view COG/cddid.tbl @ 9:24676ef2945d draft

Uploaded
author dereeper
date Thu, 30 May 2024 16:18:04 +0000
parents e42d30da7a74
children
line wrap: on
line source

214330	CHL00001	rpoB	RNA polymerase beta subunit	1070
214331	CHL00002	matK	maturase K	504
176948	CHL00003	psbA	photosystem II protein D1	338
176949	CHL00004	psbD	photosystem II protein D2	353
176950	CHL00005	rps16	ribosomal protein S16	82
176951	CHL00008	petG	cytochrome b6/f complex subunit V	37
176952	CHL00009	petN	cytochrome b6/f complex subunit VIII	29
214332	CHL00010	infA	translation initiation factor 1	78
176954	CHL00011	ndhD	NADH dehydrogenase subunit 4	498
176955	CHL00012	ndhJ	NADH dehydrogenase subunit J	158
214333	CHL00013	rpoA	RNA polymerase alpha subunit	327
214334	CHL00014	ndhI	NADH dehydrogenase subunit I	167
176958	CHL00015	ndhE	NADH dehydrogenase subunit 4L	101
214335	CHL00016	ndhG	NADH dehydrogenase subunit 6	182
176960	CHL00017	ndhH	NADH dehydrogenase subunit 7	393
214336	CHL00018	rpoC1	RNA polymerase beta' subunit	663
176962	CHL00019	atpF	ATP synthase CF0 B subunit	184
176963	CHL00020	psbN	photosystem II protein N	43
176964	CHL00022	ndhC	NADH dehydrogenase subunit 3	120
214337	CHL00023	ndhK	NADH dehydrogenase subunit K	225
176966	CHL00024	psbI	photosystem II protein I	36
214338	CHL00025	ndhF	NADH dehydrogenase subunit 5	741
214339	CHL00027	rps15	ribosomal protein S15	90
214340	CHL00028	clpP	ATP-dependent Clp protease proteolytic subunit	200
176970	CHL00029	rpl36	ribosomal protein L36	26
176971	CHL00030	rpl23	ribosomal protein L23	93
176972	CHL00031	psbT	photosystem II protein T	33
214341	CHL00032	ndhA	NADH dehydrogenase subunit 1	363
176974	CHL00033	ycf3	photosystem I assembly protein Ycf3	168
214342	CHL00034	rpl22	ribosomal protein L22	117
214343	CHL00035	psbC	photosystem II 44 kDa protein	473
176977	CHL00036	ycf4	photosystem I assembly protein Ycf4	184
176978	CHL00037	petA	cytochrome f	320
176979	CHL00038	psbL	photosystem II protein L	38
176980	CHL00039	psbF	photosystem II protein VI	39
176981	CHL00040	rbcL	ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit	475
176982	CHL00041	rps11	ribosomal protein S11	116
214344	CHL00042	rps8	ribosomal protein S8	132
214345	CHL00043	cemA	envelope membrane protein	261
176985	CHL00044	rpl16	ribosomal protein L16	135
214346	CHL00045	ccsA	cytochrome c biogenesis protein	319
176987	CHL00046	atpI	ATP synthase CF0 A subunit	228
214347	CHL00047	psbK	photosystem II protein K	58
214348	CHL00048	rps3	ribosomal protein S3	214
176990	CHL00049	ndhB	NADH dehydrogenase subunit 2	494
176991	CHL00050	rps19	ribosomal protein S19	92
176992	CHL00051	rps12	ribosomal protein S12	123
176993	CHL00052	rpl2	ribosomal protein L2	273
176994	CHL00053	rps7	ribosomal protein S7	155
176995	CHL00054	psaB	photosystem I P700 chlorophyll a apoprotein A2	734
176996	CHL00056	psaA	photosystem I P700 chlorophyll a apoprotein A1	750
176997	CHL00057	rpl14	ribosomal protein L14	122
176998	CHL00058	petD	cytochrome b6/f complex subunit IV	160
176999	CHL00059	atpA	ATP synthase CF1 alpha subunit	485
214349	CHL00060	atpB	ATP synthase CF1 beta subunit	494
177001	CHL00061	atpH	ATP synthase CF0 C subunit	81
214350	CHL00062	psbB	photosystem II 47 kDa protein	504
214351	CHL00063	atpE	ATP synthase CF1 epsilon subunit	134
177004	CHL00064	psbE	photosystem II protein V	83
177005	CHL00065	psaC	photosystem I subunit VII	81
177006	CHL00066	psbH	photosystem II protein H	73
177007	CHL00067	rps2	ribosomal protein S2	230
214352	CHL00068	rpl20	ribosomal protein L20	115
177009	CHL00070	petB	cytochrome b6	215
177010	CHL00071	tufA	elongation factor Tu	409
177011	CHL00072	chlL	photochlorophyllide reductase subunit L	290
214353	CHL00073	chlN	photochlorophyllide reductase subunit N	457
214354	CHL00074	rps14	ribosomal protein S14	100
177014	CHL00075	rpl21	ribosomal protein L21	108
214355	CHL00076	chlB	photochlorophyllide reductase subunit B	513
177016	CHL00077	rps18	ribosomal protein S18	86
214356	CHL00078	rpl5	ribosomal protein L5	181
214357	CHL00079	rps9	ribosomal protein S9	130
177019	CHL00080	psbM	photosystem II protein M	34
177020	CHL00081	chlI	Mg-protoporyphyrin IX chelatase	350
177021	CHL00082	psbZ	photosystem II protein Z	62
214358	CHL00083	rpl12	ribosomal protein L12	131
177023	CHL00084	rpl19	ribosomal protein L19	117
214359	CHL00085	ycf24	putative ABC transporter	485
164492	CHL00086	apcA	allophycocyanin alpha subunit	161
164493	CHL00088	apcB	allophycocyanin beta subunit	161
100206	CHL00089	apcF	allophycocyanin beta 18 subunit	169
164494	CHL00090	apcD	allophycocyanin gamma subunit	161
164495	CHL00091	apcE	phycobillisome linker protein	877
177025	CHL00093	groEL	chaperonin GroEL	529
214360	CHL00094	dnaK	heat shock protein 70	621
214361	CHL00095	clpC	Clp protease ATP binding subunit	821
214362	CHL00098	tsf	elongation factor Ts	200
214363	CHL00099	ilvB	acetohydroxyacid synthase large subunit	585
214364	CHL00100	ilvH	acetohydroxyacid synthase small subunit	174
214365	CHL00101	trpG	anthranilate synthase component 2	190
214366	CHL00102	rps20	ribosomal protein S20	93
214367	CHL00103	rpl35	ribosomal protein L35	65
177033	CHL00104	rpl33	ribosomal protein L33	66
177034	CHL00105	psaJ	photosystem I subunit IX	42
177035	CHL00106	petL	cytochrome b6/f complex subunit VI	31
177036	CHL00108	psbJ	photosystem II protein J	40
177037	CHL00112	rpl28	ribosomal protein L28; Provisional	63
177038	CHL00113	rps4	ribosomal protein S4; Reviewed	201
100224	CHL00114	psbX	photosystem II protein X; Reviewed	39
177039	CHL00115	rpl34	ribosomal protein L34; Reviewed	46
214368	CHL00117	rpoC2	RNA polymerase beta'' subunit; Reviewed	1364
214369	CHL00118	atpG	ATP synthase CF0 B' subunit; Validated	156
177042	CHL00119	atpD	ATP synthase CF1 delta subunit; Validated	184
177043	CHL00120	psaL	photosystem I subunit XI; Validated	143
214370	CHL00121	rpl27	ribosomal protein L27; Reviewed	86
214371	CHL00122	secA	preprotein translocase subunit SecA; Validated	870
177046	CHL00123	rps6	ribosomal protein S6; Validated	97
177047	CHL00124	acpP	acyl carrier protein; Validated	82
177048	CHL00125	psaE	photosystem I subunit IV; Reviewed	64
177049	CHL00127	rpl11	ribosomal protein L11; Validated	140
177050	CHL00128	psbW	photosystem II protein W; Reviewed	113
177051	CHL00129	rpl1	ribosomal protein L1; Reviewed	229
177052	CHL00130	rbcS	ribulose-1,5-bisphosphate carboxylase/oxygenase small subunit; Reviewed	138
214372	CHL00131	ycf16	sulfate ABC transporter protein; Validated	252
177054	CHL00132	psaF	photosystem I subunit III; Validated	185
177055	CHL00133	psbV	photosystem II cytochrome c550; Validated	163
177056	CHL00134	petF	ferredoxin; Validated	99
177057	CHL00135	rps10	ribosomal protein S10; Validated	101
177058	CHL00136	rpl31	ribosomal protein L31; Validated	68
177059	CHL00137	rps13	ribosomal protein S13; Validated	122
177060	CHL00138	rps5	ribosomal protein S5; Validated	143
214373	CHL00139	rpl18	ribosomal protein L18; Validated	109
177062	CHL00140	rpl6	ribosomal protein L6; Validated	178
214374	CHL00141	rpl24	ribosomal protein L24; Validated	83
177064	CHL00142	rps17	ribosomal protein S17; Validated	84
177065	CHL00143	rpl3	ribosomal protein L3; Validated	207
177066	CHL00144	odpB	pyruvate dehydrogenase E1 component beta subunit; Validated	327
177067	CHL00145	psaD	photosystem I subunit II; Validated	139
214375	CHL00147	rpl4	ribosomal protein L4; Validated	215
214376	CHL00148	orf27	Ycf27; Reviewed	240
177069	CHL00149	odpA	pyruvate dehydrogenase E1 component alpha subunit; Reviewed	341
164542	CHL00151	preA	prenyl transferase; Reviewed	323
214377	CHL00152	rpl32	ribosomal protein L32; Validated	53
177071	CHL00154	rpl29	ribosomal protein L29; Validated	67
177072	CHL00159	rpl13	ribosomal protein L13; Validated	143
214378	CHL00160	rpl9	ribosomal protein L9; Provisional	153
214379	CHL00161	secY	preprotein translocase subunit SecY; Validated	417
214380	CHL00162	thiG	thiamin biosynthesis protein G; Validated	267
214381	CHL00163	ycf65	putative ribosomal protein 3; Validated	99
164550	CHL00164	psaK	photosystem I subunit X; Validated	86
214382	CHL00165	ftrB	ferredoxin thioreductase subunit beta; Validated	116
214383	CHL00168	pbsA	heme oxygenase; Provisional	238
100270	CHL00170	cpcA	phycocyanin alpha subunit; Reviewed	162
100271	CHL00171	cpcB	phycocyanin beta subunit; Reviewed	172
133617	CHL00172	cpeB	phycoerythrin beta subunit; Provisional	177
100273	CHL00173	cpeA	phycoerythrin alpha subunit; Provisional	164
214384	CHL00174	accD	acetyl-CoA carboxylase beta subunit; Reviewed	296
214385	CHL00175	minD	septum-site determining protein; Validated	281
214386	CHL00176	ftsH	cell division protein; Validated	638
214387	CHL00177	ccs1	c-type cytochrome biogenensis protein; Validated	426
177082	CHL00180	rbcR	LysR transcriptional regulator; Provisional	305
177083	CHL00181	cbbX	CbbX; Provisional	287
177084	CHL00182	tatC	Sec-independent translocase component C; Provisional	249
177085	CHL00183	petJ	cytochrome c553; Provisional	108
177086	CHL00184	ycf12	Ycf12; Provisional	33
177087	CHL00185	ycf59	magnesium-protoporphyrin IX monomethyl ester cyclase; Provisional	351
177088	CHL00186	psaI	photosystem I subunit VIII; Validated	36
214388	CHL00187	cysT	sulfate transport protein; Provisional	237
214389	CHL00188	hisH	imidazole glycerol phosphate synthase subunit hisH; Provisional	210
177089	CHL00189	infB	translation initiation factor 2; Provisional	742
177090	CHL00190	psaM	photosystem I subunit XII; Provisional	30
214390	CHL00191	ycf61	DNA-directed RNA polymerase subunit omega; Provisional	76
214391	CHL00192	syfB	phenylalanyl-tRNA synthetase beta chain; Provisional	704
177092	CHL00193	ycf35	Ycf35; Provisional	128
177093	CHL00194	ycf39	Ycf39; Provisional	317
177094	CHL00195	ycf46	Ycf46; Provisional	489
177095	CHL00196	psbY	photosystem II protein Y; Provisional	36
214392	CHL00197	carA	carbamoyl-phosphate synthase arginine-specific small subunit; Provisional	382
214393	CHL00198	accA	acetyl-CoA carboxylase carboxyltransferase alpha subunit; Provisional	322
164575	CHL00199	infC	translation initiation factor 3; Provisional	182
214394	CHL00200	trpA	tryptophan synthase alpha subunit; Provisional	263
164576	CHL00201	syh	histidine-tRNA synthetase; Provisional	430
133644	CHL00202	argB	acetylglutamate kinase; Provisional	284
164577	CHL00203	fabH	3-oxoacyl-acyl-carrier-protein synthase 3; Provisional	326
214395	CHL00204	ycf1	Ycf1; Provisional	1832
214396	CHL00206	ycf2	Ycf2; Provisional	2281
214397	CHL00207	rpoB	RNA polymerase beta subunit; Provisional	1077
223080	COG0001	HemL	Glutamate-1-semialdehyde aminotransferase [Coenzyme transport and metabolism]. 	432
223081	COG0002	ArgC	N-acetyl-gamma-glutamylphosphate reductase [Amino acid transport and metabolism]. 	349
223082	COG0003	ArsA	Anion-transporting ATPase, ArsA/GET3 family [Inorganic ion transport and metabolism]. 	322
223083	COG0004	AmtB	Ammonia channel protein AmtB [Inorganic ion transport and metabolism]. 	409
223084	COG0005	XapA	Purine nucleoside phosphorylase [Nucleotide transport and metabolism]. 	262
223085	COG0006	PepP	Xaa-Pro aminopeptidase [Amino acid transport and metabolism]. 	384
223086	COG0007	CysG	Uroporphyrinogen-III methylase (siroheme synthase) [Coenzyme transport and metabolism]. 	244
223087	COG0008	GlnS	Glutamyl- or glutaminyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	472
223088	COG0009	SUA5	tRNA A37 threonylcarbamoyladenosine synthetase subunit TsaC/SUA5/YrdC [Translation, ribosomal structure and biogenesis]. 	211
223089	COG0010	SpeB	Arginase family enzyme [Amino acid transport and metabolism]. 	305
223090	COG0011	YqgV	Uncharacterized conserved protein YqgV, UPF0045/DUF77 family [Function unknown]. 	100
223091	COG0012	GTP1	Ribosome-binding ATPase YchF, GTP1/OBG family [Translation, ribosomal structure and biogenesis]. 	372
223092	COG0013	AlaS	Alanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	879
223093	COG0014	ProA	Gamma-glutamyl phosphate reductase [Amino acid transport and metabolism]. 	417
223094	COG0015	PurB	Adenylosuccinate lyase [Nucleotide transport and metabolism]. 	438
223095	COG0016	PheS	Phenylalanyl-tRNA synthetase alpha subunit [Translation, ribosomal structure and biogenesis]. 	335
223096	COG0017	AsnS	Aspartyl/asparaginyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	435
223097	COG0018	ArgS	Arginyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	577
223098	COG0019	LysA	Diaminopimelate decarboxylase [Amino acid transport and metabolism]. 	394
223099	COG0020	UppS	Undecaprenyl pyrophosphate synthase [Lipid transport and metabolism]. 	245
223100	COG0021	TktA	Transketolase [Carbohydrate transport and metabolism]. 	663
223101	COG0022	AcoB	Pyruvate/2-oxoglutarate/acetoin dehydrogenase complex, dehydrogenase (E1) component [Energy production and conversion]. 	324
223102	COG0023	SUI1	Translation initiation factor 1 (eIF-1/SUI1) [Translation, ribosomal structure and biogenesis]. 	104
223103	COG0024	Map	Methionine aminopeptidase [Translation, ribosomal structure and biogenesis]. 	255
223104	COG0025	NhaP	NhaP-type Na+/H+ or K+/H+ antiporter [Inorganic ion transport and metabolism]. 	429
223105	COG0026	PurK	Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) [Nucleotide transport and metabolism]. 	375
223106	COG0027	PurT	Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) [Nucleotide transport and metabolism]. 	394
223107	COG0028	IlvB	Acetolactate synthase large subunit or other thiamine pyrophosphate-requiring enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 	550
223108	COG0029	NadB	Aspartate oxidase [Coenzyme transport and metabolism]. 	518
223109	COG0030	RsmA	16S rRNA A1518 and A1519 N6-dimethyltransferase RsmA/KsgA/DIM1 (may also have DNA glycosylase/AP lyase activity)  [Translation, ribosomal structure and biogenesis]. 	259
223110	COG0031	CysK	Cysteine synthase [Amino acid transport and metabolism]. 	300
223111	COG0033	Pgm	Phosphoglucomutase [Carbohydrate transport and metabolism]. 	524
223112	COG0034	PurF	Glutamine phosphoribosylpyrophosphate amidotransferase [Nucleotide transport and metabolism]. 	470
223113	COG0035	Upp	Uracil phosphoribosyltransferase [Nucleotide transport and metabolism]. 	210
223114	COG0036	Rpe	Pentose-5-phosphate-3-epimerase [Carbohydrate transport and metabolism]. 	220
223115	COG0037	TilS	tRNA(Ile)-lysidine synthase TilS/MesJ [Translation, ribosomal structure and biogenesis]. 	298
223116	COG0038	ClcA	H+/Cl- antiporter ClcA [Inorganic ion transport and metabolism]. 	443
223117	COG0039	Mdh	Malate/lactate dehydrogenase [Energy production and conversion]. 	313
223118	COG0040	HisG	ATP phosphoribosyltransferase [Amino acid transport and metabolism]. 	290
223119	COG0041	PurE	Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase [Nucleotide transport and metabolism]. 	162
223120	COG0042	DusA	tRNA-dihydrouridine synthase [Translation, ribosomal structure and biogenesis]. 	323
223121	COG0043	UbiD	3-polyprenyl-4-hydroxybenzoate decarboxylase [Coenzyme transport and metabolism]. 	477
223122	COG0044	AllB	Dihydroorotase or related cyclic amidohydrolase [Nucleotide transport and metabolism]. 	430
223123	COG0045	SucC	Succinyl-CoA synthetase, beta subunit [Energy production and conversion]. 	387
223124	COG0046	PurL1	Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain [Nucleotide transport and metabolism]. 	743
223125	COG0047	PurL2	Phosphoribosylformylglycinamidine (FGAM) synthase, glutamine amidotransferase domain [Nucleotide transport and metabolism]. 	231
223126	COG0048	RpsL	Ribosomal protein S12 [Translation, ribosomal structure and biogenesis]. 	129
223127	COG0049	RpsG	Ribosomal protein S7 [Translation, ribosomal structure and biogenesis]. 	148
223128	COG0050	TufB	Translation elongation factor EF-Tu, a GTPase [Translation, ribosomal structure and biogenesis]. 	394
223129	COG0051	RpsJ	Ribosomal protein S10 [Translation, ribosomal structure and biogenesis]. 	104
223130	COG0052	RpsB	Ribosomal protein S2 [Translation, ribosomal structure and biogenesis]. 	252
223131	COG0053	FieF	Divalent metal cation (Fe/Co/Zn/Cd) transporter [Inorganic ion transport and metabolism]. 	304
223132	COG0054	RibE	6,7-dimethyl-8-ribityllumazine synthase (Riboflavin synthase beta chain) [Coenzyme transport and metabolism]. 	152
223133	COG0055	AtpD	FoF1-type ATP synthase, beta subunit [Energy production and conversion]. 	468
223134	COG0056	AtpA	FoF1-type ATP synthase, alpha subunit [Energy production and conversion]. 	504
223135	COG0057	GapA	Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase [Carbohydrate transport and metabolism]. 	335
223136	COG0058	GlgP	Glucan phosphorylase [Carbohydrate transport and metabolism]. 	750
223137	COG0059	IlvC	Ketol-acid reductoisomerase [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 	338
223138	COG0060	IleS	Isoleucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	933
223139	COG0061	NadF	NAD kinase [Nucleotide transport and metabolism]. 	281
223140	COG0062	Nnr1	NAD(P)H-hydrate repair enzyme Nnr, NAD(P)H-hydrate epimerase domain [Nucleotide transport and metabolism]. 	203
223141	COG0063	Nnr2	NAD(P)H-hydrate repair enzyme Nnr, NAD(P)H-hydrate dehydratase domain  [Nucleotide transport and metabolism]. 	284
223142	COG0064	GatB	Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit [Translation, ribosomal structure and biogenesis]. 	483
223143	COG0065	LeuC	Homoaconitase/3-isopropylmalate dehydratase large subunit [Amino acid transport and metabolism]. 	423
223144	COG0066	LeuD	3-isopropylmalate dehydratase small subunit [Amino acid transport and metabolism]. 	191
223145	COG0067	GltB1	Glutamate synthase domain 1 [Amino acid transport and metabolism]. 	371
223146	COG0068	HypF	Hydrogenase maturation factor HypF (carbamoyltransferase) [Posttranslational modification, protein turnover, chaperones]. 	750
223147	COG0069	GltB2	Glutamate synthase domain 2 [Amino acid transport and metabolism]. 	485
223148	COG0070	GltB3	Glutamate synthase domain 3 [Amino acid transport and metabolism]. 	301
223149	COG0071	IbpA	Molecular chaperone IbpA, HSP20 family [Posttranslational modification, protein turnover, chaperones]. 	146
223150	COG0072	PheT	Phenylalanyl-tRNA synthetase beta subunit [Translation, ribosomal structure and biogenesis]. 	650
223151	COG0073	EMAP	tRNA-binding EMAP/Myf domain [Translation, ribosomal structure and biogenesis]. 	123
223152	COG0074	SucD	Succinyl-CoA synthetase, alpha subunit [Energy production and conversion]. 	293
223153	COG0075	PucG	Archaeal aspartate aminotransferase or a related aminotransferase, includes purine catabolism protein PucG [Amino acid transport and metabolism, Nucleotide transport and metabolism]. 	383
223154	COG0076	GadA	Glutamate or tyrosine decarboxylase or a related PLP-dependent protein [Amino acid transport and metabolism]. 	460
223155	COG0077	PheA2	Prephenate dehydratase [Amino acid transport and metabolism]. 	279
223156	COG0078	ArgF	Ornithine carbamoyltransferase [Amino acid transport and metabolism]. 	310
223157	COG0079	HisC	Histidinol-phosphate/aromatic aminotransferase or cobyric acid decarboxylase [Amino acid transport and metabolism]. 	356
223158	COG0080	RplK	Ribosomal protein L11 [Translation, ribosomal structure and biogenesis]. 	141
223159	COG0081	RplA	Ribosomal protein L1 [Translation, ribosomal structure and biogenesis]. 	228
223160	COG0082	AroC	Chorismate synthase [Amino acid transport and metabolism]. 	369
223161	COG0083	ThrB	Homoserine kinase [Amino acid transport and metabolism]. 	299
223162	COG0084	TatD	Tat protein secretion system quality control protein TatD (DNase activity)  [Cell motility]. 	256
223163	COG0085	RpoB	DNA-directed RNA polymerase, beta subunit/140 kD subunit [Transcription]. 	1060
223164	COG0086	RpoC	DNA-directed RNA polymerase, beta' subunit/160 kD subunit [Transcription]. 	808
223165	COG0087	RplC	Ribosomal protein L3 [Translation, ribosomal structure and biogenesis]. 	218
223166	COG0088	RplD	Ribosomal protein L4 [Translation, ribosomal structure and biogenesis]. 	214
223167	COG0089	RplW	Ribosomal protein L23 [Translation, ribosomal structure and biogenesis]. 	94
223168	COG0090	RplB	Ribosomal protein L2 [Translation, ribosomal structure and biogenesis]. 	275
223169	COG0091	RplV	Ribosomal protein L22 [Translation, ribosomal structure and biogenesis]. 	120
223170	COG0092	RpsC	Ribosomal protein S3 [Translation, ribosomal structure and biogenesis]. 	233
223171	COG0093	RplN	Ribosomal protein L14 [Translation, ribosomal structure and biogenesis]. 	122
223172	COG0094	RplE	Ribosomal protein L5 [Translation, ribosomal structure and biogenesis]. 	180
223173	COG0095	LplA	Lipoate-protein ligase A [Coenzyme transport and metabolism]. 	248
223174	COG0096	RpsH	Ribosomal protein S8 [Translation, ribosomal structure and biogenesis]. 	132
223175	COG0097	RplF	Ribosomal protein L6P/L9E [Translation, ribosomal structure and biogenesis]. 	178
223176	COG0098	RpsE	Ribosomal protein S5 [Translation, ribosomal structure and biogenesis]. 	181
223177	COG0099	RpsM	Ribosomal protein S13 [Translation, ribosomal structure and biogenesis]. 	121
223178	COG0100	RpsK	Ribosomal protein S11 [Translation, ribosomal structure and biogenesis]. 	129
223179	COG0101	TruA	tRNA U38,U39,U40 pseudouridine synthase TruA [Translation, ribosomal structure and biogenesis]. 	266
223180	COG0102	RplM	Ribosomal protein L13 [Translation, ribosomal structure and biogenesis]. 	148
223181	COG0103	RpsI	Ribosomal protein S9 [Translation, ribosomal structure and biogenesis]. 	130
223182	COG0104	PurA	Adenylosuccinate synthase [Nucleotide transport and metabolism]. 	430
223183	COG0105	Ndk	Nucleoside diphosphate kinase [Nucleotide transport and metabolism]. 	135
223184	COG0106	HisA	Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase [Amino acid transport and metabolism]. 	241
223185	COG0107	HisF	Imidazole glycerol phosphate synthase subunit HisF [Amino acid transport and metabolism]. 	256
223186	COG0108	RibB	3,4-dihydroxy-2-butanone 4-phosphate synthase [Coenzyme transport and metabolism]. 	203
223187	COG0109	CyoE	Polyprenyltransferase (heme O synthase) [Coenzyme transport and metabolism, Lipid transport and metabolism]. 	304
223188	COG0110	WbbJ	Acetyltransferase (isoleucine patch superfamily) [General function prediction only]. 	190
223189	COG0111	SerA	Phosphoglycerate dehydrogenase or related dehydrogenase [Coenzyme transport and metabolism, General function prediction only]. 	324
223190	COG0112	GlyA	Glycine/serine hydroxymethyltransferase [Amino acid transport and metabolism]. 	413
223191	COG0113	HemB	Delta-aminolevulinic acid dehydratase, porphobilinogen synthase [Coenzyme transport and metabolism]. 	330
223192	COG0114	FumC	Fumarate hydratase class II [Energy production and conversion]. 	462
223193	COG0115	IlvE	Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 	284
223194	COG0116	RlmL	23S rRNA G2445 N2-methylase RlmL [Translation, ribosomal structure and biogenesis]. 	381
223195	COG0117	RibD1	Pyrimidine deaminase domain of riboflavin biosynthesis protein RibD [Coenzyme transport and metabolism]. 	146
223196	COG0118	HisH	Imidazoleglycerol phosphate synthase glutamine amidotransferase subunit HisH [Amino acid transport and metabolism]. 	204
223197	COG0119	LeuA	Isopropylmalate/homocitrate/citramalate synthases [Amino acid transport and metabolism]. 	409
223198	COG0120	RpiA	Ribose 5-phosphate isomerase [Carbohydrate transport and metabolism]. 	227
223199	COG0121	YafJ	Predicted glutamine amidotransferase [General function prediction only]. 	252
223200	COG0122	AlkA	3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase [Replication, recombination and repair]. 	285
223201	COG0123	AcuC	Acetoin utilization deacetylase AcuC or a related deacetylase [Chromatin structure and dynamics, Secondary metabolites biosynthesis, transport and catabolism]. 	340
223202	COG0124	HisS	Histidyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	429
223203	COG0125	Tmk	Thymidylate kinase [Nucleotide transport and metabolism]. 	208
223204	COG0126	Pgk	3-phosphoglycerate kinase [Carbohydrate transport and metabolism]. 	395
223205	COG0127	RdgB	Inosine/xanthosine triphosphate pyrophosphatase, all-alpha NTP-PPase family [Nucleotide transport and metabolism]. 	194
223206	COG0128	AroA	5-enolpyruvylshikimate-3-phosphate synthase [Amino acid transport and metabolism]. 	428
223207	COG0129	IlvD	Dihydroxyacid dehydratase/phosphogluconate dehydratase [Amino acid transport and metabolism, Carbohydrate transport and metabolism]. 	575
223208	COG0130	TruB	tRNA U55 pseudouridine synthase TruB, may also work on U342 of tmRNA [Translation, ribosomal structure and biogenesis]. 	271
223209	COG0131	HisB2	Imidazoleglycerol phosphate dehydratase HisB [Amino acid transport and metabolism]. 	195
223210	COG0132	BioD	Dethiobiotin synthetase [Coenzyme transport and metabolism]. 	223
223211	COG0133	TrpB	Tryptophan synthase beta chain [Amino acid transport and metabolism]. 	396
223212	COG0134	TrpC	Indole-3-glycerol phosphate synthase [Amino acid transport and metabolism]. 	254
223213	COG0135	TrpF	Phosphoribosylanthranilate isomerase [Amino acid transport and metabolism]. 	208
223214	COG0136	Asd	Aspartate-semialdehyde dehydrogenase [Amino acid transport and metabolism]. 	334
223215	COG0137	ArgG	Argininosuccinate synthase [Amino acid transport and metabolism]. 	403
223216	COG0138	PurH	AICAR transformylase/IMP cyclohydrolase PurH [Nucleotide transport and metabolism]. 	515
223217	COG0139	HisI1	Phosphoribosyl-AMP cyclohydrolase [Amino acid transport and metabolism]. 	111
223218	COG0140	HisI2	Phosphoribosyl-ATP pyrophosphohydrolase [Amino acid transport and metabolism]. 	92
223219	COG0141	HisD	Histidinol dehydrogenase [Amino acid transport and metabolism]. 	425
223220	COG0142	IspA	Geranylgeranyl pyrophosphate synthase [Coenzyme transport and metabolism]. 	322
223221	COG0143	MetG	Methionyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	558
223222	COG0144	RsmB	16S rRNA C967 or C1407 C5-methylase, RsmB/RsmF family [Translation, ribosomal structure and biogenesis]. 	355
223223	COG0145	HyuA	N-methylhydantoinase A/oxoprolinase/acetone carboxylase, beta subunit  [Amino acid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	674
223224	COG0146	HyuB	N-methylhydantoinase B/oxoprolinase/acetone carboxylase, alpha subunit  [Amino acid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	563
223225	COG0147	TrpE	Anthranilate/para-aminobenzoate synthases component I [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 	462
223226	COG0148	Eno	Enolase [Carbohydrate transport and metabolism]. 	423
223227	COG0149	TpiA	Triosephosphate isomerase [Carbohydrate transport and metabolism]. 	251
223228	COG0150	PurM	Phosphoribosylaminoimidazole (AIR) synthetase [Nucleotide transport and metabolism]. 	345
223229	COG0151	PurD	Phosphoribosylamine-glycine ligase [Nucleotide transport and metabolism]. 	428
223230	COG0152	PurC	Phosphoribosylaminoimidazole-succinocarboxamide synthase [Nucleotide transport and metabolism]. 	247
223231	COG0153	GalK	Galactokinase [Carbohydrate transport and metabolism]. 	390
223232	COG0154	GatA	Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit or related amidase [Translation, ribosomal structure and biogenesis]. 	475
223233	COG0155	CysI	Sulfite reductase, beta subunit (hemoprotein) [Inorganic ion transport and metabolism]. 	510
223234	COG0156	BioF	7-keto-8-aminopelargonate synthetase or related enzyme [Coenzyme transport and metabolism]. 	388
223235	COG0157	NadC	Nicotinate-nucleotide pyrophosphorylase [Coenzyme transport and metabolism]. 	280
223236	COG0158	Fbp	Fructose-1,6-bisphosphatase [Carbohydrate transport and metabolism]. 	326
223237	COG0159	TrpA	Tryptophan synthase alpha chain [Amino acid transport and metabolism]. 	265
223238	COG0160	GabT	4-aminobutyrate aminotransferase or related aminotransferase [Amino acid transport and metabolism]. 	447
223239	COG0161	BioA	Adenosylmethionine-8-amino-7-oxononanoate aminotransferase [Coenzyme transport and metabolism]. 	449
223240	COG0162	TyrS	Tyrosyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	401
223241	COG0163	UbiX	UbiX family flavin prenyltransferase [Coenzyme transport and metabolism]. 	191
223242	COG0164	RnhB	Ribonuclease HII [Replication, recombination and repair]. 	199
223243	COG0165	ArgH	Argininosuccinate lyase [Amino acid transport and metabolism]. 	459
223244	COG0166	Pgi	Glucose-6-phosphate isomerase [Carbohydrate transport and metabolism]. 	446
223245	COG0167	PyrD	Dihydroorotate dehydrogenase [Nucleotide transport and metabolism]. 	310
223246	COG0168	TrkG	Trk-type K+ transport system, membrane component [Inorganic ion transport and metabolism]. 	499
223247	COG0169	AroE	Shikimate 5-dehydrogenase [Amino acid transport and metabolism]. 	283
223248	COG0170	SEC59	Dolichol kinase [Posttranslational modification, protein turnover, chaperones]. 	216
223249	COG0171	NadE	NH3-dependent NAD+ synthetase [Coenzyme transport and metabolism]. 	268
223250	COG0172	SerS	Seryl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	429
223251	COG0173	AspS	Aspartyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	585
223252	COG0174	GlnA	Glutamine synthetase [Amino acid transport and metabolism]. 	443
223253	COG0175	CysH	3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase or related enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 	261
223254	COG0176	TalA	Transaldolase [Carbohydrate transport and metabolism]. 	239
223255	COG0177	Nth	Endonuclease III [Replication, recombination and repair]. 	211
223256	COG0178	UvrA	Excinuclease UvrABC ATPase subunit [Replication, recombination and repair]. 	935
223257	COG0179	MhpD	2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) [Secondary metabolites biosynthesis, transport and catabolism]. 	266
223258	COG0180	TrpS	Tryptophanyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	314
223259	COG0181	HemC	Porphobilinogen deaminase [Coenzyme transport and metabolism]. 	307
223260	COG0182	MtnA	Methylthioribose-1-phosphate isomerase (methionine salvage pathway), a paralog of eIF-2B alpha subunit [Amino acid transport and metabolism]. 	346
223261	COG0183	PaaJ	Acetyl-CoA acetyltransferase [Lipid transport and metabolism]. 	392
223262	COG0184	RpsO	Ribosomal protein S15P/S13E [Translation, ribosomal structure and biogenesis]. 	89
223263	COG0185	RpsS	Ribosomal protein S19 [Translation, ribosomal structure and biogenesis]. 	93
223264	COG0186	RpsQ	Ribosomal protein S17 [Translation, ribosomal structure and biogenesis]. 	87
223265	COG0187	GyrB	DNA gyrase/topoisomerase IV, subunit B [Replication, recombination and repair]. 	635
223266	COG0188	GyrA	DNA gyrase/topoisomerase IV, subunit A [Replication, recombination and repair]. 	804
223267	COG0189	RimK	Glutathione synthase/RimK-type ligase, ATP-grasp superfamily [Coenzyme transport and metabolism, Translation, ribosomal structure and biogenesis]. 	318
223268	COG0190	FolD	5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase [Coenzyme transport and metabolism]. 	283
223269	COG0191	Fba	Fructose/tagatose bisphosphate aldolase [Carbohydrate transport and metabolism]. 	286
223270	COG0192	MetK	S-adenosylmethionine synthetase [Coenzyme transport and metabolism]. 	388
223271	COG0193	Pth	Peptidyl-tRNA hydrolase [Translation, ribosomal structure and biogenesis]. 	190
223272	COG0194	Gmk	Guanylate kinase [Nucleotide transport and metabolism]. 	191
223273	COG0195	NusA	Transcription antitermination factor NusA, contains S1 and KH domains [Transcription]. 	190
223274	COG0196	RibF	FAD synthase [Coenzyme transport and metabolism]. 	304
223275	COG0197	RplP	Ribosomal protein L16/L10AE [Translation, ribosomal structure and biogenesis]. 	146
223276	COG0198	RplX	Ribosomal protein L24 [Translation, ribosomal structure and biogenesis]. 	104
223277	COG0199	RpsN	Ribosomal protein S14 [Translation, ribosomal structure and biogenesis]. 	61
223278	COG0200	RplO	Ribosomal protein L15 [Translation, ribosomal structure and biogenesis]. 	152
223279	COG0201	SecY	Preprotein translocase subunit SecY [Intracellular trafficking, secretion, and vesicular transport]. 	436
223280	COG0202	RpoA	DNA-directed RNA polymerase, alpha subunit/40 kD subunit [Transcription]. 	317
223281	COG0203	RplQ	Ribosomal protein L17 [Translation, ribosomal structure and biogenesis]. 	116
223282	COG0204	PlsC	1-acyl-sn-glycerol-3-phosphate acyltransferase [Lipid transport and metabolism]. 	255
223283	COG0205	PfkA	6-phosphofructokinase [Carbohydrate transport and metabolism]. 	347
223284	COG0206	FtsZ	Cell division GTPase FtsZ [Cell cycle control, cell division, chromosome partitioning]. 	338
223285	COG0207	ThyA	Thymidylate synthase [Nucleotide transport and metabolism]. 	268
223286	COG0208	NrdF	Ribonucleotide reductase beta subunit, ferritin-like domain [Nucleotide transport and metabolism]. 	348
223287	COG0209	NrdA	Ribonucleotide reductase alpha subunit [Nucleotide transport and metabolism]. 	651
223288	COG0210	UvrD	Superfamily I DNA or RNA helicase [Replication, recombination and repair]. 	655
223289	COG0211	RpmA	Ribosomal protein L27 [Translation, ribosomal structure and biogenesis]. 	87
223290	COG0212	FAU1	5-formyltetrahydrofolate cyclo-ligase [Coenzyme transport and metabolism]. 	191
223291	COG0213	DeoA	Thymidine phosphorylase [Nucleotide transport and metabolism]. 	435
223292	COG0214	PdxS	Pyridoxal biosynthesis lyase PdxS [Coenzyme transport and metabolism]. 	296
223293	COG0215	CysS	Cysteinyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	464
223294	COG0216	PrfA	Protein chain release factor A [Translation, ribosomal structure and biogenesis]. 	363
223295	COG0217	TACO1	Transcriptional and/or translational regulatory protein YebC/TACO1 [Transcription, Translation, ribosomal structure and biogenesis]. 	241
223296	COG0218	EngB	GTP-binding protein EngB required for normal cell division [Cell cycle control, cell division, chromosome partitioning]. 	200
223297	COG0219	TrmL	tRNA(Leu) C34 or U34 (ribose-2'-O)-methylase TrmL, contains SPOUT domain [Translation, ribosomal structure and biogenesis]. 	155
223298	COG0220	TrmB	tRNA G46 methylase TrmB [Translation, ribosomal structure and biogenesis]. 	227
223299	COG0221	Ppa	Inorganic pyrophosphatase [Energy production and conversion, Inorganic ion transport and metabolism]. 	171
223300	COG0222	RplL	Ribosomal protein L7/L12 [Translation, ribosomal structure and biogenesis]. 	124
223301	COG0223	Fmt	Methionyl-tRNA formyltransferase [Translation, ribosomal structure and biogenesis]. 	307
223302	COG0224	AtpG	FoF1-type ATP synthase, gamma subunit [Energy production and conversion]. 	287
223303	COG0225	MsrA	Peptide methionine sulfoxide reductase MsrA [Posttranslational modification, protein turnover, chaperones]. 	174
223304	COG0226	PstS	ABC-type phosphate transport system, periplasmic component [Inorganic ion transport and metabolism]. 	318
223305	COG0227	RpmB	Ribosomal protein L28 [Translation, ribosomal structure and biogenesis]. 	77
223306	COG0228	RpsP	Ribosomal protein S16 [Translation, ribosomal structure and biogenesis]. 	87
223307	COG0229	MsrB	Peptide methionine sulfoxide reductase MsrB [Posttranslational modification, protein turnover, chaperones]. 	140
223308	COG0230	RpmH	Ribosomal protein L34 [Translation, ribosomal structure and biogenesis]. 	44
223309	COG0231	Efp	Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) [Translation, ribosomal structure and biogenesis]. 	131
223310	COG0232	Dgt	dGTP triphosphohydrolase [Nucleotide transport and metabolism]. 	412
223311	COG0233	Frr	Ribosome recycling factor [Translation, ribosomal structure and biogenesis]. 	187
223312	COG0234	GroES	Co-chaperonin GroES (HSP10) [Posttranslational modification, protein turnover, chaperones]. 	96
223313	COG0235	AraD	Ribulose-5-phosphate 4-epimerase/Fuculose-1-phosphate aldolase [Carbohydrate transport and metabolism]. 	219
223314	COG0236	AcpP	Acyl carrier protein [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	80
223315	COG0237	CoaE	Dephospho-CoA kinase [Coenzyme transport and metabolism]. 	201
223316	COG0238	RpsR	Ribosomal protein S18 [Translation, ribosomal structure and biogenesis]. 	75
223317	COG0239	CrcB	Fluoride ion exporter CrcB/FEX, affects chromosome condensation [Cell cycle control, cell division, chromosome partitioning, Inorganic ion transport and metabolism]. 	126
223318	COG0240	GpsA	Glycerol-3-phosphate dehydrogenase [Energy production and conversion]. 	329
223319	COG0241	HisB1	Histidinol phosphatase or a related phosphatase [Amino acid transport and metabolism]. 	181
223320	COG0242	Def	Peptide deformylase [Translation, ribosomal structure and biogenesis]. 	168
223321	COG0243	BisC	Anaerobic selenocysteine-containing dehydrogenase [Energy production and conversion]. 	765
223322	COG0244	RplJ	Ribosomal protein L10 [Translation, ribosomal structure and biogenesis]. 	175
223323	COG0245	IspF	2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase [Lipid transport and metabolism]. 	159
223324	COG0246	MtlD	Mannitol-1-phosphate/altronate dehydrogenases [Carbohydrate transport and metabolism]. 	473
223325	COG0247	GlpC	Fe-S oxidoreductase [Energy production and conversion]. 	388
223326	COG0248	GppA	Exopolyphosphatase/pppGpp-phosphohydrolase [Nucleotide transport and metabolism, Signal transduction mechanisms, Inorganic ion transport and metabolism]. 	492
223327	COG0249	MutS	DNA mismatch repair ATPase MutS [Replication, recombination and repair]. 	843
223328	COG0250	NusG	Transcription antitermination factor NusG [Transcription]. 	178
223329	COG0251	RidA	Enamine deaminase RidA, house cleaning of reactive enamine intermediates, YjgF/YER057c/UK114 family [Defense mechanisms]. 	130
223330	COG0252	AnsA	L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D [Translation, ribosomal structure and biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 	351
223331	COG0253	DapF	Diaminopimelate epimerase [Amino acid transport and metabolism]. 	272
223332	COG0254	RpmE	Ribosomal protein L31 [Translation, ribosomal structure and biogenesis]. 	75
223333	COG0255	RpmC	Ribosomal protein L29 [Translation, ribosomal structure and biogenesis]. 	69
223334	COG0256	RplR	Ribosomal protein L18 [Translation, ribosomal structure and biogenesis]. 	125
223335	COG0257	RpmJ	Ribosomal protein L36 [Translation, ribosomal structure and biogenesis]. 	38
223336	COG0258	Exo	5'-3' exonuclease [Replication, recombination and repair]. 	310
223337	COG0259	PdxH	Pyridoxine/pyridoxamine 5'-phosphate oxidase [Coenzyme transport and metabolism]. 	214
223338	COG0260	PepB	Leucyl aminopeptidase [Amino acid transport and metabolism]. 	485
223339	COG0261	RplU	Ribosomal protein L21 [Translation, ribosomal structure and biogenesis]. 	103
223340	COG0262	FolA	Dihydrofolate reductase [Coenzyme transport and metabolism]. 	167
223341	COG0263	ProB	Glutamate 5-kinase [Amino acid transport and metabolism]. 	369
223342	COG0264	Tsf	Translation elongation factor EF-Ts [Translation, ribosomal structure and biogenesis]. 	296
223343	COG0265	DegQ	Periplasmic serine protease, S1-C subfamily, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]. 	347
223344	COG0266	Nei	Formamidopyrimidine-DNA glycosylase [Replication, recombination and repair]. 	273
223345	COG0267	RpmG	Ribosomal protein L33 [Translation, ribosomal structure and biogenesis]. 	50
223346	COG0268	RpsT	Ribosomal protein S20 [Translation, ribosomal structure and biogenesis]. 	88
223347	COG0269	SgbH	3-keto-L-gulonate-6-phosphate decarboxylase [Carbohydrate transport and metabolism]. 	217
223348	COG0270	Dcm	Site-specific DNA-cytosine methylase [Replication, recombination and repair]. 	328
223349	COG0271	BolA	Stress-induced morphogen (activity unknown) [Signal transduction mechanisms]. 	90
223350	COG0272	Lig	NAD-dependent DNA ligase [Replication, recombination and repair]. 	667
223351	COG0274	DeoC	Deoxyribose-phosphate aldolase [Nucleotide transport and metabolism]. 	228
223352	COG0275	RmsH	16S rRNA C1402 N4-methylase RsmH [Translation, ribosomal structure and biogenesis]. 	314
223353	COG0276	HemH	Protoheme ferro-lyase (ferrochelatase) [Coenzyme transport and metabolism]. 	320
223354	COG0277	GlcD	FAD/FMN-containing dehydrogenase [Energy production and conversion]. 	459
223355	COG0278	GrxD	Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]. 	105
223356	COG0279	GmhA	Phosphoheptose isomerase [Carbohydrate transport and metabolism]. 	176
223357	COG0280	Pta	Phosphotransacetylase [Energy production and conversion]. 	327
223358	COG0281	SfcA	Malic enzyme [Energy production and conversion]. 	432
223359	COG0282	AckA	Acetate kinase [Energy production and conversion]. 	396
223360	COG0283	Cmk	Cytidylate kinase [Nucleotide transport and metabolism]. 	222
223361	COG0284	PyrF	Orotidine-5'-phosphate decarboxylase [Nucleotide transport and metabolism]. 	240
223362	COG0285	FolC	Folylpolyglutamate synthase/Dihydropteroate synthase [Coenzyme transport and metabolism]. 	427
223363	COG0286	HsdM	Type I restriction-modification system, DNA methylase subunit [Defense mechanisms]. 	489
223364	COG0287	TyrA	Prephenate dehydrogenase [Amino acid transport and metabolism]. 	279
223365	COG0288	CynT	Carbonic anhydrase [Inorganic ion transport and metabolism]. 	207
223366	COG0289	DapB	Dihydrodipicolinate reductase [Amino acid transport and metabolism]. 	266
223367	COG0290	InfC	Translation initiation factor IF-3 [Translation, ribosomal structure and biogenesis]. 	176
223368	COG0291	RpmI	Ribosomal protein L35 [Translation, ribosomal structure and biogenesis]. 	65
223369	COG0292	RplT	Ribosomal protein L20 [Translation, ribosomal structure and biogenesis]. 	118
223370	COG0293	RlmE	23S rRNA U2552 (ribose-2'-O)-methylase RlmE/FtsJ [Translation, ribosomal structure and biogenesis]. 	205
223371	COG0294	FolP	Dihydropteroate synthase [Coenzyme transport and metabolism]. 	274
223372	COG0295	Cdd	Cytidine deaminase [Nucleotide transport and metabolism]. 	134
223373	COG0296	GlgB	1,4-alpha-glucan branching enzyme [Carbohydrate transport and metabolism]. 	628
223374	COG0297	GlgA	Glycogen synthase [Carbohydrate transport and metabolism]. 	487
223375	COG0298	HypC	Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 	82
223376	COG0299	PurN	Folate-dependent phosphoribosylglycinamide formyltransferase PurN [Nucleotide transport and metabolism]. 	200
223377	COG0300	DltE	Short-chain dehydrogenase [General function prediction only]. 	265
223378	COG0301	ThiI	Adenylyl- and sulfurtransferase ThiI, participates in tRNA 4-thiouridine and thiamine biosynthesis [Coenzyme transport and metabolism, Translation, ribosomal structure and biogenesis]. 	383
223379	COG0302	FolE	GTP cyclohydrolase I [Coenzyme transport and metabolism]. 	195
223380	COG0303	MoeA	Molybdopterin biosynthesis enzyme [Coenzyme transport and metabolism]. 	404
223381	COG0304	FabB	3-oxoacyl-(acyl-carrier-protein) synthase [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	412
223382	COG0305	DnaB	Replicative DNA helicase [Replication, recombination and repair]. 	435
223383	COG0306	PitA	Phosphate/sulfate permease [Inorganic ion transport and metabolism]. 	326
223384	COG0307	RibC	Riboflavin synthase alpha chain [Coenzyme transport and metabolism]. 	204
223385	COG0308	PepN	Aminopeptidase N [Amino acid transport and metabolism]. 	859
223386	COG0309	HypE	Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 	339
223387	COG0310	CbiM	ABC-type Co2+ transport system, permease component [Inorganic ion transport and metabolism]. 	204
223388	COG0311	PdxT	Glutamine amidotransferase PdxT (pyridoxal biosynthesis)  [Coenzyme transport and metabolism]. 	194
223389	COG0312	TldD	Predicted Zn-dependent protease or its inactivated homolog [General function prediction only]. 	454
223390	COG0313	RsmI	16S rRNA C1402 (ribose-2'-O) methylase RsmI [Translation, ribosomal structure and biogenesis]. 	275
223391	COG0314	MoaE	Molybdopterin synthase catalytic subunit [Coenzyme transport and metabolism]. 	149
223392	COG0315	MoaC	Molybdenum cofactor biosynthesis enzyme [Coenzyme transport and metabolism]. 	157
223393	COG0316	IscA	Fe-S cluster assembly iron-binding protein IscA [Posttranslational modification, protein turnover, chaperones]. 	110
223394	COG0317	SpoT	(p)ppGpp synthase/hydrolase, HD superfamily [Signal transduction mechanisms, Transcription]. 	701
223395	COG0318	CaiC	Acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	534
223396	COG0319	YbeY	ssRNA-specific RNase YbeY, 16S rRNA maturation enzyme [Translation, ribosomal structure and biogenesis]. 	153
223397	COG0320	LipA	Lipoate synthase [Coenzyme transport and metabolism]. 	306
223398	COG0321	LipB	Lipoate-protein ligase B [Coenzyme transport and metabolism]. 	221
223399	COG0322	UvrC	Excinuclease UvrABC, nuclease subunit [Replication, recombination and repair]. 	581
223400	COG0323	MutL	DNA mismatch repair ATPase MutL [Replication, recombination and repair]. 	638
223401	COG0324	MiaA	tRNA A37 N6-isopentenylltransferase MiaA [Translation, ribosomal structure and biogenesis]. 	308
223402	COG0325	YggS	Uncharacterized pyridoxal phosphate-containing protein, affects Ilv metabolism, UPF0001 family  [General function prediction only]. 	228
223403	COG0326	HtpG	Molecular chaperone, HSP90 family [Posttranslational modification, protein turnover, chaperones]. 	623
223404	COG0327	NIF3	Putative GTP cyclohydrolase 1 type 2, NIF3 family [Coenzyme transport and metabolism]. 	250
223405	COG0328	RnhA	Ribonuclease HI [Replication, recombination and repair]. 	154
223406	COG0329	DapA	Dihydrodipicolinate synthase/N-acetylneuraminate lyase [Amino acid transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	299
223407	COG0330	HflC	Regulator of protease activity HflC, stomatin/prohibitin superfamily [Posttranslational modification, protein turnover, chaperones]. 	291
223408	COG0331	FabD	Malonyl CoA-acyl carrier protein transacylase [Lipid transport and metabolism]. 	310
223409	COG0332	FabH	3-oxoacyl-[acyl-carrier-protein] synthase III [Lipid transport and metabolism]. 	323
223410	COG0333	RpmF	Ribosomal protein L32 [Translation, ribosomal structure and biogenesis]. 	57
223411	COG0334	GdhA	Glutamate dehydrogenase/leucine dehydrogenase [Amino acid transport and metabolism]. 	411
223412	COG0335	RplS	Ribosomal protein L19 [Translation, ribosomal structure and biogenesis]. 	115
223413	COG0336	TrmD	tRNA G37 N-methylase TrmD [Translation, ribosomal structure and biogenesis]. 	240
223414	COG0337	AroB	3-dehydroquinate synthetase [Amino acid transport and metabolism]. 	360
223415	COG0338	Dam	Site-specific DNA-adenine methylase [Replication, recombination and repair]. 	274
223416	COG0339	Dcp	Zn-dependent oligopeptidase [Posttranslational modification, protein turnover, chaperones]. 	683
223417	COG0340	BirA2	Biotin-(acetyl-CoA carboxylase) ligase [Coenzyme transport and metabolism]. 	238
223418	COG0341	SecF	Preprotein translocase subunit SecF [Intracellular trafficking, secretion, and vesicular transport]. 	305
223419	COG0342	SecD	Preprotein translocase subunit SecD [Intracellular trafficking, secretion, and vesicular transport]. 	506
223420	COG0343	Tgt	Queuine/archaeosine tRNA-ribosyltransferase [Translation, ribosomal structure and biogenesis]. 	372
223421	COG0344	PlsY	Phospholipid biosynthesis protein PlsY, probable glycerol-3-phosphate acyltransferase [Lipid transport and metabolism]. 	200
223422	COG0345	ProC	Pyrroline-5-carboxylate reductase [Amino acid transport and metabolism]. 	266
223423	COG0346	GloA	Catechol 2,3-dioxygenase or other lactoylglutathione lyase family enzyme [Secondary metabolites biosynthesis, transport and catabolism]. 	138
223424	COG0347	GlnK	Nitrogen regulatory protein PII [Signal transduction mechanisms, Amino acid transport and metabolism]. 	112
223425	COG0348	NapH	Polyferredoxin [Energy production and conversion]. 	386
223426	COG0349	Rnd	Ribonuclease D [Translation, ribosomal structure and biogenesis]. 	361
223427	COG0350	AdaB	O6-methylguanine-DNA--protein-cysteine methyltransferase [Replication, recombination and repair]. 	168
223428	COG0351	ThiD	Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase [Coenzyme transport and metabolism]. 	263
223429	COG0352	ThiE	Thiamine monophosphate synthase [Coenzyme transport and metabolism]. 	211
223430	COG0353	RecR	Recombinational DNA repair protein RecR [Replication, recombination and repair]. 	198
223431	COG0354	YgfZ	Folate-binding Fe-S cluster repair protein YgfZ, possible role in tRNA modification [Posttranslational modification, protein turnover, chaperones]. 	305
223432	COG0355	AtpC	FoF1-type ATP synthase, epsilon subunit [Energy production and conversion]. 	135
223433	COG0356	AtpB	FoF1-type ATP synthase, membrane subunit a [Energy production and conversion]. 	246
223434	COG0357	RsmG	16S rRNA G527 N7-methylase RsmG (former glucose-inhibited division protein B) [Translation, ribosomal structure and biogenesis]. 	215
223435	COG0358	DnaG	DNA primase (bacterial type) [Replication, recombination and repair]. 	568
223436	COG0359	RplI	Ribosomal protein L9 [Translation, ribosomal structure and biogenesis]. 	148
223437	COG0360	RpsF	Ribosomal protein S6 [Translation, ribosomal structure and biogenesis]. 	112
223438	COG0361	InfA	Translation initiation factor IF-1 [Translation, ribosomal structure and biogenesis]. 	75
223439	COG0362	Gnd	6-phosphogluconate dehydrogenase [Carbohydrate transport and metabolism]. 	473
223440	COG0363	NagB	6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase [Carbohydrate transport and metabolism]. 	238
223441	COG0364	Zwf	Glucose-6-phosphate 1-dehydrogenase [Carbohydrate transport and metabolism]. 	483
223442	COG0365	Acs	Acyl-coenzyme A synthetase/AMP-(fatty) acid ligase [Lipid transport and metabolism]. 	528
223443	COG0366	AmyA	Glycosidase [Carbohydrate transport and metabolism]. 	505
223444	COG0367	AsnB	Asparagine synthetase B (glutamine-hydrolyzing) [Amino acid transport and metabolism]. 	542
223445	COG0368	CobS	Cobalamin synthase [Coenzyme transport and metabolism]. 	246
223446	COG0369	CysJ	Sulfite reductase, alpha subunit (flavoprotein) [Inorganic ion transport and metabolism]. 	587
223447	COG0370	FeoB	Fe2+ transport system protein B [Inorganic ion transport and metabolism]. 	653
223448	COG0371	GldA	Glycerol dehydrogenase or related enzyme, iron-containing ADH family [Energy production and conversion]. 	360
223449	COG0372	GltA	Citrate synthase [Energy production and conversion]. 	390
223450	COG0373	HemA	Glutamyl-tRNA reductase [Coenzyme transport and metabolism]. 	414
223451	COG0374	HyaB	Ni,Fe-hydrogenase I large subunit [Energy production and conversion]. 	545
223452	COG0375	HybF	Hydrogenase maturation metallochaperone HypA/HybF, involved in Ni insertion [Posttranslational modification, protein turnover, chaperones]. 	115
223453	COG0376	KatG	Catalase (peroxidase I) [Inorganic ion transport and metabolism]. 	730
223454	COG0377	NuoB	NADH:ubiquinone oxidoreductase 20 kD subunit (chhain B) or related Fe-S oxidoreductase [Energy production and conversion]. 	194
223455	COG0378	HypB	Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase [Posttranslational modification, protein turnover, chaperones]. 	202
223456	COG0379	NadA	Quinolinate synthase [Coenzyme transport and metabolism]. 	324
223457	COG0380	OtsA	Trehalose-6-phosphate synthase [Carbohydrate transport and metabolism]. 	486
223458	COG0381	WecB	UDP-N-acetylglucosamine 2-epimerase [Cell wall/membrane/envelope biogenesis]. 	383
223459	COG0382	UbiA	4-hydroxybenzoate polyprenyltransferase [Coenzyme transport and metabolism]. 	289
223460	COG0383	AMS1	Alpha-mannosidase [Carbohydrate transport and metabolism]. 	943
223461	COG0384	YHI9	Predicted epimerase YddE/YHI9, PhzF superfamily [General function prediction only]. 	291
223462	COG0385	YfeH	Predicted Na+-dependent transporter [General function prediction only]. 	319
223463	COG0386	BtuE	Glutathione peroxidase, house-cleaning role in reducing lipid peroxides [Defense mechanisms, Lipid transport and metabolism]. 	162
223464	COG0387	ChaA	Ca2+/H+ antiporter [Inorganic ion transport and metabolism]. 	368
223465	COG0388	YafV	Predicted amidohydrolase [General function prediction only]. 	274
223466	COG0389	DinP	Nucleotidyltransferase/DNA polymerase involved in DNA repair [Replication, recombination and repair]. 	354
223467	COG0390	FetB	ABC-type iron transport system FetAB, permease component [Inorganic ion transport and metabolism]. 	256
223468	COG0391	CofD	Archaeal 2-phospho-L-lactate transferase/Bacterial gluconeogenesis factor, CofD/UPF0052 family [Coenzyme transport and metabolism, Carbohydrate transport and metabolism]. 	323
223469	COG0392	AglD2	Uncharacterized membrane protein YbhN, UPF0104 family [Function unknown]. 	322
223470	COG0393	YbjQ	Uncharacterized conserved protein YbjQ, UPF0145 family [Function unknown]. 	108
223471	COG0394	Wzb	Protein-tyrosine-phosphatase [Signal transduction mechanisms]. 	139
223472	COG0395	UgpE	ABC-type glycerol-3-phosphate transport system, permease component [Carbohydrate transport and metabolism]. 	281
223473	COG0396	SufC	Fe-S cluster assembly ATPase SufC [Posttranslational modification, protein turnover, chaperones]. 	251
223474	COG0397	YdiU	Uncharacterized conserved protein YdiU, UPF0061 family [Function unknown]. 	488
223475	COG0398	TVP38	Uncharacterized membrane protein YdjX, TVP38/TMEM64 family, SNARE-associated domain [Function unknown]. 	223
223476	COG0399	WecE	dTDP-4-amino-4,6-dideoxygalactose transaminase  [Cell wall/membrane/envelope biogenesis]. 	374
223477	COG0400	YpfH	Predicted esterase [General function prediction only]. 	207
223478	COG0401	YqaE	Uncharacterized membrane protein YqaE, homolog of Blt101, UPF0057 family [Function unknown]. 	56
223479	COG0402	SsnA	Cytosine/adenosine deaminase or related metal-dependent hydrolase [Nucleotide transport and metabolism, General function prediction only]. 	421
223480	COG0403	GcvP1	Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain [Amino acid transport and metabolism]. 	450
223481	COG0404	GcvT	Glycine cleavage system T protein (aminomethyltransferase) [Amino acid transport and metabolism]. 	379
223482	COG0405	Ggt	Gamma-glutamyltranspeptidase [Amino acid transport and metabolism]. 	539
223483	COG0406	PhoE	Broad specificity phosphatase PhoE [Carbohydrate transport and metabolism]. 	208
223484	COG0407	HemE	Uroporphyrinogen-III decarboxylase [Coenzyme transport and metabolism]. 	352
223485	COG0408	HemF	Coproporphyrinogen III oxidase [Coenzyme transport and metabolism]. 	303
223486	COG0409	HypD	Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]. 	364
223487	COG0410	LivF	ABC-type branched-chain amino acid transport system, ATPase component [Amino acid transport and metabolism]. 	237
223488	COG0411	LivG	ABC-type branched-chain amino acid transport system, ATPase component [Amino acid transport and metabolism]. 	250
223489	COG0412	DLH	Dienelactone hydrolase [Secondary metabolites biosynthesis, transport and catabolism]. 	236
223490	COG0413	PanB	Ketopantoate hydroxymethyltransferase [Coenzyme transport and metabolism]. 	268
223491	COG0414	PanC	Panthothenate synthetase [Coenzyme transport and metabolism]. 	285
223492	COG0415	PhrB	Deoxyribodipyrimidine photolyase [Replication, recombination and repair]. 	461
223493	COG0416	PlsX	Fatty acid/phospholipid biosynthesis enzyme [Lipid transport and metabolism]. 	338
223494	COG0417	PolB	DNA polymerase elongation subunit (family B) [Replication, recombination and repair]. 	792
223495	COG0418	PyrC	Dihydroorotase [Nucleotide transport and metabolism]. 	344
223496	COG0419	SbcC	DNA repair exonuclease SbcCD ATPase subunit [Replication, recombination and repair]. 	908
223497	COG0420	SbcD	DNA repair exonuclease SbcCD nuclease subunit [Replication, recombination and repair]. 	390
223498	COG0421	SpeE	Spermidine synthase [Amino acid transport and metabolism]. 	282
223499	COG0422	ThiC	Thiamine biosynthesis protein ThiC [Coenzyme transport and metabolism]. 	432
223500	COG0423	GRS1	Glycyl-tRNA synthetase (class II)  [Translation, ribosomal structure and biogenesis]. 	558
223501	COG0424	Maf	Predicted house-cleaning NTP pyrophosphatase, Maf/HAM1 superfamily [Secondary metabolites biosynthesis, transport and catabolism]. 	193
223502	COG0425	TusA	TusA-related sulfurtransferase [Posttranslational modification, protein turnover, chaperones]. 	78
223503	COG0426	NorV	Flavorubredoxin [Energy production and conversion]. 	388
223504	COG0427	ACH1	Acyl-CoA hydrolase [Energy production and conversion]. 	501
223505	COG0428	ZupT	Zinc transporter ZupT [Inorganic ion transport and metabolism]. 	266
223506	COG0429	YheT	Predicted hydrolase of the alpha/beta-hydrolase fold [General function prediction only]. 	345
223507	COG0430	RCL1	RNA 3'-terminal phosphate cyclase [RNA processing and modification]. 	341
223508	COG0431	SsuE	NAD(P)H-dependent FMN reductase [Energy production and conversion]. 	184
223509	COG0432	YjbQ	Thiamin phosphate synthase YjbQ, UPF0047 family [Coenzyme transport and metabolism]. 	137
223510	COG0433	YjgR	Archaeal DNA helicase HerA or a related bacterial ATPase, contains HAS-barrel and ATPase domains [Replication, recombination and repair]. 	520
223511	COG0434	SgcQ	Predicted TIM-barrel enzyme [General function prediction only]. 	263
223512	COG0435	ECM4	Glutathionyl-hydroquinone reductase [Energy production and conversion]. 	324
223513	COG0436	AspB	Aspartate/methionine/tyrosine aminotransferase [Amino acid transport and metabolism]. 	393
223514	COG0437	HybA	Fe-S-cluster-containing dehydrogenase component [Energy production and conversion]. 	203
223515	COG0438	RfaB	Glycosyltransferase involved in cell wall bisynthesis [Cell wall/membrane/envelope biogenesis]. 	381
223516	COG0439	AccC	Biotin carboxylase [Lipid transport and metabolism]. 	449
223517	COG0440	IlvH	Acetolactate synthase, small subunit [Amino acid transport and metabolism]. 	163
223518	COG0441	ThrS	Threonyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	589
223519	COG0442	ProS	Prolyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	500
223520	COG0443	DnaK	Molecular chaperone DnaK (HSP70) [Posttranslational modification, protein turnover, chaperones]. 	579
223521	COG0444	DppD	ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component [Amino acid transport and metabolism, Inorganic ion transport and metabolism]. 	316
223522	COG0445	MnmG	tRNA U34 5-carboxymethylaminomethyl modifying enzyme MnmG/GidA [Translation, ribosomal structure and biogenesis]. 	621
223523	COG0446	FadH2	NADPH-dependent 2,4-dienoyl-CoA reductase, sulfur reductase, or a related oxidoreductase [Lipid transport and metabolism]. 	415
223524	COG0447	MenB	1,4-Dihydroxy-2-naphthoyl-CoA synthase [Coenzyme transport and metabolism]. 	282
223525	COG0448	GlgC	ADP-glucose pyrophosphorylase [Carbohydrate transport and metabolism]. 	393
223526	COG0449	GlmS	Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains [Cell wall/membrane/envelope biogenesis]. 	597
223527	COG0450	AhpC	Alkyl hydroperoxide reductase subunit AhpC (peroxiredoxin) [Defense mechanisms]. 	194
223528	COG0451	WcaG	Nucleoside-diphosphate-sugar epimerase [Cell wall/membrane/envelope biogenesis]. 	314
223529	COG0452	CoaBC	Phosphopantothenoylcysteine synthetase/decarboxylase [Coenzyme transport and metabolism]. 	392
223530	COG0454	PhnO	N-acetyltransferase, GNAT superfamily (includes histone acetyltransferase HPA2) [Transcription, General function prediction only]. 	156
223531	COG0455	FlhG	MinD-like ATPase involved in chromosome partitioning or flagellar assembly [Cell cycle control, cell division, chromosome partitioning, Cell motility]. 	262
223532	COG0456	RimI	Ribosomal protein S18 acetylase RimI and related acetyltransferases [Translation, ribosomal structure and biogenesis]. 	177
223533	COG0457	TPR	Tetratricopeptide (TPR) repeat [General function prediction only]. 	291
223534	COG0458	CarB	Carbamoylphosphate synthase large subunit [Amino acid transport and metabolism, Nucleotide transport and metabolism]. 	400
223535	COG0459	GroEL	Chaperonin GroEL (HSP60 family) [Posttranslational modification, protein turnover, chaperones]. 	524
223536	COG0460	ThrA	Homoserine dehydrogenase [Amino acid transport and metabolism]. 	333
223537	COG0461	PyrE	Orotate phosphoribosyltransferase [Nucleotide transport and metabolism]. 	201
223538	COG0462	PrsA	Phosphoribosylpyrophosphate synthetase [Nucleotide transport and metabolism, Amino acid transport and metabolism]. 	314
223539	COG0463	WcaA	Glycosyltransferase involved in cell wall bisynthesis [Cell wall/membrane/envelope biogenesis]. 	291
223540	COG0464	SpoVK	AAA+-type ATPase, SpoVK/Ycf46/Vps4 family [Cell wall/membrane/envelope biogenesis, Cell cycle control, cell division, chromosome partitioning, Signal transduction mechanisms]. 	494
223541	COG0465	HflB	ATP-dependent Zn proteases [Posttranslational modification, protein turnover, chaperones]. 	596
223542	COG0466	Lon	ATP-dependent Lon protease, bacterial type [Posttranslational modification, protein turnover, chaperones]. 	782
223543	COG0467	RAD55	RecA-superfamily ATPase, KaiC/GvpD/RAD55 family [Signal transduction mechanisms]. 	260
223544	COG0468	RecA	RecA/RadA recombinase [Replication, recombination and repair]. 	279
223545	COG0469	PykF	Pyruvate kinase [Carbohydrate transport and metabolism]. 	477
223546	COG0470	HolB	DNA polymerase III, delta prime subunit [Replication, recombination and repair]. 	230
223547	COG0471	CitT	Di- and tricarboxylate transporter [Carbohydrate transport and metabolism]. 	461
223548	COG0472	Rfe	UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase [Cell wall/membrane/envelope biogenesis]. 	319
223549	COG0473	LeuB	Isocitrate/isopropylmalate dehydrogenase [Energy production and conversion, Amino acid transport and metabolism]. 	348
223550	COG0474	MgtA	Magnesium-transporting ATPase (P-type) [Inorganic ion transport and metabolism]. 	917
223551	COG0475	KefB	Kef-type K+ transport system, membrane component KefB [Inorganic ion transport and metabolism]. 	397
223552	COG0476	ThiF	Molybdopterin or thiamine biosynthesis adenylyltransferase [Coenzyme transport and metabolism]. 	254
223553	COG0477	ProP	MFS family permease [Carbohydrate transport and metabolism, Amino acid transport and metabolism, Inorganic ion transport and metabolism, General function prediction only]. 	338
223554	COG0478	RIO2	RIO-like serine/threonine protein kinase fused to N-terminal HTH domain  [Signal transduction mechanisms]. 	304
223555	COG0479	FrdB	Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit [Energy production and conversion]. 	234
223556	COG0480	FusA	Translation elongation factor EF-G, a GTPase [Translation, ribosomal structure and biogenesis]. 	697
223557	COG0481	LepA	Translation elongation factor EF-4, membrane-bound GTPase [Translation, ribosomal structure and biogenesis]. 	603
223558	COG0482	MnmA	tRNA U34 2-thiouridine synthase MnmA/TrmU, contains the PP-loop ATPase domain [Translation, ribosomal structure and biogenesis]. 	356
223559	COG0483	SuhB	Archaeal fructose-1,6-bisphosphatase or related enzyme of inositol monophosphatase family [Carbohydrate transport and metabolism]. 	260
223560	COG0484	DnaJ	DnaJ-class molecular chaperone with C-terminal Zn finger domain [Posttranslational modification, protein turnover, chaperones]. 	371
223561	COG0486	MnmE	tRNA U34 5-carboxymethylaminomethyl modifying GTPase MnmE/TrmE [Translation, ribosomal structure and biogenesis]. 	454
223562	COG0488	Uup	ATPase components of ABC transporters with duplicated ATPase domains [General function prediction only]. 	530
223563	COG0489	Mrp	Chromosome partitioning ATPase, Mrp family, contains Fe-S cluster [Cell cycle control, cell division, chromosome partitioning]. 	265
223564	COG0490	KhtT	K+/H+ antiporter YhaU, regulatory subunit KhtT [Inorganic ion transport and metabolism]. 	162
223565	COG0491	GloB	Glyoxylase or a related metal-dependent hydrolase, beta-lactamase superfamily II [General function prediction only]. 	252
223566	COG0492	TrxB	Thioredoxin reductase [Posttranslational modification, protein turnover, chaperones]. 	305
223567	COG0493	GltD	NADPH-dependent glutamate synthase beta chain or related oxidoreductase [Amino acid transport and metabolism, General function prediction only]. 	457
223568	COG0494	MutT	8-oxo-dGTP pyrophosphatase MutT and related house-cleaning NTP pyrophosphohydrolases, NUDIX family [Defense mechanisms]. 	161
223569	COG0495	LeuS	Leucyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	814
223570	COG0496	SurE	Broad specificity polyphosphatase and 5'/3'-nucleotidase SurE [Replication, recombination and repair]. 	252
223571	COG0497	RecN	DNA repair ATPase RecN [Replication, recombination and repair]. 	557
223572	COG0498	ThrC	Threonine synthase [Amino acid transport and metabolism]. 	411
223573	COG0499	SAM1	S-adenosylhomocysteine hydrolase  [Coenzyme transport and metabolism]. 	420
223574	COG0500	SmtA	SAM-dependent methyltransferase [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 	257
223575	COG0501	HtpX	Zn-dependent protease with chaperone function [Posttranslational modification, protein turnover, chaperones]. 	302
223576	COG0502	BioB	Biotin synthase or related enzyme [Coenzyme transport and metabolism]. 	335
223577	COG0503	Apt	Adenine/guanine phosphoribosyltransferase or related PRPP-binding protein [Nucleotide transport and metabolism]. 	179
223578	COG0504	PyrG	CTP synthase (UTP-ammonia lyase) [Nucleotide transport and metabolism]. 	533
223579	COG0505	CarA	Carbamoylphosphate synthase small subunit [Amino acid transport and metabolism, Nucleotide transport and metabolism]. 	368
223580	COG0506	PutA	Proline dehydrogenase [Amino acid transport and metabolism]. 	391
223581	COG0507	RecD	ATP-dependent exoDNAse (exonuclease V), alpha subunit, helicase superfamily I  [Replication, recombination and repair]. 	696
223582	COG0508	AceF	Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component [Energy production and conversion]. 	404
223583	COG0509	GcvH	Glycine cleavage system H protein (lipoate-binding) [Amino acid transport and metabolism]. 	131
223584	COG0510	CotS	Thiamine kinase and related kinases [Coenzyme transport and metabolism]. 	269
223585	COG0511	AccB	Biotin carboxyl carrier protein [Coenzyme transport and metabolism, Lipid transport and metabolism]. 	140
223586	COG0512	PabA	Anthranilate/para-aminobenzoate synthase component II [Amino acid transport and metabolism, Coenzyme transport and metabolism]. 	191
223587	COG0513	SrmB	Superfamily II DNA and RNA helicase [Replication, recombination and repair]. 	513
223588	COG0514	RecQ	Superfamily II DNA helicase RecQ [Replication, recombination and repair]. 	590
223589	COG0515	SPS1	Serine/threonine protein kinase  [Signal transduction mechanisms]. 	384
223590	COG0516	GuaB	IMP dehydrogenase/GMP reductase [Nucleotide transport and metabolism]. 	170
223591	COG0517	CBS	CBS domain [Signal transduction mechanisms]. 	117
223592	COG0518	GuaA1	GMP synthase - Glutamine amidotransferase domain [Nucleotide transport and metabolism]. 	198
223593	COG0519	GuaA2	GMP synthase, PP-ATPase domain/subunit [Nucleotide transport and metabolism]. 	315
223594	COG0520	CsdA	Selenocysteine lyase/Cysteine desulfurase [Amino acid transport and metabolism]. 	405
223595	COG0521	MoaB	Molybdopterin biosynthesis enzyme MoaB [Coenzyme transport and metabolism]. 	169
223596	COG0522	RpsD	Ribosomal protein S4 or related protein [Translation, ribosomal structure and biogenesis]. 	205
223597	COG0523	YejR	GTPase, G3E family [General function prediction only]. 	323
223598	COG0524	RbsK	Sugar or nucleoside kinase, ribokinase family [Carbohydrate transport and metabolism]. 	311
223599	COG0525	ValS	Valyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]. 	877
223600	COG0526	TrxA	Thiol-disulfide isomerase or thioredoxin [Posttranslational modification, protein turnover, chaperones]. 	127
223601	COG0527	LysC	Aspartokinase [Amino acid transport and metabolism]. 	447
223602	COG0528	PyrH	Uridylate kinase [Nucleotide transport and metabolism]. 	238
223603	COG0529	CysC	Adenylylsulfate kinase or related kinase [Inorganic ion transport and metabolism]. 	197
223604	COG0530	ECM27	Ca2+/Na+ antiporter [Inorganic ion transport and metabolism]. 	320
223605	COG0531	PotE	Amino acid transporter [Amino acid transport and metabolism]. 	466
223606	COG0532	InfB	Translation initiation factor IF-2, a GTPase [Translation, ribosomal structure and biogenesis]. 	509
223607	COG0533	TsaD	tRNA A37 threonylcarbamoyltransferase TsaD [Translation, ribosomal structure and biogenesis]. 	342
223608	COG0534	NorM	Na+-driven multidrug efflux pump [Defense mechanisms]. 	455
223609	COG0535	SkfB	Radical SAM superfamily enzyme, MoaA/NifB/PqqE/SkfB family [General function prediction only]. 	347
223610	COG0536	Obg	GTPase involved in cell partioning and DNA repair [Cell cycle control, cell division, chromosome partitioning, Replication, recombination and repair]. 	369
223611	COG0537	Hit	Diadenosine tetraphosphate (Ap4A) hydrolase or other HIT family hydrolase [Nucleotide transport and metabolism, Carbohydrate transport and metabolism, General function prediction only]. 	138
223612	COG0538	Icd	Isocitrate dehydrogenase [Energy production and conversion]. 	407
223613	COG0539	RpsA	Ribosomal protein S1 [Translation, ribosomal structure and biogenesis]. 	541
223614	COG0540	PyrB	Aspartate carbamoyltransferase, catalytic chain [Nucleotide transport and metabolism]. 	316
223615	COG0541	Ffh	Signal recognition particle GTPase [Intracellular trafficking, secretion, and vesicular transport]. 	451
223616	COG0542	ClpA	ATP-dependent Clp protease ATP-binding subunit ClpA [Posttranslational modification, protein turnover, chaperones]. 	786
223617	COG0543	Mcr1	NAD(P)H-flavin reductase [Coenzyme transport and metabolism, Energy production and conversion]. 	252
223618	COG0544	Tig	FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) [Posttranslational modification, protein turnover, chaperones]. 	441
223619	COG0545	FkpA	FKBP-type peptidyl-prolyl cis-trans isomerase [Posttranslational modification, protein turnover, chaperones]. 	205
223620	COG0546	Gph	Phosphoglycolate phosphatase, HAD superfamily [Energy production and conversion]. 	220
223621	COG0547	TrpD	Anthranilate phosphoribosyltransferase [Amino acid transport and metabolism]. 	338
223622	COG0548	ArgB	Acetylglutamate kinase [Amino acid transport and metabolism]. 	265
223623	COG0549	ArcC	Carbamate kinase [Amino acid transport and metabolism]. 	312
223624	COG0550	TopA	DNA topoisomerase IA [Replication, recombination and repair]. 	570
223625	COG0551	YrdD	ssDNA-binding Zn-finger and Zn-ribbon domains of topoisomerase 1 [Replication, recombination and repair]. 	140
223626	COG0552	FtsY	Signal recognition particle GTPase [Intracellular trafficking, secretion, and vesicular transport]. 	340
223627	COG0553	HepA	Superfamily II DNA or RNA helicase, SNF2 family [Transcription, Replication, recombination and repair]. 	866
223628	COG0554	GlpK	Glycerol kinase [Energy production and conversion]. 	499
223629	COG0555	CysU	ABC-type sulfate transport system, permease component [Inorganic ion transport and metabolism]. 	274
223630	COG0556	UvrB	Excinuclease UvrABC helicase subunit UvrB [Replication, recombination and repair]. 	663
223631	COG0557	VacB	Exoribonuclease R [Transcription]. 	706
223632	COG0558	PgsA	Phosphatidylglycerophosphate synthase [Lipid transport and metabolism]. 	192
223633	COG0559	LivH	Branched-chain amino acid ABC-type transport system, permease component [Amino acid transport and metabolism]. 	297
223634	COG0560	SerB	Phosphoserine phosphatase [Amino acid transport and metabolism]. 	212
223635	COG0561	Cof	Hydroxymethylpyrimidine pyrophosphatase and other HAD family phosphatases [Coenzyme transport and metabolism, General function prediction only]. 	264
223636	COG0562	Glf	UDP-galactopyranose mutase [Cell wall/membrane/envelope biogenesis]. 	374
223637	COG0563	Adk	Adenylate kinase or related kinase [Nucleotide transport and metabolism]. 	178
223638	COG0564	RluA	Pseudouridylate synthase, 23S rRNA- or tRNA-specific [Translation, ribosomal structure and biogenesis]. 	289
223639	COG0565	TrmJ	tRNA C32,U32 (ribose-2'-O)-methylase TrmJ or a related methyltransferase [Translation, ribosomal structure and biogenesis]. 	242
223640	COG0566	SpoU	tRNA G18 (ribose-2'-O)-methylase SpoU [Translation, ribosomal structure and biogenesis]. 	260
223641	COG0567	SucA	2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes [Energy production and conversion]. 	906
223642	COG0568	RpoD	DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) [Transcription]. 	342
223643	COG0569	TrkA	Trk K+ transport system, NAD-binding component [Inorganic ion transport and metabolism]. 	225
223644	COG0571	Rnc	dsRNA-specific ribonuclease [Transcription]. 	235
223645	COG0572	Udk	Uridine kinase [Nucleotide transport and metabolism]. 	218
223646	COG0573	PstC	ABC-type phosphate transport system, permease component [Inorganic ion transport and metabolism]. 	310
223647	COG0574	PpsA	Phosphoenolpyruvate synthase/pyruvate phosphate dikinase [Carbohydrate transport and metabolism]. 	740
223648	COG0575	CdsA	CDP-diglyceride synthetase [Lipid transport and metabolism]. 	265
223649	COG0576	GrpE	Molecular chaperone GrpE (heat shock protein) [Posttranslational modification, protein turnover, chaperones]. 	193
223650	COG0577	SalY	ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 	419
223651	COG0578	GlpA	Glycerol-3-phosphate dehydrogenase [Energy production and conversion]. 	532
223652	COG0579	LhgO	L-2-hydroxyglutarate oxidase LhgO [Carbohydrate transport and metabolism]. 	429
223653	COG0580	GlpF	Glycerol uptake facilitator and related aquaporins (Major Intrinsic Protein Family) [Carbohydrate transport and metabolism]. 	241
223654	COG0581	PstA	ABC-type phosphate transport system, permease component [Inorganic ion transport and metabolism]. 	292
223655	COG0582	XerC	Integrase [Replication, recombination and repair, Mobilome: prophages, transposons]. 	309
223656	COG0583	LysR	DNA-binding transcriptional regulator, LysR family [Transcription]. 	297
223657	COG0584	UgpQ	Glycerophosphoryl diester phosphodiesterase [Lipid transport and metabolism]. 	257
223658	COG0585	TruD	tRNA(Glu) U13 pseudouridine synthase TruD [Translation, ribosomal structure and biogenesis]. 	406
223659	COG0586	DedA	Uncharacterized membrane protein DedA, SNARE-associated domain [Function unknown]. 	208
223660	COG0587	DnaE	DNA polymerase III, alpha subunit [Replication, recombination and repair]. 	1139
223661	COG0588	GpmA	Phosphoglycerate mutase (BPG-dependent) [Carbohydrate transport and metabolism]. 	230
223662	COG0589	UspA	Nucleotide-binding universal stress protein,  UspA family [Signal transduction mechanisms]. 	154
223663	COG0590	TadA	tRNA(Arg) A34 adenosine deaminase TadA [Translation, ribosomal structure and biogenesis]. 	152
223664	COG0591	PutP	Na+/proline symporter [Amino acid transport and metabolism]. 	493
223665	COG0592	DnaN	DNA polymerase III sliding clamp (beta) subunit, PCNA homolog [Replication, recombination and repair]. 	364
223666	COG0593	DnaA	Chromosomal replication initiation ATPase DnaA [Replication, recombination and repair]. 	408
223667	COG0594	RnpA	RNase P protein component [Translation, ribosomal structure and biogenesis]. 	117
223668	COG0595	RnjA	mRNA degradation ribonuclease J1/J2 [Translation, ribosomal structure and biogenesis]. 	555
223669	COG0596	MhpC	Pimeloyl-ACP methyl ester carboxylesterase [Coenzyme transport and metabolism, General function prediction only]. 	282
223670	COG0597	LspA	Lipoprotein signal peptidase [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 	167
223671	COG0598	CorA	Mg2+ and Co2+ transporter CorA [Inorganic ion transport and metabolism]. 	322
223672	COG0599	YurZ	Uncharacterized conserved protein YurZ, alkylhydroperoxidase/carboxymuconolactone decarboxylase family  [General function prediction only]. 	124
223673	COG0600	TauC	ABC-type nitrate/sulfonate/bicarbonate transport system, permease component [Inorganic ion transport and metabolism]. 	258
223674	COG0601	DppB	ABC-type dipeptide/oligopeptide/nickel transport system, permease component [Amino acid transport and metabolism, Inorganic ion transport and metabolism]. 	317
223675	COG0602	NrdG	Organic radical activating enzyme [General function prediction only]. 	212
223676	COG0603	QueC	7-cyano-7-deazaguanine synthase (queuosine biosynthesis) [Translation, ribosomal structure and biogenesis]. 	222
223677	COG0604	Qor	NADPH:quinone reductase or related Zn-dependent oxidoreductase [Energy production and conversion, General function prediction only]. 	326
223678	COG0605	SodA	Superoxide dismutase [Inorganic ion transport and metabolism]. 	204
223679	COG0606	YifB	Predicted ATPase with chaperone activity [Posttranslational modification, protein turnover, chaperones]. 	490
223680	COG0607	PspE	Rhodanese-related sulfurtransferase [Inorganic ion transport and metabolism]. 	110
223681	COG0608	RecJ	Single-stranded DNA-specific exonuclease, DHH superfamily, may be involved in archaeal DNA replication intiation [Replication, recombination and repair]. 	491
223682	COG0609	FepD	ABC-type Fe3+-siderophore transport system, permease component [Inorganic ion transport and metabolism]. 	334
223683	COG0610	COG0610	Type I site-specific restriction-modification system, R (restriction) subunit and related helicases ... [Defense mechanisms]. 	962
223684	COG0611	ThiL	Thiamine monophosphate kinase [Coenzyme transport and metabolism]. 	317
223685	COG0612	PqqL	Predicted Zn-dependent peptidase [General function prediction only]. 	438
223686	COG0613	YciV	Predicted metal-dependent phosphoesterase TrpH, contains PHP domain [General function prediction only]. 	258
223687	COG0614	FepB	ABC-type Fe3+-hydroxamate transport system, periplasmic component [Inorganic ion transport and metabolism]. 	319
223688	COG0615	TagD	Glycerol-3-phosphate cytidylyltransferase, cytidylyltransferase family  [Cell wall/membrane/envelope biogenesis]. 	140
223689	COG0616	SppA	Periplasmic serine protease, ClpP class [Posttranslational modification, protein turnover, chaperones]. 	317
223690	COG0617	PcnB	tRNA nucleotidyltransferase/poly(A) polymerase [Translation, ribosomal structure and biogenesis]. 	412
223691	COG0618	NrnA	nanoRNase/pAp phosphatase, hydrolyzes c-di-AMP and oligoRNAs   [Nucleotide transport and metabolism]. 	332
223692	COG0619	EcfT	Energy-coupling factor transporter transmembrane protein EcfT [Coenzyme transport and metabolism]. 	252
223693	COG0620	MetE	Methionine synthase II (cobalamin-independent) [Amino acid transport and metabolism]. 	330
223694	COG0621	MiaB	tRNA A37 methylthiotransferase MiaB [Translation, ribosomal structure and biogenesis]. 	437
223695	COG0622	YfcE	Predicted phosphodiesterase [General function prediction only]. 	172
223696	COG0623	FabI	Enoyl-[acyl-carrier-protein] reductase (NADH) [Lipid transport and metabolism]. 	259
223697	COG0624	ArgE	Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase or related deacylase [Amino acid transport and metabolism]. 	409
223698	COG0625	GstA	Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]. 	211
223699	COG0626	MetC	Cystathionine beta-lyase/cystathionine gamma-synthase [Amino acid transport and metabolism]. 	396
223700	COG0627	FrmB	S-formylglutathione hydrolase FrmB [Defense mechanisms]. 	316
223701	COG0628	PerM	Predicted PurR-regulated permease PerM [General function prediction only]. 	355
223702	COG0629	Ssb	Single-stranded DNA-binding protein [Replication, recombination and repair]. 	167
223703	COG0630	VirB11	Type IV secretory pathway ATPase VirB11/Archaellum biosynthesis ATPase [Intracellular trafficking, secretion, and vesicular transport]. 	312
223704	COG0631	PTC1	Serine/threonine protein phosphatase PrpC [Signal transduction mechanisms]. 	262
223705	COG0632	RuvA	Holliday junction resolvasome RuvABC DNA-binding subunit [Replication, recombination and repair]. 	201
223706	COG0633	Fdx	Ferredoxin [Energy production and conversion]. 	102
223707	COG0634	HptA	Hypoxanthine-guanine phosphoribosyltransferase [Nucleotide transport and metabolism]. 	178
223708	COG0635	HemN	Coproporphyrinogen III oxidase or related Fe-S oxidoreductase [Coenzyme transport and metabolism]. 	416
223709	COG0636	AtpE	FoF1-type ATP synthase, membrane subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K [Energy production and conversion]. 	79
223710	COG0637	YcjU	Beta-phosphoglucomutase or related phosphatase, HAD superfamily  [Carbohydrate transport and metabolism, General function prediction only]. 	221
223711	COG0638	PRE1	20S proteasome, alpha and beta subunits  [Posttranslational modification, protein turnover, chaperones]. 	236
223712	COG0639	ApaH	Diadenosine tetraphosphatase ApaH/serine/threonine protein phosphatase, PP2A family [Signal transduction mechanisms]. 	155
223713	COG0640	ArsR	DNA-binding transcriptional regulator, ArsR family [Transcription]. 	110
223714	COG0641	AslB	Sulfatase maturation enzyme AslB, radical SAM superfamily  [Posttranslational modification, protein turnover, chaperones]. 	378
223715	COG0642	BaeS	Signal transduction histidine kinase [Signal transduction mechanisms]. 	336
223716	COG0643	CheA	Chemotaxis protein histidine kinase CheA [Cell motility, Signal transduction mechanisms]. 	716
223717	COG0644	FixC	Dehydrogenase (flavoprotein) [Energy production and conversion]. 	396
223718	COG0645	COG0645	Predicted kinase  [General function prediction only]. 	170
223719	COG0646	MetH1	Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]. 	311
223720	COG0647	NagD	Ribonucleotide monophosphatase NagD, HAD superfamily [Nucleotide transport and metabolism]. 	269
223721	COG0648	Nfo	Endonuclease IV [Replication, recombination and repair]. 	280
223722	COG0649	NuoD	NADH:ubiquinone oxidoreductase 49 kD subunit (chain D) [Energy production and conversion]. 	398
223723	COG0650	HyfC	Formate hydrogenlyase subunit 4 [Energy production and conversion]. 	309
223724	COG0651	HyfB	Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 	504
223725	COG0652	PpiB	Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family [Posttranslational modification, protein turnover, chaperones]. 	158
223726	COG0653	SecA	Preprotein translocase subunit SecA (ATPase, RNA helicase) [Intracellular trafficking, secretion, and vesicular transport]. 	822
223727	COG0654	UbiH	2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases [Coenzyme transport and metabolism, Energy production and conversion]. 	387
223728	COG0655	WrbA	Multimeric flavodoxin WrbA [Energy production and conversion]. 	207
223729	COG0656	ARA1	Aldo/keto reductase, related to diketogulonate reductase [Secondary metabolites biosynthesis, transport and catabolism]. 	280
223730	COG0657	Aes	Acetyl esterase/lipase [Lipid transport and metabolism]. 	312
223731	COG0658	ComEC	Predicted membrane metal-binding protein [General function prediction only]. 	453
223732	COG0659	SUL1	Sulfate permease or related transporter, MFS superfamily [Inorganic ion transport and metabolism]. 	554
223733	COG0661	AarF	Predicted unusual protein kinase regulating ubiquinone biosynthesis, AarF/ABC1/UbiB family [Coenzyme transport and metabolism, Signal transduction mechanisms]. 	517
223734	COG0662	ManC	Mannose-6-phosphate isomerase, cupin superfamily [Carbohydrate transport and metabolism]. 	127
223735	COG0663	PaaY	Carbonic anhydrase or acetyltransferase, isoleucine patch superfamily [General function prediction only]. 	176
223736	COG0664	Crp	cAMP-binding domain of CRP or a regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]. 	214
223737	COG0665	DadA	Glycine/D-amino acid oxidase (deaminating) [Amino acid transport and metabolism]. 	387
223738	COG0666	ANKYR	Ankyrin repeat [Signal transduction mechanisms]. 	235
223739	COG0667	Tas	Predicted oxidoreductase (related to aryl-alcohol dehydrogenase) [General function prediction only]. 	316
223740	COG0668	MscS	Small-conductance mechanosensitive channel [Cell wall/membrane/envelope biogenesis]. 	316
223741	COG0669	CoaD	Phosphopantetheine adenylyltransferase [Coenzyme transport and metabolism]. 	159
223742	COG0670	YbhL	Integral membrane protein, interacts with FtsH [General function prediction only]. 	233
223743	COG0671	PgpB	Membrane-associated phospholipid phosphatase [Lipid transport and metabolism]. 	232
223744	COG0672	FTR1	High-affinity Fe2+/Pb2+ permease [Inorganic ion transport and metabolism]. 	383
223745	COG0673	MviM	Predicted dehydrogenase [General function prediction only]. 	342
223746	COG0674	PorA	Pyruvate:ferredoxin oxidoreductase or related 2-oxoacid:ferredoxin oxidoreductase, alpha subunit [Energy production and conversion]. 	365
223747	COG0675	InsQ	Transposase [Mobilome: prophages, transposons]. 	364
223748	COG0676	YeaD	D-hexose-6-phosphate mutarotase [Carbohydrate transport and metabolism]. 	287
223749	COG0677	WecC	UDP-N-acetyl-D-mannosaminuronate dehydrogenase [Cell wall/membrane/envelope biogenesis]. 	436
223750	COG0678	AHP1	Peroxiredoxin  [Posttranslational modification, protein turnover, chaperones]. 	165
223751	COG0679	YfdV	Predicted permease [General function prediction only]. 	311
223752	COG0680	HyaD	Ni,Fe-hydrogenase maturation factor [Energy production and conversion]. 	160
223753	COG0681	LepB	Signal peptidase I [Intracellular trafficking, secretion, and vesicular transport]. 	166
223754	COG0682	Lgt	Prolipoprotein diacylglyceryltransferase [Cell wall/membrane/envelope biogenesis]. 	287
223755	COG0683	LivK	ABC-type branched-chain amino acid transport system, periplasmic component [Amino acid transport and metabolism]. 	366
223756	COG0684	RraA	Regulator of RNase E activity RraA [Translation, ribosomal structure and biogenesis]. 	210
223757	COG0685	MetF	5,10-methylenetetrahydrofolate reductase [Amino acid transport and metabolism]. 	291
223758	COG0686	Ald	Alanine dehydrogenase  [Amino acid transport and metabolism]. 	371
223759	COG0687	PotD	Spermidine/putrescine-binding periplasmic protein [Amino acid transport and metabolism]. 	363
223760	COG0688	Psd	Phosphatidylserine decarboxylase [Lipid transport and metabolism]. 	239
223761	COG0689	Rph	Ribonuclease PH [Translation, ribosomal structure and biogenesis]. 	230
223762	COG0690	SecE	Preprotein translocase subunit SecE [Intracellular trafficking, secretion, and vesicular transport]. 	73
223763	COG0691	SmpB	tmRNA-binding protein [Posttranslational modification, protein turnover, chaperones]. 	153
223764	COG0692	Ung	Uracil DNA glycosylase [Replication, recombination and repair]. 	223
223765	COG0693	ThiJ	Putative intracellular protease/amidase [General function prediction only]. 	188
223766	COG0694	NifU	Fe-S cluster biogenesis protein NfuA, 4Fe-4S-binding domain [Posttranslational modification, protein turnover, chaperones]. 	93
223767	COG0695	GrxC	Glutaredoxin [Posttranslational modification, protein turnover, chaperones]. 	80
223768	COG0696	GpmI	Phosphoglycerate mutase (BPG-independent, AlkP superfamily) [Carbohydrate transport and metabolism]. 	509
223769	COG0697	RhaT	Permease of the drug/metabolite transporter (DMT) superfamily [Carbohydrate transport and metabolism, Amino acid transport and metabolism, General function prediction only]. 	292
223770	COG0698	RpiB	Ribose 5-phosphate isomerase RpiB [Carbohydrate transport and metabolism]. 	151
223771	COG0699	CrfC	Replication fork clamp-binding protein CrfC (dynamin-like GTPase family) [Replication, recombination and repair]. 	546
223772	COG0700	SpmB	Spore maturation protein SpmB (function unknown) [Function unknown]. 	162
223773	COG0701	YraQ	Uncharacterized membrane protein YraQ, UPF0718 family [Function unknown]. 	317
223774	COG0702	YbjT	Uncharacterized conserved protein YbjT, contains NAD(P)-binding and DUF2867 domains [General function prediction only]. 	275
223775	COG0703	AroK	Shikimate kinase [Amino acid transport and metabolism]. 	172
223776	COG0704	PhoU	Phosphate uptake regulator [Inorganic ion transport and metabolism]. 	240
223777	COG0705	GlpG	Membrane associated serine protease, rhomboid family  [Posttranslational modification, protein turnover, chaperones]. 	228
223778	COG0706	YidC	Membrane protein insertase Oxa1/YidC/SpoIIIJ, required for the localization of integral membrane proteins [Cell wall/membrane/envelope biogenesis]. 	314
223779	COG0707	MurG	UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase [Cell wall/membrane/envelope biogenesis]. 	357
223780	COG0708	XthA	Exonuclease III [Replication, recombination and repair]. 	261
223781	COG0709	SelD	Selenophosphate synthase [Amino acid transport and metabolism]. 	346
223782	COG0710	AroD	3-dehydroquinate dehydratase [Amino acid transport and metabolism]. 	231
223783	COG0711	AtpF	FoF1-type ATP synthase, membrane subunit b or b' [Energy production and conversion]. 	161
223784	COG0712	AtpH	FoF1-type ATP synthase, delta subunit [Energy production and conversion]. 	178
223785	COG0713	NuoK	NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) [Energy production and conversion]. 	100
223786	COG0714	MoxR	MoxR-like ATPase [General function prediction only]. 	329
223787	COG0715	TauA	ABC-type nitrate/sulfonate/bicarbonate transport system, periplasmic component [Inorganic ion transport and metabolism]. 	335
223788	COG0716	FldA	Flavodoxin [Energy production and conversion]. 	151
223789	COG0717	Dcd	Deoxycytidine triphosphate deaminase [Nucleotide transport and metabolism]. 	183
223790	COG0718	YbaB	Conserved DNA-binding protein YbaB (function unknown) [General function prediction only]. 	105
223791	COG0719	SufB	Fe-S cluster assembly scaffold protein SufB [Posttranslational modification, protein turnover, chaperones]. 	412
223792	COG0720	QueD	6-pyruvoyl-tetrahydropterin synthase [Coenzyme transport and metabolism]. 	127
223793	COG0721	GatC	Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit  [Translation, ribosomal structure and biogenesis]. 	96
223794	COG0722	AroG1	3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase [Amino acid transport and metabolism]. 	351
223795	COG0723	QcrA	Rieske Fe-S protein  [Energy production and conversion]. 	177
223796	COG0724	RRM	RNA recognition motif (RRM) domain [Translation, ribosomal structure and biogenesis]. 	306
223797	COG0725	ModA	ABC-type molybdate transport system, periplasmic component [Inorganic ion transport and metabolism]. 	258
223798	COG0726	CDA1	Peptidoglycan/xylan/chitin deacetylase, PgdA/CDA1 family [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	267
223799	COG0727	YkgJ	Fe-S-cluster containining protein [General function prediction only]. 	132
223800	COG0728	MviN	Peptidoglycan biosynthesis protein MviN/MurJ, putative lipid II flippase [Cell wall/membrane/envelope biogenesis]. 	518
223801	COG0729	TamA	Outer membrane translocation and assembly module TamA [Cell wall/membrane/envelope biogenesis]. 	594
223802	COG0730	YfcA	Uncharacterized membrane protein YfcA [Function unknown]. 	258
223803	COG0731	Tyw1	Wyosine [tRNA(Phe)-imidazoG37] synthetase, radical SAM superfamily  [Translation, ribosomal structure and biogenesis]. 	296
223804	COG0732	HsdS	Restriction endonuclease S subunit [Defense mechanisms]. 	391
223805	COG0733	YocR	Na+-dependent transporter, SNF family  [General function prediction only]. 	439
223806	COG0735	Fur	Fe2+ or Zn2+ uptake regulation protein [Inorganic ion transport and metabolism]. 	145
223807	COG0736	AcpS	Phosphopantetheinyl transferase (holo-ACP synthase) [Lipid transport and metabolism]. 	127
223808	COG0737	UshA	2',3'-cyclic-nucleotide 2'-phosphodiesterase/5'- or 3'-nucleotidase, 5'-nucleotidase family [Nucleotide transport and metabolism, Defense mechanisms]. 	517
223809	COG0738	FucP	Fucose permease [Carbohydrate transport and metabolism]. 	422
223810	COG0739	NlpD	Murein DD-endopeptidase MepM and murein hydrolase activator NlpD, contain LysM domain [Cell wall/membrane/envelope biogenesis]. 	277
223811	COG0740	ClpP	ATP-dependent protease ClpP, protease subunit [Posttranslational modification, protein turnover, chaperones]. 	200
223812	COG0741	MltE	Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) [Cell wall/membrane/envelope biogenesis]. 	296
223813	COG0742	RsmD	16S rRNA G966 N2-methylase RsmD [Translation, ribosomal structure and biogenesis]. 	187
223814	COG0743	Dxr	1-deoxy-D-xylulose 5-phosphate reductoisomerase [Lipid transport and metabolism]. 	385
223815	COG0744	MrcB	Membrane carboxypeptidase (penicillin-binding protein) [Cell wall/membrane/envelope biogenesis]. 	661
223816	COG0745	OmpR	DNA-binding response regulator, OmpR family, contains REC and winged-helix (wHTH) domain [Signal transduction mechanisms, Transcription]. 	229
223817	COG0746	MobA	Molybdopterin-guanine dinucleotide biosynthesis protein A [Coenzyme transport and metabolism]. 	192
223818	COG0747	DdpA	ABC-type transport system, periplasmic component [Amino acid transport and metabolism]. 	556
223819	COG0748	HugZ	Putative heme iron utilization protein  [Inorganic ion transport and metabolism]. 	245
223820	COG0749	PolA	DNA polymerase I - 3'-5' exonuclease and polymerase domains [Replication, recombination and repair]. 	593
223821	COG0750	RseP	Membrane-associated protease RseP, regulator of RpoE activity  [Posttranslational modification, protein turnover, chaperones, Transcription]. 	375
223822	COG0751	GlyS	Glycyl-tRNA synthetase, beta subunit [Translation, ribosomal structure and biogenesis]. 	691
223823	COG0752	GlyQ	Glycyl-tRNA synthetase, alpha subunit [Translation, ribosomal structure and biogenesis]. 	298
223824	COG0753	KatE	Catalase [Inorganic ion transport and metabolism]. 	496
223825	COG0754	Gsp	Glutathionylspermidine synthase [Amino acid transport and metabolism]. 	387
223826	COG0755	CcmC	ABC-type transport system involved in cytochrome c biogenesis, permease component [Posttranslational modification, protein turnover, chaperones]. 	281
223827	COG0756	Dut	dUTPase [Nucleotide transport and metabolism, Defense mechanisms]. 	148
223828	COG0757	AroQ	3-dehydroquinate dehydratase [Amino acid transport and metabolism]. 	146
223829	COG0758	Smf	Predicted Rossmann fold nucleotide-binding protein DprA/Smf involved in DNA uptake [Replication, recombination and repair]. 	350
223830	COG0759	YidD	Membrane-anchored protein YidD, putatitve component of membrane protein insertase Oxa1/YidC/SpoIIIJ [Cell wall/membrane/envelope biogenesis]. 	92
223831	COG0760	SurA	Parvulin-like peptidyl-prolyl isomerase [Posttranslational modification, protein turnover, chaperones]. 	320
223832	COG0761	IspH	4-Hydroxy-3-methylbut-2-enyl diphosphate reductase IspH [Lipid transport and metabolism]. 	294
223833	COG0762	Ycf19	Uncharacterized conserved protein YggT, Ycf19 family [Function unknown]. 	96
223834	COG0763	LpxB	Lipid A disaccharide synthetase [Cell wall/membrane/envelope biogenesis]. 	381
223835	COG0764	FabA	3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratase [Lipid transport and metabolism]. 	147
223836	COG0765	HisM	ABC-type amino acid transport system, permease component [Amino acid transport and metabolism]. 	222
223837	COG0766	MurA	UDP-N-acetylglucosamine enolpyruvyl transferase [Cell wall/membrane/envelope biogenesis]. 	421
223838	COG0767	MlaE	ABC-type transporter Mla maintaining outer membrane lipid asymmetry, permease component MlaE [Cell wall/membrane/envelope biogenesis]. 	267
223839	COG0768	FtsI	Cell division protein FtsI/penicillin-binding protein 2 [Cell cycle control, cell division, chromosome partitioning, Cell wall/membrane/envelope biogenesis]. 	599
223840	COG0769	MurE	UDP-N-acetylmuramyl tripeptide synthase [Cell wall/membrane/envelope biogenesis]. 	475
223841	COG0770	MurF	UDP-N-acetylmuramyl pentapeptide synthase [Cell wall/membrane/envelope biogenesis]. 	451
223842	COG0771	MurD	UDP-N-acetylmuramoylalanine-D-glutamate ligase [Cell wall/membrane/envelope biogenesis]. 	448
223843	COG0772	FtsW	Bacterial cell division protein FtsW, lipid II flippase [Cell cycle control, cell division, chromosome partitioning]. 	381
223844	COG0773	MurC	UDP-N-acetylmuramate-alanine ligase [Cell wall/membrane/envelope biogenesis]. 	459
223845	COG0774	LpxC	UDP-3-O-acyl-N-acetylglucosamine deacetylase [Cell wall/membrane/envelope biogenesis]. 	300
223846	COG0775	Pfs	Nucleoside phosphorylase [Nucleotide transport and metabolism]. 	234
223847	COG0776	HimA	Bacterial nucleoid DNA-binding protein [Replication, recombination and repair]. 	94
223848	COG0777	AccD	Acetyl-CoA carboxylase beta subunit [Lipid transport and metabolism]. 	294
223849	COG0778	NfnB	Nitroreductase [Energy production and conversion]. 	207
223850	COG0779	RimP	Ribosome maturation factor RimP [Translation, ribosomal structure and biogenesis]. 	153
223851	COG0780	QueFC	NADPH-dependent 7-cyano-7-deazaguanine reductase QueF, C-terminal domain, T-fold superfamily [Translation, ribosomal structure and biogenesis]. 	149
223852	COG0781	NusB	Transcription termination factor NusB [Transcription]. 	151
223853	COG0782	GreA	Transcription elongation factor, GreA/GreB family [Transcription]. 	151
223854	COG0783	Dps	DNA-binding ferritin-like protein (oxidative damage protectant) [Inorganic ion transport and metabolism, Defense mechanisms]. 	156
223855	COG0784	CheY	CheY chemotaxis protein or a CheY-like REC (receiver) domain [Signal transduction mechanisms]. 	130
223856	COG0785	CcdA	Cytochrome c biogenesis protein CcdA [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	220
223857	COG0786	GltS	Na+/glutamate symporter [Amino acid transport and metabolism]. 	404
223858	COG0787	Alr	Alanine racemase [Cell wall/membrane/envelope biogenesis]. 	360
223859	COG0788	PurU	Formyltetrahydrofolate hydrolase [Nucleotide transport and metabolism]. 	287
223860	COG0789	SoxR	DNA-binding transcriptional regulator, MerR family [Transcription]. 	124
223861	COG0790	TPR	TPR repeat [Signal transduction mechanisms]. 	292
223862	COG0791	Spr	Cell wall-associated hydrolase, NlpC family [Cell wall/membrane/envelope biogenesis]. 	197
223863	COG0792	YraN	Predicted endonuclease distantly related to archaeal Holliday junction resolvase [Replication, recombination and repair]. 	114
223864	COG0793	CtpA	C-terminal processing protease CtpA/Prc, contains a PDZ domain [Posttranslational modification, protein turnover, chaperones]. 	406
223865	COG0794	GutQ	D-arabinose 5-phosphate isomerase GutQ [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	202
223866	COG0795	LptF	Lipopolysaccharide export LptBFGC system, permease protein LptF [Cell wall/membrane/envelope biogenesis, Cell motility]. 	364
223867	COG0796	MurI	Glutamate racemase [Cell wall/membrane/envelope biogenesis]. 	269
223868	COG0797	RlpA	Rare lipoprotein A, peptidoglycan hydrolase digesting "naked" glycans, contains C-terminal SPOR domain [Cell wall/membrane/envelope biogenesis]. 	233
223869	COG0798	ACR3	Arsenite efflux pump ArsB, ACR3 family [Inorganic ion transport and metabolism]. 	342
223870	COG0799	RsfS	Ribosomal silencing factor RsfS, regulates association of 30S and 50S subunits [Translation, ribosomal structure and biogenesis]. 	115
223871	COG0800	Eda	2-keto-3-deoxy-6-phosphogluconate aldolase [Carbohydrate transport and metabolism]. 	211
223872	COG0801	FolK	7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase [Coenzyme transport and metabolism]. 	160
223873	COG0802	TsaE	tRNA A37 threonylcarbamoyladenosine biosynthesis protein TsaE [Translation, ribosomal structure and biogenesis]. 	149
223874	COG0803	ZnuA	ABC-type Zn uptake system ZnuABC, Zn-binding component ZnuA [Inorganic ion transport and metabolism]. 	303
223875	COG0804	UreC	Urease alpha subunit  [Amino acid transport and metabolism]. 	568
223876	COG0805	TatC	Sec-independent protein secretion pathway component TatC [Intracellular trafficking, secretion, and vesicular transport]. 	255
223877	COG0806	RimM	Ribosomal 30S subunit maturation factor RimM, required for 16S rRNA processing [Translation, ribosomal structure and biogenesis]. 	174
223878	COG0807	RibA	GTP cyclohydrolase II [Coenzyme transport and metabolism]. 	193
223879	COG0809	QueA	S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) [Translation, ribosomal structure and biogenesis]. 	348
223880	COG0810	TonB	Periplasmic protein TonB, links inner and outer membranes [Cell wall/membrane/envelope biogenesis]. 	244
223881	COG0811	TolQ	Biopolymer transport protein ExbB/TolQ [Intracellular trafficking, secretion, and vesicular transport]. 	216
223882	COG0812	MurB	UDP-N-acetylenolpyruvoylglucosamine reductase [Cell wall/membrane/envelope biogenesis]. 	291
223883	COG0813	DeoD	Purine-nucleoside phosphorylase [Nucleotide transport and metabolism]. 	236
223884	COG0814	SdaC	Amino acid permease [Amino acid transport and metabolism]. 	415
223885	COG0815	Lnt	Apolipoprotein N-acyltransferase [Cell wall/membrane/envelope biogenesis]. 	518
223886	COG0816	YqgF	RNase H-fold protein, predicted Holliday junction resolvase in Firmicutes and mycoplasms, involved in anti-termination at Rho-dependent terminators [Transcription]. 	141
223887	COG0817	RuvC	Holliday junction resolvasome RuvABC endonuclease subunit [Replication, recombination and repair]. 	160
223888	COG0818	DgkA	Diacylglycerol kinase [Lipid transport and metabolism]. 	123
223889	COG0819	TenA	Thiaminase [Coenzyme transport and metabolism]. 	218
223890	COG0820	RlmN	Adenine C2-methylase RlmN of 23S rRNA A2503 and tRNA A37 [Translation, ribosomal structure and biogenesis]. 	349
223891	COG0821	IspG	4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase IspG/GcpE [Lipid transport and metabolism]. 	361
223892	COG0822	IscU	NifU homolog involved in Fe-S cluster formation [Posttranslational modification, protein turnover, chaperones]. 	150
223893	COG0823	TolB	Periplasmic component of the Tol biopolymer transport system [Intracellular trafficking, secretion, and vesicular transport]. 	425
223894	COG0824	FadM	Acyl-CoA thioesterase FadM [Lipid transport and metabolism]. 	137
223895	COG0825	AccA	Acetyl-CoA carboxylase alpha subunit [Lipid transport and metabolism]. 	317
223896	COG0826	PrtC	Collagenase-like protease, PrtC family [Posttranslational modification, protein turnover, chaperones]. 	347
223897	COG0827	YtxK	Adenine-specific DNA methylase  [Replication, recombination and repair]. 	381
223898	COG0828	RpsU	Ribosomal protein S21 [Translation, ribosomal structure and biogenesis]. 	67
223899	COG0829	UreH	Urease accessory protein UreH  [Posttranslational modification, protein turnover, chaperones]. 	269
223900	COG0830	UreF	Urease accessory protein UreF  [Posttranslational modification, protein turnover, chaperones]. 	229
223901	COG0831	UreA	Urease gamma subunit  [Amino acid transport and metabolism]. 	100
223902	COG0832	UreB	Urease beta subunit  [Amino acid transport and metabolism]. 	106
223903	COG0833	LysP	Amino acid permease [Amino acid transport and metabolism]. 	541
223904	COG0834	HisJ	ABC-type amino acid transport/signal transduction system, periplasmic component/domain [Amino acid transport and metabolism, Signal transduction mechanisms]. 	275
223905	COG0835	CheW	Chemotaxis signal transduction protein [Cell motility, Signal transduction mechanisms]. 	165
223906	COG0836	CpsB	Mannose-1-phosphate guanylyltransferase [Cell wall/membrane/envelope biogenesis]. 	333
223907	COG0837	Glk	Glucokinase [Carbohydrate transport and metabolism]. 	320
223908	COG0838	NuoA	NADH:ubiquinone oxidoreductase subunit 3 (chain A) [Energy production and conversion]. 	123
223909	COG0839	NuoJ	NADH:ubiquinone oxidoreductase subunit 6 (chain J) [Energy production and conversion]. 	166
223910	COG0840	Tar	Methyl-accepting chemotaxis protein [Cell motility, Signal transduction mechanisms]. 	408
223911	COG0841	AcrB	Multidrug efflux pump subunit AcrB [Defense mechanisms]. 	1009
223912	COG0842	YadH	ABC-type multidrug transport system, permease component [Defense mechanisms]. 	286
223913	COG0843	CyoB	Heme/copper-type cytochrome/quinol oxidase, subunit 1 [Energy production and conversion]. 	566
223914	COG0845	AcrA	Multidrug efflux pump subunit AcrA (membrane-fusion protein) [Cell wall/membrane/envelope biogenesis, Defense mechanisms]. 	372
223915	COG0846	SIR2	NAD-dependent protein deacetylase, SIR2 family [Posttranslational modification, protein turnover, chaperones]. 	250
223916	COG0847	DnaQ	DNA polymerase III, epsilon subunit or related 3'-5' exonuclease [Replication, recombination and repair]. 	243
223917	COG0848	ExbD	Biopolymer transport protein ExbD [Intracellular trafficking, secretion, and vesicular transport]. 	137
223918	COG0849	FtsA	Cell division ATPase FtsA [Cell cycle control, cell division, chromosome partitioning]. 	418
223919	COG0850	MinC	Septum formation inhibitor MinC [Cell cycle control, cell division, chromosome partitioning]. 	219
223920	COG0851	MinE	Septum formation topological specificity factor MinE [Cell cycle control, cell division, chromosome partitioning]. 	88
223921	COG0852	NuoC	NADH:ubiquinone oxidoreductase 27 kD subunit (chain C) [Energy production and conversion]. 	176
223922	COG0853	PanD	Aspartate 1-decarboxylase [Coenzyme transport and metabolism]. 	126
223923	COG0854	PdxJ	Pyridoxine 5'-phosphate synthase PdxJ [Coenzyme transport and metabolism]. 	243
223924	COG0855	Ppk	Polyphosphate kinase [Inorganic ion transport and metabolism]. 	696
223925	COG0856	PyrE2	Orotate phosphoribosyltransferase homolog [Nucleotide transport and metabolism]. 	203
223926	COG0857	PtaN	BioD-like N-terminal domain of phosphotransacetylase [General function prediction only]. 	354
223927	COG0858	RbfA	Ribosome-binding factor A [Translation, ribosomal structure and biogenesis]. 	118
223928	COG0859	RfaF	ADP-heptose:LPS heptosyltransferase [Cell wall/membrane/envelope biogenesis]. 	334
223929	COG0860	AmiC	N-acetylmuramoyl-L-alanine amidase [Cell wall/membrane/envelope biogenesis]. 	231
223930	COG0861	TerC	Membrane protein TerC, possibly involved in tellurium resistance [Inorganic ion transport and metabolism]. 	254
223931	COG0863	YhdJ	DNA modification methylase [Replication, recombination and repair]. 	302
223932	COG0864	NikR	Metal-responsive transcriptional regulator, contains CopG/Arc/MetJ DNA-binding domain [Transcription]. 	136
223933	COG1001	AdeC	Adenine deaminase [Nucleotide transport and metabolism]. 	584
223934	COG1002	YeeA	Type II restriction/modification system, DNA methylase subunit YeeA  [Defense mechanisms]. 	786
223935	COG1003	GcvP2	Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain [Amino acid transport and metabolism]. 	496
223936	COG1004	Ugd	UDP-glucose 6-dehydrogenase [Cell wall/membrane/envelope biogenesis]. 	414
223937	COG1005	NuoH	NADH:ubiquinone oxidoreductase subunit 1 (chain H) [Energy production and conversion]. 	332
223938	COG1006	MnhC	Multisubunit Na+/H+ antiporter, MnhC subunit  [Inorganic ion transport and metabolism]. 	115
223939	COG1007	NuoN	NADH:ubiquinone oxidoreductase subunit 2 (chain N) [Energy production and conversion]. 	475
223940	COG1008	NuoM	NADH:ubiquinone oxidoreductase subunit 4 (chain M) [Energy production and conversion]. 	497
223941	COG1009	NuoL	NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 	606
223942	COG1010	CobJ	Precorrin-3B methylase  [Coenzyme transport and metabolism]. 	249
223943	COG1011	YigB	FMN phosphatase YigB, HAD superfamily [Coenzyme transport and metabolism]. 	229
223944	COG1012	AdhE	Acyl-CoA reductase or other NAD-dependent aldehyde dehydrogenase [Energy production and conversion]. 	472
223945	COG1013	PorB	Pyruvate:ferredoxin oxidoreductase or related 2-oxoacid:ferredoxin oxidoreductase, beta subunit [Energy production and conversion]. 	294
223946	COG1014	PorG	Pyruvate:ferredoxin oxidoreductase or related 2-oxoacid:ferredoxin oxidoreductase, gamma subunit [Energy production and conversion]. 	203
223947	COG1015	DeoB	Phosphopentomutase [Carbohydrate transport and metabolism]. 	397
223948	COG1017	Hmp	Hemoglobin-like flavoprotein [Energy production and conversion]. 	150
223949	COG1018	Fpr	Ferredoxin-NADP reductase [Energy production and conversion]. 	266
223950	COG1019	CAB4	Phosphopantetheine adenylyltransferase [Coenzyme transport and metabolism]. 	158
223951	COG1020	EntF	Non-ribosomal peptide synthetase component F [Secondary metabolites biosynthesis, transport and catabolism]. 	642
223952	COG1021	EntE	Non-ribosomal peptide synthetase component E (peptide arylation enzyme) [Secondary metabolites biosynthesis, transport and catabolism]. 	542
223953	COG1022	FAA1	Long-chain acyl-CoA synthetase (AMP-forming)  [Lipid transport and metabolism]. 	613
223954	COG1023	YqeC	6-phosphogluconate dehydrogenase (decarboxylating) [Carbohydrate transport and metabolism]. 	300
223955	COG1024	CaiD	Enoyl-CoA hydratase/carnithine racemase [Lipid transport and metabolism]. 	257
223956	COG1025	Ptr	Secreted/periplasmic Zn-dependent peptidases, insulinase-like [Posttranslational modification, protein turnover, chaperones]. 	937
223957	COG1026	Cym1	Zn-dependent peptidase, M16 (insulinase) family [Posttranslational modification, protein turnover, chaperones]. 	978
223958	COG1027	AspA	Aspartate ammonia-lyase [Amino acid transport and metabolism]. 	471
223959	COG1028	FabG	NAD(P)-dependent dehydrogenase, short-chain alcohol dehydrogenase family [Lipid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 	251
223960	COG1029	FwdB	Formylmethanofuran dehydrogenase subunit B  [Energy production and conversion]. 	429
223961	COG1030	NfeD	Membrane-bound serine protease (ClpP class)  [Posttranslational modification, protein turnover, chaperones]. 	436
223962	COG1031	TM1601	Radical SAM superfamily enzyme with C-terminal helix-hairpin-helix motif [General function prediction only]. 	560
223963	COG1032	YgiQ	Radical SAM superfamily enzyme YgiQ, UPF0313 family [General function prediction only]. 	490
223964	COG1033	COG1033	Predicted exporter protein, RND superfamily  [General function prediction only]. 	727
223965	COG1034	NuoG	NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) [Energy production and conversion]. 	693
223966	COG1035	FrhB	Coenzyme F420-reducing hydrogenase, beta subunit  [Energy production and conversion]. 	332
223967	COG1036	COG1036	Archaeal flavoprotein  [Energy production and conversion]. 	187
223968	COG1038	PycA	Pyruvate carboxylase  [Energy production and conversion]. 	1149
223969	COG1039	RnhC	Ribonuclease HIII  [Replication, recombination and repair]. 	297
223970	COG1040	ComFC	Predicted amidophosphoribosyltransferases [General function prediction only]. 	225
223971	COG1041	Trm11	tRNA G10  N-methylase Trm11 [Translation, ribosomal structure and biogenesis]. 	347
223972	COG1042	ACCS	Acyl-CoA synthetase (NDP forming) [Energy production and conversion]. 	598
223973	COG1043	LpxA	Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase [Cell wall/membrane/envelope biogenesis]. 	260
223974	COG1044	LpxD	UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase [Cell wall/membrane/envelope biogenesis]. 	338
223975	COG1045	CysE	Serine acetyltransferase [Amino acid transport and metabolism]. 	194
223976	COG1047	SlpA	FKBP-type peptidyl-prolyl cis-trans isomerase 2 [Posttranslational modification, protein turnover, chaperones]. 	174
223977	COG1048	AcnA	Aconitase A [Energy production and conversion]. 	861
223978	COG1049	AcnB	Aconitase B [Energy production and conversion]. 	852
223979	COG1051	YjhB	ADP-ribose pyrophosphatase YjhB, NUDIX family  [Nucleotide transport and metabolism]. 	145
223980	COG1052	LdhA	Lactate dehydrogenase or related 2-hydroxyacid dehydrogenase [Energy production and conversion, Coenzyme transport and metabolism, General function prediction only]. 	324
223981	COG1053	SdhA	Succinate dehydrogenase/fumarate reductase, flavoprotein subunit [Energy production and conversion]. 	562
223982	COG1054	YceA	Predicted sulfurtransferase [General function prediction only]. 	308
223983	COG1055	ArsB	Na+/H+ antiporter NhaD or related arsenite permease [Inorganic ion transport and metabolism]. 	424
223984	COG1056	NadR	Nicotinamide mononucleotide adenylyltransferase [Coenzyme transport and metabolism]. 	172
223985	COG1057	NadD	Nicotinic acid mononucleotide adenylyltransferase [Coenzyme transport and metabolism]. 	197
223986	COG1058	CinA	Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA [General function prediction only]. 	255
223987	COG1059	ENDO3c	Thermostable 8-oxoguanine DNA glycosylase  [Replication, recombination and repair, Defense mechanisms]. 	210
223988	COG1060	ThiH	2-iminoacetate synthase ThiH/Menaquinone biosynthesis enzyme MqnC [Coenzyme transport and metabolism]. 	370
223989	COG1061	SSL2	Superfamily II DNA or RNA helicase [Transcription, Replication, recombination and repair]. 	442
223990	COG1062	FrmA	Zn-dependent alcohol dehydrogenase [General function prediction only]. 	366
223991	COG1063	Tdh	Threonine dehydrogenase or related Zn-dependent dehydrogenase [Amino acid transport and metabolism, General function prediction only]. 	350
223992	COG1064	AdhP	D-arabinose 1-dehydrogenase, Zn-dependent alcohol dehydrogenase family [Carbohydrate transport and metabolism]. 	339
223993	COG1066	Sms	Predicted ATP-dependent serine protease [Posttranslational modification, protein turnover, chaperones]. 	456
223994	COG1067	LonB	Predicted ATP-dependent protease [Posttranslational modification, protein turnover, chaperones]. 	647
223995	COG1069	AraB	Ribulose kinase [Carbohydrate transport and metabolism]. 	544
223996	COG1070	XylB	Sugar (pentulose or hexulose) kinase [Carbohydrate transport and metabolism]. 	502
223997	COG1071	AcoA	TPP-dependent pyruvate or acetoin dehydrogenase subunit alpha [Energy production and conversion]. 	358
223998	COG1072	CoaA	Panthothenate kinase [Coenzyme transport and metabolism]. 	283
223999	COG1073	FrsA	Fermentation-respiration switch protein FrsA, has esterase activity, DUF1100 family [Signal transduction mechanisms]. 	299
224000	COG1074	RecB	ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) [Replication, recombination and repair]. 	1139
224001	COG1075	EstA	Triacylglycerol esterase/lipase EstA, alpha/beta hydrolase fold  [Lipid transport and metabolism]. 	336
224002	COG1076	DjlA	DnaJ-domain-containing proteins 1 [Posttranslational modification, protein turnover, chaperones]. 	174
224003	COG1077	MreB	Actin-like ATPase involved in cell morphogenesis [Cell cycle control, cell division, chromosome partitioning]. 	342
224004	COG1078	YdhJ	HD superfamily phosphohydrolase [General function prediction only]. 	421
224005	COG1079	YufQ	ABC-type uncharacterized transport system, permease component [General function prediction only]. 	304
224006	COG1080	PtsA	Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) [Carbohydrate transport and metabolism]. 	574
224007	COG1082	YcjR	Sugar phosphate isomerase/epimerase [Carbohydrate transport and metabolism]. 	274
224008	COG1083	NeuA	CMP-N-acetylneuraminic acid synthetase  [Cell wall/membrane/envelope biogenesis]. 	228
224009	COG1084	Nog1	GTP-binding protein, GTP1/Obg family  [General function prediction only]. 	346
224010	COG1085	GalT	Galactose-1-phosphate uridylyltransferase [Carbohydrate transport and metabolism]. 	338
224011	COG1086	FlaA1	NDP-sugar epimerase, includes UDP-GlcNAc-inverting 4,6-dehydratase FlaA1 and capsular polysaccharide biosynthesis protein EpsC [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 	588
224012	COG1087	GalE	UDP-glucose 4-epimerase [Cell wall/membrane/envelope biogenesis]. 	329
224013	COG1088	RfbB	dTDP-D-glucose 4,6-dehydratase [Cell wall/membrane/envelope biogenesis]. 	340
224014	COG1089	Gmd	GDP-D-mannose dehydratase [Cell wall/membrane/envelope biogenesis]. 	345
224015	COG1090	YfcH	NAD dependent epimerase/dehydratase family enzyme [General function prediction only]. 	297
224016	COG1091	RfbD	dTDP-4-dehydrorhamnose reductase [Cell wall/membrane/envelope biogenesis]. 	281
224017	COG1092	RlmK	23S rRNA G2069 N7-methylase RlmK or C1962 C5-methylase RlmI [Translation, ribosomal structure and biogenesis]. 	393
224018	COG1093	SUI2	Translation initiation factor 2, alpha subunit (eIF-2alpha)  [Translation, ribosomal structure and biogenesis]. 	269
224019	COG1094	Krr1	rRNA processing protein Krr1/Pno1, contains KH domain [Translation, ribosomal structure and biogenesis]. 	194
224020	COG1095	RPB7	DNA-directed RNA polymerase, subunit E'/Rpb7 [Transcription]. 	183
224021	COG1096	Csl4	Exosome complex RNA-binding protein Csl4, contains S1 and Zn-ribbon domains  [Translation, ribosomal structure and biogenesis]. 	188
224022	COG1097	Rrp4	Exosome complex RNA-binding protein Rrp4, contains S1 and KH domains  [Translation, ribosomal structure and biogenesis]. 	239
224023	COG1098	YabR	Predicted RNA-binding protein, contains ribosomal protein S1 (RPS1) domain  [General function prediction only]. 	129
224024	COG1099	COG1099	Predicted metal-dependent hydrolase, TIM-barrel fold  [General function prediction only]. 	254
224025	COG1100	Gem1	GTPase SAR1 family domain [General function prediction only]. 	219
224026	COG1101	PhnK	ABC-type uncharacterized transport system, ATPase component  [General function prediction only]. 	263
224027	COG1102	CmkB	Cytidylate kinase  [Nucleotide transport and metabolism]. 	179
224028	COG1103	COG1103	Archaeal Cys-tRNA synthase (O-phospho-L-seryl-tRNA:Cys-tRNA synthase) [Translation, ribosomal structure and biogenesis]. 	382
224029	COG1104	NifS	Cysteine sulfinate desulfinase/cysteine desulfurase or related enzyme [Amino acid transport and metabolism]. 	386
224030	COG1105	FruK	Fructose-1-phosphate kinase or kinase (PfkB) [Carbohydrate transport and metabolism]. 	310
224031	COG1106	AAA15	ATPase/GTPase, AAA15 family [General function prediction only]. 	371
224032	COG1107	COG1107	Archaea-specific RecJ-like exonuclease, contains DnaJ-type Zn finger domain  [Replication, recombination and repair]. 	715
224033	COG1108	ZnuB	ABC-type Mn2+/Zn2+ transport system, permease component [Inorganic ion transport and metabolism]. 	274
224034	COG1109	ManB	Phosphomannomutase [Carbohydrate transport and metabolism]. 	464
224035	COG1110	TopG2	Reverse gyrase  [Replication, recombination and repair]. 	1187
224036	COG1111	MPH1	ERCC4-related helicase [Replication, recombination and repair]. 	542
224037	COG1112	DNA2	Superfamily I DNA and/or RNA helicase [Replication, recombination and repair]. 	767
224038	COG1113	AnsP	L-asparagine transporter and related permeases [Amino acid transport and metabolism]. 	462
224039	COG1114	BrnQ	Branched-chain amino acid permeases [Amino acid transport and metabolism]. 	431
224040	COG1115	AlsT	Na+/alanine symporter [Amino acid transport and metabolism]. 	452
224041	COG1116	TauB	ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component [Inorganic ion transport and metabolism]. 	248
224042	COG1117	PstB	ABC-type phosphate transport system, ATPase component [Inorganic ion transport and metabolism]. 	253
224043	COG1118	CysA	ABC-type sulfate/molybdate transport systems, ATPase component [Inorganic ion transport and metabolism]. 	345
224044	COG1119	ModF	ABC-type molybdenum transport system, ATPase component/photorepair protein PhrA [Inorganic ion transport and metabolism]. 	257
224045	COG1120	FepC	ABC-type cobalamin/Fe3+-siderophores transport system, ATPase component [Inorganic ion transport and metabolism, Coenzyme transport and metabolism]. 	258
224046	COG1121	ZnuC	ABC-type Mn2+/Zn2+ transport system, ATPase component [Inorganic ion transport and metabolism]. 	254
224047	COG1122	EcfA2	Energy-coupling factor transporter ATP-binding protein EcfA2 [Inorganic ion transport and metabolism, General function prediction only]. 	235
224048	COG1123	GsiA	ABC-type glutathione transport system ATPase component, contains duplicated ATPase domain [Posttranslational modification, protein turnover, chaperones]. 	539
224049	COG1124	DppF	ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component [Amino acid transport and metabolism, Inorganic ion transport and metabolism]. 	252
224050	COG1125	OpuBA	ABC-type proline/glycine betaine transport system, ATPase component [Amino acid transport and metabolism]. 	309
224051	COG1126	GlnQ	ABC-type polar amino acid transport system, ATPase component [Amino acid transport and metabolism]. 	240
224052	COG1127	MlaF	ABC-type transporter Mla maintaining outer membrane lipid asymmetry, ATPase component MlaF [Cell wall/membrane/envelope biogenesis]. 	263
224053	COG1129	MglA	ABC-type sugar transport system, ATPase component [Carbohydrate transport and metabolism]. 	500
224054	COG1131	CcmA	ABC-type multidrug transport system, ATPase component [Defense mechanisms]. 	293
224055	COG1132	MdlB	ABC-type multidrug transport system, ATPase and permease component [Defense mechanisms]. 	567
224056	COG1133	SbmA	ABC-type long-chain fatty acid transport system, fused permease and ATPase components [Lipid transport and metabolism]. 	405
224057	COG1134	TagH	ABC-type polysaccharide/polyol phosphate transport system, ATPase component [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	249
224058	COG1135	AbcC	ABC-type methionine transport system, ATPase component [Amino acid transport and metabolism]. 	339
224059	COG1136	LolD	ABC-type lipoprotein export system, ATPase component [Cell wall/membrane/envelope biogenesis]. 	226
224060	COG1137	LptB	ABC-type lipopolysaccharide export system, ATPase component [Cell wall/membrane/envelope biogenesis]. 	243
224061	COG1138	CcmF	Cytochrome c biogenesis factor [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	648
224062	COG1139	LutB	L-lactate utilization protein LutB, contains a ferredoxin-type domain [Energy production and conversion]. 	459
224063	COG1140	NarY	Nitrate reductase beta subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 	513
224064	COG1141	Fer	Ferredoxin  [Energy production and conversion]. 	68
224065	COG1142	HycB	Fe-S-cluster-containing hydrogenase component 2 [Energy production and conversion]. 	165
224066	COG1143	NuoI	Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) [Energy production and conversion]. 	172
224067	COG1144	PorD	Pyruvate:ferredoxin oxidoreductase or related 2-oxoacid:ferredoxin oxidoreductase, delta subunit [Energy production and conversion]. 	91
224068	COG1145	NapF	Ferredoxin [Energy production and conversion]. 	99
224069	COG1146	PreA	NAD-dependent dihydropyrimidine dehydrogenase, PreA subunit [Nucleotide transport and metabolism]. 	68
224070	COG1148	HdrA	Heterodisulfide reductase, subunit A (polyferredoxin)  [Energy production and conversion]. 	622
224071	COG1149	COG1149	MinD superfamily P-loop ATPase, contains an inserted ferredoxin domain  [General function prediction only]. 	284
224072	COG1150	HdrC	Heterodisulfide reductase, subunit C [Energy production and conversion]. 	195
224073	COG1151	Hcp	Hydroxylamine reductase (hybrid-cluster protein) [Inorganic ion transport and metabolism, Energy production and conversion]. 	576
224074	COG1152	CdhA	CO dehydrogenase/acetyl-CoA synthase alpha subunit  [Energy production and conversion]. 	772
224075	COG1153	FwdD	Formylmethanofuran dehydrogenase subunit D  [Energy production and conversion]. 	128
224076	COG1154	Dxs	Deoxyxylulose-5-phosphate synthase [Coenzyme transport and metabolism, Lipid transport and metabolism]. 	627
224077	COG1155	NtpA	Archaeal/vacuolar-type H+-ATPase catalytic subunit A/Vma1 [Energy production and conversion]. 	588
224078	COG1156	NtpB	Archaeal/vacuolar-type H+-ATPase subunit B/Vma2 [Energy production and conversion]. 	463
224079	COG1157	FliI	Flagellar biosynthesis/type III secretory pathway ATPase [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 	441
224080	COG1158	Rho	Transcription termination factor Rho [Transcription]. 	422
224081	COG1159	Era	GTPase Era, involved in 16S rRNA processing [Translation, ribosomal structure and biogenesis]. 	298
224082	COG1160	Der	Predicted GTPases [General function prediction only]. 	444
224083	COG1161	RbgA	Ribosome biogenesis GTPase A [Translation, ribosomal structure and biogenesis]. 	322
224084	COG1162	RsgA	Putative ribosome biogenesis GTPase RsgA [Translation, ribosomal structure and biogenesis]. 	301
224085	COG1163	Rbg1	Ribosome-interacting GTPase 1 [Translation, ribosomal structure and biogenesis]. 	365
224086	COG1164	PepF	Oligoendopeptidase F [Amino acid transport and metabolism]. 	598
224087	COG1165	MenD	2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate synthase [Coenzyme transport and metabolism]. 	566
224088	COG1166	SpeA	Arginine decarboxylase (spermidine biosynthesis) [Amino acid transport and metabolism]. 	652
224089	COG1167	ARO8	DNA-binding transcriptional regulator, MocR family, contains an aminotransferase domain [Transcription, Amino acid transport and metabolism]. 	459
224090	COG1168	MalY	Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities [Amino acid transport and metabolism, General function prediction only]. 	388
224091	COG1169	MenF	Isochorismate synthase EntC [Coenzyme transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	423
224092	COG1171	IlvA	Threonine dehydratase [Amino acid transport and metabolism]. 	347
224093	COG1172	AraH	Ribose/xylose/arabinose/galactoside ABC-type transport system, permease component [Carbohydrate transport and metabolism]. 	316
224094	COG1173	DppC	ABC-type dipeptide/oligopeptide/nickel transport system, permease component [Amino acid transport and metabolism, Inorganic ion transport and metabolism]. 	289
224095	COG1174	OpuBB	ABC-type proline/glycine betaine transport system, permease component [Amino acid transport and metabolism]. 	221
224096	COG1175	UgpA	ABC-type sugar transport system, permease component [Carbohydrate transport and metabolism]. 	295
224097	COG1176	PotB	ABC-type spermidine/putrescine transport system, permease component I [Amino acid transport and metabolism]. 	287
224098	COG1177	PotC	ABC-type spermidine/putrescine transport system, permease component II [Amino acid transport and metabolism]. 	267
224099	COG1178	FbpB	ABC-type Fe3+ transport system, permease component [Inorganic ion transport and metabolism]. 	540
224100	COG1179	TcdA	tRNA A37 threonylcarbamoyladenosine dehydratase [Translation, ribosomal structure and biogenesis]. 	263
224101	COG1180	PflA	Pyruvate-formate lyase-activating enzyme [Posttranslational modification, protein turnover, chaperones]. 	260
224102	COG1181	DdlA	D-alanine-D-alanine ligase and related ATP-grasp enzymes [Cell wall/membrane/envelope biogenesis, General function prediction only]. 	317
224103	COG1182	AzoR	FMN-dependent NADH-azoreductase [Energy production and conversion]. 	202
224104	COG1183	PssA	Phosphatidylserine synthase [Lipid transport and metabolism]. 	234
224105	COG1184	GCD2	Translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]. 	301
224106	COG1185	Pnp	Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) [Translation, ribosomal structure and biogenesis]. 	692
224107	COG1186	PrfB	Protein chain release factor B [Translation, ribosomal structure and biogenesis]. 	239
224108	COG1187	RsuA	16S rRNA U516 pseudouridylate synthase RsuA and related 23S rRNA U2605, pseudouridylate synthases [Translation, ribosomal structure and biogenesis]. 	248
224109	COG1188	HslR	Ribosomal 50S subunit-recycling heat shock protein, contains S4 domain [Translation, ribosomal structure and biogenesis]. 	100
224110	COG1189	YqxC	Predicted rRNA methylase YqxC, contains S4 and FtsJ domains   [Translation, ribosomal structure and biogenesis]. 	245
224111	COG1190	LysU	Lysyl-tRNA synthetase (class II) [Translation, ribosomal structure and biogenesis]. 	502
224112	COG1191	FliA	DNA-directed RNA polymerase specialized sigma subunit [Transcription]. 	247
224113	COG1192	BcsQ	Cellulose biosynthesis protein BcsQ [Cell motility]. 	259
224114	COG1193	MutS2	dsDNA-specific endonuclease/ATPase MutS2 [Replication, recombination and repair]. 	753
224115	COG1194	MutY	Adenine-specific DNA glycosylase, acts on AG and A-oxoG pairs [Replication, recombination and repair]. 	342
224116	COG1195	RecF	Recombinational DNA repair ATPase RecF [Replication, recombination and repair]. 	363
224117	COG1196	Smc	Chromosome segregation ATPase [Cell cycle control, cell division, chromosome partitioning]. 	1163
224118	COG1197	Mfd	Transcription-repair coupling factor (superfamily II helicase) [Replication, recombination and repair, Transcription]. 	1139
224119	COG1198	PriA	Primosomal protein N' (replication factor Y) - superfamily II helicase [Replication, recombination and repair]. 	730
224120	COG1199	DinG	Rad3-related DNA helicase [Replication, recombination and repair]. 	654
224121	COG1200	RecG	RecG-like helicase [Replication, recombination and repair]. 	677
224122	COG1201	Lhr	Lhr-like helicase [Replication, recombination and repair]. 	814
224123	COG1202	COG1202	Superfamily II helicase, archaea-specific [Replication, recombination and repair]. 	830
224124	COG1203	Cas3	CRISPR/Cas system-associated endonuclease/helicase Cas3 [Defense mechanisms]. 	733
224125	COG1204	BRR2	Replicative superfamily II helicase [Replication, recombination and repair]. 	766
224126	COG1205	YprA	ATP-dependent helicase YprA,  contains C-terminal metal-binding DUF1998 domain [Replication, recombination and repair]. 	851
224127	COG1206	TrmFO	Folate-dependent tRNA-U54 methylase TrmFO/GidA [Translation, ribosomal structure and biogenesis]. 	439
224128	COG1207	GlmU	Bifunctional protein GlmU, N-acetylglucosamine-1-phosphate-uridyltransferase/glucosamine-1-phosphate-acetyltransferase [Cell wall/membrane/envelope biogenesis]. 	460
224129	COG1208	GCD1	NDP-sugar pyrophosphorylase, includes eIF-2Bgamma, eIF-2Bepsilon, and LPS biosynthesis proteins [Translation, ribosomal structure and biogenesis, Cell wall/membrane/envelope biogenesis]. 	358
224130	COG1209	RmlA1	dTDP-glucose pyrophosphorylase [Cell wall/membrane/envelope biogenesis]. 	286
224131	COG1210	GalU	UTP-glucose-1-phosphate uridylyltransferase [Cell wall/membrane/envelope biogenesis]. 	291
224132	COG1211	IspD	2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase [Lipid transport and metabolism]. 	230
224133	COG1212	KdsB	CMP-2-keto-3-deoxyoctulosonic acid synthetase [Cell wall/membrane/envelope biogenesis]. 	247
224134	COG1213	COG1213	Choline kinase [Lipid transport and metabolism]. 	239
224135	COG1214	TsaB	tRNA A37 threonylcarbamoyladenosine modification protein TsaB [Translation, ribosomal structure and biogenesis]. 	220
224136	COG1215	BcsA	Glycosyltransferase, catalytic subunit of cellulose synthase and poly-beta-1,6-N-acetylglucosamine synthase [Cell motility]. 	439
224137	COG1216	GT2	Glycosyltransferase, GT2 family [Carbohydrate transport and metabolism]. 	305
224138	COG1217	TypA	Predicted membrane GTPase involved in stress response [Signal transduction mechanisms]. 	603
224139	COG1218	CysQ	3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase [Inorganic ion transport and metabolism]. 	276
224140	COG1219	ClpX	ATP-dependent protease Clp, ATPase subunit [Posttranslational modification, protein turnover, chaperones]. 	408
224141	COG1220	HslU	ATP-dependent protease HslVU (ClpYQ), ATPase subunit [Posttranslational modification, protein turnover, chaperones]. 	444
224142	COG1221	PspF	Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain [Transcription, Signal transduction mechanisms]. 	403
224143	COG1222	RPT1	ATP-dependent 26S proteasome regulatory subunit [Posttranslational modification, protein turnover, chaperones]. 	406
224144	COG1223	COG1223	Predicted ATPase, AAA+ superfamily [General function prediction only]. 	368
224145	COG1224	TIP49	DNA helicase TIP49, TBP-interacting protein [Transcription]. 	450
224146	COG1225	Bcp	Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 	157
224147	COG1226	Kch	Voltage-gated potassium channel Kch [Inorganic ion transport and metabolism]. 	212
224148	COG1227	PPX1	Inorganic pyrophosphatase/exopolyphosphatase [Energy production and conversion, Inorganic ion transport and metabolism]. 	311
224149	COG1228	HutI	Imidazolonepropionase or related amidohydrolase [Secondary metabolites biosynthesis, transport and catabolism]. 	406
224150	COG1229	FwdA	Formylmethanofuran dehydrogenase subunit A  [Energy production and conversion]. 	575
224151	COG1230	CzcD	Co/Zn/Cd efflux system component [Inorganic ion transport and metabolism]. 	296
224152	COG1231	YobN	Monoamine oxidase  [Amino acid transport and metabolism]. 	450
224153	COG1232	HemY	Protoporphyrinogen oxidase [Coenzyme transport and metabolism]. 	444
224154	COG1233	COG1233	Phytoene dehydrogenase-related protein  [Secondary metabolites biosynthesis, transport and catabolism]. 	487
224155	COG1234	ElaC	Ribonuclease BN, tRNA processing enzyme [Translation, ribosomal structure and biogenesis]. 	292
224156	COG1235	PhnP	Phosphoribosyl 1,2-cyclic phosphodiesterase [Inorganic ion transport and metabolism]. 	269
224157	COG1236	YSH1	RNA processing exonuclease, beta-lactamase fold, Cft2 family [Translation, ribosomal structure and biogenesis]. 	427
224158	COG1237	COG1237	Metal-dependent hydrolase, beta-lactamase superfamily II  [General function prediction only]. 	259
224159	COG1238	YgaA	Uncharacterized membrane protein YqaA, SNARE-associated domain [Function unknown]. 	161
224160	COG1239	ChlI	Mg-chelatase subunit ChlI [Coenzyme transport and metabolism]. 	423
224161	COG1240	ChlD	Mg-chelatase subunit ChlD [Coenzyme transport and metabolism]. 	261
224162	COG1241	Mcm2	DNA replicative helicase MCM subunit Mcm2, Cdc46/Mcm family  [Replication, recombination and repair]. 	682
224163	COG1242	YhcC	Radical SAM superfamily enzyme [General function prediction only]. 	312
224164	COG1243	ELP3	Histone acetyltransferase, component of the RNA polymerase elongator complex  [Transcription, Chromatin structure and dynamics]. 	515
224165	COG1244	COG1244	Uncharacterized Fe-S cluster-containing protein. MiaB family [General function prediction only]. 	358
224166	COG1245	Rli1	Translation initiation factor RLI1, contains Fe-S and AAA+ ATPase domains [Translation, ribosomal structure and biogenesis]. 	591
224167	COG1246	ArgA	N-acetylglutamate synthase or related acetyltransferase, GNAT family [Amino acid transport and metabolism]. 	153
224168	COG1247	YncA	L-amino acid N-acyltransferase YncA [Amino acid transport and metabolism]. 	169
224169	COG1249	Lpd	Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component or related enzyme [Energy production and conversion]. 	454
224170	COG1250	FadB	3-hydroxyacyl-CoA dehydrogenase [Lipid transport and metabolism]. 	307
224171	COG1251	NirB	NAD(P)H-nitrite reductase, large subunit [Energy production and conversion]. 	793
224172	COG1252	Ndh	NADH dehydrogenase, FAD-containing subunit [Energy production and conversion]. 	405
224173	COG1253	TlyC	Hemolysin or related protein, contains CBS domains [General function prediction only]. 	429
224174	COG1254	AcyP	Acylphosphatase [Energy production and conversion]. 	92
224175	COG1255	COG1255	Uncharacterized protein, UPF0146 family [Function unknown]. 	129
224176	COG1256	FlgK	Flagellar hook-associated protein FlgK [Cell motility]. 	552
224177	COG1257	HMG1	Hydroxymethylglutaryl-CoA reductase [Lipid transport and metabolism]. 	436
224178	COG1258	Pus10	tRNA U54 and U55 pseudouridine synthase Pus10 [Translation, ribosomal structure and biogenesis]. 	398
224179	COG1259	COG1259	Bifunctional DNase/RNase [General function prediction only]. 	151
224180	COG1260	INO1	Myo-inositol-1-phosphate synthase [Lipid transport and metabolism]. 	362
224181	COG1261	FlgA	Flagella basal body P-ring formation protein FlgA [Cell motility]. 	220
224182	COG1262	YfmG	Formylglycine-generating enzyme, required for sulfatase activity, contains SUMF1/FGE domain [Posttranslational modification, protein turnover, chaperones]. 	314
224183	COG1263	PtsG1	Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific [Carbohydrate transport and metabolism]. 	393
224184	COG1264	PtsG2	Phosphotransferase system IIB components [Carbohydrate transport and metabolism]. 	88
224185	COG1266	YdiL	Membrane protease YdiL, CAAX protease family  [Posttranslational modification, protein turnover, chaperones]. 	226
224186	COG1267	PgpA	Phosphatidylglycerophosphatase A [Lipid transport and metabolism]. 	160
224187	COG1268	BioY	Biotin transporter BioY [Coenzyme transport and metabolism]. 	184
224188	COG1269	NtpI	Archaeal/vacuolar-type H+-ATPase subunit I/STV1 [Energy production and conversion]. 	660
224189	COG1270	CbiB	Cobalamin biosynthesis protein CobD/CbiB [Coenzyme transport and metabolism]. 	320
224190	COG1271	AppC	Cytochrome bd-type quinol oxidase, subunit 1 [Energy production and conversion]. 	457
224191	COG1272	YqfA	Predicted membrane channel-forming protein YqfA, hemolysin III family [Intracellular trafficking, secretion, and vesicular transport]. 	226
224192	COG1273	YkoV	Non-homologous end joining protein Ku, dsDNA break repair [Replication, recombination and repair]. 	278
224193	COG1274	PepCK	Phosphoenolpyruvate carboxykinase, GTP-dependent [Energy production and conversion]. 	608
224194	COG1275	TehA	Tellurite resistance protein TehA and related permeases [Defense mechanisms]. 	329
224195	COG1276	PcoD	Putative copper export protein [Inorganic ion transport and metabolism]. 	289
224196	COG1277	NosY	ABC-type transport system involved in multi-copper enzyme maturation, permease component [Posttranslational modification, protein turnover, chaperones]. 	278
224197	COG1278	CspC	Cold shock protein, CspA family [Transcription]. 	67
224198	COG1279	ArgO	Arginine exporter protein ArgO [Amino acid transport and metabolism]. 	202
224199	COG1280	RhtB	Threonine/homoserine/homoserine lactone efflux protein [Amino acid transport and metabolism]. 	208
224200	COG1281	HslO	Redox-regulated molecular chaperone, HSP33 family [Posttranslational modification, protein turnover, chaperones]. 	286
224201	COG1282	PntB	NAD/NADP transhydrogenase beta subunit [Energy production and conversion]. 	463
224202	COG1283	NptA	Na+/phosphate symporter [Inorganic ion transport and metabolism]. 	533
224203	COG1284	YitT	Uncharacterized membrane-anchored protein YitT, contains DUF161 and DUF2179 domains [Function unknown]. 	289
224204	COG1285	SapB	Uncharacterized membrane protein YhiD, involved in acid resistance [Function unknown]. 	221
224205	COG1286	CvpA	Uncharacterized membrane protein, required for colicin V production [Function unknown]. 	182
224206	COG1287	Stt3	Asparagine N-glycosylation enzyme, membrane subunit Stt3 [Posttranslational modification, protein turnover, chaperones]. 	773
224207	COG1288	YfcC	Uncharacterized membrane protein YfcC, ion transporter superfamily  [General function prediction only]. 	481
224208	COG1289	YccC	Uncharacterized membrane protein YccC [Function unknown]. 	674
224209	COG1290	QcrB	Cytochrome b subunit of the bc complex  [Energy production and conversion]. 	381
224210	COG1291	MotA	Flagellar motor component MotA [Cell motility]. 	266
224211	COG1292	BetT	Choline-glycine betaine transporter [Cell wall/membrane/envelope biogenesis]. 	537
224212	COG1293	YloA	Predicted component of the ribosome quality control (RQC) complex, YloA/Tae2 family, contains fibronectin-binding (FbpA) and DUF814 domains [Translation, ribosomal structure and biogenesis]. 	564
224213	COG1294	AppB	Cytochrome bd-type quinol oxidase, subunit 2 [Energy production and conversion]. 	346
224214	COG1295	BrkB	Uncharacterized membrane protein, BrkB/YihY/UPF0761 family (not an RNase) [Function unknown]. 	303
224215	COG1296	AzlC	Predicted branched-chain amino acid permease (azaleucine resistance) [Amino acid transport and metabolism]. 	238
224216	COG1297	OPT	Uncharacterized membrane protein, oligopeptide transporter (OPT) family [Function unknown]. 	624
224217	COG1298	FlhA	Flagellar biosynthesis pathway, component FlhA [Cell motility]. 	696
224218	COG1299	FrwC	Phosphotransferase system, fructose-specific IIC component [Carbohydrate transport and metabolism]. 	343
224219	COG1300	SpoIIM	Uncharacterized membrane protein SpoIIM, required for sporulation [Cell cycle control, cell division, chromosome partitioning]. 	207
224220	COG1301	GltP	Na+/H+-dicarboxylate symporter [Energy production and conversion]. 	415
224221	COG1302	YloU	Uncharacterized conserved protein YloU, alkaline shock protein (Asp23) family  [Function unknown]. 	131
224222	COG1303	COG1303	Predicted rRNA methylase, SpoU family [General function prediction only]. 	179
224223	COG1304	LldD	FMN-dependent dehydrogenase, includes L-lactate dehydrogenase and type II isopentenyl diphosphate isomerase  [Energy production and conversion, Lipid transport and metabolism, General function prediction only]. 	360
224224	COG1305	YebA	Transglutaminase-like enzyme, putative cysteine protease  [Posttranslational modification, protein turnover, chaperones]. 	319
224225	COG1306	COG1306	Predicted glycosyl hydrolase, alpha amylase family [General function prediction only]. 	400
224226	COG1307	DegV	Fatty acid-binding protein DegV (function unknown) [Lipid transport and metabolism]. 	282
224227	COG1308	EGD2	Transcription factor homologous to NACalpha-BTF3 [Transcription]. 	122
224228	COG1309	AcrR	DNA-binding transcriptional regulator, AcrR family [Transcription]. 	201
224229	COG1310	Rri1	Proteasome lid subunit RPN8/RPN11, contains Jab1/MPN domain metalloenzyme (JAMM) motif [Posttranslational modification, protein turnover, chaperones]. 	134
224230	COG1311	HYS2	Archaeal DNA polymerase II, small subunit/DNA polymerase delta, subunit B [Replication, recombination and repair]. 	481
224231	COG1312	UxuA	D-mannonate dehydratase [Carbohydrate transport and metabolism]. 	362
224232	COG1313	PflX	Uncharacterized Fe-S protein PflX, radical SAM superfamily [General function prediction only]. 	335
224233	COG1314	SecG	Preprotein translocase subunit SecG [Intracellular trafficking, secretion, and vesicular transport]. 	86
224234	COG1315	COG1315	Uncharacterized conserved protein, DUF342 family [Function unknown]. 	543
224235	COG1316	Cps2a	Anionic cell wall polymer biosynthesis enzyme,  LytR-Cps2A-Psr (LCP) family  [Cell wall/membrane/envelope biogenesis]. 	307
224236	COG1317	FliH	Flagellar biosynthesis/type III secretory pathway protein FliH [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 	234
224237	COG1318	COG1318	Predicted transcriptional regulator [Transcription]. 	182
224238	COG1319	CoxM	CO or xanthine dehydrogenase, FAD-binding subunit [Energy production and conversion]. 	284
224239	COG1320	MnhG	Multisubunit Na+/H+ antiporter, MnhG subunit [Inorganic ion transport and metabolism]. 	113
224240	COG1321	MntR	Mn-dependent transcriptional regulator, DtxR family [Transcription]. 	154
224241	COG1322	RmuC	DNA anti-recombination protein (rearrangement mutator) RmuC [Replication, recombination and repair]. 	448
224242	COG1323	YlbM	Predicted nucleotidyltransferase [General function prediction only]. 	358
224243	COG1324	CutA	Uncharacterized protein involved in tolerance to divalent cations [Inorganic ion transport and metabolism]. 	104
224244	COG1325	COG1325	Exosome subunit, RNA binding protein with dsRBD fold [Translation, ribosomal structure and biogenesis]. 	149
224245	COG1326	COG1326	Uncharacterized archaeal Zn-finger protein [General function prediction only]. 	201
224246	COG1327	NrdR	Transcriptional regulator NrdR, contains Zn-ribbon and ATP-cone domains [Transcription]. 	156
224247	COG1328	NrdD	Anaerobic ribonucleoside-triphosphate reductase [Nucleotide transport and metabolism]. 	700
224248	COG1329	CdnL	RNA polymerase-interacting regulator, CarD/CdnL/TRCF family [Transcription]. 	166
224249	COG1330	RecC	Exonuclease V gamma subunit [Replication, recombination and repair]. 	1078
224250	COG1331	YyaL	Uncharacterized conserved protein YyaL, SSP411 family, contains thoiredoxin and six-hairpin glycosidase-like domains   [General function prediction only]. 	667
224251	COG1332	Csm5	CRISPR/Cas system CSM-associated protein Csm5, group 7 of RAMP superfamily [Defense mechanisms]. 	369
224252	COG1333	ResB	Cytochrome c biogenesis protein ResB [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	478
224253	COG1334	FlaG	Uncharacterized conserved protein, FlaG/YvyC family  [General function prediction only]. 	120
224254	COG1335	PncA	Nicotinamidase-related amidase [Coenzyme transport and metabolism, General function prediction only]. 	205
224255	COG1336	Cmr4	CRISPR/Cas system CMR subunit Cmr4, Cas7 group, RAMP superfamily [Defense mechanisms]. 	298
224256	COG1337	Csm3	CRISPR/Cas system CSM-associated protein Csm3, group 7 of RAMP superfamily [Defense mechanisms]. 	249
224257	COG1338	FliP	Flagellar biosynthetic protein FliP [Cell motility]. 	248
224258	COG1339	Rfk	Archaeal CTP-dependent riboflavin kinase [Coenzyme transport and metabolism]. 	214
224259	COG1340	COG1340	Uncharacterized coiled-coil protein, contains DUF342 domain [Function unknown]. 	294
224260	COG1341	Grc3	Polynucleotide 5'-kinase, involved in rRNA processing [Translation, ribosomal structure and biogenesis]. 	398
224261	COG1342	COG1342	Predicted DNA-binding protein, UPF0251 family [General function prediction only]. 	99
224262	COG1343	Cas2	CRISPR/Cas system-associated endoribonuclease Cas2 [Defense mechanisms]. 	89
224263	COG1344	FlgL	Flagellin and related hook-associated protein FlgL [Cell motility]. 	360
224264	COG1345	FliD	Flagellar capping protein FliD [Cell motility]. 	483
224265	COG1346	LrgB	Putative effector of murein hydrolase [Cell wall/membrane/envelope biogenesis]. 	230
224266	COG1347	NqrD	Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD  [Energy production and conversion]. 	208
224267	COG1348	NifH	Nitrogenase subunit NifH, an ATPase [Inorganic ion transport and metabolism]. 	278
224268	COG1349	GlpR	DNA-binding transcriptional regulator of sugar metabolism, DeoR/GlpR family [Transcription, Carbohydrate transport and metabolism]. 	253
224269	COG1350	COG1350	Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) [Amino acid transport and metabolism]. 	432
224270	COG1351	ThyX	Thymidylate synthase ThyX [Nucleotide transport and metabolism]. 	273
224271	COG1352	CheR	Methylase of chemotaxis methyl-accepting proteins [Cell motility, Signal transduction mechanisms]. 	268
224272	COG1353	Cas10	CRISPR/Cas system-associated protein Cas10, large subunit of type III CRISPR-Cas systems, contains HD superfamily nuclease domain [Defense mechanisms]. 	799
224273	COG1354	ScpA	Chromatin segregation and condensation protein Rec8/ScpA/Scc1, kleisin family  [Replication, recombination and repair]. 	248
224274	COG1355	Mho1	Predicted class III extradiol dioxygenase, MEMO1 family [General function prediction only]. 	279
224275	COG1356	COG1356	Transcriptional regulator [Transcription]. 	143
224276	COG1357	YjbI	Uncharacterized protein YjbI, contains pentapeptide repeats [Function unknown]. 	238
224277	COG1358	Rpl7Ae	Ribosomal protein L7Ae or related RNA K-turn-binding protein [Translation, ribosomal structure and biogenesis]. 	116
224278	COG1359	YgiN	Quinol monooxygenase YgiN [Energy production and conversion]. 	100
224279	COG1360	MotB	Flagellar motor protein MotB [Cell motility]. 	244
224280	COG1361	COG1361	Uncharacterized conserved protein [Function unknown]. 	500
224281	COG1362	LAP4	Aspartyl aminopeptidase  [Amino acid transport and metabolism]. 	437
224282	COG1363	FrvX	Putative aminopeptidase FrvX [Amino acid transport and metabolism, Carbohydrate transport and metabolism]. 	355
224283	COG1364	ArgJ	N-acetylglutamate synthase (N-acetylornithine aminotransferase) [Amino acid transport and metabolism]. 	404
224284	COG1365	COG1365	Predicted ATPase, PP-loop superfamily [General function prediction only]. 	255
224285	COG1366	SpoIIAA	Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) [Signal transduction mechanisms]. 	117
224286	COG1367	Cmr1	CRISPR/Cas system CMR-associated protein Cmr1, group 7 of RAMP superfamily [Defense mechanisms]. 	393
224287	COG1368	MdoB	Phosphoglycerol transferase MdoB or a related enzyme of AlkP superfamily [Cell wall/membrane/envelope biogenesis]. 	650
224288	COG1369	POP5	RNase P/RNase MRP subunit POP5 [Translation, ribosomal structure and biogenesis]. 	124
224289	COG1370	COG1370	tRNA-guanine transglycosylase, archaeosine-15-forming [Translation, ribosomal structure and biogenesis]. 	155
224290	COG1371	Archease	Archease protein family (MTH1598/TM1083)[General function prediction only]. This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism.	137
224291	COG1372	Hop	Intein/homing endonuclease [Replication, recombination and repair, Mobilome: prophages, transposons]. 	420
224292	COG1373	COG1373	Predicted ATPase, AAA+ superfamily [General function prediction only]. 	398
224293	COG1374	NIP7	Rbosome biogenesis protein Nip4, contains PUA domain  [Translation, ribosomal structure and biogenesis]. 	176
224294	COG1376	ErfK	Lipoprotein-anchoring transpeptidase ErfK/SrfK [Cell wall/membrane/envelope biogenesis]. 	232
224295	COG1377	FlhB	Flagellar biosynthesis protein FlhB [Cell motility]. 	363
224296	COG1378	TrmB	Sugar-specific transcriptional regulator TrmB [Transcription]. 	247
224297	COG1379	YqxK	PHP family phosphoesterase with a Zn ribbon [General function prediction only]. 	403
224298	COG1380	YohJ	Putative effector of murein hydrolase LrgA, UPF0299 family [General function prediction only]. 	128
224299	COG1381	RecO	Recombinational DNA repair protein (RecF pathway) [Replication, recombination and repair]. 	251
224300	COG1382	GimC	Prefoldin, chaperonin cofactor [Posttranslational modification, protein turnover, chaperones]. 	119
224301	COG1383	RPS17A	Ribosomal protein S17E [Translation, ribosomal structure and biogenesis]. 	74
224302	COG1384	LysS	Lysyl-tRNA synthetase, class I [Translation, ribosomal structure and biogenesis]. 	521
224303	COG1385	RsmE	16S rRNA U1498 N3-methylase RsmE [Translation, ribosomal structure and biogenesis]. 	246
224304	COG1386	ScpB	Chromosome segregation and condensation protein ScpB [Transcription]. 	184
224305	COG1387	HIS2	Histidinol phosphatase or related hydrolase of the PHP family [Amino acid transport and metabolism, General function prediction only]. 	237
224306	COG1388	LysM	LysM repeat  [Cell wall/membrane/envelope biogenesis]. 	124
224307	COG1389	COG1389	DNA topoisomerase VI, subunit B [Replication, recombination and repair]. 	538
224308	COG1390	NtpE	Archaeal/vacuolar-type H+-ATPase subunit E/Vma4 [Energy production and conversion]. 	194
224309	COG1391	GlnE	Glutamine synthetase adenylyltransferase [Posttranslational modification, protein turnover, chaperones]. 	963
224310	COG1392	YkaA	Uncharacterized conserved protein YkaA, distantly related to PhoU, UPF0111/DUF47 family [Function unknown]. 	217
224311	COG1393	ArsC	Arsenate reductase and related proteins, glutaredoxin family [Inorganic ion transport and metabolism]. 	117
224312	COG1394	NtpD	Archaeal/vacuolar-type H+-ATPase subunit D/Vma8 [Energy production and conversion]. 	211
224313	COG1395	COG1395	Predicted transcriptional regulator [Transcription]. 	313
224314	COG1396	HipB	Transcriptional regulator, contains XRE-family HTH domain [Transcription]. 	120
224315	COG1397	DraG	ADP-ribosylglycohydrolase [Posttranslational modification, protein turnover, chaperones]. 	314
224316	COG1398	OLE1	Fatty-acid desaturase  [Lipid transport and metabolism]. 	289
224317	COG1399	YceD	Uncharacterized metal-binding protein YceD, DUF177 family [Function unknown]. 	176
224318	COG1400	SEC65	Signal recognition particle subunit SEC65 [Intracellular trafficking, secretion, and vesicular transport]. 	93
224319	COG1401	McrB	5-methylcytosine-specific restriction endonuclease McrBC, GTP-binding regulatory subunit McrB [Defense mechanisms]. 	601
224320	COG1402	ArfB	Creatinine amidohydrolase/Fe(II)-dependent formamide hydrolase involved in riboflavin and F420 biosynthesis [Coenzyme transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	250
224321	COG1403	McrA	5-methylcytosine-specific restriction endonuclease McrA [Defense mechanisms]. 	146
224322	COG1404	AprE	Serine protease, subtilisin family  [Posttranslational modification, protein turnover, chaperones]. 	508
224323	COG1405	SUA7	Transcription initiation factor TFIIIB, Brf1 subunit/Transcription initiation factor TFIIB [Transcription]. 	285
224324	COG1406	CheX	Chemotaxis protein CheX, a CheY~P-specific phosphatase [Cell motility]. 	153
224325	COG1407	COG1407	Metallophosphoesterase superfamily enzyme [General function prediction only]. 	235
224326	COG1408	YaeI	Predicted phosphohydrolase, MPP superfamily [General function prediction only]. 	284
224327	COG1409	CpdA	3',5'-cyclic AMP phosphodiesterase CpdA [Signal transduction mechanisms]. 	301
224328	COG1410	MetH2	Methionine synthase I, cobalamin-binding domain [Amino acid transport and metabolism]. 	842
224329	COG1411	COG1411	Uncharacterized protein related to proFAR isomerase (HisA) [General function prediction only]. 	229
224330	COG1412	Fcf1	rRNA-processing protein FCF1 [Translation, ribosomal structure and biogenesis]. 	136
224331	COG1413	HEAT	HEAT repeat [General function prediction only]. 	335
224332	COG1414	IclR	DNA-binding transcriptional regulator, IclR family [Transcription]. 	246
224333	COG1415	COG1415	Uncharacterized protein [Function unknown]. 	373
224334	COG1416	COG1416	Intracellular sulfur oxidation protein, DsrE/DsrF family [Inorganic ion transport and metabolism]. 	112
224335	COG1417	COG1417	Uncharacterized protein [Function unknown]. 	288
224336	COG1418	RnaY	HD superfamily phosphodieaserase, includes HD domain of RNase Y [Translation, ribosomal structure and biogenesis, General function prediction only]. 	222
224337	COG1419	FlhF	Flagellar biosynthesis GTPase FlhF  [Cell motility]. 	407
224338	COG1420	HrcA	Transcriptional regulator of heat shock response [Transcription]. 	346
224339	COG1421	Csm2	CRISPR/Cas system CSM-associated protein Csm2, small subunit [Defense mechanisms]. 	137
224340	COG1422	COG1422	Uncharacterized archaeal membrane protein, DUF106 family, distantly related to YidC/Oxa1   [Function unknown]. 	201
224341	COG1423	COG1423	ATP-dependent RNA circularization protein, DNA/RNA ligase (PAB1020)  family    [Replication, recombination and repair]. 	382
224342	COG1424	BioW	Pimeloyl-CoA synthetase [Coenzyme transport and metabolism]. 	239
224343	COG1426	RodZ	Cytoskeletal protein RodZ, contains Xre-like HTH and DUF4115 domains [Cell cycle control, cell division, chromosome partitioning]. 	284
224344	COG1427	MqnA	Menaquinone biosynthesis enzyme MqnA [Coenzyme transport and metabolism]. 	252
224345	COG1428	Dck	Deoxyadenosine/deoxycytidine kinase [Nucleotide transport and metabolism]. 	216
224346	COG1429	CobN	Cobalamin biosynthesis protein CobN, Mg-chelatase  [Coenzyme transport and metabolism]. 	1388
224347	COG1430	COG1430	Uncharacterized conserved membrane protein, UPF0127 family [Function unknown]. 	126
224348	COG1431	COG1431	Argonaute homolog, implicated in RNA metabolism and viral defense [Translation, ribosomal structure and biogenesis, Defense mechanisms]. 	685
224349	COG1432	LabA	Uncharacterized conserved protein, LabA/DUF88 family [Function unknown]. 	181
224350	COG1433	NifX	Predicted Fe-Mo cluster-binding protein, NifX family [Posttranslational modification, protein turnover, chaperones]. 	121
224351	COG1434	YdcF	Uncharacterized SAM-binding protein YcdF, DUF218 family [General function prediction only]. 	223
224352	COG1435	Tdk	Thymidine kinase [Nucleotide transport and metabolism]. 	201
224353	COG1436	NtpF	Archaeal/vacuolar-type H+-ATPase subunit F/Vma7 [Energy production and conversion]. 	104
224354	COG1437	CyaB	Adenylate cyclase class IV, CYTH domain (includes archaeal enzymes of unknown function) [Signal transduction mechanisms, General function prediction only]. 	178
224355	COG1438	ArgR	Arginine repressor [Transcription]. 	150
224356	COG1439	Nob1	rRNA maturation endonuclease Nob1 [Translation, ribosomal structure and biogenesis]. 	177
224357	COG1440	CelA	Phosphotransferase system cellobiose-specific component IIB [Carbohydrate transport and metabolism]. 	102
224358	COG1441	MenC	O-succinylbenzoate synthase [Coenzyme transport and metabolism]. 	321
224359	COG1442	RfaJ	Lipopolysaccharide biosynthesis protein, LPS:glycosyltransferase [Cell wall/membrane/envelope biogenesis]. 	325
224360	COG1443	Idi	Isopentenyldiphosphate isomerase [Lipid transport and metabolism]. 	185
224361	COG1444	TmcA	tRNA(Met) C34 N-acetyltransferase TmcA [Translation, ribosomal structure and biogenesis]. 	758
224362	COG1445	FrwB	Phosphotransferase system fructose-specific component IIB [Carbohydrate transport and metabolism]. 	122
224363	COG1446	IaaA	Isoaspartyl peptidase or L-asparaginase, Ntn-hydrolase superfamily  [Amino acid transport and metabolism]. 	307
224364	COG1447	CelC	Phosphotransferase system cellobiose-specific component IIA [Carbohydrate transport and metabolism]. 	105
224365	COG1448	TyrB	Aspartate/tyrosine/aromatic aminotransferase [Amino acid transport and metabolism]. 	396
224366	COG1449	COG1449	Alpha-amylase/alpha-mannosidase, GH57 family  [Carbohydrate transport and metabolism]. 	615
224367	COG1450	PulD	Type II secretory pathway component GspD/PulD (secretin) [Intracellular trafficking, secretion, and vesicular transport]. 	587
224368	COG1451	YgjP	Predicted metal-dependent hydrolase [General function prediction only]. 	223
224369	COG1452	LptD	LPS assembly outer membrane protein LptD (organic solvent tolerance protein OstA) [Cell wall/membrane/envelope biogenesis]. 	784
224370	COG1453	COG1453	Predicted oxidoreductase of the aldo/keto reductase family [General function prediction only]. 	391
224371	COG1454	EutG	Alcohol dehydrogenase, class IV [Energy production and conversion]. 	377
224372	COG1455	CelB	Phosphotransferase system cellobiose-specific component IIC [Carbohydrate transport and metabolism]. 	432
224373	COG1456	CdhE	CO dehydrogenase/acetyl-CoA synthase gamma subunit (corrinoid Fe-S protein)  [Energy production and conversion]. 	467
224374	COG1457	CodB	Purine-cytosine permease or related protein [Nucleotide transport and metabolism]. 	442
224375	COG1458	COG1458	Predicted DNA-binding protein containing PIN domain, UPF0278 family [General function prediction only]. 	221
224376	COG1459	PulF	Type II secretory pathway, component PulF [Cell motility, Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	397
224377	COG1460	RpoF	DNA-directed RNA polymerase, subunit F [Transcription]. 	114
224378	COG1461	YloV	Predicted kinase related to dihydroxyacetone kinase  [General function prediction only]. 	542
224379	COG1462	CsgG	Curli biogenesis system outer membrane secretion channel CsgG [Cell wall/membrane/envelope biogenesis]. 	252
224380	COG1463	MlaD	ABC-type transporter Mla maintaining outer membrane lipid asymmetry, periplasmic component MlaD [Cell wall/membrane/envelope biogenesis]. 	359
224381	COG1464	NlpA	ABC-type metal ion transport system, periplasmic component/surface antigen [Inorganic ion transport and metabolism]. 	268
224382	COG1465	AroB2	3-dehydroquinate synthase, class II  [Amino acid transport and metabolism]. 	376
224383	COG1466	HolA	DNA polymerase III, delta subunit [Replication, recombination and repair]. 	334
224384	COG1467	PRI1	Eukaryotic-type DNA primase, catalytic (small) subunit  [Replication, recombination and repair]. 	341
224385	COG1468	Cas4	CRISPR/Cas system-associated exonuclease Cas4, RecB family [Defense mechanisms]. 	190
224386	COG1469	FolE2	GTP cyclohydrolase FolE2 [Coenzyme transport and metabolism]. 	289
224387	COG1470	COG1470	Uncharacterized membrane protein  [Function unknown]. 	513
224388	COG1471	RPS4A	Ribosomal protein S4E  [Translation, ribosomal structure and biogenesis]. 	241
224389	COG1472	BglX	Periplasmic beta-glucosidase and related glycosidases [Carbohydrate transport and metabolism]. 	397
224390	COG1473	AbgB	Metal-dependent amidase/aminoacylase/carboxypeptidase [General function prediction only]. 	392
224391	COG1474	CDC6	Cdc6-related protein, AAA superfamily ATPase  [Replication, recombination and repair]. 	366
224392	COG1475	Spo0J	Chromosome segregation protein Spo0J, contains ParB-like nuclease domain   [Cell cycle control, cell division, chromosome partitioning]. 	240
224393	COG1476	XRE	DNA-binding transcriptional regulator, XRE-family HTH domain  [Transcription]. 	68
224394	COG1477	ApbE	Thiamine biosynthesis lipoprotein ApbE [Coenzyme transport and metabolism]. 	337
224395	COG1478	CofE	F420-0:Gamma-glutamyl ligase (F420 biosynthesis) [Coenzyme transport and metabolism]. 	257
224396	COG1479	COG1479	Uncharacterized conserved protein, contains ParB-like and HNH nuclease domains [Function unknown]. 	409
224397	COG1480	YqfF	Membrane-associated HD superfamily phosphohydrolase  [General function prediction only]. 	700
224398	COG1481	WhiA	DNA-binding transcriptional regulator WhiA, involved in cell division  [Transcription]. 	308
224399	COG1482	ManA	Mannose-6-phosphate isomerase, class I [Carbohydrate transport and metabolism]. 	312
224400	COG1483	COG1483	Predicted ATPase, AAA+ superfamily [General function prediction only]. 	774
224401	COG1484	DnaC	DNA replication protein DnaC [Replication, recombination and repair]. 	254
224402	COG1485	YhcM	Predicted ATPase [General function prediction only]. 	367
224403	COG1486	CelF	Alpha-galactosidase/6-phospho-beta-glucosidase, family 4 of glycosyl hydrolase [Carbohydrate transport and metabolism]. 	442
224404	COG1487	VapC	Predicted nucleic acid-binding protein, contains PIN domain  [General function prediction only]. 	133
224405	COG1488	PncB	Nicotinic acid phosphoribosyltransferase [Coenzyme transport and metabolism]. 	405
224406	COG1489	SfsA	DNA-binding protein, stimulates sugar fermentation [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 	235
224407	COG1490	Dtd	D-Tyr-tRNAtyr deacylase [Translation, ribosomal structure and biogenesis]. 	145
224408	COG1491	COG1491	Predicted nucleic acid-binding OB-fold protein [General function prediction only]. 	202
224409	COG1492	CobQ	Cobyric acid synthase  [Coenzyme transport and metabolism]. 	486
224410	COG1493	HprK	Serine kinase of the HPr protein, regulates carbohydrate metabolism  [Signal transduction mechanisms]. 	308
224411	COG1494	GlpX	Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase or related protein [Carbohydrate transport and metabolism]. 	332
224412	COG1495	DsbB	Disulfide bond formation protein DsbB [Posttranslational modification, protein turnover, chaperones]. 	170
224413	COG1496	YfiH	Copper oxidase (laccase) domain [Inorganic ion transport and metabolism]. 	249
224414	COG1497	COG1497	Predicted transcriptional regulator  [Transcription]. 	260
224415	COG1498	SIK1	RNA processing factor Prp31, contains Nop domain [Translation, ribosomal structure and biogenesis]. 	395
224416	COG1499	NMD3	NMD protein affecting ribosome stability and mRNA decay  [Translation, ribosomal structure and biogenesis]. 	355
224417	COG1500	Sdo1	Ribosome maturation protein Sdo1 [Translation, ribosomal structure and biogenesis]. 	234
224418	COG1501	YicI	Alpha-glucosidase, glycosyl hydrolase family GH31 [Carbohydrate transport and metabolism]. 	772
224419	COG1502	Cls	Phosphatidylserine/phosphatidylglycerophosphate/cardiolipin synthase or related enzyme [Lipid transport and metabolism]. 	438
224420	COG1503	eRF1	Peptide chain release factor 1 (eRF1)  [Translation, ribosomal structure and biogenesis]. 	411
224421	COG1504	COG1504	Uncharacterized protein [Function unknown]. 	121
224422	COG1505	PreP	Prolyl oligopeptidase PreP, S9A serine peptidase family [Amino acid transport and metabolism]. 	648
224423	COG1506	DAP2	Dipeptidyl aminopeptidase/acylaminoacyl peptidase  [Amino acid transport and metabolism]. 	620
224424	COG1507	COG1507	Uncharacterized protein, DUF501 family [Function unknown]. 	167
224425	COG1508	RpoN	DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog [Transcription]. 	444
224426	COG1509	EpmB	L-lysine 2,3-aminomutase (EF-P beta-lysylation pathway) [Amino acid transport and metabolism]. 	369
224427	COG1510	GbsR	DNA-binding transcriptional regulator GbsR, MarR family [Transcription]. 	177
224428	COG1511	YhgE	Uncharacterized membrane protein YhgE, phage infection protein (PIP) family [Function unknown]. 	780
224429	COG1512	YgcG	Uncharacterized membrane protein YgcG, contains a TPM-fold domain [Function unknown]. 	271
224430	COG1513	CynS	Cyanate lyase [Inorganic ion transport and metabolism]. 	151
224431	COG1514	LigT	2'-5' RNA ligase [Translation, ribosomal structure and biogenesis]. 	180
224432	COG1515	Nfi	Deoxyinosine 3'endonuclease (endonuclease V) [Replication, recombination and repair]. 	212
224433	COG1516	FliS	Flagellin-specific chaperone FliS [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 	132
224434	COG1517	Csx1	CRISPR/Cas system-associated protein Csx1, contains CARF domain [Defense mechanisms]. 	406
224435	COG1518	Cas1	CRISPR/Cas system-associated endonuclease Cas1 [Defense mechanisms]. 	327
224436	COG1519	KdtA	3-deoxy-D-manno-octulosonic-acid transferase [Cell wall/membrane/envelope biogenesis]. 	419
224437	COG1520	PQQ	Outer membrane protein assembly factor BamB, contains PQQ-like beta-propeller repeat [Cell wall/membrane/envelope biogenesis]. 	370
224438	COG1521	CoaX	Pantothenate kinase type III [Coenzyme transport and metabolism]. 	251
224439	COG1522	Lrp	DNA-binding transcriptional regulator, Lrp family [Transcription]. 	154
224440	COG1523	PulA	Pullulanase/glycogen debranching enzyme [Carbohydrate transport and metabolism]. 	697
224441	COG1524	Npp1	Predicted pyrophosphatase or phosphodiesterase, AlkP superfamily [General function prediction only]. 	450
224442	COG1525	YncB	Endonuclease YncB, thermonuclease family [Replication, recombination and repair]. 	192
224443	COG1526	FdhD	Formate dehydrogenase assembly factor FdhD [Energy production and conversion]. 	266
224444	COG1527	NtpC	Archaeal/vacuolar-type H+-ATPase subunit C/Vma6 [Energy production and conversion]. 	346
224445	COG1528	Ftn	Ferritin [Inorganic ion transport and metabolism]. 	167
224446	COG1529	CoxL	CO or xanthine dehydrogenase, Mo-binding subunit [Energy production and conversion]. 	731
224447	COG1530	CafA	Ribonuclease G or E [Translation, ribosomal structure and biogenesis]. 	487
224448	COG1531	COG1531	Uncharacterized protein, UPF0248 family [Function unknown]. 	77
224449	COG1532	COG1532	CooT family nickel-binding protein [General function prediction only]. 	57
224451	COG1534	YhbY	RNA-binding protein YhbY [Translation, ribosomal structure and biogenesis]. 	97
224452	COG1535	EntB	Isochorismate hydrolase [Secondary metabolites biosynthesis, transport and catabolism]. 	218
224453	COG1536	FliG	Flagellar motor switch protein FliG [Cell motility]. 	339
224454	COG1537	PelA	Stalled ribosome rescue protein Dom34, pelota family  [Translation, ribosomal structure and biogenesis]. 	352
224455	COG1538	TolC	Outer membrane protein TolC [Cell wall/membrane/envelope biogenesis]. 	457
224456	COG1539	FolB	Dihydroneopterin aldolase [Coenzyme transport and metabolism]. 	121
224457	COG1540	YbgL	Lactam utilization protein B (function unknown) [General function prediction only]. 	252
224458	COG1541	PaaK	Phenylacetate-coenzyme A ligase PaaK, adenylate-forming domain family [Coenzyme transport and metabolism]. 	438
224459	COG1542	COG1542	Uncharacterized protein [Function unknown]. 	593
224460	COG1543	COG1543	Predicted glycosyl hydrolase, contains GH57 and DUF1957 domains [Carbohydrate transport and metabolism]. 	504
224461	COG1544	RaiA	Ribosome-associated translation inhibitor RaiA [Translation, ribosomal structure and biogenesis]. 	110
224462	COG1545	COG1545	Uncharacterized OB-fold protein, contains Zn-ribbon domain [General function prediction only]. 	140
224463	COG1546	PncC	Nicotinamide mononucleotide (NMN) deamidase PncC [Coenzyme transport and metabolism]. 	162
224464	COG1547	YpuF	Predicted metal-dependent hydrolase  [Function unknown]. 	156
224465	COG1548	COG1548	Uncharacterized protein, hydantoinase/oxoprolinase family [Function unknown]. 	330
224466	COG1549	COG1549	Archaeosine tRNA-guanine transglycosylase, contains uracil-DNA-glycosylase and PUA domains [Translation, ribosomal structure and biogenesis]. 	519
224467	COG1550	YlxP	Uncharacterized conserved protein YlxP, DUF503 family [Function unknown]. 	95
224468	COG1551	CsrA	sRNA-binding carbon storage regulator CsrA [Signal transduction mechanisms]. 	73
224469	COG1552	RPL40A	Ribosomal protein L40E  [Translation, ribosomal structure and biogenesis]. 	50
224470	COG1553	DsrE	Sulfur relay (sulfurtransferase) complex TusBCD TusD component, DsrE family  [Inorganic ion transport and metabolism]. 	126
224471	COG1554	ATH1	Trehalose and maltose hydrolase (possible phosphorylase) [Carbohydrate transport and metabolism]. 	772
224472	COG1555	ComEA	DNA uptake protein ComE and related DNA-binding proteins [Replication, recombination and repair]. 	149
224473	COG1556	LutC	L-lactate utilization protein LutC, contains LUD domain [Energy production and conversion]. 	218
224474	COG1558	FlgC	Flagellar basal body rod protein FlgC [Cell motility]. 	137
224475	COG1559	YceG	Cell division protein YceG, involved in septum cleavage [Cell cycle control, cell division, chromosome partitioning]. 	342
224476	COG1560	HtrB	Lauroyl/myristoyl acyltransferase [Lipid transport and metabolism]. 	308
224477	COG1561	YicC	Uncharacterized conserved protein YicC, UPF0701 family [Function unknown]. 	290
224478	COG1562	ERG9	Phytoene/squalene synthetase  [Lipid transport and metabolism]. 	288
224479	COG1563	COG1563	Uncharacterized MnhB-related membrane protein [General function prediction only]. 	87
224480	COG1564	ThiN	Thiamine pyrophosphokinase  [Coenzyme transport and metabolism]. 	212
224481	COG1565	MidA	SAM-dependent methyltransferase, MidA family [General function prediction only]. 	370
224482	COG1566	EmrA	Multidrug resistance efflux pump [Defense mechanisms]. 	352
224483	COG1567	Csm4	CRISPR/Cas system CSM-associated protein Csm4, group 5 of RAMP superfamily [Defense mechanisms]. 	313
224484	COG1568	COG1568	Predicted methyltransferase  [General function prediction only]. 	354
224485	COG1569	COG1569	Predicted nucleic acid-binding protein, contains PIN domain  [General function prediction only]. 	142
224486	COG1570	XseA	Exonuclease VII, large subunit [Replication, recombination and repair]. 	440
224487	COG1571	TiaS	tRNA(Ile2) C34 agmatinyltransferase TiaS [Translation, ribosomal structure and biogenesis]. 	421
224489	COG1573	Udg4	Uracil-DNA glycosylase  [Replication, recombination and repair]. 	202
224490	COG1574	YtcJ	Predicted amidohydrolase YtcJ [General function prediction only]. 	535
224491	COG1575	MenA	1,4-dihydroxy-2-naphthoate octaprenyltransferase [Coenzyme transport and metabolism]. 	303
224492	COG1576	RlmH	23S rRNA pseudoU1915 N3-methylase RlmH [Translation, ribosomal structure and biogenesis]. 	155
224493	COG1577	ERG12	Mevalonate kinase  [Lipid transport and metabolism]. 	307
224494	COG1578	COG1578	Uncharacterized conserved protein, contains ATP-grasp and redox domains [Function unknown]. 	285
224495	COG1579	COG1579	Predicted  nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 	239
224496	COG1580	FliL	Flagellar basal body-associated protein FliL [Cell motility]. 	159
224497	COG1581	AlbA	Archaeal DNA-binding protein  [Transcription]. 	91
224498	COG1582	FlgEa	Uncharacterized protein YlzI, FlbEa/FlbD family [General function prediction only]. 	67
224499	COG1583	Cas6	CRISPR/Cas system endoribonuclease Cas6, RAMP superfamily [Defense mechanisms]. 	240
224500	COG1584	SatP	Succinate-acetate transporter protein [Energy production and conversion]. 	207
224501	COG1585	YbbJ	Membrane protein implicated in regulation of membrane protease activity [Posttranslational modification, protein turnover, chaperones]. 	140
224502	COG1586	SpeD	S-adenosylmethionine decarboxylase or arginine decarboxylase [Amino acid transport and metabolism]. 	136
224503	COG1587	HemD	Uroporphyrinogen-III synthase [Coenzyme transport and metabolism]. 	248
224504	COG1588	POP4	RNase P/RNase MRP subunit p29  [Translation, ribosomal structure and biogenesis]. 	95
224505	COG1589	FtsQ	Cell division septal protein FtsQ [Cell cycle control, cell division, chromosome partitioning]. 	269
224506	COG1590	Tyw3	tRNA(Phe) wybutosine-synthesizing methylase Tyw3 [Translation, ribosomal structure and biogenesis]. 	208
224507	COG1591	COG1591	Holliday junction resolvase, archaeal type  [Replication, recombination and repair]. 	137
224508	COG1592	YotD	Rubrerythrin  [Energy production and conversion]. 	166
224509	COG1593	DctQ	TRAP-type C4-dicarboxylate transport system, large permease component [Carbohydrate transport and metabolism]. 	379
224510	COG1594	RPB9	DNA-directed RNA polymerase, subunit M/Transcription elongation factor TFIIS [Transcription]. 	113
224511	COG1595	RpoE	DNA-directed RNA polymerase specialized sigma subunit, sigma24 family [Transcription]. 	182
224512	COG1596	Wza	Periplasmic protein involved in polysaccharide export, contains SLBB domain of the beta-grasp fold [Cell wall/membrane/envelope biogenesis]. 	239
224513	COG1597	LCB5	Diacylglycerol kinase family enzyme [Lipid transport and metabolism, General function prediction only]. 	301
224514	COG1598	HicB	Predicted nuclease of the RNAse H fold, HicB family [Defense mechanisms]. 	73
224515	COG1599	RFA1	ssDNA-binding replication factor A, large subunit [Replication, recombination and repair]. 	407
224516	COG1600	QueG	Epoxyqueuosine reductase  QueG (queuosine biosynthesis) [Translation, ribosomal structure and biogenesis]. 	337
224517	COG1601	GCD7	Translation initiation factor 2, beta subunit (eIF-2beta)/eIF-5 N-terminal domain  [Translation, ribosomal structure and biogenesis]. 	151
224518	COG1602	COG1602	Uncharacterized protein [Function unknown]. 	402
224519	COG1603	RPP1	RNase P/RNase MRP subunit p30  [Translation, ribosomal structure and biogenesis]. 	229
224520	COG1604	Cmr6	CRISPR/Cas system CMR subunit Cmr6, Cas7 group, RAMP superfamily [Defense mechanisms]. 	257
224521	COG1605	PheA	Chorismate mutase [Amino acid transport and metabolism]. 	101
224522	COG1606	COG1606	ATP-utilizing enzyme, PP-loop superfamily  [General function prediction only]. 	269
224523	COG1607	YciA	Acyl-CoA hydrolase [Lipid transport and metabolism]. 	157
224524	COG1608	COG1608	Isopentenyl phosphate kinase [Lipid transport and metabolism]. 	252
224525	COG1609	PurR	DNA-binding transcriptional regulator, LacI/PurR family [Transcription]. 	333
224526	COG1610	YqeY	Uncharacterized conserved protein YqeY [Function unknown]. 	148
224527	COG1611	YgdH	Predicted Rossmann fold nucleotide-binding protein [General function prediction only]. 	205
224528	COG1612	CtaA	Heme A synthase [Coenzyme transport and metabolism]. 	323
224529	COG1613	Sbp	ABC-type sulfate transport system, periplasmic component [Inorganic ion transport and metabolism]. 	348
224530	COG1614	CdhC	CO dehydrogenase/acetyl-CoA synthase beta subunit  [Energy production and conversion]. 	470
224531	COG1615	COG1615	Uncharacterized membrane protein, UPF0182 family [Function unknown]. 	885
224532	COG1617	Cgi121	tRNA threonylcarbamoyladenosine modification (KEOPS) complex,  Cgi121 subunit [Translation, ribosomal structure and biogenesis]. 	158
224533	COG1618	THEP1	Nucleoside-triphosphatase THEP1 [Nucleotide transport and metabolism]. 	179
224534	COG1619	LdcA	Muramoyltetrapeptide carboxypeptidase LdcA (peptidoglycan recycling) [Cell wall/membrane/envelope biogenesis]. 	313
224535	COG1620	LldP	L-lactate permease [Energy production and conversion]. 	522
224536	COG1621	SacC	Sucrose-6-phosphate hydrolase SacC, GH32 family [Carbohydrate transport and metabolism]. 	486
224537	COG1622	CyoA	Heme/copper-type cytochrome/quinol oxidase, subunit 2 [Energy production and conversion]. 	247
224538	COG1623	DisA	Diadenylate cyclase (c-di-AMP synthetase), DNA integrity scanning protein DisA [Signal transduction mechanisms]. 	349
224539	COG1624	DisA_N	Diadenylate cyclase (c-di-AMP synthetase), DisA_N domain [Signal transduction mechanisms]. 	247
224540	COG1625	NifB	Fe-S oxidoreductase, related to NifB/MoaA family  [Energy production and conversion]. 	414
224541	COG1626	TreA	Neutral trehalase [Carbohydrate transport and metabolism]. 	558
224542	COG1627	COG1627	Uncharacterized protein [Function unknown]. 	419
224543	COG1628	COG1628	Endonuclease V homolog, UPF0215 family [General function prediction only]. 	185
224544	COG1629	CirA	Outer membrane receptor proteins, mostly Fe transport [Inorganic ion transport and metabolism]. 	768
224545	COG1630	NurA	NurA 5'-3' nuclease  [Replication, recombination and repair]. 	379
224546	COG1631	RPL42A	Ribosomal protein L44E  [Translation, ribosomal structure and biogenesis]. 	94
224547	COG1632	RPL15A	Ribosomal protein L15E  [Translation, ribosomal structure and biogenesis]. 	195
224548	COG1633	YhjR	Rubrerythrin [Inorganic ion transport and metabolism]. 	176
224549	COG1634	COG1634	Uncharacterized Rossmann fold enzyme  [Function unknown]. 	232
224550	COG1635	THI4	Archaeal ribulose 1,5-bisphosphate synthetase/yeast thiazole synthase [Coenzyme transport and metabolism]. 	262
224551	COG1636	COG1636	Predicted ATPase, Adenine nucleotide alpha hydrolases (AANH) superfamily  [General function prediction only]. 	204
224552	COG1637	NucS	Endonuclease NucS, RecB family [Replication, recombination and repair]. 	253
224553	COG1638	DctP	TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]. 	332
224554	COG1639	HDOD	HD-like signal output (HDOD) domain, no enzymatic activity [Signal transduction mechanisms]. 	289
224555	COG1640	MalQ	4-alpha-glucanotransferase [Carbohydrate transport and metabolism]. 	520
224556	COG1641	COG1641	Uncharacterized conserved protein, DUF111 family [Function unknown]. 	387
224557	COG1643	HrpA	HrpA-like RNA helicase [Translation, ribosomal structure and biogenesis]. 	845
224558	COG1644	RPB10	DNA-directed RNA polymerase, subunit N (RpoN/RPB10)  [Transcription]. 	63
224559	COG1645	COG1645	Uncharacterized Zn-finger containing protein, UPF0148 family  [General function prediction only]. 	131
224560	COG1646	PcrB	Heptaprenylglyceryl phosphate synthase [Lipid transport and metabolism]. 	240
224561	COG1647	YvaK	Esterase/lipase  [Secondary metabolites biosynthesis, transport and catabolism]. 	243
224562	COG1648	CysG2	Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) [Coenzyme transport and metabolism]. 	210
224563	COG1649	YddW	Uncharacterized lipoprotein YddW, UPF0748 family [Function unknown]. 	418
224564	COG1650	COG1650	D-tyrosyl-tRNA(Tyr) deacylase [Translation, ribosomal structure and biogenesis]. 	266
224565	COG1651	DsbG	Protein-disulfide isomerase [Posttranslational modification, protein turnover, chaperones]. 	244
224566	COG1652	XkdP	Nucleoid-associated protein YgaU, contains BON and LysM domains [Function unknown]. 	269
224567	COG1653	UgpB	ABC-type glycerol-3-phosphate transport system, periplasmic component [Carbohydrate transport and metabolism]. 	433
224568	COG1654	BirA	Biotin operon repressor [Transcription]. 	79
224569	COG1655	COG1655	Uncharacterized protein, DUF2225 family [Function unknown]. 	267
224570	COG1656	COG1656	Uncharacterized conserved protein, contains PIN domain [Function unknown]. 	165
224571	COG1657	SqhC	Squalene cyclase  [Lipid transport and metabolism]. 	517
224572	COG1658	RnmV	5S rRNA maturation endonuclease (Ribonuclease M5), contains TOPRIM domain [Translation, ribosomal structure and biogenesis]. 	127
224573	COG1659	COG1659	Uncharacterized protein, linocin/CFP29 family [Function unknown]. 	267
224574	COG1660	RapZ	RNase adaptor protein for sRNA GlmZ degradation, contains a P-loop ATPase domain [Signal transduction mechanisms]. 	286
224575	COG1661	COG1661	Predicted DNA-binding protein with PD1-like DNA-binding motif  [General function prediction only]. 	141
224576	COG1662	InsB	Transposase and inactivated derivatives, IS1 family [Mobilome: prophages, transposons]. 	121
224577	COG1663	LpxK	Tetraacyldisaccharide-1-P 4'-kinase [Cell wall/membrane/envelope biogenesis]. 	336
224578	COG1664	CcmA	Cytoskeletal protein CcmA, bactofilin family [Cytoskeleton]. 	146
224579	COG1665	COG1665	Predicted nucleotidyltransferase  [General function prediction only]. 	315
224580	COG1666	YajQ	Uncharacterized conserved protein YajQ, UPF0234 family [Function unknown]. 	165
224581	COG1667	COG1667	Uncharacterized protein [Function unknown]. 	254
224582	COG1668	NatB	ABC-type Na+ efflux pump, permease component  [Energy production and conversion, Inorganic ion transport and metabolism]. 	407
224583	COG1669	COG1669	Predicted nucleotidyltransferase  [General function prediction only]. 	97
224584	COG1670	RimL	Protein N-acetyltransferase, RimJ/RimL family [Translation, ribosomal structure and biogenesis, Posttranslational modification, protein turnover, chaperones]. 	187
224585	COG1671	YaiI	Uncharacterized conserved protein YaiI, UPF0178 family [Function unknown]. 	150
224586	COG1672	AAAA	Predicted ATPase, archaeal AAA+ ATPase superfamily  [General function prediction only]. 	359
224587	COG1673	COG1673	Predicted RNA-binding protein, contains PUA-like EVE domain [General function prediction only]. 	151
224588	COG1674	FtsK	DNA segregation ATPase FtsK/SpoIIIE and related proteins [Cell cycle control, cell division, chromosome partitioning]. 	858
224589	COG1675	TFA1	Transcription initiation factor IIE, alpha subunit  [Transcription]. 	176
224590	COG1676	SEN2	tRNA splicing endonuclease  [Translation, ribosomal structure and biogenesis]. 	181
224591	COG1677	FliE	Flagellar hook-basal body complex protein FliE [Cell motility]. 	105
224592	COG1678	AlgH	Putative transcriptional regulator, AlgH/UPF0301 family [Transcription]. 	194
224593	COG1679	COG1679	Predicted aconitase  [Energy production and conversion]. 	403
224594	COG1680	AmpC	CubicO group peptidase, beta-lactamase class C family [Defense mechanisms]. 	390
224595	COG1681	FlaB	Archaellin (archaeal flagellin)  [Cell motility]. 	209
224596	COG1682	TagG	ABC-type polysaccharide/polyol phosphate export permease [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	263
224597	COG1683	YbbK	Uncharacterized conserved protein YbbK, DUF523 family [Function unknown]. 	156
224598	COG1684	FliR	Flagellar biosynthesis protein FliR [Cell motility]. 	258
224599	COG1685	AroK2	Archaeal shikimate kinase  [Amino acid transport and metabolism]. 	278
224600	COG1686	DacC	D-alanyl-D-alanine carboxypeptidase [Cell wall/membrane/envelope biogenesis]. 	389
224601	COG1687	AzlD	Branched-chain amino acid transport protein AzlD [Amino acid transport and metabolism]. 	106
224602	COG1688	Cas5	CRISPR/Cas system-associated protein Cas5, RAMP superfamily [Defense mechanisms]. 	240
224603	COG1689	COG1689	Uncharacterized protein [Function unknown]. 	274
224604	COG1690	RtcB	RNA-splicing ligase RtcB, repairs tRNA damage [Translation, ribosomal structure and biogenesis]. 	432
224605	COG1691	COG1691	NCAIR mutase (PurE)-related protein [Nucleotide transport and metabolism]. 	254
224606	COG1692	YmdB	Calcineurin-like phosphoesterase  [General function prediction only]. 	266
224607	COG1693	COG1693	Repressor of nif and glnA expression  [Transcription]. 	325
224608	COG1694	MazG	NTP pyrophosphatase, house-cleaning of non-canonical NTPs [Defense mechanisms]. 	102
224609	COG1695	PadR	DNA-binding transcriptional regulator, PadR family [Transcription]. 	138
224610	COG1696	DltB	D-alanyl-lipoteichoic acid acyltransferase DltB, MBOAT superfamily [Cell wall/membrane/envelope biogenesis]. 	425
224611	COG1697	Spo11	DNA topoisomerase VI, subunit A  [Replication, recombination and repair]. 	356
224612	COG1698	COG1698	Uncharacterized protein, UPF0147 family [Function unknown]. 	93
224613	COG1699	FliW	Flagellar assembly factor FliW [Cell motility]. 	146
224614	COG1700	COG1700	Predicted component of virus defense system, contains PD-(D/E)xK nuclease domain, DUF524 [Defense mechanisms]. 	503
224615	COG1701	COG1701	Archaeal phosphopantothenate synthetase [Coenzyme transport and metabolism]. 	256
411689	COG1702	PhoH	Phosphate starvation-inducible protein PhoH, predicted ATPase [Signal transduction mechanisms]. 	260
224617	COG1703	ArgK	Putative periplasmic protein kinase ArgK or related GTPase of G3E family [Posttranslational modification, protein turnover, chaperones]. 	323
224618	COG1704	LemA	Uncharacterized conserved protein [Function unknown]. 	185
224619	COG1705	FlgJ	Flagellum-specific peptidoglycan hydrolase FlgJ [Cell wall/membrane/envelope biogenesis, Cell motility]. 	201
224620	COG1706	FlgI	Flagellar basal body P-ring protein FlgI [Cell motility]. 	365
224621	COG1707	COG1707	Uncharacterized protein, contains ACT and thioredoxin-like domains [General function prediction only]. 	218
224622	COG1708	COG1708	Predicted nucleotidyltransferase  [General function prediction only]. 	128
224623	COG1709	COG1709	Predicted transcriptional regulator  [Transcription]. 	241
224624	COG1710	COG1710	Uncharacterized protein [Function unknown]. 	139
224625	COG1711	COG1711	DNA replication initiation complex subunit, GINS family [Replication, recombination and repair]. 	223
224626	COG1712	COG1712	Predicted dinucleotide-utilizing enzyme  [General function prediction only]. 	255
224627	COG1713	YqeK	HD superfamily phosphohydrolase YqeK (fused to NMNAT in mycoplasms)    [General function prediction only]. 	187
224628	COG1714	YckC	Uncharacterized membrane protein YckC, RDD family [Function unknown]. 	172
224629	COG1715	Mrr	Restriction endonuclease Mrr [Defense mechanisms]. 	308
224630	COG1716	FHA	Forkhead associated (FHA) domain, binds pSer, pThr, pTyr [Signal transduction mechanisms]. 	191
224631	COG1717	Rpl32e	Ribosomal protein L32E  [Translation, ribosomal structure and biogenesis]. 	133
224632	COG1718	RIO1	Serine/threonine-protein kinase RIO1 [Signal transduction mechanisms]. 	268
224633	COG1719	COG1719	Predicted hydrocarbon binding protein, contains 4VR domain [General function prediction only]. 	158
224634	COG1720	TsaA	tRNA (Thr-GGU) A37 N-methylase [Translation, ribosomal structure and biogenesis]. 	156
224635	COG1721	YeaD2	Uncharacterized conserved protein, DUF58 family, contains vWF domain [Function unknown]. 	416
224636	COG1722	XseB	Exonuclease VII small subunit [Replication, recombination and repair]. 	81
224637	COG1723	Rmd1	Uncharacterized protein, Rmd1/YagE family [Function unknown]. 	331
224638	COG1724	YcfA	Predicted RNA binding protein YcfA, dsRBD-like fold, HicA-like mRNA interferase family [General function prediction only]. 	66
224639	COG1725	YhcF	DNA-binding transcriptional regulator YhcF, GntR family [Transcription]. 	125
224640	COG1726	NqrA	Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA  [Energy production and conversion]. 	447
224641	COG1727	RPL18A	Ribosomal protein L18E  [Translation, ribosomal structure and biogenesis]. 	122
224642	COG1728	YaaR	Uncharacterized protein YaaR, TM1646/DUF327 family [Function unknown]. 	151
224643	COG1729	YbgF	Periplasmic TolA-binding protein (function unknown) [General function prediction only]. 	262
224644	COG1730	GIM5	Prefoldin subunit 5 [Posttranslational modification, protein turnover, chaperones]. 	145
224645	COG1731	RibC2	Archaeal riboflavin synthase  [Coenzyme transport and metabolism]. 	154
224646	COG1732	OsmF	Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) [Cell wall/membrane/envelope biogenesis]. 	300
224647	COG1733	HxlR	DNA-binding transcriptional regulator, HxlR family [Transcription]. 	120
224648	COG1734	DksA	RNA polymerase-binding transcription factor DksA [Translation, ribosomal structure and biogenesis]. 	120
224649	COG1735	Php	Predicted metal-dependent hydrolase, phosphotriesterase family [General function prediction only]. 	316
224650	COG1736	DPH2	Diphthamide synthase subunit DPH2  [Translation, ribosomal structure and biogenesis]. 	347
224651	COG1737	RpiR	DNA-binding transcriptional regulator, MurR/RpiR family, contains HTH and SIS domains [Transcription]. 	281
224652	COG1738	YhhQ	Uncharacterized PurR-regulated membrane protein YhhQ, DUF165 family [Function unknown]. 	233
224653	COG1739	YIH1	Putative translation regulator, IMPACT (imprinted ancient) protein family [General function prediction only]. 	203
224654	COG1740	HyaA	Ni,Fe-hydrogenase I small subunit [Energy production and conversion]. 	355
224655	COG1741	YhaK	Redox-sensitive bicupin YhaK, pirin superfamily [General function prediction only]. 	276
224656	COG1742	YnfA	Uncharacterized inner membrane protein YnfA, drug/metabolite transporter superfamily [General function prediction only]. 	109
224657	COG1743	COG1743	Adenine-specific DNA methylase, contains a Zn-ribbon domain [Replication, recombination and repair]. 	875
224658	COG1744	Med	Basic membrane lipoprotein Med, periplasmic binding protein (PBP1-ABC) superfamily [Cell wall/membrane/envelope biogenesis]. 	345
224659	COG1745	COG1745	Uncharacterized euryarchaeal protein, UPF0058 family [Function unknown]. 	94
224660	COG1746	CCA1	tRNA nucleotidyltransferase (CCA-adding enzyme)  [Translation, ribosomal structure and biogenesis]. 	443
224661	COG1747	COG1747	Uncharacterized N-terminal domain of the transcription elongation factor GreA  [Function unknown]. 	711
224662	COG1748	Lys9	Saccharopine dehydrogenase, NADP-dependent  [Amino acid transport and metabolism]. 	389
224663	COG1749	FlgE	Flagellar hook protein FlgE [Cell motility]. 	423
224664	COG1750	COG1750	Predicted archaeal serine protease, S18 family  [General function prediction only]. 	579
224665	COG1751	COG1751	Uncharacterized protein [Function unknown]. 	186
224666	COG1752	RssA	Predicted acylesterase/phospholipase RssA, containd patatin domain [General function prediction only]. 	306
224667	COG1753	VapB3	Predicted antitoxin, CopG family  [Defense mechanisms]. 	74
224668	COG1754	COG1754	Uncharacterized C-terminal domain of topoisomerase IA  [Function unknown]. 	298
224669	COG1755	YpbQ	Uncharacterized protein YpbQ, isoprenylcysteine carboxyl methyltransferase (ICMT) family [Function unknown]. 	172
224670	COG1756	Emg1	rRNA pseudouridine-1189 N-methylase Emg1, Nep1/Mra1 family [Translation, ribosomal structure and biogenesis]. 	223
224671	COG1757	NhaC	Na+/H+ antiporter NhaC [Energy production and conversion]. 	485
224672	COG1758	RpoZ	DNA-directed RNA polymerase, subunit K/omega [Transcription]. 	74
224673	COG1759	PurP	5-formaminoimidazole-4-carboxamide-1-beta-D-ribofuranosyl 5'-monophosphate synthetase (purine biosynthesis) [Nucleotide transport and metabolism]. 	361
224674	COG1760	SdaA	L-serine deaminase [Amino acid transport and metabolism]. 	262
224675	COG1761	RPB11	DNA-directed RNA polymerase, subunit L  [Transcription]. 	99
224676	COG1762	PtsN	Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 	152
224677	COG1763	MobB	Molybdopterin-guanine dinucleotide biosynthesis protein [Coenzyme transport and metabolism]. 	161
224678	COG1764	OsmC	Organic hydroperoxide reductase OsmC/OhrA [Defense mechanisms]. 	143
224679	COG1765	YhfA	Uncharacterized OsmC-related protein [General function prediction only]. 	137
224680	COG1766	FliF	Flagellar biosynthesis/type III secretory pathway M-ring protein FliF/YscJ [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 	545
224681	COG1767	CitG	Triphosphoribosyl-dephospho-CoA synthetase [Coenzyme transport and metabolism]. 	288
224682	COG1768	COG1768	Predicted phosphohydrolase  [General function prediction only]. 	230
224683	COG1769	Cmr3	CRISPR/Cas system CMR-associated protein Cmr3, group 5 of RAMP superfamily [Defense mechanisms]. 	335
224684	COG1770	PtrB	Protease II [Amino acid transport and metabolism]. 	682
224685	COG1771	COG1771	Uncharacterized protein, contains N-terminal Zn-finger domain [Function unknown]. 	471
224686	COG1772	COG1772	Uncharacterized protein [Function unknown]. 	178
224687	COG1773	YgaK	Rubredoxin [Energy production and conversion]. 	55
224688	COG1774	YaaT	Cell fate regulator YaaT, PSP1 superfamily (controls sporulation, competence, biofilm development) [Signal transduction mechanisms]. 	265
224689	COG1775	HgdB	Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB [Secondary metabolites biosynthesis, transport and catabolism]. 	379
224690	COG1776	CheC	Chemotaxis protein CheY-P-specific phosphatase CheC [Signal transduction mechanisms]. 	203
224691	COG1777	COG1777	Predicted transcriptional regulator [Transcription]. 	217
224692	COG1778	KdsC	3-deoxy-D-manno-octulosonate 8-phosphate phosphatase KdsC and related HAD superfamily phosphatases [Cell wall/membrane/envelope biogenesis, General function prediction only]. 	170
224693	COG1779	Zpr1	C4-type Zn-finger protein  [General function prediction only]. 	201
224694	COG1780	NrdI	Protein involved in ribonucleotide reduction [Nucleotide transport and metabolism]. 	141
224695	COG1781	PyrI	Aspartate carbamoyltransferase, regulatory subunit [Nucleotide transport and metabolism]. 	153
224696	COG1782	COG1782	Predicted metal-dependent RNase, contains metallo-beta-lactamase and KH domains [General function prediction only]. 	637
224697	COG1783	XtmB	Phage terminase large subunit  [Mobilome: prophages, transposons]. 	414
224698	COG1784	COG1784	TctA family transporter [General function prediction only]. 	395
224699	COG1785	PhoA	Alkaline phosphatase [Inorganic ion transport and metabolism, General function prediction only]. 	482
224700	COG1786	COG1786	Swiveling domain associated with predicted aconitase  [General function prediction only]. 	131
224701	COG1787	COG1787	Endonuclease, HJR/Mrr/RecB family [Defense mechanisms]. 	217
224702	COG1788	AtoD	Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit [Lipid transport and metabolism]. 	220
224703	COG1790	COG1790	Uncharacterized protein [Function unknown]. 	209
224704	COG1791	Adi1	Acireductone dioxygenase (methionine salvage), cupin superfamily [Amino acid transport and metabolism]. 	181
224705	COG1792	MreC	Cell shape-determining protein MreC [Cell cycle control, cell division, chromosome partitioning]. 	284
224706	COG1793	CDC9	ATP-dependent DNA ligase  [Replication, recombination and repair]. 	444
224707	COG1794	RacX	Aspartate/glutamate racemase [Cell wall/membrane/envelope biogenesis]. 	230
224708	COG1795	COG1795	Formaldehyde-activating enzyme nesessary for methanogenesis  [Energy production and conversion]. 	170
224709	COG1796	PolX	DNA polymerase/3'-5' exonuclease PolX [Replication, recombination and repair]. 	326
224710	COG1797	CobB	Cobyrinic acid a,c-diamide synthase  [Coenzyme transport and metabolism]. 	451
224711	COG1798	DPH5	Diphthamide biosynthesis methyltransferase  [Translation, ribosomal structure and biogenesis]. 	260
224712	COG1799	YlmF	FtsZ-interacting cell division protein YlmF [Cell cycle control, cell division, chromosome partitioning]. 	167
224713	COG1800	COG1800	Predicted transglutaminase-like protease [General function prediction only]. 	335
224714	COG1801	YecE	Uncharacterized conserved protein YecE, DUF72 family [Function unknown]. 	263
224715	COG1802	GntR	DNA-binding transcriptional regulator, GntR family [Transcription]. 	230
224716	COG1803	MgsA	Methylglyoxal synthase [Carbohydrate transport and metabolism]. 	142
224717	COG1804	CaiB	Crotonobetainyl-CoA:carnitine CoA-transferase CaiB and related acyl-CoA transferases [Lipid transport and metabolism]. 	396
224718	COG1805	NqrB	Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB  [Energy production and conversion]. 	400
224719	COG1806	PpsR	Regulator of PEP synthase PpsR, kinase-PPPase family (combines ADP:protein kinase and phosphorylase activities) [Signal transduction mechanisms]. 	273
224720	COG1807	ArnT	4-amino-4-deoxy-L-arabinose transferase or related glycosyltransferase of PMT family [Cell wall/membrane/envelope biogenesis]. 	535
224721	COG1808	COG1808	Uncharacterized membrane protein  [Function unknown]. 	334
224722	COG1809	ComA	Phosphosulfolactate synthase, CoM biosynthesis protein A  [Coenzyme transport and metabolism]. 	258
224723	COG1810	COG1810	Uncharacterized conserved protein [Function unknown]. 	224
224724	COG1811	YqgA	Uncharacterized membrane protein YqgA, affects biofilm formation [Function unknown]. 	228
224725	COG1812	MetK2	Archaeal S-adenosylmethionine synthetase  [Coenzyme transport and metabolism]. 	400
224726	COG1813	aMBF1	Archaeal ribosome-binding protein aMBF1, putative translation factor, contains Zn-ribbon and HTH domains [Translation, ribosomal structure and biogenesis]. 	165
224727	COG1814	Ccc1	Predicted Fe2+/Mn2+ transporter, VIT1/CCC1 family [Inorganic ion transport and metabolism]. 	229
224728	COG1815	FlgB	Flagellar basal body rod protein FlgB [Cell motility]. 	133
224729	COG1816	Add	Adenosine deaminase [Nucleotide transport and metabolism]. 	345
224730	COG1817	COG1817	Predicted glycosyltransferase [General function prediction only]. 	346
224731	COG1818	Tan1	tRNA(Ser,Leu) C12 N-acetylase TAN1, contains THUMP domain [Translation, ribosomal structure and biogenesis]. 	175
224732	COG1819	YjiC	UDP:flavonoid glycosyltransferase YjiC, YdhE family [Carbohydrate transport and metabolism]. 	406
224733	COG1820	NagA	N-acetylglucosamine-6-phosphate deacetylase [Carbohydrate transport and metabolism]. 	380
224734	COG1821	COG1821	Predicted ATP-dependent carboligase, ATP-grasp superfamily [General function prediction only]. 	307
224735	COG1822	COG1822	Uncharacterized membrane protein  [Function unknown]. 	349
224736	COG1823	TcyP	L-cystine uptake protein TcyP, sodium:dicarboxylate symporter family  [Amino acid transport and metabolism]. 	458
224737	COG1824	MgtE2	Permease, similar to cation transporters  [Inorganic ion transport and metabolism]. 	203
224738	COG1825	RplY	Ribosomal protein L25 (general stress protein Ctc) [Translation, ribosomal structure and biogenesis]. 	93
224739	COG1826	TatA	Sec-independent protein translocase protein TatA [Intracellular trafficking, secretion, and vesicular transport]. 	94
224740	COG1827	NiaR	Transcriptional regulator of NAD metabolism, contains HTH and 3H domains  [Transcription, Coenzyme transport and metabolism]. 	168
224741	COG1828	PurS	Phosphoribosylformylglycinamidine (FGAM) synthase, PurS component  [Nucleotide transport and metabolism]. 	83
224742	COG1829	COG1829	Archaeal pantoate kinase [Coenzyme transport and metabolism]. 	283
224743	COG1830	FbaB	Fructose-bisphosphate aldolase class Ia, DhnA family [Carbohydrate transport and metabolism]. 	265
224744	COG1831	COG1831	Predicted metal-dependent hydrolase, urease superfamily [General function prediction only]. 	285
224745	COG1832	YccU	Predicted CoA-binding protein [General function prediction only]. 	140
224746	COG1833	COG1833	Uri superfamily endonuclease  [General function prediction only]. 	132
224747	COG1834	DdaH	N-Dimethylarginine dimethylaminohydrolase  [Amino acid transport and metabolism]. 	267
224748	COG1835	OafA	Peptidoglycan/LPS O-acetylase OafA/YrhL, contains acyltransferase and SGNH-hydrolase domains  [Cell wall/membrane/envelope biogenesis]. 	386
224749	COG1836	COG1836	Uncharacterized membrane protein  [Function unknown]. 	247
224750	COG1837	YlqC	Predicted RNA-binding protein YlqC, contains KH domain, UPF0109 family [General function prediction only]. 	76
224751	COG1838	FumA	Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain [Energy production and conversion]. 	184
224752	COG1839	COG1839	Adenosine/AMP kinase [Nucleotide transport and metabolism]. 	162
224753	COG1840	AfuA	ABC-type Fe3+ transport system, periplasmic component  [Inorganic ion transport and metabolism]. 	299
224754	COG1841	RpmD	Ribosomal protein L30/L7E [Translation, ribosomal structure and biogenesis]. 	55
224755	COG1842	PspA	Phage shock protein A [Transcription, Signal transduction mechanisms]. 	225
224756	COG1843	FlgD	Flagellar hook assembly protein FlgD [Cell motility]. 	222
224757	COG1844	COG1844	Uncharacterized protein [Function unknown]. 	125
224758	COG1845	CyoC	Heme/copper-type cytochrome/quinol oxidase, subunit 3 [Energy production and conversion]. 	209
224759	COG1846	MarR	DNA-binding transcriptional regulator, MarR family [Transcription]. 	126
224760	COG1847	Jag	Predicted RNA-binding protein Jag, conains KH and R3H domains [General function prediction only]. 	208
224761	COG1848	COG1848	Predicted nucleic acid-binding protein, contains PIN domain  [General function prediction only]. 	140
224762	COG1849	COG1849	Uncharacterized protein [Function unknown]. 	90
224763	COG1850	RbcL	Ribulose 1,5-bisphosphate carboxylase, large subunit, or a RuBisCO-like protein [Carbohydrate transport and metabolism]. 	429
224764	COG1851	COG1851	Uncharacterized protein, UPF0128 family [Function unknown]. 	229
224765	COG1852	COG1852	Uncharacterized protein, DUF116 family [Function unknown]. 	209
224766	COG1853	RutF	NADH-FMN oxidoreductase RutF, flavin reductase (DIM6/NTAB) family [Energy production and conversion]. 	176
224767	COG1854	LuxS	S-ribosylhomocysteine lyase LuxS, autoinducer biosynthesis [Signal transduction mechanisms]. 	161
224768	COG1855	COG1855	Predicted ATPase, PilT family  [General function prediction only]. 	604
224769	COG1856	COG1856	Uncharacterized protein, radical SAM superfamily [General function prediction only]. 	275
224770	COG1857	Cas7	CRISPR/Cas system-associated protein Cas7,  RAMP superfamily [Defense mechanisms]. 	334
224771	COG1858	MauG	Cytochrome c peroxidase [Posttranslational modification, protein turnover, chaperones]. 	364
224772	COG1859	KptA	RNA:NAD 2'-phosphotransferase, TPT1/KptA family [Translation, ribosomal structure and biogenesis]. 	211
224773	COG1860	COG1860	Uncharacterized conserved protein, UPF0179 family [Nucleotide transport and metabolism, Replication, recombination and repair]. 	147
224774	COG1861	SpsF	Spore coat polysaccharide biosynthesis protein SpsF, cytidylyltransferase family [Cell wall/membrane/envelope biogenesis]. 	241
224775	COG1862	YajC	Preprotein translocase subunit YajC [Intracellular trafficking, secretion, and vesicular transport]. 	97
224776	COG1863	MnhE	Multisubunit Na+/H+ antiporter, MnhE subunit  [Inorganic ion transport and metabolism]. 	158
224777	COG1864	NUC1	DNA/RNA endonuclease G, NUC1  [Nucleotide transport and metabolism]. 	281
224778	COG1865	CbiZ	Adenosylcobinamide amidohydrolase  [Coenzyme transport and metabolism]. 	200
224779	COG1866	PckA	Phosphoenolpyruvate carboxykinase, ATP-dependent [Energy production and conversion]. 	529
224780	COG1867	TRM1	tRNA G26 N,N-dimethylase Trm1 [Translation, ribosomal structure and biogenesis]. 	380
224781	COG1868	FliM	Flagellar motor switch protein FliM [Cell motility]. 	332
224782	COG1869	RbsD	D-ribose pyranose/furanose isomerase RbsD [Carbohydrate transport and metabolism]. 	135
224783	COG1871	CheD	Chemotaxis receptor (MCP) glutamine deamidase CheD [Cell motility, Signal transduction mechanisms]. 	164
224784	COG1872	YggU	Uncharacterized conserved protein YggU, UPF0235/DUF167 family [Function unknown]. 	102
224785	COG1873	YlmC	Sporulation protein YlmC, PRC-barrel domain family [General function prediction only]. 	87
224786	COG1874	GanA	Beta-galactosidase GanA [Carbohydrate transport and metabolism]. 	673
224787	COG1875	YlaK	Predicted ribonuclease YlaK, contains NYN-type RNase and PhoH-family ATPase domains  [General function prediction only]. 	436
224788	COG1876	LdcB	LD-carboxypeptidase LdcB, LAS superfamily  [Cell wall/membrane/envelope biogenesis]. 	241
224789	COG1877	OtsB	Trehalose-6-phosphatase [Carbohydrate transport and metabolism]. 	266
224790	COG1878	COG1878	Kynurenine formamidase  [Amino acid transport and metabolism]. 	218
224791	COG1879	RbsB	ABC-type sugar transport system, periplasmic component, contains N-terminal xre family HTH domain [Carbohydrate transport and metabolism]. 	322
224792	COG1880	CdhB	CO dehydrogenase/acetyl-CoA synthase epsilon subunit  [Energy production and conversion]. 	170
224793	COG1881	PEBP	Uncharacterized conserved protein, phosphatidylethanolamine-binding protein (PEBP) family [General function prediction only]. 	174
224794	COG1882	PflD	Pyruvate-formate lyase [Energy production and conversion]. 	755
224795	COG1883	OadB	Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit  [Energy production and conversion]. 	375
224796	COG1884	Sbm	Methylmalonyl-CoA mutase, N-terminal domain/subunit [Lipid transport and metabolism]. 	548
224797	COG1885	COG1885	Uncharacterized protein, UPF0212 family [Function unknown]. 	115
224798	COG1886	FliN	Flagellar motor switch/type III secretory pathway protein FliN [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 	136
224799	COG1887	TagB	CDP-glycerol glycerophosphotransferase, TagB/SpsB family  [Cell wall/membrane/envelope biogenesis, Lipid transport and metabolism]. 	388
224800	COG1888	COG1888	Uncharacterized protein [Function unknown]. 	97
224801	COG1889	NOP1	Fibrillarin-like rRNA methylase  [Translation, ribosomal structure and biogenesis]. 	231
224802	COG1890	RPS3A	Ribosomal protein S3AE  [Translation, ribosomal structure and biogenesis]. 	214
224803	COG1891	COG1891	Uncharacterized protein, UPF0264 family [Function unknown]. 	235
224804	COG1892	PpcA	Phosphoenolpyruvate carboxylase  [Carbohydrate transport and metabolism]. 	488
224805	COG1893	PanE	Ketopantoate reductase [Coenzyme transport and metabolism]. 	307
224806	COG1894	NuoF	NADH:ubiquinone oxidoreductase, NADH-binding 51 kD subunit (chain F) [Energy production and conversion]. 	424
224807	COG1895	COG1895	Uncharacterized protein, contains HEPN domain, UPF0332 family  [Function unknown]. 	129
224808	COG1896	YfbR	5'-deoxynucleotidase YfbR and related HD superfamily hydrolases [Nucleotide transport and metabolism, General function prediction only]. 	193
224809	COG1897	MetA	Homoserine trans-succinylase [Amino acid transport and metabolism]. 	307
224810	COG1898	RfbC	dTDP-4-dehydrorhamnose 3,5-epimerase or related enzyme [Cell wall/membrane/envelope biogenesis]. 	173
224811	COG1899	DYS1	Deoxyhypusine synthase  [Posttranslational modification, protein turnover, chaperones, Translation, ribosomal structure and biogenesis]. 	318
224812	COG1900	COG1900	Uncharacterized conserved protein, DUF39 family [Function unknown]. 	365
224813	COG1901	COG1901	tRNA pseudouridine-54 N-methylase [Translation, ribosomal structure and biogenesis]. 	197
224814	COG1902	FadH	2,4-dienoyl-CoA reductase or related NADH-dependent reductase, Old Yellow Enzyme (OYE) family [Energy production and conversion]. 	363
224815	COG1903	CbiD	Cobalamin biosynthesis protein CbiD  [Coenzyme transport and metabolism]. 	367
224816	COG1904	UxaC	Glucuronate isomerase [Carbohydrate transport and metabolism]. 	463
224817	COG1905	NuoE	NADH:ubiquinone oxidoreductase 24 kD subunit (chain E) [Energy production and conversion]. 	160
224818	COG1906	COG1906	Uncharacterized protein [Function unknown]. 	388
224819	COG1907	COG1907	Predicted archaeal sugar kinase [General function prediction only]. 	312
224820	COG1908	FrhD	Coenzyme F420-reducing hydrogenase, delta subunit  [Energy production and conversion]. 	132
224821	COG1909	COG1909	Uncharacterized protein, UPF0218 family [Function unknown]. 	167
224822	COG1910	YvgK	Periplasmic molybdate-binding protein/domain  [Inorganic ion transport and metabolism]. 	223
224823	COG1911	RPL30E	Ribosomal protein L30E  [Translation, ribosomal structure and biogenesis]. 	100
224824	COG1912	COG1912	S-adenosylmethionine hydrolase (SAM-hydroxide adenosyltransferase) [Coenzyme transport and metabolism]. 	268
224825	COG1913	COG1913	Predicted Zn-dependent protease  [General function prediction only]. 	181
224826	COG1914	MntH	Mn2+ and Fe2+ transporters of the NRAMP family [Inorganic ion transport and metabolism]. 	416
224827	COG1915	COG1915	Uncharacterized conserved protein, contains Saccharopine dehydrogenase N-terminal (SDHN) domain [Function unknown]. 	415
224828	COG1916	COG1916	Pheromone shutdown protein TraB, contains GTxH motif (function unknown) [Function unknown]. 	388
224829	COG1917	QdoI	Cupin domain protein related to quercetin dioxygenase [General function prediction only]. 	131
224830	COG1918	FeoA	Fe2+ transport system protein FeoA [Inorganic ion transport and metabolism]. 	75
224831	COG1920	COG1920	2-phospho-L-lactate guanylyltransferase, coenzyme F420 biosynthesis enzyme, CobY/MobA/RfbA family [Coenzyme transport and metabolism]. 	210
224832	COG1921	SelA	Seryl-tRNA(Sec) selenium transferase [Translation, ribosomal structure and biogenesis]. 	395
224833	COG1922	WecG	UDP-N-acetyl-D-mannosaminuronic acid transferase, WecB/TagA/CpsF family  [Cell wall/membrane/envelope biogenesis]. 	253
224834	COG1923	Hfq	sRNA-binding regulator protein Hfq [Signal transduction mechanisms]. 	77
224835	COG1924	YjiL	Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) [Lipid transport and metabolism]. 	396
224836	COG1925	PtsH	Phosphotransferase system, HPr and related phosphotransfer proteins [Signal transduction mechanisms, Carbohydrate transport and metabolism]. 	88
224837	COG1926	COG1926	Predicted phosphoribosyltransferase [General function prediction only]. 	220
224838	COG1927	Mtd	F420-dependent methylenetetrahydromethanopterin dehydrogenase [Energy production and conversion]. 	277
224839	COG1928	PMT1	Dolichyl-phosphate-mannose--protein O-mannosyl transferase  [Posttranslational modification, protein turnover, chaperones]. 	699
224840	COG1929	GlxK	Glycerate kinase [Carbohydrate transport and metabolism]. 	378
224841	COG1930	CbiN	ABC-type cobalt transport system, periplasmic component  [Inorganic ion transport and metabolism]. 	97
224842	COG1931	COG1931	Predicted RNA binding protein with dsRBD fold, UPF0201 family [General function prediction only]. 	140
224843	COG1932	SerC	Phosphoserine aminotransferase [Coenzyme transport and metabolism, Amino acid transport and metabolism]. 	365
224844	COG1933	PolC	Archaeal DNA polymerase II, large subunit  [Replication, recombination and repair]. 	253
224845	COG1934	LptA	Lipopolysaccharide export system protein LptA [Cell wall/membrane/envelope biogenesis]. 	173
224846	COG1935	COG1935	Uncharacterized protein [Function unknown]. 	122
224847	COG1936	Fap7	Broad-specificity NMP kinase [Nucleotide transport and metabolism]. 	180
224848	COG1937	FrmR	DNA-binding transcriptional regulator, FrmR family [Transcription]. 	89
224849	COG1938	COG1938	Predicted ATP-dependent carboligase, ATP-grasp superfamily [General function prediction only]. 	244
224850	COG1939	MrnC	23S rRNA maturation mini-RNase III [Translation, ribosomal structure and biogenesis]. 	132
224851	COG1940	NagC	Sugar kinase of the NBD/HSP70 family, may contain an N-terminal HTH domain  [Transcription, Carbohydrate transport and metabolism]. 	314
224852	COG1941	FrhG	Coenzyme F420-reducing hydrogenase, gamma subunit  [Energy production and conversion]. 	247
224853	COG1942	PptA	Phenylpyruvate tautomerase PptA, 4-oxalocrotonate tautomerase family [Secondary metabolites biosynthesis, transport and catabolism]. 	69
224854	COG1943	RAYT	REP element-mobilizing transposase RayT [Mobilome: prophages, transposons]. 	136
224855	COG1944	YcaO	Ribosomal protein S12 methylthiotransferase accessory factor YcaO [Translation, ribosomal structure and biogenesis]. 	398
224856	COG1945	PdaD	Pyruvoyl-dependent arginine decarboxylase (PvlArgDC)  [Amino acid transport and metabolism]. 	163
224857	COG1946	TesB	Acyl-CoA thioesterase [Lipid transport and metabolism]. 	289
224858	COG1947	IspE	4-diphosphocytidyl-2C-methyl-D-erythritol kinase [Lipid transport and metabolism]. 	289
224859	COG1948	MUS81	ERCC4-type nuclease  [Replication, recombination and repair]. 	254
224860	COG1949	Orn	Oligoribonuclease (3'-5' exoribonuclease) [RNA processing and modification]. 	184
224861	COG1950	YvlD	Uncharacterized membrane protein YvlD, DUF360 family [Function unknown]. 	120
224862	COG1951	TtdA	Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain [Energy production and conversion]. 	297
224863	COG1952	SecB	Preprotein translocase subunit SecB [Intracellular trafficking, secretion, and vesicular transport]. 	157
224864	COG1953	FUI1	Cytosine/uracil/thiamine/allantoin permease [Nucleotide transport and metabolism, Coenzyme transport and metabolism]. 	497
224865	COG1954	GlpP	Glycerol-3-phosphate responsive antiterminator (mRNA-binding) [Transcription]. 	181
224866	COG1955	FlaJ	Archaellum biogenesis protein FlaJ, TadC family [Cell motility]. 	527
224867	COG1956	GAF	GAF domain-containing protein, putative methionine-R-sulfoxide reductase [Defense mechanisms, Signal transduction mechanisms]. 	163
224868	COG1957	URH1	Inosine-uridine nucleoside N-ribohydrolase [Nucleotide transport and metabolism]. 	311
224869	COG1958	LSM1	Small nuclear ribonucleoprotein (snRNP) homolog  [Transcription]. 	79
224870	COG1959	IscR	DNA-binding transcriptional regulator, IscR family [Transcription]. 	150
224871	COG1960	CaiA	Acyl-CoA dehydrogenase related to the alkylation response protein AidB [Lipid transport and metabolism]. 	393
224872	COG1961	PinE	Site-specific DNA recombinase related to the DNA invertase Pin [Replication, recombination and repair]. 	222
224873	COG1962	MtrH	Tetrahydromethanopterin S-methyltransferase, subunit H  [Coenzyme transport and metabolism]. 	313
224874	COG1963	YuiD	Acid phosphatase family membrane protein YuiD [General function prediction only]. 	153
224875	COG1964	COG1964	Uncharacterized Fe-S cluster-containing enzyme, radical SAM superfamily [General function prediction only]. 	475
224876	COG1965	CyaY	Iron-binding protein CyaY, frataxin homolog [Inorganic ion transport and metabolism]. 	106
224877	COG1966	CstA	Carbon starvation protein CstA [Signal transduction mechanisms]. 	575
224878	COG1967	COG1967	Uncharacterized membrane protein  [Function unknown]. 	271
224879	COG1968	UppP	Undecaprenyl pyrophosphate phosphatase [Lipid transport and metabolism]. 	270
224880	COG1969	HyaC	Ni,Fe-hydrogenase I cytochrome b subunit [Energy production and conversion]. 	227
224881	COG1970	MscL	Large-conductance mechanosensitive channel [Cell wall/membrane/envelope biogenesis]. 	130
224882	COG1971	MntP	Putative Mn2+ efflux pump MntP [Inorganic ion transport and metabolism]. 	190
224883	COG1972	NupC	Nucleoside permease NupC [Nucleotide transport and metabolism]. 	404
224884	COG1973	HypE	Hydrogenase maturation factor HypE [Posttranslational modification, protein turnover, chaperones]. 	449
224885	COG1974	LexA	SOS-response transcriptional repressor LexA (RecA-mediated autopeptidase) [Transcription, Signal transduction mechanisms]. 	201
224886	COG1975	XdhC	Xanthine and CO dehydrogenase maturation factor, XdhC/CoxF family [Posttranslational modification, protein turnover, chaperones]. 	278
224887	COG1976	TIF6	Translation initiation factor 6 (eIF-6)  [Translation, ribosomal structure and biogenesis]. 	222
224888	COG1977	MoaD	Molybdopterin converting factor, small subunit [Coenzyme transport and metabolism]. 	84
224889	COG1978	YkuK	Predicted RNase H-related nuclease YkuK, DUF458 family  [General function prediction only]. 	152
224890	COG1979	YqdH	Alcohol dehydrogenase YqhD, Fe-dependent ADH family [Energy production and conversion]. 	384
224891	COG1980	COG1980	Archaeal fructose 1,6-bisphosphatase  [Carbohydrate transport and metabolism]. 	369
224892	COG1981	COG1981	Uncharacterized membrane protein  [Function unknown]. 	149
224893	COG1982	LdcC	Arginine/lysine/ornithine decarboxylase [Amino acid transport and metabolism]. 	557
224894	COG1983	PspC	Phage shock protein PspC (stress-responsive transcriptional regulator) [Transcription, Signal transduction mechanisms]. 	70
224895	COG1984	DUR1B	Allophanate hydrolase subunit 2 [Amino acid transport and metabolism]. 	314
224896	COG1985	RibD	Pyrimidine reductase, riboflavin biosynthesis [Coenzyme transport and metabolism]. 	218
224897	COG1986	YjjX	Non-canonical (house-cleaning) NTP pyrophosphatase, all-alpha NTP-PPase family [Nucleotide transport and metabolism, Defense mechanisms]. 	175
224898	COG1987	FliQ	Flagellar biosynthesis protein FliQ [Cell motility]. 	89
224899	COG1988	YbcI	Membrane-bound metal-dependent hydrolase YbcI, DUF457 family [General function prediction only]. 	190
224900	COG1989	PulO	Prepilin signal peptidase PulO (type II secretory pathway) or related peptidase [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 	254
224901	COG1990	Pth2	Peptidyl-tRNA hydrolase  [Translation, ribosomal structure and biogenesis]. 	122
224902	COG1991	COG1991	Uncharacterized protein, UPF0333 family [Function unknown]. 	131
224903	COG1992	COG1992	Predicted transcriptional regulator fused phosphomethylpyrimidine kinase (thiamin biosynthesis) [General function prediction only]. 	181
224904	COG1993	COG1993	PII-like signaling protein  [Signal transduction mechanisms]. 	109
224905	COG1994	SpoIVFB	Zn-dependent protease (includes SpoIVFB) [Posttranslational modification, protein turnover, chaperones]. 	230
224906	COG1995	PdxA	4-hydroxy-L-threonine phosphate dehydrogenase PdxA [Coenzyme transport and metabolism]. 	332
224907	COG1996	RPC10	DNA-directed RNA polymerase, subunit RPC12/RpoP, contains C4-type Zn-finger [Transcription]. 	49
224908	COG1997	RPL43A	Ribosomal protein L37AE/L43A  [Translation, ribosomal structure and biogenesis]. 	89
224909	COG1998	RPS27AE	Ribosomal protein S27AE  [Translation, ribosomal structure and biogenesis]. 	51
224910	COG1999	Sco1	Cytochrome oxidase Cu insertion factor, SCO1/SenC/PrrC family [Posttranslational modification, protein turnover, chaperones]. 	207
224911	COG2000	COG2000	Uncharacterized Fe-S cluster-containing protein [General function prediction only]. 	226
224912	COG2001	MraZ	MraZ, DNA-binding transcriptional regulator and inhibitor of RsmH methyltransferase activity [Translation, ribosomal structure and biogenesis]. 	146
224913	COG2002	AbrB	Bifunctional DNA-binding transcriptional regulator of stationary/sporulation/toxin gene expression and antitoxin component of the YhaV-PrlF toxin-antitoxin module [Transcription, Defense mechanisms]. 	89
224914	COG2003	RadC	DNA repair protein RadC, contains a helix-hairpin-helix DNA-binding motif  [Replication, recombination and repair]. 	224
224915	COG2004	RPS24A	Ribosomal protein S24E  [Translation, ribosomal structure and biogenesis]. 	107
224916	COG2005	ModE	DNA-binding transcriptional regulator ModE (molybdenum-dependent) [Transcription]. 	130
224917	COG2006	COG2006	Uncharacterized conserved protein, DUF362 family [Function unknown]. 	293
224918	COG2007	RPS8A	Ribosomal protein S8E  [Translation, ribosomal structure and biogenesis]. 	127
224919	COG2008	GLY1	Threonine aldolase [Amino acid transport and metabolism]. 	342
224920	COG2009	SdhC	Succinate dehydrogenase/fumarate reductase, cytochrome b subunit [Energy production and conversion]. 	132
224921	COG2010	CccA	Cytochrome c, mono- and diheme variants  [Energy production and conversion]. 	150
224922	COG2011	MetP	ABC-type methionine transport system, permease component [Amino acid transport and metabolism]. 	222
224923	COG2012	RPB5	DNA-directed RNA polymerase, subunit H, RpoH/RPB5  [Transcription]. 	80
224924	COG2013	AIM24	Uncharacterized conserved protein, AIM24 family [Function unknown]. 	227
224925	COG2014	COG2014	Uncharacterized conserved protein, contains DUF4213 and DUF364 domains [Function unknown]. 	250
224926	COG2015	BDS1	Alkyl sulfatase BDS1 and related hydrolases, metallo-beta-lactamase superfamily  [Secondary metabolites biosynthesis, transport and catabolism]. 	655
224927	COG2016	Tma20	Predicted ribosome-associated RNA-binding protein Tma20, contains PUA domain  [Translation, ribosomal structure and biogenesis]. 	161
224928	COG2017	GalM	Galactose mutarotase or related enzyme [Carbohydrate transport and metabolism]. 	308
224929	COG2018	COG2018	Predicted regulator of Ras-like GTPase activity, Roadblock/LC7/MglB family [Signal transduction mechanisms]. 	119
224930	COG2019	AdkA	Archaeal adenylate kinase  [Nucleotide transport and metabolism]. 	189
224931	COG2020	STE14	Protein-S-isoprenylcysteine O-methyltransferase Ste14 [Posttranslational modification, protein turnover, chaperones]. 	187
224932	COG2021	MET2	Homoserine acetyltransferase  [Amino acid transport and metabolism]. 	368
224933	COG2022	ThiG	Thiamin biosynthesis thiazole synthase ThiGH, ThiG subunit [Coenzyme transport and metabolism]. 	262
224934	COG2023	RPR2	RNase P subunit RPR2  [Translation, ribosomal structure and biogenesis]. 	105
224935	COG2024	SepRS	O-phosphoseryl-tRNA(Cys) synthetase [Translation, ribosomal structure and biogenesis]. 	536
224936	COG2025	FixB	Electron transfer flavoprotein, alpha subunit [Energy production and conversion]. 	313
224937	COG2026	RelE	mRNA-degrading endonuclease RelE, toxin component of the RelBE toxin-antitoxin system [Defense mechanisms]. 	90
224938	COG2027	DacB	D-alanyl-D-alanine carboxypeptidase [Cell wall/membrane/envelope biogenesis]. 	470
224939	COG2028	COG2028	Uncharacterized protein [Function unknown]. 	145
224940	COG2029	COG2029	Uncharacterized protein [Function unknown]. 	189
224941	COG2030	MaoC	Acyl dehydratase [Lipid transport and metabolism]. 	159
224942	COG2031	AtoE	Short chain fatty acids transporter [Lipid transport and metabolism]. 	446
224943	COG2032	SodC	Cu/Zn superoxide dismutase [Inorganic ion transport and metabolism]. 	179
224944	COG2033	SORL	Desulfoferrodoxin, superoxide reductase-like (SORL) domain  [Energy production and conversion]. 	126
224945	COG2034	COG2034	Uncharacterized membrane protein  [Function unknown]. 	85
224946	COG2035	COG2035	Uncharacterized membrane protein  [Function unknown]. 	276
224947	COG2036	HHT1	Archaeal histone H3/H4 [Chromatin structure and dynamics]. 	91
224948	COG2037	Ftr	Formylmethanofuran:tetrahydromethanopterin formyltransferase  [Energy production and conversion]. 	297
224949	COG2038	CobT	NaMN:DMB phosphoribosyltransferase [Coenzyme transport and metabolism]. 	347
224950	COG2039	Pcp	Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase)  [Posttranslational modification, protein turnover, chaperones]. 	207
224951	COG2040	MHT1	Homocysteine/selenocysteine methylase (S-methylmethionine-dependent) [Amino acid transport and metabolism]. 	300
224952	COG2041	YedY	Periplasmic DMSO/TMAO reductase YedYZ, molybdopterin-dependent catalytic subunit [Energy production and conversion]. 	271
224953	COG2042	Tsr3	Ribosome biogenesis protein Tsr3 (rRNA maturation) [Translation, ribosomal structure and biogenesis]. 	179
224954	COG2043	COG2043	Uncharacterized conserved protein, DUF169 family [Function unknown]. 	237
224955	COG2044	COG2044	Predicted peroxiredoxin [General function prediction only]. 	120
224956	COG2045	ComB	Phosphosulfolactate phosphohydrolase or related enzyme [Coenzyme transport and metabolism, General function prediction only]. 	230
224957	COG2046	MET3	ATP sulfurylase (sulfate adenylyltransferase)  [Inorganic ion transport and metabolism]. 	397
224958	COG2047	COG2047	Proteasome assembly chaperone (PAC2) family protein [General function prediction only]. 	258
224959	COG2048	HdrB	Heterodisulfide reductase, subunit B  [Energy production and conversion]. 	293
224960	COG2049	DUR1A	Allophanate hydrolase subunit 1 [Amino acid transport and metabolism]. 	223
224961	COG2050	PaaI	Acyl-coenzyme A thioesterase PaaI, contains HGG motif [Secondary metabolites biosynthesis, transport and catabolism]. 	141
224962	COG2051	RPS27A	Ribosomal protein S27E  [Translation, ribosomal structure and biogenesis]. 	67
224963	COG2052	RemA	Regulator of extracellular matrix RemA, YlzA/DUF370 family [Cell wall/membrane/envelope biogenesis]. 	89
224964	COG2053	RPS28A	Ribosomal protein S28E/S33  [Translation, ribosomal structure and biogenesis]. 	69
224965	COG2054	COG2054	Uncharacterized archaeal kinase related to aspartokinase [General function prediction only]. 	212
224966	COG2055	AllD	Malate/lactate/ureidoglycolate dehydrogenase, LDH2 family [Energy production and conversion]. 	349
224967	COG2056	YuiF	Predicted histidine transporter YuiF, NhaC family [Amino acid transport and metabolism]. 	444
224968	COG2057	AtoA	Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit [Lipid transport and metabolism]. 	225
224969	COG2058	RPP1A	Ribosomal protein L12E/L44/L45/RPP1/RPP2  [Translation, ribosomal structure and biogenesis]. 	109
224970	COG2059	ChrA	Chromate transport protein ChrA  [Inorganic ion transport and metabolism]. 	195
224971	COG2060	KdpA	K+-transporting ATPase, A chain [Inorganic ion transport and metabolism]. 	560
224972	COG2061	COG2061	Uncharacterized conserved protein, contains ACT domain [General function prediction only]. 	170
224973	COG2062	SixA	Phosphohistidine phosphatase SixA [Signal transduction mechanisms]. 	163
224974	COG2063	FlgH	Flagellar basal body L-ring protein FlgH [Cell motility]. 	230
224975	COG2064	TadC	Pilus assembly protein TadC [Extracellular structures]. 	320
224976	COG2065	PyrR	Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase  [Nucleotide transport and metabolism]. 	179
224977	COG2066	GlsA	Glutaminase [Amino acid transport and metabolism]. 	309
224978	COG2067	FadL	Long-chain fatty acid transport protein [Lipid transport and metabolism]. 	440
224979	COG2068	MocA	CTP:molybdopterin cytidylyltransferase MocA [Coenzyme transport and metabolism]. 	199
224980	COG2069	CdhD	CO dehydrogenase/acetyl-CoA synthase delta subunit (corrinoid Fe-S protein)  [Energy production and conversion]. 	403
224981	COG2070	YrpB	NAD(P)H-dependent flavin oxidoreductase YrpB, nitropropane dioxygenase family [General function prediction only]. 	336
224982	COG2071	PuuD	Gamma-glutamyl-gamma-aminobutyrate hydrolase PuuD (putrescine degradation), contains GATase1-like domain [Amino acid transport and metabolism]. 	243
224983	COG2072	CzcO	Predicted flavoprotein CzcO associated with the cation diffusion facilitator CzcD  [Inorganic ion transport and metabolism]. 	443
224984	COG2073	CbiG	Cobalamin biosynthesis protein CbiG  [Coenzyme transport and metabolism]. 	298
224985	COG2074	Pgk2	2-phosphoglycerate kinase  [Carbohydrate transport and metabolism]. 	299
224986	COG2075	RPL24A	Ribosomal protein L24E  [Translation, ribosomal structure and biogenesis]. 	66
224987	COG2076	EmrE	Multidrug transporter EmrE and related cation transporters [Defense mechanisms]. 	106
224988	COG2077	Tpx	Peroxiredoxin [Posttranslational modification, protein turnover, chaperones]. 	158
224989	COG2078	AMMECR1	Uncharacterized conserved protein, AMMECR1 domain [Function unknown]. 	203
224990	COG2079	PrpD	2-methylcitrate dehydratase PrpD [Carbohydrate transport and metabolism]. 	453
224991	COG2080	CoxS	Aerobic-type carbon monoxide dehydrogenase, small subunit, CoxS/CutS family [Energy production and conversion]. 	156
224992	COG2081	YhiN	Predicted flavoprotein YhiN [General function prediction only]. 	408
224993	COG2082	CobH	Precorrin isomerase  [Coenzyme transport and metabolism]. 	210
224994	COG2083	COG2083	Uncharacterized protein, UPF0216 family [Function unknown]. 	140
224995	COG2084	MmsB	3-hydroxyisobutyrate dehydrogenase or related beta-hydroxyacid dehydrogenase [Lipid transport and metabolism]. 	286
224996	COG2085	COG2085	Predicted dinucleotide-binding enzyme [General function prediction only]. 	211
224997	COG2086	FixA	Electron transfer flavoprotein, alpha and beta subunits [Energy production and conversion]. 	260
224998	COG2087	CobU	Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase [Coenzyme transport and metabolism]. 	175
224999	COG2088	SpoVG	DNA-binding protein SpoVG, cell septation regulator [Cell cycle control, cell division, chromosome partitioning]. 	95
225000	COG2089	SpsE	Sialic acid synthase SpsE, contains C-terminal SAF domain [Cell wall/membrane/envelope biogenesis]. 	347
225001	COG2090	COG2090	Uncharacterized protein [Function unknown]. 	141
225002	COG2091	Sfp	Phosphopantetheinyl transferase [Coenzyme transport and metabolism]. 	223
225003	COG2092	EFB1	Translation elongation factor EF-1beta  [Translation, ribosomal structure and biogenesis]. 	88
225004	COG2093	Spt4	RNA polymerase subunit RPABC4/transcription elongation factor Spt4 [Transcription]. 	64
225005	COG2094	Mpg	3-methyladenine DNA glycosylase Mpg [Replication, recombination and repair]. 	200
225006	COG2095	MarC	Small neutral amino acid transporter SnatA, MarC family [Amino acid transport and metabolism]. 	203
225007	COG2096	PduO	Cob(I)alamin adenosyltransferase  [Coenzyme transport and metabolism]. 	184
225008	COG2097	RPL31A	Ribosomal protein L31E  [Translation, ribosomal structure and biogenesis]. 	89
225009	COG2098	COG2098	Uncharacterized protein [Function unknown]. 	116
225010	COG2099	CobK	Precorrin-6x reductase  [Coenzyme transport and metabolism]. 	257
225011	COG2100	COG2100	Uncharacterized Fe-S cluster-containing enzyme, radical SAM superfamily [General function prediction only]. 	414
225012	COG2101	SPT15	TATA-box binding protein (TBP), component of TFIID and TFIIIB  [Transcription]. 	185
225013	COG2102	Dph6	Diphthamide synthase (EF-2-diphthine--ammonia ligase) [Translation, ribosomal structure and biogenesis]. 	223
225014	COG2103	MurQ	N-acetylmuramic acid 6-phosphate (MurNAc-6-P) etherase [Cell wall/membrane/envelope biogenesis]. 	298
225015	COG2104	ThiS	Sulfur carrier protein ThiS (thiamine biosynthesis) [Coenzyme transport and metabolism]. 	68
225016	COG2105	YtfP	Uncharacterized conserved protein YtfP, gamma-glutamylcyclotransferase (GGCT)/AIG2-like family [General function prediction only]. 	120
225017	COG2106	MTH1	Predicted RNA methylase MTH1, SPOUT superfamily [General function prediction only]. 	272
225018	COG2107	MqnD	Menaquinone biosynthesis enzyme MqnD [Coenzyme transport and metabolism]. 	272
225019	COG2108	COG2108	Uncharacterized conserved protein related to pyruvate formate-lyase activating enzyme  [Function unknown]. 	353
225020	COG2109	BtuR	ATP:corrinoid adenosyltransferase [Coenzyme transport and metabolism]. 	198
225021	COG2110	YmdB	O-acetyl-ADP-ribose deacetylase (regulator of RNase III), contains Macro domain  [Translation, ribosomal structure and biogenesis]. 	179
225022	COG2111	MnhB	Multisubunit Na+/H+ antiporter, MnhB subunit  [Inorganic ion transport and metabolism]. 	162
225023	COG2112	COG2112	Predicted Ser/Thr protein kinase  [Signal transduction mechanisms]. 	201
225024	COG2113	ProX	ABC-type proline/glycine betaine transport system, periplasmic component [Amino acid transport and metabolism]. 	302
225025	COG2114	AcyC	Adenylate cyclase, class 3 [Signal transduction mechanisms]. 	227
225026	COG2115	XylA	Xylose isomerase [Carbohydrate transport and metabolism]. 	438
225027	COG2116	FocA	Formate/nitrite transporter FocA, FNT family [Inorganic ion transport and metabolism]. 	265
225028	COG2117	COG2117	Predicted subunit of tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain [Translation, ribosomal structure and biogenesis]. 	198
225029	COG2118	PDCD5	DNA-binding TFAR19-related protein, PDSD5 family [General function prediction only]. 	116
225030	COG2119	Gdt1	Putative Ca2+/H+ antiporter, TMEM165/GDT1 family  [General function prediction only]. 	190
225031	COG2120	LmbE	N-acetylglucosaminyl deacetylase, LmbE family  [Carbohydrate transport and metabolism]. 	237
225032	COG2121	COG2121	Uncharacterized conserved protein, lysophospholipid acyltransferase (LPLAT) superfamily  [Function unknown]. 	214
225033	COG2122	COG2122	Uncharacterized protein, UPF0280 family, ApbE superfamily [Function unknown]. 	256
225034	COG2123	Rrp42	Exosome complex RNA-binding protein Rrp42, RNase PH superfamily [Translation, ribosomal structure and biogenesis]. 	272
225035	COG2124	CypX	Cytochrome P450  [Secondary metabolites biosynthesis, transport and catabolism, Defense mechanisms]. 	411
225036	COG2125	RPS6A	Ribosomal protein S6E (S10)  [Translation, ribosomal structure and biogenesis]. 	120
225037	COG2126	RPL37A	Ribosomal protein L37E  [Translation, ribosomal structure and biogenesis]. 	61
225038	COG2127	ClpS	ATP-dependent Clp protease adapter protein ClpS [Posttranslational modification, protein turnover, chaperones]. 	107
225039	COG2128	YciW	Alkylhydroperoxidase family enzyme, contains CxxC motif [Inorganic ion transport and metabolism]. 	177
225040	COG2129	COG2129	Predicted phosphoesterase, related to the Icc protein  [General function prediction only]. 	226
225041	COG2130	CurA	NADPH-dependent curcumin reductase CurA [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 	340
225042	COG2131	ComEB	Deoxycytidylate deaminase  [Nucleotide transport and metabolism]. 	164
225043	COG2132	SufI	Multicopper oxidase with three cupredoxin domains (includes cell division protein FtsP and spore coat protein CotA)   [Cell cycle control, cell division, chromosome partitioning, Inorganic ion transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	451
225044	COG2133	YliI	Glucose/arabinose dehydrogenase, beta-propeller fold [Carbohydrate transport and metabolism]. 	399
225045	COG2134	Cdh	CDP-diacylglycerol pyrophosphatase [Lipid transport and metabolism]. 	252
225046	COG2135	SRAP	Putative SOS response-associated peptidase YedK [Posttranslational modification, protein turnover, chaperones]. 	226
225047	COG2136	IMP4	rRNA maturation protein Rpf1, contains Brix/IMP4 (anticodon-binding) domain [Translation, ribosomal structure and biogenesis]. 	191
225048	COG2137	RecX	SOS response regulatory protein OraA/RecX, interacts with RecA [Posttranslational modification, protein turnover, chaperones]. 	174
225049	COG2138	SirB	Sirohydrochlorin ferrochelatase  [Coenzyme transport and metabolism]. 	245
225050	COG2139	RPL21A	Ribosomal protein L21E  [Translation, ribosomal structure and biogenesis]. 	98
225051	COG2140	OxdD	Oxalate decarboxylase/archaeal phosphoglucose isomerase, cupin superfamily [Carbohydrate transport and metabolism]. 	209
225052	COG2141	SsuD	Flavin-dependent oxidoreductase, luciferase family (includes alkanesulfonate monooxygenase SsuD and methylene tetrahydromethanopterin reductase) [Coenzyme transport and metabolism, General function prediction only]. 	336
225053	COG2142	SdhD	Succinate dehydrogenase, hydrophobic anchor subunit [Energy production and conversion]. 	117
225054	COG2143	SoxW	Thioredoxin-related protein  [Posttranslational modification, protein turnover, chaperones]. 	182
225055	COG2144	COG2144	Selenophosphate synthetase-related protein [General function prediction only]. 	324
225056	COG2145	ThiM	Hydroxyethylthiazole kinase, sugar kinase family [Coenzyme transport and metabolism]. 	265
225057	COG2146	NirD	Ferredoxin subunit of nitrite reductase or a ring-hydroxylating dioxygenase [Inorganic ion transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	106
225058	COG2147	RPL19A	Ribosomal protein L19E  [Translation, ribosomal structure and biogenesis]. 	150
225059	COG2148	WcaJ	Sugar transferase involved in LPS biosynthesis (colanic, teichoic acid) [Cell wall/membrane/envelope biogenesis]. 	226
225060	COG2149	YidH	Uncharacterized membrane protein YidH, DUF202 family [Function unknown]. 	120
225061	COG2150	COG2150	Predicted regulator of amino acid metabolism, contains ACT domain  [General function prediction only]. 	167
225062	COG2151	PaaD	Metal-sulfur cluster biosynthetic enzyme [Posttranslational modification, protein turnover, chaperones]. 	111
225063	COG2152	COG2152	Predicted glycosyl hydrolase, GH43/DUF377 family  [Carbohydrate transport and metabolism]. 	314
225064	COG2153	ElaA	Predicted N-acyltransferase, GNAT family [General function prediction only]. 	155
225065	COG2154	PhhB	Pterin-4a-carbinolamine dehydratase  [Coenzyme transport and metabolism]. 	101
225066	COG2155	YuzA	Uncharacterized membrane protein YuzA, DUF378 family [Function unknown]. 	79
225067	COG2156	KdpC	K+-transporting ATPase, c chain [Inorganic ion transport and metabolism]. 	190
225068	COG2157	RPL20A	Ribosomal protein L20A (L18A)  [Translation, ribosomal structure and biogenesis]. 	85
225069	COG2158	COG2158	Uncharacterized protein, contains a Zn-finger-like domain  [General function prediction only]. 	112
225070	COG2159	COG2159	Predicted metal-dependent hydrolase, TIM-barrel fold  [General function prediction only]. 	293
225071	COG2160	AraA	L-arabinose isomerase [Carbohydrate transport and metabolism]. 	497
225072	COG2161	StbD	Antitoxin component YafN of the YafNO toxin-antitoxin module, PHD/YefM family [Defense mechanisms]. 	86
225073	COG2162	NhoA	Arylamine N-acetyltransferase [Secondary metabolites biosynthesis, transport and catabolism]. 	275
225074	COG2163	RPL14A	Ribosomal protein L14E/L6E/L27E  [Translation, ribosomal structure and biogenesis]. 	125
225075	COG2164	COG2164	Uncharacterized protein [Function unknown]. 	126
225076	COG2165	PulG	Type II secretory pathway, pseudopilin PulG [Cell motility, Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	149
225077	COG2166	SufE	Sulfur transfer protein SufE, Fe-S cluster assembly [Posttranslational modification, protein turnover, chaperones]. 	144
225078	COG2167	RPL39	Ribosomal protein L39E  [Translation, ribosomal structure and biogenesis]. 	51
225079	COG2168	DsrH	Sulfur transfer complex TusBCD TusB component, DsrH family  [Posttranslational modification, protein turnover, chaperones]. 	96
225080	COG2169	AdaA	Methylphosphotriester-DNA--protein-cysteine methyltransferase (N-terminal fragment of Ada), contains Zn-binding and two AraC-type DNA-binding domains [Replication, recombination and repair]. 	187
225081	COG2170	YbdK	Gamma-glutamyl:cysteine ligase YbdK, ATP-grasp superfamily  [Posttranslational modification, protein turnover, chaperones]. 	369
225082	COG2171	DapD	Tetrahydrodipicolinate N-succinyltransferase [Amino acid transport and metabolism]. 	271
225083	COG2172	RsbW	Anti-sigma regulatory factor (Ser/Thr protein kinase)  [Signal transduction mechanisms]. 	146
225084	COG2173	DdpX	D-alanyl-D-alanine dipeptidase [Cell wall/membrane/envelope biogenesis]. 	211
225085	COG2174	RPL34A	Ribosomal protein L34E  [Translation, ribosomal structure and biogenesis]. 	93
225086	COG2175	TauD	Taurine dioxygenase, alpha-ketoglutarate-dependent [Secondary metabolites biosynthesis, transport and catabolism]. 	286
225087	COG2176	PolC	DNA polymerase III, alpha subunit (gram-positive type)  [Replication, recombination and repair]. 	1444
225088	COG2177	FtsX	Cell division protein FtsX [Cell cycle control, cell division, chromosome partitioning]. 	297
225089	COG2178	COG2178	Predicted RNA- or ssDNA-binding protein, translin family  [General function prediction only]. 	204
225090	COG2179	YqeG	Predicted phosphohydrolase YqeG, HAD superfamily [General function prediction only]. 	175
225091	COG2180	NarJ	Nitrate reductase assembly protein NarJ, required for insertion of molybdenum cofactor  [Energy production and conversion, Inorganic ion transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 	179
225092	COG2181	NarI	Nitrate reductase gamma subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 	228
225093	COG2182	MalE	Maltose-binding periplasmic protein MalE [Carbohydrate transport and metabolism]. 	420
225094	COG2183	Tex	Transcriptional accessory protein Tex/SPT6 [Transcription]. 	780
225095	COG2184	FIDO	Fido, protein-threonine AMPylation domain [Signal transduction mechanisms]. 	201
225096	COG2185	Sbm	Methylmalonyl-CoA mutase, C-terminal domain/subunit (cobalamin-binding) [Lipid transport and metabolism]. 	143
225097	COG2186	FadR	DNA-binding transcriptional regulator, FadR family [Transcription]. 	241
225098	COG2187	COG2187	Aminoglycoside phosphotransferase family enzyme [General function prediction only]. 	337
225099	COG2188	MngR	DNA-binding transcriptional regulator, GntR family [Transcription]. 	236
225100	COG2189	Mod	Adenine specific DNA methylase Mod  [Replication, recombination and repair]. 	590
225101	COG2190	NagE	Phosphotransferase system IIA component [Carbohydrate transport and metabolism]. 	156
225102	COG2191	FwdE	Formylmethanofuran dehydrogenase subunit E  [Energy production and conversion]. 	206
225103	COG2192	COG2192	Predicted carbamoyl transferase, NodU family  [General function prediction only]. 	555
225104	COG2193	Bfr	Bacterioferritin (cytochrome b1) [Inorganic ion transport and metabolism]. 	157
225105	COG2194	OpgE	Phosphoethanolamine transferase for periplasmic glucans (OPG), alkaline phosphatase superfamily [Cell wall/membrane/envelope biogenesis]. 	555
225106	COG2195	PepD2	Di- or tripeptidase [Amino acid transport and metabolism]. 	414
225107	COG2197	CitB	DNA-binding response regulator, NarL/FixJ family, contains REC and HTH domains [Signal transduction mechanisms, Transcription]. 	211
225108	COG2198	HPtr	HPt (histidine-containing phosphotransfer) domain [Signal transduction mechanisms]. 	122
225109	COG2199	GGDEF	GGDEF domain, diguanylate cyclase (c-di-GMP synthetase) or its enzymatically inactive variants  [Signal transduction mechanisms]. 	181
225110	COG2200	EAL	EAL domain, c-di-GMP-specific phosphodiesterase class I (or its enzymatically inactive variant) [Signal transduction mechanisms]. 	256
225111	COG2201	CheB	Chemotaxis response regulator CheB, contains REC and protein-glutamate methylesterase domains [Cell motility, Signal transduction mechanisms]. 	350
225112	COG2202	PAS	PAS domain [Signal transduction mechanisms]. 	232
225113	COG2203	FhlA	GAF domain [Signal transduction mechanisms]. 	175
225114	COG2204	AtoC	DNA-binding transcriptional response regulator, NtrC family, contains REC, AAA-type ATPase, and a Fis-type DNA-binding domains [Signal transduction mechanisms]. 	464
225115	COG2205	KdpD	K+-sensing histidine kinase KdpD [Signal transduction mechanisms]. 	890
225116	COG2206	HDGYP	HD-GYP domain, c-di-GMP phosphodiesterase class II (or its inactivated variant)  [Signal transduction mechanisms]. 	344
225117	COG2207	AraC	AraC-type DNA-binding domain and AraC-containing proteins [Transcription]. 	127
225118	COG2208	RsbU	Serine phosphatase RsbU, regulator of sigma subunit  [Signal transduction mechanisms, Transcription]. 	367
225119	COG2209	NqrE	Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE  [Energy production and conversion]. 	198
225120	COG2210	YrkE	Peroxiredoxin family protein  [Energy production and conversion]. 	137
225121	COG2211	MelB	Na+/melibiose symporter or related transporter [Carbohydrate transport and metabolism]. 	467
225122	COG2212	MnhF	Multisubunit Na+/H+ antiporter, MnhF subunit  [Inorganic ion transport and metabolism]. 	89
225123	COG2213	MtlA	Phosphotransferase system, mannitol-specific IIBC component [Carbohydrate transport and metabolism]. 	472
225124	COG2214	CbpA	Curved DNA-binding protein CbpA, contains a DnaJ-like domain [Transcription]. 	237
225125	COG2215	RcnA	ABC-type nickel/cobalt efflux system, permease component RcnA [Inorganic ion transport and metabolism]. 	303
225126	COG2216	KdpB	High-affinity K+ transport system, ATPase chain B [Inorganic ion transport and metabolism]. 	681
225127	COG2217	ZntA	Cation transport ATPase [Inorganic ion transport and metabolism]. 	713
225128	COG2218	FwdC	Formylmethanofuran dehydrogenase subunit C  [Energy production and conversion]. 	264
225129	COG2219	PRI2	Eukaryotic-type DNA primase, large subunit  [Replication, recombination and repair]. 	363
225130	COG2220	UlaG	L-ascorbate metabolism protein UlaG, beta-lactamase superfamily [Carbohydrate transport and metabolism]. 	258
225131	COG2221	DsrA	Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits  [Inorganic ion transport and metabolism]. 	317
225132	COG2222	AgaS	Fructoselysine-6-P-deglycase FrlB and related proteins with duplicated sugar isomerase (SIS) domain [Cell wall/membrane/envelope biogenesis]. 	340
225133	COG2223	NarK	Nitrate/nitrite transporter NarK [Inorganic ion transport and metabolism]. 	417
225134	COG2224	AceA	Isocitrate lyase [Energy production and conversion]. 	433
225135	COG2225	AceB	Malate synthase [Energy production and conversion]. 	545
225136	COG2226	UbiE	Ubiquinone/menaquinone biosynthesis C-methylase UbiE [Coenzyme transport and metabolism]. 	238
225137	COG2227	UbiG	2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase [Coenzyme transport and metabolism]. 	243
225138	COG2229	Srp102	Signal recognition particle receptor subunit beta, a GTPase [Intracellular trafficking, secretion, and vesicular transport]. 	187
225139	COG2230	Cfa	Cyclopropane fatty-acyl-phospholipid synthase and related methyltransferases [Lipid transport and metabolism]. 	283
225140	COG2231	COG2231	Uncharacterized protein related to Endonuclease III  [General function prediction only]. 	215
225141	COG2232	COG2232	Predicted ATP-dependent carboligase, ATP-grasp superfamily [General function prediction only]. 	389
225142	COG2233	UraA	Xanthine/uracil permease [Nucleotide transport and metabolism]. 	451
225143	COG2234	Iap	Zn-dependent amino- or carboxypeptidase, M28 family [Posttranslational modification, protein turnover, chaperones, Amino acid transport and metabolism]. 	435
225144	COG2235	ArcA	Arginine deiminase  [Amino acid transport and metabolism]. 	409
225145	COG2236	Hpt1	Hypoxanthine phosphoribosyltransferase  [Coenzyme transport and metabolism]. 	192
225146	COG2237	COG2237	Uncharacterized membrane protein [Function unknown]. 	364
225147	COG2238	RPS19A	Ribosomal protein S19E (S16A)  [Translation, ribosomal structure and biogenesis]. 	147
225148	COG2239	MgtE	Mg/Co/Ni transporter MgtE (contains CBS domain)  [Inorganic ion transport and metabolism]. 	451
225149	COG2240	PdxK	Pyridoxal/pyridoxine/pyridoxamine kinase [Coenzyme transport and metabolism]. 	281
225150	COG2241	CobL	Precorrin-6B methylase 1  [Coenzyme transport and metabolism]. 	210
225151	COG2242	CobL	Precorrin-6B methylase 2  [Coenzyme transport and metabolism]. 	187
225152	COG2243	CobF	Precorrin-2 methylase  [Coenzyme transport and metabolism]. 	234
225153	COG2244	RfbX	Membrane protein involved in the export of O-antigen and teichoic acid [Cell wall/membrane/envelope biogenesis]. 	480
225154	COG2245	COG2245	Uncharacterized membrane protein  [Function unknown]. 	182
225155	COG2246	GtrA	Putative flippase GtrA (transmembrane translocase of bactoprenol-linked glucose) [Lipid transport and metabolism]. 	139
225156	COG2247	LytB	Putative cell wall-binding domain  [Cell wall/membrane/envelope biogenesis]. 	337
225157	COG2248	COG2248	Predicted hydrolase, metallo-beta-lactamase superfamily [General function prediction only]. 	304
225158	COG2249	MdaB	Putative NADPH-quinone reductase (modulator of drug activity B) [General function prediction only]. 	189
225159	COG2250	HEPN	HEPN domain [Function unknown]. 	132
225160	COG2251	COG2251	Predicted nuclease, RecB family  [General function prediction only]. 	474
225161	COG2252	AzgA	Xanthine/uracil/vitamin C permease, AzgA family [Nucleotide transport and metabolism]. 	436
225162	COG2253	COG2253	Predicted nucleotidyltransferase component of viral defense system [Defense mechanisms]. 	258
225163	COG2254	Cas3	CRISPR/Cas system-associated endonuclease Cas3-HD [Defense mechanisms]. 	230
225164	COG2255	RuvB	Holliday junction resolvasome RuvABC, ATP-dependent DNA helicase subunit [Replication, recombination and repair]. 	332
225165	COG2256	RarA	Replication-associated recombination protein RarA (DNA-dependent ATPase) [Replication, recombination and repair]. 	436
225166	COG2257	YlqH	Type III secretion system substrate exporter, FlhB-like [Intracellular trafficking, secretion, and vesicular transport]. 	92
225167	COG2258	YiiM	Uncharacterized conserved protein YiiM, contains MOSC domain [Function unknown]. 	210
225168	COG2259	DoxX	Uncharacterized membrane protein YphA, DoxX/SURF4 family [Function unknown]. 	142
225169	COG2260	Nop10	rRNA maturation protein Nop10, contains Zn-ribbon domain [Translation, ribosomal structure and biogenesis]. 	59
225170	COG2261	YeaQ	Uncharacterized membrane protein YeaQ/YmgE, transglycosylase-associated protein family [General function prediction only]. 	82
225171	COG2262	HflX	50S ribosomal subunit-associated GTPase HflX [Translation, ribosomal structure and biogenesis]. 	411
225172	COG2263	COG2263	Predicted RNA methylase  [General function prediction only]. 	198
225173	COG2264	PrmA	Ribosomal protein L11 methylase PrmA [Translation, ribosomal structure and biogenesis]. 	300
225174	COG2265	TrmA	tRNA/tmRNA/rRNA uracil-C5-methylase, TrmA/RlmC/RlmD family [Translation, ribosomal structure and biogenesis]. 	432
225175	COG2266	COG2266	GTP:adenosylcobinamide-phosphate guanylyltransferase  [Coenzyme transport and metabolism]. 	177
225176	COG2267	PldB	Lysophospholipase, alpha-beta hydrolase superfamily [Lipid transport and metabolism]. 	298
225177	COG2268	YqiK	Uncharacterized membrane protein YqiK, contains Band7/PHB/SPFH domain [Function unknown]. 	548
225178	COG2269	EpmA	Elongation factor P--beta-lysine ligase (EF-P beta-lysylation pathway) [Translation, ribosomal structure and biogenesis]. 	322
225179	COG2270	BtlA	MFS-type transporter involved in bile tolerance, Atg22 family [General function prediction only]. 	438
225180	COG2271	UhpC	Sugar phosphate permease [Carbohydrate transport and metabolism]. 	448
225181	COG2272	PnbA	Carboxylesterase type B  [Lipid transport and metabolism]. 	491
225182	COG2273	BglS	Beta-glucanase, GH16 family [Carbohydrate transport and metabolism]. 	355
225183	COG2274	SunT	ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain  [Defense mechanisms]. 	709
225184	COG2301	CitE	Citrate lyase beta subunit [Carbohydrate transport and metabolism]. 	283
225185	COG2302	YlmH	RNA-binding protein YlmH, contains S4-like domain  [General function prediction only]. 	257
225186	COG2303	BetA	Choline dehydrogenase or related flavoprotein [Lipid transport and metabolism, General function prediction only]. 	542
225187	COG2304	YfbK	Secreted protein containing bacterial Ig-like domain and vWFA domain [General function prediction only]. 	399
225188	COG2306	COG2306	Predicted RNA-binding protein, associated with RNAse of E/G family [General function prediction only]. 	183
225189	COG2307	COG2307	Uncharacterized conserved protein, Alpha-E superfamily [Function unknown]. 	313
225190	COG2308	COG2308	Uncharacterized conserved protein, circularly permuted ATPgrasp superfamily  [Function unknown]. 	488
225191	COG2309	AmpS	Leucyl aminopeptidase (aminopeptidase T)  [Amino acid transport and metabolism]. 	385
225192	COG2310	TerZ	Stress response protein SCP2 [Signal transduction mechanisms]. 	182
225193	COG2311	YeiB	Uncharacterized membrane protein YeiB [Function unknown]. 	394
225194	COG2312	YbfO	Erythromycin esterase homolog  [Secondary metabolites biosynthesis, transport and catabolism]. 	405
225195	COG2313	PsuG	Pseudouridine-5'-phosphate glycosidase (pseudoU degradation) [Nucleotide transport and metabolism]. 	310
225196	COG2314	TM2	Uncharacterized membrane protein YozV, TM2 domain [Function unknown]. 	95
225197	COG2315	MmcQ	Predicted DNA-binding protein with double-wing structural motif, MmcQ/YjbR family [Transcription]. 	118
225198	COG2316	COG2316	Predicted hydrolase, HD superfamily  [General function prediction only]. 	212
225199	COG2317	YpwA	Zn-dependent carboxypeptidase, M32 family [Posttranslational modification, protein turnover, chaperones]. 	497
225200	COG2318	DinB	Uncharacterized damage-inducible protein DinB (forms a four-helix bundle) [Function unknown]. 	172
225201	COG2319	WD40	WD40 repeat  [General function prediction only]. 	466
225202	COG2320	GrpB	GrpB domain, predicted nucleotidyltransferase, UPF0157 family [General function prediction only]. 	185
225203	COG2321	YpfJ	Predicted metalloprotease [General function prediction only]. 	295
225204	COG2322	YozB	Uncharacterized membrane protein YozB, DUF420 family [Function unknown]. 	177
225205	COG2323	YcaP	Uncharacterized membrane protein YcaP, DUF421 family [Function unknown]. 	224
225206	COG2324	COG2324	Uncharacterized membrane protein  [Function unknown]. 	281
225207	COG2326	COG2326	Polyphosphate kinase 2, PPK2 family [Energy production and conversion]. 	270
225208	COG2327	WcaK	Polysaccharide pyruvyl transferase family protein WcaK [Cell wall/membrane/envelope biogenesis]. 	385
225209	COG2329	HmoA	Heme-degrading monooxygenase HmoA and related ABM domain proteins [Coenzyme transport and metabolism]. 	105
225210	COG2331	COG2331	Predicted nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 	82
225211	COG2332	CcmE	Cytochrome c-type biogenesis protein CcmE [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	153
225212	COG2333	ComEC	Metal-dependent hydrolase, beta-lactamase superfamily II [General function prediction only]. 	293
225213	COG2334	SrkA	Ser/Thr protein kinase RdoA involved in Cpx stress response, MazF antagonist [Signal transduction mechanisms]. 	331
225214	COG2335	FAS1	Uncaracterized surface protein containing fasciclin (FAS1) repeats  [General function prediction only]. 	187
225215	COG2336	MazE	Antitoxin component of the MazEF toxin-antitoxin module [Signal transduction mechanisms]. 	82
225216	COG2337	MazF	mRNA-degrading endonuclease, toxin component of the MazEF toxin-antitoxin module [Defense mechanisms]. 	112
225217	COG2339	PrsW	Membrane proteinase PrsW, cleaves anti-sigma factor RsiW, M82 family [Signal transduction mechanisms]. 	274
225218	COG2340	YkwD	Uncharacterized conserved protein YkwD, contains CAP (CSP/antigen 5/PR1) domain [Function unknown]. 	207
225219	COG2342	COG2342	Endo alpha-1,4 polygalactosaminidase, GH114 family (was erroneously annotated as Cys-tRNA synthetase) [Carbohydrate transport and metabolism]. 	300
225220	COG2343	COG2343	Uncharacterized conserved protein, DUF427 family [Function unknown]. 	132
225221	COG2344	Rex	NADH/NAD ratio-sensing transcriptional regulator Rex [Transcription]. 	211
225222	COG2345	COG2345	Predicted transcriptional regulator, ArsR family [Transcription]. 	218
225223	COG2346	YjbI	Truncated hemoglobin YjbI [Inorganic ion transport and metabolism]. 	133
225224	COG2348	FmhB	Lipid II:glycine glycyltransferase (Peptidoglycan interpeptide bridge formation enzyme) [Cell wall/membrane/envelope biogenesis]. 	418
225225	COG2350	YciI	Uncharacterized conserved protein YciI, contains a putative active-site phosphohistidine  [General function prediction only]. 	92
225226	COG2351	HiuH	5-hydroxyisourate hydrolase (purine catabolism), transthyretin-related family [Nucleotide transport and metabolism]. 	124
225227	COG2352	Ppc	Phosphoenolpyruvate carboxylase [Energy production and conversion]. 	910
225228	COG2353	YceI	Polyisoprenoid-binding periplasmic protein YceI [General function prediction only]. 	192
225229	COG2354	MutK	Uncharacterized membrane protein MutK, may be involved in DNA repair  [Function unknown]. 	303
225230	COG2355	COG2355	Zn-dependent dipeptidase, microsomal dipeptidase homolog  [Posttranslational modification, protein turnover, chaperones, Amino acid transport and metabolism]. 	313
225231	COG2356	EndA	Endonuclease I [Replication, recombination and repair]. 	237
225232	COG2357	YjbM	ppGpp synthetase catalytic domain (RelA/SpoT-type nucleotidyltranferase)  [Nucleotide transport and metabolism, Signal transduction mechanisms]. 	231
225233	COG2358	Imp	TRAP-type uncharacterized transport system, periplasmic component  [General function prediction only]. 	321
225234	COG2359	SpoVS	Stage V sporulation protein SpoVS  (function unknown) [Function unknown]. 	87
225235	COG2360	Aat	Leu/Phe-tRNA-protein transferase [Posttranslational modification, protein turnover, chaperones]. 	221
225236	COG2361	COG2361	Uncharacterized conserved protein, contains HEPN domain [Function unknown]. 	117
225237	COG2362	DppA	D-aminopeptidase  [Amino acid transport and metabolism]. 	274
225238	COG2363	YgdD	Uncharacterized membrane protein YgdD, TMEM256/DUF423 family [Function unknown]. 	124
225239	COG2364	YczE	Uncharacterized membrane protein YczE [Function unknown]. 	210
225240	COG2365	Oca4	Protein tyrosine/serine phosphatase  [Signal transduction mechanisms]. 	249
225241	COG2366	PvdQ	Acyl-homoserine lactone (AHL) acylase PvdQ [Secondary metabolites biosynthesis, transport and catabolism]. 	768
225242	COG2367	PenP	Beta-lactamase class A  [Defense mechanisms]. 	329
225243	COG2368	YoaI	Aromatic ring hydroxylase  [Secondary metabolites biosynthesis, transport and catabolism]. 	493
225244	COG2369	COG2369	Uncharacterized conserved protein, contains phage Mu gpF-like domain [Function unknown]. 	432
225245	COG2370	HupE	Hydrogenase/urease accessory protein HupE [Posttranslational modification, protein turnover, chaperones]. 	201
225246	COG2371	UreE	Urease accessory protein UreE  [Posttranslational modification, protein turnover, chaperones]. 	155
225247	COG2372	CopC	Copper-binding protein CopC (methionine-rich) [Inorganic ion transport and metabolism]. 	127
225248	COG2373	YfaS	Uncharacterized conserved protein YfaS, alpha-2-macroglobulin family  [General function prediction only]. 	1621
225249	COG2374	COG2374	Predicted extracellular nuclease  [General function prediction only]. 	798
225250	COG2375	ViuB	NADPH-dependent ferric siderophore reductase, contains FAD-binding and SIP domains [Inorganic ion transport and metabolism]. 	265
225251	COG2376	DAK1	Dihydroxyacetone kinase [Carbohydrate transport and metabolism]. 	323
225252	COG2377	AnmK	1,6-Anhydro-N-acetylmuramate kinase [Cell wall/membrane/envelope biogenesis]. 	371
225253	COG2378	YafY	Predicted DNA-binding transcriptional regulator YafY, contains an HTH and WYL domains  [Transcription]. 	311
225254	COG2379	GckA	Glycerate-2-kinase  [Carbohydrate transport and metabolism]. 	422
225255	COG2380	COG2380	Uncharacterized protein [Function unknown]. 	327
225256	COG2382	Fes	Enterochelin esterase or related enzyme [Inorganic ion transport and metabolism]. 	299
225257	COG2383	COG2383	Uncharacterized membrane protein, Fun14 family [Function unknown]. 	109
225258	COG2384	TrmK	tRNA A22 N-methylase [Translation, ribosomal structure and biogenesis]. 	226
225259	COG2385	SpoIID	Peptidoglycan hydrolase (amidase) enhancer domain [Cell wall/membrane/envelope biogenesis]. 	397
225260	COG2386	CcmB	ABC-type transport system involved in cytochrome c biogenesis, permease component [Posttranslational modification, protein turnover, chaperones]. 	221
225261	COG2388	YidJ	Predicted acetyltransferase, GNAT superfamily [General function prediction only]. 	99
225262	COG2389	COG2389	Uncharacterized metal-binding protein, DUF2227 family [Function unknown]. 	179
225263	COG2390	DeoR	DNA-binding transcriptional regulator LsrR, DeoR family [Transcription]. 	321
225264	COG2391	YedE	Uncharacterized membrane protein YedE/YeeE, contains two sulfur transport domains  [General function prediction only]. 	198
225265	COG2401	MK0520	ABC-type ATPase fused to a predicted acetyltransferase domain  [General function prediction only]. 	593
225266	COG2402	COG2402	Predicted nucleic acid-binding protein, contains PIN domain  [General function prediction only]. 	135
225267	COG2403	COG2403	Predicted GTPase  [General function prediction only]. 	449
225268	COG2404	NrnB	Oligoribonuclease NrnB or cAMP/cGMP phosphodiesterase, DHH superfamily [Translation, ribosomal structure and biogenesis, Signal transduction mechanisms]. 	339
225269	COG2405	COG2405	Predicted nucleic acid-binding protein, contains PIN domain  [General function prediction only]. 	157
225270	COG2406	COG2406	Protein distantly related to bacterial ferritins  [General function prediction only]. 	172
225271	COG2407	FucI	L-fucose isomerase or related protein [Carbohydrate transport and metabolism]. 	470
225272	COG2409	YdfJ	Uncharacterized membrane protein YdfJ, MMPL/SSD domain [Function unknown]. 	937
225273	COG2410	COG2410	Predicted nuclease (RNAse H fold)  [General function prediction only]. 	178
225274	COG2411	COG2411	Uncharacterized protein [Function unknown]. 	188
225275	COG2412	COG2412	Uncharacterized protein [Function unknown]. 	101
225276	COG2413	COG2413	Predicted nucleotidyltransferase  [General function prediction only]. 	228
225277	COG2414	YdhV	Aldehyde:ferredoxin oxidoreductase [Energy production and conversion]. 	614
225278	COG2419	COG2419	Trm5-related predicted tRNA methylase [Translation, ribosomal structure and biogenesis]. 	336
225279	COG2421	FmdA	Acetamidase/formamidase  [Energy production and conversion]. 	305
225280	COG2423	OCDMu	Ornithine cyclodeaminase/archaeal alanine dehydrogenase, mu-crystallin family [Amino acid transport and metabolism]. 	330
225281	COG2425	ViaA	Uncharacterized protein, contains a von Willebrand factor type A (vWA) domain [Function unknown]. 	437
225282	COG2426	COG2426	Uncharacterized membrane protein  [Function unknown]. 	142
225283	COG2427	YjgD	Uncharacterized conserved protein YjgD, DUF1641 family [Function unknown]. 	148
225284	COG2428	Sfm1	Rps3 or RNA methylase involved in ribosome biogenesis, SPOUT family,  [Translation, ribosomal structure and biogenesis]. 	196
225285	COG2429	Gch31	Archaeal GTP cyclohydrolase III  [Nucleotide transport and metabolism]. 	250
225286	COG2430	COG2430	Uncharacterized protein [Function unknown]. 	236
225287	COG2431	YbjE	Uncharacterized membrane protein YbjE, DUF340 family [Function unknown]. 	297
225288	COG2433	COG2433	Possible nuclease of RNase H fold, RuvC/YqgF family [General function prediction only]. 	652
225289	COG2440	FixX	Ferredoxin-like protein FixX [Energy production and conversion]. 	99
225290	COG2441	COG2441	Predicted butyrate kinase, DUF1464 family [General function prediction only]. 	374
225291	COG2442	COG2442	Uncharacterized conserved protein, DUF433 family [Function unknown]. 	79
225292	COG2443	Sss1	Preprotein translocase subunit Sss1  [Intracellular trafficking, secretion, and vesicular transport]. 	65
225293	COG2445	YutE	Uncharacterized conserved protein YutE, UPF0331/DUF86 family  [Function unknown]. 	138
225294	COG2450	COG2450	Predicted archaeal cell division protein, SepF homolog, DUF552 family [Cell cycle control, cell division, chromosome partitioning]. 	124
225295	COG2451	Rpl35A	Ribosomal protein L35AE/L33A  [Translation, ribosomal structure and biogenesis]. 	100
225296	COG2452	COG2452	Predicted site-specific integrase-resolvase  [Mobilome: prophages, transposons]. 	193
225297	COG2453	CDC14	Protein-tyrosine phosphatase [Signal transduction mechanisms]. 	180
225298	COG2454	COG2454	Uncharacterized protein [Function unknown]. 	211
225299	COG2456	COG2456	Uncharacterized protein [Function unknown]. 	121
225300	COG2457	COG2457	Uncharacterized protein [Function unknown]. 	199
225301	COG2461	COG2461	Uncharacterized conserved protein, DUF438 domain, may contain hemerythrin domain [Function unknown]. 	409
225302	COG2469	COG2469	Uncharacterized protein, contains HTH domain [Function unknown]. 	284
225303	COG2501	YbcJ	Ribosome-associated protein YbcJ, S4-like RNA binding protein [Translation, ribosomal structure and biogenesis]. 	73
225304	COG2502	AsnA	Asparagine synthetase A [Amino acid transport and metabolism]. 	330
225305	COG2503	COG2503	Predicted secreted acid phosphatase  [General function prediction only]. 	274
225306	COG2508	PucR	DNA-binding transcriptional regulator, PucR family [Transcription]. 	421
225307	COG2509	COG2509	FAD-dependent dehydrogenase [General function prediction only]. 	486
225308	COG2510	COG2510	Uncharacterized membrane protein  [Function unknown]. 	140
225309	COG2511	GatE	Archaeal Glu-tRNAGln amidotransferase subunit E, contains GAD domain  [Translation, ribosomal structure and biogenesis]. 	631
225310	COG2512	COG2512	Uncharacterized membrane protein  [Function unknown]. 	258
225311	COG2513	PrpB	2-Methylisocitrate lyase and related enzymes, PEP mutase family [Carbohydrate transport and metabolism]. 	289
225312	COG2514	CatE	Catechol-2,3-dioxygenase [Secondary metabolites biosynthesis, transport and catabolism]. 	265
225313	COG2515	Acd	1-aminocyclopropane-1-carboxylate deaminase/D-cysteine desulfhydrase, PLP-dependent ACC family [Amino acid transport and metabolism]. 	323
225314	COG2516	COG2516	Biotin synthase-related protein, radical SAM superfamily [General function prediction only]. 	339
225315	COG2517	COG2517	Predicted RNA-binding protein, contains C-terminal EMAP domain  [General function prediction only]. 	219
225316	COG2518	Pcm	Protein-L-isoaspartate O-methyltransferase [Posttranslational modification, protein turnover, chaperones]. 	209
225317	COG2519	Gcd14	tRNA A58 N-methylase Trm61 [Translation, ribosomal structure and biogenesis]. 	256
225318	COG2520	Trm5	tRNA G37 N-methylase Trm5 [Translation, ribosomal structure and biogenesis]. 	341
225319	COG2521	COG2521	Predicted archaeal methyltransferase  [General function prediction only]. 	287
225320	COG2522	COG2522	Predicted transcriptional regulator  [General function prediction only]. 	119
225321	COG2524	COG2524	Predicted transcriptional regulator, contains C-terminal CBS domains  [Transcription]. 	294
225322	COG2602	YbxI	Beta-lactamase class D  [Defense mechanisms]. 	254
225323	COG2603	SelU	tRNA 2-selenouridine synthase SelU, contains rhodanese domain [Translation, ribosomal structure and biogenesis]. 	334
225324	COG2604	COG2604	Uncharacterized conserved protein [Function unknown]. 	594
225325	COG2605	COG2605	Predicted kinase related to galactokinase and mevalonate kinase  [General function prediction only]. 	333
225326	COG2606	EbsC	Cys-tRNA(Pro) deacylase, prolyl-tRNA editing enzyme YbaK/EbsC [Translation, ribosomal structure and biogenesis]. 	155
225327	COG2607	COG2607	Predicted ATPase, AAA+ superfamily  [General function prediction only]. 	287
225328	COG2608	CopZ	Copper chaperone CopZ [Inorganic ion transport and metabolism]. 	71
225329	COG2609	AceE	Pyruvate dehydrogenase complex, dehydrogenase (E1) component [Energy production and conversion]. 	887
225330	COG2610	GntT	H+/gluconate symporter or related permease [Carbohydrate transport and metabolism, General function prediction only]. 	442
225331	COG2703	COG2703	Hemerythrin  [Signal transduction mechanisms]. 	144
225332	COG2704	DcuA	Anaerobic C4-dicarboxylate transporter [Carbohydrate transport and metabolism]. 	436
225333	COG2706	Pgl	6-phosphogluconolactonase, cycloisomerase 2 family [Carbohydrate transport and metabolism]. 	346
225334	COG2707	YeaL	Uncharacterized membrane protein, DUF441 family [Function unknown]. 	151
225335	COG2710	NifD	Nitrogenase molybdenum-iron protein, alpha and beta chains  [Inorganic ion transport and metabolism]. 	456
225336	COG2715	SpmA	Spore maturation protein SpmA (function unknown) [General function prediction only]. 	206
225337	COG2716	GcvR	Glycine cleavage system regulatory protein [Amino acid transport and metabolism]. 	176
225338	COG2717	YedZ	Periplasmic DMSO/TMAO reductase YedYZ, heme-binding membrane subunit [Energy production and conversion]. 	209
225339	COG2718	YeaH	Uncharacterized conserved protein YeaH/YhbH, required for sporulation, DUF444 family [General function prediction only]. 	423
225340	COG2719	SpoVR	Stage V sporulation protein SpoVR/YcgB, involved in spore cortex formation (function unknown) [Cell cycle control, cell division, chromosome partitioning]. 	495
225341	COG2720	YoaR	Vancomycin resistance protein YoaR (function unknown), contains peptidoglycan-binding and VanW domains  [Defense mechanisms]. 	376
225342	COG2721	UxaA	Altronate dehydratase [Carbohydrate transport and metabolism]. 	381
225343	COG2723	BglB	Beta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase [Carbohydrate transport and metabolism]. 	460
225344	COG2730	BglC	Aryl-phospho-beta-D-glucosidase BglC, GH1 family [Carbohydrate transport and metabolism]. 	407
225345	COG2731	EbgC	Beta-galactosidase, beta subunit [Carbohydrate transport and metabolism]. 	154
225346	COG2732	BarS	Barstar, RNAse (barnase) inhibitor [Transcription]. 	91
225347	COG2733	YjiN	Uncharacterized membrane-anchored protein YjiN, DUF445 family [Function unknown]. 	415
225348	COG2738	YugP	Zn-dependent membrane protease YugP [Posttranslational modification, protein turnover, chaperones]. 	226
225349	COG2739	YlxM	Predicted DNA-binding protein YlxM, UPF0122 family [Transcription]. 	105
225350	COG2740	YlxR	Predicted RNA-binding protein YlxR, DUF448 family [General function prediction only]. 	95
225351	COG2746	YokD	Aminoglycoside N3'-acetyltransferase  [Defense mechanisms]. 	251
225352	COG2747	FlgM	Negative regulator of flagellin synthesis (anti-sigma28 factor) [Transcription, Cell motility]. 	93
225353	COG2755	TesA	Lysophospholipase L1 or related esterase [Amino acid transport and metabolism]. 	216
225354	COG2759	MIS1	Formyltetrahydrofolate synthetase  [Nucleotide transport and metabolism]. 	554
225355	COG2761	FrnE	Predicted dithiol-disulfide isomerase, DsbA family [Posttranslational modification, protein turnover, chaperones]. 	225
225356	COG2764	PhnB	Uncharacterized conserved protein PhnB, glyoxalase superfamily [General function prediction only]. 	136
225357	COG2766	PrkA	Predicted Ser/Thr protein kinase [Signal transduction mechanisms]. 	649
225358	COG2768	COG2768	Uncharacterized Fe-S cluster protein  [Function unknown]. 	354
225359	COG2770	HAMP	HAMP domain  [Signal transduction mechanisms]. 	83
225360	COG2771	CsgD	DNA-binding transcriptional regulator, CsgD family [Transcription]. 	65
225361	COG2801	Tra5	Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons]. 	232
225362	COG2802	LON	Uncharacterized protein, LON-like domain, ASCH/PUA-like superfamily [Function unknown]. 	221
225363	COG2804	PulE	Type II secretory pathway ATPase GspE/PulE or T4P pilus assembly pathway ATPase PilB [Cell motility, Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	500
225364	COG2805	PilT	Tfp pilus assembly protein PilT, pilus retraction ATPase [Cell motility, Extracellular structures]. 	353
225365	COG2807	CynX	Cyanate permease [Inorganic ion transport and metabolism]. 	395
225366	COG2808	PaiB	Predicted FMN-binding regulatory protein PaiB [Signal transduction mechanisms]. 	209
225367	COG2810	COG2810	Predicted type IV restriction endonuclease  [Defense mechanisms]. 	284
225368	COG2811	NtpH	Archaeal/vacuolar-type H+-ATPase subunit H  [Energy production and conversion]. 	108
225369	COG2812	DnaX	DNA polymerase III, gamma/tau subunits [Replication, recombination and repair]. 	515
225370	COG2813	RsmC	16S rRNA G1207 methylase RsmC [Translation, ribosomal structure and biogenesis]. 	300
225371	COG2814	AraJ	Predicted arabinose efflux permease, MFS family [Carbohydrate transport and metabolism]. 	394
225372	COG2815	PASTA	PASTA domain, binds beta-lactams  [Cell wall/membrane/envelope biogenesis]. 	303
225373	COG2816	NPY1	NADH pyrophosphatase NudC, Nudix superfamily [Nucleotide transport and metabolism]. 	279
225374	COG2818	Tag	3-methyladenine DNA glycosylase Tag [Replication, recombination and repair]. 	188
225375	COG2819	YbbA	Predicted hydrolase of the alpha/beta superfamily  [General function prediction only]. 	264
225376	COG2820	Udp	Uridine phosphorylase [Nucleotide transport and metabolism]. 	248
225377	COG2821	MltA	Membrane-bound lytic murein transglycosylase [Cell wall/membrane/envelope biogenesis]. 	373
225378	COG2822	EfeO	Iron uptake system EfeUOB, periplasmic (or lipoprotein) component EfeO/EfeM [Inorganic ion transport and metabolism]. 	376
225379	COG2823	OsmY	Osmotically-inducible protein OsmY, contains BON domain [Function unknown]. 	196
225380	COG2824	PhnA	Uncharacterized Zn-ribbon-containing protein [General function prediction only]. 	112
225381	COG2825	HlpA	Periplasmic chaperone for outer membrane proteins, Skp family [Cell wall/membrane/envelope biogenesis, Posttranslational modification, protein turnover, chaperones]. 	170
225382	COG2826	Tra8	Transposase and inactivated derivatives, IS30 family [Mobilome: prophages, transposons]. 	318
225383	COG2827	YhbQ	Predicted endonuclease, GIY-YIG superfamily  [Replication, recombination and repair]. 	95
225384	COG2828	PrpF	2-Methylaconitate cis-trans-isomerase PrpF (2-methyl citrate pathway) [Energy production and conversion]. 	378
225385	COG2829	PldA	Outer membrane phospholipase A [Cell wall/membrane/envelope biogenesis]. 	317
225386	COG2830	COG2830	Uncharacterized protein [Function unknown]. 	214
225387	COG2831	FhaC	Hemolysin activation/secretion protein  [Intracellular trafficking, secretion, and vesicular transport]. 	554
225388	COG2832	YbaN	Uncharacterized membrane protein YbaN, DUF454 family [Function unknown]. 	119
225389	COG2833	COG2833	Uncharacterized conserved protein, contains ferritin-like DUF455 domain [Function unknown]. 	268
225390	COG2834	LolA	Outer membrane lipoprotein-sorting protein [Cell wall/membrane/envelope biogenesis]. 	211
225391	COG2835	YcaR	Uncharacterized conserved protein YbaR, Trm112 family [Function unknown]. 	60
225392	COG2836	TauE	Sulfite exporter TauE/SafE [Inorganic ion transport and metabolism]. 	232
225393	COG2837	EfeB	Periplasmic deferrochelatase/peroxidase EfeB [Inorganic ion transport and metabolism]. 	352
225394	COG2838	IcdM	Monomeric isocitrate dehydrogenase  [Energy production and conversion]. 	744
225395	COG2839	YqgC	Uncharacterized conserved protein YqgC, DUF456 family [Function unknown]. 	160
225396	COG2840	SmrA	DNA-nicking endonuclease, Smr domain  [Replication, recombination and repair]. 	184
225397	COG2841	YdcH	Uncharacterized conserved protein YdcH, DUF465 family [Function unknown]. 	72
225398	COG2842	COG2842	Bacteriophage DNA transposition protein, AAA+ family ATPase  [Mobilome: prophages, transposons]. 	297
225399	COG2843	COG2843	Poly-gamma-glutamate biosynthesis protein CapA/YwtB (capsule formation), metallophosphatase superfamily   [Cell wall/membrane/envelope biogenesis]. 	372
225400	COG2844	GlnD	UTP:GlnB (protein PII) uridylyltransferase [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 	867
225401	COG2845	COG2845	Uncharacterized protein [Function unknown]. 	354
225402	COG2846	RIC	Iron-sulfur cluster repair protein YtfE, RIC family, contains ScdAN and hemerythrin domains [Posttranslational modification, protein turnover, chaperones]. 	221
225403	COG2847	COG2847	Copper(I)-binding protein  [Inorganic ion transport and metabolism]. 	151
225404	COG2848	COG2848	Uncharacterized conserved protein, UPF0210 family [Cell cycle control, cell division, chromosome partitioning]. 	445
225405	COG2849	YwqK	Antitoxin component YwqK of the YwqJK toxin-antitoxin module [Defense mechanisms]. 	230
225406	COG2850	RoxA	Ribosomal protein L16 Arg81 hydroxylase, contains JmjC domain [Translation, ribosomal structure and biogenesis]. 	383
225407	COG2851	CitM	Mg2+/citrate symporter  [Energy production and conversion]. 	433
225408	COG2852	YcjD	Very-short-patch-repair endonuclease [Replication, recombination and repair]. 	129
225409	COG2853	VacJ	ABC-type transporter Mla maintaining outer membrane lipid asymmetry, lipoprotein component MlaA [Cell wall/membrane/envelope biogenesis]. 	250
225410	COG2854	MlaC	ABC-type transporter Mla maintaining outer membrane lipid asymmetry, periplasmic MlaC component [Lipid transport and metabolism]. 	202
225411	COG2855	YeiH	Uncharacterized membrane protein YadS [Function unknown]. 	334
225412	COG2856	ImmA	Zn-dependent peptidase ImmA, M78 family [Posttranslational modification, protein turnover, chaperones]. 	213
225413	COG2857	CYT1	Cytochrome c1  [Energy production and conversion]. 	250
225414	COG2859	COG2859	Uncharacterized protein [Function unknown]. 	237
225415	COG2860	YadS	Uncharacterized membrane protein YeiH [Function unknown]. 	209
225416	COG2861	YibQ	Uncharacterized conserved protein YibQ, putative polysaccharide deacetylase 2 family  [Carbohydrate transport and metabolism]. 	250
225417	COG2862	YqhA	Uncharacterized membrane protein YqhA [Function unknown]. 	169
225418	COG2863	CytC553	Cytochrome c553  [Energy production and conversion]. 	121
225419	COG2864	FdnI	Cytochrome b subunit of formate dehydrogenase [Energy production and conversion]. 	218
225420	COG2865	COG2865	Predicted transcriptional regulator, contains HTH domain [Transcription]. 	467
225421	COG2866	MpaA	Murein tripeptide amidase MpaA [Cell wall/membrane/envelope biogenesis]. 	374
225422	COG2867	PasT	Ribosome association toxin PasT (RatA) of the RatAB toxin-antitoxin module [Translation, ribosomal structure and biogenesis]. 	146
225423	COG2868	YsxB	Uncharacterized conserved protein YsxB, DUF464 family [Function unknown]. 	109
225424	COG2869	NqrC	Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC  [Energy production and conversion]. 	264
225425	COG2870	RfaE	ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase [Cell wall/membrane/envelope biogenesis]. 	467
225426	COG2871	NqrF	Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF  [Energy production and conversion]. 	410
225427	COG2872	AlaX	Ser-tRNA(Ala) deacylase AlaX (editing enzyme) [Translation, ribosomal structure and biogenesis]. 	241
225428	COG2873	MET17	O-acetylhomoserine/O-acetylserine sulfhydrylase, pyridoxal phosphate-dependent [Amino acid transport and metabolism]. 	426
225429	COG2874	FlaH	Archaellum biogenesis protein FlaH, an ATPase [Cell motility]. 	235
225430	COG2875	CobM	Precorrin-4 methylase  [Coenzyme transport and metabolism]. 	254
225431	COG2876	AroGA	3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase  [Amino acid transport and metabolism]. 	286
225432	COG2877	KdsA	3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase [Cell wall/membrane/envelope biogenesis]. 	279
225433	COG2878	RnfB	Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfB subunit  [Energy production and conversion]. 	198
225434	COG2879	YbdD	Uncharacterized short protein YbdD, DUF466 family [Function unknown]. 	65
225435	COG2880	COG2880	Predicted DNA-binding protein, potential antitoxin AbrB/MazE fold [General function prediction only]. 	67
225436	COG2881	COG2881	Uncharacterized protein [Function unknown]. 	181
225437	COG2882	FliJ	Flagellar biosynthesis chaperone FliJ [Cell motility]. 	148
225438	COG2884	FtsE	ABC-type ATPase involved in cell division [Cell cycle control, cell division, chromosome partitioning]. 	223
225439	COG2885	OmpA	Outer membrane protein OmpA and related peptidoglycan-associated (lipo)proteins [Cell wall/membrane/envelope biogenesis]. 	190
225440	COG2886	COG2886	Predicted antitoxin, contains HTH domain [General function prediction only]. 	88
225441	COG2887	COG2887	RecB family exonuclease  [Replication, recombination and repair]. 	269
225442	COG2888	COG2888	Predicted RNA-binding protein involved in translation, contains  Zn-ribbon domain, DUF1610 family [General function prediction only]. 	61
225443	COG2890	HemK	Methylase of polypeptide chain release factors [Translation, ribosomal structure and biogenesis]. 	280
225444	COG2891	MreD	Cell shape-determining protein MreD [Cell wall/membrane/envelope biogenesis]. 	167
225445	COG2892	Pcc1	tRNA threonylcarbamoyladenosine modification (KEOPS) complex,  Pcc1 subunit [Translation, ribosomal structure and biogenesis]. 	82
225446	COG2893	ManX	Phosphotransferase system, mannose/fructose-specific component IIA [Carbohydrate transport and metabolism]. 	143
225447	COG2894	MinD	Septum formation inhibitor-activating ATPase MinD [Cell cycle control, cell division, chromosome partitioning]. 	272
225448	COG2895	CysN	Sulfate adenylyltransferase subunit 1, EFTu-like GTPase family  [Inorganic ion transport and metabolism]. 	431
225449	COG2896	MoaA	Molybdenum cofactor biosynthesis enzyme MoaA [Coenzyme transport and metabolism]. 	322
225450	COG2897	SseA	3-mercaptopyruvate sulfurtransferase SseA, contains two rhodanese domains [Inorganic ion transport and metabolism]. 	285
225451	COG2898	MprF	Lysylphosphatidylglycerol synthetase, C-terminal domain, DUF2156 family  [Function unknown]. 	538
225452	COG2899	COG2899	Uncharacterized protein [Function unknown]. 	346
225453	COG2900	SlyX	Uncharacterized coiled-coil protein SlyX (sensitive to lysis X)  [Function unknown]. 	72
225454	COG2901	Fis	DNA-binding protein Fis (factor for inversion stimulation) [Transcription]. 	98
225455	COG2902	Gdh2	NAD-specific glutamate dehydrogenase  [Amino acid transport and metabolism]. 	1592
225456	COG2904	QueFN	NADPH-dependent 7-cyano-7-deazaguanine reductase QueF, N-terminal domain [Translation, ribosomal structure and biogenesis]. 	137
225457	COG2905	COG2905	Signal-transduction protein containing cAMP-binding, CBS, and nucleotidyltransferase domains  [Signal transduction mechanisms]. 	610
225458	COG2906	Bfd	Bacterioferritin-associated ferredoxin [Inorganic ion transport and metabolism]. 	63
225459	COG2907	COG2907	Predicted NAD/FAD-binding protein  [General function prediction only]. 	447
225460	COG2908	LpxH	UDP-2,3-diacylglucosamine pyrophosphatase LpxH [Cell wall/membrane/envelope biogenesis]. 	237
225461	COG2909	MalT	ATP-, maltotriose- and DNA-dependent transcriptional regulator MalT [Transcription]. 	894
225462	COG2910	YwnB	Putative NADH-flavin reductase  [General function prediction only]. 	211
225463	COG2911	TamB	Autotransporter translocation and assembly factor TamB [Intracellular trafficking, secretion, and vesicular transport]. 	1278
225464	COG2912	SirB1	Regulator of sirC expression, contains transglutaminase-like and TPR domains [Signal transduction mechanisms]. 	269
225465	COG2913	BamE	Outer membrane protein assembly factor BamE, lipoprotein component of the BamABCDE complex [Cell wall/membrane/envelope biogenesis]. 	147
225466	COG2914	PasI	Putative antitoxin component PasI (RatB) of the RatAB toxin-antitoxin module, ubiquitin-RnfH superfamily [Defense mechanisms]. 	99
225467	COG2915	HflD	Regulator of phage lambda lysogenization HflD, binds to CII and stimulates its degradation [Mobilome: prophages, transposons, Signal transduction mechanisms]. 	207
225468	COG2916	Hns	DNA-binding protein H-NS [Transcription]. 	128
225469	COG2917	YciB	Intracellular septation protein A [Cell cycle control, cell division, chromosome partitioning]. 	180
225470	COG2918	GshA	Gamma-glutamylcysteine synthetase [Coenzyme transport and metabolism]. 	518
225471	COG2919	FtsB	Cell division protein FtsB [Cell cycle control, cell division, chromosome partitioning]. 	117
225472	COG2920	DsrC	Sulfur relay (sulfurtransferase) protein, DsrC/TusE family [Inorganic ion transport and metabolism]. 	111
225473	COG2921	YbeD	Putative lipoic acid-binding regulatory protein [Signal transduction mechanisms]. 	90
225474	COG2922	Smg	Uncharacterized conserved protein Smg, DUF494 family  [Function unknown]. 	157
225475	COG2923	DsrF	Sulfur relay (sulfurtransferase) complex TusC component, DsrF/TusC family  [Inorganic ion transport and metabolism]. 	118
225476	COG2924	YggX	Fe-S cluster biosynthesis and repair protein YggX  [Inorganic ion transport and metabolism, Posttranslational modification, protein turnover, chaperones]. 	90
225477	COG2925	SbcB	Exonuclease I [Replication, recombination and repair]. 	475
225478	COG2926	YeeX	Uncharacterized conserved protein YeeX, DUF496 family [Function unknown]. 	109
225479	COG2927	HolC	DNA polymerase III, chi subunit [Replication, recombination and repair]. 	144
225480	COG2928	COG2928	Uncharacterized membrane protein [Function unknown]. 	222
225481	COG2929	COG2929	Uncharacterized conserved protein, DUF497 family [Function unknown]. 	93
225482	COG2930	SYLF	Lipid-binding SYLF domain [Lipid transport and metabolism]. 	227
225483	COG2931	COG2931	Ca2+-binding protein, RTX toxin-related  [Secondary metabolites biosynthesis, transport and catabolism]. 	510
225484	COG2932	COG2932	Phage repressor protein C, contains Cro/C1-type HTH and peptidase s24 domains [Mobilome: prophages, transposons]. 	214
225485	COG2933	RlmM	23S rRNA C2498 (ribose-2'-O)-methylase RlmM [Translation, ribosomal structure and biogenesis]. 	358
225486	COG2935	Ate1	Arginyl-tRNA--protein-N-Asp/Glu arginylyltransferase  [Posttranslational modification, protein turnover, chaperones]. 	253
225487	COG2936	COG2936	Predicted acyl esterase [General function prediction only]. 	563
225488	COG2937	PlsB	Glycerol-3-phosphate O-acyltransferase [Lipid transport and metabolism]. 	810
225489	COG2938	SdhE	Succinate dehydrogenase flavin-adding protein, antitoxin component of the CptAB toxin-antitoxin module [Posttranslational modification, protein turnover, chaperones]. 	94
225490	COG2939	Kex1	Carboxypeptidase C (cathepsin A)  [Amino acid transport and metabolism]. 	498
225491	COG2940	SET	SET domain-containing protein (function unknown) [General function prediction only]. 	480
225492	COG2941	Coq7	Demethoxyubiquinone hydroxylase, CLK1/Coq7/Cat5 family [Coenzyme transport and metabolism]. 	204
225493	COG2942	YihS	Mannose or cellobiose epimerase, N-acyl-D-glucosamine 2-epimerase family [Carbohydrate transport and metabolism]. 	388
225494	COG2943	MdoH	Membrane glycosyltransferase [Cell wall/membrane/envelope biogenesis, Carbohydrate transport and metabolism]. 	736
225495	COG2944	YiaG	DNA-binding transcriptional regulator YiaG, XRE-type HTH domain [Transcription]. 	104
225496	COG2945	COG2945	Alpha/beta superfamily hydrolase [General function prediction only]. 	210
225497	COG2946	NicK	DNA relaxase NicK  [Replication, recombination and repair]. 	377
225498	COG2947	COG2947	Predicted RNA-binding protein, contains PUA-like domain [General function prediction only]. 	156
225499	COG2948	VirB10	Type IV secretory pathway, VirB10 components  [Intracellular trafficking, secretion, and vesicular transport]. 	360
225500	COG2949	SanA	Uncharacterized periplasmic protein SanA, affects membrane permeability for vancomycin [Cell wall/membrane/envelope biogenesis]. 	235
225501	COG2951	MltB	Membrane-bound lytic murein transglycosylase B [Cell wall/membrane/envelope biogenesis]. 	343
225502	COG2952	COG2952	Uncharacterized protein [Function unknown]. 	183
225503	COG2954	CYTH	CYTH domain, found in class IV adenylate cyclase and various triphosphatases [General function prediction only]. 	156
225504	COG2956	YciM	Lipopolysaccharide biosynthesis regulator YciM, contains six TPR domains and a predicted metal-binding C-terminal domain [Cell wall/membrane/envelope biogenesis]. 	389
225505	COG2957	AguA	Agmatine/peptidylarginine deiminase [Amino acid transport and metabolism]. 	346
225506	COG2958	COG2958	Uncharacterized protein [Function unknown]. 	307
225507	COG2959	HemX	Uncharacterized conserved protein HemX (no evidence of involvement in heme biosynthesis) [Function unknown]. 	391
225508	COG2960	YqiC	Uncharacterized conserved protein YqiC, BMFP domain [Function unknown]. 	103
225509	COG2961	RlmJ	23S rRNA A2030 N6-methylase RlmJ [Translation, ribosomal structure and biogenesis]. 	279
225510	COG2962	RarD	Uncharacterized membrane protein RarD, contains two EamA domains [Function unknown]. 	293
225511	COG2963	InsE	Transposase and inactivated derivatives [Mobilome: prophages, transposons]. 	116
225512	COG2964	YheO	Predicted transcriptional regulator YheO, contains PAS and DNA-binding HTH domains [Transcription]. 	220
225513	COG2965	PriB	Primosomal replication protein N [Replication, recombination and repair]. 	103
225514	COG2966	YjjP	Uncharacterized membrane protein YjjP, DUF1212 family [Function unknown]. 	250
225515	COG2967	ApaG	Uncharacterized protein affecting Mg2+/Co2+ transport [Inorganic ion transport and metabolism]. 	126
225516	COG2968	YggE	Uncharacterized conserved protein YggE, contains kinase-interacting SIMPL domain  [Function unknown]. 	243
225517	COG2969	SspB	Stringent starvation protein B, binds SsrA peptide [Posttranslational modification, protein turnover, chaperones]. 	155
225518	COG2971	BadF	BadF-type ATPase, related to human N-acetylglucosamine kinase  [Carbohydrate transport and metabolism]. 	301
225519	COG2972	YesM	Sensor histidine kinase YesM [Signal transduction mechanisms]. 	456
225520	COG2973	TrpR	Trp operon repressor [Transcription]. 	103
225521	COG2974	RdgC	DNA recombination-dependent growth factor C [Replication, recombination and repair]. 	303
225522	COG2975	IscX	Fe-S-cluster formation regulator IscX/YfhJ [Posttranslational modification, protein turnover, chaperones]. 	64
225523	COG2976	YfgM	Putative negative regulator of RcsB-dependent stress response [Signal transduction mechanisms]. 	207
225524	COG2977	EntD	4'-phosphopantetheinyl transferase EntD (siderophore biosynthesis) [Secondary metabolites biosynthesis, transport and catabolism]. 	228
225525	COG2978	AbgT	p-Aminobenzoyl-glutamate transporter AbgT [Coenzyme transport and metabolism]. 	516
225526	COG2979	YebE	Uncharacterized membrane protein YebE, DUF533 family [Function unknown]. 	225
225527	COG2980	LptE	Outer membrane lipoprotein LptE/RlpB (LPS assembly) [Cell wall/membrane/envelope biogenesis]. 	178
225528	COG2981	CysZ	Uncharacterized protein involved in cysteine biosynthesis [Amino acid transport and metabolism]. 	250
225529	COG2982	AsmA	Uncharacterized protein involved in outer membrane biogenesis [Cell wall/membrane/envelope biogenesis]. 	648
225530	COG2983	YcgN	Uncharacterized cysteine cluster protein YcgN, CxxCxxCC family [Function unknown]. 	153
225531	COG2984	COG2984	ABC-type uncharacterized transport system, periplasmic component  [General function prediction only]. 	322
225532	COG2985	YbjL	Uncharacterized membrane protein YbjL, putative transporter [General function prediction only]. 	544
225533	COG2986	HutH	Histidine ammonia-lyase  [Amino acid transport and metabolism]. 	498
225534	COG2987	HutU	Urocanate hydratase  [Amino acid transport and metabolism]. 	561
225535	COG2988	AstE	Succinylglutamate desuccinylase [Amino acid transport and metabolism]. 	324
225536	COG2989	YcbB	Murein L,D-transpeptidase YcbB/YkuD [Cell wall/membrane/envelope biogenesis]. 	561
225537	COG2990	VirK	Uncharacterized protein VirK/YbjX, DUF535 family [Function unknown]. 	300
225538	COG2991	COG2991	Uncharacterized protein [Function unknown]. 	77
225539	COG2992	Bax	Uncharacterized FlgJ-related protein [General function prediction only]. 	262
225540	COG2993	CcoO	Cbb3-type cytochrome oxidase, cytochrome c subunit  [Energy production and conversion]. 	227
225541	COG2994	HlyC	ACP:hemolysin acyltransferase (hemolysin-activating protein)  [Posttranslational modification, protein turnover, chaperones]. 	148
225542	COG2995	PqiA	Uncharacterized paraquat-inducible protein A [Function unknown]. 	418
225543	COG2996	CvfB	Predicted RNA-binding protein, contains S1 domains,  virulence factor B family [General function prediction only]. 	287
225544	COG2998	TupA	tungsten ABC transporter substrate-binding protein  [Inorganic ion transport and metabolism]. 	280
225545	COG2999	GrxB	Glutaredoxin 2 [Posttranslational modification, protein turnover, chaperones]. 	215
225546	COG3000	ERG3	Sterol desaturase/sphingolipid hydroxylase, fatty acid hydroxylase superfamily [Lipid transport and metabolism]. 	271
225547	COG3001	FN3K	Fructosamine-3-kinase [Carbohydrate transport and metabolism]. 	286
225548	COG3002	YbcC	Uncharacterized conserved protein YbcC, UPF0753/DUF2309 family [Function unknown]. 	880
225549	COG3004	NhaA	Na+/H+ antiporter NhaA [Energy production and conversion, Inorganic ion transport and metabolism]. 	390
225550	COG3005	NapC	Tetraheme cytochrome c subunit of nitrate or TMAO reductase [Energy production and conversion]. 	190
225551	COG3006	MukF	Chromosome condensin MukBEF complex, kleisin-like MukF subunit [Cell cycle control, cell division, chromosome partitioning]. 	440
225552	COG3007	COG3007	Trans-2-enoyl-CoA reductase [Lipid transport and metabolism]. 	398
225553	COG3008	PqiB	Paraquat-inducible protein B (function unknown) [Function unknown]. 	553
225554	COG3009	YmbA	Uncharacterized lipoprotein YmbA [Function unknown]. 	190
225555	COG3010	NanE	Putative N-acetylmannosamine-6-phosphate epimerase [Carbohydrate transport and metabolism]. 	229
225556	COG3011	YuxK	Predicted thiol-disulfide oxidoreductase YuxK, DCC family [General function prediction only]. 	137
225557	COG3012	YchJ	Uncharacterized conserved protein YchJ, contains N- and C-terminal SEC-C domains [Function unknown]. 	151
225558	COG3013	YfbU	Uncharacterized protein YfbU, UPF0304 family [Function unknown]. 	168
225559	COG3014	COG3014	Uncharacterized protein [Function unknown]. 	449
225560	COG3015	CutF	Uncharacterized lipoprotein NlpE involved in copper resistance [Cell wall/membrane/envelope biogenesis, Defense mechanisms]. 	178
225561	COG3016	PhuW	Uncharacterized iron-regulated protein  [Function unknown]. 	295
225562	COG3017	LolB	Outer membrane lipoprotein LolB, involved in outer membrane biogenesis [Cell wall/membrane/envelope biogenesis]. 	206
225563	COG3018	COG3018	Uncharacterized protein [Function unknown]. 	115
225564	COG3019	COG3019	Uncharacterized conserved protein [Function unknown]. 	149
225565	COG3021	YafD	Uncharacterized conserved protein YafD, endonuclease/exonuclease/phosphatase (EEP) superfamily [General function prediction only]. 	309
225566	COG3022	YaaA	Cytoplasmic iron level regulating protein YaaA, DUF328/UPF0246 family [Inorganic ion transport and metabolism]. 	253
225567	COG3023	AmpD	N-acetyl-anhydromuramyl-L-alanine amidase AmpD [Cell wall/membrane/envelope biogenesis]. 	257
225568	COG3024	YacG	Endogenous inhibitor of DNA gyrase, YacG/DUF329 family [Replication, recombination and repair]. 	65
225569	COG3025	PPPi	Inorganic triphosphatase YgiF, contains CYTH and CHAD domains [Inorganic ion transport and metabolism]. 	432
225570	COG3026	RseB	Negative regulator of sigma E activity [Signal transduction mechanisms]. 	320
225571	COG3027	ZapA	Cell division protein ZapA, inhibits GTPase activity of FtsZ [Cell cycle control, cell division, chromosome partitioning]. 	105
225572	COG3028	YjgA	Ribosomal 50S subunit-associated protein YjgA (function unknown), DUF615 family  [Translation, ribosomal structure and biogenesis]. 	187
225573	COG3029	FrdC	Fumarate reductase subunit C [Energy production and conversion]. 	129
225574	COG3030	FxsA	Protein affecting phage T7 exclusion by the F plasmid, UPF0716 family [General function prediction only]. 	158
225575	COG3031	PulC	Type II secretory pathway, component PulC [Intracellular trafficking, secretion, and vesicular transport]. 	275
225576	COG3033	TnaA	Tryptophanase [Amino acid transport and metabolism]. 	471
225577	COG3034	YafK	Murein L,D-transpeptidase YafK [Cell wall/membrane/envelope biogenesis]. 	298
225578	COG3036	ArfA	Stalled ribosome alternative rescue factor ArfA [Translation, ribosomal structure and biogenesis]. 	66
225579	COG3037	UlaA	Ascorbate-specific PTS system EIIC-type component UlaA [Carbohydrate transport and metabolism]. 	481
225580	COG3038	CybB	Cytochrome b561 [Energy production and conversion]. 	181
225581	COG3039	IS5	Transposase and inactivated derivatives, IS5 family [Mobilome: prophages, transposons]. 	230
225582	COG3040	Blc	Bacterial lipocalin [Cell wall/membrane/envelope biogenesis]. 	174
225583	COG3041	YafQ	mRNA-degrading endonuclease (mRNA interferase) YafQ, toxin component of the YafQ-DinJ toxin-antitoxin module [Translation, ribosomal structure and biogenesis]. 	91
225584	COG3042	Hlx	Putative hemolysin [General function prediction only]. 	85
225585	COG3043	NapB	Nitrate reductase cytochrome c-type subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 	155
225586	COG3044	COG3044	Predicted ATPase of the ABC class  [General function prediction only]. 	554
225587	COG3045	CreA	Periplasmic catabolite regulation protein CreA (function unknown) [Signal transduction mechanisms]. 	165
225588	COG3046	COG3046	Uncharacterized protein related to deoxyribodipyrimidine photolyase  [General function prediction only]. 	505
225589	COG3047	OmpW	Outer membrane protein W [Cell wall/membrane/envelope biogenesis]. 	213
225590	COG3048	DsdA	D-serine dehydratase [Amino acid transport and metabolism]. 	443
225591	COG3049	YxeI	Penicillin V acylase or related amidase, Ntn superfamily  [Cell wall/membrane/envelope biogenesis, General function prediction only]. 	353
225592	COG3050	HolD	DNA polymerase III, psi subunit [Replication, recombination and repair]. 	133
225593	COG3051	CitF	Citrate lyase, alpha subunit [Energy production and conversion]. 	513
225594	COG3052	CitD	Citrate lyase, gamma subunit [Energy production and conversion]. 	98
225595	COG3053	CitC	Citrate lyase synthetase [Energy production and conversion]. 	352
225596	COG3054	YtfJ	Predicted transcriptional regulator [General function prediction only]. 	184
225597	COG3055	NanM	N-acetylneuraminic acid mutarotase [Cell wall/membrane/envelope biogenesis]. 	381
225598	COG3056	YajG	Uncharacterized lipoprotein YajG [Function unknown]. 	204
225599	COG3057	SeqA	Negative regulator of replication initiation [Replication, recombination and repair]. 	181
225600	COG3058	FdhE	Formate dehydrogenase maturation protein FdhE [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	308
225601	COG3059	YkgB	Uncharacterized membrane protein YkgB [Function unknown]. 	182
225602	COG3060	MetJ	Transcriptional regulator of met regulon [Transcription, Amino acid transport and metabolism]. 	105
225603	COG3061	OapA	Cell envelope opacity-associated protein A (function unknown) [Function unknown]. 	242
225604	COG3062	NapD	Cytoplasmic chaperone NapD for the signal peptide of periplasmic nitrate reductase NapAB [Posttranslational modification, protein turnover, chaperones]. 	94
225605	COG3063	PilF	Tfp pilus assembly protein PilF  [Cell motility, Extracellular structures]. 	250
225606	COG3064	TolA	Membrane protein involved in colicin uptake [Cell wall/membrane/envelope biogenesis]. 	387
225607	COG3065	Slp	Starvation-inducible outer membrane lipoprotein [Cell wall/membrane/envelope biogenesis]. 	191
225608	COG3066	MutH	DNA mismatch repair protein MutH [Replication, recombination and repair]. 	229
225609	COG3067	NhaB	Na+/H+ antiporter NhaB [Energy production and conversion, Inorganic ion transport and metabolism]. 	516
225610	COG3068	YjaG	Uncharacterized protein YjaG, DUF416 family [Function unknown]. 	194
225611	COG3069	DcuC	C4-dicarboxylate transporter [Energy production and conversion]. 	451
225612	COG3070	TfoX	Transcriptional regulator of competence genes, TfoX/Sxy family [Transcription]. 	121
225613	COG3071	HemY	Uncharacterized conserved protein HemY, contains two TPR repeats [Function unknown]. 	400
225614	COG3072	CyaA	Adenylate cyclase [Nucleotide transport and metabolism]. 	853
225615	COG3073	RseA	Negative regulator of sigma E activity [Signal transduction mechanisms]. 	213
225616	COG3074	ZapB	Cell division protein ZapB, interacts with FtsZ [Cell cycle control, cell division, chromosome partitioning]. 	79
225617	COG3075	GlpB	Anaerobic glycerol-3-phosphate dehydrogenase [Amino acid transport and metabolism]. 	421
225618	COG3076	RraB	Regulator of RNase E activity RraB [Translation, ribosomal structure and biogenesis]. 	135
225619	COG3077	RelB	Antitoxin component of the RelBE or YafQ-DinJ toxin-antitoxin module [Defense mechanisms]. 	88
225620	COG3078	YihI	Ribosome assembly protein YihI, activator of Der GTPase  [Translation, ribosomal structure and biogenesis]. 	169
225621	COG3079	YgfB	Uncharacterized conserved protein YgfB, UPF0149 family [Function unknown]. 	186
225622	COG3080	FrdD	Fumarate reductase subunit D [Energy production and conversion]. 	118
225623	COG3081	NdpA	Nucleoid-associated protein YejK (function unknown) [Function unknown]. 	335
225624	COG3082	YejL	Uncharacterized conserved protein YejL, UPF0352 family [Function unknown]. 	74
225625	COG3083	YejM	Membrane-anchored periplasmic protein YejM, alkaline phosphatase superfamily [Cell wall/membrane/envelope biogenesis]. 	600
225626	COG3084	YihD	Uncharacterized protein YihD, DUF1040 family [Function unknown]. 	88
225627	COG3085	YifE	Uncharacterized conserved protein YifE, UPF0438 family [Function unknown]. 	112
225628	COG3086	RseC	Positive regulator of sigma E activity [Signal transduction mechanisms]. 	150
225629	COG3087	FtsN	Cell division protein FtsN [Cell cycle control, cell division, chromosome partitioning]. 	264
225630	COG3088	NrfF	Cytochrome c-type biogenesis protein CcmH/NrfF [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	153
225631	COG3089	YheU	Uncharacterized conserved protein YheU, UPF0270 family [Function unknown]. 	72
225632	COG3090	DctM	TRAP-type C4-dicarboxylate transport system, small permease component [Carbohydrate transport and metabolism]. 	177
225633	COG3091	SprT	Predicted Zn-dependent metalloprotease, SprT family [General function prediction only]. 	156
225634	COG3092	YfbV	Uncharacterized membrane protein YfbV, UPF0208 family [Function unknown]. 	149
225635	COG3093	VapI	Plasmid maintenance system antidote protein VapI, contains XRE-type HTH domain [Defense mechanisms]. 	104
225636	COG3094	SirB2	Uncharacterized membrane protein SirB2 [Function unknown]. 	129
225637	COG3095	MukE	Chromosome condensin MukBEF, MukE localization factor [Cell cycle control, cell division, chromosome partitioning]. 	238
225638	COG3096	MukB	Chromosome condensin MukBEF, ATPase and DNA-binding subunit MukB [Escherichia coli str. K-12 substr. MG1655 [Cell cycle control, cell division, chromosome partitioning]. 	1480
225639	COG3097	YqfB	Uncharacterized protein YqfB, UPF0267 family [Function unknown]. 	106
225640	COG3098	YqcC	Uncharacterized conserved protein YqcC, DUF446 family [Function unknown]. 	109
225641	COG3099	YciU	Uncharacterized conserved protein YciU, UPF0263 family [Function unknown]. 	108
225642	COG3100	YcgL	Uncharacterized conserved protein YcgL, UPF0745 family [Function unknown]. 	103
225643	COG3101	EpmC	Elongation factor P hydroxylase (EF-P beta-lysylation pathway) [Translation, ribosomal structure and biogenesis]. 	180
225644	COG3102	YecM	Uncharacterized conserved protein YecM, predicted metalloenzyme [General function prediction only]. 	185
225645	COG3103	YgiM	Uncharacterized conserved protein YgiM, contains N-terminal SH3 domain, DUF1202 family [General function prediction only]. 	205
225646	COG3104	PTR2	Dipeptide/tripeptide permease [Amino acid transport and metabolism]. 	498
225647	COG3105	YhcB	Uncharacterized membrane-anchored protein YhcB, DUF1043 family [Function unknown]. 	138
225648	COG3106	YcjX	Predicted ATPase, YcjX-like family [General function prediction only]. 	467
225649	COG3107	LpoA	Outer membrane lipoprotein LpoA, binds and activates PBP1a [Cell wall/membrane/envelope biogenesis]. 	604
225650	COG3108	YcbK	Uncharacterized conserved protein YcbK, DUF882 family [Function unknown]. 	185
225651	COG3109	ProQ	sRNA-binding protein [Signal transduction mechanisms]. 	208
225652	COG3110	YccT	Uncharacterized conserved protein YccT, UPF0319 family [Function unknown]. 	216
225653	COG3111	YdeI	Predicted periplasmic protein YdeI with OB-fold, BOF family [Function unknown]. 	128
225654	COG3112	YacL	Uncharacterized protein YacL, UPF0231 family [Function unknown]. 	121
225655	COG3113	MlaB	ABC-type transporter Mla maintaining outer membrane lipid asymmetry, MlaB component, contains STAS domain  [Cell wall/membrane/envelope biogenesis]. 	99
225656	COG3114	CcmD	Heme exporter protein D [Intracellular trafficking, secretion, and vesicular transport]. 	67
225657	COG3115	ZipA	Cell division protein ZipA, interacts with FtsZ [Cell cycle control, cell division, chromosome partitioning]. 	324
225658	COG3116	FtsL	Cell division protein FtsL, interacts with FtsB, FtsL and FtsQ [Cell cycle control, cell division, chromosome partitioning]. 	105
225659	COG3117	YrbK	Lipopolysaccharide export system protein LptC [Cell wall/membrane/envelope biogenesis]. 	188
225660	COG3118	YbbN	Negative regulator of GroEL, contains thioredoxin-like and TPR-like domains  [Posttranslational modification, protein turnover, chaperones]. 	304
225661	COG3119	AslA	Arylsulfatase A or related enzyme [Inorganic ion transport and metabolism]. 	475
225662	COG3120	MatP	Macrodomain Ter protein organizer, MatP/YcbG family [Replication, recombination and repair]. 	149
225663	COG3121	FimC	P pilus assembly protein, chaperone PapD [Extracellular structures]. 	235
225664	COG3122	YaiL	Uncharacterized conserved protein YaiL, DUF2058 family [Function unknown]. 	215
225665	COG3123	YaiE	Uncharacterized conserved protein YaiE, UPF0345 family [Function unknown]. 	94
225666	COG3124	AcpH	Acyl carrier protein phosphodiesterase [Lipid transport and metabolism]. 	193
225667	COG3125	CyoD	Heme/copper-type cytochrome/quinol oxidase, subunit 4 [Energy production and conversion]. 	111
225668	COG3126	YbaY	Uncharacterized lipoprotein YbaY [Function unknown]. 	158
225669	COG3127	YbbP	Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, permease component [Secondary metabolites biosynthesis, transport and catabolism]. 	829
225670	COG3128	PiuC	Predicted 2-oxoglutarate- and Fe(II)-dependent dioxygenase YbiX [General function prediction only]. 	229
225671	COG3129	RlmF	23S rRNA A1618 N6-methylase RlmF [Translation, ribosomal structure and biogenesis]. 	292
225672	COG3130	Rmf	Ribosome modulation factor [Translation, ribosomal structure and biogenesis]. 	55
225673	COG3131	MdoG	Periplasmic glucans biosynthesis protein [Cell wall/membrane/envelope biogenesis]. 	534
225674	COG3132	YceH	Uncharacterized conserved protein YceH, UPF0502 family [Function unknown]. 	215
225675	COG3133	SlyB	Outer membrane lipoprotein SlyB [Cell wall/membrane/envelope biogenesis]. 	154
225676	COG3134	YcfJ	Uncharacterized conserved protein YcfJ, contains glycine zipper 2TM domain [Function unknown]. 	179
225677	COG3135	BenE	Predicted benzoate:H+ symporter BenE [Secondary metabolites biosynthesis, transport and catabolism]. 	402
225678	COG3136	GlpM	Uncharacterized membrane protein, GlpM family [Function unknown]. 	111
225679	COG3137	YdiY	Putative salt-induced outer membrane protein YdiY [Cell wall/membrane/envelope biogenesis]. 	262
225680	COG3138	AstA	Arginine/ornithine N-succinyltransferase beta subunit [Amino acid transport and metabolism]. 	336
225681	COG3139	yeaC	Uncharacterized conserved protein YeaC, DUF1315 family [Function unknown]. 	90
225682	COG3140	yoaH	Uncharacterized conserved protein YoaH, UPF0181 family [Function unknown]. 	60
225683	COG3141	YebG	dsDNA-binding SOS-regulon protein, induction by DNA damage requires cAMP [Replication, recombination and repair]. 	97
225684	COG3142	CutC	Copper homeostasis protein CutC [Inorganic ion transport and metabolism]. 	241
225685	COG3143	CheZ	Chemotaxis regulator CheZ, phosphatase of CheY~P [Cell motility, Signal transduction mechanisms]. 	217
225686	COG3144	FliK	Flagellar hook-length control protein FliK [Cell motility]. 	417
225687	COG3145	AlkB	Alkylated DNA repair dioxygenase AlkB [Replication, recombination and repair]. 	194
225688	COG3146	COG3146	Predicted N-acyltransferase  [General function prediction only]. 	387
225689	COG3147	DedD	Cell division protein DedD (periplasmic protein involved in septation) [Cell cycle control, cell division, chromosome partitioning]. 	226
225690	COG3148	YfiP	Uncharacterized conserved protein YfiP, DTW domain [Function unknown]. 	231
225691	COG3149	PulM	Type II secretory pathway, component PulM [Intracellular trafficking, secretion, and vesicular transport]. 	181
225692	COG3150	ycfP	Predicted esterase YcpF, UPF0227 family [General function prediction only]. 	191
225693	COG3151	yqiB	Uncharacterized protein YqiB, DUF1249 family [Function unknown]. 	147
225694	COG3152	yhaH	Uncharacterized membrane protein YhaH, DUF805 family [Function unknown]. 	125
225695	COG3153	yhbS	Predicted N-acetyltransferase YhbS [General function prediction only]. 	171
225696	COG3154	SCP2	Predicted lipid carrier protein YhbT, SCP2 domain [Lipid transport and metabolism]. 	168
225697	COG3155	ElbB	Enhancing lycopene biosynthesis protein 2 [Secondary metabolites biosynthesis, transport and catabolism]. 	217
225698	COG3156	PulK	Type II secretory pathway, component PulK [Intracellular trafficking, secretion, and vesicular transport]. 	323
225699	COG3157	Hcp	Type VI protein secretion system component Hcp (secreted cytotoxin) [Intracellular trafficking, secretion, and vesicular transport]. 	162
225700	COG3158	Kup	K+ transporter [Inorganic ion transport and metabolism]. 	627
225701	COG3159	YigA	Uncharacterized conserved protein YigA, DUF484 family [Function unknown]. 	218
225702	COG3160	Rsd	Regulator of sigma D [Transcription]. 	162
225703	COG3161	UbiC	4-hydroxybenzoate synthetase (chorismate-pyruvate lyase) [Coenzyme transport and metabolism]. 	174
225704	COG3162	YjcH	Uncharacterized membrane protein, DUF485 family [Function unknown]. 	102
225705	COG3164	YhdR	Uncharacterized conserved protein YhdP, contains DUF3971 and AsmA2 domains  [Function unknown]. 	1271
225706	COG3165	UbiJ	Ubiquinone biosynthesis protein UbiJ, contains SCP2 domain [Coenzyme transport and metabolism]. 	204
225707	COG3166	PilN	Tfp pilus assembly protein PilN [Cell motility, Extracellular structures]. 	206
225708	COG3167	PilO	Tfp pilus assembly protein PilO  [Cell motility, Extracellular structures]. 	211
225709	COG3168	PilP	Tfp pilus assembly protein PilP  [Cell motility, Extracellular structures]. 	170
225710	COG3169	COG3169	Uncharacterized conserved protein, DUF486 family [Function unknown]. 	116
225711	COG3170	FimV	Tfp pilus assembly protein FimV  [Cell motility, Extracellular structures]. 	755
225712	COG3171	YggL	Uncharacterized conserved protein YggL, DUF469 family [Function unknown]. 	119
225713	COG3172	NadR3	Nicotinamide riboside kinase [Coenzyme transport and metabolism]. 	187
225714	COG3173	YcbJ	Predicted  kinase, aminoglycoside phosphotransferase (APT) family  [General function prediction only]. 	321
225715	COG3174	COG3174	Uncharacterized membrane protein, DUF4010 family [Function unknown]. 	371
225716	COG3175	COX11	Cytochrome c oxidase assembly protein Cox11 [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	195
225717	COG3176	COG3176	Putative hemolysin  [General function prediction only]. 	292
225718	COG3177	COG3177	Fic family protein  [Transcription]. 	348
225719	COG3178	COG3178	Predicted phosphotransferase, aminoglycoside/choline kinase (APH/ChoK) family [General function prediction only]. 	351
225720	COG3179	COG3179	Predicted chitinase  [General function prediction only]. 	206
225721	COG3180	AbrB	Uncharacterized membrane protein AbrB, regulator of aidB expression [General function prediction only]. 	352
225722	COG3181	TctC	Tripartite-type tricarboxylate transporter, receptor component TctC [Energy production and conversion]. 	319
225723	COG3182	PiuB	Uncharacterized iron-regulated membrane protein  [Function unknown]. 	442
225724	COG3183	COG3183	Predicted restriction endonuclease, HNH family [Defense mechanisms]. 	272
225725	COG3184	COG3184	Uncharacterized protein, contains DUF2059 domain [Function unknown]. 	183
225726	COG3185	HppD	4-hydroxyphenylpyruvate dioxygenase and related hemolysins  [Amino acid transport and metabolism, General function prediction only]. 	363
225727	COG3186	PhhA	Phenylalanine-4-hydroxylase  [Amino acid transport and metabolism]. 	291
225728	COG3187	HslJ	Heat shock protein HslJ [Posttranslational modification, protein turnover, chaperones]. 	142
225729	COG3188	FimD	Outer membrane usher protein FimD/PapC [Cell motility, Extracellular structures]. 	835
225730	COG3189	YeaO	Uncharacterized conserved protein YeaO, DUF488 family [Function unknown]. 	117
225731	COG3190	FliO	Flagellar biogenesis protein FliO [Cell motility]. 	137
225732	COG3191	DmpA	L-aminopeptidase/D-esterase  [Amino acid transport and metabolism, Secondary metabolites biosynthesis, transport and catabolism]. 	348
225733	COG3192	EutH	Ethanolamine transporter EutH, required for ethanolamine utilization at low pH [Amino acid transport and metabolism]. 	389
225734	COG3193	GlcG	Uncharacterized conserved protein GlcG, DUF336 family [Function unknown]. 	141
225735	COG3194	AllA	Ureidoglycolate hydrolase (allantoin degradation) [Nucleotide transport and metabolism]. 	168
225736	COG3195	PucL	2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) decarboxylase [Nucleotide transport and metabolism]. 	176
225737	COG3196	CbrC	Uncharacterized protein CbrC, UPF0167 family [Function unknown]. 	183
225738	COG3197	FixS	Cytochrome oxidase maturation protein, CcoS/FixS family  [Posttranslational modification, protein turnover, chaperones]. 	58
225739	COG3198	COG3198	Uncharacterized protein [Function unknown]. 	172
225740	COG3199	COG3199	Predicted polyphosphate- or ATP-dependent NAD kinase  [Nucleotide transport and metabolism]. 	355
225741	COG3200	AroG2	3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase, class II  [Amino acid transport and metabolism]. 	445
225742	COG3201	PnuC	Nicotinamide riboside transporter PnuC [Coenzyme transport and metabolism]. 	222
225743	COG3202	TlcC	ATP/ADP translocase  [Energy production and conversion]. 	509
225744	COG3203	OmpC	Outer membrane protein (porin) [Cell wall/membrane/envelope biogenesis]. 	354
225745	COG3204	YjiK	Uncharacterized protein YjiK [Function unknown]. 	316
225746	COG3205	COG3205	Uncharacterized membrane protein  [Function unknown]. 	72
225747	COG3206	GumC	Uncharacterized protein involved in exopolysaccharide biosynthesis [Cell wall/membrane/envelope biogenesis]. 	458
225748	COG3207	Dit1	Pyoverdine/dityrosine biosynthesis protein Dit1 [Secondary metabolites biosynthesis, transport and catabolism]. 	330
225749	COG3208	GrsT	Surfactin synthase thioesterase subunit [Secondary metabolites biosynthesis, transport and catabolism]. 	244
225750	COG3209	RhsA	Uncharacterized conserved protein RhsA, contains 28 RHS repeats [General function prediction only]. 	796
225751	COG3210	FhaB	Large exoprotein involved in heme utilization or adhesion  [Intracellular trafficking, secretion, and vesicular transport]. 	1013
225752	COG3211	PhoX	Secreted phosphatase, PhoX family [General function prediction only]. 	616
225753	COG3212	YkoI	Uncharacterized membrane protein YkoI [Function unknown]. 	144
225754	COG3213	NnrS	Uncharacterized protein involved in response to NO  [Defense mechanisms]. 	396
225755	COG3214	YcaQ	Uncharacterized conserved protein YcaQ, contains winged helix DNA-binding domain [General function prediction only]. 	400
225756	COG3215	PilZ	Tfp pilus assembly protein PilZ  [Cell motility, Extracellular structures]. 	117
225757	COG3216	COG3216	Uncharacterized conserved protein, DUF2062 family [Function unknown]. 	184
225758	COG3217	YcbX	Uncharacterized conserved protein YcbX, contains MOSC and Fe-S domains [General function prediction only]. 	270
225759	COG3218	COG3218	ABC-type uncharacterized transport system, auxiliary component  [General function prediction only]. 	205
225760	COG3219	COG3219	Uncharacterized protein, DUF2063 family [Function unknown]. 	237
225761	COG3220	COG3220	Uncharacterized conserved protein, UPF0276 family [Function unknown]. 	282
225762	COG3221	PhnD	ABC-type phosphate/phosphonate transport system, periplasmic component [Inorganic ion transport and metabolism]. 	299
225763	COG3222	COG3222	Uncharacterized conserved protein, glycosyltransferase A (GT-A) superfamily, DUF2064 family [Function unknown]. 	211
225764	COG3223	PsiE	Phosphate starvation-inducible membrane PsiE (function unknown)  [General function prediction only]. 	138
225765	COG3224	COG3224	Antibiotic biosynthesis monooxygenase (ABM) superfamily enzyme [General function prediction only]. 	195
225766	COG3225	GldG	ABC-type uncharacterized transport system involved in gliding motility, auxiliary component  [Cell motility]. 	538
225767	COG3226	YbjK	DNA-binding transcriptional regulator YbjK [Transcription]. 	204
225768	COG3227	LasB	Zn-dependent metalloprotease [Posttranslational modification, protein turnover, chaperones]. 	507
225769	COG3228	MtfA	Mlc titration factor MtfA, regulates ptsG expression [Signal transduction mechanisms]. 	266
225770	COG3230	HemO	Heme oxygenase  [Inorganic ion transport and metabolism]. 	196
225771	COG3231	Aph	Aminoglycoside phosphotransferase  [Translation, ribosomal structure and biogenesis]. 	266
225772	COG3232	HpaF	5-carboxymethyl-2-hydroxymuconate isomerase  [Amino acid transport and metabolism]. 	127
225773	COG3233	COG3233	Predicted deacetylase  [General function prediction only]. 	233
225774	COG3234	yfaT	Uncharacterized conserved protein YfaT, DUF1175 family [Function unknown]. 	215
225775	COG3235	COG3235	Uncharacterized membrane protein [Function unknown]. 	223
225776	COG3236	ybiA	N-glycosylase of 5-amino-6-ribosylamino-2,4-pyrimidinedione 5?-phosphate (riboflavin biosynthesis damage control) [Coenzyme transport and metabolism]. 	162
225777	COG3237	yjbJ	Uncharacterized conserved protein YjbJ, UPF0337 family [Function unknown]. 	67
225778	COG3238	ydcZ	Uncharacterized membrane protein YdcZ, DUF606 family [Function unknown]. 	150
225779	COG3239	DesA	Fatty acid desaturase  [Lipid transport and metabolism]. 	343
225780	COG3240	COG3240	Phospholipase/lecithinase/hemolysin  [Lipid transport and metabolism, General function prediction only]. 	370
225781	COG3241	COG3241	Azurin  [Energy production and conversion]. 	151
225782	COG3242	yjeT	Uncharacterized conserved protein YjeT, DUF2065 family [Function unknown]. 	62
225783	COG3243	PhaC	Poly(3-hydroxyalkanoate) synthetase  [Lipid transport and metabolism]. 	445
225784	COG3245	CytC5	Cytochrome c5  [Energy production and conversion]. 	126
225785	COG3246	COG3246	Uncharacterized conserved protein, DUF849 family [Function unknown]. 	298
225786	COG3247	HdeD	Uncharacterized membrane protein HdeD, DUF308 family [Function unknown]. 	185
225787	COG3248	Tsx	Nucleoside-specific outer membrane channel protein Tsx [Cell wall/membrane/envelope biogenesis]. 	284
225788	COG3249	COG3249	Uncharacterized protein [Function unknown]. 	343
225789	COG3250	LacZ	Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]. 	808
225790	COG3251	MbtH	Uncharacterized conserved protein YbdZ, MbtH family [Function unknown]. 	71
225791	COG3252	Mch	Methenyltetrahydromethanopterin cyclohydrolase  [Coenzyme transport and metabolism]. 	314
225792	COG3253	YwfI	Chlorite dismutase [Inorganic ion transport and metabolism]. 	230
225793	COG3254	RhaM	L-rhamnose mutarotase [Cell wall/membrane/envelope biogenesis]. 	105
225794	COG3255	SCP2	Putative sterol carrier protein  [Lipid transport and metabolism]. 	134
225795	COG3256	NorB	Nitric oxide reductase large subunit  [Inorganic ion transport and metabolism]. 	717
225796	COG3257	AllE	Ureidoglycine aminohydrolase [Nucleotide transport and metabolism]. 	264
225797	COG3258	CytC	Cytochrome c  [Energy production and conversion]. 	293
225798	COG3259	FrhA	Coenzyme F420-reducing hydrogenase, alpha subunit  [Energy production and conversion]. 	441
225799	COG3260	HycG	Ni,Fe-hydrogenase III small subunit [Energy production and conversion]. 	148
225800	COG3261	HycE2	Ni,Fe-hydrogenase III large subunit [Energy production and conversion]. 	382
225801	COG3262	HycE1	Ni,Fe-hydrogenase III component G [Energy production and conversion]. 	165
225802	COG3263	NhaP2	NhaP-type Na+/H+ and K+/H+ antiporter with C-terminal TrkAC and CorC domains [Energy production and conversion, Inorganic ion transport and metabolism]. 	574
225803	COG3264	MscK	Small-conductance mechanosensitive channel [Cell wall/membrane/envelope biogenesis]. 	835
225804	COG3265	GntK	Gluconate kinase [Carbohydrate transport and metabolism]. 	161
225805	COG3266	DamX	Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell cycle control, cell division, chromosome partitioning]. 	292
225806	COG3267	ExeA	Type II secretory pathway, component ExeA (predicted ATPase)  [Intracellular trafficking, secretion, and vesicular transport]. 	269
225807	COG3268	COG3268	Uncharacterized conserved protein, related to short-chain dehydrogenases [Function unknown]. 	382
225808	COG3269	COG3269	Predicted RNA-binding protein, contains TRAM domain  [General function prediction only]. 	73
225809	COG3270	Ncl1	Ribosome biogenesis protein, NOL1/NOP2/fmu family [Translation, ribosomal structure and biogenesis]. 	127
225810	COG3271	COG3271	Predicted double-glycine peptidase  [General function prediction only]. 	201
225811	COG3272	YbgA	Uncharacterized conserved protein YbgA, DUF1722 family [Function unknown]. 	163
225812	COG3273	COG3273	Uncharacterized conserved protein, contains PhoU and TrkA_C domains [Function unknown]. 	204
225813	COG3274	WecH	Surface polysaccharide O-acyltransferase, integral membrane enzyme  [Cell wall/membrane/envelope biogenesis]. 	332
225814	COG3275	LytS	Sensor histidine kinase, LytS/YehU family [Signal transduction mechanisms]. 	557
225815	COG3276	SelB	Selenocysteine-specific translation elongation factor [Translation, ribosomal structure and biogenesis]. 	447
225816	COG3277	GAR1	rRNA processing protein Gar1  [Translation, ribosomal structure and biogenesis]. 	98
225817	COG3278	CcoN	Cbb3-type cytochrome oxidase, subunit 1  [Energy production and conversion]. 	482
225818	COG3279	LytT	DNA-binding response regulator, LytR/AlgR family [Transcription, Signal transduction mechanisms]. 	244
225819	COG3280	TreY	Maltooligosyltrehalose synthase [Carbohydrate transport and metabolism]. 	889
225820	COG3281	Ble	Predicted trehalose synthase [Carbohydrate transport and metabolism]. 	438
225821	COG3283	TyrR	Transcriptional regulator of aromatic amino acids metabolism [Transcription, Amino acid transport and metabolism]. 	511
225822	COG3284	AcoR	Transcriptional regulator of acetoin/glycerol metabolism [Transcription]. 	606
225823	COG3285	LigD	Eukaryotic-type DNA primase  [Replication, recombination and repair]. 	299
225824	COG3286	COG3286	Uncharacterized protein [Function unknown]. 	204
225825	COG3287	COG3287	Uncharacterized conserved protein, contains FIST_N domain [Function unknown]. 	379
225826	COG3288	PntA	NAD/NADP transhydrogenase alpha subunit [Energy production and conversion]. 	356
225827	COG3290	CitA	Sensor histidine kinase regulating citrate/malate metabolism [Signal transduction mechanisms]. 	537
225828	COG3291	COG3291	PKD repeat  [Function unknown]. 	297
225829	COG3292	COG3292	Periplasmic ligand-binding sensor domain  [Signal transduction mechanisms]. 	671
225830	COG3293	COG3293	Transposase [Mobilome: prophages, transposons]. 	124
225831	COG3294	COG3294	Metal-dependent phosphatase/phosphodiesterase, HD supefamily [General function prediction only]. 	269
225832	COG3295	COG3295	Uncharacterized protein  [Function unknown]. 	213
225833	COG3296	Tic20	Uncharacterized conserved protein, Tic20 family [Function unknown]. 	143
225834	COG3297	PulL	Type II secretory pathway, component PulL [Intracellular trafficking, secretion, and vesicular transport]. 	390
225835	COG3298	COG3298	Predicted 3'-5' exonuclease related to the exonuclease domain of PolB  [Replication, recombination and repair]. 	122
225836	COG3299	JayE	Uncharacterized phage protein gp47/JayE [Mobilome: prophages, transposons]. 	353
225837	COG3300	MHYT	MHYT domain, NO-binding membrane sensor [Signal transduction mechanisms]. 	236
225838	COG3301	NrfD	Formate-dependent nitrite reductase, membrane component NrfD [Inorganic ion transport and metabolism]. 	305
225839	COG3302	DmsC	DMSO reductase anchor subunit [Energy production and conversion]. 	281
225840	COG3303	NrfA	Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit [Inorganic ion transport and metabolism]. 	501
225841	COG3304	YccF	Uncharacterized membrane protein YccF, DUF307 family [Function unknown]. 	145
225842	COG3305	COG3305	Uncharacterized membrane protein, DUF2068 family  [Function unknown]. 	152
225843	COG3306	COG3306	Glycosyltransferase involved in LPS biosynthesis, GR25 family [Cell wall/membrane/envelope biogenesis]. 	255
225844	COG3307	RfaL	O-antigen ligase [Cell wall/membrane/envelope biogenesis]. 	424
225845	COG3308	COG3308	Uncharacterized membrane protein  [Function unknown]. 	131
225846	COG3309	VapD	Virulence-associated protein VapD (function unknown) [Function unknown]. 	96
225847	COG3310	COG3310	Uncharacterized protein, DUF1415 family [Function unknown]. 	196
225848	COG3311	AlpA	Predicted DNA-binding transcriptional regulator AlpA [Transcription, Mobilome: prophages, transposons]. 	70
225849	COG3312	AtpI	FoF1-type ATP synthase assembly protein I [Energy production and conversion]. 	128
225850	COG3313	YdhL	Predicted Fe-S protein YdhL, DUF1289 family  [General function prediction only]. 	74
225851	COG3314	YjiH	Uncharacterized membrane protein YjiH, contains nucleoside recognition GATE domain [Function unknown]. 	427
225852	COG3315	YktD	O-Methyltransferase involved in polyketide biosynthesis  [Secondary metabolites biosynthesis, transport and catabolism]. 	297
225853	COG3316	Rve	Transposase (or an inactivated derivative) [Mobilome: prophages, transposons]. 	215
225854	COG3317	NlpB	Uncharacterized lipoprotein, NlpB/DapX family [Function unknown]. 	342
225855	COG3318	YecA	Uncharacterized conserved protein YecA, UPF0149 family, contains C-terminal Zn-binding SEC-C motif [Function unknown]. 	216
225856	COG3319	EntF	Thioesterase domain of type I polyketide synthase or non-ribosomal peptide synthetase [Secondary metabolites biosynthesis, transport and catabolism]. 	257
225857	COG3320	Lys2b	Thioester reductase domain of alpha aminoadipate reductase Lys2 and NRPSs [Secondary metabolites biosynthesis, transport and catabolism]. 	382
225858	COG3321	PksD	Acyl transferase domain in polyketide synthase (PKS) enzymes [Secondary metabolites biosynthesis, transport and catabolism]. 	1061
225859	COG3322	CHASE4	Extracellular (periplasmic) sensor domain CHASE (specificity unknown) [Signal transduction mechanisms]. 	295
225860	COG3323	YqfO	Uncharacterized protein YbgI, a toroidal structure with a dinuclear metal site [Function unknown]. 	109
225861	COG3324	COG3324	Predicted enzyme related to lactoylglutathione lyase  [General function prediction only]. 	127
225862	COG3325	ChiA	Chitinase, GH18 family  [Carbohydrate transport and metabolism]. 	441
225863	COG3326	YsdA	Uncharacterized membrane protein YsdA, DUF1294 family [Function unknown]. 	94
225864	COG3327	PaaX	DNA-binding transcriptional regulator PaaX (phenylacetic acid degradation) [Transcription]. 	291
225865	COG3328	IS285	Transposase (or an inactivated derivative) [Mobilome: prophages, transposons]. 	379
225866	COG3329	COG3329	Uncharacterized conserved protein [Function unknown]. 	372
225867	COG3330	COG3330	Uncharacterized conserved protein [Function unknown]. 	215
225868	COG3331	PrfA	Penicillin-binding protein-related factor A, putative recombinase  [General function prediction only]. 	177
225869	COG3332	NRDE	Uncharacterized conserved protein, contains NRDE domain [Function unknown]. 	270
225870	COG3333	COG3333	TctA family transporter [General function prediction only]. 	504
225871	COG3334	MotE	Flagellar motility protein MotE, a chaperone for MotC folding [Cell motility]. 	192
225872	COG3335	COG3335	Transposase [Mobilome: prophages, transposons]. 	132
225873	COG3336	CtaG	Cytochrome c oxidase assembly factor CtaG  [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	299
225874	COG3337	Cmr5	CRISPR/Cas system CMR-associated protein Cmr5, small subunit [Defense mechanisms]. 	134
225875	COG3338	Cah	Carbonic anhydrase  [Inorganic ion transport and metabolism]. 	250
225876	COG3339	YkvA	Uncharacterized membrane protein YkvA, DUF1232 family [Function unknown]. 	116
225877	COG3340	PepE	Peptidase E [Amino acid transport and metabolism]. 	224
225878	COG3341	Rnh1	RNase HI-related protein, contains viroplasmin and RNaseH domains [General function prediction only]. 	225
225879	COG3342	COG3342	Uncharacterized conserved protein, Ntn-hydrolase superfamily  [General function prediction only]. 	265
225880	COG3343	RpoE	DNA-directed RNA polymerase, delta subunit  [Transcription]. 	175
225881	COG3344	YkfC	Retron-type reverse transcriptase [Mobilome: prophages, transposons]. 	328
225882	COG3345	GalA	Alpha-galactosidase  [Carbohydrate transport and metabolism]. 	687
225883	COG3346	Shy1	Cytochrome oxidase assembly protein ShyY1 [Posttranslational modification, protein turnover, chaperones]. 	252
225884	COG3347	RhaD	Rhamnose utilisation protein RhaD, predicted bifunctional aldolase and dehydrogenase [Carbohydrate transport and metabolism]. 	404
225885	COG3349	COG3349	Uncharacterized conserved protein, contains NAD-binding domain and a Fe-S cluster [General function prediction only]. 	485
225886	COG3350	COG3350	Uncharacterized conserved protein, YHS domain  [Function unknown]. 	53
225887	COG3351	FlaD	Archaellum component FlaD/FlaE [Cell motility]. 	214
225888	COG3352	FlaC	Archaellum component FlaC [Cell motility]. 	157
225889	COG3353	FlaF	Archaellum component FlaF, FlaF/FlaG flagellin family [Cell motility]. 	137
225890	COG3354	FlaG	Archaellum component FlaG, FlaF/FlaG flagellin family [Cell motility]. 	154
225891	COG3355	COG3355	Predicted transcriptional regulator  [Transcription]. 	126
225892	COG3356	COG3356	Predicted membrane-associated lipid hydrolase, neutral ceramidase superfamily [Lipid transport and metabolism]. 	578
225893	COG3357	COG3357	Predicted transcriptional regulator containing an HTH domain fused to a Zn-ribbon  [Transcription]. 	97
225894	COG3358	COG3358	Uncharacterized conserved protein, DUF1684 family [Function unknown]. 	262
225895	COG3359	YprB	Uncharacterized conserved protein YprB, contains RNaseH-like and TPR domains [General function prediction only]. 	278
225896	COG3360	COG3360	Flavin-binding protein dodecin [General function prediction only]. 	71
225897	COG3361	YqjF	Uncharacterized protein YqjF, DUF2071 family [Function unknown]. 	240
225898	COG3363	PurO	Archaeal IMP cyclohydrolase  [Nucleotide transport and metabolism]. 	200
225899	COG3364	COG3364	Predicted  nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 	112
225900	COG3365	COG3365	Uncharacterized protein, DUF2073 family [Function unknown]. 	118
225901	COG3366	COG3366	Uncharacterized protein [Function unknown]. 	311
225902	COG3367	COG3367	Uncharacterized conserved protein, NAD-dependent epimerase/dehydratase family [General function prediction only]. 	339
225903	COG3368	COG3368	Predicted permease  [General function prediction only]. 	465
225904	COG3369	COG3369	Uncharacterized protein, contains Zn-finger domain of CDGSH type [Function unknown]. 	78
225905	COG3370	COG3370	Uncharacterized protein [Function unknown]. 	113
225906	COG3371	COG3371	Uncharacterized membrane protein  [Function unknown]. 	181
225907	COG3372	COG3372	Predicted nuclease of restriction endonuclease-like (RecB) superfamily, implicated in nucleotide excision repair [General function prediction only]. 	396
225908	COG3373	COG3373	Predicted transcriptional regulator, contains HTH domain [Transcription]. 	108
225909	COG3374	COG3374	Uncharacterized membrane protein  [Function unknown]. 	197
225910	COG3375	COG3375	Predicted acetyltransferase, GNAT superfamily [General function prediction only]. 	266
225911	COG3376	HoxN	High-affinity nickel permease  [Inorganic ion transport and metabolism]. 	342
225912	COG3377	YunC	Uncharacterized protein YunC, DUF1805 family [Function unknown]. 	95
225913	COG3378	COG3378	Phage- or plasmid-associated DNA primase [Mobilome: prophages, transposons]. 	517
225914	COG3379	COG3379	Predicted phosphohydrolase or phosphomutase, AlkP superfamily [General function prediction only]. 	471
225915	COG3380	COG3380	Predicted NAD/FAD-dependent oxidoreductase  [General function prediction only]. 	331
225916	COG3381	TorD	Cytoplasmic chaperone TorD involved in molybdoenzyme TorA maturation [Posttranslational modification, protein turnover, chaperones]. 	204
225917	COG3382	B3/B4	B3/B4 domain (DNA/RNA-binding domain of Phe-tRNA-synthetase)  [General function prediction only]. 	229
225918	COG3383	YjgC	Predicted molibdopterin-dependent oxidoreductase YjgC [General function prediction only]. 	978
225919	COG3384	LigB	Aromatic ring-opening dioxygenase, catalytic subunit, LigB family [Secondary metabolites biosynthesis, transport and catabolism]. 	268
225920	COG3385	InsG	IS4 transposase [Mobilome: prophages, transposons]. 	292
225921	COG3386	YvrE	Sugar lactone lactonase YvrE [Carbohydrate transport and metabolism]. 	307
225922	COG3387	SGA1	Glucoamylase (glucan-1,4-alpha-glucosidase), GH15 family [Carbohydrate transport and metabolism]. 	612
225923	COG3388	COG3388	Predicted transcriptional regulator  [Transcription]. 	101
225924	COG3389	COG3389	Presenilin-like membrane protease, A22 family [Posttranslational modification, protein turnover, chaperones]. 	277
225925	COG3390	COG3390	Replication protein A (RPA) family protein [Replication, recombination and repair]. 	196
225926	COG3391	YncE	DNA-binding beta-propeller fold protein YncE [General function prediction only]. 	381
225927	COG3392	COG3392	Adenine-specific DNA methylase  [Replication, recombination and repair]. 	330
225928	COG3393	COG3393	Predicted acetyltransferase, GNAT family  [General function prediction only]. 	268
225929	COG3394	ChbG	Predicted glycoside hydrolase or deacetylase ChbG, UPF0249 family [Function unknown]. 	257
225930	COG3395	YgbK	Uncharacterized conserved protein YgbK, DUF1537 family [Function unknown]. 	413
225931	COG3396	YdbO	1,2-phenylacetyl-CoA epoxidase, catalytic subunit  [Secondary metabolites biosynthesis, transport and catabolism]. 	265
225932	COG3397	COG3397	Predicted carbohydrate-binding protein, contains CBM5 and CBM33 domains [General function prediction only]. 	308
225933	COG3398	COG3398	Predicted transcriptional regulator, containsd two HTH domains [Transcription]. 	240
225934	COG3399	COG3399	Uncharacterized protein [Function unknown]. 	148
225935	COG3400	COG3400	Uncharacterized protein [Function unknown]. 	471
225936	COG3401	FN3	Fibronectin type 3 domain [General function prediction only]. 	343
225937	COG3402	YdbS	Uncharacterized membrane protein YdbS, contains bPH2 (bacterial pleckstrin homology) domain [Function unknown]. 	161
225938	COG3403	YcgG	Uncharacterized protein YcgG, contains conserved FPC and CPF motifs [Function unknown]. 	257
225939	COG3404	FtcD	Formiminotetrahydrofolate cyclodeaminase [Amino acid transport and metabolism]. 	208
225940	COG3405	BcsZ	Endo-1,4-beta-D-glucanase Y [Carbohydrate transport and metabolism]. 	360
225941	COG3407	MVD1	Mevalonate pyrophosphate decarboxylase  [Lipid transport and metabolism]. 	329
225942	COG3408	GDB1	Glycogen debranching enzyme (alpha-1,6-glucosidase) [Carbohydrate transport and metabolism]. 	641
225943	COG3409	PGRP	Peptidoglycan-binding (PGRP) domain of peptidoglycan hydrolases [Cell wall/membrane/envelope biogenesis]. 	185
225944	COG3410	BH3996	Uncharacterized protein, DUF2075 family [Function unknown]. 	191
225945	COG3411	2Fe2S	(2Fe-2S) ferredoxin  [Energy production and conversion]. 	64
225946	COG3412	DhaM	PTS-EIIA-like component DhaM of the dihydroxyacetone kinase DhaKLM complex [Signal transduction mechanisms]. 	129
225947	COG3413	COG3413	Predicted DNA binding protein, contains HTH domain [General function prediction only]. 	215
225948	COG3414	SgaB	Phosphotransferase system, galactitol-specific IIB component [Carbohydrate transport and metabolism]. 	93
225949	COG3415	COG3415	Transposase [Mobilome: prophages, transposons]. 	138
225950	COG3416	COG3416	Uncharacterized protein [Function unknown]. 	233
225951	COG3417	LpoB	Outer membrane lipoprotein LpoB, binds and activates PBP1b [Cell wall/membrane/envelope biogenesis]. 	200
225952	COG3418	FlgN	Flagellar biosynthesis/type III secretory pathway chaperone [Cell motility, Intracellular trafficking, secretion, and vesicular transport]. 	146
225953	COG3419	PilY1	Tfp pilus assembly protein, tip-associated adhesin PilY1  [Cell motility, Extracellular structures]. 	1036
225954	COG3420	NosD	Nitrous oxidase accessory protein NosD, contains tandem CASH domains [Inorganic ion transport and metabolism]. 	408
225955	COG3421	COG3421	Uncharacterized protein [Function unknown]. 	812
225956	COG3422	YegP	Uncharacterized conserved protein YegP, UPF0339 family [Function unknown]. 	59
225957	COG3423	SfsB	Predicted transcriptional regulator, lambda repressor-like DNA-binding domain [Transcription]. 	82
225958	COG3424	BH0617	Predicted naringenin-chalcone synthase  [Secondary metabolites biosynthesis, transport and catabolism]. 	356
225959	COG3425	PksG	3-hydroxy-3-methylglutaryl CoA synthase  [Lipid transport and metabolism]. 	377
225960	COG3426	Buk	Butyrate kinase  [Energy production and conversion]. 	358
225961	COG3427	CoxG	Carbon monoxide dehydrogenase subunit G [Energy production and conversion]. 	146
225962	COG3428	YdbT	Uncharacterized membrane protein YdbT, contains bPH2 (bacterial pleckstrin homology) domain [Function unknown]. 	494
225963	COG3429	OpcA	Glucose-6-phosphate dehydrogenase assembly protein OpcA, contains a peptidoglycan-binding domain [Carbohydrate transport and metabolism]. 	314
225964	COG3430	COG3430	Archaeal flagellin (archaellin), FlaG/FlaF family [Cell motility]. 	161
225965	COG3431	COG3431	Uncharacterized membrane protein, DUF373 family  [Function unknown]. 	142
225966	COG3432	COG3432	Predicted transcriptional regulator  [Transcription]. 	95
225967	COG3433	DhbB2	Aryl carrier domain [Secondary metabolites biosynthesis, transport and catabolism]. 	74
225968	COG3434	YuxH	c-di-GMP-related signal transduction protein, contains EAL and HDOD domains [Signal transduction mechanisms]. 	407
225969	COG3435	COG3435	Gentisate 1,2-dioxygenase  [Secondary metabolites biosynthesis, transport and catabolism]. 	351
225970	COG3436	COG3436	Transposase [Mobilome: prophages, transposons]. 	157
225971	COG3437	RpfG	Response regulator c-di-GMP phosphodiesterase, RpfG family, contains REC and HD-GYP domains [Signal transduction mechanisms]. 	360
225972	COG3439	COG3439	Uncharacterized conserved protein, DUF302 family [Function unknown]. 	137
225973	COG3440	COG3440	Predicted restriction endonuclease  [Defense mechanisms]. 	301
225974	COG3442	COG3442	Glutamine amidotransferase related to the GATase domain of CobQ [General function prediction only]. 	250
225975	COG3443	ZinT	Periplasmic Zn/Cd-binding protein ZinT [Inorganic ion transport and metabolism]. 	193
225976	COG3444	AgaB	Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB [Carbohydrate transport and metabolism]. 	159
225977	COG3445	GrcA	Autonomous glycyl radical cofactor GrcA [Coenzyme transport and metabolism]. 	127
225978	COG3447	MASE1	Integral membrane sensor domain MASE1 [Signal transduction mechanisms]. 	308
225979	COG3448	COG3448	CBS-domain-containing membrane protein  [Signal transduction mechanisms]. 	382
225980	COG3449	SbmC	DNA gyrase inhibitor GyrI [Replication, recombination and repair]. 	154
225981	COG3450	COG3450	Predicted enzyme of the cupin superfamily  [General function prediction only]. 	116
225982	COG3451	VirB4	Type IV secretory pathway, VirB4 component [Intracellular trafficking, secretion, and vesicular transport]. 	796
225983	COG3452	CHASE	Extracellular (periplasmic) sensor domain CHASE (specificity unknown) [Signal transduction mechanisms]. 	297
225984	COG3453	COG3453	Predicted phosphohydrolase, protein tyrosine phosphatase (PTP) superfamily, DUF442 family  [General function prediction only]. 	130
225985	COG3454	PhnM	Alpha-D-ribose 1-methylphosphonate 5-triphosphate diphosphatase PhnM [Inorganic ion transport and metabolism]. 	377
225986	COG3455	COG3455	Type VI protein secretion system component VasF  [Intracellular trafficking, secretion, and vesicular transport]. 	262
225987	COG3456	COG3456	Predicted component of the type VI protein secretion system, contains a FHA domain  [Signal transduction mechanisms, Intracellular trafficking, secretion, and vesicular transport]. 	430
225988	COG3457	YhfX	Predicted amino acid racemase [Amino acid transport and metabolism]. 	353
225989	COG3458	Axe1	Cephalosporin-C deacetylase or related acetyl esterase [Secondary metabolites biosynthesis, transport and catabolism]. 	321
225990	COG3459	COG3459	Cellobiose phosphorylase  [Carbohydrate transport and metabolism]. 	1056
225991	COG3460	PaaB	1,2-phenylacetyl-CoA epoxidase, PaaB subunit [Secondary metabolites biosynthesis, transport and catabolism]. 	117
225992	COG3461	COG3461	Uncharacterized protein [Function unknown]. 	103
225993	COG3462	COG3462	Uncharacterized membrane protein  [Function unknown]. 	117
225994	COG3463	COG3463	Uncharacterized membrane protein  [Function unknown]. 	458
225995	COG3464	COG3464	Transposase [Mobilome: prophages, transposons]. 	402
225996	COG3465	YwgA	Uncharacterized protein YwgA [Function unknown]. 	171
225997	COG3466	ISA1214	Putative transposon-encoded protein  [Mobilome: prophages, transposons]. 	52
225998	COG3467	NimA	Nitroimidazol reductase NimA or a related FMN-containing flavoprotein, pyridoxamine 5'-phosphate oxidase superfamily [Defense mechanisms]. 	166
225999	COG3468	AidA	Type V secretory pathway, adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport]. 	592
226000	COG3469	Chi1	Chitinase  [Carbohydrate transport and metabolism]. 	332
226001	COG3470	Tpd	Uncharacterized protein probably involved in high-affinity Fe2+ transport  [Cell wall/membrane/envelope biogenesis, Lipid transport and metabolism]. 	179
226002	COG3471	COG3471	Predicted secreted (periplasmic) protein  [Function unknown]. 	235
226003	COG3472	COG3472	Uncharacterized protein [Function unknown]. 	342
226004	COG3473	COG3473	Maleate cis-trans isomerase  [Secondary metabolites biosynthesis, transport and catabolism]. 	238
226005	COG3474	Cyc7	Cytochrome c2  [Energy production and conversion]. 	135
226006	COG3475	LicD	Phosphorylcholine metabolism protein LicD [Lipid transport and metabolism]. 	256
226007	COG3476	TspO	Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog)  [Signal transduction mechanisms]. 	161
226008	COG3477	YagU	Uncharacterized membrane protein YagU, involved in acid resistance, DUF1440 family [Function unknown]. 	176
226009	COG3478	YpzJ	Predicted nucleic-acid-binding protein, contains Zn-ribbon domain  [General function prediction only]. 	68
226010	COG3479	PadC	Phenolic acid decarboxylase  [Secondary metabolites biosynthesis, transport and catabolism]. 	175
226011	COG3480	SdrC	Predicted secreted protein containing a PDZ domain  [Signal transduction mechanisms]. 	342
226012	COG3481	YhaM	3'-5' exoribonuclease YhaM, can participate in 23S rRNA maturation,  HD superfamily [Translation, ribosomal structure and biogenesis]. 	287
226013	COG3482	COG3482	Uncharacterized protein [Function unknown]. 	237
226014	COG3483	TDO2	Tryptophan 2,3-dioxygenase (vermilion)  [Amino acid transport and metabolism]. 	262
226015	COG3484	COG3484	Predicted proteasome-type protease  [Posttranslational modification, protein turnover, chaperones]. 	255
226016	COG3485	PcaH	Protocatechuate 3,4-dioxygenase beta subunit  [Secondary metabolites biosynthesis, transport and catabolism]. 	226
226017	COG3486	IucD	Lysine/ornithine N-monooxygenase  [Secondary metabolites biosynthesis, transport and catabolism]. 	436
226018	COG3487	IrpA	Uncharacterized iron-regulated protein  [Function unknown]. 	446
226019	COG3488	COG3488	Uncharacterized conserved protein with two CxxC motifs, DUF1111 family [General function prediction only]. 	481
226020	COG3489	COG3489	Predicted periplasmic lipoprotein  [Function unknown]. 	359
226021	COG3490	COG3490	Uncharacterized protein [Function unknown]. 	366
226022	COG3491	PcbC	Isopenicillin N synthase and related dioxygenases  [Secondary metabolites biosynthesis, transport and catabolism]. 	322
226023	COG3492	COG3492	Uncharacterized protein, DUF1244 family [Function unknown]. 	104
226024	COG3493	CitS	Na+/citrate or Na+/malate symporter  [Energy production and conversion]. 	438
226025	COG3494	COG3494	Uncharacterized conserved protein, DUF1009 family [Function unknown]. 	279
226026	COG3495	COG3495	Uncharacterized protein, DUF3299 family [Function unknown]. 	166
226027	COG3496	COG3496	Uncharacterized conserved protein, DUF1365 family [Function unknown]. 	261
226028	COG3497	COG3497	Phage tail sheath protein FI  [Mobilome: prophages, transposons]. 	394
226029	COG3498	COG3498	Phage tail tube protein FII  [Mobilome: prophages, transposons]. 	169
226030	COG3499	COG3499	Phage protein U  [Mobilome: prophages, transposons]. 	147
226031	COG3500	gpD	Phage protein D [Mobilome: prophages, transposons]. 	350
226032	COG3501	VgrG	Uncharacterized conserved protein, implicated in type VI secretion and phage assembly  [Intracellular trafficking, secretion, and vesicular transport, Mobilome: prophages, transposons, General function prediction only]. 	550
226033	COG3502	COG3502	Uncharacterized conserved protein, DUF952 family [Function unknown]. 	115
226034	COG3503	COG3503	Uncharacterized membrane protein  [Function unknown]. 	323
226035	COG3504	VirB9	Type IV secretory pathway, VirB9 components  [Intracellular trafficking, secretion, and vesicular transport]. 	265
226036	COG3505	VirD4	Type IV secretory pathway, VirD4 component, TraG/TraD family ATPase  [Intracellular trafficking, secretion, and vesicular transport]. 	596
226037	COG3506	Ree1	Regulation of enolase protein 1 (function unknown), concanavalin A-like superfamily [Function unknown]. 	189
226038	COG3507	XynB2	Beta-xylosidase [Carbohydrate transport and metabolism]. 	549
226039	COG3508	HmgA	Homogentisate 1,2-dioxygenase  [Secondary metabolites biosynthesis, transport and catabolism]. 	427
226040	COG3509	LpqC	Poly(3-hydroxybutyrate) depolymerase  [Secondary metabolites biosynthesis, transport and catabolism]. 	312
226041	COG3510	CmcI	Cephalosporin hydroxylase  [Defense mechanisms]. 	237
226042	COG3511	PlcC	Phospholipase C  [Cell wall/membrane/envelope biogenesis]. 	527
226043	COG3512	Cas2	CRISPR/Cas system-associated protein Cas2, endoribonuclease [Defense mechanisms]. 	116
226044	COG3513	Cas9	CRISPR/Cas system Type II  associated protein, contains McrA/HNH and RuvC-like nuclease domains  [Defense mechanisms]. 	1088
226045	COG3514	COG3514	Uncharacterized conserved protein, DUF4415 family [Function unknown]. 	93
226046	COG3515	COG3515	Predicted component of the type VI protein secretion system  [Intracellular trafficking, secretion, and vesicular transport]. 	346
226047	COG3516	COG3516	Predicted component of the type VI protein secretion system  [Intracellular trafficking, secretion, and vesicular transport]. 	169
226048	COG3517	COG3517	Predicted component of the type VI protein secretion system  [Intracellular trafficking, secretion, and vesicular transport]. 	495
226049	COG3518	COG3518	Predicted component of the type VI protein secretion system  [Intracellular trafficking, secretion, and vesicular transport]. 	157
226050	COG3519	COG3519	Type VI protein secretion system component VasA  [Intracellular trafficking, secretion, and vesicular transport]. 	621
226051	COG3520	COG3520	Predicted component of the type VI protein secretion system  [Intracellular trafficking, secretion, and vesicular transport]. 	335
226052	COG3521	COG3521	Predicted component of the type VI protein secretion system  [Intracellular trafficking, secretion, and vesicular transport]. 	159
226053	COG3522	COG3522	Predicted component of the type VI protein secretion system  [Intracellular trafficking, secretion, and vesicular transport]. 	446
226054	COG3523	IcmF	Type VI protein secretion system component VasK  [Intracellular trafficking, secretion, and vesicular transport]. 	1188
226055	COG3524	KpsE	Capsule polysaccharide export protein KpsE/RkpR [Cell wall/membrane/envelope biogenesis]. 	372
226056	COG3525	Chb	N-acetyl-beta-hexosaminidase  [Carbohydrate transport and metabolism]. 	732
226057	COG3526	COG3526	Predicted selenoprotein, Rdx family [Function unknown]. 	99
226058	COG3527	AlsD	Alpha-acetolactate decarboxylase  [Secondary metabolites biosynthesis, transport and catabolism]. 	234
226059	COG3528	COG3528	Uncharacterized protein, DUF2219 family [Function unknown]. 	330
226060	COG3529	COG3529	Predicted nucleic-acid-binding protein, contains Zn-ribbon domain  [General function prediction only]. 	66
226061	COG3530	COG3530	Uncharacterized conserved protein, DUF3820 family [Function unknown]. 	71
226062	COG3531	COG3531	Predicted protein-disulfide isomerase , contains CxxC motif [Posttranslational modification, protein turnover, chaperones]. 	212
226063	COG3533	COG3533	Uncharacterized conserved protein, DUF1680 family [Function unknown]. 	589
226064	COG3534	AbfA	Alpha-L-arabinofuranosidase  [Carbohydrate transport and metabolism]. 	501
226065	COG3535	COG3535	Uncharacterized conserved protein, DUF917 family [Function unknown]. 	357
226066	COG3536	COG3536	Uncharacterized conserved protein, DUF971 family [Function unknown]. 	120
226067	COG3537	COG3537	Putative alpha-1,2-mannosidase  [Carbohydrate transport and metabolism]. 	768
226068	COG3538	COG3538	Meiotically up-regulated gene 157 (Mug157) protein (function unknown) [Function unknown]. 	434
226069	COG3539	FimA	Pilin (type 1 fimbria component protein) [Cell motility]. 	184
226070	COG3540	PhoD	Phosphodiesterase/alkaline phosphatase D  [Inorganic ion transport and metabolism]. 	522
226071	COG3541	YcgL	Predicted nucleotidyltransferase  [General function prediction only]. 	248
226072	COG3542	CFF1	Predicted sugar epimerase, cupin superfamily [General function prediction only]. 	162
226073	COG3543	COG3543	Uncharacterized protein [Function unknown]. 	135
226074	COG3544	COG3544	Uncharacterized conserved protein, DUF305 family [Function unknown]. 	190
226075	COG3545	YdeN	Predicted esterase of the alpha/beta hydrolase fold  [General function prediction only]. 	181
226076	COG3546	CotJC	Mn-containing catalase (includes spore coat protein CotJC) [Inorganic ion transport and metabolism]. 	277
226077	COG3547	COG3547	Transposase [Mobilome: prophages, transposons]. 	303
226078	COG3548	COG3548	Uncharacterized membrane protein  [Function unknown]. 	197
226079	COG3549	HigB	Plasmid maintenance system killer protein  [Defense mechanisms]. 	94
226080	COG3550	HipA	Serine/threonine protein kinase HipA, toxin component of the HipAB toxin-antitoxin module [Signal transduction mechanisms]. 	392
226081	COG3551	COG3551	Uncharacterized protein [Function unknown]. 	402
226082	COG3552	CoxE	Uncharacterized conserved protein, contains von Willebrand factor type A (vWA) domain   [Function unknown]. 	395
226083	COG3553	COG3553	Uncharacterized protein [Function unknown]. 	96
226084	COG3554	COG3554	Uncharacterized protein [Function unknown]. 	190
226085	COG3555	LpxO2	Aspartyl/asparaginyl beta-hydroxylase, cupin superfamily [Posttranslational modification, protein turnover, chaperones]. 	291
226086	COG3556	COG3556	Uncharacterized membrane protein [Function unknown]. 	150
226087	COG3557	YgaC	Uncharacterized protein associated with RNAses G and E,  UPF0374/DUF402 family [Function unknown]. 	177
226088	COG3558	COG3558	Uncharacterized conserved protein, nuclear transport factor 2 (NTF2) superfamily [Function unknown]. 	154
226089	COG3559	TnrB3	Putative exporter of polyketide antibiotics  [Intracellular trafficking, secretion, and vesicular transport]. 	536
226090	COG3560	FMR2	Fatty acid repression mutant protein (predicted oxidoreductase) [General function prediction only]. 	200
226091	COG3561	COG3561	Phage anti-repressor protein  [Mobilome: prophages, transposons]. 	110
226092	COG3562	KpsS	Capsule polysaccharide modification protein KpsS [Cell wall/membrane/envelope biogenesis]. 	403
226093	COG3563	KpsC	Capsule polysaccharide export protein KpsC/LpsZ [Cell wall/membrane/envelope biogenesis]. 	671
226094	COG3564	COG3564	Uncharacterized conserved protein, DUF779 family [Function unknown]. 	116
226095	COG3565	COG3565	Predicted dioxygenase of extradiol dioxygenase family  [General function prediction only]. 	138
226096	COG3566	COG3566	Uncharacterized protein [Function unknown]. 	379
226097	COG3567	COG3567	Uncharacterized protein [Function unknown]. 	452
226098	COG3568	ElsH	Metal-dependent hydrolase, endonuclease/exonuclease/phosphatase family [General function prediction only]. 	259
226099	COG3569	Top1	DNA topoisomerase IB  [Replication, recombination and repair]. 	354
226100	COG3570	StrB	Streptomycin 6-kinase  [Defense mechanisms]. 	274
226101	COG3571	COG3571	Predicted hydrolase of the alpha/beta-hydrolase fold  [General function prediction only]. 	213
226102	COG3572	Gsh2	Gamma-glutamylcysteine synthetase  [Coenzyme transport and metabolism]. 	456
226103	COG3573	COG3573	Predicted oxidoreductase  [General function prediction only]. 	552
226104	COG3575	COG3575	Uncharacterized protein [Function unknown]. 	184
226105	COG3576	COG3576	Predicted flavin-nucleotide-binding protein, pyridoxine 5'-phosphate oxidase superfamily [General function prediction only]. 	173
226106	COG3577	COG3577	Predicted aspartyl protease  [General function prediction only]. 	215
226107	COG3579	PepC	Aminopeptidase C  [Amino acid transport and metabolism]. 	444
226108	COG3580	COG3580	Predicted nucleotide-binding protein, sugar kinase/HSP70/actin superfamily [General function prediction only]. 	351
226109	COG3581	COG3581	Predicted nucleotide-binding protein, sugar kinase/HSP70/actin superfamily [General function prediction only]. 	420
226110	COG3582	COG3582	Predicted nucleic acid binding protein containing the AN1-type Zn-finger  [General function prediction only]. 	162
226111	COG3583	YabE	Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function unknown]. 	309
226112	COG3584	3D	3D (Asp-Asp-Asp) domain [Function unknown]. 	109
226113	COG3585	MopI	Molybdopterin-binding protein  [Coenzyme transport and metabolism]. 	69
226114	COG3586	COG3586	Predicted transport protein [Function unknown]. 	101
226115	COG3587	COG3587	Restriction endonuclease  [Defense mechanisms]. 	985
226116	COG3588	Fba1	Fructose-bisphosphate aldolase class 1 [Carbohydrate transport and metabolism]. 	332
226117	COG3589	COG3589	Uncharacterized protein [Function unknown]. 	360
226118	COG3590	PepO	Predicted metalloendopeptidase  [Posttranslational modification, protein turnover, chaperones]. 	654
226119	COG3591	eMpr	V8-like Glu-specific endopeptidase [Posttranslational modification, protein turnover, chaperones]. 	251
226120	COG3592	YjdI	Uncharacterized Fe-S cluster protein YjdI [Function unknown]. 	74
226121	COG3593	YbjD	Predicted ATP-dependent endonuclease of the OLD family, contains P-loop ATPase and TOPRIM domains [Replication, recombination and repair]. 	581
226122	COG3594	NolL	Fucose 4-O-acetylase or related acetyltransferase [Carbohydrate transport and metabolism]. 	343
226123	COG3595	YvlB	Uncharacterized conserved protein YvlB, contains  DUF4097 and DUF4098 domains [Function unknown]. 	318
226124	COG3596	YeeP	Predicted GTPase [General function prediction only]. 	296
226125	COG3597	COG3597	Uncharacterized conserved protein, DUF697 family [Function unknown]. 	139
226126	COG3598	RepA	RecA-family ATPase  [Replication, recombination and repair]. 	402
226127	COG3599	DivIVA	Cell division septum initiation DivIVA, interacts with FtsZ, MinD and other proteins [Cell cycle control, cell division, chromosome partitioning]. 	212
226128	COG3600	GepA	Uncharacterized phage-associated protein  [Mobilome: prophages, transposons]. 	154
226129	COG3601	FmnP	Riboflavin transporter FmnP [Coenzyme transport and metabolism]. 	186
226130	COG3602	COG3602	Uncharacterized protein [Function unknown]. 	134
226131	COG3603	COG3603	Uncharacterized protein [Function unknown]. 	128
226132	COG3604	FhlA	Transcriptional regulator containing GAF, AAA-type ATPase, and DNA-binding Fis domains [Transcription, Signal transduction mechanisms]. 	550
226133	COG3605	PtsP	Signal transduction protein containing GAF and PtsI domains [Signal transduction mechanisms]. 	756
226134	COG3607	COG3607	Predicted lactoylglutathione lyase  [General function prediction only]. 	133
226135	COG3608	COG3608	Predicted deacylase  [General function prediction only]. 	331
226136	COG3609	ParD	Transcriptional regulator, contains Arc/MetJ-type RHH (ribbon-helix-helix) DNA-binding domain  [Transcription]. 	89
226137	COG3610	YjjB	Uncharacterized membrane protein YjjB, DUF3815 family [Function unknown]. 	156
226138	COG3611	DnaB	Replication initiation and membrane attachment protein DnaB [Replication, recombination and repair]. 	417
226139	COG3612	COG3612	Uncharacterized protein [Function unknown]. 	157
226140	COG3613	RCL	Nucleoside 2-deoxyribosyltransferase  [Nucleotide transport and metabolism]. 	172
226141	COG3614	CHASE1	Extracellular (periplasmic) sensor domain CHASE1 (specificity unknown) [Signal transduction mechanisms]. 	348
226142	COG3615	TehB	Uncharacterized protein/domain, possibly involved in tellurite resistance [Function unknown]. 	99
226143	COG3616	Dsd1	D-serine deaminase, pyridoxal phosphate-dependent [Amino acid transport and metabolism]. 	368
226144	COG3617	COG3617	Prophage antirepressor  [Mobilome: prophages, transposons]. 	176
226145	COG3618	COG3618	Predicted metal-dependent hydrolase, TIM-barrel fold  [General function prediction only]. 	279
226146	COG3619	YoaK	Uncharacterized membrane protein YoaK, UPF0700 family [Function unknown]. 	226
226147	COG3620	COG3620	Predicted transcriptional regulator with C-terminal CBS domains  [Transcription]. 	187
226148	COG3621	PATA	Patatin-like phospholipase/acyl hydrolase [General function prediction only]. 	394
226149	COG3622	Hyi	Hydroxypyruvate isomerase [Carbohydrate transport and metabolism]. 	260
226150	COG3623	SgaU	L-ribulose-5-phosphate 3-epimerase UlaE [Carbohydrate transport and metabolism]. 	287
226151	COG3624	PhnG	Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnG [Inorganic ion transport and metabolism]. 	151
226152	COG3625	PhnH	Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnH [Inorganic ion transport and metabolism]. 	196
226153	COG3626	PhnI	Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnI [Inorganic ion transport and metabolism]. 	367
226154	COG3627	PhnJ	Alpha-D-ribose 1-methylphosphonate 5-phosphate C-P lyase [Inorganic ion transport and metabolism]. 	291
226155	COG3628	COG3628	Phage baseplate assembly protein W  [Mobilome: prophages, transposons]. 	116
226156	COG3629	DnrI	DNA-binding transcriptional activator of the SARP family  [Signal transduction mechanisms]. 	280
226157	COG3630	OadG	Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, gamma subunit  [Energy production and conversion]. 	84
226158	COG3631	YesE	Ketosteroid isomerase-related protein  [General function prediction only]. 	133
226159	COG3633	SstT	Na+/serine symporter [Amino acid transport and metabolism]. 	407
226160	COG3634	AhpF	Alkyl hydroperoxide reductase subunit AhpF [Defense mechanisms]. 	520
226161	COG3635	ApgM	2,3-bisphosphoglycerate-independent phosphoglycerate mutase, archeal type [Carbohydrate transport and metabolism]. 	408
226162	COG3636	COG3636	DNA-binding prophage protein  [Mobilome: prophages, transposons]. 	100
226163	COG3637	LomR	Opacity protein and related surface antigens [Cell wall/membrane/envelope biogenesis]. 	199
226164	COG3638	PhnC	ABC-type phosphate/phosphonate transport system, ATPase component [Inorganic ion transport and metabolism]. 	258
226165	COG3639	PhnE	ABC-type phosphate/phosphonate transport system, permease component [Inorganic ion transport and metabolism]. 	283
226166	COG3640	CooC	CO dehydrogenase nickel-insertion accessory protein CooC1 [Posttranslational modification, protein turnover, chaperones]. 	255
226167	COG3641	PfoR	Uncharacterized membrane protein PfoR (does not regulate perfringolysin O expression) [Function unknown]. 	348
226168	COG3642	Bud32	tRNA A-37 threonylcarbamoyl transferase component Bud32 [Translation, ribosomal structure and biogenesis]. 	204
226169	COG3643	GluFT	Glutamate formiminotransferase  [Amino acid transport and metabolism]. 	302
226170	COG3644	COG3644	Uncharacterized protein [Function unknown]. 	194
226171	COG3645	KilAC	Phage antirepressor protein YoqD, KilAC domain [Mobilome: prophages, transposons]. 	135
226172	COG3646	pRha	Phage regulatory protein Rha [Mobilome: prophages, transposons]. 	167
226173	COG3647	YjdF	Uncharacterized membrane protein YjdF [Function unknown]. 	205
226174	COG3648	UriC	Uricase (urate oxidase)  [Secondary metabolites biosynthesis, transport and catabolism]. 	299
226175	COG3649	Csh2	CRISPR/Cas system type I-B associated protein Csh2, Cas7 group, RAMP superfamily [Defense mechanisms]. 	283
226176	COG3650	COG3650	Uncharacterized membrane protein [Function unknown]. 	149
226177	COG3651	COG3651	Uncharacterized conserved protein, DUF2237 family [Function unknown]. 	125
226178	COG3652	COG3652	Predicted outer membrane protein  [Function unknown]. 	170
226179	COG3653	COG3653	N-acyl-D-aspartate/D-glutamate deacylase  [Secondary metabolites biosynthesis, transport and catabolism]. 	579
226180	COG3654	Doc	Prophage maintenance system killer protein  [Mobilome: prophages, transposons]. 	132
226181	COG3655	YozG	DNA-binding transcriptional regulator, XRE family  [Transcription]. 	73
226182	COG3656	COG3656	Predicted periplasmic protein  [Function unknown]. 	172
226183	COG3657	COG3657	Putative component of the toxin-antitoxin plasmid stabilization module [Defense mechanisms]. 	100
226184	COG3658	CytB	Cytochrome b  [Energy production and conversion]. 	192
226185	COG3659	OprB	Carbohydrate-selective porin OprB [Cell wall/membrane/envelope biogenesis]. 	439
226186	COG3660	ELM1	Mitochondrial fission protein ELM1 [Cell cycle control, cell division, chromosome partitioning]. 	329
226187	COG3661	AguA2	Alpha-glucuronidase  [Carbohydrate transport and metabolism]. 	684
226188	COG3662	COG3662	Uncharacterized conserved protein, DUF2236 family [Function unknown]. 	300
226189	COG3663	Mug	G:T/U-mismatch repair DNA glycosylase [Replication, recombination and repair]. 	169
226190	COG3664	XynB	Beta-xylosidase  [Carbohydrate transport and metabolism]. 	428
226191	COG3665	YcgI	Uncharacterized conserved protein YcgI, DUF1989 family [Function unknown]. 	264
226192	COG3666	COG3666	Transposase [Mobilome: prophages, transposons]. 	161
226193	COG3667	PcoB	Uncharacterized protein involved in copper resistance  [Inorganic ion transport and metabolism]. 	321
226194	COG3668	ParE	Plasmid stabilization system protein ParE [Mobilome: prophages, transposons]. 	98
226195	COG3669	AfuC	Alpha-L-fucosidase  [Carbohydrate transport and metabolism]. 	430
226196	COG3670	COG3670	Carotenoid cleavage dioxygenase or a related enzyme [Secondary metabolites biosynthesis, transport and catabolism]. 	490
226197	COG3671	COG3671	Uncharacterized membrane protein  [Function unknown]. 	125
226198	COG3672	COG3672	Predicted transglutaminase-like cysteine proteinase  [Posttranslational modification, protein turnover, chaperones]. 	191
226199	COG3673	COG3673	Uncharacterized protein, PA2063/DUF2235 family [Function unknown]. 	423
226200	COG3675	Lip2	Predicted lipase  [Lipid transport and metabolism]. 	332
226201	COG3676	COG3676	Transposase and inactivated derivatives  [Mobilome: prophages, transposons]. 	126
226202	COG3677	InsA	Transposase [Mobilome: prophages, transposons]. 	129
226203	COG3678	CpxP	Periplasmic protein refolding chaperone Spy/CpxP family [Posttranslational modification, protein turnover, chaperones]. 	160
226204	COG3679	YlbF	Cell fate regulator YlbF, YheA/YmcA/DUF963 family (controls sporulation, competence, biofilm development) [Signal transduction mechanisms]. 	118
226205	COG3680	COG3680	Uncharacterized protein [Function unknown]. 	259
226206	COG3681	CdsB	L-cysteine desulfidase [Amino acid transport and metabolism]. 	433
226207	COG3682	COG3682	Predicted transcriptional regulator  [Transcription]. 	123
226208	COG3683	COG3683	ABC-type uncharacterized transport system, periplasmic component  [General function prediction only]. 	213
226209	COG3684	LacD	Tagatose-1,6-bisphosphate aldolase [Carbohydrate transport and metabolism]. 	306
226210	COG3685	YciE	Ferritin-like metal-binding protein YciE [Inorganic ion transport and metabolism]. 	167
226211	COG3686	COG3686	Uncharacterized conserved protein, MAPEG superfamily [Function unknown]. 	125
226212	COG3687	COG3687	Predicted metal-dependent hydrolase  [General function prediction only]. 	280
226213	COG3688	YacP	Predicted RNA-binding protein containing a PIN domain  [General function prediction only]. 	173
226214	COG3689	YcgQ	Uncharacterized membrane protein YcgQ,  UPF0703/DUF1980 family [Function unknown]. 	271
226215	COG3691	YfcZ	Uncharacterized conserved protein YfcZ, UPF0381/DUF406 family [Function unknown]. 	98
226216	COG3692	YifN	Uncharacterized protein YifN, PemK superfamily [Function unknown]. 	142
226217	COG3693	XynA	Endo-1,4-beta-xylanase, GH35 family [Carbohydrate transport and metabolism]. 	345
226218	COG3694	COG3694	ABC-type uncharacterized transport system, permease component  [General function prediction only]. 	260
226219	COG3695	Atl1	Alkylated DNA nucleotide flippase Atl1, participates in nucleotide excision repair, Ada-like DNA-binding domain [Transcription]. 	103
226220	COG3696	CusA	Cu/Ag efflux pump CusA [Inorganic ion transport and metabolism]. 	1027
226221	COG3697	CitX	Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) [Coenzyme transport and metabolism, Lipid transport and metabolism]. 	182
226222	COG3698	YigE	Uncharacterized protein YigE, DUF2233 family [Function unknown]. 	250
226223	COG3700	AphA	Acid phosphatase (class B) [Inorganic ion transport and metabolism, General function prediction only]. 	237
226224	COG3701	TrbF	Type IV secretory pathway, TrbF components  [Intracellular trafficking, secretion, and vesicular transport]. 	228
226225	COG3702	VirB3	Type IV secretory pathway, VirB3 components  [Intracellular trafficking, secretion, and vesicular transport]. 	105
226226	COG3703	ChaC	Cation transport regulator ChaC [Inorganic ion transport and metabolism]. 	190
226227	COG3704	VirB6	Type IV secretory pathway, VirB6 components  [Intracellular trafficking, secretion, and vesicular transport]. 	406
226228	COG3705	HisZ	ATP phosphoribosyltransferase regulatory subunit HisZ [Amino acid transport and metabolism]. 	390
226229	COG3706	PleD	Two-component response regulator, PleD family, consists of two REC domains and a diguanylate cyclase (GGDEF) domain [Signal transduction mechanisms, Transcription]. 	435
226230	COG3707	AmiR	Two-component response regulator, AmiR/NasT family, consists of REC and RNA-binding antiterminator (ANTAR) domains [Signal transduction mechanisms, Transcription]. 	194
226231	COG3708	YdeE	Predicted transcriptional regulator YdeE, contains AraC-type DNA-binding domain [Transcription]. 	157
226232	COG3709	PhnN	Ribose 1,5-bisphosphokinase PhnN [Carbohydrate transport and metabolism]. 	192
226233	COG3710	CadC1	DNA-binding winged helix-turn-helix (wHTH) domain [Transcription]. 	148
226234	COG3711	BglG	Transcriptional antiterminator [Transcription]. 	491
226235	COG3712	FecR	periplasmic ferric-dicitrate binding protein FecR, regulates iron transport through sigma-19 [Inorganic ion transport and metabolism, Signal transduction mechanisms]. 	322
226236	COG3713	OmpV	Outer membrane scaffolding protein for murein synthesis, MipA/OmpV family [Cell wall/membrane/envelope biogenesis]. 	258
226237	COG3714	YhhN	Uncharacterized membrane protein YhhN [Function unknown]. 	212
226238	COG3715	ManY	Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC [Carbohydrate transport and metabolism]. 	265
226239	COG3716	ManZ	Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID [Carbohydrate transport and metabolism]. 	269
226240	COG3717	KduI	5-keto 4-deoxyuronate isomerase [Carbohydrate transport and metabolism]. 	278
226241	COG3718	IolB	5-deoxy-D-glucuronate isomerase [Carbohydrate transport and metabolism]. 	270
226242	COG3719	RnaI	Ribonuclease I [Translation, ribosomal structure and biogenesis]. 	249
226243	COG3720	HemS	Putative heme degradation protein  [Inorganic ion transport and metabolism]. 	349
226244	COG3721	HugX	Putative heme iron utilization protein  [Inorganic ion transport and metabolism]. 	176
226245	COG3722	MtlR	DNA-binding transcriptional regulator, MltR family [Transcription]. 	174
226246	COG3723	RecT	Recombinational DNA repair protein RecT [Replication, recombination and repair]. 	276
226247	COG3724	AstB	Succinylarginine dihydrolase [Amino acid transport and metabolism]. 	442
226248	COG3725	AmpE	Membrane protein required for beta-lactamase induction [Defense mechanisms]. 	282
226249	COG3726	AhpA	Uncharacterized membrane protein affecting hemolysin expression [Function unknown]. 	214
226250	COG3727	Vsr	G:T-mismatch repair DNA endonuclease, very short patch repair protein [Replication, recombination and repair]. 	150
226251	COG3728	XtmA	Phage terminase, small subunit  [Mobilome: prophages, transposons]. 	179
226252	COG3729	GsiB	General stress protein YciG, contains tandem KGG domains [General function prediction only]. 	73
226253	COG3730	SrlA	Phosphotransferase system sorbitol-specific component IIC [Carbohydrate transport and metabolism]. 	176
226254	COG3731	SrlB	Phosphotransferase system sorbitol-specific component IIA [Carbohydrate transport and metabolism]. 	123
226255	COG3732	SrlE	Phosphotransferase system sorbitol-specific component IIBC [Carbohydrate transport and metabolism]. 	328
226256	COG3733	TynA	Cu2+-containing amine oxidase [Secondary metabolites biosynthesis, transport and catabolism]. 	654
226257	COG3734	DgoK	2-keto-3-deoxy-galactonokinase [Carbohydrate transport and metabolism]. 	306
226258	COG3735	TraB	Uncharacterized conserved protein YbaP, TraB family [Function unknown]. 	299
226259	COG3736	VirB8	Type IV secretory pathway, component VirB8  [Intracellular trafficking, secretion, and vesicular transport]. 	239
226260	COG3737	COG3737	Uncharacterized conserved protein, contains Mth938-like domain [Function unknown]. 	127
226261	COG3738	YiijF	Uncharacterized protein YijF, DUF1287 family [Function unknown]. 	200
226262	COG3739	YoaT	Uncharacterized membrane protein YoaT, DUF817 family [Function unknown]. 	263
226263	COG3740	COG3740	Phage head maturation protease  [Mobilome: prophages, transposons]. 	194
226264	COG3741	HutG	N-formylglutamate amidohydrolase  [Amino acid transport and metabolism]. 	272
226265	COG3742	COG3742	Uncharacterized protein, contains PIN domain [Function unknown]. 	131
226266	COG3743	H3TH	Predicted 5' DNA nuclease, flap endonuclease-1-like, helix-3-turn-helix (H3TH) domain [Replication, recombination and repair]. 	133
226267	COG3744	COG3744	PIN domain nuclease, a component of toxin-antitoxin system (PIN domain)  [Defense mechanisms]. 	130
226268	COG3745	CpaB	Flp pilus assembly protein CpaB  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	276
226269	COG3746	OprP	Phosphate-selective porin  [Inorganic ion transport and metabolism]. 	426
226270	COG3747	COG3747	Phage terminase, small subunit  [Mobilome: prophages, transposons]. 	160
226271	COG3748	COG3748	Uncharacterized membrane protein  [Function unknown]. 	407
226272	COG3749	COG3749	Uncharacterized conserved protein, DUF934 family [Function unknown]. 	167
226273	COG3750	COG3750	Uncharacterized conserved protein, UPF0335 family [Function unknown]. 	85
226274	COG3751	EGL9	Proline 4-hydroxylase (includes Rps23 Pro-64 3,4-dihydroxylase Tpa1), contains SM-20 domain [Translation, ribosomal structure and biogenesis, Posttranslational modification, protein turnover, chaperones]. 	252
226275	COG3752	COG3752	Steroid 5-alpha reductase family enzyme  [General function prediction only]. 	272
226276	COG3753	YidB	Uncharacterized conserved protein YidB, DUF937 family [Function unknown]. 	143
226277	COG3754	RgpF	Lipopolysaccharide biosynthesis protein  [Cell wall/membrane/envelope biogenesis]. 	595
226278	COG3755	YecT	Uncharacterized conserved protein YecT, DUF1311 family [Function unknown]. 	127
226279	COG3756	YdaU	Uncharacterized conserved protein YdaU, DUF1376 family [Function unknown]. 	153
226280	COG3757	Acm	Lyzozyme M1 (1,4-beta-N-acetylmuramidase), GH25 family [Cell wall/membrane/envelope biogenesis]. 	269
226281	COG3758	Ves	Various environmental stresses-induced protein Ves (function unknown) [Function unknown]. 	193
226282	COG3759	COG3759	Uncharacterized membrane protein [Function unknown]. 	121
226283	COG3760	ProX	Predicted aminoacyl-tRNA deacylase, YbaK-like aminoacyl-tRNA editing domain [General function prediction only]. 	164
226284	COG3761	NDUFA12	NADH:ubiquinone oxidoreductase 17.2 kD subunit  [Energy production and conversion]. 	118
226285	COG3762	COG3762	Uncharacterized membrane protein [Function unknown]. 	213
226286	COG3763	YneF	Uncharacterized protein YneF, UPF0154 family [Function unknown]. 	71
226287	COG3764	SrtA	Sortase (surface protein transpeptidase)  [Cell wall/membrane/envelope biogenesis]. 	210
226288	COG3765	WzzB	LPS O-antigen chain length determinant protein, WzzB/FepE family [Cell wall/membrane/envelope biogenesis]. 	347
226289	COG3766	YjfL	Uncharacterized membrane protein YjfL, UPF0719 family [Function unknown]. 	133
226290	COG3767	COG3767	Uncharacterized low-complexity protein  [Function unknown]. 	95
226291	COG3768	YcjF	Uncharacterized membrane protein YcjF, UPF0283 family [Function unknown]. 	350
226292	COG3769	YedP	Predicted mannosyl-3-phosphoglycerate phosphatase, HAD superfamily [Carbohydrate transport and metabolism]. 	274
226293	COG3770	MepA	Murein endopeptidase [Cell wall/membrane/envelope biogenesis]. 	284
226294	COG3771	YciS	Uncharacterized membrane protein YciS, DUF1049 family [Function unknown]. 	97
226295	COG3772	RrrD	Phage-related lysozyme (muramidase), GH24 family [Cell wall/membrane/envelope biogenesis]. 	152
226296	COG3773	CwlJ	Cell wall hydrolase CwlJ, involved in spore germination  [Cell cycle control, cell division, chromosome partitioning, Cell wall/membrane/envelope biogenesis]. 	249
226297	COG3774	OCH1	Mannosyltransferase OCH1 or related enzyme [Cell wall/membrane/envelope biogenesis]. 	347
226298	COG3775	SgcC	Phosphotransferase system, galactitol-specific IIC component [Carbohydrate transport and metabolism]. 	446
226299	COG3776	YhhL	Uncharacterized conserved protein YhhL, DUF1145 family [Function unknown]. 	91
226300	COG3777	HTD2	Hydroxyacyl-ACP dehydratase HTD2, hotdog domain [Lipid transport and metabolism]. 	273
226301	COG3778	YmfQ	Uncharacterized protein YmfQ in lambdoid prophage, DUF2313 family [Mobilome: prophages, transposons]. 	188
226302	COG3779	YegJ	Uncharacterized conserved protein YegJ, DUF2314 family [Function unknown]. 	151
226303	COG3780	COG3780	DNA endonuclease related to intein-encoded endonucleases  [Replication, recombination and repair]. 	266
226304	COG3781	YneE	Predicted membrane chloride channel, bestrophin family [Inorganic ion transport and metabolism]. 	306
226305	COG3782	COG3782	Uncharacterized protein [Function unknown]. 	289
226306	COG3783	CybC	Soluble cytochrome b562 [Energy production and conversion]. 	100
226307	COG3784	YdbL	Uncharacterized conserved protein YdbL, DUF1318 family [Function unknown]. 	109
226308	COG3785	HspQ	Heat shock protein HspQ [Posttranslational modification, protein turnover, chaperones]. 	116
226309	COG3786	COG3786	L,D-peptidoglycan transpeptidase YkuD, ErfK/YbiS/YcfS/YnhG family [Cell wall/membrane/envelope biogenesis]. 	217
226310	COG3787	YhbP	Uncharacterized conserved protein YhbP, UPF0306 family [Function unknown]. 	145
226311	COG3788	YecN	Uncharacterized membrane protein YecN, MAPEG domain [Function unknown]. 	131
226312	COG3789	YjfI	Uncharacterized protein YjfI, DUF2170 family [Function unknown]. 	146
226313	COG3790	YbgE	Predicted membrane protein, encoded in cydAB operon [Function unknown]. 	97
226314	COG3791	COG3791	Uncharacterized conserved protein [Function unknown]. 	133
226315	COG3792	COG3792	Uncharacterized protein [Function unknown]. 	122
226316	COG3793	TerB	Tellurite resistance protein  [Inorganic ion transport and metabolism]. 	144
226317	COG3794	PetE	Plastocyanin  [Energy production and conversion]. 	128
226318	COG3795	COG3795	Uncharacterized conserved protein [Function unknown]. 	123
226319	COG3797	COG3797	Uncharacterized conserved protein, DUF1697 family [Function unknown]. 	178
226320	COG3798	COG3798	Uncharacterized protein [Function unknown]. 	75
226321	COG3799	Mal	Methylaspartate ammonia-lyase  [Amino acid transport and metabolism]. 	410
226322	COG3800	COG3800	Predicted transcriptional regulator  [General function prediction only]. 	332
226323	COG3801	COG3801	Uncharacterized protein [Function unknown]. 	124
226324	COG3802	GguC	Uncharacterized protein [Function unknown]. 	333
226325	COG3803	COG3803	Uncharacterized conserved protein, DUF924 family [Function unknown]. 	182
226326	COG3804	COG3804	Uncharacterized protein [Function unknown]. 	350
226327	COG3805	DodA	Aromatic ring-cleaving dioxygenase  [Secondary metabolites biosynthesis, transport and catabolism]. 	120
226328	COG3806	ChrR	Anti-sigma factor ChrR, cupin superfamily [Signal transduction mechanisms]. 	216
226329	COG3807	SH3	SH3-like domain [Function unknown]. 	171
226330	COG3808	OVP1	Na+ or H+-translocating membrane pyrophosphatase  [Energy production and conversion]. 	703
226331	COG3809	COG3809	Predicted nucleic acid-binding protein, contains Zn-finger domain [General function prediction only]. 	88
226332	COG3811	YjhX	Uncharacterized protein YjhX, UPF0386/DUF2084 family [Function unknown]. 	85
226333	COG3812	COG3812	Uncharacterized protein [Function unknown]. 	193
226334	COG3813	COG3813	Uncharacterized protein [Function unknown]. 	84
226335	COG3814	SspB2	SspB-like protein, predicted to bind SsrA peptide [Posttranslational modification, protein turnover, chaperones]. 	157
226336	COG3815	COG3815	Uncharacterized membrane protein  [Function unknown]. 	113
226337	COG3816	COG3816	Uncharacterized protein, DUF1285 family [Function unknown]. 	205
226338	COG3817	COG3817	Uncharacterized membrane protein [Function unknown]. 	313
226339	COG3818	COG3818	Predicted acetyltransferase, GNAT superfamily  [General function prediction only]. 	167
226340	COG3819	COG3819	Uncharacterized membrane protein [Function unknown]. 	229
226341	COG3820	COG3820	Uncharacterized protein, DUF1013 family [Function unknown]. 	230
226342	COG3821	COG3821	Uncharacterized membrane protein [Function unknown]. 	234
226343	COG3822	YdaE	D-lyxose ketol-isomerase  [Carbohydrate transport and metabolism]. 	225
226344	COG3823	COG3823	Glutamine cyclotransferase  [Posttranslational modification, protein turnover, chaperones]. 	262
226345	COG3824	COG3824	Predicted Zn-dependent protease, minimal metalloprotease (MMP)-like domain [Posttranslational modification, protein turnover, chaperones]. 	136
226346	COG3825	CoxE	Uncharacterized conserved protein,  contains von Willebrand factor type A (vWA) domain  [Function unknown]. 	393
226347	COG3826	COG3826	Uncharacterized protein [Function unknown]. 	236
226348	COG3827	PopZ	Cell pole-organizing protein PopZ  [Cell cycle control, cell division, chromosome partitioning]. 	231
226349	COG3828	COG3828	Type 1 glutamine amidotransferase (GATase1)-like domain [General function prediction only]. 	239
226350	COG3829	RocR	Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding Fis domains [Transcription, Signal transduction mechanisms]. 	560
226351	COG3830	ACT	ACT domain, binds amino acids and other small ligands [Signal transduction mechanisms]. 	90
226352	COG3831	WGR	WGR domain, predicted DNA-binding domain in MolR [Transcription]. 	85
226353	COG3832	YndB	Uncharacterized conserved protein YndB, AHSA1/START domain [Function unknown]. 	149
226354	COG3833	MalG	ABC-type maltose transport system, permease component [Carbohydrate transport and metabolism]. 	282
226355	COG3835	CdaR	Sugar diacid utilization regulator [Transcription, Signal transduction mechanisms]. 	376
226356	COG3836	HpcH	2-keto-3-deoxy-L-rhamnonate aldolase RhmA [Carbohydrate transport and metabolism]. 	255
226357	COG3837	COG3837	Uncharacterized conserved protein, cupin superfamily [Function unknown]. 	161
226358	COG3838	VirB2	Type IV secretory pathway, VirB2 components (pilins)  [Intracellular trafficking, secretion, and vesicular transport]. 	108
226359	COG3839	MalK	ABC-type sugar transport system, ATPase component [Carbohydrate transport and metabolism]. 	338
226360	COG3840	ThiQ	ABC-type thiamine transport system, ATPase component [Coenzyme transport and metabolism]. 	231
226361	COG3842	PotA	ABC-type Fe3+/spermidine/putrescine transport systems, ATPase components [Amino acid transport and metabolism]. 	352
226362	COG3843	VirD2	Type IV secretory pathway, VirD2 components (relaxase)  [Intracellular trafficking, secretion, and vesicular transport]. 	326
226363	COG3844	Bna5	Kynureninase  [Amino acid transport and metabolism]. 	407
226364	COG3845	YufO	ABC-type uncharacterized transport system, ATPase component [General function prediction only]. 	501
226365	COG3846	TrbL	Type IV secretory pathway, TrbL components  [Intracellular trafficking, secretion, and vesicular transport]. 	452
226366	COG3847	Flp	Flp pilus assembly protein, pilin Flp  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	58
226367	COG3848	PykA2	Phosphohistidine swiveling domain of PEP-utilizing enzymes [Signal transduction mechanisms]. 	111
226368	COG3850	NarQ	Signal transduction histidine kinase, nitrate/nitrite-specific [Signal transduction mechanisms]. 	574
226369	COG3851	UhpB	Signal transduction histidine kinase, glucose-6-phosphate specific [Signal transduction mechanisms]. 	497
226370	COG3852	NtrB	Signal transduction histidine kinase, nitrogen specific [Signal transduction mechanisms]. 	363
226371	COG3853	TelA	Uncharacterized conserved protein YaaN involved in tellurite resistance  [Defense mechanisms]. 	386
226372	COG3854	SpoIIIAA	Stage III sporulation protein SpoIIIAA [Cell cycle control, cell division, chromosome partitioning]. 	308
226373	COG3855	Fbp2	Fructose-1,6-bisphosphatase [Carbohydrate transport and metabolism]. 	648
226374	COG3856	Sbp	Small basic protein (function unknown) [Function unknown]. 	113
226375	COG3857	AddB	ATP-dependent helicase/DNAse subunit B [Replication, recombination and repair]. 	1108
226376	COG3858	YaaH	Spore germination protein YaaH [Cell cycle control, cell division, chromosome partitioning]. 	423
226377	COG3859	ThiT	Thiamine transporter ThiT [Coenzyme transport and metabolism]. 	185
226378	COG3860	COG3860	Uncharacterized protein, DUF2087 family [Function unknown]. 	89
226379	COG3861	YsnF	Stress response protein YsnF (function unknown) [Function unknown]. 	195
226380	COG3862	COG3862	Uncharacterized protein with two CxxC motifs [Function unknown]. 	117
226381	COG3863	YycO	Uncharacterized protein YycO, NlpC/P60 family [Function unknown]. 	231
226382	COG3864	COG3864	Predicted metal-dependent peptidase [General function prediction only]. 	396
226383	COG3865	COG3865	Glyoxalase superfamily enzyme, possibly 3-demethylubiquinone-9 3-methyltransferase [General function prediction only]. 	151
226384	COG3866	PelB	Pectate lyase  [Carbohydrate transport and metabolism]. 	345
226385	COG3867	GanB	Arabinogalactan endo-1,4-beta-galactosidase  [Carbohydrate transport and metabolism]. 	403
226386	COG3868	COG3868	Predicted glycosyl hydrolase, GH114 family [Carbohydrate transport and metabolism]. 	306
226387	COG3869	McsB	Protein-arginine kinase [Posttranslational modification, protein turnover, chaperones]. 	352
226388	COG3870	YaaQ	Uncharacterized protein YaaQ, DUF970 family [Function unknown]. 	109
226389	COG3871	YzzA	General stress protein 26 (function unknown) [Function unknown]. 	145
226390	COG3872	YqhQ	Uncharacterized conserved protein YqhQ [Function unknown]. 	318
226391	COG3874	YtfJ	Uncharacterized spore protein YtfJ [Function unknown]. 	138
226392	COG3875	LarA	Nickel-dependent lactate racemase [Cell wall/membrane/envelope biogenesis]. 	423
226393	COG3876	YbbC	Uncharacterized conserved protein YbbC, DUF1343 family [Function unknown]. 	409
226394	COG3877	COG3877	Uncharacterized protein, DUF2089 family [Function unknown]. 	122
226395	COG3878	YwqG	Uncharacterized protein YwqG, DUF1963 family [Function unknown]. 	261
226396	COG3879	YlxW	Uncharacterized conserved protein YlxW, UPF0749 family [Function unknown]. 	247
226397	COG3880	McsA	Protein-arginine kinase activator protein McsA [Posttranslational modification, protein turnover, chaperones]. 	176
226398	COG3881	YrrD	Uncharacterized protein YrrD, contains PRC-barrel domain [Function unknown]. 	176
226399	COG3882	FkbH	Predicted enzyme involved in methoxymalonyl-ACP biosynthesis  [Lipid transport and metabolism]. 	574
226400	COG3883	CwlO1	Uncharacterized N-terminal domain of peptidoglycan hydrolase CwlO  [Function unknown]. 	265
226401	COG3884	FatA	Acyl-ACP thioesterase  [Lipid transport and metabolism]. 	250
226402	COG3885	COG3885	Aromatic ring-opening dioxygenase, LigB subunit [Secondary metabolites biosynthesis, transport and catabolism]. 	261
226403	COG3886	COG3886	HKD family nuclease  [Replication, recombination and repair]. 	198
226404	COG3887	GdpP	c-di-AMP phosphodiesterase, consists of a GGDEF-like and DHH domains [Signal transduction mechanisms]. 	655
226405	COG3888	COG3888	Predicted transcriptional regulator  [Transcription]. 	321
226406	COG3889	COG3889	Predicted periplasmic protein [Function unknown]. 	872
226407	COG3890	ERG8	Phosphomevalonate kinase  [Lipid transport and metabolism]. 	337
226408	COG3892	IolC	Myo-inositol catabolism protein IolC [Carbohydrate transport and metabolism]. 	310
226409	COG3893	COG3893	Inactivated superfamily I helicase  [Replication, recombination and repair]. 	697
226410	COG3894	COG3894	Uncharacterized 2Fe-2 and 4Fe-4S clusters-containing protein, contains DUF4445 domain  [Function unknown]. 	614
226411	COG3895	MliC	Membrane-bound inhibitor of C-type lysozyme [Cell wall/membrane/envelope biogenesis]. 	112
226412	COG3896	COG3896	Chloramphenicol 3-O-phosphotransferase  [Defense mechanisms]. 	205
226413	COG3897	Nnt1	Predicted nicotinamide N-methyase [General function prediction only]. 	218
226414	COG3898	COG3898	Uncharacterized membrane-anchored protein  [Function unknown]. 	531
226415	COG3899	COG3899	Predicted ATPase  [General function prediction only]. 	849
226416	COG3900	COG3900	Predicted periplasmic protein  [Function unknown]. 	262
226417	COG3901	NosR	Regulator of nitric oxide reductase transcription  [Transcription]. 	482
226418	COG3903	COG3903	Predicted ATPase  [General function prediction only]. 	414
226419	COG3904	COG3904	Predicted periplasmic protein  [Function unknown]. 	245
226420	COG3905	COG3905	Predicted transcriptional regulator  [Transcription]. 	83
226421	COG3906	YrzB	Uncharacterized protein YrzB, UPF0473 family [Function unknown]. 	105
226422	COG3907	COG3907	Membrane-associated enzyme, PAP2 (acid phosphatase) superfamily [General function prediction only]. 	249
226423	COG3908	COG3908	Uncharacterized protein [Function unknown]. 	77
226424	COG3909	CytC556	Cytochrome c556  [Energy production and conversion]. 	147
226425	COG3910	COG3910	Predicted ATPase  [General function prediction only]. 	233
226426	COG3911	COG3911	Predicted ATPase  [General function prediction only]. 	183
226427	COG3913	SciT	Uncharacterized protein [Function unknown]. 	227
226428	COG3914	Spy	Predicted O-linked N-acetylglucosamine transferase, SPINDLY family  [Posttranslational modification, protein turnover, chaperones]. 	620
226429	COG3915	COG3915	Uncharacterized protein [Function unknown]. 	155
226430	COG3916	LasI	N-acyl-L-homoserine lactone synthetase  [Signal transduction mechanisms]. 	209
226431	COG3917	NahD	2-hydroxychromene-2-carboxylate isomerase  [Secondary metabolites biosynthesis, transport and catabolism]. 	203
226432	COG3918	COG3918	Uncharacterized membrane protein  [Function unknown]. 	153
226433	COG3919	COG3919	Predicted ATP-dependent carboligase, ATP-grasp superfamily [General function prediction only]. 	415
226434	COG3920	COG3920	Two-component sensor histidine kinase, HisKA and HATPase domains [Signal transduction mechanisms]. 	221
226435	COG3921	COG3921	Uncharacterized conserved protein [Function unknown]. 	300
226436	COG3923	PriC	Primosomal replication protein N'' [Replication, recombination and repair]. 	175
226437	COG3924	YhdT	Uncharacterized membrane protein YhdT [Function unknown]. 	80
226438	COG3925	FruA	N-terminal domain of the phosphotransferase system fructose-specific component IIB [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 	103
226439	COG3926	ZliS	Lysozyme family protein  [General function prediction only]. 	252
226440	COG3930	COG3930	Uncharacterized protein [Function unknown]. 	434
226441	COG3931	HutG2	Predicted N-formylglutamate amidohydrolase  [Amino acid transport and metabolism]. 	263
226442	COG3932	COG3932	Uncharacterized conserved protein [Function unknown]. 	209
226443	COG3933	LevR	Transcriptional regulatory protein LevR, contains PRD, AAA+ and EIIA domains [Transcription]. 	470
226444	COG3934	COG3934	Endo-1,4-beta-mannosidase [Carbohydrate transport and metabolism]. 	587
226445	COG3935	DnaD	DNA replication protein DnaD  [Replication, recombination and repair]. 	246
226446	COG3936	YfiQ	Membrane-bound acyltransferase YfiQ, involved in biofilm formation  [Carbohydrate transport and metabolism]. 	349
226447	COG3937	PhaF	Polyhydroxyalkanoate synthesis regulator phasin [Secondary metabolites biosynthesis, transport and catabolism, Signal transduction mechanisms]. 	108
226448	COG3938	PrdF	Proline racemase  [Amino acid transport and metabolism]. 	341
226449	COG3940	COG3940	Beta-xylosidase, GH43 family [Carbohydrate transport and metabolism]. 	324
226450	COG3941	HI1514	Phage tail tape-measure protein, controls tail length   [Mobilome: prophages, transposons]. 	633
226451	COG3942	COG3942	Surface antigen  [Cell wall/membrane/envelope biogenesis]. 	173
226452	COG3943	COG3943	Uncharacterized conserved protein [Function unknown]. 	329
226453	COG3944	YveK	Capsular polysaccharide biosynthesis protein  [Cell wall/membrane/envelope biogenesis]. 	226
226454	COG3945	COG3945	Hemerythrin-like domain [General function prediction only]. 	189
226455	COG3946	VirJ	Type IV secretory pathway, VirJ component  [Intracellular trafficking, secretion, and vesicular transport]. 	456
226456	COG3947	SAPR	Two-component response regulator, SAPR family, consists of REC, wHTH and BTAD domains [Signal transduction mechanisms, Transcription]. 	361
226457	COG3948	COG3948	Phage-related baseplate assembly protein  [Mobilome: prophages, transposons]. 	306
226458	COG3949	YkvI	Uncharacterized membrane protein YkvI [Function unknown]. 	349
226459	COG3950	COG3950	Predicted ATP-binding protein involved in virulence  [General function prediction only]. 	440
226460	COG3951	FlgJ1	Rod binding protein domain [Cell motility]. 	166
226461	COG3952	LABN	Uncharacterized N-terminal domain of lipid-A-disaccharide synthase [General function prediction only]. 	113
226462	COG3953	SLT	SLT domain protein [Mobilome: prophages, transposons]. 	235
226463	COG3954	PrkB	Phosphoribulokinase [Carbohydrate transport and metabolism]. 	289
226464	COG3955	COG3955	Uncharacterized protein, DUF1919 family [Cell wall/membrane/envelope biogenesis]. 	211
226465	COG3956	YabN	Uncharacterized conserved protein YabN, contains tetrapyrrole methylase and MazG-like pyrophosphatase domain [General function prediction only]. 	488
226466	COG3957	XFP	Phosphoketolase  [Carbohydrate transport and metabolism]. 	793
226467	COG3958	TktA2	Transketolase, C-terminal subunit  [Carbohydrate transport and metabolism]. 	312
226468	COG3959	TktA1	Transketolase, N-terminal subunit  [Carbohydrate transport and metabolism]. 	243
226469	COG3960	Gcl	Glyoxylate carboligase [Secondary metabolites biosynthesis, transport and catabolism]. 	592
226470	COG3961	PDC1	TPP-dependent 2-oxoacid decarboxylase, includes indolepyruvate decarboxylase [Carbohydrate transport and metabolism, Coenzyme transport and metabolism, General function prediction only]. 	557
226471	COG3962	IolD	TPP-dependent trihydroxycyclohexane-1,2-dione (THcHDO) dehydratase, myo-inositol metabolism [Carbohydrate transport and metabolism]. 	617
226472	COG3963	COG3963	Phospholipid N-methyltransferase  [Lipid transport and metabolism]. 	194
226473	COG3964	COG3964	Predicted amidohydrolase  [General function prediction only]. 	386
226474	COG3965	COG3965	Predicted Co/Zn/Cd cation transporter, cation efflux family  [Inorganic ion transport and metabolism]. 	314
226475	COG3966	DltD	Poly D-alanine transfer protein DltD, involved inesterification of teichoic acids [Cell wall/membrane/envelope biogenesis]. 	415
226476	COG3967	DltE	Short-chain dehydrogenase involved in D-alanine esterification of teichoic acids [Cell wall/membrane/envelope biogenesis, Lipid transport and metabolism]. 	245
226477	COG3968	GlnA3	Glutamine synthetase type III [Amino acid transport and metabolism]. 	724
226478	COG3969	YbdN	Predicted phosphoadenosine phosphosulfate sulfurtransferase, contains C-terminal DUF3440 domain [General function prediction only]. 	407
226479	COG3970	COG3970	Fumarylacetoacetate (FAA) hydrolase family protein  [General function prediction only]. 	379
226480	COG3971	MhpD	2-keto-4-pentenoate hydratase [Secondary metabolites biosynthesis, transport and catabolism]. 	264
226481	COG3972	COG3972	Superfamily I DNA and RNA helicases  [Replication, recombination and repair]. 	660
226482	COG3973	HelD	DNA helicase IV [Replication, recombination and repair]. 	747
226483	COG3975	COG3975	Predicted metalloprotease, contains C-terminal PDZ domain  [General function prediction only]. 	558
226484	COG3976	COG3976	Uncharacterized protein, contains FMN-binding domain [General function prediction only]. 	135
226485	COG3977	AvtA	Alanine-alpha-ketoisovalerate (or valine-pyruvate) aminotransferase [Amino acid transport and metabolism]. 	417
226486	COG3978	IlvM	Acetolactate synthase small subunit, contains ACT domain [Energy production and conversion]. 	86
226487	COG3979	COG3979	Chitodextrinase [Carbohydrate transport and metabolism]. 	181
226488	COG3980	SpsG	Spore coat polysaccharide biosynthesis protein SpsG, predicted glycosyltransferase  [Cell wall/membrane/envelope biogenesis]. 	318
226489	COG3981	COG3981	Predicted acetyltransferase  [General function prediction only]. 	174
226490	COG4001	COG4001	Uncharacterized protein [Function unknown]. 	102
226491	COG4002	COG4002	Predicted methyltransferase MtxX, methanogen marker protein 4 [General function prediction only]. 	256
226492	COG4003	COG4003	Uncharacterized protein, DUF2095 family [Function unknown]. 	98
226493	COG4004	COG4004	Uncharacterized protein [Function unknown]. 	96
226494	COG4006	COG4006	CRISPR/Cas system-associated protein Csm6, COG1517 family [Defense mechanisms]. 	278
226495	COG4007	COG4007	Predicted dehydrogenase related to H2-forming N5,N10-methylenetetrahydromethanopterin dehydrogenase [General function prediction only]. 	340
226496	COG4008	COG4008	Predicted metal-binding transcription factor, methanogenesis marker domain 9 [Transcription]. 	153
226497	COG4009	COG4009	Uncharacterized protein [Function unknown]. 	88
226498	COG4010	COG4010	Uncharacterized protein [Function unknown]. 	170
226499	COG4012	COG4012	Uncharacterized protein, DUF1786 family [Function unknown]. 	342
226500	COG4013	COG4013	Uncharacterized protein [Function unknown]. 	91
226501	COG4014	COG4014	Uncharacterized protein [Function unknown]. 	97
226502	COG4015	COG4015	Predicted dinucleotide-utilizing enzyme of the ThiF/HesA family  [General function prediction only]. 	217
226503	COG4016	COG4016	Uncharacterized protein, UPF0254 family [Function unknown]. 	165
226504	COG4017	COG4017	Uncharacterized protein [Function unknown]. 	254
226505	COG4018	COG4018	Uncharacterized protein [Function unknown]. 	505
226506	COG4019	COG4019	Uncharacterized protein [Function unknown]. 	156
226507	COG4020	COG4020	Uncharacterized protein [Function unknown]. 	332
226508	COG4021	Thg1	tRNA(His) 5'-end guanylyltransferase [Translation, ribosomal structure and biogenesis]. 	249
226509	COG4022	COG4022	Uncharacterized protein [Function unknown]. 	286
226510	COG4023	SBH1	Preprotein translocase subunit Sec61beta  [Intracellular trafficking, secretion, and vesicular transport]. 	57
226511	COG4024	COG4024	Uncharacterized protein [Function unknown]. 	218
226512	COG4025	COG4025	Uncharacterized membrane protein  [Function unknown]. 	284
226513	COG4026	COG4026	Uncharacterized protein, contains TOPRIM domain, potential nuclease  [General function prediction only]. 	290
226514	COG4027	COG4027	3'-phosphoadenosine 5'-phosphosulfate sulfotransferase  [Nucleotide transport and metabolism]. 	194
226515	COG4028	COG4028	Predicted P-loop ATPase/GTPase  [General function prediction only]. 	271
226516	COG4029	COG4029	Uncharacterized protein [Function unknown]. 	142
226517	COG4030	COG4030	Predicted phosphohydrolase, HAD superfamily [General function prediction only]. 	315
226518	COG4031	COG4031	Uncharacterized protein, DUF2103 family [Function unknown]. 	227
226519	COG4032	COG4032	Sulfopyruvate decarboxylase, TPP-binding subunit (coenzyme M biosynthesis) [Coenzyme transport and metabolism]. 	172
226520	COG4033	COG4033	Uncharacterized protein [Function unknown]. 	102
226521	COG4034	COG4034	Uncharacterized protein [Function unknown]. 	328
226522	COG4035	COG4035	Uncharacterized membrane protein  [Function unknown]. 	108
226523	COG4036	EhaG	Energy-converting hydrogenase Eha subunit G [Energy production and conversion]. 	224
226524	COG4037	EhaF	Energy-converting hydrogenase Eha subunit F [Energy production and conversion]. 	163
226525	COG4038	EhaE	Energy-converting hydrogenase Eha subunit E [Energy production and conversion]. 	87
226526	COG4039	EhaC	Energy-converting hydrogenase Eha subunit C [Energy production and conversion]. 	86
226527	COG4040	COG4040	Uncharacterized membrane protein [Function unknown]. 	85
226528	COG4041	EhaB	Energy-converting hydrogenase Eha subunit B [Energy production and conversion]. 	171
226529	COG4042	EhaA	Energy-converting hydrogenase Eha subunit A [Energy production and conversion]. 	104
226530	COG4043	ASCH	ASC-1 homology (ASCH) domain, predicted RNA-binding domain [General function prediction only]. 	111
226531	COG4044	COG4044	Uncharacterized protein [Function unknown]. 	247
226532	COG4046	COG4046	Uncharacterized protein [Function unknown]. 	368
226533	COG4047	COG4047	N-glycosylase/DNA lyase [Replication, recombination and repair]. 	243
226534	COG4048	COG4048	Uncharacterized protein [Function unknown]. 	123
226535	COG4049	COG4049	Uncharacterized protein, contains archaeal-type C2H2 Zn-finger  [General function prediction only]. 	65
226536	COG4050	COG4050	Uncharacterized protein [Function unknown]. 	152
226537	COG4051	COG4051	Uncharacterized protein [Function unknown]. 	202
226538	COG4052	COG4052	Uncharacterized protein related to methyl coenzyme M reductase subunit C, methanogenesis marker protein 7  [General function prediction only]. 	310
226539	COG4053	COG4053	Uncharacterized protein [Function unknown]. 	244
226540	COG4054	McrB	Methyl coenzyme M reductase, beta subunit  [Coenzyme transport and metabolism]. 	447
226541	COG4055	McrD	Methyl coenzyme M reductase, subunit D  [Coenzyme transport and metabolism]. 	165
226542	COG4056	McrC	Methyl coenzyme M reductase, subunit C  [Coenzyme transport and metabolism]. 	204
226543	COG4057	McrG	Methyl coenzyme M reductase, gamma subunit  [Coenzyme transport and metabolism]. 	257
226544	COG4058	McrA	Methyl coenzyme M reductase, alpha subunit  [Coenzyme transport and metabolism]. 	553
226545	COG4059	MtrE	Tetrahydromethanopterin S-methyltransferase, subunit E  [Coenzyme transport and metabolism]. 	304
226546	COG4060	MtrD	Tetrahydromethanopterin S-methyltransferase, subunit D  [Coenzyme transport and metabolism]. 	230
226547	COG4061	MtrC	Tetrahydromethanopterin S-methyltransferase, subunit C  [Coenzyme transport and metabolism]. 	262
226548	COG4062	MtrB	Tetrahydromethanopterin S-methyltransferase, subunit B  [Coenzyme transport and metabolism]. 	108
226549	COG4063	MtrA	Tetrahydromethanopterin S-methyltransferase, subunit A  [Coenzyme transport and metabolism]. 	238
226550	COG4064	MtrG	Tetrahydromethanopterin S-methyltransferase, subunit G  [Coenzyme transport and metabolism]. 	75
226551	COG4065	COG4065	Uncharacterized protein [Function unknown]. 	480
226552	COG4066	COG4066	Uncharacterized protein, UPF0305 family [Function unknown]. 	165
226553	COG4067	COG4067	Uncharacterized conserved protein [Function unknown]. 	162
226554	COG4068	COG4068	Predicted nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 	64
226555	COG4069	COG4069	Uncharacterized protein [Function unknown]. 	367
226556	COG4070	COG4070	Uncharacterized protein, methanogenesis marker protein 3, UPF0288 family [Function unknown]. 	512
226557	COG4071	COG4071	Uncharacterized protein, related to F420-0:gamma-glutamyl ligase [Function unknown]. 	278
226558	COG4072	COG4072	Uncharacterized protein [Function unknown]. 	161
226559	COG4073	COG4073	Uncharacterized protein [Function unknown]. 	198
226560	COG4074	Mth	5,10-methenyltetrahydromethanopterin hydrogenase [Energy production and conversion]. 	343
226561	COG4075	COG4075	Uncharacterized protein, distantly related to nitrogen regulatory protein PII [Function unknown]. 	110
226562	COG4076	COG4076	Predicted RNA methylase  [General function prediction only]. 	252
226563	COG4077	COG4077	Uncharacterized protein [Function unknown]. 	156
226564	COG4078	EhaH	Energy-converting hydrogenase Eha subunit H [Energy production and conversion]. 	221
226565	COG4079	COG4079	Uncharacterized protein [Function unknown]. 	293
226566	COG4080	COG4080	SpoU rRNA Methylase family enzyme  [Translation, ribosomal structure and biogenesis]. 	147
226567	COG4081	COG4081	Uncharacterized protein [Function unknown]. 	148
226568	COG4083	COG4083	Exosortase/Archaeosortase [Replication, recombination and repair]. 	239
226569	COG4084	COG4084	Energy-converting hydrogenase A subunit M [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	135
226570	COG4085	YhcR	DNA/RNA endonuclease YhcR, contains UshA esterase domain [RNA processing and modification]. 	204
226571	COG4086	YpuA	Uncharacterized protein YpuA, DUF1002 family [Function unknown]. 	299
226572	COG4087	COG4087	Soluble P-type ATPase  [General function prediction only]. 	152
226573	COG4088	Kti12	tRNA Uridine 5-carbamoylmethylation protein Kti12 (Killer toxin insensitivity protein) [Translation, ribosomal structure and biogenesis]. 	261
226574	COG4089	COG4089	Uncharacterized membrane protein  [Function unknown]. 	235
226575	COG4090	COG4090	Uncharacterized protein [Function unknown]. 	154
226576	COG4091	COG4091	Predicted homoserine dehydrogenase, contains C-terminal SAF domain  [Amino acid transport and metabolism]. 	438
226577	COG4092	COG4092	Predicted glycosyltransferase involved in capsule biosynthesis  [Cell wall/membrane/envelope biogenesis]. 	346
226578	COG4093	COG4093	Uncharacterized protein [Function unknown]. 	338
226579	COG4094	COG4094	Uncharacterized membrane protein [Function unknown]. 	219
226580	COG4095	SWEET	Sugar transporter, SemiSWEET family, contains PQ motif [Carbohydrate transport and metabolism]. 	89
226581	COG4096	HsdR	Type I site-specific restriction endonuclease, part of a restriction-modification system [Defense mechanisms]. 	875
226582	COG4097	COG4097	Predicted ferric reductase  [Inorganic ion transport and metabolism]. 	438
226583	COG4098	comFA	Superfamily II DNA/RNA helicase required for DNA uptake (late competence protein)  [Replication, recombination and repair]. 	441
226584	COG4099	COG4099	Predicted peptidase  [General function prediction only]. 	387
226585	COG4100	YnbB	Cystathionine beta-lyase family protein involved in aluminum resistance  [Inorganic ion transport and metabolism, General function prediction only]. 	416
226586	COG4101	RmlC	Uncharacterized protein, RmlC-like cupin domain [General function prediction only]. 	142
226587	COG4102	COG4102	Uncharacterized conserved protein, DUF1501 family [Function unknown]. 	418
226588	COG4103	TerB	Uncharacterized conserved protein, tellurite resistance protein B (TerB) family [Function unknown]. 	148
226589	COG4104	PAAR	Zn-binding Pro-Ala-Ala-Arg (PAAR) domain, incolved in TypeVI secretion [Intracellular trafficking, secretion, and vesicular transport]. 	98
226590	COG4105	BamD	Outer membrane protein assembly factor BamD, BamD/ComL family [Cell wall/membrane/envelope biogenesis]. 	254
226591	COG4106	Tam	Trans-aconitate methyltransferase [Energy production and conversion]. 	257
226592	COG4107	PhnK	ABC-type phosphonate transport system, ATPase component [Inorganic ion transport and metabolism]. 	258
226593	COG4108	PrfC	Peptide chain release factor RF-3 [Translation, ribosomal structure and biogenesis]. 	528
226594	COG4109	YtoI	Predicted transcriptional regulator containing CBS domains  [Transcription]. 	432
226595	COG4110	TerA	Uncharacterized protein involved in tellurium resistance [Defense mechanisms]. 	200
226596	COG4111	COG4111	Uncharacterized conserved protein  [Function unknown]. 	322
226597	COG4112	YmaB	Predicted phosphoesterase, NUDIX family  [General function prediction only]. 	203
226598	COG4113	COG4113	Predicted nucleic acid-binding protein, contains PIN domain  [General function prediction only]. 	134
226599	COG4114	FhuF	Ferric iron reductase protein FhuF, involved in iron transport [Inorganic ion transport and metabolism]. 	251
226600	COG4115	COG4115	Toxin component of the Txe-Axe toxin-antitoxin module, Txe/YoeB family   [Defense mechanisms]. 	84
226601	COG4116	YjbK	Predicted triphosphatase or cyclase YjbK, contains CYTH domain [General function prediction only]. 	193
226602	COG4117	YdhU	Thiosulfate reductase cytochrome b subunit [Inorganic ion transport and metabolism]. 	221
226603	COG4118	Phd	Antitoxin component of toxin-antitoxin stability system, DNA-binding transcriptional repressor [Defense mechanisms]. 	84
226604	COG4119	COG4119	Predicted NTP pyrophosphohydrolase, NUDIX family  [Nucleotide transport and metabolism, General function prediction only]. 	161
226605	COG4120	COG4120	ABC-type uncharacterized transport system, permease component  [General function prediction only]. 	293
226606	COG4121	MnmC	tRNA U34 5-methylaminomethyl-2-thiouridine-forming methyltransferase MnmC [Translation, ribosomal structure and biogenesis]. 	252
226607	COG4122	YrrM	Predicted O-methyltransferase YrrM [General function prediction only]. 	219
226608	COG4123	TrmN6	tRNA1(Val) A37 N6-methylase TrmN6 [Translation, ribosomal structure and biogenesis]. 	248
226609	COG4124	ManB2	Beta-mannanase  [Carbohydrate transport and metabolism]. 	355
226610	COG4125	COG4125	Uncharacterized membrane protein  [Function unknown]. 	149
226611	COG4126	Dcg1	Asp/Glu/hydantoin racemase [Amino acid transport and metabolism]. 	230
226612	COG4127	COG4127	Predicted restriction endonuclease, Mrr-cat superfamily [General function prediction only]. 	318
226613	COG4128	Zot	Zona occludens toxin, predicted ATPase [General function prediction only]. 	398
226614	COG4129	YgaE	Uncharacterized membrane protein YgaE, UPF0421/DUF939 family [Function unknown]. 	332
226615	COG4130	COG4130	Predicted sugar epimerase, xylose isomerase-like family [Carbohydrate transport and metabolism]. 	272
226616	COG4132	COG4132	ABC-type uncharacterized transport system, permease component  [General function prediction only]. 	282
226617	COG4133	CcmA	ABC-type transport system involved in cytochrome c biogenesis, ATPase component [Posttranslational modification, protein turnover, chaperones]. 	209
226618	COG4134	YnjB	ABC-type uncharacterized transport system YnjBCD, periplasmic component [General function prediction only]. 	384
226619	COG4135	YnjC	ABC-type uncharacterized transport system YnjBCD, permease component [General function prediction only]. 	551
226620	COG4136	YnjD	ABC-type uncharacterized transport system YnjBCD, ATPase component [General function prediction only]. 	213
226621	COG4137	YpjD	ABC-type uncharacterized transport system, permease component [General function prediction only]. 	265
226622	COG4138	BtuD	ABC-type cobalamin transport system, ATPase component [Coenzyme transport and metabolism]. 	248
226623	COG4139	BtuC	ABC-type cobalamin transport system, permease component [Coenzyme transport and metabolism]. 	326
226624	COG4143	TbpA	ABC-type thiamine transport system, periplasmic component [Coenzyme transport and metabolism]. 	336
226625	COG4145	PanF	Na+/panthothenate symporter [Coenzyme transport and metabolism]. 	473
226626	COG4146	YidK	Uncharacterized membrane permease YidK, sodium:solute symporter family  [General function prediction only]. 	571
226627	COG4147	ActP	Na+(or H+)/acetate symporter ActP [Energy production and conversion]. 	529
226628	COG4148	ModC	ABC-type molybdate transport system, ATPase component [Inorganic ion transport and metabolism]. 	352
226629	COG4149	ModC	ABC-type molybdate transport system, permease component [Inorganic ion transport and metabolism]. 	225
226630	COG4150	CysP	ABC-type sulfate transport system, periplasmic component [Inorganic ion transport and metabolism]. 	341
226631	COG4152	YhaQ	ABC-type uncharacterized transport system, ATPase component  [General function prediction only]. 	300
226632	COG4154	FucU	L-fucose mutarotase/ribose pyranase, RbsD/FucU family [Carbohydrate transport and metabolism]. 	144
226633	COG4158	COG4158	Predicted ABC-type sugar transport system, permease component  [General function prediction only]. 	329
226634	COG4160	ArtM	ABC-type arginine/histidine transport system, permease component [Amino acid transport and metabolism]. 	228
226635	COG4161	ArtP	ABC-type arginine transport system, ATPase component [Amino acid transport and metabolism]. 	242
226636	COG4166	OppA	ABC-type oligopeptide transport system, periplasmic component [Amino acid transport and metabolism]. 	562
226637	COG4167	SapF	ABC-type antimicrobial peptide transport system, ATPase component [Defense mechanisms]. 	267
226638	COG4168	SapB	ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 	321
226639	COG4170	SapD	ABC-type antimicrobial peptide transport system, ATPase component [Defense mechanisms]. 	330
226640	COG4171	SapC	ABC-type antimicrobial peptide transport system, permease component [Defense mechanisms]. 	296
226641	COG4172	YejF	ABC-type microcin C transport system, duplicated ATPase component YejF [Secondary metabolites biosynthesis, transport and catabolism]. 	534
226642	COG4174	YejB	ABC-type microcin C transport system, permease component YejB [Secondary metabolites biosynthesis, transport and catabolism]. 	364
226643	COG4175	ProV	ABC-type proline/glycine betaine transport system, ATPase component [Amino acid transport and metabolism]. 	386
226644	COG4176	ProW	ABC-type proline/glycine betaine transport system, permease component [Amino acid transport and metabolism]. 	290
226645	COG4177	LivM	ABC-type branched-chain amino acid transport system, permease component [Amino acid transport and metabolism]. 	314
226646	COG4178	YddA	ABC-type uncharacterized transport system, permease and ATPase components [General function prediction only]. 	604
226647	COG4181	YbbA	Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, ATPase component [Secondary metabolites biosynthesis, transport and catabolism]. 	228
226648	COG4185	COG4185	Predicted ABC-type ATPase [General function prediction only]. 	187
226649	COG4186	COG4186	Calcineurin-like phosphoesterase superfamily protein [General function prediction only]. 	186
226650	COG4187	RocB	Arginine utilization protein RocB  [Amino acid transport and metabolism]. 	553
226651	COG4188	COG4188	Predicted dienelactone hydrolase  [General function prediction only]. 	365
226652	COG4189	COG4189	Predicted transcriptional regulator  [Transcription]. 	308
226653	COG4190	COG4190	Predicted transcriptional regulator  [Transcription]. 	144
226654	COG4191	COG4191	Signal transduction histidine kinase regulating C4-dicarboxylate transport system  [Signal transduction mechanisms]. 	603
226655	COG4192	COG4192	Signal transduction histidine kinase regulating phosphoglycerate transport system  [Signal transduction mechanisms]. 	673
226656	COG4193	LytD	Beta- N-acetylglucosaminidase  [Carbohydrate transport and metabolism]. 	245
226657	COG4194	COG4194	Uncharacterized membrane protein, DUF1648 family [Function unknown]. 	350
226658	COG4195	YjqB	Phage-related replication protein YjqB, UPF0714/DUF867 family [Mobilome: prophages, transposons]. 	208
226659	COG4196	COG4196	Uncharacterized conserved protein, DUF2126 family [Function unknown]. 	808
226660	COG4197	YdaS	DNA-binding transcriptional regulator YdaS, prophage-encoded, Cro superfamily [Transcription]. 	96
226661	COG4198	COG4198	Uncharacterized conserved protein, DUF1015 family [Function unknown]. 	405
226662	COG4199	RecJ	ssDNA-specific exonuclease RecJ [Replication, recombination and repair]. 	201
226663	COG4200	EfiE	Predicted lantabiotic-exporting membrane pepmease, EfiE/EfiG/ABC2 family  [Defense mechanisms]. 	239
226664	COG4206	BtuB	Outer membrane cobalamin receptor protein [Coenzyme transport and metabolism]. 	608
226665	COG4208	CysW	ABC-type sulfate transport system, permease component [Inorganic ion transport and metabolism]. 	287
226666	COG4209	LplB	ABC-type polysaccharide transport system, permease component  [Carbohydrate transport and metabolism]. 	309
226667	COG4211	MglC	ABC-type glucose/galactose transport system, permease component [Carbohydrate transport and metabolism]. 	336
226668	COG4213	XylF	ABC-type xylose transport system, periplasmic component [Carbohydrate transport and metabolism]. 	341
226669	COG4214	XylH	ABC-type xylose transport system, permease component [Carbohydrate transport and metabolism]. 	394
226670	COG4215	ArtQ	ABC-type arginine transport system, permease component [Amino acid transport and metabolism]. 	230
226671	COG4218	MtrF	Tetrahydromethanopterin S-methyltransferase, subunit F  [Coenzyme transport and metabolism]. 	73
226672	COG4219	MecR1	Signal transducer regulating beta-lactamase production, contains  metallopeptidase domain [Signal transduction mechanisms]. 	337
226673	COG4220	Nu1	Phage DNA packaging protein, Nu1 subunit of terminase [Mobilome: prophages, transposons]. 	174
226674	COG4221	YdfG	NADP-dependent 3-hydroxy acid dehydrogenase YdfG [Energy production and conversion]. 	246
226675	COG4222	COG4222	Uncharacterized conserved protein [Function unknown]. 	391
226676	COG4223	COG4223	Uncharacterized conserved protein [Function unknown]. 	422
226677	COG4224	YnzC	Uncharacterized protein YnzC, UPF0291/DUF896 family [Function unknown]. 	77
226678	COG4225	YesR	Rhamnogalacturonyl hydrolase YesR [Carbohydrate transport and metabolism]. 	357
226679	COG4226	HicB	Predicted nuclease of the RNAse H fold, HicB family  [General function prediction only]. 	111
226680	COG4227	ArdC	Antirestriction protein ArdC [Replication, recombination and repair]. 	316
226681	COG4228	COG4228	Mu-like prophage DNA circulation protein  [Mobilome: prophages, transposons]. 	451
226682	COG4229	Utr4	Enolase-phosphatase E1 involved in merthionine salvage [Amino acid transport and metabolism]. 	229
226683	COG4230	PutA2	Delta 1-pyrroline-5-carboxylate dehydrogenase [Amino acid transport and metabolism]. 	769
226684	COG4231	IorA	TPP-dependent indolepyruvate ferredoxin oxidoreductase, alpha subunit  [Energy production and conversion]. 	640
226685	COG4232	DsbD	Thiol:disulfide interchange protein [Posttranslational modification, protein turnover, chaperones]. 	569
226686	COG4233	COG4233	Thiol-disulfide interchange protein, contains DsbC and DsbD domains [Posttranslational modification, protein turnover, chaperones, Energy production and conversion]. 	273
226687	COG4235	NrfG	Cytochrome c-type biogenesis protein CcmH/NrfG [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	287
226688	COG4237	HyfE	Hydrogenase-4 membrane subunit HyfE [Energy production and conversion]. 	218
226689	COG4238	Lpp	Outer membrane murein-binding lipoprotein Lpp [Cell wall/membrane/envelope biogenesis]. 	78
226690	COG4239	YejE	ABC-type microcin C transport system, permease component YejE [Secondary metabolites biosynthesis, transport and catabolism]. 	341
226691	COG4240	Tda10	Pantothenate kinase-related protein Tda10 (topoisomerase I damage affected protein) [General function prediction only]. 	300
226692	COG4241	YybS	Uncharacterized conserved protein YybS, DUF2232 family [Function unknown]. 	314
226693	COG4242	CphB	Cyanophycinase and related exopeptidases  [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 	293
226694	COG4243	COG4243	Uncharacterized membrane protein  [Function unknown]. 	156
226695	COG4244	COG4244	Uncharacterized membrane protein [Function unknown]. 	160
226696	COG4245	TerY	Uncharacterized conserved protein YegL, contains vWA domain of TerY type  [Function unknown]. 	207
226697	COG4246	COG4246	Uncharacterized protein [Function unknown]. 	340
226698	COG4247	Phy	3-phytase (myo-inositol-hexaphosphate 3-phosphohydrolase)  [Lipid transport and metabolism]. 	364
226699	COG4248	YegI	Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains [General function prediction only]. 	637
226700	COG4249	COG4249	Uncharacterized protein, contains caspase domain  [General function prediction only]. 	380
226701	COG4250	DICT	Sensory domain  found in diguanylate cyclases and two-component systems (DICT domain) [Signal transduction mechanisms]. 	226
226702	COG4251	COG4251	Bacteriophytochrome (light-regulated signal transduction histidine kinase)  [Signal transduction mechanisms]. 	750
226703	COG4252	CHASE2	Extracellular (periplasmic) sensor domain CHASE2 (specificity unknown) [Signal transduction mechanisms]. 	400
226704	COG4253	COG4253	Uncharacterized conserved protein, DUF2345 family [Function unknown]. 	278
226705	COG4254	COG4254	Uncharacterized conserved protein, contains LysM and FecR  domains [General function prediction only]. 	339
226706	COG4255	COG4255	Uncharacterized protein [Function unknown]. 	318
226707	COG4256	HemP	Hemin uptake protein HemP [Coenzyme transport and metabolism]. 	63
226708	COG4257	Vgb	Streptogramin lyase  [Defense mechanisms]. 	353
226709	COG4258	COG4258	Predicted exporter  [General function prediction only]. 	788
226710	COG4259	COG4259	Uncharacterized protein [Function unknown]. 	121
226711	COG4260	YdjI	Membrane protease subunit, stomatin/prohibitin family, contains C-terminal Zn-ribbon domain  [Posttranslational modification, protein turnover, chaperones]. 	345
226712	COG4261	COG4261	Predicted acyltransferase, LPLAT superfamily  [General function prediction only]. 	309
226713	COG4262	COG4262	Predicted spermidine synthase with an N-terminal membrane domain  [General function prediction only]. 	508
226714	COG4263	NosZ	Nitrous oxide reductase  [Inorganic ion transport and metabolism]. 	637
226715	COG4264	RhbC	Siderophore synthetase component  [Inorganic ion transport and metabolism]. 	602
226716	COG4266	Alc	Allantoicase  [Nucleotide transport and metabolism]. 	334
226717	COG4267	COG4267	Uncharacterized membrane protein [Function unknown]. 	467
226718	COG4268	McrC	5-methylcytosine-specific restriction endonuclease McrBC, regulatory subunit McrC [Defense mechanisms]. 	439
226719	COG4269	YjgN	Uncharacterized membrane protein YjgN, DUF898 family [Function unknown]. 	364
226720	COG4270	COG4270	Uncharacterized membrane protein  [Function unknown]. 	131
226721	COG4271	COG4271	Predicted nucleotide-binding protein containing TIR -like domain  [General function prediction only]. 	233
226722	COG4272	COG4272	Uncharacterized membrane protein  [Function unknown]. 	125
226723	COG4273	COG4273	Uncharacterized protein, contains metal-binding DGC domain  [Function unknown]. 	135
226724	COG4274	COG4274	Uncharacterized protein, contains GYD domain [Function unknown]. 	104
226725	COG4275	ChrB1	Chromate resistance protein ChrB1 [Inorganic ion transport and metabolism]. 	143
226726	COG4276	SRPBCC	Ligand-binding SRPBCC domain  [General function prediction only]. 	153
226727	COG4277	COG4277	Predicted DNA-binding protein with the Helix-hairpin-helix motif  [General function prediction only]. 	404
226728	COG4278	COG4278	Uncharacterized protein [Function unknown]. 	269
226729	COG4279	COG4279	Uncharacterized conserved protein, contains Zn finger domain [Function unknown]. 	266
226730	COG4280	COG4280	Uncharacterized membrane protein  [Function unknown]. 	236
226731	COG4281	ACB	Acyl-CoA-binding protein  [Lipid transport and metabolism]. 	87
226732	COG4282	SMI1	Cell wall assembly regulator SMI1 [Cell wall/membrane/envelope biogenesis]. 	191
226733	COG4283	DinB	Uncharacterized protein DinB, DUF1706 family [Function unknown]. 	170
226734	COG4284	QRI1	UDP-N-acetylglucosamine pyrophosphorylase [Carbohydrate transport and metabolism]. 	472
226735	COG4285	COG4285	Uncharacterized conserved protein , conains N-terminal glutamine amidotransferase (GATase1)-like domain  [General function prediction only]. 	253
226736	COG4286	COG4286	Uncharacterized protein, UPF0160 family [Function unknown]. 	306
226737	COG4287	PqaA	PhoPQ-activated pathogenicity-related protein  [General function prediction only]. 	507
226738	COG4288	COG4288	Uncharacterized protein [Function unknown]. 	124
226739	COG4289	COG4289	Uncharacterized protein [Function unknown]. 	458
226740	COG4290	COG4290	Guanyl-specific ribonuclease Sa  [Nucleotide transport and metabolism]. 	152
226741	COG4291	COG4291	Uncharacterized membrane protein  [Function unknown]. 	228
226742	COG4292	LtrA	Low temperature requirement protein LtrA (function unknown) [Function unknown]. 	387
226743	COG4293	COG4293	Uncharacterized protein, DUF1802 family [Function unknown]. 	184
226744	COG4294	Uve	UV DNA damage repair endonuclease  [Replication, recombination and repair]. 	347
226745	COG4295	COG4295	Uncharacterized protein [Function unknown]. 	285
226746	COG4296	COG4296	Uncharacterized protein [Function unknown]. 	156
226747	COG4297	YjlB	Uncharacterized protein YjlB, cupin superfamily [Function unknown]. 	163
226748	COG4298	COG4298	Uncharacterized protein [Function unknown]. 	95
226749	COG4299	COG4299	Predicted acyltransferase [General function prediction only]. 	371
226750	COG4300	CadD	Cadmium resistance protein CadD, predicted permease [Inorganic ion transport and metabolism]. 	205
226751	COG4301	COG4301	Uncharacterized conserved protein, contains predicted SAM-dependent methyltransferase domain [General function prediction only]. 	321
226752	COG4302	EutC	Ethanolamine ammonia-lyase, small subunit [Amino acid transport and metabolism]. 	294
226753	COG4303	EutB	Ethanolamine ammonia-lyase, large subunit [Amino acid transport and metabolism]. 	453
226754	COG4304	COG4304	Uncharacterized protein [Function unknown]. 	166
226755	COG4305	YoaJ	Peptidoglycan-binding domain, expansin [Cell wall/membrane/envelope biogenesis]. 	232
226756	COG4306	COG4306	Uncharacterized protein [Function unknown]. 	160
226757	COG4307	COG4307	Uncharacterized protein, DUF2248 family [Function unknown]. 	349
226758	COG4308	LimA	Limonene-1,2-epoxide hydrolase  [Secondary metabolites biosynthesis, transport and catabolism]. 	130
226759	COG4309	COG4309	Uncharacterized conserved protein, DUF2249 family [Function unknown]. 	98
226760	COG4310	COG4310	Uncharacterized protein, cotains an aminopeptidase-like domain  [General function prediction only]. 	435
226761	COG4311	SoxD	Sarcosine oxidase delta subunit  [Amino acid transport and metabolism]. 	97
226762	COG4312	COG4312	Predicted dithiol-disulfide oxidoreductase, DUF899 family [General function prediction only]. 	247
226763	COG4313	SphA	Uncharacterized conserved protein [Function unknown]. 	304
226764	COG4314	NosL	Nitrous oxide reductase accessory protein NosL [Inorganic ion transport and metabolism]. 	176
226765	COG4315	COG4315	Predicted lipoprotein with conserved Yx(FWY)xxD motif (function unknown) [Function unknown]. 	138
226766	COG4316	COG4316	Uncharacterized protein [Function unknown]. 	138
226767	COG4317	XapX	Xanthosine utilization system component, XapX domain [Nucleotide transport and metabolism]. 	93
226768	COG4318	COG4318	Uncharacterized protein [Function unknown]. 	221
226769	COG4319	YybH	Ketosteroid isomerase homolog  [General function prediction only]. 	137
226770	COG4320	COG4320	Uncharacterized conserved protein, DUF2252 family [Function unknown]. 	410
226771	COG4321	COG4321	Predicted DNA-binding protein, contains Ribbon-helix-helix (RHH) domain [General function prediction only]. 	102
226772	COG4322	COG4322	Uncharacterized protein [Function unknown]. 	304
226773	COG4323	COG4323	Uncharacterized protein [Function unknown]. 	105
226774	COG4324	COG4324	Predicted aminopeptidase  [General function prediction only]. 	376
226775	COG4325	COG4325	Uncharacterized membrane protein  [Function unknown]. 	464
226776	COG4326	Spo0M	Sporulation-control protein spo0M  [Cell cycle control, cell division, chromosome partitioning]. 	270
226777	COG4327	COG4327	Uncharacterized membrane protein  [Function unknown]. 	101
226778	COG4328	COG4328	Predicted nuclease (RNAse H fold)  [General function prediction only]. 	266
226779	COG4329	COG4329	Uncharacterized membrane protein  [Function unknown]. 	160
226780	COG4330	COG4330	Uncharacterized membrane protein [Function unknown]. 	211
226781	COG4331	COG4331	Uncharacterized membrane protein [Function unknown]. 	167
226782	COG4332	COG4332	Uncharacterized protein [Function unknown]. 	203
226783	COG4333	COG4333	Uncharacterized protein [Function unknown]. 	167
226784	COG4334	COG4334	Uncharacterized protein [Function unknown]. 	131
226785	COG4335	AlkC	3-methyladenine DNA glycosylase AlkC [Replication, recombination and repair]. 	167
226786	COG4336	YcsI	Uncharacterized protein YcsI, UPF0317 family [Function unknown]. 	265
226787	COG4337	COG4337	Uncharacterized protein [Function unknown]. 	206
226788	COG4338	COG4338	Uncharacterized protein, DUF2256 family [Function unknown]. 	54
226789	COG4339	COG4339	Predicted metal-dependent phosphohydrolase, HD superfamily [General function prediction only]. 	208
226790	COG4340	COG4340	Predicted dioxygenase, 2-oxoglutarate and Fe-dependent (2OG-Fe) dioxygenase superfamily [General function prediction only]. 	226
226791	COG4341	COG4341	Predicted HD phosphohydrolase  [General function prediction only]. 	186
226792	COG4342	COG4342	Intergrase/Recombinase [Mobilome: prophages, transposons]. 	291
226793	COG4343	Cas4	CRISPR/Cas system-associated exonuclease Cas4, RecB family [Defense mechanisms]. 	281
226794	COG4344	COG4344	Predicted transciptional regulator, contains HTH domain [Transcription]. 	175
226795	COG4345	COG4345	Uncharacterized protein [Function unknown]. 	181
226796	COG4346	COG4346	Predicted membrane-bound dolichyl-phosphate-mannose-protein mannosyltransferase  [Posttranslational modification, protein turnover, chaperones]. 	438
226797	COG4347	YpjA	Uncharacterized membrane protein YpjA [Function unknown]. 	200
226798	COG4352	RPL13	Ribosomal protein L13E  [Translation, ribosomal structure and biogenesis]. 	113
226799	COG4353	COG4353	Uncharacterized protein [Function unknown]. 	192
226800	COG4354	COG4354	Uncharacterized protein, contains GBA2_N and DUF608 domains  [Function unknown]. 	721
226801	COG4357	COG4357	Uncharacterized protein, contains Zn-finger domain of CHY type  [Function unknown]. 	105
226802	COG4359	MtnX	2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate phosphatase (methionine salvage) [Amino acid transport and metabolism]. 	220
226803	COG4360	APA2	ATP adenylyltransferase (5',5'''-P-1,P-4-tetraphosphate phosphorylase II)  [Nucleotide transport and metabolism]. 	298
226804	COG4362	COG4362	Nitric oxide synthase, oxygenase domain  [Inorganic ion transport and metabolism]. 	355
226805	COG4365	YllA	Uncharacterized protein YllA, UPF0747 family [Function unknown]. 	537
226806	COG4367	COG4367	Uncharacterized protein [Function unknown]. 	97
226807	COG4370	COG4370	Uncharacterized protein [Function unknown]. 	412
226808	COG4371	COG4371	Uncharacterized membrane protein [Function unknown]. 	334
226809	COG4372	COG4372	Uncharacterized conserved protein, contains DUF3084 domain [Function unknown]. 	499
226810	COG4373	COG4373	Mu-like prophage FluMu protein gp28  [Mobilome: prophages, transposons]. 	509
226811	COG4374	COG4374	PIN domain nuclease, a component of toxin-antitoxin system (PIN domain)  [Defense mechanisms]. 	130
226812	COG4377	YhfC	Uncharacterized membrane protein YhfC [Function unknown]. 	258
226813	COG4378	COG4378	Uncharacterized protein [Function unknown]. 	103
226814	COG4379	COG4379	Mu-like prophage tail protein gpP  [Mobilome: prophages, transposons]. 	386
226815	COG4380	COG4380	Uncharacterized protein [Function unknown]. 	216
226816	COG4381	gp46	Mu-like prophage protein gp46  [Mobilome: prophages, transposons]. 	135
226817	COG4382	gp16	Mu-like prophage protein gp16  [Mobilome: prophages, transposons]. 	170
226818	COG4383	gp29	Mu-like prophage protein gp29  [Mobilome: prophages, transposons]. 	517
226819	COG4384	gp45	Mu-like prophage protein gp45  [Mobilome: prophages, transposons]. 	203
226820	COG4385	gpI	Bacteriophage P2-related tail formation protein  [Mobilome: prophages, transposons]. 	206
226821	COG4386	COG4386	Mu-like prophage tail sheath protein gpL  [Mobilome: prophages, transposons]. 	487
226822	COG4387	gp436	Mu-like prophage protein gp36  [Mobilome: prophages, transposons]. 	139
226823	COG4388	COG4388	Mu-like prophage I protein  [Mobilome: prophages, transposons]. 	357
226824	COG4389	COG4389	Site-specific recombinase  [Replication, recombination and repair]. 	677
226825	COG4390	COG4390	Uncharacterized protein [Function unknown]. 	106
226826	COG4391	COG4391	Uncharacterized conserved protein, contains Zn-finger domain  [Function unknown]. 	62
226827	COG4392	AzlD2	Branched-chain amino acid transport protein [Amino acid transport and metabolism]. 	107
226828	COG4393	COG4393	Uncharacterized membrane protein  [Function unknown]. 	405
226829	COG4394	EarP	Elongation-Factor P (EF-P) rhamnosyltransferase EarP [Translation, ribosomal structure and biogenesis]. 	370
226830	COG4395	Tim44	Predicted lipid-binding transport protein, Tim44 family [Lipid transport and metabolism]. 	281
226831	COG4396	COG4396	Mu-like prophage host-nuclease inhibitor protein Gam  [Mobilome: prophages, transposons]. 	170
226832	COG4397	COG4397	Mu-like prophage major head subunit gpT  [Mobilome: prophages, transposons]. 	308
226833	COG4398	FIST	Small ligand-binding sensory domain FIST [Signal transduction mechanisms]. 	389
226834	COG4399	YheB	Uncharacterized membrane protein YheB, UPF0754 family [Function unknown]. 	376
226835	COG4401	AroH	Chorismate mutase  [Amino acid transport and metabolism]. 	125
226836	COG4402	COG4402	Uncharacterized protein [Function unknown]. 	457
226837	COG4403	LcnDR2	Lantibiotic modifying enzyme  [Defense mechanisms]. 	963
226838	COG4405	YhfF	Predicted RNA-binding protein YhfF, contains PUA-like ASCH domain [General function prediction only]. 	140
226839	COG4408	COG4408	Uncharacterized protein [Function unknown]. 	431
226840	COG4409	NanH	Neuraminidase (sialidase)  [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	728
226841	COG4412	COG4412	Bacillopeptidase F, M6 metalloprotease family  [Posttranslational modification, protein turnover, chaperones]. 	760
226842	COG4413	Utp	Urea transporter  [Amino acid transport and metabolism]. 	319
226843	COG4416	Com	Mu-like prophage FluMu protein Com [Mobilome: prophages, transposons]. 	60
226844	COG4420	COG4420	Uncharacterized membrane protein [Function unknown]. 	191
226845	COG4421	COG4421	Capsular polysaccharide biosynthesis protein  [Cell wall/membrane/envelope biogenesis]. 	368
226846	COG4422	COG4422	Bacteriophage protein gp37  [Mobilome: prophages, transposons]. 	250
226847	COG4423	COG4423	Uncharacterized protein [Function unknown]. 	81
226848	COG4424	LpsS	LPS sulfotransferase NodH [Cell wall/membrane/envelope biogenesis]. 	250
226849	COG4425	COG4425	Uncharacterized membrane protein [Function unknown]. 	588
226850	COG4427	COG4427	Uncharacterized protein [Function unknown]. 	350
226851	COG4430	YdeI	Uncharacterized conserved protein YdeI, YjbR/CyaY-like superfamily, DUF1801 family [Function unknown]. 	200
226852	COG4443	COG4443	Uncharacterized protein [Function unknown]. 	72
226853	COG4445	MiaE	tRNA isopentenyl-2-thiomethyl-A-37 hydroxylase MiaE (synthesis of 2-methylthio-cis-ribozeatin)  [Translation, ribosomal structure and biogenesis]. 	203
226854	COG4446	COG4446	Uncharacterized conserved protein, DUF1499 family [Function unknown]. 	141
226855	COG4447	COG4447	Uncharacterized protein related to plant photosystem II stability/assembly factor  [General function prediction only]. 	339
226856	COG4448	AnsA2	L-asparaginase II  [Amino acid transport and metabolism]. 	339
226857	COG4449	COG4449	Predicted protease, Abi (CAAX) family  [General function prediction only]. 	827
226858	COG4451	RbcS	Ribulose bisphosphate carboxylase small subunit  [Carbohydrate transport and metabolism]. 	127
226859	COG4452	CreD	Inner membrane protein involved in colicin E2 resistance [Defense mechanisms]. 	443
226860	COG4453	COG4453	Uncharacterized conserved protein, DUF1778 family [Function unknown]. 	95
226861	COG4454	COG4454	Uncharacterized copper-binding protein, cupredoxin-like subfamily [General function prediction only]. 	158
226862	COG4455	ImpE	Protein of avirulence locus involved in temperature-dependent protein secretion  [General function prediction only]. 	273
226863	COG4456	VagC	Virulence-associated protein VagC (function unknown) [Function unknown]. 	74
226864	COG4457	SrfB	Uncharacterized protein [Function unknown]. 	1014
226865	COG4458	SrfC	Uncharacterized protein [Function unknown]. 	821
226866	COG4459	NapE	Periplasmic nitrate reductase system, NapE component  [Energy production and conversion]. 	62
226867	COG4460	COG4460	Uncharacterized protein [Function unknown]. 	130
226868	COG4461	LprI	Uncharacterized protein LprI [Function unknown]. 	185
226869	COG4463	CtsR	Transcriptional regulator CtsR [Transcription]. 	153
226870	COG4464	YwqE	Tyrosine-protein phosphatase YwqE [Signal transduction mechanisms]. 	254
226871	COG4465	CodY	GTP-sensing pleiotropic transcriptional regulator CodY [Transcription]. 	261
226872	COG4466	COG4466	Uncharacterized protein Veg, DUF1021 family [Function unknown]. 	80
226873	COG4467	YabA	Regulator of replication initiation timing  [Replication, recombination and repair]. 	114
226874	COG4468	GalT2	Galactose-1-phosphate uridylyltransferase [Carbohydrate transport and metabolism]. 	503
226875	COG4469	CoiA	Competence protein CoiA-like family, contains a predicted nuclease domain  [General function prediction only]. 	342
226876	COG4470	YutD	Uncharacterized protein YutD, DUF1027 family [Function unknown]. 	126
226877	COG4471	YlbG	Uncharacterized protein YlbG, UPF0298 family [Function unknown]. 	90
226878	COG4472	IreB-like	IreB family regulatory phosphoprotein. IreB (EF1202) was characterized in Enterococcus faecalis as a small protein, well-conserved in the Firmicutes. It belongs to a system that includes the Ser/Thr protein kinase IreK, and phosphatase IreP, undergoes phosphorylation on threonine residues, and is involved in regulating cephalosporin resistance. This family was previously named DUF965 by Pfam model pfam06135	88
226879	COG4473	EcsB	Predicted ABC-type exoprotein transport system, permease component  [Intracellular trafficking, secretion, and vesicular transport]. 	379
226880	COG4474	YoqJ	Uncharacterized SPBc2 prophage-derived protein YoqJ [Mobilome: prophages, transposons]. 	180
226881	COG4475	YwlG	Uncharacterized protein YwlG, UPF0340 family [Function unknown]. 	180
226882	COG4476	YktA	Uncharacterized protein YktA, UPF0223 family [Function unknown]. 	90
226883	COG4477	EzrA	Septation ring formation regulator EzrA [Cell cycle control, cell division, chromosome partitioning]. 	570
226884	COG4478	COG4478	Uncharacterized membrane protein [Function unknown]. 	210
226885	COG4479	YozE	Uncharacterized protein YozE, UPF0346 family [Function unknown]. 	74
226886	COG4481	COG4481	Uncharacterized protein, DUF951 family [Function unknown]. 	60
226887	COG4483	YqgQ	Uncharacterized protein YqgQ, DUF910 family [Function unknown]. 	68
226888	COG4485	YfhO	Uncharacterized membrane protein YfhO [Function unknown]. 	858
226889	COG4487	COG4487	Uncharacterized protein, contains DUF2130 domain [Function unknown]. 	438
226890	COG4492	PheB	ACT domain-containing protein  [General function prediction only]. 	150
226891	COG4493	YktB	Uncharacterized protein YktB, UPF0637 family [Function unknown]. 	209
226892	COG4495	COG4495	Uncharacterized protein [Function unknown]. 	109
226893	COG4496	YerC	Predicted DNA-binding transcriptional regulator YerC, contains ArsR-like HTH domain [General function prediction only]. 	100
226894	COG4499	YukC	Uncharacterized membrane protein YukC  [Function unknown]. 	434
226895	COG4502	YorC	5'(3')-deoxyribonucleotidase  [Nucleotide transport and metabolism]. 	180
226896	COG4506	YwiB	Uncharacterized beta-barrel protein YwiB, DUF1934 family [Function unknown]. 	143
226897	COG4508	Dut2	Dimeric dUTPase, all-alpha-NTP-PPase (MazG) superfamily [Nucleotide transport and metabolism]. 	161
226898	COG4509	SrtB	class B sortase (surface protein transpeptidase) [Cell wall/membrane/envelope biogenesis]. 	244
226899	COG4512	AgrB	Accessory gene regulator protein AgrB [Transcription, Signal transduction mechanisms]. 	198
226900	COG4517	COG4517	Uncharacterized protein [Function unknown]. 	109
226901	COG4518	gp41	Mu-like prophage FluMu protein gp41  [Mobilome: prophages, transposons]. 	122
226902	COG4519	COG4519	Uncharacterized protein [Function unknown]. 	95
226903	COG4520	LipA17	Surface antigen  [Cell wall/membrane/envelope biogenesis]. 	136
226904	COG4521	TauA	ABC-type taurine transport system, periplasmic component [Inorganic ion transport and metabolism]. 	334
226905	COG4525	TauB	ABC-type taurine transport system, ATPase component [Inorganic ion transport and metabolism]. 	259
226906	COG4529	YdhS	Uncharacterized NAD(P)/FAD-binding protein YdhS [General function prediction only]. 	474
226907	COG4530	COG4530	Uncharacterized protein [Function unknown]. 	129
226908	COG4531	ZnuA	ABC-type Zn2+ transport system, periplasmic component/surface adhesin [Inorganic ion transport and metabolism]. 	318
226909	COG4533	SgrR	DNA-binding transcriptional regulator SgrR  of sgrS sRNA, contains a MarR-type HTH domain and a periplasmic-type solute-binding domain [Transcription]. 	564
226910	COG4535	CorC	Mg2+ and Co2+ transporter CorC, contains CBS pair and CorC-HlyC domains [Inorganic ion transport and metabolism]. 	293
226911	COG4536	CorB	Mg2+ and Co2+ transporter CorB, contains DUF21, CBS pair, and CorC-HlyC domains [Inorganic ion transport and metabolism]. 	423
226912	COG4537	ComGC	Competence protein ComGC  [Mobilome: prophages, transposons]. 	107
226913	COG4538	COG4538	Uncharacterized protein [Function unknown]. 	112
226914	COG4539	COG4539	Uncharacterized membrane protein YGL010W  [Function unknown]. 	180
226915	COG4540	gpV	Phage P2 baseplate assembly protein gpV  [Mobilome: prophages, transposons]. 	184
226916	COG4541	COG4541	Uncharacterized membrane protein [Function unknown]. 	100
226917	COG4542	PduX	Protein involved in propanediol utilization, and related proteins (includes coumermycin biosynthetic... [Secondary metabolites biosynthesis, transport and catabolism]. 	293
226918	COG4544	COG4544	Uncharacterized conserved protein [Function unknown]. 	260
226919	COG4545	COG4545	Glutaredoxin-related protein  [Posttranslational modification, protein turnover, chaperones]. 	85
226920	COG4547	CobT2	Cobalamin biosynthesis protein CobT (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole phosphorib... [Coenzyme transport and metabolism]. 	620
226921	COG4548	NorD	Nitric oxide reductase activation protein  [Inorganic ion transport and metabolism]. 	637
226922	COG4549	YcnI	Uncharacterized protein YcnI, contains cohesin/reeler-like domain [Function unknown]. 	178
226923	COG4550	YmcA	Cell fate regulator YmcA, YheA/YmcA/DUF963 family (controls sporulation, competence, biofilm development) [Signal transduction mechanisms]. 	120
226924	COG4551	COG4551	Predicted protein tyrosine phosphatase  [General function prediction only]. 	109
226925	COG4552	Eis	Predicted acetyltransferase [General function prediction only]. 	389
226926	COG4553	DepA	Poly-beta-hydroxyalkanoate depolymerase  [Lipid transport and metabolism]. 	415
226927	COG4555	NatA	ABC-type Na+ transport system, ATPase component NatA [Energy production and conversion, Inorganic ion transport and metabolism]. 	245
226928	COG4558	ChuT	ABC-type hemin transport system, periplasmic component  [Inorganic ion transport and metabolism]. 	300
226929	COG4559	COG4559	ABC-type hemin transport system, ATPase component  [Inorganic ion transport and metabolism]. 	259
226930	COG4564	COG4564	Signal transduction histidine kinase  [Signal transduction mechanisms]. 	459
226931	COG4565	CitB	Response regulator of citrate/malate metabolism [Transcription, Signal transduction mechanisms]. 	224
226932	COG4566	FixJ	Two-component response regulator, FixJ family, consists of REC and HTH domains [Signal transduction mechanisms, Transcription]. 	202
226933	COG4567	COG4567	Two-component response regulator, ActR/RegA family, consists of REC and Fis-type HTH domains  [Signal transduction mechanisms, Transcription]. 	182
226934	COG4568	Rof	Transcriptional antiterminator Rof (Rho-off) [Transcription]. 	84
226935	COG4569	MhpF	Acetaldehyde dehydrogenase (acetylating) [Secondary metabolites biosynthesis, transport and catabolism]. 	310
226936	COG4570	RusA	Holliday junction resolvase RusA (prophage-encoded endonuclease) [Replication, recombination and repair]. 	132
226937	COG4571	OmpT	Outer membrane protease [Cell wall/membrane/envelope biogenesis]. 	314
226938	COG4572	ChaB	Cation transport regulator ChaB [Inorganic ion transport and metabolism]. 	76
226939	COG4573	GatZ	Tagatose-1,6-bisphosphate aldolase non-catalytic subunit AgaZ/GatZ [Carbohydrate transport and metabolism]. 	426
226940	COG4574	Eco	Serine protease inhibitor ecotin [Posttranslational modification, protein turnover, chaperones]. 	162
226941	COG4575	ElaB	Membrane-anchored ribosome-binding protein, inhibits growth in stationary phase, ElaB/YqjD/DUF883 family [Translation, ribosomal structure and biogenesis]. 	104
226942	COG4576	CcmL	Carboxysome shell and ethanolamine utilization microcompartment protein CcmK/EutM [Secondary metabolites biosynthesis, transport and catabolism, Energy production and conversion]. 	89
226943	COG4577	CcmK	Carboxysome shell and ethanolamine utilization microcompartment protein CcmL/EutN [Secondary metabolites biosynthesis, transport and catabolism, Energy production and conversion]. 	150
226944	COG4578	GutM	DNA-binding transcriptional regulator of glucitol operon [Transcription]. 	128
226945	COG4579	AceK	Isocitrate dehydrogenase kinase/phosphatase [Signal transduction mechanisms]. 	578
226946	COG4580	LamB	Maltoporin (phage lambda and maltose receptor) [Carbohydrate transport and metabolism]. 	429
226947	COG4581	Dob10	Superfamily II RNA helicase  [Replication, recombination and repair]. 	1041
226948	COG4582	ZapD	Cell division protein ZapD, interacts with FtsZ  [Cell cycle control, cell division, chromosome partitioning]. 	244
226949	COG4583	SoxG	Sarcosine oxidase gamma subunit  [Amino acid transport and metabolism]. 	189
226950	COG4584	COG4584	Transposase [Mobilome: prophages, transposons]. 	278
226951	COG4585	COG4585	Signal transduction histidine kinase  [Signal transduction mechanisms]. 	365
226952	COG4586	COG4586	ABC-type uncharacterized transport system, ATPase component  [General function prediction only]. 	325
226953	COG4587	COG4587	ABC-type uncharacterized transport system, permease component  [General function prediction only]. 	268
226954	COG4588	AcfC	Accessory colonization factor AcfC, contains ABC-type periplasmic domain  [Cell wall/membrane/envelope biogenesis]. 	252
226955	COG4589	YnbB	CDP-diglyceride synthetase [Lipid transport and metabolism]. 	303
226956	COG4590	COG4590	ABC-type uncharacterized transport system, permease component  [General function prediction only]. 	733
226957	COG4591	LolE	ABC-type transport system, involved in lipoprotein release, permease component [Cell wall/membrane/envelope biogenesis]. 	408
226958	COG4592	FepB	ABC-type Fe2+-enterobactin transport system, periplasmic component [Inorganic ion transport and metabolism]. 	319
226959	COG4594	FecB	ABC-type Fe3+-citrate transport system, periplasmic component [Inorganic ion transport and metabolism]. 	310
226960	COG4597	BatB	ABC-type amino acid transport system, permease component [Amino acid transport and metabolism]. 	397
226961	COG4598	HisP	ABC-type histidine transport system, ATPase component [Amino acid transport and metabolism]. 	256
226962	COG4603	COG4603	ABC-type uncharacterized transport system, permease component  [General function prediction only]. 	356
226963	COG4604	CeuD	ABC-type enterochelin transport system, ATPase component  [Inorganic ion transport and metabolism]. 	252
226964	COG4605	CeuC	ABC-type enterochelin transport system, permease component  [Inorganic ion transport and metabolism]. 	316
226965	COG4606	CeuB	ABC-type enterochelin transport system, permease component  [Inorganic ion transport and metabolism]. 	321
226966	COG4607	CeuA	ABC-type enterochelin transport system, periplasmic component  [Inorganic ion transport and metabolism]. 	320
226967	COG4608	AppF	ABC-type oligopeptide transport system, ATPase component [Amino acid transport and metabolism]. 	268
226968	COG4615	PvdE	ABC-type siderophore export system, fused ATPase and permease components [Inorganic ion transport and metabolism]. 	546
226969	COG4618	ArpD	ABC-type protease/lipase transport system, ATPase and permease components  [Intracellular trafficking, secretion, and vesicular transport]. 	580
226970	COG4619	FetA	ABC-type iron transport system FetAB, ATPase component [Inorganic ion transport and metabolism]. 	223
226971	COG4623	MltF	Membrane-bound lytic murein transglycosylase MltF [Cell wall/membrane/envelope biogenesis, Signal transduction mechanisms]. 	473
226972	COG4624	Nar1	Iron only hydrogenase large subunit, C-terminal domain  [Energy production and conversion]. 	411
226973	COG4625	COG4625	Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function unknown]. 	577
226974	COG4626	YmfN	Phage terminase-like protein, large subunit, contains N-terminal HTH domain [Mobilome: prophages, transposons]. 	546
226975	COG4627	COG4627	Predicted SAM-depedendent methyltransferase [General function prediction only]. 	185
226976	COG4628	COG4628	Uncharacterized conserved protein, DUF2132 family [Function unknown]. 	136
226977	COG4630	XdhA	Xanthine dehydrogenase, Fe-S cluster and FAD-binding subunit XdhA  [Nucleotide transport and metabolism]. 	493
226978	COG4631	XdhB	Xanthine dehydrogenase, molybdopterin-binding subunit XdhB  [Nucleotide transport and metabolism]. 	781
226979	COG4632	EpsL	Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acety... [Carbohydrate transport and metabolism]. 	320
226980	COG4633	COG4633	Plastocyanin domain containing protein  [General function prediction only]. 	272
226981	COG4634	COG4634	Predicted nuclease, contains PIN domain, potential toxin-antitoxin system component [General function prediction only]. 	113
226982	COG4635	HemG	Protoporphyrinogen IX oxidase, menaquinone-dependent (flavodoxin domain)   [Coenzyme transport and metabolism]. 	175
226983	COG4636	Uma2	Endonuclease, Uma2 family (restriction endonuclease fold)  [General function prediction only]. 	200
226984	COG4637	COG4637	Predicted ATPase  [General function prediction only]. 	373
226985	COG4638	HcaE	Phenylpropionate dioxygenase or related ring-hydroxylating dioxygenase, large terminal subunit [Inorganic ion transport and metabolism, General function prediction only]. 	367
226986	COG4639	COG4639	Predicted kinase  [General function prediction only]. 	168
226987	COG4640	YvbJ	Uncharacterized membrane protein YvbJ [Function unknown]. 	465
226988	COG4641	COG4641	Spore maturation protein CgeB  [Cell cycle control, cell division, chromosome partitioning]. 	373
226989	COG4642	COG4642	Uncharacterized conserved protein [Function unknown]. 	139
226990	COG4643	COG4643	Uncharacterized domain associated with phage/plasmid primase [Mobilome: prophages, transposons]. 	366
226991	COG4644	COG4644	Transposase and inactivated derivatives, TnpA family  [Mobilome: prophages, transposons]. 	323
226992	COG4645	OpgC	Predicted acyltransferase [General function prediction only]. 	410
226993	COG4646	COG4646	Adenine-specific DNA methylase, N12 class  [Replication, recombination and repair]. 	637
226994	COG4647	AcxC	Acetone carboxylase, gamma subunit  [Secondary metabolites biosynthesis, transport and catabolism]. 	165
226995	COG4648	COG4648	Uncharacterized membrane protein [Function unknown]. 	201
226996	COG4649	COG4649	Uncharacterized protein [Function unknown]. 	221
226997	COG4650	RtcR	Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain [Transcription, Signal transduction mechanisms]. 	531
226998	COG4651	RosB	Predicted Kef-type K+ transport protein, K+/H+ antiporter domain [Inorganic ion transport and metabolism]. 	408
226999	COG4652	COG4652	Uncharacterized protein [Function unknown]. 	657
227000	COG4653	COG4653	Predicted phage phi-C31 gp36 major capsid-like protein  [Mobilome: prophages, transposons]. 	422
227001	COG4654	CytC552	Cytochrome c551/c552  [Energy production and conversion]. 	110
227002	COG4655	COG4655	Uncharacterized membrane protein  [Function unknown]. 	565
227003	COG4656	RnfC	Na+-translocating ferredoxin:NAD+ oxidoreductase  RNF, RnfC subunit  [Energy production and conversion]. 	529
227004	COG4657	RnfA	Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfA subunit  [Energy production and conversion]. 	193
227005	COG4658	RnfD	Na+-translocating ferredoxin:NAD+ oxidoreductase  RNF, RnfD subunit  [Energy production and conversion]. 	338
227006	COG4659	RnfG	Na+-translocating ferredoxin:NAD+ oxidoreductase RNF, RnfG subunit  [Energy production and conversion]. 	195
227007	COG4660	RnfE	Na+-translocating ferredoxin:NAD+ oxidoreductase  RNF, RnfE subunit  [Energy production and conversion]. 	212
227008	COG4662	TupA	ABC-type tungstate transport system, periplasmic component  [Inorganic ion transport and metabolism]. 	227
227009	COG4663	FcbT1	TRAP-type mannitol/chloroaromatic compound transport system, periplasmic component  [Secondary metabolites biosynthesis, transport and catabolism]. 	363
227010	COG4664	FcbT3	TRAP-type mannitol/chloroaromatic compound transport system, large permease component  [Secondary metabolites biosynthesis, transport and catabolism]. 	447
227011	COG4665	FcbT2	TRAP-type mannitol/chloroaromatic compound transport system, small permease component  [Secondary metabolites biosynthesis, transport and catabolism]. 	182
227012	COG4666	COG4666	TRAP-type uncharacterized transport system, fused permease components  [General function prediction only]. 	642
227013	COG4667	YjjU	Predicted phospholipase, patatin/cPLA2 family [Lipid transport and metabolism]. 	292
227014	COG4668	MtlA2	Mannitol/fructose-specific phosphotransferase system, IIA domain [Carbohydrate transport and metabolism]. 	142
227015	COG4669	EscJ	Type III secretory pathway, lipoprotein EscJ  [Intracellular trafficking, secretion, and vesicular transport]. 	246
227016	COG4670	YdiF	Acyl CoA:acetate/3-ketoacid CoA transferase [Lipid transport and metabolism]. 	527
227017	COG4671	COG4671	Predicted glycosyl transferase  [General function prediction only]. 	400
227018	COG4672	gp18	Phage-related protein  [Mobilome: prophages, transposons]. 	231
227019	COG4674	COG4674	ABC-type uncharacterized transport system, ATPase component  [General function prediction only]. 	249
227020	COG4675	MdpB	Microcystin-dependent protein  (function unknown) [Function unknown]. 	170
227021	COG4676	YfaP	Uncharacterized conserved protein YfaP, DUF2135 family [Function unknown]. 	268
227022	COG4677	PemB	Pectin methylesterase and related acyl-CoA thioesterases [Carbohydrate transport and metabolism, Lipid transport and metabolism]. 	405
227023	COG4678	COG4678	Muramidase (phage lambda lysozyme)  [Cell wall/membrane/envelope biogenesis, Mobilome: prophages, transposons]. 	180
227024	COG4679	COG4679	Phage-related protein  [Mobilome: prophages, transposons]. 	116
227025	COG4680	HigB	mRNA-degrading endonuclease (mRNA interferase) HigB, toxic component of the HigAB toxin-antitoxin module [Translation, ribosomal structure and biogenesis]. 	98
227026	COG4681	YaeQ	Uncharacterized conserved protein YaeQ, suppresses RfaH defect [Function unknown]. 	181
227027	COG4682	YiaA	Uncharacterized membrane protein YiaA [Function unknown]. 	128
227028	COG4683	COG4683	Uncharacterized protein [Function unknown]. 	120
227029	COG4684	COG4684	Uncharacterized membrane protein [Function unknown]. 	189
227030	COG4685	YfaA	Uncharacterized conserved protein YfaA, DUF2138 family [Function unknown]. 	571
227031	COG4687	COG4687	Uncharacterized protein [Function unknown]. 	122
227032	COG4688	COG4688	Uncharacterized protein [Function unknown]. 	665
227033	COG4689	Adc	Acetoacetate decarboxylase  [Secondary metabolites biosynthesis, transport and catabolism]. 	247
227034	COG4690	PepD	Dipeptidase  [Amino acid transport and metabolism]. 	464
227035	COG4691	StbC	Plasmid stability protein  [Defense mechanisms]. 	80
227036	COG4692	COG4692	Predicted neuraminidase (sialidase)  [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	381
227037	COG4693	PchG	Oxidoreductase (NAD-binding), involved in siderophore biosynthesis  [Inorganic ion transport and metabolism]. 	361
227038	COG4694	RloC	Wobble nucleotide-excising tRNase [Translation, ribosomal structure and biogenesis]. 	758
227039	COG4695	BeeE	Phage portal protein BeeE [Mobilome: prophages, transposons]. 	398
227040	COG4696	COG4696	Predicted phosphohydrolase, Cof family, HAD superfamily [General function prediction only]. 	180
227041	COG4697	COG4697	Uncharacterized protein [Function unknown]. 	319
227042	COG4698	YpmS	Uncharacterized protein YpmS, DUF2140 family [Function unknown]. 	197
227043	COG4699	COG4699	Uncharacterized protein [Function unknown]. 	120
227044	COG4700	COG4700	Uncharacterized protein [Function unknown]. 	251
227045	COG4701	COG4701	Uncharacterized protein [Function unknown]. 	162
227046	COG4702	COG4702	Uncharacterized protein, UPF0303 family [Function unknown]. 	168
227047	COG4703	YkuJ	Uncharacterized protein YkuJ, DUF1797 family [Function unknown]. 	74
227048	COG4704	COG4704	Uncharacterized conserved protein, DUF2141 family [Function unknown]. 	151
227049	COG4705	COG4705	Uncharacterized membrane-anchored protein  [Function unknown]. 	258
227050	COG4706	COG4706	Predicted 3-hydroxylacyl-ACP dehydratase, HotDog domain  [Lipid transport and metabolism]. 	161
227051	COG4707	COG4707	Prophage pi2 protein 07 [Mobilome: prophages, transposons]. 	107
227052	COG4708	COG4708	Uncharacterized membrane protein  [Function unknown]. 	169
227053	COG4709	COG4709	Uncharacterized membrane protein  [Function unknown]. 	195
227054	COG4710	COG4710	Predicted DNA-binding protein with an HTH domain  [General function prediction only]. 	80
227055	COG4711	COG4711	Uncharacterized membrane protein  [Function unknown]. 	217
227056	COG4712	COG4712	Uncharacterized protein [Function unknown]. 	234
227057	COG4713	COG4713	Uncharacterized membrane protein  [Function unknown]. 	489
227058	COG4714	COG4714	Uncharacterized membrane-anchored protein  [Function unknown]. 	303
227059	COG4715	COG4715	Uncharacterized conserved protein, contains Zn finger domain [Function unknown]. 	587
227060	COG4716	COG4716	Myosin-crossreactive antigen  (function unknown) [Function unknown]. 	587
227061	COG4717	YhaN	Uncharacterized protein YhaN, contains AAA domain [Function unknown]. 	984
227062	COG4718	COG4718	Phage-related protein  [Mobilome: prophages, transposons]. 	111
227063	COG4719	COG4719	Uncharacterized protein [Function unknown]. 	176
227064	COG4720	COG4720	Uncharacterized membrane protein  [Function unknown]. 	177
227065	COG4721	YkoE	ABC-type thiamine/hydroxymethylpyrimidine transport system, permease component  [Coenzyme transport and metabolism]. 	192
227066	COG4722	YomH	Phage-related protein  [Mobilome: prophages, transposons]. 	239
227067	COG4723	COG4723	Phage-related protein, tail component  [Mobilome: prophages, transposons]. 	198
227068	COG4724	COG4724	Endo-beta-N-acetylglucosaminidase D  [Carbohydrate transport and metabolism]. 	553
227069	COG4725	IME4	N6-adenosine-specific RNA methylase IME4 [Translation, ribosomal structure and biogenesis]. 	198
227070	COG4726	PilX	Tfp pilus assembly protein PilX  [Cell motility, Extracellular structures]. 	196
227071	COG4727	COG4727	Uncharacterized protein [Function unknown]. 	287
227072	COG4728	COG4728	Uncharacterized protein, DUF1653 family [Function unknown]. 	124
227073	COG4729	COG4729	Uncharacterized protein, DUF1850 family [Function unknown]. 	156
227074	COG4731	COG4731	Uncharacterized conserved protein, DUF2147 family [Function unknown]. 	162
227075	COG4732	ThiW	Predicted membrane protein  [Function unknown]. 	177
227076	COG4733	COG4733	Phage-related protein, tail component  [Mobilome: prophages, transposons]. 	952
227077	COG4734	ArdA	Antirestriction protein  [Defense mechanisms]. 	193
227078	COG4735	YaaW	Uncharacterized protein YaaW, UPF0174 family [Function unknown]. 	211
227079	COG4736	CcoQ	Cbb3-type cytochrome oxidase, subunit 3  [Energy production and conversion]. 	60
227080	COG4737	COG4737	Uncharacterized protein [Function unknown]. 	123
227081	COG4738	COG4738	Predicted transcriptional regulator  [Transcription]. 	124
227082	COG4739	COG4739	Uncharacterized protein, contains ferredoxin domain [Function unknown]. 	182
227083	COG4740	COG4740	Predicted metalloprotease  [General function prediction only]. 	176
227084	COG4741	COG4741	Predicted secreted endonuclease distantly related to archaeal Holliday junction resolvase  [Nucleotide transport and metabolism]. 	175
227085	COG4742	COG4742	Predicted transcriptional regulator, contains HTH domain [Transcription]. 	260
227086	COG4743	COG4743	Uncharacterized membrane protein  [Function unknown]. 	316
227087	COG4744	COG4744	Uncharacterized protein [Function unknown]. 	121
227088	COG4745	COG4745	Predicted membrane-bound mannosyltransferase  [General function prediction only]. 	556
227089	COG4746	COG4746	Uncharacterized protein [Function unknown]. 	80
227090	COG4747	ACTx2	Uncharacterized conserved protein, contains tandem ACT domains [Function unknown]. 	142
227091	COG4748	COG4748	Uncharacterized protein, contains restriction enzyme R protein N terminal (HSDR_N) domain  [Function unknown]. 	365
227092	COG4749	COG4749	Uncharacterized protein [Function unknown]. 	196
227093	COG4750	LicC	CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epi... [Cell wall/membrane/envelope biogenesis, Lipid transport and metabolism]. 	231
227094	COG4752	COG4752	Uncharacterized protein [Function unknown]. 	190
227095	COG4753	YesN	Two-component response regulator, YesN/AraC family, consists of REC and AraC-type DNA-binding domains [Signal transduction mechanisms, Transcription]. 	475
227096	COG4754	COG4754	Uncharacterized protein [Function unknown]. 	157
227097	COG4755	COG4755	Uncharacterized protein [Function unknown]. 	151
227098	COG4756	COG4756	Predicted cation transporter  [General function prediction only]. 	367
227099	COG4757	COG4757	Predicted alpha/beta hydrolase  [General function prediction only]. 	281
227100	COG4758	LiaF	Predicted membrane protein  [Function unknown]. 	235
227101	COG4759	COG4759	Uncharacterized protein, contains thioredoxin-like domain [General function prediction only]. 	316
227102	COG4760	COG4760	Uncharacterized membrane protein, YccA/Bax inhibitor family [Function unknown]. 	276
227103	COG4762	COG4762	Uncharacterized protein, UPF0548 family [Function unknown]. 	168
227104	COG4763	YcfT	Uncharacterized membrane protein YcfT [Function unknown]. 	388
227105	COG4764	COG4764	Uncharacterized protein [Function unknown]. 	197
227106	COG4765	COG4765	Uncharacterized protein [Function unknown]. 	164
227107	COG4766	EutQ	Ethanolamine utilization protein EutQ, cupin superfamily (function unknown) [Amino acid transport and metabolism]. 	176
227108	COG4767	VanZ	Glycopeptide antibiotics resistance protein  [Defense mechanisms]. 	199
227109	COG4768	COG4768	Uncharacterized protein YoxC, contains an MCP-like domain [Function unknown]. 	139
227110	COG4769	COG4769	Uncharacterized membrane protein  [Function unknown]. 	181
227111	COG4770	PccA	Acetyl/propionyl-CoA carboxylase, alpha subunit  [Lipid transport and metabolism]. 	645
227112	COG4771	FepA	Outer membrane receptor for ferrienterochelin and colicins [Inorganic ion transport and metabolism]. 	699
227113	COG4772	FecA	Outer membrane receptor for Fe3+-dicitrate [Inorganic ion transport and metabolism]. 	753
227114	COG4773	FhuE	Outer membrane receptor for ferric coprogen and ferric-rhodotorulic acid [Inorganic ion transport and metabolism]. 	719
227115	COG4774	Fiu	Outer membrane receptor for monomeric catechols [Inorganic ion transport and metabolism]. 	750
227116	COG4775	BamA	Outer membrane protein assembly factor BamA [Cell wall/membrane/envelope biogenesis]. 	766
227117	COG4776	Rnb	Exoribonuclease II [Transcription]. 	645
227118	COG4778	PhnL	Alpha-D-ribose 1-methylphosphonate 5-triphosphate synthase subunit PhnL [Inorganic ion transport and metabolism]. 	235
227119	COG4779	FepG	ABC-type enterobactin transport system, permease component [Inorganic ion transport and metabolism]. 	346
227120	COG4781	UgpQ1	Membrane-anchored glycerophosphoryl diester phosphodiesterase (GDPDase), membrane domain [Lipid transport and metabolism]. 	340
227121	COG4782	COG4782	Esterase/lipase superfamily enzyme [General function prediction only]. 	377
227122	COG4783	YfgC	Putative Zn-dependent protease, contains TPR repeats [General function prediction only]. 	484
227123	COG4784	COG4784	Putative Zn-dependent protease  [General function prediction only]. 	479
227124	COG4785	NlpI	Lipoprotein NlpI, contains TPR repeats [Cell wall/membrane/envelope biogenesis]. 	297
227125	COG4786	FlgG	Flagellar basal body rod protein FlgG [Cell motility]. 	265
227126	COG4787	FlgF	Flagellar basal body rod protein FlgF [Cell motility]. 	251
227127	COG4789	EscV	Type III secretory pathway, component EscV  [Intracellular trafficking, secretion, and vesicular transport]. 	689
227128	COG4790	EscR	Type III secretory pathway, component EscR  [Intracellular trafficking, secretion, and vesicular transport]. 	214
227129	COG4791	EscT	Type III secretory pathway, component EscT  [Intracellular trafficking, secretion, and vesicular transport]. 	259
227130	COG4792	EscU	Type III secretory pathway, component EscU  [Intracellular trafficking, secretion, and vesicular transport]. 	349
227131	COG4794	EscS	Type III secretory pathway, component EscS  [Intracellular trafficking, secretion, and vesicular transport]. 	89
227132	COG4795	PulJ	Type II secretory pathway, component PulJ [Intracellular trafficking, secretion, and vesicular transport]. 	194
227133	COG4796	HofQ	Type II secretory pathway, component HofQ [Intracellular trafficking, secretion, and vesicular transport]. 	709
227134	COG4797	COG4797	Predicted regulatory domain of a methyltransferase  [General function prediction only]. 	268
227135	COG4798	COG4798	Predicted methyltransferase  [General function prediction only]. 	238
227136	COG4799	MmdA	Acetyl-CoA carboxylase, carboxyltransferase component [Lipid transport and metabolism]. 	526
227137	COG4800	COG4800	Predicted transcriptional regulator with an HTH domain  [Transcription]. 	170
227138	COG4801	COG4801	Predicted acyltransferase, contains DUF342 domain [General function prediction only]. 	277
227139	COG4802	FtrB	Ferredoxin-thioredoxin reductase, catalytic subunit  [Energy production and conversion]. 	110
227140	COG4803	COG4803	Uncharacterized membrane protein  [Function unknown]. 	170
227141	COG4804	YhcG	Predicted nuclease of restriction endonuclease-like (RecB) superfamily,  DUF1016 family [General function prediction only]. 	159
227142	COG4805	COG4805	Uncharacterized conserved protein, DUF885 familyt [Function unknown]. 	588
227143	COG4806	RhaA	L-rhamnose isomerase [Carbohydrate transport and metabolism]. 	419
227144	COG4807	YehS	Uncharacterized conserved protein YehS, DUF1456 family [Function unknown]. 	155
227145	COG4808	YehR	Uncharacterized lipoprotein YehR, DUF1307 family [Function unknown]. 	152
227146	COG4809	Pfk2	Archaeal ADP-dependent phosphofructokinase/glucokinase  [Carbohydrate transport and metabolism]. 	466
227147	COG4810	EutS	Ethanolamine utilization protein EutS, ethanolamine utilization microcompartment shell protein [Amino acid transport and metabolism]. 	121
227148	COG4811	YobD	Uncharacterized membrane protein YobD, UPF0266 family [Function unknown]. 	152
227149	COG4812	EutT	Ethanolamine utilization cobalamin adenosyltransferase [Amino acid transport and metabolism]. 	255
227150	COG4813	ThuA	Trehalose utilization protein  [Carbohydrate transport and metabolism]. 	261
227151	COG4814	COG4814	Uncharacterized protein with an alpha/beta hydrolase fold  [Function unknown]. 	288
227152	COG4815	COG4815	Uncharacterized protein [Function unknown]. 	145
227153	COG4816	EutL	Ethanolamine utilization protein EutL, ethanolamine utilization microcompartment shell protein [Amino acid transport and metabolism]. 	219
227154	COG4817	GINS	DNA-binding ferritin-like protein (Dps family)  [Replication, recombination and repair]. 	111
227155	COG4818	COG4818	Uncharacterized membrane protein  [Function unknown]. 	105
227156	COG4819	EutA	Ethanolamine utilization protein EutA, possible chaperonin protecting lyase from inhibition [Amino acid transport and metabolism]. 	473
227157	COG4820	EutJ	Ethanolamine utilization protein EutJ, possible chaperonin [Amino acid transport and metabolism]. 	277
227158	COG4821	COG4821	Uncharacterized protein, contains SIS (Sugar ISomerase) phosphosugar binding domain  [General function prediction only]. 	243
227159	COG4822	CbiK	Cobalamin biosynthesis protein CbiK, Co2+ chelatase  [Coenzyme transport and metabolism]. 	265
227160	COG4823	AbiF	Abortive infection bacteriophage resistance protein  [Defense mechanisms]. 	299
227161	COG4824	COG4824	Phage-related holin (Lysis protein)  [Mobilome: prophages, transposons]. 	133
227162	COG4825	COG4825	Uncharacterized membrane-anchored protein  [Function unknown]. 	395
227163	COG4826	SERPIN	Serine protease inhibitor  [Posttranslational modification, protein turnover, chaperones]. 	410
227164	COG4827	COG4827	Predicted transporter  [General function prediction only]. 	239
227165	COG4828	COG4828	Uncharacterized membrane protein  [Function unknown]. 	113
227166	COG4829	CatC1	Muconolactone delta-isomerase  [Secondary metabolites biosynthesis, transport and catabolism]. 	98
227167	COG4830	RPS26B	Ribosomal protein S26  [Translation, ribosomal structure and biogenesis]. 	108
227168	COG4831	COG4831	Roadblock/LC7 domain  [Signal transduction mechanisms]. 	109
227169	COG4832	COG4832	Uncharacterized protein [Function unknown]. 	207
227170	COG4833	COG4833	Predicted alpha-1,6-mannanase, GH76 family   [Carbohydrate transport and metabolism]. 	377
227171	COG4834	COG4834	Uncharacterized protein [Function unknown]. 	334
227172	COG4835	COG4835	Uncharacterized protein [Function unknown]. 	124
227173	COG4836	YwzB	Uncharacterized membrane protein YwzB [Function unknown]. 	77
227174	COG4837	YuzD	Disulfide oxidoreductase YuzD  [Posttranslational modification, protein turnover, chaperones]. 	106
227175	COG4838	YlaN	Uncharacterized protein YlaN, UPF0358 family [Function unknown]. 	92
227176	COG4839	FtsL2	Cell division protein FtsL [Cell cycle control, cell division, chromosome partitioning]. 	120
227177	COG4840	YfkK	Uncharacterized protein YfkK, UPF0435 family [Function unknown]. 	71
227178	COG4841	YneR	Uncharacterized protein YneR, related to HesB/YadR/YfhF family [Function unknown]. 	95
227179	COG4842	YukE	Uncharacterized conserved protein YukE [Function unknown]. 	97
227180	COG4843	YebE	Uncharacterized protein YebE, UPF0316/DUF2179 family [Function unknown]. 	179
227181	COG4844	YuzB	Uncharacterized protein YuzB, UPF0349 family [Function unknown]. 	78
227182	COG4845	CatA	Chloramphenicol O-acetyltransferase  [Defense mechanisms]. 	219
227183	COG4846	CcdC	Membrane protein CcdC involved in cytochrome C biogenesis  [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	163
227184	COG4847	COG4847	Uncharacterized protein [Function unknown]. 	103
227185	COG4848	YtpQ	Uncharacterized protein YtpQ, UPF0354 family [Function unknown]. 	265
227186	COG4849	COG4849	Predicted nucleotidyltransferase  [General function prediction only]. 	269
227187	COG4850	App1	Phosphatidate phosphatase APP1 [Lipid transport and metabolism]. 	373
227188	COG4851	CamS	Protein involved in sex pheromone biosynthesis  [General function prediction only]. 	382
227189	COG4852	COG4852	Uncharacterized membrane protein [Function unknown]. 	134
227190	COG4853	YycI	Two-component signal transduction system YycFG, regulatory protein YycI [Signal transduction mechanisms]. 	264
227191	COG4854	COG4854	Uncharacterized membrane protein  [Function unknown]. 	126
227192	COG4855	COG4855	Uncharacterized protein [Nucleotide transport and metabolism]. 	76
227193	COG4856	YbbR	Uncharacterized protein, YbbR domain [Function unknown]. 	403
227194	COG4857	COG4857	5-Methylthioribose kinase, methionine salvage pathway  [Amino acid transport and metabolism]. 	408
227195	COG4858	COG4858	Uncharacterized membrane-anchored protein  [Function unknown]. 	226
227196	COG4859	COG4859	Uncharacterized protein [Function unknown]. 	105
227197	COG4860	COG4860	Predicted DNA-binding transcriptional regulator, ArsR family [Transcription]. 	170
227198	COG4861	COG4861	Uncharacterized protein [Function unknown]. 	345
227199	COG4862	MecA	Negative regulator of genetic competence, sporulation and motility  [Transcription, Signal transduction mechanisms, Cell motility]. 	224
227200	COG4863	YycH	Two-component signal transduction system YycFG, regulatory protein YycH [Signal transduction mechanisms]. 	439
227201	COG4864	YqfA	Uncharacterized protein YqfA, UPF0365 family [Function unknown]. 	328
227202	COG4865	GlmE	Glutamate mutase epsilon subunit  [Amino acid transport and metabolism]. 	485
227203	COG4866	COG4866	Uncharacterized protein [Function unknown]. 	294
227204	COG4867	COG4867	Uncharacterized protein, contains von Willebrand factor type A (vWA) domain  [Function unknown]. 	652
227205	COG4868	COG4868	Uncharacterized protein, UPF0371 family [Function unknown]. 	493
227206	COG4869	PduL	Propanediol utilization protein  [Secondary metabolites biosynthesis, transport and catabolism]. 	210
227207	COG4870	COG4870	Cysteine protease, C1A family  [Posttranslational modification, protein turnover, chaperones]. 	372
227208	COG4871	COG4871	Metal-binding trascriptional regulator, contains putative Fe-S cluster and ArsR family DNA binding domain [Transcription]. 	193
227209	COG4872	COG4872	Uncharacterized membrane protein [Function unknown]. 	394
227210	COG4873	YkvS	Uncharacterized protein YkvS, DUF2187 family [Function unknown]. 	81
227211	COG4874	COG4874	Uncharacterized protein [Function unknown]. 	318
227212	COG4875	COG4875	Uncharacterized protein [Function unknown]. 	156
227213	COG4876	YdaT	Uncharacterized protein YdaT [Function unknown]. 	138
227214	COG4877	COG4877	Uncharacterized protein [Function unknown]. 	63
227215	COG4878	COG4878	Uncharacterized protein [Function unknown]. 	309
227216	COG4879	COG4879	Uncharacterized protein [Function unknown]. 	243
227217	COG4880	COG4880	Secreted protein containing C-terminal beta-propeller domain distantly related to WD-40 repeats  [General function prediction only]. 	603
227218	COG4881	COG4881	Predicted membrane protein  [Function unknown]. 	371
227219	COG4882	COG4882	Predicted aminopeptidase, Iap family  [General function prediction only]. 	486
227220	COG4883	COG4883	Uncharacterized protein [Function unknown]. 	500
227221	COG4884	YfeS	Uncharacterized conserved protein YfeS, contains WGR domain [Function unknown]. 	176
227222	COG4885	COG4885	Uncharacterized protein [Function unknown]. 	312
227223	COG4886	LRR	Leucine-rich repeat (LRR) protein  [Transcription]. 	394
227224	COG4887	COG4887	Uncharacterized metal-binding protein, DUF1847 family [Function unknown]. 	191
227225	COG4888	Elf1	Transcription elongation factor Elf1, contains Zn-ribbon domain [Transcription]. 	104
227226	COG4889	COG4889	Predicted helicase  [General function prediction only]. 	1518
227227	COG4890	COG4890	Predicted outer membrane lipoprotein  [Function unknown]. 	37
227228	COG4891	COG4891	Uncharacterized protein [Function unknown]. 	93
227229	COG4892	COG4892	Predicted heme/steroid binding protein  [General function prediction only]. 	81
227230	COG4893	COG4893	Uncharacterized protein [Function unknown]. 	123
227231	COG4894	YxjI	Uncharacterized protein YxjI, Tubby2 superfamily [Function unknown]. 	159
227232	COG4895	YwbE	Uncharacterized protein YwbE, DUF2196 family [Function unknown]. 	63
227233	COG4896	YlaI	Uncharacterized protein YlaI, DUF2197 family [Function unknown]. 	68
227234	COG4897	CsbA	General stress protein CsbA (function unknown) [Function unknown]. 	78
227235	COG4898	COG4898	Uncharacterized protein [Function unknown]. 	115
227236	COG4899	COG4899	Uncharacterized protein [Function unknown]. 	166
227237	COG4900	COG4900	Predicted metallopeptidase  [General function prediction only]. 	133
227238	COG4901	RPS25	Ribosomal protein S25  [Translation, ribosomal structure and biogenesis]. 	107
227239	COG4902	COG4902	Uncharacterized protein [Function unknown]. 	189
227240	COG4903	ComK	Competence transcription factor ComK [Transcription]. 	190
227241	COG4904	COG4904	Uncharacterized protein [Function unknown]. 	174
227242	COG4905	COG4905	Uncharacterized membrane protein [Function unknown]. 	243
227243	COG4906	COG4906	Uncharacterized membrane protein  [Function unknown]. 	696
227244	COG4907	COG4907	Uncharacterized membrane protein  [Function unknown]. 	595
227245	COG4908	COG4908	Uncharacterized protein, contains a NRPS condensation (elongation) domain  [General function prediction only]. 	439
227246	COG4909	PduC	Propanediol dehydratase, large subunit  [Secondary metabolites biosynthesis, transport and catabolism]. 	554
227247	COG4910	PduE	Propanediol dehydratase, small subunit  [Secondary metabolites biosynthesis, transport and catabolism]. 	170
227248	COG4911	COG4911	Uncharacterized protein [Function unknown]. 	123
227249	COG4912	AlkD	3-methyladenine DNA glycosylase AlkD [Replication, recombination and repair]. 	222
227250	COG4913	COG4913	Uncharacterized protein, contains a C-terminal ATPase domain [Function unknown]. 	1104
227251	COG4914	COG4914	Predicted nucleotidyltransferase  [General function prediction only]. 	190
227252	COG4915	XpaC	5-bromo-4-chloroindolyl phosphate hydrolysis protein  [Secondary metabolites biosynthesis, transport and catabolism, General function prediction only]. 	204
227253	COG4916	COG4916	Uncharacterized protein [Function unknown]. 	329
227254	COG4917	EutP	Ethanolamine utilization protein EutP, contains a P-loop NTPase domain [Amino acid transport and metabolism]. 	148
227255	COG4918	YqkB	Predicted Fe-S cluster biosynthesis protein  [General function prediction only]. 	114
227256	COG4919	RPS30	Ribosomal protein S30  [Translation, ribosomal structure and biogenesis]. 	54
227257	COG4920	COG4920	Uncharacterized membrane protein  [Function unknown]. 	249
227258	COG4921	COG4921	Uncharacterized protein [Function unknown]. 	131
227259	COG4922	COG4922	Predicted SnoaL-like aldol condensation-catalyzing enzyme [General function prediction only]. 	129
227260	COG4923	COG4923	Predicted nuclease (RNAse H fold)  [General function prediction only]. 	245
227261	COG4924	COG4924	Uncharacterized protein [Function unknown]. 	386
227262	COG4925	COG4925	Uncharacterized protein [Function unknown]. 	166
227263	COG4926	PblB	Phage-related protein  [Mobilome: prophages, transposons]. 	698
227264	COG4927	COG4927	Predicted choloylglycine hydrolase  [General function prediction only]. 	336
227265	COG4928	COG4928	Predicted P-loop ATPase, KAP-like  [General function prediction only]. 	646
227266	COG4929	COG4929	Uncharacterized membrane-anchored protein  [Function unknown]. 	190
227267	COG4930	COG4930	Predicted ATP-dependent Lon-type protease  [Posttranslational modification, protein turnover, chaperones]. 	683
227268	COG4932	COG4932	Uncharacterized surface anchored protein  [Function unknown]. 	1531
227269	COG4933	COG4933	Predicted transcriptional regulator, contains an HTH and PUA-like domains [Transcription]. 	124
227270	COG4934	COG4934	Serine protease, subtilase family  [Posttranslational modification, protein turnover, chaperones]. 	1174
227271	COG4935	COG4935	Regulatory P domain of the subtilisin-like proprotein convertases and other proteases  [Posttranslational modification, protein turnover, chaperones]. 	177
227272	COG4936	PocR	Ligand-binding sensor domain [Signal transduction mechanisms]. 	169
227273	COG4937	FDXACB	Ferredoxin-fold anticodon binding domain  [Translation, ribosomal structure and biogenesis]. 	171
227274	COG4938	COG4938	Predicted ATPase [General function prediction only]. 	374
227275	COG4939	Tpp15	Major membrane immunogen, membrane-anchored lipoprotein  [Function unknown]. 	147
227276	COG4940	ComGF	Competence protein ComGF  [Mobilome: prophages, transposons]. 	154
227277	COG4941	COG4941	Predicted RNA polymerase sigma factor, contains C-terminal TPR domain  [Transcription]. 	415
227278	COG4942	EnvC	Septal ring factor EnvC, activator of murein hydrolases AmiA and AmiB [Cell cycle control, cell division, chromosome partitioning]. 	420
227279	COG4943	YjcC	Environmental sensor c-di-GMP phosphodiesterase, contains periplasmic CSS-motif sensor and cytoplasmic EAL domain  [Signal transduction mechanisms]. 	524
227280	COG4944	COG4944	Uncharacterized protein [Function unknown]. 	213
227281	COG4945	DOMON	Carbohydrate-binding DOMON domain [Carbohydrate transport and metabolism, Signal transduction mechanisms]. 	570
227282	COG4946	COG4946	Uncharacterized N-terminal domain of tricorn protease  [Function unknown]. 	668
227283	COG4947	COG4947	Esterase/lipase superfamily enzyme  [General function prediction only]. 	227
227284	COG4948	RspA	L-alanine-DL-glutamate epimerase or related enzyme of enolase superfamily [Cell wall/membrane/envelope biogenesis, General function prediction only]. 	372
227285	COG4949	COG4949	Uncharacterized membrane-anchored protein  [Function unknown]. 	424
227286	COG4950	YciW1	N-terminal domain of uncharacterized protein YciW (function unknown) [Function unknown]. 	193
227287	COG4951	COG4951	Uncharacterized protein [Function unknown]. 	361
227288	COG4952	COG4952	L-rhamnose isomerase [Cell wall/membrane/envelope biogenesis]. 	430
227289	COG4953	PbpC	Membrane carboxypeptidase/penicillin-binding protein PbpC [Cell wall/membrane/envelope biogenesis]. 	733
227290	COG4954	COG4954	Uncharacterized protein [Function unknown]. 	135
227291	COG4955	YpbB	Uncharacterized protein YpbB, contains C-terminal HTH domain [Function unknown]. 	343
227292	COG4956	COG4956	Uncharacterized conserved protein YacL, contains PIN and TRAM domains [General function prediction only]. 	356
227293	COG4957	COG4957	Predicted transcriptional regulator  [Transcription]. 	148
227294	COG4959	TraF	Type IV secretory pathway, protease TraF  [Posttranslational modification, protein turnover, chaperones, Intracellular trafficking, secretion, and vesicular transport]. 	173
227295	COG4960	CpaA	Flp pilus assembly protein, protease CpaA  [Posttranslational modification, protein turnover, chaperones, Signal transduction mechanisms]. 	168
227296	COG4961	TadG	Flp pilus assembly protein TadG  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	185
227297	COG4962	CpaF	Pilus assembly protein, ATPase of CpaF family [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	355
227298	COG4963	CpaE	Flp pilus assembly protein, ATPase CpaE  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	366
227299	COG4964	CpaC	Flp pilus assembly protein, secretin CpaC  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	455
227300	COG4965	TadB	Flp pilus assembly protein TadB  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	309
227301	COG4966	PilW	Tfp pilus assembly protein PilW  [Cell motility, Extracellular structures]. 	318
227302	COG4967	PilV	Tfp pilus assembly protein PilV [Cell motility, Extracellular structures]. 	162
227303	COG4968	PilE	Tfp pilus assembly protein PilE  [Cell motility, Extracellular structures]. 	139
227304	COG4969	PilA	Tfp pilus assembly protein, major pilin PilA [Cell motility, Extracellular structures]. 	125
227305	COG4970	FimT	Tfp pilus assembly protein FimT  [Cell motility, Extracellular structures]. 	181
227306	COG4972	PilM	Tfp pilus assembly protein, ATPase PilM  [Cell motility, Extracellular structures]. 	354
227307	COG4973	XerC	Site-specific recombinase XerC [Replication, recombination and repair]. 	299
227308	COG4974	XerD	Site-specific recombinase XerD [Replication, recombination and repair]. 	300
227309	COG4975	GlcU	Glucose uptake protein GlcU [Carbohydrate transport and metabolism]. 	288
227310	COG4976	COG4976	Predicted methyltransferase, contains TPR repeat [General function prediction only]. 	287
227311	COG4977	GlxA	Transcriptional regulator GlxA family, contains an amidase domain and an AraC-type DNA-binding HTH domain  [Transcription]. 	328
227312	COG4978	BltR2	Bacterial effector-binding domain [Signal transduction mechanisms]. 	153
227313	COG4980	GvpP	Gas vesicle protein  [General function prediction only]. 	115
227314	COG4981	COG4981	Enoyl reductase domain of yeast-type FAS1  [Lipid transport and metabolism]. 	717
227315	COG4982	FabG2	3-oxoacyl-ACP reductase domain of yeast-type FAS1 [Lipid transport and metabolism]. 	866
227316	COG4983	COG4983	Uncharacterized protein, contains Primase-polymerase (Primpol) domain [Function unknown]. 	495
227317	COG4984	COG4984	Uncharacterized membrane protein [Function unknown]. 	644
227318	COG4985	COG4985	ABC-type phosphate transport system, auxiliary component  [Inorganic ion transport and metabolism]. 	289
227319	COG4986	COG4986	ABC-type anion transport system, duplicated permease component  [Inorganic ion transport and metabolism]. 	523
227320	COG4987	CydC	ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	573
227321	COG4988	CydD	ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components [Energy production and conversion, Posttranslational modification, protein turnover, chaperones]. 	559
227322	COG4989	YdhF	Predicted oxidoreductase [General function prediction only]. 	298
227323	COG4990	YvpB	Predicted cysteine peptidase, C39 family  [General function prediction only]. 	195
227324	COG4991	YraI	Uncharacterized conserved protein YraI [Function unknown]. 	155
227325	COG4992	ArgD	Acetylornithine/succinyldiaminopimelate/putrescine aminotransferase [Amino acid transport and metabolism]. 	404
227326	COG4993	Gcd	Glucose dehydrogenase [Carbohydrate transport and metabolism]. 	773
227327	COG4994	COG4994	Uncharacterized protein [Function unknown]. 	120
227328	COG4995	COG4995	Uncharacterized conserved protein, contains CHAT domain [Function unknown]. 	420
227329	COG4996	COG4996	Predicted phosphatase  [General function prediction only]. 	164
227330	COG4997	COG4997	Predicted house-cleaning noncanonical NTP pyrophosphatase, all-alpha NTP-PPase (MazG) superfamily  [General function prediction only]. 	95
227331	COG4998	RecB	Predicted endonuclease, RecB family  [Replication, recombination and repair]. 	209
227332	COG4999	BarA5	Uncharacterized domain of BarA-like signal transduction histidine kinase [Signal transduction mechanisms]. 	140
227333	COG5000	NtrY	Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation  [Signal transduction mechanisms]. 	712
227334	COG5001	COG5001	Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain  [Signal transduction mechanisms]. 	663
227335	COG5002	VicK	Signal transduction histidine kinase  [Signal transduction mechanisms]. 	459
227336	COG5003	COG5003	Mu-like prophage protein gp37  [Mobilome: prophages, transposons]. 	151
227337	COG5004	COG5004	P2-like prophage tail protein X  [Mobilome: prophages, transposons]. 	70
227338	COG5005	COG5005	Mu-like prophage protein gpG  [Mobilome: prophages, transposons]. 	140
227339	COG5006	RhtA	Threonine/homoserine efflux transporter RhtA [Amino acid transport and metabolism]. 	292
227340	COG5007	IbaG	Acid stress-induced BolA-like protein IbaG/YrbA, predicted regulator of iron metabolism [Signal transduction mechanisms]. 	80
227341	COG5008	PilU	Tfp pilus assembly protein, ATPase PilU  [Cell motility, Extracellular structures]. 	375
227342	COG5009	MrcA	Membrane carboxypeptidase/penicillin-binding protein [Cell wall/membrane/envelope biogenesis]. 	797
227343	COG5010	TadD	Flp pilus assembly protein TadD, contains TPR repeats  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	257
227344	COG5011	COG5011	Uncharacterized conserved protein, DUF2344 family [Function unknown]. 	228
227345	COG5012	MtbC1	Methanogenic corrinoid protein MtbC1  [Energy production and conversion]. 	227
227346	COG5013	NarG	Nitrate reductase alpha subunit [Energy production and conversion, Inorganic ion transport and metabolism]. 	1227
227347	COG5014	COG5014	Uncharacterized Fe-S cluster-containing protein, radical SAM superfamily [General function prediction only]. 	228
227348	COG5015	COG5015	Uncharacterized protein, pyridoxamine 5'-phosphate oxidase (PNPOx-like) family [Function unknown]. 	132
227349	COG5016	OadA1	Pyruvate/oxaloacetate carboxyltransferase  [Energy production and conversion]. 	472
227350	COG5017	COG5017	UDP-N-acetylglucosamine transferase subunit ALG13 [Carbohydrate transport and metabolism]. 	161
227351	COG5018	KapD	Inhibitor of the KinA pathway to sporulation, predicted exonuclease  [General function prediction only]. 	210
227352	COG5019	CDC3	Septin family protein  [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 	373
227353	COG5020	KTR1	Mannosyltransferase [Carbohydrate transport and metabolism]. 	399
227354	COG5021	HUL4	Ubiquitin-protein ligase [Posttranslational modification, protein turnover, chaperones]. 	872
227355	COG5022	COG5022	Myosin heavy chain  [General function prediction only]. 	1463
227356	COG5023	COG5023	Tubulin [Cytoskeleton]. 	443
227357	COG5024	COG5024	Cyclin [Cell division and chromosome partitioning]. 	440
227358	COG5025	COG5025	Transcription factor of the Forkhead/HNF3 family [Transcription]. 	610
227359	COG5026	COG5026	Hexokinase  [Carbohydrate transport and metabolism]. 	466
227360	COG5027	SAS2	Histone acetyltransferase (MYST family) [Chromatin structure and dynamics]. 	395
227361	COG5028	COG5028	Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion]. 	861
227362	COG5029	CAL1	Prenyltransferase, beta subunit  [Posttranslational modification, protein turnover, chaperones, Lipid transport and metabolism]. 	342
227363	COG5030	APS2	Clathrin adaptor complex, small subunit [Intracellular trafficking and secretion]. 	152
227364	COG5031	COQ4	Ubiquinone biosynthesis protein Coq4 [Coenzyme transport and metabolism]. 	235
227365	COG5032	TEL1	Phosphatidylinositol kinase or protein kinase, PI-3  family  [Signal transduction mechanisms]. 	2105
227366	COG5033	TFG3	Transcription initiation factor IIF, auxiliary subunit  [Transcription]. 	225
227367	COG5034	TNG2	Chromatin remodeling protein, contains PhD zinc finger [Chromatin structure and dynamics]. 	271
227368	COG5035	CDC50	Cell cycle control protein [Cell division and chromosome partitioning / Transcription / Signal transduction mechanisms]. 	372
227369	COG5036	COG5036	SPX domain-containing protein involved in vacuolar polyphosphate accumulation  [Inorganic ion transport and metabolism, Intracellular trafficking, secretion, and vesicular transport]. 	509
227370	COG5037	TOS9	Gluconate transport-inducing protein [Signal transduction mechanisms / Carbohydrate transport and metabolism]. 	248
227371	COG5038	COG5038	Ca2+-dependent lipid-binding protein, contains C2 domain  [General function prediction only]. 	1227
227372	COG5039	EpsI	Exopolysaccharide biosynthesis protein EpsI, predicted pyruvyl transferase  [Carbohydrate transport and metabolism, Cell wall/membrane/envelope biogenesis]. 	339
227373	COG5040	BMH1	14-3-3 family protein [Signal transduction mechanisms]. 	268
227374	COG5041	SKB2	Casein kinase II, beta subunit [Signal transduction mechanisms / Cell division and chromosome partitioning / Transcription]. 	242
227375	COG5042	NUP	Purine nucleoside permease  [Nucleotide transport and metabolism]. 	349
227376	COG5043	MRS6	Vacuolar protein sorting-associated protein [Intracellular trafficking and secretion]. 	2552
227377	COG5044	MRS6	RAB proteins geranylgeranyltransferase component A (RAB escort protein)  [Posttranslational modification, protein turnover, chaperones]. 	434
227378	COG5045	COG5045	Ribosomal protein S10E [Translation, ribosomal structure and biogenesis]. 	105
227379	COG5046	MAF1	Protein involved in Mod5 protein sorting [Posttranslational modification, protein turnover, chaperones]. 	282
227380	COG5047	SEC23	Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion]. 	755
227381	COG5048	COG5048	FOG: Zn-finger  [General function prediction only]. 	467
227382	COG5049	XRN1	5'-3' exonuclease  [Replication, recombination and repair]. 	953
227383	COG5050	EPT1	sn-1,2-diacylglycerol ethanolamine- and cholinephosphotranferases [Lipid metabolism]. 	384
227384	COG5051	RPL36A	Ribosomal protein L36E [Translation, ribosomal structure and biogenesis]. 	97
227385	COG5052	YOP1	Protein involved in membrane traffic [Intracellular trafficking and secretion]. 	186
227386	COG5053	CDC33	Translation initiation factor 4E (eIF-4E) [Translation, ribosomal structure and biogenesis]. 	217
227387	COG5054	ERV1	Mitochondrial sulfhydryl oxidase involved in the biogenesis of cytosolic Fe/S proteins [Posttranslational modification, protein turnover, chaperones]. 	181
227388	COG5055	RAD52	Recombination DNA repair protein (RAD52 pathway)  [Replication, recombination and repair]. 	375
227389	COG5056	ARE1	Acyl-CoA cholesterol acyltransferase [Lipid metabolism]. 	512
227390	COG5057	LAG1	Phosphotyrosyl phosphatase activator [Cell division and chromosome partitioning / Signal transduction mechanisms]. 	353
227391	COG5058	LAG1	Protein transporter of the TRAM (translocating chain-associating membrane) superfamily, longevity assurance factor [Intracellular trafficking and secretion]. 	395
227392	COG5059	KIP1	Kinesin-like protein [Cytoskeleton]. 	568
227393	COG5061	ERO1	Oxidoreductin, endoplasmic reticulum membrane-associated protein involved in disulfide bond formation [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 	425
227394	COG5062	COG5062	Uncharacterized membrane protein [Function unknown]. 	429
227395	COG5063	CTH1	CCCH-type Zn-finger protein [General function prediction only]. 	351
227396	COG5064	SRP1	Karyopherin (importin) alpha [Intracellular trafficking and secretion]. 	526
227397	COG5065	PHO88	Protein involved in inorganic phosphate transport [Inorganic ion transport and metabolism]. 	185
227398	COG5066	SCS2	VAMP-associated protein involved in inositol metabolism [Intracellular trafficking and secretion]. 	242
227399	COG5067	DBF4	Protein kinase essential for the initiation of DNA replication [DNA replication, recombination, and repair / Cell division and chromosome partitioning]. 	468
227400	COG5068	ARG80	Regulator of arginine metabolism and related MADS box-containing transcription factors [Transcription]. 	412
227401	COG5069	SAC6	Ca2+-binding actin-bundling protein fimbrin/plastin (EF-Hand superfamily) [Cytoskeleton]. 	612
227402	COG5070	VRG4	Nucleotide-sugar transporter [Carbohydrate transport and metabolism / Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 	309
227403	COG5071	RPN5	26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 	439
227404	COG5072	ALK1	Serine/threonine kinase of the haspin family [Cell division and chromosome partitioning]. 	488
227405	COG5073	VID24	Vacuolar import and degradation protein [Intracellular trafficking and secretion]. 	272
227406	COG5074	COG5074	t-SNARE complex subunit, syntaxin  [Intracellular trafficking, secretion, and vesicular transport]. 	280
227407	COG5075	COG5075	Uncharacterized conserved protein [Function unknown]. 	305
227408	COG5076	COG5076	Transcription factor involved in chromatin remodeling, contains bromodomain [Chromatin structure and dynamics / Transcription]. 	371
227409	COG5077	COG5077	Ubiquitin carboxyl-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 	1089
227410	COG5078	COG5078	Ubiquitin-protein ligase  [Posttranslational modification, protein turnover, chaperones]. 	153
227411	COG5079	SAC3	Nuclear protein export factor [Intracellular trafficking and secretion / Cell division and chromosome partitioning]. 	646
227412	COG5080	YIP1	Rab GTPase interacting factor, Golgi membrane protein [Intracellular trafficking and secretion]. 	227
227413	COG5081	COG5081	Predicted membrane protein [Function unknown]. 	180
227414	COG5082	AIR1	Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. 	190
227415	COG5083	SMP2	Phosphatidate phosphatase PAH1, contains Lipin and LNS2 domains. can be involved in plasmid maintenance  [Lipid transport and metabolism]. 	580
227416	COG5084	YTH1	Cleavage and polyadenylation specificity factor (CPSF) Clipper subunit and related makorin family Zn-finger proteins [General function prediction only]. 	285
227417	COG5085	COG5085	Predicted membrane protein [Function unknown]. 	230
227418	COG5086	COG5086	Uncharacterized conserved protein [Function unknown]. 	218
227419	COG5087	RTT109	Uncharacterized conserved protein [Function unknown]. 	349
227420	COG5088	SOH1	Rad5p-binding protein [General function prediction only]. 	114
227421	COG5090	TFG2	Transcription initiation factor IIF, small subunit (RAP30) [Transcription]. 	297
227422	COG5091	SGT1	Suppressor of G2 allele of skp1 and related proteins [General function prediction only]. 	368
227423	COG5092	NMT1	N-myristoyl transferase [Lipid metabolism]. 	451
227424	COG5093	COG5093	Uncharacterized conserved protein [Function unknown]. 	185
227425	COG5094	TAF9	Transcription initiation factor TFIID, subunit TAF9 (also component of histone acetyltransferase SAGA) [Transcription]. 	145
227426	COG5095	TAF6	Transcription initiation factor TFIID, subunit TAF6 (also component of histone acetyltransferase SAGA) [Transcription]. 	450
227427	COG5096	COG5096	Vesicle coat complex, various subunits  [Intracellular trafficking, secretion, and vesicular transport]. 	757
227428	COG5097	MED6	RNA polymerase II transcriptional regulation mediator [Transcription]. 	210
227429	COG5098	COG5098	Chromosome condensation complex Condensin, subunit D2 [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 	1128
227430	COG5099	COG5099	RNA-binding protein of the Puf family, translational repressor [Translation, ribosomal structure and biogenesis]. 	777
227431	COG5100	NPL4	Nuclear pore protein [Nuclear structure]. 	571
227432	COG5101	CRM1	Importin beta-related nuclear transport receptor [Nuclear structure / Intracellular trafficking and secretion]. 	1053
227433	COG5102	SFT2	Membrane protein involved in ER to Golgi transport [Intracellular trafficking and secretion]. 	201
227434	COG5103	CDC39	Cell division control protein, negative regulator of transcription [Cell division and chromosome partitioning / Transcription]. 	2005
227435	COG5104	PRP40	Splicing factor [RNA processing and modification]. 	590
227436	COG5105	MIH1	Mitotic inducer, protein phosphatase [Cell division and chromosome partitioning]. 	427
227437	COG5106	RPF2	Uncharacterized conserved protein [Function unknown]. 	316
227438	COG5107	RNA14	Pre-mRNA 3'-end processing (cleavage and polyadenylation) factor [RNA processing and modification]. 	660
227439	COG5108	RPO41	Mitochondrial DNA-directed RNA polymerase  [Transcription]. 	1117
227440	COG5109	COG5109	Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 	396
227441	COG5110	RPN1	26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 	881
227442	COG5111	RPC34	DNA-directed RNA polymerase III, subunit C34 [Transcription]. 	301
227443	COG5112	UFD2	U1-like Zn-finger-containing protein [General function prediction only]. 	126
227444	COG5113	UFD2	Ubiquitin fusion degradation protein 2 [Posttranslational modification, protein turnover, chaperones]. 	929
227445	COG5114	COG5114	Histone acetyltransferase complex SAGA/ADA, subunit ADA2 [Chromatin structure and dynamics]. 	432
227446	COG5116	RPN2	26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 	926
227447	COG5117	NOC3	Protein involved in the nuclear export of pre-ribosomes [Translation, ribosomal structure and biogenesis / Intracellular trafficking and secretion]. 	657
227448	COG5118	BDP1	Transcription initiation factor TFIIIB, Bdp1 subunit [Transcription]. 	507
227449	COG5119	COG5119	Uncharacterized protein, contains ParB-like nuclease domain  [General function prediction only]. 	119
227450	COG5120	GOT1	Membrane protein involved in Golgi transport [Intracellular trafficking and secretion]. 	129
227451	COG5122	TRS23	Transport protein particle (TRAPP) complex subunit [Intracellular trafficking and secretion]. 	134
227452	COG5123	TOA2	Transcription initiation factor IIA, gamma subunit [Transcription]. 	113
227453	COG5124	COG5124	Protein predicted to be involved in meiotic recombination [Cell division and chromosome partitioning / General function prediction only]. 	209
227454	COG5125	COG5125	Uncharacterized conserved protein [Function unknown]. 	259
227455	COG5126	FRQ1	Ca2+-binding protein, EF-hand superfamily [Signal transduction mechanisms]. 	160
227456	COG5127	COG5127	Vacuolar H+-ATPase V1 sector, subunit C [Energy production and conversion]. 	383
227457	COG5128	COG5128	Transport protein particle (TRAPP) complex subunit [Intracellular trafficking and secretion]. 	208
227458	COG5129	MAK16	Nuclear protein with HMG-like acidic region [General function prediction only]. 	303
227459	COG5130	YIP3	Prenylated rab acceptor 1 and related proteins [Intracellular trafficking and secretion / Signal transduction mechanisms]. 	169
227460	COG5131	URM1	Ubiquitin-like protein [Posttranslational modification, protein turnover, chaperones]. 	96
227461	COG5132	BUD31	Cell cycle control protein, G10 family [Transcription / Cell division and chromosome partitioning]. 	146
227462	COG5133	COG5133	Uncharacterized conserved protein [Function unknown]. 	181
227463	COG5134	COG5134	Uncharacterized conserved protein [Function unknown]. 	272
227464	COG5135	COG5135	Uncharacterized protein [Function unknown]. 	245
227465	COG5136	COG5136	U1 snRNP-specific protein C [RNA processing and modification]. 	188
227466	COG5137	COG5137	Histone chaperone involved in gene silencing [Transcription / Chromatin structure and dynamics]. 	279
227467	COG5138	COG5138	Uncharacterized conserved protein [Function unknown]. 	168
227468	COG5139	COG5139	Uncharacterized conserved protein [Function unknown]. 	397
227469	COG5140	UFD1	Ubiquitin fusion-degradation protein [Posttranslational modification, protein turnover, chaperones]. 	331
227470	COG5141	COG5141	PHD zinc finger-containing protein [General function prediction only]. 	669
227471	COG5142	OXR1	Oxidation resistance protein [DNA replication, recombination, and repair]. 	212
227472	COG5143	SNC1	Synaptobrevin/VAMP-like protein [Intracellular trafficking and secretion]. 	190
227473	COG5144	TFB2	RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB2 [Transcription / DNA replication, recombination, and repair]. 	447
227474	COG5145	RAD14	DNA excision repair protein [DNA replication, recombination, and repair]. 	292
227475	COG5146	PanK	Pantothenate kinase [Coenzyme transport and metabolism]. 	342
227476	COG5147	REB1	Myb superfamily proteins, including transcription factors and mRNA splicing factors [Transcription / RNA processing and modification / Cell division and chromosome partitioning]. 	512
227477	COG5148	RPN10	26S proteasome regulatory complex, subunit RPN10/PSMD4 [Posttranslational modification, protein turnover, chaperones]. 	243
227478	COG5149	TOA1	Transcription initiation factor IIA, large chain [Transcription]. 	293
227479	COG5150	COG5150	Class 2 transcription repressor NC2, beta subunit (Dr1) [Transcription]. 	148
227480	COG5151	SSL1	RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit SSL1 [Transcription / DNA replication, recombination, and repair]. 	421
227481	COG5152	COG5152	Uncharacterized conserved protein, contains RING and CCCH-type Zn-fingers [General function prediction only]. 	259
227482	COG5153	CVT17	Putative lipase essential for disintegration of autophagic bodies inside the vacuole  [Intracellular trafficking, secretion, and vesicular transport]. 	425
227483	COG5154	BRX1	RNA-binding protein required for 60S ribosomal subunit biogenesis [Translation, ribosomal structure and biogenesis]. 	283
227484	COG5155	ESP1	Separase, a protease involved in sister chromatid separation [Cell division and chromosome partitioning / Posttranslational modification, protein turnover, chaperones]. 	1622
227485	COG5156	DOC1	Anaphase-promoting complex (APC), subunit 10 [Cell division and chromosome partitioning / Posttranslational modification, protein turnover, chaperones]. 	189
227486	COG5157	CDC73	RNA polymerase II assessory factor [Transcription]. 	362
227487	COG5158	SEC1	Proteins involved in synaptic transmission and general secretion, Sec1 family [Intracellular trafficking and secretion]. 	582
227488	COG5159	RPN6	26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones]. 	421
227489	COG5160	ULP1	Protease, Ulp1 family  [Posttranslational modification, protein turnover, chaperones]. 	578
227490	COG5161	SFT1	Pre-mRNA cleavage and polyadenylation specificity factor [RNA processing and modification]. 	1319
227491	COG5162	COG5162	Transcription initiation factor TFIID, subunit TAF10 (also component of histone acetyltransferase SAGA) [Transcription]. 	197
227492	COG5163	NOP7	Protein required for biogenesis of the 60S ribosomal subunit [Translation, ribosomal structure and biogenesis]. 	591
227493	COG5164	SPT5	Transcription elongation factor  [Transcription]. 	607
227494	COG5165	POB3	Nucleosome-binding factor SPN, POB3 subunit [Transcription / DNA replication, recombination, and repair / Chromatin structure and dynamics]. 	508
227495	COG5166	COG5166	Uncharacterized conserved protein [Function unknown]. 	657
227496	COG5167	VID27	Protein involved in vacuole import and degradation [Intracellular trafficking and secretion]. 	776
227497	COG5169	HSF1	Heat shock transcription factor [Transcription]. 	282
227498	COG5170	CDC55	Serine/threonine protein phosphatase 2A, regulatory subunit [Signal transduction mechanisms]. 	460
227499	COG5171	YRB1	Ran GTPase-activating protein (Ran-binding protein) [Intracellular trafficking and secretion]. 	211
227500	COG5173	SEC6	Exocyst complex subunit SEC6 [Intracellular trafficking and secretion]. 	742
227501	COG5174	TFA2	Transcription initiation factor IIE, beta subunit [Transcription]. 	285
227502	COG5175	MOT2	Transcriptional repressor [Transcription]. 	480
227503	COG5176	MSL5	Splicing factor (branch point binding protein) [RNA processing and modification]. 	269
227504	COG5177	COG5177	Uncharacterized conserved protein [Function unknown]. 	769
227505	COG5178	PRP8	U5 snRNP spliceosome subunit [RNA processing and modification]. 	2365
227506	COG5179	TAF1	Transcription initiation factor TFIID, subunit TAF1 [Transcription]. 	968
227507	COG5180	PBP1	PAB1-binding protein PBP1, interacts with poly(A)-binding protein [RNA processing and modification]. 	654
227508	COG5181	HSH155	U2 snRNP spliceosome subunit [RNA processing and modification]. 	975
227509	COG5182	CUS1	Splicing factor 3b, subunit 2 [RNA processing and modification]. 	429
227510	COG5183	SSM4	E3 ubiquitin-protein ligase DOA10 [Posttranslational modification, protein turnover, chaperones]. 	1175
227511	COG5184	ATS1	Alpha-tubulin suppressor and related RCC1 domain-containing proteins  [Cell cycle control, cell division, chromosome partitioning, Cytoskeleton]. 	476
227512	COG5185	HEC1	Protein involved in chromosome segregation, interacts with SMC proteins  [Cell cycle control, cell division, chromosome partitioning]. 	622
227513	COG5186	PAP1	Poly(A) polymerase Pap1 [RNA processing and modification]. 	552
227514	COG5187	RPN7	26S proteasome regulatory complex component, contains PCI domain [Posttranslational modification, protein turnover, chaperones]. 	412
227515	COG5188	PRP9	Splicing factor 3a, subunit 3 [RNA processing and modification]. 	470
227516	COG5189	SFP1	Putative transcriptional repressor regulating G2/M transition [Transcription / Cell division and chromosome partitioning]. 	423
227517	COG5190	FCP1	TFIIF-interacting CTD phosphatase, includes NLI-interacting factor  [Transcription]. 	390
227518	COG5191	COG5191	Uncharacterized conserved protein, contains HAT (Half-A-TPR) repeat [General function prediction only]. 	435
227519	COG5192	BMS1	GTP-binding protein required for 40S ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 	1077
227520	COG5193	LHP1	La protein, small RNA-binding pol III transcript stabilizing protein and related La-motif-containing proteins involved in translation [Posttranslational modification, protein turnover, chaperones / Translation, ribosomal structure and biogenesis]. 	438
227521	COG5194	APC11	Component of SCF ubiquitin ligase and anaphase-promoting complex [Posttranslational modification, protein turnover, chaperones / Cell division and chromosome partitioning]. 	88
227522	COG5195	COG5195	Uncharacterized conserved protein [Function unknown]. 	118
227523	COG5196	ERD2	ER lumen protein retaining receptor [Intracellular trafficking and secretion]. 	214
227524	COG5197	COG5197	Predicted membrane protein [Function unknown]. 	284
227525	COG5198	Ptpl	Protein tyrosine phosphatase-like protein (contains Pro instead of catalytic Arg) [General function prediction only]. 	209
227526	COG5199	SCP1	Calponin [Cytoskeleton]. 	178
227527	COG5200	LUC7	U1 snRNP component, mediates U1 snRNP association with cap-binding complex [RNA processing and modification]. 	258
227528	COG5201	SKP1	SCF ubiquitin ligase, SKP1 component [Posttranslational modification, protein turnover, chaperones]. 	158
227529	COG5202	COG5202	Predicted membrane protein [Function unknown]. 	512
227530	COG5204	SPT4	Transcription elongation factor SPT4 [Transcription]. 	112
227531	COG5206	GPI8	Glycosylphosphatidylinositol transamidase (GPIT), subunit GPI8  [Posttranslational modification, protein turnover, chaperones]. 	382
227532	COG5207	UBP14	Uncharacterized Zn-finger protein, UBP-type [General function prediction only]. 	749
227533	COG5208	HAP5	CCAAT-binding factor, subunit C [Transcription]. 	286
227534	COG5209	RCD1	Uncharacterized protein involved in cell differentiation/sexual development [General function prediction only]. 	315
227535	COG5210	COG5210	GTPase-activating protein [General function prediction only]. 	496
227536	COG5211	SSU72	RNA polymerase II-interacting protein involved in transcription start site selection [Transcription]. 	197
227537	COG5212	PDE1	cAMP phosphodiesterase  [Signal transduction mechanisms]. 	356
227538	COG5213	FIP1	Polyadenylation factor I complex, subunit FIP1 [RNA processing and modification]. 	266
227539	COG5214	POL12	DNA polymerase alpha-primase complex, polymerase-associated subunit B [DNA replication, recombination, and repair]. 	581
227540	COG5215	KAP95	Karyopherin (importin) beta [Intracellular trafficking and secretion]. 	858
227541	COG5216	COG5216	Uncharacterized conserved protein [Function unknown]. 	67
227542	COG5217	BIM1	Microtubule-binding protein involved in cell cycle control [Cell division and chromosome partitioning / Cytoskeleton]. 	342
227543	COG5218	YCG1	Chromosome condensation complex Condensin, subunit G [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 	885
227544	COG5219	COG5219	Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 	1525
227545	COG5220	TFB3	Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB3 [Cell division and chromosome partitioning / Transcription / DNA replication, recombination, and repair]. 	314
227546	COG5221	DOP1	Dopey and related predicted leucine zipper transcription factors [Transcription]. 	1618
227547	COG5222	COG5222	Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 	427
227548	COG5223	COG5223	Uncharacterized conserved protein [Function unknown]. 	240
227549	COG5224	HAP2	CCAAT-binding factor, subunit B [Transcription]. 	248
227550	COG5225	RRS1	Uncharacterized protein involved in ribosome biogenesis [Translation, ribosomal structure and biogenesis]. 	172
227551	COG5226	CEG1	mRNA capping enzyme, guanylyltransferase (alpha) subunit [RNA processing and modification]. 	404
227552	COG5227	SMT3	Ubiquitin-like protein (sentrin) [Posttranslational modification, protein turnover, chaperones]. 	103
227553	COG5228	POP2	mRNA deadenylase subunit [RNA processing and modification]. 	299
227554	COG5229	LOC7	Chromosome condensation complex Condensin, subunit H [Chromatin structure and dynamics / Cell division and chromosome partitioning]. 	662
227555	COG5230	COG5230	Uncharacterized conserved protein [Function unknown]. 	194
227556	COG5231	VMA13	Vacuolar H+-ATPase V1 sector, subunit H [Energy production and conversion]. 	432
227557	COG5232	SEC62	Preprotein translocase subunit Sec62 [Intracellular trafficking and secretion]. 	259
227558	COG5233	GRH1	Peripheral Golgi membrane protein [Intracellular trafficking and secretion]. 	417
227559	COG5234	CIN1	Beta-tubulin folding cofactor D [Posttranslational modification, protein turnover, chaperones / Cytoskeleton]. 	993
227560	COG5235	RFA2	Single-stranded DNA-binding replication protein A (RPA), medium (30 kD) subunit [DNA replication, recombination, and repair]. 	258
227561	COG5236	COG5236	Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]. 	493
227562	COG5237	PER1	Predicted membrane protein [Function unknown]. 	319
227563	COG5238	RNA1	Ran GTPase-activating protein (RanGAP) involved in mRNA processing and transport  [Signal transduction mechanisms, RNA processing and modification]. 	388
227564	COG5239	CCR4	mRNA deadenylase, 3'-5' endonuclease subunit Ccr4 [RNA processing and modification]. 	378
227565	COG5240	SEC21	Vesicle coat complex COPI, gamma subunit [Intracellular trafficking and secretion]. 	898
227566	COG5241	RAD10	Nucleotide excision repair endonuclease NEF1, RAD10 subunit [DNA replication, recombination, and repair]. 	224
227567	COG5242	TFB4	RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB4 [Transcription / DNA replication, recombination, and repair]. 	296
227568	COG5243	HRD1	HRD ubiquitin ligase complex, ER membrane component [Posttranslational modification, protein turnover, chaperones]. 	491
227569	COG5244	NIP100	Dynactin complex subunit involved in mitotic spindle partitioning in anaphase B  [Cell cycle control, cell division, chromosome partitioning]. 	669
227570	COG5245	DYN1	Dynein, heavy chain [Cytoskeleton]. 	3164
227571	COG5246	PRP11	Splicing factor 3a, subunit 2 [RNA processing and modification]. 	222
227572	COG5247	BUR6	Class 2 transcription repressor NC2, alpha subunit (DRAP1 homolog) [Transcription]. 	113
227573	COG5248	TAF19	Transcription initiation factor TFIID, subunit TAF13 [Transcription]. 	126
227574	COG5249	RER1	Golgi protein involved in Golgi-to-ER retrieval [Intracellular trafficking and secretion]. 	180
227575	COG5250	RPB4	RNA polymerase II, fourth largest subunit [Transcription]. 	138
227576	COG5251	TAF40	Transcription initiation factor TFIID, subunit TAF11 [Transcription]. 	199
227577	COG5252	COG5252	Uncharacterized conserved protein, contains CCCH-type Zn-finger protein [General function prediction only]. 	299
227578	COG5253	MSS4	Phosphatidylinositol-4-phosphate 5-kinase [Signal transduction mechanisms]. 	612
227579	COG5254	ARV1	Predicted membrane protein [Function unknown]. 	239
227580	COG5255	COG5255	Uncharacterized protein [Function unknown]. 	239
227581	COG5256	TEF1	Translation elongation factor EF-1alpha (GTPase)  [Translation, ribosomal structure and biogenesis]. 	428
227582	COG5257	GCD11	Translation initiation factor 2, gamma subunit (eIF-2gamma; GTPase)  [Translation, ribosomal structure and biogenesis]. 	415
227583	COG5258	GTPBP1	GTPase  [General function prediction only]. 	527
227584	COG5259	RSC8	RSC chromatin remodeling complex subunit RSC8 [Chromatin structure and dynamics / Transcription]. 	531
227585	COG5260	TRF4	DNA polymerase sigma  [Replication, recombination and repair]. 	482
227586	COG5261	IQG1	Protein involved in regulation of cellular morphogenesis/cytokinesis [Cell division and chromosome partitioning / Signal transduction mechanisms]. 	1054
227587	COG5262	HTA1	Histone H2A [Chromatin structure and dynamics]. 	132
227588	COG5263	COG5263	Glucan-binding domain (YG repeat)  [Carbohydrate transport and metabolism]. 	313
227589	COG5264	VTC1	Vacuolar transporter chaperone [Posttranslational modification, protein turnover, chaperones]. 	126
227590	COG5265	ATM1	ABC-type transport system involved in Fe-S cluster assembly, permease and ATPase components  [Posttranslational modification, protein turnover, chaperones]. 	497
227591	COG5266	COG5266	Uncharacterized conserved protein, contains GH25 family domain [General function prediction only]. 	264
227592	COG5267	COG5267	Uncharacterized conserved protein, DUF1800 family [Function unknown]. 	496
227593	COG5268	TrbD	Type IV secretory pathway, TrbD component  [Intracellular trafficking, secretion, and vesicular transport]. 	93
227594	COG5269	ZUO1	Ribosome-associated chaperone zuotin [Translation, ribosomal structure and biogenesis / Posttranslational modification, protein turnover, chaperones]. 	379
227595	COG5270	PUA	PUA domain (predicted RNA-binding domain)  [Translation, ribosomal structure and biogenesis]. 	202
227596	COG5271	MDN1	Midasin, AAA ATPase with  vWA domain, involved in ribosome maturation  [Translation, ribosomal structure and biogenesis]. 	4600
319244	COG5272	UBI4	UBI4; linked to 3D-structure	74
227598	COG5273	COG5273	Uncharacterized protein containing DHHC-type Zn finger [General function prediction only]. 	309
227599	COG5274	CYB5	Cytochrome b involved in lipid metabolism  [Energy production and conversion, Lipid transport and metabolism]. 	164
227600	COG5275	COG5275	BRCT domain type II  [General function prediction only]. 	276
227601	COG5276	COG5276	Uncharacterized conserved protein [Function unknown]. 	370
227602	COG5277	COG5277	Actin-related protein [Cytoskeleton]. 	444
227603	COG5278	CHASE3	Extracellular (periplasmic) sensor domain CHASE3 (specificity unknown) [Signal transduction mechanisms]. 	207
227604	COG5279	CYK3	Cytokinesis protein 3, contains TGc (transglutaminase/protease-like) domain [Cell cycle control, cell division, chromosome partitioning]. 	521
227605	COG5280	YqbO	Phage-related minor tail protein  [Mobilome: prophages, transposons]. 	634
227606	COG5281	COG5281	Phage-related minor tail protein  [Mobilome: prophages, transposons]. 	833
227607	COG5282	COG5282	Uncharacterized conserved protein, DUF2342 family [Function unknown]. 	359
227608	COG5283	COG5283	Phage-related tail protein  [Mobilome: prophages, transposons]. 	1213
227609	COG5285	PhyH	Ectoine hydroxylase-related dioxygenase, phytanoyl-CoA dioxygenase (PhyH) family [Secondary metabolites biosynthesis, transport and catabolism]. 	299
227610	COG5290	COG5290	IkappaB kinase complex, IKAP component [Transcription]. 	1243
227611	COG5291	COG5291	Predicted membrane protein [Function unknown]. 	313
227612	COG5293	YydB	Uncharacterized protein YydD, contains DUF2326 domain [Function unknown]. 	591
227613	COG5294	YxeA	Uncharacterized protein YxeA, DUF1093 family [Function unknown]. 	113
227614	COG5295	Hia	Autotransporter adhesin  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	715
227615	COG5296	COG5296	Transcription factor involved in TATA site selection and in elongation by RNA polymerase II [Transcription]. 	521
227616	COG5297	CelA1	Cellulase/cellobiase CelA1 [Carbohydrate transport and metabolism]. 	544
227617	COG5298	YdaL	Predicted metal-dependent carbohydrate esterase YdaL, contains NodB-like catalytic (CE4) domain  [General function prediction only]. 	530
227618	COG5301	COG5301	Phage-related tail fibre protein  [Mobilome: prophages, transposons]. 	587
227619	COG5302	COG5302	Post-segregation antitoxin (ccd killing mechanism protein) encoded by the F plasmid  [Mobilome: prophages, transposons]. 	80
227620	COG5304	COG5304	Predicted DNA binding protein, CopG/RHH family [Transcription]. 	92
227621	COG5305	COG5305	Uncharacterized membrane protein  [Function unknown]. 	552
227622	COG5306	COG5306	Uncharacterized protein [Function unknown]. 	621
227623	COG5307	COG5307	Guanine-nucleotide exchange factor, contains Sec7 domain [General function prediction only]. 	1024
227624	COG5308	NUP170	Nuclear pore complex subunit [Intracellular trafficking and secretion]. 	1263
227625	COG5309	Scw11	Exo-beta-1,3-glucanase,  GH17 family [Carbohydrate transport and metabolism]. 	305
227626	COG5310	COG5310	Homospermidine synthase  [Secondary metabolites biosynthesis, transport and catabolism]. 	481
227627	COG5314	COG5314	Conjugal transfer/entry exclusion protein  [Mobilome: prophages, transposons]. 	252
227628	COG5316	COG5316	Uncharacterized protein [Function unknown]. 	421
227629	COG5317	COG5317	Uncharacterized protein [Function unknown]. 	175
227630	COG5319	COG5319	Uncharacterized protein [Function unknown]. 	142
227631	COG5321	COG5321	Uncharacterized protein [Function unknown]. 	164
227632	COG5322	COG5322	Predicted amino acid dehydrogenase  [General function prediction only]. 	351
227633	COG5323	COG5323	Large terminase phage packaging protein [Mobilome: prophages, transposons]. 	410
227634	COG5324	Trl1	tRNA splicing ligase [Translation, ribosomal structure and biogenesis]. 	758
227635	COG5325	COG5325	t-SNARE complex subunit, syntaxin [Intracellular trafficking and secretion]. 	283
227636	COG5328	COG5328	Uncharacterized protein, UPF0262 family [Function unknown]. 	160
227637	COG5329	COG5329	Phosphoinositide polyphosphatase (Sac family) [Signal transduction mechanisms]. 	570
227638	COG5330	COG5330	Uncharacterized conserved protein, DUF2336 family [Function unknown]. 	364
227639	COG5331	COG5331	Uncharacterized protein [Function unknown]. 	139
227640	COG5333	CCL1	Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH/TFIIK, cyclin H subunit [Cell division and chromosome partitioning / Transcription / DNA replication, recombination, and repair]. 	297
227641	COG5336	AtpI2	FoF1-type ATP synthase assembly protein I [Energy production and conversion]. 	116
227642	COG5337	CotH	Spore coat protein CotH [Cell wall/membrane/envelope biogenesis]. 	473
227643	COG5338	COG5338	Uncharacterized protein [Function unknown]. 	468
227644	COG5339	YdgA	Uncharacterized conserved protein YdgA, DUF945 family [Function unknown]. 	479
227645	COG5340	COG5340	Transcriptional regulator, predicted component of viral defense system [Defense mechanisms]. 	269
227646	COG5341	COG5341	Uncharacterized protein [Function unknown]. 	132
227647	COG5342	IalB	Invasion protein IalB, involved in pathogenesis  [General function prediction only]. 	181
227648	COG5343	RskA	Anti-sigma-K factor RskA [Signal transduction mechanisms]. 	240
227649	COG5345	COG5345	Uncharacterized protein [Function unknown]. 	358
227650	COG5346	COG5346	Uncharacterized membrane protein [Function unknown]. 	136
227651	COG5347	COG5347	GTPase-activating protein that regulates ARFs (ADP-ribosylation factors), involved in ARF-mediated vesicular transport [Intracellular trafficking and secretion]. 	319
227652	COG5349	COG5349	Uncharacterized conserved protein, DUF983 family [Function unknown]. 	126
227653	COG5350	COG5350	Predicted protein tyrosine phosphatase  [General function prediction only]. 	172
227654	COG5351	COG5351	Uncharacterized protein [Function unknown]. 	367
227655	COG5352	COG5352	Uncharacterized protein [Function unknown]. 	169
227656	COG5353	YpmB	Uncharacterized protein YpmB, contains C-terminal PepSY domain [Function unknown]. 	161
227657	COG5354	COG5354	Uncharacterized protein, contains Trp-Asp (WD) repeat  [General function prediction only]. 	561
227658	COG5360	COG5360	Uncharacterized conserved protein, heparinase superfamily [Function unknown]. 	566
227659	COG5361	COG5361	Uncharacterized conserved protein [Mobilome: prophages, transposons]. 	458
227660	COG5362	COG5362	Phage terminase large subunit [Mobilome: prophages, transposons]. 	202
227661	COG5366	COG5366	Protein involved in propagation of M2 dsRNA satellite of L-A virus [General function prediction only]. 	531
227662	COG5368	COG5368	Uncharacterized protein [Function unknown]. 	451
227663	COG5369	COG5369	Uncharacterized conserved protein [Function unknown]. 	743
227664	COG5371	COG5371	Golgi nucleoside diphosphatase  [Nucleotide transport and metabolism]. 	549
227665	COG5373	COG5373	Uncharacterized membrane protein [Function unknown]. 	931
227666	COG5374	COG5374	Uncharacterized conserved protein [Function unknown]. 	192
227667	COG5375	COG5375	Uncharacterized protein [Function unknown]. 	216
227668	COG5377	COG5377	Phage-related protein, predicted endonuclease  [Mobilome: prophages, transposons]. 	319
227669	COG5378	COG5378	Predicted nucleic acid-binding protein, contains PIN domain  [General function prediction only]. 	175
227670	COG5379	BtaA	S-adenosylmethionine:diacylglycerol 3-amino-3-carboxypropyl transferase  [Lipid transport and metabolism]. 	414
227671	COG5380	LimK	Lipase chaperone LimK [Posttranslational modification, protein turnover, chaperones]. 	283
227672	COG5381	COG5381	Uncharacterized protein [Function unknown]. 	184
227673	COG5383	YdcJ	Uncharacterized metalloenzyme YdcJ, glyoxalase superfamily [General function prediction only]. 	295
227674	COG5384	Mpp10	U3 small nucleolar ribonucleoprotein component  [Translation, ribosomal structure and biogenesis]. 	569
227675	COG5385	COG5385	Uncharacterized protein [Function unknown]. 	214
227676	COG5386	NEAT	Heme-binding NEAT domain [Inorganic ion transport and metabolism]. 	352
227677	COG5387	Atp12	Chaperone required for the assembly of the mitochondrial F1-ATPase  [Posttranslational modification, protein turnover, chaperones]. 	264
227678	COG5388	COG5388	Uncharacterized protein [Function unknown]. 	209
227679	COG5389	COG5389	Uncharacterized protein [Function unknown]. 	181
227680	COG5391	COG5391	Phox homology (PX) domain protein [Intracellular trafficking and secretion / General function prediction only]. 	524
227681	COG5393	YqjE	Uncharacterized membrane protein YqjE [Function unknown]. 	131
227682	COG5394	COG5394	Polyhydroxyalkanoate (PHA) synthesis regulator protein, binds DNA and PHA [Secondary metabolites biosynthesis, transport and catabolism, Signal transduction mechanisms]. 	193
227683	COG5395	COG5395	Uncharacterized membrane protein [Function unknown]. 	131
227684	COG5397	COG5397	Uncharacterized protein [Function unknown]. 	349
227685	COG5398	COG5398	Heme oxygenase  [Coenzyme transport and metabolism]. 	238
227686	COG5399	COG5399	Uncharacterized protein [Function unknown]. 	139
227687	COG5400	COG5400	Uncharacterized protein [Function unknown]. 	205
227688	COG5401	GerM	Spore germination protein GerM [Cell cycle control, cell division, chromosome partitioning]. 	250
227689	COG5402	COG5402	Uncharacterized protein [Function unknown]. 	194
227690	COG5403	COG5403	Uncharacterized protein [Function unknown]. 	285
227691	COG5404	SulA	Cell division inhibitor SulA, prevents FtsZ ring assembly [Cell cycle control, cell division, chromosome partitioning]. 	169
227692	COG5405	HslV	ATP-dependent protease HslVU (ClpYQ), peptidase subunit [Posttranslational modification, protein turnover, chaperones]. 	178
227693	COG5406	COG5406	Nucleosome binding factor SPN, SPT16 subunit  [Transcription, Replication, recombination and repair, Chromatin structure and dynamics]. 	1001
227694	COG5407	SEC63	Preprotein translocase subunit Sec63  [Intracellular trafficking, secretion, and vesicular transport]. 	610
227695	COG5408	COG5408	SPX domain-containing protein [Signal transduction mechanisms]. 	296
227696	COG5409	COG5409	EXS domain-containing protein [Signal transduction mechanisms]. 	384
227697	COG5410	COG5410	Uncharacterized protein [Function unknown]. 	305
227698	COG5411	COG5411	Phosphatidylinositol 5-phosphate phosphatase [Signal transduction mechanisms]. 	460
227699	COG5412	COG5412	Phage-related protein  [Mobilome: prophages, transposons]. 	637
227700	COG5413	COG5413	Uncharacterized integral membrane protein  [Function unknown]. 	168
227701	COG5414	Taf7	TATA-binding protein-associated factor Taf7, part of the TFIID transcription initiation complex [Transcription]. 	392
227702	COG5415	COG5415	Predicted integral membrane metal-binding protein [General function prediction only]. 	251
227703	COG5416	YrvD	Uncharacterized integral membrane protein  [Function unknown]. 	98
227704	COG5417	YukD	Uncharacterized ubiquitin-like protein YukD [Function unknown]. 	81
227705	COG5418	COG5418	Predicted secreted protein  [Function unknown]. 	164
227706	COG5419	COG5419	Uncharacterized protein [Function unknown]. 	160
227707	COG5420	COG5420	Uncharacterized protein [Function unknown]. 	71
227708	COG5421	COG5421	Transposase  [Mobilome: prophages, transposons]. 	480
227709	COG5422	ROM1	RhoGEF, Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases [Signal transduction mechanisms]. 	1175
227710	COG5423	COG5423	Predicted metal-binding protein [Function unknown]. 	167
227711	COG5424	PqqC	Pyrroloquinoline quinone (PQQ) biosynthesis protein C  [Coenzyme transport and metabolism]. 	242
227712	COG5425	Usg	Usg protein (tryptophan operon, function unknown) [Function unknown]. 	90
227713	COG5426	COG5426	Uncharacterized membrane protein  [Function unknown]. 	254
227714	COG5427	COG5427	Uncharacterized membrane protein  [Function unknown]. 	684
227715	COG5428	YuzE	Uncharacterized protein YuzE, DUF2283 family [Function unknown]. 	69
227716	COG5429	COG5429	Uncharacterized protein [Function unknown]. 	261
227717	COG5430	SCPU	Spore coat protein U (SCPU) domain, function unknown  [Function unknown]. 	174
227718	COG5431	COG5431	Predicted nucleic acid-binding protein, contains Zn-finger domain [General function prediction only]. 	117
227719	COG5432	RAD18	RING-finger-containing E3 ubiquitin ligase [Signal transduction mechanisms]. 	391
227720	COG5433	YhhI	Predicted transposase YbfD/YdcC associated with H repeats [Mobilome: prophages, transposons]. 	121
227721	COG5434	Pgu1	Polygalacturonase [Carbohydrate transport and metabolism]. 	542
227722	COG5435	COG5435	Uncharacterized protein [Function unknown]. 	147
227723	COG5436	COG5436	Uncharacterized membrane protein [Function unknown]. 	182
227724	COG5437	COG5437	Predicted secreted protein  [Function unknown]. 	138
227725	COG5438	COG5438	Uncharacterized membrane protein  [Function unknown]. 	385
227726	COG5439	COG5439	Uncharacterized protein [Function unknown]. 	112
227727	COG5440	COG5440	Uncharacterized protein [Function unknown]. 	161
227728	COG5441	COG5441	Uncharacterized protein, UPF0261 family [Function unknown]. 	401
227729	COG5442	FlaF	Flagellar biosynthesis regulator FlaF  [Cell motility]. 	115
227730	COG5443	FlbT	Flagellar biosynthesis regulator FlbT  [Cell motility]. 	148
227731	COG5444	YeeF	Predicted ribonuclease, toxin component of the YeeF-YezG toxin-antitoxin module [Defense mechanisms]. 	565
227732	COG5445	YfaQ	Uncharacterized conserved protein YfaQ, DUF2300 domain [Function unknown]. 	268
227733	COG5446	CbtA	Uncharacterized membrane protein, predicted cobalt tansporter CbtA  [General function prediction only]. 	233
227734	COG5447	COG5447	Uncharacterized protein [Function unknown]. 	115
227735	COG5448	COG5448	Uncharacterized protein [Function unknown]. 	184
227736	COG5449	COG5449	Uncharacterized protein [Function unknown]. 	225
227737	COG5450	VapB6	Transcription regulator of the Arc/MetJ class  [Transcription]. 	84
227738	COG5451	COG5451	Predicted secreted protein  [Function unknown]. 	128
227739	COG5452	COG5452	Uncharacterized protein [Function unknown]. 	180
227740	COG5453	COG5453	Uncharacterized protein [Function unknown]. 	96
227741	COG5454	COG5454	Predicted secreted protein  [Function unknown]. 	89
227742	COG5455	RcnB	Periplasmic regulator RcnB of Ni and Co efflux [Inorganic ion transport and metabolism]. 	129
227743	COG5456	FixH	Nitrogen fixation protein FixH [Inorganic ion transport and metabolism]. 	166
227744	COG5457	YjiS	Uncharacterized conserved protein YjiS, DUF1127 family [Function unknown]. 	63
227745	COG5458	COG5458	Uncharacterized protein [Function unknown]. 	144
227746	COG5459	Rsm22	Ribosomal protein RSM22 (predicted mitochondrial rRNA methylase) [Translation, ribosomal structure and biogenesis]. 	484
227747	COG5460	COG5460	Uncharacterized conserved protein, DUF2164 family [Function unknown]. 	82
227748	COG5461	CpaD	Type IV pilus biogenesis protein CpaD/CtpE  [Extracellular structures]. 	224
227749	COG5462	COG5462	Predicted secreted (periplasmic) protein  [Function unknown]. 	138
227750	COG5463	YgiB	Uncharacterized conserved protein YgiB, involved in bioifilm formation, UPF0441/DUF1190 family [Function unknown]. 	198
227751	COG5464	YadD	Predicted transposase YdaD [Replication, recombination and repair]. 	289
227752	COG5465	COG5465	Uncharacterized protein [Function unknown]. 	166
227753	COG5466	COG5466	Predicted small metal-binding protein  [Function unknown]. 	59
227754	COG5467	COG5467	Uncharacterized protein [Function unknown]. 	104
227755	COG5468	COG5468	Predicted secreted (periplasmic) protein  [Function unknown]. 	172
227756	COG5469	COG5469	Predicted metal-binding protein [Function unknown]. 	143
227757	COG5470	COG5470	Uncharacterized conserved protein, DUF1330 family [Function unknown]. 	96
227758	COG5471	COG5471	Predicted phage recombinase, RecA/RadA family [Mobilome: prophages, transposons]. 	107
227759	COG5472	COG5472	Predicted small integral membrane protein  [Function unknown]. 	164
227760	COG5473	COG5473	Uncharacterized membrane protein  [Function unknown]. 	290
227761	COG5474	COG5474	Uncharacterized protein [Function unknown]. 	159
227762	COG5475	YodC	Uncharacterized conserved protein YodC, DUF2158 family [Function unknown]. 	60
227763	COG5476	COG5476	Microcystin degradation protein MlrC, contains DUF1485 domain [General function prediction only]. 	488
227764	COG5477	COG5477	Predicted small integral membrane protein  [Function unknown]. 	97
227765	COG5478	Fet4	Low affinity Fe/Cu permease [Inorganic ion transport and metabolism]. 	141
227766	COG5479	Psp3	Uncharacterized conserved protein, contains LGFP repeats [Function unknown]. 	556
227767	COG5480	COG5480	Uncharacterized membrane protein [Function unknown]. 	147
227768	COG5481	COG5481	Uncharacterized protein [Function unknown]. 	67
227769	COG5482	COG5482	Uncharacterized protein [Function unknown]. 	229
227770	COG5483	COG5483	Uncharacterized conserved protein, DUF488 family [Function unknown]. 	289
227771	COG5484	YjcR	Uncharacterized protein YjcR, contains N-terminal HTH domain [Function unknown]. 	279
227772	COG5485	COG5485	Predicted ester cyclase  [General function prediction only]. 	131
227773	COG5486	COG5486	Predicted metal-binding membrane protein [Function unknown]. 	283
227774	COG5487	YtjA	Uncharacterized membrane protein YtjA, UPF0391 family [Function unknown]. 	54
227775	COG5488	COG5488	Uncharacterized membrane protein  [Function unknown]. 	164
227776	COG5489	COG5489	Uncharacterized conserved protein, DUF736 family [Function unknown]. 	107
227777	COG5490	COG5490	Uncharacterized protein [Function unknown]. 	158
227778	COG5491	Did4	Archaeal division protein CdvB, Snf7/Vps24/ESCRT-III family [Cell cycle control, cell division, chromosome partitioning]. 	204
227779	COG5492	YjdB	Uncharacterized conserved protein YjdB, contains Ig-like domain  [General function prediction only]. 	329
227780	COG5493	COG5493	Uncharacterized protein [Function unknown]. 	231
227781	COG5494	COG5494	Predicted thioredoxin/glutaredoxin  [Posttranslational modification, protein turnover, chaperones]. 	265
227782	COG5495	COG5495	Predicted oxidoreductase, contains short-chain dehydrogenase (SDR) and DUF2520 domains [General function prediction only]. 	289
227783	COG5496	COG5496	Predicted thioesterase  [General function prediction only]. 	130
227784	COG5497	COG5497	Predicted secreted protein  [Function unknown]. 	228
227785	COG5498	Acf2	Endoglucanase Acf2 [Carbohydrate transport and metabolism]. 	760
227786	COG5499	HigA	Antitoxin component HigA of the HigAB toxin-antitoxin module, contains an N-terminal HTH domain [Defense mechanisms]. 	120
227787	COG5500	COG5500	Uncharacterized membrane protein  [Function unknown]. 	159
227788	COG5501	COG5501	Predicted secreted protein  [Function unknown]. 	148
227789	COG5502	COG5502	Uncharacterized conserved protein, DUF2267 family [Function unknown]. 	135
227790	COG5503	RpoEps	DNA-dependent RNA polymerase auxiliary subunit epsilon [Transcription, Defense mechanisms]. 	69
227791	COG5504	YjaZ	Predicted Zn-dependent protease YjaZ, DUF2268 family [General function prediction only]. 	280
227792	COG5505	COG5505	Uncharacterized membrane protein  [Function unknown]. 	384
227793	COG5506	YueI	Uncharacterized protein YueI, DUF2278 family [Function unknown]. 	144
227794	COG5507	YbaA	Uncharacterized  conserved protein YbaA, DUF1428 family [Function unknown]. 	117
227795	COG5508	COG5508	Uncharacterized protein [Function unknown]. 	84
227796	COG5509	COG5509	Uncharacterized small protein, DUF1192 family [Function unknown]. 	65
227797	COG5510	COG5510	Predicted small secreted protein  [Function unknown]. 	44
227798	COG5511	COG5511	Bacteriophage capsid protein  [Mobilome: prophages, transposons]. 	492
227799	COG5512	COG5512	Predicted  nucleic acid-binding protein, contains Zn-ribbon domain (includes truncated derivatives)  [General function prediction only]. 	194
227800	COG5513	COG5513	Predicted secreted protein  [Function unknown]. 	113
227801	COG5514	COG5514	Uncharacterized protein [Function unknown]. 	203
227802	COG5515	COG5515	Uncharacterized protein [Function unknown]. 	70
227803	COG5516	COG5516	Conserved protein containing a Zn-ribbon-like motif, possibly RNA-binding  [General function prediction only]. 	196
227804	COG5517	HcaF	3-phenylpropionate/cinnamic acid dioxygenase, small subunit [Secondary metabolites biosynthesis, transport and catabolism]. 	164
227805	COG5518	COG5518	Bacteriophage capsid portal protein  [Mobilome: prophages, transposons]. 	492
227806	COG5519	COG5519	Uncharcterized protein, DUF927 family [Function unknown]. 	562
227807	COG5520	XynC	O-Glycosyl hydrolase  [Cell wall/membrane/envelope biogenesis]. 	433
227808	COG5521	YvdJ	Maltodextrin utilization protein YvdJ (function unknown)  [Carbohydrate transport and metabolism]. 	275
227809	COG5522	YwaF	Uncharacterized membrane protein YwaF  [Function unknown]. 	236
227810	COG5523	COG5523	Uncharacterized membrane protein [Function unknown]. 	271
227811	COG5524	COG5524	Bacteriorhodopsin  [Energy production and conversion, Signal transduction mechanisms]. 	285
227812	COG5525	YbcX	Phage terminase, large subunit GpA [Mobilome: prophages, transposons]. 	611
227813	COG5526	COG5526	Lysozyme family protein  [General function prediction only]. 	191
227814	COG5527	COG5527	Protein involved in initiation of plasmid replication  [Mobilome: prophages, transposons]. 	342
227815	COG5528	COG5528	Uncharacterized membrane protein [Function unknown]. 	155
227816	COG5529	COG5529	Pyocin large subunit  [Secondary metabolites biosynthesis, transport and catabolism]. 	326
227817	COG5530	COG5530	Uncharacterized membrane protein  [Function unknown]. 	247
227818	COG5531	Rsc6	Chromatin remodeling complex protein RSC6, contains SWIB domain [Chromatin structure and dynamics]. 	237
227819	COG5532	yfdQ	Uncharacterized conserved protein YfdQ, DUF2303 family [Function unknown]. 	269
227820	COG5533	COG5533	Ubiquitin C-terminal hydrolase  [Posttranslational modification, protein turnover, chaperones]. 	415
227821	COG5534	RepA	Plasmid replication initiator protein [Mobilome: prophages, transposons]. 	383
227822	COG5535	RAD4	DNA repair protein RAD4 [DNA replication, recombination, and repair]. 	650
227823	COG5536	BET4	Protein prenyltransferase, alpha subunit [Posttranslational modification, protein turnover, chaperones]. 	328
227824	COG5537	IRR1	Cohesin [Cell division and chromosome partitioning]. 	740
227825	COG5538	SEC66	Endoplasmic reticulum translocation complex, subunit SEC66 [Cell motility and secretion]. 	180
227826	COG5539	COG5539	Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]. 	306
227827	COG5540	COG5540	RING-finger-containing ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 	374
227828	COG5541	RET3	Vesicle coat complex COPI, zeta subunit [Posttranslational modification, protein turnover, chaperones]. 	187
227829	COG5542	COG5542	Mannosyltransferase related to Gpi18 [Carbohydrate transport and metabolism]. 	420
227830	COG5543	COG5543	Uncharacterized conserved protein [Function unknown]. 	1400
227831	COG5544	yfiM	Uncharacterized conserved protein YfiM, DUF2279 family [Function unknown]. 	101
227832	COG5545	COG5545	Predicted P-loop ATPase and inactivated derivatives  [Mobilome: prophages, transposons]. 	517
227833	COG5546	COG5546	Uncharacterized membrane protein [Function unknown]. 	80
227834	COG5547	COG5547	Uncharacterized membrane protein [Function unknown]. 	62
227835	COG5548	COG5548	Uncharacterized membrane protein, UPF0136 family [Function unknown]. 	105
227836	COG5549	COG5549	Predicted Zn-dependent protease  [Posttranslational modification, protein turnover, chaperones]. 	236
227837	COG5550	COG5550	Predicted aspartyl protease  [Posttranslational modification, protein turnover, chaperones]. 	125
227838	COG5551	Cas6	CRISPR/Cas system endoribonuclease Cas6, RAMP superfamily [Defense mechanisms]. 	261
227839	COG5552	COG5552	Uncharacterized protein [Function unknown]. 	88
227840	COG5553	COG5553	Predicted metal-dependent enzyme of the double-stranded beta helix superfamily  [General function prediction only]. 	191
227841	COG5554	NifT	Nitrogen fixation protein  [Secondary metabolites biosynthesis, transport and catabolism]. 	69
227842	COG5555	COG5555	Cytolysin, a secreted calcineurin-like phosphatase  [Intracellular trafficking, secretion, and vesicular transport]. 	392
227843	COG5556	COG5556	Uncharacterized protein [Function unknown]. 	110
227844	COG5557	HybB	Ni/Fe-hydrogenase 2 integral membrane subunit HybB [Energy production and conversion]. 	401
227845	COG5558	COG5558	Transposase  [Mobilome: prophages, transposons]. 	261
227846	COG5559	COG5559	Uncharacterized conserved small protein [Function unknown]. 	65
227847	COG5560	UBP12	Ubiquitin C-terminal hydrolase [Posttranslational modification, protein turnover, chaperones]. 	823
227848	COG5561	COG5561	Predicted metal-binding protein [Function unknown]. 	101
227849	COG5562	YbcV	Prophage-encoded protein YbcV, DUF1398 family [Mobilome: prophages, transposons]. 	137
227850	COG5563	COG5563	Uncharacterized membrane protein  [Function unknown]. 	379
227851	COG5564	COG5564	Predicted TIM-barrel enzyme [Function unknown]. 	276
227852	COG5565	COG5565	Bacteriophage terminase large (ATPase) subunit and inactivated derivatives  [Mobilome: prophages, transposons]. 	79
227853	COG5566	COG5566	Transcriptional regulator, Middle operon regulator (Mor) family [Transcription]. 	137
227854	COG5567	YifL	Predicted small periplasmic lipoprotein YifL (function unknown0 [Function unknown]. 	58
227855	COG5568	COG5568	Uncharacterized protein [Function unknown]. 	85
227856	COG5569	CusF	Periplasmic Cu and Ag efflux protein CusF [Inorganic ion transport and metabolism]. 	108
227857	COG5570	COG5570	Uncharacterized protein [Function unknown]. 	57
227858	COG5571	YhjY	Uncharacterized protein YhjY, contains autotransporter beta-barrel domain [General function prediction only]. 	239
227859	COG5572	COG5572	Uncharacterized membrane protein [Function unknown]. 	104
227860	COG5573	COG5573	Predicted nucleic acid-binding protein, contains PIN domain  [General function prediction only]. 	142
227861	COG5574	PEX10	RING-finger-containing E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 	271
227862	COG5575	ORC2	Origin recognition complex, subunit 2 [DNA replication, recombination, and repair]. 	535
227863	COG5576	COG5576	Homeodomain-containing transcription factor [Transcription]. 	156
227864	COG5577	CotF	Spore coat protein CotF [Cell wall/membrane/envelope biogenesis]. 	145
227865	COG5578	YesL	Uncharacterized membrane protein YesL  [Function unknown]. 	208
227866	COG5579	COG5579	Uncharacterized protein, DUF1810 family [Function unknown]. 	143
227867	COG5580	COG5580	Activator of HSP90 ATPase  [Posttranslational modification, protein turnover, chaperones]. 	272
227868	COG5581	YcgR	c-di-GMP-binding flagellar brake protein YcgR, contains PilZNR and PilZ domains [Cell motility]. 	233
227869	COG5582	YpiB	Uncharacterized protein YpiB, UPF0302 family [Function unknown]. 	182
227870	COG5583	COG5583	Uncharacterized protein [Function unknown]. 	54
227871	COG5584	COG5584	Predicted small secreted protein  [Function unknown]. 	103
227872	COG5585	COG5585	NAD+--asparagine ADP-ribosyltransferase  [Signal transduction mechanisms]. 	417
227873	COG5586	COG5586	Uncharacterized protein [Function unknown]. 	110
227874	COG5587	COG5587	Uncharacterized conserved protein, DUF2461 family [Function unknown]. 	228
227875	COG5588	COG5588	Uncharacterized protein [Function unknown]. 	207
227876	COG5589	COG5589	Uncharacterized protein [Function unknown]. 	164
227877	COG5590	COG5590	Ubiquinone biosynthesis protein COQ9 [Coenzyme transport and metabolism]. 	229
227878	COG5591	COG5591	Uncharacterized protein [Function unknown]. 	103
227879	COG5592	COG5592	Hemerythrin superfamily protein [General function prediction only]. 	171
227880	COG5593	COG5593	Nucleic-acid-binding protein possibly involved in ribosomal biogenesis [Translation, ribosomal structure and biogenesis]. 	821
227881	COG5594	COG5594	Uncharacterized integral membrane protein [Function unknown]. 	827
227882	COG5595	COG5595	Predicted  nucleic acid-binding protein, contains Zn-ribbon domain [General function prediction only]. 	256
227883	COG5596	TIM22	Mitochondrial import inner membrane translocase, subunit TIM22 [Posttranslational modification, protein turnover, chaperones]. 	191
227884	COG5597	Gnt1	Alpha-N-acetylglucosamine transferase  [Cell wall/membrane/envelope biogenesis]. 	368
227885	COG5598	MttB2	Trimethylamine:corrinoid methyltransferase  [Coenzyme transport and metabolism]. 	526
227886	COG5599	COG5599	Protein tyrosine phosphatase  [Signal transduction mechanisms]. 	302
227887	COG5600	COG5600	Transcription-associated recombination protein [DNA replication, recombination, and repair]. 	413
227888	COG5601	CDC36	General negative regulator of transcription subunit [Transcription]. 	172
227889	COG5602	Sin3	Histone deacetylase complex, regulatory component SIN3  [Chromatin structure and dynamics]. 	1163
227890	COG5603	TRS20	Subunit of TRAPP, an ER-Golgi tethering complex [Cell motility and secretion]. 	136
227891	COG5604	COG5604	Uncharacterized conserved protein [Function unknown]. 	523
227892	COG5605	COG5605	Cytochrome c oxidase subunit IV [Energy production and conversion]. 	115
227893	COG5606	COG5606	Predicted DNA-binding protein, XRE-type HTH domain [General function prediction only]. 	91
227894	COG5607	CHAD	CHAD domain (function unknown) [Function unknown]. 	283
227895	COG5608	COG5608	LEA14-like dessication related protein  [Defense mechanisms]. 	161
227896	COG5609	YbcI	Uncharacterized protein YbcI, DUF2294 family [Function unknown]. 	124
227897	COG5610	COG5610	Predicted hydrolase, HAD superfamily [General function prediction only]. 	635
227898	COG5611	COG5611	Predicted nucleic-acid-binding protein, contains PIN domain  [General function prediction only]. 	130
227899	COG5612	COG5612	Uncharacterized membrane protein [Function unknown]. 	148
227900	COG5613	COG5613	Uncharacterized protein [Function unknown]. 	400
227901	COG5614	COG5614	Bacteriophage head-tail adaptor  [Mobilome: prophages, transposons]. 	109
227902	COG5615	COG5615	Uncharacterized membrane protein  [Function unknown]. 	161
227903	COG5616	TolBN	TolB amino-terminal domain (function unknown) [General function prediction only]. 	152
227904	COG5617	COG5617	Uncharacterized membrane protein  [Function unknown]. 	801
227905	COG5618	COG5618	Predicted periplasmic lipoprotein  [Function unknown]. 	206
227906	COG5619	COG5619	Uncharacterized protein [Function unknown]. 	224
227907	COG5620	COG5620	Uncharacterized protein [Function unknown]. 	200
227908	COG5621	COG5621	Predicted secreted hydrolase  [General function prediction only]. 	354
227909	COG5622	COG5622	Protein required for attachment to host cells  [Cell wall/membrane/envelope biogenesis]. 	139
227910	COG5623	CLP1	Predicted GTPase subunit of the pre-mRNA cleavage complex [Translation, ribosomal structure and biogenesis]. 	424
227911	COG5624	COG5624	Transcription initiation factor TFIID, subunit TAF12 [Transcription]. 	505
227912	COG5625	COG5625	Predicted DNA-binding transcriptional regulator, contains HTH domain  [Transcription]. 	113
227913	COG5626	COG5626	Uncharacterized protein [Function unknown]. 	97
227914	COG5627	COG5627	SUMO ligase MMS21, Smc5/6 complex, required for cell growth and DNA repair [Replication, recombination and repair]. 	275
227915	COG5628	COG5628	Predicted acetyltransferase  [General function prediction only]. 	143
227916	COG5629	COG5629	Predicted metal-binding protein [Function unknown]. 	321
227917	COG5630	Arg2	Acetylglutamate synthase  [Amino acid transport and metabolism]. 	495
227918	COG5631	COG5631	Predicted transcription regulator, contains HTH domain, MarR family  [Transcription]. 	199
227919	COG5632	CwlA	N-acetylmuramoyl-L-alanine amidase CwlA [Cell wall/membrane/envelope biogenesis]. 	302
227920	COG5633	YcfL	Uncharacterized conserved protein YcfL [Function unknown]. 	123
227921	COG5634	YukJ	Uncharacterized protein YukJ, DUF2278 family [Function unknown]. 	223
227922	COG5635	COG5635	Predicted NTPase, NACHT family domain [Signal transduction mechanisms]. 	824
227923	COG5636	COG5636	Uncharacterized conserved protein, contains Zn-ribbon-like motif [Function unknown]. 	284
227924	COG5637	COG5637	Uncharacterized membrane protein [Function unknown]. 	217
227925	COG5638	COG5638	Uncharacterized conserved protein [Function unknown]. 	622
227926	COG5639	COG5639	Uncharacterized protein [Function unknown]. 	77
227927	COG5640	COG5640	Secreted trypsin-like serine protease  [Posttranslational modification, protein turnover, chaperones]. 	413
227928	COG5641	GAT1	GATA Zn-finger-containing transcription factor [Transcription]. 	498
227929	COG5642	COG5642	Uncharacterized conserved protein, DUF2384 family [Function unknown]. 	149
227930	COG5643	COG5643	Protein containing a metal-binding domain shared with formylmethanofuran dehydrogenase subunit E  [General function prediction only]. 	685
227931	COG5644	COG5644	U3 small nucleolar RNA-associated protein 14 [Function unknown]. 	869
227932	COG5645	YceK	Uncharacterized conserved protein YceK [Function unknown]. 	80
227933	COG5646	YdhG	Uncharacterized conserved protein YdhG, YjbR/CyaY-like superfamily, DUF1801 family [Function unknown]. 	126
227934	COG5647	COG5647	Cullin, a subunit of E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]. 	773
227935	COG5648	NHP6B	Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]. 	211
227936	COG5649	COG5649	Uncharacterized protein [Function unknown]. 	132
227937	COG5650	COG5650	Uncharacterized membrane protein  [Function unknown]. 	536
227938	COG5651	COG5651	PPE-repeat protein [Function unknown]. 	490
227939	COG5652	COG5652	VanZ-like family protein (function unknown) [Function unknown]. 	148
227940	COG5653	BcsL	Acetyltransferase involved in cellulose biosynthesis, CelD/BcsL family [Cell motility]. 	406
227941	COG5654	COG5654	Uncharacterized conserved protein, contains RES domain [Function unknown]. 	163
227942	COG5655	REP	Plasmid rolling circle replication initiator protein REP and truncated derivatives  [Mobilome: prophages, transposons]. 	256
227943	COG5656	SXM1	Importin, protein involved in nuclear import [Posttranslational modification, protein turnover, chaperones]. 	970
227944	COG5657	CSE1	CAS/CSE protein involved in chromosome segregation [Cell division and chromosome partitioning]. 	947
227945	COG5658	COG5658	Uncharacterized membrane protein  [Function unknown]. 	204
227946	COG5659	COG5659	SRSO17 transposase [Mobilome: prophages, transposons]. 	385
227947	COG5660	YlaC	Predicted anti-sigma-YlaC factor YlaD, contains Zn-finger domain  [Signal transduction mechanisms]. 	238
227948	COG5661	COG5661	Predicted secreted Zn-dependent protease  [Posttranslational modification, protein turnover, chaperones]. 	210
227949	COG5662	RsiW	Transmembrane transcriptional regulator (anti-sigma factor RsiW)  [Transcription]. 	256
227950	COG5663	YqfW	Uncharacterized protein, HAD superfamily  [General function prediction only]. 	194
227951	COG5664	COG5664	Predicted secreted Zn-dependent protease  [Posttranslational modification, protein turnover, chaperones]. 	201
227952	COG5665	COG5665	CCR4-NOT transcriptional regulation complex, NOT5 subunit  [Transcription]. 	548
177099	MTH00001	ND4L	NADH dehydrogenase subunit 4L; Provisional	99
214398	MTH00004	ND5	NADH dehydrogenase subunit 5; Validated	602
164583	MTH00005	ATP6	ATP synthase F0 subunit 6; Provisional	231
133649	MTH00007	COX1	cytochrome c oxidase subunit I; Validated	511
164584	MTH00008	COX2	cytochrome c oxidase subunit II; Validated	228
177101	MTH00009	COX3	cytochrome c oxidase subunit III; Validated	259
164586	MTH00010	ND1	NADH dehydrogenase subunit 1; Validated	311
164587	MTH00011	ND2	NADH dehydrogenase subunit 2; Validated	330
164588	MTH00012	ND3	NADH dehydrogenase subunit 3; Validated	117
133655	MTH00013	ND4L	NADH dehydrogenase subunit 4L; Validated	97
214399	MTH00014	ND4	NADH dehydrogenase subunit 4; Validated	452
164590	MTH00015	ND6	NADH dehydrogenase subunit 6; Validated	155
177102	MTH00016	CYTB	cytochrome b; Validated	378
164592	MTH00018	ND3	NADH dehydrogenase subunit 3; Validated	113
214400	MTH00020	ND5	NADH dehydrogenase subunit 5; Reviewed	610
214401	MTH00021	ND6	NADH dehydrogenase subunit 6; Validated	188
164595	MTH00022	CYTB	cytochrome b; Validated	379
214402	MTH00023	COX2	cytochrome c oxidase subunit II; Validated	240
214403	MTH00024	COX3	cytochrome c oxidase subunit III; Validated	261
214404	MTH00025	ATP8	ATP synthase F0 subunit 8; Validated	70
164599	MTH00026	COX1	cytochrome c oxidase subunit I; Provisional	534
214405	MTH00027	COX2	cytochrome c oxidase subunit II; Provisional	262
214406	MTH00028	COX3	cytochrome c oxidase subunit III; Provisional	297
177108	MTH00029	ND1	NADH dehydrogenase subunit 1; Provisional	343
164603	MTH00030	ND3	NADH dehydrogenase subunit 3; Provisional	123
214407	MTH00032	ND5	NADH dehydrogenase subunit 5; Provisional	669
133672	MTH00033	CYTB	cytochrome b; Provisional	383
177109	MTH00034	CYTB	cytochrome b; Validated	379
177110	MTH00035	ATP6	ATP synthase F0 subunit 6; Validated	229
214408	MTH00036	ATP8	ATP synthase F0 subunit 8; Validated	54
177112	MTH00037	COX1	cytochrome c oxidase subunit I; Provisional	517
177113	MTH00038	COX2	cytochrome c oxidase subunit II; Provisional	229
177114	MTH00039	COX3	cytochrome c oxidase subunit III; Validated	260
214409	MTH00040	ND1	NADH dehydrogenase subunit 1; Validated	323
177116	MTH00041	ND2	NADH dehydrogenase subunit 2; Validated	349
177117	MTH00042	ND3	NADH dehydrogenase subunit 3; Validated	116
214410	MTH00043	ND4L	NADH dehydrogenase subunit 4L; Validated	98
214411	MTH00044	ND4	NADH dehydrogenase subunit 4; Validated	458
177120	MTH00045	ND6	NADH dehydrogenase subunit 6; Validated	162
177121	MTH00046	CYTB	cytochrome b; Validated	355
214412	MTH00047	COX2	cytochrome c oxidase subunit II; Provisional	194
177123	MTH00048	COX1	cytochrome c oxidase subunit I; Provisional	511
177124	MTH00049	COX3	cytochrome c oxidase subunit III; Validated	215
177125	MTH00050	ATP6	ATP synthase F0 subunit 6; Validated	170
177126	MTH00051	COX2	cytochrome c oxidase subunit II; Provisional	234
164623	MTH00052	COX3	cytochrome c oxidase subunit III; Provisional	262
164624	MTH00053	CYTB	cytochrome b; Provisional	381
177127	MTH00054	ND1	NADH dehydrogenase subunit 1; Provisional	324
177128	MTH00055	ND3	NADH dehydrogenase subunit 3; Provisional	118
177129	MTH00057	ND6	NADH dehydrogenase subunit 6; Provisional	186
177130	MTH00058	ND1	NADH dehydrogenase subunit 1; Provisional	293
177131	MTH00059	ND2	NADH dehydrogenase subunit 2; Provisional	289
177132	MTH00060	ND3	NADH dehydrogenase subunit 3; Provisional	116
177133	MTH00061	ND4L	NADH dehydrogenase subunit 4L; Provisional	86
214413	MTH00062	ND4	NADH dehydrogenase subunit 4; Provisional	417
214414	MTH00063	ND5	NADH dehydrogenase subunit 5; Provisional	522
177136	MTH00064	ND6	NADH dehydrogenase subunit 6; Provisional	151
214415	MTH00065	ND6	NADH dehydrogenase subunit 6; Provisional	172
214416	MTH00066	ND5	NADH dehydrogenase subunit 5; Provisional	598
177139	MTH00067	ND4L	NADH dehydrogenase subunit 4L; Provisional	98
214417	MTH00068	ND4	NADH dehydrogenase subunit 4; Provisional	458
177141	MTH00069	ND3	NADH dehydrogenase subunit 3; Provisional	114
177142	MTH00070	ND2	NADH dehydrogenase subunit 2; Provisional	346
214418	MTH00071	ND1	NADH dehydrogenase subunit 1; Provisional	322
164642	MTH00072	ATP8	ATP synthase F0 subunit 8; Provisional	54
177144	MTH00073	ATP6	ATP synthase F0 subunit 6; Provisional	227
177145	MTH00074	CYTB	cytochrome b; Provisional	380
177146	MTH00075	COX3	cytochrome c oxidase subunit III; Provisional	261
164646	MTH00076	COX2	cytochrome c oxidase subunit II; Provisional	228
214419	MTH00077	COX1	cytochrome c oxidase subunit I; Provisional	514
177148	MTH00079	COX1	cytochrome c oxidase subunit I; Provisional	508
177149	MTH00080	COX2	cytochrome c oxidase subunit II; Provisional	231
177150	MTH00083	COX3	cytochrome c oxidase subunit III; Provisional	256
177151	MTH00086	CYTB	cytochrome b; Provisional	355
177152	MTH00087	ATP6	ATP synthase F0 subunit 6; Provisional	195
177153	MTH00090	ND1	NADH dehydrogenase subunit 1; Provisional	284
177154	MTH00091	ND2	NADH dehydrogenase subunit 2; Provisional	273
177155	MTH00092	ND3	NADH dehydrogenase subunit 3; Provisional	111
177156	MTH00093	ND4L	NADH dehydrogenase subunit 4L; Provisional	77
177157	MTH00094	ND4	NADH dehydrogenase subunit 4; Provisional	403
177158	MTH00095	ND5	NADH dehydrogenase subunit 5; Provisional	527
177159	MTH00097	ND6	NADH dehydrogenase subunit 6; Provisional	121
177160	MTH00098	COX2	cytochrome c oxidase subunit II; Validated	227
177161	MTH00099	COX3	cytochrome c oxidase subunit III; Validated	261
177162	MTH00100	CYTB	cytochrome b; Provisional	379
177163	MTH00101	ATP6	ATP synthase F0 subunit 6; Validated	226
214420	MTH00102	ATP8	ATP synthase F0 subunit 8; Validated	67
177165	MTH00103	COX1	cytochrome c oxidase subunit I; Validated	513
177166	MTH00104	ND1	NADH dehydrogenase subunit 1; Provisional	318
177167	MTH00105	ND2	NADH dehydrogenase subunit 2; Provisional	347
177168	MTH00106	ND3	NADH dehydrogenase subunit 3; Provisional	115
177169	MTH00107	ND4L	NADH dehydrogenase subunit 4L; Provisional	98
177170	MTH00108	ND5	NADH dehydrogenase subunit 5; Provisional	602
214421	MTH00109	ND6	NADH dehydrogenase subunit 6; Provisional	175
177172	MTH00110	ND4	NADH dehydrogenase subunit 4; Provisional	459
214422	MTH00111	ND1	NADH dehydrogenase subunit 1; Provisional	323
214423	MTH00112	ND2	NADH dehydrogenase subunit 2; Provisional	346
177175	MTH00113	ND3	NADH dehydrogenase subunit 3; Provisional	114
214424	MTH00115	ND6	NADH dehydrogenase subunit 6; Provisional	174
177177	MTH00116	COX1	cytochrome c oxidase subunit I; Provisional	515
177178	MTH00117	COX2	cytochrome c oxidase subunit II; Provisional	227
177179	MTH00118	COX3	cytochrome c oxidase subunit III; Provisional	261
214425	MTH00119	CYTB	cytochrome b; Provisional	380
177181	MTH00120	ATP6	ATP synthase F0 subunit 6; Provisional	227
214426	MTH00123	ATP8	ATP synthase F0 subunit 8; Provisional	54
214427	MTH00124	ND4	NADH dehydrogenase subunit 4; Provisional	457
177184	MTH00125	ND4L	NADH dehydrogenase subunit 4L; Provisional	98
177185	MTH00126	ND4L	NADH dehydrogenase subunit 4L; Provisional	98
177186	MTH00127	ND4	NADH dehydrogenase subunit 4; Provisional	459
177187	MTH00129	COX2	cytochrome c oxidase subunit II; Provisional	230
177188	MTH00130	COX3	cytochrome c oxidase subunit III; Provisional	261
177189	MTH00131	CYTB	cytochrome b; Provisional	380
177190	MTH00132	ATP6	ATP synthase F0 subunit 6; Provisional	227
177191	MTH00133	ATP8	ATP synthase F0 subunit 8; Provisional	55
177192	MTH00134	ND1	NADH dehydrogenase subunit 1; Provisional	324
177193	MTH00135	ND2	NADH dehydrogenase subunit 2; Provisional	347
177194	MTH00136	ND3	NADH dehydrogenase subunit 3; Provisional	116
214428	MTH00137	ND5	NADH dehydrogenase subunit 5; Provisional	611
177196	MTH00138	ND6	NADH dehydrogenase subunit 6; Provisional	173
214429	MTH00139	COX2	cytochrome c oxidase subunit II; Provisional	226
214430	MTH00140	COX2	cytochrome c oxidase subunit II; Provisional	228
177199	MTH00141	COX3	cytochrome c oxidase subunit III; Provisional	259
214431	MTH00142	COX1	cytochrome c oxidase subunit I; Provisional	511
177201	MTH00143	ND1	NADH dehydrogenase subunit 1; Provisional	307
214432	MTH00144	ND2	NADH dehydrogenase subunit 2; Provisional	328
177203	MTH00145	CYTB	cytochrome b; Provisional	379
177204	MTH00147	ATP8	ATP synthase F0 subunit 8; Provisional	51
214433	MTH00148	ND3	NADH dehydrogenase subunit 3; Provisional	117
214434	MTH00149	ND4L	NADH dehydrogenase subunit 4L; Provisional	97
214435	MTH00150	ND4	NADH dehydrogenase subunit 4; Provisional	417
214436	MTH00151	ND5	NADH dehydrogenase subunit 5; Provisional	565
214437	MTH00152	ND6	NADH dehydrogenase subunit 6; Provisional	163
177210	MTH00153	COX1	cytochrome c oxidase subunit I; Provisional	511
214438	MTH00154	COX2	cytochrome c oxidase subunit II; Provisional	227
214439	MTH00155	COX3	cytochrome c oxidase subunit III; Provisional	255
214440	MTH00156	CYTB	cytochrome b; Provisional	356
214441	MTH00157	ATP6	ATP synthase F0 subunit 6; Provisional	223
177215	MTH00158	ATP8	ATP synthase F0 subunit 8; Provisional	32
214442	MTH00160	ND2	NADH dehydrogenase subunit 2; Provisional	335
177217	MTH00161	ND3	NADH dehydrogenase subunit 3; Provisional	113
177218	MTH00162	ND4L	NADH dehydrogenase subunit 4L; Provisional	89
214443	MTH00163	ND4	NADH dehydrogenase subunit 4; Provisional	445
214444	MTH00165	ND5	NADH dehydrogenase subunit 5; Provisional	573
214445	MTH00166	ND6	NADH dehydrogenase subunit 6; Provisional	160
177222	MTH00167	COX1	cytochrome c oxidase subunit I; Provisional	512
177223	MTH00168	COX2	cytochrome c oxidase subunit II; Provisional	225
214446	MTH00169	ATP8	ATP synthase F0 subunit 8; Provisional	67
177225	MTH00171	ATP8	ATP synthase F0 subunit 8; Provisional	54
214447	MTH00172	ATP6	ATP synthase F0 subunit 6; Provisional	232
214448	MTH00173	ATP6	ATP synthase F0 subunit 6; Provisional	231
133799	MTH00174	ATP6	ATP synthase F0 subunit 6; Provisional	252
177228	MTH00175	ATP6	ATP synthase F0 subunit 6; Provisional	244
214449	MTH00176	ATP6	ATP synthase F0 subunit 6; Provisional	229
177230	MTH00179	ATP6	ATP synthase F0 subunit 6; Provisional	227
177231	MTH00180	ND4L	NADH dehydrogenase subunit 4L; Provisional	99
214450	MTH00181	ND4L	NADH dehydrogenase subunit 4L; Provisional	93
214451	MTH00182	COX1	cytochrome c oxidase subunit I; Provisional	525
177234	MTH00183	COX1	cytochrome c oxidase subunit I; Provisional	516
177235	MTH00184	COX1	cytochrome c oxidase subunit I; Provisional	519
164736	MTH00185	COX2	cytochrome c oxidase subunit II; Provisional	230
177236	MTH00186	ATP8	ATP synthase F0 subunit 8; Provisional	52
177237	MTH00188	ND4L	NADH dehydrogenase subunit 4L; Provisional	97
177238	MTH00189	COX3	cytochrome c oxidase subunit III; Provisional	260
177239	MTH00191	CYTB	cytochrome b; Provisional	365
177240	MTH00192	ND4L	NADH dehydrogenase subunit 4L; Provisional	99
177241	MTH00193	ND1	NADH dehydrogenase subunit 1; Provisional	306
214452	MTH00195	ND1	NADH dehydrogenase subunit 1; Provisional	307
214453	MTH00196	ND2	NADH dehydrogenase subunit 2; Provisional	365
177244	MTH00197	ND2	NADH dehydrogenase subunit 2; Provisional	323
214454	MTH00198	ND2	NADH dehydrogenase subunit 2; Provisional	607
177245	MTH00199	ND2	NADH dehydrogenase subunit 2; Provisional	460
177246	MTH00200	ND2	NADH dehydrogenase subunit 2; Provisional	347
214455	MTH00202	ND3	NADH dehydrogenase subunit 3; Provisional	117
214456	MTH00203	ND3	NADH dehydrogenase subunit 3; Provisional	112
164750	MTH00204	ND4	NADH dehydrogenase subunit 4; Provisional	485
214457	MTH00205	ND4	NADH dehydrogenase subunit 4; Provisional	448
214458	MTH00206	ND4	NADH dehydrogenase subunit 4; Provisional	450
164753	MTH00207	ND5	NADH dehydrogenase subunit 5; Provisional	572
177251	MTH00208	ND5	NADH dehydrogenase subunit 5; Provisional	628
177252	MTH00209	ND5	NADH dehydrogenase subunit 5; Provisional	564
177253	MTH00210	ND5	NADH dehydrogenase subunit 5; Provisional	616
214459	MTH00211	ND5	NADH dehydrogenase subunit 5; Provisional	597
214460	MTH00212	ND6	NADH dehydrogenase subunit 6; Provisional	160
177256	MTH00213	ND6	NADH dehydrogenase subunit 6; Provisional	239
214461	MTH00214	ND6	NADH dehydrogenase subunit 6; Provisional	168
164761	MTH00216	ND1	NADH dehydrogenase subunit 1; Provisional	327
214462	MTH00217	ND4	NADH dehydrogenase subunit 4; Provisional	482
214463	MTH00218	ND1	NADH dehydrogenase subunit 1; Provisional	311
214464	MTH00219	COX3	cytochrome c oxidase subunit III; Provisional	262
164765	MTH00222	ATP9	ATP synthase F0 subunit 9; Provisional	77
177260	MTH00223	COX1	cytochrome c oxidase subunit I; Provisional	512
164767	MTH00224	CYTB	cytochrome b; Provisional	379
214465	MTH00225	ND1	NADH dehydrogenase subunit 1; Provisional	305
214466	MTH00226	ND4	NADH dehydrogenase subunit 4; Provisional	505
164770	MTH00260	ATP8	ATP synthase F0 subunit 8; Provisional	53
177263	MTH00261	ATP8	ATP synthase F0 subunit 8; Provisional	68
411074	NF000031	MFS_efflux_LmrA	lincomycin efflux MFS transporter Lmr(A). 	484
411075	NF000040	Tn10_TetC	tetracyline resistance-associated transcriptional repressor TetC. TetC, as found in composite transposon Tn10, is a transcriptional repressor of itself and of TetD, which is a transcriptional activator for some stress response proteins in the SoxS/MarA/Rob regulon in E. coli and which therefore contributes to antibiotic resistance.	197
411076	NF000058	Erm41	23S rRNA (adenine(2058)-N(6))-methyltransferase Erm(41). 	173
411077	NF000060	MFS_efflux_LmrB	lincomycin efflux MFS transporter Lmr(B). The lin-2 mutant, described in PMID:12499232, alters Lmr(B) expression in Bacillus subtilis and allows Lmr(B) to confer resistance to lincomycin.	479
411078	NF000106	40850658_otr	oxytetracycline efflux ABC transporter Otr(C) ATP-binding subunit. 	351
411079	NF000140	AAC_6p_Salmo	AAC(6')-Iy/Iaa family aminoglycoside 6'-N-acetyltransferase. Members of this family are chromosomal acetyltransferases from the genus Salmonella. Analysis has demonstrated a case in which a member designated AAC(6')-Iy, identical in two different strains isolated from a single patient, conferred resistance to tobramycin only in the isolate where a deletion event upstream of the gene resulted in high expression.  Members of this family are therefore considered cryptic aminoglycoside N-acetyltransferases. which may or may not confer resistance, depending on expression levels.	145
411080	NF000217	MATE_multi_FepA	multidrug efflux MATE transporter FepA. 	443
411081	NF000342	glpA_Cterm	aminoglycoside O-phosphotransferase APH(4)-Ib. The C-terminal half of CAA52372.1 shows homology to hygromycin-modifying enzyme APH(4)-Ia. The N-terminal region shows no homology to any other protein. The gene glpA was identified as one of two in a 3.0-kb DNA segment capable of conferring on E. coli the ability to degrade and use the phosphonate herbicide glyphosate as a sole carbon source. The expressed protein conferred minor (3-fold) increase in tolerance to hygromycin, but this protein probably should not be considered an aminoglycoside modifying enzyme associated with the spread of resistance in bacteria toward clinically important antimicrobials.	228
411082	NF000349	mfpA_AE000516.2	pentapeptide repeat protein MfpA. 	183
411083	NF000391	EmrB	multidrug efflux MFS transporter permease subunit EmrB. 	501
380146	NF000535	MSCRAMM_SdrC	MSCRAMM family adhesin SdrC. Features of this protein family include a YSIRK-type signal peptide at the N-terminus and a variable-length C-terminal region of Ser-Asp (SD) repeats followed by an LPXTG motif for surface immobilization by sortase.	963
333720	NF000536	YmiA	YmiA family putative membrane protein. 	42
333721	NF000537	YncL	stress response membrane protein YncL. 	30
333722	NF000539	plantaricin	plantaricin C family lantibiotic. This family describes plantaricin C-like lantibiotic precursors. The seed alignment straddles the cleavage motif (typically GG), and includes both an extended leader peptide region and a Cys-rich core peptide region. Because of the mosaic structure of lantibiotic precursors, this family can be expected to overlap other lantibiotic precursor families in the same clan.	65
380147	NF000540	alt_ValS	valine--tRNA ligase. 	827
411084	NF012135	ANT_3pp_9_crypt	aminoglycoside nucleotidyltransferase ANT(3'')/ANT(9). Known members of this family are restricted to Salmonella enterica. Activities as both a streptomycin 3''-O-adenylyltransferase and a spectinomycin 9-O-adenylyltransferase are cryptic because of a lack of expression, but detected if cells are grown on miminal rather than rich media (see PMID:21507083).	262
333724	NF012136	SecA2_Lm	accessory Sec system translocase SecA2. Members of this family are SecA2, part of a Sec-like preprotein translocase called accessory Sec. This SecA2 family is characteristic of Listeria species.	776
333725	NF012138	exosort_XrtR	exosortase R. 	160
333726	NF012139	exosort_XrtP	exosortase P. 	159
411085	NF012144	ramA_TF	RamA family antibiotic efflux transcriptional regulator. 	108
380148	NF012162	surf_Nterm_1	surface-anchored protein thioester-forming domain. This model describes a conserved region, fairly rich in insertions and deletions, located just past the signal peptide region in long, variable, and typically highly repetitive and sortase-dependent surface proteins. Members are found in a broad range of taxa, including many strains of Streptococcus pneumoniae. A conserved Cys forms a thioester bond, often to a host protein for covalent attachment.	234
411086	NF012163	BaeS_SmeS	sensor histidine kinase efflux regulator BaeS. 	457
333728	NF012164	AlbA	subtilosin maturase AlbA. AlbA is a radical SAM/SPASM domain-containing protein responsible for introducing thioether crosslinks during that maturation of bacteriocins such subtilosin A.	442
411087	NF012168	BlaI_of_BCL	penicillinase repressor BlaI. 	121
333729	NF012179	CptA	phosphoethanolamine transferase CptA. 	556
380149	NF012181	MSCRAMM_SdrD	MSCRAMM family adhesin SdrD. Features of this protein family include a YSIRK-type signal peptide at the N-terminus and a variable-length C-terminal region of Ser-Asp (SD) repeats followed by an LPXTG motif for surface immobilization by sortase.	1379
333731	NF012182	exosortase_XrtQ	exosortase Q. 	256
380150	NF012196	Ig_like_ice	Ig-like domain. This variant form of the Ig-like domain occurs as a repeat in a number of large adhesins, including a 1.5-MDa ice-binding adhesin, the Marinomonas primoryensis antifreeze protein.	108
380151	NF012197	lonely_Cys	lonely Cys domain. This model describes an unusual domain, over 700 amino acids long, that is largely restricted to the Streptomyces (prodigious producers of natural products) and that may occur ten or more times in giant proteins. The most striking feature is an extremely low cysteine composition, one residue per domain, and that in an essentially invariant position.	706
411088	NF012198	MarA_TF	MDR efflux pump AcrAB transcriptional activator MarA. 	124
380152	NF012200	choice_anch_D	choice-of-anchor D domain. This HMM describes a repeat domain just over 100 amino acids long and usually found in tandem copies. Members appear to be extracellular proteins that have some C-terminal anchoring domain, such as type IX secrection (T9SS) or PEP-CTERM.	107
333735	NF012201	WIAG-tail	WIAG-tail domain. This 80-amino acid domain occurs in proteins in a single copy at the C-terminus.  In most proteins, the domain immediately follows a long, variable run of tandem 10-amino acid repeats. The domain is named for its C-terminal motif, WIAxGx, hence the name WIAG-tail.	80
380153	NF012204	adhes_FxxPxG	leukotoxin LktA family filamentous adhesin N-terminal domain. This model, related to TIGR01901, describes a conserved single-copy N-terminal domain found in repeat-rich, extremely long proteins such as the leukotoxin LktA of Fusobacterium necrophorum.	152
333737	NF012206	LktA_tand_53	leukotoxin LktA-type filamentous protein tandem repeat. This repeat, about 53 amino acids in length, may comprise most of the length of proteins over 3000 amino acids long. The best characterized protein with this repeat is the leukotoxin LktA of Fusobacterium necrophorum, where it is the major virulence factor.	53
411089	NF012208	SDR_dihy_bifunc	bifunctional dihydropteridine reductase/dihydrofolate reductase TmpR. Members of this family are SDR family oxidoreductases, unrelated to previously known families of dihydrofolate reductase (DHFR), one of which was demonstrated to be a bifunctional dihydropteridine reductase/dihydrofolate reductase. The DHFR activity can give a heterologously expressed protein the ability to confer resistance to trimethoprim, an inhibitor of most forms of DHFR.	233
333738	NF012209	LEPR-8K	LEPR-XLL family repeat protein signature domain. This model, just 24 amino acids long, describes an N-terminal single-copy region that contains the most highly conserved motif in a collection of repeat-filled giant proteins. Member proteins average over 8000 amino acids and include at least one longer than 35,000 in length. The signature motif is LEPRxLL	24
380154	NF012210	PDxFFG	PDxFFG domain. This model represents the conserved N-terminal domain of family of large proteins with signal peptides, found in Mycoplasma and Ureaplasma. A short conserved N-terminal domain and a large conserved C-terminal domain are separated by poorly conserved regions of variable length. This domain is named for its best conserved motif, PDxFFG.	269
333740	NF012211	tand_rpt_95	tandem-95 repeat. This 95-amino acid repeat occurs in tandem in proteins that may be several thousand amino acids long.	98
380155	NF012221	MARTX_Nterm	MARTX multifunctional-autoprocessing repeats-in-toxin holotoxin N-terminal region. This model describes the N-terminal 1900 amino acids of MARTX family multifunctional-autoprocessing repeats-in-toxin holotoxins, which contain both repeat regions that facilitate their entry into eukaryotic target cells, and multiple effector domains.	1848
411090	NF012226	AdeS_HK	two-component sensor histidine kinase AdeS. Mutations in this component of the two-component regulatory system for the AdeABC efflux pump can confer adaptive resistance to certain antibiotics, including tigecycline.	353
411091	NF012227	AdeR_RR	efflux system response regulator transcription factor AdeR. This protein, the DNA-binding regulator AdeR, works with its two-component system partner AdeS, to modulate expression of the AdeABC efflux pump.	239
411092	NF012228	RobA_TF	MDR efflux pump AcrAB transcriptional activator RobA. The original characterization of RobA as a Right side Origin of replication Binding protein A (robA) may be misleading. Characterizations in large numbers of papers since then treat RobA as a transcriptional activator of the AcrAB antibiotic efflux pump.	286
333742	NF012230	LWXIA_domain	LWXIA domain. This domain occurs exclusively at the C-terminus of a set of long proteins (average length 4000 residues), and is separated form the rest of the protein sequence by a Pro and Ser-rich spacer region of poorly conserved, low-complexity sequence. This domain is named for its most conserved motif, LWxIA.  Some but not all sequences in the seed alignment score well locally to the LysM domain model of PF01476, which may have a general peptidoglycan binding function.	74
333743	NF028536	PAP2_near_MCR1	PAP2 family protein. Members of this family belong to the PAP2 superfamily (see PF01569). The founding members of this family are notable for being encoded next to mcr-1, a phosphoethanolamine--lipid A transferase that confers resistance to colistin.	237
380156	NF028538	PAP2_lipid_A	PAP2 family lipid A phosphatase. All members of the seed alignment for this family belong to the PAP2 superfamily and therefore share homology with the lipid A 1-phosphatase LpxE of Helicobacter pylori.  LpxE removes one of two KDO sugar phosphates from lipid A, making it possible for a phosphoethanolamine--lipid A transferase to add the modifying group that increases resistance to colistin.  All members of the seed alignment for this model are encoded close to the gene for a phosphoethanolamine--lipid A transferase, such as MCR-1.	224
380157	NF032891	tail_200_repeat	tandem large repeat. This HMM describes a domain of nearly 200 amino acids, found in up to 14 tandem repeats in the C-terminal region of very large protein, in Vibrio parahaemolyticus and related species.	192
380158	NF032893	tail-700	PLxRFG domain. This domain, nearly 700 residues long, begins with a nearly invariant motif YxPLxRFGx[YF].  It occurs as the extreme C-terminal domain of large size, some over 5000 amino acids long with an average of nearly 3000. The function is unknown.	681
333748	NF033070	rSAM_AprD4	AprD4 family radical SAM diol-dehydratase. AprD4 is a radical SAM enzyme involved in C3-deoxygenation of the intermediate paromamine during biosynthesis of the aminoglycoside apramycin. It acts as a diol-dehydratase, and works with the partner protein, AprD3, a reductase.	456
380159	NF033071	SusD	starch-binding outer membrane lipoprotein SusD. SusD (Starch Uptake System D) is an outer membrane lipoprotein that binds starch and participates in a TonB-dependent nutrient uptake complex. Related proteins from similar TonB-dependent complexes that import other, usually multimeric nutrient substrates include RagB and NanU.	558
333750	NF033072	NanU	SusD family outer membrane lipoprotein NanU. NanU, related to SusD and RagB, is an outer membrane lipoprotein from a TonB-dependent nutrient uptake complex.	521
380160	NF033073	LPXTG_double	doubled motif LPXTG anchor domain. This unusual LPXTG-type C-terminal protein sorting domain occurs largely in the genus Clostridium and typically is separated from the main body of the protein by a glycine-rich linker sequence. In this domain, the classical sortase cleavage motif, LPXTG, has the consensus sequence VPLAxLPKTG.   Much of this motif, the sequence VPLAxLP, is repeated an average 20 amino acids upstream within this domain. This unusual structure of a sortase recognition site-containing domain suggest a specialized form of interaction with its cognate sortase.	66
380161	NF033092	HK_WalK	cell wall metabolism sensor histidine kinase WalK. This model describes WalK as found in Staphylococcus aureus (sp|Q2G2U4.1|WALK_STAA8).  A shorter version, as found in Streptococcus pneumoniae, called WalK(Spn) or VicK, is not included. WalK is part of a two-component system and works with partner protein WalR.	594
380162	NF033093	HK_VicK	cell wall metabolism sensor histidine kinase VicK. This model describes the protein VicK (or WalK) as found in Streptococcus pneumoniae, This protein is shorter than the WalK of Staphylococcus aureus, although apparently is functionally similar. Compare to model NF033092  (HK_WalK).  VicK is a sensor histidine kinase involved in regulating cell wall metabolism. Its two component system partner is the response regulatory VicR.	448
380163	NF033113	halo_ClmS	chloramphenicol-biosynthetic FADH2-dependent halogenase CmlS. 	570
411093	NF033124	estX	alpha/beta fold putative hydrolase EstX. 	280
411094	NF033138	RND-peri-MexC	MexC family multidrug efflux RND transporter periplasmic adaptor subunit. 	375
411095	NF033143	efflux_OM_AdeK	multidrug efflux RND transporter AdeIJK outer membrane channel subunit AdeK. 	480
411096	NF033147	GXX_rpt_CTERM	Gly-Xaa-Xaa repeat protein C-terminal domain. This model often occurs at the C-terminus, and companion model N_to_GlyXaaXaa (NF033172) at the N-terminus, of proteins that in between consist largely of variable numbers of Gly-Xaa-Xaa repeats, reminiscent of collagen repeats.  Member proteins observed have been found so far only in Gram-positive bacteria.  This domain contains a motif IPxTG near its C-terminus, suggesting it is processed by some form of sortase.	132
411097	NF033153	phage_ICD_like	host cell division inhibitor Icd domain. Icd from temperate phage P1 inhibits cell division in its host. Homologous sequence is found in many other proteins, often as the C-terminal region of what appears to be a much larger protein. Putative phage proteins that contain this domain may be designated "host cell division inhibitor Icd-like protein". See PMID: 8491703 for a description of Icd. Many proteins with this domain also have the Ash domain described by PF10554, which also occurs in phage.	48
411098	NF033154	endonuc_SmrA	DNA endonuclease SmrA. YdaL is a small endonuclease with homology to the C-terminal domain found in the endonuclease MutS2, but not found in the related mismatch repair protein MutS. The biological role of this endonuclease is not yet known. As one of two Small MutS2-Related proteins in E. coli, This protein was designated SmrA by Gui, et al. (PMID:21276852).  The term SMR is much better known for describing a large family of Small Multidrug Resistance (SMR) efflux transporters, but in that context is used with three capital letters.	189
380165	NF033155	CatA_like_1	CatA-like O-acetyltransferase. Members of this family are homologs to members of the CatA family of chloramphenicol acetyltransferases, although less than 30% identical.  There is no evidence that members of this family act on or confer resistance to chloramphenicol.	209
380166	NF033157	SWFGD_domain	SWFGD domain. This small domain (29 amino acids long) is named for its most conspicuous (although not invariant) motif, SWFGD. The motif occurs primarily, although not exclusively, in protein sequences with a BON domain (PF04972), suspected of involvement in attachment to phospholipid membranes, or with a DUF2171 domain (PF09939). Two copies of the motif may be found in a single protein.	29
380167	NF033158	Myrrcad	Myrrcad domain. This domain appears at or near the C-terminus in expanded paralogous families of proteins in Mycoplasma and Candidatus Mycoplasma genomes. Proteins with this domain typically show an N-terminal sequence regions followed by a repeat region, highly variable in length and similar to the leucine-rich repeat. The center region of this domain resembles alpha-helical hydrophobic transmembrane segments. The domain ends with a cluster of basic residues, suggesting an orientation in which the C-terminal residues of the domain face the cytosol. The coinage "Myrrcad", rendered without full capitalization because it is an acronym rather than an amino acid sequence motif, signifies "MYcoplasma Repeat-Rich protein C-terminal Anchor Domain"	36
380168	NF033160	lipo_LipL36	lipoprotein LipL36. Members of this family are lipoprotein LipL36, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	383
380169	NF033161	lipo_LipL41	lipoprotein LipL41. Members of this family are lipoprotein LipL41, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	358
380170	NF033162	lipo_LipL21	lipoprotein LipL21. Members of this family are lipoprotein LipL21, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	191
380171	NF033163	lipo_LipL71	lipoprotein LipL71. Members of this family are lipoprotein LipL71, also known as LruA, as described in Leptospira interrogans but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	472
380172	NF033164	lipo_LipL46	lipoprotein LipL46. Members of this family are lipoprotein LipL46, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	413
380173	NF033165	lipo_LipL45	lipoprotein LipL45. Members of this family are lipoprotein LipL45, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	391
380174	NF033166	lipo_LipL31	lipoprotein LipL31. Members of this family are lipoprotein LipL31, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	210
380175	NF033167	lipo_LIC11695	LIC_11695/LIC_11696 family lipoprotein. Members of this family are lipoproteins found broadly in the genus Leptospira. Two paralogs, LIC_11695 and LIC_11696 are found in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 (a well-studied reference strain), where they are encoded by tandem genes.	186
380176	NF033168	lipo_LIC10766	LIC_10766 family lipoprotein. Members of this family are lipoproteins found broadly in the genus Leptospira.	142
380177	NF033169	lipo_LIC10494	LIC_10494 family lipoprotein. Members of this family are lipoproteins found broadly in the genus Leptospira.	213
380178	NF033170	lipo_LIC13355	LIC_13355 family lipoprotein. Members of this family are lipoproteins found broadly in the genus Leptospira.	273
411099	NF033171	lipo_LIC11139	LIC_11139 family putative lipoprotein. Members of this family are restricted to the genus Leptospira. They are putative lipoproteins with only a single Cys residue, invariant at the proposed lipoprotein signal cleavage site. Residues in the -1 position are unusual for lipoproteins in general, but consistent with observations in Leptospira, as described in PMID:26890609.	157
380180	NF033172	N_to_GlyXaaXaa	collagen-like repeat preface domain. All protein sequence used in the seed alignment for this model comes from the N-terminal region of proteins with extended collagen-like Gly-rich repeat regions, and occur in Firmicutes.	122
380181	NF033173	anticapsin_BacC	dihydroanticapsin 7-dehydrogenase. Members of this family are dihydroanticapsin 7-dehydrogenase (EC 1.1.1.385), one of seven key molecular markers for biosynthesis of the non-cognate amino acid anticapsin, a building block for the dipeptide antibiotic natural product bacilysin.	252
380182	NF033175	fuso_auto_Nterm	autotransporter-associated N-terminal domain. This domain typically is found in the genus Fusobacterium, in N-terminal regions of large proteins that are recognized as autotransporter proteins by C-terminal regions matching Pfam model  PF03797.  In paralogous families of such proteins, the N-terminal and C-terminal regions are fairly well-conserved, but the repetitive central region is poorly conserved in both length and sequence.	121
380183	NF033176	auto_AIDA-I	autotransporter adhesin AIDA-I. 	1287
380184	NF033177	auto_Ag43	autotransporter adhesin Ag43. 	948
380185	NF033178	auto_BigA	autotransporter adhesin BigA. Members of this family are the adhesin BigA, found in Salmonella. BigA is an autotranporter, meaning is has a C-terminal outer membrane beta-barrel, through which passenger regions of the proteins are transported.	1875
380186	NF033179	TnsA_like_Actin	TnsA-like heteromeric transposase endonuclease subunit. The transposase of transposon Tn7 contains multiple subunit. Members of this family are largely restricted to the Actinobacteria, resemble the endonuclease subunit TsnA of the multimeric transposase of Tn7 and its relatives, and occur in genomic neighborhoods that suggest a similar role in transposition.	212
380187	NF033181	NiFeSe_hydrog	nickel-dependent hydrogenase large subunit. Members of this family are the large subunit of the periplasmic nickel-dependent hydrogenase. Some members contain a selenocysteine residue.	509
380188	NF033183	colliding_TM	low-complexity tail membrane protein. Members of this family appear typically as an unusual gene pair with members of PF11998 (DUF3493).  Strangely, members tend to have tail-to-tail overlapping regions, where the tail region from this protein is long, low-complexity, and typically rich in  Asp, Glu, Asn, Gln, Ser, and Thr. These gene pairs occur broadly in Cyanobacteria. The function of this pair of convergently-transcribed overlapping proteins is unknown.  The low-complexity region was trimmed from the seed alignment to build this HMM.	188
411100	NF033186	internalin_K	class 1 internalin InlK. Internalins, as found in the intracellular human pathogen Listeria monocytogenes, are paralogous surface-anchored proteins with an N-terminal signal peptide, leucine-rich repeats, and a C-terminal LPXTG processing and cell surface anchoring site. Members of this family are internalin K (InlK), a virulence factor.  See articles PMID:17764999. for a general discussion of internalins, and PMID:21829365, PMID:22082958, and PMID:23958637 for more information about internalin K.	604
380191	NF033187	internalin_J	class 1 internalin InlJ. Internalins, as found in the intracellular human pathogen Listeria monocytogenes, are paralogous surface-anchored proteins with an N-terminal signal peptide, leucine-rich repeats, and a C-terminal LPXTG processing and cell surface anchoring site. See PMID:17764999 for a general discussion of internalins. Members of this family are internalin J (InlJ).	846
380192	NF033188	internalin_H	InlH/InlC2 family class 1 internalin. Internalins, as found in the intracellular human pathogen Listeria monocytogenes, are paralogous surface or secreted proteins with an N-terminal signal peptide, leucine-rich repeats, and usually a C-terminal LPXTG processing and cell surface anchoring site. See PMID:17764999 for a general discussion of internalins. Members of this family are internalin H (InlH), or internalin C2,  two class 1 (LPXTG-type) internalins that are closely related, one apparently derived from the other through a recombination event.	548
380193	NF033189	internalin_A	class 1 internalin InlA. Internalins, as found in the intracellular human pathogen Listeria monocytogenes, are paralogous surface or secreted proteins with an N-terminal signal peptide, leucine-rich repeats, and usually a C-terminal LPXTG processing and cell surface anchoring site. See PMID:17764999 for a general discussion of internalins. Members of this family are internalin A (InlA), a class 1 (LPXTG-type) internalin.	799
411101	NF033190	inl_like_NEAT_1	NEAT domain-containing leucine-rich repeat protein. Members of this family have an N-terminal NEAT (near transporter) domain often associated with iron transport, followed by a leucine-rich repeat region with significant sequence similarity to the internalins of Listeria monocytogenes.  However, since Bacillus cereus (from which this protein was described, in PMID:16978259) is not considered an intracellular pathogen, and the function may be iron transport rather than internalization, applying the name "internalin" to this family probably would be misleading.	754
380194	NF033191	JDVT-CTERM	JDVT-CTERM protein-sorting domain. This bacterial C-terminal protein-sorting domain, superficially similar to MYXO-CTERM (TIGR03901), occurs in a variety of Proteobacteria, including Janthinobacterium, Duganella, Vibrio breoganii, and Thioalkalivibrio, hence the name JDVT. Its local genomic context in species examined so far includes a homolog of eukaryotic type II CAAX prenylation site proteases (see PF02517). The architecture of the domain consists of a run of Gly residues and an invariant Cys residue (a probably modification site), followed by a hydrophobic predicted transmembrane alpha helix and then a cluster of basic residues, mostly Arg.	34
380195	NF033192	JDVT-CAAX	JDVT-CTERM system CAAX-type protease. See NF033191.	98
380196	NF033193	lipo_NDxxF	NDxxF motif lipoprotein. Members of this family are lipoproteins, about 200 amino acids long in precursor form, found in Staphylococcus aureus, Bacillus cereus, and various other Firmicutes. The protein family is named for one of its several highly conserved motifs.	199
380197	NF033194	lipo_EMYY	EMYY motif lipoprotein. Members of this family are lipoproteins, about 300 amino acids long in precursor form, found broadly in the genus Staphylococcus and in some related species.	292
380198	NF033195	F430_CfbB	Ni-sirohydrochlorin a,c-diamide synthase. Members of this family are Ni-sirohydrochlorin a,c-diamide synthase, involving in synthesizing coenzyme F430, used in methanogens by coenzyme M reductase. Members of this family are restricted to archaeal methanogens, and resemble (and may be misannotated as) the enzyme cobyrinic acid a,c-diamide synthase, involved in cobalamin biosynthesis.	451
380199	NF033196	c_type_nonphoto	c-type cytochrome. Members of this family are apparent c-type cytochromes that resemble the photosynthetic reaction center c-type cytochrome (PF02276) but are smaller and found in non-photosynthetic organisms.	97
380200	NF033197	F430_CfbE	coenzyme F430 synthase. Members of this family are coenzyme F430 synthase, involving in synthesizing coenzyme F430, which is used in methanogens by coenzyme M reductase. Members of this family are restricted to archaeal methanogens, and resemble (and may be misannotated as) MurD, an enzyme of bacterial cell wall biosynthesis.	419
380201	NF033198	F430_CfbA	sirohydrochlorin nickelochelatase. Members of this family are sirohydrochlorin nickelochelatase, involving in synthesizing coenzyme F430, used in methanogens by coenzyme M reductase. Members of this family are restricted to archaeal methanogens, and resemble (and may be misannotated as) sirohydrochlorin cobaltochelatase , involved in cobalamin biosynthesis. Some members of this family are double in length because of a duplication.	124
380202	NF033200	F430_CfbC	Ni-sirohydrochlorin a,c-diamide reductive cyclase ATP-dependent reductase subunit. This family, very closely related to the nitrogenase iron protein, was identified as a subunit involved in biosynthesis of coenzyme F430 in archaeal methanogens and archaeal anaerobic methanotrophs.	260
380203	NF033201	Vip_LPXTG_Lm	cell invasion LPXTG protein Vip. Vip (Virulence protein), like the LPXTG-type internalins, is an LPXTG-anchored surface protein of the mammalian cell-invading pathogen Listeria monocytogenes, but absent from the related species Listeria innocua. For certain cell types, Vip is required for Listeria's ability to invade. It appears to bind the endoplasmic reticulum (ER) resident chaperone Gp96 as its receptor.	414
380204	NF033202	GW_glycos_SH3	GW domain. The GW domain of Listeria belongs to the clan of SH3-like domains. A similar but broader model (PF13457) occurs in Pfam. The GW domain occurs as repeats on surface proteins of the cell-invading pathogenic bacterium Listeria monocytogenes, and is involved in binding to glycosaminoglycans. Members of this family include the GW-type internalin InlB and several paralogs.	81
380205	NF033203	entero_EhxA	enterohemolysin EhxA. Members of this family are the RTX toxin called enterohemolysin or EhxA, because it is found in enterohemorrhagic Escherichia coli  (EHEC) strains such as O157:H7.	997
380206	NF033205	IPExxxVDY	IPExxxVDY family protein. This protein family is uncharacterized. Member proteins average about 160 amino acids in size, and feature two widely separated invariant Asn residues, as well as the larger (though less invariant) motif for which it is named. Members are found primarily in the Flavobacteriia branch of the Bacteroidetes.	152
380207	NF033206	ScyE_fam	ScyD/ScyE family protein. This family includes ScyE, a protein involved in scytomenin biosynthesis and export, and its paralog ScyD. Some members of the family contain a C-terminal PEP-CTERM domain that predictions anchoring to the outer membrane.	330
380208	NF033207	midcut_by_XrtH	midcut-by-XrtH protein. Members of this protein family occur in bacterial genomes that encode the exosortase/archaeosortase family member XrtH (exosortase H). While many targets of XrtH are C-terminal protein-processing signals described by TIGR04174, the IPTL-CTERM domain, members of this family have a version of that signal in the N-terminal half of the protein. This architecture suggests that XrtH may performs a cleavage that releases the C-terminal domain of midcut-by-XrtH proteins from the membrane, perhaps as part of some regulatory pathway.	171
411102	NF033208	choice_anch_E	choice-of-anchor E domain-containing protein. This HMM describes a  domain just over 100 amino acids long and usually found in tandem copies. Members appear to be extracellular proteins that have some C-terminal anchoring domain, usually PEP-CTERM but occasionally a type IX secrection (T9SS) recognition domain.	171
380210	NF033210	RBP7	reticulate body protein Rbp-7. Members of this family are the 7-kDa reticulate body protein found in several species of Chlamydia. The protein is often overlooked during genome structural annotation; members observed so far are 65 to 67 amino acids long. The protein has been demonstrated by mass spectroscopy, and shown to be present only in reticulate or intermediate bodies.	74
411103	NF033212	SapB_AmfS_lanti	SapB/AmfS family lanthipeptide. Members of this family are class III lantipeptide precursors. These typically are short peptides, encoded next to the gene for the lantipeptide synthetase that creates the characteristic lanthionine (or beta-methyl lanthionine) bridges for which these natural products are named. Members of this family include SapB and AmfS, which are considered morphogens rather than antibiotics. Members also include labionin-containing peptides such labyrinthopeptins A1 and A2.	40
411104	NF033213	matur_PanM	aspartate 1-decarboxylase autocleavage activator PanM. Members of this family, called PanM (or PanZ), although related to the GNAT family N-acetyltransferases, have a different function. Then enzyme PanD, aspartate 1-decarboxylase, has an active site modified Ser residue, created by cleavage of a precursor form.  PanM promotes the maturation of the CoA biosynthesis enzyme PanD, but also inhibits its activity in the presence of CoA. Figure 6 in PMID:26276430 identifies residues considered critical to interaction with PanD; seed alignment sequences and cutoff scores were chosen to separate proposed PanM from functionally distinct relatives.	130
380213	NF033214	ComC_Streptocco	competence-stimulating peptide ComC. Members of this family are ComC, a secreted peptide that stimulates competence for natural transformation in Streptococcus. ComC peptides fall within the broader family of PF03047, a homology family of pheromone/bacteriocin precursors that is also restricted to Streptococcus. The PF03047 HMM runs only a few residues past the GlyGly precursor peptide cleavage site, and thus does not distinguish ComC from other pheromone precursors, such as BlpC.	41
380214	NF033215	BlpC_Streptocco	quorum-sensing system pheromone BlpC. Members of this family are BlpC, a peptide pheromone that stimulates production of BLP (bacteriocin-like peptides) family class II bacteriocins. BlpC peptides fall within the broader family of PF03047, a homology family of pheromone/bacteriocin precursors that is also restricted to Streptococcus. The PF03047 HMM runs only a few residues past the GlyGly precursor peptide cleavage site, and thus does not distinguish BlpC from other pheromone precursors, such as ComC.	51
380215	NF033216	lipo_YgdI_YgdR	YgdI/YgdR family lipoprotein. Members of this family are exclusively lipoproteins of small size, including YgdI and YgdR from E. coli K-12.	71
380216	NF033217	Fur_reg_FbpC	Fur-regulated basic protein FbpC. Members of this family are FbpC, Fur-regulated basic protein C. This protein has also been described as MrgC (metal-regulated gene C). Members of this family are found so far only in the genus Bacillus, although the small size may have interferred in gene-finding.	29
411105	NF033218	anchor_AmaP	alkaline shock response membrane anchor protein AmaP. The founding member of this family, AmaP (Asp23 membrane anchoring protein), is related to Asp23 through part of its length, but includes a highly hydrophobic N-terminal region that should make it an integral membrane protein. Asp23 (alkaline shock protein of 23 kDa), described in PMID:7864904, is a cytosolic protein in Staphylococcus aureus, strongly induced by a pH shift from 7 to 10, and also recruited to the membrane. AmaP appears to be the partner protein with an integral membrane segment and the ability to anchor Asp23 to the membrane. This model was built to identify full-length homologs of AmaP, while excluding Asp23. Some but not all members of this family score above the cutoffs of Pfam model version PF03780.11, but full-length homologs of Asp23 score considerably higher. Asp23 family members previously were known as DUF322.	166
380218	NF033222	listolys_S	listeriolysin S family toxin. Members of this family include listeriolysin S from some strains of Listeria monocytogenes, and staphylolysin S. Members are encoded in biosynthetic clusters similar to those for streptolysin S (found by model TIGR03602), a precursor similar in length, architecture, and composition, but different enough to require a different HMM.	32
380219	NF033223	YHYH_alt	YHYH domain. Proteins with this form of YHYH motif-containing domain have it located near the N-terminus of a protein just after a signal peptide region.  The domain has two characteristic motifs, GxC and YH[YC]H, separated by a short spacer region of variable length. A different family of YHYH domain-domaining proteins, in which the YHYH domain is more C-terminal and is repeated, is described by Pfam model PF14240.	25
380220	NF033224	PmrR	LpxT activity modulator PmrR. Members of this family are PrmR, and extremely small protein at 29 amino acids in length.	29
380221	NF033225	spore_CmpA	cortex morphogenetic protein CmpA. Members of this family are CmpA (cortex morphogenetic protein A), a small protein (37 amino acids) involved in endospore formation and frequently missed during genome annotation.	36
380222	NF033226	small_MntS	manganase accumulation protein MntS. Members of this family are MntS, a small protein of about 42 amino acids that seems to play assist bacteria in accumulating manganese when iron is limiting. It may function as a manganese chaperone, or may inhibit manganese efflux transporters.	41
380223	NF033227	Fur_reg_FbpB	Fur-regulated basic protein FbpB. This model describes FbpB (Fur-regulated basic protein B), one of three paralogous small proteins recognized by Pfam model PF13040 in Bacillus subtilis.	43
380224	NF033228	div_inhib_SidA	cell division inhibitor SidA. This protein, SidA (SOS-induced inhibitor of cell division A), is found so far in Caulobacter and Phenylobacterium. It interacts with FtsW.	29
380225	NF033229	small_MgtR	protein MgtR. 	30
411106	NF033230	phage_region_01	phage region protein. Members of this family are found broadly in the Gammaproteobacteria and may be a marker of temperate phage or prophage. A member (YP_008766900.1) occurs in Shigella phage SfIV.	142
380226	NF033231	small_Blr	division septum protein Blr. Members of this family are Blr, named beta-lactam resistance protein because mutants have heightened sensitivity to beta-lactam antibiotics, but actually involved in cell division. The protein is very small.	40
380227	NF033232	small_YtzI	YtzI protein. Members of this family include YtzI from Bacillus subtilis, and homologs widely distributed in the Firmicutes. The pattern of sequence conservation suggests the protein begins with a hydrophobic stretch without any basic residue near the initiator Met residue.  Members of this family average about 53 residues in length. At the time this model was constructed, members were included in Pfam model PF12606.6 (Tumour necrosis factor receptor superfamily member 19), seemingly in error. This model was constructed to separate the two families.	41
380228	NF033233	twin_helix	twin transmembrane helix small protein. Members of the seed alignment for this family are small (average length 68 residues), strictly bacterial, and extremely hydrophobic.  Pfam model PF04588 (HIG_1_N) includes both eukaryotic proteins, including a protein from the fish Gillichthys mirabilis, and the members of this family. Similarity between those eukaryotic proteins and the members o this model may represent convergent evolution related to the similar composition of their transmembrane alpha-helical regions, rather than a common origin or common function.	58
380229	NF033376	lat_flg_LafA_1	lateral flagellin LafA. This HMM describes rare second type of flagellin from E. coli and some closely related species, called LafA, where the familiar and common flagellin, FliC, is nearly universal and carries the H-antigen used for serotyping strains. In contrast to FliC, whose center region is highly variable, LafA shows little variability in sequence. In many E. coli strains, the Flag-2 locus either is absent or is cryptic, appearing degraded and non-functional.	304
380230	NF033377	OMA_tautomer	4-oxalomesaconate tautomerase. 	347
380231	NF033379	FrucBisAld_I	fructose-bisphosphate aldolase class I. This family consists of fructose-bisphosphate aldolase class I. All members of the seed alignment are from prokaryotes, although class I is the common form in plants and animals. The common form in prokaryotes is class II.	324
380232	NF033380	Rlm_2499C5	23S rRNA (cytosine(2499)-C(5))-methyltransferase. This model describes a 23S rRNA modification related to RlmI of E. coli, but that modifies site C2499 rather than C1962.	391
411107	NF033381	MonaBetaBRL_TX	monalysin family beta-barrel pore-forming toxin. Members of this family are secreted in a water-soluble pro-toxin form, but undergo cleavage and oligomerization to form beta-barrel pore. The founding member of the family is monalysin from Pseudomonas entomophila. This family is built narrowly, and therefore excludes a set of pore-forming proteins (not necessarily toxins) from a eukaryote, Dictyostelium. Analogous (but perhaps not homologous) beta-type pore-forming toxins include aerolysin and leukocidin.	230
380234	NF033382	OMP_33_36	porin Omp33-36. Members of this family are outer membrane beta-barrel proteins that facilitate passive transport from the extracellular milieu into the periplasm. Known members are limited to the genus Acinetobacter, and the name, Omp33-36, reflects variability of this protein across the lineage. Note that this HMM previously was named CarO in error. Both this protein and CarO affect carbapenem transport across the outer member and thus carbapenem susceptibility or resistance.	293
411108	NF033383	induct_EntF	EntF family bacteriocin induction factor. Members of this family have leader sequences like bacteriocins (see TIGR01847), but characterized examples function as signaling peptides that induce production of a nearby encoded bacteriocin, rather than as bacteriocins themselves. The founding member of this family is enterocin induction factor EntF.	39
380236	NF033384	enterocin_MR10	enterocin L50 family leaderless bacteriocin. Members of this family are leaderless peptide components of bacteriocins in which the two subunits share about 74% identity with each other and are each about 43 amino acids long. Members include enterocin subunits L50A and L50B, MR10A and MR10B, etc.	43
380237	NF033385	enterocin_LsbB	LsbB family leaderless bacteriocin. Members of this family are leaderless peptide components of bacteriocins with a conserved motif KXXXGXXPWE.	35
380238	NF033388	ubiq_like_UBact	ubiquitin-like protein UBact. This HMM describes a protein family that includes most, but not all, of the proteins designated UBact (a ubiquitin-like protein) in the article first describing a biosystem related to bacterial pupylation. Protein modification by ubiquitin in eukaryotes, pupylation in many bacteria, and this system is a few other bacteria, is considered a signal that can trigger altered protein handling such as rapid degradation.  Proteins that the authors consider members of the same family, but that show very little sequence similarity other than protein size and the final two residues, include WP_008669967.1 in the genus Rhodopirellula and OHA48658.1 in Candidatus Terrybacteria.	54
411109	NF033389	scrub_typh_TSA22	major outer membrane protein TSA22. Members of this family are TSA22, one of three major outer membrane proteins, the so-called type-specific antigens, in two species of Orientia, including O. tsutsugamushi, causative agent of scrub typhus. The other type-specific antigens are TSA47 (a serine proteins in the DegP/HtrA family) and the much better known TSA56, which is quite variable in size and may be used for strain typing.	202
380240	NF033390	Orientia_TSA56	type-specific antigen TSA56. This protein is the immunodominant major cell surface protein of Orienta tsutsugamushi, known as "56-kDa type-specific antigen" or TSA56. It should not be confused with unrelated proteins TSA47 (a serine protease) or TSA22. An ortholog is found in Orientia chuto, and included in the seed alignment.	525
380241	NF033391	lipid_A_LpxO	lipid A hydroxylase LpxO. Members of this family are LpxO, an enzyme that modifies one of the lipid chains in lipid A by hydroxylation, with resulting changes in resistance to the host immune response and to the antibiotic colistin. This family, as built, includes LpxO1 from Pseudomonas aeruginosa, but not its paralog LpxO2.	297
380242	NF033392	PSM_delta	PSM-delta family phenol-soluble modulin. Members of this family are phenol-soluble modulins (short peptides, usually cytolysins) with an intact N-formyl-methionine at the N-terminus.	23
380243	NF033393	TRP47_fam_Nterm	TRP47 family tandem repeat effector N-terminal domain. This HMM describes a conserved N-terminal domain of a family of proteins found, so far, only in the genus Ehrlichia. The repeat region is followed by a long repeat region with a large content of acidic and serine residues, but other than in composition, the repeats themselves may be unrelated from one lineage to another.  Characterized examples, such as TRP47 from Ehrlichia chaffeensis, are glycoproteins and are immunodominant antigens.	105
411110	NF033394	capsid_maj_Podo	phage major capsid protein. A founding member of this family, AKO59007.1, was identified as the major head protein in Brucella phage 02_19 during a comparison of Brucella phage genomes. The N-terminal half appears to the better conserved region with fewer insertions and deletions.	311
380245	NF033395	fibronec_SfbI	fibronectin-binding protein SfbI. SfbI is a fibronectin-binding protein a C-terminal region LPXTG region that mediates processing by sortase and covalent attachment to the cell wall.  Near the N-terminus is a TED domain, which includes a Cys residue that forms a covalent thioester bond.	555
380246	NF033396	pilus_ancill_1	pilus ancillary protein 1. 	737
380247	NF033399	thiazolyl_GetA	GE37468 family thiazolyl peptide. 	47
380248	NF033400	thiazolyl_B	thiazolylpeptide-type bacteriocin. 	45
380249	NF033401	thiazolyl_BerA	thiocillin/thiostrepton family thiazolyl peptide. Members of this family include the precursor peptides for the antibiotics thiostrepton, nosiheptide, thiocillin, and berninamycin.	42
380250	NF033402	linaridin_RiPP	linaridin family RiPP. Linaridins are ribosomally translated, post-translationally modified peptide natural products, or RiPPs.  Examples include cypemycin, SGR-1832, and legonaridin.	63
380251	NF033403	linaridin_rel	linaridin-like RiPP. Members of this family share N-terminal (leader peptide) sequence with the linaridin family of ribosomally translated, post-translationally modified natural product precursors.	64
380252	NF033404	YneK	putative protein YneK. Members of this family, YneK, are found so far only in Escherichia coli and Escherichia albertii, with 68% sequence identity but in identical gene neighborhoods. In E. coli O157:H7 strain Sakai, yneK has a nonsense mutation and is therefore truncated. The function is unknown.	371
411111	NF033407	SnoaL_meth_ester	SnoaL/DnrD family polyketide biosynthesis methyl ester cyclase. This HMM represents mutually closely related methyl ester cyclases from a number of polyketide biosynthesis pathways. Examples include proteins designated SnoaL (nogalamycin biosynthesis), DnrD (doxorubicin biosynthesis), RdmA (rhodomycin biosynthesis), etc.	144
380254	NF033411	small_mem_YnhF	YnhF family membrane protein. Members of this protein family, are small membrane proteins, about 29 amino acids in length. YnhF from E. coli was shown to have an intact fMet residue at the N-terminus and to be chloroform-soluble. The previously generated narrow cluster PRK14756 includes some members of this family.	29
380255	NF033412	primase_PriX	eukaryotic-type DNA primase noncatalytic subunit PriX. In most archaea, the eukaryotic-type DNA primase has catalytic subunit PriS and a regulatory subunit PriL. The proteins in this family are PriX, an essential second noncatalytic subunit found in a subset of the archaea.	98
380256	NF033413	RiPP_TM1316	Cys-rich RiPP peptide. Member of this family include the small, Cys-rich peptide TM1316 of Thermotoga maritima, encoded near the peptide-modifying radical SAM/SPASM protein TM1317 (AE000512.1). TM1316 is expressed at very high levels in stationary phase.	31
380257	NF033414	bottro_RiPP	bottromycin family RiPP peptide. Bottromycins are one of the rarer known classes of ribosomally translated, post-translationally modified peptide (RiPP) antibiotics.	44
380258	NF033415	thiovirid_RiPP	thioviridamide family RiPP peptide. Thioviridamide represents one of the rarer known classes of ribosomally translated, post-translationally modified peptide (RiPP) antibiotics.	72
380259	NF033416	YM-216391_RiPP	YM-216391 family RiPP peptide. YM-216391 represents one of the rarer known classes of ribosomally translated, post-translationally modified peptide (RiPP) antibiotics.	35
380260	NF033417	glycocin_F_RiPP	glycocin F family RiPP peptide. Glycocin F, from Lactobacillus plantarum strain KW30, represents one of the rarer known classes of ribosomally translated, post-translationally modified peptide (RiPP) antibiotics. Members of the family are glycosylated, which is uncommon among RiPP natural products.	68
380261	NF033418	T6SS_TagK	type VI secretion system-associated protein TagK. Members of this family have full-length homology to SciF, a type VI secretion system (T6SS) protein from Salmonella typhimurium  island SPI-6. Homologs occur in some but not all T6SS loci, and the broader family is now called TagK.	304
380262	NF033419	T6SS_TagK_dom	TagK family protein C-terminal domain. 	127
411112	NF033420	T6SS_PAAR_dom	type VI secretion system PAAR domain. The PAAR domain is widespread, but this model represents a narrow clade that may occur in type VI secretion systems (T6SS), either free-standing or fused to a long extension. Effector domains of T6SS may be separate proteins, or may be fused to on of the tube or spike proteins: VgrG (spike), Hcp (tube), or this PAAR family (spike tip). Members of this family that have long extensions are likely to be T6SS effectors.	94
411113	NF033422	onco_T4SS_CagA	type IV secretion system oncogenic effector CagA. CagA, an effector injected into host cells by the type IV secretion system (T4SS) apparatus of Helicobacter pylori, is an oncogenic toxin. Tyrosine phosphorylation at multiple Glu-Pro-Ile-Tyr-Ala (EPIYA) motifs creates a scaffold that interacts with multiple host signaling systems and sometimes allows neoplasias to begin in gastric epithelial cells.	1056
380266	NF033424	chlamy_CPAF	protease-like activity factor CPAF. CPAF (chlamydial protease/proteasome-like activity factor) is a serine protease secreted various species of the intracellular pathogen Chlamydia. Early attribution of contributions of CPAF to virulence by cleavage of specific host cell substrates contains a number of errors, and the true role of CPAF, and its contributions to virulence, remain under study.	565
380267	NF033425	PSM_alpha_1_2	alpha-1/alpha-2 family phenol-soluble modulin. Members of this family are extremely short proteins, about 21 amino acids long, that are known to retain an N-formyl-methionine (fMet) at the N-terminus. These proteins, phenol-soluble modulins of the alpha class, including alpha-1 and alpha-2 from Staphylococcus aureus, are exported by an ABC transporter, and affect the state of the host immune system in a number of ways.	21
380268	NF033426	PSM_alpha_3	alpha-3 family phenol-soluble modulin. 	22
380269	NF033427	PSM_alpha_Shaem	alpha family phenol-soluble modulin. This family is based on an alpha family Staphylococcus haemolyticus phenol-soluble modulin, somewhat similar to delta-lysin.	20
380270	NF033428	PSM_epsilo	epsilon family phenol-soluble modulin. Members of this family epsilon-family phenol-soluble modulins. Species with members include Staphylococcus epidermidis, Staphylococcus capitis, Staphylococcus lugdunensis, Staphylococcus pseudintermedius, and Staphylococcus schleiferi.	21
411114	NF033429	ImuA_translesion	translesion DNA synthesis-associated protein ImuA. A three-gene cassette encoding ImuA, ImuB, and ImuC ("inducible mutagenesis") is induced by DNA damage, is capable of DNA synthesis across DNA damage lesions, and consequently is associated with mutagenesis.  This family, ImuA (previously misnamed SulA in Pseudomonas putida) shows some homology to  SulA itself and to RecA. ImuB resembles Y-family polymerases but may be catalytically inactive. ImuC is a C-family polymerase, and catalytically active.	181
380272	NF033430	TfxA_RiPP	trifolitoxin family RiPP peptide. Trifolitoxin is ribosomally synthesized and post-translationally modified peptide natural product (RiPP) antibiotic produced by some strains of Rhizobium and active against others. At the time of building this model, only two variants sequences from this family are detected, both 42 amino acid-long TfxA peptides that differ at only two positions.	42
380273	NF033431	cinnamycin_RiPP	cinnamycin family lantibiotic. Members of this family are RiPP precursor peptides from which the lantibiotic cinnamycin is the most heavily studied. Mature cinnamycin is 19 amino acids long with nine post-translational modifications, including lanthionine, methyllanthionine, and lysinoalanine bridge modifications.	77
380274	NF033432	ThioGly_TfuA_rel	TfuA-related McrA-glycine thioamidation protein. 	211
380275	NF033433	NisI_immun_dup	NisI/SpaI lantibiotic immunity lipoprotein domain. This HMM describes a domain that occurs twice in the nisin lantibiotic self-immunity lipoprotein NisI, and once in the subtilin lantibiotic self-immunity lipoprotein SpaI, and once or twice in numerous other known or putative lantibiotic resistance lipoproteins.	104
380276	NF033434	AzmA_fam_RiPP	azolemycin family RiPP peptide. The azolemycin precursor peptide, AzmA, contains a mature (core) peptide region derived from the sequence VVSTCTI.  Examining genomic context from that RiPP precursor peptides related to AzmA are much more similar in the leader peptide region than in the core region, and that the proper length for the percursor peptide is probably about 36 amino acids.	36
380277	NF033435	S-layer_Clost	S-layer protein SlpA. In Clostridiodes difficile, the S-layer protein precursor, SlpA, is one member of a large paralogous family of protein that share several cell wall-binding repeats. SlpA is cleaved into a larger and smaller protein. The S-layer protein itself is important to adhesion, and portions of it are highly variable, and then N-terminal and C-terminal are well-conserved.	728
380278	NF033436	SpoVM_broad	stage V sporulation protein SpoVM. Members of this family are SpoVM (stage V sporulation protein M).	26
380279	NF033437	YpdK	membrane protein YpdK. 	23
411115	NF033438	BREX_BrxD	BREX system ATP-binding protein BrxD. BrxD is an ATP-binding protein found in types 2 and 6 of BREX (bacteriophage exclusion) phage resistance systems.	423
380281	NF033439	small_mem_YoeI	membrane protein YoeI. YoeI, a hydrophobic protein of only 20 amino acids, is found in at least these genera: Escherichia, Salmonella, Citrobacter, Enterobacter, and Klebsiella. It is known to be expressed in E. coli.	20
380282	NF033440	small_YrbN	protein YrbN. YrbN, a small protein of only 26 amino acids, is found in at least these genera: Escherichia, Salmonella, Enterobacter, Klebsiella, and Yersinia. It is known to be expressed in E. coli.	26
380283	NF033441	BREX_BrxC	BREX system P-loop protein BrxC. BrxC is a P-loop-containing protein, and probable ATPase, from BREX (bacteriophage exclusion) systems of type 1.	1173
411116	NF033442	BREX_PglW	BREX system serine/threonine kinase PglW. Members of this family are PglW, a predicted serine/threonine kinase of the Pgl (phage growth limitation) system (now called BREX type 2) and the BREX type 3 system.	1387
411117	NF033443	BREX_PglZ_6	BREX-6 system phosphatase PglZ. 	958
380286	NF033444	BREX_PglZ_5	BREX-5 system phosphatase PglZ. 	704
411118	NF033445	BREX_PglZ_4	BREX-4 system phosphatase PglZ. 	734
411119	NF033446	BREX_PglZ_2	BREX-2 system phosphatase PglZ. 	890
380289	NF033447	BrxE_fam	BrxE family protein. This family is uncharacterized, but a subgroup within this family is BrxE, a protein of unknown function found in type 6 BREX phage resistance systems.	166
380290	NF033448	BREX_6_BrxE	BREX-6 system BrxE protein. Members of this family are BrxE, a protein of unknown function that is found in type 6 BREX systems of phage defense.	182
411120	NF033449	BREX_PglZ_3	BREX-3 system phosphatase PglZ. BREX is a phage defense system (BacteRiophage EXclusion), with a number of described subtypes. The first described, PGL (phage growth limitation), is not called BREX-2. This model describes one of the two core proteins universal across the first six defined BREX subtypes, the phosphatase-like PglZ domain protein, as found in BREX-3 systems.	642
380292	NF033450	BREX_PglZ_1_B	BREX-1 system phosphatase PglZ type B. BREX (bacteriophage exclusion) is a phage resistance resistance, in which two protein families are core but other proteins are variable. BREX subtypes are based on PglZ domain protein, a putative phosphatase. This family is one of two major subtypes of PglZ as seen in type 1 BREX systems. Most members of this family contain an additional C-terminal domain that is not included in the seed alignment. Family TIGR02687 describes the alternative type A for PglZ of BREX-1.	672
380293	NF033451	BREX_2_MTaseX	BREX-2 system adenine-specific DNA-methyltransferase PglX. This protein, PglX, is a site-specific DNA methyltransferase associated with PGL (phage growth limitation), a type 2 BREX (bacteriophage exclusion) system. The phage resistance appears not be restriction, but does manage to inhibit phage replication.	1188
411121	NF033452	BREX_1_MTaseX	BREX-1 system adenine-specific DNA-methyltransferase PglX. This protein, PglX, is a site-specific DNA methyltransferase associated BREX (bacteriophage exclusion) type 1 systems. The phage resistance appears not to be through restriction-modification, as phage DNA appears not to get degraded, but it does manage to inhibit phage replication.	1187
380295	NF033453	BREX_3_BrxF	BREX-3 system P-loop-containing protein BrxF. This family of proteins that are about 150 amino acids in length includes BrxF from type 3 BREX (bacteriophage exclusion) systems. Most members have the P-loop motif GxxGxGKT, but the region is surprisingly poorly conserved in a sizable fraction of otherwise strongly similar proteins.	149
380296	NF033454	BREX_5_MTaseX	BREX-5 system adenine-specific DNA-methyltransferase PglX. 	1401
380297	NF033455	BREX_6_MTaseX	BREX-6 system adenine-specific DNA-methyltransferase PglX. 	1319
380298	NF033456	RiPP_CCRG-2	CCRG-2 family RiPP leader. This model consists of the conserved leader peptide domain of CCRG-2 family ribosomal peptide natural products (RiPPs), up to the Gly-Gly presumptive cleavage site, plus one additional residue (hydrophobic, or another Gly). Members are found almost exclusively in the genus Prochlorococcus, in multiple instances per genome.	18
380299	NF033457	elgicin_lanti	elgicin/penisin family lantibiotic. The HMM describes the elgicin family of lantipeptides active as bacteriocins. The leader domain occurs in additional proteins that lack homology in the core region, although sharing richness in Cys residues there.	64
380300	NF033458	lipid_A_LpxG	UDP-2,3-diacylglucosamine diphosphatase LpxG. Members of this family are LpxG, the lipid A biosynthesis enzyme UDP-2,3-diacylglucosamine diphosphatase ()EC 3.6.1.54). This family is unrelated to the more common LpxH, or to LpxI, which share the same activity.	321
380301	NF033459	DksA_like	RNA polymerase-binding protein DksA. 	113
380302	NF033460	glycerol3P_ox_II	type 2 glycerol-3-phosphate oxidase. This FAD-dependent enzyme (EC 1.1.3.21), glycerol-3-phosphate oxidase, also called L-alpha-glycerophosphate oxidase, converts sn-glycerol 3-phosphate plus oxygen to glycerone phosphate plus hydrogen peroxide, which contributes to virulence. This form, called type 2, is shorter than type 1 and is found exclusively in Mycoplasmas and other members of the Mollicutes.	361
411122	NF033461	glycerol3P_ox_1	type 1 glycerol-3-phosphate oxidase. Glycerol-3-phosphate oxidase, also called alpha-glycerophosphate oxidase (GlpO), is an FAD-dependent enzyme related to the glycerol-3-phosphate dehydrogenase GlpD. Notably, GlpO releases hydrogen peroxide, which can contribute to virulence.	607
380304	NF033464	cyanoexo_CrtC	cyanoexosortase C. Cyanosortase C (CrtC) belongs to the exosortase/archaeosortase family of multiple membrane proteins that act as cysteine proteases, and probably as transpeptidases, analogous to (but unrelated to) the sortases. CrtC is known so far only in Cyanobacteria, and appears so far primarily in the lesser-studied genera: Leptolyngbya, Oscillatoriales, Scytonema, Alkalinema, Phormidesmis, etc.	278
380305	NF033465	PTPA-CTERM	PTPA-CTERM protein sorting domain. This C-terminal sorting and processing signal, called PTPA-CTERM, is a variant of the widespread PEP-CTERM domain. It is restricted a subset of Cyanobacteria that encode the sorting enzyme named cyanoexosortase C.	23
380306	NF033471	J25_fam_lasso	acinetodin/klebsidin/J25 family lasso peptide. Members of this family include precursors of at least three lasso peptides, namely microcin J25, acinetodin and klebsidin. All members of the family are encoded as neighboring genes to homologs of the processing protein genes mcjB, mcjC, and mcjD.	39
380307	NF033474	DivGenRetAVD	diversity-generating retroelement protein Avd. Avd (accessory variability determinant) is part of diversity-generating retroelement (GDR) system through which a portion of a protein-coding gene can be rewritten, creating diversity that can affect host range. The founding member of this family, bAvd, from a Bordetella bacteriophage, belongs to a retrohoming element called BPP-1. Members of this family are four-helix bundle proteins, related to those of family TIGR02436, some of whose members are found in long intervening sequence (IVS) regions in 23S rRNA.	104
380308	NF033477	EmaA_autotrans	collagen-binding adhesin autotransporter EmaA. EmaA (extracellular matrix protein adhesin ) is an outer membrane protein first described in Aggregatibacter (Actinobacillus) actinomycetemcomitans, a member of the Gammaproteobacteria.  It is a glycoprotein, and has a C-terminal beta-barrel domain that marks it as an autotransporter. It serves as an adhesin that binds collagen, the most abundant material in the host extracellular matrix. It shares homology with the collagen-binding protein YadA of Yersinia enterocolitica.	1694
380309	NF033478	YadA_autotrans	trimeric autotransporter adhesin YadA. YadA (Yersinia adhesin A) is a type Vc secretion system autotransporter found in at least two pathogenic Yersinia species, Y. enterocolitica and Y. pseudotuberculosis. It forms trimers, with three monomers contributing to a single outer membrane beta-barrel. It binds collagen and other components of the extracellular matrix.	459
380310	NF033479	Efa1_rel_toxin	LifA/Efa1-related large cytotoxin. Members of this family are large and almost certainly multifunctional proteins found in various pathogens from genus Chlamydia, about 3000 amino acids in size and related to lymphostatin (Efa1/LifA) from enteropathogenic Escherichia coli. Roles have been suggested for Efa1 (EHEC factor for adherence) in adhesion, so some members have been annotated as adherence proteins rather than cytotoxins.	3223
411123	NF033480	bifunc_MprF	bifunctional lysylphosphatidylglycerol flippase/synthetase MprF. The C-terminal region of MprF tranfers lysine from a charged tRNA onto phosphatidylglycerol to make lysylphosphatidylglycerol (EC 2.3.2.3). The N-terminal region of MprF acts as a flippase. MprF helps confer resistance to antimicrobial cationic peptides.	839
411124	NF033481	auto_Ata	trimeric autotransporter adhesin Ata. Ata (Acinetobacter trimeric autotransporter) has an architecture that consists of a long signal peptide, a repetitive passenger domain that varies in length from strain to strain, and a C-terminal domain of four transmembrane beta stands that forms one third of the pore for autotransporter activity and anchoring in the outer membrane.	1862
411125	NF033482	RiPP_thiocil	thiocillin family RiPP. Members of this family are ribosomally synthesized and post-translationally modified peptide natural product (RiPP) precursors. Nearly all contain at least one Cys residues expected to be involved in thiazolyl peptide modifications, and are expected to behave at least in part as antibiotics. Genomes of some species (Bacillus atrophaeus, Streptomyces coelicoflavus, etc.) may contain multiple paralogs.	48
411126	NF033483	PknB_PASTA_kin	Stk1 family PASTA domain-containing Ser/Thr kinase. 	563
411127	NF033484	Stp1_PP2C_phos	Stp1/IreP family PP2C-type Ser/Thr phosphatase. Many Gram-positive bacteria have a protein kinase/protein phosphatase gene pair that responds to peptidoglycan metabolites and can be instrumental in resistance to beta-lactam antibiotics. Characterized examples of the phosphatase component are Stp1 of Staphylococcus aureus and IreP of Enterococcus faecalis.	232
411128	NF033485	small_SCO1431	SCO1431 family membrane protein. Members of this family, including SCO1431 from Streptomyces coelicolor A3(2), are small and extremely hydrophobic proteins that lack an N-terminal signal peptide. Known members are restricted to the genus Streptomyces, where the protein family is widespread.	47
411129	NF033486	harvest_ssl1498	ssl1498 family light-harvesting-like protein. Members of this family appear restricted to the Cyanobacteria, and include ssl1498 from Synechocystis sp. PCC 6803. These proteins are small, usually about 56 amino acids, with an N-terminal half related to a number of light-harvesting proteins, and a highly hydrophobic C-terminal half likely to be embedded in membrane.	53
411130	NF033487	Lacal_2735_fam	Lacal_2735 family protein. This small protein is widespread but uncharacterized. Most members are shorter than 60 amino acids in length.	54
411131	NF033488	lmo0937_fam_TM	lmo0937 family membrane protein. Members of this family are very small (about 45 amino acids) and highly hydrophobic, suggesting a presence in the membrane, and have a broad phylogenetic distribution. The member protein lmo0937, from the pathogen Listeria monocytogenes, is described as up-regulated when the bacterium is in the mouse spleen, suggesting a role in stress response.	43
411132	NF033490	small_SPW0924	SPW_0924 family protein. Members of this family average less than 44 amino acids in length, and are found exclusively in the Actinobacteria (mostly Streptomyces). The N-terminal half is organized like a signal peptide, beginning Met-Arg and then continuing with a hydrophobic stretch that is unusually rich in alanine. The C-terminal region has a nearly invariant motif TSPxPLLTTVP. The function is unknown.	44
411133	NF033491	BA3454_fam	BA3454 family stress response protein. BA3454, a protein less than 45 amino acids long, is up-regulated strongly by SpxA2 during stress conditions. Related proteins are found widely in the genus Bacillus, although not in Bacillus subtilis.	43
411134	NF033492	podovir_small	putative phage replication protein. Members of this family of very small, highly hydrophobic proteins are restricted to the genus Acinetobacter, and appear to be entirely of phage origin. One member is encoded in the Podoviral Bacteriophage YMC/09/02/B1251 ABA BP (although the coding region is not predicted in GenBank record JX403940.1). Evidence suggesting the reading frame really does encode protein includes the overlap of a TGA stop codon with an ATG start codon at both ends of this protein, suggesting translational coupling with the much larger adjacent genes immediately upstream and downstream.	36
411135	NF033493	MetS_like_NSS	MetS family NSS transporter small subunit. MetS, as described in the Gram-positive bacterium Corynebacterium glutamicum, is the small subunit of MetPS, an NSS (Neurotransmitter:Sodium Symporter) transporter involved in methionine and alanine import. While MetS itself is small, only 60 amino acids, homologs in gamma proteobacteria such as Vibrio sp., similarly found next to an NSS transporter large subunit, may be barely half that length and consist almost entirely of a predicted hydrophobic region that would localize to within the plasma membrane.	30
411136	NF033494	NSS_import_MetS	methionine/alanine import NSS transporter subunit MetS. 	53
411137	NF033495	phage_BC1881	BC1881 family protein. Members of this family of very small proteins (average length is about 50 residues) include BC1881 from the phBC6A51 prophage region of the Bacillus cereus ATCC 14579 genome.	45
411138	NF033496	DUF2080_fam_acc	DUF2080 family transposase-associated protein. Members of this family appear restricted to the archaea. They tend to be encoded upstream of predicted transposase genes within insertion sequences such as ISNagr11, ISHca1, ISH36, etc. The widespread distribution suggests this protein may be more than a mere passenger gene and may participate in some transposase-associated function. See PF09853, COG3466, and arCOG03884 for alternative (currently narrow) treatments of this family.	34
411139	NF033497	rubre_like_arch	rubrerythrin-like domain. This rubrerythrin-like domain is found primarily in the archaea, occasionally as part of a larger redox-active protein. It features two CxxC motifs with a spacer of 12 to 13 amino acids.	34
411140	NF033498	YlcG_phage_expr	YlcG family protein. Members of this family include YlcG from the DLP12 prophage region of Eschichia coli K-12, and homologs from the Gifsy-1 and Gifsy-2 prophage regions of Salmonella enterica subsp. enterica  serovar Typhimurium str. LT2. Members of this protein family are small, about 46 amino acids long. YlcG is known to be expressed. It is encoded immediately downstream of the Holliday junction resolvase RusA.	41
411141	NF033499	Xis_Gifsy_1	excisionase Xis. Members of this family are excisionases such as Xis from the Gifsy-1 prophage of Salmonella enterica subsp. enterica serovar Typhimurium str. LT2.	92
411142	NF033500	phi80_GamL	host nuclease inhibitor GamL. Members of this family, including GamL from phage phi80, are phage inhibitors of host nucleases such as RecBCD and SbcCD. This family has a distant relationship to Gam (see PF06064), which inhibits RecBCD.	89
411143	NF033501	ArfB_arch_rifla	2-amino-5-formylamino-6-ribosylaminopyrimidin-4(3H)-one 5'-monophosphate deformylase. MJ0116 from Methanocaldococcus jannaschii, the founding member of this family, was shown be 2-amino-5-formylamino-6-ribosylaminopyrimidin-4(3H)-one 5'-monophosphate deformylase, catalyzing the second step in archaeal riboflavin and Fo biosynthesis.	219
411144	NF033503	LarB	nickel pincer cofactor biosynthesis protein LarB. This protein, related to AIR carboxylase, is part of a three protein system involved in producing a specialized nicotinic acid-derived, nickel-containing cofactor, as used in the nickel-dependent lactate racemase of lactic acid bacteria.	209
411145	NF033504	Ni_dep_LarA	nickel-dependent lactate racemase. LarA from Lactobacillus plantarum is a nickel-dependent lactate racemase and the founding member of a family of isomerases that depend on a nicotinic acid-derived nickel pincer cofactor. While it is not yet clear which homologs of LarA act preferentially on lactate, this model identifies one clade of architecurally similar proteins from among a broader set of LarA homologs.  Note that the crystal structure 4NAR, on deposit at PDB but not associated with any publication, represents a protein from Thermotoga maritima that falls outside the scope of this family and that is annotated in PDB as a putative uronate isomerase.	417
411146	NF033505	paceosortase	pacearchaeosortase. Members of this family, pacearchaeosortase, are archaeosortases from the uncultured (so far) Candidatus Pacearchaeota archaeon lineage and its close relatives. In most assemblies where pacearchaeosortase is found, only one protein can be found likely to make it the target for sorting and cleavage, and it is encoded by the adjacent gene. This dedicated arrangement suggests the adjacent gene encodes a critical surface protein, most likely one that helps form an S-layer.	167
411147	NF033506	PACE-CTERM-PROT	putative S-layer protein. Assembled genomes from the Candidatus Pacearchaeota archaeon group and its close relatives, so far all uncultured, have a single archaeosortase, called pacearchaseosortase. A search protein archaeosortase targets, with a C-terminal domain resembling other archaeosortase and exosortase sorting signal regions, found this family as the best candidate. It is nearly always encoded by a gene found adjacent to the pacearchaseosortase gene.  This dedicated arrangement, a sorting enzyme encoded next to its only predicted sorting substrate, suggests that members of this family, called PACE-CTERM, may be an important and abundant surface protein, most likely the major S-layer protein.	558
411148	NF033507	Loki-CTERM	Loki-CTERM protein-sorting domain. 	26
411149	NF033510	Ca_tandemer	Ca2+-stabilized adhesin repeat. This repeat is found in proteins such as the  biofilm-associated protein Bap of Acinetobacter baumannii (which can exceed 8000 amino acids in length), the calcium-stabilized ice-binding adhesin of the Antarctic bacterium Marinomonas primoryensis, and the giant calcium-binding adhesin SiiE of Salmonella enterica.	97
411150	NF033511	metallo_CpaA	metalloendopeptidase CpaA. 	575
411151	NF033512	T2SS_chap_CpaB	metalloprotease secretion chaperone CpaB. The cpaA and cpaB gene pair, as described in the genus Acinetobacter, consists of a metalloendopeptidase virulence factor, CpaA, and a tightly binding membrane-bound chaperone, CpaB, important for its secretion by a type II secretion system (T2SS). CpaA, in at least some Acinetobacter, is the most heavily secreted T2SS effector, and behaves as a virulence factor that cleaves factor V in blood and alters coagulation.	205
411152	NF033515	lipo_6_6_Borrel	Lp6.6 family lipoprotein. In Borrelia burgdorferi, the tiny lipoprotein Lp6.6 is plasmid-borne and was originally described as a major low-molecular-weight lipoprotein that could constitute 2% of the dry weight of defatted cells. Lp6.6 was later found to facilitate transmission from ticks to mice, and to be down-regulated after infection.	60
411153	NF033516	transpos_IS3	IS3 family transposase. 	369
411154	NF033517	transpos_IS66	IS66 family transposase. Members of this protein family are DDE transposases from the IS66 family insertion sequences, which typically consist of two accessary genes (TnpA and TnpB) and the third gene encoding the transposase.	388
411155	NF033518	transpos_IS607	IS607 family transposase. 	187
411156	NF033519	transpos_ISAzo13	ISAzo13 family transposase. 	387
411157	NF033520	transpos_IS982	IS982 family transposase. Currently, there are 46 seed sequences in this family.	243
411158	NF033521	lasso_leader_L3	lasso RiPP family leader peptide. 	20
411159	NF033522	lasso_benenodin	benenodin family lasso peptide. This family consists of both the leader (removed) and core (mature) portions of precursor peptides from the benenodin family of lasso peptides.	43
411160	NF033523	lasso_peptidase	Atxe2 family lasso peptide isopeptidase. 	637
411161	NF033524	lasso_PadeA_fam	paeninodin family lasso peptide. Members of this family are lasso peptides in the paeninodin family, mostly from the genera Bacillus, Paenibacillus, and Thermobacillus. The HMM covers the leader peptide region but only about half of the core peptide region, since it is quite diverse.	30
411162	NF033525	lasso_albusnod	albusnodin family lasso peptide. Members of this family are lasso peptides in the family of albusnodin, and appear limited so far to the Actinobacteria. Members are more strongly conserved in the core peptide region than in the leader peptide region, which is unusual for ribosomally produced, post-translationally modified natural products. The founding member of this family is not only circularized by the formation of an isopeptide bond, but also acetylated.	40
411163	NF033527	transpos_Tn3	Tn3 family transposase. 	954
411164	NF033528	lasso_cyano	lasso peptide. Members of this family are lasso peptide precursors of a type common in the Cyanobacteria and so far restricted to them. Some members have a pair of Cys residues in the core peptide region, C-terminal the region modeled in the HMM, and therefore would form bicyclic compounds.	35
411165	NF033529	transpos_ISLre2	ISLre2 family transposase. 	378
411166	NF033530	lasso_PqqD_Strm	lasso peptide biosynthesis PqqD family chaperone. Members of this family are homologs of PqqD, a chaperone that binds RiPP peptide precursors for their modification into bioactive natural products. By context, this set is involved in the biosynthesis of threaded-lasso peptides. This model focuses on lasso peptide systems from Actinobacteria. Similar systems, with different subfamilies of PqqD-related peptides, occur in lasso peptide systems in other lineages. A characterized example is LarB1 from the lariatin system of Rhodococcus jostii.	78
411167	NF033532	lone7para_assoc	type VII secretion system-associated protein. Members of this family occur almost exclusively in the genus Streptomyces, in the context of type VII secretion systems (T7SS). Several paralogs may accompany a single T7SS. A few members of this family are large proteins with additional domains that add or remove, ADP-ribosylations, suggesting that all family members may have effector activity as well, and that the longer members of the family are multifunctional effector proteins.	162
411168	NF033533	lone7_assoc_B	type VII secretion system-associated protein. Members of this family are found almost entirely in the genus Streptomyces, and are associated with a type VII secretion system (T7SS).	135
411169	NF033534	rhodolasso	lasso peptide. Members of this family are lasso peptides as found in the genus Rhodothermus. The leader peptide region shows sequence relatedness to several other groups of lasso peptide precursors. Known members of this family have a pair of Cys residues in the core peptide region, suggesting a bicyclic product, with one cross-link from the signature lasso peptide isopeptide bond and the other from the cysteine disulfide bond. This subfamily of lasso peptide appears not to have been discussed in the literature yet.	49
411170	NF033535	lass_lactam_cya	lasso peptide isopeptide bond-forming cyclase. Members of this family are the isopeptide bond-forming cyclase of lasso peptide biosynthesis systems, from a subgroup that contains primarily cyanobacterial examples. These proteins resemble the glutamine-hydrolyzing asparagine synthase AsnB (EC 6.3.5.4).	668
411171	NF033536	lasso_PqqD_Bac	lasso peptide biosynthesis PqqD family chaperone. Members of this family are homologs of PqqD, a chaperone that binds RiPP peptide precursors for their modification into bioactive natural products. By context, this set is involved in the biosynthesis of threaded-lasso peptides. This model focuses on lasso peptide systems from Firmicutes. Similar systems, with different subfamilies of PqqD-related peptides, occur in lasso peptide systems in other lineages.	88
411172	NF033537	lasso_biosyn_B2	lasso peptide biosynthesis B2 protein. 	130
411173	NF033538	transpos_IS91	IS91 family transposase. 	376
411174	NF033539	transpos_IS1380	IS1380 family transposase. Proteins of this family are DDE type transposases, which are encoded by IS1380 family elements. It was first identified and characterized in an Acetobacter pasteurianus mutant with ethanol oxidation deficiency caused by disruption of the cytochrome c gene by the IS1380 element.	417
411175	NF033540	transpos_IS701	IS701 family transposase. Members of this family are transposases in the family of that of insertion element IS701, narrowly defined. Note that a molecular phylogenetic tree of the broader sets of transposases  from IS elements classified as IS701 family or IS4 family by ISFINDER shows the two groups interleaved. This model represents an unambiguous clade that includes IS701 itself and the majority of proteins called IS701 family. The poorly conserved C-terminal region of members of this family is not included in the seed alignment.	345
411176	NF033541	transpos_ISH3	ISH3 family transposase. This family contains transposases from the insertion element ISH3, and related transposases from other mobile elements with similar transposases. This model reproduces the classification from ISFinder except for ISC1439B-like transposases, since those are extremely different.	298
411177	NF033542	transpos_IS110	IS110 family transposase. Proteins of this family are DEDD (Asp, Glu, Asp, Asp) type transposases, which are encoded by the IS110 family elements.	345
411178	NF033543	transpos_IS256	IS256 family transposase. Members of this family belong to the branch of the IS256-like family of transposases that includes the founding member. It excludes the IS1249 group.	406
411179	NF033544	transpos_IS1249	IS1249 family transposase. Members of this family belong to the IS1249 group branch of the broader IS256 family of transposases. This group differs sharply from the main branch, which includes founding member IS256 itself, by having an N-terminal region with two pairs of closely spaced Cys residues.	381
411180	NF033545	transpos_IS630	IS630 family transposase. 	298
411181	NF033546	transpos_IS21	IS21 family transposase. 	296
411182	NF033547	transpos_IS1595	IS1595 family transposase. Most transposases of this family of transposases, IS1595, have an additional short N-terminal domain with a pair of CxxC motifs.	211
411183	NF033550	transpos_ISL3	ISL3 family transposase. 	369
411184	NF033551	transpos_IS1182	IS1182 family transposase. Members of this family are transposases of the IS1182 family.  About two-thirds of the members of this family have an extra domain between the middle and the C-terminal domain, about 50 amino acids in size and containing four invariant Cys residues.	437
411185	NF033553	MerP_Gpos	mercury resistance system substrate-binding protein MerP. Members of this family are MerP, a substrate-binding protein for a system in which toxic Hg(II) is transported into the cytosol, where it can be reduced to the much less toxic form Hg(0). Members of this family are found, so far, in Gram-positive bacteria. A related MerP family, as found in plasmid-borne systems in Gram-negative bacteria, is described by TIGR02052.	111
411186	NF033554	floc_PepA	flocculation-associated PEP-CTERM protein PepA. PepA was described in Zoogloea resiniphila as a PEP-CTERM protein regulated by the PrsK/PrsR two-component system. Knocking out that system blocks flocculation, after which expression of recombinant PepA can restore flocculation.	258
411187	NF033556	MerTP_fusion	mercuric transport protein MerTP. MerTP is a transport protein for the mercuric ion, Hg(II). Once imported to the cytosol, the highly toxic ion can be converted by MerA, mercury(II) reductase, to the less toxic Hg atom.	186
411188	NF033557	LLB_putidacin	putidacin L1 family lectin-like bacteriocin. Putidacin L1 is a well-described member of a family of lectin-like bacteriocins found almost exclusively in Pseudomonas. This subfamily is narrowly defined, and does not include the homolog pyocin L1.	274
411189	NF033558	transpos_IS1	IS1 family transposase. Proteins of this family are DDE transposases encoded by the IS1 family elements usually through a translational frameshift mechanism.	199
411190	NF033559	transpos_IS1634	IS1634 family transposase. Members of this protein family are DDE type transposases encoded by the IS1634 family elements, which were firstly identified and characterized in Mycoplasma mycoides.	463
411191	NF033561	macrolact_Ik_Al	albusnodin/ikarugamycin family macrolactam cyclase. Members of this family show homology enzymes known to form the lactam bond of the isopeptide linkage of lasso peptides. This family includes the peptide cyclase involved in biosynthesis of the lasso peptide albusnodin. However, another member of this family belongs to the biosynthesis cassette for  ikarugamycin, a macrolactam whose biosynthesis relies on a hybrid PKS/NRPS system, not a ribosomally produced peptide.	561
411192	NF033562	BH0509_fam	BH0509 family protein. This family of unknown function appears restricted to the Firmicutes. Proteomics evidence for expression was provided for the member from Bacillus cereus by Dr. Samuel Payne, Pacific Northwest National Labs. The family is named, for now, after BH0509 from Bacillus halodurans.	43
411193	NF033563	transpos_IS30	IS30 family transposase. 	267
411194	NF033564	transpos_ISAs1	ISAs1 family transposase. 	314
411195	NF033566	adhes_LIC20035	LIC20035 family adhesin. LIC20035 of Leptospira interrogans was characterized as a surface exposed adhesin that binds host extracellular matrix components. Orthologs appear restricted to the genus Leptospira. Member proteins average about 430 residues in length, much of which consists of repeats. All members are predicted lipoproteins.	428
411196	NF033567	act_recrut_TARP	type III secretion system actin-recruiting effector Tarp. The founding member of the Tarp (translocated actin-recruiting phosphoprotein) is CT456 from Chlamydia trachomatis. Tarp is a type III secretion system effector. Orthologs are found other Chlamydia, but are highly variable in length because many lack much of the repeat region.	761
411197	NF033570	FIB_Spiroplas	cytoskeletal motor fibril protein Fib. Fib, a 59K protein also called fibrillin, is the repeating subunit of the linear fibril that runs through the shortest path along the length of Spiroplasma cells, members of the Mollicutes that lack cells walls but have a spiral shape. The fibril is a linear contractile ribbon, and Fib is its only component.	511
411198	NF033571	motil_scm1_spiro	motility-associated protein Scm1. Scm1 (Spiroplasma citri motility gene 1) was shown by loss-of-function mutation to be involved in motility. The Scm1 family is widespread in the genus Spiroplasma, members of the Mollicutes (bacteria with no cell wall) that have a spiral shape organized around a contractile ribbon fibril made of repeating subunits of the Fib (fibril) protein.	401
411199	NF033572	transpos_ISKra4	ISKra4 family transposase. 	414
411200	NF033573	transpos_IS200	IS200/IS605 family transposase. Most IS200/IS605 family insertion sequences encode both this transposase, TnpA, about 130 amino acids long, and larger accessory protein, TnpB, that may act as a methyltransferase.	126
411201	NF033576	mCpol	mCpol domain. The mCpol domain (minimal CRISPR polymerase) is named for its homology relationship to catalytic domain of the CRISPR polymerases (often called Cmr2 or Cas10).  It is predicted to generate cyclic nucleotides, potentially sensed by CARF domains which in turn activate various effector domain including HEPN RNases, CARF sensor and effectors are found in conserved genome contexts. It is part of a broader class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives.  The putative function of the mCpol domain implies that CRISPR polymerases of the type III CRISPR/Cas systems have a nucleotide synthetase functional role.	118
411202	NF033577	transpos_IS481	IS481 family transposase. 	283
411203	NF033578	transpos_IS5_1	IS5 family transposase. 	415
411204	NF033579	transpos_IS5_2	IS5 family transposase. 	287
411205	NF033580	transpos_IS5_3	IS5 family transposase. 	257
411206	NF033581	transpos_IS5_4	IS5 family transposase. 	284
411207	NF033583	staphy_B_SbnC	staphyloferrin B biosynthesis protein SbnC. SbnC, related to siderophore biosynthesis protein IucA and IucC, is encoded in Staphylococcus aureus in the sbnABCDEFGHI locus responsible for the biosynthesis of staphyloferrin B, a carboxylate-type siderophore. SbnC is found in many species of Staphylococcus.	584
411208	NF033586	staphy_B_SbnF	staphyloferrin B biosynthesis protein SbnF. 	577
411209	NF033587	transpos_IS6	IS6 family transposase. 	203
411210	NF033588	transpos_ISC774	IS6 family transposase. ISC774 is an example of an outlier clade of IS6 family insertion sequences. Members so are appear restricted to the archaeal genus Sulfolobus.	195
411211	NF033589	staphy_B_SbnI	bifunctional transcriptional regulator/O-phospho-L-serine synthase SbnI. SbnI is a bifunctional protein involved in staphyloferrin B (staphylobactin) biosynthesis in Staphylococcus aureus and other members of the genus. It is a bifunctional protein. The N-terminal region is heme-binding, and loses the ability to bind DNA when heme is bound. Under low iron conditions, the biosynthesis operon for staphyloferrin B, a carboxylate-type siderophore, is derepressed. The C-terminal domain is a kinase that acts on free serine, producing O-phospho-L-serine, which is used as one of the precursors of staphyloferrin B.	254
411212	NF033590	transpos_IS4_3	IS4 family transposase. 	403
411213	NF033591	transpos_IS4_2	IS4 family transposase. 	340
411214	NF033592	transpos_IS4_1	IS4 family transposase. 	332
411215	NF033593	transpos_ISNCY_1	ISNCY family transposase. The ISNCY insertion sequence family, as defined by ISFinder, encodes several apparently unrelated families of transposases. Members of this family resemble the transposases of ISNCY family elements such as ISRm17 from Sinorhizobium meliloti, ISMav9 from Mycobacterium avium, and ISNfl1 from Nostoc commune.	444
411216	NF033594	transpos_ISNCY_2	ISNCY family transposase. The ISNCY insertion sequence family, as defined by ISFinder, encodes several apparently unrelated families of transposases. Members of this family resemble the transposases of ISNCY family elements such as IS1202, ISTde1, ISKpn21, and ISCARN1.	367
411217	NF033595	denti_PrtP	dentilisin complex serine proteinase subunit PrtP. PrtP, a chymotrypsin-like protease known as dentilisin, forms a complex with PrcB and PrcA. It is found in Treponema denticola and in numerous other Treponema species. Dentilisin from T. denticola plays a significant role in pathogen-host interactions in periodontal disease.	608
411218	NF033596	denti_PrcB	dentilisin complex subunit PrcB. 	174
411219	NF033597	denti_PrcA	dentilisin complex subunit PrcA. PrcA is a lipoprotein that, together with PrcB and the serine proteinase subunit PrtP, form a chymotrypsin-like surface complex that is also known as dentilisin, after its discovery and characterization in Treponema denticola. Dentilisin is an important virulence factor in periodontal disease.	611
411220	NF033598	elast_bind_EbpS	elastin-binding protein EbpS. The elastin-binding protein EbpS is an adhesin described in Staphylococcus aureus, with orthologs found in many additional staphylococcal species. EbpS is a membrane protein that lacks an N-terminal signal peptide region, has extensive regions low-complexity sequence rich in Asn and Gln, and has a C-terminal LysM domain.	466
411221	NF033599	His_racem_CntK	histidine racemase CntK. CntK (cobalt and nickel transport system protein K) is a histidine racemase that performs the first step in the biosynthesis of staphylopine, a metallophore involved in the import of multiple divalent cations. It was first characterized in Staphylococcus aureus.	271
411222	NF033600	staphylopine_DH	staphylopine biosynthesis dehydrogenase. 	424
411223	NF033601	Sta_opine_CntL	staphylopine biosynthesis enzyme CntL. CntL (cobalt and nickel transporter L) is an enzyme involved in biosynthesis of staphylopine, a metallophore involved in the import of zinc, cobalt, nickel, and other divalent cations. CntL transfers aminobutyrate from S-adenoyslmethionine, and is sometimes misannotated as a SAM-dependent methyltransferase. The staphylopine biosynthesis pathway was first characterized in Staphylococcus aureus.	255
411224	NF033602	campy_sm_acidic	highly acidic protein. This highly acidic protein, usually between 50 and 55 amino acids long, is found so far in Campylobacter jejuni and Campylobacter coli. A reanalysis of proteomics data, performed by Dr. Samuel Payne of Pacific Northwest National Labs, shows strong evidence for expression of AIW09513.1, founding member of the family.	50
411225	NF033603	mini-MOMP_1	mini-MOMP protein. Mini-MOMP proteins, found in several species of Campylobacter, are small proteins (about 63 amino acids long before removal of the signal peptide) with strong homology to the N-terminal region of MOMP, the major outer membrane protein that is Campylobacter's major porin.	63
411226	NF033604	epsi_CJH_07325	CJH_07325 family protein. Members of the CJH_07325 family are small proteins, shorter than 70 amino acids, expressed in members of the genera Campylobacter and Helicobacter. The function is unknown.	56
411227	NF033605	Zn_bnd_ABC_AdcA	zinc ABC transporter substrate-binding lipoprotein AdcA. 	516
411228	NF033606	heat_AAA_ClpK	heat shock survival AAA family ATPase ClpK. ClpK, a Clp family AAA ATPase, was discovered as a plasmid-encoded determinant for survival of heat shock along with other putative heat shock proteins. ClpK requires the presence of ClpP to confer heat resistance. ClpK is about  65% identical to ClpG. Note that PMID:26974352 and PMID:29263094 discuss both ClpG itself and a member of this family (ClpK) that they call ClpG-GI.	949
411229	NF033607	disagg_AAA_ClpG	AAA family protein disaggregase ClpG. ClpG, as characterized in Pseudomonas aeruginosa, is a Clp family member of the AAA+ family of ATPases. ClpG has stand-alone ability to disaggregate proteins from aggregates that result from heat stess. Both ClpG and its mobilized homolog ClpK provide increased survival of exposure to heat.	932
411230	NF033608	type_I_tox_Fst	type I toxin-antitoxin system Fst family toxin. This model represents an expansion of Pfam model PF13955, for type I toxin-antitoxin system Fst family toxins, with increased sensitivity to pick up a toxin from Streptococcus mutans described in PMID:23326602	28
411231	NF033609	MSCRAMM_ClfA	MSCRAMM family adhesin clumping factor ClfA. Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.	934
411232	NF033610	SLATT_3	SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is always N-terminally fused to the SLATT_1 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels.  The SLATT domain defined here (170 residues long) is similar to the DUF4231 domain (105 residues long) described in Pfam model PF14015.	164
411233	NF033611	SAVED	SAVED domain. The SAVED domain is predicted to function as a sensor domain, sensing nucleotides or nucleotide derivatives generated by SMODS and other nucleotide synthetase domains. The sensing of ligands by SAVED is predicted to activate effectors deployed by a class of conflict systems which are reliant on the on the production and sensing of the nucleotide second messengers.	260
411234	NF033615	CDF_MamM	magnetosome biogenesis CDF transporter MamM. 	290
411235	NF033616	CDF_MamB	magnetosome biogenesis CDF transporter MamB. 	284
411236	NF033617	RND_permease_2	multidrug efflux RND transporter permease subunit. 	1009
411237	NF033618	mlaB_1	lipid asymmetry maintenance protein MlaB. MlaB belongs to a system that maintains asymmetry in the outer membrane of Gram-negative bacteria, with LPS in the outer leaflet and phospholipids in the inner leaflet. Several components of the system share homology with typical ABC transporters.	94
411238	NF033619	perm_MlaE_1	lipid asymmetry maintenance ABC transporter permease subunit MlaE. 	253
411239	NF033620	pqiC	membrane integrity-associated transporter subunit PqiC. PqiC (YmbA), a lipoprotein, has been identified as part of the PqiABC system, a transporter that bridges the inner and outer membranes in species such as Escherichia coli and is important to membrane integrity.	185
411240	NF033621	de_GSH_amidase	deaminated glutathione amidase. 	260
411241	NF033622	repair_DdrC	DNA damage response protein DdrC. DdrC is a DNA-binding protein that seems restricted to the genus Deinococcus, and that plays a role in the ability of members of that genus to recover from fragmentation of their DNA. Note that the region where DdrC is found in Deinococcus radiodurans R1 originally had incorrect structural annotation, with a feature designated DR_0003 shown on the opposite strand.	223
411242	NF033623	urate_HpxO	FAD-dependent urate hydroxylase HpxO. HpxO is an FAD-dependent urate hydroxylase (EC 1.14.13.113). Like the factor independent urate hydroxylase (EC 1.7.3.3), it consumes O2 and converts urate to 5-hydroxyisourate, which decomposes spontaneously to allantoin and CO2. However, HpxO oxidizes NADH to NAD(+), and produces H20, while EC 1.7.3.3 produces H202 as a byproduct.	382
411243	NF033624	HpxX	oxalurate catabolism protein HpxX. HpxX is a small protein of unknown function, about 60 residues in length, encoded in the set of four genes, hpxWXYZ, that belong to the oxalurate metabolism portion of a complete pathway for hypoxanthine (hpx) utilization, as in Klebsiella pneumoniae.	55
411244	NF033625	HpxZ	oxalurate catabolism protein HpxZ. HpxZ is not characterized, but it is encoded in the cluster hpxWXYZ, associated with oxalurate catabolism, within the larger hpx (hypoxanthine utilization) locus of species such as Klebsiella pneumoniae, where KPN_01771 is HpxZ.	122
411245	NF033628	snapalysin	snapalysin. Snapalysin (SnpA, or Small Neutral Protease A) belongs to the metzincin family of zinc-dependent metalloendopeptidases.	207
411246	NF033629	RiPP_CPAC	RiPP peptide. 	52
411247	NF033630	SLATT_6	SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family associates with a SMODS nucleotide synthetase domain fused to the predicted AGS-C sensor domain. It is sometimes further coupled to R-M systems.	179
411248	NF033631	SLATT_5	SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family contains an additional C-terminal alpha-helix, and strictly associates with a reverse transcriptase domain, part of a predicted retroelement with diversity-generating potential.	182
411249	NF033632	SLATT_4	SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often coupled to the SMODS nucleotide synthetase and is sometimes further embedded in other conflict systems like CRISPR/Cas or R-M systems.	154
411250	NF033633	SLATT_2	SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is the only prokaryotic SLATT family to exist as a standalone domain, with no as-yet discernable genome associations.	181
411251	NF033634	SLATT_1	SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often C-terminally fused to the SLATT_3 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. In relatively rare instances, it is genomically linked as a standalone domain to the RelA/SpoT nucleotide synthetase and the predicted NA37/YejK sensor domain.	135
411252	NF033635	SLATT_fungal	SLATT domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function in bacteria as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. The role of this fungal family is not yet understood, although the expansion of the family in many fungal lineages points to a potential role in conflict.	125
411253	NF033638	RNase_AS	polyadenylate-specific 3'-exoribonuclease AS. RNase AS is a 3'-exoribonuclease, found in Mycobacterium tuberculosis and other Actinobacteria, that acts specifically to degrade polyadenylate sequences from the 3'-end of RNA.	155
411254	NF033640	N_Twi_rSAM	twitch domain-containing radical SAM protein. Members of this family are unusual among radical SAM proteins in several ways. First, the N-terminal region consists of an iron-sulfur cluster-binding twitch domain (half of a SPASM domain), something usually found C-terminal to the radical SAM domain. Second, the radical SAM domains in many of the members of this family score poorly vs. the Pfam HMM, PF04055 (version 19), used to identify radical SAM. Lastly, the majority of members sequenced to date come from uncultured bacteria from marine or aquifer sources rather than from conventionally cultured bacterial isolates. The function is unknown.	396
411255	NF033641	antiterm_LoaP	antiterminator LoaP. LoaP is a paralog of NusG with an extensive presence in Firmicutes. The founding member, from Bacillus amyloliquefaciens, was shown to serve as an antiterminator for the transcription of genes involved in antibiotic biosynthesis.	166
411256	NF033642	stress_AzuC	stress response protein AzuC. AzuC is a basic, extremely small protein (28 amino acids in Escherichia coli K-12) whose expression is repressed by cyclic AMP response protein (CRP) and stimulated by acidic pH.	26
411257	NF033644	antiterm_UpxY	UpxY family transcription antiterminator. The UpxY family of NusG-related transcription antiterminators was described originally from a paralogous family of eight members from Bacteriodes fragilis, UpaY to UphY, each of which was associated with a distinct capsular polysaccharide biosynthesis locus. There is no UpxY protein per se.	162
411258	NF033645	pilus_FilE	putative pilus assembly protein FilE. FilE is found almost exclusively in the genus Acinetobacter, and is assigned as a putative pilus system protein from local genomic contexts that include several additional putative pilus system proteins. Note that some members of this protein family have proline-rich repeat regions for which spurious translation in another reading frame can give a false-positive match to Pfam's collagen repeat region HMM, PF01391.	411
411259	NF033647	adhesin_LEA	LEA family epithelial adhesin N-terminal domain. LEA (Lactobacillus epithelium adhesin), as characterized in an adhesive commensal strain of Lactobacillus crispatus (ST1), is a large, repetitive protein with an N-terminal YSIRK-type signal peptide and a C-terminal LPXTG site for processing by sortase and attachment to the cell surface. Family members contain variable numbers of an 82 amino acid long repeats similar to Lactobacillus Rib/alpha-like repeats. This HMM describes the N-terminal region upstream of the repeat region, just over 600 amino acids long.	655
411260	NF033649	LipDrop_Rv1109c	lipid droplet-associated protein. RHA1_ro05869 from Rhodococcus jostii RHA1, an ortholog of Rv1109c from Mycobacterium tuberculosis, has been shown specifically with lipid droplets that consist of a neutral lipid core enveloped by a phospholipid monolayer and surface proteins. Lipid droplets of triacylglycerol can be especially prominent in members of the genus Rhodococcus, but occur also in Mycobacterium tuberculosis and can support dormancy of that pathogen.	199
411261	NF033650	ANR_neg_reg	ANR family transcriptional regulator. ANR family transcriptional regulators include the AggR-activated regulator Aar. The name ANR (AraC Negative Regulators) refers to an effect on, rather than homology to, certain AraC family transcriptional regulators.	54
411262	NF033652	LbtU_sider_porin	LbtU family siderophore porin. LbtU, from Legionella pneumophila, a novel TonB-independent siderophore uptake outer membrane protein from a species that lacks TonB, is the founding member of a class of porins that may be involved generally in siderophore-mediated iron acquisition.	322
411263	NF033656	DMQ_monoox_COQ7	2-polyprenyl-3-methyl-6-methoxy-1,4-benzoquinone monooxygenase. 	205
411264	NF033657	choice_anch_F	choice-of-anchor F family protein. Choice-of-anchor F is a domain found in prokaryotic proteins with a variety of C-terminal sorting and transit domains. These include the autotransporter outer membrane beta-barrel domain, the JDVT-CTERM domain, and variant forms of PEP-CTERM domains.	296
411265	NF033662	acid_disulf_rpt	acidic double-disulfide repeat. The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.	32
411266	NF033663	AceI_fam_PACE	AceI family chlorhexidine efflux PACE transporter. The AceI family of proton-coupled multidrug/biocide efflux transporters includes several shown to respond to the presence of the biocide chlorhexidine and improve resistance to it, among other biocides and antibiotics. Members of the family seen so far, including founding member AceI, have been chromosomal and restricted to the genus Acinetobacter.	136
411267	NF033664	PACE_transport	PACE efflux transporter. PACE transporters, including the chlorhexidine efflux transporter AceI, average about 140 amino acids in length and consist of two tandem homologous domains described by Pfam model PF05232. PACE transporters are single component efflux transporters that sit in the plasma membrane and couple proton import to substrate export.	130
411268	NF033665	PACE_efflu_PCE	multidrug/biocide efflux PACE transporter. PACE (proteobacterial antimicrobial compound efflux) transporters are single component proton-coupled efflux pumps that help confer resistance to a number of biocides and antibiotics. The family has also been named PCE (proteobacterial chlorhexidine efflux). Members of this subfamily of the PACE transporters, distinct from the AceI-like branch, include several whose expression is increased by exposure to chlorhexidine and/or help confer increased resistance to it.	130
411269	NF033668	rSAM_PA0069	PA0069 family radical SAM protein. PA0069 from Pseudomonas aeruginosa is the founding member a family of radical SAM enzymes of unknown function. Note that inclusion of some members of this family in COG1533, along with some spore photoproduct lyase (SPL) proteins, has led to some family members being annotated as SPL.	348
411270	NF033672	mbn_chaper_assoc	copper uptake system-associated domain. Proteins from this family that may contain just an N-terminal signal peptide followed by this domain, or have a copper chaperone domain  in addition, just past the signal peptide. A majority of bacteria that encode a peptide-derived methanobactin precursor encode a member of this family as well, strongly suggesting a role for this domain in copper acquisition.	104
411271	NF033674	stress_OB_fold	NirD/YgiW/YdeI family stress tolerance protein. Members of this family possess an N-terminal signal peptide, and are associated with tolerance to various toxic stresses. These include antimicrobial peptides (YdeI), hydrogen peroxide (YgiW and YdeI), and nickel (NcrY and NirD).	110
411272	NF033675	NTTRR-F1	NTTRR-F1 domain. NTTRR-F1 (N-terminal To Repetitive Region - Firmicutes 1) is a homology domain found strictly as the N-terminal non-repetitive region of otherwise highly repetitive proteins of various Firmicutes. The repetitive region that follows typically is collagen-like, with every third residue a glycine.	155
411273	NF033676	Lacb_SerRich_Nt	serine-rich glycoprotein adhesin prefix domain. Lacb_SerRich_Nt describes a Lactobacillus-restricted N-terminal non-repetitive sequence region shared by proteins with extensive serine-rich repeat regions, all likely to function as adhesins. This region contains a variant form of the KxYKxGKxW motif (see TIGR03715) followed by a region related to serine-rich glycoprotein adhesins of the Streptococci.	80
411274	NF033677	biofilm_BapA_N	BapA prefix-like domain. Two largely unrelated repetitive proteins, both named biofilm-associated protein BapA (from Salmonella enterica and from Paracoccus denitrificans) share homology domains at the two ends. Both lack a typical signal peptide for translocation by Sec, and instead depend on type I secretion for export and for contribution to biofilm formation. The conserved prefix (i.e. N-terminal) domain is shared by a number of other large, repetitive proteins of Proteobacteria thought to be associated with adhesion or biofilm formation.	64
411275	NF033678	C69_fam_dipept	C69 family dipeptidase. Members of the MEROPS C69 family (subfamily 001) are dipeptidases (EC 3.4.13.-).	463
411276	NF033679	DNRLRE_dom	DNRLRE domain. The DNRLRE domain, with a length of about 160 amino acids, appears typically in large, repetitive surface proteins of bacteria and archaea, sometimes repeated several times. It occurs, notably, three times in the C-terminal region of the enzyme disaggregatase from the archaeal species Methanosarcina mazei, each time with the motif DNRLRE, for which the domain is named.  Archaeal proteins within this family are described particularly well by the currently more narrowly defined Pfam model, PF06848. Note that the catalytic region of disaggregatase, in the N-terminal portion of the protein, is modeled by a different HMM, PF08480.	164
411277	NF033680	exonuc_ExeM-GG	extracellular exonuclease ExeM. ExeM, as described in Shewanella oneidensis, is a biofilm formation-associated exonuclease that cleaves extracellular DNA (eDNA), a biofilm component. Members of the ExeM family contain two or three pairs of Cys residues, presumed to form disulfide bonds, and a C-terminal GlyGly-CTERM membrane-anchoring segment. Strangely, engineered removal of the GlyGly-CTERM region did not result in net export from the cell and appearance of the enzyme in culture supernatants.	883
411278	NF033681	ExeM_NucH_DNase	ExeM/NucH family extracellular endonuclease. 	545
411279	NF033682	retention_LapA	retention module-containing protein. The retention module, as described for the giant adhesin LapA of Pseudomonas fluorescens and for an ice-binding giant adhesin of an Antarctic bacterium, appears at the N-terminus of a number of very large repetitive proteins, many of which have C-terminal regions that make them substrates for type I secretion systems.	145
411280	NF033683	di_4Fe-4S_YfhL	YfhL family 4Fe-4S dicluster ferredoxin. 	79
411281	NF033684	suffix_2_RND	transporter suffix domain. Members of this protein family contain a highly hydrophobic region about 70 amino acids long that usually occurs as essentially the full length of a small membrane protein, but in some cases occurs as a C-terminal suffix domain for RND efflux transporter permease subunit proteins.	69
411282	NF033685	Tet_leader_L	tetracycline resistance efflux system leader peptide. Stalling of the ribosomal translation of a tetracycline resistance system mRNA from inability to properly translate a short leader peptide can affect mRNA secondary structure, preventing early transcriptional termination and therefore inducing expression of the resistance protein. This leader peptide is found upstream of efflux transporters such as tet(L) and tet(45) (which occur in Gram-positive organisms), but on occasion may occur as a fused additional N-terminal domain of such efflux proteins.	20
411283	NF033686	leader_PheM_1	pheST operon leader peptide PheM. 	14
411284	NF033688	MG406_fam	MG406 family protein. Homologs to MG406 from Mycoplasma genitalium and MPN605 from Mycoplasma pneumoniae are about 150 amino acids long on average, highly hydrophobic, widespread in but restricted to the Mollicutes, and highly divergent there. MG406 itself appears to be an essential gene.	117
411285	NF033689	N2Fix_CO_CowN	N(2)-fixation sustaining protein CowN. Carbon monoxide inhibits the ability of Mo-nitrogenase to fix nitrogen, but expression of CowN is protective, and allows nitrogen fixation to continue.	89
411286	NF033690	ErmCL_fam_lead	ErmCL family antibiotic resistance leader peptide. 	19
411287	NF033691	immunity_MafI	MafI family immunity protein. MafI proteins, as described in Neisseria species, are small proteins encoded in modules with MafB (multiple adhesin family B) secreted toxins. Homologs are found broadly, primarily in Proteobacterial genera such as Neisseria, Cronobacter, Enterobacter, Pseudomonas, etc.	67
411288	NF033694	perox_inhi_SPIN	SPIN family peroxidase inhibitor. SPIN (staphylococcal peroxidase inhibitor) binds to and inhibits human myeloperoxidase to resist killing after phagocytosis by neutrophils.	99
411289	NF033696	CrpP_fam	CrpP family protein. CrpP, described originally as a protein encoded by Pseudomonas aeruginosa plasmid pUM505, confers elevated resistance to ciprofloxacin, but not to four other fluoroquinoline-type antibiotics tested.  The apparent mechanism is phosphorylation. However, CrpP appears more closely related to the ribosome modulation factor Rmf than to any known aminoglycoside phosphotransferase.	58
411290	NF033697	leader_RseD	rpoE leader peptide RseD. RseD, as described originally in Escherichia coli, is a leader peptide translationally coupled to the extracytoplasmic stress response sigma factor RpoE. It participates in the CrsA-mediated fine-tuning of the stress response. It is found also in Salmonella enterica, Cedecea neteri, etc. The corresponding locus in Klebsiella pneumoniae appears to be split into an upstream and a downstream peptide.	31
411291	NF033701	yciY_fam	YciY family protein. Members of the YciY family are named after the gene symbol given in E. coli K-12, but members of the family are found also in Salmonella, Klebsiella, Yersinia, Erwinia, etc.	56
411292	NF033703	transcr_KstR	cholesterol catabolism transcriptional regulator KstR. KstR, a protein characterized in Mycobacterium tuberculosis (MTB) and M. smegmatis is a TetR family transcriptional regulator that is essential for pathogenesis in MTB. In controls the expression of about 80 proteins involved in the earlier stages of cholesterol catabolism. KstR binds not to cholesterol itself, but to catabolites found early in the degradation pathway.	185
411293	NF033704	helico_Hpn_like	nickel-binding protein HpnL. The Hpn-like protein HpnL, or Hpn-2, including founding member HP1432, is a histidine and glutamine-rich metal-binding polypeptide that binds nickel, among other metals, any may help deliver sequestered nickel for the biosynthesis of urease, which is critical to the survival of Helicobacter pylori during exposures to strongly acidic conditions. This protein is a paralog to the histidine-rich nickel storage protein Hpn (e.g. HP1427). Both nickel-binding proteins are expressed in response to nickel, under control of the regulator NikR.	71
411294	NF033705	helico_Hpn	nickel storage protein Hpn. Hpn (Helicobacter pylori nickel) is a histidine-rich polypeptide, 60 amino acids in length, capable of binding nickel and several other metals. It can store a reserve of nickel for biosynthesis of active urease, critical to the ability of the bacterium to survive acid exposure during colonization, and hydrogenase. Hpn is closely related in its N-terminal region to a somewhat longer His and Gln-rich protein, called the Hpn-like protein. Both are expressed, under control of NikR, in response to nickel.	60
411295	NF033706	Ni_bind_SCO4226	SCO4226 family nickel-binding protein. Members of the SCO4226 family belong to the larger family of DUF4242 domain-containing proteins, described by Pfam model PF14026. SCO4226 itself was shown to dimerize and bind four nickel atoms per homodimer.	82
411296	NF033707	T9SS_sortase	type IX secretion system sortase PorU. PorU, part of type IX secretion systems (T9SS), is the protease responsible for both removing the C-terminal sorting signal found in substrates and for its replacement by anionic LPS, through which most T9SS substrates become attached to the cell surface after secretion.	1056
411297	NF033708	T9SS_Cterm_ChiA	T9SS sorting signal type C. The sorting signals of type IX secretion systems (T9SS) in the CFB bacteria are long, compared to other prokaryotic C-terminal sorting motif-containing signals, including LPXTG, PEP-CTERM, and GlyGly-TERM, and they seem to contain multiple motifs. A few T9SS substrates, including ChiA, have a variant form of T9SS sorting signal that may score poorly to both TIGR04183  (type A) and TIGR04131 (type B), depend on T9SS for secretion, but are released from the cell rather than left anchored to the cell surface.	55
411298	NF033709	PorV_fam	PorV/PorQ family protein. Proteins closely related to PorV, and its paralog PorQ, are found regularly in species with type IX secretion systems (T9SS), the system associated with a type of gliding motility in many of the Bacteroidetes.	327
411299	NF033710	T9SS_OM_PorV	type IX secretion system outer membrane channel protein PorV. PorV, as characterized in oral pathogen Porphyromonas gingivalis, is a component of the type IX secretion system (T9SS) needed to process a subset of T9SS substrates. PorV is a paralog of PorQ.	368
411300	NF033711	T9SS_PorQ	type IX secretion system protein PorQ. 	330
411301	NF033712	B12_rSAM_KedN5	KedN5 family methylcobalamin-dependent radical SAM C-methyltransferase. KedN5, the founding member of a family of radical SAM enzymes with an N-terminal B12-binding domain, is a C-methyltransferase that relies on a methylcobalamin cofactor during natural product biosynthesis.	624
411302	NF033713	DbpA	DbpA. 	166
411303	NF033715	glycyl_HPDL_Lrg	4-hydroxyphenylacetate decarboxylase large subunit. 4-hydroxyphenylacetate decarboxylase, an enzyme with a glycyl radical active site, depends on a radical SAM enzyme for activation, and is found in strict anaerobes such as Clostridium difficile. It has a large and a small subunit.	901
411304	NF033716	glycyl_HPDL_Sma	4-hydroxyphenylacetate decarboxylase small subunit. 4-hydroxyphenylacetate decarboxylase, an enzyme with a glycyl radical active site, depends on a radical SAM enzyme for activation, and is found in strict anaerobes such as Clostridium difficile. It has a large and a small subunit.	79
411305	NF033717	HPDL_rSAM_activ	4-hydroxyphenylacetate decarboxylase activase. 4-hydroxyphenylacetate decarboxylase activase is a radical SAM enzyme, found in anaerobic bacteria where 4-hydroxyphenylacetate decarboxylase occurs and required to prepare the glycyl radical active site of the enzyme.	311
411306	NF033718	indole_decarb	indoleacetate decarboxylase. Indoleacetate decarboxylase is a single subunit glycyl radical enzyme that depends on a cognate radical SAM enzyme for its activation. It performs the final step in the anaerobic fermentation of tryptophan to skatole, a malodorous volatile compound.	868
411307	NF033719	ind_deCO2_activ	indoleacetate decarboxylase activase. 	302
411308	NF033720	DbpB	decorin-binding protein DbpB. 	182
411309	NF033721	P12_lipo	P12 family lipoprotein. 	287
411310	NF033723	S2_P23	S2/P23 family protein. 	179
411311	NF033724	P13_porin	P13 family porin. 	178
411312	NF033725	borfam_49	chromosome replication/partitioning protein. 	150
411313	NF033726	borfam52	P52 family lipoprotein. 	171
411314	NF033728	borfam54_1	complement regulator-acquiring protein. 	319
411315	NF033729	borfam54_2	complement regulator-acquiring protein. 	219
411316	NF033730	borfam54_3	complement regulator-acquiring protein. 	292
411317	NF033731	borfam63	fibronectin-binding protein RevA. 	160
411318	NF033732	borfam95	exported protein A EppA. 	176
411319	NF033733	MFS_ArsK	arsenite efflux MFS transporter ArsK. ArsK, a major facilitator superfamily (MFS) transporter, was shown in Agrobacterium tumefaciens to be induced by arsenite and antimonite, to reduce their accumulation, and to confer resistance when expressed heterologously.	388
411320	NF033734	MFS_ArsJ	organoarsenical effux MFS transporter ArsJ. ArkJ regularly is encoded next to a glyceraldehyde-3-phosphate dehydrogenase that can synthesize 1-arseno-3-phosphoglycerate in the presence of arsenate, and appears to provide arsenate resistance by exporting that organoarsenical compound before it spontaneously dissociates into arsenate and 3-phosphoglycerate.	392
411321	NF033735	G3PDH_Arsen	ArsJ-associated glyceraldehyde-3-phosphate dehydrogenase. 	324
411322	NF033737	Amm_Lyn_leader	ammosamide/lymphostin biosynthesis leader domain. Precursors of the ammosamide (Amm6) and lymphostin family of natural products share both a strongly conserved leader peptide region that interacts with natural product biosynthesis enzymes, and a critical Trp residue at or near the C-terminus that is incorporated into the small molecule natural product eventually produced - a pyrroloquinoline alkaloid whose core is derived from tryptophan. An additional, shorter homolog, Xan, encoded in a less well understood natural production biosynthesis locus, lacks the critical Trp. It is thought to be a trans-acting leader peptide that does not need to be linked covalently to a core peptide, if that is the substrate of the natural product biosynthesis enzymes, in order to enable them to perform their modifications.	27
411323	NF033738	microvirid_RiPP	microviridin/marinostatin family tricyclic proteinase inhibitor. Members of the microviridin/marinostatin are ribosomally translated peptides whose post-translational processing converts them into tricyclic depsipeptides that serve as serine proteinase inhibitors. A single precursor usually has one core peptide region near the C-terminus, with a nearly invariant TxKYPSD motif, but may instead have two or three repeats of the core region.	47
411324	NF033739	intramemb_PrsW	intramembrane metalloprotease PrsW. PrsW, an intramembrane protease, cleaves the anti-sigma factor RsiW, which regulates the activity of the ECF-type sigma factor SigW.	209
411325	NF033740	MarP_fam_protase	MarP family serine protease. The founding member of this family of membrane-spanning serine proteases, which is restricted to Actinobacteria, is the acid resistance periplasmic serine protease MarP of Mycobacterium tuberculosis. Recent work shows that MarP is required to cleave and activate the peptidoglycan hydrolase RipA, and loss of RipA activity creates a defect in progeny separation during cell division. Therefore, the requirement for MarP in order to survive acidic conditions may be a consequence of peptidoglycan hydrolysis requirements, explaining why MarP family members are distributed more broadly in the Actinobacteria than the subset of species capable of surviving intracellularly as pathogens.	390
411326	NF033741	NlpC_p60_RipA	NlpC/P60 family peptidoglycan endopeptidase RipA. 	457
411327	NF033742	NlpC_p60_RipB	NlpC/P60 family peptidoglycan endopeptidase RipB. 	206
411328	NF033743	NlpC_inact_RipD	NlpC/P60 family peptidoglycan-binding protein RipD. RipD proteins, such as founding member Rv1566c from Mycobacterium tuberculosis, is a catalytically inactive paralog of the peptidoglycan endopeptidases RipA and RipB. A catalytically important Cys and His pair is replaced by Ala-83 and Ser-132.	177
411329	NF033745	class_C_sortase	class C sortase. 	218
411330	NF033746	class_D_sortase	class D sortase. 	135
411331	NF033747	class_E_sortase	class E sortase. 	211
411332	NF033748	class_F_sortase	class F sortase. 	155
411333	NF033749	bact_hemeryth	bacteriohemerythrin. Bacteriohemerythrin, an O2-carrying protein that lacks a heme moiety, is named based on its homology to eukaryotic proteins such as myohemerythrin.	129
411334	NF033750	vWF_bind_Staph	von Willebrand factor binding protein Vwb. The von Willebrand factor binding protein Vwb, like its paralog staphylocoagulase, is a coagulase and a virulence factor. It induces clotting, not by being an enzyme, but by activating prothrombin to generate fibrin.	510
411335	NF033751	pallilysin_like	pallilysin-related adhesin. In contrast to pallilysin itself (a bifunctional adhesin and protease), members of the pallilysin-related adhesin family average twice the length, lack the HEXXH motif essential to pallilysin's metalloprotease activity, and are likely to function in virulence only as an adhesin. Typical members of this family include TDE0840 from Treponema denticola and BB0038 from Borrelia burgdorferi, which share less than 20% pairwise amino acid sequence identity.	385
411336	NF033752	linaridin_CypA	cypemycin family RiPP. The cypemycin precursor CypA belongs to the linaridin class (linear "arid" peptide, following dehydration modifications) of RiPP natural product precursors. The signature terminal motif CL[VI]C is modified by decarboxylation of the C-terminal Cys residue, followed by cyclization.	52
411337	NF033753	RiPP_decarbCypD	CypD family RiPP peptide-cysteine decarboxylase. CypD, a Cys decarboxylase flavoprotein, oxidatively removes the carboxyl moiety from the C-terminal Cys residue of CypA, the precursor of the RiPP natural product cypemycin.	182
411338	NF033754	gliding_CglC	adventurous gliding motility lipoprotein CglC. CglC (cell contact-dependent gliding (or conditional gliding) motility protein C, also called adventurous gliding motility protein AgmO, is found in delta-proteobacterial species that exhibit a taxonomically restricted form of gliding motility.	156
411339	NF033755	gliding_CglE	adventurous gliding motility protein CglE. 	172
411340	NF033756	gliding_GltC	adventurous gliding motility protein GltC. GltC is a soluble periplasmic protein required for a type of gliding motility found in certain social delta-proteobacteria, including the model species Myxococcus xanthus.	621
411341	NF033757	gliding_CglB	adventurous gliding motility lipoprotein CglB. CglB is an outer membrane lipoprotein required for A-motility in Myxococcus xanthus and other delta-proteobacteria. It is transferable between cells that have a compatible TraAB system, and was therefore named conditional (or cell contact-dependent) gliding motility protein B, or CglB.	403
411342	NF033758	gliding_GltE	adventurous gliding motility TPR repeat lipoprotein GltE. GltE (also called AglT) is a tetratricopeptide repeat protein with a lipoprotein signal peptide and a role in A-motility (adventurous gliding motility) in Myxococcus xanthus and other delta-proteobacteria.	411
411343	NF033759	exchanger_TraA	outer membrane exchange protein TraA. TraA, together with its partner TraB, mediates a large scale exchange of outer membrane lipoproteins, and lipids, between closely related strains or clonally identical cells, certain delta-proteobacterial species such as Myxococcus xanthus. The exchange mechanism is likely to involve fusion of outer membrane, probably done to coordinate the social behaviors these bacteria display.	662
411344	NF033760	gliding_GltG	adventurous gliding motility protein GltG. GltG proteins, including the founding member MXAN_4867 from Myxococcus xanthus, occur in certain delta-proteobacteria and are involved in adventurous gliding (A-)motility. GltG has an N-terminal forkhead-associated (FHA) domain domain, often associated with signal transduction.	647
411345	NF033761	gliding_GltJ	adventurous gliding motility protein GltJ. Adventurous gliding motility protein GltJ, also known as AgmX, occurs in delta-proteobacteria such as Myxococcus xanthus.	671
411346	NF033762	social_mot_Tgl	social motility TPR repeat lipoprotein Tgl. Social motility in delta-proteobacterial species such as Myxococcus xanthus depends on a type VI pilus, which in turn depends on assembly of the PilQ secretin complex. Tgl, a tetratricopeptide repeat (TPR) outer membrane lipoprotein, is required for PilQ assembly.	252
411347	NF033763	exchanger_TraB	outer membrane exchange protein TraB. TraB, as described originally in the delta-proteobacterium, is a protein with a C-terminal OmpA-like domain, and is encoded in an operon with TraA. Together TraAB make it possible for bacterial cells with close enough kinship to exchange outer membrane lipoproteins, such that certain motility defects in mutant cells can be corrected through the exchange. This exchange most likely involves membrane fusion events, as large amounts of lipid are also exchanged, and it is restricted by a bi-directional kin recognition requirement. Among wild-type cells, these exchanges likely help coordinate various social behaviors.	522
411348	NF033764	gliding_CglF	adventurous gliding motility protein CglF. CglF, as originally described in Myxococcus xanthus, is a gliding motility protein. It has the property that motility in a cglF loss mutant can be restored by close contact with compatible kin strains where cglF is wild type. The restoration of motility depends on bidirectional kinship recognition, which is mediated by TraAB and leads to exchange of outer membrane proteins and lipids. This cell contact requirement leads to the gene symbol, cglF, meaning conditional (or, cell contact-dependent) gliding F. This protein has also been called GltF.	87
411349	NF033765	gliding_CglD	adventurous gliding motility lipoprotein CglD. 	155
411350	NF033766	choice_anch_G	choice-of-anchor G family protein. Choice-of-anchor proteins belong to homology families in which various branches carry C-terminal sorting signals known, or suspected, to be processed by transpeptidases as sortase, exosortase, archaeosortase, rhombosortase, or the type 9 secretion system sorting enzyme. Members of this family, called choice-of-anchor G, included sortase and exosortase targets, and are likely to be found on the cell surface.	282
411351	NF033767	exosort_XrtS	exosortase S. Members of the exosortase S family occur in the high GC Gram-positive order Micrococcales (a branch of the Actinobacteria), in genera such as Arthrobacter, Microbacterium, Curtobacterium, and Paenarthrobacter.	155
411352	NF033768	myxo_SS_tail	AgmX/PglI C-terminal domain. The myxo_SS_tail domain occurs as the C-terminal domain in multiple proteins per genome for a number of species capable of surface gliding motility, e.g. 12 in Myxococcus xanthus. Member proteins include the adventurous gliding motility proteins AgmX (GltJ) and PglI in M. xanthus. The domain is about 92 amino acids long, and features a pair of Cys residues about 45 amino acids apart in almost all cases.	92
411353	NF033769	after_VWA_1	after-VIT domain. The after-VIT domain is a bacterial surface protein C-terminal domain found on some proteins that have both the Vault protein Inter-alpha-Trypsin (VIT) domain and a von Willebrand factor type A domain. Note that some of after-VIT domain-containing proteins, such as members of TIGR03788, may have a known C-terminal sorting signal, such as LPXTG or PEP-CTERM, instead of the after-VIT domain. The after-VIT domain appears to be homologous to the myxo_SS_tail domain, some of whose member proteins are involved in adventurous gliding motility in Myxococcus xanthus, and it is similarly located.	90
411354	NF033770	exosort_XrtT	exosortase T. Exosortase T is a variant form of exosortase, typically found in Alphaproteobacteria in genera such as Pseudovibrio,  Labrenzia, and Ruegeria. Members of this family may be dedicated enzymes, processing a single substrate encoded by a nearby gene, rather than processing multiple proteins like exosortases A and B.	472
411355	NF033771	colonize_BriC	biofilm-regulating peptide BriC. BriC (Biofilm-Regulating peptide Induced by Competence), as characterized in Streptococcus pneumoniae, is a cell-cell communication peptide, or peptide pheromone, that is induced by expression of ComE (a master regulator of competence). BriC contributes to biofilm formation, and to colonization in a mouse mole.	60
411356	NF033772	pheromone_VP1	peptide pheromone VP1. VP1 (virulence peptide 1), as characterized in Streptococcus pneumoniae (a.k.a. pneumococcus), is part of a large panel of secreted regulatory peptides with paralogous leader domains, typically ending with the cleavage site GlyGly, but with highly variable core regions. VP1, along with other pneumococcal peptide pheromones such as BriC, participate to cell-cell communication to regulate pathogenic processes such as biofilm formation.	65
411357	NF033773	tellur_TrgA	TrgA family protein. TrgA, a protein associated with tellurium resistance (but less critical to the phenotype than TrgB, encoded by the adjacent gene), is the founding member of a family of hydrophobic proteins, about 150 amino acids in length, probably embedded in the membrane, and possibly involved in transport.	145
411358	NF033774	phos_trans_PitA	inorganic phosphate transporter PitA. PitA is a low-affinity transporter for inorganic phosphate. It imports phosphate complexed to divalent cations such as Mg(2+) or Ca(2+). Loss of PitA function can confer reduced sensitivity to high levels of Zn(2+). The PitA of Escherichia coli has a closely related paralog, PitB, that is functional but only minimally expressed. Note that the term PitA is used broadly. This exception-level HMM identifies a rather narrow clade of PitA transporters which, however, includes both PitA and PitB of E. coli K-12.	499
411359	NF033775	P_type_ZntA	Zn(II)/Cd(II)/Pb(II) translocating P-type ATPase ZntA. 	732
411360	NF033776	stress_YhcN	peroxide/acid stress response protein YhcN. 	87
411361	NF033777	M_group_A_cterm	M protein C-terminal domain. M protein (emm) is an important virulence protein and serology-defining surface antigen of Streptococcus pyogenes (group A Streptococcus). M protein has an amino-terminal YSIRK-type signal sequence (associated with cross-wall targeting in dividing cells), and a C-terminal LPXTG domain for processing by sortase and covalent attachment to the Gram-positive cell wall. Past the signal peptide, M protein has a hypervariable region, but this HMM describes only the well-conserved region C-terminal to the hypervariable region. It discriminates M protein from two related proteins, Enn and Mrp.	218
411362	NF033778	trans_TimA	TIM44-related membrane protein TimA. TimA was first described in Caulobacter crescentus, a member of the alpha-proteobacterial lineage that gave rise to the mitochondrion. It is notable because of homology to Tim44, a protein involved in protein translocation in both human and yeast mitochondria, although TimA itself is not considered involved in protein translocation. TimA is found localized to the plasma membrane, on the cytosolic face. The mitochondrial homolog, Tim44, is found as a peripheral protein of the inner face of the mitochondrial inner membrane, and serves as an adaptor to recruit other subunits into the translocation complex, rather than in transport channel itself.	198
411363	NF033779	Tim44_TimA_adap	Tim44/TimA family putative adaptor protein. Members of this family resemble both the eukaryotic protein Tim44, important to the assembly of a protein translocase in mitochondria, and the TimA protein of alpha-proteobacteria such as Caulobacter crescentus. TimA may assist in protein recruitment to the membrane, as Tim44, but appears not to be part of any complex associated with protein translocation.	215
411364	NF033780	exosort_XrtU_C	exosortase U C-terminal region. The XrtU family of exosortases is marked by a distinctive C-terminal region, modeled by the exosort_XrtU_C hidden Markov model. Because members of the archaeosortase and exosortase family perform cleavage of target proteins, in pathways thought to lead to new covalent attachment at the C-terminus and surface anchoring, the XrtU C-terminal domain may indicate the presence of a novel anchoring chemistry.	173
411365	NF033782	lipoprot_Omp28	Omp28 family outer membrane lipoprotein. The Omp28 family of lipoproteins is named for a founding member described in Porphyromonas gingivalis, where it has been shown across many strains to be an expressed surface antigen. All members of the family are predicted lipoproteins.	263
411366	NF033785	sulfur_OscA	sulfur starvation response protein OscA. OscA (organosulfur compound A) is a small protein, about 60 amino acids in length, in the DUF2292 family. As characterized in Pseudomonas corrugata, OscA is required during sulfur starvation for obtaining it from organosulfur compounds. The pathway is required to remediate oxidative stress from chromate, so oscA was discovered by the loss of high resistance to chromate in Pseudomonas corrugata 28 when the gene is insertionally inactivated. The oscA gene tends to be found near sulfate transporter genes.	60
411367	NF033787	HTH_BldC	BldC family transcriptional regulator. BldC, a helix-turn-helix transcription factor with homology to the mercury resistance transcriptional regulator MerR, is a DNA-binding protein. It is considered the founding member of a subfamily of regulators with an asymmetric head-to-tail oligomerization for cooperative DNA binding, rather than classic dimerization.	49
411368	NF033788	HTH_metalloreg	metalloregulator ArsR/SmtB family transcription factor. Transcriptional repressors that sense toxic heavy metals such as arsenic or cadmium, and are released from DNA so that resistance factors will be expressed, include ArsR, SmtB, ZiaR, CadC, CadX, KmtR, etc. However, some members of this family, including the sporulation delaying system autorepressor SdpR and its family (see NF033789), may lack metal-binding cites and instead regulate other cellular processes.	76
411369	NF033789	repress_SdpR	autorepressor SdpR family transcription factor. Transcription factors in the family of the sporulation delaying system autorepressor SdpR (of Bacillus subtilis) resemble metalloregulatory transcriptional repressors such as ArsR, SmtB, CadX, ZiaR, etc., but may lack the key metal-binding residues.	79
411370	NF033790	CnrY_NccY_antiS	CnrY/NccY family anti-sigma factor. 	95
411371	NF033791	ActR_PrrA_rreg	ActR/PrrA/RegA family redox response regulator transcription factor. ActR, PrrA, and RegA are examples of lineage-specific names given to a response regulator transcription factor that acts as a global regulator and belongs to a sensor-regulator pair that is highly conserved in the alpha-proteobacteria. Examples of this regulator include ActR (acid tolerance regulator) in Sinorhizobium meliloti, stationary phase response regulator SpdR in Caulobacter crescentus,	173
411372	NF033792	ActS_PrrB_HisK	ActS/PrrB/RegB family redox-sensitive histidine kinase. This redox-responsive histidine kinase, found in alpha-proteobacteria, shows strong sequence conservation, including the notable motif [VA]AAAAHELGTPxTI. It always acts as a partner to an ActR/PrrA/RegA family global response regulator transcription factor in a two-component sensory transduction system. Lineage-specific names and gene symbols given to this histidine kinase reflect downstream regulator changes such as entry into stationary phase, anaerobic expression of photosynthesis genes, and survival of exposure to low pH.	423
411373	NF033793	peri_CopK	periplasmic Cu(I)/Cu(II)-binding protein CopK. 	88
411374	NF033794	chaper_CopZ_Eh	copper chaperone CopZ. Copper chaperone CopZ, as the name is used in Enterococcus hirae and related species, is a small copper-binding protein with close homology to domains found, sometimes in multiple copies, in various copper-translocating copper-translocating P-type ATPases, and to distinct families of other small copper chaperones that also named CopZ.	68
411375	NF033795	chaper_CopZ_Bs	copper chaperone CopZ. This model describes CopZ, a small copper chaperone, as found in Bacillus subtilis and related species. A number of longer protein, such as copper-translocating P-type ATPases, contain multiple CopZ-like domains, with its signature invariant CxxC motif. CopZ from other species may be more different in sequence from this family than some of those domains of longer proteins.	66
411376	NF033796	selen_YedE_FdhT	selenium metabolism membrane protein YedE/FdhT. Members of this family are predicted multiple membrane-spanning proteins, and therefore thought likely to be transporters. It appears that all species whose genomes encode a member of this family produce selenocysteine-containing enzymes, typically formate dehydrogenase. The family member from Campylobacter jejuni was show to be essential for formate dehydrogenase (a selenocysteine-containing enzyme) expression and activity, and so it was named a formate dehydrogenase accessory protein, FdhT.  Note that this family is related to (but distinct from) that of TIGR04112, which is similarly restricted to species with pathways for selenium incorporation.	387
411377	NF033797	phero_SHP2_SHP3	SHP2/SHP3 family peptide pheromone. 	23
411378	NF033798	biofilm_StcA	StcA family protein. StcA (streptococcal charged A protein), as described in Streptococcus pyogenes, is a small, positively charged, secreted protein that participates in a quorum-sensing system, and promotes biofilm formation. Related proteins are found in several other Streptococcus spp.	89
411379	NF033799	inhib_PhrA	PhrA family phosphatase inhibitor. 	44
411380	NF033800	quorum_NprX	quorum-signaling peptide NprX. NprX, also called NprRB, belongs to the NprR-NprX quorum-sensing system in Bacillus. The mature form of the peptide pheromone is the SKPDIVG heptapeptide.	43
411381	NF033801	NprX_fam	NprX family peptide pheromone. 	42
411382	NF033802	AimP_fam	lysogeny pheromone AimP family peptide. AimP is the quorum signaling-like peptide pheromone of a phage system, called the arbitrium system, that detects environmental evidence of predecessor phage activity, in order to direct a lysis/lysogeny decision.	43
411383	NF033803	TOMM_BorA	TOMM family putative cytolysin BorA. The BorA family of thiazole/oxazole-modified microcin (TOMM) peptides that are putative cytolysins and virulence factors have been found, so far, encoded by plasmids in members of the genus Borreliella.	37
411384	NF033804	Streccoc_I_II	antigen I/II family LPXTG-anchored adhesin. Members of the antigen I/II family are adhesins with a glucan-binding domain, two types of repetitive regions, an isopeptide bond-forming domain associated with shear resistance, and a C-terminal LPXTG motif for anchoring to the cell wall. They occur in oral Streptococci, and tend to be major cell surface adhesins. Members of this family include SspA and SspB from Streptococcus gordonii, antigen I/II from S. mutans, etc.	1552
411385	NF033805	invasion_CiaB	invasion protein CiaB. CiaB (Campylobacter invasion antigen B) is important for host cell invasion by Campylobacter. It is found as well in a number of other species capable of invasion, including Campylobacter coli, Campylobacter rectus, and Helicobacter pullorum.	600
411386	NF033806	laterosporulin	laterosporulin family class IId bacteriocin. 	52
411387	NF033807	CopL_fam	CopL family metal-binding regulatory protein. The founding member of this family was shown to be involved in the copper-responsive expression of a multicopper oxidase copA encoded downstream. The regulatory function likely involves copper-binding, but activity as a DNA-binding transcriptional regulator was not demonstrated directly.	129
411388	NF033808	copper_CopD	copper homeostasis membrane protein CopD. CopD is an inner membrane protein encoded in Gram-negative bacterial systems that provide resistance to excess copper. It is typically chromosomal, although the PcoD protein of the plasmid-borne pco copper resistance system is a exceptional member of the CopD family.	295
411389	NF033814	copper_CopC	copper homeostasis periplasmic binding protein CopC. 	109
411390	NF033816	Cj0069_fam	Cj0069 family protein. Cj0069 from Campylobacter jejuni, described as a serological marker that can show a patient was previously infected with that organism, is the founding member of an uncommon broadly distributed family of proteins. Members of that family are found also in various Helicobacter spp.,  Bradyrhizobium spp., and Corynebacterium spp.	333
411391	NF033817	Mplas_variab_LP	variable surface lipoprotein signal domain. This HMM describes a homology domain of lipoprotein signal peptides restricted to the genus Mycoplasma, and found in paralogous families that typically are associated with antigenic phase variation, as expression of some members of the family is turned on, others turned off, by the means promoter region modifications by site-specific DNA invertases. The avg family of Mycoplasma agalactiae represents one such family of lipoproteins.	30
411392	NF033819	IS66_TnpB	IS66 family insertion sequence element accessory protein TnpB. The IS66 family insertion sequence element encodes a DDE transposase TnpC, and two accessory proteins, TnpA and TnpB. It has been assumed that the TnpA, TnpB, and TnpC proteins are produced independently in appropriate amounts and form a complex, which acts as a transposase to promote the transposition of an IS66 family element.	90
411393	NF033820	STM0539_fam	STM0539 family protein. STM0539, from Salmonella enterica strain LT2, is the founding member of family of proteins of unknown function, about 145 amino acids in length.	141
411394	NF033821	YoaK	YoaK family small membrane protein. YoaK is a small protein (about 32 amino acids) found in E. coli (from which it is named), Salmonella, Klebsiella, Pantoea, and related taxa. It associates with the inner membrane.	32
411395	NF033823	archmetzin	archaemetzincin family Zn-dependent metalloprotease. 	170
411396	NF033824	Myxo_non_zincin	non-proteolytic archaemetzincin-like protein. Members of this family resemble the archaemetzincins, a mostly archaeal family of Zn-dependent metalloproteases, but it lacks the critical catalytic glutamate (E) in the signature motif HEXXH, within the longer pattern HEXXHXXGX3CX4CXMX17CXXC that defines the archaemetzincins. Members of this family are found in Myxococcus xanthus and related members of the Delta-proteobacteria.	174
411397	NF033826	immun_CdiI	ribonuclease toxin immunity protein CdiI. CdiI proteins, including the founding member from Escherichia coli strain STEC_O31, serve as immunity proteins for the toxic tRNA-cleaving ribonuclease toxin CdiA. The system confers contact-dependent inhibition (cdi) between different strains of bacteria.	118
411398	NF033827	CDF_efflux_DmeF	CDF family Co(II)/Ni(II) efflux transporter DmeF. DmeF, a metal efflux transporter belongs to the cation diffusion facilitator (CDF) family. Examples from different species have been described as primarily being induced by, or performing efflux of, different panels of metals, including Co(II) and Ni(II) for Rhizobium leguminosarum and Agrobacterium tumefaciens, and a broader spectrum Wautersia metallidurans CH34.	308
411399	NF033828	entry_exc2_fam	Exc2 family lipoprotein. Exc2, a plasmid-encoded predicted lipoprotein of small size, was once described as a plasmid entry exclusion protein (1985).  However, a more recent article (1995), says entry exclusion activity should instead be ascribed to MbeD.	131
411400	NF033829	plas_excl_MbeD	MbeD family mobilization/exclusion protein. MbeD, as found in the ColE1 plasmid, was originally described as a plasmid mobilization protein. Later, it was shown that MbeD additionally was responsible for a plasmid entry exclusion phenotype that had previously been ascribed to products of the exc1 and exc2 genes.	69
411401	NF033830	NleE_fam_methyl	NleE/OspZ family T3SS effector cysteine methyltransferase. NleE from Escherichia coli O157:H7 strain Sakai, and its homolog OspZ from Shigella, are the founding members of a family of SAM-dependent protein--cysteine methyltransferases. Both are type III secretion system (T3SS) effectors involved in virulence, and have the host protein NF-kappa-B as substrates.	212
411402	NF033831	sce7725_fam	sce7725 family protein. This family of uncharacterized proteins is named for founding member sce7725 from Sorangium cellulosum, from the Deltaproteobacteria. It belongs a gene pair found sporadically in genera as diverse as Enterococcus,  Lactobacillus, Staphylococcus, Streptococcus, Acinetobacter, and Klebsiella. The partner in each gene pair is a member of the sce7726 famly.	311
411403	NF033832	sce7726_fam	sce7726 family protein. This family of uncharacterized proteins is named for founding member sce7726 from Sorangium cellulosum, from the Deltaproteobacteria. It belongs a gene pair found sporadically in genera as diverse as Enterococcus,  Lactobacillus, Staphylococcus, Streptococcus, Acinetobacter, and Klebsiella, or in phage from those lineages. The partner in each gene pair is a member of the sce7725 family.	182
411404	NF033833	rhodan_ChrE	rhodanese family chromate resistance protein ChrE. 	108
411405	NF033835	VraH_fam	VraH family protein. 	54
411406	NF033837	GarQ_core	garvicin Q family class II bacteriocin core domain. This HMM describes the core (mature) peptide region of GarQ, the class II bacteriocin garvicin (garvieacin) Q, and homologous peptide with similar core regions. Some members, such as GarQ itself, have a classical GlyGly-containing ComC/BlpC-like leader peptide, but others have N-terminal regions (probable leader peptides) of a different type.	42
411407	NF033838	PspC_subgroup_1	pneumococcal surface protein PspC, choline-binding form. The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.	684
411408	NF033839	PspC_subgroup_2	pneumococcal surface protein PspC, LPXTG-anchored form. The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.	557
411409	NF033840	PspC_relate_1	PspC-related protein choline-binding protein 1. Members of this family share C-terminal homology to the choline-binding form of the pneumococcal surface antigen PspC, but not to its allelic LPXTG-anchored forms because they lack the choline-binding repeat region. Members of this family should not be confused with PspC itself, whose identity and function reflect regions N-terminal to the choline-binding region. See Iannelli, et al. (PMID: 11891047) for information about the different allelic forms of PspC.	648
411410	NF033841	small_YshB	YshB family small membrane protein. YshB, a membrane-associated protein typically 36 to 40 amino acids in length, is found conserved in genera of the gamma-proteobacteria including Cronobacter, Enterobacter, Escherichia, Klebiella, Salmonella, and Serratia. The gene symbol derives from E. coli K-12.	40
411411	NF033842	small_MgtS	protein MgtS. MgtS, previously called YneM, is a small inner membrane protein that modulates Mg(2+) concentrations through its effects on the P-type transporter MgtA.	30
411412	NF033843	small_YpfM	protein YpfM. 	19
411413	NF033844	small_YqgB	acid stress response protein YqgB. 	40
411414	NF033845	MSCRAMM_ClfB	MSCRAMM family adhesin clumping factor ClfB. Clumping factor B is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.	871
411415	NF033846	Rumino_NPXTG	NPXTG family C-terminal sorting domain. Rumino_NPXTG represents a flavor of C-terminal protein sorting signal in species related to Ruminococcus albus. In that lineage, multiple sortases per genome may be found, including multiple B-type sortases. Proteins found by this HMM (more than 12 encoded in a representative complete genome) may represent substrates of a panel of related sortases, while additional proteins found in Ruminococcus genomes with below-cutoff hits to this model may be processed by other sortases.	29
411416	NF033847	MCP_Sipho	major capsid protein, Siphoviridae type. This protein is a phage major capsid protein, as reported in primary sequence submissions of a large number of Siphoviridae, many of which have hosts in the Mycobacterium and Gordonia genera of bacteria.	549
411417	NF033848	VgrG_rel	VgrG-related protein. Members of this family resemble Vgr proteins of type VI secretion systems (T6SS) as found in various proteobacteria. However, members of this family occur instead in genera such as Streptomyces and Roseiflexus. The biological roles and molecular functions of proteins in this family appear not to have been characterized.	547
411418	NF033849	ser_rich_anae_1	serine-rich protein. This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.	1122
411419	NF033850	LCxxNW	mobility-associated LCxxNW protein. This protein belongs to a family of small proteins, about 65 amino acids long, with three invariant Cys residues, including one in the motif LCxxNW, for which the family is named. Member proteins are found in contexts that suggests they are accessory proteins of mobile elements. The context includes members of families PF01695, PF01610, and PF00665, all associated with transposition and/or integration.	61
411420	NF033852	fulvocin_rel	bacteriocin fulvocin C-related protein. Fulvocin C was described in 1981 as a bacteriocin from Myxococcus fulvus, 45 amino acids long with 8 cysteines. The precursor form was not described. However, the most closely related precursor-like proteins represent the founding members of a family of proteins that average over 225 amino acids in length, the majority of which have a C-terminal tail region in which the 8 Cys residues are essentially invariant. The long N-terminal region, and the sharp change in amino acid composition at the start of what appears to be the bacteriocin core peptide region, suggests that the N-terminal region may contribute directly in peptide maturation or transport, rather than merely providing for recognition by maturation proteins.	149
411421	NF033853	KPN_two_small	small membrane protein. Members of this family, including paralogs KPN_01023 and KPN_01923 from Klebsiella pneumoniae subsp. pneumoniae MGH 78578, are small proteins, typically 40 amino acids in length. A highly hydrophobic region, about 25 amino acids in length with a composition typical of transmembrane segments, is followed by a highly basic region of about 15 amino acids. Analogous proteins of similar size, with apparently similar hydrophobic regions, have been described as membrane-associated proteins expressed in response to various types of cell envelope stress.	37
411422	NF033854	esterase_BioV	pimelyl-ACP methyl ester esterase BioV. BioV, found in Helicobacter pylori and a number of related Epsilonproteobacteria replaces BioH as a pimelyl-ACP methyl ester esterase required for biotin biosynthesis.	167
411423	NF033855	tRNA_MNMC2	tRNA (5-methylaminomethyl-2-thiouridine)(34)-methyltransferase MnmD. This HMM describes either the N-terminal region, called MNMC2, of the tRNA modification bifunctional enzyme MnmC, or a free-standing protein that performs the same methyltransferase function, in partnership with an FAD-dependent protein, or C-terminal region, called MNMC1 (see TIGR03197).	205
411424	NF033856	T4SS_effec_BID	T4SS effector BID domain. The BID domain (Bartonella intracellular delivery domain) is recognized by the type IV secretion system (T4SS) virB (not trw) of Bartonella and related taxa (e.g. Ochrobactrum), and is found in T4SS effector proteins such as BepA, BepB, BepC, etc. Multiple copies of the domain may be found in a single protein.	109
411425	NF033857	BPSL0067_fam	BPSL0067 family protein. 	115
411426	NF033858	ABC2_perm_RbbA	ribosome-associated ATPase/putative transporter RbbA. 	907
411427	NF033859	SMEK_N	SMEK domain. The SMEK domain is named for four genera in which multiple, diverse members of this uncommon family of bacterial proteins are found: Staphylococcus, Mycoplasma, Escherichia, and Klebsiella. Members of the family are highly variable in length. This domain occurs as the N-terminal region. The four scattered invariant residues in the seed alignment, which may provide a clue to function, are Glu, Asp, Gln, and Lys.	97
411428	NF033860	Wzy_O6_O28	oligosaccharide repeat unit polymerase. Members of this family are oligosaccharide repeat unit polymerases in a subfamily that includes the Wzy proteins for polymerization of the O-antigens O6, O28, O39, O59, and several others.	384
411429	NF033861	sm_mem_Ecr	Ecr family regulatory small membrane protein. Ecr, as described in the genus Enterobacter, is a small membrane protein predicted to span the inner membrane. It was found to modulate expression of PhoP, part of the PhoP-PhoQ two component system, which in turn induces the arnBCADTEF operon, leading to modification of LPS and conferring elevated resistance to colistin. The form described in PMID:31169899  (WP_048029797.1), at 72 amino acids in length, is aberrant compared with the typical length for the majority of family members, about 46 residues.	46
411430	NF033863	immun_TipC_fam	TipC family immunity protein. This family is named for founding member TipC1 (previously TipC), an immunity protein for the toxin TelC, which is a type VII secretion system (T7SS) effector lipid II phosphatase.	192
411431	NF033864	cytochrome579	cytochrome 579. Cytochrome 579, as described originally in Leptospirillum from acid mine drainage, is an abundant red cytochrome that acts as an electron transfer protein involved in Fe(II) oxidation.	178
411432	NF033865	fusolisin	autotransporter serine protease fusolisin. 	1003
411433	NF033869	viru_reg_Rsp	AraC family transcriptional regulator Rsp. Rsp (repressor of surface proteins), as described in Staphylococcus aureus, is a large protein with an AraC-like helix-turn-helix DNA-binding domain. Regulatory targets include the accessory gene regulator (agr) operon, which in turn regulates a large number of virulence factors.	701
411434	NF033870	VOMP_auto_Cterm	Vomp family autotransporter C-terminal domain. The Vomp (variably expressed outer-membrane proteins) family, as described in Bartonella, consists of autotransporter surface proteins including collagen-binding autotransporter adhesins VompA and VompC.	356
411435	NF033871	deAMP_SidD	deAMPylase SidD family protein. The founding member of this protein family, SidD from Legionella pneumophila, is a type IV secretion system effector that acts as a deAMPylase of the host protein. Homologs are found in several other Gammaproteobacteria, including several different Legionella and Coxiella species.	180
411436	NF033872	SidA_fam	T4SS effector SidA family protein. The founding member of this protein family, SidD from Legionella pneumophila, is a minimally characterized type IV secretion system substrate. Homologs are found in a wide range of Legionella species.	379
411437	NF033873	SidJ_poly_Glu	SidJ family T4SS effector polyglutamylation protein. SidJ, called a pseudokinase because its polyglutamylation activity differs from what might be expected from its kinase-like fold, is the founding member of a family of such pseudokinases. SidJ itself, as described in Legionella pneumophila, is exported by a type IV secretion system (T4SS), and modifies and modulates the activity of T4SS effector SidE.	758
411438	NF033874	SidJ_rel_pseudo	SidJ-related pseudokinase. Members of this family are uncharacterized but exhibit strong local sequence similarity to SidJ, a protein that exhibits a protein kinase fold, but that surprisingly exhibit a different activity. In the case of SidJ itself, the activity is polyglutamation activity. For this family, the activity is unknown.	508
411439	NF033875	Agg_substance	LPXTG-anchored aggregation substance. Aggregation substances, as described in Enterococcus, are LPXTG-anchored large surface proteins that contribute to virulence. Several closely related paralogs may be found in a single strain.	1306
411440	NF033876	flagella_HExxH	flagellinolysin. Flagellinolysin is a variant form of bacterial flagellin in with the normally hypervariable central region contains an M9 (MEROPS classification) family metalloprotease domain, with its signature HExxH motif. The founding member of the family, from the pathogen Clostridium haemolyticum, shows EDTA-sensitive metalloprotease activity. The large count of flagellin subunits in a complete flagellum means the capacity of  flagellinolysin to perform as a protease may have implications for host-pathogen relationships.	380
411441	NF033878	thiovarsolin	thiovarsolin family RiPP. The thiovarsolins, named for a founding member from Streptomyces varsoviensis, are RiPPs (ribosomally synthesized and post-translationally modified peptide). As with the thioviridamides, thiovarsolin precursors are encoded in loci that encode YcaO and TfuA family proteins, suggesting post-translational modification by thioamidation.	89
411442	NF033879	smalltalk	smalltalk protein. Smalltalk is a membrane-associated protein of very small size (less than 35 amino acids), found broadly in Bacteroides and Prevotella, both of which are prevalent in human gut microbiomes. Genomic context suggests a role in crosstalk in the gut microbiome, whether that involve toxins and immunity, signaling, or some other form of interaction. The family was identified and discussed by Sberro, et al., in a screen for overlooked small proteins encoded within human microbiomes, and named smalltalk here for its small size and cross-talk role.	29
411443	NF033880	Prli42	stressosome-associated protein Prli42. Prli42, as characterized in Listeria monocytogenes and found broadly in the Firmicutes, is a membrane protein of very small size, essential to the function of the stressosome. It appears to be related to DUF4044 (PF13253).	31
411444	NF033881	aureocin_A53	aureocin A53 family class IId bacteriocin. Members of this family include leaderless, unmodified class IId bacteriocins such as lacticin Q, BacSp222, and the founding member aureocin A53.	48
411445	NF033882	T4SS_lipo_DotD	type IVB secretion system lipoprotein DotD. Members of this family are the lipoprotein DotD from type IVB secretion systems, which are also called Dot/Icm secretion systems. DotD is is related to conjugal transfer protein TraH as that term is used in IncI1 plasmid transfer regions.	143
411446	NF033883	conj_TraQ_IncI1	conjugal transfer protein TraQ. 	175
411447	NF033884	conj_TraO_IncI1	conjugal transfer protein TraO. TraO, involved in the conjugal transfer of plasmids such as IncI1 plasmids, shares homology with IcmE of type IVB secretion systems.	381
411448	NF033885	conj_TraP_IncI1	conjugal transfer protein TraP. Members of this family are the conjugal transfer protein TraP, as the term is used for the member protein from IncI1 plasmids and for their homologs. Note that the same terminology may be applied to unrelated proteins from other forms of conjugal transfer system.	217
411449	NF033886	T4SS_DotA	type IVB secretion system protein DotA. This HMM distinguishes DotA of type IVB secretion systems from TraY as the term is used in the conjugal transfer systems of IncI1 family plasmids.	777
411450	NF033887	conj_TraX	conjugal transfer protein TraX. 	163
411451	NF033888	conj_TraW	conjugal transfer protein TraW. Members of this family are the TraW protein of conjugal plasmid transfer systems, as the term is used for certain transfer systems, including that of IncI1 family plasmids. Note that an unrelated protein, also designated TraW, participates in the assembly of F-pilin subunits, involved in transfer of F-plasmids.	380
411452	NF033889	termin_lrg_T7	phage terminase large subunit. The phage terminase large subunit (TerL) is also called DNA maturase B. It oligomerizes into an ATPase that forces DNA into a pre-existing phage prohead to package the DNA. This TerL family includes members from phage T7 and phage T3, among others.	499
411453	NF033890	DotM_IcmP_IVB	type IVB secretion system coupling complex protein DotM/IcmP. 	354
411454	NF033891	surf_exc_IncI1	plasmid IncI1-type surface exclusion protein ExcA. The surface exclusion protein ExcA, as found in R64 and other IncI1 family plasmids, is not required for plasmid transfer. Instead, it is required for blocking transfer of closely related plasmids into the host cell.	210
411455	NF033892	XcbB_CpsF_sero	XcbB/CpsF family capsular polysaccharide biosynthesis protein. Two partially characterized members of this family are XcbB, as described in Neisseria meningitidis serotype X, and CpsF as described in Enterococcus faecalis serotype C. In the latter case, loss of CpsF converts capsular polysaccharide to serotype D.	291
411456	NF033893	pheromone_ipd	peptide pheromone inhibitor Ipd. The pheromone inhibitor iPD1, in mature form, is the last 8 amino acids of the product of the ipd gene. It was described in conjugative plasmids of Enterococcus faecalis.	21
411457	NF033894	Eex_IncN	EexN family lipoprotein. Members of this family are lipoproteins, typically associated with mobile elements such as phage, and related to the entry exclusion protein Eex of IncN-type plasmids. Members of this family tend to be small (shorter than 90 amino acids) with three invariant Cys residues, one of which belongs to the lipoprotein signal peptide. This family shares some similarities with TIGR04359, which includes the entry exclusion lipoprotein TrbK of IncPalpha-type plasmids.	65
411458	NF033896	MFS_LfrA	efflux MFS transporter LfrA. This efflux transporter, as characterized in Mycolicibacterium (Mycobacterium) smegmatis, provides low-level fluoroquinolone resistance (lfr) when overexpressed.	504
411459	NF033897	GUMAP_C	GUMAP protein C-terminal domain. GUMAP (Giant Ureaplasma Membrane-Anchored Protein) is a very large protein found in several species of the genus Ureaplasma, a lineage that resembles Gram-positive bacteria by ancestry but that lacks a peptidoglycan cell wall. GUMAP proteins average about 5000 amino acids in length. Near the C-terminus, these proteins have a strongly hydrophobic segment followed immediately by a cluster of basic residues, as is characteristic of proteins with a C-terminal membrane anchor. Because of the potentially large computation cost of performing database searches with a protein profile HMM that is thousands of residues long, this HMM models only about 750 amino acids of the C-terminal region of GUMAP proteins.	729
411460	NF033898	QWxxN_dom	QWxxN domain-containing protein. The QWxxN domain is about 125 amino acids long, and appears typically as the conserved core region in up to 9 tandem repeats, each about 200 amino acids long. Proteins with this domain are known so far only in the genus Enterococcus, and may reach over 3000 amino acids in length.	127
411461	NF033899	T4SS_pilin_TrwL	VirB2 family type IV secretion system major pilin TrwL. TrwL is the major pilin of Trw type IV secretion system (T4SS) of Bartonella species. It is related to VirB2 of related T4SS and to the conjugal transfer protein TrbC. The Trw system is unusual for having duplications of certain subunits, and TrwL has divergent, tandem-duplicated copies, named in series TrwL1, TrwL2, etc.	103
411462	NF033900	T4SS_IcmE_DotG	type IVB secretion system protein DotG/IcmE. 	1012
411463	NF033901	L_lactate_LldD	FMN-dependent L-lactate dehydrogenase LldD. LldD is an FMN-dependent L-lactate dehydrogenase. It occurs in E. coli, Salmonella, and as one of two L-lactate dehydrogenases in Pseudomonas aeruginosa. It is unrelated to the NAD-dependent enzyme.	377
411464	NF033902	iso_D2_wall_anc	SpaH/EbpB family LPXTG-anchored major pilin. Members of this family are pilin major subunits whose structure includes an LPXTG motif-containing signal (see TIGR01167) near the C-terminus, for processing by sortases. Most contain a recognizable D2-type fimbrial isopeptide formation domain (see TIGR04226), in which Lys-to-Asn isopeptide bond formation provides additional structural integrity to support adhesion despite shear. For proper members of this subfamily, lengths fall typically in the range of 460 to 640 amino acids in length. Many members of this family contribute to the virulence of certain Gram-positive pathogens, including SpaA, SpaD, and SpaH from Corynebacterium diphtheriae, and EbpB and EbpC from Enterococcus faecalis.	533
411465	NF033903	VaFE_rpt	VaFE repeat. The VaFE domain, about 121 amino acids long, typically occurs as a tandem repeat in sortase-anchored surface proteins of Gram-positive bacteria. A single protein may from one to over fifteen VaFE domains. The domain is named for a particularly strong motif with a nearly invariant Phe-Glu residue pair. The function is for the VaFE domain is unknown.	121
411466	NF033904	LlsX_fam	LlsX family protein. LlsX, as found in Listeria monocytogenes, is a small protein of unknown function, encoded in the island responsible for listeriolysin S biosynthesis and processing. Related proteins are found in additional Gram-positive lineages, such as Streptococcus sobrinus and Lactobacillus sp.	90
411467	NF033906	ExsE_fam	T3SS regulon translocated regulator ExsE family protein. ExsE, through protein-protein interaction, serves in a regulatory cascade that modulates the role of ExsA, a transcriptional activator of Pseudomonas aeruginosa's type III secretion system (T3SS) regulon. ExsE itself is a substrate for translocation (i.e. removal) by the T3SS system, providing feedback that modulates expression of secretion system genes. Homologs found in multiple species of Aeromonas and Photorhabdus may be functionally equivalent. Note that VP1702 from Vibrio parahaemolyticus, given the same gene symbol and ascribed an equivalent function, appears unrelated in sequence.	77
411468	NF033907	ExsE2_fam	T3SS regulon translocated regulator ExsE2. ExsE2, as described in two Vibrio species, is called functionally equivalent to ExsE from Pseudomonas aeruginosa, but appears to lack detectable sequence homology. In each model organism, a type III secretion system (T3SS) contains a DNA-binding transcriptional activator ExsA, an anti-activator ExsD that binds ExsA and inhibits its activity, and a secretion chaperone ExsC that can bind to and counteract ExsD. ExsE2 (in V. alginolyticus) or ExsE (in P. aeruginosa) can bind its chaperone ExsC until it exits the cell by successful T3SS translocation, providing fine tuning of T3SS gene expression through these interactions. We renamed the Vibrio family from ExsE to ExsE2 to help minimize confusion between these two very dissimilar families.	93
411469	NF033908	AcfA_fam_omp	AcfA family outer membrane beta-barrel protein. AcfA (accessory colonization factor A), as discussed in Vibrio cholerae, is a porin-like outer membrane beta-barrel protein. It encoded in a locus with other proteins also termed accessory colonization factor, near the toxin-coregulated pilus genes, and its presence aids in intestinal colonization, but its molecular function is unknown. Members of the broader family, described by this HMM, are found in many species of Vibrio, Photobacterium, Aliivibrio, and related genera. The name AcfA is used also for a member of this family from Vibrio alginolyticus, that is no more than 40 percent identical in amino acid sequence, but it is unclear that all members of this family should be considered AcfA.	214
411470	NF033909	opacity_OapA	opacity-associated protein OapA. This family consists of full-length homologs to OapA, opacity-associated protein A as described in Haemophilus influenzae. OapA shares a C-terminal homology domain, called the OapA domain, with the Escherichia coli protein YtfB, which is now known to bind peptidoglycan through its OapA domain and to act as a cell division protein.	421
411471	NF033910	LWR_salt	LWR-salt protein. This family of uncharacterized proteins was assigned the name LWR-salt (pronounced "lower-salt") to mark a well-conserved motif LWR (part of the longer motif WxFFRDxLWRG) and its restriction to the Halobacteria, a branch of the Archaea.	118
411472	NF033911	botu_NTNH	non-toxic nonhemagglutinin NTNH. The botulinum neurotoxin (BoNT) is always encoded together with associated non-toxic proteins (ANTPs) that are co-produced with it and form a complex that protects the toxin. Often, NtnH (non-toxic nonhemagglutinin) is one of these ANTPs, with the ntnh gene lying immediately upstream of the bont gene.	1164
411473	NF033912	msc	mechanosensitive ion channel. Proteins of this subfamily are mechanosensitive channels, involved in numerous biological functions. Representative proteins of this subfamily are WP_092311603 and WP_068170239 (TCDB accession: 1.A.23.8.3 and 1.A.23.8.5, respectively).	365
411474	NF033913	fibronec_FbpA	LPXTG-anchored fibronectin-binding protein FbpA. FbpA, a fibronectin-binding protein described in Streptococcus pyogenes, has a YSIRK-type (crosswall-targeting) signal peptide and a C-terminal LPXTG motif for covalent attachment to the cell wall. It is unrelated to the PavA-like protein from Streptococcus gordonii (see BlastRule NBR009716) that was given the identical name, so the phase LPXTG-anchored is added to the protein name for clarity.	386
411475	NF033914	antiphage_ZorA_1	anti-phage defense protein ZorA. Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense.	619
411476	NF033915	antiphage_ZorA_2	anti-phage defense ZorAB system ZorA. Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense.	383
411477	NF033916	antiphage_ZorA_3	anti-phage defense ZorAB system ZorA. Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense.	509
411478	NF033917	antiphage_ZorA_4	anti-phage defense ZorAB system ZorA. Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense.	417
411479	NF033918	LGIC_1	ligand-gated ion channel. Prokaryotic ligand-gated ion channels (LGICs) are a large group of transmembrane transporters, which might contribute to adaptation to pH change.	305
411480	NF033919	PA2779_fam	PA2779 family protein. Homologs of PA2779, an uncharacterized protein, average about 130 amino acids in length. The most distinctive feature is an extremely hydrophobic region at or near the C-terminus, consisting almost entirely of the bulkier hydrophobic residues Val, Ile, Leu, and Phe. The function is unknown.	123
411481	NF033920	C39_PA2778_fam	PA2778 family cysteine peptidase. Members of this family are MEROPS classification C39 family cysteine peptidases, a group that includes many processing enzyme for peptide natural products such as lantibiotics and other bacteriocins. This family, more specifically, includes PA2778 as found in Pseudomonas aeruginosa. All members of the defining seed alignment are encoded in the vicinity of a homolog of PA2779 (see HMM NF033919). Note that the C-terminal region consists largely of tetratricopeptide repeats (TPR), so classification using this HMM must be based on comparing the top domain score to the second gathering threshold (GA2).	255
411482	NF033921	por_somb	iron uptake porin. Proteins of this family have typical porin structures. It has been reported that Synechococcus outer membrane (Som) porins (SomA and SomB) are involved in iron uptake in cyanobacterium Synechococcus.	481
411483	NF033922	opr_porin_1	Opr family porin. Proteins hit by this HMM model are members of the Opr family porins, which are mainly found in Pseudomonas and other Gram-negative bacteria with different substrates.	328
411484	NF033923	opr_proin_2	Opr family porin. Proteins hit by this HMM model are members of the Opr family porins, which are mainly found in Pseudomonas and other Gram-negative bacteria with different substrates.	386
411485	NF033924	T3SS_LcrQ_reg	type III secretion system exported negative regulator LcrQ/YscM1. The type III secretion system (T3SS) protein called LcrQ in Yersinia pseudotuberculosis and YscM1 in Yersinia enterocolitica is a post-transcriptional regulator of T3SS effector gene expression. Successful chaperone-dependent export by the T3SS allows the translation of T3SS effector proteins to proceed.	110
411486	NF033925	pora_1	PorA family porin. This HMM hits Corynebacterial Porin A (PorA) family porins, which are short membrane proteins.	42
411487	NF033926	msp_porin	MSP porin. Members of this HMM are MSP porins (major outer sheath proteins) in Treponema. They may play a role in immune evasion and persistence.	512
411488	NF033927	alph_xenorhab_B	alpha-xenorhabdolysin family binary toxin subunit B. 	223
411489	NF033928	alph_xenorhab_A	alpha-xenorhabdolysin family binary toxin subunit A. Alpha-xenorhabdolysin was the founding member of a family of alpha-helical pore-forming binary toxins. YaxAB from Yersinia enterocolitica has been studied structurally. This HMM represents subunit A proteins such as XaxA and YaxA, capable of binding to the membrane even in the absence of the B subunit. This family is related to the Bacillus haemolytic enterotoxin family (see  PF05791.9), although thresholds for this HMM are set to exclude that family.	340
411490	NF033930	pneumo_PspA	pneumococcal surface protein A. The pneumococcal surface protein proteins, found in Streptococcus pneumoniae, are repetitive, with patterns of localized high sequence identity across pairs of proteins given different specific names that recombination may be presumed. This protein, PspA, has an N-terminal region that lacks a cross-wall-targeting YSIRK type extended signal peptide, in contrast to the closely related choline-binding protein CbpA which has a similar C-terminus but a YSIRK-containing region at the N-terminus.	660
411491	NF033932	LapB_rpt_80	LapB C-terminal region repeat. This model describes a tandem repeat about 80 amino acids in length per repeat, found in at least 12 different surface-exposed proteins of the pathogen Listeria monocytogenes, and in particular found 10 times in tandem in the surface protein LapB, for which the repeat is named.	83
411492	NF033934	KCU-star	KCU-star family selenoprotein. This family is named KCU-star because nearly all member proteins end with tripeptide lysine-cysteine-selenocysteine, followed immediately by a stop codon (represented by an asterisk, or star). Members occur in primarily in species of Helicobacter (although not Helicobacter pylori, in which selenocysteine incorporation capability has been lost) and Campylobacter. This small family belongs the larger YbdD/YjiX (DUF466) family described by Pfam model PF04328.	57
411493	NF033935	inclusion_IncB	inclusion membrane protein IncB. When Chlamydia invades a cell, the host-derived membrane of the vacuole in which it resides becomes known as the inclusion membrane. The chlamydial type III secretion system (T3SS) delivers a number of effector proteins into the inclusion membrane, including this protein, IncB (inclusion membrane protein B). IncB proteins from different chlamydial species share a conserved hydrophobic C-terminal region, represented by this HMM, but their N-terminal regions vary considerably in length and sequence.	78
411494	NF033936	CuZnOut_SO0444	SO_0444 family Cu/Zn efflux transporter. Members of this family are apparent metal cation efflux transporters. Architectural features include an average length of about 400 residues, with well conserved and highly hydrophobic N-terminal and C-terminal domains. The central region is highly variable in length and sequence, and rich in both Cys and His residues, as often seen in proteins produced in response to toxic concentrations of certain metals. The founding member, SO_0444, was shown to confer resistance to high levels of Cu and Zn ions. The best conserved region of the protein is a CSCG motif in the N-terminal region, found at least twice as in a selenocysteine-containing form, USCG.	336
411495	NF033937	porH_1	PorH family porin. Proteins of this HMM family form major outer membrane hetero-oligomeric pores on the cell wall of Corynebacterium with PorA family porins.	97
411496	NF033938	porH_2	PorH family porin. Proteins of this HMM family form major outer membrane hetero-oligomeric pores on the cell wall of Corynebacterium with PorA family porins.	63
411497	NF033939	DESULF_POR1	outer membrane homotrimeric porin. Proteins of this HMM family are primarily identified in sulfate-reducing Desulfovibrio, but this HMM may also hit proteins from other Gram-negative bacteria. Porins of this family form transmembrane pores for the passive transport of small molecules across the outer membranes of Gram-negative bacteria.	455
411498	NF033940	ErpA_rel	ErpA-related iron-sulfur cluster insertion protein. 	95
411499	NF033942	GjpA	outer membrane porin GjpA. GjpA was first identified in Gordonia jacobaea strain MV-1 as an outer membrane channel-forming protein. It has been reported that GjpA could be of relevance in the import and export of negatively charged molecules across the cell wall.	331
411500	NF033943	RTX_toxin	RTX family hemolysin. RTX family toxin are secreted from the bacteria and inserted into the membranes of infected cells, causing host cell rupture.	875
411501	NF033945	AcrIIA2_fam	AcrIIA2 family anti-CRISPR protein. Anti-CRISPR proteins are phage proteins that defeat CRISPR-Cas systems for immunity based on phage-derived spacers found in arrays between CRISPR repeats. The founding member of this family, AcrIIA2, works against a CRISPR-Cas class II system.	118
411502	NF033946	AcrIIA4_fam	AcrIIA4 family anti-CRISPR protein. AcrIIA4 is an anti-CRISPR protein that affects Cas9, a class II CRISPR system protein used in biotechnology applications for targeted genome editing.	86
411503	NF033947	PEP-cistern	cistern family PEP-CTERM protein. Members of this family are PEP-CTERM proteins, that is, surface proteins of Gram-negative organisms that carry a short C-terminal region used to help target proteins to their proper cellular location, hold them in position for post-translational modifications that might need to occur (such as glycosylation), and which is eventually removed by exosortase as the protein is ligated to something else. In this family the most conspicuous feature other than the PEP-CTERM sorting signal (with variants that include PEP, PAP, PTP, and SEP) is a pair of Cys residues about 6 amino acids apart from each other. The second Cys occurs in the middle of run of amino acids that are all either small (Gly, Ser, Ala) or else Asn. The local context suggests the Cys occurs at a turn at the end of a structural feature such as alpha-helix or beta-strand, rather than in the middle of one. The word "cistern" was assigned to suggest the proposed Cys-turn feature.	198
411504	NF033948	AcrIIA3_fam	anti-CRISPR protein AcrIIA3. AcrIIA3, as found in siphophages infecting Listeria and Streptococcus, is an anti-CRISPR protein that prevents Cas9-containing CRISPR systems from protecting against phage infection.	119
411505	NF033949	Cas12b	type V CRISPR-associated protein Cas12b. 	1187
411506	NF033950	Cas12c	type V CRISPR-associated protein Cas12c. 	1246
411507	NF033951	Cas12d	type V CRISPR-associated protein Cas12d. 	1118
411508	NF033952	AcrID1_fam	AcrID1 family anti-CRISPR protein. The AcrID1 family of anti-CRISPR proteins occurs in virus infecting the Archaea, primarily Sulfolobus. It targets and inactivates type I-D CRISPR-Cas systems.	93
411509	NF033953	AcrF10_fam	AcrF10 family anti-CRISPR protein. Members of the AcrF10 family of anti-CRISPR proteins have been found in phage from various Vibrio, Shewanella, and their relatives. AcrF10 is considered a DNA mimic protein.	94
411510	NF035921	staph_coagu	staphylocoagulase. Past residue 485, the protein consists of a variable number of 27-amino acid tandem repeats (see PF04022). This HMM omits coverage of a portion of the repetitive C-terminal region.	520
411511	NF035922	Trp_DH_ScyB	tryptophan dehydrogenase ScyB. The tryptophan dehydrogenase (EC 1.4.1.19) ScyB performs a reversible NAD(+)-dependent  deamination of L-Trp to 3-indolepyruvate. ScyB occurs in Cyanobacteria that biosynthesize scytonemin, a natural sunscreen, from tryptophan.	346
411512	NF035923	TPP_ScyA	scytonemin biosynthesis protein ScyA. This HMM distinguishes ScyA itself, found within the ScyABCDEF operon, from closely related paralogous TPP-binding ScyA-related proteins encoded outside the operon.	623
411513	NF035924	scytonem_ScyC	scytonemin biosynthesis cyclase/decarboxylase ScyC. Of the various markers of scytonemin biosynthesis, ScyC appears to be the clearest, as there are few or no close homologs from outside of the set of confidently predicted scytonemin producer bacteria.	312
411514	NF035925	Geo26A_fam	geobacillin-26 family protein. 	197
411515	NF035926	scyF_NHL_VPEP	scytonemin biosynthesis PEP-CTERM protein ScyF. ScyF has multiple tandem copies of the NHL repeat, making it a beta-propellar protein. It has an N-terminal signal peptide, and a C-terminal PEP-CTERM domain for recognition and cleavage by exosortase, suggesting that ScyF is sorted to an extracellular location, perhaps the outer leaflet of the outer membrane, and attached there covalently.	376
411516	NF035927	TPP_ScyA_rel	ScyA-related TPP-binding enzyme. Members of this family are not ScyA itself, which is described in TPP_ScyA (NF035923), but instead are closely related paralogs that are likewise restricted to the Cyanobacteria. Nostoc punctiforme, for example, has ScyA itself, but also has three members of this family, all more closely related to each other than to ScyA. By homology, members of this family are expected to be thiamine pyrophosphate-binding enzymes.	551
411517	NF035928	holin_1	bacteriophage holin. Holins form transmembrane pores for releasing endolysins that hydrolyze the cell wall and induce cell death.	117
411518	NF035929	lectin_1	lectin. Lectins are important adhesin proteins, which bind carbohydrate structures on host cell surface. The carbohydrate specificity of diverse lectins to a large extent dictates bacteria tissue tropism by mediating specific attachment to unique host sites expressing the corresponding carbohydrate receptor.	837
411519	NF035930	lectin_2	lectin. Lectins are important adhesin proteins, which bind carbohydrate structures on host cell surface. The carbohydrate specificity of diverse lectins to a large extent dictates bacteria tissue tropism by mediating specific attachment to unique host sites expressing the corresponding carbohydrate receptor.	238
411520	NF035931	lectin_3	lectin. Lectins are important adhesin proteins, which bind carbohydrate structures on host cell surface. The carbohydrate specificity of diverse lectins to a large extent dictates bacteria tissue tropism by mediating specific attachment to unique host sites expressing the corresponding carbohydrate receptor.	226
411521	NF035932	lectin_4	lectin. Lectins are important adhesin proteins, which bind carbohydrate structures on host cell surface. The carbohydrate specificity of diverse lectins to a large extent dictates bacteria tissue tropism by mediating specific attachment to unique host sites expressing the corresponding carbohydrate receptor.	318
411522	NF035933	ESAT6_1	pore-forming ESAT-6 family protein. 	107
411523	NF035934	ESAT6_2	pore-forming ESAT-6 family protein. 	96
411524	NF035935	ESAT6_3	pore-forming ESAT-6 family protein. 	98
411525	NF035936	agg_sub_LPXTH	serine-rich aggregation substance UasX. Members of this protein family are repetitive, serine-rich surface proteins of the Firmicutes, found primarily in the genus Leuconostoc. The variant form of sortase signal, LPXTH, is replaced by LPXTG in members from some lineages, such as Weissella oryzae, and therefore recognizable. Some members of this family have the KxYKxGKxW type signal peptide as seen in the glycoprotein adhesin GspB, a substrate of the accessory Sec system for secretion. WOSG25_050600 from Weissella oryzae SG25 is identified in a publication as an unnamed aggregation substance, a conclusion supported by the sorting signals and composition reported here. We assign the gene symbol uasX (unnamed aggregation substance X) based on our evaluation of the family.	281
411526	NF035937	EboA_family	EboA family metabolite traffic protein. This HMM describes a narrow, cyanobacterial-only clade of members of the EboA (eustigmatophyte/bacterial operon A) family. Members of this family appear required for transport of certain secondary metabolite precursors to the periplasm, including (but not limited) to precursors of scytonemin. More than half the members of this clade belong to scytonemin producers.	219
411527	NF035938	EboA_domain	EboA domain-containing protein. EboA (eustigmatophyte/bacterial operon A) belongs to a broadly distributed system of six proteins. The ebo system in general appears to involved in trafficking certain of certain natural product precursors whose biosynthesis requires export at least as far as the periplasm. Scytonemin is an example of one natural product whose biosynthesis requires an Ebo system.	141
411528	NF035939	TIM_EboE	metabolite traffic protein EboE. EboE (eustigmatophyte/bacterial operon E) belongs to the TIM barrel fold superfamily by homology, as shown by Pfam model PF01261. Although its exact function is unknown, EboE is encoded as part of a widely distributed system that appears to allow export of precursor metabolites of various secondary metabolites so that biosynthesis can be completed outside of the cytosol.	375
411529	NF035940	prenyl_rel_EboC	UbiA-like protein EboC. 	292
411530	NF035941	GBS_alph_likeN	alpha-like surface protein N-terminal domain. Most Group B Streptococcus (GBS) have a member of a mosaic family of repetitive surface protein. The founding member is the alpha C protein (bca), but other named members with complete sequences include Alp2, Alp3, and Rib. This HMM describes the shared, non-repetitive N-terminal region, including a YSIRK-like signal peptide region.	225
411531	NF035942	T3SS_eff_HopBF1	T3SS effector protein kinase HopBF1. HopBF1, found in plant pathogens such as Pseudomonas syringae and in the human pathogen Ewingella americana, it a type III secretion system effector that acts as a protein kinase. It phosphorylates the eukaryotic chaperone HSP90 on a serine residue, inhibiting its ATPase activity. The inhibition interferes with the proper folding of client proteins of HSP90 that are important to resistance to bacterial infection.	200
411532	NF035943	exosort_XrtV	exosortase V. 	262
411533	NF035944	PEPxxWA-CTERM	PEPxxWA-CTERM sorting domain. This variant form of PEP-CTERM sorting signal shows unusually strong conservation across much of the hydrophobic transmembrane segment, including an unusual Trp (W) residue at position 6 of the seed alignment. The Trp is replaced by Tyr (Y) in a number of proteins hit by the HMM. The top-scoring members of this family tend to occur in Sphingomonas and related genera, encoded by genomes that also encode the sorting enzyme XrtV (exosortase V).	24
411534	NF035945	Zn_serralysin	serralysin family metalloprotease. 	457
411535	NF035950	RumC_sactiRiPP	RumC family sactipeptide. The founding member of this family of radical SAM/SPASM-modified sactipeptide bacteriocins is ruminococcin C1, AEC03333.1 from Ruminococcus gnavus. This bacteriocin, from a human gut bacterium, is interesting because of its anti-clostridial activity. Known homologs to RumC1 average about 65 amino acids in length and have four invariant Cys residues.	63
411536	NF035951	rSAM_RumMC	RumMC family radical SAM sactipeptide maturase. The RumMC family of radical SAM/SPASM domain peptide modification enzymes creates sulfur-to-alpha-carbon (sactipeptide) linages in RiPP peptide natural products in the family of ruminococcin C.	510
411537	NF035952	WxPxxD_TM	WxPxxD family membrane protein. This uncommon, extremely hydrophobic protein of about 240 amino acids occurs sporadically in members of the Firmicutes, including the Listeria, Bacillus, Anoxybacillus, and Terribacillus genera. The protein is named for its most distinctive motif. The function is unknown, but the size and hydrophobicity suggests a transport-related function.	234
411538	NF035953	integrity_Cei	envelope integrity protein Cei. Cei (cell envelope integrity), as described for the founding member Rv2700 from Mycobacterium tuberculosis, is a transmembrane protein with an extracellular LytR_C domain. It lacks any DNA-binding domain and is not a transcriptional regulator. It shares homology to C-terminal regions present in some members of the LytR-CpsA-Psr family, a family in which some characterized members transfer teichoic acids to from carriers to mature peptidoglycan.	211
411539	NF035954	ocin_CA_C0660	CA_C0660 family putative sactipeptide bacteriocin. Members of this family are Cys-rich peptides about 65 amino acids in length, regularly found in the vicinity of the radical SAM/SPASM domain enzymes. The most closely related such radical SAM enzyme is the RiPP modification enzyme that introduces sulfur-to-alpha-carbon peptide (sactipeptide) modification in ruminococcin C, whose precursor is very similar in length and in Cys content and arrangement.	64
411540	NF037932	ocin_sys_WGxF	bacteriocin-like WGxF protein. Members of this protein family of hydrophobic proteins about 60 amino acids long are found various members of the Firmicutes in three-gene contexts that suggest a role as a bacteriocin or an immunity protein. The protein is named for its most striking sequence feature, a nearly invariant WGxF motif. The two conserved neighboring families are TIGR01654-like (e.g. WP_149116529.1) and TIGR03608-like (e.g. WP_149116530.1).	59
411541	NF037933	EpaQ_fam	EpaQ family protein. EpaQ, as described in the Gram-positive bacterium Enterococcus faecalis, is encoded with the enterococcal polysaccharide antigen (epa) operon. It is distantly related to some O-antigen ligases of Gram-negative bacteria, and may have a similar molecular function. EpaQ contributes to biofilm formation and resistance to certain antibiotics.	377
411542	NF037934	holdfast_HfaA	holdfast anchoring protein HfaA. 	115
411543	NF037935	holdfast_HfaB	holdfast anchoring protein HfaB. HfaB, part of the holdfast anchoring complex of Caulobacter and related bacteria, is a homolog of the outer membrane protein CsgG of curli biogenesis.	264
411544	NF037936	holdfast_HfaD	holdfast anchor protein HfaD. 	373
411545	NF037937	septum_RefZ	forespore capture DNA-binding protein RefZ. RefZ (regulator of FtsZ), a DNA-binding protein in the family of TetR/AcrR family transcriptional regulators, participates in septum placement and in chromosome capture during the asymmetrical cell division in endospore formation. The five nearly palindromic DNA motifs (RBMs) to which RefZ binds affect chromosomal localization, not transcription, so RefZ is not considered a transcription factor.	195
411546	NF037938	Myr_Ysa_major	MyfA/PsaA family fimbrial adhesin. The related adhesins MyfA and PsaA are fimbrial major subunits of Myf fimbriae from Yersinia enterocolitica, and Psa (also known as pH6 antigen) fimbriae from Y. pestis.	158
411547	NF037940	PKS_MbtD	mycobactin polyketide synthase MbtD. 	973
411548	NF037941	PKS_NbtC	nocobactin polyketide synthase NbtC. 	1026
411549	NF037942	ac_ACP_DH_MbtN	mycobactin biosynthesis acyl-ACP dehydrogenase MbtN. MbtN belongs to a family of dehydrogenases that in most cases act on acyl groups carried on CoA. However, MbtN appears to act on an acyl group carried instead on the mycobactin biosynthesis acyl carrier protein MbtL.	376
411550	NF037944	holin_2	bacteriophage holin. Proteins of this family are homologs of the mycobacterial phage holin Gp29. They can cause host cell lysis to release progeny phage particles.	64
411551	NF037945	holin_3	bacteriophage holin. Bacteriophage holin can cause host cell lysis to release progeny phage particles. Proteins of this family have about 60 amino acids, and a transmembrane domain is usually found on the C-terminal.	37
411552	NF037946	terminal_TopJ	terminal organelle assembly protein TopJ. 	440
411553	NF037947	holin_4	bacteriophage holin. Bacteriophage holin can cause host cell lysis to release progeny phage particles. Proteins of this family usually have two transmembrane domains.	75
411554	NF037948	signal_int_SinM	signal integration modulator SinM. SinM (signal integration modulator) is a regulatory partner of the hybrid histidine kinase/response regulator SinK (signal integrating kinase).	387
411555	NF037949	holin_5	bacteriophage holin. Proteins of this family can cause host cell lysis to release progeny phage particles.	64
411556	NF037950	spanin2_1	bacteriophage spanin2 family protein. A number of bacteriophage proteins cause lysis of host cells. Holins and endolysins induce the disruption of inner membrane by degrading peptidoglycans. Then, spanin2 family proteins are involved in the final step in host cell lysis by disrupting the outer membrane.	130
411557	NF037951	spanin2_2	bacteriophage spanin2 family protein. A number of bacteriophage proteins cause lysis of host cells. Holins and endolysins induce the disruption of inner membrane by degrading peptidoglycans. Then, spanin2 family proteins are involved in the final step in host cell lysis by disrupting the outer membrane.	107
411558	NF037952	spanin2_3	bacteriophage spanin2 family protein. A number of bacteriophage proteins cause lysis of host cells. Holins and endolysins induce the disruption of inner membrane by degrading peptidoglycans. Then, spanin2 family proteins are involved in the final step in host cell lysis by disrupting the outer membrane.	98
411559	NF037953	frad	septal junction protein FraD. Proteins of this family are components of cyanobacterial septal junctions (microplasmodesmata) in heterocyst-forming cyanobacteria.	305
411560	NF037954	het_cyst_PatD	heterocyst frequency control protein PatD. 	116
411561	NF037955	mfs	MFS transporter. 	366
411562	NF037957	freyrasin_like	freyrasin family ranthipeptide. A ranthipeptide is a ribosomally synthesized and post-translationally modified peptide (RiPP) whose linkages from cysteine sulfur atoms are to something besides the alpha carbons of other amino acids. Thus, ranthipeptides differ from sactipeptides (sulfur-to-alpha carbon RiPP peptides). The founding member of this family is freyrasin itself, the PapA protein from Paenibacillus polymyxa (see SUA72395.1). All members so far are encoded next to radical SAM peptide maturases.	49
411563	NF037958	QH_gamma	quinohemoprotein amine dehydrogenase subunit gamma. 	100
411564	NF037959	MFS_SpdSyn	fused MFS/spermidine synthase. Proteins of this family are fusion of a N-terminal MFS (Major Facilitator Superfamily) transporter domain and a C-terminal spermidine synthase (SpdSyn)-like domain. The encoding genes usually near the genes encoding S-adenosylmethionine decarboxylase (AdoMetDC) on many bacterial genomes. It has been shown in Shewanella oneidensis that the fused protein aminopropylates a substrate other than putrescine, and has a role outside of polyamine biosynthesis.	480
411565	NF037960	MFS_trans	MFS transporter. 	382
411566	NF037961	RodA_shape	rod shape-determining protein RodA. Proteins of this family are members of the FtsW/RodA/SpoVE superfamily. It has been reported that RodA proteins play important roles in maintaining cell shape and antibiotic resistance in bacteria.	415
411567	NF037962	arsenic_eff	arsenic efflux protein. Most proteins of this family have 8 transmembrane domains with two 4 transmembrane halves separated by a hydrophilic loop of variable sizes. It has been reported that some proteins of this family are involved in arsenate/arsenite resistance.	286
411568	NF037963	heterocyst_HetZ	heterocyst differentiation protein HetZ. The HMM distinguishes HetZ itself, a heterocyst differentiation protein, from a closely related paralog.	366
411569	NF037964	HetZ_related	HetZ-related protein. Members of this cyanobacterial protein family are paralogs of the  heterocyst differentiation protein HetZ and occur in largely the same set of genomes.	370
411570	NF037965	HetZ_rel_2	HetZ-related protein 2. Members of this family are cyanobacterial proteins distantly related to heterocyst differentiation protein HetZ, which also has a much more closely related set of paralogs in heterocyst-forming species.	367
411571	NF037966	HetP_family	HetP family heterocyst commitment protein. HetP and its paralogs occur in heterocyst-forming members of the Cyanobacteria, and play a role in commitment to development into heterocysts, which specialize in nitrogen fixation rather than photosynthesis.	63
411572	NF037967	SemiSWEET_1	SemiSWEET transporter. The SWEET (Sugars Will Eventually be Exported Transporter) is a superfamily of sugar transporters found in both eukaryotes and prokaryotes. Eukaryotic SWEETs usually have seven transmembrane helices (TMHs), but most prokaryotic SWEETs (SemiSWEETs) have only three TMHs. Proteins of this family have 7 TMHs.	196
411573	NF037968	SemiSWEET_2	SemiSWEET transporter. The SWEET (Sugars Will Eventually be Exported Transporter) is a superfamily of sugar transporters found in both eukaryotes and prokaryotes. Eukaryotic SWEETs usually have seven transmembrane helices (TMHs), but most prokaryotic SWEETs (SemiSWEETs) have only three TMHs. Proteins of this family have 3 TMHs.	76
411574	NF037969	SemiSWEET_3	SemiSWEET transporter. The SWEET (Sugars Will Eventually be Exported Transporter) is a superfamily of sugar transporters found in both eukaryotes and prokaryotes. Eukaryotic SWEETs usually have seven transmembrane helices (TMHs), but most prokaryotic SWEETs (SemiSWEETs) have only three TMHs. Proteins of this family have 3 TMHs.	97
411575	NF037970	vanZ_1	VanZ family protein. VanZ was originally identified in Enterococcus faecium. VanZ increases teicoplanin resistance in Enterococcus faecium, but has no impact on vancomycin resistance. Proteins of this family are homologs of the VanZ protein. They may be involved in teicoplanin resistance.	107
411576	NF037971	lipo_BcpO	CDI system lipoprotein BcpO. BcpO is a small lipoprotein, about 74 amino acids long on average, encoded in Burkholderia bcpAIOB locus systems for two-partner secretion and contact dependent growth inhibition.	66
411577	NF037972	HuaA_fam_RiPP	huazacin family RiPP peptide. 	40
411578	NF037974	SslE_AcfD_Zn_LP	SslE/AcfD family lipoprotein zinc metalloprotease. Members of this family are surface lipoprotein zinc metalloproteases, from the family that includes accessory colonization factor AcfD from Vibrio cholerae, SslE (YghJ ) from E. coli  (Secreted and Surface-associated Lipoprotein from E. coli), and VPA1376 from Vibrio parahaemolyticus. Each is about 1500 amino acids long, and SslE is a known substrate of a type II secretion system (T2SS). SslE is known to have mucinase activity.	1389
411579	NF037975	pilot_rel_YacC	YacC family pilotin-like protein. Members of this, including YacC from Escherichia coli K-12, resemble the lipoprotein GspS of type II secretion systems (T2SS), but in general are not lipoproteins. In E. coli K-12, where the T2SS is cryptic (not expressed, but able to function after manipulation to force its express), YacC is encoded far from the locus where the main set of T2SS genes are found, and it is not clear that YacC is a true GspS.	115
411580	NF037976	gtrA_1	GtrA family protein. The GtrA family proteins have 3-4 transmembrane domains. They are involved in translocation of lipid-linked glucose across the cytoplasmic membrane.	126
411581	NF037977	Lpg0189_fam	Lpg0189 family type II secretion system effector. 	279
411582	NF037978	T2SS_GspB	type II secretion system assembly factor GspB. GspB (general secretory pathway B) occurs in type II secretion systems (T2SS) and is viewed as an accessory protein, a factor involved in the assembly process rather than integral to the completed T2SS apparatus.	158
411583	NF037979	Na_transp	sodium-dependent transporter. 	417
411584	NF037980	T2SS_GspK	type II secretion system minor pseudopilin GspK. 	311
411585	NF037981	NCS2_1	purine/pyrimidine permease. Proteins of this family usually have 14 transmembrane domains. They belong to the NSC2 superfamily transporters. They are specific purine and/or pyrimidine permeases.	419
411586	NF037982	Nramp_1	Nramp family divalent metal transporter. Nramp (natural resistance-associated macrophage protein) family divalent metal transporters are widely conserved divalent metal transporters, which enables manganese import in bacteria and dietary iron uptake in mammals.	404
411587	NF037993	cyano_chori_ly	cyanobacterial-type chorismate lyase. This variant form of chorismate lyase is widespread in the cyanobacteria, including founding example sll1797 from  Synechocystis sp. PCC6803. Previously, members of this family were named DUF98 domain-containing protein, as found by Pfam model PF01947. The product, 4-hydroxybenzoate, is next prenylated by slr0926, during plastoquinone biosynthesis.	176
411588	NF037994	DcuC_1	C4-dicarboxylate transporter DcuC. Proteins of this family usually have 11-12 transmembrane regions. They transport C4-dicarboxylates under anaerobic conditions, which plays an important role in anaerobic energy metabolism.	414
411589	NF037995	TRAP_S1	TRAP transporter substrate-binding protein DctP. Proteins of this family are members of the superfamily of Tripartite ATP-independent Periplasmic Transporter (TRAP-T). They transport hydrophobic substrates, usually lipoprotein.	271
411590	NF037996	B-4DMT	B-4DMT family transporter. Proteins of this family usually have four transmembrane regions. They are classified as a new transporter family (9.B.148) by TCDB.	139
411591	NF037997	Na_Pi_symport	Na/Pi symporter. Proteins of this family belong to the Phosphate:Na+ Symporter (PNaS) superfamily.	294
411592	NF037998	RND_1	protein translocase. 	1237
411593	NF037999	mutacin	lantibiotic mutacin. Mutacins are lantibiotics in the epidermin/gallidermin/nisin family, found in the biofilm-forming dental caries pathogen Streptococcus mutans. Named members of the family include mutacin I and mutacin 1140. This HMM separates the mutacins (MutA) from paralog MutA' encoded nearby, which lacks mutacin activity.	63
411594	NF038000	mutacin_prime	mutacin-like lantipeptide. MutA', a paralog of the nisin-like lantibiotic mutacin precursor MutA, is a lantipeptide of unknown function, encoded in Streptococcus mutans within the same region. It is not required for mutacin function.	64
411595	NF038001	HYExAFE	HYExAFE family protein. This uncharacterized protein is named for its best conserved region, the motif HY[ED]xAFE, found near the N-terminus. It appears to limited in taxonomic range to members of the Planctomycetes.	164
411596	NF038002	bifunc_CbiS	bifunctional adenosylcobinamide hydrolase/alpha-ribazole phosphatase CbiS. 	333
411597	NF038004	darobactin_RiPP	darobactin family peptide antibiotic. Darobactin, discovered in the genus Photorhabdus, is a peptide antibiotic, made from a ribosomally translated precursor, and modified by the radical SAM/SPASM peptide maturase DarE.  Darobactin A is the founding member of a new class of antibiotic that appears to target BamA, a component of the outer membrane beta-barrel assembly machine. It is seven amino acids long, Trp1-Asn2-Trp3-Ser4-Lys5-Ser6-Phe7, with two crosslinks, one from Trp1 to Trp3, the other from Trp3 to Lys5. Homologs of the darobactin A precursor are encoded in various strains of Photorhabdus, Yersinia, and Vibrio.	45
411598	NF038005	rSAM_mat_DarE	darobactin maturation radical SAM/SPASM protein DarE. The radical SAM/SPASM protein DarE is a maturase for the ribosomally translated, post-translationally modified peptide natural product (RiPP) darobactin, including forms A, B, C, D, and E. The mature form is just seven amino acids long, with two cross-links, a Trp1-Trp3 linkage and a Trp3-Lys5 (or Arg5) linkage.	432
411599	NF038006	NhaD_1	sodium:proton antiporter NhaD. Proteins of this family usually have 10-13 transmembrane regions. They extrudes Na+ or Li+ in exchange for H+. They have been identified and characterized in a number of bacterial species.	405
411600	NF038007	ABC_ATP_DarD	darobactin export ABC transporter ATP-binding protein. 	218
411601	NF038008	ABC_perm_DarB	darobactin export ABC transporter permease subunit. 	777
411602	NF038009	TatB_1	twin-arginine translocation system-associated protein. This family is suggested by TCDB to be part of the twin-arginine translocation (TAT) system, but lacks detectable sequence similarity to subunits such as TatB.	70
411603	NF038010	ABC_adapt_DarC	darobactin export ABC transporter periplasmic adaptor subunit. 	417
411604	NF038011	PelF	GT4 family glycosyltransferase PelF. Proteins of this family are components of the exopolysaccharide Pel transporter. It has been reported that PelF is a soluble glycosyltransferase that uses UDP-glucose as the substrate for the synthesis of exopolysaccharide Pel, whereas PelG is a Wzx-like and PST family exopolysaccharide transporter.	489
411605	NF038012	DMT_1	DMT family transporter. Proteins of this family belong to the drug/metabolite transporter (DMT) superfamily.	276
411606	NF038013	AceTr_1	acetate uptake transporter. Proteins of this family are acetate transporters, which usually have 6 transmembrane regions. The homologue in E. coli is YaaH.	176
411607	NF038014	Chlamy_inclu_1	inclusion-associated protein. Proteins of this family are inclusion-associated proteins in Chlamydia. It has been shown that protein CPj0783, which is identical to the HMM seed protein WP_010892266, was localized on Chlamydial inclusion. CPj0783 interacted with host Huntingtin-protein14, which may play an important role in disturbing the vesicle transport system to escape host lysosomal or autophagosomal degradation.	236
411608	NF038015	AztD	metallochaperone AztD. Proteins of this family are components of the AztABCD zinc-uptake system. AztC, AztB, and AztA are the extracellular solute-binding protein (SBP), permease, and ATP-binding protein, respectively. AztD is a zinc chaperone to AztC, and it may store zinc in the periplasm for transfer through the AztABCD transporter system.	374
411609	NF038016	sporang_Gsm	sporangiospore maturation cell wall hydrolase GsmA. The peptidoglycan-hydrolyzing enzyme GsmA occurs in some sporangia-forming members of the Actinobacteria, such as Actinoplanes missouriensis, and is required for proper separation of spores. GsmA proteins have one or two SH3 domains N-terminal  to the hydrolase domain.	312
411610	NF038017	ABC_perm1	ABC transporter permease. Proteins of this family are the permease subunit of an ABC transporter complex, which may be involved in tungstate uptake.	184
411611	NF038018	qmoC	quinone-interacting membrane-bound oxidoreductase complex subunit QmoC. Proteins of this family are the transmembrane subunit of the quinone-interacting membrane-bound oxidoreductase complex, which consists of the QmoA, QmoB, and QmoC proteins. It has been reported that the QmoABC complex is essential for efficiently delivering electron to adenosine 5'-phosphosulfate reductase AprAB, which is important in sulfate reduction in sulfate reducing prokaryotes (SRPs).	363
411612	NF038019	PE_process_PecA	PecA family PE domain-processing aspartic protease. PecA from Mycobacterium marinum, and by homology, three related paralogs from Mycobacterium tuberculosis (PE26, PE_PGRS35, and PE_PGRS16) are all PE domain-containing proteins secreted by a type VII secretion system (T7SS, also called ESX in Mycobacterium), and all share a C-terminal aspartic protease-like domain. PecA itself is now known to be a functional aspartic protease that cleaves within the PE domain of T7SS secretion substrates that have the domain, including itself. Members of this family typically contain a long, variable, low-complexity region. This HMM represents the aspartic protease region C-terminal to the low-complexity region.	278
411613	NF038020	HeR	heliorhodopsin HeR. This HMM represents heliorhodopsins, a group of phylogenetically distinct microbial rhodopsins, which play an important role in absorbing and transferring light energy for numerous biological processes in bacteria. Heliorhodopsin was initially identified and characterized in a Gram-positive actinobacterium based on functional metagenomics and photochemical approaches. Heliorhodopsin have seven transmembrane domains, and exhibit similar biological function as microbial rhodopsins. however, heliorhodopsin form a distinct cluster based on phylogenetic analyses. Most microbial rhodopsins are hit by the Pfam HMM PF01036, which does not hit heliorhodopsins.	244
411614	NF038021	mannan_LmeA	mannan chain length control protein LmeA. 	264
411615	NF038022	PorACj_fam	PorACj family cell wall channel-forming small protein. Members of this unusual protein family are small (often 40 amino acids or shorter), variable, and detected so far only in the genus Corynebacterium. Despite its small size, the founding member reported to form into homooligomeric channels in the cell wall (not the plasma membrane). This family, as built, may also include PorA subunits of PorA/PorH heterooligomeric cell wall channels.	32
411616	NF038023	S_layer_PS2	S-layer protein PS2. 	499
411617	NF038024	CRR6_slr1097	CRR6 family NdhI maturation factor. The protein Slr1097 and its functionally equivalent cyanobacterial homologs are required for proper maturation of NdhI, a subunit of NADPH dehydrogenase complexes, so that NDH-1 complexes can assemble properly. The related protein in the model plant species Arabidopsis thaliana is known as CRR6 (chlororespiratory reduction 6).	151
411618	NF038025	dapto_LiaX	daptomycin-sensing surface protein LiaX. LiaX (lipid-II###interacting antibiotics X), as described in Enterococcus faecalis, is expressed under control of the the LiaR response regulator, and is involved in the process of resistance to daptomycin and to antimicrobial peptides of the innate immune response.	513
411619	NF038026	RsaX20_sORF	putative metal homeostasis protein. Members of this family average just 38 amino acids in length, but are widely conserved among species of the Gram-positive genera that include Staphylococcus, Enterococcus, Leuconostoc, and Lactobacillus. Expression of an RNA designated RsaX20, which encodes a member of this family, was studied in a Staphylococcus aureus of possible structural RNAs, and shown to be controlled by Fur-like transcription factor Zur. The broad conservation of the coding region across multiple species and protein-like pattern of amino acid substitutions in multiple sequence alignments strongly suggests that that members of this family are indeed translated into functional proteins.	32
411620	NF038027	TssQ_fam	TssQ family T6SS-associated lipoprotein. The founding member of this protein family, TssQ (BB0812) from Bordetella bronchiseptica, is a lipoprotein, and possibly all other family members are as well. TssQ is encoded within a T6SS locus but its function remains unknown.	100
411621	NF038028	spiralin_repeat	spiralin repeat. The spiralin repeat is a domain that appears once in spiralin (the major lipoprotein of Spiroplasma species) and up to six times in related proteins.	88
411622	NF038029	LP_plasma	lipoprotein signal peptide. This HMM describes one of several homology families that can be found of mutually closely related lipoprotein signal peptides in the Mycoplasmas and closely related genera (Mesoplasma, Spiroplasma), but absent outside that taxonomic group. Member proteins include spiralin, the most abundant lipoprotein in Spiroplasma.	24
411623	NF038030	spiralin_LP	spiralin lipoprotein. Spiralin is the major lipoprotein in multiple species of Spiroplasma, a relative of the Mycoplasmas.	239
411624	NF038031	PavB_Nterm	PavB family adhesin N-terminal domain. This HMM describes the portion of PavB from Streptococcus pneumoniae, and closely related proteins from Streptococcus mitis and Streptococcus pseudopneumoniae, N-terminal to the repetitive region with variable numbers of SSURE (Streptococcal Surface REpeats) regions (see PF11966), which bind fibronectin. The PavB region is notable, in part, for its rare variant, WSIRR, of the YSIRK motif signal peptide. Full-length versions of proteins from this family have a C-terminal LPXTG-containing region for sortase-mediated anchoring to the cell wall.	128
411625	NF038032	CehA_McbA_metalo	CehA/McbA family metallohydrolase domain. This domain, a branch of the PHP superfamily, is found in several partially characterized metallohydrolases, including McbA and CehA. Both were studied as hydrolases of carbaryl, a xenobiotic compound that does not contain a phosphate group, suggesting that presuming members of this family to be phosphoesterases (like many PHP domain-containing proteins) may be incorrect.	315
411626	NF038033	FEA1_rel_lipo	FEA1-related lipoprotein. 	406
411627	NF038034	lactGbeta_entB	lactococcin G-beta/enterocin 1071B family bacteriocin. This HMM was built to improve on Pfam model PF11632, which in version PF11632.8 had a two-member seed. It includes 12 residues of leader peptide and GlyGly cleavage motif (see TIGR01847), and has a shorter but more broadly conserved core peptide region. Characterized member proteins include lactococcin G and enterocin 1071B.	38
411628	NF038035	lactGalph_entA	lactococcin G-alpha/enterocin 1071A family bacteriocin. 	34
411629	NF038036	TCP11_Legionella	TCP11-related protein. 	927
411630	NF038037	cytob_DsrM	sulfate reduction electron transfer complex DsrMKJOP subunit DsrM. Proteins of this family are the DsrM subunit of the DsrMKJOP complex, which is a membrane-bound redox complex involved in sulfate reduction in Sulfate-reducing organisms (SROs). The dsrM gene encodes a cytochrome b reductase, which usually has six transmembrane helices and five conserved histidines.	320
411631	NF038038	cytoc_DsrJ	sulfate reduction electron transfer complex DsrMKJOP subunit DsrJ. Proteins of this family are the DsrJ subunit of the DsrMKJOP complex, which is a membrane-bound redox complex involved in sulfate reduction in Sulfate-reducing organisms (SROs). The dsrJ gene encodes a triheme periplasmic cytochrome c subunit, which contains three conserved heme c-binding sites (CXXCH) at the C terminal.	119
411632	NF038039	WGxxGxxG-CTERM	WGxxGxxG-CTERM domain. This domain, a possible protein-sorting signal, begins with the motif that usually takes the form DWGW, followed by a hydrophobic (and probably membrane-spanning) stretch xGxxGxxGxxG (where x is hydrophobic), and then a short patch of mostly basic residues at the C-terminus. This domain has a broad taxonomic range that includes Firmicutes, Cyanobacteria, and Deinococcus. Members from the Firmicutes tend to occur together with a DUF3231 (PF11553) family protein.	19
411633	NF038040	phero_PhrK_fam	PhrK family phosphatase-inhibitory pheromone. 	40
411634	NF038041	fim_Mfa1_fam	Mfa1 family fimbria major subunit. 	483
411635	NF038042	actinodefensin	actinodefensin. The actinodefensin family is named (here) as an Actinomyces-specific branch of the (otherwise eukaryotic) arthropod defensin family described by Pfam model PF01097.	70
411636	NF038043	act_def_assoc_A	actinodefensin-associated protein A. Actinodefensin (see family NF038042) is a bacterial branch of the arthropod defensin family. Members of that family occur in the Actinomyes lineage, have a distinctive N-terminal region that may reflect how processing and transport occur, and are found in a conserved gene neighborhood. Actinodefensin-associated protein A is found exclusively in these conserved gene neighborhoods.	379
411637	NF038044	act_def_assoc_B	actinodefensin-associated protein B. Members of this family are small proteins, averaging about 70 amino acids in length, restricted to the Actinomycetes. Member proteins typically occur in the vicinity of actinodefensin, which represents and Actinomycetes-restricted branch of the arthropod defensin family. The function of this protein is unknown.	63
411638	NF038045	GEF_RalF	T4SS guanine nucleotide exchange effector RalF. 	341
411639	NF038047	not_Tcp10	AAWKG family protein. Members of this family are found primarily in Streptomyces. The family is notable in part because a region outside of the N-terminal region modeled here contains 9-residue repeats that resemble the 18-residue repeats found by PF07202 in the C-terminal region of eukaryotic T-complex protein 10. The family is uncharacterized. This model was constructed, and named for family's most prominent motif, AAWKG, to head off any possible confusion with Tcp10.	398
411640	NF038048	DIP1984_fam	DIP1984 family protein. Members of this family, including the Corynebacterium diphtheriae protein  DIP1984, which has a solved crystal structure, are uncharacterized with respect to function. Some members of this family previously have been annotated, incorrectly, as septolysin. This model was constructed to overrule and correct such errors. Note that septolysin O, and other members of the family of cholesterol-dependent cytolysins such as listeriolysin O (WP_003722731.1), are unrelated.	149
411641	NF038049	SelD_rel_HyperS	SelD-related putative sulfur metabolism protein. 	465
411642	NF038050	NrtS	nitrate/nitrite transporter NrtS. NrtS family proteins were first identified and characterized in Synechococcus sp. PCC 7002. The homologous proteins NrtS1 and NrtS2 are encoded by two neighboring genes on Synechococcus sp. PCC 7002 genome. The heteromeric transporter was shown to transport nitrite as well as nitrate. This HMM hits both NrtS1 and NrtS2 proteins, which have extremely high sequence identity and conserved motifs.	56
411643	NF038051	MamC	magnetosome protein MamC. Magnetosomes are membrane-enclosed organelles containing crystals of magnetite (Fe3O4) that cause magnetotactic bacteria to orient in magnetic fields. MamC interacts with the magnetite surface and affect the size and shape of the growing crystal.	97
411644	NF038052	histone_lik_HC2	histone H1-like DNA-binding protein Hc2. This model describes highly repetitive, Lys and Arg-rich histone H1-like protein Hc2, as found in the genus Chlamydia.	201
411645	NF038053	hist_H1_lk_Burk	histone H1-like DNA-binding protein. This model describes histone H1-like repetitive basic proteins as found in Burkholderia and related species. It excludes the Hc2 family found in Chlamydia and involved in DNA condensation for the formation of elementary bodies.	192
411646	NF038054	T3SS_SctI	type III secretion system inner rod subunit SctI. This model describes protein SctI (Secretion and Cellular Translocation I), an inner rod protein in the basal body of the type III secretion system (T3SS) as found in many pathogenic bacteria. SctI has some sequence similarity to the needle filament protein SctF. Lineage-specific names for SctI in various bacteria include BsaK, PrgJ, and MxiI.	76
411647	NF038055	T3SS_SctB_pilot	type III secretion system translocon subunit SctB. One SctB and four SctE subunits, located at the tip of the type III secretion system (T3SS) injectosome, combine to form the translocon (translocator pore) in the membrane of targeted cells. Species-specific names for this highly variable component of T3SS include YopD, EspB, IpaC, SipC, etc.	311
411648	NF038058	adhes_P110_Nter	P110/LppT family adhesin N-terminal domain. Members of this family include the multifunctional adhesin P110 (as the name is used in Mycoplasma hyopneumoniae, not Mycoplasma genitalium) and homologs presumed also to be adhesins. Homologs include LppT (lipoprotein T), which despite its name seems to lack the signal peptide region Cys residue required for conversion into a lipoprotein, and paralogs MHP683 and MHP684 from Mycoplasma hyopneumoniae.	202
411649	NF038065	Pr6Pr	Pr6Pr family membrane protein. This family is defined by TCDB as prokaryotic 6 TMS (Pr6Pr) family membrane protein(http://www.tcdb.org/search/result.php?tc=9.b.302). The function of this family proteins is not understood.	162
411650	NF038066	MptB	polyprenol phosphomannose-dependent alpha 1,6 mannosyltransferase MptB. Proteins of this family are Involved in the initiation of core alpha-(1,6) mannan biosynthesis of lipomannan (LM-A) and multi-mannosylated polymer (LM-B), extending triacylatedphosphatidyl-myo-inositol dimannoside (Ac1PIM2) and mannosylated glycolipid, 1,2-di-O-C16/C18:1-(alpha-D-mannopyranosyl)-(1->4)-(alpha-D-glucopyranosyluronic acid)-(1->3)-glycerol (Man1GlcAGroAc2), respectively.	554
411651	NF038067	OMP_CarO	ornithine uptake porin CarO. The outer membrane porin CarO (carbapenem resistance-associated outer membrane protein), found in Acinetobacter, is of clinical interest because of its role in allowing entry of carbapenem antibiotics and its variability from strain to strain. CarO should not be confused with Omp33-36, an essentially unrelated outer membrane protein that is similar in size and that also affects carbapenem susceptibility.	249
411652	NF038068	LaoB_over_CadC	L-arginine responsive protein LaoB. LaoB (L-arginine responsive overlapping gene) is a small, rare protein, encoded in an antisense frame to the gene for CadC in some strains of Escherichia coli. CadC, in response to acid stress and the presence of sufficient lysine, activates expression of a lysine decarboxlation and lysine/cadaverine antiport system, which provides resistance to the acidity.	41
411653	NF038070	LmbU_fam_TF	LmbU family transcriptional regulator. LmbU is a well-described member of a family of DNA-binding transcriptional activators for natural product biosynthesis in Streptomyces and related species. Besides LmbU (lincomycin), some other members include CouE (coumermycin A1), CloE (clorobiocin), and HrmB (hormaomycin).	171
411654	NF038071	lat_flg_LafA_2	lateral flagellin LafA. 	283
411655	NF038072	IcmL_DotI_only	type IVB secretion system apparatus protein IcmL/DotI. 	200
411656	NF038073	rSAM_STM4011	STM4011 family radical SAM protein. Members of this family are putative radical SAM proteins that (at the time of HMM construction) fall outside the scope of Pfam model PF04055 available at the time (version 21), as many radical SAM proteins do. The function is unknown. Members are somewhat variable in architecture, and found mostly in Streptomyces and related species. However, the family is named for a member in Salmonella enterica, in the model strain LT2, protein STM4011, whose function is unknown.	278
411657	NF038074	fam_STM4014	STM4014 family protein. Members of this family are proteins of unknown function, regularly found in a conserved gene neighborhood that also includes two uncharacterized radical SAM proteins. The protein family is named for a founding member from the Salmonella enterica model strain LT2, although the system is rare in the Proteobacteria and relativly common in Streptomyces and related taxa.	350
411658	NF038075	fam_STM4013	STM4013/SEN3800 family hydrolase. Members of this family are sulfatase-like metal-dependent hydrolases, of unknown function, regularly found in a conserved gene neighborhood that also includes two uncharacterized radical SAM proteins. See PF00884 for information on related proteins.	261
411659	NF038076	fam_STM4015	STM4015 family protein. Members of this family are proteins of unknown function, regularly found in a conserved gene neighborhood that also includes two uncharacterized radical SAM proteins. The protein family is named for a founding member from the Salmonella enterica model strain LT2, although the system is rare in the Proteobacteria and relativly common in Streptomyces and related taxa.	286
411660	NF038077	MFS_export_MxcK	myxochelin export MFS transporter MxcK. 	405
411661	NF038078	NRPS_MxcG	myxochelin non-ribosomal peptide synthetase MxcG. 	1444
411662	NF038079	TonB_sider_MxcH	TonB-dependent siderophore myxochelin receptor MxcH. 	796
411663	NF038080	PG_bind_siph	peptidoglycan-binding domain. This domain occurs shows apparent homology to known or putative peptidoglycan-binding domains in families such as PF01471. The domain occurs once, or twice, at the C-terminus of proteins such as cell wall amidases. In particular, member proteins can be found among putative lysins of phage of Streptomyces from the Siphoviridae family, such as phiBT1.	76
411664	NF038081	BN159_2729_fam	BN159_2729 family protein. This uncharacterized protein family occurs in Streptomyces and related species. Some members have insertions of long stretches of low-complexity sequences.	237
411665	NF038082	phiSA1p31	phiSA1p31 domain. This domain occurs in Streptomyces and related lineages, in proteins with highly variable architectures, typically at or near the C-terminus. Member proteins include at least two from known temperate phage of Streptomyces, including phiSA1p31, for which it is named, from Streptomyces phage phiSASD1.	60
411666	NF038083	CU044_5270_fam	CU044_5270 family protein. Members of this family occur largely in Streptomyces and related species, often with several members per genome. Lengths average about 340 amino acids. The function is unknown.	284
411667	NF038084	DHCW_cupin	DHCW motif cupin fold protein. Members of this uncharacterized protein family resemble other cupin superfamily small barrel proteins. This family has a signature motif, DHCW, for which the family is named.	106
411668	NF038085	MSMEG_6728_fam	MSMEG_6728 family protein. 	149
411669	NF038086	anchor_synt_A	protein sorting system archaetidylserine synthase. Members of this family are homologs of CDP-diacylglycerol--serine O-phosphatidyltransferase PssA of subclass II, as found in Gram-positive bacteria, but occur in a branch of the archaea. In Haloferax volcanii, the member of this family HVO_1143, together with the PssD-related decarboxylase HVO_0146, were both shown to be required for the archaeosortase ArtA to cause removal of the PGF-CTERM sorting signal and replacement with a prenyl-derived C-terminal lipid anchor. Based on these observations, members of this family are suggested to be CDP-2,3-di-O-geranylgeranyl-sn-glycerol:l-serine O-archaetidyltransferase, generating archaetidylserine en route to archaetidylethanolamine biosynthesis.	222
411670	NF038087	arch_ser_synth	archaetidylserine synthase. Members of this family, including founding member MTH_1027 from Methanothermobacter thermautotrophicus, resemble subclass II bacterial CDP-diacylglycerol--serine O-phosphatidyltransferase, but act as instead as CDP-2,3-Di-O-geranylgeranyl-sn-glycerol:L-serine O-archaetidyltransferase (archaetidylserine synthase).	216
411671	NF038088	anchor_synt_D	protein sorting system archaetidylserine decarboxylase. Members of this family, including founding member HVO_0146 from Haloferax volcanii, are archaeal homologs of bacterial phosphatidylserine decarboxylases (PssD). HVO_0146, and the PssA homolog HVO_1143, were shown be required for archaeosortase A (ArtA)-mediated removal of the PGF-CTERM protein-sorting signal and replacement with a large, prenyl-derived, C-terminal anchoring lipid moiety that is proposed to be archaetidylethanolamine.	196
411672	NF038090	IscA_HesB_Se	IscA/HesB family protein. Members of this family, a large fraction of which are selenoproteins, are homologous to proteins of iron-sulfur cluster biosynthesis such as IscA, and belong to the broader set of HesB-related proteins.	99
411673	NF038091	T4SS_VirB10	type IV secretion system protein VirB10. Members of this family are VirB10, an outer membrane-associated protein from the apparatus of protein type IV secretion systems (T4SS). The model attempts to exclude related TraI proteins of conjugal transfer systems as well as the ComB10 protein of a DNA-translocating competence protein of Helicobacter pylori. Because the N-terminal regions of VirB10 proteins are highly variable, the model	197
411674	NF038092	T4SS_ComB10	DNA type IV secretion system protein ComB10. ComB10, a VirB10 homolog, is an outer membrane-associated component of a DNA-translocating type IV secretion system (T4SS) involved in competence. Most T4SS translocate proteins, but both ComB10 as found in Helicobacter, and the T4SS of Agrobacterium, translocate DNA.	225
411675	NF038093	GrdX	GrdX family protein. GrdX is a small protein, of unknown function, encoded in grd operons for selenocysteine-dependent glycine reductase systems. A small number of GrdX proteins appear to be encoded with a UGA-encoded selenocysteine residue appearing close to the N-terminus, at sites that do not align with Cys residues in other members of the family. This arrangement suggests that the ability to complete translation of GrdX selenoproteins may have regulatory value, in addition to whatever may be the molecular function of GrdX itself.	118
411676	NF038094	CueP_fam	CueP family metal-binding protein. Members of this family including CueP itself, a copper-binding periplasmic metallochaperone, as found in Salmonella enterica serovar Typhimurium. Many family members, although not CueP itself, are lipoproteins. Several other members of the family are selenoproteins, including members from Bacillus selenitireducens, Bacillus beveridgei, and others.	146
411677	NF038095	met_chaper_CueP	copper-binding periplasmic metallochaperone CueP. This narrowly built model for CueP includes periplasmic proteins from Salmonella enterica, in which it contributes to an increased tolerance to copper, and from various other Gram-negative bacteria. It does not include CueP lipoproteins from species such as Corynebacterium diphtheriae.	177
411678	NF038096	thylak_slr1796	thylakoid membrane photosystem I accumulation factor. Members of this family, restricted to the Cyanobacteria and chloroplasts, show homology to thioredoxins. However, the core region of family protein alignment shows either one Cys residue or zero, suggesting family members may share the thioredoxin-like fold but lack redox capability. The founding member of the family, slr1796, was shown by proteomics to localize to the thylakoid membrane, consistent with taxonomic restriction to the Cyanobacteria. Targeted mutation of slr1796 shows a role in successful translation and insertion of photosystem I proteins into the thylakoid membrane.	157
411679	NF038097	KCGN_DNA_rpt	KCGN motif-containing spurious repeat. This AntiFam-type HMM recognizes spurious protein translations, often with the motif KCGN, of a DNA repeat widespread in the genus Leptospira.	21
411680	NF038098	GyrA_w_intein	intein-containing DNA gyrase subunit A. 	1232
411681	NF038099	AsSugarArsM	arsenosugar biosynthesis arsenite methyltransferase ArsM. This form of arsenite methyltransferase works together with the radical SAM enzyme ArsS in a pathway of arsenosugar biosynthesis. Examples of ArsM such as slr0303 from Synechocystis sp. PCC 6803 and alr3095 from Nostoc sp. PCC 7120 are encoded next to ArsS.	321
411682	NF038101	Trm112_arch	methytransferase partner Trm112. This HMM describes an archaeal branch of a small protein, Trm112, that is conserved in the three domains of life and that serves as general activator of methyltransferases for RNA or protein.	59
411683	NF038104	lipo_NF038104	NF038104 family lipoprotein. This family of small lipoproteins of unknown function, about 68 amino acids long, occurs in genera that include Acinetobacter, Moraxella, Neisseria, and Psychrobacter. The N-terminal half, including the lipoprotein signal peptide, shows significant sequence similarity to the divisome-associated lipoprotein YraP, a three-fold longer protein, as found in Eschericia coli.	62
411684	NF038105	acin_NF038105	NF038105 family protein. This family of small proteins, about 66 amino acids long, appears universal in the first 20 species of Acinetobacter examined, but absent outside the genus.	62
411685	NF038106	gamma_NF038106	PA4642 family protein. Member of this family are small (about 95 amino acids), uncharacterized, and apparently restricted to the Gammaproteobacteria. Members include PA4642 from Pseudomonas aeruginosa PAO1.	93
411686	NF038107	rSAM_NF038107	Cys-every-fifth radical SAM/SPASM peptide maturase CefB. Members of this family are radical SAM/SPASM domain proteins, most of which perform post-translational modification on RiPP peptide precursors or enzyme subunits. Target residues for modification often are Cys residues. Members of family NF038108, with a Cys Every Fifth position (CefA) over most of the short length of that family, are the putative target RiPP proteins.	444
411687	NF038108	RiPP_NF038108	Cys-every-fifth RiPP peptide CefA. Members typically are shorter than 100 residues, with from nine to eleven Cys residues spaced strictly as every fifth residue and usually with at least one adjacent Gly. Most family members occur in the vicinity of a peptide-modifying radical SAM/SPASM domain protein, marking those family members as putative RiPP (ribosomally translated, post-translationally modified peptide) precursors. Because of the small size, richness in Cys and Gly residues, and strictly repetitive nature, it may be expected that some predicted proteins, scoring above the thresholds for the model, are related by convergent evolution rather than by homology, and are not themselves RiPP precursors.	65
411688	NF038110	Lys_methyl_FliB	flagellin lysine-N-methylase. 	375
222768	PHA00002	A	DNA replication initiation protein gpA	515
222769	PHA00003	B	internal scaffolding protein	120
164773	PHA00006	D	external scaffolding protein	151
164774	PHA00007	E	cell lysis protein	91
222770	PHA00008	J	DNA packaging protein	25
164775	PHA00009	F	capsid protein	427
164776	PHA00010	G	major spike protein	179
222771	PHA00012	I	assembly protein	361
222772	PHA00019	IV	phage assembly protein	428
164777	PHA00022	VII	minor coat protein	28
106880	PHA00024	IX	minor coat protein	33
222773	PHA00025	VIII	major coat protein	76
133846	PHA00026	cp	coat protein	129
133847	PHA00027	lys	lysis protein	58
222774	PHA00028	rep	RNA replicase, beta subunit	561
222775	PHA00080	PHA00080	DksA-like zinc finger domain containing protein	72
106886	PHA00094	VI	minor coat protein	112
164779	PHA00097	K	protein K	56
222776	PHA00098	PHA00098	hypothetical protein	112
164781	PHA00099	PHA00099	minor capsid protein	147
177266	PHA00101	PHA00101	internal virion protein B	194
222777	PHA00144	PHA00144	major head protein	438
133855	PHA00147	PHA00147	upper collar protein	308
222778	PHA00148	PHA00148	lower collar protein	242
222779	PHA00149	PHA00149	DNA encapsidation protein	331
177267	PHA00159	PHA00159	endonuclease I	148
222780	PHA00198	PHA00198	nonstructural protein	86
177268	PHA00201	PHA00201	major capsid protein	343
164786	PHA00202	PHA00202	DNA replication initiation protein	388
164787	PHA00212	PHA00212	putative transcription regulator	63
222781	PHA00276	PHA00276	phage lambda Rz-like lysis protein	144
106901	PHA00280	PHA00280	putative NHN endonuclease	121
164789	PHA00327	PHA00327	minor capsid protein	187
222782	PHA00330	PHA00330	putative replication initiation protein	316
222783	PHA00350	PHA00350	putative assembly protein	399
177271	PHA00360	II	replication initiation protein	421
222784	PHA00363	PHA00363	major capsid protein	557
222785	PHA00368	PHA00368	internal virion protein D	1315
164794	PHA00369	H	minor spike protein	325
164795	PHA00370	III	attachment protein	297
222786	PHA00371	mat	maturation protein	418
133872	PHA00380	PHA00380	tail protein	599
164796	PHA00404	PHA00404	hypothetical protein	42
222787	PHA00405	PHA00405	hypothetical protein	85
164797	PHA00406	PHA00406	hypothetical protein	48
164798	PHA00407	PHA00407	phage lambda Rz1-like protein	84
222788	PHA00415	25	baseplate wedge subunit	131
133878	PHA00422	PHA00422	hypothetical protein	69
164800	PHA00425	PHA00425	DNA packaging protein, small subunit	88
164801	PHA00426	PHA00426	type II holin	67
222789	PHA00428	PHA00428	tail tubular protein A	193
222790	PHA00430	PHA00430	tail fiber protein	568
222791	PHA00431	PHA00431	internal virion protein C	746
177277	PHA00432	PHA00432	internal virion protein A	137
222792	PHA00435	PHA00435	capsid assembly protein	306
222793	PHA00437	PHA00437	tail assembly protein	94
133887	PHA00438	PHA00438	hypothetical protein	81
222794	PHA00439	PHA00439	exonuclease	286
133889	PHA00440	PHA00440	host protein H-NS-interacting protein	98
222795	PHA00441	PHA00441	hypothetical protein	89
222796	PHA00442	PHA00442	host recBCD nuclease inhibitor	59
177281	PHA00446	PHA00446	hypothetical protein	89
177282	PHA00447	PHA00447	lysozyme	142
133894	PHA00448	PHA00448	hypothetical protein	70
164812	PHA00450	PHA00450	host dGTPase inhibitor	85
177283	PHA00451	PHA00451	protein kinase	362
222797	PHA00452	PHA00452	T3/T7-like RNA polymerase	807
164815	PHA00453	PHA00453	hypothetical protein	41
222798	PHA00454	PHA00454	ATP-dependent DNA ligase	315
133900	PHA00455	PHA00455	hypothetical protein	85
164817	PHA00456	PHA00456	hypothetical protein	34
222799	PHA00457	PHA00457	inhibitor of host bacterial RNA polymerase	63
222800	PHA00458	PHA00458	single-stranded DNA-binding protein	233
222801	PHA00476	PHA00476	hypothetical protein	110
133905	PHA00489	PHA00489	scaffolding protein	101
133906	PHA00490	PHA00490	terminal protein	266
222802	PHA00497	pol	RNA-dependent RNA polymerase	673
133907	PHA00510	PHA00510	transcriptional regulator	125
222803	PHA00514	PHA00514	dsDNA binding protein	98
133909	PHA00515	PHA00515	hypothetical protein	53
222804	PHA00520	PHA00520	packaging NTPase P4	330
222805	PHA00527	PHA00527	hypothetical protein	129
133910	PHA00540	PHA00540	hypothetical protein	715
106954	PHA00542	PHA00542	putative Cro-like protein	82
164822	PHA00547	PHA00547	hypothetical protein	337
177288	PHA00616	PHA00616	hypothetical protein	44
177289	PHA00617	PHA00617	ribbon-helix-helix domain containing protein	80
177290	PHA00619	PHA00619	CRISPR-associated Cas4-like protein	201
106959	PHA00626	PHA00626	hypothetical protein	59
222806	PHA00645	PHA00645	hypothetical protein	125
133916	PHA00646	PHA00646	hypothetical protein	65
106962	PHA00649	PHA00649	hypothetical protein	83
106963	PHA00650	PHA00650	hypothetical protein	82
106964	PHA00652	PHA00652	hypothetical protein	128
164824	PHA00653	mtd	major tropism determinant	381
106966	PHA00657	PHA00657	crystallin beta/gamma motif-containing protein	2052
106967	PHA00658	PHA00658	putative lysin	720
133918	PHA00660	PHA00660	hypothetical protein	215
106970	PHA00661	PHA00661	hypothetical protein	734
222807	PHA00662	PHA00662	hypothetical protein	215
106972	PHA00663	PHA00663	hypothetical protein	68
106973	PHA00664	PHA00664	hypothetical protein	140
106974	PHA00665	PHA00665	major capsid protein	329
222808	PHA00666	PHA00666	putative protease	233
106976	PHA00667	PHA00667	hypothetical protein	158
222809	PHA00669	PHA00669	hypothetical protein	114
106978	PHA00670	PHA00670	hypothetical protein	540
106979	PHA00671	PHA00671	hypothetical protein	135
133920	PHA00672	PHA00672	hypothetical protein	152
106981	PHA00673	PHA00673	acetyltransferase domain containing protein	154
106982	PHA00675	PHA00675	hypothetical protein	78
106983	PHA00676	PHA00676	hypothetical protein	96
106984	PHA00679	PHA00679	hypothetical protein	71
106985	PHA00680	PHA00680	hypothetical protein	143
222810	PHA00684	PHA00684	hypothetical protein	128
106987	PHA00687	PHA00687	hypothetical protein	56
106988	PHA00689	PHA00689	hypothetical protein	62
106989	PHA00691	PHA00691	hypothetical protein	68
106990	PHA00692	PHA00692	hypothetical protein	74
222811	PHA00724	PHA00724	hypothetical protein	83
177293	PHA00725	PHA00725	hypothetical protein	81
177294	PHA00726	PHA00726	hypothetical protein	89
222812	PHA00727	PHA00727	hypothetical protein	278
177296	PHA00728	PHA00728	hypothetical protein	151
177297	PHA00729	PHA00729	NTP-binding motif containing protein	226
222813	PHA00730	int	integrase	337
222814	PHA00731	PHA00731	hypothetical protein	96
177300	PHA00732	PHA00732	hypothetical protein	79
177301	PHA00733	PHA00733	hypothetical protein	128
177302	PHA00734	PHA00734	hypothetical protein	95
177303	PHA00735	PHA00735	hypothetical protein	808
177304	PHA00736	PHA00736	hypothetical protein	79
177305	PHA00738	PHA00738	putative HTH transcription regulator	108
177306	PHA00739	V3	structural protein VP3	92
222815	PHA00742	PHA00742	hypothetical protein	211
177308	PHA00743	PHA00743	helix-turn-helix protein	51
164842	PHA00771	PHA00771	head assembly protein	151
107010	PHA00780	PHA00780	hypothetical protein	80
133939	PHA00781	PHA00781	hypothetical protein	59
164843	PHA00821	PHA00821	hypothetical protein	295
222816	PHA00911	21	prohead core scaffolding protein and protease	212
222817	PHA00965	PHA00965	tail protein	588
177310	PHA00979	PHA00979	putative major coat protein	77
222818	PHA01075	PHA01075	major capsid protein	408
107017	PHA01076	PHA01076	putative encapsidation protein	378
222819	PHA01077	PHA01077	putative lower collar protein	251
164848	PHA01078	PHA01078	putative upper collar protein	249
164849	PHA01079	PHA01079	hypothetical protein	48
164850	PHA01080	PHA01080	hypothetical protein	80
133945	PHA01081	PHA01081	putative minor coat protein	104
222820	PHA01082	PHA01082	putative transcription regulator	133
164851	PHA01083	PHA01083	hypothetical protein	149
107025	PHA01159	PHA01159	hypothetical protein	114
107026	PHA01160	PHA01160	nonstructural protein	40
222821	PHA01327	PHA01327	hypothetical protein	49
164853	PHA01346	PHA01346	hypothetical protein	53
107029	PHA01351	PHA01351	putative minor structural protein	1070
107030	PHA01365	PHA01365	hypothetical protein	91
222822	PHA01366	PHA01366	hypothetical protein	337
133949	PHA01399	PHA01399	membrane protein P6	242
164854	PHA01474	PHA01474	nonstructural protein	52
107034	PHA01486	PHA01486	nonstructural protein	32
107035	PHA01511	PHA01511	coat protein	430
164855	PHA01513	mnt	Mnt	82
107037	PHA01514	PHA01514	O-antigen conversion protein C	485
107038	PHA01516	PHA01516	hypothetical protein	98
107039	PHA01519	PHA01519	hypothetical protein	115
177311	PHA01547	PHA01547	putative internal virion protein A	206
222823	PHA01548	PHA01548	hypothetical protein	167
164858	PHA01622	PHA01622	CRISPR-associated Cas4-like protein	204
222824	PHA01623	PHA01623	hypothetical protein	56
222825	PHA01624	PHA01624	hypothetical protein	102
164860	PHA01625	PHA01625	hypothetical protein	249
222826	PHA01627	PHA01627	DNA binding protein	107
164861	PHA01630	PHA01630	putative group 1 glycosyl transferase	331
164862	PHA01631	PHA01631	hypothetical protein	176
133953	PHA01632	PHA01632	hypothetical protein	64
107050	PHA01633	PHA01633	putative glycosyl transferase group 1	335
133954	PHA01634	PHA01634	hypothetical protein	156
222827	PHA01635	PHA01635	hypothetical protein	231
107053	PHA01707	dut	2'-deoxyuridine 5'-triphosphatase	158
222828	PHA01732	PHA01732	proline-rich protein	94
107055	PHA01733	PHA01733	hypothetical protein	153
222829	PHA01735	PHA01735	hypothetical protein	76
133956	PHA01740	PHA01740	putative single-stranded DNA-binding protein	158
222830	PHA01745	PHA01745	hypothetical protein	306
107059	PHA01746	PHA01746	hypothetical protein	131
222831	PHA01747	PHA01747	putative ATP-dependent protease	425
222832	PHA01748	PHA01748	hypothetical protein	60
177316	PHA01749	PHA01749	coat protein	134
107063	PHA01750	PHA01750	hypothetical protein	75
222833	PHA01751	PHA01751	hypothetical protein	110
177317	PHA01752	PHA01752	hypothetical protein	488
133958	PHA01753	PHA01753	Holliday junction resolvase	121
133959	PHA01754	PHA01754	hypothetical protein	69
222834	PHA01755	PHA01755	hypothetical protein	562
107069	PHA01756	PHA01756	hypothetical protein	268
222835	PHA01757	PHA01757	hypothetical protein	98
222836	PHA01769	PHA01769	hypothetical protein	98
177318	PHA01782	PHA01782	hypothetical protein	177
164869	PHA01790	PHA01790	streptodornase	326
222837	PHA01794	PHA01794	hypothetical protein	134
177320	PHA01795	PHA01795	hypothetical protein	280
222838	PHA01806	PHA01806	hypothetical protein	200
222839	PHA01807	PHA01807	hypothetical protein	153
164872	PHA01808	PHA01808	putative structural protein	98
107079	PHA01809	PHA01809	hypothetical protein	65
177323	PHA01810	PHA01810	hypothetical protein	100
177324	PHA01811	PHA01811	hypothetical protein	78
177325	PHA01812	PHA01812	hypothetical protein	122
107083	PHA01813	PHA01813	hypothetical protein	58
107084	PHA01814	PHA01814	hypothetical protein	137
107085	PHA01815	PHA01815	hypothetical protein	55
107086	PHA01816	PHA01816	hypothetical protein	160
177326	PHA01817	PHA01817	hypothetical protein	479
107088	PHA01818	PHA01818	hypothetical protein	458
107089	PHA01819	PHA01819	hypothetical protein	129
222840	PHA01886	PHA01886	TM2 domain-containing protein	78
177328	PHA01929	PHA01929	putative scaffolding protein	306
222841	PHA01971	PHA01971	hypothetical protein	123
222842	PHA01972	PHA01972	structural protein	828
177330	PHA01976	PHA01976	helix-turn-helix protein	67
177331	PHA02004	PHA02004	capsid protein	332
222843	PHA02030	PHA02030	hypothetical protein	336
222844	PHA02031	PHA02031	putative DnaG-like primase	266
222845	PHA02046	PHA02046	hypothetical protein	99
222846	PHA02047	PHA02047	phage lambda Rz1-like protein	101
177336	PHA02053	PHA02053	hypothetical protein	115
177337	PHA02054	PHA02054	hypothetical protein	94
177338	PHA02057	PHA02057	ADP-ribosylation superfamily-like protein	319
164889	PHA02067	PHA02067	hypothetical protein	221
177339	PHA02078	PHA02078	hypothetical protein	54
164890	PHA02085	PHA02085	hypothetical protein	87
107108	PHA02086	PHA02086	hypothetical protein	88
107109	PHA02087	PHA02087	hypothetical protein	83
107110	PHA02088	PHA02088	hypothetical protein	125
177340	PHA02090	PHA02090	hypothetical protein	79
177341	PHA02091	PHA02091	hypothetical protein	72
177342	PHA02092	PHA02092	hypothetical protein	108
177343	PHA02094	PHA02094	hypothetical protein	81
107115	PHA02095	PHA02095	hypothetical protein	84
107116	PHA02096	PHA02096	hypothetical protein	103
177344	PHA02097	PHA02097	hypothetical protein	59
107118	PHA02098	PHA02098	hypothetical protein	56
107119	PHA02099	PHA02099	hypothetical protein	84
107120	PHA02100	PHA02100	hypothetical protein	112
177345	PHA02101	PHA02101	hypothetical protein	101
222847	PHA02102	PHA02102	hypothetical protein	72
222848	PHA02103	PHA02103	hypothetical protein	135
177347	PHA02104	PHA02104	hypothetical protein	89
133990	PHA02105	PHA02105	hypothetical protein	68
177348	PHA02106	PHA02106	hypothetical protein	91
164900	PHA02107	PHA02107	hypothetical protein	216
177349	PHA02108	PHA02108	hypothetical protein	48
222849	PHA02109	PHA02109	hypothetical protein	233
107130	PHA02110	PHA02110	hypothetical protein	98
107131	PHA02114	PHA02114	hypothetical protein	127
164902	PHA02115	PHA02115	hypothetical protein	105
177351	PHA02117	PHA02117	glutathionylspermidine synthase domain-containing protein	397
107134	PHA02118	PHA02118	hypothetical protein	202
107135	PHA02119	PHA02119	hypothetical protein	87
177352	PHA02122	PHA02122	hypothetical protein	65
107137	PHA02123	PHA02123	hypothetical protein	146
133998	PHA02125	PHA02125	thioredoxin-like protein	75
222850	PHA02126	PHA02126	hypothetical protein	153
107140	PHA02127	PHA02127	hypothetical protein	57
107141	PHA02128	PHA02128	hypothetical protein	151
107142	PHA02130	PHA02130	hypothetical protein	81
107143	PHA02131	PHA02131	hypothetical protein	70
107144	PHA02132	PHA02132	hypothetical protein	86
107145	PHA02135	PHA02135	hypothetical protein	122
177353	PHA02141	PHA02141	hypothetical protein	105
134000	PHA02142	PHA02142	putative RNA ligase	366
107148	PHA02145	PHA02145	hypothetical protein	230
107149	PHA02146	PHA02146	hypothetical protein	86
107150	PHA02148	PHA02148	hypothetical protein	110
134001	PHA02150	PHA02150	hypothetical protein	77
177354	PHA02151	PHA02151	hypothetical protein	217
107153	PHA02152	PHA02152	hypothetical protein	96
107154	PHA02239	PHA02239	putative protein phosphatase	235
107155	PHA02241	PHA02241	hypothetical protein	182
107156	PHA02243	PHA02243	hypothetical protein	160
107157	PHA02244	PHA02244	ATPase-like protein	383
177355	PHA02246	PHA02246	hypothetical protein	192
134004	PHA02248	PHA02248	hypothetical protein	204
177356	PHA02256	PHA02256	hypothetical protein	113
107161	PHA02264	PHA02264	hypothetical protein	152
164905	PHA02265	PHA02265	hypothetical protein	103
107163	PHA02275	PHA02275	hypothetical protein	125
107164	PHA02277	PHA02277	hypothetical protein	150
177357	PHA02278	PHA02278	thioredoxin-like protein	103
107166	PHA02283	PHA02283	hypothetical protein	210
107167	PHA02284	PHA02284	hypothetical protein	251
107168	PHA02290	PHA02290	hypothetical protein	234
177358	PHA02291	PHA02291	hypothetical protein	132
177359	PHA02310	PHA02310	hypothetical protein	130
164907	PHA02324	PHA02324	hypothetical protein	47
177360	PHA02325	PHA02325	hypothetical protein	72
164909	PHA02334	PHA02334	hypothetical protein	64
164910	PHA02335	PHA02335	hypothetical protein	118
177361	PHA02337	PHA02337	putative high light inducible protein	35
164912	PHA02357	PHA02357	hypothetical protein	81
222851	PHA02358	PHA02358	hypothetical protein	194
107178	PHA02360	PHA02360	hypothetical protein	70
107179	PHA02414	PHA02414	hypothetical protein	111
177362	PHA02415	PHA02415	DNA primase domain-containing protein	930
107181	PHA02416	PHA02416	hypothetical protein	167
164914	PHA02417	PHA02417	hypothetical protein	83
107183	PHA02436	PHA02436	hypothetical protein	52
177363	PHA02446	PHA02446	hypothetical protein	166
164916	PHA02447	PHA02447	hypothetical protein	86
107186	PHA02448	PHA02448	hypothetical protein	192
134010	PHA02450	PHA02450	hypothetical protein	53
177364	PHA02451	PHA02451	hypothetical protein	54
164918	PHA02456	PHA02456	zinc metallopeptidase motif-containing protein	141
164919	PHA02458	A	protein A*; Reviewed	341
177365	PHA02503	PHA02503	putative transcription regulator; Provisional	57
107192	PHA02508	PHA02508	putative minor coat protein; Provisional	93
222852	PHA02510	X	gene X product; Reviewed	116
177367	PHA02513	V1	structural protein V1; Reviewed	135
107197	PHA02515	PHA02515	hypothetical protein; Provisional	508
134016	PHA02516	W	baseplate wedge subunit; Provisional	103
222853	PHA02517	PHA02517	putative transposase OrfB; Reviewed	277
222854	PHA02518	PHA02518	ParA-like protein; Provisional	211
107201	PHA02519	PHA02519	plasmid partition protein SopA; Reviewed	387
164924	PHA02523	43B	DNA polymerase subunit B; Provisional	391
164925	PHA02524	43A	DNA polymerase subunit A; Provisional	498
177369	PHA02528	43	DNA polymerase; Provisional	881
222855	PHA02529	O	capsid-scaffolding protein; Provisional	278
222856	PHA02530	pseT	polynucleotide kinase; Provisional	300
222857	PHA02531	20	portal vertex protein; Provisional	514
222858	PHA02533	17	large terminase protein; Provisional	534
222859	PHA02535	P	terminase ATPase subunit; Provisional	581
222860	PHA02536	Q	portal vertex protein; Provisional	346
222861	PHA02537	M	terminase endonuclease subunit; Provisional	230
164934	PHA02538	N	capsid protein; Provisional	348
222862	PHA02539	18	tail sheath protein; Provisional	648
222863	PHA02540	61	DNA primase; Provisional	337
177376	PHA02541	23	major capsid protein; Provisional	518
222864	PHA02542	41	41 helicase; Provisional	473
222865	PHA02543	regA	translation repressor protein; Provisional	125
222866	PHA02544	44	clamp loader, small subunit; Provisional	316
177380	PHA02545	45	sliding clamp; Provisional	223
222867	PHA02546	47	endonuclease subunit; Provisional	340
222868	PHA02547	55	RNA polymerase sigma factor; Provisional	179
177383	PHA02548	24	capsid vertex protein; Provisional	412
222869	PHA02550	32	single-stranded DNA binding protein; Provisional	304
177385	PHA02551	19	tail tube protein; Provisional	163
222870	PHA02552	4	head completion protein; Provisional	151
222871	PHA02553	6	baseplate wedge subunit; Provisional	611
177388	PHA02554	13	neck protein; Provisional	311
222872	PHA02555	14	neck protein; Provisional	216
222873	PHA02556	15	tail sheath stabilizer and completion protein; Provisional	273
222874	PHA02557	22	prohead core protein; Provisional	271
222875	PHA02558	uvsW	UvsW helicase; Provisional	501
222876	PHA02559	59	59 protein; Provisional	216
164955	PHA02560	FI	major tail sheath protein; Provisional	388
222877	PHA02561	D	tail protein; Provisional	351
222878	PHA02562	46	endonuclease subunit; Provisional	562
222879	PHA02563	PHA02563	DNA polymerase; Provisional	630
222880	PHA02564	V	virion protein; Provisional	141
177395	PHA02565	49	recombination endonuclease VII; Provisional	157
222881	PHA02566	alt	ADP-ribosyltransferase; Provisional	684
222882	PHA02567	rnh	RnaseH; Provisional	304
164963	PHA02568	J	baseplate assembly protein; Provisional	300
177398	PHA02569	39	DNA topoisomerase II large subunit; Provisional	602
177399	PHA02570	dexA	exonuclease; Provisional	220
177400	PHA02571	a-gt.4	hypothetical protein; Provisional	109
222883	PHA02572	nrdA	ribonucleoside-diphosphate reductase subunit alpha; Provisional	753
222884	PHA02573	30.3	hypothetical protein; Provisional	148
177403	PHA02574	57B	hypothetical protein; Provisional	149
222885	PHA02575	1	deoxynucleoside monophosphate kinase; Provisional	227
177405	PHA02576	3	tail completion and sheath stabilizer protein; Provisional	177
222886	PHA02577	2	DNA end protector protein; Provisional	181
177407	PHA02578	53	baseplate wedge subunit; Provisional	181
177408	PHA02579	7	baseplate wedge subunit; Provisional	1030
177409	PHA02580	8	baseplate wedge subunit; Provisional	331
222887	PHA02581	9	baseplate wedge tail fiber connector; Provisional	284
222888	PHA02582	10	baseplate wedge subunit and tail pin; Provisional	604
222889	PHA02583	11	baseplate wedge subunit and tail pin; Provisional	218
222890	PHA02584	34	long tail fiber, proximal subunit; Provisional	1229
222891	PHA02585	16	small terminase protein; Provisional	161
222892	PHA02586	68	prohead core protein; Provisional	140
222893	PHA02587	30	DNA ligase; Provisional	488
222894	PHA02588	cd	deoxycytidylate deaminase; Provisional	168
222895	PHA02589	rnlA	RNA ligase A; Provisional	378
164985	PHA02590	PHA02590	hypothetical protein; Provisional	105
164986	PHA02591	PHA02591	hypothetical protein; Provisional	83
222896	PHA02592	52	DNA topisomerase II medium subunit; Provisional	439
222897	PHA02593	62	clamp loader small subunit; Provisional	191
222898	PHA02594	nadV	nicotinamide phosphoribosyl transferase; Provisional	470
222899	PHA02595	tk.4	hypothetical protein; Provisional	154
222900	PHA02596	5	baseplate hub subunit and tail lysozyme; Provisional	576
222901	PHA02597	30.2	hypothetical protein; Provisional	197
222902	PHA02598	denA	endonuclease II; Provisional	138
222903	PHA02599	dsbA	double-stranded DNA binding protein; Provisional	91
164995	PHA02600	FII	major tail tube protein; Provisional	169
222904	PHA02601	int	integrase; Provisional	333
177427	PHA02602	56	dCTP pyrophosphatase; Provisional	172
222905	PHA02603	nrdC.11	hypothetical protein; Provisional	330
177429	PHA02604	rI.-1	hypothetical protein; Provisional	126
177430	PHA02605	54	baseplate subunit; Provisional	305
222906	PHA02606	5.1	hypothetical protein; Provisional	179
177432	PHA02607	wac	fibritin; Provisional	454
177433	PHA02608	67	prohead core protein; Provisional	80
165004	PHA02609	uvsW.1	hypothetical protein; Provisional	76
165005	PHA02610	uvsY.-2	hypothetical protein; Provisional	53
222907	PHA02611	51	baseplate hub assembly protein; Provisional	249
222908	PHA02612	27	baseplate hub subunit; Provisional	372
222909	PHA02613	48	baseplate subunit; Provisional	361
222910	PHA02614	PHA02614	Major capsid protein VP1; Provisional	363
222911	PHA02616	PHA02616	VP2/VP3; Provisional	259
177439	PHA02620	PHA02620	VP3; Provisional	353
177440	PHA02621	PHA02621	agnoprotein; Provisional	68
222912	PHA02624	PHA02624	large T antigen; Provisional	647
177442	PHA02627	PHA02627	hypothetical protein; Provisional	73
165015	PHA02629	PHA02629	A-type inclusion body protein; Provisional	61
165016	PHA02633	PHA02633	hypothetical protein; Provisional	63
165017	PHA02634	PHA02634	hypothetical protein; Provisional	49
165018	PHA02635	PHA02635	ankyrin-like protein; Provisional	61
165019	PHA02636	PHA02636	hypothetical protein; Provisional	47
222913	PHA02637	PHA02637	TNF-alpha-receptor-like protein; Provisional	127
165021	PHA02638	PHA02638	CC chemokine receptor-like protein; Provisional	417
165022	PHA02639	PHA02639	EEV host range protein; Provisional	295
165023	PHA02641	PHA02641	hypothetical protein; Provisional	188
165024	PHA02642	PHA02642	C-type lectin-like protein; Provisional	216
165025	PHA02643	PHA02643	hypothetical protein; Provisional	82
165026	PHA02644	PHA02644	hypothetical protein; Provisional	112
165027	PHA02646	PHA02646	virion protein; Provisional	156
165029	PHA02649	PHA02649	hypothetical protein; Provisional	95
165030	PHA02650	PHA02650	hypothetical protein; Provisional	81
165031	PHA02651	PHA02651	IL-1 receptor antagonist; Provisional	165
165032	PHA02652	PHA02652	hypothetical protein; Provisional	70
177443	PHA02653	PHA02653	RNA helicase NPH-II; Provisional	675
165034	PHA02655	PHA02655	hypothetical protein; Provisional	94
165035	PHA02656	PHA02656	viral TNFR II-like protein; Provisional	199
165036	PHA02657	PHA02657	hypothetical protein; Provisional	95
165037	PHA02658	PHA02658	hypothetical protein; Provisional	92
165038	PHA02659	PHA02659	endothelin precursor; Provisional	70
165039	PHA02660	PHA02660	serpin-like protein; Provisional	364
177444	PHA02661	PHA02661	vascular endothelial growth factor like protein; Provisional	146
177445	PHA02662	PHA02662	ORF131 putative membrane protein; Provisional	226
177446	PHA02663	PHA02663	hypothetical protein; Provisional	172
177447	PHA02664	PHA02664	hypothetical protein; Provisional	534
177448	PHA02665	PHA02665	hypothetical protein; Provisional	322
222914	PHA02666	PHA02666	hypothetical protein; Provisional	287
177450	PHA02668	PHA02668	GM-CSF/IL-2 inhibition factor; Provisional	265
177451	PHA02669	PHA02669	hypothetical protein; Provisional	210
222915	PHA02670	PHA02670	ORF112 putative chemokine-binding protein; Provisional	287
177453	PHA02671	PHA02671	hypothetical protein; Provisional	179
177454	PHA02672	PHA02672	ORF110 EEV glycoprotein; Provisional	166
177455	PHA02673	PHA02673	ORF109 EEV glycoprotein; Provisional	161
177456	PHA02674	PHA02674	ORF107 virion morphogenesis; Provisional	60
177457	PHA02675	PHA02675	ORF104 fusion protein; Provisional	90
177458	PHA02676	PHA02676	A-type inclusion protein; Provisional	520
222916	PHA02677	PHA02677	hypothetical protein; Provisional	108
177460	PHA02678	PHA02678	hypothetical protein; Provisional	89
177461	PHA02679	PHA02679	ORF091 IMV membrane protein; Provisional	53
177462	PHA02680	PHA02680	ORF090 IMV phosphorylated membrane protein; Provisional	91
222917	PHA02681	PHA02681	ORF089 virion membrane protein; Provisional	92
177464	PHA02682	PHA02682	ORF080 virion core protein; Provisional	280
177465	PHA02683	PHA02683	ORF078 thioredoxin-like protein; Provisional	75
177466	PHA02684	PHA02684	ORF066 virion protein; Provisional	221
177467	PHA02685	PHA02685	ORF065 virion protein; Provisional	155
177468	PHA02686	PHA02686	hypothetical protein; Provisional	138
222918	PHA02687	PHA02687	ORF061 late transcription factor VLTF-4; Provisional	231
222919	PHA02688	PHA02688	ORF059 IMV protein VP55; Provisional	323
177471	PHA02689	PHA02689	ORF051 putative membrane protein; Provisional	128
222920	PHA02690	PHA02690	hypothetical protein; Provisional	90
177473	PHA02691	PHA02691	hypothetical protein; Provisional	110
177474	PHA02692	PHA02692	hypothetical protein; Provisional	70
177475	PHA02693	PHA02693	hypothetical protein; Provisional	710
177476	PHA02694	PHA02694	hypothetical protein; Provisional	292
177477	PHA02695	PHA02695	hypothetical protein; Provisional	725
222921	PHA02696	PHA02696	hypothetical protein; Provisional	79
222922	PHA02697	PHA02697	hypothetical protein; Provisional	255
177480	PHA02698	PHA02698	hypothetical protein; Provisional	89
165075	PHA02699	PHA02699	hypothetical protein; Provisional	466
177481	PHA02700	PHA02700	ORF017 DNA-binding phosphoprotein; Provisional	106
177482	PHA02701	PHA02701	ORF020 dsRNA-binding PKR inhibitor; Provisional	183
177483	PHA02702	PHA02702	ORF033 IMV membrane protein; Provisional	78
165079	PHA02703	PHA02703	ORF007 dUTPase; Provisional	165
165080	PHA02705	PHA02705	hypothetical protein; Provisional	72
165081	PHA02706	PHA02706	hypothetical protein; Provisional	58
165082	PHA02707	PHA02707	hypothetical protein; Provisional	37
177484	PHA02708	PHA02708	hypothetical protein; Provisional	148
165084	PHA02709	PHA02709	hypothetical protein; Provisional	44
165085	PHA02711	PHA02711	Toll/IL-receptor-like protein; Provisional	190
165086	PHA02713	PHA02713	hypothetical protein; Provisional	557
165087	PHA02714	PHA02714	CD-30-like protein; Provisional	110
165088	PHA02715	PHA02715	hypothetical protein; Provisional	202
165089	PHA02716	PHA02716	CPXV016; CPX019; EVM010; Provisional	764
165090	PHA02718	PHA02718	hypothetical protein; Provisional	69
165092	PHA02723	PHA02723	hypothetical protein; Provisional	77
165093	PHA02724	PHA02724	hydrophobic IMV membrane protein; Provisional	53
165094	PHA02725	PHA02725	hypothetical protein; Provisional	170
165095	PHA02726	PHA02726	hypothetical protein; Provisional	94
165096	PHA02728	PHA02728	uncharacterized protein; Provisional	184
165097	PHA02729	PHA02729	hypothetical protein; Provisional	94
165098	PHA02730	PHA02730	ankyrin-like protein; Provisional	672
177485	PHA02731	PHA02731	putative integrase; Provisional	231
165099	PHA02732	PHA02732	hypothetical protein; Provisional	1467
165101	PHA02734	PHA02734	coat protein; Provisional	149
165102	PHA02735	PHA02735	putative DNA polymerase type B; Provisional	716
165103	PHA02736	PHA02736	Viral ankyrin protein; Provisional	154
165104	PHA02737	PHA02737	hypothetical protein; Provisional	72
222923	PHA02738	PHA02738	hypothetical protein; Provisional	320
222924	PHA02739	PHA02739	hypothetical protein; Provisional	116
165107	PHA02740	PHA02740	protein tyrosine phosphatase; Provisional	298
165108	PHA02741	PHA02741	hypothetical protein; Provisional	169
165109	PHA02742	PHA02742	protein tyrosine phosphatase; Provisional	303
222925	PHA02743	PHA02743	Viral ankyrin protein; Provisional	166
165111	PHA02744	PHA02744	hypothetical protein; Provisional	88
222926	PHA02745	PHA02745	hypothetical protein; Provisional	265
165113	PHA02746	PHA02746	protein tyrosine phosphatase; Provisional	323
165114	PHA02747	PHA02747	protein tyrosine phosphatase; Provisional	312
165115	PHA02748	PHA02748	viral inexin-like protein; Provisional	360
165116	PHA02749	PHA02749	hypothetical protein; Provisional	322
165117	PHA02750	PHA02750	hypothetical protein; Provisional	240
165118	PHA02751	PHA02751	hypothetical protein; Provisional	233
177486	PHA02752	PHA02752	hypothetical protein; Provisional	242
165120	PHA02753	PHA02753	hypothetical protein; Provisional	298
165121	PHA02754	PHA02754	hypothetical protein; Provisional	67
165122	PHA02755	PHA02755	hypothetical protein; Provisional	96
165123	PHA02756	PHA02756	hypothetical protein; Provisional	164
165124	PHA02757	PHA02757	hypothetical protein; Provisional	75
165125	PHA02758	PHA02758	hypothetical protein; Provisional	321
165126	PHA02759	PHA02759	virus coat protein VP2; Provisional	245
165127	PHA02762	PHA02762	hypothetical protein; Provisional	62
177487	PHA02763	PHA02763	hypothetical protein; Provisional	102
165129	PHA02764	PHA02764	hypothetical protein; Provisional	399
165130	PHA02765	PHA02765	hypothetical protein; Provisional	117
165131	PHA02766	PHA02766	hypothetical protein; Provisional	73
165132	PHA02767	PHA02767	hypothetical protein; Provisional	101
165133	PHA02768	PHA02768	hypothetical protein; Provisional	55
165134	PHA02769	PHA02769	hypothetical protein; Provisional	154
165135	PHA02770	PHA02770	hypothetical protein; Provisional	81
165136	PHA02771	PHA02771	hypothetical protein; Provisional	90
165137	PHA02772	PHA02772	hypothetical protein; Provisional	95
165138	PHA02773	PHA02773	hypothetical protein; Provisional	112
222927	PHA02774	PHA02774	E1; Provisional	613
165140	PHA02775	PHA02775	E6; Provisional	160
165141	PHA02776	PHA02776	E7 protein; Provisional	101
165142	PHA02777	PHA02777	major capsid L1 protein; Provisional	555
222928	PHA02778	PHA02778	major capsid L1 protein; Provisional	503
222929	PHA02779	PHA02779	E6 protein; Provisional	150
177490	PHA02780	PHA02780	hypothetical protein; Provisional	73
165146	PHA02781	PHA02781	hypothetical protein; Provisional	78
165147	PHA02782	PHA02782	hypothetical protein; Provisional	503
165148	PHA02783	PHA02783	uncharacterized protein; Provisional	181
165149	PHA02785	PHA02785	IL-beta-binding protein; Provisional	326
222930	PHA02786	PHA02786	uncharacterized  protein; Provisional	192
165152	PHA02789	PHA02789	uncharacterized protein; Provisional	173
165153	PHA02790	PHA02790	Kelch-like protein; Provisional	480
165154	PHA02791	PHA02791	ankyrin-like protein; Provisional	284
165155	PHA02792	PHA02792	ankyrin-like protein; Provisional	631
165156	PHA02793	PHA02793	hypothetical protein; Provisional	66
165157	PHA02795	PHA02795	ankyrin-like protein; Provisional	437
222931	PHA02798	PHA02798	ankyrin-like protein; Provisional	489
165159	PHA02800	PHA02800	hypothetical protein; Provisional	161
165161	PHA02807	PHA02807	hypothetical protein; Provisional	155
222932	PHA02809	PHA02809	hypothetical protein; Provisional	111
165163	PHA02811	PHA02811	putative host range protein; Provisional	197
165164	PHA02813	PHA02813	hypothetical protein; Provisional	354
165165	PHA02815	PHA02815	hypothetical protein; Provisional	64
222933	PHA02816	PHA02816	hypothetical protein; Provisional	106
165167	PHA02817	PHA02817	EEV Host range protein; Provisional	225
165168	PHA02818	PHA02818	hypothetical protein; Provisional	92
165169	PHA02819	PHA02819	hypothetical protein; Provisional	71
222934	PHA02820	PHA02820	phospholipase-D-like protein; Provisional	424
222935	PHA02823	PHA02823	chemokine binding protein; Provisional	255
177491	PHA02825	PHA02825	LAP/PHD finger-like protein; Provisional	162
165173	PHA02826	PHA02826	IL-1 receptor-like protein; Provisional	227
177492	PHA02827	PHA02827	hypothetical protein; Provisional	150
165175	PHA02828	PHA02828	putative transmembrane protein; Provisional	100
165176	PHA02831	PHA02831	EEV host range protein; Provisional	268
165177	PHA02834	PHA02834	chemokine receptor-like protein; Provisional	323
165178	PHA02835	PHA02835	putative secreted protein; Provisional	186
165179	PHA02836	PHA02836	putative transmembrane protein; Provisional	153
165180	PHA02837	PHA02837	uncharacterized protein; Provisional	190
165181	PHA02838	PHA02838	hypothetical protein; Provisional	68
165182	PHA02839	PHA02839	Il-24-like protein; Provisional	156
165183	PHA02840	PHA02840	hypothetical protein; Provisional	82
165184	PHA02841	PHA02841	hypothetical protein; Provisional	103
165185	PHA02843	PHA02843	hypothetical protein; Provisional	73
165186	PHA02844	PHA02844	putative transmembrane protein; Provisional	75
165187	PHA02845	PHA02845	hypothetical protein; Provisional	91
165188	PHA02849	PHA02849	putative transmembrane protein; Provisional	82
165189	PHA02851	PHA02851	EEV glycoprotein; Provisional	223
165190	PHA02852	PHA02852	putative virion structural protein; Provisional	153
165191	PHA02854	PHA02854	putative host range protein; Provisional	178
222936	PHA02855	PHA02855	anti-apoptotic membrane protein; Provisional	180
165193	PHA02857	PHA02857	monoglyceride lipase; Provisional	276
165194	PHA02858	PHA02858	EIF2a-like PKR inhibitor; Provisional	86
165195	PHA02859	PHA02859	ankyrin repeat protein; Provisional	209
165196	PHA02861	PHA02861	uncharacterized protein; Provisional	149
165197	PHA02862	PHA02862	5L protein; Provisional	156
222937	PHA02864	PHA02864	hypothetical protein; Provisional	240
165199	PHA02865	PHA02865	MHC-like TNF binding protein; Provisional	338
165200	PHA02866	PHA02866	Hypothetical protein; Provisional	333
165201	PHA02867	PHA02867	C-type lectin protein; Provisional	167
165202	PHA02869	PHA02869	C4L/C10L-like gene family protein; Provisional	418
165203	PHA02871	PHA02871	hypothetical protein; Provisional	222
222938	PHA02872	PHA02872	EFc gene family protein; Provisional	124
165205	PHA02874	PHA02874	ankyrin repeat protein; Provisional	434
165206	PHA02875	PHA02875	ankyrin repeat protein; Provisional	413
165207	PHA02876	PHA02876	ankyrin repeat protein; Provisional	682
222939	PHA02878	PHA02878	ankyrin repeat protein; Provisional	477
222940	PHA02880	PHA02880	hypothetical protein; Provisional	189
165210	PHA02881	PHA02881	hypothetical protein; Provisional	161
165211	PHA02882	PHA02882	putative serine/threonine kinase; Provisional	294
165212	PHA02884	PHA02884	ankyrin repeat protein; Provisional	300
165213	PHA02885	PHA02885	putative interleukin binding protein; Provisional	135
165214	PHA02887	PHA02887	EGF-like protein; Provisional	126
165215	PHA02888	PHA02888	hypothetical protein; Provisional	96
165216	PHA02889	PHA02889	hypothetical protein; Provisional	241
165217	PHA02890	PHA02890	hypothetical protein; Provisional	278
165218	PHA02891	PHA02891	hypothetical protein; Provisional	120
165219	PHA02892	PHA02892	hypothetical protein; Provisional	75
165220	PHA02893	PHA02893	hypothetical protein; Provisional	88
165221	PHA02894	PHA02894	hypothetical protein; Provisional	97
165222	PHA02896	PHA02896	A-type inclusion like protein; Provisional	616
165223	PHA02898	PHA02898	virion envelope protein; Provisional	92
222941	PHA02901	PHA02901	virus redox protein; Provisional	75
165225	PHA02902	PHA02902	putative IMV membrane protein; Provisional	70
165226	PHA02907	PHA02907	hypothetical protein; Provisional	182
165227	PHA02909	PHA02909	hypothetical protein; Provisional	72
165228	PHA02910	PHA02910	hypothetical protein; Provisional	171
177496	PHA02911	PHA02911	C-type lectin-like protein; Provisional	213
177497	PHA02913	PHA02913	TGF-beta-like protein; Provisional	172
165230	PHA02914	PHA02914	Immunoglobulin-like domain protein; Provisional	500
165231	PHA02917	PHA02917	ankyrin-like protein; Provisional	661
165232	PHA02919	PHA02919	host-range protein; Provisional	150
165233	PHA02920	PHA02920	putative virulence factor; Provisional	117
165234	PHA02922	PHA02922	hypothetical protein; Provisional	153
165235	PHA02923	PHA02923	hypothetical protein; Provisional	315
222942	PHA02924	PHA02924	hypothetical protein; Provisional	156
165237	PHA02926	PHA02926	zinc finger-like protein; Provisional	242
222943	PHA02927	PHA02927	secreted complement-binding protein; Provisional	263
165239	PHA02928	PHA02928	Hypothetical protein; Provisional	214
222944	PHA02929	PHA02929	N1R/p28-like protein; Provisional	238
165241	PHA02930	PHA02930	hypothetical protein; Provisional	81
165242	PHA02931	PHA02931	hypothetical protein; Provisional	72
222945	PHA02932	PHA02932	hypothetical protein; Provisional	221
165244	PHA02933	PHA02933	unchracterized protein; Provisional	149
165245	PHA02934	PHA02934	Hypothetical protein; Provisional	253
222946	PHA02935	PHA02935	Hypothetical protein; Provisional	349
165247	PHA02937	PHA02937	hypothetical protein; Provisional	310
165248	PHA02938	PHA02938	hypothetical protein; Provisional	361
222947	PHA02939	PHA02939	hypothetical protein; Provisional	144
165250	PHA02940	PHA02940	hypothetical protein; Provisional	315
222948	PHA02941	PHA02941	hypothetical protein; Provisional	356
165252	PHA02942	PHA02942	putative transposase; Provisional	383
165253	PHA02943	PHA02943	hypothetical protein; Provisional	165
165254	PHA02944	PHA02944	hypothetical protein; Provisional	180
165255	PHA02945	PHA02945	interferon resistance protein; Provisional	88
165256	PHA02946	PHA02946	ankyin-like protein; Provisional	446
222949	PHA02947	PHA02947	S-S bond formation pathway protein; Provisional	215
165258	PHA02948	PHA02948	serine protease inhibitor-like protein; Provisional	373
165259	PHA02949	PHA02949	Hypothetical protein; Provisional	65
177499	PHA02951	PHA02951	Hypothetical protein; Provisional	337
222950	PHA02952	PHA02952	EEV maturation protein; Provisional	648
165262	PHA02953	PHA02953	IEV and EEV membrane glycoprotein; Provisional	170
165263	PHA02954	PHA02954	EEV membrane glycoprotein; Provisional	317
165264	PHA02955	PHA02955	hypothetical protein; Provisional	213
165265	PHA02956	PHA02956	hypothetical protein; Provisional	189
165266	PHA02957	PHA02957	hypothetical protein; Provisional	206
165267	PHA02961	PHA02961	hypothetical protein; Provisional	658
165268	PHA02962	PHA02962	hypothetical protein; Provisional	722
165269	PHA02963	PHA02963	hypothetical protein; Provisional	210
165270	PHA02965	PHA02965	hypothetical protein; Provisional	466
165271	PHA02966	PHA02966	hypothetical protein; Provisional	67
165272	PHA02967	PHA02967	hypothetical protein; Provisional	128
165273	PHA02968	PHA02968	hypothetical protein; Provisional	414
165274	PHA02969	PHA02969	hypothetical protein; Provisional	111
222951	PHA02970	PHA02970	hypothetical protein; Provisional	115
165276	PHA02972	PHA02972	hypothetical protein; Provisional	109
165277	PHA02973	PHA02973	hypothetical protein; Provisional	102
165278	PHA02974	PHA02974	putative IMV membrane protein; Provisional	81
165279	PHA02975	PHA02975	hypothetical protein; Provisional	69
165280	PHA02976	PHA02976	hypothetical protein; Provisional	181
165281	PHA02977	PHA02977	hypothetical protein; Provisional	201
165282	PHA02978	PHA02978	hypothetical protein; Provisional	135
165283	PHA02979	PHA02979	hypothetical protein; Provisional	140
165284	PHA02980	PHA02980	hypothetical protein; Provisional	160
165285	PHA02982	PHA02982	hypothetical protein; Provisional	251
222952	PHA02983	PHA02983	hypothetical protein; Provisional	180
165287	PHA02984	PHA02984	hypothetical protein; Provisional	286
165288	PHA02985	PHA02985	hypothetical protein; Provisional	271
222953	PHA02986	PHA02986	hypothetical protein; Provisional	141
165290	PHA02987	PHA02987	Ig domain OX-2-like protein; Provisional	189
165291	PHA02988	PHA02988	hypothetical protein; Provisional	283
222954	PHA02989	PHA02989	ankyrin repeat protein; Provisional	494
222955	PHA02991	PHA02991	HT motif gene family protein; Provisional	120
222956	PHA02992	PHA02992	hypothetical protein; Provisional	728
165295	PHA02993	PHA02993	hypothetical protein; Provisional	147
222957	PHA02994	PHA02994	hypothetical protein; Provisional	218
165297	PHA02995	PHA02995	DNA-binding virion core protein; Provisional	101
177503	PHA02996	PHA02996	poly(A) polymerase large subunit; Provisional	467
222958	PHA02998	PHA02998	RNA polymerase subunit; Provisional	195
222959	PHA02999	PHA02999	Hypothetical protein; Provisional	382
177505	PHA03000	PHA03000	Hypothetical protein; Provisional	566
222960	PHA03001	PHA03001	putative virion core protein; Provisional	132
165303	PHA03002	PHA03002	Hypothetical protein; Provisional	679
177506	PHA03003	PHA03003	palmytilated EEV membrane glycoprotein; Provisional	369
177507	PHA03004	PHA03004	putative membrane protein; Provisional	270
222961	PHA03005	PHA03005	sulfhydryl oxidase; Provisional	96
165307	PHA03006	PHA03006	hypothetical protein; Provisional	323
165308	PHA03007	PHA03007	hypothetical protein; Provisional	540
165309	PHA03008	PHA03008	hypothetical protein; Provisional	234
165310	PHA03010	PHA03010	hypothetical protein; Provisional	546
165311	PHA03011	PHA03011	hypothetical protein; Provisional	120
165312	PHA03012	PHA03012	hypothetical protein; Provisional	279
165313	PHA03013	PHA03013	hypothetical protein; Provisional	109
165314	PHA03014	PHA03014	hypothetical protein; Provisional	163
165315	PHA03016	PHA03016	hypothetical protein; Provisional	441
165316	PHA03017	PHA03017	hypothetical protein; Provisional	228
165317	PHA03018	PHA03018	hypothetical protein; Provisional	174
165318	PHA03019	PHA03019	hypothetical protein; Provisional	77
165319	PHA03020	PHA03020	hypothetical protein; Provisional	352
165320	PHA03022	PHA03022	hypothetical protein; Provisional	335
165321	PHA03023	PHA03023	hypothetical protein; Provisional	112
165322	PHA03024	PHA03024	hypothetical protein; Provisional	229
165323	PHA03025	PHA03025	hypothetical protein; Provisional	68
165324	PHA03026	PHA03026	hypothetical protein; Provisional	421
165325	PHA03027	PHA03027	hypothetical protein; Provisional	325
165326	PHA03028	PHA03028	hypothetical protein; Provisional	185
165327	PHA03029	PHA03029	hypothetical protein; Provisional	92
165328	PHA03030	PHA03030	hypothetical protein; Provisional	122
165329	PHA03031	PHA03031	hypothetical protein; Provisional	449
165330	PHA03033	PHA03033	hypothetical protein; Provisional	142
165331	PHA03034	PHA03034	hypothetical protein; Provisional	145
165332	PHA03035	PHA03035	hypothetical protein; Provisional	158
222962	PHA03036	PHA03036	DNA polymerase; Provisional	1004
222963	PHA03041	PHA03041	virion core protein; Provisional	153
222964	PHA03042	PHA03042	CD47-like protein; Provisional	286
165336	PHA03043	PHA03043	hypothetical protein; Provisional	130
165337	PHA03044	PHA03044	IMV membrane protein; Provisional	74
177510	PHA03045	PHA03045	IMV membrane protein; Provisional	113
165339	PHA03046	PHA03046	Hypothetical protein; Provisional	142
165340	PHA03047	PHA03047	IMV membrane receptor-like protein; Provisional	53
165341	PHA03048	PHA03048	IMV membrane protein; Provisional	93
165342	PHA03049	PHA03049	IMV membrane protein; Provisional	68
165343	PHA03050	PHA03050	glutaredoxin; Provisional	108
165344	PHA03051	PHA03051	Hypothetical protein; Provisional	88
165345	PHA03052	PHA03052	Hypothetical protein; Provisional	69
165346	PHA03054	PHA03054	IMV membrane protein; Provisional	72
165347	PHA03055	PHA03055	Hypothetical protein; Provisional	79
165348	PHA03056	PHA03056	putative myristoylated protein; Provisional	165
222965	PHA03057	PHA03057	Hypothetical protein; Provisional	146
222966	PHA03058	PHA03058	Hypothetical protein; Provisional	124
222967	PHA03060	PHA03060	Hypothetical protein; Provisional	71
177511	PHA03061	PHA03061	putative DNA-binding virion core protein; Provisional	311
177512	PHA03062	PHA03062	putative IMV membrane protein; Provisional	78
222968	PHA03065	PHA03065	Hypothetical protein; Provisional	438
165355	PHA03066	PHA03066	Hypothetical protein; Provisional	110
222969	PHA03067	PHA03067	hypothetical protein; Provisional	383
177515	PHA03068	PHA03068	DNA-binding phosphoprotein; Provisional	270
165358	PHA03069	PHA03069	DNA-binding protein; Provisional	119
177516	PHA03070	PHA03070	DNA-binding virion core protein; Provisional	249
165360	PHA03071	PHA03071	late transcription factor VLTF-1; Provisional	260
222970	PHA03072	PHA03072	putative viral membrane protein; Provisional	190
177518	PHA03073	PHA03073	late transcription factor VLTF-2; Provisional	150
165363	PHA03074	PHA03074	late transcription factor VLTF-3; Provisional	225
177519	PHA03075	PHA03075	glutaredoxin-like protein; Provisional	123
222971	PHA03078	PHA03078	transcriptional elongation factor; Provisional	219
165366	PHA03079	PHA03079	hypothetical protein; Provisional	87
222972	PHA03080	PHA03080	putative virion core protein; Provisional	366
222973	PHA03081	PHA03081	putative metalloprotease; Provisional	595
222974	PHA03082	PHA03082	DNA-dependent RNA polymerase subunit; Provisional	63
222975	PHA03083	PHA03083	poxvirus myristoylprotein; Provisional	334
222976	PHA03087	PHA03087	G protein-coupled chemokine receptor-like protein; Provisional	335
222977	PHA03089	PHA03089	late transcription factor VLTF-4; Provisional	191
222978	PHA03091	PHA03091	putative alpha aminitin-sensitive protein; Provisional	232
165374	PHA03092	PHA03092	semaphorin-like protein; Provisional	134
222979	PHA03093	PHA03093	EEV glycoprotein; Provisional	185
165376	PHA03094	PHA03094	dUTPase; Provisional	144
222980	PHA03095	PHA03095	ankyrin-like protein; Provisional	471
222981	PHA03096	PHA03096	p28-like protein; Provisional	284
222982	PHA03097	PHA03097	C-type lectin-like protein; Provisional	157
222983	PHA03098	PHA03098	kelch-like protein; Provisional	534
165381	PHA03099	PHA03099	epidermal growth factor-like protein (EGF-like protein); Provisional	139
222984	PHA03100	PHA03100	ankyrin repeat protein; Provisional	422
222985	PHA03101	PHA03101	DNA topoisomerase type I; Provisional	314
222986	PHA03102	PHA03102	Small T antigen; Reviewed	153
222987	PHA03103	PHA03103	double-strand RNA-binding protein; Provisional	183
222988	PHA03105	PHA03105	EEV glycoprotein; Provisional	188
165387	PHA03108	PHA03108	poly(A) polymerase small subunit; Provisional	300
222989	PHA03111	PHA03111	Ser/Thr kinase; Provisional	444
222990	PHA03112	PHA03112	IL-18 binding protein; Provisional	141
177532	PHA03115	PHA03115	hypothetical protein; Provisional	340
165391	PHA03118	PHA03118	multifunctional expression regulator; Provisional	474
222991	PHA03119	PHA03119	helicase-primase primase subunit; Provisional	1085
165393	PHA03120	PHA03120	tegument protein VP22; Provisional	310
165395	PHA03123	PHA03123	dUTPase; Provisional	402
165396	PHA03124	PHA03124	dUTPase; Provisional	418
222992	PHA03125	PHA03125	dUTPase; Provisional	376
165398	PHA03126	PHA03126	dUTPase; Provisional	326
222993	PHA03127	PHA03127	dUTPase; Provisional	322
165400	PHA03128	PHA03128	dUTPase; Provisional	376
222994	PHA03129	PHA03129	dUTPase; Provisional	436
222995	PHA03130	PHA03130	dUTPase; Provisional	368
222996	PHA03131	PHA03131	dUTPase; Provisional	286
222997	PHA03132	PHA03132	thymidine kinase; Provisional	580
165405	PHA03133	PHA03133	thymidine kinase; Provisional	368
177537	PHA03134	PHA03134	thymidine kinase; Provisional	340
165407	PHA03135	PHA03135	thymidine kinase; Provisional	343
177538	PHA03136	PHA03136	thymidine kinase; Provisional	378
165410	PHA03138	PHA03138	thymidine kinase; Provisional	340
165411	PHA03139	PHA03139	helicase-primase primase subunit; Provisional	860
222998	PHA03140	PHA03140	helicase-primase primase subunit; Provisional	772
177540	PHA03141	PHA03141	helicase-primase primase subunit; Provisional	101
222999	PHA03142	PHA03142	helicase-primase primase subunit BSLF1; Provisional	835
223000	PHA03144	PHA03144	helicase-primase primase subunit; Provisional	746
165416	PHA03145	PHA03145	helicase-primase primase subunit; Provisional	1058
177543	PHA03146	PHA03146	helicase-primase primase subunit; Provisional	1075
165418	PHA03147	PHA03147	hypothetical protein; Provisional	280
223001	PHA03148	PHA03148	hypothetical protein; Provisional	289
165420	PHA03149	PHA03149	hypothetical protein; Provisional	66
223002	PHA03150	PHA03150	hypothetical protein; Provisional	456
177546	PHA03151	PHA03151	hypothetical protein; Provisional	259
165423	PHA03152	PHA03152	hypothetical protein; Provisional	138
165425	PHA03154	PHA03154	hypothetical protein; Provisional	304
165426	PHA03155	PHA03155	hypothetical protein; Provisional	115
165427	PHA03156	PHA03156	hypothetical protein; Provisional	90
165429	PHA03158	PHA03158	hypothetical protein; Provisional	273
165430	PHA03159	PHA03159	hypothetical protein; Provisional	160
165431	PHA03160	PHA03160	hypothetical protein; Provisional	499
165432	PHA03161	PHA03161	hypothetical protein; Provisional	150
165433	PHA03162	PHA03162	hypothetical protein; Provisional	135
165434	PHA03163	PHA03163	hypothetical protein; Provisional	92
177547	PHA03164	PHA03164	hypothetical protein; Provisional	88
165436	PHA03165	PHA03165	hypothetical protein; Provisional	57
177548	PHA03166	PHA03166	hypothetical protein; Provisional	580
223003	PHA03169	PHA03169	hypothetical protein; Provisional	413
165441	PHA03170	PHA03170	UL37 tegument protein; Provisional	293
165442	PHA03171	PHA03171	UL37 tegument protein; Provisional	499
165443	PHA03172	PHA03172	UL37 tegument protein; Provisional	951
223004	PHA03173	PHA03173	UL37 tegument protein; Provisional	1028
177551	PHA03175	PHA03175	UL43 envelope protein; Provisional	413
223005	PHA03176	PHA03176	UL43 envelope protein; Provisional	420
177552	PHA03178	PHA03178	UL43 envelope protein; Provisional	403
223006	PHA03179	PHA03179	UL43 envelope protein; Provisional	387
165451	PHA03180	PHA03180	helicase-primase primase subunit; Provisional	1071
165452	PHA03181	PHA03181	helicase-primase primase subunit; Provisional	764
177553	PHA03185	PHA03185	UL14 tegument protein; Provisional	214
223007	PHA03187	PHA03187	UL14 tegument protein; Provisional	322
165458	PHA03188	PHA03188	UL14 tegument protein; Provisional	199
223008	PHA03189	PHA03189	UL14 tegument protein; Provisional	348
165460	PHA03190	PHA03190	UL14 tegument protein; Provisional	196
165461	PHA03191	PHA03191	UL14 tegument protein; Provisional	238
177555	PHA03193	PHA03193	tegument protein VP11/12; Provisional	594
177556	PHA03195	PHA03195	tegument protein VP11/12; Provisional	746
165466	PHA03199	PHA03199	uracil DNA glycosylase; Provisional	304
165467	PHA03200	PHA03200	uracil DNA glycosylase; Provisional	255
165468	PHA03201	PHA03201	uracil DNA glycosylase; Provisional	318
165469	PHA03202	PHA03202	uracil DNA glycosylase; Provisional	313
165471	PHA03204	PHA03204	uracil DNA glycosylase; Provisional	322
165473	PHA03207	PHA03207	serine/threonine kinase US3; Provisional	392
177557	PHA03209	PHA03209	serine/threonine kinase US3; Provisional	357
165476	PHA03210	PHA03210	serine/threonine kinase US3; Provisional	501
223009	PHA03211	PHA03211	serine/threonine kinase US3; Provisional	461
165478	PHA03212	PHA03212	serine/threonine kinase US3; Provisional	391
165479	PHA03214	PHA03214	nuclear protein UL24; Provisional	252
223010	PHA03215	PHA03215	nuclear protein UL24; Provisional	262
177558	PHA03216	PHA03216	nuclear protein UL24; Provisional	272
223011	PHA03218	PHA03218	nuclear protein UL24; Provisional	306
165484	PHA03219	PHA03219	nuclear protein UL24; Provisional	300
165485	PHA03222	PHA03222	single-stranded binding protein UL29; Provisional	337
165486	PHA03225	PHA03225	DNA packaging protein UL33; Provisional	125
223012	PHA03229	PHA03229	DNA packaging protein UL33; Provisional	132
223013	PHA03230	PHA03230	nuclear protein UL55; Provisional	180
223014	PHA03231	PHA03231	glycoprotein BALF4; Provisional	829
223015	PHA03232	PHA03232	DNA packaging protein UL32; Provisional	586
223016	PHA03233	PHA03233	DNA packaging protein UL32; Provisional	518
177562	PHA03234	PHA03234	DNA packaging protein UL33; Provisional	338
223017	PHA03235	PHA03235	DNA packaging protein UL33; Provisional	409
223018	PHA03236	PHA03236	DNA packaging protein UL33; Provisional	127
223019	PHA03237	PHA03237	envelope glycoprotein M; Provisional	424
177565	PHA03239	PHA03239	envelope glycoprotein M; Provisional	429
165499	PHA03240	PHA03240	envelope glycoprotein M; Provisional	258
177566	PHA03242	PHA03242	envelope glycoprotein M; Provisional	428
177567	PHA03244	PHA03244	large tegument protein UL36; Provisional	478
223020	PHA03246	PHA03246	large tegument protein UL36; Provisional	3095
223021	PHA03247	PHA03247	large tegument protein UL36; Provisional	3151
223022	PHA03248	PHA03248	DNA packaging tegument protein UL25; Provisional	583
223023	PHA03249	PHA03249	DNA packaging tegument protein UL25; Provisional	653
165509	PHA03250	PHA03250	UL35; Provisional	564
223024	PHA03252	PHA03252	DNA packaging tegument protein UL25; Provisional	589
223025	PHA03253	PHA03253	UL35; Provisional	609
165513	PHA03255	PHA03255	BDLF3; Provisional	234
165514	PHA03256	PHA03256	BDLF3; Provisional	77
177569	PHA03257	PHA03257	Capsid triplex subunit 2; Provisional	316
165516	PHA03258	PHA03258	Capsid triplex subunit 2; Provisional	304
165517	PHA03259	PHA03259	Capsid triplex subunit 2; Provisional	302
165518	PHA03260	PHA03260	Capsid triplex subunit 2; Provisional	339
223026	PHA03261	PHA03261	Capsid triplex subunit 1; Provisional	469
223027	PHA03262	PHA03262	Capsid triplex subunit 1; Provisional	264
223028	PHA03263	PHA03263	Capsid triplex subunit 1; Provisional	332
223029	PHA03264	PHA03264	envelope glycoprotein D; Provisional	416
165523	PHA03265	PHA03265	envelope glycoprotein D; Provisional	402
165527	PHA03269	PHA03269	envelope glycoprotein C; Provisional	566
165528	PHA03270	PHA03270	envelope glycoprotein C; Provisional	466
223030	PHA03271	PHA03271	envelope glycoprotein C; Provisional	490
223031	PHA03273	PHA03273	envelope glycoprotein C; Provisional	486
177573	PHA03275	PHA03275	envelope glycoprotein K; Provisional	340
165533	PHA03276	PHA03276	envelope glycoprotein K; Provisional	337
177574	PHA03278	PHA03278	envelope glycoprotein K; Provisional	347
165536	PHA03279	PHA03279	envelope glycoprotein K; Provisional	361
165538	PHA03281	PHA03281	envelope glycoprotein E; Provisional	642
165539	PHA03282	PHA03282	envelope glycoprotein E; Provisional	540
223032	PHA03283	PHA03283	envelope glycoprotein E; Provisional	542
177576	PHA03286	PHA03286	envelope glycoprotein E; Provisional	492
165546	PHA03289	PHA03289	envelope glycoprotein I; Provisional	352
165547	PHA03290	PHA03290	envelope glycoprotein I; Provisional	357
223033	PHA03291	PHA03291	envelope glycoprotein I; Provisional	401
177577	PHA03292	PHA03292	envelope glycoprotein I; Provisional	413
223034	PHA03293	PHA03293	deoxyribonuclease; Provisional	523
223035	PHA03294	PHA03294	envelope glycoprotein H; Provisional	835
223036	PHA03295	PHA03295	envelope glycoprotein H; Provisional	714
165553	PHA03296	PHA03296	envelope glycoprotein H; Provisional	814
165554	PHA03297	PHA03297	envelope glycoprotein L; Provisional	185
165555	PHA03298	PHA03298	envelope glycoprotein L; Provisional	167
165556	PHA03299	PHA03299	envelope glycoprotein L; Provisional	195
223037	PHA03301	PHA03301	envelope glycoprotein L; Provisional	226
223038	PHA03302	PHA03302	envelope glycoprotein L; Provisional	253
165560	PHA03303	PHA03303	envelope glycoprotein L; Provisional	159
223039	PHA03307	PHA03307	transcriptional regulator ICP4; Provisional	1352
165563	PHA03308	PHA03308	transcriptional regulator ICP4; Provisional	1463
165564	PHA03309	PHA03309	transcriptional regulator ICP4; Provisional	2033
223040	PHA03311	PHA03311	helicase-primase subunit BBLF4; Provisional	782
177582	PHA03312	PHA03312	helicase-primase subunit BBLF2/3; Provisional	709
223041	PHA03321	PHA03321	tegument protein VP11/12; Provisional	694
223042	PHA03322	PHA03322	tegument protein VP11/12; Provisional	674
223043	PHA03323	PHA03323	nuclear egress membrane protein UL34; Provisional	272
165570	PHA03324	PHA03324	nuclear egress membrane protein UL34; Provisional	274
223044	PHA03325	PHA03325	nuclear-egress-membrane-like protein; Provisional	418
223045	PHA03326	PHA03326	nuclear egress membrane protein; Provisional	275
223046	PHA03328	PHA03328	nuclear egress lamina protein UL31; Provisional	316
165574	PHA03330	PHA03330	putative primase; Provisional	771
223047	PHA03332	PHA03332	membrane glycoprotein; Provisional	1328
223048	PHA03333	PHA03333	putative ATPase subunit of terminase; Provisional	752
223049	PHA03334	PHA03334	putative DNA polymerase catalytic subunit; Provisional	1545
223050	PHA03335	PHA03335	hypothetical protein; Provisional	385
223051	PHA03336	PHA03336	uncharacterized protein; Provisional	462
165582	PHA03338	PHA03338	US22 family homolog; Provisional	344
165586	PHA03342	PHA03342	US22 family homolog; Provisional	511
165587	PHA03343	PHA03343	US22 family homolog; Provisional	578
165588	PHA03344	PHA03344	US22 family homolog; Provisional	672
223052	PHA03346	PHA03346	US22 family homolog; Provisional	520
177588	PHA03347	PHA03347	uracil DNA glycosylase; Provisional	252
177589	PHA03348	PHA03348	tegument protein UL21; Provisional	526
177590	PHA03349	PHA03349	tegument protein UL16; Provisional	343
177591	PHA03351	PHA03351	tegument protein UL16; Provisional	235
223053	PHA03352	PHA03352	tegument protein UL16; Provisional	340
177593	PHA03354	PHA03354	Alkaline exonuclease; Provisional	81
177594	PHA03356	PHA03356	tegument protein UL11; Provisional	93
177595	PHA03357	PHA03357	Alkaline exonuclease; Provisional	81
177596	PHA03358	PHA03358	Alkaline exonuclease; Provisional	75
223054	PHA03359	PHA03359	UL17 tegument protein; Provisional	686
177598	PHA03360	PHA03360	tegument protein; Provisional	442
223055	PHA03361	PHA03361	UL7 tegument protein; Provisional	302
223056	PHA03362	PHA03362	single-stranded binding protein UL29; Provisional	1189
223057	PHA03364	PHA03364	hypothetical protein; Provisional	264
177602	PHA03365	PHA03365	hypothetical protein; Provisional	419
223058	PHA03366	PHA03366	FGAM-synthase; Provisional	1304
223059	PHA03367	PHA03367	single-stranded DNA binding protein; Provisional	1115
223060	PHA03368	PHA03368	DNA packaging terminase subunit 1; Provisional	738
223061	PHA03369	PHA03369	capsid maturational protease; Provisional	663
177607	PHA03370	PHA03370	virion protein US2; Provisional	269
177608	PHA03371	PHA03371	circ protein; Provisional	240
177609	PHA03372	PHA03372	DNA packaging terminase subunit 1; Provisional	668
223062	PHA03373	PHA03373	tegument protein; Provisional	247
223063	PHA03374	PHA03374	hypothetical protein; Provisional	730
223064	PHA03375	PHA03375	hypothetical protein; Provisional	844
177613	PHA03376	PHA03376	BARF1; Provisional	221
177614	PHA03377	PHA03377	EBNA-3C; Provisional	1000
223065	PHA03378	PHA03378	EBNA-3B; Provisional	991
223066	PHA03379	PHA03379	EBNA-3A; Provisional	935
223067	PHA03380	PHA03380	transactivating tegument protein VP16; Provisional	432
177618	PHA03381	PHA03381	tegument protein VP22; Provisional	290
177619	PHA03383	PHA03383	PCNA-like protein; Provisional	262
223068	PHA03384	PHA03384	early DNA-binding protein E2A; Provisional	445
177621	PHA03385	IX	capsid protein IX,hexon associated protein IX; Provisional	135
177622	PHA03386	P10	fibrous body protein; Provisional	94
177623	PHA03387	gp37	spherodin-like protein; Provisional	267
177624	PHA03388	ORF1_granulin	Granulin; Provisional	248
177625	PHA03389	polh	polyhedrin; Provisional	246
223069	PHA03390	pk1	serine/threonine-protein kinase 1; Provisional	267
223070	PHA03391	p47	viral transcription regulator p47; Provisional	395
223071	PHA03392	egt	ecdysteroid UDP-glucosyltransferase; Provisional	507
223072	PHA03393	odv-e66	occlusion-derived virus envelope protein E66; Provisional	682
223073	PHA03394	lef-8	DNA-directed RNA polymerase subunit beta-like protein; Provisional	865
177631	PHA03395	p10	fibrous body protein; Provisional	87
223074	PHA03396	lef-9	late expression factor 9; Provisional	493
177633	PHA03397	vlf-1	very late expression factor 1; Provisional	363
223075	PHA03398	PHA03398	viral phosphatase superfamily protein; Provisional	303
223076	PHA03399	pif3	per os infectivity factor 3; Provisional	200
223077	PHA03402	PHA03402	hypothetical protein; Provisional	81
177637	PHA03405	PHA03405	hypothetical protein; Provisional	130
223078	PHA03410	PHA03410	hypothetical protein; Provisional	170
177639	PHA03411	PHA03411	putative methyltransferase; Provisional	279
177640	PHA03412	PHA03412	putative methyltransferase; Provisional	241
177641	PHA03413	PHA03413	putative internal core protein; Provisional	1304
177642	PHA03414	PHA03414	virion protein; Provisional	1337
177643	PHA03415	PHA03415	putative internal virion protein; Provisional	1019
177644	PHA03416	PHA03416	hypothetical E4 protein; Provisional	92
177645	PHA03417	PHA03417	E4 protein; Provisional	118
177646	PHA03418	PHA03418	hypothetical E4 protein; Provisional	230
223079	PHA03419	PHA03419	E4 protein; Provisional	200
177648	PHA03420	PHA03420	E4 protein; Provisional	137
177649	PLN00009	PLN00009	cyclin-dependent kinase A; Provisional	294
215027	PLN00010	PLN00010	cyclin-dependent kinases regulatory subunit; Provisional	86
177651	PLN00011	PLN00011	cysteine synthase	323
215028	PLN00012	PLN00012	chlorophyll synthetase; Provisional	375
177653	PLN00014	PLN00014	light-harvesting-like protein 3; Provisional	250
177654	PLN00015	PLN00015	protochlorophyllide reductase	308
215029	PLN00016	PLN00016	RNA-binding protein; Provisional	378
177656	PLN00017	PLN00017	photosystem I reaction centre subunit VI; Provisional	90
215030	PLN00019	PLN00019	photosystem I reaction center subunit III; Provisional	223
215031	PLN00020	PLN00020	ribulose bisphosphate carboxylase/oxygenase activase -RuBisCO activase (RCA); Provisional	413
177659	PLN00021	PLN00021	chlorophyllase	313
215032	PLN00022	PLN00022	electron transfer flavoprotein subunit alpha; Provisional	356
177661	PLN00023	PLN00023	GTP-binding protein; Provisional	334
215033	PLN00025	PLN00025	photosystem II light harvesting chlorophyll a/b binding protein; Provisional	262
177663	PLN00026	PLN00026	aquaporin  NIP; Provisional	298
177664	PLN00027	PLN00027	aquaporin TIP; Provisional	252
177665	PLN00028	PLN00028	nitrate transmembrane transporter; Provisional	476
215034	PLN00032	PLN00032	DNA-directed RNA polymerase; Provisional	71
215035	PLN00033	PLN00033	photosystem II stability/assembly factor; Provisional	398
215036	PLN00034	PLN00034	mitogen-activated protein kinase kinase; Provisional	353
177669	PLN00035	PLN00035	histone H4; Provisional	103
177670	PLN00036	PLN00036	40S ribosomal protein S4; Provisional	261
177671	PLN00037	PLN00037	photosystem II oxygen-evolving enhancer protein 1; Provisional	313
215037	PLN00038	PLN00038	photosystem I reaction center subunit XI (PsaL); Provisional	165
177673	PLN00039	PLN00039	photosystem II reaction center Psb28 protein; Provisional	111
215038	PLN00040	PLN00040	Protein MAK16 homolog; Provisional	233
215039	PLN00041	PLN00041	photosystem I reaction center subunit II; Provisional	196
177676	PLN00042	PLN00042	photosystem II oxygen-evolving enhancer protein 2; Provisional	260
165621	PLN00043	PLN00043	elongation factor 1-alpha; Provisional	447
165622	PLN00044	PLN00044	multi-copper oxidase-related protein; Provisional	596
177677	PLN00045	PLN00045	photosystem I reaction center subunit IV; Provisional	101
215040	PLN00046	PLN00046	photosystem I reaction center subunit O; Provisional	141
177679	PLN00047	PLN00047	photosystem II biogenesis protein Psb29; Provisional	283
177680	PLN00048	PLN00048	photosystem I light harvesting chlorophyll a/b binding protein 3; Provisional	262
177681	PLN00049	PLN00049	carboxyl-terminal processing protease; Provisional	389
165628	PLN00050	PLN00050	expansin A; Provisional	247
177682	PLN00051	PLN00051	RNA-binding S4 domain-containing protein; Provisional	267
177683	PLN00052	PLN00052	prolyl 4-hydroxylase; Provisional	310
215041	PLN00053	PLN00053	photosystem II subunit R; Provisional	117
215042	PLN00054	PLN00054	photosystem I reaction center subunit N; Provisional	139
177686	PLN00055	PLN00055	photosystem II reaction center protein H; Provisional	73
177687	PLN00056	PLN00056	photosystem Q(B) protein; Provisional	353
177688	PLN00057	PLN00057	proliferating cell nuclear antigen; Provisional	263
177689	PLN00058	PLN00058	photosystem II reaction center subunit T; Provisional	103
177690	PLN00059	PLN00059	PsbP domain-containing protein 1; Provisional	286
177691	PLN00060	PLN00060	meiotic recombination protein SPO11-2; Provisional	384
215043	PLN00061	PLN00061	photosystem II protein Psb27; Provisional	150
177693	PLN00062	PLN00062	TATA-box-binding protein; Provisional	179
215044	PLN00063	PLN00063	photosystem II core complex proteins psbY; Provisional	194
215045	PLN00064	PLN00064	photosystem II protein Psb27; Provisional	166
215046	PLN00066	PLN00066	PsbP domain-containing protein 4; Provisional	262
177697	PLN00067	PLN00067	PsbP domain-containing protein 6; Provisional	263
177698	PLN00068	PLN00068	photosystem II CP47 chlorophyll A apoprotein; Provisional	508
215047	PLN00070	PLN00070	aconitate hydratase	936
177700	PLN00071	PLN00071	photosystem I subunit VII; Provisional	81
177701	PLN00072	PLN00072	3-isopropylmalate isomerase/dehydratase small subunit; Provisional	246
215048	PLN00074	PLN00074	photosystem II D2 protein (PsbD); Provisional	353
215049	PLN00075	PLN00075	Photosystem II reaction center protein K; Provisional	52
215050	PLN00077	PLN00077	photosystem II reaction centre W protein; Provisional	128
165653	PLN00078	PLN00078	photosystem I reaction center subunit N (PsaN); Provisional	122
165655	PLN00081	PLN00081	photosystem I reaction center subunit V (PsaG); Provisional	141
215051	PLN00082	PLN00082	photosystem II reaction centre W protein (PsbW); Provisional	67
177706	PLN00083	PLN00083	photosystem II subunit R; Provisional	101
177707	PLN00084	PLN00084	photosystem II subunit S (PsbS); Provisional	214
177708	PLN00085	PLN00085	photosystem II reaction center protein M (PsbM); Provisional	149
177709	PLN00088	PLN00088	predicted protein; Provisional	127
177710	PLN00089	PLN00089	fucoxanthin-chlorophyll a/c binding protein; Provisional	209
165663	PLN00090	PLN00090	photosystem II reaction center M protein; Provisional	113
215052	PLN00091	PLN00091	photosystem I reaction center subunit V (PsaG); Provisional	160
177712	PLN00092	PLN00092	photosystem I reaction center subunit V (PsaG); Provisional	137
177713	PLN00093	PLN00093	geranylgeranyl diphosphate reductase; Provisional	450
215053	PLN00094	PLN00094	aconitate hydratase 2; Provisional	938
165668	PLN00095	PLN00095	chlorophyllide a oxygenase; Provisional	394
177715	PLN00096	PLN00096	isocitrate dehydrogenase (NADP+); Provisional	393
165670	PLN00097	PLN00097	photosystem I light harvesting complex Lhca2/4, chlorophyll a/b binding; Provisional	244
177716	PLN00098	PLN00098	light-harvesting complex I chlorophyll a/b-binding protein (Lhac); Provisional	267
177717	PLN00099	PLN00099	light-harvesting complex IChlorophyll A-B binding protein Lhca1; Provisional	243
215054	PLN00100	PLN00100	light-harvesting complex chlorophyll-a/b protein of photosystem I (Lhca); Provisional	246
215055	PLN00101	PLN00101	Photosystem I light-harvesting complex type 4 protein; Provisional	250
177720	PLN00103	PLN00103	isocitrate dehydrogenase (NADP+); Provisional	410
215056	PLN00104	PLN00104	MYST -like histone acetyltransferase; Provisional	450
215057	PLN00105	PLN00105	malate/L-lactate dehydrogenase; Provisional	330
215058	PLN00106	PLN00106	malate dehydrogenase	323
165679	PLN00107	PLN00107	FAD-dependent oxidoreductase; Provisional	257
177724	PLN00108	PLN00108	unknown protein; Provisional	257
177725	PLN00110	PLN00110	flavonoid 3',5'-hydroxylase (F3'5'H); Provisional	504
215059	PLN00111	PLN00111	accumulation of photosystem one; Provisional	399
215060	PLN00112	PLN00112	malate dehydrogenase (NADP); Provisional	444
215061	PLN00113	PLN00113	leucine-rich repeat receptor-like protein kinase; Provisional	968
177729	PLN00115	PLN00115	pollen allergen group 3; Provisional	118
177730	PLN00116	PLN00116	translation elongation factor EF-2 subunit; Provisional	843
215062	PLN00118	PLN00118	isocitrate dehydrogenase (NAD+)	372
177732	PLN00119	PLN00119	endoglucanase	489
215063	PLN00120	PLN00120	fucoxanthin-chlorophyll a-c binding protein; Provisional	202
177733	PLN00121	PLN00121	histone H3; Provisional	136
215064	PLN00122	PLN00122	serine/threonine protein phosphatase 2A; Provisional	170
215065	PLN00123	PLN00123	isocitrate dehydrogenase (NAD+)	360
177736	PLN00124	PLN00124	succinyl-CoA ligase [GDP-forming] subunit beta; Provisional	422
215066	PLN00125	PLN00125	Succinyl-CoA ligase [GDP-forming] subunit alpha	300
165695	PLN00126	PLN00126	succinate dehydrogenase, cytochrome b subunit family; Provisional	129
177738	PLN00127	PLN00127	succinate dehydrogenase (ubiquinone) cytochrome b subunit; Provisional	178
177739	PLN00128	PLN00128	Succinate dehydrogenase [ubiquinone] flavoprotein subunit	635
215067	PLN00129	PLN00129	succinate dehydrogenase [ubiquinone] iron-sulfur subunit	276
177741	PLN00130	PLN00130	succinate dehydrogenase (SDH3); Provisional	213
165700	PLN00131	PLN00131	hypothetical protein; Provisional	218
215068	PLN00133	PLN00133	class I-fumerate hydratase; Provisional	576
215069	PLN00134	PLN00134	fumarate hydratase; Provisional	458
177744	PLN00135	PLN00135	malate dehydrogenase	309
215070	PLN00136	PLN00136	silicon transporter; Provisional	482
215071	PLN00137	PLN00137	NHAD transporter family protein; Provisional	424
165706	PLN00138	PLN00138	large subunit ribosomal protein LP2; Provisional	113
165707	PLN00139	PLN00139	hypothetical protein; Provisional	320
165708	PLN00140	PLN00140	alcohol acetyltransferase family protein; Provisional	444
215072	PLN00141	PLN00141	Tic62-NAD(P)-related group II protein; Provisional	251
215073	PLN00142	PLN00142	sucrose synthase	815
165711	PLN00143	PLN00143	tyrosine/nicotianamine aminotransferase; Provisional	409
177748	PLN00144	PLN00144	acetylornithine transaminase	382
215074	PLN00145	PLN00145	tyrosine/nicotianamine aminotransferase; Provisional	430
215075	PLN00146	PLN00146	40S ribosomal protein S15a; Provisional	130
215076	PLN00147	PLN00147	light-harvesting complex I chlorophyll-a/b binding protein Lhca5; Provisional	252
215077	PLN00148	PLN00148	potassium transporter; Provisional	785
177753	PLN00149	PLN00149	potassium transporter; Provisional	779
215078	PLN00150	PLN00150	potassium ion transporter family protein; Provisional	779
215079	PLN00151	PLN00151	potassium transporter; Provisional	852
177755	PLN00152	PLN00152	DNA-directed RNA polymerase; Provisional	130
165721	PLN00153	PLN00153	histone H2A; Provisional	129
177756	PLN00154	PLN00154	histone H2A; Provisional	136
165723	PLN00155	PLN00155	histone H2A; Provisional	58
215080	PLN00156	PLN00156	histone H2AX; Provisional	139
177758	PLN00157	PLN00157	histone H2A; Provisional	132
215081	PLN00158	PLN00158	histone H2B; Provisional	116
165727	PLN00160	PLN00160	histone H3; Provisional	97
215082	PLN00161	PLN00161	histone H3; Provisional	135
215083	PLN00162	PLN00162	transport protein sec23; Provisional	761
165730	PLN00163	PLN00163	histone H4; Provisional	59
215084	PLN00164	PLN00164	glucosyltransferase; Provisional	480
165732	PLN00165	PLN00165	hypothetical protein; Provisional	88
165733	PLN00166	PLN00166	aquaporin TIP2; Provisional	250
215085	PLN00167	PLN00167	aquaporin TIP5; Provisional	256
215086	PLN00168	PLN00168	Cytochrome P450; Provisional	519
177765	PLN00169	PLN00169	CETS family protein; Provisional	175
215087	PLN00170	PLN00170	photosystem II light-harvesting-Chl-binding protein  Lhcb6 (CP24); Provisional	255
215088	PLN00171	PLN00171	photosystem  light-harvesting complex -chlorophyll a/b binding protein Lhcb7; Provisional	324
177768	PLN00172	PLN00172	ubiquitin conjugating enzyme; Provisional	147
177769	PLN00174	PLN00174	predicted protein; Provisional	160
215089	PLN00175	PLN00175	aminotransferase family protein; Provisional	413
215090	PLN00176	PLN00176	galactinol synthase	333
177772	PLN00177	PLN00177	sulfite oxidase; Provisional	393
177773	PLN00178	PLN00178	sulfite reductase	623
215091	PLN00179	PLN00179	acyl- [acyl-carrier protein] desaturase	390
177775	PLN00180	PLN00180	NDF6 (NDH-dependent flow 6); Provisional	180
177776	PLN00181	PLN00181	protein SPA1-RELATED; Provisional	793
165748	PLN00182	PLN00182	putative aquaporin NIP4; Provisional	283
215092	PLN00183	PLN00183	putative aquaporin NIP7; Provisional	274
177778	PLN00184	PLN00184	aquaporin NIP1; Provisional	296
177779	PLN00185	PLN00185	60S ribosomal protein L4-1; Provisional	405
215093	PLN00186	PLN00186	ribosomal protein S26; Provisional	109
177781	PLN00187	PLN00187	photosystem II light-harvesting complex II protein Lhcb4; Provisional	286
215094	PLN00188	PLN00188	enhanced disease resistance protein (EDR2); Provisional	719
177783	PLN00189	PLN00189	40S ribosomal protein S9; Provisional	194
177784	PLN00190	PLN00190	60S ribosomal protein L21; Provisional	158
215095	PLN00191	PLN00191	enolase	457
215096	PLN00192	PLN00192	aldehyde oxidase	1344
215097	PLN00193	PLN00193	expansin-A; Provisional	256
215098	PLN00194	PLN00194	aldose 1-epimerase; Provisional	337
165762	PLN00196	PLN00196	alpha-amylase; Provisional	428
215099	PLN00197	PLN00197	beta-amylase; Provisional	573
215100	PLN00198	PLN00198	anthocyanidin reductase; Provisional	338
177791	PLN00200	PLN00200	argininosuccinate synthase; Provisional	404
177792	PLN00202	PLN00202	beta-ureidopropionase	405
215101	PLN00203	PLN00203	glutamyl-tRNA reductase	519
215102	PLN00204	PLN00204	CP12 gene family protein; Provisional	126
177795	PLN00205	PLN00205	ribisomal protein L13 family protein; Provisional	191
215103	PLN00206	PLN00206	DEAD-box ATP-dependent RNA helicase; Provisional	518
215104	PLN00207	PLN00207	polyribonucleotide nucleotidyltransferase; Provisional	891
177798	PLN00208	PLN00208	translation initiation factor (eIF); Provisional	145
165774	PLN00209	PLN00209	ribosomal protein S27; Provisional	86
177799	PLN00210	PLN00210	40S ribosomal protein S16; Provisional	141
215105	PLN00211	PLN00211	predicted protein; Provisional	61
215106	PLN00212	PLN00212	glutelin; Provisional	493
165778	PLN00213	PLN00213	predicted protein; Provisional	118
177800	PLN00214	PLN00214	putative protein; Provisional	115
165780	PLN00215	PLN00215	predicted protein; Provisional	110
165781	PLN00216	PLN00216	predicted protein; Provisional	69
165782	PLN00217	PLN00217	predicted protein; Provisional	210
165783	PLN00218	PLN00218	predicted protein; Provisional	151
165784	PLN00219	PLN00219	predicted protein; Provisional	65
215107	PLN00220	PLN00220	tubulin beta chain; Provisional	447
177802	PLN00221	PLN00221	tubulin alpha chain; Provisional	450
215108	PLN00222	PLN00222	tubulin gamma chain; Provisional	454
165788	PLN00223	PLN00223	ADP-ribosylation factor; Provisional	181
215109	PLN00410	PLN00410	U5 snRNP protein, DIM1 family; Provisional	142
177805	PLN00411	PLN00411	nodulin MtN21 family protein; Provisional	358
215110	PLN00412	PLN00412	NADP-dependent glyceraldehyde-3-phosphate dehydrogenase; Provisional	496
165792	PLN00413	PLN00413	triacylglycerol lipase	479
177807	PLN00414	PLN00414	glycosyltransferase family protein	446
177808	PLN00415	PLN00415	3-ketoacyl-CoA synthase	466
177809	PLN00416	PLN00416	carbonate dehydratase	258
177810	PLN00417	PLN00417	oxidoreductase, 2OG-Fe(II) oxygenase family protein	348
177811	PLN02150	PLN02150	terpene synthase/cyclase family protein	96
177812	PLN02151	PLN02151	trehalose-phosphatase	354
177813	PLN02152	PLN02152	indole-3-acetate beta-glucosyltransferase	455
177814	PLN02153	PLN02153	epithiospecifier protein	341
215111	PLN02154	PLN02154	carbonic anhydrase	290
165802	PLN02155	PLN02155	polygalacturonase	394
177816	PLN02156	PLN02156	gibberellin 2-beta-dioxygenase	335
177817	PLN02157	PLN02157	3-hydroxyisobutyryl-CoA hydrolase-like protein	401
177818	PLN02159	PLN02159	Fe(2+) transport protein	337
177819	PLN02160	PLN02160	thiosulfate sulfurtransferase	136
177820	PLN02161	PLN02161	beta-amylase	531
177821	PLN02162	PLN02162	triacylglycerol lipase	475
177822	PLN02164	PLN02164	sulfotransferase	346
177823	PLN02165	PLN02165	adenylate isopentenyltransferase	334
165812	PLN02166	PLN02166	dTDP-glucose 4,6-dehydratase	436
215112	PLN02167	PLN02167	UDP-glycosyltransferase family protein	475
215113	PLN02168	PLN02168	copper ion binding / pectinesterase	545
177826	PLN02169	PLN02169	fatty acid (omega-1)-hydroxylase/midchain alkane hydroxylase	500
215114	PLN02170	PLN02170	probable pectinesterase/pectinesterase inhibitor	529
215115	PLN02171	PLN02171	endoglucanase	629
215116	PLN02172	PLN02172	flavin-containing monooxygenase FMO GS-OX	461
177830	PLN02173	PLN02173	UDP-glucosyl transferase family protein	449
177831	PLN02174	PLN02174	aldehyde dehydrogenase family 3 member H1	484
177832	PLN02175	PLN02175	endoglucanase	484
215117	PLN02176	PLN02176	putative pectinesterase	340
215118	PLN02177	PLN02177	glycerol-3-phosphate acyltransferase	497
177834	PLN02178	PLN02178	cinnamyl-alcohol dehydrogenase	375
177835	PLN02179	PLN02179	carbonic anhydrase	235
177836	PLN02180	PLN02180	gamma-glutamyl transpeptidase 4	639
177837	PLN02182	PLN02182	cytidine deaminase	339
165828	PLN02183	PLN02183	ferulate 5-hydroxylase	516
177838	PLN02184	PLN02184	superoxide dismutase [Fe]	212
215119	PLN02187	PLN02187	rooty/superroot1	462
215120	PLN02188	PLN02188	polygalacturonase/glycoside hydrolase family protein	404
215121	PLN02189	PLN02189	cellulose synthase	1040
215122	PLN02190	PLN02190	cellulose synthase-like protein	756
177843	PLN02191	PLN02191	L-ascorbate oxidase	574
215123	PLN02192	PLN02192	3-ketoacyl-CoA synthase	511
177844	PLN02193	PLN02193	nitrile-specifier protein	470
177845	PLN02194	PLN02194	cytochrome-c oxidase	265
215124	PLN02195	PLN02195	cellulose synthase A	977
177847	PLN02196	PLN02196	abscisic acid 8'-hydroxylase	463
177848	PLN02197	PLN02197	pectinesterase	588
177849	PLN02198	PLN02198	glutathione gamma-glutamylcysteinyltransferase	573
177850	PLN02199	PLN02199	shikimate kinase	303
215125	PLN02200	PLN02200	adenylate kinase family protein	234
177852	PLN02201	PLN02201	probable pectinesterase/pectinesterase inhibitor	520
177853	PLN02202	PLN02202	carbonate dehydratase	284
165847	PLN02203	PLN02203	aldehyde dehydrogenase	484
215126	PLN02204	PLN02204	diacylglycerol kinase	601
177855	PLN02205	PLN02205	alpha,alpha-trehalose-phosphate synthase [UDP-forming]	854
177856	PLN02206	PLN02206	UDP-glucuronate decarboxylase	442
177857	PLN02207	PLN02207	UDP-glycosyltransferase	468
177858	PLN02208	PLN02208	glycosyltransferase family protein	442
177859	PLN02209	PLN02209	serine carboxypeptidase	437
215127	PLN02210	PLN02210	UDP-glucosyl transferase	456
215128	PLN02211	PLN02211	methyl indole-3-acetate methyltransferase	273
165857	PLN02213	PLN02213	sinapoylglucose-malate O-sinapoyltransferase/ carboxypeptidase	319
177862	PLN02214	PLN02214	cinnamoyl-CoA reductase	342
215129	PLN02216	PLN02216	protein SRG1	357
215130	PLN02217	PLN02217	probable pectinesterase/pectinesterase inhibitor	670
177865	PLN02218	PLN02218	polygalacturonase ADPG	431
165863	PLN02219	PLN02219	probable galactinol--sucrose galactosyltransferase 2	775
177866	PLN02220	PLN02220	delta-9 acyl-lipid desaturase	299
177867	PLN02221	PLN02221	asparaginyl-tRNA synthetase	572
177868	PLN02222	PLN02222	phosphoinositide phospholipase C 2	581
165867	PLN02223	PLN02223	phosphoinositide phospholipase C	537
177869	PLN02224	PLN02224	methionine-tRNA ligase	616
177870	PLN02225	PLN02225	1-deoxy-D-xylulose-5-phosphate synthase	701
177871	PLN02226	PLN02226	2-oxoglutarate dehydrogenase E2 component	463
177872	PLN02227	PLN02227	fructose-bisphosphate aldolase I	399
177873	PLN02228	PLN02228	Phosphoinositide phospholipase C	567
177874	PLN02229	PLN02229	alpha-galactosidase	427
177875	PLN02230	PLN02230	phosphoinositide phospholipase C 4	598
177876	PLN02231	PLN02231	alanine transaminase	534
165876	PLN02232	PLN02232	ubiquinone biosynthesis methyltransferase	160
177877	PLN02233	PLN02233	ubiquinone biosynthesis methyltransferase	261
177878	PLN02234	PLN02234	1-deoxy-D-xylulose-5-phosphate synthase	641
177879	PLN02235	PLN02235	ATP citrate (pro-S)-lyase	423
177880	PLN02236	PLN02236	choline kinase	344
215131	PLN02237	PLN02237	glyceraldehyde-3-phosphate dehydrogenase B	442
215132	PLN02238	PLN02238	hypoxanthine phosphoribosyltransferase	189
177883	PLN02240	PLN02240	UDP-glucose 4-epimerase	352
215133	PLN02241	PLN02241	glucose-1-phosphate adenylyltransferase	436
215134	PLN02242	PLN02242	methionine gamma-lyase	418
177886	PLN02243	PLN02243	S-adenosylmethionine synthase	386
215135	PLN02244	PLN02244	tocopherol O-methyltransferase	340
215136	PLN02245	PLN02245	ATP phosphoribosyl transferase	403
215137	PLN02246	PLN02246	4-coumarate--CoA ligase	537
165890	PLN02247	PLN02247	indole-3-acetic acid-amido synthetase	606
215138	PLN02248	PLN02248	cellulose synthase-like protein	1135
177891	PLN02249	PLN02249	indole-3-acetic acid-amido synthetase	597
215139	PLN02250	PLN02250	lipid phosphate phosphatase	314
215140	PLN02251	PLN02251	pyrophosphate-dependent phosphofructokinase	568
215141	PLN02252	PLN02252	nitrate reductase [NADPH]	888
177895	PLN02253	PLN02253	xanthoxin dehydrogenase	280
215142	PLN02254	PLN02254	gibberellin 3-beta-dioxygenase	358
215143	PLN02255	PLN02255	H(+) -translocating inorganic pyrophosphatase	765
215144	PLN02256	PLN02256	arogenate dehydrogenase	304
177899	PLN02257	PLN02257	phosphoribosylamine--glycine ligase	434
215145	PLN02258	PLN02258	9-cis-epoxycarotenoid dioxygenase NCED	590
177901	PLN02259	PLN02259	branched-chain-amino-acid aminotransferase 2	388
215146	PLN02260	PLN02260	probable rhamnose biosynthetic enzyme	668
215147	PLN02262	PLN02262	fructose-1,6-bisphosphatase	340
177904	PLN02263	PLN02263	serine decarboxylase	470
215148	PLN02264	PLN02264	lipoxygenase	919
215149	PLN02265	PLN02265	probable phenylalanyl-tRNA synthetase beta chain	597
215150	PLN02266	PLN02266	endoglucanase	510
215151	PLN02267	PLN02267	enoyl-CoA hydratase/isomerase family protein	239
177909	PLN02268	PLN02268	probable polyamine oxidase	435
215152	PLN02269	PLN02269	Pyruvate dehydrogenase E1 component subunit alpha	362
165912	PLN02270	PLN02270	phospholipase D alpha	808
215153	PLN02271	PLN02271	serine hydroxymethyltransferase	586
177912	PLN02272	PLN02272	glyceraldehyde-3-phosphate dehydrogenase	421
215154	PLN02274	PLN02274	inosine-5'-monophosphate dehydrogenase	505
215155	PLN02275	PLN02275	transferase, transferring glycosyl groups	371
215156	PLN02276	PLN02276	gibberellin 20-oxidase	361
177916	PLN02277	PLN02277	H(+) -translocating inorganic pyrophosphatase	730
215157	PLN02278	PLN02278	succinic semialdehyde dehydrogenase	498
177918	PLN02279	PLN02279	ent-kaur-16-ene synthase	784
215158	PLN02280	PLN02280	IAA-amino acid hydrolase	478
177920	PLN02281	PLN02281	chlorophyllide a oxygenase	536
165923	PLN02282	PLN02282	phosphoglycerate kinase	401
177921	PLN02283	PLN02283	alpha-dioxygenase	633
177922	PLN02284	PLN02284	glutamine synthetase	354
215159	PLN02285	PLN02285	methionyl-tRNA formyltransferase	334
215160	PLN02286	PLN02286	arginine-tRNA ligase	576
215161	PLN02287	PLN02287	3-ketoacyl-CoA thiolase	452
215162	PLN02288	PLN02288	mannose-6-phosphate isomerase	394
215163	PLN02289	PLN02289	ribulose-bisphosphate carboxylase small chain	176
215164	PLN02290	PLN02290	cytokinin trans-hydroxylase	516
177928	PLN02291	PLN02291	phospho-2-dehydro-3-deoxyheptonate aldolase	474
215165	PLN02292	PLN02292	ferric-chelate reductase	702
177930	PLN02293	PLN02293	adenine phosphoribosyltransferase	187
177931	PLN02294	PLN02294	cytochrome c oxidase subunit Vb	174
215166	PLN02295	PLN02295	glycerol kinase	512
215167	PLN02296	PLN02296	carbonate dehydratase	269
177934	PLN02297	PLN02297	ribose-phosphate pyrophosphokinase	326
165939	PLN02298	PLN02298	hydrolase, alpha/beta fold family protein	330
215168	PLN02299	PLN02299	1-aminocyclopropane-1-carboxylate oxidase	321
215169	PLN02300	PLN02300	lactoylglutathione lyase	286
215170	PLN02301	PLN02301	pectinesterase/pectinesterase inhibitor	548
215171	PLN02302	PLN02302	ent-kaurenoic acid oxidase	490
215172	PLN02303	PLN02303	urease	837
215173	PLN02304	PLN02304	probable pectinesterase	379
215174	PLN02305	PLN02305	lipoxygenase	918
177941	PLN02306	PLN02306	hydroxypyruvate reductase	386
177942	PLN02307	PLN02307	phosphoglucomutase	579
177943	PLN02308	PLN02308	endoglucanase	492
215175	PLN02309	PLN02309	5'-adenylylsulfate reductase	457
215176	PLN02310	PLN02310	triacylglycerol lipase	405
215177	PLN02311	PLN02311	chalcone isomerase	271
215178	PLN02312	PLN02312	acyl-CoA oxidase	680
177947	PLN02313	PLN02313	Pectinesterase/pectinesterase inhibitor	587
215179	PLN02314	PLN02314	pectinesterase	586
177949	PLN02315	PLN02315	aldehyde dehydrogenase family 7 member	508
215180	PLN02316	PLN02316	synthase/transferase	1036
215181	PLN02317	PLN02317	arogenate dehydratase	382
177952	PLN02318	PLN02318	phosphoribulokinase/uridine kinase	656
177953	PLN02319	PLN02319	aminomethyltransferase	404
177954	PLN02320	PLN02320	seryl-tRNA synthetase	502
215182	PLN02321	PLN02321	2-isopropylmalate synthase	632
177956	PLN02322	PLN02322	acyl-CoA thioesterase	154
215183	PLN02323	PLN02323	probable fructokinase	330
177958	PLN02324	PLN02324	triacylglycerol lipase	415
215184	PLN02325	PLN02325	nudix hydrolase	144
215185	PLN02326	PLN02326	3-oxoacyl-[acyl-carrier-protein] synthase III	379
215186	PLN02327	PLN02327	CTP synthase	557
215187	PLN02328	PLN02328	lysine-specific histone demethylase 1 homolog	808
215188	PLN02329	PLN02329	3-isopropylmalate dehydrogenase	409
215189	PLN02330	PLN02330	4-coumarate--CoA ligase-like 1	546
177965	PLN02331	PLN02331	phosphoribosylglycinamide formyltransferase	207
215190	PLN02332	PLN02332	membrane bound O-acyl transferase (MBOAT) family protein	465
215191	PLN02333	PLN02333	glucose-6-phosphate 1-dehydrogenase	604
215192	PLN02334	PLN02334	ribulose-phosphate 3-epimerase	229
177969	PLN02335	PLN02335	anthranilate synthase	222
177970	PLN02336	PLN02336	phosphoethanolamine N-methyltransferase	475
215193	PLN02337	PLN02337	lipoxygenase	866
177972	PLN02338	PLN02338	3-phosphoshikimate 1-carboxyvinyltransferase	443
177973	PLN02339	PLN02339	NAD+ synthase (glutamine-hydrolysing)	700
215194	PLN02340	PLN02340	endoglucanase	614
215195	PLN02341	PLN02341	pfkB-type carbohydrate kinase family protein	470
177976	PLN02342	PLN02342	ornithine carbamoyltransferase	348
177977	PLN02343	PLN02343	allene oxide cyclase	229
177978	PLN02344	PLN02344	chorismate mutase	284
177979	PLN02345	PLN02345	endoglucanase	469
215196	PLN02346	PLN02346	histidine biosynthesis bifunctional protein hisIE	271
215197	PLN02347	PLN02347	GMP synthetase	536
215198	PLN02348	PLN02348	phosphoribulokinase	395
215199	PLN02349	PLN02349	glycerol-3-phosphate acyltransferase	426
215200	PLN02350	PLN02350	phosphogluconate dehydrogenase (decarboxylating)	493
215201	PLN02351	PLN02351	cytochromes b561 family protein	242
215202	PLN02352	PLN02352	phospholipase D epsilon	758
177986	PLN02353	PLN02353	probable UDP-glucose 6-dehydrogenase	473
177987	PLN02354	PLN02354	copper ion binding / oxidoreductase	552
215203	PLN02355	PLN02355	probable galactinol--sucrose galactosyltransferase 1	758
215204	PLN02356	PLN02356	phosphateglycerate kinase	423
215205	PLN02357	PLN02357	serine acetyltransferase	360
165999	PLN02358	PLN02358	glyceraldehyde-3-phosphate dehydrogenase	338
166000	PLN02359	PLN02359	ethanolaminephosphotransferase	389
166001	PLN02360	PLN02360	probable 6-phosphogluconolactonase	268
177990	PLN02361	PLN02361	alpha-amylase	401
215206	PLN02362	PLN02362	hexokinase	509
215207	PLN02363	PLN02363	phosphoribosylanthranilate isomerase	256
166005	PLN02364	PLN02364	L-ascorbate peroxidase 1	250
177993	PLN02365	PLN02365	2-oxoglutarate-dependent dioxygenase	300
215208	PLN02366	PLN02366	spermidine synthase	308
177995	PLN02367	PLN02367	lactoylglutathione lyase	233
177996	PLN02368	PLN02368	alanine transaminase	407
215209	PLN02369	PLN02369	ribose-phosphate pyrophosphokinase	302
215210	PLN02370	PLN02370	acyl-ACP thioesterase	419
215211	PLN02371	PLN02371	phosphoglucosamine mutase family protein	583
215212	PLN02372	PLN02372	violaxanthin de-epoxidase	455
178001	PLN02373	PLN02373	soluble inorganic pyrophosphatase	188
215213	PLN02374	PLN02374	pyruvate dehydrogenase (acetyl-transferring)	433
178003	PLN02375	PLN02375	molybderin biosynthesis protein CNX3	270
178004	PLN02376	PLN02376	1-aminocyclopropane-1-carboxylate synthase	496
166018	PLN02377	PLN02377	3-ketoacyl-CoA synthase	502
166019	PLN02378	PLN02378	glutathione S-transferase DHAR1	213
178005	PLN02379	PLN02379	pfkB-type carbohydrate kinase family protein	367
178006	PLN02380	PLN02380	1-acyl-sn-glycerol-3-phosphate acyltransferase	376
215214	PLN02381	PLN02381	valyl-tRNA synthetase	1066
178008	PLN02382	PLN02382	probable sucrose-phosphatase	413
178009	PLN02383	PLN02383	aspartate semialdehyde dehydrogenase	344
215215	PLN02384	PLN02384	ribose-5-phosphate isomerase	264
215216	PLN02385	PLN02385	hydrolase; alpha/beta fold family protein	349
166027	PLN02386	PLN02386	superoxide dismutase [Cu-Zn]	152
215217	PLN02387	PLN02387	long-chain-fatty-acid-CoA ligase family protein	696
215218	PLN02388	PLN02388	phosphopantetheine adenylyltransferase	177
215219	PLN02389	PLN02389	biotin synthase	379
178014	PLN02390	PLN02390	molybdopterin synthase catalytic subunit	111
178015	PLN02392	PLN02392	probable steroid reductase DET2	260
215220	PLN02393	PLN02393	leucoanthocyanidin dioxygenase like protein	362
215221	PLN02394	PLN02394	trans-cinnamate 4-monooxygenase	503
166036	PLN02395	PLN02395	glutathione S-transferase	215
178018	PLN02396	PLN02396	hexaprenyldihydroxybenzoate methyltransferase	322
215222	PLN02397	PLN02397	aspartate transaminase	423
215223	PLN02398	PLN02398	hydroxyacylglutathione hydrolase	329
178021	PLN02399	PLN02399	phospholipid hydroperoxide glutathione peroxidase	236
215224	PLN02400	PLN02400	cellulose synthase	1085
215225	PLN02401	PLN02401	diacylglycerol o-acyltransferase	446
178024	PLN02402	PLN02402	cytidine deaminase	303
178025	PLN02403	PLN02403	aminocyclopropanecarboxylate oxidase	303
178026	PLN02404	PLN02404	6,7-dimethyl-8-ribityllumazine synthase	141
215226	PLN02405	PLN02405	hexokinase	497
215227	PLN02406	PLN02406	ethanolamine-phosphate cytidylyltransferase	418
178029	PLN02407	PLN02407	diphosphomevalonate decarboxylase	343
215228	PLN02408	PLN02408	phospholipase A1	365
178031	PLN02409	PLN02409	serine--glyoxylate aminotransaminase	401
178032	PLN02410	PLN02410	UDP-glucoronosyl/UDP-glucosyl transferase family protein	451
178033	PLN02411	PLN02411	12-oxophytodienoate reductase	391
166053	PLN02412	PLN02412	probable glutathione peroxidase	167
215229	PLN02413	PLN02413	choline-phosphate cytidylyltransferase	294
178035	PLN02414	PLN02414	glycine dehydrogenase (decarboxylating)	993
178036	PLN02415	PLN02415	uricase	304
178037	PLN02416	PLN02416	probable pectinesterase/pectinesterase inhibitor	541
178038	PLN02417	PLN02417	dihydrodipicolinate synthase	280
215230	PLN02418	PLN02418	delta-1-pyrroline-5-carboxylate synthase	718
166060	PLN02419	PLN02419	methylmalonate-semialdehyde dehydrogenase [acylating]	604
178040	PLN02420	PLN02420	endoglucanase	525
215231	PLN02421	PLN02421	phosphotransferase, alcohol group as acceptor/kinase	330
215232	PLN02422	PLN02422	dephospho-CoA kinase	232
178043	PLN02423	PLN02423	phosphomannomutase	245
215233	PLN02424	PLN02424	ketopantoate hydroxymethyltransferase	332
215234	PLN02425	PLN02425	probable fructose-bisphosphate aldolase	390
215235	PLN02426	PLN02426	cytochrome P450, family 94, subfamily C protein	502
178047	PLN02427	PLN02427	UDP-apiose/xylose synthase	386
215236	PLN02428	PLN02428	lipoic acid synthase	349
166070	PLN02429	PLN02429	triosephosphate isomerase	315
178049	PLN02430	PLN02430	long-chain-fatty-acid-CoA ligase	660
178050	PLN02431	PLN02431	ferredoxin--nitrite reductase	587
178051	PLN02432	PLN02432	putative pectinesterase	293
215237	PLN02433	PLN02433	uroporphyrinogen decarboxylase	345
178053	PLN02434	PLN02434	fatty acid hydroxylase	237
215238	PLN02435	PLN02435	probable UDP-N-acetylglucosamine pyrophosphorylase	493
215239	PLN02436	PLN02436	cellulose synthase A	1094
178056	PLN02437	PLN02437	ribonucleoside--diphosphate reductase large subunit	813
178057	PLN02438	PLN02438	inositol-3-phosphate synthase	510
215240	PLN02439	PLN02439	arginine decarboxylase	559
215241	PLN02440	PLN02440	amidophosphoribosyltransferase	479
215242	PLN02441	PLN02441	cytokinin dehydrogenase	525
178061	PLN02442	PLN02442	S-formylglutathione hydrolase	283
178062	PLN02443	PLN02443	acyl-coenzyme A oxidase	664
215243	PLN02444	PLN02444	HMP-P synthase	642
215244	PLN02445	PLN02445	anthranilate synthase component I	523
215245	PLN02446	PLN02446	(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase	262
215246	PLN02447	PLN02447	1,4-alpha-glucan-branching enzyme	758
215247	PLN02448	PLN02448	UDP-glycosyltransferase family protein	459
178068	PLN02449	PLN02449	ferrochelatase	485
178069	PLN02450	PLN02450	1-aminocyclopropane-1-carboxylate synthase	468
215248	PLN02451	PLN02451	homoserine kinase	370
178071	PLN02452	PLN02452	phosphoserine transaminase	365
178072	PLN02453	PLN02453	complex I subunit	105
215249	PLN02454	PLN02454	triacylglycerol lipase	414
178074	PLN02455	PLN02455	fructose-bisphosphate aldolase	358
215250	PLN02456	PLN02456	citrate synthase	455
215251	PLN02457	PLN02457	phenylalanine ammonia-lyase	706
215252	PLN02458	PLN02458	transferase, transferring glycosyl groups	346
215253	PLN02459	PLN02459	probable adenylate kinase	261
215254	PLN02460	PLN02460	indole-3-glycerol-phosphate synthase	338
215255	PLN02461	PLN02461	Probable pyruvate kinase	511
215256	PLN02462	PLN02462	sedoheptulose-1,7-bisphosphatase	304
178082	PLN02463	PLN02463	lycopene beta cyclase	447
215257	PLN02464	PLN02464	glycerol-3-phosphate dehydrogenase	627
215258	PLN02465	PLN02465	L-galactono-1,4-lactone dehydrogenase	573
215259	PLN02466	PLN02466	aldehyde dehydrogenase family 2 member	538
215260	PLN02467	PLN02467	betaine aldehyde dehydrogenase	503
178087	PLN02468	PLN02468	putative pectinesterase/pectinesterase inhibitor	565
178088	PLN02469	PLN02469	hydroxyacylglutathione hydrolase	258
215261	PLN02470	PLN02470	acetolactate synthase	585
215262	PLN02471	PLN02471	superoxide dismutase [Mn]	231
215263	PLN02472	PLN02472	uncharacterized protein	246
166114	PLN02473	PLN02473	glutathione S-transferase	214
178092	PLN02474	PLN02474	UTP--glucose-1-phosphate uridylyltransferase	469
215264	PLN02475	PLN02475	5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferase	766
178094	PLN02476	PLN02476	O-methyltransferase	278
178095	PLN02477	PLN02477	glutamate dehydrogenase	410
215265	PLN02478	PLN02478	alternative oxidase	328
178097	PLN02479	PLN02479	acetate-CoA ligase	567
178098	PLN02480	PLN02480	Probable pectinesterase	343
215266	PLN02481	PLN02481	Omega-hydroxypalmitate O-feruloyl transferase	436
178100	PLN02482	PLN02482	glutamate-1-semialdehyde 2,1-aminomutase	474
178101	PLN02483	PLN02483	serine palmitoyltransferase	489
178102	PLN02484	PLN02484	probable pectinesterase/pectinesterase inhibitor	587
215267	PLN02485	PLN02485	oxidoreductase	329
178104	PLN02486	PLN02486	aminoacyl-tRNA ligase	383
215268	PLN02487	PLN02487	zeta-carotene desaturase	569
178106	PLN02488	PLN02488	probable pectinesterase/pectinesterase inhibitor	509
215269	PLN02489	PLN02489	homocysteine S-methyltransferase	335
215270	PLN02490	PLN02490	MPBQ/MSBQ methyltransferase	340
215271	PLN02491	PLN02491	carotenoid 9,10(9',10')-cleavage dioxygenase	545
215272	PLN02492	PLN02492	ribonucleoside-diphosphate reductase	324
166134	PLN02493	PLN02493	probable peroxisomal (S)-2-hydroxy-acid oxidase	367
178111	PLN02494	PLN02494	adenosylhomocysteinase	477
215273	PLN02495	PLN02495	oxidoreductase, acting on the CH-CH group of donors	385
215274	PLN02496	PLN02496	probable phosphopantothenoylcysteine decarboxylase	209
178113	PLN02497	PLN02497	probable pectinesterase	331
215275	PLN02498	PLN02498	omega-3 fatty acid desaturase	450
178115	PLN02499	PLN02499	glycerol-3-phosphate acyltransferase	498
215276	PLN02500	PLN02500	cytochrome P450 90B1	490
215277	PLN02501	PLN02501	digalactosyldiacylglycerol synthase	794
215278	PLN02502	PLN02502	lysyl-tRNA synthetase	553
215279	PLN02503	PLN02503	fatty acyl-CoA reductase 2	605
178120	PLN02504	PLN02504	nitrilase	346
178121	PLN02505	PLN02505	omega-6 fatty acid desaturase	381
215280	PLN02506	PLN02506	putative pectinesterase/pectinesterase inhibitor	537
215281	PLN02507	PLN02507	glutathione reductase	499
178124	PLN02508	PLN02508	magnesium-protoporphyrin IX monomethyl ester [oxidative] cyclase	357
178125	PLN02509	PLN02509	cystathionine beta-lyase	464
178126	PLN02510	PLN02510	probable 1-acyl-sn-glycerol-3-phosphate acyltransferase	374
215282	PLN02511	PLN02511	hydrolase	388
178128	PLN02512	PLN02512	acetylglutamate kinase	309
178129	PLN02513	PLN02513	adenylosuccinate synthase	427
166155	PLN02514	PLN02514	cinnamyl-alcohol dehydrogenase	357
178130	PLN02515	PLN02515	naringenin,2-oxoglutarate 3-dioxygenase	358
178131	PLN02516	PLN02516	methylenetetrahydrofolate dehydrogenase (NADP+)	299
178132	PLN02517	PLN02517	phosphatidylcholine-sterol O-acyltransferase	642
215283	PLN02518	PLN02518	pheophorbide a oxygenase	539
215284	PLN02519	PLN02519	isovaleryl-CoA dehydrogenase	404
178135	PLN02520	PLN02520	bifunctional 3-dehydroquinate dehydratase/shikimate dehydrogenase	529
215285	PLN02521	PLN02521	galactokinase	497
178137	PLN02522	PLN02522	ATP citrate (pro-S)-lyase	608
215286	PLN02523	PLN02523	galacturonosyltransferase	559
215287	PLN02524	PLN02524	S-adenosylmethionine decarboxylase	355
215288	PLN02525	PLN02525	phosphatidic acid phosphatase family protein	352
178141	PLN02526	PLN02526	acyl-coenzyme A oxidase	412
178142	PLN02527	PLN02527	aspartate carbamoyltransferase	306
215289	PLN02528	PLN02528	2-oxoisovalerate dehydrogenase E2 component	416
178144	PLN02529	PLN02529	lysine-specific histone demethylase 1	738
178145	PLN02530	PLN02530	histidine-tRNA ligase	487
215290	PLN02531	PLN02531	GTP cyclohydrolase I	469
215291	PLN02532	PLN02532	asparagine-tRNA synthetase	633
215292	PLN02533	PLN02533	probable purple acid phosphatase	427
215293	PLN02534	PLN02534	UDP-glycosyltransferase	491
215294	PLN02535	PLN02535	glycolate oxidase	364
178151	PLN02536	PLN02536	diaminopimelate epimerase	267
178152	PLN02537	PLN02537	diaminopimelate decarboxylase	410
215295	PLN02538	PLN02538	2,3-bisphosphoglycerate-independent phosphoglycerate mutase	558
178154	PLN02539	PLN02539	glucose-6-phosphate 1-dehydrogenase	491
215296	PLN02540	PLN02540	methylenetetrahydrofolate reductase	565
215297	PLN02541	PLN02541	uracil phosphoribosyltransferase	244
215298	PLN02542	PLN02542	fructose-1,6-bisphosphatase	412
215299	PLN02543	PLN02543	pfkB-type carbohydrate kinase family protein	496
178159	PLN02544	PLN02544	phosphoribosylaminoimidazole-succinocarboxamide synthase	370
215300	PLN02545	PLN02545	3-hydroxybutyryl-CoA dehydrogenase	295
215301	PLN02546	PLN02546	glutathione reductase	558
215302	PLN02547	PLN02547	dUTP pyrophosphatase	157
178163	PLN02548	PLN02548	adenosine kinase	332
178164	PLN02549	PLN02549	asparagine synthase (glutamine-hydrolyzing)	578
178165	PLN02550	PLN02550	threonine dehydratase	591
178166	PLN02551	PLN02551	aspartokinase	521
215303	PLN02552	PLN02552	isopentenyl-diphosphate delta-isomerase	247
178168	PLN02553	PLN02553	inositol-phosphate phosphatase	270
215304	PLN02554	PLN02554	UDP-glycosyltransferase family protein	481
178170	PLN02555	PLN02555	limonoid glucosyltransferase	480
178171	PLN02556	PLN02556	cysteine synthase/L-3-cyanoalanine synthase	368
178172	PLN02557	PLN02557	phosphoribosylformylglycinamidine cyclo-ligase	379
166199	PLN02558	PLN02558	CDP-diacylglycerol-glycerol-3-phosphate/ 3-phosphatidyltransferase	203
178173	PLN02559	PLN02559	chalcone--flavonone isomerase	230
178174	PLN02560	PLN02560	enoyl-CoA reductase	308
178175	PLN02561	PLN02561	triosephosphate isomerase	253
215305	PLN02562	PLN02562	UDP-glycosyltransferase	448
178177	PLN02563	PLN02563	aminoacyl-tRNA ligase	963
178178	PLN02564	PLN02564	6-phosphofructokinase	484
166206	PLN02565	PLN02565	cysteine synthase	322
215306	PLN02566	PLN02566	amine oxidase (copper-containing)	646
215307	PLN02567	PLN02567	alpha,alpha-trehalase	554
215308	PLN02568	PLN02568	polyamine oxidase	539
178182	PLN02569	PLN02569	threonine synthase	484
215309	PLN02571	PLN02571	triacylglycerol lipase	413
215310	PLN02572	PLN02572	UDP-sulfoquinovose synthase	442
215311	PLN02573	PLN02573	pyruvate decarboxylase	578
215312	PLN02574	PLN02574	4-coumarate--CoA ligase-like	560
215313	PLN02575	PLN02575	haloacid dehalogenase-like hydrolase	381
215314	PLN02576	PLN02576	protoporphyrinogen oxidase	496
178189	PLN02577	PLN02577	hydroxymethylglutaryl-CoA synthase	459
215315	PLN02578	PLN02578	hydrolase	354
215316	PLN02579	PLN02579	sphingolipid delta-4 desaturase	323
215317	PLN02580	PLN02580	trehalose-phosphatase	384
215318	PLN02581	PLN02581	red chlorophyll catabolite reductase	267
178194	PLN02582	PLN02582	1-deoxy-D-xylulose-5-phosphate synthase	677
178195	PLN02583	PLN02583	cinnamoyl-CoA reductase	297
178196	PLN02584	PLN02584	5'-methylthioadenosine nucleosidase	249
215319	PLN02585	PLN02585	magnesium protoporphyrin IX methyltransferase	315
166227	PLN02586	PLN02586	probable cinnamyl alcohol dehydrogenase	360
178198	PLN02587	PLN02587	L-galactose dehydrogenase	314
215320	PLN02588	PLN02588	glycerol-3-phosphate acyltransferase	525
166230	PLN02589	PLN02589	caffeoyl-CoA O-methyltransferase	247
178200	PLN02590	PLN02590	probable tyrosine decarboxylase	539
178201	PLN02591	PLN02591	tryptophan synthase	250
215321	PLN02592	PLN02592	ent-copalyl diphosphate synthase	800
178203	PLN02593	PLN02593	adrenodoxin-like ferredoxin protein	117
215322	PLN02594	PLN02594	phosphatidate cytidylyltransferase	342
178205	PLN02595	PLN02595	cytochrome c oxidase subunit VI protein	102
178206	PLN02596	PLN02596	hexokinase-like	490
178207	PLN02597	PLN02597	phosphoenolpyruvate carboxykinase [ATP]	555
215323	PLN02598	PLN02598	omega-6 fatty acid desaturase	421
178209	PLN02599	PLN02599	dihydroorotase	364
178210	PLN02600	PLN02600	enoyl-CoA hydratase	251
178211	PLN02601	PLN02601	beta-carotene hydroxylase	303
178212	PLN02602	PLN02602	lactate dehydrogenase	350
178213	PLN02603	PLN02603	asparaginyl-tRNA synthetase	565
215324	PLN02604	PLN02604	oxidoreductase	566
215325	PLN02605	PLN02605	monogalactosyldiacylglycerol synthase	382
215326	PLN02606	PLN02606	palmitoyl-protein thioesterase	306
215327	PLN02607	PLN02607	1-aminocyclopropane-1-carboxylate synthase	447
178218	PLN02608	PLN02608	L-ascorbate peroxidase	289
215328	PLN02609	PLN02609	catalase	492
215329	PLN02610	PLN02610	probable methionyl-tRNA synthetase	801
178221	PLN02611	PLN02611	glutamate--cysteine ligase	482
215330	PLN02612	PLN02612	phytoene desaturase	567
215331	PLN02613	PLN02613	endoglucanase	498
166255	PLN02614	PLN02614	long-chain acyl-CoA synthetase	666
178224	PLN02615	PLN02615	arginase	338
215332	PLN02616	PLN02616	tetrahydrofolate dehydrogenase/cyclohydrolase, putative	364
178226	PLN02617	PLN02617	imidazole glycerol phosphate synthase hisHF	538
215333	PLN02618	PLN02618	tryptophan synthase, beta chain	410
178228	PLN02619	PLN02619	nucleoside-diphosphate kinase	238
166261	PLN02620	PLN02620	indole-3-acetic acid-amido synthetase	612
178229	PLN02621	PLN02621	nicotinamidase	197
166263	PLN02622	PLN02622	iron superoxide dismutase	261
215334	PLN02623	PLN02623	pyruvate kinase	581
215335	PLN02624	PLN02624	ornithine-delta-aminotransferase	474
178232	PLN02625	PLN02625	uroporphyrin-III C-methyltransferase	263
215336	PLN02626	PLN02626	malate synthase	551
178234	PLN02627	PLN02627	glutamyl-tRNA synthetase	535
215337	PLN02628	PLN02628	fructose-1,6-bisphosphatase family protein	351
215338	PLN02629	PLN02629	powdery mildew resistance 5	387
178237	PLN02630	PLN02630	pfkB-type carbohydrate kinase family protein	335
178238	PLN02631	PLN02631	ferric-chelate reductase	699
215339	PLN02632	PLN02632	phytoene synthase	334
178240	PLN02633	PLN02633	palmitoyl protein thioesterase family protein	314
215340	PLN02634	PLN02634	probable pectinesterase	359
215341	PLN02635	PLN02635	disproportionating enzyme	538
215342	PLN02636	PLN02636	acyl-coenzyme A oxidase	686
215343	PLN02638	PLN02638	cellulose synthase A (UDP-forming), catalytic subunit	1079
178245	PLN02639	PLN02639	oxidoreductase, 2OG-Fe(II) oxygenase family protein	337
215344	PLN02640	PLN02640	glucose-6-phosphate 1-dehydrogenase	573
215345	PLN02641	PLN02641	anthranilate phosphoribosyltransferase	343
178248	PLN02642	PLN02642	copper, zinc superoxide dismutase	164
215346	PLN02643	PLN02643	ADP-glucose phosphorylase	336
215347	PLN02644	PLN02644	acetyl-CoA C-acetyltransferase	394
178251	PLN02645	PLN02645	phosphoglycolate phosphatase	311
215348	PLN02646	PLN02646	argininosuccinate lyase	474
215349	PLN02647	PLN02647	acyl-CoA thioesterase	437
215350	PLN02648	PLN02648	allene oxide synthase	480
215351	PLN02649	PLN02649	glucose-6-phosphate isomerase	560
178256	PLN02650	PLN02650	dihydroflavonol-4-reductase	351
178257	PLN02651	PLN02651	cysteine desulfurase	364
215352	PLN02652	PLN02652	hydrolase; alpha/beta fold family protein	395
178259	PLN02653	PLN02653	GDP-mannose 4,6-dehydratase	340
215353	PLN02654	PLN02654	acetate-CoA ligase	666
215354	PLN02655	PLN02655	ent-kaurene oxidase	466
178262	PLN02656	PLN02656	tyrosine transaminase	409
178263	PLN02657	PLN02657	3,8-divinyl protochlorophyllide a 8-vinyl reductase	390
215355	PLN02658	PLN02658	homogentisate 1,2-dioxygenase	435
215356	PLN02659	PLN02659	Probable galacturonosyltransferase	534
178266	PLN02660	PLN02660	pantoate--beta-alanine ligase	284
178267	PLN02661	PLN02661	Putative thiazole synthesis	357
178268	PLN02662	PLN02662	cinnamyl-alcohol dehydrogenase family protein	322
166304	PLN02663	PLN02663	hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyltransferase	431
178269	PLN02664	PLN02664	enoyl-CoA hydratase/delta3,5-delta2,4-dienoyl-CoA isomerase	275
215357	PLN02665	PLN02665	pectinesterase family protein	366
215358	PLN02666	PLN02666	5-oxoprolinase	1275
215359	PLN02667	PLN02667	inositol polyphosphate multikinase	286
178273	PLN02668	PLN02668	indole-3-acetate carboxyl methyltransferase	386
178274	PLN02669	PLN02669	xylulokinase	556
178275	PLN02670	PLN02670	transferase, transferring glycosyl groups	472
178276	PLN02671	PLN02671	pectinesterase	359
215360	PLN02672	PLN02672	methionine S-methyltransferase	1082
215361	PLN02673	PLN02673	quinolinate synthetase A	724
178279	PLN02674	PLN02674	adenylate kinase	244
215362	PLN02676	PLN02676	polyamine oxidase	487
215363	PLN02677	PLN02677	mevalonate kinase	387
215364	PLN02678	PLN02678	seryl-tRNA synthetase	448
178283	PLN02679	PLN02679	hydrolase, alpha/beta fold family protein	360
215365	PLN02680	PLN02680	carbon-monoxide oxygenase	232
215366	PLN02681	PLN02681	proline dehydrogenase	455
215367	PLN02682	PLN02682	pectinesterase family protein	369
215368	PLN02683	PLN02683	pyruvate dehydrogenase E1 component subunit beta	356
166325	PLN02684	PLN02684	Probable galactinol--sucrose galactosyltransferase	750
215369	PLN02685	PLN02685	iron superoxide dismutase	299
215370	PLN02686	PLN02686	cinnamoyl-CoA reductase	367
215371	PLN02687	PLN02687	flavonoid 3'-monooxygenase	517
178291	PLN02688	PLN02688	pyrroline-5-carboxylate reductase	266
215372	PLN02689	PLN02689	Bifunctional isoaspartyl peptidase/L-asparaginase	318
178293	PLN02690	PLN02690	Agmatine deiminase	374
215373	PLN02691	PLN02691	porphobilinogen deaminase	351
178295	PLN02692	PLN02692	alpha-galactosidase	412
178296	PLN02693	PLN02693	IAA-amino acid hydrolase	437
178297	PLN02694	PLN02694	serine O-acetyltransferase	294
178298	PLN02695	PLN02695	GDP-D-mannose-3',5'-epimerase	370
215374	PLN02696	PLN02696	1-deoxy-D-xylulose-5-phosphate reductoisomerase	454
215375	PLN02697	PLN02697	lycopene epsilon cyclase	529
178301	PLN02698	PLN02698	Probable pectinesterase/pectinesterase inhibitor	497
215376	PLN02699	PLN02699	Bifunctional molybdopterin adenylyltransferase/molybdopterin molybdenumtransferase	659
215377	PLN02700	PLN02700	homoserine dehydrogenase family protein	377
178304	PLN02701	PLN02701	alpha-mannosidase	1050
215378	PLN02702	PLN02702	L-idonate 5-dehydrogenase	364
178306	PLN02703	PLN02703	beta-fructofuranosidase	618
166345	PLN02704	PLN02704	flavonol synthase	335
178307	PLN02705	PLN02705	beta-amylase	681
178308	PLN02706	PLN02706	glucosamine 6-phosphate N-acetyltransferase	150
178309	PLN02707	PLN02707	Soluble inorganic pyrophosphatase	267
215379	PLN02708	PLN02708	Probable pectinesterase/pectinesterase inhibitor	553
178311	PLN02709	PLN02709	nudix hydrolase	222
215380	PLN02710	PLN02710	farnesyltranstransferase subunit beta	439
215381	PLN02711	PLN02711	Probable galactinol--sucrose galactosyltransferase	777
215382	PLN02712	PLN02712	arogenate dehydrogenase	667
215383	PLN02713	PLN02713	Probable pectinesterase/pectinesterase inhibitor	566
178316	PLN02714	PLN02714	thiamin pyrophosphokinase	229
178317	PLN02715	PLN02715	lipid phosphate phosphatase	327
178318	PLN02716	PLN02716	nicotinate-nucleotide diphosphorylase (carboxylating)	308
178319	PLN02717	PLN02717	uridine nucleosidase	316
178320	PLN02718	PLN02718	Probable galacturonosyltransferase	603
178321	PLN02719	PLN02719	triacylglycerol lipase	518
178322	PLN02720	PLN02720	complex II	140
178323	PLN02721	PLN02721	threonine aldolase	353
166363	PLN02722	PLN02722	indole-3-acetamide amidohydrolase	422
178324	PLN02723	PLN02723	3-mercaptopyruvate sulfurtransferase	320
215384	PLN02724	PLN02724	Molybdenum cofactor sulfurase	805
178326	PLN02725	PLN02725	GDP-4-keto-6-deoxymannose-3,5-epimerase-4-reductase	306
215385	PLN02726	PLN02726	dolichyl-phosphate beta-D-mannosyltransferase	243
215386	PLN02727	PLN02727	NAD kinase	986
215387	PLN02728	PLN02728	2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase	252
215388	PLN02729	PLN02729	PSII-Q subunit	220
178331	PLN02730	PLN02730	enoyl-[acyl-carrier-protein] reductase	303
178332	PLN02731	PLN02731	Putative lipid phosphate phosphatase	333
215389	PLN02732	PLN02732	Probable NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit	159
215390	PLN02733	PLN02733	phosphatidylcholine-sterol O-acyltransferase	440
178335	PLN02734	PLN02734	glycyl-tRNA synthetase	684
215391	PLN02735	PLN02735	carbamoyl-phosphate synthase	1102
178337	PLN02736	PLN02736	long-chain acyl-CoA synthetase	651
215392	PLN02737	PLN02737	inositol monophosphatase family protein	363
215393	PLN02738	PLN02738	carotene beta-ring hydroxylase	633
215394	PLN02739	PLN02739	serine acetyltransferase	355
178341	PLN02740	PLN02740	Alcohol dehydrogenase-like	381
178342	PLN02741	PLN02741	riboflavin synthase	194
215395	PLN02742	PLN02742	Probable galacturonosyltransferase	534
215396	PLN02743	PLN02743	nicotinamidase	239
215397	PLN02744	PLN02744	dihydrolipoyllysine-residue acetyltransferase component of pyruvate dehydrogenase complex	539
178346	PLN02745	PLN02745	Putative pectinesterase/pectinesterase inhibitor	596
178347	PLN02746	PLN02746	hydroxymethylglutaryl-CoA lyase	347
215398	PLN02747	PLN02747	N-carbamolyputrescine amidase	296
215399	PLN02748	PLN02748	tRNA dimethylallyltransferase	468
178350	PLN02749	PLN02749	Uncharacterized protein At1g47420	173
178351	PLN02750	PLN02750	oxidoreductase, 2OG-Fe(II) oxygenase family protein	345
215400	PLN02751	PLN02751	glutamyl-tRNA(Gln) amidotransferase	544
215401	PLN02752	PLN02752	[acyl-carrier protein] S-malonyltransferase	343
178354	PLN02753	PLN02753	triacylglycerol lipase	531
215402	PLN02754	PLN02754	chorismate synthase	413
178356	PLN02755	PLN02755	complex I subunit	71
166397	PLN02756	PLN02756	S-methyl-5-thioribose kinase	418
215403	PLN02757	PLN02757	sirohydrochlorine ferrochelatase	154
215404	PLN02758	PLN02758	oxidoreductase, 2OG-Fe(II) oxygenase family protein	361
178359	PLN02759	PLN02759	Formate--tetrahydrofolate ligase	637
215405	PLN02760	PLN02760	4-aminobutyrate:pyruvate transaminase	504
215406	PLN02761	PLN02761	lipase class 3 family protein	527
215407	PLN02762	PLN02762	pyruvate kinase complex alpha subunit	509
215408	PLN02763	PLN02763	hydrolase, hydrolyzing O-glycosyl compounds	978
178364	PLN02764	PLN02764	glycosyltransferase family protein	453
215409	PLN02765	PLN02765	pyruvate kinase	526
215410	PLN02766	PLN02766	coniferyl-aldehyde dehydrogenase	501
215411	PLN02768	PLN02768	AMP deaminase	835
215412	PLN02769	PLN02769	Probable galacturonosyltransferase	629
215413	PLN02770	PLN02770	haloacid dehalogenase-like hydrolase family protein	248
178370	PLN02771	PLN02771	carbamoyl-phosphate synthase (glutamine-hydrolyzing)	415
215414	PLN02772	PLN02772	guanylate kinase	398
178372	PLN02773	PLN02773	pectinesterase	317
178373	PLN02774	PLN02774	brassinosteroid-6-oxidase	463
178374	PLN02775	PLN02775	Probable dihydrodipicolinate reductase	286
215415	PLN02776	PLN02776	prenyltransferase	341
178376	PLN02777	PLN02777	photosystem I P subunit (PSI-P)	167
178377	PLN02778	PLN02778	3,5-epimerase/4-reductase	298
215416	PLN02779	PLN02779	haloacid dehalogenase-like hydrolase family protein	286
166421	PLN02780	PLN02780	ketoreductase/ oxidoreductase	320
215417	PLN02781	PLN02781	Probable caffeoyl-CoA O-methyltransferase	234
215418	PLN02782	PLN02782	Branched-chain amino acid aminotransferase	403
178380	PLN02783	PLN02783	diacylglycerol O-acyltransferase	315
215419	PLN02784	PLN02784	alpha-amylase	894
215420	PLN02785	PLN02785	Protein HOTHEAD	587
178383	PLN02786	PLN02786	isochorismate synthase	533
215421	PLN02787	PLN02787	3-oxoacyl-[acyl-carrier-protein] synthase II	540
215422	PLN02788	PLN02788	phenylalanine-tRNA synthetase	402
215423	PLN02789	PLN02789	farnesyltranstransferase	320
215424	PLN02790	PLN02790	transketolase	654
215425	PLN02791	PLN02791	Nudix hydrolase homolog	770
178389	PLN02792	PLN02792	oxidoreductase	536
215426	PLN02793	PLN02793	Probable polygalacturonase	443
178391	PLN02794	PLN02794	cardiolipin synthase	341
178392	PLN02795	PLN02795	allantoinase	505
215427	PLN02796	PLN02796	D-glycerate 3-kinase	347
178394	PLN02797	PLN02797	phosphatidyl-N-dimethylethanolamine N-methyltransferase	164
215428	PLN02798	PLN02798	nitrilase	286
215429	PLN02799	PLN02799	Molybdopterin synthase sulfur carrier subunit	82
215430	PLN02800	PLN02800	imidazoleglycerol-phosphate dehydratase	261
215431	PLN02801	PLN02801	beta-amylase	517
215432	PLN02802	PLN02802	triacylglycerol lipase	509
178400	PLN02803	PLN02803	beta-amylase	548
178401	PLN02804	PLN02804	chalcone isomerase	206
178402	PLN02805	PLN02805	D-lactate dehydrogenase [cytochrome]	555
178403	PLN02806	PLN02806	complex I subunit	81
215433	PLN02807	PLN02807	diaminohydroxyphosphoribosylaminopyrimidine deaminase	380
166449	PLN02808	PLN02808	alpha-galactosidase	386
178405	PLN02809	PLN02809	4-hydroxybenzoate nonaprenyltransferase	289
178406	PLN02810	PLN02810	carbon-monoxide oxygenase	231
178407	PLN02811	PLN02811	hydrolase	220
178408	PLN02812	PLN02812	5-formyltetrahydrofolate cyclo-ligase	211
215434	PLN02813	PLN02813	pfkB-type carbohydrate kinase family protein	426
215435	PLN02814	PLN02814	beta-glucosidase	504
215436	PLN02815	PLN02815	L-aspartate oxidase	594
215437	PLN02816	PLN02816	mannosyltransferase	546
166458	PLN02817	PLN02817	glutathione dehydrogenase (ascorbate)	265
215438	PLN02818	PLN02818	tocopherol cyclase	403
215439	PLN02819	PLN02819	lysine-ketoglutarate reductase/saccharopine dehydrogenase	1042
178415	PLN02820	PLN02820	3-methylcrotonyl-CoA carboxylase, beta chain	569
215440	PLN02821	PLN02821	1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate reductase	460
178417	PLN02822	PLN02822	serine palmitoyltransferase	481
178418	PLN02823	PLN02823	spermine synthase	336
178419	PLN02824	PLN02824	hydrolase, alpha/beta fold family protein	294
215441	PLN02825	PLN02825	amino-acid N-acetyltransferase	515
178421	PLN02826	PLN02826	dihydroorotate dehydrogenase	409
215442	PLN02827	PLN02827	Alcohol dehydrogenase-like	378
178422	PLN02828	PLN02828	formyltetrahydrofolate deformylase	268
215443	PLN02829	PLN02829	Probable galacturonosyltransferase	639
215444	PLN02830	PLN02830	UDP-sugar pyrophosphorylase	615
215445	PLN02831	PLN02831	Bifunctional GTP cyclohydrolase II/ 3,4-dihydroxy-2-butanone-4-phosphate synthase	450
215446	PLN02832	PLN02832	glutamine amidotransferase subunit of pyridoxal 5'-phosphate synthase complex	248
215447	PLN02833	PLN02833	glycerol acyltransferase family protein	376
215448	PLN02834	PLN02834	3-dehydroquinate synthase	433
178429	PLN02835	PLN02835	oxidoreductase	539
215449	PLN02836	PLN02836	3-oxoacyl-[acyl-carrier-protein] synthase	437
215450	PLN02837	PLN02837	threonine-tRNA ligase	614
166479	PLN02838	PLN02838	3-hydroxyacyl-CoA dehydratase subunit of elongase	221
178432	PLN02839	PLN02839	nudix hydrolase	372
215451	PLN02840	PLN02840	tRNA dimethylallyltransferase	421
178434	PLN02841	PLN02841	GPI mannosyltransferase	440
178435	PLN02842	PLN02842	nucleotide kinase	505
215452	PLN02843	PLN02843	isoleucyl-tRNA synthetase	974
215453	PLN02844	PLN02844	oxidoreductase/ferric-chelate reductase	722
215454	PLN02845	PLN02845	Branched-chain-amino-acid aminotransferase-like protein	336
166487	PLN02846	PLN02846	digalactosyldiacylglycerol synthase	462
178439	PLN02847	PLN02847	triacylglycerol lipase	633
178440	PLN02848	PLN02848	adenylosuccinate lyase	458
215455	PLN02849	PLN02849	beta-glucosidase	503
215456	PLN02850	PLN02850	aspartate-tRNA ligase	530
178443	PLN02851	PLN02851	3-hydroxyisobutyryl-CoA hydrolase-like protein	407
215457	PLN02852	PLN02852	ferredoxin-NADP+ reductase	491
215458	PLN02853	PLN02853	Probable phenylalanyl-tRNA synthetase alpha chain	492
215459	PLN02854	PLN02854	3-ketoacyl-CoA synthase	521
215460	PLN02855	PLN02855	Bifunctional selenocysteine lyase/cysteine desulfurase	424
215461	PLN02856	PLN02856	fumarylacetoacetase	424
215462	PLN02857	PLN02857	octaprenyl-diphosphate synthase	416
215463	PLN02858	PLN02858	fructose-bisphosphate aldolase	1378
178450	PLN02859	PLN02859	glutamine-tRNA ligase	788
215464	PLN02860	PLN02860	o-succinylbenzoate-CoA ligase	563
178452	PLN02861	PLN02861	long-chain-fatty-acid-CoA ligase	660
178453	PLN02862	PLN02862	2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase	216
215465	PLN02863	PLN02863	UDP-glucoronosyl/UDP-glucosyl transferase family protein	477
178455	PLN02864	PLN02864	enoyl-CoA hydratase	310
215466	PLN02865	PLN02865	galactokinase	423
215467	PLN02866	PLN02866	phospholipase D	1068
178458	PLN02867	PLN02867	Probable galacturonosyltransferase	535
178459	PLN02868	PLN02868	acyl-CoA thioesterase family protein	413
166510	PLN02869	PLN02869	fatty aldehyde decarbonylase	620
215468	PLN02870	PLN02870	Probable galacturonosyltransferase	533
215469	PLN02871	PLN02871	UDP-sulfoquinovose:DAG sulfoquinovosyltransferase	465
215470	PLN02872	PLN02872	triacylglycerol lipase	395
215471	PLN02873	PLN02873	coproporphyrinogen-III oxidase	274
178462	PLN02874	PLN02874	3-hydroxyisobutyryl-CoA hydrolase-like protein	379
215472	PLN02875	PLN02875	4-hydroxyphenylpyruvate dioxygenase	398
215473	PLN02876	PLN02876	acyl-CoA dehydrogenase	822
215474	PLN02877	PLN02877	alpha-amylase/limit dextrinase	970
178466	PLN02878	PLN02878	homogentisate phytyltransferase	280
178467	PLN02879	PLN02879	L-ascorbate peroxidase	251
215475	PLN02880	PLN02880	tyrosine decarboxylase	490
215476	PLN02881	PLN02881	tetrahydrofolylpolyglutamate synthase	530
215477	PLN02882	PLN02882	aminoacyl-tRNA ligase	1159
178471	PLN02883	PLN02883	Branched-chain amino acid aminotransferase	384
178472	PLN02884	PLN02884	6-phosphofructokinase	411
178473	PLN02885	PLN02885	nicotinate phosphoribosyltransferase	545
215478	PLN02886	PLN02886	aminoacyl-tRNA ligase	389
215479	PLN02887	PLN02887	hydrolase family protein	580
215480	PLN02888	PLN02888	enoyl-CoA hydratase	265
215481	PLN02889	PLN02889	oxo-acid-lyase/anthranilate synthase	918
178478	PLN02890	PLN02890	geranyl diphosphate synthase	422
178479	PLN02891	PLN02891	IMP cyclohydrolase	547
215482	PLN02892	PLN02892	isocitrate lyase	570
215483	PLN02893	PLN02893	Cellulose synthase-like protein	734
215484	PLN02894	PLN02894	hydrolase, alpha/beta fold family protein	402
215485	PLN02895	PLN02895	phosphoacetylglucosamine mutase	562
178484	PLN02896	PLN02896	cinnamyl-alcohol dehydrogenase	353
178485	PLN02897	PLN02897	tetrahydrofolate dehydrogenase/cyclohydrolase, putative	345
215486	PLN02898	PLN02898	HMP-P kinase/thiamin-monophosphate pyrophosphorylase	502
178487	PLN02899	PLN02899	alpha-galactosidase	633
215487	PLN02900	PLN02900	alanyl-tRNA synthetase	936
215488	PLN02901	PLN02901	1-acyl-sn-glycerol-3-phosphate acyltransferase	214
215489	PLN02902	PLN02902	pantothenate kinase	876
215490	PLN02903	PLN02903	aminoacyl-tRNA ligase	652
178492	PLN02904	PLN02904	oxidoreductase	357
178493	PLN02905	PLN02905	beta-amylase	702
215491	PLN02906	PLN02906	xanthine dehydrogenase	1319
215492	PLN02907	PLN02907	glutamate-tRNA ligase	722
178496	PLN02908	PLN02908	threonyl-tRNA synthetase	686
178497	PLN02909	PLN02909	Endoglucanase	486
215493	PLN02910	PLN02910	polygalacturonate 4-alpha-galacturonosyltransferase	657
178499	PLN02911	PLN02911	inositol-phosphate phosphatase	296
178500	PLN02912	PLN02912	oxidoreductase, 2OG-Fe(II) oxygenase family protein	348
178501	PLN02913	PLN02913	dihydrofolate synthetase	510
178502	PLN02914	PLN02914	hexokinase	490
215494	PLN02915	PLN02915	cellulose synthase A [UDP-forming], catalytic subunit	1044
178504	PLN02916	PLN02916	pectinesterase family protein	502
215495	PLN02917	PLN02917	CMP-KDO synthetase	293
215496	PLN02918	PLN02918	pyridoxine (pyridoxamine) 5'-phosphate oxidase	544
215497	PLN02919	PLN02919	haloacid dehalogenase-like hydrolase family protein	1057
215498	PLN02920	PLN02920	pantothenate kinase 1	398
178509	PLN02921	PLN02921	naphthoate synthase	327
215499	PLN02922	PLN02922	prenyltransferase	315
178511	PLN02923	PLN02923	xylose isomerase	478
178512	PLN02924	PLN02924	thymidylate kinase	220
178513	PLN02925	PLN02925	4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase	733
215500	PLN02926	PLN02926	histidinol dehydrogenase	431
178515	PLN02927	PLN02927	antheraxanthin epoxidase/zeaxanthin epoxidase	668
215501	PLN02928	PLN02928	oxidoreductase family protein	347
215502	PLN02929	PLN02929	NADH kinase	301
178518	PLN02930	PLN02930	CDP-diacylglycerol-serine O-phosphatidyltransferase	353
215503	PLN02931	PLN02931	nucleoside diphosphate kinase family protein	177
178520	PLN02932	PLN02932	3-ketoacyl-CoA synthase	478
178521	PLN02933	PLN02933	Probable pectinesterase/pectinesterase inhibitor	530
215504	PLN02934	PLN02934	triacylglycerol lipase	515
215505	PLN02935	PLN02935	Bifunctional NADH kinase/NAD(+) kinase	508
178524	PLN02936	PLN02936	epsilon-ring hydroxylase	489
215506	PLN02937	PLN02937	Putative isoaspartyl peptidase/L-asparaginase	414
178526	PLN02938	PLN02938	phosphatidylserine decarboxylase	428
215507	PLN02939	PLN02939	transferase, transferring glycosyl groups	977
178528	PLN02940	PLN02940	riboflavin kinase	382
215508	PLN02941	PLN02941	inositol-tetrakisphosphate 1-kinase	328
178530	PLN02942	PLN02942	dihydropyrimidinase	486
215509	PLN02943	PLN02943	aminoacyl-tRNA ligase	958
178531	PLN02945	PLN02945	nicotinamide-nucleotide adenylyltransferase/nicotinate-nucleotide adenylyltransferase	236
178532	PLN02946	PLN02946	cysteine-tRNA ligase	557
215510	PLN02947	PLN02947	oxidoreductase	374
178534	PLN02948	PLN02948	phosphoribosylaminoimidazole carboxylase	577
215511	PLN02949	PLN02949	transferase, transferring glycosyl groups	463
215512	PLN02950	PLN02950	4-alpha-glucanotransferase	909
215513	PLN02951	PLN02951	Molybderin biosynthesis protein CNX2	373
178538	PLN02952	PLN02952	phosphoinositide phospholipase C	599
178539	PLN02953	PLN02953	phosphatidate cytidylyltransferase	403
215514	PLN02954	PLN02954	phosphoserine phosphatase	224
178541	PLN02955	PLN02955	8-amino-7-oxononanoate synthase	476
215515	PLN02956	PLN02956	PSII-Q subunit	185
215516	PLN02957	PLN02957	copper, zinc superoxide dismutase	238
215517	PLN02958	PLN02958	diacylglycerol kinase/D-erythro-sphingosine kinase	481
215518	PLN02959	PLN02959	aminoacyl-tRNA ligase	1084
215519	PLN02960	PLN02960	alpha-amylase	897
178546	PLN02961	PLN02961	alanine-tRNA ligase	223
178547	PLN02962	PLN02962	hydroxyacylglutathione hydrolase	251
215520	PLN02964	PLN02964	phosphatidylserine decarboxylase	644
178549	PLN02965	PLN02965	Probable pheophorbidase	255
178550	PLN02966	PLN02966	cytochrome P450 83A1	502
215521	PLN02967	PLN02967	kinase	581
215522	PLN02968	PLN02968	Probable N-acetyl-gamma-glutamyl-phosphate reductase	381
215523	PLN02969	PLN02969	9-cis-epoxycarotenoid dioxygenase	610
215524	PLN02970	PLN02970	serine racemase	328
166612	PLN02971	PLN02971	tryptophan N-hydroxylase	543
215525	PLN02972	PLN02972	Histidyl-tRNA synthetase	763
178556	PLN02973	PLN02973	beta-fructofuranosidase	571
215526	PLN02974	PLN02974	adenosylmethionine-8-amino-7-oxononanoate transaminase	817
166616	PLN02975	PLN02975	complex I subunit	97
215527	PLN02976	PLN02976	amine oxidase	1713
215528	PLN02977	PLN02977	glutathione synthetase	478
215529	PLN02978	PLN02978	pyridoxal kinase	308
166620	PLN02979	PLN02979	glycolate oxidase	366
215530	PLN02980	PLN02980	2-oxoglutarate decarboxylase/ hydro-lyase/ magnesium ion binding  / thiamin pyrophosphate binding	1655
215531	PLN02981	PLN02981	glucosamine:fructose-6-phosphate aminotransferase	680
215532	PLN02982	PLN02982	galactinol-raffinose galactosyltransferase/ghydrolase, hydrolyzing O-glycosyl compounds	865
215533	PLN02983	PLN02983	biotin carboxyl carrier protein of acetyl-CoA carboxylase	274
215534	PLN02984	PLN02984	oxidoreductase, 2OG-Fe(II) oxygenase family protein	341
178566	PLN02985	PLN02985	squalene monooxygenase	514
178567	PLN02986	PLN02986	cinnamyl-alcohol dehydrogenase family protein	322
166628	PLN02987	PLN02987	Cytochrome P450, family 90, subfamily A	472
178568	PLN02988	PLN02988	3-hydroxyisobutyryl-CoA hydrolase	381
178569	PLN02989	PLN02989	cinnamyl-alcohol dehydrogenase family protein	325
215535	PLN02990	PLN02990	Probable pectinesterase/pectinesterase inhibitor	572
215536	PLN02991	PLN02991	oxidoreductase	543
178572	PLN02992	PLN02992	coniferyl-alcohol glucosyltransferase	481
215537	PLN02993	PLN02993	lupeol synthase	763
166635	PLN02994	PLN02994	1-aminocyclopropane-1-carboxylate synthase	153
178574	PLN02995	PLN02995	Probable pectinesterase/pectinesterase inhibitor	539
215538	PLN02996	PLN02996	fatty acyl-CoA reductase	491
178576	PLN02997	PLN02997	flavonol synthase	325
215539	PLN02998	PLN02998	beta-glucosidase	497
178577	PLN02999	PLN02999	photosystem II oxygen-evolving enhancer 3 protein (PsbQ)	190
178578	PLN03000	PLN03000	amine oxidase	881
166642	PLN03001	PLN03001	oxidoreductase, 2OG-Fe(II) oxygenase family protein	262
178579	PLN03002	PLN03002	oxidoreductase, 2OG-Fe(II) oxygenase family protein	332
178580	PLN03003	PLN03003	Probable polygalacturonase At3g15720	456
178581	PLN03004	PLN03004	UDP-glycosyltransferase	451
178582	PLN03005	PLN03005	beta-fructofuranosidase	550
178583	PLN03006	PLN03006	carbonate dehydratase	301
178584	PLN03007	PLN03007	UDP-glucosyltransferase family protein	482
178585	PLN03008	PLN03008	Phospholipase D delta	868
166650	PLN03009	PLN03009	cellulase	495
215540	PLN03010	PLN03010	polygalacturonase	409
166653	PLN03012	PLN03012	Camelliol C synthase	759
178587	PLN03013	PLN03013	cysteine synthase	429
178588	PLN03014	PLN03014	carbonic anhydrase	347
178589	PLN03015	PLN03015	UDP-glucosyl transferase	470
178590	PLN03016	PLN03016	sinapoylglucose-malate O-sinapoyltransferase	433
178591	PLN03017	PLN03017	trehalose-phosphatase	366
178592	PLN03018	PLN03018	homomethionine N-hydroxylase	534
166660	PLN03019	PLN03019	carbonic anhydrase	330
215541	PLN03020	PLN03020	low-temperature-induced protein; Provisional	556
178593	PLN03021	PLN03021	Low-temperature-induced protein; Provisional	619
215542	PLN03023	PLN03023	Expansin-like B1; Provisional	247
178595	PLN03024	PLN03024	Putative EG45-like domain containing protein 1; Provisional	125
178596	PLN03025	PLN03025	replication factor C subunit; Provisional	319
178597	PLN03026	PLN03026	histidinol-phosphate aminotransferase; Provisional	380
215543	PLN03028	PLN03028	pyrophosphate--fructose-6-phosphate 1-phosphotransferase; Provisional	610
215544	PLN03029	PLN03029	type-a response regulator protein; Provisional	222
215545	PLN03030	PLN03030	cationic peroxidase; Provisional	324
215546	PLN03031	PLN03031	hypothetical protein; Provisional	102
166673	PLN03032	PLN03032	serine decarboxylase; Provisional	374
178601	PLN03033	PLN03033	2-dehydro-3-deoxyphosphooctonate aldolase; Provisional	290
178602	PLN03034	PLN03034	phosphoglycerate kinase; Provisional	481
178603	PLN03036	PLN03036	glutamine synthetase; Provisional	432
215547	PLN03037	PLN03037	lipase class 3 family protein; Provisional	525
166679	PLN03039	PLN03039	ethanolaminephosphotransferase; Provisional	337
215548	PLN03042	PLN03042	Lactoylglutathione lyase; Provisional	185
178606	PLN03043	PLN03043	Probable pectinesterase/pectinesterase inhibitor; Provisional	538
215549	PLN03044	PLN03044	GTP cyclohydrolase I; Provisional	188
178608	PLN03046	PLN03046	D-glycerate 3-kinase; Provisional	460
215550	PLN03049	PLN03049	pyridoxine (pyridoxamine) 5'-phosphate oxidase; Provisional	462
215551	PLN03050	PLN03050	pyridoxine (pyridoxamine) 5'-phosphate oxidase; Provisional	246
215552	PLN03051	PLN03051	acyl-activating enzyme; Provisional	499
215553	PLN03052	PLN03052	acetate--CoA ligase; Provisional	728
178613	PLN03055	PLN03055	AMP deaminase; Provisional	602
166697	PLN03058	PLN03058	dynein light chain type 1 family protein; Provisional	128
166698	PLN03059	PLN03059	beta-galactosidase; Provisional	840
215554	PLN03060	PLN03060	inositol phosphatase-like protein; Provisional	206
215555	PLN03063	PLN03063	alpha,alpha-trehalose-phosphate synthase (UDP-forming); Provisional	797
215556	PLN03064	PLN03064	alpha,alpha-trehalose-phosphate synthase (UDP-forming); Provisional	934
178617	PLN03065	PLN03065	isocitrate dehydrogenase (NADP+); Provisional	483
215557	PLN03069	PLN03069	magnesiumprotoporphyrin-IX chelatase subunit H; Provisional	1220
178619	PLN03070	PLN03070	photosystem I reaction center subunit psaK 247; Provisional	128
178620	PLN03071	PLN03071	GTP-binding nuclear protein Ran; Provisional	219
178621	PLN03072	PLN03072	60S ribosomal protein L12; Provisional	166
215558	PLN03073	PLN03073	ABC transporter F family; Provisional	718
215559	PLN03074	PLN03074	auxin influx permease; Provisional	473
178624	PLN03075	PLN03075	nicotianamine synthase; Provisional	296
215560	PLN03076	PLN03076	ARF guanine nucleotide exchange factor (ARF-GEF); Provisional	1780
215561	PLN03077	PLN03077	Protein ECB2; Provisional	857
215562	PLN03078	PLN03078	Putative tRNA pseudouridine synthase; Provisional	513
178628	PLN03079	PLN03079	Uncharacterized protein At4g33100; Provisional	91
178629	PLN03080	PLN03080	Probable beta-xylosidase; Provisional	779
215563	PLN03081	PLN03081	pentatricopeptide (PPR) repeat-containing protein; Provisional	697
215564	PLN03082	PLN03082	Iron-sulfur cluster assembly; Provisional	163
215565	PLN03083	PLN03083	E3 UFM1-protein ligase 1 homolog; Provisional	803
178633	PLN03084	PLN03084	alpha/beta hydrolase fold protein; Provisional	383
215566	PLN03085	PLN03085	nucleobase:cation symporter-1; Provisional	221
178635	PLN03086	PLN03086	PRLI-interacting factor K; Provisional	567
215567	PLN03087	PLN03087	BODYGUARD 1 domain containing hydrolase; Provisional	481
215568	PLN03088	PLN03088	SGT1,  suppressor of G2 allele of SKP1; Provisional	356
215569	PLN03089	PLN03089	hypothetical protein; Provisional	373
178639	PLN03090	PLN03090	auxin-responsive family protein; Provisional	104
215570	PLN03091	PLN03091	hypothetical protein; Provisional	459
178641	PLN03093	PLN03093	Protein SENSITIVITY TO RED LIGHT REDUCED 1; Provisional	273
178642	PLN03094	PLN03094	Substrate binding subunit of ER-derived-lipid transporter; Provisional	370
215571	PLN03095	PLN03095	NADH:ubiquinone oxidoreductase 18 kDa subunit; Provisional	115
215572	PLN03096	PLN03096	glyceraldehyde-3-phosphate dehydrogenase A; Provisional	395
178645	PLN03097	FHY3	Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional	846
215573	PLN03098	LPA1	LOW PSII ACCUMULATION1; Provisional	453
215574	PLN03099	PIR	Protein PIR; Provisional	1232
215575	PLN03100	PLN03100	Permease subunit of ER-derived-lipid transporter; Provisional	292
215576	PLN03102	PLN03102	acyl-activating enzyme; Provisional	579
215577	PLN03103	PLN03103	GDP-L-galactose-hexose-1-phosphate guanyltransferase; Provisional	403
215578	PLN03104	FHL	FAR-RED-ELONGATED HYPOCOTYL1-LIKE; Provisional	201
178652	PLN03105	TCP24	transcription factor TCP24 (TEOSINTE BRANCHED1, CYCLOIDEA, AND PCF FAMILY 24); Provisional	324
215579	PLN03106	TCP2	Protein TCP2; Provisional	447
215580	PLN03107	PLN03107	eukaryotic translation initiation factor 5A; Provisional	159
178655	PLN03108	PLN03108	Rab family protein; Provisional	210
215581	PLN03109	PLN03109	ETHYLENE-INSENSITIVE3-like3 protein; Provisional	599
178657	PLN03110	PLN03110	Rab GTPase; Provisional	216
215582	PLN03111	PLN03111	DNA-directed RNA polymerase II subunit family protein; Provisional	206
215583	PLN03112	PLN03112	cytochrome P450 family protein; Provisional	514
215584	PLN03113	PLN03113	DNA ligase 1; Provisional	744
178661	PLN03114	PLN03114	ADP-ribosylation factor GTPase-activating protein AGD10; Provisional	395
215585	PLN03115	PLN03115	ferredoxin--NADP(+) reductase; Provisional	367
215586	PLN03116	PLN03116	ferredoxin--NADP+ reductase; Provisional	307
178664	PLN03117	PLN03117	Branched-chain-amino-acid aminotransferase; Provisional	355
215587	PLN03118	PLN03118	Rab family protein; Provisional	211
178666	PLN03119	PLN03119	putative ADP-ribosylation factor GTPase-activating protein AGD14; Provisional	648
215588	PLN03120	PLN03120	nucleic acid binding protein; Provisional	260
215589	PLN03121	PLN03121	nucleic acid binding protein; Provisional	243
178669	PLN03122	PLN03122	Poly [ADP-ribose] polymerase; Provisional	815
215590	PLN03123	PLN03123	poly [ADP-ribose] polymerase; Provisional	981
215591	PLN03124	PLN03124	poly [ADP-ribose] polymerase; Provisional	643
215592	PLN03126	PLN03126	Elongation factor Tu; Provisional	478
178673	PLN03127	PLN03127	Elongation factor Tu; Provisional	447
215593	PLN03128	PLN03128	DNA topoisomerase 2; Provisional	1135
215594	PLN03129	PLN03129	NADP-dependent malic enzyme; Provisional	581
215595	PLN03130	PLN03130	ABC transporter C family member; Provisional	1622
178677	PLN03131	PLN03131	hypothetical protein; Provisional	705
178678	PLN03132	PLN03132	NADH dehydrogenase (ubiquinone) flavoprotein 1; Provisional	461
215596	PLN03133	PLN03133	beta-1,3-galactosyltransferase; Provisional	636
178680	PLN03134	PLN03134	glycine-rich RNA-binding protein 4; Provisional	144
178681	PLN03136	PLN03136	Ferredoxin; Provisional	148
215597	PLN03137	PLN03137	ATP-dependent DNA helicase; Q4-like; Provisional	1195
215598	PLN03138	PLN03138	Protein TOC75; Provisional	796
178684	PLN03139	PLN03139	formate dehydrogenase; Provisional	386
215599	PLN03140	PLN03140	ABC transporter G family member; Provisional	1470
215600	PLN03141	PLN03141	3-epi-6-deoxocathasterone 23-monooxygenase; Provisional	452
215601	PLN03142	PLN03142	Probable chromatin-remodeling complex ATPase chain; Provisional	1033
215602	PLN03143	PLN03143	nudix hydrolase; Provisional	291
178689	PLN03144	PLN03144	Carbon catabolite repressor protein 4 homolog; Provisional	606
215603	PLN03145	PLN03145	Protein phosphatase 2c; Provisional	365
178691	PLN03146	PLN03146	aspartyl protease family protein; Provisional	431
178692	PLN03147	PLN03147	ribosomal protein S19; Provisional	92
178693	PLN03148	PLN03148	Blue copper-like protein; Provisional	167
178694	PLN03149	PLN03149	peptidyl-prolyl isomerase H (cyclophilin H); Provisional	186
178695	PLN03150	PLN03150	hypothetical protein; Provisional	623
215604	PLN03151	PLN03151	cation/calcium exchanger; Provisional	650
178697	PLN03152	PLN03152	hypothetical protein; Provisional	241
215605	PLN03153	PLN03153	hypothetical protein; Provisional	537
215606	PLN03154	PLN03154	putative allyl alcohol dehydrogenase; Provisional	348
178700	PLN03155	PLN03155	cytochrome c oxidase subunit 5C; Provisional	63
178701	PLN03156	PLN03156	GDSL esterase/lipase; Provisional	351
178702	PLN03157	PLN03157	spermidine hydroxycinnamoyl transferase; Provisional	447
215607	PLN03158	PLN03158	methionine aminopeptidase; Provisional	396
215608	PLN03159	PLN03159	cation/H(+) antiporter 15; Provisional	832
215609	PLN03160	PLN03160	uncharacterized protein; Provisional	219
178706	PLN03161	PLN03161	Probable xyloglucan endotransglucosylase/hydrolase protein; Provisional	291
178707	PLN03162	PLN03162	golden-2 like transcription factor; Provisional	526
215610	PLN03164	PLN03164	3-oxo-5-alpha-steroid 4-dehydrogenase, C-terminal domain containing protein; Provisional	323
178709	PLN03165	PLN03165	chaperone protein dnaJ-related; Provisional	111
178710	PLN03166	PLN03166	60S ribosomal protein L34; Provisional	96
215611	PLN03167	PLN03167	Chaperonin-60 beta subunit; Provisional	600
178712	PLN03168	PLN03168	chalcone synthase; Provisional	389
215612	PLN03169	PLN03169	chalcone synthase family protein; Provisional	391
178714	PLN03170	PLN03170	chalcone synthase; Provisional	401
178715	PLN03171	PLN03171	chalcone synthase-like protein; Provisional	399
178716	PLN03172	PLN03172	chalcone synthase family protein; Provisional	393
178717	PLN03173	PLN03173	chalcone synthase; Provisional	391
215613	PLN03174	PLN03174	Chalcone-flavanone isomerase-related; Provisional	278
178719	PLN03175	PLN03175	hypothetical protein; Provisional	415
178720	PLN03176	PLN03176	flavanone-3-hydroxylase; Provisional	120
215614	PLN03178	PLN03178	leucoanthocyanidin dioxygenase; Provisional	360
215615	PLN03180	PLN03180	reversibly glycosylated polypeptide; Provisional	346
215616	PLN03181	PLN03181	glycosyltransferase; Provisional	453
215617	PLN03182	PLN03182	xyloglucan 6-xylosyltransferase; Provisional	429
178725	PLN03183	PLN03183	acetylglucosaminyltransferase  family protein; Provisional	421
215618	PLN03184	PLN03184	chloroplast Hsp70; Provisional	673
215619	PLN03185	PLN03185	phosphatidylinositol phosphate kinase; Provisional	765
178728	PLN03186	PLN03186	DNA repair protein RAD51 homolog; Provisional	342
215620	PLN03187	PLN03187	meiotic recombination protein DMC1 homolog; Provisional	344
215621	PLN03188	PLN03188	kinesin-12 family protein; Provisional	1320
215622	PLN03189	PLN03189	Protease specific for SMALL UBIQUITIN-RELATED MODIFIER (SUMO); Provisional	490
215623	PLN03190	PLN03190	aminophospholipid translocase; Provisional	1178
215624	PLN03191	PLN03191	Type I inositol-1,4,5-trisphosphate 5-phosphatase 2; Provisional	621
215625	PLN03192	PLN03192	Voltage-dependent potassium channel; Provisional	823
178735	PLN03193	PLN03193	beta-1,3-galactosyltransferase; Provisional	408
215626	PLN03194	PLN03194	putative disease resistance protein; Provisional	187
215627	PLN03195	PLN03195	fatty acid omega-hydroxylase; Provisional	516
215628	PLN03196	PLN03196	MOC1-like protein; Provisional	487
178739	PLN03198	PLN03198	delta6-acyl-lipid desaturase; Provisional	526
178740	PLN03199	PLN03199	delta6-acyl-lipid desaturase-like protein; Provisional	485
215629	PLN03200	PLN03200	cellulose synthase-interactive protein; Provisional	2102
215630	PLN03201	PLN03201	RAB geranylgeranyl transferase beta-subunit; Provisional	316
215631	PLN03202	PLN03202	protein argonaute; Provisional	900
178744	PLN03205	PLN03205	ATR interacting protein; Provisional	652
178745	PLN03206	PLN03206	phosphoribosylformylglycinamidine synthase; Provisional	1307
215632	PLN03207	PLN03207	stomagen; Provisional	113
178747	PLN03208	PLN03208	E3 ubiquitin-protein ligase RMA2; Provisional	193
178748	PLN03209	PLN03209	translocon at the inner envelope of chloroplast subunit 62; Provisional	576
215633	PLN03210	PLN03210	Resistant to P. syringae 6; Provisional	1153
215634	PLN03211	PLN03211	ABC transporter G-25; Provisional	659
178751	PLN03212	PLN03212	Transcription repressor MYB5; Provisional	249
178752	PLN03213	PLN03213	repressor of silencing 3; Provisional	759
215635	PLN03214	PLN03214	probable enoyl-CoA hydratase/isomerase; Provisional	278
178754	PLN03215	PLN03215	ascorbic acid mannose pathway regulator 1; Provisional	373
178755	PLN03216	PLN03216	actin depolymerizing factor; Provisional	141
178756	PLN03217	PLN03217	transcription factor ATBS1; Provisional	93
215636	PLN03218	PLN03218	maturation of RBCL 1; Provisional	1060
178758	PLN03219	PLN03219	uncharacterized protein; Provisional	108
178759	PLN03220	PLN03220	uncharacterized protein; Provisional	105
178760	PLN03221	PLN03221	rapid alkalinization factor 23; Provisional	137
178761	PLN03222	PLN03222	rapid alkalinization factor 23-like protein; Provisional	119
215637	PLN03223	PLN03223	Polycystin cation channel protein; Provisional	1634
178763	PLN03224	PLN03224	probable serine/threonine protein kinase; Provisional	507
215638	PLN03225	PLN03225	Serine/threonine-protein kinase SNT7; Provisional	566
215639	PLN03226	PLN03226	serine hydroxymethyltransferase; Provisional	475
178766	PLN03227	PLN03227	serine palmitoyltransferase-like protein; Provisional	392
178767	PLN03228	PLN03228	methylthioalkylmalate synthase; Provisional	503
178768	PLN03229	PLN03229	acetyl-coenzyme A carboxylase carboxyl transferase subunit alpha; Provisional	762
178769	PLN03230	PLN03230	acetyl-coenzyme A carboxylase carboxyl transferase; Provisional	431
178770	PLN03231	PLN03231	putative alpha-galactosidase; Provisional	357
215640	PLN03232	PLN03232	ABC transporter C family member; Provisional	1495
178772	PLN03233	PLN03233	putative glutamate-tRNA ligase; Provisional	523
178773	PLN03234	PLN03234	cytochrome P450 83B1; Provisional	499
178774	PLN03236	PLN03236	4-alpha-glucanotransferase; Provisional	745
215641	PLN03237	PLN03237	DNA topoisomerase 2; Provisional	1465
215642	PLN03238	PLN03238	probable histone acetyltransferase MYST; Provisional	290
178777	PLN03239	PLN03239	histone acetyltransferase; Provisional	351
178778	PLN03240	PLN03240	putative Low-temperature-induced protein; Provisional	626
215643	PLN03241	PLN03241	magnesium chelatase subunit H; Provisional	1315
178780	PLN03242	PLN03242	diacylglycerol o-acyltransferase; Provisional	410
215644	PLN03243	PLN03243	haloacid dehalogenase-like hydrolase; Provisional	260
178782	PLN03244	PLN03244	alpha-amylase; Provisional	872
215645	PLN03246	PLN03246	26S proteasome regulatory subunit; Provisional	303
234564	PRK00001	rplC	50S ribosomal protein L3; Validated	210
234565	PRK00002	aroB	3-dehydroquinate synthase; Reviewed	358
234566	PRK00004	rplX	50S ribosomal protein L24; Reviewed	105
234567	PRK00005	fmt	methionyl-tRNA formyltransferase; Reviewed	309
234568	PRK00006	fabZ	3-hydroxyacyl-ACP dehydratase FabZ. 	147
234569	PRK00007	PRK00007	elongation factor G; Reviewed	693
234570	PRK00009	PRK00009	phosphoenolpyruvate carboxylase; Reviewed	911
178791	PRK00010	rplE	50S ribosomal protein L5; Validated	179
234571	PRK00011	glyA	serine hydroxymethyltransferase; Reviewed	416
234572	PRK00012	gatA	Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase subunit GatA. 	459
234573	PRK00013	groEL	chaperonin GroEL; Reviewed	542
134031	PRK00014	ribB	3,4-dihydroxy-2-butanone 4-phosphate synthase; Provisional	230
234574	PRK00015	rnhB	ribonuclease HII; Validated	197
234575	PRK00016	PRK00016	metal-binding heat shock protein; Provisional	159
234576	PRK00019	rpmE	50S ribosomal protein L31; Reviewed	72
134035	PRK00020	truB	tRNA pseudouridine synthase B; Provisional	244
234577	PRK00021	truA	tRNA pseudouridine(38-40) synthase TruA. 	244
234578	PRK00022	lolB	lipoprotein localization protein LolB. 	202
234579	PRK00023	cmk	(d)CMP kinase. 	225
178801	PRK00024	PRK00024	DNA repair protein RadC. 	224
234580	PRK00025	lpxB	lipid-A-disaccharide synthase; Reviewed	380
234581	PRK00026	trmD	tRNA (guanine-N(1)-)-methyltransferase; Reviewed	244
234582	PRK00028	infC	translation initiation factor IF-3; Reviewed	177
234583	PRK00029	PRK00029	YdiU family protein. 	487
178806	PRK00030	minC	septum site-determining protein MinC. 	292
178807	PRK00031	lolA	outer membrane lipoprotein chaperone LolA. 	195
234584	PRK00032	PRK00032	septum formation inhibitor Maf. 	190
178809	PRK00033	clpS	ATP-dependent Clp protease adaptor protein ClpS; Reviewed	100
178810	PRK00034	gatC	Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase subunit GatC. 	95
234585	PRK00035	hemH	ferrochelatase; Reviewed	333
134050	PRK00036	PRK00036	primosomal replication protein N; Reviewed	107
234586	PRK00037	hisS	histidyl-tRNA synthetase; Reviewed	412
134052	PRK00038	rnpA	ribonuclease P protein component. 	123
234587	PRK00039	ruvC	Holliday junction resolvase; Reviewed	164
234588	PRK00040	rpsP	30S ribosomal protein S16; Reviewed	75
178815	PRK00041	PRK00041	hypothetical protein; Validated	93
234589	PRK00042	tpiA	triosephosphate isomerase; Provisional	250
234590	PRK00043	thiE	thiamine phosphate synthase. 	212
234591	PRK00044	psd	phosphatidylserine decarboxylase; Reviewed	288
234592	PRK00045	hemA	glutamyl-tRNA reductase; Reviewed	423
234593	PRK00046	murB	UDP-N-acetylmuramate dehydrogenase. 	334
234594	PRK00047	glpK	glycerol kinase GlpK. 	498
234595	PRK00048	PRK00048	dihydrodipicolinate reductase; Provisional	257
234596	PRK00049	PRK00049	elongation factor Tu; Reviewed	396
234597	PRK00050	PRK00050	16S rRNA (cytosine(1402)-N(4))-methyltransferase RsmH. 	296
234598	PRK00051	hisI	phosphoribosyl-AMP cyclohydrolase; Reviewed	125
234599	PRK00052	PRK00052	prolipoprotein diacylglyceryl transferase; Reviewed	269
234600	PRK00053	alr	alanine racemase; Reviewed	363
234601	PRK00054	PRK00054	dihydroorotate dehydrogenase electron transfer subunit; Reviewed	250
234602	PRK00055	PRK00055	ribonuclease Z; Reviewed	270
234603	PRK00056	mtgA	monofunctional biosynthetic peptidoglycan transglycosylase; Provisional	236
234604	PRK00058	PRK00058	peptide-methionine (S)-S-oxide reductase MsrA. 	213
234605	PRK00059	prsA	peptidylprolyl isomerase; Provisional	336
234606	PRK00061	ribH	6,7-dimethyl-8-ribityllumazine synthase; Provisional	154
234607	PRK00062	PRK00062	glutamate-1-semialdehyde 2,1-aminomutase. 	426
234608	PRK00064	recF	recombination protein F; Reviewed	361
178836	PRK00066	ldh	L-lactate dehydrogenase; Reviewed	315
234609	PRK00068	PRK00068	hypothetical protein; Validated	970
234610	PRK00070	acpS	4'-phosphopantetheinyl transferase; Provisional	126
234611	PRK00071	nadD	nicotinate-nucleotide adenylyltransferase. 	203
234612	PRK00072	hemC	porphobilinogen deaminase; Reviewed	295
234613	PRK00073	pgk	phosphoglycerate kinase; Provisional	389
234614	PRK00074	guaA	GMP synthase; Reviewed	511
234615	PRK00075	cbiD	cobalt-precorrin-6A synthase; Reviewed	361
234616	PRK00076	recR	recombination protein RecR; Reviewed	196
234617	PRK00077	eno	enolase; Provisional	425
234618	PRK00078	PRK00078	Maf-like protein; Reviewed	192
234619	PRK00080	ruvB	Holliday junction branch migration DNA helicase RuvB. 	328
234620	PRK00081	coaE	dephospho-CoA kinase; Reviewed	194
234621	PRK00082	hrcA	heat-inducible transcription repressor; Provisional	339
178850	PRK00083	frr	ribosome recycling factor; Reviewed	185
178851	PRK00084	ispF	2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase; Reviewed	159
234622	PRK00085	recO	DNA repair protein RecO; Reviewed	247
234623	PRK00087	PRK00087	bifunctional 4-hydroxy-3-methylbut-2-enyl diphosphate reductase/30S ribosomal protein S1. 	647
234624	PRK00089	era	GTPase Era; Reviewed	292
234625	PRK00090	bioD	ATP-dependent dethiobiotin synthetase BioD. 	222
234626	PRK00091	miaA	tRNA delta(2)-isopentenylpyrophosphate transferase; Reviewed	307
234627	PRK00092	PRK00092	ribosome maturation protein RimP; Reviewed	154
234628	PRK00093	PRK00093	GTP-binding protein Der; Reviewed	435
234629	PRK00094	gpsA	NAD(P)H-dependent glycerol-3-phosphate dehydrogenase. 	325
234630	PRK00095	mutL	DNA mismatch repair endonuclease MutL. 	617
234631	PRK00098	PRK00098	GTPase RsgA; Reviewed	298
234632	PRK00099	rplJ	50S ribosomal protein L10; Reviewed	172
234633	PRK00102	rnc	ribonuclease III; Reviewed	229
234634	PRK00103	PRK00103	rRNA large subunit methyltransferase; Provisional	157
234635	PRK00104	scpA	segregation and condensation protein A; Reviewed	242
234636	PRK00105	cobT	nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase; Reviewed	335
178867	PRK00106	PRK00106	ribonuclease Y. 	535
234637	PRK00107	gidB	16S rRNA (guanine(527)-N(7))-methyltransferase RsmG. 	187
234638	PRK00108	mraY	phospho-N-acetylmuramoyl-pentapeptide-transferase; Provisional	344
234639	PRK00109	PRK00109	Holliday junction resolvase RuvX. 	138
234640	PRK00110	PRK00110	YebC/PmpR family DNA-binding transcriptional regulator. 	245
234641	PRK00111	PRK00111	hypothetical protein; Provisional	180
234642	PRK00112	tgt	queuine tRNA-ribosyltransferase; Provisional	366
234643	PRK00114	hslO	Hsp33 family molecular chaperone HslO. 	293
234644	PRK00115	hemE	uroporphyrinogen decarboxylase; Validated	346
234645	PRK00116	ruvA	Holliday junction branch migration protein RuvA. 	192
234646	PRK00117	recX	recombination regulator RecX; Reviewed	157
234647	PRK00118	PRK00118	putative DNA-binding protein; Validated	104
234648	PRK00120	PRK00120	dITP/XTP pyrophosphatase; Reviewed	196
234649	PRK00121	trmB	tRNA (guanine-N(7)-)-methyltransferase; Reviewed	202
234650	PRK00122	rimM	16S rRNA-processing protein RimM; Provisional	172
178882	PRK00124	PRK00124	YaiI/YqxD family protein. 	151
234651	PRK00125	pyrF	orotidine 5'-phosphate decarboxylase; Reviewed	278
234652	PRK00128	ipk	4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional	286
234653	PRK00129	upp	uracil phosphoribosyltransferase; Reviewed	209
178886	PRK00130	truB	tRNA pseudouridine synthase B; Provisional	290
234654	PRK00131	aroK	shikimate kinase; Reviewed	175
178888	PRK00132	rpsI	30S ribosomal protein S9; Reviewed	130
234655	PRK00133	metG	methionyl-tRNA synthetase; Reviewed	673
234656	PRK00134	ccrB	fluoride efflux transporter family protein. 	104
234657	PRK00135	scpB	segregation and condensation protein B; Reviewed	188
234658	PRK00136	rpsH	30S ribosomal protein S8; Validated	130
234659	PRK00137	rplI	50S ribosomal protein L9; Reviewed	147
234660	PRK00139	murE	UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-diaminopimelate ligase; Provisional	460
234661	PRK00140	rplK	50S ribosomal protein L11; Validated	141
234662	PRK00141	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	473
234663	PRK00142	PRK00142	rhodanese-related sulfurtransferase. 	314
234664	PRK00143	mnmA	tRNA-specific 2-thiouridylase MnmA; Reviewed	346
234665	PRK00145	PRK00145	putative inner membrane protein translocase component YidC; Provisional	223
234666	PRK00147	queA	S-adenosylmethionine:tRNA ribosyltransferase-isomerase; Provisional	342
178901	PRK00148	PRK00148	Maf-like protein; Reviewed	194
234667	PRK00149	dnaA	chromosomal replication initiator protein DnaA. 	401
234668	PRK00150	def	peptide deformylase; Reviewed	165
234669	PRK00153	PRK00153	YbaB/EbfC family nucleoid-associated protein. 	104
234670	PRK00155	ispD	D-ribitol-5-phosphate cytidylyltransferase. 	227
234671	PRK00157	rplL	50S ribosomal protein L7/L12; Reviewed	123
178907	PRK00159	PRK00159	putative septation inhibitor protein; Reviewed	87
178908	PRK00162	glpE	thiosulfate sulfurtransferase GlpE. 	108
234672	PRK00164	moaA	GTP 3',8-cyclase MoaA. 	331
234673	PRK00166	apaH	symmetrical bis(5'-nucleosyl)-tetraphosphatase. 	275
234674	PRK00168	coaD	phosphopantetheine adenylyltransferase; Provisional	159
234675	PRK00170	PRK00170	azoreductase; Reviewed	201
234676	PRK00172	rpmI	50S ribosomal protein L35; Reviewed	65
178914	PRK00173	rph	ribonuclease PH; Reviewed	238
234677	PRK00174	PRK00174	acetyl-CoA synthetase; Provisional	637
234678	PRK00175	metX	homoserine O-acetyltransferase; Provisional	379
166839	PRK00178	tolB	Tol-Pal system protein TolB. 	430
234679	PRK00179	pgi	glucose-6-phosphate isomerase; Reviewed	548
234680	PRK00180	PRK00180	acetate kinase A/propionate kinase 2; Reviewed	402
234681	PRK00182	tatB	Sec-independent protein translocase subunit TatB. 	160
166842	PRK00183	PRK00183	hypothetical protein; Provisional	157
166843	PRK00187	PRK00187	NorM family multidrug efflux MATE transporter. 	464
234682	PRK00188	trpD	anthranilate phosphoribosyltransferase; Provisional	339
234683	PRK00191	tatA	twin arginine translocase protein A; Provisional	84
234684	PRK00192	PRK00192	mannosyl-3-phosphoglycerate phosphatase; Reviewed	273
178923	PRK00194	PRK00194	ACT domain-containing protein. 	90
234685	PRK00197	proA	gamma-glutamyl phosphate reductase; Provisional	417
178925	PRK00199	ihfB	integration host factor subunit beta; Reviewed	94
234686	PRK00202	nusB	transcription antitermination factor NusB. 	137
178927	PRK00203	rnhA	ribonuclease H; Reviewed	150
178928	PRK00207	PRK00207	sulfurtransferase complex subunit TusD. 	128
234687	PRK00208	thiG	thiazole synthase; Reviewed	250
178930	PRK00211	PRK00211	sulfurtransferase complex subunit TusC. 	119
234688	PRK00215	PRK00215	transcriptional repressor LexA. 	205
234689	PRK00216	ubiE	bifunctional demethylmenaquinone methyltransferase/2-methoxy-6-polyprenyl-1,4-benzoquinol methylase UbiE. 	239
234690	PRK00218	PRK00218	lysogenization regulator HflD. 	207
234691	PRK00220	PRK00220	glycerol-3-phosphate 1-O-acyltransferase PlsY. 	198
234692	PRK00222	PRK00222	peptide-methionine (R)-S-oxide reductase MsrB. 	142
234693	PRK00226	greA	transcription elongation factor GreA; Reviewed	157
178937	PRK00227	glnD	[protein-PII] uridylyltransferase. 	693
234694	PRK00228	PRK00228	YqgE/AlgH family protein. 	191
234695	PRK00230	PRK00230	orotidine-5'-phosphate decarboxylase. 	230
234696	PRK00232	pdxA	4-hydroxythreonine-4-phosphate dehydrogenase; Reviewed	332
166864	PRK00234	PRK00234	Maf-like protein; Reviewed	192
234697	PRK00235	cobS	cobalamin synthase; Reviewed	249
234698	PRK00236	xerC	site-specific tyrosine recombinase XerC; Reviewed	297
178943	PRK00239	rpsT	30S ribosomal protein S20; Reviewed	88
234699	PRK00241	nudC	NAD(+) diphosphatase. 	256
178945	PRK00247	PRK00247	putative inner membrane protein translocase component YidC; Validated	429
234700	PRK00249	flgH	flagellar basal body L-ring protein FlgH. 	222
234701	PRK00252	alaS	alanyl-tRNA synthetase; Reviewed	865
178948	PRK00253	fliE	flagellar hook-basal body protein FliE; Reviewed	108
234702	PRK00254	PRK00254	ski2-like helicase; Provisional	720
166874	PRK00257	PRK00257	4-phosphoerythronate dehydrogenase PdxB. 	381
234703	PRK00258	aroE	shikimate 5-dehydrogenase; Reviewed	278
234704	PRK00259	PRK00259	intracellular septation protein A; Reviewed	179
234705	PRK00260	cysS	cysteinyl-tRNA synthetase; Validated	463
234706	PRK00269	zipA	cell division protein ZipA; Reviewed	293
234707	PRK00270	rpsU	30S ribosomal protein S21; Reviewed	64
234708	PRK00274	ksgA	16S rRNA (adenine(1518)-N(6)/adenine(1519)-N(6))-dimethyltransferase RsmA. 	272
234709	PRK00275	glnD	PII uridylyl-transferase; Provisional	895
178954	PRK00276	infA	translation initiation factor IF-1; Validated	72
178955	PRK00277	clpP	ATP-dependent Clp protease proteolytic subunit; Reviewed	200
234710	PRK00278	trpC	indole-3-glycerol phosphate synthase TrpC. 	260
234711	PRK00279	adk	adenylate kinase; Reviewed	215
234712	PRK00281	PRK00281	undecaprenyl-diphosphate phosphatase. 	268
234713	PRK00283	xerD	tyrosine recombinase. 	299
178960	PRK00284	pqqA	pyrroloquinoline quinone precursor peptide PqqA. 	23
178961	PRK00285	ihfA	integration host factor subunit alpha; Reviewed	99
234714	PRK00286	xseA	exodeoxyribonuclease VII large subunit; Reviewed	438
234715	PRK00290	dnaK	molecular chaperone DnaK; Provisional	627
234716	PRK00292	glk	glucokinase; Provisional	316
234717	PRK00293	dipZ	thiol:disulfide interchange protein precursor; Provisional	571
166894	PRK00294	hscB	co-chaperone HscB; Provisional	173
166895	PRK00295	PRK00295	hypothetical protein; Provisional	68
234718	PRK00296	minE	cell division topological specificity factor MinE; Reviewed	86
178967	PRK00299	PRK00299	sulfurtransferase TusA. 	81
234719	PRK00300	gmk	guanylate kinase; Provisional	205
234720	PRK00301	aat	leucyl/phenylalanyl-tRNA--protein transferase; Reviewed	233
234721	PRK00302	lnt	apolipoprotein N-acyltransferase; Reviewed	505
166901	PRK00304	PRK00304	hypothetical protein; Provisional	75
178971	PRK00306	PRK00306	50S ribosomal protein L29; Reviewed	66
234722	PRK00310	rpsC	30S ribosomal protein S3; Reviewed	232
234723	PRK00311	panB	3-methyl-2-oxobutanoate hydroxymethyltransferase; Reviewed	264
178974	PRK00312	pcm	protein-L-isoaspartate(D-aspartate) O-methyltransferase. 	212
234724	PRK00315	PRK00315	potassium-transporting ATPase subunit KdpC. 	193
234725	PRK00317	mobA	molybdopterin-guanine dinucleotide biosynthesis protein MobA; Reviewed	193
234726	PRK00321	rdgC	recombination associated protein; Reviewed	303
234727	PRK00325	algL	polysaccharide lyase. 	359
234728	PRK00326	PRK00326	transcriptional regulator MraZ. 	139
178979	PRK00329	PRK00329	GIY-YIG nuclease superfamily protein; Validated	86
234729	PRK00331	PRK00331	isomerizing glutamine--fructose-6-phosphate transaminase. 	604
234730	PRK00339	minC	septum site-determining protein MinC. 	249
166914	PRK00341	PRK00341	DUF493 domain-containing protein. 	91
234731	PRK00343	ipk	4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional	271
234732	PRK00346	surE	5'(3')-nucleotidase/polyphosphatase; Provisional	250
234733	PRK00347	PRK00347	DNA/RNA nuclease SfsA. 	234
234734	PRK00349	uvrA	excinuclease ABC subunit UvrA. 	943
178985	PRK00357	rpsS	30S ribosomal protein S19; Reviewed	92
234735	PRK00358	pyrH	uridylate kinase; Provisional	231
234736	PRK00359	rpmB	50S ribosomal protein L28; Reviewed	76
178988	PRK00364	groES	co-chaperonin GroES; Reviewed	95
234737	PRK00366	ispG	flavodoxin-dependent (E)-4-hydroxy-3-methylbut-2-enyl-diphosphate synthase. 	360
234738	PRK00369	pyrC	dihydroorotase; Provisional	392
178991	PRK00373	PRK00373	V-type ATP synthase subunit D; Reviewed	204
234739	PRK00376	lspA	lipoprotein signal peptidase. 	160
234740	PRK00377	cbiT	cobalt-precorrin-6Y C(15)-methyltransferase; Provisional	198
178993	PRK00378	PRK00378	nucleoid-associated protein NdpA; Validated	334
234741	PRK00380	panC	pantoate--beta-alanine ligase; Reviewed	281
234742	PRK00389	gcvT	glycine cleavage system aminomethyltransferase GcvT. 	359
234743	PRK00390	leuS	leucyl-tRNA synthetase; Validated	805
178997	PRK00391	rpsR	30S ribosomal protein S18; Reviewed	79
234744	PRK00392	rpoZ	DNA-directed RNA polymerase subunit omega; Reviewed	69
234745	PRK00393	ribA	GTP cyclohydrolase II RibA. 	197
234746	PRK00394	PRK00394	transcription factor; Reviewed	179
179001	PRK00395	hfq	RNA-binding protein Hfq; Provisional	79
179002	PRK00396	rnpA	ribonuclease P protein component. 	130
234747	PRK00398	rpoP	DNA-directed RNA polymerase subunit P; Provisional	46
179004	PRK00399	rpmH	50S ribosomal protein L34; Reviewed	44
179005	PRK00400	hisE	phosphoribosyl-ATP diphosphatase. 	105
234748	PRK00402	PRK00402	3-isopropylmalate dehydratase large subunit; Reviewed	418
166942	PRK00404	tatB	Sec-independent protein translocase subunit TatB. 	141
234749	PRK00405	rpoB	DNA-directed RNA polymerase subunit beta; Reviewed	1112
179008	PRK00407	Archease	Archease protein family (MTH1598/TM1083). This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism.	139
234750	PRK00409	PRK00409	recombination and DNA strand exchange inhibitor protein; Reviewed	782
234751	PRK00411	cdc6	ORC1-type DNA replication protein. 	394
234752	PRK00413	thrS	threonyl-tRNA synthetase; Reviewed	638
179012	PRK00414	gmhA	D-sedoheptulose 7-phosphate isomerase. 	192
179013	PRK00415	rps27e	30S ribosomal protein S27e; Reviewed	59
234753	PRK00416	dcd	deoxycytidine triphosphate deaminase; Reviewed	177
234754	PRK00418	PRK00418	DNA gyrase inhibitor YacG. 	62
234755	PRK00419	PRK00419	DNA primase small subunit PriS. 	376
234756	PRK00420	PRK00420	hypothetical protein; Validated	112
234757	PRK00421	murC	UDP-N-acetylmuramate--L-alanine ligase; Provisional	461
234758	PRK00423	tfb	transcription initiation factor IIB; Reviewed	310
179020	PRK00430	fis	DNA-binding transcriptional regulator Fis. 	95
234759	PRK00431	PRK00431	ADP-ribose-binding protein. 	177
234760	PRK00432	PRK00432	30S ribosomal protein S27ae; Validated	50
179023	PRK00435	ef1B	elongation factor 1-beta; Validated	88
234761	PRK00436	argC	N-acetyl-gamma-glutamyl-phosphate reductase; Validated	343
234762	PRK00439	leuD	3-isopropylmalate dehydratase small subunit; Reviewed	163
234763	PRK00440	rfc	replication factor C small subunit; Reviewed	319
179027	PRK00441	argR	arginine repressor; Provisional	149
234764	PRK00442	tatA	twin-arginine translocase TatA/TatE family subunit. 	92
179028	PRK00443	nagB	glucosamine-6-phosphate deaminase; Provisional	261
234765	PRK00446	cyaY	frataxin-like protein; Provisional	105
234766	PRK00447	PRK00447	hypothetical protein; Provisional	144
234767	PRK00448	polC	DNA polymerase III PolC; Validated	1437
234768	PRK00450	dapF	diaminopimelate epimerase; Provisional	274
234769	PRK00451	PRK00451	aminomethyl-transferring glycine dehydrogenase subunit GcvPA. 	447
179034	PRK00453	rpsF	30S ribosomal protein S6; Reviewed	108
234770	PRK00454	engB	GTP-binding protein YsxC; Reviewed	196
234771	PRK00455	pyrE	orotate phosphoribosyltransferase; Validated	202
234772	PRK00458	PRK00458	adenosylmethionine decarboxylase. 	127
234773	PRK00461	rpmC	50S ribosomal protein L29; Reviewed	87
234774	PRK00464	nrdR	transcriptional repressor NrdR. 	154
179039	PRK00465	rpmJ	50S ribosomal protein L36; Reviewed	37
166979	PRK00466	PRK00466	acetyl-lysine deacetylase; Validated	346
179040	PRK00468	PRK00468	KH domain-containing protein. 	75
179041	PRK00474	rps9p	30S ribosomal protein S9P; Reviewed	134
234775	PRK00476	aspS	aspartyl-tRNA synthetase; Validated	588
234776	PRK00478	scpA	segregation and condensation protein ScpA. 	505
234777	PRK00481	PRK00481	NAD-dependent deacetylase; Provisional	242
234778	PRK00484	lysS	lysyl-tRNA synthetase; Reviewed	491
234779	PRK00485	fumC	fumarate hydratase; Reviewed	464
234780	PRK00488	pheS	phenylalanyl-tRNA synthetase subunit alpha; Validated	339
234781	PRK00489	hisG	ATP phosphoribosyltransferase; Reviewed	287
234782	PRK00499	rnpA	ribonuclease P; Reviewed	114
234783	PRK00504	rpmG	50S ribosomal protein L33; Validated	50
234784	PRK00507	PRK00507	deoxyribose-phosphate aldolase; Provisional	221
234785	PRK00509	PRK00509	argininosuccinate synthase; Provisional	399
179052	PRK00513	minC	septum formation inhibitor; Reviewed	214
234786	PRK00517	prmA	50S ribosomal protein L11 methyltransferase. 	250
234787	PRK00521	rbfA	30S ribosome-binding factor RbfA. 	120
179055	PRK00522	tpx	thiol peroxidase. 	167
179056	PRK00523	PRK00523	hypothetical protein; Provisional	72
179057	PRK00528	rpmE	50S ribosomal protein L31; Reviewed	71
234788	PRK00529	PRK00529	elongation factor P; Validated	186
134311	PRK00536	speE	spermidine synthase; Provisional	262
179059	PRK00539	atpC	F0F1 ATP synthase subunit epsilon; Validated	133
234789	PRK00549	PRK00549	competence damage-inducible protein A; Provisional	414
234790	PRK00550	rpsE	30S ribosomal protein S5; Validated	168
179062	PRK00553	PRK00553	ribose-phosphate pyrophosphokinase; Provisional	332
179063	PRK00555	PRK00555	galactokinase; Provisional	363
234791	PRK00556	minC	septum formation inhibitor; Reviewed	194
234792	PRK00558	uvrC	excinuclease ABC subunit UvrC. 	598
167003	PRK00560	PRK00560	molybdenum cofactor guanylyltransferase MobA. 	196
100598	PRK00561	ppnK	NAD(+) kinase. 	259
179066	PRK00564	hypA	hydrogenase nickel incorporation protein; Provisional	117
234793	PRK00565	rplV	50S ribosomal protein L22; Reviewed	112
234794	PRK00566	PRK00566	DNA-directed RNA polymerase subunit beta'; Provisional	1156
234795	PRK00567	mscL	large-conductance mechanosensitive channel protein MscL. 	134
134322	PRK00568	PRK00568	carbon storage regulator; Provisional	76
234796	PRK00571	atpC	F0F1 ATP synthase subunit epsilon; Validated	135
100605	PRK00573	lspA	signal peptidase II; Provisional	184
234797	PRK00575	tatA	Sec-independent protein translocase subunit TatA. 	92
234798	PRK00576	PRK00576	molybdopterin-guanine dinucleotide biosynthesis protein A; Provisional	178
234799	PRK00578	prfB	peptide chain release factor 2; Validated	367
234800	PRK00587	PRK00587	YbaB/EbfC family nucleoid-associated protein. 	99
179073	PRK00588	rnpA	ribonuclease P protein component. 	118
234801	PRK00591	prfA	peptide chain release factor 1; Validated	359
179075	PRK00595	rpmG	50S ribosomal protein L33; Validated	53
179076	PRK00596	rpsJ	30S ribosomal protein S10; Reviewed	102
234802	PRK00601	dut	dUTP diphosphatase. 	150
100616	PRK00611	PRK00611	putative disulfide oxidoreductase; Provisional	135
234803	PRK00615	PRK00615	glutamate-1-semialdehyde aminotransferase; Provisional	433
167014	PRK00624	PRK00624	glycine cleavage system protein H; Provisional	114
134335	PRK00625	PRK00625	shikimate kinase; Provisional	173
234804	PRK00629	pheT	phenylalanyl-tRNA synthetase subunit beta; Reviewed	791
234805	PRK00630	PRK00630	nickel responsive regulator; Provisional	148
234806	PRK00635	PRK00635	excinuclease ABC subunit A; Provisional	1809
179080	PRK00642	PRK00642	inorganic pyrophosphatase; Provisional	205
100624	PRK00647	PRK00647	hypothetical protein; Validated	96
234807	PRK00648	PRK00648	Maf-like protein; Reviewed	191
134340	PRK00650	PRK00650	4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 	288
234808	PRK00652	lpxK	tetraacyldisaccharide 4'-kinase; Reviewed	325
234809	PRK00654	glgA	glycogen synthase GlgA. 	466
179084	PRK00665	petG	cytochrome b6-f complex subunit PetG; Reviewed	37
179085	PRK00668	ndk	mulitfunctional nucleoside diphosphate kinase/apyrimidinic endonuclease/3'-; Validated	134
234810	PRK00676	hemA	glutamyl-tRNA reductase; Validated	338
179086	PRK00683	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	418
234811	PRK00685	PRK00685	metal-dependent hydrolase; Provisional	228
234812	PRK00694	PRK00694	4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase; Validated	606
234813	PRK00696	sucC	ADP-forming succinate--CoA ligase subunit beta. 	388
234814	PRK00698	tmk	thymidylate kinase; Validated	205
234815	PRK00701	PRK00701	divalent metal cation transporter MntH. 	439
234816	PRK00702	PRK00702	ribose-5-phosphate isomerase RpiA. 	220
234817	PRK00704	PRK00704	photosystem I reaction center protein subunit XI; Provisional	160
234818	PRK00708	PRK00708	twin-arginine translocase subunit TatB. 	209
234819	PRK00711	PRK00711	D-amino acid dehydrogenase. 	416
234820	PRK00714	PRK00714	RNA pyrophosphohydrolase; Reviewed	156
234821	PRK00719	PRK00719	alkanesulfonate monooxygenase; Provisional	378
234822	PRK00720	tatA	twin-arginine translocase TatA/TatE family subunit. 	78
179097	PRK00723	PRK00723	phosphatidylserine decarboxylase; Provisional	297
234823	PRK00724	PRK00724	formate dehydrogenase accessory sulfurtransferase FdhD. 	263
234824	PRK00725	glgC	glucose-1-phosphate adenylyltransferase; Provisional	425
234825	PRK00726	murG	undecaprenyldiphospho-muramoylpentapeptide beta-N- acetylglucosaminyltransferase; Provisional	357
234826	PRK00730	rnpA	ribonuclease P protein component. 	138
179101	PRK00732	fliE	flagellar hook-basal body complex protein FliE. 	102
234827	PRK00733	hppA	membrane-bound proton-translocating pyrophosphatase; Validated	666
179103	PRK00736	PRK00736	hypothetical protein; Provisional	68
179104	PRK00737	PRK00737	small nuclear ribonucleoprotein; Provisional	72
179105	PRK00741	prfC	peptide chain release factor 3; Provisional	526
234828	PRK00742	PRK00742	chemotaxis-specific protein-glutamate methyltransferase CheB. 	354
179107	PRK00745	PRK00745	4-oxalocrotonate tautomerase; Provisional	62
179108	PRK00748	PRK00748	1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase; Validated	233
234829	PRK00750	lysK	lysyl-tRNA synthetase; Reviewed	510
134373	PRK00753	psbL	photosystem II reaction center L; Provisional	39
179110	PRK00754	PRK00754	signal recognition particle protein Srp19; Provisional	95
179111	PRK00756	PRK00756	acyltransferase NodA; Provisional	196
179112	PRK00758	PRK00758	GMP synthase subunit A; Validated	184
179113	PRK00762	hypA	hydrogenase maturation nickel metallochaperone HypA. 	124
234830	PRK00766	PRK00766	hypothetical protein; Provisional	194
179115	PRK00767	PRK00767	transcriptional regulator BetI; Validated	197
234831	PRK00768	nadE	ammonia-dependent NAD(+) synthetase. 	268
179117	PRK00770	PRK00770	deoxyhypusine synthase. 	384
179118	PRK00771	PRK00771	signal recognition particle protein Srp54; Provisional	437
234832	PRK00772	PRK00772	3-isopropylmalate dehydrogenase; Provisional	358
234833	PRK00773	rplX	50S ribosomal protein LX; Validated	76
234834	PRK00777	PRK00777	pantetheine-phosphate adenylyltransferase. 	153
234835	PRK00779	PRK00779	ornithine carbamoyltransferase; Provisional	304
234836	PRK00782	PRK00782	MEMO1 family protein. 	267
234837	PRK00783	PRK00783	DNA-directed RNA polymerase subunit D; Provisional	263
234838	PRK00784	PRK00784	cobyric acid synthase. 	488
179126	PRK00790	fliE	flagellar hook-basal body complex protein FliE. 	109
179127	PRK00794	flbT	flagellar biosynthesis repressor FlbT; Reviewed	132
234839	PRK00801	PRK00801	hypothetical protein; Provisional	201
234840	PRK00802	PRK00802	DNA-3-methyladenine glycosylase. 	188
234841	PRK00805	PRK00805	putative deoxyhypusine synthase; Provisional	329
179131	PRK00807	PRK00807	50S ribosomal protein L24e; Validated	52
179132	PRK00808	PRK00808	bacteriohemerythrin. 	150
179133	PRK00809	PRK00809	hypothetical protein; Provisional	144
234842	PRK00810	nifW	nitrogenase stabilizing/protective protein NifW. 	113
234843	PRK00811	PRK00811	polyamine aminopropyltransferase. 	283
234844	PRK00816	rnfD	electron transport complex protein RnfD; Reviewed	350
179136	PRK00819	PRK00819	RNA 2'-phosphotransferase; Reviewed	179
234845	PRK00823	phhB	pterin-4-alpha-carbinolamine dehydratase; Validated	97
179138	PRK00831	rpmJ	50S ribosomal protein L36; Validated	41
179139	PRK00843	egsA	NAD(P)-dependent glycerol-1-phosphate dehydrogenase; Reviewed	350
234846	PRK00844	glgC	glucose-1-phosphate adenylyltransferase; Provisional	407
179141	PRK00846	PRK00846	hypothetical protein; Provisional	77
234847	PRK00847	thyX	FAD-dependent thymidylate synthase; Reviewed	217
234848	PRK00854	rocD	ornithine--oxo-acid transaminase; Reviewed	401
179143	PRK00855	PRK00855	argininosuccinate lyase; Provisional	459
234849	PRK00856	pyrB	aspartate carbamoyltransferase catalytic subunit. 	305
234850	PRK00861	PRK00861	putative lipid kinase; Reviewed	300
234851	PRK00865	PRK00865	glutamate racemase; Provisional	261
179147	PRK00870	PRK00870	haloalkane dehalogenase; Provisional	302
234852	PRK00871	PRK00871	glutathione-regulated potassium-efflux system oxidoreductase KefF. 	176
179149	PRK00872	PRK00872	hypothetical protein; Provisional	157
179150	PRK00876	nadE	NAD(+) synthase. 	326
234853	PRK00877	hisD	bifunctional histidinal dehydrogenase/ histidinol dehydrogenase; Reviewed	425
234854	PRK00881	purH	bifunctional phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase; Provisional	513
234855	PRK00884	PRK00884	Maf-like protein; Reviewed	194
234856	PRK00885	PRK00885	phosphoribosylamine--glycine ligase; Provisional	420
234857	PRK00886	PRK00886	2-phosphosulfolactate phosphatase family protein. 	240
179156	PRK00888	ftsB	cell division protein FtsB; Reviewed	105
179157	PRK00889	PRK00889	adenylylsulfate kinase; Provisional	175
234858	PRK00892	lpxD	UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase; Provisional	343
234859	PRK00893	PRK00893	aspartate carbamoyltransferase regulatory subunit; Reviewed	152
234860	PRK00901	PRK00901	methylated-DNA--protein-cysteine methyltransferase; Provisional	155
179161	PRK00907	PRK00907	hypothetical protein; Provisional	92
179162	PRK00910	ribB	3,4-dihydroxy-2-butanone-4-phosphate synthase. 	218
234861	PRK00911	PRK00911	dihydroxy-acid dehydratase; Provisional	552
234862	PRK00912	PRK00912	ribonuclease P protein component 3; Provisional	237
234863	PRK00913	PRK00913	multifunctional aminopeptidase A; Provisional	483
234864	PRK00915	PRK00915	2-isopropylmalate synthase; Validated	513
179167	PRK00919	PRK00919	glutamine-hydrolyzing GMP synthase subunit GuaA. 	307
234865	PRK00923	PRK00923	sirohydrochlorin nickelochelatase. 	126
179169	PRK00924	PRK00924	5-keto-4-deoxyuronate isomerase; Provisional	276
234866	PRK00927	PRK00927	tryptophanyl-tRNA synthetase; Reviewed	333
234867	PRK00933	PRK00933	ribosomal biogenesis protein; Validated	165
234868	PRK00934	PRK00934	ribose-phosphate pyrophosphokinase; Provisional	285
179173	PRK00939	PRK00939	stress response translation initiation inhibitor YciH. 	99
179174	PRK00941	PRK00941	acetyl-CoA decarbonylase/synthase complex subunit alpha; Validated	781
234869	PRK00942	PRK00942	acetylglutamate kinase; Provisional	283
234870	PRK00943	PRK00943	selenide, water dikinase SelD. 	347
234871	PRK00944	PRK00944	hypothetical protein; Provisional	195
179177	PRK00945	PRK00945	acetyl-CoA decarbonylase/synthase complex subunit epsilon; Provisional	171
234872	PRK00950	PRK00950	histidinol-phosphate transaminase. 	361
234873	PRK00951	hisB	imidazoleglycerol-phosphate dehydratase HisB. 	195
234874	PRK00955	PRK00955	YgiQ family radical SAM protein. 	620
179181	PRK00956	thyA	thymidylate synthase; Provisional	208
234875	PRK00957	PRK00957	methionine synthase; Provisional	305
234876	PRK00960	PRK00960	seryl-tRNA synthetase; Provisional	517
179184	PRK00961	PRK00961	H(2)-dependent methylenetetrahydromethanopterin dehydrogenase; Provisional	342
179185	PRK00962	PRK00962	hypothetical protein; Provisional	165
234877	PRK00964	PRK00964	tetrahydromethanopterin S-methyltransferase subunit A; Provisional	225
179187	PRK00965	PRK00965	tetrahydromethanopterin S-methyltransferase subunit B; Provisional	96
179188	PRK00967	PRK00967	hypothetical protein; Provisional	105
234878	PRK00968	PRK00968	tetrahydromethanopterin S-methyltransferase subunit D; Provisional	240
234879	PRK00969	PRK00969	methanogenesis marker 3 protein. 	508
234880	PRK00971	PRK00971	glutaminase; Provisional	307
234881	PRK00972	PRK00972	tetrahydromethanopterin S-methyltransferase subunit E; Provisional	292
179193	PRK00973	PRK00973	glucose-6-phosphate isomerase; Provisional	446
234882	PRK00976	PRK00976	methanogenesis marker 12 protein. 	326
179195	PRK00977	PRK00977	exodeoxyribonuclease VII small subunit; Provisional	80
234883	PRK00979	PRK00979	tetrahydromethanopterin S-methyltransferase subunit H; Provisional	308
179197	PRK00982	acpP	acyl carrier protein; Provisional	78
234884	PRK00984	truD	tRNA pseudouridine synthase D; Reviewed	341
179199	PRK00989	truB	tRNA pseudouridine synthase B; Provisional	230
234885	PRK00994	PRK00994	F420-dependent methylenetetrahydromethanopterin dehydrogenase. 	277
234886	PRK00996	PRK00996	ribonuclease HIII; Provisional	304
234887	PRK01001	PRK01001	putative inner membrane protein translocase component YidC; Provisional	795
179203	PRK01002	PRK01002	nickel responsive regulator; Provisional	141
179204	PRK01005	PRK01005	V-type ATP synthase subunit E; Provisional	207
134464	PRK01008	PRK01008	queuine tRNA-ribosyltransferase; Provisional	372
179205	PRK01018	PRK01018	50S ribosomal protein L30e; Reviewed	99
167141	PRK01021	lpxB	lipid-A-disaccharide synthase; Reviewed	608
234888	PRK01022	PRK01022	hypothetical protein; Provisional	167
179207	PRK01024	PRK01024	Na(+)-translocating NADH-quinone reductase subunit B; Provisional	503
179208	PRK01026	PRK01026	tetrahydromethanopterin S-methyltransferase subunit G; Provisional	77
234889	PRK01029	tolB	Tol-Pal system protein TolB. 	428
234890	PRK01030	PRK01030	tetrahydromethanopterin S-methyltransferase subunit C; Provisional	264
234891	PRK01033	PRK01033	imidazole glycerol phosphate synthase subunit HisF; Provisional	258
234892	PRK01037	trmD	tRNA (guanine-N(1)-)-methyltransferase/unknown domain fusion protein; Reviewed	357
234893	PRK01045	ispH	4-hydroxy-3-methylbut-2-enyl diphosphate reductase; Reviewed	298
234894	PRK01059	PRK01059	ATP:guanido phosphotransferase; Provisional	346
179214	PRK01060	PRK01060	endonuclease IV; Provisional	281
234895	PRK01061	PRK01061	Na(+)-translocating NADH-quinone reductase subunit E; Provisional	244
179216	PRK01064	PRK01064	hypothetical protein; Provisional	78
167150	PRK01066	PRK01066	porphobilinogen deaminase; Provisional	231
179217	PRK01076	PRK01076	L-rhamnose isomerase; Provisional	419
234896	PRK01077	PRK01077	cobyrinate a,c-diamide synthase. 	451
234897	PRK01096	PRK01096	deoxyguanosinetriphosphate triphosphohydrolase-like protein; Provisional	440
179220	PRK01099	rpoK	DNA-directed RNA polymerase subunit K; Provisional	62
234898	PRK01100	PRK01100	accessory gene regulator ArgB-like protein. 	210
234899	PRK01103	PRK01103	bifunctional DNA-formamidopyrimidine glycosylase/DNA-(apurinic or apyrimidinic site) lyase. 	274
234900	PRK01109	PRK01109	ATP-dependent DNA ligase; Provisional	590
234901	PRK01110	rpmF	50S ribosomal protein L32; Validated	60
234902	PRK01112	PRK01112	2,3-bisphosphoglycerate-dependent phosphoglycerate mutase. 	228
234903	PRK01115	PRK01115	DNA polymerase sliding clamp; Validated	247
234904	PRK01117	PRK01117	adenylosuccinate synthetase; Provisional	430
179228	PRK01119	PRK01119	putative heavy metal-binding protein. 	106
234905	PRK01122	PRK01122	potassium-transporting ATPase subunit KdpB. 	679
234906	PRK01123	PRK01123	shikimate kinase; Provisional	282
234907	PRK01130	PRK01130	putative N-acetylmannosamine-6-phosphate 2-epimerase. 	221
234908	PRK01143	rpl11p	50S ribosomal protein L11P; Validated	163
234909	PRK01146	PRK01146	DNA-directed RNA polymerase subunit L; Provisional	85
179234	PRK01151	rps17E	30S ribosomal protein S17e; Validated	58
179235	PRK01153	PRK01153	nicotinamide-nucleotide adenylyltransferase; Provisional	174
100796	PRK01156	PRK01156	chromosome segregation protein; Provisional	895
234910	PRK01158	PRK01158	phosphoglycolate phosphatase; Provisional	230
234911	PRK01160	PRK01160	hypothetical protein; Provisional	178
234912	PRK01170	PRK01170	bifunctional pantetheine-phosphate adenylyltransferase/NTP phosphatase. 	322
100801	PRK01172	PRK01172	ATP-dependent DNA helicase. 	674
234913	PRK01175	PRK01175	phosphoribosylformylglycinamidine synthase I; Provisional	261
167170	PRK01177	PRK01177	hypothetical protein; Provisional	140
179239	PRK01178	rps24e	30S ribosomal protein S24e; Reviewed	99
234914	PRK01184	PRK01184	flagellar hook-basal body complex protein FliE. 	184
179241	PRK01185	ppnK	NAD(+) kinase. 	271
100807	PRK01189	PRK01189	V-type ATP synthase subunit F; Provisional	104
234915	PRK01191	rpl24p	50S ribosomal protein L24P; Validated	120
234916	PRK01192	PRK01192	50S ribosomal protein L31e; Reviewed	89
100810	PRK01194	PRK01194	V-type ATP synthase subunit E; Provisional	185
234917	PRK01198	PRK01198	V-type ATP synthase subunit C; Provisional	352
234918	PRK01202	PRK01202	glycine cleavage system protein GcvH. 	127
100813	PRK01203	PRK01203	prefoldin subunit alpha; Provisional	130
100814	PRK01207	PRK01207	methionine synthase; Provisional	343
234919	PRK01209	cobD	cobalamin biosynthesis protein. 	312
179247	PRK01211	PRK01211	dihydroorotase; Provisional	409
234920	PRK01212	PRK01212	homoserine kinase; Provisional	301
234921	PRK01213	PRK01213	phosphoribosylformylglycinamidine synthase subunit PurL. 	724
179250	PRK01215	PRK01215	nicotinamide mononucleotide deamidase-related protein. 	264
179251	PRK01216	PRK01216	DNA polymerase IV; Validated	351
179252	PRK01217	PRK01217	hypothetical protein; Provisional	114
179253	PRK01220	PRK01220	malonate decarboxylase subunit delta; Provisional	99
234922	PRK01221	PRK01221	deoxyhypusine synthase. 	312
234923	PRK01222	PRK01222	phosphoribosylanthranilate isomerase. 	210
234924	PRK01229	PRK01229	N-glycosylase/DNA lyase; Provisional	208
179257	PRK01231	ppnK	NAD(+) kinase. 	295
234925	PRK01233	glyS	glycyl-tRNA synthetase subunit beta; Validated	682
234926	PRK01236	PRK01236	S-adenosylmethionine decarboxylase proenzyme; Provisional	131
234927	PRK01237	PRK01237	triphosphoribosyl-dephospho-CoA synthase; Validated	289
179261	PRK01242	rpl39e	50S ribosomal protein L39e; Validated	50
179262	PRK01250	PRK01250	inorganic diphosphatase. 	176
179263	PRK01253	PRK01253	preprotein translocase subunit Sec61beta. 	54
234928	PRK01254	PRK01254	YgiQ family radical SAM protein. 	707
234929	PRK01259	PRK01259	ribose-phosphate diphosphokinase. 	309
234930	PRK01261	aroD	3-dehydroquinate dehydratase; Provisional	229
234931	PRK01265	PRK01265	heat shock protein HtpX; Provisional	324
234932	PRK01269	PRK01269	tRNA s(4)U8 sulfurtransferase; Provisional	482
179269	PRK01271	PRK01271	tautomerase PptA. 	76
179270	PRK01278	argD	acetylornithine transaminase protein; Provisional	389
234933	PRK01285	PRK01285	pyruvoyl-dependent arginine decarboxylase; Reviewed	155
234934	PRK01286	PRK01286	deoxyguanosinetriphosphate triphosphohydrolase-like protein; Provisional	336
234935	PRK01287	xerC	site-specific tyrosine recombinase XerC; Reviewed	358
234936	PRK01293	PRK01293	phosphoribosyl-dephospho-CoA transferase; Provisional	207
234937	PRK01294	PRK01294	lipase secretion chaperone. 	336
167205	PRK01295	PRK01295	phosphoglyceromutase; Provisional	206
234938	PRK01297	PRK01297	ATP-dependent RNA helicase RhlB; Provisional	475
234939	PRK01305	PRK01305	arginyl-tRNA-protein transferase; Provisional	240
167208	PRK01310	PRK01310	hypothetical protein; Validated	104
234940	PRK01313	rnpA	ribonuclease P protein component. 	129
234941	PRK01315	PRK01315	putative inner membrane protein translocase component YidC; Provisional	329
234942	PRK01318	PRK01318	membrane protein insertase; Provisional	521
179280	PRK01322	PRK01322	6-carboxyhexanoate--CoA ligase; Provisional	242
179281	PRK01326	prsA	foldase protein PrsA; Reviewed	310
234943	PRK01343	PRK01343	zinc-binding protein; Provisional	57
234944	PRK01345	PRK01345	heat shock protein HtpX; Provisional	317
234945	PRK01346	PRK01346	enhanced intracellular survival protein Eis. 	411
234946	PRK01355	PRK01355	azoreductase; Reviewed	199
167217	PRK01356	hscB	co-chaperone HscB; Provisional	166
234947	PRK01362	PRK01362	fructose-6-phosphate aldolase. 	214
179286	PRK01368	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	454
179287	PRK01371	PRK01371	Sec-independent protein translocase protein TatB. 	137
234948	PRK01372	ddl	D-alanine--D-alanine ligase; Reviewed	304
134546	PRK01379	cyaY	iron donor protein CyaY. 	103
179289	PRK01381	PRK01381	trp operon repressor. 	99
234949	PRK01388	PRK01388	arginine deiminase; Provisional	406
234950	PRK01390	murD	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional	460
234951	PRK01392	citX	2'-(5''-triphosphoribosyl)-3'-dephospho-CoA:apo-citrate lyase; Reviewed	180
179293	PRK01395	PRK01395	V-type ATP synthase subunit F; Provisional	104
179294	PRK01397	PRK01397	50S ribosomal protein L31; Provisional	78
234952	PRK01402	hslO	Hsp33-like chaperonin; Reviewed	328
234953	PRK01406	gltX	glutamyl-tRNA synthetase; Reviewed	476
167229	PRK01415	PRK01415	hypothetical protein; Validated	247
234954	PRK01424	PRK01424	S-adenosylmethionine:tRNA ribosyltransferase-isomerase; Provisional	366
234955	PRK01433	hscA	chaperone protein HscA; Provisional	595
179297	PRK01438	murD	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional	480
167232	PRK01441	PRK01441	Maf-like protein; Reviewed	207
179298	PRK01470	tatA	twin arginine translocase protein A; Provisional	51
100879	PRK01474	atpC	F0F1 ATP synthase subunit epsilon; Validated	112
134562	PRK01482	fliE	flagellar hook-basal body complex protein FliE. 	108
234956	PRK01490	tig	trigger factor; Provisional	435
100883	PRK01492	rnpA	ribonuclease P protein component. 	118
234957	PRK01526	PRK01526	Maf-like protein; Reviewed	205
179300	PRK01528	truB	tRNA pseudouridine synthase B; Provisional	292
134567	PRK01530	PRK01530	hypothetical protein; Reviewed	105
134568	PRK01533	PRK01533	histidinol-phosphate aminotransferase; Validated	366
234958	PRK01544	PRK01544	bifunctional N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase/tRNA (m7G46) methyltransferase; Reviewed	506
100891	PRK01546	PRK01546	hypothetical protein; Provisional	79
234959	PRK01550	truB	tRNA pseudouridine synthase B; Provisional	304
179302	PRK01558	PRK01558	V-type ATP synthase subunit E; Provisional	198
234960	PRK01565	PRK01565	thiamine biosynthesis protein ThiI; Provisional	394
179304	PRK01574	lspA	signal peptidase II; Provisional	163
234961	PRK01581	speE	polyamine aminopropyltransferase. 	374
234962	PRK01584	PRK01584	alanyl-tRNA synthetase; Provisional	594
234963	PRK01610	PRK01610	putative voltage-gated ClC-type chloride channel ClcB; Provisional	418
234964	PRK01611	argS	arginyl-tRNA synthetase; Reviewed	507
234965	PRK01614	tatE	Sec-independent protein translocase subunit TatA. 	85
234966	PRK01617	PRK01617	hypothetical protein; Provisional	154
179310	PRK01622	PRK01622	membrane protein insertase YidC. 	256
179311	PRK01625	sspH	acid-soluble spore protein H; Provisional	59
167247	PRK01631	PRK01631	hypothetical protein; Provisional	76
179312	PRK01636	ccrB	fluoride efflux transporter CrcB. 	118
179313	PRK01637	PRK01637	virulence factor BrkB family protein. 	286
179314	PRK01641	leuD	3-isopropylmalate dehydratase small subunit. 	200
234967	PRK01642	cls	cardiolipin synthetase; Reviewed	483
179316	PRK01655	spxA	transcriptional regulator Spx; Reviewed	131
167253	PRK01658	PRK01658	CidA/LrgA family holin-like protein. 	122
234968	PRK01663	PRK01663	C4-dicarboxylate transporter DctA; Reviewed	428
234969	PRK01678	rpmE2	type B 50S ribosomal protein L31. 	87
234970	PRK01683	PRK01683	trans-aconitate 2-methyltransferase; Provisional	258
234971	PRK01686	hisG	ATP phosphoribosyltransferase catalytic subunit; Reviewed	215
234972	PRK01688	PRK01688	histidinol-phosphate aminotransferase; Provisional	351
179322	PRK01699	fliE	flagellar hook-basal body complex protein FliE. 	99
167260	PRK01706	PRK01706	adenosylmethionine decarboxylase. 	123
179323	PRK01710	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	458
234973	PRK01712	PRK01712	carbon storage regulator CsrA. 	64
167263	PRK01713	PRK01713	ornithine carbamoyltransferase; Provisional	334
234974	PRK01722	PRK01722	formimidoylglutamase; Provisional	320
234975	PRK01723	PRK01723	3-deoxy-D-manno-octulosonic-acid kinase; Reviewed	239
179327	PRK01732	rnpA	ribonuclease P; Reviewed	114
234976	PRK01736	PRK01736	hypothetical protein; Reviewed	190
234977	PRK01741	PRK01741	cell division protein ZipA; Provisional	332
179329	PRK01742	tolB	Tol-Pal system protein TolB. 	429
234978	PRK01747	mnmC	bifunctional tRNA (5-methylaminomethyl-2-thiouridine)(34)-methyltransferase MnmD/FAD-dependent 5-carboxymethylaminomethyl-2-thiouridine(34) oxidoreductase MnmC. 	662
234979	PRK01749	PRK01749	disulfide bond formation protein DsbB. 	176
179332	PRK01752	PRK01752	YchJ family protein. 	156
234980	PRK01759	glnD	bifunctional uridylyltransferase/uridylyl-removing protein GlnD. 	854
234981	PRK01766	PRK01766	multidrug efflux protein; Reviewed	456
179334	PRK01770	PRK01770	Sec-independent protein translocase subunit TatB. 	171
179335	PRK01773	hscB	Fe-S protein assembly co-chaperone HscB. 	173
234982	PRK01777	PRK01777	RnfH family protein. 	95
167278	PRK01792	ribB	3,4-dihydroxy-2-butanone-4-phosphate synthase. 	214
179337	PRK01810	PRK01810	DNA polymerase IV; Validated	407
179338	PRK01816	PRK01816	hypothetical protein; Provisional	143
234983	PRK01821	PRK01821	hypothetical protein; Provisional	133
234984	PRK01827	thyA	thymidylate synthase; Reviewed	264
167284	PRK01833	tatA	Sec-independent protein translocase subunit TatA. 	74
179341	PRK01839	PRK01839	septum formation inhibitor Maf. 	209
234985	PRK01842	PRK01842	hypothetical protein; Provisional	149
100947	PRK01844	PRK01844	YneF family protein. 	72
234986	PRK01851	truB	tRNA pseudouridine synthase B; Provisional	303
234987	PRK01862	PRK01862	voltage-gated chloride channel ClcB. 	574
179345	PRK01885	greB	transcription elongation factor GreB; Reviewed	157
234988	PRK01889	PRK01889	GTPase RsgA; Reviewed	356
234989	PRK01903	rnpA	ribonuclease P protein component. 	133
234990	PRK01904	PRK01904	DUF2057 domain-containing protein. 	219
179348	PRK01905	PRK01905	Fis family transcriptional regulator. 	77
179349	PRK01906	PRK01906	tetraacyldisaccharide 4'-kinase; Provisional	338
179350	PRK01908	PRK01908	electron transport complex protein RnfG; Validated	205
234991	PRK01909	pdxA	4-hydroxythreonine-4-phosphate dehydrogenase PdxA. 	329
179352	PRK01911	ppnK	inorganic polyphosphate/ATP-NAD kinase; Provisional	292
179353	PRK01917	PRK01917	cation-binding hemerythrin HHE family protein; Provisional	139
234992	PRK01919	tatB	Sec-independent protein translocase subunit TatB. 	169
179355	PRK01964	PRK01964	4-oxalocrotonate tautomerase; Provisional	64
234993	PRK01966	ddl	D-alanine--D-alanine ligase. 	333
234994	PRK01973	PRK01973	septum site-determining protein MinC. 	271
179358	PRK02001	PRK02001	ribosome assembly cofactor RimP. 	152
234995	PRK02006	murD	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional	498
179360	PRK02047	PRK02047	hypothetical protein; Provisional	91
179361	PRK02048	PRK02048	4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase; Provisional	611
179362	PRK02079	PRK02079	pyrroloquinoline quinone biosynthesis peptide chaperone PqqD. 	88
234996	PRK02083	PRK02083	imidazole glycerol phosphate synthase subunit HisF; Provisional	253
234997	PRK02090	PRK02090	phosphoadenylyl-sulfate reductase. 	241
234998	PRK02098	PRK02098	phosphoribosyl-dephospho-CoA transferase; Provisional	221
234999	PRK02101	PRK02101	peroxide stress protein YaaA. 	255
179366	PRK02102	PRK02102	ornithine carbamoyltransferase; Validated	331
179367	PRK02103	PRK02103	malonate decarboxylase acyl carrier protein. 	105
235000	PRK02106	PRK02106	choline dehydrogenase; Validated	560
235001	PRK02107	PRK02107	glutamate--cysteine ligase; Provisional	523
235002	PRK02110	PRK02110	disulfide bond formation protein B; Provisional	169
179371	PRK02113	PRK02113	MBL fold metallo-hydrolase. 	252
235003	PRK02114	PRK02114	formylmethanofuran--tetrahydromethanopterin formyltransferase; Provisional	297
179373	PRK02118	PRK02118	V-type ATP synthase subunit B; Provisional	436
235004	PRK02119	PRK02119	hypothetical protein; Provisional	73
235005	PRK02122	PRK02122	glucosamine-6-phosphate deaminase-like protein; Validated	652
235006	PRK02126	PRK02126	ribonuclease Z; Provisional	334
235007	PRK02134	PRK02134	chitin disaccharide deacetylase. 	249
235008	PRK02135	PRK02135	hypothetical protein; Provisional	201
235009	PRK02141	PRK02141	Maf-like protein; Reviewed	207
179379	PRK02155	ppnK	NAD kinase. 	291
167325	PRK02166	PRK02166	hypothetical protein; Reviewed	184
235010	PRK02186	PRK02186	argininosuccinate lyase; Provisional	887
235011	PRK02190	PRK02190	agmatinase; Provisional	301
179381	PRK02193	truB	tRNA pseudouridine synthase B; Provisional	279
179382	PRK02195	PRK02195	V-type ATP synthase subunit D; Provisional	201
235012	PRK02201	PRK02201	putative inner membrane protein translocase component YidC; Provisional	357
235013	PRK02220	PRK02220	4-oxalocrotonate tautomerase; Provisional	61
179385	PRK02224	PRK02224	DNA double-strand break repair Rad50 ATPase. 	880
235014	PRK02227	PRK02227	(5-formylfuran-3-yl)methyl phosphate synthase. 	238
179387	PRK02228	PRK02228	V-type ATP synthase subunit F; Provisional	100
179388	PRK02230	PRK02230	inorganic pyrophosphatase; Provisional	184
167337	PRK02231	ppnK	NAD(+) kinase. 	272
235015	PRK02234	recU	Holliday junction-specific endonuclease; Reviewed	195
235016	PRK02237	PRK02237	YnfA family protein. 	109
179391	PRK02240	PRK02240	GTP cyclohydrolase IIa. 	254
179392	PRK02249	PRK02249	DNA primase regulatory subunit PriL. 	343
179393	PRK02250	PRK02250	hypothetical protein; Provisional	166
179394	PRK02251	PRK02251	cell division protein CrgA. 	87
179395	PRK02253	PRK02253	deoxyuridine 5'-triphosphate nucleotidohydrolase; Provisional	167
235017	PRK02255	PRK02255	putrescine carbamoyltransferase; Provisional	338
235018	PRK02256	PRK02256	putative aminopeptidase 1; Provisional	462
235019	PRK02259	PRK02259	aspartoacylase; Provisional	288
179399	PRK02260	PRK02260	S-ribosylhomocysteine lyase. 	158
179400	PRK02261	PRK02261	methylaspartate mutase subunit S; Provisional	137
235020	PRK02264	PRK02264	N(5),N(10)-methenyltetrahydromethanopterin cyclohydrolase; Provisional	317
179402	PRK02265	PRK02265	acetoacetate decarboxylase; Provisional	246
235021	PRK02268	PRK02268	hypothetical protein; Provisional	141
167353	PRK02269	PRK02269	ribose-phosphate diphosphokinase. 	320
235022	PRK02271	PRK02271	methylenetetrahydromethanopterin reductase; Provisional	325
235023	PRK02277	PRK02277	orotate phosphoribosyltransferase-like protein; Provisional	200
235024	PRK02287	PRK02287	DUF367 family protein. 	171
179406	PRK02289	PRK02289	4-oxalocrotonate tautomerase; Provisional	60
235025	PRK02290	PRK02290	3-dehydroquinate synthase II family protein. 	344
235026	PRK02292	PRK02292	V-type ATP synthase subunit E; Provisional	188
235027	PRK02301	PRK02301	deoxyhypusine synthase. 	316
179410	PRK02302	PRK02302	hypothetical protein; Provisional	89
235028	PRK02304	PRK02304	adenine phosphoribosyltransferase; Provisional	175
235029	PRK02308	uvsE	putative UV damage endonuclease; Provisional	303
235030	PRK02315	PRK02315	adaptor protein MecA. 	233
235031	PRK02318	PRK02318	mannitol-1-phosphate 5-dehydrogenase; Provisional	381
235032	PRK02362	PRK02362	ATP-dependent DNA helicase. 	737
235033	PRK02363	PRK02363	DNA-directed RNA polymerase subunit delta; Reviewed	129
179417	PRK02382	PRK02382	dihydroorotase; Provisional	443
179418	PRK02391	PRK02391	heat shock protein HtpX; Provisional	296
235034	PRK02395	PRK02395	hypothetical protein; Provisional	279
179420	PRK02399	PRK02399	hypothetical protein; Provisional	406
235035	PRK02406	PRK02406	DNA polymerase IV; Validated	343
235036	PRK02412	aroD	type I 3-dehydroquinate dehydratase. 	253
235037	PRK02427	PRK02427	3-phosphoshikimate 1-carboxyvinyltransferase; Provisional	435
235038	PRK02436	xerD	site-specific tyrosine recombinase XerD. 	245
235039	PRK02458	PRK02458	ribose-phosphate pyrophosphokinase; Provisional	323
235040	PRK02463	PRK02463	OxaA-like protein precursor; Provisional	307
179427	PRK02471	PRK02471	bifunctional glutamate--cysteine ligase GshA/glutathione synthetase GshB. 	752
235041	PRK02472	murD	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional	447
167380	PRK02478	PRK02478	Maf-like protein; Reviewed	199
235042	PRK02484	truB	tRNA pseudouridine synthase B; Provisional	294
179430	PRK02487	PRK02487	heme-degrading domain-containing protein. 	163
179431	PRK02491	PRK02491	putative deoxyribonucleotide triphosphate pyrophosphatase/unknown domain fusion protein; Reviewed	328
235043	PRK02492	PRK02492	deoxyhypusine synthase. 	347
179433	PRK02496	adk	adenylate kinase; Provisional	184
235044	PRK02504	PRK02504	NAD(P)H-quinone oxidoreductase subunit N. 	513
235045	PRK02506	PRK02506	dihydroorotate dehydrogenase 1A; Reviewed	310
235046	PRK02507	PRK02507	proton extrusion protein PcxA; Provisional	422
235047	PRK02509	PRK02509	hypothetical protein; Provisional	973
235048	PRK02515	psbU	photosystem II complex extrinsic protein PsbU. 	132
134722	PRK02529	petN	cytochrome b6-f complex subunit PetN; Provisional	33
235049	PRK02534	PRK02534	4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional	312
179440	PRK02539	PRK02539	DUF896 family protein. 	85
179441	PRK02542	PRK02542	photosystem I assembly protein Ycf4; Provisional	188
235050	PRK02546	PRK02546	NAD(P)H-quinone oxidoreductase subunit 4; Provisional	525
179443	PRK02551	PRK02551	flavoprotein NrdI; Provisional	154
167396	PRK02553	psbK	photosystem II reaction center protein K; Provisional	45
179444	PRK02557	psbE	cytochrome b559 subunit alpha; Provisional	81
235051	PRK02561	psbF	cytochrome b559 subunit beta; Provisional	44
179446	PRK02565	PRK02565	photosystem II reaction center protein J; Provisional	39
167400	PRK02576	psbZ	photosystem II reaction center protein PsbZ. 	62
235052	PRK02597	rpoC2	DNA-directed RNA polymerase subunit beta'; Provisional	1331
179448	PRK02603	PRK02603	photosystem I assembly protein Ycf3; Provisional	172
235053	PRK02610	PRK02610	histidinol-phosphate transaminase. 	374
235054	PRK02615	PRK02615	thiamine phosphate synthase. 	347
179451	PRK02624	psbH	photosystem II reaction center protein PsbH. 	64
235055	PRK02625	rpoC1	DNA-directed RNA polymerase subunit gamma; Provisional	627
235056	PRK02627	PRK02627	acetylornithine aminotransferase; Provisional	396
235057	PRK02628	nadE	NAD synthetase; Reviewed	679
179455	PRK02645	ppnK	NAD(+) kinase. 	305
179456	PRK02649	ppnK	NAD(+) kinase. 	305
179457	PRK02651	PRK02651	photosystem I iron-sulfur center protein PsaC. 	81
235058	PRK02654	PRK02654	putative inner membrane protein translocase component YidC; Provisional	375
179459	PRK02655	psbI	photosystem II reaction center protein I. 	38
179460	PRK02693	PRK02693	apocytochrome f; Reviewed	312
235059	PRK02705	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	459
235060	PRK02710	PRK02710	plastocyanin; Provisional	119
235061	PRK02714	PRK02714	o-succinylbenzoate synthase. 	320
235062	PRK02724	PRK02724	30S ribosomal protein PSRP-3. 	104
235063	PRK02726	PRK02726	molybdenum cofactor guanylyltransferase. 	200
235064	PRK02731	PRK02731	histidinol-phosphate aminotransferase; Validated	367
235065	PRK02733	PRK02733	photosystem I reaction center subunit IX; Provisional	42
235066	PRK02746	pdxA	4-hydroxythreonine-4-phosphate dehydrogenase PdxA. 	345
179468	PRK02749	PRK02749	photosystem I reaction center subunit IV; Provisional	71
179469	PRK02755	truB	tRNA pseudouridine synthase B; Provisional	295
235067	PRK02759	PRK02759	bifunctional phosphoribosyl-AMP cyclohydrolase/phosphoribosyl-ATP diphosphatase HisIE. 	203
235068	PRK02769	PRK02769	histidine decarboxylase; Provisional	380
235069	PRK02770	PRK02770	adenosylmethionine decarboxylase. 	139
179472	PRK02793	PRK02793	phi X174 lysis protein; Provisional	72
179473	PRK02794	PRK02794	DNA polymerase IV; Provisional	419
235070	PRK02797	PRK02797	TDP-N-acetylfucosamine:lipid II N-acetylfucosaminyltransferase. 	322
235071	PRK02801	PRK02801	primosomal replication protein N; Provisional	101
235072	PRK02812	PRK02812	ribose-phosphate pyrophosphokinase; Provisional	330
235073	PRK02813	PRK02813	putative aminopeptidase 2; Provisional	428
179478	PRK02816	PRK02816	phycocyanobilin:ferredoxin oxidoreductase; Validated	243
179479	PRK02821	PRK02821	RNA-binding protein. 	77
235074	PRK02830	PRK02830	Na(+)-translocating NADH-quinone reductase subunit E; Provisional	202
235075	PRK02833	PRK02833	phosphate-starvation-inducible protein PsiE; Provisional	133
235076	PRK02842	PRK02842	ferredoxin:protochlorophyllide reductase (ATP-dependent) subunit N. 	427
235077	PRK02853	PRK02853	hypothetical protein; Provisional	161
179484	PRK02854	PRK02854	primosomal protein DnaT. 	179
235078	PRK02858	PRK02858	germination protease; Provisional	369
179486	PRK02862	glgC	glucose-1-phosphate adenylyltransferase; Provisional	429
235079	PRK02866	PRK02866	cyanate hydratase; Validated	147
235080	PRK02868	PRK02868	hypothetical protein; Provisional	245
235081	PRK02870	PRK02870	heat shock protein HtpX; Provisional	336
179490	PRK02877	PRK02877	hypothetical protein; Provisional	106
179491	PRK02886	PRK02886	hypothetical protein; Provisional	87
235082	PRK02888	PRK02888	nitrous-oxide reductase; Validated	635
235083	PRK02889	tolB	Tol-Pal system protein TolB. 	427
179494	PRK02898	PRK02898	energy-coupling factor ABC transporter substrate-binding protein. 	100
179495	PRK02899	PRK02899	genetic competence negative regulator. 	197
235084	PRK02901	PRK02901	O-succinylbenzoate synthase; Provisional	327
179497	PRK02909	PRK02909	flagellar transcriptional regulator FlhD. 	105
235085	PRK02910	PRK02910	ferredoxin:protochlorophyllide reductase (ATP-dependent) subunit B. 	519
179499	PRK02913	PRK02913	hypothetical protein; Provisional	150
235086	PRK02919	PRK02919	oxaloacetate decarboxylase subunit gamma; Provisional	82
179501	PRK02922	PRK02922	cell surface composition regulator GlgS. 	67
235087	PRK02925	PRK02925	glucuronate isomerase; Reviewed	466
179503	PRK02929	PRK02929	L-arabinose isomerase; Provisional	499
179504	PRK02935	PRK02935	hypothetical protein; Provisional	110
179505	PRK02936	argD	acetylornithine transaminase. 	377
179506	PRK02939	PRK02939	YnfC family lipoprotein. 	236
235088	PRK02943	PRK02943	secA regulator SecM. 	167
179508	PRK02944	PRK02944	YidC family membrane integrase SpoIIIJ. 	255
235089	PRK02946	aceK	bifunctional isocitrate dehydrogenase kinase/phosphatase protein; Validated	575
179510	PRK02947	PRK02947	sugar isomerase domain-containing protein. 	246
179511	PRK02948	PRK02948	IscS subfamily cysteine desulfurase. 	381
235090	PRK02951	PRK02951	DNA replication terminus site-binding protein; Provisional	309
179513	PRK02955	PRK02955	small acid-soluble spore protein SspI; Provisional	68
235091	PRK02958	tatA	Sec-independent protein translocase subunit TatA. 	73
235092	PRK02963	PRK02963	carbon starvation induced protein CsiD. 	316
235093	PRK02967	PRK02967	nickel-responsive transcriptional regulator NikR. 	139
235094	PRK02971	PRK02971	4-amino-4-deoxy-L-arabinose-phosphoundecaprenol flippase subunit ArnF; Provisional	129
179518	PRK02975	PRK02975	O-antigen assembly polymerase. 	450
235095	PRK02983	lysS	bifunctional lysylphosphatidylglycerol synthetase/lysine--tRNA ligase LysX. 	1094
179520	PRK02984	sspO	acid-soluble spore protein O; Provisional	49
235096	PRK02991	PRK02991	D-serine dehydratase; Provisional	441
179522	PRK02998	prsA	peptidylprolyl isomerase; Reviewed	283
235097	PRK02999	PRK02999	malate synthase G; Provisional	726
179524	PRK03001	PRK03001	zinc metalloprotease HtpX. 	283
101162	PRK03002	prsA	peptidylprolyl isomerase PrsA. 	285
179525	PRK03003	PRK03003	GTP-binding protein Der; Reviewed	472
235098	PRK03007	PRK03007	deoxyguanosinetriphosphate triphosphohydrolase-like protein; Provisional	428
235099	PRK03011	PRK03011	butyrate kinase; Provisional	358
235100	PRK03031	rnpA	ribonuclease P protein component. 	122
179529	PRK03057	PRK03057	hypothetical protein; Provisional	180
235101	PRK03059	PRK03059	PII uridylyl-transferase; Provisional	856
179531	PRK03065	hutP	anti-terminator HutP; Provisional	148
235102	PRK03072	PRK03072	heat shock protein HtpX; Provisional	288
235103	PRK03080	PRK03080	phosphoserine transaminase. 	378
179534	PRK03081	sspK	small, acid-soluble spore protein K. 	50
179535	PRK03092	PRK03092	ribose-phosphate diphosphokinase. 	304
179536	PRK03094	PRK03094	hypothetical protein; Provisional	80
179537	PRK03095	prsA	peptidylprolyl isomerase PrsA. 	287
179538	PRK03100	PRK03100	Sec-independent protein translocase subunit TatB. 	136
235104	PRK03103	PRK03103	DNA polymerase IV; Reviewed	409
179540	PRK03113	PRK03113	putative disulfide oxidoreductase; Provisional	139
179541	PRK03114	PRK03114	DUF84 family protein. 	169
235105	PRK03124	PRK03124	S-adenosylmethionine decarboxylase proenzyme; Provisional	127
179543	PRK03137	PRK03137	1-pyrroline-5-carboxylate dehydrogenase; Provisional	514
179544	PRK03140	PRK03140	phosphatidylserine decarboxylase; Provisional	259
179545	PRK03147	PRK03147	thiol-disulfide oxidoreductase ResA. 	173
235106	PRK03158	PRK03158	histidinol-phosphate aminotransferase; Provisional	359
235107	PRK03170	PRK03170	dihydrodipicolinate synthase; Provisional	292
179548	PRK03174	sspH	small acid-soluble spore protein H. 	59
235108	PRK03180	ligB	ATP-dependent DNA ligase; Reviewed	508
235109	PRK03187	tgl	transglutaminase; Provisional	272
235110	PRK03188	PRK03188	4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional	300
179552	PRK03195	PRK03195	DUF721 family protein. 	186
235111	PRK03202	PRK03202	ATP-dependent 6-phosphofructokinase. 	320
179554	PRK03204	PRK03204	haloalkane dehalogenase; Provisional	286
235112	PRK03244	argD	acetylornithine transaminase. 	398
235113	PRK03287	truB	tRNA pseudouridine synthase B; Provisional	298
235114	PRK03298	PRK03298	endonuclease NucS. 	224
235115	PRK03317	PRK03317	histidinol-phosphate aminotransferase; Provisional	368
179559	PRK03321	PRK03321	putative aminotransferase; Provisional	352
179560	PRK03333	coaE	dephospho-CoA kinase/protein folding accessory domain-containing protein; Provisional	395
235116	PRK03341	PRK03341	arginine repressor; Provisional	168
235117	PRK03343	PRK03343	transaldolase; Validated	368
235118	PRK03348	PRK03348	DNA polymerase IV; Provisional	454
179564	PRK03352	PRK03352	DNA polymerase IV; Validated	346
235119	PRK03353	ribB	3,4-dihydroxy-2-butanone 4-phosphate synthase; Provisional	217
179566	PRK03354	PRK03354	crotonobetainyl-CoA dehydrogenase; Validated	380
179567	PRK03355	PRK03355	glycerol-3-phosphate 1-O-acyltransferase. 	783
179568	PRK03356	PRK03356	L-carnitine/gamma-butyrobetaine antiport BCCT transporter. 	504
179569	PRK03359	PRK03359	putative electron transfer flavoprotein FixA; Reviewed	256
235120	PRK03363	fixB	electron transfer flavoprotein subunit alpha/FixB family protein. 	313
179571	PRK03369	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	488
179572	PRK03371	pdxA	D-threonate 4-phosphate dehydrogenase. 	326
235121	PRK03372	ppnK	inorganic polyphosphate/ATP-NAD kinase; Provisional	306
235122	PRK03378	ppnK	inorganic polyphosphate/ATP-NAD kinase; Provisional	292
179575	PRK03379	PRK03379	vitamin B12-transporter protein BtuF; Provisional	260
235123	PRK03381	PRK03381	PII uridylyl-transferase; Provisional	774
235124	PRK03427	PRK03427	cell division protein ZipA; Provisional	333
235125	PRK03430	PRK03430	hypothetical protein; Validated	157
179579	PRK03437	PRK03437	3-isopropylmalate dehydrogenase; Provisional	344
179580	PRK03449	PRK03449	putative inner membrane protein translocase component YidC; Provisional	304
235126	PRK03459	rnpA	ribonuclease P; Reviewed	122
235127	PRK03467	PRK03467	hypothetical protein; Provisional	144
179583	PRK03482	PRK03482	phosphoglycerate mutase GpmB. 	215
179584	PRK03501	ppnK	NAD kinase. 	264
179585	PRK03511	minC	septum site-determining protein MinC. 	228
179586	PRK03512	PRK03512	thiamine phosphate synthase. 	211
179587	PRK03515	PRK03515	ornithine carbamoyltransferase subunit I; Provisional	336
235128	PRK03522	rumB	23S rRNA (uracil(747)-C(5))-methyltransferase RlmC. 	315
179589	PRK03525	PRK03525	L-carnitine CoA-transferase. 	405
235129	PRK03537	PRK03537	molybdate ABC transporter substrate-binding protein. 	188
179591	PRK03545	PRK03545	putative arabinose transporter; Provisional	390
179592	PRK03554	tatA	Sec-independent protein translocase subunit TatA. 	89
235130	PRK03557	PRK03557	CDF family zinc transporter ZitB. 	312
235131	PRK03562	PRK03562	glutathione-regulated potassium-efflux system protein KefC; Provisional	621
179595	PRK03564	PRK03564	formate dehydrogenase accessory protein FdhE; Provisional	309
179596	PRK03573	PRK03573	transcriptional regulator SlyA; Provisional	144
235132	PRK03577	PRK03577	acid shock protein. 	102
235133	PRK03578	hscB	Fe-S protein assembly co-chaperone HscB. 	176
179599	PRK03580	PRK03580	crotonobetainyl-CoA hydratase. 	261
235134	PRK03584	PRK03584	acetoacetate--CoA ligase. 	655
235135	PRK03592	PRK03592	haloalkane dehalogenase; Provisional	295
235136	PRK03598	PRK03598	putative efflux pump membrane fusion protein; Provisional	331
179603	PRK03600	nrdI	class Ib ribonucleoside-diphosphate reductase assembly flavoprotein NrdI. 	134
235137	PRK03601	PRK03601	HTH-type transcriptional regulator HdfR. 	275
235138	PRK03604	moaC	bifunctional molybdenum cofactor biosynthesis protein MoaC/MogA; Provisional	312
179606	PRK03606	PRK03606	ureidoglycolate lyase. 	162
179607	PRK03609	umuC	translesion error-prone DNA polymerase V subunit UmuC. 	422
235139	PRK03612	PRK03612	polyamine aminopropyltransferase. 	521
235140	PRK03619	PRK03619	phosphoribosylformylglycinamidine synthase subunit PurQ. 	219
235141	PRK03620	PRK03620	5-dehydro-4-deoxyglucarate dehydratase; Provisional	303
235142	PRK03624	PRK03624	putative acetyltransferase; Provisional	140
179612	PRK03625	tatE	twin-arginine translocase subunit TatE. 	67
235143	PRK03629	tolB	Tol-Pal system protein TolB. 	429
179614	PRK03633	PRK03633	putative MFS family transporter protein; Provisional	381
179615	PRK03634	PRK03634	rhamnulose-1-phosphate aldolase; Provisional	274
235144	PRK03635	PRK03635	ArgP/LysG family DNA-binding transcriptional regulator. 	294
235145	PRK03636	PRK03636	hypothetical protein; Provisional	179
235146	PRK03640	PRK03640	o-succinylbenzoate--CoA ligase. 	483
179619	PRK03641	PRK03641	DUF2057 family protein. 	220
179620	PRK03642	PRK03642	putative periplasmic esterase; Provisional	432
235147	PRK03643	PRK03643	tagaturonate reductase. 	471
179622	PRK03646	dadX	catabolic alanine racemase. 	355
235148	PRK03655	PRK03655	putative ion channel protein; Provisional	414
235149	PRK03657	PRK03657	2-oxo-tetronate isomerase. 	170
179625	PRK03659	PRK03659	glutathione-regulated potassium-efflux system protein KefB; Provisional	601
179626	PRK03660	PRK03660	anti-sigma F factor; Provisional	146
179627	PRK03661	PRK03661	nicotinamide-nucleotide amidase. 	164
179628	PRK03669	PRK03669	mannosyl-3-phosphoglycerate phosphatase-related protein. 	271
167581	PRK03670	PRK03670	competence damage-inducible protein A; Provisional	252
179629	PRK03673	PRK03673	nicotinamide mononucleotide deamidase-related protein YfaY. 	396
179630	PRK03681	hypA	hydrogenase/urease nickel incorporation protein HypA. 	114
179631	PRK03692	PRK03692	putative UDP-N-acetyl-D-mannosaminuronic acid transferase; Provisional	243
235150	PRK03695	PRK03695	vitamin B12-transporter ATPase; Provisional	248
235151	PRK03699	PRK03699	putative transporter; Provisional	394
235152	PRK03705	PRK03705	glycogen debranching protein GlgX. 	658
179635	PRK03708	ppnK	NAD(+) kinase. 	277
179636	PRK03715	argD	acetylornithine transaminase protein; Provisional	395
167589	PRK03717	PRK03717	ribonuclease P protein component 2; Provisional	120
179637	PRK03719	PRK03719	ecotin; Provisional	166
235153	PRK03731	aroL	shikimate kinase AroL. 	171
167593	PRK03732	PRK03732	hypothetical protein; Provisional	114
179639	PRK03735	PRK03735	cytochrome b6; Provisional	223
235154	PRK03739	PRK03739	2-isopropylmalate synthase; Validated	552
179641	PRK03743	pdxA	4-hydroxythreonine-4-phosphate dehydrogenase PdxA. 	332
235155	PRK03745	PRK03745	signal recognition particle protein Srp19; Provisional	100
179642	PRK03757	PRK03757	YceI family protein. 	191
235156	PRK03759	PRK03759	isopentenyl-diphosphate Delta-isomerase. 	184
235157	PRK03760	PRK03760	hypothetical protein; Provisional	117
235158	PRK03761	PRK03761	LPS assembly outer membrane complex protein LptD; Provisional	778
179646	PRK03762	PRK03762	YbaB/EbfC family nucleoid-associated protein. 	103
179647	PRK03767	PRK03767	NAD(P)H:quinone oxidoreductase; Provisional	200
179648	PRK03776	PRK03776	phosphatidylglycerol--membrane-oligosaccharide glycerophosphotransferase. 	762
235159	PRK03784	PRK03784	vtamin B12-transporter permease; Provisional	331
235160	PRK03803	murD	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional	448
179651	PRK03806	murD	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional	438
235161	PRK03814	PRK03814	oxaloacetate decarboxylase subunit gamma; Provisional	85
235162	PRK03815	murD	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional	401
235163	PRK03817	PRK03817	galactokinase; Provisional	351
179654	PRK03818	PRK03818	putative transporter; Validated	552
179655	PRK03822	lplA	lipoate-protein ligase A; Provisional	338
235164	PRK03824	hypA	hydrogenase nickel incorporation protein HypA. 	135
235165	PRK03826	PRK03826	5'-nucleotidase; Provisional	195
179658	PRK03830	PRK03830	small acid-soluble spore protein Tlp; Provisional	73
235166	PRK03837	PRK03837	transcriptional regulator NanR; Provisional	241
179660	PRK03839	PRK03839	putative kinase; Provisional	180
179661	PRK03846	PRK03846	adenylylsulfate kinase; Provisional	198
235167	PRK03854	opgC	glucans biosynthesis protein MdoC. 	375
179663	PRK03858	PRK03858	DNA polymerase IV; Validated	396
235168	PRK03868	PRK03868	glucose-6-phosphate isomerase; Provisional	410
235169	PRK03879	PRK03879	ribonuclease P protein component 1; Validated	96
235170	PRK03881	PRK03881	hypothetical protein; Provisional	467
167628	PRK03887	PRK03887	methylated-DNA--protein-cysteine methyltransferase; Provisional	175
179667	PRK03892	PRK03892	Ribonuclease P protein component 3. 	216
179668	PRK03893	PRK03893	putative sialic acid transporter; Provisional	496
179669	PRK03902	PRK03902	transcriptional regulator MntR. 	142
235171	PRK03903	PRK03903	transaldolase; Provisional	274
235172	PRK03906	PRK03906	mannonate dehydratase; Provisional	385
235173	PRK03907	fliE	flagellar hook-basal body complex protein FliE. 	97
179673	PRK03910	PRK03910	D-cysteine desulfhydrase; Validated	331
235174	PRK03911	PRK03911	HrcA family transcriptional regulator. 	260
235175	PRK03918	PRK03918	DNA double-strand break repair ATPase Rad50. 	880
179676	PRK03922	PRK03922	hypothetical protein; Provisional	113
179677	PRK03926	PRK03926	mevalonate kinase; Provisional	302
235176	PRK03932	asnC	asparaginyl-tRNA synthetase; Validated	450
235177	PRK03934	PRK03934	phosphatidylserine decarboxylase; Provisional	265
235178	PRK03941	PRK03941	NTPase; Reviewed	174
179681	PRK03946	pdxA	4-hydroxythreonine-4-phosphate dehydrogenase; Provisional	307
235179	PRK03947	PRK03947	prefoldin subunit alpha; Reviewed	140
235180	PRK03954	PRK03954	ribonuclease P protein component 4; Validated	121
179684	PRK03955	PRK03955	DUF126 domain-containing protein. 	131
179685	PRK03957	PRK03957	V-type ATP synthase subunit F; Provisional	100
235181	PRK03958	PRK03958	tRNA 2'-O-methylase; Reviewed	176
167649	PRK03963	PRK03963	V-type ATP synthase subunit E; Provisional	198
167650	PRK03967	PRK03967	histidinol-phosphate transaminase. 	337
235182	PRK03968	PRK03968	DNA primase large subunit PriL. 	399
179688	PRK03971	PRK03971	deoxyhypusine synthase. 	334
179689	PRK03972	PRK03972	ribosomal biogenesis protein; Validated	208
179690	PRK03975	tfx	putative transcriptional regulator; Provisional	141
235183	PRK03976	rpl37ae	50S ribosomal protein L37Ae; Reviewed	90
235184	PRK03979	PRK03979	ADP-specific phosphofructokinase; Provisional	463
235185	PRK03980	PRK03980	flap endonuclease-1; Provisional	292
235186	PRK03982	PRK03982	heat shock protein HtpX; Provisional	288
235187	PRK03983	PRK03983	exosome complex exonuclease Rrp41; Provisional	244
235188	PRK03987	PRK03987	translation initiation factor IF-2 subunit alpha; Validated	262
235189	PRK03988	PRK03988	translation initiation factor IF-2 subunit beta; Validated	138
235190	PRK03991	PRK03991	threonyl-tRNA synthetase; Validated	613
179699	PRK03992	PRK03992	proteasome-activating nucleotidase; Provisional	389
235191	PRK03995	PRK03995	D-aminoacyl-tRNA deacylase. 	267
235192	PRK03996	PRK03996	archaeal proteasome endopeptidase complex subunit alpha. 	241
235193	PRK03999	PRK03999	translation initiation factor IF-5A; Provisional	129
235194	PRK04000	PRK04000	translation initiation factor IF-2 subunit gamma; Validated	411
235195	PRK04004	PRK04004	translation initiation factor IF-2; Validated	586
235196	PRK04005	PRK04005	50S ribosomal protein L18e; Provisional	111
235197	PRK04007	rps28e	30S ribosomal protein S28e; Validated	70
235198	PRK04011	PRK04011	peptide chain release factor 1; Provisional	411
179708	PRK04012	PRK04012	translation initiation factor IF-1A; Provisional	100
101376	PRK04013	argD	acetylornithine/acetyl-lysine aminotransferase; Provisional	364
235199	PRK04015	PRK04015	DNA/RNA-binding protein AlbA. 	91
235200	PRK04016	PRK04016	DNA-directed RNA polymerase subunit N; Provisional	62
179711	PRK04017	PRK04017	hypothetical protein; Provisional	132
179712	PRK04019	rplP0	acidic ribosomal protein P0; Validated	330
235201	PRK04020	rps2P	30S ribosomal protein S2; Provisional	204
167678	PRK04021	PRK04021	hypothetical protein; Reviewed	92
235202	PRK04023	PRK04023	DNA polymerase II large subunit; Validated	1121
235203	PRK04024	PRK04024	2,3-bisphosphoglycerate-independent phosphoglycerate mutase. 	412
179716	PRK04025	PRK04025	adenosylmethionine decarboxylase. 	139
235204	PRK04027	PRK04027	30S ribosomal protein S7P; Reviewed	195
235205	PRK04028	PRK04028	Glu-tRNA(Gln) amidotransferase subunit GatE. 	630
235206	PRK04031	PRK04031	DNA primase; Provisional	408
235207	PRK04032	PRK04032	CDP-2,3-bis-(O-geranylgeranyl)-sn-glycerol synthase. 	159
179721	PRK04034	rps8p	30S ribosomal protein S8P; Reviewed	130
235208	PRK04036	PRK04036	DNA-directed DNA polymerase II small subunit. 	504
235209	PRK04038	rps19p	30S ribosomal protein S19P; Provisional	134
235210	PRK04040	PRK04040	adenylate kinase; Provisional	188
235211	PRK04042	rpl4lp	50S ribosomal protein L4P; Provisional	254
235212	PRK04043	tolB	Tol-Pal system protein TolB. 	419
235213	PRK04044	rps5p	30S ribosomal protein S5P; Reviewed	211
179728	PRK04046	PRK04046	translation initiation factor IF-6; Provisional	222
235214	PRK04049	PRK04049	30S ribosomal protein S8e; Validated	127
179730	PRK04051	rps4p	30S ribosomal protein S4P; Validated	177
235215	PRK04053	rps13p	30S ribosomal protein S13P; Reviewed	149
179732	PRK04056	PRK04056	septum formation inhibitor Maf. 	180
235216	PRK04057	PRK04057	30S ribosomal protein S3Ae; Validated	203
179734	PRK04059	rpl34e	50S ribosomal protein L34e; Validated	88
235217	PRK04069	PRK04069	serine-protein kinase RsbW; Provisional	161
179736	PRK04073	rocD	ornithine--oxo-acid transaminase; Provisional	396
235218	PRK04081	PRK04081	hypothetical protein; Provisional	207
235219	PRK04098	PRK04098	Sec-independent protein translocase subunit TatB. 	158
179739	PRK04099	truB	tRNA pseudouridine synthase B; Provisional	273
179740	PRK04101	PRK04101	metallothiol transferase FosB. 	139
235220	PRK04115	PRK04115	hypothetical protein; Provisional	137
235221	PRK04123	PRK04123	ribulokinase; Provisional	548
235222	PRK04125	PRK04125	antiholin-like protein LrgA. 	141
167709	PRK04128	PRK04128	1-(5-phosphoribosyl)-5- ((5-phosphoribosylamino)methylideneamino)imidazole-4-carboxamide isomerase. 	228
235223	PRK04132	PRK04132	replication factor C small subunit; Provisional	846
179745	PRK04135	PRK04135	2,3-bisphosphoglycerate-independent phosphoglycerate mutase. 	395
179746	PRK04136	rpl40e	50S ribosomal protein L40e; Provisional	48
235224	PRK04140	PRK04140	transcriptional regulator. 	317
235225	PRK04143	PRK04143	protein-ADP-ribose hydrolase. 	264
179749	PRK04147	PRK04147	N-acetylneuraminate lyase; Provisional	293
235226	PRK04148	PRK04148	hypothetical protein; Provisional	134
235227	PRK04149	sat	sulfate adenylyltransferase; Reviewed	391
179752	PRK04151	PRK04151	IMP cyclohydrolase; Provisional	197
235228	PRK04155	PRK04155	protein deglycase HchA. 	287
235229	PRK04156	gltX	glutamyl-tRNA synthetase; Provisional	567
235230	PRK04158	PRK04158	GTP-sensing pleiotropic transcriptional regulator CodY. 	256
235231	PRK04160	PRK04160	diphthine synthase; Provisional	258
235232	PRK04161	PRK04161	tagatose 1,6-diphosphate aldolase; Reviewed	329
235233	PRK04163	PRK04163	exosome complex protein Rrp4. 	235
235234	PRK04164	PRK04164	hypothetical protein; Provisional	181
235235	PRK04165	PRK04165	acetyl-CoA decarbonylase/synthase complex subunit gamma; Provisional	450
235236	PRK04168	PRK04168	tungstate ABC transporter substrate-binding protein WtpA. 	334
235237	PRK04169	PRK04169	heptaprenylglyceryl phosphate synthase. 	232
235238	PRK04171	PRK04171	16S rRNA (pseudouridine)(914)-N(1))-methyltransferase Nep1. 	222
235239	PRK04172	pheS	phenylalanine--tRNA ligase subunit alpha. 	489
235240	PRK04173	PRK04173	glycyl-tRNA synthetase; Provisional	456
179766	PRK04175	rpl7ae	50S ribosomal protein L7Ae; Validated	122
235241	PRK04176	PRK04176	ribulose-1,5-biphosphate synthetase; Provisional	257
235242	PRK04179	rpl37e	50S ribosomal protein L37e; Reviewed	62
179769	PRK04180	PRK04180	pyridoxal 5'-phosphate synthase lyase subunit PdxS. 	293
235243	PRK04181	PRK04181	4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional	257
235244	PRK04182	PRK04182	cytidylate kinase; Provisional	180
235245	PRK04183	PRK04183	Glu-tRNA(Gln) amidotransferase subunit GatD. 	419
235246	PRK04184	PRK04184	DNA topoisomerase VI subunit B; Validated	535
179774	PRK04190	PRK04190	glucose-6-phosphate isomerase; Provisional	191
235247	PRK04191	rps3p	30S ribosomal protein S3P; Reviewed	207
235248	PRK04192	PRK04192	V-type ATP synthase subunit A; Provisional	586
235249	PRK04194	PRK04194	nickel pincer cofactor biosynthesis protein LarC. 	392
235250	PRK04195	PRK04195	replication factor C large subunit; Provisional	482
235251	PRK04196	PRK04196	V-type ATP synthase subunit B; Provisional	460
235252	PRK04199	rpl10e	50S ribosomal protein L16. 	172
179781	PRK04200	PRK04200	cofactor-independent phosphoglycerate mutase; Provisional	395
235253	PRK04201	PRK04201	zinc transporter ZupT; Provisional	265
235254	PRK04203	rpl1P	50S ribosomal protein L1P; Reviewed	215
235255	PRK04204	PRK04204	RNA 3'-terminal phosphate cyclase. 	343
179785	PRK04205	PRK04205	hypothetical protein; Provisional	229
179786	PRK04207	PRK04207	type II glyceraldehyde-3-phosphate dehydrogenase. 	341
179787	PRK04208	rbcL	ribulose bisophosphate carboxylase; Reviewed	468
235256	PRK04210	PRK04210	phosphoenolpyruvate carboxykinase (GTP). 	601
235257	PRK04211	rps12P	30S ribosomal protein S12P; Reviewed	145
179790	PRK04213	PRK04213	GTP-binding protein EngB. 	201
179791	PRK04214	rbn	ribonuclease BN/unknown domain fusion protein; Reviewed	412
235258	PRK04217	PRK04217	hypothetical protein; Provisional	110
235259	PRK04219	rpl5p	50S ribosomal protein L5P; Reviewed	177
179793	PRK04220	PRK04220	2-phosphoglycerate kinase; Provisional	301
179794	PRK04223	rpl22p	50S ribosomal protein L22P; Reviewed	153
235260	PRK04231	rpl3p	50S ribosomal protein L3P; Reviewed	337
235261	PRK04233	PRK04233	hypothetical protein; Provisional	129
235262	PRK04235	PRK04235	hypothetical protein; Provisional	196
179798	PRK04239	PRK04239	DNA-binding protein. 	110
235263	PRK04243	PRK04243	50S ribosomal protein L15e; Validated	196
235264	PRK04247	PRK04247	endonuclease NucS. 	238
235265	PRK04250	PRK04250	dihydroorotase; Provisional	398
179802	PRK04257	PRK04257	hypothetical protein; Provisional	78
179803	PRK04260	PRK04260	acetylornithine transaminase. 	375
235266	PRK04262	PRK04262	hypothetical protein; Provisional	347
235267	PRK04266	PRK04266	fibrillarin-like rRNA/tRNA 2'-O-methyltransferase. 	226
179806	PRK04270	PRK04270	RNA-guided pseudouridylation complex pseudouridine synthase subunit Cbf5. 	300
179807	PRK04280	PRK04280	transcriptional regulator ArgR. 	148
235268	PRK04282	PRK04282	exosome complex protein Rrp42. 	271
235269	PRK04284	PRK04284	ornithine carbamoyltransferase; Provisional	332
235270	PRK04286	PRK04286	hypothetical protein; Provisional	298
179810	PRK04288	PRK04288	antiholin-like protein LrgB; Provisional	232
235271	PRK04290	PRK04290	30S ribosomal protein S6e; Validated	115
179812	PRK04293	PRK04293	adenylosuccinate synthetase; Provisional	333
235272	PRK04296	PRK04296	thymidine kinase; Provisional	190
235273	PRK04301	radA	DNA repair and recombination protein RadA; Validated	317
235274	PRK04302	PRK04302	triosephosphate isomerase; Provisional	223
235275	PRK04306	PRK04306	50S ribosomal protein L21e; Reviewed	98
235276	PRK04307	PRK04307	protein-disulfide oxidoreductase DsbI. 	218
167786	PRK04308	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	445
235277	PRK04309	PRK04309	DNA-directed RNA polymerase subunit A''; Validated	383
235278	PRK04311	PRK04311	selenocysteine synthase; Provisional	464
179820	PRK04313	PRK04313	30S ribosomal protein S4e; Validated	237
235279	PRK04319	PRK04319	acetyl-CoA synthetase; Provisional	570
235280	PRK04322	PRK04322	peptidyl-tRNA hydrolase; Provisional	113
179823	PRK04323	PRK04323	hypothetical protein; Provisional	91
179824	PRK04325	PRK04325	hypothetical protein; Provisional	74
179825	PRK04326	PRK04326	methionine synthase; Provisional	330
235281	PRK04328	PRK04328	hypothetical protein; Provisional	249
235282	PRK04330	PRK04330	hypothetical protein; Provisional	88
235283	PRK04333	PRK04333	50S ribosomal protein L14e; Validated	84
235284	PRK04334	PRK04334	hypothetical protein; Provisional	251
235285	PRK04335	PRK04335	cell division protein ZipA; Provisional	313
179831	PRK04337	PRK04337	50S ribosomal protein L35Ae; Validated	87
235286	PRK04338	PRK04338	N(2),N(2)-dimethylguanosine tRNA methyltransferase; Provisional	382
235287	PRK04342	PRK04342	DNA topoisomerase IV subunit A. 	367
235288	PRK04346	PRK04346	tryptophan synthase subunit beta; Validated	397
235289	PRK04350	PRK04350	thymidine phosphorylase; Provisional	490
235290	PRK04351	PRK04351	SprT family protein. 	149
235291	PRK04358	PRK04358	hypothetical protein; Provisional	217
235292	PRK04366	PRK04366	aminomethyl-transferring glycine dehydrogenase subunit GcvPB. 	481
179839	PRK04374	PRK04374	[protein-PII] uridylyltransferase. 	869
235293	PRK04375	PRK04375	protoheme IX farnesyltransferase; Provisional	296
179841	PRK04387	PRK04387	hypothetical protein; Provisional	90
235294	PRK04388	PRK04388	disulfide bond formation protein B; Provisional	172
179843	PRK04390	rnpA	ribonuclease P protein component. 	120
235295	PRK04405	prsA	peptidylprolyl isomerase; Provisional	298
235296	PRK04406	PRK04406	hypothetical protein; Provisional	75
235297	PRK04423	PRK04423	LPS-assembly protein LptD. 	798
179847	PRK04424	PRK04424	transcription factor FapR. 	185
167814	PRK04425	PRK04425	septum formation protein Maf. 	196
179848	PRK04435	PRK04435	ACT domain-containing protein. 	147
235298	PRK04439	PRK04439	methionine adenosyltransferase. 	399
235299	PRK04443	PRK04443	[LysW]-lysine hydrolase. 	348
235300	PRK04447	PRK04447	hypothetical protein; Provisional	351
235301	PRK04452	PRK04452	acetyl-CoA decarbonylase/synthase complex subunit delta; Provisional	319
235302	PRK04456	PRK04456	acetyl-CoA decarbonylase/synthase complex subunit beta; Reviewed	463
179854	PRK04457	PRK04457	polyamine aminopropyltransferase. 	262
179855	PRK04460	PRK04460	nickel-responsive transcriptional regulator NikR. 	137
179856	PRK04516	minC	septum site-determining protein MinC. 	235
235303	PRK04517	PRK04517	hypothetical protein; Provisional	216
235304	PRK04523	PRK04523	N-acetylornithine carbamoyltransferase; Reviewed	335
235305	PRK04527	PRK04527	argininosuccinate synthase; Provisional	400
235306	PRK04531	PRK04531	acetylglutamate kinase; Provisional	398
235307	PRK04537	PRK04537	ATP-dependent RNA helicase RhlB; Provisional	572
179862	PRK04539	ppnK	inorganic polyphosphate/ATP-NAD kinase; Provisional	296
179863	PRK04542	PRK04542	elongation factor P; Provisional	189
179864	PRK04561	tatA	twin arginine translocase protein A; Provisional	75
235308	PRK04570	PRK04570	cell division protein ZipA; Provisional	243
235309	PRK04596	minC	septum site-determining protein MinC. 	248
179867	PRK04598	tatA	twin-arginine translocase subunit TatA. 	81
179868	PRK04612	argD	acetylornithine transaminase. 	408
179869	PRK04635	PRK04635	histidinol-phosphate aminotransferase; Provisional	354
179870	PRK04642	truB	tRNA pseudouridine synthase B; Provisional	300
135173	PRK04654	PRK04654	sec-independent translocase; Provisional	214
179871	PRK04663	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	438
179872	PRK04690	murD	UDP-N-acetylmuramoyl-L-alanine--D-glutamate ligase. 	468
179873	PRK04694	PRK04694	Maf-like protein; Reviewed	190
235310	PRK04750	ubiB	putative ubiquinone biosynthesis protein UbiB; Reviewed	537
179875	PRK04758	PRK04758	hypothetical protein; Validated	181
179876	PRK04761	ppnK	inorganic polyphosphate/ATP-NAD kinase; Reviewed	246
179877	PRK04778	PRK04778	septation ring formation regulator EzrA; Provisional	569
235311	PRK04781	PRK04781	histidinol-phosphate aminotransferase; Provisional	364
235312	PRK04792	tolB	Tol-Pal system protein TolB. 	448
179880	PRK04804	minC	septum site-determining protein MinC. 	221
235313	PRK04813	PRK04813	D-alanine--poly(phosphoribitol) ligase subunit DltA. 	503
179882	PRK04820	rnpA	ribonuclease P protein component. 	145
179883	PRK04833	PRK04833	argininosuccinate lyase; Provisional	455
235314	PRK04837	PRK04837	ATP-dependent RNA helicase RhlB; Provisional	423
235315	PRK04841	PRK04841	HTH-type transcriptional regulator MalT. 	903
179886	PRK04860	PRK04860	SprT family zinc-dependent metalloprotease. 	160
235316	PRK04863	mukB	chromosome partition protein MukB. 	1486
179888	PRK04870	PRK04870	histidinol-phosphate transaminase. 	356
235317	PRK04885	ppnK	inorganic polyphosphate/ATP-NAD kinase; Provisional	265
235318	PRK04897	PRK04897	heat shock protein HtpX; Provisional	298
235319	PRK04914	PRK04914	RNA polymerase-associated protein RapA. 	956
179892	PRK04922	tolB	Tol-Pal system beta propeller repeat protein TolB. 	433
179893	PRK04923	PRK04923	ribose-phosphate diphosphokinase. 	319
235320	PRK04926	dgt	deoxyguanosinetriphosphate triphosphohydrolase; Provisional	503
179895	PRK04930	PRK04930	glutathione-regulated potassium-efflux system ancillary protein KefG; Provisional	184
179896	PRK04940	PRK04940	hypothetical protein; Provisional	180
235321	PRK04946	PRK04946	endonuclease SmrB. 	181
179898	PRK04949	PRK04949	putative sulfate transport protein CysZ; Validated	251
235322	PRK04950	PRK04950	ProP expression regulator; Provisional	213
235323	PRK04960	PRK04960	universal stress protein UspB; Provisional	111
179901	PRK04964	PRK04964	hypothetical protein; Provisional	66
179902	PRK04965	PRK04965	NADH:flavorubredoxin reductase NorW. 	377
179903	PRK04966	PRK04966	hypothetical protein; Provisional	72
235324	PRK04968	PRK04968	SecY interacting protein Syd; Provisional	181
179905	PRK04972	PRK04972	putative transporter; Provisional	558
235325	PRK04974	PRK04974	glycerol-3-phosphate 1-O-acyltransferase PlsB. 	818
235326	PRK04976	torD	chaperone protein TorD; Validated	202
179908	PRK04980	PRK04980	hypothetical protein; Provisional	102
179909	PRK04984	PRK04984	fatty acid metabolism transcriptional regulator FadR. 	239
235327	PRK04987	PRK04987	fumarate reductase subunit FrdC. 	130
235328	PRK04989	psbM	photosystem II reaction center protein M; Provisional	35
179912	PRK04998	PRK04998	hypothetical protein; Provisional	88
235329	PRK05007	PRK05007	bifunctional uridylyltransferase/uridylyl-removing protein GlnD. 	884
179914	PRK05014	hscB	co-chaperone HscB; Provisional	171
235330	PRK05015	PRK05015	aminopeptidase B; Provisional	424
235331	PRK05022	PRK05022	nitric oxide reductase transcriptional regulator NorR. 	509
235332	PRK05031	PRK05031	tRNA (uracil-5-)-methyltransferase; Validated	362
235333	PRK05033	truB	tRNA pseudouridine synthase B; Provisional	312
235334	PRK05035	PRK05035	electron transport complex protein RnfC; Provisional	695
179920	PRK05054	PRK05054	exoribonuclease II; Provisional	644
235335	PRK05057	aroK	shikimate kinase AroK. 	172
179922	PRK05066	PRK05066	transcriptional regulator ArgR. 	156
179923	PRK05070	PRK05070	DNA mismatch repair protein; Provisional	218
235336	PRK05074	PRK05074	non-canonical purine NTP phosphatase. 	173
235337	PRK05077	frsA	esterase FrsA. 	414
235338	PRK05082	PRK05082	N-acetylmannosamine kinase; Provisional	291
235339	PRK05084	xerS	site-specific tyrosine recombinase XerS; Reviewed	357
235340	PRK05086	PRK05086	malate dehydrogenase; Provisional	312
179929	PRK05087	PRK05087	D-alanine--poly(phosphoribitol) ligase subunit DltC. 	78
235341	PRK05089	PRK05089	cytochrome C oxidase assembly protein; Provisional	188
179931	PRK05090	PRK05090	hypothetical protein; Validated	95
235342	PRK05092	PRK05092	PII uridylyl-transferase; Provisional	931
179933	PRK05093	argD	acetylornithine/succinyldiaminopimelate transaminase. 	403
179934	PRK05094	PRK05094	dsDNA-mimic protein; Reviewed	107
235343	PRK05096	PRK05096	guanosine 5'-monophosphate oxidoreductase; Provisional	346
235344	PRK05097	PRK05097	macrodomain Ter protein MatP. 	150
179937	PRK05101	PRK05101	galactokinase; Provisional	382
235345	PRK05105	PRK05105	O-succinylbenzoate synthase; Provisional	322
235346	PRK05111	PRK05111	acetylornithine deacetylase; Provisional	383
235347	PRK05113	PRK05113	electron transport complex protein RnfB; Provisional	191
179941	PRK05114	PRK05114	YoaH family protein. 	59
235348	PRK05122	PRK05122	major facilitator superfamily transporter; Provisional	399
235349	PRK05124	cysN	sulfate adenylyltransferase subunit 1; Provisional	474
235350	PRK05134	PRK05134	bifunctional 2-polyprenyl-6-hydroxyphenol methylase/3-demethylubiquinol 3-O-methyltransferase UbiG. 	233
235351	PRK05137	tolB	Tol-Pal system protein TolB. 	435
235352	PRK05151	PRK05151	electron transport complex protein RsxA; Provisional	193
235353	PRK05157	PRK05157	pyrroloquinoline quinone biosynthesis protein PqqC; Provisional	246
235354	PRK05159	aspC	aspartyl-tRNA synthetase; Provisional	437
235355	PRK05163	rpsL	30S ribosomal protein S12; Validated	124
179950	PRK05166	PRK05166	histidinol-phosphate transaminase. 	371
179951	PRK05168	PRK05168	ribonuclease T; Provisional	211
235356	PRK05170	PRK05170	YcgN family cysteine cluster protein. 	147
179953	PRK05174	PRK05174	bifunctional 3-hydroxydecanoyl-ACP dehydratase/trans-2-decenoyl-ACP isomerase. 	172
235357	PRK05177	minC	septum formation inhibitor MinC. 	239
235358	PRK05179	rpsM	30S ribosomal protein S13; Validated	122
235359	PRK05182	PRK05182	DNA-directed RNA polymerase subunit alpha; Provisional	310
235360	PRK05183	hscA	chaperone protein HscA; Provisional	616
235361	PRK05184	PRK05184	pyrroloquinoline quinone biosynthesis protein PqqB; Provisional	302
179959	PRK05185	rplT	50S ribosomal protein L20; Provisional	114
235362	PRK05192	PRK05192	tRNA uridine-5-carboxymethylaminomethyl(34) synthesis enzyme MnmG. 	618
235363	PRK05198	PRK05198	2-dehydro-3-deoxyphosphooctonate aldolase; Provisional	264
235364	PRK05201	hslU	ATP-dependent protease ATPase subunit HslU. 	443
235365	PRK05205	PRK05205	bifunctional pyr operon transcriptional regulator/uracil phosphoribosyltransferase PyrR. 	176
179964	PRK05208	PRK05208	hypothetical protein; Provisional	168
235366	PRK05218	PRK05218	heat shock protein 90; Provisional	613
235367	PRK05222	PRK05222	5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase; Provisional	758
235368	PRK05225	PRK05225	ketol-acid reductoisomerase; Validated	487
235369	PRK05231	PRK05231	homoserine kinase; Provisional	319
179969	PRK05234	mgsA	methylglyoxal synthase; Validated	142
235370	PRK05244	PRK05244	Der GTPase-activating protein YihI. 	177
235371	PRK05246	PRK05246	glutathione synthetase; Provisional	316
235372	PRK05248	PRK05248	hypothetical protein; Provisional	121
235373	PRK05249	PRK05249	Si-specific NAD(P)(+) transhydrogenase. 	461
235374	PRK05250	PRK05250	S-adenosylmethionine synthetase; Validated	384
235375	PRK05253	PRK05253	sulfate adenylyltransferase subunit CysD. 	301
235376	PRK05254	PRK05254	uracil-DNA glycosylase; Provisional	224
235377	PRK05255	PRK05255	ribosome-associated protein. 	171
235378	PRK05256	PRK05256	chromosome partition protein MukE. 	238
179979	PRK05257	PRK05257	malate:quinone oxidoreductase; Validated	494
179980	PRK05260	PRK05260	chromosome partition protein MukF. 	440
235379	PRK05261	PRK05261	phosphoketolase. 	785
179982	PRK05264	PRK05264	met regulon transcriptional regulator MetJ. 	105
235380	PRK05265	PRK05265	pyridoxine 5'-phosphate synthase; Provisional	239
235381	PRK05269	PRK05269	transaldolase B; Provisional	318
235382	PRK05270	PRK05270	UDP-glucose--hexose-1-phosphate uridylyltransferase. 	493
235383	PRK05273	PRK05273	D-tyrosyl-tRNA(Tyr) deacylase; Provisional	147
235384	PRK05274	PRK05274	2-keto-3-deoxygluconate permease; Provisional	326
235385	PRK05277	PRK05277	H(+)/Cl(-) exchange transporter ClcA. 	438
235386	PRK05279	PRK05279	N-acetylglutamate synthase; Validated	441
179990	PRK05282	PRK05282	dipeptidase PepE. 	233
235387	PRK05283	PRK05283	deoxyribose-phosphate aldolase; Provisional	257
235388	PRK05286	PRK05286	quinone-dependent dihydroorotate dehydrogenase. 	344
235389	PRK05287	PRK05287	cell division protein ZapD. 	250
235390	PRK05289	PRK05289	acyl-ACP--UDP-N-acetylglucosamine O-acyltransferase. 	262
235391	PRK05290	PRK05290	hybrid cluster protein; Provisional	546
235392	PRK05291	trmE	tRNA uridine-5-carboxymethylaminomethyl(34) synthesis GTPase MnmE. 	449
179997	PRK05293	glgC	glucose-1-phosphate adenylyltransferase; Provisional	380
235393	PRK05294	carB	carbamoyl-phosphate synthase large subunit. 	1066
235394	PRK05297	PRK05297	phosphoribosylformylglycinamidine synthase; Provisional	1290
235395	PRK05298	PRK05298	excinuclease ABC subunit UvrB. 	652
235396	PRK05299	rpsB	30S ribosomal protein S2; Provisional	258
235397	PRK05301	PRK05301	pyrroloquinoline quinone biosynthesis protein PqqE; Provisional	378
235398	PRK05302	PRK05302	30S ribosomal protein S7; Validated	156
235399	PRK05303	flgI	flagellar basal body P-ring protein FlgI. 	367
235400	PRK05305	PRK05305	phosphatidylserine decarboxylase family protein. 	206
235401	PRK05306	infB	translation initiation factor IF-2; Validated	746
180007	PRK05309	PRK05309	30S ribosomal protein S11; Validated	128
235402	PRK05312	pdxA	4-hydroxythreonine-4-phosphate dehydrogenase PdxA. 	336
180009	PRK05313	PRK05313	hypothetical protein; Provisional	452
235403	PRK05318	PRK05318	deoxyguanosinetriphosphate triphosphohydrolase-like protein; Provisional	432
235404	PRK05319	rplD	50S ribosomal protein L4; Provisional	205
235405	PRK05320	PRK05320	rhodanese superfamily protein; Provisional	257
235406	PRK05321	PRK05321	nicotinate phosphoribosyltransferase; Provisional	400
235407	PRK05322	PRK05322	galactokinase; Provisional	387
235408	PRK05324	PRK05324	succinylglutamate desuccinylase; Provisional	329
235409	PRK05325	PRK05325	hypothetical protein; Provisional	401
235410	PRK05326	PRK05326	potassium/proton antiporter. 	562
235411	PRK05327	rpsD	30S ribosomal protein S4; Validated	203
235412	PRK05329	PRK05329	glycerol-3-phosphate dehydrogenase subunit GlpB. 	422
235413	PRK05330	PRK05330	oxygen-dependent coproporphyrinogen oxidase. 	300
235414	PRK05331	PRK05331	phosphate acyltransferase PlsX. 	334
235415	PRK05333	PRK05333	NAD-dependent protein deacetylase. 	285
235416	PRK05335	PRK05335	tRNA (uracil-5-)-methyltransferase Gid; Reviewed	436
235417	PRK05337	PRK05337	beta-hexosaminidase; Provisional	337
235418	PRK05338	rplS	50S ribosomal protein L19; Provisional	116
235419	PRK05339	PRK05339	pyruvate, phosphate dikinase/phosphoenolpyruvate synthase regulator. 	269
235420	PRK05340	PRK05340	UDP-2,3-diacylglucosamine hydrolase; Provisional	241
235421	PRK05341	PRK05341	homogentisate 1,2-dioxygenase; Provisional	438
235422	PRK05342	clpX	ATP-dependent Clp protease ATP-binding subunit ClpX. 	412
235423	PRK05346	PRK05346	Na(+)-translocating NADH-quinone reductase subunit C; Provisional	256
235424	PRK05347	PRK05347	glutaminyl-tRNA synthetase; Provisional	554
235425	PRK05349	PRK05349	Na(+)-translocating NADH-quinone reductase subunit B; Provisional	405
180033	PRK05350	PRK05350	acyl carrier protein; Provisional	82
235426	PRK05352	PRK05352	Na(+)-translocating NADH-quinone reductase subunit A; Provisional	448
235427	PRK05354	PRK05354	biosynthetic arginine decarboxylase. 	634
235428	PRK05355	PRK05355	3-phosphoserine/phosphohydroxythreonine transaminase. 	360
235429	PRK05359	PRK05359	oligoribonuclease; Provisional	181
235430	PRK05362	PRK05362	phosphopentomutase; Provisional	394
235431	PRK05363	PRK05363	protein-methionine-sulfoxide reductase catalytic subunit MsrP. 	280
180040	PRK05365	PRK05365	malonic semialdehyde reductase; Provisional	195
235432	PRK05367	PRK05367	aminomethyl-transferring glycine dehydrogenase. 	954
235433	PRK05368	PRK05368	homoserine O-succinyltransferase; Provisional	302
235434	PRK05370	PRK05370	argininosuccinate synthase; Validated	447
235435	PRK05371	PRK05371	x-prolyl-dipeptidyl aminopeptidase; Provisional	767
180045	PRK05377	PRK05377	fructose-1,6-bisphosphate aldolase; Reviewed	296
235436	PRK05379	PRK05379	bifunctional nicotinamide-nucleotide adenylyltransferase/Nudix hydroxylase. 	340
235437	PRK05380	pyrG	CTP synthetase; Validated	533
235438	PRK05382	PRK05382	chorismate synthase; Validated	359
235439	PRK05385	PRK05385	phosphoribosylaminoimidazole synthetase; Provisional	327
235440	PRK05387	PRK05387	histidinol-phosphate aminotransferase; Provisional	353
235441	PRK05388	argJ	bifunctional glutamate N-acetyltransferase/amino-acid acetyltransferase ArgJ. 	395
235442	PRK05389	truB	tRNA pseudouridine synthase B; Provisional	305
235443	PRK05395	PRK05395	type II 3-dehydroquinate dehydratase. 	146
180054	PRK05396	tdh	L-threonine 3-dehydrogenase; Validated	341
180055	PRK05398	PRK05398	formyl-coenzyme A transferase; Provisional	416
235444	PRK05399	PRK05399	DNA mismatch repair protein MutS; Provisional	854
235445	PRK05402	PRK05402	1,4-alpha-glucan branching protein GlgB. 	726
235446	PRK05406	PRK05406	5-oxoprolinase subunit PxpA. 	246
180059	PRK05408	PRK05408	oxidative damage protection protein; Provisional	90
235447	PRK05409	PRK05409	hypothetical protein; Provisional	281
180061	PRK05412	PRK05412	putative nucleotide-binding protein; Reviewed	161
235448	PRK05414	PRK05414	urocanate hydratase; Provisional	556
235449	PRK05415	PRK05415	hypothetical protein; Provisional	341
235450	PRK05416	PRK05416	RNase adapter RapZ. 	288
235451	PRK05417	PRK05417	glutathione-dependent formaldehyde-activating enzyme; Provisional	191
235452	PRK05419	PRK05419	protein-methionine-sulfoxide reductase heme-binding subunit MsrQ. 	205
235453	PRK05420	PRK05420	aquaporin Z; Provisional	231
235454	PRK05421	PRK05421	endonuclease/exonuclease/phosphatase family protein. 	263
235455	PRK05422	smpB	SsrA-binding protein SmpB. 	148
180070	PRK05423	PRK05423	DUF496 family protein. 	104
180071	PRK05424	rplA	50S ribosomal protein L1; Validated	230
235456	PRK05425	PRK05425	asparagine synthetase AsnA; Provisional	327
235457	PRK05426	PRK05426	peptidyl-tRNA hydrolase; Provisional	189
235458	PRK05427	PRK05427	putative manganese-dependent inorganic pyrophosphatase; Provisional	308
235459	PRK05428	PRK05428	HPr kinase/phosphorylase; Provisional	308
235460	PRK05429	PRK05429	gamma-glutamyl kinase; Provisional	372
235461	PRK05431	PRK05431	seryl-tRNA synthetase; Provisional	425
235462	PRK05433	PRK05433	GTP-binding protein LepA; Provisional	600
235463	PRK05434	PRK05434	2,3-bisphosphoglycerate-independent phosphoglycerate mutase. 	507
235464	PRK05435	rpmA	50S ribosomal protein L27; Validated	82
235465	PRK05437	PRK05437	isopentenyl pyrophosphate isomerase; Provisional	352
235466	PRK05439	PRK05439	pantothenate kinase; Provisional	311
235467	PRK05441	murQ	N-acetylmuramic acid-6-phosphate etherase; Reviewed	299
235468	PRK05442	PRK05442	malate dehydrogenase; Provisional	326
235469	PRK05443	PRK05443	polyphosphate kinase; Provisional	691
235470	PRK05444	PRK05444	1-deoxy-D-xylulose-5-phosphate synthase; Provisional	580
180087	PRK05445	PRK05445	YfbU family protein. 	164
235471	PRK05446	PRK05446	bifunctional histidinol-phosphatase/imidazoleglycerol-phosphate dehydratase HisB. 	354
235472	PRK05447	PRK05447	1-deoxy-D-xylulose 5-phosphate reductoisomerase; Provisional	385
180090	PRK05449	PRK05449	aspartate alpha-decarboxylase; Provisional	126
235473	PRK05450	PRK05450	3-deoxy-manno-octulosonate cytidylyltransferase; Provisional	245
235474	PRK05451	PRK05451	dihydroorotase; Provisional	345
235475	PRK05452	PRK05452	anaerobic nitric oxide reductase flavorubredoxin; Provisional	479
235476	PRK05454	PRK05454	glucans biosynthesis glucosyltransferase MdoH. 	605
235477	PRK05456	PRK05456	ATP-dependent protease subunit HslV. 	172
235478	PRK05457	PRK05457	protease HtpX. 	284
235479	PRK05458	PRK05458	guanosine 5'-monophosphate oxidoreductase; Provisional	326
180098	PRK05461	apaG	CO2+/MG2+ efflux protein ApaG; Reviewed	127
235480	PRK05462	PRK05462	adenosylmethionine decarboxylase. 	266
180100	PRK05463	PRK05463	putative hydro-lyase. 	262
235481	PRK05464	PRK05464	Na(+)-translocating NADH-quinone reductase subunit F; Provisional	409
235482	PRK05465	PRK05465	ethanolamine ammonia-lyase subunit EutC. 	260
235483	PRK05467	PRK05467	Fe(II)-dependent oxygenase superfamily protein; Provisional	226
235484	PRK05469	PRK05469	tripeptide aminopeptidase PepT. 	408
180105	PRK05470	PRK05470	fumarate reductase subunit FrdD. 	118
235485	PRK05471	PRK05471	CDP-diacylglycerol pyrophosphatase; Provisional	252
235486	PRK05472	PRK05472	redox-sensing transcriptional repressor Rex; Provisional	213
180108	PRK05473	IreB-like	IreB family regulatory phosphoprotein. IreB (EF1202) was characterized in Enterococcus faecalis as a small protein, well-conserved in the Firmicutes. It belongs to a system that includes the Ser/Thr protein kinase IreK, and phosphatase IreP, undergoes phosphorylation on threonine residues, and is involved in regulating cephalosporin resistance. This family was previously named DUF965 by Pfam model pfam06135	86
235487	PRK05474	PRK05474	xylose isomerase; Provisional	437
235488	PRK05476	PRK05476	S-adenosyl-L-homocysteine hydrolase; Provisional	425
235489	PRK05477	gatB	Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase subunit GatB. 	474
235490	PRK05478	PRK05478	3-isopropylmalate dehydratase large subunit. 	466
235491	PRK05479	PRK05479	ketol-acid reductoisomerase; Provisional	330
235492	PRK05480	PRK05480	uridine/cytidine kinase; Provisional	209
235493	PRK05481	PRK05481	lipoyl synthase; Provisional	289
235494	PRK05482	PRK05482	potassium-transporting ATPase subunit A; Provisional	559
180117	PRK05483	rplN	50S ribosomal protein L14; Validated	122
235495	PRK05498	rplF	50S ribosomal protein L6; Validated	178
180119	PRK05500	PRK05500	bifunctional orotidine-5'-phosphate decarboxylase/orotate phosphoribosyltransferase. 	477
180120	PRK05506	PRK05506	bifunctional sulfate adenylyltransferase subunit 1/adenylylsulfate kinase protein; Provisional	632
180121	PRK05508	PRK05508	methionine-R-sulfoxide reductase. 	119
235496	PRK05518	rpl6p	50S ribosomal protein L6P; Reviewed	180
235497	PRK05528	PRK05528	peptide-methionine (S)-S-oxide reductase. 	156
135428	PRK05529	PRK05529	cell division protein FtsQ; Provisional	255
180124	PRK05537	PRK05537	bifunctional sulfate adenylyltransferase/adenylylsulfate kinase. 	568
235498	PRK05541	PRK05541	adenylylsulfate kinase; Provisional	176
235499	PRK05550	PRK05550	bifunctional methionine sulfoxide reductase B/A protein; Provisional	283
235500	PRK05557	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Validated	248
235501	PRK05559	PRK05559	DNA topoisomerase IV subunit B; Reviewed	631
235502	PRK05560	PRK05560	DNA gyrase subunit A; Validated	805
235503	PRK05561	PRK05561	DNA topoisomerase 4 subunit A. 	742
235504	PRK05562	PRK05562	NAD(P)-dependent oxidoreductase. 	223
235505	PRK05563	PRK05563	DNA polymerase III subunits gamma and tau; Validated	559
180132	PRK05564	PRK05564	DNA polymerase III subunit delta'; Validated	313
235506	PRK05565	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	247
235507	PRK05567	PRK05567	inosine 5'-monophosphate dehydrogenase; Reviewed	486
235508	PRK05568	PRK05568	flavodoxin; Provisional	142
135442	PRK05569	PRK05569	flavodoxin; Provisional	141
235509	PRK05571	PRK05571	ribose-5-phosphate isomerase B; Provisional	148
180137	PRK05572	PRK05572	RNA polymerase sporulation sigma factor, SigF/SigG family. 	252
235510	PRK05573	rplU	50S ribosomal protein L21; Validated	103
235511	PRK05574	holA	DNA polymerase III subunit delta; Reviewed	340
180140	PRK05575	cbiC	precorrin-8X methylmutase; Validated	204
235512	PRK05576	PRK05576	cobalt-factor II C(20)-methyltransferase. 	229
180142	PRK05578	PRK05578	cytidine deaminase; Validated	131
235513	PRK05579	PRK05579	bifunctional phosphopantothenoylcysteine decarboxylase/phosphopantothenate synthase; Validated	399
235514	PRK05580	PRK05580	primosome assembly protein PriA; Validated	679
235515	PRK05581	PRK05581	ribulose-phosphate 3-epimerase; Validated	220
235516	PRK05582	PRK05582	type I DNA topoisomerase. 	650
235517	PRK05583	PRK05583	ribosomal protein L7Ae family protein; Provisional	104
180148	PRK05584	PRK05584	5'-methylthioadenosine/adenosylhomocysteine nucleosidase. 	230
235518	PRK05585	yajC	preprotein translocase subunit YajC; Validated	106
180150	PRK05586	PRK05586	acetyl-CoA carboxylase biotin carboxylase subunit. 	447
235519	PRK05588	PRK05588	histidinol phosphate phosphatase. 	255
235520	PRK05589	PRK05589	peptide chain release factor 2; Provisional	325
235521	PRK05590	PRK05590	hypothetical protein; Provisional	166
235522	PRK05591	rplQ	50S ribosomal protein L17; Validated	113
235523	PRK05592	rplO	50S ribosomal protein L15; Reviewed	146
235524	PRK05593	rplR	50S ribosomal protein L18; Reviewed	117
235525	PRK05595	PRK05595	replicative DNA helicase; Provisional	444
235526	PRK05597	PRK05597	molybdopterin biosynthesis protein MoeB; Validated	355
235527	PRK05599	PRK05599	SDR family oxidoreductase. 	246
235528	PRK05600	PRK05600	thiamine biosynthesis protein ThiF; Validated	370
235529	PRK05601	PRK05601	DNA polymerase III subunit epsilon; Validated	377
235530	PRK05602	PRK05602	RNA polymerase sigma factor; Reviewed	186
235531	PRK05605	PRK05605	long-chain-fatty-acid--CoA ligase; Validated	573
180161	PRK05609	nusG	transcription antitermination protein NusG; Validated	181
235532	PRK05610	rpsQ	30S ribosomal protein S17; Reviewed	84
180163	PRK05611	rpmD	50S ribosomal protein L30; Reviewed	59
168128	PRK05613	PRK05613	O-acetylhomoserine/O-acetylserine sulfhydrylase. 	437
180164	PRK05614	gltA	citrate synthase. 	419
235533	PRK05617	PRK05617	3-hydroxyisobutyryl-CoA hydrolase; Provisional	342
235534	PRK05618	PRK05618	50S ribosomal protein L25/general stress protein Ctc; Reviewed	197
180167	PRK05620	PRK05620	long-chain fatty-acid--CoA ligase. 	576
235535	PRK05621	PRK05621	F0F1 ATP synthase subunit gamma; Validated	284
180169	PRK05625	PRK05625	5-amino-6-(5-phosphoribosylamino)uracil reductase; Validated	217
180170	PRK05626	rpsO	30S ribosomal protein S15; Reviewed	89
235536	PRK05627	PRK05627	bifunctional riboflavin kinase/FAD synthetase. 	305
180172	PRK05628	PRK05628	coproporphyrinogen III oxidase; Validated	375
180173	PRK05629	PRK05629	hypothetical protein; Validated	318
180174	PRK05630	PRK05630	adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional	422
235537	PRK05632	PRK05632	phosphate acetyltransferase; Reviewed	684
235538	PRK05634	PRK05634	nucleosidase; Provisional	185
180177	PRK05636	PRK05636	replicative DNA helicase; Provisional	505
180178	PRK05637	PRK05637	anthranilate synthase component II; Provisional	208
235539	PRK05638	PRK05638	threonine synthase; Validated	442
168145	PRK05639	PRK05639	acetyl ornithine aminotransferase family protein. 	457
101884	PRK05640	PRK05640	putative monovalent cation/H+ antiporter subunit B; Reviewed	151
235540	PRK05641	PRK05641	putative acetyl-CoA carboxylase biotin carboxyl carrier protein subunit; Validated	153
168147	PRK05642	PRK05642	DnaA regulatory inactivator Hda. 	234
235541	PRK05643	PRK05643	DNA polymerase III subunit beta; Validated	367
235542	PRK05644	gyrB	DNA gyrase subunit B; Validated	638
135493	PRK05645	PRK05645	lysophospholipid acyltransferase. 	295
235543	PRK05646	PRK05646	lipid A biosynthesis lauroyl acyltransferase; Provisional	310
235544	PRK05647	purN	phosphoribosylglycinamide formyltransferase; Reviewed	200
235545	PRK05650	PRK05650	SDR family oxidoreductase. 	270
235546	PRK05653	fabG	3-oxoacyl-ACP reductase FabG. 	246
235547	PRK05654	PRK05654	acetyl-CoA carboxylase carboxyltransferase subunit beta. 	292
168156	PRK05656	PRK05656	acetyl-CoA C-acetyltransferase. 	393
235548	PRK05657	PRK05657	RNA polymerase sigma factor RpoS; Validated	325
235549	PRK05658	PRK05658	RNA polymerase sigma factor RpoD; Validated	619
168159	PRK05659	PRK05659	sulfur carrier protein ThiS; Validated	66
235550	PRK05660	PRK05660	radical SAM family heme chaperone HemW. 	378
180188	PRK05664	PRK05664	threonine-phosphate decarboxylase; Reviewed	330
168162	PRK05665	PRK05665	amidotransferase; Provisional	240
235551	PRK05667	dnaG	DNA primase; Validated	580
235552	PRK05670	PRK05670	anthranilate synthase component II; Provisional	189
168165	PRK05671	PRK05671	aspartate-semialdehyde dehydrogenase; Reviewed	336
235553	PRK05672	dnaE2	error-prone DNA polymerase; Validated	1046
235554	PRK05673	dnaE	DNA polymerase III subunit alpha; Validated	1135
168168	PRK05674	PRK05674	gamma-carboxygeranoyl-CoA hydratase; Validated	265
180193	PRK05675	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	570
168170	PRK05677	PRK05677	long-chain-fatty-acid--CoA ligase; Validated	562
180194	PRK05678	PRK05678	succinyl-CoA synthetase subunit alpha; Validated	291
235555	PRK05679	PRK05679	pyridoxal 5'-phosphate synthase. 	195
180196	PRK05680	flgB	flagellar basal body rod protein FlgB; Reviewed	137
235556	PRK05681	flgC	flagellar basal body rod protein FlgC; Reviewed	135
235557	PRK05682	flgE	flagellar hook protein FlgE; Validated	407
235558	PRK05683	flgK	flagellar hook-associated protein FlgK; Validated	676
235559	PRK05684	flgJ	flagellar assembly peptidoglycan hydrolase FlgJ. 	312
235560	PRK05685	fliS	flagellar export chaperone FliS. 	132
235561	PRK05686	fliG	flagellar motor switch protein G; Validated	339
235562	PRK05687	fliH	flagellar assembly protein FliH. 	246
168181	PRK05688	fliI	flagellar protein export ATPase FliI. 	451
235563	PRK05689	fliJ	flagella biosynthesis chaperone FliJ. 	147
180204	PRK05690	PRK05690	molybdopterin biosynthesis protein MoeB; Provisional	245
235564	PRK05691	PRK05691	peptide synthase; Validated	4334
180206	PRK05692	PRK05692	hydroxymethylglutaryl-CoA lyase; Provisional	287
168186	PRK05693	PRK05693	SDR family oxidoreductase. 	274
180207	PRK05696	fliL	flagellar basal body-associated protein FliL; Reviewed	170
235565	PRK05697	PRK05697	flagellar basal body-associated protein FliL-like protein; Validated	137
168189	PRK05698	fliN	flagellar motor switch protein FliN. 	155
235566	PRK05699	fliP	flagellar biosynthesis protein FliP; Reviewed	245
235567	PRK05700	fliQ	flagellar type III secretion system protein FliQ. 	89
235568	PRK05701	fliR	flagellar type III secretion system protein FliR. 	242
235569	PRK05702	flhB	flagellar type III secretion system protein FlhB. 	359
235570	PRK05703	flhF	flagellar biosynthesis protein FlhF. 	424
235571	PRK05704	PRK05704	2-oxoglutarate dehydrogenase complex dihydrolipoyllysine-residue succinyltransferase. 	407
180215	PRK05707	PRK05707	DNA polymerase III subunit delta'; Validated	328
235572	PRK05708	PRK05708	putative 2-dehydropantoate 2-reductase. 	305
235573	PRK05710	PRK05710	tRNA glutamyl-Q(34) synthetase GluQRS. 	299
235574	PRK05711	PRK05711	DNA polymerase III subunit epsilon; Provisional	240
235575	PRK05713	PRK05713	iron-sulfur-binding ferredoxin reductase. 	312
168201	PRK05714	PRK05714	2-octaprenyl-3-methyl-6-methoxy-1,4-benzoquinol hydroxylase; Provisional	405
180218	PRK05715	PRK05715	NADH-quinone oxidoreductase subunit NuoK. 	100
235576	PRK05716	PRK05716	methionine aminopeptidase; Validated	252
168204	PRK05717	PRK05717	SDR family oxidoreductase. 	255
235577	PRK05718	PRK05718	keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase; Provisional	212
235578	PRK05720	mtnA	methylthioribose-1-phosphate isomerase; Reviewed	344
235579	PRK05722	PRK05722	glucose-6-phosphate 1-dehydrogenase; Validated	495
168208	PRK05723	PRK05723	flavodoxin; Provisional	151
235580	PRK05724	PRK05724	acetyl-CoA carboxylase carboxyltransferase subunit alpha; Validated	319
235581	PRK05728	PRK05728	DNA polymerase III subunit chi; Validated	142
235582	PRK05729	valS	valyl-tRNA synthetase; Reviewed	874
235583	PRK05731	PRK05731	thiamine monophosphate kinase; Provisional	318
235584	PRK05732	PRK05732	2-octaprenyl-6-methoxyphenyl hydroxylase; Validated	395
235585	PRK05733	PRK05733	single-stranded DNA-binding protein; Provisional	172
235586	PRK05738	rplW	50S ribosomal protein L23; Reviewed	92
235587	PRK05740	secE	preprotein translocase subunit SecE; Reviewed	92
180230	PRK05742	PRK05742	carboxylating nicotinate-nucleotide diphosphorylase. 	277
235588	PRK05743	ileS	isoleucyl-tRNA synthetase; Reviewed	912
180232	PRK05748	PRK05748	replicative DNA helicase; Provisional	448
235589	PRK05749	PRK05749	3-deoxy-D-manno-octulosonic-acid transferase; Reviewed	425
180234	PRK05751	PRK05751	preprotein translocase subunit SecB; Validated	156
235590	PRK05752	PRK05752	uroporphyrinogen-III synthase; Validated	255
180236	PRK05753	PRK05753	nucleoside diphosphate kinase regulator; Provisional	137
235591	PRK05755	PRK05755	DNA polymerase I; Provisional	880
235592	PRK05756	PRK05756	pyridoxal kinase PdxY. 	286
235593	PRK05758	PRK05758	F0F1 ATP synthase subunit delta; Validated	177
180240	PRK05759	PRK05759	F0F1 ATP synthase subunit B; Validated	156
180241	PRK05760	PRK05760	F0F1 ATP synthase subunit I; Validated	124
235594	PRK05761	PRK05761	DNA-directed DNA polymerase I. 	787
235595	PRK05762	PRK05762	DNA polymerase II; Reviewed	786
235596	PRK05764	PRK05764	aspartate aminotransferase; Provisional	393
235597	PRK05765	PRK05765	precorrin-3B C17-methyltransferase; Provisional	246
235598	PRK05766	rps14P	30S ribosomal protein S14P; Reviewed	52
180247	PRK05767	rpl44e	50S ribosomal protein L44e; Validated	92
235599	PRK05769	PRK05769	acetyl ornithine aminotransferase family protein. 	441
235600	PRK05771	PRK05771	V-type ATP synthase subunit I; Validated	646
168237	PRK05772	PRK05772	S-methyl-5-thioribose-1-phosphate isomerase. 	363
235601	PRK05773	PRK05773	3,4-dihydroxy-2-butanone 4-phosphate synthase; Validated	219
235602	PRK05776	PRK05776	DNA topoisomerase I; Provisional	670
235603	PRK05777	PRK05777	NADH-quinone oxidoreductase subunit NuoN. 	476
235604	PRK05778	PRK05778	2-oxoglutarate ferredoxin oxidoreductase subunit beta; Validated	301
235605	PRK05782	PRK05782	bifunctional sirohydrochlorin cobalt chelatase/precorrin-8X methylmutase; Validated	335
235606	PRK05783	PRK05783	hypothetical protein; Provisional	84
180256	PRK05784	PRK05784	phosphoribosylamine--glycine ligase; Provisional	486
235607	PRK05785	PRK05785	hypothetical protein; Provisional	226
235608	PRK05786	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	238
235609	PRK05787	PRK05787	cobalt-precorrin-7 (C(5))-methyltransferase. 	210
235610	PRK05788	PRK05788	cobalt-precorrin 5A hydrolase. 	315
180261	PRK05790	PRK05790	putative acyltransferase; Provisional	393
235611	PRK05793	PRK05793	amidophosphoribosyltransferase; Provisional	469
180263	PRK05799	PRK05799	oxygen-independent coproporphyrinogen III oxidase. 	374
235612	PRK05800	cobU	adenosylcobinamide kinase/adenosylcobinamide-phosphate guanylyltransferase; Validated	170
235613	PRK05802	PRK05802	sulfide/dihydroorotate dehydrogenase-like FAD/NAD-binding protein. 	320
180266	PRK05803	PRK05803	RNA polymerase sporulation sigma factor SigK. 	233
180267	PRK05805	PRK05805	phosphate butyryltransferase; Validated	301
235614	PRK05807	PRK05807	RNA-binding protein S1. 	136
180269	PRK05808	PRK05808	3-hydroxybutyryl-CoA dehydrogenase; Validated	282
180270	PRK05809	PRK05809	short-chain-enoyl-CoA hydratase. 	260
235615	PRK05812	secD	preprotein translocase subunit SecD; Reviewed	462
235616	PRK05813	PRK05813	single-stranded DNA-binding protein; Provisional	219
235617	PRK05815	PRK05815	F0F1 ATP synthase subunit A; Validated	227
235618	PRK05818	PRK05818	DNA polymerase III subunit delta'; Validated	261
180275	PRK05819	deoD	DeoD-type purine-nucleoside phosphorylase. 	235
180276	PRK05820	deoA	thymidine phosphorylase; Reviewed	440
235619	PRK05826	PRK05826	pyruvate kinase; Provisional	465
180278	PRK05828	PRK05828	acyl carrier protein; Validated	84
180279	PRK05834	PRK05834	hypothetical protein; Provisional	194
180280	PRK05835	PRK05835	class II fructose-1,6-bisphosphate aldolase. 	307
180281	PRK05839	PRK05839	succinyldiaminopimelate transaminase. 	374
235620	PRK05841	flgE	flagellar hook protein FlgE; Validated	603
235621	PRK05842	flgD	flagellar hook assembly protein FlgD. 	295
180284	PRK05844	PRK05844	pyruvate flavodoxin oxidoreductase subunit gamma; Validated	186
235622	PRK05846	PRK05846	NADH:ubiquinone oxidoreductase subunit M; Reviewed	497
180286	PRK05848	PRK05848	carboxylating nicotinate-nucleotide diphosphorylase. 	273
235623	PRK05849	PRK05849	hypothetical protein; Provisional	783
235624	PRK05850	PRK05850	acyl-CoA synthetase; Validated	578
180289	PRK05851	PRK05851	long-chain-fatty acid--ACP ligase MbtM. 	525
235625	PRK05852	PRK05852	fatty acid--CoA ligase family protein. 	534
235626	PRK05853	PRK05853	hypothetical protein; Validated	161
235627	PRK05854	PRK05854	SDR family oxidoreductase. 	313
235628	PRK05855	PRK05855	SDR family oxidoreductase. 	582
180293	PRK05857	PRK05857	fatty acid--CoA ligase. 	540
235629	PRK05858	PRK05858	acetolactate synthase. 	542
180295	PRK05862	PRK05862	enoyl-CoA hydratase; Provisional	257
135627	PRK05863	PRK05863	sulfur carrier protein ThiS; Provisional	65
168278	PRK05864	PRK05864	enoyl-CoA hydratase; Provisional	276
235630	PRK05865	PRK05865	sugar epimerase family protein. 	854
235631	PRK05866	PRK05866	SDR family oxidoreductase. 	293
135631	PRK05867	PRK05867	SDR family oxidoreductase. 	253
180297	PRK05868	PRK05868	FAD-binding protein. 	372
235632	PRK05869	PRK05869	enoyl-CoA hydratase; Validated	222
180298	PRK05870	PRK05870	enoyl-CoA hydratase; Provisional	249
235633	PRK05872	PRK05872	short chain dehydrogenase; Provisional	296
102036	PRK05874	PRK05874	L-fuculose-phosphate aldolase; Validated	217
180300	PRK05875	PRK05875	short chain dehydrogenase; Provisional	276
135637	PRK05876	PRK05876	short chain dehydrogenase; Provisional	275
235634	PRK05877	PRK05877	aminodeoxychorismate synthase component I; Provisional	405
235635	PRK05878	PRK05878	pyruvate phosphate dikinase; Provisional	530
180303	PRK05880	PRK05880	F0F1 ATP synthase subunit C; Validated	81
180304	PRK05883	PRK05883	acyl carrier protein; Validated	91
135642	PRK05884	PRK05884	SDR family oxidoreductase. 	223
235636	PRK05886	yajC	preprotein translocase subunit YajC; Validated	109
235637	PRK05888	PRK05888	NADH-quinone oxidoreductase subunit NuoI. 	164
180306	PRK05889	PRK05889	biotin/lipoyl-binding carrier protein. 	71
180307	PRK05892	PRK05892	nucleoside diphosphate kinase regulator; Provisional	158
235638	PRK05896	PRK05896	DNA polymerase III subunits gamma and tau; Validated	605
135648	PRK05898	dnaE	DNA polymerase III subunit alpha. 	971
235639	PRK05899	PRK05899	transketolase; Reviewed	586
235640	PRK05901	PRK05901	RNA polymerase sigma factor; Provisional	509
235641	PRK05904	PRK05904	coproporphyrinogen III oxidase; Provisional	353
235642	PRK05905	PRK05905	4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 	258
168292	PRK05906	PRK05906	lipid A biosynthesis lauroyl acyltransferase; Provisional	454
235643	PRK05907	PRK05907	hypothetical protein; Provisional	311
168293	PRK05910	PRK05910	type III secretion system protein; Validated	584
235644	PRK05911	PRK05911	RNA polymerase sigma factor sigma-28; Reviewed	257
235645	PRK05912	PRK05912	tyrosyl-tRNA synthetase; Validated	408
102059	PRK05917	PRK05917	DNA polymerase III subunit delta'; Validated	290
180312	PRK05920	PRK05920	aromatic acid decarboxylase; Validated	204
102061	PRK05922	PRK05922	type III secretion system ATPase; Validated	434
235646	PRK05925	PRK05925	aspartate kinase; Provisional	440
168296	PRK05926	PRK05926	hypothetical protein; Provisional	370
135660	PRK05927	PRK05927	dehypoxanthine futalosine cyclase. 	350
235647	PRK05928	hemD	uroporphyrinogen-III synthase; Reviewed	249
235648	PRK05932	PRK05932	RNA polymerase factor sigma-54; Reviewed	455
180315	PRK05933	PRK05933	type III secretion system protein; Validated	372
168300	PRK05934	PRK05934	type III secretion system protein; Validated	341
235649	PRK05935	PRK05935	biotin--protein ligase; Provisional	190
102071	PRK05937	PRK05937	8-amino-7-oxononanoate synthase; Provisional	370
235650	PRK05939	PRK05939	cystathionine gamma-synthase family protein. 	397
235651	PRK05940	PRK05940	anthranilate synthase component I. 	463
180317	PRK05942	PRK05942	aspartate aminotransferase; Provisional	394
180318	PRK05943	PRK05943	50S ribosomal protein L25; Reviewed	94
180319	PRK05945	sdhA	succinate dehydrogenase/fumarate reductase flavoprotein subunit. 	575
180320	PRK05948	PRK05948	precorrin-2 C(20)-methyltransferase. 	238
180321	PRK05949	PRK05949	RNA polymerase sigma factor; Validated	327
235652	PRK05950	sdhB	succinate dehydrogenase iron-sulfur subunit; Reviewed	232
180323	PRK05951	ubiA	prenyltransferase; Reviewed	296
235653	PRK05952	PRK05952	beta-ketoacyl-ACP synthase. 	381
180325	PRK05953	PRK05953	Precorrin-8X methylmutase. 	208
180326	PRK05954	PRK05954	precorrin-8X methylmutase; Provisional	203
235654	PRK05957	PRK05957	pyridoxal phosphate-dependent aminotransferase. 	389
235655	PRK05958	PRK05958	8-amino-7-oxononanoate synthase; Reviewed	385
168315	PRK05962	PRK05962	amidase; Validated	424
180328	PRK05963	PRK05963	beta-ketoacyl-ACP synthase III. 	326
235656	PRK05964	PRK05964	adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional	423
180330	PRK05965	PRK05965	hypothetical protein; Provisional	459
235657	PRK05967	PRK05967	cystathionine beta-lyase; Provisional	395
168320	PRK05968	PRK05968	hypothetical protein; Provisional	389
235658	PRK05972	ligD	ATP-dependent DNA ligase; Reviewed	860
168322	PRK05973	PRK05973	replicative DNA helicase; Provisional	237
235659	PRK05974	PRK05974	phosphoribosylformylglycinamidine synthase subunit PurS; Reviewed	80
168324	PRK05975	PRK05975	3-carboxy-cis,cis-muconate cycloisomerase; Provisional	351
235660	PRK05976	PRK05976	dihydrolipoamide dehydrogenase; Validated	472
180334	PRK05978	PRK05978	hypothetical protein; Provisional	148
180335	PRK05980	PRK05980	crotonase/enoyl-CoA hydratase family protein. 	260
235661	PRK05981	PRK05981	enoyl-CoA hydratase/isomerase. 	266
180337	PRK05985	PRK05985	cytosine deaminase; Provisional	391
235662	PRK05986	PRK05986	cob(I)yrinic acid a,c-diamide adenosyltransferase. 	191
180339	PRK05988	PRK05988	formate dehydrogenase subunit gamma; Validated	156
235663	PRK05989	cobN	cobaltochelatase subunit CobN; Reviewed	1244
180341	PRK05990	PRK05990	precorrin-2 C(20)-methyltransferase; Reviewed	241
180342	PRK05991	PRK05991	precorrin-3B C17-methyltransferase; Provisional	250
180343	PRK05993	PRK05993	SDR family oxidoreductase. 	277
180344	PRK05994	PRK05994	O-acetylhomoserine aminocarboxypropyltransferase; Validated	427
235664	PRK05995	PRK05995	enoyl-CoA hydratase; Provisional	262
235665	PRK05996	motB	MotB family protein. 	423
235666	PRK06002	fliI	flagellar protein export ATPase FliI. 	450
168340	PRK06003	flgB	flagellar basal body rod protein FlgB; Reviewed	126
235667	PRK06004	flgB	flagellar basal body rod protein FlgB; Reviewed	127
180347	PRK06005	flgA	flagellar basal body P-ring formation protein FlgA. 	160
235668	PRK06007	fliF	flagellar basal body M-ring protein FliF. 	542
235669	PRK06008	flgL	flagellar hook-associated family protein. 	348
235670	PRK06009	flgD	flagellar hook assembly protein FlgD. 	140
235671	PRK06010	fliQ	flagellar biosynthesis protein FliQ; Reviewed	88
235672	PRK06012	flhA	flagellar type III secretion system protein FlhA. 	697
168348	PRK06015	PRK06015	2-dehydro-3-deoxy-phosphogluconate aldolase. 	201
235673	PRK06018	PRK06018	putative acyl-CoA synthetase; Provisional	542
235674	PRK06019	PRK06019	phosphoribosylaminoimidazole carboxylase ATPase subunit; Reviewed	372
168351	PRK06023	PRK06023	crotonase/enoyl-CoA hydratase family protein. 	251
235675	PRK06025	PRK06025	acetyl-CoA C-acetyltransferase. 	417
180353	PRK06026	PRK06026	5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase; Validated	212
235676	PRK06027	purU	formyltetrahydrofolate deformylase; Reviewed	286
235677	PRK06029	PRK06029	UbiX family flavin prenyltransferase. 	185
180356	PRK06030	PRK06030	hypothetical protein; Provisional	124
235678	PRK06031	PRK06031	phosphoribosyltransferase; Provisional	233
235679	PRK06032	fliH	flagellar assembly protein H; Validated	199
180359	PRK06033	PRK06033	flagellar motor switch protein FliN. 	83
235680	PRK06034	PRK06034	hypothetical protein; Provisional	279
180361	PRK06035	PRK06035	3-hydroxyacyl-CoA dehydrogenase; Validated	291
180362	PRK06036	PRK06036	S-methyl-5-thioribose-1-phosphate isomerase. 	339
180363	PRK06038	PRK06038	N-ethylammeline chlorohydrolase; Provisional	430
235681	PRK06039	ileS	isoleucyl-tRNA synthetase; Reviewed	975
235682	PRK06041	PRK06041	archaellar assembly protein FlaJ. 	553
180366	PRK06043	PRK06043	fumarate hydratase; Provisional	192
180367	PRK06046	PRK06046	alanine dehydrogenase; Validated	326
180368	PRK06048	PRK06048	acetolactate synthase large subunit. 	561
235683	PRK06049	rpl30p	50S ribosomal protein L30P; Reviewed	154
235684	PRK06052	PRK06052	methionine synthase. 	344
180371	PRK06057	PRK06057	short chain dehydrogenase; Provisional	255
235685	PRK06058	PRK06058	4-aminobutyrate--2-oxoglutarate transaminase. 	443
180373	PRK06059	PRK06059	lipid-transfer protein; Provisional	399
180374	PRK06060	PRK06060	p-hydroxybenzoic acid--AMP ligase FadD22. 	705
235686	PRK06061	PRK06061	amidase; Provisional	483
235687	PRK06062	PRK06062	hypothetical protein; Provisional	451
180377	PRK06063	PRK06063	DEDDh family exonuclease. 	313
235688	PRK06064	PRK06064	thiolase domain-containing protein. 	389
180379	PRK06065	PRK06065	thiolase domain-containing protein. 	392
180380	PRK06066	PRK06066	thiolase domain-containing protein. 	385
180381	PRK06067	PRK06067	flagellar accessory protein FlaH; Validated	234
235689	PRK06069	sdhA	succinate dehydrogenase/fumarate reductase flavoprotein subunit. 	577
168377	PRK06072	PRK06072	enoyl-CoA hydratase; Provisional	248
235690	PRK06073	PRK06073	NADH dehydrogenase subunit A; Validated	124
235691	PRK06074	PRK06074	NADH dehydrogenase subunit C; Provisional	189
180385	PRK06075	PRK06075	NADH-quinone oxidoreductase subunit D. 	392
235692	PRK06076	PRK06076	NADH-quinone oxidoreductase subunit NuoH. 	322
235693	PRK06077	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	252
180387	PRK06078	PRK06078	pyrimidine-nucleoside phosphorylase; Reviewed	434
235694	PRK06079	PRK06079	enoyl-[acyl-carrier-protein] reductase FabI. 	252
235695	PRK06080	PRK06080	1,4-dihydroxy-2-naphthoate octaprenyltransferase; Validated	293
180390	PRK06082	PRK06082	aspartate aminotransferase family protein. 	459
180391	PRK06083	PRK06083	sulfur carrier protein ThiS; Provisional	84
180392	PRK06084	PRK06084	bifunctional O-acetylhomoserine aminocarboxypropyltransferase/cysteine synthase. 	425
180393	PRK06087	PRK06087	medium-chain fatty-acid--CoA ligase. 	547
180394	PRK06090	PRK06090	DNA polymerase III subunit delta'; Validated	319
180395	PRK06091	PRK06091	membrane protein FdrA; Validated	555
235696	PRK06092	PRK06092	4-amino-4-deoxychorismate lyase; Reviewed	268
180397	PRK06096	PRK06096	molybdenum transport protein ModD; Provisional	284
235697	PRK06099	PRK06099	F0F1 ATP synthase subunit I; Validated	126
180398	PRK06100	PRK06100	DNA polymerase III subunit psi; Provisional	132
180399	PRK06101	PRK06101	SDR family oxidoreductase. 	240
235698	PRK06102	PRK06102	amidase. 	452
180401	PRK06105	PRK06105	aminotransferase; Provisional	460
180402	PRK06106	PRK06106	carboxylating nicotinate-nucleotide diphosphorylase. 	281
180403	PRK06107	PRK06107	aspartate transaminase. 	402
180404	PRK06108	PRK06108	pyridoxal phosphate-dependent aminotransferase. 	382
235699	PRK06110	PRK06110	threonine dehydratase. 	322
180406	PRK06111	PRK06111	acetyl-CoA carboxylase biotin carboxylase subunit; Validated	450
235700	PRK06112	PRK06112	acetolactate synthase catalytic subunit; Validated	578
135765	PRK06113	PRK06113	7-alpha-hydroxysteroid dehydrogenase; Validated	255
180408	PRK06114	PRK06114	SDR family oxidoreductase. 	254
180409	PRK06115	PRK06115	dihydrolipoamide dehydrogenase; Reviewed	466
235701	PRK06116	PRK06116	glutathione reductase; Validated	450
180411	PRK06123	PRK06123	SDR family oxidoreductase. 	248
235702	PRK06124	PRK06124	SDR family oxidoreductase. 	256
235703	PRK06125	PRK06125	short chain dehydrogenase; Provisional	259
235704	PRK06126	PRK06126	hypothetical protein; Provisional	545
235705	PRK06127	PRK06127	enoyl-CoA hydratase; Provisional	269
180413	PRK06128	PRK06128	SDR family oxidoreductase. 	300
235706	PRK06129	PRK06129	3-hydroxyacyl-CoA dehydrogenase; Validated	308
235707	PRK06130	PRK06130	3-hydroxybutyryl-CoA dehydrogenase; Validated	311
235708	PRK06131	PRK06131	dihydroxy-acid dehydratase; Validated	571
235709	PRK06132	PRK06132	hypothetical protein; Provisional	359
235710	PRK06133	PRK06133	glutamate carboxypeptidase; Reviewed	410
180419	PRK06134	PRK06134	putative FAD-binding dehydrogenase; Reviewed	581
235711	PRK06136	PRK06136	uroporphyrinogen-III C-methyltransferase. 	249
235712	PRK06138	PRK06138	SDR family oxidoreductase. 	252
235713	PRK06139	PRK06139	SDR family oxidoreductase. 	330
180421	PRK06141	PRK06141	ornithine cyclodeaminase family protein. 	314
235714	PRK06142	PRK06142	crotonase/enoyl-CoA hydratase family protein. 	272
180423	PRK06143	PRK06143	enoyl-CoA hydratase; Provisional	256
180424	PRK06144	PRK06144	enoyl-CoA hydratase; Provisional	262
102207	PRK06145	PRK06145	acyl-CoA synthetase; Validated	497
235715	PRK06147	PRK06147	3-oxoacyl-(acyl carrier protein) synthase; Validated	348
180426	PRK06148	PRK06148	hypothetical protein; Provisional	1013
235716	PRK06149	PRK06149	aminotransferase. 	972
180428	PRK06151	PRK06151	N-ethylammeline chlorohydrolase; Provisional	488
235717	PRK06153	PRK06153	hypothetical protein; Provisional	393
235718	PRK06154	PRK06154	thiamine pyrophosphate-requiring protein. 	565
235719	PRK06155	PRK06155	crotonobetaine/carnitine-CoA ligase; Provisional	542
235720	PRK06156	PRK06156	dipeptidase. 	520
180433	PRK06157	PRK06157	acetyl-CoA acetyltransferase; Validated	398
180434	PRK06158	PRK06158	thiolase; Provisional	384
180435	PRK06161	PRK06161	putative monovalent cation/H+ antiporter subunit F; Reviewed	89
235721	PRK06163	PRK06163	hypothetical protein; Provisional	202
235722	PRK06164	PRK06164	acyl-CoA synthetase; Validated	540
180437	PRK06169	PRK06169	putative amidase; Provisional	466
235723	PRK06170	PRK06170	amidase; Provisional	490
180439	PRK06171	PRK06171	sorbitol-6-phosphate 2-dehydrogenase; Provisional	266
180440	PRK06172	PRK06172	SDR family oxidoreductase. 	253
180441	PRK06173	PRK06173	adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional	429
180442	PRK06175	PRK06175	L-aspartate oxidase; Provisional	433
180443	PRK06176	PRK06176	cystathionine gamma-synthase. 	380
235724	PRK06178	PRK06178	acyl-CoA synthetase; Validated	567
235725	PRK06179	PRK06179	short chain dehydrogenase; Provisional	270
180446	PRK06180	PRK06180	short chain dehydrogenase; Provisional	277
235726	PRK06181	PRK06181	SDR family oxidoreductase. 	263
180448	PRK06182	PRK06182	short chain dehydrogenase; Validated	273
235727	PRK06183	mhpA	bifunctional 3-(3-hydroxy-phenyl)propionate/3-hydroxycinnamic acid hydroxylase. 	500
235728	PRK06184	PRK06184	hypothetical protein; Provisional	502
235729	PRK06185	PRK06185	FAD-dependent oxidoreductase. 	407
180452	PRK06186	PRK06186	hypothetical protein; Validated	229
235730	PRK06187	PRK06187	long-chain-fatty-acid--CoA ligase; Validated	521
235731	PRK06188	PRK06188	acyl-CoA synthetase; Validated	524
235732	PRK06189	PRK06189	allantoinase; Provisional	451
235733	PRK06190	PRK06190	enoyl-CoA hydratase; Provisional	258
235734	PRK06193	PRK06193	hypothetical protein; Provisional	206
180458	PRK06194	PRK06194	hypothetical protein; Provisional	287
235735	PRK06195	PRK06195	DNA polymerase III subunit epsilon; Validated	309
235736	PRK06196	PRK06196	oxidoreductase; Provisional	315
235737	PRK06197	PRK06197	short chain dehydrogenase; Provisional	306
180462	PRK06198	PRK06198	short chain dehydrogenase; Provisional	260
235738	PRK06199	PRK06199	ornithine cyclodeaminase; Validated	379
235739	PRK06200	PRK06200	2,3-dihydroxy-2,3-dihydrophenylpropionate dehydrogenase; Provisional	263
180465	PRK06201	PRK06201	hypothetical protein; Validated	221
180466	PRK06202	PRK06202	hypothetical protein; Provisional	232
235740	PRK06203	aroB	3-dehydroquinate synthase; Reviewed	389
235741	PRK06205	PRK06205	acetyl-CoA C-acetyltransferase. 	404
235742	PRK06207	PRK06207	pyridoxal phosphate-dependent aminotransferase. 	405
235743	PRK06208	PRK06208	class II aldolase/adducin family protein. 	274
180471	PRK06209	PRK06209	glutamate-1-semialdehyde 2,1-aminomutase; Provisional	431
180472	PRK06210	PRK06210	enoyl-CoA hydratase; Provisional	272
235744	PRK06213	PRK06213	crotonase/enoyl-CoA hydratase family protein. 	229
235745	PRK06214	PRK06214	sulfite reductase subunit alpha. 	530
235746	PRK06215	PRK06215	hypothetical protein; Provisional	238
168472	PRK06217	PRK06217	hypothetical protein; Validated	183
235747	PRK06222	PRK06222	sulfide/dihydroorotate dehydrogenase-like FAD/NAD-binding protein. 	281
180477	PRK06223	PRK06223	malate dehydrogenase; Reviewed	307
235748	PRK06224	PRK06224	citryl-CoA lyase. 	263
235749	PRK06225	PRK06225	pyridoxal phosphate-dependent aminotransferase. 	380
235750	PRK06228	PRK06228	F0F1 ATP synthase subunit epsilon; Validated	131
180481	PRK06231	PRK06231	F0F1 ATP synthase subunit B; Validated	205
180482	PRK06233	PRK06233	vitamin B12 independent methionine synthase. 	372
168478	PRK06234	PRK06234	methionine gamma-lyase; Provisional	400
235751	PRK06241	PRK06241	phosphoenolpyruvate synthase; Validated	871
180484	PRK06242	PRK06242	flavodoxin; Provisional	150
180485	PRK06245	cofG	FO synthase subunit 1; Reviewed	336
180486	PRK06246	PRK06246	fumarate hydratase; Provisional	280
180487	PRK06247	PRK06247	pyruvate kinase; Provisional	476
180488	PRK06249	PRK06249	putative 2-dehydropantoate 2-reductase. 	313
235752	PRK06251	PRK06251	V-type ATP synthase subunit K; Validated	102
235753	PRK06252	PRK06252	methylcobalamin:coenzyme M methyltransferase; Validated	339
235754	PRK06253	PRK06253	O-phosphoseryl-tRNA synthetase; Reviewed	529
235755	PRK06256	PRK06256	biotin synthase; Validated	336
235756	PRK06259	PRK06259	succinate dehydrogenase/fumarate reductase iron-sulfur subunit; Provisional	486
235757	PRK06260	PRK06260	threonine synthase; Validated	397
235758	PRK06263	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	543
235759	PRK06264	cbiC	cobalt-precorrin-8 methylmutase. 	210
235760	PRK06265	PRK06265	cobalt transporter CbiM. 	199
235761	PRK06266	PRK06266	transcription initiation factor E subunit alpha; Validated	178
235762	PRK06267	PRK06267	hypothetical protein; Provisional	350
235763	PRK06270	PRK06270	homoserine dehydrogenase; Provisional	341
180501	PRK06271	PRK06271	V-type ATP synthase subunit K; Validated	213
235764	PRK06273	PRK06273	ferredoxin; Provisional	165
235765	PRK06274	PRK06274	indolepyruvate oxidoreductase subunit beta. 	197
235766	PRK06276	PRK06276	acetolactate synthase large subunit. 	586
235767	PRK06277	PRK06277	energy conserving hydrogenase EhbF. 	478
180505	PRK06278	PRK06278	cobyrinic acid a,c-diamide synthase; Validated	476
180506	PRK06279	PRK06279	putative monovalent cation/H+ antiporter subunit E; Reviewed	100
235768	PRK06280	PRK06280	hypothetical protein; Provisional	77
180508	PRK06281	PRK06281	putative monovalent cation/H+ antiporter subunit B; Reviewed	154
180509	PRK06285	PRK06285	chorismate mutase; Provisional	96
235769	PRK06286	PRK06286	putative monovalent cation/H+ antiporter subunit G; Reviewed	91
180511	PRK06287	PRK06287	cobalt transport protein CbiN; Validated	107
235770	PRK06288	PRK06288	RNA polymerase sigma factor WhiG; Reviewed	268
235771	PRK06289	PRK06289	acetyl-CoA acetyltransferase; Provisional	403
235772	PRK06290	PRK06290	LL-diaminopimelate aminotransferase. 	410
235773	PRK06291	PRK06291	aspartate kinase; Provisional	465
235774	PRK06292	PRK06292	dihydrolipoamide dehydrogenase; Validated	460
180517	PRK06293	PRK06293	single-stranded DNA-binding protein; Provisional	161
180518	PRK06294	PRK06294	coproporphyrinogen III oxidase; Provisional	370
180519	PRK06298	PRK06298	type III secretion system protein; Validated	356
235775	PRK06299	rpsA	30S ribosomal protein S1; Reviewed	565
235776	PRK06300	PRK06300	enoyl-(acyl carrier protein) reductase; Provisional	299
235777	PRK06302	PRK06302	acetyl-CoA carboxylase biotin carboxyl carrier protein. 	155
180523	PRK06305	PRK06305	DNA polymerase III subunits gamma and tau; Validated	451
180524	PRK06309	PRK06309	DNA polymerase III subunit epsilon; Validated	232
180525	PRK06310	PRK06310	DNA polymerase III subunit epsilon; Validated	250
180526	PRK06315	PRK06315	type III secretion system ATPase; Provisional	442
235778	PRK06319	PRK06319	DNA topoisomerase I/SWI domain fusion protein; Validated	860
180528	PRK06321	PRK06321	replicative DNA helicase; Provisional	472
235779	PRK06327	PRK06327	dihydrolipoamide dehydrogenase; Validated	475
180530	PRK06328	PRK06328	HrpE/YscL family type III secretion apparatus protein. 	223
235780	PRK06330	PRK06330	transcript cleavage factor/unknown domain fusion protein; Validated	718
235781	PRK06333	PRK06333	beta-ketoacyl-ACP synthase. 	424
180533	PRK06334	PRK06334	long chain fatty acid--[acyl-carrier-protein] ligase; Validated	539
235782	PRK06341	PRK06341	single-stranded DNA-binding protein; Provisional	166
180535	PRK06342	PRK06342	transcription elongation factor GreA. 	160
180536	PRK06347	PRK06347	1,4-beta-N-acetylmuramoylhydrolase. 	592
180537	PRK06348	PRK06348	pyridoxal phosphate-dependent aminotransferase. 	384
235783	PRK06349	PRK06349	homoserine dehydrogenase; Provisional	426
180539	PRK06352	PRK06352	threonine synthase; Validated	351
235784	PRK06354	PRK06354	pyruvate kinase; Provisional	590
180541	PRK06357	PRK06357	hypothetical protein; Provisional	216
180542	PRK06358	PRK06358	threonine-phosphate decarboxylase; Provisional	354
180543	PRK06361	PRK06361	histidinol phosphate phosphatase domain-containing protein. 	212
235785	PRK06365	PRK06365	thiolase domain-containing protein. 	430
102340	PRK06366	PRK06366	acetyl-CoA C-acetyltransferase. 	388
235786	PRK06369	nac	nascent polypeptide-associated complex protein; Reviewed	115
235787	PRK06370	PRK06370	FAD-containing oxidoreductase. 	463
180547	PRK06371	PRK06371	S-methyl-5-thioribose-1-phosphate isomerase. 	329
235788	PRK06372	PRK06372	translation initiation factor IF-2B subunit delta; Provisional	253
180548	PRK06380	PRK06380	metal-dependent hydrolase; Provisional	418
235789	PRK06381	PRK06381	threonine synthase; Validated	319
180550	PRK06382	PRK06382	threonine dehydratase; Provisional	406
235790	PRK06386	PRK06386	replication factor A; Reviewed	358
102351	PRK06388	PRK06388	amidophosphoribosyltransferase; Provisional	474
235791	PRK06389	PRK06389	argininosuccinate lyase; Provisional	434
235792	PRK06390	PRK06390	adenylosuccinate lyase; Provisional	451
102354	PRK06392	PRK06392	homoserine dehydrogenase; Provisional	326
102355	PRK06393	rpoE	DNA-directed RNA polymerase subunit E''; Validated	64
235793	PRK06394	rpl13p	50S ribosomal protein L13P; Reviewed	146
102357	PRK06395	PRK06395	phosphoribosylamine--glycine ligase; Provisional	435
135898	PRK06397	PRK06397	V-type ATP synthase subunit H; Validated	111
235794	PRK06398	PRK06398	aldose dehydrogenase; Validated	258
235795	PRK06402	rpl12p	50S ribosomal protein L12P; Reviewed	106
102361	PRK06404	PRK06404	anthranilate synthase component I; Reviewed	351
235796	PRK06406	PRK06406	vitamin B12-dependent ribonucleotide reductase. 	771
180556	PRK06407	PRK06407	ornithine cyclodeaminase; Provisional	301
235797	PRK06411	PRK06411	NADH-quinone oxidoreductase subunit NuoB. 	183
235798	PRK06416	PRK06416	dihydrolipoamide dehydrogenase; Reviewed	462
180559	PRK06418	PRK06418	transcription elongation factor NusA-like protein; Validated	166
235799	PRK06419	rpl15p	50S ribosomal protein L15P; Reviewed	148
102368	PRK06423	PRK06423	phosphoribosylformylglycinamidine synthase; Provisional	73
102369	PRK06424	PRK06424	transcription factor; Provisional	144
102370	PRK06425	PRK06425	histidinol-phosphate aminotransferase; Validated	332
180561	PRK06427	PRK06427	bifunctional hydroxy-methylpyrimidine kinase/ hydroxy-phosphomethylpyrimidine kinase; Reviewed	266
102372	PRK06432	PRK06432	NADH dehydrogenase subunit A; Validated	144
180562	PRK06433	PRK06433	NADH dehydrogenase subunit J; Provisional	88
102374	PRK06434	PRK06434	cystathionine gamma-lyase; Validated	384
235800	PRK06436	PRK06436	2-hydroxyacid dehydrogenase. 	303
135906	PRK06437	PRK06437	hypothetical protein; Provisional	67
102377	PRK06438	PRK06438	hypothetical protein; Provisional	292
102378	PRK06439	PRK06439	NADH dehydrogenase subunit J; Provisional	72
235801	PRK06443	PRK06443	chorismate mutase; Validated	177
102381	PRK06444	PRK06444	prephenate dehydrogenase; Provisional	197
180563	PRK06445	PRK06445	acetyl-CoA C-acetyltransferase. 	394
235802	PRK06446	PRK06446	hypothetical protein; Provisional	436
180565	PRK06450	PRK06450	threonine synthase; Validated	338
235803	PRK06451	PRK06451	NADP-dependent isocitrate dehydrogenase. 	412
180567	PRK06452	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	566
235804	PRK06455	PRK06455	riboflavin synthase; Provisional	155
180569	PRK06456	PRK06456	acetolactate synthase large subunit. 	572
180570	PRK06457	PRK06457	pyruvate dehydrogenase; Provisional	549
235805	PRK06458	PRK06458	hydrogenase 4 subunit F; Validated	490
235806	PRK06459	PRK06459	hydrogenase 4 subunit B; Validated	585
235807	PRK06460	PRK06460	hypothetical protein; Provisional	376
180574	PRK06461	PRK06461	single-stranded DNA-binding protein; Reviewed	129
235808	PRK06462	PRK06462	asparagine synthetase A; Reviewed	335
180576	PRK06463	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	255
235809	PRK06464	PRK06464	phosphoenolpyruvate synthase; Validated	795
180578	PRK06466	PRK06466	acetolactate synthase 3 large subunit. 	574
180579	PRK06467	PRK06467	dihydrolipoamide dehydrogenase; Reviewed	471
235810	PRK06473	PRK06473	NADH-quinone oxidoreductase subunit M. 	500
235811	PRK06474	PRK06474	hypothetical protein; Provisional	178
180582	PRK06475	PRK06475	FAD-binding protein. 	400
235812	PRK06476	PRK06476	pyrroline-5-carboxylate reductase; Reviewed	258
180584	PRK06481	PRK06481	flavocytochrome c. 	506
235813	PRK06482	PRK06482	SDR family oxidoreductase. 	276
180586	PRK06483	PRK06483	dihydromonapterin reductase; Provisional	236
168574	PRK06484	PRK06484	short chain dehydrogenase; Validated	520
235814	PRK06486	PRK06486	aldolase. 	262
180588	PRK06487	PRK06487	2-hydroxyacid dehydrogenase. 	317
168577	PRK06488	PRK06488	sulfur carrier protein ThiS; Validated	65
235815	PRK06489	PRK06489	hypothetical protein; Provisional	360
180590	PRK06490	PRK06490	glutamine amidotransferase; Provisional	239
180591	PRK06494	PRK06494	enoyl-CoA hydratase; Provisional	259
168580	PRK06495	PRK06495	enoyl-CoA hydratase/isomerase family protein. 	257
180592	PRK06498	PRK06498	isocitrate lyase; Provisional	531
235816	PRK06500	PRK06500	SDR family oxidoreductase. 	249
235817	PRK06501	PRK06501	beta-ketoacyl-ACP synthase. 	425
180595	PRK06504	PRK06504	acetyl-CoA C-acetyltransferase. 	390
180596	PRK06505	PRK06505	enoyl-[acyl-carrier-protein] reductase FabI. 	271
180597	PRK06508	PRK06508	acyl carrier protein; Provisional	93
180598	PRK06512	PRK06512	thiamine phosphate synthase. 	221
235818	PRK06518	PRK06518	hypothetical protein; Provisional	177
235819	PRK06519	PRK06519	beta-ketoacyl-ACP synthase. 	398
180601	PRK06520	PRK06520	5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase. 	368
235820	PRK06521	PRK06521	hydrogenase 4 subunit B; Validated	667
235821	PRK06522	PRK06522	2-dehydropantoate 2-reductase; Reviewed	304
180604	PRK06523	PRK06523	short chain dehydrogenase; Provisional	260
180605	PRK06524	PRK06524	biotin carboxylase-like protein; Validated	493
180606	PRK06525	PRK06525	hydrogenase 4 subunit D; Validated	479
180607	PRK06526	PRK06526	transposase; Provisional	254
180608	PRK06529	PRK06529	amidase; Provisional	482
235822	PRK06531	yajC	preprotein translocase subunit YajC; Validated	113
180610	PRK06539	PRK06539	ribonucleoside-diphosphate reductase subunit alpha. 	822
235823	PRK06541	PRK06541	aspartate aminotransferase family protein. 	460
180612	PRK06543	PRK06543	carboxylating nicotinate-nucleotide diphosphorylase. 	281
235824	PRK06545	PRK06545	prephenate dehydrogenase; Validated	359
180614	PRK06546	PRK06546	pyruvate dehydrogenase; Provisional	578
235825	PRK06547	PRK06547	hypothetical protein; Provisional	172
75628	PRK06548	PRK06548	ribonuclease H; Provisional	161
235826	PRK06549	PRK06549	acetyl-CoA carboxylase biotin carboxyl carrier protein subunit; Validated	130
180617	PRK06550	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	235
180618	PRK06552	PRK06552	keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase; Provisional	213
235827	PRK06553	PRK06553	lipid A biosynthesis lauroyl acyltransferase; Provisional	308
180620	PRK06555	PRK06555	pyrophosphate--fructose-6-phosphate 1-phosphotransferase; Validated	403
235828	PRK06556	PRK06556	vitamin B12-dependent ribonucleotide reductase; Validated	953
235829	PRK06557	PRK06557	L-ribulose-5-phosphate 4-epimerase; Validated	221
235830	PRK06558	PRK06558	V-type ATP synthase subunit K; Validated	159
235831	PRK06559	PRK06559	carboxylating nicotinate-nucleotide diphosphorylase. 	290
180625	PRK06563	PRK06563	crotonase/enoyl-CoA hydratase family protein. 	255
180626	PRK06565	PRK06565	amidase; Validated	566
235832	PRK06567	PRK06567	putative bifunctional glutamate synthase subunit beta/2-polyprenylphenol hydroxylase; Validated	1028
168615	PRK06568	PRK06568	F0F1 ATP synthase subunit B; Validated	154
180627	PRK06569	PRK06569	F0F1 ATP synthase subunit B'; Validated	155
180628	PRK06580	PRK06580	putative monovalent cation/H+ antiporter subunit E; Reviewed	103
235833	PRK06581	PRK06581	DNA polymerase III subunit delta'; Validated	263
180630	PRK06582	PRK06582	coproporphyrinogen III oxidase; Provisional	390
235834	PRK06585	holA	DNA polymerase III subunit delta; Reviewed	343
168619	PRK06588	PRK06588	putative monovalent cation/H+ antiporter subunit D; Reviewed	506
235835	PRK06589	PRK06589	putative monovalent cation/H+ antiporter subunit D; Reviewed	489
235836	PRK06590	PRK06590	NADH:ubiquinone oxidoreductase subunit L; Reviewed	624
235837	PRK06591	PRK06591	putative monovalent cation/H+ antiporter subunit D; Reviewed	432
235838	PRK06596	PRK06596	RNA polymerase factor sigma-32; Reviewed	284
235839	PRK06598	PRK06598	aspartate-semialdehyde dehydrogenase; Reviewed	369
235840	PRK06599	PRK06599	DNA topoisomerase I; Validated	675
180638	PRK06602	PRK06602	NADH:ubiquinone oxidoreductase subunit A; Validated	121
168626	PRK06603	PRK06603	enoyl-[acyl-carrier-protein] reductase FabI. 	260
235841	PRK06606	PRK06606	branched-chain amino acid transaminase. 	306
235842	PRK06608	PRK06608	serine/threonine dehydratase. 	338
168629	PRK06617	PRK06617	2-octaprenyl-6-methoxyphenyl hydroxylase; Validated	374
168630	PRK06620	PRK06620	hypothetical protein; Validated	214
102471	PRK06628	PRK06628	lipid A biosynthesis lauroyl acyltransferase; Provisional	290
168631	PRK06630	PRK06630	hypothetical protein; Provisional	99
168632	PRK06633	PRK06633	acetyl-CoA C-acetyltransferase. 	392
235843	PRK06635	PRK06635	aspartate kinase; Reviewed	404
235844	PRK06638	PRK06638	NADH-quinone oxidoreductase subunit J. 	198
135984	PRK06642	PRK06642	single-stranded DNA-binding protein; Provisional	152
180643	PRK06645	PRK06645	DNA polymerase III subunits gamma and tau; Validated	507
102480	PRK06646	PRK06646	DNA polymerase III subunit chi; Provisional	154
235845	PRK06647	PRK06647	DNA polymerase III subunits gamma and tau; Validated	563
180645	PRK06649	PRK06649	V-type ATP synthase subunit K; Validated	143
235846	PRK06654	fliL	flagellar basal body-associated protein FliL; Reviewed	181
235847	PRK06655	flgD	flagellar hook assembly protein FlgD. 	225
168637	PRK06661	PRK06661	hypothetical protein; Provisional	231
180648	PRK06663	PRK06663	flagellar hook-associated protein 3. 	419
235848	PRK06664	fliD	flagellar hook-associated protein FliD; Validated	661
180650	PRK06665	flgK	flagellar hook-associated protein FlgK; Validated	627
235849	PRK06666	fliM	flagellar motor switch protein FliM; Validated	337
180652	PRK06667	motB	flagellar motor protein MotB; Validated	252
235850	PRK06669	fliH	flagellar assembly protein H; Validated	281
180654	PRK06672	PRK06672	hypothetical protein; Validated	341
135998	PRK06673	PRK06673	DNA polymerase III subunit beta; Validated	376
235851	PRK06676	rpsA	30S ribosomal protein S1; Reviewed	390
180656	PRK06680	PRK06680	D-amino acid aminotransferase; Reviewed	286
136002	PRK06683	PRK06683	hypothetical protein; Provisional	82
180657	PRK06687	PRK06687	TRZ/ATZ family protein. 	419
235852	PRK06688	PRK06688	enoyl-CoA hydratase; Provisional	259
180659	PRK06690	PRK06690	acetyl-CoA C-acyltransferase. 	361
180660	PRK06696	PRK06696	uridine kinase; Validated	223
136007	PRK06698	PRK06698	bifunctional 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase/phosphatase; Validated	459
235853	PRK06701	PRK06701	short chain dehydrogenase; Provisional	290
102505	PRK06702	PRK06702	bifunctional O-acetylhomoserine aminocarboxypropyltransferase/cysteine synthase. 	432
235854	PRK06703	PRK06703	flavodoxin; Provisional	151
180663	PRK06704	PRK06704	RNA polymerase subunit sigma-70. 	228
180664	PRK06705	PRK06705	argininosuccinate lyase; Provisional	502
235855	PRK06707	PRK06707	amidase; Provisional	536
180666	PRK06710	PRK06710	long-chain-fatty-acid--CoA ligase; Validated	563
168652	PRK06714	PRK06714	S-adenosylhomocysteine nucleosidase; Validated	236
180667	PRK06718	PRK06718	NAD(P)-binding protein. 	202
180668	PRK06719	PRK06719	precorrin-2 dehydrogenase; Validated	157
180669	PRK06720	PRK06720	hypothetical protein; Provisional	169
136018	PRK06721	PRK06721	threonine synthase; Reviewed	352
180670	PRK06722	PRK06722	exonuclease; Provisional	281
180671	PRK06724	PRK06724	hypothetical protein; Provisional	128
180672	PRK06725	PRK06725	acetolactate synthase large subunit. 	570
136022	PRK06728	PRK06728	aspartate-semialdehyde dehydrogenase; Provisional	347
75717	PRK06731	flhF	flagellar biosynthesis regulator FlhF; Validated	270
235856	PRK06732	PRK06732	phosphopantothenate--cysteine ligase; Validated	229
180674	PRK06733	PRK06733	hypothetical protein; Provisional	151
180675	PRK06737	PRK06737	ACT domain-containing protein. 	76
180676	PRK06739	PRK06739	pyruvate kinase; Validated	352
180677	PRK06740	PRK06740	histidinol phosphate phosphatase domain-containing protein. 	331
102525	PRK06742	PRK06742	flagellar motor protein MotS; Reviewed	225
136027	PRK06743	PRK06743	flagellar motor protein MotP; Reviewed	254
75726	PRK06746	PRK06746	peptide chain release factor 2; Provisional	326
180678	PRK06748	PRK06748	hypothetical protein; Validated	83
168658	PRK06749	PRK06749	replicative DNA helicase; Provisional	428
168659	PRK06751	PRK06751	single-stranded DNA-binding protein; Provisional	173
168660	PRK06752	PRK06752	single-stranded DNA-binding protein; Validated	112
168661	PRK06753	PRK06753	hypothetical protein; Provisional	373
180679	PRK06754	mtnB	methylthioribulose 1-phosphate dehydratase. 	208
102532	PRK06755	PRK06755	hypothetical protein; Validated	209
168663	PRK06756	PRK06756	flavodoxin; Provisional	148
136035	PRK06758	PRK06758	hypothetical protein; Provisional	128
235857	PRK06759	PRK06759	RNA polymerase factor sigma-70; Validated	154
180681	PRK06760	PRK06760	hypothetical protein; Provisional	223
180682	PRK06761	PRK06761	hypothetical protein; Provisional	282
235858	PRK06762	PRK06762	hypothetical protein; Provisional	166
136040	PRK06763	PRK06763	F0F1 ATP synthase subunit alpha; Validated	213
102540	PRK06764	PRK06764	hypothetical protein; Provisional	105
235859	PRK06765	PRK06765	homoserine O-acetyltransferase; Provisional	389
180685	PRK06767	PRK06767	methionine gamma-lyase; Provisional	386
180686	PRK06769	PRK06769	HAD-IIIA family hydrolase. 	173
180687	PRK06770	PRK06770	hypothetical protein; Provisional	180
180688	PRK06771	PRK06771	hypothetical protein; Provisional	93
102546	PRK06772	PRK06772	salicylate synthase. 	434
180689	PRK06774	PRK06774	aminodeoxychorismate synthase component II. 	191
180690	PRK06777	PRK06777	4-aminobutyrate--2-oxoglutarate transaminase. 	421
235860	PRK06778	PRK06778	hypothetical protein; Validated	289
136048	PRK06781	PRK06781	amidophosphoribosyltransferase; Provisional	471
235861	PRK06782	PRK06782	flagellar motor switch protein; Reviewed	528
180693	PRK06788	PRK06788	flagellar motor switch protein; Validated	119
180694	PRK06789	PRK06789	flagellar motor switch protein; Validated	74
180695	PRK06792	flgD	flagellar basal body rod modification protein; Validated	190
180696	PRK06793	fliI	flagellar protein export ATPase FliI. 	432
180697	PRK06797	flgB	flagellar basal body rod protein FlgB; Reviewed	135
180698	PRK06798	fliD	flagellar hook-associated protein 2. 	440
180699	PRK06799	flgK	flagellar hook-associated protein FlgK; Validated	431
180700	PRK06800	fliH	flagellar assembly protein H; Validated	228
180701	PRK06801	PRK06801	ketose 1,6-bisphosphate aldolase. 	286
180702	PRK06802	flgC	flagellar basal body rod protein FlgC; Reviewed	141
235862	PRK06803	flgE	flagellar basal body protein FlaE. 	402
235863	PRK06804	flgA	flagellar basal body P-ring formation protein FlgA. 	261
180705	PRK06806	PRK06806	class II aldolase. 	281
235864	PRK06807	PRK06807	3'-5' exonuclease. 	313
180707	PRK06811	PRK06811	RNA polymerase factor sigma-70; Validated	189
168683	PRK06813	PRK06813	homoserine dehydrogenase; Validated	346
235865	PRK06814	PRK06814	acyl-[ACP]--phospholipid O-acyltransferase. 	1140
180709	PRK06815	PRK06815	threonine/serine dehydratase. 	317
235866	PRK06816	PRK06816	StlD/DarB family beta-ketosynthase. 	378
235867	PRK06819	PRK06819	FliC/FljB family flagellin. 	376
180712	PRK06820	PRK06820	EscN/YscN/HrcN family type III secretion system ATPase. 	440
136070	PRK06823	PRK06823	ornithine cyclodeaminase family protein. 	315
168689	PRK06824	PRK06824	translation initiation factor Sui1; Validated	118
235868	PRK06826	dnaE	DNA polymerase III DnaE; Reviewed	1151
180714	PRK06827	PRK06827	phosphoribosylpyrophosphate synthetase; Provisional	382
180715	PRK06828	PRK06828	amidase; Provisional	491
235869	PRK06830	PRK06830	ATP-dependent 6-phosphofructokinase. 	443
180717	PRK06833	PRK06833	L-fuculose-phosphate aldolase. 	214
235870	PRK06834	PRK06834	hypothetical protein; Provisional	488
235871	PRK06835	PRK06835	DNA replication protein DnaC; Validated	329
180720	PRK06836	PRK06836	pyridoxal phosphate-dependent aminotransferase. 	394
180721	PRK06837	PRK06837	ArgE/DapE family deacylase. 	427
168698	PRK06839	PRK06839	o-succinylbenzoate--CoA ligase. 	496
235872	PRK06840	PRK06840	3-oxoacyl-ACP synthase. 	339
180723	PRK06841	PRK06841	short chain dehydrogenase; Provisional	255
180724	PRK06842	PRK06842	Fe-S-containing hydro-lyase. 	185
180725	PRK06843	PRK06843	inosine 5-monophosphate dehydrogenase; Validated	404
235873	PRK06846	PRK06846	putative deaminase; Validated	410
235874	PRK06847	PRK06847	hypothetical protein; Provisional	375
235875	PRK06848	PRK06848	cytidine deaminase. 	139
235876	PRK06849	PRK06849	hypothetical protein; Provisional	389
235877	PRK06850	PRK06850	hypothetical protein; Provisional	507
235878	PRK06851	PRK06851	hypothetical protein; Provisional	367
180731	PRK06852	PRK06852	aldolase; Validated	304
180732	PRK06853	PRK06853	indolepyruvate oxidoreductase subunit beta; Reviewed	197
235879	PRK06854	PRK06854	adenylyl-sulfate reductase subunit alpha. 	608
180734	PRK06855	PRK06855	pyridoxal phosphate-dependent aminotransferase. 	433
180735	PRK06856	PRK06856	DNA polymerase III subunit psi; Validated	128
235880	PRK06860	PRK06860	lipid A biosynthesis lauroyl acyltransferase; Provisional	309
136097	PRK06863	PRK06863	single-stranded DNA-binding protein; Provisional	168
235881	PRK06870	secG	preprotein translocase subunit SecG; Reviewed	76
180738	PRK06871	PRK06871	DNA polymerase III subunit delta'; Validated	325
180739	PRK06876	PRK06876	F0F1 ATP synthase subunit C; Validated	78
168717	PRK06882	PRK06882	acetolactate synthase 3 large subunit. 	574
180740	PRK06886	PRK06886	hypothetical protein; Validated	329
168719	PRK06893	PRK06893	DnaA regulatory inactivator Hda. 	229
235882	PRK06895	PRK06895	anthranilate synthase component II. 	190
235883	PRK06901	PRK06901	oxidoreductase. 	322
136106	PRK06904	PRK06904	replicative DNA helicase; Validated	472
180742	PRK06911	rpsN	30S ribosomal protein S14; Reviewed	100
180743	PRK06912	acoL	dihydrolipoamide dehydrogenase; Validated	458
180744	PRK06914	PRK06914	SDR family oxidoreductase. 	280
180745	PRK06915	PRK06915	peptidase. 	422
180746	PRK06916	PRK06916	adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional	460
235884	PRK06917	PRK06917	aspartate aminotransferase family protein. 	447
235885	PRK06918	PRK06918	4-aminobutyrate aminotransferase; Reviewed	451
180749	PRK06920	dnaE	DNA polymerase III subunit alpha. 	1107
180750	PRK06921	PRK06921	hypothetical protein; Provisional	266
180751	PRK06922	PRK06922	class I SAM-dependent methyltransferase. 	677
235886	PRK06923	PRK06923	isochorismate synthase DhbC; Validated	399
180753	PRK06924	PRK06924	(S)-benzoin forming benzil reductase. 	251
235887	PRK06925	PRK06925	flagellar motor protein MotB. 	230
180755	PRK06926	PRK06926	flagellar motor protein MotP; Reviewed	271
235888	PRK06928	PRK06928	pyrroline-5-carboxylate reductase; Reviewed	277
180757	PRK06930	PRK06930	positive control sigma-like factor; Validated	170
235889	PRK06931	PRK06931	diaminobutyrate--2-oxoglutarate transaminase. 	459
235890	PRK06932	PRK06932	2-hydroxyacid dehydrogenase. 	314
235891	PRK06933	PRK06933	YscQ/HrcQ family type III secretion apparatus protein. 	308
180760	PRK06934	PRK06934	flavodoxin; Provisional	221
180761	PRK06935	PRK06935	2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase KduD. 	258
180762	PRK06936	PRK06936	EscN/YscN/HrcN family type III secretion system ATPase. 	439
180763	PRK06937	PRK06937	HrpE/YscL family type III secretion apparatus protein. 	204
235892	PRK06938	PRK06938	diaminobutyrate--2-oxoglutarate aminotransferase; Provisional	464
235893	PRK06939	PRK06939	2-amino-3-ketobutyrate coenzyme A ligase; Provisional	397
180766	PRK06940	PRK06940	short chain dehydrogenase; Provisional	275
235894	PRK06943	PRK06943	adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional	453
180768	PRK06944	PRK06944	sulfur carrier protein ThiS; Provisional	65
235895	PRK06945	flgK	flagellar hook-associated protein FlgK; Validated	651
180770	PRK06946	PRK06946	lipid A biosynthesis lauroyl acyltransferase; Provisional	293
180771	PRK06947	PRK06947	SDR family oxidoreductase. 	248
180772	PRK06948	PRK06948	ribonucleotide reductase-like protein; Provisional	595
180773	PRK06949	PRK06949	SDR family oxidoreductase. 	258
180774	PRK06953	PRK06953	SDR family oxidoreductase. 	222
180775	PRK06954	PRK06954	acetyl-CoA C-acetyltransferase. 	397
235896	PRK06955	PRK06955	biotin--[acetyl-CoA-carboxylase] ligase. 	300
180777	PRK06958	PRK06958	single-stranded DNA-binding protein; Provisional	182
235897	PRK06959	PRK06959	threonine-phosphate decarboxylase. 	339
235898	PRK06964	PRK06964	DNA polymerase III subunit delta'; Validated	342
180780	PRK06965	PRK06965	acetolactate synthase 3 catalytic subunit; Validated	587
180781	PRK06973	PRK06973	nicotinate-nucleotide adenylyltransferase. 	243
235899	PRK06975	PRK06975	bifunctional uroporphyrinogen-III synthetase/uroporphyrin-III C-methyltransferase; Reviewed	656
235900	PRK06978	PRK06978	carboxylating nicotinate-nucleotide diphosphorylase. 	294
235901	PRK06986	fliA	flagellar biosynthesis sigma factor; Validated	236
235902	PRK06988	PRK06988	formyltransferase. 	312
235903	PRK06991	PRK06991	electron transport complex subunit RsxB. 	270
235904	PRK06995	flhF	flagellar biosynthesis protein FlhF. 	484
235905	PRK06996	PRK06996	UbiH/UbiF/VisC/COQ6 family ubiquinone biosynthesis hydroxylase. 	398
180789	PRK06997	PRK06997	enoyl-[acyl-carrier-protein] reductase FabI. 	260
235906	PRK07003	PRK07003	DNA polymerase III subunit gamma/tau. 	830
235907	PRK07004	PRK07004	replicative DNA helicase; Provisional	460
180792	PRK07006	PRK07006	isocitrate dehydrogenase; Reviewed	409
235908	PRK07008	PRK07008	long-chain-fatty-acid--CoA ligase; Validated	539
180794	PRK07018	flgA	flagellar basal body P-ring formation protein FlgA. 	235
235909	PRK07021	fliL	flagellar basal body-associated protein FliL; Reviewed	162
180796	PRK07023	PRK07023	SDR family oxidoreductase. 	243
235910	PRK07024	PRK07024	SDR family oxidoreductase. 	257
235911	PRK07027	PRK07027	cobalamin biosynthesis protein CbiG; Provisional	126
235912	PRK07028	PRK07028	bifunctional hexulose-6-phosphate synthase/ribonuclease regulator; Validated	430
180800	PRK07030	PRK07030	adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional	466
180801	PRK07033	PRK07033	DotU family type VI secretion system protein. 	427
168775	PRK07034	PRK07034	hypothetical protein; Provisional	536
180802	PRK07035	PRK07035	SDR family oxidoreductase. 	252
235913	PRK07036	PRK07036	aminotransferase. 	466
180803	PRK07037	PRK07037	extracytoplasmic-function sigma-70 factor; Validated	163
235914	PRK07041	PRK07041	SDR family oxidoreductase. 	230
235915	PRK07042	PRK07042	amidase; Provisional	464
235916	PRK07044	PRK07044	aldolase II superfamily protein; Provisional	252
136171	PRK07045	PRK07045	putative monooxygenase; Reviewed	388
235917	PRK07046	PRK07046	aminotransferase; Validated	453
235918	PRK07048	PRK07048	threo-3-hydroxy-L-aspartate ammonia-lyase. 	321
180809	PRK07049	PRK07049	cystathionine gamma-synthase family protein. 	427
180810	PRK07050	PRK07050	cystathionine beta-lyase; Provisional	394
180811	PRK07051	PRK07051	biotin carboxyl carrier domain-containing protein. 	80
235919	PRK07053	PRK07053	glutamine amidotransferase; Provisional	234
235920	PRK07054	PRK07054	isochorismate synthase. 	475
235921	PRK07056	PRK07056	amidase; Provisional	454
180814	PRK07057	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	591
235922	PRK07058	PRK07058	acetate/propionate family kinase. 	396
235923	PRK07059	PRK07059	Long-chain-fatty-acid--CoA ligase; Validated	557
180817	PRK07060	PRK07060	short chain dehydrogenase; Provisional	245
180818	PRK07062	PRK07062	SDR family oxidoreductase. 	265
235924	PRK07063	PRK07063	SDR family oxidoreductase. 	260
180820	PRK07064	PRK07064	thiamine pyrophosphate-binding protein. 	544
168796	PRK07066	PRK07066	L-carnitine dehydrogenase. 	321
235925	PRK07067	PRK07067	L-iditol 2-dehydrogenase. 	257
180822	PRK07069	PRK07069	short chain dehydrogenase; Validated	251
180823	PRK07074	PRK07074	SDR family oxidoreductase. 	257
136191	PRK07075	PRK07075	isochorismate lyase. 	101
235926	PRK07077	PRK07077	phosphorylase. 	238
235927	PRK07078	PRK07078	hypothetical protein; Validated	759
235928	PRK07079	PRK07079	hypothetical protein; Provisional	469
235929	PRK07080	PRK07080	amino acid--[acyl-carrier-protein] ligase. 	317
180828	PRK07081	PRK07081	acyl carrier protein; Provisional	83
180829	PRK07084	PRK07084	class II fructose-1,6-bisphosphate aldolase. 	321
235930	PRK07085	PRK07085	diphosphate--fructose-6-phosphate 1-phosphotransferase; Provisional	555
180831	PRK07088	PRK07088	ribonucleoside-diphosphate reductase subunit alpha. 	764
180832	PRK07090	PRK07090	class II aldolase/adducin domain protein; Provisional	260
235931	PRK07092	PRK07092	benzoylformate decarboxylase; Reviewed	530
235932	PRK07093	PRK07093	para-aminobenzoate synthase component I; Validated	323
180835	PRK07094	PRK07094	biotin synthase; Provisional	323
235933	PRK07097	PRK07097	gluconate 5-dehydrogenase; Provisional	265
235934	PRK07101	PRK07101	hypothetical protein; Provisional	187
180838	PRK07102	PRK07102	SDR family oxidoreductase. 	243
180839	PRK07103	PRK07103	polyketide beta-ketoacyl:acyl carrier protein synthase; Validated	410
180840	PRK07105	PRK07105	pyridoxamine kinase; Validated	284
180841	PRK07106	PRK07106	phosphoribosylaminoimidazolecarboxamide formyltransferase. 	390
180842	PRK07107	PRK07107	IMP dehydrogenase. 	502
180843	PRK07108	PRK07108	acetyl-CoA C-acyltransferase. 	392
235935	PRK07109	PRK07109	short chain dehydrogenase; Provisional	334
235936	PRK07110	PRK07110	polyketide biosynthesis enoyl-CoA hydratase; Validated	249
235937	PRK07111	PRK07111	anaerobic ribonucleoside triphosphate reductase; Provisional	735
235938	PRK07112	PRK07112	enoyl-CoA hydratase/isomerase. 	255
235939	PRK07114	PRK07114	bifunctional 4-hydroxy-2-oxoglutarate aldolase/2-dehydro-3-deoxy-phosphogluconate aldolase. 	222
235940	PRK07115	PRK07115	AMP nucleosidase; Provisional	258
180850	PRK07116	PRK07116	flavodoxin; Provisional	160
180851	PRK07117	PRK07117	acyl carrier protein; Validated	79
235941	PRK07118	PRK07118	Fe-S cluster domain-containing protein. 	280
235942	PRK07119	PRK07119	2-ketoisovalerate ferredoxin reductase; Validated	352
180854	PRK07121	PRK07121	FAD-binding protein. 	492
168831	PRK07122	PRK07122	RNA polymerase sigma factor SigF; Reviewed	264
180855	PRK07132	PRK07132	DNA polymerase III subunit delta'; Validated	299
235943	PRK07133	PRK07133	DNA polymerase III subunits gamma and tau; Validated	725
235944	PRK07135	dnaE	DNA polymerase III DnaE; Validated	973
235945	PRK07139	PRK07139	amidase; Provisional	439
235946	PRK07143	PRK07143	hypothetical protein; Provisional	279
235947	PRK07152	nadD	nicotinate-nucleotide adenylyltransferase. 	342
235948	PRK07157	PRK07157	acetate kinase; Provisional	400
235949	PRK07159	PRK07159	F0F1 ATP synthase subunit C; Validated	100
235950	PRK07164	PRK07164	5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase; Provisional	218
235951	PRK07165	PRK07165	ATP F0F1 synthase subunit alpha. 	507
180864	PRK07168	PRK07168	uroporphyrin-III C-methyltransferase. 	474
180865	PRK07178	PRK07178	acetyl-CoA carboxylase biotin carboxylase subunit. 	472
180866	PRK07179	PRK07179	quorum-sensing autoinducer synthase. 	407
75964	PRK07182	flgB	flagellar basal body rod protein FlgB; Reviewed	148
235952	PRK07187	PRK07187	ribonucleoside-diphosphate reductase subunit alpha. 	721
235953	PRK07188	PRK07188	nicotinate phosphoribosyltransferase; Provisional	352
235954	PRK07189	PRK07189	malonate decarboxylase subunit beta; Reviewed	301
235955	PRK07190	PRK07190	FAD-binding protein. 	487
180871	PRK07191	flgK	flagellar hook-associated protein FlgK; Validated	456
235956	PRK07192	flgL	flagellar hook-associated protein FlgL; Reviewed	305
235957	PRK07193	fliF	flagellar MS-ring protein; Reviewed	552
235958	PRK07194	fliG	flagellar motor switch protein G; Reviewed	334
180875	PRK07196	fliI	flagellar protein export ATPase FliI. 	434
235959	PRK07198	PRK07198	GTP cyclohydrolase II. 	418
235960	PRK07199	PRK07199	ribose-phosphate diphosphokinase. 	301
235961	PRK07200	PRK07200	aspartate/ornithine carbamoyltransferase family protein; Validated	395
235962	PRK07201	PRK07201	SDR family oxidoreductase. 	657
235963	PRK07203	PRK07203	putative aminohydrolase SsnA. 	442
235964	PRK07204	PRK07204	beta-ketoacyl-ACP synthase III. 	329
235965	PRK07205	PRK07205	hypothetical protein; Provisional	444
180883	PRK07206	PRK07206	hypothetical protein; Provisional	416
235966	PRK07207	PRK07207	ribonucleoside-diphosphate reductase subunit alpha. 	965
235967	PRK07208	PRK07208	hypothetical protein; Provisional	479
235968	PRK07209	PRK07209	ribonucleotide-diphosphate reductase subunit beta; Validated	369
180887	PRK07211	PRK07211	single-stranded DNA binding protein. 	485
235969	PRK07213	PRK07213	chlorohydrolase; Provisional	375
235970	PRK07217	PRK07217	replication factor A; Reviewed	311
180890	PRK07218	PRK07218	Single-stranded DNA binding protein. 	423
235971	PRK07219	PRK07219	DNA topoisomerase I; Validated	822
180892	PRK07220	PRK07220	DNA topoisomerase I; Validated	740
235972	PRK07225	PRK07225	DNA-directed RNA polymerase subunit B'; Validated	605
235973	PRK07226	PRK07226	fructose-bisphosphate aldolase; Provisional	267
180895	PRK07228	PRK07228	5'-deoxyadenosine deaminase. 	445
235974	PRK07229	PRK07229	aconitate hydratase; Validated	646
235975	PRK07231	FabG-like	SDR family oxidoreductase. 	251
235976	PRK07232	PRK07232	bifunctional malic enzyme oxidoreductase/phosphotransacetylase; Reviewed	752
235977	PRK07233	PRK07233	hypothetical protein; Provisional	434
235978	PRK07234	PRK07234	putative monovalent cation/H+ antiporter subunit D; Reviewed	470
235979	PRK07235	PRK07235	amidase; Provisional	502
235980	PRK07236	PRK07236	hypothetical protein; Provisional	386
180903	PRK07238	PRK07238	bifunctional RNase H/acid phosphatase; Provisional	372
235981	PRK07239	PRK07239	bifunctional uroporphyrinogen-III synthetase/response regulator domain protein; Validated	381
180905	PRK07246	PRK07246	bifunctional ATP-dependent DNA helicase/DNA polymerase III subunit epsilon; Validated	820
180906	PRK07247	PRK07247	3'-5' exonuclease. 	195
168880	PRK07248	PRK07248	chorismate mutase. 	87
180907	PRK07251	PRK07251	FAD-containing oxidoreductase. 	438
180908	PRK07252	PRK07252	S1 RNA-binding domain-containing protein. 	120
235982	PRK07259	PRK07259	dihydroorotate dehydrogenase. 	301
180910	PRK07260	PRK07260	enoyl-CoA hydratase; Provisional	255
180911	PRK07261	PRK07261	DNA topology modulation protein. 	171
235983	PRK07269	PRK07269	cystathionine gamma-synthase; Reviewed	364
235984	PRK07272	PRK07272	amidophosphoribosyltransferase; Provisional	484
180914	PRK07274	PRK07274	single-stranded DNA-binding protein; Provisional	131
180915	PRK07275	PRK07275	single-stranded DNA-binding protein; Provisional	162
180916	PRK07276	PRK07276	DNA polymerase III subunit delta'; Validated	290
180917	PRK07279	dnaE	DNA polymerase III DnaE; Reviewed	1034
180918	PRK07281	PRK07281	methionyl aminopeptidase. 	286
180919	PRK07282	PRK07282	acetolactate synthase large subunit. 	566
180920	PRK07283	PRK07283	YlxQ-related RNA-binding protein. 	98
180921	PRK07306	PRK07306	ribonucleotide-diphosphate reductase subunit alpha; Validated	720
180922	PRK07308	PRK07308	flavodoxin; Validated	146
235985	PRK07309	PRK07309	pyridoxal phosphate-dependent aminotransferase. 	391
235986	PRK07313	PRK07313	phosphopantothenoylcysteine decarboxylase; Validated	182
235987	PRK07314	PRK07314	beta-ketoacyl-ACP synthase II. 	411
180926	PRK07315	PRK07315	fructose-bisphosphate aldolase; Provisional	293
235988	PRK07318	PRK07318	dipeptidase PepV; Reviewed	466
180928	PRK07322	PRK07322	adenine phosphoribosyltransferase; Provisional	178
235989	PRK07324	PRK07324	transaminase; Validated	373
235990	PRK07326	PRK07326	SDR family oxidoreductase. 	237
235991	PRK07327	PRK07327	enoyl-CoA hydratase/isomerase family protein. 	268
235992	PRK07328	PRK07328	histidinol-phosphatase; Provisional	269
180933	PRK07329	PRK07329	hypothetical protein; Provisional	246
235993	PRK07331	PRK07331	cobalt transporter CbiM. 	322
180935	PRK07333	PRK07333	ubiquinone biosynthesis hydroxylase. 	403
235994	PRK07334	PRK07334	threonine dehydratase; Provisional	403
180937	PRK07337	PRK07337	pyridoxal phosphate-dependent aminotransferase. 	388
235995	PRK07338	PRK07338	hydrolase. 	402
235996	PRK07340	PRK07340	delta(1)-pyrroline-2-carboxylate reductase family protein. 	304
235997	PRK07342	PRK07342	peptide chain release factor 2; Provisional	339
235998	PRK07349	PRK07349	amidophosphoribosyltransferase; Provisional	500
180941	PRK07352	PRK07352	F0F1 ATP synthase subunit B; Validated	174
235999	PRK07353	PRK07353	F0F1 ATP synthase subunit B'; Validated	140
180942	PRK07354	PRK07354	F0F1 ATP synthase subunit C; Validated	81
236000	PRK07360	PRK07360	FO synthase subunit 2; Reviewed	371
180944	PRK07362	PRK07362	NADP-dependent isocitrate dehydrogenase. 	474
180945	PRK07363	PRK07363	NADH-quinone oxidoreductase subunit M. 	501
236001	PRK07364	PRK07364	FAD-dependent hydroxylase. 	415
180947	PRK07366	PRK07366	LL-diaminopimelate aminotransferase. 	388
236002	PRK07369	PRK07369	dihydroorotase; Provisional	418
180949	PRK07370	PRK07370	enoyl-[acyl-carrier-protein] reductase FabI. 	258
236003	PRK07373	PRK07373	DNA polymerase III subunit alpha; Reviewed	449
168927	PRK07374	dnaE	DNA polymerase III subunit alpha; Validated	1170
236004	PRK07375	PRK07375	Na+/H+ antiporter subunit C. 	112
236005	PRK07376	PRK07376	NADH-quinone oxidoreductase subunit L. 	673
236006	PRK07377	PRK07377	hypothetical protein; Provisional	184
236007	PRK07379	PRK07379	coproporphyrinogen III oxidase; Provisional	400
180954	PRK07380	PRK07380	adenylosuccinate lyase; Provisional	431
236008	PRK07390	PRK07390	NAD(P)H-quinone oxidoreductase subunit F; Validated	613
236009	PRK07392	PRK07392	threonine-phosphate decarboxylase; Validated	360
168934	PRK07394	PRK07394	hypothetical protein; Provisional	342
236010	PRK07395	PRK07395	L-aspartate oxidase; Provisional	553
180958	PRK07396	PRK07396	dihydroxynaphthoic acid synthetase; Validated	273
236011	PRK07399	PRK07399	DNA polymerase III subunit delta'; Validated	314
180960	PRK07400	PRK07400	30S ribosomal protein S1; Reviewed	318
180961	PRK07402	PRK07402	precorrin-6Y C5,15-methyltransferase subunit CbiT. 	196
180962	PRK07403	PRK07403	type I glyceraldehyde-3-phosphate dehydrogenase. 	337
180963	PRK07405	PRK07405	RNA polymerase sigma factor SigD; Validated	317
236012	PRK07406	PRK07406	RNA polymerase sigma factor RpoD; Validated	373
180965	PRK07408	PRK07408	RNA polymerase sigma factor SigF; Reviewed	256
236013	PRK07409	PRK07409	threonine synthase; Validated	353
180967	PRK07411	PRK07411	molybdopterin-synthase adenylyltransferase MoeB. 	390
180968	PRK07413	PRK07413	cob(I)yrinic acid a,c-diamide adenosyltransferase. 	382
168945	PRK07414	PRK07414	P-loop NTPase family protein. 	178
180969	PRK07415	PRK07415	NAD(P)H-quinone oxidoreductase subunit H; Validated	394
180970	PRK07417	PRK07417	prephenate/arogenate dehydrogenase. 	279
236014	PRK07418	PRK07418	acetolactate synthase large subunit. 	616
236015	PRK07419	PRK07419	2-carboxy-1,4-naphthoquinone phytyltransferase. 	304
236016	PRK07424	PRK07424	bifunctional sterol desaturase/short chain dehydrogenase; Validated	406
236017	PRK07428	PRK07428	carboxylating nicotinate-nucleotide diphosphorylase. 	288
180975	PRK07429	PRK07429	phosphoribulokinase; Provisional	327
236018	PRK07431	PRK07431	aspartate kinase; Provisional	587
180977	PRK07432	PRK07432	S-methyl-5'-thioadenosine phosphorylase. 	290
180978	PRK07440	PRK07440	thiamine biosynthesis protein ThiS. 	70
236019	PRK07445	PRK07445	O-succinylbenzoic acid--CoA ligase; Reviewed	452
236020	PRK07449	PRK07449	2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate synthase; Validated	568
236021	PRK07451	PRK07451	translation initiation factor. 	115
180982	PRK07452	PRK07452	DNA polymerase III subunit delta; Validated	326
180983	PRK07453	PRK07453	protochlorophyllide reductase. 	322
180984	PRK07454	PRK07454	SDR family oxidoreductase. 	241
180985	PRK07455	PRK07455	bifunctional 4-hydroxy-2-oxoglutarate aldolase/2-dehydro-3-deoxy-phosphogluconate aldolase. 	187
180986	PRK07459	PRK07459	single-stranded DNA-binding protein; Provisional	121
180987	PRK07468	PRK07468	crotonase/enoyl-CoA hydratase family protein. 	262
180988	PRK07470	PRK07470	acyl-CoA synthetase; Validated	528
236022	PRK07471	PRK07471	DNA polymerase III subunit delta'; Validated	365
168961	PRK07473	PRK07473	M20/M25/M40 family metallo-hydrolase. 	376
236023	PRK07474	PRK07474	sulfur oxidation protein SoxY; Provisional	154
236024	PRK07475	PRK07475	hypothetical protein; Provisional	245
236025	PRK07476	eutB	threonine dehydratase; Provisional	322
180993	PRK07478	PRK07478	short chain dehydrogenase; Provisional	254
180994	PRK07480	PRK07480	putative aminotransferase; Validated	456
168967	PRK07481	PRK07481	hypothetical protein; Provisional	449
236026	PRK07482	PRK07482	hypothetical protein; Provisional	461
236027	PRK07483	PRK07483	aspartate aminotransferase family protein. 	443
236028	PRK07486	PRK07486	amidase; Provisional	484
236029	PRK07487	PRK07487	amidase; Provisional	469
236030	PRK07488	PRK07488	indoleacetamide hydrolase. 	472
236031	PRK07490	PRK07490	hypothetical protein; Provisional	245
181000	PRK07492	PRK07492	adenylosuccinate lyase; Provisional	435
181001	PRK07494	PRK07494	UbiH/UbiF family hydroxylase. 	388
236032	PRK07495	PRK07495	4-aminobutyrate--2-oxoglutarate transaminase. 	425
236033	PRK07500	rpoH2	RNA polymerase factor sigma-32; Reviewed	289
236034	PRK07502	PRK07502	prephenate/arogenate dehydrogenase family protein. 	307
181005	PRK07503	PRK07503	methionine gamma-lyase; Provisional	403
168979	PRK07504	PRK07504	O-succinylhomoserine sulfhydrylase; Reviewed	398
181006	PRK07505	PRK07505	hypothetical protein; Provisional	402
236035	PRK07508	PRK07508	aminodeoxychorismate synthase component I. 	378
181008	PRK07509	PRK07509	crotonase/enoyl-CoA hydratase family protein. 	262
181009	PRK07511	PRK07511	enoyl-CoA hydratase; Provisional	260
236036	PRK07512	PRK07512	L-aspartate oxidase; Provisional	513
181011	PRK07514	PRK07514	malonyl-CoA synthase; Validated	504
236037	PRK07515	PRK07515	3-oxoacyl-(acyl carrier protein) synthase III; Reviewed	372
181013	PRK07516	PRK07516	thiolase domain-containing protein. 	389
236038	PRK07521	flgK	flagellar hook-associated protein FlgK; Validated	483
236039	PRK07522	PRK07522	acetylornithine deacetylase; Provisional	385
236040	PRK07523	PRK07523	gluconate 5-dehydrogenase; Provisional	255
236041	PRK07524	PRK07524	5-guanidino-2-oxopentanoate decarboxylase. 	535
236042	PRK07525	PRK07525	sulfoacetaldehyde acetyltransferase; Validated	588
236043	PRK07529	PRK07529	AMP-binding domain protein; Validated	632
181018	PRK07530	PRK07530	3-hydroxybutyryl-CoA dehydrogenase; Validated	292
236044	PRK07531	PRK07531	carnitine 3-dehydrogenase. 	495
181020	PRK07533	PRK07533	enoyl-[acyl-carrier-protein] reductase FabI. 	258
236045	PRK07534	PRK07534	betaine--homocysteine S-methyltransferase. 	336
181022	PRK07535	PRK07535	methyltetrahydrofolate:corrinoid/iron-sulfur protein methyltransferase; Validated	261
236046	PRK07538	PRK07538	hypothetical protein; Provisional	413
181024	PRK07539	PRK07539	NADH-quinone oxidoreductase subunit NuoE. 	154
181025	PRK07544	PRK07544	branched-chain amino acid aminotransferase; Validated	292
169002	PRK07546	PRK07546	hypothetical protein; Provisional	209
181026	PRK07550	PRK07550	aminotransferase. 	386
181027	PRK07558	PRK07558	F0F1 ATP synthase subunit C; Validated	74
181028	PRK07559	PRK07559	2'-deoxycytidine 5'-triphosphate deaminase; Provisional	365
236047	PRK07560	PRK07560	elongation factor EF-2; Reviewed	731
236048	PRK07561	PRK07561	DNA topoisomerase I subunit omega; Validated	859
236049	PRK07562	PRK07562	vitamin B12-dependent ribonucleotide reductase. 	1220
236050	PRK07564	PRK07564	phosphoglucomutase; Validated	543
236051	PRK07565	PRK07565	dihydroorotate dehydrogenase-like protein. 	334
236052	PRK07566	PRK07566	chlorophyll synthase ChlG. 	314
181035	PRK07567	PRK07567	glutamine amidotransferase; Provisional	242
181036	PRK07568	PRK07568	pyridoxal phosphate-dependent aminotransferase. 	397
181037	PRK07569	PRK07569	bidirectional hydrogenase complex protein HoxU; Validated	234
181038	PRK07570	PRK07570	succinate dehydrogenase/fumarate reductase iron-sulfur subunit; Validated	250
236053	PRK07571	PRK07571	bidirectional hydrogenase complex protein HoxE; Reviewed	169
181039	PRK07572	PRK07572	cytosine deaminase; Validated	426
236054	PRK07573	sdhA	fumarate reductase/succinate dehydrogenase flavoprotein subunit. 	640
181041	PRK07574	PRK07574	NAD-dependent formate dehydrogenase. 	385
236055	PRK07575	PRK07575	dihydroorotase; Provisional	438
236056	PRK07576	PRK07576	short chain dehydrogenase; Provisional	264
181044	PRK07577	PRK07577	SDR family oxidoreductase. 	234
236057	PRK07578	PRK07578	short chain dehydrogenase; Provisional	199
236058	PRK07579	PRK07579	dTDP-4-amino-4,6-dideoxyglucose formyltransferase. 	245
236059	PRK07580	PRK07580	Mg-protoporphyrin IX methyl transferase; Validated	230
236060	PRK07581	PRK07581	hypothetical protein; Validated	339
236061	PRK07582	PRK07582	cystathionine gamma-lyase; Validated	366
236062	PRK07583	PRK07583	cytosine deaminase. 	438
236063	PRK07586	PRK07586	acetolactate synthase large subunit. 	514
169028	PRK07588	PRK07588	FAD-binding domain. 	391
236064	PRK07589	PRK07589	ornithine cyclodeaminase; Validated	346
181053	PRK07590	PRK07590	L,L-diaminopimelate aminotransferase; Validated	409
236065	PRK07591	PRK07591	threonine synthase; Validated	421
136438	PRK07594	PRK07594	EscN/YscN/HrcN family type III secretion system ATPase. 	433
236066	PRK07597	secE	preprotein translocase subunit SecE; Reviewed	64
236067	PRK07598	PRK07598	RNA polymerase sigma factor SigC; Validated	415
181057	PRK07608	PRK07608	UbiH/UbiF family hydroxylase. 	388
181058	PRK07609	PRK07609	CDP-6-deoxy-delta-3,4-glucoseen reductase; Validated	339
181059	PRK07627	PRK07627	dihydroorotase; Provisional	425
236068	PRK07630	PRK07630	CobD/CbiB family protein; Provisional	312
181061	PRK07631	PRK07631	amidophosphoribosyltransferase; Provisional	475
236069	PRK07632	PRK07632	ribonucleotide-diphosphate reductase subunit alpha; Validated	699
181063	PRK07634	PRK07634	pyrroline-5-carboxylate reductase; Reviewed	245
236070	PRK07636	ligB	ATP-dependent DNA ligase; Reviewed	275
236071	PRK07638	PRK07638	acyl-CoA synthetase; Validated	487
181065	PRK07639	PRK07639	petrobactin biosynthesis protein AsbD. 	86
181066	PRK07649	PRK07649	aminodeoxychorismate/anthranilate synthase component II. 	195
181067	PRK07650	PRK07650	4-amino-4-deoxychorismate lyase; Provisional	283
236072	PRK07656	PRK07656	long-chain-fatty-acid--CoA ligase; Validated	513
181069	PRK07657	PRK07657	enoyl-CoA hydratase; Provisional	260
181070	PRK07658	PRK07658	enoyl-CoA hydratase; Provisional	257
236073	PRK07659	PRK07659	enoyl-CoA hydratase; Provisional	260
181072	PRK07661	PRK07661	acetyl-CoA C-acetyltransferase. 	391
236074	PRK07666	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	239
169051	PRK07667	PRK07667	uridine kinase; Provisional	193
181074	PRK07668	PRK07668	hypothetical protein; Validated	254
181075	PRK07670	PRK07670	FliA/WhiG family RNA polymerase sigma factor. 	251
181076	PRK07671	PRK07671	bifunctional cystathionine gamma-lyase/homocysteine desulfhydrase. 	377
181077	PRK07677	PRK07677	short chain dehydrogenase; Provisional	252
181078	PRK07678	PRK07678	aminotransferase; Validated	451
181079	PRK07679	PRK07679	pyrroline-5-carboxylate reductase; Reviewed	279
181080	PRK07680	PRK07680	late competence protein ComER; Validated	273
181081	PRK07681	PRK07681	LL-diaminopimelate aminotransferase. 	399
181082	PRK07682	PRK07682	aminotransferase. 	378
236075	PRK07683	PRK07683	aminotransferase A; Validated	387
181084	PRK07688	PRK07688	thiamine/molybdopterin biosynthesis ThiF/MoeB-like protein; Validated	339
181085	PRK07691	PRK07691	putative monovalent cation/H+ antiporter subunit D; Reviewed	496
181086	PRK07695	PRK07695	thiazole tautomerase TenI. 	201
169065	PRK07696	PRK07696	thiamine biosynthesis protein ThiS. 	67
181087	PRK07701	flgL	flagellar hook-associated protein FlgL; Validated	298
181088	PRK07708	PRK07708	hypothetical protein; Validated	219
169068	PRK07709	PRK07709	fructose-bisphosphate aldolase; Provisional	285
236076	PRK07710	PRK07710	acetolactate synthase large subunit. 	571
236077	PRK07714	PRK07714	YlxQ family RNA-binding protein. 	100
181090	PRK07718	fliL	flagellar basal body-associated protein FliL; Reviewed	142
181091	PRK07720	fliJ	flagellar biosynthesis chaperone FliJ. 	146
181092	PRK07721	fliI	flagellar protein export ATPase FliI. 	438
236078	PRK07726	PRK07726	DNA topoisomerase 3. 	658
236079	PRK07729	PRK07729	glyceraldehyde-3-phosphate dehydrogenase; Validated	343
236080	PRK07734	motB	flagellar motor protein MotB; Reviewed	259
236081	PRK07735	PRK07735	NADH-quinone oxidoreductase subunit C. 	430
236082	PRK07737	fliD	flagellar hook-associated protein 2. 	501
236083	PRK07738	PRK07738	flagellar protein FlaG; Provisional	117
236084	PRK07739	flgK	flagellar hook-associated protein FlgK; Validated	507
236085	PRK07740	PRK07740	hypothetical protein; Provisional	244
236086	PRK07742	PRK07742	phosphate butyryltransferase; Validated	299
236087	PRK07748	PRK07748	3'-5' exonuclease KapD. 	207
169084	PRK07756	PRK07756	NADH-quinone oxidoreductase subunit A. 	122
236088	PRK07757	PRK07757	N-acetyltransferase. 	152
181104	PRK07758	PRK07758	hypothetical protein; Provisional	95
236089	PRK07761	PRK07761	DNA polymerase III subunit beta; Validated	376
236090	PRK07764	PRK07764	DNA polymerase III subunits gamma and tau; Validated	824
181107	PRK07765	PRK07765	aminodeoxychorismate/anthranilate synthase component II. 	214
236091	PRK07768	PRK07768	long-chain-fatty-acid--CoA ligase; Validated	545
181109	PRK07769	PRK07769	long-chain-fatty-acid--CoA ligase; Validated	631
236092	PRK07772	PRK07772	single-stranded DNA-binding protein; Provisional	186
236093	PRK07773	PRK07773	replicative DNA helicase; Validated	886
236094	PRK07774	PRK07774	SDR family oxidoreductase. 	250
181113	PRK07775	PRK07775	SDR family oxidoreductase. 	274
236095	PRK07777	PRK07777	putative succinyldiaminopimelate transaminase DapC. 	387
181115	PRK07785	PRK07785	NADH dehydrogenase subunit C; Provisional	235
169098	PRK07786	PRK07786	long-chain-fatty-acid--CoA ligase; Validated	542
236096	PRK07787	PRK07787	acyl-CoA synthetase; Validated	471
236097	PRK07788	PRK07788	acyl-CoA synthetase; Validated	549
236098	PRK07789	PRK07789	acetolactate synthase 1 catalytic subunit; Validated	612
236099	PRK07791	PRK07791	short chain dehydrogenase; Provisional	286
181120	PRK07792	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	306
236100	PRK07798	PRK07798	acyl-CoA synthetase; Validated	533
181122	PRK07799	PRK07799	crotonase/enoyl-CoA hydratase family protein. 	263
181123	PRK07801	PRK07801	acetyl-CoA C-acetyltransferase. 	382
236101	PRK07803	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	626
236102	PRK07804	PRK07804	L-aspartate oxidase; Provisional	541
181126	PRK07806	PRK07806	SDR family oxidoreductase. 	248
181127	PRK07807	PRK07807	GuaB1 family IMP dehydrogenase-related protein. 	479
236103	PRK07810	PRK07810	O-succinylhomoserine sulfhydrylase; Provisional	403
236104	PRK07811	PRK07811	cystathionine gamma-synthase; Provisional	388
236105	PRK07812	PRK07812	O-acetylhomoserine aminocarboxypropyltransferase; Validated	436
181131	PRK07814	PRK07814	SDR family oxidoreductase. 	263
236106	PRK07818	PRK07818	dihydrolipoamide dehydrogenase; Reviewed	466
181133	PRK07819	PRK07819	3-hydroxybutyryl-CoA dehydrogenase; Validated	286
236107	PRK07823	PRK07823	S-methyl-5'-thioadenosine phosphorylase. 	264
236108	PRK07824	PRK07824	o-succinylbenzoate--CoA ligase. 	358
181136	PRK07825	PRK07825	short chain dehydrogenase; Provisional	273
236109	PRK07827	PRK07827	enoyl-CoA hydratase family protein. 	260
236110	PRK07831	PRK07831	SDR family oxidoreductase. 	262
181139	PRK07832	PRK07832	SDR family oxidoreductase. 	272
236111	PRK07843	PRK07843	3-oxosteroid 1-dehydrogenase. 	557
236112	PRK07845	PRK07845	flavoprotein disulfide reductase; Reviewed	466
181142	PRK07846	PRK07846	mycothione reductase; Reviewed	451
236113	PRK07847	PRK07847	amidophosphoribosyltransferase; Provisional	510
236114	PRK07849	PRK07849	aminodeoxychorismate lyase. 	292
181145	PRK07850	PRK07850	steroid 3-ketoacyl-CoA thiolase. 	387
181146	PRK07851	PRK07851	acetyl-CoA C-acetyltransferase. 	406
236115	PRK07854	PRK07854	enoyl-CoA hydratase; Provisional	243
181147	PRK07855	PRK07855	lipid-transfer protein; Provisional	386
236116	PRK07856	PRK07856	SDR family oxidoreductase. 	252
236117	PRK07857	PRK07857	chorismate mutase. 	106
236118	PRK07860	PRK07860	NADH dehydrogenase subunit G; Validated	797
236119	PRK07865	PRK07865	N-succinyldiaminopimelate aminotransferase; Reviewed	364
236120	PRK07867	PRK07867	acyl-CoA synthetase; Validated	529
236121	PRK07868	PRK07868	acyl-CoA synthetase; Validated	994
181154	PRK07869	PRK07869	amidase; Provisional	468
169138	PRK07874	PRK07874	ATP synthase F0 subunit C. 	80
236122	PRK07877	PRK07877	Rv1355c family protein. 	722
181156	PRK07878	PRK07878	molybdopterin biosynthesis-like protein MoeZ; Validated	392
236123	PRK07883	PRK07883	DEDD exonuclease domain-containing protein. 	557
236124	PRK07889	PRK07889	enoyl-[acyl-carrier-protein] reductase FabI. 	256
181159	PRK07890	PRK07890	short chain dehydrogenase; Provisional	258
236125	PRK07896	PRK07896	carboxylating nicotinate-nucleotide diphosphorylase. 	289
236126	PRK07899	rpsA	30S ribosomal protein S1; Reviewed	486
181162	PRK07904	PRK07904	decaprenylphospho-beta-D-erythro-pentofuranosid-2-ulose 2-reductase. 	253
181163	PRK07906	PRK07906	hypothetical protein; Provisional	426
236127	PRK07907	PRK07907	hypothetical protein; Provisional	449
236128	PRK07908	PRK07908	threonine-phosphate decarboxylase. 	349
236129	PRK07910	PRK07910	beta-ketoacyl-ACP synthase. 	418
169151	PRK07912	PRK07912	salicylate synthase. 	449
236130	PRK07914	PRK07914	hypothetical protein; Reviewed	320
236131	PRK07920	PRK07920	lipid A biosynthesis lauroyl acyltransferase; Provisional	298
181169	PRK07921	PRK07921	RNA polymerase sigma factor SigB; Reviewed	324
236132	PRK07922	PRK07922	amino-acid N-acetyltransferase. 	169
181171	PRK07928	PRK07928	NADH dehydrogenase subunit A; Validated	119
236133	PRK07933	PRK07933	dTMP kinase. 	213
181173	PRK07937	PRK07937	lipid-transfer protein; Provisional	352
181174	PRK07938	PRK07938	enoyl-CoA hydratase family protein. 	249
236134	PRK07940	PRK07940	DNA polymerase III subunit delta'; Validated	394
181176	PRK07942	PRK07942	DNA polymerase III subunit epsilon; Provisional	232
236135	PRK07945	PRK07945	PHP domain-containing protein. 	335
236136	PRK07946	PRK07946	putative monovalent cation/H+ antiporter subunit C; Reviewed	163
181179	PRK07948	PRK07948	putative monovalent cation/H+ antiporter subunit F; Reviewed	86
181180	PRK07952	PRK07952	DNA replication protein DnaC; Validated	244
236137	PRK07956	ligA	NAD-dependent DNA ligase LigA; Validated	665
181182	PRK07960	fliI	flagellum-specific ATP synthase FliI. 	455
181183	PRK07963	fliN	flagellar motor switch protein FliN; Validated	137
181184	PRK07967	PRK07967	beta-ketoacyl-ACP synthase I. 	406
181185	PRK07979	PRK07979	acetolactate synthase 3 large subunit. 	574
181186	PRK07983	PRK07983	exodeoxyribonuclease X; Provisional	219
181187	PRK07984	PRK07984	enoyl-ACP reductase FabI. 	262
181188	PRK07985	PRK07985	SDR family oxidoreductase. 	294
181189	PRK07986	PRK07986	adenosylmethionine--8-amino-7-oxononanoate transaminase; Validated	428
181190	PRK07993	PRK07993	DNA polymerase III subunit delta'; Validated	334
236138	PRK07994	PRK07994	DNA polymerase III subunits gamma and tau; Validated	647
181192	PRK07998	gatY	class II aldolase. 	283
169179	PRK08005	PRK08005	ribulose-phosphate 3 epimerase family protein. 	210
181193	PRK08006	PRK08006	replicative DNA helicase DnaB. 	471
181194	PRK08007	PRK08007	aminodeoxychorismate synthase component 2. 	187
181195	PRK08008	caiC	putative crotonobetaine/carnitine-CoA ligase; Validated	517
181196	PRK08010	PRK08010	pyridine nucleotide-disulfide oxidoreductase; Provisional	441
236139	PRK08013	PRK08013	oxidoreductase; Provisional	400
181198	PRK08017	PRK08017	SDR family oxidoreductase. 	256
181199	PRK08020	ubiF	2-octaprenyl-3-methyl-6-methoxy-1,4-benzoquinol hydroxylase; Reviewed	391
181200	PRK08025	PRK08025	kdo(2)-lipid IV(A) palmitoleoyltransferase. 	305
236140	PRK08026	PRK08026	FliC/FljB family flagellin. 	529
181202	PRK08027	flgL	flagellar hook-filament junction protein FlgL. 	317
236141	PRK08032	fliD	flagellar capping protein; Reviewed	462
181204	PRK08035	PRK08035	YscQ/HrcQ family type III secretion apparatus protein. 	323
181205	PRK08040	PRK08040	putative semialdehyde dehydrogenase; Provisional	336
181206	PRK08042	PRK08042	formate hydrogenlyase subunit 3; Reviewed	593
181207	PRK08043	PRK08043	bifunctional acyl-ACP--phospholipid O-acyltransferase/long-chain-fatty-acid--ACP ligase. 	718
169193	PRK08044	PRK08044	allantoinase AllB. 	449
169194	PRK08045	PRK08045	cystathionine gamma-synthase; Provisional	386
181208	PRK08049	PRK08049	F0F1 ATP synthase subunit I; Validated	124
236142	PRK08051	fre	FMN reductase; Validated	232
181210	PRK08053	PRK08053	sulfur carrier protein ThiS; Provisional	66
236143	PRK08055	PRK08055	chorismate mutase; Provisional	181
181212	PRK08056	PRK08056	threonine-phosphate decarboxylase; Provisional	356
236144	PRK08057	PRK08057	cobalt-precorrin-6x reductase; Reviewed	248
181214	PRK08058	PRK08058	DNA polymerase III subunit delta'; Validated	329
181215	PRK08059	PRK08059	general stress protein 13; Validated	123
181216	PRK08061	rpsN	type Z 30S ribosomal protein S14. 	61
236145	PRK08063	PRK08063	enoyl-[acyl-carrier-protein] reductase FabL. 	250
236146	PRK08064	PRK08064	cystathionine beta-lyase MetC. 	390
181219	PRK08068	PRK08068	transaminase; Reviewed	389
236147	PRK08071	PRK08071	L-aspartate oxidase; Provisional	510
181221	PRK08072	PRK08072	carboxylating nicotinate-nucleotide diphosphorylase. 	277
181222	PRK08073	flgL	flagellar hook-associated protein 3. 	287
236148	PRK08074	PRK08074	bifunctional ATP-dependent DNA helicase/DNA polymerase III subunit epsilon; Validated	928
181224	PRK08084	PRK08084	DnaA inactivator Hda. 	235
181225	PRK08085	PRK08085	gluconate 5-dehydrogenase; Provisional	254
181226	PRK08087	PRK08087	L-fuculose-phosphate aldolase. 	215
236149	PRK08088	PRK08088	4-aminobutyrate--2-oxoglutarate transaminase. 	425
169215	PRK08091	PRK08091	ribulose-phosphate 3-epimerase; Validated	228
236150	PRK08097	ligB	NAD-dependent DNA ligase LigB. 	562
236151	PRK08099	PRK08099	multifunctional transcriptional regulator/nicotinamide-nucleotide adenylyltransferase/ribosylnicotinamide kinase NadR. 	399
181230	PRK08105	PRK08105	flavodoxin; Provisional	149
181231	PRK08114	PRK08114	cystathionine beta-lyase; Provisional	395
236152	PRK08115	PRK08115	vitamin B12-dependent ribonucleotide reductase. 	858
236153	PRK08116	PRK08116	hypothetical protein; Validated	268
181234	PRK08117	PRK08117	aspartate aminotransferase family protein. 	433
181235	PRK08118	PRK08118	DNA topology modulation protein. 	167
236154	PRK08119	PRK08119	flagellar motor switch protein; Validated	382
236155	PRK08123	PRK08123	histidinol-phosphatase HisJ. 	270
181238	PRK08124	PRK08124	flagellar motor protein MotA; Validated	263
236156	PRK08125	PRK08125	bifunctional UDP-4-amino-4-deoxy-L-arabinose formyltransferase/UDP-glucuronic acid oxidase ArnA. 	660
236157	PRK08126	PRK08126	hypothetical protein; Provisional	432
181241	PRK08130	PRK08130	putative aldolase; Validated	213
181242	PRK08131	PRK08131	3-oxoadipyl-CoA thiolase. 	401
236158	PRK08132	PRK08132	FAD-dependent oxidoreductase; Provisional	547
181244	PRK08133	PRK08133	O-succinylhomoserine sulfhydrylase; Validated	390
236159	PRK08134	PRK08134	O-acetylhomoserine aminocarboxypropyltransferase; Validated	433
236160	PRK08136	PRK08136	glycosyl transferase family protein; Provisional	317
236161	PRK08137	PRK08137	amidase; Provisional	497
236162	PRK08138	PRK08138	enoyl-CoA hydratase; Provisional	261
181249	PRK08139	PRK08139	enoyl-CoA hydratase; Validated	266
236163	PRK08140	PRK08140	enoyl-CoA hydratase; Provisional	262
236164	PRK08142	PRK08142	thiolase domain-containing protein. 	388
236165	PRK08147	flgK	flagellar hook-associated protein FlgK; Validated	547
236166	PRK08149	PRK08149	FliI/YscN family ATPase. 	428
181254	PRK08150	PRK08150	crotonase/enoyl-CoA hydratase family protein. 	255
181255	PRK08153	PRK08153	pyridoxal phosphate-dependent aminotransferase. 	369
236167	PRK08154	PRK08154	anaerobic benzoate catabolism transcriptional regulator; Reviewed	309
181257	PRK08155	PRK08155	acetolactate synthase large subunit. 	564
236168	PRK08156	PRK08156	EscU/YscU/HrcU family type III secretion system export apparatus switch protein. 	361
181259	PRK08158	PRK08158	YscQ/HrcQ family type III secretion apparatus protein. 	303
181260	PRK08159	PRK08159	enoyl-[acyl-carrier-protein] reductase FabI. 	272
236169	PRK08162	PRK08162	acyl-CoA synthetase; Validated	545
181262	PRK08163	PRK08163	3-hydroxybenzoate 6-monooxygenase. 	396
236170	PRK08166	PRK08166	NADH-quinone oxidoreductase subunit NuoG. 	791
236171	PRK08168	PRK08168	NADH-quinone oxidoreductase subunit L. 	516
181265	PRK08170	PRK08170	acetyl-CoA C-acetyltransferase. 	426
181266	PRK08172	PRK08172	acyl carrier protein. 	82
236172	PRK08173	PRK08173	DNA topoisomerase III; Validated	862
181268	PRK08175	PRK08175	aminotransferase; Validated	395
181269	PRK08176	pdxK	pyridoxine/pyridoxal/pyridoxamine kinase. 	281
236173	PRK08177	PRK08177	SDR family oxidoreductase. 	225
236174	PRK08178	PRK08178	acetolactate synthase 1 small subunit. 	96
181271	PRK08179	prfH	peptide chain release factor-like protein; Reviewed	200
236175	PRK08180	PRK08180	feruloyl-CoA synthase; Reviewed	614
136670	PRK08181	PRK08181	transposase; Validated	269
169261	PRK08182	PRK08182	single-stranded DNA-binding protein; Provisional	148
236176	PRK08183	PRK08183	NADH:ubiquinone oxidoreductase subunit NDUFA12. 	133
181274	PRK08184	PRK08184	benzoyl-CoA-dihydrodiol lyase; Provisional	550
181275	PRK08185	PRK08185	hypothetical protein; Provisional	283
236177	PRK08186	PRK08186	allophanate hydrolase; Provisional	600
236178	PRK08187	PRK08187	pyruvate kinase; Validated	493
236179	PRK08188	PRK08188	ribonucleotide-diphosphate reductase subunit alpha; Validated	714
236180	PRK08190	PRK08190	bifunctional enoyl-CoA hydratase/phosphate acetyltransferase; Validated	466
169269	PRK08192	PRK08192	aspartate carbamoyltransferase; Provisional	338
236181	PRK08193	araD	L-ribulose-5-phosphate 4-epimerase AraD. 	231
181281	PRK08194	PRK08194	tartrate dehydrogenase; Provisional	352
181282	PRK08195	PRK08195	4-hyroxy-2-oxovalerate/4-hydroxy-2-oxopentanoic acid aldolase,; Validated	337
181283	PRK08197	PRK08197	threonine synthase; Validated	394
236182	PRK08198	PRK08198	threonine dehydratase; Provisional	404
181285	PRK08199	PRK08199	thiamine pyrophosphate protein; Validated	557
169276	PRK08201	PRK08201	dipeptidase. 	456
236183	PRK08202	PRK08202	purine nucleoside phosphorylase; Provisional	272
236184	PRK08203	PRK08203	hydroxydechloroatrazine ethylaminohydrolase; Reviewed	451
181288	PRK08204	PRK08204	hypothetical protein; Provisional	449
236185	PRK08205	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	583
236186	PRK08206	PRK08206	diaminopropionate ammonia-lyase; Provisional	399
236187	PRK08207	PRK08207	coproporphyrinogen III oxidase; Provisional	488
181292	PRK08208	PRK08208	coproporphyrinogen III oxidase family protein. 	430
236188	PRK08210	PRK08210	aspartate kinase I; Reviewed	403
236189	PRK08211	PRK08211	YjhG/YagF family D-xylonate dehydratase. 	655
181295	PRK08213	PRK08213	gluconate 5-dehydrogenase; Provisional	259
181296	PRK08215	PRK08215	RNA polymerase sporulation sigma factor SigG. 	258
181297	PRK08217	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	253
181298	PRK08219	PRK08219	SDR family oxidoreductase. 	227
236190	PRK08220	PRK08220	2,3-dihydroxybenzoate-2,3-dehydrogenase; Validated	252
181300	PRK08221	PRK08221	anaerobic sulfite reductase subunit AsrB. 	263
181301	PRK08222	PRK08222	hydrogenase 4 subunit H; Validated	181
181302	PRK08223	PRK08223	hypothetical protein; Validated	287
236191	PRK08224	ligC	ATP-dependent DNA ligase; Reviewed	350
181304	PRK08225	PRK08225	acetyl-CoA carboxylase biotin carboxyl carrier protein subunit; Validated	70
181305	PRK08226	PRK08226	SDR family oxidoreductase UcpA. 	263
181306	PRK08227	PRK08227	3-hydroxy-5-phosphonooxypentane-2,4-dione thiolase. 	264
236192	PRK08228	PRK08228	L(+)-tartrate dehydratase subunit beta; Validated	204
236193	PRK08229	PRK08229	2-dehydropantoate 2-reductase; Provisional	341
181309	PRK08230	PRK08230	tartrate dehydratase subunit alpha; Validated	299
181310	PRK08233	PRK08233	hypothetical protein; Provisional	182
181311	PRK08235	PRK08235	acetyl-CoA C-acetyltransferase. 	393
236194	PRK08236	PRK08236	hypothetical protein; Provisional	212
236195	PRK08238	PRK08238	UbiA family prenyltransferase. 	479
236196	PRK08241	PRK08241	RNA polymerase subunit sigma-70. 	339
236197	PRK08242	PRK08242	acetyl-CoA C-acetyltransferase. 	402
236198	PRK08243	PRK08243	4-hydroxybenzoate 3-monooxygenase; Validated	392
236199	PRK08244	PRK08244	monooxygenase. 	493
236200	PRK08245	PRK08245	hypothetical protein; Validated	240
181319	PRK08246	PRK08246	serine/threonine dehydratase. 	310
181320	PRK08247	PRK08247	methionine biosynthesis PLP-dependent protein. 	366
236201	PRK08248	PRK08248	homocysteine synthase. 	431
236202	PRK08249	PRK08249	cystathionine gamma-synthase family protein. 	398
181323	PRK08250	PRK08250	glutamine amidotransferase; Provisional	235
181324	PRK08251	PRK08251	SDR family oxidoreductase. 	248
181325	PRK08252	PRK08252	crotonase/enoyl-CoA hydratase family protein. 	254
236203	PRK08255	PRK08255	bifunctional salicylyl-CoA 5-hydroxylase/oxidoreductase. 	765
181327	PRK08256	PRK08256	lipid-transfer protein; Provisional	391
236204	PRK08257	PRK08257	acetyl-CoA acetyltransferase; Validated	498
181329	PRK08258	PRK08258	enoyl-CoA hydratase family protein. 	277
236205	PRK08259	PRK08259	crotonase/enoyl-CoA hydratase family protein. 	254
236206	PRK08260	PRK08260	enoyl-CoA hydratase; Provisional	296
236207	PRK08261	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	450
236208	PRK08262	PRK08262	M20 family peptidase. 	486
181334	PRK08263	PRK08263	short chain dehydrogenase; Provisional	275
181335	PRK08264	PRK08264	SDR family oxidoreductase. 	238
236209	PRK08265	PRK08265	short chain dehydrogenase; Provisional	261
181337	PRK08266	PRK08266	hypothetical protein; Provisional	542
236210	PRK08267	PRK08267	SDR family oxidoreductase. 	260
236211	PRK08268	PRK08268	3-hydroxy-acyl-CoA dehydrogenase; Validated	507
181340	PRK08269	PRK08269	3-hydroxybutyryl-CoA dehydrogenase; Validated	314
236212	PRK08270	PRK08270	ribonucleoside triphosphate reductase. 	656
181342	PRK08271	PRK08271	anaerobic ribonucleoside triphosphate reductase; Provisional	623
236213	PRK08272	PRK08272	crotonase/enoyl-CoA hydratase family protein. 	302
181344	PRK08273	PRK08273	thiamine pyrophosphate protein; Provisional	597
236214	PRK08274	PRK08274	FAD-dependent tricarballylate dehydrogenase TcuA. 	466
181346	PRK08275	PRK08275	putative oxidoreductase; Provisional	554
236215	PRK08276	PRK08276	long-chain-fatty-acid--CoA ligase; Validated	502
236216	PRK08277	PRK08277	D-mannonate oxidoreductase; Provisional	278
181349	PRK08278	PRK08278	SDR family oxidoreductase. 	273
236217	PRK08279	PRK08279	long-chain-acyl-CoA synthetase; Validated	600
236218	PRK08284	PRK08284	precorrin 6A synthase; Provisional	253
181352	PRK08285	cobH	precorrin-8X methylmutase; Reviewed	208
181353	PRK08286	cbiC	cobalt-precorrin-8 methylmutase. 	214
181354	PRK08287	PRK08287	decarboxylating cobalt-precorrin-6B (C(15))-methyltransferase. 	187
236219	PRK08289	PRK08289	glyceraldehyde-3-phosphate dehydrogenase; Reviewed	477
236220	PRK08290	PRK08290	enoyl-CoA hydratase; Provisional	288
236221	PRK08291	PRK08291	cyclodeaminase. 	330
236222	PRK08292	PRK08292	AMP nucleosidase; Provisional	489
181359	PRK08293	PRK08293	3-hydroxyacyl-CoA dehydrogenase. 	287
236223	PRK08294	PRK08294	phenol 2-monooxygenase; Provisional	634
181361	PRK08295	PRK08295	RNA polymerase sporulation sigma factor SigH. 	208
181362	PRK08296	PRK08296	hypothetical protein; Provisional	603
236224	PRK08297	PRK08297	L-lysine aminotransferase; Provisional	443
236225	PRK08298	PRK08298	cytidine deaminase; Validated	136
236226	PRK08299	PRK08299	NADP-dependent isocitrate dehydrogenase. 	402
236227	PRK08300	PRK08300	acetaldehyde dehydrogenase; Validated	302
236228	PRK08301	PRK08301	RNA polymerase sporulation sigma factor SigE. 	234
236229	PRK08303	PRK08303	short chain dehydrogenase; Provisional	305
236230	PRK08304	PRK08304	stage V sporulation protein AD; Validated	337
181370	PRK08305	spoVFB	dipicolinate synthase subunit B; Reviewed	196
181371	PRK08306	PRK08306	dipicolinate synthase subunit DpsA. 	296
181372	PRK08307	PRK08307	stage III sporulation protein SpoAB; Provisional	171
236231	PRK08308	PRK08308	acyl-CoA synthetase; Validated	414
236232	PRK08309	PRK08309	short chain dehydrogenase; Provisional	177
181375	PRK08310	PRK08310	amidase; Provisional	395
236233	PRK08311	PRK08311	RNA polymerase sigma factor SigI. 	237
236234	PRK08312	PRK08312	indolepyruvate oxidoreductase subunit beta family protein. 	510
181378	PRK08313	PRK08313	thiolase domain-containing protein. 	386
236235	PRK08314	PRK08314	long-chain-fatty-acid--CoA ligase; Validated	546
236236	PRK08315	PRK08315	AMP-binding domain protein; Validated	559
181381	PRK08316	PRK08316	acyl-CoA synthetase; Validated	523
181382	PRK08317	PRK08317	hypothetical protein; Provisional	241
236237	PRK08318	PRK08318	NAD-dependent dihydropyrimidine dehydrogenase subunit PreA. 	420
181384	PRK08319	PRK08319	energy-coupling factor ABC transporter permease. 	224
236238	PRK08320	PRK08320	branched-chain amino acid aminotransferase; Reviewed	288
181386	PRK08321	PRK08321	1,4-dihydroxy-2-naphthoyl-CoA synthase. 	302
236239	PRK08322	PRK08322	acetolactate synthase large subunit. 	547
236240	PRK08323	PRK08323	phenylhydantoinase; Validated	459
236241	PRK08324	PRK08324	bifunctional aldolase/short-chain dehydrogenase. 	681
236242	PRK08326	PRK08326	R2-like ligand-binding oxidase. 	311
236243	PRK08327	PRK08327	thiamine pyrophosphate-requiring protein. 	569
169382	PRK08328	PRK08328	hypothetical protein; Provisional	231
236244	PRK08329	PRK08329	threonine synthase; Validated	347
169384	PRK08330	PRK08330	biotin--protein ligase; Provisional	236
181392	PRK08332	PRK08332	vitamin B12-dependent ribonucleotide reductase. 	1740
181393	PRK08333	PRK08333	aldolase. 	184
169386	PRK08334	PRK08334	S-methyl-5-thioribose-1-phosphate isomerase. 	356
169387	PRK08335	PRK08335	translation initiation factor IF-2B subunit alpha; Validated	275
181394	PRK08338	PRK08338	2-oxoacid:ferredoxin oxidoreductase subunit gamma. 	170
169389	PRK08339	PRK08339	short chain dehydrogenase; Provisional	263
169390	PRK08340	PRK08340	SDR family oxidoreductase. 	259
181395	PRK08341	PRK08341	amidophosphoribosyltransferase; Provisional	442
236245	PRK08343	secD	preprotein translocase subunit SecD; Reviewed	417
236246	PRK08344	PRK08344	V-type ATP synthase subunit K; Validated	157
236247	PRK08345	PRK08345	cytochrome-c3 hydrogenase subunit gamma; Provisional	289
181399	PRK08348	PRK08348	NADH-plastoquinone oxidoreductase subunit; Provisional	120
169396	PRK08349	PRK08349	hypothetical protein; Validated	198
169397	PRK08350	PRK08350	hypothetical protein; Provisional	341
181400	PRK08351	PRK08351	DNA-directed RNA polymerase subunit E''; Validated	61
169399	PRK08354	PRK08354	putative aminotransferase; Provisional	311
169400	PRK08356	PRK08356	hypothetical protein; Provisional	195
169401	PRK08359	PRK08359	transcription factor; Validated	176
181401	PRK08360	PRK08360	aspartate aminotransferase family protein. 	443
236248	PRK08361	PRK08361	aspartate aminotransferase; Provisional	391
181402	PRK08363	PRK08363	alanine aminotransferase; Validated	398
236249	PRK08364	PRK08364	sulfur carrier protein ThiS; Provisional	70
169406	PRK08366	vorA	2-ketoisovalerate ferredoxin oxidoreductase subunit alpha; Reviewed	390
181403	PRK08367	porA	pyruvate ferredoxin oxidoreductase subunit alpha; Reviewed	394
236250	PRK08373	PRK08373	aspartate kinase; Validated	341
169409	PRK08374	PRK08374	homoserine dehydrogenase; Provisional	336
236251	PRK08375	PRK08375	putative monovalent cation/H+ antiporter subunit D; Reviewed	487
236252	PRK08376	PRK08376	putative monovalent cation/H+ antiporter subunit D; Reviewed	521
181406	PRK08377	PRK08377	NADH dehydrogenase subunit N; Validated	494
236253	PRK08378	PRK08378	hypothetical protein; Provisional	93
169414	PRK08381	PRK08381	putative monovalent cation/H+ antiporter subunit F; Reviewed	87
169415	PRK08382	PRK08382	putative monovalent cation/H+ antiporter subunit E; Reviewed	201
181407	PRK08383	PRK08383	putative monovalent cation/H+ antiporter subunit E; Reviewed	168
236254	PRK08384	PRK08384	thiamine biosynthesis protein ThiI; Provisional	381
236255	PRK08385	PRK08385	carboxylating.nicotinate-nucleotide diphosphorylase. 	278
236256	PRK08386	PRK08386	putative monovalent cation/H+ antiporter subunit B; Reviewed	151
169420	PRK08387	PRK08387	putative monovalent cation/H+ antiporter subunit B; Reviewed	131
236257	PRK08388	PRK08388	putative monovalent cation/H+ antiporter subunit C; Reviewed	119
236258	PRK08389	PRK08389	putative monovalent cation/H+ antiporter subunit C; Reviewed	114
169423	PRK08392	PRK08392	hypothetical protein; Provisional	215
181411	PRK08393	PRK08393	N-ethylammeline chlorohydrolase; Provisional	424
169425	PRK08395	PRK08395	fumarate hydratase; Provisional	162
236259	PRK08401	PRK08401	L-aspartate oxidase; Provisional	466
169427	PRK08402	PRK08402	replication factor A; Reviewed	355
169428	PRK08404	PRK08404	V-type ATP synthase subunit H; Validated	103
181413	PRK08406	PRK08406	transcription elongation factor NusA-like protein; Validated	140
181414	PRK08410	PRK08410	D-2-hydroxyacid dehydrogenase. 	311
236260	PRK08411	PRK08411	flagellin; Reviewed	572
236261	PRK08412	flgL	flagellar hook-associated protein FlgL; Validated	827
181416	PRK08415	PRK08415	enoyl-[acyl-carrier-protein] reductase FabI. 	274
181417	PRK08416	PRK08416	enoyl-ACP reductase. 	260
236262	PRK08417	PRK08417	metal-dependent hydrolase. 	386
181419	PRK08418	PRK08418	metal-dependent hydrolase. 	408
181420	PRK08419	PRK08419	lipid A biosynthesis lauroyl acyltransferase; Reviewed	298
236263	PRK08425	flgE	flagellar hook protein FlgE; Validated	731
236264	PRK08432	PRK08432	flagellar motor switch protein FliY; Validated	283
181423	PRK08433	PRK08433	flagellar motor switch protein FliN. 	111
236265	PRK08439	PRK08439	3-oxoacyl-(acyl carrier protein) synthase II; Reviewed	406
181425	PRK08441	oorC	2-oxoglutarate-acceptor oxidoreductase subunit OorC; Reviewed	183
181426	PRK08444	PRK08444	aminofutalosine synthase MqnE. 	353
181427	PRK08445	PRK08445	dehypoxanthine futalosine cyclase. 	348
181428	PRK08446	PRK08446	coproporphyrinogen III oxidase family protein. 	350
236266	PRK08447	PRK08447	ribonucleoside-diphosphate reductase subunit alpha. 	789
236267	PRK08451	PRK08451	DNA polymerase III subunits gamma and tau; Validated	535
236268	PRK08452	PRK08452	flagellar protein FlaG; Provisional	124
181432	PRK08453	fliD	flagellar filament capping protein FliD. 	673
181433	PRK08455	fliL	flagellar basal body-associated protein FliL; Reviewed	182
181434	PRK08456	PRK08456	flagellar motor protein MotA; Validated	257
181435	PRK08457	motB	flagellar motor protein MotB; Reviewed	257
236269	PRK08462	PRK08462	acetyl-CoA carboxylase biotin carboxylase subunit. 	445
169452	PRK08463	PRK08463	acetyl-CoA carboxylase subunit A; Validated	478
181437	PRK08470	PRK08470	adenylosuccinate lyase; Provisional	442
236270	PRK08471	flgK	flagellar hook-associated protein FlgK; Validated	613
181439	PRK08472	fliI	flagellar protein export ATPase FliI. 	434
236271	PRK08474	PRK08474	F0F1 ATP synthase subunit delta; Validated	176
236272	PRK08475	PRK08475	F0F1 ATP synthase subunit B; Validated	167
181442	PRK08476	PRK08476	F0F1 ATP synthase subunit B'; Validated	141
236273	PRK08477	PRK08477	biotin--[acetyl-CoA-carboxylase] ligase. 	211
181444	PRK08482	PRK08482	F0F1 ATP synthase subunit C; Validated	105
236274	PRK08485	PRK08485	DNA polymerase III subunit delta'; Validated	206
236275	PRK08486	PRK08486	single-stranded DNA-binding protein; Provisional	182
236276	PRK08487	PRK08487	DNA polymerase III subunit delta; Validated	328
181448	PRK08489	PRK08489	NAD(P)H-quinone oxidoreductase subunit 3. 	129
181449	PRK08491	PRK08491	NADH-quinone oxidoreductase subunit C. 	263
236277	PRK08493	PRK08493	NADH-quinone oxidoreductase subunit G. 	819
236278	PRK08506	PRK08506	replicative DNA helicase; Provisional	472
181452	PRK08507	PRK08507	prephenate dehydrogenase; Validated	275
236279	PRK08508	PRK08508	biotin synthase; Provisional	279
236280	PRK08515	flgA	flagellar basal body P-ring formation protein FlgA. 	222
236281	PRK08517	PRK08517	3'-5' exonuclease. 	257
181456	PRK08525	PRK08525	amidophosphoribosyltransferase; Provisional	445
181457	PRK08526	PRK08526	threonine dehydratase; Provisional	403
181458	PRK08527	PRK08527	acetolactate synthase large subunit. 	563
181459	PRK08533	PRK08533	flagellar accessory protein FlaH; Reviewed	230
181460	PRK08534	PRK08534	pyruvate ferredoxin oxidoreductase subunit gamma; Reviewed	181
236282	PRK08535	PRK08535	ribose 1,5-bisphosphate isomerase. 	310
181462	PRK08537	PRK08537	2-oxoacid:ferredoxin oxidoreductase subunit gamma. 	177
236283	PRK08540	PRK08540	adenylosuccinate lyase; Reviewed	449
236284	PRK08541	PRK08541	flagellin; Validated	211
236285	PRK08554	PRK08554	peptidase; Reviewed	438
181465	PRK08557	PRK08557	hypothetical protein; Provisional	417
181466	PRK08558	PRK08558	adenine phosphoribosyltransferase; Provisional	238
181467	PRK08559	nusG	transcription antitermination protein NusG; Validated	153
236286	PRK08560	PRK08560	tyrosyl-tRNA synthetase; Validated	329
236287	PRK08561	rps15p	30S ribosomal protein S15P; Reviewed	151
236288	PRK08562	rpl32e	50S ribosomal protein L32e; Validated	125
236289	PRK08563	PRK08563	DNA-directed RNA polymerase subunit E'; Provisional	187
236290	PRK08564	PRK08564	S-methyl-5'-thioadenosine phosphorylase. 	267
236291	PRK08565	PRK08565	DNA-directed RNA polymerase subunit B; Provisional	1103
236292	PRK08566	PRK08566	DNA-directed RNA polymerase subunit A'; Validated	882
236293	PRK08568	PRK08568	preprotein translocase subunit SecY; Reviewed	462
236294	PRK08569	rpl18p	50S ribosomal protein L18P; Reviewed	193
236295	PRK08570	rpl19e	50S ribosomal protein L19e; Reviewed	150
181478	PRK08571	rpl14p	50S ribosomal protein L14P; Reviewed	132
236296	PRK08572	rps17p	30S ribosomal protein S17P; Reviewed	108
236297	PRK08573	PRK08573	bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 	448
236298	PRK08574	PRK08574	cystathionine gamma-synthase family protein. 	385
236299	PRK08575	PRK08575	5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferase; Provisional	326
236300	PRK08576	PRK08576	hypothetical protein; Provisional	438
236301	PRK08577	PRK08577	hypothetical protein; Provisional	136
236302	PRK08578	PRK08578	preprotein translocase subunit SecF; Reviewed	292
236303	PRK08579	PRK08579	anaerobic ribonucleoside triphosphate reductase; Provisional	625
236304	PRK08581	PRK08581	amidase domain-containing protein. 	619
236305	PRK08582	PRK08582	RNA-binding protein S1. 	139
236306	PRK08583	PRK08583	RNA polymerase sigma factor SigB; Validated	257
181490	PRK08588	PRK08588	succinyl-diaminopimelate desuccinylase; Reviewed	377
181491	PRK08589	PRK08589	SDR family oxidoreductase. 	272
236307	PRK08591	PRK08591	acetyl-CoA carboxylase biotin carboxylase subunit; Validated	451
181493	PRK08593	PRK08593	aspartate aminotransferase family protein. 	445
236308	PRK08594	PRK08594	enoyl-[acyl-carrier-protein] reductase FabI. 	257
181495	PRK08596	PRK08596	acetylornithine deacetylase; Validated	421
236309	PRK08599	PRK08599	oxygen-independent coproporphyrinogen III oxidase. 	377
181497	PRK08600	PRK08600	putative monovalent cation/H+ antiporter subunit C; Reviewed	113
236310	PRK08601	PRK08601	NADH dehydrogenase subunit 5; Validated	509
181499	PRK08605	PRK08605	D-lactate dehydrogenase; Validated	332
236311	PRK08609	PRK08609	DNA polymerase/3'-5' exonuclease PolX. 	570
181501	PRK08610	PRK08610	fructose-bisphosphate aldolase; Reviewed	286
181502	PRK08611	PRK08611	pyruvate oxidase; Provisional	576
236312	PRK08617	PRK08617	acetolactate synthase AlsS. 	552
236313	PRK08618	PRK08618	ornithine cyclodeaminase family protein. 	325
181505	PRK08621	PRK08621	galactose-6-phosphate isomerase subunit LacA; Reviewed	142
181506	PRK08622	PRK08622	galactose-6-phosphate isomerase subunit LacB; Reviewed	171
236314	PRK08624	PRK08624	hypothetical protein; Provisional	373
181507	PRK08626	PRK08626	fumarate reductase flavoprotein subunit; Provisional	657
181508	PRK08628	PRK08628	SDR family oxidoreductase. 	258
181509	PRK08629	PRK08629	coproporphyrinogen III oxidase family protein. 	433
236315	PRK08633	PRK08633	2-acyl-glycerophospho-ethanolamine acyltransferase; Validated	1146
236316	PRK08636	PRK08636	LL-diaminopimelate aminotransferase. 	403
181512	PRK08637	PRK08637	hypothetical protein; Provisional	388
236317	PRK08638	PRK08638	bifunctional threonine ammonia-lyase/L-serine ammonia-lyase TdcB. 	333
236318	PRK08639	PRK08639	threonine dehydratase; Validated	420
181515	PRK08640	sdhB	succinate dehydrogenase iron-sulfur subunit; Reviewed	249
236319	PRK08641	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	589
181517	PRK08642	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	253
181518	PRK08643	PRK08643	(S)-acetoin forming diacetyl reductase. 	256
236320	PRK08644	PRK08644	sulfur carrier protein ThiS adenylyltransferase ThiF. 	212
236321	PRK08645	PRK08645	bifunctional homocysteine S-methyltransferase/5,10-methylenetetrahydrofolate reductase protein; Reviewed	612
236322	PRK08649	PRK08649	GuaB3 family IMP dehydrogenase-related protein. 	368
236323	PRK08651	PRK08651	succinyl-diaminopimelate desuccinylase; Reviewed	394
236324	PRK08652	PRK08652	acetylornithine deacetylase; Provisional	347
236325	PRK08654	PRK08654	acetyl-CoA carboxylase biotin carboxylase subunit. 	499
236326	PRK08655	PRK08655	prephenate dehydrogenase; Provisional	437
181526	PRK08659	PRK08659	2-oxoacid:acceptor oxidoreductase subunit alpha. 	376
181527	PRK08660	PRK08660	aldolase. 	181
236327	PRK08661	PRK08661	prolyl-tRNA synthetase; Provisional	477
236328	PRK08662	PRK08662	nicotinate phosphoribosyltransferase; Reviewed	343
236329	PRK08664	PRK08664	aspartate-semialdehyde dehydrogenase; Reviewed	349
236330	PRK08665	PRK08665	vitamin B12-dependent ribonucleotide reductase. 	752
169548	PRK08666	PRK08666	5'-methylthioadenosine phosphorylase; Validated	261
236331	PRK08667	PRK08667	hydrogenase membrane subunit; Validated	644
236332	PRK08668	PRK08668	NADH dehydrogenase subunit M; Validated	610
181534	PRK08671	PRK08671	methionine aminopeptidase; Provisional	291
181535	PRK08673	PRK08673	3-deoxy-7-phosphoheptulonate synthase; Reviewed	335
181536	PRK08674	PRK08674	bifunctional phosphoglucose/phosphomannose isomerase; Validated	337
181537	PRK08676	PRK08676	hydrogenase membrane subunit; Validated	485
169553	PRK08690	PRK08690	enoyl-[acyl-carrier-protein] reductase FabI. 	261
236333	PRK08691	PRK08691	DNA polymerase III subunits gamma and tau; Validated	709
181538	PRK08699	PRK08699	DNA polymerase III subunit delta'; Validated	325
169556	PRK08703	PRK08703	SDR family oxidoreductase. 	239
169557	PRK08706	PRK08706	lipid A biosynthesis lauroyl acyltransferase; Provisional	289
236334	PRK08719	PRK08719	ribonuclease H; Reviewed	147
181539	PRK08722	PRK08722	beta-ketoacyl-ACP synthase II. 	414
236335	PRK08724	fliD	flagellar filament capping protein FliD. 	673
181541	PRK08727	PRK08727	DnaA regulatory inactivator Hda. 	233
181542	PRK08733	PRK08733	LpxL/LpxP family Kdo(2)-lipid IV(A) lauroyl/palmitoleoyl acyltransferase. 	306
181543	PRK08734	PRK08734	lauroyl acyltransferase. 	305
181544	PRK08737	PRK08737	acetylornithine deacetylase; Provisional	364
236336	PRK08742	PRK08742	adenosylmethionine--8-amino-7-oxononanoate transaminase; Provisional	472
136958	PRK08745	PRK08745	ribulose-phosphate 3-epimerase; Provisional	223
181546	PRK08751	PRK08751	long-chain fatty acid--CoA ligase. 	560
181547	PRK08760	PRK08760	replicative DNA helicase; Provisional	476
236337	PRK08762	PRK08762	molybdopterin-synthase adenylyltransferase MoeB. 	376
181549	PRK08763	PRK08763	single-stranded DNA-binding protein; Provisional	164
181550	PRK08764	PRK08764	Rnf electron transport complex subunit RnfB. 	135
181551	PRK08769	PRK08769	DNA polymerase III subunit delta'; Validated	319
181552	PRK08773	PRK08773	UbiH/UbiF family hydroxylase. 	392
181553	PRK08775	PRK08775	homoserine O-succinyltransferase. 	343
181554	PRK08776	PRK08776	O-succinylhomoserine (thiol)-lyase. 	405
181555	PRK08780	PRK08780	DNA topoisomerase I; Provisional	780
136970	PRK08787	PRK08787	peptide chain release factor 2; Provisional	313
236338	PRK08788	PRK08788	enoyl-CoA hydratase; Validated	287
181557	PRK08808	PRK08808	type II secretion system protein J. 	211
181558	PRK08811	PRK08811	uroporphyrinogen-III synthase; Validated	266
236339	PRK08813	PRK08813	threonine dehydratase; Provisional	349
236340	PRK08815	PRK08815	GTP cyclohydrolase II RibA. 	375
181561	PRK08818	PRK08818	prephenate dehydrogenase; Provisional	370
181562	PRK08840	PRK08840	replicative DNA helicase; Provisional	464
181563	PRK08841	PRK08841	aspartate kinase; Validated	392
181564	PRK08849	PRK08849	2-octaprenyl-3-methyl-6-methoxy-1,4-benzoquinol hydroxylase; Provisional	384
236341	PRK08850	PRK08850	2-octaprenyl-6-methoxyphenol hydroxylase; Validated	405
181566	PRK08857	PRK08857	aminodeoxychorismate/anthranilate synthase component II. 	193
181567	PRK08861	PRK08861	O-succinylhomoserine (thiol)-lyase. 	388
236342	PRK08862	PRK08862	SDR family oxidoreductase. 	227
236343	PRK08868	PRK08868	flagellar protein FlaG; Provisional	144
236344	PRK08869	PRK08869	polar flagellin E. 	376
236345	PRK08870	flgL	flagellar hook-associated protein FlgL; Reviewed	404
181572	PRK08871	flgK	flagellar hook-associated protein FlgK; Validated	626
181573	PRK08878	PRK08878	cobalamin biosynthesis family protein. 	317
181574	PRK08881	rpsN	30S ribosomal protein S14; Reviewed	101
181575	PRK08883	PRK08883	ribulose-phosphate 3-epimerase; Provisional	220
181576	PRK08887	PRK08887	nicotinate-nicotinamide nucleotide adenylyltransferase. 	174
236346	PRK08898	PRK08898	oxygen-independent coproporphyrinogen III oxidase-like protein. 	394
236347	PRK08903	PRK08903	DnaA regulatory inactivator Hda; Validated	227
236348	PRK08905	PRK08905	lysophospholipid acyltransferase family protein. 	289
181580	PRK08912	PRK08912	aminotransferase. 	387
236349	PRK08913	flgL	flagellin. 	301
236350	PRK08916	PRK08916	flagellar motor switch protein FliN. 	116
236351	PRK08927	fliI	flagellar protein export ATPase FliI. 	442
181584	PRK08931	PRK08931	S-methyl-5'-thioadenosine phosphorylase. 	289
181585	PRK08936	PRK08936	glucose-1-dehydrogenase; Provisional	261
236352	PRK08937	PRK08937	adenylosuccinate lyase; Provisional	216
236353	PRK08939	PRK08939	primosomal protein DnaI; Reviewed	306
236354	PRK08942	PRK08942	D-glycero-beta-D-manno-heptose 1,7-bisphosphate 7-phosphatase. 	181
236355	PRK08943	PRK08943	lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA acyltransferase; Validated	314
236356	PRK08944	motB	flagellar motor protein MotB; Reviewed	302
236357	PRK08945	PRK08945	putative oxoacyl-(acyl carrier protein) reductase; Provisional	247
181592	PRK08947	fadA	3-ketoacyl-CoA thiolase; Reviewed	387
181593	PRK08951	PRK08951	malate synthase; Provisional	190
169599	PRK08955	PRK08955	glyceraldehyde-3-phosphate dehydrogenase; Validated	334
181594	PRK08958	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	588
181595	PRK08960	PRK08960	pyridoxal phosphate-dependent aminotransferase. 	387
236358	PRK08961	PRK08961	bifunctional aspartate kinase/diaminopimelate decarboxylase. 	861
181597	PRK08963	fadI	3-ketoacyl-CoA thiolase; Reviewed	428
181598	PRK08965	PRK08965	putative monovalent cation/H+ antiporter subunit E; Reviewed	162
181599	PRK08972	fliI	flagellar protein export ATPase FliI. 	444
236359	PRK08974	PRK08974	long-chain-fatty-acid--CoA ligase FadD. 	560
181601	PRK08978	PRK08978	acetolactate synthase 2 catalytic subunit; Reviewed	548
181602	PRK08979	PRK08979	acetolactate synthase 3 large subunit. 	572
236360	PRK08983	fliN	flagellar motor switch protein FliN. 	127
181604	PRK08990	PRK08990	flagellar motor protein PomA; Reviewed	254
181605	PRK08993	PRK08993	2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase KduD. 	253
181606	PRK08997	PRK08997	isocitrate dehydrogenase; Provisional	334
236361	PRK08999	PRK08999	Nudix family hydrolase. 	312
181608	PRK09004	PRK09004	FMN-binding protein MioC; Provisional	146
181609	PRK09009	PRK09009	SDR family oxidoreductase. 	235
236362	PRK09010	PRK09010	single-stranded DNA-binding protein; Provisional	177
181611	PRK09014	rfaH	transcription/translation regulatory transformer protein RfaH. 	162
181612	PRK09016	PRK09016	carboxylating nicotinate-nucleotide diphosphorylase. 	296
181613	PRK09019	PRK09019	stress response translation initiation inhibitor YciH. 	108
181614	PRK09027	PRK09027	cytidine deaminase; Provisional	295
181615	PRK09028	PRK09028	cystathionine beta-lyase; Provisional	394
236363	PRK09029	PRK09029	O-succinylbenzoic acid--CoA ligase; Provisional	458
236364	PRK09034	PRK09034	aspartate kinase; Reviewed	454
181618	PRK09038	PRK09038	flagellar motor protein MotD; Reviewed	281
181619	PRK09039	PRK09039	peptidoglycan -binding protein. 	343
181620	PRK09040	PRK09040	hypothetical protein; Provisional	214
236365	PRK09041	motB	motility protein MotB. 	317
236366	PRK09045	PRK09045	TRZ/ATZ family hydrolase. 	443
236367	PRK09047	PRK09047	RNA polymerase factor sigma-70; Validated	161
181624	PRK09050	PRK09050	beta-ketoadipyl CoA thiolase; Validated	401
181625	PRK09051	PRK09051	beta-ketothiolase BktB. 	394
181626	PRK09052	PRK09052	acetyl-CoA C-acyltransferase. 	399
181627	PRK09053	PRK09053	3-carboxy-cis,cis-muconate cycloisomerase; Provisional	452
181628	PRK09054	PRK09054	phosphogluconate dehydratase; Validated	603
181629	PRK09057	PRK09057	coproporphyrinogen III oxidase; Provisional	380
236368	PRK09058	PRK09058	heme anaerobic degradation radical SAM methyltransferase ChuW/HutW. 	449
181631	PRK09059	PRK09059	dihydroorotase; Validated	429
181632	PRK09060	PRK09060	dihydroorotase; Validated	444
236369	PRK09061	PRK09061	D-glutamate deacylase; Validated	509
236370	PRK09064	PRK09064	5-aminolevulinate synthase; Validated	407
181635	PRK09065	PRK09065	glutamine amidotransferase; Provisional	237
236371	PRK09070	PRK09070	aminodeoxychorismate synthase component I. 	447
181637	PRK09071	PRK09071	glycosyl transferase family protein. 	323
236372	PRK09072	PRK09072	SDR family oxidoreductase. 	263
236373	PRK09076	PRK09076	enoyl-CoA hydratase; Provisional	258
236374	PRK09077	PRK09077	L-aspartate oxidase; Provisional	536
236375	PRK09078	sdhA	succinate dehydrogenase flavoprotein subunit; Reviewed	598
181642	PRK09082	PRK09082	methionine aminotransferase; Validated	386
236376	PRK09084	PRK09084	aspartate kinase III; Validated	448
169652	PRK09087	PRK09087	hypothetical protein; Validated	226
181644	PRK09088	PRK09088	acyl-CoA synthetase; Validated	488
236377	PRK09094	PRK09094	putative monovalent cation/H+ antiporter subunit C; Reviewed	114
181646	PRK09098	PRK09098	HrpE/YscL family type III secretion apparatus protein. 	233
169656	PRK09099	PRK09099	type III secretion system ATPase; Provisional	441
181647	PRK09101	nrdB	ribonucleotide-diphosphate reductase subunit beta; Reviewed	376
236378	PRK09102	PRK09102	ribonucleoside-diphosphate reductase subunit alpha. 	601
181649	PRK09103	PRK09103	ribonucleoside-diphosphate reductase subunit alpha. 	758
236379	PRK09104	PRK09104	hypothetical protein; Validated	464
181651	PRK09105	PRK09105	pyridoxal phosphate-dependent aminotransferase. 	370
236380	PRK09107	PRK09107	acetolactate synthase 3 catalytic subunit; Validated	595
236381	PRK09108	PRK09108	type III secretion system protein HrcU; Validated	353
181654	PRK09109	motC	flagellar motor protein; Reviewed	246
181655	PRK09110	PRK09110	flagellar motor stator protein MotA. 	283
236382	PRK09111	PRK09111	DNA polymerase III subunits gamma and tau; Validated	598
169667	PRK09112	PRK09112	DNA polymerase III subunit delta'; Validated	351
181657	PRK09116	PRK09116	beta-ketoacyl-ACP synthase. 	405
236383	PRK09120	PRK09120	p-hydroxycinnamoyl CoA hydratase/lyase; Validated	275
181659	PRK09121	PRK09121	methionine synthase. 	339
236384	PRK09123	PRK09123	amidophosphoribosyltransferase; Provisional	479
181661	PRK09124	PRK09124	ubiquinone-dependent pyruvate dehydrogenase. 	574
181662	PRK09125	PRK09125	DNA ligase; Provisional	282
236385	PRK09126	PRK09126	FAD-dependent hydroxylase. 	392
236386	PRK09129	PRK09129	NADH dehydrogenase subunit G; Validated	776
236387	PRK09130	PRK09130	NADH dehydrogenase subunit G; Validated	687
236388	PRK09133	PRK09133	hypothetical protein; Provisional	472
236389	PRK09134	PRK09134	SDR family oxidoreductase. 	258
181668	PRK09135	PRK09135	pteridine reductase; Provisional	249
236390	PRK09136	PRK09136	S-methyl-5'-thioinosine phosphorylase. 	245
181670	PRK09140	PRK09140	2-dehydro-3-deoxy-6-phosphogalactonate aldolase; Reviewed	206
236391	PRK09145	PRK09145	3'-5' exonuclease. 	202
236392	PRK09146	PRK09146	DNA polymerase III subunit epsilon; Validated	239
236393	PRK09147	PRK09147	succinyldiaminopimelate transaminase; Provisional	396
181674	PRK09148	PRK09148	LL-diaminopimelate aminotransferase. 	405
181675	PRK09162	PRK09162	hypoxanthine-guanine phosphoribosyltransferase; Provisional	181
181676	PRK09165	PRK09165	replicative DNA helicase; Provisional	497
236394	PRK09169	PRK09169	hypothetical protein; Validated	2316
169691	PRK09173	PRK09173	F0F1 ATP synthase subunit B; Validated	159
169692	PRK09174	PRK09174	F0F1 ATP synthase subunit B. 	204
236395	PRK09177	PRK09177	xanthine-guanine phosphoribosyltransferase; Validated	156
236396	PRK09181	PRK09181	aspartate kinase; Validated	475
236397	PRK09182	PRK09182	DNA polymerase III subunit epsilon; Validated	294
181681	PRK09183	PRK09183	transposase/IS protein; Provisional	259
181682	PRK09184	PRK09184	acyl carrier protein; Provisional	89
236398	PRK09185	PRK09185	beta-ketoacyl-ACP synthase. 	392
236399	PRK09186	PRK09186	flagellin modification protein A; Provisional	256
236400	PRK09188	PRK09188	serine/threonine protein kinase; Provisional	365
169701	PRK09189	PRK09189	uroporphyrinogen-III synthase; Validated	240
236401	PRK09190	PRK09190	RNA-binding protein. 	220
236402	PRK09191	PRK09191	two-component response regulator; Provisional	261
236403	PRK09192	PRK09192	fatty acyl-AMP ligase. 	579
236404	PRK09193	PRK09193	indolepyruvate ferredoxin oxidoreductase; Validated	1165
236405	PRK09194	PRK09194	prolyl-tRNA synthetase; Provisional	565
181690	PRK09195	gatY	tagatose-bisphosphate aldolase; Reviewed	284
181691	PRK09196	PRK09196	fructose-bisphosphate aldolase class II. 	347
236406	PRK09197	PRK09197	fructose-bisphosphate aldolase; Provisional	350
236407	PRK09198	PRK09198	putative nicotinate phosphoribosyltransferase; Provisional	463
236408	PRK09200	PRK09200	preprotein translocase subunit SecA; Reviewed	790
236409	PRK09201	PRK09201	AtzE family amidohydrolase. 	465
236410	PRK09202	nusA	transcription elongation factor NusA; Validated	470
236411	PRK09203	rplP	50S ribosomal protein L16; Reviewed	138
236412	PRK09204	secY	preprotein translocase subunit SecY; Reviewed	426
181699	PRK09206	PRK09206	pyruvate kinase PykF. 	470
181700	PRK09209	PRK09209	ribonucleoside-diphosphate reductase subunit alpha. 	761
236413	PRK09210	PRK09210	RNA polymerase sigma factor RpoD; Validated	367
169719	PRK09212	PRK09212	pyruvate dehydrogenase subunit beta; Validated	327
236414	PRK09213	PRK09213	pur operon repressor; Provisional	271
181703	PRK09216	rplM	50S ribosomal protein L13; Reviewed	144
181704	PRK09218	PRK09218	peptide deformylase; Validated	136
181705	PRK09219	PRK09219	xanthine phosphoribosyltransferase; Validated	189
236415	PRK09220	PRK09220	methylthioribulose 1-phosphate dehydratase. 	204
181707	PRK09221	PRK09221	beta alanine--pyruvate transaminase; Provisional	445
236416	PRK09222	PRK09222	NADP-dependent isocitrate dehydrogenase. 	482
236417	PRK09224	PRK09224	threonine ammonia-lyase IlvA. 	504
236418	PRK09225	PRK09225	threonine synthase; Validated	462
236419	PRK09228	PRK09228	guanine deaminase; Provisional	433
236420	PRK09229	PRK09229	N-formimino-L-glutamate deiminase; Validated	456
181713	PRK09230	PRK09230	cytosine deaminase; Provisional	426
236421	PRK09231	PRK09231	fumarate reductase flavoprotein subunit; Validated	582
236422	PRK09234	fbiC	FO synthase; Reviewed	843
181716	PRK09236	PRK09236	dihydroorotase; Reviewed	444
236423	PRK09237	PRK09237	amidohydrolase/deacetylase family metallohydrolase. 	380
236424	PRK09238	PRK09238	bifunctional aconitate hydratase 2/2-methylisocitrate dehydratase; Validated	835
181719	PRK09239	PRK09239	chorismate mutase; Provisional	104
236425	PRK09240	thiH	2-iminoacetate synthase ThiH. 	371
181721	PRK09242	PRK09242	SDR family oxidoreductase. 	257
236426	PRK09243	PRK09243	nicotinate phosphoribosyltransferase; Validated	464
181723	PRK09245	PRK09245	crotonase/enoyl-CoA hydratase family protein. 	266
236427	PRK09246	PRK09246	amidophosphoribosyltransferase; Provisional	501
236428	PRK09247	PRK09247	ATP-dependent DNA ligase; Validated	539
236429	PRK09248	PRK09248	putative hydrolase; Validated	246
236430	PRK09249	PRK09249	coproporphyrinogen dehydrogenase. 	453
236431	PRK09250	PRK09250	class I fructose-bisphosphate aldolase. 	348
236432	PRK09255	PRK09255	malate synthase; Validated	531
181730	PRK09256	PRK09256	aminoacyl-tRNA hydrolase. 	138
181731	PRK09257	PRK09257	aromatic amino acid transaminase. 	396
181732	PRK09258	PRK09258	3-oxoacyl-(acyl carrier protein) synthase III; Reviewed	338
236433	PRK09259	PRK09259	putative oxalyl-CoA decarboxylase; Validated	569
236434	PRK09260	PRK09260	3-hydroxyacyl-CoA dehydrogenase. 	288
236435	PRK09261	PRK09261	phospho-2-dehydro-3-deoxyheptonate aldolase; Validated	349
181735	PRK09262	PRK09262	hypothetical protein; Provisional	225
236436	PRK09263	PRK09263	anaerobic ribonucleoside triphosphate reductase; Provisional	711
236437	PRK09264	PRK09264	diaminobutyrate--2-oxoglutarate transaminase. 	425
181738	PRK09265	PRK09265	aminotransferase AlaT; Validated	404
236438	PRK09266	PRK09266	hypothetical protein; Provisional	266
236439	PRK09267	PRK09267	flavodoxin FldA; Validated	169
236440	PRK09268	PRK09268	acetyl-CoA C-acetyltransferase. 	427
236441	PRK09269	PRK09269	chorismate mutase; Provisional	193
236442	PRK09270	PRK09270	nucleoside triphosphate hydrolase domain-containing protein; Reviewed	229
181744	PRK09271	PRK09271	flavodoxin; Provisional	160
181745	PRK09272	PRK09272	hypothetical protein; Provisional	109
181746	PRK09273	PRK09273	hypothetical protein; Provisional	211
236443	PRK09274	PRK09274	peptide synthase; Provisional	552
236444	PRK09275	PRK09275	bifunctional aspartate transaminase/aspartate 4-decarboxylase. 	527
181749	PRK09276	PRK09276	LL-diaminopimelate aminotransferase; Provisional	385
236445	PRK09277	PRK09277	aconitate hydratase AcnA. 	888
236446	PRK09279	PRK09279	pyruvate phosphate dikinase; Provisional	879
236447	PRK09280	PRK09280	F0F1 ATP synthase subunit beta; Validated	463
236448	PRK09281	PRK09281	F0F1 ATP synthase subunit alpha; Validated	502
236449	PRK09282	PRK09282	pyruvate carboxylase subunit B; Validated	592
236450	PRK09283	PRK09283	porphobilinogen synthase. 	323
236451	PRK09284	PRK09284	thiamine biosynthesis protein ThiC; Provisional	607
236452	PRK09285	PRK09285	adenylosuccinate lyase; Provisional	456
236453	PRK09287	PRK09287	NADP-dependent phosphogluconate dehydrogenase. 	459
236454	PRK09288	purT	formate-dependent phosphoribosylglycinamide formyltransferase. 	395
236455	PRK09289	PRK09289	riboflavin synthase. 	194
236456	PRK09290	PRK09290	allantoate amidohydrolase; Reviewed	413
181762	PRK09291	PRK09291	SDR family oxidoreductase. 	257
236457	PRK09292	PRK09292	Na(+)-translocating NADH-quinone reductase subunit D; Validated	209
236458	PRK09293	PRK09293	class 1 fructose-bisphosphatase. 	327
181765	PRK09294	PRK09294	phthiocerol/phthiodiolone dimycocerosyl transferase. 	416
181766	PRK09295	PRK09295	cysteine desulfurase SufS. 	406
181767	PRK09296	PRK09296	cysteine desulfuration protein SufE. 	138
236459	PRK09297	PRK09297	tRNA-splicing endonuclease subunit alpha; Reviewed	169
236460	PRK09300	PRK09300	tRNA splicing endonuclease; Reviewed	330
181770	PRK09301	PRK09301	circadian clock protein KaiB; Provisional	103
236461	PRK09302	PRK09302	circadian clock protein KaiC; Reviewed	509
236462	PRK09303	PRK09303	histidine kinase. 	380
236463	PRK09304	PRK09304	arginine exporter ArgO. 	207
137204	PRK09310	aroDE	bifunctional 3-dehydroquinate dehydratase/shikimate dehydrogenase. 	477
181774	PRK09311	PRK09311	bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase/GTP cyclohydrolase II. 	402
181775	PRK09314	PRK09314	bifunctional 3,4-dihydroxy-2-butanone 4-phosphate synthase/GTP cyclohydrolase II. 	339
236464	PRK09318	PRK09318	bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase/GTP cyclohydrolase II. 	387
236465	PRK09319	PRK09319	bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase RibB/GTP cyclohydrolase II RibA. 	555
236466	PRK09325	PRK09325	coenzyme F420-reducing hydrogenase subunit beta; Validated	282
181779	PRK09326	PRK09326	F420H2 dehydrogenase subunit F; Provisional	341
236467	PRK09328	PRK09328	N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase; Provisional	275
236468	PRK09330	PRK09330	cell division protein FtsZ; Validated	384
236469	PRK09331	PRK09331	Sep-tRNA:Cys-tRNA synthetase; Provisional	387
236470	PRK09333	PRK09333	30S ribosomal protein S19e; Provisional	150
181784	PRK09334	PRK09334	30S ribosomal protein S25e; Provisional	86
181785	PRK09335	PRK09335	30S ribosomal protein S26e; Provisional	95
181786	PRK09336	PRK09336	30S ribosomal protein S30e; Provisional	50
181787	PRK09343	PRK09343	prefoldin subunit beta; Provisional	121
236471	PRK09344	PRK09344	phosphoenolpyruvate carboxykinase. 	526
236472	PRK09347	folE	GTP cyclohydrolase I; Provisional	188
236473	PRK09348	glyQ	glycyl-tRNA synthetase subunit alpha; Validated	283
236474	PRK09350	PRK09350	elongation factor P--(R)-beta-lysine ligase. 	306
236475	PRK09352	PRK09352	beta-ketoacyl-ACP synthase 3. 	319
236476	PRK09354	recA	recombinase A; Provisional	349
236477	PRK09355	PRK09355	hydroxyethylthiazole kinase; Validated	263
236478	PRK09356	PRK09356	imidazolonepropionase; Validated	406
236479	PRK09357	pyrC	dihydroorotase; Validated	423
236480	PRK09358	PRK09358	adenosine deaminase; Provisional	340
236481	PRK09360	lamB	maltoporin LamB. 	415
236482	PRK09361	radB	DNA repair and recombination protein RadB; Provisional	225
181800	PRK09362	PRK09362	phosphoribosylaminoimidazole-succinocarboxamide synthase; Reviewed	238
236483	PRK09364	moaC	cyclic pyranopterin monophosphate synthase MoaC. 	159
236484	PRK09367	PRK09367	histidine ammonia-lyase; Provisional	500
236485	PRK09368	PRK09368	gas vesicle structural protein GvpA. 	140
236486	PRK09369	PRK09369	UDP-N-acetylglucosamine 1-carboxyvinyltransferase; Validated	417
181805	PRK09371	PRK09371	gas vesicle structural protein GvpA. 	68
236487	PRK09372	PRK09372	ribonuclease E inhibitor RraA. 	159
236488	PRK09374	rplB	50S ribosomal protein L2; Validated	276
236489	PRK09375	PRK09375	quinolinate synthase NadA. 	319
236490	PRK09376	rho	transcription termination factor Rho; Provisional	416
236491	PRK09377	tsf	elongation factor Ts; Provisional	290
181811	PRK09379	PRK09379	LytR family transcriptional regulator. 	303
181812	PRK09381	trxA	thioredoxin TrxA. 	109
236492	PRK09382	ispDF	bifunctional 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase/2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase protein; Provisional	378
236493	PRK09389	PRK09389	(R)-citramalate synthase; Provisional	488
181815	PRK09390	fixJ	response regulator FixJ; Provisional	202
236494	PRK09391	fixK	transcriptional regulator FixK; Provisional	230
181817	PRK09392	ftrB	transcriptional activator FtrB; Provisional	236
181818	PRK09393	ftrA	transcriptional activator FtrA; Provisional	322
236495	PRK09395	actP	cation/acetate symporter ActP. 	551
236496	PRK09398	sspN	acid-soluble spore protein N; Provisional	47
181821	PRK09399	sspP	small acid-soluble spore protein P. 	48
236497	PRK09400	secE	preprotein translocase subunit SecE; Reviewed	61
236498	PRK09401	PRK09401	reverse gyrase; Reviewed	1176
236499	PRK09404	sucA	2-oxoglutarate dehydrogenase E1 component; Reviewed	924
236500	PRK09405	aceE	pyruvate dehydrogenase subunit E1; Reviewed	891
181826	PRK09406	gabD1	succinic semialdehyde dehydrogenase; Reviewed	457
236501	PRK09407	gabD2	succinic semialdehyde dehydrogenase; Reviewed	524
181828	PRK09408	ompX	outer membrane protein OmpX. 	171
181829	PRK09409	PRK09409	IS2 transposase TnpB; Reviewed	301
236502	PRK09410	ulaA	PTS system ascorbate-specific transporter subunit IIC; Reviewed	452
181831	PRK09411	PRK09411	carbamate kinase; Reviewed	297
236503	PRK09412	PRK09412	anaerobic C4-dicarboxylate transporter; Reviewed	433
181833	PRK09413	PRK09413	IS2 repressor TnpA; Reviewed	121
181834	PRK09414	PRK09414	NADP-specific glutamate dehydrogenase. 	445
181835	PRK09415	PRK09415	RNA polymerase factor sigma C; Reviewed	179
181836	PRK09416	lstR	PadR family transcriptional regulator. 	135
181837	PRK09417	mogA	molybdenum cofactor biosynthesis protein MogA; Provisional	193
236504	PRK09418	PRK09418	bifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase. 	780
236505	PRK09419	PRK09419	multifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase/5'-nucleotidase. 	1163
236506	PRK09420	cpdB	bifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase. 	649
181841	PRK09421	modB	molybdate ABC transporter permease subunit. 	229
181842	PRK09422	PRK09422	ethanol-active dehydrogenase/acetaldehyde-active reductase; Provisional	338
181843	PRK09423	gldA	glycerol dehydrogenase; Provisional	366
236507	PRK09424	pntA	Re/Si-specific NAD(P)(+) transhydrogenase subunit alpha. 	509
181845	PRK09425	prpD	bifunctional 2-methylcitrate dehydratase/aconitate hydratase. 	480
236508	PRK09426	PRK09426	methylmalonyl-CoA mutase; Reviewed	714
236509	PRK09427	PRK09427	bifunctional indole-3-glycerol-phosphate synthase TrpC/phosphoribosylanthranilate isomerase TrpF. 	454
236510	PRK09428	pssA	CDP-diacylglycerol--serine O-phosphatidyltransferase. 	451
236511	PRK09429	mepA	penicillin-insensitive murein endopeptidase; Reviewed	275
236512	PRK09430	djlA	co-chaperone DjlA. 	267
236513	PRK09431	asnB	asparagine synthetase B; Provisional	554
181852	PRK09432	metF	methylenetetrahydrofolate reductase. 	296
181853	PRK09433	thiP	thiamine transporter membrane protein; Reviewed	525
236514	PRK09434	PRK09434	aminoimidazole riboside kinase; Provisional	304
236515	PRK09435	PRK09435	methylmalonyl Co-A mutase-associated GTPase MeaB. 	332
181856	PRK09436	thrA	bifunctional aspartokinase I/homoserine dehydrogenase I; Provisional	819
181857	PRK09437	bcp	thioredoxin-dependent thiol peroxidase; Reviewed	154
236516	PRK09438	nudB	dihydroneopterin triphosphate pyrophosphatase; Provisional	148
181859	PRK09439	PRK09439	PTS system glucose-specific transporter subunit; Provisional	169
236517	PRK09440	avtA	valine--pyruvate transaminase; Provisional	416
236518	PRK09441	PRK09441	cytoplasmic alpha-amylase; Reviewed	479
236519	PRK09442	panF	sodium/pantothenate symporter. 	483
236520	PRK09444	pntB	Re/Si-specific NAD(P)(+) transhydrogenase subunit beta. 	462
236521	PRK09448	PRK09448	DNA starvation/stationary phase protection protein Dps; Provisional	162
181865	PRK09449	PRK09449	dUMP phosphatase; Provisional	224
236522	PRK09450	cyaA	class I adenylate cyclase. 	830
181867	PRK09451	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	456
236523	PRK09452	potA	spermidine/putrescine ABC transporter ATP-binding protein PotA. 	375
181869	PRK09453	PRK09453	phosphodiesterase; Provisional	182
236524	PRK09454	ugpQ	cytoplasmic glycerophosphodiester phosphodiesterase; Provisional	249
236525	PRK09455	rseB	anti-sigma E factor; Provisional	319
181872	PRK09456	PRK09456	?-D-glucose-1-phosphatase; Provisional	199
181873	PRK09457	astD	succinylglutamic semialdehyde dehydrogenase; Reviewed	487
236526	PRK09458	pspB	envelope stress response membrane protein PspB. 	75
181875	PRK09459	pspG	envelope stress response protein PspG. 	76
181876	PRK09461	ansA	cytoplasmic asparaginase I; Provisional	335
236527	PRK09462	fur	ferric uptake regulator; Provisional	148
236528	PRK09463	fadE	acyl-CoA dehydrogenase; Reviewed	777
181879	PRK09464	pdhR	pyruvate dehydrogenase complex transcriptional repressor PdhR. 	254
236529	PRK09465	tolC	outer membrane channel protein; Reviewed	446
236530	PRK09466	metL	bifunctional aspartate kinase II/homoserine dehydrogenase II; Provisional	810
236531	PRK09467	envZ	osmolarity sensor protein; Provisional	435
181883	PRK09468	ompR	osmolarity response regulator; Provisional	239
181884	PRK09469	glnA	glutamate--ammonia ligase. 	469
236532	PRK09470	cpxA	envelope stress sensor histidine kinase CpxA. 	461
181886	PRK09471	oppB	oligopeptide ABC transporter permease OppB. 	306
181887	PRK09472	ftsA	cell division protein FtsA; Reviewed	420
181888	PRK09473	oppD	oligopeptide transporter ATP-binding component; Provisional	330
236533	PRK09474	malE	maltose/maltodextrin ABC transporter substrate-binding protein MalE. 	396
236534	PRK09476	napG	quinol dehydrogenase periplasmic component; Provisional	254
236535	PRK09477	napH	quinol dehydrogenase membrane component; Provisional	271
181892	PRK09478	mglC	galactose/methyl galactoside ABC transporter permease MglC. 	336
236536	PRK09479	glpX	fructose 1,6-bisphosphatase II; Reviewed	319
181894	PRK09480	slmA	division inhibitor protein; Provisional	194
236537	PRK09481	sspA	stringent starvation protein A; Provisional	211
181896	PRK09482	PRK09482	flap endonuclease-like protein; Provisional	256
236538	PRK09483	PRK09483	response regulator; Provisional	217
181898	PRK09484	PRK09484	3-deoxy-manno-octulosonate-8-phosphatase KdsC. 	183
181899	PRK09485	mmuM	homocysteine methyltransferase; Provisional	304
181900	PRK09487	sdhC	succinate dehydrogenase cytochrome b556 subunit. 	129
181901	PRK09488	sdhD	succinate dehydrogenase membrane anchor subunit. 	115
181902	PRK09489	rsmC	16S rRNA (guanine(1207)-N(2))-methyltransferase RsmC. 	342
236539	PRK09490	metH	B12-dependent methionine synthase; Provisional	1229
181904	PRK09491	rimI	ribosomal-protein-alanine N-acetyltransferase; Provisional	146
181905	PRK09492	treR	HTH-type transcriptional regulator TreR. 	315
181906	PRK09493	glnQ	glutamine ABC transporter ATP-binding protein GlnQ. 	240
181907	PRK09494	glnP	glutamine ABC transporter permease protein; Reviewed	219
236540	PRK09495	glnH	glutamine ABC transporter periplasmic protein; Reviewed	247
236541	PRK09496	trkA	Trk system potassium transporter TrkA. 	453
181910	PRK09497	potB	spermidine/putrescine ABC transporter membrane protein; Reviewed	285
181911	PRK09498	sifA	type III secretion system effector SifA. 	336
137339	PRK09499	sifB	type III secretion system effector SifB. 	316
236542	PRK09500	potC	spermidine/putrescine ABC transporter membrane protein; Reviewed	256
181913	PRK09501	potD	spermidine/putrescine ABC transporter periplasmic substrate-binding protein; Reviewed	348
181914	PRK09502	iscA	iron-sulfur cluster assembly protein IscA. 	107
181915	PRK09504	sufA	iron-sulfur cluster assembly scaffold protein; Provisional	122
236543	PRK09505	malS	alpha-amylase; Reviewed	683
236544	PRK09506	mrcB	bifunctional glycosyl transferase/transpeptidase; Reviewed	830
169931	PRK09507	cspE	cold shock-like protein CspE. 	69
181918	PRK09508	leuO	leucine transcriptional activator; Reviewed	314
181919	PRK09509	fieF	CDF family cation-efflux pump FieF. 	299
236545	PRK09510	tolA	cell envelope integrity inner membrane protein TolA; Provisional	387
181921	PRK09511	nirD	nitrite reductase small subunit NirD. 	108
181922	PRK09512	rbsC	ribose ABC transporter permease protein; Reviewed	320
181923	PRK09513	fruK	1-phosphofructokinase; Provisional	312
181924	PRK09514	zntR	Zn(2+)-responsive transcriptional regulator. 	140
169939	PRK09517	PRK09517	multifunctional thiamine-phosphate pyrophosphorylase/synthase/phosphomethylpyrimidine kinase; Provisional	755
236546	PRK09518	PRK09518	bifunctional cytidylate kinase/GTPase Der; Reviewed	712
77219	PRK09519	recA	intein-containing recombinase RecA. 	790
236547	PRK09521	PRK09521	exosome complex RNA-binding protein Csl4; Provisional	189
181927	PRK09522	PRK09522	bifunctional anthranilate synthase glutamate amidotransferase component TrpG/anthranilate phosphoribosyltransferase TrpD. 	531
236548	PRK09525	lacZ	beta-galactosidase. 	1027
181929	PRK09526	lacI	lac repressor; Reviewed	342
181930	PRK09527	lacA	galactoside O-acetyltransferase; Reviewed	203
236549	PRK09528	lacY	galactoside permease; Reviewed	420
236550	PRK09529	PRK09529	bifunctional acetyl-CoA decarbonylase/synthase complex subunit alpha/beta; Reviewed	711
181933	PRK09532	PRK09532	DNA polymerase III subunit alpha; Reviewed	874
236551	PRK09533	PRK09533	bifunctional transaldolase/phosoglucose isomerase; Validated	948
236552	PRK09534	btuF	corrinoid ABC transporter substrate-binding protein; Reviewed	359
236553	PRK09535	btuC	cobalamin ABC transporter permease BtuC. 	366
236554	PRK09536	btuD	corrinoid ABC transporter ATPase; Reviewed	402
236555	PRK09537	pylS	pyrrolysine--tRNA(Pyl) ligase. 	417
236556	PRK09539	PRK09539	tRNA-splicing endonuclease subunit beta; Reviewed	124
137367	PRK09541	emrE	EmrE family multidrug efflux SMR transporter. 	110
236557	PRK09542	manB	phosphomannomutase/phosphoglucomutase; Reviewed	445
181938	PRK09543	znuB	zinc ABC transporter permease subunit ZnuB. 	261
181939	PRK09544	znuC	high-affinity zinc transporter ATPase; Reviewed	251
236558	PRK09545	znuA	zinc ABC transporter substrate-binding protein ZnuA. 	311
181941	PRK09546	zntB	zinc transporter ZntB. 	324
181942	PRK09547	nhaB	sodium/proton antiporter NhaB. 	513
236559	PRK09548	PRK09548	PTS ascorbate-specific subunit IIBC. 	602
236560	PRK09549	mtnW	2,3-diketo-5-methylthiopentyl-1-phosphate enolase; Reviewed	407
236561	PRK09550	mtnK	methylthioribose kinase; Reviewed	401
236562	PRK09552	mtnX	2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate phosphatase; Reviewed	219
181947	PRK09553	tauD	taurine dioxygenase; Reviewed	277
236563	PRK09554	feoB	Fe(2+) transporter permease subunit FeoB. 	772
181949	PRK09555	feoA	ferrous iron transporter A. 	74
236564	PRK09556	uhpT	hexose-6-phosphate:phosphate antiporter. 	467
236565	PRK09557	PRK09557	fructokinase; Reviewed	301
236566	PRK09558	ushA	bifunctional UDP-sugar hydrolase/5'-nucleotidase periplasmic precursor; Reviewed	551
236567	PRK09559	PRK09559	putative global regulator; Reviewed	327
236568	PRK09560	nhaA	pH-dependent sodium/proton antiporter; Reviewed	389
181955	PRK09561	nhaA	sodium/proton antiporter NhaA. 	388
236569	PRK09562	mazG	nucleoside triphosphate pyrophosphohydrolase; Reviewed	262
236570	PRK09563	rbgA	GTPase YlqF; Reviewed	287
181958	PRK09564	PRK09564	coenzyme A disulfide reductase; Reviewed	444
236571	PRK09565	PRK09565	heme-binding protein. 	533
236572	PRK09566	nirA	ferredoxin-nitrite reductase; Reviewed	513
236573	PRK09567	nirA	NirA family protein. 	593
236574	PRK09568	PRK09568	DNA primase regulatory subunit PriL. 	306
181961	PRK09569	PRK09569	citrate (Si)-synthase. 	437
236575	PRK09570	rpoH	DNA-directed RNA polymerase subunit H; Reviewed	79
181963	PRK09573	PRK09573	(S)-2,3-di-O-geranylgeranylglyceryl phosphate synthase; Reviewed	279
236576	PRK09575	vmrA	MATE family efflux transporter. 	453
169981	PRK09577	PRK09577	multidrug efflux RND transporter permease subunit. 	1032
169982	PRK09578	PRK09578	MexX/AxyX family multidrug efflux RND transporter periplasmic adaptor subunit. 	385
169983	PRK09579	PRK09579	multidrug efflux RND transporter permease subunit. 	1017
181965	PRK09580	sufC	cysteine desulfurase ATPase component; Reviewed	248
236577	PRK09581	pleD	response regulator PleD; Reviewed	457
181967	PRK09582	chaB	putative cation transport regulator ChaB. 	76
236578	PRK09583	PRK09583	mycothiol-dependent maleylpyruvate isomerase; Reviewed	241
181969	PRK09584	tppB	dipeptide/tripeptide permease DtpA. 	500
236579	PRK09585	anmK	anhydro-N-acetylmuramic acid kinase; Reviewed	365
181971	PRK09586	murP	PTS system N-acetylmuramic acid transporter subunits EIIBC; Reviewed	476
181972	PRK09588	PRK09588	hypothetical protein; Reviewed	376
181973	PRK09589	celA	6-phospho-beta-glucosidase; Reviewed	476
181974	PRK09590	celB	PTS cellobiose transporter subunit IIB. 	104
181975	PRK09591	celC	PTS cellobiose transporter subunit IIA. 	104
181976	PRK09592	celD	PTS cellobiose transporter subunit IIC. 	449
236580	PRK09593	arb	6-phospho-beta-glucosidase; Reviewed	478
181978	PRK09597	PRK09597	lipid A 1-phosphatase LpxE. 	190
236581	PRK09598	PRK09598	phosphoethanolamine--lipid A transferase EptA. 	522
236582	PRK09599	PRK09599	NADP-dependent phosphogluconate dehydrogenase. 	301
236583	PRK09601	PRK09601	redox-regulated ATPase YchF. 	364
236584	PRK09602	PRK09602	translation-associated GTPase; Reviewed	396
181983	PRK09603	PRK09603	DNA-directed RNA polymerase subunit beta/beta'. 	2890
236585	PRK09604	PRK09604	tRNA (adenosine(37)-N6)-threonylcarbamoyltransferase complex transferase subunit TsaD. 	332
236586	PRK09605	PRK09605	bifunctional N(6)-L-threonylcarbamoyladenine synthase/serine/threonine protein kinase. 	535
236587	PRK09606	PRK09606	DNA-directed RNA polymerase subunit B''; Validated	494
236588	PRK09607	rps11p	30S ribosomal protein S11P; Reviewed	132
181988	PRK09609	PRK09609	hypothetical protein; Provisional	312
236589	PRK09612	rpl2p	50S ribosomal protein L2P; Validated	238
236590	PRK09613	thiH	thiamine biosynthesis protein ThiH; Reviewed	469
236591	PRK09614	nrdF	ribonucleotide-diphosphate reductase subunit beta; Reviewed	324
181992	PRK09615	ggt	gamma-glutamyltranspeptidase; Reviewed	581
236592	PRK09616	pheT	phenylalanine--tRNA ligase subunit beta. 	552
236593	PRK09617	PRK09617	type III secretion system protein; Reviewed	243
236594	PRK09618	flgD	flagellar hook assembly protein FlgD. 	142
181996	PRK09619	flgD	flagellar hook assembly protein FlgD. 	218
181997	PRK09620	PRK09620	hypothetical protein; Provisional	229
236595	PRK09621	PRK09621	ATP synthase subunit C. 	141
181999	PRK09622	porA	2-oxoacid:ferredoxin oxidoreductase subunit alpha. 	407
170016	PRK09623	vorD	3-methyl-2-oxobutanoate dehydrogenase subunit delta. 	105
170017	PRK09624	porD	pyruvate ferredoxin oxidoreductase subunit delta; Reviewed	105
236596	PRK09625	porD	pyruvate flavodoxin oxidoreductase subunit delta; Reviewed	133
236597	PRK09626	oorD	2-oxoglutarate-acceptor oxidoreductase subunit OorD; Reviewed	103
182002	PRK09627	oorA	2-oxoglutarate synthase subunit alpha. 	375
182003	PRK09628	oorB	2-oxoglutarate ferredoxin oxidoreductase subunit beta. 	277
104071	PRK09629	PRK09629	bifunctional thiosulfate sulfurtransferase/phosphatidylserine decarboxylase; Provisional	610
170022	PRK09630	PRK09630	DNA topoisomerase IV subunit A; Provisional	479
236598	PRK09631	PRK09631	DNA topoisomerase IV subunit A; Provisional	635
236599	PRK09632	PRK09632	ATP-dependent DNA ligase; Reviewed	764
182006	PRK09633	ligD	DNA ligase D. 	610
182007	PRK09634	nusB	transcription antitermination protein NusB; Provisional	207
182008	PRK09635	sigI	RNA polymerase sigma factor SigI; Provisional	290
236600	PRK09636	PRK09636	RNA polymerase sigma factor SigJ; Provisional	293
236601	PRK09637	PRK09637	RNA polymerase sigma factor SigZ; Provisional	181
182010	PRK09638	PRK09638	RNA polymerase sigma factor SigY; Reviewed	176
236602	PRK09639	PRK09639	RNA polymerase sigma factor SigX; Provisional	166
236603	PRK09640	PRK09640	RNA polymerase sigma factor SigX; Reviewed	188
182012	PRK09641	PRK09641	RNA polymerase sigma factor SigW; Provisional	187
170031	PRK09642	PRK09642	RNA polymerase sigma factor. 	160
236604	PRK09643	PRK09643	RNA polymerase sigma factor SigM; Reviewed	192
170033	PRK09644	PRK09644	RNA polymerase sigma factor SigM; Provisional	165
236605	PRK09645	PRK09645	ECF RNA polymerase sigma factor SigL. 	173
182015	PRK09646	PRK09646	ECF RNA polymerase sigma factor SigK. 	194
236606	PRK09647	PRK09647	RNA polymerase sigma factor SigE; Reviewed	203
236607	PRK09648	PRK09648	RNA polymerase sigma factor ShbA. 	189
137458	PRK09649	PRK09649	RNA polymerase sigma factor SigC; Reviewed	185
182018	PRK09651	PRK09651	RNA polymerase sigma factor FecI; Provisional	172
236608	PRK09652	PRK09652	RNA polymerase sigma factor RpoE; Provisional	182
236609	PRK09653	eutD	phosphotransacetylase. 	324
182021	PRK09662	PRK09662	GspL-like protein; Provisional	286
182022	PRK09664	PRK09664	low affinity tryptophan permease TnaB. 	415
182023	PRK09665	PRK09665	PTS galactitol transporter subunit IIA. 	150
236610	PRK09669	PRK09669	putative symporter YagG; Provisional	444
236611	PRK09672	PRK09672	phage exclusion protein Lit; Provisional	305
182026	PRK09674	PRK09674	enoyl-CoA hydratase-isomerase; Provisional	255
137467	PRK09677	PRK09677	putative lipopolysaccharide biosynthesis O-acetyl transferase WbbJ; Provisional	192
137468	PRK09678	PRK09678	DNA-binding transcriptional regulator; Provisional	72
182027	PRK09681	PRK09681	putative type II secretion protein GspC; Provisional	276
236612	PRK09685	PRK09685	DNA-binding transcriptional activator FeaR; Provisional	302
170047	PRK09687	PRK09687	putative lyase; Provisional	280
182029	PRK09689	PRK09689	prophage protein NinE; Provisional	56
170049	PRK09692	PRK09692	integrase; Provisional	413
236613	PRK09693	PRK09693	Cascade antiviral complex protein; Validated	489
182031	PRK09694	PRK09694	CRISPR-associated helicase/endonuclease Cas3. 	878
182032	PRK09695	PRK09695	glycolate permease GlcA. 	560
182033	PRK09697	PRK09697	putative general secretion pathway protein GspB. 	139
182034	PRK09698	PRK09698	D-allose kinase; Provisional	302
182035	PRK09699	PRK09699	D-allose ABC transporter permease. 	312
182036	PRK09700	PRK09700	D-allose ABC transporter ATP-binding protein AlsA. 	510
182037	PRK09701	PRK09701	D-allose transporter substrate-binding protein. 	311
77355	PRK09702	PRK09702	PTS sugar transporter subunit IIB. 	161
236614	PRK09705	cynX	putative cyanate transporter; Provisional	393
182039	PRK09706	PRK09706	transcriptional repressor DicA; Reviewed	135
182040	PRK09707	PRK09707	putative lipoprotein; Provisional	1343
236615	PRK09709	PRK09709	exodeoxyribonuclease VIII. 	877
182042	PRK09710	lar	type I toxin-antitoxin system endodeoxyribonuclease toxin RalR. 	64
137485	PRK09713	focB	formate transporter. 	282
182043	PRK09716	PRK09716	YhaC family protein. 	395
182044	PRK09717	PRK09717	stationary phase growth adaptation protein; Provisional	179
182045	PRK09718	PRK09718	SopA family protein. 	512
182046	PRK09719	PRK09719	hypothetical protein; Provisional	89
182047	PRK09720	cybC	cytochrome b562; Provisional	100
236616	PRK09722	PRK09722	allulose-6-phosphate 3-epimerase; Provisional	229
236617	PRK09723	PRK09723	fimbrial-like adhesin. 	421
182049	PRK09726	PRK09726	type II toxin-antitoxin system antitoxin HipB. 	88
137493	PRK09727	PRK09727	his operon leader peptide; Provisional	16
182050	PRK09729	PRK09729	hypothetical protein; Provisional	68
182051	PRK09730	PRK09730	SDR family oxidoreductase. 	247
182052	PRK09731	PRK09731	type II secretion system protein. 	178
170072	PRK09732	PRK09732	hypothetical protein; Provisional	134
182053	PRK09733	PRK09733	fimbrial-like protein. 	181
236618	PRK09736	PRK09736	5-methylcytosine-specific restriction enzyme subunit McrC; Provisional	352
236619	PRK09737	PRK09737	type I restriction-modification system specificity subunit. 	461
182055	PRK09738	PRK09738	small toxic polypeptide; Provisional	52
236620	PRK09739	PRK09739	NAD(P)H oxidoreductase. 	199
182057	PRK09741	PRK09741	hypothetical protein; Provisional	148
137503	PRK09744	PRK09744	DNA-binding transcriptional regulator DicC; Provisional	75
182058	PRK09750	PRK09750	hypothetical protein; Provisional	64
137505	PRK09751	PRK09751	putative ATP-dependent helicase Lhr; Provisional	1490
182059	PRK09752	PRK09752	AIDA-I family autotransporter YfaL. 	1250
170080	PRK09754	PRK09754	phenylpropionate dioxygenase ferredoxin reductase subunit; Provisional	396
182060	PRK09755	PRK09755	ABC transporter substrate-binding protein. 	535
182061	PRK09756	PRK09756	PTS N-acetylgalactosamine transporter subunit IIB. 	158
236621	PRK09757	PRK09757	PTS N-acetylgalactosamine transporter subunit IIC. 	267
182063	PRK09759	PRK09759	type I toxin-antitoxin system toxin HokA. 	50
182064	PRK09762	PRK09762	galactosamine-6-phosphate isomerase; Provisional	232
182065	PRK09764	PRK09764	GntR family transcriptional regulator. 	240
182066	PRK09765	PRK09765	PTS system 2-O-a-mannosyl-D-glycerate specific transporter subunit IIABC; Provisional	631
182067	PRK09767	PRK09767	DUF559 domain-containing protein. 	117
170086	PRK09772	PRK09772	transcriptional antiterminator BglG; Provisional	278
236622	PRK09774	PRK09774	fec operon regulator FecR; Reviewed	319
236623	PRK09775	PRK09775	type II toxin-antitoxin system HipA family toxin YjjJ. 	442
182070	PRK09776	PRK09776	putative diguanylate cyclase; Provisional	1092
182071	PRK09777	fecD	Fe(3+) dicitrate ABC transporter permease subunit FecD. 	318
170091	PRK09778	PRK09778	type I toxin-antitoxin system antitoxin YafN. 	97
170092	PRK09781	PRK09781	hypothetical protein; Provisional	181
236624	PRK09782	PRK09782	bacteriophage N4 receptor, outer membrane subunit; Provisional	987
236625	PRK09783	PRK09783	copper/silver efflux system membrane fusion protein CusB; Provisional	409
182074	PRK09784	PRK09784	YccE family protein. 	417
182075	PRK09786	PRK09786	endodeoxyribonuclease RUS; Reviewed	120
182076	PRK09790	PRK09790	hypothetical protein; Reviewed	91
182077	PRK09791	PRK09791	LysR family transcriptional regulator. 	302
182078	PRK09792	PRK09792	4-aminobutyrate transaminase; Provisional	421
182079	PRK09793	PRK09793	methyl-accepting chemotaxis protein IV. 	533
182080	PRK09795	PRK09795	aminopeptidase; Provisional	361
182081	PRK09796	PRK09796	PTS system cellobiose/arbutin/salicin-specific transporter subunits IIBC; Provisional	472
182082	PRK09798	PRK09798	MazF-MazE toxin-antitoxin system antitoxin MazE. 	82
182083	PRK09799	PRK09799	putative oxidoreductase; Provisional	258
182084	PRK09800	PRK09800	putative hypoxanthine oxidase; Provisional	956
182085	PRK09801	PRK09801	LysR family transcriptional regulator. 	310
182086	PRK09802	PRK09802	DeoR family transcriptional regulator. 	269
182087	PRK09804	PRK09804	C4-dicarboxylate transporter DcuC. 	455
77417	PRK09806	PRK09806	tryptophanase leader peptide; Provisional	24
182088	PRK09807	PRK09807	hypothetical protein; Provisional	161
137533	PRK09810	PRK09810	lipoprotein antitoxin entericidin A. 	41
182089	PRK09812	PRK09812	type II toxin-antitoxin system ChpB family toxin. 	116
182090	PRK09813	PRK09813	fructoselysine 6-kinase; Provisional	260
236626	PRK09814	PRK09814	sugar transferase. 	333
77423	PRK09816	thrL	thr operon leader peptide; Provisional	21
182092	PRK09818	PRK09818	kinase inhibitor. 	183
182093	PRK09819	PRK09819	mannosylglycerate hydrolase. 	875
182094	PRK09821	PRK09821	putative transporter; Provisional	454
182095	PRK09822	PRK09822	lipopolysaccharide core biosynthesis protein; Provisional	269
170114	PRK09823	PRK09823	putative inner membrane protein; Provisional	160
236627	PRK09824	PRK09824	PTS system beta-glucoside-specific transporter subunits IIABC; Provisional	627
182097	PRK09825	idnK	gluconokinase. 	176
182098	PRK09828	PRK09828	putative fimbrial outer membrane usher protein; Provisional	865
182099	PRK09831	PRK09831	GNAT family N-acetyltransferase. 	147
182100	PRK09834	PRK09834	DNA-binding transcriptional regulator. 	263
182101	PRK09835	PRK09835	Cu(+)/Ag(+) sensor histidine kinase. 	482
182102	PRK09836	PRK09836	DNA-binding transcriptional activator CusR; Provisional	227
182103	PRK09837	PRK09837	Cu(I)/Ag(I) efflux RND transporter outer membrane protein. 	461
182104	PRK09838	PRK09838	periplasmic copper-binding protein; Provisional	115
182105	PRK09840	PRK09840	catecholate siderophore receptor Fiu; Provisional	761
182106	PRK09841	PRK09841	tyrosine-protein kinase. 	726
236628	PRK09846	recT	recombination protein RecT. 	266
182108	PRK09847	PRK09847	gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase; Provisional	494
182109	PRK09848	PRK09848	glucuronide transporter; Provisional	448
236629	PRK09849	PRK09849	putative oxidoreductase; Provisional	702
182111	PRK09850	PRK09850	pseudouridine kinase; Provisional	313
182112	PRK09852	PRK09852	cryptic 6-phospho-beta-glucosidase; Provisional	474
236630	PRK09853	PRK09853	putative selenate reductase subunit YgfK; Provisional	1019
182114	PRK09854	cmtB	PTS mannitol transporter subunit IIA. 	147
182115	PRK09855	PRK09855	PTS N-acetylgalactosamine transporter subunit IID. 	263
182116	PRK09856	PRK09856	fructoselysine 3-epimerase; Provisional	275
182117	PRK09857	PRK09857	recombination-promoting nuclease RpnA. 	292
137559	PRK09859	PRK09859	multidrug transporter subunit MdtE. 	385
182118	PRK09860	PRK09860	putative alcohol dehydrogenase; Provisional	383
182119	PRK09861	PRK09861	lipoprotein NlpA. 	272
182120	PRK09862	PRK09862	ATP-dependent protease. 	506
182121	PRK09863	PRK09863	putative frv operon regulatory protein; Provisional	584
182122	PRK09864	PRK09864	aminopeptidase. 	356
182123	PRK09866	PRK09866	clamp-binding protein CrfC. 	741
182124	PRK09867	PRK09867	hypothetical protein; Provisional	209
182125	PRK09870	PRK09870	tyrosine recombinase; Provisional	200
182126	PRK09871	PRK09871	tyrosine recombinase; Provisional	198
182127	PRK09874	PRK09874	multidrug efflux MFS transporter MdtG. 	408
182128	PRK09875	PRK09875	phosphotriesterase-related protein. 	292
182129	PRK09877	PRK09877	2,3-diketo-L-gulonate TRAP transporter small permease protein YiaM; Provisional	157
182130	PRK09880	PRK09880	L-idonate 5-dehydrogenase; Provisional	343
182131	PRK09881	PRK09881	D,D-dipeptide ABC transporter permease. 	296
236631	PRK09885	PRK09885	type II toxin-antitoxin system YafO family toxin. 	132
77467	PRK09890	PRK09890	cold shock protein CspG; Provisional	70
170147	PRK09891	PRK09891	protein YmcE. 	76
182133	PRK09894	PRK09894	diguanylate cyclase; Provisional	296
182134	PRK09897	PRK09897	FAD-NAD(P)-binding protein. 	534
182135	PRK09898	PRK09898	ferredoxin-like protein. 	208
182136	PRK09902	PRK09902	lipopolysaccharide kinase InaA. 	216
104216	PRK09903	PRK09903	transporter YfdV. 	314
182137	PRK09906	PRK09906	DNA-binding transcriptional regulator HcaR; Provisional	296
182138	PRK09907	PRK09907	endoribonuclease MazF. 	111
182139	PRK09908	PRK09908	xanthine dehydrogenase iron sulfur-binding subunit XdhC. 	159
182140	PRK09912	PRK09912	L-glyceraldehyde 3-phosphate reductase; Provisional	346
182141	PRK09913	PRK09913	PTS fructose transporter subunit IIA. 	148
182142	PRK09915	PRK09915	MdtP family multidrug efflux transporter outer membrane subunit. 	488
182143	PRK09917	PRK09917	threonine/serine exporter. 	157
236632	PRK09918	PRK09918	putative fimbrial chaperone protein; Provisional	230
236633	PRK09919	PRK09919	anti-adapter protein IraM; Provisional	114
182146	PRK09920	PRK09920	acetyl-CoA:acetoacetyl-CoA transferase subunit alpha; Provisional	219
182147	PRK09921	PRK09921	permease DsdX; Provisional	445
182148	PRK09922	PRK09922	lipopolysaccharide 1,6-galactosyltransferase. 	359
137592	PRK09925	PRK09925	leu operon leader peptide; Provisional	28
236634	PRK09926	PRK09926	fimbrial chaperone. 	246
236635	PRK09928	PRK09928	choline transport protein BetT; Provisional	679
182151	PRK09929	PRK09929	hypothetical protein; Provisional	91
182152	PRK09932	PRK09932	glycerate 3-kinase. 	381
182153	PRK09934	PRK09934	fimbriae assembly protein. 	171
182154	PRK09935	PRK09935	fimbriae biosynthesis transcriptional regulator FimZ. 	210
182155	PRK09936	PRK09936	DUF4434 family protein. 	296
77494	PRK09937	PRK09937	cold shock-like protein CspD. 	74
182156	PRK09939	PRK09939	acid resistance putative oxidoreductase YdeP. 	759
182157	PRK09940	PRK09940	transcriptional regulator YdeO; Provisional	253
182158	PRK09943	PRK09943	HTH-type transcriptional regulator PuuR. 	185
137602	PRK09945	PRK09945	hypothetical protein; Provisional	418
182159	PRK09946	PRK09946	hypothetical protein; Provisional	270
182160	PRK09947	PRK09947	YdhW family putative oxidoreductase system protein. 	215
236636	PRK09950	PRK09950	putative transporter; Provisional	506
182162	PRK09951	PRK09951	hypothetical protein; Provisional	222
182163	PRK09952	PRK09952	shikimate transporter; Provisional	438
182164	PRK09953	wcaD	putative colanic acid biosynthesis protein; Provisional	404
182165	PRK09954	PRK09954	sugar kinase. 	362
182166	PRK09955	rihB	ribosylpyrimidine nucleosidase. 	313
182167	PRK09956	PRK09956	ISNCY family transposase. 	308
182168	PRK09958	PRK09958	acid-sensing system DNA-binding response regulator EvgA. 	204
182169	PRK09959	PRK09959	acid-sensing system histidine kinase EvgS. 	1197
182170	PRK09961	PRK09961	aminopeptidase. 	344
170182	PRK09965	PRK09965	3-phenylpropionate dioxygenase ferredoxin subunit; Provisional	106
182171	PRK09966	PRK09966	diguanylate cyclase DgcN. 	407
182172	PRK09967	PRK09967	OmpA family protein. 	160
182173	PRK09968	PRK09968	protein-serine/threonine phosphatase. 	218
236637	PRK09970	PRK09970	xanthine dehydrogenase subunit XdhA; Provisional	759
182175	PRK09971	PRK09971	xanthine dehydrogenase subunit XdhB; Provisional	291
170188	PRK09973	PRK09973	lipoprotein YqhH. 	85
236638	PRK09974	PRK09974	type II toxin-antitoxin system PrlF family antitoxin. 	111
182177	PRK09975	PRK09975	DNA-binding transcriptional regulator EnvR; Provisional	213
182178	PRK09977	PRK09977	MgtC/SapB family protein. 	215
137624	PRK09978	PRK09978	DNA-binding transcriptional regulator GadX; Provisional	274
77522	PRK09979	PRK09979	rho operon leader peptide rhoL. 	33
182179	PRK09980	ompL	porin OmpL. 	230
182180	PRK09981	PRK09981	DUF406 domain-containing protein. 	99
137627	PRK09982	PRK09982	universal stress protein UspD; Provisional	142
182181	PRK09983	pflD	putative formate acetyltransferase 2; Provisional	765
182182	PRK09984	PRK09984	phosphonate ABC transporter ATP-binding protein. 	262
182183	PRK09986	PRK09986	LysR family transcriptional regulator. 	294
182184	PRK09987	PRK09987	dTDP-4-dehydrorhamnose reductase; Provisional	299
182185	PRK09989	PRK09989	HPr family phosphocarrier protein. 	258
182186	PRK09990	PRK09990	DNA-binding transcriptional regulator GlcC; Provisional	251
182187	PRK09993	PRK09993	C-lysozyme inhibitor; Provisional	153
182188	PRK09997	PRK09997	hydroxypyruvate isomerase; Provisional	258
182189	PRK10001	PRK10001	serine-type D-Ala-D-Ala carboxypeptidase. 	400
236639	PRK10002	PRK10002	porin OmpF. 	362
236640	PRK10003	PRK10003	ferric-rhodotorulic acid outer membrane transporter; Provisional	729
182192	PRK10005	PRK10005	dihydroxyacetone kinase ADP-binding subunit DhaL. 	210
182193	PRK10014	PRK10014	DNA-binding transcriptional repressor MalI; Provisional	342
182194	PRK10015	PRK10015	oxidoreductase; Provisional	429
182195	PRK10016	PRK10016	DNA gyrase inhibitor SbmC. 	156
182196	PRK10017	PRK10017	colanic acid biosynthesis protein; Provisional	426
182197	PRK10018	PRK10018	colanic acid biosynthesis glycosyltransferase WcaA. 	279
236641	PRK10019	PRK10019	nickel/cobalt efflux transporter RcnA. 	279
182199	PRK10022	PRK10022	putative DNA-binding transcriptional regulator; Provisional	167
182200	PRK10026	PRK10026	arsenate reductase (glutaredoxin). 	141
182201	PRK10027	PRK10027	cryptic adenine deaminase; Provisional	588
236642	PRK10030	PRK10030	YiiX family permuted papain-like enzyme. 	197
182203	PRK10034	PRK10034	gluconate transporter GntP. 	447
182204	PRK10037	PRK10037	cellulose biosynthesis protein BcsQ. 	250
170217	PRK10039	PRK10039	hypothetical protein; Provisional	127
182205	PRK10040	PRK10040	hypothetical protein; Provisional	52
236643	PRK10044	PRK10044	ferrichrome outer membrane transporter; Provisional	727
182207	PRK10045	PRK10045	ACP phosphodiesterase. 	193
182208	PRK10046	dpiA	two-component response regulator DpiA; Provisional	225
236644	PRK10049	pgaA	outer membrane protein PgaA; Provisional	765
182210	PRK10050	PRK10050	curli production assembly/transport protein CsgF. 	138
182211	PRK10051	csgA	major curlin subunit CsgA. 	151
182212	PRK10053	PRK10053	YdeI family stress tolerance OB fold protein. 	130
182213	PRK10054	PRK10054	efflux MFS transporter YdeE. 	395
182214	PRK10057	rpsV	stationary-phase-induced ribosome-associated protein. 	44
236645	PRK10060	PRK10060	cyclic di-GMP phosphodiesterase. 	663
182216	PRK10061	PRK10061	DNA damage-inducible protein YebG; Provisional	96
182217	PRK10062	PRK10062	hypothetical protein; Provisional	303
182218	PRK10063	PRK10063	colanic acid biosynthesis glycosyltransferase WcaE. 	248
236646	PRK10064	PRK10064	catecholate siderophore receptor CirA; Provisional	663
236647	PRK10069	PRK10069	3-phenylpropionate/cinnamic acid dioxygenase subunit beta. 	183
182221	PRK10070	PRK10070	proline/glycine betaine ABC transporter ATP-binding protein ProV. 	400
182222	PRK10072	PRK10072	HTH-type transcriptional regulator. 	96
182223	PRK10073	PRK10073	putative glycosyl transferase; Provisional	328
182224	PRK10076	PRK10076	pyruvate formate lyase II activase; Provisional	213
182225	PRK10077	xylE	D-xylose transporter XylE; Provisional	479
236648	PRK10078	PRK10078	ribose 1,5-bisphosphokinase; Provisional	186
182227	PRK10079	PRK10079	phosphonate metabolism transcriptional regulator PhnF; Provisional	241
170240	PRK10081	PRK10081	lipoprotein toxin entericidin B. 	48
182228	PRK10082	PRK10082	hypochlorite stress DNA-binding transcriptional regulator HypT. 	303
182229	PRK10083	PRK10083	putative oxidoreductase; Provisional	339
236649	PRK10084	PRK10084	dTDP-glucose 4,6 dehydratase; Provisional	352
182231	PRK10086	PRK10086	DNA-binding transcriptional regulator DsdC. 	311
182232	PRK10089	PRK10089	chaperone CsaA. 	112
182233	PRK10090	PRK10090	aldehyde dehydrogenase A; Provisional	409
182234	PRK10091	PRK10091	MFS transport protein AraJ; Provisional	382
182235	PRK10092	PRK10092	maltose O-acetyltransferase; Provisional	183
182236	PRK10093	PRK10093	primosomal replication protein N''; Provisional	171
182237	PRK10094	PRK10094	HTH-type transcriptional activator AllS. 	308
236650	PRK10095	PRK10095	ribonuclease I; Provisional	268
182239	PRK10096	citG	triphosphoribosyl-dephospho-CoA synthase; Provisional	292
182240	PRK10098	PRK10098	putative dehydrogenase; Provisional	350
182241	PRK10100	PRK10100	transcriptional regulator CsgD. 	216
182242	PRK10101	csgB	curlin minor subunit CsgB; Provisional	151
182243	PRK10102	csgC	curli assembly protein CsgC; Provisional	110
236651	PRK10106	PRK10106	multiple antibiotic resistance protein MarB. 	65
182245	PRK10110	PRK10110	PTS maltose transporter subunit IICB. 	530
182246	PRK10113	PRK10113	cell division activator CedA. 	80
182247	PRK10115	PRK10115	protease 2; Provisional	686
182248	PRK10116	PRK10116	universal stress protein UspC; Provisional	142
182249	PRK10117	PRK10117	trehalose-6-phosphate synthase; Provisional	474
236652	PRK10118	PRK10118	flagellar hook length control protein FliK. 	408
182251	PRK10119	PRK10119	putative hydrolase; Provisional	231
182252	PRK10122	PRK10122	UTP--glucose-1-phosphate uridylyltransferase GalF. 	297
182253	PRK10123	wcaM	putative colanic acid biosynthesis protein; Provisional	464
182254	PRK10124	PRK10124	putative UDP-glucose lipid carrier transferase; Provisional	463
182255	PRK10125	PRK10125	colanic acid biosynthesis glycosyltransferase WcaC. 	405
182256	PRK10126	PRK10126	low molecular weight protein-tyrosine-phosphatase Wzb. 	147
182257	PRK10128	PRK10128	2-keto-3-deoxy-L-rhamnonate aldolase; Provisional	267
182258	PRK10130	PRK10130	HTH-type transcriptional regulator EutR. 	350
182259	PRK10132	PRK10132	hypothetical protein; Provisional	108
182260	PRK10133	PRK10133	L-fucose:H+ symporter permease. 	438
236653	PRK10137	PRK10137	alpha-glucosidase; Provisional	786
182262	PRK10139	PRK10139	serine endoprotease DegQ. 	455
182263	PRK10140	PRK10140	N-acetyltransferase. 	162
236654	PRK10141	PRK10141	DNA-binding transcriptional repressor ArsR; Provisional	117
182265	PRK10144	PRK10144	formate-dependent nitrite reductase complex subunit NrfF; Provisional	126
182266	PRK10146	PRK10146	aminoalkylphosphonate N-acetyltransferase. 	144
236655	PRK10147	phnH	phosphonate C-P lyase system protein PhnH. 	196
236656	PRK10148	PRK10148	VOC family metalloprotein YjdN. 	147
236657	PRK10150	PRK10150	beta-D-glucuronidase; Provisional	604
182270	PRK10151	PRK10151	50S ribosomal protein L7/L12-serine acetyltransferase. 	179
236658	PRK10153	PRK10153	DNA-binding transcriptional activator CadC; Provisional	517
182272	PRK10154	PRK10154	DUF2541 family protein. 	134
182273	PRK10157	PRK10157	putative oxidoreductase FixC; Provisional	428
236659	PRK10158	PRK10158	bifunctional tRNA pseudouridine(32) synthase/23S rRNA pseudouridine(746) synthase RluA. 	219
182275	PRK10159	PRK10159	phosphoporin PhoE. 	351
182276	PRK10160	PRK10160	taurine ABC transporter permease TauC. 	275
182277	PRK10161	PRK10161	phosphate response regulator transcription factor PhoB. 	229
236660	PRK10162	PRK10162	acetyl esterase. 	318
182279	PRK10163	PRK10163	HTH-type transcriptional repressor AllR. 	271
182280	PRK10167	PRK10167	hypothetical protein; Provisional	169
182281	PRK10170	PRK10170	Ni/Fe-hydrogenase large subunit. 	597
182282	PRK10171	PRK10171	Ni/Fe-hydrogenase b-type cytochrome subunit. 	235
182283	PRK10172	PRK10172	AppA family phytase/histidine-type acid phosphatase. 	436
182284	PRK10173	PRK10173	glucose-1-phosphatase/inositol phosphatase; Provisional	413
182285	PRK10174	PRK10174	hypothetical protein; Provisional	75
182286	PRK10175	PRK10175	YceK/YidQ family lipoprotein. 	75
236661	PRK10177	PRK10177	YchO/YchP family invasin. 	465
236662	PRK10178	PRK10178	D-alanyl-D-alanine dipeptidase; Provisional	184
182289	PRK10179	PRK10179	formate dehydrogenase-N subunit gamma; Provisional	217
182290	PRK10183	PRK10183	hypothetical protein; Provisional	56
182291	PRK10187	PRK10187	trehalose-6-phosphate phosphatase; Provisional	266
182292	PRK10188	PRK10188	transcriptional regulator SdiA. 	240
182293	PRK10189	PRK10189	EmmdR/YeeO family multidrug/toxin efflux MATE transporter. 	478
182294	PRK10190	PRK10190	L,D-transpeptidase; Provisional	310
182295	PRK10191	PRK10191	putative acyl transferase; Provisional	146
182296	PRK10194	PRK10194	ferredoxin-type protein NapF. 	163
182297	PRK10197	PRK10197	GABA permease. 	446
182298	PRK10198	PRK10198	formate hydrogenlyase regulator HycA. 	152
182299	PRK10199	PRK10199	alkaline phosphatase isozyme conversion aminopeptidase; Provisional	346
182300	PRK10200	PRK10200	putative racemase; Provisional	230
182301	PRK10201	PRK10201	G/U mismatch-specific DNA glycosylase. 	168
182302	PRK10202	ebgC	beta-galactosidase subunit beta. 	149
182303	PRK10203	PRK10203	hypothetical protein; Provisional	122
182304	PRK10204	PRK10204	hypothetical protein; Provisional	55
182305	PRK10206	PRK10206	putative oxidoreductase; Provisional	344
182306	PRK10207	PRK10207	dipeptide/tripeptide permease DtpB. 	489
182307	PRK10208	PRK10208	acid-activated periplasmic chaperone HdeA. 	114
182308	PRK10209	PRK10209	HdeD family acid-resistance protein. 	190
182309	PRK10213	nepI	purine ribonucleoside efflux pump NepI. 	394
182310	PRK10214	PRK10214	ilvB operon leader peptide IvbL. 	32
236663	PRK10215	PRK10215	hypothetical protein; Provisional	218
182312	PRK10216	PRK10216	HTH-type transcriptional regulator YidZ. 	319
182313	PRK10217	PRK10217	dTDP-glucose 4,6-dehydratase; Provisional	355
104396	PRK10218	PRK10218	translational GTPase TypA. 	607
182314	PRK10219	PRK10219	superoxide response transcriptional regulator SoxS. 	107
182315	PRK10220	PRK10220	phnA family protein. 	111
182316	PRK10222	PRK10222	PTS ascorbate transporter subunit IIB. 	85
236664	PRK10224	PRK10224	pyr operon leader peptide. 	44
182318	PRK10225	PRK10225	Uxu operon transcriptional regulator. 	257
182319	PRK10226	PRK10226	isoaspartyl peptidase; Provisional	313
182320	PRK10227	PRK10227	HTH-type transcriptional regulator CueR. 	135
236665	PRK10229	PRK10229	threonine efflux system; Provisional	206
182322	PRK10234	PRK10234	transcriptional regulator GutM. 	118
182323	PRK10236	PRK10236	acidic protein MsyB. 	237
182324	PRK10238	PRK10238	aromatic amino acid transporter AroP. 	456
182325	PRK10239	PRK10239	2-amino-4-hydroxy-6-hydroxymethyldihydropteridine diphosphokinase. 	159
182326	PRK10240	PRK10240	(2E,6E)-farnesyl-diphosphate-specific ditrans,polycis-undecaprenyl-diphosphate synthase. 	229
182327	PRK10241	PRK10241	hydroxyacylglutathione hydrolase; Provisional	251
236666	PRK10244	PRK10244	anti-adapter protein IraP. 	88
182329	PRK10245	adrA	diguanylate cyclase AdrA; Provisional	366
182330	PRK10246	PRK10246	exonuclease subunit SbcC; Provisional	1047
182331	PRK10247	PRK10247	putative ABC transporter ATP-binding protein YbbL; Provisional	225
236667	PRK10249	PRK10249	phenylalanine transporter; Provisional	458
182333	PRK10250	PRK10250	MmcQ/YjbR family DNA-binding protein. 	122
182334	PRK10251	PRK10251	enterobactin synthase subunit EntD. 	207
236668	PRK10252	entF	enterobactin non-ribosomal peptide synthetase EntF. 	1296
182336	PRK10253	PRK10253	iron-enterobactin ABC transporter ATP-binding protein. 	265
182337	PRK10254	PRK10254	proofreading thioesterase EntH. 	137
182338	PRK10255	PRK10255	PTS system N-acetyl glucosamine specific transporter subunits IIABC; Provisional	648
182339	PRK10257	PRK10257	putative kinase inhibitor protein; Provisional	158
182340	PRK10258	PRK10258	biotin biosynthesis protein BioC; Provisional	251
137782	PRK10259	PRK10259	hypothetical protein; Provisional	86
182341	PRK10260	PRK10260	L,D-transpeptidase; Provisional	306
182342	PRK10261	PRK10261	glutathione transporter ATP-binding protein; Provisional	623
182343	PRK10262	PRK10262	thioredoxin reductase; Provisional	321
236669	PRK10263	PRK10263	DNA translocase FtsK; Provisional	1355
182345	PRK10264	PRK10264	hydrogenase 1 maturation protease; Provisional	195
182346	PRK10265	PRK10265	chaperone modulator CbpM. 	101
182347	PRK10266	PRK10266	curved DNA-binding protein. 	306
182348	PRK10270	PRK10270	putative aminodeoxychorismate lyase; Provisional	340
182349	PRK10271	thiK	thiamine kinase; Provisional	188
182350	PRK10276	PRK10276	translesion error-prone DNA polymerase V autoproteolytic subunit. 	139
182351	PRK10278	PRK10278	SirB family protein. 	130
182352	PRK10279	PRK10279	patatin-like phospholipase RssA. 	300
182353	PRK10280	PRK10280	peptidyl-dipeptidase Dcp. 	681
182354	PRK10281	PRK10281	PhzF family isomerase. 	299
182355	PRK10286	PRK10286	methylated-DNA--[protein]-cysteine S-methyltransferase. 	171
182356	PRK10287	PRK10287	thiosulfate:cyanide sulfurtransferase; Provisional	104
182357	PRK10290	PRK10290	superoxide dismutase [Cu-Zn] SodC2. 	173
182358	PRK10291	PRK10291	glyoxalase I; Provisional	129
182359	PRK10292	PRK10292	fumarate hydratase FumD. 	69
182360	PRK10293	PRK10293	1,4-dihydroxy-2-naphthoyl-CoA hydrolase. 	136
182361	PRK10294	PRK10294	6-phosphofructokinase 2; Provisional	309
182362	PRK10296	PRK10296	DNA-binding transcriptional regulator ChbR; Provisional	278
182363	PRK10297	PRK10297	PTS system N,N'-diacetylchitobiose-specific transporter subunit IIC; Provisional	452
182364	PRK10299	PRK10299	PhoP/PhoQ regulator MgrB. 	47
182365	PRK10301	PRK10301	CopC domain-containing protein YobA. 	124
182366	PRK10302	PRK10302	hypothetical protein; Provisional	272
182367	PRK10304	PRK10304	non-heme ferritin. 	165
182368	PRK10306	PRK10306	zinc/cadmium-binding protein; Provisional	216
236670	PRK10307	PRK10307	colanic acid biosynthesis glycosyltransferase WcaI. 	412
236671	PRK10308	PRK10308	3-methyl-adenine DNA glycosylase II; Provisional	283
182371	PRK10309	PRK10309	galactitol-1-phosphate 5-dehydrogenase. 	347
182372	PRK10310	PRK10310	PTS galactitol transporter subunit IIB. 	94
182373	PRK10314	PRK10314	GNAT family N-acetyltransferase. 	153
182374	PRK10316	PRK10316	hypothetical protein; Provisional	209
236672	PRK10318	PRK10318	hypothetical protein; Provisional	121
182376	PRK10319	PRK10319	N-acetylmuramoyl-L-alanine amidase AmiA. 	287
182377	PRK10323	PRK10323	cysteine/O-acetylserine transporter. 	195
182378	PRK10324	PRK10324	ribosome-associated translation inhibitor RaiA. 	113
182379	PRK10325	PRK10325	heat shock protein GrpE; Provisional	197
182380	PRK10328	PRK10328	DNA-binding protein StpA. 	134
182381	PRK10329	PRK10329	glutaredoxin-like protein NrdH. 	81
182382	PRK10330	PRK10330	electron transport protein HydN. 	181
182383	PRK10331	PRK10331	L-fuculokinase; Provisional	470
182384	PRK10332	PRK10332	prepilin-type N-terminal cleavage/methylation domain-containing protein. 	107
182385	PRK10333	PRK10333	5-formyltetrahydrofolate cyclo-ligase family protein; Provisional	182
182386	PRK10334	PRK10334	small-conductance mechanosensitive channel MscS. 	286
182387	PRK10336	PRK10336	two-component system response regulator QseB. 	219
182388	PRK10337	PRK10337	sensor protein QseC; Provisional	449
182389	PRK10339	PRK10339	DNA-binding transcriptional repressor EbgR; Provisional	327
236673	PRK10340	ebgA	cryptic beta-D-galactosidase subunit alpha; Reviewed	1021
182391	PRK10341	PRK10341	transcriptional regulator TdcA. 	312
182392	PRK10342	PRK10342	glycerate kinase I; Provisional	381
182393	PRK10343	PRK10343	ribosome assembly RNA-binding protein YhbY. 	97
182394	PRK10344	PRK10344	DNA-binding transcriptional regulator SfsB. 	92
182395	PRK10345	PRK10345	PhoP regulatory network protein YrbL. 	210
182396	PRK10347	PRK10347	putative adenosine monophosphate-protein transferase Fic. 	200
182397	PRK10348	PRK10348	ribosome-associated heat shock protein Hsp15; Provisional	133
137836	PRK10349	PRK10349	pimeloyl-ACP methyl ester esterase BioH. 	256
182398	PRK10350	PRK10350	DUF2756 family protein. 	145
182399	PRK10351	PRK10351	4'-phosphopantetheinyl transferase AcpT. 	187
182400	PRK10352	PRK10352	nickel transporter permease NikB; Provisional	314
182401	PRK10353	PRK10353	DNA-3-methyladenine glycosylase I. 	187
182402	PRK10354	PRK10354	RNA chaperone/antiterminator CspA. 	70
182403	PRK10355	xylF	D-xylose ABC transporter substrate-binding protein. 	330
182404	PRK10356	PRK10356	protein bax. 	274
182405	PRK10357	PRK10357	putative glutathione S-transferase; Provisional	202
182406	PRK10358	PRK10358	tRNA (uridine(34)/cytosine(34)/5-carboxymethylaminomethyluridine(34)-2'-O)-methyltransferase TrmL. 	157
182407	PRK10359	PRK10359	lipopolysaccharide core heptose(II) kinase RfaY. 	232
182408	PRK10360	PRK10360	transcriptional regulator UhpA. 	196
182409	PRK10361	PRK10361	DNA recombination protein RmuC; Provisional	475
182410	PRK10363	cpxP	cell-envelope stress modulator CpxP. 	166
236674	PRK10364	PRK10364	two-component system sensor histidine kinase ZraS. 	457
182412	PRK10365	PRK10365	sigma-54-dependent response regulator transcription factor ZraR. 	441
182413	PRK10367	PRK10367	DNA-damage-inducible SOS response protein; Provisional	441
236675	PRK10369	PRK10369	heme lyase subunit NrfE; Provisional	571
182415	PRK10370	PRK10370	formate-dependent nitrite reductase complex subunit NrfG; Provisional	198
182416	PRK10371	PRK10371	transcriptional regulator MelR. 	302
182417	PRK10372	PRK10372	PTS ascorbate transporter subunit IIA. 	154
236676	PRK10376	PRK10376	putative oxidoreductase; Provisional	290
182419	PRK10377	PRK10377	PTS glucitol/sorbitol transporter subunit IIA. 	120
236677	PRK10378	PRK10378	inactive ferrous ion transporter periplasmic protein EfeO; Provisional	375
182421	PRK10380	PRK10380	hypothetical protein; Provisional	63
182422	PRK10381	PRK10381	LPS O-antigen length regulator; Provisional	377
182423	PRK10382	PRK10382	alkyl hydroperoxide reductase subunit C; Provisional	187
236678	PRK10386	PRK10386	curli production assembly/transport protein CsgE. 	130
236679	PRK10387	PRK10387	glutaredoxin 2; Provisional	210
182426	PRK10391	PRK10391	transcription modulator YdgT. 	71
236680	PRK10396	PRK10396	hypothetical protein; Provisional	221
182428	PRK10397	PRK10397	lipoprotein; Provisional	137
236681	PRK10401	PRK10401	HTH-type transcriptional regulator GalS. 	346
236682	PRK10402	PRK10402	DNA-binding transcriptional activator YeiL; Provisional	226
182431	PRK10403	PRK10403	nitrate/nitrite response regulator protein NarP. 	215
182432	PRK10404	PRK10404	stress response protein ElaB. 	101
182433	PRK10406	PRK10406	alpha-ketoglutarate transporter; Provisional	432
182434	PRK10408	PRK10408	L-valine transporter subunit YgaH. 	111
182435	PRK10409	PRK10409	HypC/HybG/HupF family hydrogenase formation chaperone. 	90
236683	PRK10410	PRK10410	nitrous oxide-stimulated promoter family protein. 	100
236684	PRK10411	PRK10411	L-fucose operon activator. 	240
182438	PRK10413	PRK10413	hydrogenase maturation factor HybG. 	82
236685	PRK10414	PRK10414	biopolymer transporter ExbB. 	244
182440	PRK10415	PRK10415	tRNA-dihydrouridine synthase B; Provisional	321
236686	PRK10416	PRK10416	signal recognition particle-docking protein FtsY; Provisional	318
236687	PRK10417	nikC	nickel transporter permease NikC; Provisional	272
236688	PRK10418	nikD	nickel transporter ATP-binding protein NikD; Provisional	254
236689	PRK10419	nikE	nickel ABC transporter ATP-binding protein NikE. 	268
182445	PRK10420	PRK10420	L-lactate permease; Provisional	551
236690	PRK10421	PRK10421	DNA-binding transcriptional repressor LldR; Provisional	253
182447	PRK10422	PRK10422	lipopolysaccharide core biosynthesis protein; Provisional	352
182448	PRK10423	PRK10423	transcriptional repressor RbsR; Provisional	327
170429	PRK10424	PRK10424	ilv operon leader peptide. 	32
182449	PRK10425	PRK10425	3'-5' ssDNA/RNA exonuclease TatD. 	258
236691	PRK10426	PRK10426	alpha-glucosidase; Provisional	635
182451	PRK10427	PRK10427	PTS fructose-like transporter subunit IIB. 	114
182452	PRK10428	PRK10428	hypothetical protein; Provisional	69
182453	PRK10429	PRK10429	melibiose:sodium transporter MelB. 	473
182454	PRK10430	PRK10430	two-component system response regulator DcuR. 	239
236692	PRK10431	PRK10431	N-acetylmuramoyl-l-alanine amidase II; Provisional	445
236693	PRK10433	PRK10433	putative RNA methyltransferase; Provisional	228
182457	PRK10434	srlR	DNA-binding transcriptional repressor. 	256
182458	PRK10435	cadB	cadaverine/lysine antiporter. 	435
236694	PRK10436	PRK10436	hypothetical protein; Provisional	462
182460	PRK10437	PRK10437	carbonic anhydrase; Provisional	220
182461	PRK10438	PRK10438	C-N hydrolase family amidase; Provisional	256
236695	PRK10439	PRK10439	enterobactin/ferric enterobactin esterase; Provisional	411
182463	PRK10440	PRK10440	iron-enterobactin ABC transporter permease. 	330
182464	PRK10441	PRK10441	Fe(3+)-siderophore ABC transporter permease. 	335
182465	PRK10443	rihA	ribonucleoside hydrolase 1; Provisional	311
182466	PRK10444	PRK10444	HAD-IIA family hydrolase. 	248
182467	PRK10445	PRK10445	endonuclease VIII; Provisional	263
182468	PRK10446	PRK10446	30S ribosomal protein S6--L-glutamate ligase. 	300
182469	PRK10447	PRK10447	FtsH protease modulator YccA. 	219
182470	PRK10449	PRK10449	heat shock protein HslJ. 	140
182471	PRK10452	PRK10452	multidrug/spermidine efflux SMR transporter subunit MdtJ. 	120
182472	PRK10454	PRK10454	PTS N,N'-diacetylchitobiose transporter subunit IIA. 	115
182473	PRK10455	PRK10455	periplasmic protein; Reviewed	161
182474	PRK10456	PRK10456	arginine succinyltransferase; Provisional	344
182475	PRK10457	PRK10457	hypothetical protein; Provisional	82
236696	PRK10458	PRK10458	DNA cytosine methylase; Provisional	467
236697	PRK10459	PRK10459	MOP flippase family protein. 	492
182478	PRK10461	PRK10461	thiamine biosynthesis lipoprotein ApbE; Provisional	350
182479	PRK10463	PRK10463	hydrogenase nickel incorporation protein HypB; Provisional	290
182480	PRK10465	PRK10465	hydrogenase-2 assembly chaperone. 	159
182481	PRK10466	hybD	HyaD/HybD family hydrogenase maturation endopeptidase. 	164
182482	PRK10467	PRK10467	hydrogenase 2 large subunit; Provisional	567
182483	PRK10468	PRK10468	hydrogenase 2 small subunit; Provisional	371
182484	PRK10470	PRK10470	ribosome hibernation promoting factor. 	95
182485	PRK10472	PRK10472	low affinity gluconate transporter; Provisional	445
182486	PRK10473	PRK10473	MdtL family multidrug efflux MFS transporter. 	392
170468	PRK10474	PRK10474	PTS fructose-like transporter subunit IIB. 	88
236698	PRK10475	PRK10475	23S rRNA pseudouridine(2604) synthase RluF. 	290
182488	PRK10476	PRK10476	multidrug transporter subunit MdtN. 	346
182489	PRK10477	PRK10477	outer membrane lipoprotein Blc; Provisional	177
182490	PRK10478	PRK10478	PTS fructose transporter subunit EIIC. 	359
182491	PRK10481	PRK10481	hypothetical protein; Provisional	224
182492	PRK10483	PRK10483	tryptophan permease; Provisional	414
236699	PRK10484	PRK10484	putative transporter; Provisional	523
182494	PRK10486	PRK10486	(4S)-4-hydroxy-5-phosphonooxypentane-2,3-dione isomerase. 	96
236700	PRK10489	PRK10489	enterobactin transporter EntS. 	417
236701	PRK10490	PRK10490	sensor protein KdpD; Provisional	895
236702	PRK10494	PRK10494	envelope biogenesis factor ElyC. 	259
182498	PRK10497	PRK10497	phage shock protein PspD. 	73
182499	PRK10499	PRK10499	PTS sugar transporter subunit IIB. 	106
236703	PRK10502	PRK10502	putative acyl transferase; Provisional	182
182501	PRK10503	PRK10503	MdtB/MuxB family multidrug efflux RND transporter permease subunit. 	1040
182502	PRK10504	PRK10504	putative transporter; Provisional	471
236704	PRK10506	PRK10506	prepilin peptidase-dependent protein. 	162
182504	PRK10507	PRK10507	bifunctional glutathionylspermidine amidase/glutathionylspermidine synthetase; Provisional	619
182505	PRK10508	PRK10508	luciferase-like monooxygenase. 	333
182506	PRK10509	PRK10509	bacterioferritin-associated ferredoxin; Provisional	64
182507	PRK10510	PRK10510	OmpA family lipoprotein. 	219
182508	PRK10512	PRK10512	selenocysteinyl-tRNA-specific translation factor; Provisional	614
182509	PRK10513	PRK10513	sugar phosphate phosphatase; Provisional	270
182510	PRK10514	PRK10514	putative acetyltransferase; Provisional	145
170492	PRK10515	PRK10515	hypothetical protein; Provisional	90
236705	PRK10517	PRK10517	magnesium-transporting P-type ATPase MgtA. 	902
236706	PRK10518	PRK10518	alkaline phosphatase; Provisional	476
182513	PRK10519	PRK10519	hypothetical protein; Provisional	151
182514	PRK10520	rhtB	homoserine/homoserine lactone efflux protein; Provisional	205
236707	PRK10522	PRK10522	multidrug transporter membrane component/ATP-binding component; Provisional	547
236708	PRK10523	PRK10523	envelope stress response activation lipoprotein NlpE. 	234
182517	PRK10524	prpE	propionyl-CoA synthetase; Provisional	629
182518	PRK10525	PRK10525	cytochrome o ubiquinol oxidase subunit II; Provisional	315
182519	PRK10526	PRK10526	acyl-CoA thioesterase II; Provisional	286
182520	PRK10527	PRK10527	DUF454 family protein. 	125
182521	PRK10528	PRK10528	multifunctional acyl-CoA thioesterase I and protease I and lysophospholipase L1; Provisional	191
182522	PRK10529	PRK10529	DNA-binding transcriptional activator KdpE; Provisional	225
182523	PRK10530	PRK10530	pyridoxal phosphate (PLP) phosphatase; Provisional	272
236709	PRK10531	PRK10531	putative acyl-CoA thioester hydrolase. 	422
182525	PRK10532	PRK10532	threonine and homoserine efflux system; Provisional	293
182526	PRK10533	PRK10533	putative lipoprotein; Provisional	171
236710	PRK10534	PRK10534	L-threonine aldolase; Provisional	333
182528	PRK10535	PRK10535	macrolide ABC transporter ATP-binding protein/permease MacB. 	648
182529	PRK10536	PRK10536	phosphate starvation-inducible protein PhoH. 	262
236711	PRK10537	PRK10537	voltage-gated potassium channel protein. 	393
182531	PRK10538	PRK10538	bifunctional NADP-dependent 3-hydroxy acid dehydrogenase/3-hydroxypropionate dehydrogenase YdfG. 	248
182532	PRK10540	PRK10540	osmotically-inducible lipoprotein OsmB. 	72
182533	PRK10542	PRK10542	glutathionine S-transferase; Provisional	201
182534	PRK10543	PRK10543	superoxide dismutase [Fe]. 	193
182535	PRK10545	PRK10545	excinuclease Cho. 	286
182536	PRK10546	PRK10546	pyrimidine (deoxy)nucleoside triphosphate diphosphatase. 	135
236712	PRK10547	PRK10547	chemotaxis protein CheA; Provisional	670
182538	PRK10548	PRK10548	flagella biosynthesis regulatory protein FliT. 	121
182539	PRK10549	PRK10549	two-component system sensor histidine kinase BaeS. 	466
236713	PRK10550	PRK10550	tRNA dihydrouridine(16) synthase DusC. 	312
182541	PRK10551	PRK10551	cyclic di-GMP phosphodiesterase. 	518
182542	PRK10553	PRK10553	chaperone NapD. 	87
182543	PRK10554	PRK10554	outer membrane porin protein C; Provisional	355
182544	PRK10555	PRK10555	multidrug efflux RND transporter permease AcrD. 	1037
182545	PRK10556	PRK10556	hypothetical protein; Provisional	111
236714	PRK10557	PRK10557	prepilin peptidase-dependent protein. 	192
182547	PRK10558	PRK10558	alpha-dehydro-beta-deoxy-D-glucarate aldolase; Provisional	256
182548	PRK10559	PRK10559	p-hydroxybenzoic acid efflux pump subunit AaeA. 	310
182549	PRK10560	hofQ	outer membrane porin HofQ; Provisional	386
182550	PRK10561	PRK10561	sn-glycerol-3-phosphate ABC transporter permease UgpA. 	280
236715	PRK10562	PRK10562	putative acetyltransferase; Provisional	145
182552	PRK10563	PRK10563	6-phosphogluconate phosphatase; Provisional	221
236716	PRK10564	PRK10564	maltose operon protein MalM. 	303
182554	PRK10565	PRK10565	putative carbohydrate kinase; Provisional	508
182555	PRK10566	PRK10566	esterase; Provisional	249
182556	PRK10568	PRK10568	molecular chaperone OsmY. 	203
182557	PRK10569	PRK10569	NAD(P)H-dependent FMN reductase; Provisional	191
236717	PRK10572	PRK10572	arabinose operon transcriptional regulator AraC. 	290
182559	PRK10573	PRK10573	protein transport protein HofC. 	399
236718	PRK10574	PRK10574	putative major pilin subunit; Provisional	146
182561	PRK10575	PRK10575	Fe3+-hydroxamate ABC transporter ATP-binding protein FhuC. 	265
236719	PRK10576	PRK10576	Fe(3+)-hydroxamate ABC transporter substrate-binding protein FhuD. 	292
236720	PRK10577	PRK10577	Fe(3+)-hydroxamate ABC transporter permease FhuB. 	668
182564	PRK10578	PRK10578	hypothetical protein; Provisional	207
182565	PRK10579	PRK10579	pyrimidine/purine nucleoside phosphorylase. 	94
182566	PRK10580	proY	putative proline-specific permease; Provisional	457
182567	PRK10581	PRK10581	(2E,6E)-farnesyl diphosphate synthase. 	299
182568	PRK10582	PRK10582	cytochrome o ubiquinol oxidase subunit IV; Provisional	109
182569	PRK10584	PRK10584	putative ABC transporter ATP-binding protein YbbA; Provisional	228
182570	PRK10586	PRK10586	putative oxidoreductase; Provisional	362
236721	PRK10588	PRK10588	hypothetical protein; Provisional	97
236722	PRK10590	PRK10590	ATP-dependent RNA helicase RhlE; Provisional	456
182573	PRK10591	PRK10591	hypothetical protein; Provisional	92
182574	PRK10592	PRK10592	putrescine transporter subunit: membrane component of ABC superfamily; Provisional	281
182575	PRK10593	PRK10593	hypothetical protein; Provisional	297
236723	PRK10594	PRK10594	murein L,D-transpeptidase; Provisional	608
182577	PRK10595	PRK10595	cell division inhibitor SulA. 	164
182578	PRK10597	PRK10597	DNA damage-inducible protein I; Provisional	81
182579	PRK10598	PRK10598	lipoprotein; Provisional	186
182580	PRK10599	PRK10599	sodium-potassium/proton antiporter ChaA. 	366
182581	PRK10600	PRK10600	nitrate/nitrite two-component system sensor histidine kinase NarX. 	569
182582	PRK10602	PRK10602	murein tripeptide amidase MpaA. 	237
236724	PRK10604	PRK10604	sensor protein RstB; Provisional	433
182584	PRK10605	PRK10605	N-ethylmaleimide reductase; Provisional	362
182585	PRK10606	btuE	putative glutathione peroxidase; Provisional	183
170568	PRK10610	PRK10610	chemotaxis protein CheY. 	129
236725	PRK10611	PRK10611	protein-glutamate O-methyltransferase CheR. 	287
182587	PRK10612	PRK10612	chemotaxis protein CheW. 	167
182588	PRK10613	PRK10613	DUF2594 family protein. 	74
182589	PRK10614	PRK10614	multidrug efflux system subunit MdtC; Provisional	1025
182590	PRK10617	PRK10617	cytochrome c-type protein NapC; Provisional	200
236726	PRK10618	PRK10618	phosphotransfer intermediate protein in two-component regulatory system with RcsBC; Provisional	894
182592	PRK10619	PRK10619	histidine ABC transporter ATP-binding protein HisP. 	257
182593	PRK10621	PRK10621	hypothetical protein; Provisional	266
182594	PRK10622	pheA	bifunctional chorismate mutase/prephenate dehydratase; Provisional	386
182595	PRK10624	PRK10624	L-1,2-propanediol oxidoreductase; Provisional	382
236727	PRK10625	tas	putative aldo-keto reductase; Provisional	346
182597	PRK10626	PRK10626	hypothetical protein; Provisional	239
182598	PRK10628	PRK10628	LigB family dioxygenase; Provisional	246
236728	PRK10629	PRK10629	EnvZ/OmpR regulon moderator MzrA. 	127
182600	PRK10631	PRK10631	p-hydroxybenzoic acid efflux subunit AaeB; Provisional	652
182601	PRK10632	PRK10632	HTH-type transcriptional activator AaeR. 	309
182602	PRK10633	PRK10633	hypothetical protein; Provisional	80
182603	PRK10634	PRK10634	L-threonylcarbamoyladenylate synthase type 1 TsaC. 	190
182604	PRK10635	PRK10635	bacterioferritin; Provisional	158
236729	PRK10636	PRK10636	putative ABC transporter ATP-binding protein; Provisional	638
182606	PRK10637	cysG	siroheme synthase CysG. 	457
182607	PRK10638	PRK10638	glutaredoxin 3; Provisional	83
182608	PRK10639	PRK10639	formate dehydrogenase cytochrome b556 subunit. 	211
182609	PRK10640	rhaB	rhamnulokinase; Provisional	471
236730	PRK10641	btuB	TonB-dependent vitamin B12 receptor BtuB. 	614
182611	PRK10642	PRK10642	proline/glycine betaine transporter ProP. 	490
182612	PRK10643	PRK10643	two-component system response regulator PmrA. 	222
182613	PRK10644	PRK10644	arginine/agmatine antiporter. 	445
182614	PRK10645	PRK10645	divalent cation tolerance protein CutA. 	112
182615	PRK10646	PRK10646	tRNA (adenosine(37)-N6)-threonylcarbamoyltransferase complex ATPase subunit type 1 TsaE. 	153
182616	PRK10647	PRK10647	ferric iron reductase involved in ferric hydroximate transport; Provisional	262
182617	PRK10649	PRK10649	phosphoethanolamine transferase CptA. 	577
182618	PRK10650	PRK10650	multidrug/spermidine efflux SMR transporter subunit MdtI. 	109
182619	PRK10651	PRK10651	transcriptional regulator NarL; Provisional	216
182620	PRK10653	PRK10653	ribose ABC transporter substrate-binding protein RbsB. 	295
182621	PRK10654	dcuC	C4-dicarboxylate transporter DcuC; Provisional	455
182622	PRK10655	potE	putrescine transporter; Provisional	438
182623	PRK10657	PRK10657	isoaspartyl dipeptidase; Provisional	388
236731	PRK10658	PRK10658	putative alpha-glucosidase; Provisional	665
182625	PRK10659	PRK10659	acetate uptake transporter. 	188
182626	PRK10660	tilS	tRNA(Ile)-lysidine synthetase; Provisional	436
236732	PRK10662	PRK10662	beta-lactam binding protein AmpH; Provisional	378
182628	PRK10663	PRK10663	cytochrome o ubiquinol oxidase subunit III; Provisional	204
170612	PRK10664	PRK10664	DNA-binding protein HU-beta. 	90
182629	PRK10665	PRK10665	P-II family nitrogen regulator. 	112
182630	PRK10666	PRK10666	ammonium transporter AmtB. 	428
182631	PRK10667	PRK10667	Hha toxicity modulator TomB. 	122
182632	PRK10668	PRK10668	DNA-binding transcriptional repressor AcrR; Provisional	215
182633	PRK10669	PRK10669	putative cation:proton antiport protein; Provisional	558
182634	PRK10670	PRK10670	Cys-tRNA(Pro)/Cys-tRNA(Cys) deacylase YbaK. 	159
182635	PRK10671	copA	copper-exporting P-type ATPase CopA. 	834
236733	PRK10672	PRK10672	endolytic peptidoglycan transglycosylase RlpA. 	361
182637	PRK10673	PRK10673	esterase. 	255
236734	PRK10674	PRK10674	deoxyribodipyrimidine photolyase; Provisional	472
182639	PRK10675	PRK10675	UDP-galactose-4-epimerase; Provisional	338
182640	PRK10676	PRK10676	DNA-binding transcriptional regulator ModE; Provisional	263
182641	PRK10677	modA	molybdate transporter periplasmic protein; Provisional	257
182642	PRK10678	moaE	molybdopterin synthase catalytic subunit MoaE. 	150
182643	PRK10680	PRK10680	molybdopterin biosynthesis protein MoeA; Provisional	411
182644	PRK10681	PRK10681	DNA-binding transcriptional repressor DeoR; Provisional	252
182645	PRK10682	PRK10682	putrescine transporter subunit: periplasmic-binding component of ABC superfamily; Provisional	370
182646	PRK10683	PRK10683	putrescine transporter subunit: membrane component of ABC superfamily; Provisional	317
236735	PRK10684	PRK10684	HCP oxidoreductase, NADH-dependent; Provisional	332
182648	PRK10687	PRK10687	purine nucleoside phosphoramidase; Provisional	119
182649	PRK10689	PRK10689	transcription-repair coupling factor; Provisional	1147
182650	PRK10691	PRK10691	fumarylacetoacetate hydrolase family protein. 	219
182651	PRK10692	PRK10692	stress response protein YchH. 	92
182652	PRK10693	PRK10693	two-component system response regulator RssB. 	303
236736	PRK10694	PRK10694	acyl-CoA thioester hydrolase YciA. 	133
182654	PRK10695	PRK10695	YdbH family protein. 	859
236737	PRK10696	PRK10696	tRNA 2-thiocytidine biosynthesis protein TtcA; Provisional	258
182656	PRK10697	PRK10697	envelope stress response membrane protein PspC. 	118
182657	PRK10698	PRK10698	phage shock protein PspA; Provisional	222
182658	PRK10699	PRK10699	phosphatidylglycerophosphatase B; Provisional	244
182659	PRK10700	PRK10700	23S rRNA pseudouridine(2605) synthase RluB. 	289
236738	PRK10701	PRK10701	DNA-binding transcriptional regulator RstA; Provisional	240
182661	PRK10702	PRK10702	endonuclease III; Provisional	211
236739	PRK10703	PRK10703	HTH-type transcriptional repressor PurR. 	341
182663	PRK10707	PRK10707	putative NUDIX hydrolase; Provisional	190
182664	PRK10708	PRK10708	protein DsrB. 	62
182665	PRK10710	PRK10710	DNA-binding transcriptional regulator BaeR; Provisional	240
182666	PRK10711	PRK10711	hypothetical protein; Provisional	231
236740	PRK10712	PRK10712	PTS system fructose-specific transporter subunits IIBC; Provisional	563
182668	PRK10713	PRK10713	2Fe-2S ferredoxin-like protein. 	84
182669	PRK10714	PRK10714	undecaprenyl phosphate 4-deoxy-4-formamido-L-arabinose transferase; Provisional	325
182670	PRK10715	flk	flagella biosynthesis regulator Flk. 	335
236741	PRK10716	PRK10716	long-chain fatty acid transporter FadL. 	435
182672	PRK10717	PRK10717	cysteine synthase A; Provisional	330
236742	PRK10718	PRK10718	RpoE-regulated lipoprotein; Provisional	191
236743	PRK10719	eutA	ethanolamine ammonia-lyase reactivating factor EutA. 	475
236744	PRK10720	PRK10720	uracil transporter; Provisional	428
170660	PRK10721	PRK10721	hypothetical protein; Provisional	66
236745	PRK10722	PRK10722	two-component system QseEF-associated lipoprotein QseG. 	247
182677	PRK10723	PRK10723	polyphenol oxidase. 	243
182678	PRK10724	PRK10724	type II toxin-antitoxin system RatA family toxin. 	158
182679	PRK10725	PRK10725	fructose-1-phosphate/6-phosphogluconate phosphatase. 	188
236746	PRK10726	PRK10726	DUF3561 family protein. 	105
182681	PRK10727	PRK10727	HTH-type transcriptional regulator GalR. 	343
182682	PRK10729	nudF	ADP-ribose pyrophosphatase NudF; Provisional	202
182683	PRK10733	hflB	ATP-dependent zinc metalloprotease FtsH. 	644
182684	PRK10734	PRK10734	putative calcium/sodium:proton antiporter; Provisional	325
182685	PRK10735	tldD	protease TldD; Provisional	481
236747	PRK10736	PRK10736	DNA-protecting protein DprA. 	374
236748	PRK10737	PRK10737	peptidylprolyl isomerase. 	196
182688	PRK10738	PRK10738	OsmC family protein. 	134
170674	PRK10739	PRK10739	YhgN family NAAT transporter. 	197
182689	PRK10740	PRK10740	high-affinity branched-chain amino acid ABC transporter permease LivH. 	308
236749	PRK10742	PRK10742	16S rRNA (guanine(1516)-N(2))-methyltransferase RsmJ. 	250
182691	PRK10743	PRK10743	heat shock chaperone IbpA. 	137
182692	PRK10744	pstB	phosphate ABC transporter ATP-binding protein PstB. 	260
182693	PRK10745	trkD	low affinity potassium transporter Kup. 	622
182694	PRK10746	PRK10746	putative transport protein YifK; Provisional	461
182695	PRK10747	PRK10747	putative protoheme IX biogenesis protein; Provisional	398
182696	PRK10748	PRK10748	5-amino-6-(5-phospho-D-ribitylamino)uracil phosphatase YigB. 	238
182697	PRK10749	PRK10749	lysophospholipase L2; Provisional	330
182698	PRK10750	PRK10750	Trk system potassium transporter TrkH. 	483
236750	PRK10751	PRK10751	molybdopterin-guanine dinucleotide biosynthesis protein B; Provisional	173
182700	PRK10752	PRK10752	sulfate ABC transporter substrate-binding protein. 	329
138142	PRK10753	PRK10753	DNA-binding protein HU-alpha. 	90
182701	PRK10754	PRK10754	NADPH:quinone reductase. 	327
236751	PRK10755	PRK10755	two-component system sensor histidine kinase PmrB. 	356
236752	PRK10756	PRK10756	protein CreA. 	157
236753	PRK10757	PRK10757	inositol-1-monophosphatase. 	267
182705	PRK10759	PRK10759	YfiM family lipoprotein. 	106
236754	PRK10760	PRK10760	murein hydrolase B; Provisional	359
236755	PRK10762	PRK10762	D-ribose transporter ATP binding protein; Provisional	501
182708	PRK10763	PRK10763	phospholipase A; Provisional	289
236756	PRK10764	PRK10764	potassium-tellurite ethidium and proflavin transporter; Provisional	324
182710	PRK10765	PRK10765	oxygen-insensitive NADPH nitroreductase. 	240
182711	PRK10766	PRK10766	two-component system response regulator TorR. 	221
236757	PRK10767	PRK10767	chaperone protein DnaJ; Provisional	371
182713	PRK10768	PRK10768	ribonucleoside hydrolase RihC; Provisional	304
182714	PRK10769	folA	type 3 dihydrofolate reductase. 	159
236758	PRK10770	PRK10770	peptidyl-prolyl cis-trans isomerase SurA; Provisional	413
182716	PRK10771	thiQ	thiamine ABC transporter ATP-binding protein ThiQ. 	232
182717	PRK10772	PRK10772	cell division protein FtsL; Provisional	108
182718	PRK10773	murF	UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase; Reviewed	453
182719	PRK10774	PRK10774	cell division protein FtsW; Provisional	404
182720	PRK10775	PRK10775	cell division protein FtsQ; Provisional	276
182721	PRK10776	PRK10776	8-oxo-dGTP diphosphatase MutT. 	129
182722	PRK10778	dksA	RNA polymerase-binding protein DksA. 	151
182723	PRK10779	PRK10779	sigma E protease regulator RseP. 	449
182724	PRK10780	PRK10780	molecular chaperone Skp. 	165
182725	PRK10781	rcsF	Rcs stress response system protein RcsF. 	133
182726	PRK10782	PRK10782	D-methionine ABC transporter permease MetI. 	217
182727	PRK10783	mltD	membrane-bound lytic murein transglycosylase D; Provisional	456
236759	PRK10785	PRK10785	maltodextrin glucosidase; Provisional	598
182729	PRK10786	ribD	bifunctional diaminohydroxyphosphoribosylaminopyrimidine deaminase/5-amino-6-(5-phosphoribosylamino)uracil reductase RibD. 	367
182730	PRK10787	PRK10787	DNA-binding ATP-dependent protease La; Provisional	784
182731	PRK10788	PRK10788	periplasmic folding chaperone; Provisional	623
182732	PRK10789	PRK10789	SmdA family multidrug ABC transporter permease/ATP-binding protein. 	569
182733	PRK10790	PRK10790	SmdB family multidrug efflux ABC transporter permease/ATP-binding protein. 	592
182734	PRK10791	PRK10791	peptidylprolyl isomerase B. 	164
236760	PRK10792	PRK10792	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	285
182736	PRK10793	PRK10793	D-alanyl-D-alanine carboxypeptidase fraction A; Provisional	403
182737	PRK10794	PRK10794	rod shape-determining protein RodA. 	370
236761	PRK10795	PRK10795	penicillin-binding protein 2; Provisional	634
236762	PRK10796	PRK10796	LPS-assembly lipoprotein RlpB; Provisional	188
236763	PRK10797	PRK10797	glutamate and aspartate transporter subunit; Provisional	302
182741	PRK10799	PRK10799	type 2 GTP cyclohydrolase I. 	247
182742	PRK10800	PRK10800	acyl-CoA thioesterase YbgC; Provisional	130
182743	PRK10801	PRK10801	Tol-Pal system protein TolQ. 	227
182744	PRK10802	PRK10802	peptidoglycan-associated lipoprotein Pal. 	173
182745	PRK10803	PRK10803	tol-pal system protein YbgF; Provisional	263
182746	PRK10805	PRK10805	formate transporter; Provisional	285
182747	PRK10807	PRK10807	intermembrane transport protein PqiB. 	547
236764	PRK10808	PRK10808	outer membrane protein A; Reviewed	351
182749	PRK10809	PRK10809	30S ribosomal protein S5 alanine N-acetyltransferase. 	194
236765	PRK10810	PRK10810	anti-sigma-28 factor FlgM. 	98
236766	PRK10811	rne	ribonuclease E; Reviewed	1068
236767	PRK10812	PRK10812	putative DNAse; Provisional	265
182753	PRK10814	PRK10814	lipoprotein-releasing ABC transporter permease subunit LolC. 	399
182754	PRK10815	PRK10815	two-component system sensor histidine kinase PhoQ. 	485
182755	PRK10816	PRK10816	two-component system response regulator PhoP. 	223
182756	PRK10818	PRK10818	septum site-determining protein MinD. 	270
236768	PRK10819	PRK10819	transport protein TonB; Provisional	246
236769	PRK10820	PRK10820	transcriptional regulator TyrR. 	520
182759	PRK10824	PRK10824	Grx4 family monothiol glutaredoxin. 	115
236770	PRK10826	PRK10826	hexitol phosphatase HxpB. 	222
182761	PRK10828	PRK10828	putative oxidoreductase; Provisional	183
236771	PRK10829	PRK10829	ribonuclease D; Provisional	373
182763	PRK10832	PRK10832	CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase. 	182
236772	PRK10833	PRK10833	putative assembly protein; Provisional	617
182765	PRK10834	PRK10834	outer membrane permeability protein SanA. 	239
182766	PRK10835	PRK10835	hypothetical protein; Provisional	373
182767	PRK10836	PRK10836	lysine transporter; Provisional	489
182768	PRK10837	PRK10837	putative DNA-binding transcriptional regulator; Provisional	290
236773	PRK10838	spr	bifunctional murein DD-endopeptidase/murein LD-carboxypeptidase. 	190
236774	PRK10839	PRK10839	16S rRNA pseudouridine(516) synthase RsuA. 	232
182771	PRK10840	PRK10840	transcriptional regulator RcsB; Provisional	216
182772	PRK10841	PRK10841	two-component system sensor histidine kinase RcsC. 	924
182773	PRK10845	PRK10845	colicin V production protein; Provisional	162
182774	PRK10846	PRK10846	bifunctional folylpolyglutamate synthase/ dihydrofolate synthase; Provisional	416
182775	PRK10847	PRK10847	DedA family protein. 	219
182776	PRK10848	PRK10848	phosphohistidine phosphatase SixA. 	159
182777	PRK10850	PRK10850	phosphocarrier protein Hpr. 	85
182778	PRK10851	PRK10851	sulfate/thiosulfate ABC transporter ATP-binding protein CysA. 	353
236775	PRK10852	PRK10852	thiosulfate ABC transporter substrate-binding protein CysP. 	338
182780	PRK10853	PRK10853	putative reductase; Provisional	118
182781	PRK10854	PRK10854	exopolyphosphatase; Provisional	513
236776	PRK10856	PRK10856	cytoskeleton protein RodZ. 	331
236777	PRK10857	PRK10857	Fe-S cluster assembly transcriptional regulator IscR. 	164
182784	PRK10858	PRK10858	nitrogen regulatory protein P-II. 	112
236778	PRK10859	PRK10859	membrane-bound lytic murein transglycosylase MltF. 	482
182786	PRK10860	PRK10860	tRNA-specific adenosine deaminase; Provisional	172
182787	PRK10861	PRK10861	signal peptidase I. 	324
182788	PRK10862	PRK10862	SoxR-reducing system protein RseC. 	154
182789	PRK10863	PRK10863	anti-sigma-E factor RseA. 	216
236779	PRK10864	PRK10864	putative methyltransferase; Provisional	346
182791	PRK10865	PRK10865	ATP-dependent chaperone ClpB. 	857
182792	PRK10866	PRK10866	outer membrane protein assembly factor BamD. 	243
236780	PRK10867	PRK10867	signal recognition particle protein; Provisional	433
236781	PRK10869	PRK10869	recombination and repair protein; Provisional	553
182795	PRK10870	PRK10870	transcriptional repressor MprA; Provisional	176
236782	PRK10871	nlpD	murein hydrolase activator NlpD. 	319
182797	PRK10872	relA	(p)ppGpp synthetase I/GTP pyrophosphokinase; Provisional	743
182798	PRK10873	PRK10873	hypothetical protein; Provisional	131
182799	PRK10874	PRK10874	cysteine desulfurase CsdA. 	401
236783	PRK10875	recD	exodeoxyribonuclease V subunit alpha. 	615
236784	PRK10876	recB	exonuclease V subunit beta; Provisional	1181
182802	PRK10877	PRK10877	protein disulfide isomerase II DsbC; Provisional	232
182803	PRK10878	PRK10878	FAD assembly factor SdhE. 	72
182804	PRK10879	PRK10879	proline aminopeptidase P II; Provisional	438
182805	PRK10880	PRK10880	adenine DNA glycosylase. 	350
236785	PRK10881	PRK10881	Ni/Fe-hydrogenase cytochrome b subunit. 	394
236786	PRK10882	PRK10882	hydrogenase 2 operon protein HybA. 	328
182808	PRK10883	PRK10883	FtsI repressor; Provisional	471
182809	PRK10884	PRK10884	SH3 domain-containing protein; Provisional	206
182810	PRK10885	cca	multifunctional CCA addition/repair protein. 	409
182811	PRK10886	PRK10886	DnaA initiator-associating protein DiaA; Provisional	196
236787	PRK10887	glmM	phosphoglucosamine mutase; Provisional	443
182813	PRK10888	PRK10888	octaprenyl diphosphate synthase; Provisional	323
182814	PRK10892	PRK10892	arabinose-5-phosphate isomerase KdsD. 	326
236788	PRK10893	PRK10893	LPS export ABC transporter periplasmic protein LptC. 	192
182816	PRK10894	PRK10894	lipopolysaccharide ABC transporter substrate-binding protein LptA. 	180
182817	PRK10895	PRK10895	lipopolysaccharide ABC transporter ATP-binding protein; Provisional	241
182818	PRK10896	PRK10896	PTS IIA-like nitrogen regulatory protein PtsN. 	154
182819	PRK10897	PRK10897	PTS phosphocarrier protein NPr. 	90
182820	PRK10898	PRK10898	serine endoprotease DegS. 	353
236789	PRK10899	PRK10899	AsmA2 domain-containing protein. 	1022
236790	PRK10901	PRK10901	16S rRNA (cytosine(967)-C(5))-methyltransferase RsmB. 	427
236791	PRK10902	PRK10902	FKBP-type peptidyl-prolyl cis-trans isomerase; Provisional	269
182824	PRK10903	PRK10903	peptidylprolyl isomerase A. 	190
182825	PRK10904	PRK10904	adenine-specific DNA-methyltransferase. 	271
236792	PRK10905	PRK10905	cell division protein DamX; Validated	328
182827	PRK10906	PRK10906	DeoR/GlpR family transcriptional regulator. 	252
182828	PRK10907	PRK10907	intramembrane serine protease GlpG; Provisional	276
182829	PRK10908	PRK10908	cell division ATP-binding protein FtsE. 	222
236793	PRK10909	rsmD	16S rRNA m(2)G966-methyltransferase; Provisional	199
182831	PRK10910	PRK10910	DUF1145 family protein. 	89
182832	PRK10911	PRK10911	oligopeptidase A; Provisional	680
182833	PRK10913	PRK10913	dipeptide ABC transporter permease DppC. 	300
182834	PRK10914	PRK10914	dipeptide ABC transporter permease DppB. 	339
182835	PRK10916	PRK10916	ADP-heptose--LPS heptosyltransferase RfaF. 	348
236794	PRK10917	PRK10917	ATP-dependent DNA helicase RecG; Provisional	681
182837	PRK10918	PRK10918	phosphate ABC transporter substrate-binding protein PstS. 	346
182838	PRK10919	PRK10919	ATP-dependent DNA helicase Rep; Provisional	672
236795	PRK10920	PRK10920	putative uroporphyrinogen III C-methyltransferase; Provisional	390
182840	PRK10921	PRK10921	Sec-independent protein translocase subunit TatC. 	258
236796	PRK10922	PRK10922	4-hydroxy-3-polyprenylbenzoate decarboxylase. 	497
182842	PRK10923	glnG	nitrogen regulation protein NR(I); Provisional	469
182843	PRK10925	PRK10925	superoxide dismutase [Mn]. 	206
182844	PRK10926	PRK10926	ferredoxin-NADP reductase; Provisional	248
236797	PRK10927	PRK10927	cell division protein FtsN. 	319
236798	PRK10929	PRK10929	putative mechanosensitive channel protein; Provisional	1109
236799	PRK10930	PRK10930	FtsH protease activity modulator HflK. 	419
182848	PRK10931	PRK10931	adenosine-3'(2'),5'-bisphosphate nucleotidase; Provisional	246
182849	PRK10933	PRK10933	trehalose-6-phosphate hydrolase; Provisional	551
236800	PRK10935	PRK10935	nitrate/nitrite two-component system sensor histidine kinase NarQ. 	565
236801	PRK10936	PRK10936	TMAO reductase system periplasmic protein TorT; Provisional	343
182852	PRK10938	PRK10938	putative molybdenum transport ATP-binding protein ModF; Provisional	490
182853	PRK10939	PRK10939	autoinducer-2 (AI-2) kinase; Provisional	520
182854	PRK10941	PRK10941	tetratricopeptide repeat-containing protein. 	269
236802	PRK10942	PRK10942	serine endoprotease DegP. 	473
170841	PRK10943	PRK10943	cold shock-like protein CspC; Provisional	69
182856	PRK10945	PRK10945	hemolysin expression modulator Hha. 	72
236803	PRK10946	entE	(2,3-dihydroxybenzoyl)adenylate synthase. 	536
182858	PRK10947	PRK10947	DNA-binding transcriptional regulator H-NS. 	135
236804	PRK10948	PRK10948	Fe-S cluster assembly protein SufD. 	424
182860	PRK10949	PRK10949	signal peptide peptidase SppA. 	618
236805	PRK10952	PRK10952	proline/glycine betaine ABC transporter permease ProW. 	355
182862	PRK10953	cysJ	NADPH-dependent assimilatory sulfite reductase flavoprotein subunit. 	600
182863	PRK10954	PRK10954	thiol:disulfide interchange protein DsbA. 	207
182864	PRK10955	PRK10955	envelope stress response regulator transcription factor CpxR. 	232
236806	PRK10957	PRK10957	iron-enterobactin transporter periplasmic binding protein; Provisional	317
236807	PRK10958	PRK10958	leucine export protein LeuE; Provisional	212
182867	PRK10959	PRK10959	outer membrane protein W; Provisional	212
236808	PRK10963	PRK10963	hypothetical protein; Provisional	223
236809	PRK10964	PRK10964	lipopolysaccharide heptosyltransferase RfaC. 	322
236810	PRK10965	PRK10965	multicopper oxidase; Provisional	523
182871	PRK10966	PRK10966	exonuclease subunit SbcD; Provisional	407
182872	PRK10969	PRK10969	DNA polymerase III subunit theta; Reviewed	75
182873	PRK10971	PRK10971	sulfate/thiosulfate ABC transporter permease CysT. 	277
182874	PRK10972	PRK10972	cell division protein ZapA. 	109
182875	PRK10973	PRK10973	sn-glycerol-3-phosphate ABC transporter permease UgpE. 	281
182876	PRK10974	PRK10974	sn-glycerol-3-phosphate ABC transporter substrate-binding protein UgpB. 	438
182877	PRK10975	PRK10975	dTDP-4-amino-4,6-dideoxy-D-galactose acyltransferase. 	194
182878	PRK10976	PRK10976	putative hydrolase; Provisional	266
182879	PRK10977	PRK10977	hypothetical protein; Provisional	509
182880	PRK10982	PRK10982	galactose/methyl galaxtoside transporter ATP-binding protein; Provisional	491
182881	PRK10983	PRK10983	AI-2E family transporter YdiK. 	368
182882	PRK10984	PRK10984	sigma factor-binding protein Crl. 	127
182883	PRK10985	PRK10985	putative hydrolase; Provisional	324
236811	PRK10987	PRK10987	beta-lactamase regulator AmpE. 	284
182885	PRK10991	fucI	L-fucose isomerase; Provisional	588
236812	PRK10992	PRK10992	iron-sulfur cluster repair protein YtfE. 	220
236813	PRK10993	PRK10993	omptin family outer membrane protease. 	314
236814	PRK10995	PRK10995	MarC family NAAT transporter. 	221
182889	PRK10996	PRK10996	thioredoxin 2; Provisional	139
236815	PRK10997	yieM	ATPase RavA stimulator ViaA. 	487
182891	PRK10998	malG	maltose ABC transporter permease MalG. 	296
236816	PRK10999	malF	maltose ABC transporter permease MalF. 	520
182893	PRK11000	PRK11000	maltose/maltodextrin ABC transporter ATP-binding protein MalK. 	369
236817	PRK11001	mtlR	MltR family transcriptional regulator. 	171
182895	PRK11006	phoR	phosphate regulon sensor histidine kinase PhoR. 	430
182896	PRK11007	PRK11007	PTS system trehalose(maltose)-specific transporter subunits IIBC; Provisional	473
236818	PRK11009	aphA	class B acid phosphatase. 	237
182898	PRK11010	ampG	muropeptide MFS transporter AmpG. 	491
236819	PRK11013	PRK11013	DNA-binding transcriptional regulator LysR; Provisional	309
236820	PRK11014	PRK11014	HTH-type transcriptional repressor NsrR. 	141
236821	PRK11017	codB	cytosine permease; Provisional	404
236822	PRK11018	PRK11018	putative sulfurtransferase YedF. 	78
182903	PRK11019	PRK11019	DksA/TraR family C4-type zinc finger protein. 	88
182904	PRK11020	PRK11020	YibL family ribosome-associated protein. 	118
236823	PRK11021	PRK11021	putative transporter; Provisional	410
182906	PRK11022	dppD	dipeptide transporter ATP-binding subunit; Provisional	326
182907	PRK11023	PRK11023	divisome-associated lipoprotein YraP. 	191
236824	PRK11024	PRK11024	colicin uptake protein TolR; Provisional	141
182909	PRK11025	PRK11025	23S rRNA pseudouridine(955/2504/2580) synthase RluC. 	317
182910	PRK11026	ftsX	cell division ABC transporter subunit FtsX; Provisional	309
236825	PRK11027	PRK11027	hypothetical protein; Provisional	112
182912	PRK11028	PRK11028	6-phosphogluconolactonase; Provisional	330
182913	PRK11029	PRK11029	protease modulator HflC. 	334
236826	PRK11031	PRK11031	guanosine-5'-triphosphate,3'-diphosphate diphosphatase. 	496
182915	PRK11032	PRK11032	zinc ribbon-containing protein. 	160
236827	PRK11033	zntA	zinc/cadmium/mercury/lead-transporting ATPase; Provisional	741
236828	PRK11034	clpA	ATP-dependent Clp protease ATP-binding subunit; Provisional	758
182918	PRK11036	PRK11036	tRNA uridine 5-oxyacetic acid(34) methyltransferase CmoM. 	255
182919	PRK11037	PRK11037	hypothetical protein; Provisional	83
182920	PRK11038	PRK11038	hypothetical protein; Provisional	47
182921	PRK11039	PRK11039	DUF1249 family protein. 	140
182922	PRK11040	PRK11040	peptidase PmbA; Provisional	446
182923	PRK11041	PRK11041	DNA-binding transcriptional regulator CytR; Provisional	309
182924	PRK11043	PRK11043	Bcr/CflA family multidrug efflux MFS transporter. 	401
236829	PRK11045	pagP	lipid IV(A) palmitoyltransferase PagP. 	184
236830	PRK11049	PRK11049	D-alanine/D-serine/glycine permease; Provisional	469
182927	PRK11050	PRK11050	manganese-binding transcriptional regulator MntR. 	152
236831	PRK11052	malQ	4-alpha-glucanotransferase; Provisional	695
182929	PRK11053	PRK11053	oxygen-insensitive NAD(P)H nitroreductase. 	217
182930	PRK11054	helD	DNA helicase IV; Provisional	684
182931	PRK11055	galM	galactose-1-epimerase; Provisional	342
236832	PRK11056	PRK11056	YijD family membrane protein. 	120
182933	PRK11057	PRK11057	ATP-dependent DNA helicase RecQ; Provisional	607
182934	PRK11058	PRK11058	GTPase HflX; Provisional	426
236833	PRK11059	PRK11059	regulatory protein CsrD; Provisional	640
182936	PRK11060	PRK11060	rod shape-determining protein MreD; Provisional	162
182937	PRK11061	PRK11061	phosphoenolpyruvate--protein phosphotransferase. 	748
182938	PRK11062	nhaR	transcriptional activator NhaR; Provisional	296
182939	PRK11063	metQ	D-methionine ABC transporter substrate-binding protein MetQ. 	271
182940	PRK11064	wecC	UDP-N-acetyl-D-mannosamine dehydrogenase; Provisional	415
236834	PRK11067	PRK11067	outer membrane protein assembly factor YaeT; Provisional	803
182942	PRK11068	PRK11068	phosphatidylglycerophosphatase A; Provisional	164
236835	PRK11069	recC	exodeoxyribonuclease V subunit gamma. 	1122
182944	PRK11070	PRK11070	single-stranded-DNA-specific exonuclease RecJ. 	575
182945	PRK11071	PRK11071	esterase YqiA; Provisional	190
236836	PRK11072	PRK11072	bifunctional [glutamate--ammonia ligase]-adenylyl-L-tyrosine phosphorylase/[glutamate--ammonia-ligase] adenylyltransferase. 	943
182947	PRK11073	glnL	nitrogen regulation protein NR(II). 	348
182948	PRK11074	PRK11074	putative DNA-binding transcriptional regulator; Provisional	300
236837	PRK11081	PRK11081	tRNA guanosine-2'-O-methyltransferase; Provisional	229
236838	PRK11083	PRK11083	DNA-binding response regulator CreB; Provisional	228
182951	PRK11085	PRK11085	magnesium/cobalt transporter CorA. 	316
236839	PRK11086	PRK11086	sensory histidine kinase DcuS; Provisional	542
236840	PRK11087	PRK11087	oxidative stress defense protein; Provisional	231
236841	PRK11088	rrmA	23S rRNA methyltransferase A; Provisional	272
182955	PRK11089	PRK11089	PTS system glucose-specific transporter subunits  IIBC; Provisional	477
236842	PRK11091	PRK11091	aerobic respiration control sensor protein ArcB; Provisional	779
236843	PRK11092	PRK11092	bifunctional GTP diphosphokinase/guanosine-3',5'-bis pyrophosphate 3'-pyrophosphohydrolase. 	702
182958	PRK11096	ansB	L-asparaginase II; Provisional	347
236844	PRK11097	PRK11097	cellulase. 	376
182960	PRK11098	PRK11098	peptide antibiotic transporter SbmA. 	409
236845	PRK11099	PRK11099	putative inner membrane protein; Provisional	399
236846	PRK11100	PRK11100	sensory histidine kinase CreC; Provisional	475
236847	PRK11101	glpA	anaerobic glycerol-3-phosphate dehydrogenase subunit A. 	546
182964	PRK11102	PRK11102	Bcr/CflA family multidrug efflux MFS transporter. 	377
182965	PRK11103	PRK11103	PTS mannose transporter subunit IID. 	282
182966	PRK11104	hemG	menaquinone-dependent protoporphyrinogen IX dehydrogenase. 	177
182967	PRK11106	PRK11106	queuosine biosynthesis protein QueC; Provisional	231
236848	PRK11107	PRK11107	hybrid sensory histidine kinase BarA; Provisional	919
236849	PRK11109	PRK11109	fused PTS fructose transporter subunit IIA/HPr protein. 	375
236850	PRK11111	PRK11111	YchE family NAAT transporter. 	214
182971	PRK11112	PRK11112	tRNA pseudouridine synthase C; Provisional	257
182972	PRK11113	PRK11113	D-alanyl-D-alanine carboxypeptidase/endopeptidase; Provisional	477
236851	PRK11114	PRK11114	cellulose biosynthesis cyclic di-GMP-binding regulatory protein BcsB. 	756
182974	PRK11115	PRK11115	phosphate signaling complex protein PhoU. 	236
182975	PRK11118	PRK11118	putative monooxygenase; Provisional	100
236852	PRK11119	proX	proline/glycine betaine ABC transporter substrate-binding protein ProX. 	331
236853	PRK11121	nrdG	anaerobic ribonucleoside-triphosphate reductase-activating protein. 	154
182978	PRK11122	artM	arginine ABC transporter permease ArtM. 	222
182979	PRK11123	PRK11123	arginine ABC transporter permease ArtQ. 	238
182980	PRK11124	artP	arginine transporter ATP-binding subunit; Provisional	242
236854	PRK11125	nrfA	ammonia-forming cytochrome c nitrite reductase. 	480
236855	PRK11126	PRK11126	2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase; Provisional	242
182983	PRK11127	PRK11127	autonomous glycyl radical cofactor GrcA; Provisional	127
236856	PRK11128	PRK11128	3-phenylpropionate MFS transporter. 	382
182985	PRK11130	moaD	molybdopterin synthase small subunit; Provisional	81
182986	PRK11131	PRK11131	ATP-dependent RNA helicase HrpA; Provisional	1294
182987	PRK11132	cysE	serine acetyltransferase; Provisional	273
182988	PRK11133	serB	phosphoserine phosphatase; Provisional	322
236857	PRK11138	PRK11138	outer membrane biogenesis protein BamB; Provisional	394
182990	PRK11139	PRK11139	DNA-binding transcriptional activator GcvA; Provisional	297
236858	PRK11142	PRK11142	ribokinase; Provisional	306
236859	PRK11143	glpQ	glycerophosphodiester phosphodiesterase; Provisional	355
182993	PRK11144	modC	molybdenum ABC transporter ATP-binding protein ModC. 	352
182994	PRK11145	pflA	pyruvate formate lyase 1-activating protein. 	246
236860	PRK11146	PRK11146	lipoprotein-releasing ABC transporter permease subunit LolE. 	412
236861	PRK11147	PRK11147	ABC transporter ATPase component; Reviewed	635
182997	PRK11148	PRK11148	cyclic 3',5'-adenosine monophosphate phosphodiesterase; Provisional	275
182998	PRK11150	rfaD	ADP-L-glycero-D-mannoheptose-6-epimerase; Provisional	308
182999	PRK11151	PRK11151	DNA-binding transcriptional regulator OxyR; Provisional	305
236862	PRK11152	ilvM	acetolactate synthase 2 small subunit. 	76
236863	PRK11153	metN	DL-methionine transporter ATP-binding subunit; Provisional	343
236864	PRK11154	fadJ	fatty acid oxidation complex subunit alpha FadJ. 	708
236865	PRK11160	PRK11160	cysteine/glutathione ABC transporter membrane/ATP-binding component; Reviewed	574
183004	PRK11161	PRK11161	fumarate/nitrate reduction transcriptional regulator Fnr. 	235
236866	PRK11162	mltA	murein transglycosylase A; Provisional	355
236867	PRK11165	PRK11165	diaminopimelate decarboxylase; Provisional	420
236868	PRK11166	PRK11166	chemotaxis regulator CheZ; Provisional	214
236869	PRK11168	glpC	anaerobic glycerol-3-phosphate dehydrogenase subunit C. 	396
183009	PRK11169	PRK11169	leucine-responsive transcriptional regulator Lrp. 	164
183010	PRK11170	nagA	N-acetylglucosamine-6-phosphate deacetylase; Provisional	382
183011	PRK11171	PRK11171	(S)-ureidoglycine aminohydrolase. 	266
183012	PRK11172	dkgB	2,5-didehydrogluconate reductase DkgB. 	267
183013	PRK11173	PRK11173	two-component response regulator; Provisional	237
236870	PRK11174	PRK11174	cysteine/glutathione ABC transporter membrane/ATP-binding component; Reviewed	588
236871	PRK11175	PRK11175	universal stress protein UspE; Provisional	305
183016	PRK11176	PRK11176	lipid A ABC transporter ATP-binding protein/permease MsbA. 	582
183017	PRK11177	PRK11177	phosphoenolpyruvate-protein phosphotransferase PtsI. 	575
183018	PRK11178	PRK11178	uridine phosphorylase; Provisional	251
183019	PRK11179	PRK11179	DNA-binding transcriptional regulator AsnC; Provisional	153
183020	PRK11180	rluD	23S rRNA pseudouridine(1911/1915/1917) synthase RluD. 	325
183021	PRK11181	PRK11181	23S rRNA (guanosine(2251)-2'-O)-methyltransferase RlmB. 	244
236872	PRK11183	PRK11183	D-lactate dehydrogenase; Provisional	564
236873	PRK11186	PRK11186	carboxy terminal-processing peptidase. 	667
236874	PRK11187	PRK11187	replication initiation negative regulator SeqA. 	182
183025	PRK11188	rrmJ	23S rRNA (uridine(2552)-2'-O)-methyltransferase RlmE. 	209
236875	PRK11189	PRK11189	lipoprotein NlpI; Provisional	296
183027	PRK11190	PRK11190	iron-sulfur cluster biogenesis protein NfuA. 	192
236876	PRK11191	PRK11191	ribonuclease E inhibitor RraB. 	138
236877	PRK11192	PRK11192	ATP-dependent RNA helicase SrmB; Provisional	434
236878	PRK11193	PRK11193	23S rRNA accumulation protein YceD. 	172
183031	PRK11194	PRK11194	ribosomal RNA large subunit methyltransferase N; Provisional	372
236879	PRK11195	PRK11195	lysophospholipid transporter LplT; Provisional	393
183033	PRK11197	lldD	L-lactate dehydrogenase; Provisional	381
236880	PRK11198	PRK11198	LysM domain/BON superfamily protein; Provisional	147
183035	PRK11199	tyrA	bifunctional chorismate mutase/prephenate dehydrogenase; Provisional	374
183036	PRK11200	grxA	glutaredoxin 1; Provisional	85
236881	PRK11202	PRK11202	HTH-type transcriptional repressor FabR. 	203
236882	PRK11204	PRK11204	N-glycosyltransferase; Provisional	420
236883	PRK11205	tbpA	thiamine transporter substrate binding subunit; Provisional	330
183040	PRK11207	PRK11207	tellurite resistance methyltransferase TehB. 	197
183041	PRK11212	PRK11212	7-cyano-7-deazaguanine/7-aminomethyl-7-deazaguanine transporter. 	210
183042	PRK11228	fecC	iron-dicitrate ABC transporter permease FecC. 	323
183043	PRK11230	PRK11230	glycolate oxidase subunit GlcD; Provisional	499
183044	PRK11231	fecE	Fe(3+) dicitrate ABC transporter ATP-binding protein FecE. 	255
183045	PRK11233	PRK11233	nitrogen assimilation transcriptional regulator; Provisional	305
236884	PRK11234	nfrB	phage adsorption protein NrfB. 	727
183047	PRK11235	PRK11235	type II toxin-antitoxin system RelB/DinJ family antitoxin. 	80
183048	PRK11239	PRK11239	hypothetical protein; Provisional	215
183049	PRK11240	PRK11240	penicillin-binding protein 1C; Provisional	772
183050	PRK11241	gabD	NADP-dependent succinate-semialdehyde dehydrogenase I. 	482
183051	PRK11242	PRK11242	DNA-binding transcriptional regulator CynR; Provisional	296
183052	PRK11244	phnP	carbon-phosphorus lyase complex accessory protein; Provisional	250
183053	PRK11245	folX	dihydroneopterin triphosphate 2'-epimerase. 	120
236885	PRK11246	PRK11246	YtjB family periplasmic protein. 	218
183055	PRK11247	ssuB	aliphatic sulfonates transport ATP-binding subunit; Provisional	257
183056	PRK11248	tauB	taurine ABC transporter ATP-binding subunit. 	255
236886	PRK11249	katE	hydroperoxidase II; Provisional	752
183058	PRK11251	PRK11251	osmotically-inducible lipoprotein OsmE. 	109
183059	PRK11253	ldcA	L,D-carboxypeptidase A; Provisional	305
236887	PRK11259	solA	N-methyl-L-tryptophan oxidase. 	376
183061	PRK11260	PRK11260	cystine ABC transporter substrate-binding protein. 	266
236888	PRK11263	PRK11263	cardiolipin synthase ClsB. 	411
183063	PRK11264	PRK11264	putative amino-acid ABC transporter ATP-binding protein YecC; Provisional	250
183064	PRK11267	PRK11267	biopolymer transport protein ExbD; Provisional	141
183065	PRK11268	pstA	phosphate ABC transporter permease PstA. 	295
183066	PRK11269	PRK11269	glyoxylate carboligase; Provisional	591
183067	PRK11272	PRK11272	putative DMT superfamily transporter inner membrane protein; Provisional	292
236889	PRK11273	glpT	glycerol-3-phosphate transporter. 	452
236890	PRK11274	glcF	glycolate oxidase subunit GlcF. 	407
183070	PRK11275	pstC	phosphate ABC transporter permease PstC. 	319
236891	PRK11278	PRK11278	NADH-quinone oxidoreductase subunit NuoF. 	448
183072	PRK11280	PRK11280	hypothetical protein; Provisional	170
236892	PRK11281	PRK11281	mechanosensitive channel MscK. 	1113
236893	PRK11282	glcE	glycolate oxidase FAD binding subunit; Provisional	352
183075	PRK11283	gltP	glutamate/aspartate:proton symporter; Provisional	437
183076	PRK11285	araH	L-arabinose transporter permease protein; Provisional	333
183077	PRK11288	araG	L-arabinose ABC transporter ATP-binding protein AraG. 	501
236894	PRK11289	ampC	beta-lactamase/D-alanine carboxypeptidase; Provisional	384
236895	PRK11295	PRK11295	HNH nuclease family protein. 	113
183080	PRK11300	livG	leucine/isoleucine/valine transporter ATP-binding subunit; Provisional	255
236896	PRK11301	livM	leucine/isoleucine/valine transporter permease subunit; Provisional	419
183082	PRK11302	PRK11302	DNA-binding transcriptional regulator HexR; Provisional	284
236897	PRK11303	PRK11303	catabolite repressor/activator. 	328
236898	PRK11308	dppF	dipeptide transporter ATP-binding subunit; Provisional	327
183085	PRK11316	PRK11316	bifunctional D-glycero-beta-D-manno-heptose-7-phosphate kinase/D-glycero-beta-D-manno-heptose 1-phosphate adenylyltransferase HldE. 	473
183086	PRK11320	prpB	2-methylisocitrate lyase; Provisional	292
183087	PRK11325	PRK11325	scaffold protein; Provisional	127
183088	PRK11331	PRK11331	5-methylcytosine-specific restriction enzyme subunit McrB; Provisional	459
183089	PRK11337	PRK11337	MurR/RpiR family transcriptional regulator. 	292
183090	PRK11339	abgT	putative aminobenzoyl-glutamate transporter; Provisional	508
236899	PRK11340	PRK11340	phosphodiesterase YaeI; Provisional	271
183092	PRK11342	mhpD	2-keto-4-pentenoate hydratase; Provisional	262
183093	PRK11346	PRK11346	hypothetical protein; Provisional	285
183094	PRK11347	PRK11347	antitoxin ChpS; Provisional	83
183095	PRK11352	PRK11352	formaldehyde-responsive transcriptional repressor FrmR. 	91
236900	PRK11354	kil	FtsZ inhibitor protein; Reviewed	73
183096	PRK11357	frlA	amino acid permease. 	445
183097	PRK11359	PRK11359	cyclic-di-GMP phosphodiesterase; Provisional	799
236901	PRK11360	PRK11360	two-component system sensor histidine kinase AtoS. 	607
183099	PRK11361	PRK11361	acetoacetate metabolism transcriptional regulator AtoC. 	457
183100	PRK11365	ssuC	aliphatic sulfonate ABC transporter permease SsuC. 	263
183101	PRK11366	puuD	gamma-glutamyl-gamma-aminobutyrate hydrolase; Provisional	254
183102	PRK11367	PRK11367	hypothetical protein; Provisional	476
183103	PRK11370	PRK11370	YciI family protein. 	99
183104	PRK11371	PRK11371	hypothetical protein; Provisional	126
236902	PRK11372	PRK11372	C-type lysozyme inhibitor. 	109
183106	PRK11375	PRK11375	putative allantoin permease. 	484
183107	PRK11376	hlyE	hemolysin HlyE. 	303
183108	PRK11377	PRK11377	dihydroxyacetone kinase subunit M; Provisional	473
183109	PRK11379	PRK11379	putative outer membrane porin protein; Provisional	417
236903	PRK11380	PRK11380	hypothetical protein; Provisional	353
183111	PRK11382	frlB	fructoselysine 6-phosphate deglycase. 	340
105206	PRK11383	PRK11383	YiaB family inner membrane protein. 	145
183112	PRK11385	PRK11385	fimbrial chaperone. 	236
236904	PRK11387	PRK11387	S-methylmethionine permease. 	471
183114	PRK11388	PRK11388	DNA-binding transcriptional regulator DhaR; Provisional	638
138553	PRK11391	etp	phosphotyrosine-protein phosphatase; Provisional	144
183115	PRK11394	PRK11394	23S rRNA pseudouridine(2457) synthase RluE. 	217
236905	PRK11396	PRK11396	environmental stress-induced protein Ves. 	191
183117	PRK11397	dacD	serine-type D-Ala-D-Ala carboxypeptidase DacD. 	388
105214	PRK11401	PRK11401	enamine/imine deaminase. 	129
183118	PRK11402	PRK11402	transcriptional regulator PhoB. 	241
183119	PRK11403	PRK11403	hypothetical protein; Provisional	113
183120	PRK11404	PRK11404	PTS fructose-like transporter subunit IIBC. 	482
236906	PRK11408	PRK11408	hypothetical protein; Provisional	145
171099	PRK11409	PRK11409	YoeB-YefM toxin-antitoxin system antitoxin YefM. 	83
236907	PRK11410	PRK11410	hypothetical protein; Provisional	561
183123	PRK11411	fecB	iron-dicitrate transporter substrate-binding subunit; Provisional	303
183124	PRK11412	PRK11412	uracil/xanthine transporter. 	433
183125	PRK11413	PRK11413	putative hydratase; Provisional	751
183126	PRK11414	PRK11414	GntR family transcriptional regulator. 	221
183127	PRK11415	PRK11415	hypothetical protein; Provisional	74
236908	PRK11423	PRK11423	methylmalonyl-CoA decarboxylase; Provisional	261
236909	PRK11424	PRK11424	DNA-binding transcriptional activator TdcR; Provisional	114
183129	PRK11425	PRK11425	PTS N-acetylgalactosamine transporter subunit IIB. 	157
183130	PRK11426	PRK11426	hypothetical protein; Provisional	132
183131	PRK11427	PRK11427	multidrug efflux transporter permease subunit MdtO. 	683
183132	PRK11430	PRK11430	putative CoA-transferase; Provisional	381
171110	PRK11431	PRK11431	quaternary ammonium compound efflux SMR transporter SugE. 	105
183133	PRK11432	fbpC	ferric ABC transporter ATP-binding protein. 	351
236910	PRK11433	PRK11433	aldehyde oxidoreductase 2Fe-2S subunit; Provisional	217
183135	PRK11436	PRK11436	biofilm-dependent modulation protein; Provisional	71
236911	PRK11439	pphA	protein-serine/threonine phosphatase. 	218
183137	PRK11440	PRK11440	putative hydrolase; Provisional	188
183138	PRK11443	PRK11443	lipoprotein; Provisional	124
183139	PRK11445	PRK11445	FAD-binding protein. 	351
183140	PRK11447	PRK11447	cellulose synthase subunit BcsC; Provisional	1157
236912	PRK11448	hsdR	type I restriction enzyme EcoKI subunit R; Provisional	1123
171118	PRK11449	PRK11449	metal-dependent hydrolase. 	258
183142	PRK11453	PRK11453	O-acetylserine/cysteine exporter. 	299
183143	PRK11459	PRK11459	multidrug resistance outer membrane protein MdtQ; Provisional	478
183144	PRK11460	PRK11460	putative hydrolase; Provisional	232
183145	PRK11462	PRK11462	putative transporter; Provisional	460
236913	PRK11463	fxsA	phage T7 F exclusion suppressor FxsA; Reviewed	148
183147	PRK11465	PRK11465	putative mechanosensitive channel protein; Provisional	741
236914	PRK11466	PRK11466	hybrid sensory histidine kinase TorS; Provisional	914
183149	PRK11467	PRK11467	secY/secA suppressor protein; Provisional	124
183150	PRK11468	PRK11468	dihydroxyacetone kinase subunit DhaK; Provisional	356
183151	PRK11469	PRK11469	manganese efflux pump MntP. 	188
183152	PRK11470	PRK11470	YebB family permuted papain-like enzyme. 	200
236915	PRK11475	PRK11475	DNA-binding transcriptional activator BglJ; Provisional	207
183154	PRK11476	PRK11476	carnitine metabolism transcriptional regulator CaiF. 	113
183155	PRK11477	PRK11477	CdaR family transcriptional regulator. 	385
183156	PRK11478	PRK11478	VOC family protein. 	129
183157	PRK11479	PRK11479	YaeF family permuted papain-like enzyme. 	274
183158	PRK11480	tauA	taurine transporter substrate binding subunit; Provisional	320
183159	PRK11482	PRK11482	DNA-binding transcriptional regulator. 	317
183160	PRK11486	PRK11486	flagellar type III secretion system protein FliO. 	124
236916	PRK11492	hyfE	hydrogenase 4 membrane subunit; Provisional	216
236917	PRK11493	sseA	3-mercaptopyruvate sulfurtransferase; Provisional	281
236918	PRK11498	bcsA	cellulose synthase catalytic subunit; Provisional	852
236919	PRK11504	tynA	primary-amine oxidase. 	647
183165	PRK11505	PRK11505	phosphate starvation-inducible protein PsiF. 	106
183166	PRK11507	PRK11507	ribosome-associated protein YbcJ. 	70
183167	PRK11508	PRK11508	sulfurtransferase TusE. 	109
183168	PRK11509	PRK11509	hydrogenase-1 operon protein HyaE; Provisional	132
236920	PRK11511	PRK11511	MDR efflux pump AcrAB transcriptional activator MarA. 	127
183170	PRK11512	PRK11512	multiple antibiotic resistance transcriptional regulator MarR. 	144
236921	PRK11513	PRK11513	cytochrome b561; Provisional	176
183172	PRK11517	PRK11517	DNA-binding response regulator HprR. 	223
183173	PRK11519	PRK11519	tyrosine-protein kinase Wzc. 	719
183174	PRK11521	PRK11521	DUF2509 family protein. 	124
183175	PRK11522	PRK11522	putrescine--2-oxoglutarate aminotransferase; Provisional	459
183176	PRK11523	PRK11523	transcriptional regulator ExuR. 	253
183177	PRK11524	PRK11524	adenine-specific DNA-methyltransferase. 	284
183178	PRK11525	dinD	DNA-damage-inducible protein D; Provisional	279
236922	PRK11528	PRK11528	hypothetical protein; Provisional	254
236923	PRK11530	PRK11530	hypothetical protein; Provisional	183
183181	PRK11534	PRK11534	DNA-binding transcriptional regulator CsiR; Provisional	224
183182	PRK11536	PRK11536	6-N-hydroxylaminopurine resistance protein; Provisional	223
183183	PRK11537	PRK11537	putative GTP-binding protein YjiA; Provisional	318
183184	PRK11538	PRK11538	ribosome silencing factor. 	105
236924	PRK11539	PRK11539	ComEC family competence protein; Provisional	755
183186	PRK11543	gutQ	arabinose-5-phosphate isomerase GutQ. 	321
236925	PRK11544	hycI	hydrogenase 3 maturation protease; Provisional	156
236926	PRK11545	gntK	gluconokinase. 	163
183189	PRK11546	zraP	zinc resistance sensor/chaperone ZraP. 	143
183190	PRK11548	PRK11548	outer membrane protein assembly factor BamE. 	113
236927	PRK11551	PRK11551	putative 3-hydroxyphenylpropionic transporter MhpT; Provisional	406
236928	PRK11552	PRK11552	putative DNA-binding transcriptional regulator; Provisional	225
236929	PRK11553	PRK11553	alkanesulfonate transporter substrate-binding subunit; Provisional	314
183194	PRK11556	PRK11556	MdtA/MuxA family multidrug efflux RND transporter periplasmic adaptor subunit. 	415
183195	PRK11557	PRK11557	MurR/RpiR family transcriptional regulator. 	278
236930	PRK11558	PRK11558	putative ssRNA endonuclease; Provisional	97
183197	PRK11559	garR	tartronate semialdehyde reductase; Provisional	296
183198	PRK11560	PRK11560	kdo(2)-lipid A phosphoethanolamine 7''-transferase. 	558
183199	PRK11561	PRK11561	isovaleryl CoA dehydrogenase; Provisional	538
183200	PRK11562	PRK11562	nitrite transporter NirC; Provisional	268
236931	PRK11563	PRK11563	bifunctional aldehyde dehydrogenase/enoyl-CoA hydratase; Provisional	675
236932	PRK11564	PRK11564	stationary phase inducible protein CsiE; Provisional	426
183203	PRK11565	dkgA	2,5-didehydrogluconate reductase DkgA. 	275
183204	PRK11566	hdeB	acid-activated periplasmic chaperone HdeB. 	102
183205	PRK11568	PRK11568	IMPACT family protein. 	204
183206	PRK11569	PRK11569	glyoxylate bypass operon transcriptional repressor IclR. 	274
183207	PRK11570	PRK11570	peptidyl-prolyl cis-trans isomerase; Provisional	206
183208	PRK11572	PRK11572	copper homeostasis protein CutC; Provisional	248
236933	PRK11573	PRK11573	hypothetical protein; Provisional	413
183210	PRK11574	PRK11574	protein deglycase YajL. 	196
183211	PRK11578	PRK11578	macrolide transporter subunit MacA; Provisional	370
183212	PRK11579	PRK11579	putative oxidoreductase; Provisional	346
183213	PRK11582	PRK11582	flagella biosynthesis regulatory protein FliZ. 	169
183214	PRK11586	napB	nitrate reductase cytochrome c-type subunit. 	149
183215	PRK11587	PRK11587	putative phosphatase; Provisional	218
236934	PRK11588	PRK11588	putative basic amino acid antiporter YfcC. 	506
236935	PRK11589	gcvR	glycine cleavage system transcriptional repressor; Provisional	190
183218	PRK11590	PRK11590	hypothetical protein; Provisional	211
183219	PRK11593	folB	bifunctional dihydroneopterin aldolase/7,8-dihydroneopterin epimerase. 	119
183220	PRK11594	PRK11594	efflux system membrane protein; Provisional	67
183221	PRK11595	PRK11595	DNA utilization protein GntX; Provisional	227
183222	PRK11596	PRK11596	cyclic-di-GMP phosphodiesterase; Provisional	255
183223	PRK11597	PRK11597	heat shock chaperone IbpB; Provisional	142
183224	PRK11598	PRK11598	putative metal dependent hydrolase; Provisional	545
183225	PRK11602	cysW	sulfate/thiosulfate ABC transporter permease CysW. 	283
183226	PRK11607	potG	putrescine ABC transporter ATP-binding subunit PotG. 	377
236936	PRK11608	pspF	phage shock protein operon transcriptional activator; Provisional	326
183228	PRK11609	PRK11609	bifunctional nicotinamidase/pyrazinamidase. 	212
183229	PRK11611	PRK11611	enhanced serine sensitivity protein SseB; Provisional	246
183230	PRK11613	folP	dihydropteroate synthase; Provisional	282
183231	PRK11614	livF	high-affinity branched-chain amino acid ABC transporter ATP-binding protein LivF. 	237
183232	PRK11615	PRK11615	hypothetical protein; Provisional	185
183233	PRK11616	PRK11616	hypothetical protein; Provisional	109
236937	PRK11617	PRK11617	deoxyribonuclease V. 	224
183235	PRK11618	PRK11618	inner membrane ABC transporter permease protein YjfF; Provisional	317
183236	PRK11619	PRK11619	lytic murein transglycosylase; Provisional	644
236938	PRK11621	PRK11621	Tat proofreading chaperone DmsD. 	204
183238	PRK11622	PRK11622	ABC transporter substrate-binding protein. 	401
236939	PRK11623	pcnB	poly(A) polymerase I; Provisional	472
183240	PRK11624	cdsA	phosphatidate cytidylyltransferase. 	285
183241	PRK11625	PRK11625	Rho-binding antiterminator; Provisional	84
183242	PRK11627	PRK11627	hypothetical protein; Provisional	192
183243	PRK11628	PRK11628	transcriptional regulator BolA; Provisional	105
183244	PRK11629	lolD	lipoprotein-releasing ABC transporter ATP-binding protein LolD. 	233
183245	PRK11630	PRK11630	threonylcarbamoyl-AMP synthase. 	206
236940	PRK11633	PRK11633	cell division protein DedD; Provisional	226
236941	PRK11634	PRK11634	ATP-dependent RNA helicase DeaD; Provisional	629
183248	PRK11636	mrcA	penicillin-binding protein 1a; Provisional	850
236942	PRK11637	PRK11637	AmiB activator; Provisional	428
236943	PRK11638	PRK11638	ECA polysaccharide chain length modulation protein. 	342
183251	PRK11639	PRK11639	zinc uptake transcriptional repressor Zur. 	169
183252	PRK11640	PRK11640	putative transcriptional regulator; Provisional	191
236944	PRK11642	PRK11642	ribonuclease R. 	813
236945	PRK11644	PRK11644	signal transduction histidine-protein kinase/phosphatase UhpB. 	495
183255	PRK11646	PRK11646	multidrug efflux MFS transporter MdtH. 	400
183256	PRK11648	PRK11648	metal-dependent hydrolase. 	195
236946	PRK11649	PRK11649	putative peptidase; Provisional	439
236947	PRK11650	ugpC	sn-glycerol-3-phosphate ABC transporter ATP-binding protein UgpC. 	356
183259	PRK11652	emrD	multidrug transporter EmrD. 	394
236948	PRK11653	PRK11653	DUF1190 family protein. 	225
236949	PRK11655	ubiC	chorismate pyruvate lyase; Provisional	169
183262	PRK11657	dsbG	disulfide isomerase/thiol-disulfide oxidase; Provisional	251
183263	PRK11658	PRK11658	UDP-4-amino-4-deoxy-L-arabinose aminotransferase. 	379
183264	PRK11659	PRK11659	cytochrome c nitrite reductase pentaheme subunit; Provisional	183
183265	PRK11660	PRK11660	putative transporter; Provisional	568
183266	PRK11663	PRK11663	glucose-6-phosphate receptor/MFS transporter UhpC. 	434
236950	PRK11664	PRK11664	ATP-dependent RNA helicase HrpB; Provisional	812
236951	PRK11667	PRK11667	hypothetical protein; Provisional	163
236952	PRK11669	pbpG	D-alanyl-D-alanine endopeptidase; Provisional	306
183270	PRK11670	PRK11670	iron-sulfur cluster carrier protein ApbC. 	369
183271	PRK11671	mltC	membrane-bound lytic murein transglycosylase MltC. 	359
183272	PRK11675	PRK11675	LexA regulated protein; Provisional	90
236953	PRK11677	PRK11677	DUF1043 family protein. 	134
236954	PRK11678	PRK11678	putative chaperone; Provisional	450
236955	PRK11679	PRK11679	outer membrane protein assembly factor BamC. 	346
183276	PRK11688	PRK11688	thioesterase family protein. 	154
183277	PRK11689	PRK11689	aromatic amino acid efflux DMT transporter YddG. 	295
236956	PRK11697	PRK11697	two-component system response regulator BtsR. 	238
236957	PRK11700	PRK11700	VOC family protein. 	187
183280	PRK11701	phnK	phosphonate C-P lyase system protein PhnK; Provisional	258
183281	PRK11702	PRK11702	hypothetical protein; Provisional	108
183282	PRK11705	PRK11705	cyclopropane fatty acyl phospholipid synthase. 	383
183283	PRK11706	PRK11706	TDP-4-oxo-6-deoxy-D-glucose transaminase; Provisional	375
236958	PRK11709	PRK11709	putative L-ascorbate 6-phosphate lactonase; Provisional	355
183285	PRK11712	PRK11712	ribonuclease G; Provisional	489
236959	PRK11713	PRK11713	16S ribosomal RNA methyltransferase RsmE; Provisional	234
236960	PRK11715	PRK11715	cell envelope integrity protein CreD. 	436
236961	PRK11716	PRK11716	HTH-type transcriptional activator IlvY. 	269
236962	PRK11718	PRK11718	sigma D regulator. 	161
236963	PRK11720	PRK11720	UDP-glucose--hexose-1-phosphate uridylyltransferase. 	346
236964	PRK11727	PRK11727	23S rRNA (adenine(1618)-N(6))-methyltransferase RlmF. 	321
183292	PRK11728	PRK11728	L-2-hydroxyglutarate oxidase. 	393
183293	PRK11730	fadB	fatty acid oxidation complex subunit alpha FadB. 	715
236965	PRK11742	PRK11742	bifunctional NADH:ubiquinone oxidoreductase subunit C/D; Provisional	575
236966	PRK11747	dinG	ATP-dependent DNA helicase DinG; Provisional	697
236967	PRK11749	PRK11749	dihydropyrimidine dehydrogenase subunit A; Provisional	457
236968	PRK11750	gltB	glutamate synthase subunit alpha; Provisional	1485
183298	PRK11752	PRK11752	putative S-transferase; Provisional	264
236969	PRK11753	PRK11753	cAMP-activated global transcriptional regulator CRP. 	211
236970	PRK11756	PRK11756	exonuclease III; Provisional	268
236971	PRK11760	PRK11760	putative 23S rRNA C2498 ribose 2'-O-ribose methyltransferase; Provisional	357
236972	PRK11761	cysM	cysteine synthase CysM. 	296
183303	PRK11762	nudE	adenosine nucleotide hydrolase NudE; Provisional	185
236973	PRK11767	PRK11767	SpoVR family protein; Provisional	498
236974	PRK11768	PRK11768	serine/threonine protein kinase. 	325
236975	PRK11770	PRK11770	YccF domain-containing protein. 	135
236976	PRK11773	uvrD	DNA-dependent helicase II; Provisional	721
236977	PRK11776	PRK11776	ATP-dependent RNA helicase DbpA; Provisional	460
236978	PRK11778	PRK11778	putative inner membrane peptidase; Provisional	330
236979	PRK11779	sbcB	exonuclease I; Provisional	476
236980	PRK11780	PRK11780	isoprenoid biosynthesis glyoxalase ElbB. 	217
236981	PRK11783	rlmL	bifunctional 23S rRNA (guanine(2069)-N(7))-methyltransferase RlmK/23S rRNA (guanine(2445)-N(2))-methyltransferase RlmL. 	702
236982	PRK11784	PRK11784	tRNA 2-selenouridine synthase; Provisional	345
236983	PRK11788	PRK11788	tetratricopeptide repeat protein; Provisional	389
236984	PRK11789	PRK11789	1,6-anhydro-N-acetylmuramyl-L-alanine amidase AmpD. 	185
236985	PRK11790	PRK11790	phosphoglycerate dehydrogenase. 	409
236986	PRK11792	queF	7-cyano-7-deazaguanine reductase; Provisional	273
183318	PRK11797	PRK11797	D-ribose pyranase; Provisional	139
236987	PRK11798	PRK11798	ClpXP protease specificity-enhancing factor; Provisional	138
236988	PRK11805	PRK11805	50S ribosomal protein L3 N(5)-glutamine methyltransferase. 	307
236989	PRK11809	putA	trifunctional transcriptional regulator/proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase; Reviewed	1318
236990	PRK11814	PRK11814	cysteine desulfurase activator complex subunit SufB; Provisional	486
236991	PRK11815	PRK11815	tRNA dihydrouridine(20/20a) synthase DusA. 	333
236992	PRK11819	PRK11819	putative ABC transporter ATP-binding protein; Reviewed	556
236993	PRK11820	PRK11820	YicC family protein. 	288
236994	PRK11823	PRK11823	DNA repair protein RadA; Provisional	446
236995	PRK11824	PRK11824	polynucleotide phosphorylase/polyadenylase; Provisional	693
183328	PRK11827	PRK11827	protein YcaR. 	60
183329	PRK11829	PRK11829	biofilm formation regulator HmsP; Provisional	660
236996	PRK11830	dapD	2,3,4,5-tetrahydropyridine-2,6-carboxylate N-succinyltransferase; Provisional	272
236997	PRK11831	PRK11831	phospholipid ABC transporter ATP-binding protein MlaF. 	269
183332	PRK11832	PRK11832	hydrogen peroxide resistance inhibitor IprA. 	207
183333	PRK11835	PRK11835	hypothetical protein; Provisional	114
183334	PRK11836	PRK11836	deubiquitinase; Provisional	403
183335	PRK11837	PRK11837	undecaprenyl pyrophosphate phosphatase; Provisional	202
236998	PRK11840	PRK11840	bifunctional sulfur carrier protein/thiazole synthase protein; Provisional	326
236999	PRK11854	aceF	pyruvate dehydrogenase dihydrolipoyltransacetylase; Validated	633
237000	PRK11855	PRK11855	dihydrolipoamide acetyltransferase; Reviewed	547
237001	PRK11856	PRK11856	branched-chain alpha-keto acid dehydrogenase subunit E2; Reviewed	411
237002	PRK11857	PRK11857	dihydrolipoamide acetyltransferase; Reviewed	306
183341	PRK11858	aksA	trans-homoaconitate synthase; Reviewed	378
237003	PRK11860	PRK11860	bifunctional 3-phosphoshikimate 1-carboxyvinyltransferase/cytidylate kinase. 	661
183343	PRK11861	PRK11861	bifunctional prephenate dehydrogenase/3-phosphoshikimate 1-carboxyvinyltransferase; Provisional	673
237004	PRK11863	PRK11863	N-acetyl-gamma-glutamyl-phosphate reductase; Provisional	313
237005	PRK11864	PRK11864	3-methyl-2-oxobutanoate dehydrogenase subunit beta. 	300
183346	PRK11865	PRK11865	pyruvate synthase subunit beta. 	299
183347	PRK11866	PRK11866	2-oxoacid ferredoxin oxidoreductase subunit beta; Provisional	279
237006	PRK11867	PRK11867	2-oxoglutarate ferredoxin oxidoreductase subunit beta; Reviewed	286
183349	PRK11869	PRK11869	2-oxoacid ferredoxin oxidoreductase subunit beta; Provisional	280
183350	PRK11872	antC	anthranilate 1,2-dioxygenase electron transfer component AntC. 	340
237007	PRK11873	arsM	arsenite methyltransferase. 	272
183352	PRK11874	petL	cytochrome b6-f complex subunit PetL; Reviewed	30
183353	PRK11875	psbT	photosystem II reaction center protein T; Reviewed	31
183354	PRK11876	petM	cytochrome b6-f complex subunit PetM; Reviewed	32
183355	PRK11877	psaI	photosystem I reaction center subunit VIII; Reviewed	38
183356	PRK11878	psaM	photosystem I reaction center subunit XII; Reviewed	34
237008	PRK11880	PRK11880	pyrroline-5-carboxylate reductase; Reviewed	267
237009	PRK11883	PRK11883	protoporphyrinogen oxidase; Reviewed	451
237010	PRK11886	PRK11886	bifunctional biotin--[acetyl-CoA-carboxylase] ligase/biotin operon repressor BirA. 	319
183360	PRK11889	flhF	flagellar biosynthesis protein FlhF. 	436
183361	PRK11890	PRK11890	phosphate acetyltransferase; Provisional	312
183362	PRK11891	PRK11891	aspartate carbamoyltransferase; Provisional	429
237011	PRK11892	PRK11892	pyruvate dehydrogenase subunit beta; Provisional	464
237012	PRK11893	PRK11893	methionyl-tRNA synthetase; Reviewed	511
183365	PRK11895	ilvH	acetolactate synthase 3 regulatory subunit; Reviewed	161
237013	PRK11898	PRK11898	prephenate dehydratase; Provisional	283
237014	PRK11899	PRK11899	prephenate dehydratase; Provisional	279
237015	PRK11901	PRK11901	hypothetical protein; Reviewed	327
183369	PRK11902	ampG	muropeptide transporter; Reviewed	402
237016	PRK11903	PRK11903	3,4-dehydroadipyl-CoA semialdehyde dehydrogenase. 	521
237017	PRK11904	PRK11904	bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA. 	1038
237018	PRK11905	PRK11905	bifunctional proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase; Reviewed	1208
183373	PRK11906	PRK11906	HilA family transcriptional regulator YgeH. 	458
237019	PRK11907	PRK11907	bifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase. 	814
183375	PRK11908	PRK11908	bifunctional UDP-4-keto-pentose/UDP-xylose synthase. 	347
183376	PRK11909	PRK11909	cobalt transporter CbiM. 	230
183377	PRK11910	PRK11910	amidase; Provisional	615
138812	PRK11911	flgD	flagellar basal body rod modification protein; Provisional	140
237020	PRK11913	phhA	phenylalanine 4-monooxygenase; Reviewed	275
237021	PRK11914	PRK11914	diacylglycerol kinase; Reviewed	306
237022	PRK11915	PRK11915	lysophospholipid acyltransferase. 	621
183380	PRK11916	PRK11916	electron transfer flavoprotein subunit alpha. 	312
183381	PRK11917	PRK11917	bifunctional adhesin/ABC transporter aspartate/glutamate-binding protein; Reviewed	259
171344	PRK11920	rirA	iron-responsive transcriptional regulator RirA. 	153
237023	PRK11921	PRK11921	anaerobic nitric oxide reductase flavorubredoxin. 	394
237024	PRK11922	PRK11922	RNA polymerase sigma factor; Provisional	231
171347	PRK11923	algU	RNA polymerase sigma factor AlgU; Provisional	193
183384	PRK11924	PRK11924	RNA polymerase sigma factor; Provisional	179
237025	PRK11929	PRK11929	bifunctional UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--2,6-diaminopimelate ligase MurE/UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase MurF. 	958
237026	PRK11930	PRK11930	putative bifunctional UDP-N-acetylmuramoyl-tripeptide:D-alanyl-D-alanine ligase/alanine racemase; Provisional	822
183387	PRK11933	yebU	rRNA (cytosine-C(5)-)-methyltransferase RsmF; Reviewed	470
237027	PRK12266	glpD	glycerol-3-phosphate dehydrogenase; Reviewed	508
237028	PRK12267	PRK12267	methionyl-tRNA synthetase; Reviewed	648
237029	PRK12268	PRK12268	methionyl-tRNA synthetase; Reviewed	556
105491	PRK12269	PRK12269	bifunctional cytidylate kinase/ribosomal protein S1; Provisional	863
237030	PRK12270	kgd	multifunctional oxoglutarate decarboxylase/oxoglutarate dehydrogenase thiamine pyrophosphate-binding subunit/dihydrolipoyllysine-residue succinyltransferase subunit. 	1228
183392	PRK12271	rps10p	30S ribosomal protein S10P; Reviewed	102
237031	PRK12273	aspA	aspartate ammonia-lyase; Provisional	472
237032	PRK12274	PRK12274	serine/threonine protein kinase; Provisional	218
183395	PRK12275	PRK12275	hypothetical protein; Reviewed	116
237033	PRK12276	PRK12276	putative heme peroxidase; Provisional	248
183397	PRK12277	PRK12277	50S ribosomal protein L13e; Provisional	83
237034	PRK12278	PRK12278	50S ribosomal protein L21. 	221
138835	PRK12279	PRK12279	50S ribosomal protein L22/unknown domain fusion protein; Provisional	311
237035	PRK12280	rplW	50S ribosomal protein L23; Reviewed	158
183399	PRK12281	rplX	50S ribosomal protein L24; Reviewed	76
183400	PRK12282	PRK12282	tryptophanyl-tRNA synthetase II; Reviewed	333
183401	PRK12283	PRK12283	tryptophanyl-tRNA synthetase; Reviewed	398
237036	PRK12284	PRK12284	tryptophanyl-tRNA synthetase; Reviewed	431
237037	PRK12285	PRK12285	tryptophanyl-tRNA synthetase; Reviewed	368
237038	PRK12286	rpmF	50S ribosomal protein L32; Reviewed	57
183405	PRK12287	tqsA	pheromone autoinducer 2 transporter; Reviewed	344
237039	PRK12288	PRK12288	small ribosomal subunit biogenesis GTPase RsgA. 	347
237040	PRK12289	PRK12289	small ribosomal subunit biogenesis GTPase RsgA. 	352
237041	PRK12290	thiE	thiamine phosphate synthase. 	437
237042	PRK12291	PRK12291	apolipoprotein N-acyltransferase; Reviewed	418
237043	PRK12292	hisZ	ATP phosphoribosyltransferase regulatory subunit; Provisional	391
183411	PRK12293	hisZ	ATP phosphoribosyltransferase regulatory subunit; Provisional	281
237044	PRK12294	hisZ	ATP phosphoribosyltransferase regulatory subunit; Provisional	272
183413	PRK12295	hisZ	ATP phosphoribosyltransferase regulatory subunit; Provisional	373
237045	PRK12296	obgE	GTPase CgtA; Reviewed	500
237046	PRK12297	obgE	GTPase CgtA; Reviewed	424
237047	PRK12298	obgE	GTPase CgtA; Reviewed	390
237048	PRK12299	obgE	GTPase CgtA; Reviewed	335
237049	PRK12300	leuS	leucyl-tRNA synthetase; Reviewed	897
183419	PRK12301	bssS	biofilm formation regulator BssS. 	84
183420	PRK12302	bssR	biofilm formation regulator BssR. 	127
183421	PRK12303	PRK12303	tumor necrosis factor alpha-inducing protein; Reviewed	192
237050	PRK12305	thrS	threonyl-tRNA synthetase; Reviewed	575
183423	PRK12306	uvrC	excinuclease ABC subunit C; Reviewed	519
237051	PRK12307	PRK12307	MFS transporter. 	426
183425	PRK12308	PRK12308	argininosuccinate lyase. 	614
183426	PRK12309	PRK12309	transaldolase. 	391
183427	PRK12310	PRK12310	hydroxylamine reductase; Provisional	433
183428	PRK12311	rpsB	30S ribosomal protein S2. 	326
237052	PRK12313	PRK12313	1,4-alpha-glucan branching protein GlgB. 	633
183430	PRK12314	PRK12314	gamma-glutamyl kinase; Provisional	266
237053	PRK12315	PRK12315	1-deoxy-D-xylulose-5-phosphate synthase; Provisional	581
237054	PRK12316	PRK12316	peptide synthase; Provisional	5163
237055	PRK12317	PRK12317	elongation factor 1-alpha; Reviewed	425
183434	PRK12318	PRK12318	methionyl aminopeptidase. 	291
183435	PRK12319	PRK12319	acetyl-CoA carboxylase subunit alpha; Provisional	256
138873	PRK12320	PRK12320	hypothetical protein; Provisional	699
237056	PRK12321	cobN	cobaltochelatase subunit CobN; Reviewed	1100
183437	PRK12322	PRK12322	NADH-quinone oxidoreductase subunit D. 	366
237057	PRK12323	PRK12323	DNA polymerase III subunit gamma/tau. 	700
237058	PRK12324	PRK12324	decaprenyl-phosphate phosphoribosyltransferase. 	295
237059	PRK12325	PRK12325	prolyl-tRNA synthetase; Provisional	439
237060	PRK12326	PRK12326	preprotein translocase subunit SecA; Reviewed	764
237061	PRK12327	nusA	transcription elongation factor NusA; Provisional	362
237062	PRK12328	nusA	transcription termination/antitermination protein NusA. 	374
237063	PRK12329	nusA	transcription termination factor NusA. 	449
183445	PRK12330	PRK12330	methylmalonyl-CoA carboxytransferase subunit 5S. 	499
183446	PRK12331	PRK12331	oxaloacetate decarboxylase subunit alpha. 	448
183447	PRK12332	tsf	elongation factor Ts; Reviewed	198
237064	PRK12333	PRK12333	nucleoside triphosphate pyrophosphohydrolase; Reviewed	204
237065	PRK12334	PRK12334	nucleoside triphosphate pyrophosphohydrolase; Reviewed	277
183450	PRK12335	PRK12335	tellurite resistance protein TehB; Provisional	287
183451	PRK12336	PRK12336	translation initiation factor IF-2 subunit beta; Provisional	201
183452	PRK12337	PRK12337	2-phosphoglycerate kinase; Provisional	475
237066	PRK12338	PRK12338	hypothetical protein; Provisional	319
105560	PRK12339	PRK12339	mevalonate-3-phosphate 5-kinase. 	197
183454	PRK12341	PRK12341	acyl-CoA dehydrogenase. 	381
183455	PRK12342	PRK12342	electron transfer flavoprotein. 	254
237067	PRK12343	PRK12343	cyclic pyranopterin monophosphate synthase MoaC. 	151
237068	PRK12344	PRK12344	putative alpha-isopropylmalate/homocitrate synthase family transferase; Provisional	524
183458	PRK12346	PRK12346	transaldolase A; Provisional	316
183459	PRK12347	sgbE	L-ribulose-5-phosphate 4-epimerase; Reviewed	231
183460	PRK12348	sgaE	L-ribulose-5-phosphate 4-epimerase; Reviewed	228
237069	PRK12349	PRK12349	citrate synthase. 	369
237070	PRK12350	PRK12350	citrate synthase 2; Provisional	353
183463	PRK12351	PRK12351	methylcitrate synthase; Provisional	378
183464	PRK12352	PRK12352	putative carbamate kinase; Reviewed	316
237071	PRK12353	PRK12353	putative amino acid kinase; Reviewed	314
183466	PRK12354	PRK12354	carbamate kinase; Reviewed	307
237072	PRK12355	PRK12355	type-F conjugative transfer system mating-pair stabilization protein TraN. 	558
237073	PRK12356	PRK12356	glutaminase; Reviewed	319
237074	PRK12357	PRK12357	glutaminase; Reviewed	326
183470	PRK12358	PRK12358	glucosamine-6-phosphate deaminase. 	239
183471	PRK12359	PRK12359	flavodoxin FldB; Provisional	172
237075	PRK12360	PRK12360	4-hydroxy-3-methylbut-2-enyl diphosphate reductase; Provisional	281
183473	PRK12361	PRK12361	hypothetical protein; Provisional	547
237076	PRK12362	PRK12362	germination protease; Provisional	318
171438	PRK12363	PRK12363	phosphoglycerol transferase I; Provisional	703
237077	PRK12364	PRK12364	ribonucleoside-diphosphate reductase subunit alpha. 	842
171440	PRK12365	PRK12365	ribonucleoside-diphosphate reductase subunit alpha. 	1046
237078	PRK12366	PRK12366	replication factor A; Reviewed	637
237079	PRK12367	PRK12367	short chain dehydrogenase; Provisional	245
171443	PRK12369	PRK12369	putative transporter; Reviewed	326
237080	PRK12370	PRK12370	HilA/EilA family virulence transcriptional regulator. 	553
171444	PRK12371	PRK12371	ribonuclease III; Reviewed	235
237081	PRK12372	PRK12372	ribonuclease III; Reviewed	413
237082	PRK12373	PRK12373	NADH-quinone oxidoreductase subunit E. 	400
237083	PRK12374	PRK12374	ATP-dependent dethiobiotin synthetase BioD. 	231
183481	PRK12376	PRK12376	putative translaldolase; Provisional	236
183482	PRK12377	PRK12377	putative replication protein; Provisional	248
237084	PRK12378	PRK12378	YebC/PmpR family DNA-binding transcriptional regulator. 	235
183484	PRK12379	PRK12379	propionate kinase. 	396
183485	PRK12380	PRK12380	hydrogenase/urease nickel incorporation protein. 	113
183486	PRK12381	PRK12381	bifunctional succinylornithine transaminase/acetylornithine transaminase; Provisional	406
183487	PRK12382	PRK12382	putative transporter; Provisional	392
237085	PRK12383	PRK12383	putative mutase; Provisional	406
183489	PRK12384	PRK12384	sorbitol-6-phosphate dehydrogenase; Provisional	259
183490	PRK12385	PRK12385	succinate dehydrogenase/fumarate reductase iron-sulfur subunit. 	244
237086	PRK12386	PRK12386	fumarate reductase iron-sulfur subunit; Provisional	251
183492	PRK12387	PRK12387	formate hydrogenlyase complex iron-sulfur subunit; Provisional	180
171459	PRK12388	PRK12388	class II fructose-bisphosphatase. 	321
183493	PRK12389	PRK12389	glutamate-1-semialdehyde aminotransferase; Provisional	428
183494	PRK12390	PRK12390	1-aminocyclopropane-1-carboxylate deaminase; Provisional	337
237087	PRK12391	PRK12391	TrpB-like pyridoxal phosphate-dependent enzyme. 	427
171463	PRK12392	PRK12392	bacteriochlorophyll c synthase; Provisional	331
237088	PRK12393	PRK12393	amidohydrolase; Provisional	457
183497	PRK12394	PRK12394	metallo-dependent hydrolase. 	379
183498	PRK12395	PRK12395	maltoporin; Provisional	419
183499	PRK12396	PRK12396	5-methylribose kinase; Reviewed	409
183500	PRK12397	PRK12397	acetate/propionate family kinase. 	404
237089	PRK12398	PRK12398	pyruvoyl-dependent arginine decarboxylase; Provisional	162
183502	PRK12399	PRK12399	tagatose 1,6-diphosphate aldolase; Reviewed	324
171470	PRK12400	PRK12400	D-amino acid aminotransferase; Reviewed	290
237090	PRK12402	PRK12402	replication factor C small subunit 2; Reviewed	337
171472	PRK12403	PRK12403	aspartate aminotransferase family protein. 	460
183504	PRK12404	PRK12404	stage V sporulation protein AD; Provisional	334
237091	PRK12405	PRK12405	electron transport complex RsxE subunit; Provisional	231
183506	PRK12406	PRK12406	long-chain-fatty-acid--CoA ligase; Provisional	509
183507	PRK12407	flgH	flagellar basal body L-ring protein FlgH. 	221
237092	PRK12408	PRK12408	glucokinase; Provisional	336
237093	PRK12409	PRK12409	D-amino acid dehydrogenase small subunit; Provisional	410
237094	PRK12410	PRK12410	glutamylglutaminyl-tRNA synthetase; Provisional	433
183511	PRK12411	PRK12411	cytidine deaminase; Provisional	132
183512	PRK12412	PRK12412	bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 	268
183513	PRK12413	PRK12413	bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 	253
183514	PRK12414	PRK12414	putative aminotransferase; Provisional	384
183515	PRK12415	PRK12415	fructose-bisphosphatase class II. 	322
183516	PRK12416	PRK12416	protoporphyrinogen oxidase; Provisional	463
237095	PRK12417	secY	preprotein translocase subunit SecY; Reviewed	404
183518	PRK12418	PRK12418	cysteinyl-tRNA synthetase; Provisional	384
237096	PRK12419	PRK12419	6,7-dimethyl-8-ribityllumazine synthase. 	158
237097	PRK12420	PRK12420	histidyl-tRNA synthetase; Provisional	423
237098	PRK12421	PRK12421	ATP phosphoribosyltransferase regulatory subunit; Provisional	392
183521	PRK12422	PRK12422	chromosomal replication initiator protein DnaA. 	445
171489	PRK12423	PRK12423	LexA repressor; Provisional	202
171490	PRK12425	PRK12425	class II fumarate hydratase. 	464
183522	PRK12426	PRK12426	elongation factor P; Provisional	185
183523	PRK12427	PRK12427	FliA/WhiG family RNA polymerase sigma factor. 	231
237099	PRK12428	PRK12428	coniferyl-alcohol dehydrogenase. 	241
237100	PRK12429	PRK12429	3-hydroxybutyrate dehydrogenase; Provisional	258
237101	PRK12430	PRK12430	putative bifunctional flagellar biosynthesis protein FliO/FliP; Provisional	379
171495	PRK12434	PRK12434	tRNA pseudouridine(38-40) synthase TruA. 	245
183526	PRK12435	PRK12435	ferrochelatase; Provisional	311
171497	PRK12436	PRK12436	UDP-N-acetylmuramate dehydrogenase. 	305
183527	PRK12437	PRK12437	prolipoprotein diacylglyceryl transferase; Reviewed	269
171499	PRK12438	PRK12438	hypothetical protein; Provisional	991
171500	PRK12439	PRK12439	NAD(P)H-dependent glycerol-3-phosphate dehydrogenase; Provisional	341
183528	PRK12440	PRK12440	acetate/propionate family kinase. 	397
237102	PRK12442	PRK12442	translation initiation factor IF-1; Reviewed	87
183530	PRK12444	PRK12444	threonyl-tRNA synthetase; Reviewed	639
171504	PRK12445	PRK12445	lysyl-tRNA synthetase; Reviewed	505
171505	PRK12446	PRK12446	undecaprenyldiphospho-muramoylpentapeptide beta-N-acetylglucosaminyltransferase; Reviewed	352
237103	PRK12447	PRK12447	histidinol dehydrogenase; Reviewed	426
237104	PRK12448	PRK12448	dihydroxy-acid dehydratase; Provisional	615
183533	PRK12449	PRK12449	acyl carrier protein; Provisional	80
138982	PRK12450	PRK12450	foldase protein PrsA; Reviewed	309
183534	PRK12451	PRK12451	arginyl-tRNA synthetase; Reviewed	562
171510	PRK12452	PRK12452	cardiolipin synthase. 	509
183535	PRK12454	PRK12454	carbamate kinase-like carbamoyl phosphate synthetase; Reviewed	313
84141	PRK12456	PRK12456	Na(+)-translocating NADH-quinone reductase subunit E; Provisional	199
237105	PRK12457	PRK12457	3-deoxy-8-phosphooctulonate synthase. 	281
183536	PRK12458	PRK12458	glutathione synthetase; Provisional	338
237106	PRK12459	PRK12459	S-adenosylmethionine synthetase; Provisional	386
183538	PRK12460	PRK12460	2-keto-3-deoxygluconate permease; Provisional	312
183539	PRK12461	PRK12461	UDP-N-acetylglucosamine acyltransferase; Provisional	255
183540	PRK12462	PRK12462	phosphoserine aminotransferase; Provisional	364
171518	PRK12463	PRK12463	chorismate synthase; Reviewed	390
237107	PRK12464	PRK12464	1-deoxy-D-xylulose 5-phosphate reductoisomerase; Provisional	383
183542	PRK12465	PRK12465	xylose isomerase; Provisional	445
183543	PRK12466	PRK12466	3-isopropylmalate dehydratase large subunit. 	471
237108	PRK12467	PRK12467	peptide synthase; Provisional	3956
171522	PRK12468	flhB	flagellar biosynthesis protein FlhB; Reviewed	386
237109	PRK12469	PRK12469	RNA polymerase factor sigma-54; Provisional	481
171524	PRK12470	PRK12470	amidase; Provisional	462
237110	PRK12472	PRK12472	hypothetical protein; Provisional	508
183546	PRK12473	PRK12473	hypothetical protein; Provisional	198
139002	PRK12474	PRK12474	hypothetical protein; Provisional	518
183547	PRK12475	PRK12475	thiamine/molybdopterin biosynthesis MoeB-like protein; Provisional	338
171527	PRK12476	PRK12476	putative fatty-acid--CoA ligase; Provisional	612
183548	PRK12478	PRK12478	crotonase/enoyl-CoA hydratase family protein. 	298
183549	PRK12479	PRK12479	branched-chain-amino-acid transaminase. 	299
183550	PRK12480	PRK12480	D-lactate dehydrogenase; Provisional	330
171531	PRK12481	PRK12481	2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase KduD. 	251
171532	PRK12482	PRK12482	flagellar motor stator protein MotA. 	287
237111	PRK12483	PRK12483	threonine dehydratase; Reviewed	521
237112	PRK12484	PRK12484	nicotinate phosphoribosyltransferase; Provisional	443
171535	PRK12485	PRK12485	bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase/GTP cyclohydrolase II. 	369
237113	PRK12486	dmdA	dimethylsulfoniopropionate demethylase. 	368
183553	PRK12487	PRK12487	putative 4-hydroxy-4-methyl-2-oxoglutarate aldolase. 	163
237114	PRK12488	PRK12488	cation/acetate symporter. 	549
237115	PRK12489	PRK12489	anaerobic C4-dicarboxylate transporter; Reviewed	443
237116	PRK12490	PRK12490	6-phosphogluconate dehydrogenase-like protein; Reviewed	299
105695	PRK12491	PRK12491	pyrroline-5-carboxylate reductase; Reviewed	272
171539	PRK12492	PRK12492	long-chain-fatty-acid--CoA ligase; Provisional	562
237117	PRK12493	PRK12493	magnesium chelatase subunit H; Provisional	1310
183557	PRK12494	PRK12494	NAD(P)H-quinone oxidoreductase subunit J. 	172
183558	PRK12495	PRK12495	hypothetical protein; Provisional	226
237118	PRK12496	PRK12496	hypothetical protein; Provisional	164
237119	PRK12497	PRK12497	YraN family protein. 	119
183561	PRK12504	PRK12504	DUF4040 domain-containing protein. 	178
237120	PRK12505	PRK12505	putative monovalent cation/H+ antiporter subunit B; Reviewed	159
237121	PRK12507	PRK12507	putative monovalent cation/H+ antiporter subunit B; Reviewed	332
183563	PRK12508	PRK12508	Na(+)/H(+) antiporter subunit B. 	139
237122	PRK12509	PRK12509	Na+/H+ antiporter subunit B. 	137
237123	PRK12511	PRK12511	RNA polymerase sigma factor; Provisional	182
171551	PRK12512	PRK12512	RNA polymerase sigma factor; Provisional	184
183566	PRK12513	PRK12513	RNA polymerase sigma factor; Provisional	194
105710	PRK12514	PRK12514	RNA polymerase sigma factor; Provisional	179
183567	PRK12515	PRK12515	RNA polymerase sigma factor; Provisional	189
183568	PRK12516	PRK12516	RNA polymerase sigma factor; Provisional	187
183569	PRK12517	PRK12517	RNA polymerase sigma factor; Provisional	188
237124	PRK12518	PRK12518	RNA polymerase sigma factor; Provisional	175
237125	PRK12519	PRK12519	RNA polymerase sigma factor; Provisional	194
237126	PRK12520	PRK12520	RNA polymerase sigma factor; Provisional	191
183571	PRK12522	PRK12522	RNA polymerase sigma factor; Provisional	173
183572	PRK12523	PRK12523	RNA polymerase sigma factor; Reviewed	172
183573	PRK12524	PRK12524	RNA polymerase sigma factor; Provisional	196
139037	PRK12525	PRK12525	RNA polymerase sigma factor; Provisional	168
237127	PRK12526	PRK12526	RNA polymerase sigma factor; Provisional	206
171560	PRK12527	PRK12527	RNA polymerase sigma factor; Reviewed	159
171561	PRK12528	PRK12528	RNA polymerase sigma factor; Provisional	161
183574	PRK12529	PRK12529	RNA polymerase sigma factor; Provisional	178
237128	PRK12530	PRK12530	RNA polymerase sigma factor; Provisional	189
105726	PRK12531	PRK12531	RNA polymerase sigma factor; Provisional	194
171564	PRK12532	PRK12532	RNA polymerase sigma factor; Provisional	195
237129	PRK12533	PRK12533	RNA polymerase sigma factor; Provisional	216
183576	PRK12534	PRK12534	RNA polymerase sigma factor; Provisional	187
237130	PRK12535	PRK12535	RNA polymerase sigma factor; Provisional	196
237131	PRK12536	PRK12536	RNA polymerase sigma factor; Provisional	181
171568	PRK12537	PRK12537	RNA polymerase sigma factor; Provisional	182
139048	PRK12538	PRK12538	RNA polymerase sigma factor; Provisional	233
237132	PRK12539	PRK12539	RNA polymerase sigma factor SigF. 	184
183579	PRK12540	PRK12540	RNA polymerase sigma factor; Provisional	182
183580	PRK12541	PRK12541	RNA polymerase sigma factor; Provisional	161
183581	PRK12542	PRK12542	RNA polymerase sigma factor; Provisional	185
183582	PRK12543	PRK12543	RNA polymerase sigma factor; Provisional	179
183583	PRK12544	PRK12544	RNA polymerase factor sigma-70. 	206
183584	PRK12545	PRK12545	RNA polymerase factor sigma-70. 	201
139055	PRK12546	PRK12546	RNA polymerase sigma factor; Provisional	188
139056	PRK12547	PRK12547	RNA polymerase sigma factor; Provisional	164
183585	PRK12548	PRK12548	shikimate dehydrogenase. 	289
183586	PRK12549	PRK12549	shikimate 5-dehydrogenase; Reviewed	284
183587	PRK12550	PRK12550	shikimate 5-dehydrogenase; Reviewed	272
139060	PRK12551	PRK12551	ATP-dependent Clp protease proteolytic subunit; Reviewed	196
183588	PRK12552	PRK12552	ATP-dependent Clp protease proteolytic subunit. 	222
237133	PRK12553	PRK12553	ATP-dependent Clp protease proteolytic subunit; Reviewed	207
237134	PRK12554	PRK12554	undecaprenyl pyrophosphate phosphatase; Reviewed	276
237135	PRK12555	PRK12555	chemotaxis-specific protein-glutamate methyltransferase CheB. 	337
183592	PRK12556	PRK12556	tryptophanyl-tRNA synthetase; Provisional	332
237136	PRK12557	PRK12557	H(2)-dependent methylenetetrahydromethanopterin dehydrogenase-related protein; Provisional	342
183594	PRK12558	PRK12558	glutamyl-tRNA synthetase; Provisional	445
79035	PRK12559	PRK12559	transcriptional regulator Spx; Provisional	131
183595	PRK12560	PRK12560	adenine phosphoribosyltransferase; Provisional	187
237137	PRK12561	PRK12561	NAD(P)H-quinone oxidoreductase subunit 4; Provisional	504
105755	PRK12562	PRK12562	ornithine carbamoyltransferase subunit F; Provisional	334
237138	PRK12563	PRK12563	sulfate adenylyltransferase subunit CysD. 	312
237139	PRK12564	PRK12564	carbamoyl-phosphate synthase small subunit. 	360
171585	PRK12566	PRK12566	glycine dehydrogenase; Provisional	954
237140	PRK12567	PRK12567	putative monovalent cation/H+ antiporter subunit B; Reviewed	218
139075	PRK12568	PRK12568	glycogen branching enzyme; Provisional	730
237141	PRK12569	PRK12569	hypothetical protein; Provisional	245
237142	PRK12570	PRK12570	N-acetylmuramic acid-6-phosphate etherase; Reviewed	296
183601	PRK12571	PRK12571	1-deoxy-D-xylulose-5-phosphate synthase; Provisional	641
183602	PRK12573	PRK12573	Na(+)/H(+) antiporter subunit B. 	140
183603	PRK12574	PRK12574	monovalent cation/H+ antiporter subunit B. 	141
171592	PRK12575	PRK12575	succinate dehydrogenase/fumarate reductase iron-sulfur subunit. 	235
237143	PRK12576	PRK12576	succinate dehydrogenase/fumarate reductase iron-sulfur subunit. 	279
183605	PRK12577	PRK12577	succinate dehydrogenase/fumarate reductase iron-sulfur subunit. 	329
183606	PRK12578	PRK12578	thiolase domain-containing protein. 	385
183607	PRK12579	PRK12579	putative monovalent cation/H+ antiporter subunit B; Reviewed	258
79055	PRK12580	PRK12580	omptin family plasminogen activator Pla. 	312
79056	PRK12581	PRK12581	oxaloacetate decarboxylase; Provisional	468
237144	PRK12582	PRK12582	acyl-CoA synthetase; Provisional	624
237145	PRK12583	PRK12583	acyl-CoA synthetase; Provisional	558
237146	PRK12584	PRK12584	flagellin A; Reviewed	510
183610	PRK12585	PRK12585	putative monovalent cation/H+ antiporter subunit G; Reviewed	197
237147	PRK12586	PRK12586	Na+/H+ antiporter subunit G. 	145
183612	PRK12587	PRK12587	putative monovalent cation/H+ antiporter subunit G; Reviewed	118
183613	PRK12592	PRK12592	putative monovalent cation/H+ antiporter subunit G; Reviewed	126
183614	PRK12595	PRK12595	bifunctional 3-deoxy-7-phosphoheptulonate synthase/chorismate mutase; Reviewed	360
105779	PRK12596	PRK12596	putative monovalent cation/H+ antiporter subunit E; Reviewed	171
183615	PRK12597	PRK12597	F0F1 ATP synthase subunit beta; Provisional	461
237148	PRK12599	PRK12599	putative monovalent cation/H+ antiporter subunit F; Reviewed	91
183617	PRK12600	PRK12600	Na(+)/H(+) antiporter subunit F1. 	94
183618	PRK12603	PRK12603	putative monovalent cation/H+ antiporter subunit F; Reviewed	86
183619	PRK12604	PRK12604	putative monovalent cation/H+ antiporter subunit F; Reviewed	84
237149	PRK12606	PRK12606	GTP cyclohydrolase I; Reviewed	201
183621	PRK12607	PRK12607	phosphoribosylaminoimidazole-succinocarboxamide synthase; Provisional	313
237150	PRK12608	PRK12608	transcription termination factor Rho; Provisional	380
237151	PRK12612	PRK12612	putative monovalent cation/H+ antiporter subunit F; Reviewed	87
171609	PRK12613	PRK12613	galactose-6-phosphate isomerase subunit LacA; Provisional	141
171610	PRK12615	PRK12615	galactose-6-phosphate isomerase subunit LacB; Reviewed	171
183624	PRK12616	PRK12616	bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 	270
183625	PRK12617	flgA	flagellar basal body P-ring formation protein FlgA. 	214
183626	PRK12618	flgA	flagellar basal body P-ring formation protein FlgA. 	141
183627	PRK12619	flgB	flagellar basal body rod protein FlgB; Provisional	130
183628	PRK12620	flgB	flagellar basal body rod protein FlgB; Provisional	132
171613	PRK12621	flgB	flagellar basal body rod protein FlgB; Provisional	136
183629	PRK12622	flgB	flagellar basal body rod protein FlgB; Provisional	135
183630	PRK12623	flgB	flagellar basal body rod protein FlgB; Provisional	131
139107	PRK12624	flgB	flagellar basal body rod protein FlgB; Provisional	143
183631	PRK12625	flgB	flagellar basal body rod protein FlgB; Provisional	132
183632	PRK12626	flgB	flagellar basal body rod protein FlgB; Provisional	162
237152	PRK12627	flgB	FlgB family protein. 	128
79089	PRK12628	flgC	flagellar basal body rod protein FlgC; Provisional	140
183634	PRK12629	flgC	flagellar basal body rod protein FlgC; Provisional	135
183635	PRK12630	flgC	flagellar basal body rod protein FlgC; Provisional	143
183636	PRK12631	flgC	flagellar basal body rod protein FlgC; Provisional	138
183637	PRK12632	flgC	flagellar basal body rod protein FlgC; Provisional	130
183638	PRK12633	flgD	flagellar basal body rod modification protein; Provisional	230
183639	PRK12634	flgD	flagellar basal body rod modification protein; Reviewed	221
183640	PRK12636	flgG	flagellar basal body rod protein FlgG; Provisional	263
183641	PRK12637	flgE	flagellar hook protein FlgE; Provisional	473
183642	PRK12640	flgF	flagellar basal body rod protein FlgF; Reviewed	246
105809	PRK12641	flgF	flagellar basal-body rod protein FlgF. 	252
237153	PRK12642	flgF	flagellar basal-body rod protein FlgF. 	241
139117	PRK12643	flgF	flagellar basal body rod protein FlgF; Reviewed	209
237154	PRK12644	PRK12644	putative monovalent cation/H+ antiporter subunit A; Reviewed	965
237155	PRK12645	PRK12645	monovalent cation/H+ antiporter subunit A; Reviewed	800
183646	PRK12646	PRK12646	DUF4040 family protein. 	800
237156	PRK12647	PRK12647	putative monovalent cation/H+ antiporter subunit A; Reviewed	761
237157	PRK12648	PRK12648	putative monovalent cation/H+ antiporter subunit A; Reviewed	948
183649	PRK12649	PRK12649	putative monovalent cation/H+ antiporter subunit A; Reviewed	789
237158	PRK12650	PRK12650	DUF4040 family protein. 	962
237159	PRK12651	PRK12651	Na+/H+ antiporter subunit E. 	158
237160	PRK12652	PRK12652	monovalent cation/H+ antiporter subunit E. 	357
183653	PRK12653	PRK12653	fructose-6-phosphate aldolase; Reviewed	220
237161	PRK12654	PRK12654	monovalent cation/H+ antiporter subunit E. 	151
183655	PRK12655	PRK12655	fructose-6-phosphate aldolase; Reviewed	220
183656	PRK12656	PRK12656	fructose-6-phosphate aldolase; Reviewed	222
183657	PRK12657	PRK12657	putative monovalent cation/H+ antiporter subunit F; Reviewed	100
183658	PRK12658	PRK12658	Na+/H+ antiporter subunit C. 	125
183659	PRK12659	PRK12659	Na+/H+ antiporter subunit C. 	117
183660	PRK12660	PRK12660	putative monovalent cation/H+ antiporter subunit C; Reviewed	114
237162	PRK12661	PRK12661	putative monovalent cation/H+ antiporter subunit C; Reviewed	140
183662	PRK12662	PRK12662	putative monovalent cation/H+ antiporter subunit D; Reviewed	492
237163	PRK12663	PRK12663	Na+/H+ antiporter subunit D. 	497
237164	PRK12664	PRK12664	putative monovalent cation/H+ antiporter subunit D; Reviewed	527
237165	PRK12665	PRK12665	putative monovalent cation/H+ antiporter subunit D; Reviewed	521
237166	PRK12666	PRK12666	putative monovalent cation/H+ antiporter subunit D; Reviewed	528
237167	PRK12667	PRK12667	putative monovalent cation/H+ antiporter subunit D; Reviewed	520
237168	PRK12668	PRK12668	Na(+)/H(+) antiporter subunit D. 	581
183669	PRK12670	PRK12670	putative monovalent cation/H+ antiporter subunit G; Reviewed	99
183670	PRK12671	PRK12671	putative monovalent cation/H+ antiporter subunit G; Reviewed	120
183671	PRK12672	PRK12672	putative monovalent cation/H+ antiporter subunit G; Reviewed	118
237169	PRK12674	PRK12674	putative monovalent cation/H+ antiporter subunit G; Reviewed	99
171652	PRK12675	PRK12675	putative monovalent cation/H+ antiporter subunit G; Reviewed	104
183673	PRK12676	PRK12676	bifunctional fructose-bisphosphatase/inositol-phosphate phosphatase. 	263
237170	PRK12677	PRK12677	xylose isomerase; Provisional	384
237171	PRK12678	PRK12678	transcription termination factor Rho; Provisional	672
183676	PRK12679	cbl	HTH-type transcriptional regulator Cbl. 	316
183677	PRK12680	PRK12680	LysR family transcriptional regulator. 	327
183678	PRK12681	cysB	HTH-type transcriptional regulator CysB. 	324
183679	PRK12682	PRK12682	transcriptional regulator CysB-like protein; Reviewed	309
237172	PRK12683	PRK12683	transcriptional regulator CysB-like protein; Reviewed	309
237173	PRK12684	PRK12684	CysB family HTH-type transcriptional regulator. 	313
183682	PRK12685	flgB	flagellar basal body rod protein FlgB; Reviewed	116
183683	PRK12686	PRK12686	carbamate kinase; Reviewed	312
105853	PRK12687	PRK12687	flagellin; Reviewed	311
171664	PRK12688	PRK12688	flagellin; Reviewed	751
183684	PRK12689	flgF	flagellar basal-body rod protein FlgF. 	253
183685	PRK12690	flgF	flagellar hook-basal body complex protein. 	238
183686	PRK12691	flgG	flagellar basal body rod protein FlgG; Reviewed	262
139158	PRK12692	flgG	flagellar basal body rod protein FlgG; Reviewed	262
183687	PRK12693	flgG	flagellar basal body rod protein FlgG; Provisional	261
183688	PRK12694	flgG	flagellar basal body rod protein FlgG; Reviewed	260
237174	PRK12696	flgH	flagellar basal body L-ring protein; Reviewed	236
237175	PRK12697	flgH	flagellar basal body L-ring protein FlgH. 	226
183690	PRK12698	flgH	flagellar basal body L-ring protein FlgH. 	224
105864	PRK12699	flgH	flagellar basal body L-ring protein; Reviewed	246
139164	PRK12700	flgH	flagellar basal body L-ring protein; Reviewed	230
183691	PRK12701	flgH	flagellar basal body L-ring protein; Reviewed	230
105866	PRK12702	PRK12702	mannosyl-3-phosphoglycerate phosphatase; Reviewed	302
237176	PRK12703	PRK12703	tRNA 2'-O-methylase; Reviewed	339
237177	PRK12704	PRK12704	phosphodiesterase; Provisional	520
237178	PRK12705	PRK12705	hypothetical protein; Provisional	508
183694	PRK12706	flgI	flagellar basal body P-ring protein; Provisional	328
139168	PRK12708	flgJ	peptidoglycan hydrolase; Reviewed	134
237179	PRK12709	flgJ	flagellar rod assembly protein/muramidase FlgJ; Provisional	320
139170	PRK12710	flgJ	flagellar rod assembly protein/muramidase FlgJ; Provisional	291
237180	PRK12711	flgJ	flagellar assembly peptidoglycan hydrolase FlgJ. 	392
139172	PRK12712	flgJ	flagellar rod assembly protein/muramidase FlgJ; Provisional	344
139173	PRK12713	flgJ	flagellar rod assembly protein/muramidase FlgJ; Provisional	339
183697	PRK12714	flgK	flagellar hook-associated protein FlgK; Provisional	624
183698	PRK12715	flgK	flagellar hook-associated protein FlgK; Provisional	649
171679	PRK12717	flgL	flagellar hook-associated protein 3. 	523
79176	PRK12718	flgL	flagellar hook-associated protein FlgL; Provisional	510
183699	PRK12720	PRK12720	EscV/YscV/HrcV family type III secretion system export apparatus protein. 	675
183700	PRK12721	PRK12721	EscU/YscU/HrcU family type III secretion system export apparatus switch protein. 	349
237181	PRK12722	PRK12722	flagellar transcriptional regulator FlhC. 	187
183702	PRK12723	PRK12723	flagellar biosynthesis regulator FlhF; Provisional	388
183703	PRK12724	PRK12724	flagellar biosynthesis regulator FlhF; Provisional	432
183704	PRK12726	PRK12726	flagellar biosynthesis regulator FlhF; Provisional	407
237182	PRK12727	PRK12727	flagellar biosynthesis protein FlhF. 	559
237183	PRK12728	fliE	flagellar hook-basal body protein FliE; Provisional	102
183707	PRK12729	fliE	flagellar hook-basal body protein FliE; Provisional	127
183708	PRK12735	PRK12735	elongation factor Tu; Reviewed	396
237184	PRK12736	PRK12736	elongation factor Tu; Reviewed	394
183710	PRK12737	gatY	tagatose-bisphosphate aldolase subunit GatY. 	284
183711	PRK12738	kbaY	tagatose-bisphosphate aldolase subunit KbaY. 	286
237185	PRK12739	PRK12739	elongation factor G; Reviewed	691
237186	PRK12740	PRK12740	elongation factor G-like protein EF-G2. 	668
183714	PRK12742	PRK12742	SDR family oxidoreductase. 	237
237187	PRK12743	PRK12743	SDR family oxidoreductase. 	256
183716	PRK12744	PRK12744	SDR family oxidoreductase. 	257
237188	PRK12745	PRK12745	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	256
183718	PRK12746	PRK12746	SDR family oxidoreductase. 	254
183719	PRK12747	PRK12747	short chain dehydrogenase; Provisional	252
237189	PRK12748	PRK12748	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	256
183721	PRK12749	PRK12749	quinate/shikimate dehydrogenase; Reviewed	288
183722	PRK12750	cpxP	periplasmic repressor CpxP; Reviewed	170
171704	PRK12751	cpxP	periplasmic stress adaptor protein CpxP; Reviewed	162
183723	PRK12753	PRK12753	transketolase; Reviewed	663
183724	PRK12754	PRK12754	transketolase; Reviewed	663
237190	PRK12755	PRK12755	phospho-2-dehydro-3-deoxyheptonate aldolase; Provisional	353
183726	PRK12756	PRK12756	Trp-sensitive 3-deoxy-7-phosphoheptulonate synthase AroH. 	348
237191	PRK12757	PRK12757	cell division protein FtsN; Provisional	256
237192	PRK12758	PRK12758	DNA gyrase/topoisomerase IV subunit A. 	869
139206	PRK12759	PRK12759	bifunctional gluaredoxin/ribonucleoside-diphosphate reductase subunit beta; Provisional	410
237193	PRK12764	PRK12764	fumarylacetoacetate hydrolase family protein. 	500
237194	PRK12765	PRK12765	flagellar filament capping protein FliD. 	595
183731	PRK12766	PRK12766	50S ribosomal protein L32e; Provisional	232
237195	PRK12767	PRK12767	carbamoyl phosphate synthase-like protein; Provisional	326
237196	PRK12768	PRK12768	sulfate transporter family protein. 	240
183733	PRK12769	PRK12769	putative oxidoreductase Fe-S binding subunit; Reviewed	654
237197	PRK12770	PRK12770	putative glutamate synthase subunit beta; Provisional	352
237198	PRK12771	PRK12771	putative glutamate synthase (NADPH) small subunit; Provisional	564
237199	PRK12772	PRK12772	fused FliR family export protein/FlhB family type III secretion system protein. 	609
183737	PRK12773	flhB	flagellar biosynthesis protein FlhB; Reviewed	646
183738	PRK12775	PRK12775	putative trifunctional 2-polyprenylphenol hydroxylase/glutamate synthase subunit beta/ferritin domain-containing protein; Provisional	1006
237200	PRK12778	PRK12778	bifunctional dihydroorotate dehydrogenase B NAD binding subunit/NADPH-dependent glutamate synthase. 	752
183740	PRK12779	PRK12779	putative bifunctional glutamate synthase subunit beta/2-polyprenylphenol hydroxylase; Provisional	944
183741	PRK12780	fliR	flagellar biosynthesis protein FliR; Reviewed	251
139219	PRK12781	fliQ	flagellar biosynthetic protein FliQ. 	88
139220	PRK12782	flgC	flagellar basal body rod protein FlgC; Reviewed	138
171720	PRK12783	fliP	flagellar biosynthesis protein FliP; Reviewed	255
183742	PRK12784	PRK12784	hypothetical protein; Provisional	84
183743	PRK12785	fliL	flagellar basal body-associated protein FliL; Reviewed	166
237201	PRK12786	flgA	flagellar basal body P-ring formation protein FlgA. 	338
183745	PRK12787	fliX	flagellar assembly regulator FliX; Reviewed	138
237202	PRK12788	flgH	flagellar basal body L-ring protein FlgH. 	234
183746	PRK12789	flgI	flagellar basal body P-ring protein FlgI. 	367
237203	PRK12790	PRK12790	flagellar rod assembly protein FlgJ. 	115
237204	PRK12791	flbT	flagellar biosynthesis repressor FlbT; Reviewed	131
237205	PRK12792	flhA	flagellar biosynthesis protein FlhA; Reviewed	694
237206	PRK12793	flaF	flagellar biosynthesis regulator FlaF. 	115
237207	PRK12794	flaF	flagellar biosynthesis regulatory protein FlaF; Reviewed	122
237208	PRK12795	fliM	flagellar motor switch protein FliM; Reviewed	388
237209	PRK12796	spaP	EscR/YscR/HrcR family type III secretion system export apparatus protein. 	221
237210	PRK12797	PRK12797	type III secretion system protein YscR; Provisional	213
237211	PRK12798	PRK12798	chemotaxis protein; Reviewed	421
183756	PRK12799	motB	flagellar motor protein MotB; Reviewed	421
183757	PRK12800	fliF	flagellar MS-ring protein; Reviewed	574
139237	PRK12802	PRK12802	flagellin; Provisional	282
183758	PRK12803	PRK12803	flagellin; Provisional	335
183759	PRK12804	PRK12804	flagellin; Provisional	301
183760	PRK12805	PRK12805	FliC/FljB family flagellin. 	287
183761	PRK12806	PRK12806	flagellin; Provisional	475
171737	PRK12807	PRK12807	flagellin; Provisional	287
237212	PRK12808	PRK12808	flagellin; Provisional	476
183762	PRK12809	PRK12809	putative oxidoreductase Fe-S binding subunit; Reviewed	639
237213	PRK12810	gltD	glutamate synthase subunit beta; Reviewed	471
139245	PRK12812	flgD	flagellar basal body rod modification protein; Reviewed	259
237214	PRK12813	flgD	flagellar basal body rod modification protein; Reviewed	223
139246	PRK12814	PRK12814	putative NADPH-dependent glutamate synthase small subunit; Provisional	652
237215	PRK12815	carB	carbamoyl phosphate synthase large subunit; Reviewed	1068
183766	PRK12816	flgG	flagellar basal body rod protein FlgG; Reviewed	264
183767	PRK12817	flgG	flagellar basal body rod protein FlgG; Reviewed	260
183768	PRK12818	flgG	flagellar basal body rod protein FlgG; Reviewed	256
183769	PRK12819	flgG	flagellar basal-body rod protein FlgG. 	257
105955	PRK12820	PRK12820	bifunctional aspartyl-tRNA synthetase/aspartyl/glutamyl-tRNA amidotransferase subunit C; Provisional	706
237216	PRK12821	PRK12821	aspartyl/glutamyl-tRNA amidotransferase subunit C-like protein; Provisional	477
237217	PRK12822	PRK12822	phospho-2-dehydro-3-deoxyheptonate aldolase; Provisional	356
183772	PRK12823	benD	1,6-dihydroxycyclohexa-2,4-diene-1-carboxylate dehydrogenase; Provisional	260
183773	PRK12824	PRK12824	3-oxoacyl-ACP reductase. 	245
237218	PRK12825	fabG	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	249
183775	PRK12826	PRK12826	SDR family oxidoreductase. 	251
237219	PRK12827	PRK12827	short chain dehydrogenase; Provisional	249
237220	PRK12828	PRK12828	short chain dehydrogenase; Provisional	239
183778	PRK12829	PRK12829	short chain dehydrogenase; Provisional	264
183779	PRK12830	PRK12830	UDP-N-acetylglucosamine 1-carboxyvinyltransferase; Reviewed	417
183780	PRK12831	PRK12831	putative oxidoreductase; Provisional	464
183781	PRK12833	PRK12833	acetyl-CoA carboxylase biotin carboxylase subunit; Provisional	467
183782	PRK12834	PRK12834	putative FAD-binding dehydrogenase; Reviewed	549
237221	PRK12835	PRK12835	3-ketosteroid-delta-1-dehydrogenase; Reviewed	584
237222	PRK12837	PRK12837	FAD-binding protein. 	513
183784	PRK12838	PRK12838	carbamoyl phosphate synthase small subunit; Reviewed	354
237223	PRK12839	PRK12839	FAD-dependent oxidoreductase. 	572
237224	PRK12842	PRK12842	putative succinate dehydrogenase; Reviewed	574
237225	PRK12843	PRK12843	FAD-dependent oxidoreductase. 	578
183787	PRK12844	PRK12844	3-ketosteroid-delta-1-dehydrogenase; Reviewed	557
237226	PRK12845	PRK12845	3-ketosteroid-delta-1-dehydrogenase; Reviewed	564
237227	PRK12846	PRK12846	peptide deformylase; Reviewed	165
237228	PRK12847	ubiA	4-hydroxybenzoate octaprenyltransferase. 	285
237229	PRK12848	ubiA	4-hydroxybenzoate octaprenyltransferase. 	282
237230	PRK12849	groEL	chaperonin GroEL; Reviewed	542
237231	PRK12850	groEL	chaperonin GroEL; Reviewed	544
171770	PRK12851	groEL	chaperonin GroEL; Reviewed	541
237232	PRK12852	groEL	chaperonin GroEL; Reviewed	545
237233	PRK12853	PRK12853	glucose-6-phosphate dehydrogenase. 	482
237234	PRK12854	PRK12854	glucose-6-phosphate 1-dehydrogenase; Provisional	484
171774	PRK12855	PRK12855	hypothetical protein; Provisional	103
105987	PRK12856	PRK12856	hypothetical protein; Provisional	103
237235	PRK12857	PRK12857	class II fructose-1,6-bisphosphate aldolase. 	284
237236	PRK12858	PRK12858	tagatose 1,6-diphosphate aldolase; Reviewed	340
183797	PRK12859	PRK12859	3-ketoacyl-(acyl-carrier-protein) reductase; Provisional	256
237237	PRK12860	PRK12860	flagellar transcriptional regulator FlhC. 	189
183798	PRK12861	PRK12861	malic enzyme; Reviewed	764
183799	PRK12862	PRK12862	malic enzyme; Reviewed	763
183800	PRK12863	PRK12863	YciI-like protein; Reviewed	94
183801	PRK12864	PRK12864	YciI-like protein; Reviewed	89
171782	PRK12865	PRK12865	YciI-like protein; Reviewed	97
237238	PRK12866	PRK12866	YciI-like protein; Reviewed	97
237239	PRK12869	ubiA	protoheme IX farnesyltransferase; Reviewed	279
237240	PRK12870	ubiA	4-hydroxybenzoate octaprenyltransferase. 	290
106000	PRK12871	ubiA	prenyltransferase; Reviewed	297
237241	PRK12872	ubiA	prenyltransferase; Reviewed	285
171787	PRK12873	ubiA	4-hydroxybenzoate polyprenyltransferase. 	294
237242	PRK12874	ubiA	4-hydroxybenzoate polyprenyltransferase. 	291
237243	PRK12875	ubiA	prenyltransferase; Reviewed	282
237244	PRK12876	ubiA	prenyltransferase; Reviewed	300
183808	PRK12878	ubiA	4-hydroxybenzoate octaprenyltransferase. 	314
237245	PRK12879	PRK12879	3-oxoacyl-(acyl carrier protein) synthase III; Reviewed	325
171793	PRK12880	PRK12880	beta-ketoacyl-ACP synthase III. 	353
237246	PRK12881	acnA	aconitate hydratase AcnA. 	889
183811	PRK12882	ubiA	prenyltransferase; Reviewed	276
171796	PRK12883	ubiA	prenyltransferase UbiA-like protein; Reviewed	277
183812	PRK12884	ubiA	prenyltransferase; Reviewed	279
237247	PRK12886	ubiA	prenyltransferase; Reviewed	291
183813	PRK12887	ubiA	tocopherol phytyltransferase; Reviewed	308
183814	PRK12888	ubiA	4-hydroxybenzoate octaprenyltransferase. 	284
237248	PRK12890	PRK12890	allantoate amidohydrolase; Reviewed	414
237249	PRK12891	PRK12891	allantoate amidohydrolase; Reviewed	414
183817	PRK12892	PRK12892	allantoate amidohydrolase; Reviewed	412
237250	PRK12893	PRK12893	Zn-dependent hydrolase. 	412
237251	PRK12895	ubiA	prenyltransferase; Reviewed	286
237252	PRK12896	PRK12896	methionine aminopeptidase; Reviewed	255
171806	PRK12897	PRK12897	type I methionyl aminopeptidase. 	248
237253	PRK12898	secA	preprotein translocase subunit SecA; Reviewed	656
237254	PRK12899	secA	preprotein translocase subunit SecA; Reviewed	970
237255	PRK12900	secA	preprotein translocase subunit SecA; Reviewed	1025
237256	PRK12901	secA	preprotein translocase subunit SecA; Reviewed	1112
237257	PRK12902	secA	preprotein translocase subunit SecA; Reviewed	939
237258	PRK12903	secA	preprotein translocase subunit SecA; Reviewed	925
237259	PRK12904	PRK12904	preprotein translocase subunit SecA; Reviewed	830
237260	PRK12906	secA	preprotein translocase subunit SecA; Reviewed	796
183828	PRK12907	secY	preprotein translocase subunit SecY; Reviewed	434
171815	PRK12911	PRK12911	bifunctional preprotein translocase subunit SecD/SecF; Reviewed	1403
183829	PRK12921	PRK12921	oxidoreductase. 	305
237261	PRK12928	PRK12928	lipoyl synthase; Provisional	290
183831	PRK12933	secD	protein translocase subunit SecD. 	604
183832	PRK12935	PRK12935	acetoacetyl-CoA reductase; Provisional	247
171820	PRK12936	PRK12936	3-ketoacyl-(acyl-carrier-protein) reductase NodG; Reviewed	245
171821	PRK12937	PRK12937	short chain dehydrogenase; Provisional	245
171822	PRK12938	PRK12938	3-ketoacyl-ACP reductase. 	246
183833	PRK12939	PRK12939	short chain dehydrogenase; Provisional	250
171824	PRK12996	ulaA	PTS ascorbate transporter subunit IIC. 	463
237262	PRK12997	PRK12997	PTS sugar transporter subunit IIC. 	466
237263	PRK12999	PRK12999	pyruvate carboxylase; Reviewed	1146
183836	PRK13004	PRK13004	YgeY family selenium metabolism-linked hydrolase. 	399
237264	PRK13007	PRK13007	succinyl-diaminopimelate desuccinylase; Reviewed	352
237265	PRK13009	PRK13009	succinyl-diaminopimelate desuccinylase; Reviewed	375
139334	PRK13010	purU	formyltetrahydrofolate deformylase; Reviewed	289
237266	PRK13011	PRK13011	formyltetrahydrofolate deformylase; Reviewed	286
237267	PRK13012	PRK13012	2-oxoacid dehydrogenase subunit E1; Provisional	896
237268	PRK13013	PRK13013	acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase family protein. 	427
237269	PRK13014	PRK13014	methionine sulfoxide reductase A; Provisional	186
237270	PRK13015	PRK13015	3-dehydroquinate dehydratase; Reviewed	146
237271	PRK13016	PRK13016	dihydroxy-acid dehydratase; Provisional	577
237272	PRK13017	PRK13017	dihydroxy-acid dehydratase; Provisional	596
237273	PRK13018	PRK13018	cell division protein FtsZ; Provisional	378
183845	PRK13019	clpS	ATP-dependent Clp protease adapter ClpS. 	94
183846	PRK13020	PRK13020	riboflavin synthase subunit alpha; Provisional	206
237274	PRK13021	secF	preprotein translocase subunit SecF; Reviewed	297
237275	PRK13022	secF	protein translocase subunit SecF. 	289
171842	PRK13023	PRK13023	protein translocase subunit SecDF. 	758
237276	PRK13024	PRK13024	bifunctional preprotein translocase subunit SecD/SecF; Reviewed	755
237277	PRK13026	PRK13026	acyl-CoA dehydrogenase; Reviewed	774
183850	PRK13027	PRK13027	C4-dicarboxylate transporter DctA; Reviewed	421
183851	PRK13028	PRK13028	tryptophan synthase subunit beta; Provisional	402
237278	PRK13029	PRK13029	indolepyruvate ferredoxin oxidoreductase family protein. 	1186
237279	PRK13030	PRK13030	indolepyruvate ferredoxin oxidoreductase family protein. 	1159
106068	PRK13031	PRK13031	preprotein translocase subunit SecB; Provisional	149
171848	PRK13032	PRK13032	chemotaxis-inhibiting protein CHIPS; Reviewed	149
171849	PRK13033	PRK13033	formyl peptide receptor-like 1 inhibitory protein; Reviewed	133
237280	PRK13034	PRK13034	serine hydroxymethyltransferase; Reviewed	416
171851	PRK13035	PRK13035	superantigen-like protein SSL5; Reviewed. 	234
171852	PRK13036	PRK13036	superantigen-like protein SSL11; Reviewed. 	227
106074	PRK13037	PRK13037	superantigen-like protein SSL1; Reviewed. 	226
171853	PRK13038	PRK13038	superantigen-like protein SSL10; Reviewed. 	227
171854	PRK13039	PRK13039	superantigen-like protein SSL8; Reviewed. 	232
106077	PRK13040	PRK13040	superantigen-like protein SSL6; Reviewed. 	231
106078	PRK13041	PRK13041	superantigen-like protein SSL2; Reviewed. 	231
183854	PRK13042	PRK13042	superantigen-like protein SSL4; Reviewed. 	291
171855	PRK13043	PRK13043	superantigen-like protein SSL14; Reviewed. 	241
237281	PRK13054	PRK13054	lipid kinase; Reviewed	300
237282	PRK13055	PRK13055	putative lipid kinase; Reviewed	334
183857	PRK13057	PRK13057	lipid kinase. 	287
183858	PRK13059	PRK13059	putative lipid kinase; Reviewed	295
183859	PRK13103	secA	preprotein translocase subunit SecA; Reviewed	913
183860	PRK13104	secA	preprotein translocase subunit SecA; Reviewed	896
183861	PRK13105	ubiA	prenyltransferase; Reviewed	282
237283	PRK13106	ubiA	prenyltransferase; Reviewed	300
183863	PRK13107	PRK13107	preprotein translocase subunit SecA; Reviewed	908
237284	PRK13108	PRK13108	prolipoprotein diacylglyceryl transferase; Reviewed	460
183864	PRK13109	flhB	flagellar biosynthesis protein FlhB; Reviewed	358
237285	PRK13111	trpA	tryptophan synthase subunit alpha; Provisional	258
237286	PRK13125	trpA	tryptophan synthase subunit alpha; Provisional	244
171868	PRK13128	PRK13128	D-aminopeptidase; Reviewed	518
237287	PRK13130	PRK13130	RNA-protein complex protein Nop10. 	56
237288	PRK13141	hisH	imidazole glycerol phosphate synthase subunit HisH; Provisional	205
171871	PRK13142	hisH	imidazole glycerol phosphate synthase subunit HisH; Provisional	192
237289	PRK13143	hisH	imidazole glycerol phosphate synthase subunit HisH; Provisional	200
183870	PRK13145	araD	L-ribulose-5-phosphate 4-epimerase; Provisional	234
237290	PRK13146	hisH	imidazole glycerol phosphate synthase subunit HisH; Provisional	209
183872	PRK13149	PRK13149	H/ACA RNA-protein complex component Gar1; Reviewed	73
139376	PRK13150	PRK13150	cytochrome c maturation protein CcmE. 	159
171876	PRK13152	hisH	imidazole glycerol phosphate synthase subunit HisH; Provisional	201
183873	PRK13159	PRK13159	cytochrome c-type biogenesis protein CcmE; Reviewed	155
183874	PRK13165	PRK13165	cytochrome c maturation protein CcmE. 	160
237291	PRK13168	rumA	23S rRNA (uracil(1939)-C(5))-methyltransferase RlmD. 	443
183876	PRK13169	PRK13169	DNA replication initiation control protein YabA. 	110
183877	PRK13170	hisH	imidazole glycerol phosphate synthase subunit HisH; Provisional	196
183878	PRK13181	hisH	imidazole glycerol phosphate synthase subunit HisH; Provisional	199
237292	PRK13182	racA	chromosome-anchoring protein RacA. 	175
171884	PRK13183	psbN	photosystem II reaction center protein PsbN. 	46
183880	PRK13184	pknD	serine/threonine-protein kinase PknD. 	932
237293	PRK13185	chlL	protochlorophyllide reductase iron-sulfur ATP-binding protein; Provisional	270
237294	PRK13186	lpxC	UDP-3-O-acyl-N-acetylglucosamine deacetylase. 	295
237295	PRK13187	PRK13187	UDP-3-O-acyl N-acetylglycosamine deacetylase. 	304
237296	PRK13188	PRK13188	bifunctional UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine deacetylase/(3R)-hydroxymyristoyl-[acyl-carrier-protein] dehydratase; Reviewed	464
237297	PRK13189	PRK13189	peroxiredoxin; Provisional	222
106159	PRK13190	PRK13190	putative peroxiredoxin; Provisional	202
183885	PRK13191	PRK13191	putative peroxiredoxin; Provisional	215
183886	PRK13192	PRK13192	bifunctional urease subunit gamma/beta; Reviewed	208
237298	PRK13193	PRK13193	pyroglutamyl-peptidase I. 	209
183887	PRK13194	PRK13194	pyrrolidone-carboxylate peptidase; Provisional	208
171894	PRK13195	PRK13195	pyrrolidone-carboxylate peptidase; Provisional	222
171895	PRK13196	PRK13196	pyroglutamyl-peptidase I. 	211
237299	PRK13197	PRK13197	pyrrolidone-carboxylate peptidase; Provisional	215
171897	PRK13198	ureB	urease subunit beta; Reviewed	158
237300	PRK13199	psaB	photosystem I P700 chlorophyll a apoprotein A2; Provisional	742
237301	PRK13200	psaA	photosystem I P700 chlorophyll a apoprotein A1; Provisional	766
237302	PRK13201	ureB	urease subunit beta; Reviewed	136
106171	PRK13202	ureB	urease subunit beta; Reviewed	104
237303	PRK13203	ureB	urease subunit beta; Reviewed	102
171902	PRK13204	ureB	urease subunit beta; Reviewed	159
106174	PRK13205	ureB	urease subunit beta; Reviewed	162
237304	PRK13206	ureC	urease subunit alpha; Reviewed	573
237305	PRK13207	ureC	urease subunit alpha; Reviewed	568
237306	PRK13208	valS	valyl-tRNA synthetase; Reviewed	800
237307	PRK13209	PRK13209	L-ribulose-5-phosphate 3-epimerase. 	283
237308	PRK13210	PRK13210	L-ribulose-5-phosphate 3-epimerase. 	284
237309	PRK13211	PRK13211	N-acetylglucosamine-binding protein GbpA. 	478
106181	PRK13213	araD	L-ribulose-5-phosphate 4-epimerase; Reviewed	231
183899	PRK13214	PRK13214	photosystem I reaction center subunit X; Reviewed	86
183900	PRK13216	PRK13216	photosystem I reaction center subunit X-like protein; Reviewed	91
237310	PRK13222	PRK13222	N-acetylmuramic acid 6-phosphate phosphatase MupP. 	226
171912	PRK13223	PRK13223	phosphoglycolate phosphatase; Provisional	272
106187	PRK13225	PRK13225	phosphoglycolate phosphatase; Provisional	273
237311	PRK13226	PRK13226	phosphoglycolate phosphatase; Provisional	229
183903	PRK13230	PRK13230	nitrogenase reductase-like protein; Reviewed	279
183904	PRK13231	PRK13231	nitrogenase reductase-like protein; Reviewed	264
106194	PRK13232	nifH	nitrogenase reductase; Reviewed	273
183905	PRK13233	nifH	nitrogenase iron protein. 	275
183906	PRK13234	nifH	nitrogenase reductase; Reviewed	295
183907	PRK13235	nifH	nitrogenase reductase; Reviewed	274
237312	PRK13236	PRK13236	nitrogenase reductase; Reviewed	296
237313	PRK13237	PRK13237	tyrosine phenol-lyase; Provisional	460
237314	PRK13238	tnaA	tryptophanase. 	460
183911	PRK13239	PRK13239	alkylmercury lyase MerB. 	206
183912	PRK13240	pbsY	photosystem II protein Y; Reviewed	40
183913	PRK13241	ureA	urease subunit gamma; Provisional	100
139420	PRK13242	ureA	urease subunit gamma; Provisional	100
183914	PRK13243	PRK13243	glyoxylate reductase; Reviewed	333
183915	PRK13244	PRK13244	protease inhibitor. 	145
183916	PRK13245	hetR	heterocyst differentiation control protein; Reviewed	299
106208	PRK13246	PRK13246	15,16-dihydrobiliverdin:ferredoxin oxidoreductase. 	236
237315	PRK13247	PRK13247	15,16-dihydrobiliverdin:ferredoxin oxidoreductase. 	238
139425	PRK13248	PRK13248	phycoerythrobilin:ferredoxin oxidoreductase; Provisional	253
139426	PRK13249	PRK13249	phycoerythrobilin:ferredoxin oxidoreductase; Provisional	257
139427	PRK13250	PRK13250	phycoerythrobilin:ferredoxin oxidoreductase; Provisional	248
183917	PRK13251	PRK13251	trp RNA-binding attenuation protein MtrB. 	75
183918	PRK13252	PRK13252	betaine aldehyde dehydrogenase; Provisional	488
237316	PRK13253	PRK13253	citrate lyase subunit gamma; Provisional	92
237317	PRK13254	PRK13254	cytochrome c maturation protein CcmE. 	148
183921	PRK13255	PRK13255	thiopurine S-methyltransferase; Reviewed	218
237318	PRK13256	PRK13256	thiopurine S-methyltransferase; Reviewed	226
237319	PRK13257	PRK13257	allantoicase; Provisional	336
237320	PRK13258	PRK13258	7-cyano-7-deazaguanine reductase; Provisional	114
237321	PRK13259	PRK13259	septation regulator SpoVG. 	94
183926	PRK13260	PRK13260	2,3-diketo-L-gulonate reductase; Provisional	332
237322	PRK13261	ureE	urease accessory protein UreE; Provisional	159
183928	PRK13262	ureE	urease accessory protein UreE; Provisional	231
237323	PRK13263	ureE	urease accessory protein UreE; Provisional	206
183930	PRK13264	PRK13264	3-hydroxyanthranilate 3,4-dioxygenase; Provisional	177
183931	PRK13265	PRK13265	glycine/sarcosine/betaine reductase complex protein A; Reviewed	154
237324	PRK13266	PRK13266	Thf1-like protein; Reviewed	225
237325	PRK13267	PRK13267	archaemetzincin-like protein; Reviewed	179
183934	PRK13270	treF	alpha,alpha-trehalase TreF. 	549
237326	PRK13271	treA	alpha,alpha-trehalase TreA. 	569
183936	PRK13272	treA	alpha,alpha-trehalase TreA. 	542
237327	PRK13273	mdoD	glucan biosynthesis protein D; Provisional	476
237328	PRK13274	mdoG	glucan biosynthesis protein G; Provisional	516
183939	PRK13275	mtrF	tetrahydromethanopterin S-methyltransferase subunit F; Provisional	67
183940	PRK13276	PRK13276	iron-sulfur cluster repair di-iron protein ScdA. 	224
183941	PRK13277	PRK13277	5-formaminoimidazole-4-carboxamide-1-(beta)-D-ribofuranosyl 5'-monophosphate synthetase-like protein; Provisional	366
237329	PRK13278	purP	formate--phosphoribosylaminoimidazolecarboxamide ligase. 	358
237330	PRK13279	arnT	lipid IV(A) 4-amino-4-deoxy-L-arabinosyltransferase. 	552
237331	PRK13280	PRK13280	N-glycosylase/DNA lyase; Provisional	269
237332	PRK13281	PRK13281	N-succinylarginine dihydrolase. 	442
183946	PRK13282	PRK13282	flagellar assembly protein FliW; Provisional	128
183947	PRK13283	PRK13283	flagellar assembly protein FliW; Provisional	134
237333	PRK13284	PRK13284	flagellar assembly protein FliW; Provisional	145
237334	PRK13285	PRK13285	flagellar assembly protein FliW; Provisional	148
237335	PRK13286	amiE	aliphatic amidase. 	345
183950	PRK13287	amiF	formamidase; Provisional	333
237336	PRK13288	PRK13288	pyrophosphatase PpaX; Provisional	214
237337	PRK13289	PRK13289	NO-inducible flavohemoprotein. 	399
183953	PRK13290	ectC	L-ectoine synthase; Reviewed	125
183954	PRK13291	PRK13291	putative metal-dependent hydrolase. 	173
183955	PRK13292	PRK13292	NADH-quinone oxidoreductase subunit B/C/D. 	788
183956	PRK13293	PRK13293	F420-0--gamma-glutamyl ligase; Reviewed	245
183957	PRK13294	PRK13294	F420-0--gamma-glutamyl ligase; Provisional	448
171961	PRK13295	PRK13295	cyclohexanecarboxylate-CoA ligase; Reviewed	547
106256	PRK13296	PRK13296	CCA tRNA nucleotidyltransferase. 	360
139469	PRK13297	PRK13297	tRNA CCA-pyrophosphorylase; Provisional	364
237338	PRK13298	PRK13298	tRNA CCA-pyrophosphorylase; Provisional	417
237339	PRK13299	PRK13299	tRNA CCA-pyrophosphorylase; Provisional	394
237340	PRK13300	PRK13300	CCA tRNA nucleotidyltransferase. 	447
106261	PRK13301	PRK13301	putative L-aspartate dehydrogenase; Provisional	267
237341	PRK13302	PRK13302	aspartate dehydrogenase. 	271
237342	PRK13303	PRK13303	aspartate dehydrogenase. 	265
237343	PRK13304	PRK13304	aspartate dehydrogenase. 	265
183962	PRK13305	sgbH	3-keto-L-gulonate-6-phosphate decarboxylase UlaD. 	218
237344	PRK13306	ulaD	3-dehydro-L-gulonate-6-phosphate decarboxylase. 	216
183964	PRK13307	PRK13307	bifunctional 5,6,7,8-tetrahydromethanopterin hydro-lyase/3-hexulose-6-phosphate synthase. 	391
183965	PRK13308	ureC	urease subunit alpha; Reviewed	569
183966	PRK13309	ureC	urease subunit alpha; Reviewed	572
183967	PRK13310	PRK13310	N-acetyl-D-glucosamine kinase; Provisional	303
106271	PRK13311	PRK13311	N-acetyl-D-glucosamine kinase; Provisional	256
139480	PRK13312	PRK13312	staphylobilin-forming heme oxygenase IsdG. 	107
183968	PRK13313	PRK13313	staphylobilin-forming heme oxygenase IsdI. 	108
183969	PRK13314	PRK13314	heme oxygenase. 	107
237345	PRK13315	PRK13315	heme oxygenase. 	107
183970	PRK13316	PRK13316	heme oxygenase IsdG. 	121
237346	PRK13317	PRK13317	pantothenate kinase; Provisional	277
237347	PRK13318	PRK13318	type III pantothenate kinase. 	258
237348	PRK13320	PRK13320	type III pantothenate kinase. 	244
237349	PRK13321	PRK13321	type III pantothenate kinase. 	256
237350	PRK13322	PRK13322	pantothenate kinase; Reviewed	246
106284	PRK13324	PRK13324	type III pantothenate kinase. 	258
183976	PRK13325	PRK13325	bifunctional biotin--[acetyl-CoA-carboxylase] ligase/type III pantothenate kinase. 	592
237351	PRK13326	PRK13326	type III pantothenate kinase. 	262
183977	PRK13327	PRK13327	type III pantothenate kinase. 	242
237352	PRK13328	PRK13328	type III pantothenate kinase. 	255
183979	PRK13329	PRK13329	pantothenate kinase; Reviewed	249
237353	PRK13331	PRK13331	pantothenate kinase; Reviewed	251
183981	PRK13333	PRK13333	type III pantothenate kinase. 	206
139494	PRK13335	PRK13335	superantigen-like protein SSL3; Reviewed. 	356
183982	PRK13337	PRK13337	putative lipid kinase; Reviewed	304
183983	PRK13339	PRK13339	malate:quinone oxidoreductase; Reviewed	497
183984	PRK13340	PRK13340	alanine racemase; Reviewed	406
237354	PRK13341	PRK13341	AAA family ATPase. 	725
237355	PRK13342	PRK13342	recombination factor protein RarA; Reviewed	413
183987	PRK13343	PRK13343	F0F1 ATP synthase subunit alpha; Provisional	502
183988	PRK13344	spxA	transcriptional regulator Spx; Reviewed	132
106303	PRK13345	PRK13345	superantigen-like protein SSL9; Reviewed. 	232
106304	PRK13346	PRK13346	superantigen-like protein SSL7; Reviewed. 	231
237356	PRK13347	PRK13347	coproporphyrinogen III oxidase; Provisional	453
237357	PRK13348	PRK13348	HTH-type transcriptional regulator ArgP. 	294
106307	PRK13349	PRK13349	superantigen-like protein SSL13; Reviewed. 	241
171995	PRK13350	PRK13350	superantigen-like protein SSL12; Reviewed. 	238
237358	PRK13351	PRK13351	elongation factor G-like protein. 	687
237359	PRK13352	PRK13352	phosphomethylpyrimidine synthase ThiC. 	431
183992	PRK13353	PRK13353	aspartate ammonia-lyase; Provisional	473
237360	PRK13354	PRK13354	tyrosyl-tRNA synthetase; Provisional	410
237361	PRK13355	PRK13355	bifunctional HTH-domain containing protein/aminotransferase; Provisional	517
237362	PRK13356	PRK13356	branched-chain amino acid aminotransferase. 	286
237363	PRK13357	PRK13357	branched-chain amino acid aminotransferase; Provisional	356
183997	PRK13358	PRK13358	protocatechuate 4,5-dioxygenase subunit beta; Provisional	269
183998	PRK13359	PRK13359	beta-ketoadipyl CoA thiolase; Provisional	400
183999	PRK13360	PRK13360	omega amino acid--pyruvate transaminase; Provisional	442
237364	PRK13361	PRK13361	GTP 3',8-cyclase MoaA. 	329
184001	PRK13362	PRK13362	protoheme IX farnesyltransferase; Provisional	306
184002	PRK13363	PRK13363	protocatechuate 4,5-dioxygenase subunit beta; Provisional	335
184003	PRK13364	PRK13364	protocatechuate 4,5-dioxygenase subunit beta; Provisional	278
184004	PRK13365	PRK13365	protocatechuate 4,5-dioxygenase subunit beta; Provisional	279
184005	PRK13366	PRK13366	protocatechuate 4,5-dioxygenase subunit beta; Provisional	284
184006	PRK13367	PRK13367	gallate dioxygenase. 	420
184007	PRK13368	PRK13368	3-deoxy-manno-octulosonate cytidylyltransferase; Provisional	238
237365	PRK13369	PRK13369	glycerol-3-phosphate dehydrogenase; Provisional	502
237366	PRK13370	mhpB	3-carboxyethylcatechol 2,3-dioxygenase. 	313
237367	PRK13371	PRK13371	4-hydroxy-3-methylbut-2-enyl diphosphate reductase; Provisional	387
106330	PRK13372	pcmA	protocatechuate 4,5-dioxygenase subunit alpha/beta. 	444
106331	PRK13373	PRK13373	putative dioxygenase; Provisional	344
237368	PRK13374	PRK13374	DeoD-type purine-nucleoside phosphorylase. 	233
172015	PRK13375	pimE	mannosyltransferase; Provisional	409
237369	PRK13376	pyrB	bifunctional aspartate carbamoyltransferase catalytic subunit/aspartate carbamoyltransferase regulatory subunit; Provisional	525
184013	PRK13377	PRK13377	protocatechuate 4,5-dioxygenase subunit alpha; Provisional	129
139527	PRK13378	PRK13378	protocatechuate 4,5-dioxygenase subunit alpha; Provisional	117
184014	PRK13379	PRK13379	protocatechuate 4,5-dioxygenase subunit alpha; Provisional	119
237370	PRK13380	PRK13380	glycine cleavage system protein H; Provisional	144
237371	PRK13381	PRK13381	peptidase T; Provisional	404
172019	PRK13382	PRK13382	bile acid CoA ligase. 	537
139531	PRK13383	PRK13383	acyl-CoA synthetase; Provisional	516
172020	PRK13384	PRK13384	porphobilinogen synthase. 	322
184017	PRK13385	PRK13385	2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase; Provisional	230
237372	PRK13386	fliH	flagellar assembly protein H; Provisional	236
237373	PRK13387	PRK13387	1,4-dihydroxy-2-naphthoate octaprenyltransferase; Provisional	317
237374	PRK13388	PRK13388	acyl-CoA synthetase; Provisional	540
184021	PRK13389	PRK13389	UTP--glucose-1-phosphate uridylyltransferase GalU. 	302
139538	PRK13390	PRK13390	acyl-CoA synthetase; Provisional	501
184022	PRK13391	PRK13391	acyl-CoA synthetase; Provisional	511
184023	PRK13392	PRK13392	5-aminolevulinate synthase; Provisional	410
184024	PRK13393	PRK13393	5-aminolevulinate synthase; Provisional	406
184025	PRK13394	PRK13394	3-hydroxybutyrate dehydrogenase; Provisional	262
237375	PRK13395	PRK13395	ureidoglycolate lyase. 	171
237376	PRK13396	PRK13396	3-deoxy-7-phosphoheptulonate synthase; Provisional	352
172030	PRK13397	PRK13397	3-deoxy-7-phosphoheptulonate synthase; Provisional	250
184028	PRK13398	PRK13398	3-deoxy-7-phosphoheptulonate synthase; Provisional	266
184029	PRK13399	PRK13399	fructose-bisphosphate aldolase class II. 	347
184030	PRK13400	PRK13400	30S ribosomal protein S18; Provisional	147
184031	PRK13401	PRK13401	30S ribosomal protein S18; Provisional	82
184032	PRK13402	PRK13402	glutamate 5-kinase. 	368
106361	PRK13403	PRK13403	ketol-acid reductoisomerase; Provisional	335
184033	PRK13404	PRK13404	dihydropyrimidinase; Provisional	477
237377	PRK13405	bchH	magnesium chelatase subunit H; Provisional	1209
237378	PRK13406	bchD	magnesium chelatase subunit D; Provisional	584
184036	PRK13407	bchI	magnesium chelatase subunit I; Provisional	334
184037	PRK13409	PRK13409	ribosome biogenesis/translation initiation ATPase RLI. 	590
184038	PRK13410	PRK13410	molecular chaperone DnaK; Provisional	668
184039	PRK13411	PRK13411	molecular chaperone DnaK; Provisional	653
237379	PRK13412	fkp	bifunctional fucokinase/L-fucose-1-P-guanylyltransferase; Provisional	974
184041	PRK13413	mpi	master DNA invertase Mpi family serine-type recombinase. 	200
139556	PRK13414	PRK13414	flagellar biosynthesis protein FliZ; Provisional	209
184042	PRK13415	PRK13415	flagella biosynthesis protein FliZ; Provisional	219
237380	PRK13417	PRK13417	F0F1 ATP synthase subunit A; Provisional	352
237381	PRK13419	PRK13419	F0F1 ATP synthase subunit A; Provisional	342
237382	PRK13420	PRK13420	F0F1 ATP synthase subunit A; Provisional	226
237383	PRK13421	PRK13421	F0F1 ATP synthase subunit A; Provisional	223
184046	PRK13422	PRK13422	F0F1 ATP synthase subunit gamma; Provisional	298
237384	PRK13423	PRK13423	F0F1 ATP synthase subunit gamma; Provisional	288
172047	PRK13424	PRK13424	F0F1 ATP synthase subunit gamma; Provisional	291
139564	PRK13425	PRK13425	F0F1 ATP synthase subunit gamma; Provisional	291
237385	PRK13426	PRK13426	F0F1 ATP synthase subunit gamma; Provisional	291
172049	PRK13427	PRK13427	F0F1 ATP synthase subunit gamma; Provisional	289
184048	PRK13428	PRK13428	F0F1 ATP synthase subunit delta; Provisional	445
237386	PRK13429	PRK13429	F0F1 ATP synthase subunit delta; Provisional	181
237387	PRK13430	PRK13430	F0F1 ATP synthase subunit delta; Provisional	271
184051	PRK13431	PRK13431	F0F1 ATP synthase subunit delta; Provisional	180
139571	PRK13434	PRK13434	F0F1 ATP synthase subunit delta; Provisional	184
184052	PRK13435	PRK13435	response regulator; Provisional	145
184053	PRK13436	PRK13436	F0F1 ATP synthase subunit delta; Provisional	179
184054	PRK13441	PRK13441	F0F1 ATP synthase subunit delta; Provisional	180
184055	PRK13442	atpC	F0F1 ATP synthase subunit epsilon; Provisional	89
237388	PRK13443	atpC	F0F1 ATP synthase subunit epsilon; Provisional	136
139576	PRK13444	atpC	F0F1 ATP synthase subunit epsilon; Provisional	127
184056	PRK13446	atpC	F0F1 ATP synthase subunit epsilon; Provisional	136
184057	PRK13447	PRK13447	F0F1 ATP synthase subunit epsilon; Provisional	136
139579	PRK13448	atpC	F0F1 ATP synthase subunit epsilon; Provisional	135
184058	PRK13449	atpC	ATP synthase F1 subunit epsilon. 	88
184059	PRK13450	atpC	F0F1 ATP synthase subunit epsilon; Provisional	132
172059	PRK13451	atpC	F0F1 ATP synthase subunit epsilon; Provisional	101
106409	PRK13452	atpC	F0F1 ATP synthase subunit epsilon; Provisional	145
184060	PRK13453	PRK13453	F0F1 ATP synthase subunit B; Provisional	173
184061	PRK13454	PRK13454	F0F1 ATP synthase subunit B'; Provisional	181
184062	PRK13455	PRK13455	F0F1 ATP synthase subunit B; Provisional	184
237389	PRK13456	PRK13456	DNA protection protein DPS; Provisional	186
139585	PRK13460	PRK13460	F0F1 ATP synthase subunit B; Provisional	173
184064	PRK13461	PRK13461	F0F1 ATP synthase subunit B; Provisional	159
139587	PRK13462	PRK13462	acid phosphatase; Provisional	203
172065	PRK13463	PRK13463	phosphoserine phosphatase 1. 	203
184065	PRK13464	PRK13464	F0F1 ATP synthase subunit B. 	101
172066	PRK13466	PRK13466	F0F1 ATP synthase subunit C; Provisional	66
237390	PRK13467	PRK13467	F0F1 ATP synthase subunit C; Provisional	66
184067	PRK13468	PRK13468	F0F1 ATP synthase subunit C; Provisional	82
184068	PRK13469	PRK13469	F0F1 ATP synthase subunit C; Provisional	79
184069	PRK13471	PRK13471	F0F1 ATP synthase subunit C; Provisional	85
237391	PRK13473	PRK13473	aminobutyraldehyde dehydrogenase. 	475
237392	PRK13474	PRK13474	cytochrome b6-f complex iron-sulfur subunit; Provisional	178
184072	PRK13475	PRK13475	ribulose-bisphosphate carboxylase. 	443
184073	PRK13476	PRK13476	cytochrome b6-f complex subunit IV; Provisional	160
237393	PRK13477	PRK13477	bifunctional pantoate--beta-alanine ligase/(d)CMP kinase. 	512
184075	PRK13478	PRK13478	phosphonoacetaldehyde hydrolase; Provisional	267
184076	PRK13479	PRK13479	2-aminoethylphosphonate--pyruvate transaminase; Provisional	368
237394	PRK13480	PRK13480	3'-5' exoribonuclease YhaM; Provisional	314
184078	PRK13481	PRK13481	glycosyltransferase; Provisional	232
237395	PRK13482	PRK13482	DNA integrity scanning protein DisA; Provisional	352
184080	PRK13483	PRK13483	ligand-gated channel protein. 	660
139605	PRK13484	PRK13484	IreA family TonB-dependent siderophore receptor. 	682
139606	PRK13486	PRK13486	TonB-dependent receptor. 	696
237396	PRK13487	PRK13487	chemoreceptor glutamine deamidase CheD; Provisional	201
237397	PRK13488	PRK13488	chemoreceptor glutamine deamidase CheD; Provisional	157
237398	PRK13489	PRK13489	chemoreceptor glutamine deamidase CheD; Provisional	233
184084	PRK13490	PRK13490	chemoreceptor glutamine deamidase CheD; Provisional	162
184085	PRK13491	PRK13491	chemoreceptor glutamine deamidase CheD; Provisional	199
184086	PRK13493	PRK13493	chemoreceptor glutamine deamidase CheD; Provisional	213
184087	PRK13494	PRK13494	chemoreceptor glutamine deamidase CheD; Provisional	163
184088	PRK13495	PRK13495	chemoreceptor glutamine deamidase CheD; Provisional	159
237399	PRK13497	PRK13497	chemoreceptor glutamine deamidase CheD; Provisional	184
237400	PRK13498	PRK13498	chemoreceptor glutamine deamidase CheD; Provisional	167
237401	PRK13499	PRK13499	L-rhamnose/proton symporter RhaT. 	345
184091	PRK13500	PRK13500	HTH-type transcriptional activator RhaR. 	312
184092	PRK13501	PRK13501	HTH-type transcriptional activator RhaR. 	290
184093	PRK13502	PRK13502	HTH-type transcriptional activator RhaR. 	282
184094	PRK13503	PRK13503	HTH-type transcriptional activator RhaS. 	278
237402	PRK13504	PRK13504	NADPH-dependent assimilatory sulfite reductase hemoprotein subunit. 	569
237403	PRK13505	PRK13505	formate--tetrahydrofolate ligase; Provisional	557
237404	PRK13506	PRK13506	formate--tetrahydrofolate ligase; Provisional	578
184098	PRK13507	PRK13507	formate--tetrahydrofolate ligase; Provisional	587
237405	PRK13508	PRK13508	tagatose-6-phosphate kinase; Provisional	309
184100	PRK13509	PRK13509	HTH-type transcriptional regulator UlaR. 	251
184101	PRK13510	PRK13510	sulfurtransferase complex subunit TusB. 	95
184102	PRK13511	PRK13511	6-phospho-beta-galactosidase; Provisional	469
184103	PRK13512	PRK13512	coenzyme A disulfide reductase; Provisional	438
184104	PRK13513	PRK13513	ligand-gated channel protein. 	659
237406	PRK13515	PRK13515	carboxylate-amine ligase; Provisional	371
237407	PRK13516	PRK13516	gamma-glutamyl:cysteine ligase; Provisional	373
237408	PRK13517	PRK13517	glutamate--cysteine ligase. 	373
184108	PRK13518	PRK13518	glutamate--cysteine ligase. 	357
237409	PRK13520	PRK13520	tyrosine decarboxylase MfnA. 	371
184110	PRK13523	PRK13523	NADPH dehydrogenase NamA; Provisional	337
237410	PRK13524	PRK13524	FepA family TonB-dependent siderophore receptor. 	744
237411	PRK13525	PRK13525	pyridoxal 5'-phosphate synthase glutaminase subunit PdxT. 	189
184113	PRK13526	PRK13526	glutamine amidotransferase subunit PdxT; Provisional	179
237412	PRK13527	PRK13527	glutamine amidotransferase subunit PdxT; Provisional	200
237413	PRK13528	PRK13528	outer membrane receptor FepA; Provisional	727
237414	PRK13529	PRK13529	oxaloacetate-decarboxylating malate dehydrogenase. 	563
237415	PRK13530	PRK13530	arsenate reductase (thioredoxin). 	133
184118	PRK13531	PRK13531	regulatory ATPase RavA; Provisional	498
237416	PRK13532	PRK13532	nitrate reductase catalytic subunit NapA. 	830
237417	PRK13533	PRK13533	7-cyano-7-deazaguanine tRNA-ribosyltransferase; Provisional	487
237418	PRK13534	PRK13534	7-cyano-7-deazaguanine tRNA-ribosyltransferase; Provisional	639
184122	PRK13535	PRK13535	erythrose 4-phosphate dehydrogenase; Provisional	336
237419	PRK13536	PRK13536	nodulation factor ABC transporter ATP-binding protein NodI. 	340
237420	PRK13537	PRK13537	nodulation factor ABC transporter ATP-binding protein NodI. 	306
184125	PRK13538	PRK13538	cytochrome c biogenesis heme-transporting ATPase CcmA. 	204
237421	PRK13539	PRK13539	cytochrome c biogenesis protein CcmA; Provisional	207
184127	PRK13540	PRK13540	cytochrome c biogenesis protein CcmA; Provisional	200
184128	PRK13541	PRK13541	cytochrome c biogenesis protein CcmA; Provisional	195
184129	PRK13543	PRK13543	heme ABC exporter ATP-binding protein CcmA. 	214
184130	PRK13545	tagH	teichoic acids export protein ATP-binding subunit; Provisional	549
184131	PRK13546	PRK13546	teichoic acids export ABC transporter ATP-binding subunit TagH. 	264
184132	PRK13547	hmuV	heme ABC transporter ATP-binding protein. 	272
237422	PRK13548	hmuV	hemin importer ATP-binding subunit; Provisional	258
184134	PRK13549	PRK13549	xylose transporter ATP-binding subunit; Provisional	506
184135	PRK13551	PRK13551	agmatine deiminase; Provisional	362
184136	PRK13552	frdB	fumarate reductase iron-sulfur subunit; Provisional	239
237423	PRK13553	PRK13553	fumarate reductase cytochrome b subunit. 	258
237424	PRK13554	PRK13554	fumarate reductase cytochrome b-556 subunit; Provisional	241
184139	PRK13555	PRK13555	FMN-dependent NADH-azoreductase. 	208
184140	PRK13556	PRK13556	FMN-dependent NADH-azoreductase. 	208
237425	PRK13557	PRK13557	histidine kinase; Provisional	540
237426	PRK13558	PRK13558	bacterio-opsin activator; Provisional	665
237427	PRK13559	PRK13559	hypothetical protein; Provisional	361
106506	PRK13560	PRK13560	hypothetical protein; Provisional	807
184143	PRK13561	PRK13561	putative diguanylate cyclase; Provisional	651
184144	PRK13562	PRK13562	ACT domain-containing protein. 	84
237428	PRK13564	PRK13564	anthranilate synthase component 1. 	520
184146	PRK13565	PRK13565	anthranilate synthase component I; Provisional	490
237429	PRK13566	PRK13566	anthranilate synthase component I. 	720
184148	PRK13567	PRK13567	anthranilate synthase component I; Provisional	468
237430	PRK13568	hofQ	DNA uptake porin HofQ. 	381
184150	PRK13569	PRK13569	anthranilate synthase component I; Provisional	506
237431	PRK13570	PRK13570	anthranilate synthase component I; Provisional	455
184152	PRK13571	PRK13571	anthranilate synthase component I; Provisional	506
237432	PRK13572	PRK13572	anthranilate synthase component I; Provisional	435
184154	PRK13573	PRK13573	anthranilate synthase component I; Provisional	503
184155	PRK13574	PRK13574	anthranilate synthase component I; Provisional	420
184156	PRK13575	PRK13575	type I 3-dehydroquinate dehydratase. 	238
237433	PRK13576	PRK13576	type I 3-dehydroquinate dehydratase. 	216
184158	PRK13577	PRK13577	diaminopimelate epimerase; Provisional	281
237434	PRK13578	PRK13578	ornithine decarboxylase; Provisional	720
237435	PRK13579	gcvT	glycine cleavage system aminomethyltransferase GcvT. 	370
184161	PRK13580	PRK13580	glycine hydroxymethyltransferase. 	493
237436	PRK13581	PRK13581	D-3-phosphoglycerate dehydrogenase; Provisional	526
237437	PRK13582	thrH	bifunctional phosphoserine phosphatase/homoserine phosphotransferase ThrH. 	205
237438	PRK13583	hisG	ATP phosphoribosyltransferase. 	228
172153	PRK13584	hisG	ATP phosphoribosyltransferase. 	204
184165	PRK13585	PRK13585	1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino]imidazole-4-carboxamide isomerase. 	241
237439	PRK13586	PRK13586	1-(5-phosphoribosyl)-5- ((5-phosphoribosylamino)methylideneamino)imidazole-4-carboxamide isomerase. 	232
172156	PRK13587	PRK13587	1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase; Provisional	234
237440	PRK13588	PRK13588	flagellin B; Provisional	514
172158	PRK13589	PRK13589	flagellin A. 	576
184168	PRK13590	PRK13590	putative bifunctional OHCU decarboxylase/allantoate amidohydrolase; Provisional	591
184169	PRK13591	ubiA	prenyltransferase; Provisional	307
139690	PRK13592	ubiA	prenyltransferase; Provisional	299
172161	PRK13595	ubiA	prenyltransferase; Provisional	292
237441	PRK13596	PRK13596	NADH-quinone oxidoreductase subunit NuoF. 	433
184171	PRK13598	hisB	imidazoleglycerol-phosphate dehydratase; Provisional	193
106544	PRK13599	PRK13599	peroxiredoxin. 	215
184172	PRK13600	PRK13600	putative ribosomal protein L7Ae-like; Provisional	84
184173	PRK13601	PRK13601	putative L7Ae-like ribosomal protein; Provisional	82
184174	PRK13602	PRK13602	50S ribosomal protein L7ae-like protein. 	82
172166	PRK13603	PRK13603	fumarate reductase subunit C; Provisional	126
184175	PRK13604	luxD	acyl transferase; Provisional	307
237442	PRK13605	PRK13605	endoribonuclease SymE; Provisional	113
237443	PRK13606	PRK13606	LPPG:FO 2-phospho-L-lactate transferase; Provisional	303
237444	PRK13607	PRK13607	proline dipeptidase; Provisional	443
184179	PRK13608	PRK13608	diacylglycerol glucosyltransferase; Provisional	391
237445	PRK13609	PRK13609	diacylglycerol glucosyltransferase; Provisional	380
139699	PRK13610	PRK13610	photosystem II reaction center protein Psb28; Provisional	113
106556	PRK13611	PRK13611	photosystem II reaction center protein Psb28; Provisional	104
237446	PRK13612	PRK13612	photosystem II reaction center protein Psb28; Provisional	113
237447	PRK13613	PRK13613	lipoprotein LpqB; Provisional	599
237448	PRK13614	PRK13614	lipoprotein LpqB; Provisional	573
184183	PRK13615	PRK13615	lipoprotein LpqB; Provisional	557
237449	PRK13616	PRK13616	MtrAB system accessory protein LpqB. 	591
106562	PRK13617	psbV	cytochrome c-550; Provisional	170
184185	PRK13618	psbV	cytochrome c-550; Provisional	163
172177	PRK13619	psbV	cytochrome c-550; Provisional	160
139707	PRK13620	psbV	cytochrome c-550; Provisional	215
237450	PRK13621	psbV	cytochrome c-550; Provisional	170
106567	PRK13622	psbV	cytochrome c-550; Provisional	180
184186	PRK13623	PRK13623	iron-sulfur cluster insertion protein ErpA; Provisional	115
184187	PRK13625	PRK13625	bis(5'-nucleosyl)-tetraphosphatase PrpE; Provisional	245
184188	PRK13626	PRK13626	HTH-type transcriptional regulator SgrR. 	552
184189	PRK13627	PRK13627	carnitine operon protein CaiE; Provisional	196
184190	PRK13628	PRK13628	serine/threonine transporter SstT; Provisional	402
184191	PRK13629	PRK13629	threonine/serine transporter TdcC; Provisional	443
237451	PRK13631	cbiO	cobalt transporter ATP-binding subunit; Provisional	320
237452	PRK13632	cbiO	cobalt transporter ATP-binding subunit; Provisional	271
237453	PRK13633	PRK13633	energy-coupling factor transporter ATPase. 	280
237454	PRK13634	cbiO	cobalt transporter ATP-binding subunit; Provisional	290
184195	PRK13635	cbiO	energy-coupling factor ABC transporter ATP-binding protein. 	279
184196	PRK13636	cbiO	cobalt transporter ATP-binding subunit; Provisional	283
237455	PRK13637	cbiO	energy-coupling factor transporter ATPase. 	287
184198	PRK13638	cbiO	energy-coupling factor ABC transporter ATP-binding protein. 	271
184199	PRK13639	cbiO	cobalt transporter ATP-binding subunit; Provisional	275
184200	PRK13640	cbiO	energy-coupling factor transporter ATPase. 	282
237456	PRK13641	cbiO	energy-coupling factor transporter ATPase. 	287
184202	PRK13642	cbiO	energy-coupling factor transporter ATPase. 	277
184203	PRK13643	cbiO	energy-coupling factor transporter ATPase. 	288
106587	PRK13644	cbiO	energy-coupling factor transporter ATPase. 	274
184204	PRK13645	cbiO	energy-coupling factor transporter ATPase. 	289
184205	PRK13646	cbiO	energy-coupling factor transporter ATPase. 	286
237457	PRK13647	cbiO	cobalt transporter ATP-binding subunit; Provisional	274
184207	PRK13648	cbiO	cobalt transporter ATP-binding subunit; Provisional	269
184208	PRK13649	cbiO	energy-coupling factor transporter ATPase. 	280
184209	PRK13650	cbiO	energy-coupling factor transporter ATPase. 	279
184210	PRK13651	PRK13651	cobalt transporter ATP-binding subunit; Provisional	305
172200	PRK13652	cbiO	cobalt transporter ATP-binding subunit; Provisional	277
237458	PRK13654	PRK13654	magnesium-protoporphyrin IX monomethyl ester cyclase; Provisional	355
237459	PRK13655	PRK13655	phosphoenolpyruvate carboxylase; Provisional	494
237460	PRK13656	PRK13656	enoyl-[acyl-carrier-protein] reductase FabV. 	398
184214	PRK13657	PRK13657	glucan ABC transporter ATP-binding protein/ permease. 	588
184215	PRK13658	PRK13658	hypothetical protein; Provisional	59
184216	PRK13659	PRK13659	DUF1283 family protein. 	103
237461	PRK13660	PRK13660	hypothetical protein; Provisional	182
184218	PRK13661	PRK13661	ECF-type riboflavin transporter substrate-binding protein. 	182
184219	PRK13662	PRK13662	hypothetical protein; Provisional	177
184220	PRK13663	PRK13663	hypothetical protein; Provisional	493
184221	PRK13664	PRK13664	hypothetical protein; Provisional	62
237462	PRK13665	PRK13665	hypothetical protein; Provisional	316
184223	PRK13666	PRK13666	hypothetical protein; Provisional	92
184224	PRK13667	PRK13667	hypothetical protein; Provisional	70
237463	PRK13668	PRK13668	hypothetical protein; Provisional	267
184226	PRK13669	PRK13669	hypothetical protein; Provisional	78
237464	PRK13670	PRK13670	nucleotidyltransferase. 	388
184228	PRK13671	PRK13671	nucleotidyltransferase. 	298
184229	PRK13672	PRK13672	hypothetical protein; Provisional	71
237465	PRK13673	PRK13673	hypothetical protein; Provisional	118
237466	PRK13674	PRK13674	GTP cyclohydrolase I FolE2. 	271
184232	PRK13675	PRK13675	GTP cyclohydrolase; Provisional	308
237467	PRK13676	PRK13676	YlbF/YmcA family competence regulator. 	114
184234	PRK13677	PRK13677	DUF3461 family protein. 	125
184235	PRK13678	PRK13678	hypothetical protein; Provisional	95
184236	PRK13679	PRK13679	hypothetical protein; Provisional	168
184237	PRK13680	PRK13680	hypothetical protein; Provisional	117
184238	PRK13681	PRK13681	protein YohO. 	35
184239	PRK13682	PRK13682	hypothetical protein; Provisional	51
184240	PRK13683	PRK13683	hypothetical protein; Provisional	87
237468	PRK13684	PRK13684	photosynthesis system II assembly factor Ycf48. 	334
184242	PRK13685	PRK13685	hypothetical protein; Provisional	326
237469	PRK13686	PRK13686	photosystem II reaction center protein Ycf12. 	43
184244	PRK13687	PRK13687	hypothetical protein; Provisional	85
237470	PRK13688	PRK13688	N-acetyltransferase. 	156
237471	PRK13689	PRK13689	hypothetical protein; Provisional	75
237472	PRK13690	PRK13690	hypothetical protein; Provisional	184
139768	PRK13691	PRK13691	(3R)-hydroxyacyl-ACP dehydratase subunit HadC; Provisional	166
237473	PRK13692	PRK13692	(3R)-hydroxyacyl-ACP dehydratase subunit HadA; Provisional	159
184249	PRK13693	PRK13693	(3R)-hydroxyacyl-ACP dehydratase subunit HadB; Provisional	142
237474	PRK13694	PRK13694	hypothetical protein; Provisional	83
237475	PRK13695	PRK13695	NTPase. 	174
237476	PRK13696	PRK13696	hypothetical protein; Provisional	62
184253	PRK13697	PRK13697	cytochrome c6; Provisional	111
184254	PRK13698	PRK13698	ParB/RepB/Spo0J family plasmid partition protein. 	323
184255	PRK13699	PRK13699	putative methylase; Provisional	227
184256	PRK13700	PRK13700	conjugal transfer protein TraD; Provisional	732
237477	PRK13701	psiB	conjugation system SOS inhibitor PsiB. 	144
184258	PRK13702	PRK13702	replication regulatory protein RepA. 	85
184259	PRK13703	PRK13703	conjugal pilus assembly protein TraF; Provisional	248
184260	PRK13704	PRK13704	plasmid SOS inhibition protein A; Provisional	240
184261	PRK13705	PRK13705	plasmid-partitioning protein SopA; Provisional	388
184262	PRK13706	PRK13706	conjugal transfer pilus acetylase TraX. 	248
184263	PRK13707	PRK13707	type IV conjugative transfer system protein TraL. 	101
184264	PRK13708	PRK13708	type II toxin-antitoxin system toxin CcdB. 	101
237478	PRK13709	PRK13709	conjugal transfer nickase/helicase TraI; Provisional	1747
184266	PRK13710	PRK13710	type II toxin-antitoxin system antitoxin CcdA. 	72
184267	PRK13711	PRK13711	P-type conjugative transfer protein TrbJ. 	113
184268	PRK13712	PRK13712	conjugal transfer protein TrbA; Provisional	115
184269	PRK13713	PRK13713	relaxosome protein TraM. 	118
184270	PRK13715	PRK13715	conjugal transfer protein TraR; Provisional	73
106657	PRK13716	PRK13716	RepA leader peptide Tap. 	24
184271	PRK13717	PRK13717	type-F conjugative transfer system protein TrbI. 	128
172260	PRK13718	PRK13718	conjugal transfer protein TrbE; Provisional	84
237479	PRK13719	PRK13719	conjugal transfer transcriptional regulator TraJ; Provisional	217
172262	PRK13720	PRK13720	modulator of post-segregation killing protein; Provisional	70
237480	PRK13721	PRK13721	conjugal transfer ATP-binding protein TraC; Provisional	844
184274	PRK13722	PRK13722	lytic transglycosylase; Provisional	169
237481	PRK13723	PRK13723	conjugal transfer pilus assembly protein TraH; Provisional	451
237482	PRK13724	PRK13724	conjugal transfer protein TrbD; Provisional	65
184277	PRK13725	PRK13725	tRNA(fMet)-specific endonuclease VapC. 	132
184278	PRK13726	PRK13726	type IV conjugative transfer system protein TraE. 	188
237483	PRK13727	PRK13727	conjugal transfer pilin chaperone TraQ; Provisional	80
237484	PRK13728	PRK13728	conjugal transfer protein TrbB; Provisional	181
184281	PRK13729	PRK13729	conjugal transfer pilus assembly protein TraB; Provisional	475
184282	PRK13730	PRK13730	conjugal transfer pilus assembly protein TrbC; Provisional	212
184283	PRK13731	PRK13731	complement resistance protein TraT. 	243
184284	PRK13732	PRK13732	single-stranded DNA-binding protein; Provisional	175
184285	PRK13733	PRK13733	conjugal transfer protein TraV; Provisional	171
237485	PRK13734	PRK13734	conjugal transfer pilin subunit TraA; Provisional	120
184287	PRK13735	PRK13735	conjugal transfer mating pair stabilization protein TraG; Provisional	942
237486	PRK13736	PRK13736	conjugal transfer protein TraK; Provisional	245
237487	PRK13737	PRK13737	conjugal transfer pilus assembly protein TraU; Provisional	330
184290	PRK13738	PRK13738	conjugal transfer pilus assembly protein TraW; Provisional	209
237488	PRK13739	PRK13739	conjugal transfer protein TraP; Provisional	198
184292	PRK13740	PRK13740	conjugal transfer relaxosome protein TraY. 	70
172283	PRK13741	PRK13741	conjugal transfer protein TraS. 	171
184293	PRK13742	PRK13742	replication initiation protein RepE. 	245
184294	PRK13743	PRK13743	conjugal transfer protein TrbF; Provisional	141
139817	PRK13744	PRK13744	conjugal transfer protein TrbG; Provisional	83
237489	PRK13745	PRK13745	anaerobic sulfatase-maturation protein. 	412
184296	PRK13746	PRK13746	aminoglycoside resistance protein; Provisional	262
184297	PRK13747	PRK13747	putative mercury resistance protein; Provisional	78
184298	PRK13748	PRK13748	putative mercuric reductase; Provisional	561
184299	PRK13749	PRK13749	HTH-type transcriptional regulator MerD. 	121
184300	PRK13750	PRK13750	replication protein; Provisional	285
184301	PRK13751	PRK13751	putative mercuric transport protein; Provisional	116
184302	PRK13752	PRK13752	mercuric resistance operon transcriptional regulator MerR. 	144
184303	PRK13753	PRK13753	dihydropteroate synthase; Provisional	279
184304	PRK13754	PRK13754	fertility inhibition protein FinO. 	186
237490	PRK13755	PRK13755	organomercurial transporter MerC. 	139
172294	PRK13756	PRK13756	TetR family transcriptional regulator. 	205
172295	PRK13757	PRK13757	type A chloramphenicol O-acetyltransferase. 	219
172296	PRK13758	PRK13758	anaerobic sulfatase-maturase; Provisional	370
237491	PRK13759	PRK13759	arylsulfatase; Provisional	485
237492	PRK13760	PRK13760	ribosome assembly factor SBDS. 	231
184308	PRK13761	PRK13761	phosphopantothenate/pantothenate synthetase. 	248
237493	PRK13762	PRK13762	4-demethylwyosine synthase TYW1. 	322
237494	PRK13763	PRK13763	putative RNA-processing protein; Provisional	180
184311	PRK13764	PRK13764	ATPase; Provisional	602
237495	PRK13765	PRK13765	ATP-dependent protease Lon; Provisional	637
237496	PRK13766	PRK13766	Hef nuclease; Provisional	773
237497	PRK13767	PRK13767	ATP-dependent helicase; Provisional	876
237498	PRK13768	PRK13768	GTPase; Provisional	253
172307	PRK13769	PRK13769	histidinol dehydrogenase; Provisional	368
172308	PRK13770	PRK13770	histidinol dehydrogenase; Provisional	416
184316	PRK13771	PRK13771	putative alcohol dehydrogenase; Provisional	334
172310	PRK13772	PRK13772	formimidoylglutamase; Provisional	314
237499	PRK13773	PRK13773	formimidoylglutamase; Provisional	324
184317	PRK13774	PRK13774	formimidoylglutamase; Provisional	311
172313	PRK13775	PRK13775	formimidoylglutamase; Provisional	328
237500	PRK13776	PRK13776	formimidoylglutamase; Provisional	318
237501	PRK13777	PRK13777	HTH-type transcriptional regulator Hpr. 	185
184320	PRK13778	paaA	phenylacetate-CoA oxygenase subunit PaaA; Provisional	314
237502	PRK13779	PRK13779	bifunctional PTS system fructose-specific transporter subunit IIA/HPr protein; Provisional	503
237503	PRK13780	PRK13780	phosphocarrier protein HPr; Provisional	88
237504	PRK13781	paaB	phenylacetate-CoA oxygenase subunit PaaB; Provisional	95
172320	PRK13782	PRK13782	HPr family phosphocarrier protein. 	82
237505	PRK13783	PRK13783	adenylosuccinate synthetase; Provisional	404
172322	PRK13784	PRK13784	adenylosuccinate synthetase; Provisional	428
237506	PRK13785	PRK13785	adenylosuccinate synthetase; Provisional	454
184325	PRK13786	PRK13786	adenylosuccinate synthetase; Provisional	424
172324	PRK13787	PRK13787	adenylosuccinate synthetase; Provisional	423
184326	PRK13788	PRK13788	adenylosuccinate synthetase; Provisional	404
184327	PRK13789	PRK13789	phosphoribosylamine--glycine ligase; Provisional	426
237507	PRK13790	PRK13790	phosphoribosylamine--glycine ligase; Provisional	379
237508	PRK13791	PRK13791	c-type lysozyme inhibitor. 	113
106733	PRK13792	PRK13792	lysozyme inhibitor; Provisional	127
184329	PRK13793	PRK13793	nicotinate-nicotinamide nucleotide adenylyltransferase. 	196
237509	PRK13794	PRK13794	hypothetical protein; Provisional	479
237510	PRK13795	PRK13795	hypothetical protein; Provisional	636
237511	PRK13796	PRK13796	GTPase YqeH; Provisional	365
106738	PRK13797	PRK13797	allantoicase. 	516
184333	PRK13798	PRK13798	putative OHCU decarboxylase; Provisional	166
106740	PRK13799	PRK13799	unknown domain/N-carbamoyl-L-amino acid hydrolase fusion protein; Provisional	591
237512	PRK13800	PRK13800	fumarate reductase/succinate dehydrogenase flavoprotein subunit. 	897
184335	PRK13802	PRK13802	bifunctional indole-3-glycerol phosphate synthase/tryptophan synthase subunit beta; Provisional	695
237513	PRK13803	PRK13803	bifunctional phosphoribosylanthranilate isomerase/tryptophan synthase subunit beta; Provisional	610
237514	PRK13804	ileS	isoleucyl-tRNA synthetase; Provisional	961
237515	PRK13805	PRK13805	bifunctional acetaldehyde-CoA/alcohol dehydrogenase; Provisional	862
237516	PRK13806	rpsA	30S ribosomal protein S1; Provisional	491
237517	PRK13807	PRK13807	maltose phosphorylase; Provisional	756
172341	PRK13808	PRK13808	adenylate kinase; Provisional	333
184340	PRK13809	PRK13809	orotate phosphoribosyltransferase; Provisional	206
184341	PRK13810	PRK13810	orotate phosphoribosyltransferase; Provisional	187
237518	PRK13811	PRK13811	orotate phosphoribosyltransferase; Provisional	170
237519	PRK13812	PRK13812	orotate phosphoribosyltransferase; Provisional	176
237520	PRK13813	PRK13813	orotidine 5'-phosphate decarboxylase; Provisional	215
139876	PRK13814	pyrB	aspartate carbamoyltransferase. 	310
172345	PRK13815	PRK13815	ribosome-binding factor A; Provisional	122
184344	PRK13816	PRK13816	ribosome-binding factor A; Provisional	131
139879	PRK13817	PRK13817	ribosome-binding factor A; Provisional	119
184345	PRK13818	PRK13818	ribosome-binding factor A; Provisional	121
237521	PRK13820	PRK13820	argininosuccinate synthase; Provisional	394
184347	PRK13821	thyA	thymidylate synthase; Provisional	323
237522	PRK13822	PRK13822	conjugal transfer coupling protein TraG; Provisional	641
184348	PRK13823	PRK13823	conjugal transfer protein TrbD; Provisional	94
184349	PRK13824	PRK13824	replication initiation protein RepC; Provisional	404
237523	PRK13825	PRK13825	conjugal transfer protein TraB; Provisional	388
237524	PRK13826	PRK13826	Dtr system oriT relaxase; Provisional	1102
184351	PRK13828	rimM	16S rRNA-processing protein RimM; Provisional	161
184352	PRK13829	rimM	16S rRNA-processing protein RimM; Provisional	162
237525	PRK13830	PRK13830	conjugal transfer protein TrbE; Provisional	818
172358	PRK13831	PRK13831	conjugal transfer protein TrbI; Provisional	432
184353	PRK13832	PRK13832	plasmid partitioning protein; Provisional	520
172360	PRK13833	PRK13833	conjugal transfer protein TrbB; Provisional	323
172361	PRK13834	PRK13834	putative autoinducer synthesis protein; Provisional	207
172362	PRK13835	PRK13835	conjugal transfer protein TrbH; Provisional	145
172363	PRK13836	PRK13836	conjugal transfer protein TrbF; Provisional	220
237526	PRK13837	PRK13837	two-component system VirA-like sensor kinase. 	828
172365	PRK13838	PRK13838	conjugal transfer pilin processing protease TraF; Provisional	176
237527	PRK13839	PRK13839	conjugal transfer protein TrbG; Provisional	277
237528	PRK13840	PRK13840	sucrose phosphorylase; Provisional	495
237529	PRK13841	PRK13841	conjugal transfer protein TrbL; Provisional	391
172369	PRK13842	PRK13842	conjugal transfer protein TrbJ; Provisional	267
237530	PRK13843	PRK13843	conjugal transfer protein TraH; Provisional	207
139904	PRK13844	PRK13844	recombination protein RecR; Provisional	200
172371	PRK13845	PRK13845	putative glycerol-3-phosphate acyltransferase PlsX; Provisional	437
139906	PRK13846	PRK13846	phosphate acyltransferase PlsX. 	316
172372	PRK13847	PRK13847	type IV conjugative transfer system coupling protein TraD. 	71
172373	PRK13848	PRK13848	conjugal transfer protein TraC; Provisional	98
139909	PRK13849	PRK13849	conjugal transfer ATPase VirC1. 	231
237531	PRK13850	PRK13850	type IV secretion system protein VirD4; Provisional	670
172375	PRK13851	PRK13851	type IV secretion system protein VirB11; Provisional	344
139912	PRK13852	PRK13852	type IV secretion system protein. 	295
139913	PRK13853	PRK13853	type IV secretion system protein VirB4; Provisional	789
139914	PRK13854	PRK13854	type IV secretion system protein VirB3; Provisional	108
172376	PRK13855	PRK13855	type IV secretion system protein VirB10; Provisional	376
172377	PRK13856	PRK13856	two-component response regulator VirG; Provisional	241
172378	PRK13857	PRK13857	pilin major subunit VirB2. 	120
237532	PRK13858	PRK13858	T-DNA border endonuclease VirD1. 	147
172380	PRK13859	PRK13859	type IV secretion system lipoprotein VirB7; Provisional	55
172381	PRK13860	PRK13860	pilin minor subunit VirB5. 	220
172382	PRK13861	PRK13861	type IV secretion system protein VirB9; Provisional	292
172383	PRK13862	PRK13862	conjugal transfer protein VirC2. 	201
237533	PRK13863	PRK13863	T-DNA border endonuclease VirD2. 	446
237534	PRK13864	PRK13864	type IV secretion system lytic transglycosylase VirB1; Provisional	245
172386	PRK13865	PRK13865	type IV secretion system protein VirB8; Provisional	229
172387	PRK13866	PRK13866	plasmid partitioning protein RepB; Provisional	336
172388	PRK13867	PRK13867	type IV secretion system effector chaperone VirE1. 	65
237535	PRK13868	PRK13868	type IV secretion system single-stranded DNA binding effector VirE2. 	556
139929	PRK13869	PRK13869	plasmid-partitioning protein RepA; Provisional	405
172390	PRK13870	PRK13870	transcriptional regulator TraR; Provisional	234
172391	PRK13871	PRK13871	conjugal transfer protein TrbC; Provisional	135
184356	PRK13872	PRK13872	conjugal transfer protein TrbF; Provisional	228
237536	PRK13873	PRK13873	conjugal transfer ATPase TrbE; Provisional	811
184358	PRK13874	PRK13874	conjugal transfer protein TrbJ; Provisional	230
237537	PRK13875	PRK13875	conjugal transfer protein TrbL; Provisional	440
237538	PRK13876	PRK13876	conjugal transfer coupling protein TraG; Provisional	663
184361	PRK13877	PRK13877	conjugal transfer transcriptional regulator TraJ. 	114
237539	PRK13878	PRK13878	conjugal transfer relaxase TraI; Provisional	746
184363	PRK13879	PRK13879	P-type conjugative transfer protein TrbJ. 	253
237540	PRK13880	PRK13880	conjugal transfer coupling protein TraG; Provisional	636
237541	PRK13881	PRK13881	conjugal transfer protein TrbI; Provisional	472
237542	PRK13882	PRK13882	conjugal transfer protein TrbP; Provisional	232
184367	PRK13883	PRK13883	conjugal transfer protein TrbH; Provisional	151
184368	PRK13884	PRK13884	conjugal transfer peptidase TraF; Provisional	178
237543	PRK13885	PRK13885	conjugal transfer protein TrbG; Provisional	299
184370	PRK13886	PRK13886	conjugal transfer protein TraL; Provisional	241
237544	PRK13887	PRK13887	conjugal transfer protein TrbF; Provisional	250
237545	PRK13888	PRK13888	conjugal transfer protein TrbN; Provisional	206
237546	PRK13889	PRK13889	conjugal transfer relaxase TraA; Provisional	988
237547	PRK13890	PRK13890	conjugal transfer protein TrbA; Provisional	120
184375	PRK13891	PRK13891	conjugal transfer protein TrbE; Provisional	852
184376	PRK13892	PRK13892	conjugal transfer protein TrbC; Provisional	134
237548	PRK13893	PRK13893	conjugal transfer protein TrbM; Provisional	193
184377	PRK13894	PRK13894	conjugal transfer ATPase TrbB; Provisional	319
184378	PRK13895	PRK13895	conjugal transfer protein TraM; Provisional	144
184379	PRK13896	PRK13896	cobyrinic acid a,c-diamide synthase; Provisional	433
237549	PRK13897	PRK13897	type IV secretion system component VirD4; Provisional	606
172418	PRK13898	PRK13898	type IV secretion system ATPase VirB4; Provisional	800
237550	PRK13899	PRK13899	type IV secretion system protein VirB3; Provisional	97
184381	PRK13900	PRK13900	type IV secretion system ATPase VirB11; Provisional	332
139961	PRK13901	ruvA	Holliday junction branch migration protein RuvA. 	196
237551	PRK13902	alaS	alanyl-tRNA synthetase; Provisional	900
237552	PRK13903	murB	UDP-N-acetylmuramate dehydrogenase. 	363
184384	PRK13904	murB	UDP-N-acetylmuramate dehydrogenase. 	257
237553	PRK13905	murB	UDP-N-acetylmuramate dehydrogenase. 	298
184386	PRK13906	murB	UDP-N-acetylmuramate dehydrogenase. 	307
139967	PRK13907	rnhA	ribonuclease H; Provisional	128
184387	PRK13908	PRK13908	recombination protein RecO. 	204
237554	PRK13909	PRK13909	RecB-like helicase. 	910
172427	PRK13910	PRK13910	DNA glycosylase MutY; Provisional	289
139971	PRK13911	PRK13911	exodeoxyribonuclease III; Provisional	250
184389	PRK13912	PRK13912	nuclease NucT; Provisional	177
184390	PRK13913	PRK13913	3-methyladenine DNA glycosylase; Provisional	218
237555	PRK13914	PRK13914	invasion associated endopeptidase. 	481
237556	PRK13915	PRK13915	putative glucosyl-3-phosphoglycerate synthase; Provisional	306
139976	PRK13916	PRK13916	plasmid segregation protein ParR; Provisional	97
184393	PRK13917	PRK13917	plasmid segregation protein ParM; Provisional	344
237557	PRK13918	PRK13918	CRP/FNR family transcriptional regulator; Provisional	202
184395	PRK13919	PRK13919	putative RNA polymerase sigma E protein; Provisional	186
237558	PRK13920	PRK13920	putative anti-sigmaE protein; Provisional	206
237559	PRK13921	PRK13921	CRISPR-associated Cse2 family protein; Provisional	173
237560	PRK13922	PRK13922	rod shape-determining protein MreC; Provisional	276
237561	PRK13923	PRK13923	putative spore coat protein regulator protein YlbO; Provisional	170
184399	PRK13925	rnhB	ribonuclease HII; Provisional	198
184400	PRK13926	PRK13926	ribonuclease HII; Provisional	207
237562	PRK13927	PRK13927	rod shape-determining protein MreB; Provisional	334
237563	PRK13928	PRK13928	rod shape-determining protein Mbl; Provisional	336
184403	PRK13929	PRK13929	rod-share determining protein MreBH; Provisional	335
237564	PRK13930	PRK13930	rod shape-determining protein MreB; Provisional	335
184405	PRK13931	PRK13931	5'/3'-nucleotidase SurE. 	261
172445	PRK13932	PRK13932	stationary phase survival protein SurE; Provisional	257
184406	PRK13933	PRK13933	stationary phase survival protein SurE; Provisional	253
237565	PRK13934	PRK13934	stationary phase survival protein SurE; Provisional	266
237566	PRK13935	PRK13935	stationary phase survival protein SurE; Provisional	253
237567	PRK13936	PRK13936	phosphoheptose isomerase; Provisional	197
184408	PRK13937	PRK13937	phosphoheptose isomerase; Provisional	188
139997	PRK13938	PRK13938	phosphoheptose isomerase; Provisional	196
172450	PRK13940	PRK13940	glutamyl-tRNA reductase; Provisional	414
184409	PRK13942	PRK13942	protein-L-isoaspartate O-methyltransferase; Provisional	212
237568	PRK13943	PRK13943	protein-L-isoaspartate O-methyltransferase; Provisional	322
140001	PRK13944	PRK13944	protein-L-isoaspartate O-methyltransferase; Provisional	205
184410	PRK13945	PRK13945	formamidopyrimidine-DNA glycosylase; Provisional	282
184411	PRK13946	PRK13946	shikimate kinase; Provisional	184
184412	PRK13947	PRK13947	shikimate kinase; Provisional	171
184413	PRK13948	PRK13948	shikimate kinase; Provisional	182
140006	PRK13949	PRK13949	shikimate kinase; Provisional	169
172457	PRK13951	PRK13951	bifunctional shikimate kinase AroK/3-dehydroquinate synthase AroB. 	488
237569	PRK13952	mscL	large conductance mechanosensitive channel protein MscL. 	142
184415	PRK13953	mscL	large conductance mechanosensitive channel protein MscL. 	125
172460	PRK13954	mscL	large conductance mechanosensitive channel protein MscL. 	119
184416	PRK13955	mscL	large conductance mechanosensitive channel protein MscL. 	130
184417	PRK13956	dut	dUTP diphosphatase. 	147
140013	PRK13957	PRK13957	indole-3-glycerol-phosphate synthase; Provisional	247
184418	PRK13958	PRK13958	N-(5'-phosphoribosyl)anthranilate isomerase; Provisional	207
237570	PRK13959	PRK13959	phosphoribosylaminoimidazole-succinocarboxamide synthase; Provisional	341
184420	PRK13960	PRK13960	phosphoribosylaminoimidazole-succinocarboxamide synthase; Provisional	367
237571	PRK13961	PRK13961	phosphoribosylaminoimidazole-succinocarboxamide synthase; Provisional	296
237572	PRK13962	PRK13962	bifunctional phosphoglycerate kinase/triosephosphate isomerase; Provisional	645
237573	PRK13963	PRK13963	rRNA maturation RNase YbeY. 	258
184424	PRK13964	coaD	pantetheine-phosphate adenylyltransferase. 	140
184425	PRK13965	PRK13965	ribonucleotide-diphosphate reductase subunit beta; Provisional	335
140022	PRK13966	nrdF2	ribonucleotide-diphosphate reductase subunit beta; Provisional	324
140023	PRK13967	nrdF1	ribonucleotide-diphosphate reductase subunit beta; Provisional	322
184426	PRK13968	PRK13968	putative succinate semialdehyde dehydrogenase; Provisional	462
184427	PRK13969	PRK13969	proline racemase; Provisional	334
172473	PRK13970	PRK13970	4-hydroxyproline epimerase. 	311
184428	PRK13971	PRK13971	4-hydroxyproline epimerase. 	333
172475	PRK13972	PRK13972	GSH-dependent disulfide bond oxidoreductase; Provisional	215
184429	PRK13973	PRK13973	thymidylate kinase; Provisional	213
172477	PRK13974	PRK13974	dTMP kinase. 	212
184430	PRK13975	PRK13975	dTMP kinase. 	196
237574	PRK13976	PRK13976	dTMP kinase. 	209
237575	PRK13977	PRK13977	myosin-cross-reactive antigen; Provisional	576
184433	PRK13978	PRK13978	ribose 5-phosphate isomerase A. 	228
237576	PRK13979	PRK13979	DNA topoisomerase IV subunit A; Provisional	957
184435	PRK13980	PRK13980	NAD synthetase; Provisional	265
237577	PRK13981	PRK13981	NAD synthetase; Provisional	540
172484	PRK13982	PRK13982	bifunctional SbtC-like/phosphopantothenoylcysteine decarboxylase/phosphopantothenate synthase; Provisional	475
237578	PRK13983	PRK13983	M20 family metallo-hydrolase. 	400
172486	PRK13984	PRK13984	putative oxidoreductase; Provisional	604
184438	PRK13985	ureB	urease subunit alpha. 	568
184439	PRK13986	PRK13986	urease subunit beta. 	225
237579	PRK13987	PRK13987	cell division topological specificity factor MinE; Provisional	91
184441	PRK13988	PRK13988	cell division topological specificity factor MinE; Provisional	97
184442	PRK13989	PRK13989	cell division topological specificity factor MinE; Provisional	84
172492	PRK13990	PRK13990	cell division topological specificity factor MinE; Provisional	90
172493	PRK13991	PRK13991	cell division topological specificity factor MinE; Provisional	87
237580	PRK13992	minC	septum site-determining protein MinC. 	205
237581	PRK13994	PRK13994	potassium-transporting ATPase subunit C; Provisional	222
184445	PRK13995	PRK13995	K(+)-transporting ATPase subunit C. 	203
172497	PRK13996	PRK13996	potassium-transporting ATPase subunit C; Provisional	197
172498	PRK13997	PRK13997	K(+)-transporting ATPase subunit C. 	193
172499	PRK13998	PRK13998	K(+)-transporting ATPase subunit C. 	186
172500	PRK13999	PRK13999	K(+)-transporting ATPase subunit C. 	201
184446	PRK14000	PRK14000	K(+)-transporting ATPase subunit C. 	185
172502	PRK14001	PRK14001	K(+)-transporting ATPase subunit C. 	189
172503	PRK14002	PRK14002	K(+)-transporting ATPase subunit C. 	186
184447	PRK14003	PRK14003	K(+)-transporting ATPase subunit C. 	194
172505	PRK14004	hisH	imidazole glycerol phosphate synthase subunit HisH; Provisional	210
184448	PRK14010	PRK14010	K(+)-transporting ATPase subunit B. 	673
237582	PRK14011	PRK14011	prefoldin subunit alpha; Provisional	144
184450	PRK14012	PRK14012	IscS subfamily cysteine desulfurase. 	404
237583	PRK14013	PRK14013	hypothetical protein; Provisional	338
237584	PRK14014	PRK14014	putative acyltransferase; Provisional	301
237585	PRK14015	pepN	aminopeptidase N; Provisional	875
237586	PRK14016	PRK14016	cyanophycin synthetase; Provisional	727
184455	PRK14017	PRK14017	galactonate dehydratase; Provisional	382
184456	PRK14018	PRK14018	bifunctional peptide-methionine (S)-S-oxide reductase MsrA/peptide-methionine (R)-S-oxide reductase MsrB. 	521
237587	PRK14019	PRK14019	bifunctional 3,4-dihydroxy-2-butanone-4-phosphate synthase/GTP cyclohydrolase II. 	367
184458	PRK14021	PRK14021	bifunctional shikimate kinase/3-dehydroquinate synthase; Provisional	542
237588	PRK14022	PRK14022	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--L-lysine ligase. 	481
184460	PRK14023	PRK14023	homoaconitate hydratase small subunit; Provisional	166
237589	PRK14024	PRK14024	phosphoribosyl isomerase A; Provisional	241
184462	PRK14025	PRK14025	multifunctional 3-isopropylmalate dehydrogenase/D-malate dehydrogenase; Provisional	330
172521	PRK14027	PRK14027	quinate/shikimate dehydrogenase (NAD+). 	283
172522	PRK14028	PRK14028	pyruvate ferredoxin oxidoreductase subunit gamma/delta; Provisional	312
172523	PRK14029	PRK14029	pyruvate/ketoisovalerate ferredoxin oxidoreductase subunit gamma; Provisional	185
184463	PRK14030	PRK14030	glutamate dehydrogenase; Provisional	445
184464	PRK14031	PRK14031	NADP-specific glutamate dehydrogenase. 	444
184465	PRK14032	PRK14032	citrate synthase; Provisional	447
237590	PRK14033	PRK14033	bifunctional 2-methylcitrate synthase/citrate synthase. 	375
184467	PRK14034	PRK14034	citrate synthase; Provisional	372
184468	PRK14035	PRK14035	citrate synthase; Provisional	371
237591	PRK14036	PRK14036	citrate synthase; Provisional	377
184470	PRK14037	PRK14037	citrate synthase; Provisional	377
172532	PRK14038	PRK14038	ADP-specific glucokinase. 	453
184471	PRK14039	PRK14039	ADP-dependent glucokinase; Provisional	453
237592	PRK14040	PRK14040	oxaloacetate decarboxylase subunit alpha. 	593
237593	PRK14041	PRK14041	pyruvate carboxylase subunit B. 	467
172536	PRK14042	PRK14042	pyruvate carboxylase subunit B; Provisional	596
172537	PRK14045	PRK14045	1-aminocyclopropane-1-carboxylate deaminase; Provisional	329
237594	PRK14046	PRK14046	malate--CoA ligase subunit beta; Provisional	392
184475	PRK14047	PRK14047	tetrahydromethanopterin S-methyltransferase subunit H. 	310
172540	PRK14048	PRK14048	ferrichrome/ferrioxamine B periplasmic transporter; Provisional	374
172541	PRK14049	PRK14049	ferrioxamine B receptor precursor protein; Provisional	726
237595	PRK14050	PRK14050	TonB-dependent siderophore receptor. 	728
184476	PRK14051	PRK14051	negative regulator GrlR; Provisional	123
184477	PRK14052	PRK14052	adenosine monophosphate-protein transferase vopS. 	387
237596	PRK14053	PRK14053	methyltransferase; Provisional	194
237597	PRK14054	PRK14054	peptide-methionine (S)-S-oxide reductase. 	172
172547	PRK14055	PRK14055	aromatic amino acid hydroxylase; Provisional	362
237598	PRK14056	PRK14056	aromatic amino acid hydroxylase. 	578
172549	PRK14057	PRK14057	epimerase; Provisional	254
237599	PRK14058	PRK14058	[LysW]-aminoadipate/[LysW]-glutamate kinase. 	268
184482	PRK14059	PRK14059	pyrimidine reductase family protein. 	251
172552	PRK14061	PRK14061	unknown domain/lipoate-protein ligase A fusion protein; Provisional	562
184483	PRK14063	PRK14063	exodeoxyribonuclease VII small subunit; Provisional	76
172554	PRK14064	PRK14064	exodeoxyribonuclease VII small subunit; Provisional	75
184484	PRK14065	PRK14065	exodeoxyribonuclease VII small subunit; Provisional	86
172556	PRK14066	PRK14066	exodeoxyribonuclease VII small subunit; Provisional	75
172557	PRK14067	PRK14067	exodeoxyribonuclease VII small subunit; Provisional	80
184485	PRK14068	PRK14068	exodeoxyribonuclease VII small subunit; Provisional	76
172559	PRK14069	PRK14069	exodeoxyribonuclease VII small subunit; Provisional	95
184486	PRK14070	PRK14070	exodeoxyribonuclease VII small subunit; Provisional	69
184487	PRK14071	PRK14071	ATP-dependent 6-phosphofructokinase. 	360
237600	PRK14072	PRK14072	diphosphate--fructose-6-phosphate 1-phosphotransferase. 	416
172564	PRK14074	rpsF	30S ribosomal protein S6; Provisional	257
184489	PRK14075	pnk	NAD(+) kinase. 	256
237601	PRK14076	pnk	bifunctional NADP phosphatase/NAD kinase. 	569
172567	PRK14077	pnk	NAD(+) kinase. 	287
184491	PRK14079	recF	recombination protein F; Provisional	349
237602	PRK14081	PRK14081	triple tyrosine motif-containing protein; Provisional	667
184493	PRK14082	PRK14082	hypothetical protein; Provisional	65
237603	PRK14083	PRK14083	HSP90 family protein; Provisional	601
184495	PRK14084	PRK14084	DNA-binding response regulator. 	246
237604	PRK14085	PRK14085	imidazolonepropionase; Provisional	382
237605	PRK14086	dnaA	chromosomal replication initiator protein DnaA. 	617
172577	PRK14087	dnaA	chromosomal replication initiator protein DnaA. 	450
172578	PRK14088	dnaA	chromosomal replication initiator protein DnaA. 	440
237606	PRK14089	PRK14089	lipid-A-disaccharide synthase. 	347
184499	PRK14090	PRK14090	phosphoribosylformylglycinamidine synthase subunit PurL. 	601
237607	PRK14091	PRK14091	RNA chaperone Hfq. 	165
172582	PRK14092	PRK14092	2-amino-4-hydroxy-6-hydroxymethyldihydropteridine diphosphokinase. 	163
184501	PRK14093	PRK14093	UDP-N-acetylmuramoylalanyl-D-glutamyl-2,6-diaminopimelate--D-alanyl-D-alanine ligase; Provisional	479
172584	PRK14094	psbM	photosystem II reaction center protein PsbM. 	50
237608	PRK14095	pgi	glucose-6-phosphate isomerase; Provisional	533
237609	PRK14096	pgi	glucose-6-phosphate isomerase; Provisional	528
184504	PRK14097	pgi	glucose-6-phosphate isomerase; Provisional	448
172588	PRK14098	PRK14098	starch synthase. 	489
237610	PRK14099	PRK14099	glycogen synthase GlgA. 	485
184506	PRK14100	PRK14100	2-phosphosulfolactate phosphatase; Provisional	237
184507	PRK14101	PRK14101	bifunctional transcriptional regulator/glucokinase. 	638
184508	PRK14102	nifW	nitrogenase-stabilizing/protective protein NifW. 	105
184509	PRK14103	PRK14103	trans-aconitate 2-methyltransferase; Provisional	255
172594	PRK14104	PRK14104	chaperonin GroEL; Provisional	546
237611	PRK14105	PRK14105	selenide, water dikinase SelD. 	345
184511	PRK14106	murD	UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional	450
237612	PRK14108	PRK14108	bifunctional [glutamine synthetase] adenylyltransferase/[glutamine synthetase]-adenylyl-L-tyrosine phosphorylase. 	986
237613	PRK14109	PRK14109	bifunctional [glutamine synthetase] adenylyltransferase/[glutamine synthetase]-adenylyl-L-tyrosine phosphorylase. 	1007
184514	PRK14110	PRK14110	F0F1 ATP synthase subunit gamma; Provisional	291
184515	PRK14111	PRK14111	F0F1 ATP synthase subunit gamma; Provisional	290
172602	PRK14112	PRK14112	urease accessory protein UreE; Provisional	149
237614	PRK14113	PRK14113	urease accessory protein UreE; Provisional	152
172604	PRK14114	PRK14114	1-(5-phosphoribosyl)-5- ((5-phosphoribosylamino)methylideneamino)imidazole-4-carboxamide isomerase. 	241
184516	PRK14115	gpmA	2,3-diphosphoglycerate-dependent phosphoglycerate mutase. 	247
172606	PRK14116	gpmA	2,3-diphosphoglycerate-dependent phosphoglycerate mutase. 	228
184517	PRK14117	gpmA	phosphoglyceromutase; Provisional	230
172608	PRK14118	gpmA	2,3-diphosphoglycerate-dependent phosphoglycerate mutase. 	227
184518	PRK14119	gpmA	phosphoglyceromutase; Provisional	228
184519	PRK14120	gpmA	phosphoglyceromutase; Provisional	249
237615	PRK14121	PRK14121	tRNA (guanine-N(7)-)-methyltransferase; Provisional	390
184521	PRK14122	PRK14122	tRNA pseudouridine synthase B; Provisional	312
184522	PRK14123	PRK14123	tRNA pseudouridine synthase B; Provisional	305
172614	PRK14124	PRK14124	tRNA pseudouridine synthase B; Provisional	308
184523	PRK14125	PRK14125	cell division suppressor protein YneA; Provisional	103
172616	PRK14126	PRK14126	cell division protein ZapA; Provisional	85
237616	PRK14127	PRK14127	cell division regulator GpsB. 	109
184525	PRK14128	iraD	anti-adapter protein IraD. 	69
184526	PRK14129	PRK14129	heat shock protein HspQ; Provisional	105
237617	PRK14131	PRK14131	N-acetylneuraminate epimerase. 	376
237618	PRK14132	PRK14132	riboflavin kinase; Provisional	126
184529	PRK14133	PRK14133	DNA polymerase IV; Provisional	347
184530	PRK14134	recX	recombination regulator RecX; Provisional	283
237619	PRK14135	recX	recombination regulator RecX; Provisional	263
237620	PRK14136	recX	recombination regulator RecX; Provisional	309
172626	PRK14137	recX	recombination regulator RecX; Provisional	195
172627	PRK14138	PRK14138	NAD-dependent deacetylase; Provisional	244
237621	PRK14139	PRK14139	heat shock protein GrpE; Provisional	185
237622	PRK14140	PRK14140	heat shock protein GrpE; Provisional	191
172630	PRK14141	PRK14141	heat shock protein GrpE; Provisional	209
237623	PRK14142	PRK14142	heat shock protein GrpE; Provisional	223
237624	PRK14143	PRK14143	heat shock protein GrpE; Provisional	238
184535	PRK14144	PRK14144	heat shock protein GrpE; Provisional	199
184536	PRK14145	PRK14145	heat shock protein GrpE; Provisional	196
172635	PRK14146	PRK14146	heat shock protein GrpE; Provisional	215
237625	PRK14147	PRK14147	heat shock protein GrpE; Provisional	172
172637	PRK14148	PRK14148	heat shock protein GrpE; Provisional	195
184538	PRK14149	PRK14149	heat shock protein GrpE; Provisional	191
184539	PRK14150	PRK14150	heat shock protein GrpE; Provisional	193
172640	PRK14151	PRK14151	heat shock protein GrpE; Provisional	176
184540	PRK14153	PRK14153	heat shock protein GrpE; Provisional	194
237626	PRK14154	PRK14154	heat shock protein GrpE; Provisional	208
237627	PRK14155	PRK14155	heat shock protein GrpE; Provisional	208
237628	PRK14156	PRK14156	heat shock protein GrpE; Provisional	177
184543	PRK14157	PRK14157	heat shock protein GrpE; Provisional	227
172646	PRK14158	PRK14158	heat shock protein GrpE; Provisional	194
172647	PRK14159	PRK14159	heat shock protein GrpE; Provisional	176
237629	PRK14160	PRK14160	heat shock protein GrpE; Provisional	211
237630	PRK14161	PRK14161	heat shock protein GrpE; Provisional	178
237631	PRK14162	PRK14162	heat shock protein GrpE; Provisional	194
184546	PRK14163	PRK14163	heat shock protein GrpE; Provisional	214
237632	PRK14164	PRK14164	heat shock protein GrpE; Provisional	218
184548	PRK14165	PRK14165	winged helix-turn-helix domain-containing protein/riboflavin kinase; Provisional	217
172654	PRK14166	PRK14166	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	282
184549	PRK14167	PRK14167	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	297
237633	PRK14168	PRK14168	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	297
184550	PRK14169	PRK14169	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	282
172658	PRK14170	PRK14170	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	284
172659	PRK14171	PRK14171	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	288
172660	PRK14172	PRK14172	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	278
184551	PRK14173	PRK14173	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	287
172662	PRK14174	PRK14174	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	295
184552	PRK14175	PRK14175	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	286
184553	PRK14176	PRK14176	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	287
172665	PRK14177	PRK14177	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	284
172666	PRK14178	PRK14178	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	279
237634	PRK14179	PRK14179	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase. 	284
172668	PRK14180	PRK14180	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	282
172669	PRK14181	PRK14181	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	287
172670	PRK14182	PRK14182	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	282
184555	PRK14183	PRK14183	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	281
237635	PRK14184	PRK14184	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	286
184556	PRK14185	PRK14185	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	293
237636	PRK14186	PRK14186	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	297
172675	PRK14187	PRK14187	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	294
184558	PRK14188	PRK14188	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	296
184559	PRK14189	PRK14189	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase. 	285
184560	PRK14190	PRK14190	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	284
172679	PRK14191	PRK14191	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	285
184561	PRK14192	PRK14192	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	283
237637	PRK14193	PRK14193	bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 5,10-methylene-tetrahydrofolate cyclohydrolase; Provisional	284
172682	PRK14194	PRK14194	bifunctional methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase FolD. 	301
184563	PRK14195	PRK14195	fluoride efflux transporter CrcB. 	125
184564	PRK14196	PRK14196	fluoride efflux transporter CrcB. 	127
172685	PRK14197	PRK14197	fluoride efflux transporter CrcB. 	124
172686	PRK14198	PRK14198	fluoride efflux transporter CrcB. 	124
172687	PRK14199	PRK14199	fluoride efflux transporter CrcB. 	128
237638	PRK14200	PRK14200	fluoride efflux transporter CrcB. 	127
184566	PRK14201	PRK14201	fluoride efflux transporter CrcB. 	121
172690	PRK14202	PRK14202	fluoride efflux transporter CrcB. 	128
237639	PRK14203	PRK14203	fluoride efflux transporter CrcB. 	132
172692	PRK14204	PRK14204	fluoride efflux transporter CrcB. 	127
172693	PRK14205	PRK14205	fluoride efflux transporter CrcB. 	118
172694	PRK14206	PRK14206	fluoride efflux transporter CrcB. 	127
172695	PRK14207	PRK14207	fluoride efflux transporter CrcB. 	123
172696	PRK14208	PRK14208	fluoride efflux transporter CrcB. 	126
237640	PRK14209	PRK14209	fluoride efflux transporter CrcB. 	124
172698	PRK14210	PRK14210	fluoride efflux transporter CrcB. 	127
172699	PRK14211	PRK14211	fluoride efflux transporter CrcB. 	114
184569	PRK14212	PRK14212	fluoride efflux transporter CrcB. 	128
184570	PRK14213	PRK14213	camphor resistance protein CrcB; Provisional	118
184571	PRK14214	PRK14214	fluoride efflux transporter CrcB. 	118
172703	PRK14215	PRK14215	fluoride efflux transporter CrcB. 	126
184572	PRK14216	PRK14216	fluoride efflux transporter CrcB. 	132
172705	PRK14217	PRK14217	fluoride efflux transporter CrcB. 	134
184573	PRK14218	PRK14218	fluoride efflux transporter CrcB. 	133
172707	PRK14219	PRK14219	fluoride efflux transporter CrcB. 	132
237641	PRK14220	PRK14220	fluoride efflux transporter CrcB. 	120
184575	PRK14221	PRK14221	fluoride efflux transporter CrcB. 	124
172710	PRK14222	PRK14222	fluoride efflux transporter CrcB. 	124
184576	PRK14223	PRK14223	fluoride efflux transporter CrcB. 	122
237642	PRK14224	PRK14224	fluoride efflux transporter CrcB. 	126
172713	PRK14225	PRK14225	fluoride efflux transporter CrcB. 	137
172714	PRK14226	PRK14226	fluoride efflux transporter CrcB. 	130
172715	PRK14227	PRK14227	fluoride efflux transporter CrcB. 	124
237643	PRK14228	PRK14228	fluoride efflux transporter CrcB. 	122
172717	PRK14229	PRK14229	fluoride efflux transporter CrcB. 	108
172718	PRK14230	PRK14230	camphor resistance protein CrcB; Provisional	119
184579	PRK14231	PRK14231	fluoride efflux transporter CrcB. 	129
237644	PRK14232	PRK14232	fluoride efflux transporter CrcB. 	120
172721	PRK14233	PRK14233	fluoride efflux transporter CrcB. 	133
172722	PRK14234	PRK14234	fluoride efflux transporter CrcB. 	124
237645	PRK14235	PRK14235	phosphate transporter ATP-binding protein; Provisional	267
184582	PRK14236	PRK14236	phosphate transporter ATP-binding protein; Provisional	272
237646	PRK14237	PRK14237	phosphate transporter ATP-binding protein; Provisional	267
184584	PRK14238	PRK14238	phosphate transporter ATP-binding protein; Provisional	271
184585	PRK14239	PRK14239	phosphate transporter ATP-binding protein; Provisional	252
184586	PRK14240	PRK14240	phosphate transporter ATP-binding protein; Provisional	250
184587	PRK14241	PRK14241	phosphate transporter ATP-binding protein; Provisional	258
172730	PRK14242	PRK14242	phosphate ABC transporter ATP-binding protein. 	253
184588	PRK14243	PRK14243	phosphate transporter ATP-binding protein; Provisional	264
172732	PRK14244	PRK14244	phosphate ABC transporter ATP-binding protein; Provisional	251
172733	PRK14245	PRK14245	phosphate ABC transporter ATP-binding protein; Provisional	250
172734	PRK14246	PRK14246	phosphate ABC transporter ATP-binding protein; Provisional	257
172735	PRK14247	PRK14247	phosphate ABC transporter ATP-binding protein; Provisional	250
237647	PRK14248	PRK14248	phosphate ABC transporter ATP-binding protein; Provisional	268
184590	PRK14249	PRK14249	phosphate ABC transporter ATP-binding protein; Provisional	251
237648	PRK14250	PRK14250	phosphate ABC transporter ATP-binding protein; Provisional	241
172739	PRK14251	PRK14251	phosphate ABC transporter ATP-binding protein; Provisional	251
172740	PRK14252	PRK14252	phosphate ABC transporter ATP-binding protein; Provisional	265
172741	PRK14253	PRK14253	phosphate ABC transporter ATP-binding protein; Provisional	249
237649	PRK14254	PRK14254	phosphate ABC transporter ATP-binding protein; Provisional	285
172743	PRK14255	PRK14255	phosphate ABC transporter ATP-binding protein; Provisional	252
172744	PRK14256	PRK14256	phosphate ABC transporter ATP-binding protein; Provisional	252
172745	PRK14257	PRK14257	phosphate ABC transporter ATP-binding protein; Provisional	329
184593	PRK14258	PRK14258	phosphate ABC transporter ATP-binding protein; Provisional	261
172747	PRK14259	PRK14259	phosphate ABC transporter ATP-binding protein; Provisional	269
172748	PRK14260	PRK14260	phosphate ABC transporter ATP-binding protein; Provisional	259
172749	PRK14261	PRK14261	phosphate ABC transporter ATP-binding protein; Provisional	253
172750	PRK14262	PRK14262	phosphate ABC transporter ATP-binding protein; Provisional	250
172751	PRK14263	PRK14263	phosphate ABC transporter ATP-binding protein; Provisional	261
184594	PRK14264	PRK14264	phosphate ABC transporter ATP-binding protein; Provisional	305
237650	PRK14265	PRK14265	phosphate ABC transporter ATP-binding protein; Provisional	274
237651	PRK14266	PRK14266	phosphate ABC transporter ATP-binding protein; Provisional	250
184596	PRK14267	PRK14267	phosphate ABC transporter ATP-binding protein; Provisional	253
172756	PRK14268	PRK14268	phosphate ABC transporter ATP-binding protein; Provisional	258
172757	PRK14269	PRK14269	phosphate ABC transporter ATP-binding protein; Provisional	246
184597	PRK14270	PRK14270	phosphate ABC transporter ATP-binding protein; Provisional	251
172759	PRK14271	PRK14271	phosphate ABC transporter ATP-binding protein; Provisional	276
172760	PRK14272	PRK14272	phosphate ABC transporter ATP-binding protein; Provisional	252
172761	PRK14273	PRK14273	phosphate ABC transporter ATP-binding protein; Provisional	254
172762	PRK14274	PRK14274	phosphate ABC transporter ATP-binding protein; Provisional	259
237652	PRK14275	PRK14275	phosphate ABC transporter ATP-binding protein; Provisional	286
237653	PRK14276	PRK14276	chaperone protein DnaJ; Provisional	380
184599	PRK14277	PRK14277	chaperone protein DnaJ; Provisional	386
237654	PRK14278	PRK14278	chaperone protein DnaJ; Provisional	378
237655	PRK14279	PRK14279	molecular chaperone DnaJ. 	392
237656	PRK14280	PRK14280	molecular chaperone DnaJ. 	376
237657	PRK14281	PRK14281	chaperone protein DnaJ; Provisional	397
184603	PRK14282	PRK14282	chaperone protein DnaJ; Provisional	369
184604	PRK14283	PRK14283	chaperone protein DnaJ; Provisional	378
237658	PRK14284	PRK14284	chaperone protein DnaJ; Provisional	391
172773	PRK14285	PRK14285	chaperone protein DnaJ; Provisional	365
172774	PRK14286	PRK14286	chaperone protein DnaJ; Provisional	372
237659	PRK14287	PRK14287	chaperone protein DnaJ; Provisional	371
172776	PRK14288	PRK14288	molecular chaperone DnaJ. 	369
237660	PRK14289	PRK14289	molecular chaperone DnaJ. 	386
172778	PRK14290	PRK14290	chaperone protein DnaJ; Provisional	365
237661	PRK14291	PRK14291	chaperone protein DnaJ; Provisional	382
237662	PRK14292	PRK14292	chaperone protein DnaJ; Provisional	371
237663	PRK14293	PRK14293	molecular chaperone DnaJ. 	374
237664	PRK14294	PRK14294	chaperone protein DnaJ; Provisional	366
237665	PRK14295	PRK14295	molecular chaperone DnaJ. 	389
237666	PRK14296	PRK14296	chaperone protein DnaJ; Provisional	372
184611	PRK14297	PRK14297	molecular chaperone DnaJ. 	380
184612	PRK14298	PRK14298	chaperone protein DnaJ; Provisional	377
237667	PRK14299	PRK14299	chaperone protein DnaJ; Provisional	291
172788	PRK14300	PRK14300	chaperone protein DnaJ; Provisional	372
237668	PRK14301	PRK14301	chaperone protein DnaJ; Provisional	373
184614	PRK14314	glmM	phosphoglucosamine mutase; Provisional	450
237669	PRK14315	glmM	phosphoglucosamine mutase; Provisional	448
237670	PRK14316	glmM	phosphoglucosamine mutase; Provisional	448
237671	PRK14317	glmM	phosphoglucosamine mutase; Provisional	465
237672	PRK14318	glmM	phosphoglucosamine mutase; Provisional	448
172795	PRK14319	glmM	phosphoglucosamine mutase; Provisional	430
172796	PRK14320	glmM	phosphoglucosamine mutase; Provisional	443
172797	PRK14321	glmM	phosphoglucosamine mutase; Provisional	449
184619	PRK14322	glmM	phosphoglucosamine mutase; Provisional	429
184620	PRK14323	glmM	phosphoglucosamine mutase; Provisional	440
184621	PRK14324	glmM	phosphoglucosamine mutase; Provisional	446
237673	PRK14325	PRK14325	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	444
237674	PRK14326	PRK14326	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	502
184624	PRK14327	PRK14327	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	509
237675	PRK14328	PRK14328	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	439
237676	PRK14329	PRK14329	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	467
184627	PRK14330	PRK14330	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	434
184628	PRK14331	PRK14331	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	437
172808	PRK14332	PRK14332	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	449
237677	PRK14333	PRK14333	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	448
184630	PRK14334	PRK14334	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	440
237678	PRK14335	PRK14335	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	455
184632	PRK14336	PRK14336	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	418
172813	PRK14337	PRK14337	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	446
184633	PRK14338	PRK14338	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	459
184634	PRK14339	PRK14339	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	420
237679	PRK14340	PRK14340	(dimethylallyl)adenosine tRNA methylthiotransferase; Provisional	445
237680	PRK14341	PRK14341	lipoyl(octanoyl) transferase LipB. 	213
237681	PRK14342	PRK14342	lipoyl(octanoyl) transferase LipB. 	213
237682	PRK14343	PRK14343	lipoyl(octanoyl) transferase LipB. 	235
237683	PRK14344	PRK14344	lipoyl(octanoyl) transferase LipB. 	223
184638	PRK14345	PRK14345	lipoyl(octanoyl) transferase LipB. 	234
237684	PRK14346	PRK14346	lipoyl(octanoyl) transferase LipB. 	230
172823	PRK14347	PRK14347	lipoyl(octanoyl) transferase LipB. 	209
172824	PRK14348	PRK14348	lipoyl(octanoyl) transferase LipB. 	221
172825	PRK14349	PRK14349	lipoyl(octanoyl) transferase LipB. 	220
172826	PRK14350	ligA	NAD-dependent DNA ligase LigA; Provisional	669
184640	PRK14351	ligA	NAD-dependent DNA ligase LigA; Provisional	689
184641	PRK14352	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	482
184642	PRK14353	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	446
184643	PRK14354	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	458
237685	PRK14355	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	459
237686	PRK14356	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	456
237687	PRK14357	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	448
237688	PRK14358	glmU	bifunctional N-acetylglucosamine-1-phosphate uridyltransferase/glucosamine-1-phosphate acetyltransferase; Provisional	481
237689	PRK14359	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	430
184646	PRK14360	glmU	bifunctional UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase GlmU. 	450
172837	PRK14361	PRK14361	Maf-like protein; Provisional	187
172838	PRK14362	PRK14362	Maf-like protein; Provisional	207
184647	PRK14363	PRK14363	Maf-like protein; Provisional	204
184648	PRK14364	PRK14364	Maf-like protein; Provisional	181
237690	PRK14365	PRK14365	Maf-like protein; Provisional	197
237691	PRK14366	PRK14366	Maf-like protein; Provisional	195
237692	PRK14367	PRK14367	Maf-like protein; Provisional	202
237693	PRK14368	PRK14368	Maf-like protein; Provisional	193
184650	PRK14369	PRK14369	membrane protein insertion efficiency factor YidD. 	119
184651	PRK14370	PRK14370	hypothetical protein; Provisional	120
172847	PRK14371	PRK14371	hypothetical protein; Provisional	81
172848	PRK14372	PRK14372	membrane protein insertion efficiency factor YidD. 	97
172849	PRK14373	PRK14373	hypothetical protein; Provisional	73
237694	PRK14374	PRK14374	membrane protein insertion efficiency factor YidD. 	118
172851	PRK14375	PRK14375	membrane protein insertion efficiency factor YidD. 	70
237695	PRK14376	PRK14376	membrane protein insertion efficiency factor YidD. 	176
172853	PRK14377	PRK14377	membrane protein insertion efficiency factor YidD. 	104
237696	PRK14378	PRK14378	membrane protein insertion efficiency factor YidD. 	103
237697	PRK14379	PRK14379	membrane protein insertion efficiency factor YidD. 	95
184654	PRK14380	PRK14380	hypothetical protein; Provisional	81
172857	PRK14381	PRK14381	membrane protein insertion efficiency factor YidD. 	103
172858	PRK14382	PRK14382	hypothetical protein; Provisional	68
237698	PRK14383	PRK14383	membrane protein insertion efficiency factor YidD. 	84
172860	PRK14384	PRK14384	hypothetical protein; Provisional	56
172861	PRK14385	PRK14385	membrane protein insertion efficiency factor YidD. 	96
172862	PRK14386	PRK14386	membrane protein insertion efficiency factor YidD. 	106
184655	PRK14387	PRK14387	membrane protein insertion efficiency factor YidD. 	84
172864	PRK14388	PRK14388	membrane protein insertion efficiency factor YidD. 	82
184656	PRK14389	PRK14389	membrane protein insertion efficiency factor YidD. 	98
172866	PRK14390	PRK14390	hypothetical protein; Provisional	63
172867	PRK14391	PRK14391	membrane protein insertion efficiency factor YidD. 	84
237699	PRK14392	PRK14392	glycerol-3-phosphate acyltransferase. 	207
172869	PRK14393	PRK14393	glycerol-3-phosphate acyltransferase. 	194
172870	PRK14394	PRK14394	glycerol-3-phosphate acyltransferase. 	195
172871	PRK14395	PRK14395	glycerol-3-phosphate acyltransferase. 	195
184657	PRK14396	PRK14396	glycerol-3-phosphate acyltransferase. 	190
237700	PRK14397	PRK14397	membrane protein; Provisional	222
237701	PRK14398	PRK14398	glycerol-3-phosphate acyltransferase. 	191
237702	PRK14399	PRK14399	membrane protein; Provisional	258
237703	PRK14400	PRK14400	glycerol-3-phosphate acyltransferase. 	201
184658	PRK14401	PRK14401	membrane protein; Provisional	187
237704	PRK14402	PRK14402	glycerol-3-phosphate acyltransferase. 	198
172879	PRK14403	PRK14403	glycerol-3-phosphate acyltransferase. 	196
237705	PRK14404	PRK14404	glycerol-3-phosphate acyltransferase. 	201
237706	PRK14405	PRK14405	membrane protein; Provisional	202
172882	PRK14406	PRK14406	glycerol-3-phosphate acyltransferase. 	199
172883	PRK14407	PRK14407	glycerol-3-phosphate acyltransferase. 	219
172884	PRK14408	PRK14408	glycerol-3-phosphate acyltransferase. 	257
172885	PRK14409	PRK14409	glycerol-3-phosphate acyltransferase. 	205
237707	PRK14410	PRK14410	glycerol-3-phosphate acyltransferase. 	235
184663	PRK14411	PRK14411	membrane protein; Provisional	204
184664	PRK14412	PRK14412	glycerol-3-phosphate acyltransferase. 	198
172889	PRK14413	PRK14413	glycerol-3-phosphate acyltransferase. 	197
184665	PRK14414	PRK14414	glycerol-3-phosphate acyltransferase. 	210
184666	PRK14415	PRK14415	glycerol-3-phosphate acyltransferase. 	216
184667	PRK14416	PRK14416	membrane protein; Provisional	200
184668	PRK14417	PRK14417	membrane protein; Provisional	232
237708	PRK14418	PRK14418	glycerol-3-phosphate acyltransferase. 	236
237709	PRK14419	PRK14419	membrane protein; Provisional	199
237710	PRK14420	PRK14420	acylphosphatase; Provisional	91
237711	PRK14421	PRK14421	acylphosphatase; Provisional	99
237712	PRK14422	PRK14422	acylphosphatase; Provisional	93
237713	PRK14423	PRK14423	acylphosphatase; Provisional	92
184674	PRK14424	PRK14424	acylphosphatase; Provisional	94
172901	PRK14425	PRK14425	acylphosphatase; Provisional	94
184675	PRK14426	PRK14426	acylphosphatase; Provisional	92
172903	PRK14427	PRK14427	acylphosphatase; Provisional	94
172904	PRK14428	PRK14428	acylphosphatase; Provisional	97
184676	PRK14429	PRK14429	acylphosphatase; Provisional	90
172906	PRK14430	PRK14430	acylphosphatase; Provisional	92
184677	PRK14431	PRK14431	acylphosphatase; Provisional	89
184678	PRK14432	PRK14432	acylphosphatase; Provisional	93
184679	PRK14433	PRK14433	acylphosphatase; Provisional	87
184680	PRK14434	PRK14434	acylphosphatase; Provisional	92
184681	PRK14435	PRK14435	acylphosphatase; Provisional	90
172912	PRK14436	PRK14436	acylphosphatase; Provisional	91
172913	PRK14437	PRK14437	acylphosphatase; Provisional	109
172914	PRK14438	PRK14438	acylphosphatase; Provisional	91
237714	PRK14439	PRK14439	acylphosphatase; Provisional	163
172916	PRK14440	PRK14440	acylphosphatase; Provisional	90
172917	PRK14441	PRK14441	acylphosphatase; Provisional	93
172918	PRK14442	PRK14442	acylphosphatase; Provisional	91
172919	PRK14443	PRK14443	acylphosphatase; Provisional	93
172920	PRK14444	PRK14444	acylphosphatase; Provisional	92
172921	PRK14445	PRK14445	acylphosphatase; Provisional	91
172922	PRK14446	PRK14446	acylphosphatase; Provisional	88
172923	PRK14447	PRK14447	acylphosphatase; Provisional	95
172924	PRK14448	PRK14448	acylphosphatase; Provisional	90
184682	PRK14449	PRK14449	acylphosphatase; Provisional	90
184683	PRK14450	PRK14450	acylphosphatase; Provisional	91
237715	PRK14451	PRK14451	acylphosphatase; Provisional	89
237716	PRK14452	PRK14452	acylphosphatase; Provisional	107
184685	PRK14453	PRK14453	chloramphenicol/florfenicol resistance protein; Provisional	347
184686	PRK14454	PRK14454	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	342
237717	PRK14455	PRK14455	ribosomal RNA large subunit methyltransferase N; Provisional	356
172932	PRK14456	PRK14456	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	368
184688	PRK14457	PRK14457	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	345
184689	PRK14459	PRK14459	ribosomal RNA large subunit methyltransferase N; Provisional	373
172935	PRK14460	PRK14460	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	354
237718	PRK14461	PRK14461	ribosomal RNA large subunit methyltransferase N; Provisional	371
237719	PRK14462	PRK14462	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	356
237720	PRK14463	PRK14463	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	349
184691	PRK14464	PRK14464	RNA methyltransferase. 	344
172940	PRK14465	PRK14465	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	342
237721	PRK14466	PRK14466	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	345
184693	PRK14467	PRK14467	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	348
184694	PRK14468	PRK14468	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	343
172944	PRK14469	PRK14469	23S rRNA (adenine(2503)-C(2))-methyltransferase RlmN. 	343
172945	PRK14470	PRK14470	ribosomal RNA large subunit methyltransferase N; Provisional	336
184695	PRK14471	PRK14471	F0F1 ATP synthase subunit B; Provisional	164
172947	PRK14472	PRK14472	F0F1 ATP synthase subunit B; Provisional	175
172948	PRK14473	PRK14473	F0F1 ATP synthase subunit B; Provisional	164
184696	PRK14474	PRK14474	F0F1 ATP synthase subunit B; Provisional	250
184697	PRK14475	PRK14475	F0F1 ATP synthase subunit B; Provisional	167
237722	PRK14476	PRK14476	nitrogenase molybdenum-cofactor biosynthesis protein NifN; Provisional	455
172952	PRK14477	PRK14477	bifunctional nitrogenase molybdenum-cofactor biosynthesis protein NifE/NifN; Provisional	917
184699	PRK14478	PRK14478	nitrogenase molybdenum-cofactor biosynthesis protein NifE; Provisional	475
237723	PRK14479	PRK14479	dihydroxyacetone kinase; Provisional	568
237724	PRK14481	PRK14481	dihydroxyacetone kinase subunit DhaK; Provisional	331
172956	PRK14483	PRK14483	DhaKLM operon coactivator DhaQ; Provisional	329
184702	PRK14484	PRK14484	phosphotransferase mannnose-specific family component IIA; Provisional	124
184703	PRK14485	PRK14485	putative bifunctional cbb3-type cytochrome c oxidase subunit I/II; Provisional	712
184704	PRK14486	PRK14486	putative bifunctional cbb3-type cytochrome c oxidase subunit II/cytochrome c; Provisional	294
237725	PRK14487	PRK14487	cbb3-type cytochrome c oxidase subunit II; Provisional	217
237726	PRK14488	PRK14488	cbb3-type cytochrome c oxidase subunit I; Provisional	473
237727	PRK14489	PRK14489	putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MobA/MobB; Provisional	366
237728	PRK14490	PRK14490	putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MobB/MobA; Provisional	369
237729	PRK14491	PRK14491	putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MobB/MoeA; Provisional	597
237730	PRK14493	PRK14493	putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MobB/MoaE; Provisional	274
237731	PRK14494	PRK14494	putative molybdopterin-guanine dinucleotide biosynthesis protein MobB/FeS domain-containing protein protein; Provisional	229
172967	PRK14495	PRK14495	putative molybdopterin-guanine dinucleotide biosynthesis protein MobB/unknown domain fusion protein; Provisional	452
172968	PRK14497	PRK14497	putative molybdopterin biosynthesis protein MoeA/unknown domain fusion protein; Provisional	546
237732	PRK14498	PRK14498	putative molybdopterin biosynthesis protein MoeA/LysR substrate binding-domain-containing protein; Provisional	633
237733	PRK14499	PRK14499	cyclic pyranopterin monophosphate synthase MoaC/MOSC-domain-containing protein. 	308
237734	PRK14500	PRK14500	putative bifunctional molybdopterin-guanine dinucleotide biosynthesis protein MoaC/MobA; Provisional	346
184712	PRK14501	PRK14501	putative bifunctional trehalose-6-phosphate synthase/HAD hydrolase subfamily IIB; Provisional	726
184713	PRK14502	PRK14502	bifunctional mannosyl-3-phosphoglycerate synthase/mannosyl-3 phosphoglycerate phosphatase; Provisional	694
237735	PRK14503	PRK14503	mannosyl-3-phosphoglycerate synthase; Provisional	393
237736	PRK14504	PRK14504	photosynthetic reaction center subunit M; Provisional	315
172976	PRK14505	PRK14505	bifunctional photosynthetic reaction center subunit L/M; Provisional	643
184716	PRK14506	PRK14506	photosynthetic reaction center subunit L; Provisional	276
237737	PRK14507	PRK14507	malto-oligosyltrehalose synthase. 	1693
237738	PRK14508	PRK14508	4-alpha-glucanotransferase; Provisional	497
237739	PRK14510	PRK14510	bifunctional glycogen debranching protein GlgX/4-alpha-glucanotransferase. 	1221
237740	PRK14511	PRK14511	malto-oligosyltrehalose synthase. 	879
237741	PRK14512	PRK14512	ATP-dependent Clp protease proteolytic subunit; Provisional	197
237742	PRK14513	PRK14513	ATP-dependent Clp protease proteolytic subunit; Provisional	201
184722	PRK14514	PRK14514	ATP-dependent Clp endopeptidase proteolytic subunit ClpP. 	221
237743	PRK14515	PRK14515	aspartate ammonia-lyase; Provisional	479
184724	PRK14520	rpsP	30S ribosomal protein S16; Provisional	155
237744	PRK14521	rpsP	30S ribosomal protein S16; Provisional	186
172988	PRK14522	rpsP	30S ribosomal protein S16; Provisional	116
172989	PRK14523	rpsP	30S ribosomal protein S16; Provisional	137
172990	PRK14524	rpsP	30S ribosomal protein S16; Provisional	94
172991	PRK14525	rpsP	30S ribosomal protein S16; Provisional	88
172992	PRK14526	PRK14526	adenylate kinase; Provisional	211
237745	PRK14527	PRK14527	adenylate kinase; Provisional	191
172994	PRK14528	PRK14528	adenylate kinase; Provisional	186
237746	PRK14529	PRK14529	adenylate kinase; Provisional	223
237747	PRK14530	PRK14530	adenylate kinase; Provisional	215
172997	PRK14531	PRK14531	adenylate kinase; Provisional	183
184729	PRK14532	PRK14532	adenylate kinase; Provisional	188
184730	PRK14533	groES	co-chaperonin GroES; Provisional	91
173000	PRK14534	cysS	cysteinyl-tRNA synthetase; Provisional	481
173001	PRK14535	cysS	cysteinyl-tRNA synthetase; Provisional	699
184731	PRK14536	cysS	cysteinyl-tRNA synthetase; Provisional	490
237748	PRK14537	PRK14537	50S ribosomal protein L20/unknown domain fusion protein; Provisional	230
173004	PRK14538	PRK14538	putative bifunctional signaling protein/50S ribosomal protein L9; Provisional	838
184732	PRK14539	PRK14539	50S ribosomal protein L11/unknown domain fusion protein; Provisional	196
184733	PRK14540	PRK14540	nucleoside diphosphate kinase; Provisional	134
173007	PRK14541	PRK14541	nucleoside diphosphate kinase; Provisional	140
173008	PRK14542	PRK14542	nucleoside diphosphate kinase; Provisional	137
237749	PRK14543	PRK14543	nucleoside diphosphate kinase; Provisional	169
173010	PRK14544	PRK14544	nucleoside diphosphate kinase; Provisional	183
184734	PRK14545	PRK14545	nucleoside diphosphate kinase; Provisional	139
184735	PRK14547	rplD	50S ribosomal protein L4; Provisional	298
237750	PRK14548	PRK14548	50S ribosomal protein L23P; Provisional	84
237751	PRK14549	PRK14549	50S ribosomal protein L29P; Provisional	69
173015	PRK14550	rnhB	ribonuclease HII; Provisional	204
237752	PRK14551	rnhB	ribonuclease HII; Provisional	212
237753	PRK14552	PRK14552	C/D box methylation guide ribonucleoprotein complex aNOP56 subunit; Provisional	414
184740	PRK14553	PRK14553	ribosomal-processing cysteine protease Prp. 	108
237754	PRK14554	PRK14554	tRNA pseudouridine(54/55) synthase Pus10. 	422
237755	PRK14555	PRK14555	RNA-binding protein. 	145
173021	PRK14556	pyrH	UMP kinase. 	249
173022	PRK14557	pyrH	uridylate kinase; Provisional	247
173023	PRK14558	pyrH	uridylate kinase; Provisional	231
237756	PRK14559	PRK14559	serine/threonine phosphatase. 	645
237757	PRK14560	PRK14560	putative RNA-binding protein; Provisional	160
184745	PRK14561	PRK14561	hypothetical protein; Provisional	194
184746	PRK14562	PRK14562	haloacid dehalogenase superfamily protein; Provisional	204
184747	PRK14563	PRK14563	ribosome modulation factor; Provisional	55
237758	PRK14565	PRK14565	triosephosphate isomerase; Provisional	237
184749	PRK14566	PRK14566	triosephosphate isomerase; Provisional	260
173031	PRK14567	PRK14567	triosephosphate isomerase; Provisional	253
184750	PRK14568	vanB	D-alanine--D-lactate ligase; Provisional	343
173033	PRK14569	PRK14569	D-alanyl-alanine synthetase A; Provisional	296
173034	PRK14570	PRK14570	D-alanyl-alanine synthetase A; Provisional	364
184751	PRK14571	PRK14571	D-alanyl-alanine synthetase A; Provisional	299
173036	PRK14572	PRK14572	D-alanyl-alanine synthetase A; Provisional	347
184752	PRK14573	PRK14573	bifunctional UDP-N-acetylmuramate--L-alanine ligase/D-alanine--D-alanine ligase. 	809
173038	PRK14574	hmsH	poly-beta-1,6 N-acetyl-D-glucosamine export porin PgaA. 	822
173039	PRK14575	PRK14575	putative peptidase; Provisional	406
173040	PRK14576	PRK14576	putative endopeptidase; Provisional	405
173042	PRK14578	PRK14578	elongation factor P; Provisional	187
184753	PRK14581	hmsF	outer membrane N-deacetylase; Provisional	672
184754	PRK14582	pgaB	poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase PgaB. 	671
184755	PRK14583	hmsR	poly-beta-1,6 N-acetyl-D-glucosamine synthase. 	444
184756	PRK14584	hmsS	hemin storage system protein; Provisional	153
173049	PRK14585	pgaD	putative PGA biosynthesis protein; Provisional	137
173050	PRK14586	PRK14586	tRNA pseudouridine(38-40) synthase TruA. 	245
173051	PRK14587	PRK14587	tRNA pseudouridine synthase ACD; Provisional	256
173052	PRK14588	PRK14588	tRNA pseudouridine(38-40) synthase TruA. 	272
237759	PRK14589	PRK14589	tRNA pseudouridine(38-40) synthase TruA. 	265
173054	PRK14590	rimM	16S rRNA-processing protein RimM; Provisional	171
173055	PRK14591	rimM	16S rRNA-processing protein RimM; Provisional	169
173056	PRK14592	rimM	16S rRNA-processing protein RimM; Provisional	165
237760	PRK14593	rimM	ribosome maturation factor RimM. 	184
173058	PRK14594	rimM	16S rRNA-processing protein RimM; Provisional	166
184757	PRK14595	PRK14595	peptide deformylase; Provisional	162
184758	PRK14596	PRK14596	peptide deformylase; Provisional	199
237761	PRK14597	PRK14597	peptide deformylase; Provisional	166
237762	PRK14598	PRK14598	peptide deformylase; Provisional	187
173063	PRK14599	trmD	tRNA (guanine-N(1)-)-methyltransferase/unknown domain fusion protein; Provisional	222
173064	PRK14600	ruvA	Holliday junction branch migration protein RuvA. 	186
173065	PRK14601	ruvA	Holliday junction branch migration protein RuvA. 	183
173066	PRK14602	ruvA	Holliday junction branch migration protein RuvA. 	203
237763	PRK14603	ruvA	Holliday junction branch migration protein RuvA. 	197
184760	PRK14604	ruvA	Holliday junction branch migration protein RuvA. 	195
184761	PRK14605	ruvA	Holliday junction branch migration protein RuvA. 	194
184762	PRK14606	ruvA	Holliday junction branch migration protein RuvA. 	188
237764	PRK14607	PRK14607	bifunctional anthranilate synthase component II/anthranilate phosphoribosyltransferase. 	534
237765	PRK14608	PRK14608	4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional	290
237766	PRK14609	PRK14609	4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional	269
184766	PRK14610	PRK14610	4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 	283
184767	PRK14611	PRK14611	4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 	275
237767	PRK14612	PRK14612	4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 	276
173077	PRK14613	PRK14613	4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 	297
173078	PRK14614	PRK14614	4-diphosphocytidyl-2-C-methyl-D-erythritol kinase; Provisional	280
237768	PRK14615	PRK14615	4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 	296
237769	PRK14616	PRK14616	4-(cytidine 5'-diphospho)-2-C-methyl-D-erythritol kinase. 	287
237770	PRK14618	PRK14618	NAD(P)H-dependent glycerol-3-phosphate dehydrogenase; Provisional	328
237771	PRK14619	PRK14619	NAD(P)H-dependent glycerol-3-phosphate dehydrogenase; Provisional	308
173083	PRK14620	PRK14620	NAD(P)H-dependent glycerol-3-phosphate dehydrogenase; Provisional	326
173084	PRK14621	PRK14621	YbaB/EbfC family nucleoid-associated protein. 	111
173085	PRK14622	PRK14622	YbaB/EbfC family nucleoid-associated protein. 	103
184771	PRK14623	PRK14623	YbaB/EbfC family nucleoid-associated protein. 	106
173087	PRK14624	PRK14624	YbaB/EbfC family nucleoid-associated protein. 	115
184772	PRK14625	PRK14625	hypothetical protein; Provisional	109
173089	PRK14626	PRK14626	YbaB/EbfC family nucleoid-associated protein. 	110
173090	PRK14627	PRK14627	YbaB/EbfC family nucleoid-associated protein. 	100
173091	PRK14628	PRK14628	YbaB/EbfC family nucleoid-associated protein. 	118
173092	PRK14629	PRK14629	YbaB/EbfC family nucleoid-associated protein. 	99
173093	PRK14630	PRK14630	ribosome maturation factor RimP. 	143
237772	PRK14631	PRK14631	ribosome maturation factor RimP. 	174
173095	PRK14632	PRK14632	ribosome maturation factor RimP. 	172
173096	PRK14633	PRK14633	ribosome maturation factor RimP. 	150
173097	PRK14634	PRK14634	ribosome maturation factor RimP. 	155
184774	PRK14635	PRK14635	ribosome maturation factor RimP. 	162
237773	PRK14636	PRK14636	ribosome maturation protein RimP. 	176
237774	PRK14637	PRK14637	ribosome maturation factor RimP. 	151
184777	PRK14638	PRK14638	ribosome maturation factor RimP. 	150
173102	PRK14639	PRK14639	ribosome maturation factor RimP. 	140
173103	PRK14640	PRK14640	hypothetical protein; Provisional	152
173104	PRK14641	PRK14641	ribosome maturation factor RimP. 	173
237775	PRK14642	PRK14642	ribosome maturation factor RimP. 	197
173106	PRK14643	PRK14643	ribosome maturation factor RimP. 	164
184779	PRK14644	PRK14644	hypothetical protein; Provisional	136
184780	PRK14645	PRK14645	ribosome maturation factor RimP. 	154
173109	PRK14646	PRK14646	ribosome maturation factor RimP. 	155
173110	PRK14647	PRK14647	ribosome maturation factor RimP. 	159
173111	PRK14648	PRK14648	UDP-N-acetylmuramate dehydrogenase. 	354
173112	PRK14649	PRK14649	UDP-N-acetylmuramate dehydrogenase. 	295
173113	PRK14650	PRK14650	UDP-N-acetylmuramate dehydrogenase. 	302
237776	PRK14651	PRK14651	UDP-N-acetylmuramate dehydrogenase. 	273
237777	PRK14652	PRK14652	UDP-N-acetylmuramate dehydrogenase. 	302
237778	PRK14653	PRK14653	UDP-N-acetylmuramate dehydrogenase. 	297
173117	PRK14654	mraY	phospho-N-acetylmuramoyl-pentapeptide-transferase; Provisional	302
173118	PRK14655	mraY	phospho-N-acetylmuramoyl-pentapeptide-transferase; Provisional	304
237779	PRK14656	acpS	holo-[acyl-carrier-protein] synthase. 	126
173120	PRK14657	acpS	holo-[acyl-carrier-protein] synthase. 	123
173121	PRK14658	acpS	holo-ACP synthase. 	115
237780	PRK14659	acpS	holo-[acyl-carrier-protein] synthase. 	122
173123	PRK14660	acpS	holo-[acyl-carrier-protein] synthase. 	125
184782	PRK14661	acpS	holo-[acyl-carrier-protein] synthase. 	169
184783	PRK14662	acpS	4'-phosphopantetheinyl transferase; Provisional	120
237781	PRK14663	acpS	holo-[acyl-carrier-protein] synthase. 	116
173127	PRK14664	PRK14664	tRNA-specific 2-thiouridylase MnmA; Provisional	362
173128	PRK14665	mnmA	tRNA-specific 2-thiouridylase MnmA; Provisional	360
237782	PRK14666	uvrC	excinuclease ABC subunit C; Provisional	694
237783	PRK14667	uvrC	excinuclease ABC subunit C; Provisional	567
184785	PRK14668	uvrC	excinuclease ABC subunit C; Provisional	577
237784	PRK14669	uvrC	excinuclease ABC subunit C; Provisional	624
173133	PRK14670	uvrC	excinuclease ABC subunit C; Provisional	574
237785	PRK14671	uvrC	excinuclease ABC subunit C; Provisional	621
173135	PRK14672	uvrC	excinuclease ABC subunit C; Provisional	691
237786	PRK14673	PRK14673	hypothetical protein; Provisional	137
184788	PRK14674	PRK14674	hypothetical protein; Provisional	133
173138	PRK14675	PRK14675	hypothetical protein; Provisional	125
173139	PRK14676	PRK14676	hypothetical protein; Provisional	117
184789	PRK14677	PRK14677	hypothetical protein; Provisional	107
173141	PRK14678	PRK14678	hypothetical protein; Provisional	120
173142	PRK14679	PRK14679	hypothetical protein; Provisional	128
173143	PRK14680	PRK14680	hypothetical protein; Provisional	134
237787	PRK14681	PRK14681	hypothetical protein; Provisional	158
173145	PRK14682	PRK14682	hypothetical protein; Provisional	117
173146	PRK14683	PRK14683	hypothetical protein; Provisional	122
173147	PRK14684	PRK14684	hypothetical protein; Provisional	120
173148	PRK14685	PRK14685	hypothetical protein; Provisional	177
184791	PRK14686	PRK14686	hypothetical protein; Provisional	119
237788	PRK14687	PRK14687	hypothetical protein; Provisional	173
184792	PRK14688	PRK14688	hypothetical protein; Provisional	121
173152	PRK14689	PRK14689	hypothetical protein; Provisional	124
237789	PRK14690	PRK14690	molybdopterin biosynthesis protein MoeA; Provisional	419
173154	PRK14691	PRK14691	3-oxoacyl-(acyl carrier protein) synthase II; Provisional	342
173155	PRK14692	PRK14692	flagellar hook-associated protein FlgL. 	749
173156	PRK14693	PRK14693	hypothetical protein; Provisional	552
237790	PRK14694	PRK14694	putative mercuric reductase; Provisional	468
173158	PRK14695	PRK14695	serine/threonine transporter SstT; Provisional	319
184793	PRK14696	tynA	primary-amine oxidase. 	721
184794	PRK14697	PRK14697	bifunctional 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase/phosphatase; Provisional	233
184795	PRK14698	PRK14698	V-type ATP synthase subunit A; Provisional	1017
173162	PRK14699	PRK14699	replication factor A; Provisional	484
173163	PRK14700	PRK14700	recombination factor protein RarA; Provisional	300
237791	PRK14701	PRK14701	reverse gyrase; Provisional	1638
237792	PRK14702	PRK14702	insertion element IS2 transposase InsD; Provisional	262
237793	PRK14703	PRK14703	glutaminyl-tRNA synthetase/YqeY domain fusion protein; Provisional	771
184798	PRK14704	PRK14704	anaerobic ribonucleoside triphosphate reductase; Provisional	618
237794	PRK14705	PRK14705	glycogen branching enzyme; Provisional	1224
237795	PRK14706	PRK14706	glycogen branching enzyme; Provisional	639
173170	PRK14707	PRK14707	hypothetical protein; Provisional	2710
173171	PRK14708	PRK14708	flagellin; Provisional	888
173172	PRK14709	PRK14709	hypothetical protein; Provisional	469
173173	PRK14710	PRK14710	hypothetical protein; Provisional	86
173174	PRK14711	ureE	urease accessory protein UreE; Provisional	191
237796	PRK14712	PRK14712	conjugal transfer nickase/helicase TraI; Provisional	1623
237797	PRK14713	PRK14713	bifunctional hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. 	530
237798	PRK14714	PRK14714	DNA-directed DNA polymerase II large subunit. 	1337
237799	PRK14715	PRK14715	DNA-directed DNA polymerase II large subunit. 	1627
237800	PRK14716	PRK14716	glycosyl transferase family protein. 	504
184803	PRK14717	PRK14717	putative glycine/sarcosine/betaine reductase complex protein A; Provisional	107
173181	PRK14718	PRK14718	ribonuclease III; Provisional	467
237801	PRK14719	PRK14719	bifunctional RNAse/5-amino-6-(5-phosphoribosylamino)uracil reductase; Provisional	360
184804	PRK14720	PRK14720	transcription elongation factor GreA. 	906
173184	PRK14721	flhF	flagellar biosynthesis regulator FlhF; Provisional	420
173185	PRK14722	flhF	flagellar biosynthesis regulator FlhF; Provisional	374
237802	PRK14723	flhF	flagellar biosynthesis regulator FlhF; Provisional	767
237803	PRK14724	PRK14724	DNA topoisomerase III; Provisional	987
237804	PRK14725	PRK14725	pyruvate kinase; Provisional	608
237805	PRK14726	PRK14726	protein translocase subunit SecDF. 	855
237806	PRK14727	PRK14727	putative mercuric reductase; Provisional	479
173191	PRK14729	miaA	tRNA delta(2)-isopentenylpyrophosphate transferase; Provisional	300
184807	PRK14730	coaE	dephospho-CoA kinase; Provisional	195
173193	PRK14731	coaE	dephospho-CoA kinase; Provisional	208
237807	PRK14732	coaE	dephospho-CoA kinase; Provisional	196
173195	PRK14733	coaE	dephospho-CoA kinase; Provisional	204
237808	PRK14734	coaE	dephospho-CoA kinase; Provisional	200
173197	PRK14735	atpC	F0F1 ATP synthase subunit epsilon; Provisional	139
173198	PRK14736	atpC	F0F1 ATP synthase subunit epsilon; Provisional	133
173199	PRK14737	gmk	guanylate kinase; Provisional	186
237809	PRK14738	gmk	guanylate kinase; Provisional	206
173201	PRK14740	kdbF	K(+)-transporting ATPase subunit F. 	29
173202	PRK14741	spoVM	stage V sporulation protein SpoVM. 	26
173203	PRK14742	thrL	thr operon leader peptide; Provisional	28
173204	PRK14743	thrL	thr operon leader peptide; Provisional	22
173205	PRK14744	PRK14744	leu operon leader peptide; Provisional	28
173206	PRK14745	PRK14745	RepA leader peptide Tap; Provisional	25
173207	PRK14746	PRK14746	RepA leader peptide Tap; Provisional	24
184810	PRK14747	PRK14747	cytochrome b6-f complex subunit PetN; Provisional	29
173209	PRK14748	kdpF	K(+)-transporting ATPase subunit F. 	29
173210	PRK14749	PRK14749	cytochrome bd-II oxidase subunit CbdX. 	30
173211	PRK14750	kdpF	K(+)-transporting ATPase subunit F. 	29
173212	PRK14751	PRK14751	tetracycline resistance determinant leader peptide; Provisional	28
173213	PRK14752	PRK14752	delta-lysin family phenol-soluble modulin. 	44
184811	PRK14753	PRK14753	30S ribosomal protein Thx; Provisional	27
173215	PRK14754	PRK14754	toxic peptide TisB; Provisional	29
173216	PRK14755	PRK14755	transcriptional regulatory protein PufK; Provisional	20
341227	PRK14756	small_mem_YnhF	YnhF family membrane protein; Validated. 	29
173218	PRK14757	PRK14757	putative protamine-like protein; Provisional	29
173219	PRK14758	PRK14758	hypothetical protein; Provisional	27
173220	PRK14759	PRK14759	K(+)-transporting ATPase subunit F. 	29
173221	PRK14760	PRK14760	protein YoaJ. 	24
173222	PRK14761	PRK14761	fur leader peptide. 	28
173223	PRK14762	PRK14762	protein YohP. 	27
184813	PRK14763	PRK14763	pyrroloquinoline quinone precursor peptide PqqA. 	23
237810	PRK14764	PRK14764	lipoprotein signal peptidase; Provisional	209
173227	PRK14766	PRK14766	lipoprotein signal peptidase; Provisional	201
237811	PRK14767	PRK14767	lipoprotein signal peptidase; Provisional	174
173229	PRK14768	PRK14768	lipoprotein signal peptidase; Provisional	148
173230	PRK14769	PRK14769	lipoprotein signal peptidase; Provisional	156
173231	PRK14770	PRK14770	lipoprotein signal peptidase; Provisional	167
184816	PRK14771	PRK14771	lipoprotein signal peptidase; Provisional	165
237812	PRK14772	PRK14772	lipoprotein signal peptidase; Provisional	190
173234	PRK14773	PRK14773	lipoprotein signal peptidase; Provisional	192
173235	PRK14774	PRK14774	lipoprotein signal peptidase; Provisional	185
237813	PRK14775	PRK14775	lipoprotein signal peptidase; Provisional	170
173237	PRK14776	PRK14776	lipoprotein signal peptidase; Provisional	170
237814	PRK14777	PRK14777	lipoprotein signal peptidase; Provisional	184
173239	PRK14778	PRK14778	lipoprotein signal peptidase; Provisional	186
184818	PRK14779	PRK14779	lipoprotein signal peptidase; Provisional	159
173241	PRK14780	PRK14780	lipoprotein signal peptidase; Provisional	263
237815	PRK14781	PRK14781	lipoprotein signal peptidase; Provisional	153
173243	PRK14782	PRK14782	signal peptidase II. 	157
173244	PRK14783	PRK14783	lipoprotein signal peptidase; Provisional	182
184819	PRK14784	PRK14784	lipoprotein signal peptidase; Provisional	160
184820	PRK14785	PRK14785	lipoprotein signal peptidase; Provisional	171
173247	PRK14786	PRK14786	lipoprotein signal peptidase; Provisional	154
173248	PRK14787	PRK14787	lipoprotein signal peptidase; Provisional	159
237816	PRK14788	PRK14788	lipoprotein signal peptidase; Provisional	200
173250	PRK14789	PRK14789	lipoprotein signal peptidase; Provisional	191
173251	PRK14790	PRK14790	lipoprotein signal peptidase; Provisional	169
237817	PRK14791	PRK14791	lipoprotein signal peptidase; Provisional	146
237818	PRK14792	PRK14792	signal peptidase II. 	159
184823	PRK14793	PRK14793	lipoprotein signal peptidase; Provisional	150
173255	PRK14794	PRK14794	lipoprotein signal peptidase; Provisional	136
173256	PRK14795	PRK14795	lipoprotein signal peptidase; Provisional	158
184824	PRK14796	PRK14796	lipoprotein signal peptidase; Provisional	161
184825	PRK14797	PRK14797	lipoprotein signal peptidase; Provisional	150
184826	PRK14799	thrS	threonyl-tRNA synthetase; Provisional	545
173265	PRK14804	PRK14804	ornithine carbamoyltransferase; Provisional	311
237819	PRK14805	PRK14805	ornithine carbamoyltransferase; Provisional	302
237820	PRK14806	PRK14806	bifunctional cyclohexadienyl dehydrogenase/ 3-phosphoshikimate 1-carboxyvinyltransferase; Provisional	735
184829	PRK14807	PRK14807	histidinol-phosphate transaminase. 	351
173269	PRK14808	PRK14808	histidinol-phosphate transaminase. 	335
184830	PRK14809	PRK14809	histidinol-phosphate transaminase. 	357
173271	PRK14810	PRK14810	formamidopyrimidine-DNA glycosylase; Provisional	272
184831	PRK14811	PRK14811	formamidopyrimidine-DNA glycosylase; Provisional	269
173273	PRK14812	PRK14812	hypothetical protein; Provisional	119
173274	PRK14813	PRK14813	NADH dehydrogenase subunit B; Provisional	189
173275	PRK14814	PRK14814	NADH dehydrogenase subunit B; Provisional	186
237821	PRK14815	PRK14815	NADH dehydrogenase subunit B; Provisional	183
173277	PRK14816	PRK14816	NADH-quinone oxidoreductase subunit B. 	182
173278	PRK14817	PRK14817	NADH-quinone oxidoreductase subunit B. 	181
173279	PRK14818	PRK14818	NADH dehydrogenase subunit B; Provisional	173
237822	PRK14819	PRK14819	NADH-quinone oxidoreductase subunit B. 	264
184833	PRK14820	PRK14820	NADH-quinone oxidoreductase subunit B. 	180
184834	PRK14821	PRK14821	XTP/dITP diphosphatase. 	184
184835	PRK14822	PRK14822	XTP/dITP diphosphatase. 	200
237823	PRK14823	PRK14823	putative deoxyribonucleoside-triphosphatase; Provisional	191
237824	PRK14824	PRK14824	putative deoxyribonucleotide triphosphate pyrophosphatase; Provisional	201
173286	PRK14825	PRK14825	putative deoxyribonucleotide triphosphate pyrophosphatase; Provisional	199
173287	PRK14826	PRK14826	putative deoxyribonucleotide triphosphate pyrophosphatase; Provisional	222
173288	PRK14827	PRK14827	undecaprenyl pyrophosphate synthase; Provisional	296
237825	PRK14828	PRK14828	undecaprenyl pyrophosphate synthase; Provisional	256
237826	PRK14829	PRK14829	undecaprenyl pyrophosphate synthase; Provisional	243
184840	PRK14830	PRK14830	undecaprenyl pyrophosphate synthase; Provisional	251
184841	PRK14831	PRK14831	undecaprenyl pyrophosphate synthase; Provisional	249
237827	PRK14832	PRK14832	undecaprenyl pyrophosphate synthase; Provisional	253
237828	PRK14833	PRK14833	di-trans,poly-cis-decaprenylcistransferase. 	233
237829	PRK14834	PRK14834	isoprenyl transferase. 	249
237830	PRK14835	PRK14835	isoprenyl transferase. 	275
237831	PRK14836	PRK14836	undecaprenyl pyrophosphate synthase; Provisional	253
173298	PRK14837	PRK14837	undecaprenyl pyrophosphate synthase; Provisional	230
184846	PRK14838	PRK14838	isoprenyl transferase. 	242
237832	PRK14839	PRK14839	di-trans,poly-cis-decaprenylcistransferase. 	239
173301	PRK14840	PRK14840	undecaprenyl pyrophosphate synthase; Provisional	250
173302	PRK14841	PRK14841	undecaprenyl pyrophosphate synthase; Provisional	233
173303	PRK14842	PRK14842	undecaprenyl pyrophosphate synthase; Provisional	241
184847	PRK14843	PRK14843	dihydrolipoamide acetyltransferase; Provisional	347
173305	PRK14844	PRK14844	DNA-directed RNA polymerase subunit beta/beta'. 	2836
237833	PRK14845	PRK14845	translation initiation factor IF-2; Provisional	1049
237834	PRK14846	truB	tRNA pseudouridine synthase B; Provisional	345
184849	PRK14847	PRK14847	2-isopropylmalate synthase. 	333
184850	PRK14848	PRK14848	type III secretion system effector deubiquitinase SseL. 	317
184851	PRK14849	PRK14849	autotransporter barrel domain-containing lipoprotein. 	1806
237835	PRK14850	PRK14850	penicillin-binding protein 1b; Provisional	764
184853	PRK14851	PRK14851	hypothetical protein; Provisional	679
184854	PRK14852	PRK14852	hypothetical protein; Provisional	989
184855	PRK14853	nhaA	pH-dependent sodium/proton antiporter; Provisional	423
184856	PRK14854	nhaA	pH-dependent sodium/proton antiporter; Provisional	383
237836	PRK14855	nhaA	pH-dependent sodium/proton antiporter; Provisional	423
184858	PRK14856	nhaA	sodium/proton antiporter NhaA. 	438
184859	PRK14857	tatA	TatA/E family twin arginine-targeting protein translocase. 	90
184860	PRK14858	tatA	twin arginine translocase protein A; Provisional	108
184861	PRK14859	tatA	twin arginine translocase protein A; Provisional	63
184862	PRK14860	tatA	twin arginine translocase protein A; Provisional	64
237837	PRK14861	tatA	twin arginine translocase protein A; Provisional	61
237838	PRK14862	rimO	30S ribosomal protein S12 methylthiotransferase RimO. 	440
184865	PRK14863	PRK14863	bifunctional regulator KidO; Provisional	292
184866	PRK14864	PRK14864	biofilm peroxide resistance protein BsmA. 	104
237839	PRK14865	rnpA	ribonuclease P protein component. 	116
237840	PRK14866	PRK14866	hypothetical protein; Provisional	451
237841	PRK14867	PRK14867	DNA topoisomerase VI subunit B; Provisional	659
237842	PRK14868	PRK14868	DNA topoisomerase VI subunit B; Provisional	795
237843	PRK14869	PRK14869	putative manganese-dependent inorganic diphosphatase. 	546
184872	PRK14872	PRK14872	rod shape-determining protein MreC; Provisional	337
237844	PRK14873	PRK14873	primosomal protein N'. 	665
237845	PRK14874	PRK14874	aspartate-semialdehyde dehydrogenase; Provisional	334
184875	PRK14875	PRK14875	acetoin dehydrogenase E2 subunit dihydrolipoyllysine-residue acetyltransferase; Provisional	371
237846	PRK14876	PRK14876	conjugal transfer mating pair stabilization protein TraN; Provisional	928
184877	PRK14877	PRK14877	conjugal transfer mating pair stabilization protein TraN; Provisional	1062
184878	PRK14878	PRK14878	UGMP family protein; Provisional	323
237847	PRK14879	PRK14879	Kae1-associated kinase Bud32. 	211
237848	PRK14886	PRK14886	KEOPS complex subunit Cgi121. 	167
237849	PRK14887	PRK14887	KEOPS complex Pcc1-like subunit; Provisional	84
237850	PRK14888	PRK14888	KEOPS complex Pcc1-like subunit; Provisional	59
184883	PRK14889	PRK14889	VKOR family protein; Provisional	143
184884	PRK14890	PRK14890	putative Zn-ribbon RNA-binding protein; Provisional	59
184885	PRK14891	PRK14891	50S ribosomal protein L24e/unknown domain fusion protein; Provisional	131
184886	PRK14892	PRK14892	putative transcription elongation factor Elf1; Provisional	99
184887	PRK14893	PRK14893	V-type ATP synthase subunit K; Provisional	161
237851	PRK14894	PRK14894	glycyl-tRNA synthetase; Provisional	539
184889	PRK14895	gltX	glutamyl-tRNA synthetase; Provisional	513
237852	PRK14896	ksgA	16S ribosomal RNA methyltransferase A. 	258
237853	PRK14897	PRK14897	unknown domain/DNA-directed RNA polymerase subunit A'' fusion protein; Provisional	509
237854	PRK14898	PRK14898	DNA-directed RNA polymerase subunit A''; Provisional	858
237855	PRK14900	valS	valyl-tRNA synthetase; Provisional	1052
237856	PRK14901	PRK14901	16S rRNA methyltransferase B; Provisional	434
237857	PRK14902	PRK14902	16S rRNA (cytosine(967)-C(5))-methyltransferase RsmB. 	444
184896	PRK14903	PRK14903	16S rRNA methyltransferase B; Provisional	431
237858	PRK14904	PRK14904	16S rRNA methyltransferase B; Provisional	445
184898	PRK14905	PRK14905	triosephosphate isomerase/PTS system glucose/sucrose-specific transporter subunit IIB; Provisional	355
184899	PRK14906	PRK14906	DNA-directed RNA polymerase subunit beta'. 	1460
184900	PRK14907	rplD	50S ribosomal protein L4; Provisional	295
237859	PRK14908	PRK14908	glycine--tRNA ligase. 	1000
184902	PRK14938	PRK14938	Ser-tRNA(Thr) hydrolase; Provisional	387
237860	PRK14939	gyrB	DNA gyrase subunit B; Provisional	756
184904	PRK14940	PRK14940	DNA polymerase III subunit beta; Provisional	367
184905	PRK14941	PRK14941	DNA polymerase III subunit beta; Provisional	374
184906	PRK14942	PRK14942	DNA polymerase III subunit beta; Provisional	373
184907	PRK14943	PRK14943	DNA polymerase III subunit beta; Provisional	374
184908	PRK14944	PRK14944	DNA polymerase III subunit beta; Provisional	375
184909	PRK14945	PRK14945	DNA polymerase III subunit beta; Provisional	362
184910	PRK14946	PRK14946	DNA polymerase III subunit beta; Provisional	366
237861	PRK14947	PRK14947	DNA polymerase III subunit beta; Provisional	384
237862	PRK14948	PRK14948	DNA polymerase III subunit gamma/tau. 	620
237863	PRK14949	PRK14949	DNA polymerase III subunits gamma and tau; Provisional	944
237864	PRK14950	PRK14950	DNA polymerase III subunits gamma and tau; Provisional	585
237865	PRK14951	PRK14951	DNA polymerase III subunits gamma and tau; Provisional	618
237866	PRK14952	PRK14952	DNA polymerase III subunits gamma/tau. 	584
237867	PRK14953	PRK14953	DNA polymerase III subunits gamma and tau; Provisional	486
184918	PRK14954	PRK14954	DNA polymerase III subunits gamma and tau; Provisional	620
184919	PRK14955	PRK14955	DNA polymerase III subunits gamma and tau; Provisional	397
184920	PRK14956	PRK14956	DNA polymerase III subunits gamma and tau; Provisional	484
184921	PRK14957	PRK14957	DNA polymerase III subunits gamma and tau; Provisional	546
184922	PRK14958	PRK14958	DNA polymerase III subunits gamma and tau; Provisional	509
184923	PRK14959	PRK14959	DNA polymerase III subunits gamma and tau; Provisional	624
237868	PRK14960	PRK14960	DNA polymerase III subunit gamma/tau. 	702
184925	PRK14961	PRK14961	DNA polymerase III subunits gamma and tau; Provisional	363
237869	PRK14962	PRK14962	DNA polymerase III subunits gamma and tau; Provisional	472
184927	PRK14963	PRK14963	DNA polymerase III subunits gamma and tau; Provisional	504
237870	PRK14964	PRK14964	DNA polymerase III subunits gamma and tau; Provisional	491
237871	PRK14965	PRK14965	DNA polymerase III subunits gamma and tau; Provisional	576
184930	PRK14966	PRK14966	unknown domain/N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase fusion protein; Provisional	423
184931	PRK14967	PRK14967	putative methyltransferase; Provisional	223
237872	PRK14968	PRK14968	putative methyltransferase; Provisional	188
237873	PRK14969	PRK14969	DNA polymerase III subunits gamma and tau; Provisional	527
184934	PRK14970	PRK14970	DNA polymerase III subunits gamma and tau; Provisional	367
237874	PRK14971	PRK14971	DNA polymerase III subunit gamma/tau. 	614
184936	PRK14973	PRK14973	DNA topoisomerase I; Provisional	936
237875	PRK14974	PRK14974	signal recognition particle-docking protein FtsY. 	336
237876	PRK14975	PRK14975	bifunctional 3'-5' exonuclease/DNA polymerase; Provisional	553
237877	PRK14976	PRK14976	5'-3' exonuclease; Provisional	281
184940	PRK14977	PRK14977	bifunctional DNA-directed RNA polymerase A'/A'' subunit; Provisional	1321
237878	PRK14979	PRK14979	DNA-directed RNA polymerase subunit D; Provisional	195
184942	PRK14980	PRK14980	DNA-directed RNA polymerase subunit G; Provisional	127
184943	PRK14981	PRK14981	RNA polymerase Rpb4 family protein. 	112
184944	PRK14982	PRK14982	acyl-ACP reductase; Provisional	340
237879	PRK14983	PRK14983	aldehyde decarbonylase; Provisional	231
237880	PRK14984	PRK14984	gluconate transporter. 	438
237881	PRK14985	PRK14985	maltodextrin phosphorylase; Provisional	798
184948	PRK14986	PRK14986	glycogen phosphorylase; Provisional	815
184949	PRK14987	PRK14987	HTH-type transcriptional regulator GntR. 	331
237882	PRK14988	PRK14988	GMP/IMP nucleotidase; Provisional	224
184951	PRK14989	PRK14989	nitrite reductase subunit NirD; Provisional	847
184952	PRK14990	PRK14990	anaerobic dimethyl sulfoxide reductase subunit A; Provisional	814
237883	PRK14991	PRK14991	tetrathionate reductase subunit TtrA. 	1031
184954	PRK14992	PRK14992	tetrathionate reductase subunit TtrC. 	335
184955	PRK14993	PRK14993	tetrathionate reductase subunit TtrB. 	244
184956	PRK14994	PRK14994	SAM-dependent 16S ribosomal RNA C1402 ribose 2'-O-methyltransferase; Provisional	287
184957	PRK14995	PRK14995	SmvA family efflux MFS transporter. 	495
184958	PRK14996	PRK14996	TetR family transcriptional regulator; Provisional	192
184959	PRK14997	PRK14997	LysR family transcriptional regulator; Provisional	301
184960	PRK14998	PRK14998	cold shock-like protein CspD; Provisional	73
184961	PRK14999	PRK14999	histidine utilization repressor; Provisional	241
184962	PRK15000	PRK15000	peroxiredoxin C. 	200
184963	PRK15001	PRK15001	23S rRNA (guanine(1835)-N(2))-methyltransferase RlmG. 	378
184964	PRK15002	PRK15002	redox-sensitive transcriptional activator SoxR. 	154
184965	PRK15003	PRK15003	cytochrome d ubiquinol oxidase subunit II. 	379
184966	PRK15004	PRK15004	adenosylcobalamin/alpha-ribazole phosphatase. 	199
184967	PRK15005	PRK15005	universal stress protein UspF. 	144
184968	PRK15006	PRK15006	thiosulfate reductase cytochrome B subunit; Provisional	261
184969	PRK15007	PRK15007	arginine ABC transporter substrate-binding protein. 	243
184970	PRK15008	PRK15008	HTH-type transcriptional regulator RutR; Provisional	212
184971	PRK15009	PRK15009	GDP-mannose pyrophosphatase NudK; Provisional	191
184972	PRK15010	PRK15010	lysine/arginine/ornithine ABC transporter substrate-binding protein ArgT. 	260
184973	PRK15011	PRK15011	sugar efflux transporter SetB. 	393
184974	PRK15012	PRK15012	menaquinone-specific isochorismate synthase; Provisional	431
184975	PRK15014	PRK15014	6-phospho-beta-glucosidase BglA; Provisional	477
184976	PRK15015	PRK15015	carbon starvation protein CstA. 	701
184977	PRK15016	PRK15016	isochorismate synthase EntC; Provisional	391
184978	PRK15017	PRK15017	cytochrome o ubiquinol oxidase subunit I; Provisional	663
184979	PRK15018	PRK15018	1-acyl-sn-glycerol-3-phosphate acyltransferase; Provisional	245
184980	PRK15019	PRK15019	cysteine desulfurase sulfur acceptor subunit CsdE. 	147
237884	PRK15020	PRK15020	ethanolamine utilization cob(I)yrinic acid a,c-diamide adenosyltransferase EutT. 	267
184982	PRK15021	PRK15021	microcin C ABC transporter permease; Provisional	341
184983	PRK15022	PRK15022	non-heme ferritin-like protein. 	167
184984	PRK15023	PRK15023	L-serine deaminase; Provisional	454
184985	PRK15025	PRK15025	ureidoglycolate dehydrogenase; Provisional	349
184986	PRK15026	PRK15026	aminoacyl-histidine dipeptidase; Provisional	485
184987	PRK15027	PRK15027	xylulokinase; Provisional	484
184988	PRK15028	PRK15028	cytochrome d ubiquinol oxidase subunit II. 	378
184989	PRK15029	PRK15029	arginine decarboxylase; Provisional	755
184990	PRK15030	PRK15030	multidrug efflux RND transporter periplasmic adaptor subunit AcrA. 	397
184991	PRK15031	PRK15031	5-carboxymethyl-2-hydroxymuconate Delta-isomerase. 	126
184992	PRK15032	PRK15032	pentaheme c-type cytochrome TorC. 	390
237885	PRK15033	PRK15033	tricarballylate utilization 4Fe-4S protein TcuB. 	389
184994	PRK15034	PRK15034	nitrate/nitrite transport protein NarU; Provisional	462
184995	PRK15035	PRK15035	cytochrome bd-II oxidase subunit 1; Provisional	514
184996	PRK15036	PRK15036	hydroxyisourate hydrolase; Provisional	137
184997	PRK15037	PRK15037	D-mannonate oxidoreductase; Provisional	486
184998	PRK15038	PRK15038	autoinducer 2 ABC transporter permease LsrD. 	330
184999	PRK15039	PRK15039	Ni(II)/Co(II)-binding transcriptional repressor RcnR. 	90
185000	PRK15040	PRK15040	L-serine ammonia-lyase. 	454
185001	PRK15041	PRK15041	methyl-accepting chemotaxis protein. 	554
237886	PRK15042	pduD	propanediol/glycerol family dehydratase medium subunit. 	219
185003	PRK15043	PRK15043	HTH-type transcriptional regulator MlrA. 	243
185004	PRK15044	PRK15044	transcriptional regulator SirC; Provisional	295
185005	PRK15045	PRK15045	cellulose biosynthesis protein BcsE; Provisional	519
237887	PRK15046	PRK15046	2-aminoethylphosphonate ABC transporter substrate-binding protein; Provisional	349
185007	PRK15047	PRK15047	N-hydroxyarylamine O-acetyltransferase; Provisional	281
185008	PRK15048	PRK15048	methyl-accepting chemotaxis protein II; Provisional	553
185009	PRK15049	PRK15049	L-asparagine permease. 	499
237888	PRK15050	PRK15050	2-aminoethylphosphonate transport system permease PhnU; Provisional	296
185011	PRK15051	PRK15051	4-amino-4-deoxy-L-arabinose-phosphoundecaprenol flippase subunit ArnE; Provisional	111
237889	PRK15052	PRK15052	tagatose-bisphosphate aldolase subunit GatZ. 	421
185013	PRK15053	dpiB	sensor histidine kinase DpiB; Provisional	545
185014	PRK15054	PRK15054	nitrate reductase molybdenum cofactor assembly chaperone. 	231
237890	PRK15055	PRK15055	anaerobic sulfite reductase subunit AsrA. 	344
185016	PRK15056	PRK15056	manganese/iron ABC transporter ATP-binding protein. 	272
185017	PRK15057	PRK15057	UDP-glucose 6-dehydrogenase; Provisional	388
185018	PRK15058	PRK15058	cytochrome b562; Provisional	128
185019	PRK15059	PRK15059	2-hydroxy-3-oxopropionate reductase. 	292
185020	PRK15060	PRK15060	2,3-diketo-L-gulonate transporter large permease YiaN. 	425
237891	PRK15061	PRK15061	catalase/peroxidase. 	726
237892	PRK15062	PRK15062	hydrogenase isoenzymes formation protein HypD; Provisional	364
237893	PRK15063	PRK15063	isocitrate lyase; Provisional	428
237894	PRK15064	PRK15064	ABC transporter ATP-binding protein; Provisional	530
237895	PRK15065	PRK15065	mannose/fructose/sorbose family PTS transporter subunit IIC. 	262
237896	PRK15066	PRK15066	inner membrane transport permease; Provisional	257
237897	PRK15067	PRK15067	ethanolamine ammonia-lyase subunit EutB. 	461
237898	PRK15068	PRK15068	tRNA 5-methoxyuridine(34)/uridine 5-oxyacetic acid(34) synthase CmoB. 	322
185029	PRK15069	PRK15069	histidine ABC transporter permease HisM. 	234
237899	PRK15070	PRK15070	phosphate propanoyltransferase. 	211
237900	PRK15071	PRK15071	lipopolysaccharide ABC transporter permease; Provisional	356
237901	PRK15072	PRK15072	D-galactonate dehydratase family protein. 	404
185033	PRK15074	PRK15074	inosine/guanosine kinase; Provisional	434
237902	PRK15075	PRK15075	tricarballylate/proton symporter TcuC. 	434
185035	PRK15076	PRK15076	alpha-galactosidase; Provisional	431
237903	PRK15078	PRK15078	polysaccharide export protein Wza; Provisional	379
185037	PRK15079	PRK15079	oligopeptide ABC transporter ATP-binding protein OppF; Provisional	331
237904	PRK15080	PRK15080	ethanolamine utilization protein EutJ; Provisional	267
185039	PRK15081	PRK15081	glutathione ABC transporter permease GsiC; Provisional	306
185040	PRK15082	PRK15082	glutathione ABC transporter permease GsiD; Provisional	301
237905	PRK15083	PRK15083	PTS system mannitol-specific transporter subunit IICBA; Provisional	639
185042	PRK15084	PRK15084	formate hydrogenlyase maturation protein HycH; Provisional	133
237906	PRK15086	PRK15086	ethanolamine utilization protein EutH; Provisional	372
185044	PRK15087	PRK15087	hemolysin; Provisional	219
185045	PRK15088	PRK15088	PTS system mannose-specific transporter subunits IIAB; Provisional	322
185046	PRK15090	PRK15090	DNA-binding transcriptional regulator KdgR; Provisional	257
185047	PRK15091	PRK15091	phospholipid-binding lipoprotein MlaA. 	251
237907	PRK15092	PRK15092	DNA-binding transcriptional repressor LrhA; Provisional	310
185049	PRK15093	PRK15093	peptide ABC transporter ATP-binding protein SapD. 	330
185050	PRK15094	PRK15094	magnesium/cobalt transporter CorC. 	292
237908	PRK15095	PRK15095	FKBP-type peptidyl-prolyl cis-trans isomerase; Provisional	156
185052	PRK15097	PRK15097	cytochrome bd-I ubiquinol oxidase subunit CydA. 	522
185053	PRK15098	PRK15098	beta-glucosidase BglX. 	765
185054	PRK15099	PRK15099	lipid III flippase WzxE. 	416
185055	PRK15100	PRK15100	cystine ABC transporter permease. 	220
185056	PRK15101	PRK15101	protease3; Provisional	961
237909	PRK15102	PRK15102	trimethylamine-N-oxide reductase TorA. 	825
237910	PRK15103	PRK15103	membrane integrity-associated transporter subunit PqiA. 	419
185059	PRK15104	PRK15104	oligopeptide ABC transporter substrate-binding protein OppA; Provisional	543
185060	PRK15105	PRK15105	peptidoglycan synthase FtsI; Provisional	578
237911	PRK15106	PRK15106	nucleoside-specific channel-forming protein Tsx; Provisional	289
185062	PRK15107	PRK15107	glutamate/aspartate ABC transporter permease GltK. 	224
185063	PRK15108	PRK15108	biotin synthase; Provisional	345
185064	PRK15109	PRK15109	antimicrobial peptide ABC transporter periplasmic binding protein SapA; Provisional	547
185065	PRK15110	PRK15110	peptide ABC transporter permease SapB. 	321
185066	PRK15111	PRK15111	peptide ABC transporter permease SapC. 	296
185067	PRK15112	PRK15112	peptide ABC transporter ATP-binding protein SapF. 	267
185068	PRK15113	PRK15113	glutathione transferase. 	214
185069	PRK15114	PRK15114	tRNA (cytidine/uridine-2'-O-)-methyltransferase TrmJ; Provisional	245
185070	PRK15115	PRK15115	response regulator GlrR; Provisional	444
185071	PRK15116	PRK15116	sulfur acceptor protein CsdL; Provisional	268
237912	PRK15117	PRK15117	phospholipid-binding protein MlaC. 	211
185073	PRK15118	PRK15118	universal stress protein UspA. 	144
237913	PRK15119	PRK15119	UDP-N-acetylglucosamine--undecaprenyl-phosphate N-acetylglucosaminephosphotransferase. 	365
185075	PRK15120	PRK15120	lipopolysaccharide ABC transporter permease LptF; Provisional	366
185076	PRK15121	PRK15121	MDR efflux pump AcrAB transcriptional activator RobA. 	289
237914	PRK15122	PRK15122	magnesium-transporting ATPase; Provisional	903
237915	PRK15123	PRK15123	lipopolysaccharide core heptose(I) kinase RfaP; Provisional	268
185079	PRK15124	PRK15124	RNA 2',3'-cyclic phosphodiesterase. 	176
185080	PRK15126	PRK15126	HMP-PP phosphatase. 	272
185081	PRK15127	PRK15127	multidrug efflux RND transporter permease subunit AcrB. 	1049
185082	PRK15128	PRK15128	23S rRNA (cytosine(1962)-C(5))-methyltransferase RlmI. 	396
185083	PRK15129	PRK15129	L-Ala-D/L-Glu epimerase; Provisional	321
237916	PRK15130	PRK15130	spermidine N1-acetyltransferase; Provisional	186
185085	PRK15131	PRK15131	mannose-6-phosphate isomerase; Provisional	389
185086	PRK15132	PRK15132	tyrosine transporter TyrP; Provisional	403
185087	PRK15133	PRK15133	microcin C ABC transporter permease YejB; Provisional	364
237917	PRK15134	PRK15134	microcin C ABC transporter ATP-binding protein YejF; Provisional	529
185089	PRK15135	PRK15135	histidine ABC transporter permease HisQ. 	228
185090	PRK15136	PRK15136	multidrug efflux MFS transporter periplasmic adaptor subunit EmrA. 	390
185091	PRK15137	PRK15137	DNA-specific endonuclease I; Provisional	235
185092	PRK15138	PRK15138	alcohol dehydrogenase. 	387
185093	PRK15171	PRK15171	lipopolysaccharide 3-alpha-galactosyltransferase. 	334
237918	PRK15172	PRK15172	aldose-1-epimerase. 	300
185095	PRK15173	PRK15173	peptidase; Provisional	323
185096	PRK15174	PRK15174	Vi polysaccharide transport protein VexE. 	656
185097	PRK15175	PRK15175	Vi polysaccharide ABC transporter protein VexA. 	355
185098	PRK15176	PRK15176	Vi polysaccharide ABC transporter inner membrane protein VexB. 	264
185099	PRK15177	PRK15177	Vi polysaccharide ABC transporter ATP-binding protein VexC. 	213
185100	PRK15178	PRK15178	Vi polysaccharide ABC transporter inner membrane protein VexD. 	434
185101	PRK15179	PRK15179	Vi polysaccharide biosynthesis protein TviE; Provisional	694
185102	PRK15180	PRK15180	Vi polysaccharide biosynthesis protein TviD; Provisional	831
185103	PRK15181	PRK15181	Vi polysaccharide biosynthesis UDP-N-acetylglucosaminuronic acid C-4 epimerase TviC. 	348
185104	PRK15182	PRK15182	Vi polysaccharide biosynthesis UDP-N-acetylglucosamine C-6 dehydrogenase TviB. 	425
185105	PRK15183	PRK15183	Vi polysaccharide biosynthesis regulator TviA. 	143
185106	PRK15184	PRK15184	curli production assembly/transport protein CsgG; Provisional	277
185107	PRK15185	PRK15185	transcriptional regulator HilD; Provisional	309
185108	PRK15186	PRK15186	AraC family transcriptional regulator; Provisional	291
185109	PRK15187	PRK15187	fimbrial protein BcfA; Provisional	180
185110	PRK15188	PRK15188	fimbrial chaperone protein BcfB; Provisional	228
185111	PRK15189	PRK15189	fimbrial protein BcfD; Provisional	335
185112	PRK15190	PRK15190	fimbrial protein BcfE; Provisional	181
185113	PRK15191	PRK15191	fimbrial protein BcfF; Provisional	172
185114	PRK15192	PRK15192	pili/flagellar-assembly chaperone. 	234
237919	PRK15193	PRK15193	outer membrane usher protein; Provisional	876
237920	PRK15194	PRK15194	type 1 fimbrial protein subunit FimA. 	185
185117	PRK15195	PRK15195	molecular chaperone FimC. 	229
185118	PRK15196	PRK15196	type III secretion system effector PipB2. 	350
185119	PRK15197	PRK15197	type III secretion system effector PipB. 	291
185120	PRK15198	PRK15198	outer membrane usher protein FimD. 	860
237921	PRK15199	fimH	type 1 fimbrin D-mannose specific adhesin FimH. 	335
185122	PRK15200	PRK15200	type 1 fimbrial protein subunit FimI. 	177
185123	PRK15201	PRK15201	fimbriae biosynthesis transcriptional regulator FimW. 	198
237922	PRK15202	PRK15202	type III secretion system chaperone. 	117
185125	PRK15203	PRK15203	4-hydroxyphenylacetate degradation bifunctional isomerase/decarboxylase; Provisional	429
185126	PRK15204	PRK15204	undecaprenyl-phosphate galactose phosphotransferase; Provisional	476
237923	PRK15205	PRK15205	long polar fimbrial protein LpfE; Provisional	176
237924	PRK15206	PRK15206	long polar fimbrial protein LpfD; Provisional	359
185129	PRK15207	PRK15207	outer membrane usher protein LpfC. 	842
237925	PRK15208	PRK15208	molecular chaperone LpfB. 	228
185131	PRK15209	PRK15209	long polar fimbrial protein LpfA; Provisional	174
185132	PRK15210	PRK15210	fimbrial protein. 	194
185133	PRK15211	PRK15211	fimbrial chaperone. 	229
185134	PRK15212	PRK15212	virulence protein SpvA; Provisional	255
237926	PRK15213	PRK15213	outer membrane usher protein PefC. 	797
185136	PRK15214	PRK15214	fimbrial protein PefA; Provisional	172
185137	PRK15215	PRK15215	fimbriae biosynthesis regulatory protein; Provisional	100
185138	PRK15216	PRK15216	putative fimbrial biosynthesis regulatory protein; Provisional	340
185139	PRK15217	PRK15217	fimbrial outer membrane usher protein; Provisional	826
185140	PRK15218	PRK15218	fimbrial assembly chaperone. 	226
237927	PRK15219	PRK15219	carbonic anhydrase; Provisional	245
237928	PRK15220	PRK15220	fimbrial protein YehD. 	178
185143	PRK15221	PRK15221	Saf-pilin pilus formation protein SafA; Provisional	165
185144	PRK15222	PRK15222	putative pilin structural protein SafD; Provisional	156
185145	PRK15223	PRK15223	fimbrial biogenesis outer membrane usher protein. 	836
185146	PRK15224	PRK15224	pili assembly chaperone PapD. 	237
185147	PRK15228	PRK15228	fimbrial protein SefA; Provisional	165
185148	PRK15231	PRK15231	fimbrial adhesin protein SefD; Provisional	150
185149	PRK15233	PRK15233	fimbrial chaperone SefB. 	246
185150	PRK15235	PRK15235	outer membrane fimbrial usher protein SefC; Provisional	814
237929	PRK15238	PRK15238	inner membrane transporter YjeM; Provisional	496
185152	PRK15239	PRK15239	putative fimbrial protein StaA; Provisional	197
185153	PRK15240	PRK15240	resistance to complement killing; Provisional	185
185154	PRK15241	PRK15241	putative fimbrial protein StaD; Provisional	188
185155	PRK15243	PRK15243	virulence genes transcriptional activator SpvR. 	297
185156	PRK15244	PRK15244	type III secretion system effector NAD(+)--protein-arginine ADP-ribosyltransferase SpvB. 	591
185157	PRK15245	PRK15245	type III secretion system effector phosphothreonine lyase. 	241
185158	PRK15246	PRK15246	fimbrial assembly chaperone. 	233
185159	PRK15247	PRK15247	fimbrial usher protein StbD. 	441
185160	PRK15248	PRK15248	fimbrial outer membrane usher protein. 	853
237930	PRK15249	PRK15249	fimbrial chaperone. 	253
185162	PRK15250	PRK15250	type III secretion system effector cysteine hydrolase SpvD. 	216
237931	PRK15251	PRK15251	cytolethal distending toxin subunit B family protein. 	271
237932	PRK15252	PRK15252	putative fimbrial-like adhesin protein. 	344
185165	PRK15253	PRK15253	putative fimbrial assembly chaperone protein StcB; Provisional	242
185166	PRK15254	PRK15254	fimbrial chaperone protein StdC; Provisional	239
185167	PRK15255	PRK15255	fimbrial outer membrane usher protein StdB; Provisional	829
185168	PRK15260	PRK15260	fimbrial protein SteF; Provisional	178
185169	PRK15261	PRK15261	fimbrial protein SteA; Provisional	195
237933	PRK15262	PRK15262	putative fimbrial protein StaF; Provisional	197
185171	PRK15263	PRK15263	fimbrial-like protein. 	196
185172	PRK15265	PRK15265	subtilase cytotoxin subunit B-like protein; Provisional	134
185173	PRK15266	PRK15266	subtilase cytotoxin subunit B; Provisional	135
185174	PRK15267	PRK15267	subtilase cytotoxin subunit B-like protein; Provisional	141
185175	PRK15272	PRK15272	pertussis-like toxin subunit ArtA. 	242
185176	PRK15273	PRK15273	fimbrial biogenesis outer membrane usher protein. 	881
185177	PRK15274	PRK15274	putative periplasmic fimbrial chaperone protein SteC; Provisional	257
185178	PRK15275	PRK15275	putative fimbrial protein SteD; Provisional	166
185179	PRK15276	PRK15276	putative fimbrial subunit SteE; Provisional	153
185180	PRK15278	PRK15278	type III secretion protein BopE; Provisional	261
185181	PRK15279	PRK15279	type III secretion system guanine nucleotide exchange factor SopE. 	240
185182	PRK15280	PRK15280	type III secretion system guanine nucleotide exchange factor SopE2. 	240
185183	PRK15283	PRK15283	fimbrial major subunit StfA. 	186
185184	PRK15284	PRK15284	putative fimbrial outer membrane usher protein StfC; Provisional	881
185185	PRK15285	PRK15285	fimbrial chaperone StfD. 	250
185186	PRK15286	PRK15286	putative minor fimbrial subunit StfE; Provisional	170
185187	PRK15287	PRK15287	fimbrial minor subunit StfF. 	158
185188	PRK15288	PRK15288	putative minor fimbrial subunit StfG; Provisional	176
185189	PRK15289	lpfA	fimbrial protein; Provisional	190
237934	PRK15290	lfpB	fimbrial chaperone. 	243
185191	PRK15291	PRK15291	fimbrial protein StgD; Provisional	355
237935	PRK15292	PRK15292	type 1 fimbrial protein. 	365
185193	PRK15293	PRK15293	fimbrial protein SthD. 	185
185194	PRK15294	PRK15294	fimbrial outer membrane usher protein. 	845
237936	PRK15295	PRK15295	fimbrial assembly chaperone. 	226
185196	PRK15296	PRK15296	fimbrial protein SthA. 	181
185197	PRK15297	PRK15297	fimbrial protein. 	359
185198	PRK15298	PRK15298	fimbrial outer membrane usher protein. 	848
185199	PRK15299	PRK15299	fimbrial chaperone protein StiB; Provisional	227
185200	PRK15300	PRK15300	fimbrial protein StiA; Provisional	179
185201	PRK15301	PRK15301	hypothetical protein; Provisional	186
185202	PRK15302	PRK15302	hypothetical protein; Provisional	229
185203	PRK15303	PRK15303	hypothetical protein; Provisional	229
237937	PRK15304	PRK15304	putative fimbrial outer membrane usher protein; Provisional	801
185205	PRK15305	PRK15305	fimbrial protein StkG. 	353
237938	PRK15306	PRK15306	fimbrial protein. 	190
185207	PRK15307	PRK15307	major fimbrial protein StkA; Provisional	201
237939	PRK15308	PRK15308	fimbrial protein TcfA. 	234
185209	PRK15309	PRK15309	fimbrial protein TcfB. 	191
185210	PRK15310	PRK15310	fimbrial outer membrane usher protein TcfC; Provisional	895
185211	PRK15311	PRK15311	fimbrial protein TcfD. 	359
185212	PRK15312	PRK15312	antimicrobial resistance protein Mig-14; Provisional	298
237940	PRK15313	PRK15313	intestinal colonization autotransporter adhesin MisL. 	955
185214	PRK15314	PRK15314	outer membrane protein RatB; Provisional	2435
237941	PRK15315	PRK15315	outer membrane protein RatA; Provisional	1865
185216	PRK15316	PRK15316	RatA-like protein; Provisional	2683
237942	PRK15317	PRK15317	alkyl hydroperoxide reductase subunit F; Provisional	517
237943	PRK15318	PRK15318	intimin-like protein SinH; Provisional	730
185219	PRK15319	PRK15319	fibronectin-binding autotransporter adhesin ShdA. 	2039
185220	PRK15320	PRK15320	transcriptional activator SprB; Provisional	251
185221	PRK15321	PRK15321	type III secretion system effector protein OrgC. 	120
185222	PRK15322	PRK15322	oxygen-regulated invasion protein OrgB. 	210
185223	PRK15323	PRK15323	oxygen-regulated invasion protein OrgA. 	167
185224	PRK15324	PRK15324	EscJ/YscJ/HrcJ family type III secretion inner membrane ring protein. 	252
185225	PRK15325	PRK15325	type III secretion system inner rod protein PrgJ. 	80
185226	PRK15326	PRK15326	type III secretion system needle complex protein. 	80
237944	PRK15327	PRK15327	PrgH/EprH family type III secretion inner membrane ring protein. 	393
185228	PRK15328	PRK15328	type III secretion system invasion protein IagB. 	160
237945	PRK15329	PRK15329	chaperone SicP. 	138
185230	PRK15330	PRK15330	type III secretion system needle tip protein SipD. 	343
185231	PRK15331	PRK15331	type III secretion system translocator chaperone SicA. 	165
185232	PRK15332	PRK15332	SpaR/YscT/HrcT type III secretion system export apparatus protein. 	263
185233	PRK15333	PRK15333	EscS/YscS/HrcS family type III secretion system export apparatus protein. 	86
185234	PRK15334	PRK15334	type III secretion system protein SpaN. 	336
185235	PRK15335	PRK15335	type III secretion system protein SpaM; Provisional	147
185236	PRK15336	PRK15336	type III secretion system chaperone SpaK; Provisional	135
237946	PRK15337	PRK15337	EscV/YscV/HrcV family type III secretion system export apparatus protein. 	686
237947	PRK15338	PRK15338	YopN/LcrE/InvE/MxiC type III secretion system gatekeeper. 	372
237948	PRK15339	PRK15339	EscC/YscC/HrcC family type III secretion system outer membrane ring protein. 	559
185240	PRK15340	PRK15340	AraC family transcriptional regulator InvF. 	216
185241	PRK15341	PRK15341	type III secretion system invasion lipoprotein InvH. 	147
185242	PRK15344	PRK15344	type III secretion system needle protein SsaG; Provisional	71
237949	PRK15345	PRK15345	SepL/TyeA/HrpJ family type III secretion system protein. 	326
237950	PRK15346	PRK15346	EscC/YscC/HrcC family type III secretion system outer membrane ring protein. 	499
237951	PRK15347	PRK15347	two component system sensor kinase. 	921
185246	PRK15348	PRK15348	EscJ/YscJ/HrcJ family type III secretion inner membrane ring protein SsaJ. 	249
185247	PRK15349	PRK15349	EscT/YscT/HrcT family type III secretion system export apparatus protein. 	259
185248	PRK15350	PRK15350	EscS/YscS/HrcS family type III secretion system export apparatus protein. 	88
185249	PRK15351	PRK15351	type III secretion system apparatus protein SsaP. 	124
185250	PRK15352	PRK15352	type III secretion system apparatus protein SsaO. 	125
185251	PRK15353	PRK15353	type III secretion system apparatus protein. 	122
185252	PRK15354	PRK15354	type III secretion system apparatus protein SsaK. 	224
185253	PRK15355	PRK15355	type III secretion system protein SsaI; Provisional	82
185254	PRK15356	PRK15356	type III secretion system protein SsaH; Provisional	75
185255	PRK15357	PRK15357	pathogenicity island 2 effector protein SseG; Provisional	229
185256	PRK15358	PRK15358	type III secretion systems effector SseF. 	239
185257	PRK15359	PRK15359	type III secretion system chaperone protein SscB; Provisional	144
185258	PRK15360	PRK15360	pathogenicity island 2 effector protein SseE; Provisional	137
185259	PRK15361	PRK15361	type III secretion system translocon protein SseD. 	195
237952	PRK15362	PRK15362	type III secretion system translocon protein. 	473
185261	PRK15363	PRK15363	CesD/SycD/LcrH family type III secretion system chaperone SscA. 	157
185262	PRK15364	PRK15364	type III secretion system translocon protein SseB. 	196
185263	PRK15365	PRK15365	type III secretion system chaperone SseA; Provisional	107
185264	PRK15366	PRK15366	type III secretion system chaperone SsaE; Provisional	80
185265	PRK15367	PRK15367	EscD/YscD/HrpQ family type III secretion system inner membrane ring protein. 	395
185266	PRK15368	PRK15368	type III secretion system protein SpiC. 	127
185267	PRK15369	PRK15369	two component system response regulator. 	211
185268	PRK15370	PRK15370	type III secretion system effector E3 ubiquitin transferase SlrP. 	754
185269	PRK15371	PRK15371	YopJ family type III secretion system effector serine/threonine acetyltransferase. 	287
185270	PRK15372	PRK15372	type III secretion system effector SseI. 	292
185271	PRK15373	PRK15373	IpaC/SipC family type III secretion system needle tip complex protein. 	411
185272	PRK15374	PRK15374	type III secretion system needle tip complex protein SipB. 	593
185273	PRK15375	PRK15375	type III secretion system effector GTPase-activating protein SptP. 	535
185274	PRK15376	PRK15376	type III secretion system effector SipA. 	670
185275	PRK15377	PRK15377	type III secretion system effector HECT-type E3 ubiquitin transferase. 	782
237953	PRK15378	PRK15378	type III secretion system effector inositol phosphate phosphatase. 	564
185277	PRK15379	PRK15379	type III secretion system effector SopD. 	317
185278	PRK15380	PRK15380	type III secretion system effector SopD2. 	319
185279	PRK15381	PRK15381	type III secretion system effector. 	408
185280	PRK15382	PRK15382	NleB family type III secretion system effector arginine glycosyltransferase. 	326
185281	PRK15383	PRK15383	type III secretion system effector arginine glycosyltransferase. 	335
185282	PRK15384	PRK15384	type III secretion system effector arginine glycosyltransferase SseK1. 	336
185283	PRK15385	PRK15385	MgtC family protein. 	225
237954	PRK15386	PRK15386	type III secretion effector GogB. 	426
185285	PRK15387	PRK15387	type III secretion system effector E3 ubiquitin transferase SspH2. 	788
185286	PRK15388	PRK15388	superoxide dismutase [Cu-Zn]. 	177
237955	PRK15389	PRK15389	fumarate hydratase; Provisional	536
185288	PRK15390	PRK15390	fumarate hydratase FumA; Provisional	548
185289	PRK15391	PRK15391	class I fumarate hydratase. 	548
185290	PRK15392	PRK15392	class I fumarate hydratase. 	550
185291	PRK15393	PRK15393	NUDIX hydrolase YfcD; Provisional	180
185292	PRK15394	PRK15394	4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol deformylase ArnD; Provisional	296
185293	PRK15395	PRK15395	galactose/glucose ABC transporter substrate-binding protein MglB. 	330
185294	PRK15396	PRK15396	major outer membrane lipoprotein. 	78
185295	PRK15397	PRK15397	nicotinamide riboside transporter PnuC; Provisional	239
237956	PRK15398	PRK15398	aldehyde dehydrogenase. 	465
185297	PRK15399	PRK15399	lysine decarboxylase. 	713
185298	PRK15400	PRK15400	lysine decarboxylase. 	714
237957	PRK15401	PRK15401	DNA oxidative demethylase AlkB. 	213
185300	PRK15402	PRK15402	MdfA family multidrug efflux MFS transporter. 	406
237958	PRK15403	PRK15403	multidrug efflux MFS transporter MdtM. 	413
237959	PRK15404	PRK15404	high-affinity branched-chain amino acid ABC transporter substrate-binding protein. 	369
185303	PRK15405	PRK15405	ethanolamine utilization microcompartment protein EutL. 	217
185304	PRK15406	PRK15406	oligopeptide ABC transporter permease OppC; Provisional	302
237960	PRK15407	PRK15407	lipopolysaccharide biosynthesis protein RfbH; Provisional	438
237961	PRK15408	PRK15408	autoinducer 2 ABC transporter substrate-binding protein LsrB. 	336
185307	PRK15409	PRK15409	glyoxylate/hydroxypyruvate reductase GhrB. 	323
185308	PRK15410	PRK15410	DgsA anti-repressor MtfA; Provisional	260
185309	PRK15411	rcsA	transcriptional regulator RcsA. 	207
185310	PRK15412	PRK15412	thiol:disulfide interchange protein DsbE; Provisional	185
185311	PRK15413	PRK15413	glutathione ABC transporter substrate-binding protein GsiB; Provisional	512
185312	PRK15414	PRK15414	phosphomannomutase. 	456
185313	PRK15415	PRK15415	propanediol utilization microcompartment protein PduB. 	266
185314	PRK15416	PRK15416	lipopolysaccharide core heptose(II)-phosphate phosphatase; Provisional	201
185315	PRK15417	PRK15417	integron integrase. 	337
237962	PRK15418	PRK15418	transcriptional regulator LsrR; Provisional	318
185317	PRK15419	PRK15419	sodium/proline symporter PutP. 	502
185318	PRK15420	fucU	L-fucose mutarotase; Provisional	140
185319	PRK15421	PRK15421	HTH-type transcriptional regulator MetR. 	317
185320	PRK15422	PRK15422	septal ring assembly protein ZapB; Provisional	79
185321	PRK15423	PRK15423	hypoxanthine phosphoribosyltransferase; Provisional	178
237963	PRK15424	PRK15424	propionate catabolism operon regulatory protein PrpR; Provisional	538
185323	PRK15425	gapA	glyceraldehyde-3-phosphate dehydrogenase. 	331
237964	PRK15426	PRK15426	cellulose biosynthesis regulator YedQ. 	570
185325	PRK15427	PRK15427	colanic acid biosynthesis glycosyltransferase WcaL; Provisional	406
185326	PRK15428	PRK15428	putative propanediol utilization protein PduM; Provisional	163
237965	PRK15429	PRK15429	formate hydrogenlyase transcriptional activator FlhA. 	686
185328	PRK15430	PRK15430	EamA family transporter RarD. 	296
185329	PRK15431	PRK15431	[Fe-S]-dependent transcriptional repressor FeoC. 	78
185330	PRK15432	PRK15432	autoinducer 2 ABC transporter permease LsrC; Provisional	344
185331	PRK15433	PRK15433	branched-chain amino acid transporter carrier protein BrnQ. 	439
237966	PRK15434	PRK15434	GDP-mannose mannosyl hydrolase. 	159
185333	PRK15435	PRK15435	bifunctional DNA-binding transcriptional regulator/O6-methylguanine-DNA methyltransferase Ada. 	353
185334	PRK15437	PRK15437	histidine ABC transporter substrate-binding protein HisJ; Provisional	259
185335	PRK15438	PRK15438	erythronate-4-phosphate dehydrogenase PdxB; Provisional	378
185336	PRK15439	PRK15439	autoinducer 2 ABC transporter ATP-binding protein LsrA; Provisional	510
185337	PRK15440	PRK15440	L-rhamnonate dehydratase; Provisional	394
185338	PRK15441	PRK15441	peptidyl-prolyl cis-trans isomerase C; Provisional	93
185339	PRK15442	PRK15442	beta-lactamase TEM; Provisional	284
185340	PRK15443	pduE	diol dehydratase small subunit. 	138
185341	PRK15444	pduC	propanediol/glycerol family dehydratase large subunit. 	554
185342	PRK15445	PRK15445	arsenical efflux pump membrane protein ArsB. 	427
237967	PRK15446	PRK15446	phosphonate metabolism protein PhnM; Provisional	383
237968	PRK15447	PRK15447	putative protease; Provisional	301
185345	PRK15448	PRK15448	ethanolamine utilization microcompartment protein EutN. 	95
185346	PRK15449	PRK15449	ferredoxin-like protein FixX; Provisional	95
185347	PRK15450	PRK15450	signal transduction protein PmrD; Provisional	85
185348	PRK15451	PRK15451	carboxy-S-adenosyl-L-methionine synthase CmoA. 	247
237969	PRK15452	PRK15452	putative protease; Provisional	443
237970	PRK15453	PRK15453	phosphoribulokinase; Provisional	290
185351	PRK15454	PRK15454	ethanolamine utilization ethanol dehydrogenase EutG. 	395
185352	PRK15455	PRK15455	PrkA family serine protein kinase; Provisional	644
185353	PRK15456	PRK15456	universal stress protein UspG; Provisional	142
185354	PRK15457	PRK15457	ethanolamine utilization acetate kinase EutQ. 	233
185355	PRK15458	PRK15458	tagatose 6-phosphate aldolase subunit KbaZ; Provisional	426
185356	PRK15459	PRK15459	flagella biosynthesis chaperone FlgN. 	140
185357	PRK15460	cpsB	mannose-1-phosphate guanyltransferase; Provisional	478
185358	PRK15461	PRK15461	sulfolactaldehyde 3-reductase. 	296
237971	PRK15462	PRK15462	dipeptide permease. 	493
185360	PRK15463	PRK15463	cold shock-like protein CspF; Provisional	70
185361	PRK15464	PRK15464	cold shock-like protein CspH; Provisional	70
185362	PRK15465	pabB	aminodeoxychorismate synthase component 1. 	453
185363	PRK15466	PRK15466	ethanolamine utilization microcompartment protein EutK. 	166
185364	PRK15467	PRK15467	ethanolamine utilization acetate kinase EutP. 	158
185365	PRK15468	PRK15468	ethanolamine utilization microcompartment protein EutS. 	111
185366	PRK15469	ghrA	glyoxylate/hydroxypyruvate reductase GhrA. 	312
185367	PRK15470	emtA	membrane-bound lytic murein transglycosylase EmtA. 	203
185368	PRK15471	PRK15471	chain length determinant protein WzzB; Provisional	325
185369	PRK15472	PRK15472	nucleoside triphosphatase NudI; Provisional	141
185370	PRK15473	cbiF	cobalt-precorrin-4 methyltransferase. 	257
185371	PRK15474	PRK15474	ethanolamine utilization microcompartment protein EutM. 	97
185372	PRK15475	PRK15475	oxaloacetate decarboxylase subunit beta; Provisional	433
185373	PRK15476	PRK15476	oxaloacetate decarboxylase subunit beta; Provisional	433
185374	PRK15477	PRK15477	oxaloacetate decarboxylase subunit beta; Provisional	433
185375	PRK15478	cbiH	precorrin-3B C(17)-methyltransferase. 	241
185376	PRK15479	PRK15479	transcriptional regulator TctD. 	221
185377	PRK15480	PRK15480	glucose-1-phosphate thymidylyltransferase RfbA; Provisional	292
185378	PRK15481	PRK15481	transcriptional regulatory protein PtsJ; Provisional	431
185379	PRK15482	PRK15482	HTH-type transcriptional regulator MurR. 	285
237972	PRK15483	PRK15483	type III restriction-modification system endonuclease. 	986
185381	PRK15484	PRK15484	lipopolysaccharide N-acetylglucosaminyltransferase. 	380
185382	PRK15485	PRK15485	energy-coupling factor ABC transporter transmembrane protein. 	225
185383	PRK15486	hpaC	4-hydroxyphenylacetate 3-monooxygenase reductase subunit; Provisional	170
185384	PRK15487	PRK15487	O-antigen ligase RfaL; Provisional	400
237973	PRK15488	PRK15488	thiosulfate reductase PhsA; Provisional	759
237974	PRK15489	nfrB	glycosyl transferase family protein. 	703
185387	PRK15490	PRK15490	Vi polysaccharide biosynthesis glycosyltransferase TviE. 	578
185388	PRK15491	PRK15491	replication factor A; Provisional	374
185389	PRK15492	PRK15492	triosephosphate isomerase; Provisional	260
185390	PRK15493	PRK15493	bifunctional S-methyl-5'-thioadenosine deaminase/S-adenosylhomocysteine deaminase. 	435
185391	PRK15494	era	GTPase Era; Provisional	339
240225	PTZ00004	PTZ00004	actin-2; Provisional	378
173310	PTZ00005	PTZ00005	phosphoglycerate kinase; Provisional	417
240226	PTZ00007	PTZ00007	(NAP-L) nucleosome assembly protein -L; Provisional	337
185394	PTZ00008	PTZ00008	(NAP-S) nucleosome assembly protein-S; Provisional	185
240227	PTZ00009	PTZ00009	heat shock 70 kDa protein; Provisional	653
240228	PTZ00010	PTZ00010	tubulin beta chain; Provisional	445
140051	PTZ00013	PTZ00013	plasmepsin 4 (PM4); Provisional	450
240229	PTZ00014	PTZ00014	myosin-A; Provisional	821
185397	PTZ00015	PTZ00015	histone H4; Provisional	102
240230	PTZ00016	PTZ00016	aquaglyceroporin; Provisional	294
185399	PTZ00017	PTZ00017	histone H2A; Provisional	134
185400	PTZ00018	PTZ00018	histone H3; Provisional	136
240231	PTZ00019	PTZ00019	fructose-bisphosphate aldolase; Provisional	355
240232	PTZ00021	PTZ00021	falcipain-2; Provisional	489
173322	PTZ00023	PTZ00023	glyceraldehyde-3-phosphate dehydrogenase; Provisional	337
240233	PTZ00024	PTZ00024	cyclin-dependent protein kinase; Provisional	335
185402	PTZ00026	PTZ00026	60S ribosomal protein L15; Provisional	204
240234	PTZ00027	PTZ00027	60S ribosomal protein L6; Provisional	190
185404	PTZ00028	PTZ00028	40S ribosomal protein S6e; Provisional	218
185405	PTZ00029	PTZ00029	60S ribosomal protein L10a; Provisional	216
185406	PTZ00030	PTZ00030	60S ribosomal protein L20; Provisional	121
173329	PTZ00031	PTZ00031	ribosomal protein L2; Provisional	317
240235	PTZ00032	PTZ00032	60S ribosomal protein L18; Provisional	211
140068	PTZ00033	PTZ00033	60S ribosomal protein L24; Provisional	125
173331	PTZ00034	PTZ00034	40S ribosomal protein S10; Provisional	124
185407	PTZ00035	PTZ00035	Rad51 protein; Provisional	337
173333	PTZ00036	PTZ00036	glycogen synthase kinase; Provisional	440
240236	PTZ00037	PTZ00037	DnaJ_C chaperone protein; Provisional	421
240237	PTZ00038	PTZ00038	ferredoxin; Provisional	191
173336	PTZ00039	PTZ00039	40S ribosomal protein S20; Provisional	115
240238	PTZ00040	PTZ00040	translation initiation factor E4; Provisional	233
240239	PTZ00041	PTZ00041	60S ribosomal protein L35a; Provisional	120
240240	PTZ00043	PTZ00043	cytochrome c oxidase subunit; Provisional	268
185411	PTZ00044	PTZ00044	ubiquitin; Provisional	76
240241	PTZ00045	PTZ00045	apical membrane antigen 1; Provisional	595
240242	PTZ00046	PTZ00046	rifin; Provisional	358
240243	PTZ00047	PTZ00047	cytochrome c oxidase subunit II; Provisional	162
185414	PTZ00048	PTZ00048	cytochrome c; Provisional	115
240244	PTZ00049	PTZ00049	cathepsin C-like protein; Provisional	693
240245	PTZ00050	PTZ00050	3-oxoacyl-acyl carrier protein synthase; Provisional	421
173347	PTZ00051	PTZ00051	thioredoxin; Provisional	98
185416	PTZ00052	PTZ00052	thioredoxin reductase; Provisional	499
240246	PTZ00053	PTZ00053	methionine aminopeptidase 2; Provisional	470
185418	PTZ00054	PTZ00054	60S ribosomal protein L23; Provisional	139
240247	PTZ00055	PTZ00055	glutathione synthetase; Provisional	619
240248	PTZ00056	PTZ00056	glutathione peroxidase; Provisional	199
173353	PTZ00057	PTZ00057	glutathione s-transferase; Provisional	205
185420	PTZ00058	PTZ00058	glutathione reductase; Provisional	561
185421	PTZ00059	PTZ00059	dynein light chain; Provisional	90
240249	PTZ00060	PTZ00060	cyclophilin; Provisional	183
173356	PTZ00061	PTZ00061	DNA-directed RNA polymerase; Provisional	205
240250	PTZ00062	PTZ00062	glutaredoxin; Provisional	204
240251	PTZ00063	PTZ00063	histone deacetylase; Provisional	436
173359	PTZ00064	PTZ00064	histone acetyltransferase; Provisional	552
240252	PTZ00065	PTZ00065	60S ribosomal protein L14; Provisional	130
173361	PTZ00066	PTZ00066	pyruvate kinase; Provisional	513
185422	PTZ00067	PTZ00067	40S ribosomal S23; Provisional	143
240253	PTZ00068	PTZ00068	60S ribosomal protein L13a; Provisional	202
240254	PTZ00069	PTZ00069	60S ribosomal protein L5; Provisional	300
240255	PTZ00070	PTZ00070	40S ribosomal protein S2; Provisional	257
240256	PTZ00071	PTZ00071	40S ribosomal protein S24; Provisional	132
185427	PTZ00072	PTZ00072	40S ribosomal protein S13; Provisional	148
240257	PTZ00073	PTZ00073	60S ribosomal protein L37; Provisional	91
185429	PTZ00074	PTZ00074	60S ribosomal protein L34; Provisional	135
240258	PTZ00075	PTZ00075	Adenosylhomocysteinase; Provisional	476
173371	PTZ00076	PTZ00076	60S ribosomal protein L17; Provisional	253
185431	PTZ00077	PTZ00077	asparagine synthetase-like protein; Provisional	586
185432	PTZ00078	PTZ00078	Superoxide dismutase [Fe]; Provisional	193
185433	PTZ00079	PTZ00079	NADP-specific glutamate dehydrogenase; Provisional	454
240259	PTZ00081	PTZ00081	enolase; Provisional	439
173376	PTZ00082	PTZ00082	L-lactate dehydrogenase; Provisional	321
185434	PTZ00083	PTZ00083	40S ribosomal protein S27; Provisional	85
240260	PTZ00084	PTZ00084	40S ribosomal protein S3; Provisional	220
240261	PTZ00085	PTZ00085	40S ribosomal protein S28; Provisional	73
185437	PTZ00086	PTZ00086	40S ribosomal protein S16; Provisional	147
185438	PTZ00087	PTZ00087	thrombosponding-related protein; Provisional	340
240262	PTZ00088	PTZ00088	adenylate kinase 1; Provisional	229
173383	PTZ00089	PTZ00089	transketolase; Provisional	661
173384	PTZ00090	PTZ00090	40S ribosomal protein S11; Provisional	233
185439	PTZ00091	PTZ00091	40S ribosomal protein S5; Provisional	193
240263	PTZ00092	PTZ00092	aconitate hydratase-like protein; Provisional	898
173387	PTZ00093	PTZ00093	nucleoside diphosphate kinase, cytosolic; Provisional	149
240264	PTZ00094	PTZ00094	serine hydroxymethyltransferase; Provisional	452
140127	PTZ00095	PTZ00095	40S ribosomal protein S19; Provisional	169
185442	PTZ00096	PTZ00096	40S ribosomal protein S15; Provisional	143
185443	PTZ00097	PTZ00097	60S ribosomal protein L19; Provisional	175
173391	PTZ00098	PTZ00098	phosphoethanolamine N-methyltransferase; Provisional	263
185444	PTZ00099	PTZ00099	rab6; Provisional	176
240265	PTZ00100	PTZ00100	DnaJ chaperone protein; Provisional	116
185445	PTZ00101	PTZ00101	rhomboid-1 protease; Provisional	278
240266	PTZ00102	PTZ00102	disulphide isomerase; Provisional	477
240267	PTZ00103	PTZ00103	60S ribosomal protein L3; Provisional	390
240268	PTZ00104	PTZ00104	S-adenosylmethionine synthase; Provisional	398
240269	PTZ00105	PTZ00105	60S ribosomal protein L12; Provisional	140
185450	PTZ00106	PTZ00106	60S ribosomal protein L30; Provisional	108
240270	PTZ00107	PTZ00107	hexokinase; Provisional	464
240271	PTZ00108	PTZ00108	DNA topoisomerase 2-like protein; Provisional	1388
240272	PTZ00109	PTZ00109	DNA gyrase subunit b; Provisional	903
240273	PTZ00110	PTZ00110	helicase; Provisional	545
173403	PTZ00111	PTZ00111	DNA replication licensing factor MCM4; Provisional	915
240274	PTZ00112	PTZ00112	origin recognition complex 1 protein; Provisional	1164
240275	PTZ00113	PTZ00113	proliferating cell nuclear antigen; Provisional	275
185455	PTZ00114	PTZ00114	Heat shock protein 60; Provisional	555
240276	PTZ00115	PTZ00115	40S ribosomal protein S12; Provisional	290
173408	PTZ00116	PTZ00116	signal peptidase; Provisional	185
173409	PTZ00117	PTZ00117	malate dehydrogenase; Provisional	319
240277	PTZ00118	PTZ00118	40S ribosomal protein S4; Provisional	262
240278	PTZ00119	PTZ00119	40S ribosomal protein S15; Provisional	302
185458	PTZ00120	PTZ00120	D-tyrosyl-tRNA(Tyr) deacylase; Provisional	154
173412	PTZ00121	PTZ00121	MAEBL; Provisional	2084
240279	PTZ00122	PTZ00122	phosphoglycerate mutase; Provisional	299
240280	PTZ00123	PTZ00123	phosphoglycerate mutase like-protein; Provisional	236
173415	PTZ00124	PTZ00124	adenosine deaminase; Provisional	362
240281	PTZ00125	PTZ00125	ornithine aminotransferase-like protein; Provisional	400
240282	PTZ00126	PTZ00126	tyrosyl-tRNA synthetase; Provisional	383
240283	PTZ00127	PTZ00127	cytochrome c oxidase assembly protein; Provisional	403
185464	PTZ00128	PTZ00128	cytochrome c oxidase assembly protein-like; Provisional	232
185465	PTZ00129	PTZ00129	40S ribosomal protein S14; Provisional	149
185466	PTZ00130	PTZ00130	heat shock protein 90; Provisional	814
185467	PTZ00131	PTZ00131	glycophorin-binding protein; Provisional	413
240284	PTZ00132	PTZ00132	GTP-binding nuclear protein Ran; Provisional	215
173423	PTZ00133	PTZ00133	ADP-ribosylation factor; Provisional	182
185469	PTZ00134	PTZ00134	40S ribosomal protein S18; Provisional	154
240285	PTZ00135	PTZ00135	60S acidic ribosomal protein P0; Provisional	310
185471	PTZ00136	PTZ00136	eukaryotic translation initiation factor 6-like protein; Provisional	247
173427	PTZ00137	PTZ00137	2-Cys peroxiredoxin; Provisional	261
185472	PTZ00138	PTZ00138	small nuclear ribonucleoprotein; Provisional	89
240286	PTZ00139	PTZ00139	Succinate dehydrogenase [ubiquinone] flavoprotein subunit; Provisional	617
173430	PTZ00140	PTZ00140	sexual stage antigen s45/48; Provisional	447
185474	PTZ00141	PTZ00141	elongation factor 1- alpha; Provisional	446
240287	PTZ00142	PTZ00142	6-phosphogluconate dehydrogenase; Provisional	470
240288	PTZ00143	PTZ00143	deoxyuridine 5'-triphosphate nucleotidohydrolase; Provisional	155
240289	PTZ00144	PTZ00144	dihydrolipoamide succinyltransferase; Provisional	418
240290	PTZ00145	PTZ00145	phosphoribosylpyrophosphate synthetase; Provisional	439
240291	PTZ00146	PTZ00146	fibrillarin; Provisional	293
140176	PTZ00147	PTZ00147	plasmepsin-1; Provisional	453
240292	PTZ00148	PTZ00148	40S ribosomal protein S8; Provisional	205
240293	PTZ00149	PTZ00149	hypoxanthine phosphoribosyltransferase; Provisional	241
240294	PTZ00150	PTZ00150	phosphoglucomutase-2-like protein; Provisional	584
173440	PTZ00151	PTZ00151	translationally controlled tumor-like  protein; Provisional	172
173441	PTZ00152	PTZ00152	cofilin/actin-depolymerizing factor 1-like protein; Provisional	122
173442	PTZ00153	PTZ00153	lipoamide dehydrogenase; Provisional	659
240295	PTZ00154	PTZ00154	40S ribosomal protein S17; Provisional	134
185484	PTZ00155	PTZ00155	40S ribosomal protein S9; Provisional	181
185485	PTZ00156	PTZ00156	60S ribosomal protein L11; Provisional	172
240296	PTZ00157	PTZ00157	60S ribosomal protein L36a; Provisional	84
185487	PTZ00158	PTZ00158	40S ribosomal protein S15A; Provisional	130
240297	PTZ00159	PTZ00159	60S ribosomal protein L32; Provisional	133
185489	PTZ00160	PTZ00160	60S ribosomal protein L27a; Provisional	147
240298	PTZ00162	PTZ00162	DNA-directed RNA polymerase II subunit 7; Provisional	176
185490	PTZ00163	PTZ00163	hypothetical protein; Provisional	230
240299	PTZ00164	PTZ00164	bifunctional dihydrofolate reductase-thymidylate synthase; Provisional	514
240300	PTZ00165	PTZ00165	aspartyl protease; Provisional	482
240301	PTZ00166	PTZ00166	DNA polymerase delta catalytic subunit; Provisional	1054
185493	PTZ00167	PTZ00167	RNA polymerase subunit 8c; Provisional	144
185494	PTZ00168	PTZ00168	mitochondrial carrier protein; Provisional	259
240302	PTZ00169	PTZ00169	ADP/ATP transporter on adenylate translocase; Provisional	300
240303	PTZ00170	PTZ00170	D-ribulose-5-phosphate 3-epimerase; Provisional	228
240304	PTZ00171	PTZ00171	acyl carrier protein; Provisional	148
185497	PTZ00172	PTZ00172	40S ribosomal protein S26; Provisional	108
185498	PTZ00173	PTZ00173	60S ribosomal protein L10; Provisional	213
240305	PTZ00174	PTZ00174	phosphomannomutase; Provisional	247
185500	PTZ00175	PTZ00175	diphthine synthase; Provisional	270
140204	PTZ00176	PTZ00176	erythrocyte membrane protein 1 (PfEMP1); Provisional	1317
240306	PTZ00178	PTZ00178	60S ribosomal protein L17; Provisional	181
140206	PTZ00179	PTZ00179	60S ribosomal protein L9; Provisional	189
173464	PTZ00180	PTZ00180	60S ribosomal protein L8; Provisional	260
140208	PTZ00181	PTZ00181	60S ribosomal protein L38; Provisional	82
185502	PTZ00182	PTZ00182	3-methyl-2-oxobutanate dehydrogenase; Provisional	355
185503	PTZ00183	PTZ00183	centrin; Provisional	158
185504	PTZ00184	PTZ00184	calmodulin; Provisional	149
140212	PTZ00185	PTZ00185	ATPase alpha subunit; Provisional	574
140213	PTZ00186	PTZ00186	heat shock 70 kDa precursor protein; Provisional	657
240307	PTZ00187	PTZ00187	succinyl-CoA synthetase alpha subunit; Provisional	317
240308	PTZ00188	PTZ00188	adrenodoxin reductase; Provisional	506
240309	PTZ00189	PTZ00189	60S ribosomal protein L21; Provisional	160
140217	PTZ00190	PTZ00190	60S ribosomal protein L29; Provisional	70
185507	PTZ00191	PTZ00191	60S ribosomal protein L23a; Provisional	145
173472	PTZ00192	PTZ00192	60S ribosomal protein L13; Provisional	218
140220	PTZ00193	PTZ00193	60S ribosomal protein L31; Provisional	188
185508	PTZ00194	PTZ00194	60S ribosomal protein L26; Provisional	143
140222	PTZ00195	PTZ00195	60S ribosomal protein L18; Provisional	198
185509	PTZ00196	PTZ00196	60S ribosomal protein L36; Provisional	98
185510	PTZ00197	PTZ00197	60S ribosomal protein L28; Provisional	146
173474	PTZ00198	PTZ00198	60S ribosomal protein L22; Provisional	122
185511	PTZ00199	PTZ00199	high mobility group protein; Provisional	94
240310	PTZ00200	PTZ00200	cysteine proteinase; Provisional	448
240311	PTZ00201	PTZ00201	amastin surface glycoprotein; Provisional	192
240312	PTZ00202	PTZ00202	tuzin; Provisional	550
185513	PTZ00203	PTZ00203	cathepsin L protease; Provisional	348
140231	PTZ00204	PTZ00204	hypothetical protein; Provisional	120
140232	PTZ00205	PTZ00205	DNA polymerase kappa; Provisional	571
240313	PTZ00206	PTZ00206	amino acid transporter; Provisional	467
140234	PTZ00207	PTZ00207	hypothetical protein; Provisional	591
240314	PTZ00208	PTZ00208	65 kDa invariant surface glycoprotein; Provisional	436
140236	PTZ00209	PTZ00209	retrotransposon hot spot protein; Provisional	693
140237	PTZ00210	PTZ00210	UDP-GlcNAc-dependent glycosyltransferase; Provisional	382
240315	PTZ00211	PTZ00211	ribonucleoside-diphosphate reductase small subunit; Provisional	330
185514	PTZ00212	PTZ00212	T-complex protein 1 subunit beta; Provisional	533
185515	PTZ00213	PTZ00213	asparagine synthetase A; Provisional	348
173479	PTZ00214	PTZ00214	high cysteine membrane protein Group 4; Provisional	800
185516	PTZ00215	PTZ00215	ribose 5-phosphate isomerase; Provisional	151
240316	PTZ00216	PTZ00216	acyl-CoA synthetase; Provisional	700
240317	PTZ00217	PTZ00217	flap endonuclease-1; Provisional	393
185518	PTZ00218	PTZ00218	40S ribosomal protein S29; Provisional	54
185519	PTZ00219	PTZ00219	Sec61 alpha  subunit; Provisional	474
173484	PTZ00220	PTZ00220	Activator of HSP-90 ATPase; Provisional	132
140248	PTZ00221	PTZ00221	cyclophilin; Provisional	249
140249	PTZ00222	PTZ00222	60S ribosomal protein L7a; Provisional	263
140250	PTZ00223	PTZ00223	40S ribosomal protein S4; Provisional	273
240318	PTZ00224	PTZ00224	protein phosphatase 2C; Provisional	381
140252	PTZ00225	PTZ00225	60S ribosomal protein L10a; Provisional	214
240319	PTZ00226	PTZ00226	fumarate hydratase; Provisional	570
140254	PTZ00227	PTZ00227	variable surface protein Vir14; Provisional	418
240320	PTZ00228	PTZ00228	variable surface protein Vir21; Provisional	350
140256	PTZ00229	PTZ00229	variable surface protein Vir30; Provisional	317
240321	PTZ00230	PTZ00230	variable surface protein Vir7; Provisional	364
140258	PTZ00231	PTZ00231	variable surface protein Vir17; Provisional	385
240322	PTZ00232	PTZ00232	variable surface protein Vir27; Provisional	363
240323	PTZ00233	PTZ00233	variable surface protein Vir18; Provisional	509
240324	PTZ00234	PTZ00234	variable surface protein Vir12; Provisional	433
185521	PTZ00235	PTZ00235	DNA polymerase epsilon subunit B; Provisional	291
173487	PTZ00236	PTZ00236	mitochondrial import inner membrane translocase subunit tim17; Provisional	164
240325	PTZ00237	PTZ00237	acetyl-CoA synthetase; Provisional	647
140265	PTZ00238	PTZ00238	expression site-associated gene (ESAG); Provisional	326
173488	PTZ00239	PTZ00239	serine/threonine protein phosphatase 2A; Provisional	303
140267	PTZ00240	PTZ00240	60S ribosomal protein P0; Provisional	323
240326	PTZ00241	PTZ00241	40S ribosomal protein S11; Provisional	158
185524	PTZ00242	PTZ00242	protein tyrosine phosphatase; Provisional	166
240327	PTZ00243	PTZ00243	ABC transporter; Provisional	1560
140271	PTZ00244	PTZ00244	serine/threonine-protein phosphatase PP1; Provisional	294
140272	PTZ00245	PTZ00245	ubiquitin activating enzyme; Provisional	287
173491	PTZ00246	PTZ00246	proteasome subunit alpha; Provisional	253
240328	PTZ00247	PTZ00247	adenosine kinase; Provisional	345
240329	PTZ00248	PTZ00248	eukaryotic translation initiation factor 2 subunit 1; Provisional	319
140276	PTZ00249	PTZ00249	variable surface protein Vir28; Provisional	516
140277	PTZ00250	PTZ00250	variable surface protein Vir23; Provisional	350
140278	PTZ00251	PTZ00251	fatty acid elongase; Provisional	272
240330	PTZ00252	PTZ00252	histone H2A; Provisional	134
140280	PTZ00253	PTZ00253	tryparedoxin peroxidase; Provisional	199
240331	PTZ00254	PTZ00254	40S ribosomal protein SA; Provisional	249
240332	PTZ00255	PTZ00255	60S ribosomal protein L37a; Provisional	90
173495	PTZ00256	PTZ00256	glutathione peroxidase; Provisional	183
240333	PTZ00257	PTZ00257	Glycoprotein GP63 (leishmanolysin); Provisional	622
240334	PTZ00258	PTZ00258	GTP-binding protein; Provisional	390
240335	PTZ00259	PTZ00259	endonuclease G; Provisional	434
240336	PTZ00260	PTZ00260	dolichyl-phosphate beta-glucosyltransferase; Provisional	333
240337	PTZ00261	PTZ00261	acyltransferase; Provisional	355
240338	PTZ00262	PTZ00262	subtilisin-like protease; Provisional	639
140289	PTZ00263	PTZ00263	protein kinase A catalytic subunit; Provisional	329
173500	PTZ00264	PTZ00264	circumsporozoite-related antigen; Provisional	148
240339	PTZ00265	PTZ00265	multidrug resistance protein (mdr1); Provisional	1466
173502	PTZ00266	PTZ00266	NIMA-related protein kinase; Provisional	1021
140293	PTZ00267	PTZ00267	NIMA-related protein kinase; Provisional	478
140294	PTZ00268	PTZ00268	glycosylphosphatidylinositol-specific phospholipase C; Provisional	380
140295	PTZ00269	PTZ00269	variant surface glycoprotein; Provisional	472
240340	PTZ00270	PTZ00270	variable surface protein Vir32; Provisional	333
140297	PTZ00271	PTZ00271	hypoxanthine-guanine phosphoribosyltransferase; Provisional	211
240341	PTZ00272	PTZ00272	heat shock protein 83 kDa (Hsp83); Provisional	701
140299	PTZ00273	PTZ00273	oxidase reductase; Provisional	320
140300	PTZ00274	PTZ00274	cytochrome b5 reductase; Provisional	325
185536	PTZ00275	PTZ00275	biotin-acetyl-CoA-carboxylase ligase; Provisional	285
140302	PTZ00276	PTZ00276	biotin/lipoate protein ligase; Provisional	245
240342	PTZ00278	PTZ00278	ARP2/3 complex subunit; Provisional	174
240343	PTZ00280	PTZ00280	Actin-related protein 3; Provisional	414
173506	PTZ00281	PTZ00281	actin; Provisional	376
240344	PTZ00283	PTZ00283	serine/threonine protein kinase; Provisional	496
140307	PTZ00284	PTZ00284	protein kinase; Provisional	467
140308	PTZ00285	PTZ00285	glucosamine-6-phosphate isomerase; Provisional	253
185539	PTZ00286	PTZ00286	6-phospho-1-fructokinase; Provisional	459
240345	PTZ00287	PTZ00287	6-phosphofructokinase; Provisional	1419
240346	PTZ00288	PTZ00288	glucokinase 1; Provisional	405
240347	PTZ00290	PTZ00290	galactokinase; Provisional	468
185541	PTZ00292	PTZ00292	ribokinase; Provisional	326
185542	PTZ00293	PTZ00293	thymidine kinase; Provisional	211
240348	PTZ00294	PTZ00294	glycerol kinase-like protein; Provisional	504
240349	PTZ00295	PTZ00295	glucosamine-fructose-6-phosphate aminotransferase; Provisional	640
240350	PTZ00296	PTZ00296	choline kinase; Provisional	442
140318	PTZ00297	PTZ00297	pantothenate kinase; Provisional	1452
240351	PTZ00298	PTZ00298	mevalonate kinase; Provisional	328
140320	PTZ00299	PTZ00299	homoserine kinase; Provisional	336
140321	PTZ00300	PTZ00300	pyruvate kinase; Provisional	454
140322	PTZ00301	PTZ00301	uridine kinase; Provisional	210
240352	PTZ00302	PTZ00302	N-acetylglucosamine-phosphate mutase; Provisional	585
140324	PTZ00303	PTZ00303	phosphatidylinositol kinase; Provisional	1374
185547	PTZ00304	PTZ00304	NADH dehydrogenase [ubiquinone] flavoprotein 1; Provisional	461
140326	PTZ00305	PTZ00305	NADH:ubiquinone oxidoreductase; Provisional	297
140327	PTZ00306	PTZ00306	NADH-dependent fumarate reductase; Provisional	1167
140328	PTZ00307	PTZ00307	ethanolamine phosphotransferase; Provisional	417
140329	PTZ00308	PTZ00308	ethanolamine-phosphate cytidylyltransferase; Provisional	353
240353	PTZ00309	PTZ00309	glucose-6-phosphate 1-dehydrogenase; Provisional	542
240354	PTZ00310	PTZ00310	AMP deaminase; Provisional	1453
185549	PTZ00311	PTZ00311	phosphoenolpyruvate carboxykinase; Provisional	561
140333	PTZ00312	PTZ00312	inositol-1,4,5-triphosphate 5-phosphatase; Provisional	356
140334	PTZ00313	PTZ00313	inosine-adenosine-guanosine-nucleoside hydrolase; Provisional	326
240355	PTZ00314	PTZ00314	inosine-5'-monophosphate dehydrogenase; Provisional	495
240356	PTZ00315	PTZ00315	2'-phosphotransferase; Provisional	582
140337	PTZ00316	PTZ00316	profilin; Provisional	150
240357	PTZ00317	PTZ00317	NADP-dependent malic enzyme; Provisional	559
185553	PTZ00318	PTZ00318	NADH dehydrogenase-like protein; Provisional	424
173521	PTZ00319	PTZ00319	NADH-cytochrome B5 reductase; Provisional	300
140341	PTZ00320	PTZ00320	ribosomal protein L14; Provisional	188
240358	PTZ00321	PTZ00321	ribosomal protein L11; Provisional	342
140343	PTZ00322	PTZ00322	6-phosphofructo-2-kinase/fructose-2,6-biphosphatase; Provisional	664
185554	PTZ00323	PTZ00323	NAD+ synthase; Provisional	294
240359	PTZ00324	PTZ00324	glutamate dehydrogenase 2; Provisional	1002
240360	PTZ00325	PTZ00325	malate dehydrogenase; Provisional	321
240361	PTZ00326	PTZ00326	phenylalanyl-tRNA synthetase alpha chain; Provisional	494
240362	PTZ00327	PTZ00327	eukaryotic translation initiation factor 2 gamma subunit; Provisional	460
140349	PTZ00328	PTZ00328	eukaryotic initiation factor 5a; Provisional	166
185558	PTZ00329	PTZ00329	eukaryotic translation initiation factor 1A; Provisional	155
140351	PTZ00330	PTZ00330	acetyltransferase; Provisional	147
240363	PTZ00331	PTZ00331	alpha/beta hydrolase; Provisional	212
240364	PTZ00332	PTZ00332	paraflagellar rod protein; Provisional	589
240365	PTZ00333	PTZ00333	triosephosphate isomerase; Provisional	255
240366	PTZ00334	PTZ00334	trans-sialidase; Provisional	780
185562	PTZ00335	PTZ00335	tubulin alpha chain; Provisional	448
185563	PTZ00337	PTZ00337	surface protease GP63; Provisional	567
240367	PTZ00338	PTZ00338	dimethyladenosine transferase-like protein; Provisional	294
240368	PTZ00339	PTZ00339	UDP-N-acetylglucosamine pyrophosphorylase; Provisional	482
240369	PTZ00340	PTZ00340	O-sialoglycoprotein endopeptidase-like protein; Provisional	345
173534	PTZ00341	PTZ00341	Ring-infected erythrocyte surface antigen; Provisional	1136
240370	PTZ00342	PTZ00342	acyl-CoA synthetase; Provisional	746
240371	PTZ00343	PTZ00343	triose or hexose phosphate/phosphate translocator; Provisional	350
240372	PTZ00344	PTZ00344	pyridoxal kinase; Provisional	296
240373	PTZ00345	PTZ00345	glycerol-3-phosphate dehydrogenase; Provisional	365
240374	PTZ00346	PTZ00346	histone deacetylase; Provisional	429
240375	PTZ00347	PTZ00347	phosphomethylpyrimidine kinase; Provisional	504
173541	PTZ00348	PTZ00348	tyrosyl-tRNA synthetase; Provisional	682
185571	PTZ00349	PTZ00349	dehydrodolichyl diphosphate synthetase; Provisional	322
240376	PTZ00350	PTZ00350	adenylosuccinate synthetase; Provisional	436
173544	PTZ00351	PTZ00351	adenylosuccinate synthetase; Provisional	710
240377	PTZ00352	PTZ00352	60S ribosomal protein L13; Provisional	212
173546	PTZ00353	PTZ00353	glycosomal glyceraldehyde-3-phosphate dehydrogenase; Provisional	342
173547	PTZ00354	PTZ00354	alcohol dehydrogenase; Provisional	334
173548	PTZ00355	PTZ00355	Rhoptry-associated protein 2; Provisional	400
185573	PTZ00356	PTZ00356	peptidyl-prolyl cis-trans isomerase (PPIase); Provisional	115
173550	PTZ00357	PTZ00357	methyltransferase; Provisional	1072
240378	PTZ00358	PTZ00358	hypothetical protein; Provisional	367
173552	PTZ00359	PTZ00359	hypothetical protein; Provisional	443
240379	PTZ00360	PTZ00360	sexual stage antigen; Provisional	543
185575	PTZ00361	PTZ00361	26 proteosome regulatory subunit 4-like protein; Provisional	438
240380	PTZ00362	PTZ00362	hypothetical protein; Provisional	479
185577	PTZ00363	PTZ00363	rab-GDP dissociation inhibitor; Provisional	443
240381	PTZ00364	PTZ00364	dipeptidyl-peptidase I precursor; Provisional	548
240382	PTZ00365	PTZ00365	60S ribosomal protein L7Ae-like; Provisional	266
240383	PTZ00366	PTZ00366	Surface antigen  (SAG) superfamily; Provisional	392
240384	PTZ00367	PTZ00367	squalene epoxidase; Provisional	567
173561	PTZ00368	PTZ00368	universal minicircle sequence binding protein (UMSBP); Provisional	148
240385	PTZ00369	PTZ00369	Ras-like protein; Provisional	189
240386	PTZ00370	PTZ00370	STEVOR; Provisional	296
240387	PTZ00371	PTZ00371	aspartyl aminopeptidase; Provisional	465
240388	PTZ00372	PTZ00372	endonuclease 4-like protein; Provisional	413
185582	PTZ00373	PTZ00373	60S Acidic ribosomal protein P2; Provisional	112
240389	PTZ00374	PTZ00374	dihydroxyacetone phosphate acyltransferase; Provisional	1108
185583	PTZ00375	PTZ00375	dihydroxyacetone kinase-like protein; Provisional	584
240390	PTZ00376	PTZ00376	aspartate aminotransferase; Provisional	404
240391	PTZ00377	PTZ00377	alanine aminotransferase; Provisional	481
173571	PTZ00378	PTZ00378	hypothetical protein; Provisional	518
173572	PTZ00380	PTZ00380	microtubule-associated protein (MAP); Provisional	121
240392	PTZ00381	PTZ00381	aldehyde dehydrogenase family protein; Provisional	493
173574	PTZ00382	PTZ00382	Variant-specific surface protein (VSP); Provisional	96
240393	PTZ00383	PTZ00383	malate:quinone oxidoreductase; Provisional	497
173576	PTZ00384	PTZ00384	choline kinase; Provisional	383
185588	PTZ00385	PTZ00385	lysyl-tRNA synthetase; Provisional	659
240394	PTZ00386	PTZ00386	formyl tetrahydrofolate synthetase; Provisional	625
240395	PTZ00387	PTZ00387	epsilon tubulin; Provisional	465
240396	PTZ00388	PTZ00388	40S ribosomal protein S8-like; Provisional	223
185592	PTZ00389	PTZ00389	40S ribosomal protein S7; Provisional	184
240397	PTZ00390	PTZ00390	ubiquitin-conjugating enzyme; Provisional	152
240398	PTZ00391	PTZ00391	transport protein particle component (TRAPP) superfamily; Provisional	168
240399	PTZ00393	PTZ00393	protein tyrosine phosphatase; Provisional	241
173585	PTZ00394	PTZ00394	glucosamine-fructose-6-phosphate aminotransferase; Provisional	670
185594	PTZ00395	PTZ00395	Sec24-related protein; Provisional	1560
240400	PTZ00396	PTZ00396	Casein kinase II subunit beta; Provisional	251
240401	PTZ00397	PTZ00397	macrophage migration inhibition factor-like protein; Provisional	116
173589	PTZ00398	PTZ00398	phosphoenolpyruvate carboxylase; Provisional	974
240402	PTZ00399	PTZ00399	cysteinyl-tRNA-synthetase; Provisional	651
240403	PTZ00400	PTZ00400	DnaK-type molecular chaperone; Provisional	663
173592	PTZ00401	PTZ00401	aspartyl-tRNA synthetase; Provisional	550
240404	PTZ00402	PTZ00402	glutamyl-tRNA synthetase; Provisional	601
173594	PTZ00403	PTZ00403	phosphatidylserine decarboxylase; Provisional	353
173595	PTZ00404	PTZ00404	cytochrome P450; Provisional	482
173596	PTZ00405	PTZ00405	cytochrome c; Provisional	114
173597	PTZ00407	PTZ00407	DNA topoisomerase IA; Provisional	805
240405	PTZ00408	PTZ00408	NAD-dependent deacetylase; Provisional	242
173599	PTZ00409	PTZ00409	Sir2 (Silent Information Regulator) protein; Provisional	271
185600	PTZ00410	PTZ00410	NAD-dependent SIR2; Provisional	349
240406	PTZ00411	PTZ00411	transaldolase-like protein; Provisional	333
240407	PTZ00412	PTZ00412	leucyl aminopeptidase; Provisional	569
240408	PTZ00413	PTZ00413	lipoate synthase; Provisional	398
173604	PTZ00414	PTZ00414	10 kDa heat shock protein; Provisional	100
185603	PTZ00415	PTZ00415	transmission-blocking target antigen s230; Provisional	2849
240409	PTZ00416	PTZ00416	elongation factor 2; Provisional	836
173607	PTZ00417	PTZ00417	lysine-tRNA ligase; Provisional	585
240410	PTZ00418	PTZ00418	Poly(A) polymerase; Provisional	593
240411	PTZ00419	PTZ00419	valyl-tRNA synthetase-like protein; Provisional	995
240412	PTZ00420	PTZ00420	coronin; Provisional	568
173611	PTZ00421	PTZ00421	coronin; Provisional	493
185607	PTZ00422	PTZ00422	glideosome-associated protein 50; Provisional	394
240413	PTZ00423	PTZ00423	glideosome-associated protein 45; Provisional	193
185609	PTZ00424	PTZ00424	helicase 45; Provisional	401
240414	PTZ00425	PTZ00425	asparagine-tRNA ligase; Provisional	586
173616	PTZ00426	PTZ00426	cAMP-dependent protein kinase catalytic subunit; Provisional	340
173617	PTZ00427	PTZ00427	isoleucine-tRNA ligase, putative; Provisional	1205
185611	PTZ00428	PTZ00428	60S ribosomal protein L4; Provisional	381
240415	PTZ00429	PTZ00429	beta-adaptin; Provisional	746
185612	PTZ00430	PTZ00430	glucose-6-phosphate isomerase; Provisional	552
173621	PTZ00431	PTZ00431	pyrroline carboxylate reductase; Provisional	260
240416	PTZ00432	PTZ00432	falcilysin; Provisional	1119
185613	PTZ00433	PTZ00433	tyrosine aminotransferase; Provisional	412
185614	PTZ00434	PTZ00434	cytosolic glyceraldehyde 3-phosphate dehydrogenase; Provisional	361
240417	PTZ00435	PTZ00435	isocitrate dehydrogenase; Provisional	413
185616	PTZ00436	PTZ00436	60S ribosomal protein L19-like protein; Provisional	357
240418	PTZ00437	PTZ00437	glutaminyl-tRNA synthetase; Provisional	574
185618	PTZ00438	PTZ00438	gamete antigen 27/25-like protein; Provisional	374
240419	PTZ00440	PTZ00440	reticulocyte binding protein 2-like protein; Provisional	2722
240420	PTZ00441	PTZ00441	sporozoite surface protein 2 (SSP2); Provisional	576
185621	PTZ00442	PTZ00442	sexual stage antigen s48/45-like protein; Provisional	347
185622	PTZ00443	PTZ00443	Thioredoxin domain-containing protein; Provisional	224
185623	PTZ00444	PTZ00444	hypothetical protein; Provisional	184
240421	PTZ00445	PTZ00445	p36-lilke protein; Provisional	219
240422	PTZ00446	PTZ00446	vacuolar sorting protein SNF7-like; Provisional	191
185626	PTZ00447	PTZ00447	apical membrane antigen 1-like protein; Provisional	508
185627	PTZ00448	PTZ00448	hypothetical protein; Provisional	373
185628	PTZ00449	PTZ00449	104 kDa microneme/rhoptry antigen; Provisional	943
185629	PTZ00450	PTZ00450	macrophage migration inhibitory factor-like protein; Provisional	113
185630	PTZ00451	PTZ00451	dephospho-CoA kinase; Provisional	244
185631	PTZ00452	PTZ00452	actin; Provisional	375
185632	PTZ00453	PTZ00453	cyclin-dependent kinase; Provisional	96
240423	PTZ00454	PTZ00454	26S protease regulatory subunit 6B-like protein; Provisional	398
240424	PTZ00455	PTZ00455	3-ketoacyl-CoA thiolase; Provisional	438
185635	PTZ00456	PTZ00456	acyl-CoA dehydrogenase; Provisional	622
185636	PTZ00457	PTZ00457	acyl-CoA dehydrogenase; Provisional	520
185637	PTZ00458	PTZ00458	acyl CoA binding protein; Provisional	90
185638	PTZ00459	PTZ00459	mucin-associated surface protein (MASP); Provisional	291
185639	PTZ00460	PTZ00460	acyl-CoA dehydrogenase; Provisional	646
185640	PTZ00461	PTZ00461	isovaleryl-CoA dehydrogenase; Provisional	410
185641	PTZ00462	PTZ00462	Serine-repeat antigen protein; Provisional	1004
185642	PTZ00463	PTZ00463	histone H2B; Provisional	117
240425	PTZ00464	PTZ00464	SNF-7-like protein; Provisional	211
185644	PTZ00465	PTZ00465	rhoptry-associated protein 1 (RAP-1); Provisional	565
240426	PTZ00466	PTZ00466	actin-like protein; Provisional	380
185646	PTZ00467	PTZ00467	40S ribosomal protein S30; Provisional	66
185647	PTZ00468	PTZ00468	phosphofructokinase family protein; Provisional	1328
185648	PTZ00469	PTZ00469	60S ribosomal subunit protein L18; Provisional	187
240427	PTZ00470	PTZ00470	glycoside hydrolase family 47 protein; Provisional	522
240428	PTZ00471	PTZ00471	60S ribosomal protein L27; Provisional	134
240429	PTZ00472	PTZ00472	serine carboxypeptidase (CBP1); Provisional	462
240430	PTZ00473	PTZ00473	Plasmodium Vir superfamily; Provisional	420
240431	PTZ00474	PTZ00474	tryptophan/threonine-rich antigen superfamily; Provisional	316
185654	PTZ00475	PTZ00475	RESA-like protein; Provisional	282
240432	PTZ00477	PTZ00477	rhoptry-associated protein; Provisional	524
185656	PTZ00478	PTZ00478	Sec superfamily; Provisional	81
185657	PTZ00479	PTZ00479	RAP Superfamily; Provisional	435
185658	PTZ00480	PTZ00480	serine/threonine-protein phosphatase; Provisional	320
185659	PTZ00481	PTZ00481	Membrane attack complex/ Perforin (MACPF) Superfamily; Provisional	524
240433	PTZ00482	PTZ00482	membrane-attack complex/perforin (MACPF) Superfamily; Provisional	844
185661	PTZ00483	PTZ00483	proliferating cell nuclear antigen; Provisional	264
240434	PTZ00484	PTZ00484	GTP cyclohydrolase I; Provisional	259
240435	PTZ00485	PTZ00485	aldolase 1-epimerase; Provisional	376
240436	PTZ00486	PTZ00486	apyrase Superfamily; Provisional	352
240437	PTZ00487	PTZ00487	ceramidase; Provisional	715
185666	PTZ00488	PTZ00488	Proteasome subunit beta type-5; Provisional	247
240438	PTZ00489	PTZ00489	glutamate 5-kinase; Provisional	264
185668	PTZ00490	PTZ00490	Ferredoxin superfamily; Provisional	143
240439	PTZ00491	PTZ00491	major vault protein; Provisional	850
240440	PTZ00493	PTZ00493	phosphomethylpyrimidine kinase; Provisional	321
185671	PTZ00494	PTZ00494	tuzin-like protein; Provisional	664
272847	TIGR00001	rpmI_bact	ribosomal protein L35. This ribosomal protein is found in bacteria and organelles only. It is not closely related to any eukaryotic or archaeal ribosomal protein. [Protein synthesis, Ribosomal proteins: synthesis and modification]	63
272848	TIGR00002	S16	ribosomal protein S16. This model describes ribosomal S16 of bacteria and organelles. [Protein synthesis, Ribosomal proteins: synthesis and modification]	78
188014	TIGR00003	TIGR00003	copper ion binding protein. This model describes an apparently copper-specific subfamily of the metal-binding domain HMA (pfam00403). Closely related sequences outside this model include mercury resistance proteins and repeated domains of eukaryotic eukaryotic copper transport proteins. Members of this family are strictly prokaryotic. The model identifies both small proteins consisting of just this domain and N-terminal regions of cation (probably copper) transporting ATPases. [Transport and binding proteins, Cations and iron carrying compounds]	66
129116	TIGR00004	TIGR00004	reactive intermediate/imine deaminase. This protein was described initially as an inhibitor of protein synthesis intiation, then as an endoribonuclease active on single-stranded mRNA, endoribonuclease L-PSP. Members of this family, conserved in all domains of life and often with several members per bacterial genome, appear to catalyze a reaction that minimizes toxic by-products from reactions catalyzed by pyridoxal phosphate-dependent enzymes. [Cellular processes, Other]	124
161659	TIGR00005	rluA_subfam	pseudouridine synthase, RluA family. In E. coli, RluD (SfhB) modifies uridine to pseudouridine at 23S RNA U1911, 1915, and 1917, RluC modifies 955, 2504 and 2580, and RluA modifies U746 and tRNA U32. An additional homolog from E. coli outside this family, TruC (SP|Q46918), modifies uracil-65 in transfer RNAs to pseudouridine. [Protein synthesis, tRNA and rRNA base modification]	299
272849	TIGR00006	TIGR00006	16S rRNA (cytosine(1402)-N(4))-methyltransferase. This model describes RsmH, a 16S rRNA methyltransferase. Previously, this gene was designated MraW, known to be essential in E. coli and widely conserved in bacteria. [Protein synthesis, tRNA and rRNA base modification]	307
272850	TIGR00007	TIGR00007	phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase. This protein family consists of HisA, phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase, the enzyme catalyzing the fourth step in histidine biosynthesis. It is closely related to the enzyme HisF for the sixth step. Examples of this enzyme in Actinobacteria have been found to be bifunctional, also possessing phosphoribosylanthranilate isomerase activity; the trusted cutoff here has now been raised to 275.0 to exclude the bifunctional group, now represented by model TIGR01919. HisA from Lactococcus lactis was reported to be inactive (MEDLINE:93322317). [Amino acid biosynthesis, Histidine family]	230
188015	TIGR00008	infA	translation initiation factor IF-1. This family consists of translation initiation factor IF-1 as found in bacteria and chloroplasts. This protein, about 70 residues in length, consists largely of an S1 RNA binding domain (pfam00575). [Protein synthesis, Translation factors]	69
272851	TIGR00009	L28	ribosomal protein L28. This model describes bacterial and chloroplast forms of the 50S ribosomal protein L28, a polypeptide about 60 amino acids in length. Mitochondrial homologs differ substantially in architecture (e.g. SP|P36525 from Saccharomyces cerevisiae, which is 258 amino acids long) and are not included. [Protein synthesis, Ribosomal proteins: synthesis and modification]	56
272852	TIGR00010	TIGR00010	hydrolase, TatD family. PSI-BLAST, starting with a urease alpha subunit, finds a large superfamily of proteins, including a number of different enzymes that act as hydrolases at C-N bonds other than peptide bonds (EC 3.5.-.-), many uncharacterized proteins, and the members of this family. Several genomes have multiple paralogs related to this family. However, a set of 17 proteins can be found, one each from 17 of the first 20 genomes, such that each member forms a bidirectional best hit across genomes with all other members of the set. This core set (and one other near-perfect member), but not the other paralogs, form the seed for this model. Additionally, members of the seed alignment and all trusted hits, but not all paralogs, have a conserved motif DxHxH near the amino end. The member from E. coli was recently shown to have DNase activity. [Unknown function, Enzymes of unknown specificity]	252
272853	TIGR00011	YbaK_EbsC	Cys-tRNA(Pro) deacylase. This model represents the YbaK family, bacterial proteins whose full length sequence is homologous to an insertion domain in proline--tRNA ligases. The domain deacylates mischarged tRNAs. The YbaK protein of Haemophilus influenzae (HI1434) likewise deacylates Ala-tRNA(Pro), but not the correctly charged Pro-tRNA(Pro). A crystallographic study of HI1434 suggests a nucleotide binding function. Previously, a member of this family was described as EbsC and was thought to be involved in cell wall metabolism. [Protein synthesis, tRNA aminoacylation]	152
272854	TIGR00012	L29	ribosomal protein L29. This model describes a ribosomal large subunit protein, called L29 in prokaryotic (50S) large subunits and L35 in eukaryotic (60S) large subunits. [Protein synthesis, Ribosomal proteins: synthesis and modification]	55
129125	TIGR00013	taut	4-oxalocrotonate tautomerase family enzyme. 4-oxalocrotonate tautomerase is a homohexamer in which each monomer is very small, at about 62 amino acids. Pro-1 of the mature protein serves as a general base. The enzyme functions in meta-cleavage pathways of aromatic hydrocarbon catabolism. Because several Arg residues located near the active site in the crystal structure of Pseudomonas putida are not conserved among all members of this family, because the literature describes a general role in the isomerization of beta,gamma-unsaturated enones to their alpha,beta-isomers, and because of the presence of fairly distantly related paralogs in Campylobacter jejuni, the family is regarded as not necessarily uniform in function. [Energy metabolism, Other]	63
272855	TIGR00014	arsC	arsenate reductase (glutaredoxin). This model describes a distinct clade, including ArsC itself, of the broader ArsC family described by Pfam pfam03960. This clade is almost completely restricted to the Proteobacteria. An anion-translocating ATPase has been identified as the product of the arsenical resistance operon of resistance plasmid R773. When expressed in Escherichia coli this ATP-driven oxyanion pump catalyses extrusion of the oxyanions arsenite, antimonite and arsenate. The pump is composed of two polypeptides, the products of the arsA and arsB genes. The pump alone produces resistance to arsenite and antimonite. This protein, ArsC, catalyzes the reduction of arsenate to arsenite, and thus extends resistance to include arsenate. [Cellular processes, Detoxification]	114
272856	TIGR00016	ackA	acetate kinase. Acetate kinase is involved in the activation of acetate to acetyl CoA and in the secretion of acetate. It catalyzes the reaction ATP + acetate = ADP + acetyl phosphate. Some members of this family have been shown to act on propionate as well as acetate. An example of a propionate/acetate kinase is TdcD of E. coli (SP|P11868), an enzyme of an anaerobic pathway of threonine catabolism. It is not known how many members of this family act on additional substrates besides acetate. [Energy metabolism, Fermentation]	404
129128	TIGR00017	cmk	cytidylate kinase. This family consists of cytidylate kinase, which catalyzes the phosphorylation of cytidine 5-monophosphate (dCMP) to cytidine 5 -diphosphate (dCDP) in the presence of ATP or GTP. UMP and dCMP can also act as acceptors. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	217
272857	TIGR00018	panC	pantoate--beta-alanine ligase. This family is pantoate--beta-alanine ligase, the last enzyme of pantothenate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	282
129130	TIGR00019	prfA	peptide chain release factor 1. This model describes peptide chain release factor 1 (PrfA, RF-1), and excludes the related peptide chain release factor 2 (PrfB, RF-2). RF-1 helps recognize and terminate translation at UAA and UAG stop codons. The mitochondrial release factors are prfA-like, although not included above the trusted cutoff for this model. RF-1 does not have a translational frameshift. [Protein synthesis, Translation factors]	360
272858	TIGR00020	prfB	peptide chain release factor 2. In many but not all taxa, there is a conserved real translational frameshift at a TGA codon. RF-2 helps terminate translation at TGA codons and can therefore regulate its own production by readthrough when RF-2 is insufficient. There is a Pfam model called "RF-1" for the superfamily of RF-1, RF-2, mitochondrial, RF-H, etc. [Protein synthesis, Translation factors]	364
272859	TIGR00021	rpiA	ribose 5-phosphate isomerase. This model describes ribose 5-phosphate isomerase, an enzyme of the non-oxidative branch of the pentose phosphate pathway. [Energy metabolism, Pentose phosphate pathway]	218
129133	TIGR00022	TIGR00022	YhcH/YjgK/YiaL family protein. This family consists of conserved hypothetical proteins, about 150 amino acids in length. Members with limited information include YhcH, a possible sugar isomerase of sialic acid catabolism, and YjgK. [Unknown function, General]	142
272860	TIGR00023	TIGR00023	acyl-phosphate glycerol 3-phosphate acyltransferase. This model represents the full length of acylphosphate:glycerol 3-phosphate acyltransferase, and integral membrane protein about 200 amino acids in length, called PlsY in Streptococcus pneumoniae, YneS in Bacillus subtilis, and YgiH in E. coli. It is found in a single copy in a large number of bacteria, including the Mycoplasmas but not Mycobacteria or spirochetes, for example. Its partner is PlsX (see TIGR00182), and the pair can replace PlsB for synthesizing 1-acylglycerol-3-phosphate. [Fatty acid and phospholipid metabolism, Biosynthesis]	196
129135	TIGR00024	SbcD_rel_arch	putative phosphoesterase, SbcD/Mre11-related. Members of this uncharacterized family share a motif approximating DXH(X25)GDXXD(X25)GNHD as found in several phosphoesterases, including the nucleases SbcD and Mre11. SbcD is a subunit of the SbcCD nuclease of E. coli that can cleave DNA hairpins to unblock stalled DNA replication. All members of this family are archaeal. [Unknown function, Enzymes of unknown specificity]	225
272861	TIGR00025	Mtu_efflux	ABC transporter efflux protein, DrrB family. The seed members for this model are a paralogous family of Mycobacterium tuberculosis. Nearly all proteins scoring above the noise cutoff are from high-GC Gram-positive organisms. The members of this paralogous family of efflux proteins are all found in operons with ATP-binding chain partners. They are related to a putative daunorubicin resistance efflux protein of Streptomyces peucetius. This model represents a branch of a larger superfamily that also includes NodJ, a part of the NodIJ pair of nodulation-triggering signal efflux proteins. The members of this branch may all act in antibiotic resistance.	232
211538	TIGR00026	hi_GC_TIGR00026	deazaflavin-dependent oxidoreductase, nitroreductase family. This model represents a family of proteins found in paralogous families in the genera Mycobacterium and Streptomyces. Seven members are in Mycobacterium tuberculosis. Member protein Rv3547 has been characterized as a deazaflavin-dependent nitroreductase. [Unknown function, Enzymes of unknown specificity]	113
272862	TIGR00027	mthyl_TIGR00027	methyltransferase, TIGR00027 family. This model represents a set of probable methyltransferases, about 300 amino acids long, with essentially full length homology. Members share an N-terminal region described by Pfam model pfam02409. Included are a paralogous family of 12 proteins in Mycobacterium tuberculosis, plus close homologs in related species, a family of 8 in the archaeon Methanosarcina acetivorans, and small numbers of members in other species, including plants. [Unknown function, Enzymes of unknown specificity]	260
272863	TIGR00028	Mtu_PIN_fam	toxin-antitoxin system PIN domain toxin. Members of this protein consist almost entirely of a PIN (PilT N terminus) domain (see pfam01850). This family was originally defined a set of twelve closely related paralogs found in Mycobacterium tuberculosis, but additional members are found now Synechococcus sp. WH8102, etc. Inspection of genomic regions suggests these represent toxin components of toxin-antitoxin regions, potentially important to creating dormant persister cells.	142
211539	TIGR00029	S20	ribosomal protein S20. This family consists of bacterial (and chloroplast) examples of the bacteria ribosomal small subunit protein S20. [Protein synthesis, Ribosomal proteins: synthesis and modification]	87
129141	TIGR00030	S21p	ribosomal protein S21. This model describes bacterial ribosomal protein S21 and most mitochondrial and chloroplast equivalents. [Protein synthesis, Ribosomal proteins: synthesis and modification]	58
272864	TIGR00031	UDP-GALP_mutase	UDP-galactopyranose mutase. This enzyme is involved in the conversion of UDP-GALP into UDP-GALF through a 2-keto intermediate. It contains FAD as a cofactor. The gene is known as glf, ceoA, and rfbD. It is known experimentally in E. coli, Mycobacterium tuberculosis, and Klebsiella pneumoniae. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	377
199987	TIGR00032	argG	argininosuccinate synthase. argG in bacteria, ARG1 in Saccharomyces cerevisiae. There is a very unusual clustering in the alignment, with a deep split between one cohort of E. coli, H. influenzae, and Streptomyces, and the other cohort of eukaryotes, archaea, and the rest of the eubacteria. [Amino acid biosynthesis, Glutamate family]	394
272865	TIGR00033	aroC	chorismate synthase. Homotetramer (noted in E.coli) suggests reason for good conservation. [Amino acid biosynthesis, Aromatic amino acid family]	351
129145	TIGR00034	aroFGH	phospho-2-dehydro-3-deoxyheptonate aldolase. [Amino acid biosynthesis, Aromatic amino acid family]	344
213495	TIGR00035	asp_race	aspartate racemase. Asparate racemases and some close homologs of unknown function are related to the more common glutamate racemases, but form a distinct evolutionary branch. This model identifies members of the aspartate racemase-related subset of amino acid racemases. [Energy metabolism, Amino acids and amines]	229
129147	TIGR00036	dapB	4-hydroxy-tetrahydrodipicolinate reductase. [Amino acid biosynthesis, Aspartate family]	266
272866	TIGR00037	eIF_5A	translation elongation factor IF5A. Recent work (2009) changed the view of eIF5A in eukaryotes and aIF5A in archaea, hypusine-containing proteins, from translation initiation factor to translation elongation factor. [Protein synthesis, Translation factors]	130
272867	TIGR00038	efp	translation elongation factor P. function: involved in peptide bond synthesis. stimulate efficient translation and peptide-bond synthesis on native or reconstituted 70S ribosomes in vitro. probably functions indirectly by altering the affinity of the ribosome for aminoacyl-tRNA, thus increasing their reactivity as acceptors for peptidyl transferase (by similarity). The trusted cutoff of this model is set high enough to exclude members of TIGR02178, an EFP-like protein of certain Gammaproteobacteria. [Protein synthesis, Translation factors]	184
272868	TIGR00039	6PTHBS	6-pyruvoyl tetrahydropterin synthase/QueD family protein. This model has been downgraded from hypothetical_equivalog to subfamily. The animal enzymes are known to be 6-pyruvoyl tetrahydropterin synthase. The function of the bacterial branch of the sequence lineage had been thought to be the same, but many are now taken to be QueD, and enzyme of queuosine biosynthesis. Queuosine is a hypermodified base in the wobble position of some tRNAs in most species. A new model is built to be the QueD equivalog model. [Protein synthesis, tRNA and rRNA base modification]	124
272869	TIGR00040	yfcE	phosphoesterase, MJ0936 family. Members of this largely uncharacterized family share a motif approximating DXH(X25)GDXXD(X25)GNHD as found in several phosphoesterases, including the nucleases SbcD and Mre11, and a family of uncharacterized archaeal putative phosphoesterases described by TIGR00024. In this family, the His residue in GNHD portion of the motif is not conserved. The member MJ0936, one of two from Methanococcus jannaschii, was shown () to act on model phosphodiesterase substrates; a divalent cation was required. [Unknown function, Enzymes of unknown specificity]	158
161676	TIGR00041	DTMP_kinase	dTMP kinase. Function: phosphorylation of DTMP to form DTDP in both de novo and salvage pathways of DTTP synthesis. Catalytic activity: ATP + thymidine 5'-phosphate = ADP + thymidine 5'-diphosphate. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	195
272870	TIGR00042	TIGR00042	non-canonical purine NTP pyrophosphatase, RdgB/HAM1 family. Saccharomyces cerevisiae HAM1 protects against the mutagenic effects of the base analog 6-N-hydroxylaminopurine, which can be a natural product of monooxygenase activity on adenine. Methanococcus jannaschii MJ0226 and E. coli RdgB are also characterized as pyrophosphatases active against non-standard purines NTPs. E. coli RdgB appears to act by intercepting non-canonical deoxyribonucleotide triphosphates from replication precursor pools. [DNA metabolism, DNA replication, recombination, and repair]	184
272871	TIGR00043	TIGR00043	rRNA maturation RNase YbeY. This metalloprotein family is represented by a single member sequence only in nearly every bacterium. Crystallography demonstrated metal-binding activity, possibly to nickel. It is a predicted to be a metallohydrolase, and more recently it was shown that mutants have a ribosomal RNA processing defect. [Protein synthesis, Other]	110
129155	TIGR00044	TIGR00044	pyridoxal phosphate enzyme, YggS family. Members of this protein family include YggS from Escherichia coli and YBL036C, an uncharacterized pyridoxal protein of Saccharomyces cerevisiae. [Unknown function, Enzymes of unknown specificity]	229
272872	TIGR00045	TIGR00045	glycerate kinase. The only characterized member of this family so far is the glycerate kinase GlxK (EC 2.7.1.31) of E. coli. This enzyme acts after glyoxylate carboligase and 2-hydroxy-3-oxopropionate reductase (tartronate semialdehyde reductase) in the conversion of glyoxylate to 3-phosphoglycerate (the D-glycerate pathway) as a part of allantoin degradation. [Energy metabolism, Other]	375
272873	TIGR00046	TIGR00046	RNA methyltransferase, RsmE family. Members of this protein family, previously called conserved hypothetical protein TIGR00046, include the YggJ protein of E. coli, which has now been shown to methylate U1498 in 16S rRNA. [Protein synthesis, tRNA and rRNA base modification]	240
272874	TIGR00048	rRNA_mod_RlmN	23S rRNA (adenine(2503)-C(2))-methyltransferase. Members of this family are RlmN, a 23S rRNA m2A2503 methyltransferase in the radical SAM enzyme family. Closely related is Cfr, a Staphylococcus sciuri plasmid-borne homolog to this family, Cfr, has been identified as essential to transferrable resistance to chloramphenicol and florfenicol. Cfr methylates 23S RNA at a different site. [Protein synthesis, tRNA and rRNA base modification]	355
272875	TIGR00049	TIGR00049	Iron-sulfur cluster assembly accessory protein. Proteins in this subfamily appear to be associated with the process of FeS-cluster assembly. The HesB proteins are associated with the nif gene cluster and the Rhizobium gene IscN has been shown to be required for nitrogen fixation. Nitrogenase includes multiple FeS clusters and many genes for their assembly. The E. coli SufA protein is associated with SufS, a NifS homolog and SufD which are involved in the FeS cluster assembly of the FhnF protein. The Azotobacter protein IscA (homologs of which are also found in E.coli) is associated which IscS, another NifS homolog and IscU, a nifU homolog as well as other factors consistent with a role in FeS cluster chemistry. A homolog from Geobacter contains a selenocysteine in place of an otherwise invariant cysteine, further suggesting a role in redox chemistry. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	105
272876	TIGR00050	rRNA_methyl_1	RNA methyltransferase, TrmH family, group 1. This is part of the trmH (spoU) family of S-adenosyl-L-methionine (AdoMet)-dependent methyltransferases, and is now characterized, in E. coli, as a tRNA:Cm32/Um32 methyltransferase. It may be named TrMet(Xm32), or TrmJ, according to the nomenclature style chosen [Protein synthesis, tRNA and rRNA base modification]	233
129161	TIGR00051	TIGR00051	acyl-CoA thioester hydrolase, YbgC/YbaW family. This model describes a subset of related acyl-CoA thioesterases that include several at least partially characterized proteins. YbgC is an acyl-CoA thioesterase associated with the Tol-Pal system. YbaW is part of the FadM regulon. [Unknown function, General]	117
129162	TIGR00052	TIGR00052	nudix-type nucleoside diphosphatase, YffH/AdpP family. Members of this family include proteins of about 200 amino acids, including the recently characterized nudix hydrolase YffH, shows to be highly active as a GDP-mannose pyrophosphatase. It also includes the C-terminal half of a 361-amino acid protein, TrgB from Rhodobacter sphaeroides, shown experimentally to help confer tellurite resistance. This model also hits a region near the C-terminus of a 1092-amino acid protein of C. elegans. [Unknown function, Enzymes of unknown specificity]	185
272877	TIGR00053	TIGR00053	addiction module toxin component, YafQ family. This model represents a cluster of eubacterial proteins and a cluster of archaeal proteins, all of which are uncharacterized, from 85 to 102 residues in length, and similar in sequence. These include YafQ, a ribosome-associated endoribonuclease that serves as part of a toxin-antitoxin system, for which DinJ is the antidote component. [Cellular processes, Adaptations to atypical conditions]	90
272878	TIGR00054	TIGR00054	RIP metalloprotease RseP. Members of this nearly universal bacterial protein family are regulated intramembrane proteolysis (RIP) proteases. Older and synonymous gene symbols include yaeL in E. coli, mmpA in Caulobacter crescentus, etc. This family includes a region that hits the PDZ domain, found in a number of proteins targeted to the membrane by binding to a peptide ligand. The N-terminal region of this family contains a perfectly conserved motif HEXGH as found in a number of metalloproteinases, where the Glu is the active site and the His residues coordinate the metal cation. Membership in this family is determined by a match to the full length of the seed alignment; the model also detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	419
129165	TIGR00055	uppS	undecaprenyl diphosphate synthase. This enzyme builds undecaprenyl diphosphate, a molecule that in bacteria is used a carrier in synthesizing cell wall components. Alternate name: undecaprenyl pyrophosphate synthetase. Activity has been demonstrated experimentally for members of this family from Micrococcus luteus, E. coli, Haemophilus influenzae, and Streptococcus pneumoniae. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	226
129166	TIGR00056	TIGR00056	ABC transport permease subunit. This model describes a subfamily of ABC transporter permease subunits. One member of this family has been associated with the toluene tolerance phenotype of Pseudomonas putida, another with L-glutamate transport, another with maintenance of lipid asymmetry. Many bacterial species have one or two members. The Mycobacteria have large paralogous families included in the DUF140 family but excluded from this subfamily on based on extreme divergence at the amino end and on phylogenetic and UPGMA trees on the more conserved regions. [Hypothetical proteins, Conserved]	259
272879	TIGR00057	TIGR00057	tRNA threonylcarbamoyl adenosine modification protein, Sua5/YciO/YrdC/YwlC family. Has paralogs, but YrdC called a tRNA modification protein. Ref 2 authors say probably heteromultimeric complex. Paralogs may mean its does the final binding to the tRNA. [Protein synthesis, tRNA and rRNA base modification]	201
129168	TIGR00058	Hemerythrin	hemerythrin family non-heme iron protein. This family includes oxygen carrier proteins of various oligomeric states from the vascular fluid (hemerythrin) and muscle (myohemerythrin) of some marine invertebrates. Each unit binds 2 non-heme Fe using 5 H, one E and one D. One member of this family,from the sandworm Nereis diversicolor, is an unusual (non-metallothionein) cadmium-binding protein. Homologous proteins, excluded from this narrowly defined family, are found in archaea and bacteria (see pfam01814).	115
272880	TIGR00059	L17	ribosomal protein L17. Eubacterial and mitochondrial. The mitochondrial form, from yeast, contains an additional 110 amino acids C-terminal to the region found by this model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	111
272881	TIGR00060	L18_bact	ribosomal protein L18, bacterial type. The archaeal and eukaryotic type rpL18 is not detectable under this model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	114
129171	TIGR00061	L21	ribosomal protein L21. Eubacterial and chloroplast. [Protein synthesis, Ribosomal proteins: synthesis and modification]	101
272882	TIGR00062	L27	ribosomal protein L27. Eubacterial, chloroplast, and mitochondrial. Mitochondrial members have an additional C-terminal domain. [Protein synthesis, Ribosomal proteins: synthesis and modification]	84
129173	TIGR00063	folE	GTP cyclohydrolase I. alternate names: Punch (Drosophila),GTP cyclohydrolase I (EC 3.5.4.16) catalyzes the biosynthesis of formic acid and dihydroneopterin triphosphate from GTP. This reaction is the first step in the biosynthesis of tetrahydrofolate in prokaryotes, of tetrahydrobiopterin in vertebrates, and of pteridine-containing pigments in insects. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	180
272883	TIGR00064	ftsY	signal recognition particle-docking protein FtsY. There is a weak division between FtsY and SRP54; both are GTPases. In E.coli, ftsY is an essential gene located in an operon with cell division genes ftsE and ftsX, but its apparent function is as the signal recognition particle docking protein. [Protein fate, Protein and peptide secretion and trafficking]	277
272884	TIGR00065	ftsZ	cell division protein FtsZ. This family consists of cell division protein FtsZ, a GTPase found in bacteria, the chloroplast of plants, and in archaebacteria. Structurally similar to tubulin, FtsZ undergoes GTP-dependent polymerization into filaments that form a cytoskeleton involved in septum synthesis. [Cellular processes, Cell division]	349
129176	TIGR00066	g_glut_trans	gamma-glutamyltranspeptidase. Also called gamma-glutamyltranspeptidase (ggt). Some members of this family have antibiotic synthesis or resistance activities. In the case of a cephalosporin acylase from Pseudomonas sp., the enzyme was shown to retain some gamma-glutamyltranspeptidase activity. Other, more distantly related proteins have ggt-related activities and score below the trusted cutoff. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	516
272885	TIGR00067	glut_race	glutamate racemase. This family consists of glutamate racemase, a protein required for making the UDP-N-acetylmuramoyl-pentapeptide used as a precursor in bacterial peptidoglycan biosynthesis. The most closely related proteins differing in function are aspartate racemases. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	251
272886	TIGR00068	glyox_I	lactoylglutathione lyase. Lactoylglutathione lyase is also known as aldoketomutase and glyoxalase I. Glyoxylase I is a homodimer in many species. In some eukaryotes, including yeasts and plants, the orthologous protein carries a tandem duplication, is twice as long, and hits this model twice. [Central intermediary metabolism, Amino sugars, Energy metabolism, Other]	150
272887	TIGR00069	hisD	histidinol dehydrogenase. This model describes a polypeptide sequence catalyzing the final step in histidine biosynthesis, found sometimes as an independent protein and sometimes as a part of a multifunctional protein. [Amino acid biosynthesis, Histidine family]	393
272888	TIGR00070	hisG	ATP phosphoribosyltransferase. Members of this family from B. subtilis, Aquifex aeolicus, and Synechocystis PCC6803 (and related taxa) lack the C-terminal third of the sequence. The sole homolog from Archaeoglobus fulgidus lacks the N-terminal 50 residues (as reported) and is otherwise atypical of the rest of the family. This model excludes the C-terminal extension. [Amino acid biosynthesis, Histidine family]	183
272889	TIGR00071	hisT_truA	tRNA pseudouridine(38-40) synthase. Members of this family are the tRNA modification enzyme TruA, tRNA pseudouridine(38-40) synthase. In a few species (e.g. Bacillus anthracis), TruA is represented by two paralogs. [Protein synthesis, tRNA and rRNA base modification]	227
272890	TIGR00072	hydrog_prot	hydrogenase maturation protease. HycI and HoxM are well-characterized as responsible for C-terminal protease activity on their respective hydrogenase large chains. A large number of homologous proteins appear responsible for the maturation of various forms of hydrogenase.	145
272891	TIGR00073	hypB	hydrogenase accessory protein HypB. A GTP hydrolase for assembly of nickel metallocenter of hydrogenase. A similar protein, ureG, is an accessory protein for urease, which also uses nickel. hits scoring 75 and above are safe as orthologs. [SS 1/05/04 I changed the role_ID and process GO from protein folding to to protein modification, since a protein folding role has not been established, but HypB is implicated in insertion of nickel into the large subunit of NiFe hydrogenases.] [Protein fate, Protein modification and repair]	208
129184	TIGR00074	hypC_hupF	hydrogenase assembly chaperone HypC/HupF. This protein is suggested by act as a chaperone for a hydrogenase large subunit, holding the precursor form before metallocenter nickel incorporation. [SS 12/31/03] More recently proposed additional function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. . Added metallochaperone and protein mod GO terms. [Protein fate, Protein folding and stabilization, Protein fate, Protein modification and repair]	76
272892	TIGR00075	hypD	hydrogenase expression/formation protein HypD. HypD is involved in the hyp operon which is needed for the activity of the three hydrogenase isoenzymes in Escherichia coli. HypD is one of the genes needed for formation of these enzymes. This protein has been found in gram-negative and gram-positive bacteria and Archaea. [Protein fate, Protein modification and repair]	369
272893	TIGR00077	lspA	lipoprotein signal peptidase. Alternate name: lipoprotein signal peptidase [Protein fate, Protein and peptide secretion and trafficking]	166
272894	TIGR00078	nadC	nicotinate-nucleotide pyrophosphorylase. Synonym: quinolinate phosphoribosyltransferase (decarboxylating) [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	265
272895	TIGR00079	pept_deformyl	peptide deformylase. Peptide deformylase (EC 3.5.1.88), also called polypeptide deformylase, is a metalloenzyme that uses water to release formate from the N-terminal formyl-L-methionine of bacterial and chloroplast peptides. This enzyme should not be confused with formylmethionine deformylase (EC 3.5.1.31) which is active on free N-formyl methionine and has been reported from rat intestine. [Protein fate, Protein modification and repair]	161
272896	TIGR00080	pimt	protein-L-isoaspartate(D-aspartate) O-methyltransferase. This is an all-kingdom (but not all species) full-length ortholog enzyme for repairing aging proteins. Among the prokaryotes, the gene name is pcm. Among eukaryotes, pimt. [Protein fate, Protein modification and repair]	215
272897	TIGR00081	purC	phosphoribosylaminoimidazole-succinocarboxamide synthase. Alternate name: SAICAR synthetase purine de novo biosynthesis. E.coli example noted as homotrimer. Check length. Longer versions may be multifunctional enzymes. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	237
129191	TIGR00082	rbfA	ribosome-binding factor A. Associates with free 30S ribosomal subunits (but not with 30S subunits that are part of 70S ribosomes or polysomes). Essential for efficient processing of 16S rRNA. May interact with the 5'terminal helix region of 16S rRNA. Mutants lacking rbfA have a cold-sensitive phenotype. [Transcription, RNA processing]	114
272898	TIGR00083	ribF	riboflavin kinase/FMN adenylyltransferase. multifunctional enzyme: riboflavin kinase (EC 2.7.1.26) (flavokinase) / FMN adenylyltransferase (EC 2.7.7.2) (FAD pyrophosphorylase) (FAD synthetase). [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD]	288
129193	TIGR00084	ruvA	Holliday junction DNA helicase, RuvA subunit. RuvA specifically binds Holliday junctions as a sandwich of two tetramers and maintains the configuration of the junction. It forms a complex with two hexameric rings of RuvB, the subunit that contains helicase activity. The complex drives ATP-dependent branch migration of the Holliday junction recombination intermediate. The endonuclease RuvC resolves junctions. [DNA metabolism, DNA replication, recombination, and repair]	191
129194	TIGR00086	smpB	SsrA-binding protein. This model describes the SsrA-binding protein, also called tmRNA binding protein, small protein B, and SmpB. The small, stable RNA SsrA (also called tmRNA or 10Sa RNA) recognizes stalled ribosomes such as occur during translation from message that lacks a stop codon. It becomes charged with Ala like a tRNA, then acts as mRNA to resume translation started with the defective mRNA. The short C-terminal peptide tag added by the SsrA system marks the abortively translated protein for degradation. SmpB binds SsrA after its aminoacylation but before the coupling of the Ala to the nascent polypeptide chain and is an essential part of the SsrA peptide tagging system. SmpB has been associated with the survival of bacterial pathogens in conditions of stress. It is universal in the first 100 sequenced bacterial genomes. [Protein synthesis, Other]	144
272899	TIGR00087	surE	5'/3'-nucleotidase SurE. This protein family originally was named SurE because of its role in stationary phase survivalin Escherichia coli. In E. coli, surE is next to pcm, an L-isoaspartyl protein repair methyltransferase that is also required for stationary phase survival. Recent work () shows that viewing SurE as an acid phosphatase (3.1.3.2) is not accurate. Rather, SurE in E. coli, Thermotoga maritima, and Pyrobaculum aerophilum acts strictly on nucleoside 5'- and 3'-monophosphates. E. coli SurE is Recommended cutoffs are 15 for homology, 40 for probable orthology, and 200 for orthology with full-length homology. [Cellular processes, Adaptations to atypical conditions]	247
129196	TIGR00088	trmD	tRNA (guanine-N1)-methyltransferase. This model is specfic for the tRNA modification enzyme tRNA (guanine-N1)-methyltransferase (trmD). This enzyme methylates guanosime-37 in a number of tRNAs.The enzyme's catalytic activity is as follows: S-adenosyl-L-methionine + tRNA = S-adenosyl-L-homocysteine + tRNA containing N1-methylguanine. [Protein synthesis, tRNA and rRNA base modification]	233
272900	TIGR00089	TIGR00089	radical SAM methylthiotransferase, MiaB/RimO family. This subfamily contains the tRNA-i(6)A37 modification enzyme, MiaB (TIGR01574). The phylogenetic tree indicates 4 distinct clades, one of which corresponds to MiaB. The other three clades are modelled by hypothetical equivalogs (TIGR01125, TIGR01579 and TIGR01578). Together, the four models hit every sequence hit by the subfamily model without any overlap between them. This subfamily is aparrently a part of a larger superfamily of enzymes utilizing both a 4Fe4S cluster and S-adenosyl methionine (SAM) to initiate radical reactions. MiaB acts on a particular isoprenylated Adenine base of certain tRNAs causing thiolation at an aromatic carbon, and probably also transferring a methyl grouyp from SAM to the thiol. The particular substrate of the three other clades is unknown but may be very closely related.	429
272901	TIGR00090	rsfS_iojap_ybeB	ribosome silencing factor RsfS/YbeB/iojap. This model describes a widely distributed family of bacterial proteins related to iojap from plants. It includes RsfS(YbeB) from E. coli. The gene iojap is a pattern-striping gene in maize, reflecting a chloroplast development defect in some cells. The conserved function of this protein is to silence ribosomes by binding the ribosomal large subunit and impairing joining with the small subunit in response to nutrient stress. Note that RsfS (starvation) is an author-endorsed change from the published symbol RsfA, which conflicted with previously published gene symbols. [Protein synthesis, Translation factors]	99
161703	TIGR00091	TIGR00091	tRNA (guanine-N(7)-)-methyltransferase. This predicted S-adenosylmethionine-dependent methyltransferase is found in a single copy in most Bacteria. It is also found, with a short amino-terminal extension in eukaryotes. Its function is unknown. In E. coli, this protein flanks the DNA repair protein MutY, also called micA. [Protein synthesis, tRNA and rRNA base modification]	194
129200	TIGR00092	TIGR00092	GTP-binding protein YchF. This predicted GTP-binding protein is found in a single copy in every complete bacterial genome, and is found in Eukaryotes. A more distantly related protein, separated from this model, is found in the archaea. It is known to bind GTP and double-stranded nucleic acid. It is suggested to belong to a nucleoprotein complex and act as a translation factor. [Unknown function, General]	368
272902	TIGR00093	TIGR00093	pseudouridine synthase. This model identifies panels of pseudouridine synthase enzymes that RNA modifications involved in maturing the protein translation apparatus. Counts per genome vary: two in Staphylococcus aureus, three in Pseudomonas putida, four in E. coli, etc. [Protein synthesis, tRNA and rRNA base modification]	128
272903	TIGR00094	tRNA_TruD_broad	tRNA pseudouridine synthase, TruD family. an EGAD loading error caused one member to be called surE, but that's an adjacent gene. MJ11364 is a strong partial match from 50 to 230 aa. [Protein synthesis, tRNA and rRNA base modification]	387
188022	TIGR00095	TIGR00095	16S rRNA (guanine(966)-N(2))-methyltransferase RsmD. This model represents a family of uncharacterized bacterial proteins. Members are present in nearly every complete bacterial genome, always in a single copy. PSI-BLAST analysis shows homology to several families of SAM-dependent methyltransferases, including ribosomal RNA adenine dimethylases. [Protein synthesis, tRNA and rRNA base modification]	190
129204	TIGR00096	TIGR00096	16S rRNA (cytidine(1402)-2'-O)-methyltransferase. This protein, previously known as YraL, is RsmI, one of a pair of genes involved in a unique dimethyl modification of a cytidine in 16S rRNA. See pfam00590 (tetrapyrrole methylase), which demonstrates homology between this family and other members, including several methylases for the tetrapyrrole class of compound, as well as the enzyme diphthine synthase. [Protein synthesis, tRNA and rRNA base modification]	276
272904	TIGR00097	HMP-P_kinase	hydroxymethylpyrimidine kinase/phosphomethylpyrimidine kinase. This model represents a bifunctional enzyme, phosphomethylpyrimidine kinase (EC 2.7.4.7)/Hydroxymethylpyrimidine kinase (EC 2.7.1.49), the ThiD/J protein of thiamine biosynthesis. The protein is commonly observed within operons containing other thiamine biosynthesis genes. Numerous examples are fusion proteins with other thiamine-biosynthetic domains. Saccaromyces has three recent paralogs, two of which are isofunctional and score above the trusted cutoff. The third shows a longer branch length in a phylogenetic tree and scores below the trusted cutoff, as do putative second copies in a number of species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	254
272905	TIGR00099	Cof-subfamily	Cof subfamily of IIB subfamily of haloacid dehalogenase superfamily. This subfamily of sequences falls within the Class-IIB subfamily (TIGR01484) of the Haloacid Dehalogenase superfamily of aspartate-nucleophile hydrolases. The use of the name "Cof" as an identifier here is arbitrary and refers to the E. coli Cof protein. This subfamily is notable for the large number of recent paralogs in many species. Listeria, for instance, has 12, Clostridium, Lactococcus and Streptococcus pneumoniae have 8 each, Enterococcus and Salmonella have 7 each, and Bacillus subtilus, Mycoplasma, Staphylococcus and E. coli have 6 each. This high degree of gene duplication is limited to the gamma proteobacteria and low-GC gram positive lineages. The profusion of genes in this subfamily is not coupled with a high degree of divergence, so it is impossible to determine an accurate phylogeny at the equivalog level. Considering the relationship of this subfamily to the other known members of the HAD-IIB subfamily (TIGR01484), sucrose and trehalose phosphatases and phosphomannomutase, it seems a reasonable hypothesis that these enzymes act on phosphorylated sugars. Possibly the diversification of genes in this subfamily represents the diverse sugars and polysaccharides that various bacteria find in their biological niches. The members of this subfamily are restricted almost exclusively to bacteria (one sequences from S. pombe scores above trusted, while another is between trusted and noise). It is notable that no archaea are found in this group, the closest relations to the archaea found here being two Deinococcus sequences. [Unknown function, Enzymes of unknown specificity]	256
272906	TIGR00100	hypA	hydrogenase nickel insertion protein HypA. CXXC-~12X-CXXC and genetically seems a regulatory protein. In Hpylori, hypA mutant abolished hydrogenase activity and decrease in urease activity. Nickel supplementation in media restored urease activity and partial hydrogenase activity. HypA probably involved in inserting Ni in enzymes. [Protein fate, Protein modification and repair]	115
129208	TIGR00101	ureG	urease accessory protein UreG. This model represents UreG, a GTP hydrolase that acts in the assembly of the nickel metallocenter of urease. It is found only in urease-positive species, although some urease-positive species (e.g. Bacillus subtilis) lack this protein. A similar protein, hypB, is an accessory protein for expression of hydrogenase, which also uses nickel. [Central intermediary metabolism, Nitrogen metabolism]	199
211546	TIGR00103	DNA_YbaB_EbfC	DNA-binding protein, YbaB/EbfC family. The function of this protein is unknown, but it has been expressed and crystallized. Its gene nearly always occurs next to recR and/or dnaX. It is restricted to Bacteria and the plant Arabidopsis. The plant form contains an additional N-terminal region that may serve as a transit peptide and shows a close relationship to the cyanobacterial member, suggesting that it is a chloroplast protein. Members of this family are found in a single copy per bacterial genome, but are broadly distributed. A member is present even in the minimal gene complement of Mycoplasm genitalium. [Unknown function, General]	101
129210	TIGR00104	tRNA_TsaA	tRNA-Thr(GGU) m(6)t(6)A37 methyltransferase TsaA. This protein has been characterized by crystallography in complex with S-Adenosylmethionine, making it a probable S-adenosylmethionine-dependent methyltransferase. Analysis in EcoGene links this protein to the enzyme characterization mapped to the tsaA gene in Escherichia coli. [Unknown function, Enzymes of unknown specificity]	142
272907	TIGR00105	L31	ribosomal protein L31. This family consists exclusively of bacterial (and organellar) 50S ribosomal protein L31. In some species, such as Bacillus subtilis, this protein exists in two forms (RpmE and YtiA), one of which (RpmE) contains a pair of motifs, CXC and CXXC, for binding zinc. [Protein synthesis, Ribosomal proteins: synthesis and modification]	68
272908	TIGR00106	TIGR00106	uncharacterized protein, MTH1187 family. This protein has been crystallized in both Methanobacterium thermoautotrophicum and yeast, but its function remains unknown. Both crystal structures showed sulfate ions bound at the interface of two dimers to form a tetramer. [Unknown function, General]	97
188024	TIGR00107	deoD	purine-nucleoside phosphorylase, family 1 (deoD). Purine nucleoside phosphorylase (also called inosine phosphorylase) is a purine salvage enzyme. Purine nucleosides, such as guanosine, inosine, or xanthosine, plus orthophosphate, can be converted to their respective purine bases (guanine, hypoxanthine, or xanthine) plus ribose-1-phosphate. This family of purine nucleoside phosphorylase is restricted to the bacteria. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	232
272909	TIGR00109	hemH	ferrochelatase. Human ferrochelatase, found at the mitochondrial inner membrane inner surface, was shown in an active recombinant form to be a homodimer. This contrasts to an earlier finding by gel filtration that overexpressed E. coli ferrochelatase runs as a monomer. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	322
272910	TIGR00110	ilvD	dihydroxy-acid dehydratase. This protein, dihydroxy-acid dehydratase, catalyzes the fourth step in valine and isoleucine biosynthesis. It contains a catalytically essential [4Fe-4S] cluster This model generates scores of up to 150 bits vs. 6-phosphogluconate dehydratase, a homologous enzyme. [Amino acid biosynthesis, Pyruvate family]	535
129217	TIGR00111	pelota	mRNA surveillance protein pelota. This model describes the Drosophila protein Pelota, the budding yeast protein DOM34 which it can replace, and a set of closely related archaeal proteins. Members contain a proposed RNA binding motif. The meiotic defect in pelota mutants may be a complex result of a protein translation defect, as suggested in yeast by ribosomal protein RPS30A being a multicopy suppressor and by an altered polyribosome profile in DOM34 mutants rescued by RPS30A. This family is homologous to a family of peptide chain release factors. Pelota is proposed to act in protein translation. [Protein synthesis, Translation factors]	351
272911	TIGR00112	proC	pyrroline-5-carboxylate reductase. This enzyme catalyzes the final step in proline biosynthesis. Among the four paralogs in Bacillus subtilis (proG, proH, proI, and comER), ComER is the most divergent and does not prevent proline auxotrophy from mutation of the other three. It is excluded from the seed and scores between the trusted and noise cutoffs. [Amino acid biosynthesis, Glutamate family]	245
272912	TIGR00113	queA	S-adenosylmethionine:tRNA ribosyltransferase-isomerase. This model describes the enzyme for S-adenosylmethionine:tRNA ribosyltransferase-isomerase (QueA). QueA synthesizes Queuosine which is usually in the first position of the anticodon of tRNAs specific for asparagine, aspartate, histidine, and tyrosine. [Protein synthesis, tRNA and rRNA base modification]	344
211550	TIGR00114	lumazine-synth	6,7-dimethyl-8-ribityllumazine synthase. This enzyme catalyzes the cyclo-ligation of 3,4-dihydroxy-2-butanone-4-P and 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione to form 6,7-dimethyl-8-ribityllumazine, the immediate precursor of riboflavin. Sometimes referred to as riboflavin synthase, beta subunit, this should not be confused with the alpha subunit which carries out the subsequent reaction. Archaeal members of this family are considered putative, although included in the seed and scoring above the trusted cutoff. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD]	138
272913	TIGR00115	tig	trigger factor. Trigger factor is a ribosome-associated molecular chaperone and is the first chaperone to interact with nascent polypeptide. Trigger factor can bind at the same time as the signal recognition particle (SRP), but is excluded by the SRP receptor (FtsY). The central domain of trigger factor has peptidyl-prolyl cis/trans isomerase activity. This protein is found in a single copy in virtually every bacterial genome. [Protein fate, Protein folding and stabilization]	410
272914	TIGR00116	tsf	translation elongation factor Ts. Translational elongation factor Ts (EF-Ts) catalyzes the exchange of GTP for the GDP of the EF-Tu.GDP complex as part of the cycle of translation elongation. This protein is found in Bacteria, mitochondria, and chloroplasts. [Protein synthesis, Translation factors]	291
129223	TIGR00117	acnB	aconitate hydratase 2. Aconitate hydratase (aconitase) is an enzyme of the TCA cycle. This model describes aconitase 2, AcnB, which has weak similarity to aconitase 1. It is found almost exclusively in the Proteobacteria. [Energy metabolism, TCA cycle]	844
272915	TIGR00118	acolac_lg	acetolactate synthase, large subunit, biosynthetic type. Two groups of proteins form acetolactate from two molecules of pyruvate. The type of acetolactate synthase described in this model also catalyzes the formation of acetohydroxybutyrate from pyruvate and 2-oxobutyrate, an early step in the branched chain amino acid biosynthesis; it is therefore also termed acetohydroxyacid synthase. In bacteria, this catalytic chain is associated with a smaller regulatory chain in an alpha2/beta2 heterotetramer. Acetolactate synthase is a thiamine pyrophosphate enzyme. In this type, FAD and Mg++ are also found. Several isozymes of this enzyme are found in E. coli K12, one of which contains a frameshift in the large subunit gene and is not expressed. [Amino acid biosynthesis, Pyruvate family]	558
272916	TIGR00119	acolac_sm	acetolactate synthase, small subunit. Acetolactate synthase is a heterodimeric thiamine pyrophosphate enzyme with large and small subunits. One of the three isozymes in E. coli K12 contains a frameshift in the large subunit gene and is not expressed. acetohydroxyacid synthase is a synonym. [Amino acid biosynthesis, Pyruvate family]	157
161718	TIGR00120	ArgJ	glutamate N-acetyltransferase/amino-acid acetyltransferase. This enzyme can acetylate Glu to N-acetyl-Glu by deacetylating N-2-acetyl-ornithine into ornithine; the two halves of this reaction represent the first and fifth steps in the synthesis of Arg (or citrulline) from Glu by way of ornithine (EC 2.3.1.35). In Bacillus stearothermophilus, but not in Thermus thermophilus HB27, the enzyme is bifunctional and can also use acetyl-CoA to acetylate Glu (EC 2.3.1.1). [Amino acid biosynthesis, Glutamate family]	404
272917	TIGR00121	birA_ligase	birA, biotin-[acetyl-CoA-carboxylase] ligase region. This model represents the biotin--acetyl-CoA-carboxylase ligase region of biotin--acetyl-CoA-carboxylase ligase. In Escherichia coli and some other species, this enzyme is part of a bifunction protein BirA that includes a small, N-terminal biotin operon repressor domain. Proteins identified by this model should not be called bifunctional unless they are also identified by birA_repr_reg (TIGR00122). The protein name suggests that this enzyme transfers biotin only to acetyl-CoA-carboxylase but it also transfers the biotin moiety to other proteins. The apparent orthologs among the eukaryotes are larger proteins that contain a single copy of this domain. [Protein fate, Protein modification and repair]	237
272918	TIGR00122	birA_repr_reg	BirA biotin operon repressor domain. This model represents the amino-terminal helix-turn-helix repressor region of the biotin--acetyl-CoA-carboxylase ligase/biotin operon repressor bifunctional protein BirA. In many species, the biotin--acetyl-CoA-carboxylase ligase ortholog lacks this DNA-binding repressor region and therefore is not equivalent to the well-characterized BirA of E. coli. This model may recognize some other putative repressor proteins, such as DnrO of Streptomyces peucetius with scores below the noise cutoff but with significance shown by low E-value. [Regulatory functions, DNA interactions]	69
272919	TIGR00123	cbiM	cobalamin biosynthesis protein CbiM. A cutoff of 200 bits for trusted orthologs of cbiM is suggested. Scores lower than 200 but higher than 20 may be considered sufficient to call a protein cobalamin biosynthesis protein CbiM-related.The seed alignment for this model is a cluster of very closely related proteins from Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, Methanococcus jannaschii, and Salmonella typhimurium, each of which has greater than 50% identity to all the others. The ortholog from Salmonella is the source of the gene symbol cbiM for this set.In Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, and Methanococcus jannaschii, a second homolog of cbiM is also found. These cbiM-related proteins appear to represent a distinct but less well-conserved orthologous group. Still more distant homologs include sll0383 from Synechocystis sp. and HI1621 from Haemophilus influenzae; the latter protein, from a species that does not synthesize cobalamin, is the most divergent member of the group. The functions of and relationships among the set of proteins homologous to cbiM have not been determined. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	214
129230	TIGR00124	cit_ly_ligase	[citrate (pro-3S)-lyase] ligase. ATP is cleaved to AMP and pyrophosphate during the reaction. The carboxyl end is homologous to a number of cytidyltransferases that also release pyrophosphate. [Energy metabolism, Fermentation, Protein fate, Protein modification and repair]	332
272920	TIGR00125	cyt_tran_rel	cytidyltransferase-like domain. Protein families that contain at least one copy of this domain include citrate lyase ligase, pantoate-beta-alanine ligase, glycerol-3-phosphate cytidyltransferase, ADP-heptose synthase, phosphocholine cytidylyltransferase, lipopolysaccharide core biosynthesis protein KdtB, the bifunctional protein NadR, and a number whose function is unknown. Many of these proteins are known to use CTP or ATP and release pyrophosphate.	66
272921	TIGR00126	deoC	deoxyribose-phosphate aldolase. Deoxyribose-phosphate aldolase is involved in the catabolism of nucleotides and deoxyriibonucleotides. The catalytic process is as follows: 2-deoxy-D-ribose 5-phosphate = D-glyceraldehyde 3-phosphate + acetaldehyde. It is found in both gram-postive and gram-negative bacteria. [Purines, pyrimidines, nucleosides, and nucleotides, Other, Energy metabolism, Other]	211
129233	TIGR00127	nadp_idh_euk	isocitrate dehydrogenase, NADP-dependent, eukaryotic type. This model describes a eukaryotic, NADP-dependent form of isocitrate dehydrogenase. These eukaryotic enzymes differ considerably from a fairly tight cluster that includes all other related isocitrate dehydrogenases, 3-isopropylmalate dehydrogenases, and tartrate dehydrogenases. Several NAD- or NADP-dependent dehydrogenases, including 3-isopropylmalate dehydrogenase, tartrate dehydrogenase, and the multimeric forms of isocitrate dehydrogenase, share a nucleotide binding domain unrelated to that of lactate dehydrogenase and its homologs. These enzymes dehydrogenate their substates at a H-C-OH site adjacent to a H-C-COOH site; the latter carbon, now adjacent to a carbonyl group, readily decarboxylates. This model does not discriminate cytosolic, mitochondrial, and chloroplast proteins. However, the model starts very near the amino end of the cytosolic form; the finding of additional amino-terminal sequence may indicate a transit peptide. [Energy metabolism, TCA cycle]	409
272922	TIGR00128	fabD	malonyl CoA-acyl carrier protein transacylase. This enzyme of fatty acid biosynthesis transfers the malonyl moeity from coenzyme A to acyl-carrier protein. The seed alignment for this family of proteins contains a single member each from a number of bacterial species but also an additional pair of closely related, uncharacterized proteins from B. subtilis, one of which has a long C-terminal extension. [Fatty acid and phospholipid metabolism, Biosynthesis]	290
272923	TIGR00129	fdhD_narQ	formate dehydrogenase family accessory protein FdhD. FdhD in E. coli and NarQ in B. subtilis are required for the activity of formate dehydrogenase. The gene name in B. subtilis reflects the requirement of the neighboring gene narA for nitrate assimilation, for which NarQ is not required. In some species, the gene is associated not with a known formate dehydrogenase but with a related putative molybdopterin-binding oxidoreductase. A reasonable hypothesis is that this protein helps prepare a required cofactor for assembly into the holoenzyme. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	237
161726	TIGR00130	frhD	coenzyme F420-reducing hydrogenase delta subunit (putative coenzyme F420 hydrogenase processing subunit). FrhD is not part of the active FRH heterotrimer, but is probably a protease required for maturation. Alternative name: 8-hydroxy-5-deazaflavin (F420) reducing hydrogenase (FRH) subunit delta. [Protein fate, Protein modification and repair]	153
272924	TIGR00131	gal_kin	galactokinase. Galactokinase is a member of the GHMP kinases (Galactokinase, Homoserine kinase, Mevalonate kinase, Phosphomevalonate kinase) and shares with them an amino-terminal domain probably related to ATP binding.The galactokinases found by this model are divided into two sets. Prokaryotic forms are generally shorter. The eukaryotic forms are longer because of additional central regions and in some cases are known to be bifunctional, with regulatory activities that are independent of galactokinase activity. [Energy metabolism, Sugars]	386
272925	TIGR00132	gatA	aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase, A subunit. In many species, Gln--tRNA ligase is missing. tRNA(Gln) is misacylated with Glu after which a heterotrimeric amidotransferase converts Glu to Gln. This model represents the amidase chain of that heterotrimer, encoded by the gatA gene. In the Archaea, Asn--tRNA ligase is also missing. This amidase subunit may also function in the conversion of Asp-tRNA(Asn) to Asn-tRNA(Asn), presumably with a different recognition unit to replace gatB. Both Methanococcus jannaschii and Methanobacterium thermoautotrophicum have both authentic gatB and a gatB-related gene, but only one gene like gatA. It has been shown that gatA can be expressed only when gatC is also expressed. In most species expressing the amidotransferase, the gatC ortholog is about 90 residues in length, but in Mycoplasma genitalium and Mycoplasma pneumoniae the gatC equivalent is as the C-terminal domain of a much longer protein. Not surprisingly, the Mycoplasmas also represent the most atypical lineage of gatA orthology. This orthology group is more narrowly defined here than in Proc Natl Acad Aci USA 94, 11819-11826 (1997). In particular, a Rhodococcus homolog found in association with nitrile hydratase genes and described as an enantiomer-selective amidase active on several 2-aryl propionamides, is excluded here. It is likely, however, that the amidase subunit GatA is not exclusively a part of the Glu-tRNA(Gln) amidotransferase heterotrimer and restricted to that function in all species. [Protein synthesis, tRNA aminoacylation]	460
272926	TIGR00133	gatB	aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase, B subunit. The heterotrimer GatABC is responsible for transferring the NH2 group that converts Glu to Gln, or Asp to Asn after the Glu or Asp has been ligated to the tRNA for Gln or Asn, respectively. In Lactobacillus, GatABC is responsible only for tRNA(Gln). In the Archaea, GatABC is responsible only for tRNA(Asn), while GatDE is responsible for tRNA(Gln). In lineages that include Thermus, Chlamydia, or Acidithiobacillus, the GatABC complex catalyzes both. [Protein synthesis, tRNA aminoacylation]	478
213509	TIGR00134	gatE_arch	glutamyl-tRNA(Gln) amidotransferase, subunit E. This peptide is found only in the Archaea. It is paralogous to the gatB-encoded subunit of Glu-tRNA(Gln) amidotransferase. The GatABC system operates in many bacteria to convert Glu-tRNA(Gln) into Gln-tRNA(Gln). However, the homologous system in archaea instead converts Asp-tRNA(Asn) to Asn-tRNA(Asn). Glu-tRNA(Gln) is converted to Gln-tRNA(Gln) by a heterodimeric amidotransferase of GatE (this protein) and GatD. The Archaea have an Asp-tRNA(Asn) amidotransferase instead of an Asp--tRNA ligase, but the genes have not been identified. It is likely that this protein replaces gatB in Asp-tRNA(Asn) amidotransferase but that both enzymes share gatA. [Protein synthesis, tRNA aminoacylation]	620
129241	TIGR00135	gatC	aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase, C subunit. Archaea, organelles, and many bacteria charge Gln-tRNA by first misacylating it with Glu and then amidating Glu to Gln. This small protein is part of the amidotransferase heterotrimer and appears to be important to the stability of the amidase subunit encode by gatA, but its function may not be required in every organism that expresses gatA and gatB. The seed alignment for this model does not include any eukaryotic sequence and is not guaranteed to find eukaryotic examples, although it does find some. Saccharomyces cerevisiae, which expresses the amidotransferase for mitochondrial protein translation, seems to lack a gatC ortholog. This model has been revised to remove the candidate sequence from Methanococcus jannaschii, now part of a related model. [Protein synthesis, tRNA aminoacylation]	93
272927	TIGR00136	gidA	glucose-inhibited division protein A. GidA, the longer of two forms of GidA-related proteins, appears to be present in all complete eubacterial genomes so far, as well as Saccharomyces cerevisiae. A subset of these organisms have a closely related protein. GidA is absent in the Archaea. It appears to act with MnmE, in an alpha2/beta2 heterotetramer, in the 5-carboxymethylaminomethyl modification of uridine 34 in certain tRNAs. The shorter, related protein, previously called gid or gidA(S), is now called TrmFO (see model TIGR00137). [Protein synthesis, tRNA and rRNA base modification]	616
129243	TIGR00137	gid_trmFO	tRNA:m(5)U-54 methyltransferase. This model represents an orthologous set of proteins present in relatively few bacteria but very tightly conserved where it occurs. It is closely related to gidA (glucose-inhibited division protein A), which appears to be present in all complete eubacterial genomes so far and in Saccharomyces cerevisiae. It was designated gid but is now recognized as a tRNA:m(5)U-54 methyltransferase and is now designated trmFO. [Protein synthesis, tRNA and rRNA base modification]	433
272928	TIGR00138	rsmG_gidB	16S rRNA (guanine(527)-N(7))-methyltransferase RsmG. RsmG was previously called GidB (glucose-inhibited division protein B). It is present and a single copy in nearly all complete eubacterial genomes. It is missing only from some obligate intracellular species of various lineages (Chlamydiae, Ehrlichia, Wolbachia, Anaplasma, Buchnera, etc.). RsmG shows a methytransferase fold in its the crystal structure, and acts as a 7-methylguanosine (m(7)G) methyltransferase, apparently specific to 16S rRNA. [Protein synthesis, tRNA and rRNA base modification]	181
129245	TIGR00139	h_aconitase	homoaconitase. Homoaconitase is known only as a fungal enzyme from two species, where it is part of an unusual lysine biosynthesis pathway. Because this model is based on just two sequences from a narrow taxonomic range, it may not recognize distant orthologs, should any exist. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures, but 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble leuC and leuD over their lengths but are even closer to the respective domains of homoaconitase, and their identity is uncertain. [Amino acid biosynthesis, Aspartate family]	712
129246	TIGR00140	hupD	hydrogenase expression/formation protein. valid names: hupD, hynC, hoxM. C at 64 and 67 are believed to be metal binding. Postulated to be involved in processing or hydrogenase. Superfamily suggests that it is a peptidase/protease. [Protein fate, Protein modification and repair]	134
129247	TIGR00142	hycI	hydrogenase maturation protease HycI. Hydrogenase maturation protease is a protease that is involved in the C-terminal processing of HycE,the large subunit of hydrogenase 3 from E.Coli. This protein seems to be found in E.Coli and in Archaea. [Protein fate, Protein modification and repair]	146
272929	TIGR00143	hypF	[NiFe] hydrogenase maturation protein HypF. A previously described regulatory effect of HypF mutatation is attributable to loss of activity of a regulatory hydrogenase. A zinc finger-like region CXXCX(18)CXXCX(24)CXXCX(18)CXXC region further supported the regulatory hypothesis. However, more recent work (PUBMED:11375153) shows the direct effect is on the activity of expressed hydrogenases with nickel/iron centers, rather than on expression. [Protein fate, Protein modification and repair]	711
129249	TIGR00144	beta_RFAP_syn	beta-RFAP synthase. This protein family contains several archaeal examples of beta-ribofuranosylaminobenzene 5-prime-phosphate synthase (beta-RFAP synthase), an enzyme involved in methanopterin biosynthesis. In some species, two members of this family are found. It is unclear whether both act as beta-RFAP synthase. This family is related to the GHMP kinases (Galactokinase, Homoserine kinase, Mevalonate kinase, Phosphomevalonate kinase). Members are found so far only in the Archaea and in Methylobacterium extorquens. [Unknown function, Enzymes of unknown specificity]	324
272930	TIGR00145	TIGR00145	FTR1 family protein. A characterized member from yeast acts as oxidase-coupled high affinity iron transporter. Note that the apparent member from E. coli K12-MG1655 has a frameshift by homology with member sequences from other species. [Unknown function, General]	283
161732	TIGR00147	TIGR00147	lipid kinase, YegS/Rv2252/BmrU family. The E. coli member of this family, YegS has been purified and shown to have phosphatidylglycerol kinase activity. The member from M. tuberculosis, Rv2252, has diacylglycerol kinase activity. BmrU from B. subtilis is in an operon with multidrug efflux transporter Bmr, but is uncharacterized. [Unknown function, Enzymes of unknown specificity]	293
129252	TIGR00148	TIGR00148	UbiD family decarboxylase. The member of this family in E. coli is UbiD, 3-octaprenyl-4-hydroxybenzoate carboxy-lyase. The family described by this model, however, is broad enough that it is likely to contain several different decarboxylases. Found in bacteria, archaea, and yeast, with two members in A. fulgidus. No homologs were detected besides those classified as orthologs. The member from H. pylori has a C-terminal extension of just over 100 residues that is shared in part by the Aquifex aeolicus homolog. [Unknown function, General]	438
129253	TIGR00149	TIGR00149_YjbQ	secondary thiamine-phosphate synthase enzyme. Members of this protein family have been studied extensively by crystallography. Members from several different species have been shown to have sufficient thiamin phosphate synthase activity (EC 2.5.1.3) to complement thiE mutants. However, it is presumed that this is a secondary activity, and the primary function of this enzyme remains unknown. [Unknown function, Enzymes of unknown specificity]	132
129254	TIGR00150	T6A_YjeE	tRNA threonylcarbamoyl adenosine modification protein YjeE. This protein family belongs to a four-gene system responsible for the threonylcarbamoyl adenosine (t6A) tRNA modification. Members of this family have a conserved nucleotide-binding motif GXXGXGKT and a nucleotide-binding fold. Member protein YjeE of Haemophilus influenzae (HI0065) was shown to have (weak) ATPase activity. [Protein synthesis, tRNA and rRNA base modification]	133
129255	TIGR00151	ispF	2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase. Members of this protein family are 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase, the IspF protein of the deoxyxylulose (non-mevalonate) pathway of IPP biosynthesis. This protein occurs as an IspDF bifunctional fusion protein in about 20 percent of bacterial genomes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	155
272931	TIGR00152	TIGR00152	dephospho-CoA kinase. This model produces scores in the range of 0-25 bits against adenylate, guanylate, uridine, and thymidylate kinases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	190
272932	TIGR00153	TIGR00153	TIGR00153 family protein. An apparent homolog with a suggested function is Pit accessory protein from Sinorhizobium meliloti, which may be involved in phosphate (Pi) transport. [Hypothetical proteins, Conserved]	216
188029	TIGR00154	ispE	4-diphosphocytidyl-2C-methyl-D-erythritol kinase. Members of this family of GHMP kinases were previously designated as conserved hypothetical protein YchB or as isopentenyl monophosphate kinase. It is now known, in tomato and E. coli, to encode 4-diphosphocytidyl-2C-methyl-D-erythritol kinase, an enzyme of the deoxyxylulose phosphate pathway of terpenoid biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	294
272933	TIGR00155	pqiA_fam	integral membrane protein, PqiA family. This family consists of uncharacterized predicted integral membrane proteins found, so far, only in the Proteobacteria. Of two members in E. coli, one is induced by paraquat and is designated PqiA, paraquat-inducible protein A. [Unknown function, General]	403
129260	TIGR00156	TIGR00156	TIGR00156 family protein. As of the last revision, this family consists only of two proteins from Escherichia coli and one from the related species Haemophilus influenzae. [Hypothetical proteins, Conserved]	126
272934	TIGR00157	TIGR00157	ribosome small subunit-dependent GTPase A. Members of this protein were designated YjeQ and are now designated RsgA (ribosome small subunit-dependent GTPase A). The strongest motif in the alignment of these proteins is GXSGVGKS[ST], a classic P-loop for nucleotide binding. This protein has been shown to cleave GTP and remain bound to GDP. A role as a regulator of translation has been suggested. The Aquifex aeolicus ortholog is split into consecutive open reading frames. Consequently, this model was build in fragment mode (-f option). [Protein synthesis, Translation factors]	245
129262	TIGR00158	L9	ribosomal protein L9. Ribosomal protein L9 appears to be universal in, but restricted to, eubacteria and chloroplast. [Protein synthesis, Ribosomal proteins: synthesis and modification]	148
129263	TIGR00159	TIGR00159	TIGR00159 family protein. These proteins have no detectable global or local homology to any protein of known function. Members are restricted to the bacteria and found broadly in lineages other than the Proteobacteria. [Hypothetical proteins, Conserved]	211
272935	TIGR00160	MGSA	methylglyoxal synthase. Methylglyoxal synthase (MGS) generates methylglyoxal (MG), a toxic metabolite (that may also be a regulatory metabolite and) that is detoxified, prinicipally, through a pathway involving glutathione and glyoxylase I. Totemeyer et al. propose that, during a loss of control over carbon flux, with accumulation of phosphorylated sugars and depletion of phosphate, as might happen during a rapid shift to a richer medium, MGS aids the cell by converting some dihydroxyacetone phosphate (DHAP) to MG and phosphate. This is therefore an alternative to triosephosphate isomerase and the remainder of the glycolytic pathway for the disposal of DHAP during the stress of a sudden increase in available sugars. [Energy metabolism, Other]	143
129265	TIGR00161	TIGR00161	TIGR00161 family protein. This model represents one out of two closely related ortholgous sets of proteins that, so far, are found only in the Archaea. This ortholog set includes MJ0106 from Methanococcus jannaschii and AF1251 from Archaeoglobus fulgidus, but not MJ1210 or AF0525. [Hypothetical proteins, Conserved]	238
129266	TIGR00162	TIGR00162	TIGR00162 family protein. This model represents one out of two closely related ortholgous sets of proteins that, so far, are found only in but are universal among the Archaea. This ortholog set includes MJ1210 from Methanococcus jannaschii and AF0525 from Archaeoglobus fulgidus while excluding MJ0106 and AF1251. [Hypothetical proteins, Conserved]	188
272936	TIGR00163	PS_decarb	phosphatidylserine decarboxylase precursor. Phosphatidylserine decarboxylase is synthesized as a single chain precursor. Generation of the pyruvoyl active site from a Ser is coupled to cleavage of a Gly-Ser bond between the larger (beta) and smaller (alpha chains). It is an integral membrane protein. A closely related family, possibly also active as phosphatidylserine decarboxylase, falls under model TIGR00164. [Fatty acid and phospholipid metabolism, Biosynthesis]	238
129268	TIGR00164	PS_decarb_rel	phosphatidylserine decarboxylase precursor-related protein. Phosphatidylserine decarboxylase is synthesized as a single chain precursor. Generation of the pyruvoyl active site from a Ser is coupled to cleavage of a Gly-Ser bond between the larger (beta) and smaller (alpha chains). It is an integral membrane protein. This protein has many regions of homology to known phosphatidylserine decarboxylases, including the Gly-Ser motif for chain cleavage and active site generation, but has a shorter amino end and a number of deletions along the length of the alignment to the phosphatidylserine decarboxylases. It is unclear whether this protein is a form of phosphatidylserine decarboxylase or is a related enzyme. It is found in Neisseria gonorrhoeae, Mycobacterium tuberculosis, and several archaeal species, all of which lack known phosphatidylserine decarboxylase. [Unknown function, General]	189
272937	TIGR00165	S18	ribosomal protein S18. This ribosomal small subunit protein is found in all eubacteria so far, as well as in chloroplasts. YER050C from Saccharomyces cerevisiae and a related protein from Caenorhabditis elegans appear to be homologous and may represent mitochondrial forms. The trusted cutoff is set high enough that these two candidate S18 proteins are not categorized automatically. [Protein synthesis, Ribosomal proteins: synthesis and modification]	70
129270	TIGR00166	S6	ribosomal protein S6. The ribosomal protein S6 ortholog family, including yeast MRP17, shows more than two-fold length variation from 95 residues in Bacillus subtilis to 215 in Mycoplasma pneumoniae. This length variation comes primarily from poorly conserved C-terminal extensions that are particularly long in the Mycoplasmas. MRP17 protein is a component of the small ribosomal subunit in mitochondria, and is shown here to be an ortholog of S6. [Protein synthesis, Ribosomal proteins: synthesis and modification]	93
272938	TIGR00167	cbbA	ketose-bisphosphate aldolase. This model is under revision. Proteins found by this model include fructose-bisphosphate and tagatose-bisphosphate aldolase. [Energy metabolism, Glycolysis/gluconeogenesis]	288
129272	TIGR00168	infC	translation initiation factor IF-3. infC uses abnormal initiation codons such as AUA, AUC, and CUG which render its expression particularly sensitive to excess of its gene product IF-3 thereby regulating its own expression [Protein synthesis, Translation factors]	165
272939	TIGR00169	leuB	3-isopropylmalate dehydrogenase. Several NAD- or NADP-dependent dehydrogenases, including 3-isopropylmalate dehydrogenase, tartrate dehydrogenase, and the dimeric forms of isocitrate dehydrogenase, share a nucleotide binding domain unrelated to that of lactate dehydrogenase and its homologs. These enzymes dehydrogenate their substates at a H-C-OH site adjacent to a H-C-COOH site; the latter carbon, now adjacent to a carbonyl group, readily decarboxylates.Among these decarboxylating dehydrogenases of hydroxyacids, overall sequence homology indicates evolutionary history rather than actual substrate or cofactor specifity, which may be toggled experimentally by replacement of just a few amino acids. 3-isopropylmalate dehydrogenase is an NAD-dependent enzyme and should have a sequence resembling HGSAPDI around residue 340. The subtrate binding loop should include a sequence resembling E[KQR]X(0,1)LLXXR around residue 115. Other contacts of importance are known from crystallography but not detailed here.This model will not find all isopropylmalate dehydrogenases; the enzyme from Sulfolobus sp. strain 7 is more similar to mitochondrial NAD-dependent isocitrate dehydrogenases than to other known isopropylmalate dehydrogenases and was omitted to improve the specificity of the model. It scores below the cutoff and below some enzymes known not to be isopropylmalate dehydrogenase. [Amino acid biosynthesis, Pyruvate family]	346
272940	TIGR00170	leuC	3-isopropylmalate dehydratase, large subunit. Members of this family are 3-isopropylmalate dehydratase, large subunit, or the large subunit domain of single-chain forms. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures. All are dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur cluster. 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble the leuC and leuD pair in length and sequence but even more closely resemble the respective domains of homoaconitase, and their identity is uncertain. These homologs are now described by a separate model of subfamily (rather than equivalog) homology type, and the priors and cutoffs for this model have been changed to focus this equivalog family more narrowly. [Amino acid biosynthesis, Pyruvate family]	465
129275	TIGR00171	leuD	3-isopropylmalate dehydratase, small subunit. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures. All are dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur cluster. 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble the leuC and leuD pair in length and sequence but even more closely resemble the respective domains of homoaconitase, and their identity is uncertain. The candidate archaeal leuD proteins are not included in the seed alignment for this model and score below the trusted cutoff. [Amino acid biosynthesis, Pyruvate family]	188
129276	TIGR00172	maf	MAF protein. This nonessential gene causes inhibition of septation when overexpressed. A member of the family is found in the Archaeon Pyrococcus horikoshii and another in the round worm Caenorhabditis elegans. [Cellular processes, Cell division]	183
272941	TIGR00173	menD	2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylic-acid synthase. MenD was thought until recently to act as SHCHC synthase, but has recently been shown to act instead as SEPHCHC synthase. Conversion of SEPHCHC into SHCHC and pyruvate may occur spontaneously but is catalyzed efficiently, at least in some organisms, by MenH (see TIGR03695). 2-oxoglutarate decarboxylase/SHCHC synthase (menD) is a thiamine pyrophosphate enzyme involved in menaquinone biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	432
213512	TIGR00174	miaA	tRNA dimethylallyltransferase. Alternate names include delta(2)-isopentenylpyrophosphate transferase, IPP transferase, 2-methylthio-N6-isopentyladenosine tRNA modification enzyme. Catalyzes the first step in the modification of an adenosine near the anticodon to 2-methylthio-N6-isopentyladenosine. Understanding of substrate specificity has changed. [Protein synthesis, tRNA and rRNA base modification]	287
272942	TIGR00175	mito_nad_idh	isocitrate dehydrogenase, NAD-dependent, mitochondrial type. Several NAD- or NADP-dependent dehydrogenases, including 3-isopropylmalate dehydrogenase, tartrate dehydrogenase, and the multimeric forms of isocitrate dehydrogenase, share a nucleotide binding domain unrelated to that of lactate dehydrogenase and its homologs. These enzymes dehydrogenate their substates at a H-C-OH site adjacent to a H-C-COOH site; the latter carbon, now adjacent to a carbonyl group, readily decarboxylates. Mitochondrial NAD-dependent isocitrate dehydrogenases (IDH) resemble prokaryotic NADP-dependent IDH and 3-isopropylmalate dehydrogenase (an NAD-dependent enzyme) more closely than they resemble eukaryotic NADP-dependent IDH. The mitochondrial NAD-dependent isocitrate dehydrogenase is believed to be an alpha(2)-beta-gamma heterotetramer. All subunits are homologous and found by this model. The NADP-dependent IDH of Thermus aquaticus thermophilus strain HB8 resembles these NAD-dependent IDH, except for the residues involved in cofactor specificity, much more closely than it resembles other prokaryotic NADP-dependent IDH, including that of Thermus aquaticus strain YT1. [Energy metabolism, TCA cycle]	333
272943	TIGR00176	mobB	molybdopterin-guanine dinucleotide biosynthesis protein MobB. This molybdenum cofactor biosynthesis enzyme is similar to the urease accessory protein UreG and to the hydrogenase accessory protein HypB, both GTP hydrolases involved in loading nickel into the metallocenters of their respective target enzymes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	155
272944	TIGR00177	molyb_syn	molybdenum cofactor synthesis domain. The Drosophila protein cinnamon, the Arabidopsis protein cnx1, and rat protein gephyrin each have one domain like MoeA and one like MoaB and Mog. These domains are, however, distantly related to each other, as captured by this model. Gephyrin is unusual in that it seems to be a tubulin-binding neuroprotein involved in the clustering of both blycine receptors and GABA receptors, rather than a protein of molybdenum cofactor biosynthesis.	148
129282	TIGR00178	monomer_idh	isocitrate dehydrogenase, NADP-dependent, monomeric type. The monomeric type of isocitrate dehydrogenase has been found so far in a small number of species, including Azotobacter vinelandii, Corynebacterium glutamicum, Rhodomicrobium vannielii, and Neisseria meningitidis. It is NADP-specific. [Energy metabolism, TCA cycle]	741
272945	TIGR00179	murB	UDP-N-acetylenolpyruvoylglucosamine reductase. This model describes MurB, UDP-N-acetylenolpyruvoylglucosamine reductase, which is also called UDP-N-acetylmuramate dehydrogenase. It is part of the pathway for the biosynthesis of the UDP-N-acetylmuramoyl-pentapeptide that is a precursor of bacterial peptidoglycan. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	284
272946	TIGR00180	parB_part	ParB/RepB/Spo0J family partition protein. This model represents the most well-conserved core of a set of chromosomal and plasmid partition proteins related to ParB, including Spo0J, RepB, and SopB. Spo0J has been shown to bind a specific DNA sequence that, when introduced into a plasmid, can serve as partition site. Study of RepB, which has nicking-closing activity, suggests that it forms a transient protein-DNA covalent intermediate during the strand transfer reaction.	187
272947	TIGR00181	pepF	oligoendopeptidase F. This family represents the oligoendopeptidase F clade of the family of larger M3 or thimet (for thiol-dependent metallopeptidase) oligopeptidase family. Lactococcus lactis PepF hydrolyzed peptides of 7 and 17 amino acids with fairly broad specificity. The homolog of lactococcal PepF in group B Streptococcus was named PepB (, with the name difference reflecting a difference in species of origin rather activity; substrate profiles were quite similar. Differences in substrate specificity should be expected in other species. The gene is duplicated in Lactococcus lactis on the plasmid that bears it. A shortened second copy is found in Bacillus subtilis. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	591
129286	TIGR00182	plsX	fatty acid/phospholipid synthesis protein PlsX. This protein of fatty acid/phospholipid biosynthesis, called PlsX after the member in Streptococcus pneumoniae, is proposed to be a phosphate acyltransferase that partners with PlsY (TIGR00023) in a two-step 1-acylglycerol-3-phosphate biosynthesis pathway alternative to the one-step PlsB (EC 2.3.1.15) pathway. [Fatty acid and phospholipid metabolism, Biosynthesis]	322
272948	TIGR00183	prok_nadp_idh	isocitrate dehydrogenase, NADP-dependent, prokaryotic type. Several NAD- or NADP-dependent dehydrogenases, including 3-isopropylmalate dehydrogenase, tartrate dehydrogenase, and the multimeric forms of isocitrate dehydrogenase, share a nucleotide binding domain unrelated to that of lactate dehydrogenase and its homologs. These enzymes dehydrogenate their substates at a H-C-OH site adjacent to a H-C-COOH site; Prokaryotic NADP-dependent isocitrate dehydrogenases resemble their NAD-dependent counterparts and 3-isopropylmalate dehydrogenase (an NAD-dependent enzyme) more closely than they resemble eukaryotic NADP-dependent isocitrate dehydrogenases. [Energy metabolism, TCA cycle]	416
272949	TIGR00184	purA	adenylosuccinate synthase. Alternate name IMP--aspartate ligase. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	425
211559	TIGR00185	tRNA_yibK_trmL	tRNA (cytidine(34)-2'-O)-methyltransferase. TrmL (previously YibK) is responsible for 2'-O-methylation at tRNA(Leu) position 34. [Protein synthesis, tRNA and rRNA base modification]	153
129290	TIGR00186	rRNA_methyl_3	rRNA methylase, putative, group 3. this is part of the trmH (spoU) family of rRNA methylases [Protein synthesis, tRNA and rRNA base modification]	237
272950	TIGR00187	ribE	riboflavin synthase, alpha subunit. This protein family consists almost entirely of two lumazine-binding domains, described in the family Lum_binding from Pfam. The model generates lower scores against other proteins that also have two lumazine-binding domains, including some involved in bioluminescence.The name ribE was selected, from among alternatives including ribB and ribC, to match the usage in EcoCyc. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD]	200
211560	TIGR00188	rnpA	ribonuclease P protein component, eubacterial. This peptide is the protein component of a ribonucleoprotein that cleaves the leader sequence from each tRNA precursor to leave the mature 5'-terminus. The catalytic site is in the RNA component, M1 RNA. The yeast mitochondrial RNase P protein component gene RPM2 has no obvious sequence similarity to rnpA, but resembles eukaryotic nuclear RNase P instead. [Transcription, RNA processing]	111
272951	TIGR00189	tesB	acyl-CoA thioesterase II. Function: hydrolyzes a broad range of acyl-CoA thioesters. Physiological function is not known. Subunit: homotetramer. [Fatty acid and phospholipid metabolism, Biosynthesis]	271
129294	TIGR00190	thiC	phosphomethylpyrimidine synthase. The thiC ortholog is designated thiA in Bacillus subtilis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	423
129295	TIGR00191	thrB	homoserine kinase. Homoserine kinase is part of the threonine biosynthetic pathway.Homoserine kinase is a member of the GHMP kinases (Galactokinase, Homoserine kinase, Mevalonate kinase, Phosphomevalonate kinase) and shares with them an amino-terminal domain probably related to ATP binding.P.aeruginosa homoserine kinase seems not to be homologous (see PROSITE:PDOC0054) [Amino acid biosynthesis, Aspartate family]	302
129296	TIGR00192	urease_beta	urease, beta subunit. In a number of species, including B.subtilis, Synechocystis, and Haemophilus influenzae, urease subunits beta and gamma are encoded as separate polypeptides. In Helicobacter pylori UreA and in the fission yeast Schizosaccharomyces pombe, beta subunit-like sequence follows gamma subunit-like sequence in a single chain; the fission yeast protein contains additional C-terminal regions. [Central intermediary metabolism, Nitrogen metabolism]	101
272952	TIGR00193	urease_gam	urease, gamma subunit. In a number of species, including B.subtilis, Synechocystis, and Haemophilus influenzae, urease subunits beta and gamma are encoded as separate polypeptides. In Helicobacter pylori UreA and in the fission yeast Schizosaccharomyces pombe, beta subunit-like sequence follows gamma subunit-like sequence in a single chain; the fission yeast protein contains additional C-terminal regions. Nomenclature for the various subunits of urease in Helicobacter differs from nomenclature in most other species. [Central intermediary metabolism, Nitrogen metabolism]	102
272953	TIGR00194	uvrC	excinuclease ABC, C subunit. This family consists of the DNA repair enzyme UvrC, an ABC excinuclease subunit which interacts with the UvrA/UvrB complex to excise UV-damaged nucleotide segments. [DNA metabolism, DNA replication, recombination, and repair]	574
272954	TIGR00195	exoDNase_III	exodeoxyribonuclease III. The model brings in reverse transcriptases at scores below 50, model also contains eukaryotic apurinic/apyrimidinic endonucleases which group in the same family [DNA metabolism, DNA replication, recombination, and repair]	254
272955	TIGR00196	yjeF_cterm	yjeF C-terminal region, hydroxyethylthiazole kinase-related. E. coli yjeF has full-length orthologs in a number of species, all of unknown function. However, yeast YNL200C is homologous and corresponds to the N-terminal region while yeast YKL151C and B. subtilis yxkO correspond to this C-terminal region only. The present model may hit hydroxyethylthiazole kinase, an enzyme associated with thiamine biosynthesis. [Unknown function, General]	270
272956	TIGR00197	yjeF_nterm	yjeF N-terminal region. The protein region corresponding to this model shows no clear homology to any protein of known function. This model is built on yeast protein YNL200C and the N-terminal regions of E. coli yjeF and its orthologs in various species. The C-terminal region of yjeF and its orthologs shows similarity to hydroxyethylthiazole kinase (thiM) and other enzymes involved in thiamine biosynthesis. Yeast YKL151C and B. subtilis yxkO match the yjeF C-terminal domain but lack this region. [Unknown function, General]	205
272957	TIGR00198	cat_per_HPI	catalase/peroxidase HPI. As catalase, this enzyme catalyzes the dismutation of two molecules of hydrogen peroxide to dioxygen and two molecules of water. As a peroxidase, it uses hydrogen peroxide to oxidize donor compounds and produce water. KatG from E. coli is a homotetramer with two non-covalently associated iron protoheme IX groups per tetramer, but the ortholog from Synechococcus sp. is a homodimer with one protoheme. Important sites (numbered according to E. coli KatG) include heme ligands His-106 and His-267 and active site Trp-318. Note that the translation PID:g296476 from accession X71420 from Rhodobacter capsulatus B10 contains extensive frameshift differences from the rest of the orthologous family. [Cellular processes, Detoxification]	716
129303	TIGR00199	PncC_domain	amidohydrolase, PncC family. CinA is a DNA damage- or competence-inducible protein that is polycistronic with recA in a number of species. Several bacterial species have a protein consisting largely of the C-terminal domain of CinA but lacking the N-terminal domain, including nicotinamide mononucleotide (NMN) deamidase (3.5.1.42) proteins PncC in Shewanella oneidensis and ygaD in E. coli. [DNA metabolism, DNA replication, recombination, and repair]	146
161761	TIGR00200	cinA_nterm	competence/damage-inducible protein CinA N-terminal domain. cinA is a DNA damage- or competence-inducible protein that is polycistronic with recA in a number of species [DNA metabolism, DNA replication, recombination, and repair]	413
272958	TIGR00201	comF	comF family protein. This protein is found in species that do (Bacillus subtilis, Haemophilus influenzae) or do not (E. coli, Borrelia burgdorferi) have described systems for natural transformation with exogenous DNA. It is involved in competence for transformation in Bacillus subtilis. [Cellular processes, DNA transformation]	190
272959	TIGR00202	csrA	carbon storage regulator (csrA). Modulates the expression of genes in the glycogen biosynthesis and gluconeogenesis pathways by accelerating the 5'-to-3' degradation of these transcripts through selective RNA binding. The N-terminal end of the sequence (AA 11-45) contains the KH motif which is characteristic of a set of RNA-binding proteins. [Energy metabolism, Glycolysis/gluconeogenesis, Regulatory functions, RNA interactions]	69
129307	TIGR00203	cydB	cytochrome d oxidase, subunit II (cydB). part of a two component cytochrome D terminal complex. Terminal reaction in the aerobic respiratory chain. [Energy metabolism, Electron transport]	378
129308	TIGR00204	dxs	1-deoxy-D-xylulose-5-phosphate synthase. DXP synthase is a thiamine diphosphate-dependent enzyme related to transketolase and the pyruvate dehydrogenase E1-beta subunit. By an acyloin condensation of pyruvate with glyceraldehyde 3-phosphate, it produces 1-deoxy-D-xylulose 5-phosphate, a precursor of thiamine diphosphate (TPP), pyridoxal phosphate, and the isoprenoid building block isopentenyl diphosphate (IPP). [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine, Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	617
272960	TIGR00205	fliE	flagellar hook-basal body complex protein FliE. fliE is a component of the flagellar hook-basal body complex located possibly at (MS-ring)-rod junction. [Cellular processes, Chemotaxis and motility]	108
129310	TIGR00206	fliF	flagellar basal-body M-ring protein/flagellar hook-basal body protein (fliF). Component of the M (cytoplasmic associated) ring, one of four rings (L,P,S,M) which make up the flagellar hook-basal body which is a major portion of the flagellar organelle. Although the basic structure of the flagella appears to be similar for all bacteria, additional rings and structures surrounding the basal body have been observed for some bacteria (eg Vibrio cholerae and Treponema pallidum). [Cellular processes, Chemotaxis and motility]	555
272961	TIGR00207	fliG	flagellar motor switch protein FliG. The fliG protein along with fliM and fliN interact to form the switch complex of the bacterial flagellar motor located at the base of the basal body. This complex interacts with chemotaxis proteins (eg CHEY). In addition the complex interacts with other components of the motor that determine the direction of flagellar rotation. The model contains putative members of the fliG family at scores of less than 100 from Agrobacterium radiobacter and Sinorhizobium meliloti as well as fliG-like genes from treponema pallidum and Borrelia burgdorferi. That is why the suggested cutoff is set at 20 but was set at 100 to construct the family. [Cellular processes, Chemotaxis and motility]	338
188033	TIGR00208	fliS	flagellar biosynthetic protein FliS. The function of this protein in flagellar biosynthesis is unknown, but appears to be regulatory. The member of this family in Vibrio parahaemolyticus is designated FlaJ (creating a synonym for FliS) and was shown essential for flagellin biosynthesis. [Cellular processes, Chemotaxis and motility]	124
129313	TIGR00209	galT_1	galactose-1-phosphate uridylyltransferase, family 1. This enzyme is involved in glucose and galactose interconversion. This model describes one of two extremely distantly related branches of the model pfam01087. [Energy metabolism, Sugars]	347
129314	TIGR00210	gltS	sodium--glutamate symport carrier (gltS). [Transport and binding proteins, Amino acids, peptides and amines]	398
272962	TIGR00211	glyS	glycyl-tRNA synthetase, tetrameric type, beta subunit. The glycyl-tRNA synthetases differ even among the eubacteria in oligomeric structure. In Escherichia coli and most others, it is a heterodimer of two alpha chains and two beta chains, encoded by tandem genes. The genes are similar, but fused, in Chlamydia trachomatis. By contrast, the glycyl-tRNA synthetases of Thermus thermophilus and of archaea and eukaryotes differ considerably; they are homodimeric, mutually similar, and not detected by this model. [Protein synthesis, tRNA aminoacylation]	691
272963	TIGR00212	hemC	hydroxymethylbilane synthase. Alternate name hydroxymethylbilane synthase Biosynthesis of cofactors, prosthetic groups, and carriers: Heme and porphyrin [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	292
129317	TIGR00213	GmhB_yaeD	D,D-heptose 1,7-bisphosphate phosphatase. This family of proteins formerly designated yaeD resembles the histidinol phosphatase domain of the bifunctional protein HisB. The member from E. coli has been characterized as D,D-heptose 1,7-bisphosphate phosphatase, GmhB, involved in inner core LPS assembly (). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	176
272964	TIGR00214	lipB	lipoate-protein ligase B. Involved in lipoate biosynthesis as the main determinant of the lipoyl-protein ligase activity required for lipoylation of enzymes such as alpha-ketoacid dehydrogenases. Involved in activation and re-activation (following denaturation) of lipoyl-protein ligases (calcium ion-dependant process). [Protein fate, Protein modification and repair]	184
129319	TIGR00215	lpxB	lipid-A-disaccharide synthase. Lipid-A precursor biosynthesis producing lipid A disaccharide in a condensation reaction. transcribed as part of an operon including lpxA [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	385
272965	TIGR00216	ispH_lytB	(E)-4-hydroxy-3-methyl-but-2-enyl pyrophosphate reductase (IPP and DMAPP forming). The IspH protein (previously designated LytB) has now been recognized as the last enzyme in the biosynthesis of isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). Escherichia coli LytB protein had been found to regulate the activity of RelA (guanosine 3',5'-bispyrophosphate synthetase I), which in turn controls the level of a regulatory metabolite. It is involved in penicillin tolerance and the stringent response. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	282
129321	TIGR00217	malQ	4-alpha-glucanotransferase. This enzyme is known as amylomaltase and disproportionating enzyme. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	513
272966	TIGR00218	manA	mannose-6-phosphate isomerase, class I. The names phosphomannose isomerase and mannose-6-phosphate isomerase are synonomous. This family contains two rather deeply branched groups. One group contains an experimentally determined phosphomannose isomerase of Streptococcus mutans as well as three uncharacterized paralogous proteins of Bacillus subtilis, all at more than 50 % identity to each other, plus a more distant homolog from Archaeoglobus fulgidus. The other group contains members from E. coli, budding yeast, Borrelia burgdorferi, etc. [Energy metabolism, Sugars]	302
129323	TIGR00219	mreC	rod shape-determining protein MreC. MreC (murein formation C) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped. Cells defective in MreC are round. Species with MreC include many of the Proteobacteria, Gram-positives, and spirochetes. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	283
272967	TIGR00220	mscL	large conductance mechanosensitive channel protein. Protein encodes a channel which opens in response to a membrane stretch force. Probably serves as an osmotic gauge. Carboxy terminus tends to be more divergent across species with a high degree of sequence conservation found at the N-terminus. [Cellular processes, Adaptations to atypical conditions]	127
272968	TIGR00221	nagA	N-acetylglucosamine-6-phosphate deacetylase. [Central intermediary metabolism, Amino sugars]	380
272969	TIGR00222	panB	3-methyl-2-oxobutanoate hydroxymethyltransferase. Members of this family are 3-methyl-2-oxobutanoate hydroxymethyltransferase, the first enzyme of the pantothenate biosynthesis pathway. An alternate name is ketopantoate hydroxymethyltransferase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	263
129327	TIGR00223	panD	L-aspartate-alpha-decarboxylase. Members of this family are aspartate 1-decarboxylase, the enzyme that makes beta-alanine and C02 from aspartate. Beta-alanine is then used to make the vitamin pantothenate, from which coenzyme A is made. Aspartate 1-decarboxylase is synthesized as a proenzyme, then cleaved to an alpha (C-terminal) and beta (N-terminal) subunit with a pyruvoyl group. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	126
161774	TIGR00224	pckA	phosphoenolpyruvate carboxykinase (ATP). Involved in the gluconeogenesis pathway. It converts oxaloacetic acid to phosphoenolpyruvate using ATP. Enzyme is a monomer. The reaction is also catalysed by phosphoenolpyruvate carboxykinase (GTP) (EC 4.1.1.32) using GTP instead of ATP, described in PROSITE:PDOC00421 [Energy metabolism, Glycolysis/gluconeogenesis]	532
272970	TIGR00225	prc	C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Protein fate, Protein modification and repair]	334
129330	TIGR00227	ribD_Cterm	riboflavin-specific deaminase C-terminal domain. Eubacterial riboflavin-specific deaminases have a zinc-binding domain recognized by the dCMP_cyt_deam model toward the N-terminus and this domain toward the C-terminus. Yeast HTP reductase, a riboflavin-biosynthetic enzyme, and several archaeal proteins believed related to riboflavin biosynthesis consist only of this domain and lack the dCMP_cyt_deam domain.	216
129331	TIGR00228	ruvC	crossover junction endodeoxyribonuclease RuvC. Endonuclease that resolves Holliday junction intermediates in genetic recombination. The active form of the protein is a dimer. Structure studies reveals that the catalytic center, comprised of four acidic residues, lies at the bottom of a cleft that fits a DNA duplex. The model hits a single Synechocystis PCC6803 protein at a score of 30, below the trusted cutoff, that appears orthologous and may act as authentic RuvC. [DNA metabolism, DNA replication, recombination, and repair]	156
272971	TIGR00229	sensory_box	PAS domain S-box. The PAS domain was previously described. This sensory box, or S-box domain occupies the central portion of the PAS domain but is more widely distributed. It is often tandemly repeated. Known prosthetic groups bound in the S-box domain include heme in the oxygen sensor FixL, FAD in the redox potential sensor NifL, and a 4-hydroxycinnamyl chromophore in photoactive yellow protein. Proteins containing the domain often contain other regulatory domains such as response regulator or sensor histidine kinase domains. Other S-box proteins include phytochromes and the aryl hydrocarbon receptor nuclear translocator. [Regulatory functions, Small molecule interactions]	124
272972	TIGR00230	sfsA	sugar fermentation stimulation protein. probable regulatory factor involved in maltose metabolism contains a putative DNA binding domain. Isolated as a gene which enabled E.coli strain MK2001 to use maltose. [Energy metabolism, Sugars, Regulatory functions, Other]	234
272973	TIGR00231	small_GTP	small GTP-binding protein domain. Proteins with a small GTP-binding domain recognized by this model include Ras, RhoA, Rab11, translation elongation factor G, translation initiation factor IF-2, tetratcycline resistance protein TetM, CDC42, Era, ADP-ribosylation factors, tdhF, and many others. In some proteins the domain occurs more than once.This model recognizes a large number of small GTP-binding proteins and related domains in larger proteins. Note that the alpha chains of heterotrimeric G proteins are larger proteins in which the NKXD motif is separated from the GxxxxGK[ST] motif (P-loop) by a long insert and are not easily detected by this model. [Unknown function, General]	162
272974	TIGR00232	tktlase_bact	transketolase, bacterial and yeast. This model is designed to capture orthologs of bacterial transketolases. The group includes two from the yeast Saccharomyces cerevisiae but excludes dihydroxyactetone synthases (formaldehyde transketolases) from various yeasts and the even more distant mammalian transketolases. Among the family of thiamine diphosphate-dependent enzymes that includes transketolases, dihydroxyacetone synthases, pyruvate dehydrogenase E1-beta subunits, and deoxyxylulose-5-phosphate synthases, mammalian and bacterial transketolases seem not to be orthologous. [Energy metabolism, Pentose phosphate pathway]	653
272975	TIGR00233	trpS	tryptophanyl-tRNA synthetase. This model represents tryptophanyl-tRNA synthetase. Some members of the family have a pfam00458 domain amino-terminal to the region described by this model. [Protein synthesis, tRNA aminoacylation]	327
272976	TIGR00234	tyrS	tyrosyl-tRNA synthetase. This tyrosyl-tRNA synthetase model starts picking up tryptophanyl-tRNA synthetases at scores of 0 and below. The proteins found by this model have a deep split between two groups. One group contains bacterial and organellar eukaryotic examples. The other contains archaeal and cytosolic eukaryotic examples. [Protein synthesis, tRNA aminoacylation]	378
272977	TIGR00235	udk	uridine kinase. Model contains a number of longer eukaryotic proteins and starts bringing in phosphoribulokinase hits at scores of 160 and below [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	207
272978	TIGR00236	wecB	UDP-N-acetylglucosamine 2-epimerase. This cytosolic enzyme converts UDP-N-acetyl-D-glucosamine to UDP-N-acetyl-D-mannosamine. In E. coli, this is the first step in the pathway of enterobacterial common antigen biosynthesis.Members of this orthology group have many gene symbols, often reflecting the overall activity of the pathway and/or operon that includes it. Symbols include epsC (exopolysaccharide C) in Burkholderia solanacerum, cap8P (type 8 capsule P) in Staphylococcus aureus, and nfrC in an older designation based on the effects of deletion on phage N4 adsorption. Epimerase activity was also demonstrated in a bifunctional rat enzyme, for which the N-terminal domain appears to be orthologous. The set of proteins found above the suggested cutoff includes E. coli WecB in one of two deeply branched clusters and the rat UDP-N-acetylglucosamine 2-epimerase domain in the other. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	365
272979	TIGR00237	xseA	exodeoxyribonuclease VII, large subunit. This family consist of exodeoxyribonuclease VII, large subunit XseA which catalyses exonucleolytic cleavage in either the 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. Exonuclease VII consists of one large subunit and four small subunits. [DNA metabolism, Degradation of DNA]	389
272980	TIGR00238	TIGR00238	KamA family protein. This model represents essentially the whole of E. coli YjeK and of some of its apparent orthologs. YodO in Bacillus subtilis, a family member which is longer protein by an additional 100 residues, is characterized as a lysine 2,3-aminomutase with iron, sulphide and pyridoxal 5'-phosphate groups. The homolog MJ0634 from M. jannaschii is preceded by nearly 200 C-terminal residues. This family shows similarity to molybdenum cofactor biosynthesis protein MoaA and related proteins. Note that the E. coli homolog was expressed in E. coli and purified and found not to display display lysine 2,3-aminomutase activity. Active site residues are found in 100 residue extension in B. subtilis. Name changed to KamA family protein. [Cellular processes, Adaptations to atypical conditions]	331
161785	TIGR00239	2oxo_dh_E1	2-oxoglutarate dehydrogenase, E1 component. The 2-oxoglutarate dehydrogenase complex consists of this thiamine pyrophosphate-binding subunit (E1), dihydrolipoamide succinyltransferase (E2), and lipoamide dehydrogenase (E3). The E1 ortholog from Corynebacterium glutamicum is unusual in having an N-terminal extension that resembles the dihydrolipoamide succinyltransferase (E2) component of 2-oxoglutarate dehydrogenase. [Energy metabolism, TCA cycle]	929
272981	TIGR00240	ATCase_reg	aspartate carbamoyltransferase, regulatory subunit. The presence of this regulatory subunit allows feedback inhibition by CTP on aspartate carbamoyltransferase, the first step in the synthesis of CTP from aspartate. In many species, this regulatory subunit is not present. In Thermotoga maritima, the catalytic and regulatory subunits are encoded by a fused gene and the regulatory region has enough sequence differences to score below the trusted cutoff. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	150
129344	TIGR00241	CoA_E_activ	CoA-substrate-specific enzyme activase, putative. This domain is found in a set of closely related proteins including the (R)-2-hydroxyglutaryl-CoA dehydratase activase of Acidaminococcus fermentans, in longer proteins from M. jannaschii and M. thermoautotrophicum that share an additional N-terminal domain, in a protein described as a subunit of the benzoyl-CoA reductase of Rhodopseudomonas palustris, and in two repeats of an uncharacterized protein of Aquifex aeolicus.This domain may be involved in generating or regenerating the active sites of enzymes related to (R)-2-hydroxyglutaryl-CoA dehydratase and benzoyl-CoA reductase.	248
129345	TIGR00242	TIGR00242	division/cell wall cluster transcriptional repressor MraZ. Members of this family contain two tandem copies of a domain described by pfam02381. This protein often is found with other genes of the dcw (division cell wall) gene cluster, including mraW, ftsI, murE, murF, ftsW, murG, etc. Recent work shows MraW in E. coli binds an upstream region with three tandem GTGGG repeats separated by 5bp spacers. We find similar sites in other species. [Cellular processes, Cell division, Regulatory functions, DNA interactions]	142
161787	TIGR00243	Dxr	1-deoxy-D-xylulose 5-phosphate reductoisomerase. 1-deoxy-D-xylulose 5-phosphate is converted to 2-C-methyl-D-erythritol 4-phosphate in the presence of NADPH. It is involved in the synthesis of isopentenyl diphosphate (IPP), a basic building block in isoprenoid, thiamin, and pyridoxal biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	389
129347	TIGR00244	TIGR00244	transcriptional regulator NrdR. Members of this almost entirely bacterial family contain an ATP cone domain (pfam03477). There is never more than one member per genome. Common gene symbols given include nrdR, ybaD, ribX and ytcG. The member from Streptomyces coelicolor is found upstream in the operon of the class II oxygen-independent ribonucleotide reductase gene nrdJ and was shown to repress nrdJ expression. Many members of this family are found near genes for riboflavin biosynthesis in Gram-negative bacteria, suggesting a role in that pathway. However, a phylogenetic profiling study associates members of this family with the presence of a palindromic signal with consensus acaCwAtATaTwGtgt, termed the NrdR-box, an upstream element for most operons for ribonucleotide reductase of all three classes in bacterial genomes. [Regulatory functions, DNA interactions]	147
272982	TIGR00245	TIGR00245	TIGR00245 family protein. [Hypothetical proteins, Conserved]	248
129349	TIGR00246	tRNA_RlmH_YbeA	rRNA large subunit m3Psi methyltransferase RlmH. This protein, in the SPOUT methyltransferase family, previously designated YbeA in E. coli, was shown to be responsible for a further modification, a methylation, to a pseudouridine base in ribosomal large subunit RNA. [Protein synthesis, tRNA and rRNA base modification]	153
272983	TIGR00247	TIGR00247	conserved hypothetical protein, YceG family. This uncharacterized protein family, found in three of four microbial genomes, virtually always once per genome, includes YceG from Escherichia coli. This protein is encoded next to PabC, 4-amino-4-deoxychorismate lyase, in E. coli and numerous other proteobacteria, but that proximity is not conserved in other lineages. Numerous members of this family have been misannotated as aminodeoxychorismate lyase, apparently because of promiximty to PabC. [Hypothetical proteins, Conserved]	342
129351	TIGR00249	sixA	phosphohistidine phosphatase SixA. [Regulatory functions, Protein interactions]	152
129352	TIGR00250	RNAse_H_YqgF	putative transcription antitermination factor YqgF. This protein family, which exhibits an RNAse H fold in crystal structure, has been proposed as a putative Holliday junction resolvase, an alternate to RuvC. [Unknown function, General]	130
129353	TIGR00251	TIGR00251	TIGR00251 family protein. [Hypothetical proteins, Conserved]	87
129354	TIGR00252	TIGR00252	TIGR00252 family protein. the scores for Mycobacterium tuberculosis and Treponema pallidum are low considering the alignment [Hypothetical proteins, Conserved]	119
129355	TIGR00253	RNA_bind_YhbY	putative RNA-binding protein, YhbY family. A combination of crystal structure, molecular modeling, and bioinformatic data together suggest that members of this family, including YhbY of E. coli, are RNA binding proteins. [Unknown function, General]	95
272984	TIGR00254	GGDEF	diguanylate cyclase (GGDEF) domain. The GGDEF domain is named for the motif GG[DE]EF shared by many proteins carrying the domain. There is evidence that the domain has diguanylate cyclase activity. Several proteins carrying this domain also carry domains with functions relating to environmental sensing. These include PleD, a response regulator protein involved in the swarmer-to-stalked cell transition in Caulobacter crescentus, and FixL, a heme-containing oxygen sensor protein. [Regulatory functions, Small molecule interactions, Signal transduction, Other]	165
129357	TIGR00255	TIGR00255	TIGR00255 family protein. The apparent ortholog from Aquifex aeolicus as reported is split into two consecutive reading frames. [Hypothetical proteins, Conserved]	291
129358	TIGR00256	TIGR00256	D-tyrosyl-tRNA(Tyr) deacylase. This homodimeric enzyme appears able to cleave any D-amino acid (and glycine, which does not have distinct D/L forms) from charged tRNA. The name reflects characterization with respect to D-Tyr on tRNA(Tyr) as established in the literature, but substrate specificity seems much broader. [Protein synthesis, tRNA aminoacylation]	145
129359	TIGR00257	IMPACT_YIGZ	uncharacterized protein, YigZ family. This uncharacterized protein family includes YigZ, which has been crystallized, from E. coli. YigZ is homologous to the protein product of the mouse IMPACT gene. Crystallography shows a two-domain stucture, and the C-terminal domain is suggested to bind nucleic acids. The function is unknown. Note that the ortholog from E. coli was shown fused to the pepQ gene in GenBank entry X54687. This caused occasional misidentification of this protein as pepQ; this family is found in a number of species that lack pepQ. [Unknown function, General]	204
272985	TIGR00258	TIGR00258	inosine/xanthosine triphosphatase. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	163
129361	TIGR00259	thylakoid_BtpA	membrane complex biogenesis protein, BtpA family. Members of this family are found in C. elegans, Synechocystis sp., E. coli, and several of the Archaea. Members in Cyanobacteria have been shown to play a role in protein complex biogenesis, and designated BtpA (biogenesis of thylakoid protein). Homologs in non-photosynthetic species, where thylakoid intracytoplasmic membranes are lacking, are likely to act elsewhere in membrane protein biogenesis. [Protein fate, Protein folding and stabilization]	257
272986	TIGR00260	thrC	threonine synthase. Involved in threonine biosynthesis it catalyses the reaction O-PHOSPHO-L-HOMOSERINE + H(2)O = L-THREONINE + ORTHOPHOSPHATE using pyridoxal phosphate as a cofactor. the enzyme is distantly related to the serine/threonine dehydratases which are also pyridoxal-phosphate dependent enzymes. the pyridoxal-phosphate binding site is a Lys (K) residues present at residue 70 of the model. [Amino acid biosynthesis, Aspartate family]	327
129363	TIGR00261	traB	pheromone shutdown-related protein TraB. traB is a plasmid encoded gene that functions in the shutdown of the peptide sex pheromone cPD1 which is produced by the plasmid free recipient cell prior to conjugative transfer in Enterococcus faecalis. Once the recipient acquires the plasmid, production of cPD1 is shut down. The gene product may play another role in the other species in the family. [Unknown function, General]	380
161792	TIGR00262	trpA	tryptophan synthase, alpha subunit. Tryptophan synthase catalyzes the last step in the biosynthesis of tryptophan. The alpha chain is responsible for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate. In bacteria and plants each domain is found on a separate subunit (alpha and beta chains), while in fungi the two domains are fused together on a single multifunctional protein. The signature pattern for trpA contains three conserved acidic residues. [LIVM]-E-[LIVM]-G-x(2)-[FYC]-[ST]-[DE]-[PA]-[LIVMY]-[AGLI]-[DE]-G and this is located between residues 43-58 of the model. The Sulfolobus solfataricus trpA is known to be quite divergent from other known trpA sequences. [Amino acid biosynthesis, Aromatic amino acid family]	256
272987	TIGR00263	trpB	tryptophan synthase, beta subunit. Tryptophan synthase catalyzes the last step in the biosynthesis of tryptophan. the beta chain contains the functional domain for or the synthesis of tryptophan from indole and serine. The enzyme requires pyridoxal-phosphate as a cofactor. The pyridoxal-P attachment site is contained within the conserved region [LIVM]-x-H-x-G-[STA]-H-K-x-N] [K is the pyridoxal-P attachment site] which is present between residues 90-100 of the model. [Amino acid biosynthesis, Aromatic amino acid family]	385
272988	TIGR00264	TIGR00264	alpha-NAC-related protein. This hypothetical protein is found so far only in the Archaea. Its C-terminal domain of about 40 amino acids is homologous to the C-termini of the nascent polypeptide-associated complex alpha chain (alpha-NAC) and its yeast ortholog Egd2p and to the huntingtin-interacting protein HYPK. It shows weaker similarity, possibly through shared structural constraints rather than through homology, with the amino-terminal domain of elongation factor Ts. Alpha-NAC plays a role in preventing nascent polypeptides from binding inappropriately to membrane-targeting apparatus during translation, but is also active as a transcription regulator. [Unknown function, General]	116
272989	TIGR00266	TIGR00266	TIGR00266 family protein. [Hypothetical proteins, Conserved]	222
129368	TIGR00267	TIGR00267	TIGR00267 family protein. This family of uncharacterized proteins shows a low level of similarity (possibly meaningful) to the predicted membrane protein YLR220W, which is involved in calcium homeostatis. It shows no similarity to any other characterized protein.This family is represented in three of the first four completed archaeal genomes, with two members in A. fulgidus. [Hypothetical proteins, Conserved]	169
129369	TIGR00268	TIGR00268	TIGR00268 family protein. The N-terminal region of the model shows similarity to Argininosuccinate synthase proteins using PSI-blast and using the recognize protein identification server. [Hypothetical proteins, Conserved]	252
129370	TIGR00269	TIGR00269	TIGR00269 family protein. [Hypothetical proteins, Conserved]	104
129371	TIGR00270	TIGR00270	TIGR00270 family protein. [Hypothetical proteins, Conserved]	154
129372	TIGR00271	TIGR00271	uncharacterized hydrophobic domain. This domain is in a family of archaeal proteins that includes AF0785 of Archaeoglobus fulgidus and in several eubacterial proteins, including the much longer protein sll1151 from Synechocystis PCC6803.	175
272990	TIGR00272	DPH2	diphthamide biosynthesis protein 2. This protein has been shown in Saccharomyces cerevisiae to be one of several required for the modification of a particular histidine residue of translation elongation factor 2 to diphthamide. This modified site can then become the target for ADP-ribosylation by diphtheria toxin. [Protein fate, Protein modification and repair]	496
129374	TIGR00273	TIGR00273	iron-sulfur cluster-binding protein. Members of this family have a perfect 4Fe-4S binding motif C-x(2)-C-x(2)-C-x(3)-CP followed by either a perfect or imperfect (the first Cys replaced by Ser) second copy. Members probably bind two 4fe-4S iron-sulfur clusters. [Energy metabolism, Electron transport]	432
272991	TIGR00274	TIGR00274	N-acetylmuramic acid 6-phosphate etherase. This protein, MurQ, is involved in recycling components of the bacterial murein sacculus turned over during cell growth. The cell wall metabolite anhydro-N-acetylmuramic acid (anhMurNAc) is converted by a kinase, AnmK, to MurNAc-phosphate, then converted to N-acetylglucosamine-phosphate by this etherase, called MurQ. This family of proteins is similar to the C-terminal half of a number of vertebrate glucokinase regulator proteins and contains a Prosite pattern which is shared by this group of proteins in a region of local similarity. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	291
272992	TIGR00275	TIGR00275	flavoprotein, HI0933 family. The model when searched with a partial length search brings in proteins with a dinucleotide-binding motif (Rossman fold) over the initial 40 residues of the model, including oxidoreductases and dehydrogenases. Partially characterized members include an FAD-binding protein from Bacillus cereus and flavoprotein HI0933 from Haemophilus influenzae. [Unknown function, Enzymes of unknown specificity]	400
272993	TIGR00276	TIGR00276	epoxyqueuosine reductase. This model was rebuilt to exclude archaeal homologs, now that there is new information that bacterial members are epoxyqueuosine reductase, QueG, involved in queuosine biosynthesis for tRNA maturation. [Protein synthesis, tRNA and rRNA base modification]	337
272994	TIGR00277	HDIG	HDIG domain. This domain is found in a few known nucleotidyltransferes and in a large number of uncharacterized proteins. It contains four widely separated His residues, the second of which is part of an invariant dipeptide His-Asp in a region matched approximately by the motif HDIG. This model may annotate homologous domains in which one or more of the His residues is conserved but misaligned, and some probable false-positive hits.	80
272995	TIGR00278	TIGR00278	putative membrane protein insertion efficiency factor. This model describes a family, YidD, of small, non-essential proteins now suggested to improve YidC-dependent inner membrane protein insertion. A related protein is found in the temperature phage HP1 of Haemophilus influenzae. Annotation of some members of this family as hemolysins appears to represent propagation from an unpublished GenBank submission, L36462, attributed to Aeromonas hydrophila but a close match to E. coli. [Hypothetical proteins, Conserved]	75
129380	TIGR00279	uL16_euk_arch	ribosomal protein uL16(L10.e), eukarotic/archaeal form. This model finds the archaeal and eukaryotic forms of ribosomal protein uL16, previously L10.e. The protein is encoded by multiple loci in some eukaryotes and has been assigned a number of extra-ribosomal functions, some of which will require re-evaluation in the context of identification as a ribosomal protein. L10.e is distantly related to eubacterial ribosomal protein L16. [Protein synthesis, Ribosomal proteins: synthesis and modification]	172
272996	TIGR00280	eL43_euk_arch	ribosomal protein eL43. This model finds eukaryotic ribosomal protein eL43 (previously L37a) and its archaeal orthologs. The nomeclature is tricky because eukaryotes have proteins called both L37 and L37a. [Protein synthesis, Ribosomal proteins: synthesis and modification]	92
213521	TIGR00281	TIGR00281	segregation and condensation protein B. Shown to be required for chromosome segregation and condensation in B. subtilis. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]	186
161802	TIGR00282	TIGR00282	metallophosphoesterase, MG_246/BB_0505 family. A member of this family from Mycoplasma Pneumoniae has been crystallized and described as a novel phosphatase. [Unknown function, Enzymes of unknown specificity]	266
161803	TIGR00283	arch_pth2	peptidyl-tRNA hydrolase. This model describes an archaeal/eukaryotic form of peptidyl-tRNA hydrolase. Most bacterial forms are described by TIGR00447. [Protein synthesis, Other]	115
272997	TIGR00284	TIGR00284	dihydropteroate synthase-related protein. This protein has been found so far only in the Archaea, and in particular in those archaea that lack a bacterial-type dihydropteroate synthase. The central region of this protein shows considerable homology to the amino-terminal half of dihydropteroate synthases, while the carboxyl-terminal region shows homology to the small, uncharacterized protein slr0651 of Synechocystis PCC6803. [Unknown function, General]	499
129386	TIGR00285	TIGR00285	DNA-binding protein Alba. Alba has been shown to bind DNA and affect DNA supercoiling in a temperature dependent manner. It is regulated by acetylation (alba = acetylation lowers binding affinity) by the Sir2 protein. Alba is proposed to play a role in establishment or maintenace of chromatin architecture and thereby in transcription repression. This protein appears so far only in the Archaea, but may be universal there. There is a single member in three of the first four completed archaeal genomes, and a second copy in A. fulgidus. In Sulfolobus shibatae there is a tandem second copy that is poorly conserved and scores below the trusted cutoff; all other members of the family are conserved at greater than 50 % pairwise identity. [DNA metabolism, Chromosome-associated proteins]	87
211565	TIGR00286	TIGR00286	arginine decarboxylase, pyruvoyl-dependent. The three copies present in Archeoglobus fulgidus, one of which is only half-length and excluded from the seed alignment, are very closely related and clearly arose by duplication after the separation from well-studied species. The other completed archaeal genomes each contain a single copy. The lone, weak (below trusted cutoff) hit to a non-archaeal sequence is to an uncharacterized protein of Chlamydia, with the greatest similarity in the amino-terminal half of the model. [Central intermediary metabolism, Polyamine biosynthesis, Energy metabolism, Amino acids and amines]	152
272998	TIGR00287	cas1	CRISPR-associated endonuclease Cas1. This model identifies CRISPR-associated protein Cas1, the most universal CRISPR system protein. CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Cas1 is a metal-dependent DNA-specific endonuclease.	323
272999	TIGR00288	TIGR00288	TIGR00288 family protein. This family of orthologs is restricted to but universal among the completed archaeal genomes so far. Eubacterial proteins showing at least local homology include slr1870 from Synechocystis PCC6803 and two proteins from Aquifex aeolicusr, none of which is characterized. [Hypothetical proteins, Conserved]	160
129390	TIGR00289	TIGR00289	TIGR00289 family protein. Homologous proteins related to MJ0570 of Methanococcus jannaschii include both the apparent orthologs found by this model above the trusted cutoff, the much longer protein YLR143W from Saccharomyces cerevisiae, and second homologous proteins from Archaeoglobus fulgidus and Pyrococcus horikoshii that appear to represent a second orthologous group. [Hypothetical proteins, Conserved]	222
273000	TIGR00290	MJ0570_dom	MJ0570-related uncharacterized domain. Proteins with this uncharacterized domain include two apparent ortholog families in the Archaea, one of which is universal among the first four completed archaeal genomes, and YLR143W, a much longer protein from Saccharomyces cerevisiae. The domain comprises the full length of the archaeal proteins and the first third of the yeast protein.	223
129392	TIGR00291	RNA_SBDS	rRNA metabolism protein, SBDS family. This protein family, possibly universal in both archaea and eukaryotes, appears to be involved in (ribosomal) RNA metabolism. Mutations in the human ortholog are associated with Shwachman-Bodian-Diamond syndrome. [Protein synthesis, Other]	231
273001	TIGR00292	TIGR00292	thiazole biosynthesis enzyme. This enzyme is involved in the biosynthesis of the thiamine precursor thiazole, and is repressed by thiamine. This family includes c-thi1, a Citrus gene induced during natural and ethylene induced fruit maturation and is highly homologous to plant and yeast thi genes involved in thiamine biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	254
129394	TIGR00293	TIGR00293	prefoldin, archaeal alpha subunit/eukaryotic subunit 5. Members of this protein family, rich in coiled coil regions, are molecular chaperones in the class of the prefoldin (GimC) alpha subunit. Prefoldin is a hexamer of two alpha and four beta subunits. This protein appears universal in the archaea but is restricted to Aquifex aeolicus among bacteria so far. Eukaryotes have several related proteins; only prefoldin subunit 5, which appeared the most similar to archaeal prefoldin alpha, is included in this model. This model finds a set of small proteins from the Archaea and from Aquifex aeolicus that may represent two orthologous groups. The proteins are predicted to be mostly coiled coil, and the model may have a significant number of hits to proteins that contain coiled coil regions. [Protein fate, Protein folding and stabilization]	126
129395	TIGR00294	TIGR00294	GTP cyclohydrolase, MptA/FolE2 family. This family includes type I GTP cyclohydrolases involved in methanopterin in archaea (MptA) and de novo tetrahydrofolate biosynthesis in bacteria (FolE2). [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	308
129396	TIGR00295	TIGR00295	TIGR00295 family protein. This set of orthologs is narrowly defined, comprising proteins found in three Archaea but not in Pyrococcus horikoshii. The closest homologs are other archaeal proteins that appear to be represent distinct orthologous clusters. [Hypothetical proteins, Conserved]	164
273002	TIGR00296	TIGR00296	uncharacterized protein, PH0010 family. Members of this functionally uncharacterized protein family have been crystallized from Pyrococcus Horikoshii, Methanosarcina Mazei, and Sulfolobus Tokodaii. [Unknown function, General]	200
213522	TIGR00297	TIGR00297	TIGR00297 family protein. [Hypothetical proteins, Conserved]	237
273003	TIGR00298	TIGR00298	2-phosphosulfolactate phosphatase. 2-phosphosulfolactate phosphatase catalyzes the sulfonation of phosphoenolpyruvate to form 2-phospho-3-sulfolactate, the second step in coenzyme M biosynthesis. Coenzyme M is the terminal methyl carrier in methanogenesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis]	216
129400	TIGR00299	TIGR00299	TIGR00299 family protein. Members of this family are found in the Archaea and in several different bacteria lineages. The function in unknown and the genomic context is not well conserved. [Hypothetical proteins, Conserved]	382
129401	TIGR00300	TIGR00300	TIGR00300 family protein. All members of the family come from genome projects. A partial length search brings in two plant lysine-ketoglutarate reductase/saccharopine dehydrogenase bifunctional enzymes hitting the N-terminal region of the family. [Hypothetical proteins, Conserved]	407
129402	TIGR00302	TIGR00302	phosphoribosylformylglycinamidine synthase, purS protein. In species such as Bacillus subtilis in which FGAM synthetase is split into two ORFs purL and purQ, this small protein, previously called yexA, is required for FGAM synthetase activity. Although the article does not make it clear whether this is a subunit or an accessory protein, it is encoded as part of the operon, which suggests stochiometric amounts, = subunit. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	80
273004	TIGR00303	TIGR00303	TIGR00303 family protein. All current members of the family are from genome projects. [Hypothetical proteins, Conserved]	331
213523	TIGR00304	TIGR00304	TIGR00304 family protein. The member of this family from Pyrococcus horikoshii scores only 13.91 bits, largely because it is at least 15 residues shorter than other members of this family of small proteins and is penalized for not matching to the N-terminal section of the model. Cutoff scores are set so this hit is between noise and trusted cutoffs. [Hypothetical proteins, Conserved]	77
129405	TIGR00305	TIGR00305	putative toxin-antitoxin system toxin component, PIN family. This uncharacterized protein family, part of the PIN domain superfamily, is restricted to bacteria and archaea. A comprehensive in silico study of toxin-antitoxin systems by Makarova, et al. (2009) finds evidence this family represents the toxin-like component of one class of type 2 toxin-antitoxin systems. [Cellular processes, Other, Transcription, Degradation of RNA]	114
273005	TIGR00306	apgM	phosphoglycerate mutase (2,3-diphosphoglycerate-independent), archaeal form. Experimentally characterized in archaea as 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. This model describes a set of proteins in the Archaea (two each in Methanococcus jannaschii, Methanobacterium thermoautotrophicum, and Archaeoglobus fulgidus) and in Aquifex aeolicus (1 member). [Energy metabolism, Glycolysis/gluconeogenesis]	396
129407	TIGR00307	eS8	ribosomal protein eS8. Archaeal and eukaryotic ribosomal protein S8. This model could easily have been split into two models, one for eukaryotic S8 and one for archaeal S8; eukaryotic forms invariably have in insert of about 80 residues that archaeal forms of S8 do not. [Protein synthesis, Ribosomal proteins: synthesis and modification]	127
273006	TIGR00308	TRM1	tRNA(guanine-26,N2-N2) methyltransferase. This enzyme is responsible for two methylations of a characteristic guanine of most tRNA molecules. The activity has been demonstrated for eukaryotic and archaeal proteins, which are active when expressed in E. coli, a species that lacks this enzyme. At least one Eubacterium, Aquifex aeolicus, has an ortholog, as do all completed archaeal genomes. [Protein synthesis, tRNA and rRNA base modification]	374
129409	TIGR00309	V_ATPase_subD	H(+)-transporting ATP synthase, vacuolar type, subunit D. Although this ATPase can run backwards, using a proton gradient to synthesize ATP, the primary biological role is to acidify some compartment, such as yeast vacuole (a lysosomal homolog) or the interior of a prokaryote. [Transport and binding proteins, Cations and iron carrying compounds]	209
273007	TIGR00310	ZPR1_znf	ZPR1 zinc finger domain. An orthologous protein found once in each of the completed archaeal genomes corresponds to a zinc finger-containing domain repeated as the N-terminal and C-terminal halves of the mouse protein ZPR1. ZPR1 is an experimentally proven zinc-binding protein that binds the tyrosine kinase domain of the epidermal growth factor receptor (EGFR); binding is inhibited by EGF stimulation and tyrosine phosphorylation, and activation by EGF is followed by some redistribution of ZPR1 to the nucleus. By analogy, other proteins with the ZPR1 zinc finger domain may be regulatory proteins that sense protein phosphorylation state and/or participate in signal transduction.	192
129411	TIGR00311	aIF-2beta	translation initiation factor aIF-2, beta subunit, putative. The trusted cutoff is set high enough to select only archaeal members. The suggested cutoff is set to include most eukaryotic members but largely exclude the related eIF-5. [Protein synthesis, Translation factors]	133
273008	TIGR00312	cbiD	cobalamin biosynthesis protein CbiD. This protein has been shown by cloning into E. coli to be required for cobalamin biosynthesis. role_id [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	347
129413	TIGR00313	cobQ	cobyric acid synthase CobQ. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	475
129414	TIGR00314	cdhA	CO dehydrogenase/acetyl-CoA synthase complex, epsilon subunit. Acetyl-CoA decarbonylase/synthase (ACDS) is a multienzyme complex. Carbon monoxide dehydrogenase is a synonym. The ACDS complex carries out an unusual reaction involving the reversible cleavage and synthesis of acetyl-CoA in methanogens. The model contains the prosite signature for 4Fe-4S ferredoxins [C-x(2)-C-x(2)-C-x(3)-C-[PEG]] between residues 448-462 of the model. [Energy metabolism, Chemoautotrophy]	784
273009	TIGR00315	cdhB	CO dehydrogenase/acetyl-CoA synthase complex, epsilon subunit. Nomenclature follows the description for Methanosarcina thermophila. The complex is also found in Archaeoglobus fulgidus, not considered a methanogen, but is otherwise generally associated with methanogenesis. [Energy metabolism, Chemoautotrophy]	165
129416	TIGR00316	cdhC	CO dehydrogenase/CO-methylating acetyl-CoA synthase complex, beta subunit. Nomenclature follows the description for Methanosarcina thermophila. The CO-methylating acetyl-CoA synthase is considered the defining enzyme of the Wood-Ljungdahl pathway, used for acetate catabolism by sulfate reducing bacteria but for acetate biosynthesis by acetogenic bacteria such as oorella thermoacetica (f. Clostridium thermoaceticum). [Energy metabolism, Chemoautotrophy]	458
213524	TIGR00317	cobS	cobalamin 5'-phosphate synthase/cobalamin synthase. cobS is involved with cobalamin biosynthesis in part III of colbalmin biosynthesis. The enzyme catyalzes the reactions adenosylcobinamide-GDP + alpha-ribazole-5'-P = adenosylcobalamin-5'-phosphate + GMP and adenosylcobinamide-GDP + alpha-ribazole = adenosylcobalamin + GMP. The protein product is associated with a large complex of proteins and is induced by cobinamide. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	241
273010	TIGR00318	cyaB	adenylyl cyclase CyaB, putative. The protein CyaB from Aeromonas hydrophila is a second adenylyl cyclase from that species, as demonstrated by complementation in E. coli and by assay of the enzymatic properties of purified recombinant protein. It has no detectable homology to any other protein of known function, and has several unusual properties, including an optimal temperature of 65 degrees and an optimal pH of 9.5. A cluster of uncharaterized archaeal homologs may be orthologous and serve (under certain circumstances) to produce the regulatory metabolite cyclic AMP (cAMP). [Regulatory functions, Small molecule interactions]	174
200008	TIGR00319	desulf_FeS4	desulfoferrodoxin FeS4 iron-binding domain. This domain is found as essentially the full length of desulforedoxin, a 37-residue homodimeric non-heme iron protein. It is also found as the N-terminal domain of desulfoferrodoxin (rbo), a homodimeric non-heme iron protein with 2 Fe atoms per monomer in different oxidation states.This domain binds the ferric rather than the ferrous Fe of desulfoferrodoxin.Neelaredoxin, a monomeric blue non-heme iron protein, lacks this domain. [Energy metabolism, Electron transport]	33
273011	TIGR00320	dfx_rbo	desulfoferrodoxin. The short N-terminal domain contains four conserved Cys for binding of a ferric iron atom, and is homologous to the small protein desulforedoxin; this domain may also be responsible for dimerization. The remainder of the molecule binds a ferrous iron atom and is similar to neelaredoxin, a monomeric blue non-heme iron protein. The homolog from Treponema pallidum scores between the trusted cutoff for orthology and the noise cutoff. Although essentially a full length homolog, it lacks three of the four Cys residues in the N-terminal domain; the domain may have lost ferric binding ability but may have some conserved structural role such as dimerization, or some new function. This protein is described in some articles as rubredoxin oxidoreductase (rbo), and its gene shares an operon with the rubredoxin gene in Desulfovibrio vulgaris Hildenborough. [Energy metabolism, Electron transport]	125
273012	TIGR00321	dhys	deoxyhypusine synthase. Deoxyhypusine synthase is responsible for the first step in creating hypusine. Hypusine is a modified amino acid found in eukaryotes and in archaea in their respective forms of initiation factor 5A. Its presence is confirmed in archaeal genera Pyrococcus (), Sulfolobus, Halobacterium, and Haloferax (), but in an older report was not detected in Methanococcus voltae (J Biol Chem 1987 Dec 5;262(34):16585-9). This family of apparent orthologs has an unusual UPGMA difference tree, in which the members from the archaea M. jannaschii and P. horikoshii cluster with the known eukaryotic deoxyhypusine synthases. Separated by a fairly deep branch, although still strongly related, is a small cluster of proteins from Methanobacterium thermoautotrophicum and Archeoglobus fulgidus, the latter of which has two. [Protein fate, Protein modification and repair]	301
273013	TIGR00322	diphth2_R	diphthamide biosynthesis enzyme Dph1/Dph2 domain. Archaea and Eukaryotes, but not Eubacteria, share the property of having a covalently modified residue, 2'-[3-carboxamido-3-(trimethylammonio)propyl]histidine, as a part of a cytosolic protein. The modified His, termed diphthamide, is part of translation elongation factor EF-2 and is the site for ADP-ribosylation by diphtheria toxin. This model includes both Dph1 and Dph2 from Saccharomyces cerevisiae, although only Dph2 is found in the Archaea (see TIGR03682). Dph2 has been shown to act analogously to the radical SAM (rSAM) family (pfam04055), with 4Fe-4S-assisted cleavage of S-adenosylmethionine to create a free radical, but a different organic radical than in rSAM.	318
211569	TIGR00323	eIF-6	translation initiation factor eIF-6, putative. This model finds translation initiation factor eIF-6 of eukaryotes, which is a ribosome dissociation factor. It also finds a set of apparent archaeal orthologs, slightly shorter proteins not yet shown to act as initiation factors; these probably should be designated as translation initiation factor aIF-6, putative. [Protein synthesis, Translation factors]	216
129424	TIGR00324	endA	tRNA-intron lyase. The enzyme catalyses the endonucleolytic cleavage of pre tRNA at the 5' and 3' splice sites to release the intron and produces two half tRNA molecules bearing 5' hydroxyl and 2', 3'-cyclic phosphate termini. The genes are homologous in Eucarya and Archea. The two yeast genes have been functionally studied and are two subunits of a heterotetramer enzyme in yeast the other two subunits of which have no known homologs. [Transcription, RNA processing]	170
273014	TIGR00325	lpxC	UDP-3-0-acyl N-acetylglucosamine deacetylase. UDP-3-O-(R-3-hydroxymyristoyl)-GlcNAc deacetylase from E. coli , LpxC, was previously designated EnvA. This enzyme is involved in lipid-A precursor biosynthesis. It is essential for cell viability. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	297
273015	TIGR00326	eubact_ribD	riboflavin biosynthesis protein RibD. This model describes the ribD protein as found in Escherichia coli. The N-terminal domain includes the conserved zinc-binding site region captured in the model dCMP_cyt_deam and shared by proteins such as cytosine deaminase, mammalian apolipoprotein B mRNA editing protein, blasticidin-S deaminase, and Bacillus subtilis competence protein comEB. The C-terminal domain is homologous to the full length of yeast HTP reductase, a protein required for riboflavin biosynthesis. A number of archaeal proteins believed related to riboflavin biosynthesis contain only this C-terminal domain and are not found as full-length matches to this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD]	344
273016	TIGR00327	secE_euk_arch	protein translocase SEC61 complex gamma subunit, archaeal and eukaryotic. This model describes archaeal SEC61-like and eukaryotic SEC61 but not bacterial secE proteins, for which a Pfam pfam00584 (SecE) has been created. [Protein fate, Protein and peptide secretion and trafficking]	61
129428	TIGR00328	flhB	flagellar biosynthetic protein FlhB. FlhB and its functionally equivalent orthologs, from among a larger superfamily of proteins involved in type III protein export systems, are specifically involved in flagellar protein export. The seed members are restricted and the trusted cutoff is set high such that the proteins gathered by this model play roles specifically related to flagellar structures. Full-length homologs scoring below the trusted cutoff are involved in peptide export but not necessarily in the creation of flagella. [Cellular processes, Chemotaxis and motility]	347
129429	TIGR00329	gcp_kae1	metallohydrolase, glycoprotease/Kae1 family. This subfamily includes the well-studied secreted O-sialoglycoprotein endopeptidase (glycoprotease, EC 3.4.24.57) of Pasteurella haemolytica, a pathogen. A member from Riemerella anatipestifer, associated with cohemolysin activity, likewise is exported without benefit of a classical signal peptide and shows glycoprotease activity on the test substrate glycophorin. However, archaeal members of this subfamily show unrelated activities as demonstrated in Pyrococcus abyssi: DNA binding, iron binding, apurinic endonuclease activity, genomic association with a kinase domain, and no glycoprotease activity. This family thus pulls together a set of proteins as a homology group that appears to be near-universal in life, yet heterogeneous in assayed function between bacteria and archaea. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	305
129430	TIGR00330	glpX	fructose-1,6-bisphosphatase, class II. This model represents GlpX, one of three classes of bacterial fructose-1,6-bisphosphatases. This form is homodimeric and Mn2+-dependent, and only very distantly related to the class I fructose-1,6-bisphosphatase, the product of the fbp gene, which is homotetrameric and Mg2+-dependent. A third class is found as one of two types in Bacillus subtilis. In E. coli, GlpX is found in the glpFKX operon together with a glycerol update protein and glycerol kinase. [Energy metabolism, Pentose phosphate pathway]	321
273017	TIGR00331	hrcA	heat shock gene repressor HrcA. HrcA represses the class I heat shock operons groE and dnaK; overproduction prevents induction of these operons by heat shock while deletion allows constitutive expression even at low temperatures. In Bacillus subtilis, hrcA is the first gene of the dnaK operon and so is itself a heat shock gene. [Regulatory functions, DNA interactions]	337
273018	TIGR00332	neela_ferrous	desulfoferrodoxin ferrous iron-binding domain. This domain comprises essentially the full length of neelaredoxin, a monomeric, blue, non-heme iron protein of Desulfovibrio gigas said to bind two iron atoms per monomer with identical spectral properties. Neelaredoxin was shown recently to have significant superoxide dismutase activity. This domain is also found (in a form in which the distance between the motifs H[HWYF]IXW and CN[IL]HGXW is somewhat shorter) as the C-terminal domain of desulfoferrodoxin, which is said to bind a single ferrous iron atom.The N-terminal domain of desulfoferrodoxin is described in a separate model, dfx_rbo (TIGR00320). [Energy metabolism, Electron transport]	106
188042	TIGR00333	nrdI	ribonucleoside-diphosphate reductase 2, operon protein nrdI. Ribonucleotide reductases (RNRs) are enzymes that provide the precursors of DNA synthesis. The three characterized classes of RNRs differ by their metal cofactor and their stable organic radical. The exact function of nrdI within the ribonucleotide reductases has not yet been fully characterised. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	127
273019	TIGR00334	5S_RNA_mat_M5	ribonuclease M5. This family of orthologous proteins shows a weak but significant similarity to the central region of the DnaG-type DNA primase. The region of similarity is termed the Toprim (topoisomerase-primase) domain and is also shared by RecR, OLD family nucleases, and type IA and II topoisomerases. [Transcription, RNA processing]	174
273020	TIGR00335	primase_sml	DNA primase, eukaryotic-type, small subunit, putative. Archaeal members differ substantially from eukaryotic members and should be considered putative pending experimental evidence. The protein is universal and single copy among completed archaeal and eukarotic genomes to date. DNA primase creates RNA primers needed for DNA replication.This model is named putative because the assignment is putative for archaeal proteins. Eukaryotic proteins scoring above the trusted cutoff can be considered authentic. [DNA metabolism, DNA replication, recombination, and repair]	297
129436	TIGR00336	pyrE	orotate phosphoribosyltransferase. Orotate phosphoribosyltransferase (OPRTase) is involved in the biosynthesis of pyrimidine nucleotides. Alpha-D-ribosyldiphosphate 5-phosphate (PRPP) and orotate are utilized to form pyrophosphate and orotidine 5'-monophosphate (OMP) in the presence of divalent cations, preferably Mg2+. In a number of eukaryotes, this protein is fused to a domain that catalyses the reaction (EC 4.1.1.23). The combined activity of EC 2.4.2.10 and EC 4.1.1.23 is termed uridine 5'-monophosphate synthase. The conserved Lys (K) residue at position 101 of the seed alignment has been proposed as the active site for the enzyme. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	173
273021	TIGR00337	PyrG	CTP synthase. CTP synthase is involved in pyrimidine ribonucleotide/ribonucleoside metabolism. The enzyme catalyzes the reaction L-glutamine + H2O + UTP + ATP = CTP + phosphate + ADP + L-glutamate. The enzyme exists as a dimer of identical chains that aggregates as a tetramer. This gene has been found circa 500 bp 5' upstream of enolase in both beta (Nitrosomonas europaea) and gamma (E.coli) subdivisions of proteobacterium (FEMS Microbiol Lett 1998 Aug 1;165(1):153-7). [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	525
273022	TIGR00338	serB	phosphoserine phosphatase SerB. Phosphoserine phosphatase catalyzes the reaction 3-phospho-serine + H2O = L-serine + phosphate. It catalyzes the last of three steps in the biosynthesis of serine from D-3-phosphoglycerate. Note that this enzyme acts on free phosphoserine, not on phosphoserine residues of phosphoproteins. [Amino acid biosynthesis, Serine family]	219
273023	TIGR00339	sopT	ATP sulphurylase. This enzyme forms adenosine 5'-phosphosulfate (APS) from ATP and free sulfate, the first step in the formation of the activated sulfate donor 3'-phosphoadenylylsulfate (PAPS). In some cases, it is found in a bifunctional protein in which the other domain, APS kinase, catalyzes the second and final step, the phosphorylation of APS to PAPS; the combined ATP sulfurylase/APS kinase may be called PAPS synthase. Members of this family also include the dissimilatory sulfate adenylyltransferase (sat) of the sulfate reducer Archaeoglobus fulgidus. [Central intermediary metabolism, Sulfur metabolism]	383
129440	TIGR00340	zpr1_rel	ZPR1-related zinc finger protein. This model describes a strictly archaeal family homologous to the domain duplicated in the eukaryotic zinc-binding protein ZPR1. ZPR1 was shown experimentally to bind approximately two moles of zinc; each copy of the domain contains a putative zinc finger of the form CXXCX(25)CXXC. ZPR1 binds the tyrosine kinase domain of epidermal growth factor receptor, but is displaced by receptor activation and autophosphorylation after which it redistributes in part to the nucleus. The proteins described by this model by analogy may be suggested to play a role in signal transduction. A model ZPR1_znf (TIGR00310) has been created to describe the domain shared by this protein and ZPR1. [Unknown function, General]	163
273024	TIGR00341	TIGR00341	TIGR00341 family protein. This conserved hypothetical protein is found so far only in three archaeal genomes and in Streptomyces coelicolor. It shares a hydrophobic uncharacterized domain (see TIGR00271) of about 180 residues with several eubacterial proteins, including the much longer protein sll1151 of Synechocystis PCC6803. [Hypothetical proteins, Conserved]	325
273025	TIGR00342	TIGR00342	tRNA sulfurtransferase ThiI. Members of this protein family are "ThiI", a sulfurtransferase involved in 4-thiouridine modification of tRNA. This protein often is bifunctional, with genetically separable activities, where the C-terminal rhodanese-like domain (residues 385 to 482 in E. coli ThiI), a domain not included in this model, is sufficient to synthesize the thiazole moiety of thiamine (see TIGR04271). Note that ThiI, because of its role in tRNA modification, may occur in species (such as Mycoplasma genitalium) that lack de novo thiamine biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine, Protein synthesis, tRNA and rRNA base modification]	371
129443	TIGR00343	TIGR00343	pyridoxal 5'-phosphate synthase, synthase subunit Pdx1. This protein had been believed to be a singlet oxygen resistance protein. Subsequent work showed that it is a protein of pyridoxine (vitamin B6) biosynthesis, and that pyridoxine quenches the highly toxic singlet form of oxygen produced by light in the presence of certain chemicals. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	287
273026	TIGR00344	alaS	alanine--tRNA ligase. The model describes alanine--tRNA ligase. This enzyme catalyzes the reaction (tRNAala + L-alanine + ATP = L-alanyl-tRNAala + pyrophosphate + AMP). [Protein synthesis, tRNA aminoacylation]	845
273027	TIGR00345	GET3_arsA_TRC40	transport-energizing ATPase, TRC40/GET3/ArsA family. Members of this family are ATPases that energize transport, although with different partner proteins for different functions. Recent findings show that TRC40 (GET3 in yeast) in involved in the insertion of tail-anchored membrane proteins in eukaryotes. A similar function is expected for members of this family in archaea. However, the earliest discovery of a function for this protein family is ArsA, an arsenic resistance protein that partners with ArsB (see pfam02040) for As(III) efflux. [Hypothetical proteins, Conserved]	284
129446	TIGR00346	azlC	4-azaleucine resistance probable transporter AzlC. Overexpression of this gene results in resistance to a leucine analog, 4-azaleucine. The protein has 5 potential transmembrane motifs. It has been inferred, but not experimentally demonstrated, to be part of a branched-chain amino acid transport system. Commonly found in association with azlD. [Transport and binding proteins, Amino acids, peptides and amines]	221
129447	TIGR00347	bioD	dethiobiotin synthase. Dethiobiotin synthase is involved in biotin biosynthesis and catalyses the reaction (CO2 + 7,8-diaminononanoate + ATP = dethiobiotin + phosphate + ADP). The enzyme binds ATP (see motif in first 12 residues of the SEED alignment) and requires magnesium as a co-factor. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin]	166
273028	TIGR00348	hsdR	type I site-specific deoxyribonuclease, HsdR family. This gene is part of the type I restriction and modification system which is composed of three polypeptides R (restriction endonuclease), M (modification) and S (specificity). This group of enzymes recognize specific short DNA sequences and have an absolute requirement for ATP (or dATP) and S-adenosyl-L-methionine. They also catalyse the reactions of EC 2.1.1.72 and EC 2.1.1.73, with similar site specificity.(J. Mol. Biol. 271 (3), 342-348 (1997)). Members of this family are assumed to differ from each other in DNA site specificity. [DNA metabolism, Restriction/modification]	667
273029	TIGR00350	lytR_cpsA_psr	cell envelope-related function transcriptional attenuator common domain. This model describes a domain of unknown function that is found in the predicted extracellular domain of a number of putative membrane-bound proteins. One of these is proteins psr, described as a penicillin binding protein 5 (PDP-5) synthesis repressor. Another is Bacillus subtilis LytR, described as a transcriptional attenuator of itself and the LytABC operon, where LytC is N-acetylmuramoyl-L-alanine amidase. A third is CpsA, a putative regulatory protein involved in exocellular polysaccharide biosynthesis. Besides the region of strong similarily represented by this model, these proteins share the property of having a short putative N-terminal cytoplasmic domain and transmembrane domain forming a signal-anchor. [Regulatory functions, Other]	152
273030	TIGR00351	narI	respiratory nitrate reductase, gamma subunit. Involved in anerobic respiration the gene product catalyzes the reaction (reduced acceptor + NO3- = Acceptor + nitrite). Another possible role_id for this gene product is in nitrogen fixation (Role_id:160). [Energy metabolism, Anaerobic]	224
129451	TIGR00353	nrfE	c-type cytochrome biogenesis protein CcmF. The product of this gene is required for the biogenesis of C-type cytochromes. This gene is thought to have eleven transmembrane helices. Disruption of this gene in Paracoccus denitrificans, encoding a putative transporter, results in formation of an unstable apocytochrome c and deficiency in siderophore production. [Energy metabolism, Electron transport]	576
273031	TIGR00354	polC	DNA polymerase, archaeal type II, large subunit. This model represents the large subunit, DP2, of a two subunit novel Archaeal replicative DNA polymerase first characterized for Pyrococcus furiosus. Structure of DP2 appears to be organized as a ~950 residue component separated from a ~300 residue component by a ~150 residue intein. The other subunit, DP1, has sequence similarity to the eukaryotic DNA polymerase delta small subunit. [DNA metabolism, DNA replication, recombination, and repair]	1095
273032	TIGR00355	purH	phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase. PurH is bifunctional: IMP cyclohydrolase (EC 3.5.4.10); phosphoribosylaminoimidazolecarboxamide formyltransferase (EC 2.1.2.3) Involved in purine ribonucleotide biosynthesis. The IMP cyclohydrolase activity is in the N-terminal region. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	511
129454	TIGR00357	TIGR00357	methionine-R-sulfoxide reductase. This model describes a domain found in PilB, a protein important for pilin expression, N-terminal to a domain coextensive to with the known peptide methionine sulfoxide reductase (MsrA), a protein repair enzyme, of E. coli. Among the early completed genomes, this module is found if and only if MsrA is also found, whether N-terminal to MsrA (as for Helicobacter pylori), C-terminal (as for Treponema pallidum), or in a separate polypeptide. Although the function of this region is not clear, an auxiliary function to MsrA is suggested. [Protein fate, Protein modification and repair, Cellular processes, Adaptations to atypical conditions]	134
273033	TIGR00358	3_prime_RNase	VacB and RNase II family 3'-5' exoribonucleases. This model is defined to identify a pair of paralogous 3-prime exoribonucleases in E. coli, plus the set of proteins apparently orthologous to one or the other in other eubacteria. VacB was characterized originally as required for the expression of virulence genes, but is now recognized as the exoribonuclease RNase R (Rnr). Its paralog in E. coli and H. influenzae is designated exoribonuclease II (Rnb). Both are involved in the degradation of mRNA, and consequently have strong pleiotropic effects that may be difficult to disentangle. Both these proteins share domain-level similarity (RNB, S1) with a considerable number of other proteins, and full-length similarity scoring below the trusted cutoff to proteins associated with various phenotypes but uncertain biochemistry; it may be that these latter proteins are also 3-prime exoribonucleases. [Transcription, Degradation of RNA]	654
273034	TIGR00359	cello_pts_IIC	phosphotransferase system, cellobiose specific, IIC component. The family consists of the cellobiose specific form of the phosphotransferase system (PTS), IIC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	423
273035	TIGR00360	ComEC_N-term	ComEC/Rec2-related protein. The related model ComEC_Rec2 (TIGR00361) describes a set of proteins of ~ 700-800 residues, one each from a number of different species, of which most can become competent for natural transformation with exogenous DNA. The best-studied examples are ComEC from Bacillus subtilis and Rec-2 from Haemophilus influenzae, where the protein appears to form part of the DNA import structure. This model represents a region found in full-length ComEC/Rec2 and shorter homologs of unknown function from large number of additional bacterial species, most of which are not known to become competent for transformation (an exception is Helicobacter pylori). [Unknown function, General]	171
273036	TIGR00361	ComEC_Rec2	DNA internalization-related competence protein ComEC/Rec2. Apparant orthologs are found in 5 species so far (Haemophilus influenzae, Escherichia coli, Bacillus subtilis, Neisseria gonorrhoeae, Streptococcus pneumoniae), of which all but E. coli are model systems for the study of competence for natural transformation. This protein is a predicted multiple membrane-spanning protein likely to be involved in DNA internalization. In a large number of bacterial species not known to exhibit competence, this protein is replaced by a half-length N-terminal homolog of unknown function, modelled by the related model ComEC_N-term. The role for this protein in species that are not naturally transformable is unknown. [Cellular processes, DNA transformation]	662
273037	TIGR00362	DnaA	chromosomal replication initiator protein DnaA. DnaA is involved in DNA biosynthesis; initiation of chromosome replication and can also be transcription regulator. The C-terminal of the family hits the pfam bacterial DnaA (bac_dnaA) domain family. For a review, see Kaguni (2006). [DNA metabolism, DNA replication, recombination, and repair]	437
129460	TIGR00363	TIGR00363	lipoprotein, YaeC family. This family of putative lipoproteins contains a consensus site for lipoprotein signal sequence cleavage. Included in this family is the E. coli hypothetical protein yaeC. About half of the proteins between the noise and trusted cutoffs contain the consensus lipoprotein signature and may belong to this family. [Cell envelope, Other]	258
129461	TIGR00364	TIGR00364	queuosine biosynthesis protein QueC. Members of this protein family are QueC, involved in synthesizing pre-Q0 from GTP en route to tRNA modification with queuosine. This protein family is represented by a single member in nearly every completed large (> 1000 genes) prokaryotic genome. In Rhizobium meliloti, the gene was designated exsB, possibly because of polar effects on exsA expression in a shared polycistronic mRNA. In Arthrobacter viscosus, the homologous gene was designated ALU1 and was associated with an aluminum tolerance phenotype. [Unknown function, General]	201
188046	TIGR00365	TIGR00365	monothiol glutaredoxin, Grx4 family. The gene for the member of this glutaredoxin family in E. coli, originally designated ydhD, is now designated grxD. Its protein, Grx4, is a monothiol glutaredoxin similar to Grx5 of yeast, which is involved in iron-sulfur cluster formation. [Energy metabolism, Electron transport]	97
273038	TIGR00366	TIGR00366	TIGR00366 family protein. [Hypothetical proteins, Conserved]	438
273039	TIGR00367	TIGR00367	K+-dependent Na+/Ca+ exchanger related-protein. This model models a family of bacterial and archaeal proteins that is homologous, except for lacking a central region of ~ 250 amino acids and an N-terminal region of > 100 residues, to a functionally proven potassium-dependent sodium-calcium exchanger of the rat. [Unknown function, General]	307
129465	TIGR00368	TIGR00368	Mg chelatase-related protein. The N-terminal end matches very strongly a pfam Mg_chelatase domain. [Unknown function, General]	499
161843	TIGR00369	unchar_dom_1	uncharacterized domain 1. Most proteins containing this domain consist almost entirely of a single copy of this domain. A protein from C. elegans consists of two tandem copies of the domain. The domain is also found as the N-terminal region of an apparent initiation factor eIF-2B alpha subunit of Aquifex aeolicus. The function of the domain is unknown.	117
129467	TIGR00370	TIGR00370	sensor histidine kinase inhibitor, KipI family. [Hypothetical proteins, Conserved]	202
273040	TIGR00372	cas4	CRISPR-associated protein Cas4. This model represents a family of proteins associated with CRISPR repeats in a wide set of prokaryotic genomes. This scope of this model has been broadened since it was first built to describe an archaeal subset only. The function of the protein is undefined. Distantly related proteins, excluded from this model, include ORFs from Mycobacteriophage D29 and Sulfolobus islandicus filamentous virus and a region of the Schizosaccharomyces pombe DNA replication helicase Dna2p.	178
129469	TIGR00373	TIGR00373	transcription factor E. This family of proteins is, so far, restricted to archaeal genomes. The family appears to be distantly related to the N-terminal region of the eukaryotic transcription initiation factor IIE alpha chain. [Transcription, Transcription factors]	158
129470	TIGR00374	TIGR00374	conserved hypothetical protein. This model is built on a superfamily of proteins in the Archaea and in Aquifex aeolicus. The authenticity of homology can be seen in the presence of motifs in the alignment that include residues relatively rare among these sequences, even though the alignment includes long regions of low-complexity hydrophobic sequences. One apparent fusion protein contains a Glycos_transf_2 region in the N-terminal half of the protein and a region homologous to this superfamily in the C-terminal region. [Unknown function, General]	319
161657	TIGR00375	TIGR00375	TIGR00375 family protein. The member of this family from Methanococcus jannaschii, MJ0043, is considerably longer and appears to contain an intein N-terminal to the region of homology. [Hypothetical proteins, Conserved]	374
273041	TIGR00376	TIGR00376	DNA helicase, putative. The gene product may represent a DNA helicase. Eukaryotic members of this family have been characterized as binding certain single-stranded G-rich DNA sequences (GGGGT and GGGCT). A number of related proteins are characterized as helicases. [DNA metabolism, DNA replication, recombination, and repair]	636
273042	TIGR00377	ant_ant_sig	anti-anti-sigma factor. This superfamily includes small (105-125 residue) proteins related to SpoIIAA of Bacillus subtilis, an anti-anti-sigma factor. SpoIIAA can bind to and inhibit the anti-sigma F factor SpoIIAB. Also, it can be phosphorylated by SpoIIAB on a Ser residue at position 59 of the seed alignment. A similar arrangement is inferred for RsbV, an anti-anti-sigma factor for sigma B. This Ser is fairly well conserved within a motif resembling MXS[STA]G[VIL]X[VIL][VILF] among homologous known or predicted anti-anti-sigma factors. Regions similar to SpoIIAA and apparently homologous, but differing considerably near the phosphorlated Ser of SpoIIAA, appear in a single copy in several longer proteins. [Regulatory functions, Protein interactions]	108
273043	TIGR00378	cax	calcium/proton exchanger (cax). [Transport and binding proteins, Cations and iron carrying compounds]	349
273044	TIGR00379	cobB	cobyrinic acid a,c-diamide synthase. This model describes cobyrinic acid a,c-diamide synthase, the cobB (cbiA in Salmonella) protein of cobalamin biosynthesis. It is responsible for the amidation of carboxylic groups at positions A and C of either cobyrinic acid or hydrogenobrynic acid. NH(2) groups are provided by glutamine and one molecule of ATP hydrogenolyzed for each amidation. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	449
273045	TIGR00380	cobD	cobalamin biosynthesis protein CobD. This protein is involved in cobalamin (vitamin B12) biosynthesis and porphyrin biosynthesis. It converts cobyric acid to cobinamide by the addition of aminopropanol on the F carboxylic group. It is part of the cob operon. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	305
273046	TIGR00381	cdhD	CO dehydrogenase/acetyl-CoA synthase, delta subunit. This is the small subunit of a heterodimer which catalyzes the reaction CO + H2O + Acceptor = CO2 + Reduced acceptor and is involved in the synthesis of acetyl-CoA from CO2 and H2. [Energy metabolism, Chemoautotrophy]	389
273047	TIGR00382	clpX	endopeptidase Clp ATP-binding regulatory subunit (clpX). A member of the ATP-dependent proteases, ClpX has ATP-dependent chaperone activity and is required for specific ATP-dependent proteolytic activities expressed by ClpPX. The gene is also found to be involved in stress tolerance in Bacillus subtilis and is essential for the efficient acquisition of genes specifying type IA and IB restriction. [Protein fate, Protein folding and stabilization, Protein fate, Degradation of proteins, peptides, and glycopeptides]	413
273048	TIGR00383	corA	magnesium Mg(2+) and cobalt Co(2+) transport protein (corA). The article in Microb Comp Genomics 1998;3(3):151-69 discusses this family and suggests that some members may have functions other than Mg2+ transport. [Transport and binding proteins, Cations and iron carrying compounds]	318
273049	TIGR00384	dhsB	succinate dehydrogenase and fumarate reductase iron-sulfur protein. Succinate dehydrogenase and fumarate reductase are reverse directions of the same enzymatic interconversion, succinate + FAD+ = fumarate + FADH2 (EC 1.3.11.1). In E. coli, the forward and reverse reactions are catalyzed by distinct complexes: fumarate reductase operates under anaerobic conditions and succinate dehydrogenase operates under aerobic conditions. This model also describes a region of the B subunit of a cytosolic archaeal fumarate reductase. [Energy metabolism, Aerobic, Energy metabolism, Anaerobic, Energy metabolism, TCA cycle]	220
129481	TIGR00385	dsbE	periplasmic protein thiol:disulfide oxidoreductases, DsbE subfamily. Involved in the biogenesis of c-type cytochromes as well as in disulfide bond formation in some periplasmic proteins. [Protein fate, Protein folding and stabilization]	173
273050	TIGR00387	glcD	glycolate oxidase, subunit GlcD. This protein, the glycolate oxidase GlcD subunit, is similar in sequence to that of several D-lactate dehydrogenases, including that of E. coli. The glycolate oxidase has been found to have some D-lactate dehydrogenase activity. [Energy metabolism, Other]	413
129483	TIGR00388	glyQ	glycyl-tRNA synthetase, tetrameric type, alpha subunit. This tetrameric form of glycyl-tRNA synthetase (2 alpha, 2 beta) is found in the majority of completed eubacterial genomes, with the two genes fused in a few species. A substantially different homodimeric form (not recognized by this model) replaces this form in the Archaea, animals, yeasts, and some eubacteria. [Protein synthesis, tRNA aminoacylation]	293
273051	TIGR00389	glyS_dimeric	glycyl-tRNA synthetase, dimeric type. This model describes a glycyl-tRNA synthetase distinct from the two alpha and two beta chains of the tetrameric E. coli glycyl-tRNA synthetase. This enzyme is a homodimeric class II tRNA synthetase and is recognized by pfam model tRNA-synt_2b, which recognizes His, Ser, Pro, and this set of glycyl-tRNA synthetases. [Protein synthesis, tRNA aminoacylation]	551
273052	TIGR00390	hslU	ATP-dependent protease HslVU, ATPase subunit. This model represents the ATPase subunit of HslVU, while the proteasome-related peptidase subunit is HslV. Residues 54-61 of the model contain a P-loop ATP-binding motif. Cys-287 of E. coli (position 308 in the seed alignment) is Ser in other members of the seed alignment. [Protein fate, Protein folding and stabilization]	441
273053	TIGR00391	hydA	hydrogenase (NiFe) small subunit (hydA). Called (hupA/hydA/hupS/hoxK/vhtG) Involved in hydrogenase reactions performing different specific functions in different species eg (EC 1.12.2.1) in Desulfovibrio gigas,(EC 1.12.99.3) in Wolinella succinogenes and (EC 1.18.99.1) in E.coli and a number of other species and (EC 1.12.99.-) in the archea. [Energy metabolism, Electron transport]	365
273054	TIGR00392	ileS	isoleucyl-tRNA synthetase. The isoleucyl tRNA synthetase (IleS) is a class I amino acyl-tRNA ligase and is particularly closely related to the valyl tRNA synthetase. This model may recognize IleS from every species, including eukaryotic cytosolic and mitochondrial forms. [Protein synthesis, tRNA aminoacylation]	861
129488	TIGR00393	kpsF	KpsF/GutQ family protein. This model describes a number of closely related proteins with the phosphosugar-binding domain SIS (Sugar ISomerase) followed by two copies of the CBS (named after Cystathionine Beta Synthase) domain. One is GutQ, a protein of the glucitol operon. Another is KpsF, a virulence factor involved in capsular polysialic acid biosynthesis in some pathogenic strains of E. coli. [Energy metabolism, Sugars]	268
273055	TIGR00394	lac_pts_IIC	phosphotransferase system, lactose specific, IIC component. This family of proteins models the IIC domain of the phosphotransferase system (PTS) for lactose. The IIC domain catalyzes the transfer of a phosphoryl group from the IIB domain to lactose. When the IIC component and IIB components are in the same polypeptide chain they are designated IIBC. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	412
273056	TIGR00395	leuS_arch	leucyl-tRNA synthetase, archaeal and cytosolic family. The leucyl-tRNA synthetases belong to two families so broadly different that they are represented by separate models. This model includes both archaeal and cytosolic eukaryotic leucyl-tRNA synthetases; the eubacterial and mitochondrial forms differ so substantially that some other tRNA ligases score higher by this model than does any eubacterial LeuS. [Protein synthesis, tRNA aminoacylation]	938
273057	TIGR00396	leuS_bact	leucyl-tRNA synthetase, eubacterial and mitochondrial family. The leucyl-tRNA synthetases belong to two families so broadly different that they are represented by separate models. This model includes both eubacterial and mitochondrial leucyl-tRNA synthetases. It generates higher scores for some valyl-tRNA synthetases than for any archaeal or eukaryotic cytosolic leucyl-tRNA synthetase. Note that the enzyme from Aquifex aeolicus is split into alpha and beta chains; neither chain is long enough to score above the trusted cutoff, but the alpha chain scores well above the noise cutoff. The beta chain must be found by a model and search designed for partial length matches. [Protein synthesis, tRNA aminoacylation]	842
129492	TIGR00397	mauM_napG	MauM/NapG family ferredoxin-type protein. MauM is involved in methylamine utilization. NapG is associated with nitrate reductase activity. The two proteins are highly similar. [Energy metabolism, Electron transport]	213
273058	TIGR00398	metG	methionine--tRNA ligase. The methionyl-tRNA synthetase (metG) is a class I amino acyl-tRNA ligase. This model appears to recognize the methionyl-tRNA synthetase of every species, including eukaryotic cytosolic and mitochondrial forms. The UPGMA difference tree calculated after search and alignment according to this model shows an unusual deep split between two families of MetG. One family contains forms from the Archaea, yeast cytosol, spirochetes, and E. coli, among others. The other family includes forms from yeast mitochondrion, Synechocystis sp., Bacillus subtilis, the Mycoplasmas, Aquifex aeolicus, and Helicobacter pylori. The E. coli enzyme is homodimeric, although monomeric forms can be prepared that are fully active. Activity of this enzyme in bacteria includes aminoacylation of fMet-tRNA with Met; subsequent formylation of the Met to fMet is catalyzed by a separate enzyme. Note that the protein from Aquifex aeolicus is split into an alpha (large) and beta (small) subunit; this model does not include the C-terminal region corresponding to the beta chain. [Protein synthesis, tRNA aminoacylation]	530
273059	TIGR00399	metG_C_term	methionyl-tRNA synthetase C-terminal region/beta chain. The methionyl-tRNA synthetase (metG) is a class I amino acyl-tRNA ligase. This model describes a region of the methionyl-tRNA synthetase that is present at the C-terminus of MetG in some species (E. coli, B. subtilis, Thermotoga maritima, Methanobacterium thermoautotrophicum), and as a separate beta chain in Aquifex aeolicus. It is absent in a number of other species (e.g. Mycoplasma genitalium, Mycobacterium tuberculosis), while Pyrococcus horikoshii has both a full length MetG and a second protein homologous to the beta chain only. Proteins hit by this model should be called methionyl-tRNA synthetase beta chain if and only if the model metG hits a separate protein not also hit by this model. [Protein synthesis, tRNA aminoacylation]	137
129495	TIGR00400	mgtE	Mg2+ transporter (mgtE). This family of prokaryotic proteins models a class of Mg++ transporter first described in Bacillus firmus. May form a homodimer. [Transport and binding proteins, Cations and iron carrying compounds]	449
129496	TIGR00401	msrA	methionine-S-sulfoxide reductase. This model describes peptide methionine sulfoxide reductase (MsrA), a repair enzyme for proteins that have been inactivated by oxidation. The enzyme from E. coli is coextensive with this model and has enzymatic activity. However, in all completed genomes in which this module is present, a second protein module, described in TIGR00357, is also found, and in several cases as part of the same polypeptide chain: N-terminal to this module in Helicobacter pylori and Haemophilus influenzae (as in PilB of Neisseria gonorrhoeae) but C-terminal to it in Treponema pallidum. PilB, containing both domains, has been shown to be important for the expression of adhesins in certain pathogens. [Protein fate, Protein modification and repair, Cellular processes, Adaptations to atypical conditions]	149
273060	TIGR00402	napF	ferredoxin-type protein NapF. The gene codes for a ferredoxin-type cytosolic protein, NapF, of the periplasmic nitrate reductase system, as in Escherichia coli. NapF interacts with the catalytic subunit, NapA, and may be an accessory protein for NapA maturation. [Energy metabolism, Electron transport]	101
129498	TIGR00403	ndhI	NADH-plastoquinone oxidoreductase subunit I protein. [Energy metabolism, Electron transport]	183
129499	TIGR00405	KOW_elon_Spt5	transcription elongation factor Spt5. This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.	145
273061	TIGR00406	prmA	ribosomal protein L11 methyltransferase. Ribosomal protein L11 methyltransferase is an S-adenosyl-L-methionine-dependent methyltransferase required for the modification of ribosomal protein L11. This protein is found in bacteria and (with a probable transit peptide) in Arabidopsis. [Protein synthesis, Ribosomal proteins: synthesis and modification]	288
161862	TIGR00407	proA	gamma-glutamyl phosphate reductase. The related model TIGR01092 describes a full-length fusion protein delta l-pyrroline-5-carboxylate synthetase that includes a gamma-glutamyl phosphate reductase region as described by this model. Alternate name: glutamate-5-semialdehyde dehydrogenase. The prosite motif begins at residue 332 of the seed alignment although not all of the members of the family exactly obey the motif. [Amino acid biosynthesis, Glutamate family]	398
273062	TIGR00408	proS_fam_I	prolyl-tRNA synthetase, family I. Prolyl-tRNA synthetase is a class II tRNA synthetase and is recognized by pfam model tRNA-synt_2b, which recognizes tRNA synthetases for Gly, His, Ser, and Pro. The prolyl-tRNA synthetases are divided into two widely divergent families. This family includes the archaeal enzyme, the Pro-specific domain of a human multifunctional tRNA ligase, and the enzyme from the spirochete Borrelia burgdorferi. The other family includes enzymes from Escherichia coli, Bacillus subtilis, Synechocystis PCC6803, and one of the two prolyL-tRNA synthetases of Saccharomyces cerevisiae. [Protein synthesis, tRNA aminoacylation]	472
273063	TIGR00409	proS_fam_II	prolyl-tRNA synthetase, family II. Prolyl-tRNA synthetase is a class II tRNA synthetase and is recognized by pfam model tRNA-synt_2b, which recognizes tRNA synthetases for Gly, His, Ser, and Pro. The prolyl-tRNA synthetases are divided into two widely divergent groups. This group includes enzymes from Escherichia coli, Bacillus subtilis, Aquifex aeolicus, the spirochete Treponema pallidum, Synechocystis PCC6803, and one of the two prolyL-tRNA synthetases of Saccharomyces cerevisiae. The other group includes the Pro-specific domain of a human multifunctional tRNA ligase and the prolyl-tRNA synthetases from the Archaea, the Mycoplasmas, and the spirochete Borrelia burgdorferi. [Protein synthesis, tRNA aminoacylation]	568
273064	TIGR00410	lacE	PTS system, lactose/cellobiose family IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. This family of proteins consists of both the cellobiose specific and the lactose specific forms of the phosphotransferase system (PTS) IIC component. The IIC domain catalyzes the transfer of a phosphoryl group from the IIB domain to the substrate. When the IIC component and IIB components are in the same polypeptide chain they are designated IIBC. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	423
129505	TIGR00411	redox_disulf_1	small redox-active disulfide protein 1. This protein is homologous to a family of proteins that includes thioredoxins, glutaredoxins, protein-disulfide isomerases, and others, some of which have several such domains. The sequence of this protein at the redox-active disufide site, CPYC, matches glutaredoxins rather than thioredoxins, although its overall sequence seems closer to thioredoxins. It is suggested to be a ribonucleotide-reducing system component distinct from thioredoxin or glutaredoxin. [Unknown function, General]	82
129506	TIGR00412	redox_disulf_2	small redox-active disulfide protein 2. This small protein is found in three archaeal species so far (Methanococcus jannaschii, Archeoglobus fulgidus, and Methanobacterium thermoautotrophicum) as well as in Anabaena PCC7120. It is homologous to thioredoxins, glutaredoxins, and protein disulfide isomerases, and shares with them a redox-active disulfide. The redox active disulfide region CXXC motif resembles neither thioredoxin nor glutaredoxin. A closely related protein found in the same three Archaea, described by redox_disulf_1, has a glutaredoxin-like CP[YH]C sequence; it has been characterized in functional assays as redox-active but unlikely to be a thioredoxin or glutaredoxin. [Unknown function, General]	76
273065	TIGR00413	rlpA	rare lipoprotein A. This is a family of prokaryotic proteins with unknown function. Lipoprotein annotation based on the presence of consensus lipoprotein signal sequence. Included in this family is the E. coli putative lipoprotein rlpA. [Cell envelope, Other]	208
273066	TIGR00414	serS	seryl-tRNA synthetase. This model represents the seryl-tRNA synthetase found in most organisms. This protein is a class II tRNA synthetase, and is recognized by the pfam model tRNA-synt_2b. The seryl-tRNA synthetases of two archaeal species, Methanococcus jannaschii and Methanobacterium thermoautotrophicum, differ considerably and are included in a different model. [Protein synthesis, tRNA aminoacylation]	418
129509	TIGR00415	serS_MJ	seryl-tRNA synthetase, Methanococcus jannaschii family. The seryl-tRNA synthetases from a few of the Archaea, represented by this model, are very different from the set of mutually more closely related seryl-tRNA synthetases from Eubacteria, Eukaryotes, and other Archaea. Although distantly homologous, the present set differs enough not to be recognized by the pfam model tRNA-synt_2b that recognizes the remainder of seryl-tRNA synthetases among oither class II amino-acyl tRNA synthetases. [Protein synthesis, tRNA aminoacylation]	520
273067	TIGR00416	sms	DNA repair protein RadA. The gene protuct codes for a probable ATP-dependent protease involved in both DNA repair and degradation of proteins, peptides, glycopeptides. Also known as sms. Residues 11-28 of the SEED alignment contain a putative Zn binding domain. Residues 110-117 of the seed contain a putative ATP binding site both documented in Haemophilus (SP:P45266) and in Listeria monocytogenes (SP:Q48761) . for E.coli see ( J. BACTERIOL. 178:5045-5048(1996)). [DNA metabolism, DNA replication, recombination, and repair]	454
188048	TIGR00417	speE	spermidine synthase. the SpeE subunit of spermidine synthase catalysesthe reaction (putrescine + S-adenosylmethioninamine = spermidine + 5'-methylthioadenosine) and is involved in polyamine biosynthesis and in the biosynthesis of spermidine from arganine. The region between residues 77 and 120 of the seed alignment is thought to be involved in binding to decarboxylated SAM. [Central intermediary metabolism, Polyamine biosynthesis]	271
273068	TIGR00418	thrS	threonyl-tRNA synthetase. This model represents the threonyl-tRNA synthetase found in most organisms. This protein is a class II tRNA synthetase, and is recognized by the pfam model tRNA-synt_2b. Note that B. subtilis has closely related isozymes thrS and thrZ. The N-terminal regions are quite dissimilar between archaeal and eubacterial forms, while some eukaryotic forms are missing sequence there altogether. . [Protein synthesis, tRNA aminoacylation]	563
129513	TIGR00419	tim	triosephosphate isomerase. Triosephosphate isomerase (tim/TPIA) is the glycolytic enzyme that catalyzes the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. The active site of the enzyme is located between residues 240-258 of the model ([AV]-Y-E-P-[LIVM]-W-[SA]-I-G-T-[GK]) with E being the active site residue. There is a slight deviation from this sequence within the archeal members of this family. [Energy metabolism, Glycolysis/gluconeogenesis]	205
273069	TIGR00420	trmU	tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase. tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase (trmU, asuE, or mnmA) is involved in the biosynthesis of the modified nucleoside 5-methylaminomethyl-2-thiouridine (mnm5s2U34) present in the wobble position of some tRNAs. This enzyme appears not to occur in the Archaea. [Protein synthesis, tRNA and rRNA base modification]	352
129515	TIGR00421	ubiX_pad	UbiX family flavin prenyltransferase. UbiX partners with UbiD for decarboxylation of the 3-octaprenyl-4-hydroxybenzoate     precursor during ubiquinone biosynthesis, but the role of UbiX is as a flavin prenyltransferase     that provides a cofactor UbiD requires.In E.coli, the protein UbiX (3-octaprenyl-4-hydroxybenzoate carboxy-lyase) has been shown to be involved in the third step of ubiquinone biosynthesis, the reaction [3-octaprenyl-4-hydroxybenzoate = 2-octaprenylphenol + CO2]. The knockout of the homologous protein in yeast confers sensitivity to phenylacrylic acid. Members are not restricted to ubiquinone-synthesizing species. This family represents a distinct clade within the flavoprotein family of pfam02441.	181
273070	TIGR00422	valS	valyl-tRNA synthetase. The valyl-tRNA synthetase (ValS) is a class I amino acyl-tRNA ligase and is particularly closely related to the isoleucyl tRNA synthetase. [Protein synthesis, tRNA aminoacylation]	861
273071	TIGR00423	TIGR00423	radical SAM domain protein, CofH subfamily. This protein family includes the CofH protein of coenzyme F(420) biosynthesis from Methanocaldococcus jannaschii, but appears to hit genomes more broadly than just the subset that make coenzyme F(420), so that narrower group is being built as a separate family. [Hypothetical proteins, Conserved]	309
273072	TIGR00424	APS_reduc	5'-adenylylsulfate reductase, thioredoxin-independent. This enzyme, involved in the assimilation of inorganic sulfate, is closely related to the thioredoxin-dependent PAPS reductase of Bacteria (CysH) and Saccharomyces cerevisiae. However, it has its own C-terminal thioredoxin-like domain and is not thioredoxin-dependent. Also, it has a substrate preference for 5'-adenylylsulfate (APS) over 3'-phosphoadenylylsulfate (PAPS) so the pathway does not require an APS kinase (CysC) to convert APS to PAPS. Arabidopsis thaliana appears to have three isozymes, all able to complement E. coli CysH mutants (even in backgrounds lacking thioredoxin or APS kinase) but likely localized to different compartments in Arabidopsis. [Central intermediary metabolism, Sulfur metabolism]	463
273073	TIGR00425	CBF5	rRNA pseudouridine synthase, putative. This family, found in archaea and eukaryotes, includes the only archaeal proteins markedly similar to bacterial TruB, the tRNA pseudouridine 55 synthase. However, among two related yeast proteins, the archaeal set matches yeast YLR175w far better than YNL292w. The first, termed centromere/microtubule binding protein 5 (CBF5), is an apparent rRNA pseudouridine synthase, while the second is the exclusive tRNA pseudouridine 55 synthase for both cytosolic and mitochondrial compartments. It is unclear whether archaeal proteins found by this model modify tRNA, rRNA, or both. [Protein synthesis, tRNA and rRNA base modification]	322
129520	TIGR00426	TIGR00426	competence protein ComEA helix-hairpin-helix repeat region. Members of the subfamily recognized by this model include competence protein ComEA and closely related proteins from a number of species that exhibit competence for transformation by exongenous DNA, including Streptococcus pneumoniae, Bacillus subtilis, Neisseria meningitidis, and Haemophilus influenzae. This model represents a region of two tandem copies of a helix-hairpin-helix domain (pfam00633), each about 30 residues in length. Limited sequence similarity can be found among some members of this family N-terminal to the region covered by this model. [Cellular processes, DNA transformation]	69
129521	TIGR00427	TIGR00427	membrane protein, MarC family. MarC is a protein that spans the plasma membrane multiple times and once was thought to be a multiple antibiotic resistance protein. The function for this family is unknown. [Unknown function, General]	201
129522	TIGR00430	Q_tRNA_tgt	tRNA-guanine transglycosylase. This tRNA-guanine transglycosylase (tgt) catalyzes an exchange for the guanine base at position 34 of many tRNAs; this nucleotide is subsequently modified to queuosine. The Archaea have a closely related enzyme that catalyzes a base exchange for guanine at position 15 in some tRNAs, a site that is subsequently converted to the archaeal-specific modified base archaeosine (7-formamidino-7-deazaguanosine), while Archaeoglobus fulgidus has both enzymes. [Protein synthesis, tRNA and rRNA base modification]	368
129523	TIGR00431	TruB	tRNA pseudouridine(55) synthase. TruB, the tRNA pseudouridine 55 synthase, converts uracil to pseudouridine in the T loop of most tRNAs in all three domains of life. This model is built on a seed alignment of bacterial proteins only. Saccharomyces cerevisiae protein YNL292w (Pus4) has been shown to be the pseudouridine 55 synthase of both cytosolic and mitochondrial compartments, active at no other position on tRNA and the only enzyme active at that position in the species. A distinct yeast protein YLR175w, (centromere/microtubule-binding protein CBF5) is an rRNA pseudouridine synthase, and the archaeal set is much more similar to CBF5 than to Pus4. It is unclear whether the archaeal proteins found by this model are tRNA pseudouridine 55 synthases like TruB, rRNA pseudouridine synthases like CBF5, or (as suggested by the absence of paralogs in the Archaea) both. CBF5 likely has additional, eukaryotic-specific functions. The trusted cutoff is set above the scores for the archaeal homologs of unknown function, so yeast Pus4p scores between trusted and noise. [Protein synthesis, tRNA and rRNA base modification]	209
273074	TIGR00432	arcsn_tRNA_tgt	tRNA-guanine(15) transglycosylase. This tRNA-guanine transglycosylase (tgt) differs from the tgt of E. coli and other Bacteria in the site of action and the modification that results. It exchanges 7-cyano-7-deazaguanine (preQ0) with guanine at position 15 of archaeal tRNA; this nucleotide is subsequently converted to archaeosine, found exclusively in the Archaea. This enzyme from Haloferax volcanii has been purified, characterized, and partially sequenced and is the basis for identifying this family. In contrast, bacterial tgt (TIGR00430) catalyzes the exchange of preQ0 or preQ1 for the guanine base at position 34; this nucleotide is subsequently modified to queuosine. Archeoglobus fulgidus has both enzymes, while some other Archaea have just this one. [Protein synthesis, tRNA and rRNA base modification]	540
273075	TIGR00433	bioB	biotin synthase. Catalyzes the last step of the biotin biosynthesis pathway. All members of the seed alignment are in the immediate gene neighborhood of a bioA gene. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin]	296
129526	TIGR00434	cysH	phosophoadenylyl-sulfate reductase (thioredoxin). This enzyme, involved in the assimilation of inorganic sulfate, is designated cysH in Bacteria and MET16 in Saccharomyces cerevisiae. Synonyms include phosphoadenosine phosphosulfate reductase, PAPS reductase, and PAPS reductase, thioredoxin-dependent. In a reaction requiring reduced thioredoxin and NADPH, it converts 3(prime)-phosphoadenylylsulfate (PAPS) to sulfite and adenosine 3(prime),5(prime) diphosphate (PAP). A related family of plant enzymes, scoring below the trusted cutoff, differs in having a thioredoxin-like C-terminal domain, not requiring thioredoxin, and in having a preference for 5(prime)-adenylylsulfate (APS) over PAPS. [Central intermediary metabolism, Sulfur metabolism]	212
273076	TIGR00435	cysS	cysteinyl-tRNA synthetase. This model finds the cysteinyl-tRNA synthetase from most but not from all species. The enzyme from one archaeal species, Archaeoglobus fulgidus, is found but the equivalent enzymes from some other Archaea, including Methanococcus jannaschii, are not found, although biochemical evidence suggests that tRNA(Cys) in these species are charged directly with Cys rather than through a misacylation and correction pathway as for tRNA(Gln). [Protein synthesis, tRNA aminoacylation]	464
129528	TIGR00436	era	GTP-binding protein Era. Era is an essential GTPase in Escherichia coli and many other bacteria. It plays a role in ribosome biogenesis. Few bacteria lack this protein. [Protein synthesis, Other]	270
273077	TIGR00437	feoB	ferrous iron transporter FeoB. FeoB (773 amino acids in E. coli), a cytoplasmic membrane protein required for iron(II) update, is encoded in an operon with FeoA (75 amino acids), which is also required, and is regulated by Fur. There appear to be two copies in Archaeoglobus fulgidus and Clostridium acetobutylicum. [Transport and binding proteins, Cations and iron carrying compounds]	591
273078	TIGR00438	rrmJ	cell division protein FtsJ. Methylates the 23S rRNA. Previously known as cell division protein ftsJ.// Trusted cutoff too high? [SS 10/1/04] [Protein synthesis, tRNA and rRNA base modification]	188
129531	TIGR00439	ftsX	putative protein insertion permease FtsX. FtsX is an integral membrane protein encoded in the same operon as signal recognition particle docking protein FtsY and FtsE. It belongs to a family of predicted permeases and may play a role in the insertion of proteins required for potassium transport, cell division, and other activities. FtsE is a hydrophilic nucleotide-binding protein that associates with the inner membrane by means of association with FtsX. [Cellular processes, Cell division, Protein fate, Protein and peptide secretion and trafficking]	309
273079	TIGR00440	glnS	glutaminyl-tRNA synthetase. This protein is a relatively rare aminoacyl-tRNA synthetase, found in the cytosolic compartment of eukaryotes, in E. coli and a number of other Gram-negative Bacteria, and in Deinococcus radiodurans. In contrast, the pathway to Gln-tRNA in mitochondria, Archaea, Gram-positive Bacteria, and a number of other lineages is by misacylation with Glu followed by transamidation to correct the aminoacylation to Gln. This enzyme is a class I tRNA synthetase (hit by the pfam model tRNA-synt_1c) and is quite closely related to glutamyl-tRNA synthetases. [Protein synthesis, tRNA aminoacylation]	522
129533	TIGR00441	gmhA	phosphoheptose isomerase. This model describes phosphoheptose isomerase. Because a closely related paralo in Escherichia coli differs in function (DnaA initiator-associating protein diaA), this model has been rebuilt with a high stringency, and is likely to miss many true examples for phosphoheptose isomerase. Involved in lipopolysaccharide biosynthesis it may have a role in virulence in Haemophilus ducreyi. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	154
273080	TIGR00442	hisS	histidyl-tRNA synthetase. This model finds a histidyl-tRNA synthetase in every completed genome. Apparent second copies from Bacillus subtilis, Synechocystis sp., and Aquifex aeolicus are slightly shorter, more closely related to each other than to other hisS proteins, and actually serve as regulatory subunits for an enzyme of histidine biosynthesis. They were excluded from the seed alignment and score much lower than do single copy histidyl-tRNA synthetases of other genomes not included in the seed alignment. These putative second copies of HisS score below the trusted cutoff. The regulatory protein kinase GCN2 of Saccharomyces cerevisiae (YDR283c), and related proteins from other species designated eIF-2 alpha kinase, have a domain closely related to histidyl-tRNA synthetase that may serve to detect and respond to uncharged tRNA(his), an indicator of amino acid starvation; these regulatory proteins are not orthologous and so score below the noise cutoff. [Protein synthesis, tRNA aminoacylation]	404
273081	TIGR00443	hisZ_biosyn_reg	ATP phosphoribosyltransferase, regulatory subunit. Apparant second copies of histidyl-tRNA synthetase, found in Bacillus subtilis, Synechocystis sp., Aquifex aeolicus, and others, are in fact a regulatory subunit of ATP phosphoribosyltransferase, and usually encoded by a gene adjacent to that encoding the catalytic subunit. [Amino acid biosynthesis, Histidine family]	313
273082	TIGR00444	mazG	MazG family protein. This family of prokaryotic proteins has no known function. It includes the uncharacterized protein MazG in E. coli. [Unknown function, General]	248
161884	TIGR00445	mraY	phospho-N-acetylmuramoyl-pentapeptide-transferase. Involved in peptidoglycan biosynthesis, the enzyme catalyzes the first of the lipid cycle reactions. Also known as Muramoyl-Pentapeptide Transferase (murX). [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	321
188051	TIGR00446	nop2p	NOL1/NOP2/sun family putative RNA methylase. [Protein synthesis, tRNA and rRNA base modification]	264
213531	TIGR00447	pth	aminoacyl-tRNA hydrolase. The natural substrate for this enzyme may be peptidyl-tRNAs that drop off the ribosome during protein synthesis. Peptidyl-tRNA hydrolase is a bacterial protein; YHR189W from Saccharomyces cerevisiae appears to be orthologous and likely has the same function. [Protein synthesis, Other]	188
129540	TIGR00448	rpoE	DNA-directed RNA polymerase (rpoE), archaeal and eukaryotic form. This family seems to be confined to the archea and eukaryotic taxa and are quite dissimilar to E.coli rpoE. [Transcription, DNA-dependent RNA polymerase]	179
129541	TIGR00449	tgt_general	tRNA-guanine family transglycosylase. Different tRNA-guanine transglycosylases catalyze different tRNA base modifications. Two guanine base substitutions by different enzymes described by the model are involved in generating queuosine at position 34 in bacterial tRNAs and archaeosine at position 15 in archaeal tRNAs. This model is designed for fragment searching, so the superfamily is used loosely. [Protein synthesis, tRNA and rRNA base modification]	367
273083	TIGR00450	mnmE_trmE_thdF	tRNA modification GTPase TrmE. TrmE, also called MnmE and previously designated ThdF (thiophene and furan oxidation protein), is a GTPase involved in tRNA modification to create 5-methylaminomethyl-2-thiouridine in the wobble position of some tRNAs. This protein and GidA form an alpha2/beta2 heterotetramer. [Protein synthesis, tRNA and rRNA base modification]	442
129543	TIGR00451	unchar_dom_2	uncharacterized domain 2. This uncharacterized domain is found a number of enzymes and uncharacterized proteins, often at the C-terminus. It is found in some but not all members of a family of related tRNA-guanine transglycosylases (tgt), which exchange a guanine base for some modified base without breaking the phosphodiester backbone of the tRNA. It is also found in rRNA pseudouridine synthase, another enzyme of RNA base modification not otherwise homologous to tgt. It is found, again at the C-terminus, in two putative glutamate 5-kinases. It is also found in a family of small, uncharacterized archaeal proteins consisting mostly of this domain.	107
273084	TIGR00452	TIGR00452	tRNA (mo5U34)-methyltransferase. This model describes CmoB, the enzyme tRNA (mo5U34)-methyltransferase involved in tRNA wobble base modification. [Unknown function, Enzymes of unknown specificity]	316
213532	TIGR00453	ispD	2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase. Members of this protein family are 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase, the IspD protein of the deoxyxylulose pathway of IPP biosynthesis. In about twenty percent of bacterial genomes, this protein occurs as IspDF, a bifunctional fusion protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	217
200016	TIGR00454	TIGR00454	TIGR00454 family protein. At this time this gene appears to be present only in Archea [Hypothetical proteins, Conserved]	175
129547	TIGR00455	apsK	adenylyl-sulfate kinase. This protein, adenylylsulfate kinase, is often found as a fusion protein with sulfate adenylyltransferase. Important residue (active site in E.coli) is residue 100 of the seed alignment. [Central intermediary metabolism, Sulfur metabolism]	184
273085	TIGR00456	argS	arginyl-tRNA synthetase. This model recognizes arginyl-tRNA synthetase in every completed genome to date. An interesting feature of the alignment of all arginyl-tRNA synthetases is a fairly deep split between two families. One family includes archaeal, eukaryotic and organellar, spirochete, E. coli, and Synechocystis sp. The second, sharing a deletion of about 25 residues in the central region relative to the first, includes Bacillus subtilis, Aquifex aeolicus, the Mycoplasmas and Mycobacteria, and the Gram-negative bacterium Helicobacter pylori. [Protein synthesis, tRNA aminoacylation]	563
273086	TIGR00457	asnS	asparaginyl-tRNA synthetase. In a multiple sequence alignment of representative asparaginyl-tRNA synthetases (asnS), archaeal/eukaryotic type aspartyl-tRNA synthetases (aspS_arch), and bacterial type aspartyl-tRNA synthetases (aspS_bact), there is a striking similarity between asnS and aspS_arch in gap pattern and in sequence, and a striking divergence of aspS_bact. Consequently, a separate model was built for each of the three groups. This model, asnS, represents asparaginyl-tRNA synthetases from the three domains of life. Some species lack this enzyme and charge tRNA(asn) by misacylation with Asp, followed by transamidation of Asp to Asn. [Protein synthesis, tRNA aminoacylation]	453
273087	TIGR00458	aspS_nondisc	nondiscriminating aspartyl-tRNA synthetase. In a multiple sequence alignment of representative asparaginyl-tRNA synthetases (asnS), archaeal/eukaryotic type aspartyl-tRNA synthetases (aspS_arch), and bacterial type aspartyl-tRNA synthetases (aspS_bact), there is a striking similarity between asnS and aspS_arch in gap pattern and in sequence, and a striking divergence of aspS_bact. Consequently, a separate model was built for each of the three groups. This model, aspS_arch, represents aspartyl-tRNA synthetases from the eukaryotic cytosol and from the Archaea. In some species, this enzyme aminoacylates tRNA for both Asp and Asn; Asp-tRNA(asn) is subsequently transamidated to Asn-tRNA(asn). [Protein synthesis, tRNA aminoacylation]	428
211576	TIGR00459	aspS_bact	aspartyl-tRNA synthetase, bacterial type. Asparate--tRNA ligases in this family may be discriminating (6.1.1.12) or nondiscriminating (6.1.1.23). In a multiple sequence alignment of representative asparaginyl-tRNA synthetases (asnS), archaeal/eukaryotic type aspartyl-tRNA synthetases (aspS_arch), and bacterial type aspartyl-tRNA synthetases (aspS_bact), there is a striking similarity between asnS and aspS_arch in gap pattern and in sequence, and a striking divergence of aspS_bact. Consequently, a separate model was built for each of the three groups. This model, aspS_bact, represents aspartyl-tRNA synthetases from the Bacteria and from mitochondria. In some species, this enzyme aminoacylates tRNA for both Asp and Asn; Asp-tRNA(asn) is subsequently transamidated to Asn-tRNA(asn). This model generates very low scores for the archaeal type of aspS and for asnS; scores between the trusted and noise cutoffs represent fragmentary sequences. [Protein synthesis, tRNA aminoacylation]	583
273088	TIGR00460	fmt	methionyl-tRNA formyltransferase. The top-scoring characterized proteins other than methionyl-tRNA formyltransferase (fmt) itself are formyltetrahydrofolate dehydrogenases. The mitochondrial methionyl-tRNA formyltransferases are so divergent that, in a multiple alignment of bacterial fmt, mitochondrial fmt, and formyltetrahydrofolate dehydrogenases, the mitochondrial fmt appears the most different. However, because both bacterial and mitochondrial fmt are included in the seed alignment, all credible fmt sequences score higher than any non-fmt sequence. This enzyme modifies Met on initiator tRNA to f-Met. [Protein synthesis, tRNA aminoacylation]	313
273089	TIGR00461	gcvP	glycine dehydrogenase (decarboxylating). This apparently ubiquitous enzyme is found in bacterial, mammalian and plant sources. The enzyme catalyzes the reaction: GLYCINE + LIPOYLPROTEIN = S-AMINOMETHYL-DIHYDROLIPOYLPROTEIN + CO2. It is part of the glycine decarboxylase multienzyme complex (GDC) consisting of four proteins P, H, L and T. Active site in E.coli is located as the (K) residues at position 713 of the SEED alignment. [Energy metabolism, Amino acids and amines]	939
273090	TIGR00462	genX	EF-P lysine aminoacylase GenX. Many Gram-negative bacteria have a protein closely homologous to the C-terminal region of lysyl-tRNA synthetase (LysS). Multiple sequence alignment of these proteins with the homologous regions of collected LysS proteins shows that these proteins form a distinct set rather than just similar truncations of LysS. The protein is termed GenX after its designation in E. coli. Interestingly, genX often is located near a homolog of lysine-2,3-aminomutase. Its function is unknown. [Unknown function, General]	290
273091	TIGR00463	gltX_arch	glutamyl-tRNA synthetase, archaeal and eukaryotic family. The glutamyl-tRNA synthetases of the eukaryotic cytosol and of the Archaea are more similar to glutaminyl-tRNA synthetases than to bacterial glutamyl-tRNA synthetases. This model models just the eukaryotic cytosolic and archaeal forms of the enzyme. In some eukaryotes, the glutamyl-tRNA synthetase is part of a longer, multifunctional aminoacyl-tRNA ligase. In many species, the charging of tRNA(gln) proceeds first through misacylation with Glu and then transamidation. For this reason, glutamyl-tRNA synthetases, including all known archaeal enzymes (as of 2010) may act on both tRNA(gln) and tRNA(glu). [Protein synthesis, tRNA aminoacylation]	556
273092	TIGR00464	gltX_bact	glutamyl-tRNA synthetase, bacterial family. The glutamyl-tRNA synthetases of the eukaryotic cytosol and of the Archaea are more similar to glutaminyl-tRNA synthetases than to bacterial glutamyl-tRNA synthetases. This model models just the bacterial and mitochondrial forms of the enzyme. In many species, the charging of tRNA(gln) proceeds first through misacylation with Glu and then transamidation. For this reason, glutamyl-tRNA synthetases may act on both tRNA(gln) and tRNA(glu). This model is highly specific. Proteins with positive scores below the trusted cutoff may be fragments rather than full-length sequences. [Protein synthesis, tRNA aminoacylation]	470
273093	TIGR00465	ilvC	ketol-acid reductoisomerase. This is the second enzyme in the parallel isoleucine-valine biosynthetic pathway [Amino acid biosynthesis, Pyruvate family]	314
129558	TIGR00466	kdsB	3-deoxy-D-manno-octulosonate cytidylyltransferase. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	238
273094	TIGR00467	lysS_arch	lysyl-tRNA synthetase, archaeal and spirochete. This model represents the lysyl-tRNA synthetases that are class I amino-acyl tRNA synthetases. It includes archaeal and spirochete examples of the enzyme. All other known examples are class IIc amino-acyl tRNA synthetases and seem to form a separate orthologous set. [Protein synthesis, tRNA aminoacylation]	515
273095	TIGR00468	pheS	phenylalanyl-tRNA synthetase, alpha subunit. Most phenylalanyl-tRNA synthetases are heterodimeric, with 2 alpha (pheS) and 2 beta (pheT) subunits. This model describes the alpha subunit, which shows some similarity to class II aminoacyl-tRNA ligases. Mitochondrial phenylalanyl-tRNA synthetase is a single polypeptide chain, active as a monomer, and similar to this chain rather than to the beta chain, but excluded from this model. An interesting feature of the alignment of all sequences captured by this model is a deep split between non-spirochete bacterial examples and all other examples; supporting this split is a relative deletion of about 50 residues in the former set between two motifs well conserved throughout the alignment. [Protein synthesis, tRNA aminoacylation]	293
129561	TIGR00469	pheS_mito	phenylalanyl-tRNA synthetase, mitochondrial. Unlike all other known phenylalanyl-tRNA synthetases, the mitochondrial form demonstrated from yeast is monomeric. It is similar to but longer than the alpha subunit (PheS) of the alpha 2 beta 2 form found in Bacteria, Archaea, and eukaryotes, and shares the characteristic motifs of class II aminoacyl-tRNA ligases. This model models the experimental example from Saccharomyces cerevisiae (designated MSF1) and its orthologs from other eukaryotic species. [Protein synthesis, tRNA aminoacylation]	460
129562	TIGR00470	sepS	O-phosphoserine--tRNA ligase. This family of archaeal proteins resembles known phenylalanyl-tRNA synthetase alpha chains. Recently, it was shown to act in a proposed pathway of tRNA(Cys) indirect aminoacylation, resulting in Cys biosynthesis from O-phosphoserine, in certain archaea. It charges tRNA(Cys) with O-phosphoserine. The pscS gene product converts the phosphoserine to Cys. [Amino acid biosynthesis, Serine family, Protein synthesis, tRNA aminoacylation]	533
273096	TIGR00471	pheT_arch	phenylalanyl-tRNA synthetase, beta subunit. Every known example of the phenylalanyl-tRNA synthetase, except the monomeric form of mitochondrial, is an alpha 2 beta 2 heterotetramer. The beta subunits break into two subfamilies that are considerably different in sequence, length, and pattern of gaps. This model represents the subfamily that includes the beta subunit from eukaryotic cytosol, the Archaea, and spirochetes. [Protein synthesis, tRNA aminoacylation]	551
273097	TIGR00472	pheT_bact	phenylalanyl-tRNA synthetase, beta subunit, non-spirochete bacterial. Every known example of the phenylalanyl-tRNA synthetase, except the monomeric form of mitochondrial, is an alpha 2 beta 2 heterotetramer. The beta subunits break into two subfamilies that are considerably different in sequence, length, and pattern of gaps. This model represents the subfamily that includes the beta subunit from Bacteria other than spirochetes, as well as a chloroplast-encoded form from Porphyra purpurea. The chloroplast-derived sequence is considerably shorter at the amino end. [Protein synthesis, tRNA aminoacylation]	797
273098	TIGR00473	pssA	CDP-diacylglycerol--serine O-phosphatidyltransferase. This enzyme, CDP-diacylglycerol--serine O-phosphatidyltransferase, is involved in phospholipid biosynthesis catalyzing the reaction CDP-diacylglycerol + L-serine = CMP + L-1-phosphatidylserine. Members of this family do not bear any significant sequence similarity to the corresponding E.coli protein. [Fatty acid and phospholipid metabolism, Biosynthesis]	151
273099	TIGR00474	selA	L-seryl-tRNA(Sec) selenium transferase. In bacteria, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3-prime or 5-prime non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. This model describes SelA. This model excludes homologs that appear to differ in function from Frankia alni, Helicobacter pylori, Methanococcus jannaschii and other archaea, and so on. [Protein synthesis, tRNA aminoacylation]	454
129567	TIGR00475	selB	selenocysteine-specific elongation factor SelB. In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3-prime or 5-prime non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. This model describes the elongation factor SelB, a close homolog rf EF-Tu. It may function by replacing EF-Tu. A C-terminal domain not found in EF-Tu is in all SelB sequences in the seed alignment except that from Methanococcus jannaschii. This model does not find an equivalent protein for eukaryotes. [Protein synthesis, Translation factors]	581
273100	TIGR00476	selD	selenium donor protein. In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3-prime or 5-prime non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation. This model describes SelD, known as selenophosphate synthetase, selenium donor protein, and selenide,water dikinase. SelD provides reduced selenium for the selenium transferase SelA. This protein itself contains selenocysteine in many species; any sequence scoring well but not aligning to the beginning of the model is likely to have a selenocysteine residue incorrectly interpreted as a stop codon upstream of the given sequence. The SelD protein also provides selenophosphate for the enzyme tRNA 2-selenouridine synthase, which catalyzes a tRNA base modification. It also contributes to selenium incorporation by selenium-dependent molybdenum hydroxylases (SDMH), in genomes with the marker TIGR03309. All genomes with SelD should make selenocysteine, selenouridine, SDMH, or some combination.	301
188054	TIGR00477	tehB	tellurite resistance protein TehB. Part of a tellurite-reducing operon tehA and tehB [Cellular processes, Toxin production and resistance]	195
129570	TIGR00478	tly	TlyA family rRNA methyltransferase/putative hemolysin. Members of this family include TlyA from Mycobacterium tuberculosis, an rRNA methylase whose modifications are necessary to confer sensitivity to ribosome-targeting antibiotics capreomycin and viomycin. Homology supports identification as a methyltransferase. However, a parallel literature persists in calling some members hemolysins. Hemolysins are exotoxins that attack blood cell membranes and cause cell rupture, often by forming a pore in the membrane. A recent study (2013) on SCO1782 from Streptomyces coelicolor shows hemolysin activity as earlier described for a homolog from the spirochete Serpula (Treponema) hyodysenteriae and one from Mycobacterium tuberculosis. [Unknown function, General]	228
129571	TIGR00479	rumA	23S rRNA (uracil-5-)-methyltransferase RumA. This protein family was first proposed to be RNA methyltransferases by homology to the TrmA family. The member from E. coli has now been shown to act as the 23S RNA methyltransferase for the conserved U1939. The gene is now designated rumA and was previously designated ygcA. [Protein synthesis, tRNA and rRNA base modification]	431
129572	TIGR00481	TIGR00481	Raf kinase inhibitor-like protein, YbhB/YbcL family. [Unknown function, General]	141
273101	TIGR00482	TIGR00482	nicotinate (nicotinamide) nucleotide adenylyltransferase. This model represents the predominant bacterial/eukaryotic adenylyltransferase for nicotinamide-nucleotide, its deamido form nicotinate nucleotide, or both. The first activity, nicotinamide-nucleotide adenylyltransferase (EC 2.7.7.1), synthesizes NAD by the salvage pathway, while the second, nicotinate-nucleotide adenylyltransferase (EC 2.7.7.18) synthesizes the immediate precursor of NAD by the de novo pathway. In E. coli, NadD activity is biased toward the de novo pathway while salvage activity is channeled through the multifunctional NadR protein, but this division of labor may be exceptional. The given name of this model, nicotinate (nicotinamide) nucleotide adenylyltransferase, reflects the lack of absolute specificity with respect to substrate amidation state in most species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	193
129574	TIGR00483	EF-1_alpha	translation elongation factor EF-1 alpha. This model represents the counterpart of bacterial EF-Tu for the Archaea (aEF-1 alpha) and Eukaryotes (eEF-1 alpha). The trusted cutoff is set fairly high so that incomplete sequences will score between suggested and trusted cutoff levels. [Protein synthesis, Translation factors]	426
129575	TIGR00484	EF-G	translation elongation factor EF-G. After peptide bond formation, this elongation factor of bacteria and organelles catalyzes the translocation of the tRNA-mRNA complex, with its attached nascent polypeptide chain, from the A-site to the P-site of the ribosome. Every completed bacterial genome has at least one copy, but some species have additional EF-G-like proteins. The closest homolog to canonical (e.g. E. coli) EF-G in the spirochetes clusters as if it is derived from mitochondrial forms, while a more distant second copy is also present. Synechocystis PCC6803 has a few proteins more closely related to EF-G than to any other characterized protein. Two of these resemble E. coli EF-G more closely than does the best match from the spirochetes; it may be that both function as authentic EF-G. [Protein synthesis, Translation factors]	689
129576	TIGR00485	EF-Tu	translation elongation factor TU. This model models orthologs of translation elongation factor EF-Tu in bacteria, mitochondria, and chloroplasts, one of several GTP-binding translation factors found by the more general pfam model GTP_EFTU. The eukaryotic conterpart, eukaryotic translation elongation factor 1 (eEF-1 alpha), is excluded from this model. EF-Tu is one of the most abundant proteins in bacteria, as well as one of the most highly conserved, and in a number of species the gene is duplicated with identical function. When bound to GTP, EF-Tu can form a complex with any (correctly) aminoacylated tRNA except those for initiation and for selenocysteine, in which case EF-Tu is replaced by other factors. Transfer RNA is carried to the ribosome in these complexes for protein translation. [Protein synthesis, Translation factors]	394
213534	TIGR00486	YbgI_SA1388	dinuclear metal center protein, YbgI/SA1388 family. The characterization of this family of uncharacterized proteins as orthologous is tentative. Members are found in all three domains of life. Several members (from Bacillus subtilis, Listeria monocytogenes, and Mycobacterium tuberculosis - all classified as Firmicutes within the Eubacteria) share a long insert relative to other members. [Unknown function, General]	249
273102	TIGR00487	IF-2	translation initiation factor IF-2. This model discriminates eubacterial (and mitochondrial) translation initiation factor 2 (IF-2), encoded by the infB gene in bacteria, from similar proteins in the Archaea and Eukaryotes. In the bacteria and in organelles, the initiator tRNA is charged with N-formyl-Met instead of Met. This translation factor acts in delivering the initator tRNA to the ribosome. It is one of a number of GTP-binding translation factors recognized by the pfam model GTP_EFTU. [Protein synthesis, Translation factors]	587
273103	TIGR00488	TIGR00488	putative HD superfamily hydrolase of NAD metabolism. The function of this protein family is unknown. Members of this family of uncharacterized proteins from the Mycoplasmas are longer at the amino end, fused to a region of nicotinamide nucleotide adenylyltransferase, an NAD salvage biosynthesis enzyme. Members are putative metal-dependent phosphohydrolases for NAD metabolism. [Unknown function, Enzymes of unknown specificity]	158
129580	TIGR00489	aEF-1_beta	translation elongation factor aEF-1 beta. This model describes the archaeal translation elongation factor aEF-1 beta. The member from Sulfolobus solfataricus was demonstrated experimentally. It is a dimer that catalyzes the exchange of GDP for GTP on aEF-1 alpha. [Protein synthesis, Translation factors]	88
129581	TIGR00490	aEF-2	translation elongation factor aEF-2. This model represents archaeal elongation factor 2, a protein more similar to eukaryotic EF-2 than to bacterial EF-G, both in sequence similarity and in sharing with eukaryotes the property of having a diphthamide (modified His) residue at a conserved position. The diphthamide can be ADP-ribosylated by diphtheria toxin in the presence of NAD. [Protein synthesis, Translation factors]	720
273104	TIGR00491	aIF-2	translation initiation factor aIF-2/yIF-2. This model describes archaeal and eukaryotic orthologs of bacterial IF-2. Like IF-2, it helps convey the initiator tRNA to the ribosome, although the initiator is N-formyl-Met in bacteria and Met here. This protein is not closely related to the subunits of eIF-2 of eukaryotes, which is also involved in the initiation of translation. The aIF-2 of Methanococcus jannaschii contains a large intein interrupting a region of very strongly conserved sequence very near the amino end; the alignment generated by this model does not correctly align the sequences from Methanococcus jannaschii and Pyrococcus horikoshii in this region. [Protein synthesis, Translation factors]	591
129583	TIGR00492	alr	alanine racemase. This enzyme interconverts L-alanine and D-alanine. Its primary function is to generate D-alanine for cell wall formation. With D-alanine-D-alanine ligase, it makes up the D-alanine branch of the peptidoglycan biosynthetic route. It is a monomer with one pyridoxal phosphate per subunit. In E. coli, the ortholog is duplicated so that a second isozyme, DadX, is present. DadX, a paralog of the biosynthetic Alr, is induced by D- or L-alanine and is involved in catabolism. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	367
188055	TIGR00493	clpP	ATP-dependent Clp endopeptidase, proteolytic subunit ClpP. This model for the proteolytic subunit ClpP has been rebuilt to a higher stringency. In every bacterial genome with the ClpXP machine, a ClpP protein will be found that scores well with this model. In general, this ClpP member will be encoded adjacent to the clpX gene, as were all examples used in the seed alignment. A large fraction of genomes have one or more additional ClpP paralogs, sometimes encoded nearby and sometimes elsewhere. The stringency of the trusted cutoff used here excludes the more divergent ClpP paralogs from being called authentic ClpP by this model. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	192
129585	TIGR00494	crcB	protein CrcB. The role of this protein is uncharacterized, but phenotypes associated with overproduction include resistance to camphor, suppression of a mukB chromosomal partition mutant, and chromosome condensation, together suggesting a function related to chromosome folding. [Unknown function, General]	117
273105	TIGR00495	crvDNA_42K	42K curved DNA binding protein. Proteins identified by this model have been identified in a number of species as a nuclear (but not nucleolar) protein with a cell cycle dependence. Various names given to members of this family have included cell cycle protein p38-2G4, DNA-binding protein GBP16, and proliferation-associated protein 1. This protein is closely related to methionine aminopeptidase, a cobolt-binding protein. [Unknown function, General]	390
129587	TIGR00496	frr	ribosome recycling factor. This model finds only eubacterial proteins. Mitochondrial and/or chloroplast forms might be expected but are not currently known. This protein was previously called ribosome releasing factor. By releasing ribosomes from mRNA at the end of protein biosynthesis, it prevents inappropriate translation from 3-prime regions of the mRNA and frees the ribosome for new rounds of translation. EGAD|53116|YHR038W is part of the frr superfamily. [Protein synthesis, Translation factors]	176
211578	TIGR00497	hsdM	type I restriction system adenine methylase (hsdM). Function: methylation of specific adenine residues; required for both restriction and modification activities. The ECOR124/3 I enzyme recognizes 5'GAA(N7)RTCG. for E.coli see (J. Mol. Biol. 257: 960-969 (1996)). [DNA metabolism, Restriction/modification]	501
273106	TIGR00498	lexA	SOS regulatory protein LexA. LexA acts as a homodimer to repress a number of genes involved in the response to DNA damage (SOS response), including itself and RecA. RecA, in the presence of single-stranded DNA, acts as a co-protease to activate a latent autolytic protease activity (EC 3.4.21.88) of LexA, where the active site Ser is part of LexA. The autolytic cleavage site is an Ala-Gly bond in LexA (at position 84-85 in E. coli LexA; this sequence is replaced by Gly-Gly in Synechocystis). The cleavage leads to derepression of the SOS regulon and eventually to DNA repair. LexA in Bacillus subtilis is called DinR. LexA is much less broadly distributed than RecA. [DNA metabolism, DNA replication, recombination, and repair, Regulatory functions, DNA interactions]	199
273107	TIGR00499	lysS_bact	lysyl-tRNA synthetase, eukaryotic and non-spirochete bacterial. This model represents the lysyl-tRNA synthetases that are class II amino-acyl tRNA synthetases. It includes all eukaryotic and most bacterial examples of the enzyme, but not archaeal or spirochete forms. [Protein synthesis, tRNA aminoacylation]	493
129591	TIGR00500	met_pdase_I	methionine aminopeptidase, type I. Methionine aminopeptidase is a cobalt-binding enzyme. Bacterial and organellar examples (type I) differ from eukaroytic and archaeal (type II) examples in lacking a region of approximately 60 amino acids between the 4th and 5th cobalt-binding ligands. This model describes type I. The role of this protein in general is to produce the mature form of cytosolic proteins by removing the N-terminal methionine. [Protein fate, Protein modification and repair]	247
129592	TIGR00501	met_pdase_II	methionine aminopeptidase, type II. Methionine aminopeptidase (map) is a cobalt-binding enzyme. Bacterial and organellar examples (type I) differ from eukaroytic and archaeal (type II) examples in lacking a region of approximately 60 amino acids between the 4th and 5th cobalt-binding ligands. The role of this protein in general is to produce the mature amino end of cytosolic proteins by removing the N-terminal methionine. This model describes type II, among which the eukaryotic members typically have an N-terminal extension not present in archaeal members. It can act cotranslationally. The enzyme from rat has been shown to associate with translation initiation factor 2 (IF-2) and may have a role in translational regulation. [Protein fate, Protein modification and repair]	295
129593	TIGR00502	nagB	glucosamine-6-phosphate isomerase. The set of proteins recognized by this model includes a closely related pair from Bacillus subtilis, one of which is uncharacterized but included as a member of the orthologous set. [Central intermediary metabolism, Amino sugars]	259
129594	TIGR00503	prfC	peptide chain release factor 3. This translation releasing factor, RF-3 (prfC) was originally described as stop codon-independent, in contrast to peptide chain release factor 1 (RF-1, prfA) and RF-2 (prfB). RF-1 and RF-2 are closely related to each other, while RF-3 is similar to elongation factors EF-Tu and EF-G; RF-1 is active at UAA and UAG and RF-2 is active at UAA and UGA. More recently, RF-3 was shown to be active primarily at UGA stop codons in E. coli. All bacteria and organelles have RF-1. The Mycoplasmas and organelles, which translate UGA as Trp rather than as a stop codon, lack RF-2. RF-3, in contrast, seems to be rare among bacteria and is found so far only in Escherichia coli and some other gamma subdivision Proteobacteria, in Synechocystis PCC6803, and in Staphylococcus aureus. [Protein synthesis, Translation factors]	527
129595	TIGR00504	pyro_pdase	pyroglutamyl-peptidase I. Alternate names include pyroglutamate aminopeptidase, pyrrolidone-carboxylate peptidase, and 5-oxoprolyl-peptidase. It removes pyroglutamate (pyrrolidone-carboxylate, a modified glutamine) that can otherwise block hydrolysis of a polypeptide at the amino end, and so can be extremely useful in the biochemical studies of proteins. The biological role in the various species in which it is found is not fully understood. The enzyme appears to be a homodimer. It does not closely resemble any other peptidases. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	212
129596	TIGR00505	ribA	GTP cyclohydrolase II. Several members of the family are bifunctional, involving both ribA and ribB function. In these cases, ribA tends to be on the C-terminal end of the protein and ribB tends to be on the N-terminal. The function of archaeal members of the family has not been demonstrated and is assigned tentatively. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD]	191
273108	TIGR00506	ribB	3,4-dihydroxy-2-butanone 4-phosphate synthase. Several members of the family are bifunctional, involving both ribA and ribB function. In these cases, ribA tends to be on the C-terminal end of the protein and ribB tends to be on the N-terminal. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD]	199
161904	TIGR00507	aroE	shikimate dehydrogenase. This model finds proteins from prokaryotes and functionally equivalent domains from larger, multifunctional proteins of fungi and plants. Below the trusted cutoff of 180, but above the noise cutoff of 20, are the putative shikimate dehydrogenases of Thermotoga maritima and Mycobacterium tuberculosis, and uncharacterized paralogs of shikimate dehydrogenase from E. coli and H. influenzae. The related enzyme quinate 5-dehydrogenase scores below the noise cutoff. A neighbor-joining tree, constructed with quinate 5-dehydrogenases as the outgroup, shows the Clamydial homolog as clustering among the shikimate dehydrogenases, although the sequence is unusual in the degree of sequence divergence and the presence of an additional N-terminal domain. [Amino acid biosynthesis, Aromatic amino acid family]	270
273109	TIGR00508	bioA	adenosylmethionine-8-amino-7-oxononanoate transaminase. All members of the seed alignment have been demonstrated experimentally to act as EC 2.6.1.62, an enzyme in the biotin biosynthetic pathway. Alternate names include 7,8-diaminopelargonic acid aminotransferase, DAPA aminotransferase, and adenosylmethionine-8-amino-7-oxononanoate aminotransferase. The gene symbol is bioA in E. coli and BIO3 in S. cerevisiae. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin]	421
273110	TIGR00509	bisC_fam	molybdopterin guanine dinucleotide-containing S/N-oxide reductases. This enzyme family shares sequence similarity and a requirement for a molydenum cofactor as the only prosthetic group. The form of the cofactor is a single molybdenum atom coordinated by two molybdopterin guanine dinucleotide molecules. Members of the family include biotin sulfoxide reductase, dimethylsulfoxide reductase, and trimethylamine-N-oxide reductase, although a single member may show all those activities and related activities; it may not be possible to resolve the primary function for members of this family by sequence comparison alone. A number of similar molybdoproteins in which the N-terminal region contains a CXXXC motif and may bind an iron-sulfur cluster are excluded from this set, including formate dehydrogenases and nitrate reductases. Also excluded is the A chain of a heteromeric, anaerobic DMSO reductase, which also contains the CXXXC motif.	770
273111	TIGR00510	lipA	lipoate synthase. This enzyme is an iron-sulfur protein. It is localized to mitochondria in yeast and Arabidopsis. It generates lipoic acid, a thiol antioxidant that is linked to a specific Lys as prosthetic group for the pyruvate and alpha-ketoglutarate dehydrogenase complexes and the glycine-cleavage system. The family shows strong sequence conservation. [Biosynthesis of cofactors, prosthetic groups, and carriers, Lipoate]	302
188057	TIGR00511	ribulose_e2b2	ribose-1,5-bisphosphate isomerase, e2b2 family. The delineation of this family was based originally, in part, on a discussion and neighbor-joining phylogenetic study by Kyrpides and Woese of archaeal and other proteins homologous to the alpha, beta, and delta subunits of eukaryotic initiation factor 2B (eIF-2B), a five-subunit molecule that catalyzes GTP recycling for eIF-2. Recently, Sato, et al. assigned the function ribulose-1,5 bisphosphate isomerase. [Energy metabolism, Other]	301
273112	TIGR00512	salvage_mtnA	S-methyl-5-thioribose-1-phosphate isomerase. The delineation of this family was based in part on a discussion and neighbor-joining phylogenetic study, by Kyrpides and Woese, of archaeal and other proteins homologous to the alpha, beta, and delta subunits of eukaryotic initiation factor 2B (eIF-2B), a five-subunit molecule that catalyzes GTP recycling for eIF-2. This clade is now recognized to include the methionine salvage pathway enzyme MtnA. [Amino acid biosynthesis, Aspartate family]	335
273113	TIGR00513	accA	acetyl-CoA carboxylase, carboxyl transferase, alpha subunit. The enzyme acetyl-CoA carboxylase contains a biotin carboxyl carrier protein or domain, a biotin carboxylase, and a carboxyl transferase. This model represents the alpha chain of the carboxyl transferase for cases in which the architecture of the protein is as in E. coli, in which the carboxyltransferase portion consists of two non-identical subnits, alpha and beta. [Fatty acid and phospholipid metabolism, Biosynthesis]	316
129605	TIGR00514	accC	acetyl-CoA carboxylase, biotin carboxylase subunit. This model represents the biotin carboxylase subunit found usually as a component of acetyl-CoA carboxylase. Acetyl-CoA carboxylase is designated EC 6.4.1.2 and this component, biotin carboxylase, has its own designation, EC 6.3.4.14. Homologous domains are found in eukaryotic forms of acetyl-CoA carboxylase and in a number of other carboxylases (e.g. pyruvate carboxylase), but seed members and trusted cutoff are selected so as to exclude these. In some systems, the biotin carboxyl carrier protein and this protein (biotin carboxylase) may be shared by different carboxyltransferases. However, this model is not intended to identify the biotin carboxylase domain of propionyl-coA carboxylase. The model should hit the full length of proteins, except for chloroplast transit peptides in plants. If it hits a domain only of a longer protein, there may be a problem with the identification. [Fatty acid and phospholipid metabolism, Biosynthesis]	449
129606	TIGR00515	accD	acetyl-CoA carboxylase, carboxyl transferase, beta subunit. The enzyme acetyl-CoA carboxylase contains a biotin carboxyl carrier protein or domain, a biotin carboxylase, and a carboxyl transferase. This model represents the beta chain of the carboxyl transferase for cases in which the architecture of the protein is as in E. coli, in which the carboxyltransferase portion consists of two non-identical subnits, alpha and beta. [Fatty acid and phospholipid metabolism, Biosynthesis]	285
273114	TIGR00516	acpS	holo-[acyl-carrier-protein] synthase. Formerly dpj. This enzyme adds the prosthetic group, phosphopantethiene, to the acyl carrier protein (ACP) apo-enzyme to generate the holo-enzyme. Related phosphopantethiene--protein transferases also exist. There is an orthologous domain in eukaryotic proteins. [Fatty acid and phospholipid metabolism, Biosynthesis]	121
213536	TIGR00517	acyl_carrier	acyl carrier protein. This small protein has phosphopantetheine covalently bound to a Ser residue. It acts as a carrier of the growing fatty acid chain, which is bound to the prosthetic group, during fatty acid biosynthesis. Homologous phosphopantetheine-binding domains are found in longer proteins. Acyl carrier proteins scoring above the noise cutoff but below the trusted cutoff may be specialized versions. These include those involved in mycolic acid biosynthesis in the Mycobacteria, lipid A biosynthesis in Rhizobium, actinorhodin polyketide synthesis in Streptomyces coelicolor, etc. This protein is not found in the Archaea.Gene name acpP.S (Ser) at position 37 in the seed alignment, in the motif DSLD, is the phosphopantetheine attachment site. [Fatty acid and phospholipid metabolism, Biosynthesis]	77
129609	TIGR00518	alaDH	alanine dehydrogenase. The family of known L-alanine dehydrogenases (EC 1.4.1.1) includes representatives from the Proteobacteria, Firmicutes, Cyanobacteria, and Actinobacteria, all with about 50 % identity or better. An outlier to this group in both sequence and gap pattern is the homolog from Helicobacter pylori, an epsilon division Proteobacteria, which must be considered a putative alanine dehydrogenase. In Mycobacterium smegmatis and M. tuberculosis, the enzyme doubles as a glycine dehydrogenase (1.4.1.10), running in the reverse direction (glyoxylate amination to glycine, with conversion of NADH to NAD+). Related proteins include saccharopine dehydrogenase and the N-terminal half of the NAD(P) transhydrogenase alpha subunit. All of these related proteins bind NAD and/or NADP. [Energy metabolism, Amino acids and amines]	370
129610	TIGR00519	asnASE_I	L-asparaginase, type I. Two related families of asparaginase are designated type I and type II according to the terminology in E. coli, which has both: L-asparaginase I is a low-affinity enzyme found in the cytoplasm, while L-asparaginase II is a high-affinity secreted enzyme synthesized with a cleavable signal sequence. This model describes L-asparaginases related to type I of E. coli. Archaeal putative asparaginases are of this type but contain an extra ~ 80 residues in a conserved N-terminal region. These archaeal homologs are included in this model.	336
273115	TIGR00520	asnASE_II	L-asparaginase, type II. Two related families of asparaginase (L-asparagine amidohydrolase, EC 3.5.1.1) are designated type I and type II according to the terminology in E. coli, which has both: L-asparaginase I is a low-affinity enzyme found in the cytoplasm, while L-asparaginase II is a high-affinity periplasmic enzyme synthesized with a cleavable signal sequence. This model describes L-asparaginases related to type II of E. coli. Both the cytoplasmic and the cell wall asparaginases of Saccharomyces cerevisiae belong to this set. Members of this set from Acinetobacter glutaminasificans and Pseudomonas fluorescens are described as having both glutaminase and asparaginase activitities. All members are homotetrameric. [Energy metabolism, Amino acids and amines]	349
273116	TIGR00521	coaBC_dfp	phosphopantothenoylcysteine decarboxylase / phosphopantothenate--cysteine ligase. This model represents a bifunctional enzyme that catalyzes the second and third steps (cysteine ligation, EC 6.3.2.5, and decarboxylation, EC 4.1.1.36) in the biosynthesis of coenzyme A (CoA) from pantothenate in bacteria. In early descriptions of this flavoprotein, a ts mutation in one region of the protein appeared to cause a defect in DNA metaobolism rather than an increased need for the pantothenate precursor beta-alanine. This protein was then called dfp, for DNA/pantothenate metabolism flavoprotein. The authors responsible for detecting phosphopantothenate--cysteine ligase activity suggest renaming this bifunctional protein coaBC for its role in CoA biosynthesis. This enzyme contains the FMN cofactor, but no FAD or pyruvoyl group. The amino-terminal region contains the phosphopantothenoylcysteine decarboxylase activity. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	391
273117	TIGR00522	dph5	diphthine synthase. Alternate name: diphthamide biosynthesis S-adenosylmethionine-dependent methyltransferase. This protein participates in the modification of a specific His of elongation factor 2 of eukarotes and Archaea to diphthamide. The protein was characterized in Saccharomyces cerevisiae and designated DPH5. [Protein fate, Protein modification and repair]	257
273118	TIGR00523	eIF-1A	eukaryotic/archaeal initiation factor 1A. Recommended nomenclature: eIF-1A for eukaryotes, aIF-1A for Archaea. Also called eIF-4C [Protein synthesis, Translation factors]	98
273119	TIGR00524	eIF-2B_rel	eIF-2B alpha/beta/delta-related uncharacterized proteins. This model, eIF-2B_rel, describes half of a superfamily, where the other half consists of eukaryotic translation initiation factor 2B (eIF-2B) subunits alpha, beta, and delta. It is unclear whether the eIF-2B_rel set is monophyletic, or whether they are all more closely related to each other than to any eIF-2B subunit because the eIF-2B clade is highly derived. Members of this branch of the family are all uncharacterized with respect to function and are found in the Archaea, Bacteria, and Eukarya, although a number are described as putative translation intiation factor components. Proteins found by eIF-2B_rel include at least three clades, including a set of uncharacterized eukaryotic proteins, a set found in some but not all Archaea, and a set universal so far among the Archaea and closely related to several uncharacterized bacterial proteins. [Unknown function, General]	303
213537	TIGR00525	folB	dihydroneopterin aldolase. This model describes a bacterial dihydroneopterin aldolase, shown to form homo-octamers in E. coli. The equivalent activity is catalyzed by domains of larger folate biosynthesis proteins in other systems. The closely related parologous enzyme in E. coli, dihydroneopterin triphosphate epimerase, which is also homo-octameric, and dihydroneopterin aldolase domains of larger proteins, score below the trusted cutoff but may score well above the noise cutoff. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	116
273120	TIGR00526	folB_dom	FolB domain. Two paralogous genes of E. coli, folB (dihydroneopterin aldolase) and folX (d-erythro-7,8-dihydroneopterin triphosphate epimerase) are homologous to each other and homo-octameric. In Pneumocystis carinii, a multifunctional enzyme of folate synthesis has an N-terminal region active as dihydroneopterin aldolase. This region consists of two tandem sequences each homologous to folB and forms tetramers.	118
200024	TIGR00527	gcvH	glycine cleavage system H protein. This model represents the glycine cleavage system H protein, which shuttles the methylamine group of glycine from the P protein to the T protein. The mature protein is about 130 residues long and contains a lipoyl group covalently bound to a conserved Lys residue. The genome of Aquifex aeolicus contains one protein scoring above the trusted cutoff and clustering with other bacterial H proteins, and four more proteins clustering together and scoring below the trusted cutoff; it seems doubtful that all of these homologs are authentic H protein. The Chlamydial homolog of H protein is nearly as divergent as the Aquifex outgroup, is not accompanied by P and T proteins, is not included in the seed alignment, and consequently also scores below the trusted cutoff. [Energy metabolism, Amino acids and amines]	128
273121	TIGR00528	gcvT	glycine cleavage system T protein. The glycine cleavage system T protein (GcvT) is also known as aminomethyltransferase (EC 2.1.2.10). It works with the H protein (GcvH), the P protein (GcvP), and lipoamide dehydrogenase. The reported sequence of the member from Aquifex aeolicus starts about 50 residues downstream of the start of other members of the family (perhaps in error); it scores below the trusted cutoff. Eukaryotic forms are mitochondrial and have an N-terminal transit peptide. [Energy metabolism, Amino acids and amines]	362
273122	TIGR00529	AF0261	integral membrane protein, TIGR00529 family. This protein is predicted to have 10 transmembrane regions. Members of this family are found so far in the Archaea (Archaeoglobus fulgidus and Pyrococcus horikoshii) and in a bacterial thermophile, Thermotoga maritima. In Pyrococcus, the gene is located between nadA and nadB, two components of an enzyme involved in de novo synthesis of NAD. By PSI-BLAST, this family shows similarity (but not necessarily homology) to gluconate permease and other transport proteins. [Hypothetical proteins, Conserved]	387
129621	TIGR00530	AGP_acyltrn	1-acyl-sn-glycerol-3-phosphate acyltransferases. This model describes the core homologous region of a collection of related proteins, several of which are known to act as 1-acyl-sn-glycerol-3-phosphate acyltransferases (EC 2.3.1.51). Proteins scoring above the trusted cutoff are likely to have the same general activity. However, there is variation among characterized members as to whether the acyl group can be donated by acyl carrier protein or coenzyme A, and in the length and saturation of the donated acyl group. 1-acyl-sn-glycerol-3-phosphate acyltransferase is also called 1-AGP acyltransferase, lysophosphatidic acid acyltransferase, and LPA acyltransferase. [Fatty acid and phospholipid metabolism, Biosynthesis]	130
273123	TIGR00531	BCCP	acetyl-CoA carboxylase, biotin carboxyl carrier protein. This model is designed to identify biotin carboxyl carrier protein as a peptide of acetyl-CoA carboxylase. Scoring below the trusted cutoff is a related protein encoded in a region associated with polyketide synthesis in the prokaryote Saccharopolyspora hirsuta, and a reported chloroplast-encoded biotin carboxyl carrier protein that may be highly derived from the last common ancestral sequence. Scoring below the noise cutoff are biotin carboxyl carrier domains of other enzymes such as pyruvate carboxylase.The gene name is accB or fabE. [Fatty acid and phospholipid metabolism, Biosynthesis]	155
129623	TIGR00532	HMG_CoA_R_NAD	hydroxymethylglutaryl-CoA reductase, degradative. Most known examples of hydroxymethylglutaryl-CoA reductase are NADP-dependent (EC 1.1.1.34) from eukaryotes and archaea, involved in the biosynthesis of mevalonate from 3-hydroxy-3-methylglutaryl-CoA. This model, in contrast, is built from the two examples in completed genomes of sequences closely related to the degradative, NAD-dependent hydroxymethylglutaryl-CoA reductase of Pseudomonas mevalonii, a bacterium that can use mevalonate as its sole carbon source. [Energy metabolism, Other]	393
129624	TIGR00533	HMG_CoA_R_NADP	3-hydroxy-3-methylglutaryl Coenzyme A reductase, hydroxymethylglutaryl-CoA reductase (NADP). This model represents archaeal examples of the enzyme hydroxymethylglutaryl-CoA reductase (NADP) (EC 1.1.1.34) and the catalytic domain of eukaryotic examples, which also contain a hydrophobic N-terminal domain. This enzyme synthesizes mevalonate, a precursor of isopentenyl pyrophosphate (IPP), a building block for the synthesis of cholesterol, isoprenoids, and other molecules. A related hydroxymethylglutaryl-CoA reductase, typified by an example from Pseudomonas mevalonii, is NAD-dependent and catabolic. [Central intermediary metabolism, Other]	402
213538	TIGR00534	OpcA	glucose-6-phosphate dehydrogenase assembly protein OpcA. The opcA gene is found immediately downstream of zwf, the glucose-6-phosphate dehydrogenase (G6PDH) gene, in a number of species, including Mycobacterium tuberculosis, Streptomyces coelicolor, Nostoc punctiforme, and Synechococcus sp. PCC 7942. In the latter, disruption of opcA was shown to block assembly of G6PDH into active oligomeric forms. [Protein fate, Protein folding and stabilization]	311
273124	TIGR00535	SAM_DCase	S-adenosylmethionine decarboxylase proenzyme, eukaryotic form. This enzyme is a key regulatory enzyme of the polyamine synthetic pathway. This protein is a pyruvoyl-dependent enzyme. The proenzyme is cleaved at a Ser residue that becomes a pyruvoyl group active site. [Central intermediary metabolism, Polyamine biosynthesis]	334
273125	TIGR00536	hemK_fam	HemK family putative methylases. The gene hemK from E. coli was found to contribute to heme biosynthesis and originally suggested to be protoporphyrinogen oxidase. Functional analysis of the nearest homolog in Saccharomyces cerevisiae, YNL063w, finds it is not protoporphyrinogen oxidase and sequence analysis suggests that HemK homologs have S-adenosyl-methionine-dependent methyltransferase activity (Medline 99237242). Homologs are found, usually in a single copy, in nearly all completed genomes, but varying somewhat in apparent domain architecture. Both E. coli and H. influenzae have two members rather than one. The members from the Mycoplasmas have an additional C-terminal domain. [Protein fate, Protein modification and repair]	284
129628	TIGR00537	hemK_rel_arch	HemK-related putative methylase. The gene hemK from E. coli was found to contribute to heme biosynthesis and originally suggested to be protoporphyrinogen oxidase. Functional analysis of the nearest homolog in Saccharomyces cerevisiae, YNL063w, finds it is not protoporphyrinogen oxidase and sequence analysis suggests that HemK homologs have S-adenosyl-methionine-dependent methyltransferase activity (Medline 99237242). Homologs are found, usually in a single copy, in nearly all completed genomes, but varying somewhat in apparent domain architecture. This model represents an archaeal and eukaryotic protein family that lacks an N-terminal domain found in HemK and its eubacterial homologs. It is found in a single copy in the first six completed archaeal and eukaryotic genomes. [Unknown function, Enzymes of unknown specificity]	179
129629	TIGR00538	hemN	oxygen-independent coproporphyrinogen III oxidase. This model represents HemN, the oxygen-independent coproporphyrinogen III oxidase that replaces HemF function under anaerobic conditions. Several species, including E. coli, Helicobacter pylori, and Aquifex aeolicus, have both a member of this family and a member of another, closely related family for which there is no evidence of coproporphyrinogen III oxidase activity. Members of this family have a perfectly conserved motif PYRT[SC]YP in a region N-terminal to the region of homology with the related uncharacterized protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	455
129630	TIGR00539	hemN_rel	putative oxygen-independent coproporphyrinogen III oxidase. Experimentally determined examples of oxygen-independent coproporphyrinogen III oxidase, an enzyme that replaces HemF function under anaerobic conditions, belong to a family of proteins described by the model hemN. This model, hemN_rel, models a closely related protein, shorter at the amino end and lacking the region containing the motif PYRT[SC]YP found in members of the hemN family. Several species, including E. coli, Helicobacter pylori, Aquifex aeolicus, and Chlamydia trachomatis, have members of both this family and the E. coli hemN family. The member of this family from Bacillus subtilis was shown to complement an hemF/hemN double mutant of Salmonella typimurium and to prevent accumulation of coproporphyrinogen III under anaerobic conditions, but the exact role of this protein is still uncertain. It is found in a number of species that do not synthesize heme de novo. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	360
273126	TIGR00540	TPR_hemY_coli	heme biosynthesis-associated TPR protein. Members of this protein family are uncharacterized tetratricopeptide repeat (TPR) proteins invariably found in heme biosynthesis gene clusters. The absence of any invariant residues other than Ala argues against this protein serving as an enzyme per se. The gene symbol hemY assigned in E. coli is unfortunate in that an unrelated protein, protoporphyrinogen oxidase (HemG in E. coli) is designated HemY in Bacillus subtilis. [Unknown function, General]	367
129632	TIGR00541	hisDCase_pyru	histidine decarboxylase, pyruvoyl type. This enzyme converts histadine to histamine in a single step by catalyzing the release of CO2. This type is synthesized as an inactive single chain precursor, then cleaved into two chains. The Ser at the new N-terminus at the cleavage site is converted to a pyruvoyl group essential for activity. This type of histidine decarboxylase appears is known so far only in some Gram-positive bacteria, where it may play a role in amino acid catabolism. There is also a pyridoxal phosphate type histidine decarboxylase, as found in human, where histamine is a biologically active amine. [Energy metabolism, Amino acids and amines]	310
129633	TIGR00542	hxl6Piso_put	hexulose-6-phosphate isomerase, putative. This family shows similarity by PSI-BLAST to other isomerases. Putative identification as hexulose-6-phosphate isomerase is reported in Swiss-Prot, attributing a discussion in Genome Sci. Technol. 1:53-75(1996). This family is conserved at better than 40 % identity among the four known examples from three species: Escherichia coli (SgbU and SgaU), Haemophilus influenzae, and Mycoplasma pneumoniae. The rarity of the family, high level of conservation, and proposed catabolic role suggests lateral transfer may be a part of the evolutionary history of this protein. [Energy metabolism, Sugars]	279
273127	TIGR00543	isochor_syn	isochorismate synthases. This enzyme interconverts chorismate and isochorismate. In E. coli, different loci encode isochorismate synthases for the pathways of menaquinone biosynthesis and enterobactin biosynthesis (via salicilate) and fail to complement each other. Among isochorismate synthases, the N-terminal domain is poorly conserved. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	351
273128	TIGR00544	lgt	prolipoprotein diacylglyceryl transferase. The conversion of lipoprotein precursors into lipoproteins consists of three steps. First, the enzyme described by this model transfers a diacylglyceryl moiety from phosphatidylglycerol to the side chain of a Cys that will become the new N-terminus. Second, the signal peptide is removed by signal peptidase II. Finally, the free amino group of the new N-terminal Cys is acylated by apolipoprotein N-acyltransferase. [Protein fate, Protein modification and repair]	277
161920	TIGR00545	lipoyltrans	lipoyltransferase and lipoate-protein ligase. One member of this group of proteins is bovine lipoyltransferase, which transfers the lipoyl group from lipoyl-AMP to the specific Lys of lipoate-dependent enzymes. However, it does not first activate lipoic acid with ATP to create lipoyl-AMP and pyrophosphate. Another member of this group, lipoate-protein ligase A from E. coli, catalyzes both the activation and the transfer of lipoate. Homology between the two is full-length, except for the bovine mitochondrial targeting signal, but is strongest toward the N-terminus. [Protein fate, Protein modification and repair]	324
273129	TIGR00546	lnt	apolipoprotein N-acyltransferase. This enzyme transfers the acyl group to lipoproteins in the lgt/lsp/lnt system which is found broadly in bacteria but not in archaea. This model represents one component of the "lipoprotein lgt/lsp/lnt system" genome property. [Protein fate, Protein modification and repair]	391
129638	TIGR00547	lolA	periplasmic chaperone LolA. This protein, LolA, is known so far only in the gamma and beta subdivisions of the Proteobacteria. The E. coli major outer lipoprotein (Lpp) of E. coli is released from the inner membrane as a complex with this chaperone in an energy-requiring process, and is then delivered to LolB for insertion into the outer membrane. LolA is involved in the delivery of lipoproteins generally, rather than just Lpp, and is an essential protein in E. coli, unlike Lpp itself. [Protein fate, Protein and peptide secretion and trafficking]	204
129639	TIGR00548	lolB	outer membrane lipoprotein LolB. This protein, LolB, is known so far only in the gamma and beta subdivisions of the Proteobacteria. It is a processed, lipid-modified outer membrane protein. It is required in E. coli for insertion of the major outer lipoprotein (Lpp) into the outer membrane. Lpp is transferred to LolB from the carrier protein LolA in the periplasm. Previously, this protein was thought to play in role in 5-aminolevulinic acid synthesis and was designated HemM. [Protein fate, Protein and peptide secretion and trafficking]	202
273130	TIGR00549	mevalon_kin	mevalonate kinase. This model represents mevalonate kinase, the third step in the mevalonate pathway of isopentanyl pyrophosphate (IPP) biosynthesis. IPP is a common intermediate for a number of pathways including cholesterol biosynthesis. This model covers enzymes from eukaryotes, archaea and bacteria. The related enzyme from the same pathway, phosphmevalonate kinase, serves as an outgroup for this clade. Paracoccus exhibits two genes within the phosphomevalonate/mevalonate kinase family, one of which falls between trusted and noise cutoffs of this model. The degree of divergence is high, but if the trees created from this model are correct, the proper names of these genes have been swapped. [Central intermediary metabolism, Other]	273
129641	TIGR00550	nadA	quinolinate synthetase complex, A subunit. This protein, termed NadA, plays a role in the synthesis of pyridine, a precursor to NAD. The quinolinate synthetase complex consists of A protein (this protein) and B protein. B protein converts L-aspartate to iminoaspartate, an unstable reaction product which in the absence of A protein is spontaneously hydrolyzed to form oxaloacetate. The A protein, NadA, converts iminoaspartate to quinolate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	310
273131	TIGR00551	nadB	L-aspartate oxidase. L-aspartate oxidase is the B protein, NadB, of the quinolinate synthetase complex. Quinolinate synthetase makes a precursor of the pyridine nucleotide portion of NAD. This model identifies proteins that cluster as L-aspartate oxidase (a flavoprotein difficult to separate from the set of closely related flavoprotein subunits of succinate dehydrogenase and fumarate reductase) by both UPGMA and neighbor-joining trees. The most distant protein accepted as an L-aspartate oxidase (NadB), that from Pyrococcus horikoshii, not only clusters with other NadB but is just one gene away from NadA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	489
273132	TIGR00552	nadE	NAD+ synthetase. NAD+ synthetase is a nearly ubiquitous enzyme for the final step in the biosynthesis of the essensial cofactor NAD. The member of this family from Bacillus subtilis is a strictly NH(3)-dependent NAD(+) synthetase of 272 amino acids. Proteins consisting only of the domain modeled here may be named as NH3-dependent NAD+ synthetase. Amidotransferase activity may reside in a separate protein, or not be present. Some other members of the family, such as from Mycobacterium tuberculosis, are considerably longer, contain an apparent amidotransferase domain, and show glutamine-dependent as well as NH(3)-dependent activity. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	250
273133	TIGR00553	pabB	aminodeoxychorismate synthase, component I, bacterial clade. Members of this family, aminodeoxychorismate synthase, component I (PabB), were designated para-aminobenzoate synthase component I until it was recognized that PabC, a lyase, completes the pathway of PABA synthesis. This family is closely related to anthranilate synthase component I (trpE), and both act on chorismate. The clade of PabB enzymes represented by this model includes sequences from Gram-positive and alpha and gamma Proteobacteria as well as Chlorobium, Nostoc, Fusobacterium and Arabidopsis. A closely related clade of fungal PabB enzymes is identified by TIGR01823, while another bacterial clade of potential PabB enzymes is more closely related to TrpE (TIGR01824). [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	328
273134	TIGR00554	panK_bact	pantothenate kinase, bacterial type. Shown to be a homodimer in E. coli. This enzyme catalyzes the rate-limiting step in the biosynthesis of coenzyme A. It is very well conserved from E. coli to B. subtilis, but differs considerably from known eukaryotic forms, described in a separate model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	290
273135	TIGR00555	panK_eukar	pantothenate kinase, eukaryotic/staphyloccocal type. This model describes a eukaryotic form of pantothenate kinase, characterized from the fungus Aspergillus nidulans and with similar forms known in several other eukaryotes. It also includes forms from several Gram-positive bacteria suggested to have originated from the eukaryotic form by lateral transfer. It differs in a number of biochemical properties (such as inhibition by acetyl-CoA) from most bacterial CoaA and lacks sequence similarity. This enzyme is the key regulatory step in the biosynthesis of coenzyme A (CoA). [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	296
273136	TIGR00556	pantethn_trn	phosphopantetheine--protein transferase domain. This model models a domain active in transferring the phophopantetheine prosthetic group to its attachment site on enzymes and carrier proteins. Many members of this family are small proteins that act on the acyl carrier protein involved in fatty acid biosynthesis. Some members are domains of larger proteins involved specialized pathways for the synthesis of unusual molecules including polyketides, atypical fatty acids, and antibiotics. [Protein fate, Protein modification and repair]	128
273137	TIGR00557	pdxA	4-hydroxythreonine-4-phosphate dehydrogenase. This model represents PdxA, an NAD+-dependent 4-hydroxythreonine 4-phosphate dehydrogenase (EC 1.1.1.262) active in pyridoxal phosphate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	320
273138	TIGR00558	pdxH	pyridoxamine-phosphate oxidase. This model is similar to Pyridox_oxidase from Pfam but is designed to find only true pyridoxamine-phosphate oxidase and to ignore the related protein PhzG involved in phenazine biosynthesis. This protein from E. coli was characterized as a homodimer with two FMN per dimer. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	190
188064	TIGR00559	pdxJ	pyridoxine 5'-phosphate synthase. PdxJ is required in the biosynthesis of pyridoxine (vitamin B6), a precursor to the enzyme cofactor pyridoxal phosphate. ECOCYC describes the predicted reaction equation as 1-amino-propan-2-one-3-phosphate + deoxyxylulose-5-phosphate = pyridoxine-5'-phosphate. The product of that reaction is oxidized by PdxH to pyridoxal 5'-phosphate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	236
273139	TIGR00560	pgsA	CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase. Alternate names: phosphatidylglycerophosphate synthase; glycerophosphate phosphatidyltransferase; PGP synthase. A number of related enzymes are quite similar in both sequence and catalytic activity, including Saccharamyces cerevisiae YDL142c, now known to be a cardiolipin synthase. There may be problems with incorrect transitive annotation of near homologs as authentic CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase. [Fatty acid and phospholipid metabolism, Biosynthesis]	182
273140	TIGR00561	pntA	NAD(P) transhydrogenase, alpha subunit. This integral membrane protein is the alpha subunit of alpha 2 beta 2 tetramer that couples the proton transport across the membrane to the reversible transfer of hydride ion equivalents between NAD and NADP. An alternate name is pyridine nucleotide transhydrogenase alpha subunit. The N-terminal region is homologous to alanine dehydrogenase. In some species, such as Rhodospirillum rubrum, the alpha chain is replaced by two shorter chains, both with some homology to the full-length alpha chain modeled here. These score below the trusted cutoff. [Energy metabolism, Electron transport]	512
213540	TIGR00562	proto_IX_ox	protoporphyrinogen oxidase. This enzyme oxidizes protoporphyrinogen IX to protoporphyrin IX, a precursor of heme and chlorophyll. Bacillus subtilis HemY also has coproporphyrinogen III to coproporphyrin III oxidase activity in a heterologous expression system, although the role for this activity in vivo is unclear. This protein is a flavoprotein and has a beta-alpha-beta dinucleotide binding motif near the amino end. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	462
273141	TIGR00563	rsmB	16S rRNA (cytosine(967)-C(5))-methyltransferase. This protein is also known as sun protein. The reading frame was originally interpreted as two reading frames, fmu and fmv. The recombinant protein from E. coli was shown to methylate only C967 of small subunit (16S) ribosomal RNA and to produce only m5C at that position. The seed alignment is built from bacterial sequences only. Eukaryotic homologs include Nop2, a protein required for processing pre-rRNA, that is likely also a rRNA methyltransferase, although the fine specificity may differ. Cutoff scores are set to avoid treating archaeal and eukaroytic homologs automatically as functionally equivalent, although they may have very similar roles. [Protein synthesis, tRNA and rRNA base modification]	426
273142	TIGR00564	trpE_most	anthranilate synthase component I, non-proteobacterial lineages. This enzyme resembles some other chorismate-binding enzymes, including para-aminobenzoate synthase (pabB) and isochorismate synthase. There is a fairly deep split between two sets, seen in the pattern of gaps as well as in amino acid sequence differences. Archaeal enzymes have been excluded from this model (and are now found in TIGR01820) as have a clade of enzymes which constitute a TrpE paralog which may have PabB activity (TIGR01824). This allows the B. subtilus paralog which has been shown to have PabB activity to score below trusted to this model. This model contains sequences from gram-positive bacteria, certain proteobacteria, cyanobacteria, plants, fungi and assorted other bacteria.A second family of TrpE enzymes is modelled by TIGR00565. The breaking of the TrpE family into these diverse models allows for the separation of the models for the related enzyme, PabB. [Amino acid biosynthesis, Aromatic amino acid family]	454
273143	TIGR00565	trpE_proteo	anthranilate synthase component I, proteobacterial subset. This enzyme resembles some other chorismate-binding enzymes, including para-aminobenzoate synthase (pabB) and isochorismate synthase. There is a fairly deep split between two sets, seen in the pattern of gaps as well as in amino acid sequence differences. This group includes proteobacteria such as E. coli and Helicobacter pylori but also the gram-positive organism Corynebacterium glutamicum. The second group includes eukaryotes, archaea, and most other bacterial lineages; sequences from the second group may resemble pabB more closely than other trpE from this group. [Amino acid biosynthesis, Aromatic amino acid family]	498
273144	TIGR00566	trpG_papA	glutamine amidotransferase of anthranilate synthase or aminodeoxychorismate synthase. This model describes the glutamine amidotransferase domain or peptide of the tryptophan-biosynthetic pathway enzyme anthranilate synthase or of the folate biosynthetic pathway enzyme para-aminobenzoate synthase. In at least one case, a single polypeptide from Bacillus subtilis was shown to have both functions. This model covers a subset of the sequences described by the Pfam model GATase.	188
273145	TIGR00567	3mg	DNA-3-methyladenine glycosylase. This families are based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). All proteins in this family for which the function is known are involved in the base excision repair of alkylation damage to DNA. The exact specificty of the type of alkylation damage repaired by each of these varies somewhat between species. Substrates include 3-methyl adenine, 7-methyl-guanaine, and 3-methyl-guanine. [DNA metabolism, DNA replication, recombination, and repair]	192
129659	TIGR00568	alkb	DNA alkylation damage repair protein AlkB. Proteins in this family have an as of yet undetermined function in the repair of alkylation damage to DNA. Alignment and family designation based on phylogenomic analysis of Jonathan A. Eisen (PhD Thesis, Stanford University, 1999). [DNA metabolism, DNA replication, recombination, and repair]	169
129660	TIGR00569	ccl1	cyclin ccl1. All proteins in this family for which functions are known are cyclins that are components of TFIIH, a complex that is involved in nucleotide excision repair and transcription initiation. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, StanfordUniversity). [DNA metabolism, DNA replication, recombination, and repair]	305
129661	TIGR00570	cdk7	CDK-activating kinase assembly factor MAT1. All proteins in this family for which functions are known are cyclin dependent protein kinases that are components of TFIIH, a complex that is involved in nucleotide excision repair and transcription initiation. Also known as MAT1 (menage a trois 1). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	309
273146	TIGR00571	dam	DNA adenine methylase (dam). All proteins in this family for which functions are known are DNA-adenine methyltransferases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The DNA adenine methylase (dam) of E. coli and related species is instrumental in distinguishing the newly synthesized strand during DNA replication for methylation-directed mismatch repair. This family includes several phage methylases and a number of different restriction enzyme chromosomal site-specific modification systems. [DNA metabolism, DNA replication, recombination, and repair]	267
129663	TIGR00573	dnaq	exonuclease, DNA polymerase III, epsilon subunit family. All proteins in this family for which functions are known are components of the DNA polymerase III complex (epsilon subunit). There is, however, an outgroup that includes paralogs in some gamma-proteobacteria and the n-terminal region of DinG from some low GC gram positive bacteria. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, Degradation of DNA]	217
273147	TIGR00574	dnl1	DNA ligase I, ATP-dependent (dnl1). All proteins in this family with known functions are ATP-dependent DNA ligases. Functions include DNA repair, DNA replication, and DNA recombination (or any process requiring ligation of two single-stranded DNA sections). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	514
273148	TIGR00575	dnlj	DNA ligase, NAD-dependent. All proteins in this family with known functions are NAD-dependent DNA ligases. Functions of these proteins include DNA repair, DNA replication, and DNA recombination. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The member of this family from Treponema pallidum differs in having three rather than just one copy of the BRCT (BRCA1 C Terminus) domain (pfam00533) at the C-terminus. It is included in the seed. [DNA metabolism, DNA replication, recombination, and repair]	652
273149	TIGR00576	dut	deoxyuridine 5'-triphosphate nucleotidohydrolase (dut). The main function of these proteins is in maintaining the levels of dUTP in the cell to prevent dUTP incorporation into DNA during DNA replication. Pol proteins in viruses are very similar to this protein family. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Changed role from 132 to 123. RTD [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	142
273150	TIGR00577	fpg	DNA-formamidopyrimidine glycosylase. All proteins in the FPG family with known functions are FAPY-DNA glycosylases that function in base excision repair. Homologous to endonuclease VIII (nei). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	272
273151	TIGR00578	ku70	ATP-dependent DNA helicase II, 70 kDa subunit (ku70). Proteins in this family are involved in non-homologous end joining, a process used for the repair of double stranded DNA breaks. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Cutoff does not detect the putative ku70 homologs in yeast. [DNA metabolism, DNA replication, recombination, and repair]	586
273152	TIGR00580	mfd	transcription-repair coupling factor (mfd). All proteins in this family for which functions are known are DNA-dependent ATPases that function in the process of transcription-coupled DNA repair in which the repair of the transcribed strand of actively transcribed genes is repaired at a higher rate than the repair of non-transcribed regions of the genome and than the non-transcribed strand of the same gene. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). This family is closely related to the RecG and UvrB families. [DNA metabolism, DNA replication, recombination, and repair]	926
129670	TIGR00581	moaC	molybdenum cofactor biosynthesis protein MoaC. MoaC catalyzes an early step in molybdenum cofactor biosynthesis in E. coli. The Arabidopsis homolog Cnx3 complements MoaC deficiency in E. coli. Eukarotic members of this family branch within the bacterial branch, with the archaeal members as an apparent outgroup. This protein is absent in a number of the pathogens with smaller genomes, including Mycoplasmas, Chlamydias, and spirochetes, but is found in most other complete genomes to date. The homolog form Synechocystis sp. is fused to a MobA-homologous region and is an outlier to all other bacterial forms by both neighbor-joining and UPGMA analyses. Members of this family are well-conserved. The seed for this model excludes both archaeal sequences and the most divergent bacterial sequences, but still finds all candidate MoaC sequences easily between trusted and noise cutoffs. We suggest that sequences branching outside the set that contains all seed members be regarded only as putative functional equivalents of MoaC unless and until a member of the archaeal outgroup is shown to have equivalent function. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	147
273153	TIGR00583	mre11	DNA repair protein (mre11). All proteins in this family for which functions are known are subunits of a nuclease complex made up of multiple proteins including MRE11 and RAD50 homologs. The functions of this nuclease complex include recombinational repair and non-homolgous end joining. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The proteins in this family are distantly related to proteins in the SbcCD complex of bacteria. [DNA metabolism, DNA replication, recombination, and repair]	405
273154	TIGR00584	mug	mismatch-specific thymine-DNA glycosylate (mug). All proteins in this family for whcih functions are known are G-T or G-U mismatch glycosylases that function in base excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Used 2pf model. [DNA metabolism, DNA replication, recombination, and repair]	328
273155	TIGR00585	mutl	DNA mismatch repair protein MutL. All proteins in this family for which the functions are known are involved in the process of generalized mismatch repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	312
200031	TIGR00586	mutt	mutator mutT protein. All proteins in this family for which functions are known are involved in repairing oxidative damage to dGTP (they are 8-oxo-dGTPases). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Lowering the threshold picks up members of MutT superfamily well. [DNA metabolism, DNA replication, recombination, and repair]	128
273156	TIGR00587	nfo	apurinic endonuclease (APN1). All proteins in this family for which functions are known are 5' AP endonculeases that are used in base excision repair and the repair of abasic sites in DNA.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	274
211589	TIGR00588	ogg	8-oxoguanine DNA-glycosylase (ogg). All proteins in this family for which functions are known are 8-oxo-guanaine DNA glycosylases that function in base excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). This family is distantly realted to the Nth-MutY superfamily. [DNA metabolism, DNA replication, recombination, and repair]	310
273157	TIGR00589	ogt	O-6-methylguanine DNA methyltransferase. All proteins in this family for which functions are known are involved alkyl-DNA transferases which remove alkyl groups from DNA as part of alkylation DNA repair. Some of the proteins in this family are also transcription regulators and have a distinct transcription regulatory domain. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	80
273158	TIGR00590	pcna	proliferating cell nuclear antigen (pcna). All proteins in this family for which functions are known form sliding DNA clamps that are used in DNA replication processes. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	259
129679	TIGR00591	phr2	photolyase PhrII. All proteins in this family for which functions are known are DNA-photolyases used for the direct repair of UV irradiation induced DNA damage. Some repair 6-4 photoproducts while others repair cyclobutane pyrimidine dimers. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	454
273159	TIGR00592	pol2	DNA polymerase (pol2). All proteins in this superfamily for which functions are known are DNA polymerases.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	1172
273160	TIGR00593	pola	DNA polymerase I. All proteins in this family for which functions are known are DNA polymerases Many also have an exonuclease motif. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	887
273161	TIGR00594	polc	DNA-directed DNA polymerase III (polc). All proteins in this family for which functions are known are DNA polymerases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	1022
273162	TIGR00595	priA	primosomal protein N'. All proteins in this family for which functions are known are components of the primosome which is involved in replication, repair, and recombination.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	505
273163	TIGR00596	rad1	DNA repair protein (rad1). All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologs). This complex is used primarily for nucleotide excision repair but also for some aspects of recombinational repair in some species. Most Archaeal species also have homologs of these genes, but the function of these Archaeal genes is not known, so we have set our cutoff to only pick up the eukaryotic genes.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford Universit [DNA metabolism, DNA replication, recombination, and repair]	814
129685	TIGR00597	rad10	DNA repair protein rad10. All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologs). This complex is used primarily for nucleotide excision repair but also for some aspects of recombination repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	112
273164	TIGR00598	rad14	DNA repair protein. All proteins in this family for which functions are known are used for the recognition of DNA damage as part of nucleotide excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	172
273165	TIGR00599	rad18	DNA repair protein rad18. All proteins in this family for which functions are known are involved in nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	397
273166	TIGR00600	rad2	DNA excision repair protein (rad2). All proteins in this family for which functions are known are flap endonucleases that generate the 3' incision next to DNA damage as part of nucleotide excision repair. This family is related to many other flap endonuclease families including the fen1 family. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	1034
273167	TIGR00601	rad23	UV excision repair protein Rad23. All proteins in this family for which functions are known are components of a multiprotein complex used for targeting nucleotide excision repair to specific parts of the genome. In humans, Rad23 complexes with the XPC protein. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	378
129690	TIGR00602	rad24	checkpoint protein rad24. All proteins in this family for which functions are known are involved in DNA damage tolerance (likely cell cycle checkpoints).This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	637
273168	TIGR00603	rad25	DNA repair helicase rad25. All proteins in this family for which functions are known are DNA-DNA helicases used for the initiation of nucleotide excision repair and transacription as part of the TFIIH complex.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	732
273169	TIGR00604	rad3	DNA repair helicase (rad3). All proteins in this family for which funcitons are known are DNA-DNA helicases that funciton in the initiation of transcription and nucleotide excision repair as part of the TFIIH complex. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	705
273170	TIGR00605	rad4	DNA repair protein rad4. All proteins in this family for which functions are known are involved in targeting nucleotide excision repair to specific regions of the genome.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	713
129694	TIGR00606	rad50	rad50. All proteins in this family for which functions are known are involvedin recombination, recombinational repair, and/or non-homologous end joining.They are components of an exonuclease complex with MRE11 homologs. This family is distantly related to the SbcC family of bacterial proteins.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).	1311
129695	TIGR00607	rad52	recombination protein rad52. All proteins in this family for which functions are known are involved in recombination and recombination repair. Their exact biochemical activity is not yet known.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	161
273171	TIGR00608	radc	DNA repair protein radc. The genes in this family for which the functions are known have an as yet porrly defined role in determining sensitivity to DNA damaging agents such as UV irradiation. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	218
273172	TIGR00609	recB	exodeoxyribonuclease V, beta subunit. The RecBCD holoenzyme is a multifunctional nuclease with potent ATP-dependent exodeoxyribonuclease activity. Ejection of RecD, as occurs at chi recombinational hotspots, cripples exonuclease activity in favor of recombinagenic helicase activity. All proteins in this family for which functions are known are DNA-DNA helicases that are used as part of an exonuclease-helicase complex (made up of RecBCD homologs) that function to generate substrates for the initiation of recombination and recombinational repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	1087
273173	TIGR00611	recf	recF protein. All proteins in this family for which functions are known are DNA binding proteins that assist the filamentation of RecA onto DNA for the initiation of recombination or recombinational repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	365
273174	TIGR00612	ispG_gcpE	1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase. This protein of previously unknown biochemical function has now been identified as an enzyme of the non-mevalonate pathway of IPP biosynthesis. Chlamydial members of the family have a long insert. The family is largely restricted to Bacteria, where it is widely but not universally distributed. No homology can be detected between the GcpE family and other proteins. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	346
273175	TIGR00613	reco	DNA repair protein RecO. All proteins in this family for which functions are known are DNA binding proteins that are involved in the initiation of recombination or recombinational repair. [DNA metabolism, DNA replication, recombination, and repair]	241
129701	TIGR00614	recQ_fam	ATP-dependent DNA helicase, RecQ family. All proteins in this family for which functions are known are 3'-5' DNA-DNA helicases. These proteins are used for recombination, recombinational repair, and possibly maintenance of chromosome stability. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	470
273176	TIGR00615	recR	recombination protein RecR. All proteins in this family for which functions are known are involved in the initiation of recombination and recombinational repair. RecF is also required. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	195
129703	TIGR00616	rect	recombinase, phage RecT family. All proteins in this family for which functions are known bind ssDNA and are involved in the the pairing of homologous DNA This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). RecT and homologs are found in prophage regions of bacterial genomes. RecT works with a partner protein, RecE. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions]	241
273177	TIGR00617	rpa1	replication factor-a protein 1 (rpa1). All proteins in this family for which functions are known are part of a multiprotein complex made up of homologs of RPA1, RPA2 and RPA3 that bind ssDNA and function in the recognition of DNA damage for nucleotide excision repairThis family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	608
129705	TIGR00618	sbcc	exonuclease SbcC. All proteins in this family for which functions are known are part of an exonuclease complex with sbcD homologs. This complex is involved in the initiation of recombination to regulate the levels of palindromic sequences in DNA. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	1042
273178	TIGR00619	sbcd	exonuclease SbcD. All proteins in this family for which functions are known are double-stranded DNA exonuclease (as part of a complex with SbcC homologs). This complex functions in the initiation of recombination and recombinational repair and is particularly important in regulating the stability of DNA sections that can form secondary structures. This family is likely homologous to the MRE11 family. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	253
273179	TIGR00621	ssb	single stranded DNA-binding protein (ssb). All proteins in this family for which functions are known are single-stranded DNA binding proteins that function in many processes including transcription, repair, replication and recombination. Members encoded between genes for ribosomal proteins S6 and S18 should be annotated as primosomal protein N (PriB). Forms in gamma-protoeobacteria are much shorter and poorly recognized by this model. Additional members of this family include phage proteins. Eukaryotic members are organellar proteins. [DNA metabolism, DNA replication, recombination, and repair]	164
129709	TIGR00622	ssl1	transcription factor ssl1. All proteins in this family for which functions are known are components of the TFIIH complex which is involved in the initiaiton of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	112
129710	TIGR00623	sula	cell division inhibitor SulA. All proteins in this family for which the functions are known are cell division inhibitors. In E. coli, SulA is one of the SOS regulated genes. [DNA metabolism, DNA replication, recombination, and repair]	168
129711	TIGR00624	tag	DNA-3-methyladenine glycosylase I. All proteins in this family are alkylation DNA glycosylases that function in base excision repair This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	179
273180	TIGR00625	tfb2	Transcription factor tfb2. All proteins in this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	448
273181	TIGR00627	tfb4	transcription factor tfb4. All proteins in this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	279
273182	TIGR00628	ung	uracil-DNA glycosylase. All proteins in this family for which functions are known are uracil-DNA glycosylases that function in base excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	211
273183	TIGR00629	uvde	UV damage endonuclease UvdE. All proteins in this family for which functions are known are UV dimer endonucleases that function in an alternative nucleotide excision repair process. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	312
273184	TIGR00630	uvra	excinuclease ABC, A subunit. This family is a member of the ABC transporter superfamily of proteins of which all members for which functions are known except the UvrA proteins are involved in the transport of material through membranes. UvrA orthologs are involved in the recognition of DNA damage as a step in nucleotide excision repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	925
273185	TIGR00631	uvrb	excinuclease ABC, B subunit. All proteins in this family for wich functions are known are DNA helicases that function in the nucleotide excision repair and are endonucleases that make the 3' incision next to DNA damage. They are part of a pathway requiring UvrA, UvrB, UvrC, and UvrD homologs. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University) [DNA metabolism, DNA replication, recombination, and repair]	655
211591	TIGR00632	vsr	DNA mismatch endonuclease Vsr. All proteins in this family for which functions are known are G:T mismatch endonucleases that function in a specialized mismatch repair process used usually to repair G:T mismatches in specific sections of the genome. This family was based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Members of this family typically are found near to a DNA cytosine methyltransferase. [DNA metabolism, DNA replication, recombination, and repair]	117
273186	TIGR00633	xth	exodeoxyribonuclease III (xth). All proteins in this family for which functions are known are 5' AP endonucleases that funciton in base excision repair and the repair of abasic sites in DNA.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	254
273187	TIGR00634	recN	DNA repair protein RecN. All proteins in this family for which functions are known are ATP binding proteins involved in the initiation of recombination and recombinational repair. [DNA metabolism, DNA replication, recombination, and repair]	563
129721	TIGR00635	ruvB	Holliday junction DNA helicase, RuvB subunit. All proteins in this family for which functions are known are 5'-3' DNA helicases that, as part of a complex with RuvA homologs serve as a 5'-3' Holliday junction helicase. RuvA specifically binds Holliday junctions as a sandwich of two tetramers and maintains the configuration of the junction. It forms a complex with two hexameric rings of RuvB, the subunit that contains helicase activity. The complex drives ATP-dependent branch migration of the Holliday junction recombination intermediate. The endonuclease RuvC resolves junctions. [DNA metabolism, DNA replication, recombination, and repair]	305
213544	TIGR00636	PduO_Nterm	ATP:cob(I)alamin adenosyltransferase. This model represents as ATP:cob(I)alamin adenosyltransferase family corresponding to the N-terminal half of Salmonella PduO, a 1,2-propanediol utilization protein that probably is bifunctional. PduO represents one of at least three families of ATP:corrinoid adenosyltransferase: others are CobA (which partially complements PduO) and EutT. It was not clear originally whether ATP:cob(I)alamin adenosyltransferase activity resides in the N-terminal region of PduO, modeled here, but this has now become clear from the characterization of MeaD from Methylobacterium extorquens. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	171
273188	TIGR00637	ModE_repress	ModE molybdate transport repressor domain. ModE is a molybdate-activated repressor of the molybdate transport operon in E. coli. It consists of the domain represented by this model and two tandem copies of mop-like domain, where Mop proteins are a family of 68-residue molybdenum-pterin binding proteins of Clostridium pasteurianum. This model also represents the full length of a pair of archaeal proteins that lack Mop-like domains. PSI-BLAST analysis shows similarity to helix-turn-helix regulatory proteins. [Regulatory functions, Other]	99
273189	TIGR00638	Mop	molybdenum-pterin binding domain. This model describes a multigene family of molybdenum-pterin binding proteins of about 70 amino acids in Clostridium pasteurianum, as a tandemly-repeated domain C-terminal to an unrelated domain in ModE, a molybdate transport gene repressor of E. coli, and in single or tandemly paired domains in several related proteins. [Transport and binding proteins, Anions]	69
161973	TIGR00639	PurN	phosphoribosylglycinamide formyltransferase, formyltetrahydrofolate-dependent. This model describes phosphoribosylglycinamide formyltransferase (GAR transformylase), one of several proteins in formyl_transf (pfam00551). This enzyme uses formyl tetrahydrofolate as a formyl group donor to produce 5'-phosphoribosyl-N-formylglycinamide. PurT, a different GAR transformylase, uses ATP and formate rather than formyl tetrahydrofolate. Experimental proof includes complementation of E. coli purN mutants by orthologs from vertebrates (where it is a domain of a multifunctional protein), Bacillus subtilis, and Arabidopsis. No archaeal example was detected. In phylogenetic analyses, the member from Saccharomyces cerevisiae shows a long branch length but membership in the family, while the formyltetrahydrofolate deformylases form a closely related outgroup. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	190
129726	TIGR00640	acid_CoA_mut_C	methylmalonyl-CoA mutase C-terminal domain. Methylmalonyl-CoA mutase (EC 5.4.99.2) catalyzes a reversible isomerization between L-methylmalonyl-CoA and succinyl-CoA. The enzyme uses an adenosylcobalamin cofactor. It may be a homodimer, as in mitochondrion, or a heterodimer with partially homologous beta chain that does not bind the adenosylcobalamin cofactor, as in Propionibacterium freudenreichii. The most similar archaeal sequences are separate chains, such as AF2215 and AF2219 of Archaeoglobus fulgidus, that correspond roughly to the first 500 and last 130 residues, respectively of known methylmalonyl-CoA mutases. This model describes the C-terminal domain subfamily. In a neighbor-joining tree (methylaspartate mutase S chain as the outgroup), AF2219 branches with a coenzyme B12-dependent enzyme known not to be 5.4.99.2.	132
273190	TIGR00641	acid_CoA_mut_N	methylmalonyl-CoA mutase N-terminal domain. Methylmalonyl-CoA mutase (EC 5.4.99.2) catalyzes a reversible isomerization between L-methylmalonyl-CoA and succinyl-CoA. The enzyme uses an adenosylcobalamin cofactor. It may be a homodimer, as in mitochondrion, or a heterodimer with partially homologous beta chain that does not bind the adenosylcobalamin cofactor, as in Propionibacterium freudenreichii. The most similar archaeal sequences are separate chains, such as AF2215 abd AF2219 of Archaeoglobus fulgidus, that correspond roughly to the first 500 and last 130 residues, respectively of known methylmalonyl-CoA mutases. This model describes the N-terminal domain subfamily. In a neighbor-joining tree, AF2215 branches with a bacterial isobutyryl-CoA mutase, which is also the same length. Scoring between the noise and trusted cutoffs are the non-catalytic, partially homologous beta chains from certain heterodimeric examples of 5.4.99.2.	526
273191	TIGR00642	mmCoA_mut_beta	methylmalonyl-CoA mutase, heterodimeric type, beta chain. The adenosylcobalamin-binding, catalytic chain of methylmalonyl-CoA mutase may form homodimers, as in mitochondrion and E. coli, or heterodimers with a shorter, homologous chain that does not bind adenosylcobalamin. This model describes this non-catalytic beta chain, as found in the enzyme from Propionibacterium freudenreichii, for which the 3-dimensional structure has been solved. [Central intermediary metabolism, Other]	619
273192	TIGR00643	recG	ATP-dependent DNA helicase RecG. [DNA metabolism, DNA replication, recombination, and repair]	630
273193	TIGR00644	recJ	single-stranded-DNA-specific exonuclease RecJ. All proteins in this family are 5'-3' single-strand DNA exonucleases. These proteins are used in some aspects of mismatch repair, recombination, and recombinational repair. [DNA metabolism, DNA replication, recombination, and repair]	539
129731	TIGR00645	HI0507	TIGR00645 family protein. This conserved hypothetical protein with four predicted transmembrane regions is found in E. coli, Haemophilus influenzae, and Helicobacter pylori, among completed genomes. A similar protein from Aquifex aeolicus appears to share a central region of homology and a similar overall arrangement of hydrophobic stretches, and forms a bidirectional best hit with several members of the seed alignment. However, it is uncertain whether the observed similarity represents full-length homology and/or equivalent function, and so it is excluded from the seed and scores below the trusted cutoff. [Hypothetical proteins, Conserved]	167
129732	TIGR00646	MG010	DNA primase-related protein. The DNA primase DnaG of E. coli and its apparent orthologs in other eubacterial species are approximately 600 residues in length. Within this set, a conspicuous outlier in percent identity, as seen in a UPGMA difference tree, is the branch containing the Mycoplasmas. This lineage is also unique in containing the small, DNA primase-related protein modelled here, which is homologous to the central third of DNA primase. Several small regions of sequence similarity specifically to Mycoplasma sequences rather than to all DnaG homologs suggests that the divergence of this protein from DnaG post-dated the separation of bacterial lineages. The function of this DNA primase-related protein is unknown. [Unknown function, General]	218
273194	TIGR00647	DNA_bind_WhiA	DNA-binding protein WhiA. This family describes a DNA-binding protein widely conserved in Gram-positive bacteria, and occasionally occurring elsewhere, such as in Thermotoga. It is associated with cell division, and in sporulating organisms with sporulation. [Cellular processes, Cell division]	304
129734	TIGR00648	recU	recombination protein U. The Bacillus protein has been shown to be required for DNA recombination and repair. RJD 11/20/00 [DNA metabolism, DNA replication, recombination, and repair]	169
273195	TIGR00649	MG423	beta-CASP ribonuclease, RNase J family. This family of metalloenzymes includes RNase J1 and RNase J2, involved in mRNA degradation in a wide range of organism. [Transcription, Degradation of RNA]	422
273196	TIGR00651	pta	phosphate acetyltransferase. Alternate name: phosphotransacetylase Model contains a gene from E.coli coding for ethanolamine utilization protein (euti) and also contains similarity to malate oxidoreductases [Central intermediary metabolism, Other, Energy metabolism, Fermentation]	303
273197	TIGR00652	DapF	diaminopimelate epimerase. [Amino acid biosynthesis, Aspartate family]	267
273198	TIGR00653	GlnA	glutamine synthetase, type I. Alternate name: glutamate--ammonia ligase. This model represents the dodecameric form, which can be subdivided into 1-alpha and 1-beta forms. The phylogeny of the 1-alpha and 1-beta forms appears polyphyletic. E. coli, Synechocystis PCC6803, Aquifex aeolicus, and the crenarcheon Sulfolobus acidocaldarius have form 1-beta, while Bacillus subtilis, Thermotoga maritima, and various euryarchaea has form 1-alpha. The 1-beta dodecamer from the crenarcheon Sulfolobus acidocaldarius differs from that in E. coli in that it is not regulated by adenylylation. [Amino acid biosynthesis, Glutamate family]	459
129739	TIGR00654	PhzF_family	phenazine biosynthesis protein PhzF family. Members of this family show a distant global similarity to diaminopimelate epimerases, which can be taken as the outgroup. One member of this family has been shown to act as an enzyme in the biosynthesis of the antibiotic phenazine in Pseudomonas aureofaciens. The function in other species is unclear. [Cellular processes, Toxin production and resistance]	297
273199	TIGR00655	PurU	formyltetrahydrofolate deformylase. This model describes formyltetrahydrofolate deformylases. The enzyme is a homohexamer. Sequences from a related enzyme formyl tetrahydrofolate-specific enzyme, phosphoribosylglycinamide formyltransferase, serve as an outgroup for phylogenetic analysis. Putative members of this family, scoring below the trusted cutoff, include a sequence from Rhodobacter capsulatus that lacks an otherwise conserved C-terminal region. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	280
273200	TIGR00656	asp_kin_monofn	aspartate kinase, monofunctional class. This model describes a subclass of aspartate kinases. These are mostly Lys-sensitive and not fused to homoserine dehydrogenase, unlike some Thr-sensitive and Met-sensitive forms. Homoserine dehydrogenase is part of Thr and Met but not Lys biosynthetic pathways. Aspartate kinase catalyzes a first step in the biosynthesis from Asp of Lys (and its precursor diaminopimelate), Met, and Thr. In E. coli, a distinct isozyme is inhibited by each of the three amino acid products. The Met-sensitive (I) and Thr-sensitive (II) forms are bifunctional enzymes fused to homoserine dehydrogenases and form homotetramers, while the Lys-sensitive form (III) is a monofunctional homodimer. The Lys-sensitive enzyme of Bacillus subtilis resembles the E. coli form but is an alpha 2/beta 2 heterotetramer, where the beta subunit is translated from an in-phase alternative initiator at Met-246. The protein slr0657 from Synechocystis PCC6803 is extended by a duplication of the C-terminal region corresponding to the beta chain. Incorporation of a second copy of the C-terminal domain may be quite common in this subgroup of aspartokinases. [Amino acid biosynthesis, Aspartate family]	400
273201	TIGR00657	asp_kinases	aspartate kinase. Aspartate kinase catalyzes a first step in the biosynthesis from Asp of Lys (and its precursor diaminopimelate), Met, and Thr. In E. coli, a distinct isozyme is inhibited by each of the three amino acid products. The Met-sensitive (I) and Thr-sensitive (II) forms are bifunctional enzymes fused to homoserine dehydrogenases and form homotetramers, while the Lys-sensitive form (III) is a monofunctional homodimer.The Lys-sensitive enzyme of Bacillus subtilis resembles the E. coli form but is an alpha 2/beta 2 heterotetramer, where the beta subunit is translated from an in-phase alternative initiator at Met-246. This may be a feature of a number of closely related forms, including a paralog from B. subtilis. [Amino acid biosynthesis, Aspartate family]	441
129743	TIGR00658	orni_carb_tr	ornithine carbamoyltransferase. This family of ornithine carbamoyltransferases (OTCase) is in a superfamily with the related enzyme aspartate carbamoyltransferase. Most known examples are anabolic, playing a role in arginine biosynthesis, but some are catabolic. Most OTCases are homotrimers, but the homotrimers are organized into dodecamers built from four trimers in at least two species; the catabolic OTCase of Pseudomonas aeruginosa is allosterically regulated, while OTCase of the extreme thermophile Pyrococcus furiosus shows both allostery and thermophily. [Amino acid biosynthesis, Glutamate family]	304
273202	TIGR00659	TIGR00659	TIGR00659 family protein. Members of this small but broadly distibuted (Gram-positive, Gram-negative, and Archaeal) family appear to have multiple transmembrane segments. The function is unknown. A homolog, LrgB of Staphylococcus aureus, in the same small superfamily but in an outgroup to this subfamily, is regulated by LytSR and is suggested to act as a murein hydrolase. Of the three paralogous proteins in B. subtilis, one is a full length member of this family, one lacks the C-terminal 60 residues and has an additional 128 N-terminal residues but branches within the family in a phylogenetic tree, and one is closely related to LrgB and part of the outgroup. [Hypothetical proteins, Conserved]	226
273203	TIGR00661	MJ1255	conserved hypothetical protein. This model represents nearly the full length of MJ1255 from Methanococcus jannaschii and of an unpublished protein from Vibrio cholerae, as well as the C-terminal half of a protein from Methanobacterium thermoautotrophicum. A small region (~50 amino acids) within the domain appears related to a family of sugar transferases. [Hypothetical proteins, Conserved]	321
273204	TIGR00663	dnan	DNA polymerase III, beta subunit. All proteins in this family for which functions are known are components of the DNA polymerase III complex (beta subunit). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	367
273205	TIGR00664	DNA_III_psi	DNA polymerase III, psi subunit. This small subunit of the DNA polymerase III holoenzyme in E. coli and related species appearsto have a narrow taxonomic distribution. It is not found so far outside the gamma subdivision proteobacteria. [DNA metabolism, DNA replication, recombination, and repair]	124
273206	TIGR00665	DnaB	replicative DNA helicase. This model describes the helicase DnaB, a homohexameric protein required for DNA replication. The homohexamer can form a ring around a single strand of DNA near a replication fork. An intein of > 400 residues is found at a conserved location in DnaB of Synechocystis PCC6803, Rhodothermus marinus (both experimentally confirmed), and Mycobacterium tuberculosis. The intein removes itself by a self-splicing reaction. The seed alignment contains inteins so that the model built from the seed alignment will model a low cost at common intein insertion sites. [DNA metabolism, DNA replication, recombination, and repair]	432
200042	TIGR00666	PBP4	D-alanyl-D-alanine carboxypeptidase, serine-type, PBP4 family. In E. coli, this protein is known as penicillin binding protein 4 (dacB). A signal sequence is cleaved from a precursor form. The protein is described as periplasmic in E. coli (Gram-negative) and extracellular in Actinomadura R39 (Gram-positive). Unlike some other proteins with similar activity, it does not form transpeptidation. It is not essential for viability. This family is related to class A beta-lactamases. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	333
273207	TIGR00667	aat	leucyl/phenylalanyl-tRNA--protein transferase. The N-terminal residue controls the biological half-life of many proteins via the N-end rule pathway. This enzyme transfers a Leu or Phe to the amino end of certain proteins to enable degradation. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	185
273208	TIGR00668	apaH	bis(5'-nucleosyl)-tetraphosphatase (symmetrical). Diadenosine 5',5"'-P1,P4-tetraphosphate (Ap4A) is a regulatory metabolite of stress conditions. It is hydrolyzed to two ADP by this enzyme. Alternate names include diadenosine-tetraphosphatase and Ap4A hydrolase. [Cellular processes, Adaptations to atypical conditions]	279
129752	TIGR00669	asnA	aspartate--ammonia ligase, AsnA-type. This model represents one of two non-homologous forms of aspartate--ammonia ligase (asparagine synthetase) found in E. coli. This type is also found in Haemophilus influenzae, Treponema pallidum and Lactobacillus delbrueckii, but appears to have a very limited distribution. The fact that the protein from the H. influenzae is more than 70 % identical to that from the spirochete Treponema pallidum, but less than 65 % identical to that from the closely related E. coli, strongly suggests lateral transfer. [Amino acid biosynthesis, Aspartate family]	330
273209	TIGR00670	asp_carb_tr	aspartate carbamoyltransferase. Aspartate transcarbamylase (ATCase) is an alternate name.PyrB encodes the catalytic chain of aspartate carbamoyltransferase, an enzyme of pyrimidine biosynthesis, which organizes into trimers. In some species, including E. coli and the Archaea but excluding Bacillus subtilis, a regulatory subunit PyrI is also present in an allosterically regulated hexameric holoenzyme. Several molecular weight classes of ATCase are described in MEDLINE:96303527 and often vary within taxa. PyrB and PyrI are fused in Thermotoga maritima.Ornithine carbamoyltransferases are in the same superfamily and form an outgroup. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	301
273210	TIGR00671	baf	pantothenate kinase, type III. This model describes a family of proteins found in a single copy in at least ten different early completed bacterial genomes. The only characterized member of the family is Bvg accessory factor (Baf), a protein required, in addition to the regulatory operon bvgAS, for heterologous transcription of the Bordetella pertussis toxin operon (ptx) in E. coli. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	243
129755	TIGR00672	cdh	CDP-diacylglycerol diphosphatase, bacterial type. This model finds only bacterial examples of CDP-diacylglycerol pyrophosphatase. The member from Mycobacterium tuberculosis, the only non-proteobacterial example, is only tentatively identified and scores below the trusted cutoff. No homology is detected to functionally similar mammalian enzymes. Alternate names for this enzyme include CDP-diglyceride hydrolase and CDP-diacylglycerol hydrolase. [Fatty acid and phospholipid metabolism, Biosynthesis]	250
213549	TIGR00673	cynS	cyanase. Alternate names include cyanate C-N-lyase, cyanate hydratase, and cyanate hydrolase. [Cellular processes, Detoxification]	150
129757	TIGR00674	dapA	4-hydroxy-tetrahydrodipicolinate synthase. Members of this family are 4-hydroxy-tetrahydrodipicolinate synthase, previously (incorrectly) called dihydrodipicolinate synthase. It is a homotetrameric enzyme of lysine biosynthesis. E. coli has several paralogs closely related to dihydrodipicoline synthase (DapA), as well as the more distant N-acetylneuraminate lyase. In Pyrococcus horikoshii, the bidirectional best hit with E. coli is to an uncharacterized paralog of DapA, not DapA itself, and it is omitted from the seed. The putative members from the Chlamydias (pathogens with a parasitic metabolism) are easily the most divergent members of the multiple alignment. [Amino acid biosynthesis, Aspartate family]	285
273211	TIGR00675	dcm	DNA-methyltransferase (dcm). All proteins in this family for which functions are known are DNA-cytosine methyltransferases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	315
273212	TIGR00676	fadh2	5,10-methylenetetrahydrofolate reductase, prokaryotic form. The enzyme activities methylenetetrahydrofolate reductase (EC 1.5.1.20) and 5,10-methylenetetrahydrofolate reductase (FADH) (EC 1.7.99.5) differ in that 1.5.1.20 (assigned in many eukaryotes) is defined to use NADP+ as an acceptor, while 1.7.99.5 (assigned in many bacteria) is flexible with respect to the acceptor; both convert 5-methyltetrahydrofolate to 5,10-methylenetetrahydrofolate. From a larger set of proteins assigned as 1.5.1.20 and 1.7.99.5, this model describes the subset of proteins found in bacteria, and currently designated 1.7.99.5. This protein is an FAD-containing flavoprotein. [Amino acid biosynthesis, Aspartate family]	272
129760	TIGR00677	fadh2_euk	methylenetetrahydrofolate reductase, eukaryotic type. The enzyme activities methylenetetrahydrofolate reductase (EC 1.5.1.20) and 5,10-methylenetetrahydrofolate reductase (FADH) (EC 1.7.99.5) differ in that 1.5.1.20 (assigned in many eukaryotes) is defined to use NADP+ as an acceptor, while 1.7.99.5 (assigned in many bacteria) is flexible with respect to the acceptor; both convert 5-methyltetrahydrofolate to 5,10-methylenetetrahydrofolate. From a larger set of proteins assigned as 1.5.1.20 and 1.7.99.5, this model describes the subset of proteins found in eukaryotes and designated 1.5.1.20. This protein is an FAD-containing flavoprotein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	281
273213	TIGR00678	holB	DNA polymerase III, delta' subunit. This model describes the N-terminal half of the delta' subunit of DNA polymerase III. Delta' is homologous to the gamma and tau subunits, which form an outgroup for phylogenetic comparison. The gamma/tau branch of the tree is much more tighly conserved than the delta' branch, and some members of that branch score more highly against this model than some proteins classisified as delta'. The noise cutoff is set to detect weakly scoring delta' subunits rather than to exclude gamma/tau subunits. At position 126-127 of the seed alignment, this family lacks the HM motif of gamma/tau; at 132 it has a near-invariant A vs. an invariant F in gamma/tau. [DNA metabolism, DNA replication, recombination, and repair]	188
273214	TIGR00679	hpr-ser	Hpr(Ser) kinase/phosphatase. Members of this family are the bifunctional enzyme, HPr kinase/phosphatase. All members of the seed alignment (n=57) have a gene tightly clustered with a gene for the phospocarrier protein HPr, its target. [Regulatory functions, Protein interactions, Signal transduction, PTS]	300
273215	TIGR00680	kdpA	K+-transporting ATPase, KdpA. Kdp is a high affinity ATP-driven K+ transport system in Escherichia coli. It is composed of three membrane-bound subunits, KdpA, KdpB and KdpC and one small peptide, KdpF. KdpA is the K+-transporting subunit of this complex. During assembly of the complex, KdpA and KdpC bind to each other. This interaction is thought to stabilize the complex [medline:9858692]. Data indicates that KdpC might connect the KdpA, the K+-transporting subunit, to KdpB, the ATP-hydrolyzing (energy providing) subunit [medline:9858692]. [Transport and binding proteins, Cations and iron carrying compounds]	563
129764	TIGR00681	kdpC	K+-transporting ATPase, C subunit. This chain has a single predicted transmembrane region near the amino end. It is part of a K+-transport ATPase that contains two other membrane-bound subunits, KdpA and KdpB, and a small subunit KdpF. KdpA is the K+-translocating subunit, KdpB the ATP-hydrolyzing subunit. During assembly of the complex, KdpA and KdpC bind to each other. This interaction is thought to stabilize the complex [MEDLINE:9858692]. Data indicates that KdpC might connect the KdpA, the K+-transporting subunit, to KdpB, the ATP-hydrolyzing (energy providing) subunit [MEDLINE:9858692]. [Transport and binding proteins, Cations and iron carrying compounds]	187
273216	TIGR00682	lpxK	tetraacyldisaccharide 4'-kinase. Also called lipid-A 4'-kinase. This essential gene encodes an enzyme in the pathway of lipid A biosynthesis in Gram-negative organisms. A single copy of this protein is found in Gram-negative bacteria. PSI-BLAST converges on this set of apparent orthologs without identifying any other homologs. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	311
273217	TIGR00683	nanA	N-acetylneuraminate lyase. N-acetylneuraminate lyase is also known as N-acetylneuraminic acid aldolase, sialic acid aldolase, or sialate lyase. It is an intracellular enzyme. The structure of this homotetrameric enzyme related to dihydrodipicolinate synthase is known. In Clostridium tertium, the enzyme appears to be in an operon with a secreted sialidase that releases sialic acid from host sialoglycoconjugates. In several E. coli strains, however, this enzyme is responsible for N-acetyl-D-neuraminic acid synthesis for capsule production by condensing N-acetyl-D-mannosamine and pyruvate. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Central intermediary metabolism, Amino sugars]	293
273218	TIGR00684	narJ	nitrate reductase molybdenum cofactor assembly chaperone. This protein is termed NarJ in most species that have a single copy, and has been called the delta subunit of nitrate reductase. However, although it is required for correct assembly of active enzyme, it dissociates and is not part of the enzyme. Two hits to this model are found each in E. coli and in Mycobacterium tuberculosis, but in each case duplication to create paralogs appears to be recent. The NarX protein of Mycobacterium tuberculosis includes one of these paralogs as a domain, fused to structural domains of nitrate reductases before and after the NarJ-homologous region. [Protein fate, Protein folding and stabilization]	152
273219	TIGR00685	T6PP	trehalose-phosphatase. Trehalose, a neutral disaccharide of two glucose residues, is an important osmolyte for dessication and/or salt tolerance in a number of prokaryotic and eukaryotic species, including E. coli, Saccharomyces cerevisiae, and Arabidopsis thaliana. Many bacteria also utilize trehalose in the synthesis of trehalolipids, specialized cell wall constituents believed to be involved in the uptake of hydrophobic substances. Trehalose dimycolate (TDM, cord factor) and related substances are important constituents of the mycobacterial waxy coat and responsible for various clinically important immunological interactions with host organism. This enzyme, trehalose-phosphatase, removes a phosphate group in the final step of trehalose biosynthesis. The trehalose-phosphatase from Saccharomyces cerevisiae is fused to the synthase. At least 18 distinct sequences from Arabidopsis have been identified, roughly half of these are of the fungal type, with a fused synthase and half are like the bacterial members having only the phosphatase domain. It has been suggested that trehalose is being used in Arabidopsis as a regulatory molecule in development and possibly other processes. [Cellular processes, Adaptations to atypical conditions]	244
273220	TIGR00686	phnA	alkylphosphonate utilization operon protein PhnA. The protein family includes an uncharacterized member designated phnA in Escherichia coli, part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage. This protein is not related to the characterized phosphonoacetate hydrolase designated PhnA by Kulakova, et al. (2001, 1997). [Unknown function, General]	109
273221	TIGR00687	pyridox_kin	pyridoxal kinase. E. coli has an enzyme PdxK that acts in vitro as a pyridoxine/pyridoxal/pyridoxamine kinase, but mutants lacking PdxK activity retain a specific pyridoxal kinase, PdxY. PdxY acts in the salvage pathway of pyridoxal 5'-phosphate biosynthesis. Mammalian forms of pyridoxal kinase are more similar to PdxY than to PdxK. The PdxK isozyme is omitted from the seed alignment but scores above the trusted cutoff.ThiD and related proteins form an outgroup. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	287
129771	TIGR00688	rarD	rarD protein. This uncharacterized protein is predicted to have many membrane-spanning domains. [Transport and binding proteins, Unknown substrate]	256
129772	TIGR00689	rpiB_lacA_lacB	sugar-phosphate isomerase, RpiB/LacA/LacB family. Proteins of known function in this family act as sugar (pentose and/or hexose)-phosphate isomerases, including the LacA and LacB subunits of galactose-6-phosphate isomerases from Gram-positive bacteria and RpiB. RpiB is the second ribose phosphate isomerase of E. coli. It lacks homology to RpiA, its inducer is unknown (but is not ribose), and it can be replaced by the homologous galactose-6-phosphate isomerase of Streptococcus mutans, all of which suggests that the ribose phosphate isomerase activity of RpiB is a secondary function. On the other hand, there appear to be a significant number of species which contain rpiB, lack rpiA and seem to require rpi activity in order to complete the pentose phosphate pathway.	144
188073	TIGR00690	rpoZ	DNA-directed RNA polymerase, omega subunit. This small component of highly purified E. coli RNA polymerase is not required for transcription, but acts in assembly and is present in stochiometric amounts. The trusted cutoff excludes archaeal homologs but captures some organellar sequences. [Transcription, DNA-dependent RNA polymerase]	60
213552	TIGR00691	spoT_relA	(p)ppGpp synthetase, RelA/SpoT family. The functions of E. coli RelA and SpoT differ somewhat. RelA (EC 2.7.6.5) produces pppGpp (or ppGpp) from ATP and GTP (or GDP). SpoT (EC 3.1.7.2) degrades ppGpp, but may also act as a secondary ppGpp synthetase. The two proteins are strongly similar. In many species, a single homolog to SpoT and RelA appears reponsible for both ppGpp synthesis and ppGpp degradation. (p)ppGpp is a regulatory metabolite of the stringent response, but appears also to be involved in antibiotic biosynthesis in some species. [Cellular processes, Adaptations to atypical conditions]	683
129775	TIGR00692	tdh	L-threonine 3-dehydrogenase. This protein is a tetrameric, zinc-binding, NAD-dependent enzyme of threonine catabolism. Closely related proteins include sorbitol dehydrogenase, xylitol dehydrogenase, and benzyl alcohol dehydrogenase. Eukaryotic examples of this enzyme have been demonstrated experimentally but do not appear in database search results.E. coli His-90 modulates substrate specificity and is believed part of the active site. [Energy metabolism, Amino acids and amines]	340
273222	TIGR00693	thiE	thiamine-phosphate diphosphorylase. This model represents the thiamine-phosphate pyrophosphorylase, ThiE, of a number of bacteria, and N-terminal domains of bifunctional thiamine proteins of Saccharomyces cerevisiae and Schizosaccharomyces pombe, in which the C-terminal domain corresponds to the bacterial hydroxyethylthiazole kinase (EC 2.7.1.50), ThiM. This model includes ThiE from Bacillus subtilis but excludes its paralog, the regulatory protein TenI (SP:P25053), and neighbors of TenI. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	196
188074	TIGR00694	thiM	hydroxyethylthiazole kinase. This model represents the hydoxyethylthiazole kinase, ThiM, of a number of bacteria, and C-terminal domains of bifunctional thiamine biosynthesis proteins of Saccharomyces cerevisiae and Schizosaccharomyces pombe, in which the N-terminal domain corresponds to the bacterial thiamine-phosphate pyrophosphorylase (EC 2.5.1.3), ThiE. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	249
129778	TIGR00695	uxuA	mannonate dehydratase. This Fe2+-requiring enzyme plays a role in D-glucuronate catabolism in Escherichia coli. Mannonate dehydratase converts D-mannonate to 2-dehydro-3-deoxy-D-gluconate. An apparent equivalog is found in a glucuronate utilization operon in Bacillus stearothermophilus T-6. [Energy metabolism, Sugars]	394
129779	TIGR00696	wecG_tagA_cpsF	bacterial polymer biosynthesis proteins, WecB/TagA/CpsF family. The WecG member of this superfamily, believed to be UDP-N-acetyl-D-mannosaminuronic acid transferase, plays a role in enterobacterial common antigen (eca) synthesis in Escherichia coli. Another family member, the Bacillus subtilis TagA protein, is involved in the biosynthesis of the cell wall polymer poly(glycerol phosphate). The third family member, CpsF, CMP-N-acetylneuraminic acid synthetase has a role in the capsular polysaccharide biosynthesis pathway. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	177
129780	TIGR00697	TIGR00697	conserved hypothetical integral membrane protein. All known members of this family are proteins or 210-250 amino acids in length. Conserved regions of hydrophobicity suggest that all members of the family are integral membrane proteins. [Hypothetical proteins, Conserved]	202
129781	TIGR00698	TIGR00698	conserved hypothetical integral membrane protein. Members of this family are found so far only in one archaeal species, Archaeoglobus fulgidus, and in two related bacterial species, Haemophilus influenzae and Escherichia coli. It has 9 GES predicted transmembrane regions at conserved locations in all members. These proteins have a molecular weight of approximately 35 to 38 kDa. [Hypothetical proteins, Conserved]	335
129782	TIGR00699	GABAtrns_euk	4-aminobutyrate aminotransferase, eukaryotic type. This enzyme is a class III pyridoxal-phosphate-dependent aminotransferase. This model describes known eukaryotic examples of the enzyme. The degree of sequence difference between this set and known bacterial examples is greater than the distance between either set the most similar enzyme with distinct function, and so separate models are built for prokaryotic and eukaryotic sets. Alternate names include GABA transaminase, gamma-amino-N-butyrate transaminase, and beta-alanine--oxoglutarate aminotransferase. [Central intermediary metabolism, Other]	464
129783	TIGR00700	GABAtrnsam	4-aminobutyrate aminotransferase, prokaryotic type. This enzyme is a class III pyridoxal-phosphate-dependent aminotransferase. This model describes known bacterial examples of the enzyme. The best archaeal matches are presumed but not trusted to have the equivalent function. The degree of sequence difference between this set and known eukaryotic (mitochondrial) examples is greater than the distance to some proteins known to have different functions, and so separate models are built for prokaryotic and eukaryotic sets. E. coli has two isozymes. Alternate names include GABA transaminase, gamma-amino-N-butyrate transaminase, and beta-alanine--oxoglutarate aminotransferase. [Central intermediary metabolism, Other]	420
273223	TIGR00701	TIGR00701	TIGR00701 family protein. It appears this conserved hypothetical integral membrane protein is found only in gram negative bacteria. Completed genomes that include a member of this family include Rickettsia prowazekii, Synechocystis sp. PCC6803, and Helicobacter pylori. These proteins have 3 (Helicobacter pylori) to 5 (Synechocystis sp. PCC 6803) GES predicted transmembrane regions. Most members have 4 GES predicted transmembrane regions. [Hypothetical proteins, Conserved]	142
273224	TIGR00702	TIGR00702	YcaO-type kinase domain. This protein family includes YcaO and homologs that can phosphorylate a peptide amide backbone (rather than side chains), as during heterocycle-forming modifications during maturation of the TOMM class (Thiazole/Oxazole-Modified Microcins) of bacteriocins. However, YcaO domain proteins also occur in contexts that do not suggest peptide modification. [Hypothetical proteins, Conserved]	377
129786	TIGR00703	TIGR00703	TIGR00703 family protein. The function of this family is unknown. These proteins are from 222 to 233 residues in length, lack hydrophobic stretches, and are found so far only in thermophiles. [Hypothetical proteins, Conserved]	223
273225	TIGR00704	NaPi_cotrn_rel	Na/Pi-cotransporter. This model describes essentially the full length of an uncharacterized protein from Bacillus subtilis and correponding lengths of longer proteins from E. coli and Treponema pallidum. PSI-BLAST analysis converges to demonstrate homology to one other group of proteins, type II sodium/phosphate (Na/Pi) cotransporters. A well-conserved repeated domain in this family, approximately 60 residues in length, is also repeated in the Na/Pi cotransporters, although with greater spacing between the repeats. The two families share additional homology in the region after the first repeat, share the properly of having extensive hydrophobic regions, and may be similar in function. [Transport and binding proteins, Cations and iron carrying compounds]	308
273226	TIGR00705	SppA_67K	signal peptide peptidase SppA, 67K type. This model represents the signal peptide peptidase A (SppA, protease IV) as found in E. coli, Treponema pallidum, Mycobacterium leprae, and several other species, in which it has a molecular mass around 67 kDa and a duplication such that the N-terminal half shares extensive homology with the C-terminal half. This enzyme was shown in E. coli to form homotetramers. E. coli SohB, which is most closely homologous to the C-terminal duplication of SppA, is predicted to perform a similar function of small peptide degradation, but in the periplasm. Many prokaryotes have a single SppA/SohB homolog that may perform the function of either or both. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	584
273227	TIGR00706	SppA_dom	signal peptide peptidase SppA, 36K type. The related but duplicated, double-length protein SppA (protease IV) of E. coli was shown experimentally to degrade signal peptides as are released by protein processing and secretion. This protein shows stronger homology to the C-terminal region of SppA than to the N-terminal domain or to the related putative protease SuhB. The member of this family from Bacillus subtilis was shown to have properties consistent with a role in degrading signal peptides after cleavage from precursor proteins, although it was not demonstrated conclusively. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	208
273228	TIGR00707	argD	transaminase, acetylornithine/succinylornithine family. This family of proteins, for which ornithine aminotransferases form an outgroup, consists mostly of proteins designated acetylornithine aminotransferase. However, the two very closely related members from E. coli are assigned different enzymatic activities. One is acetylornithine aminotransferase (EC 2.6.1.11), ArgD, an enzyme of arginine biosynthesis, while another is succinylornithine aminotransferase, an enzyme of the arginine succinyltransferase pathway, an ammonia-generating pathway of arginine catabolism (See MEDLINE:98361920). Members of this family may also act on ornithine, like ornithine aminotransferase (EC 2.6.1.13) (see MEDLINE:90337349) and on succinyldiaminopimelate, like N-succinyldiaminopmelate-aminotransferase (EC 2.6.1.17, DapC, an enzyme of lysine biosynthesis) (see MEDLINE:99175097)	379
129791	TIGR00708	cobA	cob(I)alamin adenosyltransferase. Alternate name: corrinoid adenosyltransferase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	173
129792	TIGR00709	dat	2,4-diaminobutyrate 4-transaminases. This family consists of L-diaminobutyric acid transaminases. This general designation covers both 2.6.1.76 (diaminobutyrate-2-oxoglutarate transaminase, which uses glutamate as the amino donor in DABA biosynthesis), and 2.6.1.46 (diaminobutyrate--pyruvate transaminase, which uses alanine as the amino donor). Most members with known function are 2.6.1.76, and at least some annotations as 2.6.1.46 in current databases at time of model revision are incorrect. A distinct branch of this family contains examples of 2.6.1.76 nearly all of which are involved in ectoine biosynthesis. A related enzyme is 4-aminobutyrate aminotransferase (EC 2.6.1.19), also called GABA transaminase. These enzymes all are pyridoxal phosphate-containing class III aminotransferase. [Central intermediary metabolism, Other]	442
273229	TIGR00710	efflux_Bcr_CflA	drug resistance transporter, Bcr/CflA subfamily. This subfamily of drug efflux proteins, a part of the major faciliator family, is predicted to have 12 membrane-spanning regions. Members with known activity include Bcr (bicyclomycin resistance protein) in E. coli, Flor (chloramphenicol and florfenicol resistance) in Salmonella typhimurium DT104, and CmlA (chloramphenicol resistance) in Pseudomonas sp. plasmid R1033.	385
129794	TIGR00711	efflux_EmrB	drug resistance transporter, EmrB/QacA subfamily. This subfamily of drug efflux proteins, a part of the major faciliator family, is predicted to have 14 potential membrane-spanning regions. Members with known activities include EmrB (multiple drug resistance efflux pump) in E. coli, FarB (antibacterial fatty acid resistance) in Neisseria gonorrhoeae, TcmA (tetracenomycin C resistance) in Streptomyces glaucescens, etc. In most cases, the efflux pump is described as having a second component encoded in the same operon, such as EmrA of E. coli. [Cellular processes, Toxin production and resistance, Transport and binding proteins, Other]	485
129795	TIGR00712	glpT	glycerol-3-phosphate transporter. This model describes a very hydrophobic protein, predicted to span the membrane at least 8 times. The two members confirmed experimentally as glycerol-3-phosphate transporters, from E. coli and B. subtilis, share more than 50 % amino acid identity. Proteins of the hexose phosphate and phosphoglycerate transport systems are also quite similar. [Transport and binding proteins, Other]	438
273230	TIGR00713	hemL	glutamate-1-semialdehyde-2,1-aminomutase. This enzyme, glutamate-1-semialdehyde-2,1-aminomutase (glutamate-1-semialdehyde aminotransferase, GSA aminotransferase), contains a pyridoxal phosphate attached at a Lys residue at position 283 of the seed alignment. It is in the family of class III aminotransferases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	423
211601	TIGR00714	hscB	Fe-S protein assembly co-chaperone HscB. This model describes the small subunit, Hsc20 (20K heat shock cognate protein) of a pair of proteins Hsc66-Hsc20, related to the DnaK-DnaJ heat shock proteins, which also serve as molecular chaperones. Hsc20, unlike DnaJ, appears not to have chaperone activity on its own, but to act solely as a regulatory subunit for Hsc66 (i.e., to be a co-chaperone). The gene for Hsc20 in E. coli, hscB, is not induced by heat shock. [Protein fate, Protein folding and stabilization]	155
273231	TIGR00715	precor6x_red	precorrin-6x reductase. This enzyme catalyzes a step in cobalamin biosynthesis. It has been identified experimentally in Pseudomonas denitrificans and has been shown to be part of cobalamin biosynthetic operons in several other species. This enzyme was found to be a monomer by gel filtration. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	256
129799	TIGR00716	rnhC	ribonuclease HIII. This enzyme cleaves RNA from DNA-RNA hybrids. Two types of ribonuclease H in Bacillus subtilis, RNase HII (rnhB) and RNase HIII (rnhC), are both known experimentally and are quite similar to each other. The only RNase H homolog in the Mycoplasmas resembles rnhC. Archaeal forms resemble HII more closely than HIII. This model describes bacterial RNase III. [DNA metabolism, DNA replication, recombination, and repair]	284
273232	TIGR00717	rpsA	ribosomal protein S1. This model describes ribosomal protein S1, RpsA. This protein is found in most bacterial genomes in a single copy, but is not present in the Mycoplasmas. It is heterogeneous with respect to the number of repeats of the S1 RNA binding domain described by pfam00575: six repeats in E. coli and most other bacteria, four in Bacillus subtilis and some other species. rpsA is an essential gene in E. coli but not in B. subtilis. It is associated with the cytidylate kinase gene cmk in many species, and fused to it in Treponema pallidum. RpsA is proposed (Medline:97323001) to assist in mRNA degradation. This model provides trusted hits to most long form (6 repeat) examples of RpsA. Among homologs with only four repeats are some to which other (perhaps secondary) functions have been assigned. [Protein synthesis, Ribosomal proteins: synthesis and modification]	516
129801	TIGR00718	sda_alpha	L-serine dehydratase, iron-sulfur-dependent, alpha subunit. This enzyme is also called serine deaminase. L-serine dehydratase converts serine into pyruvate in the gluconeogenesis pathway from serine. This model describes the alpha chain of an iron-sulfur-dependent L-serine dehydratase, found in Bacillus subtilis. A fairly deep split in a UPGMA tree separates members of this family of alpha chains from the homologous region of single chain forms such as found in Escherichia coli. This family of enzymes is not homologous to the pyridoxal phosphate-dependent threonine deaminases and eukaryotic serine deaminases. [Energy metabolism, Amino acids and amines, Energy metabolism, Glycolysis/gluconeogenesis]	294
129802	TIGR00719	sda_beta	L-serine dehydratase, iron-sulfur-dependent, beta subunit. This enzyme is also called serine deaminase. This model describes the beta chain of an iron-sulfur-dependent L-serine dehydratase, as in Bacillus subtilis. A fairly deep split in a UPGMA tree separates members of this family of beta chains from the homologous region of single chain forms such as found in E. coli. This family of enzymes is not homologous to the pyridoxal phosphate-dependent threonine deaminases and eukaryotic serine deaminases. [Energy metabolism, Amino acids and amines, Energy metabolism, Glycolysis/gluconeogenesis]	208
273233	TIGR00720	sda_mono	L-serine dehydratase, iron-sulfur-dependent, single chain form. This enzyme is also called serine deaminase and L-serine dehydratase 1. L-serine ammonia-lyase converts serine into pyruvate in the gluconeogenesis pathway from serine. This enzyme is comprised of a single chain in Escherichia coli, Mycobacterium tuberculosis, and several other species, but has separate alpha and beta chains in Bacillus subtilis and related species. The beta and alpha chains are homologous to the N-terminal and C-terminal regions, respectively, but are rather deeply branched in a UPGMA tree. This enzyme requires iron and dithiothreitol for activation in vitro, and is a predicted 4Fe-4S protein. Escherichia coli Pseudomonas aeruginosa have two copies of this protein. [Energy metabolism, Amino acids and amines, Energy metabolism, Glycolysis/gluconeogenesis]	450
129804	TIGR00721	tfx	DNA-binding protein, Tfx family. PSI-BLAST starting with one member of this family converges with significant hits only to other members of the family, which is restricted to the Archaea. Homology is strongest in the helix-turn-helix-containing N-terminal region. Tfx from Methanobacterium thermoautotrophicum is associated with the operon for molybdenum formyl-methanofuran dehydrogenase and binds a DNA sequence near its promoter. [Regulatory functions, DNA interactions]	137
273234	TIGR00722	ttdA_fumA_fumB	hydro-lyases, Fe-S type, tartrate/fumarate subfamily, alpha region. A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see PROSITE:PDOC00147). This model represents a subset of closely related proteins or modules, including the E. coli tartrate dehydratase alpha chain and the N-terminal region of the class I fumarase (where the C-terminal region is homologous to the tartrate dehydratase beta chain). The activity of archaeal proteins in this subfamily has not been established.	273
129806	TIGR00723	ttdB_fumA_fumB	hydro-lyases, Fe-S type, tartrate/fumarate subfamily, beta region. A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see PROSITE:PDOC00147). This model represents a subset of closely related proteins or modules, including the E. coli tartrate dehydratase beta chain and the C-terminal region of the class I fumarase (where the N-terminal region is homologous to the tartrate dehydratase alpha chain). The activity of archaeal proteins in this subfamily has not been established.	168
129807	TIGR00724	urea_amlyse_rel	biotin-dependent carboxylase uncharacterized domain. Urea amidolyase of Saccharomyces cerevisiae is a 1835 amino acid protein with an amidase domain, a biotin/lipoyl cofactor attachment domain, a carbamoyl-phosphate synthase L chain-like domain, and uncharacterized regions. It has both urea carboxylase and allophanate hydrolase activities. This model models a domain that represents uncharacterized prokaryotic proteins of about 300 amino acids, regions of prokaryotic urea carboxylase and of the urea carboxylase region of yeast urea amidolyase, and regions of other biotin-containing proteins. [Unknown function, General]	314
129808	TIGR00725	TIGR00725	TIGR00725 family protein. This model represents one branch of a subfamily of uncharacterized proteins. Both PSI-BLAST and weak hits by this model show a low level of similarity and suggest an evolutionary relationship of the subfamily to the DprA/Smf family of DNA-processing proteins involved in chromosomal transformation with foreign DNA. Both Aquifex aeolicus and Mycobacterium leprae have one member in each of two branches of this subfamily, suggesting the branches may have distinct functions. This family is one of several families within the scope of pfam03641, several members of which are annotated as lysine decarboxylases. That larger family, and the branch described by this model, have a well-conserved motif PGGXGTXXE. [Hypothetical proteins, Conserved]	159
273235	TIGR00726	TIGR00726	YfiH family protein. PSI-BLAST converges on members of this family of uncharacterized bacterial proteins and shows no significant similarity to any characterized protein. No completed genome to date has two members. Members of the family have been crystallized but the function is unknown. [Unknown function, General]	221
129810	TIGR00727	ISP4_OPT	small oligopeptide transporter, OPT family. This model represents a family of transporters of small oligopeptides, demonstrated experimentally in three different species of yeast. A set of related proteins from the plant Arabidopsis thaliana forms an outgroup to the yeast set by neighbor joining analysis but is remarkably well conserved and is predicted here to have equivalent function. [Transport and binding proteins, Amino acids, peptides and amines]	681
273236	TIGR00728	OPT_sfam	oligopeptide transporter, OPT superfamily. This superfamily has two main branches. One branch contains a tetrapeptide transporter demonstrated experimentally in three different species of yeast. The other family contains EspB of Myxococcus xanthus, a protein required for normal rather than delayed sporulation after cellular aggregation; its role is unknown but is compatible with transport of a signalling molecule. Homology between the two branches of the superfamily is seen most easily at the ends of the protein. The central regions are poorly conserved within each branch and may not be homologous between branches.	657
129812	TIGR00729	TIGR00729	ribonuclease H, mammalian HI/archaeal HII subfamily. This enzyme cleaves RNA from DNA-RNA hybrids. Archaeal members of this subfamily of RNase H are designated RNase HII and one has been shown to be active as a monomer. A member from Homo sapiens was characterized as RNase HI, large subunit. [DNA metabolism, DNA replication, recombination, and repair]	206
129813	TIGR00730	TIGR00730	TIGR00730 family protein. This model represents one branch of a subfamily of proteins of unknown function. Both PSI-BLAST and weak hits by this model show a low level of similarity to and suggest an evolutionary relationship of the subfamily to the DprA/Smf family of DNA-processing proteins involved in chromosomal transformation with foreign DNA. Both Aquifex aeolicus and Mycobacterium leprae have one member in each of two branches of this subfamily, suggesting that the branches may have distinct functions. [Hypothetical proteins, Conserved]	178
273237	TIGR00731	bL25_bact_ctc	ribosomal protein bL25, Ctc-form. This model models a family of proteins with full-length homology to the general stress protein Ctc of Bacillus subtilis, a mesophile, and ribosomal protein TL5 of Thermus thermophilus, a thermophile. Ribosomal protein L25 of Escherichia coli and H. influenzae appear to be orthologous but consist only of the N-terminal half of Ctc and TL5. Both short (L25-like) and full-length (CTC-like) members of this family bind the E-loop of bacterial 5S rRNA. This protein appears to be restricted to bacteria and organelles, and consists of at most one copy per prokaryotic genome.Ctc of Bacillus subtilis has now been localized to ribosomes and can be viewed as the long form, or Ctc form, of L25. The C-terminal domain of sll1824, an apparent L25 of Synechocystis PCC6803, matches the N-terminal domain of this family. Examples of L25 and Ctc are not separated by a UPGMA tree built on the region of shared homology. [Protein synthesis, Ribosomal proteins: synthesis and modification]	176
273238	TIGR00732	dprA	DNA protecting protein DprA. Disruption of this gene in both Haemophilus influenzae and Helicobacter pylori drastically reduces the efficiency of transformation with exogenous DNA, but with different levels of effect on chromosomal (linear) and plasmid (circular) DNA. This difference suggests the DprA is not active in recombination, and it has been shown not to affect DNA binding, leaving the intermediate step in natural transformation, DNA processing. In Strep. pneumoniae, inactivation of dprA had no effect on the uptake of DNA. All of these data indicated that DprA is required at a later stage in transformation. Subsequently DprA and RecA were both shown in S. pneumoniae to be required to protect incoming ssDNA from immediate degradation. Role of DprA in non-transformable species is not known. The gene symbol smf was assigned in E. coli, but without assignment of function. [Cellular processes, DNA transformation]	220
273239	TIGR00733	TIGR00733	putative oligopeptide transporter, OPT family. This protein represents a small family of integral membrane proteins from Gram-negative bacteria, a Gram-positive bacteria, and an archaeal species. Members of this family contain 15 to 18 GES predicted transmembrane regions, and this family has extensive homology to a family of yeast tetrapeptide transporters, including isp4 (Schizosaccharomyces pombe) and Opt1 (Candida albicans). EspB, an apparent equivalog from Myxococcus xanthus, shares an operon with a two component system regulatory protein, and is required for the normal timing of sporulation after the aggregation of cells. This is consistent with a role in transporting oligopeptides as signals across the membrane. [Transport and binding proteins, Amino acids, peptides and amines]	591
273240	TIGR00734	hisAF_rel	hisA/hisF family protein. This model models a family of proteins found so far in three archaeal species: Methanobacterium thermoautotrophicum, Methanococcus jannaschii, and Archaeoglobus fulgidus. This protein is homologous to phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase (HisA) and, with lower similarity, to the cyclase HisF, both of which are enzymes of histidine biosynthesis. Each species with this protein also encodes HisA. The function of this protein is unknown. [Unknown function, General]	221
273241	TIGR00735	hisF	imidazoleglycerol phosphate synthase, cyclase subunit. [Amino acid biosynthesis, Histidine family]	254
129819	TIGR00736	nifR3_rel_arch	TIM-barrel protein, putative. Members of this family show a distant relationship by PSI-BLAST to alpha/beta (TIM) barrel enzymes such as dihydroorotate dehydrogenase and glycolate oxidase. At least two closely related but well-separable families among the bacteria, the nifR3/yhdG family and the yjbN family, share a more distant relationship to this family of shorter, exclusively archaeal proteins. [Unknown function, General]	231
129820	TIGR00737	nifR3_yhdG	putative TIM-barrel protein, nifR3 family. This model represents one branch of COG0042 (Predicted TIM-barrel enzymes, possibly dehydrogenases, nifR3 family). This branch includes NifR3 itself, from Rhodobacter capsulatus. It excludes a broadly distributed but more sparsely populated subfamily that contains sll0926 from Synechocystis PCC6803, HI0634 from Haemophilus influenzae, and BB0225 from Borrelia burgdorferi. It also excludes a shorter and more distant archaeal subfamily.The function of nifR3, a member of this family, is unknown, but it is found in an operon with nitrogen-sensing two component regulators in Rhodobacter capsulatus.Members of this family show a distant relationship to alpha/beta (TIM) barrel enzymes such as dihydroorotate dehydrogenase and glycolate oxidase. [Unknown function, General]	319
273242	TIGR00738	rrf2_super	Rrf2 family protein. This model represents a superfamily of probable transcriptional regulators. One member, RRF2 of Desulfovibrio vulgaris is an apparent regulatory protein experimentally (MEDLINE:97293189). The N-terminal region appears related to the DNA-binding biotin repressor region of the BirA bifunctional according to results after three rounds of PSI-BLAST with a fairly high stringency. [Unknown function, General]	132
273243	TIGR00739	yajC	preprotein translocase, YajC subunit. While this protein is part of the preprotein translocase in Escherichia coli, it is not essential for viability or protein secretion. The N-terminus region contains a predicted membrane-spanning region followed by a region consisting almost entirely of residues with charged (acidic, basic, or zwitterionic) side chains. This small protein is about 100 residues in length, and is restricted to bacteria; however, this protein is absent from some lineages, including spirochetes and Mycoplasmas. [Protein fate, Protein and peptide secretion and trafficking]	84
273244	TIGR00740	TIGR00740	tRNA (cmo5U34)-methyltransferase. This tRNA methyltransferase is involved, together with cmoB, in preparing the uridine-5-oxyacetic acid (cmo5U) at position 34. [Unknown function, Enzymes of unknown specificity]	239
129824	TIGR00741	yfiA	ribosomal subunit interface protein. This model includes a small protein encoded by one of two genes, both downstream of the gene rpoN for sigma 54, whose deletion leads to increased expression from sigma 54-dependent promoters. It also includes the N-terminal half of a light-repressed protein LtrA of Synechococcus PCC 7002 and the N-terminal region (after removal of the transit peptide) of a larger plastid-specific ribosomal protein of spinach. The member of this family from E. coli is now recognized as a protein at the interace between ribosomal large and small subunits, with about 1/3 as many copies per cell as the number of ribosomes. [Protein synthesis, Translation factors]	95
129825	TIGR00742	yjbN	tRNA dihydrouridine synthase A. This model represents one branch of COG0042 (Predicted TIM-barrel enzymes, possibly dehydrogenases, nifR3 family). It represents a distinct subset by a set of shared unique motifs, a conserved pattern of insertions/deletions relative to other nifR3 homologs, and by subclustering based on cross-genome bidirectional best hits. Members are found in species as diverse as the proteobacteria, a spirochete, a cyanobacterium, and Deinococcus radiodurans. NifR3 itself, a protein of unknown function associated with nitrogen regulation in Rhodobacter capsulatus, is not a member of this branch. Members of this family show a distant relationship to alpha/beta (TIM) barrel enzymes such as dihydroorotate dehydrogenase and glycolate oxidase. [Protein synthesis, tRNA and rRNA base modification]	318
273245	TIGR00743	TIGR00743	conserved hypothetical protein. These small proteins are approximately 100 amino acids in length and appear to be found only in gamma proteobacteria. The function of this protein family is unknown. [Hypothetical proteins, Conserved]	95
273246	TIGR00744	ROK_glcA_fam	ROK family protein (putative glucokinase). This model models one branch of the ROK superfamily of proteins. The three members of the seed alignment for this model all have experimental evidence for activity as glucokinase, but the set of related proteins is crowded with paralogs of different or unknown function. Proteins scoring above the trusted_cutoff will show strong similarity to at least one known glucokinase and may be designated as putative glucokinases. However, definitive identification of glucokinases should be done only with extreme caution. [Unknown function, General]	318
273247	TIGR00745	apbA_panE	2-dehydropantoate 2-reductase. This model describes enzymes that perform as 2-dehydropantoate 2-reductase, one of four enzymes required for the de novo biosynthesis of pantothenate (vitamin B5) from Asp and 2-oxoisovalerate. Although few members of the seed alignment are characterized experimentally, nearly all from complete genomes are found in a genome-wide (but not local) context of all three other pantothenate-biosynthetic enzymes (TIGR00222, TIGR00018, TIGR00223). The gene encoding this enzyme is designated apbA in Salmonella typhimurium and panE in Escherichia coli; this protein functions as a monomer and functions in the alternative pyrimidine biosynthetic, or APB, pathway, used to synthesize the pyrimidine moiety of thiamine. Note, synthesis of the pyrimidine moiety of thiamine occurs either via the first five steps in de novo purine biosynthesis, which uses the pur gene products, or through the APB pathway. Note that this family includes both NADH and NADPH-dependent enzymes, and enzymes with broad specificity, such as a D-mandelate dehydrogease that is also a 2-dehydropantoate 2-reductase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	293
273248	TIGR00746	arcC	carbamate kinase. In most species, carbamate kinase works in arginine catabolism and consumes carbamoyl phosphate to convert ADP into ATP. In the pathway in Pyrococcus furiosus, the enzyme acts instead to generate carbamoyl phosphate.The seed alignment for this model includes experimentally confirmed examples from a set of phylogenetically distinct species. In a neighbor-joining tree constructed from an alignment of candidate carbamate kinases and several acetylglutamate kinases, the latter group forms a clear outgroup which roots the tree of carbamate kinase-like proteins. This analysis suggests that in E. coli, the ArcC paralog YqeA may be a second isozyme, while the paralog YahI branches as an outlier and is less likely to be an authentic carbamate kinase. The homolog from Mycoplasma pneumoniae likewise branches outside the set containing known carbamate kinases and also scores below the trusted cutoff. [Energy metabolism, Amino acids and amines]	310
273249	TIGR00747	fabH	3-oxoacyl-(acyl-carrier-protein) synthase III. FabH in general initiate elongation in type II fatty acid synthase systems found in bacteria and plants. The two members of this subfamily from Bacillus subtilis differ from each other, and from FabH from E. coli, in acyl group specificity. Active site residues include Cys112, His244 and Asn274 of E. coli FabH. Cys-112 is the site of acyl group attachment. [Fatty acid and phospholipid metabolism, Biosynthesis]	318
273250	TIGR00748	HMG_CoA_syn_Arc	hydroxymethylglutaryl-CoA synthase, putative. This family of archaeal proteins shows considerable homology and identical active site residues to the bacterial hydroxymethylglutaryl-CoA synthase (HMG-CoA synthase, modeled by TIGR01835) which is the second step in the mevalonate pathway of IPP biosynthesis. An enzyme from Pseudomonas fluorescens involved in the biosynthesis of the polyketide diacetyl-phloroglucinol is more closely related, but lacks the active site residues. In each of the genomes containing a member of this family there is no other recognized HMG-CoA synthase, although other elements of the mevalonate pathway are in evidence. The only archaeon currently sequenced which lacks a homolog in this pathway is Halobacterium, which _does_ contain a separate HMG-CoA synthase. Thus, although there is no experimental evidence supporting this name, the bioinformatics-based conclusion appears to be sound. [Fatty acid and phospholipid metabolism, Biosynthesis]	347
129832	TIGR00749	glk	glucokinase, proteobacterial type. This model represents glucokinase of E. coli and close homologs, mostly from other proteobacteria, presumed to have equivalent function. This glucokinase is more closely related to a number of uncharacterized paralogs than to the glucokinase glcK (fromerly yqgR) of Bacillus subtilis and its closest homologs, so the two sets are represented by separate models. [Energy metabolism, Glycolysis/gluconeogenesis]	316
129833	TIGR00750	lao	LAO/AO transport system ATPase. In E. coli, mutation of this kinase blocks phosphorylation of two transporter system periplasmic binding proteins and consequently inhibits those transporters. This kinase is also found in Gram-positive bacteria, archaea, and the roundworm C. elegans. It may have a more general, but still unknown function. Mutations have also been found that do not phosphorylate the periplasmic binding proteins, yet still allow transport. The ATPase activity of this protein seems to be necessary, however. [Transport and binding proteins, Amino acids, peptides and amines, Regulatory functions, Protein interactions]	300
129834	TIGR00751	menA	1,4-dihydroxy-2-naphthoate octaprenyltransferase. This membrane-associated enzyme converts 1,4-dihydroxy-2-naphthoic acid (DHNA) to demethylmenaquinone, a step in menaquinone biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	284
273251	TIGR00752	slp	outer membrane lipoprotein, Slp family. Slp superfamily members are present in the Gram-negative gamma proteobacteria Escherichia coli, which also contains a close paralog, Haemophilus influenzae and Pasteurella multocida and Vibrio cholera. The known members of the family to date share a motif LX[GA]C near the N-terminus, which is compatible with the possibility that the protein is modified into a lipoprotein with Cys as the new N-terminus. Slp from Escherichia coli is known to be a lipoprotein of the outer membrane and to be expressed in response to carbon starvation. [Cell envelope, Other]	182
129836	TIGR00753	undec_PP_bacA	undecaprenyl-diphosphatase UppP. This is a family of small, highly hydrophobic proteins. Overexpression of this protein in Escherichia coli is associated with bacitracin resistance, and the protein was originally proposed to be an undecaprenol kinase and called bacA. It is now known to be an undecaprenyl pyrophosphate phosphatase (EC 3.6.1.27) and is renamed UppP. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	255
162022	TIGR00754	bfr	bacterioferritin. Bacterioferritin, predominantly an iron-storage protein restricted to Bacteria, has also been designated cytochrome b1 and cytochrome b-557.Bacterioferritin is a homomultimer most species. In Neisseria gonorrhoeae, Synechocystis PCC6803, Magnetospirillum magnetotacticum, and Pseudomonas aeruginosa, two types of subunit are found in a heteromultimeric complex, with each species having one member of each type. At present, both types of subunit are including in this single model. [Transport and binding proteins, Cations and iron carrying compounds]	157
273252	TIGR00755	ksgA	ribosomal RNA small subunit methyltransferase A. In both E. coli and Saccharomyces cerevisiae, this protein is responsible for the dimethylation of two adjacent adenosine residues in a conserved hairpin of 16S rRNA in bacteria, 18S rRNA in eukaryotes. This adjacent dimethylation is the only rRNA modification shared by bacteria and eukaryotes. A single member of this family is present in each of the first 20 completed microbial genomes. This protein is essential in yeast, but not in E. coli, where its deletion leads to resistance to the antibiotic kasugamycin. Alternate name: S-adenosylmethionine--6-N',N'-adenosyl (rRNA) dimethyltransferase [Protein synthesis, tRNA and rRNA base modification]	254
273253	TIGR00756	PPR	pentatricopeptide repeat domain (PPR motif). This model describes a domain called the PPR motif, or pentatricopeptide repeat. Its consensus sequence is 35 positions long and typically is found in four or more tandem copies. This family is strongly represented in plant proteins, particularly those sorted to chloroplasts or mitochondria. The pfam01535, domain of unknown function DUF17, consists of 6 copies of this repeat. This family has a similar consensus to the TPR domain (tetratricopeptide), pfam00515, a 33-residue repeat. It is predicted to form a pair of antiparallel helices similar to that of TPR.	35
273254	TIGR00757	RNaseEG	ribonuclease, Rne/Rng family. This model describes ribonuclease G (formerly CafA, cytoplasmic axial filament protein A), the N-terminal domain of ribonuclease E in which ribonuclease activity resides, and related proteins. In E. coli, both RNase E and RNase G have been shown to play a role in the maturation of the 5' end of 16S RNA. The C-terminal half of RNase E (excluded from the seed alignment for this model) lacks ribonuclease activity but participates in mRNA degradation by organizing the degradosome. [Transcription, Degradation of RNA]	414
129841	TIGR00758	UDG_fam4	uracil-DNA glycosylase, family 4. This well-conserved family of proteins is about 200 residues in length and homologous to the N-terminus of the DNA polymerase of phage SPO1 of Bacillus subtilis. The member from Thermus thermophilus HB8 is known to act as uracil-DNA glycosylase, an enzyme of DNA base excision repair. Its appearance as a domain of phage DNA polymerases could be consistent with uracil-DNA glycosylase activity. [DNA metabolism, DNA replication, recombination, and repair]	173
273255	TIGR00759	aceE	pyruvate dehydrogenase E1 component, homodimeric type. Most members of this family are pyruvate dehydrogenase complex, E1 component. Note: this family was classified as subfamily rather than equivalog because it includes a counterexample from Pseudomonas putida, MdeB, that is active as an E1 component of an alpha-ketoglutarate dehydrogenase complex rather than a pyruvate dehydrogase complex. The second pyruvate dehydrogenase complex E1 protein from Alcaligenes eutrophus, PdhE, complements an aceE mutant of E. coli but is not part of a pyruvate dehydrogenase complex operon, is more similar to the Pseudomonas putida MdeB than to E. coli AceE, and may have also have a different primary specificity.	885
129843	TIGR00760	araD	L-ribulose-5-phosphate 4-epimerase. E. coli has two genes, sgaE and sgbE (YiaS), that are very close homologs of araD, the established L-ribulose-5-phosphate 4-epimerase of E. coli. SgbE, part of an operon for L-xylulose metabolism, also has L-ribulose-5-phosphate 4-epimerase activity; L-xylulose-5-phosphate may be converted into L-ribulose-5-phosphate by another product of that operon. The homolog to this family from Mycobacterium smegmatis is flanked by putative araB and araA genes, consistent with it also being araD. [Energy metabolism, Sugars]	231
273256	TIGR00761	argB	acetylglutamate kinase. This model describes N-acetylglutamate kinases (ArgB) of many prokaryotes and the N-acetylglutamate kinase domains of multifunctional proteins from yeasts. This enzyme is the second step in the "acetylated" ornithine biosynthesis pathway. A related group of enzymes representing the first step of the pathway contain a homologous domain and are excluded from this model. [Amino acid biosynthesis, Glutamate family]	231
273257	TIGR00762	DegV	EDD domain protein, DegV family. This family of proteins is related to DegV of Bacillus subtilis and includes paralogous sets in several species (B. subtilis, Deinococcus radiodurans, Mycoplasma pneumoniae) that are closer in percent identity to each than to most homologs from other species. This suggests both recent paralogy and diversity of function. DegV itself is encoded immediately downstream of DegU, a transcriptional regulator of degradation, but is itself uncharacterized. Crystallography suggested a lipid-binding site, while comparison of the crystal structure to dihydroxyacetone kinase and to a mannose transporter EIIA domain suggests a conserved domain, EDD, with phosphotransferase activity. [Unknown function, General]	275
273258	TIGR00763	lon	endopeptidase La. This protein, the ATP-dependent serine endopeptidase La, is induced by heat shock and other stresses in E. coli, B. subtilis, and other species. The yeast member, designated PIM1, is located in the mitochondrial matrix, required for mitochondrial function, and also induced by heat shock. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	775
273259	TIGR00764	lon_rel	lon-related putative ATP-dependent protease. This model represents a set of proteins with extensive C-terminal homology to the ATP-dependent protease La, product of the lon gene of E. coli. The model is based on a seed alignment containing only archaeal members, but several bacterial proteins match the model well. Because several species, including Thermotoga maritima and Treponema pallidum, contain both a close homolog of the lon protease and nearly full-length homolog of the members of this family, we suggest there may also be a functional division between the two families. Members of this family from Pyrococcus horikoshii and Pyrococcus abyssi each contain a predicted intein. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	608
273260	TIGR00765	yihY_not_rbn	YihY family inner membrane protein. Initial identification of members of this protein family was based on characterization of the yihY gene product as ribonuclease BN in Escherichia coli. This identification has been withdrawn, as the group now finds the homolog in E. coli of RNase Z is the true ribonuclease BN rather than a strict functional equivalent of RNase Z. Members of this subfamily include the largely uncharacterized BrkB (Bordetella resist killing by serum B) from Bordetella pertussis. Some members have an additional C-terminal domain. Paralogs from E. coli (yhjD) and Mycobactrium tuberculosis (Rv3335c) are part of a smaller, related subfamily that form their own cluster. [Unknown function, General]	259
188082	TIGR00766	TIGR00766	inner membrane protein YhjD. This family, including YhjD in E. coli, is a conserved inner membrane protein homologous YihY, which in turn was incorrectly assigned to be ribonuclease BN. This, any suggestion this family is similar to ribonucleases should be removed. [Transcription, Degradation of RNA]	263
162030	TIGR00767	rho	transcription termination factor Rho. This RNA helicase, the transcription termination factor Rho, occurs in nearly all bacteria but is missing from the Cyanobacteria, the Mollicutes (Mycoplasmas), and various Lactobacillales including Streptococcus. It is also missing, of course, from the Archaea, which also lack Nus factors. Members of this family from Micrococcus luteus, Mycobacterium tuberculosis, and related species have a related but highly variable long, highly charged insert near the amino end. Members of this family differ in the specificity of RNA binding. [Transcription, Transcription factors]	415
273261	TIGR00768	rimK_fam	alpha-L-glutamate ligase, RimK family. This family, related to bacterial glutathione synthetases, contains at least three different alpha-L-glutamate ligases. One is RimK, as in E. coli, which adds additional Glu residues to the native Glu-Glu C-terminus of ribosomal protein S6, but not to Lys-Glu mutants. Most species with a member of this subfamily lack an S6 homolog ending in Glu-Glu, however. Members in Methanococcus jannaschii act instead as a tetrahydromethanopterin:alpha-l-glutamate ligase (MJ0620) and a gamma-F420-2:alpha-l-glutamate ligase (MJ1001).	276
273262	TIGR00769	AAA	ADP/ATP carrier protein family. These proteins are members of the ATP:ADP Antiporter (AAA) Family (TC 2.A.12), which consists of nucleotide transporters that have 12 GES predicted transmembrane regions. One protein from Rickettsia prowazekii functions to take up ATP from the eukaryotic cell cytoplasm into the bacterium in exchange for ADP. Five AAA family paralogues are encoded within the genome of R. prowazekii. This organism transports UMP and GMP but not CMP, and it seems likely that one or more of the AAA family paralogues are responsible. The genome of Chlamydia trachomatis encodes two AAA family members, Npt1 and Npt2, which catalyse ATP/ADP exchange and GTP, CTP, ATP and UTP uptake probably employing a proton symport mechanism. Two homologous adenylate translocators of Arabidopsis thaliana are postulated to be localized to the intracellular plastid membrane where they function as ATP importers. [Transport and binding proteins, Nucleosides, purines and pyrimidines]	472
273263	TIGR00770	Dcu	anaerobic c4-dicarboxylate membrane transporter family protein. These proteins are members of th C4-Dicarboxylate Uptake (Dcu) Family (TC 2.A.13). Most proteins in this family have 12 GES predicted transmembrane regions; however one member has 10 experimentally determined transmembrane regions with both the N- and C-termini localized to the periplasm. The two Escherichia coli proteins, DcuA and DcuB, transport aspartate, malate, fumarate and succinate, and function as antiporters with any two of these substrates. Since DcuA is encoded in an operon with the gene for aspartase, and DcuB is encoded in an operon with the gene for fumarase, their physiological functions may be to catalyze aspartate:fumarate and fumarate:malate exchange during the anaerobic utilization of aspartate and fumarate, respectively. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	430
129854	TIGR00771	DcuC	c4-dicarboxylate anaerobic carrier family protein. These proteins are members of the C4-dicarboxylate Uptake C (DcuC) Family (TC 2.A.61). The only functionally characterized member of this family is the anaerobic C4-dicarboxylate transporter (DcuC) of Escherichia coli. DcuC has 12 GES predicted transmembrane regions, is induced only under anaerobic conditions, and is not repressed by glucose. It may therefore function as a succinate efflux system during anaerobic glucose fermentation. However, when overexpressed, it can replace either DcuA or DcuB in catalyzing fumarate-succinate exchange and fumarate uptake. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	388
273264	TIGR00773	NhaA	Na+/H+ antiporter NhaA. These proteins are members of the NhaA Na+:H+ Antiporter (NhaA) Family (TC. 2.A.33). The Escherichia coli NhaA protein probably functions in the regulation of the internal pH when the external pH is alkaline. It also uses the H+ gradient to expel Na+ from the cell. Its activity is highly pH dependent. Only the E. coli protein is functionally and structurally well characterized. [Transport and binding proteins, Cations and iron carrying compounds]	373
129856	TIGR00774	NhaB	Na+/H+ antiporter NhaB. These proteins are members of the NhaB Na+:H+ Antiporter (NhaB) Family (TC 2.A.34). The only characterised member of this family is the Escherichia coli NhaB protein, which has 12 GES predicted transmembrane regions, and catalyses sodium/proton exchange. Unlike NhaA this activity is not pH dependent. [Transport and binding proteins, Cations and iron carrying compounds]	515
129857	TIGR00775	NhaD	Na+/H+ antiporter, NhaD family. These proteins are members of the NhaD Na+:H+ Antiporter (NhaD) Family (TC 2.A.62). A single member of the NhaD family has been characterized. This protein is the NhaD protein of Vibrio parahaemolyticus which has 12 GES predicted transmembrane regions. It has been shown to catalyze Na+/H+ antiport, but Li+ can also be a substrate. [Transport and binding proteins, Cations and iron carrying compounds]	420
273265	TIGR00776	RhaT	RhaT L-rhamnose-proton symporter family protein. These proteins are members of the L-Rhamnose Symporter (RhaT) Family (TC 2.A.7). This family includes two characterized members, both of which function as L-rhamnose:H+ symporters and have 10 GES predicted transmembrane domains. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	290
129859	TIGR00777	ahpD	alkylhydroperoxidase, AhpD family. Members of this family are alkylhydroperoxidases, which catalyze the reduction of peroxides to their corresponding alcohols via oxidation of cysteine residues. In these alkylhydroperoxidases, the cysteines are located in a conserved -CXXC- motif located towards the COOH terminus. In Mycobacterium tuberculosis, two non-homologous alkylhydroperoxidases, AhpD and AhpC, are found in the same operon. [Cellular processes, Detoxification]	177
273266	TIGR00778	ahpD_dom	alkylhydroperoxidase AhpD family core domain. This model represents a 51-residue core region of homology among a family of mostly uncharacterized proteins of 110 to 227 amino acids. Most members of this family contain the motif EXXXXXX[SA]XXXXXC[VIL]XCXXXH. Members of the family include the alkylhydroperoxidase AhpD of Mycobacterium tuberculosis, a macrophage infectivity potentiator peptide of Legionella pneumophila, and an uncharacterized peptide in the tetrachloroethene reductive dehalogenase operon of Dehalospirillum multivorans. We suggest that many peptides containing this domain may have alkylhydroperoxidase or related antioxidant activity. [Unknown function, General]	50
129861	TIGR00779	cad	cadmium resistance transporter (or sequestration) family protein. These proteins are members of the Cadmium Resistance (CadD) Family (TC 2.A.77). To date, this family of proteins has only been found in Gram-positive bacteria. The CadD family includes several closely related Staphylococcal proteins reported to function in cadmium resistance. Members are predicted to span the membrane five times; the mechanism of resistance is believed to be export but has also been suggested to be binding and sequestration in the membrane. Closely related but outside the scope of this model is another staphylococcal protein that has been reported to possibly function in quaternary ammonium ion export. Still more distant are other members of the broader LysE family (see Vrljic. et al, ). [Transport and binding proteins, Amino acids, peptides and amines]	193
129862	TIGR00780	ccoN	cytochrome c oxidase, cbb3-type, subunit I. This model represents the largest subunit, I, of the ccb3-type cytochrome c oxidase, with two protohemes and copper. It shows strong homology to subunits of other types of cytochrome oxidases. Species with this type, all from the Proteobacteria so far, include Neisseria meningitidis, Helicobacter pylori, Campylobacter jejuni, Rhodobacter sphaeroides, Rhizobium leguminosarum, and others. Gene symbols ccoN and fixN are synonymous. [Energy metabolism, Electron transport]	474
129863	TIGR00781	ccoO	cytochrome c oxidase, cbb3-type, subunit II. This model describes the monoheme subunit of the cbb3-type cytochrome oxidase, found in a subset of Proteobacterial species. Species having this protein also have CcoN (subunit I, containing copper and two heme groups), CcoP (subunit III, containing two hemes), and CcoQ (essential for incorporation of the prosthetic groups). [Energy metabolism, Electron transport]	232
129864	TIGR00782	ccoP	cytochrome c oxidase, cbb3-type, subunit III. This model describes a di-heme subunit of approximately 26 kDa of the cbb3 type copper and heme-containing cytochrome oxidase. [Energy metabolism, Electron transport]	285
129865	TIGR00783	ccs	citrate carrier protein, CCS family. These proteins are members of the Citrate:Cation Symporter (CCS) Family (TC 2.A.24). These proteins have 12 GES predicted transmembrane regions. Most members of the CCS family catalyze citrate uptake with either Na+ or H+ as the cotransported cation. However, one member is specific for L-malate and probably functions by a proton symport mechanism. [Unclassified, Role category not yet assigned]	347
162036	TIGR00784	citMHS	citrate transporter, CitMHS family. This family includes two characterized citrate/proton symporters from Bacillus subtilis. CitM transports citrate complexed to Mg2+, while the CitH apparently transports citrate without Mg2+. The family also includes uncharacterized transporters, including a third paralog in Bacillus subtilis. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	431
273267	TIGR00785	dass	anion transporter. The Divalent Anion:Na+ Symporter (DASS) Family (TC 2.A.47) Functionally characterized proteins of the DASS family transport (1) organic di- and tricarboxylates of the Krebs Cycle as well as dicarboxylate amino acid, (2) inorganic sulfate and (3) phosphate. The animal NaDC-1 cotransport 3 Na+ with each dicarboxylate. Protonated tricarboxylates are also cotransported with 3Na+. [Transport and binding proteins, Anions, Transport and binding proteins, Cations and iron carrying compounds]	444
129868	TIGR00786	dctM	TRAP transporter, DctM subunit. The Tripartite ATP-independent Periplasmic Transporter (TRAP-T) Family (TC 2.A.56)- DctM subunit TRAP-T family permeases generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. Only one member of the family has been both sequenced and functionally characterized. This system is the DctPQM system of Rhodobacter capsulatus (Forward et al., 1997). DctP is a periplasmic dicarboxylate (malate, fumarate, succinate) binding receptor that is biochemically well-characterized. DctQ is an integral cytoplasmic membrane protein with 4 putative transmembrane a-helical spanners (TMSs). DctM is a second integral cytoplasmic membrane protein with 12 putative TMSs. These proteins have been shown to be both necessary and sufficient for the proton motive force-dependent uptake of dicarboxylates into R. capsulatus. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	405
129869	TIGR00787	dctP	tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	257
273268	TIGR00788	fbt	folate/biopterin transporter. The Folate-Biopterin Transporter (FBT) Family (TC 2.A.71)The only functionally characterized members of the family are from protozoa and include FT1, the major folate transporter in Leishmania, and BT1, the Leishmania biopterin/folate transporter. A related protein in Trypanosoma brucei, ESAGIO, shows weak folate/biopterin transport activity. [Cell envelope, Other]	468
129871	TIGR00789	flhB_rel	flhB C-terminus-related protein. This model describes a short protein (80-93 residues) homologous to the C-terminus of the flagellar biosynthetic protein FlhB. It is found so far only in species that also have FlhB. In a phylogenetic tree based on alignment of both this family and the homologous region of FlhB and its homologs, the members of this family form a monophyletic set. [Unknown function, General]	82
273269	TIGR00790	fnt	formate/nitrite transporter. The Formate-Nitrite Transporter (FNT) Family (TC 2.A.44)The prokaryotic proteins of the FNT family probably function in the transport of the structurally related compounds, formate and nitrite. The homologous yeast protein may function as a short chain aliphatic carboxylate H+ symporter,transporting formate, acetate and propionate, and functioning primarily as an acetate uptake permease. The putative formate efflux transporters (FocA) of bacteria associated with pyruvate-formate lyase (pfl) comprise cluster I; the putative formate uptake permeases (FdhC) of bacteria and archaea associated with formate dehydrogenase comprise cluster II; the putative nitrite uptake permeases (NirC) of bacteria comprise cluster III, and the single yeast protein, the putative acetate:H+ symporter alone comprises cluster IV. The energy coupling mechanisms for proteins of the FNT family have not been extensively characterized. HCO2 -, CH3CO2 - and NO2 - uptakes are probably coupled to H+symport. HCO2 - efflux may be driven by the membrane potential by a uniport mechanism or by H+ antiport. [Transport and binding proteins, Anions]	239
129873	TIGR00791	gntP	gluconate transporter. This family includes known gluconate transporters of E. coli and Bacillus species as well as an idonate transporter from E. coli. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	440
273270	TIGR00792	gph	sugar (Glycoside-Pentoside-Hexuronide) transporter. The Glycoside-Pentoside-Hexuronide (GPH):Cation Symporter Family (TC 2.A.2) GPH:cation symporters catalyze uptake of sugars in symport with a monovalent cation (H+ or Na+). Members of this family includes transporters for melibiose, lactose, raffinose, glucuronides, pentosides and isoprimeverose. Mutants of two groups of these symporters (the melibiose permeases of enteric bacteria, and the lactose permease of Streptococcus thermophilus) have been isolated in which altered cation specificity is observed or in which sugar transport is uncoupled from cation symport (i.e., uniport is catalyzed). The various members of the family can use Na+, H+ or Li, Na+ or Li+, H+ or Li+, or only H+ as the symported cation. All of these proteins possess twelve putative transmembrane a-helical spanners. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	437
273271	TIGR00793	kdgT	2-keto-3-deoxygluconate transporter. This family includes the characterized 2-Keto-3-Deoxygluconate transporters from Bacillus subtilis and Erwinia chrysanthemi. There are homologs of this protein found in both gram-positive and gram-negative bacteria. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	314
129876	TIGR00794	kup	potassium uptake protein. Proteins of the KUP family include the KUP (TrkD) protein of E. coli, a partially sequenced ORF from Lactococcus lactis, high affinity K+ uptake systems (Hak1) of the yeast Debaryomyces occidentalis as well as the fungus, Neurospora crassa, and several homologues in plants. While the E. coli KUP protein is assumed to be a secondary transporter, and uptake is blocked by protonophores such as CCCP (but not arsenate), the energy coupling mechanism has not been defined. However, the N. crassa protein has been shown to be a K+:H+ symporter, establishing that the KUP family consists of secondary carriers. The plant high affinity (20mM) K+ transporter can complement K+ uptake defects in E. coli. [Transport and binding proteins, Cations and iron carrying compounds]	688
162041	TIGR00795	lctP	L-lactate transport. The Lactate Permease (LctP) Family (TC 2.A.14) The only characterized member of this family, from E. coli, appears to catalyze lactate:H+ uptake. Members of this family have 12 probable TMS. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	530
273272	TIGR00796	livcs	branched-chain amino acid uptake carrier. The Branched Chain Amino Acid:Cation Symporter (LIVCS) Family (TC 2.A.26) Characterized members of this family transport all three of the branched chain aliphatic amino acids (leucine (L), isoleucine (I) and valine (V)). They function by a Na+ or H+ symport mechanism and display 12 putative transmembrane helical spanners. [Transport and binding proteins, Amino acids, peptides and amines]	378
273273	TIGR00797	matE	putative efflux protein, MATE family. The Multi Antimicrobial Extrusion (MATE) Family (TC 2.A.66) The MATE family consists of probable efflux proteins including a functionally characterized multi drug efflux system from Vibrio parahaemolyticus, a putative ethionine resistance protein of Saccharomyces cerevisiae, and the functionally uncharacterized DNA damage-inducible protein F (DinF) of E. coli. These proteins have 12 probable TMS. [Transport and binding proteins, Other]	342
129880	TIGR00798	mtc	tricarboxylate carrier. The MTC family consists of a limited number of homologues, all from eukaryotes. A single member of the family has been functionally characterized, the tricarboxylate carrier from rat liver mitochondria. The rat liver mitochondrial tricarboxylate carrier has been reported to transport citrate, cis-aconitate, threo-D-isocitrate, D- and L-tartrate, malate, succinate and phosphoenolpyruvate. It presumably functions by a proton symport mechanism. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	318
273274	TIGR00799	mtp	Golgi 4-transmembrane spanning transporter. The proteins of the MET family have 4 TMS regions and are located in late endosomal or lysosomal membranes. Substrates of the mouse MTP transporter include thymidine, both nucleoside and nucleobase analogues, antibiotics, anthracyclines, ionophores and steroid hormones. MET transporters may be involved in the subcellular compartmentation of steroid hormones and other compounds.Drug sensitivity by mouse MET was regulated by compounds that inhibit lysosomal function, interface with intracellular cholesterol transport, or modulate the multidrug resistance phenotype of mammalian cells. Thus, MET family members may compartmentalize diverse hydrophobic molecules, thereby affecting cellular drug sensitivity,nucleoside/nucleobase availability and steroid hormone responses. [Transport and binding proteins, Unknown substrate]	258
273275	TIGR00800	ncs1	NCS1 nucleoside transporter family. The Nucleobase:Cation Symporter-1 (NCS1) Family (TC 2.A.39) The NCS1 family consists of bacterial and yeast transporters for nucleobases including purines and pyrimidines. Members of this family possess twelve putative transmembrane a-helical spanners (TMSs). At least some of them have been shown to function in uptake by substrate:H+ symport mechanism. [Transport and binding proteins, Nucleosides, purines and pyrimidines]	442
273276	TIGR00801	ncs2	uracil-xanthine permease. The Nucleobase:Cation Symporter-2 (NCS2) Family (TC 2.A.40) Most of the functionally characterized members of the NCS2 family are transporters specific for nucleobases including both purines and pyrimidines. However, two closely related rat members of the family, SVCT1 and SVCT2, localized to different tissues of the body, cotransport L-ascorbate and Na+ with a high degree of specificity and high affinity for the vitamin. The NCS2 family appears to be distantly related to the NCS1 family (TC #2.A.39). [Transport and binding proteins, Nucleosides, purines and pyrimidines]	412
273277	TIGR00802	nico	high-affinity nickel-transporter, HoxN/HupN/NixA family. This family is found in both Gram-negative and Gram-positive bacteria. The functionally characterized members of the family catalyze uptake of either Ni2+ or Co2+ in a proton motive force-dependent process. Topological analyses with the HoxN Ni2+ transporter of Ralstonia eutropha (Alcaligenes eutrophus) suggest that it possesses 8 TMSs with its N- and C-termini in the cytoplasm. [Transport and binding proteins, Cations and iron carrying compounds]	280
129885	TIGR00803	nst	UDP-galactose transporter. The 10-12 TMS Nucleotide Sugar Transporters (TC 2.A.7.10)Nucleotide-sugar transporters (NSTs) are found in the Golgi apparatus and the endoplasmic reticulum of eukaryotic cells. Members of the family have been sequenced from yeast, protozoans and animals. Animals such as C. elegans possess many of these transporters. Humans have at least two closely related isoforms of the UDP-galactose:UMP exchange transporter.NSTs generally appear to function by antiport mechanisms, exchanging a nucleotide-sugar for a nucleotide. Thus, CMP-sialic acid is exchanged for CMP; GDP-mannose is preferentially exchanged for GMP, and UDP-galactose and UDP-N-acetylglucosamine are exchanged for UMP (or possibly UDP). Other nucleotide sugars (e.g., GDP-fucose, UDP-xylose, UDP-glucose, UDP-N-acetylgalactosamine, etc.) may also be transported in exchange for various nucleotides, but their transporters have not been molecularly characterized. Each compound appears to be translocated by its own transport protein. Transport allows the compound, synthesized in the cytoplasm, to be exported to the lumen of the Golgi apparatus or the endoplasmic reticulum where it is used for the synthesis of glycoproteins and glycolipids.	222
273278	TIGR00804	nupC	nucleoside transporter. The Concentrative Nucleoside Transporter (CNT) Family (TC 2.A.41) Members of the CNT family mediate nucleoside uptake. In bacteria they are energized by H+ symport, but in mammals they are energized by Na+ symport. The different transporters exhibit differing specificities for nucleosides. The E. coli NupC permease transports all nucleosides (both ribo- and deoxyribonucleosides) except hypoxanthine and guanine nucleosides. The B. subtilis NupC is specific for pyrimidine nucleosides (cytidine and uridine and the corresponding deoxyribonucleosides). The mammalian permease members of the CNT family also exhibit differing specificities. Thus, rats possess at least two NupC homologues, one specific for both purine and pyrimidine nucleosides and one specific for purine nucleosides. At least three paralogues have been characterized from humans. One human homologue(CNT1) transports pyrimidine nucleosides and adenosine, but deoxyadenosine and guanosine are poor substrates of this permease. Another (CNT2) is selective for purine nucleosides. Alteration of just a few amino acyl residues in TMSs 7 and 8 interconverts their specificities. [Transport and binding proteins, Nucleosides, purines and pyrimidines]	401
273279	TIGR00805	oat	sodium-independent organic anion transporter. The Organo Anion Transporter (OAT) Family (TC 2.A.60)Proteins of the OAT family catalyze the Na+-independent facilitated transport of organic anions such as bromosulfobromophthalein and prostaglandins as well as conjugated and unconjugated bile acids (taurocholate and cholate, respectively). These transporters have been characterized in mammals, but homologues are present in C. elegans and A. thaliana. Some of the mammalian proteins exhibit a high degree of tissue specificity. For example, the rat OAT is found at high levels in liver and kidney and at lower levels in other tissues. These proteins possess 10-12 putative a-helical transmembrane spanners. They may catalyze electrogenic anion uniport or anion exchange.	632
129888	TIGR00806	rfc	RFC reduced folate carrier. The Reduced Folate Carrier (RFC) Family (TC 2.A.48) Members of the RFC family mediate the uptake of folate, reduce folate, derivatives of reduced folate and the drug, methotrexate. Proteins of the RFC family are so-far restricted to animals. RFC proteins possess 12 putative transmembrane a-helical spanners (TMSs) and evidence for a 12 TMS topology has been published for the human RFC. The RFC transporters appear to transport reduced folate by an energy-dependent, pH-dependent, Na+-independent mechanism. Folate:H+ symport, folate:OH- antiport and folate:anion antiport mechanisms have been proposed, but the energetic mechanism is not well defined. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	511
129889	TIGR00807	malonate_madL	malonate transporter, MadL subunit. The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM. The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	125
129890	TIGR00808	malonate_madM	malonate transporter, MadM subunit. The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM.The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	254
213561	TIGR00809	secB	protein-export chaperone SecB. This protein acts as an export-specific cytosolic chaperone. It binds the mature region of pre-proteins destined for secretion, prevents aggregation, and delivers them to SecA. This protein is tetrameric in E. coli. The archaeal Methanococcus jannaschii homolog MJ0357 has been shown () to share many properties, including chaperone-like activity, and scores between trusted and noise. [Protein fate, Protein and peptide secretion and trafficking]	140
273280	TIGR00810	secG	protein translocase, SecG subunit. This family of proteins forms a complex with SecY and SecE. SecA then recruits the SecYEG complex to form an active protein translocation channel. [Protein fate, Protein and peptide secretion and trafficking]	73
273281	TIGR00811	sit	silicon transporter. Marine diatoms such as Cylindrotheca fusiformis encode at least six silicon transport protein homologues which exhibit similar size and topology. One characterized member of the family (Sit1) functions in the energy-dependent uptake of either Silicic acid [Si(OH)4] or Silicate [Si(OH)3O-] by a Na+ symport mechanism. The system is found in marine diatoms which make their "glass houses" out of silicon. [Transport and binding proteins, Other]	545
273282	TIGR00813	sss	transporter, SSS family. The Solute:Sodium Symporter (SSS) Family (TC 2.A.21) Members of the SSS family catalyze solute:Na+ symport. The solutes transported may be sugars, amino acids, nucleosides, inositols, vitamins, urea or anions, depending on the system. Members of the SSS family have been identified in bacteria, archaea and animals, and all functionally well characterized members catalyze solute uptake via Na+ symport. Proteins of the SSS generally share a core of 13 TMSs, but different members of the family may have different numbers of TMSs. A 13 TMS topology with a periplasmic N-terminus and a cytoplasmic C-terminus has been experimentally determined for the proline:Na+ symporter, PutP, of E. coli. [Transport and binding proteins, Cations and iron carrying compounds]	407
273283	TIGR00814	stp	serine transporter. The Hydroxy/Aromatic Amino Acid Permease (HAAAP) Family- serine/threonine subfamily (TC 2.A.42.2) The HAAAP family includes well characterized aromatic amino acid:H+ symport permeases and hydroxy amino acid permeases. This subfamily is specific for hydroxy amino acid transporters and includes the serine permease, SdaC, of E. coli, and the threonine permease, TdcC, of E. coli.//added GO terms, none avaialbelf or ser/thr specifically [SS 2/6/05] [Transport and binding proteins, Amino acids, peptides and amines]	397
273284	TIGR00815	sulP	high affinity sulphate transporter 1. The SulP family is a large and ubiquitous family with over 30 sequenced members derived from bacteria, fungi, plants and animals. Many organisms including Bacillus subtilis, Synechocystis sp, Saccharomyces cerevisiae, Arabidopsis thaliana and Caenorhabditis elegans possess multiple SulP family paralogues. Many of these proteins are functionally characterized, and all are sulfate uptake transporters. Some transport their substrate with high affinities, while others transport it with relatively low affinities. Most function by SO42- :H+symport, but SO42- :HCO3- antiport has been reported for the rat protein (spP45380). The bacterial proteins vary in size from 434 residues to 566 residues with one exception, a Mycobacterium tuberculosis protein with 784 residues. The eukaryotic proteins vary in size from 611 residues to 893 residues with one exception, a protein designated "early nodulin 70 protein" from Glycine max which is reported to be of 485 residues. Thus, the eukaryotic proteins are almost without exception larger than the prokaryotic proteins. These proteins exhibit 10-13 putative transmembrane a-helical spanners (TMSs) depending on the protein. The phylogenetic tree for the SulP family reveals five principal branches. Three of these are bacterial specific as follows: one bears a single protein from M. tuberculosis; a second bears two proteins, one from M. tuberculosis, the other from Synechocystis sp, and the third bears all remaining prokaryotic proteins. The remaining two clusters bear only eukaryotic proteins with the animal proteins all localized to one branch and the plant and fungal proteins localized to the other. The generalized transport reactions catalyzed by SulP family proteins are: (1) SO42- (out) + nH+ (out) --> SO42- (in) + nH+ (in). (2) SO42- (out) + nHCO3- (in) SO42- (in) + nHCO3- (out). [Transport and binding proteins, Anions]	552
273285	TIGR00816	tdt	C4-dicarboxylate transporter/malic acid transport protein. The Tellurite-Resistance/Dicarboxylate Transporter (TDT) Family (TC 2.A.16)Two members of the TDT family have been functionally characterized. One is the TehA protein of E. coli which has been implicated in resistance to tellurite; the other is the Mae1 protein of S. pombe which functions in the uptake of malate and other dicarboxylates by a proton symportmechanism. These proteins exhibit 10 putative transmembrane a-helicalspanners (TMSs). [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	320
129898	TIGR00817	tpt	Tpt phosphate/phosphoenolpyruvate translocator. The 6-8 TMS Triose-phosphate Transporter (TPT) Family (TC 2.A.7.9)Functionally characterized members of the TPT family are derived from the inner envelope membranes of chloroplasts and nongreen plastids of plants. However,homologues are also present in yeast. Saccharomyces cerevisiae has three functionally uncharacterized TPT paralogues encoded within its genome. Under normal physiologicalconditions, chloroplast TPTs mediate a strict antiport of substrates, frequently exchanging an organic three carbon compound phosphate ester for inorganic phosphate (Pi).Normally, a triose-phosphate, 3-phosphoglycerate, or another phosphorylated C3 compound made in the chloroplast during photosynthesis, exits the organelle into thecytoplasm of the plant cell in exchange for Pi. However, experiments with reconstituted translocator in artificial membranes indicate that transport can also occur by achannel-like uniport mechanism with up to 10-fold higher transport rates. Channel opening may be induced by a membrane potential of large magnitude and/or by high substrateconcentrations. Nongreen plastid and chloroplast carriers, such as those from maize endosperm and root membranes, mediate transport of C3 compounds phosphorylated atcarbon atom 2, particularly phosphenolpyruvate, in exchange for Pi. These are the phosphoenolpyruvate:Pi antiporters (PPT). Glucose-6-P has also been shown to be asubstrate of some plastid translocators (GPT). The three types of proteins (TPT, PPT and GPT) are divergent in sequence as well as substrate specificity, but their substratespecificities overlap. [Hypothetical proteins, Conserved]	302
273286	TIGR00819	ydaH	p-Aminobenzoyl-glutamate transporter family. The p-Aminobenzoyl-glutamate transporter family includes two transporters, the AbgT (YdaH) protein of E. coli and MtrF of Neisseria gonorrhoea. AbgT is apparently cryptic in wild type cells, but when expressed on a high copy number plasmid, or when expressed at higher levels due to mutation, it allows utilization of p-aminobenzoyl-glutamate as a source of p-aminobenzoate for p-aminobenzoate auxotrophs. p-Aminobenzoate is a constituent of and a precursor for the biosynthesis of folic acid. [Hypothetical proteins, Conserved]	524
273287	TIGR00820	zip	ZIP zinc/iron transport family. The Zinc (Zn2+)-Iron (Fe2+) Permease (ZIP) Family (TC 2.A.5)Members of the ZIP family consist of proteins with eight putative transmembrane spanners. They are derived from animals, plants and yeast. Theycomprise a diverse family, with several paralogues in any one organism (e.g., at least five in Caenorabditis elegans, at least five in Arabidopsis thaliana and two inSaccharomyces cervisiae. The two S. cerevisiae proteins, Zrt1 and Zrt2, both probably transport Zn2+ with high specificity, but Zrt1 transports Zn2+ with ten-fold higher affinitythan Zrt2. Some members of the ZIP family have been shown to transport Zn2+ while others transport Fe2+, and at least one transports a range of metal ions. The energy source fortransport has not been characterized, but these systems probably function as secondary carriers. [Transport and binding proteins, Cations and iron carrying compounds]	324
129901	TIGR00821	EII-GUT	PTS system, glucitol/sorbitol-specific, IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Gut family consists only of glucitol-specific transporters, but these occur both in Gram-negative and Gram-positive bacteria.E. coli consists of IIA protein, a IIC protein and a IIBC protein. This family is specific for the IIC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	181
129902	TIGR00822	EII-Sor	PTS system, mannose/fructose/sorbose family, IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man (PTS splinter group) family is unique in several respects among PTS permease families. It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this family can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the sorbose-specific IIC subunits of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	265
129903	TIGR00823	EIIA-LAC	phosphotransferase system enzyme II, lactose-specific, factor III. The PTS Lactose-N,N?-Diacetylchitobiose-b-glucoside (Lac) Family (TC 4.A.3)Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Lac family includes several sequenced lactose (b-galactoside) permeases of Gram-positive bacteria as well as the E. coli N,N?-diacetylchitobiose (Chb)permease which can transport aromatic b-glucosides and cellobiose as well as the chitin disaccharide, Chb, but only Chb induces expression of the chboperon. While the Lac permeases consist of two polypeptide chains (IIA and IICB), the Chb permease of E. coli consists of three (IIA, IIB and IIC). In B. subtilis, a PTS permease similar to the Chb permease of E. coli is believed to transport lichenan (a b-1,3;1,4 glucan) degradation products, oligosaccharides of 2-4 glucose units. This model is specific for the IIA subunit of the Lac PTS family. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	99
129904	TIGR00824	EIIA-man	PTS system, mannose/fructose/sorbose family, IIA component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IIA components. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	116
129905	TIGR00825	EIIBC-GUT	PTS system, glucitol/sorbitol-specific, IIBC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Gut family consists only of glucitol-specific permeases, but these occur both in Gram-negative and Gram-positive bacteria.E. coli consists of IIA protein, a IIC protein and a IIBC protein. This family is specific for the IIBC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	331
273288	TIGR00826	EIIB_glc	PTS system, glucose-like IIB component. The PTS Glucose-Glucoside (Glc) Family (TC 4.A.1) Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family (TC #4.A.3). These permeases show limited sequence similarity with members of the Fru family (TC #4.A.2). Several of the E. coli PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli . Most, but not all Scr permeases of other bacteria also lack a IIA domain. This model is specific for the IIB domain of the Glc family PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	88
129907	TIGR00827	EIIC-GAT	PTS system, galactitol-specific IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The only characterized member of this family of PTS transporters is the E. coli galactitol transporter. Gat family PTS systems typically have 3 components: IIA, IIB and IIC. This family is specific for the IIC component of the PTS Gat family. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	407
129908	TIGR00828	EIID-AGA	PTS system, mannose/fructose/sorbose family, IID component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IID subunits of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	271
129909	TIGR00829	FRU	PTS system, fructose-specific, IIB component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Fru family is a large and complex family which includes several sequenced fructose and mannitol-specific permeases as well as several PTS components of unknown specificities. The fructose components of this family phosphorylate fructose on the 1-position. The Fru family PTS systems typically have 3 domains, IIA, IIB and IIC, which may be found as 1 or more proteins. The fructose and mannitol transporters form separate phylogenetic clusters in this family. This family is specific for the IIB domain of the fructose PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	85
273289	TIGR00830	PTBA	PTS system, glucose subfamily, IIA component. These are part of the The PTS Glucose-Glucoside (Glc) SuperFamily. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family (TC #4.A.3). The IIA, IIB and IIC domains of all of the permeases listed below are demonstrably homologous. These permeases show limited sequence similarity with members of the Fru family (TC #4.A.2). Several of the PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli . Most, but not all Scr permeases of other bacteria also lack a IIA domain. The three-dimensional structures of the IIA and IIB domains of the E. coli glucose permease have been elucidated. IIAglchas a complex b-sandwich structure while IIBglc is a split ab-sandwich with a topology unrelated to the split ab-sandwich structure of HPr. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	121
129911	TIGR00831	a_cpa1	Na+/H+ antiporter, bacterial form. The Monovalent Cation:Proton Antiporter-1 (CPA1) Family (TC 2.A.36) The CPA1 family is a large family of proteins derived from Gram-positive and Gram-negative bacteria, blue green bacteria, yeast, plants and animals. Transporters from eukaryotes have been functionally characterized, and all of these catalyze Na+:H+ exchange. Their primary physiological functions may be in (1) cytoplasmic pH regulation, extruding the H+ generated during metabolism, and (2) salt tolerance (in plants), due to Na+ uptake into vacuoles. This model is specific for the bacterial members of this family. [Transport and binding proteins, Cations and iron carrying compounds]	525
213563	TIGR00832	acr3	arsenical-resistance protein. The Arsenical Resistance-3 (ACR3) Family (TC 2.A.59) The first protein of the ACR3 family functionally characterized was the ACR3 protein of Saccharomyces cerevisiae. It is present in the yeast plasma membrane and pumps arsenite out of the cell in response to the pmf. Similar proteins are found in bacteria, often as part of a four gene operon with an regulatory protein ArsR, a protein of unknown function ArsH, and an arsenate reductase that converts arsenate to arsenite to facilitate transport. [Cellular processes, Detoxification, Transport and binding proteins, Anions]	328
129913	TIGR00833	actII	Transport protein. The Resistance-Nodulation-Cell Division (RND) Superfamily- MmpL sub family (TC 2.A.6.5)Characterized members of the RND superfamily all probably catalyze substrate efflux via an H+ antiport mechanism. These proteins are found ubiquitously in bacteria, archaea and eukaryotes. This sub-family includes the S. coelicolor ActII3 protein, which may play a role in drug resistance, and the M. tuberculosis MmpL7 protein, which catalyzes export of an outer membrane lipid, phthiocerol dimycocerosate. [Transport and binding proteins, Unknown substrate]	910
273290	TIGR00834	ae	anion exchange protein. The Anion Exchanger (AE) Family (TC 2.A.31)Characterized protein members of the AE family are found only in animals.They preferentially catalyze anion exchange (antiport) reactions, typically acting as HCO3-:Cl- antiporters, but also transporting a range of other inorganic and organic anions. Additionally, renal Na+:HCO3- cotransporters have been found to be members of the AE family. They catalyze the reabsorption of HCO3- in the renal proximal tubule. [Transport and binding proteins, Anions]	900
273291	TIGR00835	agcS	amino acid carrier protein. The Alanine or Glycine: Cation Symporter (AGCS) Family (TC 2.A.25) Members of the AGCS family transport alanine and/or glycine in symport with Na+ and or H+.	425
273292	TIGR00836	amt	ammonium transporter. The Ammonium Transporter (Amt) Family (TC 2.A.49) All functionally characterized members of the Amt family are ammonia or ammonium uptake transporters. Some, but not others, also transport methylammonium. The mechanism of energy coupling, if any, to methyl-NH2 or NH3 uptake by the AmtB protein of E. coli is not entirely clear. NH4+ uniport driven by the pmf, energy independent NH3 facilitation, and NH4+/K+ antiport have been proposed as possible transport mechanisms. In Corynebacterium glutamicum and Arabidopsis thaliana, uptake via the Amt1 homologues of AmtB has been reported to be driven by the pmf. [Transport and binding proteins, Cations and iron carrying compounds]	403
273293	TIGR00837	araaP	aromatic amino acid transport protein. The Hydroxy/Aromatic Amino Acid Permease (HAAAP) Family- tyrosine/tryptophan subfamily (TC 2.A.42.1) The HAAAP family includes well characterized aromatic amino acid:H+ symport permeases and hydroxy amino acid permeases. This subfamily is specific for aromatic amino acid transporters and includes the tyrosine permease, TyrP, of E. coli, and the tryptophan transporters TnaB and Mtr of E. coli. [Transport and binding proteins, Amino acids, peptides and amines]	381
129918	TIGR00838	argH	argininosuccinate lyase. This model describes argininosuccinate lyase, but may include examples of avian delta crystallins, in which argininosuccinate lyase activity may or may not be present and the biological role is to provide the optically clear cellular protein of the eye lens. [Amino acid biosynthesis, Glutamate family]	455
213564	TIGR00839	aspA	aspartate ammonia-lyase. This enzyme, aspartate ammonia-lyase, shows local homology to a number of other lyases, as modeled by pfam00206. Fumarate hydratase scores as high as 570 bits against this model. [Energy metabolism, Amino acids and amines]	468
273294	TIGR00840	b_cpa1	sodium/hydrogen exchanger 3. The Monovalent Cation:Proton Antiporter-1 (CPA1) Family (TC 2.A.36)The CPA1 family is a large family of proteins derived from Gram-positive and Gram-negative bacteria, blue green bacteria, yeast, plants and animals.Transporters from eukaryotes have been functionally characterized, and all of these catalyze Na+:H+ exchange. Their primary physiological functions may be in(1) cytoplasmic pH regulation, extruding the H+ generated during metabolism, and (2) salt tolerance (in plants), due to Na+ uptake into vacuoles.This model is specific for the eukaryotic members members of this family. [Transport and binding proteins, Cations and iron carrying compounds]	559
188087	TIGR00841	bass	bile acid transporter. The Bile Acid:Na+ Symporter (BASS) Family (TC 2.A.28) Functionally characterized members of the BASS family catalyze Na+:bile acid symport. These systems have been identified in intestinal, liver and kidney tissues of animals. These symporters exhibit broad specificity, taking up a variety of non bile organic compounds as well as taurocholate and other bile salts. Functionally uncharacterised homologues are found in plants, yeast, archaea and bacteria. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	286
213565	TIGR00842	bcct	choline/carnitine/betaine transport. The Betaine/Carnitine/Choline Transporter (BCCT) Family (TC 2.A.15) Proteins of the BCCT family share the common functional feature of transporting molecules with a quaternary ammonium group [R-N+(CH3)3]. The BCCT family includes transporters for carnitine, choline and glycine betaine. BCCT transporters have 12 putative TMS, and are energized by pmf-driven proton symport. Some of these permeases exhibit osmosensory and osmoregulatory properties inherent to their polypeptide chains. [Transport and binding proteins, Other]	452
129923	TIGR00843	benE	benzoate transporter. The benzoate transporter family contains only a single characterised member, the benzoate transporter of Acinetobacter calcoaceticus, which functions as a benzoate/proton symporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	395
273295	TIGR00844	c_cpa1	na(+)/h(+) antiporter. The Monovalent Cation:Proton Antiporter-1 (CPA1) Family (TC 2.A.36) The CPA1 family is a large family of proteins derived from Gram-positive and Gram-negative bacteria, blue green bacteria, yeast, plants and animals. Transporters from eukaryotes have been functionally characterized, and all of these catalyze Na+:H+ exchange. Their primary physiological functions may be in (1) cytoplasmic pH regulation, extruding the H+ generated during metabolism, and (2) salt tolerance (in plants), due to Na+ uptake into vacuoles. This model is specific for the fungal members of this family. [Transport and binding proteins, Cations and iron carrying compounds]	810
273296	TIGR00845	caca	sodium/calcium exchanger 1. The Ca2+:Cation Antiporter (CaCA) Family (TC 2.A.19)Proteins of the CaCA family are found ubiquitously, having been identified in animals, plants, yeast, archaea and widely divergent bacteria.All of the characterized animal proteins catalyze Ca2+:Na+ exchange although some also transport K+. The NCX1 plasma membrane protein exchanges 3 Na+ for 1 Ca2+. The E. coli ChaA protein catalyzes Ca2+:H+ antiport but may also catalyze Na+:H+ antiport. All remaining well-characterized members of the family catalyze Ca2+:H+ exchange.This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family [Transport and binding proteins, Other]	928
273297	TIGR00846	caca2	calcium/proton exchanger. The Ca2+:Cation Antiporter (CaCA) Family (TC 2.A.19)Proteins of the CaCA family are found ubiquitously, having been identified in animals, plants, yeast, archaea and widely divergent bacteria.All of the characterized animal proteins catalyze Ca2+:Na+ exchange although some also transport K+. The NCX1 plasma membrane protein exchanges 3 Na+ for 1 Ca2+. The E. coli ChaA protein catalyzes Ca2+:H+ antiport but may also catalyze Na+:H+ antiport. All remaining well-characterized members of the family catalyze Ca2+:H+ exchange.This model is generated from the calcium ion/proton exchangers of the CacA family. [Transport and binding proteins, Cations and iron carrying compounds]	363
129927	TIGR00847	ccoS	cytochrome oxidase maturation protein, cbb3-type. CcoS from Rhodobacter capsulatus has been shown essential for incorporation of redox-active prosthetic groups (heme, Cu) into cytochrome cbb(3) oxidase. FixS of Bradyrhizobium japonicum appears to have the same function. Members of this family are found so far in organisms with a cbb3-type cytochrome oxidase, including Neisseria meningitidis, Helicobacter pylori, Campylobacter jejuni, Caulobacter crescentus, Bradyrhizobium japonicum, and Rhodobacter capsulatus. [Energy metabolism, Electron transport, Protein fate, Protein modification and repair]	51
273298	TIGR00848	fruA	PTS system, fructose subfamily, IIA component. 4.A.2 The PTS Fructose-Mannitol (Fru) Family Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Fru family is a large and complex family which includes several sequenced fructose and mannitol-specific permeases as well as several putative PTS permeases of unknown specificities. The fructose permeases of this family phosphorylate fructose on the 1-position. Those of family 4.6 phosphorylate fructose on the 6-position. The Fru family PTS systems typically have 3 domains, IIA, IIB and IIC, which may be found as 1 or more proteins. The fructose and mannitol transporters form separate phylogenetic clusters in this family. This model is specific for the IIA domain of the fructose PTS transporters. Also similar to the Enzyme IIA Fru subunits of the PTS, but included in TIGR01419 rather than this model, is enzyme IIA Ntr (nitrogen), also called PtsN, found in E. coli and other organisms, which may play a solely regulatory role. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	129
129929	TIGR00849	gutA	PTS system, glucitol/sorbitol-specific IIA component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. This family consists only of glucitol-specific transporters, and occur both in Gram-negative and Gram-positive bacteria.The system in E.Coli consists of a IIA protein, and a IIBC protein. This family is specific for the IIA component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	121
129930	TIGR00851	mtlA	PTS system, mannitol-specific IIC component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Fru family is a large and complex family which includes several sequenced fructose and mannitol-specific permeases as well as several putative PTS permeases of unknown specificities.The Fru family PTS systems typically have 3 domains, IIA, IIB and IIC, which may be found as 1 or more proteins. The fructose and mannitol transporters form separate phylogenetic clusters in this family. This family is specific for the IIC domain of the mannitol PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	338
273299	TIGR00852	pts-Glc	PTS system, maltose and glucose-specific subfamily, IIC component. The PTS Glucose-Glucoside (Glc) Family (TC 4.A.1) Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family (TC #4.A.3). These permeases show limited sequence similarity with members of the Fru family (TC #4.A.2). Several of the E. coli PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli . Most, but not all Scr permeases of other bacteria also lack a IIA domain. This model is specific for the IIC domain of the Glc family PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	289
273300	TIGR00853	pts-lac	PTS system, lactose/cellobiose family IIB component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Lac family includes several sequenced lactose (b-galactoside) permeases of Gram-positive bacteria as well as those in E. coli. While the Lac family usually consists of two polypeptide components IIA and IICB, the Chb permease of E. coli consists of three IIA, IIB and IIC. This family is specific for the IIB subunit of the Lac PTS family. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	95
129933	TIGR00854	pts-sorbose	PTS system, mannose/fructose/sorbose family, IIB component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IIB components of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	151
273301	TIGR00855	L12	ribosomal protein L7/L12. Ribosomal proteins L7 and L12 are synonymous except for post-translational modification of the N-terminal amino acid. THis model resembles pfam00542 but matches the full length of prokaryotic and organellar proteins rather than just the C-terminus. [Protein synthesis, Ribosomal proteins: synthesis and modification]	123
129935	TIGR00856	pyrC_dimer	dihydroorotase, homodimeric type. This homodimeric form of dihydroorotase is less common in microbial genomes than a related dihydroorotase that appears in a complex with aspartyltranscarbamoylase or as a homologous domain in multifunctional proteins of pyrimidine biosynthesis in higher eukaryotes. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	341
273302	TIGR00857	pyrC_multi	dihydroorotase, multifunctional complex type. In contrast to the homodimeric type of dihydroorotase found in E. coli, this class tends to appear in a large, multifunctional complex with aspartate transcarbamoylase. Homologous domains appear in multifunctional proteins of higher eukaryotes. In some species, including Pseudomonas putida and P. aeruginosa, this protein is inactive but is required as a non-catalytic subunit of aspartate transcarbamoylase (ATCase). In these species, a second, active dihydroorotase is also present. The seed for this model does not include any example of the dihydroorotase domain of eukaryotic multidomain pyrimidine synthesis proteins. All proteins described by this model should represent active and inactive dihydroorotase per se and functionally equivalent domains of multifunctional proteins from higher eukaryotes, but exclude related proteins such as allantoinase. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	411
273303	TIGR00858	bioF	8-amino-7-oxononanoate synthase. 7-keto-8-aminopelargonic acid synthetase is an alternate name. This model represents 8-amino-7-oxononanoate synthase, the BioF protein of biotin biosynthesis. This model is based on a careful phylogenetic analysis to separate members of this family from 2-amino-3-ketobutyrate and other related pyridoxal phosphate-dependent enzymes. In several species, including Staphylococcus and Coxiella, a candidate 8-amino-7-oxononanoate synthase is confirmed by location in the midst of a biotin biosynthesis operon but scores below the trusted cutoff of this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin]	360
273304	TIGR00859	ENaC	sodium channel transporter. The Epithelial Na+ Channel (ENaC) Family (TC 1.A.06)The ENaC family consists of sodium channels from animals and has no recognizable homologues in other eukaryotes or bacteria. The vertebrate ENaC proteins from epithelial cells cluster tightly together on the phylogenetic tree: voltage-insensitive ENaC homologues are also found in the brain. Eleven sequenced C. elegans proteins, including the degenerins, are distantly related to the vertebrate proteins as well as to each other. At least some ofthese proteins form part of a mechano-transducing complex for touch sensitivity. Other members of the ENaC family, the acid-sensing ion channels, ASIC1-3,are homo- or hetero-oligomeric neuronal H+-gated channels that mediate pain sensation in response to tissue acidosis. The homologous Helix aspersa(FMRF-amide)-activated Na+ channel is the first peptide neurotransmitter-gated ionotropic receptor to be sequenced.Mammalian ENaC is important for the maintenance of Na+ balance and the regulation of blood pressure. Three homologous ENaC subunits, a, b and g, havebeen shown to assemble to form the highly Na+-selective channel.This model is designed from the vertebrate members of the ENaC family. [Transport and binding proteins, Cations and iron carrying compounds]	595
273305	TIGR00860	LIC	Cation transporter family protein. The Ligand-gated Ion Channel (LIC) Family of Neurotransmitter Receptors TC 1.A.9)Members of the LIC family of ionotropic neurotransmitter receptors are found only in vertebrate and invertebrate animals. They exhibit receptor specificity for (1)acetylcholine, (2) serotonin, (3) glycine, (4) glutamate and (5) g-aminobutyric acid (GABA). All of these receptor channels are probably hetero- orhomopentameric. The best characterized are the nicotinic acetyl-choline receptors which are pentameric channels of a2bgd subunit composition. All subunits arehomologous. The three dimensional structures of the protein complex in both the open and closed configurations have been solved at 0.9 nm resolution.The channel protein complexes of the LIC family preferentially transport cations or anions depending on the channel (e.g., the acetylcholine receptors are cationselective while glycine receptors are anion selective). [Transport and binding proteins, Cations and iron carrying compounds]	459
273306	TIGR00861	MIP	MIP family channel proteins. 1.A.8 The Major Intrinsic Protein (MIP) FamilyThe MIP family is large and diverse, possessing over 100 members that all form transmembrane channels. These channel proteins function in water, smallcarbohydrate (e.g., glycerol), urea, NH3, CO2 and possibly ion transport by an energy independent mechanism. They are found ubiquitously in bacteria, archaeaand eukaryotes. The MIP family contains two major groups of channels: aquaporins and glycerol facilitators.The known aquaporins cluster loosely together as do the known glycerol facilitators. MIP family proteins are believed to form aqueous pores that selectively allow passive transport of their solute(s) across the membrane with minimal apparent recognition. Aquaporins selectively transport water (but not glycerol) while glycerol facilitators selectively transport glycerol but not water. Some aquaporins can transport NH3 and CO2. Glycerol facilitators function as solute nonspecific channels, and may transport glycerol, dihydroxyacetone, propanediol, urea and other small neutral molecules in physiologically importantprocesses. Some members of the family, including the yeast FPS protein (TC #1.A.8.5.1) and tobacco NtTIPA may transport both water and small solutes. [Transport and binding proteins, Unknown substrate]	216
129941	TIGR00862	O-ClC	intracellular chloride channel protein. The Organellar Chloride Channel (O-ClC) Family (TC 1.A.12) Proteins of the O-ClC family are voltage-sensitive chloride channels found in intracellular membranes but not the plasma membranes of animal cells. They are found in human nuclear membranes, and the bovine protein targets to the microsomes, but not the plasma membrane, when expressed in Xenopus laevis oocytes. These proteins are thought to function in the regulation of the membrane potential and in transepithelial ion absorption and secretion in the kidney. [Transport and binding proteins, Anions]	236
273307	TIGR00863	P2X	cation transporter protein. ATP-gated Cation Channel (ACC) Family (TC 1.A.7)Members of the ACC family (also called P2X receptors) respond to ATP, a functional neurotransmitter released by exocytosis from many types of neurons.These channels, which function at neuron-neuron and neuron-smooth muscle junctions, may play roles in the control of blood pressure and pain sensation. They may also function in lymphocyte and plateletphysiology. They are found only in animals.ACC channels are probably hetero- or homomultimers and transport small monovalent cations (Me+). Some also transport Ca2+; a few also transport small metabolites. [Transport and binding proteins, Cations and iron carrying compounds]	372
188093	TIGR00864	PCC	polycystin cation channel protein. The Polycystin Cation Channel (PCC) Family (TC 1.A.5) Polycystin is a huge protein of 4303aas. Its repeated leucine-rich (LRR) segment is found in many proteins. It contains 16 polycystic kidney disease (PKD) domains, one LDL-receptor class A domain, one C-type lectin family domain, and 16-18 putative TMSs in positions between residues 2200 and 4100. Polycystin-L has been shown to be a cation (Na+, K+ and Ca2+) channel that is activated by Ca2+. Two members of the PCC family (polycystin 1 and 2) are mutated in autosomal dominant polycystic kidney disease, and polycystin-L is deleted in mice with renal and retinal defects. Note: this model is restricted to the amino half.	2740
273308	TIGR00865	bcl-2	apoptosis regulator. The Bcl-2 (Bcl-2) Family (TC 1.A.21) The Bcl-2 family consists of the apoptosis regulator, Bcl-X, and its homologues. Bcl-X is a dominant regulator of programmed cell death in mammalian cells. The long form (Bcl-X(L)) displays cell death repressor activity, but the short isoform (Bcl-X(S)) and the b-isoform (Bcl-Xb) promote cell death. Bcl-X(L), Bcl-X(S) and Bcl-Xb are three isoforms derived by alternative RNA splicing. Bcl-X(S) forms heterodimers with Bcl-2. Homologues of Bcl-X include the Bax (rat; 192 aas; spQ63690) and Bak (mouse; 208 aas; spO08734) proteins which also influence apoptosis. Using isolated mitochondria, recombinant Bax and Bak have been shown to induce Dy loss, swelling and cytochrome c release. All of these changes are dependent on Ca2+ and are prevented by cyclosporin A and bongkrekic acid, both of which are known to close permeability transition pores (megachannels). Coimmimoprecipitation studies revealed that Bax and Bak interact with VDAC to form permeability transition pores. Thus, even though they can form channels in artificial membranes at acidic pH, proapoptotic Bcl-2 family proteins (including Bax and Bak) probably induce the mitochondrial permeability transition and cytochrome c release by interacting with permeability transition pores, the most important component for pore fomation of which is VDAC. [Regulatory functions, Other]	213
273309	TIGR00867	deg-1	degenerin. The Epithelial Na+ Channel (ENaC) Family (TC 1.A.06)The ENaC family consists of sodium channels from animals and has no recognizable homologues in other eukaryotes or bacteria. The vertebrate ENaC proteins from epithelial cells cluster tightly together on the phylogenetic tree: voltage-insensitive ENaC homologues are also found in the brain. Eleven sequenced C. elegans proteins, including the degenerins, are distantly related to the vertebrate proteins as well as to each other. At least some ofthese proteins form part of a mechano-transducing complex for touch sensitivity. Other members of the ENaC family, the acid-sensing ion channels, ASIC1-3,are homo- or hetero-oligomeric neuronal H+-gated channels that mediate pain sensation in response to tissue acidosis. The homologous Helix aspersa(FMRF-amide)-activated Na+ channel is the first peptide neurotransmitter-gated ionotropic receptor to be sequenced.Mammalian ENaC is important for the maintenance of Na+ balance and the regulation of blood pressure. Three homologous ENaC subunits, a, b and g, havebeen shown to assemble to form the highly Na+-selective channel.This model is designed from the invertebrate members of the ENaC family. [Transport and binding proteins, Cations and iron carrying compounds]	600
129946	TIGR00868	hCaCC	calcium-activated chloride channel protein 1. found a row in 1A13.INFO that was not parsed out AC found a row in 1A13.INFO that was not parsed out EC found a row in 1A13.INFO that was not parsed out GA found a row in 1A13.INFO that was not parsed out SO found a row in 1A13.INFO that was not parsed out RH found a row in 1A13.INFO that was not parsed out EN found a row in 1A13.INFO that was not parsed out GS found a row in 1A13.INFO that was not parsed out AL found a row in 1A13.INFO that was not parsed out The Epithelial Chloride Channel (E-ClC) Family (TC 1.A.13) found a row in 1A13.INFO that was not parsed out found a row in 1A13.INFO that was not parsed out Mammals have multiple isoforms of epithelial chloride channel proteins. The first member of this family to be characterized was a respiratory epithelium, Ca found a row in 1A13.INFO that was not parsed out 2+-regulated, chloride channel protein isolated from bovine tracheal apical membranes. It was biochemically characterized as a 140 kDa complex. The purified found a row in 1A13.INFO that was not parsed out complex when reconstituted in a planar lipid bilayer behaved as an anion-selective channel. It was regulated by Ca 2+ via a calmodulin kinase II-dependent found a row in 1A13.INFO that was not parsed out mechanism. When the cRNA was injected into Xenopus oocytes, an outward rectifying, DIDS-sensitive, anion conductance was measured. A related gene, found a row in 1A13.INFO that was not parsed out Lu-ECAM, was cloned from the bovine aortic endothelial cell line, BAEC. It is expressed in the lung and spleen but not in the trachea. Homologues are found in found a row in 1A13.INFO that was not parsed out several mammals, and at least three paralogues(hCaCC-1-3) are present in humans, each with different tissue distributions. found a row in 1A13.INFO that was not parsed out [Transport and binding proteins, Anions]	863
273310	TIGR00869	sec62	protein translocation protein, Sec62 family. Members of the NSCC2 family have been sequenced from various yeast, fungal and animals species including Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. These proteins are the Sec62 proteins, believed to be associated with the Sec61 and Sec63 constituents of the general protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins has been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions. [Transport and binding proteins, Amino acids, peptides and amines]	232
273311	TIGR00870	trp	transient-receptor-potential calcium channel protein. The Transient Receptor Potential Ca2+ Channel (TRP-CC) Family (TC. 1.A.4)The TRP-CC family has also been called the store-operated calcium channel (SOC) family. The prototypical members include the Drosophila retinal proteinsTRP and TRPL (Montell and Rubin, 1989; Hardie and Minke, 1993). SOC members of the family mediate the entry of extracellular Ca2+ into cells in responseto depletion of intracellular Ca2+ stores (Clapham, 1996) and agonist stimulated production of inositol-1,4,5 trisphosphate (IP3). One member of the TRP-CCfamily, mammalian Htrp3, has been shown to form a tight complex with the IP3 receptor (TC #1.A.3.2.1). This interaction is apparently required for IP3 tostimulate Ca2+ release via Htrp3. The vanilloid receptor subtype 1 (VR1), which is the receptor for capsaicin (the ?hot? ingredient in chili peppers) and servesas a heat-activated ion channel in the pain pathway (Caterina et al., 1997), is also a member of this family. The stretch-inhibitable non-selective cation channel(SIC) is identical to the vanilloid receptor throughout all of its first 700 residues, but it exhibits a different sequence in its last 100 residues. VR1 and SICtransport monovalent cations as well as Ca2+. VR1 is about 10x more permeable to Ca2+ than to monovalent ions. Ca2+ overload probably causes cell deathafter chronic exposure to capsaicin. (McCleskey and Gold, 1999). [Transport and binding proteins, Cations and iron carrying compounds]	743
273312	TIGR00871	zwf	glucose-6-phosphate 1-dehydrogenase. This enzyme (EC 1.1.1.49) acts on glucose 6-phospate and reduces NADP(+). An alternate name appearing in the literature for the human enzyme, based on a slower activity with beta-D-glucose, is glucose 1-dehydrogenase (EC 1.1.1.47), but that name more properly describes a subfamily of the short chain dehydrogenases/reductases family. This is a well-studied enzyme family, with sequences available from well over 50 species. The trusted cutoff is set above the score for the Drosophila melanogaster CG7140 gene product, a homolog of unknown function. G6PD homologs from the bacteria Aquifex aeolicus and Helicobacter pylori lack several motifs well conserved most other members, were omitted from the seed alignment, and score well below the trusted cutoff. [Energy metabolism, Pentose phosphate pathway]	487
273313	TIGR00872	gnd_rel	6-phosphogluconate dehydrogenase (decarboxylating). This family resembles a larger family (gnd) of bacterial and eukaryotic 6-phosphogluconate dehydrogenases but differs from it by a deep split in a UPGMA similarity clustering tree and the lack of a central region of about 140 residues. Among complete genomes, it is found is found in Bacillus subtilis and Mycobacterium tuberculosis, both of which also contain gnd, and in Aquifex aeolicus. The protein from Methylobacillus flagellatus KT has been characterized as a decarboxylating 6-phosphogluconate dehydrogenase as part of an unusual formaldehyde oxidation cycle. In some sequenced organisms members of this family are the sole 6-phosphogluconate dehydrogenase present and are probably active in the pentose phosphate cycle. [Energy metabolism, Pentose phosphate pathway]	298
273314	TIGR00873	gnd	6-phosphogluconate dehydrogenase (decarboxylating). This model does not specify whether the cofactor is NADP only (EC 1.1.1.44), NAD only, or both. The model does not assign an EC number for that reason. [Energy metabolism, Pentose phosphate pathway]	467
162081	TIGR00874	talAB	transaldolase. This family includes the majority of known and predicted transaldolase sequences, including E. coli TalA and TalB. It excluded two other families. The first includes E. coli transaldolase-like protein TalC. The second family includes the putative transaldolases of Helicobacter pylori and Mycobacterium tuberculosis. [Energy metabolism, Pentose phosphate pathway]	317
129953	TIGR00875	fsa_talC_mipB	fructose-6-phosphate aldolase, TalC/MipB family. This model represents a family that includes the E. coli transaldolase homologs TalC and MipB, both shown to be fructose-6-phosphate aldolases rather than transaldolases as previously thought. It is related to but distinct from the transaldolase family of E. coli TalA and TalB. The member from Bacillus subtilis becomes phosphorylated during early stationary phase but not during exponential growth. [Energy metabolism, Pentose phosphate pathway]	213
129954	TIGR00876	tal_mycobact	transaldolase, mycobacterial type. This model describes one of three related but easily separable famiiles of known and putative transaldolases. This family and the family typified by E. coli TalA and TalB both contain experimentally verified examples. [Energy metabolism, Pentose phosphate pathway]	350
273315	TIGR00877	purD	phosphoribosylamine--glycine ligase. Alternate name: glycinamide ribonucleotide synthetase (GARS). This enzyme appears as a monofunctional protein in prokaryotes but as part of a larger, multidomain protein in eukaryotes. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	422
273316	TIGR00878	purM	phosphoribosylaminoimidazole synthetase. Alternate name: phosphoribosylformylglycinamidine cyclo-ligase; AIRS; AIR synthase This enzyme is found as a homodimeric monofunctional protein in prokaryotes and as part of a larger, multifunctional protein, sometimes with two copies of this enzyme in tandem, in eukaryotes. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	332
273317	TIGR00879	SP	MFS transporter, sugar porter (SP) family. This model represent the sugar porter subfamily of the major facilitator superfamily (pfam00083) [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	481
273318	TIGR00880	2_A_01_02	Multidrug resistance protein. 	141
273319	TIGR00881	2A0104	phosphoglycerate transporter family protein. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	379
211613	TIGR00882	2A0105	oligosaccharide:H+ symporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	396
273320	TIGR00883	2A0106	metabolite-proton symporter. This model represents the metabolite:H+ symport subfamily of the major facilitator superfamily (pfam00083), including citrate-H+ symporters, dicarboxylate:H+ symporters, the proline/glycine-betaine transporter ProP, etc. [Transport and binding proteins, Unknown substrate]	394
273321	TIGR00884	guaA_Cterm	GMP synthase (glutamine-hydrolyzing), C-terminal domain or B subunit. This protein of purine de novo biosynthesis is well-conserved. However, it appears to split into two separate polypeptide chains in most of the Archaea. This C-terminal region would be the larger subunit [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	311
211614	TIGR00885	fucP	L-fucose:H+ symporter permease. This family describes the L-fucose permease in bacteria. L-fucose(6-deoxy-L-galactose) is a monosaccharide found in glycoproteins and cell wall polysaccharides. L-fucose is used in bacteria through an inducible pathway mediated by atleast four enzymes: a permease, isomerase, kinase and an aldolase which are encoded by fucP, fucI, fucK, fucA respectively. The fuc genes belong to a regulon comprising of four linked operons: fucO, fucA, fucPIK and fucR. The positive regulator is encoded by fucR, whose protein responds to fuculose-1-phosphate, which acts as an effector. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	408
273322	TIGR00886	2A0108	nitrite extrusion protein (nitrite facilitator). [Transport and binding proteins, Anions]	354
129965	TIGR00887	2A0109	phosphate:H+ symporter. This model represents the phosphate uptake symporter subfamily of the major facilitator superfamily (pfam00083). [Transport and binding proteins, Anions]	502
129966	TIGR00888	guaA_Nterm	GMP synthase (glutamine-hydrolyzing), N-terminal domain or A subunit. This protein of purine de novo biosynthesis is well-conserved. However, it appears to split into two separate polypeptide chains in most of the Archaea. This N-terminal region would be the smaller subunit. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	188
129967	TIGR00889	2A0110	nucleoside transporter. This family of proteins transports nucleosides at a high affinity. The transport mechanism is driven by proton motive force. This family includes nucleoside permease NupG and xanthosine permease from E.Coli. [Transport and binding proteins, Nucleosides, purines and pyrimidines]	418
273323	TIGR00890	2A0111	oxalate/formate antiporter family transporter. This subfamily belongs to the major facilitator family. Members include the oxalate/formate antiporter of Oxalobacter formigenes, where one substrate is decarboxylated in the cytosol into the other to consume a proton and drive an ion gradient. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	377
273324	TIGR00891	2A0112	putative sialic acid transporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	405
273325	TIGR00892	2A0113	monocarboxylate transporter 1. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	455
273326	TIGR00893	2A0114	D-galactonate transporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	399
129972	TIGR00894	2A0114euk	Na(+)-dependent inorganic phosphate cotransporter. [Transport and binding proteins, Anions]	465
273327	TIGR00895	2A0115	benzoate transport. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	398
129974	TIGR00896	CynX	cyanate transporter. This family of proteins is involved in active transport of cyanate. The cyanate transporter in E.Coli is used to transport cyanate into the cell so it can be metabolized into ammonia and bicarbonate. This process is used to overcome the toxicity of environmental cyanate. [Transport and binding proteins, Other]	355
162096	TIGR00897	2A0118	polyol permease family. This family of proteins includes the ribitol and D-arabinitol transporters from Klebsiella pneumoniae and the alpha-ketoglutarate permease from Bacillus subtilis. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	402
273328	TIGR00898	2A0119	cation transport protein. [Transport and binding proteins, Cations and iron carrying compounds]	505
129977	TIGR00899	2A0120	sugar efflux transporter. This family of proteins is an efflux system for lactose, glucose, aromatic glucosides and galactosides, cellobiose, maltose, a-methyl glucoside and other sugar compounds. They are found in both gram-negative and gram-postitive bacteria. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	375
162098	TIGR00900	2A0121	H+ Antiporter protein. [Transport and binding proteins, Cations and iron carrying compounds]	365
273329	TIGR00901	2A0125	AmpG-like permease. [Cellular processes, Adaptations to atypical conditions]	356
129980	TIGR00902	2A0127	phenyl proprionate permease family protein. This family of proteins is involved in the uptake of 3-phenylpropionic acid. This uptake mechanism is for the metabolism of phenylpropanoid compounds and plays an important role in the natural degradative cycle of these aromatic molecules. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	382
129981	TIGR00903	2A0129	major facilitator 4 family protein. This family of proteins are uncharacterized proteins from archaea. This family includes proteins from Archaeoglobus fulgidus and Aeropyrum pernix. [Transport and binding proteins, Other]	368
129982	TIGR00904	mreB	cell shape determining protein, MreB/Mrl family. MreB (mecillinam resistance) in E. coli (also called envB) and the paralogous pair MreB and Mrl of Bacillus subtilis have all been shown to help determine cell shape. This protein is present in a wide variety of bacteria, including spirochetes, but is missing from the Mycoplasmas and from Gram-positive cocci. Most completed bacterial genomes have a single member of this family. In some species it is an essential gene. A close homolog is found in the Archaeon Methanobacterium thermoautotrophicum, and a more distant homolog in Archaeoglobus fulgidus. The family is related to cell division protein FtsA and heat shock protein DnaK. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	333
129983	TIGR00905	2A0302	transporter, basic amino acid/polyamine antiporter (APA) family. This family includes several families of antiporters that, rather commonly, are encoded next to decarboxylases that convert one of the antiporter substrates into the other. This arrangement allows a cycle that can remove proteins from the cytoplasm and thereby protect against acidic conditions. [Transport and binding proteins, Amino acids, peptides and amines]	473
273330	TIGR00906	2A0303	cationic amino acid transport permease. [Transport and binding proteins, Amino acids, peptides and amines]	557
273331	TIGR00907	2A0304	amino acid permease (GABA permease). [Transport and binding proteins, Amino acids, peptides and amines]	482
129986	TIGR00908	2A0305	ethanolamine permease. The three genes used as the seed for this model (from Burkholderia pseudomallei, Pseudomonas aeruginosa and Clostridium acetobutylicum are all adjacent to genes for the catabolism of ethanolamine. Most if not all of the hits to this model have a similar arrangement of genes. This group is a member of the Amino Acid-Polyamine-Organocation (APC) Superfamily. [Transport and binding proteins, Amino acids, peptides and amines]	442
129987	TIGR00909	2A0306	amino acid transporter. [Transport and binding proteins, Amino acids, peptides and amines]	429
129988	TIGR00910	2A0307_GadC	glutamate:gamma-aminobutyrate antiporter. Lowered cutoffs from 1000/500 to 800/300, promoted from subfamily to equivalog, and put into a Genome Property DHH 9/1/2009 [Transport and binding proteins, Amino acids, peptides and amines]	507
273332	TIGR00911	2A0308	L-type amino acid transporter. [Transport and binding proteins, Amino acids, peptides and amines]	501
273333	TIGR00912	2A0309	spore germination protein (amino acid permease). This model describes spore germination protein GerKB and paralogs from Bacillus subtilis, Clostridium tetani, and other known or predicted endospore-forming members of the Firmicutes (low-GC Gram positive bacteria). Members show some similarity to amino acid permeases. [Transport and binding proteins, Amino acids, peptides and amines]	359
273334	TIGR00913	2A0310	amino acid permease (yeast). [Transport and binding proteins, Amino acids, peptides and amines]	478
129992	TIGR00914	2A0601	heavy metal efflux pump, CzcA family. This model represents a family of H+/heavy metal cation antiporters. This family is one of several subfamilies within the scope of pfam00873. [Cellular processes, Detoxification, Transport and binding proteins, Cations and iron carrying compounds]	1051
273335	TIGR00915	2A0602	The (Largely Gram-negative Bacterial) Hydrophobe/Amphiphile Efflux-1 (HAE1) Family. Proteins scoring above the trusted cutoff (1000) form a tight clade within the RND (Resistance-Nodulation-Cell Division) superfamily. Proteins scoring greater than the noise cutoff (100) appear to form a larger clade, cleanly separated from more distant homologs that include cadmium/zinc/cobalt resistance transporters. This family is one of several subfamilies within the scope of pfam00873. [Cellular processes, Toxin production and resistance, Transport and binding proteins, Unknown substrate]	1044
273336	TIGR00916	2A0604s01	protein-export membrane protein, SecD/SecF family. The SecA,SecB,SecD,SecE,SecF,SecG and SecY proteins form the protein translocation appartus in prokaryotes. This family is specific for the SecD and SecF proteins. [Protein fate, Protein and peptide secretion and trafficking]	192
273337	TIGR00917	2A060601	Niemann-Pick C type protein family. The model describes Niemann-Pick C type protein in eukaryotes. The defective protein has been associated with Niemann-Pick disease which is described in humans as autosomal recessive lipidosis. It is characterized by the lysosomal accumulation of unestrified cholesterol. It is an integral membrane protein, which indicates that this protein is most likely involved in cholesterol transport or acts as some component of cholesterol homeostasis. [Transport and binding proteins, Other]	1205
273338	TIGR00918	2A060602	The Eukaryotic (Putative) Sterol Transporter (EST) Family. 	1145
273339	TIGR00920	2A060605	3-hydroxy-3-methylglutaryl-coenzyme A reductase. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	886
273340	TIGR00921	2A067	The (Largely Archaeal Putative) Hydrophobe/Amphiphile Efflux-3 (HAE3) Family. Characterized members of the RND superfamily all probably catalyze substrate efflux via an H+ antiport mechanism. These proteins are found ubiquitously in bacteria, archaea and eukaryotes. They fall into seven phylogenetic families, this family (2.A.6.7) consists of uncharacterised putative transporters, largely in the Archaea. [Transport and binding proteins, Unknown substrate]	719
273341	TIGR00922	nusG	transcription termination/antitermination factor NusG. NusG proteins are transcription factors which are aparrently universal in prokaryotes (archaea and eukaryotes have homologs that may have related functions). The essential components of these factors include an N-terminal RNP-like (ribonucleoprotein) domain and a C-terminal KOW motif (pfam00467) believed to be a nucleic acid binding domain. In E. coli, NusA has been shown to interact with RNA polymerase and termination factor Rho. This model covers a wide variety of bacterial species but excludes mycoplasmas which are covered by a separate model (TIGR01956).The function of all of these NusG proteins is likely to be the same at the level of interaction with RNA and other protein factors to affect termination; however different species may utilize NusG towards different processes and in combination with different suites of affector proteins.In E. coli, NusG promotes rho-dependent termination. It is an essential gene. In Streptomyces virginiae and related species, an additional N-terminal sequence is also present and is suggested to play a role in butyrolactone-mediated autoregulation. In Thermotoga maritima, NusG has a long insert, fails to substitute for E. coli NusG (with or without the long insert), is a large 0.7 % of total cellular protein, and has a general, sequence non-specific DNA and RNA binding activity that blocks ethidium staining, yet permits transcription.Archaeal proteins once termed NusG share the KOW domain but are actually a ribosomal protein corresponding to L24p in bacterial and L26e in eukaryotes (TIGR00405). [Transcription, Transcription factors]	172
273342	TIGR00924	yjdL_sub1_fam	amino acid/peptide transporter (Peptide:H+ symporter), bacterial. The model describes proton-dependent oligopeptide transporters in bacteria. This model is restricted in its range in recognizing bacterial proton-dependent oligopeptide transporters, although they are found in yeast, plants and animals. They function by proton symport in a 1:1 stoichiometry, which is variable in different species. All of them are predicted to contain 12 transmembrane domains, for which limited experimental evidence exists. [Transport and binding proteins, Amino acids, peptides and amines]	475
273343	TIGR00926	2A1704	Peptide:H+ symporter (also transports b-lactam antibiotics, the antitumor agent, bestatin, and various protease inhibitors). [Transport and binding proteins, Amino acids, peptides and amines]	654
273344	TIGR00927	2A1904	K+-dependent Na+/Ca+ exchanger. [Transport and binding proteins, Cations and iron carrying compounds]	1096
273345	TIGR00928	purB	adenylosuccinate lyase. This family consists of adenylosuccinate lyase, the enzyme that catalyzes step 8 in the purine biosynthesis pathway for de novo synthesis of IMP and also the final reaction in the two-step sequence from IMP to AMP. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	435
273346	TIGR00929	VirB4_CagE	type IV secretion/conjugal transfer ATPase, VirB4 family. Type IV secretion systems are found in Gram-negative pathogens. They export proteins, DNA, or complexes in different systems and are related to plasmid conjugation systems. This model represents related ATPases that include VirB4 in Agrobacterium tumefaciens (DNA export) CagE in Helicobacter pylori (protein export) and plasmid TraB (conjugation).	785
273347	TIGR00930	2a30	K-Cl cotransporter. [Transport and binding proteins, Other]	953
188097	TIGR00931	antiport_nhaC	Na+/H+ antiporter NhaC. A single member of the NhaC family, a protein from Bacillus firmus, has been functionally characterized.It is involved in pH homeostasis and sodium extrusion. Members of the NhaC family are found in both Gram-negative bacteria and Gram-positive bacteria. Intriguingly, archaeal homolog ArcD (just outside boundaries of family) has been identified as an arginine/ornithine antiporter. [Transport and binding proteins, Cations and iron carrying compounds]	454
273348	TIGR00932	2a37	transporter, monovalent cation:proton antiporter-2 (CPA2) family. [Transport and binding proteins, Cations and iron carrying compounds]	273
273349	TIGR00933	2a38	potassium uptake protein, TrkH family. The proteins of the Trk family are derived from Gram-negative and Gram-positive bacteria, yeast and wheat. The proteins of E. coli K12 TrkH and TrkG as well as several yeast proteins have been functionally characterized.The E. coli TrkH and TrkG proteins are complexed to two peripheral membrane proteins, TrkA, an NAD-binding protein, and TrkE, an ATP-binding protein. This complex forms the potassium uptake system. [Transport and binding proteins, Cations and iron carrying compounds]	391
130009	TIGR00934	2a38euk	potassium uptake protein, Trk family. The proteins of the Trk family are derived from Gram-negative and Gram-positive bacteria, yeast and wheat. The proteins of E. coli K12 TrkH and TrkG as well as several yeast proteins have been functionally characterized.The E. coli TrkH and TrkG proteins are complexed to two peripheral membrane proteins, TrkA, an NAD-binding protein, and TrkE, an ATP-binding protein. This complex forms the potassium uptake system. This family is specific for the eukaryotic Trk system. [Transport and binding proteins, Cations and iron carrying compounds]	800
213571	TIGR00935	2a45	arsenite/antimonite efflux pump membrane protein. Members of this protein family are ArsB, a highly hydrophobic integral membrane protein involved in transport processes used to protect cells from arsenite (or antimonite). Members of the seed alignment were selected by adjacency to the ATPase subunit ArsA that energizes the transport. [Cellular processes, Detoxification, Transport and binding proteins, Other]	426
213572	TIGR00936	ahcY	adenosylhomocysteinase. This enzyme hydrolyzes adenosylhomocysteine as part of a cycle for the regeneration of the methyl donor S-adenosylmethionine. Species that lack this enzyme are likely to have adenosylhomocysteine nucleosidase (EC 3.2.2.9), an enzyme which also acts as 5'-methyladenosine nucleosidase (see TIGR01704). [Energy metabolism, Amino acids and amines]	407
273350	TIGR00937	2A51	chromate transporter, chromate ion transporter (CHR) family. Members of this family probably act as chromate transporters, and are found in Pseudomonas aeruginosa, Alcaligenes eutrophus, Vibrio cholerae, Bacillus subtilis, Ochrobactrum tritici, cyanobacteria and archaea. The protein reduces chromate accumulation and is essential for chromate resistance. [Transport and binding proteins, Anions]	368
273351	TIGR00938	thrB_alt	homoserine kinase, Neisseria type. Homoserine kinase is required in the biosynthesis of threonine from aspartate.The member of this family from Pseudomonas aeruginosa was shown by direct assay and complementation to act specifically as a homoserine kinase. [Amino acid biosynthesis, Aspartate family]	307
273352	TIGR00939	2a57	Equilibrative Nucleoside Transporter (ENT). [Transport and binding proteins, Nucleosides, purines and pyrimidines]	437
273353	TIGR00940	2a6301s01	Tmonovalent cation:proton antiporter. This family of proteins constists of bacterial multicomponent K+:H+ and Na+:H+ antiporters. The best characterized systems are the PhaABCDEFG system of Rhizobium meliloti which functions in pH adaptation and as a K+ efflux system and the MnhABCDEFG system of Staphylococcus aureus which functions as a Na+:H+ antiporter. [Transport and binding proteins, Cations and iron carrying compounds]	793
273354	TIGR00941	2a6301s03	Multicomponent Na+:H+ antiporter, MnhC subunit. [Transport and binding proteins, Cations and iron carrying compounds]	104
130017	TIGR00942	2a6301s05	Monovalent Cation (K+ or Na+):Proton Antiporter-3 (CPA3) subfamily. [Transport and binding proteins, Cations and iron carrying compounds]	144
130018	TIGR00943	2a6301s02	monovalent cation:proton antiporter. This family of proteins constists of bacterial multicomponent K+:H+ and Na+:H+ antiporters. The best characterized systems are the PhaABCDEFG system of Rhizobium meliloti which functions in pH adaptation and as a K+ efflux system and the MnhABCDEFG system of Staphylococcus aureus which functions as a Na+:H+ antiporter.This family is specific for the phaB and mnhB proteins. [Transport and binding proteins, Cations and iron carrying compounds]	107
130019	TIGR00944	2a6301s04	Multicomponent K+:H+antiporter. [Transport and binding proteins, Cations and iron carrying compounds]	463
273355	TIGR00945	tatC	Twin arginine targeting (Tat) protein translocase TatC. This model represents the TatC translocase component of the Sec-independent protein translocation system. This system is responsible for translocation of folded proteins, often with bound cofactors across the periplasmic membrane. A related model (TIGR01912) represents the archaeal clade of this family. TatC is often found in a gene cluster with the two other components of the system, TatA/E (TIGR01411) and TatB (TIGR01410). A model also exists for the Twin-arginine signal sequence (TIGR01409). [Protein fate, Protein and peptide secretion and trafficking]	215
273356	TIGR00946	2a69	he Auxin Efflux Carrier (AEC) Family. [Transport and binding proteins, Other]	321
273357	TIGR00947	2A73	putative bicarbonate transporter, IctB family. This family of proteins is suggested to transport inorganic carbon (HCO3-), based on the phenotype of a mutant of IctB in Synechococcus sp. strain PCC 7942. Bicarbonate uptake is used by many photosynthetic organisms including cyanobacteria. These organisms are able to concentrate CO2/HCO3- against a greater than ten-fold concentration gradient. Cyanobacteria may have several such carriers operating with different efficiencies. Note that homology to various O-antigen ligases, with possible implications for mutant cell envelope structure, might allow alternatives to the interpretation of IctB as a bicarbonate transport protein. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	425
130023	TIGR00948	2a75	L-lysine exporter. [Transport and binding proteins, Amino acids, peptides and amines]	177
273358	TIGR00949	2A76	The Resistance to Homoserine/Threonine (RhtB) Family protein. [Transport and binding proteins, Amino acids, peptides and amines]	185
273359	TIGR00950	2A78	Carboxylate/Amino Acid/Amine Transporter. [Transport and binding proteins, Amino acids, peptides and amines]	260
130026	TIGR00951	2A43	Lysosomal Cystine Transporter. [Transport and binding proteins, Amino acids, peptides and amines]	220
130027	TIGR00952	S15_bact	ribosomal protein S15, bacterial/organelle. This model is built to recognize specifically bacterial, chloroplast, and mitochondrial ribosomal protein S15. The homologous proteins of Archaea and Eukarya are designated S13. [Protein synthesis, Ribosomal proteins: synthesis and modification]	86
273360	TIGR00954	3a01203	Peroxysomal Fatty Acyl CoA Transporter (FAT) Family protein. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	659
273361	TIGR00955	3a01204	The Eye Pigment Precursor Transporter (EPP) Family protein. [Transport and binding proteins, Other]	617
273362	TIGR00956	3a01205	Pleiotropic Drug Resistance (PDR) Family protein. [Transport and binding proteins, Other]	1394
188098	TIGR00957	MRP_assoc_pro	multi drug resistance-associated protein (MRP). This model describes multi drug resistance-associated protein (MRP) in eukaryotes. The multidrug resistance-associated protein is an integral membrane protein that causes multidrug resistance when overexpressed in mammalian cells. It belongs to ABC transporter superfamily. The protein topology and function was experimentally demonstrated by epitope tagging and immunofluorescence. Insertion of tags in the critical regions associated with drug efflux, abrogated its function. The C-terminal domain seem to highly conserved. [Transport and binding proteins, Other]	1522
273363	TIGR00958	3a01208	Conjugate Transporter-2 (CT2) Family protein. [Transport and binding proteins, Other]	711
273364	TIGR00959	ffh	signal recognition particle protein. This model represents Ffh (Fifty-Four Homolog), the protein component that forms the bacterial (and organellar) signal recognition particle together with a 4.5S RNA. Ffh is a GTPase homologous to eukaryotic SRP54 and also to the GTPase FtsY (TIGR00064) that is the receptor for the signal recognition particle. [Protein fate, Protein and peptide secretion and trafficking]	428
273365	TIGR00962	atpA	proton translocating ATP synthase, F1 alpha subunit. The sequences of ATP synthase F1 alpha and beta subunits are related and both contain a nucleotide-binding site for ATP and ADP. They have a common amino terminal domain but vary at the C-terminus. The beta chain has catalytic activity, while the alpha chain is a regulatory subunit. The alpha-subunit contains a highly conserved adenine-specific noncatalytic nucleotide-binding domain. The conserved amino acid sequence is Gly-X-X-X-X-Gly-Lys. Proton translocating ATP synthase F1, alpha subunit is homologous to proton translocating ATP synthase archaeal/vacuolar(V1), B subunit. [Energy metabolism, ATP-proton motive force interconversion]	501
273366	TIGR00963	secA	preprotein translocase, SecA subunit. The proteins SecA-F and SecY, not all of which are necessary, comprise the standard prokaryotic protein translocation apparatus. Other, specialized translocation systems also exist but are not as broadly distributed. This model describes SecA, an essential member of the apparatus. This model excludes SecA2 of the accessory secretory system. [Protein fate, Protein and peptide secretion and trafficking]	742
273367	TIGR00964	secE_bact	preprotein translocase, SecE subunit, bacterial. This model represents exclusively the bacterial (and some organellar) SecE protein. SecE is part of the core heterotrimer, SecYEG, of the Sec preprotein translocase system. Other components are the ATPase SecA, a cytosolic chaperone SecB, and an accessory complex of SecDF and YajC. [Protein fate, Protein and peptide secretion and trafficking]	55
130038	TIGR00965	dapD	2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. This enzyme is part of the diaminopimelate pathway of lysine biosynthesis. Alternate name: tetrahydrodipicolinate N-succinyltransferase. The closely related TabB protein of Pseudomonas syringae (pv. tabaci), SP|P31852|TABB_PSESZ, appears to act in the biosynthesis of tabtoxin rather than lysine. The trusted cutoff is set high enough to exclude this gene. Sequences below trusted also include a version of this enzyme which apparently utilize acetate rather than succinate (EC: 2.3.1.89). [Amino acid biosynthesis, Aspartate family]	269
273368	TIGR00966	3a0501s07	protein-export membrane protein SecF. This bacterial protein is always found with the homologous protein-export membrane protein SecD. In numerous lineages, this protein occurs as a SecDF fusion protein. [Protein fate, Protein and peptide secretion and trafficking]	246
273369	TIGR00967	3a0501s007	preprotein translocase, SecY subunit. Members of this protein family are the SecY component of the SecYEG translocon, or protein translocation pore, which is driven by the ATPase SecA. This model does not discriminate bacterial from archaeal forms. [Protein fate, Protein and peptide secretion and trafficking]	410
130041	TIGR00968	3a0106s01	sulfate ABC transporter, ATP-binding protein. [Transport and binding proteins, Anions]	237
273370	TIGR00969	3a0106s02	sulfate ABC transporter, permease protein. This model describes a subfamily of both CysT and CysW, paralogous and generally tandemly encoded permease proteins of the sulfate ABC transporter. [Transport and binding proteins, Anions]	271
273371	TIGR00970	leuA_yeast	2-isopropylmalate synthase, yeast type. A larger family of homologous proteins includes homocitrate synthase, distinct lineages of 2-isopropylmalate synthase, several distinct, uncharacterized, orthologous sets in the Archaea, and other related enzymes. This model describes a family of 2-isopropylmalate synthases as found in yeasts and in a minority of studied bacteria. [Amino acid biosynthesis, Pyruvate family]	564
130044	TIGR00971	3a0106s03	sulfate/thiosulfate-binding protein. This model describes binding proteins functionally associated with the sulfate ABC transporter. In the model bacterium E. coli, two different members work with the same transporter; mutation analysis says each enables the uptake of both sulfate and thiosulfate. In many species, a single binding protein is found, and may be referred to in general terms as a sulfate ABC transporter sulfate-binding protein. [Transport and binding proteins, Anions]	315
273372	TIGR00972	3a0107s01c2	phosphate ABC transporter, ATP-binding protein. This model represents the ATP-binding protein of a family of ABC transporters for inorganic phosphate. In the model species Escherichia coli, a constitutive transporter for inorganic phosphate, with low affinity, is also present. The high affinity transporter that includes this polypeptide is induced when extracellular phosphate concentrations are low. The proteins most similar to the members of this family but not included appear to be amino acid transporters. [Transport and binding proteins, Anions]	247
130046	TIGR00973	leuA_bact	2-isopropylmalate synthase, bacterial type. This is the first enzyme of leucine biosynthesis. A larger family of homologous proteins includes homocitrate synthase, distinct lineages of 2-isopropylmalate synthase, several distinct, uncharacterized, orthologous sets in the Archaea, and other related enzymes. This model describes a family of 2-isopropylmalate synthases found primarily in Bacteria. The homologous families in the Archaea may represent isozymes and/or related enzymes. [Amino acid biosynthesis, Pyruvate family]	494
273373	TIGR00974	3a0107s02c	phosphate ABC transporter, permease protein PstA. This model describes PtsA, one of a pair of permease proteins in the ABC (high affinity) phosphate transporter. In a number of species, this permease is fused with the PtsC protein (TIGR02138). In the model bacterium Escherichia coli, this transport system is induced when the concentration of extrallular inorganic phosphate is low. A constitutive, lower affinity transporter operates otherwise. [Transport and binding proteins, Anions]	271
273374	TIGR00975	3a0107s03	phosphate ABC transporter, phosphate-binding protein. This family represents one type of (periplasmic, in Gram-negative bacteria) phosphate-binding protein found in phosphate ABC (ATP-binding cassette) transporters. This protein is accompanied, generally in the same operon, by an ATP binding protein and (usually) two permease proteins. [Transport and binding proteins, Anions]	313
273375	TIGR00976	/NonD	putative hydrolase, CocE/NonD family. This model represents a protein subfamily that includes the cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). This family shows extensive, low-level similarity to a family of xaa-pro dipeptidyl-peptidases, and local similarity by PSI-BLAST to many other hydrolases. [Unknown function, Enzymes of unknown specificity]	550
130050	TIGR00977	citramal_synth	citramalate synthase. This model includes GSU1798 and is now known to represent citramalate synthase. Members are related to 2-isopropylmalate synthases and homocitrate synthases but phylogenetically distinct. The role is isoleucine biosynthesis, the first dedicated step. [Unknown function, General]	526
273376	TIGR00978	asd_EA	aspartate-semialdehyde dehydrogenase (non-peptidoglycan organisms). Two closely related families of aspartate-semialdehyde dehydrogenase are found. They differ by a deep split in phylogenetic and percent identity trees and in gap patterns. Separate models are built for the two types in order to exclude the USG-1 protein, found in several species, which is specifically related to the Bacillus subtilis type of aspartate-semialdehyde dehydrogenase. Members of this type are found primarily in organisms that lack peptidoglycan. [Amino acid biosynthesis, Aspartate family]	341
130052	TIGR00979	fumC_II	fumarate hydratase, class II. Putative fumarases from several species (Mycobacterium tuberculosis, Streptomyces coelicolor, Pseudomonas aeruginosa) branch deeply, although within the same branch of a phylogenetic tree rooted by aspartate ammonia-lyase sequences, and score between the trusted and noise cutoffs. [Energy metabolism, TCA cycle]	458
130053	TIGR00980	3a0801so1tim17	mitochondrial import inner membrane translocase subunit tim17. [Transport and binding proteins, Amino acids, peptides and amines]	170
130054	TIGR00981	rpsL_bact	ribosomal protein S12, bacterial/organelle. This model recognizes ribosomal protein S12 of Bacteria, mitochondria, and chloroplasts. The homologous ribosomal proteins of Archaea and Eukarya, termed S23 in Eukarya and S12 or S23 in Archaea, score below the trusted cutoff. [Protein synthesis, Ribosomal proteins: synthesis and modification]	124
273377	TIGR00982	uS12_E_A	ribosomal protein uS12, eukaryotic/archaeal form. This model represents eukaryotic and archaeal forms of ribosomal protein uS12. This protein was known previously as S23 in eukaryotes and as either S12 or S23 in the Archaea. [Protein synthesis, Ribosomal proteins: synthesis and modification]	139
130056	TIGR00983	3a0801s02tim23	mitochondrial import inner membrane translocase subunit tim23. [Transport and binding proteins, Amino acids, peptides and amines]	149
130057	TIGR00984	3a0801s03tim44	mitochondrial import inner membrane, translocase subunit. The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents.These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family is specific for the Tim proteins. [Transport and binding proteins, Amino acids, peptides and amines]	378
273378	TIGR00985	3a0801s04tom	mitochondrial import receptor subunit translocase of outer membrane 20 kDa subunit. [Transport and binding proteins, Amino acids, peptides and amines]	148
273379	TIGR00986	3a0801s05tom22	mitochondrial import receptor subunit Tom22. The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents.These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family is specific for the Tom22 proteins. [Transport and binding proteins, Amino acids, peptides and amines]	145
130060	TIGR00987	himA	integration host factor, alpha subunit. This protein forms a site-specific DNA-binding heterodimer with the integration host factor beta subunit. It is closely related to the DNA-binding protein HU. [DNA metabolism, DNA replication, recombination, and repair]	96
130061	TIGR00988	hip	integration host factor, beta subunit. This protein forms a site-specific DNA-binding heterodimer with the homologous integration host factor alpha subunit. It is closely related to the DNA-binding protein HU. [DNA metabolism, DNA replication, recombination, and repair]	94
130062	TIGR00989	3a0801s07tom40	mitochondrial import receptor subunit Tom40. The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents.These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family is specific for the Tom40 proteins. [Transport and binding proteins, Amino acids, peptides and amines]	161
273380	TIGR00990	3a0801s09	mitochondrial precursor proteins import receptor (72 kDa mitochondrial outermembrane protein) (mitochondrial import receptor for the ADP/ATP carrier) (translocase of outermembrane tom70). [Transport and binding proteins, Amino acids, peptides and amines]	615
130064	TIGR00991	3a0901s02IAP34	GTP-binding protein (Chloroplast Envelope Protein Translocase). [Transport and binding proteins, Nucleosides, purines and pyrimidines]	313
130065	TIGR00992	3a0901s03IAP75	chloroplast envelope protein translocase, IAP75 family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the TOC IAP75 protein. [Transport and binding proteins, Amino acids, peptides and amines]	718
273381	TIGR00993	3a0901s04IAP86	chloroplast protein import component Toc86/159, G and M domains. The long precursor of the 86K protein originally described is proposed to have three domains. The N-terminal A-domain is acidic, repetitive, weakly conserved, readily removed by proteolysis during chloroplast isolation, and not required for protein translocation. The other domains are designated G (GTPase) and M (membrane anchor); this family includes most of the G domain and all of M. [Transport and binding proteins, Amino acids, peptides and amines]	763
273382	TIGR00994	3a0901s05TIC20	chloroplast protein import component, Tic20 family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic20 protein. [Transport and binding proteins, Amino acids, peptides and amines]	267
273383	TIGR00995	3a0901s06TIC22	chloroplast protein import component, Tic22 family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic22 protein. [Transport and binding proteins, Amino acids, peptides and amines]	270
273384	TIGR00996	Mtu_fam_mce	virulence factor Mce family protein. Members of this paralogous family are found as six tandem homologous proteins in the same orientation per cassette, in four separate cassettes in Mycobacterium tuberculosis. The six members of each cassette represent six subfamilies. One subfamily includes the protein mce (mycobacterial cell entry), a virulence protein required for invasion of non-phagocytic cells. [Cellular processes, Pathogenesis]	291
130070	TIGR00997	ispZ	intracellular septation protein A. This partially characterized protein, whose absence can cause a cell division defect in an intracellularly replicating bacterium, is found only so far only in the Proteobacteria. [Cellular processes, Cell division]	178
273385	TIGR00998	8a0101	efflux pump membrane protein (multidrug resistance protein A). [Transport and binding proteins, Other]	334
273386	TIGR00999	8a0102	Membrane Fusion Protein cluster 2 (function with RND porters). [Transport and binding proteins, Other]	265
273387	TIGR01000	bacteriocin_acc	bacteriocin secretion accessory protein. This family represents an accessory protein that works with the bacteriocin maturation and ABC transport secretion protein described by TIGR01193. [Transport and binding proteins, Other]	457
130074	TIGR01001	metA	homoserine O-succinyltransferase. The apparent equivalog from Bacillus subtilis is broken into two tandem reading frames. [Amino acid biosynthesis, Aspartate family]	300
273388	TIGR01002	hlyII	beta-channel forming cytolysin. This family of cytolytic pore-forming proteins includes alpha toxin and leukocidin F and S subunits from Staphylococcus aureus, hemolysin II of Bacillus cereus, and related toxins. [Cellular processes, Toxin production and resistance]	312
273389	TIGR01003	PTS_HPr_family	Phosphotransferase System HPr (HPr) Family. The HPr family are bacterial proteins (or domains of proteins) which function in phosphoryl transfer system (PTS) systems. They include energy-coupling components which catalyze sugar uptake via a group translocation mechanism. The functions of most of these proteins are not known, but they presumably function in PTS-related regulatory capacities. All seed members are stand-alone HPr proteins, although the model also recognizes HPr domains of PTS fusion proteins. This family includes the related NPr protein. [Signal transduction, PTS]	82
273390	TIGR01004	PulS_OutS	lipoprotein, PulS/OutS family. This family comprises lipoproteins from four gamma proteobacterial species: PulS protein of Klebsiella pneumoniae, the OutS protein of Erwinia chrysanthemi and Pectobacterium chrysanthemi, and the functionally uncharacterized E. coli protein EtpO. PulS and OutS have been shown to interact with and facilitate insertion of secretins into the outer membrane, suggesting a chaperone-like, or piloting function for members of this family. [Transport and binding proteins, Amino acids, peptides and amines]	128
273391	TIGR01005	eps_transp_fam	exopolysaccharide transport protein family. The model describes the exopolysaccharide transport protein family in bacteria. The transport protein is part of a large genetic locus which is associated with exopolysaccharide (EPS) biosynthesis. Detailed molecular characterization and gene fusion analysis revealed atleast seven gene products are involved in the overall regulation, which among other things, include exopolysaccharide biosynthesis, property of conferring virulence and exopolysaccharide export. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	764
130079	TIGR01006	polys_exp_MPA1	polysaccharide export protein, MPA1 family, Gram-positive type. This family contains members from Low GC Gram-positive bacteria; they are proposed to have a function in the export of complex polysaccharides. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	226
273392	TIGR01007	eps_fam	capsular exopolysaccharide family. This model describes the capsular exopolysaccharide proteins in bacteria. The exopolysaccharide gene cluster consists of several genes which encode a number of proteins which regulate the exoploysaccharide biosynthesis(EPS). Atleast 13 genes espA to espM in streptococcus species seem to direct the EPS proteins and all of which share high homology. Functional roles were characterized by gene disruption experiments which resulted in exopolysaccharide-deficient phenotypes. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	204
273393	TIGR01008	uS3_euk_arch	ribosomal protein uS3, eukaryotic/archaeal type. This model describes ribosomal protein S3 of the eukaryotic cytosol and of the archaea. TIGRFAMs model TIGR01009 describes the bacterial/organellar type, although the organellar types have a different architecture with long insertions and may score poorly. [Protein synthesis, Ribosomal proteins: synthesis and modification]	195
130082	TIGR01009	rpsC_bact	ribosomal protein S3, bacterial type. This model describes the bacterial type of ribosomal protein S3. Chloroplast and mitochondrial forms have large, variable inserts between conserved N-terminal and C-terminal domains. This model recognizes all bacterial forms and many chloroplast forms above the trusted cutoff score. TIGRFAMs model TIGR01008 describes S3 of the eukaryotic cytosol and of the archaea. [Protein synthesis, Ribosomal proteins: synthesis and modification]	211
130083	TIGR01010	BexC_CtrB_KpsE	polysaccharide export inner-membrane protein, BexC/CtrB/KpsE family. This family contains gamma proteobacterial proteins involved in capsule polysaccharide export. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	362
130084	TIGR01011	rpsB_bact	ribosomal protein S2, bacterial type. This model describes the bacterial, ribosomal, and chloroplast forms of ribosomal protein S2. TIGR01012 describes the archaeal and cytosolic forms. [Protein synthesis, Ribosomal proteins: synthesis and modification]	225
273394	TIGR01012	uS2_euk_arch	ribosomal protein uS2, eukaryotic/archaeal form. This model describes the ribosomal protein of the cytosol and of Archaea, homologous to S2 of bacteria. It is designated typically as Sa in eukaryotes and Sa or S2 in the archaea. TIGR01011 describes the related protein of organelles and bacteria. [Protein synthesis, Ribosomal proteins: synthesis and modification]	196
162157	TIGR01013	2a58	Phosphate:Na+ Symporter (PNaS) Family. [Transport and binding proteins, Cations and iron carrying compounds]	456
273395	TIGR01015	hmgA	homogentisate 1,2-dioxygenase. Missing in human disease alkaptonuria. [Energy metabolism, Amino acids and amines]	429
273396	TIGR01016	sucCoAbeta	succinyl-CoA synthetase, beta subunit. This model is designated subfamily because it does not discriminate the ADP-forming enzyme ((EC 6.2.1.5) from the GDP_forming (EC 6.2.1.4) enzyme. The N-terminal half is described by the CoA-ligases model (pfam00549). The C-terminal half is described by the ATP-grasp model (pfam02222). This family contains a split seen both in a maximum parsimony tree (which ignores gaps) and in the gap pattern near position 85 of the seed alignment. Eukaryotic and most bacterial sequences are longer and contain a region similar to TXQTXXXG. Sequences from Deinococcus radiodurans, Mycobacterium tuberculosis, Streptomyces coelicolor, and the Archaea are 6 amino acids shorter in that region and contain a motif resembling [KR]G [Energy metabolism, TCA cycle]	386
200066	TIGR01017	rpsD_bact	ribosomal protein S4, bacterial/organelle type. This model finds organelle (chloroplast and mitochondrial) ribosomal protein S4 as well as bacterial ribosomal protein S4. [Protein synthesis, Ribosomal proteins: synthesis and modification]	200
273397	TIGR01018	uS4_arch	ribosomal protein uS4, eukaryotic/archaeal type. This model finds eukaryotic ribosomal protein S9 as well as archaeal ribosomal protein S4. [Protein synthesis, Ribosomal proteins: synthesis and modification]	162
130091	TIGR01019	sucCoAalpha	succinyl-CoA synthetase, alpha subunit. This model describes succinyl-CoA synthetase alpha subunits but does not discriminate between GTP-specific and ATP-specific reactions. The model is designated as subfamily rather than equivalog for that reason. ATP citrate lyases appear to form an outgroup. [Energy metabolism, TCA cycle]	286
273398	TIGR01020	uS5_euk_arch	ribosomal protein uS5, eukaryotic/archaeal form. This model finds eukaryotic ribosomal protein uS5 (previously S2 in yeast and human) as well as archaeal ribosomal protein uS5. [Protein synthesis, Ribosomal proteins: synthesis and modification]	212
130093	TIGR01021	rpsE_bact	ribosomal protein S5, bacterial/organelle type. This model finds chloroplast ribosomal protein S5 as well as bacterial ribosomal protein S5. A candidate mitochondrial form (Saccharomyces cerevisiae YBR251W and its homolog) differs substantially and is not included in this model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	154
130094	TIGR01022	rpmJ_bact	ribosomal protein L36, bacterial type. Proteins found by this model occur exclusively in bacteria and organelles. [Protein synthesis, Ribosomal proteins: synthesis and modification]	37
273399	TIGR01023	rpmG_bact	ribosomal protein L33, bacterial type. This model describes bacterial ribosomal protein L33 and its chloroplast and mitochondrial equivalents. [Protein synthesis, Ribosomal proteins: synthesis and modification]	54
130096	TIGR01024	rplS_bact	ribosomal protein L19, bacterial type. This model describes bacterial ribosomoal protein L19 and its chloroplast equivalent. Putative mitochondrial L19 are found in several species (but not Saccharomyces cerevisiae) and score between trusted and noise cutoffs. [Protein synthesis, Ribosomal proteins: synthesis and modification]	113
273400	TIGR01025	uS19_arch	ribosomal protein uS19, eukaryotic/archaeal form. This model represents eukaryotic ribosomal protein uS19 (previously S15) and its archaeal equivalent. It excludes bacterial and organellar ribosomal protein S19. The nomenclature for the archaeal members is unresolved and given variously as S19 (after the more distant bacterial homologs) or S15. [Protein synthesis, Ribosomal proteins: synthesis and modification]	135
273401	TIGR01026	fliI_yscN	ATPase, FliI/YscN family. This family of ATPases demonstrates extensive homology with ATP synthase F1, beta subunit. It is a mixture of members with two different protein functions. The first group is exemplified by Salmonella typhimurium FliI protein. It is needed for flagellar assembly, its ATPase activity is required for flagellation, and it may be involved in a specialized protein export pathway that proceeds without signal peptide cleavage. The second group of proteins function in the export of virulence proteins; exemplified by Yersinia sp. YscN protein an ATPase involved in the type III secretory pathway for the antihost Yops proteins. [Energy metabolism, ATP-proton motive force interconversion]	440
162163	TIGR01027	proB	glutamate 5-kinase. Bacterial ProB proteins hit the full length of this model, but the ProB-like domain of delta 1-pyrroline-5-carboxylate synthetase does not hit the C-terminal 100 residues of this model. The noise cutoff is set low enough to hit delta 1-pyrroline-5-carboxylate synthetase and other partial matches to this family. [Amino acid biosynthesis, Glutamate family]	363
273402	TIGR01028	uS7_euk_arch	ribosomal protein uS7, eukaryotic/archaeal. This model describes the members from the eukaryotic cytosol and the Archaea of the family that includes ribosomal protein uS7 (previously S5 in yeast and human). A separate model describes bacterial and organellar S7. [Protein synthesis, Ribosomal proteins: synthesis and modification]	186
273403	TIGR01029	rpsG_bact	ribosomal protein S7, bacterial/organelle. This model describes the bacterial and organellar branch of the ribosomal protein S7 family (includes prokaroytic S7 and eukaryotic S5). The eukaryotic and archaeal branch is described by model TIGR01028. [Protein synthesis, Ribosomal proteins: synthesis and modification]	154
130102	TIGR01030	rpmH_bact	ribosomal protein L34, bacterial type. This model describes the bacterial protein L34 and its equivalents in organelles. [Protein synthesis, Ribosomal proteins: synthesis and modification]	44
273404	TIGR01031	rpmF_bact	ribosomal protein L32. This protein describes bacterial ribosomal protein L32. The noise cutoff is set low enough to include the equivalent protein from mitochondria and chloroplasts. No related proteins from the Archaea nor from the eukaryotic cytosol are detected by this model. This model is a fragment model; the putative L32 of some species shows similarity only toward the N-terminus. [Protein synthesis, Ribosomal proteins: synthesis and modification]	55
130104	TIGR01032	rplT_bact	ribosomal protein L20. This model describes bacterial ribosomal protein L20 and its chloroplast equvalent. This protein binds directly to 23s ribosomal RNA and is necessary for the in vitro assembly process of the 50s ribosomal subunit. It is not involved in the protein synthesizing functions of that subunit. GO process changed accordingly (SS 5/09/03) [Protein synthesis, Ribosomal proteins: synthesis and modification]	113
273405	TIGR01033	TIGR01033	DNA-binding regulatory protein, YebC/PmpR family. This model describes a minimally characterized protein family, restricted to bacteria excepting for some eukaryotic sequences that have possible transit peptides. YebC from E. coli is crystallized, and PA0964 from Pseudomonas aeruginosa has been shown to be a sequence-specific DNA-binding regulatory protein. In silico analysis suggests a role in Holliday junction resolution. [Regulatory functions, DNA interactions]	238
273406	TIGR01034	metK	S-adenosylmethionine synthetase. Tandem isozymes of this S-adenosylmethionine synthetase in E. coli are designated MetK and MetX. [Central intermediary metabolism, Other]	377
273407	TIGR01035	hemA	glutamyl-tRNA reductase. This enzyme, together with glutamate-1-semialdehyde-2,1-aminomutase (TIGR00713), leads to the production of delta-amino-levulinic acid from Glu-tRNA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	417
273408	TIGR01036	pyrD_sub2	dihydroorotate dehydrogenase, subfamily 2. This model describes enzyme protein dihydroorotate dehydrogenase exclusively for subfamily 2. It includes members from bacteria, yeast, plants etc. The subfamilies 1 and 2 share extensive homology, particularly toward the C-terminus. This subfamily has a longer N-terminal region. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	335
130109	TIGR01037	pyrD_sub1_fam	dihydroorotate dehydrogenase (subfamily 1) family protein. This family includes subfamily 1 dihydroorotate dehydrogenases while excluding the closely related subfamily 2 (TIGR01036). This family also includes a number of uncharacterized proteins and a domain of dihydropyrimidine dehydrogenase. The uncharacterized proteins might all be dihydroorotate dehydrogenase.	300
273409	TIGR01038	uL22_arch_euk	ribosomal protein uL22, eukaryotic/archaeal form. This model describes the ribosomal protein uL22 of the eukaryotic cytosol and of the Archaea, previously designated as L17, L22, and L23. The corresponding bacterial form of uL22 is described by a separate model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	148
211621	TIGR01039	atpD	ATP synthase, F1 beta subunit. The sequences of ATP synthase F1 alpha and beta subunits are related and both contain a nucleotide-binding site for ATP and ADP. They have a common amino terminal domain but vary at the C-terminus. The beta chain has catalytic activity, while the alpha chain is a regulatory subunit. Proton translocating ATP synthase, F1 beta subunit is homologous to proton translocating ATP synthase archaeal/vacuolar(V1), A subunit. [Energy metabolism, ATP-proton motive force interconversion]	461
273410	TIGR01040	V-ATPase_V1_B	V-type (H+)-ATPase V1, B subunit. This models eukaryotic vacuolar (H+)-ATPase that is responsible for acidifying cellular compartments. This enzyme shares extensive sequence similarity with archaeal ATP synthase. [Transport and binding proteins, Cations and iron carrying compounds]	466
200071	TIGR01041	ATP_syn_B_arch	ATP synthase archaeal, B subunit. Archaeal ATP synthase shares extensive sequence similarity with eukaryotic and prokaryotic V-type (H+)-ATPases. [Energy metabolism, ATP-proton motive force interconversion]	458
273411	TIGR01042	V-ATPase_V1_A	V-type (H+)-ATPase V1, A subunit. This models eukaryotic vacuolar (H+)-ATPase that is responsible for acidifying cellular compartments. This enzyme shares extensive sequence similarity with archaeal ATP synthase. [Transport and binding proteins, Cations and iron carrying compounds]	591
130115	TIGR01043	ATP_syn_A_arch	ATP synthase archaeal, A subunit. Archaeal ATP synthase shares extensive sequence similarity with eukaryotic and prokaryotic V-type (H+)-ATPases. [Energy metabolism, ATP-proton motive force interconversion]	578
130116	TIGR01044	rplV_bact	ribosomal protein L22, bacterial type. This model decribes bacterial and chloroplast ribosomal protein L22. [Protein synthesis, Ribosomal proteins: synthesis and modification]	103
273412	TIGR01045	RPE1	Rickettsial palindromic element RPE1 domain. This model describes protein translations of the first family described, RPE1, of Rickettsia palindromic elements (RPE). In Rickettsia conorii, 19 copies are found within protein coding regions, where they encode an insert relative to homologs from other species but do not disrupt the reading frame. Insertion is always in the same reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else.	46
273413	TIGR01046	uS10_euk_arch	ribosomal protein uS10, eukaryotic/archaeal. This model describes the archaeal ribosomal protein uS10 and its equivalents (previously called S20) in eukaryotes. [Protein synthesis, Ribosomal proteins: synthesis and modification]	99
273414	TIGR01047	nspC	carboxynorspermidine decarboxylase. This protein is related to diaminopimelate decarboxylase. It is the last enzyme in norspermidine biosynthesis by an unusual pathway shown in Vibrio alginolyticus. [Central intermediary metabolism, Polyamine biosynthesis]	379
273415	TIGR01048	lysA	diaminopimelate decarboxylase. This family consists of diaminopimelate decarboxylase, an enzyme which catalyzes the conversion of diaminopimelic acid into lysine during the last step of lysine biosynthesis. [Amino acid biosynthesis, Aspartate family]	414
130121	TIGR01049	rpsJ_bact	ribosomal protein S10, bacterial/organelle. This model describes bacterial 30S ribosomal protein S10. In species that have a transcription antitermination complex, or N utilization substance, with NusA, NusB, NusG, and NusE, this ribosomal protein is responsible for NusE activity. Included in the family are one member each from Saccharomyces cerevisiae and Schizosaccharomyces pombe. These proteins lack an N-terminal mitochondrial transit peptide but contain additional sequence C-terminal to the ribosomal S10 protein region. [Protein synthesis, Ribosomal proteins: synthesis and modification]	99
130122	TIGR01050	rpsS_bact	ribosomal protein S19, bacterial/organelle. The homologous protein of the eukaryotic cytosol and of the Archaea may be designated S15 or S19. [Protein synthesis, Ribosomal proteins: synthesis and modification]	92
273416	TIGR01051	topA_bact	DNA topoisomerase I, bacterial. This model describes DNA topoisomerase I among the members of bacteria. DNA topoisomerase I transiently cleaves one DNA strand and thus relaxes negatively supercoiled DNA during replication, transcription and recombination events. [DNA metabolism, DNA replication, recombination, and repair]	610
273417	TIGR01052	top6b	DNA topoisomerase VI, B subunit. This model describes DNA topoisomerase VI, an archaeal type II DNA topoisomerase (DNA gyrase). [DNA metabolism, DNA replication, recombination, and repair]	488
130125	TIGR01053	LSD1	zinc finger domain, LSD1 subclass. This model describes a putative zinc finger domain found in three closely spaced copies in Arabidopsis protein LSD1 and in two copies in other proteins from the same species. The motif resembles CxxCRxxLMYxxGASxVxCxxC	31
273418	TIGR01054	rgy	reverse gyrase. This model describes reverse gyrase, found in both archaeal and bacterial thermophiles. This enzyme, a fusion of a type I topoisomerase domain and a helicase domain, introduces positive supercoiling to increase the melting temperature of DNA double strands. Generally, these gyrases are encoded as a single polypeptide. An exception was found in Methanopyrus kandleri, where enzyme is split within the topoisomerase domain, yielding a heterodimer of gene products designated RgyB and RgyA. [DNA metabolism, DNA replication, recombination, and repair]	1171
130127	TIGR01055	parE_Gneg	DNA topoisomerase IV, B subunit, proteobacterial. Operationally, topoisomerase IV is a type II topoisomerase required for the decatenation of chromosome segregation. Not every bacterium has both a topo II and a topo IV. The topo IV families of the Gram-positive bacteria and the Gram-negative bacteria appear not to represent a single clade among the type II topoisomerases, and are represented by separate models for this reason. This protein is active as an alpha(2)beta(2) heterotetramer. [DNA metabolism, DNA replication, recombination, and repair]	625
273419	TIGR01056	topB	DNA topoisomerase III, bacteria and conjugative plasmid. This model describes topoisomerase III from bacteria and its equivalents encoded on plasmids. The gene is designated topB if found in the bacterial chromosome, traE on conjugative plasmid RP4, etc. These enzymes are involved in the control of DNA topology. DNA topoisomerase III belongs to the type I topoisomerases, which are ATP-independent. [DNA metabolism, DNA replication, recombination, and repair]	660
273420	TIGR01057	topA_arch	DNA topoisomerase I, archaeal. This model describes topoisomerase I from archaea. These enzymes are involved in the control of DNA topology. DNA topoisomerase I belongs to the type I topoisomerases, which are ATP-independent. [DNA metabolism, DNA replication, recombination, and repair]	618
130130	TIGR01058	parE_Gpos	DNA topoisomerase IV, B subunit, Gram-positive. Operationally, topoisomerase IV is a type II topoisomerase required for the decatenation step of chromosome segregation. Not every bacterium has both a topo II and a topo IV. The topo IV families of the Gram-positive bacteria and the Gram-negative bacteria appear not to represent a single clade among the type II topoisomerases, and are represented by separate models for this reason. [DNA metabolism, DNA replication, recombination, and repair]	637
273421	TIGR01059	gyrB	DNA gyrase, B subunit. This model describes the common type II DNA topoisomerase (DNA gyrase). Two apparently independently arising families, one in the Proteobacteria and one in Gram-positive lineages, are both designated toposisomerase IV. Proteins scoring above the noise cutoff for this model and below the trusted cutoff for topoisomerase IV models probably should be designated GyrB. [DNA metabolism, DNA replication, recombination, and repair]	654
213580	TIGR01060	eno	phosphopyruvate hydratase. Alternate name: enolase [Energy metabolism, Glycolysis/gluconeogenesis]	425
273422	TIGR01061	parC_Gpos	DNA topoisomerase IV, A subunit, Gram-positive. Operationally, topoisomerase IV is a type II topoisomerase required for the decatenation of chromosome segregation. Not every bacterium has both a topo II and a topo IV. The topo IV families of the Gram-positive bacteria and the Gram-negative bacteria appear not to represent a single clade among the type II topoisomerases, and are represented by separate models for this reason. [DNA metabolism, DNA replication, recombination, and repair]	738
130134	TIGR01062	parC_Gneg	DNA topoisomerase IV, A subunit, proteobacterial. Operationally, topoisomerase IV is a type II topoisomerase required for the decatenation of chromosome segregation. Not every bacterium has both a topo II and a topo IV. The topo IV families of the Gram-positive bacteria and the Gram-negative bacteria appear not to represent a single clade among the type II topoisomerases, and are represented by separate models for this reason. [DNA metabolism, DNA replication, recombination, and repair]	735
273423	TIGR01063	gyrA	DNA gyrase, A subunit. This model describes the common type II DNA topoisomerase (DNA gyrase). Two apparently independently arising families, one in the Proteobacteria and one in Gram-positive lineages, are both designated toposisomerase IV. [DNA metabolism, DNA replication, recombination, and repair]	800
273424	TIGR01064	pyruv_kin	pyruvate kinase. This enzyme is a homotetramer. Some forms are active only in the presence of fructose-1,6-bisphosphate or similar phosphorylated sugars. [Energy metabolism, Glycolysis/gluconeogenesis]	472
273425	TIGR01065	hlyIII	channel protein, hemolysin III family. This family includes proteins from pathogenic and non-pathogenic bacteria, Homo sapiens and Drosophila. In Bacillus cereus, a pathogen, it has been show to function as a channel-forming cytolysin. The human protein is expressed preferentially in mature macrophages, consistent with a role cytolytic role.	204
162186	TIGR01066	rplM_bact	ribosomal protein L13, bacterial type. This model distinguishes ribosomal protein L13 of bacteria and organelles from its eukarytotic and archaeal counterparts. [Protein synthesis, Ribosomal proteins: synthesis and modification]	140
273426	TIGR01067	rplN_bact	ribosomal protein L14, bacterial/organelle. This model distinguishes bacterial and most organellar examples of ribosomal protein L14 from all archaeal and eukaryotic forms. [Protein synthesis, Ribosomal proteins: synthesis and modification]	122
200072	TIGR01068	thioredoxin	thioredoxin. Several proteins, such as protein disulfide isomerase, have two or more copies of a domain closely related to thioredoxin. This model is designed to recognize authentic thioredoxin, a small protein that should be hit exactly once by this model. Any protein that hits once with a score greater than the second (per domain) trusted cutoff may be taken as thioredoxin. [Energy metabolism, Electron transport]	101
130141	TIGR01069	mutS2	MutS2 family protein. Function of MutS2 is unknown. It should not be considered a DNA mismatch repair protein. It is likely a DNA mismatch binding protein of unknown cellular function. [DNA metabolism, Other]	771
273427	TIGR01070	mutS1	DNA mismatch repair protein MutS. [DNA metabolism, DNA replication, recombination, and repair]	840
273428	TIGR01071	rplO_bact	ribosomal protein L15, bacterial/organelle. [Protein synthesis, Ribosomal proteins: synthesis and modification]	145
162190	TIGR01072	murA	UDP-N-acetylglucosamine 1-carboxyvinyltransferase. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	416
273429	TIGR01073	pcrA	ATP-dependent DNA helicase PcrA. Designed to identify pcrA members of the uvrD/rep subfamily. [DNA metabolism, DNA replication, recombination, and repair]	726
130146	TIGR01074	rep	ATP-dependent DNA helicase Rep. Designed to identify rep members of the uvrD/rep subfamily. [DNA metabolism, DNA replication, recombination, and repair]	664
130147	TIGR01075	uvrD	DNA helicase II. Designed to identify uvrD members of the uvrD/rep subfamily. [DNA metabolism, DNA replication, recombination, and repair]	715
130148	TIGR01076	sortase_fam	LPXTG-site transpeptidase (sortase) family protein. This family includes Staphylococcus aureus sortase, a transpeptidase that attaches surface proteins by the Thr of an LPXTG motif to the cell wall. It also includes a protein required for correct assembly of an LPXTG-containing fimbrial protein, a set of homologous proteins from Streptococcus pneumoniae, in which LPXTG proteins are common. However, related proteins are found in Bacillus subtilis and Methanobacterium thermoautotrophicum, in which LPXTG-mediated cell wall attachment is not known. [Cell envelope, Other, Protein fate, Protein and peptide secretion and trafficking]	136
162192	TIGR01077	L13_A_E	ribosomal protein uL13, archaeal/eukaryotic form. This model represents ribosomal protein of L13 from the Archaea and from the eukaryotic cytosol. Bacterial and organellar forms are represented by model TIGR01066. [Protein synthesis, Ribosomal proteins: synthesis and modification]	142
273430	TIGR01078	arcA	arginine deiminase. Arginine deiminase is the first enzyme of the arginine deiminase pathway of arginine degradation. [Energy metabolism, Amino acids and amines]	405
273431	TIGR01079	rplX_bact	ribosomal protein L24, bacterial/organelle. This model recognizes bacterial and organellar forms of ribosomal protein L24. It excludes eukaryotic and archaeal forms, designated L26 in eukaryotes. [Protein synthesis, Ribosomal proteins: synthesis and modification]	102
273432	TIGR01080	rplX_A_E	ribosomal protein uL24, archaeal/eukaryotic form. This model represents the archaeal and eukaryotic branch of the ribosomal protein L24p/L26e family. Bacterial and organellar forms are represented by related model TIGR01079. [Protein synthesis, Ribosomal proteins: synthesis and modification]	114
130153	TIGR01081	mpl	UDP-N-acetylmuramate:L-alanyl-gamma-D-glutamyl-meso-diaminopimelate ligase. Alternate name: murein tripeptide ligase [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	448
273433	TIGR01082	murC	UDP-N-acetylmuramate--L-alanine ligase. This model describes the MurC protein in bacterial peptidoglycan (murein) biosynthesis. In a few species (Mycobacterium leprae, the Chlamydia), the amino acid may be L-serine or glycine instead of L-alanine. A related protein, UDP-N-acetylmuramate:L-alanyl-gamma-D-glutamyl-meso-diaminopimelate ligase (murein tripeptide ligase) is described by model TIGR01081. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	448
273434	TIGR01083	nth	endonuclease III. This equivalog model identifes nth members of the pfam00730 superfamily (HhH-GPD: Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate). The major members of the superfamily are nth and mutY. [DNA metabolism, DNA replication, recombination, and repair]	192
130156	TIGR01084	mutY	A/G-specific adenine glycosylase. This equivalog model identifies mutY members of the pfam00730 superfamily (HhH-GPD: Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate). The major members of the superfamily are nth and mutY. [DNA metabolism, DNA replication, recombination, and repair]	275
273435	TIGR01085	murE	UDP-N-acetylmuramyl-tripeptide synthetase. Most members of this family are EC 6.3.2.13, UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--2,6-diaminopimelate ligase. An exception is Staphylococcus aureus, in which diaminopimelate is replaced by lysine in the peptidoglycan and MurE is EC 6.3.2.7. The Mycobacteria, part of the closest neighboring branch outside of the low-GC Gram-positive bacteria, use diaminopimelate. A close homolog, scoring just below the trusted cutoff, is found (with introns) in Arabidopsis thaliana. Its role is unknown. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	464
188107	TIGR01086	fucA	L-fuculose phosphate aldolase. Members of this family are L-fuculose phosphate aldolase from various Proteobacteria, encoded in fucose utilization operons. Homologs in other bacteria given similar annotation but scoring below the trusted cutoff may share extensive sequence similarity but are not experimenally characterized and are not found in apparent fucose utilization operons; we consider their annotation as L-fuculose phosphate aldolase to be tenuous. This model has been narrowed in scope from the previous version. [Energy metabolism, Sugars]	214
273436	TIGR01087	murD	UDP-N-acetylmuramoylalanine--D-glutamate ligase. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	433
130160	TIGR01088	aroQ	3-dehydroquinate dehydratase, type II. This model specifies the type II enzyme. The type I enzyme, often found as part of a multifunctional protein, is described by TIGR01093. [Amino acid biosynthesis, Aromatic amino acid family]	141
130161	TIGR01089	fucI	L-fucose isomerase. This enzyme catalyzes the first step in fucose metabolism, and has been characterized in Escherichia coli and Bacteroides thetaiotaomicron. [Energy metabolism, Sugars]	587
273437	TIGR01090	apt	adenine phosphoribosyltransferase. A phylogenetic analysis suggested omitting the bi-directional best hit homologs from the spirochetes from the seed for this model and making only tentative predictions of adenine phosphoribosyltransferase function for this lineage. The trusted cutoff score is made high for this reason. Most proteins scoring between the trusted and noise cutoffs are likely to act as adenine phosphotransferase. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	169
273438	TIGR01091	upp	uracil phosphoribosyltransferase. A fairly deep split in phylogenetic and UPGMA trees separates this mostly prokaryotic set of uracil phosphoribosyltransferases from a mostly eukaryotic set that includes uracil phosphoribosyltransferase, uridine kinases, and other, uncharacterized proteins. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	207
130164	TIGR01092	P5CS	delta l-pyrroline-5-carboxylate synthetase. This protein contains a glutamate 5-kinase (ProB, EC 2.7.2.11) region followed by a gamma-glutamyl phosphate reductase (ProA, EC 1.2.1.41) region. [Amino acid biosynthesis, Glutamate family]	715
273439	TIGR01093	aroD	3-dehydroquinate dehydratase, type I. This model detects 3-dehydroquinate dehydratase, type I, either as a monofunctional protein or as a domain of a larger, multifunctional protein. It is often found fused to shikimate 5-dehydrogenase (EC 1.1.1.25), and sometimes additional domains. Type II 3-dehydroquinate dehydratase, designated AroQ, is described by the model TIGR01088. [Amino acid biosynthesis, Aromatic amino acid family]	229
273440	TIGR01096	3A0103s03R	lysine-arginine-ornithine-binding periplasmic protein. [Transport and binding proteins, Amino acids, peptides and amines]	250
273441	TIGR01097	PhnE	phosphonate ABC transporter, permease protein PhnE. Phosphonates are a class of compound analogous to organic phosphates, but in which the C-O-P linkage is replaced by a direct, stable C-P bond. Some bacteria can utilize phosphonates as a source of phosphorus. This family consists of permease proteins of known or predicted phosphonate ABC transporters. Often this protein is found as a duplicated pair, occasionally as a fused pair. Certain "second" copies score in between the trusted and noise cutoff and should be considered true hits (by context). [Transport and binding proteins, Anions]	250
273442	TIGR01098	3A0109s03R	phosphate/phosphite/phosphonate ABC transporter, periplasmic binding protein. Phosphonates are a varied class of phosphorus-containing organic compound in which a direct C-P bond is found, rather than a C-O-P linkage of the phosphorus through an oxygen atom. They may be toxic but also may be used as sources of phosphorus and energy by various bacteria. Phosphonate utilization systems typically are encoded in 14 or more genes, including a three gene ABC transporter. This family includes the periplasmic binding protein component of ABC transporters for phosphonates as well as other, related binding components for closely related substances such as phosphate and phosphite. A number of members of this family are found in genomic contexts with components of selenium metabolic processes suggestive of a role in selenate or other selenium-compound transport. A subset of this model in which nearly all members exhibit genomic context with elements of phosphonate metabolism, particularly the C-P lyase system (GenProp0232) has been built (TIGR03431) as an equivalog. Nevertheless, there are members of this subfamily (TIGR01098) which show up sporadically on a phylogenetic tree that also show phosphonate context and are most likely competent to transport phosphonates. [Transport and binding proteins, Anions]	254
273443	TIGR01099	galU	UTP--glucose-1-phosphate uridylyltransferase. Built to distinquish between the highly similar genes galU and galF [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	260
130170	TIGR01100	V_ATP_synt_C	vacuolar ATP synthase 16 kDa proteolipid subunit. This model describes the vacuolar ATP synthase 16 kDa proteolipid subunit in eukaryotes and includes members from diverse groups e.g., fungi, plants, parasites etc. The principal role V-ATPases are the acidification of intracellular compartments of eukaryotic cells. [Transport and binding proteins, Cations and iron carrying compounds]	108
130171	TIGR01101	V_ATP_synt_F	vacuolar ATP synthase F subunit. This model describes the vacuolar ATP synthase F subunit (14 kDa subunit) in eukaryotes. In some archaeal species this protein subunit is referred as G subunit [Transport and binding proteins, Cations and iron carrying compounds]	115
130172	TIGR01102	yscR	type III secretion apparatus protein, YscR/HrcR family. This model identifies the generic virulence translocation proteins in bacteria. It derives its name:'Yop' from Yersinia enterocolitica species, where this virulence protein was identified. In bacterial pathogenesis, Yop effector proteins are translocated into the eukaryotic cells. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	202
211625	TIGR01103	fliP	flagellar biosynthetic protein FliP. This model describes bacterial flagellar biogenesis protein fliP, which is one of the genes in motility locus on the bacterial chromosome that is involved in structure and function of bacterial flagellum. It was demonstrated that mutants in fliP locus were non-flagellated and non-motile, while revertants were flagellated and motile. [Cellular processes, Chemotaxis and motility]	197
273444	TIGR01104	V_PPase	vacuolar-type H(+)-translocating pyrophosphatase. This model describes proton pyrophosphatases from eukaryotes (predominantly plants), archaea and bacteria. It is an integral membrane protein and is suggested to have about 15 membrane spanning domains. Proton translocating inorganic pyrophosphatase, like H(+)-ATPase, acidifies the vacuoles and is pivotal to the vacuolar secondary active transport systems in plants. [Transport and binding proteins, Cations and iron carrying compounds]	695
130175	TIGR01105	galF	UTP-glucose-1-phosphate uridylyltransferase, non-catalytic GalF subunit. GalF is a non-catalytic subunit of the UTP-glucose pyrophosphorylase modulating the enzyme activity to increase the formation of UDP-glucose [Regulatory functions, Protein interactions]	297
273445	TIGR01106	ATPase-IIC_X-K	sodium or proton efflux -- potassium uptake antiporter, P-type ATPase, alpha subunit. This model describes the P-type ATPases responsible for the exchange of either protons or sodium ions for potassium ions across the plasma membranes of eukaryotes. Unlike most other P-type ATPases, members of this subfamily require a beta subunit for activity. This model encompasses eukaryotes and consists of two functional types, a Na/K antiporter found widely distributed in eukaryotes and a H/K antiporter found only in vertebrates. The Na+ or H+/K+ antiporter P-type ATPases have been characterized as Type IIC based on a published phylogenetic analysis. Sequences from Blastocladiella emersonii (GP|6636502, GP|6636502 and PIR|T43025), C. elegans (GP|2315419, GP|6671808 and PIR|T31763) and Drosophila melanogaster (GP|7291424) score below trusted cutoff, apparently due to long branch length (excessive divergence from the last common ancestor) as evidenced by a phylogenetic tree. Experimental evidence is needed to determine whether these sequences represent ATPases with conserved function. Aside from fragments, other sequences between trusted and noise appear to be bacterial ATPases of unclear lineage, but most likely calcium pumps. [Energy metabolism, ATP-proton motive force interconversion]	997
273446	TIGR01107	Na_K_ATPase_bet	Sodium Potassium ATPase beta subunit. This model describes the Na+/K+ ATPase beta subunit in eukaryotes. Na+/K+ ATPase(also called Sodium-Potassium pump) is intimately associated with the plasma membrane. It couples the energy released by the hydrolysis of ATP to extrude 3 Na+ ions, with the concomitant uptake of 2K+ ions, against their ionic gradients. [Transport and binding proteins, Cations and iron carrying compounds]	290
273447	TIGR01108	oadA	oxaloacetate decarboxylase alpha subunit. This model describes the bacterial oxaloacetate decarboxylase alpha subunit and its equivalents in archaea. The oxaloacetate decarboxylase Na+ pump is the paradigm of the family of Na+ transport decarboxylases that present in bacteria and archaea. It a multi subunit enzyme consisting of a peripheral alpha-subunit and integral membrane subunits beta and gamma. The energy released by the decarboxylation reaction of oxaloacetate is coupled to Na+ ion pumping across the membrane. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other]	582
130179	TIGR01109	Na_pump_decarbB	sodium ion-translocating decarboxylase, beta subunit. This model describes the beta subunits of sodium pump decarboxylases that include oxaloacetate decarboxylase, methylmalonyl-CoA decarboxylase, and glutaconyl-CoA decarboxylase. Beta and gammma-subunits are integral membrane proteins, while alpha is membrane bound. Catalytically, the energy released by the decarboxylation reaction is coupled to the extrusion of Na+ ions across the membrane. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other]	354
273448	TIGR01110	mdcA	malonate decarboxylase, alpha subunit. This model describes malonate decarboxylase alpha subunit, from both the water-soluble form as found in Klebsiella pneumoniae and the form couple to sodium ion pumping in Malonomonas rubra. Malonate decarboxylase Na+ pump is the paradigm of the family of Na+ transport decarboxylases. Essentially, it couples the energy derived from decarboxylation of a carboxylic acid substrate to move Na+ ion across the bilayer. Functional malonate decarboylase is a multi subunit protein. The alpha subunit enzymatically performs the transfer of malonate (substrate) to an acyl carrier protein subunit for subsequent decarboxylation, hence the name: acetyl-S-acyl carrier protein:malonate carrier protein-SH transferase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other]	543
130181	TIGR01111	mtrA	N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit A. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit A in methanogenic archaea. This methyltranferase is a membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase (encoded by subunit A) is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. [Transport and binding proteins, Cations and iron carrying compounds]	238
273449	TIGR01112	mtrD	N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit D. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit D in methanogenic archaea. This methyltranferase is membrane-associated enzyme complex that uses methy-transfer reaction to drive sodium-ion pump. Archaea domain, have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Methanogenesis]	223
130183	TIGR01113	mtrE	N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit E. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit E in methanogenic archaea. This methyltranfersae is membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Methanogenesis]	283
273450	TIGR01114	mtrH	N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit H. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit H in methanogenic archaea. This methyltranfersae is membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Energy metabolism, Methanogenesis]	314
273451	TIGR01115	pufM	photosynthetic reaction center M subunit. This model decribes the photosynthetic reaction center M subunit in non-oxygenic photosynthetic bacteria. Reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reacion center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in form of NADH. Ultimately the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is some organic acid and not water. Much of our current functional understanding of photosynthesis comes from the structural determination, spectroscopic studies and mutational analysis on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	305
273452	TIGR01116	ATPase-IIA1_Ca	sarco/endoplasmic reticulum calcium-translocating P-type ATPase. This model describes the P-type ATPase responsible for translocating calcium ions across the endoplasmic reticulum membrane of eukaryotes, and is of particular importance in the sarcoplasmic reticulum of skeletal and cardiac muscle in vertebrates. These pumps transfer Ca2+ from the cytoplasm to the lumen of the endoplasmic reticulum. In humans and mice, at least, there are multiple isoforms of the SERCA pump with overlapping but not redundant functions. Defects in SERCA isoforms are associated with diseases in humans. The calcium P-type ATPases have been characterized as Type IIA based on a phylogenetic analysis which distinguishes this group from the Type IIB PMCA calcium pump modelled by TIGR01517. A separate analysis divides Type IIA into sub-types, SERCA and PMR1, the latter of which is modelled by TIGR01522. [Transport and binding proteins, Cations and iron carrying compounds]	917
130187	TIGR01117	mmdA	methylmalonyl-CoA decarboxylase alpha subunit. This model describes methymalonyl-CoA decarboxylase aplha subunit in archaea and bacteria. Metylmalonyl-CoA decarboxylase Na+ pump is a representative of a class of Na+ transport decarboxylases that couples the energy derived by decarboxylation of carboxylic acid substrates to drive the extrusion of Na+ ion across the membrane. [Energy metabolism, ATP-proton motive force interconversion, Energy metabolism, Fermentation, Transport and binding proteins, Cations and iron carrying compounds]	512
130188	TIGR01118	lacA	galactose-6-phosphate isomerase, LacA subunit. This family contains members from low GC gram-positive bacteria. Galactose-6-phosphate isomerase is involved in lactose catabolism by the tagatose-6-phosphate pathway. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	141
130189	TIGR01119	lacB	galactose-6-phosphate isomerase, LacB subunit. This family contains four members from low GC gram-positive bacteria. Galactose-6-phosphate isomerase is involved in lactose catabolism by the tagatose-6-phosphate pathway. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	171
130190	TIGR01120	rpiB	ribose 5-phosphate isomerase B. Involved in the non-oxidative branch of the pentose phospate pathway. [Energy metabolism, Pentose phosphate pathway]	143
130191	TIGR01121	D_amino_aminoT	D-amino acid aminotransferase. This enzyme is a homodimer. The pyridoxal phosphate attachment site is the Lys at position 146 of the seed alignment, in the motif Cys-Asp-Ile-Lys-Ser-Leu-Asn. Specificity is broad for various D-amino acids, and differs among members of the family; the family is designated equivalog, but with this caveat attached. [Energy metabolism, Amino acids and amines]	276
130192	TIGR01122	ilvE_I	branched-chain amino acid aminotransferase, group I. Among the class IV aminotransferases are two phylogenetically separable groups of branched-chain amino acid aminotransferase (IlvE). The last common ancestor of the two lineages appears also to have given rise to a family of D-amino acid aminotransferases (DAAT). This model represents the IlvE family more strongly similar to the DAAT family. [Amino acid biosynthesis, Pyruvate family]	298
233278	TIGR01123	ilvE_II	branched-chain amino acid aminotransferase, group II. Among the class IV aminotransferases are two phylogenetically separable groups of branched-chain amino acid aminotransferase (IlvE). The last common ancestor of the two lineages appears also to have given rise to a family of D-amino acid aminotransferases (DAAT). This model represents the IlvE family less similar to the DAAT family. [Amino acid biosynthesis, Pyruvate family]	313
130194	TIGR01124	ilvA_2Cterm	threonine ammonia-lyase, biosynthetic, long form. This model describes a form of threonine ammonia-lyase, a pyridoxal-phosphate dependent enzyme, with two copies of the threonine dehydratase C-terminal domain (pfam00585). Members with known function participate in isoleucine biosynthesis and are inhibited by isoleucine. Alternate name: threonine deaminase, threonine dehydratase. Forms scoring between the trusted and noise cutoff tend to branch with this subgroup of threonine ammonia-lyase phylogenetically but have only a single copy of the C-terminal domain. [Amino acid biosynthesis, Pyruvate family]	499
273453	TIGR01125	TIGR01125	ribosomal protein S12 methylthiotransferase RimO. Members of this protein are the methylthiotransferase RimO, which modifies a conserved Asp residue in ribosomal protein S12. This clade of radical SAM family proteins is closely related to the tRNA modification bifunctional enzyme MiaB (see TIGR01574), and it catalyzes the same two types of reactions: a radical-mechanism sulfur insertion, and a methylation of the inserted sulfur. This clade spans alpha and gamma proteobacteria, cyano bacteria, Deinococcus, Porphyromonas, Aquifex, Helicobacter, Campylobacter, Thermotoga, Chlamydia, Streptococcus coelicolor and Clostridium, but does not include most other gram positive bacteria, archaea or eukaryotes. [Protein synthesis, Ribosomal proteins: synthesis and modification]	426
273454	TIGR01126	pdi_dom	protein disulfide-isomerase domain. This model describes a domain of eukaryotic protein disulfide isomerases, generally found in two copies. The high cutoff for total score reflects the expectation of finding both copies. The domain is similar to thioredoxin but the redox-active disulfide region motif is APWCGHCK. [Protein fate, Protein folding and stabilization]	102
130197	TIGR01127	ilvA_1Cterm	threonine ammonia-lyase, medium form. A form of threonine dehydratase with two copies of the C-terminal domain pfam00585 is described by TIGR01124. This model describes a phylogenetically distinct form with a single copy of pfam00585. This form branches with the catabolic threonine dehydratase of E. coli; many members are designated as catabolic for this reason. However, the catabolic form lacks any pfam00585 domain. Many members of this model are found in species with other Ile biosynthetic enzymes. [Amino acid biosynthesis, Pyruvate family]	380
273455	TIGR01128	holA	DNA polymerase III, delta subunit. DNA polymerase III delta (holA) and delta prime (holB) subunits are distinct proteins encoded by separate genes. The delta prime subunit (holB) exhibits sequence homology to the tau and gamma subunits (dnaX), but the delta subunit (holA) does not demonstrate this same homology with dnaX. The delta, delta prime, gamma, chi and psi subunits form the gamma complex subassembly of DNA polymerase III holoenzyme, which couples ATP to assemble the ring-shaped beta subunit around DNA forming a DNA sliding clamp. [DNA metabolism, DNA replication, recombination, and repair]	302
273456	TIGR01129	secD	protein-export membrane protein SecD. Members of this family are highly variable in length immediately after the well-conserved motif LGLGLXGG at the amino-terminal end of this model. Archaeal homologs are not included in the seed and score between the trusted and noise cutoffs. SecD from Mycobacterium tuberculosis has a long Pro-rich insert. [Protein fate, Protein and peptide secretion and trafficking]	397
273457	TIGR01130	ER_PDI_fam	protein disulfide isomerase, eukaryotic. This model represents eukaryotic protein disulfide isomerases retained in the endoplasmic reticulum (ER) and closely related forms. Some members have been assigned alternative or additional functions such as prolyl 4-hydroxylase and dolichyl-diphosphooligosaccharide-protein glycotransferase. Members of this family have at least two protein-disulfide domains, each similar to thioredoxin but with the redox-active disulfide in the motif PWCGHCK, and an ER retention signal at the extreme C-terminus (KDEL, HDEL, and similar motifs).	462
273458	TIGR01131	ATP_synt_6_or_A	ATP synthase subunit 6 (eukaryotes),also subunit A (prokaryotes). Bacterial forms should be designated ATP synthase, F0 subunit A; eukaryotic (chloroplast and mitochondrial) forms should be designated ATP synthase, F0 subunit 6. The F1/F0 ATP synthase is a multisubunit, membrane associated enzyme found in bacteria and mitochondria and chloroplast. This enzyme is principally involved in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. Individual subunits in each of these clusters are named differently in prokaryotes and in organelles e.g., mitochondria and chloroplast. The bacterial equivalent of subunit 6 is named subunit 'A'. It has been shown that proton is conducted though this subunit. Typically, deprotonation and reprotonation of the acidic amino acid side-chains are implicated in the process. [Energy metabolism, ATP-proton motive force interconversion]	226
273459	TIGR01132	pgm	phosphoglucomutase, alpha-D-glucose phosphate-specific. This enzyme interconverts alpha-D-glucose-1-P and alpha-D-glucose-6-P. [Energy metabolism, Sugars]	543
273460	TIGR01133	murG	undecaprenyldiphospho-muramoylpentapeptide beta-N-acetylglucosaminyltransferase. RM 8449890 RT The final step of peptidoglycan subunit assembly in Escherichia coli occurs in the cytoplasm. RA Bupp K, van Heijenoort J. RL J Bacteriol 1993 Mar;175(6):1841-3 [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	348
273461	TIGR01134	purF	amidophosphoribosyltransferase. Alternate name: glutamine phosphoribosylpyrophosphate (PRPP) amidotransferase. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	442
273462	TIGR01135	glmS	glucosamine--fructose-6-phosphate aminotransferase (isomerizing). The member from Methanococcus jannaschii contains an intein. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Central intermediary metabolism, Amino sugars]	607
273463	TIGR01136	cysKM	cysteine synthase. This model discriminates cysteine synthases (EC 2.5.1.47) (both CysK and CysM) from cystathionine beta-synthase, a protein found primarily in eukaryotes and carrying a C-terminal CBS domain lacking from this protein. Bacterial proteins lacking the CBS domain but otherwise showing resemblamnce to cystathionine beta-synthases and considerable phylogenetic distance from known cysteine synthases were excluded from the seed and score below the trusted cutoff. [Amino acid biosynthesis, Serine family]	299
273464	TIGR01137	cysta_beta	cystathionine beta-synthase. Members of this family closely resemble cysteine synthase but contain an additional C-terminal CBS domain. The function of any bacterial member included in this family is proposed but not proven. [Amino acid biosynthesis, Serine family]	455
130208	TIGR01138	cysM	cysteine synthase B. CysM differs from CysK in that it can also use thiosulfate instead of sulfide, to produce cysteine thiosulfonate instead of cysteine. Alternate name: O-acetylserine (thiol)-lyase [Amino acid biosynthesis, Serine family]	290
273465	TIGR01139	cysK	cysteine synthase A. This model distinguishes cysteine synthase A (CysK) from cysteine synthase B (CysM). CysM differs in having a broader specificity that also allows the use of thiosulfate to produce cysteine thiosulfonate. [Amino acid biosynthesis, Serine family]	298
273466	TIGR01140	L_thr_O3P_dcar	L-threonine-O-3-phosphate decarboxylase. This family contains pyridoxal phosphate-binding class II aminotransferases (see pfamAM:pfam00222) closely related to, yet distinct from, histidinol-phosphate aminotransferase (HisC). It is found in cobalamin biosynthesis operons in Salmonella typhimurium and Bacillus halodurans (each of which also has HisC) and has been shown to have L-threonine-O-3-phosphate decarboxylase activity in Salmonella. Although the gene symbol cobD was assigned in Salmonella, cobD in other contexts refers to a different cobalamin biosynthesis enzyme, modeled by pfam03186 and called cbiB in Salmonella. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	330
273467	TIGR01141	hisC	histidinol-phosphate aminotransferase. Alternate names: histidinol-phosphate transaminase; imidazole acetol-phosphate transaminase Histidinol-phosphate aminotransferase is a pyridoxal-phosphate dependent enzyme. [Amino acid biosynthesis, Histidine family]	350
130212	TIGR01142	purT	phosphoribosylglycinamide formyltransferase 2. This enzyme is an alternative to PurN (TIGR00639) [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	380
273468	TIGR01143	murF	UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase. This family consists of the strictly bacterial MurF gene of peptidoglycan biosynthesis. This enzyme is almost always UDP-N-acetylmuramoylalanyl-D-glutamyl-2,6-diaminopimelate--D-alanyl-D-alanyl ligase, but in a few species, MurE adds lysine rather than diaminopimelate. This enzyme acts on the product from MurE activity, and so is also subfamily rather than equivalog. Staphylococcus aureus is an example of species in this MurF protein would differ. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	417
130214	TIGR01144	ATP_synt_b	ATP synthase, F0 subunit b. This model describes the F1/F0 ATP synthase b subunit in bacteria only. Scoring just below the trusted cutoff are the N-terminal domains of Mycobacterial b/delta fusion proteins and a subunit from an archaeon, Methanosarcina barkeri, in which the ATP synthase homolog differs in architecture and is not experimentally confirmed. This model helps resolve b from the related b' subunit. Within the family is an example from a sodium-translocating rather than proton-translocating ATP synthase. [Energy metabolism, ATP-proton motive force interconversion]	147
130215	TIGR01145	ATP_synt_delta	ATP synthase, F1 delta subunit. This model describes the ATP synthase delta subunit in bacteria, mitochondria, and chloroplasts. It is sometimes called OSCP for Oligomycin Sensitivity Conferring Protein. F1/F0-ATP synthase is a multisubunit, membrane associated enzyme found in bacteria and organelles of higher eukaryotes, namely, mitochondria and chloroplast. This enzyme is principally involved in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. Delta subunit belongs to the F1 cluster or sector and functionally implicated in the overall stability of the complex. Expression of truncated forms of this subunit results in low ATPase activity. [Energy metabolism, ATP-proton motive force interconversion]	172
273469	TIGR01146	ATPsyn_F1gamma	ATP synthase, F1 gamma subunit. This model describes the ATP synthase gamma subunit in bacteria and its equivalents in organelles, namely, mitochondria and chloroplast. F1/F0-ATP synthase is a multisubunit, membrane associated enzyme found in bacteria and organelles of higher eukaryotes, namely, mitochondria and chloroplast. This enzyme is principally involed in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. The gamma subunit is the part of F1 cluster. Surrounding the gamma subunit in a cylinder-like structure are three alpha and three subunits in an alternating fashion. This is the central catalytic unit whose different conformations permit the binding of ADP and inorganic phosphate and release of ATP. [Energy metabolism, ATP-proton motive force interconversion]	286
130217	TIGR01147	V_ATP_synt_G	vacuolar ATP synthase, subunit G. This model describes the vacuolar ATP synthase G subunit in eukaryotes and includes members from diverse groups e.g., fungi, plants, parasites etc. V-ATPases are multi-subunit enzymes composed of two functional domains: A transmembrane Vo domain and a peripheral catalytic domain V1. The G subunit is one of the subunits of the catalytic domain. V-ATPases are responsible for the acidification of endosomes and lysosomes, which are part of the central vacuolar system. [Energy metabolism, ATP-proton motive force interconversion]	113
130218	TIGR01148	mtrC	N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit C. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit C in methanogenic archaea. This methyltranferase is membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Energy metabolism, Other]	265
130219	TIGR01149	mtrG	N5-methyltetrahydromethanopterin:coenzyme M methyltransferase subunit G. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit G in methanogenic archaea. This methyltranfersae is membrane-associated enzyme complex that uses methyl-transfer reaction to drive sodium-ion pump. Archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Energy metabolism, Other]	70
273470	TIGR01150	puhA	photosynthetic reaction center, subunit H, bacterial. This model describes the photosynthetic reaction center H subunit in non-oxygenic photosynthetic bacteria. The reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reaction center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in the form of NADH. Ultimately, the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is an organic acid rather than water. Much of our current functional understanding of photosynthesis comes from the structural determination and spectroscopic studies on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	252
130221	TIGR01151	psbA	photosystem II, DI subunit (also called Q(B)). This model describes the Photosystem II, DI subunit (also called Q(B)) in bacterial and its equivalents in chloroplast of algae and higher plants. Photosystem II is many ways functionally equivalent to bacterial reaction center. At the core of Photosystem II are several light harvesting cofactors including plastoquinones, pheophytins, phyloquinones etc. These cofactors are intimately associated with the polypeptides, which principally including subunits DI, DII, Cyt.b, Cyt.f and iron-sulphur protein. Together they participate in the electron transfer reactions that lead to the net production of the reducting equivalents in the form of NADPH, which are used for reduction of CO2 to carbohydrates(C6H1206). Phosystem II operates during oxygenic photosynthesis and principal electron donor is H2O. Although no structural data is presently available, a huge body of literature exits that describes function using a variety of biochemical and biophysical techniques. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	360
130222	TIGR01152	psbD	Photosystem II, DII subunit (also called Q(A)). This model describes the Photosystem II, DII subunit (also called Q(A)) in bacterial and its equivalents in chloroplast of algae and higher plants. Photosystem II is in many ways functionally equivalent to bacterial reaction center. At the core of Photosystem II are several light harvesting cofactors including plastoquinones, pheophytins, phyloquinones etc. These cofactors are intimately associated with the polypeptides, which principally including subunits DI, DII, Cyt.b, Cyt.f and iron-sulphur protein. Together they participate in the electron transfer reactions that lead to the net production of the reducting equivalents in the form of NADPH, which are used for reduction of CO2 to carbohydrates(C6H1206). Phosystem II operates during oxygenic photosynthesis and principal electron donor is H2O. Although no high resolution X-ray structural data is presently available, recently a 3D structure of the supercomplex has been described by cryo-electron microscopy. Besides a huge body of literature exits that describes function using a variety of biochemical and biophysical techniques. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	352
213589	TIGR01153	psbC	photosystem II 44 kDa subunit reaction center protein (also called P6 protein, CP43), bacterial and chloroplast. This model describes the Photosystem II, 44kDa subunit (also called P6 protein, CP43) in bacterial and its equivalents in chloroplast of algae and higher plants. Photosystem II is in many ways functionally equivalent to bacterial reaction center. At the core of Photosystem II are several light harvesting cofactors including plastoquinones, pheophytins, phyloquinones etc. These cofactors are intimately associated with the polypeptides, which principally including subunits 44 kDa protein,DI, DII, Cyt.b, Cyt.f, iron-sulphur protein and others. Functinally 44 kDa subunit is imlicated in chlorophyll binding. Together they participate in the electron transfer reactions that lead to the net production of the reducting equivalents in the form of NADPH, which are used for reduction of CO2 to carbohydrates(C6H1206). Phosystem II operates during oxygenic photosynthesis and principal electron donor is H2O. Although no high resolution X-ray structural data is presently available, recently a 3D structure of the supercomplex has been described by cryo-electron microscopy. Besides a huge body of literature exits that describes function using a variety of biochemical and biophysical techniques. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	432
130224	TIGR01156	cytb6/f_IV	cytochrome b6/f complex subunit IV. This model describes the subunit IV of the cytochrome b6/f complex. The cyt b6/f complex is central to the functions of the oxygenic phosynthetic electron transport in cyanobacteria and its equivalents in algae and higher plants. Energetically, on the redox scale the cytb6/f complex is placed below the other components - Q(A); Q(B) of the photosystem II in the Z-scheme, along the pathway of the electron transport. The complex is made of the following subunits: cytochrome f; cytochrome b6; Rieske 2Fe-2S; and subunits IV; V; VI; VII. Subunit IV is one of the principal subunits for the binding of the redox prosthetic groups. Each monomer of the complex contains a molecule of chlorophyll a and beta-carotene. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	159
130225	TIGR01157	pufL	photosynthetic reaction center L subunit. This model describes the photosynthetic reaction center L subunit in non-oxygenic photosynthetic bacteria. Reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reaction center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in form of NADH. Ultimately the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is some organic acid and not water. Much of our current functional understanding of photosynthesis comes from the structural determination, spectroscopic studies and mutational analysis on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	239
273471	TIGR01158	SUI1_rel	translation initation factor SUI1, putative, prokaryotic. This family of archaeal and bacterial proteins is homologous to the eukaryotic translation intiation factor SUI1 involved in directing the ribosome to the proper start site of translation by functioning in concert with eIF-2 and the initiator tRNA-Met. [Protein synthesis, Translation factors]	101
273472	TIGR01159	DRP1	density-regulated protein DRP1. This protein family shows weak but suggestive similarity to translation initiation factor SUI1 and its prokaryotic homologs.	173
130228	TIGR01160	SUI1_MOF2	translation initiation factor SUI1, eukaryotic. Alternate name: MOF2. A similar protein family (see TIGRFAMs model TIGR01158) is found in prokaryotes. The human proteins complements a yeast SUI1 mutatation. [Protein synthesis, Translation factors]	110
273473	TIGR01161	purK	phosphoribosylaminoimidazole carboxylase, PurK protein. Phosphoribosylaminoimidazole carboxylase is a fusion protein in plants and fungi, but consists of two non-interacting proteins in bacteria, PurK and PurE. This model represents PurK, N5-carboxyaminoimidazole ribonucleotide synthetase, which hydrolyzes ATP and converts AIR to N5-CAIR. PurE converts N5-CAIR to CAIR. In the presence of high concentrations of bicarbonate, PurE is reported able to convert AIR to CAIR directly and without ATP. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	352
273474	TIGR01162	purE	phosphoribosylaminoimidazole carboxylase, PurE protein. Phosphoribosylaminoimidazole carboxylase is a fusion protein in plants and fungi, but consists of two non-interacting proteins in bacteria, PurK and PurE. This model represents PurK, an N5-CAIR mutase. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	156
273475	TIGR01163	rpe	ribulose-phosphate 3-epimerase. This family consists of Ribulose-phosphate 3-epimerase, also known as pentose-5-phosphate 3-epimerase (PPE). PPE converts D-ribulose 5-phosphate into D-xylulose 5-phosphate in Calvin's reductive pentose phosphate cycle. It has been found in a wide range of bacteria, archebacteria, fungi and plants. [Energy metabolism, Pentose phosphate pathway]	210
273476	TIGR01164	rplP_bact	ribosomal protein L16, bacterial/organelle. This model describes bacterial and organellar ribosomal protein L16. The homologous protein of the eukaryotic cytosol is designated L10 [Protein synthesis, Ribosomal proteins: synthesis and modification]	125
273477	TIGR01165	cbiN	cobalt transport protein. This model describes the cobalt transporter in bacteria and its equivalents in archaea. It principally functions in the ion uptake mechanism. It is a multisubunit transporter with two integral membrane proteins and two closely associated cytoplasmic subunits. This transporter belongs to the ABC transporter superfamily (ATP stands for ATP Binding Cassette). This superfamily includes two groups, one which catalyze the uptake of small molecules, including ions from the external milieu and the other group which is engaged in the efflux of small molecular weight compounds and ions from within the cell. Energy derived from the hydrolysis of ATP drive the both the process of uptake and efflux. [Transport and binding proteins, Cations and iron carrying compounds]	91
130234	TIGR01166	cbiO	cobalt transport protein ATP-binding subunit. This model describes the ATP binding subunit of the multisubunit cobalt transporter in bacteria and its equivalents in archaea. The model is restricted to ATP subunit that is a part of the cobalt transporter, which belongs to the ABC transporter superfamily (ATP Binding Cassette). The model excludes ATP binding subunit that are associated with other transporters belonging to ABC transporter superfamily. This superfamily includes two groups, one which catalyze the uptake of small molecules, including ions from the external milieu and the other group which is engaged in the efflux of small molecular weight compounds and ions from within the cell. Energy derived from the hydrolysis of ATP drive the both the process of uptake and efflux. [Transport and binding proteins, Cations and iron carrying compounds]	190
273478	TIGR01167	LPXTG_anchor	LPXTG-motif cell wall anchor domain. This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other]	34
273479	TIGR01168	YSIRK_signal	Gram-positive signal peptide, YSIRK family. Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.	39
211630	TIGR01169	rplA_bact	ribosomal protein L1, bacterial/chloroplast. This model describes bacterial (and chloroplast) ribosomal protein L1. The apparent mitochondrial L1 is sufficiently diverged to be the subject of a separate model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	227
273480	TIGR01170	rplA_mito	ribosomal protein uL1, mitochondrial. This model represents the mitochondrial homolog of bacterial ribosomal protein L1. Unlike chloroplast L1, this form was not sufficiently similar to bacterial forms to include in a single bacterial/organellar L1 model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	141
273481	TIGR01171	rplB_bact	ribosomal protein L2, bacterial/organellar. This model distinguishes bacterial and organellar ribosomal protein L2 from its counterparts in the archaea nad in the eukaryotic cytosol. Plant mitochondrial examples tend to have long, variable inserts. [Protein synthesis, Ribosomal proteins: synthesis and modification]	273
200082	TIGR01172	cysE	serine O-acetyltransferase. Cysteine biosynthesis [Amino acid biosynthesis, Serine family]	162
273482	TIGR01173	glmU	UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase. This protein is a bifunctional enzyme, GlmU, which catalyzes last two reactions in the four-step pathway of UDP-N-acetylglucosamine biosynthesis from fructose-6-phosphate. Its reaction product is required from peptidoglycan biosynthesis, LPS biosynthesis in species with LPS, and certain other processes. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Central intermediary metabolism, Amino sugars]	451
273483	TIGR01174	ftsA	cell division protein FtsA. This bacterial cell division protein interacts with FtsZ, the bacterial homolog of tubulin. It is an ATP-binding protein and shows structural similarities to actin and heat shock cognate protein 70. [Cellular processes, Cell division]	371
273484	TIGR01175	pilM	type IV pilus assembly protein PilM. This protein is required for the assembly of the type IV fimbria in Pseudomonas aeruginosa responsible for twitching motility, and for a similar pilus-like structure in Synechocystis. It is also found in species such as Deinococcus described as having natural transformation (for which a type IV pilus-like structure is proposed) but not fimbria.	348
273485	TIGR01176	fum_red_Fp	fumarate reductase (quinol), flavoprotein subunit. The terms succinate dehydrogenase and fumarate reductase may be used interchangeably in certain systems. However, a number of species have distinct complexes, with the fumarate reductase active under anaerobic conditions. This model represents the fumarate reductase flavoprotein subunit from several such species in which a distinct succinate dehydrogenase is also found. Not all bona fide fumarate reductases will be found by this model.	580
273486	TIGR01177	TIGR01177	putative methyltransferase, TIGR01177 family. This family of probable methyltransferases is found exclusively in the Archaea. [Hypothetical proteins, Conserved]	329
130246	TIGR01178	ade	adenine deaminase. The family described by this model includes an experimentally characterized adenine deaminase of Bacillus subtilis. It also include a member from Methanobacterium thermoautotrophicum, in which adenine deaminase activity has been detected. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	552
273487	TIGR01179	galE	UDP-glucose-4-epimerase GalE. Alternate name: UDPgalactose 4-epimerase This enzyme interconverts UDP-glucose and UDP-galactose. A set of related proteins, some of which are tentatively identified as UDP-glucose-4-epimerase in Thermotoga maritima, Bacillus halodurans, and several archaea, but deeply branched from this set and lacking experimental evidence, are excluded from this model and described by a separate model. [Energy metabolism, Sugars]	328
273488	TIGR01180	aman2_put	alpha-1,2-mannosidase, putative. The identification of members of this family as putative alpha-1,2-mannosidases is based on an unpublished characterization of the aman2 gene in Bacillus sp. M-90 by Maruyama,Y., Nakajima,M. and Nakajima,T. (Genbank accession BAA76709, pid g4587313). Most members of this family appear to have signal sequences. Members from the dental pathogen Porphyromonas gingivalis have been described as immunoreactive with periodontitis patient serum. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	750
273489	TIGR01181	dTDP_gluc_dehyt	dTDP-glucose 4,6-dehydratase. This protein is related to UDP-glucose 4-epimerase (GalE) and likewise has an NAD cofactor. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	317
273490	TIGR01182	eda	Entner-Doudoroff aldolase. 2-deydro-3-deoxyphosphogluconate aldolase (EC 4.1.2.14) is an enzyme of the Entner-Doudoroff pathway. This aldolase has another function, 4-hydroxy-2-oxoglutarate aldolase (EC 4.1.3.16) shown experimentally in Escherichia coli and Pseudomonas putida [Amino acid biosynthesis, Glutamate family, Energy metabolism, Entner-Doudoroff]	204
130251	TIGR01183	ntrB	nitrate ABC transporter, permease protein. This model describes the nitrate transport permease in bacteria. This is gene product of ntrB. The nitrate transport permease is the integral membrane component of the nitrate transport system and belongs to the ATP-binding cassette (ABC) superfamily. At least in photosynthetic bacteria nitrate assimilation is aided by other proteins derived from the operon which among others include products of ntrA, ntrB, ntrC, ntrD, narB. Functionally ntrC and ntrD resemble the ATP binding components of the binding protein-dependent transport systems. Mutational studies have shown that ntrB and ntrC are mandatory for nitrate accumulation. Nitrate reductase is encoded by narB. [Transport and binding proteins, Anions]	202
130252	TIGR01184	ntrCD	nitrate transport ATP-binding subunits C and D. This model describes the ATP binding subunits of nitrate transport in bacteria and archaea. This protein belongs to the ATP-binding cassette (ABC) superfamily. It is thought that the two subunits encoded by ntrC and ntrD form the binding surface for interaction with ATP. This model is restricted in identifying ATP binding subunit associated with the nitrate transport. Nitrate assimilation is aided by other proteins derived from the operon which among others include products of ntrA - a regulatory protein; ntrB - a hydropbobic transmembrane permease and narB - a reductase. [Transport and binding proteins, Anions, Transport and binding proteins, Other]	230
130253	TIGR01185	devC	DevC protein. This model describes a predicted membrane subunit, DevC, of an ABC transporter known so far from two species of cyanobacteria. Some experimental data from mutational analysis suggest that this protein along with DevA and DevB encoded in the same operon may be involved in the transport/export of glycolipids. [Transport and binding proteins, Other]	380
130254	TIGR01186	proV	glycine betaine/L-proline transport ATP binding subunit. This model describes the glycine betaine/L-proline ATP binding subunit in bacteria and its equivalents in archaea. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporter is the obligatory coupling of ATP hydrolysis to substrate translocation. The minimal configuration of bacterial ABC transport system: an ATPase or ATP binding subunit; An integral membrane protein; a hydrophilic polypetpide, which likely functions as substrate binding protein. Functionally, this transport system is involved in osmoregulation. Under conditions of stress, the organism recruits these transport system to accumulate glycine betaine and other solutes which offer osmo-protection. It has been demonstrated that glycine betaine uptake is accompanied by symport with sodium ions. The locus has been named variously as proU or opuA. A gene library from L.lactis functionally complements an E.coli proU mutant. The comlementing locus is similar to a opuA locus in B.sutlis. This clarifies the differences in nomenclature. [Transport and binding proteins, Amino acids, peptides and amines]	363
162242	TIGR01187	potA	spermidine/putrescine ABC transporter ATP-binding subunit. This model describes spermidine/putrescine ABC transporter, ATP binding subunit in bacteria and its equivalents in archaea. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporter is the obligatory coupling of ATP hydrolysis to substrate translocation. The minimal configuration of bacterial ABC transport system: an ATPase or ATP binding subunit; An integral membrane protein; a hydrophilic polypetpide, which likely functions as substrate binding protein. Polyamines like spermidine and putrescine play vital role in cell proliferation, differentiation, and ion homeostasis. The concentration of polyamines within the cell are regulated by biosynthesis, degradation and transport (uptake and efflux included). [Transport and binding proteins, Amino acids, peptides and amines]	325
130256	TIGR01188	drrA	daunorubicin resistance ABC transporter ATP-binding subunit. This model describes daunorubicin resistance ABC transporter, ATP binding subunit in bacteria and archaea. This model is restricted in its scope to preferentially recognize the ATP binding subunit associated with effux of the drug, daunorubicin. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporter is the obligatory coupling of ATP hydrolysis to substrate translocation. The minimal configuration of bacterial ABC transport system: an ATPase or ATP binding subunit; An integral membrane protein; a hydrophilic polypetpide, which likely functions as substrate binding protein. In eukaryotes proteins of similar function include p-gyco proteins, multidrug resistance protein etc. [Transport and binding proteins, Other]	302
273491	TIGR01189	ccmA	heme ABC exporter, ATP-binding protein CcmA. This model describes the cyt c biogenesis protein encoded by ccmA in bacteria. An exception is, an arabidopsis protein. Quite likely this is encoded by an organelle. Bacterial c-type cytocromes are located on the periplasmic side of the cytoplasmic membrane. Several gene products encoded in a locus designated as 'ccm' are implicated in the transport and assembly of the functional cytochrome C. This cluster includes genes: ccmA;B;C;D;E;F;G and H. The posttranslational pathway includes the transport of heme moiety, the secretion of the apoprotein and the covalent attachment of the heme with the apoprotein. The proteins ccmA and B represent an ABC transporter; ccmC and D participate in heme transfer to ccmE, which function as a periplasmic heme chaperone. The presence of ccmF, G and H is suggested to be obligatory for the final functional assembly of cytochrome c. [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Other]	198
200083	TIGR01190	ccmB	heme exporter protein CcmB. This model describes the cyt c biogenesis protein encoded by ccmB in bacteria. Bacterial c-type cytochromes are located on the periplasmic side of the cytoplasmic membrane. Several gene products encoded in a locus designated as 'ccm' are implicated in the transport and assembly of the functional cytochrome C. This cluster includes genes: ccmA;B;C;D;E;F;G and H. The posttranslational pathway includes the transport of heme moiety, the secretion of the apoprotein and the covalent attachment of the heme with the apoprotein. The proteins ccmA and B represent an ABC transporter; ccmC and D participate in heme transfer to ccmE, which function as a periplasmic heme chaperone. The presence of ccmF, G and H is suggested to be obligatory for the final functional assembly of cytochrome C. [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Other]	211
273492	TIGR01191	ccmC	heme exporter protein CcmC. This model describes the cyt c biogenesis protein encoded by ccmC in bacteria. It must be noted an arabidopsis, a tritcum and a piscum plant proteins were recognizable in the clade. Quite likely they are of organellar origin. Bacterial c-type cytocromes are located on the periplasmic side of the cytoplasmic membrane. Several gene products encoded in a locus designated as 'ccm' are implicated in the transport and assembly of the functional cytochrome C. This cluster includes genes, ccmA;B;C;D;E;F;G and H. The posttranslational pathway includes the transport of heme moiety, the secretion of the apoprotein and the covalent attachment of the heme with the apoprotein. The proteins ccmA and B represent an ABC transporter; ccmC and D participate in the heme transfer to ccmE, which function as a periplasmic heme chaperone. The presence of ccmF, G and H is suggested to be obligatory for the final functional assembly of cytochrome c. [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Other]	184
130260	TIGR01192	chvA	glucan exporter ATP-binding protein. This model describes glucan exporter ATP binding protein in bacteria. It belongs to the larger ABC transporter superfamily with the characteristic ATP binding motif. The In general, this protein is in some ways implicated in osmoregulation and suggested to participate in the export of glucan from the cytoplasm to periplasm. The cyclic beta-1,2-glucan in the bactrerial periplasmic space is suggested to confer the property of high osmolority. It has also been demonstrated that mutants in this loci have lost functions of virulence and motility. It is unclear as to how virulence and osmoadaptaion are related. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	585
130261	TIGR01193	bacteriocin_ABC	ABC-type bacteriocin transporter. This model describes ABC-type bacteriocin transporter. The amino terminal domain (pfam03412) processes the N-terminal leader peptide from the bacteriocin while C-terminal domains resemble ABC transporter membrane protein and ATP-binding cassette domain. In general, bacteriocins are agents which are responsible for killing or inhibiting the closely related species or even different strains of the same species. Bacteriocins are usually encoded by bacterial plasmids. Bacteriocins are named after the species and hence in literature one encounters various names e.g., leucocin from Leuconostic geldium; pedicocin from Pedicoccus acidilactici; sakacin from Lactobacillus sake etc. [Protein fate, Protein and peptide secretion and trafficking, Protein fate, Protein modification and repair, Transport and binding proteins, Other]	708
130262	TIGR01194	cyc_pep_trnsptr	cyclic peptide transporter. This model describes cyclic peptide transporter in bacteria. Bacteria have elaborate pathways for the production of toxins and secondary metabolites. Many such compounds, including syringomycin and pyoverdine are synthesized on non-ribosomal templates consisting of a multienzyme complex. On several occasions the proteins of the complex and transporter protein are present on the same operon. Often times these compounds cross the biological membrane by specific transporters. Syringomycin is an amphipathic, cylclic lipodepsipeptide when inserted into host causes formation of channels, permeable to variety of cations. On the other hand, pyoverdine is a cyclic octa-peptidyl dihydroxyquinoline, which is efficient in sequestering iron for uptake. [Transport and binding proteins, Amino acids, peptides and amines, Transport and binding proteins, Other]	555
273493	TIGR01195	oadG_fam	sodium pump decarboxylases, gamma subunit. This model finds the subfamily of distantly related, low complexity, hydrophobic small subunits of several related sodium ion-pumping decarboxylases. These include oxaloacetate decarboxylase gamma subunit and methylmalonyl-CoA decarboxylase delta subunit. Most sequences scoring between the noise and trusted cutoffs are eukaryotic sodium channel proteins.	82
130264	TIGR01196	edd	6-phosphogluconate dehydratase. A close homolog, designated MocB (mannityl opine catabolism), is found in a mannopine catabolism region of a plasmid of Agrobacterium tumefaciens. However, it is not essential for mannopine catabolism, branches within the cluster of 6-phosphogluconate dehydratases (with a short branch length) in a tree rooted by the presence of other dehydyatases. It may represent an authentic 6-phosphogluconate dehydratase, redundant with the chromosomal copy shown to exist in plasmid-cured strains. This model includes mocB above the trusted cutoff, although the designation is somewhat tenuous. [Energy metabolism, Entner-Doudoroff]	601
162246	TIGR01197	nramp	NRAMP (natural resistance-associated macrophage protein) metal ion transporters. This model describes the Nramp metal ion transporter family. Historically, in mammals these proteins have been functionally characterized as proteins involved in the host pathogen resistance, hence the name - NRAMP. At least two isoforms Nramp1 and Nramp2 have been identified. However the exact mechanism of pathogen resistance was unclear, until it was demonstrated by expression cloning and electrophysiological techniques that this protein was a metal ion transporter. It was also independently demonstrated that a microcytic anemia (mk) locus in mouse, encodes a metal ion transporter (DCT1 or Nramp2). The transporter has a broad range of substrate specificity that include Fe+2, Zn+2, Mn+2, Co+2, Cd+2, Cu+2, Ni+2 and Pb+2. The uptake of these metal ions is coupled to proton symport. Metal ions are essential cofactors in a number of biological process including, oxidative phosphorylation, gene regulation and metal ion homeostasis. Nramp1 could confer resistance to infection in one of the two ways. (1) The uptake of Fe+2 can produce toxic hydroxyl radicals via Fenton reaction killing the pathogens in phagosomes or (2) Deplete the metal ion pools in the phagosome and deprive the pathogens of metal ions, which is critical for its survival. [Transport and binding proteins, Cations and iron carrying compounds]	390
273494	TIGR01198	pgl	6-phosphogluconolactonase. This enzyme of the pentose phosphate pathway is often found as a part of a multifunctional protein with [Energy metabolism, Pentose phosphate pathway]	233
273495	TIGR01200	GLPGLI	GLPGLI family protein. This protein family was first noted as a paralogous set in Porphyromonas gingivalis, but it is more widely distributed among the Bacteroidetes. The protein family is now renamed GLPGLI after its best-conserved motif.	226
273496	TIGR01201	HU_rel	DNA-binding protein, histone-like, putative. This model describes a set of proteins related to but longer than DNA-binding protein HU. Its distinctive domain architecture compared to HU and related histone-like DNA-binding proteins justifies the designation as superfamily. Members include, so far, one from Bacteroides fragilis, a gut bacterium, and ten from Porphyromonas gingivalis, an oral anaerobe. [DNA metabolism, Chromosome-associated proteins]	145
130269	TIGR01202	bchC	2-desacetyl-2-hydroxyethyl bacteriochlorophyllide A dehydrogenase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	308
273497	TIGR01203	HGPRTase	hypoxanthine phosphoribosyltransferase. Alternate name: hypoxanthine-guanine phosphoribosyltransferase. Sequence differences as small as a single residue can affect whether members of this family act on hypoxanthine and guanine or hypoxanthine only. The designation of this model as equivalog reflects hypoxanthine specificity and does not reflect whether or not guanine can replace hypoxanthine. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	166
130271	TIGR01204	bioW	6-carboxyhexanoate--CoA ligase. Alternate name: pimeloyl-CoA synthase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin]	232
273498	TIGR01205	D_ala_D_alaTIGR	D-alanine--D-alanine ligase. This model describes D-Ala--D-Ala ligase, an enzyme that makes a required precursor of the bacterial cell wall. It also describes some closely related proteins responsible for resistance to glycopeptide antibiotics such as vancomycin. The mechanism of glyopeptide antibiotic resistance involves the production of D-alanine-D-lactate (VanA and VanB families) or D-alanine-D-serine (VanC). The seed alignment contains only chromosomally encoded D-ala--D-ala ligases, but a number of antibiotic resistance proteins score above the trusted cutoff of this model. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	315
273499	TIGR01206	lysW	lysine biosynthesis protein LysW. This very small, poorly characterized protein has been shown essential in Thermus thermophilus for an unusual pathway of Lys biosynthesis from aspartate by way of alpha-aminoadipate (AAA) rather than diaminopimelate. It is found also in Deinococcus radiodurans and Pyrococcus horikoshii, which appear to share the AAA pathway. [Amino acid biosynthesis, Aspartate family]	54
130274	TIGR01207	rmlA	glucose-1-phosphate thymidylyltransferase, short form. Alternate name: dTDP-D-glucose synthase homotetramer This model describes a tightly conserved but broadly distributed subfamily (here designated as short form) of known and putative bacterial glucose-1-phosphate thymidylyltransferases. It is well characterized in several species as the first of four enzymes involved in the biosynthesis of dTDP-L-rhamnose, a cell wall constituent and a feedback inhibitor of the enzyme. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	286
273500	TIGR01208	rmlA_long	glucose-1-phosphate thymidylylransferase, long form. The family of known and putative glucose-1-phosphate thymidyltransferase (also called dTDP-glucose synthase) shows a deep split into a short form (see TIGR01207) and a long form described by this model. The homotetrameric short form is found in numerous bacterial species that incorporate dTDP-L-rhamnose, which it helps synthesize, into the cell wall. It is subject to feedback inhibition. This form, in contrast, is found in many species for which it serves as a sugar-activating enzyme for antibiotic biosynthesis and or other, unknown pathways, and in which dTDP-L-rhamnose is not necessarily produced. Alternate name: dTDP-D-glucose synthase	353
273501	TIGR01209	TIGR01209	RNA ligase, Pab1020 family. Members of this family are found, so far, in a single copy per genome and largely in thermophiles, of which only Aquifex aeolicus is bacterial rather than archaeal. PSI-BLAST converges after a single iteration to the whole of this family and reveals no convincing similarity to any other protein. The member protein Pab1020 has been characterized as an RNA ligase with circularization activity. [Transcription, RNA processing]	374
273502	TIGR01210	TIGR01210	radical SAM enzyme, TIGR01210 family. This family of exclusively archaeal radical SAM enzymes has no characterized close homologs. [Hypothetical proteins, Conserved]	313
273503	TIGR01211	ELP3	radical SAM enzyme/protein acetyltransferase, ELP3 family. This family includes elongator complex protein 3 (ELP3) from eukaryotes and related proteins from other lineages. ELP3 is a component of the RNA polymerase II holoenzyme. It has an N-terminal radical SAM domain and C-terminal GNAT acetyltransferase domain. Members of this family are found in eukaryotes, archaea, and a few bacteria (e.g. Atopobium sp). The activity discovered first was an acetyltransferase modification at the N-termini of all four core histones, shown in vitro in eukaryotes. More recently, the radical SAM domain was shown to play a role in zygotic paternal genome demethylation. Family TIGR01212, widespread in prokaryotes, lacks the GNAT acetyltransferase domain but shares extensive sequence similarity with this family (TIGR01211). [Transcription, DNA-dependent RNA polymerase]	522
130279	TIGR01212	TIGR01212	radical SAM protein, TIGR01212 family. Members of this family are apparent radical-SAM enzymes, related to the N-terminal region of the bifunctional ELP3, whose C-terminal region is part of the elongator complex and appears to acetylate histones and other proteins. ELP3 binds S-adenosylmethionine (SAM) and was recently shown to be involved in a DNA demethylation process in eukaryotes. Close sequence similarity of this family (with lacks the GNAT family acetyltransferase domain) to the ELP3 N-terminal region and a strong match to the pfam04055 support identification of this family as radical SAM despite the atypical spacing between first and second Cys residues in the 4Fe4S-binding motif. [Unknown function, Enzymes of unknown specificity]	302
273504	TIGR01213	pseudo_Pus10arc	tRNA pseudouridine(54/55) synthase. Members of this family show twilight-zone similarity to several predicted RNA pseudouridine synthases. All trusted members of this family are archaeal. Several eukaryotic homologs lack N-terminal homology including two CXXC motifs. [Hypothetical proteins, Conserved]	388
273505	TIGR01214	rmlD	dTDP-4-dehydrorhamnose reductase. This enzyme catalyzes the last of 4 steps in making dTDP-rhamnose, a precursor of LPS core antigen, O-antigen, etc. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	287
188120	TIGR01215	minE	cell division topological specificity factor MinE. This protein is involved in the process of cell division. This protein prevents the proteins MinC and MinD to inhibit cell division at internal sites, but allows inhibiton at polar sites. This allows for correct cell division at the proper sites. [Cellular processes, Cell division]	81
273506	TIGR01216	ATP_synt_epsi	ATP synthase, F1 epsilon subunit (delta in mitochondria). This model describes one of the five types of subunits in the F1 part of F1/F0 ATP synthases. Members of this family are designated epsilon in bacterial and chloroplast systems but designated delta in mitochondria, where the counterpart of the bacterial delta subunit is designated OSCP. In a few cases (Propionigenium modestum, Acetobacterium woodii) scoring above the trusted cutoff and designated here as exceptions, Na+ replaces H+ for translocation. [Energy metabolism, ATP-proton motive force interconversion]	130
273507	TIGR01217	ac_ac_CoA_syn	acetoacetyl-CoA synthase. This enzyme catalyzes the first step of the mevalonate pathway of IPP biosynthesis. Most bacteria do not use this pathway, but rather the deoxyxylulose pathway. [Central intermediary metabolism, Other]	652
273508	TIGR01218	Gpos_tandem_5TM	tandem five-transmembrane protein. Members of this family of proteins, with average length of 210, have no invariant residues but five predicted transmembrane segments. Strangely, most members occur in groups of consecutive paralogous genes. A striking example is a set of eleven encoded consecutively, head-to-tail, in Staphylococcus aureus strain COL.	207
273509	TIGR01219	Pmev_kin_ERG8	phosphomevalonate kinase, ERG8-type, eukaryotic branch. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found - the animal type and this ERG8 type. This model represents plant and fungal forms of the ERG8 type of phosphomevalonate kinase. [Central intermediary metabolism, Other]	454
130287	TIGR01220	Pmev_kin_Gr_pos	phosphomevalonate kinase, ERG8-type, Gram-positive branch. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found - the animal type and this ERG8 type. This model represents the low GC Gram-positive organism forms of the ERG8 type of phosphomevalonate kinase. [Central intermediary metabolism, Other]	358
273510	TIGR01221	rmlC	dTDP-4-dehydrorhamnose 3,5-epimerase. This enzyme participates in the biosynthesis of dTDP-L-rhamnose, often as a precursor to LPS O-antigen [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	176
273511	TIGR01222	minC	septum site-determining protein MinC. The minC protein assists in correct placement of the septum for cell division by inhibiting septum formation at other sites. Homologs from Deinocoocus, Synechocystis PCC 6803, and Helicobacter pylori do not hit the full length of the model and score between the trusted and noise cutoffs. [Cellular processes, Cell division]	217
130290	TIGR01223	Pmev_kin_anim	phosphomevalonate kinase, animal type. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found. One is this type, found in animals. The other is the ERG8 type, found in plants and fungi (TIGR01219) and in Gram-positive bacteria (TIGR01220). [Central intermediary metabolism, Other]	182
273512	TIGR01224	hutI	imidazolonepropionase. This enzyme catalyzes the third step in histidine degradation. [Energy metabolism, Amino acids and amines]	377
200086	TIGR01225	hutH	histidine ammonia-lyase. This enzyme deaminates histidine to urocanic acid, the first step in histidine degradation. It is closely related to phenylalanine ammonia-lyase. [Energy metabolism, Amino acids and amines]	506
130293	TIGR01226	phe_am_lyase	phenylalanine ammonia-lyase. Members of this subfamily of MIO prosthetic group enzymes are phenylalanine ammonia-lyases. They are found, so far, in plants and fungi. From phenylalanine, this enzyme yields cinnaminic acid, a precursor of many important plant compounds. This protein shows extensive homology to histidine ammonia-lyase, the first enzyme of a histidine degradation pathway. Note that members of this family from plant species that synthesize taxol are actually phenylalanine aminomutase, and are covered by exception model TIGR04473.	680
273513	TIGR01227	hutG	formimidoylglutamase. Formiminoglutamase, the fourth enzyme of histidine degradation, is similar to arginases and agmatinases. It is often encoded near other enzymes of the histidine degredation pathway: histidine ammonia-lyase, urocanate hydratase, and imidazolonepropionase. [Energy metabolism, Amino acids and amines]	307
130295	TIGR01228	hutU	urocanate hydratase. This model represents the second of four enzymes involved in the degradation of histidine to glutamate. [Energy metabolism, Amino acids and amines]	545
162262	TIGR01229	rocF_arginase	arginase. This model helps resolve arginases from known and putative agmatinases, formiminoglutamases, and other related proteins of unknown specifity. The pathway from arginine to the polyamine putrescine may procede by hydrolysis to remove urea (arginase) followed by decarboxylation (ornithine decarboxylase), or by decarboxylation first (arginine decarboxylase) followed by removal of urea (agmatinase).	300
273514	TIGR01230	agmatinase	agmatinase. Members of this family include known and predicted examples of agmatinase (agmatine ureohydrolase). The seed includes members of archaea, for which no definitive agmatinase sequence has yet been made available. However, archaeal sequences are phylogenetically close to the experimentally verified B. subtilis sequence. One species of Halobacterium has been demonstrated in vitro to produce agmatine from arginine, but no putrescine from ornithine, suggesting that arginine decarboxylase and agmatinase, rather than arginase and ornithine decarboxylase, lead from Arg to polyamine biosynthesis. Note: a history of early misannotation of members of this family is detailed in PUBMED:10931887.	275
273515	TIGR01231	lacC	tagatose-6-phosphate kinase. This enzyme is part of the tagatose-6-phosphate pathway of lactose degradation. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	310
130299	TIGR01232	lacD	tagatose 1,6-diphosphate aldolase. This family consists of Gram-positive proteins. Tagatose 1,6-diphosphate aldolase is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	325
273516	TIGR01233	lacG	6-phospho-beta-galactosidase. This enzyme is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	467
130301	TIGR01234	L-ribulokinase	ribulokinase. This enzyme catalyzes the second step in arabinose catabolism. The most closely related protein subfamily outside the scope of this model includes ribitol kinase from E. coli. [Energy metabolism, Sugars]	536
130302	TIGR01235	pyruv_carbox	pyruvate carboxylase. This enzyme plays a role in gluconeogensis but not glycolysis. [Energy metabolism, Glycolysis/gluconeogenesis]	1143
273517	TIGR01236	D1pyr5carbox1	delta-1-pyrroline-5-carboxylate dehydrogenase, group 1. This model represents one of two related branches of delta-1-pyrroline-5-carboxylate dehydrogenase. The two branches are not as closely related to each other as some aldehyde dehydrogenases are to this branch, and separate models are built for this reason. The enzyme is the second of two in the degradation of proline to glutamate. [Energy metabolism, Amino acids and amines]	532
200087	TIGR01237	D1pyr5carbox2	delta-1-pyrroline-5-carboxylate dehydrogenase, group 2, putative. This enzyme is the second of two in the degradation of proline to glutamate. This model represents one of several related branches of delta-1-pyrroline-5-carboxylate dehydrogenase. Members of this branch may be associated with proline dehydrogenase (the other enzyme of the pathway from proline to glutamate) but have not been demonstrated experimentally. The branches are not as closely related to each other as some distinct aldehyde dehydrogenases are to some; separate models were built to let each model describe a set of equivalogs. [Energy metabolism, Amino acids and amines]	511
273518	TIGR01238	D1pyr5carbox3	delta-1-pyrroline-5-carboxylate dehydrogenase (PutA C-terminal domain). This model represents one of several related branches of delta-1-pyrroline-5-carboxylate dehydrogenase. Members of this branch are the C-terminal domain of the PutA bifunctional proline dehydrogenase / delta-1-pyrroline-5-carboxylate dehydrogenase. [Energy metabolism, Amino acids and amines]	500
273519	TIGR01239	galT_2	galactose-1-phosphate uridylyltransferase, family 2. This enzyme is involved in glucose and galactose interconversion. This model describes one of two extremely distantly related branches of the model pfam01087 from Pfam. [Energy metabolism, Sugars]	489
130307	TIGR01240	mevDPdecarb	diphosphomevalonate decarboxylase. This enzyme catalyzes the last step in the synthesis of isopentenyl diphosphate (IPP) in the mevalonate pathway. Alternate names: mevalonate diphosphate decarboxylase; pyrophosphomevalonate decarboxylase [Central intermediary metabolism, Other]	305
273520	TIGR01241	FtsH_fam	ATP-dependent metalloprotease FtsH. HflB(FtsH) is a pleiotropic protein required for correct cell division in bacteria. It has ATP-dependent zinc metalloprotease activity. It was formerly designated cell division protein FtsH. [Cellular processes, Cell division, Protein fate, Degradation of proteins, peptides, and glycopeptides]	495
130309	TIGR01242	26Sp45	26S proteasome subunit P45 family. Many proteins may score above the trusted cutoff because an internal	364
273521	TIGR01243	CDC48	AAA family ATPase, CDC48 subfamily. This subfamily of the AAA family ATPases includes two members each from three archaeal species. It also includes yeast CDC48 (cell division control protein 48) and the human ortholog, transitional endoplasmic reticulum ATPase (valosin-containing protein). These proteins in eukaryotes are involved in the budding and transfer of membrane from the transitional endoplasmic reticulum to the Golgi apparatus.	733
130311	TIGR01244	TIGR01244	TIGR01244 family protein. No member of this family is characterized. The member from Xylella fastidiosa is a longer protein with an N-terminal region described by this model, followed by a metallo-beta-lactamase family domain and an additional C-terminal region. Members scoring above the trusted cutoff are limited to the proteobacteria. [Hypothetical proteins, Conserved]	135
273522	TIGR01245	trpD	anthranilate phosphoribosyltransferase. In many widely different species, including E. coli, Thermotoga maritima, and Archaeoglobus fulgidus, this enzymatic domain (anthranilate phosphoribosyltransferase) is found C-terminal to glutamine amidotransferase; the fusion protein is designated anthranilate synthase component II (EC 4.1.3.27) [Amino acid biosynthesis, Aromatic amino acid family]	330
162269	TIGR01246	dapE_proteo	succinyl-diaminopimelate desuccinylase, proteobacterial clade. This model describes a proteobacterial subset of succinyl-diaminopimelate desuccinylases. An experimentally confirmed Gram-positive lineage succinyl-diaminopimelate desuccinylase has been described for Corynebacterium glutamicum (SP:Q59284), and a neighbor-joining tree shows the seed members, SP:Q59284, and putative archaeal members such as TrEMBL:O58003 in a single clade. However, the archaeal members differ substantially, share a number of motifs with acetylornithine deacetylases rather than succinyl-diaminopimelate desuccinylases, and are not taken as trusted examples of succinyl-diaminopimelate desuccinylases. This model is limited to proteobacterial members for this reason. [Amino acid biosynthesis, Aspartate family]	370
130314	TIGR01247	drrB	daunorubicin resistance ABC transporter membrane protein. This model describes daunorubicin resistance ABC transporter, membrane associated protein in bacteria and archaea. The protein associated with effux of the drug, daunorubicin. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporter is the obligatory coupling of ATP hydrolysis to substrate translocation. The minimal configuration of bacterial ABC transport system: an ATPase or ATP binding subunit; An integral membrane protein; a hydrophilic polypetpide, which likely functions as substrate binding protein. In eukaryotes proteins of similar function include p-gyco proteins, multidrug resistance protein etc. [Transport and binding proteins, Other]	236
130315	TIGR01248	drrC	daunorubicin resistance protein C. The model describes daunorubicin resistance protein C in bacteria. This protein confers the function of daunorubicin resistance. The protein seems to share strong sequence similarity to UvrA proteins, which are involved in excision repair of DNA. Disruption of drrC gene showed increased sensitivity upon exposure to duanorubicin. However it failed to complement uvrA mutants to exposure to UV irradiation. The mechanism on how it confers duanomycin resistance is unclear, but has been suggested to be different from DrrA and DrrB which are antiporters. [Unclassified, Role category not yet assigned]	152
130316	TIGR01249	pro_imino_pep_1	proline iminopeptidase, Neisseria-type subfamily. This model represents one of two related families of proline iminopeptidase in the alpha/beta fold hydrolase family. The fine specificities of the various members, including both the range of short peptides from which proline can be removed and whether other amino acids such as alanine can be also removed, may vary among members.	306
188121	TIGR01250	pro_imino_pep_2	proline-specific peptidase, Bacillus coagulans-type subfamily. This model describes a subfamily of the alpha/beta fold family of hydrolases. Characterized members include prolinases (Pro-Xaa dipeptidase, EC 3.4.13.8), prolyl aminopeptidases (EC 3.4.11.5), and a leucyl aminopeptidase	289
273523	TIGR01251	ribP_PPkin	ribose-phosphate pyrophosphokinase. Alternate name: phosphoribosylpyrophosphate synthetase In some systems, close homologs lacking enzymatic activity exist and perform regulatory functions. The model is designated subfamily rather than equivalog for this reason. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	308
273524	TIGR01252	acetolac_decarb	alpha-acetolactate decarboxylase. Puruvate can be fermented to 2,3-butanediol. It is first converted to alpha-acetolactate by alpha-acetolactate synthase, then decarboxylated to acetoin by this enzyme. Acetoin can be reduced in some species to 2,3-butanediol by acetoin reductase. [Energy metabolism, Fermentation]	232
130320	TIGR01253	thiP	thiamine ABC transporter, permease protein. The model describes thiamine ABC transporter, permease protein in bacteria. The protein belongs to the larger ABC transport system. It consists of atleast three components: the inner mebrane permease; thiamine binding protein; an ATP-binding subunit. It has been experimentally demonstrated that the mutants in the various steps in the de novo synthesis of the thiamine and the biologically active form, namely thiamine pyrophosphate can be exogenously supplemented with thiamine, thiamine monophosphate (TMP) or thiamine pyrophosphate (TPP). [Transport and binding proteins, Other]	519
130321	TIGR01254	sfuA	ABC transporter periplasmic binding protein, thiB subfamily. The model describes thiamine ABC transporter, periplasmic protein in bacteria and archae. The protein belongs to the larger ABC transport system. It consists of at least three components: the thiamine binding periplasmic protein; an inner membrane permease; an ATP-binding subunit. It has been experimentally demonstrated that the mutants in the various steps in the de novo synthesis of the thiamine and the biologically active form, namely thiamine pyrophosphate can be exogenously supplemented with thiamine, thiamine monophosphate (TMP) or thiamine pyrophosphate (TPP). [Transport and binding proteins, Other]	304
273525	TIGR01255	pyr_form_ly_1	formate acetyltransferase 1. Alternate names: pyruvate formate-lyase; formate C-acetyltransferase This enzyme converts formate + acetyl-CoA into pyruvate + CoA. This model describes formate acetyltransferase 1. More distantly related putative formate acetyltransferases have also been identified, including formate acetyltransferase 2 from E. coli, which is excluded from this model. [Energy metabolism, Fermentation]	744
273526	TIGR01256	modA	molybdenum ABC transporter, periplasmic molybdate-binding protein. The model describes the molybdate ABC transporter periplasmic binding protein in bacteria and archae. Several of the periplasmic receptors constitute a diverse class of binding proteins that differ widely in size, sequence and ligand specificity. It has been shown experimentally by radioactive labeling that ModA represent hydrophylioc periplasmic-binding protein in gram-negative organisms and its counterpart in gram-positive organisms is a lipoprotein. The other components of the system include the ModB, an integral membrane protein and ModC the ATP-binding subunit. Invariably almost all of them display a common beta/alpha folding motif and have similar tertiary structures consisting of two globular domains. [Transport and binding proteins, Anions]	216
130324	TIGR01257	rim_protein	retinal-specific rim ABC transporter. This model describes the photoreceptor protein (rim protein) in eukaryotes. It is the member of ABC transporter superfamily. Rim protein is a membrane glycoprotein which is localized in the photoreceptor outer segment discs. Mutation/s in its genetic loci is implicated in the recessive Stargardt's disease. [Transport and binding proteins, Other]	2272
213596	TIGR01258	pgm_1	phosphoglycerate mutase, BPG-dependent, family 1. Most members of this family are phosphoglycerate mutase (EC 5.4.2.1). This enzyme interconverts 2-phosphoglycerate and 3-phosphoglycerate. The enzyme is transiently phosphorylated on an active site histidine by 2,3-diphosphoglyerate, which is both substrate and product. Some members of this family have are phosphoglycerate mutase as a minor activity and act primarily as a bisphoglycerate mutase, interconverting 2,3-diphosphoglycerate and 1,3-diphosphoglycerate (EC 5.4.2.4). This model is designated as a subfamily for this reason. The second and third paralogs in S. cerevisiae are somewhat divergent and apparently inactive (see PUBMED:9544241) but are also part of this subfamily phylogenetically.	245
213597	TIGR01259	comE	comEA protein. This model describes the ComEA protein in bacteria. The com E locus is obligatory for bacterial cell competence - the process of internalizing the exogenous added DNA. Lesions in the loci has been variously described for the appearance of competence-related pheonotypes and impairment of competence, suggesting their intimate functional role in bacterial transformation. [Cellular processes, DNA transformation]	120
130327	TIGR01260	ATP_synt_c	ATP synthase, F0 subunit c. This model describes the subunit c in F1/F0-ATP synthase, a membrane associated multisubunit complex found in bacteria and organelles of higher eukaryotes, namely, mitochondria and chloroplast. This enzyme is principally involved in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. The functional role of subunit c, which is the part of F0 cluster, has been delineated in-vitro reconstitution experiments. Overall experimental proof exists that demonstrate the electrochemical gradient is converted into a rotational torque that leads to ATP synthesis. [Energy metabolism, ATP-proton motive force interconversion]	58
130328	TIGR01261	hisB_Nterm	histidinol-phosphatase. This model describes histidinol phosphatase. All known examples in the scope of this model are bifunctional proteins with a histidinol phosphatase domain followed by an imidazoleglycerol-phosphate dehydratase domain. These enzymatic domains catalyze the ninth and seventh steps, respectively, of histidine biosynthesis. [Amino acid biosynthesis, Histidine family]	161
273527	TIGR01262	maiA	maleylacetoacetate isomerase. Maleylacetoacetate isomerase is an enzyme of tyrosine and phenylalanine catabolism. It requires glutathione and belongs by homology to the zeta family of glutathione S-transferases. The enzyme (EC 5.2.1.2) is described as active also on maleylpyruvate, and the example from a Ralstonia sp. catabolic plasmid is described as a maleylpyruvate isomerase involved in gentisate catabolism. [Energy metabolism, Amino acids and amines]	210
273528	TIGR01263	4HPPD	4-hydroxyphenylpyruvate dioxygenase. This protein oxidizes 4-hydroxyphenylpyruvate, a tyrosine and phenylalanine catabolite, to homogentisate. Homogentisate can undergo a further non-enzymatic oxidation and polymerization into brown pigments that protect some bacterial species from light. A similar process occurs spontaneously in blood and is hemolytic (see . In some bacterial species, this enzyme has been studied as a hemolysin. [Energy metabolism, Amino acids and amines]	352
273529	TIGR01264	tyr_amTase_E	tyrosine aminotransferase, eukaryotic. This model describes tyrosine aminotransferase as found in animals and Trypanosoma cruzi. It is the first enzyme of a pathway of tyrosine degradation via homogentisate. Several plant enzyme designated as probable tyrosine aminotransferases are very closely related to an experimentally demonstrated nicotianamine aminotransferase, an enzyme in a siderophore (iron uptake chelator) biosynthesis pathway. These plant sequences are excluded from the model seed and score between the trusted an noise cutoffs. [Energy metabolism, Amino acids and amines]	401
188123	TIGR01265	tyr_nico_aTase	tyrosine/nicotianamine family aminotransferase. This subfamily of pyridoxal phosphate-dependent enzymes includes known examples of both tyrosine aminotransferase from animals and nicotianamine aminotransferase from barley.	403
162276	TIGR01266	fum_ac_acetase	fumarylacetoacetase. This enzyme catalyzes the final step in the breakdown of tyrosine or phenylalanine to fumarate and acetoacetate. [Energy metabolism, Amino acids and amines]	415
130334	TIGR01267	Phe4hydrox_mono	phenylalanine-4-hydroxylase, monomeric form. This model describes the smaller, monomeric form of phenylalanine-4-hydroxylase, as found in a small number of Gram-negative bacteria. The enzyme irreversibly converts phenylalanine to tryosine and is known to be the rate-limiting step in phenylalanine catabolism in some systems. This family is of biopterin and metal-dependent hydroxylases is related to a family of longer, multimeric aromatic amino acid hydroxylases that have additional N-terminal regulatory sequences. These include tyrosine 3-monooxygenase, phenylalanine-4-hydroxylase, and tryptophan 5-monoxygenase. [Energy metabolism, Amino acids and amines]	248
130335	TIGR01268	Phe4hydrox_tetr	phenylalanine-4-hydroxylase, tetrameric form. This model describes the larger, tetrameric form of phenylalanine-4-hydroxylase, as found in metazoans. The enzyme irreversibly converts phenylalanine to tryosine and is known to be the rate-limiting step in phenylalanine catabolism in some systems. It is closely related to metazoan tyrosine 3-monooxygenase and tryptophan 5-monoxygenase, and more distantly to monomeric phenylalanine-4-hydroxylases of some Gram-negative bacteria. The member of this family from Drosophila has been described as having both phenylalanine-4-hydroxylase and tryptophan 5-monoxygenase activity (. However, a Drosophila member of the tryptophan 5-monoxygenase clade has subsequently been discovered.	436
130336	TIGR01269	Tyr_3_monoox	tyrosine 3-monooxygenase, tetrameric. This model describes tyrosine 3-monooxygenase, a member of the family of tetrameric, biopterin-dependent aromatic amino acid hydroxylases found in metazoans. It is closely related to tetrameric phenylalanine-4-hydroxylase and tryptophan 5-monooxygenase, and more distantly related to the monomeric phenylalanine-4-hydroxylase found in some Gram-negative bacteria.	457
130337	TIGR01270	Trp_5_monoox	tryptophan 5-monooxygenase, tetrameric. This model describes tryptophan 5-monooxygenase, a member of the family of tetrameric, biopterin-dependent aromatic amino acid hydroxylases found in metazoans. It is closely related to tetrameric phenylalanine-4-hydroxylase and tyrosine 3-monooxygenase, and more distantly related to the monomeric phenylalanine-4-hydroxylase found in some Gram-negative bacteria. [Energy metabolism, Amino acids and amines]	464
273530	TIGR01271	CFTR_protein	cystic fibrosis transmembrane conductor regulator (CFTR). The model describes the cystis fibrosis transmembrane conductor regulator (CFTR) in eukaryotes. The principal role of this protein is chloride ion conductance. The protein is predicted to consist of 12 transmembrane domains. Mutations or lesions in the genetic loci have been linked to the aetiology of asthma, bronchiectasis, chronic obstructive pulmonary disease etc. Disease-causing mutations have been studied by 36Cl efflux assays in vitro cell cultures and electrophysiology, all of which point to the impairment of chloride channel stability and not the biosynthetic processing per se. [Transport and binding proteins, Anions]	1490
273531	TIGR01272	gluP	glucose/galactose transporter. This model describes the glucose/galactose transporter in bacteria. This belongs to the larger facilitator superfamily. Disruption of the loci leads to the total loss of glucose or galactose uptake in E.coli. Putative transporters in other bacterial species were isolated by functional complementation, which restored it functional activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	310
273532	TIGR01273	speA	arginine decarboxylase, biosynthetic. Two alternative pathways can convert arginine to putrescine. One is decarboxylation by this enzyme followed by removal of the urea moeity by agmatinase. In the other, the ureohydrolase (arginase) acts first, followed by ornithine decarboxylase. This pathway leads to spermidine biosynthesis, hence the gene symbol speA. A distinct biodegradative form is also pyridoxal phosphate-dependent but is not similar in sequence. [Central intermediary metabolism, Polyamine biosynthesis]	624
130341	TIGR01274	ACC_deam	1-aminocyclopropane-1-carboxylate deaminase. This pyridoxal phosphate-dependent enzyme degrades 1-aminocyclopropane-1-carboxylate, which in plants is a precursor of the ripening hormone ethylene, to ammonia and alpha-ketoglutarate. This model includes all members of this family for which function has been demonstrated experimentally, but excludes a closely related family often annotated as putative members of this family. [Central intermediary metabolism, Other]	337
273533	TIGR01275	ACC_deam_rel	pyridoxal phosphate-dependent enzymes, D-cysteine desulfhydrase family. This model represents a family of pyridoxal phosphate-dependent enzymes closely related to (and often designated as putative examples of) 1-aminocyclopropane-1-carboxylate deaminase. It appears that members of this family include both D-cysteine desulfhydrase (EC 4.4.1.15) and 1-aminocyclopropane-1-carboxylate deaminase (EC 3.5.99.7).	318
130343	TIGR01276	thiB	thiamine ABC transporter, periplasmic binding protein. This model finds the thiamine (and thiamine pyrophosphate) ABC transporter periplasmic binding protein ThiB in proteobacteria. Completed genomes having this protein (E. coli, Vibrio cholera, Haemophilus influenzae) also have the permease ThiP, described by TIGRFAMs equivalog model TIGR01253. [Transport and binding proteins, Other]	309
130344	TIGR01277	thiQ	thiamine ABC transporter, ATP-binding protein. This model describes the energy-transducing ATPase subunit ThiQ of the ThiBPQ thiamine (and thiamine pyrophosphate) ABC transporter in several Proteobacteria. This protein is found so far only in Proteobacteria, and is found in complete genomes only if the ThiB and ThiP subunits are also found. [Transport and binding proteins, Other]	213
273534	TIGR01278	DPOR_BchB	light-independent protochlorophyllide reductase, B subunit. Alternate name: dark protochlorophyllide reductase This enzyme describes the B subunit of the dark form protochlorophyllide reductase, a nitrogenase-like enzyme. This subunit shows homology to the nitrogenase molybdenum-iron protein. It catalyzes a step in bacteriochlorophyll biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	511
273535	TIGR01279	DPOR_bchN	light-independent protochlorophyllide reductase, N subunit. This enzyme describes the N subunit of the dark form protochlorophyllide reductase, a nitrogenase-like enzyme involved in bacteriochlorophyll biosynthesis. This subunit shows homology to the nitrogenase molybdenum-iron protein NifN. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	407
273536	TIGR01280	xseB	exodeoxyribonuclease VII, small subunit. This protein is the small subunit for exodeoxyribonuclease VII. Exodeoxyribonuclease VII is made of a complex of four small subunits to one large subunit. The complex degrades single-stranded DNA into large acid-insoluble oligonucleotides. These nucleotides are then degraded further into acid-soluble oligonucleotides. [DNA metabolism, Degradation of DNA]	54
130348	TIGR01281	DPOR_bchL	light-independent protochlorophyllide reductase, iron-sulfur ATP-binding protein. The BchL peptide (ChlL in chloroplast and cyanobacteria) is an ATP-binding iron-sulfur protein of the dark form protochlorophyllide reductase, an enzyme similar to nitrogenase. This subunit resembles the nitrogenase NifH subunit. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	268
162284	TIGR01282	nifD	nitrogenase molybdenum-iron protein alpha chain. Nitrogenase consists of alpha (NifD) and beta (NifK) subunits of the molybdenum-iron protein and an ATP-binding iron-sulfur protein (NifH). This model describes a large clade of NifD proteins, but excludes a lineage that contains putative NifD and NifD homologs from species with vanadium-dependent nitrogenases. [Central intermediary metabolism, Nitrogen fixation]	466
188126	TIGR01283	nifE	nitrogenase molybdenum-iron cofactor biosynthesis protein NifE. This protein is part of the NifEN complex involved in biosynthesis of the molybdenum-iron cofactor used by the homologous NifDK complex of nitrogenase. In a few species, the protein is found as a NifEN fusion protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation]	453
188127	TIGR01284	alt_nitrog_alph	nitrogenase alpha chain. This model represents the alpha chains of various forms of the nitrogen-fixing enzyme nitrogenase: vanadium-iron, iron-iron, and molybdenum-iron. Most examples of NifD, the molybdenum-iron type nitrogenase alpha chain, are excluded from this model and described instead by equivalog model TIGR01282. It appears by phylogenetic and UPGMA trees that this model represents a distinct clade of NifD homologs, in which arose several molybdenum-independent forms. [Central intermediary metabolism, Nitrogen fixation]	457
273537	TIGR01285	nifN	nitrogenase molybdenum-iron cofactor biosynthesis protein NifN. This protein forms a complex with NifE, and appears as a NifEN in some species. NifEN is a required for producing the molybdenum-iron cofactor of molybdenum-requiring nitrogenases. NifN is closely related to the nitrogenase molybdenum-iron protein beta chain NifK. This model describes most examples of NifN but excludes some cases, such as the putative NifN of Chlorobium tepidum, for which a separate model may be created. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation]	432
130353	TIGR01286	nifK	nitrogenase molybdenum-iron protein beta chain. This model represents the majority of known sequences of the nitrogenase molybdenum-iron protein beta subunit. A distinct clade in a phylogenetic tree contains molybdenum-iron, vanadium-iron, and iron-iron forms of nitrogenase beta subunit and is excluded from this model. Nitrogenase, also called dinitrogenase, is responsible for nitrogen fixation. Note: the trusted cutoff score has recently been lowered to include an additional family in which the beta subunit is shorter by about 50 amino acids at the N-terminus. In species with the shorter form of the beta subunit, the alpha subunit has a novel insert of similar length. [Central intermediary metabolism, Nitrogen fixation]	515
273538	TIGR01287	nifH	nitrogenase iron protein. This model describes nitrogenase (EC 1.18.6.1) iron protein, also called nitrogenase reductase or nitrogenase component II. This model includes molybdenum-iron nitrogenase reductase (nifH), vanadium-iron nitrogenase reductase (vnfH), and iron-iron nitrogenase reductase (anfH). The model excludes the homologous protein from the light-independent protochlorophyllide reductase. [Central intermediary metabolism, Nitrogen fixation]	275
130355	TIGR01288	nodI	ATP-binding ABC transporter family nodulation protein NodI. This protein is required for normal nodulation by nitrogen-fixing root nodule bacteria such as Mesorhizobium loti. It is a member of the family of ABC transporter ATP binding proteins and works with NodJ to export a nodulation signal molecule. This model does not recognize the highly divergent NodI from Azorhizobium caulinodans. [Cellular processes, Other, Transport and binding proteins, Other]	303
200089	TIGR01289	LPOR	light-dependent protochlorophyllide reductase. This model represents the light-dependent, NADPH-dependent form of protochlorophyllide reductase. It belongs to the short chain alcohol dehydrogenase family, in contrast to the nitrogenase-related light-independent form. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	314
273539	TIGR01290	nifB	nitrogenase cofactor biosynthesis protein NifB. This model describes NifB, a protein required for the biosynthesis of the iron-molybdenum (or iron-vanadium) cofactor used by the nitrogen-fixing enzyme nitrogenase. NifB belongs to the radical SAM family, and the FeMo cluster biosynthesis process requires S-adenosylmethionine. Archaeal homologs lack the most C-terminal region and score between the trusted and noise cutoffs of this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation]	442
130358	TIGR01291	nodJ	ABC-2 type transporter, NodJ family. Nearly all members of this subfamily are NodJ which, together with NodI (TIGR01288), acts to export a variety of modified carbohydrate molecules as signals to plant hosts to establish root nodules. The seed alignment includes a highly divergent member from Azorhizobium caulinodans that is, nonetheless, associated with nodulation. This model is designated as subfamily in part because not all sequences derived from the last common ancestral sequence of Rhizobium sp. and Azorhizobium caulinodans NodJ are necessarily nodulation proteins. [Cellular processes, Other, Transport and binding proteins, Other]	253
273540	TIGR01292	TRX_reduct	thioredoxin-disulfide reductase. This model describes thioredoxin-disulfide reductase, a member of the pyridine nucleotide-disulphide oxidoreductases (pfam00070). [Energy metabolism, Electron transport]	299
213602	TIGR01293	Kv_beta	voltage-dependent potassium channel beta subunit, animal. This model describes the conserved core region of the beta subunit of voltage-gated potassium (Kv) channels in animals. Amino-terminal regions differ substantially, in part by alternative splicing, and are not included in the model. Four beta subunits form a complex with four alpha subunit cytoplasmic (T1) regions, and the structure of the complex is solved. The beta subunit belongs to a family of NAD(P)H-dependent aldo-keto reductases, binds NADPH, and couples voltage-gated channel activity to the redox potential of the cell. Plant beta subunits and their closely related bacterial homologs (in Deinococcus radiudurans, Xylella fastidiosa, etc.) appear more closely related to each other than to animal forms. However, the bacterial species lack convincing counterparts the Kv alpha subunit and the Kv beta homolog may serve as an enzyme. Cutoffs are set for this model such that yeast and plant forms and bacterial close homologs score between trusted and noise cutoffs.	317
273541	TIGR01294	P_lamban	phospholamban. This model represents the short (52 residue) transmembrane phosphoprotein phospholamban. Phospholamban, in its unphosphorylated form, inhibits SERCA2, the cardiac sarcoplasmic reticulum Ca-ATPase.	52
273542	TIGR01295	PedC_BrcD	bacteriocin transport accessory protein, putative. This model describes a small family of proteins believed to aid in the export of various class II bacteriocins, which are ribosomally-synthesized, non-lantibiotic bacterial peptide antibiotics. Members of this family are found in operons for pediocin PA-1 from Pediococcus acidilactici and brochocin-C from Brochothrix campestris.	122
273543	TIGR01296	asd_B	aspartate-semialdehyde dehydrogenase (peptidoglycan organisms). Two closely related families of aspartate-semialdehyde dehydrogenase are found. They differ by a deep split in phylogenetic and percent identity trees and in gap patterns. This model represents a branch more closely related to the USG-1 protein than to the other aspartate-semialdehyde dehydrogenases represented in model TIGR00978. [Amino acid biosynthesis, Aspartate family]	338
273544	TIGR01297	CDF	cation diffusion facilitator family transporter. This model describes a broadly distributed family of transporters, a number of which have been shown to transport divalent cations of cobalt, cadmium and/or zinc. The family has six predicted transmembrane domains. Members of the family are variable in length because of variably sized inserts, often containing low-complexity sequence. [Transport and binding proteins, Cations and iron carrying compounds]	268
188129	TIGR01298	RNaseT	ribonuclease T. This model describes ribonuclease T, an enzyme found so far only in gamma-subdivision Proteobacteria such as Escherichia coli and Xylella fastidiosa. Ribonuclease T is homologous to the DNA polymerase III alpha chain. It can liberate AMP from the common C-C-A terminus of uncharged tRNA. It appears also to be involved in RNA maturation. It also acts as a 3' to 5' single-strand DNA-specific exonuclease; it is distinctive for its ability to remove residues near a double-stranded stem. Ribonuclease T is a high copy suppressor in E. coli of a uv-repair defect caused by deletion of three other single-stranded DNA exonucleases. [Transcription, RNA processing]	200
130366	TIGR01299	synapt_SV2	synaptic vesicle protein SV2. This model describes a tightly conserved subfamily of the larger family of sugar (and other) transporters described by pfam00083. Members of this subfamily include closely related forms SV2A and SV2B of synaptic vesicle protein from vertebrates and a more distantly related homolog (below trusted cutoff) from Drosophila melanogaster. Members are predicted to have two sets of six transmembrane helices.	742
130367	TIGR01300	CPA3_mnhG_phaG	monovalent cation/proton antiporter, MnhG/PhaG subunit. This model represents a subfamily of small, transmembrane proteins believed to be components of Na+/H+ and K+/H+ antiporters. Members, including proteins designated MnhG from Staphylococcus aureus and PhaG from Rhizobium meliloti, show some similarity to chain L of the NADH dehydrogenase I, which also translocates protons. [Transport and binding proteins, Cations and iron carrying compounds]	97
273545	TIGR01301	GPH_sucrose	GPH family sucrose/H+ symporter. This model represents sucrose/proton symporters, found in plants, from the Glycoside-Pentoside-Hexuronide (GPH)/cation symporter family. These proteins are predicted to have 12 transmembrane domains. Members may export sucrose (e.g. SUT1, SUT4) from green parts to the phloem for long-distance transport or import sucrose (e.g SUT2) to sucrose sinks such as the tap root of the carrot.	477
273546	TIGR01302	IMP_dehydrog	inosine-5'-monophosphate dehydrogenase. This model describes IMP dehydrogenase, an enzyme of GMP biosynthesis. This form contains two CBS domains. This model describes a rather tightly conserved cluster of IMP dehydrogenase sequences, many of which are characterized. The model excludes two related families of proteins proposed also to be IMP dehydrogenases, but without characterized members. These are related families are the subject of separate models. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	450
130370	TIGR01303	IMP_DH_rel_1	IMP dehydrogenase family protein. This model represents a family of proteins, often annotated as a putative IMP dehydrogenase, related to IMP dehydrogenase and GMP reductase and restricted to the high GC Gram-positive bacteria. All species in which a member is found so far (Corynebacterium glutamicum, Mycobacterium tuberculosis, Streptomyces coelicolor, etc.) also have IMP dehydrogenase as described by TIGRFAMs entry TIGR01302. [Unknown function, General]	475
273547	TIGR01304	IMP_DH_rel_2	IMP dehydrogenase family protein. This model represents a family of proteins, often annotated as a putative IMP dehydrogenase, related to IMP dehydrogenase and GMP reductase. Most species with a member of this family belong to the high GC Gram-positive bacteria, and these also have the IMP dehydrogenase described by TIGRFAMs equivalog model TIGR01302. [Unknown function, General]	369
130372	TIGR01305	GMP_reduct_1	guanosine monophosphate reductase, eukaryotic. A deep split separates two families of GMP reductase. This family includes both eukaryotic and some proteobacterial sequences, while the other family contains other bacterial sequences. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	343
130373	TIGR01306	GMP_reduct_2	guanosine monophosphate reductase, bacterial. A deep split separates two families of GMP reductase. The other (TIGR01305) is found in eukaryotic and some proteobacterial lineages, including E. coli, while this family is found in a variety of bacterial lineages. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	321
130374	TIGR01307	pgm_bpd_ind	phosphoglycerate mutase (2,3-diphosphoglycerate-independent). This protein is about double in length of, and devoid of homology to the form of phosphoglycerate mutase that uses 2,3-bisphosphoglycerate as a cofactor. [Energy metabolism, Glycolysis/gluconeogenesis]	501
130375	TIGR01308	rpmD_bact	ribosomal protein L30, bacterial/organelle. This model describes bacterial (and organellar) 50S ribosomal protein L30. Homologous ribosomal proteins of the eukaryotic cytosol and of the archaea differ substantially in architecture, from bacterial L30 and also from each other, and are described by separate models. [Protein synthesis, Ribosomal proteins: synthesis and modification]	55
130376	TIGR01309	uL30_arch	50S ribosomal protein uL30, archaeal form. This model represents the archaeal ribosomal protein similar to longer (~ 250 residue) eukaryotic 60S ribosomal protein L7 and to the much shorter (~ 60 residue) bacterial 50S ribosomal protein L30. Protein naming follows the SwissProt designation as L30P, while the gene symbol rpmD follows TIGR usage. [Protein synthesis, Ribosomal proteins: synthesis and modification]	152
273548	TIGR01310	uL30_euk	60S ribosomal protein uL30, eukaryotic form. This model describes the eukaryotic 60S (cytosolic) ribosomal protein uL30 (previously L7) and paralogs that may or may not also be uL30. Human, Drosophila, and Arabidopsis all have both a typical L7 and an L7-related paralog. This family is designated subfamily rather than equivalog to reflect these uncharacterized paralogs. Members of this family average ~ 250 residues in length, somewhat longer than the archaeal L30P/L7E homolog (~ 155 residues) and much longer than the related bacterial/organellar form (~ 60 residues).	235
273549	TIGR01311	glycerol_kin	glycerol kinase. This model describes glycerol kinase, a member of the FGGY family of carbohydrate kinases. [Energy metabolism, Other]	493
273550	TIGR01312	XylB	D-xylulose kinase. This model describes D-xylulose kinases, a subfamily of the FGGY family of carbohydrate kinases. The member from Klebsiella pneumoniae, designated DalK (see , was annotated erroneously in GenBank as D-arabinitol kinase but is authentic D-xylulose kinase. D-xylulose kinase (XylB) generally is found with xylose isomerase (XylA) and acts in xylose utilization. [Energy metabolism, Sugars]	481
273551	TIGR01313	therm_gnt_kin	carbohydrate kinase, thermoresistant glucokinase family. This model represents a subfamily of proteins that includes thermoresistant and thermosensitve isozymes of gluconate kinase (gluconokinase) in E. coli and other related proteins; members of this family are often named by similarity to the thermostable isozyme. These proteins show homology to shikimate kinases and adenylate kinases but not to gluconate kinases from the FGGY family of carbohydrate kinases.	163
130381	TIGR01314	gntK_FGGY	gluconate kinase, FGGY type. Gluconate is derived from glucose in two steps. This model describes one form of gluconate kinase, belonging to the FGGY family of carbohydrate kinases. Gluconate kinase phosphoryates gluconate for entry into the Entner-Douderoff pathway. [Energy metabolism, Sugars]	505
273552	TIGR01315	5C_CHO_kinase	FGGY-family pentulose kinase. This model represents a subfamily of the FGGY family of carbohydrate kinases. This subfamily is closely related to a set of ribulose kinases, and many members are designated ribitol kinase. However, the member from Klebsiella pneumoniae, from a ribitol catabolism operon, accepts D-ribulose and to a lesser extent D-arabinitol and ribitol (and JW Lengeler, personal communication); its annotation in GenBank as ribitol kinase is imprecise and may have affected public annotation of related proteins.	541
130383	TIGR01316	gltA	glutamate synthase (NADPH), homotetrameric. This protein is homologous to the small subunit of NADPH and NADH forms of glutamate synthase as found in eukaryotes and some bacteria. This protein is found in numerous species having no homolog of the glutamate synthase large subunit. The prototype of the family, from Pyrococcus sp. KOD1, was shown to be active as a homotetramer and to require NADPH. [Amino acid biosynthesis, Glutamate family]	449
162300	TIGR01317	GOGAT_sm_gam	glutamate synthases, NADH/NADPH, small subunit. This model represents one of three built for the NADPH-dependent or NADH-dependent glutamate synthase (EC 1.4.1.13 and 1.4.1.14, respectively) small subunit or homologous region. TIGR01316 describes a family in several archaeal and deeply branched bacterial lineages of a homotetrameric form for which there is no large subunit. Another model describes glutamate synthase small subunit from gamma and some alpha subdivision Proteobacteria plus paralogs of unknown function. This model describes the small subunit, or homologous region of longer forms proteins, of eukaryotes, Gram-positive bacteria, cyanobacteria, and some other lineages. All members with known function participate in NADH or NADPH-dependent reactions to interconvert between glutamine plus 2-oxoglutarate and two molecules of glutamate.	485
273553	TIGR01318	gltD_gamma_fam	glutamate synthase small subunit family protein, proteobacterial. This model represents one of three built for the NADPH-dependent or NADH-dependent glutamate synthase (EC 1.4.1.13 and 1.4.1.14, respectively) small subunit and homologs. TIGR01317 describes the small subunit (or equivalent region from longer forms) in eukaryotes, Gram-positive bacteria, and some other lineages, both NADH and NADPH-dependent. TIGR01316 describes a protein of similar length, from Archaea and a number of bacterial lineages, that forms glutamate synthase homotetramers without a large subunit. This model describes both glutatate synthase small subunit and closely related paralogs of unknown function from a number of gamma and alpha subdivision Proteobacteria, including E. coli.	467
130386	TIGR01319	glmL_fam	conserved hypothetical protein. This small family includes, so far, an uncharacterized protein from E. coli O157:H7 and GlmL from Clostridium tetanomorphum and Clostridium cochlearium. GlmL is located between the genes for the two subunits, epsilon (GlmE) and sigma (GlmS), of the coenzyme-B12-dependent glutamate mutase (methylaspartate mutase), the first enzyme in a pathway of glutamate fermentation. Members shows significant sequence similarity to the hydantoinase branch of the hydantoinase/oxoprolinase family (pfam01968).	463
130387	TIGR01320	mal_quin_oxido	malate:quinone-oxidoreductase. This membrane-associated enzyme is an alternative to the better-known NAD-dependent malate dehydrogenase as part of the TCA cycle. The reduction of a quinone rather than NAD+ makes the reaction essentially irreversible in the direction of malate oxidation to oxaloacetate. Both forms of malate dehydrogenase are active in E. coli; disruption of this form causes less phenotypic change. In some bacteria, this form is the only or the more important malate dehydrogenase. [Energy metabolism, TCA cycle]	483
130388	TIGR01321	TrpR	trp operon repressor, proteobacterial. This model represents TrpR, the repressor of the trp operon. It is found so far only in the gamma subdivision of the proteobacteria and in Chlamydia trachomatis. All members belong to species capable of tryptophan biosynthesis. [Amino acid biosynthesis, Aromatic amino acid family, Regulatory functions, DNA interactions]	94
273554	TIGR01322	scrB_fam	sucrose-6-phosphate hydrolase. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	445
188130	TIGR01323	nitrile_alph	nitrile hydratase, alpha subunit. This model describes both iron- and cobalt-containing nitrile hydratase alpha chains. It excludes the thiocyanate hydrolase gamma subunit of Thiobacillus thioparus, a sequence that appears to have evolved from within the family of nitrile hydratase alpha subunits but which differs by several indels and a more rapid accumulation of point mutations. [Energy metabolism, Amino acids and amines]	189
130391	TIGR01324	cysta_beta_ly_B	cystathionine beta-lyase, bacterial. This model represents cystathionine beta-lyase (alternate name: beta-cystathionase), one of several pyridoxal-dependent enzymes of cysteine, methionine, and homocysteine metabolism. This enzyme is involved in the biosynthesis of Met from Cys. [Amino acid biosynthesis, Aspartate family]	377
188131	TIGR01325	O_suc_HS_sulf	O-succinylhomoserine sulfhydrylase. This model describes O-succinylhomoserine sulfhydrylase, one of several related pyridoxal phosphate-dependent enzymes of cysteine and methionine metabolism. This enzyme is part of an alternative pathway of homocysteine biosynthesis, a step in methionine biosynthesis. [Amino acid biosynthesis, Aspartate family]	381
273555	TIGR01326	OAH_OAS_sulfhy	OAH/OAS sulfhydrylase. This model describes a distinct clade of the Cys/Met metabolism pyridoxal phosphate-dependent enzyme superfamily. Members include examples of OAH/OAS sulfhydrylase, an enzyme with activity both as O-acetylhomoserine (OAH) sulfhydrylase (EC 2.5.1.49) and O-acetylserine (OAS) sulphydrylase (EC 2.5.1.47). An alternate name for OAH sulfhydrylase is homocysteine synthase. This model is designated subfamily because it may or may not have both activities. [Amino acid biosynthesis, Aspartate family, Amino acid biosynthesis, Serine family]	418
273556	TIGR01327	PGDH	D-3-phosphoglycerate dehydrogenase. This model represents a long form of D-3-phosphoglycerate dehydrogenase, the serA gene of one pathway of serine biosynthesis. Shorter forms, scoring between trusted and noise cutoff, include SerA from E. coli. [Amino acid biosynthesis, Serine family]	525
130395	TIGR01328	met_gam_lyase	methionine gamma-lyase. This model describes a methionine gamma-lyase subset of a family of PLP-dependent trans-sulfuration enzymes. The member from the parasite Trichomonas vaginalis is described as catalyzing alpha gamma- and alpha-beta eliminations and gamma-replacement reactions on methionine, cysteine, and some derivatives. Likewise, the enzyme from Pseudomonas degrades cysteine as well as methionine. [Energy metabolism, Amino acids and amines]	391
273557	TIGR01329	cysta_beta_ly_E	cystathionine beta-lyase, eukaryotic. This model represents cystathionine beta-lyase (alternate name: beta-cystathionase), one of several pyridoxal-dependent enzymes of cysteine, methionine, and homocysteine metabolism. This enzyme is involved in the biosynthesis of Met from Cys.	378
273558	TIGR01330	bisphos_HAL2	3'(2'),5'-bisphosphate nucleotidase, HAL2 family. Sulfate is incorporated into 3-phosphoadenylylsulfate, PAPS, for utilization in pathways such as methionine biosynthesis. Transfer of sulfate from PAPS to an acceptor leaves adenosine 3'-5'-bisphosphate, APS. This model describes a form found in plants of the enzyme 3'(2'),5'-bisphosphate nucleotidase, which removes the 3'-phosphate from APS to regenerate AMP and help drive the cycle. Sensitivity of this essential enzyme to sodium and other metal ions results is responsible for characterization of this enzyme as a salt tolerance protein. Some members of this family are active also as inositol 1-monophosphatase.	353
130398	TIGR01331	bisphos_cysQ	3'(2'),5'-bisphosphate nucleotidase, bacterial. Sulfate is incorporated into 3-phosphoadenylylsulfate, PAPS, for utilization in pathways such as methionine biosynthesis. Transfer of sulfate from PAPS to an acceptor leaves adenosine 3'-5'-bisphosphate, APS. This model describes a form found in bacteria of the enzyme 3'(2'),5'-bisphosphate nucleotidase, which removes the 3'-phosphate from APS to regenerate AMP and help drive the cycle. [Central intermediary metabolism, Sulfur metabolism]	249
130399	TIGR01332	cyt_b559_alpha	cytochrome b559, alpha subunit. This model describes the alpha subunit of cytochrome b559, about 83 residues in length. The N-terminal half is homologous to the ~ 40-residue beta subunit. Cytochrome b559 is associated with photosystem II. Sequences scoring between trusted and noise cutoffs are fragments. [Energy metabolism, Photosynthesis]	80
130400	TIGR01333	cyt_b559_beta	cytochrome b559, beta subunit. This model describes the beta subunit of cytochrome b559, about 40 residues in length. It is homologous to the N-terminal half of the alpha subunit, a protein of about 83 residues. Cytochrome b559 is associated with photosystem II. [Energy metabolism, Photosynthesis]	43
130401	TIGR01334	modD	putative molybdenum utilization protein ModD. The gene modD for a member of this family is found with molybdenum transport genes modABC in Rhodobacter capsulatus. However, disruption of modD causes only a 4-fold (rather than 500-fold for modA, modB, modC) change in the external molybdenum concentration required to suppress an alternative nitrogenase. ModD proteins are highly similar to nicotinate-nucleotide pyrophosphorylase (also called quinolinate phosphoribosyltransferase). The function unknown. [Unknown function, General]	277
130402	TIGR01335	psaA	photosystem I core protein PsaA. The core proteins of photosystem I are PsaA and PsaB, homologous integral membrane proteins that form a heterodimer. The heterodimer binds the electron-donating chlorophyll dimer P700, as well as chlorophyll, phylloquinone, and 4FE-4S electron acceptors. This model describes PsaA only. [Energy metabolism, Photosynthesis]	752
130403	TIGR01336	psaB	photosystem I core protein PsaB. The core proteins of photosystem I are PsaA and PsaB, homologous integral membrane proteins that form a heterodimer. The heterodimer binds the electron-donating chlorophyll dimer P700, as well as chlorophyll, phylloquinone, and 4FE-4S electron acceptors. This model describes PsaB only. [Energy metabolism, Photosynthesis]	734
273559	TIGR01337	apcB	allophycocyanin, beta subunit. The alpha and beta subunits of allophycocyanin form heterodimers, six of which associate into larger aggregates as part of the phycobilisome, a light-harvesting complex of phycobiliproteins and linker proteins. This model describes allophycocyanin beta subunit. Other, homologous phyobiliproteins include allophycocyanin alpha chain and the phycocyanin and phycoerythrin alpha and beta chains. [Energy metabolism, Photosynthesis]	167
130405	TIGR01338	phycocy_alpha	phycocyanin, alpha subunit. This model describes the phycocyanin alpha subunit. Other, homologous phyobiliproteins of the phycobilisome include phycocyanin alpha chain and the allophycocyanin and phycoerythrin alpha and beta chains. This model excludes the closely related phycoerythrocyanin alpha subunit. [Energy metabolism, Photosynthesis]	161
273560	TIGR01339	phycocy_beta	phycocyanin, beta subunit. This model describes the phycocyanin beta subunit. Other, homologous phycobiliproteins of the phycobilisome include phycocyanin beta chain and the allophycocyanin and phycoerythrin alpha and beta chains. This model excludes the closely related phycoerythrocyanin beta subunit. [Energy metabolism, Photosynthesis]	171
273561	TIGR01340	aconitase_mito	aconitate hydratase, mitochondrial. This model represents mitochondrial forms of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydro-lyase. [Energy metabolism, TCA cycle]	745
273562	TIGR01341	aconitase_1	aconitate hydratase 1. This model represents one form of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydro-lyase. It is found in bacteria, archaea, and eukaryotic cytosol. It has been shown to act also as an iron-responsive element binding protein in animals and may have the same role in other eukaryotes. [Energy metabolism, TCA cycle]	876
130409	TIGR01342	acon_putative	aconitate hydratase, putative, Aquifex type. This model represents a small family of proteins homologous (and likely functionally equivalent to) aconitase 1. Members are found, so far in the anaerobe Clostridium acetobutylicum, in the microaerophilic, early-branching bacterium Aquifex aeolicus, and in the halophilic archaeon Halobacterium sp. NRC-1. No member is experimentally characterized. [Energy metabolism, TCA cycle]	658
273563	TIGR01343	hacA_fam	homoaconitate hydratase family protein. This model represents a subfamily of proteins consisting of aconitase, homoaconitase, 3-isopropylmalate dehydratase, and uncharacterized proteins. The majority of the members of this family have been designated as 3-isopropylmalate dehydratase large subunit (LeuC) in microbial genome annotation, but the only characterized member is Thermus thermophilus homoaconitase, an enzyme of a non-aspartate pathway of Lys biosynthesis.	412
188132	TIGR01344	malate_syn_A	malate synthase A. This model represents plant malate synthase and one of two bacterial forms, designated malate synthase A. The distantly related malate synthase G is described by a separate model. This enzyme and isocitrate lyase are the two characteristic enzymes of the glyoxylate shunt. The shunt enables the cell to use acetyl-CoA to generate increased levels of TCA cycle intermediates for biosynthetic pathways such as gluconeogenesis. [Energy metabolism, TCA cycle]	511
130412	TIGR01345	malate_syn_G	malate synthase G. This model describes the G isozyme of malate synthase. Isocitrate synthase and malate synthase form the glyoxylate shunt, which generates additional TCA cycle intermediates. [Energy metabolism, TCA cycle]	721
273564	TIGR01346	isocit_lyase	isocitrate lyase. Isocitrate lyase and malate synthase are the enzymes of the glyoxylate shunt, a pathway associated with the TCA cycle. [Energy metabolism, TCA cycle]	527
273565	TIGR01347	sucB	2-oxoglutarate dehydrogenase complex dihydrolipoamide succinyltransferase (E2 component). This model describes the TCA cycle 2-oxoglutarate system E2 component, dihydrolipoamide succinyltransferase. It is closely related to the pyruvate dehydrogenase E2 component, dihydrolipoamide acetyltransferase. The seed for this model includes mitochondrial and Gram-negative bacterial forms. Mycobacterial candidates are highly derived, differ in having and extra copy of the lipoyl-binding domain at the N-terminus. They score below the trusted cutoff, but above the noise cutoff and above all examples of dihydrolipoamide acetyltransferase. [Energy metabolism, TCA cycle]	403
273566	TIGR01348	PDHac_trf_long	pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase, long form. This model describes a subset of pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase specifically close by both phylogenetic and per cent identity (UPGMA) trees. Members of this set include two or three copies of the lipoyl-binding domain. E. coli AceF is a member of this model, while mitochondrial and some other bacterial forms belong to a separate model. [Energy metabolism, Pyruvate dehydrogenase]	546
273567	TIGR01349	PDHac_trf_mito	pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase, long form. This model represents one of several closely related clades of the dihydrolipoamide acetyltransferase subunit of the pyruvate dehydrogenase complex. It includes sequences from mitochondria and from alpha and beta branches of the proteobacteria, as well as from some other bacteria. Sequences from Gram-positive bacteria are not included. The non-enzymatic homolog protein X, which serves as an E3 component binding protein, falls within the clade phylogenetically but is rejected by its low score. [Energy metabolism, Pyruvate dehydrogenase]	436
273568	TIGR01350	lipoamide_DH	dihydrolipoamide dehydrogenase. This model describes dihydrolipoamide dehydrogenase, a flavoprotein that acts in a number of ways. It is the E3 component of dehydrogenase complexes for pyruvate, 2-oxoglutarate, 2-oxoisovalerate, and acetoin. It can also serve as the L protein of the glycine cleavage system. This family includes a few members known to have distinct functions (ferric leghemoglobin reductase and NADH:ferredoxin oxidoreductase) but that may be predicted by homology to act as dihydrolipoamide dehydrogenase as well. The motif GGXCXXXGCXP near the N-terminus contains a redox-active disulfide.	460
273569	TIGR01351	adk	adenylate kinase. Adenylate kinase (EC 2.7.4.3) converts ATP + AMP to ADP + ADP, that is, uses ATP as a phosphate donor for AMP. Most members of this family are known or believed to be adenylate kinase. However, some members accept other nucleotide triphosphates as donors, may be unable to use ATP, and may fail to complement adenylate kinase mutants. An example of a nucleoside-triphosphate--adenylate kinase (EC 2.7.4.10) is SP|Q9UIJ7, a GTP:AMP phosphotransferase. This family is designated subfamily rather than equivalog for this reason. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	210
273570	TIGR01352	tonB_Cterm	TonB family C-terminal domain. This model represents the C-terminal of TonB and is homologs. TonB is an energy-transducer for TonB-dependent receptors of Gram-negative bacteria. Most members are designated as TonB or TonB-related proteins, but a few represent the paralogous TolA protein. Several bacteria have up to four TonB paralogs. In nearly every case, a proline-rich repetive region is found N-terminal to this domain; these low-complexity regions are highly divergent and cannot readily be aligned. The region is suggested to help span the periplasm. [Transport and binding proteins, Cations and iron carrying compounds]	74
273571	TIGR01353	dGTP_triPase	deoxyguanosinetriphosphate triphosphohydrolase, putative. dGTP triphosphohydrolase (dgt) releases inorganic triphosphate, an unusual activity reaction product, from GTP. Its activity has been called limited to the Enterobacteriaceae, although homologous sequences are detected elsewhere. This finding casts doubt on whether the activity is shared in other species. In several of these other species, the homologous gene is found in an apparent operon with dnaG, the DNA primase gene. The enzyme from E. coli was shown to bind coopertatively to single stranded DNA. The biological role of dgt is unknown. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	381
273572	TIGR01354	cyt_deam_tetra	cytidine deaminase, homotetrameric. This small, homotetrameric zinc metalloprotein is found in humans and most bacteria. A related, homodimeric form with a much larger subunit is found in E. coli and in Arabidopsis. Both types may act on deoxycytidine as well as cytidine. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	127
273573	TIGR01355	cyt_deam_dimer	cytidine deaminase, homodimeric. This homodimeric zinc metalloprotein is found in Arabidopis and some Proteobacteria. A related, homotetrameric form with a much smaller subunit is found most bacteria and in animals. Both types may act on deoxycytidine as well as cytidine. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	283
273574	TIGR01356	aroA	3-phosphoshikimate 1-carboxyvinyltransferase. This model represents 3-phosphoshikimate-1-carboxyvinyltransferase (aroA), which catalyzes the sixth of seven steps in the shikimate pathway of the biosynthesis of chorimate. Chorismate is last common precursor of all three aromatic amino acids. Sequences scoring between the trusted and noise cutoffs include fragmentary and aberrant sequences in which generally well-conserved motifs are missing or altererd, but no example of a protein known to have a different function. [Amino acid biosynthesis, Aromatic amino acid family]	409
273575	TIGR01357	aroB	3-dehydroquinate synthase. This model represents 3-dehydroquinate synthase, the enzyme catalyzing the second of seven steps in the shikimate pathway of chorismate biosynthesis. Chorismate is the last common intermediate in the biosynthesis of all three aromatic amino acids. [Amino acid biosynthesis, Aromatic amino acid family]	344
130425	TIGR01358	DAHP_synth_II	3-deoxy-7-phosphoheptulonate synthase, class II. This model represents the class II family of 3-deoxy-7-phosphoheptulonate synthase, aka phospho-2-dehydro-3-deoxyheptonate aldolase, as found in plants and some bacteria. It shows some similarity to the class I family found in many bacteria. The enzyme catalyzes the first of 7 steps in the biosynthesis of chorismate, that last common precursor of all three aromatic amino acids. Homologs scoring between trusted and noise cutoff include proteins involved in antibiotic biosynthesis; one example is active as this enzyme, while another acts on an amino analog. [Amino acid biosynthesis, Aromatic amino acid family]	443
273576	TIGR01359	UMP_CMP_kin_fam	UMP-CMP kinase family. This subfamily of the adenylate kinase superfamily contains examples of UMP-CMP kinase, as well as others proteins with unknown specificity, some currently designated adenylate kinase. All known members are eukaryotic.	185
130427	TIGR01360	aden_kin_iso1	adenylate kinase, isozyme 1 subfamily. Members of this family are adenylate kinase, EC 2.7.4.3. This clade is found only in eukaryotes and includes human adenylate kinase isozyme 1 (myokinase). Within the adenylate kinase superfamily, this set appears specifically closely related to a subfamily of eukaryotic UMP-CMP kinases (TIGR01359), rather than to the large clade of bacterial, archaeal, and eukaryotic adenylate kinase family members in TIGR01351.	188
273577	TIGR01361	DAHP_synth_Bsub	3-deoxy-7-phosphoheptulonate synthase. This model describes one of at least three types of phospho-2-dehydro-3-deoxyheptonate aldolase (DAHP synthase). This enzyme catalyzes the first of 7 steps in the biosynthesis of chorismate, that last common precursor of all three aromatic amino acids and of PABA, ubiquinone and menaquinone. Some members of this family, including an experimentally characterized member from Bacillus subtilis, are bifunctional, with a chorismate mutase domain N-terminal to this region. The member of this family from Synechocystis PCC 6803, CcmA, was shown to be essential for carboxysome formation. However, no other candidate for this enzyme is present in that species, chorismate biosynthesis does occur, other species having this protein lack carboxysomes but appear to make chorismate, and a requirement of CcmA for carboxysome formation does not prohibit a role in chorismate biosynthesis. [Amino acid biosynthesis, Aromatic amino acid family]	260
130429	TIGR01362	KDO8P_synth	3-deoxy-8-phosphooctulonate synthase. This model describes 3-deoxy-8-phosphooctulonate synthase. Alternate names include 2-dehydro-3-deoxyphosphooctonate aldolase, 3-deoxy-d-manno-octulosonic acid 8-phosphate and KDO-8 phosphate synthetase. It catalyzes the aldol condensation of phosphoenolpyruvate with D-arabinose 5-phosphate: phosphoenolpyruvate + D-arabinose 5-phosphate + H2O = 2-dehydro-3-deoxy-D-octonate 8-phosphate + phosphate In Gram-negative bacteria, this is the first step in the biosynthesis of 3-deoxy-D-manno-octulosonate, part of the oligosaccharide core of lipopolysaccharide. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	258
273578	TIGR01363	strep_his_triad	streptococcal histidine triad protein. This model represents the N-terminal half of a family of Streptococcal proteins that contain a signal peptide and then up to five repeats of a region that includes a His-X-X-His-X-His (histidine triad) motif. Three repeats are found in the seed alignment. Additional copies in more poorly conserved regions may be detected. Members of this family from Streptococcus pneumoniae are suggested to cleave human C3, and the member PhpA has been shown in vaccine studies to be a protective antigen in mice. [Cellular processes, Pathogenesis]	348
130431	TIGR01364	serC_1	phosphoserine aminotransferase. This model represents the common form of the phosphoserine aminotransferase SerC. The phosphoserine aminotransferase of the archaeon Methanosarcina barkeri and putative phosphoserine aminotransferase of Mycobacterium tuberculosis are represented by separate models. All are members of the class V aminotransferases (pfam00266). [Amino acid biosynthesis, Serine family]	349
130432	TIGR01365	serC_2	phosphoserine aminotransferase, Methanosarcina type. This model represents a variant form of the serine biosynthesis enzyme phosphoserine aminotransferase, as found in a small number of distantly related species, including Caulobacter crescentus, Mesorhizobium loti, and the archaeon Methanosarcina barkeri. [Amino acid biosynthesis, Serine family]	374
130433	TIGR01366	serC_3	phosphoserine aminotransferase, putative. This model represents a putative variant form of the serine biosynthesis enzyme phosphoserine aminotransferase, as found in Mycobacterium tuberculosis and related high-GC Gram-positive bacteria. [Amino acid biosynthesis, Serine family]	361
273579	TIGR01367	pyrE_Therm	orotate phosphoribosyltransferase, Thermus family. This model represents a distinct clade of orotate phosphoribosyltransferases. Members include the experimentally determined example from Thermus aquaticus and additional examples from Caulobacter crescentus, Helicobacter pylori, Mesorhizobium loti, and related species. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	187
273580	TIGR01368	CPSaseIIsmall	carbamoyl-phosphate synthase, small subunit. This model represents the whole of the small chain of the glutamine-dependent form (EC 6.3.5.5) of carbamoyl phosphate synthase, CPSase II. The C-terminal domain has glutamine amidotransferase activity. Note that the sequence from the mammalian urea cycle form has lost the active site Cys, resulting in an ammonia-dependent form, CPSase I (EC 6.3.4.16). CPSases of pyrimidine biosynthesis, arginine biosynthesis, and the urea cycle may be encoded by one or by several genes, depending on the species. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	357
273581	TIGR01369	CPSaseII_lrg	carbamoyl-phosphate synthase, large subunit. Carbamoyl-phosphate synthase (CPSase) catalyzes the first committed step in pyrimidine, arginine, and urea biosynthesis. In general, it is a glutamine-dependent enzyme, EC 6.3.5.5, termed CPSase II in eukaryotes. An exception is the mammalian mitochondrial urea-cycle form, CPSase I, in which the glutamine amidotransferase domain active site Cys on the small subunit has been lost, and the enzyme is ammonia-dependent. In both CPSase I and the closely related, glutamine-dependent CPSase III (allosterically activated by acetyl-glutamate) demonstrated in some other vertebrates, the small and large chain regions are fused in a single polypeptide chain. This model represents the large chain of glutamine-hydrolysing carbamoyl-phosphate synthases, or the corresponding regions of larger, multifunctional proteins, as found in all domains of life, and CPSase I forms are considered exceptions within the family. In several thermophilic species (Methanobacterium thermoautotrophicum, Methanococcus jannaschii, Aquifex aeolicus), the large subunit appears split, at different points, into two separate genes. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	1050
273582	TIGR01370	TIGR01370	extracellular protein. Original assignment of this protein family as cysteinyl-tRNA synthetase is controversial, supported by but challenged by and by subsequent discovery of the actual mechanism for synthesizing Cys-tRNA in species where a direct Cys--tRNA ligase was not found. Lingering legacy annotations of members of this family probably should be removed. Evidence against the role includes a signal peptide. This family as been renamed "extracellular protein" to facilitate correction. Members of this family occur in Deinococcus radiodurans (bacterial) and Methanococcus jannaschii (archaeal). A number of homologous but more distantly related proteins are annotated as alpha-1,4 polygalactosaminidases. The function remains unknown. [Unknown function, General]	315
273583	TIGR01371	met_syn_B12ind	5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase. This model describes the cobalamin-independent methionine synthase. A family of uncharacterized archaeal proteins is homologous to the C-terminal region of this family. That family is excluded from this model but, along with this family, belongs to pfam01717. [Amino acid biosynthesis, Aspartate family]	750
273584	TIGR01372	soxA	sarcosine oxidase, alpha subunit family, heterotetrameric form. This model describes the alpha subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason.Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) forms [Energy metabolism, Amino acids and amines]	985
273585	TIGR01373	soxB	sarcosine oxidase, beta subunit family, heterotetrameric form. This model describes the beta subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) forms. [Energy metabolism, Amino acids and amines]	407
130441	TIGR01374	soxD	sarcosine oxidase, delta subunit family, heterotetrameric form. This model describes the delta subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) form [Energy metabolism, Amino acids and amines]	84
273586	TIGR01375	soxG	sarcosine oxidase, gamma subunit family, heterotetrameric form. This model describes the gamma subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason. The gamma subunit is the most divergent between operons of the four subunits. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) forms. [Energy metabolism, Amino acids and amines]	152
273587	TIGR01376	POMP_repeat	Chlamydial polymorphic outer membrane protein repeat. This model represents a repeat region of about 27 residues that appears from twice to over twenty times in Chlamydial polymorphic outer membrane proteins (POMP). Characteristic motifs in the repeat are FXXN and GGAI. Except for a few apparently truncated examples, Chlamydial proteins have this repeat region if and only if they also have the autotransporter beta-domain (pfam03797) at the C-terminus, with Phe as the C-terminal residue. This repeat is observed, but is very rare, outside the Chlamydias.	27
130444	TIGR01377	soxA_mon	sarcosine oxidase, monomeric form. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) forms [Energy metabolism, Amino acids and amines]	380
273588	TIGR01378	thi_PPkinase	thiamine pyrophosphokinase. This model has been revised. Originally, it described strictly eukaryotic thiamine pyrophosphokinase. However, it is now expanded to include also homologous enzymes, apparently functionally equivalent, from species that rely on thiamine pyrophosphokinase rather than thiamine-monophosphate kinase (TIGR01379) to produce the active TPP cofactor. This includes the thiamine pyrophosphokinase from Bacillus subtilis, previously designated YloS. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	205
273589	TIGR01379	thiL	thiamine-phosphate kinase. This model describes thiamine-monophosphate kinase, an enzyme that converts thiamine monophosphate into thiamine pyrophosphate (TPP, coenzyme B1), an enzyme cofactor. Thiamine monophosphate may be derived from de novo synthesis or from unphosphorylated thiamine, known as vitamin B1. Proteins scoring between the trusted and noise cutoff for this model include short forms from the Thermoplasmas (which lack the N-terminal region) and a highly derived form from Campylobacter jejuni. Eukaryotes lack this enzyme, and add pyrophosphate from ATP to unphosphorylated thiamine in a single step. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	317
130447	TIGR01380	glut_syn	glutathione synthetase, prokaryotic. This model was built using glutathione synthetases found in Gram-negative bacteria. This gene does not appear to be present in genomes of Gram-positive bacteria. Glutathione synthetase has an ATP-binding domain in the COOH terminus and catalyzes the second step in the glutathione biosynthesis pathway: ATP + gamma-L-glutamyl-L-cysteine + glycine = ADP + phosphate + glutathione. Glutathione is a tripeptide that functions as a reductant in many cellular reactions. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	312
273590	TIGR01381	E1_like_apg7	E1-like protein-activating enzyme Gsa7p/Apg7p. This model represents a family of eukaryotic proteins found in animals, plants, and yeasts, including Apg7p (YHR171W) from Saccharomyces cerevisiae and GSA7 from Pichia pastoris. Members are about 650 to 700 residues in length and include a central domain of about 150 residues shared with the ThiF/MoeB/HesA family of proteins. A low level of similarity to ubiquitin-activating enzyme E1 is described in a paper on peroxisome autophagy mediated by GSA7, and is the basis of the name ubiquitin activating enzyme E1-like protein. Members of the family appear to be involved in protein lipidation events analogous to ubiquitination and required for membrane fusion events during autophagy.	664
273591	TIGR01382	PfpI	intracellular protease, PfpI family. The member of this family from Pyrococcus horikoshii has been solved to 2 Angstrom resolution. It is an ATP-independent intracellular protease that crystallizes as a hexameric ring. Cys-101 is proposed as the active site residue in a catalytic triad with the adjacent His-102 and a Glu residue from an adjacent monomer. A member of this family from Bacillus subtilis, GSP18, has been shown to be expressed in response to several forms of stress. A role in the degradation of small peptides has been suggested. A closely related family consists of the thiamine biosynthesis protein ThiJ and its homologs. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	166
213612	TIGR01383	not_thiJ	DJ-1 family protein. This model represents the DJ-1 clade of the so-called ThiJ/PfpI family of proteins. PfpI, represented by a distinct model, is a putative intracellular cysteine protease. DJ-1 is described as an oncogene that acts cooperatively with H-Ras. Many members of the DJ-1 clade are annotated (apparently incorrectly) as ThiJ, a protein of thiamine biosynthesis. However, published reports of ThiJ activity and identification of a ThiJ/ThiD bifunctional protein describe an unrelated locus mapping near ThiM, rather than the DJ-1 homolog of E. coli. The ThiJ designation for this family may be spurious; the cited paper refers to a locus near thiD and thiM in E. coli, unlike the gene represented here. Current public annotation reflects ThiJ/ThiD bifunctional activity, apparently a property of ThiD and not of this locus. [Unknown function, General]	179
130451	TIGR01384	TFS_arch	transcription factor S, archaeal. This model describes archaeal transcription factor S, a protein related in size and sequence to certain eukaryotic RNA polymerase small subunits, and in sequence and function to the much larger eukaryotic transcription factor IIS (TFIIS). Although originally suggested to be a subunit of the archaeal RNA polymerase, it elutes separately from active polymerase in gel filtration experiments and acts, like TFIIs, as an induction factor for RNA cleavage by RNA polymerase. There has been an apparent duplication event in the Halobacteriaceae lineage (Haloarcula, Haloferax, Haloquadratum, Halobacterium and Natromonas). There appears to be a separate duplication in Methanosphaera stadtmanae. [Transcription, Transcription factors]	104
273592	TIGR01385	TFSII	transcription elongation factor S-II. This model represents eukaryotic transcription elongation factor S-II. This protein allows stalled RNA transcription complexes to perform a cleavage of the nascent RNA and restart at the newly generated 3-prime end.	299
273593	TIGR01386	cztS_silS_copS	heavy metal sensor kinase. Members of this family contain a sensor histidine kinase domain (pfam00512) and a domain found in bacterial signal proteins (pfam00672). This group is separated phylogenetically from related proteins with similar architecture and contains a number of proteins associated with heavy metal resistance efflux systems for copper, silver, cadmium, and/or zinc.	457
130454	TIGR01387	cztR_silR_copR	heavy metal response regulator. Members of this family contain a response regulator receiver domain (pfam00072) and an associated transcriptional regulatory region (pfam00486). This group is separated phylogenetically from related proteins with similar architecture and contains a number of proteins associated with heavy metal resistance efflux systems for copper, silver, cadmium, and/or zinc. Most members encoded by genes adjacent to genes for encoding a member of the heavy metal sensor histidine kinase family (TIGRFAMs:TIGR01386), its partner in the two-component response regulator system. [Regulatory functions, DNA interactions]	218
130455	TIGR01388	rnd	ribonuclease D. This model describes ribonuclease D, a 3'-exonuclease shown to act on tRNA both in vitro and when overexpressed in vivo. Trusted members of this family are restricted to the Proteobacteria; Aquifex, Mycobacterial, and eukaryotic homologs are not full-length homologs. Ribonuclease D is not essential in E. coli and is deleterious when overexpressed. Its precise biological role is still unknown. [Transcription, RNA processing]	367
273594	TIGR01389	recQ	ATP-dependent DNA helicase RecQ. The ATP-dependent DNA helicase RecQ of E. coli is about 600 residues long. This model represents bacterial proteins with a high degree of similarity in domain architecture and in primary sequence to E. coli RecQ. The model excludes eukaryotic and archaeal proteins with RecQ-like regions, as well as more distantly related bacterial helicases related to RecQ. [DNA metabolism, DNA replication, recombination, and repair]	591
130457	TIGR01390	CycNucDiestase	2',3'-cyclic-nucleotide 2'-phosphodiesterase. 2',3'-cyclic-nucleotide 2'-phosphodiesterase is a bifunctional enzyme localized to the periplasm of Gram-negative bacteria. 2',3'-cyclic-nucleotide 2'-phosphodiesters are intermediates formed during the hydrolysis of RNA by the ribonuclease I, which is also found to the periplasm, and other enzymes of the RNAse T2 family. Bacteria are unable to transport 2',3'-cyclic-nucleotides into the cytoplasm. 2',3'-cyclic-nucleotide 2'-phosphodiesterase contains 2 active sites which catalyze the reactions that convert the 2',3'-cyclic-nucleotide into a 3'-nucleotide, which is then converted into nucleic acid and phosphate. Both final products can be transported into the cytoplasm. Thus, it has been suggested that 2',3'-cyclic-nucleotide 2'-phosphodiesterase has a 'scavenging' function. Experimental evidence indicates that 2',3'-cyclic-nucleotide 2'-phosphodiesterase enables Yersinia enterocolitica O:8 to grow on 2'3'-cAMP as a sole source of carbon and energy (). [Purines, pyrimidines, nucleosides, and nucleotides, Other]	626
273595	TIGR01391	dnaG	DNA primase, catalytic core. Members of this family are DNA primase, a ubiquitous bacteria protein. Most members of this family contain nearly two hundred additional residues C-terminal to the region represented here, but conservation between species is poor and the C-terminal region was not included in the seed alignment. This protein contains a CHC2 zinc finger (pfam01807) and a Toprim domain (pfam01751). [DNA metabolism, DNA replication, recombination, and repair]	415
273596	TIGR01392	homoserO_Ac_trn	homoserine O-acetyltransferase. This family describes homoserine-O-acetyltransferase, an enzyme of methionine biosynthesis. This model has been rebuilt to identify sequences more broadly, including a number of sequences suggested to be homoserine O-acetyltransferase based on proximity to other Met biosynthesis genes. [Amino acid biosynthesis, Aspartate family]	351
130460	TIGR01393	lepA	elongation factor 4. LepA (GUF1 in Saccaromyces), now called elongation factor 4, is a GTP-binding membrane protein related to EF-G and EF-Tu. Two types of phylogenetic tree, rooted by other GTP-binding proteins, suggest that eukaryotic homologs (including GUF1 of yeast) originated within the bacterial LepA family. The function is unknown. [Unknown function, General]	595
273597	TIGR01394	TypA_BipA	GTP-binding protein TypA/BipA. This bacterial (and Arabidopsis) protein, termed TypA or BipA, a GTP-binding protein, is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways, but the precise function is unknown. [Regulatory functions, Other, Cellular processes, Adaptations to atypical conditions, Protein synthesis, Translation factors]	594
130462	TIGR01395	FlgC	flagellar basal-body rod protein FlgC. This model represents FlgC, one of several components of bacterial flagella that share a domain described by pfam00460. FlgC is part of the basal body. [Cellular processes, Chemotaxis and motility]	135
273598	TIGR01396	FlgB	flagellar basal-body rod protein FlgB. This model represents FlgB, one of several components of bacterial flagella that share a domain described by pfam00460. FlgB is part of the basal body. [Cellular processes, Chemotaxis and motility]	131
130464	TIGR01397	fliM_switch	flagellar motor switch protein FliM. Members of this family are the flagellar motor switch protein FliM. The family excludes FliM homologs that lack an N-terminal region critical to interaction with phosphorylated CheY. One set lacking this N-terminal region is found in Rhizobium meliloti, in which the direction of flagellar rotation is not reversible (i.e. the FliM homolog does not act to reverse the motor direction), and in related species. Another is found in Buchnera, an obligate intracellular endosymbiont with genes for many of the components of the flagellar apparatus, but not, apparently, for flagellin iself. [Cellular processes, Chemotaxis and motility]	320
273599	TIGR01398	FlhA	flagellar biosynthesis protein FlhA. This model describes flagellar biosynthesis protein FlhA, one of a large number of genes associated with the biosynthesis of functional bacterial flagella. Homologs of many such proteins, including FlhA, function in type III protein secretion systems. A separate model describes InvA (Salmonella enterica), LcrD (Yersinia enterocolitica), HrcV (Xanthomonas), etc., all of which score below the noise cutoff for this model. [Cellular processes, Chemotaxis and motility]	678
273600	TIGR01399	hrcV	type III secretion protein, HrcV family. Members of this family are closely homologous to the flagellar biosynthesis protein FlhA (TIGR01398) and should all participate in type III secretion systems. Examples include InvA (Salmonella enterica), LcrD (Yersinia enterocolitica), HrcV (Xanthomonas), etc. Type III secretion systems resemble flagellar biogenesis systems, and may share the property of translocating special classes of peptides through the membrane. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	677
130467	TIGR01400	fliR	flagellar biosynthetic protein FliR. This model recognizes the FliR protein of bacterial flagellar biosynthesis. It distinguishes FliR from the homologous proteins bacterial type III protein secretion systems, known by names such as YopT, EscT, and HrcT. [Cellular processes, Chemotaxis and motility]	245
130468	TIGR01401	fliR_like_III	type III secretion protein SpaR/YscT/HrcT. This model represents members of bacterial type III secretion systems homologous to the flagellar biosynthetic protein FliR (TIGRFAMs:TIGR01400). [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	253
130469	TIGR01402	fliQ	flagellar biosynthetic protein FliQ. This model describes FliQ, a protein involved in biosynthesis of bacterial flagella. A related family of proteins, excluded from this model, participates in bacterial type III protein secretion systems. [Cellular processes, Chemotaxis and motility]	88
130470	TIGR01403	fliQ_rel_III	type III secretion protein, HrpO family. This model represents one of several families of proteins related to bacterial flagellar biosynthesis proteins and involved in bacterial type III protein secretion systems. This family is homologous to, but distinguished from, flagellar biosynthetic protein FliQ. This model may not identify all type III secretion system FliQ homologs. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	81
130471	TIGR01404	FlhB_rel_III	type III secretion protein, YscU/HrpY family. This model represents one of several families of proteins related to bacterial flagellar biosynthesis proteins and involved in bacterial type III protein secretion systems. This family is homologous to, but distinguished from, flagellar biosynthetic protein FlhB (TIGRFAMs model TIGR00328). This model may not identify all type III secretion system FlhB homologs. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	342
273601	TIGR01405	polC_Gram_pos	DNA polymerase III, alpha chain, Gram-positive type. This model describes a polypeptide chain of DNA polymerase III. Full-length homologs of this protein are restricted to the Gram-positive lineages, including the Mycoplasmas. This protein is designated alpha chain and given the gene symbol polC, but is not a full-length homolog of other polC genes. The N-terminal region of about 200 amino acids is rich in low-complexity sequence, poorly alignable, and not included n this model. [DNA metabolism, DNA replication, recombination, and repair]	1213
130473	TIGR01406	dnaQ_proteo	DNA polymerase III, epsilon subunit, Proteobacterial. This model represents DnaQ, the DNA polymerase III epsilon subunit, as found in most Proteobacteria. It consists largely of an exonuclease domain as described in pfam00929. In Gram-positive bacteria, closely related regions are found both in the Gram-positive type DNA polymerase III alpha subunit and as an additional N-terminal domain of a DinG-family helicase. Both are excluded from this model, as are smaller proteins, also outside the Proteobacteria, that are similar in size to the epsilon subunit but as different in sequence as are the epsilon-like regions found in Gram-positive bacteria. [DNA metabolism, DNA replication, recombination, and repair]	225
273602	TIGR01407	dinG_rel	DnaQ family exonuclease/DinG family helicase, putative. This model represents a family of proteins in Gram-positive bacteria. The N-terminal region of about 200 amino acids resembles the epsilon subunit of E. coli DNA polymerase III and the homologous region of the Gram-positive type DNA polymerase III alpha subunit. The epsilon subunit contains an exonuclease domain. The remainder of this protein family resembles a predicted ATP-dependent helicase, the DNA damage-inducible protein DinG of E. coli. [DNA metabolism, DNA replication, recombination, and repair]	850
273603	TIGR01408	Ube1	ubiquitin-activating enzyme E1. This model represents the full length, over a thousand amino acids, of a multicopy family of eukaryotic proteins, many of which are designated ubiquitin-activating enzyme E1. Members have two copies of the ThiF family domain (pfam00899), a repeat found in ubiquitin-activating proteins (pfam02134), and other regions.	1006
273604	TIGR01409	TAT_signal_seq	Tat (twin-arginine translocation) pathway signal sequence. Proteins assembled with various cofactors or by means of cytosolic molecular chaperones are poor candidates for translocation across the bacterial inner membrane by the standard general secretory (Sec) pathway. This model describes a family of predicted long, non-Sec signal sequences and signal-anchor sequences (uncleaved signal sequences). All contain an absolutely conserved pair of arginine residues, in a motif approximated by (S/T)-R-R-X-F-L-K, followed by a membrane-spanning hydrophobic region. Members with small amino acid side chains at the -1 and -3 positions from the C-terminus of the model should be predicted to be cleaved as are Sec pathway signal sequences. Members are almost exclusively bacterial, although archaeal sequences are also found. A large fraction of the members of this family may have bound redox-active cofactors. [Protein fate, Protein and peptide secretion and trafficking]	29
130477	TIGR01410	tatB	twin arginine-targeting protein translocase TatB. This model represents the TatB protein of a Sec-independent system for transporting folded proteins, often with a bound redox cofactor, across the bacterial inner membrane. TatC is the multiple membrane spanning component. TatB, like the related TatA/E proteins, appears to span the membrane one time. The tat system recognizes proteins with an elongated signal sequence containing a conserved R-R in a motif approximated by RRxFLK N-terminal to the transmembrane helix. TIGRFAMs model TIGR01409 describes this twin-Arg signal sequence. A similar system, termed Delta-pH-dependent transport, operates on chloroplast-encoded proteins. [Protein fate, Protein and peptide secretion and trafficking]	80
273605	TIGR01411	tatAE	twin arginine-targeting protein translocase, TatA/E family. This model distinguishes TatA/E from the related TatB, but does not distinguish TatA from TatE. The Tat (twin-arginine translocation) system is a Sec-independent exporter for folded proteins, often with a redox cofactor already bound, across the bacterial inner membrane. Functionally equivalent systems are found in the chloroplast and some in archaeal species. The signal peptide recognized by the Tat system is modeled by TIGR01409. [Protein fate, Protein and peptide secretion and trafficking]	47
273606	TIGR01412	tat_substr_1	Tat-translocated enzyme. This model represents a small family of proteins with a typical Tat (twin-arginine translocation) signal sequence, suggesting that the family is exported in a folded state, perhaps with a bound redox cofactor. Members of this family show homology to Dyp, a dye-decolorizing peroxidase from Geotrichum candidum that lacks any typical heme-binding site.	414
273607	TIGR01413	Dyp_perox_fam	Dyp-type peroxidase family. A defined member of this superfamily is Dyp, a dye-decolorizing peroxidase that lacks a typical heme-binding region. A distinct, uncharacterized branch (TIGR01412) of this superfamily has a typical twin-arginine dependent signal sequence characteristic of exported proteins with bound redox cofactors.	308
273608	TIGR01414	autotrans_barl	outer membrane autotransporter barrel domain. A number of Gram-negative bacterial proteins, mostly found in pathogens and associated with virulence, contain a conserved C-terminal domain that integrates into the outer membrane and enables the N-terminal region to be delivered across the membrane. This C-terminal autotransporter domain is about 400 amino acids in length and includes the aromatic amino acid-rich OMP signal, typically ending with a Phe or Trp residue, at the extreme C-terminus. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	431
273609	TIGR01415	trpB_rel	pyridoxal-phosphate dependent TrpB-like enzyme. This model represents a family of pyridoxal-phosphate dependent enzyme (pfam00291) closely related to the beta subunit of tryptophan synthase (TIGR00263). However, the only case in which a member of this family replaces a member of TIGR00263 is in Sulfolobus species which contain two sequences which hit this model, one of which is proximal to the alpha subunit. In every other case so far, either the species appears not to make tryptophan (there is no trp synthase alpha subunit), or a trp synthase beta subunit matching TIGR00263 is also found. [Unknown function, Enzymes of unknown specificity]	419
273610	TIGR01416	Rieske_proteo	ubiquinol-cytochrome c reductase, iron-sulfur subunit. This model represents the Proteobacterial and mitochondrial type of the Rieske [2Fe-2S] iron-sulfur as found in ubiquinol-cytochrome c reductase. The model excludes the Rieske iron-sulfur protein as found in the cytochrome b6-f complex of the Cyanobacteria and chloroplasts. Most members of this family have a recognizable twin-arginine translocation (tat) signal sequence (DeltaPh-dependent translocation in chloroplast) for transport across the membrane with the 2Fe-2S group already bound. These signal sequences include a motif resembling RRxFLK before the transmembrane helix. [Energy metabolism, Electron transport]	174
273611	TIGR01417	PTS_I_fam	phosphoenolpyruvate-protein phosphotransferase. This model recognizes a distinct clade of phophoenolpyruvate (PEP)-dependent enzymes. Most members are known or deduced to function as the phosphoenolpyruvate-protein phosphotransferase (or enzyme I) of PTS sugar transport systems. However, some species with both a member of this family and a homolog of the phosphocarrier protein HPr lack a IIC component able to serve as a permease. An HPr homolog designated NPr has been implicated in the regulation of nitrogen assimilation, which demonstrates that not all phosphotransferase system components are associated directly with PTS transport.	565
273612	TIGR01418	PEP_synth	phosphoenolpyruvate synthase. Also called pyruvate,water dikinase and PEP synthase. The member from Methanococcus jannaschii contains a large intein. This enzyme generates phosphoenolpyruvate (PEP) from pyruvate, hydrolyzing ATP to AMP and releasing inorganic phosphate in the process. The enzyme shows extensive homology to other enzymes that use PEP as substrate or product. This enzyme may provide PEP for gluconeogenesis, for PTS-type carbohydrate transport systems, or for other processes. [Energy metabolism, Glycolysis/gluconeogenesis]	786
162350	TIGR01419	nitro_reg_IIA	PTS IIA-like nitrogen-regulatory protein PtsN. This model describes a full-length protein of about 160 residues closely related to the fructose-specific phosphotransferase (PTS) system IIA component. It is a regulatory protein found only in species with a phosphoenolpyruvate-protein phosphotransferase (enzyme I of PTS systems) and an HPr-like phosphocarrier protein, but not all species have a IIC-like permease. Members of this family are found in Proteobacteria, Chlamydia, and the spirochete Treponema pallidum. [Signal transduction, PTS]	145
273613	TIGR01420	pilT_fam	pilus retraction protein PilT. This model represents the PilT subfamily of proteins related to GspE, a protein involved in type II secretion (also called the General Secretion Pathway). PilT is an apparent cytosolic ATPase associated with type IV pilus systems. It is not required for pilin biogenesis, but is required for twitching motility and social gliding behaviors, shown in some species, powered by pilus retraction. Members of this family may be found in some species that type IV pili but have related structures for DNA uptake and natural transformation. [Cell envelope, Surface structures, Cellular processes, Chemotaxis and motility]	343
273614	TIGR01421	gluta_reduc_1	glutathione-disulfide reductase, animal/bacterial. The tripeptide glutathione is an important reductant, e.g., for maintaining the cellular thiol/disulfide status and for protecting against reactive oxygen species such as hydrogen peroxide. Glutathione-disulfide reductase regenerates reduced glutathione from oxidized glutathione (glutathione disulfide) + NADPH. This model represents one of two closely related subfamilies of glutathione-disulfide reductase. Both are closely related to trypanothione reductase, and separate models are built so each of the three can describe proteins with conserved function. This model describes glutathione-disulfide reductases of animals, yeast, and a number of animal-resident bacteria. [Energy metabolism, Electron transport]	450
188140	TIGR01422	phosphonatase	phosphonoacetaldehyde hydrolase. This enzyme catalyzes the cleavage of the carbon phosphorous bond of a phosphonate. The mechanism depends on the substrate having a carbonyl one carbon away from the cleavage position. This enzyme is a member of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases (pfam00702), and contains a modified version of the conserved catalytic motifs of that superfamily: the first motif is usually DxDx(T/V), here it is DxAxT, and in the third motif the normal conserved lysine is instead an arginine. Additionally, the enzyme contains a unique conserved catalytic lysine (B. cereus pos. 53) which is involved in the binding and activation of the substrate through the formation of a Schiff base. The substrate of this enzyme is the product of 2-aminoethylphosphonate (AEP) transaminase, phosphonoacetaldehyde. This degradation pathway for AEP may be related to its toxic properties which are utilized by microorganisms as a chemical warfare agent. [Central intermediary metabolism, Other]	253
200098	TIGR01423	trypano_reduc	trypanothione-disulfide reductase. Trypanothione, a glutathione-modified derivative of spermidine, is (in its reduced form) an important antioxidant found in trypanosomatids (Crithidia, Leishmania, Trypanosoma). This model describes trypanothione reductase, a possible antitrypanosomal drug target closely related to some forms of glutathione reductase.	486
213618	TIGR01424	gluta_reduc_2	glutathione-disulfide reductase, plant. The tripeptide glutathione is an important reductant, e.g., for maintaining the cellular thiol/disulfide status and for protecting against reactive oxygen species such as hydrogen peroxide. Glutathione-disulfide reductase regenerates reduced glutathione from oxidized glutathione (glutathione disulfide) + NADPH. This model represents one of two closely related subfamilies of glutathione-disulfide reductase. Both are closely related to trypanothione reductase, and separate models are built so each of the three can describe proteins with conserved function. This model describes glutathione-disulfide reductases of plants and some bacteria, including cyanobacteria. [Energy metabolism, Electron transport]	446
273615	TIGR01425	SRP54_euk	signal recognition particle protein SRP54. This model represents examples from the eukaryotic cytosol of the signal recognition particle protein component, SRP54. This GTP-binding protein is a component of the eukaryotic signal recognition particle, along with several other protein subunits and a 7S RNA. Some species, including Arabidopsis, have several closely related forms. The extreme C-terminal region is glycine-rich and lower in complexity, poorly conserved between species, and excluded from this model.	428
273616	TIGR01426	MGT	glycosyltransferase, MGT family. This model describes the MGT (macroside glycosyltransferase) subfamily of the UDP-glucuronosyltransferase family. Members include a number of glucosyl transferases for macrolide antibiotic inactivation, but also include transferases of glucose-related sugars for macrolide antibiotic production. [Cellular processes, Toxin production and resistance]	392
273617	TIGR01427	PTS_IIC_fructo	PTS system, fructose subfamily, IIC component. This model represents the IIC component, or IIC region of a IIABC or IIBC polypeptide of a phosphotransferase system for carbohydrate transport. Members of this family belong to the fructose-specific subfamily of the broader family (pfam02378) of PTS IIC proteins. Members should be found as part of the same chain or in the same operon as fructose family IIA (TIGR00848) and IIB (TIGR00829) protein regions. A number of bacterial species have members in two different branches of this subfamily, suggesting some diversity in substrate specificity of its members.	346
130495	TIGR01428	HAD_type_II	2-haloalkanoic acid dehalogenase, type II. Catalyzes the hydrolytic dehalogenation of small L-2-haloalkanoic acids to yield the corresponding D-2-hydroxyalkanoic acids. Belongs to the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases (pfam00702), class (subfamily) I. Note that the Type I HAD enzymes have not yet been fully characterized, but clearly utilize a substantially different catalytic mechanism and are thus unlikely to be related.	198
273618	TIGR01429	AMP_deaminase	AMP deaminase. This model describes AMP deaminase, a large, well-conserved eukaryotic protein involved in energy metabolism. Most members of the family have an additional, poorly alignable region of 150 amino acids or more N-terminal to the region included in the model.	611
273619	TIGR01430	aden_deam	adenosine deaminase. This family includes the experimentally verified adenosine deaminases of mammals and E. coli. Other members of this family are predicted also to be adenosine deaminase, an enzyme of nucleotide degradation. This family is distantly related to AMP deaminase.	324
273620	TIGR01431	adm_rel	adenosine deaminase-related growth factor. Members of this family have been described as secreted proteins with growth factor activity and regions of adenosine deaminase homology in insects, mollusks, and vertebrates.	479
273621	TIGR01432	QOXA	cytochrome aa3 quinol oxidase, subunit II. This enzyme catalyzes the oxidation of quinol with the concomitant reduction of molecular oxygen to water. This acts as the terminal electron acceptor in the respiratory chain. This subunit contains two transmembrane helices and a large external domain responsible for the binding and oxidation of quinol. QuoX is (presently) only found in gram positive bacteria of the Bacillus/Staphylococcus group. Like CyoA, the ubiquinol oxidase found in proteobacteria, the residues responsible for the ligation of Cu(a) and cytochrome c (found in the related cyt. c oxidases) are absent. Unlike CyoA, QoxA is in complex with a subunit I which contains cytochromes a similar to the cyt. c oxidases (as opposed to cytochromes b). [Energy metabolism, Electron transport]	226
213620	TIGR01433	CyoA	cytochrome o ubiquinol oxidase subunit II. This enzyme catalyzes the oxidation of ubiquinol with the concomitant reduction of molecular oxygen to water. This acts as the terminal electron acceptor in the respiratory chain. Subunit II is responsible for binding and oxidation of the ubiquinone substrate. This sequence is closely related to QoxA, which oxidizes quinol in gram positive bacteria but which is in complex with subunits which utilize cytochromes a in the reduction of molecular oxygen. Slightly more distantly related is subunit II of cytochrome c oxidase which uses cyt. c as the oxidant. [Energy metabolism, Electron transport]	226
213621	TIGR01434	glu_cys_ligase	glutamate--cysteine ligase. Alternate name: gamma-glutamylcysteine synthetase. This model represents glutamate--cysteine ligase, and enzyme in the biosynthesis of glutathione (GSH). GSH is one of several low molecular weight cysteine derivatives that can serve to protect against oxidative damage and participate in a biosynthetic or detoxification reactions. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	512
273622	TIGR01435	glu_cys_lig_rel	glutamate--cysteine ligase/glutathione synthase, Streptococcus agalactiae type. This model represents a bifunctional protein family for the biosynthesis of glutathione, and perhaps a range of related gamma-glutamyltripeptides of the form gamma-Glu-Cys-X(aa). The N-terminal region is similar to proteobacterial glutamate-cysteine ligase. The C-terminal region is homologous to cyanophycin synthetase of cyanobacteria and, more distantly, to D-alanine-D-alanine ligases. Members of this family are found in Listeria and Enterococcus, Gram-positive lineages in which glutathione is produced (see PUBMED:8606174), and in Pasteurella multocida, a Proteobacterium. In Clostridium acetobutylicum, adjacent genes include separate proteins rather than a fusion protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	737
130503	TIGR01436	glu_cys_lig_pln	glutamate--cysteine ligase, plant type. This model represents one of two highly dissimilar forms of glutamate--cysteine ligase (gamma-glutamylcysteine synthetase), an enzyme of glutathione biosynthesis. The other type is modeled by TIGR01434. This type is found in plants (with a probable transit peptide), root nodule and other bacteria, but not E. coli and closely related species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	446
273623	TIGR01437	selA_rel	uncharacterized pyridoxal phosphate-dependent enzyme. This model describes a protein related to a number of pyridoxal phosphate-dependent enzymes, and in particular to selenocysteine synthase (SelA), which converts Ser to selenocysteine on its tRNA. While resembling SelA, this protein is found only in species that have a better candidate SelA or else lack the other genes (selB, selC, and selD) required for selenocysteine incorporation. [Unknown function, Enzymes of unknown specificity]	363
273624	TIGR01438	TGR	thioredoxin and glutathione reductase selenoprotein. This homodimeric, FAD-containing member of the pyridine nucleotide disulfide oxidoreductase family contains a C-terminal motif Cys-SeCys-Gly, where SeCys is selenocysteine encoded by TGA (in some sequence reports interpreted as a stop codon). In some members of this subfamily, Cys-SeCys-Gly is replaced by Cys-Cys-Gly. The reach of the selenium atom at the C-term arm of the protein is proposed to allow broad substrate specificity.	484
273625	TIGR01439	lp_hng_hel_AbrB	looped-hinge helix DNA binding domain, AbrB family. This DNA-binding domain family includes AbrB, a transition state regulator in Bacillus subtilis, whose DNA-binding domain structure in solution was determined by NMR. The domain binds DNA as a dimer in what is termed a looped-hinge helix fold. Some members of the family have two copies of the domain in tandem. The domain is found usually at the N-terminus of a small protein. This model excludes members of family TIGR02609. [Regulatory functions, DNA interactions]	43
130507	TIGR01440	TIGR01440	TIGR01440 family protein. Members of this family are uncharacterized proteins of about 180 amino acids from the Bacillus/Clostridium group of Gram-positive bacteria, found in no more than one copy per genome. [Hypothetical proteins, Conserved]	172
273626	TIGR01441	GPR	GPR endopeptidase. This model describes a tetrameric protease that makes the rate-limiting first cut in the small, acid-soluble spore proteins (SASP) of Bacillus subtilis and related species. The enzyme lacks clear homology to other known proteases. It processes its own amino end before becoming active to cleave SASPs. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Sporulation and germination]	358
273627	TIGR01442	SASP_gamma	small, acid-soluble spore protein, gamma-type. This model represents a family of small, glutamine and asparagine-rich peptides that store amino acids in the spores of Bacillus subtilis and related bacteria. Most members of the family have two copies of the spore protease (GPR) cleavage motif, typically EFASE in this family, separating three low-complexity repeats. [Cellular processes, Sporulation and germination]	85
213622	TIGR01443	intein_Cterm	intein C-terminal splicing region. This model represents the well-conserved C-terminal region of a large number of inteins. It is based on interated search results, starting with a curated collection of intein N-terminal splicing regions from InBase, the New England Biolabs Intein Database, as presented on its web site. Inteins are regions encoded within proteins from which they remove themselves after translation in a self-splicing reaction, leaving the remainder of the coding region to form a complete, functional protein as if the intein were never there. Proteins with inteins include RecA, GyrA, ribonucleotide reductase, and others. Most inteins have a central region with putative endonuclease activity.	21
273628	TIGR01444	fkbM_fam	methyltransferase, FkbM family. Members of this family are characterized by two well-conserved short regions separated by a variable in both sequence and length. The first of the two regions is found in a large number of proteins outside this subfamily, a number of which have been characterized as methyltransferases. One member of the present family, FkbM, was shown to be required for a specific methylation in the biosynthesis of the immunosuppressant FK506 in Streptomyces strain MA6548.	143
273629	TIGR01445	intein_Nterm	intein N-terminal splicing region. This model is based on interated search results, starting with a curated collection of intein N-terminal splicing regions from InBase, the New England Biolabs Intein Database, as presented on its web site. It is designed to recognize inteins but not the related region of the sonic hedgehog protein.	81
273630	TIGR01446	DnaD_dom	DnaD and phage-associated domain. This model represents the conserved domain of DnaD, part of Bacillus subtilis replication restart primosome, and of a number of phage-associated proteins. Members, both chromosomal or phage-associated, are found in the Bacillus/Clostridium group of Gram-positive bacteria. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions]	73
273631	TIGR01447	recD	exodeoxyribonuclease V, alpha subunit. This family describes the exodeoxyribonuclease V alpha subunit, RecD. RecD is part of a RecBCD complex. A related family in the Gram-positive bacteria separates in a phylogenetic tree, has an additional N-terminal extension of about 200 residues, and is not supported as a member of a RecBCD complex by neighboring genes. The related family is consequently described by a different model. [DNA metabolism, DNA replication, recombination, and repair]	582
273632	TIGR01448	recD_rel	helicase, putative, RecD/TraA family. This model describes a family similar to RecD, the exodeoxyribonuclease V alpha chain of TIGR01447. Members of this family, however, are not found in a context of RecB and RecC and are longer by about 200 amino acids at the amino end. Chlamydia muridarum has both a member of this family and a RecD. [Unknown function, Enzymes of unknown specificity]	720
130516	TIGR01449	PGP_bact	2-phosphoglycolate phosphatase, prokaryotic. PGP is an essential enzyme in the glycolate salvage pathway in higher organisms (photorespiration in plants). Phosphoglycolate results from the oxidase activity of RubisCO in the Calvin cycle when concentrations of carbon dioxide are low relative to oxygen. In Ralstonia (Alcaligenes) eutropha and Rhodobacter sphaeroides, the PGP gene (CbbZ) is located on an operon along with other Calvin cycle enzymes including RubisCO. The only other pertinent experimental evidence concerns the gene from E. coli. The in vitro activity of the Ralstonia and Escherichia enzymes was determined with crude cell extracts of strains containing PGP on expression plasmids and compared to controls. In E. coli, however, there does not appear to be a functional Calvin cycle (RubisCO is absent), although the E. coli PGP gene (gph) is on the same operon (dam) with ribulose-5-phosphate-3-epimerase (rpe), a gene in the pentose-phosphate pathway (along with other, unrelated genes). The E. coli enzyme is not expressed under normal laboratory conditions; the pathway to which it belongs has not been determined. In fact, the possibility exists, although unlikely, that the E. coli enzyme and others within this equivalog have as their physiological substrate another, closely related molecule. The other seed chosen for this model, from Xylella fastidiosa has no experimental evidence, but is a plant pathogen and thus may obtain phosphoglycolate from its host. This model has been restricted to encompass only proteobacteria as no related PGP has been verified outside of this clade. Sequences from Aquifex aeolicus and Treponema pallidum fall between the trusted and noise cutoffs. Just below the noise cutoff is a gene which is part of the operon for the biosynthesis of the blue pigment, indigoidine, from Erwinia (Pectobacterium) chrysanthemi, a plant pathogen. It does not seem likely, considering the proposed biosynthetic mechanism, that the dephosphorylation of phosphoglycolate or a closely related compound is required. Possibly, this gene is fortuitously located in this operon, or has an indirect relationship to the necessity for the biosynthesis of this compound. Sequences from 11 species have been annotated as PGP or putative PGP but fall below the noise cutoff. None of these have experimental validation. This enzyme is a member of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolase enzymes (pfam00702). [Energy metabolism, Sugars]	213
273633	TIGR01450	recC	exodeoxyribonuclease V, gamma subunit. This model describes the gamma subunit of exodeoxyribonuclease V. Species containing this protein should also have the alpha (TIGR01447) and beta (TIGR00609) subunits. Candidates from Borrelia and from the Chlamydias differ dramatically and score between trusted and noise cutoffs. [DNA metabolism, DNA replication, recombination, and repair]	1060
273634	TIGR01451	B_ant_repeat	conserved repeat domain. This model represents the conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis, and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydial outer membrane proteins.	53
273635	TIGR01452	PGP_euk	phosphoglycolate/pyridoxal phosphate phosphatase family. PGP is an essential enzyme in the glycolate salvage pathway in higher organisms (photorespiration in plants). Phosphoglycolate results from the oxidase activity of RubisCO in the Calvin cycle when concentrations of carbon dioxide are low relative to oxygen. In mammals, PGP is found in many tissues, notably in red blood cells where P-glycolate is and important activator of the hydrolysis of 2,3-bisphosphoglycerate, a major modifier of the oxygen affinity of hemoglobin. Pyridoxal phosphate (PLP, Vitamin B6) phosphatase is involved in the degradation of PLP in mammals and is widely distributed in human tissues including erythrocyes. The enzymes described here are members of the Haloacid dehalogenase superfamily of hydrolase enzymes (pfam00702). Unlike the bacterial PGP equivalog (TIGR01449), which is a member of class (subfamily) I, these enzymes are members of class (subfamily) II. These two families have almost certainly arisen from convergent evolution (although these two ancestors may themselves have diverged from a more distant HAD superfamily progenitor). The primary seed sequence for this model comes from Chlamydomonas reinhardtii, a photosynthetic alga. The enzyme has been purified and characterized and these data are fully consistent with the assignment of function as a PGPase involved in photorespiration. The second seed, from Homo sapiens chromosome 22 has been characterized as a pyridoxal phosphatase. Biochemical characterization of partially purified PGP's from various tissues including red blood cells have been performed while one gene for PGP has been localized to chromosome 16p13.3. The sequence used here maps to chromosome 22. There is indeed a related gene on chromosome 16 (and it is expressed, since EST's are found) which shows 46% identity. The chromosome 16 gene is not in evidence in nraa but translated from the genomic sequence. The third seed, from C. elegans, is only supported by sequence similarity. This model is limited to eukaryotic species including S. pombe and S. cerevisiae, although several archaea score between the trusted and noise cutoffs. This model is closely related to a family of bacterial sequences including the E. coli NagD and B. subtilus AraL genes which are characterized by the ability to hydrolyze para-nitrophenylphosphate (pNPPases or NPPases). The chlamydomonas PGPase d	279
273636	TIGR01453	grpIintron_endo	group I intron endonuclease. This model represents one subfamily of endonucleases containing the endo/excinuclease amino terminal domain, pfam01541 at its amino end. A distinct subfamily includes excinuclease abc subunit c (uvrC). Members of pfam01541 are often termed GIY-YIG endonucleases after conserved motifs near the amino end. This subfamily in this model is found in open reading frames of group I introns in both phage and mitochondria. The closely related endonucleases of phage T4: segA, segB, segC, segD and segE, score below the trusted cutoff for the family.	214
130521	TIGR01454	AHBA_synth_RP	3-amino-5-hydroxybenoic acid synthesis related protein. The enzymes in this equivalog are all located in the operons for the biosynthesis of 3-amino-5-hydroxybenoic acid (AHBA), which is a precursor of several antibiotics including ansatrienin, naphthomycin, rifamycin and mitomycin. The role that this enzyme plays in this biosynthesis has not been elucidated. This enzyme is a member of the Haloacid dehalogenase superfamily (pfam00702) of aspartate-nucleophile hydrolases. This enzyme is closely related to phosphoglycolate phosphatase (TIGR01449), but it is unclear what purpose a PGPase or PGPase-like activity would serve in these biosyntheses. This model is limited to the Gram positive Actinobacteria. The most closely related enzyme below the noise cutoff is IndB which is involved in the biosynthesis of Indigoidine in Pectobacterium (Erwinia) chrysanthemi, a gamma proteobacter. This enzyme is similarly related to PGP. In this case, too it is unclear what role would be be played by a PGPase activity.	205
130522	TIGR01455	glmM	phosphoglucosamine mutase. This model describes GlmM, phosphoglucosamine mutase, also designated in MrsA and YhbF E. coli, UreC in Helicobacter pylori, and femR315 or FemD in Staphlococcus aureus. It converts glucosamine-6-phosphate to glucosamine-1-phosphate as part of the pathway toward UDP-N-acetylglucosamine for peptidoglycan and lipopolysaccharides. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Central intermediary metabolism, Amino sugars]	443
200106	TIGR01456	CECR5	HAD-superfamily class IIA hydrolase, TIGR01456, CECR5. This hypothetical equivalog is a member of the Class IIA subfamily of the haloacid dehalogenase superfamily of aspartate-nucleophile hydrolases. The sequences modelled by this equivalog are all eukaryotes. One sequence (GP|13344995) is called "Cat Eye Syndrome critical region protein 5" (CECR5). This gene has been cloned from a pericentromere region of human chromosome 22 believed to be the location of the gene or genes responsible for Cat Eye Syndrome. This is one of a number of candidate genes. The Schizosaccharomyces pombe sequence (EGAD|138276) is annotated as "phosphatidyl synthase," however this is due entirely to a C-terminal region of the protein (outside the region of similarity of this model) which is highly homologous to a family of CDP-alcohol phosphatidyltransferases. (Thus, the annotation of GP|4226073 from C. elegans as similar to phosphatidyl synthase, is a mistake as this gene does not contain the C-terminal portion). The physical connection of the phosphatidyl synthase and the HAD-superfamily hydrolase domain in S. pombe may, however, be an important clue to the substrate for the hydrolases in this equivalog.	321
130524	TIGR01457	HAD-SF-IIA-hyp2	HAD-superfamily subfamily IIA hydrolase, TIGR01457. This hypothetical equivalog is a member of the Class IIA subfamily of the haloacid dehalogenase superfamily of aspartate-nucleophile hydrolases. The sequences modelled by this equivalog are all gram positive (low-GC) bacteria. Sequences found in this model are annotated variously as related to NagD or 4-nitrophenyl phosphatase, and this hypothetical equivalog, of all of those within the Class IIA subfamily, is most closely related to the E. coli NagD enzyme and the PGP_euk equivalog (TIGR01452). However, there is presently no evidence that this hypothetical equivalog has the same function of either those. [Unknown function, Enzymes of unknown specificity]	249
162372	TIGR01458	HAD-SF-IIA-hyp3	HAD-superfamily subfamily IIA hydrolase, TIGR01458. This hypothetical equivalog is a member of the IIA subfamily (TIGR01460) of the haloacid dehalogenase superfamily of aspartate-nucleophile hydrolases. One sequence (GP|10716807) has been annotated as a "phospholysine phosphohistidine inorganic pyrophosphatase," probably in reference to studies on similarly described (but unsequenced) enzymes from bovine and rat tissues. However, the supporting information for this annotation has never been published. [Unknown function, Enzymes of unknown specificity]	257
130526	TIGR01459	HAD-SF-IIA-hyp4	HAD-superfamily class IIA hydrolase, TIGR01459. This hypothetical equivalog is a member of the Class IIA subfamily of the haloacid dehalogenase superfamily of aspartate-nucleophile hydrolases. The sequences modelled by this equivalog are all gram negative and primarily alpha proteobacteria. Only one sequence hase been annotated as other than "hypothetical." That one, from Brucella, is annotated as related to NagD, but only by sequence similarity and should be treated with some skepticism. (See comments for Class IIA subfamily model)	242
273637	TIGR01460	HAD-SF-IIA	Haloacid Dehalogenase Superfamily Class (subfamily) IIA. This model represents one structural subclass of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The classes are defined based on the location and the observed or predicted fold of a so-called "capping domain", or the absence of such a domain. Class I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Class II consists of sequences in which the capping domain is found between the second and third motifs. Class III sequences have no capping domain in iether of these positions. The Class IIA capping domain is predicted by PSI-PRED to consist of a mixed alpha-beta fold with the basic pattern: Helix-Helix-Helix-Sheet-Helix-Loop-Sheet-Helix-Sheet-Helix. Presently, this subfamily encompasses a single equivalog model (TIGR01452) for the eukaryotic phosphoglycolate phosphatase, as well as four hypothetical equivalogs covering closely related sequences (TIGR01456 and TIGR01458 in eukaryotes, TIGR01457 in gram positive bacteria and TIGR01459 in gram negative bacteria). The Escherishia coli NagD gene and the Bacillus subtilus AraL gene are members of this subfamily but are not members of the any of the presently defined equivalogs within it. NagD is part of the NAG operon responsible for N-acetylglucosamine metabolism. The function of this gene is unknown. Genes from several organisms have been annotated as NagD, or NagD-like. However, without data on the presence of other members of this pathway, (such as in the case of Yersinia pestis) these assignments should not be given great weight. The AraL gene is similar: it is part of the L-arabinose operon, but the function is unknown. A gene from Halobacterium has been annotated as AraL, but no other Ara operon genes have been annotated. Many of the genes in this subfamily have been annotated as "pNPPase" "4-nitrophenyl phosphatase" or "NPPase". These all refer to the same activity versus a common lab test compound used to determine phosphatase activity. There is no evidence that this activity is physiologically relevant. [Unknown function, Enzymes of unknown specificity]	236
130528	TIGR01461	greB	transcription elongation factor GreB. The GreA and GreB transcription elongation factors enable to continuation of RNA transcription past template-encoded arresting sites. Among the Proteobacteria, distinct clades of GreA and GreB are found. GreB differs functionally in that it releases larger oligonucleotides. This model describes proteobacterial GreB. [Transcription, Transcription factors]	156
273638	TIGR01462	greA	transcription elongation factor GreA. The GreA and GreB transcription elongation factors enable to continuation of RNA transcription past template-encoded arresting sites. Among the Proteobacteria, distinct clades of GreA and GreB are found. GreA differs functionally in that it releases smaller oligonucleotides. Because members of the family outside the Proteobacteria resemble GreA more closely than GreB, the GreB clade (TIGR01461) forms a plausible outgroup and the remainder of the GreA/B family, included in this model, is designated GreA. In the Chlamydias and some spirochetes, the region described by this model is found as the C-terminal region of a much larger protein. [Transcription, Transcription factors]	151
273639	TIGR01463	mtaA_cmuA	methyltransferase, MtaA/CmuA family. This subfamily is closely related to, yet is distinct from, uroporphyrinogen decarboxylase (EC 4.1.1.37). It includes two isozymes from Methanosarcina barkeri of methylcobalamin--coenzyme M methyltransferase. It also includes a chloromethane utilization protein, CmuA, which transfers the methyl group of chloromethane to a corrinoid protein.	336
273640	TIGR01464	hemE	uroporphyrinogen decarboxylase. This model represents uroporphyrinogen decarboxylase (HemE), which converts uroporphyrinogen III to coproporphyrinogen III. This step takes the pathway toward protoporphyrin IX, a common precursor of both heme and chlorophyll, rather than toward precorrin 2 and its products. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	338
200107	TIGR01465	cobM_cbiF	precorrin-4 C11-methyltransferase. This model represents precorrin-4 C11-methyltransferase, one of two methyltransferases commonly referred to as precorrin-3 methylase (the other is precorrin-3B C17-methyltransferase, EC 2.1.1.131). This enzyme participates in the pathway toward the biosynthesis of cobalamin and related products. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	247
273641	TIGR01466	cobJ_cbiH	precorrin-3B C17-methyltransferase. This model represents precorrin-3B C17-methyltransferase, one of two methyltransferases commonly referred to as precorrin-3 methylase (the other is precorrin-4 C11-methyltransferase, EC 2.1.1.133). This enzyme participates in the pathway toward the biosynthesis of cobalamin and related products. Members of this family may appear as fusion proteins with other enzymes of cobalamin biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	239
273642	TIGR01467	cobI_cbiL	precorrin-2 C(20)-methyltransferase. This model represents precorrin-2 C(20)-methyltransferase, one of several closely related S-adenosylmethionine-dependent methyltransferases involved in cobalamin (vitamin B12) biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	230
273643	TIGR01469	cobA_cysG_Cterm	uroporphyrin-III C-methyltransferase. This model represents enzymes, or enzyme domains, with uroporphyrin-III C-methyltransferase activity. This enzyme catalyzes the first step committed to the biosynthesis of either siroheme or cobalamin (vitamin B12) rather than protoheme (heme). Cobalamin contains cobalt while siroheme contains iron. Siroheme is a cofactor for nitrite and sulfite reductases and therefore plays a role in cysteine biosynthesis; many members of this family are CysG, siroheme synthase, with an additional N-terminal domain and with additional oxidation and iron insertion activities. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	236
130536	TIGR01470	cysG_Nterm	siroheme synthase, N-terminal domain. This model represents a subfamily of CysG N-terminal region-related sequences. All sequences in the seed alignment for this model are N-terminal regions of known or predicted siroheme synthases. The C-terminal region of each is uroporphyrin-III C-methyltransferase (EC 2.1.1.107), which catalyzes the first step committed to the biosynthesis of either siroheme or cobalamin (vitamin B12) rather than protoheme (heme). The region represented by this model completes the process of oxidation and iron insertion to yield siroheme. Siroheme is a cofactor for nitrite and sulfite reductases, so siroheme synthase is CysG of cysteine biosynthesis in some organisms. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	205
273644	TIGR01472	gmd	GDP-mannose 4,6-dehydratase. Alternate name: GDP-D-mannose dehydratase. This enzyme converts GDP-mannose to GDP-4-dehydro-6-deoxy-D-mannose, the first of three steps for the conversion of GDP-mannose to GDP-fucose in animals, plants, and bacteria. In bacteria, GDP-L-fucose acts as a precursor of surface antigens such as the extracellular polysaccharide colanic acid of E. coli. Excluded from this model are members of the clade that score poorly because of highly dervied (phylogenetically long-branch) sequences, e.g. Aneurinibacillus thermoaerophilus Gmd, described as a bifunctional GDP-mannose 4,6-dehydratase/GDP-6-deoxy-D-lyxo-4-hexulose reductase (PUBMED:11096116). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	343
273645	TIGR01473	cyoE_ctaB	protoheme IX farnesyltransferase. This model describes protoheme IX farnesyltransferase, also called heme O synthase, an enzyme that creates an intermediate in the biosynthesis of heme A. Prior to the description of its enzymatic function, this protein was often called a cytochrome o ubiquinol oxidase assembly factor. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	280
130539	TIGR01474	ubiA_proteo	4-hydroxybenzoate polyprenyl transferase, proteobacterial. This model represents a family of integral membrane proteins that condenses para-hydroxybenzoate with any of several polyprenyldiphosphates. Heterologous expression studies suggest that for, many but not all members, the activity seen (e.g. octaprenyltransferase in E. coli) reflects available host isoprenyl pools rather than enzyme specificity. A fairly deep split by both clustering (UPGMA) and phylogenetics (NJ tree) separates this group (mostly Proteobacterial and mitochondrial), with several characterized members, from another group (mostly archaeal and Gram-positive bacterial) lacking characterized members. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	281
273646	TIGR01475	ubiA_other	putative 4-hydroxybenzoate polyprenyltransferase. A fairly deep split separates this polyprenyltransferase subfamily from the set of mitochondrial and proteobacterial 4-hydroxybenzoate polyprenyltransferases, described in TIGR01474. Protoheme IX farnesyltransferase (heme O synthase) (TIGR01473) is more distantly related. Because no species appears to have both this protein and a member of TIGR01474, it is likely that this model represents 4-hydroxybenzoate polyprenyltransferase, a critical enzyme of ubiquinone biosynthesis, in the Archaea, Gram-positive bacteria, Aquifex aeolicus, the Chlamydias, etc. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	282
130541	TIGR01476	chlor_syn_BchG	bacteriochlorophyll/chlorophyll synthetase. This model describes a subfamily of a large family of polyprenyltransferases (pfam01040) that also includes 4-hydroxybenzoate octaprenyltransferase and protoheme IX farnesyltransferase (heme O synthase). Members of this family are found exclusively in photosynthetic organisms, including a single copy in Arabidopsis thaliana. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	283
273647	TIGR01477	RIFIN	variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.	353
130543	TIGR01478	STEVOR	variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.	295
273648	TIGR01479	GMP_PMI	mannose-1-phosphate guanylyltransferase/mannose-6-phosphate isomerase. This enzyme is known to be bifunctional, as both mannose-6-phosphate isomerase (EC 5.3.1.8) (PMI) and mannose-1-phosphate guanylyltransferase (EC 2.7.7.22) in Pseudomonas aeruginosa, Xanthomonas campestris, and Gluconacetobacter xylinus. The literature on the enzyme from E. coli attributes mannose-6-phosphate isomerase activity to an adjacent gene, but the present sequence has not been shown to lack the activity. The PMI domain is C-terminal. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	468
273649	TIGR01480	copper_res_A	copper-resistance protein, CopA family. This model represents the CopA copper resistance protein family. CopA is related to laccase (benzenediol:oxygen oxidoreductase) and L-ascorbate oxidase, both copper-containing enzymes. Most members have a typical TAT (twin-arginine translocation) signal sequence with an Arg-Arg pair. Twin-arginine translocation is observed for a large number of periplasmic proteins that cross the inner membrane with metal-containing cofactors already bound. The combination of copper-binding sites and TAT translocation motif suggests a mechansism of resistance by packaging and export. [Cellular processes, Detoxification, Transport and binding proteins, Cations and iron carrying compounds]	587
130546	TIGR01481	ccpA	catabolite control protein A. Catabolite control protein A is a LacI family global transcriptional regulator found in Gram-positive bacteria. CcpA is involved in repressing carbohydrate utilization genes [ex: alpha-amylase (amyE), acetyl-coenzyme A synthase (acsA)] and in activating genes involved in transporting excess carbon from the cell [ex: acetate kinase (ackA), alpha-acetolactate synthase (alsS)]. Additionally, disruption of CcpA in Bacillus megaterium, Staphylococcus xylosus, Lactobacillus casei and Lactocacillus pentosus also decreases growth rate, which suggests CcpA is involved in the regulation of other metabolic pathways. [Regulatory functions, DNA interactions]	329
273650	TIGR01482	SPP-subfamily	sucrose-phosphate phosphatase subfamily. This model includes both the members of the SPP equivalog model (TIGR01485), encompassing plants and cyanobacteria, as well as those archaeal sequences which are the closest relatives (TIGR01487). It remains to be shown whether these archaeal sequences catalyze the same reaction as SPP.	225
273651	TIGR01484	HAD-SF-IIB	HAD-superfamily hydrolase, subfamily IIB. This subfamily falls within the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The Class II subfamilies are characterized by a domain that is located between the second and third conserved catalytic motifs of the superfamily domain. The IIB subfamily is distinguished from the IIA subfamily (TIGR01460) by homology and the predicted secondary structure of this domain by PSI-PRED. The IIB subfamily's Class II domain has the following predicted structure: Helix-Sheet-Sheet-(Helix or Sheet)-Helix-Sheet-(variable)-Helix-Sheet-Sheet. The IIB subfamily consists of Trehalose-6-phosphatase (TIGR00685), plant and cyanobacterial Sucrose-phosphatase and a closely related group of bacterial and archaeal sequences, eukaryotic phosphomannomutase (pfam03332), a large subfamily ("Cof-like hydrolases", TIGR00099) containing many closely related bacterial sequences, a hypothetical equivalog containing the E. coli YedP protein, as well as two small clusters containing OMNI|TC0379 and OMNI|SA2196 whose relationship to the other groups is unclear. [Unknown function, Enzymes of unknown specificity]	207
130549	TIGR01485	SPP_plant-cyano	sucrose-6F-phosphate phosphohydrolase. This model describes the sucrose phosphate phosphohydrolase from plants and cyanobacteria (SPP). SPP is a member of the Class IIB subfamily (TIGR01484) of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. SPP catalyzes the final step in the biosynthesis of sucrose, a critically important molecule for plants. Sucrose phosphate synthase (SPS), the prior step in the biosynthesis of sucrose, contains a domain which exhibits considerable similarity to SPP albeit without conservation of the catalytic residues. The catalytic machinery of the synthase resides in another domain. It seems likely that the phosphatase-like domain is involved in substrate binding, possibly binding both substrates in a "product-like" orientation prior to ligation by the synthase catalytic domain.	249
130550	TIGR01486	HAD-SF-IIB-MPGP	mannosyl-3-phosphoglycerate phosphatase family. This small group of proteins is a member of the IIB subfamily (TIGR01484) of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. Several members of this family from thermophiles (and from Dehalococcoides ethenogenes) are now known to act as mannosyl-3-phosphoglycerate (MPG) phosphatase. In these cases, the enzyme acts after MPG synthase to make the compatible solute mannosylglycerate. We propose that other mesophilic members of this family do not act as mannosyl-3-phosphoglycerate phosphatase. A member of this family is found in Escherichia coli, which appears to lack MPG synthase. Mannosylglycerate is imported in E. coli by phosphoenolpyruvate-dependent transporter (), but it appears the phosphorylation is not on the glycerate moiety, that the phosphorylated import is degraded by an alpha-mannosidase from an adjacent gene, and that E. coli would have no pathway to obtain MPG.	256
273652	TIGR01487	Pglycolate_arch	phosphoglycolate phosphatase, TA0175-type. This group of Archaeal sequences, now known to be phosphoglycolate phosphatases, is most closely related to the sucrose-phosphate phosphatases from plants and cyanobacteria (TIGR01485). Together, these two models comprise a subfamily model (TIGR01482). TIGR01482, in turn, is a member of the IIB subfamily (TIGR01484) of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases.	215
273653	TIGR01488	HAD-SF-IB	Haloacid Dehalogenase superfamily, subfamily IB, phosphoserine phosphatase-like. This model represents a subfamily of the Haloacid Dehalogenase superfamily of aspartate-nucleophile hydrolases. Subfamily IA, B, C and D are distinguished from the rest of the superfamily by the presence of a variable domain between the first and second conserved catalytic motifs. In subfamilies IA and IB, this domain consists of an alpha-helical bundle. It was necessary to model these two subfamilies separately, breaking them at a an apparent phylogenetic bifurcation, so that the resulting model(s) are not so broadly defined that members of subfamily III (which lack the variable domain) are included. Subfamily IA includes the enzyme phosphoserine phosphatase (TIGR00338) as well as three hypothetical equivalogs. Many members of these hypothetical equivalogs have been annotated as PSPase-like or PSPase-family proteins. In particular, the hypothetical equivalog which appears to be most closely related to PSPase contains only Archaea (while TIGR00338 contains only eukaryotes and bacteria) of which some are annotated as PSPases. Although this is a reasonable conjecture, none of these sequences has sufficient evidence for this assignment. If such should be found, this model should be retired while the PSPase model should be broadened to include these sequences. [Unknown function, Enzymes of unknown specificity]	177
213629	TIGR01489	DKMTPPase-SF	2,3-diketo-5-methylthio-1-phosphopentane phosphatase. This phosphatase is a member of the IB subfamily (TIGR01488) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. With the exception of OMNI|NTL01BS01361 from B. subtilis and GP|15024582 from Clostridium acetabutylicum, the members of this group are all eukaryotic, spanning metazoa, plants and fungi. The B. subtilus gene (YkrX, renamed MtnX) is part of an operon for the conversion of methylthioribose (MTR) to methionine. It works with the enolase MtnW, a RuBisCO homolog. The combination of MtnW and MtnX achieves the same overall reaction as the enolase-phosphatase MtnC. The function of MtnX was shown by Ashida, et al. (2003) to be 2,3-diketo-5-methylthio-1-phosphopentane phosphatase, rather than 2,3-diketo-5-methylthio-1-phosphopentane phosphatase as proposed earlier. See the Genome Property for methionine salvage for more details. In eukaryotes, methionine salvage from methylthioadenosine also occurs. It seems reasonable that members of this family in eukaryotes fulfill a similar role as in Bacillus. A more specific, equivalog-level model is TIGR03333. Note that SP|P53981 from S. cerevisiae, a member of this family, is annotated as a "probable membrane protein" due to a predicted transmembrane helix. The region in question contains the second of the three conserved HAD superfamily catalytic motifs and thus, considering the fold of the HAD catalytic domain, is unlikely to be a transmembrane region in fact. [Central intermediary metabolism, Other]	188
273654	TIGR01490	HAD-SF-IB-hyp1	HAD-superfamily subfamily IB hydrolase, TIGR01490. This hypothetical equivalog is a member of the IB subfamily (TIGR01488) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The sequences modelled here are all bacterial. The IB subfamily includes the enzyme phosphoserine phosphatase (TIGR00338). Due to this relationship, several of these sequences have been annotated as "phosphoserine phosphatase related proteins," or "Phosphoserine phosphatase-family enzymes." There is presently no evidence that any of the enzymes in this model possess PSPase activity. OMNI|NTL01ML1250 is annotated as a "possible transferase," however this is due to the C-terminal domain found on this sequence which is homologous to a group of glycerol-phosphate acyltransferases (between trusted and noise to TIGR00530). A subset of these sequences including OMNI|CC1962, the Caulobacter crescentus CicA protein cluster together and may represent a separate equivalog. [Unknown function, Enzymes of unknown specificity]	202
273655	TIGR01491	HAD-SF-IB-PSPlk	HAD-superfamily, subfamily-IB PSPase-like hydrolase, archaeal. This hypothetical equivalog is a member of the IB subfamily (TIGR01488) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The sequences modelled here are all from archaeal species. The phylogenetically closest group of sequences to these are phosphoserine phosphatases (TIGR00338). There are no known archaeal phosphoserine phosphatases, and no archaea fall within TIGR00338. It is likely, then, that this model represents the archaeal branch of the PSPase equivalog.	201
130556	TIGR01492	CPW_WPC	Plasmodium falciparum CPW-WPC domain. This model represents a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC.	62
130557	TIGR01493	HAD-SF-IA-v2	Haloacid dehalogenase superfamily, subfamily IA, variant 2 with 3rd motif like haloacid dehalogenase. This model represents part of one structural subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The subfamilies are defined based on the location and the observed or predicted fold of a so-called 'capping domain', or the absence of such a domain. Subfamily I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Subfamily II consists of sequences in which the capping domain is found between the second and third motifs. Subfamily III sequences have no capping domain in either of these positions. The Subfamily IA and IB capping domains are predicted by PSI-PRED to consist of an alpha helical bundle. Subfamily I encompasses such a wide region of sequence space (the sequences are highly divergent) that representing it with a single model is impossible, resulting in an overly broad description which allows in many unrelated sequences. Subfamily IA and IB are separated based on an aparrent phylogenetic bifurcation. Subfamily IA is still too broad to model, but cannot be further subdivided into large chunks based on phylogenetic trees. Of the three motifs defining the HAD superfamily, the third has three variant forms: (1) hhhhsDxxx(x)D, (2) hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to a small amino acid and _h_ to a hydrophobic one. All three of these variants are found in subfamily IA. Individual models were made based on seeds exhibiting only one of the variants each. Variant 2 (this model) is distinctive of the type II haloacid dehalogenases, and nearly all of the sequences are also part of the HAD, type II equivalog model (TIGR01428). These three variant models were created with the knowledge that there will be overlap among them - this is by design and serves the purpose of eliminating the overlap with models of more distantly related HAD subfamilies caused by an overly broad single model.	175
273656	TIGR01494	ATPase_P-type	ATPase, P-type (transporting), HAD superfamily, subfamily IC. The P-type ATPases are a large family of trans-membrane transporters acting on charged substances. The distinguishing feature of the family is the formation of a phosphorylated intermediate (aspartyl-phosphate) during the course of the reaction. Another common name for these enzymes is the E1-E2 ATPases based on the two isolable conformations: E1 (unphosphorylated) and E2 (phosphorylated). Generally, P-type ATPases consist of only a single subunit encompassing the ATPase and ion translocation pathway, however, in the case of the potassium (TIGR01497) and sodium/potassium (TIGR01106) varieties, these functions are split between two subunits. Additional small regulatory or stabilizing subunits may also exist in some forms. P-type ATPases are nearly ubiquitous in life and are found in numerous copies in higher organisms (at least 45 in Arabidopsis thaliana, for instance). Phylogenetic analyses have revealed that the P-type ATPase subfamily is divided up into groups based on substrate specificities and this is represented in the various subfamily and equivalog models that have been made: IA (K+) TIGR01497, IB (heavy metals) TIGR01525, IIA1 (SERCA-type Ca++) TIGR01116, IIA2 (PMR1-type Ca++) TIGR01522, IIB (PMCA-type Ca++) TIGR01517, IIC (Na+/K+, H+/K+ antiporters) TIGR01106, IID (fungal-type Na+ and K+) TIGR01523, IIIA (H+) TIGR01647, IIIB (Mg++) TIGR01524, IV (phospholipid, flippase) TIGR01652 and V (unknown specificity) TIGR01657. The crystal structure of one calcium-pumping ATPase and an analysis of the fold of the catalytic domain of the P-type ATPases have been published. These reveal that the catalytic core of these enzymes is a haloacid dehalogenase(HAD)-type aspartate-nucleophile hydrolase. The location of the ATP-binding loop in between the first and second HAD conserved catalytic motifs defines these enzymes as members of subfamily I of the HAD superfamily (see also TIGR01493, TIGR01509, TIGR01549, TIGR01544 and TIGR01545). Based on these classifications, the P-type ATPase _superfamily_ corresponds to the IC subfamily of the HAD superfamily.	545
130559	TIGR01495	ETRAMP	Plasmodium ring stage membrane protein ETRAMP. This model describes a family of proteins from the malaria parasite Plasmodium falciparum, several of which have been shown to be expressed specifically in the ring stage as well as the rident parasite Plasmodium yoelii. A homolog from Plasmodium chabaudi was localized to the parasitophorous vacuole membrane. Members have an initial hydrophobic, Phe/Tyr-rich stretch long enough to span the membrane, a highly charged region rich in Lys, a second putative transmembrane region, and a second highly charged, low complexity sequence region. Some members have up to 100 residues of additional C-terminal sequence. These genes have been shown to be found in the sub-telomeric regions of both P. falciparum and P. yoelii chromosomes	85
273657	TIGR01496	DHPS	dihydropteroate synthase. This model represents dihydropteroate synthase, the enzyme that catalyzes the second to last step in folic acid biosynthesis. The gene is usually designated folP (folic acid biosynthsis) or sul (sulfanilamide resistance). This model represents one branch of the family of pterin-binding enzymes (pfam00809) and of a cluster of dihydropteroate synthase and related enzymes (COG0294). Other members of pfam00809 and COG0294 are represented by model TIGR00284. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	257
130561	TIGR01497	kdpB	K+-transporting ATPase, B subunit. This model describes the P-type ATPase subunit of the complex responsible for translocating potassium ions across biological membranes in microbes. In E. coli and other species, this complex consists of the proteins KdpA, KdpB, KdpC and KdpF. KdpB is the ATPase subunit, while KdpA is the potassium-ion translocating subunit. The function of KdpC is unclear, although cit has been suggested to couple the ATPase subunit to the ion-translocating subunit, while KdpF serves to stabilize the complex. The potassium P-type ATPases have been characterized as Type IA based on a phylogenetic analysis which places this clade closest to the heavy-metal translocating ATPases (Type IB). Others place this clade closer to the Na+/K+ antiporter type (Type IIC) based on physical characteristics. This model is very clear-cut, with a strong break between trusted hits and noise. All members of the seed alignment, from Clostridium, Anabaena and E. coli are in the characterized table. One sequence above trusted, OMNI|NTL01TA01282, is apparently mis-annotated in the primary literature, but properly annotated by TIGR. [Transport and binding proteins, Cations and iron carrying compounds]	675
273658	TIGR01498	folK	2-amino-4-hydroxy-6-hydroxymethyldihydropteridine diphosphokinase. This model describes the folate biosynthesis enzyme 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase. Alternate names include 6-hydroxymethyl-7,8-dihydropterin diphosphokinase and 7,8-dihydro-6-hydroxymethylpterin pyrophosphokinase (HPPK). The extreme C-terminal region, of typically eight to thirty residues, is not included in the model. This enzyme may be found as a fusion protein with other enzymes of folate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	129
273659	TIGR01499	folC	folylpolyglutamate synthase/dihydrofolate synthase. This model represents the FolC family of folate pathway proteins. Most examples are bifunctional, active as both folylpolyglutamate synthetase (EC 6.3.2.17) and dihydrofolate synthetase (EC 6.3.2.12). The two activities are similar - ATP + glutamate + dihydropteroate or tetrahydrofolyl-[Glu](n) = ADP + orthophosphate + dihydrofolate or tetrahydrofolyl-[Glu](n+1). A mutation study of the FolC gene of E. coli suggests that both activities belong to the same active site. Because some examples are monofunctional (and these cannot be separated phylogenetically), the model is treated as subfamily, not equivalog. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	397
273660	TIGR01500	sepiapter_red	sepiapterin reductase. This model describes sepiapterin reductase, a member of the short chain dehydrogenase/reductase family. The enzyme catalyzes the last step in the biosynthesis of tetrahydrobiopterin. A similar enzyme in Bacillus cereus was isolated for its ability to convert benzil to (S)-benzoin, a property sepiapterin reductase also shares. Cutoff scores for this model are set such that benzil reductase scores between trusted and noise cutoffs.	256
130565	TIGR01501	MthylAspMutase	methylaspartate mutase, S subunit. This model represents the S (sigma) subunit of methylaspartate mutase (glutamate mutase), a cobalamin-dependent enzyme that catalyzes the first step in a pathway of glutamate fermentation. [Energy metabolism, Amino acids and amines, Energy metabolism, Fermentation]	134
213632	TIGR01502	B_methylAsp_ase	methylaspartate ammonia-lyase. This model describes methylaspartate ammonia-lyase, also called beta-methylaspartase (EC 4.3.1.2). It follows methylaspartate mutase (composed of S and E subunits) in one of several possible pathways of glutamate fermentation. [Energy metabolism, Amino acids and amines, Energy metabolism, Fermentation]	408
130567	TIGR01503	MthylAspMut_E	methylaspartate mutase, E subunit. This model represents the E (epsilon) subunit of methylaspartate mutase (glutamate mutase), a cobalamin-dependent enzyme that catalyzes the first step in a pathway of glutamate fermentation. [Energy metabolism, Amino acids and amines, Energy metabolism, Fermentation]	480
213633	TIGR01504	glyox_carbo_lig	glyoxylate carboligase. Glyoxylate carboligase, also called tartronate-semialdehyde synthase, releases CO2 while synthesizing a single molecule of tartronate semialdehyde from two molecules of glyoxylate. It is a thiamine pyrophosphate-dependent enzyme, closely related in sequence to the large subunit of acetolactate synthase. In the D-glycerate pathway, part of allantoin degradation in the Enterobacteriaceae, tartronate semialdehyde is converted to D-glycerate and then 3-phosphoglycerate, a product of glycolysis and entry point in the general metabolism.	588
130569	TIGR01505	tartro_sem_red	2-hydroxy-3-oxopropionate reductase. This model represents 2-hydroxy-3-oxopropionate reductase (EC 1.1.1.60), also called tartronate semialdehyde reductase. It follows glyoxylate carboligase and precedes glycerate kinase in D-glycerate pathway of glyoxylate degradation. The eventual product, 3-phosphoglycerate, is an intermediate of glycolysis and is readily metabolized. Tartronic semialdehyde, the substrate of this enzyme, may also come from other pathways, such as D-glucarate catabolism.	291
130570	TIGR01506	ribC_arch	riboflavin synthase. This archaeal protein catalyzes the same reaction, the final step in riboflavin biosynthesis, as bacterial riboflavin biosynthesis alpha chain. However, it is more similar in sequence to 6,7-dimethyl-8-ribityllumazine synthase, which catalyzes the previous reaction and which (in bacteria) is called the riboflavin synthase beta chain. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD]	151
273661	TIGR01507	hopene_cyclase	squalene-hopene cyclase. SHC is an essential prokaryotic gene in hopanoid (triterpenoid) biosynthesis. Squalene hopene cyclase, an integral membrane protein, directly cyclizes squalene into hopanoid products. [Fatty acid and phospholipid metabolism, Other]	635
130572	TIGR01508	rib_reduct_arch	2,5-diamino-6-hydroxy-4-(5-phosphoribosylamino)pyrimidine 1'-reductase, archaeal. This model represents a specific reductase of riboflavin biosynthesis in the Archaea, diaminohydroxyphosphoribosylaminopyrimidine reductase. It should not be confused with bacterial 5-amino-6-(5-phosphoribosylamino)uracil reductase. The intermediate 2,5-diamino-6-hydroxy-4-(5-phosphoribosylamino)pyrimidine in riboflavin biosynthesis is reduced first, and then deaminated, in both Archaea and Fungi, opposite the order in Bacteria. The subsequent deaminase is not presently known and is not closely homologous to the deaminase domain (3.5.4.26) fused to the reductase domain (1.1.1.193) similar to this protein but found in most bacteria.	210
273662	TIGR01509	HAD-SF-IA-v3	haloacid dehalogenase superfamily, subfamily IA, variant 3 with third motif having DD or ED. This model represents part of one structural subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The subfamilies are defined based on the location and the observed or predicted fold of a so-called "capping domain", or the absence of such a domain. Subfamily I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Subfamily II consists of sequences in which the capping domain is found between the second and third motifs. Subfamily III sequences have no capping domain in either of these positions. The Subfamily IA and IB capping domains are predicted by PSI-PRED to consist of an alpha helical bundle. Subfamily I encompasses such a wide region of sequence space (the sequences are highly divergent) that representing it with a single model is impossible, resulting in an overly broad description which allows in many unrelated sequences. Subfamily IA and IB are separated based on an aparrent phylogenetic bifurcation. Subfamily IA is still too broad to model, but cannot be further subdivided into large chunks based on phylogenetic trees. Of the three motifs defining the HAD superfamily, the third has three variant forms: (1) hhhhsDxxx(x)D, (2) hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to a small amino acid and _h_ to a hydrophobic one. All three of these variants are found in subfamily IA. Individual models were made based on seeds exhibiting only one of the variants each. Variant 3 (this model) is found in the enzymes beta-phosphoglucomutase (TIGR01990) and deoxyglucose-6-phosphatase, while many other enzymes of subfamily IA exhibit this variant as well as variant 1 (TIGR01549). These three variant models were created with the knowledge that there will be overlap among them - this is by design and serves the purpose of eliminating the overlap with models of more distantly related HAD subfamilies caused by an overly broad single model. [Unknown function, Enzymes of unknown specificity]	178
273663	TIGR01510	coaD_prev_kdtB	pantetheine-phosphate adenylyltransferase, bacterial. This model describes pantetheine-phosphate adenylyltransferase, the penultimate enzyme of coenzyme A (CoA) biosynthesis in bacteria. It does not show any strong homology to eukaryotic enzymes of coenzyme A biosynthesis. This protein was previously designated KdtB and postulated (because of cytidyltransferase homology and proximity to kdtA) to be an enzyme of LPS biosynthesis, a cytidyltransferase for 3-deoxy-D-manno-2-octulosonic acid. However, no activity toward that compound was found with either CTP or ATP. The phylogenetic distribution of this enzyme is more consistent with coenzyme A biosynthesis than with LPS biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	155
273664	TIGR01511	ATPase-IB1_Cu	copper-(or silver)-translocating P-type ATPase. This model describes the P-type ATPase primarily responsible for translocating copper ions accross biological membranes. These transporters are found in prokaryotes and eukaryotes. This model encompasses those species which pump copper ions out of cells or organelles (efflux pumps such as CopA of Escherichia coli) as well as those which pump the ion into cells or organelles either for the purpose of supporting life in extremely low-copper environments (for example CopA of Enterococcus hirae) or for the specific delivery of copper to a biological complex for which it is a necessary component (for example FixI of Bradyrhizobium japonicum, or CtaA and PacS of Synechocystis). The substrate specificity of these transporters may, to a varying degree, include silver ions (for example, CopA from Archaeoglobus fulgidus). Copper transporters from this family are well known as the genes which are mutated in two human disorders of copper metabolism, Wilson's and Menkes' diseases. The sequences contributing to the seed of this model are all experimentally characterized. The copper P-type ATPases have been characterized as Type IB based on a phylogenetic analysis which combines the copper-translocating ATPases with the cadmium-translocating species. This model and that describing the cadmium-ATPases (TIGR01512) are well separated, and thus we further type the copper-ATPases as IB1 (and the cadmium-ATPases as IB2). Several sequences which have not been characterized experimentally fall just below the cutoffs for both of these models (SP|Q9CCL1 from Mycobacterium leprae, GP|13816263 from Sulfolobus solfataricus, OMNI|NTL01CJ01098 from Campylobacter jejuni, OMNI|NTL01HS01687 from Halobacterium sp., GP|6899169 from Ureaplasma urealyticum and OMNI|HP1503 from Helicobacter pylori). Accession PIR|A29576 from Enterococcus faecalis scores very high against this model, but yet is annotated as an "H+/K+ exchanging ATPase". BLAST of this sequence does not hit anything else annotated in this way. This error may come from the characterization paper published in 1987. Accession GP|7415611 from Saccharomyces cerevisiae appears to be mis-annotated as a cadmium resistance protein. Accession OMNI|NTL01HS00542 from Halobacterium which scores above trusted for this model is annotated as "molybdenum-binding protein" although no evidence can be found for this classification. [Cellular processes, Detoxification, Transport and binding proteins, Cations and iron carrying compounds]	562
273665	TIGR01512	ATPase-IB2_Cd	heavy metal-(Cd/Co/Hg/Pb/Zn)-translocating P-type ATPase. This model describes the P-type ATPase primarily responsible for translocating cadmium ions (and other closely-related divalent heavy metals such as cobalt, mercury, lead and zinc) across biological membranes. These transporters are found in prokaryotes and plants. Experimentally characterized members of the seed alignment include: SP|P37617 from E. coli, SP|Q10866 from Mycobacterium tuberculosis and SP|Q59998 from Synechocystis PCC6803. The cadmium P-type ATPases have been characterized as Type IB based on a phylogenetic analysis which combines the copper-translocating ATPases with the cadmium-translocating species. This model and that describing the copper-ATPases (TIGR01511) are well separated, and thus we further type the copper-ATPases as IB1 and the cadmium-ATPases as IB2. Several sequences which have not been characterized experimentally fall just below trusted cutoff for both of these models (SP|Q9CCL1 from Mycobacterium leprae, GP|13816263 from Sulfolobus solfataricus, OMNI|NTL01CJ01098 from Campylobacter jejuni, OMNI|NTL01HS01687 from Halobacterium sp., GP|6899169 from Ureaplasma urealyticum and OMNI|HP1503 from Helicobacter pylori). [Transport and binding proteins, Cations and iron carrying compounds]	550
273666	TIGR01513	NAPRTase_put	putative nicotinate phosphoribosyltransferase. A deep split separates two related families of proteins, one of which includes experimentally characterized examples of nicotinate phosphoribosyltransferase, an the first enzyme of NAD salvage biosynthesis. This model represents the other family. Members have a different (longer) spacing of several key motifs and have an additional C-terminal domain of up to 100 residues. One argument suggesting that this family represents the same enzyme is that no species has a member of both families. Another is that the gene encoding this protein is located near other NAD salvage biosynthesis genes in Nostoc and in at least four different Gram-positive bacteria. NAD and NADP are ubiquitous in life. Most members of this family are Gram-positive bacteria. An additional set of mutually closely related archaeal sequences score between the trusted and noise cutoffs. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	443
130578	TIGR01514	NAPRTase	nicotinate phosphoribosyltransferase. This model represents nicotinate phosphoribosyltransferase, the first enzyme in the salvage pathway of NAD biosynthesis from nicontinate (niacin). Members are primary proteobacterial but also include yeasts and Methanosarcina acetivorans. A related family, apparently non-overlapping in species distribution, is TIGR01513. Members of that family differ in substantially in sequence and have a long C-terminal extension missing from this family, but are proposed also to act as nicotinate phosphoribosyltransferase (see model TIGR01513). [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	394
273667	TIGR01515	branching_enzym	alpha-1,4-glucan:alpha-1,4-glucan 6-glycosyltransferase. This model describes the glycogen branching enzymes which are responsible for the transfer of chains of approx. 7 alpha(1--4)-linked glucosyl residues to other similar chains (in new alpha(1--6) linkages) in the biosynthesis of glycogen. This enzyme is a member of the broader amylase family of starch hydrolases which fold as (beta/alpha)8 barrels, the so-called TIM-barrel structure. All of the sequences comprising the seed of this model have been experimentally characterized. This model encompasses both bacterial and eukaryotic species. No archaea have this enzyme, although Aquifex aolicus does. Two species, Bacillus thuringiensis and Clostridium perfringens have two sequences each which are annotated as amylases. These annotations are aparrently in error. GP|18143720 from C. perfringens, for instance, contains the note "674 aa, similar to gp:A14658_1 amylase (1,4-alpha-glucan branching enzyme (EC 2.4.1.18) ) from Bacillus thuringiensis (648 aa); 51.1% identity in 632 aa overlap." A branching enzyme from Porphyromonas gingivales, OMNI|PG1793, appears to be more closely related to the eukaryotic species (across a deep phylogenetic split) and may represent an instance of lateral transfer from this species' host. A sequence from Arabidopsis thaliana, GP|9294564, scores just above trusted, but appears either to contain corrupt sequence or, more likely, to be a pseudogene as some of the conserved catalytic residues common to the alpha amylase family are not conserved here. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	618
273668	TIGR01517	ATPase-IIB_Ca	plasma-membrane calcium-translocating P-type ATPase. This model describes the P-type ATPase responsible for translocating calcium ions across the plasma membrane of eukaryotes, out of the cell. In some organisms, this type of pump may also be found in vacuolar membranes. In humans and mice, at least, there are multiple isoforms of the PMCA pump with overlapping but not redundant functions. Accordingly, there are no human diseases linked to PMCA defects, although alterations of PMCA function do elicit physiological effects. The calcium P-type ATPases have been characterized as Type IIB based on a phylogenetic analysis which distinguishes this group from the Type IIA SERCA calcium pump. A separate analysis divides Type IIA into sub-types (SERCA and PMR1) which are represented by two corresponding models (TIGR01116 and TIGR01522). This model is well separated from those.	956
130581	TIGR01518	g3p_cytidyltrns	glycerol-3-phosphate cytidylyltransferase. This model describes glycerol-3-phosphate cytidyltransferase, also called CDP-glycerol pyrophosphorylase. A closely related protein assigned a different function experimentally is a human ethanolamine-phosphate cytidylyltransferase (EC 2.7.7.14). Glycerol-3-phosphate cytidyltransferase acts in pathways of teichoic acid biosynthesis. Teichoic acids are substituted polymers, linked by phosphodiester bonds, of glycerol, ribitol, etc. An example is poly(glycerol phosphate), the major teichoic acid of the Bacillus subtilis cell wall. Most but not all species encoding proteins in this family are Gram-positive bacteria. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	125
130582	TIGR01519	plasmod_dom_1	Plasmodium falciparum uncharacterized domain. This model represents an uncharacterized domain present in roughly eight hypothetical proteins of the malaria parasite Plasmodium falciparum.	70
130583	TIGR01520	FruBisAldo_II_A	fructose-bisphosphate aldolase, class II, yeast/E. coli subtype. Members of this family are class II examples of the glycolytic enzyme fructose-bisphosphate aldolase (FBA). This model represents one of two deeply split, architecturally distinct clades of the family that includes class II fructose-bisphosphate aldolases, tagatose-bisphosphate aldolases, and related uncharacterized proteins. This family is well-conserved and includes characterized FBA from Saccharomyces cerevisiae, Escherichia coli, and Corynebacterium glutamicum. Proteins outside the scope of this model may also be designated as class II fructose-bisphosphate aldolases, but are well separated in an alignment-based phylogenetic tree. [Energy metabolism, Glycolysis/gluconeogenesis]	357
130584	TIGR01521	FruBisAldo_II_B	fructose-bisphosphate aldolase, class II, Calvin cycle subtype. Members of this family are class II examples of the enzyme fructose-bisphosphate aldolase, an enzyme both of glycolysis and (in the opposite direction) of the Calvin cycle of CO2 fixation. A deep split separates the tightly conserved yeast/E. coli/Mycobacterium subtype (all species lacking the Calvin cycle) represented by model TIGR01520 from a broader group of aldolases that includes both tagatose- and fructose-bisphosphate aldolases. This model represents a distinct, elongated, very well conserved subtype within the latter group. Most species with this aldolase subtype have the Calvin cycle.	347
130585	TIGR01522	ATPase-IIA2_Ca	golgi membrane calcium-translocating P-type ATPase. This model describes the P-type ATPase responsible for translocating calcium ions across the golgi membrane of fungi and animals, and is of particular importance in the sarcoplasmic reticulum of skeletal and cardiac muscle in vertebrates. The calcium P-type ATPases have been characterized as Type IIA based on a phylogenetic analysis which distinguishes this group from the Type IIB PMCA calcium pump modelled by TIGR01517. A separate analysis divides Type IIA into sub-types, SERCA and PMR1, the former of which is modelled by TIGR01116.	884
130586	TIGR01523	ATPase-IID_K-Na	potassium and/or sodium efflux P-type ATPase, fungal-type. Initially described as a calcium efflux ATPase, more recent work has shown that the S. pombe CTA3 gene is in fact a potassium ion efflux pump. This model describes the clade of fungal P-type ATPases responsible for potassium and sodium efflux. The degree to which these pumps show preference for sodium or potassium varies. This group of ATPases has been classified by phylogentic analysis as type IID. The Leishmania sequence (GP|3192903), which falls between trusted and noise in this model, may very well turn out to be an active potassium pump.	1053
130587	TIGR01524	ATPase-IIIB_Mg	magnesium-translocating P-type ATPase. This model describes the magnesium translocating P-type ATPase found in a limited number of bacterial species and best described in Salmonella typhimurium, which contains two isoforms. These transporters are active in low external Mg2+ concentrations and pump the ion into the cytoplasm. The magnesium ATPases have been classified as type IIIB by a phylogenetic analysis. [Transport and binding proteins, Cations and iron carrying compounds]	867
273669	TIGR01525	ATPase-IB_hvy	heavy metal translocating P-type ATPase. This model encompasses two equivalog models for the copper and cadmium-type heavy metal transporting P-type ATPases (TIGR01511 and TIGR01512) as well as those species which score ambiguously between both models. For more comments and references, see the files on TIGR01511 and 01512.	558
273670	TIGR01526	nadR_NMN_Atrans	nicotinamide-nucleotide adenylyltransferase, NadR type. The NadR protein of E. coli and closely related bacteria is both enzyme and regulatory protein. The first 60 or so amino acids, N-terminal to the region covered by this model, is a DNA-binding helix-turn-helix domain (pfam01381) responsible for repressing the nadAB genes of NAD de novo biosynthesis. The NadR homologs in Mycobacterium tuberculosis, Haemophilus influenzae, and others appear to lack the repressor domain. NadR has recently been shown to act as an enzyme of the salvage pathway of NAD biosynthesis, nicotinamide-nucleotide adenylyltransferase; members of this family are presumed to share this activity. E. coli NadR has also been found to regulate the import of its substrate, nicotinamide ribonucleotide, but it is not known if the other members of this model share that activity.	325
273671	TIGR01527	arch_NMN_Atrans	nicotinamide-nucleotide adenylyltransferase. This model describes a family of archaeal proteins with the activity of the NAD salvage biosynthesis enzyme nicotinamide-nucleotide adenylyltransferase (EC 2.7.7.1). In some cases, the enzyme was tested and found also to have the activity of nicotinate-nucleotide adenylyltransferase (EC 2.7.7.18), an enzyme of NAD de novo biosynthesis, although with a higher Km. In some archaeal species, a lower-scoring paralog, uncharacterized with respect to activity, is also present. These score between trusted and noise cutoffs. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	165
273672	TIGR01528	NMN_trans_PnuC	nicotinamide mononucleotide transporter PnuC. The PnuC protein of E. coli is membrane protein responsible for nicotinamide mononucleotide transport, subject to regulation by interaction with the NadR (also called NadI) protein (see TIGR01526). This model defines a region corresponding to most of the length of PnuC, found primarily in pathogens. The extreme N- and C-terminal regions are poorly conserved and not included in the alignment and model. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	189
130592	TIGR01529	argR_whole	arginine repressor. This model includes most members of the arginine-responsive transcriptional regulator family ArgR. This hexameric protein binds DNA at its amino end to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbor-joining tree, some of these paralogous sequences show long branches and differ significantly in an otherwise well-conserved C-terminal region motif GT[VIL][AC]GDDT. These paralogs are excluded from the seed and score in the gray zone of this model, between trusted and noise cutoffs. [Amino acid biosynthesis, Glutamate family, Regulatory functions, DNA interactions]	146
211667	TIGR01530	nadN	NAD pyrophosphatase/5'-nucleotidase NadN. This model describes NadN of Haemophilus influenzae and a small number of close homologs in pathogenic, Gram-negative bacteria. NadN is a periplasmic enzyme that cleaves NAD (nicotinamide adenine dinucleotide) to NMN (nicotinamide mononucleotide) and AMP. The NMN must be converted by a 5'-nucleotidase to nicotinamide riboside for import. NadN belongs a large family of 5'-nucleotidases and has NMN 5'-nucleotidase activity for NMN, AMP, etc. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	545
273673	TIGR01531	glyc_debranch	glycogen debranching enzymye. glycogen debranching enzyme possesses two different catalytic activities; oligo-1,4-->1,4-glucantransferase (EC 2.4.1.25) and amylo-1,6-glucosidase (EC 3.2.1.33). Site directed mutagenesis studies in S. cerevisiae indicate that the transferase and glucosidase activities are independent and located in different regions of the polypeptide chain. Proteins in this model belong to the larger alpha-amylase family. The model covers eukaryotic proteins with a seed composed of human, nematode and yeast sequences. Yeast seed sequence is well characterized. The model is quite rigorous; either query sequence yields large bit score or it fails to hit the model altogether. There doesn't appear to be any middle ground. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	1464
130595	TIGR01532	E4PD_g-proteo	erythrose-4-phosphate dehydrogenase. This model represents the small clade of dehydrogenases in gamma-proteobacteria which utilize NAD+ to oxidize erythrose-4-phosphate (E4P) to 4-phospho-erythronate, a precursor for the de novo synthesis of pyridoxine via 4-hydroxythreonine and D-1-deoxyxylulose. This enzyme activity appears to have evolved from glyceraldehyde-3-phosphate dehydrogenase, whose substrate differs only in the lack of one carbon relative to E4P. Accordingly, this model is very close to the corresponding models for GAPDH, and those sequences which hit above trusted here invariably hit between trusted and noise to the GAPDH model (TIGR01534). Similarly, it may be found that there are species outside of the gamma proteobacteria which synthesize pyridoxine and have more than one aparrent GAPDH gene of which one may have E4PD activity - this may necessitate a readjustment of these models. Alternatively, some of the GAPDH enzymes may prove to be bifunctional in certain species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	325
273674	TIGR01533	lipo_e_P4	5'-nucleotidase, lipoprotein e(P4) family. This model represents a set of bacterial lipoproteins belonging to a larger acid phosphatase family (pfam03767), which in turn belongs to the haloacid dehalogenase (HAD) superfamily of aspartate-dependent hydrolases. Members are found on the outer membrane of Gram-negative bacteria and the cytoplasmic membrane of Gram-positive bacteria. Most members have classic lipoprotein signal sequences. A critical role of this 5'-nucleotidase in Haemophilus influenzae is the degradation of external riboside in order to allow transport into the cell. An earlier suggested role in hemin transport is no longer current. This enzyme may also have other physiologically significant roles. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	266
273675	TIGR01534	GAPDH-I	glyceraldehyde-3-phosphate dehydrogenase, type I. This model represents glyceraldehyde-3-phosphate dehydrogenase (GAPDH), the enzyme responsible for the interconversion of 1,3-diphosphoglycerate and glyceraldehyde-3-phosphate, a central step in glycolysis and gluconeogenesis. Forms exist which utilize NAD (EC 1.2.1.12), NADP (EC 1.2.1.13) or either (1.2.1.59). In some species, NAD- and NADP- utilizing forms exist, generally being responsible for reactions in the anabolic and catabolic directions respectively. Two Pfam models cover the two functional domains of this protein; pfam00044 represents the N-terminal NAD(P)-binding domain and pfam02800 represents the C-terminal catalytic domain. An additional form of gap gene is found in gamma proteobacteria and is responsible for the conversion of erythrose-4-phosphate (E4P) to 4-phospho-erythronate in the biosynthesis of pyridoxine. This pathway of pyridoxine biosynthesis appears to be limited, however, to a relatively small number of bacterial species although it is prevalent among the gamma-proteobacteria. This enzyme is described by TIGR001532. These sequences generally score between trusted and noise to this GAPDH model due to the close evolutionary relationship. There exists the possiblity that some forms of GAPDH may be bifunctional and act on E4P in species which make pyridoxine and via hydroxythreonine and lack a separate E4PDH enzyme (for instance, the GAPDH from Bacillus stearothermophilus has been shown to posess a limited E4PD activity as well as a robust GAPDH activity). There are a great number of sequences in the databases which score between trusted and noise to this model, nearly all of them due to fragmentary sequences. It seems that study of this gene has been carried out in many species utilizing PCR probes which exclude the extreme ends of the consenses used to define this model. The noise level is set relative not to E4PD, but the next closest outliers, the class II GAPDH's (found in archaea, TIGR01546) and aspartate semialdehyde dehydrogenase (ASADH, TIGR01296) both of which have highest-scoring hits around -225 to the prior model. [Energy metabolism, Glycolysis/gluconeogenesis]	326
130598	TIGR01535	glucan_glucosid	glucan 1,4-alpha-glucosidase. Glucan 1,4-alpha-glucosidase catalyzes the hydrolysis of terminal 1,4-linked alpha-D-glucose residues from non-reducing ends of polysaccharides, releasing a beta-D-glucose monomer. Some forms of this enzyme can hydrolyze terminal 1,6- and 1,3-alpha-D-glucosidic bonds in polysaccharides as well. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	648
273676	TIGR01536	asn_synth_AEB	asparagine synthase (glutamine-hydrolyzing). This model describes the glutamine-hydrolysing asparagine synthase. A poorly conserved C-terminal extension was removed from the model. Bacterial members of the family tend to have a long, poorly conserved insert lacking from archaeal and eukaryotic sequences. Multiple isozymes have been demonstrated, such as in Bacillus subtilis. Long-branch members of the phylogenetic tree (which typically were also second or third candidate members from their genomes) were removed from the seed alignment and score below trusted cutoff. [Amino acid biosynthesis, Aspartate family]	466
273677	TIGR01537	portal_HK97	phage portal protein, HK97 family. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. [Mobile and extrachromosomal element functions, Prophage functions]	342
273678	TIGR01538	portal_SPP1	phage portal protein, SPP1 family. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. [Mobile and extrachromosomal element functions, Prophage functions]	412
273679	TIGR01539	portal_lambda	phage portal protein, lambda family. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. [Mobile and extrachromosomal element functions, Prophage functions]	458
273680	TIGR01540	portal_PBSX	phage portal protein, PBSX family. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. This family shows clear homology to TIGR01537. The alignment for this group was trimmed of poorly alignable N-terminal sequence of about 50 residues and of C-terminal regions present in some but not all members of up 180 residues. [Mobile and extrachromosomal element functions, Prophage functions]	320
273681	TIGR01541	tape_meas_lam_C	phage tail tape measure protein, lambda family. This model represents a relatively well-conserved region near the C-terminus of the tape measure protein of a lambda and related phage. This protein, which controls phage tail length, is typically about 1000 residues in length. Both low-complexity sequence and insertion/deletion events appear common in this family. Mutational studies suggest a ruler or template role in the determination of phage tail length. Similar behavior is attributed to proteins from distantly related or unrelated families in other phage. [Mobile and extrachromosomal element functions, Prophage functions]	332
130605	TIGR01542	A118_put_portal	phage portal protein, putative, A118 family. This model represents a family of phage minor structural proteins. The protein is suggested to be the head-tail connector, or portal protein, on the basis of its position in the phage gene order, its presence in mature phage, its size, and its conservation across a number of complete genomes of tailed phage that lack other candidate portal proteins. Several other known portal protein families lack clear homology to this family and to each other. [Mobile and extrachromosomal element functions, Prophage functions]	476
273682	TIGR01543	proheadase_HK97	phage prohead protease, HK97 family. This model describes the prohead protease of HK97 and related phage. It is generally encoded next to the gene for the capsid protein that it processes, and in some cases may be fused to it. This family does not show similarity to the prohead protease of phage T4 (see pfam03420). [Mobile and extrachromosomal element functions, Prophage functions, Protein fate, Other]	145
273683	TIGR01544	HAD-SF-IE	haloacid dehalogenase superfamily, subfamily IE hydrolase, TIGR01544. This model represents a small group of metazoan sequences. The sequences from mouse are annotated as Pyrimidine 5'-nucleotidases, aparrently in reference to HSPC233, the human homolog. However, no such annotation can currently be found for this gene. This group of sequences was found during searches for members of the haloacid dehalogenase (HAD) superfamily. All of the conserved catalytic motifs are found. The placement of the variable domain between motifs 1 and 2 indicates membership in subfamily I of the superfamily, but these sequences are sufficiently different from any of the branches (IA, TIGR01493, TIGR01509, TIGR01549; IB, TIGR01488; IC, TIGR01494; ID, TIGR01658; IF TIGR01545) of that subfamily as to constitute a separate branch to now be called IE. Considering that the closest identifiable hit outside of the noise range is to a phosphoserine phosphatase, this group may be considered to be most closely allied to subfamily IB.	283
130608	TIGR01545	YfhB_g-proteo	haloacid dehalogenase superfamily, subfamily IF hydrolase, YfhB. This model describes a clade of sequences limited to the gamma proteobacteria. This group is a member of the haloacid dehalogenase (HAD) superfamily of aspartate-dependent hydrolases and all of the conserved catalytic motifs are present. Although structurally similar to subfamily IA in that the variable domain is predicted to consist of five consecutive alpha helices (by PSI-PRED), it is sufficiently divergent to warrant being regarded as a separate sub-family (IF). The gene name comes from the E. coli gene. There is currently no information regarding the function of this gene.	210
130609	TIGR01546	GAPDH-II_archae	glyceraldehyde-3-phosphate dehydrogenase, type II. This model describes the type II glyceraldehyde-3-phosphate dehydrogenases which are limited to archaea. These enzymes catalyze the interconversion of 1,3-diphosphoglycerate and glyceraldehyde-3-phosphate, a central step in glycolysis and gluconeogenesis. In archaea, either NAD or NADP may be utilized as the cofactor. The class I GAPDH's from bacteria and eukaryotes are covered by TIGR01534. All of the members of the seed are characterized. See, for instance. This model is very solid, there are no species falling between trusted and noise at this time. The closest relatives scoring in the noise are the class I GAPDH's.	333
273684	TIGR01547	phage_term_2	phage terminase, large subunit, PBSX family. This model detects members of a highly divergent family of the large subunit of phage terminase. All members are encoded by phage genomes or within prophage regions of bacterial genomes. This is a distinct family from pfam03354. [Mobile and extrachromosomal element functions, Prophage functions]	394
273685	TIGR01548	HAD-SF-IA-hyp1	haloacid dehalogenase superfamily, subfamily IA hydrolase, TIGR01548. This model represents a small and phylogenetically curious clade of sequences. Sequences are found from Halobacterium (an archaeon), Nostoc and Synechococcus (cyanobacteria) and Phytophthora (a stramenophile eukaryote). These appear to be members of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases by general homology and the conservation of all of the recognized catalytic motifs. The variable domain is found in between motifs 1 and 2, indicating membership in subfamily I and phylogeny and prediction of the alpha helical nature of the variable domain (by PSI-PRED) indicate membership in subfamily IA. All but the Halobacterium sequence currently found are annotated as "Imidazoleglycerol-phosphate dehydratase", however, the source of the annotation could not be traced and significant homology could not be found between any of these sequences and known IGPD's.	197
273686	TIGR01549	HAD-SF-IA-v1	haloacid dehalogenase superfamily, subfamily IA, variant 1 with third motif having Dx(3-4)D or Dx(3-4)E. This model represents part of one structural subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The subfamilies are defined based on the location and the observed or predicted fold of a so-called "capping domain", or the absence of such a domain. Subfamily I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Subfamily II consists of sequences in which the capping domain is found between the second and third motifs. Subfamily III sequences have no capping domain in either of these positions.The Subfamily IA and IB capping domains are predicted by PSI-PRED to consist of an alpha helical bundle. Subfamily I encompasses such a wide region of sequence space (the sequences are highly divergent) that representing it with a single model is impossible, resulting in an overly broad description which allows in many unrelated sequences. Subfamily IA and IB are separated based on an aparrent phylogenetic bifurcation. Subfamily IA is still too broad to model, but cannot be further subdivided into large chunks based on phylogenetic trees. Of the three motifs defining the HAD superfamily, the third has three variant forms: (1) hhhhsDxxx(x)(D/E), (2) hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to a small amino acid and _h_ to a hydrophobic one. All three of these variants are found in subfamily IA. Individual models were made based on seeds exhibiting only one of the variants each. Variant 1 (this model) is found in the enzymes phosphoglycolate phosphatase (TIGR01449) and enolase-phosphatase. These three variant models (see also TIGR01493 and TIGR01509) were created withthe knowledge that there will be overlap among them - this is by design and serves the purpose of eliminating the overlap with models of more distantly relatedHAD subfamilies caused by an overly broad single model. [Unknown function, Enzymes of unknown specificity]	164
273687	TIGR01550	DOC_P1	death-on-curing family protein. The characterized member of this family is the death-on-curing (DOC) protein of phage P1. It is part of a two protein operon with prevents-host-death (phd) that forms an addiction module. DOC lacks homology to analogous addiction module post-segregational killing proteins involved in plasmid maintenance. These modules work as a combination of a long lived poison (e.g. this protein) and a more abundant but shorter lived antidote. Members of this family have a well-conserved central motif HxFx[ND][AG]NKR. A similar region, with K replaced by G, is found in the huntingtin interacting protein (HYPE) family. [Unknown function, General]	121
233464	TIGR01551	major_capsid_P2	phage major capsid protein, P2 family. This model family represents the major capsid protein component of the heads (capsids) of bacteriophage P2 and related phage. This model represents one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease. [Mobile and extrachromosomal element functions, Prophage functions]	327
273688	TIGR01552	phd_fam	prevent-host-death family protein. This model recognizes a region of about 55 amino acids toward the N-terminal end of bacterial proteins of about 85 amino acids in length. The best-characterized member is prevent-host-death (phd) of bacteriophage P1, the antidote partner of death-on-curing (doc) (TIGR01550) in an addiction module. Addiction modules prevent plasmid curing by killing the host cell as the longer-lived killing protein persists while the gene for the shorter-lived antidote is lost. Note, however, that relatively few members of this family appear to be plasmid or phage-encoded. Also, there is little overlap, except for phage P1 itself, of species with this family and with the doc family. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other]	52
273689	TIGR01553	formate-DH-alph	formate dehydrogenase-N alpha subunit. This model describes a subset of formate dehydrogenase alpha chains found mainly in proteobacteria but also in Aquifex. The alpha chain contains domains for molybdopterin dinucleotide binding and molybdopterin oxidoreductase (pfam01568 and pfam00384, respectively). The holo-enzyme also contains beta and gamma subunits of 32 and 20 kDa. The enzyme catalyzes the oxidation of formate (produced from pyruvate during anaerobic growth) to carbon dioxide with the concomitant release of two electrons and two protons. The electrons are utilized mainly in the nitrate respiration by nitrate reductase. In E. coli and Salmonella, there are two forms of the formate dehydrogenase, one induced by nitrate which is strictly anaerobic (fdn), and one incuced during the transition from aerobic to anaerobic growth (fdo). This subunit is one of only three proteins in E. coli which contain selenocysteine. This model is well-defined, with a large, unpopulated trusted/noise gap. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	1009
273690	TIGR01554	major_cap_HK97	phage major capsid protein, HK97 family. This model family represents the major capsid protein component of the heads (capsids) of bacteriophage HK97, phi-105, P27, and related phage. This model represents one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease. [Mobile and extrachromosomal element functions, Prophage functions]	386
130618	TIGR01555	phge_rel_HI1409	phage-related protein, HI1409 family. This model describes an uncharacterized family of proteins found in prophage regions of a number of bacterial genomes, including Haemophilus influenzae, Xylella fastidiosa, Salmonella typhi, and Enterococcus faecalis. Distantly related proteins can be found in the prophage-bearing plasmids of Borrelia burgdorferi. [Mobile and extrachromosomal element functions, Prophage functions]	404
130619	TIGR01556	rhamnosyltran	L-rhamnosyltransferase. This model subfamily is comprised of gamma proteobacteria whose proteins function as L-rhamnosyltransferases in the synthesis of their respective surface polysaccharides. Rhamnolipids are glycolipids containing mono- or di- L-rhamnose molecules. Rhamnolipid synthesis occurs by sequential glycosyltransferase reactions involving two distinct rhamnosyltransferase enzymes. In P.aeruginosa, the synthesis of mono-rhamnolipids is catalyzed by rhamnosyltransferase 1, and proceeds by a glycosyltransfer reaction catalyzed by rhamnosyltransferase 2 to yield di-rhamnolipids. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	281
130620	TIGR01557	myb_SHAQKYF	myb-like DNA-binding domain, SHAQKYF class. This model describes a DNA-binding domain restricted to (but common in) plant proteins, many of which also contain a response regulator domain. The domain appears related to the Myb-like DNA-binding domain described by pfam00249. It is distinguished in part by a well-conserved motif SH[AL]QKY[RF] at the C-terminal end of the motif.	57
273691	TIGR01558	sm_term_P27	phage terminase, small subunit, putative, P27 family. This model describes a distinct family of phage (and integrated prophage) putative terminase small subunit. Members tend to be adjacent to the phage terminase large subunit gene. [Mobile and extrachromosomal element functions, Prophage functions]	116
188157	TIGR01559	squal_synth	farnesyl-diphosphate farnesyltransferase. This model describes farnesyl-diphosphate farnesyltransferase, also known as squalene synthase, as found in eukaryotes. This family is related to phytoene synthases. Tentatively identified archaeal homologs (excluded from this model) lack the C-terminal predicted transmembrane region universally conserved among members of this family.	337
130623	TIGR01560	put_DNA_pack	uncharacterized phage protein (possible DNA packaging). This model describes a small (~ 100 amino acids) protein found in phage and in putative prophage regions of a number of bacterial genomes. Members have been annotated in some cases as a possible DNA packaging protein, but the source of this annotation was not traced during construction of this model. [Mobile and extrachromosomal element functions, Prophage functions]	91
130624	TIGR01561	gde_arch	glycogen debranching enzyme, archaeal type, putative. The seed for this model is composed of two uncharacterized archaeal proteins from Methanosarcina acetivorans and Sulfolobus solfataricus. Trusted cutoff is set so that essentially only archaeal members hit the model. The notable exceptions to archaeal membership are the Gram positive Clostridium perfringens which scores much better than some other archaea and the Cyanobacterium Nostoc sp. which scores just above the trusted cutoff. Noise cutoff is set to exclude the characterized eukaryotic glycogen debranching enzyme in S. cerevisiae. These cutoffs leave the prokaryotes Porphyromonas gingivalis and Deinococcus radiodurans below trusted but above noise. Multiple alignments including these last two species exhibit sequence divergence which may suggest a subtly different function for these prokaryotic proteins. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	575
130625	TIGR01562	FdhE	formate dehydrogenase accessory protein FdhE. This model describes an accessory protein required for the assembly of formate dehydrogenase of certain proteobacteria although not present in the final complex. The exact nature of the function of FdhE in the assembly of the complex is unknown, but considering the presence of selenocysteine, molybdopterin, iron-sulfur clusters and cytochrome b556, it is likely to have something to do with the insertion of cofactors. The only sequence scoring between trusted and noise is that from Aquifex aeolicus, which shows certain structural differences from the proteobacterial forms in the alignment. However it is notable that A. aeolicus also has a sequence scoring above trusted to the alpha subunit of formate dehydrogenase (TIGR01553).	305
273692	TIGR01563	gp16_SPP1	phage head-tail adaptor, putative, SPP1 family. This family describes a small protein of about 100 amino acids found in bacteriophage and in bacterial prophage regions. Examples include gp9 of phage HK022 and gp16 of phage SPP1. This minor structural protein is suggested to be a head-tail adaptor protein (although the source of this annotation was not traced during construction of this model). [Mobile and extrachromosomal element functions, Prophage functions]	101
273693	TIGR01564	S_layer_MJ	S-layer protein, MJ0822 family. This model represents one of several families of proteins associated with the formation of prokaryotic S-layers. Members of this family are found in archaeal species, including Pyrococcus horikoshii (split into two tandem reading frames), Methanococcus jannaschii, and related species. Some local similarity can be found to other S-layer protein families. [Cell envelope, Surface structures]	571
130628	TIGR01565	homeo_ZF_HD	homeobox domain, ZF-HD class. This model represents a class of homoebox domain that differs substantially from the typical homoebox domain described in pfam00046. It is found in both C4 and C3 plants.	58
130629	TIGR01566	ZF_HD_prot_N	ZF-HD homeobox protein Cys/His-rich dimerization domain. This model describes a 54-residue domain found in the N-terminal region of plant proteins, the vast majority of which contain a ZF-HD class homeobox domain toward the C-terminus. The region between the two domains typically is rich in low complexity sequence. The companion ZF-HD homeobox domain is described in model TIGR01565.	53
273694	TIGR01567	S_layer_rel_Mac	S-layer family duplication domain. This model represents a sequence region found tandemly duplicated in two proven archaeal S-layer glycoproteins, MA0829 from Methanosarcina acetivorans C2A and MM1976 from Methanosarcina mazei Go1, as well as in several paralogs of those L-layer proteins from both species. Members of the family show regions of local similarity to another known family of archaeal S-layer proteins described by model TIGR01564. Some members of this family, including the proven S-layer proteins, have the archaeosortase A target motif, PGF-CTERM (TIGR04126), at the protein C-terminus. [Cell envelope, Surface structures]	256
130631	TIGR01568	A_thal_3678	uncharacterized plant-specific domain TIGR01568. This model describes an uncharacterized domain of about 70 residues found exclusively in plants, generally toward the C-terminus of proteins of 200 to 350 amino acids in length. At least 14 such proteins are found in Arabidopsis thaliana. Other regions of these proteins tend to consist largely of low-complexity sequence.	66
273695	TIGR01569	A_tha_TIGR01569	plant integral membrane protein TIGR01569. This model describes a region of ~160 residues found exclusively in plant proteins, generally as the near complete length of the protein. At least 24 different members are found in Arabidopsis thaliana. Members have four predicted transmembrane regions, the last of which is preceded by an invariant CXXXXX[FY]C motif. The family is not functionally characterized.	154
273696	TIGR01570	A_thal_3588	uncharacterized plant-specific domain TIGR01570. This model represents a region of about 170 amino acids found at the C-terminus of a family of plant proteins. These proteins typically have additional highly divergent N-terminal regions rich in low complexity sequence. PSI-BLAST reveals no clear similarity to any characterized protein. At least 12 distinct members are found in Arabidopsis thaliana.	161
273697	TIGR01571	A_thal_Cys_rich	uncharacterized Cys-rich domain. This model describes an uncharacterized domain of about 100 residues. It is common in plants but found also in Homo sapiens, Dictyostelium, and Leishmania; at least 12 distinct members are found in Arabidopsis. Most members of this family contain more than 10 per cent Cys, but no Cys residue is invariant across the family.	104
273698	TIGR01572	A_thl_para_3677	Arabidopsis paralogous family TIGR01572. This model describes a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The central region of the repeat resembles the pattern [VIF][FY][QK]GX[LM]P[DEK]XXXDDAL.	265
273699	TIGR01573	cas2	CRISPR-associated endonuclease Cas2. This model describes most members of the family of Cas2, one of the first four protein families found to mark prokaryotic genomes that contain multiple CRISPR elements. CRISPR systems protect against invasive nucleic acid sequences, including phage. Cas2 proteins have been characterized as either endoribonuclease (for ssRNA) or endodeoxyribonuclease (for dsDNA), depending on the system to which the Cas2 belongs. CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats. The cas genes usually are found near the repeats. A distinct branch of the Cas2 family shows a very low level of sequence identity and is modeled by TIGR01873 instead of by this model (TIGR01573).	95
273700	TIGR01574	miaB-methiolase	tRNA-N(6)-(isopentenyl)adenosine-37 thiotransferase enzyme MiaB. This model represents homologs of the MiaB enzyme responsible for the modification of the isopentenylated adenine-37 base of most bacterial and eukaryotic tRNAs that read codons beginning with uracil (all except tRNA(I,V) Ser). Adenine-37 is next to the anticodon on the 3' side in these tRNA's, and lack of modification at this site leads to an increased spontaneous mutation frequency. Isopentenylated A-37 is modified by methylthiolation at position 2, either by MiaB alone or in concert with a separate methylase yet to be discovered (MiaC?). MiaB contains a 4Fe-4S cluster which is labile under oxidizing conditions. Additionally, the sequence is homologous (via PSI-BLAST searches) to the biotin synthetase, BioB, which utilizes both an iron-sulfur cluster and S-adenosym methionine (SAM) to generate a radical which is responsible for initiating the insertion of sulfur into the substrate. It is reasonable to surmise that the methyl group of SAM becomes the methyl group of the product, but this has not been shown, and the possibility of a separate methylase exists. This equivalog is a member of a subfamily (TIGR00089) which contains several other hypothetical equivalogs which are all probably enzymes with similar function acting on different substrates. These enzymes contain a TRAM domain (pfam01938) which is believed to be responsible for binding to tRNAs. Hits to this model span all major groups of bacteria and eukaryotes, but not archaea, which are known to lack this particular tRNA modification. The enzyme from Thermotoga maritima has been cloned, expressed, spectroscopically characterized and shown to complement the E. coli MiaB enzyme. [Protein synthesis, tRNA and rRNA base modification]	438
273701	TIGR01575	rimI	ribosomal-protein-alanine acetyltransferase. Members of this model belong to the GCN5-related N-acetyltransferase (GNAT) superfamily. This model covers prokarotes and the archaea. The seed contains a characterized accession for Gram negative E. coli. An untraceable characterized accession (PIR|S66013) for Gram positive B. subtilis scores well (205.0) in the full alignment. Characterized members are lacking in the archaea. Noise cutoff (72.4) was set to exclude M. loti paralog of rimI. Trusted cutoff (80.0) was set at next highest scoring member in the mini-database. [Protein synthesis, Ribosomal proteins: synthesis and modification]	131
273702	TIGR01577	oligosac_amyl	oligosaccharide amylase. The name of this type of amylase is based on the characterization of an glucoamylase family enzyme from Thermoactinomyces vulgaris. The T. vulgaris enzyme was expressed in E. coli and, like other glucoamylases, it releases beta-D-glucose from starch. However, unlike previously characterized glucoamylases, this T. vulgaris amylase hydrolyzes maltooligosaccharides (maltotetraose, maltose) more efficiently than starch (1), indicating this enzyme belongs to a class of glucoamylase-type enzymes with oligosaccharide-metabolizing activity.	616
273703	TIGR01578	MiaB-like-B	MiaB-like tRNA modifying enzyme, archaeal-type. This clade of sequences is closely related to MiaB, a modifier of isopentenylated adenosine-37 of certain eukaryotic and bacterial tRNAs (see TIGR01574). Sequence alignments suggest that this equivalog performs the same chemical transformation as MiaB, perhaps on a different (or differently modified) tRNA base substrate. This clade is a member of a subfamily (TIGR00089) and spans the archaea and eukaryotes. The only archaeal miaB-like genes are in this clade, while eukaryotes have sequences described by this model as well as ones falling within the scope of the MiaB equivalog model. [Protein synthesis, tRNA and rRNA base modification]	420
273704	TIGR01579	MiaB-like-C	MiaB-like tRNA modifying enzyme. This clade of sequences is closely related to MiaB, a modifier of isopentenylated adenosine-37 of certain eukaryotic and bacterial tRNAs (see TIGR01574). Sequence alignments suggest that this equivalog performs the same chemical transformation as MiaB, perhaps on a different (or differently modified) tRNA base substrate. This clade is a member of a subfamily (TIGR00089) and spans low GC Gram positive bacteria, alpha and epsilon proteobacteria, Campylobacter, Porphyromonas, Aquifex, Thermotoga, Chlamydia, Treponema and Fusobacterium. [Protein synthesis, tRNA and rRNA base modification]	414
162434	TIGR01580	narG	respiratory nitrate reductase, alpha subunit. The Nitrate reductase enzyme complex allows bacteria to use nitrate as an electron acceptor during anaerobic growth. The enzyme complex consists of a tetramer that has an alpha, beta and 2 gamma subunits. The alpha and beta subunits have catalytic activity and the gamma subunits attach the enzyme to the membrane and is a b-type cytochrome that receives electrons from the quinone pool and transfers them to the beta subunit. This model is specific for the alpha subunit for nitrate reductase I (narG) and nitrate reductase II (narZ) for gram positive and gram negative bacteria.A few thermophiles and archaea also match the model The seed members used to make the model include Nitrate reductases from Pseudomonas fluorescens (GP:11344601), E.coli (SP:P09152) and B.subtilis (SP:P42175). All seed members are experimentally characterized. Some unpublished nitrate reductases, that are shorter sequences, and probably fragments fall in between the noise and trusted cutoffs. Pfam models pfam00384 (Molybdopterin oxidoreductase) and pfam01568(Molydopterin dinucleotide binding domain) will also match the nitrate reductase, alpha subunit. [Energy metabolism, Anaerobic]	1235
130643	TIGR01581	Mo_ABC_porter	NifC-like ABC-type porter. This model describes a clade of ABC porter genes with relatively weak homology compared to its neighbor clades, the molybdate (TIGR02141) and sulfate (TIGR00969) porters. Neighbor-Joining, PAM-distance phylogenetic trees support the separation of the clades in this way. Included in this group is a gene designated NifC in Clostridium pasturianum. It would be reasonable to presume that NifC acts as a molybdate porter since the most common form of nitrogenase is a molybdoenzyme. Several other sequences falling within the scope of this model are annotated as molybdate porters and one, from Halobacterium, is annotated as a sulfate porter. There is presently no experimental evidence to support annotations with this degree of specificity.	225
273705	TIGR01582	FDH-beta	formate dehydrogenase, beta subunit, Fe-S containing. This model represents the beta subunit of the gamma-proteobacterial formate dehydrogenase. This subunit contains four 4Fe-4S clusters and is involved in transmitting electrons from the alpha subunit (TIGR01553) at the periplasmic space to the gamma subunit which spans the cytoplasmic membrane. In addition to the gamma proteobacteria, a sequence from Aquifex aolicus falls within the scope of this model. This appears to be the case for the alpha, gamma and epsilon (accessory protein TIGR01562) chains as well. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	283
130645	TIGR01583	formate-DH-gamm	formate dehydrogenase, gamma subunit. This model represents the gamma chain of the gamma proteobacteria (and Aquifex aolicus) formate dehydrogenase. This subunit is integral to the cytoplasmic membrane, consisting of 4 transmembrane helices, and receives electrons from the beta subunit. The entire E. coli formate dehydrogenase N (nitrate-inducible form) has been crystallized. The gamma subunit contains two cytochromes, heme b(P) and heme b(C) near the periplasmic and cytoplasmic sides of the membrane respectively. The electron acceptor quinone binds at the cytoplasmic heme histidine ligand. NiFe-hydrogenase and thiosulfate reductase contain homologous gamma subunits, and these can be found scoring in the noise of this model. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	204
130646	TIGR01584	citF	citrate lyase, alpha subunit. This is a model of the alpha subunit of the holoenzyme citrate lyase (EC 4.1.3.6) composed of alpha (EC 2.8.3.10), beta (EC 4.1.3.34), and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The alpha subunit catalyzes the reaction Acetyl-CoA + citrate = acetate + (3S)-citryl-CoA. The seed contains an experimentally characterized member from Lactococcus lactis subsp. lactis. The model covers both Gram positive and Gram negative bacteria. It is quite robust with queries scoring either quite well or quite poorly against the model. There are currently no hits in between the noise cutoff and trusted cutoff. [Energy metabolism, Fermentation]	492
273706	TIGR01586	yopT_cys_prot	cysteine protease domain, YopT-type. The model represents a cysteine protease domain found in proteins of bacteria that include plant pathogens (Pseudomonas syringae), root nodule bacteria, and intracellular pathogens (e.g. Yersinia pestis, Haemophilus ducreyi, Pasteurella multocida, Chlamydia trachomatis) of animal hosts. The domain features a catalytic triad of Cys, His, and Asp. Sequences can be extremely divergent outside of a few well-conserved motifs, and additional members may exist that are detected by this model. YopT, a virulence effector protein of Yersinia pestis, cleaves and releases host cell Rho GTPases from the membrane, thereby disrupting the actin cytoskeleton. Members of the family from pathogenic bacteria are likely to be pathogenesis factors. [Cellular processes, Pathogenesis]	196
273707	TIGR01587	cas3_core	CRISPR-associated helicase Cas3. This model represents the highly conserved core region of an alignment of Cas3, a protein found in association with CRISPR repeat elements in a broad range of bacteria and archaea. Cas3 appears to be a helicase, with regions found by pfam00270 (DEAD/DEAH box helicase) and pfam00271 (Helicase conserved C-terminal domain). Some but not all members have an N-terminal HD domain region (pfam01966) that is not included within this model.	359
130649	TIGR01588	citE	citrate lyase, beta subunit. This is a model of the beta subunit of the holoenzyme citrate lyase (EC 4.1.3.6) composed of alpha (EC 2.8.3.10), beta (EC 4.1.3.34), and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The beta subunit catalyzes the reaction (3S)-citryl-CoA = acetyl-CoA + oxaloacetate. The seed contains an experimentally characterized member from Leuconostoc mesenteroides. The model covers a wide range of Gram positive bacteria. For Gram negative bacteria, it appears that only gamma proteobacteria hit this model. The model is quite robust with queries scoring either quite well or quite poorly against the model. There are currently no hits in-between the noise cutoff and trusted cutoff. [Energy metabolism, Fermentation]	288
130650	TIGR01589	A_thal_3526	uncharacterized plant-specific domain TIGR01589. This model represents an uncharacterized plant-specific domain 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least 10 proteins from Arabidopsis thaliana and at least one from Oryza sativa.	57
130651	TIGR01590	yir-bir-cir_Pla	yir/bir/cir-family of variant antigens, Plasmodium-specific. This model represents a large paralogous family of variant antigens from several Plasmodium species (P. yoelii, P. berghei and P. chabaudi). The seed was generated from a list of ORF's in P. yoelii containing a paralagous domain as defined by an algorithm implemented at TIFR. The list was aligned and reduced to six sequences approximating the most divergent clades present in the data set. The model only hits genes previously characterized as yir, bir, or cir genes above the trusted cutoff. In between trusted and noise is one gene from P. vivax (vir25) which has been characterized as a distant relative of the yir/bir/cir family. The vir family appears to be present in 600-1000 copies per haploid genome and is preferentially located in the sub-telomeric regions of the chromosomes. The genomic data for yoelii is consistent with this observation. It is not believed that there are any orthologs of this family in P. falciparum.	199
130652	TIGR01591	Fdh-alpha	formate dehydrogenase, alpha subunit, archaeal-type. This model describes a subset of formate dehydrogenase alpha chains found mainly archaea but also in alpha and gamma proteobacteria and a small number of gram positive bacteria. The alpha chain contains domains for molybdopterin dinucleotide binding and molybdopterin oxidoreductase (pfam01568 and pfam00384, respectively). The holo-enzyme also contains beta and gamma subunits. The enzyme catalyzes the oxidation of formate (produced from pyruvate during anaerobic growth) to carbon dioxide with the concomitant release of two electrons and two protons. The enzyme's purpose is to allow growth on formate in some circumstances and, in the case of FdhH in gamma proteobacteria, to pass electrons to hydrogenase (by which process acid is neutralized). This model is well-defined, with only a single fragmentary sequence falling between trusted and noise. The alpha subunit of a version of nitrate reductase is closely related.	671
130653	TIGR01592	holin_SPP1	holin, SPP1 family. This model represents one of more than 30 families of phage proteins, all lacking detectable homology with each other, known or believed to act as holins. Holins act in cell lysis by bacteriophage. Members of this family are found in phage PBSX and phage SPP1, among others. [Mobile and extrachromosomal element functions, Prophage functions]	75
273708	TIGR01593	holin_tox_secr	toxin secretion/phage lysis holin. This model describes one of the many mutally dissimilar families of holins, phage proteins that act together with lytic enzymes in bacterial lysis. This family includes, besides phage holins, the protein TcdE/UtxA involved in toxin secretion in Clostridium difficile and related species. [Protein fate, Protein and peptide secretion and trafficking, Mobile and extrachromosomal element functions, Prophage functions]	128
273709	TIGR01594	holin_lambda	phage holin, lambda family. This model represents one of a large number of mutally dissimilar families of phage holins. Holins act against the host cell membrane to allow lytic enzymes of the phage to reach the bacterial cell wall. This family includes the product of the S gene of phage lambda. [Mobile and extrachromosomal element functions, Prophage functions]	107
273710	TIGR01595	cas_CT1132	CRISPR-associated protein, CT1132 family. This protein is found in at least five widely species that contain CRISPR loci. Four cas (CRISPR-associated) proteins that are widely distributed and found near the CRISPR repeats. This protein is found exclusively next to other cas proteins. Its function is unknown.	281
273711	TIGR01596	cas3_HD	CRISPR-associated endonuclease Cas3-HD. CRISPR/Cas systems are widespread, mobile systems for host defense against invasive elements such as phage. In these systems, Cas3 designates one of the core proteins shared widely by multiple types of CRISPR/Cas system. This model represents an HD-like endonuclease that occurs either separately or as the N-terminal region of Cas3, the helicase-containing CRISPR-associated protein.	176
130658	TIGR01597	PYST-B	Plasmodium yoelii subtelomeric family PYST-B. This model represents a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism.	255
130659	TIGR01598	holin_phiLC3	holin, phage phi LC3 family. Phage proteins for bacterial lysis typically include a membrane-disrupting protein, or holin, and one or more cell wall degrading enzymes that reach the cell wall because of holin action. Holins are found in a large number of mutually non-homologous families. [Mobile and extrachromosomal element functions, Prophage functions]	78
273712	TIGR01599	PYST-A	Plasmodium yoelii subtelomeric family PYST-A. This model represents a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. Members of this family are expressed in both the Sporozoite and Gametozoite life stages. A single high-scoring gene was identified in the complete genome of P. falciparum as well as a single gene from P. chaboudi from GenBank which were included in the seed. There are no obvious homologs to these genes in any non-Plasmodium organism. These observations suggest an expansion of this family in yoelii from a common Plasmodium ancestor gene (present in a single copy in falciparum).	208
130661	TIGR01600	phage_tail_L	lambda-like phage minor tail protein L. This model detects members of the family of phage lambda minor tail protein L. This model was built as a fragment model to allow detection of fragmentary sequences, as might be found in cryptic prophage regions. [Mobile and extrachromosomal element functions, Prophage functions]	225
213640	TIGR01601	PYST-C1	Plasmodium yoelii subtelomeric domain PYST-C1. This model represents the N-terminal domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The C-terminal portions of the genes which contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (TIGR01604).	82
130663	TIGR01602	PY-rept-46	Plasmodium yoelii repeat of length 46. This repeat is found in only 2 genes in Plasmodium yoelii, in each of these genes it is repeated 9 times. It is found in no other organism.	46
273713	TIGR01603	maj_tail_phi13	phage major tail protein, phi13 family. This model describes a set of proteins that share low levels of sequence similarity but similar lengths and similar patterns of charged, hydrophobic, and Gly/Pro residues. All members (except one attributed to mouse embryo cDNA) belong to phage of Gram-positive bacteria. Several are identified as phage major tail proteins. Some members of this family have additional C-terminal regions of about 100 residues not included in this model. [Mobile and extrachromosomal element functions, Prophage functions]	190
130665	TIGR01604	PYST-C2	Plasmodium yoelii subtelomeric domain PYST-C2. This model represents a domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The genes found by this model often are associated with an N-terminal domain yoelii-specific domain such as PYST-C1 (TIGR01601).	150
130666	TIGR01605	PYST-D	Plasmodium yoelii subtelomeric family PYST-D. This model represents a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. These genes are generally very short (ca. 50 residues). There are no obvious homologs to these genes in any other organism.	55
200119	TIGR01606	holin_BlyA	holin, BlyA family. This family represents a BlyA, a small holin found in Borrelia circular plasmids that prove to be temperate phage. This protein was previously proposed to be an hemolysin. BlyA is small (67 residues) and contains two largely hydrophobic helices and a highly charged C-terminus. [Mobile and extrachromosomal element functions, Prophage functions]	63
162444	TIGR01607	PST-A	Plasmodium subtelomeric family (PST-A). This model represents a paralogous family of genes in Plasmodium falciparum and Plasmodium yoelii, which are closely related to various phospholipases and lysophospholipases of plants as well as generally being related to the alpha/beta-fold superfamily of hydrolases. These genes are preferentially located in the subtelomeric regions of the chromosomes of both P. falciparum and P. yoelii.	332
130669	TIGR01608	citD	citrate lyase acyl carrier protein. This is a model of the acyl carrier protein (aka gamma subunit) of the holoenzyme citrate lyase (EC 4.1.3.6) composed of alpha (EC 2.8.3.10), beta (EC 4.1.3.34), and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The acyl carrier protein covalently binds the coenzyme of citrate lyase. The seed contains an experimentally characterized member from Leuconostoc mesenteroides. The model covers a wide range of Gram positive bacteria. For Gram negative bacteria, it appears that only gamma proteobacteria hit this model. The model is quite robust with queries scoring either quite well or quite poorly against the model. There are currently no hits in-between the noise cutoff and trusted cutoff. [Energy metabolism, Fermentation]	92
273714	TIGR01609	PF_unchar_267	Plasmodium falciparum uncharacterized protein TIGR01609. This model represents a family of at least four proteins in Plasmodium falciparum. An interesting feature is five perfectly conserved Trp residues.	146
273715	TIGR01610	phage_O_Nterm	phage replication protein O, N-terminal domain. This model represents the N-terminal region of the phage lambda replication protein O and homologous regions of other phage proteins. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions]	95
130672	TIGR01611	tail_tube	phage contractile tail tube protein, P2 family. The tails of some phage are contractile. This model represents the tail tube, or tail core, protein of the contractile tail of phage P2, and homologous proteins from additional phage. [Mobile and extrachromosomal element functions, Prophage functions]	168
130673	TIGR01612	235kDa-fam	reticulocyte binding/rhoptry protein. This model represents a group of paralogous families in plasmodium species alternately annotated as reticulocyte binding protein, 235-kDa family protein and rhoptry protein. Rhoptry protein is localized on the cell surface and is extremely large (although apparently lacking in repeat structure) and is important for the process of invasion of the RBCs by the parasite. These proteins are found in P. falciparum, P. vivax and P. yoelii.	2757
273716	TIGR01613	primase_Cterm	phage/plasmid primase, P4 family, C-terminal domain. This model represents a clade within a larger family of proteins from viruses of bacteria and animals. Members of this family are found in phage and plasmids of bacteria and archaea only. The model describes a domain of about 300 residues, found generally toward the protein C-terminus. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions]	304
273717	TIGR01614	PME_inhib	pectinesterase inhibitor domain. This model describes a plant domain of about 200 amino acids, characterized by four conserved Cys residues, shown in a pectinesterase inhibitor from Kiwi to form two disulfide bonds: first to second and third to fourth. Roughly half the members of this family have the region described by this model followed immediately by a pectinesterase domain, pfam01095. This suggests that the pairing of the enzymatic domain and its inhibitor reflects a conserved regulatory mechanism for this enzyme family.	178
273718	TIGR01615	A_thal_3542	uncharacterized plant-specific domain TIGR01615. This model represents a domain found toward the C-terminus of a number of uncharacterized plant proteins. The domain is strongly conserved (greater than 30 % sequence identity between most pairs of members) but flanked by highly divergent regions including stretches of low-complexity sequence.	131
273719	TIGR01616	nitro_assoc	nitrogenase-associated protein. This model describes a small family of uncharacterized proteins found so far in alpha and gamma proteobacteria and in Nostoc sp. PCC 7120, a cyanobacterium. The gene for this protein is associated with nitrogenase genes. This family shows sequence similarity to TIGR00014, a glutaredoxin-dependent arsenate reductase that converts arsentate to arsenite for disposal. This family is one of several included in pfam03960. [Unknown function, General]	126
273720	TIGR01617	arsC_related	transcriptional regulator, Spx/MgsR family. This model represents a portion of the proteins within the larger set covered by pfam03960. That larger family includes a glutaredoxin-dependent arsenate reductase (TIGR00014). Characterized members of this family include Spx and MgsR from Bacillus subtili. Spx is a global regulator for response to thiol-specific oxidative stress. It interacts with RNA polymerase. MgsR (modulator of the general stress response, also called YqgZ) provides a second level of regulation for more than a third of the proteins in the B. subtilis general stress regulon controlled by Sigma-B. [Regulatory functions, DNA interactions]	117
130679	TIGR01618	phage_P_loop	phage nucleotide-binding protein. This model represents an uncharacterized family of proteins from a number of phage of Gram-positive bacteria. This protein contains a P-loop motif, G/A-X-X-G-X-G-K-T near its amino end. The function of this protein is unknown. [Mobile and extrachromosomal element functions, Prophage functions]	220
130680	TIGR01619	hyp_HI0040	TIGR01619 family protein. This model represents a hypothetical equivalog of gamma proteobacteria, includes HI0040. These sequences do not have any similarity to known proteins by PSI-BLAST.	249
130681	TIGR01620	hyp_HI0043	TIGR01620 family protein. This model includes putative membrane proteins from alpha and gamma proteobacteria, each making up their own clade. The two clades have less than 25% identity between them. We could not find support for the assignment to the sequence from Brucella (OMNI|NTL01BM0951) of being a GTP-binding protein.	289
130682	TIGR01621	RluA-like	pseudouridine synthase Rlu family protein, TIGR01621. This model represents a clade of sequences within the pseudouridine synthase superfamily (pfam00849). The superfamily includes E. coli proteins: RluA, RluB, RluC, RluD, and RsuA. The sequences modeled here are most closely related to RluA. Neisseria, among those species hitting this model, does not appear to have an RluA homolog. It is presumed that these sequences function as pseudouridine synthases, although perhaps with different specificity. [Protein synthesis, tRNA and rRNA base modification]	217
273721	TIGR01622	SF-CC1	splicing factor, CC1-like family. This model represents a subfamily of RNA splicing factors including the Pad-1 protein (N. crassa), CAPER (M. musculus) and CC1.3 (H.sapiens). These proteins are characterized by an N-terminal arginine-rich, low complexity domain followed by three (or in the case of 4 H. sapiens paralogs, two) RNA recognition domains (rrm: pfam00706). These splicing factors are closely related to the U2AF splicing factor family (TIGR01642). A homologous gene from Plasmodium falciparum was identified in the course of the analysis of that genome at TIGR and was included in the seed.	494
130684	TIGR01623	put_zinc_LRP1	putative zinc finger domain, LRP1 type. This model represents a putative zinc finger domain found in plants. Arabidopsis thaliana has at least 10 distinct members. Proteins containing this domain, including LRP1, generally share the same size, about 300 amino acids, and architecture. This 43-residue domain, and a more C-terminal companion domain of similar size, appear as tightly conserved islands of sequence similarity. The remainder consists largely of low-complexity sequence. Several animal proteins have regions with matching patterns of Cys, Gly, and His residues. These are not included in the model but score between trusted and noise cutoffs.	43
273722	TIGR01624	LRP1_Cterm	LRP1 C-terminal domain. This model represents a tightly conserved small domain found in LRP1 and related plant proteins. This family also contains a well-conserved putative zinc finger domain (TIGR01623). The rest of the sequence of most members consists of highly divergent, low-complexity sequence.	50
130686	TIGR01625	YidE_YbjL_dupl	AspT/YidE/YbjL antiporter duplication domain. This model represents a domain that is duplicated the aspartate-alanine antiporter AspT, as well as HI0035 of Haemophilus influenzae, YidE and YbjL of E. coli, and a number of other known or putative transporters. Member proteins may have 0, 1, or 2 copies of TrkA potassium uptake domain pfam02080 between the duplications. The domain contains several apparent transmembrane regions and is proposed here to act in transport. [Transport and binding proteins, Unknown substrate]	154
130687	TIGR01626	ytfJ_HI0045	conserved hypothetical protein YtfJ-family, TIGR01626. This model represents sequences from gamma proteobacteria that are related to the E. coli protein, YtfJ.	184
130688	TIGR01627	A_thal_3515	uncharacterized plant-specific domain TIGR01627. This model represents an uncharacterized domain found in both Arabidopsis thaliana (at least 10 copies) and Oryza sativa. Most member proteins have only a short stretch of sequence N-terminal to this domain, but one has a long N-terminal extension that includes a protein kinase domain (pfam00069).	225
130689	TIGR01628	PABP-1234	polyadenylate binding protein, human types 1, 2, 3, 4 family. These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.	562
273723	TIGR01629	rep_II_X	phage/plasmid replication protein, gene II/X family. This model represents a family of phage and plasmid replication proteins. In bacteriophage IKe and related phage, the full-length protein is designated gene II protein. A much shorter protein of unknown function, translated from a conserved in-frame alternative initiator, is designated gene X protein. Members of this family also include plasmid replication proteins. This model is built as a fragment model to better detect translations from alternate intiators and other fragments relative to full length gene II protein. [Mobile and extrachromosomal element functions, Prophage functions, Mobile and extrachromosomal element functions, Plasmid functions]	345
130691	TIGR01630	psiM2_ORF9	phage uncharacterized protein (putative large terminase), C-terminal domain. This model represents the C-terminal region of a set of phage proteins typically about 400-500 amino acids in length, although some members are considerably shorter. An article on Methanobacterium phage Psi-M2 ( calls the member from that phage, ORF9, a putative large terminase subunit, and ORF8 a candidate terminase small subunit. Most proteins in this family have an apparent P-loop nucleotide-binding sequence toward the N-terminus. [Mobile and extrachromosomal element functions, Prophage functions]	142
273724	TIGR01631	Trypano_RHS	trypanosome RHS (retrotransposon hot spot) family. This model describes full-length and part-length members of the RHS (retrotransposon hot spot) family in Trypanosoma brucei and Trypanosoma cruzi. Members of this family are frequently interrupted by non-LTR retrotransposons inserted at exactly the same relative position.	760
233500	TIGR01632	L11_bact	50S ribosomal protein uL11, bacterial form. This model represents bacterial, chloroplast, and most mitochondrial forms of 50S ribosomal protein L11. [Protein synthesis, Ribosomal proteins: synthesis and modification]	140
188159	TIGR01633	phi3626_gp14_N	putative phage tail component, N-terminal domain. This model represents the best-conserved region of about 125 amino acids, toward the N-terminus, of a family of proteins from temperate phage of a number of Gram-positive bacteria. These phage proteins range in length from 230 to 525 amino acids. [Mobile and extrachromosomal element functions, Prophage functions]	124
130695	TIGR01634	tail_P2_I	phage tail protein, P2 protein I family. This model represents the family of phage P2 protein I and related tail proteins from a number of temperate phage of Gram-negative bacteria. This model is built as a fragment model and identifies some phage tail proteins with strong but local similarity to members of the seed alignment. [Mobile and extrachromosomal element functions, Prophage functions]	139
130696	TIGR01635	tail_comp_S	phage virion morphogenesis (putative tail completion) protein. This model describes protein S of phage P2, suggested experimentally to act in tail completion and stable head joining, and related proteins from a number of phage. [Mobile and extrachromosomal element functions, Prophage functions]	144
130697	TIGR01636	phage_rinA	phage transcriptional activator, RinA family. This model represents a family of phage proteins, including RinA, a transcriptional activator in staphylococcal phage phi 11. This family shows similarity to ArpU, a phage-related putative autolysin regulator, and to some sporulation-specific sigma factors. [Mobile and extrachromosomal element functions, Prophage functions, Regulatory functions, DNA interactions]	134
273725	TIGR01637	phage_arpU	phage transcriptional regulator, ArpU family. This model represents a family of phage proteins, including ArpU, called a putative autolysin regulatory protein. ArpU was described as a regulator of cellular muramidase-2 of Enterococcus hirae but appears to have been cloned from a prophage. This family appears related to the RinA family of bacteriophage transcriptional activators and to some sporulation-specific sigma factors. We propose that this is a phage transcriptional activator family. [Mobile and extrachromosomal element functions, Prophage functions, Regulatory functions, DNA interactions]	132
130699	TIGR01638	Atha_cystat_rel	Arabidopsis thaliana cystatin-related protein. This model represents a family similar in sequence and probably homologous to a large family of cysteine proteinase inhibitors, or cystatins, as described by pfam00031. Cystatins may help plants resist attack by insects.	92
130700	TIGR01639	P_fal_TIGR01639	Plasmodium falciparum uncharacterized domain TIGR01639. This model represents a conserved sequence region of about 60 amino acids found in over 40 predicted proteins of Plasmodium falciparum. It is not found elsewhere, including closely related species such as Plasmodium yoelii. No member of this family is characterized.	61
273726	TIGR01640	F_box_assoc_1	F-box protein interaction domain. This model describes a large family of plant domains, with several hundred members in Arabidopsis thaliana. Most examples are found C-terminal to an F-box (pfam00646), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes. Some members have two copies of this domain.	230
213641	TIGR01641	phageSPP1_gp7	phage putative head morphogenesis protein, SPP1 gp7 family. This model describes a region of about 110 amino acids found exclusively in phage-related proteins, internally or toward the C-terminus. One member, gp7 of phage SPP1, appears involved in head morphogenesis. [Mobile and extrachromosomal element functions, Prophage functions]	108
273727	TIGR01642	U2AF_lg	U2 snRNP auxilliary factor, large subunit, splicing factor. These splicing factors consist of an N-terminal arginine-rich low complexity domain followed by three tandem RNA recognition motifs (pfam00076). The well-characterized members of this family are auxilliary components of the U2 small nuclear ribonuclearprotein splicing factor (U2AF). These proteins are closely related to the CC1-like subfamily of splicing factors (TIGR01622). Members of this subfamily are found in plants, metazoa and fungi.	509
273728	TIGR01643	YD_repeat_2x	YD repeat (two copies). This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.	42
273729	TIGR01644	phage_P2_V	phage baseplate assembly protein V. This model describes a family of phage (and bacteriocin) proteins related to the phage P2 V gene product, which forms the small spike at the tip of the tail. Homologs in general are annotated as baseplate assembly protein V. At least one member is encoded within a region of Pectobacterium carotovorum (Erwinia carotovora) described as a bacteriocin, a phage tail-derived module able to kill bacteria closely related to the host strain. [Mobile and extrachromosomal element functions, Prophage functions]	190
130706	TIGR01645	half-pint	poly-U binding splicing factor, half-pint family. The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.	612
273730	TIGR01646	vgr_GE	Rhs element Vgr protein. This model represents the Vgr family of proteins, associated with some classes of Rhs elements. This model does not include a large octapeptide repeat region, VGXXXXXX, found in the Vgr of Rhs classes G and E.	483
273731	TIGR01647	ATPase-IIIA_H	plasma-membrane proton-efflux P-type ATPase. This model describes the plasma membrane proton efflux P-type ATPase found in plants, fungi, protozoa, slime molds and archaea. The best studied representative is from yeast.	754
273732	TIGR01648	hnRNP-R-Q	heterogeneous nuclear ribonucleoprotein R, Q family. Sequences in this subfamily include the human heterogeneous nuclear ribonucleoproteins (hnRNP) R, Q, and APOBEC-1 complementation factor (aka APOBEC-1 stimulating protein). These proteins contain three RNA recognition domains (rrm: pfam00076) and a somewhat variable C-terminal domain.	578
273733	TIGR01649	hnRNP-L_PTB	hnRNP-L/PTB/hephaestus splicing factor family. Included in this family of heterogeneous ribonucleoproteins are PTB (polypyrimidine tract binding protein) and hnRNP-L. These proteins contain four RNA recognition motifs (rrm: pfam00067).	481
130711	TIGR01650	PD_CobS	cobaltochelatase, CobS subunit. This model describes Pseudomonas denitrificans CobS gene product, which is a cobalt chelatase subunit that functions in cobalamin biosynthesis. Cobalamin (vitamin B12) can be synthesized via several pathways, including an aerobic pathway (found in Pseudomonas denitrificans) and an anaerobic pathway (found in P. shermanii and Salmonella typhimurium). These pathways differ in the point of cobalt insertion during corrin ring formation. There are apparently a number of variations on these two pathways, where the major differences seem to be concerned with the process of ring contraction. Confusion regarding the functions of enzymes found in the aerobic vs. anaerobic pathways has arisen because nonhomologous genes in these different pathways were given the same gene symbols. Thus, cobS in the aerobic pathway (P. denitrificans) is not a homolog of cobS in the anaerobic pathway (S. typhimurium). It should be noted that E. coli synthesizes cobalamin only when it is supplied with the precursor cobinamide, which is a complex intermediate. Additionally, all E. coli cobalamin synthesis genes (cobU, cobS and cobT) were named after their Salmonella typhimurium homologs which function in the anaerobic cobalamin synthesis pathway. This model describes the aerobic cobalamin pathway Pseudomonas denitrificans CobS gene product, which is a cobalt chelatase subunit, with a MW ~37 kDa. The aerobic pathway cobalt chelatase is a heterotrimeric, ATP-dependent enzyme that catalyzes cobalt insertion during cobalamin biosynthesis. The other two subunits are the P. denitrificans CobT (TIGR01651) and CobN (pfam02514 CobN/Magnesium Chelatase) proteins. To avoid potential confusion with the nonhomologous Salmonella typhimurium/E.coli cobS gene product, the P. denitrificans gene symbol is not used in the name of this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	327
130712	TIGR01651	CobT	cobaltochelatase, CobT subunit. This model describes Pseudomonas denitrificans CobT gene product, which is a cobalt chelatase subunit that functions in cobalamin biosynthesis. Cobalamin (vitamin B12) can be synthesized via several pathways, including an aerobic pathway (found in Pseudomonas denitrificans) and an anaerobic pathway (found in P. shermanii and Salmonella typhimurium). These pathways differ in the point of cobalt insertion during corrin ring formation. There are apparently a number of variations on these two pathways, where the major differences seem to be concerned with the process of ring contraction. Confusion regarding the functions of enzymes found in the aerobic vs. anaerobic pathways has arisen because nonhomologous genes in these different pathways were given the same gene symbols. Thus, cobT in the aerobic pathway (P. denitrificans) is not a homolog of cobT in the anaerobic pathway (S. typhimurium). It should be noted that E. coli synthesizes cobalamin only when it is supplied with the precursor cobinamide, which is a complex intermediate. Additionally, all E. coli cobalamin synthesis genes (cobU, cobS and cobT) were named after their Salmonella typhimurium homologs which function in the anaerobic cobalamin synthesis pathway. This model describes the aerobic cobalamin pathway Pseudomonas denitrificans CobT gene product, which is a cobalt chelatase subunit, with a MW ~70 kDa. The aerobic pathway cobalt chelatase is a heterotrimeric, ATP-dependent enzyme that catalyzes cobalt insertion during cobalamin biosynthesis. The other two subunits are the P. denitrificans CobS (TIGR01650) and CobN (pfam02514 CobN/Magnesium Chelatase) proteins. To avoid potential confusion with the nonhomologous Salmonella typhimurium/E.coli cobT gene product, the P. denitrificans gene symbol is not used in the name of this model. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	600
273734	TIGR01652	ATPase-Plipid	phospholipid-translocating P-type ATPase, flippase. This model describes the P-type ATPase responsible for transporting phospholipids from one leaflet of bilayer membranes to the other. These ATPases are found only in eukaryotes.	1057
273735	TIGR01653	lactococcin_972	bacteriocin, lactococcin 972 family. This model represents bacteriocins related to lactococcin 972. Members tend to be found in association with a seven transmembrane putative immunity protein. [Cellular processes, Toxin production and resistance]	92
273736	TIGR01654	bact_immun_7tm	bacteriocin-associated integral membrane (putative immunity) protein. This model represents a family of integral membrane proteins, most of which are about 650 residues in size and predicted to span the membrane seven times. Nearly half of the members of this family are found in association with a member of the lactococcin 972 family of bacteriocins (TIGR01653). Others may be associated with uncharacterized proteins that may also act as bacteriocins. Although this protein is suggested to be an immunity protein, and the bacteriocin is suggested to be exported by a Sec-dependent process, the role of this protein is unclear. [Cellular processes, Toxin production and resistance]	679
130716	TIGR01655	yxeA_fam	conserved hypothetical protein TIGR01655. This model represents a family of small (about 115 amino acids) uncharacterized proteins with N-terminal signal sequences, found exclusively in Gram-positive organisms. Most genomes that have any members of this family have at least two members. [Hypothetical proteins, Conserved]	114
273737	TIGR01656	Histidinol-ppas	histidinol-phosphate phosphatase family domain. This domain is found in authentic histidinol-phosphate phosphatases which are sometimes found as stand-alone entities and sometimes as fusions with imidazoleglycerol-phosphate dehydratase (TIGR01261). Additionally, a family of proteins including YaeD from E. coli (TIGR00213) and various other proteins are closely related but may not have the same substrate specificity. This domain is a member of the haloacid-dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. This superfamily is distinguished by the presence of three motifs: an N-terminal motif containing the nucleophilic aspartate, a central motif containing an conserved serine or threonine, and a C-terminal motif containing a conserved lysine (or arginine) and conserved aspartates. More specifically, the domian modelled here is a member of subfamily III of the HAD-superfamily by virtue of lacking a "capping" domain in either of the two common positions, between motifs 1 and 2, or between motifs 2 and 3.	147
273738	TIGR01657	P-ATPase-V	P-type ATPase of unknown pump specificity (type V). These P-type ATPases form a distinct clade but the substrate of their pumping activity has yet to be determined. This clade has been designated type V in.	1054
273739	TIGR01658	EYA-cons_domain	eyes absent protein conserved domain. This domain is common to all eyes absent (EYA) homologs. Metazoan EYA's also contain a variable N-terminal domain consisting largely of low-complexity sequences.	274
273740	TIGR01659	sex-lethal	sex-lethal family splicing factor. This model describes the sex-lethal family of splicing factors found in Dipteran insects. The sex-lethal phenotype, however, may be limited to the Melanogasters and closely related species. In Drosophila the protein acts as an inhibitor of splicing. This subfamily is most closely related to the ELAV/HUD subfamily of splicing factors (TIGR01661).	346
211677	TIGR01660	narH	nitrate reductase, beta subunit. The Nitrate reductase enzyme complex allows bacteria to use nitrate as an electron acceptor during anaerobic growth. The enzyme complex consists of a tetramer that has an alpha, beta and 2 gamma subunits. The alpha and beta subunits have catalytic activity and the gamma subunits attach the enzyme to the membrane and is a b-type cytochrome that receives electrons from the quinone pool and transfers them to the beta subunit. This model is specific for the beta subunit for nitrate reductase I (narH) and nitrate reductase II (narY) for gram positive and gram negative bacteria.A few thermophiles and archaea also match the model.The seed members used in this model are all experimentally characterized and include the following:SP:P11349, and SP:P19318, both E.Coli (NarH and NarY respectively), SP:P42176 from B. Subtilis, GP:11344602 from Psuedomonas fluorescens,GP:541762 from Paracoccus denitrificans, and GP:18413622 from Halomonas halodenitrificans. This model also matches Pfam pfam00037 for 4Fe-4S binding domain. [Energy metabolism, Anaerobic]	492
273741	TIGR01661	ELAV_HUD_SF	ELAV/HuD family splicing factor. This model describes the ELAV/HuD subfamily of splicing factors found in metazoa. HuD stands for the human paraneoplastic encephalomyelitis antigen D of which there are 4 variants in human. ELAV stnds for the Drosophila Embryonic lethal abnormal visual protein. ELAV-like splicing factors are also known in human as HuB (ELAV-like protein 2), HuC (ELAV-like protein 3, Paraneoplastic cerebellar degeneration-associated antigen) and HuR (ELAV-like protein 1). These genes are most closely related to the sex-lethal subfamily of splicing factors found in Dipteran insects (TIGR01659). These proteins contain 3 RNA-recognition motifs (rrm: pfam00076).	352
273742	TIGR01662	HAD-SF-IIIA	HAD-superfamily hydrolase, subfamily IIIA. This subfamily falls within the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The Class III subfamilies are characterized by the lack of any domains located between either between the first and second conserved catalytic motifs (as in the Class I subfamilies, TIGR01493, TIGR01509, TIGR01488 and TIGR01494) or between the second and third conserved catalytic motifs (as in the Class II subfamilies, TIGR01460 and TIGR01484) of the superfamily domain. The IIIA subfamily contains five major clades: histidinol-phosphatase (TIGR01261) and histidinol-phosphatase-related protein (TIGR00213) which together form a subfamily (TIGR01656), DNA 3'-phosphatase (TIGR01663, TIGR01664), YqeG (TIGR01668) and YrbI (TIGR01670). In the case of histidinol phosphatase and PNK-3'-phosphatase, this model represents a domain of a bifunctional system. In the histidinol phosphatase HisB, a C-terminal domain is an imidazoleglycerol-phosphate dehydratase which catalyzes a related step in histidine biosynthesis. In PNK-3'-phosphatase, N- and C-terminal domains constitute the polynucleotide kinase and DNA-binding components of the enzyme. [Unknown function, Enzymes of unknown specificity]	135
130724	TIGR01663	PNK-3'Pase	polynucleotide 5'-kinase 3'-phosphatase. This model represents the metazoan 5'-polynucleotide-kinase-3'-phosphatase, PNKP, which is believed to be involved in repair of oxidative DNA damage. Removal of 3' phosphates is essential for the further processing of the break by DNA polymerases. The central phosphatase domain is a member of the IIIA subfamily (TIGR01662) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. As is common in this superfamily, the enzyme is magnesium dependent. A difference between this enzyme and other HAD-superfamily phosphatases is in the third conserved catalytic motif which usually contains two conserved aspartate residues believed to be involved in binding the magnesium ion. Here, the second aspartate is replaced by a conserved arginine residue which may indicate an interaction with the phosphate backbone of the substrate. Very close relatives of this domain are also found separate from the N- and C-terminal domains seen here, as in the 3'-phosphatase found in plants. The larger family of these domains is described by TIGR01664. Outside of the phosphatase domain is a P-loop ATP-binding motif associated with the kinase activity. The entry for the mouse homolog appears to be missing a large piece of sequence corresponding to the first conserved catalytic motif of the phosphatase domain as well as the conserved threonine of the second motif. Either this is a sequencing artifact or this may represent a pseudo- or non-functional gene. Note that the EC number for the kinase function is: 2.7.1.78	526
211680	TIGR01664	DNA-3'-Pase	DNA 3'-phosphatase. This model represents a family of proteins and protein domains which catalyze the dephosphorylation of DNA 3'-phosphates. It is believed that this activity is important for the repair of single-strand breaks in DNA caused by radiation or oxidative damage. This domain is often (TIGR01663), but not always linked to a DNA 5'-kinase domain. The central phosphatase domain is a member of the IIIA subfamily (TIGR01662) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. As is common in this superfamily, the enzyme is magnesium dependent. A difference between this enzyme and other HAD-superfamily phosphatases is in the third conserved catalytic motif which usually contains two conserved aspartate residues believed to be involved in binding the magnesium ion. Here, the second aspartate is usually replaced by an arginine residue which may indicate an interaction with the phosphate backbone of the substrate. Alternatively, there is an additional conserved aspartate downstream of the ususal site which may indicate slightly different fold in this region.	166
273743	TIGR01665	put_anti_recept	phage minor structural protein, N-terminal region. This model represents the conserved N-terminal region, typically from about residue 25 to about residue 350, of a family of uncharacterized phage proteins 500 to 1700 residues in length. [Mobile and extrachromosomal element functions, Prophage functions]	317
130727	TIGR01666	YCCS	TIGR01666 family membrane protein. This model represents a clade of sequences from gamma and beta proteobacteria. These proteins are >700 amino acids long and many have been annotated as putative membrane proteins. The gene from Salmonella has been annotated as a putative efflux transporter. The gene from E. coli has the name yccS. [Cell envelope, Other]	704
130728	TIGR01667	YCCS_YHJK	integral membrane protein, YccS/YhfK family. This model represents two clades of putative transmembrane proteins including the E. coli YccS and YhfK proteins. The YccS hypothetical equivalog (TIGR01666) is found in beta and gamma proteobacteria, while the smaller YhfK group is only found in E. coli, Salmonella and Yersinia. TMHMM on the 19 hits to this model shows a consensus of 11 transmembrane helices separated into two clusters, an N-terminal cluster of 6 and a central cluster of 5. This would indicate two non-membrane domains one on each side of the membrane	701
273744	TIGR01668	YqeG_hyp_ppase	HAD superfamily (subfamily IIIA) phosphatase, TIGR01668. This family of hypothetical proteins is a member of the IIIA subfamily of the haloacid dehalogenase (HAD) superfamily of hydrolases. All characterized members of this subfamily (TIGR01662) and most characterized members of the HAD superfamily are phosphatases. HAD superfamily phosphatases contain active site residues in several conserved catalytic motifs, all of which are found conserved here. This family consists of sequences from fungi, plants, cyanobacteria, gram-positive bacteria and Deinococcus. There is presently no characterization of any sequence in this family.	170
273745	TIGR01669	phage_XkdX	phage uncharacterized protein, XkdX family. This model represents a family of small (about 50 amino acid) phage proteins, found in at least 12 different phage and prophage regions of Gram-positive bacteria. In a number of these phage, the gene for this protein is found near the holin and endolysin genes. [Mobile and extrachromosomal element functions, Prophage functions]	45
130731	TIGR01670	KdsC-phosphatas	3-deoxy-D-manno-octulosonate 8-phosphate phosphatase, YrbI family. This family of proteins is a member of the IIIA subfamily of the haloacid dehalogenase (HAD) superfamily of hydrolases. All characterized members of this subfamily (TIGR01662) and most characterized members of the HAD superfamily are phosphatases. HAD superfamily phosphatases contain active site residues in several conserved catalytic motifs, all of which are found conserved here. One member of this family, the YrbI protein from H. influenzae has been cloned, expressed, purified and found to be an active 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase. Furthermore, its crystal structure has been determined. This family consists of sequences from beta, gamma and epsilon proteobacteria, Aquifex, Fusobacterium, Porphyromonas and Methanosarcina. The Methanosarcina sequence is distinctive in that it is linked to an N-terminal cytidylyltransferase domain (pfam02348) and is annotated as acylneuraminate cytidylyltransferase. This may give some clue as the function of these phosphatases. Several eukaryotic sequences scoring between trusted and noise are also closely related to this function such as the CMP-N-acetylneuraminic acid synthetase from mouse, but in these cases the phosphatase domain is clearly inactive as many of the active site residues are not conserved. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	154
273746	TIGR01671	phage_TIGR01671	phage uncharacterized protein TIGR01671. This model represents an uncharacterized, well-conserved family of proteins found in bacteriophage and prophage regions of Gram-positive bacteria. [Mobile and extrachromosomal element functions, Prophage functions, Hypothetical proteins, Conserved]	151
273747	TIGR01672	AphA	HAD superfamily (subfamily IIIB) phosphatase, TIGR01672. This family of proteins is a member of the IIIB subfamily (pfam02767) of the haloacid dehalogenase (HAD) superfamily of hydrolases. All characterized members of subfamily III and most characterized members of the HAD superfamily are phosphatases. HAD superfamily phosphatases contain active site residues in several conserved catalytic motifs, all of which are found conserved here. The AphA gene from E. coli has been characterized and shown to be an active phosphatase enzyme. This family has been previously described as the "class B non-specific bacterial acid phosphatase" (B-NSAP) family, where it is noted that the enzyme is secreted and has a broad substrate range. The possibility exists, however, that the enzyme is specific for an as yet undefined substrate. Supporting evidence for the inclusion in the HAD superfamily, whose phosphatase members are magnesium dependent, is the inhibition by EDTA and calcium ions, and stimulation by magnesium ion.	237
130734	TIGR01673	holin_LLH	phage holin, LL-H family. This model represents a putative phage holin from a number of phage and prophage regions of Gram-positive bacteria. Like other holins, it is small (about 100 amino acids) with stretches of hydrophobic sequence and is encoded adjacent to lytic enzymes. [Mobile and extrachromosomal element functions, Prophage functions]	108
273748	TIGR01674	phage_lambda_G	phage minor tail protein G. This model describes a family of bacteriophage proteins including G of phage lambda. This protein has been described as undergoing a translational frameshift at a Gly-Lys dipeptide near the C-terminus of protein G from phage lambda, with about 4 % efficiency, to produce tail assembly protein G-T. The Lys of the Gly-Lys pair is the conserved second-to-last residue of seed alignment for this family. [Mobile and extrachromosomal element functions, Prophage functions]	138
273749	TIGR01675	plant-AP	plant acid phosphatase. This model represents a family of acid phosphatase from plants which are most closely related to the (so called) class B non-specific acid phosphatase OlpA (TIGR01533, which is believed to be a 5'-nucleotide phosphatase) and somewhat more distantly to another class B phosphatase, AphA (TIGR01672). Together these three clades define a subfamily (pfam03767) which corresponds to the IIIB subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. It has been reported that the best substrate for this enzyme that could be found was purine 5'-nucleoside phosphates. This is in concordance with the assignment of the H. influenzae hel protein (from TIGR01533) as a 5'-nucleotidase, however there is presently no other evidence to support this specific function for these plant phosphatases. Many genes from this family have been annotated as vegetative storage proteins due to their close homology with these earlier-characterized gene products, which are highly expressed in leaves. There are significant differences however, including expression levels and distribution. The most important difference is the lack in authentic VSPs of the nucleophilic aspartate residue, which is instead replaced by serine, glycine or asparagine. Thus these proteins can not be expected to be active phosphatases. This issue was confused by the publication in 1992 of an article claiming activity for the Glycine max VSP. In 1994 this assertion was refuted by the separation of the activity from the VSP. This model explicitly excludes the VSPs which lack the nucleophilc aspartate. The possibility exists, however, that some members of this family may, while containing all of the conserved HAD-superfamily catalytic residues, lack activity and have a function related to the function of the VSPs rather than the acid phosphatases.	228
130737	TIGR01676	GLDHase	galactonolactone dehydrogenase. This model represents L-Galactono-gamma-lactone dehydrogenase (EC 1.3.2.3). This enzyme catalyzes the final step in ascorbic acid biosynthesis in higher plants. This protein is homologous to ascorbic acid biosynthesis enzymes of other species: L-gulono-gamma-lactone oxidase in rat and L-galactono-gamma-lactone oxidase in yeast. All three covalently bind the cofactor FAD.	541
273750	TIGR01677	pln_FAD_oxido	plant-specific FAD-dependent oxidoreductase. This model represents an uncharacterized plant-specific family of FAD-dependent oxidoreductases. At least seven distinct members are found in Arabidopsis thaliana. The family shows considerable sequence similarity to three different enzymes of ascorbic acid biosynthesis: L-galactono-1,4-lactone dehydrogenase (EC 1.3.2.3) from higher plants, D-arabinono-1,4-lactone oxidase (EC 1.1.3.37 from Saccharomyces cerevisiae, and L-gulonolactone oxidase (EC 1.1.3.8) from mouse, as well as to a bacterial sorbitol oxidase. The class of compound acted on by members of this family is unknown.	557
273751	TIGR01678	FAD_lactone_ox	sugar 1,4-lactone oxidases. This model represents a family of at least two different sugar 1,4 lactone oxidases, both involved in synthesizing ascorbic acid or a derivative. These include L-gulonolactone oxidase (EC 1.1.3.8) from rat and D-arabinono-1,4-lactone oxidase (EC 1.1.3.37) from Saccharomyces cerevisiae. Members are proposed to have the cofactor FAD covalently bound at a site specified by Prosite motif PS00862; OX2_COVAL_FAD; 1.	438
130740	TIGR01679	bact_FAD_ox	FAD-linked oxidoreductase. This model represents a family of bacterial oxidoreductases with covalently linked FAD, closely related to two different eukaryotic oxidases, L-gulonolactone oxidase (EC 1.1.3.8) from rat and D-arabinono-1,4-lactone oxidase (EC 1.1.3.37) from Saccharomyces cerevisiae.	419
130741	TIGR01680	Veg_Stor_Prot	vegetative storage protein. The proteins represented by this model are close relatives of the plant acid phosphatases (TIGR01675), are limited to members of the Phaseoleae including Glycine max (soybean) and Phaseolus vulgaris (kidney bean). These proteins are highly expressed in the leaves of repeatedly depodded plants. VSP differs most strinkingly from the acid phosphatases in the lack of the conserved nucleophilic aspartate residue in the N-terminus, thus, they should be inactive as phosphatases. This issue was confused by the publication in 1992 of an article claiming activity for the Glycine max VSP. In 1994 this assertion was refuted by the separation of the activity from the VSP.	275
273752	TIGR01681	HAD-SF-IIIC	HAD-superfamily phosphatase, subfamily IIIC. This model represents the IIIC subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. Subfamily III (also including IIIA - TIGR01662 and IIIB - pfam03767) contains sequences which do not contain either of the insert domains (between the 1st and 2nd conserved catalytic motifs, subfamily I - TIGR01493, TIGR01509, TIGR01549, TIGR01488, TIGR01494, TIGR01658, TIGR01544 and TIGR01545, or between the 2nd and 3rd, subfamily II - TIGR01460 and TIGR01484). Subfamily IIIC contains five relatively distantly related clades: a family of viral proteins (TIGR01684), a family of eukaryotic proteins called MDP-1 and a family of archaeal proteins most closely related to MDP-1 (TIGR01685), a family of bacteria including the Streptomyces FkbH protein (TIGR01686), and a small clade including the Pasteurella BcbF and EcbF proteins. The overall lack of species overlap among these clades may indicate a conserved function, but the degree of divergence between the clades and the differences in archetecture outside of the domain in some clades warns against such a conclusion. No member of this subfamily is characterized with respect to function, however the MDP-1 protein is a characterized phosphatase. All of the characterized enzymes within subfamily III are phosphatases, and all of the active site residues characteristic of HAD-superfamily phosphatases are present in subfamily IIIC.	128
273753	TIGR01682	moaD	molybdopterin converting factor, subunit 1, non-archaeal. This model describes MoaD. It excludes archaeal homologs, since many Archaea have two MoaD-like proteins, suggesting two different functions. pfam02597 describes both the thiamine biosynthesis protein ThiS and this protein, MoaD, a subunit (together with MoaE, pfam02391) of the molybdopterin converting factor. Both ThiS and MoaD are involved in sulfur transfer reactions. Distribution of this family appears limited to species that also have a member of pfam02391, but a number of Archaea have two different members, suggesting functionally distinct subtypes. The C-terminal Gly-Gly of this model is critical to function. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	80
273754	TIGR01683	thiS	thiamine biosynthesis protein ThiS. This model represents ThiS, a small, ubiquitin-like thiamine biosynthesis protein related to MoaD, a molybdenum cofactor biosynthesis protein. Both proteins are involved in sulfur transfer. ThiS has a conserved Gly-Gly C-terminus that is modified, in reactions requiring ThiI, ThiF, IscS, and a sulfur atom from Cys, into the thiocarboxylate that provides the sulfur for thiazole biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	64
273755	TIGR01684	viral_ppase	viral phosphatase. This model represents a family of viral proteins of unknown function. These proteins are members, however, of the IIIC (TIGR01681) subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. All characterized members of the III subfamilies (IIIA, TIGR01662; IIIB, pfam03767) are phosphatases, including MDP-1, a member of subfamily IIIC (TIGR01681). No member of this subfamily is characterized with respect to particular function. All of the active site residues characteristic of HAD-superfamily phosphatases are present in subfamily IIIC. These proteins also include an N-terminal domain (ca. 125 aas) that is unique to this clade.	301
273756	TIGR01685	MDP-1	magnesium-dependent phosphatase-1. This model represents two closely related clades of sequences from eukaryotes and archaea. The mouse enzyme has been characterized as a phosphatase and has been positively identified as a member of the haloacid dehalogenase (HAD) superfamily by site-directed mutagenesis of the active site residues.	174
273757	TIGR01686	FkbH	FkbH-like domain. This model describes a domain of a family of proteins of unknown overall function. One of these, however, is a modular polyketide synthase 4800 amino acids in length from Streptomyces avermilitis in which this domain is the C-terminal segment. By contrast, the FkbH protein from Streptomyces hygroscopicus aparently contains only this domain. The remaining members of the family all contain an additional N-terminal domain of between 200 and 275 amino acids which show less than 20% identity to one another. It seems likely then that these proteins are involved in disparate functions, probably the biosynthesis of different natural products. For instance, the FkbH gene is found in a gene cluster believed to be responsible for the biosynthesis of unususal "PKS extender units" in the ascomycin pathway. This domain is composed of two parts, the first of which is a member of subfamily IIIC (TIGR01681) of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. All of the characterized enzymes within subfamily III are phosphatases, and all of the active site residues characteristic of HAD-superfamily phosphatases are present in this domain. The C-terminal portion of this domain is unique to this family (by BLAST).	320
273758	TIGR01687	moaD_arch	MoaD family protein, archaeal. Members of this family appear to be archaeal versions of MoaD, subunit 1 of molybdopterin converting factor. This model has been split from the bacterial/eukaryotic equivalog model TIGR01682 because the presence of two members of this family in a substantial number of archaeal species suggests that roles might not be interchangeable. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	88
130749	TIGR01688	dltC	D-alanine--poly(phosphoribitol) ligase, subunit 2. This protein is part of the teichoic acid operon in gram-positive organisms. Gram positive organisms incorporate teichoic acid in their cell walls, and in the fatty acid residues of the glycolipid component of the outer layer of the cytoplasmic membrane. This gene, dltC, encodes the alanyl carrier protein. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	73
273759	TIGR01689	EcbF-BcbF	capsule biosynthesis phosphatase. This model describes a small family of highly conserved proteins (>60% ID). Two of these, BcbF and EcbF of Pasteurella multocida are believed to be part of the capsule polysaccharide biosynthesis machinery because they are cotranscribed from a locus devoted to that purpose. In pasteurella there are six different variant capsules (A-F), and these proteins are found only in B and E. The other two species in which this gene is (currently) found are both also pathogenic. These proteins are also members of the IIIC (TIGR01681) subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. All of the characterized enzymes within subfamily III are phosphatases, and all of the active site residues characteristic of HAD-superfamily phosphatases are present in this subfamily. Due to the likelihood that the substrates of these enzymes are different depending on the nature of the particular polysaccharides associated with each species, this model has been classified as a subfamily despite the close homology.	126
162489	TIGR01690	ICE_RAQPRD	integrative conjugative element protein, RAQPRD family. This model represents a small family of proteins about 100 amino acids in length, including a predicted signal sequence and a perfectly conserved motif RAQPRD towards the C-terminus. Members are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae DC3000. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions]	94
273760	TIGR01691	enolase-ppase	2,3-diketo-5-methylthio-1-phosphopentane phosphatase. This enzyme is the enolase-phosphatase of methionine salvage, a pathway that regenerates methionine from methylthioadenosine (MTA). Adenosylmethionine (AdoMet) is a donor of different moieties for various processes, including methylation reactions. Use of AdoMet for spermidine biosynthesis, which leads to polyamine biosynthesis, leaves MTA as a by-product that must be cleared. In Bacillus subtilis and related species, this single protein is replaced by separate enzymes with enolase and phosphatase activities. [Central intermediary metabolism, Sulfur metabolism]	220
130753	TIGR01692	HIBADH	3-hydroxyisobutyrate dehydrogenase. 3-hydroxyisobutyrate dehydrogenase is an enzyme that catalyzes the NAD+-dependent oxidation of 3-hydroxyisobutyrate to methylmalonate semialdehyde of the valine catabolism pathway. In Pseudomonas aeruginosa, 3-hydroxyisobutyrate dehydrogenase (mmsB) is co-induced with methylmalonate-semialdehyde dehydrogenase (mmsA) when grown on medium containing valine as the sole carbon source. The positive transcriptional regulator of this operon (mmsR) is located upstream of these genes and has been identified as a member of the XylS/AraC family of transcriptional regulators. 3-hydroxyisobutyrate dehydrogenase shares high sequence homology to the characterized 3-hydroxyisobutyrate dehydrogenase from rat liver with conservation of proposed NAD+ binding residues at the N-terminus (G-8,10,13,24 and D-31). This enzyme belongs to the 3-hydroxyacid dehydrogenase family, sharing a common evolutionary origin and enzymatic mechanism with 6-phosphogluconate. HIBADH exhibits sequence similarity to the NAD binding domain of 6-phosphogluconate dehydrogenase above trusted (pfam03446). [Energy metabolism, Amino acids and amines]	288
273761	TIGR01693	UTase_glnD	[Protein-PII] uridylyltransferase. This model describes GlnD, the uridylyltransferase/uridylyl-removing enzyme for the nitrogen regulatory protein PII. Not all homologs of PII share the property of uridylyltransferase modification on the characteristic Tyr residue (see Prosite pattern PS00496 and document PDOC00439), but the modification site is preserved in the PII homolog of all species with a member of this family. [Central intermediary metabolism, Nitrogen metabolism, Regulatory functions, Protein interactions]	850
273762	TIGR01694	MTAP	5'-deoxy-5'-methylthioadenosine phosphorylase. This model represents the methylthioadenosine phosphorylase found in metazoa, cyanobacteria and a limited number of archaea such as Sulfolobus, Aeropyrum, Pyrobaculum, Pyrococcus, and Thermoplasma. This enzyme is responsible for the first step in the methionine salvage pathway after the transfer of the amino acid moiety from S-adenosylmethionine. The enzyme from human is well-characterized including a crystal structure. A misleading characterization is found for a Sulfolobus solfataricus enzyme, which is called a MTAP. In fact, as uncovered by the genome sequence of S. solfataricus, there are at least two nucleotide phosphorylases and the one found in the MTAP clade is not the one annotated as such. The sequence in this clade has not been isolated but is likely to be the authentic SsMTAP as it displays all of the conserved active site residues found in the human enzyme. This explains the finding that the characterized enzyme has greater efficiency towards the purines inosine, guanosine and adenosine over MTA. In fact, this mis-naming of this enzyme has been carried forward to several publications including a crystal stucture. In between the trusted and noise cutoffs are: 1) several archaeal sequences which appear to contain several residues characteristic of phosphorylases which act on guanosine or inosine (according to the crystal structure of MTAP and alignments). In any case, these residues are not conserved. 2) sequences from Mycobacterium tuberculosis and Streptomyces coelicolor which have better, although not perfect retention of the active site residues, but considering the general observation that bacteria utilize the MTA/SAH nucleotidase enzyme and a kinase to do this reaction, these have been excluded pending stronger evidence of their function, and 3) a sequence from Drosophila which appears to be a recent divergence (long branch in neighbor-joining trees) and lacks some of the conserved active site residues. [Central intermediary metabolism, Other, Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	241
273763	TIGR01695	murJ_mviN	murein biosynthesis integral membrane protein MurJ. This model represents MurJ (previously MviN), a family of integral membrane proteins predicted to have ten or more transmembrane regions. Members have been suggested to act as a lipid II flippase, translocated a precursor of murein. However, it appears FtsW has that activity. Flippase activity for MurJ has not been shown. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	502
162494	TIGR01696	deoB	phosphopentomutase. This protein is involved in the purine and pyrimidine salvage pathway. It catalyzes the conversion of D-ribose 1-phosphate to D-ribose 5-phosphate and the conversion of 2-deoxy-D-ribose 1-phosphate to 2-deoxy-D-ribose 5-phosphate. The seed members of this protein are characterized deoB proteins from E.Coli(SP:P07651) and Bacillus (SP:P46353). This model matches pfam01676 for Metalloenzyme superfamily. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	381
130758	TIGR01697	PNPH-PUNA-XAPA	inosine/guanosine/xanthosine phosphorylase family. This model is a subset of the subfamily represented by pfam00896 (phosphorylase family 2). This model excludes the methylthioadenosine phosphorylases (MTAP, TIGR01684) which are believed toplay a specific role in the recycling of methionine from methylthioadenosine. In this subfamily is found three clades of purine phosphorylases based on a neighbor-joining tree using the MTAP family as an outgroup. The highest-branching clade (TIGR01698) consists of a group of sequences from both gram positive and gram negative bacteria which have been annotated as purine nucleotide phosphorylases but have not been further characterized as to substrate specificity. Of the two remaining clades, one is xanthosine phosphorylase (XAPA, TIGR01699), is limited to certain gamma proteobacteria and constitutes a special purine phosphorylase found in a specialized operon for xanthosine catabolism. The enzyme also acts on the same purines (inosine and guanosine) as the other characterized members of this subfamily, but is only induced when xanthosine must be degraded. The remaining and largest clade consists of purine nucleotide phosphorylases (PNPH, TIGR01700) from metazoa and bacteria which act primarily on guanosine and inosine (and do not act on adenosine). Sequences from Clostridium (GP:15025051) and Thermotoga (OMNI:TM1596) fall between these last two clades and are uncharacterized with respect to substrate range and operon.	248
130759	TIGR01698	PUNP	purine nucleotide phosphorylase. This clade of purine nucleotide phosphorylases has not been experimentally characterized but is assigned based on strong sequence homology. Closely related clades act on inosine and guanosine (PNPH, TIGR01700), and xanthosine, inosine and guanosine (XAPA, TIGR01699) neither of these will act on adenosine. A more distantly related clade (MTAP, TIGR01694) acts on methylthioadenosine.	237
130760	TIGR01699	XAPA	xanthosine phosphorylase. This model represents a small clade of purine nucleotide phosphorylases found in certain gamma proteobacteria. The gene is part of an operon for the degradation of xanthosine and is induced by xanthosine. The enzyme is also capable of acting on inosine and guanosine (but not adenosine) in a manner similar to those other phosphorylases to which it is closely related (TIGR01698, TIGR01700).	248
273764	TIGR01700	PNPH	purine nucleoside phosphorylase I, inosine and guanosine-specific. This model represents a family of bacterial and metazoan purine phosphorylases acting primarily on inosine and guanosine and not acting on adenosine. PNP-I refers to the nomenclature from Bacillus stearothermophilus where PHP-II refers to the nucleotidase acting on adenosine as the primary substrate.The bacterial enzymes (PUNA) are typified by the Bacilus PupG protein, which is involved in the metabolism of nucleosides as a carbon source.Several metazoan enzymes (PNPH) are well characterized including the human and bovine enzymes which have been crystallized. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	249
273765	TIGR01701	Fdhalpha-like	oxidoreductase alpha (molybdopterin) subunit. This model represents a well-defined clade of oxidoreductase alpha subunits most closely related to a group of formate dehydrogenases including the E. coli FdhH protein (TIGR01591). These alpha subunits contain a molybdopterin cofactor and generally associate with two other subunits which contain iron-sulfur clusters and cytochromes. The particular subunits with which this enzyme interacts and the substrate which is reduced is unknown at this time. In Ralstonia, the gene is associated with the cbb operon, but is not essential for CO2 fixation.	743
130763	TIGR01702	CO_DH_cata	carbon-monoxide dehydrogenase, catalytic subunit. This model represents the carbon-monoxide dehydrogenase catalytic subunit. This protein is related to prismane (also called hybrid cluster protein), a complex whose activity is not yet fully described; the two share similar sets of ligands to unusual metal-containing clusters.	621
130764	TIGR01703	hybrid_clust	hydroxylamine reductase. This model represents a family of proteins containing an unusual 4Fe-2S-2O hydrid cluster. Earlier reports had proposed a 6Fe-6S prismane cluster. This subfamily is heterogeneous with respect to the presence or absence of a region of about 100 amino acids not far from the N-terminus of the protein. Members have been described as monomeric. The general function is unknown, although members from E. coli and several other species have hydroxylamine reductase activity. Members are found in various bacteria, in Archaea, and in several parasitic eukaryotes: Giardia intestinalis, Trichomonas vaginalis, and Entamoeba histolytica. [Cellular processes, Detoxification, Energy metabolism, Amino acids and amines]	522
130765	TIGR01704	MTA/SAH-Nsdase	5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase. This model represents the enzyme 5-methylthioadenosine/S-adenosylhomocysteine nucleosidase which acts on its two substrates at the same active site. This enzyme is involved in the recycling of the components of S-adenosylmethionine after it has donated one of its two non-ribose sulfur ligands to an acceptor. In the case of 5-methylthioadenosine this represents the first step of the methionine salvage pathway in bacteria. This enzyme is widely distributed in bacteria, especially those that lack adenosylhomocysteinase (EC 3.3.1.1). One clade of bacteria including Agrobacterium, Mesorhizobium, Sinorhizobium and Brucella includes sequences annotated as MTA/SAH nucleotidase, but differs significantly in homology and has no independent experimental evidence. There are homologs of this enzyme in plants, some of which score between trusted and noise cutoffs here, but there is no experimental evidence to validate this function at this time. [Central intermediary metabolism, Other, Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	228
130766	TIGR01705	MTA/SAH-nuc-hyp	5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase, putative. This model represents the enzyme 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase which acts on its two substrates at the same active site. This clade of sequences is sufficiently distinct from the characterized proteins, which form the seed of TIGR01704 as to cast some doubt on the accuracy of annotations based on sequence similarity alone. This enzyme is involved in the recycling of the components of S-adenosylmethionine after it has donated one of its two non-ribose sulfur ligands to an acceptor. In the case of 5'-methylthioadenosine this represents the first step of the methionine salvage pathway in bacteria. This enzyme is widely distributed in bacteria.	212
273766	TIGR01706	NAPA	periplasmic nitrate reductase, large subunit. This model represents the large subunit of a family of nitrate reductases found in proteobacteria which are localized to the periplasm. This subunit binds molybdopterin and contains a twin-arginine motif at the N-terminus. The protein associates with NapB, a soluble heme-containing protein and NapC, a membrane-bound cytochrome c. The periplasmic nitrate reductases are not involved in the assimilation of nitrogen, and are not directly involved in the formation of electrochemical gradients (i.e. respiration) either. Rather, the purpose of this enzyme is either dissimilatory (i.e. to dispose of excess reductive equivalents) or indirectly respiratory by virtue of the consumption of electrons derived from NADH via the proton translocating NADH dehydrogenase. The enzymes from Alicagenes eutrophus and Paracoccus pantotrophus have been characterized. In E. coli (as well as other organisms) this gene is part of a large nitrate reduction operon (napFDAGHBC). [Energy metabolism, Aerobic, Energy metabolism, Electron transport, Central intermediary metabolism, Nitrogen metabolism]	830
273767	TIGR01707	gspI	type II secretion system protein I. This model represents GspI, one of two proteins highly conserved at their N-termini and described by pfam02501 but easily separable phylogenetically. The other is GspJ. Both GspI and GspJ are proteins of the type II secretion pathway, or main terminal branch of the general secretion pathway. This pathway carries proteins across the outer membrane. Note that proteins of type II secretion are cryptic in E. coli K-12 - present but not yet demonstrated to act on any target.	101
130769	TIGR01708	typeII_sec_gspH	type II secretion system protein H. This model represents GspH, protein H of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	143
273768	TIGR01709	typeII_sec_gspL	type II secretion system protein L. This model represents GspL, protein L of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking]	384
130771	TIGR01710	typeII_sec_gspG	type II secretion system protein G. This model represents GspG, protein G of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	134
130772	TIGR01711	gspJ	type II secretion system protein J. This model represents GspJ, one of two proteins highly conserved at their N-termini and described by pfam02501 but easily separable phylogenetically. The other is GspI. Both GspI and GspJ are proteins of the type II secretion pathway, or main terminal branch of the general secretion pathway. This pathway carries proteins across the outer membrane. Note that proteins of type II secretion are cryptic in E. coli K-12 - present but not yet demonstrated to act on any target.	192
273769	TIGR01712	phage_N6A_met	phage N-6-adenine-methyltransferase. This model is a fragment-mode model for a phage-borne DNA N-6-adenine-methyltransferase. [Mobile and extrachromosomal element functions, Prophage functions, DNA metabolism, Restriction/modification]	166
273770	TIGR01713	typeII_sec_gspC	type II secretion system protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking]	259
130775	TIGR01714	phage_rep_org_N	phage replisome organizer, putative, N-terminal region. This model represents the N-terminal domain of a small family of phage proteins. The protein contains a region of low-complexity sequence that reflects DNA direct repeats able to function as an origin of phage replication. The region covered by this model is N-terminal to the low-complexity region. [Mobile and extrachromosomal element functions, Prophage functions]	119
273771	TIGR01715	phage_lam_T	phage tail assembly protein T. This model represents a translation of the T gene in phage lambda and related phage. A translational frameshift from the upstream gene G into the frame of T produces a minor protein gpG-T, essential in tail assembly but not found in the mature virion. [Mobile and extrachromosomal element functions, Prophage functions]	95
273772	TIGR01716	RGG_Cterm	transcriptional activator, Rgg/GadR/MutR family, C-terminal domain. This model describes the whole, except for a 60 residue N-terminal helix-turn-helix DNA-binding domain (pfam01381) of the family of proteins related to the transcriptional regulator Rgg, also called RopB. Rgg is required for secretion of several proteins, including a cysteine proteinase associated with virulence. GadR is a positive regulator of a glutamate-dependent acid resistance mechanism. MutR is a transcriptional activator for mutacin biosynthesis genes in Streptococcus mutans. This family appears restricted to the low-GC Gram-positive bacteria, including at least eight members in Lactococcus lactis. [Regulatory functions, DNA interactions]	220
273773	TIGR01717	AMP-nucleosdse	AMP nucleosidase. This model represents the AMP nucleosidase from proteobacteria but also including a sequence from Corynebacterium, a gram-positive organism. The species from E. coli has been most well studied.	477
130779	TIGR01718	Uridine-psphlse	uridine phosphorylase. This model represents a family of bacterial and archaeal uridine phosphorylases unrelated to the mammalian enzymes of the same name. The E. coli, Salmonella and Klebsiella genes have been characterized. Sequences from Clostridium, Streptomyces, Treponema, Halobacterium and Pyrobaculum were included above trusted on the basis of sequence homology and a PAM-based neighbor-joining tree. A clade including second sequences from Halobacterium and Vibrio was somewhat more distantly related and may represent a slightly different substrate specificity - these were placed below the noise cutoff. More distantly related is a clade of archaeal sequences which as related to the DeoD family of inosine phosphorylases (TIGR00107) as they are to these uridine phosphorylases. This clade includes a characterized protein from Sulfolobus solfataricus which has been mis-named as a methylthioadenosine phosphorylase, but which acts on inosine and guanosine - it is unclear whether uridine has been evaluated as a substrate. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	245
130780	TIGR01719	euk_UDPppase	uridine phosphorylase. This model represents a clade of mainly eucaryotic uridine phosphorylases. Genes from human and mouse have been characterized. This enzyme is a member of the PHP/UDP subfamily (pfam01048) and is closely related to the bacterial uridine (TIGR01718) and inosine (TIGR00107) phosphorylase equivalogs. In addition to the eukaryotes, a gene from Mycobacterium leprae is included in this equivalog and may have resulted from lateral gene transfer. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	287
273774	TIGR01720	NRPS-para261	non-ribosomal peptide synthase domain TIGR01720. This domain appears to be located immediately downstream from a condensation domain (pfam00668), and is followed primarily by the end of the molecule or another condensation domain (in a few cases it is followed by pfam00501, an AMP-binding module). The converse is not true, pfam00668 domains are not always followed by this domain. This implicates this domain in possible post-condensation modification events. This model is 171 amino acids long and contains three very highly conserved regions. At the N-terminus is a nearly invariant lysine (position 11) followed by xxxRxxPxxGxGYG in which the proline and the first glycine are invariant. This is followed approximately 22 residues later by the motif FNYLG. Near the C-terminus of the domain is the sequence TxSD where the serine and aspartate are nearly invariant.	153
130782	TIGR01721	AMN-like	AMP nucleosidase, putative. The sequences in the clade represented by this model are most closely related to the AMP nucleosidase found in TIGR01717. These sequences are found only in Chlamydia and Porphyromonas and differ sufficiently from the characterized AMP nucleosidase to put some doubt on assignment of this name.	266
130783	TIGR01722	MMSDH	methylmalonic acid semialdehyde dehydrogenase. Involved in valine catabolism, methylmalonate-semialdehyde dehydrogenase catalyzes the irreversible NAD+- and CoA-dependent oxidative decarboxylation of methylmalonate semialdehyde to propionyl-CoA. Methylmalonate-semialdehyde dehydrogenase has been characterized in both prokaryotes and eukaryotes, functioning as a mammalian tetramer and a bacterial homodimer. Although similar in monomeric molecular mass and enzymatic activity, the N-terminal sequence in P.aeruginosa does not correspond with the N-terminal sequence predicted for rat liver. Sequence homology to a variety of prokaryotic and eukaryotic aldehyde dehydrogenases places MMSDH in the aldehyde dehydrogenase (NAD+) superfamily (pfam00171), making MMSDH's CoA requirement unique among known ALDHs. Methylmalonate semialdehyde dehydrogenase is closely related to betaine aldehyde dehydrogenase, 2-hydroxymuconic semialdehyde dehydrogenase, and class 1 and 2 aldehyde dehydrogenase. In Bacillus, a highly homologous protein to methylmalonic acid semialdehyde dehydrogenase, groups out from the main MMSDH clade with Listeria and Sulfolobus. This Bacillus protein has been suggested to be located in an iol operon and/or involved in myo-inositol catabolism, converting malonic semialdehyde to acetyl CoA ad CO2. The preceeding enzymes responsible for valine catabolism are present in Bacillus, Listeria, and Sulfolobus. [Energy metabolism, Amino acids and amines]	477
130784	TIGR01723	hmd_TIGR	5,10-methenyltetrahydromethanopterin hydrogenase. This model represents a clade of authenticated coenzyme N(5),N(10)-methenyltetrahydromethanopterin reductases. This enzyme does not use F420. This enzyme acts in methanogenesis and as such is restricted to methanogenic archaeal species. This clade is one of two clades in pfam03201. [Energy metabolism, Methanogenesis]	340
130785	TIGR01724	hmd_rel	H2-forming N(5),N(10)-methenyltetrahydromethanopterin dehydrogenase-related protein. This model represents a sister clade to the authenticated coenzyme F420-dependent N(5),N(10)-methenyltetrahydromethanopterin reductase (HMD) of TIGR01723. Two members, designated HmdII and HmdIII, are found. Members are restricted to methanogens, but the function is unknown. [Unknown function, Enzymes of unknown specificity]	341
273775	TIGR01725	phge_HK97_gp10	phage protein, HK97 gp10 family. This model represents an uncharacterized, highly divergent bacteriophage family. The family includes gp10 from HK022 and HK97. It appears related to TIGR01635, a phage morphogenesis family believed to be involved in tail completion. [Mobile and extrachromosomal element functions, Prophage functions]	119
130787	TIGR01726	HEQRo_perm_3TM	amine acid ABC transporter, permease protein, 3-TM region, His/Glu/Gln/Arg/opine family. This model represents one of several classes of multiple membrane spanning regions found immediately N-terminal to the domain described by pfam00528, binding-protein-dependent transport systems inner membrane component. The region covered by this model generally is predicted to contain three transmembrane helices. Substrate specificities attributed to members of this family include histidine, arginine, glutamine, glutamate, and (in Agrobacterium) the opines octopine and nopaline. [Transport and binding proteins, Amino acids, peptides and amines]	99
213647	TIGR01727	oligo_HPY	oligopeptide/dipeptide ABC transporter, ATP-binding protein, C-terminal domain. This model represents a domain found in the C-terminal regions of oligopeptide ABC transporter ATP binding proteins, immediately following the ATP-binding domain (pfam00005). All characterized members appear able to be involved in the transport of oligopeptides or dipeptides. Some are important for sporulation or antibiotic resistance. Some dipeptide transporters also act on the heme precursor delta-aminolevulinic acid. [Transport and binding proteins, Amino acids, peptides and amines]	87
130789	TIGR01728	SsuA_fam	ABC transporter, substrate-binding protein, aliphatic sulfonates family. Members of this family are substrate-binding periplasmic proteins of ABC transporters. This subfamily includes SsuA, a member of a transporter operon needed to obtain sulfur from aliphatic sulfonates. Related proteins outside the scope of this model include taurine (NH2-CH2-CH2-S03H) binding proteins, the probable sulfate ester binding protein AtsR, and the probable aromatic sulfonate binding protein AsfC. All these families make sulfur available when Cys and sulfate levels are low. Please note that phylogenetic analysis by neighbor-joining suggests that a number of sequences belonging to this family have been excluded because of scoring lower than taurine-binding proteins. [Transport and binding proteins, Other]	288
130790	TIGR01729	taurine_ABC_bnd	taurine ABC transporter, periplasmic binding protein. This model identifies a cluster of ABC transporter periplasmic substrate binding proteins, apparently specific for taurine. Transport systems for taurine (NH2-CH2-CH2-SO3H), sulfonates, and sulfate esters import sulfur when sulfate levels are low. The most closely related proteins outside this family are putative aliphatic sulfonate binding proteins (TIGR01728).	300
273776	TIGR01730	RND_mfp	RND family efflux transporter, MFP subunit. This model represents the MFP (membrane fusion protein) component of the RND family of transporters. RND refers to Resistance, Nodulation, and cell Division. It is, in part, a subfamily of pfam00529 (Pfam release 7.5) but hits substantial numbers of proteins missed by that model. The related HlyD secretion protein, for which pfam00529 is named, is outside the scope of this model. Attributed functions imply outward transport. These functions include nodulation, acriflavin resistance, heavy metal efflux, and multidrug resistance proteins. Most members of this family are found in Gram-negative bacteria. The proposed function of MFP proteins is to bring the inner and outer membranes together and enable transport to the outside of the outer membrane. Note, however, that a few members of this family are found in Gram-positive bacteria, where there is no outer membrane. [Transport and binding proteins, Unknown substrate]	322
273777	TIGR01731	fil_hemag_20aa	adhesin HecA family 20-residue repeat (two copies). This model represents two copies of a 20-residue repeat found in Bordetella pertussis filamentous hemagglutinin family of adhesins. This family includes extremely long proteins from a number of plant and animal pathogens.	40
273778	TIGR01732	tiny_TM_bacill	conserved hypothetical tiny transmembrane protein. This model represents a family of hypothetical proteins, half of which are 40 residues or less in length. Members are found only in spore-forming species. A Gly-rich variable region is followed by a strongly conserved, highly hydrophobic region, predicted to form a transmembrane helix, ending with an invariant Gly. The consensus for this stretch is FALLVVFILLIIV. [Hypothetical proteins, Conserved]	26
273779	TIGR01733	AA-adenyl-dom	amino acid adenylation domain. This model represents a domain responsible for the specific recognition of amino acids and activation as adenylyl amino acids. The reaction catalyzed is aa + ATP -> aa-AMP + PPi. These domains are usually found as components of multi-domain non-ribosomal peptide synthetases and are usually called "A-domains" in that context. A-domains are almost invariably followed by "T-domains" (thiolation domains, pfam00550) to which the amino acid adenylate is transferred as a thiol-ester to a bound pantetheine cofactor with the release of AMP (these are also called peptide carrier proteins, or PCPs. When the A-domain does not represent the first module (corresponding to the first amino acid in the product molecule) it is usually preceded by a "C-domain" (condensation domain, pfam00668) which catalyzes the ligation of two amino acid thiol-esters from neighboring modules. This domain is a subset of the AMP-binding domain found in Pfam (pfam00501) which also hits substrate--CoA ligases and luciferases. Sequences scoring in between trusted and noise for this model may be ambiguous as to whether they activate amino acids or other molecules lacking an alpha amino group.	409
273780	TIGR01734	D-ala-DACP-lig	D-alanine--poly(phosphoribitol) ligase, subunit 1. This model represents the enzyme (also called D-alanine-D-alanyl carrier protein ligase) which activates D-alanine as an adenylate via the reaction D-ala + ATP -> D-ala-AMP + PPi, and further catalyzes the condensation of the amino acid adenylate with the D-alanyl carrier protein (D-ala-ACP). The D-alanine is then further transferred to teichoic acid in the biosynthesis of lipoteichoic acid (LTA) and wall teichoic acid (WTA) in gram positive bacteria, both polysacchatides. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	502
188163	TIGR01735	FGAM_synt	phosphoribosylformylglycinamidine synthase, single chain form. This model represents a single-molecule form of phosphoribosylformylglycinamidine synthase, also called FGAM synthase, an enzyme of purine de novo biosynthesis. This form is found mostly in eukaryotes and Proteobacteria. In Bacillus subtilis PurL (FGAM synthase II) and PurQ (FGAM synthase I), homologous to different parts of this model, perform the equivalent function; the unrelated small protein PurS is also required and may be a third subunit. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	1310
273781	TIGR01736	FGAM_synth_II	phosphoribosylformylglycinamidine synthase II. Phosphoribosylformylglycinamidine synthase is a single, long polypeptide in most Proteobacteria and eukarotes. Three proteins are required in Bacillus subtilis and many other species. This is the longest of the three and is designated PurL, phosphoribosylformylglycinamidine synthase II, or FGAM synthase II. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	715
273782	TIGR01737	FGAM_synth_I	phosphoribosylformylglycinamidine synthase I. In some species, phosphoribosylformylglycinamidine synthase is composed of a single polypeptide chain. This model describes the PurQ protein of Bacillus subtilis (where PurL, PurQ, and PurS are required for phosphoribosylformylglycinamidine synthase activity) and functionally equivalent proteins from other bacteria and archaea. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	227
273783	TIGR01738	bioH	pimelyl-[acyl-carrier protein] methyl ester esterase. This CoA-binding enzyme is required for the production of pimeloyl-coenzyme A, the substrate of the BioF protein early in the biosynthesis of biotin. Its exact function is unknown, but is proposed in ref 2. This enzyme belongs to the alpha/beta hydrolase fold family (pfam00561). Members of this family are restricted to the Proteobacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin]	245
273784	TIGR01739	tegu_FGAM_synt	herpesvirus tegument protein/v-FGAM-synthase. This model describes a family of large proteins of herpesvirues. The protein is described variably as tegument protein or phosphoribosylformylglycinamidine synthase (FGAM-synthase). Most of the length of the protein shows homology to eukaryotic FGAM-synthase. Functional characterizations were not verified during construction of this model.	1202
273785	TIGR01740	pyrF	orotidine 5'-phosphate decarboxylase, subfamily 1. This model represents orotidine 5'-monophosphate decarboxylase, the PyrF protein of pyrimidine nucleotide biosynthesis. In many eukaryotes, the region hit by this model is part of a multifunctional protein. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	214
130802	TIGR01741	staph_tand_hypo	conserved hypothetical protein. This model represents a tandem array of 10 proteins in Staphylococcus aureus and the C-terminal region of one protein each in Bacillus subtilis and Bacillus halodurans.	157
273786	TIGR01742	SA_tandem_lipo	Staphylococcus tandem lipoproteins. Members of this family are predicted lipoproteins (mostly), found in Staphylococcus aureus in several different tandem clusters in pathogenicity islands. Members are also found, clustered, in Staphylococcus epidermidis.	255
130804	TIGR01743	purR_Bsub	pur operon repressor, Bacillus subtilis type. This model represents the puring operon repressor PurR of low-GC Gram-positive bacteria. This homodimeric repressor contains a large region homologous to phosphoribosyltransferases and is inhibited by 5-phosphoribosyl 1-pyrophosphate. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis, Regulatory functions, DNA interactions]	268
130805	TIGR01744	XPRTase	xanthine phosphoribosyltransferase. This model represent a xanthine-specific phosphoribosyltransferase of Bacillus subtilis and closely related proteins from other species, mostly from other Gram-positive bacteria. The adjacent gene is a xanthine transporter; B. subtilis can import xanthine for the purine salvage pathway or for catabolism to obtain nitrogen. [Purines, pyrimidines, nucleosides, and nucleotides, Salvage of nucleosides and nucleotides]	191
130806	TIGR01745	asd_gamma	aspartate-semialdehyde dehydrogenase, gamma-proteobacterial. [Amino acid biosynthesis, Aspartate family]	366
273787	TIGR01746	Thioester-redct	thioester reductase domain. This model includes the terminal domain from the fungal alpha aminoadipate reductase enzyme (also known as aminoadipate semialdehyde dehydrogenase) which is involved in the biosynthesis of lysine, as well as the reductase-containing component of the myxochelin biosynthetic gene cluster, MxcG. The mechanism of reduction involves activation of the substrate by adenylation and transfer to a covalently-linked pantetheine cofactor as a thioester. This thioester is then reduced to give an aldehyde (thus releasing the product) and a regenerated pantetheine thiol. (In myxochelin biosynthesis this aldehyde is further reduced to an alcohol or converted to an amine by an aminotransferase.) This is a fundamentally different reaction than beta-ketoreductase domains of polyketide synthases which act at a carbonyl two carbons removed from the thioester and forms an alcohol as a product. This domain is invariably found at the C-terminus of the proteins which contain it (presumably because it results in the release of the product). The majority of hits to this model are non-ribosomal peptide synthetases in which this domain is similarly located proximal to a thiolation domain (pfam00550). In some cases this domain is found at the end of a polyketide synthetase enzyme, but is unlike ketoreductase domains which are found before the thiolase domains. Exceptions to this observed relationship with the thiolase domain include three proteins which consist of stand-alone reductase domains (GP|466833 from M. leprae, GP|435954 from Anabaena and OMNI|NTL02SC1199 from Strep. coelicolor) and one protein (OMNI|NTL01NS2636 from Nostoc) which contains N-terminal homology with a small group of hypothetical proteins but no evidence of a thiolation domain next to the putative reductase domain. Below the noise cutoff to this model are proteins containing more distantly related ketoreductase and dehydratase/epimerase domains. It has been suggested that a NADP-binding motif can be found in the N-terminal portion of this domain that may form a Rossman-type fold.	367
130808	TIGR01747	diampropi_NH3ly	diaminopropionate ammonia-lyase family. This small subfamily includes diaminopropionate ammonia-lyase from Salmonella typhimurium and a small number of close homologs, about 50 % identical in sequence. The enzyme is a pyridoxal phosphate-binding homodimer homologous to threonine dehydratase (threonine deaminase). [Energy metabolism, Other]	376
130809	TIGR01748	rhaA	L-rhamnose isomerase. This enzyme interconverts L-rhamnose and L-rhamnulose. In some species, including E. coli, this is the first step in rhamnose catabolism. Sequential steps are catalyzed by rhamnulose kinase (rhaB), then rhamnulose-1-phosphate aldolase (rhaD) to yield glycerone phosphate and (S)-lactaldehyde. Characterization of this family is based on members in E. coli and Salmonella. [Energy metabolism, Sugars]	414
130810	TIGR01749	fabA	beta-hydroxyacyl-[acyl carrier protein] dehydratase FabA. This enzyme, FabA, shows overlapping substrate specificity with FabZ with regard to chain length in fatty acid biosynthesis. It is commonly designated 3-hydroxydecanoyl-[acyl-carrier-protein] dehydratase (EC 4.2.1.60) as if it were specific for that chain length, but its specificity is broader; it is active even in the initiation of fatty acid biosynthesis. This enzyme can also isomerize trans-2-decenoyl-ACP to cis-3-decenoyl-ACP to bypass reduction by FabI and instead allow biosynthesis of unsaturated fatty acids. FabA cannot elongate unsaturated fatty acids. [Fatty acid and phospholipid metabolism, Biosynthesis]	169
130811	TIGR01750	fabZ	beta-hydroxyacyl-[acyl carrier protein] dehydratase FabZ. This enzyme, FabZ, shows overlapping substrate specificity with FabA with regard to chain length in fatty acid biosynthesis. FabZ works preferentially on shorter chains and is often designated (3R)-hydroxymyristoyl-[acyl carrier protein] dehydratase, although its actual specificity is broader. Unlike FabA, FabZ does not function as an isomerase and cannot initiate unsaturated fatty acid biosynthesis. However, only FabZ can act during the elongation of unsaturated fatty acid chains. [Fatty acid and phospholipid metabolism, Biosynthesis]	140
188164	TIGR01751	crot-CoA-red	crotonyl-CoA carboxylase/reductase. The enzyme represented by this model can convert crotonyl-CoA to butyryl-CoA (crotonyl-CoA reductase activity), but more importantly, in the presence of CO2, generates (2S)-ethylmalonyl-CoA. In serine cycle methylotrophic bacteria this enzyme is involved in the process of acetyl-CoA to glyoxylate. In other bacteria the enzyme is used to produce extender units for incorporation into polyketides such as tylosin from Streptomyces fradiae and coronatine from Pseudomonas syringae.	398
273788	TIGR01752	flav_long	flavodoxin, long chain. Flavodoxins are small redox-active proteins with a flavin mononucleotide (FMN) prosthetic group. They can act in nitrogen fixation by nitrogenase, in sulfite reduction, and light-dependent NADP+ reduction in during photosynthesis, among other roles. This model describes the long chain type, typical for nitrogen fixation but associated with pyruvate formate-lyase activation and cobalamin-dependent methionine synthase activity in E. coli. [Energy metabolism, Electron transport]	167
273789	TIGR01753	flav_short	flavodoxin, short chain. Flavodoxins are small redox-active proteins with a flavin mononucleotide (FMN) prosthetic group. They can act in nitrogen fixation by nitrogenase, in sulfite reduction, and light-dependent NADP+ reduction in during photosynthesis, among other roles. This model describes the short chain type. Many of these are involved in sulfite reduction. [Energy metabolism, Electron transport]	140
130815	TIGR01754	flav_RNR	ribonucleotide reductase-associated flavodoxin, putative. This model represents a family of proteins found immediately downstream of ribonucleotide reductase genes in Xyella fastidiosa and some Gram-positive bacteria. It appears to be a highly divergent flavodoxin of the short chain type, more like the flavodoxins of the sulfate-reducing genus Desulfovibrio than like the NifF flavodoxins associated with nitrogen fixation.	140
130816	TIGR01755	flav_wrbA	NAD(P)H:quinone oxidoreductase, type IV. This model represents a protein, WrbA, related to and slightly larger than flavodoxin. It was just shown, in E. coli and Archaeoglobus fulgidus (and previously for some eukaryotic homologs) to act as fourth type of NAD(P)H:quinone oxidoreductase. In E. coli, this protein was earlier reported to be produced during stationary phase, bind to the trp repressor, and make trp operon repression more efficient. WrbA does not interact with the trp operator by itself. Members are found in species in which homologs of the E. coli trp operon repressor TrpR (SP:P03032) are not detected. [Energy metabolism, Electron transport]	197
130817	TIGR01756	LDH_protist	lactate dehydrogenase. This model represents a family of protist lactate dehydrogenases which have aparrently evolved from a recent protist malate dehydrogenase ancestor. Lactate dehydrogenase converts the hydroxyl at C-2 of lactate to a carbonyl in the product, pyruvate. The preference of this enzyme for NAD or NADP has not been determined. A critical residue in malate dehydrogenase, arginine-91 (T. vaginalis numbering) has been mutated to a leucine, eliminating the positive charge which complemeted the carboxylate in malate which is absent in lactate. Several other more subtle changes are proposed to make the active site smaller to accomadate the less bulky lactate molecule.	313
130818	TIGR01757	Malate-DH_plant	malate dehydrogenase, NADP-dependent. This model represents the NADP-dependent malate dehydrogenase found in plants, mosses and green algae and localized to the chloroplast. Malate dehydrogenase converts oxaloacetate into malate, a critical step in the C4 cycle which allows circumvention of the effects of photorespiration. Malate is subsequenctly transported from the chloroplast to the cytoplasm (and then to the bundle sheath cells in C4 plants). The plant and moss enzymes are light regulated via cysteine disulfide bonds. The enzyme from Sorghum has been crystallized.	387
130819	TIGR01758	MDH_euk_cyt	malate dehydrogenase, NAD-dependent. This model represents the NAD-dependent cytosolic malate dehydrogenase from eukaryotes. The enzyme from pig has been studied by X-ray crystallography	324
130820	TIGR01759	MalateDH-SF1	malate dehydrogenase. This model represents a family of malate dehydrogenases in bacteria and eukaryotes which utilize either NAD or NADP depending on the species and context. MDH interconverts malate and oxaloacetate and is a part of the citric acid cycle as well as the C4 cycle in certain photosynthetic organisms.	323
273790	TIGR01760	tape_meas_TP901	phage tail tape measure protein, TP901 family, core region. This model represents a reasonably well conserved core region of a family of phage tail proteins. The member from phage TP901-1 was characterized as a tail length tape measure protein in that a shortened form of the protein leads to phage with proportionately shorter tails. [Mobile and extrachromosomal element functions, Prophage functions]	350
273791	TIGR01761	thiaz-red	thiazolinyl imide reductase. This reductase is found associated with gene clusters for the biosynthesis of various non-ribosomal peptide derived natural products in which cysteine is cyclized to a thiazoline ring containing an imide double bond. Examples include yersiniabactin (irp3/YbtU, GP|21959262) and pyochelin (PchG, GP|4325022).	344
130823	TIGR01762	chlorin-enz	chlorinating enzyme. This model represents a a group of highly homologous enzymes related to dioxygenases which chlorinate amino acid methyl groups. BarB1 and BarB2 are proposed to trichlorinate one of the methyl groups of a leucine residue in the biosynthesis of barbamide in the cyanobacterium Lyngbya majuscula. SyrB2 is proposed to chlorinate the methyl group of threonine in the biosynthesis of syringomycin in Pseudomonas syringae. CmaB is proposed to chlorinate the beta-methyl group of alloisoleucine in the process of ring closure in the biosynthesis of coronamic acid, a component of coronatine also in Pseudomonas syringae.	288
273792	TIGR01763	MalateDH_bact	malate dehydrogenase, NAD-dependent. This enzyme converts malate into oxaloacetate in the citric acid cycle. The critical residues which discriminate malate dehydrogenase from lactate dehydrogenase have been characterized, and have been used to set the cutoffs for this model. Sequences showing [aflimv][ap]R[rk]pgM[st] and [ltv][ilm]gGhgd were kept above trusted, while those in which the capitalized residues in the patterns were found to be Q, E and E were kept below the noise cutoff. Some sequences in the grey zone have been annotated as malate dehydrogenases, but none have been characterized. Phylogenetically, a clade of sequences from eukaryotes such as Toxoplasma and Plasmodium which include a characterized lactate dehydrogenase and show abiguous critical residue patterns appears to be more closely related to these bacterial sequences than other eukaryotic sequences. These are relatively long branch and have been excluded from the model. All other sequences falling below trusted appear to be phylogenetically outside of the clade including the trusted hits. The annotation of Botryococcus braunii as lactate dehydrogenase appears top be in error. This was initially annotated as MDH by Swiss-Prot and then changed. The rationale for either of these annotations is not traceable. [Energy metabolism, TCA cycle]	305
200128	TIGR01764	excise	DNA binding domain, excisionase family. An excisionase, or Xis protein, is a small protein that binds and promotes excisive recombination; it is not enzymatically active. This model represents a number of putative excisionases and related proteins from temperate phage, plasmids, and transposons, as well as DNA binding domains of other proteins, such as a DNA modification methylase. This model identifies mostly small proteins and N-terminal regions of large proteins, but some proteins appear to have two copies. This domain appears similar, in both sequence and predicted secondary structure (PSIPRED) to the MerR family of transcriptional regulators (pfam00376). [Unknown function, General]	49
130826	TIGR01765	tspaseT_teng_N	transposase, putative, N-terminal domain. This model represents the N-terminal region of a family of putative transposases found in the largest copy number in Thermoanaerobacter tengcongensis. The three homologs in Bacillus anthracis are each split into two ORFs and this model represents the upstream ORF. [Mobile and extrachromosomal element functions, Transposon functions]	73
273793	TIGR01766	tspaseT_teng_C	transposase, IS605 OrfB family, central region. This model represents a region of a sequence similarity between a family of putative transposases of Thermoanaerobacter tengcongensis, smaller related proteins from Bacillus anthracis, putative transposes described by pfam01385, and other proteins. [Mobile and extrachromosomal element functions, Transposon functions]	82
130828	TIGR01767	MTRK	S-methyl-5-thioribose kinase. This enzyme, S-methyl-5-thioribose kinase (MtnK) is involved in the methionine salvage pathway in certain bacteria.	370
273794	TIGR01768	GGGP-family	geranylgeranylglyceryl phosphate synthase family protein. This model represents a family of sequences including geranylgeranylglyceryl phosphate synthase which catalyzes the first committed step in the synthesis of ether-linked membrane lipids in archaea. The clade of bacterial sequences may have the same function or a closely related function. This model supercedes TIGR00265, which has been retired.	223
130830	TIGR01769	GGGP	phosphoglycerol geranylgeranyltransferase. This model represents geranylgeranylglyceryl phosphate synthase which catalyzes the first committed step in the synthesis of ether-linked membrane lipids in archaea. The active enzyme is reported to be a homopentamer in Methanobacterium thermoautotrophicum but is reported to be a homodimer in Thermoplasma acidophilum.	205
273795	TIGR01770	NDH_I_N	proton-translocating NADH-quinone oxidoreductase, chain N. This model describes the 14th (based on E. coli) structural gene, N, of bacterial and chloroplast energy-transducing NADH (or NADPH) dehydrogenases. This model does not describe any subunit of the mitochondrial complex I (for which the subunit composition is very different), nor NADH dehydrogenases that are not coupled to ion transport. The Enzyme Commission designation 1.6.5.3, for NADH dehydrogenase (ubiquinone), is applied broadly, perhaps unfortunately, even if the quinone is menaquinone (Thermus, Mycobacterium) or plastoquinone (chloroplast). For chloroplast members, the name NADH-plastoquinone oxidoreductase is used for the complex and this protein is designated as subunit 2 or B. This model also includes a subunit of a related complex in the archaeal methanogen, Methanosarcina mazei, in which F420H2 replaces NADH and 2-hydroxyphenazine replaces the quinone. [Energy metabolism, Electron transport]	468
273796	TIGR01771	L-LDH-NAD	L-lactate dehydrogenase. This model represents the NAD-dependent L-lactate dehydrogenases from bacteria and eukaryotes. This enzyme function as as the final step in anaerobic glycolysis. Although lactate dehydrogenases have in some cases been mistaken for malate dehydrogenases due to the similarity of these two substrates and the apparent ease with which evolution can toggle these activities, critical residues have been identified which can discriminate between the two activities. At the time of the creation of this model no hits above the trusted cutoff contained critical residues typical of malate dehydrogenases. [Energy metabolism, Anaerobic, Energy metabolism, Glycolysis/gluconeogenesis]	299
130833	TIGR01772	MDH_euk_gproteo	malate dehydrogenase, NAD-dependent. This model represents the NAD-dependent malate dehydrogenase found in eukaryotes and certain gamma proteobacteria. The enzyme is involved in the citric acid cycle as well as the glyoxalate cycle. Several isoforms exidt in eukaryotes. In S. cereviseae, for example, there are cytoplasmic, mitochondrial and peroxisomal forms. Although malate dehydrogenases have in some cases been mistaken for lactate dehydrogenases due to the similarity of these two substrates and the apparent ease with which evolution can toggle these activities, critical residues have been identified which can discriminate between the two activities. At the time of the creation of this model no hits above the trusted cutoff contained critical residues typical of lactate dehydrogenases. [Energy metabolism, TCA cycle]	312
273797	TIGR01773	GABAperm	gamma-aminobutyrate permease. GABA permease (gabP) catalyzes the translocation of 4-aminobutyrate (GABA) across the plasma membrane, with homologues expressed in Gram-negative and Gram-positive organisms. This permease is a highly hydrophobic transmembrane protein consisting of 12 transmembrane domains with hydrophilic N- and C-terminal ends. Induced by nitrogen-limited culture conditions in both Escherichia coli and Bacillus subtilis, gabP is an energy dependent transport system stimulated by membrane potential and has been observed adjacent and distant from other GABA degradation proteins. GabP is highly homologous to amino acid permeases from B. subtilis, E. coli, as well as to other members of the amino acid permease family (pfam00324). A member of the APC (amine-polyamine-choline) transporter superfamily, GABA permease possesses a "consensus amphiphatic region" (CAR) found to be evolutionarily conserved within this transport family. This amphiphatic region is located between helix 8 and cytoplasmic loop 8-9, forming a potential channel domain and suggested to play a significant role in ligand recognition and translocation. Unique to GABA permeases, a conserved cysteine residue (CYS-300, E.coli) located at the beginning of the amphiphatic domain, has been determined to be critical for catalytic specificity. [Transport and binding proteins, Amino acids, peptides and amines]	452
273798	TIGR01774	PFL2-3	glycyl radical enzyme, PFL2/glycerol dehydratase family. This family previously was designated pyruvate formate-lyase, but it now appears that members include the B12-independent glycerol dehydratase. Therefore, the functional definition of the family is being broadened. This family includes the PflF and PflD proteins of E. coli, described as isoforms of pyruvate-formate lyase found in a limited number additional species. PFL catalyzes the reaction pyruvate + CoA -> acetyl-CoA + formate, which is a step in the fermentation of glucose.	786
273799	TIGR01776	TonB-tbp-lbp	TonB-dependent lactoferrin and transferrin receptors. This family of TonB-dependent receptors are responsible for import of iron from the mammalian iron carriers lactoferrin and transferrin across the outer membrane. These receptors are found only in bacteria which can infect mammals such as Moraxella, Mannheimia, Neisseria, Actinobacillus, Pasteurella, Haemophilus and Histophilus species. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins]	932
273800	TIGR01777	yfcH	TIGR01777 family protein. This model represents a clade of proteins of unknown function including the E. coli yfcH protein. [Hypothetical proteins, Conserved]	291
273801	TIGR01778	TonB-copper	TonB-dependent copper receptor. This model represents a family of proteobacterial TonB-dependent outer membrane receptor/transporters which bind and translocate copper ions. Two characterized members of this family exist, outer membrane protein C (OprC) from Pseudomonas aeruginosa and NosA from Pseudomonas stutzeri which is responsible for providing copper for the copper-containing N2O reducatse. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins]	636
273802	TIGR01779	TonB-B12	TonB-dependent vitamin B12 receptor. This model represents the TonB-dependent outer membrane receptor found in gamma proteobacteria responsible for translocating the cobalt-containing vitamin B12 (cobalamin). [Transport and binding proteins, Other, Transport and binding proteins, Porins]	614
188167	TIGR01780	SSADH	succinate-semialdehyde dehydrogenase. Succinic semialdehyde dehydrogenase is one of three enzymes constituting 4-aminobutyrate (GABA) degradation in both prokaryotes and eukaryotes, catalyzing the (NAD(P)+)-dependent catabolism reaction of succinic semialdehyde to succinate for metabolism by the citric acid cycle. The EC number depends on the cofactor: 1.2.1.24 for NAD only, 1.2.1.79 for NADP only, and 1.2.1.16 if both can be used. In Escherichia coli, succinic semialdehyde dehydrogenase is located in an unidirectionally transcribed gene cluster encoding enzymes for GABA degradation and is suggested to be cotranscribed with succinic semialdehyde transaminase from a common promoter upstream of SSADH. Similar gene arrangements can be found in characterized Ralstonia eutropha and the genome analysis of Bacillus subtilis. Prokaryotic succinic semialdehyde dehydrogenases (1.2.1.16) share high sequence homology to characterized succinic semialdehyde dehydrogenases from rat and human (1.2.1.24), exhibiting conservation of proposed cofactor binding residues, and putative active sites (G-237 & G-242, C-293 & G-259 respectively of rat SSADH). Eukaryotic SSADH enzymes exclusively utilize NAD+ as a cofactor, exhibiting little to no NADP+ activity. While a NADP+ preference has been detected in prokaryotes in addition to both NADP+- and NAD+-dependencies as in E.coli, Pseudomonas, and Klebsiella pneumoniae. The function of this alternative SSADH currently is unknown, but has been suggested to play a possible role in 4-hydroxyphenylacetic degradation. Just outside the scope of this model, are several sequences belonging to clades scoring between trusted and noise. These sequences may be actual SSADH enzymes, but lack sufficiently close characterized homologs to make a definitive assignment at this time. SSADH enzyme belongs to the aldehyde dehydrogenase family (pfam00171), sharing a common evolutionary origin and enzymatic mechanism with lactaldehyde dehydrogenase. Like in lactaldehyde dehydrogenase and succinate semialdehyde dehydrogenase, the mammalian catalytic glutamic acid and cysteine residues are conserved in all the enzymes of this family (PS00687, PS00070). [Central intermediary metabolism, Other]	448
273803	TIGR01781	Trep_dent_lipo	Treponema denticola clustered lipoprotein. This model represents a family of six predicted lipoproteins from a region of about 20 tandemly arranged genes in the Treponema denticola genome. Two other neighboring genes share the lipoprotein signal peptide region but do not show more extensive homology. The function of this locus is unknown.	412
273804	TIGR01782	TonB-Xanth-Caul	TonB-dependent receptor. This model represents a family of TonB-dependent outer-membrane receptors which are found mainly in Xanthomonas and Caulobacter. These appear to represent the expansion of a paralogous family in that the 22 X. axonopodis (21 in X. campestris) and 18 C. crescentus sequences are more closely related to each other than any of the many TonB-dependent receptors found in other species. In fact, the Crescentus and Xanthomonas sequences are inseparable on a phylogenetic tree using a PAM-weighted neighbor-joining method, indicating that one of the two genuses may have acquired this set of receptors from the other. The mechanism by which this family is shared between Xanthomonas, a gamma proteobacterial plant pathogen and Caulobacter, an alpha proteobacterial aquatic organism is unclear. [Transport and binding proteins, Porins]	845
273805	TIGR01783	TonB-siderophor	TonB-dependent siderophore receptor. This subfamily model encompasses a wide variety of TonB-dependent outer membrane siderophore receptors. It has no overlap with TonB receptors known to transport other substances, but is likely incomplete due to lack of characterizations. It is likely that genuine siderophore receptors will be identified which score below the noise cutoff to this model at which point the model should be updated. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins]	651
273806	TIGR01784	T_den_put_tspse	conserved hypothetical protein (putative transposase or invertase). Several lines of evidence suggest that members of this family (loaded as a fragment mode model to find part-length matches) are associated with transposition, inversion, or recombination. Members are found in small numbers of genomes, but in large copy numbers in many of those species, including over 30 full length and fragmentary members in Treponema denticola. The strongest similarities are usually within rather than between species. PSI-BLAST shows similarity to proteins designated as possible transposases, DNA invertases (resolvases), and recombinases. In the oral pathogenic spirochete Treponema denticola, full-length members are often found near transporters or other membrane proteins. This family includes members of the putative transposase family pfam04754.	270
273807	TIGR01785	TonB-hemin	TonB-dependent heme/hemoglobin receptor family protein. This model represents the TonB-dependent outer membrane heme/hemoglobin receptor/transporter found in bacteria which live in contact with animals (which contain hemoglobin or other heme-bearing globins) or legumes (which contain leghemoglobin). Some species having hits to this model such as Nostoc, Caulobacter and Chlorobium do not have an obvious source of hemoglobin-like proteins in their biological niche and so the possibility exists that they act on some other substance. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins]	665
273808	TIGR01786	TonB-hemlactrns	TonB-dependent hemoglobin/transferrin/lactoferrin receptor family protein. This model represents a family of TonB-dependent outer membrane receptor/transporters acting on iron-containing proteins such as hemoglobin, transferrin and lactoferrin. Two subfamily models with a narrower scope are contained within this model, the heme/hemoglobin receptor family protein model (TIGR01785) and the transferrin/lactoferrin receptor family model (TIGR01776). Accessions which score above trusted to this model while not scoring above trusted to the more specific models are most likely to be hemoglobin transporters. Nearly all of the species containing trusted hits to this model have access to hemoglobin, transferrin or lactoferrin or related proteins in their biological niche. [Transport and binding proteins, Cations and iron carrying compounds, Transport and binding proteins, Porins]	715
273809	TIGR01787	squalene_cyclas	squalene/oxidosqualene cyclases. This family of enzymes catalyzes the cyclization of the triterpenes squalene or 2-3-oxidosqualene to a variety of products including hopene, lanosterol, cycloartenol, amyrin, lupeol, and isomultiflorenol.	621
130848	TIGR01788	Glu-decarb-GAD	glutamate decarboxylase. This model represents the pyridoxal phosphate-dependent glutamate (alpha) decarboxylase found in bacteria (low and hi-GC gram positive, proteobacteria and cyanobacteria), plants, fungi and at least one archaon (Methanosarcina). The product of the enzyme is gamma-aminobutyrate (GABA).	431
130849	TIGR01789	lycopene_cycl	lycopene cyclase. This model represents a family of bacterial lycopene cyclases catalyzing the transformation of lycopene to carotene. These enzymes are found in a limited spectrum of alpha and gamma proteobacteria as well as Flavobacterium.	370
130850	TIGR01790	carotene-cycl	lycopene cyclase family protein. This family includes lycopene beta and epsilion cyclases (which form beta and delta carotene, respectively) from bacteria and plants as well as the plant capsanthin/capsorubin and neoxanthin cyclases which appear to have evolved from the plant lycopene cyclases. The plant lycopene epsilon cyclases also transform neurosporene to alpha zeacarotene.	388
130851	TIGR01791	CM_archaeal	chorismate mutase, archaeal type. This model represents a clade of archaeal chorismate mutases. Chorismate mutase catalyzes the conversion of chorismate into prephenate which is subsequently converted into either phenylalanine or tyrosine. In Sulfolobus this gene is found as a fusion with prephenate dehydrogenase (although the non-TIGR annotation contains a typographical error indicating it as a dehydratase OMNI|NTL02SS0274) which is the next enzyme in the tyrosine biosynthesis pathway. The Archaeoglobus gene contains an N-terminal prephenate dehydrogenase domain and a C-terminal prephenate dehydratase domain followed by a regulatory amino acid-binding ACT domain. The Thermoplasma volcanium gene is adjacent to prephenate dehydratase. [Amino acid biosynthesis, Aromatic amino acid family]	83
273810	TIGR01792	urease_alph	urease, alpha subunit. This model describes the urease alpha subunit UreC (designated beta or B chain, UreB in Helicobacter species). Accessory proteins for incorporation of the nickel cofactor are usually found in addition to the urease alpha, beta, and gamma subunits. The trusted cutoff is set above the scores of many reported fragments and of a putative second urease alpha chain in Streptomyces coelicolor. [Central intermediary metabolism, Nitrogen metabolism]	567
130853	TIGR01793	cit_synth_euk	citrate (Si)-synthase, eukaryotic. This model includes both mitochondrial and peroxisomal forms of citrate synthase. Citrate synthase is the entry point to the TCA cycle from acetyl-CoA. Peroxisomal forms, such as SP:P08679 from yeast (recognized by the C-terminal targeting motif SKL) act in the glyoxylate cycle. Eukaryotic homologs excluded by the high trusted cutoff of this model include a Tetrahymena thermophila citrate synthase that doubles as a filament protein, a putative citrate synthase from Plasmodium falciparum (no TCA cycle), and a methylcitrate synthase from Aspergillus nidulans.	427
130854	TIGR01795	CM_mono_cladeE	monofunctional chorismate mutase, alpha proteobacterial type. This model represents a small clade of monofunctional (non-fused) chorismate mutases spanning alpha proteobacteria and two actinobacter gram positive species. The alpha proteobacterial members are trusted because the pathways of CM are evident and there is only one plausible CM in the genome. In S. coelicolor, however, there is another aparrent monofunctional CM. [Amino acid biosynthesis, Aromatic amino acid family]	94
130855	TIGR01796	CM_mono_aroH	monofunctional chorismate mutase, gram positive type, clade 1. This model represents a family of monofunctional (non-fused) chorismate mutases from gram positive bacteria (Firmicutes) and cyanobacteria. Trusted members of the family are found in operons with other enzymes of the chorismate pathways, both up- and downstream of CM (Listeria, Bacillus, Oceanobacillus) or are the sole CM in the genome where the other members of the chorismate pathways are found elsewhere in the genome (Nostoc, Thermosynechococcus). [Amino acid biosynthesis, Aromatic amino acid family]	117
130856	TIGR01797	CM_P_1	chorismate mutase domain of proteobacterial P-protein, clade 1. This model represents the chorismate mutase domain of the gamma and beta proteobacterial "P-protein" which contains an N-terminal chorismate mutase domain and a C-terminal prephenate dehydratase domain. [Amino acid biosynthesis, Aromatic amino acid family]	83
273811	TIGR01798	cit_synth_I	citrate synthase I (hexameric type). This model describes one of several distinct but closely homologous classes of citrate synthase, the protein that brings carbon (from acetyl-CoA) into the TCA cycle. This form, class I, is known to be hexameric and allosterically inhibited by NADH in Escherichia coli, Acinetobacter anitratum, Azotobacter vinelandii, Pseudomonas aeruginosa, etc. In most species with a class I citrate synthase, a dimeric class II isozyme is found. The class II enzyme may act primarily on propionyl-CoA to make 2-methylcitrate or be bifunctional, may be found among propionate utilization enzymes, and may be constitutive or induced by propionate. Some members of this model group as class I enzymes, and may be hexameric, but have shown regulatory properties more like class II enzymes. [Energy metabolism, TCA cycle]	412
130858	TIGR01799	CM_T	chorismate mutase domain of T-protein. This model represents the chorismate mutase domain of the gamma proteobacterial "T-protein" which consists of an N-terminal chorismate mutase domain and a C-terminal prephenate dehydrogenase domain. [Amino acid biosynthesis, Aromatic amino acid family]	83
130859	TIGR01800	cit_synth_II	2-methylcitrate synthase/citrate synthase II. Members of this family are dimeric enzymes with activity as 2-methylcitrate synthase, citrate synthase, or both. Many Gram-negative species have a hexameric citrate synthase, termed citrate synthase I (TIGR01798). Members of this family (TIGR01800) appear as a second citrate synthase isozyme but typically are associated with propionate metabolism and synthesize 2-methylcitrate from propionyl-CoA; citrate synthase activity may be incidental. A number of species, including Thermoplasma acidophilum, Pyrococcus furiosus, and the Antarctic bacterium DS2-3R have a bifunctional member of this family as the only citrate synthase isozyme.	368
130860	TIGR01801	CM_A	chorismate mutase domain of gram positive AroA protein. This model represents a small clade of chorismate mutase domains N-terminally fused to the first enzyme in the chorismate pathway, 2-dehydro-3-deoxyphosphoheptanoate aldolase (DAHP synthetase, AroA) which are found in some gram positive species and Deinococcus. Only in Deinococcus, where this domain is the sole CM domain in the genome can a trusted assignment of function be made. In the other species there is at least one other trusted CM domain present. The similarity between the Deinococcus gene and the others in this clade is sufficiently strong (~44% identity), that the whole clade can be trusted to be functional. The possibility exists, however, that in the gram positive species the fusion to the first enzyme in the pathway has evolved a separate, regulatory role. [Amino acid biosynthesis, Aromatic amino acid family]	102
273812	TIGR01802	CM_pl-yst	monofunctional chorismate mutase, eukaryotic type. This model represents the plant and yeast (plastidic) chorismate mutase. These CM's are distinct from other forms by the presence of an extended regulatory domain. [Amino acid biosynthesis, Aromatic amino acid family]	246
130862	TIGR01803	CM-like	chorismate mutase related enzymes. This subfamily includes two enzymes which are variants on the mechanism of chorismate mutase and are likely to have evolved from an ancestral chorismate mutase enzyme. 4-amino-4-deoxy-chorismate mutase produces amino-deoxy-prephenate which is subsequently converted to para-dimethylamino-phenylalanine, a component of the natural product pristinamycin. Isochorismate-pyruvate lyase presumably catalyzes the same type of 2+2+2 cyclo-rearrangement as chorismate mutase, but acting on isochorismate, this results in two broken bonds instead of one broken and one made. The product of this reaction is salicylate (2-hydroxy-benzoate) which is also incorporated into various natural products.	82
200131	TIGR01804	BADH	betaine-aldehyde dehydrogenase. Under osmotic stress, betaine aldehyde dehydrogenase oxidizes glycine betaine aldehyde into the osmoprotectant glycine betaine, via the second of two oxidation steps from exogenously supplied choline or betaine aldehyde. This choline-glycine betaine synthesis pathway can be found in gram-positive and gram-negative bacteria. In Escherichia coli, betaine aldehyde dehydrogenase (betB) is osmotically co-induced with choline dehydrogenase (betA) in the presence of choline. These dehydrogenases are located in a betaine gene cluster with the upstream choline transporter (betT) and transcriptional regulator (betI). Similar to E.coli, betaine synthesis in Staphylococcus xylosus is also influenced by osmotic stress and the presence of choline with genes localized in a functionally equivalent gene cluster. Organization of the betaine gene cluster in Sinorhizobium meliloti and Bacillus subtilis differs from that of E.coli by the absence of upstream choline transporter and transcriptional regulator homologues. Additionally, B.subtilis co-expresses a type II alcohol dehydrogenase with betaine aldehyde dehydrogenase instead of choline dehydrogenase as in E.coli, St.xylosus, and Si.meliloti. Betaine aldehyde dehydrogenase is a member of the aldehyde dehydrogenase family (pfam00171). [Cellular processes, Adaptations to atypical conditions]	467
130864	TIGR01805	CM_mono_grmpos	monofunctional chorismate mutase, gram positive-type, clade 2. This model represents a clade of chorismate mutase proteins/domains from gram positive species. The sequence from Enterococcus is fused to the C-terminus of an aparrent acetyltransferase, and the seuence from Clostridium acetobutylicum (but not perfringens) is fused to the N-terminus of shikimate-5-dehydrogenase, another enzyme of the chorismate pathway. All the other members of this clade are mono-functional. Members of this clade from Streptococcus and Lactococcus have been found which represent the sole chorismate mutase domain in their respective genomes which also exhibit evidence of the enzymes of both the upstream and downstream branches of the chorismate pathways. [Amino acid biosynthesis, Aromatic amino acid family]	81
130865	TIGR01806	CM_mono2	chorismate mutase, putative. This model represents a clade of probable chorismate mutases from alpha, beta and gamma proteobacteria as well as Mycobacterium tuberculosis and a clade of nematodes. Although the most likely function for the enzymes represented by this model is as a chorismate mutase, in no species are these enzymes the sole chorismate mutase in the genome. Also, in no case are these enzymes located in a region of the genome proximal to any other enzymes involved in chorismate pathways. Although the Pantoea enzyme has been shown to complement a CM-free mutant of E. coli, this was also shown to be the case with isochorismate-pyruvate lyase which only has a secondary (non-physiologically relevant) chorismate mutase activity. This enzyme is believed to be a homodimer and be localized to the periplasm. [Amino acid biosynthesis, Aromatic amino acid family]	114
130866	TIGR01807	CM_P2	chorismate mutase domain of proteobacterial P-protein, clade 2. This model represents one of two separate clades of the chorismate mutase domain of the gamma and beta and epsilon proteobacterial "P-protein" which contains an N-terminal chorismate mutase domain and a C-terminal prephenate dehydratase domain. It is also found in Aquifex aolicus. [Amino acid biosynthesis, Aromatic amino acid family]	76
130867	TIGR01808	CM_M_hiGC-arch	monofunctional chorismate mutase, high GC gram positive type. This model represents the monofunctional chorismate mutase from high GC gram-positive bacteria and archaea. Trusted annotations from Corynebacterium and Pyrococcus are aparrently the sole chorismate mutase enzymes in their respective genomes. This is coupled with the presence in those genomes of the enzymes of the chorismate pathways both up- and downstream of chorismate mutase. [Amino acid biosynthesis, Aromatic amino acid family]	74
273813	TIGR01809	Shik-DH-AROM	shikimate-5-dehydrogenase, fungal AROM-type. This model represents a clade of shikimate-5-dehydrogenases found in Corynebacterium, Mycobacteria and fungi. The fungal sequences are pentafunctional proteins known as AroM which contain the central five seven steps in the chorismate biosynthesis pathway. The Corynebacterium and Mycobacterial sequences represent the sole shikimate-5-dehydrogenases in species which otherwise have every enzyme of the chorismate biosynthesis pathway. [Amino acid biosynthesis, Aromatic amino acid family]	282
273814	TIGR01810	betA	choline dehydrogenase. Choline dehydrogenase catalyzes the conversion of exogenously supplied choline into the intermediate glycine betaine aldehyde, as part of a two-step oxidative reaction leading to the formation of osmoprotectant betaine. This enzymatic system can be found in both gram-positive and gram-negative bacteria. As in Escherichia coli, Staphylococcus xylosus, and Sinorhizobium meliloti, this enzyme is found associated in a transciptionally co-induced gene cluster with betaine aldehyde dehydrogenase, the second catalytic enzyme in this reaction. Other gram-positive organisms have been shown to employ a different enzymatic system, utlizing a soluable choline oxidase or type III alcohol dehydrogenase instead of choline dehydrogenase. This enzyme is a member of the GMC oxidoreductase family (pfam00732 and pfam05199), sharing a common evoluntionary origin and enzymatic reaction with alcohol dehydrogenase. Outgrouping from this model, Caulobacter crescentus shares sequence homology with choline dehydrogenase, yet other genes participating in this enzymatic reaction have not currently been identified. [Cellular processes, Adaptations to atypical conditions]	532
130870	TIGR01811	sdhA_Bsu	succinate dehydrogenase or fumarate reductase, flavoprotein subunit, Bacillus subtilis subgroup. This model represents the succinate dehydrogenase flavoprotein subunit as found in the low-GC Gram-positive bacteria and a few other lineages. This enzyme may act in a complete or partial TCA cycle, or act in the opposite direction as fumarate reductase. In some but not all species, succinate dehydrogenase and fumarate reductase may be encoded as separate isozymes. [Energy metabolism, TCA cycle]	603
273815	TIGR01812	sdhA_frdA_Gneg	succinate dehydrogenase or fumarate reductase, flavoprotein subunitGram-negative/mitochondrial subgroup. This model represents the succinate dehydrogenase flavoprotein subunit as found in Gram-negative bacteria, mitochondria, and some Archaea. Mitochondrial forms interact with ubiquinone and are designated EC 1.3.5.1, but can be degraded to 1.3.99.1. Some isozymes in E. coli and other species run primarily in the opposite direction and are designated fumarate reductase. [Energy metabolism, Aerobic, Energy metabolism, Anaerobic, Energy metabolism, TCA cycle]	566
273816	TIGR01813	flavo_cyto_c	flavocytochrome c. This model describes a family of redox proteins related to the succinate dehydrogenases and fumarate reductases of E. coli, mitochondria, and other well-characterized systems. A member of this family from Shewanella frigidimarina NCIMB400 is characterized as a water-soluble periplasmic protein with four heme groups, a non-covalently bound FAD, and essentially unidirectional fumarate reductase activity. At least seven distinct members of this family are found in Shewanella oneidensis, a species able to use a wide variety of pathways for respiraton. [Energy metabolism, Electron transport]	439
130873	TIGR01814	kynureninase	kynureninase. This model describes kynureninase, a pyridoxal-phosphate enzyme. Kynurinine is a Trp breakdown product and a precursor for NAD. In Chlamydia psittaci, an obligate intracellular pathogen, kynureninase makes anthranilate, a Trp precursor, from kynurenine. This counters the tryptophan hydrolysis that occurs in the host cell in response to the pathogen. [Energy metabolism, Amino acids and amines]	406
130874	TIGR01815	TrpE-clade3	anthranilate synthase, alpha proteobacterial clade. This model represents a small clade of anthranilate synthases from alpha proteobacteria and Nostoc (a cyanobacterium). This enzyme is the first step in the pathway for the biosynthesis of tryprophan from chorismate. [Amino acid biosynthesis, Aromatic amino acid family]	717
130875	TIGR01816	sdhA_forward	succinate dehydrogenase, flavoprotein subunit, E. coli/mitochondrial subgroup. Succinate dehydrogenase and fumarate reductase are homologous enzymes reversible in principle but favored under different circumstances. This model represents a narrowly defined clade of the succinate dehydrogenase flavoprotein subunit as found in mitochondria, in Rickettsia, in E. coli and other Proteobacteria, and in a few other lineages. However, this model excludes all known fumarate reductases. It also excludes putative succinate dehydrogenases that appear to diverged before the split between E. coli succinate dehydrogenase and fumarate reductase. [Energy metabolism, TCA cycle]	565
273817	TIGR01817	nifA	Nif-specific regulatory protein. This model represents NifA, a DNA-binding regulatory protein for nitrogen fixation. The model produces scores between the trusted and noise cutoffs for a well-described NifA homolog in Aquifex aeolicus (which lacks nitrogenase), for transcriptional activators of alternative nitrogenases (VFe or FeFe instead of MoFe), and truncated forms. [Central intermediary metabolism, Nitrogen fixation, Regulatory functions, DNA interactions]	534
273818	TIGR01818	ntrC	nitrogen regulation protein NR(I). This model represents NtrC, a DNA-binding response regulator that is phosphorylated by NtrB and interacts with sigma-54. NtrC usually controls the expression of glutamine synthase, GlnA, and may be called GlnL, GlnG, etc. [Central intermediary metabolism, Nitrogen metabolism, Regulatory functions, DNA interactions, Signal transduction, Two-component systems]	463
130878	TIGR01819	F420_cofD	2-phospho-L-lactate transferase. This model represents LPPG:Fo 2-phospho-L-lactate transferase, which catalyses the fourth step in the biosynthesis of coenzyme F420, a flavin derivative found in methanogens, the Mycobacteria, and several other lineages. This enzyme is characterized so far in Methanococcus jannaschii but appears restricted to F420-containing species and is predicted to carry out the same function in these other species. The clade represented by this model is one of two major divisions of proteins in pfam01933. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	297
273819	TIGR01820	TrpE-arch	anthranilate synthase component I, archaeal clade. This model represents an archaeal clade of anthranilate synthase component I enzymes. This enzyme is responsible for the first step of tryptophan biosynthesis from chorismate. The Sulfolobus enzyme has been reported to be part of a gene cluster for Trp biosynthesis [Amino acid biosynthesis, Aromatic amino acid family]	435
273820	TIGR01821	5aminolev_synth	5-aminolevulinic acid synthase. This model represents 5-aminolevulinic acid synthase, an enzyme for one of two routes to the heme precursor 5-aminolevulinate. The protein is a pyridoxal phosphate-dependent enzyme related to 2-amino-3-ketobutyrate CoA tranferase and 8-amino-7-oxononanoate synthase. This enzyme appears restricted to the alpha Proteobacteria and mitochondrial derivatives. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	402
130881	TIGR01822	2am3keto_CoA	glycine C-acetyltransferase. This model represents a narrowly defined clade of animal and bacterial (almost exclusively Proteobacterial) 2-amino-3-ketobutyrate--CoA ligase, now called glycine C-acetyltransferase. This enzyme can act in threonine catabolism. The closest homolog from Bacillus subtilis, and sequences like it, may be functionally equivalent but were not included in the model because of difficulty in finding reports of function. [Energy metabolism, Amino acids and amines]	393
273821	TIGR01823	PabB-fungal	aminodeoxychorismate synthase, fungal clade. This model represents the fungal clade of a para-aminobenzoate synthesis enzyme, aminodeoxychorismate synthase, which acts on chorismate in a pathway that yields PABA, a precursor of folate.	742
130883	TIGR01824	PabB-clade2	aminodeoxychorismate synthase, component I, clade 2. This clade of sequences is more closely related to TrpE (anthranilate synthase, TIGR00564/TIGR01820/TIGR00565) than to the better characterized group of PabB enzymes (TIGR00553/TIGR01823). This clade includes one characterized enzyme from Lactococcus and the conserved function across the clade is supported by these pieces of evidence: 1) all genomes with a member in this clade also have a separate TrpE gene, 2) none of these genomes contain an aparrent PabB from any of the other PabB clades, 3) none of these sequences are found in a region of the genome in association with other Trp biosynthesis genes, 4) all of these genomes aparrently contain most if not all of the steps of the folate biosynthetic pathway (for which PABA is a precursor). Many of the sequences hit by this model are annotated as TrpE enzymes, however, we believe that all members of this clade are, in fact, PabB. The sequences from Bacillus halodurans and subtilus which score below the trusted cutoff for this model are also likely to be PabB enzymes, but are too closely related to TrpE to be separated at this time.	355
130884	TIGR01825	gly_Cac_T_rel	pyridoxal phosphate-dependent acyltransferase, putative. This model represents an enzyme subfamily related to three known enzymes; it appears closest to glycine C-acteyltransferase, shows no overlap with it in species distribution, and may share that function. The three closely related enzymes are glycine C-acetyltransferase (2-amino-3-ketobutyrate coenzyme A ligase), 5-aminolevulinic acid synthase, and 8-amino-7-oxononanoate synthase. All transfer the R-group (acetyl, succinyl, or 6-carboxyhexanoyl) from coenzyme A to an amino acid (Gly, Gly, Ala, respectively), with release of CO2 for the latter two reactions.	385
211689	TIGR01826	CofD_related	conserved hypothetical protein, cofD-related. This model represents a subfamily of conserved hypothetical proteins that forms a sister group to the family of CofD, (TIGR01819), LPPG:Fo 2-phospho-L-lactate transferase, an enzyme of cytochrome F420 biosynthesis. Both this family and TIGR01819 are within the scope of the pfam01933. [Hypothetical proteins, Conserved]	310
130886	TIGR01827	gatC_rel	Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase, subunit C, putative. This model represents a family small family related to GatC, the third subunit of an enzyme for completing the charging of tRNA(Gln) by amidating the Glu-tRNA(Gln). The few known archaea that contain a member of this family appear to produce Asn-tRNA(Asn) by an analogous amidotransferase reaction. This protein is proposed to substitute for GatC in the charging of both tRNAs.	73
273822	TIGR01828	pyru_phos_dikin	pyruvate, phosphate dikinase. This model represents pyruvate,phosphate dikinase, also called pyruvate,orthophosphate dikinase. It is similar in sequence to other PEP-utilizing enzymes. [Energy metabolism, Other]	856
273823	TIGR01829	AcAcCoA_reduct	acetoacetyl-CoA reductase. This model represent acetoacetyl-CoA reductase, a member of the family short-chain-alcohol dehydrogenases. Note that, despite the precision implied by the enzyme name, the reaction of EC 1.1.1.36 is defined more generally as (R)-3-hydroxyacyl-CoA + NADP+ = 3-oxoacyl-CoA + NADPH. Members of this family may act in the biosynthesis of poly-beta-hydroxybutyrate (e.g. Rhizobium meliloti) and related poly-beta-hydroxyalkanoates. Note that the member of this family from Azospirillum brasilense, designated NodG, appears to lack acetoacetyl-CoA reductase activity and to act instead in the production of nodulation factor. This family is downgraded to subfamily for this NodG. Other proteins designated NodG, as from Rhizobium, belong to related but distinct protein families.	242
273824	TIGR01830	3oxo_ACP_reduc	3-oxoacyl-(acyl-carrier-protein) reductase. This model represents 3-oxoacyl-[ACP] reductase, also called 3-ketoacyl-acyl carrier protein reductase, an enzyme of fatty acid biosynthesis. [Fatty acid and phospholipid metabolism, Biosynthesis]	239
273825	TIGR01831	fabG_rel	3-oxoacyl-(acyl-carrier-protein) reductase, putative. This model represents a small, very well conserved family of proteins closely related to the FabG family, TIGR01830, and possibly equal in function. In all completed genomes with a member of this family, a FabG in TIGR01830 is also found. [Fatty acid and phospholipid metabolism, Biosynthesis]	239
188170	TIGR01832	kduD	2-deoxy-D-gluconate 3-dehydrogenase. This model describes 2-deoxy-D-gluconate 3-dehydrogenase (also called 2-keto-3-deoxygluconate oxidoreductase), a member of the family of short-chain-alcohol dehydrogenases (pfam00106). This protein has been characterized in Erwinia chrysanthemi as an enzyme of pectin degradation. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	248
273826	TIGR01833	HMG-CoA-S_euk	3-hydroxy-3-methylglutaryl-CoA-synthase, eukaryotic clade. Hydroxymethylglutaryl(HMG)-CoA synthase is the first step of isopentenyl pyrophosphate (IPP) biosynthesis via the mevalonate pathway. This pathway is found mainly in eukaryotes, but also in archaea and some bacteria. This model is specific for eukaryotes.	457
273827	TIGR01834	PHA_synth_III_E	poly(R)-hydroxyalkanoic acid synthase, class III, PhaE subunit. This model represents the PhaE subunit of the heterodimeric class (class III) of polymerase for poly(R)-hydroxyalkanoic acids (PHAs), carbon and energy storage polymers of many bacteria. The most common PHA is polyhydroxybutyrate but about 150 different constituent hydroxyalkanoic acids (HAs) have been identified in various species. This model must be designated subfamily to indicate the heterogeneity of PHAs. [Cellular processes, Adaptations to atypical conditions, Fatty acid and phospholipid metabolism, Biosynthesis]	320
213655	TIGR01835	HMG-CoA-S_prok	3-hydroxy-3-methylglutaryl CoA synthase, prokaryotic clade. This clade of hydroxymethylglutaryl-CoA (HMG-CoA) synthases is found in a limited spectrum of mostly gram-positive bacteria which make isopentenyl pyrophosphate (IPP) via the mevalonate pathway. This pathway is found primarily in eukaryotes and archaea, but the bacterial homologs are distinct, having aparrently diverged after being laterally transferred from an early eukaryote. HMG-CoA synthase is the first step in the pathway and joins acetyl-CoA with acetoacetyl-CoA with the release of one molecule of CoA. The Borellia sequence may have resulted from a separate lateral transfer event.	379
130895	TIGR01836	PHA_synth_III_C	poly(R)-hydroxyalkanoic acid synthase, class III, PhaC subunit. This model represents the PhaC subunit of a heterodimeric form of polyhydroxyalkanoic acid (PHA) synthase. Excepting the PhaC of Bacillus megaterium (which needs PhaR), all members require PhaE (TIGR01834) for activity and are designated class III. This enzyme builds ester polymers for carbon and energy storage that accumulate in inclusions, and both this enzyme and the depolymerase associate with the inclusions. Class III enzymes polymerize short-chain-length hydroxyalkanoates. [Fatty acid and phospholipid metabolism, Biosynthesis]	350
130896	TIGR01837	PHA_granule_1	poly(hydroxyalkanoate) granule-associated protein. This model describes a domain found in some proteins associated with polyhydroxyalkanoate (PHA) granules in a subset of species that have PHA inclusion granules. Included are two tandem proteins of Pseudomonas oleovorans, PhaI and PhaF, and their homologs in related species. PhaF proteins have a low-complexity C-terminal region with repeats similar to AAAKP. [Fatty acid and phospholipid metabolism, Biosynthesis]	118
213656	TIGR01838	PHA_synth_I	poly(R)-hydroxyalkanoic acid synthase, class I. This model represents the class I subfamily of poly(R)-hydroxyalkanoate synthases, which polymerizes hydroxyacyl-CoAs with three to five carbons in the hydroxyacyl backbone into aliphatic esters termed poly(R)-hydroxyalkanoic acids. These polymers accumulate as carbon and energy storage inclusions in many species and can amount to 90 percent of the dry weight of cell. [Fatty acid and phospholipid metabolism, Biosynthesis]	532
130898	TIGR01839	PHA_synth_II	poly(R)-hydroxyalkanoic acid synthase, class II. This model represents the class II subfamily of poly(R)-hydroxyalkanoate synthases, which polymerizes hydroxyacyl-CoAs, typically with six to fourteen carbons in the hydroxyacyl backbone into aliphatic esters termed poly(R)-hydroxyalkanoic acids. These polymers accumulate as carbon and energy storage inclusions in many species and can amount to 90 percent of the dry weight of cell. [Fatty acid and phospholipid metabolism, Biosynthesis]	560
273828	TIGR01840	esterase_phb	esterase, PHB depolymerase family. This model describes a subfamily among lipases of the ab-hydrolase family. This subfamily includes bacterial depolymerases for poly(3-hydroxybutyrate) (PHB) and related polyhydroxyalkanoates (PHA), as well as acetyl xylan esterases, feruloyl esterases, and others from fungi. [Fatty acid and phospholipid metabolism, Degradation]	212
130900	TIGR01841	phasin	phasin family protein. This model describes a family of small proteins found associated with inclusions in bacterial cells. Most associate with polyhydroxyalkanoate (PHA) inclusions, the most common of which consist of polyhydroxybutyrate (PHB). These are designated granule-associate proteins or phasins; the member from Rhodospirillum rubrum is an activator of polyhydroxybutyrate (PHB) degradation. However, the member from Magnetospirillum sp. AMB-1 is called a magnetic particle membrane-specific GTPase.	88
200134	TIGR01842	type_I_sec_PrtD	type I secretion system ABC transporter, PrtD family. Type I protein secretion is a system in some Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. Targeted proteins are not cleaved at the N-terminus, but rather carry signals located toward the extreme C-terminus to direct type I secretion. [Protein fate, Protein and peptide secretion and trafficking]	544
130902	TIGR01843	type_I_hlyD	type I secretion membrane fusion protein, HlyD family. Type I secretion is an ABC transport process that exports proteins, without cleavage of any signal sequence, from the cytosol to extracellular medium across both inner and outer membranes. The secretion signal is found in the C-terminus of the transported protein. This model represents the adaptor protein between the ATP-binding cassette (ABC) protein of the inner membrane and the outer membrane protein, and is called the membrane fusion protein. This model selects a subfamily closely related to HlyD; it is defined narrowly and excludes, for example, colicin V secretion protein CvaA and multidrug efflux proteins. [Protein fate, Protein and peptide secretion and trafficking]	423
273829	TIGR01844	type_I_sec_TolC	type I secretion outer membrane protein, TolC family. Members of this model are outer membrane proteins from the TolC subfamily within the RND (Resistance-Nodulation-cell Division) efflux systems. These proteins, unlike the NodT subfamily, appear not to be lipoproteins. All are believed to participate in type I protein secretion, an ABC transporter system for protein secretion without cleavage of a signal sequence, although they may, like TolC, participate also in the efflux of smaller molecules as well. This family includes the well-documented examples TolC (E. coli), PrtF (Erwinia), and AprF (Pseudomonas aeruginosa). [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Porins]	415
273830	TIGR01845	outer_NodT	efflux transporter, outer membrane factor (OMF) lipoprotein, NodT family. Members of this model comprise a subfamily of the Outer Membrane Factor (TCDB 1.B.17) porins. OMF proteins operate in conjunction with a primary transporter of the RND, MFS, ABC, or PET systems, and a MFP (membrane fusion protein) to tranport substrates across membranes. The complex thus formed allows transport (export) of various solutes (heavy metal cations; drugs, oligosaccharides, proteins, etc.) across the two envelopes of the Gram-negative bacterial cell envelope in a single energy-coupled step. Current data suggest that the OMF (and not the MFP) is largely responsible for the formation of both the trans-outer membrane and trans-periplasmic channels. The roles played by the MFP have yet to be determined. [Cellular processes, Detoxification, Transport and binding proteins, Porins]	460
273831	TIGR01846	type_I_sec_HlyB	type I secretion system ABC transporter, HlyB family. Type I protein secretion is a system in some Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. Targeted proteins are not cleaved at the N-terminus, but rather carry signals located toward the extreme C-terminus to direct type I secretion. [Protein fate, Protein and peptide secretion and trafficking]	694
130906	TIGR01847	bacteriocin_sig	bacteriocin-type signal sequence. Bacteriocins are bacterial peptide products toxic to closely related bacteria. This model represents the N-terminal region up to the GG cleavage motif. Processing to remove this bacteriocin leader peptide occurs together with export by an ABC transporter. Note: because this model is so small (15 amino acids), it may have many spurious high-scoring matches to unrelated proteins, even with fairly stringent cutoff scores. The most likely true positives are small proteins of Gram-positive bacteria, matching regions that start within the first 15 amino acids, and encoded near bacteriocin transport family proteins (TIGR01000, TIGR01193).	15
130907	TIGR01848	PHA_reg_PhaR	polyhydroxyalkanoate synthesis repressor PhaR. Poly-B-hydroxyalkanoates are lipidlike carbon/energy storage polymers found in granular inclusions. PhaR is a regulatory protein found in general near other proteins associated with polyhydroxyalkanoate (PHA) granule biosynthesis and utilization. It is found to be a DNA-binding homotetramer that is also capable of binding short chain hydroxyalkanoic acids and PHA granules. PhaR may regulate the expression of itself, of the phasins that coat granules, and of enzymes that direct carbon flux into polymers stored in granules. The C-terminal region is poorly conserved in this family and is not part of this model.//GO terms added 12/6/04 [SS] [Fatty acid and phospholipid metabolism, Biosynthesis, Regulatory functions, DNA interactions]	107
130908	TIGR01849	PHB_depoly_PhaZ	polyhydroxyalkanoate depolymerase, intracellular. This model represents an intracellular depolymerase for polyhydroxyalkanoate (PHA), a carbon and energy storing polyester that accumulates in granules in many bacterial species when carbon sources are abundant but other nutrients are limiting. This family is named for PHAs generally, rather than polyhydroxybutyrate (PHB) specificially as in Ralstonia eutropha H16, to avoid overcalling chemical specificity in other species. Note that this family lacks the classic GXSXG lipase motif and instead shows weak similarity to some [Fatty acid and phospholipid metabolism, Degradation]	406
273832	TIGR01850	argC	N-acetyl-gamma-glutamyl-phosphate reductase, common form. This model represents the more common of two related families of N-acetyl-gamma-glutamyl-phosphate reductase, an enzyme catalyzing the third step or Arg biosynthesis from Glu. The two families differ by phylogeny, similarity clustering, and the gap architecture in a multiple sequence alignment. Bacterial members of this family tend to be found within Arg biosynthesis operons. [Amino acid biosynthesis, Glutamate family]	346
273833	TIGR01851	argC_other	N-acetyl-gamma-glutamyl-phosphate reductase, uncommon form. This model represents the less common of two related families of N-acetyl-gamma-glutamyl-phosphate reductase, an enzyme catalyzing the third step or Arg biosynthesis from Glu. The two families differ by phylogeny, similarity clustering, and gap architecture in a multiple sequence alignment. [Amino acid biosynthesis, Glutamate family]	310
188173	TIGR01852	lipid_A_lpxA	acyl-[acyl-carrier-protein]--UDP-N-acetylglucosamine O-acyltransferase. This model describes LpxA, an enzyme for the biosynthesis of lipid A, a component oflipopolysaccharide (LPS) in the outer membrane outer leaflet of most Gram-negative bacteria. Some differences are found between lipid A of different species, but this protein represents the first step (from UDP-N-acetyl-D-glucosamine) and appears to be conserved in function. Proteins from this family contain many copies of the bacterial transferase hexapeptide repeat (pfam00132). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	254
273834	TIGR01853	lipid_A_lpxD	UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase LpxD. This model describes LpxD, an enzyme for the biosynthesis of lipid A, a component oflipopolysaccharide (LPS) in the outer membrane outer leaflet of most Gram-negative bacteria. Some differences are found between lipid A of different species. This protein represents the third step from UDP-N-acetyl-D-glucosamine. The group added at this step generally is 14:0(3-OH) (myristate) but may vary; in Aquifex it appears to be 16:0(3-OH) (palmitate). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	324
273835	TIGR01854	lipid_A_lpxH	UDP-2,3-diacylglucosamine diphosphatase. This model represents LpxH, UDP-2,3-diacylglucosamine hydrolase, and essential enzyme in E. coli that catalyzes the fourth step in lipid A biosynthesis. Note that Pseudomonas aeruginosa has both a member of this family that shares this function and a more distant homolog, designated LpxH2, that does not. Many species that produce lipid A lack an lpxH gene in this family; some of those species have an lpxH2 gene instead, although for which the function is unknown. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	231
273836	TIGR01855	IMP_synth_hisH	imidazole glycerol phosphate synthase, glutamine amidotransferase subunit. This model represents the glutamine amidotransferase subunit (or domain, in eukaryotic systems) of imidazole glycerol phosphate synthase. This subunit catalyzes step 5 of histidine biosynthesis from PRPP. The other subunit, the cyclase, catalyzes step 6. [Amino acid biosynthesis, Histidine family]	196
273837	TIGR01856	hisJ_fam	histidinol phosphate phosphatase, HisJ family. This model represents the histidinol phosphate phosphatase HisJ of Bacillus subtilis, and related proteins from a number of species within a larger family of phosphatases in the PHP hydrolase family. HisJ catalyzes the penultimate step of histidine biosynthesis but shows no homology to the functionally equivalent sequence in E. coli, a domain of the bifunctional HisB protein. Note, however, that many species have two members and that Clostridium perfringens, predicted not to make histidine, has five members of this family; this family is designated subfamily rather than equivalog to indicate that members may not all act as HisJ.	253
130916	TIGR01857	FGAM-synthase	phosphoribosylformylglycinamidine synthase, clade II. This model represents a single-molecule form of phosphoribosylformylglycinamidine synthase, also called FGAM synthase, an enzyme of purine de novo biosynthesis. This model represents a second clade of these enzymes found in Clostridia, Bifidobacteria and Streptococcus species. This enzyme performs the fourth step in IMP biosynthesis (the precursor of all purines) from PRPP. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	1239
130917	TIGR01858	tag_bisphos_ald	class II aldolase, tagatose bisphosphate family. This model describes tagatose-1,6-bisphosphate aldolases, and perhaps other closely related class II aldolases. This tetrameric, Zn2+-dependent enzyme is related to the class II fructose bisphosphate aldolase; fructose 1,6-bisphosphate and tagatose 1,6 bisphosphate differ only in chirality at C4.	282
130918	TIGR01859	fruc_bis_ald_	fructose-1,6-bisphosphate aldolase, class II, various bacterial and amitochondriate protist. This model represents of one of several subtypes of the class II fructose-1,6-bisphosphate aldolase, an enzyme of glycolysis. The subtypes are split into several models to allow separation of a family of tagatose bisphosphate aldolases. This form is found in Gram-positive bacteria, a variety of Gram-negative, and in amitochondriate protists. The class II enzymes share homology with tagatose bisphosphate aldolase but not with class I aldolase. [Energy metabolism, Glycolysis/gluconeogenesis]	282
130919	TIGR01860	VNFD	nitrogenase vanadium-iron protein, alpha chain. This model represents the alpha chain of the vanadium-containing component of the vanadium-iron nitrogenase compound I. The complex also includes a second alpha chain, two beta chains and two delta chains. Compount I interacts with compound II also known as the iron-protein which transfers electrons to compound I where the catalysis occurs. [Central intermediary metabolism, Nitrogen fixation]	461
130920	TIGR01861	ANFD	nitrogenase iron-iron protein, alpha chain. This model represents the all-iron variant of the nitrogenase component I alpha chain. Molybdenum-iron and vanadium iron forms are also found. The complete complex contains two alpha chains, two beta chains and two delta chains. The component I associates with component II also known as the iron protein which serves to provide electrons for component I. [Central intermediary metabolism, Nitrogen fixation]	513
273838	TIGR01862	N2-ase-Ialpha	nitrogenase component I, alpha chain. This model represents the alpha chain of all three varieties (Mo-Fe, V-Fe, and Fe-Fe) of component I of nitrogenase. [Central intermediary metabolism, Nitrogen fixation]	443
273839	TIGR01863	cas_Csd1	CRISPR-associated protein Cas8c/Csd1, subtype I-C/DVULG. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein that tends to be found near CRISPR repeats of the DVULG subtype of CRISPR/Cas locus. We designate this family Csd1 (CRISPR/Cas Subtype DVULG protein 1). The species range for this subtype, so far, is exclusively bacterial and mesophilic, although CRISPR loci in general are particularly common among archaea and thermophilic bacteria. In a few species (Xanthomonas axonopodis pv. citri str. 306 and Streptococcus mutans UA159), homology to this protein family is split across two tandem genes; the trusted cutoff to this family is set low enough to capture at least the longer of the two.	584
273840	TIGR01865	cas_Csn1	CRISPR subtype II/NMENI RNA-guided endonuclease Cas9/Csn1. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein found only in CRISPR-containing species, near other CRISPR-associated proteins (cas), as part of the NMENI subtype of CRISPR/Cas locus. The species range so far for this protein is animal pathogens and commensals only.	805
273841	TIGR01866	cas_Csn2	CRISPR type II-A/NMEMI-associated protein Csn2. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein found only in CRISPR-containing species, near other CRISPR-associated proteins (cas), as part of the NMENI subtype of CRISPR/Cas loci. The species range so far for this subtype is animal pathogens and commensals only. This protein is present in some but not all NMENI CRISPR/Cas loci.	222
273842	TIGR01868	casD_Cas5e	CRISPR-associated protein Cas5/CasD, subtype I-E/ECOLI. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family is part of the ECOLI subtype CRISPR/Cas locus, and now characterized as part of the CASCADE complex of that system. It shares a small N-terminal homology region with members of several other CRISPR/Cas subtypes, and we view the families that share this region as being Cas5.	216
273843	TIGR01869	casC_Cse4	CRISPR-associated protein Cas7/Cse4/CasC, subtype I-E/ECOLI. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family is represented by CT1975 of Chlorobium tepidum and is part of the Ecoli subtype of CRISPR/Cas locis. It is designated Cse4, for CRISPR/Cas Subtype Ecoli protein 4.	325
273844	TIGR01870	cas_TM1810_Csm2	CRISPR type III-A/MTUBE-associated protein Csm2. These proteins are found adjacent to a characteristic short, palidromic repeat cluster termed CRISPR, a probable mobile DNA element. This model represents the C-terminal domain of a minor family of CRISPR-associated protein from the Mtube subtype of CRISPR/Cas locus. The family is designated Csm2, for CRISPR/Cas Subtype Mtube Protein 2.	97
233610	TIGR01873	cas_CT1978	CRISPR-associated endoribonuclease Cas2, subtype I-E/ECOLI. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model represents a minor branch of the Cas2 family of CRISPR-associated endonuclease, whereas most Cas2 proteins are modeled instead by TIGR01573. This form of Cas2 is characteristic for the Ecoli subtype of CRISPR/Cas locus.	87
273845	TIGR01874	cas_cas5a	CRISPR-associated protein Cas5, subtype I-A/APERN. This model represents a minor family of CRISPR-associated (Cas) protein. These proteins are found adjacent to a characteristic short, palidromic repeat cluster termed CRISPR, a probable mobile DNA element. This family belongs to a set of several Cas proteins, one each for a number of different CRISPR/Cas subtypes, that share a region of N-terminal sequence similarity modeled by TIGR02593. The family is designated Cas5a, for CRISPR-associated protein Cas5, Apern subtype.	172
273846	TIGR01875	cas_MJ0381	CRISPR-associated autoregulator DevR family. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This model represents one such family, represented by MJ0381 of Methanococcus jannaschii. This family includes the DevR protein of Myxococcus xanthus, a protein whose expression appears to regulated through a number of means, including both location and autorepression; DevR mutants are incapable of fruiting body development. [Regulatory functions, DNA interactions, , ]	237
273847	TIGR01876	cas_Cas5d	CRISPR-associated protein Cas5, subtype I-C/DVULG. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This small Cas family is represented by CT1134 of Chlorobium tepidum. This family belongs to a set of several Cas protein families, one each for a number of different CRISPR/Cas subtypes, that share a region of N-terminal sequence similarity modeled by TIGR02593. This family represents the Dvulg subtype of CRISPR/Cas locus.	203
273848	TIGR01877	cas_cas6	CRISPR-associated endoribonuclease Cas6. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This broadly distributed, highly divergent Cas family is now characterized as an endoribonuclease that generates guide RNAs for host defense against phage and other invaders. The family contains a C-terminal motif GXGXXXXXGXG, where the each X between two Gly is hydrophobic and the spacer XXXXX contains (usually) one Arg or Lys. The seed alignment for the current version of this model has gappy columns removed. Members of this protein family are found associated with several different CRISPR/cas system subtypes, and consequently we designate this family Cas6.	199
273849	TIGR01878	cas_Csa5	CRISPR type I-A/APERN-associated protein Csa5. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model represents a minor family of Cas protein found in the (all archaeal) APERN subtype of CRISPR/Cas locus, so the family is designated Csa5, for CRISPR/Cas Subtype Protein 5.	97
200138	TIGR01879	hydantase	amidase, hydantoinase/carbamoylase family. Enzymes in this subfamily hydrolize the amide bonds of compounds containing carbamoyl groups or hydantoin rings. These enzymes are members of the broader family of amidases represented by pfam01546.	400
273850	TIGR01880	Ac-peptdase-euk	N-acyl-L-amino-acid amidohydrolase. This model represents a family of eukaryotic N-acyl-L-amino-acid amidohydrolases active on fatty acid and acetyl amides of L-amino acids.	400
273851	TIGR01881	cas_Cmr5	CRISPR type III-B/RAMP module-associated protein Cmr5. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model family, represented by TM1791.1 of Thermotoga maritima, is found in both archaeal and bacterial species as part of the 6-gene CRISPR RAMP module.	127
130937	TIGR01882	peptidase-T	peptidase T. This model represents a tripeptide aminopeptidase known as Peptidase T, which has a substrate preference for hydrophobic peptides. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	410
162579	TIGR01883	PepT-like	peptidase T-like protein. This model represents a clade of enzymes closely related to Peptidase T, an aminotripeptidase found in bacteria. This clade consists of gram positive bacteria of which several additionally contain a Peptidase T gene.	361
273852	TIGR01884	cas_HTH	CRISPR locus-related DNA-binding protein. Most but not all examples of this family are associated with CRISPR loci, a combination of DNA repeats and characteristic proteins encoded near the repeat cluster. The C-terminal region of this protein is homologous to DNA-binding helix-turn-helix domains with predicted transcriptional regulatory activity. [Regulatory functions, DNA interactions, , ]	203
273853	TIGR01885	Orn_aminotrans	ornithine aminotransferase. This model describes the final step in the biosynthesis of ornithine from glutamate via the non-acetylated pathway. Ornithine amino transferase takes L-glutamate 5-semialdehyde and makes it into ornithine, which is used in the urea cycle, as well as in the biosynthesis of arginine. This model includes low-GC bacteria and eukaryotic species. The genes from two species are annotated as putative acetylornithine aminotransferases - one from Porphyromonas gingivalis (OMNI|PG1271), and the other from Staphylococcus aureus (OMNI|SA0170). After homology searching using BLAST it was determined that these two sequences were most closely related to ornithine aminotransferases. This model's seed includes one characterized hit, from Bacillus subtilis (SP|P38021).	401
130941	TIGR01886	dipeptidase	dipeptidase PepV. This model represents a small clade of dipeptidase enzymes which are members of the larger M25 subfamily of metalloproteases. Two characterized enzymes are included in the seed. One, from Lactococcus lactis has been shown to act on a wide range of dipeptides, but not larger peptides. The enzyme from Lactobacillus delbrueckii was originally characterized as a Xaa-His dipeptidase, specifically a carnosinase (beta-Ala-His) by complementation of an E. coli mutant. Further study, including the crystallization of the enzyme, has shown it to also be a non-specific dipeptidase. This group also includes enzymes from Streptococcus and Enterococcus. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	466
273854	TIGR01887	dipeptidaselike	dipeptidase, putative. This model represents a clade of probable zinc dipeptidases, closely related to the characterized non-specific dipeptidase, PepV. Many enzymes in this clade have been given names including the terms "Xaa-His" and "carnosinase" due to the early mis-characterization of the Lactobacillus delbrueckii PepV enzyme. These names are likely too specific.	447
273855	TIGR01888	cas_cmr3	CRISPR type III-B/RAMP module-associated protein Cmr3. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This highly divergent family is found in at least ten different archaeal and bacterial species as part of the CRISPR RAMP modulue but is not a member of the RAMP superfamily itself. A typical example is TM1793 from Thermotoga maritima.	333
130944	TIGR01889	Staph_reg_Sar	staphylococcal accessory regulator family. This model represents a family of transcriptional regulatory proteins in Staphylococcus aureus and Staphylococcus epidermidis. Some members contain two tandem copies of this region. This family is related to the MarR transcriptional regulator family described by pfam01047. [Regulatory functions, DNA interactions]	109
273856	TIGR01890	N-Ac-Glu-synth	amino-acid N-acetyltransferase. This model represents a clade of amino-acid N-acetyltransferases acting mainly on glutamate in the first step of the "acetylated" ornithine biosynthesis pathway. For this reason it is also called N-acetylglutamate synthase. The enzyme may also act on aspartate. [Amino acid biosynthesis, Glutamate family]	429
273857	TIGR01891	amidohydrolases	amidohydrolase. This model represents a subfamily of amidohydrolases which are a subset of those sequences detected by pfam01546. Included within this group are hydrolases of hippurate (N-benzylglycine), indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. These hydrolases are of the carboxypeptidase-type, most likely utilizing a zinc ion in the active site. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	363
130947	TIGR01892	AcOrn-deacetyl	acetylornithine deacetylase (ArgE). This model represents a clade of acetylornithine deacetylases from proteobacteria. This enzyme is the final step of the "acetylated" ornithine biosynthesis pathway. The enzyme is closely related to dapE, succinyl-diaminopimelate desuccinylase, and outside of this clade annotation is very inaccurate as to which function should be ascribed to genes. [Amino acid biosynthesis, Glutamate family]	364
273858	TIGR01893	aa-his-dipept	Xaa-His dipeptidase. This model represents a clade of dipeptidase enzymes, many of which are specific for carnosine (beta-alanyl-histidine). This enzymes is found broadly in bacteria and at least one archaeon (Methanosarcina). In most species there is only one sequence hitting this model, while Bacteroides thetaiotaomicron, Chlorobium tepidum and Clostridium perfringens have two each and Fusobacterium nucleatum has three. These may indicate that there is a broader substrate range than just carnosine in these (and other) species. 8/19/03 GO terms added [SS]	477
273859	TIGR01894	cas_TM1795_cmr1	CRISPR type III-B/RAMP module RAMP protein Cmr1. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model represents the region of stongest conservation, the N-terminal half, of one such family, represented by TM1795 from Thermotoga maritima. This protein is the first of a set of six genes, mostly from the RAMP superfamily, that we designated the CRISPR-associated RAMP module.	154
273860	TIGR01895	cas_Cas5t	CRISPR-associated protein Cas5, subtype I-B/TNEAP. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This family is represented by TM1800 from Thermotoga maritima. It is related to TIGR01868 (CRISPR-associated protein, CT1976 family).	215
273861	TIGR01896	cas_AF1879	CRISPR-associated protein Cas4/Csa1, subtype I-A/APERN. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model describes a particularly strongly conserved family found so only in the APERN subtype of CRISPR/Cas loci and represented by AF1879 from Archaeoglobus fulgidus. This family has four perfectly preserved Cys residues. This subfamily is found in a CRISPR/Cas locus we designate APERN, so the family is designated Csa1, for CRISPR/Cas Subtype Protein 1.	271
273862	TIGR01897	cas_MJ1666	CRISPR-associated protein, MJ1666 family. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model describes a Cas protein about 400 residues in length, found mostly in the Archaea but also in Aquifex.	410
213662	TIGR01898	cas_TM1791_cmr6	CRISPR type III-B/RAMP module RAMP protein Cmr6. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This family, represented by TM1791 of Thermotoga maritima, is designated Cmr6 [sic], for CRISPR/Cas Ramp Module protein 6. This family is both closely related to and frequently encoded next to the TM1792 family of Cas proteins described by TIGR01867. The two proteins are fused in an example from Methanopyrus kandleri.	176
273863	TIGR01899	cas_TM1807_csm5	CRISPR type III-A/MTUBE-associated RAMP protein Csm5. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. Members of this cas gene family are found in the mtube subtype of CRISPR/cas locus and designated Csm5, for CRISPR/cas Subtype Mtube, protein 5.	365
273864	TIGR01900	dapE-gram_pos	succinyl-diaminopimelate desuccinylase. This model represents a clade of succinyl-diaminopimelate desuccinylases from actinobacteria (high-GC gram positives), delta-proteobacteria and aquificales and is based on the characterization of the enzyme from Corynebacterium glutamicum. This enzyme is involved in the biosynthesis of lysine, and is related to the enzyme acetylornithine deacetylase and other amidases and peptidases found within pfam01546. Other sequences included in the seed of this model were assessed to confirm that 1) the related genes DapC (succinyl-diaminopimelate transaminase) and DapD (2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase) are also found in the genome, 2) each is found only once in those genomes, 3) the lysine biosynthesis pathway is complete and 4) the direct (TIGR03540 or TIGR03542) or acetylated (GenProp0787) aminotransferase pathways are absent in thes genomes. Additionally, a number of the seed members are observed adjacent to either DapC or DapD (often as a divergon with a putative promoter site between them. [Amino acid biosynthesis, Aspartate family]	351
273865	TIGR01901	adhes_NPXG	filamentous hemagglutinin family N-terminal domain. This model represents a conserved domain found near the N-terminus of a number of large, repetitive bacterial proteins, including many proteins of over 2500 amino acids. Members generally have a signal sequence, then an intervening region, then the region described by this model. Following this region, proteins typically have regions rich in repeats but may show no homology between the repeats of one member and the repeats of another. A number of the members of this family have been designated adhesins, filamentous haemagglutinins, heme/hemopexin-binding protein, etc.	79
130957	TIGR01902	dapE-lys-deAc	N-acetyl-ornithine/N-acetyl-lysine deacetylase. This clade of mainly archaeal and related bacterial species contains two characterized enzymes, an deacetylase with specificity for both N-acetyl-ornithine and N-acetyl-lysine from Thermus, which is found within a lysine biosynthesis operon, and a fusion protein with acetyl-glutamate kinase (an enzyme of ornithine biosynthesis) from Lactobacillus. It is possible that all of the sequences within this clade have dual specificity, or that a mix of specificities have evolved within this clade.	336
273866	TIGR01903	cas5_csm4	CRISPR type III-A/MTUBE-associated RAMP protein Csm4. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. Members of this cas gene family are found in the mtube subtype of CRISPR/cas locus and designated Csm4, for CRISPR/cas Subtype Mtube, protein 4.	297
273867	TIGR01904	GSu_C4xC__C2xCH	Geobacter sulfurreducens CxxxxCH...CXXCH domain. This domain occurs from three to eight times in eight different proteins of Geobacter sulfurreducens. The final CXXCH motif matches ProSite motif PS00190, the cytochrome c family heme-binding site signature, suggesting	42
213663	TIGR01905	paired_CXXCH_1	doubled CXXCH domain. This model represents a domain of about 41 amino acids that contains, among other motifs, two copies of the motif CXXCH associated with heme binding. Almost every member of this family has at least three copies of this domain (at least six copies of CXXCH) is predicted to be a high molecular weight c-type cytochrome. Members are found mostly in species of Shewanella, Geobacter, and Vibrio.	41
273868	TIGR01906	integ_TIGR01906	integral membrane protein TIGR01906. This model represents a family of highly hydrophobic, uncharacterized predicted integral membrane proteins found almost entirely in low-GC Gram-positive bacteria, although a member is also found in the early-branching bacterium Aquifex aeolicus.	207
273869	TIGR01907	casE_Cse3	CRISPR-associated protein Cas6/Cse3/CasE, subtype I-E/ECOLI. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model family, represented by CT1974 from Chlorobium tepidum, is found in the Ecoli subtype of CRISPR/Cas regions and is designated Cse3 (CRISPR/Cas Subtype Ecoli protein 3). The representative of this family from Thermus thermophilus HB8 (TTHB192) has been crystallized and found to have a structure consisting of two domains with opposing parallel beta-sheets known as a beta-sheet platform. This structure is similar to those found in the Sex-lethal protein and poly(A)-binding protein. This structure is consistent with an RNA-binding function.	206
162595	TIGR01908	cas_CXXC_CXXC	CRISPR-associated protein Cas8b1/Cst1, subtype I-B/TNEAP. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This (revised) model describes a conserved region from an otherwise highly divergent protein found in the Tneap subtype of CRISPR/Cas regions. This Cys-rich region features two motifs of CXXC.	309
213664	TIGR01909	C_GCAxxG_C_C	C_GCAxxG_C_C family probable redox protein. This model represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters.	120
273870	TIGR01910	DapE-ArgE	acetylornithine deacetylase or succinyl-diaminopimelate desuccinylase. This group of sequences contains annotations for both acetylornithine deacetylase and succinyl-diaminopimelate desuccinylase, but does not contain any members with experimental characterization. Bacillus, Staphylococcus and Sulfolobus species contain multiple hits to this subfamily and each may have a separate activity. Determining which is which must await further laboratory research. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	375
188182	TIGR01911	HesB_rel_seleno	HesB-like selenoprotein. This model represents a family of small proteins related to HesB and its close homologs, which are likely to be invovlved in iron-sulfur cluster assembly (See TIGR00049 and pfam01521). Several members are selenoproteins, with a TGA codon and Sec residue that aligns to the conserved Cys of the HesB domain. A variable Cys/Ser/Gly-rich C-terminal region is not included in the seed alignment and model. [Unknown function, General]	92
162597	TIGR01912	TatC-Arch	Twin arginine targeting (Tat) protein translocase TatC, Archaeal clade. This model represents the TatC translocase component of the Sec-independent protein translocation system. This system is responsible for translocation of folded proteins, often with bound cofactors across the periplasmic membrane. A related model (TIGR00945) represents the bacterial clade of this family. TatC is often found (in bacteria) in a gene cluster with the two other components of the system, TatA/E (TIGR01411) and TatB (TIGR01410). A model also exists for the Twin-arginine signal sequence (TIGR01409).	237
273871	TIGR01913	bet_lambda	phage recombination protein Bet. This model represents the phage recombination protein Bet from a number of phage, including phage lambda. All members of this family are found in phage genomes or in putative prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions]	180
273872	TIGR01914	cas_Csa4	CRISPR-associated protein Cas8a2/Csa4, subtype I-A/APERN. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein that tends to be found near CRISPR repeats. The species range for this species, so far, is exclusively archaeal. It is found so far in only four different species, and includes two tandem genes in Pyrococcus furiosus DSM 3638. This subfamily is found in a CRISPR/Cas locus we designate APERN, so the family is designated Csa4, for CRISPR/Cas Subtype Protein 4.	354
273873	TIGR01915	npdG	NADPH-dependent F420 reductase. This model represents a subset of a parent family described by pfam03807. Unlike the parent family, members of this family are found only in species with evidence of coenzyme F420. All members of this family are believed to act as NADPH-dependent F420 reductase. [Energy metabolism, Electron transport]	219
273874	TIGR01916	F420_cofE	coenzyme F420-0:L-glutamate ligase. This model represents an enzyme of coenzyme F(420) biosynthesis, as catalyzed by MJ0768 of Methanococcus jannaschii and by the N-terminal half of FbiB of Mycobacterium bovis strain BCG. Note that only two glutamates are ligated in M. jannaschii, but five to six in the Mycobacterium lineage. In M. jannaschii, CofE catalyzes the GTP-dependent addition of two L-glutamates. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	243
130972	TIGR01917	gly_red_sel_B	glycine reductase, selenoprotein B. Glycine reductase is a complex with two selenoprotein subunits, A and B. This model represents the glycine reductase selenoprotein B. Closely related to it, but excluded from this model, are selenoprotein B subunits of betaine reductase and sarcosine reductase. All contain selenocysteine incorporated during translation at a specific UGA codon.	431
130973	TIGR01918	various_sel_PB	selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family. This model represents selenoprotein B of glycine reductase, sarcosine reductase, betaine reductase, D-proline reductase, and perhaps others. This model is built in fragment mode to assist in recognizing fragmentary translations. All members are expected to contain an internal TGA codon, encoding selenocysteine, which may be misinterpreted as a stop codon.	431
273875	TIGR01919	hisA-trpF	1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase/N-(5'phosphoribosyl)anthranilate isomerase. This model represents a bifunctional protein posessing both hisA (1-(5-phosphoribosyl)-5-[(5-phosphoribosylamino)methylideneamino] imidazole-4-carboxamide isomerase) and trpF (N-(5'phosphoribosyl)anthranilate isomerase) activities. Thus, it is involved in both the histidine and tryptophan biosynthetic pathways. Enzymes with this property have been described only in the Actinobacteria (High-GC gram-positive). The enzyme is closely related to the monofunctional HisA proteins (TIGR00007) and in Actinobacteria, the classical monofunctional TrpF is generally absent.	243
273876	TIGR01920	Shik_kin_archae	shikimate kinase. This model represents the shikimate kinase (SK) gene found in archaea which is only distantly related to homoserine kinase (thrB) and not atr all to the bacterial SK enzyme. The SK from M. janaschii has been overexpressed in E. coli and characterized. SK catalyzes the fifth step of the biosynthesis of chorismate from D-erythrose-4-phosphate and phosphoenolpyruvate. [Amino acid biosynthesis, Aromatic amino acid family]	261
273877	TIGR01921	DAP-DH	diaminopimelate dehydrogenase. This model represents the diaminopimelate dehydrogenase enzyme which provides an alternate (shortcut) route of lysine buiosynthesis in Corynebacterium, Bacterioides, Porphyromonas and scattered other species. The enzyme from Corynebacterium glutamicum has been crystallized and characterized.	324
273878	TIGR01922	purO_arch	IMP cyclohydrolase. This model represents IMP cyclohydrolase, the final step in the biosynthesis of inosine monophosphate (IMP) in archaea. In bacteria this step is catalyzed by a bifunctional enzyme (purH).	199
162605	TIGR01923	menE	O-succinylbenzoate-CoA ligase. This model represents an enzyme, O-succinylbenzoate-CoA ligase, which is involved in the fourth step of the menaquinone biosynthesis pathway. O-succinylbenzoate-CoA ligase, together with menB - naphtoate synthase, take 2-succinylbenzoate and convert it into 1,4-di-hydroxy-2- naphtoate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	436
273879	TIGR01924	rsbW_low_gc	serine-protein kinase RsbW. This model describes the anti-sigma B factor also known as serine-protein kinase RsbW. Sigma B controls the general stress regulon in B subtilis and is activated by cell stresses such as stationary phase and heat shock. RsbW binds to sigma B and prevents formation of the transcription complex at the promoter. RsbV (anti-anti-sigma factor) binds to RsbW to inhibit association with sigma B, however RsbW can phosphorylate RsbV, causing disassociation of the RsbV/RsbW complex. Low ATP level or environmental stress causes the dephosphorylation of RsbV.	159
130980	TIGR01925	spIIAB	anti-sigma F factor. This model describes the SpoIIAB anti-sigma F factor. Sigma F regulates spore development in B subtilis. SpoIIAB binds to sigma F, preventing formation of the transcription complex at the promoter. SpoIIAA (anti-anti-sigma F factor) binds to SpoIIAB to inhibit association with sigma F, however SpoIIAB can phosphorylate SpoIIAA, causing disassociation of the SpoIIAA/B complex. The SpoIIE phosphatase dephosphorylates SpoIIAA. [Regulatory functions, Protein interactions, Cellular processes, Sporulation and germination]	137
130981	TIGR01926	peroxid_rel	uncharacterized peroxidase-related enzyme. This protein family with length of about 200 amino acids. One member, from Myxococcus xanthus, is a selenoprotein, with an otherwise conserved Cys replaced by Sec. This family is drawn narrowly enough to suggest that These proteins contain a domain described by TIGR00778, with a CxxCxxxHxxxxxxxG motif. Some members of that family are known to act as peroxidases or correlate with resistance to oxidative stress.	177
273880	TIGR01927	menC_gamma/gm+	o-succinylbenzoate synthase. This model describes the enzyme o-succinylbenzoic acid synthetase (menC) that is involved in one of the steps of the menaquinone biosynthesis pathway. It takes SHCHC and makes it into 2-succinylbenzoate. Included in this model are gamma proteobacteria and archaea. Many of the com-names of the proteins identified by the model are identified as O-succinylbenzoyl-CoA synthase in error. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	307
213667	TIGR01928	menC_lowGC/arch	o-succinylbenzoate synthase. This model describes the enzyme o-succinylbenzoic acid synthetase (menC) that is involved in one of the steps of the menaquinone biosynthesis pathway. It takes SHCHC and makes it into 2-succinylbenzoate. Included in this model are low GC gram positive bacteria and archaea. Also included in the seed and in the model are enzymes with the com-name of N-acylamino acid racemase (or the more general term, racemase / racemase family), which refers to the enzyme's industrial application as racemases, and not to its biological function as o-succinylbenzoic acid synthetase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	324
200143	TIGR01929	menB	naphthoate synthase (dihydroxynaphthoic acid synthetase). This model represents an enzyme, naphthoate synthase (dihydroxynaphthoic acid synthetase), which is involved in the fifth step of the menaquinone biosynthesis pathway. Together with o-succinylbenzoate-CoA ligase (menE: TIGR01923), this enzyme takes 2-succinylbenzoate and converts it into 1,4-di-hydroxy-2-naphthoate. Included above the trusted cutoff are two enzymes from Arabadopsis thaliana and one from Staphylococcus aureus which are identified as putative enoyl-CoA hydratase/isomerases. These enzymes group with the naphthoate synthases when building a tree and when doing BLAST searches. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	259
273881	TIGR01930	AcCoA-C-Actrans	acetyl-CoA acetyltransferases. This model represents a large family of enzymes which catalyze the thiolysis of a linear fatty acid CoA (or acetoacetyl-CoA) using a second CoA molecule to produce acetyl-CoA and a CoA-ester product two carbons shorter (or, alternatively, the condensation of two molecules of acetyl-CoA to produce acetoacetyl-CoA and CoA). This enzyme is also known as "thiolase", "3-ketoacyl-CoA thiolase", "beta-ketothiolase" and "Fatty oxidation complex beta subunit". When catalyzing the degradative reaction on fatty acids the corresponding EC number is 2.3.1.16. The condensation reaction corresponds to 2.3.1.9. Note that the enzymes which catalyze the condensation are generally not involved in fatty acid biosynthesis, which is carried out by a decarboxylating condensation of acetyl and malonyl esters of acyl carrier proteins. Rather, this activity may produce acetoacetyl-CoA for pathways such as IPP biosynthesis in the absence of sufficient fatty acid oxidation. [Fatty acid and phospholipid metabolism, Other]	385
273882	TIGR01931	cysJ	sulfite reductase [NADPH] flavoprotein, alpha-component. This model describes an NADPH-dependent sulfite reductase flavoprotein subunit. Most members of this family are found in Cys biosynthesis gene clusters. The closest homologs below the trusted cutoff are designated as subunits nitrate reductase.	597
273883	TIGR01932	hflC	HflC protein. HflK and HflC are paralogs encoded by tandem genes in Proteobacteria, spirochetes, and some other bacterial lineages. The HflKC complex is anchored in the membrane and exposed to the periplasm. The complex is not active as a protease, but rather binds to and appears to modulate the ATP-dependent protease FtsH. The overall function of HflKC is not fully described. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Regulatory functions, Protein interactions]	317
130988	TIGR01933	hflK	HflK protein. HflK and HflC are paralogs encoded by tandem genes in Proteobacteria, spirochetes, and some other bacterial lineages. The HflKC complex is anchored in the membrane and exposed to the periplasm. The complex is not active as a protease, but rather binds to and appears to modulate the ATP-dependent protease FtsH. The overall function of HflKC is not fully described.//Regulation of FtsH by HflKC appears to be negative (SS 8/27/03]	261
273884	TIGR01934	MenG_MenH_UbiE	ubiquinone/menaquinone biosynthesis methyltransferases. This model represents a family of methyltransferases involved in the biosynthesis of menaquinone and ubiqinone. Some members such as the UbiE enzyme from E. coli are believed to act in both pathways, while others may act in only the menaquinone pathway. These methyltransferases are members of the UbiE/CoQ family of methyltransferases (pfam01209) which also contains ubiquinone methyltransferases and other methyltransferases. Members of this clade include a wide distribution of bacteria and eukaryotes, but no archaea. An outgroup for this clade is provided by the phosphatidylethanolamine methyltransferase (EC 2.1.1.17) from Rhodobacter sphaeroides. Note that a number of non-orthologous genes which are members of pfam03737 have been erroneously annotated as MenG methyltransferases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	223
130990	TIGR01935	NOT-MenG	RraA famliy. The E. coli member of this family has been characterized as a regulator of RNase E and its crystal structure has been analyzed. This model was initially classified as a "hypothetical equivalog" expressing the tentative hypothesis that all members might have the same function as the E. coli enzyme. Considering the second clade of enterobacterial sequences within this family, that appears to be less tenable. The function of these sequences outside of the narrow RraA equivalog model (TIGR02998) remains obscure. All of these were initially annotated as MenG, AKA S-adenosylmethionine: 2-demethylmenaquinone methyltransferase (EC 2.1.-.-). See the references characterizing this as a case of transitive annotation error in the case of the E. coli protein. [Unknown function, General]	150
273885	TIGR01936	nqrA	NADH:ubiquinone oxidoreductase, Na(+)-translocating, A subunit. This model represents the NqrA subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds]	447
130992	TIGR01937	nqrB	NADH:ubiquinone oxidoreductase, Na(+)-translocating, B subunit. This model represents the NqrB subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds]	413
273886	TIGR01938	nqrC	NADH:ubiquinone oxidoreductase, Na(+)-translocating, C subunit. This model represents the NqrC subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds]	251
130994	TIGR01939	nqrD	NADH:ubiquinone oxidoreductase, Na(+)-translocating, D subunit. This model represents the NqrD subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds]	207
130995	TIGR01940	nqrE	NADH:ubiquinone oxidoreductase, Na(+)-translocating, E subunit. This model represents the NqrE subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds]	200
130996	TIGR01941	nqrF	NADH:ubiquinone oxidoreductase, Na(+)-translocating, F subunit. This model represents the NqrF subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds]	405
130997	TIGR01942	pcnB	poly(A) polymerase. This model describes the pcnB family of poly(A) polymerases (also known as plasmid copy number protein). These enzymes sequentially add adenosine nucleotides to the 3' end of RNAs, targeting them for degradation by the cell. This was originally described for anti-sense RNAs, but was later demonstrated for mRNAs as well. Members of this family are as yet limited to the gamma- and beta-proteobacteria, with putative members in the Chlamydiacae and spirochetes. This family has homology to tRNA nucleotidyltransferase (cca).	410
130998	TIGR01943	rnfA	electron transport complex, RnfABCDGE type, A subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the A subunit. [Energy metabolism, Electron transport]	190
273887	TIGR01944	rnfB	electron transport complex, RnfABCDGE type, B subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the B subunit. [Energy metabolism, Electron transport]	165
273888	TIGR01945	rnfC	electron transport complex, RnfABCDGE type, C subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the C subunit. [Energy metabolism, Electron transport]	435
131001	TIGR01946	rnfD	electron transport complex, RnfABCDGE type, D subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the A subunit. [Energy metabolism, Electron transport]	327
273889	TIGR01947	rnfG	electron transport complex, RnfABCDGE type, G subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the A subunit. [Energy metabolism, Electron transport]	186
162619	TIGR01948	rnfE	electron transport complex, RnfABCDGE type, E subunit. The six subunit complex RnfABCDGE in Rhodobacter capsulatus encodes an apparent NADH oxidoreductase responsible for electron transport to nitrogenase, necessary for nitrogen fixation. A closely related complex in E. coli, RsxABCDGE (Reducer of SoxR), reduces the 2Fe-2S-containing superoxide sensor SoxR, active as a transcription factor when oxidized. This family of putative NADH oxidoreductase complexes exists in many of the same species as the related NQR, a Na(+)-translocating NADH-quinone reductase, but is distinct. This model describes the E subunit. [Energy metabolism, Electron transport]	196
273890	TIGR01949	AroFGH_arch	predicted phospho-2-dehydro-3-deoxyheptonate aldolase. This model represents a clade of sequences related to fructose-bisphosphate aldolase (class I, included within pfam01791). The members of this clade appear to be phospho-2-dehydro-3-deoxyheptonate aldolases. This enzyme is the first step of the chorismate biosynthesis pathway. Evidence for this assignment is based on gene clustering and phylogenetic profiling. A group of species lack members of the three other types of phospho-2-dehydro-3-deoxyheptonate aldolase (represented by TIGR00034, TIGR01358 and TIGR01361), and also aparrently lack the well-known forms of step 2 (3-dehydroquinate synthase), but contain all other steps of the pathway: Desulfovibrio, Aquifex, Archaeoglobus, Halobacterium, Methanopyrus, Methanococcus and Methanobacterium. The clade of sequences represented here is limited strictly to this group of organisms. In Desulfovibrio, Aquifex, Archaeoglobus, Halobacterium and Methanosarcina the genes found by this model are clustered with other genes from the chorismate, phenylalanine, tyrosine and tryptophan biosynthesis pathways. In addition, these genes in Desulfovibrio, Archaeoglobus, Halobacterium, Methanosarcina and Methanopyrus are adjacent to a gene which hits pfam01959 which also has the property of having members only in those species which lack steps 1 and 2. Together these two genes appear to perform the synthesis of 3-dehydroquinate. It is presumed that the substrates and the chemical transformations involved are identical, but this has not yet been proven experimentally.	258
131005	TIGR01950	SoxR	redox-sensitive transcriptional activator SoxR. SoxR is a MerR-family homodimeric transcription factor with a 2Fe-2S cluster in each monomer. The motif CIGCGCxxxxxC is conserved. Oxidation of the iron-sulfur cluster activates SoxR. The physiological role in E. coli is response to oxidative stress. It is activated by superoxide, singlet oxygen, nitric oxide (NO), and hydrogen peroxide. In E. coli, SoxR increases expression of transcription factor SoxS; different downstream targets may exist in other species. [Cellular processes, Detoxification, Regulatory functions, DNA interactions]	142
273891	TIGR01951	nusB	transcription antitermination factor NusB. A transcription antitermination complex active in many bacteria was designated N-utilization substance (Nus) in E. coli because of its interaction with phage lambda protein N. This model represents NusB. Other components are NusA and NusG. NusE is, in fact, ribosomal protein S10. [Transcription, Transcription factors]	129
273892	TIGR01952	nusA_arch	NusA family KH domain protein, archaeal. This model represents a family of archaeal proteins found in a single copy per genome. It contains two KH domains (pfam00013) and is most closely related to the central region bacterial NusA, a transcription termination factor named for its iteraction with phage lambda protein N in E. coli. The proteins required for antitermination by N include NusA, NusB, nusE (ribosomal protein S10), and nusG. This system, on the whole, appears not to be present in the Archaea.	141
273893	TIGR01953	NusA	transcription termination factor NusA. This model describes NusA, or N utilization substance protein A, a bacterial transcription termination factor. It binds to RNA polymerase alpha subunit and promotes termination at certain RNA hairpin structures. It is named for the interaction in E. coli of phage lambda antitermination protein N with the N-utilization substance, consisting of NusA, NusB, NusE (ribosomal protein S10), and nusG. This model represents a region of NusA shared in all bacterial forms, and including an S1 (pfam00575) and a KH (pfam00013) RNA binding domains. Proteobacterial forms have an additional C-terminal region, not included in this model, with two repeats of 50-residue domain rich in acidic amino acids. [Transcription, Transcription factors]	341
273894	TIGR01954	nusA_Cterm_rpt	transcription termination factor NusA, C-terminal duplication. NusA is a bacterial transcription termination factor. It is named for its interaction with phage lambda protein N, as part of the N utilization substance. Some members of the NusA family have a long C-terminal extension. This model represents an acidic 50-residue region found in two copies toward the C-terminus of most Proteobacterial NusA proteins, spaced about 26 residues apart. Analogous C-terminal extensions in some other bacterial lineages lack apparent homology but appear similarly acidic. [Transcription, Transcription factors]	50
131010	TIGR01955	RfaH	transcription elongation factor/antiterminator RfaH. This model represents the transcription elongation factor/antiterminator, RfaH. This protein is most closely related to the transcriptional termination/antitermination protein NusG (TIGR00922) and contains the KOW motif (pfam00467). This protein appears to be limited to the gamma proteobacteria. In E. coli, this gene appears to control the expression of haemolysin, sex factor and lipopolysaccharide genes. [Transcription, Transcription factors]	159
273895	TIGR01956	NusG_myco	NusG family protein. This model represents a family of Mycoplasma proteins orthologous to the bacterial transcription termination/antitermination factor NusG. These sequences from Mycoplasma are notably diverged (long branches in a Neighbor-joining phylogenetic tree) from the bacterial species. And although NusA and ribosomal protein S10 (NusE) appear to be present, NusB may be absent in Mycoplasmas calling into question whether these species have a functional Nus system including this family as a member.	258
273896	TIGR01957	nuoB_fam	NADH-quinone oxidoreductase, B subunit. This model describes the B chain of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoB. The quinone is plastoquinone in Synechocystis (where the chain is designated K) and in chloroplast, where NADH may be replaced by NADPH. In the methanogenic archaeal genus Methanosarcina, NADH is replaced by F420H2. [Energy metabolism, Electron transport]	145
131013	TIGR01958	nuoE_fam	NADH-quinone oxidoreductase, E subunit. This model describes the E chain of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoB. This model does not identify proteins from chloroplast and cyanobacteria. [Energy metabolism, Electron transport]	148
131014	TIGR01959	nuoF_fam	NADH-quinone oxidoreductase, F subunit. This model describes the F chain of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoF. This family does not have any members in chloroplast or cyanobacteria, where the quinone may be plastoquinone and NADH may be replaced by NADPH, nor in Methanosarcina, where NADH is replaced by F420H2. [Energy metabolism, Electron transport]	411
131015	TIGR01960	ndhF3_CO2	NAD(P)H dehydrogenase, subunit NdhF3 family. This family represents a subfamily of NAD(P)H dehydrogenase subunit 5, or ndhF. It is restricted to two paralogs in each completed cyanobacterial genome, in which several subtypes of ndhF are found. Included in this family is NdhF3, shown to play a role in high-affinity CO2 uptake in Synechococcus sp. PCC7002. In all cases, neighboring genes include a paralog of ndhD but do include other NAD(P)H dehydrogenase subunits. Instead, genes related to C02 uptake tend to be found nearby.	606
273897	TIGR01961	NuoC_fam	NADH (or F420H2) dehydrogenase, subunit C. This model describes the C subunit of the NADH dehydrogenase complex I in bacteria, as well as many instances of the corresponding mitochondrial subunit (NADH dehydrogenase subunit 9) and of the F420H2 dehydrogenase in Methanosarcina. Complex I contains subunits designated A-N. This C subunit often occurs as a fusion protein with the D subunit. This model excludes the NAD(P)H and plastoquinone-dependent form of chloroplasts and [Energy metabolism, Electron transport]	121
273898	TIGR01962	NuoD	NADH dehydrogenase I, D subunit. This model recognizes specificially the D subunit of NADH dehydrogenase I complex. It excludes the related chain of NAD(P)H-quinone oxidoreductases from chloroplast and Synechocystis, where the quinone may be plastoquinone rather than ubiquinone. This subunit often appears as a C/D fusion. [Energy metabolism, Electron transport]	386
211705	TIGR01963	PHB_DH	3-hydroxybutyrate dehydrogenase. This model represents a subfamily of the short chain dehydrogenases. Characterized members so far as 3-hydroxybutyrate dehydrogenases and are found in species that accumulate ester polmers called polyhydroxyalkanoic acids (PHAs) under certain conditions. Several members of the family are from species not known to accumulate PHAs, including Oceanobacillus iheyensis and Bacillus subtilis. However, polymer formation is not required for there be a role for 3-hydroxybutyrate dehydrogenase; it may be members of this family have the same function in those species.	255
213671	TIGR01964	chpXY	CO2 hydration protein. This small family of proteins includes paralogs ChpX and ChpY in Synechococcus sp. PCC7942 and other cyanobacteria, associated with distinct NAD(P)H dehydrogenase complexes. These proteins collectively enable light-dependent CO2 hydration and CO2 uptake; loss of both blocks growth at low CO2 concentrations. [Energy metabolism, Photosynthesis]	367
273899	TIGR01965	VCBS_repeat	VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.	99
131021	TIGR01966	RNasePH	ribonuclease PH. This bacterial enzyme, ribonuclease PH, performs the final 3'-trimming and modification of tRNA precursors. This model is restricted absolutely to bacteria. Related families outside the model include proteins described as probable exosome complex exonucleases (rRNA processing) and polyribonucleotide nucleotidyltransferases (mRNA degradation). The most divergent member within the family is RNase PH from Deinococcus radiodurans. [Transcription, RNA processing]	236
273900	TIGR01967	DEAH_box_HrpA	RNA helicase HrpA. This model represents HrpA, one of two related but uncharacterized DEAH-box ATP-dependent helicases in many Proteobacteria and a few high-GC Gram-positive bacteria. HrpA is about 1300 amino acids long, while its paralog HrpB, also uncharacterized, is about 800 amino acids long. Related characterized eukarotic proteins are RNA helicases associated with pre-mRNA processing. The HrpA/B homolog from Borrelia is 500 amino acids shorter but appears to be derived from HrpA rather than HrpB. [Unknown function, Enzymes of unknown specificity]	1283
131023	TIGR01968	minD_bact	septum site-determining protein MinD. This model describes the bacterial and chloroplast form of MinD, a multifunctional cell division protein that guides correct placement of the septum. The homologous archaeal MinD proteins, with many archaeal genomes having two or more forms, are described by a separate model. [Cellular processes, Cell division]	261
131024	TIGR01969	minD_arch	cell division ATPase MinD, archaeal. This model represents the archaeal branch of the MinD family. MinD, a weak ATPase, works in bacteria with MinC as a generalized cell division inhibitor and, through interaction with MinE, prevents septum placement inappropriate sites. Often several members of this family are found in archaeal genomes, and the function is uncharacterized. More distantly related proteins ParA chromosome partitioning proteins. The exact roles of the various archaeal MinD homologs are unknown.	251
273901	TIGR01970	DEAH_box_HrpB	ATP-dependent helicase HrpB. This model represents HrpB, one of two related but uncharacterized DEAH-box ATP-dependent helicases in many Proteobacteria, but also in a few species of other lineages. The member from Rhizobium meliloti has been designated HelO. HrpB is typically about 800 residues in length, while its paralog HrpA (TIGR01967), also uncharacterized, is about 1300 amino acids long. Related characterized eukarotic proteins are RNA helicases associated with pre-mRNA processing. [Unknown function, Enzymes of unknown specificity]	819
273902	TIGR01971	NuoI	NADH-quinone oxidoreductase, chain I. This model represents the I subunit (one of 14: A->N) of the NADH-quinone oxidoreductase complex I which generally couples NADH and ubiquinone oxidation/reduction in bacteria and mammalian mitochondria, but may act on NADPH and/or plastoquinone in cyanobacteria and plant chloroplasts. This model excludes "I" subunits from the closely related F420H2 dehydrogenase and formate hydrogenlyase complexes. [Energy metabolism, Electron transport]	122
273903	TIGR01972	NDH_I_M	proton-translocating NADH-quinone oxidoreductase, chain M. This model describes the 13th (based on E. coli) structural gene, M, of bacterial NADH dehydrogenase I, as well as chain 4 of the corresponding mitochondrial complex I and of the chloroplast NAD(P)H dehydrogenase complex. [Energy metabolism, Electron transport]	481
273904	TIGR01973	NuoG	NADH-quinone oxidoreductase, chain G. This model represents the G subunit (one of 14: A->N) of the NADH-quinone oxidoreductase complex I which generally couples NADH and ubiquinone oxidation/reduction in bacteria and mammalian mitochondria while translocating protons, but may act on NADPH and/or plastoquinone in cyanobacteria and plant chloroplasts. This model excludes related subunits from formate dehydrogenase complexes. [Energy metabolism, Electron transport]	602
273905	TIGR01974	NDH_I_L	proton-translocating NADH-quinone oxidoreductase, chain L. This model describes the 12th (based on E. coli) structural gene, L, of bacterial NADH dehydrogenase I, as well as chain 5 of the corresponding mitochondrial complex I and subunit 5 (or F) of the chloroplast NAD(P)H-plastoquinone dehydrogenase complex. [Energy metabolism, Electron transport]	609
131030	TIGR01975	isoAsp_dipep	isoaspartyl dipeptidase IadA. The L-isoaspartyl derivative of Asp arises non-enzymatically over time as a form of protein damage. In this isomerization, the connectivity of the polypeptide changes to pass through the beta-carboxyl of the side chain. Much but not all of this damage can be repaired by protein-L-isoaspartate (D-aspartate) O-methyltransferase. This model describes the isoaspartyl dipeptidase IadA, apparently one of two such enzymes in E. coli, an enzyme that degrades isoaspartyl dipeptides and may unblock degradation of proteins that cannot be repaired. This model also describes closely related proteins from other species (e.g. Clostridium perfringens, Thermoanaerobacter tengcongensis) that we assume to be equivalent in function. This family shows homology to dihydroorotases. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	389
273906	TIGR01976	am_tr_V_VC1184	cysteine desulfurase family protein, VC1184 subfamily. This model describes a subfamily of probable pyridoxal phosphate-dependent enzymes in the aminotransferase class V family (pfam00266). The most closely related characterized proteins are active as cysteine desulfurases, selenocysteine lyases, or both; some are involved in FeS cofactor biosynthesis and are designated NifS. An active site Cys residue present in those sequences, in motifs resembling GHHC or GSAC, is not found in this family. The function of members of this family is unknown, but seems unlike to be as an aminotransferase. [Unknown function, Enzymes of unknown specificity]	397
131032	TIGR01977	am_tr_V_EF2568	cysteine desulfurase family protein. This model describes a subfamily of probable pyridoxal phosphate-dependent enzymes in the aminotransferase class V family. Related families contain members active as cysteine desulfurases, selenocysteine lyases, or both. The members of this family form a distinct clade and all are shorter at the N-terminus. The function of this subfamily is unknown. [Unknown function, Enzymes of unknown specificity]	376
273907	TIGR01978	sufC	FeS assembly ATPase SufC. SufC is part of the SUF system, shown in E. coli to consist of six proteins and believed to act in Fe-S cluster formation during oxidative stress. SufC forms a complex with SufB and SufD. SufC belongs to the ATP-binding cassette transporter family (pfam00005) but is no longer thought to be part of a transporter. The complex is reported as cytosolic () or associated with the membrane (). The SUF system also includes a cysteine desulfurase (SufS, enhanced by SufE) and a probable iron-sulfur cluster assembly scaffold protein, SufA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	243
131034	TIGR01979	sufS	cysteine desulfurases, SufSfamily. This model represents a subfamily of NifS-related cysteine desulfurases involved in FeS cluster formation needed for nitrogen fixation among other vital functions. Many cysteine desulfurases are also active as selenocysteine lyase and/or cysteine sulfinate desulfinase. This subfamily is associated with the six-gene SUF system described in E. coli and Erwinia as an FeS cluster formation system during oxidative stress. The active site Cys is this subfamily resembles GHHC with one or both His conserved. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	403
131035	TIGR01980	sufB	FeS assembly protein SufB. This protein, SufB, forms a cytosolic complex SufBCD. This complex enhances the cysteine desulfurase of SufSE. The system, together with SufA, is believed to act in iron-sulfur cluster formation during oxidative stress. Note that SufC belongs to the family of ABC transporter ATP binding proteins, so this protein, encoded by an adjacent gene, has often been annotated as a transporter component. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	448
273908	TIGR01981	sufD	FeS assembly protein SufD. This protein, SufD, forms a cytosolic complex SufBCD. This complex enhances the cysteine desulfurase of SufSE. The system, together with SufA, is believed to act in iron-sulfur cluster formation during oxidative stress. SufB and SufD are homologous. Note that SufC belongs to the family of ABC transporter ATP binding proteins, so this protein, encoded by an adjacent gene, has often been annotated as a transporter component. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	275
273909	TIGR01982	UbiB	2-polyprenylphenol 6-hydroxylase. This model represents the enzyme (UbiB) which catalyzes the first hydroxylation step in the ubiquinone biosynthetic pathway in bacteria. It is believed that the reaction is 2-polyprenylphenol -> 6-hydroxy-2-polyprenylphenol. This model finds hits primarily in the proteobacteria. The gene is also known as AarF in certain species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	437
273910	TIGR01983	UbiG	ubiquinone biosynthesis O-methyltransferase. This model represents an O-methyltransferase believed to act at two points in the ubiquinone biosynthetic pathway in bacteria (UbiG) and fungi (COQ3). A separate methylase (MenG/UbiE) catalyzes the single C-methylation step. The most commonly used names for genes in this family do not indicate whether this gene is an O-methyl, or C-methyl transferase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	224
273911	TIGR01984	UbiH	2-polyprenyl-6-methoxyphenol 4-hydroxylase. This model represents the FAD-dependent monoxygenase responsible for the second hydroxylation step in the aerobic ubiquinone bioynthetic pathway. The scope of this model is limited to the proteobacteria. This family is closely related to the UbiF hydroxylase which catalyzes the final hydroxylation step. The enzyme has also been named VisB due to a mutant VISible light sensitive phenotype. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	382
131040	TIGR01985	phasin_2	phasin. This model represents a family of granule-associate proteins (phasins) which are part of the polyhydroxyalkanoate synthesis machinery. This family is based on a pair of characterized genes from Methylobacterium extorquens. Members of the seed for this model all contain the rest of the components believed to be essential for this system (see the "polyhydroxyalkanoic acid synthesis" property in the GenPropDB). Members of this family score below trusted to another phasin model, TIGR01841 and together may represent a subfamily or broader equivalog.	112
273912	TIGR01986	glut_syn_euk	glutathione synthetase, eukaryotic. This model represents the eukaryotic glutathione synthetase, which shows little resemblance to the analogous enzyme of Gram-negative bacteria (TIGR01380). In the Kinetoplastida, trypanothione replaces glutathione, but can be made from glutathione; a sequence from Leishmania is not included in the seed, is highly divergent, and therefore scores between the trusted and noise cutoffs.	472
131042	TIGR01987	HI0074	nucleotidyltransferase substrate binding protein, HI0074 family. The member of this family from Haemophilus influenzae, HI0074, has been shown by crystal structure to resemble nucleotidyltransferase substrate binding proteins. It forms a complex with HI0073, encoded by the adjacent gene and containing a nucleotidyltransferase nucleotide binding domain (pfam01909).	123
273913	TIGR01988	Ubi-OHases	Ubiquinone biosynthesis hydroxylase, UbiH/UbiF/VisC/COQ6 family. This model represents a family of FAD-dependent hydroxylases (monooxygenases) which are all believed to act in the aerobic ubiquinone biosynthesis pathway. A separate set of hydroxylases, as yet undiscovered, are believed to be active under anaerobic conditions. In E. coli three enzyme activities have been described, UbiB (which acts first at position 6, see TIGR01982), UbiH (which acts at position 4) and UbiF (which acts at position 5). UbiH and UbiF are similar to one another and form the basis of this subfamily. Interestingly, E. coli contains another hydroxylase gene, called visC, that is highly similar to UbiF, adjacent to UbiH and, when mutated, results in a phenotype similar to that of UbiH (which has also been named visB). Several other species appear to have three homologs in this family, although they assort themselves differently on phylogenetic trees (e.g. Xylella and Mesorhizobium) making it difficult to ascribe a specific activity to each one. Eukaryotes appear to have only a single homolog in this subfamily (COQ6) which complements UbiH, but also possess a non-orthologous gene, COQ7 which complements UbiF. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	385
273914	TIGR01989	COQ6	ubiquinone biosynthesis monooxygenase COQ6. This model represents the monooxygenase responsible for the 4-hydroxylateion of the phenol ring in the aerobic biosynthesis of ubiquinone	437
213672	TIGR01990	bPGM	beta-phosphoglucomutase. This model represents the beta-phosphoglucomutase enzyme which catalyzes the interconverison of beta-D-glucose-1-phosphate and beta-D-glucose-6-phosphate. The 6-phosphate is capable of non-enzymatic anomerization (alpha <-> beta) while the 1-phosphate is not. A separate enzyme is responsible for the isomerization of the alpha anomers. Beta-D-glucose-1-phosphate results from the phosphorylysis of maltose (2.4.1.8), trehalose (2.4.1.64) or trehalose-6-phosphate (2.4.1.216). Alternatively, these reactions can be run in the synthetic direction to create the disaccharides. All sequenced genomes which contain a member of this family also appear to contain at least one putative maltose or trehalose phosphorylase. Three species, Lactococcus, Enterococcus and Neisseria appear to contain a pair of paralogous beta-PGM's. Beta-phosphoglucomutase is a member of the haloacid dehalogenase superfamily of hydrolase enzymes. These enzymes are characterized by a series of three catalytic motifs positioned within an alpha-beta (Rossman) fold. beta-PGM contains an inserted alpha helical domain in between the first and second conserved motifs and thus is a member of subfamily IA of the superfamily. The third catalytic motif comes in three variants, the third of which, containing a conserved DD or ED, is the only one found here as well as in several other related enzymes (TIGR01509). The enzyme from L. lactis has been extensively characterized including a remarkable crystal structure which traps the pentacoordinate transition state. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	185
273915	TIGR01991	HscA	Fe-S protein assembly chaperone HscA. The Heat Shock Cognate proteins HscA and HscB act together as chaperones. HscA resembles DnaK but belongs in a separate clade. The apparent function is to aid assembly of iron-sulfur cluster proteins. Homologs from Buchnera and Wolbachia are clearly in the same clade but are highly derived and score lower than some examples of DnaK. [Protein fate, Protein folding and stabilization]	599
273916	TIGR01992	PTS-IIBC-Tre	PTS system, trehalose-specific IIBC component. This model represents the fused enzyme II B and C components of the trehalose-specific PTS sugar transporter system. Trehalose is converted to trehalose-6-phosphate in the process of translocation into the cell. These transporters lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). The exceptions to this rule are Staphylococci and Streptococci which contain their own A domain as a C-terminal fusion. This family is closely related to the sucrose transporting PTS IIBC enzymes and the B and C domains of each are described by subfamily-domain level TIGRFAMs models (TIGR00826 and TIGR00852, respectively). In E. coli, B. subtilis and P. fluorescens the presence of this gene is associated with the presence of trehalase which degrades T6P to glucose and glucose-6-P. Trehalose may also be transported (in Salmonella) via the mannose PTS or galactose permease systems, or (in Sinorhizobium, Thermococcus and Sulfolobus, for instance) by ABC transporters.	462
273917	TIGR01993	Pyr-5-nucltdase	pyrimidine 5'-nucleotidase. This family of proteins includes the SDT1/SSM1 gene from yeast which has been shown to code for a pyrimidine (UMP/CMP) 5'nucleotidase. The family spans plants, fungi and a small number of bacteria. These enzymes are members of the haloacid dehalogenase (HAD) superfamily of hydrolases, specifically the IA subfamily (variant 3, TIGR01509).	183
273918	TIGR01994	SUF_scaf_2	SUF system FeS assembly protein, NifU family. Three iron-sulfur cluster assembly systems are known so far. ISC is broadly distributed while NIF tends to be associated with nitrogenase in nitrogen-fixing bacteria. The most recently described is SUF, believed to be important to maintain the function during aerobic stress of enzymes with labile Fe-S clusters. It is fairly widely distributed. This family represents one of two different proteins proposed to act as a scaffold on which the Fe-S cluster is built and from which it is transferred. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	137
273919	TIGR01995	PTS-II-ABC-beta	PTS system, beta-glucoside-specific IIABC component. This model represents a family of PTS enzyme II proteins in which all three domains are found in the same polypeptide chain and which appear to have a broad specificity for beta-glucosides including salicin (beta-D-glucose-1-salicylate) and arbutin (Hydroquinone-O-beta-D-glucopyranoside). These are distinct from the closely related sucrose-specific and trehalose-specific PTS transporters.	610
131051	TIGR01996	PTS-II-BC-sucr	PTS system, sucrose-specific IIBC component. This model represents the fused enzyme II B and C components of the sucrose-specific PTS sugar transporter system. Sucrose is converted to sucrose-6-phosphate in the process of translocation into the cell. Some of these transporters lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). The exceptions to this rule are Staphylococci, Streptococci, Lactococci, Lactobacilli, etc. which contain their own A domain as a C-terminal fusion. This family is closely related to the trehalose transporting PTS IIBC enzymes and the B and C domains of each are described by subfamily-domain level TIGRFAMs models (TIGR00826 and TIGR00852, respectively).	461
131052	TIGR01997	sufA_proteo	FeS assembly scaffold SufA. This model represents the SufA protein of the SUF system of iron-sulfur cluster biosynthesis. This system performs FeS biosynthesis even during oxidative stress and tends to be absent in obligate anaerobic and microaerophilic bacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	107
273920	TIGR01998	PTS-II-BC-nag	PTS system, N-acetylglucosamine-specific IIBC component. This model represents the combined B and C domains of the PTS transport system enzyme II specific for N-acetylglucosamine transport. Many of the genes in this family also include an A domain as part of the same polypeptide and thus should be given the name "PTS system, N-acetylglucosamine-specific IIABC component". This family is most closely related to the glucose-specific PTS enzymes. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	475
188192	TIGR01999	iscU	FeS cluster assembly scaffold IscU. This model represents IscU, a homolog of the N-terminal region of NifU, an Fe-S cluster assembly protein found mostly in nitrogen-fixing bacteria. IscU is considered part of the IscSUA-hscAB-fdx system of Fe-S assembly, whereas NifU is found in nitrogenase-containing (nitrogen-fixing) species. A NifU-type protein is also found in Helicobacter and Campylobacter. IscU and NifU are considered scaffold proteins on which Fe-S clusters are assembled before transfer to apoproteins. This model excludes true NifU proteins as in Klebsiella pneumoniae and Anabaena sp. as well as archaeal homologs. It includes largely proteobacterial and eukaryotic forms. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	124
273921	TIGR02000	NifU_proper	Fe-S cluster assembly protein NifU. Three different but partially homologous Fe-S cluster assembly systems have been described: Isc, Suf, and Nif. The latter is associated with donation of an Fe-S cluster to nitrogenase in a number of nitrogen-fixing species. NifU, described here, consists of an N-terminal domain (pfam01592) and a C-terminal domain (pfam01106). Homologs with an equivalent domain archictecture from Helicobacter and Campylobacter, however, are excluded from this model by a high trusted cutoff. The model, therefore, is specific for NifU involved in nitrogenase maturation. The related model TIGR01999 homologous to the N-terminus of this model describes IscU from the Isc system as in E. coli, Saccharomyces cerevisiae, and Homo sapiens. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation]	290
273922	TIGR02001	gcw_chp	conserved hypothetical protein, proteobacterial. This model represents a conserved hypothetical protein about 240 residues in length found so far in Proteobacteria including Shewanella oneidensis, Ralstonia solanacearum, and Colwellia psychrerythraea, usually as part of a paralogous family. The function is unknown.	243
273923	TIGR02002	PTS-II-BC-glcB	PTS system, glucose-specific IIBC component. This model represents the combined B and C domains of the PTS transport system enzyme II specific for glucose transport. Many of the genes in this family also include an A domain as part of the same polypeptide and thus should be given the name "PTS system, glucose-specific IIABC component" while the B. subtilus enzyme also contains an enzyme III domain which appears to act independently of the enzyme II domains. This family is most closely related to the N-acetylglucosamine-specific PTS enzymes (TIGR01998). [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	502
131058	TIGR02003	PTS-II-BC-unk1	PTS system, IIBC component. This model represents a family of fused B and C components of PTS enzyme II. This clade is a member of a larger family which contains enzyme II's specific for a variety of sugars including glucose (TIGR02002) and N-acetylglucosamine (TIGR01998). None of the members of this clade have been experimentally characterized. This clade includes sequences from Streptococcus and Enterococcus which also include a C-terminal A domain as well as Bacillus and Clostridium which do not. In nearly all cases, these species also contain an authentic glucose-specific PTS transporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	548
273924	TIGR02004	PTS-IIBC-malX	PTS system, maltose and glucose-specific IIBC component. This model represents a family of PTS enzyme II fused B and C components including and most closely related to the MalX maltose and glucose-specific transporter of E. coli. A pair of paralogous genes from E. coli strain CFT073 score between trusted and noise and may have diverged sufficiently to have an altered substrate specificity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	517
273925	TIGR02005	PTS-IIBC-alpha	PTS system, alpha-glucoside-specific IIBC component. This model represents a family of fused PTS enzyme II B and C domains. A gene from Clostridium has been partially characterized as a maltose transporter, while genes from Fusobacterium and Klebsiella have been proposed to transport the five non-standard isomers of sucrose. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	524
131061	TIGR02006	IscS	cysteine desulfurase IscS. This model represents IscS, one of several cysteine desulfurases from a larger protein family designated (misleadingly, in this case) class V aminotransferases. IscS is one of at least 6 enzymes characteristic of the IscSUA-hscAB-fsx system of iron-sulfur cluster assembly. Scoring almost as well as proteobacterial sequences included in the model are mitochondrial cysteine desulfurases, apparently from an analogous system in eukaryotes. The sulfur, taken from cysteine, may be used in other systems as well, such as tRNA base modification and biosynthesis of other cofactors. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Protein synthesis, tRNA and rRNA base modification]	402
131062	TIGR02007	fdx_isc	ferredoxin, 2Fe-2S type, ISC system. This family consists of proteobacterial ferredoxins associated with and essential to the ISC system of 2Fe-2S cluster assembly. This family is closely related to (but excludes) eukaryotic (mitochondrial) adrenodoxins, which are ferredoxins involved in electron transfer to P450 cytochromes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	110
273926	TIGR02008	fdx_plant	ferredoxin [2Fe-2S]. This model represents single domain 2Fe-2S (also called plant type) ferredoxins. In general, these occur as a single domain proteins or with a chloroplast transit peptide. Species tend to be photosynthetic, but several forms may occur in one species and individually may not be associated with photocynthesis. Halobacterial forms differ somewhat in architecture; they score between trusted and noise cutoffs. Sequences scoring below the noise cutoff tend to be ferredoxin-related domains of larger proteins.	97
213673	TIGR02009	PGMB-YQAB-SF	beta-phosphoglucomutase family hydrolase. This subfamily model groups together three clades: the characterized beta-phosphoglucomutases (including those from E.coli, B.subtilus and L.lactis, TIGR01990), a clade of putative bPGM's from mycobacteria and a clade including the uncharacterized E.coli and H.influenzae yqaB genes which may prove to be beta-mutases of a related 1-phosphosugar. All of these are members of the larger Haloacid dehalogenase (HAD) subfamily IA and include the "variant 3" glu-asp version of the third conserved HAD domain (TIGR01509).	185
273927	TIGR02010	IscR	iron-sulfur cluster assembly transcription factor IscR. This model describes IscR, an iron-sulfur binding transcription factor of the ISC iron-sulfur cluster assembly system. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Regulatory functions, DNA interactions]	135
213674	TIGR02011	IscA	iron-sulfur cluster assembly protein IscA. This model represents the IscA component of the ISC system for iron-sulfur cluster assembly. The ISC system consists of IscRASU, HscAB and an Isc-specific ferredoxin. IscA previously was believed to act as a scaffold and now is seen as an iron donor protein. This clade is limited to the proteobacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	105
162659	TIGR02012	tigrfam_recA	protein RecA. This model describes orthologs of the recA protein. RecA promotes hybridization of homolgous regions of DNA. A segment of ssDNA can be hybridized to another ssDNA region, or to a dsDNA region. ATP is hydrolyzed in the process. Part of the SOS respones, it is regulated by LexA via autocatalytic cleavage. [DNA metabolism, DNA replication, recombination, and repair]	321
273928	TIGR02013	rpoB	DNA-directed RNA polymerase, beta subunit. This model describes orthologs of the beta subunit of Bacterial RNA polymerase. The core enzyme consists of two alpha chains, one beta chain, and one beta' subunit. [Transcription, DNA-dependent RNA polymerase]	1065
131069	TIGR02014	BchZ	chlorophyllide reductase subunit Z. This model represents the Z subunit of the three-subunit enzyme, (bacterio)chlorophyllide reductase. This enzyme is responsible for the reduction of the chlorin B-ring and is closely related to the protochlorophyllide reductase complex which reduces the D-ring. Both of these complexes in turn are homologous to nitrogenase. [Energy metabolism, Photosynthesis]	468
131070	TIGR02015	BchY	chlorophyllide reductase subunit Y. This model represents the Y subunit of the three-subunit enzyme, (bacterio)chlorophyllide reductase. This enzyme is responsible for the reduction of the chlorin B-ring and is closely related to the protochlorophyllide reductase complex which reduces the D-ring. Both of these complexes in turn are homologous to nitrogenase. [Energy metabolism, Photosynthesis]	422
273929	TIGR02016	BchX	chlorophyllide reductase iron protein subunit X. This model represents the X subunit of the three-subunit enzyme, (bacterio)chlorophyllide reductase. This enzyme is responsible for the reduction of the chlorin B-ring and is closely related to the protochlorophyllide reductase complex which reduces the D-ring. Both of these complexes in turn are homologous to nitrogenase. This subunit is homologous to the nitrogenase component II, or "iron" protein. [Energy metabolism, Photosynthesis]	296
131072	TIGR02017	hutG_amidohyd	N-formylglutamate amidohydrolase. In some species, histidine is converted to via urocanate and then formimino-L-glutamate to glutamate in four steps, where the fourth step is conversion of N-formimino-L-glutamate to L-glutamate and formamide. In others, that pathway from formimino-L-glutamate may differ, with the next enzyme being formiminoglutamate hydrolase (HutF) yielding N-formyl-L-glutamate. This model represents the enzyme N-formylglutamate deformylase, also called N-formylglutamate amidohydrolase, which then produces glutamate. [Energy metabolism, Amino acids and amines]	263
188194	TIGR02018	his_ut_repres	histidine utilization repressor, proteobacterial. This model represents a proteobacterial histidine utilization repressor. It is usually found clustered with the enzymes HutUHIG so that it can regulate its own expression as well. A number of species have several paralogs and may fine-tune the regulation according to levels of degradation intermediates such as urocanate. This family belongs to the larger GntR family of transcriptional regulators. [Energy metabolism, Amino acids and amines, Regulatory functions, DNA interactions]	230
131074	TIGR02019	BchJ	bacteriochlorophyll 4-vinyl reductase. This model represents the component of bacteriochlorophyll synthetase responsible for reduction of the B-ring pendant ethylene (4-vinyl) group. It appears that this step must precede the reduction of ring D, at least by the "dark" protochlorophyllide reductase enzymes BchN, BchB and BchL. This family appears to be present in photosynthetic bacteria except for the cyanobacterial clade. Cyanobacteria must use a non-orthologous gene to carry out this required step for the biosynthesis of both bacteriochlorophyll and chlorophyll. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	188
131075	TIGR02020	BchF	2-vinyl bacteriochlorophyllide hydratase. This model represents the enzyme responsible for the first step in the modification of the ring A vinyl group of chlorophyllide a which (in part) distinguishes chlorophyll from bacteriochlorophyll. This enzyme is aparrently absent from cyanobacteria (which do not use bacteriochlorophyll). [Energy metabolism, Photosynthesis]	145
273930	TIGR02021	BchM-ChlM	magnesium protoporphyrin O-methyltransferase. This model represents the S-adenosylmethionine-dependent O-methyltransferase responsible for methylation of magnesium protoporphyrin IX. This step is essentiasl for the biosynthesis of both chlorophyll and bacteriochlorophyll. This model encompasses two closely related clades, from cyanobacteria (and plants) where it is called ChlM and other photosynthetic bacteria where it is known as BchM. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	219
273931	TIGR02022	hutF	formiminoglutamate deiminase. In some species, histidine utilization goes via urocanate to glutamate in four step, the last being removal of formamide. This model describes an alternate fourth step, formiminoglutamate hydrolase, which leads to N-formyl-L-glutamate. This product may be acted on by formylglutamate amidohydrolase (TIGR02017) and bypass glutamate as a product during its degradation. Alternatively, removal of formate (by EC 3.5.1.68) would yield glutamate. [Energy metabolism, Amino acids and amines]	454
273932	TIGR02023	BchP-ChlP	geranylgeranyl reductase. This model represents a group of geranylgeranyl reductases specific for the biosyntheses of bacteriochlorophyll and chlorophyll. It is unclear whether the processes of isoprenoid ligation to the chlorin ring and reduction of the geranylgeranyl chain to a phytyl chain are necessarily ordered the same way in all species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	388
131079	TIGR02024	FtcD	glutamate formiminotransferase. This model represents the tetrahydrofolate (THF) dependent glutamate formiminotransferase involved in the histidine utilization pathway. This enzyme interconverts L-glutamate and N-formimino-L-glutamate. The enzyme is bifunctional as it also catalyzes the cyclodeaminase reaction on N-formimino-THF, converting it to 5,10-methenyl-THF and releasing ammonia - part of the process of regenerating THF. This model covers enzymes from metazoa as well as gram-positive bacteria and archaea. In humans, deficiency of this enzyme results in a disease phenotype. The crystal structure of the enzyme has been studied in the context of the catalytic mechanism. [Energy metabolism, Amino acids and amines]	298
273933	TIGR02025	BchH	magnesium chelatase, H subunit. This model represents the H subunit of the magnesium chelatase complex responsible for magnesium insertion into the protoporphyrin IX ring in the biosynthesis of both chlorophyll and bacteriochlorophyll. In chlorophyll-utilizing species, this gene is known as ChlH, while in bacteriochlorophyll-utilizing spoecies it is called BchH. Subunit H is the largest (~140kDa) of the three subunits (the others being BchD/ChlD and BchI/ChlI), and is known to bind protoporphyrin IX. Subunit H is homologous to the CobN subunit of cobaltochelatase and by anology with that enzyme, subunit H is believed to also bind the magnesium ion which is inserted into the ring. In conjunction with the hydrolysis of ATP by subunits I and D, a conformation change is believed to happen in subunit H causing the magnesium ion insertion into the distorted protoporphyrin ring. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	1224
131081	TIGR02026	BchE	magnesium-protoporphyrin IX monomethyl ester anaerobic oxidative cyclase. This model represents the cobalamin-dependent oxidative cyclase, a radical SAM enzyme responsible for forming the distinctive E-ring of the chlorin ring system under anaerobic conditions. This step is essential in the biosynthesis of both bacteriochlorophyll and chlorophyll under anaerobic conditions (a separate enzyme, AcsF, acts under aerobic conditions). This model identifies two clades of sequences, one from photosynthetic, non-cyanobacterial bacteria and another including Synechocystis and several non-photosynthetic bacteria. The function of the Synechocystis gene is supported by gene clustering with other photosynthetic genes, so the purpose of the gene in the non-photosynthetic bacteria is uncertain. Note that homologs of this gene are not found in plants which rely solely on the aerobic cyclase.	497
273934	TIGR02027	rpoA	DNA-directed RNA polymerase, alpha subunit, bacterial and chloroplast-type. This family consists of the bacterial (and chloroplast) DNA-directed RNA polymerase alpha subunit, encoded by the rpoA gene. The RNA polymerase catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates. The amino terminal domain is involved in dimerizing and assembling the other RNA polymerase subunits into a transcriptionally active enzyme. The carboxy-terminal domain contains determinants for interaction with DNA and with transcriptional activator proteins. [Transcription, DNA-dependent RNA polymerase]	297
131083	TIGR02028	ChlP	geranylgeranyl reductase. This model represents the reductase which acts reduces the geranylgeranyl group to the phytyl group in the side chain of chlorophyll. It is unclear whether the enzyme has a preference for acting before or after the attachment of the side chain to chlorophyllide a by chlorophyll synthase. This clade is restricted to plants and cyanobacteria to separate it from the homologues which act in the biosynthesis of bacteriochlorophyll. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	398
131084	TIGR02029	AcsF	magnesium-protoporphyrin IX monomethyl ester aerobic oxidative cyclase. This model respresents the oxidative cyclase responsible for forming the distinctive E-ring of the chlorin ring system under aerobic conditions. This enzyme is believed to utilize a binuclear iron center and molecular oxygen. There are two isoforms of this enzyme in some plants and cyanobacterai which are differentially regulated based on the levels of copper and oxygen. This step is essential in the biosynthesis of both bacteriochlorophyll and chlorophyll under aerobic conditions (a separate enzyme, BchE, acts under anaerobic conditions). This enzyme is found in plants, cyanobacteria and other photosynthetic bacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	337
131085	TIGR02030	BchI-ChlI	magnesium chelatase ATPase subunit I. This model represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	337
273935	TIGR02031	BchD-ChlD	magnesium chelatase ATPase subunit D. This model represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. Unlike subunit I (TIGR02030), this subunit is not found in archaea. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	589
273936	TIGR02032	GG-red-SF	geranylgeranyl reductase family. This model represents a subfamily which includes geranylgeranyl reductases involved in chlorophyll and bacteriochlorophyll biosynthesis as well as other related enzymes which may also act on geranylgeranyl groups or related substrates. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	295
273937	TIGR02033	D-hydantoinase	D-hydantoinase. This model represents the D-hydantoinase (dihydropyrimidinase) which primarily converts 5,6-dihydrouracil to 3-ureidopropanoate but also acts on dihydrothymine and hydantoin. The enzyme is a metalloenzyme.	454
213679	TIGR02034	CysN	sulfate adenylyltransferase, large subunit. Metabolic assimilation of sulfur from inorganic sulfate, requires sulfate activation by coupling to a nucleoside, for the production of high-energy nucleoside phosphosulfates. This pathway appears to be similar in all prokaryotic organisms. Activation is first achieved through sulfation of sulfate with ATP by sulfate adenylyltransferase (ATP sulfurylase) to produce 5'-phosphosulfate (APS), coupled by GTP hydrolysis. Subsequently, APS is phosphorylated by an APS kinase to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS). In Escherichia coli, ATP sulfurylase is a heterodimer composed of two subunits encoded by cysD and cysN, with APS kinase encoded by cysC. These genes are located in a unidirectionally transcribed gene cluster, and have been shown to be required for the synthesis of sulfur-containing amino acids. Homologous to this E.coli activation pathway are nodPQH gene products found among members of the Rhizobiaceae family. These gene products have been shown to exhibit ATP sulfurase and APS kinase activity, yet are involved in Nod factor sulfation, and sulfation of other macromolecules. With members of the Rhizobiaceae family, nodQ often appears as a fusion of cysN (large subunit of ATP sulfurase) and cysC (APS kinase). [Central intermediary metabolism, Sulfur metabolism]	406
211710	TIGR02035	D_Ser_am_lyase	D-serine ammonia-lyase. This family consists of D-serine ammonia-lyase (EC 4.3.1.18), a pyridoxal-phosphate enzyme that converts D-serine to pyruvate and NH3. This enzyme is also called D-serine dehydratase and D-serine deaminase and was previously designated EC 4.2.1.14. It is homologous to an enzyme that acts on threonine and may itself act weakly on threonine. [Energy metabolism, Amino acids and amines]	431
131091	TIGR02036	dsdC	D-serine deaminase transcriptional activator. This family, part of the LysR family of transcriptional regulators, activates transcription of the gene for D-serine deaminase, dsdA. Trusted members of this family so far are found adjacent to dsdA and only in Gammaproteobacteria, including E. coli, Vibrio cholerae, and Colwellia psychrerythraea. [Regulatory functions, DNA interactions]	302
273938	TIGR02037	degP_htrA_DO	periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures. [Protein fate, Protein folding and stabilization, Protein fate, Degradation of proteins, peptides, and glycopeptides]	428
273939	TIGR02038	protease_degS	periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E). [Protein fate, Degradation of proteins, peptides, and glycopeptides, Regulatory functions, Protein interactions]	351
131094	TIGR02039	CysD	sulfate adenylyltransferase, small subunit. Metabolic assimilation of sulfur from inorganic sulfate, requires sulfate activation by coupling to a nucleoside, for the production of high-energy nucleoside phosphosulfates. This pathway appears to be similar in all prokaryotic organisms. Activation is first achieved through sulfation of sulfate with ATP by sulfate adenylyltransferase (ATP sulfurylase) to produce 5'-phosphosulfate (APS), coupled by GTP hydrolysis. Subsequently, APS is phosphorylated by an APS kinase to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS). In Escherichia coli, ATP sulfurylase is a heterodimer composed of two subunits encoded by cysD and cysN, with APS kinase encoded by cysC. These genes are located in a unidirectionally transcribed gene cluster, and have been shown to be required for the synthesis of sulfur-containing amino acids. Homologous to this E.coli activation pathway are nodPQH gene products found among members of the Rhizobiaceae family. These gene products have been shown to exhibit ATP sulfurase and APS kinase activity, yet are involved in Nod factor sulfation, and sulfation of other macromolecules. [Central intermediary metabolism, Sulfur metabolism]	294
273940	TIGR02040	PpsR-CrtJ	transcriptional regulator PpsR. This model represents the transcriptional regulator PpsR which is strictly associated with photosynthetic proteobacteria and found in photosynthetic operons. PpsR has been reported to be a repressor. These proteins contain a Helix-Turn_Helix motif of the "fis" type (pfam02954). [Energy metabolism, Photosynthesis, Regulatory functions, DNA interactions]	442
273941	TIGR02041	CysI	sulfite reductase (NADPH) hemoprotein, beta-component. Sulfite reductase (NADPH) catalyzes a six electron reduction of sulfite to sulfide in prokaryotic organisms. It is a complex oligomeric enzyme composed of two different peptides with a subunit composition of alpha(8)-beta(4). The alpha component, encoded by cysJ, is a flavoprotein containing both FMN and FAD, while the beta component, encoded by cysI, is a siroheme, iron-sulfur protein. In Salmonella typhimurium and Escherichia coli, both the alpha and beta subunits of sulfite reductase are located in a unidirectional gene cluster along with phosphoadenosine phosphosulfate reductase, which catalyzes a two step reduction of PAPS to give free sulfite. In cyanobacteria and plant species, sulfite reductase ferredoxin (EC 1.8.7.1) catalyzes the reduction of sulfite to sulfide. [Central intermediary metabolism, Sulfur metabolism]	541
131097	TIGR02042	sir	ferredoxin-sulfite reductase. Distantly related to the iron-sulfur hemoprotein of sulfite reductase (NADPH) found in Proteobacteria and Eubacteria, sulfite reductase (ferredoxin) is a cyanobacterial and plant monomeric enzyme that also catalyzes the reduction of sulfite to sulfide. [Central intermediary metabolism, Sulfur metabolism]	577
131098	TIGR02043	ZntR	Zn(II)-responsive transcriptional regulator. This model represents the zinc and cadmium (II) responsive transcriptional activator of the gamma proteobacterial zinc efflux system. This protein is a member of the MerR family of transcriptional activators (pfam00376) and contains a distinctive pattern of cysteine residues in its metal binding loop, Cys-Cys-X(8-9)-Cys, as well as a conserved and critical cysteine at the N-terminal end of the dimerization helix. [Regulatory functions, DNA interactions]	131
131099	TIGR02044	CueR	Cu(I)-responsive transcriptional regulator. This model represents the copper-, silver- and gold- (I) responsive transcriptional activator of the gamma proteobacterial copper efflux system. This protein is a member of the MerR family of transcriptional activators (pfam00376) and contains a distinctive pattern of cysteine residues in its metal binding loop, Cys-X7-Cys. This family also lacks a conserved cysteine at the N-terminal end of the dimerization helix which is required for the binding of divalent metals such as zinc; here it is replaced by a serine residue. [Regulatory functions, DNA interactions]	127
131100	TIGR02045	P_fruct_ADP	ADP-specific phosphofructokinase. Phosphofructokinase is a key enzyme of glycolysis. The phosphate group donor for different subtypes of phosphofructokinase can be ATP, ADP, or pyrophosphate. This family consists of ADP-dependent phosphofructokinases. Members are more similar to ADP-dependent glucokinases (excluded from this family) than to other phosphofructokinases. [Energy metabolism, Glycolysis/gluconeogenesis]	446
131101	TIGR02046	sdhC_b558_fam	succinate dehydrogenase (or fumarate reductase) cytochrome b subunit, b558 family. This family consists of the succinate dehydrogenase subunit C of Bacillus subtilis, designated cytochrome b-558, and related sequences that include a fumarate reductase subunit C. This subfamily is only weakly similar to the main group of succinate dehydrogenase cytochrome b subunits described by pfam01127, so that some members score above the gathering threshold and some do not. [Energy metabolism, TCA cycle]	214
131102	TIGR02047	CadR-PbrR	Cd(II)/Pb(II)-responsive transcriptional regulator. This model represents the cadmium(II) and/or lead(II) responsive transcriptional activator of the proteobacterial metal efflux system. This protein is a member of the MerR family of transcriptional activators (pfam00376) and contains a distinctive pattern of cysteine residues in its metal binding loop, Cys-X(6-9)-Cys, as well as a conserved and critical cysteine at the N-terminal end of the dimerization helix. [Regulatory functions, DNA interactions]	127
131103	TIGR02048	gshA_cyano	glutamate--cysteine ligase, cyanobacterial, putative. This family consists of proteins believed (see Copley SD, Dhillon JK, 2002) to be the glutamate--cysteine ligases of several cyanobacteria, which are known to make glutathione. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	376
273942	TIGR02049	gshA_ferroox	glutamate--cysteine ligase, T. ferrooxidans family. This family consists of a rare family of glutamate--cysteine ligases, demonstrated first in Thiobacillus ferrooxidans and present in a few other Proteobacteria. It is the first of two enzymes for glutathione biosynthesis. It is also called gamma-glutamylcysteine synthetase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	403
273943	TIGR02050	gshA_cyan_rel	carboxylate-amine ligase, YbdK family. This family represents a division of a larger family, the other branch of which is predicted to act as glutamate--cysteine ligase (the first of two enzymes in glutathione biosynthesis) in the cyanobacteria. Species containing this protein, however, are generally not believe to make glutathione, and the function is unknown.	287
131106	TIGR02051	MerR	Hg(II)-responsive transcriptional regulator. This model represents the mercury (II) responsive transcriptional activator of the mer organomercurial resistance operon. This protein is a member of the MerR family of transcriptional activators (pfam00376) and contains a distinctive pattern of cysteine residues in its metal binding loop, Cys-X(8)-Cys-Pro, as well as a conserved and critical cysteine at the N-terminal end of the dimerization helix. [Cellular processes, Detoxification, Regulatory functions, DNA interactions]	124
131107	TIGR02052	MerP	mercuric transport protein periplasmic component. This model represents the periplasmic mercury (II) binding protein of the bacterial mercury detoxification system which passes mercuric ion to the MerT transporter for subsequent reduction to Hg(0) by the mercuric reductase MerA. MerP contains a distinctive GMTCXXC motif associated with metal binding. MerP is related to a larger family of metal binding proteins (pfam00403). [Cellular processes, Detoxification]	92
273944	TIGR02053	MerA	mercury(II) reductase. This model represents the mercuric reductase found in the mer operon for the detoxification of mercury compounds. MerA is a FAD-containing flavoprotein which reduces Hg(II) to Hg(0) utilizing NADPH. [Cellular processes, Detoxification]	463
131109	TIGR02054	MerD	mercuric resistence transcriptional repressor protein MerD. This model represents a transcriptional repressor protein of the MerR family (pfam00376) whose expression is regulated by the mercury-sensitive transcriptional activator, MerR. MerD has been shown to repress the transcription of the mer operon. [Cellular processes, Detoxification]	120
273945	TIGR02055	APS_reductase	thioredoxin-dependent adenylylsulfate APS reductase. This model describes recently identified adenosine 5'-phosphosulfate (APS) reductase activity found in sulfate-assimilatory prokaryotes, thus separating it from the traditionally described phosphoadenosine 5'-phosphosulfate (PAPS) reductases found in bacteria and fungi. Homologous to PAPS reductase in enterobacteria, cyanobacteria, and yeast, APS reductase here clusters with, and demonstrates greater homology to plant APS reductase. Additionally, the presence of two conserved C-terminal motifs (CCXXRKXXPL & SXGCXXCT) distinguishes APS substrate specificity and serves as a FeS cluster. [Central intermediary metabolism, Sulfur metabolism]	191
131111	TIGR02056	ChlG	chlorophyll synthase, ChlG. This model represents the strictly cyanobacterial and plant-specific chlorophyll synthase ChlG. ChlG is the enzyme (esterase) which attaches the side chain moiety onto chlorophyllide a. Both geranylgeranyl and phytyl pyrophosphates are substrates to varying degrees in enzymes from different sources. Thus, ChlG may act as the final or penultimate step in chlorophyll biosynthesis (along with the geranylgeranyl reductase, ChlP). [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	306
131112	TIGR02057	PAPS_reductase	phosphoadenosine phosphosulfate reductase, thioredoxin dependent. Requiring thioredoxin as an electron donor, phosphoadenosine phosphosulfate reductase catalyzes the reduction of 3'-phosphoadenylylsulfate (PAPS) to sulfite and phospho-adenosine-phosphate (PAP). Found in enterobacteria, cyanobacteria, and yeast, PAPS reductase is related to a group of plant (TIGR00424) and bacterial (TIGR02055) enzymes preferring 5'-adenylylsulfate (APS) over PAPS as a substrate for reduction to sulfite. [Central intermediary metabolism, Sulfur metabolism]	226
131113	TIGR02058	lin0512_fam	conserved hypothetical protein. This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a perfectly conserved motif GxGxDxHG near the N-terminus. [Hypothetical proteins, Conserved]	116
131114	TIGR02059	swm_rep_I	cyanobacterial long protein repeat. This domain appears in 29 copies in a large (>10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803.	101
131115	TIGR02060	aprB	adenosine phosphosulphate reductase, beta subunit. During dissimilatory sulfate reduction and sulfur oxidation, adenylylsulfate (APS) reductase catalyzes reversibly the two-electron reduction of APS to sulfite and AMP. Found in several bacterial lineages and in Archaeoglobales, APS reductase is a heterodimer composed of an alpha subunit containing a noncovalently bound FAD, and a beta subunit containing two [4Fe-4S] clusters. Described by this model is the beta subunit of APS reductase, sharing common evolutionary origin with other iron-sulfur cluster-binding proteins. [Central intermediary metabolism, Sulfur metabolism]	132
273946	TIGR02061	aprA	adenosine phosphosulphate reductase, alpha subunit. During dissimilatory sulfate reduction or sulfur oxidation, adenylylsulfate (APS) reductase catalyzes reversibly the two-electron reduction of APS to sulfite and AMP. Found in several bacterial lineages and in Archaeoglobales, APS reductase is a heterodimer composed of an alpha subunit containing a noncovalently bound FAD, and a beta subunit containing two [4Fe-4S] clusters. Described by this model is the alpha subunit of APS reductase, sharing common evolutionary origin with fumarate reductase/succinate dehydrogenase flavoproteins. [Central intermediary metabolism, Sulfur metabolism]	614
131117	TIGR02062	RNase_B	exoribonuclease II. This family consists of exoribonuclease II, the product of the rnb gene, as found in a number of gamma proteobacteria. In Escherichia coli, it is one of eight different exoribonucleases. It is involved in mRNA degradation and tRNA precursor end processing. [Transcription, Degradation of RNA]	639
273947	TIGR02063	RNase_R	ribonuclease R. This family consists of an exoribonuclease, ribonuclease R, also called VacB. It is one of the eight exoribonucleases reported in E. coli and is broadly distributed throughout the bacteria. In E. coli, double mutants of this protein and polynucleotide phosphorylase are not viable. Scoring between trusted and noise cutoffs to the model are shorter, divergent forms from the Chlamydiae, and divergent forms from the Campylobacterales (including Helicobacter pylori) and Leptospira interrogans. [Transcription, Degradation of RNA]	709
273948	TIGR02064	dsrA	sulfite reductase, dissimilatory-type alpha subunit. Dissimilatory sulfite reductase catalyzes the six-electron reduction of sulfite to sulfide, as the terminal reaction in dissimilatory sulfate reduction. It remains unclear however, whether trithionate and thiosulfate serve as intermediate compounds to sulfide, or as end products of sulfite reduction. Sulfite reductase is a multisubunit enzyme composed of dimers of either alpha/beta or alpha/beta/gamma subunits, each containing a siroheme and iron sulfur cluster prosthetic center. Found in sulfate-reducing bacteria, these genes are commonly located in an unidirectional gene cluster. This model describes the alpha subunit of sulfite reductase. [Central intermediary metabolism, Sulfur metabolism]	402
131120	TIGR02065	ECX1	archaeal exosome-like complex exonuclease 1. This family contains the archaeal protein orthologous to the eukaryotic exosome protein Rrp41. It is somewhat more distantly related to the bacterial protein ribonuclease PH. An exosome-like complex has been demonstrated experimentally for the Archaea in Sulfolobus solfataricus, so members of this family are designated exosome complex exonuclease 1, after usage in SwissProt. [Transcription, Degradation of RNA]	230
131121	TIGR02066	dsrB	sulfite reductase, dissimilatory-type beta subunit. Dissimilatory sulfite reductase catalyzes the six-electron reduction of sulfite to sulfide, as the terminal reaction in dissimilatory sulfate reduction. It remains unclear however, whether trithionate and thiosulfate serve as intermediate compounds to sulfide, or as end products of sulfite reduction. Sulfite reductase is a multisubunit enzyme composed of dimers of either alpha/beta or alpha/beta/gamma subunits, each containing a siroheme and iron sulfur cluster prosthetic center. Found in sulfate-reducing bacteria, these genes are commonly located in an unidirectional gene cluster. This model describes the beta subunit of sulfite reductase. [Central intermediary metabolism, Sulfur metabolism]	341
273949	TIGR02067	his_9_HisN	histidinol-phosphatase, inositol monophosphatase family. This subfamily belongs to the inositol monophosphatase family (pfam00459). The members of this family consist of no more than one per species and are found only in species in which histidine is synthesized de novo but no histidinol phosphatase can be found in either of the two described families (TIGR01261, TIGR01856). In at least one species, the member of this family is found near known histidine biosynthesis genes. The role as histidinol-phosphatase wsa first proven in Corynebacterium glutamicum. [Amino acid biosynthesis, Histidine family]	251
273950	TIGR02068	cya_phycin_syn	cyanophycin synthetase. Cyanophycin is an insoluble storage polymer for carbon, nitrogen, and energy, found in most Cyanobacteria. The polymer has a backbone of L-aspartic acid, with most Asp side chain carboxyl groups attached to L-arginine. The polymer is made by this enzyme, cyanophycin synthetase, and degraded by cyanophycinase. Heterologously expressed cyanophycin synthetase in E. coli produces a closely related, water-soluble polymer with some Arg replaced by Lys. It is unclear whether enzymes that produce soluble cyanophycin-like polymers in vivo in non-Cyanobacterial species should be designated as cyanophycin synthetase itself or as a related enzyme. This model makes the designation as cyanophycin synthetase. Cyanophycin synthesis is analogous to polyhydroxyalkanoic acid (PHA) biosynthesis, except that PHA polymers lack nitrogen and may be made under nitrogen-limiting conditions. [Cellular processes, Biosynthesis of natural products]	864
131124	TIGR02069	cyanophycinase	cyanophycinase. This model describes both cytosolic and extracellular cyanophycinases. The former are part of a system in many Cyanobacteria and a few other species of generating and later utilizing a storage polymer for nitrogen, carbon, and energy, called cyanophycin. The latter are found in species such as Pseudomonas anguilliseptica that can use external cyanophycin. The polymer has a backbone of L-aspartic acid, with most Asp side chain carboxyl groups attached to L-arginine. [Energy metabolism, Other]	250
273951	TIGR02070	mono_pep_trsgly	monofunctional biosynthetic peptidoglycan transglycosylase. This family is one of the transglycosylases involved in the late stages of peptidoglycan biosynthesis. Members tend to be small, about 240 amino acids in length, and consist almost entirely of a domain described by pfam00912 for transglycosylases. Species with this protein will have several other transglycosylases as well. All species with this protein are Proteobacteria that produce murein (peptidoglycan). [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	224
273952	TIGR02071	PBP_1b	penicillin-binding protein 1B. Bacterial that synthesize a cell wall of peptidoglycan (murein) generally have several transglycosylases and transpeptidases for the task. This family consists of a particular bifunctional transglycosylase/transpeptidase in E. coli and other Proteobacteria, designated penicillin-binding protein 1B. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	730
273953	TIGR02072	BioC	malonyl-acyl carrier protein O-methyltransferase BioC. This enzyme, which is found in biotin biosynthetic gene clusters in proteobacteria, firmicutes, green-sulfur bacteria, fusobacterium and bacteroides, carries out an enzymatic step prior to the formation of pimeloyl-CoA, namely O-methylation of the malonyl group preferentially while on acyl carrier protein. The enzyme is recognizable as a methyltransferase by homology. [Biosynthesis of cofactors, prosthetic groups, and carriers, Biotin]	240
273954	TIGR02073	PBP_1c	penicillin-binding protein 1C. This subfamily of the penicillin binding proteins includes the member from E. coli designated penicillin-binding protein 1C. Members have both transglycosylase and transpeptidase domains and are involved in forming cross-links in the late stages of peptidoglycan biosynthesis. All members of this subfamily are presumed to have the same basic function. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	727
273955	TIGR02074	PBP_1a_fam	penicillin-binding protein, 1A family. Bacterial that synthesize a cell wall of peptidoglycan (murein) generally have several transglycosylases and transpeptidases for the task. This family consists of bifunctional transglycosylase/transpeptidase penicillin-binding proteins (PBP). In the Proteobacteria, this family includes PBP 1A but not the paralogous PBP 1B (TIGR02071). This family also includes related proteins, often designated PBP 1A, from other bacterial lineages. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	531
213681	TIGR02075	pyrH_bact	uridylate kinase. This protein, also called UMP kinase, converts UMP to UDP by adding a phosphate from ATP. It is the first step in pyrimidine biosynthesis. GTP is an allosteric activator. In a large fraction of all bacterial genomes, the gene tends to be located immediately downstream of elongation factor Ts and upstream of ribosome recycling factor. A related protein family, believed to be equivalent in function and found in the archaea and in spirochetes, is described by a separate model, TIGR02076. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	232
273956	TIGR02076	pyrH_arch	uridylate kinase, putative. This family consists of the archaeal and spirochete proteins most closely related to bacterial uridylate kinases (TIGR02075), an enzyme involved in pyrimidine biosynthesis. Members are likely, but not known, to be functionally equivalent to their bacterial counterparts. However, substantial sequence differences suggest that regulatory mechanisms may be different; the bacterial form is allosterically regulated by GTP. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	221
200156	TIGR02077	thr_lead_pep	thr operon leader peptide. This family consists of examples of the threonine biosynthesis (thr) operon leader peptide, also called the thr operon attenuator. The small gene for this peptide is often missed in genome annotation. It should be looked for in genomes of the proteobacteria, immediately upstream of genes for threonine biosynthesis, typically aspartokinase I/homoserine dehydrogenase, homoserine kinase, and threonine synthase. Transcription of the rest of the Thr operon is attenuated (mostly turned off) unless the ribosome pauses during a stretch of the leader sequence rich in both Ile (made from Thr) and in Thr itself because of the scarcity of those amino acids at the time. The leader peptide itself, once made, may have no role other than to be degraded. Similar systems exist for some other amino acid biosynthetic operons, such as Trp. [Amino acid biosynthesis, Aspartate family]	24
131133	TIGR02078	AspKin_pair	Pyrococcus aspartate kinase subunit, putative. This family consists of proteins restricted to and found as paralogous pairs (typically close together) in species of Pyrococcus, a hyperthermophilic archaeal genus. Members are always found close to other genes of threonine biosynthesis and appear to represent the Pyrococcal form of aspartate kinase. Alignment to aspartokinase III from E. coli shows that 300 N-terminal and 20 C-terminal amino acids are homologous, but the form in Pyrococcus lacks ~ 100 amino acids in between. [Amino acid biosynthesis, Aspartate family]	327
273957	TIGR02079	THD1	threonine dehydratase. This model represents threonine dehydratase, the first step in the pathway converting threonine into isoleucine. At least two other clades of biosynthetic threonine dehydratases have been characterized by models (TIGR01124 and TIGR01127). Those sequences described by this model are exclusively found in species containg the rest of the isoleucine pathway and which are generally lacking in members of the those other two clades of threonine dehydratases. Members of this clade are also often gene clustered with other elements of the isoleucine pathway. [Amino acid biosynthesis, Pyruvate family]	409
131135	TIGR02080	O_succ_thio_ly	O-succinylhomoserine (thiol)-lyase. This family consists of O-succinylhomoserine (thiol)-lyase, one of three different enzymes designated cystathionine gamma-synthase and involved in methionine biosynthesis. In all three cases, sulfur is added by transsulfuration from Cys to yield cystathionine rather than by a sulfhydrylation step that uses H2S directly and bypasses cystathionine. [Amino acid biosynthesis, Aspartate family]	382
273958	TIGR02081	metW	methionine biosynthesis protein MetW. This protein is found alongside MetX, of the enzyme that acylates homoserine as a first step toward methionine biosynthesis, in many species. It appears to act in methionine biosynthesis but is not fully characterized. [Amino acid biosynthesis, Aspartate family]	194
273959	TIGR02082	metH	5-methyltetrahydrofolate--homocysteine methyltransferase. This family represents 5-methyltetrahydrofolate--homocysteine methyltransferase (EC 2.1.1.13), one of at least three different enzymes able to convert homocysteine to methionine by transferring a methyl group on to the sulfur atom. It is also called the vitamin B12(or cobalamine)-dependent methionine synthase. Other methionine synthases include 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase (MetE, EC 2.1.1.14, the cobalamin-independent methionine synthase) and betaine-homocysteine methyltransferase. [Amino acid biosynthesis, Aspartate family]	1181
131138	TIGR02083	LEU2	3-isopropylmalate dehydratase, large subunit. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures. All are dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur cluster. 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble the leuC and leuD pair in length and sequence but even more closely resemble the respective domains of homoaconitase, and their identity is uncertain. These homologs are described by a separate model of subfamily (rather than equivalog) homology type (TIGR01343). This model along with TIGR00170 describe clades which consist only of LeuC sequences. Here, the genes from Pyrococcus furiosus, Clostridium acetobutylicum, Thermotoga maritima and others are gene clustered with related genes from the leucine biosynthesis pathway. [Amino acid biosynthesis, Pyruvate family]	419
131139	TIGR02084	leud	3-isopropylmalate dehydratase, small subunit. Homoaconitase, aconitase, and 3-isopropylmalate dehydratase have similar overall structures. All are dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur cluster. 3-isopropylmalate dehydratase is split into large (leuC) and small (leuD) chains in eubacteria. Several pairs of archaeal proteins resemble the leuC and leuD pair in length and sequence but even more closely resemble the respective domains of homoaconitase, and their identity is uncertain. The members of the seed for this model are those sequences which are gene clustered with other genes involved in leucine biosynthesis and include some archaea. [Amino acid biosynthesis, Pyruvate family]	156
131140	TIGR02085	meth_trns_rumB	23S rRNA (uracil-5-)-methyltransferase RumB. This family consists of RNA methyltransferases designated RumB, formerly YbjF. Members act on 23S rRNA U747 and the equivalent position in other proteobacterial species. This family is homologous to the other 23S rRNA methyltransferase RumA and to the tRNA methyltransferase TrmA. [Protein synthesis, tRNA and rRNA base modification]	374
273960	TIGR02086	IPMI_arch	3-isopropylmalate dehydratase, large subunit. This subfamily is a subset of the larger HacA family (Homoaconitate hydratase family, TIGR01343) and is most closely related to the 3-isopropylmalate dehydratase, large subunits which form TIGR00170. This subfamily includes the members of TIGR01343 which are gene clustered with other genes of leucine biosynthesis. The rest of the subfamily includes mainly archaeal species which exhibit two hits to this model. In these cases it is possible that one or the other of the hits does not have a 3-isopropylmalate dehydratase activity but rather one of the other related aconitase-like activities.	413
273961	TIGR02087	LEUD_arch	3-isopropylmalate dehydratase, small subunit. This subfamily is most closely related to the 3-isopropylmalate dehydratase, small subunits which form TIGR00171. This subfamily includes the members of TIGR02084 which are gene clustered with other genes of leucine biosynthesis. The rest of the subfamily includes mainly archaeal species which exhibit two hits to this model. In these cases it is possible that one or the other of the hits does not have a 3-isopropylmalate dehydratase activity but rather one of the other related aconitase-like activities.	154
273962	TIGR02088	LEU3_arch	isopropylmalate/isohomocitrate dehydrogenases. This model represents a group of archaeal decarboxylating dehydrogenases which include the leucine biosynthesis enzyme 3-isopropylmalate dehydrogenase (LeuB, LEU3) and the methanogenic cofactor CoB biosynthesis enzyme isohomocitrate dehydrogenase (AksF). Both of these have been characterized in Methanococcus janaschii. Non-methanogenic archaea have only one hit to this model and presumably this is LeuB, although phylogenetic trees cannot establish which gene is which in the methanogens. The AksF gene is capable of acting on isohomocitrate, iso(homo)2-citrate and iso(homo)3-citrate in the successive elongation cycles of coenzyme B (7-mercaptoheptanoyl-threonine phosphate). This family is closely related to both the LeuB genes found in TIGR00169 and the mitochondrial eukaryotic isocitrate dehydratases found in TIGR00175. All of these are included within the broader subfamily model, pfam00180.	322
273963	TIGR02089	TTC	tartrate dehydrogenase. Tartrate dehydrogenase catalyzes the oxidation of both meso- and (+)-tartrate as well as a D-malate. These enzymes are closely related to the 3-isopropylmalate and isohomocitrate dehydrogenases found in TIGR00169 and TIGR02088, respectively. [Energy metabolism, Other]	352
273964	TIGR02090	LEU1_arch	isopropylmalate/citramalate/homocitrate synthases. Methanogenic archaea contain three closely related homologs of the 2-isopropylmalate synthases (LeuA) represented by TIGR00973. Two of these in Methanococcus janaschii (MJ1392 - CimA; MJ0503 - AksA) have been characterized as catalyzing alternative reactions leaving the third (MJ1195) as the presumptive LeuA enzyme. CimA is citramalate (2-methylmalate) synthase which condenses acetyl-CoA with pyruvate. This enzyme is believed to be involved in the biosynthesis of isoleucine in methanogens and possibly other species lacking threonine dehydratase. AksA is a homocitrate synthase which also produces (homo)2-citrate and (homo)3-citrate in the biosynthesis of Coenzyme B which is restricted solely to methanogenic archaea. Methanogens, then should and aparrently do contain all three of these enzymes. Unfortunately, phylogenetic trees do not resolve into three unambiguous clades, making assignment of function to particular genes problematic. Other archaea which lack a threonine dehydratase (mainly Euryarchaeota) should contain both a CimA and a LeuA gene. This is true of, for example, archaeoglobus fulgidis, but not for the Pyrococci which have none in this clade, but one in TIGR00973 and one in TIGRT00977 which may fulfill these roles. Other species which have only one hit to this model and lack threonine dehydratase are very likely LeuA enzymes.	363
273965	TIGR02091	glgC	glucose-1-phosphate adenylyltransferase. This enzyme, glucose-1-phosphate adenylyltransferase, is also called ADP-glucose pyrophosphorylase. The plant form is an alpha2,beta2 heterodimer, allosterically regulated in plants. Both subunits are homologous and included in this model. In bacteria, both homomeric forms of GlgC and more active heterodimers of GlgC and GlgD have been described. This model describes the GlgC subunit only. This enzyme appears in variants of glycogen synthesis pathways that use ADP-glucose, rather than UDP-glucose as in animals. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	361
273966	TIGR02092	glgD	glucose-1-phosphate adenylyltransferase, GlgD subunit. This family is GlgD, an apparent regulatory protein that appears in an alpha2/beta2 heterotetramer with GlgC (glucose-1-phosphate adenylyltransferase, TIGR02091) in a subset of bacteria that use GlgC for glycogen biosynthesis. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	369
273967	TIGR02093	P_ylase	glycogen/starch/alpha-glucan phosphorylases. This family consists of phosphorylases. Members use phosphate to break alpha 1,4 linkages between pairs of glucose residues at the end of long glucose polymers, releasing alpha-D-glucose 1-phosphate. The nomenclature convention is to preface the name according to the natural substrate, as in glycogen phosphorylase, starch phosphorylase, maltodextrin phosphorylase, etc. Name differences among these substrates reflect differences in patterns of branching with alpha 1,6 linkages. Members include allosterically regulated and unregulated forms. A related family, TIGR02094, contains examples known to act well on particularly small alpha 1,4 glucans, as may be found after import from exogenous sources. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	794
273968	TIGR02094	more_P_ylases	alpha-glucan phosphorylases. This family consists of known phosphorylases, and homologs believed to share the function of using inorganic phosphate to cleave an alpha 1,4 linkage between the terminal glucose residue and the rest of the polymer (maltodextrin, glycogen, etc.). The name of the glucose storage polymer substrate, and therefore the name of this enzyme, depends on the chain lengths and branching patterns. A number of the members of this family have been shown to operate on small maltodextrins, as may be obtained by utilization of exogenous sources. This family represents a distinct clade from the related family modeled by TIGR02093/pfam00343.	601
273969	TIGR02095	glgA	glycogen/starch synthase, ADP-glucose type. This family consists of glycogen (or starch) synthases that use ADP-glucose (EC 2.4.1.21), rather than UDP-glucose (EC 2.4.1.11) as in animals, as the glucose donor. This enzyme is found in bacteria and plants. Whether the name given is glycogen synthase or starch synthase depends on context, and therefore on substrate. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	473
273970	TIGR02096	TIGR02096	conserved hypothetical protein, steroid delta-isomerase-related. This family of proteins about 135 amino acids in length largely restricted to the Proteobacteria. This family and a delta5-3-ketosteroid isomerase from Pseudomonas testosteroni appear homologous, especially toward their respective N-termini. Members, therefore, probably are enzymes.	129
131152	TIGR02097	yccV	hemimethylated DNA binding domain. This model describes the small protein from E. coli YccV and its homologs in other Proteobacteria. YccV is now described as a hemimethylated DNA binding protein. The model also describes a domain in longer eukaryotic proteins.	101
131153	TIGR02098	MJ0042_CXXC	MJ0042 family finger-like domain. This domain contains a CXXCX(19)CXXC motif suggestive of both zinc fingers and thioredoxin, usually found at the N-terminus of prokaryotic proteins. One partially characterized gene, agmX, is among a large set in Myxococcus whose interruption affects adventurous gliding motility.	38
273971	TIGR02099	TIGR02099	TIGR02099 family protein. This model describes a family of long proteins, over 1250 amino acids in length and present in the Proteobacteria. The degree of sequence similarity is low between sequences from different genera. Apparent membrane-spanning regions at the N-terminus and C-terminus suggest the protein is inserted into (or exported through) the membrane. [Hypothetical proteins, Conserved]	1260
131155	TIGR02100	glgX_debranch	glycogen debranching enzyme GlgX. This family consists of the GlgX protein from the E. coli glycogen operon and probable equivalogs from other prokaryotic species. GlgX is not required for glycogen biosynthesis, but instead acts as a debranching enzyme for glycogen catabolism. This model distinguishes GlgX from pullanases and other related proteins that also operate on alpha-1,6-glycosidic linkages. In the wide band between the trusted and noise cutoffs are functionally similar enzymes, mostly from plants, that act similarly but usually are termed isoamylase. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	688
273972	TIGR02101	IpaC_SipC	type III secretion target, IpaC/SipC family. This model represents a family of proteins associated with bacterial type III secretion systems, which are injection machines for virulence factors into host cell cytoplasm. Characterized members of this protein family are known to be secreted and are described as invasins, including IpaC from Shigella flexneri (SP:P18012) and SipC from Salmonella typhimurium (GB:AAA75170.1). Members may be referred to as invasins, pathogenicity island effectors, and cell invasion proteins. [Cellular processes, Pathogenesis]	317
273973	TIGR02102	pullulan_Gpos	pullulanase, extracellular, Gram-positive. Pullulan is an unusual, industrially important polysaccharide in which short alpha-1,4 chains (maltotriose) are connected in alpha-1,6 linkages. Enzymes that cleave alpha-1,6 linkages in pullulan and release maltotriose are called pullulanases although pullulan itself may not be the natural substrate. In contrast, a glycogen debranching enzyme such GlgX, homologous to this family, can release glucose at alpha,1-6 linkages from glycogen first subjected to limit degradation by phosphorylase. Characterized members of this family include a surface-located pullulanase from Streptococcus pneumoniae () and an extracellular bifunctional amylase/pullulanase with C-terminal pullulanase activity (.	1111
273974	TIGR02103	pullul_strch	alpha-1,6-glucosidases, pullulanase-type. Members of this protein family include secreted (or membrane-anchored) pullulanases of Gram-negative bacteria and pullulanase-type starch debranching enzymes of plants. Both enzymes hydrolyze alpha-1,6 glycosidic linkages. Pullulan is an unusual, industrially important polysaccharide in which short alpha-1,4 chains (maltotriose) are connected in alpha-1,6 linkages. Enzymes that cleave alpha-1,6 linkages in pullulan and release maltotriose are called pullulanases although pullulan itself may not be the natural substrate. This family is closely homologous to, but architecturally different from, the Gram-positive pullulanases of Gram-positive bacteria (TIGR02102). [Energy metabolism, Biosynthesis and degradation of polysaccharides]	898
273975	TIGR02104	pulA_typeI	pullulanase, type I. Pullulan is an unusual, industrially important polysaccharide in which short alpha-1,4 chains (maltotriose) are connected in alpha-1,6 linkages. Enzymes that cleave alpha-1,6 linkages in pullulan and release maltotriose are called pullulanases although pullulan itself may not be the natural substrate. This family consists of pullulanases related to the subfamilies described in TIGR02102 and TIGR02103 but having a different domain architecture with shorter sequences. Members are called type I pullulanases.	605
131160	TIGR02105	III_needle	type III secretion apparatus needle protein. Type III secretion systems translocate proteins, usually virulence factors, out across both inner and outer membranes of certain Gram-negative bacteria and further across the plasma membrane and into the cytoplasm of the host cell. This protein, termed YscF in Yersinia, and EscF, PscF, EprI, etc. in other systems, forms the needle of the injection apparatus. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	72
211715	TIGR02106	cyd_oper_ybgT	cyd operon protein YbgT. This model describes a very small (as short as 33 amino acids) protein of unknown function, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It begins with an aromatic motif MWYFXW and appears to contain a membrane-spanning helix. This protein appears to be restricted to the Proteobacteria and exist in a single copy only. We suggest it may be a membrane subunit of the terminal oxidase. The family is named after the E. coli member YbgT (SP|P56100). This model excludes the apparently related protein YccB (SP|P24244). [Energy metabolism, Electron transport]	30
273976	TIGR02107	PQQ_syn_pqqA	coenzyme PQQ precursor peptide PqqA. This model describes a very small protein, coenzyme PQQ biosynthesis protein A, which is smaller than 25 amino acids in many species. It is proposed to serve as a peptide precursor of coenzyme pyrrolo-quinoline-quinone (PQQ), with Glu and Tyr of a conserved motif Glu-Xxx-Xxx-Xxx-Tyr becoming part of the product. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	21
273977	TIGR02108	PQQ_syn_pqqB	coenzyme PQQ biosynthesis protein B. This model describes coenzyme PQQ biosynthesis protein B, a gene required for the biosynthesis of pyrrolo-quinoline-quinone (coenzyme PQQ). PQQ is required for some glucose dehydrogenases and alcohol dehydrogenases. Note that this gene appears to be required for PQQ in biosynthesis in Methylobacterium extorquens (under the name pqqG) and in Klebiella pneumoniae but that the equivalent pqqV in Acinetobacter calcoaceticus is not necessary for heterologous expression of PQQ biosynthesis in E. coli. Based on this latter finding, it is suggested (Goosen, et al. 1989) that PqqB might be a transporter or a PQQ-dependent enzyme rather than a PQQ biosynthesis enzyme. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	302
162708	TIGR02109	PQQ_syn_pqqE	coenzyme PQQ biosynthesis enzyme PqqE. This model describes coenzyme PQQ biosynthesis protein E, a prototypical peptide-cyclizing radical SAM enzyme. It links a Tyr to a Glu as the first step in the biosynthesis of pyrrolo-quinoline-quinone (coenzyme PQQ) from the precursor peptide PqqA. PQQ is required for some glucose dehydrogenases and alcohol dehydrogenases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	358
273978	TIGR02110	PQQ_syn_pqqF	coenzyme PQQ biosynthesis probable peptidase PqqF. In a subset of species that make coenzyme PQQ (pyrrolo-quinoline-quinone), this probable peptidase is found in the PQQ biosynthesis region and is thought to act as a protease on PqqA (TIGR02107), a probable peptide precursor of the coenzyme. PQQ is required for some glucose dehydrogenases and alcohol dehydrogenases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	697
131166	TIGR02111	PQQ_syn_pqqC	coenzyme PQQ biosynthesis protein C. This model describes the coenzyme PQQ (pyrrolo-quinoline-quinone) biosynthesis protein PqqC.In contrast to the broader model pfam05312, this model does not include related proteins likely to be functionally distinct from PqqC, such as homologs found in the Chlamydias. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	239
131167	TIGR02112	cyd_oper_ybgE	cyd operon protein YbgE. This model describes a small protein of unknown function, about 100 amino acids in length, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It appears to be an integral membrane protein. It is found so far only in the Proteobacteria. [Energy metabolism, Electron transport]	93
131168	TIGR02113	coaC_strep	phosphopantothenoylcysteine decarboxylase, streptococcal. In most bacteria, a single bifunctional protein catalyses phosphopantothenoylcysteine decarboxylase and phosphopantothenate--cysteine ligase activities, sequential steps in coenzyme A biosynthesis (see TIGR00521). These activities reside in separate proteins encoded by tandem genes in some bacterial lineages. This model describes proteins from the genera Streptococcus and Enterococcus homologous to the N-terminal region of TIGR00521, corresponding to phosphopantothenoylcysteine decarboxylase activity. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	177
131169	TIGR02114	coaB_strep	phosphopantothenate--cysteine ligase, streptococcal. In most bacteria, a single bifunctional protein catalyses phosphopantothenoylcysteine decarboxylase and phosphopantothenate--cysteine ligase activities, sequential steps in coenzyme A biosynthesis (see TIGR00521). These activities reside in separate proteins encoded by tandem genes in some bacterial lineages. This model describes proteins from the genera Streptococcus and Enterococcus homologous to the C-terminal region of TIGR00521, corresponding to phosphopantothenate--cysteine ligase activity. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	227
131170	TIGR02115	potass_kdpF	K+-transporting ATPase, KdpF subunit. This model describes a very small integral membrane peptide KdpF, a subunit of the K(+)-translocating Kdp complex. It is found upstream of the KdpA subunit (TIGR00680). Because of its very small size and highly hydrophobic character, it is sometimes missed in genome annotation. [Transport and binding proteins, Cations and iron carrying compounds]	24
131171	TIGR02116	toxin_Txe_YoeB	toxin-antitoxin system, toxin component, Txe/YoeB family. The Axe-Txe pair in Enterococcus faecium and the homologous YefM-YoeB pair in Escherichia coli have been shown to act as an antitoxin-toxin pair. This model describes the toxin component. Nearly every example found is next to an identifiable antitoxin, as indicated by match to models TIGR01552 and/or pfam02604. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other]	80
273979	TIGR02117	chp_urease_rgn	conserved hypothetical protein. This conserved hypothetical protein of unknown function is found in several Proteobacteria. Its function is unknown and its genome context is not well-conserved. It is found amid urease genes in at least one species. [Hypothetical proteins, Conserved]	208
131173	TIGR02118	TIGR02118	conserved hypothetical protein. This model represents a small family of proteins of unknown function, each about 105 amino acids in length. Conserved sites in the multiple alignment include a pair of aromatic residues, a histidine, and an aspartate. [Hypothetical proteins, Conserved]	100
131174	TIGR02119	panF	sodium/pantothenate symporter. Pantothenate (vitamin B5) is a precursor of coenzyme A and is made from aspartate and 2-oxoisovalerate in most bacteria with completed genome sequences. However, some pathogens must import pantothenate. This model describes PanF, a sodium/pantothenate symporter, from a larger family of Sodium/substrate symporters (pfam00474). Several species that have this transporter appear to lack all enzymes of pantothenate biosynthesis, namely Haemophilus influenzae, Pasteurella multocida, Fusobacterium nucleatum, and Borrelia burgdorferi. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A, Transport and binding proteins, Other]	471
273980	TIGR02120	GspF	type II secretion system protein F. This membrane protein is a component of the terminal branch complex of the general secretion pathway (GSP), also known as the"Type II" secretion pathway. The GSP transports proteins (generally virulence-associated cell wall hydrolases) across the outer membrase of the bacterial cell. Transport across the inner membrane is often, but not exclusively handled by the Sec system. This model was constructed from the broader subfamily model, pfam00482 which includes components of pilin complexes (PilC) as well as other related genes. GspF is nearly always gene clustered with other GSP subunits. Some genes from Xylella and Xanthomonas strains score below the trusted cutoff due to excessive divergence from the family such that a sequence from Deinococcus which does not appear to be GspF scores higher. [Protein fate, Protein and peptide secretion and trafficking]	399
273981	TIGR02121	Na_Pro_sym	sodium/proline symporter. This family consists of the sodium/proline symporter (proline permease) from a number of Gram-negative and Gram-positive bacteria and from the archaeal genus Methanosarcina. Using the related pantothenate permease as an outgroup, candidate sequences from Bifidobacterium longum and several from archaea are found to be outside the clade defined by known proline permeases. These sequences, scoring between 570 and -40, define the range between trusted and noise cutoff scores. [Transport and binding proteins, Amino acids, peptides and amines]	487
273982	TIGR02122	TRAP_TAXI	TRAP transporter solute receptor, TAXI family. This family is one of at least three major families of extracytoplasmic solute receptor (ESR) for TRAP (Tripartite ATP-independent Periplasmic Transporter) transporters. The others are the DctP (TIGR00787) and SmoM (pfam03480) families. These transporters are secondary (driven by an ion gradient) but composed of three polypeptides, although in some species the 4-TM and 12-TM integral membrane proteins are fused. Substrates for this transporter family are not fully characterized but, besides C4 dicarboxylates, may include mannitol and other compounds. [Transport and binding proteins, Unknown substrate]	320
273983	TIGR02123	TRAP_fused	TRAP transporter, 4TM/12TM fusion protein. In some species, the 12-transmembrane spanning and 4-transmembrane spanning components of tripartite ATP-independent periplasmic (TRAP)-type transporters are fused. This model describes such transporters, found in the Archaea and in Bacteria. [Transport and binding proteins, Unknown substrate]	614
273984	TIGR02124	hypE	hydrogenase expression/formation protein HypE. This family contains HypE (or HupE), a protein required for expression of catalytically active hydrogenase in many systems. It appears to be an accessory protein involved in maturation rather than a regulatory protein involved in expression. HypE shows considerable homology to the thiamine-monophosphate kinase ThiL (TIGR01379) and other enzymes.	320
273985	TIGR02125	CytB-hydogenase	Ni/Fe-hydrogenase, b-type cytochrome subunit. This model describes a family of cytochrome b proteins which appear to be specific for nickel-iron hydrogenase complexes. Every genome which contains a member of this family posesses a Ni/Fe hydrogenase according to Genome Properties (GenProp0177), and most are gene clustered with other hydrogenase components. Some Ni/Fe hydrogenase-containing species lack a member of this family but contain other CytB homologs (pfam01292) which may substitute for it.	211
273986	TIGR02126	phgtail_TP901_1	phage major tail protein, TP901-1 family. This family includes the members of pfam06199 but is broader. Characterized members are major tail proteins from various phage, including lactococcal temperate bacteriophage TP901-1. [Mobile and extrachromosomal element functions, Prophage functions]	136
273987	TIGR02127	pyrF_sub2	orotidine 5'-phosphate decarboxylase, subfamily 2. This model represents orotidine 5'-monophosphate decarboxylase, the PyrF protein of pyrimidine nucleotide biosynthesis. See TIGR01740 for a related but distinct subfamily of the same enzyme. [Purines, pyrimidines, nucleosides, and nucleotides, Pyrimidine ribonucleotide biosynthesis]	261
273988	TIGR02128	G6PI_arch	bifunctional phosphoglucose/phosphomannose isomerase. This bifunctional isomerase is a member of the larger PGI superfamily and only distantly related to other glucose-6-phosphate isomerases. The family is limited to the archaea.	308
162719	TIGR02129	hisA_euk	phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase, eukaryotic type. This enzyme acts in the biosynthesis of histidine and has been characterized in S. cerevisiae and Arabidopsis where it complements the E. coli HisA gene. In eukaryotes the gene is known as HIS6. In bacteria, this gene is found in Fibrobacter succinogenes, presumably due to lateral gene transfer from plants in the rumen gut. [Amino acid biosynthesis, Histidine family]	253
131185	TIGR02130	dapB_plant	dihydrodipicolinate reductase. This narrow family includes genes from Arabidopsis and Fibrobacter succinogenes (which probably recieved the gene from a plant via lateral gene transfer). The sequences are distantly related to the dihydrodipicolinate reductases from archaea. In Fibrobacter this gene is the only candidate DHPR in the genome. [Amino acid biosynthesis, Aspartate family]	275
131186	TIGR02131	phaP_Bmeg	polyhydroxyalkanoic acid inclusion protein PhaP. This model describes a protein found in polyhydroxyalkanoic acid (PHA) gene regions and incorporated into PHA inclusions in Bacillus cereus and Bacillus megaterium. The role of the protein may include amino acid storage (see McCool,G.J. and Cannon,M.C, 1999).	165
131187	TIGR02132	phaR_Bmeg	polyhydroxyalkanoic acid synthase, PhaR subunit. This model describes a protein, PhaR, localized to polyhydroxyalkanoic acid (PHA) inclusion granules in Bacillus cereus and related species. PhaR is required for PHA biosynthesis along with PhaC and may be a regulatory subunit.	189
273989	TIGR02133	RPI_actino	ribose 5-phosphate isomerase. This family is a member of the RpiB/LacA/LacB subfamily (TIGR00689) but lies outside the RpiB equivalog (TIGR01120) which is also a member of that subfamily. Ribose 5-phosphate isomerase is an essential enzyme of the pentose phosphate pathway; a pathway that appears to be present in the actinobacteria. The only candidates for ribose 5-phosphate isomerase in the Actinobacteria are members of this family.	148
131189	TIGR02134	transald_staph	transaldolase. This small family of proteins is a member of the transaldolase sybfamily represented by pfam00923. Coxiella and Staphylococcus lack members of the known transaldolase equivalog families and appear to require a transaldolase activity for completion of the pentose phosphate pathway. [Energy metabolism, Pentose phosphate pathway]	236
273990	TIGR02135	phoU_full	phosphate transport system regulatory protein PhoU. This model describes PhoU, a regulatory protein of unknown mechanism for high-affinity phosphate ABC transporter systems. The protein consists of two copies of the domain described by pfam01895. Deletion of PhoU activates constitutive expression of the phosphate ABC transporter and allows phosphate transport, but causes a growth defect and so likely has some second function. [Regulatory functions, Other, Transport and binding proteins, Anions]	212
273991	TIGR02136	ptsS_2	phosphate binding protein. Members of this family are phosphate-binding proteins. Most are found in phosphate ABC-transporter operons, but some are found in phosphate regulatory operons. This model separates members of the current family from the phosphate ABC transporter phosphate binding protein described by TIGRFAMs model TIGR00975. [Transport and binding proteins, Anions]	287
162723	TIGR02137	HSK-PSP	phosphoserine phosphatase/homoserine phosphotransferase bifunctional protein. This protein is has been characterized as both a phosphoserine phosphatase and a phosphoserine:homoserine phosphotransferase. In Pseudomonas aeruginosa, where the characterization was done, a second phosphoserine phosphatase (SerB) and a second homoserine kinase (thrB) are found, but in Fibrobacter succinogenes neither are present. This enzyme is a member of the haloacid dehalogenase (HAD) superfamily, specifically part of subfamily IB by virtue of the presence of an alpha helical domain in between motifs I and II of the HAD domain. The closest homologs to this family are monofunctional phosphoserine phosphatases (TIGR00338).	203
273992	TIGR02138	phosphate_pstC	phosphate ABC transporter, permease protein PstC. The typical operon for the high affinity inorganic phosphate ABC transporter encodes an ATP-binding protein, a phosphate-binding protein, and two permease proteins. This family consists of one of the two permease proteins, PstC, which is homologous to PstA (TIGR00974). In the model bacterium Escherichia coli, this transport system is induced when the concentration of extrallular inorganic phosphate is low. A constitutive, lower affinity transporter operates otherwise. [Transport and binding proteins, Anions]	295
131194	TIGR02139	permease_CysT	sulfate ABC transporter, permease protein CysT. This model represents CysT, one of two homologous, tandem permeases in the sulfate ABC transporter system; the other is CysW (TIGR02140). The sulfate transporter has been described in E. coli as transporting sulfate, thiosulfate, selenate, and selenite. Sulfate transporters may also transport molybdate ion if a specific molybdate transporter is not present. [Transport and binding proteins, Anions]	265
162725	TIGR02140	permease_CysW	sulfate ABC transporter, permease protein CysW. This model represents CysW, one of two homologous, tandem permeases in the sulfate ABC transporter system; the other is CysT (TIGR02139). The sulfate transporter has been described in E. coli as transporting sulfate, thiosulfate, selenate, and selenite. Sulfate transporters may also transport molybdate ion if a specific molybdate transporter is not present. [Transport and binding proteins, Anions]	261
273993	TIGR02141	modB_ABC	molybdate ABC transporter, permease protein. This model describes the permease protein, ModB, of the molybdate ABC transporter. This system has been characterized in E. coli, Staphylococcus carnosus, Rhodobacter capsulatus and Azotobacter vinlandii. Molybdate is chemically similar to sulfate, thiosulfate, and selenate. These related substrates, and sometimes molybdate itself, can be transported by the homologous sulfate receptor. Some apparent molybdenum transport operons include a permease related to this ModB, although less similar than some sulfate permease proteins and not included in this model. [Transport and binding proteins, Anions]	208
131197	TIGR02142	modC_ABC	molybdenum ABC transporter, ATP-binding protein. This model represents the ATP-binding cassette (ABC) protein of the three subunit molybdate ABC transporter. The three proteins of this complex are homologous to proteins of the sulfate ABC transporter. Molybdenum may be used in nitrogenases of nitrogen-fixing bacteria and in molybdopterin cofactors. In some cases, molybdate may be transported by a sulfate transporter rather than by a specific molybdate transporter. [Transport and binding proteins, Anions]	354
131198	TIGR02143	trmA_only	tRNA (uracil(54)-C(5))-methyltransferase. This family consists exclusively of proteins believed to act as tRNA (uracil-5-)-methyltransferase. All members of far are proteobacterial. The seed alignment was taken directly from pfam05958 in Pfam 12.0, but higher cutoffs are used to select only functionally equivalent proteins. Homologous proteins excluded by the higher cutoff scores of this model include other uracil methyltransferases, such as RumA, active on rRNA. [Protein synthesis, tRNA and rRNA base modification]	353
273994	TIGR02144	LysX_arch	Lysine biosynthesis enzyme LysX. The family of proteins found in this equivalog include the characterized LysX from Thermus thermophilus, which is part of a well-organized lysine biosynthesis gene cluster. LysX is believed to carry out an ATP-dependent acylation of the amino group of alpha-aminoadipate in the prokaryotic version of the fungal AAA lysine biosynthesis pathway. No species having a sequence in this equivalog contains the elements of the more common diaminopimelate lysine biosythesis pathway, and none has been shown to be a lysine auxotroph. These sequences have mainly recieved the name of the related enzyme, "ribosomal protein S6 modification protein RimK". RimK has been characterized in E. coli, and acts by ATP-dependent condensation of S6 with glutamate residues.	280
273995	TIGR02145	Fib_succ_major	Fibrobacter succinogenes major paralogous domain. This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulfide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron. [Cell envelope, Other]	171
162728	TIGR02146	LysS_fung_arch	homocitrate synthase. This model includes the yeast LYS21 gene which carries out the first step of the alpha-aminoadipate (AAA) lysine biosynthesis pathway. A related pathway is found in Thermus thermophilus. This enzyme is closely related to 2-isopropylmalate synthase (LeuA) and citramalate synthase (CimA), both of which are present in the euryarchaeota. Some archaea have a separate homocitrate synthase (AksA) which also synthesizes longer homocitrate analogs.	344
273996	TIGR02147	Fsuc_second	TIGR02147 family protein. This family consists of the 40 members of a paralogous protein family in the rumen anaerobe Fibrobacter succinogenes S85 and a smaller number in Bdellovibrio bacteriovorus HD100. Member proteins are about 270 residues long and appear to lack signal sequences and transmembrane helices. The only perfectly conserved residue is a glycine in an otherwise poorly conserved region, suggesting members are not enzymes. The family is not characterized. [Hypothetical proteins, Conserved]	271
162730	TIGR02148	Fibro_Slime	fibro-slime domain. This model represents a conserved region of about 90 amino acids, shared in at least 4 distinct large putative proteins from the slime mold Dictyostelium discoideum and 10 proteins from the rumen bacterium Fibrobacter succinogenes, and in no other species so far. We propose here the name fibro-slime domain	90
273997	TIGR02149	glgA_Coryne	glycogen synthase, Corynebacterium family. This model describes Corynebacterium glutamicum GlgA and closely related proteins in several other species. This enzyme is required for glycogen biosynthesis and appears to replace the distantly related TIGR02095 family of ADP-glucose type glycogen synthase in Corynebacterium glutamicum, Mycobacterium tuberculosis, Bifidobacterium longum, and Streptomyces coelicolor. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	388
273998	TIGR02150	IPP_isom_1	isopentenyl-diphosphate delta-isomerase, type 1. This model represents type 1 of two non-homologous families of the enzyme isopentenyl-diphosphate delta-isomerase (IPP isomerase). IPP is an essential building block for many compounds, including enzyme cofactors, sterols, and prenyl groups. This inzyme interconverts isopentenyl diphosphate and dimethylallyl diphosphate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	158
273999	TIGR02151	IPP_isom_2	isopentenyl-diphosphate delta-isomerase, type 2. Isopentenyl-diphosphate delta-isomerase (IPP isomerase) interconverts isopentenyl diphosphate and dimethylallyl diphosphate. This model represents the type 2 enzyme. FMN, NADPH, and Mg2+ are required by this form, which lacks homology to the type 1 enzyme (TIGR02150). IPP is precursor to many compounds, including enzyme cofactors, sterols, and isoprenoids. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	333
274000	TIGR02152	D_ribokin_bact	ribokinase. This model describes ribokinase, an enzyme catalyzing the first step in ribose catabolism. The rbsK gene encoding ribokinase typically is found with ribose transport genes. Ribokinase belongs to the carbohydrate kinase pfkB family (pfam00294). In the wide gulf between the current trusted (360 bit) and noise (100 bit) cutoffs are a number of sequences, few of which are clustered with predicted ribose transport genes but many of which are currently annotated as if having ribokinase activity. Most likely some have this function and others do not. [Energy metabolism, Sugars]	293
274001	TIGR02153	gatD_arch	glutamyl-tRNA(Gln) amidotransferase, subunit D. This peptide is found only in the Archaea. It is part of a heterodimer, with GatE (TIGR00134), that acts as an amidotransferase on misacylated Glu-tRNA(Gln) to produce Gln-tRNA(Gln). The analogous amidotransferase found in bacteria is the GatABC system, although GatABC homologs in the Archaea appear to act instead on Asp-tRNA(Asn). [Protein synthesis, tRNA aminoacylation]	404
131209	TIGR02154	PhoB	phosphate regulon transcriptional regulatory protein PhoB. PhoB is a DNA-binding response regulator protein acting with PhoR in a 2-component system responding to phosphate ion. PhoB acts as a positive regulator of gene expression for phosphate-related genes such as phoA, phoS, phoE and ugpAB as well as itself. It is often found proximal to genes for the high-affinity phosphate ABC transporter (pstSCAB; GenProp0190) and presumably regulates these as well. [Regulatory functions, DNA interactions, Signal transduction, Two-component systems]	226
131210	TIGR02155	PA_CoA_ligase	phenylacetate-CoA ligase. Phenylacetate-CoA ligase (PA-CoA ligase) catalyzes the first step in aromatic catabolism of phenylacetic acid (PA) into phenylacetyl-CoA (PA-CoA). Often located in a conserved gene cluster with enzymes involved in phenylacetic acid activation (paaG/H/I/J), phenylacetate-CoA ligase has been found among the proteobacteria as well as in gram positive prokaryotes. In the B-subclass proteobacterium Azoarcus evansii, phenylacetate-CoA ligase has been shown to be induced under aerobic and anaerobic growth conditions. It remains unclear however, whether this induction is due to the same enzyme or to another isoenzyme restricted to specific anaerobic growth conditions. [Energy metabolism, Other]	422
131211	TIGR02156	PA_CoA_Oxy1	phenylacetate-CoA oxygenase, PaaG subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other]	289
274002	TIGR02157	PA_CoA_Oxy2	phenylacetate-CoA oxygenase, PaaH subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other]	90
131213	TIGR02158	PA_CoA_Oxy3	phenylacetate-CoA oxygenase, PaaI subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other]	237
131214	TIGR02159	PA_CoA_Oxy4	phenylacetate-CoA oxygenase, PaaJ subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other]	146
131215	TIGR02160	PA_CoA_Oxy5	phenylacetate-CoA oxygenase/reductase, PaaK subunit. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other]	352
131216	TIGR02161	napC_nirT	periplasmic nitrate (or nitrite) reductase c-type cytochrome, NapC/NirT family. Nearly every member of this subfamily is NapC, a predicted membrane-anchored four-heme c-type cytochrome that forms one component of the periplasmic nitrate reductase along with NapA, NapB, NapD, NapE, and NapF subunits. A single known exception at this time is NirT, which is instead a component of a nitrite reductase. This family excludes TorC subunits of trimethylamine N-oxide (TMAO) reductases.	185
274003	TIGR02162	torC	trimethylamine-N-oxide reductase c-type cytochrome TorC. This family includes consists of TorC, a pentahemic c-type cytochrome subunit of periplasmic reductases for trimethylamine-N-oxide (TMAO). The N-terminal half is closely related to tetrahemic NapC (or NirT) subunits of periplasmic nitrate (or nitrite) reductases; some species have both TMAO and nitrate reductase complexes.	386
274004	TIGR02163	napH_	ferredoxin-type protein, NapH/MauN family. Most members of this family are the NapH protein, found next to NapG,in operons that encode the periplasmic nitrate reductase. Some species with this reductase lack NapC but accomplish electron transfer to NapAB in some other manner, likely to involve NapH, NapG, and/or some other protein. A few members of this protein are designated MauN and are found in methylamine utilization operons in species that appear to lack a periplasmic nitrate reductase.	255
131219	TIGR02164	torA	trimethylamine-N-oxide reductase TorA. This very narrowly defined family represents TorA, part of a family of related molybdoenzymes that include biotin sulfoxide reductases, dimethyl sulfoxide reductases, and at least two different subfamilies of trimethylamine-N-oxide reductases. A single enzyme from the larger family may have more than one activity. TorA typically is located in the periplasm, has a Tat (twin-arginine translocation)-dependent signal sequence, and is encoded in a torCAD operon.	822
274005	TIGR02165	cas5_6_GSU0054	CRISPR-associated protein GSU0054/csb2, Dpsyc system. This model represents a CRISPR-associated protein from the Dpsyc subtype (a type I-C variant), named for Desulfotalea psychrophila LSv54. CRISPR systems confer resistance in prokaryotes to invasive DNA or RNA, including phage and plasmids. CRISPR-associated proteins typically are found near CRISPR repeats and other CRISPR-associated proteins, have low levels of sequence identify, have sequence relationships that suggest lateral transfer, and show some sequence similarity to DNA-active proteins such as helicases and repair proteins.	484
274006	TIGR02166	dmsA_ynfE	anaerobic dimethyl sulfoxide reductase, A subunit, DmsA/YnfE family. Members of this family include known and probable dimethyl sulfoxide reductase (DMSO reductase) A chains. In E. coli, dmsA encodes the canonical anaerobic DMSO reductase A chain. The paralog ynfE, as part of ynfFGH expressed from a multicopy plasmid, could complement a dmsABC deletion, suggesting a similar function and some overlap in specificity, although YnfE could not substitute for DmsA in a mixed complex.	797
274007	TIGR02167	Liste_lipo_26	bacterial surface protein 26-residue repeat. This model describes a tandem peptide repeat sequence of 25 or 26 residues, found in predicted surface proteins (often lipoproteins) from Listeria monocytogenes, L. innocua, Enterococcus faecalis, Lactobacillus plantarum, Mycoplasma mycoides, Helicobacter hepaticus, and other species.	26
274008	TIGR02168	SMC_prok_B	chromosome segregation protein SMC, common bacterial type. SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]	1179
274009	TIGR02169	SMC_prok_A	chromosome segregation protein SMC, primarily archaeal type. SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]	1164
274010	TIGR02170	thyX	thymidylate synthase, flavin-dependent. Two forms of microbial thymidylate synthase are known: ThyA (2.1.1.45) and ThyX (2.1.1.148). This model describes ThyX, a homotetrameric flavoprotein. Both enzymes convert dUMP to dTMP. Under oxygen-limiting conditions, thyX can complement a thyA mutation. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	209
274011	TIGR02171	Fb_sc_TIGR02171	Fibrobacter succinogenes paralogous family TIGR02171. This model describes a paralogous family of the rumen bacterium Fibrobacter succinogenes. Eleven members are found in Fibrobacter succinogenes S85, averaging over 900 amino acids in length. More than half are predicted lipoproteins. The function is unknown.	912
162743	TIGR02172	Fb_sc_TIGR02172	Fibrobacter succinogenes paralogous family TIGR02172. This model describes a paralogous family of five proteins, likely to be enzymes, in the rumen bacterium Fibrobacter succinogenes S85. Members show homology to proteins described by pfam01636, a phosphotransferase enzyme family associated with resistance to aminoglycoside antibiotics.	226
274012	TIGR02173	cyt_kin_arch	cytidylate kinase, putative. Proteins in this family are believed to be cytidylate kinase. Members of this family are found in the archaea and in spirochaetes, and differ considerably from the common bacterial form of cytidylate kinase described by TIGR00017.	171
274013	TIGR02174	CXXU_selWTH	selT/selW/selH selenoprotein domain. This model represents a domain found in both bacteria and animals, including animal proteins SelT, SelW, and SelH, all of which are selenoproteins. In a CXXC motif near the N-terminus of the domain, selenocysteine may replace the second Cys. Proteins with this domain may include an insert of about 70 amino acids. This model is broader than the current SelW model pfam05169 in Pfam.	73
274014	TIGR02175	PorC_KorC	2-oxoacid:acceptor oxidoreductase, gamma subunit, pyruvate/2-ketoisovalerate family. A number of anaerobic and microaerophilic species lack pyruvate dehydrogenase and have instead a four subunit, oxygen-sensitive pyruvate oxidoreductase, with either ferredoxins or flavodoxins (H. pylori) used as the acceptor. Several related four-subunit enzymes may exist in the same species. This model describes the gamma subunit. In Pyrococcus furious, enzymes active on pyruvate and 2-ketoisovalerate share a common gamma subunit.	177
131231	TIGR02176	pyruv_ox_red	pyruvate:ferredoxin (flavodoxin) oxidoreductase, homodimeric. This model represents a single chain form of pyruvate:ferredoxin (or flavodoxin) oxidoreductase. This enzyme may transfer electrons to nitrogenase in nitrogen-fixing species. Portions of this protein are homologous to gamma subunit of the four subunit pyruvate:ferredoxin (flavodoxin) oxidoreductase.	1165
274015	TIGR02177	PorB_KorB	2-oxoacid:acceptor oxidoreductase, beta subunit, pyruvate/2-ketoisovalerate family. A number of anaerobic and microaerophilic species lack pyruvate dehydrogenase and have instead a four subunit, oxygen-sensitive pyruvate oxidoreductase, with either ferredoxins or flavodoxins used as the acceptor. Several related four-subunit enzymes may exist in the same species. This model describes a subfamily of beta subunits, representing mostly pyruvate and 2-ketoisovalerate specific enzymes.	287
131233	TIGR02178	yeiP	elongation factor P-like protein YeiP. This model represents the family of Escherichia coli protein YeiP, a close homolog of elongation factor P (TIGR00038) and probably itself a translation factor. Member of this family are found only in some Gammaproteobacteria, including E. coli and Vibrio cholerae. [Protein synthesis, Translation factors]	186
131234	TIGR02179	PorD_KorD	2-oxoacid:acceptor oxidoreductase, delta subunit, pyruvate/2-ketoisovalerate family. A number of anaerobic and microaerophilic species lack pyruvate dehydrogenase and have instead a four subunit, oxygen-sensitive pyruvate oxidoreductase, with either ferredoxins or flavodoxins used as the acceptor. Several related four-subunit enzymes may exist in the same species. This model describes a subfamily of delta subunits, representing mostly pyruvate, 2-ketoisovalerate, and 2-oxoglutarate specific enzymes. The delta subunit is the smallest and resembles ferredoxins.	78
274016	TIGR02180	GRX_euk	Glutaredoxin. Glutaredoxins are thioltransferases (disulfide reductases) which utilize glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system. Glutaredoxins utilize the CXXC motif common to thioredoxins and are involved in multiple cellular processes including protection from redox stress, reduction of critical enzymes such as ribonucleotide reductase and the generation of reduced sulfur for iron sulfur cluster formation. Glutaredoxins are capable of reduction of mixed disulfides of glutathione as well as the formation of glutathione mixed disulfides. This model represents eukaryotic glutaredoxins and includes sequences from fungi, plants and metazoans as well as viruses.	83
274017	TIGR02181	GRX_bact	Glutaredoxin, GrxC family. Glutaredoxins are thioltransferases (disulfide reductases) which utilize glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system. Glutaredoxins utilize the CXXC motif common to thioredoxins and are involved in multiple cellular processes including protection from redox stress, reduction of critical enzymes such as ribonucleotide reductase and the generation of reduced sulfur for iron sulfur cluster formation. Glutaredoxins are capable of reduction of mixed disulfides of glutathione as well as the formation of glutathione mixed disulfides. This family of glutaredoxins includes the E. coli protein GrxC (Grx3) which appears to have a secondary role in reducing ribonucleotide reductase (in the absence of GrxA) possibly indicating a role in the reduction of other protein disulfides. [Energy metabolism, Electron transport]	79
274018	TIGR02182	GRXB	Glutaredoxin, GrxB family. Glutaredoxins are thioltransferases (disulfide reductases) which utilize glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system. Glutaredoxins utilize the CXXC motif common to thioredoxins and are involved in multiple cellular processes including protection from redox stress, reduction of critical enzymes such as ribonucleotide reductase and the generation of reduced sulfur for iron sulfur cluster formation. Glutaredoxins are capable of reduction of mixed disulfides of glutathione as well as the formation of glutathione mixed disulfides. This model includes the highly abundant E. coli GrxB (Grx2) glutaredoxin which is notably longer than either GrxA or GrxC. Unlike the other two E. coli glutaredoxins, GrxB appears to be unable to reduce ribonucleotide reductase, and may have more to do with resistance to redox stress. [Energy metabolism, Electron transport]	209
131238	TIGR02183	GRXA	Glutaredoxin, GrxA family. Glutaredoxins are thioltransferases (disulfide reductases) which utilize glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system. Glutaredoxins utilize the CXXC motif common to thioredoxins and are involved in multiple cellular processes including protection from redox stress, reduction of critical enzymes such as ribonucleotide reductase and the generation of reduced sulfur for iron sulfur cluster formation. Glutaredoxins are capable of reduction of mixed disulfides of glutathione as well as the formation of glutathione mixed disulfides. This model includes the E. coli glyutaredoxin GrxA which appears to have primary responsibility for the reduction of ribonucleotide reductase.	86
213689	TIGR02184	Myco_arth_vir_N	Mycoplasma virulence family signal region. This model represents the N-terminal region, including a probable signal sequence or signal anchor which in most instances has four consecutive Lys residues before the hydrophobic stretch, of a family of large, virulence-associated proteins in Mycoplasma arthritidis and smaller proteins in Mycoplasma capricolum.	33
274019	TIGR02185	Trep_Strep	putative ECF transporter S component, Trep_Strep family. This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. If is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6. [Transport and binding proteins, Unknown substrate]	189
274020	TIGR02186	alph_Pro_TM	conserved hypothetical protein. This family consists of predicted transmembrane proteins of about 270 amino acids. Members are found, so far, only among the Alphaproteobacteria and only once in each genome.	261
274021	TIGR02187	GlrX_arch	Glutaredoxin-like domain protein. This family of archaeal proteins contains a C-terminal domain with homology to bacterial and eukaryotic glutaredoxins, including a CPYC motif. There is an N-terminal domain which has even more distant homology to glutaredoxins. The name "glutaredoxin" may be inappropriate in the sense of working in tandem with glutathione and glutathione reductase which may not be present in the archaea. The overall domain structure appears to be related to bacterial alkylhydroperoxide reductases, but the homology may be distant enough that the function of this family is wholly different.	215
274022	TIGR02188	Ac_CoA_lig_AcsA	acetate--CoA ligase. This model describes acetate-CoA ligase (EC 6.2.1.1), also called acetyl-CoA synthetase and acetyl-activating enzyme. It catalyzes the reaction ATP + acetate + CoA = AMP + diphosphate + acetyl-CoA and belongs to the family of AMP-binding enzymes described by pfam00501.	626
274023	TIGR02189	GlrX-like_plant	Glutaredoxin-like family. This family of glutaredoxin-like proteins is aparrently limited to plants. Multiple isoforms are found in A. thaliana and O.sativa.	99
131245	TIGR02190	GlrX-dom	Glutaredoxin-family domain. This C-terminal domain with homology to glutaredoxin is fused to an N-terminal peroxiredoxin-like domain.	79
274024	TIGR02191	RNaseIII	ribonuclease III, bacterial. This family consists of bacterial examples of ribonuclease III. This enzyme cleaves double-stranded rRNA. It is involved in processing ribosomal RNA precursors. It is found even in minimal genones such as Mycoplasma genitalium and Buchnera aphidicola, and in some cases has been shown to be an essential gene. These bacterial proteins contain a double-stranded RNA binding motif (pfam00035) and a ribonuclease III domain (pfam00636). Eukaryotic homologs tend to be much longer proteins with additional domains, localized to the nucleus, and not included in this family. [Transcription, RNA processing]	220
131247	TIGR02192	HtrL_YibB	protein YibB. The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologs are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein. [Hypothetical proteins, Conserved]	270
274025	TIGR02193	heptsyl_trn_I	lipopolysaccharide heptosyltransferase I. This family consists of examples of ADP-heptose:LPS heptosyltransferase I, an enzyme of LPS inner core region biosynthesis. LPS, composed of lipid A, a core region, and O antigen, is found in the outer membrane of Gram-negative bacteria. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	319
131249	TIGR02194	GlrX_NrdH	Glutaredoxin-like protein NrdH. NrdH-redoxin is a representative of a class of small redox proteins that contain a conserved CXXC motif and are characterized by a glutaredoxin-like amino acid sequence and thioredoxin-like activity profile. Unlike other the glutaredoxins to which it is most closely related, NrdH aparrently does not interact with glutathione/glutathione reductase, but rather with thioredoxin reductase to catalyze the reduction of ribonucleotide reductase.	72
274026	TIGR02195	heptsyl_trn_II	lipopolysaccharide heptosyltransferase II. This family consists of examples of ADP-heptose:LPS heptosyltransferase II, an enzyme of LPS inner core region biosynthesis. LPS, composed of lipid A, a core region, and O antigen, is found in the outer membrane of Gram-negative bacteria. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	334
274027	TIGR02196	GlrX_YruB	Glutaredoxin-like protein, YruB-family. This glutaredoxin-like protein family contains the conserved CxxC motif and includes the Clostridium pasteurianum protein YruB which has been cloned from a rubredoxin operon. Somewhat related to NrdH, it is unknown whether this protein actually interacts with glutathione/glutathione reducatase, or, like NrdH, some other reductant system.	74
274028	TIGR02197	heptose_epim	ADP-L-glycero-D-manno-heptose-6-epimerase. This family consists of examples of ADP-L-glycero-D-mannoheptose-6-epimerase, an enzyme involved in biosynthesis of the inner core of lipopolysaccharide (LPS) for Gram-negative bacteria. This enzyme is homologous to UDP-glucose 4-epimerase (TIGR01179) and belongs to the NAD dependent epimerase/dehydratase family (pfam01370). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	314
274029	TIGR02198	rfaE_dom_I	rfaE bifunctional protein, domain I. RfaE is a protein involved in the biosynthesis of ADP-L-glycero-D-manno-heptose, a precursor for LPS inner core biosynthesis. RfaE is a bifunctional protein in E. coli, and separate proteins in some other genome. The longer, N-terminal domain I (this family) is suggested to act in D-glycero-D-manno-heptose 1-phosphate biosynthesis, while domain II (TIGR02199) adds ADP to yield ADP-D-glycero-D-manno-heptose. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	315
131254	TIGR02199	rfaE_dom_II	rfaE bifunctional protein, domain II. RfaE is a protein involved in the biosynthesis of ADP-L-glycero-D-manno-heptose, a precursor for LPS inner core biosynthesis. RfaE is a bifunctional protein in E. coli, and separate proteins in some other genome. Domain I (TIGR02198) is suggested to act in D-glycero-D-manno-heptose 1-phosphate biosynthesis, while domain II (this family) adds ADP to yield ADP-D-glycero-D-manno-heptose. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	144
131255	TIGR02200	GlrX_actino	Glutaredoxin-like protein. This family of glutaredoxin-like proteins is limited to the Actinobacteria and contains the conserved CxxC motif.	77
131256	TIGR02201	heptsyl_trn_III	lipopolysaccharide heptosyltransferase III, putative. This family consists of examples of the putative ADP-heptose:LPS heptosyltransferase III, an enzyme of LPS inner core region biosynthesis. LPS, composed of lipid A, a core region, and O antigen, is found in the outer membrane of Gram-negative bacteria. This enzyme may be less widely distributed than heptosyltransferases I and II. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	344
131257	TIGR02202	Ehrlichia_rpt	Ehrlichia chaffeensis immunodominant surface protein repeat. This model represents 77 residues of an 80 amino acid (240 nucleotide) tandem repeat, found in a variable number of copies in an immunodominant outer membrane protein of Ehrlichia chaffeensis, a tick-borne obligate intracellular pathogen.	77
131258	TIGR02203	MsbA_lipidA	lipid A export permease/ATP-binding protein MsbA. This family consists of a single polypeptide chain transporter in the ATP-binding cassette (ABC) transporter family, MsbA, which exports lipid A. It may also act in multidrug resistance. Lipid A, a part of lipopolysaccharide, is found in the outer leaflet of the outer membrane of most Gram-negative bacteria. Members of this family are restricted to the Proteobacteria (although lipid A is more broadly distributed) and often are clustered with lipid A biosynthesis genes. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other]	571
131259	TIGR02204	MsbA_rel	ABC transporter, permease/ATP-binding protein. This protein is related to a Proteobacterial ATP transporter that exports lipid A and to eukaryotic P-glycoproteins.	576
274030	TIGR02205	septum_zipA	cell division protein ZipA. This model represents the full length of bacterial cell division protein ZipA. The N-terminal hydrophobic stretch is an uncleaved signal-anchor sequence. This is followed by an unconserved, variable length, low complexity region, and then a conserved C-terminal region of about 140 amino acids (see pfam04354) that interacts with the tubulin-like cell division protein FtsZ. [Cellular processes, Cell division]	284
131261	TIGR02206	intg_mem_TP0381	conserved hypothetical integral membrane protein TIGR02206. This model represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc.	222
274031	TIGR02207	lipid_A_htrB	lipid A biosynthesis lauroyl (or palmitoleoyl) acyltransferase. This model represents a narrow clade of acyltransferases, nearly all of which transfer a lauroyl group to KDO2-lipid IV-A, a lipid A precursor; these proteins are termed lipid A biosynthesis lauroyl acyltransferase, HtrB. An exception is a closely related paralog of E. coli HtrB, LpxP, which acts in cold shock conditions by transferring a palmitoleoyl rather than lauroyl group to the lipid A precursor. Members of this family are homologous to the family of acyltransferases responsible for the next step in lipid A biosynthesis. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	303
274032	TIGR02208	lipid_A_msbB	lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA acyltransferase. This family consists of MsbB in E. coli and closely related proteins in other species. MsbB is homologous to HtrB (TIGR02207) and acts immediately after it in the biosynthesis of KDO-2 lipid A (also called Re LPS and Re endotoxin). These two enzymes act after creation of KDO-2 lipid IV-A by addition of the KDO sugars. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	305
131264	TIGR02209	ftsL_broad	cell division protein FtsL. This model represents FtsL, both forms similar to that in E. coli and similar to that in B. subtilis. FtsL is one of the later proteins active in cell division septum formation. FtsL is small, low in complexity, and highly divergent. The scope of this model is broader than that of the pfam04999.3 for FtsL, as this one includes FtsL from Bacillus subtilis and related species. [Cellular processes, Cell division]	85
274033	TIGR02210	rodA_shape	rod shape-determining protein RodA. This protein is a member of the FtsW/RodA/SpoVE family (pfam01098). It is found only in species with rod (or spiral) shapes. In many species, mutation of rodA has been shown to correlate with loss of the normal rod shape. Note that RodA homologs are found, scoring below the cutoffs for this model, in a number of both rod-shaped and coccoid bacteria, including four proteins in Bacillus anthracis, for example. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Cell division]	352
131266	TIGR02211	LolD_lipo_ex	lipoprotein releasing system, ATP-binding protein. This model represents LolD, a member of the ABC transporter family (pfam00005). LolD is involved in localization of lipoproteins in some bacteria. It works with a transmembrane protein LolC, which in some species is a paralogous pair LolC and LolE. Depending on whether the residue immediately following the new, modified N-terminal Cys residue, the nascent lipoprotein may be carried further by LolA and LolB to the outer membrane, or remain at the inner membrane. The top scoring proteins excluded by this model include homologs from the archaeal genus Methanosarcina. [Protein fate, Protein and peptide secretion and trafficking]	221
274034	TIGR02212	lolCE	lipoprotein releasing system, transmembrane protein, LolC/E family. This model describes the LolC protein, and its paralog LolE found in some species. These proteins are homologous to permease proteins of ABC transporters. In some species, two paralogs occur, designated LolC and LolE. In others, a single form is found and tends to be designated LolC. [Protein fate, Protein and peptide secretion and trafficking]	411
131268	TIGR02213	lolE_release	lipoprotein releasing system, transmembrane protein LolE. This protein is part of an unusual ABC transporter complex that releases lipoproteins from the periplasmic side of the bacterial inner membrane, rather than transport any substrate across the inner membrane. In some species, the permease-like transmembrane protein is represented by two paralogs, LolC and LolE, both in the LolCDE complex. This family consists of LolE, as found in E. coli and related species. [Protein fate, Protein and peptide secretion and trafficking]	411
131269	TIGR02214	spoVD_pbp	stage V sporulation protein D. This model describes the spoVD subfamily of homologs of the cell division protein FtsI, a penicillin binding protein. This subfamily is restricted to Bacillus subtilis and related Gram-positive species with known or suspected endospore formation capability. In these species, the functional equivalent of FtsI is desginated PBP-2B, a paralog of spoVD. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Sporulation and germination]	636
274035	TIGR02215	phage_chp_gp8	phage conserved hypothetical protein, phiE125 gp8 family. This model describes a family of proteins found exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. Members of this family show some similarity to members of pfam05135, a putative DNA packaging protein family. [Mobile and extrachromosomal element functions, Prophage functions]	188
274036	TIGR02216	phage_TIGR02216	phage conserved hypothetical protein. This model describes a family of proteins found exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. [Mobile and extrachromosomal element functions, Prophage functions]	58
274037	TIGR02217	chp_TIGR02217	TIGR02217 family protein. This model represents a family of conserved hypothetical proteins. It is usually (but not always) found in apparent phage-derived regions of bacterial chromosomes. [Mobile and extrachromosomal element functions, Prophage functions]	210
274038	TIGR02218	phg_TIGR02218	phage conserved hypothetical protein BR0599. This model describes a family of proteins found almost exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. An apparent exception is Wolbachia pipientis wMel, a bacterial endosymbiont of the fruit fly, which has several candidate phage-related genes physically separate from obvious prophage regions. [Mobile and extrachromosomal element functions, Prophage functions]	229
131274	TIGR02219	phage_NlpC_fam	putative phage cell wall peptidase, NlpC/P60 family. Members of this family show sequence similarity to members of the NlpC/P60 family described by pfam00877 and by Anantharaman and Aravind (). The NlpC/P60 family includes a number of characterized bacterial cell wall hydrolases. Members of this related family are all found in prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions]	134
274039	TIGR02220	phg_TIGR02220	phage conserved hypothetical protein, C-terminal domain. This model represents the conserved C-terminal domain of a family of proteins found exclusively in bacteriophage and in bacterial prophage regions. The functions of this domain and the proteins containing it are unknown. [Mobile and extrachromosomal element functions, Prophage functions]	77
274040	TIGR02221	cas_TM1812	CRISPR-associated protein, TM1812 family. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This family, represented by TM1812 of Thermotoga maritima, is found also in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, a large plasmid of Synechocystis sp. PCC 6803, and Fibrobacter succinogenes S85.	218
131277	TIGR02222	chap_CsaA	export-related chaperone protein CsaA. This model describes Bacillus subtilis CsaA, an export-related chaperone that interacts with the Sec system, and related proteins from a number of other bacteria and archaea. The crystal structure is known for the homodimer from Thermus thermophilus. [Protein fate, Protein folding and stabilization, Protein fate, Protein and peptide secretion and trafficking]	107
274041	TIGR02223	ftsN	cell division protein FtsN. FtsN is a poorly conserved protein active in cell division in a number of Proteobacteria. The N-terminal 30 residue region tends to by Lys/Arg-rich, and is followed by a membrane-spanning region. This is followed by an acidic low-complexity region of variable length and a well-conserved C-terminal domain of two tandem regions matched by pfam05036 (Sporulation related repeat), found in several cell division and sporulation proteins. The role of FtsN as a suppressor for other cell division mutations is poorly understood; it may involve cell wall hydrolysis. [Cellular processes, Cell division]	298
274042	TIGR02224	recomb_XerC	tyrosine recombinase XerC. The phage integrase family describes a number of recombinases with tyrosine active sites that transiently bind covalently to DNA. Many are associated with mobile DNA elements, including phage, transposons, and phase variation loci. This model represents XerC, one of two closely related chromosomal proteins along with XerD (TIGR02225). XerC and XerD are site-specific recombinases which help resolve chromosome dimers to monomers for cell division after DNA replication. In species with a large chromosome and homologs of XerC on other replicons, the chomosomal copy was preferred for building this model. This model does not detect all XerC, as some apparent XerC examples score in the gray zone between trusted (450) and noise (410) cutoffs, along with some XerD examples. XerC and XerD interact with cell division protein FtsK. [DNA metabolism, DNA replication, recombination, and repair]	295
274043	TIGR02225	recomb_XerD	tyrosine recombinase XerD. The phage integrase family describes a number of recombinases with tyrosine active sites that transiently bind covalently to DNA. Many are associated with mobile DNA elements, including phage, transposons, and phase variation loci. This model represents XerD, one of two closely related chromosomal proteins along with XerC (TIGR02224). XerC and XerD are site-specific recombinases which help resolve chromosome dimers to monomers for cell division after DNA replication. In species with a large chromosome and with homologs of XerD on other replicons, the chomosomal copy was preferred for building this model. This model does not detect all XerD, as some apparent XerD examples score below the trusted and noise cutoff scores. XerC and XerD interact with cell division protein FtsK. [DNA metabolism, DNA replication, recombination, and repair]	291
131281	TIGR02226	two_anch	N-terminal double-transmembrane domain. This model represents a prokaryotic N-terminal region of about 80 amino acids. The predicted membrane topology by TMHMM puts the N-terminus outside and spans the membrane twice, with a cytosolic region of about 25 amino acids between the two transmembrane regions. Member proteins tend to be between 600 and 1000 amino acids in length. [Hypothetical proteins, Domain]	82
274044	TIGR02227	sigpep_I_bact	signal peptidase I, bacterial type. This model represents signal peptidase I from most bacteria. Eukaryotic sequences are likely organellar. Several bacteria have multiple paralogs, but these represent isozymes of signal peptidase I. Virtually all known bacteria may be presumed to A related model finds a simlar protein in many archaea and a few bacteria, as well as a microsomal (endoplasmic reticulum) protein in eukaryotes. [Protein fate, Protein and peptide secretion and trafficking]	142
131283	TIGR02228	sigpep_I_arch	signal peptidase I, archaeal type. This model represents signal peptidase I from most archaea, a subunit of the eukaryotic endoplasmic reticulum signal peptidase I complex, and an apparent signal peptidase I from a small number of bacteria. It is related to but does not overlap in hits with TIGR02227, the bacterial and mitochondrial signal peptidase I.	158
131284	TIGR02229	caa3_sub_IV	caa(3)-type oxidase, subunit IV. This model represents a small set of proteins with weak similarity to the sequences in pfam03626, which describes the cytochrome C oxidase subunit IV. [Energy metabolism, Electron transport]	92
131285	TIGR02230	ATPase_gene1	F0F1-ATPase subunit, putative. This model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophic stretches and is presumed to be a subunit of the enzyme. [Energy metabolism, ATP-proton motive force interconversion]	100
274045	TIGR02231	TIGR02231	conserved hypothetical protein. This family consists of proteins over 500 amino acids long in Caenorhabditis elegans and several bacteria (Pseudomonas aeruginosa, Nostoc sp. PCC 7120, Leptospira interrogans, etc.). The function is unknown.	525
200169	TIGR02232	myxo_disulf_rpt	Myxococcus cysteine-rich repeat. This model represents a sequence region shared between several proteins of Myxococcus xanthus DK 1622 and some eukaryotic proteins that include human pappalysin-1 (SP|Q13219). The region of about 40 amino acids contains several conserved Cys residues presumed to form disulfide bonds. The region appears in up to 13 repeats in Myxococcus.	38
274046	TIGR02234	trp_oprn_chp	trp region conserved hypothetical membrane protein. Members of this family are predicted transmembrane proteins with four membrane-spanning helices. Members are found in the Actinobacteria (Mycobacterium, Corynebacterium, Streptomyces), always associated with genes for tryptophan biosynthesis.	202
131289	TIGR02235	menA_cyano-plnt	1,4-dihydroxy-2-naphthoate phytyltransferase. This family of phytyltransferases, found in plants and cyanobacteria, are involved in the biosythesis of phylloquinone (Vitamin K1). Phylloquinone is a critical component of photosystem I. The closely related MenA enzyme from bacteria transfers a prenyl group (which only differs in the saturation of the isoprenyl groups) in the biosynthesis of menaquinone. Activity towards both substrates in certain organisms should be considered a possibility. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	285
131290	TIGR02236	recomb_radA	DNA repair and recombination protein RadA. This family consists exclusively of archaeal RadA protein, a homolog of bacterial RecA (TIGR02012), eukaryotic RAD51 (TIGR02239), and archaeal RadB (TIGR02237). This protein is involved in DNA repair and recombination. The member from Pyrococcus horikoshii contains an intein. [DNA metabolism, DNA replication, recombination, and repair]	310
274047	TIGR02237	recomb_radB	DNA repair and recombination protein RadB. This family consists exclusively of archaeal RadB protein, a homolog of bacterial RecA (TIGR02012), eukaryotic RAD51 (TIGR02239) and DMC1 (TIGR02238), and archaeal RadA (TIGR02236).	209
131292	TIGR02238	recomb_DMC1	meiotic recombinase Dmc1. This model describes DMC1, a subfamily of a larger family of DNA repair and recombination proteins. It is eukaryotic only and most closely related to eukaryotic RAD51. It also resembles archaeal RadA (TIGR02236) and RadB (TIGR02237) and bacterial RecA (TIGR02012). It has been characterized for human as a recombinase active only in meiosis.	313
274048	TIGR02239	recomb_RAD51	DNA repair protein RAD51. This eukaryotic sequence family consists of RAD51, a protein involved in DNA homologous recombination and repair. It is similar in sequence the exclusively meiotic recombinase DMC1 (TIGR02238), to archaeal families RadA (TIGR02236) and RadB (TIGR02237), and to bacterial RecA (TIGR02012).	316
131294	TIGR02240	PHA_depoly_arom	poly(3-hydroxyalkanoate) depolymerase. This family consists of the polyhydroxyalkanoic acid (PHA) depolymerase of Pseudomonas oleovorans, Pseudomonas putida BM01, and related species. This enzyme is part of polyester storage and mobilization system as in many bacteria. However, species containing this enzyme are unusual in their capacity to produce aromatic polyesters when grown on carbon sources such as benzoic acid or phenylacetic acid. [Energy metabolism, Other]	276
274049	TIGR02241	TIGR02241	conserved hypothetical phage tail region protein. This family consists of uncharacterized proteins. All members so far represent bacterial genes found in apparent phage or otherwisely laterally transferred regions of the chromosome. Tentatively identified neighboring proteins tend to be phage tail region proteins. In some species, including Photorhabdus luminescens TTO1, several members of this family may be encoded near each other.	140
274050	TIGR02242	tail_TIGR02242	phage tail protein domain. This model describes a region of sequence similarity shared by a number of uncharacterized proteins in bacterial genomes, including Geobacter sulfurreducens PCA, Mesorhizobium loti, Streptomyces coelicolor A3(2), Gloeobacter violaceus PCC 7421, and Myxococcus xanthus. In all cases, the genomic region resembles a phage tail region, based on tentative identifications of neighboring genes. A region of this domain resembles a region of TIGR01634, another phage tail protein model. [Mobile and extrachromosomal element functions, Prophage functions]	130
274051	TIGR02243	TIGR02243	putative baseplate assembly protein. This family consists of a large, conserved hypothetical protein in phage tail-like regions of at least six bacterial genomes: Gloeobacter violaceus PCC 7421, Geobacter sulfurreducens PCA, Streptomyces coelicolor A3(2), Streptomyces avermitilis MA-4680, Mesorhizobium loti, and Myxococcus xanthus. The C-terminal region is identified by the broader model pfam04865 as related to baseplate protein J from phage P2, but that relationship is not observed directly. [Mobile and extrachromosomal element functions, Prophage functions]	656
274052	TIGR02244	HAD-IG-Ncltidse	HAD superfamily (subfamily IG) hydrolase, 5'-nucleotidase. This model includes a 5'-nucleotidase specific for purines (IMP and GMP). These enzymes are members of the Haloacid Dehalogenase (HAD) superfamily. HAD members are recognized by three short motifs {hhhhDxDx(T/V)}, {hhhh(T/S)}, and either {hhhh(D/E)(D/E)x(3-4)(G/N)} or {hhhh(G/N)(D/E)x(3-4)(D/E)} (where "h" stands for a hydrophobic residue). Crystal structures of many HAD enzymes has verified PSI-PRED predictions of secondary structural elements which show each of the "hhhh" sequences of the motifs as part of beta sheets. This subfamily of enzymes is part of "Subfamily I" of the HAD superfamily by virtue of a "cap" domain in between motifs 1 and 2. This subfamily's cap domain has a different predicted secondary structure than all other known HAD enzymes and thus has been designated "subfamily IG". This domain appears to consist of a mixed alpha/beta fold. A Pfam model (pfam05761) detects an identical range of sequences above the trusted cutoff, but does not model the N-terminal motif 1 region. A TIGRFAMs model (TIGR01993) represents a (putative) family of _pyrimidine_ 5'-nucleotidases which are also subfamily I HAD's, which should not be confused with the current model.	343
131299	TIGR02245	HAD_IIID1	HAD-superfamily subfamily IIID hydrolase, TIGR02245. This family of sequences appears to belong to the Haloacid Dehalogenase (HAD) superfamily of enzymes by virtue of the presence of three catalytic domains, in this case: LLVLD(ILV)D(YH)T, I(VMG)IWS, and (DN)(VC)K(PA)Lx{15-17}T(IL)(MH)(FV)DD(IL)(GRS)(RK)N. Since this family has no large "cap" domain between motifs 1 and 2 or between 2 and 3, it is formally a "class III" HAD.	195
274053	TIGR02246	TIGR02246	conserved hypothetical protein. This family consists of uncharacterized proteins found in a number of genera and species, including Streptomyces, Xanthomonas, Oceanobacillus iheyensis, Caulobacter crescentus CB15, and Xylella fastidiosa. The function is unknown.	128
274054	TIGR02247	HAD-1A3-hyp	epoxide hydrolase N-terminal domain-like phosphatase. This model represents a small clade of sequences including C. elegans and mammalian sequences as well as a small number of bacteria. In eukaryotes, this domain exists as an N-terminal fusion to the soluble epoxide hydrolase enzyme and has recently been shown to be an active phosphatase, although the nature of the biological substrate is unclear. These appear to be members of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases by general homology and the conservation of all of the recognized catalytic motifs (although the first motif is unusual in the replacement of the more common aspartate with glycine...). The variable domain is found in between motifs 1 and 2, indicating membership in subfamily I and phylogeny and prediction of the alpha helical nature of the variable domain (by PSI-PRED) indicate membership in subfamily IA.	211
131302	TIGR02248	mutH_TIGR	DNA mismatch repair endonuclease MutH. This family consists exclusively of MutH, an endonuclease in some Proteobacteria that is activated by MutS1 and MutL for methylation-directed mismatch repair. [DNA metabolism, DNA replication, recombination, and repair]	217
131303	TIGR02249	integrase_gron	integron integrase. Members of this family are integrases associated with integrons (and super-integrons), which are systems for incorporating and expressing cassettes of laterally transferred DNA. Incorporation occurs at an attI site. A super-integron, as in Vibrio sp., may include over 100 cassettes. This family belongs to the phage integrase family (pfam00589) that also includes recombinases XerC (TIGR02224) and XerD (TIGR02225), which are bacterial housekeeping proteins. Within this family of integron integrases, some are designated by class, e.g. IntI4, a class 4 integron integrase from Vibrio cholerae N16961. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Other]	315
131304	TIGR02250	FCP1_euk	FCP1-like phosphatase, phosphatase domain. This model represents the phosphatase domain of the humanRNA polymerase II subunit A C-terminal domain phosphatase (FCP1) and closely related phosphatases from eukaryotes including plants, fungi, and slime mold. This domain is a member of the haloacid dehalogenase (HAD) superfamily by virtue of a conserved set of three catalytic motifs and a conserved fold as predicted by PSIPRED. The third motif in this family is distinctive (hhhhDDppphW). This domain is classified as a "Class III" HAD, since there is no large "cap" domain found between motifs 1 and 2 or motifs 2 and 3. This domain is related to domains found in the human NLI interacting factor-like phosphatases, and together both are detected by the pfam03031.	156
274055	TIGR02251	HIF-SF_euk	Dullard-like phosphatase domain. This model represents the putative phosphatase domain of a family of eukaryotic proteins including "Dullard", and the NLI interacting factor (NIF)-like phosphatases. This domain is a member of the haloacid dehalogenase (HAD) superfamily by virtue of a conserved set of three catalytic motifs and a conserved fold as predicted by PSIPRED. The third motif in this family is distinctive (hhhhDNxPxxa) and aparrently lacking the last aspartate. This domain is classified as a "Class III" HAD, since there is no large "cap" domain found between motifs 1 and 2 or motifs 2 and 3. This domain is related to domains found in FCP1-like phosphatases (TIGR02250), and together both are detected by the pfam03031.	162
274056	TIGR02252	DREG-2	REG-2-like, HAD superfamily (subfamily IA) hydrolase. This family of proteins includes uncharacterized sequences from eukaryotes, cyanobacteria and Leptospira as well as the DREG-2 protein from Drosophila melanogaster which has been identified as a rhythmically (diurnally) regulated gene. This family is a member of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs. The subfamilies are defined based on the location and the observed or predicted fold of a so-called 'capping domain', or the absence of such a domain. This family is a member of subfamily 1A in which the cap domain consists of a predicted alpha helical bundle found in between the first and second catalytic motifs. A distinctive feature of this family is a conserved tandem pair of tryptophan residues in the cap domain. The most divergent sequences included within the scope of this model are from plants and have "FW" at this position instead. Most likely, these sequences, like the vast majority of HAD sequences, represent phosphatase enzymes.	203
274057	TIGR02253	CTE7	HAD superfamily (subfamily IA) hydrolase, TIGR02253. This family of sequences from archaea and metazoans includes the human uncharacterized protein CTE7. Pyrococcus species appear to have three different forms of this enzyme, so it is unclear whether all members of this family have the same function. This family is a member of the haloacid dehalogenase (HAD) superfamily of hydrolases which are characterized by three conserved sequence motifs. By virtue of an alpha helical domain in-between the first and second conserved motif, this family is a member of subfamily IA (TIGR01549).	221
162788	TIGR02254	YjjG/YfnB	noncanonical pyrimidine nucleotidase, YjjG family. This HAD superfamily includes including YjjG from E. coli and YfnB from B. subtilis. YjjG has been shown to act as a house-cleaning enzyme, cleaving nucleotides with non-canonical nucleotide bases. This family is a member of the haloacid dehalogenase (HAD) superfamily of hydrolases which are characterized by three conserved sequence motifs. By virtue of an alpha helical domain in-between the first and second conserved motif, this family is a member of subfamily IA (TIGR01549).	224
131309	TIGR02256	ICE_VC0181	integrative and conjugative element protein, VC0181 family. This uncharacterized protein is found in several Proteobacteria, among them Rhizobium sp. NGR234, Vibrio cholerae, Myxococcus xanthus, and E. coli strain ECOR31. In the latter, it is part of an integrative and conjugative element that is readily induced to excise and circularize.	131
131310	TIGR02257	cobalto_cobN	cobaltochelatase, CobN subunit. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	1122
274058	TIGR02258	2_5_ligase	2'-5' RNA ligase. This protein family consists of bacterial and archaeal proteins with two tandem copies of Pfam domain pfam02834. Members for which activity has been measured perform a reversible, ATP-independent 2'-5'-ligation of what is presumably a non-phyiological substrate: half-tRNA splice intermediates from an intron-containing yeast tRNA. The physiological substrate(s) in prokaryotes may include small 2'-5'-link-containing oligonucleotides, perhaps with regulatory or biosynthetic roles. [Transcription, RNA processing]	179
131312	TIGR02259	benz_CoA_red_A	benzoyl-CoA reductase, bcr type, subunit A. This model describes A, or gamma, subunit of the bcr type of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA. This family shows strong sequence similarity to the 2-hydroxyglutaryl-CoA dehydratase alpha chain and to subunits of different types of benzoyl-CoA reductase (such as the bzd type).	432
131313	TIGR02260	benz_CoA_red_B	benzoyl-CoA reductase, bcr type, subunit B. This model describes B, or beta, subunit of the bcr type of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA.	413
131314	TIGR02261	benz_CoA_red_D	benzoyl-CoA reductase, bcr type, subunit D. This model describes the D subunit of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA. This family shows sequence similarity to the A subunit (TIGR02259) and to the 2-hydroxyglutaryl-CoA dehydratase alpha chain.	262
274059	TIGR02262	benz_CoA_lig	benzoate-CoA ligase family. Characterized members of this protein family include benzoate-CoA ligase, 4-hydroxybenzoate-CoA ligase, 2-aminobenzoate-CoA ligase, etc. Members are related to fatty acid and acetate CoA ligases.	505
131316	TIGR02263	benz_CoA_red_C	benzoyl-CoA reductase, subunit C. This model describes C subunit of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA. This enzyme acts under anaerobic conditions.	380
131317	TIGR02264	gmx_para_CXXCG	Myxococcus xanthus double-CXXCG motif paralogous family. This family consists of at least 10 paralogous proteins from Myxococcus xanthus that lack detectable sequence similarity to any other protein family. An imperfectly conserved CXXCG motif, a probable binding site, appears twice in the multiple sequence alignment.	237
131318	TIGR02265	Mxa_TIGR02265	Myxococcales-restricted protein, TIGR02265 family. This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus DK 1622. Members are about 200 amino acids in length. No other homologs are known; the function is unknown.	179
274060	TIGR02266	gmx_TIGR02266	Myxococcus xanthus paralogous domain TIGR02266. This domain is related to Type IV pilus assembly protein PilZ (pfam07238). It is found in at least 12 copies in Myxococcus xanthus DK 1622.	96
131320	TIGR02267	TIGR02267	DUSAM domain. This family consists of at least eight paralogs in Myxococcus xanthus and six in Stigmatella aurantiaca DW4/3-1, both members of Myxococcales order within the Deltaproteobacteria. The function is unknown. Some member proteins consist of two copies of the domain. This domain is hereby named DUSAM, DUplication in Stigmatella And Myxococcus.	123
131321	TIGR02268	TIGR02268	Myxococcus xanthus paralogous family TIGR02268. This family consists of at least 8 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown.	295
131322	TIGR02269	TIGR02269	Myxococcus xanthus paralogous lipoprotein family TIGR02269. This family consists of at least 9 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. One appears truncated toward the N-terminus; the others are predicted lipoproteins. The function is unknown.	211
131323	TIGR02270	TIGR02270	conserved hypothetical protein. Members are found in Myxococcus xanthus (six members), Geobacter sulfurreducens, and Pseudomonas aeruginosa; a short protein homologous to the N-terminal region is found in Mesorhizobium loti. All sequence are from Proteobacteria. The function is unknown. [Hypothetical proteins, Conserved]	410
131324	TIGR02271	TIGR02271	conserved domain. This model describes an uncharacterized domain, sometimes found in association with a PRC-barrel domain (pfam05239, which is also found in rRNA processing protein RimM and in a photosynthetic reaction center complex protein). This domain is found in proteins from Bacillus subtilis, Deinococcus radiodurans, Nostoc sp. PCC 7120, Myxococcus xanthus, and several other species. The function is not known.	115
131325	TIGR02272	gentisate_1_2	gentisate 1,2-dioxygenase. This family consists of gentisate 1,2-dioxygenases. This ring-opening enzyme acts in salicylate degradation that goes via gentisate rather than via catechol. It converts gentisate to maleylpyruvate. Some putative gentisate 1,2-dioxygenases are excluded by a relatively high trusted cutoff score because they are too closely related to known examples of 1-hydroxy-2-naphthoate dioxygenase. Therefore some homologs may be bona fide gentisate 1,2-dioxygenases even if they score below the given cutoffs.	335
274061	TIGR02273	16S_RimM	16S rRNA processing protein RimM. This family consists of the bacterial protein RimM (YfjA, 21K), a 30S ribosomal subunit-binding protein implicated in 16S ribsomal RNA processing. It has been partially characterized in Escherichia coli, is found with other translation-associated genes such as trmD. It is broadly distributed among bacteria, including some minimal genomes such the aphid endosymbiont Buchnera aphidicola. The protein contains a PRC-barrel domain that it shares with other protein families (pfam05239) and a unique domain (pfam01782). This model describes the full-length protein. A member from Arabidopsis (plant) has additional N-terminal sequence likely to represent a chloroplast transit peptide. [Transcription, RNA processing]	165
274062	TIGR02274	dCTP_deam	deoxycytidine triphosphate deaminase. Members of this family include the Escherichia coli monofunctional deoxycytidine triphosphate deaminase (dCTP deaminase) and a Methanocaldococcus jannaschii bifunctional dCTP deaminase (3.5.4.13)/dUTP diphosphatase (EC 3.6.1.23), which has the EC number 3.5.4.30 for the overall operation. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	179
274063	TIGR02275	DHB_AMP_lig	2,3-dihydroxybenzoate-AMP ligase. Proteins in this family belong to the AMP-binding enzyme family (pfam00501). Members activate 2,3-dihydroxybenzoate (DHB) by ligation of AMP from ATP with the release of pyrophosphate; many are involved in synthesis of siderophores such as enterobactin, vibriobactin, vulnibactin, etc. The most closely related proteine believed to differ in function activates salicylate rather than DHB. [Transport and binding proteins, Cations and iron carrying compounds]	526
213697	TIGR02276	beta_rpt_yvtn	40-residue YVTN family beta-propeller repeat. This repeat of about 40 amino acids is found in up to 14 copies per protein. Archaea Methanosarcina mazei and Methanosarcina acetivorans each have over 10 genes that encode tandem copies of this repeat, which is also found in other species. PSIPRED predicts with high confidence that each 40-residue repeats contains four beta strands. This model overlaps somewhat with the NHL repeat (pfam01436) and also shows sequence similarity to the WD domain, G-beta repeat (pfam00400).	42
274064	TIGR02277	PaaX_trns_reg	phenylacetic acid degradation operon negative regulatory protein PaaX. This transcriptional regulator is always found in association with operons believed to be involved in the degradation of phenylacetic acid. The gene product has been shown to bind to the promoter sites and repress their transcription. [Regulatory functions, DNA interactions]	280
131331	TIGR02278	PaaN-DH	phenylacetic acid degradation protein paaN. This enzyme is proposed to act in the ring-opening step of phenylacetic acid degradation which follows ligation of the acid with coenzyme A (by PaaF) and hydroxylation by a multicomponent non-heme iron hydroxylase complex (PaaGHIJK). Gene symbols have been standardized in. This enzyme is related to aldehyde dehydrogenases and has domains which are members of the pfam00171 and pfam01575 families. This family includes paaN genes from Pseudomonas, Sinorhizobium, Rhodopseudomonas, Escherichia, Deinococcus and Corynebacterium. Another homology family (TIGR02288) includes several other species.	663
188207	TIGR02279	PaaC-3OHAcCoADH	3-hydroxyacyl-CoA dehydrogenase PaaC. This 3-hydroxyacyl-CoA dehydrogenase is involved in the degradation of phenylacetic acid, presumably in steps following the opening of the phenyl ring. The sequences included in this model are all found in aparrent operons with other related genes such as paaA, paaB, paaD, paaE, paaF and paaN. Some genomes contain these other genes without an apparent paaC in the same operon - possibly in these cases a different dehydrogenase involved in fatty acid degradation may fill in the needed activity. This enzyme has domains which are members of the pfam02737 and pfam00725 families.	503
274065	TIGR02280	PaaB1	phenylacetate degradation probable enoyl-CoA hydratase paaB. This family of proteins are found within apparent operons for the degradation of phenylacetic acid. These proteins contain the enoyl-CoA hydratase domain as detected by pfam00378. This activity is consistent with current hypotheses for the degradation pathway, which involve the ligation of phenylacetate with coenzyme A (paaF), hydroxylation (paaGHIJK), ring-opening (paaN) and degradation of the resulting fatty acid-like compound to a Krebs cycle intermediate (paaABCDE).	257
131334	TIGR02281	clan_AA_DTGA	clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria). [Protein fate, Degradation of proteins, peptides, and glycopeptides]	121
274066	TIGR02282	MltB	lytic murein transglycosylase B. This family consists of lytic murein transglycosylases (murein hydrolases) in the family of MltB, which is a membrane-bound lipoprotein in Escherichia coli. The N-terminal lipoprotein modification motif is conserved in about half the members of this family. The term Slt35 describes a naturally occurring soluble fragment of MltB. Members of this family never contain the putative peptidoglycan binding domain described by pfam01471, which is associated with several classes of bacterial cell wall lytic enzymes. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	290
274067	TIGR02283	MltB_2	lytic murein transglycosylase. Members of this family are closely related to the MltB family lytic murein transglycosylases described by TIGR02282 and are likewise all proteobacterial, although that family and this one form clearly distinct clades. Several species have one member of each family. Many members of this family (unlike the MltB family) contain an additional C-terminal domain, a putative peptidoglycan binding domain (pfam01471), not included in region described by this model. Many sequences appear to contain N-terminal lipoprotein attachment sites, as does E. coli MltB in TIGR02282. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	300
131337	TIGR02284	TIGR02284	conserved hypothetical protein. Members of this protein family are found mostly in the Proteobacteria, although one member is found in the the marine planctomycete Pirellula sp. strain 1. The function is unknown.	139
131338	TIGR02285	TIGR02285	conserved hypothetical protein. Members of this family are found in several Proteobacteria, including Pseudomonas putida KT2440, Bdellovibrio bacteriovorus HD100 (three members), Aeromonas hydrophila, and Chromobacterium violaceum ATCC 12472. The function is unknown. [Hypothetical proteins, Conserved]	268
131339	TIGR02286	PaaD	phenylacetic acid degradation protein PaaD. This member of the domain family TIGR00369 (which is, in turn, a member of the pfam03061 thioesterase superfamily) is nearly always found adjacent to other genes of the phenylacetic acid degradation pathway. Its function is currently unknown, but a role as a thioesterase is not inconsistent with the proposed overall pathway. Sequences scoring between trusted and noise include those from archaea and other species not known to catabolize phenylacetic acid and which are not adjacent to other genes potentially involved with such a pathway.	114
131340	TIGR02287	PaaY	phenylacetic acid degradation protein PaaY. Members of this family are located next to other genes organized into apparent operons for phenylacetic acid degradation. PaaY is located near the end of these gene clusters and often next to PaaX, a transcriptional regulator. [Energy metabolism, Other]	192
131341	TIGR02288	PaaN_2	phenylacetic acid degradation protein paaN. This enzyme is proposed to act in the ring-opening step of phenylacetic acid degradation which follows ligation of the acid with coenzyme A (by PaaF) and hydroxylation by a multicomponent non-heme iron hydroxylase complex (PaaGHIJK). Gene symbols have been standardized in. This enzyme is related to aldehyde dehydrogenases and has a domain which is a member of the pfam00171 family. This family includes sequences from Burkholderia, Bordetella, Streptomyces. Other PaaN enzymes are represented by a separate model, TIGR02278.	551
274068	TIGR02289	M3_not_pepF	oligoendopeptidase, M3 family. This family consists of probable oligoendopeptidases in the M3 family, related to lactococcal PepF and group B streptococcal PepB (TIGR00181) but in a distinct clade with considerable sequence differences. The likely substrate is small peptides and not whole proteins, as with PepF, but members are not characterized and the activity profile may differ. Several bacteria have both a member of this family and a member of the PepF family.	549
274069	TIGR02290	M3_fam_3	oligoendopeptidase, pepF/M3 family. The M3 family of metallopeptidases contains several distinct clades. Oligoendopeptidase F as characterized in Lactococcus, the functionally equivalent oligoendopeptidase B of group B Streptococcus, and closely related sequences are described by TIGR00181. The present family is quite similar but forms a distinct clade, and a number of species have one member of each. A greater sequence difference separates members of TIGR02289, probable oligoendopeptidases of the M3 family that probably should not be designated PepF.	587
274070	TIGR02291	rimK_rel_E_lig	alpha-L-glutamate ligase-related protein. Members of this protein family contain a region of homology to the RimK family of alpha-L-glutamate ligases (TIGR00768), various members of which modify the Glu-Glu C-terminus of ribosomal protein S6, or tetrahydromethanopterin, or a form of coenzyme F420 derivative. Members of this family are found so far in various Vibrio and Pseudomonas species and some other gamma and beta Proteobacteria. The function is unknown.	317
274071	TIGR02292	ygfB_yecA	yecA family protein. This family resembles pfam03695 (version pfam03695.3), uncharacterised protein family UPF0149, but is broader in scope and includes additional proteins. It includes E. coli proteins YgfB and YecA. The function of this family of proteins is unknown. The crystal structure is known for the member from Haemophilus influenzae (Ygfb, HI0817). [Unknown function, General]	150
131346	TIGR02293	TAS_TIGR02293	putative toxin-antitoxin system antitoxin component, TIGR02293 family. Proteins in this family are found almost exclusively in the Proteobacteria, but also in Gloeobacter violaceus PCC 7421, a cyanobacterium. This family was proposed by Makarova, et al. (2009) to be the antitoxin component of a new class of type 2 toxin-antitoxin system, or addiction module. [Cellular processes, Other]	133
274072	TIGR02294	nickel_nikA	nickel ABC transporter, nickel/metallophore periplasmic binding protein. Members of this family are periplasmic nickel-binding proteins of nickel ABC transporters. Most appear to be lipoproteins. This protein was previously (circa 2003) thought to mediate binding to nickel through water molecules, but is now thought to involve a chelating organic molecule, perhaps butane-1,2,4-tricarboxylate, acting as a metallophore. [Transport and binding proteins, Cations and iron carrying compounds]	500
213698	TIGR02295	HpaD	3,4-dihydroxyphenylacetate 2,3-dioxygenase. This enzyme catalyzes the second step in the degradation of 4-hydroxyphenylacetate to succinate and pyruvate. 4-hydroxyphenylacetate arises from the degradation of tyrosine. The substrate, 3,4-dihydroxyphenylacetate (homoprotocatechuate) arises from the action of a hydroxylase on 4-hydroxyphenylacetate. The aromatic ring is opened by this dioxygenase exo to the 3,4-diol resulting in 2-hydroxy-5-carboxymethylmuconate semialdehyde. The enzyme from Bacillus brevis contains manganese.	294
131349	TIGR02296	HpaC	4-hydroxyphenylacetate 3-monooxygenase, reductase component. This model identifies the reductase component (HpaC) of 4-hydroxyphenylacetate 3-monooxygenase. This enzyme catalyzes the first step (hydroxylation at the 3-position) in the degradation of 4-hydroxyphenylacetate to succinate and pyruvate. 4-hydroxyphenylacetate arises from the degradation of tyrosine. These reductases catalyze the reduction of free flavins by NADPH. The flavin is then utilized by the large subunit of the monooxygenase.	154
131350	TIGR02297	HpaA	4-hydroxyphenylacetate catabolism regulatory protein HpaA. This putative transcriptional regulator, which contains both the substrate-binding, dimerization domain (pfam02311) and the helix-turn-helix DNA-binding domain (pfam00165) of the AraC famil, is located proximal to genes of the 4-hydroxyphenylacetate catabolism pathway.	287
131351	TIGR02298	HpaD_Fe	3,4-dihydroxyphenylacetate 2,3-dioxygenase. This enzyme catalyzes the ring-opening step in the degradation of 4-hydroxyphenylacetate.	282
131352	TIGR02299	HpaE	5-carboxymethyl-2-hydroxymuconate semialdehyde dehydrogenase. This model represents the dehydrogenase responsible for the conversion of 5-carboxymethyl-2-hydroxymuconate semialdehyde to 5-carboxymethyl-2-hydroxymuconate (a tricarboxylic acid). This is the step in the degradation of 4-hydroxyphenylacetic acid via homoprotocatechuate following the oxidative opening of the aromatic ring.	488
131353	TIGR02300	FYDLN_acid	TIGR02300 family protein. Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.	129
131354	TIGR02301	TIGR02301	TIGR02301 family protein. Members of this uncharacterized protein family are found in a number of alphaProteobacteria, including root nodule bacteria, Brucella suis, Caulobacter crescentus, and Rhodopseudomonas palustris. Conserved residues include two well-separated cysteines, suggesting a disulfide bond. The function is unknown.	121
274073	TIGR02302	aProt_lowcomp	TIGR02302 family protein. Members of this family are long (~850 residue) bacterial proteins from the alpha Proteobacteria. Each has 2-3 predicted transmembrane helices near the N-terminus and a long C-terminal region that includes stretches of Gln/Gly-rich low complexity sequence, predicted by TMHMM to be outside the membrane. In Bradyrhizobium japonicum, two tandem reading frames are together homologous the single members found in other species; the cutoffs scores are set low enough that the longer scores above the trusted cutoff and the shorter above the noise cutoff for this model.	851
131356	TIGR02303	HpaG-C-term	4-hydroxyphenylacetate degradation bifunctional isomerase/decarboxylase, C-terminal subunit. This model represents one of two subunits/domains of the bifunctional isomerase/decarboxylase involved in 4-hydroxyphenylacetate degradation. In E. coli and some other species this enzyme is encoded by a single polypeptide containing both this domain and the closely related N-terminal domain (TIGR02305). In other species such as Pasteurella multocida these domains are found as two separate proteins (usually as tandem genes). Together, these domains carry out the decarboxylation of 5-oxopent-3-ene-1,2,5-tricarboxylic acid (OPET) to 2-hydroxy-2,4-diene-1,7-dioate (HHDD) and the subsequent isomerization to 2-oxohept-3-ene-1,7-dioate (OHED).	245
274074	TIGR02304	aden_form_hyp	putative adenylate-forming enzyme. Members of this family form a distinct clade within a larger family of proteins that also includes coenzyme F390 synthetase, an enzyme known in Methanobacterium thermoautotrophicum and a few other methanogenic archaea. That enzyme adenylates coenzyme F420 to F390, a reversible process, during oxygen stress. Other informative homologies include domains of the non-ribosomal peptide synthetases involved in activation by adenylation. The family defined by this model is likely to be of an adenylate-forming enzyme related to but distinct from coenzyme F390 synthetase.	430
131358	TIGR02305	HpaG-N-term	4-hydroxyphenylacetate degradation bifunctional isomerase/decarboxylase, N-terminal subunit. This model represents one of two subunits/domains of the bifunctional isomerase/decarboxylase involved in 4-hydroxyphenylacetate degradation. In E. coli and some other species this enzyme is encoded by a single polypeptide containing both this domain and the closely related C-terminal domain (TIGR02303). In other species such as Pasteurella multocida these domains are found as two separate proteins (usually as tandem genes). Together, these domains carry out the decarboxylation of 5-oxopent-3-ene-1,2,5-tricarboxylic acid (OPET) to 2-hydroxy-2,4-diene-1,7-dioate (HHDD) and the subsequent isomerization to 2-oxohept-3-ene-1,7-dioate (OHED).	205
274075	TIGR02306	RNA_lig_DRB0094	RNA ligase, DRB0094 family. The member of this family from Deinococcus radiodurans, a species that withstands and recovers from extensive radiation or dessication damage, is an apparent RNA ligase. It repairs RNA stand breaks in nicked DNA:RNA and RNA:RNA but not DNA:DNA duplexes. It has adenylyltransferase activity associated with the C-terminal domain. Related proteins also in this family are found in Streptomyces avermitilis MA-4680 and in bacteriophage 44RR2.8t. The phage example is unsurprising since one mechanism of host cell defense against phage is cleavage and inactivation of certain tRNA molecules. A fungal sequence from Neurospora crassa scores between trusted and noise cutofffs and may be similar in function.	341
274076	TIGR02307	RNA_lig_RNL2	RNA ligase, Rnl2 family. Members of this family ligate (seal breaks in) RNA. Members so far include phage proteins that can counteract a host defense of cleavage of specific tRNA molecules, trypanosome ligases involved in RNA editing, but no prokaryotic host proteins . [Mobile and extrachromosomal element functions, Prophage functions, Transcription, RNA processing]	325
213699	TIGR02308	RNA_lig_T4_1	RNA ligase, T4 RnlA family. Members of this family are phage proteins with ATP-dependent RNA ligase activity. Host defense to phage may include cleavage and inactivation of specific tRNA molecules; members of this family act to reverse this RNA damage. The enzyme is adenylated, transiently, on a Lys residue in a motif KXDGSL. [Mobile and extrachromosomal element functions, Prophage functions, Transcription, RNA processing]	374
131362	TIGR02309	HpaB-1	4-hydroxyphenylacetate 3-monooxygenase, oxygenase component. This gene for this monooxygenase is found within apparent operons for the degradation of 4-hydroxyphenylacetic acid in Deinococcus, Thermus and Oceanobacillus. Phylogenetic trees support inclusion of the Bacillus halodurans sequence above trusted although the complete 4-hydroxyphenylacetic acid degradation pathway may not exist in that organism. Generally, this enzyme acts with the assistance of a small flavin reductase domain protein (HpaC) to provide the cycle the flavin reductant for the reaction. This family of sequences is a member of a larger subfamily of monooxygenases (pfam03241).	477
213700	TIGR02310	HpaB-2	4-hydroxyphenylacetate 3-monooxygenase, oxygenase component. This gene for this monooxygenase is found within apparent operons for the degradation of 4-hydroxyphenylacetic acid in Shigella, Photorhabdus and Pasteurella. The family represented by this model is narrowly limited to gammaproteobacteria to exclude other aromatic hydroxylases involved in various secondary metabolic pathways. Generally, this enzyme acts with the assistance of a small flavin reductase domain protein (HpaC) to provide the cycle the flavin reductant for the reaction. This family of sequences is a member of a larger subfamily of monooxygenases (pfam03241).	519
131364	TIGR02311	HpaI	2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase. This model represents the aldolase which performs the final step unique to the 4-hydroxyphenylacetic acid catabolism pathway in which 2,4-dihydroxyhept-2-ene-1,7-dioic acid is split into pyruvate and succinate-semialdehyde. The gene for enzyme is generally found adjacent to other genes for this pathway organized into an operon.	249
131365	TIGR02312	HpaH	2-oxo-hepta-3-ene-1,7-dioic acid hydratase. This model represents the enzyme which hydrates the double bond of 2-oxo-hepta-3-ene-1,7-dioic acid to form 4-hydroxy-2-oxo-heptane-1,7-dioic acid in the catabolism of 4-hydroxyphenylacetic acid. The gene for this enzyme is generally found adjacent to other genes of this pathway in an apparent operon.	267
131366	TIGR02313	HpaI-NOT-DapA	2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase. This model represents a subset of the DapA (dihydrodipicolinate synthase) family which has apparently evolved a separate function. The product of DapA, dihydrodipicolinate, results from the non-enzymatic cyclization and dehydration of 6-amino-2,4-dihydroxyhept-2-ene-1,7-dioic acid, which is different from the substrate of this reaction only in the presence of the amino group. In the absence of this amino group, and running the reaction in the opposite direction, the reaction corresponds to the HpaI aldolase component of the 4-hydroxyphenylacetic acid catabolism pathway (see TIGR02311). At present, this variant of DapA is found only in Oceanobacillus iheyensis HTE831 and Thermus thermophilus HB27. In both of these cases, one or more other DapA genes can be found and the one identified by this model is part of an operon for 4-hydroxyphenylacetic acid catabolism.	294
131367	TIGR02314	ABC_MetN	D-methionine ABC transporter, ATP-binding protein. Members of this family are the ATP-binding protein of the D-methionine ABC transporter complex. Known members belong to the Proteobacteria.	343
131368	TIGR02315	ABC_phnC	phosphonate ABC transporter, ATP-binding protein. Phosphonates are a class of phosphorus-containing organic compound with a stable direct C-P bond rather than a C-O-P linkage. A number of bacterial species have operons, typically about 14 genes in size, with genes for ATP-dependent transport of phosphonates, degradation, and regulation of the expression of the system. Members of this protein family are the ATP-binding cassette component of tripartite ABC transporters of phosphonates. [Transport and binding proteins, Anions]	243
131369	TIGR02316	propion_prpE	propionate--CoA ligase. This family contains one of three readily separable clades of proteins in the group of acetate and propionate--CoA ligases. Characterized members of this family act on propionate. From propionyl-CoA, there is a cyclic degradation pathway: it is ligated by PrpC to the TCA cycle intermediate oxaloacetate, acted upon further by PrpD and an aconitase, then cleaved by PrpB to pyruvate and the TCA cycle intermediate succinate.	628
131370	TIGR02317	prpB	methylisocitrate lyase. Members of this family are methylisocitrate lyase, also called (2S,3R)-3-hydroxybutane-1,2,3-tricarboxylate pyruvate-lyase. This enzyme acts in propionate metabolism. It cleaves a carbon-carbon bond to convert 2-methylisocitrate to pyruvate plus succinate. Some members of this family have been annotated, incorrectly it seems, as the related protein carboxyphosphoenolpyruvate phosphomutase, which is involved in synthesizing the antibiotic bialaphos in Streptomyces hygroscopicus.	285
131371	TIGR02318	phosphono_phnM	phosphonate metabolism protein PhnM. This family consists of proteins from in the PhnM family. PhnM is a a protein associated with phosphonate utilization in a number of bacterial species. In Pseudomonas stutzeri WM88, a protein that is part of a system for the oxidation of phosphites (another form of reduced phosphorous compound) scores between trusted and noise cutoffs. [Energy metabolism, Other]	376
131372	TIGR02319	CPEP_Pphonmut	carboxyvinyl-carboxyphosphonate phosphorylmutase. This family consists of carboxyvinyl-carboxyphosphonate phosphorylmutase (CPEP phosphonomutase), an unusual enzyme involved in the biosynthesis of the antibiotic bialaphos. So far, it is known only in that pathway and only in Streptomyces hygroscopicus. Some related proteins annotated as being functionally equivalent are likely misannotated examples of methylisocitrate lyase, an enzyme of priopionate utilization.	294
274077	TIGR02320	PEP_mutase	phosphoenolpyruvate mutase. This family consists of examples of phosphoenolpyruvate phosphomutase, an enzyme that creates a C-P bond as the first step in the biosynthesis of natural products including antibiotics like bialaphos and phosphonothricin in Streptomyces species, phosphonate-modified molecules such as the polysaccharide B of Bacteroides fragilis, the phosphonolipids of Tetrahymena pyroformis, the glycosylinositolphospholipids of Trypanosoma cruzi. This gene generally occurs in prokaryotic organisms adjacent to the gene for phosphonopyruvate decarboxylase (aepY). Since the PEP phosphomutase reaction favors the substrate PEP energetically, the decarboxylase is required to drive the reaction in the direction of phosphonate production. Most often an aminotansferase (aepZ) is also present which leads to the production of the most common phosphonate compound, 2-aminoethylphosphonate (AEP). A closely related enzyme, phosphonopyruvate hydrolase from Variovorax sp. Pal2, is excluded from this model.	284
131374	TIGR02321	Pphn_pyruv_hyd	phosphonopyruvate hydrolase. This family consists of phosphonopyruvate hydrolase, an enzyme closely related to phosphoenolpyruvate phosphomutase. It cleaves the direct C-P bond of phosphonopyruvate. The characterized example is from Variovorax sp. Pal2.	290
274078	TIGR02322	phosphon_PhnN	phosphonate metabolism protein/1,5-bisphosphokinase (PRPP-forming) PhnN. Members of this family resemble PhnN of phosphonate utilization operons, where different such operons confer the ability to use somewhat different profiles of C-P bond-containing compounds (see ), including phosphites as well as phosphonates. PhnN in E. coli shows considerable homology to guanylate kinases (EC 2.7.4.8), and has actually been shown to act as a ribose 1,5-bisphosphokinase (PRPP forming). This suggests an analogous kinase reaction for phosphonate metabolism, converting 5-phosphoalpha-1-(methylphosphono)ribose to methylphosphono-PRPP. [Central intermediary metabolism, Phosphorus compounds]	179
188208	TIGR02323	CP_lyasePhnK	phosphonate C-P lyase system protein PhnK. Members of this family are the PhnK protein of C-P lyase systems for utilization of phosphonates. These systems resemble phosphonatase-based systems in having a three component ABC transporter, where TIGR01097 is the permease, TIGR01098 is the phosphonates binding protein, and TIGR02315 is the ATP-binding cassette (ABC) protein. They differ, however, in having, typically, ten or more additional genes, many of which are believed to form a membrane-associated complex. This protein (PhnK) and the adjacent-encoded PhnL resemble transporter ATP-binding proteins but are suggested, based on mutatgenesis studies, to be part of this complex rather than part of a transporter per se. [Central intermediary metabolism, Phosphorus compounds]	253
131377	TIGR02324	CP_lyasePhnL	phosphonate C-P lyase system protein PhnL. Members of this family are the PhnL protein of C-P lyase systems for utilization of phosphonates. These systems resemble phosphonatase-based systems in having a three component ABC transporter, where TIGR01097 is the permease, TIGR01098 is the phosphonates binding protein, and TIGR02315 is the ATP-binding cassette (ABC) protein. They differ, however, in having, typically, ten or more additional genes, many of which are believed to form a membrane-associated C-P lysase complex. This protein (PhnL) and the adjacent-encoded PhnK (TIGR02323) resemble transporter ATP-binding proteins but are suggested, based on mutatgenesis studies, to be part of this C-P lyase complex rather than part of a transporter per se.	224
131378	TIGR02325	C_P_lyase_phnF	phosphonates metabolism transcriptional regulator PhnF. All members of the seed alignment for this family are predicted helix-turn-helix transcriptional regulatory proteins of the broader gntR and are found associated with genes for the import and degradation of phosphonates and/or related compounds (e.g. phosphonites) with a direct C-P bond. [Transport and binding proteins, Anions, Regulatory functions, DNA interactions]	238
131379	TIGR02326	transamin_PhnW	2-aminoethylphosphonate--pyruvate transaminase. Members of this family are 2-aminoethylphosphonate--pyruvate transaminase. This enzyme acts on the most common type of naturally occurring phosphonate. It interconverts 2-aminoethylphosphonate plus pyruvate with 2-phosphonoacetaldehyde plus alanine. The enzyme phosphonoacetaldehyde hydrolase (EC 3.11.1.1), usually encoded by an adjacent gene, then cleaves the C-P bond of phosphonoacetaldehyde, adding water to yield acetaldehyde plus inorganic phosphate. Species with this pathway generally have an identified phosphonate ABC transporter but do not also have the multisubunit C-P lysase complex as found in Escherichia coli. [Central intermediary metabolism, Phosphorus compounds]	363
131380	TIGR02327	int_mem_ywzB	conserved hypothetical integral membrane protein. Members of this protein family are small, typically about 80 residues in length, and are highly hydrophobic. The gene is found so far only in a subset of the Firmicutes in association with genes of the ATP synthase F1 complex or NADH-quinone oxidoreductase. This family includes ywzB from Bacillus subtilis; pfam06612 describes the same family as Protein of unknown function DUF1146.	68
131381	TIGR02328	TIGR02328	conserved hypothetical protein. Members of this protein are found in a small number of taxonomically well separated species, yet are strongly conserved, suggesting lateral gene transfer. Members are found in Treponema denticola, Clostridium acetobutylicum, and several of the Firmicutes. The function of this protein is unknown. [Hypothetical proteins, Conserved]	120
274079	TIGR02329	propionate_PrpR	propionate catabolism operon regulatory protein PrpR. At least five distinct pathways exists for the catabolism of propionate by way of propionyl-CoA. Members of this family represent the transcriptional regulatory protein PrpR, whose gene is found in most cases divergently transcribed from an operon for the methylcitric acid cycle of propionate catabolism. 2-methylcitric acid, a catabolite by this pathway, is a coactivator of PrpR. [Regulatory functions, DNA interactions]	526
131383	TIGR02330	prpD	2-methylcitrate dehydratase. Members of this family are bacterial proteins known or predicted to act as 2-methylcitrate dehydratase, an enzyme involved in the methylcitrate cycle of propionate catabolism. A related clade of archaeal proteins that may or may not be functionally equivalent is reserved for a future model and is excluded from this family. The PrpD enzyme of E. coli is responsible for the minor aconitase activity (AcnC) not accounted for by AcnA and AcnB.	468
131384	TIGR02331	rib_alpha	Rib/alpha/Esp surface antigen repeat. Sequences in this family are tandem repeats of about 79 amino acids, present in up to 14 copies in a protein and highly identical, even at the DNA level, within each protein. Sequences with these repeats are found in the Rib and alpha surface antigens of group B Streptococcus, Esp of Enterococcus faecalis, and related proteins of Lactobacillus. The repeat lacks Cys residues. Most members of this protein family also have the cell wall anchor motif LPXTG shared by many staphyloccal and streptococcal surface antigens.	80
131385	TIGR02332	HpaX	4-hydroxyphenylacetate permease. This protein is a part of the Major Facilitator Superfamily (pfam07690). Member of this family are found in a number of proteobacterial genomes, but only in the context of having genes for 4-hydroxyphenylacetate (4-HPA) degradation. The protein is characterized by Prieto, et al. ( as 4-hydroxyphenylacetate permease in E. coli, where 3-HPA and 3,4-dihydroxyphenylacetate are shown to competitively inhibit 4-HPA transport and therefore also interact specificially.	412
131386	TIGR02333	2met_isocit_dHY	2-methylisocitrate dehydratase, Fe/S-dependent. Members of this family appear in an operon for the degradation of propionyl-CoA via 2-methylcitrate. This family is homologous to aconitases A and B and appears to act the part as 2-methylisocitrate dehydratase, the enzyme after PrpD and before PrpB. In Escherichia coli, which lacks a member of this family, 2-methylisocitrate dehydratase activity was traced to aconitase B (TIGR00117) ().	858
131387	TIGR02334	prpF	probable AcnD-accessory protein PrpF. The 2-methylcitrate cycle is one of at least five degradation pathways for propionate via propionyl-CoA. Degradation of propionate toward pyruvate consumes oxaloacetate and releases succinate. Oxidation of succinate back into oxaloacetate by the TCA cycle makes the 2-methylcitrate pathway a cycle. This family consists of PrpF, an incompletely characterized protein that appears to be an essential accessory protein for the Fe/S-dependent 2-methylisocitrate dehydratase AcnD (TIGR02333). This protein is related to but distinct from FldA (part of pfam04303), a putative fluorene degradation protein of Sphingomonas sp. LB126. [Energy metabolism, Fermentation]	390
131388	TIGR02335	hydr_PhnA	phosphonoacetate hydrolase. This family consists of examples of phosphonoacetate hydrolase, an enzyme specific for the cleavage of the C-P bond in phosphonoacetate. Phosphonates are organic compounds with a direct C-P bond that is far less labile that the C-O-P bonds of phosphate attachment sites. Phosphonates may be degraded for phosphorus and energy by broad spectrum C-P lyase encoded by large operon or by specific enzymes for some of the more common phosphonates in nature. This family represents an enzyme from the latter category. It may be found encoded near genes for phosphonate transport and for pther specific phosphonatases.	408
213701	TIGR02336	TIGR02336	1,3-beta-galactosyl-N-acetylhexosamine phosphorylase. Members of this family are found in phylogenetically diverse bacteria, including Clostridium perfringens (in the Firmicutes), Bifidobacterium longum and Propionibacterium acnes (in the Actinobacteria), and Vibrio vulnificus (in the Proteobacteria), most of which occur as mammalian pathogens or commensals. The nominal activity, 1,3-beta-galactosyl-N-acetylhexosamine phosphorylase (EC 2.4.1.211), varies somewhat from instance to instance in relative rates for closely related substrates. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	719
188209	TIGR02337	HpaR	homoprotocatechuate degradation operon regulator, HpaR. This Helix-Turn-Helix transcriptional regulator is a member of the MarR family (pfam01047) and is found in association with operons for the degradation of 4-hydroxyphenylacetic acid via homoprotocatechuate.	118
131391	TIGR02338	gimC_beta	prefoldin, beta subunit, archaeal. Chaperonins are cytosolic, ATP-dependent molecular chaperones, with a conserved toroidal architecture, that assist in the folding of nascent and/or denatured polypeptide chains. The group I chaperonin system consists of GroEL and GroES, and is found (usually) in bacteria and organelles of bacterial origin. The group II chaperonin system, called the thermosome in Archaea and TRiC or CCT in the Eukaryota, is structurally similar but only distantly related. Prefoldin, also called GimC, is a complex in Archaea and Eukaryota, that works with group II chaperonins. Members of this protein family are the archaeal clade of the beta class of prefoldin subunit. Closely related, but outside the scope of this family are the eukaryotic beta-class prefoldin subunits, Gim-1,3,4 and 6. The alpha class prefoldin subunits are more distantly related.	110
274080	TIGR02339	thermosome_arch	thermosome, various subunits, archaeal. Thermosome is the name given to the archaeal rather than eukaryotic form of the group II chaperonin (counterpart to the group I chaperonin, GroEL/GroES, in bacterial), a torroidal, ATP-dependent molecular chaperone that assists in the folding or refolding of nascent or denatured proteins. Various homologous subunits, one to five per archaeal genome, may be designated alpha, beta, etc., but phylogenetic analysis does not show distinct alpha subunit and beta subunit lineages traceable to ancient paralogs. [Protein fate, Protein folding and stabilization]	519
274081	TIGR02340	chap_CCT_alpha	T-complex protein 1, alpha subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT alpha chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.	536
274082	TIGR02341	chap_CCT_beta	T-complex protein 1, beta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT beta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.	519
274083	TIGR02342	chap_CCT_delta	T-complex protein 1, delta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT delta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.	517
274084	TIGR02343	chap_CCT_epsi	T-complex protein 1, epsilon subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT epsilon chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.	532
274085	TIGR02344	chap_CCT_gamma	T-complex protein 1, gamma subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT gamma chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.	524
274086	TIGR02345	chap_CCT_eta	T-complex protein 1, eta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT eta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.	523
274087	TIGR02346	chap_CCT_theta	T-complex protein 1, theta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT alpha chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.	531
274088	TIGR02347	chap_CCT_zeta	T-complex protein 1, zeta subunit. Members of this family, all eukaryotic, are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT zeta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.	531
274089	TIGR02348	GroEL	chaperonin GroL. This family consists of GroEL, the larger subunit of the GroEL/GroES cytosolic chaperonin. It is found in bacteria, organelles derived from bacteria, and occasionally in the Archaea. The bacterial GroEL/GroES group I chaperonin is replaced a group II chaperonin, usually called the thermosome in the Archaeota and CCT (chaperone-containing TCP) in the Eukaryota. GroEL, thermosome subunits, and CCT subunits all fall under the scope of pfam00118. [Protein fate, Protein folding and stabilization]	524
274090	TIGR02349	DnaJ_bact	chaperone protein DnaJ. This model represents bacterial forms of DnaJ, part of the DnaK-DnaJ-GrpE chaperone system. The three components typically are encoded by consecutive genes. DnaJ homologs occur in many genomes, typically not near DnaK and GrpE-like genes; most such genes are not included by this family. Eukaryotic (mitochondrial and chloroplast) forms are not included in the scope of this family.	354
274091	TIGR02350	prok_dnaK	chaperone protein DnaK. Members of this family are the chaperone DnaK, of the DnaK-DnaJ-GrpE chaperone system. All members of the seed alignment were taken from completely sequenced bacterial or archaeal genomes and (except for Mycoplasma sequence) found clustered with other genes of this systems. This model excludes DnaK homologs that are not DnaK itself, such as the heat shock cognate protein HscA (TIGR01991). However, it is not designed to distinguish among DnaK paralogs in eukaryotes. Note that a number of dnaK genes have shadow ORFs in the same reverse (relative to dnaK) reading frame, a few of which have been assigned glutamate dehydrogenase activity. The significance of this observation is unclear; lengths of such shadow ORFs are highly variable as if the presumptive protein product is not conserved. [Protein fate, Protein folding and stabilization]	595
131404	TIGR02351	thiH	thiazole biosynthesis protein ThiH. Members this protein family are the ThiH protein of thiamine biosynthesis, a homolog of the BioB protein of biotin biosynthesis. Genes for the this protein generally are found in operons with other thiamin biosynthesis genes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	366
274092	TIGR02352	thiamin_ThiO	glycine oxidase ThiO. This family consists of the homotetrameric, FAD-dependent glycine oxidase ThiO, from species such as Bacillus subtilis that use glycine in thiamine biosynthesis. In general, members of this family will not be found in species such as E. coli that instead use tyrosine and the ThiH protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	337
274093	TIGR02353	NRPS_term_dom	non-ribosomal peptide synthetase terminal domain of unknown function. This domain is found exclusively in non-ribosomal peptide synthetases and always as the final domain in the polypeptide. This domain is roughly 700 amino acids in size and is found in polypeptides roughly twice that size.	695
162819	TIGR02354	thiF_fam2	thiamine biosynthesis protein ThiF, family 2. Members of the HesA/MoeB/ThiF family of proteins (pfam00899) include a number of members encoded in the midst of thiamine biosynthetic operons. This mix of known and putative ThiF proteins shows a deep split in phylogenetic trees, with one the E. coli ThiF and the E. coli MoeB proteins seemingly more closely related than E. coli ThiF and Campylobacter (for example) ThiF. This model represents the divergent clade of putative ThiF proteins such found in Campylobacter. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	200
131408	TIGR02355	moeB	molybdopterin synthase sulfurylase MoeB. This model describes the molybdopterin biosynthesis protein MoeB in E. coli and related species. The enzyme covalently modifies the molybdopterin synthase MoaD by sulfurylation. This enzyme is closely related to ThiF, a thiamine biosynthesis enzyme that modifies ThiS by an analogous adenylation. Both MoeB and ThiF belong to the HesA/MoeB/ThiF family (pfam00899). [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	240
274094	TIGR02356	adenyl_thiF	thiazole biosynthesis adenylyltransferase ThiF, E. coli subfamily. Members of the HesA/MoeB/ThiF family of proteins (pfam00899) include a number of members encoded in the midst of thiamine biosynthetic operons. This mix of known and putative ThiF proteins shows a deep split in phylogenetic trees, with the Escherichia. coli ThiF and the E. coli MoeB proteins seemingly more closely related than E. coli ThiF and Campylobacter (for example) ThiF. This model represents the more widely distributed clade of ThiF proteins such found in E. coli. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	202
274095	TIGR02357	ECF_ThiT_YuaJ	energy-coupled thiamine transporter ThiT. Members of this protein family have been assigned as thiamine transporters by a phylogenomic analysis of families of genes regulated by the THI element, a broadly conserved RNA secondary structure element through which thiamine pyrophosphate (TPP) levels can regulate transcription of many genes related to thiamine transport, salvage, and de novo biosynthesis. Species with this protein always lack the ThiBPQ ABC transporter. In some species (e.g. Steptococcus mutans and Streptoccus pyogenes), yuaJ is the only THI-regulated gene. Evidence from Bacillus cereus indicated thiamine uptake is coupled to proton translocation. However, a more recent comprehensive study of energy-coupled factor (ECF) transport suggests this protein is the S (subtrate capture) component of an ECF system, meaning it is energized by ATP. Previously YuaJ, but renamed ThiT. [Transport and binding proteins, Other]	173
274096	TIGR02358	thia_cytX	putative hydroxymethylpyrimidine transporter CytX. On the basis of a phylogenomic study of thiamine biosythetic, salvage, and transporter genes and a highly conserved RNA element THI, this protein family has been identified as a probable transporter of hydroxymethylpyrimidine (HMP), the phosphorylated (by ThiD) form of which gets joined (by ThiE) to hydroxyethylthiazole phosphate to make thiamine phosphate. [Transport and binding proteins, Nucleosides, purines and pyrimidines, Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	386
131412	TIGR02359	thiW	energy coupling factor transporter S component ThiW. Levels of thiamine pyrophosphate (TPP) or thiamine regulate transcription or translation of a number of thiamine biosynthesis, salvage, or transport genes in a wide range of prokaryotes. The mechanism involves direct binding, with no protein involved,to a structural element called THI found in the untranslated upstream region of thiamine metabolism gene operons. This element is called a riboswitch and is seen also for other metabolites such as FMN and glycine. This protein family consists of proteins identified in operons controlled by the THI riboswitch and designated ThiW. The hydrophobic nature of this protein and reconstructed metabolic background suggests that this protein acts in transport of a thiazole precursor of thiamine. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	160
131413	TIGR02360	pbenz_hydroxyl	4-hydroxybenzoate 3-monooxygenase. Members of this family are the enzyme 4-hydroxybenzoate 3-monooxygenase, also called p-hydroxybenzoate hydroxylase. It converts 4-hydroxybenzoate + NADPH + molecular oxygen to protocatechuate + NADPH + water. It contains monooxygenase (pfam01360) and FAD binding (pfam01494) domains. Pathways that contain this enzyme include the protocatechuate 4,5-degradation pathway. [Energy metabolism, Other]	390
274097	TIGR02361	dak_ATP	dihydroxyacetone kinase, ATP-dependent. This family consists of examples of the form of dihydroxyacetone kinase (also called glycerone kinase) that uses ATP (2.7.1.29) as the phosphate donor, rather than a phosphoprotein as in E. coli. This form is composed of a single chain with separable domains homologous to the K and L subunits of the E. coli enzyme, and is found in yeasts and other eukaryotes and in some bacteria, including Citrobacter freundii. The member from tomato has been shown to phosphorylate dihydroxyacetone, 3,4-dihydroxy-2-butanone, and some other aldoses and ketoses ().	574
213706	TIGR02362	dhaK1b	probable dihydroxyacetone kinase DhaK1b subunit. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form with a phosphoprotein donor related to PTS transport proteins. This family represents a protein, unique to the Firmicutes (low GC Gram-positives), that appears to be a divergent second copy of the K subunit of that complex; its gene is always found in operons with the other three proteins of the complex.	326
274098	TIGR02363	dhaK1	dihydroxyacetone kinase, DhaK subunit. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form (EC 2.7.1.-) with a phosphoprotein donor related to PTS transport proteins. This family represents the DhaK subunit of the latter type of dihydroxyacetone kinase, but it specifically excludes the DhaK paralog DhaK2 (TIGR02362) found in the same operon as DhaK and DhaK in the Firmicutes.	329
131417	TIGR02364	dha_pts	dihydroxyacetone kinase, phosphotransfer subunit. In E. coli and many other bacteria, unlike the yeasts and a few bacteria such as Citrobacter freundii, the dihydroxyacetone kinase (also called glycerone kinase) transfers a phosphate from a phosphoprotein rather than from ATP and contains multiple subunits. This protein, which resembles proteins of PTS transport systems, is found with its gene adjacent to	125
274099	TIGR02365	dha_L_ycgS	dihydroxyacetone kinase, phosphoprotein-dependent, L subunit. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form (EC 2.7.1.-) with a phosphoprotein donor related to PTS transport proteins. This family represents the subunit homologous to the E. coli YcgS subunit.	194
274100	TIGR02366	DHAK_reg	probable dihydroxyacetone kinase regulator. The seed alignment for this family was built from a set of closely related uncharacterized proteins associated with operons for the type of bacterial dihydroxyacetone kinase that transfers PEP-derived phosphate from a phosphoprotein, as in phosphotransferase system transport, rather than from ATP. Members have a TetR transcriptional regulator domain (pfam00440) at the N-terminus and sequence homology throughout.	176
188213	TIGR02367	PylS_Cterm	pyrrolysyl-tRNA synthetase, C-terminal region. PylS is the enzyme responsible for charging the pyrrolysine tRNA, PylT, by ligating a free molecule of pyrrolysine. Pyrrolysine is encoded at an in-frame UAG (amber) at least in several corrinoid-dependent methyltransferases of the archaeal genera Methanosarcina and Methanococcoides, such as trimethylamine methyltransferase. This protein occurs as a fusion protein in Methanosarcina but as split genes in Desulfitobacterium hafniense and other bacteria. [Protein synthesis, tRNA aminoacylation]	242
131421	TIGR02368	dimeth_PyL	dimethylamine:corrinoid methyltransferase. This family consists of dimethylamine methyltransferases from the genus Methanosarcina. It is found in three nearly identical copies in each of M. acetivorans, M. barkeri, and M. Mazei. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with trimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates dimethylamine, leaving monomethylamine, and methylates the prosthetic group of the small corrinoid protein MtbC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence.	466
131422	TIGR02369	trimeth_pyl	trimethylamine:corrinoid methyltransferase. This model represents a distinct subfamily of pfam06253. All members here are trimethylamine:corrinoid methyltransferases that contain a critical pyrrolysine residue incorporated during translation via a special tRNA for a TAG (amber) codon. Known members so far are from the genus Methanosarcina. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with dimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates trimethylamine, leaving dimethylamine, and methylates the prosthetic group of its small cognate corrinoid protein, MttC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence.	489
131423	TIGR02370	pyl_corrinoid	methyltransferase cognate corrinoid proteins, Methanosarcina family. This model describes a subfamily of the B12 binding domain (pfam02607, pfam02310) proteins. Members of the seed alignment include corrinoid proteins specific to four different, mutally non-homologous enzymes of the genus Methanosarcina. Three of the four cognate enzymes (trimethylamine, dimethylamine, and monomethylamine methyltransferases) all have the unusual, ribosomally incorporated amino acid pyrrolysine at the active site. All act in systems in which a methyl group is transferred to the corrinoid protein to create methylcobalamin, from which the methyl group is later transferred elsewhere.	197
131424	TIGR02371	ala_DH_arch	alanine dehydrogenase, Archaeoglobus fulgidus type. This enzyme, a homolog of bacterial ornithine cyclodeaminases and marsupial mu-crystallins, is a homodimeric, NAD-dependent alanine dehydrogenase found in Archaeoglobus fulgidus and several other Archaea. For a number of close homologs, scoring between trusted and noise cutoffs, it is not clear at present what is the enzymatic activity.	325
131425	TIGR02372	4_coum_CoA_lig	4-coumarate--CoA ligase, photoactive yellow protein activation family. This model represents the 4-coumarate--CoA ligase associated with biosynthesis of the 4-hydroxy cinnamyl (also called 4-coumaroyl) chromophore covalently linked to a Cys residue in photoactive yellow protein of Rhodobacter spp. and	386
131426	TIGR02373	photo_yellow	photoactive yellow protein. Members of this family are photoactive yellow protein, a cytosolic, 14-kDa light-sensing protein which has a 4-hydroxycinnamyl (p-coumaric acid) chromophore covalently linked to a Cys residue. The enzyme 4-coumarate--CoA ligase as described by TIGR02372 is required for its biosynthesis. The modified Cys is in a PAS (pfam00989) domain, frequently found in signal transducing proteins. Members are known in alpha and gamma Proteobacteria that include Rhodobacter capsulatus, Halorhodospira halophila, Rhodospirillum centenum, etc.	124
162827	TIGR02374	nitri_red_nirB	nitrite reductase [NAD(P)H], large subunit. [Central intermediary metabolism, Nitrogen metabolism]	785
131428	TIGR02375	pseudoazurin	pseudoazurin. Pseudoazurin, also called cupredoxin, is a small, blue periplasmic protein with a single bound copper atom. Pseudoazurin is related plastocyanins. Several examples of pseudoazurin are encoded by a neighboring gene for, or have been shown to transfer electrons to, copper-containing nitrite reductases (TIGR02376) of the same species. [Energy metabolism, Electron transport]	116
131429	TIGR02376	Cu_nitrite_red	nitrite reductase, copper-containing. This family consists of copper-type nitrite reductase. It reduces nitrite to nitric oxide, the first step in denitrification. [Central intermediary metabolism, Nitrogen metabolism]	311
131430	TIGR02377	MocE_fam_FeS	Rieske [2Fe-2S] domain protein, MocE subfamily. This model describes a subfamily of the Rieske-like [2Fe-2S] family of ferredoxins that includes MocE, part of the rhizopine (3-O-methyl-scyllo-inosamine) catabolic cluster in Rhizobium. Members of this family are related to, yet distinct from, the small subunit of nitrite reductase [NAD(P)H].	101
131431	TIGR02378	nirD_assim_sml	nitrite reductase [NAD(P)H], small subunit. This model describes NirD, the small subunit of nitrite reductase [NAD(P)H] (the assimilatory nitrite reductase), which associates with NirB, the large subunit (TIGR02374). In a few bacteria such as Klebsiella pneumoniae and in Fungi, the two regions are fused. [Central intermediary metabolism, Nitrogen metabolism]	105
131432	TIGR02379	ECA_wecE	TDP-4-keto-6-deoxy-D-glucose transaminase. This family consists of TDP-4-keto-6-deoxy-D-glucose transaminases, the WecE (formerly RffA) protein of enterobacterial common antigen (ECA) biosynthesis, from enterobacteria. It also includes closely matching sequence from species not expected to make ECA, but which contain other genes for the biosynthesis of TDP-4-keto-6-deoxy-D-Glc, an intermediate in the biosynthesis of other compounds as well and the substrate of WecA. This family belongs to the DegT/DnrJ/EryC1/StrS aminotransferase family (pfam01041). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	376
131433	TIGR02380	ECA_wecA	undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphatetransferase. Members of this family are the WecA enzyme of enterobacterial common antigen biosynthesis, undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphatetransferase. This family represents one narrow clade, and closely related sequences outside this clade may represent enzymes that catalyze the same specific reaction, but in the context of different pathways. A His-rich motif in a cytosolic loop of this integral membrane protein, shown critical to enzymatic activity for WecA is variously present or absent in the clade that includes Bacillus subtilis TagO teichoic acid biosynthesis enzyme, which may catalyze the same reaction as WecA. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	346
131434	TIGR02381	cspD	cold shock domain protein CspD. This model represents what appears to be a phylogenetically distinct clade, containing E. coli CspD (SP|P24245) and related proteobacterial proteins within the larger family of cold shock domain proteins described by pfam00313. The gene symbol cspD may have been used idependently for other subfamilies of cold shock domain proteins, such as for B. subtilis CspD. These proteins typically are shorter than 70 amino acids. In E. coli, CspD is a stress response protein induced in stationary phase. This homodimer binds single-stranded DNA and appears to inhibit DNA replication. [DNA metabolism, DNA replication, recombination, and repair, Cellular processes, Adaptations to atypical conditions]	68
131435	TIGR02382	wecD_rffC	TDP-D-fucosamine acetyltransferase. This model represents the WecD protein (Formerly RffC) for the biosynthesis of enterobacterial common antigen (ECA), an outer leaflet, outer membrane glycolipid with a trisaccharide repeat unit. WecD is a member of the GNAT family of acetytransferases (pfam00583). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	191
274101	TIGR02383	Hfq	RNA chaperone Hfq. This model represents the RNA-binding pleiotropic regulator Hfq, a small, Sm-like protein of bacteria. It helps pair regulatory noncoding RNAs with complementary mRNA target regions. It enhances the elongation of poly(A) tails on mRNA. It appears also to protect RNase E recognition sites (A/U-rich sequences with adjacent stem-loop structures) from cleavage. Being pleiotropic, it differs in some of its activities in different species. Hfq binds the non-coding regulatory RNA DsrA (see Rfam RF00014) in the few species known to have it: Escherichia coli, Shigella flexneri, Salmonella spp. In Azorhizobium caulinodans, an hfq mutant is unable to express nifA, and Hfq is called NrfA, for nif regulatory factor (see . The name hfq reflects phenomenology as a host factor for phage Q-beta RNA replication. [Regulatory functions, Other]	61
274102	TIGR02384	RelB_DinJ	addiction module antitoxin, RelB/DinJ family. Plasmids may be maintained stably in bacterial populations through the action of addiction modules, in which a toxin and antidote are encoded in a cassette on the plasmid. In any daughter cell that lacks the plasmid, the toxin persists and is lethal after the antidote protein is depleted. Toxin/antitoxin pairs are also found on main chromosomes, and likely represent selfish DNA. Sequences in the seed for this alignment all were found adjacent to toxin genes. The resulting model appears to describe a narrower set of proteins than pfam04221, although many in the scope of this model are not obviously paired with toxin proteins. Several toxin/antitoxin pairs may occur in a single species. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other]	84
211740	TIGR02385	RelE_StbE	addiction module toxin, RelE/StbE family. Plasmids may be maintained stably in bacterial populations through the action of addiction modules, in which a toxin and antidote are encoded in a cassette on the plasmid. In any daughter cell that lacks the plasmid, the toxin persists and is lethal after the antidote protein is depleted. Toxin/antitoxin pairs are also found on main chromosomes, and likely represent selfish DNA. Sequences in the seed for this alignment all are found adjacent to RelB/DinJ family antitoxin genes (TIGR02384), as are most genes found by the resulting model. StbE from Morganella morganii plasmid R485 shows typical behaviour for an addiction module toxin. It cannot be cloned without its partner (the antitoxin), whereas its partner cannot confer plasmid stability without StbE. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other]	88
274103	TIGR02386	rpoC_TIGR	DNA-directed RNA polymerase, beta' subunit, predominant form. Bacteria have a single DNA-directed RNA polymerase, with required subunits that include alpha, beta, and beta-prime. This model describes the predominant architecture of the beta-prime subunit in most bacteria. This model excludes from among the bacterial mostly sequences from the cyanobacteria, where RpoC is replaced by two tandem genes homologous to it but also encoding an additional domain. [Transcription, DNA-dependent RNA polymerase]	1140
131440	TIGR02387	rpoC1_cyan	DNA-directed RNA polymerase, gamma subunit. The RNA polymerase gamma subunit, encoded by the rpoC1 gene, is found in cyanobacteria and corresponds to the N-terminal region the beta' subunit, encoded by rpoC, in other bacteria. The equivalent subunit in plastids and chloroplasts is designated beta', while the product of the rpoC2 gene is designated beta''.	619
274104	TIGR02388	rpoC2_cyan	DNA-directed RNA polymerase, beta'' subunit. The family consists of the product of the rpoC2 gene, a subunit of DNA-directed RNA polymerase of cyanobacteria and chloroplasts. RpoC2 corresponds largely to the C-terminal region of the RpoC (the beta' subunit) of other bacteria. Members of this family are designated beta'' in chloroplasts/plastids, and beta' (confusingly) in Cyanobacteria, where RpoC1 is called beta' in chloroplasts/plastids and gamma in Cyanobacteria. We prefer to name this family beta'', after its organellar members, to emphasize that this RpoC1 and RpoC2 together replace RpoC in other bacteria. [Transcription, DNA-dependent RNA polymerase]	1227
274105	TIGR02389	RNA_pol_rpoA2	DNA-directed RNA polymerase, subunit A''. This family consists of the archaeal A'' subunit of the DNA-directed RNA polymerase. The example from Methanocaldococcus jannaschii contains an intein. [Transcription, DNA-dependent RNA polymerase]	367
274106	TIGR02390	RNA_pol_rpoA1	DNA-directed RNA polymerase subunit A'. This family consists of the archaeal A' subunit of the DNA-directed RNA polymerase. The example from Methanocaldococcus jannaschii contains an intein.	868
162834	TIGR02391	hypoth_ymh	TIGR02391 family protein. This family consists of a relatively rare (~ 8 occurrences per 200 genomes) prokaryotic protein family. Genes for members are appear to be associated variously with phage and plasmid regions, restriction system loci, transposons, and housekeeping genes. The function is unknown. [Hypothetical proteins, Domain]	125
274107	TIGR02392	rpoH_proteo	alternative sigma factor RpoH. A sigma factor is a DNA-binding protein protein that binds to the DNA-directed RNA polymerase core to produce the holoenzyme capable of initiating transcription at specific sites. Different sigma factors act in vegetative growth, heat shock, extracytoplasmic functions (ECF), etc. This model represents the clade of sigma factors called RpoH and further restricted to the Proteobacteria. This protein may be called sigma-32, sigma factor H, heat shock sigma factor, and alternative sigma factor RpoH. Note that in some species the single locus rpoH may be replaced by two or more differentially regulated stress response sigma factors. [Cellular processes, Adaptations to atypical conditions, Transcription, Transcription factors]	270
274108	TIGR02393	RpoD_Cterm	RNA polymerase sigma factor RpoD, C-terminal domain. This model represents the well-conserved C-terminal region of the major, essential sigma factor of most bacteria. Members of this clade show considerable variability in domain architecture and molecular weight, as well as in nomenclature: RpoD in E. coli and other Proteobacteria, SigA in Bacillus subtilis and many other Gram-positive bacteria, HrdB in Streptomyces, MysA in Mycobacterium smegmatis, etc. [Transcription, Transcription factors]	238
131447	TIGR02394	rpoS_proteo	RNA polymerase sigma factor RpoS. A sigma factor is a DNA-binding protein protein that binds to the DNA-directed RNA polymerase core to produce the holoenzyme capable of initiating transcription at specific sites. Different sigma factors act in vegetative growth, heat shock, extracytoplasmic functions (ECF), etc. This model represents the clade of sigma factors called RpoS (also called sigma-38, KatF, etc.), found only in Proteobacteria. This sigma factor is induced in stationary phase (in response to the stress of nutrient limitation) and becomes the second prinicipal sigma factor at that time. RpoS is a member of the larger Sigma-70 subfamily (TIGR02937) and most closely related to RpoD (TIGR02393). [Cellular processes, Adaptations to atypical conditions, Transcription, Transcription factors]	285
274109	TIGR02395	rpoN_sigma	RNA polymerase sigma-54 factor. A sigma factor is a DNA-binding protein protein that binds to the DNA-directed RNA polymerase core to produce the holoenzyme capable of initiating transcription at specific sites. Different sigma factors act in vegetative growth, heat shock, extracytoplasmic functions (ECF), etc. This model represents the clade of sigma factors called sigma-54, or RpoN (unrelated to sigma 70-type factors such as RpoD/SigA). RpoN is responsible for enhancer-dependent transcription, and its presence characteristically is associated with varied panels of activators, most of which are enhancer-binding proteins (but see Brahmachary, et al., ). RpoN may be responsible for transcription of nitrogen fixation genes, flagellins, pilins, etc., and synonyms for the gene symbol rpoN, such as ntrA, reflect these observations [Transcription, Transcription factors]	429
274110	TIGR02396	diverge_rpsU	rpsU-divergently transcribed protein. This uncharacterized protein is found in a number of Alphaproteobacteria and, with N-terminal regions long enough to be transit peptides, in eukaryotes. This phylogeny suggests mitochondrial derivation. In several Alphaproteobacteria, the gene for this protein is encoded divergently from rpsU, the gene for ribosomal protein S21. S21 is unusual in being encoded outside the usual long ribosomal protein operons, but rather in contexts that suggest regulation of the initiation of protein translation. [Unknown function, General]	185
274111	TIGR02397	dnaX_nterm	DNA polymerase III, subunit gamma and tau. This model represents the well-conserved first ~ 365 amino acids of the translation of the dnaX gene. The full-length product of the dnaX gene in the model bacterium E. coli is the DNA polymerase III tau subunit. A translational frameshift leads to early termination and a truncated protein subunit gamma, about 1/3 shorter than tau and present in roughly equal amounts. This frameshift mechanism is not necessarily universal for species with DNA polymerase III but appears conserved in the exterme thermophile Thermus thermophilis. [DNA metabolism, DNA replication, recombination, and repair]	355
131451	TIGR02398	gluc_glyc_Psyn	glucosylglycerol-phosphate synthase. Glucosylglycerol-phosphate synthase catalyzes the key step in the biosynthesis of the osmolyte glucosylglycerol. It is known in several cyanobacteria and in Pseudomonas anguilliseptica. The enzyme is closely related to the alpha,alpha-trehalose-phosphate synthase, likewise involved in osmolyte biosynthesis, of E. coli and many other bacteria. A close homolog from Xanthomonas campestris is excluded from this model and scores between trusted and noise.	487
131452	TIGR02399	salt_tol_Pase	glucosylglycerol 3-phosphatase. Proteins in this family are glucosylglycerol-phosphate phosphatase, with the gene symbol stpA (Salt Tolerance Protein A). A motif characteristic of acid phosphatases is found, but otherwise this family shows little sequence similarity to other phosphatases. This enzyme acts on the glucosylglycerol phosphate, product of glucosylglycerol phosphate synthase and immediate precursor of the osmoprotectant glucosylglycerol.	389
274112	TIGR02400	trehalose_OtsA	alpha,alpha-trehalose-phosphate synthase [UDP-forming]. This enzyme catalyzes the key, penultimate step in biosynthesis of trehalose, a compatible solute made as an osmoprotectant in some species in all three domains of life. The gene symbol OtsA stands for osmotically regulated trehalose synthesis A. Trehalose helps protect against both osmotic and thermal stresses, and is made from two glucose subunits. This model excludes glucosylglycerol-phosphate synthase, an enzyme of an analogous osmoprotectant system in many cyanobacterial strains. This model does not identify archaeal examples, as they are more divergent than glucosylglycerol-phosphate synthase. Sequences that score in the gray zone between the trusted and noise cutoffs include a number of yeast multidomain proteins in which the N-terminal domain may be functionally equivalent to this family. The gray zone also includes the OtsA of Cornyebacterium glutamicum (and related species), shown to be responsible for synthesis of only trace amounts of trehalose while the majority is synthesized by the TreYZ pathway; the significance of OtsA in this species is unclear (see Wolf, et al., ). [Cellular processes, Adaptations to atypical conditions]	456
274113	TIGR02401	trehalose_TreY	malto-oligosyltrehalose synthase. This enzyme, formally named (1->4)-alpha-D-glucan 1-alpha-D-glucosylmutase, is the TreY enzyme of the TreYZ pathway of trehalose biosynthesis, an alternative to the OtsAB pathway. Trehalose may be incorporated into more complex compounds but is best known as compatible solute. It is one of the most effective osmoprotectants, and unlike the various betaines does not require nitrogen for its synthesis. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	825
274114	TIGR02402	trehalose_TreZ	malto-oligosyltrehalose trehalohydrolase. Members of this family are the trehalose biosynthetic enzyme malto-oligosyltrehalose trehalohydrolase, formally known as 4-alpha-D-{(1->4)-alpha-D-glucano}trehalose trehalohydrolase (EC 3.2.1.141). It is the TreZ protein of the TreYZ pathway for trehalose biosynthesis, and alternative to the OtsAB system. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	544
274115	TIGR02403	trehalose_treC	alpha,alpha-phosphotrehalase. Trehalose is a glucose disaccharide that serves in many biological systems as a compatible solute for protection against hyperosmotic and thermal stress. This family describes trehalose-6-phosphate hydrolase, product of the treC (or treA) gene, which is often found together with a trehalose uptake transporter and a trehalose operon repressor.	543
274116	TIGR02404	trehalos_R_Bsub	trehalose operon repressor, B. subtilis-type. This family consists of repressors of the GntR family typically associated with trehalose utilization operons. Trehalose is imported as trehalose-6-phosphate and then hydrolyzed by alpha,alpha-phosphotrehalase to glucose and glucose-6-P. This family includes repressors mostly from Gram-positive lineages and does not include the TreR from E. coli. [Regulatory functions, DNA interactions]	233
131458	TIGR02405	trehalos_R_Ecol	trehalose operon repressor, proteobacterial. This family consists of repressors of the LacI family typically associated with trehalose utilization operons. Trehalose is imported as trehalose-6-phosphate and then hydrolyzed by alpha,alpha-phosphotrehalase to glucose and glucose-6-P. This family includes repressors mostly from Gammaproteobacteria and does not include the GntR family TreR of Bacillus subtilis [Regulatory functions, DNA interactions]	311
131459	TIGR02406	ectoine_EctA	diaminobutyrate acetyltransferase. This enzyme family is the EctA of ectoine biosynthesis. Ectoine is a compatible solute, analagous to trehalose, betaines, etc., found often in halotolerant organisms. EctA is L-2,4-diaminobutyric acid acetyltransferase, also called DABA acetyltransferase. [Cellular processes, Adaptations to atypical conditions]	157
274117	TIGR02407	ectoine_ectB	diaminobutyrate--2-oxoglutarate aminotransferase. Members of this family of class III pyridoxal-phosphate-dependent aminotransferases are diaminobutyrate--2-oxoglutarate aminotransferase (EC 2.6.1.76) that catalyze the first step in ectoine biosynthesis from L-aspartate beta-semialdehyde. This family is readily separated phylogenetically from enzymes with the same substrate and product but involved in other process such as siderophore (SP|Q9Z3R2) or 1,3-diaminopropane (SP|P44951) biosynthesis. The family TIGR00709 previously included both groups but has now been revised to exclude the ectoine biosynthesis proteins of this family. Ectoine is a compatible solute particularly effective in conferring salt tolerance. [Cellular processes, Adaptations to atypical conditions]	412
131461	TIGR02408	ectoine_ThpD	ectoine hydroxylase. Both ectoine and hydroxyectoine are compatible solvents that serve as protectants against osmotic and thermal stresses. A number of genomes synthesize ectoine. This enzyme allows conversion of ectoine to hydroxyectoine, which may be more effective for some purposes, and is found in a subset of ectoine-producing organisms.	277
274118	TIGR02409	carnitine_bodg	gamma-butyrobetaine hydroxylase. Members of this protein family are gamma-butyrobetaine hydroxylase, both bacterial and eukarytotic. This enzyme catalyzes the last step in the conversion of lysine to carnitine. Carnitine can serve as a compatible solvent in bacteria and also participates in fatty acid metabolism.	366
274119	TIGR02410	carnitine_TMLD	trimethyllysine dioxygenase. Members of this family with known function act as trimethyllysine dioxygenase, an enzyme in the pathway for carnitine biosynthesis from lysine. This enzyme is homologous to gamma-butyrobetaine,2-oxoglutarate dioxygenase, which catalyzes the last step in carnitine biosynthesis. Members of this family appear to be eukaryotic only.	362
274120	TIGR02411	leuko_A4_hydro	leukotriene A-4 hydrolase/aminopeptidase. Members of this family represent a distinctive subset within the zinc metallopeptidase family M1 (pfam01433). The majority of the members of pfam01433 are aminopeptidases, but the sequences in this family for which the function is known are leukotriene A-4 hydrolase. A dual epoxide hydrolase and aminopeptidase activity at the same active site is indicated. The physiological substrate for aminopeptidase activity is not known.	602
274121	TIGR02412	pepN_strep_liv	aminopeptidase N, Streptomyces lividans type. This family is a subset of the members of the zinc metallopeptidase family M1 (pfam01433), with a single member characterized in Streptomyces lividans 66 and designated aminopeptidase N. The spectrum of activity may differ somewhat from the aminopeptidase N clade of E. coli and most other Proteobacteria, well separated phylogenetically within the M1 family. The M1 family also includes leukotriene A-4 hydrolase/aminopeptidase (with a bifunctional active site).	831
131466	TIGR02413	Bac_small_yrzI	Bacillus tandem small hypothetical protein. Members of this family are very small proteins, about 47 residues each, in the genus Bacillus. Single members are found in Bacillus subtilis and Bacillus halodurans, but arrays of six in tandem in Bacillus cereus and Bacillus anthracis. An EIxxE motif present in most members of this family resembles cleavage sites by the germination protease GPR in a number small, acid-soluble spore proteins (SASP). A role in sporulation is possible.	46
274122	TIGR02414	pepN_proteo	aminopeptidase N, Escherichia coli type. The M1 family of zinc metallopeptidases contains a number of distinct, well-separated clades of proteins with aminopeptidase activity. Several are designated aminopeptidase N, EC 3.4.11.2, after the Escherichia coli enzyme, suggesting a similar activity profile (see SP|P04825 for a description of catalytic activity). This family consists of all aminopeptidases closely related to E. coli PepN and presumed to have similar (not identical) function. Nearly all are found in Proteobacteria, but members are found also in Cyanobacteria, plants, and apicomplexan parasites. This family differs greatly in sequence from the family of aminopeptidases typified by Streptomyces lividans PepN (TIGR02412), from the membrane bound aminopeptidase N family in animals, etc. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	863
131468	TIGR02415	23BDH	acetoin reductases. One member of this family, as characterized in Klebsiella terrigena, is described as able to interconvert acetoin + NADH with meso-2,3-butanediol + NAD(+). It is also called capable of irreversible reduction of diacetyl with NADH to acetoin. Blomqvist, et al. decline to specify either EC 1.1.1.4 which is (R,R)-butanediol dehydrogenase, or EC 1.1.1.5, which is acetoin dehydrogenase without a specified stereochemistry, for this enzyme. This enzyme is a homotetramer in the family of short chain dehydrogenases (pfam00106). Another member of this family, from Corynebacterium glutamicum, is called L-2,3-butanediol dehydrogenase (). [Energy metabolism, Fermentation]	254
131469	TIGR02416	CO_dehy_Mo_lg	carbon-monoxide dehydrogenase, large subunit. This model represents the large subunits of group of carbon-monoxide dehydrogenases that include molybdenum as part of the enzymatic cofactor. There are various forms of carbon-monoxide dehydrogenase; Salicibacter pomeroyi DSS-3, for example, has two forms. Note that, at least in some species, the active site Cys is modified with a selenium attached to (rather than replacing) the sulfur atom. This is termed selanylcysteine, and created post-translationally, in contrast to selenocysteine incorporation during translation as for many other selenoproteins. [Energy metabolism, Other]	770
131470	TIGR02417	fruct_sucro_rep	D-fructose-responsive transcription factor. Members of this family belong the lacI helix-turn-helix family (pfam00356) of DNA-binding transcriptional regulators. All members are from the proteobacteria. Characterized members act as positive and negative transcriptional regulators of fructose and sucrose transport and metabolism. Sucrose is a disaccharide composed of fructose and glucose; D-fructose-1-phosphate rather than an intact sucrose moiety has been shown to act as the inducer. [Regulatory functions, DNA interactions]	327
131471	TIGR02418	acolac_catab	acetolactate synthase, catabolic. Acetolactate synthase (EC 2.2.1.6) combines two molecules of pyruvate to yield 2-acetolactate with the release of CO2. This reaction may be involved in either valine biosynthesis (biosynthetic) or conversion of pyruvate to acetoin and possibly to 2,3-butanediol (catabolic). The biosynthetic type, described by TIGR00118, is also capable of forming acetohydroxybutyrate from pyruvate and 2-oxobutyrate for isoleucine biosynthesis. The family described here, part of the same larger family of thiamine pyrophosphate-dependent enzymes (pfam00205, pfam02776) is the catabolic form, generally found associated with in species with acetolactate decarboxylase and usually found in the same operon. The model may not encompass all catabolic acetolactate synthases, but rather one particular clade in the larger TPP-dependent enzyme family. [Energy metabolism, Fermentation]	539
274123	TIGR02419	C4_traR_proteo	phage/conjugal plasmid C-4 type zinc finger protein, TraR family. Members of this family are putative C4-type zinc finger proteins found almost exclusively in prophage regions, actual phage, or conjugal transfer regions of the Proteobactia. This small protein (about 70 amino acids) appears homologous to but is smaller than DksA (DnaK suppressor protein), found to be critical for regulating transcription of ribosomal RNA. [Mobile and extrachromosomal element functions, Prophage functions]	63
274124	TIGR02420	dksA	RNA polymerase-binding protein DksA. The model that is the basis for this family describes a small, pleiotropic protein, DksA (DnaK suppressor A), originally named as a multicopy suppressor of temperature sensitivity of dnaKJ mutants. DksA mutants are defective in quorum sensing, virulence, etc. DksA is now understood to bind RNA polymerase directly and modulate its response to small molecules to control the level of transcription of rRNA. Nearly all members of this family are in the Proteobacteria. Whether the closest homologs outside the Proteobacteria function equivalently is unknown. The low value set for the noise cutoff allows identification of possible DksA proteins from outside the proteobacteria. TIGR02419 describes a closely related family of short sequences usually found in prophage regions of proteobacterial genomes or in known phage. [Transcription, Transcription factors, Regulatory functions, Small molecule interactions]	110
274125	TIGR02421	QEGLA	conserved hypothetical protein. Members of this family include a possible metal-binding motif HEXXXH and, nearby, a perfectly conserved motif QEGLA. All members belong to the Proteobacteria, including Agrobacterium tumefaciens and several species of Vibrio and Pseudomonas, and are found in only one copy per chromosome (Vibrio vulnificus, with two chromosomes, has two). The function is unknown.	366
131475	TIGR02422	protocat_beta	protocatechuate 3,4-dioxygenase, beta subunit. This model represents the beta chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the alpha chain (TIGR02423), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA. [Energy metabolism, Other]	220
274126	TIGR02423	protocat_alph	protocatechuate 3,4-dioxygenase, alpha subunit. This model represents the alpha chain of protocatechuate 3,4-dioxygenase. The most closely related family outside this family is that of the beta chain (TIGR02422), typically encoded in an adjacent locus. This enzyme acts in the degradation of aromatic compounds by way of p-hydroxybenzoate to succinate and acetyl-CoA. [Energy metabolism, Other]	193
274127	TIGR02424	TF_pcaQ	pca operon transcription factor PcaQ. Members of this family are LysR-family transcription factors associated with operons for catabolism of protocatechuate. Members occur only in Proteobacteria. [Energy metabolism, Other, Regulatory functions, DNA interactions]	300
131478	TIGR02425	decarb_PcaC	4-carboxymuconolactone decarboxylase. Members of this family are 4-carboxymuconolactone decarboxylase, which catalyzes the third step in the catabolism of protocatechuate (and therefore the fourth step in the catabolism of para-hydroxybenzoate, of 3-hydroxybenzoate, of vanillate, etc.). Most members of this family are encoded within protocatechuate catabolism operons. This protein is sometimes found as a fusion protein with other enzymes of the pathway, as in Rhodococcus opacus, Streptomyces avermitilis, and Caulobacter crescentus. [Energy metabolism, Other]	123
274128	TIGR02426	protocat_pcaB	3-carboxy-cis,cis-muconate cycloisomerase. Members of this family are 3-carboxy-cis,cis-muconate cycloisomerase, the enzyme the catalyzes the second step in the protocatechuate degradation to beta-ketoadipate and then to succinyl-CoA and acetyl-CoA. 4-hydroxybenzoate, 3-hydroxybenzoate, and vanillate all can be converted in one step to protocatechuate. All members of the seed alignment for this model were chosen from within protocatechuate degradation operons of at least three genes of the pathway, from genomes with the complete pathway through beta-ketoadipate. [Energy metabolism, Other]	338
131480	TIGR02427	protocat_pcaD	3-oxoadipate enol-lactonase. Members of this family are 3-oxoadipate enol-lactonase. Note that the substrate is known as 3-oxoadipate enol-lactone, 2-oxo-2,3-dihydrofuran-5-acetate, 4,5-Dihydro-5-oxofuran-2-acetate, and 5-oxo-4,5-dihydrofuran-2-acetate. The enzyme the catalyzes the fourth step in the protocatechuate degradation to beta-ketoadipate and then to succinyl-CoA and acetyl-CoA. 4-hydroxybenzoate, 3-hydroxybenzoate, and vanillate all can be converted in one step to protocatechuate. This enzyme also acts in catechol degradation. In genomes that catabolize both catechol and protocatechuate, two forms of this enzyme may be found. All members of the seed alignment for this model were chosen from within protocatechuate degradation operons of at least three genes of the pathway, from genomes with the complete pathway through beta-ketoadipate. [Energy metabolism, Other]	251
188219	TIGR02428	pcaJ_scoB_fam	3-oxoacid CoA-transferase, B subunit. Various members of this family are characterized as the B subunits of succinyl-CoA:3-ketoacid-CoA transferase (EC 2.8.3.5), beta-ketoadipate:succinyl-CoA transferase (EC 2.8.3.6), acetyl-CoA:acetoacetate CoA transferase (EC 2.8.3.8), and butyrate-acetoacetate CoA-transferase (EC 2.8.3.9). This represents a very distinct clade with strong sequence conservation within the larger family defined by pfam01144. The A subunit represents a different clade in pfam01144.	207
131482	TIGR02429	pcaI_scoA_fam	3-oxoacid CoA-transferase, A subunit. Various members of this family are characterized as the A subunits of succinyl-CoA:3-ketoacid-CoA transferase (EC 2.8.3.5), beta-ketoadipate:succinyl-CoA transferase (EC 2.8.3.6), acetyl-CoA:acetoacetate CoA transferase (EC 2.8.3.8), and butyrate-acetoacetate CoA-transferase (EC 2.8.3.9). This represents a very distinct clade with strong sequence conservation within the larger family defined by pfam01144. The B subunit represents a different clade in pfam01144, described by TIGR02428. The two are found in general as tandem genes and occasionally as a fusion.	222
131483	TIGR02430	pcaF	3-oxoadipyl-CoA thiolase. Members of this family are designated beta-ketoadipyl CoA thiolase, an enzyme that acts at the end of pathways for the degradation of protocatechuate (from benzoate and related compounds) and of phenylacetic acid.	400
131484	TIGR02431	pcaR_pcaU	beta-ketoadipate pathway transcriptional regulators, PcaR/PcaU/PobR family. Member of this family are IclR-type transcriptional regulators with similar DNA binding sites, able to bind at least three different metabolites related to protocatechuate metabolism. Beta-ketoadipate is the inducer for PcaR, p-hydroxybenzoate for PobR, and protocatechuate for PcaU. [Regulatory functions, DNA interactions]	248
274129	TIGR02432	lysidine_TilS_N	tRNA(Ile)-lysidine synthetase, N-terminal domain. The only examples in which the wobble position of a tRNA must discriminate between G and A of mRNA are AUA (Ile) vs. AUG (Met) and UGA (stop) vs. UGG (Trp). In all bacteria, the wobble position of the tRNA(Ile) recognizing AUA is lysidine, a lysine derivative of cytidine. This family describes a protein domain found, apparently, in all bacteria in a single copy. Eukaryotic sequences appear to be organellar. The domain archictecture of this protein family is variable; some, including characterized proteins of E. coli and B. subtilis known to be tRNA(Ile)-lysidine synthetase, include a conserved 50-residue domain that many other members lack. This protein belongs to the ATP-binding PP-loop family ( pfam01171). It appears in the literature and protein databases as TilS, YacA, and putative cell cycle protein MesJ (a misnomer). [Protein synthesis, tRNA and rRNA base modification]	189
274130	TIGR02433	lysidine_TilS_C	tRNA(Ile)-lysidine synthetase, C-terminal domain. TIGRFAMs model TIGR02432 describes the family of the N-terminal domain of tRNA(Ile)-lysidine synthetase. This family (TIGR02433) describes a small C-terminal domain of about 50 residues present in about half the members of family TIGR02432,and in no other protein. Characterized examples of tRNA(Ile)-lysidine synthetase from E. coli and Bacillus subtilis both contain this domain. [Protein synthesis, tRNA and rRNA base modification]	47
131487	TIGR02434	CobF	precorrin-6A synthase (deacetylating). In the aerobic cobalamin biosythesis pathway, four enzymes are involved in the conversion of precorrin-3A to precorrin-6A. The first of the four steps is carried out by EC 1.14.13.83, precorrin-3B synthase (CobG), yielding precorrin-3B as the product. This is followed by three methylation reactions, which introduce a methyl group at C-17 (CobJ; EC 2.1.1.131), C-11 (CobM; EC 2.1.1.133) and C-1 (CobF; EC 2.1.1.152) of the macrocycle, giving rise to precorrin-4, precorrin-5 and precorrin-6A, respectively. This model identifies CobF in High GC gram positive, alphaproteobacteria and pseudomonas-related species.	249
274131	TIGR02435	CobG	precorrin-3B synthase. An iron-sulfur protein. An oxygen atom from dioxygen is incorporated into the macrocycle at C-20. In the aerobic cobalamin biosythesis pathway, four enzymes are involved in the conversion of precorrin-3A to precorrin-6A. The first of the four steps is carried out by EC 1.14.13.83, precorrin-3B synthase (CobG), yielding precorrin-3B as the product. This is followed by three methylation reactions, which introduce a methyl group at C-17 (CobJ; EC 2.1.1.131), C-11 (CobM; EC 2.1.1.133) and C-1 (CobF; EC 2.1.1.152) of the macrocycle, giving rise to precorrin-4, precorrin-5 and precorrin-6A, respectively. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	390
274132	TIGR02436	TIGR02436	four helix bundle protein. This family describes a protein of unknown function whose structure is a bundle of four long alpha helices. Some of the first members of this family were found encoded in the (atypically large) intervening sequence (IVS) of Leptospira 23S RNA, a region often present in the rRNA gene and removed during rRNA processing without re-ligation. However, this location is not conserved, and naming this protein as a 23S RNA protein is both confusing and inaccurate.	108
131490	TIGR02437	FadB	fatty oxidation complex, alpha subunit FadB. Members represent alpha subunit of multifunctional enzyme complex of the fatty acid degradation cycle. Activities include: enoyl-CoA hydratase (EC 4.2.1.17), dodecenoyl-CoA delta-isomerase activity (EC 5.3.3.8), 3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35), 3-hydroxybutyryl-CoA epimerase (EC 5.1.2.3). A representative is E. coli FadB (SP:P21177). This model excludes the FadJ family represented by SP:P77399. [Fatty acid and phospholipid metabolism, Degradation]	714
274133	TIGR02438	catachol_actin	catechol 1,2-dioxygenase, Actinobacterial. Members of this family are catechol 1,2-dioxygenases of the Actinobacteria. They are more closely related to actinobacterial chlorocatechol 1,2-dioxygenases than to proteobacterial catechol 1,2-dioxygenases, and so are built in this separate model. The member from Rhodococcus rhodochrous NCIMB 13259 (GB|AAC33003.1) is described as a homodimer with bound Fe, similarly active on catechol, 3-methylcatechol and 4-methylcatechol.	281
274134	TIGR02439	catechol_proteo	catechol 1,2-dioxygenase, proteobacterial. Members of this family known so far are catechol 1,2-dioxygenases of the Proteobacteria. They are distinct from catechol 1,2-dioxygenases and chlorocatechol 1,2-dioxygenases of the Actinobacteria, which are quite similar to each other and resolved by separate models. This enzyme catalyzes intradiol cleavage in which catechol + O2 becomes cis,cis-muconate. Catechol is an intermediate in the catabolism of many different aromatic compounds, as is the alternative intermediate protocatechuate. In Acinetobacter lwoffii, two isozymes are present with abilities, differing somewhat, to act on catechol analogs 3-methylcatechol, 4-methylcatechol, 4-methoxycatechol, and 4-chlorocatechol. [Energy metabolism, Other]	285
131493	TIGR02440	FadJ	fatty oxidation complex, alpha subunit FadJ. Members represent alpha subunit of multifunctional enzyme complex of the fatty acid degradation cycle. Plays a minor role in aerobic beta-oxidation of fatty acids. FadJI complex is necessary for anaerobic growth on short-chain acids with nitrate as an electron acceptor. Activities include: enoyl-CoA hydratase (EC 4.2.1.17),3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35), 3-hydroxybutyryl-CoA epimerase (EC 5.1.2.3). A representative is E. coli FadJ (aka YfcX) (SP:P77399). This model excludes the FadB of TIGR02437 equivalog model. [Fatty acid and phospholipid metabolism, Degradation]	699
131494	TIGR02441	fa_ox_alpha_mit	fatty acid oxidation complex, alpha subunit, mitochondrial. Members represent alpha subunit of mitochondrial multifunctional fatty acid degradation enzyme complex. Subunit activities include: enoyl-CoA hydratase (EC 4.2.1.17) & 3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35). Some characterization in human (SP:P40939), pig (SP:Q29554), and rat (SP:Q64428). The beta subunit has activity: acetyl-CoA C-acyltransferase (EC 2.3.1.16).	737
274135	TIGR02442	Cob-chelat-sub	cobaltochelatase subunit. Cobaltochelatase is responsible for the insertion of cobalt into the corrin ring of coenzyme B12 during its biosynthesis. Two versions have been well described. CbiK/CbiX is a monomeric, anaerobic version which acts early in the biosynthesis (pfam06180). CobNST is a trimeric, ATP-dependent, aerobic version which acts late in the biosynthesis (TIGR02257/TIGR01650/TIGR01651). A number of genomes (actinobacteria, cyanobacteria, betaproteobacteria and pseudomonads) which apparently biosynthesize B12, encode a cobN gene but are demonstrably lacking cobS and cobT. These genomes do, however contain a homolog (modelled here) of the magnesium chelatase subunits BchI/BchD family. Aside from the cyanobacteria (which have a separate magnesium chelatase trimer), these species do not make chlorins, so do not have any use for a magnesium chelatase. Furthermore, in nearly all cases the members of this family are proximal to either CobN itself or other genes involved in cobalt transport or B12 biosynthesis.	633
131496	TIGR02443	TIGR02443	conserved hypothetical metal-binding protein. Members of this family are small proteins, about 70 residues in length, with a basic triplet near the N-terminus and a probable metal-binding motif CPXCX(18)CXXC. Members are found in various Proteobacteria.	59
274136	TIGR02444	TIGR02444	TIGR02444 family protein. Members of this family are bacterial hypothetical proteins, about 160 amino acids in length, found in various Proteobacteria, including members of the genera Pseudomonas and Vibrio. The C-terminal region is poorly conserved and is not included in the model. [Hypothetical proteins, Conserved]	116
131498	TIGR02445	fadA	fatty oxidation complex, beta subunit FadA. This subunit of the FadBA complex has acetyl-CoA C-acyltransferase (EC 2.3.1.16) activity, and is also known as beta-ketothiolase and fatty oxidation complex, beta subunit. This protein is almost always located adjacent to FadB (TIGR02437). The FadBA complex is the major complex active for beta-oxidation of fatty acids in E. coli. [Fatty acid and phospholipid metabolism, Degradation]	385
131499	TIGR02446	FadI	fatty oxidation complex, beta subunit FadI. This subunit of the FadJI complex has acetyl-CoA C-acyltransferase (EC 2.3.1.16) activity, and is also known as beta-ketothiolase and fatty oxidation complex, beta subunit, and YfcY. This protein is almost always located adjacent to FadJ (TIGR02440). The FadJI complex is needed for anaerobic beta-oxidation of short-chain fatty acids in E. coli. [Fatty acid and phospholipid metabolism, Degradation]	430
131500	TIGR02447	yiiD_Cterm	thioesterase domain, putative. This family consists of a broadly distributed uncharacterized domain found often as a standalone protein. The member from Shewanella oneidensis, PDB|1T82_A (Forouhar, et al., unpublished) is described from crystallography work as a putative thioesterase. About half of the members of this family are fused to an Acetyltransf_1 domain (pfam00583). The function of this protein is unknown.	138
274137	TIGR02448	TIGR02448	conserverd hypothetical protein. This family consists of small hypothetical proteins, about 100 amino acids in length. The family includes five members (three in tandem) in Pseudomonas aeruginosa PAO1, and also in Pseudomonas putida KT2440, four in Pseudomonas syringae DC3000, and single members in several other Proteobacteria. The function is unknown.	101
131502	TIGR02449	TIGR02449	TIGR02449 family protein. Members of this family are small proteins, typically 73 amino acids in length, with single copies in each of several Proteobacteria, including Xylella fastidiosa, Pseudomonas aeruginosa, and Xanthomonas campestris. The function is unknown.	65
131503	TIGR02450	TIGR02450	tryptophan-rich conserved hypothetical protein. Members of this family are small hypothetical proteins of 60 to 100 residues from Cyanobacteria and some Proteobacteria. Prochlorococcus marinus strains have two members, other species one only. Interestingly, of the eight most conserved residues, four are aromatic and three are invariant tryptophans. It appears all species that encode this protein can synthesize tryptophan de novo.	61
131504	TIGR02451	anti_sig_ChrR	anti-sigma factor, putative, ChrR family. The member of this family from Rhodobacter sphaeroides has been shown both to form a complex with sigma(E) and to negatively regulate tetrapyrrole biosynthesis. This protein likely contains (at least) two distinct functional domains; several smaller homologs (excluded by the model) show homology only to the C-terminal, including a motif PxHxHxGxE. [Regulatory functions, Other]	215
131505	TIGR02452	TIGR02452	TIGR02452 family protein. Members of this uncharacterized protein family are found in Streptomyces, Nostoc sp. PCC 7120, Clostridium acetobutylicum, Lactobacillus johnsonii NCC 533, Deinococcus radiodurans, and Pirellula sp. for a broad but sparse phylogenetic distibution that at least suggests lateral gene transfer.	266
274138	TIGR02453	TIGR02453	TIGR02453 family protein. Members of this family are widely (though sparsely) distributed bacterial proteins about 230 residues in length. All members have a motif RxxRDxRFxxx[DN]KxxY. The function of this protein family is unknown. In several fungi, this model identifies a conserved region of a longer protein. Therefore, it may be incorrect to speculate that all members share a common function.	217
274139	TIGR02454	ECF_T_CbiQ	cobalt ECF transporter T component CbiQ. This model represents the CbiQ component of the cobalt-specific ECF-type. CbiQ is now recognized as the T component of energy-coupling factor (ECF)-type transporters. The S component confers specificity (CbiM-N for cobalt systems), which CbiO is the ABC-family ATPase. In general, proteins found by this model reside next to the other putative subunits of the complex, identified as CbiN, CbiO, or CbiM. Note that the designation of cobalt transporter has been spread excessively among ECF system transporters with many other specificities. [Transport and binding proteins, Cations and iron carrying compounds]	198
131508	TIGR02455	TreS_stutzeri	trehalose synthase, Pseudomonas stutzeri type. Trehalose synthase catalyzes a one-step conversion of maltose to trehalose. This is an alternative to the OtsAB and TreYZ pathways. This family includes a characterized example from Pseudomonas stutzeri plus very closely related sequences from other Pseudomonads. Cutoff scores are set to find a more distantly related sequence from Desulfovibrio vulgaris, likely to be functionally equivalent, between trusted and noise limits. [Energy metabolism, Biosynthesis and degradation of polysaccharides, Cellular processes, Adaptations to atypical conditions]	688
274140	TIGR02456	treS_nterm	trehalose synthase. Trehalose synthase interconverts maltose and alpha, alpha-trehalose by transglucosylation. This is one of at least three mechanisms for biosynthesis of trehalose, an important and widespread compatible solute. However, it is not driven by phosphate activation of sugars and its physiological role may tend toward trehalose degradation. This view is accentuated by numerous examples of fusion to a probable maltokinase domain. The sequence region described by this model is found both as the whole of a trehalose synthase and as the N-terminal region of a larger fusion protein that includes trehalose synthase activity. Several of these fused trehalose synthases have a domain homologous to proteins with maltokinase activity from Actinoplanes missouriensis and Streptomyces coelicolor (). [Energy metabolism, Biosynthesis and degradation of polysaccharides]	539
274141	TIGR02457	TreS_Cterm	trehalose synthase-fused probable maltokinase. Three pathways for the biosynthesis of trehalose, an osmoprotectant that in some species is also a precursor of certain cell wall glycolipids. Trehalose synthase, TreS, can interconvert maltose and trehalose, but while the equilibrium may favor trehalose, physiological concentrations of trehalose may be much greater than that of maltose and TreS may act largely in its degradation. This model describes a domain found only as a C-terminal fusion to TreS proteins. The most closely related proteins outside this family, Pep2 of Streptomyces coelicolor and Mak1 of Actinoplanes missouriensis, have known maltokinase activity. We suggest this domain acts as a maltokinase and helps drive conversion of trehalose to maltose. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	528
274142	TIGR02458	CbtA	cobalt transporter subunit CbtA (proposed). This model represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of five trans-membrane segments, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a small protein (CbtB) having a single additional trans-membrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site.	225
131512	TIGR02459	CbtB	cobalt transporter subunit CbtB (proposed). This model represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of a single trans-membrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a protein (CbtA) predicted to have five additional trans-membrane segments.	60
162866	TIGR02460	osmo_MPGsynth	mannosyl-3-phosphoglycerate synthase. This family consists of examples of mannosyl-3-phosphoglycerate synthase (MPGS), which together mannosyl-3-phosphoglycerate phosphatase (MPGP) comprises a two-step pathway for mannosylglycerate biosynthesis. Mannosylglycerate is a compatible solute that tends to be restricted to extreme thermophiles of archaea and bacteria. Note that in Rhodothermus marinus, this pathway is one of two; the other is condensation of GDP-mannose with D-glycerate by mannosylglycerate synthase.	381
131514	TIGR02461	osmo_MPG_phos	mannosyl-3-phosphoglycerate phosphatase. Members of this family are mannosyl-3-phosphoglycerate phosphatase (EC 3.1.3.70). It acts sequentially after mannosyl-3-phosphoglycerate synthase (EC 2.4.1.217) in a two-step pathway of biosynthesis of the compatible solute mannosylglycerate, a typical osmolyte of thermophiles.	225
274143	TIGR02462	pyranose_ox	pyranose oxidase. Pyranose oxidase (also called glucose 2-oxidase) converts D-glucose and molecular oxygen to 2-dehydro-D-glucose and hydrogen peroxide. Peroxide production is believed to be important to the wood rot fungi in which this enzyme is found for lignin degradation.	547
131516	TIGR02463	MPGP_rel	mannosyl-3-phosphoglycerate phosphatase-related protein. This family consists of members of the HAD superfamily, subfamily IIB. All members are closely related to mannosyl-3-phosphoglycerate phosphatase, the second enzyme in a two-step pathway for biosynthesis of mannosylglycerate, a compatible solute present in some thermophiles and in Dehalococcoides ethenogenes. However, members of this family are separable in a neighbor-joining tree constructed from a multiple sequence alignment and are found only in mesophiles that lack the companion mannosyl-3-phosphoglycerate synthase (TIGR02460). Members of this family are like to act on a compound related to yet distinct from mannosyl-3-phosphoglycerate. [Unknown function, General]	221
274144	TIGR02464	ribofla_fusion	conserved hypothetical protein, ribA/ribD-fused. This model describes a sequence region that occurs in at least three different polypeptide contexts. It is found fused to GTP cyclohydrolase II, the RibA of riboflavin biosynthesis (TIGR00505), as in Vibrio vulnificus. It is found fused to riboflavin biosynthesis protein RibD (TIGR00326) in rice and Arabidopsis. It occurs as a standalone protein in a number of bacterial species in varied contexts, including single gene operons and bacteriophage genomes. The member from E. coli currently is named YbiA. The function(s) of members of this family is unknown.	153
131518	TIGR02465	chlorocat_1_2	chlorocatechol 1,2-dioxygenase. Members of this protein family are chlorocatechol 1,2-dioxygenase. This protein is closely related to catechol 1,2-dioxygenase, TIGR02439, EC 1.13.11.1. Note that annotated database entries have appeared for the present protein family with the EC number that refers to that of family TIGR02439. This protein acts in pathways of the biodegradation of chlorinated aromatic compounds.	246
274145	TIGR02466	TIGR02466	conserved hypothetical protein. This family consists of uncharacterized proteins in Caulobacter crescentus CB15, Bdellovibrio bacteriovorus HD100, Synechococcus sp. WH 8102, Silicibacter pomeroyi DSS-3, and Hyphomonas neptunium ATCC 15444. The context of nearby genes differs substantially between members and does point to any specific biological role. [Hypothetical proteins, Conserved]	201
274146	TIGR02467	CbiE	precorrin-6y C5,15-methyltransferase (decarboxylating), CbiE subunit. This model recognizes the CbiE methylase which is responsible, in part (along with CbiT), for methylating precorrin-6y (or cobalt-precorrin-6y) at both the 5 and 15 positions as well as the concomitant decarbozylation at C-12. In many organisms, this protein is fused to the CbiT subunit. The fused protein, when found in organisms catalyzing the oxidative version of the cobalamin biosynthesis pathway, is called CobL.	204
274147	TIGR02468	sucrsPsyn_pln	sucrose phosphate synthase/possible sucrose phosphate phosphatase, plant. Members of this family are sucrose-phosphate synthases of plants. This enzyme is known to exist in multigene families in several species of both monocots and dicots. The N-terminal domain is the glucosyltransferase domain. Members of this family also have a variable linker region and a C-terminal domain that resembles sucrose phosphate phosphatase (SPP) (EC 3.1.3.24) (see TIGR01485), the next and final enzyme of sucrose biosynthesis. The SPP-like domain likely serves a binding and not a catalytic function, as the reported SPP is always encoded by a distinct protein.	1050
274148	TIGR02469	CbiT	precorrin-6Y C5,15-methyltransferase (decarboxylating), CbiT subunit. This model recognizes the CbiT methylase which is responsible, in part (along with CbiE), for methylating precorrin-6y (or cobalt-precorrin-6y) at both the 5 and 15 positions as well as the concomitant decarbozylation at C-12. In many organisms, this protein is fused to the CbiE subunit. The fused protein, when found in organisms catalyzing the oxidative version of the cobalamin biosynthesis pathway, is called CobL. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	124
274149	TIGR02470	sucr_synth	sucrose synthase. This model represents sucrose synthase, an enzyme that, despite its name, generally uses rather produces sucrose. Sucrose plus UDP (or ADP) becomes D-fructose plus UDP-glucose (or ADP-glucose), which is then available for cell wall (or starch) biosynthesis. The enzyme is homologous to sucrose phosphate synthase, which catalyzes the penultimate step in sucrose synthesis. Sucrose synthase is found, so far, exclusively in plants and cyanobacteria. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	784
131524	TIGR02471	sucr_syn_bact_C	sucrose-phosphate synthase, sucrose phosphatase-like domain, bacterial. Sucrose phosphate synthase (SPS) and sucrose phosphate phosphatase (SPP) are the last two enzymes of sucrose biosynthesis. In cyanobacteria and plants, the C-terminal region of most or all versions of SPS has a domain homologous to the known SPP. This domain may serve a binding or regulatory rather than catalytic function. Sequences in this family are bacterial C-terminal regions found in all but two of the putative bacterial sucrose phosphate synthases described by TIGR02472.	236
131525	TIGR02472	sucr_P_syn_N	sucrose-phosphate synthase, putative, glycosyltransferase domain. This family consists of the N-terminal regions, or in some cases the entirety, of bacterial proteins closely related to plant sucrose-phosphate synthases (SPS). The C-terminal domain (TIGR02471), found with most members of this family, resembles both bona fide plant sucrose-phosphate phosphatases (SPP) and the SPP-like domain of plant SPS. At least two members of this family lack the SPP-like domain, which may have binding or regulatory rather than enzymatic activity by analogy to plant SPS. This enzyme produces sucrose 6-phosphate and UDP from UDP-glucose and D-fructose 6-phosphate, and may be encoded near the gene for fructokinase.	439
131526	TIGR02473	flagell_FliJ	flagellar export protein FliJ. Members of this family are the FliJ protein found, in nearly every case, in the midst of other flagellar biosynthesis genes in bacgterial genomes. Typically the fliJ gene is found adjacent to the gene for the flagellum-specific ATPase FliI. Sequence scoring in the gray zone between trusted and noise cutoffs include both probable FliJ proteins and components of bacterial type III secretion systems.	141
274150	TIGR02474	pec_lyase	pectate lyase, PelA/Pel-15E family. Members of this family are isozymes of pectate lyase (EC 4.2.2.2), also called polygalacturonic transeliminase and alpha-1,4-D-endopolygalacturonic acid lyase. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	290
274151	TIGR02475	CobW	cobalamin biosynthesis protein CobW. The family of proteins identified by this model is generally found proximal to the trimeric cobaltochelatase subunit CobN which is essential for vitamin B12 (cobalamin) biosynthesis. The protein contains an P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. A broader CobW family is delineated by two Pfam models which identify the N- and C-terminal domains (pfam02492 and pfam07683). [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	341
162875	TIGR02476	BluB	5,6-dimethylbenzimidazole synthase. A previously published hypothesis that BluB, involved in cobalamin biosynthesis, is EC 1.16.8.1 (cob(II)yrinic acid a,c-diamide reductase) is now contradicted by newer work ascribing a role in 5,6-dimethylbenzimidazole (DMB) biosynthesis. The BluB protein is related to the nitroreductase family (pfam0881). [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	205
131530	TIGR02477	PFKA_PPi	diphosphate--fructose-6-phosphate 1-phosphotransferase. Diphosphate--fructose-6-phosphate 1-phosphotransferase catalyzes the addition of phosphate from diphosphate (PPi) to fructose 6-phosphate to give fructose 1,6-bisphosphate (EC 2.7.1.90). The enzyme is also known as pyrophosphate-dependent phosphofructokinase. The usage of PPi-dependent enzymes in glycolysis presumably frees up ATP for other processes. TIGR02482 represents the ATP-dependent 6-phosphofructokinase enzyme contained within pfam00365: Phosphofructokinase. This model hits primarily bacterial, plant alpha, and plant beta sequences. [Energy metabolism, Glycolysis/gluconeogenesis]	539
274152	TIGR02478	6PF1K_euk	6-phosphofructokinase, eukaryotic type. Members of this family are eukaryotic (with one exception) ATP-dependent 6-phosphofructokinases (EC 2.7.1.11) in which two tandem copies of the phosphofructokinase are found. Members are found, often including several isozymes, in animals and fungi and in the bacterium Propionibacterium acnes KPA171202 (a human skin commensal).	746
274153	TIGR02479	FliA_WhiG	RNA polymerase sigma factor, FliA/WhiG family. Most members of this family are the flagellar operon sigma factor FliA, controlling transcription of bacterial flagellar genes by RNA polymerase. An exception is the sigma factor WhiG in the genus Streptomyces, involved in the production of sporulating aerial mycelium.	224
131533	TIGR02480	fliN	flagellar motor switch protein FliN. Proteins that consist largely of the domain described by this model for this protein family can be designated flagellar motor switch protein FliN. Longer proteins in which this region is a C-terminal domain typically are designated FliY. More distantly related sequences, outside the scope of this family, are associated with type III secretion and include the surface presentation of antigens protein SpaO required or invasion of host cells by Salmonella enterica. [Cellular processes, Chemotaxis and motility]	77
274154	TIGR02481	hemeryth_dom	hemerythrin-like metal-binding domain. This model describes both members of the hemerythrin (TIGR00058) family of marine invertebrates and a broader collection of bacterial and archaeal homologs. Many of the latter group are multidomain proteins with signal-transducing domains such as the GGDEF diguanylate cyclase domain (TIGR00254, pfam00990) and methyl-accepting chemotaxis protein signaling domain (pfam00015). Most hemerythrins are oxygen-carriers with a bound non-heme iron, but at least one example is a cadmium-binding protein, apparently with a role in sequestering toxic metals rather than in binding oxygen. Patterns of conserved residues suggest that all prokaryotic instances of this domain bind iron or another heavy metal, but the exact function is unknown. Not surprisingly, the prokaryote with the most instances of this domain is Magnetococcus sp. MC-1, a magnetotactic bacterium.	126
213713	TIGR02482	PFKA_ATP	6-phosphofructokinase. 6-phosphofructokinase (EC 2.7.1.11) catalyzes the addition of phosphate from ATP to fructose 6-phosphate to give fructose 1,6-bisphosphate. This represents a key control step in glycolysis. This model hits bacterial ATP-dependent 6-phosphofructokinases which lack a beta-hairpin loop present in TIGR02483 family members. TIGR02483 contains members that are ATP-dependent as well as members that are pyrophosphate-dependent. TIGR02477 represents the pyrophosphate-dependent phosphofructokinase, diphosphate--fructose-6-phosphate 1-phosphotransferase (EC 2.7.1.90). [Energy metabolism, Glycolysis/gluconeogenesis]	301
274155	TIGR02483	PFK_mixed	phosphofructokinase. Members of this family that are characterized, save one, are phosphofructokinases dependent on pyrophosphate (EC 2.7.1.90) rather than ATP (EC 2.7.1.11). The exception is one of three phosphofructokinases from Streptomyces coelicolor. Family members are both bacterial and archaeal. [Energy metabolism, Glycolysis/gluconeogenesis]	324
274156	TIGR02484	CitB	CitB domain protein. This model identifies proteins of two distinct names which may or may not have two distinct functions. CitB has been identified in salmonella and E. coli as the signal transduction component of a two-component system for citrate in which CitA acts as a citrate transporter. CobZ is essential for cobalamin biosynthesis (by knockout of the R. capsulatus gene) and is complemented by the characterized precorrin 3B synthase CobG. The enzyme has been shown to contain flavin, heme and Fe-S cluster cofactors and is believed to require dioxygen as a substrate. This model identifies the C-terminal domain of the R. capsulatus CobZ, which, in most other species exists as a separate gene adjacent to CobZ.	372
274157	TIGR02485	CobZ_N-term	precorrin 3B synthase CobZ. CobZ is essential for cobalamin biosynthesis (by knockout of the R. capsulatus gene) and is complemented by the characterized precorrin 3B synthase CobG. The enzyme has been shown to contain flavin, heme and Fe-S cluster cofactors and is believed to require dioxygen as a substrate. This model identifies the N-terminal portion of the R. capsulatus gene which, in other species exists as a separate protein. The C-terminal portion is homologous to the 2-component signal transduction system protein CitB (TIGR02484). [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	432
274158	TIGR02486	RDH	reductive dehalogenase. This model represents a family of corrin and 8-iron Fe-S cluster-containing reductive dehalogenases found primarily in halorespiring microorganisms such as dehalococcoides ethenogenes which contains as many as 17 enzymes of this type with varying substrate ranges. One example of a characterized species is the tetrachloroethene reductive dehalogenase (1.97.1.8) which also acts on trichloroethene converting it to dichloroethene.	314
274159	TIGR02487	NrdD	anaerobic ribonucleoside-triphosphate reductase. This model represents the oxygen-sensitive (anaerobic, class III) ribonucleotide reductase. The mechanism of the enzyme involves a glycine-centered radical, a C-terminal zinc binding site, and a set of conserved active site cysteines and asparagines. This enzyme requires an activating component, NrdG, a radical-SAM domain containing enzyme (TIGR02491). Together the two form an alpha-2/beta-2 heterodimer. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	579
131541	TIGR02488	flgG_G_neg	flagellar basal-body rod protein FlgG, Gram-negative bacteria. This family consists of the FlgG protein of the flagellar apparatus in the Proteobacteria and spirochetes. [Cellular processes, Chemotaxis and motility]	259
131542	TIGR02489	flgE_epsilon	flagellar hook protein FlgE, epsilon proteobacterial. Members of this family are flagellar hook proteins, designated FlgE, as found in the epsilon subdivision of the Proteobacteria (Helicobacter, Wolinella, and Campylobacter). These proteins differ significantly in architecture from proteins designated FlgE in other lineages; the N-terminal and C-terminal domains are homologous, but members of this family only contain a large central domain that is surface-exposed and variable between strains.	719
274160	TIGR02490	flgF	flagellar basal-body rod protein FlgF. Members of this protein are FlgF, one of several homologous flagellar basal-body rod proteins in bacteria. [Cellular processes, Chemotaxis and motility]	89
274161	TIGR02491	NrdG	anaerobic ribonucleoside-triphosphate reductase activating protein. This enzyme is a member of the radical-SAM family (pfam04055) and utilizes S-adenosyl methionine, an iron-sulfur cluster and a reductant (dihydroflavodoxin) to produce a glycine-centered radical in the class III (anaerobic) ribonucleotide triphosphate reductase (NrdD, TIGR02487). The two components form an alpha-2/beta-2 heterodimer. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism, Protein fate, Protein modification and repair]	154
274162	TIGR02492	flgK_ends	flagellar hook-associated protein FlgK. The flagellar hook-associated protein FlgK of bacterial flagella has conserved N- and C-terminal domains. The central region is highly variable in length and sequence, and often contains substantial runs of low-complexity sequence. This model is built from an alignment of FlgK sequences with the central region excised. Note that several other proteins of the flagellar apparatus also are homologous in the N- and C-terminal regions to FlgK, but are excluded from this model. [Cellular processes, Chemotaxis and motility]	323
131546	TIGR02493	PFLA	pyruvate formate-lyase 1-activating enzyme. An iron-sulfur protein with a radical-SAM domain (pfam04055). A single glycine residue in EC 2.3.1.54, formate C-acetyltransferase (formate-pyruvate lyase), is oxidized to the corresponding radical by transfer of H from its CH2 to AdoMet with concomitant cleavage of the latter. The reaction requires Fe2+. The first stage is reduction of the AdoMet to give methionine and the 5'-deoxyadenosin-5-yl radical, which then abstracts a hydrogen radical from the glycine residue. [Energy metabolism, Anaerobic, Protein fate, Protein modification and repair]	235
274163	TIGR02494	PFLE_PFLC	glycyl-radical enzyme activating protein. This subset of the radical-SAM family (pfam04055) includes a number of probable activating proteins acting on different enzymes all requiring an amino-acid-centered radical. The closest relatives to this family are the pyruvate-formate lyase activating enzyme (PflA, 1.97.1.4, TIGR02493) and the anaerobic ribonucleotide reductase activating enzyme (TIGR02491). Included within this subfamily are activators of hydroxyphenyl acetate decarboxylase (HdpA), benzylsuccinate synthase (BssD), gycerol dehydratase (DhaB2) as well as enzymes annotated in E. coli as activators of different isozymes of pyruvate-formate lyase (PFLC and PFLE) however, these appear to lack characterization and may activate enzymes with distinctive functions. Most of the sequence-level variability between these forms is concentrated within an N-terminal domain which follows a conserved group of three cysteines and contains a variable pattern of 0 to 8 additional cysteines.	295
274164	TIGR02495	NrdG2	anaerobic ribonucleoside-triphosphate reductase activating protein. This enzyme is a member of the radical-SAM family (pfam04055). It is often gene clustered with the class III (anaerobic) ribonucleotide triphosphate reductase (NrdD, TIGR02487) and presumably fulfills the identical function as NrdG, which utilizes S-adenosyl methionine, an iron-sulfur cluster and a reductant (dihydroflavodoxin) to produce a glycine-centered radical in NrdD. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism, Protein fate, Protein modification and repair]	192
131549	TIGR02497	yscI_hrpB_dom	type III secretion apparatus protein, YscI/HrpB, C-terminal domain. This model represents the conserved C-terminal domain of a protein conserved in across species in the bacterial type III secretion apparatus. This protein is designated YscI (Yop proteins translocation protein I) in Yersinia and HrpB (hypersensitivity response and pathogenicity protein B) in plant pathogens such as Pseudomonas syringae. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	39
131550	TIGR02498	type_III_ssaH	type III secretion system protein, SsaH family. This family describes a small protein, always smaller than 100 amino acids, encoded in pathogenicity islands for bacterial type III secretion systems in various strains of Yersinia, Salmonella, and enteropathogenic E. coli, as well as Chromobacterium violaceum and Citrobacter rodentium. Although strictly associated with type III secretion systems, this protein seems not yet to have been characterized as part of the apparatus or as an effector protein. [Cellular processes, Pathogenesis]	79
274165	TIGR02499	HrpE_YscL_not	type III secretion apparatus protein, HrpE/YscL family. This model is related to pfam06188, but is broader. pfam06188 describes HrpE-like proteins, components of bacterial type III secretion systems primarily in bacteria that infect plants. This model includes also the homologous proteins of animal pathogens, such as YscL of Yersinia pestis. This model excludes the related protein FliH of the bacterial flagellar apparatus (see pfam02108) [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	166
274166	TIGR02500	type_III_yscD	type III secretion apparatus protein, YscD/HrpQ family. This family represents a conserved protein of bacterial type III secretion systems. Gene symbols are variable from species to species. Members are designated YscD in Yersinia, HrpQ in Pseudomonas syringae, and EscD in enteropathogenic Escherichia coli. In the Chlamydiae, this model describes the C-terminal 400 residues of a longer protein. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	410
274167	TIGR02501	type_III_yscE	type III secretion system protein, YseE family. Members of this family are found exclusively in type III secretion appparatus gene clusters in bacteria. Those bacteria with a protein from this family tend to target animal cells, as does Yersinia pestis. This protein is small (about 70 amino acids) and not well characterized. [Cellular processes, Pathogenesis]	67
131554	TIGR02502	type_III_YscX	type III secretion protein, YscX family. Members of this family are encoded within bacterial type III secretion gene clusters. Among all species with type III secretion, those with this protein are found among those that target animal rather than plant cells. The member of this family in Yersinia was shown by mutation to be required for type III secretion of Yops effector proteins and therefore is believe to be part of the secretion machinery. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	121
131555	TIGR02503	type_III_SycN	type III secretion chaperone SycN. Members of this protein family are part of the machinery of bacterial type III secretion in a number of bacteria that target animal cells. In the well-studied system from Yersinia, a complex of this protein (SycN) and YscB (pfam07329) acts as a chaperone for the export of YopN (). YopN then acts to control effector protein secretion, in response to calcium levels, so that secretion occurs only after contact with the targeted eukaryotic cell. [Protein fate, Protein folding and stabilization, Cellular processes, Pathogenesis]	119
274168	TIGR02504	NrdJ_Z	ribonucleoside-diphosphate reductase, adenosylcobalamin-dependent. This model represents a group of adenosylcobalamin(B12)-dependent ribonucleotide reductases (Class II RNRs) related to the characterized species from Pyrococcus, Thermoplasma, Corynebacterium, and Deinococcus. RNR's are responsible for the conversion of the ribose sugar of RNA into the deoxyribose sugar of DNA. This is the rate-limiting step of DNA biosynthesis. This model identifies genes in a wide range of deeply branching bacteria. All are structurally related to the class I (non-heme iron dependent) RNRs. In most species this gene is known as NrdJ, while in mycobacteria it is called NrdZ. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	575
274169	TIGR02505	RTPR	ribonucleoside-triphosphate reductase, adenosylcobalamin-dependent. This model represents a group of adenosylcobalamin(B12)-dependent ribonucleotide reductases (RNR) related to the characterized species from Lactococcus leichmannii. RNR's are responsible for the conversion of the ribose sugar of RNA into the deoxyribose sugar of DNA. This is the rate-limiting step of DNA biosynthesis. Thus model identifies NrdJ enzymes only in cyanobacteria, lactococcus and certain bacteriophage. A separate model (TIGR02504) identifies a larger group of divergent B12-dependent RNR's. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	713
274170	TIGR02506	NrdE_NrdA	ribonucleoside-diphosphate reductase, alpha subunit. This model represents the alpha (large) chain of the class I ribonucleotide reductase (RNR). RNR's are responsible for the conversion of the ribose sugar of RNA into the deoxyribose sugar of DNA. This is the rate-limiting step of DNA biosynthesis. Class I RNR's generate the required radical (on tyrosine) via a "non-heme" iron cofactor which resides in the beta (small) subunit. The alpha subunit contains the catalytic and allosteric regulatory sites. The mechanism of this enzyme requires molecular oxygen. E. Coli contains two versions of this enzyme which are regulated independently (NrdAB and NrdEF, where NrdA and NrdE are the large chains). Most organisms contain only one, but the application of the gene symbols NrdA and NrdE are somewhat arbitrary. This model identifies RNR's in diverse clades of bacteria, eukaryotes as well as numerous DNA viruses and phage. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	617
131559	TIGR02507	MtrF	tetrahydromethanopterin S-methyltransferase, F subunit. This small protein (MtrF) is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase in methanogenic archaea. This methyltranferase is membrane-associated enzyme complex that uses methy-transfer reaction to drive sodium-ion pump. Archaea domain, have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase.	65
131560	TIGR02508	type_III_yscG	type III secretion protein, YscG family. YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designate Yops (Yersinia outer proteins) in Yersinia. This family consists of YscG of Yersinia, and functionally equivalent type III secretion machinery protein in other species: AscG in Aeromonas, LscG in Photorhabdus luminescens, etc. [Protein fate, Protein folding and stabilization, Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	115
131561	TIGR02509	type_III_yopR	type III secretion effector, YopR family. Members of this family are type III secretion system effectors, named differently in different species and designated YopR (Yersinia outer protein R), encoded by the YscH (Yersinia secretion H) gene. This Yops protein is unusual in that it is released to extracellularly rather than injected directly into the target cell as are most Yops. [Cellular processes, Pathogenesis]	131
188230	TIGR02510	NrdE-prime	ribonucleoside-diphosphate reductase, alpha chain. This model represents a small clade of ribonucleoside-diphosphate reductase, alpha chains which are sufficiently divergent from the usual Class I RNR alpha chains (NrdE or NrdA, TIGR02506) as to warrant their own model. The genes from Thermus thermophilus, Dichelobacter and Salinibacter are adjacent to the usual RNR beta chain. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	548
131563	TIGR02511	type_III_tyeA	type III secretion effector delivery regulator, TyeA family. Members of this family include both small proteins, about 90 amino acids, in which this model covers the whole, and longer proteins of about 360 residues which match in the C-terminal region. The longer proteins (HrpJ) have N-terminal regions that match pfam07201. Members of this family belong to bacterial type III secretion systems, and include TyeA from the well-studied Yersinia systems. TyeA appears involved in calcium-responsive regulation of the delivery of type III effectors.	79
274171	TIGR02512	FeFe_hydrog_A	[FeFe] hydrogenase, group A. This model describes iron-only hydrogenases of anaerobic and microaerophilic bacteria and protozoa. This model is narrower, and covers a longer stretch of sequence, than pfam02906. This family represents a division among families that belong to pfam02906, which also includes proteins such as nuclear prelamin A recognition factor in animals. Note that this family shows some heterogeneity in terms of periplasmic, cytosolic, or hydrogenosome location, NAD or NADP dependence, and overal protein protein length.	374
131565	TIGR02513	type_III_yscB	type III secretion system chaperone, YscB family. Members of this family include YscB of Yersinia and functionally equivalent (but differently named) proteins from type III secretion systems of other pathogens that affect animal cells. YscB acts, along with SycN (TIGR02503), as a chaperone for YopN, a key part of a complex that regulates type III secretion so it responds to contact with the eukaryotic target cell.	139
274172	TIGR02514	type_III_yscP	type III secretion system needle length determinant. Members of this family include YscP of the Yersinia type III secretion system and equivalent proteins in other animal pathogen bacterial type III secretion systems. The model describes the conserved C-terminal region. N-terminal regions are poorly conserved and variable in length with some low-complexity sequence.	129
274173	TIGR02515	IV_pilus_PilQ	type IV pilus secretin (or competence protein) PilQ. A number of proteins homologous to PilQ are involved in type IV pilus formation, competence for transformation, type III secretion, and type II secretion (also called the main terminal branch of the general secretion pathway). Members of this family include PilQ itself, which is a component of the type IV pilus structure, from a number of species. In Haemophilus influenzae, the member of this family is associated with competence for transformation with exogenous DNA rather than with formation of a type IV pilus; the surface structure required for competence may be considered an unusual, incomplete type IV pilus structure. [Cell envelope, Surface structures]	418
274174	TIGR02516	type_III_yscC	type III secretion outer membrane pore, YscC/HrcC family. A number of proteins homologous to the type IV pilus secretin PilQ (TIGR02515) are involved in type IV pilus formation, competence for transformation, type III secretion, and type II secretion (also called the main terminal branch of the general secretion pathway). The clade described by this model contains the outer membrane pore proteins of bacterial type III secretion systems, typified by YscC for animal pathogens and HrcC for plant pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	462
274175	TIGR02517	type_II_gspD	type II secretion system protein D. In Gram-negative bacteria, proteins that have first crossed the inner member by Sec-dependent protein transport can be exported across the outer membrane by type II secretion, also called the main terminal branch of the general secretion pathway. Members of this family are general secretion pathway protein D. In Yersinia enterocolitica, a second member of this family is part of a novel second type II secretion system specifically associated with virulence (See ). This family is closely homologous to the type IV pilus outer membrane secretin PilQ (TIGR02515) and to the type III secretion system pore YscC/HrcC (TIGR02516). [Protein fate, Protein and peptide secretion and trafficking]	594
131570	TIGR02518	EutH_ACDH	acetaldehyde dehydrogenase (acetylating). 	488
274176	TIGR02519	pilus_MshL	pilus (MSHA type) biogenesis protein MshL. Members of this family are predicted secretins, that is, outer membrane pore proteins associated with delivery of proteins from periplasm to the outside of the cell. Related families include GspD of type II secretion (TIGR02517), the YscC/HrcC family from type III secretion (TIGR02516), and the PilQ secretin of type IV pilus formation (TIGR02515). Members of this family are found in gene clusters associated with MSHA (mannose-sensitive hemagglutinin) and related pili, and appear to be the secretin of this pilus system. [Cell envelope, Surface structures]	290
274177	TIGR02520	pilus_B_mal_scr	type IVB pilus formation outer membrane protein, R64 PilN family. Several related protein families encode outer membrane pore proteins for type II secretion, type III secretion, and type IV pilus formation. This protein family appears to encode a secretin for pilus formation, although it is quite different from PilQ. Members include the PilN lipoprotein of the plasmid R64 thin pilus, a type IV pilus. Scoring between the trusted and noise cutoffs are examples of bundle-forming pilus B (bfpB). [Cell envelope, Surface structures, Protein fate, Protein and peptide secretion and trafficking]	497
131573	TIGR02521	type_IV_pilW	type IV pilus biogenesis/stability protein PilW. Members of this family are designated PilF and PilW. This outer membrane protein is required both for pilus stability and for pilus function such as adherence to human cells. Members of this family contain copies of the TPR (tetratricopeptide repeat) domain.	234
274178	TIGR02522	pilus_cpaD	pilus (Caulobacter type) biogenesis lipoprotein CpaD. This family consists of a pilus biogenesis protein, CpaD, from Caulobacter, and homologs in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function is not known. [Cell envelope, Surface structures]	198
131575	TIGR02523	type_IV_pilV	type IV pilus modification protein PilV. Pilus systems categorized as type IV pilins differ greatly from one another, with some showing greater similarty to type II or type III secretion systems than to each other. Members of this protein family represent the PilV protein of type IV pilus systems as found in Pseudomonas aeruginosa PAO1, Pseudomonas syringae DC3000, Neisseria meningitidis MC58, Xylella fastidiosa 9a5c, etc. [Cell envelope, Surface structures, Protein fate, Protein modification and repair]	139
131576	TIGR02524	dot_icm_DotB	Dot/Icm secretion system ATPase DotB. Members of this protein family are the DotB component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the liturature now seems to favor calling this the Dot/Icm system. This family is most closely related to TraJ proteins of plasmid transfer, rather than to proteins of other type IV secretion systems.	358
131577	TIGR02525	plasmid_TraJ	plasmid transfer ATPase TraJ. Members of this protein family are predicted ATPases associated with plasmid transfer loci in bacteria. This family is most similar to the DotB ATPase of a type-IV secretion-like system of obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii (TIGR02524). [Mobile and extrachromosomal element functions, Plasmid functions]	372
131578	TIGR02526	eut_PduT	PduT-like ethanolamine utilization protein. This gene shows up in ethanolamine utilization operons in which a proteinaceous coat organelle is also encoded. It is closely related to the PduT protein in propane-diol operons with the same structure.	182
274179	TIGR02527	dot_icm_IcmQ	Dot/Icm secretion system protein IcmQ. Members of this protein family are the IcmQ component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation ().	179
131580	TIGR02528	EutP	ethanolamine utilization protein, EutP. This protein is found within operons which code for polyhedral organelles containing the enzyme ethanolamine ammonia lyase. The function of this gene is unknown, although the presence of an N-terminal GxxGxGK motif implies a GTP-binding site. [Energy metabolism, Amino acids and amines]	142
274180	TIGR02529	EutJ	ethanolamine utilization protein EutJ family protein. 	239
274181	TIGR02530	flg_new	flagellar operon protein. Members of this family are found in a subset of bacterial flagellar operons, generally between genes designated flgD and flgE, in species as diverse as Bacillus halodurans and various other Firmicutes, Geobacter sulfurreducens, and Bdellovibrio bacteriovorus. The specific molecular function is unknown. [Cellular processes, Chemotaxis and motility]	96
188231	TIGR02531	yecD_yerC	TrpR-related protein YerC/YecD. This model represents a protein subfamily found mostly in the Firmicutes (Bacillus and allies). This family is similar in sequence to the trp operon repressor TrpR described by TIGR01321, and represents a distinct clade within the broader family described by pfam01371. At least one species, Xylella fastidiosa, in the Proteobacteria, has a member of both this family and TIGR01321. Several genomes with a member of this family do not synthesize tryptophan, and members of this family should not be considered trp operon repressors without new evidence. [Unknown function, General]	87
274182	TIGR02532	IV_pilin_GFxxxE	prepilin-type N-terminal cleavage/methylation domain. This model describes many but not all examples of the N-terminal region of bacterial proteins that resemble type IV pilins at their N-terminus, with a cleavage site G^FxxxE followed by a hydrophobic stretch. The new N-terminal residue, usually Phe, is methylated. Separate domains of the prepilin peptidase appear responsible for cleavage and methylation. Proteins with this N-terminal region include type IV pilins and other components of pilus biogenesis, competence proteins, and type II secretion proteins. Typically several proteins in a single operon have this N-terminal domain. The N-terminal cleavage and methylation site is described by PROSITE motif PS00409 as [KRHEQSTAG]-G-[FYLIVM]-[ST]-[LT]-[LIVP]-E-[LIVMFWSTAG](14). [Cell envelope, Surface structures, Protein fate, Protein and peptide secretion and trafficking]	24
131585	TIGR02533	type_II_gspE	type II secretion system protein E. This family describes GspE, the E protein of the type II secretion system, also called the main terminal branch of the general secretion pathway. This model separates GspE from the PilB protein of type IV pilin biosynthesis. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	486
162905	TIGR02534	mucon_cyclo	muconate and chloromuconate cycloisomerases. This model encompasses muconate cycloisomerase (EC 5.5.1.1) and chloromuconate cycloisomerase (EC 5.5.1.7), enzymes that often overlap in specificity. It excludes more distantly related proteins such as mandelate racemase (5.1.2.2).	368
274183	TIGR02535	hyp_Hser_kinase	proposed homoserine kinase. The genes in this family are largely adjacent to genes involved in the biosynthesis of threonine (aspartate kinase, homoserine dehydrogenase and threonine synthase) in genomes which are lacking any other known homoserine kinase, and in which the presence of a homoserine kinase would indicate a complete pathway for the biosynthesis of threonine. These genes are a member of the (now subfamily, formerly equivalog) TIGR00306 model describing the archaeal form of 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. All of these are members of a superfamily (pfam01676) of metalloenzyme also including phosphopentomutase alkaline phosphatases and sulfatases. The proposal that this family encodes a kinase is based on analogy to phosphomutases which are intramolecular phosphotransferases. A mutase active site could evolve to bring together homoserine and a phosphate donor such as phosphoenolpyruvate resulting in a kinase activity.	396
274184	TIGR02536	eut_hyp	ethanolamine utilization protein. This family of proteins is found in operons for the polyhedral organelle-based degradation of ethanolamine. This family is not found in proteobacterial species which otherwise have the same suite of genes in the eut operon. Proteobacteria have two genes that are not found in non-proteobacteria which may complement this genes function, a phosphotransacetylase (pfam01515) and the EutJ protein (TIGR02529) of unknown function.	207
274185	TIGR02537	arch_flag_Nterm	archaeal flagellin N-terminal-like domain. This model describes a hydrophobic N-terminal sequence of archaeal flagellins and other archaeal proteins. The sequence is directly analogous to bacterial sequences recognized by TIGR02532, which has cleavage motif resembling G^FxxxE followed by strongly hydrophobic sequence. Such sequences are the recognized for cleavage and methylation, and include pilins and other pilus components and competence and type II secretion secretion proteins. In the present family, the E is not conversed and sequence differs enough that there is no overlap between this family and TIGR02532.	24
274186	TIGR02538	type_IV_pilB	type IV-A pilus assembly ATPase PilB. This model describes a protein of type IV pilus biogenesis designated PilB in Pseudomonas aeruginosa but PilF in Neisseria gonorrhoeae; the more common usage, reflected here, is PilB. This protein is an ATPase involved in protein export for pilin assembly and is closely related to GspE (TIGR02533) of type II secretion, also called the main terminal branch of the general secretion pathway. Note that type IV pilus systems are often divided into type IV-A and IV-B, with the latter group including bundle-forming pilus, mannose-sensitive hemagglutinin, etc. Members of this family are found in type IV-A systems. [Cell envelope, Surface structures, Protein fate, Protein and peptide secretion and trafficking]	564
274187	TIGR02539	SepCysS	O-phospho-L-seryl-tRNA:Cys-tRNA synthase. Aminoacylation of tRNA(Cys) with Cys, and cysteine biosynthesis in the process, happens in Methanocaldococcus jannaschii and several other archaea by misacylation of tRNA(Cys) with O-phosphoserine (Sep), followed by modification of the phosphoserine to cysteine. In some species, direct tRNA-cys aminoacylation also occurs but this pathway is required for Cys biosynthesis. Members of this protein catalyze the second step in this two step pathway, using pyridoxal phosphate and a sulfur donor to synthesize Cys from Sep while attached to the tRNA.	369
131592	TIGR02540	gpx7	putative glutathione peroxidase Gpx7. This model represents one of several families of known and probable glutathione peroxidases. This family is restricted to animals and designated GPX7.	153
274188	TIGR02541	flagell_FlgJ	flagellar rod assembly protein/muramidase FlgJ. The N-terminal region of this protein acts directly in flagellar rod assembly. The C-terminal region is a flagellum-specific muramidase (peptidoglycan hydrolase) required for formation of the outer membrane L ring.	294
211749	TIGR02542	T_forsyth_147	TANFOR domain. The longest predicted protein in Tannerella forsythia (Bacteroides forsythus) ATCC 43037 is over 3000 residues long and lacks homology to other known proteins. Immediately after the signal sequence are four tandem repeats, approximately 147 residues long. This model describes that repeat, plus homologous single copy N-terminal domains in other large bacterial proteins. We designate this region the TANFOR domain. Many proteins with this domain also have fibronectin type III domains.	145
274189	TIGR02543	List_Bact_rpt	Listeria/Bacterioides repeat. This model describes a conserved core region, about 43 residues in length, of at least two families of tandem repeats. These include 78-residue repeats from 2 to 15 in number, in some proteins of Bacteroides forsythus ATCC 43037, and 70-residue repeats in families of internalins of Listeria species. Single copies are found in proteins of Fibrobacter succinogenes, Geobacter sulfurreducens, and a few bacteria. [Unknown function, General]	43
274190	TIGR02544	III_secr_YscJ	type III secretion apparatus lipoprotein, YscJ/HrcJ family. All members of this protein family are predicted lipoproteins with a conserved Cys near the N-terminus for cleavage and modification, and are part of known or predicted type III secretion systems. Members are found in both plant and animal pathogens, including the obligately intracellular chlamydial species and (non-pathogenic) root nodule bacteria. The most closely related proteins outside this family are examples of the flagellar M-ring protein FliF. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	193
274191	TIGR02546	III_secr_ATP	type III secretion apparatus H+-transporting two-sector ATPase. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	422
274192	TIGR02547	casA_cse1	CRISPR type I-E/ECOLI-associated protein CasA/Cse1. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model family, represented by CT1972 from Chlorobium tepidum, is found in Ecoli subtype CRISPR/Cas regions of many bacteria, most of which are mesophiles, and not in Archaea. It is designated Cse1.	502
274193	TIGR02548	casB_cse2	CRISPR type I-E/ECOLI-associated protein CasB/Cse2. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model family is found in Ecoli subtype CRISPR/Cas regions of many bacteria, most of which are mesophiles, and not in Archaea. It was designated Cse2 originally, and renamed CasB based on its characterization in the CASCADE complex.	160
274194	TIGR02549	CRISPR_DxTHG	CRISPR-associated DxTHG motif protein. This model describes a short region highly conserved between two otherwise substantially different CRISPR-associated (cas) proteins, TIGR02221 and TIGR01987. This region includes the motif [VIL]-D-x-[ST]-H-[GS].	21
274195	TIGR02550	flagell_flgL	flagellar hook-associated protein 3. This protein family consists of flagellar hook-associated proteins designated FlgL (or HAP3) encoded in bacterial flagellar operons. A N-terminal region of about 150 residues and a C-terminal region of about 85 residues are conserved. Members show considerable length heterogeneity between these two well-conserved terminal regions; the seed alignment 486 columns, 393 of which are represented in the model, while members of this family are from 287 to over 500 residues in length. This model distinguishes FlgL from the flagellin gene product FliC. [Cellular processes, Chemotaxis and motility]	306
274196	TIGR02551	SpaO_YscQ	type III secretion system apparatus protein YscQ/HrcQ. Genes in this family are found in type III secretion operons. The gene (YscQ) in Yersinia is essential for YOPs secretion, while SpaO in Shigella is involved in the Surface Presentation of Antigens apparatus found on the virulence plasmid, and HrcQ is involved in the Harpin secretory system in organisms like Pseudomonas syringae. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	298
274197	TIGR02552	LcrH_SycD	type III secretion low calcium response chaperone LcrH/SycD. Genes in this family are found in type III secretion operons. LcrH, from Yersinia is believed to have a regulatory function in the low-calcium response of the secretion system. The same protein is also known as SycD (SYC = Specific Yop Chaperone) for its chaperone role. In Pseudomonas, where the homolog is known as PcrH, the chaperone role has been demonstrated and the regulatory role appears to be absent. ScyD/LcrH contains three central tetratricopeptide-like repeats that are predicted to fold into an all-alpha-helical array.	135
274198	TIGR02553	SipD_IpaD_SspD	type III effector protein IpaD/SipD/SspD. These proteins are found within type III secretion operons and have been shown to be secreted by that system.	313
131605	TIGR02554	PrgH	type III secretion system protein PrgH/EprH. In Samonella, this gene is part of a four-gene operon PrgHIJK and in general is found in type III secretion operons. PrgH has been shown to be required for secretion, as well as being a structural component of the needle complex. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	389
131606	TIGR02555	OrgA_MxiK	type III secretion apparatus protein OrgA/MxiK. This gene is found in type III secretion operons and has been shown to be essential for the invasion phenotype in Salmonella and a component of the secretion apparatus. The protein is known as OrgA in Salmonella due to its oxygen-dependent expression pattern in which low-oxygen levels up-regulate the gene. In Shigella the ghene is called MxiK and has been shown to be sessential for the proper assembly of the secretion needle complex.	185
274199	TIGR02556	cas_TM1802	CRISPR-associated protein, TM1802 family. This minor cas protein is found in CRISPR/cas regions of at least five prokaryotic genomes: Methanosarcina mazei, Sulfurihydrogenibium azorense, Thermotoga maritima, Carboxydothermus hydrogenoformans, and Dictyoglomus thermophilum, the first of which is archaeal while the rest are bacterial.	555
274200	TIGR02557	HpaP	type III secretion protein HpaP. This family of genes is always found in type III secretion operons, althought its function in the processes of secretion and virulence is unclear. Hpa stands for Hrp-associated gene, where Hrp stands for hypersensitivity response and virulence.	201
131609	TIGR02558	HrpB2	type III secretion protein HrpB2. This family of genes is found in type III secretion operons in a narrow group of species including Xanthomonas, Burkholderia and Ralstonia.	124
131610	TIGR02559	HrpB7	type III secretion protein HrpB7. This family of genes is found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia.	158
131611	TIGR02560	HrpB4	type III secretion protein HrpB4. This family of genes are always found in type III secretion operons in a limited number of species including Burkholderia, Xanthomonas and Ralstonia.	210
131612	TIGR02561	HrpB1_HrpK	type III secretion protein HrpB1/HrpK. This gene is found within type III secretion operons in a limited range of species including Xanthomonas, Ralstonia and Burkholderia.	153
274201	TIGR02562	cas3_yersinia	CRISPR-associated helicase Cas3, subtype I-F/YPEST. The helicase in many CRISPR-associated (cas) gene clusters is designated Cas3, and most Cas3 proteins are described by model TIGR01587. Members of this family are considerably larger, show a number of motifs in common with TIGR01587 sequences, and replace Cas3 in some CRISPR/cas loci in a number of Proteobacteria, including Yersinia pestis, Chromobacterium violaceum, Erwinia carotovora subsp. atroseptica SCRI1043, Photorhabdus luminescens subsp. laumondii TTO1, Legionella pneumophila, etc.	1110
274202	TIGR02563	cas_Csy4	CRISPR-associated endoribonuclease Cas6/Csy4, subtype I-F/YPEST. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. This family is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4.	185
274203	TIGR02564	cas_Csy1	CRISPR type I-F/YPEST-associated protein Csy1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2465 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. This family is designated Csy1, for CRISPR/Cas Subtype Ypest protein 1.	384
274204	TIGR02565	cas_Csy2	CRISPR type I-F/YPEST-associated protein Csy2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2464 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. This family is designated Csy2, for CRISPR/Cas Subtype Ypest protein 2.	296
274205	TIGR02566	cas_Csy3	CRISPR type I-F/YPEST-associated protein Csy3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2463 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. This family is designated Csy3, for CRISPR/Cas Subtype Ypest protein 3.	341
131618	TIGR02567	YscW	type III secretion system chaperone YscW. This family of proteins is found within type III secretion operons. The protein has been characterized as a chaperone for the outer membrane pore component YscC (TIGR02516). YscW is a lipoprotein which is itself localized to the outer membrane and, it is believed, facilitates the oligomerization and localization of YscC.	124
274206	TIGR02568	LcrE	type III secretion regulator YopN/LcrE/InvE/MxiC. This protein is found in type III secretion operons and, in Yersinia is localized to the cell surface and is involved in the Low-Calicium Response (LCR), possibly by sensing the calcium concentration. In Salmonella, the gene is known as InvE and is believed to perform an essential role in the secretion process and interacts with the proteins SipBCD and SicA.//Altered name to reflect regulatory role. Added GO and role IDs . Negative regulation of type III secretion in Y pestis is mediated in part by a multiprotein complex that has been proposed to act as a physical impediment to type III secretion by blocking the entrance to the secretion apparatus prior to contact with mammalian cells. This complex is composed of YopN, its heterodimeric secretion chaperone SycN-YscB, and TyeA. 3[SS 6/3/05] [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	240
131620	TIGR02569	TIGR02569_actnb	TIGR02569 family protein. This protein family is found, so far, only in Actinobacteria, including as least five species of Mycobacterium, three of Corynebacterium, and Nocardia farcinica, always in a single copy per genome. The function is unknown. [Hypothetical proteins, Conserved]	272
274207	TIGR02570	cas7_GSU0053	CRISPR-associated protein GSU0053/csb1, Dpsyc system. Members of this family are found in association with CRISPR repeats and other CRISPR-associated (cas) genes in the genomes of Geobacter sulfurreducens PCA and Desulfotalea psychrophila LSv54 (both Desulfobacterales from the Deltaproteobacteria), Gemmata obscuriglobus (Planctomycete), and Actinomyces naeslundii MG1 (Actinobacteria). This CRISPR/Cas type is designated Dpsych.	172
131622	TIGR02571	ComEB	ComE operon protein 2. This protein is found in the ComE operon for "late competence" as characterized in B. subtilis. Proteins in this family contain homology to a cytidine/deoxycytidine deaminase domain family (pfam00383), and may carry out this activity.	151
131623	TIGR02572	LcrR	type III secretion system regulator LcrR. This protein is found in type III secretion operons and has been characterized in Yersinia as a regulator of the Low-Calcium Respone (LCR). [Protein fate, Protein and peptide secretion and trafficking]	139
131624	TIGR02573	LcrG_PcrG	type III secretion protein LcrG. This protein is found in type III secretion operons, along with LcrR, H and V. Also known as PcrG in Pseudomonas, the protein is believed to make a 1:1 complex with PcrV (LcrV). Mutants of LcrG cause premature secretion of effector proteins into the medium .	90
131625	TIGR02574	stabl_TIGR02574	putative addiction module component, TIGR02574 family. Members of this family are bacterial proteins, typically are about 75 amino acids long, always found as part of a pair (at least) of two small genes. The other in the pair always belongs to a subfamily of the larger family pfam05016 (although not necessarily scoring above the designated cutoff), which contains plasmid stabilization proteins. It is likely that this protein and its pfam05016 member partner comprise some form of addiction module, although these gene pairs usually are found on the bacterial main chromosome. [Mobile and extrachromosomal element functions, Other]	63
274208	TIGR02577	cas_TM1794_Cmr2	CRISPR-associated protein Cas10/Cmr2, subtype III-B. This model represent a Crm2 family of the CRISPR-associated RAMP module, a set of six genes recurring found together in prokaryotic genomes. This gene cluster is found only in species with CRISPR repeats, usually near the repeats themselves. Because most of the six (but not this family) contain RAMP domains, and because its appearance in a genome appears to depend on other CRISPR-associated Cas genes, the set is designated the CRISPR RAMP module. This protein, typified by TM1794 from Thermotoga maritima, is designated Crm2, for CRISPR RAMP Module protein 2.	483
274209	TIGR02578	cas_TM1811_Csm1	CRISPR-associated protein Cas10/Csm1, subtype III-A/MTUBE. The family is designated Csm2, for CRISPR/Cas Subtype Mtube Protein 2. A typical example is TM1811 from Thermotoga maritima. CRISPR are Clustered Regularly Interspaced Short Palindromic Repeats. This protein family belongs to a conserved gene cluster regularly found near CRISPR repeats.	648
131628	TIGR02579	cas_csx3	CRISPR-associated protein, Csx3 family. Members of this family are found encoded in CRISPR-associated (cas) gene clusters, near CRISPR repeats, in the genomes of several different thermophiles: Archaeoglobus fulgidus (archaeal), Aquifex aeolicus (Aquificae), Dictyoglomus thermophilum (Dictyoglomi), and a thermophilic Synechococcus (Cyanobacteria). It is not yet assigned to a specific CRISPR/cas subtype (hence the x designation csx3).	83
274210	TIGR02580	cas_RAMP_Cmr4	CRISPR type III-B/RAMP module RAMP protein Cmr4. This model represents a CRISPR-associated protein from the family that includes TM1792 of Thermotoga maritima. This family is part of the broad RAMP superfamily (pfam03787) collection of CRISPR-associated proteins. It is the fourth of a recurring set of six proteins, four of are in the RAMP superfamily, that we designate the CRISPR RAMP module.	280
274211	TIGR02581	cas_cyan_RAMP	CRISPR-associated RAMP protein, SSO1426 family. Members of this CRISPR-associated (cas) gene family are found in the RAMP-2 subtype of CRISPR/cas locus and designated TM1809 family.	217
274212	TIGR02582	cas7_TM1809	CRISPR type III-A/MTUBE-associated RAMP protein Csm3. Members of this CRISPR-associated (cas) gene family are found in the mtube subtype of CRISPR/cas locus and designated Csm3, for CRISPR/cas Subtype Mtube, protein 3.	204
274213	TIGR02583	DevR_archaea	CRISPR-associated protein Cas7/Csa2, subtype I-A/APERN. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This model represents one such family, typified by MJ0381 of Methanococcus jannaschii. This archaeal clade is a member of the DevR family (TIGR01875) which includes the DevR protein of Myxococcus xanthus, a protein whose expression appears to regulated through a number of means, including both location and autorepression; DevR mutants are incapable of fruiting body development. This subfamily is found in a CRISPR/Cas locus we designate APERN, so the family is designated Csa2, for CRISPR/Cas Subtype Protein 2.	285
274214	TIGR02584	cas_NE0113	CRISPR-associated protein, NE0113 family. Members of this minor CRISPR-associated (Cas) protein family are found in cas gene clusters in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, Mannheimia succiniciproducens MBEL55E, and Verrucomicrobium spinosum.	209
274215	TIGR02585	cas_Cst2_DevR	CRISPR-associated protein Cas7/Cst2/DevR, subtype I-B/TNEAP. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This clade is a member of the DevR family (TIGR01875) and includes the DevR protein of Myxococcus xanthus, a protein whose expression appears to be regulated through a number of means, including both location and autorepression; DevR mutants are incapable of fruiting body development.	310
131635	TIGR02586	cas5_cmx5_devS	CRISPR-associated protein Cas5/DevS, subtype MYXAN. This model represents DevS of Myxococcus xanthus and related proteins of Leptospira interrogans and Gemmata obscuriglobus. This protein is encoded in a cluster of CRISPR-associated (cas) genes, and in the special case of Myxococcus xanthus has taken on a role in the control of fruiting body development. CRISPRs are clustered, regularly interspaced short palidromic repeats. This protein family is related to models TIGR01868, TIGR01895, and TIGR01876.	188
131636	TIGR02587	TIGR02587	putative integral membrane protein TIGR02587. Members of this family are found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Sinorhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighborhood. This family, as defined, includes some members of COG4711 but is narrower and strictly bacterial. Members appear to span the membrane seven times. [Cell envelope, Other]	271
131637	TIGR02588	TIGR02588	TIGR02588 family protein. The function of this protein is unknown. It is always found as part of a two-gene operon with TIGR02587, a protein that appears to span the membrane seven times. It is found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Sinorhizobium meliloti, and Gloeobacter violaceus, so far, all of which are bacterial. [Hypothetical proteins, Conserved]	122
274216	TIGR02589	cas_Csd2	CRISPR-associated protein Cas7/Csd2, subtype I-C/DVULG. This model represents one of two closely related CRISPR-associated proteins that belong to the larger family of TIGR01595. Members are the Csd2 protein of the Dvulg subtype of CRISPR/cas system. CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats. The related model is TIGR02590, the Csh2 protein of the Hmari CRISPR subtype.	284
274217	TIGR02590	cas_Csh2	CRISPR-associated protein Cas7/Csh2, subtype I-B/HMARI. This model represents one of two closely related CRISPR-associated proteins that belong to the larger family of TIGR01595. Members are the Csh2 protein of the Hmari subtype of CRISPR/cas system. CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats. The related model is TIGR02589, the Csd3 protein of the Dvulg CRISPR subtype.	286
188234	TIGR02591	cas_Csh1	CRISPR-associated protein Cas8b/Csh1, subtype I-B/HMARI. This domain is found in the C-terminal 2/3 of a family of CRISPR associated proteins of the Hmari subtype. Except for the two sequences from halophilic archaea this domain contains a pair of CXXC motifs.	393
131641	TIGR02592	cas_Cas5h	CRISPR-associated protein Cas5, subtype I-B/HMARI. This is a CRISPR-associated protein unique to the hmari subtype of cas genes and CRISPR repeat, which is the only subtype present in Haloarcula marismortui ATCC 43049. The hmari type, though uncommon, is also found in the Aquificae, Thermotogae, Firmicutes, and Dictyoglomi.	241
274218	TIGR02593	CRISPR_cas5	CRISPR-associated protein Cas5, N-terminal domain. This model represents a shared N-terminal domain, about 43 amino acids in length, common to a number of related protein families each of which is associated with a distinct subtype of CRISPR/cas system, where CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeat and Cas is an abbreviation for CRISPR-associated. Members of this family are widely distributed enough that we designated the family Cas5. Homology appears remote, or absent, between the more C-terminal regions different subfamilies of these proteins, which typically are 210 to 265 amino acids in total length. Cas5 proteins of six different CRISPR/cas subtypes so far defined are described by respective full-length models TIGR01868, TIGR01876, TIGR01895, TIGR01874, TIGR02586, and TIGR02592. The best characterized protein in this family is DevS or Myxococcus xanthus, a Cas protein that appears to participate in a species-specific developmental pathway.	42
131643	TIGR02594	TIGR02594	TIGR02594 family protein. Members of this protein family known so far are restricted to the bacteria, and for the most to the proteobacteria. The function is unknown.	129
131644	TIGR02595	PEP_exosort	PEP-CTERM protein-sorting domain. This model describes a 25-residue domain that includes a near-invariant Pro-Glu-Pro (PEP) motif, a thirteen residue strongly hydrophobic sequence likely to span the membrane, and a five-residue strongly basic motif that often contains four Arg residues. In nearly every case, this motif is found within nine residues, and usually within five residues, of the extreme C-terminus of the protein. Proteins with this motif typically have signal sequences at the N-terminus. This region appears many times per genome or not at all, and co-occurs in genomes with a proposed protein-sorting integral membrane protein we designate exosortase (see TIGR02602). PEP-CTERM proteins frequently are poorly conserved, Ser/Thr-rich proteins and may become extensively modified proteinaceous constituents of extracellular material in bacterial biofilms. [Cell envelope, Surface structures]	24
274219	TIGR02596	TIGR02596	Verru_Chthon cassette protein D. This model describes a nearly twenty member protein family in Verrucomicrobium spinosum and a somewhat smaller paralogous family in Chthoniobacter flavus. All members share a type IV pilin-like N-terminal leader sequence (TIGR02532). These proteins occur in the four-gene Verru_Chthon cassette, in which two other genes likewise encode a cleavage/methylation domain. Most of these cassettes occur next to an unusually large PEP-CTERM protein with an autotransporter domain. [Cell envelope, Surface structures]	195
274220	TIGR02597	TIGR02597	TIGR02597 family protein. This model describes a paralogous family with at least ten members in Verrucomicrobium spinosum. Two additional predicted proteins match more weakly and score between the trusted and noise cutoffs, while a third contains a point mutation. Eleven of the thirteen genes are found in a single tandem array.	361
131647	TIGR02598	TIGR02598	Verru_Chthon cassette protein B. This family consists sets of paralogous family of proteins in the Verrucomicrobium spinosum and Chthoniobacter flavus. All members contain the prepilin-type N-terminal cleavage/methylation domain (TIGR02532) at the N-terminus. The mature protein would be about 150 amino acids long. These proteins occur in the four-gene Verru_Chthon cassette, in which two other genes likewise encode a cleavage/methylation domain. Most of these cassettes occur next to an unusually large PEP-CTERM protein with an autotransporter domain. [Cell envelope, Surface structures]	151
274221	TIGR02599	TIGR02599	Verru_Chthon cassette protein C. This family consists sets of paralogous family of proteins in the Verrucomicrobium spinosum and Chthoniobacter flavus. All members contain the prepilin-type N-terminal cleavage/methylation domain (TIGR02532) at the N-terminus. The mature protein would be about 350 amino acids long. These proteins occur in the four-gene Verru_Chthon cassette, in which two other genes likewise encode a cleavage/methylation domain. Most of these cassettes occur next to an unusually large PEP-CTERM protein with an autotransporter domain. [Cell envelope, Surface structures]	339
274222	TIGR02600	Verru_Chthon_A	Verru_Chthon cassette protein A. In Verrucomicrobium spinosum and Chthoniobacter flavus, a four-gene operon that includes proteins with an N-terminal signal sequence for cleavage and methylation recurs many times. Each operon is likely to encode a membrane complex, the function of which is unknown. This model represents a long protein from this putative membrame complex, with members averaging about 1300 amino acids. The N-terminal region includes an apparent signal sequence. The function is unknown. Most cassettes are adjacent to an unusually large protein with both an outer membrane autotransporter region and PEP-CTERM putative protein-sorting motif. [Cell envelope, Surface structures]	1265
274223	TIGR02601	autotrns_rpt	autotransporter-associated beta strand repeat. This model represent a core 32-residue region of a class of bacterial protein repeat found in one to 30 copies per protein. Most proteins with a copy of this repeat have domains associated with membrane autotransporters (pfam03797, TIGR01414). The repeats occur with a periodicity of 60 to 100 residues. A pattern of sequence conservation is that every second residue is well-conserved across most of the domain. pfam05594 is based on a longer, much more poorly conserved multiple sequence alignment and hits some of the same proteins as this model with some overlap between the hit regions of the two models. It describes these repeats as likely to have a beta-helical structure. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	32
131651	TIGR02602	8TM_EpsH	exosortase. This family is designated exosortase, and it is the predicted protein-sorting transpeptidase for the PEP-CTERM protein-sorting signal of many biofilm-producing Gram-negative bacteria. This system is analogous to the sortase/LPXTG system found mostly in Gram-positive bacteria. Members of this family are integral membrane proteins with eight predicted transmembrane helices in common, and with a triad of invariant residues that matches the catalytic triad of sortases. Some members of this family have long trailing sequences past the region described by this model, which in other species is a separate protein EpsI. This model does not include the region of the first predicted transmembrane region. The only partially characterized member is EpsH of Methylobacillus sp. 12S, part of a locus associated with biosynthesis of the exopolysaccharide methanolan but itself not involved in polysaccharide biosynthesis. [Protein fate, Protein and peptide secretion and trafficking]	241
274224	TIGR02603	CxxCH_TIGR02603	putative heme-binding domain, Pirellula/Verrucomicrobium type. This model represents a domain limited to very few species but expanded into large paralogous families in some species that conain it. We find it in over 20 copies each in Pirellula sp. strain 1 (phylum Planctomycetes) and Verrucomicrobium spinosum DSM 4136 (phylum Verrucomicrobia), and no matches above trusted cutoff an any other species so far. This domain, about 140 amino acids long, contains an absolutely conserved motif CxxCH, the cytochrome c family heme-binding site signature (PS00190).	133
274225	TIGR02604	Piru_Ver_Nterm	putative membrane-bound dehydrogenase domain. All proteins that score above the trusted cutoff score of 45 to this model are large proteins of either Pirellula sp. 1 or Verrucomicrobium spinosum. These proteins all contain, in addition to this domain, several hundred residues of highly variable sequence, and then a well-conserved C-terminal domain (TIGR02603) that features a putative cytochrome c-type heme binding motif CXXCH. The membrane-bound L-sorbosone dehydrogenase from Acetobacter liquefaciens (Gluconacetobacter liquefaciens) (SP|Q44091) is homologous to this domain but lacks additional sequence regions shared by members of this family and belongs to a different clade of the larger family of homologs. It and its closely related homologs are excluded from the this model by scoring between the trusted (45) and noise (18) cutoffs.	367
274226	TIGR02605	CxxC_CxxC_SSSS	putative regulatory protein, FmdB family. This model represents a region of about 50 amino acids found in a number of small proteins in a wide range of bacteria. The region begins usually with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One member of this family is has been noted as a putative regulatory protein, designated FmdB (SP:Q50229, ). Most members of this family have a C-terminal region containing highly degenerate sequence, such as SSTSESTKSSGSSGSSGSSESKASGSTEKSTSSTTAAAAV in Mycobacterium tuberculosis and VAVGGSAPAPSPAPRAGGGGGGCCGGGCCG in Streptomyces avermitilis. These low complexity regions, which are not included in the model, resemble low-complexity C-terminal regions of some heterocycle-containing bacteriocin precursors. [Regulatory functions, DNA interactions]	52
274227	TIGR02606	antidote_CC2985	putative addiction module antidote protein, CC2985 family. This bacterial protein family has a very similar seed alignment to that of pfam03693 but is a more stringent model with higher cutoff scores. Proteins that score above the trusted cutoff to this model almost invariably are found adjacent to a ParE family protein (pfam05016), where ParE is the killing partner of an addiction module for plasmid stabilization. Members of this family, therefore, are putative addiction module antidote proteins. Some are encoded on plasmids or in prophage regions, but others appear chromosomal. A genome may contain several identical copies, such as the four in Magnetococcus sp. MC-1. This family is named for one member, CC2985 of Caulobacter crescentus CB15. [Cellular processes, Other, Mobile and extrachromosomal element functions, Plasmid functions]	69
274228	TIGR02607	antidote_HigA	addiction module antidote protein, HigA family. Members of this family form a distinct clade within the larger family HTH_3 of helix-turn-helix proteins, described by pfam01381. Members of this clade are strictly bacterial and nearly always shorter than 110 amino acids. This family includes the characterized member HigA, without which the killer protein HigB cannot be cloned. The hig (host inhibition of growth) system is noted to be unusual in that killer protein is uncoded by the upstream member of the gene pair. [Regulatory functions, DNA interactions, Regulatory functions, Protein interactions, Mobile and extrachromosomal element functions, Other]	78
274229	TIGR02608	delta_60_rpt	delta-60 repeat domain. This domain occurs in tandem repeats, as many as 13, in proteins from Bdellovibrio bacteriovorus, Azotobacter vinelandii, Geobacter sulfurreducens, Pirellula sp. 1, Myxococcus xanthus, and others, many of which are Deltaproteobacteria. The periodicity of the repeat ranges from about 57 to 61 amino acids, and a core region of about 54 is represented by this model and seed alignment.	54
274230	TIGR02609	doc_partner	putative addiction module antidote. Members of this protein family are putative addiction module antidote proteins that appear recurringly in two-gene operons with members of the Doc (death-on-curing) family TIGR01550. Members of this family contain a SpoVT/AbrB-like domain (pfam04014). Note that the gene pairs with a member of this family tend to be found on bacterial chromosomes, not on plasmids. [Mobile and extrachromosomal element functions, Other]	74
131659	TIGR02610	PHA_gran_rgn	putative polyhydroxyalkanoic acid system protein. All members of this family are encoded by genes polyhydroxyalkanoic acid (PHA) biosynthesis and utilization genes, including proteins at found at the surface of PHA granules. Examples so far are found in the Pseudomonales, Xanthomonadales, and Vibrionales, all of which belong to the Gammaproteobacteria.	91
131660	TIGR02611	TIGR02611	TIGR02611 family protein. Members of this family are Actinobacterial putative proteins of about 150 amino acids in length with three apparent transmembrane helix and an unusual motif with consensus sequence PGPGW. [Hypothetical proteins, Conserved]	121
274231	TIGR02612	mob_myst_A	mobile mystery protein A. Members of this protein family are found in mobization-related contexts more often than not, including within a CRISPR-associated gene region in Geobacter sulfurreducens PCA, and on plasmids in Agrobacterium tumefaciens and Coxiella burnetii, always together with mobile mystery protein B, a member of the Fic protein family (pfam02661). This protein is encoded by the upstream member of the gene pair and belongs to a family of helix-turn-helix DNA binding proteins (pfam01381). [Unknown function, General]	150
131662	TIGR02613	mob_myst_B	mobile mystery protein B. Members of this protein family, which we designate mobile mystery protein B, are found in mobization-related contexts more often than not, including within a CRISPR-associated gene region in Geobacter sulfurreducens PCA, and on plasmids in Agrobacterium tumefaciens and Coxiella burnetii, always together with mobile mystery protein A (TIGR02612), a member of the family of helix-turn-helix DNA binding proteins (pfam01381). This protein is encoded by the downstream member of the gene pair and belongs to the Fic protein family (pfam02661), where Fic (filamentation induced by cAMP) is a regulator of cell division. The characteristics of having a two-gene operon in a varied context and often on plasmids, with one member affecting cell division and the other able to bind DNA, suggests similarity to addiction modules.	186
274232	TIGR02614	ftsW	cell division protein FtsW. This family consists of FtsW, an integral membrane protein with ten transmembrane segments. In general, it is one of two paralogs involved in peptidoglycan biosynthesis, the other being RodA, and is essential for cell division. All members of the seed alignment for this model are encoded in operons for the biosynthesis of UDP-N-acetylmuramoyl-pentapeptide, a precursor of murein (peptidoglycan). The FtsW designation is not used in endospore-forming bacterial (e.g. Bacillus subtilis), where the member of this family is designated SpoVE and three or more RodA/FtsW/SpoVE family paralogs are present. SpoVE acts in spore cortex formation and is dispensible for growth. Biological rolls for FtsW in cell division include recruitment of penicillin-binding protein 3 to the division site. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Cell division]	356
131664	TIGR02615	spoVE	stage V sporulation protein E. This model represents an exception within the members of the FtsW model TIGR02614. This exception occurs only in endospore-forming genera such as Bacillus, Geobacillus, and Oceanobacillus. Like FtsW, members are found in a peptidoglycan operon context, but in these genera they part of a larger set of paralogs (not just the pair FtsW and RodA) and are required specifically for sporulation, not for viability. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Sporulation and germination]	354
274233	TIGR02616	tnaC_leader	tryptophanase leader peptide. Members of this family are the apparent leader peptides of tryptophanase operons in Esherichia coli, Vibrio cholerae, Photobacterium profundum, Haemophilus influenzae type b, and related species. All members of the seed alignment are examples ORFs upstream of tryptophanase, with a start codon, a conserved single Trp residue, and several other conserved residues. It is suggested (Konan KV and Yanofsky C) that the nascent peptide interacts with the ribosome once (if) the ribosome reaches the stop codon. Note that this model describes a much broader set (and shorter protein region) than pfam08053. [Energy metabolism, Amino acids and amines, Transcription, Other]	22
131666	TIGR02617	tnaA_trp_ase	tryptophanase, leader peptide-associated. Members of this family belong to the beta-eliminating lyase family (pfam01212) and act as tryptophanase (L-tryptophan indole-lyase). The tryptophanases of this family, as a rule, are found with a tryptophanase leader peptide (TnaC) encoded upstream. Both tryptophanases (4.1.99.1) and tyrosine phenol-lyases (EC 4.1.99.2) are found between trusted and noise cutoffs, but this model captures nearly all tryptophanases for which the leader peptide gene tnaC can be found upstream. [Energy metabolism, Amino acids and amines]	467
131667	TIGR02618	tyr_phenol_ly	tyrosine phenol-lyase. This model describes a group of tyrosine phenol-lyase (4.1.99.2) (beta-tyrosinase), a pyridoxal-phosphate enzyme closely related to tryptophanase (4.1.99.1) (see model TIGR02617). Both belong to the beta-eliminating lyase family (pfam01212) [Energy metabolism, Amino acids and amines]	450
274234	TIGR02619	TIGR02619	putative CRISPR-associated protein, APE2256 family. This model represents a conserved domain of about 150 amino acids found in at least five archaeal species and three bacterial species, exclusively in species with CRISPR (Clustered Regularly Interspaced Short Palidromic Repeats). In six of eight species, the member of this family is in the vicinity of a CRISPR/Cas locus.	149
200203	TIGR02620	cas_VVA1548	putative CRISPR-associated protein, VVA1548 family. This model represents a conserved domain of about 95 amino acids exclusively in species with CRISPR (Clustered Regularly Interspaced Short Palidromic Repeats). In all bacterial species with members so far (Vibrio vulnificus YJ016, Mannheimia succiniciproducens MBEL55E, and Nitrosomonas europaea ATCC 19718) and but not in the archaeon Methanothermobacter thermautotrophicus str. Delta H, the gene for this protein is in the midst of a cluster of Cas protein gene near CRISPR repeats.	93
274235	TIGR02621	cas3_GSU0051	CRISPR-associated helicase Cas3, subtype Dpsyc. This model describes a CRISPR-associated putative DEAH-box helicase, or Cas3, of a subtype found in Actinomyces naeslundii MG1, Geobacter sulfurreducens PCA, Gemmata obscuriglobus UQM 2246, and Desulfotalea psychrophila. This protein includes both DEAH and HD motifs.	862
274236	TIGR02622	CDP_4_6_dhtase	CDP-glucose 4,6-dehydratase. Members of this protein family are CDP-glucose 4,6-dehydratase from a variety of Gram-negative and Gram-positive bacteria. Members typically are encoded next to a gene that encodes a glucose-1-phosphate cytidylyltransferase, which produces the substrate, CDP-D-glucose, used by this enzyme to produce CDP-4-keto-6-deoxyglucose. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	349
131672	TIGR02623	G1P_cyt_trans	glucose-1-phosphate cytidylyltransferase. Members of this family are the enzyme glucose-1-phosphate cytidylyltransferase, also called CDP-glucose pyrophosphorylase, the product of the rfbF gene. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	254
131673	TIGR02624	rhamnu_1P_ald	rhamnulose-1-phosphate aldolase. Members of this family are the enzyme RhaD, rhamnulose-1-phosphate aldolase.	270
131674	TIGR02625	YiiL_rotase	L-rhamnose mutarotase. Members of this protein family are rhamnose mutarotase from Escherichia coli, previously designated YiiL as an uncharacterized protein, and close homologs also associated with rhamnose dissimilation operons in other bacterial genomes. Mutarotase is a term for an epimerase that changes optical activity. This enzyme was shown experimentally to interconvert alpha and beta stereoisomers of the pyranose form of L-rhamnose. The crystal structure of this small (104 amino acid) protein shows a locally asymmetric dimer with active site residues of His, Tyr, and Trp. [Energy metabolism, Sugars]	102
274237	TIGR02627	rhamnulo_kin	rhamnulokinase. This model describes rhamnulokinase, an enzyme that catalyzes the second step in rhamnose catabolism.	454
131676	TIGR02628	fuculo_kin_coli	L-fuculokinase. Members of this family are L-fuculokinase, from the clade that includes the L-fuculokinase of Escherichia coli. This enzyme catalyzes the second step in fucose catabolism. This family belongs to FGGY family of carbohydrate kinases (pfam02782, pfam00370). It is encoded by the kinase (K) gene of the fucose (fuc) operon. [Energy metabolism, Sugars]	465
131677	TIGR02629	L_rham_iso_rhiz	L-rhamnose catabolism isomerase, Pseudomonas stutzeri subtype. Members of this family are isomerases in the pathway of L-rhamnose catabolism as found in Pseudomonas stutzeri and in a number of the Rhizobiales. This family differs from the L-rhamnose isomerases of Escherichia coli (see TIGR01748). This enzyme catalyzes the isomerization step in rhamnose catabolism. Genetic evidence in Rhizobium leguminosarum bv. trifolii suggests phosphorylation occurs first, then isomerization of the the phosphorylated sugar, but characterization of the recombinant enzyme from Pseudomonas	412
274238	TIGR02630	xylose_isom_A	xylose isomerase. Members of this family are the enzyme xylose isomerase (5.3.1.5), which interconverts D-xylose and D-xylulose. [Energy metabolism, Sugars]	434
131679	TIGR02631	xylA_Arthro	xylose isomerase, Arthrobacter type. This model describes a D-xylose isomerase that is also active as a D-glucose isomerase. It is tetrameric and dependent on a divalent cation Mg2+, Co2+ or Mn2+ as characterized in Arthrobacter. Members of this family differ substantially from the D-xylose isomerases of family TIGR02630.	382
131680	TIGR02632	RhaD_aldol-ADH	rhamnulose-1-phosphate aldolase/alcohol dehydrogenase. 	676
131681	TIGR02633	xylG	D-xylose ABC transporter, ATP-binding protein. Several bacterial species have enzymes xylose isomerase and xylulokinase enzymes for xylose utilization. Members of this protein family are the ATP-binding cassette (ABC) subunit of the known or predicted high-affinity xylose ABC transporter for xylose import. These genes, which closely resemble other sugar transport ABC transporter genes, typically are encoded near xylose utilization enzymes and regulatory proteins. Note that this form of the transporter contains two copies of the ABC transporter domain (pfam00005). [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	500
188237	TIGR02634	xylF	D-xylose ABC transporter, substrate-binding protein. Members of this family are periplasmic (when in Gram-negative bacteria) binding proteins for D-xylose import by a high-affinity ATP-binding cassette (ABC) transporter. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	302
274239	TIGR02635	RhaI_grampos	L-rhamnose isomerase, Streptomyces subtype. This clade of sequences is closely related to the L-rhamnose isomerases found in Pseudomonas stutzeri and in a number of the Rhizobiales (TIGR02629). The genes of the family represented here are found in similar genomic contexts which contain genes apparently involved in rhamnose catabolism such as rhamnulose-1-phosphate aldolase (TIGR02632), sugar kinases, and sugar transporters. [Energy metabolism, Sugars]	378
274240	TIGR02636	galM_Leloir	galactose mutarotase. Members of this protein family act as galactose mutarotase (D-galactose 1-epimerase) and participate in the Leloir pathway for galactose/glucose interconversion. All members of the seed alignment for this model are found in gene clusters with other enzymes of the Leloir pathway. This enzyme family belongs to the aldose 1-epimerase family, described by pfam01263. However, the enzyme described as aldose 1-epimerase itself (EC 5.1.3.3) is called broadly specific for D-glucose, L-arabinose, D-xylose, D-galactose, maltose and lactose. The restricted genome context for genes in this family suggests members should act primarily on D-galactose.	336
131685	TIGR02637	RhaS	rhamnose ABC transporter, rhamnose-binding protein. This sugar-binding component of ABC transporter complexes is found in rhamnose catabolism operon contexts. Mutation of this gene in Rhizobium leguminosarum abolishes rhamnose transport and prevents growth on rhamnose as a carbon source.	302
131686	TIGR02638	lactal_redase	lactaldehyde reductase. This clade of genes encoding iron-containing alcohol dehydrogenase (pfam00465) proteins is generally found in apparent operons for the catabolism of rhamnose or fucose. Catabolism of both of these monosaccharides results in lactaldehyde which is reduced by this enzyme to 1,2 propanediol. This protein is alternatively known by the name 1,2 propanediol oxidoreductase. This enzyme is active under anaerobic conditions in E. coli while being inactivated by reactive oxygen species under aerobic conditions. Under aerobic conditions the lactaldehyde product of rhamnose and fucose catabolism is believed to be oxidized to lactate by a separate enzyme, lactaldehyde dehydrogenase. [Energy metabolism, Sugars]	379
274241	TIGR02639	ClpA	ATP-dependent Clp protease ATP-binding subunit clpA. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	730
131688	TIGR02640	gas_vesic_GvpN	gas vesicle protein GvpN. Members of this family are the GvpN protein associated with the production of gas vesicles produced in some prokaryotes to give cells buoyancy. This family belongs to a larger family of ATPases (pfam07728). [Cellular processes, Other]	262
274242	TIGR02641	gvpC_cyan_rpt	gas vesicle protein GvpC repeat. This model describes a 33-amino acid repeated domain in bacterial versions of the gas vesicle protein GvpC, a structural protein less abundant than GvpA. [Cellular processes, Other]	33
274243	TIGR02642	phage_xxxx	uncharacterized phage protein. This uncharacterized protein is found in prophage regions of Shewanella oneidensis MR-1, Vibrio vulnificus YJ016, Yersinia pseudotuberculosis IP 32953, and Aeromonas hydrophila ATCC7966. It appears to have regions of sequence similarity to phage lambda antitermination protein Q. [Mobile and extrachromosomal element functions, Prophage functions]	186
131691	TIGR02643	T_phosphoryl	thymidine phosphorylase. Thymidine phosphorylase (alternate name: pyrimidine phosphorylase), EC 2.4.2.4, is the designation for the enzyme of E. coli and other Proteobacteria involved in (deoxy)nucleotide degradation. It often occurs in an operon with a deoxyribose-phosphate aldolase, phosphopentomutase and a purine nucleoside phosphorylase. In many other lineages, the corresponding enzyme is designated pyrimidine-nucleoside phosphorylase (EC 2.4.2.2); the naming convention imposed by this model represents standard literature practice. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	437
274244	TIGR02644	Y_phosphoryl	pyrimidine-nucleoside phosphorylase. In general, members of this protein family are designated pyrimidine-nucleoside phosphorylase, enzyme family EC 2.4.2.2, as in Bacillus subtilis, and more narrowly as the enzyme family EC 2.4.2.4, thymidine phosphorylase (alternate name: pyrimidine phosphorylase), as in Escherichia coli. The set of proteins encompassed by this model is designated subfamily rather than equivalog for this reason; the protein name from this model should be used when TIGR02643 does not score above trusted cutoff. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	405
274245	TIGR02645	ARCH_P_rylase	putative thymidine phosphorylase. Members of this family are closely related to characterized examples of thymidine phosphorylase (EC 2.4.2.4) and pyrimidine nucleoside phosphorylase (RC 2.4.2.2). Most examples are found in the archaea, but other examples in Legionella pneumophila str. Paris and Rhodopseudomonas palustris CGA009.	493
131694	TIGR02646	TIGR02646	TIGR02646 family protein. Members of this uncharacterized protein family are found exclusively in bacteria. Neighboring genes in various genomes are also uncharacterized or may annotated as similar to restriction system proteins. [Hypothetical proteins, Conserved]	144
131695	TIGR02647	DNA	TIGR02647 family protein. Members of this family are found, so far, only in the Gammaproteobacteria. The function is unknown. The location on the chromosome usually is not far from housekeeping genes rather than in what is clearly, say, a prophage region. Some members have been annotated in public databases as DNA-binding protein inhibitor Id-2-related protein, putative transcriptional regulator, or hypothetical DNA binding protein. [Hypothetical proteins, Conserved]	77
131696	TIGR02648	rep_term_tus	DNA replication terminus site-binding protein. Members of this protein family are found on the main chromosomes of a number of the Gammaproteobacteria; this model excludes related plasmid proteins, which score between trusted and noise cutoffs. This protein, DNA replication terminus site-binding protein, binds specific DNA sites near the replication terminus to arrest the DNA replication fork. [DNA metabolism, DNA replication, recombination, and repair]	300
131697	TIGR02649	true_RNase_BN	ribonuclease BN. Members of this protein family are ribonuclease BN of Escherichia coli K-12 and closely related proteins believed to be equivalent in function. Note that E. coli appears to lack RNase Z per se, and this protein of E. coli appears orthologous to (but not functionally equivalent to) RNase Z of Bacillus subtilis and various other species. Meanwhile, the yihY gene product of E. coli previously was incorrectly identified as RNase BN. [Transcription, RNA processing]	303
188239	TIGR02650	RNase_Z_T_toga	ribonuclease Z, Thermotoga type. Members of this protein family are ribonuclease Z as found in the genus Thermotoga, where the enzyme cleaves after the CCA, in contrast to the activities characterized for other enzymes also designated ribonuclease Z. In other systems, cleavage occurs 5-prime to the location of the CCA sequence, and CCA is added subsequently. A species may lack ribonuclease Z if all tRNA genes encode the CCA sequence, or if the CCA is exposed by exonuclease activity rather than endonuclease activity. Note that members of this sequence family differ considerably from the majority of RNase Z sequences. [Transcription, RNA processing]	277
274246	TIGR02651	RNase_Z	ribonuclease Z. Processing of the 3-prime end of tRNA precursors may be the result of endonuclease or exonuclease activity, and differs in different species. Member of this family are ribonuclease Z, a tRNA 3-prime endonuclease that processes tRNAs to prepare for addition of CCA. In species where all tRNA sequences already have the CCA tail, such as E. coli, the need for such an enzyme is unclear. Protein similar to the E. coli enzyme, matched by TIGRFAMs model TIGR02649, are designated ribonuclease BN. [Transcription, RNA processing]	299
131700	TIGR02652	TIGR02652	TIGR02652 family protein. Members of this family of conserved hypothetical proteins are found, so far, only in the Cyanobacteria. Members are about 170 amino acids long and share a motif CxxCx(14)CxxH near the amino end. [Hypothetical proteins, Conserved]	163
131701	TIGR02653	Lon_rel_chp	conserved hypothetical protein. This model describes a protein family of unknown function, about 690 residues in length, in which some members show C-terminal sequence similarity to pfam05362, which is the Lon protease C-terminal proteolytic domain, from MEROPS family S16. However, the annotated catalytic sites of E. coli Lon protease are not conserved in members of this family. Members have a motif GP[RK][GS]TGKS, similar to the ATP-binding P-loop motif GxxGxGK[ST]. [Hypothetical proteins, Conserved]	675
211759	TIGR02654	circ_KaiB	circadian clock protein KaiB. Members of this protein family are the circadian clock protein KaiB of Cyanobacteria, encoded in the circadian clock gene cluster kaiABC. KaiB has homologs of unknown function in some Archaea and Proteobacteria, and has paralogs of unknown function in some Cyanobacteria. KaiB forms homodimers, homotetramers, and multimeric complexes with KaiA and/or KaiC. [Cellular processes, Other]	87
131703	TIGR02655	circ_KaiC	circadian clock protein KaiC. Members of this family are the circadian clock protein KaiC, part of the kaiABC operon that controls circadian rhythm. It may be universal in Cyanobacteria. Each member has two copies of the KaiC domain (pfam06745), which is also found in other proteins. KaiC performs autophosphorylation and acts as its own transcriptional repressor. [Cellular processes, Other]	484
274247	TIGR02656	cyanin_plasto	plastocyanin. Members of this family are plastocyanin, a blue copper protein related to pseudoazurin, halocyanin, amicyanin, etc. This protein, located in the thylakoid luman, performs electron transport to photosystem I in Cyanobacteria and chloroplasts. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	99
131705	TIGR02657	amicyanin	amicyanin. Members of this family are amicyanin, a type I blue copper protein that accepts electrons from the tryptophan tryptophylquinone (TTQ) cofactor of the methylamine dehydrogenase light chain and then transfers them to the heme group of cytochrome c-551i. Amicyanin, methylamine dehydrogenase, and cytochrome c-551i are periplasmic and form a complex. This system has been studied primarily in Paracoccus denitrificans and Methylobacterium extorquens. Related type I blue copper proteins include plastocyanin, pseudoazurin, halocyanin, etc. [Energy metabolism, Electron transport]	83
131706	TIGR02658	TTQ_MADH_Hv	methylamine dehydrogenase (amicyanin) heavy chain. This family consists of the heavy chain of methylamine dehydrogenase light chain, a periplasmic enzyme. The enzyme contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from two Trp residues in the light subunity. The enzyme forms a complex with the type I blue copper protein amicyanin and a cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome. [Energy metabolism, Amino acids and amines]	352
131707	TIGR02659	TTQ_MADH_Lt	methylamine dehydrogenase (amicyanin) light chain. This family consists of the light chain of methylamine dehydrogenase light chain, a periplasmic enzyme. This subunit contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from Trp-114 and Trp-165 of the precursor, numbered according to the sequence from Paracoccus denitrificans. The enzyme forms a complex with the type I blue copper protein amicyanin and cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome. [Energy metabolism, Amino acids and amines]	186
274248	TIGR02660	nifV_homocitr	homocitrate synthase NifV. This family consists of the NifV clade of homocitrate synthases, most of which are found in operons for nitrogen fixation. Members are closely homologous to enzymes that include 2-isopropylmalate synthase, (R)-citramalate synthase, and homocitrate synthases associated with other processes. The homocitrate made by this enzyme becomes a part of the iron-molybdenum cofactor of nitrogenase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation]	365
131709	TIGR02661	MauD	methylamine dehydrogenase accessory protein MauD. This protein, MauD, appears critical to proper formation of the small subunit of methylamine dehydrogenase, which has both an unusual tryptophan tryptophylquinone cofactor and multiple disulfide bonds. MauD shares sequence similarity, including a CPxC motif, with a number of thiol:disulfide interchange proteins. In MauD mutants, the small subunit apparently does not form properly and is rapidly degraded. [Protein fate, Protein folding and stabilization, Energy metabolism, Amino acids and amines]	189
131710	TIGR02662	dinitro_DRAG	ADP-ribosyl-[dinitrogen reductase] hydrolase. Members of this family are the enzyme ADP-ribosyl-[dinitrogen reductase] hydrolase (EC 3.2.2.24), better known as Dinitrogenase Reductase Activating Glycohydrolase, DRAG. This enzyme reverses a regulatory inactivation of dinitrogen reductase caused by the action of NAD(+)--dinitrogen-reductase ADP-D-ribosyltransferase (EC 2.4.2.37) (DRAT). This enzyme is restricted to nitrogen-fixing bacteria and belongs to the larger family of ADP-ribosylglycohydrolases described by pfam03747. [Central intermediary metabolism, Nitrogen fixation]	287
131711	TIGR02663	nifX	nitrogen fixation protein NifX. Members of this family are NifX proteins encoded within operons for nitrogen fixation in a number of bacteria. NifX, NafY, and the C-terminal region of NifB all belong to the pfam02579 and are involved in MoFe cofactor biosynthesis. NifX is a nitrogenase accessory protein with a role in expression of the MoFe cofactor. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Central intermediary metabolism, Nitrogen fixation]	119
131712	TIGR02664	nitr_red_assoc	conserved hypothetical protein. Most members of this protein family are found in the Cyanobacteria, and these mostly near nitrate reductase genes and molybdopterin biosynthesis genes. We note that molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. This protein is sometimes annotated as nitrate reductase-associated protein. Its function is unknown.	145
274249	TIGR02665	molyb_mobA	molybdenum cofactor guanylyltransferase, proteobacterial. In many molybdopterin-containing enzymes, including nitrate reductase and dimethylsulfoxide reductase, the cofactor is molybdopterin-guanine dinucleotide. The family described here contains MobA, molybdenum cofactor guanylyltransferase, from the Proteobacteria only. MobA can reconstitute molybdopterin-guanine dinucleotide biosynthesis without the product of the neighboring gene MobB. The probable MobA proteins of other lineages differ sufficiently that they are not included in scope of this family. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	186
274250	TIGR02666	moaA	molybdenum cofactor biosynthesis protein A, bacterial. The model for this family describes molybdenum cofactor biosynthesis protein A, or MoaA, as found in bacteria. It does not include the family of probable functional equivalent proteins from the archaea. MoaA works together with MoaC to synthesize precursor Z from guanine. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	334
131715	TIGR02667	moaB_proteo	molybdenum cofactor biosynthesis protein B, proteobacterial. This model represents the MoaB protein molybdopterin biosynthesis regions in Proteobacteria. This crystallized but incompletely characterized protein is thought to be involved in, though not required for, early steps in molybdopterin biosynthesis. It may bind a molybdopterin precursor. A distinctive conserved motif PCN near the C-terminus helps distinguish this clade from other homologs, including sets of proteins designated MogA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	163
274251	TIGR02668	moaA_archaeal	probable molybdenum cofactor biosynthesis protein A, archaeal. This model describes an archaeal family related, and predicted to be functionally equivalent, to molybdenum cofactor biosynthesis protein A (MoaA) of bacteria (see TIGR02666). [Biosynthesis of cofactors, prosthetic groups, and carriers, Molybdopterin]	302
274252	TIGR02669	SpoIID_LytB	SpoIID/LytB domain. This model describes a domain found typically in two or three proteins per genome in Cyanobacteria and Firmicutes, and sporadically in other genomes. One member is SpoIID of Bacillus subtilis. Another in B. subtilis is the C-terminal half of LytB, encoded immediately upstream of an amidase, the autolysin LytC, to which its N-terminus is homologous. Gene neighborhoods are not well conserved for members of this family, as many, such as SpoIID, are monocistronic. One early modelling-based study suggests a DNA-binding role for SpoIID, but the function of this domain is unknown. [Unknown function, General]	267
274253	TIGR02670	cas_csx8	CRISPR-associated protein Cas8a1/Csx8, subtype I. In three genomes so far, a member of this protein appears in the midst of a CRISPR-associated (cas) gene operon, immediately upstream of a member of family TIGR01875 (CRISPR-associated autoregulator, DevR family). The genomes so far are Nocardia farcinica IFM10152, Clostridium perfringens SM101, and Clostridium tetani E88.	441
131719	TIGR02671	cas_csx9	CRISPR-associated protein Cas8a2/Csx9, subtype I-A/APERN. Members of this family, so far, are archaeal proteins found in CRISPR-associated (cas) gene regions. So far, this rare cas protein is found in only three genomes: Pyrococcus horikoshii shinkaj OT3, Pyrococcus abyssi GE5, and Thermococcus kodakarensis KOD1. In each case it is found immediately upstream of cas3 in loci that resemble the Apern type but lack Csa1 and Csa4 genes.	377
131720	TIGR02672	cas_csm6	CRISPR type III-A/MTUBE-associated protein Csm6. Members of this family as found in CRISPR-associated (cas) gene regions in Streptococcus thermophilus CNRZ1066, Staphylococcus epidermidis RP62A, and Mycobacterium tuberculosis (strains CDC1551 and H37Rv), as part of Mtube-type CRISPR/Cas systems. CRISPR is a widespread form of direct repeat found in archaea and bacteria, with distinctive subtypes each of which has a characteristic sporadic distribution.	362
131721	TIGR02673	FtsE	cell division ATP-binding protein FtsE. This model describes FtsE, a member of the ABC transporter ATP-binding protein family. This protein, and its permease partner FtsX, localize to the division site. In a number of species, the ftsEX gene pair is located next to FtsY, the signal recognition particle-docking protein. [Cellular processes, Cell division]	214
131722	TIGR02674	cas_cyan_RAMP_2	CRISPR-associated RAMP protein, Csx10 family. CRISPR is a widespread repeat family in prokaryotes. At least 45 different protein families occur in prokaryotes only when these repeats are present. This family, a minor CRISPR-associated protein family, seems largely restricted to the Cyanobacteria. It belongs to the RAMP superfamily (pfam03787).	393
131723	TIGR02675	tape_meas_nterm	tape measure domain. Proteins containing this domain are strictly bacterial, including bacteriophage and prophage regions of bacterial genomes. Most members are 800 to 1800 amino acids long, making them among the longest predicted proteins of their respective phage genomes, where they are encoded in tail protein regions. This roughly 80-residue domain described here usually begins between residue 100 and 250. Many members are known or predicted to act as phage tail tape measure proteins, a minor tail component that regulates tail length.	75
131724	TIGR02677	TIGR02677	TIGR02677 family protein. Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved]	494
274254	TIGR02678	TIGR02678	TIGR02678 family protein. Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved]	375
274255	TIGR02679	TIGR02679	TIGR02679 family protein. Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved]	385
274256	TIGR02680	TIGR02680	TIGR02680 family protein. Members of this protein family belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). Proteins in this family average over 1400 amino acids in length. [Hypothetical proteins, Conserved]	1353
131728	TIGR02681	phage_pRha	phage regulatory protein, rha family. Members of this protein family are found in temperate phage and bacterial prophage regions. Members include the product of the rha gene of the lambdoid phage phi-80, a late operon gene. The presence of this gene interferes with infection of bacterial strains that lack integration host factor (IHF), which regulates the rha gene. It is suggested that pRha is a phage regulatory protein. [Mobile and extrachromosomal element functions, Prophage functions]	108
274257	TIGR02682	cas_csx11	CRISPR-associated protein, Csx11 family. Members of this uncommon, sporadically distributed protein family are large (>900 amino acids) and strictly associated, so far, with CRISPR-associated (Cas) gene clusters. Nearby Cas genes always include members of the RAMP superfamily and the six-gene CRISPR-associated RAMP module. Species in which it is found, so far, include three archaea (Methanosarcina mazei, M. barkeri and Methanobacterium thermoautotrophicum) and two bacteria (Thermodesulfovibrio yellowstonii DSM 11347 and Sulfurihydrogenibium azorense).	918
162974	TIGR02683	upstrm_HI1419	putative addiction module killer protein. Members of this strictly bacterial protein family are small, at roughly 100 amino acids. The gene is almost invariably the upstream member of a gene pair, where the downstream member is a predicted DNA-binding protein from a clade within Pfam helix-turn-helix family pfam01381. These gene pairs, when found on the bacterial chromosome, often are located with prophage regions, but also in both integrated plasmid regions and near housekeeping genes. Analysis suggests that the gene pair may serve as an addiction module.	95
188241	TIGR02684	dnstrm_HI1420	probable addiction module antidote protein. Members of this strictly bacterial protein family are small, at roughly 100 amino acids. The gene is almost invariably the downstream member of a gene pair. It is a predicted DNA-binding protein from a clade within Pfam helix-turn-helix family pfam01381. These gene pairs, when found on the bacterial chromosome, are located often with prophage regions, but also both in integrated plasmid regions and in housekeeping gene regions. Analysis suggests that the gene pair may serve as an addiction module. [Mobile and extrachromosomal element functions, Other]	89
131732	TIGR02685	pter_reduc_Leis	pteridine reductase. Pteridine reductase is an enzyme used by trypanosomatids (including Trypanosoma cruzi and Leishmania major) to obtain reduced pteridines by salvage rather than biosynthetic pathways. Enzymes in T. cruzi described as pteridine reductase 1 (PTR1) and pteridine reductase 2 (PTR2) have different activity profiles. PTR1 is more active with with fully oxidized biopterin and folate than with reduced forms, while PTR2 reduces dihydrobiopterin and dihydrofolate but not oxidized pteridines. T. cruzi PTR1 and PTR2 are more similar to each other in sequence than either is to the pteridine reductase of Leishmania major, and all are included in this family.	267
274258	TIGR02686	relax_trwC	conjugative relaxase domain, TrwC/TraI family. This domain is in the N-terminal (relaxase) region of TrwC, a relaxase-helicase that acts in plasmid R388 conjugation. The relaxase domain has DNA cleavage and strand transfer activities. Plasmid transfer protein TraI is also a member of this domain family. Members of this family on bacterial chromosomes typically are found near other genes typical of conjugative plasmids and appear to mark integrated plasmids. [Mobile and extrachromosomal element functions, Plasmid functions]	283
274259	TIGR02687	TIGR02687	TIGR02687 family protein. Members of this family are uncharacterized proteins sporadically distributed in bacteria and archaea, about 880 amino acids in length. This protein is repeatedly found upstream of another uncharacterized protein of about 470 amino acids in length, modeled by TIGR02688.	844
131735	TIGR02688	TIGR02688	TIGR02688 family protein. Members of this family are uncharacterized proteins sporadically distributed in bacteria and archaea, about 470 amino acids in length. Several members of this family appear in public databases with annotation as ATP-dependent protease La, despite the lack of similarity to families TIGR00763 (ATP-dependent protease La) or pfam02190 (ATP-dependent protease La (LON) domain). This protein is repeatedly found downstream of another uncharacterized protein of about 880 amino acids in length, described by model TIGR02687. [Hypothetical proteins, Conserved]	449
131736	TIGR02689	ars_reduc_gluta	arsenate reductase, glutathione/glutaredoxin type. Members of this protein family represent a novel form of arsenate reductase, using glutathione and glutaredoxin rather than thioredoxin for reducing equivalents as do some homologous arsenate reductases. An example of this type is Synechocystis sp. strain PCC 6803 slr0946, and of latter type (excluded from this model) is Staphylococcus aureus plasmid pI258 ArsC. Both are among the subset of arsenate reductases that belong the the low-molecular-weight protein-tyrosine phosphatase superfamily. [Cellular processes, Detoxification]	126
274260	TIGR02690	resist_ArsH	arsenical resistance protein ArsH. Members of this protein family occur in arsenate resistance operons that include at least two different types of arsenate reductase. ArsH is not required for arsenate resistance in some systems. This family belongs to the larger family of NADPH-dependent FMN reductases (pfam03358). The function of ArsH is not known. [Cellular processes, Detoxification]	219
131738	TIGR02691	arsC_pI258_fam	arsenate reductase (thioredoxin). This family describes the well-studied thioredoxin-dependent arsenate reductase of Staphylococcus aureaus plasmid pI258 and other mechanistically similar arsenate reductases. The mechanism involves an intramolecular disulfide bond cascade, and aligned members of this family have four absolutely conserved Cys residues. This group of arsenate reductases belongs to the low-molecular weight protein-tyrosine phosphatase family (pfam01451), as does a group of glutathione/glutaredoxin type arsenate reductases (TIGR02689). At least two other, non-homologous groups of arsenate reductases involved in arsenical resistance are also known. This enzyme reduces arsenate to arsenite, which may be more toxic but which is more easily exported. [Cellular processes, Detoxification]	129
131739	TIGR02692	tRNA_CCA_actino	tRNA adenylyltransferase. The enzyme tRNA adenylyltransferase, also called tRNA-nucleotidyltransferase and CCA-adding enzyme, can add or repair the required CCA triplet at the 3'-end of tRNA molecules. Genes encoding tRNA include the CCA tail in some but not all bacteria, and this enzyme may be required for viability. Members of this family represent a distinct clade within the larger family pfam01743 (tRNA nucleotidyltransferase/poly(A) polymerase family protein). The example from Streptomyces coelicolor was shown to act as a CCA-adding enzyme and not as a poly(A) polymerase. [Protein synthesis, tRNA and rRNA base modification]	466
274261	TIGR02693	arsenite_ox_L	arsenite oxidase, large subunit. This model represents the large subunit of an arsenite oxidase complex. The small subunit is a Rieske protein. Homologs to both large and small subunits that score in the gray zone between the set trusted and noise bit score cutoffs for the respective models are found in Aeropyrum pernix K1 and in Sulfolobus tokodaii str. 7. This enzyme acts in energy metabolim by arsenite oxidation, rather than detoxification by reduction of arsenate to arsenite prior to export. [Energy metabolism, Electron transport]	806
131741	TIGR02694	arsenite_ox_S	arsenite oxidase, small subunit. This model represents the small subunit of an arsenite oxidase complex. It is a Rieske protein and appears to rely on the Tat (twin-arginine translocation) system to cross the membrane. Although this enzyme could run in the direction of arsenate reduction to arsenite in principle, the relevant biological function is arsenite oxidation for energy metabolism, not arsenic resistance. Homologs to both large (TIGR02693) and small subunits that score in the gray zone between the set trusted and noise bit score cutoffs for the respective models are found in Aeropyrum pernix K1 and in Sulfolobus tokodaii str. 7. [Energy metabolism, Electron transport]	129
131742	TIGR02695	azurin	azurin. Azurin is a blue copper-binding protein in the plastocyanin/azurin family (see pfam00127). It serves as a redox partner to enzymes such as nitrite reductase or arsenite oxidase. The most closely related copper-binding proteins to this family are auracyanins, as in Chloroflexus aurantiacus, which have similar redox activities. [Energy metabolism, Electron transport]	125
131743	TIGR02696	pppGpp_PNP	guanosine pentaphosphate synthetase I/polynucleotide phosphorylase. Sohlberg, et al. present characterization of two proteins from Streptomyces coelicolor. The protein in this family was shown to have poly(A) polymerase activity and may be responsible for polyadenylating RNA in this species. Reference 2 showed that a nearly identical plasmid-encoded protein from Streptomyces antibioticus is a bifunctional enzyme that acts also as a guanosine pentaphosphate synthetase.	719
131744	TIGR02697	WPE_wolbac	Wolbachia palindromic element (WPE) domain. This domain conceptually resembles TIGR01045, the Rickettsial palindromic element (RPE) domain. In both cases, a protein-coding palindromic element spreads through a genome, inserting usually in protein-coding regions. The additional protein coding sequence is thought to allow function of the host protein because of location in surface-exposed regions of the protein structure. Note that this model appears to work better in fragment mode. [Mobile and extrachromosomal element functions, Other]	36
131745	TIGR02698	CopY_TcrY	copper transport repressor, CopY/TcrY family. This family includes metal-fist type transcriptional repressors of copper transport systems such as copYZAB of Enterococcus hirae and tcrYAZB (transferble copper resistance) of an Enterocuccus faecium plasmid. High levels of copper can displace zinc and prevent binding by the repressor, activating efflux by copper resistance transporters. The most closely related proteins excluded by this model are antibiotic resistance regulators including the methicillin resistance regulatory protein MecI. [Transport and binding proteins, Cations and iron carrying compounds, Regulatory functions, DNA interactions]	130
131746	TIGR02699	archaeo_AfpA	archaeoflavoprotein AfpA. The prototypical member of this archaeal protein family is AF1518 from Archaeoglobus fulgidus. This homodimer with two non-covalently bound FMN cofactors can receive electrons from ferredoxin, but not from a number of other electron donors such as NADH or rubredoxin. It can then donate electrons to various reductases. [Energy metabolism, Electron transport]	174
131747	TIGR02700	flavo_MJ0208	archaeoflavoprotein, MJ0208 family. This model describes one of two paralogous families of archaealflavoprotein. The other, described by TIGR02699 and typified by the partially characterized AF1518 of Archaeoglobus fulgidus, is a homodimeric FMN-containing flavoprotein that accepts electrons from ferredoxin and can transfer them to various oxidoreductases. The function of this protein family is unknown. [Unknown function, General]	234
274262	TIGR02701	shell_carb_anhy	carboxysome shell carbonic anhydrase. This model describes a carboxysome shell protein that proves to be a novel class, designated epsilon, of carbonic anhydrase. It tends to be encoded near genes for RuBisCo and for other carboxysome shell proteins. [Central intermediary metabolism, One-carbon metabolism]	450
131749	TIGR02702	SufR_cyano	iron-sulfur cluster biosynthesis transcriptional regulator SufR. All members of this cyanobacterial protein family are the transcriptional regulator SufR and regulate the SUF system, which makes possible iron-sulfur cluster biosynthesis despite exposure to oxygen. In all cases, the sufR gene is encoded near SUF system genes but in the opposite direction. This DNA-binding protein belongs to the the DeoR family of helix-loop-helix proteins. All members also have a probable metal-binding motif C-X(12)-C-X(13)-C-X(14)-C near the C-terminus. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Regulatory functions, DNA interactions]	203
131750	TIGR02703	carboxysome_A	carboxysome peptide A. This model distinguishes one of two closely related paralogs encoded by nearby genes in the carboxysome operons of a number of cyanobacteria and chemoautotrophic bacteria. More distantly related proteins, also belonging to pfam03319, participate in other types of shell such as the ethanolamine degradation organelle. [Central intermediary metabolism, One-carbon metabolism]	81
131751	TIGR02704	carboxysome_B	carboxysome peptide B. This model distinguishes one of two closely related paralogs encoded by nearby genes in the carboxysome operons of a number of cyanobacteria and chemoautotrophic bacteria. More distantly related proteins, also belonging to pfam03319, participate in other types of shell such as the ethanolamine degradation organelle. [Central intermediary metabolism, One-carbon metabolism]	80
131752	TIGR02705	nudix_YtkD	nucleoside triphosphatase YtkD. The functional assignment to the proteins of this family is contentious, with papers disagreeing in both interpretation and enzyme assay results. This protein belongs to the nudix family and shares some sequence identity with E. coli MutT but appears not to be functionally interchangeable with it. [DNA metabolism, DNA replication, recombination, and repair]	156
162980	TIGR02706	P_butyryltrans	phosphate butyryltransferase. Members of this family are phosphate butyryltransferase, also called phosphotransbutyrylase. In general, this enzyme is found in butyrate-producing anaerobic bacteria, encoded next to the gene for butyrate kinase. Together, these two enzymes represent what may be the less common of two pathways for butyrate production from butyryl-CoA. The alternative is transfer of the CoA group to acetate by butyryl-CoA:acetate CoA transferase. Cutoffs for this model are set such that the homolog from Thermotoga maritima, whose activity on butyryl-CoA is only 30 % of its activity with acetyl-CoA, scores in the zone between trusted and noice cutoffs. [Energy metabolism, Fermentation]	294
162981	TIGR02707	butyr_kinase	butyrate kinase. This model represents an enzyme family in which members are designated either butryate kinase or branched-chain carboxylic acid kinase. The EC designation 2.7.2.7 describes an enzyme with relatively broad specificity; gene products whose context suggests a role in metabolism of aliphatic amino acids are likely to act as branched-chain carboxylic acid kinase. The gene typically found adjacent, ptb (phosphate butyryltransferase), likewise encodes an enzyme that may have a broad specificity that includes a role in aliphatic amino acid cabolism. [Energy metabolism, Fermentation]	351
131755	TIGR02708	L_lactate_ox	L-lactate oxidase. Members of this protein oxidize L-lactate to pyruvate, reducing molecular oxygen to hydrogen peroxide. The enzyme is known in Aerococcus viridans, Streptococcus iniae, and some strains of Streptococcus pyogenes where it appears to contribute to virulence. [Energy metabolism, Other]	367
131756	TIGR02709	branched_ptb	branched-chain phosphotransacylase. This model distinguishes branched-chain phosphotransacylases like that of Enterococcus faecalis from closely related subfamilies of phosphate butyryltransferase (EC 2.3.1.19) (TIGR02706) and phosphate acetyltransferase (EC 2.3.1.8) (TIGR00651). Members of this family and of TIGR02706 show considerable crossreactivity, and the occurrence of a member of either family near an apparent leucine dehydrogenase will suggest activity on branched chain-acyl-CoA compounds. [Energy metabolism, Amino acids and amines]	271
274263	TIGR02710	TIGR02710	CRISPR-associated protein, TIGR02710 family. Members of this family are found, exclusively in the vicinity of CRISPR repeats and other CRISPR-associated (cas) genes, in Methanothermobacter thermautotrophicus (Archaea), Thermus thermophilus (Deinococcus-Thermus), Chloroflexus aurantiacus (Chloroflexi), and Thermomicrobium roseum (Thermomicrobia).	380
131758	TIGR02711	symport_actP	cation/acetate symporter ActP. Members of this family belong to the Sodium:solute symporter family. Both members of this family and other close homologs tend to be encoded next to a member of pfam04341, a set of uncharacterized membrane proteins. The characterized member from E. coli is encoded near and cotranscribed with the acetyl coenzyme A synthetase (acs) gene. Proximity to an acs gene was used as one criterion for determining the trusted cutoff for this model. Closely related proteins may differ in function and are excluded by the high cutoffs of this model; members of the family of phenylacetic acid transporter PhaJ can score as high as 1011 bits. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	549
274264	TIGR02712	urea_carbox	urea carboxylase. Members of this family are ATP-dependent urea carboxylase, including characterized members from Oleomonas sagaranensis (alpha class Proteobacterium) and yeasts such as Saccharomyces cerevisiae. The allophanate hydrolase domain of the yeast enzyme is not included in this model and is represented by an adjacent gene in Oleomonas sagaranensis. The fusion of urea carboxylase and allophanate hydrolase is designated urea amidolyase. The enzyme from Oleomonas sagaranensis was shown to be highly active on acetamide and formamide as well as urea. [Central intermediary metabolism, Nitrogen metabolism]	1201
274265	TIGR02713	allophanate_hyd	allophanate hydrolase. Allophanate hydrolase catalyzes the second reaction in an ATP-dependent two-step degradation of urea to ammonia and C02, following the action of the biotin-containing urea carboxylase. The yeast enzyme, a fusion of allophanate hydrolase to urea carboxylase, is designated urea amidolyase. [Central intermediary metabolism, Nitrogen metabolism]	561
274266	TIGR02714	amido_AtzD_TrzD	ring-opening amidohydrolases. Members of this family are are ring-opening amidohydrolases, including cyanuric acid amidohydrolase (EC 3.5.2.15) (AtzD and TrzD) and barbiturase. Note that barbiturase does not act as defined for EC 3.5.2.1 (barbiturate + water = malonate + urea) but rather catalyzes the ring-opening of barbituric acid to ureidomalonic acid (see Soong, et al., ).	366
274267	TIGR02715	amido_AtzE	amidohydrolase, AtzE family. Members of this protein family are aminohydrolases related to, but distinct from, glutamyl-tRNA(Gln) amidotransferase subunit A. The best characterized member is the biuret hydrolase of Pseudomonas sp. ADP, which hydrolyzes ammonia from the three-nitrogen compound biuret to yield allophanate. Allophanate is also an intermediate in urea degradation by the urea carboxylase/allophanate hydrolase pathway, an alternative to urease. [Unknown function, Enzymes of unknown specificity]	452
131763	TIGR02716	C20_methyl_CrtF	C-20 methyltransferase BchU. Members of this protein family are the S-adenosylmethionine-depenedent C-20 methyltransferase BchU, part of the pathway of bacteriochlorophyll c production in photosynthetic green sulfur bacteria. The position modified by this enzyme represents the difference between bacteriochlorophylls c and d; strains lacking this protein can only produced bacteriochlorophyll d. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	306
131764	TIGR02717	AcCoA-syn-alpha	acetyl coenzyme A synthetase (ADP forming), alpha domain. Although technically reversible, it is believed that this group of ADP-dependent acetyl-CoA synthetases (ACS) act in the direction of acetate and ATP production in the organisms in which it has been characterized. In most species this protein exists as a fused alpha-beta domain polypeptide. In Pyrococcus and related species, however the domains exist as separate polypeptides. This model represents the alpha (N-terminal) domain. In Pyrococcus and related species there appears to have been the development of a paralogous family such that four other proteins are close relatives. In reference, one of these (along with its beta-domain partner) was characterized as ACS-II showing specificity for phenylacetyl-CoA. This model has been constructed to exclude these non-ACS-I paralogs. This may result in new, authentic ACS-I sequences falling below the trusted cutoff.	447
131765	TIGR02718	sider_RhtX_FptX	siderophore transporter, RhtX/FptX family. RhtX from Sinorhizobium meliloti 2011 and FptX from Pseudomonas aeruginosa appear to be single polypeptide transporters, from the major facilitator family (see pfam07690) for import of siderophores as a means to import iron. This function was suggested by proximity to siderophore biosynthesis genes and then confirmed by study of knockout and heterologous expression phenotypes. [Transport and binding proteins, Cations and iron carrying compounds]	390
131766	TIGR02719	repress_PhaQ	poly-beta-hydroxybutyrate-responsive repressor. Members of this family are transcriptional regulatory proteins found in the vicinity of poly-beta-hydroxybutyrate (PHB) operons in several species of Bacillus. This protein appears to have repressor activity modulated by PHB itself. This protein belongs to the larger PadR family (see pfam03551). [Regulatory functions, DNA interactions]	138
213733	TIGR02720	pyruv_oxi_spxB	pyruvate oxidase. Members of this family are examples of pyruvate oxidase (EC 1.2.3.3), an enzyme with FAD and TPP as cofactors that catalyzes the reaction pyruvate + phosphate + O2 + H2O = acetyl phosphate + CO2 + H2O2. It should not be confused with pyruvate dehydrogenase [cytochrome] (EC 1.2.2.2) as in E. coli PoxB, although the E. coli enzyme is closely homologous and has pyruvate oxidase as an alternate name. [Energy metabolism, Aerobic]	575
274268	TIGR02721	ycfN_thiK	thiamine kinase. Members of this family are the ycfN gene product of Escherichia coli, now identified as the salvage enzyme thiamine kinase (thiK), and additional proteobacterial homologs taken to be orthologs with equivalent function. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	256
274269	TIGR02722	lp_	uncharacterized proteobacterial lipoprotein. Members of this protein family are restricted to the Proteobacteria, and all are predicted lipoproteins. In genomes that contain the thiK gene for the salvage enzyme thiamin kinase, the member of this family is encoded nearby. [Cell envelope, Other]	189
131770	TIGR02723	phenyl_P_alpha	phenylphosphate carboxylase, alpha subunit. Members of this protein family are the alpha subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. This alpha subunit is homologous to the beta subunit and, more broadly, to UbiD family decarboxylases. [Energy metabolism, Anaerobic]	485
131771	TIGR02724	phenyl_P_beta	phenylphosphate carboxylase, beta subunit. Members of this protein family are the beta subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. This beta subunit is homologous to the alpha subunit and, more broadly, to UbiD family decarboxylases.	472
131772	TIGR02725	phenyl_P_gamma	phenylphosphate carboxylase, gamma subunit. Members of this protein family are the gamma subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. The gamma subunit has no known homologs.	84
131773	TIGR02726	phenyl_P_delta	phenylphosphate carboxylase, delta subunit. Members of this protein family are the alpha subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. This delta subunit belongs to HAD family hydrolases. [Energy metabolism, Anaerobic]	169
274270	TIGR02727	MTHFS_bact	5,10-methenyltetrahydrofolate synthetase. This enzyme, 5,10-methenyltetrahydrofolate synthetase, is also called 5-formyltetrahydrofolate cycloligase. Function of bacterial proteins in this family was inferred originally from the known activity of eukaryotic homologs. Recently, activity was shown explicitly for the member from Mycoplasma pneumonia. Members of this family from alpha- and gamma-proteobacteria, designated ygfA, are often found in an operon with 6S structural RNA, and show a similar pattern of high expression during stationary phase. The function may be to deplete folate to slow 1-carbon biosynthetic metabolism. [Central intermediary metabolism, One-carbon metabolism]	179
131775	TIGR02728	spore_gerQ	spore coat protein GerQ. Members of this protein family are the spore coat protein GerQ of endospore-forming Firmicutes (low GC Gram-positive bacteria). This protein is cross-linked by a spore coat-associated transglutaminase. [Cellular processes, Sporulation and germination]	82
274271	TIGR02729	Obg_CgtA	Obg family GTPase CgtA. This model describes a univeral, mostly one-gene-per-genome GTP-binding protein that associates with ribosomal subunits and appears to play a role in ribosomal RNA maturation. This GTPase, related to the nucleolar protein Obg, is designated CgtA in bacteria. Mutations in this gene are pleiotropic, but it appears that effects on cellular functions such as chromosome partition may be secondary to the effect on ribosome structure. Recent work done in Vibrio cholerae shows an essential role in the stringent response, in which RelA-dependent ability to synthesize the alarmone ppGpp is required for deletion of this GTPase to be lethal. [Protein synthesis, Other]	328
131777	TIGR02730	carot_isom	carotene isomerase. Members of this family, including sll0033 (crtH) of Synechocystis sp. PCC 6803, catalyze a cis-trans isomerization of carotenes to the all-trans lycopene, a reaction that can also occur non-enzymatically in light through photoisomerization. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	493
131778	TIGR02731	phytoene_desat	phytoene desaturase. Plants and cyanobacteria (and, supposedly, Chlorobium tepidum) have a conserved pathway from two molecules geranylgeranyl-PP to one of all-trans-lycopene. Members of this family are the enzyme pytoene desaturase (also called phytoene dehydrogenase). This model does not include the region of the chloroplast transit peptide in plants. A closely related family, excluded by this model, is zeta-carotene desaturase, another enzyme in the same pathway. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	453
131779	TIGR02732	zeta_caro_desat	9,9'-di-cis-zeta-carotene desaturase. Carotene 7,8-desaturase, also called zeta-carotene desaturase, catalyzes multiple steps in the pathway from geranylgeranyl-PP to all-trans-lycopene in plants and cyanobacteria. A similar enzyme and pathway is found in the green sulfur bacterium Chlorobium tepidum.	474
274272	TIGR02733	desat_CrtD	C-3',4' desaturase CrtD. Members of this family are slr1293, a carotenoid biosynthesis protein which was shown to be the C-3',4' desaturase (CrtD) of myxoxanthophyll biosynthesis in Synechocystis sp. strain PCC 6803, and close homologs (presumed to be functionally equivalent) from other cyanobacteria, where myxoxanthophyll biosynthesis is either known or expected. This enzyme can act on neurosporene and so presumably catalyzes the first step that is committed to myxoxanthophyll. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	492
274273	TIGR02734	crtI_fam	phytoene desaturase. Phytoene is converted to lycopene by desaturation at four (two symmetrical pairs of) sites. This is achieved by two enzymes (crtP and crtQ) in cyanobacteria (Gloeobacter being an exception) and plants, but by a single enzyme in most other bacteria and in fungi. This single enzyme is called the bacterial-type phytoene desaturase, or CrtI. Most members of this family, part of the larger pfam01593, which also contains amino oxidases, are CrtI itself; it is likely that all members act on either phytoene or on related compounds such as dehydrosqualene, for carotenoid biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	495
131782	TIGR02735	purC_vibrio	phosphoribosylaminoimidazole-succinocarboxamide synthase, Vibrio type. Members of this protein family appear to represent a novel form of phosphoribosylaminoimidazole-succinocarboxamide synthase (SAICAR synthetase), significantly different in sequence and gap pattern from a form (see TIGR00081) shared by a broad range of bacteria and eukaryotes. Members of this family are found within the gammaproteobacteria in the genera Vibrio, Shewanella, and Colwellia, and also (reported as a fragment) in the primitive eukarote Guillardia theta. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	365
131783	TIGR02736	cbb3_Q_epsi	cytochrome c oxidase, cbb3-type, CcoQ subunit, epsilon-Proteobacterial. Members of this protein family are restricted to the epsilon branch of the Proteobacteria. All members are found in operons containing the other three structural subunits of the cbb3 type of cytochrome c oxidase. These small proteins show remote sequence similarity to the CcoQ subunit in other cytochrome c oxidase systems, so this family is assumed to represent the epsilonproteobacterial variant of CcoQ. [Energy metabolism, Electron transport]	56
131784	TIGR02737	caa3_CtaG	cytochrome c oxidase assembly factor CtaG. Members of this family are the CtaG protein required for assembly of active cytochrome c oxidase of the caa3 type, as in Bacillus subtilis.	281
131785	TIGR02738	TrbB	type-F conjugative transfer system pilin assembly thiol-disulfide isomerase TrbB. This protein is part of a large group of proteins involved in conjugative transfer of plasmid DNA, specifically the F-type system. This protein has been predicted to contain a thioredoxin fold, contains a conserved pair of cysteines and has been shown to function as a thiol disulfide isomerase by complementation of an Ecoli DsbA defect. The protein is believed to be involved in pilin assembly. The protein is closely related to TraF (TIGR02739) which is somewhat longer, lacks the cysteine motif and is apparently not functional as a disulfide bond isomerase.	153
274274	TIGR02739	TraF	type-F conjugative transfer system pilin assembly protein TraF. This protein is part of a large group of proteins involved in conjugative transfer of plasmid DNA, specifically the F-type system. This protein has been predicted to contain a thioredoxin fold and has been shown to be localized to the periplasm. Unlike the related protein TrbB (TIGR02738), TraF does not contain a conserved pair of cysteines and has been shown not to function as a thiol disulfide isomerase by complementation of an Ecoli DsbA defect. The protein is believed to be involved in pilin assembly. Even more closely related than TrbB is a clade of genes (TIGR02740) which do contain the CXXC motif, but it is unclear whether these genes are involved in type-F conjugation systems per se.	256
274275	TIGR02740	TraF-like	TraF-like protein. This protein is related to the F-type conjugation system pilus assembly proteins TraF (TIGR02739)and TrbB (TIGR02738) both of which exhibit a thioredoxin fold. The protein represented by this model has the same length and architecture as TraF, but lacks the CXXC-motif found in TrbB and believed to be responsible for the disulfide isomerase activity of that protein.	271
131788	TIGR02741	TraQ	type-F conjugative transfer system pilin chaperone TraQ. This protein makes a specific interaction with the pilin (TraA) protein to aid its transfer through the inner membrane during the process of F-type conjugative pilus assembly.	80
131789	TIGR02742	TrbC_Ftype	type-F conjugative transfer system pilin assembly protein TrbC. This protein is an essential component of the F-type conjugative pilus assembly system for the transfer of plasmid DNA. The N-terminal portion of these proteins are heterogeneous and are not covered by this model.	130
274276	TIGR02743	TraW	type-F conjugative transfer system protein TraW. This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm.	202
274277	TIGR02744	TrbI_Ftype	type-F conjugative transfer system protein TrbI. This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm.	112
274278	TIGR02745	ccoG_rdxA_fixG	cytochrome c oxidase accessory protein FixG. Member of this ferredoxin-like protein family are found exclusively in species with an operon encoding the cbb3 type of cytochrome c oxidase (cco-cbb3), and near the cco-cbb3 operon in about half the cases. The cco-cbb3 is found in a variety of proteobacteria and almost nowhere else, and is associated with oxygen use under microaerobic conditions. Some (but not all) of these proteobacteria are also nitrogen-fixing, hence the gene symbol fixG. FixG was shown essential for functional cco-cbb3 expression in Bradyrhizobium japonicum.	434
274279	TIGR02746	TraC-F-type	type-IV secretion system protein TraC. The protein family described here is common among the F, P and I-like type IV secretion systems. Gene symbols include TraC (F-type), TrbE/VirB4 (P-type) and TraU (I-type). The protein conyains the Walker A and B motifs and so is a putative nucleotide triphosphatase.	797
274280	TIGR02747	TraV	type IV conjugative transfer system lipoprotein TraV. The TraV protein is a component of conjugative type IV secretion systems. TraV is an outer membrane lipoprotein and is believed to interact with the secretin TraK. The alignment contains three conserved cysteines in the N-terminal half.	144
131795	TIGR02748	GerC3_HepT	heptaprenyl diphosphate synthase component II. Members of this family are component II of the heterodimeric heptaprenyl diphosphate synthase. The trusted cutoff was set such that all members identified are encoded near to a recognizable gene for component I (in pfam07307). This enzyme acts in menaquinone-7 isoprenoid side chain biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	319
131796	TIGR02749	prenyl_cyano	solanesyl diphosphate synthase. Members of this family all are from cyanobacteria or plastid-containing eukaryotes. A member from Arabidopsis (where both plastoquinone and ubiquinone contain the C(45) prenyl moiety) was characterized by heterologous expression as a solanesyl diphosphate synthase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	322
274281	TIGR02750	TraN_Ftype	type-F conjugative transfer system mating-pair stabilization protein TraN. TraN is a large cysteine-rich outer membrane protein involved in the mating-pair stabilization (adhesin) component of the F-type conjugative plamid transfer system. TraN is believed to interact with the core type IV secretion system apparatus through the TraV protein.	572
131798	TIGR02751	PEPCase_arch	phosphoenolpyruvate carboxylase, archaeal type. This family is the archaeal-type phosphoenolpyruvate carboxylase, although not every host species is archaeal. These sequences bear little resemblance to the bacterial/eukaryotic type. The members from Sulfolobus solfataricus and Methanothermobacter thermautotrophicus were verified experimentally, while the activity is known to be present in a number of other archaea. [Energy metabolism, Other]	506
131799	TIGR02752	MenG_heptapren	demethylmenaquinone methyltransferase. MenG is a generic term for a methyltransferase that catalyzes the last step in menaquinone biosynthesis; the exact enzymatic activity differs for different MenG because the menaquinone differ in their prenoid side chains in different species. Members of this MenG protein family are 2-heptaprenyl-1,4-naphthoquinone methyltransferase, and are found together in operons with the two subunits of the heptaprenyl diphosphate synthase in Bacillus subtilis and related species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	231
131800	TIGR02753	sodN	superoxide dismutase, Ni. This superoxide dismutase uses nickel, rather than iron, manganese, copper, or zinc. Its gene is always accompanied by a gene for a required protease.	145
274282	TIGR02754	sod_Ni_protease	nickel-type superoxide dismutase maturation protease. Members of this protein family are apparent proteases encoded adjacent to the genes for a nickel-type superoxide dismutase. This family belongs to the same larger family (see pfam00717) as signal peptidase I, an unusual serine protease suggested to have a Ser/Lys catalytic dyad. [Cellular processes, Detoxification, Protein fate, Protein modification and repair]	90
131802	TIGR02755	TraX_Ftype	type-F conjugative transfer system pilin acetylase TraX. TraX is responsible for the acetylation of the F-pilin TraA during conjugative plasmid transfer. The purpose of this acetylation is unclear, but the reported transcriptional regulation of TraX may indicate that it is involved in the process of pilu extension/retraction.	224
274283	TIGR02756	TraK_Ftype	type-F conjugative transfer system secretin TraK. The TraK protein is predicted to interact with the TraV and TraB proteins as part of the scaffold which extends from the inner membrane, through the periplasm to the cell envelope and through which the F-type conjugative pilus passes. TraK is homologous to the P-type IV secretion system protein TrbG, the Ti-type protein VirB9 and the I-type TraN protein. The protein is related to the secretin family especially the HrcC subgroup of the type III secretion system. The protein is hypothesized to oligomerize to form a ring structure akin to other secretins.	232
274284	TIGR02757	TIGR02757	TIGR02757 family protein. Members of this uncharacterized protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighborhoods show little conservation. [Hypothetical proteins, Conserved]	229
131805	TIGR02758	TraA_TIGR	type IV conjugative transfer system pilin TraA. TraA is the single structural subunit of the pilus found in type IV conjugative transfer systems. This family is generally found in gammaproteobacteria. The pilins show considerable heterogeneity among the different conjugative plasmit types. All of them however contain an N-terminal part which is cleaved off by a leader peptidase (LepB, or similar) to result in a 68-78 amino acid product. Pilins may be further processed by acetylation (in F-like systems by the TraX protein) or by cyclization (in P-like systems by the TraF protein).	93
131806	TIGR02759	TraD_Ftype	type IV conjugative transfer system coupling protein TraD. The TraD protein performs an essential coupling function in conjugative type IV secretion systems. This protein sits at the inner membrane in contact with the assembled pilus and its scaffold as well as the relaxosome-plasmid DNA complex (through TraM).	566
274285	TIGR02760	TraI_TIGR	conjugative transfer relaxase protein TraI. This protein is a component of the relaxosome complex. In the process of conjugative plasmid transfer the realaxosome binds to the plasmid at the oriT (origin of transfer) site. The relaxase protein TraI mediates the single-strand nicking and ATP-dependent unwinding (relaxation, helicase activity) of the plasmid molecule. These two activities reside in separate domains of the protein.	1960
163004	TIGR02761	TraE_TIGR	type IV conjugative transfer system protein TraE. TraE is a component of type IV secretion systems involved in conjugative transfer of plasmid DNA. The function of the TraE protein is unknown.	181
274286	TIGR02762	TraL_TIGR	type IV conjugative transfer system protein TraL. This protein is part of the type IV secretion system for conjugative plasmid transfer. The function of the TraL protein is unknown. [Cellular processes, Conjugation]	94
131810	TIGR02763	chlamy_scaf	chlamydiaphage internal scaffolding protein. Members of this protein family are encoded by genes in chlamydiaphage such as Chp2, viruses with around eight genes that infect obligately intracellular bacterial pathogens of the genus Chlamydia. This protein, initially designated VP3 (as if a structural protein of mature viral particles), is displaced from procapsids as DNA is packaged, and therefore is described as a scafolding protein. [Mobile and extrachromosomal element functions, Prophage functions]	114
274287	TIGR02764	spore_ybaN_pdaB	polysaccharide deacetylase family sporulation protein PdaB. This model describes the YbaN protein family, also called PdaB and SpoVIE, of Gram-positive bacteria. Although ybaN null mutants have only a mild sporulation defect, ybaN/ytrI double mutants show drastically reducted sporulation efficiencies. This synthetic defect suggests the role of this sigmaE-controlled gene in sporulation had been masked by functional redundancy. Members of this family are homologous to a characterized polysaccharide deacetylase; the exact function this protein family is unknown. [Cellular processes, Sporulation and germination]	191
274288	TIGR02765	crypto_DASH	cryptochrome, DASH family. Photolyases and cryptochromes are related flavoproteins. Photolyases harness the energy of blue light to repair DNA damage by removing pyrimidine dimers. Cryptochromes do not repair DNA and are presumed to act instead in some other (possibly unknown) process such as entraining circadian rhythms. This model describes the cryptochrome DASH subfamily, one of at least five major subfamilies, which is found in plants, animals, marine bacteria, etc. Members of this family bind both folate and FAD. They may show weak photolyase activity in vitro but have not been shown to affect DNA repair in vivo. Rather, DASH family cryptochromes have been shown to bind RNA (Vibrio cholerae VC1814), or DNA, and seem likely to act in light-responsive regulatory processes. [Cellular processes, Adaptations to atypical conditions]	429
131813	TIGR02766	crypt_chrom_pln	cryptochrome, plant family. At least five major families of cryptochomes and photolyases share FAD cofactor binding, sequence homology, and the ability to react to short wavelengths of visible light. Photolysases are responsible for light-dependent DNA repair by removal of two types of uv-induced DNA dimerizations. Cryptochromes have other functions, often regulatory and often largely unknown, which may include circadian clock entrainment and control of development. Members of this subfamily are known so far only in plants; they may show some photolyase activity in vitro but appear mostly to be regulatory proteins that respond to blue light.	475
131814	TIGR02767	TraG-Ti	Ti-type conjugative transfer system protein TraG. This protein is found in the Agrobacterium tumefaciens Ti plasmid tra region responsible for conjugative transfer of the entire plasmid among Agrobacterium strains. The protein is distantly related to the F-type conjugation system TraG protein. Both of these systems are examples of type IV secretion systems.	623
274289	TIGR02768	TraA_Ti	Ti-type conjugative transfer relaxase TraA. This protein contains domains distinctive of a single strand exonuclease (N-terminus, MobA/MobL, pfam03389) as well as a helicase domain (central region, homologous to the corresponding region of the F-type relaxase TraI, TIGR02760). This protein likely fills the same role as TraI(F), nicking (at the oriT site) and unwinding the coiled plasmid prior to conjugative transfer.	744
131816	TIGR02769	nickel_nikE	nickel import ATP-binding protein NikE. This family represents the NikE subunit of a multisubunit nickel import ABC transporter complex. Nickel, once imported, may be used in urease and in certain classes of hydrogenase and superoxide dismutase. [Transport and binding proteins, Cations and iron carrying compounds]	265
131817	TIGR02770	nickel_nikD	nickel import ATP-binding protein NikD. This family represents the NikD subunit of a multisubunit nickel import ABC transporter complex. Nickel, once imported, may be used in urease and in certain classes of hydrogenase and superoxide dismutase. NikD and NikE are homologous. [Transport and binding proteins, Cations and iron carrying compounds]	230
131818	TIGR02771	TraF_Ti	conjugative transfer signal peptidase TraF. This protein is found in apparent operons encoding elements of conjugative transfer systems. This family is homologous to a broader family of signal (leader) peptidases such as lepB. This family is present in both Ti-type and I-type conjugative systems.	171
274290	TIGR02772	Ku_bact	Ku protein, prokaryotic. Members of this protein family are Ku proteins of non-homologous end joining (NHEJ) DNA repair in bacteria and in at least one member of the archaea (Archaeoglobus fulgidus). Most members are encoded by a gene adjacent to the gene for the DNA ligase that completes the repair. The NHEJ system is broadly but rather sparsely distributed, being present in about one fifth of the first 250 completed prokarytotic genomes. A few species (e.g. Archaeoglobus fulgidus and Bradyrhizobium japonicum) have multiple copies that appear to represent recent paralogous family expansion. [DNA metabolism, DNA replication, recombination, and repair]	258
213736	TIGR02773	addB_Gpos	helicase-exonuclease AddAB, AddB subunit. DNA repair is accomplished by several different systems in prokaryotes. Recombinational repair of double-stranded DNA breaks involves the RecBCD pathway in some lineages, and AddAB (also called RexAB) in other. The AddA protein is conserved between the firmicutes and the alphaproteobacteria, while the partner protein is not. Nevertheless, the partner is designated AddB in both systems. This model describes the AddB protein as found Bacillus subtilis and related species. Although the RexB protein of Streptococcus and Lactococcus is considered to be orthologous, functionally equivalent, and merely named differently, all members of this protein family have a P-loop nucleotide binding motif GxxGxGK[ST] at the N-terminus, unlike RexB proteins, and a CxxCxxxxxC motif at the C-terminus, both of which may be relevant to function. [DNA metabolism, DNA replication, recombination, and repair]	1160
274291	TIGR02774	rexB_recomb	ATP-dependent nuclease subunit B. DNA repair is accomplished by several different systems in prokaryotes. Recombinational repair of double-stranded DNA breaks involves the RecBCD pathway in some lineages, and AddAB (also called RecAB) in other. The AddA protein is conserved between the firmicutes and the alphaproteobacteria, while the partner protein is not. The partner may be designated AddB, as in Bacillus and in alphaproteobacteria, or RexB as in Streptococcus and Lactococcus. Note, however, that RexB proteins lack an N-terminal GxxGxGK[ST] ATP-binding motif found in Bacillus subtilis and related species, and this difference may be important; this model represents specifically RexB proteins as found in Streptococcus and Lactococcus. [DNA metabolism, DNA replication, recombination, and repair]	1076
274292	TIGR02775	TrbG_Ti	P-type conjugative transfer protein TrbG. The TrbG protein is found in the trb locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for plasmid conjugative transfer. TrbG is a homolog of the F-type TraK protein (which is believed to be an outer membrane pore-forming secretin, TIGR02756) as well as the vir system VirB9 protein. [Cellular processes, Conjugation]	206
274293	TIGR02776	NHEJ_ligase_prk	DNA ligase D. Members of this protein family are DNA ligases involved in the repair of DNA double-stranded breaks by non-homologous end joining (NHEJ). The system of the bacterial Ku protein (TIGR02772) plus this DNA ligase is seen in about 20 % of bacterial genomes to date and at least one archaeon (Archeoglobus fulgidus). This model describes a central and a C-terminal domain. These two domains may be permuted, as in genus Mycobacterium, or divided into tandem ORFs, and therefore not be identified by this model. An additional N-terminal 3'-phosphoesterase (PE) domain present in some but not all examples of this ligase is not included in the seed alignment for this model; it only represents the central ATP-dependent ligase domain and the C-terminal polymerase domain. Most examples of genes for this ligase are adjacent to the gene for Ku. [DNA metabolism, DNA replication, recombination, and repair]	552
131824	TIGR02777	LigD_PE_dom	DNA ligase D, 3'-phosphoesterase domain. Most sequences in this family are the 3'-phosphoesterase domain of a multidomain, multifunctional DNA ligase, LigD, involved, along with bacterial Ku protein, in non-homologous end joining, the less common of two general mechanisms of repairing double-stranded breaks in DNA sequences. LigD is variable in architecture, as it lacks this domain in Bacillus subtilis, is permuted in Mycobacterium tuberculosis, and occasionally is encoded by tandem ORFs rather than as a multifuntional protein. In a few species (Dehalococcoides ethenogenes and the archaeal genus Methanosarcina), sequences corresponding to the ligase and polymerase domains of LigD are not found, and the role of this protein is unclear. [DNA metabolism, DNA replication, recombination, and repair]	156
274294	TIGR02778	ligD_pol	DNA ligase D, polymerase domain. DNA repair of double-stranded breaks by non-homologous end joining (NHEJ) is accomplished by a two-protein system that is present in a minority of prokaryotes. One component is the Ku protein (see TIGR02772), which binds DNA ends. The other is a DNA ligase, a protein that is a multidomain polypeptide in most of those bacteria that have NHEJ, a permuted polypeptide in Mycobacterium tuberculosis and a few other species, and the product of tandem genes in some other bacteria. This model represents the polymerase domain.	245
274295	TIGR02779	NHEJ_ligase_lig	DNA ligase D, ligase domain. DNA repair of double-stranded breaks by non-homologous end joining (NHEJ) is accomplished by a two-protein system that is present in a minority of prokaryotes. One component is the Ku protein (see TIGR02772), which binds DNA ends. The other is a DNA ligase, a protein that is a multidomain polypeptide in most of those bacteria that have NHEJ, a permuted polypeptide in Mycobacterium tuberculosis and a few other species, and the product of tandem genes in some other bacteria. This model represents the ligase domain.	298
131827	TIGR02780	TrbJ_Ti	P-type conjugative transfer protein TrbJ. The TrbJ protein is found in the trb locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for plasmid conjugative transfer. TrbJ is a homolog of the F-type TraE protein (which is believed to be an inner membrane pore-forming protein, TIGR02761) as well as the vir system VirB5 protein.	246
274296	TIGR02781	VirB9	P-type conjugative transfer protein VirB9. The VirB9 protein is found in the vir locus of Agrobacterium Ti plasmids where it is involved in a type IV secretion system . VirB9 is a homolog of the F-type conjugative transfer system TraK protein (which is believed to be an outer membrane pore-forming secretin, TIGR02756) as well as the Ti system TrbG protein. [Cellular processes, Conjugation]	243
274297	TIGR02782	TrbB_P	P-type conjugative transfer ATPase TrbB. The TrbB protein is found in the trb locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for plasmid conjugative transfer. TrbB is a homolog of the vir system VirB11 ATPase, and the Flp pilus sytem ATPase TadA. [Cellular processes, Conjugation]	299
131830	TIGR02783	TrbL_P	P-type conjugative transfer protein TrbL. The TrbL protein is found in the trb locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for plasmid conjugative transfer. TrbL is a homolog of the F-type TraG protein (which is believed to be a mating pair stabilization pore-forming protein, pfam07916) as well as the vir system VirB6 protein. [Cellular processes, Conjugation]	298
274298	TIGR02784	addA_alphas	double-strand break repair helicase AddA, alphaproteobacterial type. AddAB, also called RexAB, substitutes for RecBCD in several bacterial lineages. These DNA recombination proteins act before synapse and are particularly important for DNA repair of double-stranded breaks by homologous recombination. The term AddAB is used broadly, with AddA homologous between the alphaproteobacteria (as modeled here) and the Firmicutes, while the partner AddB proteins show no strong homology across the two groups of species. [DNA metabolism, DNA replication, recombination, and repair]	1135
274299	TIGR02785	addA_Gpos	helicase-exonuclease AddAB, AddA subunit, Firmicutes type. AddAB, also called RexAB, substitutes for RecBCD in several bacterial lineages. These DNA recombination proteins act before synapse and are particularly important for DNA repair of double-stranded breaks by homologous recombination. The term AddAB is used broadly, with AddA homologous between the Firmicutes (as modeled here) and the alphaproteobacteria, while the partner AddB proteins show no strong homology across the two groups of species. [DNA metabolism, DNA replication, recombination, and repair]	1230
274300	TIGR02786	addB_alphas	double-strand break repair protein AddB, alphaproteobacterial type. AddAB is a system well described in the Firmicutes as a replacement for RecBCD in many prokaryotes for the repair of double stranded break DNA damage. More recently, a distantly related gene pair conserved in many alphaproteobacteria was shown also to function in double-stranded break repair in Rhizobium etli. This family consists of AddB proteins of the alphaproteobacteial type. [DNA metabolism, DNA replication, recombination, and repair]	1021
131834	TIGR02787	codY_Gpos	GTP-sensing transcriptional pleiotropic repressor CodY. This model represents the full length of CodY, a pleiotropic repressor in Bacillus subtilis and other Firmicutes (low-GC Gram-positive bacteria) that responds to intracellular levels of GTP and branched chain amino acids. The C-terminal helix-turn-helix DNA-binding region is modeled by pfam08222 in Pfam. [Regulatory functions, DNA interactions]	251
274301	TIGR02788	VirB11	P-type DNA transfer ATPase VirB11. The VirB11 protein is found in the vir locus of Agrobacterium Ti plasmids where it is involved in the type IV secretion system for DNA transfer. VirB11 is believed to be an ATPase. VirB11 is a homolog of the P-like conjugation system TrbB protein and the Flp pilus sytem protein TadA.	308
131836	TIGR02789	nickel_nikB	nickel ABC transporter, permease subunit NikB. This family consists of the NikB family of nickel ABC transporter permeases. Operons that contain this protein also contain a homologous permease subunit NikC. Nickel is used in cells as part of urease or certain hydrogenases or superoxide dismutases. [Transport and binding proteins, Cations and iron carrying compounds]	314
131837	TIGR02790	nickel_nikC	nickel ABC transporter, permease subunit NikC. This family consists of the NikC family of nickel ABC transporter permeases. Operons that contain this protein also contain a homologous permease subunit NikB. Nickel is used in cells as part of urease or certain hydrogenases or superoxide dismutases. [Transport and binding proteins, Cations and iron carrying compounds]	258
274302	TIGR02791	VirB5	P-type DNA transfer protein VirB5. The VirB5 protein is involved in the type IV DNA secretion systems typified by the Agrobacterium Ti plasmid vir system where it interacts with several other proteins essential for proper pilus formation. VirB5 is homologous to the IncN (N-type) conjugation system protein TraC as well as the P-type protein TrbJ and the F-type protein TraE.	220
131839	TIGR02792	PCA_ligA	protocatechuate 4,5-dioxygenase, alpha subunit. Protocatechuate (PCA) 4,5-dioxygenase is the first enzyme in the PCA 4,5-cleavage pathway that is an alternative to PCA 3,4-cleavage and PCA 2,3 cleavage pathways. PCA is an intermediate in the breakdown of lignin (hence the gene symbol ligA) and other compounds. Members of this family are the alpha chain of PCA 4,5-dioxygenase, or the equivalent domain of a fusion protein. [Energy metabolism, Aerobic]	117
131840	TIGR02793	nikR	nickel-responsive transcriptional regulator NikR. Three members of the seed for this model, from Escherichia coli, Pseudomonas putida, and Brucella melitensis, are found associated with a nickel ABC transporter operon that acts to import nickel for use as a cofactor in urease or hydrogenase. These proteins, with characterized nickel-binding and DNA-binding domains, act as nickel-responsive transcriptional regulators. In the larger family of full-length homologs, most others both lack proximity to the nickel ABC transporter operon and form a separate clade. Several of the homologs not within the scope of this model, but rather scoring between the trusted and noise cutoffs, have been shown to bind nickel, copper, or both, and to regulate genes in response to nickel. [Regulatory functions, DNA interactions]	129
274303	TIGR02794	tolA_full	TolA protein. TolA couples the inner membrane complex of itself with TolQ and TolR to the outer membrane complex of TolB and OprL (also called Pal). Most of the length of the protein consists of low-complexity sequence that may differ in both length and composition from one species to another, complicating efforts to discriminate TolA (the most divergent gene in the tol-pal system) from paralogs such as TonB. Selection of members of the seed alignment and criteria for setting scoring cutoffs are based largely conserved operon struction. //The Tol-Pal complex is required for maintaining outer membrane integrity. Also involved in transport (uptake) of colicins and filamentous DNA, and implicated in pathogenesis. Transport is energized by the proton motive force. TolA is an inner membrane protein that interacts with periplasmic TolB and with outer membrane porins ompC, phoE and lamB. [Transport and binding proteins, Other, Cellular processes, Pathogenesis]	346
188247	TIGR02795	tol_pal_ybgF	tol-pal system protein YbgF. Members of this protein family are the product of one of seven genes regularly clustered in operons to encode the proteins of the tol-pal system, which is critical for maintaining the integrity of the bacterial outer membrane. The gene for this periplasmic protein has been designated orf2 and ybgF. All members of the seed alignment were from unique tol-pal gene regions from completed bacterial genomes. The architecture of this protein is a signal sequence, a low-complexity region usually rich in Asn and Gln, a well-conserved region with tandem repeats that resemble the tetratricopeptide (TPR) repeat, involved in protein-protein interaction.	117
131843	TIGR02796	tolQ	TolQ protein. TolQ is one of the essential components of the Tol-Pal system. Together with TolR, it harnesses protonmotive force to energize TolA, which spans the periplasm to reach the complex of TolB and Pal at the outer member. The tol-pal system proves to be important for maintaining outer membrane integrity. Gene pairs similar to the TolQ and TolR gene pair often number several per genome, but this model describes specificially TolQ per se, as found in tol-pal operons. A close homolog, excluded from this model, is ExbB of the ExbB/ExbD/TonB protein complex, which powers transport of siderophores and vitamin B12 across the bacterial outer membrane. The Tol-Pal system is exploited by colicin and filamentous phage DNA to enter the cell. It is also implicated in pathogenesis in several bacterial species [Transport and binding proteins, Other, Cellular processes, Pathogenesis]	215
131844	TIGR02797	exbB	tonB-system energizer ExbB. This model describes ExbB proteins, part of the MotA/TolQ/ExbB protein family. The paired proteins MotA and MotB, TolQ and TolR, and ExbB and ExbD harness the proton-motive force to drive the flagellar motor, energize the Tol-Pal system, or energize TonB, respectively. Tol-Pal and TonB are both active at the outer membrane. Genomes may have many different TonB-dependent receptors, of which many of those characterized are involved in siderophore transport across the outer membrane. [Transport and binding proteins, Cations and iron carrying compounds]	211
131845	TIGR02798	ligK_PcmE	4-carboxy-4-hydroxy-2-oxoadipate aldolase/oxaloacetate decarboxylase. Members of this protein family 4-carboxy-4-hydroxy-2-oxoadipate aldolase, also called 4-oxalocitramalate aldolase. This enzyme of the protocatechuate 4,5-cleavage pathway converts its substrate to pyruvate plus oxaloacetate. Protocatechuate is an intermediate in many pathways for degrading aromatic compounds, including lignin, fluorene, etc. Hara, et al. showed the LigK gene was not only a 4-carboxy-4-hydroxy-2-oxoadipate aldolase but also the enzyme of the following step, oxaloacetate decarboxylase.	222
274304	TIGR02799	thio_ybgC	tol-pal system-associated acyl-CoA thioesterase. The tol-pal system consists of five critical genes. Inner membrane proteins TolQ and TolR convert protomotive force to energy that is transduced through TolA to an outer membrane complex of TolB and Pal. The system is known to be required to maintain outer membrane integrity. In a system with several homologous parts, ExbB and ExbD transduces energy through TonB to a variety of outer membrane proteins, many of which are siderophore receptors. The tol-pal system therefore may also be involved in transport. This family consists of a protein nearly always found in operons with the genes of the tol-pal system. The significance of this thioesterase to the tol-pal system is unclear, but either of two observations may be relevant. First, Pal, or peptidoglycan-associated lipoprotein, has a conserved N-terminal cleavage and acylation that makes it a lipoprotein. Second, the tol-pal system is implicated not only in the import of certain organics but also in the maintenance of outer membrane integrity (by an unknown mechanism).	126
274305	TIGR02800	propeller_TolB	tol-pal system beta propeller repeat protein TolB. Members of this protein family are the TolB periplasmic protein of Gram-negative bacteria. TolB is part of the Tol-Pal (peptidoglycan-associated lipoprotein) multiprotein complex, comprising five envelope proteins, TolQ, TolR, TolA, TolB and Pal, which form two complexes. The TolQ, TolR and TolA inner-membrane proteins interact via their transmembrane domains. The {beta}-propeller domain of the periplasmic protein TolB is responsible for its interaction with Pal. TolB also interacts with the outer-membrane peptidoglycan-associated proteins Lpp and OmpA. TolA undergoes a conformational change in response to changes in the proton-motive force, and interacts with Pal in an energy-dependent manner. The C-terminal periplasmic domain of TolA also interacts with the N-terminal domain of TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi , Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear. [Transport and binding proteins, Other, Cellular processes, Pathogenesis]	417
274306	TIGR02801	tolR	TolR protein. The model describes the inner membrane protein TolR, part of the TolR/TolQ complex that transduces energy from the proton-motive force, through TolA, to an outer membrane complex made up of TolB and Pal (peptidoglycan-associated lipoprotein). The complex is required to maintain outer membrane integrity, and defects may cause a defect in the import of some organic compounds in addition to the resulting morphologic. While several gene pairs homologous to talR and tolQ may be found in a single genome, but the scope of this model is set to favor finding only bone fide TolR, supported by operon structure as well as by score. [Transport and binding proteins, Other, Cellular processes, Pathogenesis]	129
274307	TIGR02802	Pal_lipo	peptidoglycan-associated lipoprotein. Members of this protein are Pal (also called OprL), the Peptidoglycan-Associated Lipoprotein of the Tol-Pal system. The system appears to be involved both in the maintenance of outer membrane integrity and in the import of certain organic molecules as nutrients. Members of this family contain a hydrodrophobic lipoprotein signal sequence, a conserved N-terminal cleavage and modification site, a poorly conserved low-complexity region, together comprising about 65 amino acids, and a well-conserved C-terminal domain. The seed alignment for this model includes only the conserved C-terminal domain.	104
131850	TIGR02803	ExbD_1	TonB system transport protein ExbD, group 1. Members of this family are Gram-negative bacterial inner membrane proteins, generally designated ExbD, related to the TolR family modeled by TIGRFAMs TIGR02801. Members always are encoded next to a protein designated ExbB (TIGR02797), related to the TolQ family modeled by TIGRFAMs TIGR02796. ExbD and ExbB together form a proton channel through which they can harness the proton-motive force to energize TonB, which in turn energizes TonB-dependent receptors in the outer membrane. TonB-dependent receptors with known specificity tend to import siderophores or vitamin B12. A TonB system and Tol-Pal system often will co-exist in a single bacterial genome.	122
131851	TIGR02804	ExbD_2	TonB system transport protein ExbD, group 2. Members of this family are Gram-negative bacterial inner membrane proteins, generally designated ExbD, related to the TolR family modeled by TIGRFAMs TIGR02801. Members always are encoded next to a protein designated ExbB (TIGR02797), related to the TolQ family modeled by TIGRFAMs TIGR02796. ExbD and ExbB together form a proton channel through which they can harness the proton-motive force to energize TonB, which in turn energizes TonB-dependent receptors in the outer membrane. TonB-dependent receptors with known specificity tend to import siderophores or vitamin B12. A TonB system and Tol-Pal system often will co-exist in a single bacterial genome.	121
131852	TIGR02805	exbB2	tonB-system energizer ExbB, group 2. Members of this protein family appear to be the ExbB protein of an ExbBD proton-transporting membrane complex that, by means of TonB, energizes transport by TonB-dependent receptors. Note that this family represents one of at least two distinct groups TolQ homologs designated ExbB - see also TIGR02797. Each group associates with a distinct group of ExbD proteins, and a single species may have two ExbB/ExbD/TonB systems. [Transport and binding proteins, Cations and iron carrying compounds]	138
131853	TIGR02806	clostrip	clostripain. Clostripain is a cysteine protease characterized from Clostridium histolyticum, and also known from Clostridium perfringens. It is a heterodimer processed from a single precursor polypeptide, specific for Arg-|-Xaa peptide bonds. The older term alpha-clostripain refers to the most active, most reduced form, rather than to the product of one of several different genes. Clostripain belongs to the peptidase family C11, or clostripain family (see pfam03415). [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Pathogenesis]	476
274308	TIGR02807	cas6_cmx6	CRISPR-associated protein Cas6, subtype MYXAN. Members of this protein family resemble the Cas6 proteins described by TIGR01877 in having a C-terminal motif GXGXXXXXGXG, where the single X of each GXG is hydrophobic and the spacer XXXXX has at least one Lys or Arg. Examples are found in cas gene operons of CRISPR regions in Anabaena variabilis ATCC 29413, Leptospira interrogans, Gemmata obscuriglobus UQM 2246, and twice in Myxococcus xanthus DK 1622. Oddly, an orphan member is found in Thiobacillus denitrificans ATCC 25259, whose genome does not seem to contain other evidence of CRISPR repeats or cas genes.	190
131855	TIGR02808	short_TIGR02808	TIGR02808 family protein. This very small protein (about 46 amino acids) consists largely of a single predicted membrane-spanning region. It is found in Photobacterium profundum SS9 and in three species of Vibrio, always near periplasmic nitrate reductase genes, but far from the periplasmic nitrate reductase genes in Aeromonas hydrophila ATCC7966. [Hypothetical proteins, Conserved]	42
131856	TIGR02809	phasin_3	phasin family protein. Members of this protein family are encoded in polyhydroxyalkanoic acid storage system regions in Vibrio, Photobacterium profundum SS9, Acinetobacter sp., Aeromonas hydrophila, and several species of Vibrio. Members appear distantly related to the phasin family proteins modeled by TIGR01841 and TIGR01985.	110
274309	TIGR02810	agaZ_gatZ	D-tagatose-bisphosphate aldolase, class II, non-catalytic subunit. Aldolases specific for D-tagatose-bisphosphate occur in distinct pathways in Escherichia coli and other bacteria, one for the degradation of galactitol (formerly dulcitol) and one for degradation of N-acetyl-galactosamine and D-galactosamine. This family represents a protein of both systems that behaves as a non-catalytic subunit of D-tagatose-bisphosphate aldolase, required both for full activity and for good stability of the aldolase. Note that members of this protein family appear in public databases annotated as putative tagatose 6-phosphate kinases, possibly in error. [Energy metabolism, Sugars]	420
274310	TIGR02811	formate_TAT	formate dehydrogenase region TAT target. Members of this uncharacterized protein family are all small, extending 70 or fewer residues from their respective likely start codons. All have the twin-arginine-dependent tranport (TAT) signal sequence at the N-terminus and a conserved 20-residue C-terminal region that includes the motif Y-[HRK]-X-[TS]-X-H-[IV]-X-X-[YF]-Y. The TAT signal sequence suggests a bound cofactor. All members are encoded near genes for subunits of formate dehydrogenase, and may themselves be a subunit or accessory protein. [Unknown function, General]	66
163028	TIGR02812	fadR_gamma	fatty acid metabolism transcriptional regulator FadR. Members of this family are FadR, a transcriptional regulator of fatty acid metabolism, including both biosynthesis and beta-oxidation. It is found exclusively in a subset of Gammaproteobacteria, with strictly one copy per genome. It has an N-terminal DNA-binding domain and a less well conserved C-terminal long chain acyl-CoA-binding domain. FadR from this family heterologously expressed in Escherichia coli show differences in regulatory response and fatty acid binding profiles. The family is nevertheless designated equivalog, as all member proteins have at least nominally the same function. [Fatty acid and phospholipid metabolism, Biosynthesis, Fatty acid and phospholipid metabolism, Degradation, Regulatory functions, DNA interactions]	235
274311	TIGR02813	omega_3_PfaA	polyketide-type polyunsaturated fatty acid synthase PfaA. Members of the seed for this alignment are involved in omega-3 polyunsaturated fatty acid biosynthesis, such as the protein PfaA from the eicosapentaenoic acid biosynthesis operon in Photobacterium profundum strain SS9. PfaA is encoded together with PfaB, PfaC, and PfaD, and the functions of the individual polypeptides have not yet been described. More distant homologs of PfaA, also included with the reach of this model, appear to be involved in polyketide-like biosynthetic mechanisms of polyunsaturated fatty acid biosynthesis, an alternative to the more familiar iterated mechanism of chain extension and desaturation, and in most cases are encoded near genes for homologs of PfaB, PfaC, and/or PfaD.	2582
274312	TIGR02814	pfaD_fam	PfaD family protein. The protein PfaD is part of four gene locus, similar to polyketide biosynthesis systems, responsible for omega-3 polyunsaturated fatty acid biosynthesis in several high pressure and/or cold-adapted bacteria. Several other members of the seed alignment for this model are found in loci presumed to act in polyketide biosyntheses per se.	444
131862	TIGR02815	agaS_fam	putative sugar isomerase, AgaS family. Some members of this protein family are found in regions associated with N-acetyl-galactosamine and galactosamine untilization and are suggested to be isomerases.	372
131863	TIGR02816	pfaB_fam	PfaB family protein. The protein PfaB is part of four gene locus, similar to polyketide biosynthesis systems, responsible for omega-3 polyunsaturated fatty acid biosynthesis in several high pressure and/or cold-adapted bacteria. The fairly permissive trusted cutoff set for this model allows detection of homologs encoded near homologs to other proteins of the locus: PfaA, PfaC, and/or PfaD. The likely role in every case is either polyunsaturated fatty acid or polyketide biosynthesis.	538
274313	TIGR02817	adh_fam_1	zinc-binding alcohol dehydrogenase family protein. Members of this model form a distinct subset of the larger family of oxidoreductases that includes zinc-binding alcohol dehydrogenases and NADPH:quinone reductases (pfam00107). While some current members of this family carry designations as putative alginate lyase, it seems no sequence with a direct characterization as such is detected by this model. [Energy metabolism, Fermentation]	336
131865	TIGR02818	adh_III_F_hyde	S-(hydroxymethyl)glutathione dehydrogenase/class III alcohol dehydrogenase. The members of this protein family show dual function. First, they remove formaldehyde, a toxic metabolite, by acting as S-(hydroxymethyl)glutathione dehydrogenase (1.1.1.284). S-(hydroxymethyl)glutathione can form spontaneously from formaldehyde and glutathione, and so this enzyme previously was designated glutathione-dependent formaldehyde dehydrogenase. These same proteins are also designated alcohol dehydrogenase (EC 1.1.1.1) of class III, for activities that do not require glutathione; they tend to show poor activity for ethanol among their various substrate alcohols. [Cellular processes, Detoxification, Energy metabolism, Fermentation]	368
274314	TIGR02819	fdhA_non_GSH	formaldehyde dehydrogenase, glutathione-independent. Members of this family represent a distinct clade within the larger family of zinc-dependent dehydrogenases of medium chain alcohols, a family that also includes the so-called glutathione-dependent formaldehyde dehydrogenase. Members of this protein family have a tightly bound NAD that can act as a true cofactor, rather than a cosubstrate in dehydrogenase reactions, in dismutase reactions for some aldehydes. The name given to this family, however, is formaldehyde dehydrogenase, glutathione-independent. [Central intermediary metabolism, One-carbon metabolism]	393
131867	TIGR02820	formald_GSH	S-(hydroxymethyl)glutathione synthase. The formation of S-(hydroxymethyl)glutathione synthase from glutathione and formaldehyde occurs naturally, but this enzyme speeds its formation in some species as part of a pathway of formaldehyde detoxification. [Cellular processes, Detoxification, Central intermediary metabolism, One-carbon metabolism]	182
131868	TIGR02821	fghA_ester_D	S-formylglutathione hydrolase. This model describes a protein family from bacteria, yeast, and human, with a conserved critical role in formaldehyde detoxification as S-formylglutathione hydrolase (EC 3.1.2.12). Members in eukaryotes such as the human protein are better known as esterase D (EC 3.1.1.1), an enzyme with broad specificity, although S-formylglutathione hydrolase has now been demonstrated as well. [Cellular processes, Detoxification]	275
131869	TIGR02822	adh_fam_2	zinc-binding alcohol dehydrogenase family protein. Members of this model form a distinct subset of the larger family of oxidoreductases that includes zinc-binding alcohol dehydrogenases and NADPH:quinone reductases (pfam00107). The gene neighborhood of members of this family is not conserved and it appears that no members are characterized. The sequence of the family includes 6 invariant cysteine residues and one invariant histidine. It appears that no member is characterized. [Energy metabolism, Fermentation]	329
274315	TIGR02823	oxido_YhdH	putative quinone oxidoreductase, YhdH/YhfP family. This model represents a subfamily of pfam00107 as defined by Pfam, a superfamily in which some members are zinc-binding medium-chain alcohol dehydrogenases while others are quinone oxidoreductases with no bound zinc. This subfamily includes proteins studied crystallographically for insight into function: YhdH from Escherichia coli and YhfP from Bacillus subtilis. Members bind NADPH or NAD, but not zinc. [Unknown function, Enzymes of unknown specificity]	323
274316	TIGR02824	quinone_pig3	putative NAD(P)H quinone oxidoreductase, PIG3 family. Members of this family are putative quinone oxidoreductases that belong to the broader superfamily (modeled by Pfam pfam00107) of zinc-dependent alcohol (of medium chain length) dehydrogenases and quinone oxiooreductases. The alignment shows no motif of conserved Cys residues as are found in zinc-binding members of the superfamily, and members are likely to be quinone oxidoreductases instead. A member of this family in Homo sapiens, PIG3, is induced by p53 but is otherwise uncharacterized. [Unknown function, Enzymes of unknown specificity]	325
131872	TIGR02825	B4_12hDH	leukotriene B4 12-hydroxydehydrogenase/15-oxo-prostaglandin 13-reductase. Leukotriene B4 12-hydroxydehydrogenase is an NADP-dependent enzyme of arachidonic acid metabolism, responsible for converting leukotriene B4 to the much less active metabolite 12-oxo-leukotriene B4. The BRENDA database lists leukotriene B4 12-hydroxydehydrogenase as one of the synonyms of 2-alkenal reductase (EC 1.3.1.74), while 1.3.1.48 is 15-oxoprostaglandin 13-reductase.	325
274317	TIGR02826	RNR_activ_nrdG3	anaerobic ribonucleoside-triphosphate reductase activating protein. Members of this family represent a set of radical SAM enzymes related to, yet architecturally different from, the activating protein for the glycine radical-containing, oxygen-sensitive ribonucleoside-triphosphate reductase (RNR) as described in model TIGR02491. Members of this family are found paired with members of a similarly divergent set of anaerobic ribonucleoside-triphosphate reductases. Identification of this protein as an RNR activitating protein is partly from pairing with a candidate RNR. It is further supported by our finding that upstream of these operons are examples of a conserved regulatory element (described Rodionov and Gelfand) that is found in nearly all bacteria and that occurs specifically upstream of operons for all three classes of RNR genes. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	147
274318	TIGR02827	RNR_anaer_Bdell	anaerobic ribonucleoside-triphosphate reductase. Members of this family belong to the class III anaerobic ribonucleoside-triphosphate reductases (RNR). These glycine-radical-containing enzymes are oxygen-sensitive and operate under anaerobic conditions. The genes for this family are pair with genes for an acitivating protein that creates a glycine radical. Members of this family, though related, fall outside the scope of TIGR02487, a functionally equivalent protein set; no genome has members in both familes. Identification as RNR is supported by gene pairing with the activating protein, lack of other anaerobic RNR, and presence of an upstream regulatory element strongly conserved upstream of most RNR operons. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	595
131875	TIGR02828	TIGR02828	putative membrane fusion protein. Members of this family show similarity to the members of TIGR00999, the membrane fusion protein (MFP) cluster 2 family, which is linked to RND transport systems. [Transport and binding proteins, Unknown substrate]	188
213743	TIGR02829	spore_III_AE	stage III sporulation protein AE. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is found in a spore formation operon and is designated stage III sporulation protein AE. [Cellular processes, Sporulation and germination]	381
274319	TIGR02830	spore_III_AG	stage III sporulation protein AG. CC A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is found in a spore formation operon and is designated stage III sporulation protein AG. [Cellular processes, Sporulation and germination]	186
131878	TIGR02831	spo_II_M	stage II sporulation protein M. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This predicted integral membrane protein is designated stage II sporulation protein M. [Cellular processes, Sporulation and germination]	200
131879	TIGR02832	spo_yunB	sporulation protein YunB. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. Mutation of this sigma E-regulated gene, designated yunB, has been shown to cause a sporulation defect. [Cellular processes, Sporulation and germination]	204
131880	TIGR02833	spore_III_AB	stage III sporulation protein AB. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage III sporulation protein AB. [Cellular processes, Sporulation and germination]	170
274320	TIGR02834	spo_ytxC	putative sporulation protein YtxC. This uncharacterized protein is part of a panel of proteins conserved in all known endospore-forming Firmicutes (low-GC Gram-positive bacteria), including Carboxydothermus hydrogenoformans, and nowhere else. [Cellular processes, Sporulation and germination]	276
131882	TIGR02835	spore_sigmaE	RNA polymerase sigma-E factor. Members of this family comprise the Firmicutes lineage endospore formation-specific sigma factor SigE, also called SpoIIGB and sigma-29. As characterized in Bacillus subtilis, this protein is synthesized as a precursor, specifically in the mother cell compartment, and must cleaved by the SpoIIGA protein to be made active. [Transcription, Transcription factors, Cellular processes, Sporulation and germination]	234
131883	TIGR02836	spore_IV_A	stage IV sporulation protein A. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage IV sporulation protein A. It acts in the mother cell compartment and plays a role in spore coat morphogenesis. [Cellular processes, Sporulation and germination]	492
131884	TIGR02837	spore_II_R	stage II sporulation protein R. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage II sporulation protein R. [Cellular processes, Sporulation and germination]	168
131885	TIGR02838	spore_V_AC	stage V sporulation protein AC. This model describes stage V sporulation protein AC, a paralog of stage V sporulation protein AE. Both are proteins found to present in a species if and only if that species is one of the Firmicutes capable of endospore formation, as of the time of the publication of the genome of Carboxydothermus hydrogenoformans. Mutants in spoVAC have a stage V sproulation defect. [Cellular processes, Sporulation and germination]	141
131886	TIGR02839	spore_V_AE	stage V sporulation protein AE. This model describes stage V sporulation protein AE, a paralog of stage V sporulation protein AC. Both are proteins found to present in a species if and only if that species is one of the Firmicutes capable of endospore formation, as of the time of the publication of the genome of Carboxydothermus hydrogenoformans. Mutants in spoVAE have a stage V sproulation defect. [Cellular processes, Sporulation and germination]	114
274321	TIGR02840	spore_YtaF	putative sporulation protein YtaF. This protein family was identified, at the time of the publication of the Carboxydothermus hydrogenoformans genome, as having a phylogenetic profile that exactly matches the subset of the Firmicutes capable of forming endospores. The species include Bacillus anthracis, Clostridium tetani, Thermoanaerobacter tengcongensis, Geobacillus kaustophilus, etc. This protein, previously named YtaF, is therefore a putative sporulation protein. [Cellular processes, Sporulation and germination]	206
131888	TIGR02841	spore_YyaC	putative sporulation protein YyaC. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, also called YyaC, is a member of that panel and is otherwise uncharacterized. The second round of PSI-BLAST shows many similarities to the germination protease GPR, which is found in exactly the same set of organisms and has a known role in the sporulation/germination process. [Cellular processes, Sporulation and germination]	140
131889	TIGR02842	CyoC	cytochrome o ubiquinol oxidase, subunit III. Cytochrome o terminal oxidase complex is the component of the aerobic respiratory chain which reacts with oxygen, reducing it to water with the concomitant transport of 4 protons across the membrane. Also known as the cytochrome bo complex, cytochrome o ubiquinol oxidase contains four subunits, two heme b cofactors and a copper atom which is believed to be the oxygen active site. This complex is structurally related to the cytochrome caa3 oxidases which utilize cytochrome c as the reductant and contain heme a cofactors, as well as the intermediate form aa3 oxidases which also react directly with quinones as the reductant. [Energy metabolism, Electron transport]	180
131890	TIGR02843	CyoB	cytochrome o ubiquinol oxidase, subunit I. Cytochrome o terminal oxidase complex is the component of the aerobic respiratory chain which reacts with oxygen, reducing it to water with the concomitant transport of 4 protons across the membrane. Also known as the cytochrome bo complex, cytochrome o ubiquinol oxidase contains four subunits, two heme b cofactors and a copper atom which is believed to be the oxygen active site. This complex is structurally related to the cytochrome caa3 oxidases which utilize cytochrome c as the reductant and contain heme a cofactors, as well as the intermediate form aa3 oxidases which also react directly with quinones as the reductant. [Energy metabolism, Electron transport]	646
131891	TIGR02844	spore_III_D	sporulation transcriptional regulator SpoIIID. Members of this protein are the transcriptional regulator SpoIIID, or stage III sporulation protein D. It is present in genomes if and only if the species is capable of endospore formation as occurs in the model species Bacillus subtilis. SpoIIID is a DNA binding protein that, in B. subtilis, downregulates many genes but also turns on ten genes. [Regulatory functions, DNA interactions, Cellular processes, Sporulation and germination]	80
188254	TIGR02845	spore_V_AD	stage V sporulation protein AD. Bacillus and Clostridium species contain about 10 % dipicolinic acid (pyridine-2,6-dicarboxylic acid) by weight. This protein family, SpoVAD, belongs to the spoVA operon that is suggested to act in the transport of dipicolinic acid (DPA) from the mother cell, where DPA is synthesized, to the forespore, a process essential to sporulation. Members of this protein family are found, so far, in exactly those species believed capable of endospore formation. [Cellular processes, Sporulation and germination]	327
131893	TIGR02846	spore_sigmaK	RNA polymerase sigma-K factor. The sporulation-specific transcription factor sigma-K (also called sigma-27) is expressed in the mother cell compartment of endospore-forming bacteria such as Bacillus subtilis. Like its close homolog sigma-E (sigma-29) (see TIGR02835), also specific to the mother cell compartment, it must be activated by a proteolytic cleavage. Note that in Bacillus subtilis (and apparently also Clostridium tetani), but not in other endospore forming species such as Bacillus anthracis, the sigK gene is generated by a non-germline (mother cell only) chromosomal rearrangement that recombines coding regions for the N-terminal and C-terminal regions of sigma-K. [Transcription, Transcription factors, Cellular processes, Sporulation and germination]	227
131894	TIGR02847	CyoD	cytochrome o ubiquinol oxidase subunit IV. Cytochrome o terminal oxidase complex is the component of the aerobic respiratory chain which reacts with oxygen, reducing it to water with the concomitant transport of 4 protons across the membrane. Also known as the cytochrome bo complex, cytochrome o ubiquinol oxidase contains four subunits, two heme b cofactors and a copper atom which is believed to be the oxygen active site. This complex is structurally related to the cytochrome caa3 oxidases which utilize cytochrome c as the reductant and contain heme a cofactors, as well as the intermediate form aa3 oxidases which also react directly with quinones as the reductant. [Energy metabolism, Electron transport]	96
131895	TIGR02848	spore_III_AC	stage III sporulation protein AC. Members of this protein family are designated SpoIIIAC, part of the spoIIIA operon of sporulation genes whose mutant phenotype is linked to sporulation stage III. Members of this family are encoded by the genome of a species if and only if that species is capable of endospore formation, as in Bacillus subtilis. The molecular function of this small, probable integral membrane protein is unknown. [Cellular processes, Sporulation and germination]	64
131896	TIGR02849	spore_III_AD	stage III sporulation protein AD. Members of this family are the uncharacterized protein SpoIIIAD, part of the spoIIIA operon that acts at sporulation stage III as part of a cascade of events leading to endospore formation. Note that the start sites of members of this family as annotated tend to be variable; quite a few members have apparent homologous protein-coding regions continuing upstream of the first available start codon. The length of the alignment and the scoring cutoff thresholds for the model have been set to try to detect all valid members of the family, even if annotation of the start site begins too far downstream. [Cellular processes, Sporulation and germination]	101
131897	TIGR02850	spore_sigG	RNA polymerase sigma-G factor. Members of this family comprise the Firmicutes lineage endospore formation-specific sigma factor SigG. It is also desginated stage III sporulation protein G (SpoIIIG). This protein is rather closely related to sigma-F (SpoIIAC), another sporulation sigma factor. [Transcription, Transcription factors, Cellular processes, Sporulation and germination]	254
131898	TIGR02851	spore_V_T	stage V sporulation protein T. Members of this protein family are the stage V sporulation protein T (SpoVT), a protein of the sporulation/germination program in Bacillus subtilis and related species. The amino-terminal 50 amino acids are nearly perfectly conserved across all endospore-forming bacteria. SpoVT is a DNA-binding transcriptional regulator related to AbrB (See pfam04014). [Regulatory functions, DNA interactions, Cellular processes, Sporulation and germination]	180
188255	TIGR02852	spore_dpaB	dipicolinic acid synthetase, B subunit. Members of this family represent the B subunit of dipicolinic acid synthetase, an enzyme that synthesizes a small molecule that appears to confer heat stability to bacterial endospores such as those of Bacillus subtilis. The A and B subunits are together in what was originally designated the spoVF locus for stage V of endospore formation. [Cellular processes, Sporulation and germination]	187
131900	TIGR02853	spore_dpaA	dipicolinic acid synthetase, A subunit. This predicted Rossman fold-containing protein is the A subunit of dipicolinic acid synthetase as found in most, though not all, endospore-forming low-GC Gram-positive bacteria; it is absent in Clostridium. The B subunit is represented by TIGR02852. This protein is also known as SpoVFA. [Cellular processes, Sporulation and germination]	287
131901	TIGR02854	spore_II_GA	sigma-E processing peptidase SpoIIGA. Members of this protein family are the stage II sporulation protein SpoIIGA. This protein acts as an activating protease for Sigma-E, one of several specialized sigma factors of the sporulation process in Bacillus subtilis and related endospore-forming bacteria. [Cellular processes, Sporulation and germination]	288
274322	TIGR02855	spore_yabG	sporulation peptidase YabG. Members of this family are the protein YabG, demonstrated for Bacillus subtilis to be an endopeptidase able to release N-terminal peptides from a number of sporulation proteins, including CotT, CotF, and SpoIVA. It appears to be expressed under control of sigma-K. [Cellular processes, Sporulation and germination]	283
131903	TIGR02856	spore_yqfC	sporulation protein YqfC. This small protein, designated YqfC in Bacillus subtilis, is both restricted to and universal in sporulating species of the Firmcutes, such as Bacillus subtilis and Clostridium perfringens. It is part of the sigma(E)-controlled regulon, and its mutation leads to a sporulation defect. [Cellular processes, Sporulation and germination]	85
274323	TIGR02857	CydD	thiol reductant ABC exporter, CydD subunit. The gene pair cydCD encodes an ABC-family transporter in which each gene contains an N-terminal membrane-spanning domain (pfam00664) and a C-terminal ATP-binding domain (pfam00005). In E. coli these genes were discovered as mutants which caused the terminal heme-copper oxidase complex cytochrome bd to fail to assemble. Recent work has shown that the transporter is involved in export of redox-active thiol compounds such as cysteine and glutathione. The linkage to assembly of the cytochrome bd complex is further supported by the conserved operon structure found outside the gammaproteobacteria (cydABCD) containing both the transporter and oxidase genes components. The genes used as the seed members for this model are all either found in the gammproteobacterial context or the CydABCD context. All members of this family scoring above trusted at the time of its creation were from genomes which encode a cytochrome bd complex. Unfortunately, the gene symbol nomenclature adopted based on this operon in B. subtilis assigns cydC to the third gene in the operon where this gene is actually homologous to the E. coli cydD gene. We have chosen to name all homologs in this family in accordance with the precedence of publication of the E. coli name, CydD	529
274324	TIGR02858	spore_III_AA	stage III sporulation protein AA. Members of this protein are the stage III sporulation protein AA, encoded by one of several genes in the spoIIIA locus. It seems that this protein is found in a species if and only if that species is capable of endospore formation. [Cellular processes, Sporulation and germination]	270
131906	TIGR02859	spore_sigH	RNA polymerase sigma-H factor. Members of this protein family are RNA polymerase sigma-H factor for sporulation in endospore-forming bacteria. This protein is also called Sigma-30 and SigH. Although rather close homologs in Listeria score well against this model, Listeria does not form spores and the role of the related sigma factor in that genus is in doubt. [Transcription, Transcription factors, Cellular processes, Sporulation and germination]	198
274325	TIGR02860	spore_IV_B	stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else. [Cellular processes, Sporulation and germination]	402
274326	TIGR02861	SASP_H	small acid-soluble spore protein, H-type. This model is derived from pfam08141 but has been expanded to include in the seed corresponding proteins from three species of Clostridium. Members of this family should occur only in endospore-forming bacteria, typically with two members per genome, but may be absent from the genomes of some endospore-forming bacteria. SspH (previously designated YfjU) was shown to be expressed specifically in spores of Bacillus subtilis. [Cellular processes, Sporulation and germination]	58
163046	TIGR02862	spore_BofA	pro-sigmaK processing inhibitor BofA. Members of this protein family are found only in endospore-forming bacteria, such as Bacillus subtilis and Clostridium tetani. Among such bacteria, it appears only Symbiobacterium thermophilum lacks a member of this family. The protein, designated BofA, is an integral membrane protein that regulates the proteolytic activation of the RNA polymerase sigma factor K. [Cellular processes, Sporulation and germination]	83
131910	TIGR02863	spore_sspJ	small, acid-soluble spore protein, SspJ. New small, acid-soluble proteins unique to spores of Bacillus subtilis [Cellular processes, Sporulation and germination]	47
274327	TIGR02864	spore_sspO	small, acid-soluble spore protein O. This model represents a minor (low-abundance) spore protein, designated SspO. It is found in a very limited subset of the already small group of endospore-forming bacteria, but these species include Oceanobacillus iheyensis, Geobacillus kaustophilus, Bacillus subtilis, B. halodurans, and B. cereus. This protein was previously called CotK. [Cellular processes, Sporulation and germination]	50
274328	TIGR02865	spore_II_E	stage II sporulation protein E. Stage II sporulation protein E (SpoIIE) is a multiple membrane spanning protein with two separable functions. It plays a role in the switch to polar cell division during sporulation. By means of it protein phosphatase activity, located in the C-terminal region, it activates sigma-F. All proteins that score above the trusted cutoff to this model are found in endospore-forming Gram-positive bacteria. Surprisingly, a sequence from the Cyanobacterium-like (and presumably non-spore-forming) photosynthesizer Heliobacillus mobilis is homologous, and scores between the trusted and noise cutoffs. [Cellular processes, Sporulation and germination]	764
274329	TIGR02866	CoxB	cytochrome c oxidase, subunit II. Cytochrome c oxidase is the terminal electron acceptor of mitochondria (and one of several possible acceptors in prokaryotes) in the electron transport chain of aerobic respiration. The enzyme couples the oxidation of reduced cytochrome c with the reduction of molecular oxygen to water. This process results in the pumping of four protons across the membrane which are used in the proton gradient powered synthesis of ATP. The oxidase contains two heme a cofactors and three copper atoms as well as other bound ions. [Energy metabolism, Electron transport]	199
274330	TIGR02867	spore_II_P	stage II sporulation protein P. Stage II sporulation protein P is a protein of the endospore formation program in a number of lineages in the Firmicutes (low-GC Gram-positive bacteria). It is expressed in the mother cell compartment, under control of Sigma-E. SpoIIP, along with SpoIIM and SpoIID, is one of three major proteins involved in engulfment of the forespore by the mother cell. This protein family is named for the single member in Bacillus subtilis, although most sporulating bacteria have two members. [Cellular processes, Sporulation and germination]	196
274331	TIGR02868	CydC	thiol reductant ABC exporter, CydC subunit. The gene pair cydCD encodes an ABC-family transporter in which each gene contains an N-terminal membrane-spanning domain (pfam00664) and a C-terminal ATP-binding domain (pfam00005). In E. coli these genes were discovered as mutants which caused the terminal heme-copper oxidase complex cytochrome bd to fail to assemble. Recent work has shown that the transporter is involved in export of redox-active thiol compounds such as cysteine and glutathione. The linkage to assembly of the cytochrome bd complex is further supported by the conserved operon structure found outside the gammaproteobacteria (cydABCD) containing both the transporter and oxidase genes components. The genes used as the seed members for this model are all either found in the gammproteobacterial context or the CydABCD context. All members of this family scoring above trusted at the time of its creation were from genomes which encode a cytochrome bd complex.	530
213747	TIGR02869	spore_SleB	spore cortex-lytic enzyme. Members of this protein family are the spore cortex-lytic enzyme SleB from Bacillus subtilis and other Gram-positive, endospore-forming bacterial species. This protein is stored in an inactive form in the spore and activated during germination. [Cellular processes, Sporulation and germination]	200
274332	TIGR02870	spore_II_D	stage II sporulation protein D. Stage II sporulation protein D (SpoIID) is a protein of the endospore formation program in a number of lineages in the Firmicutes (low-GC Gram-positive bacteria). It is expressed in the mother cell compartment, under control of Sigma-E. SpoIID, along with SpoIIM and SpoIIP, is one of three major proteins involved in engulfment of the forespore by the mother cell. [Cellular processes, Sporulation and germination]	338
274333	TIGR02871	spore_ylbJ	sporulation integral membrane protein YlbJ. Members of this protein family are found exclusively in Firmicutes (low-GC Gram-positive bacterial) and are known from studies in Bacillus subtilis to be part of the sigma-E regulon. Mutation leads to a sporulation defect, confirming that members of this protein family, YlbJ, are sporulation proteins. This protein appears to be universal among endospore-forming bacteria, but is encoded by a pair ORFs distant from eash other in Symbiobacterium thermophilum IAM14863. [Cellular processes, Sporulation and germination]	362
274334	TIGR02872	spore_ytvI	sporulation integral membrane protein YtvI. Three lines of evidence show this protein to be involved in sporulation. First, it is under control of a sporulation-specific sigma factor, sigma-E. Second, mutation leads to a sporulation defect. Third, it if found in exactly those genomes whose bacteria are capable of sporulation, except for being absent in Clostridium acetobutylicum ATCC824. This protein has extensive hydrophobic regions and is likely an integral membrane protein. [Cellular processes, Sporulation and germination]	341
131920	TIGR02873	spore_ylxY	probable sporulation protein, polysaccharide deacetylase family. Members of this protein family are most closely related to TIGR02764, a subset of polysaccharide deacetylase family proteins found in a species if and only if the species forms endospores like those of Bacillus subtilis or Clostridium tetani. This family is likewise restricted to spore-formers, but is not universal among them in having sequences with full-length matches to the model. [Energy metabolism, Biosynthesis and degradation of polysaccharides, Cellular processes, Sporulation and germination]	268
131921	TIGR02874	spore_ytfJ	sporulation protein YtfJ. Members of this protein family, exemplified by YtfJ of Bacillus subtilis, are encoded by bacterial genomes if and only if the species is capable of endospore formation. YtfJ was confirmed in spores of Bacillus subtilis; it appears to be expressed in the forespore under control of SigF (see ). [Cellular processes, Sporulation and germination]	125
131922	TIGR02875	spore_0_A	sporulation transcription factor Spo0A. Spo0A, the stage 0 sporulation protein A, is a transcription factor critical for the initiation of sporulation. It contains a response regulator receiver domain (pfam00072). In Bacillus subtilis, it works together with response regulator Spo0F and the phosphotransferase Spo0B, both of which are missing from at least some sporulating species and thus not part of the endospore forming bacteria minimal gene set. Spo0A, however, is universal among endospore-forming species. [Cellular processes, Sporulation and germination]	262
274335	TIGR02876	spore_yqfD	sporulation protein YqfD. YqfD is part of the sigma-E regulon in the sporulation program of endospore-forming Gram-positive bacteria. Mutation results in a sporulation defect in Bacillus subtilis. Members are found in all currently known endospore-forming bacteria, including the genera Bacillus, Symbiobacterium, Carboxydothermus, Clostridium, and Thermoanaerobacter. [Cellular processes, Sporulation and germination]	382
274336	TIGR02877	spore_yhbH	sporulation protein YhbH. This protein family, typified by YhbH in Bacillus subtilis, is found in nearly every endospore-forming bacterium and in no other genome (but note that the trusted cutoff score is set high to exclude a single high-scoring sequence from Nitrosococcus oceani ATCC 19707, which is classified in the Gammaproteobacteria). The gene in Bacillus subtilis was shown to be in the regulon of the sporulation sigma factor, sigma-E, and its mutation was shown to create a sporulation defect. [Cellular processes, Sporulation and germination]	371
131925	TIGR02878	spore_ypjB	sporulation protein YpjB. Members of this protein, YpjB, family are restricted to a subset of endospore-forming bacteria, including Bacillus species but not CLostridium or some others. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon, where sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect. This protein family is not, however, a part of the endospore formation minimal gene set. [Cellular processes, Sporulation and germination]	233
200217	TIGR02880	cbbX_cfxQ	probable Rubsico expression protein CbbX. Proteins in this family are now designated CbbX. Some previously were CfxQ (carbon fixation Q). Its gene is often found immmediately downstream of the Rubisco large and small chain genes, and it is suggested to be necessary for Rubisco expression. CbbX has been shown to be necessary for photoautotrophic growth. This protein belongs to the larger family of pfam00004, ATPase family Associated with various cellular Activities. Within that larger family, members of this family are most closely related to the stage V sporulation protein K, or SpoVK, in endospore-forming bacteria such as Bacillus subtilis.	284
163057	TIGR02881	spore_V_K	stage V sporulation protein K. Members of this protein family are the stage V sporulation protein K (SpoVK), a close homolog of the Rubisco expression protein CbbX (TIGR02880) and a members of the ATPase family associated with various cellular activities (pfam00004). Members are strictly limited to bacterial endospore-forming species, but are not universal in this group and are missing from the Clostridium group. [Cellular processes, Sporulation and germination]	261
131928	TIGR02882	QoxB	cytochrome aa3 quinol oxidase, subunit I. This family (QoxB) encodes subunit I of the aa3-type quinone oxidase, one of several bacterial terminal oxidases. This complex couples oxidation of reduced quinones with the reduction of molecular oxygen to water and the pumping of protons to form a proton gradient utilized for ATP production. aa3-type oxidases contain two heme a cofactors as well as copper atoms in the active site. [Energy metabolism, Electron transport]	643
274337	TIGR02883	spore_cwlD	N-acetylmuramoyl-L-alanine amidase CwlD. Members of this protein family are the CwlD family of N-acetylmuramoyl-L-alanine amidase. This family has been called the germination-specific N-acetylmuramoyl-L-alanine amidase. CwlD is required, along with the putative deactylase PdaA, to make muramic delta-lactam, a novel peptidoglycan constituent found only in spores. CwlD mutants show a germination defect. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Sporulation and germination]	189
131930	TIGR02884	spore_pdaA	delta-lactam-biosynthetic de-N-acetylase. Muramic delta-lactam is an unusual constituent of peptidoglycan, found only in bacterial spores in the peptidoglycan wall, or spore cortex. The proteins in this family are PdaA (yfjS), a member of a larger family of polysaccharide deacetylases, and are specificially involved in delta-lactam biosynthesis. PdaA acts immediately after CwlD, an N-acetylmuramoyl-L-alanine amidase and performs a de-N-acetylation. PdaA may also perform the following transpeptidation for lactam ring formation, as heterologous expression in E. coli of CwlD and PdaA together is sufficient for delta-lactam production. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Sporulation and germination]	224
131931	TIGR02885	spore_sigF	RNA polymerase sigma-F factor. Members of this protein family are the RNA polymerase sigma factor F. Sigma-F is specifically and universally a component of the Firmicutes lineage endospore formation program, and is expressed in the forespore to turn on expression of dozens of genes. It is closely homologous to sigma-G, which is also expressed in the forespore. [Transcription, Transcription factors, Cellular processes, Sporulation and germination]	231
131932	TIGR02886	spore_II_AA	anti-sigma F factor antagonist. The anti-sigma F factor antagonist, also called stage II sporulation protein AA, is a protein universal among endospore-forming bacteria, all of which belong to the Firmcutes [Regulatory functions, Protein interactions, Cellular processes, Sporulation and germination]	106
274338	TIGR02887	spore_ger_x_C	germination protein, Ger(x)C family. Members of this protein family are restricted to endospore-forming members of the Firmicutes lineage of bacteria, including the genera Bacillus, Clostridium, Thermoanaerobacter, Carboxydothermus, etc. Members are nearly all predicted lipoproteins and belong to probable transport operons, some of which have been characterized as crucial to germination in response to alanine. Members typically have been gene symbols gerKC, gerAC, gerYC, etc. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Sporulation and germination]	371
274339	TIGR02888	spore_YlmC_YmxH	sporulation protein, YlmC/YmxH family. Members of this family belong to the broader family of PRC-barrel domain proteins (see pfam05239), but are found only in endospore-forming bacteria of the Firmicutes lineage. Most such species have exactly two members of this family and all have at least one; the function is unknown. One of two members from Bacillus subtilis, YmxH, is strongly induced by the mother cell-specific sigma-E factor. [Cellular processes, Sporulation and germination]	76
131935	TIGR02889	spore_YpeB	germination protein YpeB. Members of this family are YpeB, a protein usually encoded with the putative spore-cortex-lytic enzyme SleB and required, together with SleB, for normal germination. This family is retricted to endospore-forming species in the Firmicutes lineage of bacteria, and found in all such species to date except Clostridium perfringens. The matching phenotypes of mutants in SleB (called a lytic transglycosylase) and YpeB suggests that YpeB is necessary to allow SleB to function. [Cellular processes, Sporulation and germination]	435
274340	TIGR02890	bacill_yteA	regulatory protein, yteA family. Members of this predicted regulatory protein are found only in endospore-forming members of the Firmicutes group of bacteria, and in nearly every such species; Clostridium perfringens seems to be an exception. The member from Bacillus subtilis, the model system for the study of the sporulation program, has been designated both yteA and yzwB. Some (but not all) members of this family show a strong sequence match to Pfam family pfam01258 the C4-type zinc finger protein, DksA/TraR family, but only one of the four key Cys residues is conserved. All members of this protein family share an additional C-terminal domain. Smaller proteins from the proteobacteria with just the N-terminal domain, including DksA and DksA2 are RNA polymerase-binding regulatory proteins even if the Zn-binding site is not conserved. [Unknown function, General]	159
213748	TIGR02891	CtaD_CoxA	cytochrome c oxidase, subunit I. This large family represents subunit I's (CtaD, CoxA, CaaA) of cytochrome c oxidases of bacterial origin. Cytochrome c oxidase is the component of the respiratory chain that catalyzes the reduction of oxygen to water. Subunits I-III form the functional core of the enzyme complex. Subunit I is the catalytic subunit of the enzyme. Electrons originating in cytochrome c are transferred via the copper A center of subunit II and heme a of subunit I to the bimetallic center formed by heme a3 and copper B. This cytochrome c oxidase shows proton pump activity across the membrane in addition to the electron transfer. In the bacilli an apparent split (paralogism) has created a sister clade (TIGR02882) encoding subunits (QoxA) of the aa3-type quinone oxidase complex which reacts directly with quinones, bypassing the interaction with soluble cytochrome c. This model attempts to exclude these sequences, placing them between the trusted and noise cutoffs. These families, as well as archaeal and eukaryotic cytochrome c subunit I's are included within the superfamily model, pfam00115. [Energy metabolism, Electron transport]	499
274341	TIGR02892	spore_yabP	sporulation protein YabP. Members of this protein family are the YabP protein of the bacterial sporulation program, as found in Bacillus subtilis, Clostridium tetani, and other spore-forming members of the Firmicutes. In Bacillus subtilis, a yabP single mutant appears to sporulate and germinate normally (), but is in an operon with yabQ (essential for formation of the spore cortex), it near-universal among endospore-forming bacteria, and is found nowhere else. It is likely, therefore, that YabP does have a function in sporulation or germination, one that is either unappreciated or partially redundant with that of another protein. [Cellular processes, Sporulation and germination]	85
131939	TIGR02893	spore_yabQ	spore cortex biosynthesis protein YabQ. YabQ, a protein predicted to span the membrane several times, is found in exactly those genomes whose species perform sporulation in the style of Bacillus subtilis, Clostridium tetani, and others of the Firmicutes. Mutation of this sigma(E)-dependent gene blocks development of the spore cortex. The length of the C-terminal region, including some hydrophobic regions, is rather variable between members. [Cellular processes, Sporulation and germination]	130
274342	TIGR02894	DNA_bind_RsfA	transcription factor, RsfA family. In a subset of endospore-forming members of the Firmcutes, members of this protein family are found, several to a genome. Two very strongly conserved sequences regions are separated by a highly variable linker region. Much of the linker region was excised from the seed alignment for this model. A characterized member is the prespore-specific transcription RsfA from Bacillus subtilis, previously called YwfN, which is controlled by sigma factor F and seems to fine-tune expression of some genes in the sigma-F regulon. A paralog in Bacillus subtilis is designated YlbO. [Regulatory functions, DNA interactions, Cellular processes, Sporulation and germination]	161
131941	TIGR02895	spore_sigI	RNA polymerase sigma-I factor. Members of this sigma factor protein family are strictly limited to endospore-forming species in the Firmicutes lineage of bacteria, but are not universally present among such species. Sigma-I was shown to be induced by heat shock () in Bacillus subtilis and is suggested by its phylogenetic profile to be connected to the program of sporulation (). [Transcription, Transcription factors, Cellular processes, Sporulation and germination]	218
131942	TIGR02896	spore_III_AF	stage III sporulation protein AF. This family represents the stage III sporulation protein AF of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of this protein is poorly conserved, so only the N-terminal region, which includes two predicted transmembrane domains, is included in the seed alignment. [Cellular processes, Sporulation and germination]	106
131943	TIGR02897	QoxC	cytochrome aa3 quinol oxidase, subunit III. This family (QoxC) encodes subunit III of the aa3-type quinone oxidase, one of several bacterial terminal oxidases. This complex couples oxidation of reduced quinones with the reduction of molecular oxygen to water and the pumping of protons to form a proton gradient utilized for ATP production. aa3-type oxidases contain two heme a cofactors as well as copper atoms in the active site. [Energy metabolism, Electron transport]	190
131944	TIGR02898	spore_YhcN_YlaJ	sporulation lipoprotein, YhcN/YlaJ family. YhcN and YlaJ are predicted lipoproteins that have been detected as spore proteins but not vegetative proteins in Bacillus subtilis. Both appear to be expressed under control of the RNA polymerase sigma-G factor. The YlaJ-like members of this family have a low-complexity, strongly acidic 40-residue C-terminal domain that is not included in the seed alignment for this model. A portion of the low-complexity region between the lipoprotein signal sequence and the main conserved region of the protein family was also excised from the seed alignment. [Cellular processes, Sporulation and germination]	158
131945	TIGR02899	spore_safA	spore coat assembly protein SafA. SafA (YrbB) (SafA) of Bacillus subtilis is a protein found at the interface of the spore cortex and spore coat, and is dependent on SpoVID for its localization. This model is based on the N-terminal LysM (lysin motif) domain (see pfamAM model pfam01476) of SafA and, from several other spore-forming species, the protein with the most similar N-terminal region. However, this set of proteins differs greatly in C-terminal of the LysM domaim; blocks of 12-residue and 13-residue repeats are found in the Bacillus cereus group, tandem LysM domains in Thermoanaerobacter tengcongensis MB4, etc. in which one of which is found in most examples of endospore-forming bacteria. [Cellular processes, Sporulation and germination]	44
274343	TIGR02900	spore_V_B	stage V sporulation protein B. SpoVB is the stage V sporulation protein B of the bacterial endopore formation program in Bacillus subtilis and various other Firmcutes. It is nearly universal among endospore-formers. Paralogs with rather high sequence similarity to SpoVB exist, including YkvU in B. subtilis and a number of proteins in the genus Clostridium. [Cellular processes, Sporulation and germination]	488
200218	TIGR02901	QoxD	cytochrome aa3 quinol oxidase, subunit IV. This family (QoxD) encodes subunit IV of the aa3-type quinone oxidase, one of several bacterial terminal oxidases. This complex couples oxidation of reduced quinones with the reduction of molecular oxygen to water and the pumping of protons to form a proton gradient utilized for ATP production. aa3-type oxidases contain two heme a cofactors as well as copper atoms in the active site. [Energy metabolism, Electron transport]	94
131948	TIGR02902	spore_lonB	ATP-dependent protease LonB. Members of this protein are LonB, a paralog of the ATP-dependent protease La (LonA, TIGR00763). LonB proteins are found strictly, and almost universally, in endospore-forming bacteria. This protease was shown, in Bacillus subtilis, to be expressed specifically in the forespore, during sporulation, under control of sigma(F). The lonB gene, despite location immediately upstream of lonA, was shown to be monocistronic. LonB appears able to act on sigma(H) for post-translation control, but lonB mutation did not produce an obvious sporulation defect under the conditions tested. Note that additional paralogs of LonA and LonB occur in the Clostridium lineage and this model selects only one per species as the protein that corresponds to LonB in B. subtilis. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Sporulation and germination]	531
274344	TIGR02903	spore_lon_C	ATP-dependent protease, Lon family. Members of this protein family resemble the widely distributed ATP-dependent protease La, also called Lon and LonA. It resembles even more closely LonB, which is a LonA paralog found in genomes if and only if the species is capable of endospore formation (as in Bacillus subtilis, Clostridium tetani, and select other members of the Firmicutes) and expressed specifically in the forespore compartment. Members of this family are restricted to a subset of spore-forming species, and are very likely to participate in the program of endospore formation. We propose the designation LonC. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Sporulation and germination]	615
274345	TIGR02904	spore_ysxE	spore coat protein YsxE. Members of this family are homologs of the Bacillus subtilis spore coat protein CotS. Members of this family, designated YsxE, are found only in the family Bacillaceae, from among the endospore-forming members of the Firmicutes branch of the Bacteria. As a rule, the ysxE gene is found immediately downstream of spoVID, a gene necessary for spore coat assembly. The protein has been shown to be part of the spore coat. [Cellular processes, Sporulation and germination]	309
131951	TIGR02905	spore_yutH	spore coat protein YutH. Members of this family are homologs of the Bacillus subtilis spore coat protein CotS. Members of this family, designated YutH, are found only in the family Bacillaceae from among the endospore-forming members of the Firmicutes branch of the Bacteria. [Cellular processes, Sporulation and germination]	313
131952	TIGR02906	spore_CotS	spore coat protein, CotS family. Members of this family include the spore coat proteins CotS and YtaA from Bacillus subtilis and, from other endospore-forming bacteria, homologs that are more closely related to these two than to the spore coat proteins YutH and YsxE. The CotS family is more broadly distributed than YutH or YsxE, but still is not universal among spore-formers. [Cellular processes, Sporulation and germination]	313
274346	TIGR02907	spore_VI_D	stage VI sporulation protein D. SpoVID, the stage VI sporulation protein D, is restricted to endospore-forming members of the bacteria, all of which are found among the Firmicutes. It is widely distributed but not quite universal in this group. Between well-conserved N-terminal and C-terminal domains is a poorly conserved, low-complexity region of variable length, rich enough in glutamic acid to cause spurious BLAST search results unless a filter is used. The seed alignment for this model was trimmed, in effect, by choosing member sequences in which these regions are relatively short. SpoVID is involved in spore coat assembly by the mother cell compartment late in the process of sporulation. [Cellular processes, Sporulation and germination]	338
131954	TIGR02908	CoxD_Bacillus	cytochrome c oxidase, subunit IVB. This model represents a small clade of cytochrome oxidase subunit IV's found in the Bacilli. [Energy metabolism, Electron transport]	110
131955	TIGR02909	spore_YkwD	uncharacterized protein, YkwD family. Members of this protein family represent a subset of those belonging to pfam00188 (SCP-like extracellular protein). Based on currently cuttoffs for this model, all member proteins are found in Bacteria capable of endospore formation. Members include a named but uncharacterized protein, YkwD of Bacillus subtilis. Only the C-terminal region is well-conserved and is included in the seed alignment for this model. Three members of this family have an N-terminal domain homologous to the spore coat assembly protein SafA.	127
131956	TIGR02910	sulfite_red_A	sulfite reductase, subunit A. Members of this protein family include the A subunit, one of three subunits, of the anaerobic sulfite reductase of Salmonella, and close homologs from various Clostridum species, where the three-gene neighborhood is preserved. Two such gene clusters are found in Clostridium perfringens, but it may be that these sets of genes correspond to the distinct assimilatory and dissimilatory forms as seen in Clostridium pasteurianum. Note that any one of these enzymes may have secondary substates such as NH2OH, SeO3(2-), and SO3(2-). Heterologous expression of the anaerobic sulfite reductase of Salmonella confers on Escherichia coli the ability to produce hydrogen sulfide gas from sulfite. [Central intermediary metabolism, Sulfur metabolism]	334
131957	TIGR02911	sulfite_red_B	sulfite reductase, subunit B. Members of this protein family include the B subunit, one of three subunits, of the anaerobic sulfite reductase of Salmonella, and close homologs from various Clostridum species, where the three-gene neighborhood is preserved. Two such gene clusters are found in Clostridium perfringens, but it may be that these sets of genes correspond to the distinct assimilatory and dissimilatory forms as seen in Clostridium pasteurianum. [Central intermediary metabolism, Sulfur metabolism]	261
131958	TIGR02912	sulfite_red_C	sulfite reductase, subunit C. Members of this protein family include the C subunit, one of three subunits, of the anaerobic sulfite reductase of Salmonella, and close homologs from various Clostridum species, where the three-gene neighborhood is preserved. Two such gene clusters are found in Clostridium perfringens, but it may be that these sets of genes correspond to the distinct assimilatory and dissimilatory forms as seen in Clostridium pasteurianum. Note that any one of these enzymes may have secondary substates such as NH2OH, SeO3(2-), and SO3(2-). Heterologous expression of the anaerobic sulfite reductase of Salmonella confers on Escherichia coli the ability to produce hydrogen sulfide gas from sulfite. [Central intermediary metabolism, Sulfur metabolism]	314
131959	TIGR02913	HAF_rpt	probable extracellular repeat, HAF family. The model for this family detects a homology domain of about 40 amino acids. Member proteins always have a least two tandem copies and as many as seven. The spacing between repeats as defined here usually is four residues exactly. This repeat is named for a tripeptide motif HAF found in most members. Some members proteins are found in species with no outer membrane (archaea and Gram-positive bacteria) while others have C-terminal autotransporter domains that suggest that the repeat region is transported across the outer membrane. This domain seems likely to be an extracellular protein repeat.	39
274347	TIGR02914	EpsI_fam	EpsI family protein. In Methylobacillus sp strain 12S, EpsI is encoded immediately downstream of the multiple-membrane-spanning putative transporter EpsH, and is predicted to be a periplasmic protein involved in, but not required for, expression of the exopolysaccharide methanolan. In a number of other species, protein homologous to EpsI is encoded either next to EpsH or, more often, combined in a fused gene. We have proposed renaming EpsH, or the EpsHI fusion protein, to exosortase, based on its phylogenetic association with the PEP-CTERM proposed protein targeting signal. [Transport and binding proteins, Unknown substrate]	174
274348	TIGR02915	PEP_resp_reg	PEP-CTERM-box response regulator transcription factor. Members of this protein family share full-length homology with (but do not include) the acetoacetate metabolism regulatory protein AtoC (see SP|Q06065). These proteins have a Fis family DNA binding sequence (pfam02954), a response regulator receiver domain (pfam00072), and sigma-54 interaction domain (pfam00158). [Regulatory functions, DNA interactions]	445
274349	TIGR02916	PEP_his_kin	putative PEP-CTERM system histidine kinase. Members of this protein family have a novel N-terminal domain, a single predicted membrane-spanning helix, and a predicted cystosolic histidine kinase domain. We designate this protein PrsK, and its companion DNA-binding response regulator protein (TIGR02915) PrsR. These predicted signal-transducing proteins appear to enable enhancer-dependent transcriptional activation. The prsK gene is often associated with exopolysaccharide biosynthesis genes. [Protein fate, Protein and peptide secretion and trafficking, Signal transduction, Two-component systems]	679
274350	TIGR02917	PEP_TPR_lipo	putative PEP-CTERM system TPR-repeat lipoprotein. This protein family occurs in strictly within a subset of Gram-negative bacterial species with the proposed PEP-CTERM/exosortase system, analogous to the LPXTG/sortase system common in Gram-positive bacteria. This protein occurs in a species if and only if a transmembrane histidine kinase (TIGR02916) and a DNA-binding response regulator (TIGR02915) also occur. The present of tetratricopeptide repeats (TPR) suggests protein-protein interaction, possibly for the regulation of PEP-CTERM protein expression, since many PEP-CTERM proteins in these genomes are preceded by a proposed DNA binding site for the response regulator.	899
131964	TIGR02918	TIGR02918	accessory Sec system glycosylation protein GtfA. Members of this protein family are found only in Gram-positive bacteria of the Firmicutes lineage, including several species of Staphylococcus, Streptococcus, and Lactobacillus. Members are associated with glycosylation of serine-rich glycoproteins exported by the accessory Sec system. [Protein fate, Protein modification and repair]	500
274351	TIGR02919	TIGR02919	accessory Sec system glycosyltransferase GtfB. Members of this protein family are found only in Gram-positive bacteria of the Firmicutes lineage, including several species of Staphylococcus, Streptococcus, and Lactobacillus. [Protein fate, Protein modification and repair]	438
131966	TIGR02920	acc_sec_Y2	accessory Sec system translocase SecY2. Members of this family are restricted to the Firmicutes lineage (low-GC Gram-positive bacteria) and appear to be paralogous to, and much more divergent than, the preprotein translocase SecY. Members include the SecY2 protein of the accessory Sec system in Streptococcus gordonii, involved in export of the highly glycosylated platelet-binding protein GspB. [Protein fate, Protein and peptide secretion and trafficking]	395
131967	TIGR02921	PEP_integral	PEP-CTERM family integral membrane protein. Members of this protein family, found in eighteen genera so far, have a PEP-CTERM sequence at the carboxyl-terminus (see model TIGR02595), but are unusual among PEP-CTERM proteins in having multiple predicted transmembrane segments. The function is unknown. It is proposed that an exosortase (see TIGR02602), recognizes and cleaves PEP-CTERM proteins in a manner analogous to the cleavage of LPXTG proteins by sortase (see Haft, et al., 2006). In at least six species, a gene encoding what appears to be a dedicated (single target) exosortase is adjacent. In that subset, the PEP-CTERM motif takes the form VPEPxxWxL.	952
131968	TIGR02922	TIGR02922	TIGR02922 family protein. Two members of this family are found in Colwellia psychrerythraea 34H and one each in various other species of Colwellia and Shewanella. One member from C. psychrerythraea is of special interest because it is preceded by the same cis-regulatory site as a number of genes that have the PEP-CTERM domain described by TIGR02595. [Hypothetical proteins, Conserved]	67
274352	TIGR02923	AhaC	ATP synthase A1, C subunit. The A1/A0 ATP synthase is homologous to the V-type (V1/V0, vacuolar) ATPase, but functions in the ATP synthetic direction as does the F1/F0 ATPase of bacteria. The C subunit is part of the hydrophilic A1 "stalk" complex (AhaABCDEFG), which is the site of ATP generation and is coupled to the membrane-embedded proton translocating A0 complex.	343
274353	TIGR02924	ICDH_alpha	isocitrate dehydrogenase. This family of mainly alphaproteobacterial enzymes is a member of the isocitrate/isopropylmalate dehydrogenase superfamily described by pfam00180. Every member of the seed of this model appears to have a TCA cycle lacking only a determined isocitrate dehydrogenase. The precise identity of the cofactor (NADH -- 1.1.1.41 vs. NADPH -- 1.1.1.42) is unclear. [Energy metabolism, TCA cycle]	473
131971	TIGR02925	cis_trans_EpsD	peptidyl-prolyl cis-trans isomerase, EpsD family. Members of this family belong to the peptidyl-prolyl cis-trans isomerase family and are found in loci associated with exopolysaccharide biosynthesis. All members are encoded near a homolog of EpsH, as detected by TIGR02602.	232
131972	TIGR02926	AhaH	ATP synthase archaeal, H subunit. he A1/A0 ATP synthase is homologous to the V-type (V1/V0, vacuolar) ATPase, but functions in the ATP synthetic direction as does the F1/F0 ATPase of bacteria. The hydrophilic A1 "stalk" complex (AhaABCDEFG) is the site of ATP generation and is coupled to the membrane-embedded proton translocating A0 complex. It is unclear precisely where AhaH fits into these complexes.	85
200219	TIGR02927	SucB_Actino	2-oxoglutarate dehydrogenase, E2 component, dihydrolipoamide succinyltransferase. This model represents an Actinobacterial clade of E2 enzyme, a component of the 2-oxoglutarate dehydrogenase complex involved in the TCA cycle. These proteins have multiple domains including the catalytic domain (pfam00198), one or two biotin domains (pfam00364) and an E3-component binding domain (pfam02817).	579
274354	TIGR02928	TIGR02928	orc1/cdc6 family replication initiation protein. Members of this protein family are found exclusively in the archaea. This set of DNA binding proteins shows homology to the origin recognition complex subunit 1/cell division control protein 6 family in eukaryotes. Several members may be found in genome and interact with each other. [DNA metabolism, DNA replication, recombination, and repair]	365
131975	TIGR02929	anfG_nitrog	Fe-only nitrogenase, delta subunit. Nitrogenase, also called dinitrogenase, is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, AnfG, represents the delta subunit of the Fe-only alternative nitrogenase. It is homologous to VnfG, the delta subunit of the V-containing (vanadium) nitrogenase. [Central intermediary metabolism, Nitrogen fixation]	109
131976	TIGR02930	vnfG_nitrog	V-containing nitrogenase, delta subunit. Nitrogenase is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, VnfG, represents the delta subunit of the V-containing (vanadium) alternative nitrogenase. It is homologous to AnfG, the delta subunit of the Fe-only nitrogenase. [Central intermediary metabolism, Nitrogen fixation]	109
131977	TIGR02931	anfK_nitrog	Fe-only nitrogenase, beta subunit. Nitrogenase is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, AnfK, represents the beta subunit of the iron-only alternative nitrogenase. It is homologous to NifK and VnfK, of the molybdenum-containing and the vanadium (V)-containing types, respectively. [Central intermediary metabolism, Nitrogen fixation]	461
131978	TIGR02932	vnfK_nitrog	V-containing nitrogenase, beta subunit. Nitrogenase is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, VnfK, represents the beta subunit of the vanadium (V)-containing alternative nitrogenase. It is homologous to NifK and AnfK, of the molybdenum-containing and the iron (Fe)-only types, respectively. [Central intermediary metabolism, Nitrogen fixation]	457
131979	TIGR02933	nifM_nitrog	nitrogen fixation protein NifM. Members of this protein family, found in a subset of nitrogen-fixing bacteria, are the nitrogen fixation protein NifM. NifM, homologous to peptidyl-prolyl cis-trans isomerases, appears to be an accessory protein for NifH, the Fe protein, also called component II or dinitrogenase reductase, of nitrogenase. [Central intermediary metabolism, Nitrogen fixation]	256
274355	TIGR02934	nifT_nitrog	probable nitrogen fixation protein FixT. This largely uncharacterized protein family is assigned a role in nitrogen fixation by two criteria. First, its gene occurs, generally, among genes essential for expression of active nitrogenase. Second, its phylogenetic profile closely matches that of nitrogen-fixing bacteria. However, mutational studies in Klebsiella pneumoniae failed to demonstrate any phenotype for deletion or overexpression of the protein.	68
131981	TIGR02935	TIGR02935	probable nitrogen fixation protein. Members of this protein family, called DUF269 by pfam03270, are strictly limited to nitrogen-fixing species, although not universal among them. The gene typically is found next to the nifX gene (see TIGRFAMs model TIGR02663). [Central intermediary metabolism, Nitrogen fixation]	140
274356	TIGR02936	fdxN_nitrog	ferredoxin III, nif-specific. Members of this family are homodimeric ferredoxins from nitrogen fixation regions of many nitrogen-fixing bacteria. As characterized in Rhodobacter capsulatus, these proteins are homodimeric, with two 4Fe-4S clusters bound per monomer. Although nif-specific, this protein family is not usiveral, as other nitrogenase systems may substitute flavodoxins, or different types of ferredoxin. [Central intermediary metabolism, Nitrogen fixation]	91
274357	TIGR02937	sigma70-ECF	RNA polymerase sigma factor, sigma-70 family. This model encompasses all varieties of the sigma-70 type sigma factors including the ECF subfamily. A number of sigma factors have names with a different number than 70 (i.e. sigma-38), but in fact, all except for the Sigma-54 family (TIGR02395) are included within this family. Several Pfam models hit segments of these sequences including Sigma-70 region 2 (pfam04542) and Sigma-70, region 4 (pfam04545), but not always above their respective trusted cutoffs.	158
131984	TIGR02938	nifL_nitrog	nitrogen fixation negative regulator NifL. NifL is a modulator of the nitrogen fixation positive regulator protein NifA, and is therefore a negative regulator. It binds NifA. NifA and NifL are encoded by adjacent genes. [Central intermediary metabolism, Nitrogen fixation, Regulatory functions, Protein interactions]	494
131985	TIGR02939	RpoE_Sigma70	RNA polymerase sigma factor RpoE. A sigma factor is a DNA-binding protein protein that binds to the DNA-directed RNA polymerase core to produce the holoenzyme capable of initiating transcription at specific sites. Different sigma factors act in vegetative growth, heat shock, extracytoplasmic functions (ECF), etc. This model represents the clade of sigma factors called RpoE. This protein may be called sigma-24, sigma-E factor, sigma-H factor, fecI-like sigma factor or alternative sigma factor AlgU.	190
274358	TIGR02940	anfO_nitrog	Fe-only nitrogenase accessory protein AnfO. Members of this protein family, called Anf1 in Rhodobacter capsulatus and AnfO in Azotobacter vinelandii, are found only in species with the Fe-only nitrogenase and are encoded immediately downstream of the structural genes in the above named species.	214
131987	TIGR02941	Sigma_B	RNA polymerase sigma-B factor. This sigma factor is restricted to certain lineages of the order Bacillales including Staphylococcus, Listeria, and Bacillus.	255
274359	TIGR02943	Sig70_famx1	RNA polymerase sigma-70 factor, TIGR02943 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family.	188
131989	TIGR02944	suf_reg_Xantho	FeS assembly SUF system regulator, gammaproteobacterial. The SUF system is an oxygen-resistant iron-sulfur cluster assembly system found in both aerobes and facultative anaerobes. Its presence appears to be a marker of oxygen tolerance; strict anaerobes and microaerophiles tend to have different FeS cluster biosynthesis systems. Members of this protein family belong to the rrf2 family of transcriptional regulators and are found, typically, as the first gene of a SUF operon. It is found only in a subset of genomes that encode the SUF system, including the genus Xanthomonas. The conserved location suggests an autoregulatory role. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Regulatory functions, DNA interactions]	130
131990	TIGR02945	SUF_assoc	FeS assembly SUF system protein. Members of this family belong to the broader pfam01883, or Domain of Unknown Function DUF59. Many members of DUF59 are candidate ring hydroxylating complex subunits. However, members of the narrower family defined here all are found in genomes that carry the FeS assembly SUF system. For 70 % of these species, the member of this protein family is found as part of the SUF locus, usually immediately downstream of the sufS gene. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	99
274360	TIGR02946	acyl_WS_DGAT	acyltransferase, WS/DGAT/MGAT. This bacteria-specific protein family includes a characterized, homodimeric, broad specificity acyltransferase from Acinetobacter sp. strain ADP1, active as wax ester synthase, as acyl coenzyme A:diacylglycerol acyltransferase, and as acyl-CoA:monoacylglycerol acyltransferase. [Unknown function, Enzymes of unknown specificity]	446
131992	TIGR02947	SigH_actino	RNA polymerase sigma-70 factor, TIGR02947 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and (with the exception of a paralog in Thermobifida fusca YX) one-to-a-genome distribution, to represent a conserved family. This family is restricted to the Actinobacteria and each gene examined is followed by an anti-sigma factor in an apparent operon.	193
131993	TIGR02948	SigW_bacill	RNA polymerase sigma-W factor. This sigma factor is restricted to certain lineages of the order Bacillales.	187
188261	TIGR02949	anti_SigH_actin	anti-sigma factor, TIGR02949 family. This group of anti-sigma factors are associated in an apparent operon with a family of sigma-70 family sigma factors (TIGR02947). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. This family is restricted to the Actinobacteria. [Transcription, Transcription factors]	84
274361	TIGR02950	SigM_subfam	RNA polymerase sigma factor, SigM family. This family of RNA polymerase sigma factors is a member of the Sigma-70 subfamily (TIGR02937) and is restricted to certain lineages of the order Bacillales. This family encompasses at least two distinct sigma factors as two proteins are found in each of B. anthracis, B. subtilis subsp. subtilis str. 168, and B. lichiniformis (although these are not apparently the same two in each). One of these is designated as SigM in B. subtilis (Swiss_Prot:	154
131996	TIGR02951	DMSO_dmsB	DMSO reductase, iron-sulfur subunit. This family consists of the iron-sulfur subunit, or chain B, of an enzyme called the anaerobic dimethyl sulfoxide reductase. Chains A and B are catalytic, while chain C is a membrane anchor.	161
131997	TIGR02952	Sig70_famx2	RNA polymerase sigma-70 factor, TIGR02952 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. This family is found in a limited number of Gram-positive bacterial lineages.	170
131998	TIGR02953	penta_MxKDx	pentapeptide MXKDX repeat protein. Members of this protein family are small bacterial proteins, each with an N-terminal signal sequence followed by up to 11 imperfect repeats of a pentapeptide. The pentapeptide repeat usually follows the form Met-Xaa-Lys-Asp-Xaa.	75
213752	TIGR02954	Sig70_famx3	RNA polymerase sigma-70 factor, TIGR02954 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. This family is found in certain Bacillus and Clostridium species.	169
132000	TIGR02955	TMAO_TorT	TMAO reductase system periplasmic protein TorT. Members of this family are the periplasmic protein TorT which, together with the the TorS/TorR histidine kinase/response regulator system, regulates expression of the torCAD operon for trimethylamine N-oxide reductase (TMAO reductase). It appears to bind an inducer for TMAO reductase, and shows homology to a periplasmic D-ribose binding protein.	295
274362	TIGR02956	TMAO_torS	TMAO reductase sytem sensor TorS. This protein, TorS, is part of a regulatory system for the torCAD operon that encodes the pterin molybdenum cofactor-containing enzyme trimethylamine-N-oxide (TMAO) reductase (TorA), a cognate chaperone (TorD), and a penta-haem cytochrome (TorC). TorS works together with the inducer-binding protein TorT and the response regulator TorR. TorS contains histidine kinase ATPase (pfam02518), HAMP (pfam00672), phosphoacceptor (pfam00512), and phosphotransfer (pfam01627) domains and a response regulator receiver domain (pfam00072). [Signal transduction, Two-component systems]	968
132002	TIGR02957	SigX4	RNA polymerase sigma-70 factor, TIGR02957 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building and bidirectional best hits, to represent a conserved family. This family is found in a limited number of bacterial lineages. This family includes apparent paralogous expansion in Streptomyces coelicolor A3(2), and multiple copies in Mycobacterium smegmatis MC2, Streptomyces avermitilis MA-4680 and Nocardia farcinica IFM10152.	281
132004	TIGR02959	SigZ	RNA polymerase sigma factor, SigZ family. This family of RNA polymerase sigma factors is a member of the Sigma-70 subfamily (TIGR02937). One of these is designated as SigZ in B. subtilis (Swiss_Prot: SIGZ_BACSU). Interestingly, this group has a very sporatic distribution, B. subtilis, for instance, being the only sequenced strain of Bacilli with a member. Dechloromonas aromatica RCB appears to have two of these sigma factors. A member appears on a plasmid found in Photobacterium profundum SS9 and Vibrio fischeri ES114 (where a second one is chromosomally encoded).	170
132005	TIGR02960	SigX5	RNA polymerase sigma-70 factor, TIGR02960 family. This group of sigma factors are members of the sigma-70 family (TIGR02937). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family.	324
274363	TIGR02961	allantoicase	allantoicase. Members of this family are the enzyme allantoicase (EC 3.5.3.4), also called allantoate amidinohydrolase. This enzyme hydrolyzes allantoate to (S)-ureidoglycolate and urea; it can also degrade (R)-ureidoglycolate to glyoxylate and urea. Allantoinase (EC 3.5.2.5) hydrolyzes (S)-allantoin (a xanthine metabolite, via urate) to allantoate. Allantoate can then be degraded either by this enzyme, allantoicase, or by allantoate deiminase (EC 3.5.3.9). Members of the seed alignment for this model were taken from BRENDA. Proteins in this family contain two copies of the allantoicase repeat (pfam03561). A different but similarly named enzyme, allantoate amidohydrolase (EC 3.5.3.9), simultaneously breaks down the urea to ammonia and carbon dioxide. [Purines, pyrimidines, nucleosides, and nucleotides, Other, Energy metabolism, Other]	322
274364	TIGR02962	hdxy_isourate	hydroxyisourate hydrolase. Members of this family, hydroxyisourate hydrolase, represent a distinct clade of transthyretin-related proteins. Bacterial members typically are encoded next to ureidoglycolate hydrolase and often near either xanthine dehydrogenase or xanthine/uracil permease genes and have been demonstrated to have hydroxyisourate hydrolase activity. In eukaryotes, a clade separate from the transthyretins (a family of thyroid-hormone binding proteins) has also been shown to have HIU hydrolase activity in urate catabolizing organisms. Transthyretin, then, would appear to be the recently diverged paralog of the more ancient HIUH family. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	112
274365	TIGR02963	xanthine_xdhA	xanthine dehydrogenase, small subunit. Members of this protein family are the small subunit (or, in eukaryotes, the N-terminal domain) of xanthine dehydrogenase, an enzyme of purine catabolism via urate. The small subunit contains both an FAD and a 2Fe-2S cofactor. Aldehyde oxidase (retinal oxidase) appears to have arisen as a neofunctionalization among xanthine dehydrogenases in eukaryotes and [Purines, pyrimidines, nucleosides, and nucleotides, Other]	467
274366	TIGR02964	xanthine_xdhC	xanthine dehydrogenase accessory protein XdhC. Members of this protein family are the accessory protein XdhC for insertion of the molybdenum cofactor into the xanthine dehydrogenase large chain, XdhB, in bacteria. This protein is not part of the mature xanthine dehydrogenase. Xanthine dehydrogenase is an enzyme for purine catabolism, from other purines to xanthine to urate to further breakdown products. [Protein fate, Protein folding and stabilization, Purines, pyrimidines, nucleosides, and nucleotides, Other]	246
274367	TIGR02965	xanthine_xdhB	xanthine dehydrogenase, molybdopterin binding subunit. Members of the protein family are the molybdopterin-containing large subunit (or, in, eukaryotes, the molybdopterin-binding domain) of xanthine dehydrogenase, and enzyme that reduces the purine pool by catabolizing xanthine to urate. This model is based primarily on bacterial sequences; it does not manage to include all eukaryotic xanthine dehydrogenases and thereby discriminate them from the closely related enzyme aldehyde dehydrogenase. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	758
274368	TIGR02966	phoR_proteo	phosphate regulon sensor kinase PhoR. Members of this protein family are the regulatory histidine kinase PhoR associated with the phosphate ABC transporter in most Proteobacteria. Related proteins from Gram-positive organisms are not included in this model. The phoR gene usually is adjacent to the response regulator phoB gene (TIGR02154). [Signal transduction, Two-component systems]	333
132012	TIGR02967	guan_deamin	guanine deaminase. This model describes guanine deaminase, which hydrolyzes guanine to xanthine and ammonia. Xanthine can then be converted to urate by xanthine dehydrogenase, and urate subsequently degraded. In some bacteria, the guanine deaminase gene is found near the xdhABC genes for xanthine dehydrogenase. Non-homologous forms of guanine deaminase also exist, as well as distantly related forms outside the scope of this model. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	401
274369	TIGR02968	succ_dehyd_anc	succinate dehydrogenase, hydrophobic membrane anchor protein. In E. coli and many other bacteria, two small, hydrophobic, mutually homologous subunits of succinate dehydrogenase, a TCA cycle enzyme, are SdhC and SdhD. This family is the SdhD, the hydrophobic membrane anchor protein. SdhC is apocytochrome b558, which also plays a role in anchoring the complex. [Energy metabolism, TCA cycle]	105
132014	TIGR02969	mam_aldehyde_ox	aldehyde oxidase. Members of this family are mammalian aldehyde oxidase (EC 1.2.3.1) isozymes, closely related to xanthine dehydrogenase/oxidase.	1330
274370	TIGR02970	succ_dehyd_cytB	succinate dehydrogenase, cytochrome b556 subunit. In E. coli and many other bacteria, two small, hydrophobic, mutually homologous subunits of succinate dehydrogenase, a TCA cycle enzyme, are SdhC and SdhD. This family is the SdhC, the cytochrome b subunit, called b556 in bacteria and b560 in mitochondria. SdhD (see TIGR02968) is called the hydrophobic membrane anchor subunit, although both SdhC and SdhD participate in anchoring the complex. In some bacteria, this cytochrome b subunit is replaced my a member of the cytochrome b558 family (see TIGR02046). [Energy metabolism, TCA cycle]	120
213754	TIGR02971	heterocyst_DevB	ABC exporter membrane fusion protein, DevB family. Members of this protein family are found mostly in the Cyanobacteria, but also in the Planctomycetes. DevB from Anabaena sp. strain PCC 7120 is partially characterized as a membrane fusion protein of the DevBCA ABC exporter, probably a glycolipid exporter, required for heterocyst formation. Most Cyanobacteria have one member only, but Nostoc sp. PCC 7120 has seven members.	327
132017	TIGR02972	TMAO_torE	trimethylamine N-oxide reductase system, TorE protein. Members of this small, apparent transmembrane protein are designated TorE and occur in operons for the trimethylamine N-oxide (TMAO) reductase system. Members are closely related to the NapE protein of the related periplasmic nitrate reductase system. It may be that TorE is an integral membrane subunit of a complex with the reductase TorA. [Energy metabolism, Anaerobic]	47
132018	TIGR02973	nitrate_rd_NapE	periplasmic nitrate reductase, NapE protein. NapE, homologous to TorE (TIGR02972), is a membrane protein of unknown function that is part of the periplasmic nitrate reductase system; it may be part of the enzyme complex. The periplasmic nitrate reductase allows for nitrate respiration in anaerobic conditions. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	42
274371	TIGR02974	phageshock_pspF	psp operon transcriptional activator PspF. Members of this protein family are PspF, the sigma-54-dependent transcriptional activator of the phage shock protein (psp) operon, in Escherichia coli and numerous other species. The psp operon is induced by a number of stress conditions, including heat shock, ethanol, and filamentous phage infection. Changed com_name to adhere to TIGR role notes conventions. 09/15/06 - DMH [Regulatory functions, DNA interactions]	329
132020	TIGR02975	phageshock_pspG	phage shock protein G. This protein previously was designated yjbO in E. coli. It is found only in genomes that have the phage shock operon (psp), but only rarely is encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins, and heat shock. [Cellular processes, Adaptations to atypical conditions]	64
132021	TIGR02976	phageshock_pspB	phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response. [Cellular processes, Adaptations to atypical conditions]	75
274372	TIGR02977	phageshock_pspA	phage shock protein A. Members of this family are the phage shock protein PspA, from the phage shock operon. This is a narrower family than the set of PspA and its homologs, sometimes several in a genome, as described by pfam04012. PspA appears to maintain the protonmotive force under stress conditions that include overexpression of certain phage secretins, heat shock, ethanol, and protein export defects. [Cellular processes, Adaptations to atypical conditions]	219
132023	TIGR02978	phageshock_pspC	phage shock protein C. All members of this protein family are the phage shock protein PspC. These proteins contain a PspC domain, as do other members of the larger family of proteins described by pfam04024. The phage shock regulon is restricted to the Proteobacteria and somewhat sparsely distributed there. It is expressed, under positive control of a sigma-54-dependent transcription factor, PspF, which binds and is modulated by PspA. Stresses that induce the psp regulon include phage secretin overexpression, ethanol, heat shock, and protein export defects. [Cellular processes, Adaptations to atypical conditions]	121
132024	TIGR02979	phageshock_pspD	phage shock protein PspD. Members of this family are phage shock protein PspD, found in a minority of bacteria that carry the defining genes of the phage shock regulon (pspA, pspB, pspC, and pspF). It is found in Escherichia coli, Yersinia pestis, and closely related species, where it is part of the phage shock operon. It is known to be expressed but its function is unknown. [Cellular processes, Adaptations to atypical conditions]	59
274373	TIGR02980	SigBFG	RNA polymerase sigma-70 factor, sigma-B/F/G subfamily. This group of similar sigma-70 factors includes clades found in Bacilli (including the sporulation factors SigF:TIGR02885 and SigG:TIGR02850 as well as SigB:TIGR02941), and the high GC gram positive bacteria (Actinobacteria) where a variable number of them are found depending on the lineage.	227
132026	TIGR02981	phageshock_pspE	phage shock operon rhodanese PspE. Members of this very narrowly defined protein family are proteins active as rhodanese (EC 2.8.1.1) and found in the extended variants of the phage shock protein (psp operon) in Escherichia coli and a few closely related species. Note that the designation phage shock protein PspE has been applied, incorrectly, in many instances where the genome lacks the phage shock regulon entirely.	101
274374	TIGR02982	heterocyst_DevA	ABC exporter ATP-binding subunit, DevA family. Members of this protein family are found mostly in the Cyanobacteria, but also in the Planctomycetes. Cyanobacterial examples are involved in heterocyst formation, by which some fraction of members of the colony undergo a developmental change and become capable of nitrogen fixation. The DevBCA proteins are thought export of either heterocyst-specific glycolipids or an enzyme essential for formation of the laminated layer found in heterocysts.	220
132028	TIGR02983	SigE-fam_strep	RNA polymerase sigma-70 factor, sigma-E family. This group of similar sigma-70 factors includes the sigE factor from Streptomyces coelicolor. The family appears to include a paralagous expansion in the Streptomycetes lineage, while related Actinomycetales have at most two representatives.	162
274375	TIGR02984	Sig-70_plancto1	RNA polymerase sigma-70 factor, Planctomycetaceae-specific subfamily 1. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are apparently found only in the Planctomycetaceae family including the genuses Gemmata and Pirellula (in which seven sequences are found).	189
274376	TIGR02985	Sig70_bacteroi1	RNA polymerase sigma-70 factor, Bacteroides expansion family 1. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are found primarily in the genus Bacteroides. This family appears to have resulted from a lineage-specific expansion as B. thetaiotaomicron VPI-5482, Bacteroides forsythus ATCC 43037, Bacteroides fragilis YCH46 and Bacteroides fragilis NCTC 9343 contain 25, 12, 24 and 23 members, respectively. There are currentlyonly two known members of this family outside of the Bacteroides, in Rhodopseudomonas and Bradyrhizobium.	161
132031	TIGR02986	restrict_Alw26I	type II restriction endonuclease, Alw26I/Eco31I/Esp3I family. Members of this family are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. Characterized specificities of three members are GGTCTC, CGTCTC, and the shared subsequence GTCTC. [DNA metabolism, Restriction/modification]	424
274377	TIGR02987	met_A_Alw26	type II restriction m6 adenine DNA methyltransferase, Alw26I/Eco31I/Esp3I family. Members of this family are the m6-adenine DNA methyltransferase protein, or domain of a fusion protein that also carries m5 cytosine methyltransferase activity, of type II restriction systems of the Alw26I/Eco31I/Esp3I family. A methyltransferase of this family is alway accompanied by a type II restriction endonuclease from the Alw26I/Eco31I/Esp3I family (TIGR02986) and by an adenine-specific modification methyltransferase. Members of this family are unusual in that regions of similarity to homologs outside this family are circularly permuted. [DNA metabolism, Restriction/modification]	524
274378	TIGR02988	YaaA_near_RecF	S4 domain protein YaaA. This small protein has a single S4 domain (pfam01479), as do bacterial ribosomal protein S4, some pseudouridine synthases, tyrosyl-tRNA synthetases. The S4 domain may bind RNA. Members of this protein family are found almost exclusively in the Firmicutes, and almost invariably just a few nucleotides upstream of the gene for the DNA replication and repair protein RecF. The few members of this family that are not near recF are found instead near dnaA and/or dnaN, the usual neighbors of recF, near the origin of replication. The conserved location suggests a possible role in replication in the Firmicutes lineage. [DNA metabolism, DNA replication, recombination, and repair]	59
274379	TIGR02989	Sig-70_gvs1	RNA polymerase sigma-70 factor, Rhodopirellula/Verrucomicrobium family. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are abundantly found in the species Rhodopirellula baltica (11), and Verrucomicrobium spinosum (16) and to a lesser extent in Gemmata obscuriglobus (2).	159
132035	TIGR02990	ectoine_eutA	ectoine utilization protein EutA. Members of this protein family are EutA, a predicted arylmalonate decarboxylase found in a conserved ectoine utilization operon of species that include Sinorhizobium meliloti 1021 (where it is known to be induced by ectoine), Mesorhizobium loti and Silicibacter pomeroyi. It is missing from two other species with the other ectoine transport and utilization genes: Pseudomonas putida and Agrobacterium tumefaciens.	239
132036	TIGR02991	ectoine_eutB	ectoine utilization protein EutB. Members of this protein family are EutB, a predicted arylmalonate decarboxylase found in a conserved ectoine utilization operon of species that include Sinorhizobium meliloti 1021 (where it is known to be induced by ectoine), Mesorhizobium loti, Silicibacter pomeroyi, Agrobacterium tumefaciens, and Pseudomonas putida. Members of this family resemble threonine dehydratases.	317
132037	TIGR02992	ectoine_eutC	ectoine utilization protein EutC. Members of this protein family are EutA, a predicted arylmalonate decarboxylase found in a conserved ectoine utilization operon of species that include Sinorhizobium meliloti 1021 (where it is known to be induced by ectoine), Mesorhizobium loti, Silicibacter pomeroyi, Agrobacterium tumefaciens, and Pseudomonas putida. This family belongs to the ornithine cyclodeaminase/mu-crystallin family (pfam02423).	326
274380	TIGR02993	ectoine_eutD	ectoine utilization protein EutD. Members of this family are putative peptidases or hydrolases similar to Xaa-Pro aminopeptidase (pfam00557). They belong to ectoine utilization operons, as found in Sinorhizobium meliloti 1021 (where it is known to be induced by ectoine), Mesorhizobium loti, Silicibacter pomeroyi, Agrobacterium tumefaciens, and Pseudomonas putida. The exact function is unknown.	391
132039	TIGR02994	ectoine_eutE	ectoine utilization protein EutE. Members of this family, part of the succinylglutamate desuccinylase / aspartoacylase family (pfam04952), belong to ectoine utilization operons, as found in Sinorhizobium meliloti 1021 (where it the operon is known to be induced by ectoine), Mesorhizobium loti, Silicibacter pomeroyi, Agrobacterium tumefaciens, and Pseudomonas putida.	325
132040	TIGR02995	ectoine_ehuB	ectoine/hydroxyectoine ABC transporter solute-binding protein. Members of this family are the extracellular solute-binding proteins of ABC transporters that closely resemble amino acid transporters. The member from Sinorhizobium meliloti is involved in ectoine uptake, both for osmoprotection and for catabolism. All other members of the seed alignment are found associated with ectoine catabolic genes. [Transport and binding proteins, Amino acids, peptides and amines]	275
274381	TIGR02996	rpt_mate_G_obs	repeat-companion domain TIGR02996. This model describes an abundant paralogous domain of Gemmata obscuriglobus UQM 2246, a member of the Planctomycetes. The domain also occurs, although rarely, in Myxococcus xanthus DK 1622 and related species. Most member proteins have extensive repeats similar to the leucine-rich repeat, or another repeat class or region of low-complexity sequence. This domain is not repeated, and in Gemmata is usually found at the protein N-terminus.	42
274382	TIGR02997	Sig70-cyanoRpoD	RNA polymerase sigma factor, cyanobacterial RpoD-like family. This family includes a number of closely related sigma-70 (TIGR02937) factors in the cyanobacteria. All appear most closely related to the essential sigma-70 factor RpoD, and some score above trusted to the RpoD C-terminal domain model (TIGR02393).	298
132043	TIGR02998	RraA_entero	regulator of ribonuclease activity A. This family includes a number of closely related sequences from certain enterobacteria. The E. coli member of this family has been characterized as a regulator of RNase E and its crystal structure has been analyzed. The broader subfamily which includes this equivalog, TIGR01935, was initially classified as a "hypothetical equivalog" with the name "regulator of ribonuclease activity A" based on the same evidence for this model. It now appears that, considering the second group of enterobacterial sequences within TIGR01935, the functional assignment is unsupported. THIS PROTEIN IS _NOT_ MenG, AKA S-adenosylmethionine: 2-demethylmenaquinone methyltransferase (EC 2.1.-.-). See the references characterizing this as a case of transitive annotation error. [Transcription, Degradation of RNA, Regulatory functions, Protein interactions]	161
132044	TIGR02999	Sig-70_X6	RNA polymerase sigma factor, TIGR02999 family. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are found in a variety of species including Rhodopirellula baltica which encodes a paralogous group of five.	183
274383	TIGR03000	plancto_dom_1	Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.	75
188267	TIGR03001	Sig-70_gmx1	RNA polymerase sigma-70 factor, Myxococcales family 1. This group of sigma factors are members of the sigma-70 family (TIGR02937) and are found in multiple copies in the order Myxococcales. This model supercedes TIGR02233, which has now been retired.	244
274384	TIGR03002	outer_YhbN_LptA	lipopolysaccharide transport periplasmic protein LptA. Members of this protein family include LptA (previously called YhbN). It was shown to be an essential protein in E. coli, implicated in cell envelope integrity, and to play a role in the delivery of LPS to the outer leaflet of the outer membrane. It works with LptB (formerly yhbG), a homolog of ABC transporter ATP-binding proteins, encoded by an adjacent gene. Numerous homologs in other Proteobacteria are found in a conserved location near lipopolysaccharide inner core biosynthesis genes. This family is related to organic solvent tolerance protein (OstA), though distantly. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other]	142
132048	TIGR03003	ectoine_ehuD	ectoine/hydroxyectoine ABC transporter, permease protein EhuD. Members of this family are presumed to act as permease subunits of ectoine ABC transporters. Operons containing this gene also contain the other genes of the ABC transporter and typically are found next to either ectoine utilization or ectoine biosynthesis operons.	212
132049	TIGR03004	ectoine_ehuC	ectoine/hydroxyectoine ABC transporter, permease protein EhuC. Members of this family are presumed to act as permease subunits of ectoine ABC transporters. Operons containing this gene also contain the other genes of the ABC transporter and typically are found next to either ectoine utilization or ectoine biosynthesis operons. Permease subunits EhuC and EhuD are homologous.	214
132050	TIGR03005	ectoine_ehuA	ectoine/hydroxyectoine ABC transporter, ATP-binding protein. Members of this family are the ATP-binding protein of a conserved four gene ABC transporter operon found next to ectoine unilization operons and ectoine biosynthesis operons. Ectoine is a compatible solute that protects enzymes from high osmolarity. It is released by some species in response to hypoosmotic shock, and it is taken up by a number of bacteria as a compatible solute or for consumption. This family shows strong sequence similiarity to a number of amino acid ABC transporter ATP-binding proteins.	252
274385	TIGR03006	pepcterm_polyde	polysaccharide deacetylase family protein, PEP-CTERM locus subfamily. Members of this protein family belong to the family of polysaccharide deacetylases (pfam01522). All are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria, and are found near the epsH homolog that is the putative exosortase gene. The highest scoring homologs below the trusted cutoff for this model are found in several species of Methanosarcina, an archaeal genus. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	271
274386	TIGR03007	pepcterm_ChnLen	polysaccharide chain length determinant protein, PEP-CTERM locus subfamily. Members of this protein family belong to the family of polysaccharide chain length determinant proteins (pfam02706). All are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria, and are found near the epsH homolog that is the putative exosortase gene. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	498
163100	TIGR03008	pepcterm_CAAX	CAAX prenyl protease-related protein. The CAAX prenyl protease, in eukaryotes, catalyzes three covalent modifications, including cleavage and acylation, at the C-terminus of certain proteins in a process connected to protein sorting. This family describes a bacterial protein family homologous to one domain of the CAAX-processing enzyme. Members of this protein family are found in genomes that carry a predicted protein sorting system, PEP-CTERM/exosortase, usually in the vicinity of the EpsH homolog that is the hallmark of the system. The function of this protein is unknown, but it may relate to protein motification. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	222
274387	TIGR03009	plancto_dom_2	Planctomycetes uncharacterized domain TIGR03009. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to four proteins per genome. The function is unknown. [Hypothetical proteins, Conserved]	210
200229	TIGR03010	sulf_tusC_dsrF	sulfur relay protein TusC/DsrF. The three proteins TusB, TusC, and TusD form a heterohexamer responsible for a sulfur relay reaction. In large numbers of proteobacterial species, this complex acts on a Cys-derived persulfide moiety, delivered by the cysteine desulfurase IscS to TusA, then to TusBCD. The activated sulfur group is then transferred to TusE (DsrC), then by MnmA (TrmU) for modification of an anticodon nucleotide in tRNAs for Glu, Lys, and Gln. The sulfur relay complex TusBCD is also found, under the designation DsrEFH, in phototrophic and chemotrophic sulfur bacteria, such as Chromatium vinosum. In these organisms, it seems the primary purpose is related to sulfur flux, such as oxidation from sulfide to molecular sulfur to sulfate. [Protein synthesis, tRNA and rRNA base modification]	116
274388	TIGR03011	sulf_tusB_dsrH	sulfur relay protein TusB/DsrH. The three proteins TusB, TusC, and TusD form a heterohexamer responsible for a sulfur relay reaction. In large numbers of proteobacterial species, this complex acts on a Cys-derived persulfide moiety, delivered by the cysteine desulfurase IscS to TusA, then to TusBCD. The activated sulfur group is then transferred to TusE (DsrC), then by MnmA (TrmU) for modification of an anticodon nucleotide in tRNAs for Glu, Lys, and Gln. The sulfur relay complex TusBCD is also found, under the designation DsrEFH, in phototrophic and chemotrophic sulfur bacteria, such as Chromatium vinosum. In these organisms, it seems the primary purpose is related to sulfur flux, such as oxidation from sulfide to molecular sulfur to sulfate. [Protein synthesis, tRNA and rRNA base modification]	94
274389	TIGR03012	sulf_tusD_dsrE	sulfur relay protein TusD/DsrE. The three proteins TusB, TusC, and TusD form a heterohexamer responsible for a sulfur relay reaction. In large numbers of proteobacterial species, this complex acts on a Cys-derived persulfide moiety, delivered by the cysteine desulfurase IscS to TusA, then to TusBCD. The activated sulfur group is then transferred to TusE (DsrC), then by MnmA (TrmU) for modification of an anticodon nucleotide in tRNAs for Glu, Lys, and Gln. The sulfur relay complex TusBCD is also found, under the designation DsrEFH, in phototrophic and chemotrophic sulfur bacteria, such as Chromatium vinosum. In these organisms, it seems the primary purpose is related to sulfur flux, such as oxidation from sulfide to molecular sulfur to sulfate. [Protein synthesis, tRNA and rRNA base modification]	127
274390	TIGR03013	EpsB_2	sugar transferase, PEP-CTERM system associated. Members of this protein family belong to the family of bacterial sugar transferases (pfam02397). Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria (notable exceptions appear to include Magnetococcus sp. MC-1 and Myxococcus xanthus DK 1622 ). These genes are generally found near one or more of the PrsK, PrsR or PrsT genes that have been related to the PEP-CTERM system by phylogenetic profiling methods. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species. These proteins are homologs of the EpsB protein found in Methylobacillus sp. strain 12S, which is also associated with a PEP-CTERM system, but of a distinct type. A name which appears attached to a number of genes (by transitive annotation) in this family is "undecaprenyl-phosphate galactose phosphotransferase", which comes from relatively distant characterized enterobacterial homologs, and is considerably more specific than warranted from the currently available evidence.	442
132059	TIGR03014	EpsL	exopolysaccharide biosynthesis operon protein EpsL. The epsL gene is described as a component of the methanolan exopolysaccharide biosynthesis operon in Methylobacillus sp strain 12S, although no other information regarding its possible function is suggested. Homologs of this gene are found in several other exopolysaccharide operons in a small number of species. These operons contain a subset of the methanolan operon genes by homology and synteny, including the epsH gene which is proposed to act as an "exosortase" directing proteins with a C-terminal tag (PEP-CTERM) to the exopolysaccharide layer. Each of the genomes in which these genes and epsL are found also encode genes with these C-terminal tags.	381
132060	TIGR03015	pepcterm_ATPase	putative secretion ATPase, PEP-CTERM locus subfamily. Members of this protein are marked as probable ATPases by the nucleotide binding P-loop motif GXXGXGKTT, a motif DEAQ similar to the DEAD/H box of helicases, and extensive homology to ATPases of MSHA-type pilus systems and to GspA proteins associated with type II protein secretion systems. [Protein fate, Protein and peptide secretion and trafficking]	269
274391	TIGR03016	pepcterm_hypo_1	uncharacterized protein, PEP-CTERM system associated. Members of this protein family are found predominantly in exopolysaccharide biosynthesis operons marked by the presence of the EpsH-family putative exosortase and presence in the genome of the PEP-CTERM protein sorting signal. Members of this family may be distantly related to the EpsL family modeled in TIGR03014.	431
132062	TIGR03017	EpsF	chain length determinant protein EpsF. Sequences in this family of proteins are members of the chain length determinant family (pfam02706) which includes the wzc protein from E.coli. This family of proteins are homologous to the EpsF protein of the methanolan biosynthesis operon of Methylobacillus species strain 12S. The distribution of this protein appears to be restricted to a subset of exopolysaccharide operons containing a syntenic grouping of genes including a variant of the EpsH exosortase protein. Exosortase has been proposed to be involved in the targetting and processing of proteins containing the PEP-CTERM domain to the exopolysaccharide layer.	444
274392	TIGR03018	pepcterm_TyrKin	exopolysaccharide/PEP-CTERM locus tyrosine autokinase. Members of this protein family are related to a known protein-tyrosine autokinase and to numerous homologs from exopolysaccharide biosynthesis region proteins, many of which are designated as chain length determinants. Most members of this family contain a short region, immediately C-terminal to the region modeled here, with an abundance of Tyr residues. These C-terminal tyrosine residues are likely to be autophosphorylation sites. Some members of this family are fusion proteins.	207
132064	TIGR03019	pepcterm_femAB	FemAB-related protein, PEP-CTERM system-associated. Members of this protein family are found always as part of extended exopolysaccharide biosynthesis loci in bacteria. In nearly every case, these loci contain determinants for the processing of the PEP-CTERM proposed C-terminal protein sorting signal. This family shows remote, local sequence similarity to the FemAB protein family (see pfam02388), whose members [Unknown function, General]	330
274393	TIGR03020	EpsA	transcriptional regulator EpsA. Proteins in this family include a C-terminal LuxR transcriptional regulator domain (pfam00196). These proteins are positioned proximal to either EpsH-containing exopolysaccharide biosynthesis operons of the Methylobacillus type, or the associated PEP-CTERM-containing genes.	247
274394	TIGR03021	pilP_fam	type IV pilus biogenesis protein PilP. Members of this protein family are found in type IV pilus biogenesis loci and include proteins designated PilP. [Cell envelope, Surface structures]	118
274395	TIGR03022	WbaP_sugtrans	Undecaprenyl-phosphate galactose phosphotransferase, WbaP. The WbaP (formerly RfbP) protein has been characterized as the first enzyme in O-antigen biosynthesis in Salmonella typhimurium. The enzyme transfers galactose from UDP-galactose to a polyprenyl carrier (utilizing the highly conserved C-terminal sugar transferase domain, pfam02397) a reaction which takes place at the cytoplasmic face of the inner membrane. The N-terminal hydrophobic domain is then believed to facilitate the "flippase" function of transferring the liposaccharide unit from the cytoplasmic face to the periplasmic face of the inner membrane. This model includes the enterobacterial enzymes, where the function is presumed to be identical to the S. typhimurium enzyme as well as a somewhat broader group which are likely to catalyze the same or highly similar reactions based on a phylogenetic tree-building analysis of the broader sugar transferase family. Most of these genes are found within large operons dedicated to the production of complex exopolysaccharides such as the enterobacterial O-antigen. The most likely heterogeneity would be in the precise nature of the sugar molecule transferred.	456
274396	TIGR03023	WcaJ_sugtrans	Undecaprenyl-phosphate glucose phosphotransferase. This family of proteins encompasses the E. coli WcaJ protein involved in colanic acid biosynthesis, the Methylobacillus EpsB protein involved in methanolan biosynthesis, as well as the GumD protein involved in the biosynthesis of xanthan. All of these are closely related to the well-characterized WbaP (formerly RfbP) protein, which is the first enzyme in O-antigen biosynthesis in Salmonella typhimurium. The enzyme transfers galactose from UDP-galactose (NOTE: not glucose) to a polyprenyl carrier (utilizing the highly conserved C-terminal sugar transferase domain, pfam02397) a reaction which takes place at the cytoplasmic face of the inner membrane. The N-terminal hydrophobic domain is then believed to facilitate the "flippase" function of transferring the liposaccharide unit from the cytoplasmic face to the periplasmic face of the inner membrane. Most of these genes are found within large operons dedicated to the production of complex exopolysaccharides such as the enterobacterial O-antigen. Colanic acid biosynthesis utilizes a glucose-undecaprenyl carrier, knockout of EpsB abolishes incorporation of UDP-glucose into the lipid phase, and the C-terminal portion of GumD has been shown to be responsible for the glucosyl-1-transferase activity.	450
274397	TIGR03024	arch_PEF_CTERM	PEF-CTERM protein sorting domain. This domain, distantly related to the PEP-Cterm domain described in model TIGR02595, is found in Methanosarcina mazei in four different proteins, as well as in other archaea such as Methanococcoides burtonii. Several proteins with this domain have their genes only a short distance from archaeosortase C, a proposed integral membrane transpeptidase. This family should exclude members of the PEFG-CTERM domain family (TIGR04296), specific to the Thaumarchaeota.	25
274398	TIGR03025	EPS_sugtrans	exopolysaccharide biosynthesis polyprenyl glycosylphosphotransferase. Members of this family are generally found near other genes involved in the biosynthesis of a variety of exopolysaccharides. These proteins consist of two fused domains, an N-terminal hydrophobic domain of generally low conservation and a highly conserved C-terminal sugar transferase domain (pfam02397). Characterized and partially characterized members of this subfamily include Salmonella WbaP (originally RfbP), E. coli WcaJ, Methylobacillus EpsB, Xanthomonas GumD, Vibrio CpsA, Erwinia AmsG, Group B Streptococcus CpsE (originally CpsD), and Streptococcus suis Cps2E. Each of these is believed to act in transferring the sugar from, for instance, UDP-glucose or UDP-galactose, to a lipid carrier such as undecaprenyl phosphate as the first (priming) step in the synthesis of an oligosaccharide "block". This function is encoded in the C-terminal domain. The liposaccharide is believed to be subsequently transferred through a "flippase" function from the cytoplasmic to the periplasmic face of the inner membrane by the N-terminal domain. Certain closely related transferase enzymes, such as Sinorhizobium ExoY and Lactococcus EpsD, lack the N-terminal domain and are not found by this model.	445
274399	TIGR03026	NDP-sugDHase	nucleotide sugar dehydrogenase. Enzymes in this family catalyze the NAD-dependent alcohol-to-acid oxidation of nucleotide-linked sugars. Examples include UDP-glucose 6-dehydrogenase (1.1.1.22), GDP-mannose 6-dehydrogenase (1.1.1.132), UDP-N-acetylglucosamine 6-dehydrogenase (1.1.1.136), UDP-N-acetyl-D-galactosaminuronic acid dehydrogenase, and UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase. These enzymes are most often involved in the biosynthesis of polysaccharides and are often found in operons devoted to that purpose. All of these enzymes contain three Pfam domains, pfam03721, pfam00984, and pfam03720 for the N-terminal, central, and C-terminal regions respectively.	409
132072	TIGR03027	pepcterm_export	putative polysaccharide export protein, PEP-CTERM sytem-associated. This protein family belongs to the larger set of polysaccharide biosynthesis/export proteins described by pfam02563. Members of this family are variable in either containing of lacking a 78-residue insert, but appear to fall within a single clade, nevertheless, where the regions in which the gene is found encode components of the PEP-CTERM/EpsH proposed exosortase protein sorting system. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	165
132073	TIGR03028	EpsE	polysaccharide export protein EpsE. Sequences in this family of proteins are members of a polysaccharide export protein family (pfam02563) which includes the wza protein from E.coli. This family of proteins are homologous to the EpsE protein of the methanolan biosynthesis operon of Methylobacillus species strain 12S. The distribution of this protein appears to be restricted to a subset of exopolysaccharide operons containing a syntenic grouping of genes including a variant of the EpsH exosortase protein. Exosortase has been proposed to be involved in the targetting and processing of proteins containing the PEP-CTERM domain to the exopolysaccharide layer.	239
132074	TIGR03029	EpsG	chain length determinant protein tyrosine kinase EpsG. The proteins in this family are homologs of the EpsG protein found in Methylobacillus strain 12S and are generally found in operons with other Eps homologs. The protein is believed to function as the protein tyrosine kinase component of the chain length regulator (along with the transmembrane component EpsF).	274
274400	TIGR03030	CelA	cellulose synthase catalytic subunit (UDP-forming). Cellulose synthase catalyzes the beta-1,4 polymerization of glucose residues in the formation of cellulose. In bacteria, the substrate is UDP-glucose. The synthase consists of two subunits (or domains in the frequent cases where it is encoded as a single polypeptide), the catalytic domain modelled here and the regulatory domain (pfam03170). The regulatory domain binds the allosteric activator cyclic di-GMP. The protein is membrane-associated and probably assembles into multimers such that the individual cellulose strands can self-assemble into multi-strand fibrils.	713
274401	TIGR03031	cas_csx12	CRISPR system subtype II-B RNA-guided endonuclease Cas9/Csx12. Members of this family of CRISPR-associated (cas) protein are found, so far, in CRISPR/cas loci in Wolinella succinogenes DSM 1740, Legionella pneumophila str. Paris, and Francisella tularensis, where the last probably is an example of a degenerate CRISPR locus, having neither repeats nor a functional Cas1. The characteristic repeat length is 37 base pairs and period is about 72. One region of this large protein shows sequence similarity to pfam01844, HNH endonuclease.	802
274402	TIGR03032	TIGR03032	TIGR03032 family protein. This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown. [Hypothetical proteins, Conserved]	335
200235	TIGR03033	phage_rel_nuc	putative phage-type endonuclease. Members of this protein family are found often in phage genomes and in prokaryotic genomes in uncharacterized regions that resemble integrated prophage regions.	153
132079	TIGR03034	TIGR03034	conserved hypothetical protein. Members of this protein family have been found in several species of gammaproteobacteria, including Yersinia pestis and Y. pseudotuberculosis, Xylella fastidiosa, and Escherichia coli UTI89. As many as five members can be found in a single genome. The function is unknown. [Hypothetical proteins, Conserved]	274
132080	TIGR03035	trp_arylform	arylformamidase. One of several pathways of tryptophan degradation is as follows: tryptophan 2,3-dioxygenase (1.13.11.11) uses 02 to convert Trp to L-formylkynurenine. Arylformamidase (3.5.1.9) hydrolyzes the product to L-kynurenine and formate. Kynureninase (3.7.1.3) hydrolyzes L-kynurenine to anthranilate plus alanine. Members of the seed alignment for this model are bacterial predicted metal-dependent hydrolases. All are supported as arylformamidase (3.5.1.9) by an operon structure in which kynureninase and/or tryptophan 2,3-dioxygenase genes are adjacent. The members from Bacillus cereus, Pseudomonas aeruginosa and Ralstonia metallidurans were characterized. An example from Pseudomonas fluorescens is given the gene symbol qbsH instead of kynB because of its role in quinolobactin biosynthesis, which begins with tryptophan. All members of this family should be arylformamidase (3.5.1.9). [Energy metabolism, Amino acids and amines]	206
188272	TIGR03036	trp_2_3_diox	tryptophan 2,3-dioxygenase. Members of this family are tryptophan 2,3-dioxygenase, as confirmed by several experimental characterizations, and by conserved operon structure for many of the other members. This enzyme represents the first of a two-step degradation to L-kynurenine, and a three-step pathway (via kynurenine) to anthranilate plus alanine. [Energy metabolism, Amino acids and amines]	264
132082	TIGR03037	anthran_nbaC	3-hydroxyanthranilate 3,4-dioxygenase. Members of this protein family, from both bacteria and eukaryotes, are the enzyme 3-hydroxyanthranilate 3,4-dioxygenase. This enzyme acts on the tryptophan metabolite 3-hydroxyanthranilate and produces 2-amino-3-carboxymuconate semialdehyde, which can rearrange spontaneously to quinolinic acid and feed into nicotinamide biosynthesis, or undergo further enzymatic degradation.	159
274403	TIGR03038	PS_II_psbM	photosystem II reaction center protein PsbM. Members of this protein family are the photosystem II reaction center M protein, product of the psbM gene, in Cyanobacteria and their derived organelles in plants. This model resembles pfam05151 but has cutoffs set to avoid false-positive matches to similar (not necessarily homologous) sequences in species that are not photosynthetic. [Energy metabolism, Photosynthesis]	33
163117	TIGR03039	PS_II_CP47	photosystem II chlorophyll-binding protein CP47. [Energy metabolism, Photosynthesis]	504
213761	TIGR03041	PS_antenn_a_b	chlorophyll a/b binding light-harvesting protein. This model represents a family of proteins from the Cyanobacteria, closely homologous to and yet distinct from PbsC, a chlorophyll a antenna protein of photosystem II. Members are not univerally present in Cyanobacteria, while the family has several members per genome in Prochlorococcus marinus, with seven members in a strain adapted to low light conditions. These antenna proteins may deliver light energy to photosystem I and/or photosystem II. [Energy metabolism, Photosynthesis]	321
274404	TIGR03042	PS_II_psbQ_bact	photosystem II protein PsbQ. This protein through the member sll1638 from Synechocystis sp. PCC 6803, was shown to be part of the cyanobacteria photosystem II. It is homologous to (but quite diverged from) the chloroplast PsbQ protein, called oxygen-evolving enhancer protein 3 (OEE3). We designate this cyanobacteria protein PsbQ by homology. [Energy metabolism, Photosynthesis]	142
274405	TIGR03043	PS_II_psbZ	photosystem II core protein PsbZ. PsbZ is a core protein of photosystem II in thylakoid-containing Cyanobacteria and plant chloroplasts. The original Chlamydomonas gene symbol, ycf9, is a synonym. PsbZ controls the interaction of the reaction center core with the light-harvesting antenna. [Energy metabolism, Photosynthesis]	58
274406	TIGR03044	PS_II_psb27	photosystem II protein Psb27. Members of this family are the Psb27 protein of the cyanobacterial photosynthetic supracomplex, photosystem II. Although most protein components of both cyanobacterial and chloroplast versions of photosystem II are closely related and described together by single models, this family is strictly bacterial. Some uncharacterized proteins with highly divergent sequences, from Arabidopsis, score between trusted and noise cutoffs for this model but are not at this time assigned as functionally equivalent photosystem II proteins. [Energy metabolism, Photosynthesis]	135
274407	TIGR03045	PS_II_C550	cytochrome c-550. Members of this protein family are cytochrome c-550, the PsbV extrinsic protein of photosystem II, from both Cyanobacteria and chloroplasts. A paralog to this protein, PsbV2, is found in some species in addition to PsbV itself. [Energy metabolism, Photosynthesis]	159
274408	TIGR03046	PS_II_psbV2	photosystem II cytochrome PsbV2. Members of this protein family are PsbV2, a protein closely related cytochrome c-550 (PsbV), a protein important to the water-splitting and oxygen-evolving activity of photosystem II. Mutant studies in Thermosynechococcus elongatus showed PsbV2 can partially replace PsbV, from which it appears to have arisen first by duplication, then by intergenic recombination with a different gene. [Energy metabolism, Photosynthesis]	155
274409	TIGR03047	PS_II_psb28	photosystem II reaction center protein Psb28. Members of this protein family are the Psb28 protein of photosystem II. Two different protein families, apparently without homology between them, have been designated PsbW. Cyanobacterial proteins previously designated PsbW are members of the family described here. However, while members of the plant PsbW family are not found (so far) in Cyanobacteria, members of the present family do occur in plants. We therefore support the alternative designation that has emerged for this protein family, Psp28, rather than PsbW. [Energy metabolism, Photosynthesis]	108
132092	TIGR03048	PS_I_psaC	photosystem I iron-sulfur protein PsaC. Members of this family are PsaC, an essential component of photosystem I (PS-I) reaction center in Cyanobacteria and chloroplasts. This small protein, about 80 amino acids in length, contains two copies of the ferredoxin-like 4Fe-4S binding site (pfam00037) and therefore eight conserved Cys residues. This protein is also called photosystem I subunit VII. [Energy metabolism, Photosynthesis]	80
274410	TIGR03049	PS_I_psaK	photosystem I reaction center subunit PsaK. Members of this protein family are the PsaK of the photosystem I reaction center. Photosystems I and II occur together in the same sets of organisms. Photosystem I uses light energy to transfer electrons from plastocyanin to ferredoxin, while photosystem II uses light energy to split water and releases molecular oxygen. [Energy metabolism, Photosynthesis]	81
188274	TIGR03050	PS_I_psaK_plant	photosystem I reaction center PsaK, plant form. This protein family is based on a model that separates the photosystem I PsaK subunit of chloroplasts from chloroplast PsaG protein and from Cyanobacterial PsaK, both of which show sequence similarity.	83
132095	TIGR03051	PS_I_psaG_plant	photosystem I reaction center subunit V, chloroplast. 	88
132096	TIGR03052	PS_I_psaI	photosystem I reaction center subunit VIII. Members of this protein family are PsaI, subunit VIII of the photosystem I reaction center. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen. [Energy metabolism, Photosynthesis]	31
274411	TIGR03053	PS_I_psaM	photosystem I reaction center subunit XII. Members of this protein family are PsaM, which is subunit XII of the photosystem I reaction center. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen. The seed alignment for this model includes sequences from pfam07465 and additional sequences, as from Prochlorococcus. [Energy metabolism, Photosynthesis]	29
213764	TIGR03054	photo_alph_chp1	putative photosynthetic complex assembly protein. In twenty or so anoxygenic photosynthetic alpha-Proteobacteria known so far, a gene for a member of this protein family is present and is found in the vicinity of puhA, which encodes a component of the photosynthetic reaction center, and other genes associated with photosynthesis. This protein family is suggested, consequently, as a probable assembly factor for the photosynthetic reaction center, but its seems its actual function has not yet been demonstrated. [Energy metabolism, Photosynthesis]	135
188275	TIGR03055	photo_alph_chp2	putative photosynthetic complex assembly protein 2. This uncharacterized protein family was identified, by the method of partial phylogenetic profiling, as having a matching phylogenetic distribution to that of the photosynthetic reaction center of the alpha-proteobacterial type. It is nearly always encoded near other photosynthesis-related genes, including puhA. [Energy metabolism, Photosynthesis]	245
132100	TIGR03056	bchO_mg_che_rel	putative magnesium chelatase accessory protein. Members of this family belong to the alpha/beta fold family hydrolases (pfam00561). Members are found in bacterial genomes if and only if they encoded for anoxygenic photosynthetic systems similar to that of Rhodobacter capsulatus and other alpha-Proteobacteria. Members often are encoded in the same operon as subunits of the protoporphyrin IX magnesium chelatase, and were once designated BchO. No literature supports a role as an actual subunit of magnesium chelatase, but an accessory role is possible, as suggested by placement by its probable hydrolase activity. [Energy metabolism, Photosynthesis]	278
274412	TIGR03057	xxxLxxG_by_4	X-X-X-Leu-X-X-Gly heptad repeats. This model represents a 28-column alignment, comprising four tandem sets of seven residues each, in which the fourth residue tends to be Leu and the seventh tends to be Gly in each set. This heptad periodicity, corresponding to two turns of an alpha helix, suggests alpha-helical structure; in many proteins this 28-region model hits many times in tandem. Arrangement of these sequences on a helical wheel would show a strict alternation of Leu and Gly residues on one side of the helix, that is, an extremely bulky side chain alternating with the virtual absence of one. This suggests an extended zippering of one alpha helix to another, analogous to the shorter leucine zippers found in many dimerizing transcription factors. Proteins in which these heptad repeats occur often have higher order repeats of a unit comprised of several heptads.	28
132102	TIGR03058	rpt_csmH	chlorosome envelope protein H repeat. CsmH, as studied in Chlorobium tepidum, is one of at least ten surface-exposed proteins of the chloroplast, a bacteriochlorophyll-rich structure with a lipid-protein envelope. CsmH contain typically three copies of a repeated sequence, represented by this model. [Energy metabolism, Photosynthesis]	27
132103	TIGR03059	psaOeuk	photosystem I protein PsaO. Members of this family are the PsaO protein of photosystem I. This protein is found in chloroplasts but not in Cyanobacteria.	82
213765	TIGR03060	PS_II_psb29	photosystem II biogenesis protein Psp29. Psp29, originally designated sll1414 in Synechocystis 6803, is found universally in Cyanobacteria and in Arabidopsis. It was isolated and partially sequenced from purified photosystem II (PS II) in Synechocystis. While its function is unknown, mutant studies show an impairment in photosystem II biogenesis and/or stability, rather than in PS II core function. [Energy metabolism, Photosynthesis]	214
274413	TIGR03061	pip_yhgE_Nterm	YhgE/Pip N-terminal domain. This family contains the N-terminal domain of a family of multiple membrane-spanning proteins of Gram-positive bacteria. One member was shown to be a host protein essential for phage infection, so many members of this family are called "phage infection protein". A separate model, TIGR03062, represents the conserved C-terminal domain. The domains are separated by regions highly variable in both length and sequence, often containing extended heptad repeats as described in model TIGR03057.	164
274414	TIGR03062	pip_yhgE_Cterm	YhgE/Pip C-terminal domain. This family contains the C-terminal domain of a family of multiple membrane-spanning proteins of Gram-positive bacteria. One member was shown to be a host protein essential for phage infection, so many members of this family are called "phage infection protein". A separate model, TIGR03061, represents the conserved N-terminal domain. The domains are separated by regions highly variable in both length and sequence, often containing extended heptad repeats as described in model TIGR03057.	208
213766	TIGR03063	srtB_target	sortase B cell surface sorting signal. Two different classes of sorting signal, both analogous to the sortase A signal LPXTG, may be recognized by the sortase SrtB. These are given as NXZTN and NPKXZ. Proteins sorted by this class of sortase are less common than the sortase A and LPXTG system. This model describes a number of cell surface protein C-terminal regions from Gram-positive bacteria that appear to be sortase B (SrtB) sorting signals.	29
211782	TIGR03064	sortase_srtB	sortase, SrtB family. Members of this transpeptidase family are, in most cases, designated sortase B, product of the srtB gene. This protein shows only distant similarity to the sortase A family, for which there may be several members in a single bacterial genome. Typical SrtB substrate motifs include NAKTN, NPKSS, etc, and otherwise resemble the LPXTG sorting signals recognized by sortase A proteins. [Cell envelope, Other, Protein fate, Protein and peptide secretion and trafficking]	232
132109	TIGR03065	srtB_sig_QVPTGV	sortase B signal domain, QVPTGV class. This model represents a boutique (unusual) sorting signal, recognized by a member of the sortase SrtB family rather than by the housekeeping sortase, SrtA.	32
132110	TIGR03066	Gem_osc_para_1	Gemmata obscuriglobus paralogous family TIGR03066. This model represents an uncharacterized paralogous family in Gemmata obscuriglobus UQM 2246, a member of the Planctomycetes. This family shows sequence similarity to TIGR03067, which is also found in Gemmata obscuriglobus as well as in a few other species. [Hypothetical proteins, Conserved]	111
274415	TIGR03067	Planc_TIGR03067	Planctomycetes uncharacterized domain TIGR03067. This domain occurs in several species, mostly from the Planctomycetes division of the bacteria. It is expanded into a paralogous family of at least twenty-five members in Gemmata obscuriglobus UQM 2246. This family appears related to TIGR03066, which also is expanded into a large paralogous family in Gemmata obscuriglobus. [Unknown function, General]	107
132112	TIGR03068	srtB_sig_NPQTN	sortase B signal domain, NPQTN class. This model represents one of the boutique (rare) sortase signals, recognized by sortase B (SrtB) rather than by the housekeeping-type SrtA class sortase. This sequence, beginning NPQTN, shows little similarity to several other SrtB substrates.	33
132113	TIGR03069	PS_II_S4	photosystem II S4 domain protein. Members of this protein family are about 265 residues long and each contains an S4 RNA-binding domain of about 48 residues. The member from the Cyanobacterium, Synechocystis sp. PCC 6803, was detected as a novel polypeptide in a highly purified preparation of active photosystem II (Kashino, et al., 2002). The phylogenetic distribution, including Cyanobacteria and Arabidopsis, supports a role in photosystem II, although the high bit score cutoffs for this model reflect similar sequences in non-photosynthetic organisms such as Carboxydothermus hydrogenoformans, a Gram-positive bacterium. [Energy metabolism, Photosynthesis]	257
213767	TIGR03070	couple_hipB	transcriptional regulator, y4mF family. Members of this family belong to a clade of helix-turn-helix DNA-binding proteins, among the larger family pfam01381 (HTH_3; Helix-turn-helix). Members are similar in sequence to the HipB protein of E. coli. Genes for members of the seed alignment for this protein family were found to be closely linked to genes encoding proteins related to HipA. The HibBA operon appears to have some features in common with toxin-antitoxin post-segregational killing systems. [Regulatory functions, DNA interactions]	58
274416	TIGR03071	couple_hipA	HipA N-terminal domain. Although Pfam models pfam07805 and pfam07804 currently are called HipA-like N-terminal domain and HipA-like C-terminal domain, respectively, those models hit the central and C-terminal regions of E. coli HipA but not the N-terminal region. This model hits the N-terminal region of HipA and its homologs, and also identifies proteins that lack match regions for pfam07804 and pfam07805.	101
213768	TIGR03072	release_prfH	putative peptide chain release factor H. Members of this protein family are bacterial proteins homologous to peptide chain release factors 1 (RF-1, product of the prfA gene), and 2 (RF-2, product of the prfB gene). The member from Escherichia coli K-12, designated prfH, appears to be a pseudogene. This class I release factor is always found as the downstream gene of a two-gene operon. [Protein synthesis, Translation factors]	200
274417	TIGR03073	release_rtcB	release factor H-coupled RctB family protein. Members of this family are related to RctB. RctB a protein of known structure but unknown function that often is encoded near RNA cyclase and therefore is suggested to be a tRNA or mRNA processing enzyme. This family of RctB-like proteins in encoded upstream of, and apparently is translationally coupled to, the putative peptide chain release factor RF-H (TIGR03072), product of the prfH gene. Note that a large deletion at the junction between this gene and the prfH gene in Escherichia coli K-12 marks both as probable pseudogenes. [Protein synthesis, Other]	356
274418	TIGR03074	PQQ_membr_DH	membrane-bound PQQ-dependent dehydrogenase, glucose/quinate/shikimate family. This protein family has a phylogenetic distribution very similar to that coenzyme PQQ biosynthesis enzymes, as shown by partial phylogenetic profiling. Members of this family have several predicted transmembrane helices in the N-terminal region, and include the quinoprotein glucose dehydrogenase (EC 1.1.5.2) of Escherichia coli and the quinate/shikimate dehydrogenase of Acinetobacter sp. ADP1 (EC 1.1.99.25). Sequences closely related except for the absense of the N-terminal hydrophobic region, scoring in the gray zone between the trusted and noise cutoffs, include PQQ-dependent glycerol (EC 1.1.99.22) and and other polyol (sugar alcohol) dehydrogenases.	764
274419	TIGR03075	PQQ_enz_alc_DH	PQQ-dependent dehydrogenase, methanol/ethanol family. This protein family has a phylogenetic distribution very similar to that coenzyme PQQ biosynthesis enzymes, as shown by partial phylogenetic profiling. Genes in this family often are found adjacent to the PQQ biosynthesis genes themselves. An unusual, strained disulfide bond between adjacent Cys residues contributes to PQQ-binding, as does a Trp residue that is part of a PQQ enzyme repeat (see pfam01011). Characterized members include the dehydrogenase subunit of a membrane-anchored, three subunit alcohol (ethanol) dehydrogenase of Gluconobacter suboxydans, a homodimeric ethanol dehydrogenase in Pseudomonas aeruginosa, and the large subunit of an alpha2/beta2 heterotetrameric methanol dehydrogenase in Methylobacterium extorquens.	527
213771	TIGR03076	near_not_gcvH	Chlamydial GcvH-like protein upstream region protein. The H protein (GcvH) of the glycine cleavage system shuttles the methylamine group of glycine from the P protein to the T protein. Most Chlamydia but lack the P and T proteins, and have a single homolog of GcvH that appears deeply split from canonical GcvH in molecular phylogenetic trees. The protein family modeled here is observed so far only in the Chlamydiae, always as part of a two-gene operon, upstream of the homolog of GcvH. Its function is unknown. [Unknown function, General]	686
132121	TIGR03077	not_gcvH	glycine cleavage protein H-like protein, Chlamydial. The H protein (GcvH) of the glycine cleavage system shuttles the methylamine group of glycine from the P protein to the T protein. Most Chlamydia but lack the P and T proteins, and have a single homolog of GcvH that appears deeply split from canonical GcvH in molecular phylogenetic trees. The protein family modeled here is observed the Chlamydial GcvH homolog, so far always seen as part of a two-gene operon, downstream of a member of the uncharacterized protein family TIGR03076. The function of this protein is unknown.	110
274420	TIGR03078	CH4_NH3mon_ox_C	methane monooxygenase/ammonia monooxygenase, subunit C. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit C of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria.	231
132123	TIGR03079	CH4_NH3mon_ox_B	methane monooxygenase/ammonia monooxygenase, subunit B. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit B of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria.	399
132124	TIGR03080	CH4_NH3mon_ox_A	methane monooxygenase/ammonia monooxygenase, subunit A. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit A of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria.	243
213772	TIGR03081	metmalonyl_epim	methylmalonyl-CoA epimerase. Members of this protein family are the enzyme methylmalonyl-CoA epimerase (EC 5.1.99.1), also called methylmalonyl-CoA racemase. This enzyme converts (2R)-methylmalonyl-CoA to (2S)-methylmalonyl-CoA, which is then a substrate for methylmalonyl-CoA mutase (TIGR00642). It is known in bacteria, archaea, and as a mitochondrial protein in animals. It is closely related to lactoylglutathione lyase (TIGR00068), which is also called glyoxylase I, and is also a homodimer.	128
274421	TIGR03082	Gneg_AbrB_dup	membrane protein AbrB duplication. The model describes a hydrophobic sequence region that is duplicated to form the AbrB protein of Escherichia coli (not to be confused with a Bacillus subtilis protein with the same gene symbol). In some species, notably the Cyanobacteria and Thermus thermophilus, proteins consist of a single copy rather than two copies. The member from Pseudomonas putida, PP_1415, was suggested to be an ammonia monooxygenase characteristic of heterotrophic nitrifiers, based on an experimental indication of such activity in the organism and a glimmer of local sequence similarity between parts of P. putida protein and an instance of the AmoA protein from Nitrosomonas europaea (; we do not believe the sequence similarity to be meaningful. The member from E. coli (b0715, ybgN) appears to be the largely uncharacterized AbrB (aidB regulator) protein of E. coli cited in Volkert, et al. (PMID 8002588), although we did not manage to trace the origin of association of the article to the sequence.	156
274422	TIGR03083	TIGR03083	uncharacterized Actinobacterial protein TIGR03083. This protein family pulls together several groups of proteins, each very different from the others. They share in common three conserved regions. The first is a region of about 38 amino acids, nearly always at the N-terminus of a protein. This region has a bulky hydrophobic residue, usually Trp, at position 29, and a His residue at position 37 that is invariant, so far, in over 150 instances. The second conserved region has a motif [DE]xxxHxxD. The third conserved region contains a hydrophobic patch and a well-conserved Arg residue. Most examples are found in the Actinobacteria, including the genera Mycobacterium, Corynebacterium, Streptomyces, Nocardia, Frankia, etc. The pattern of near-invariant residues against a backdrop of extreme sequence divergence suggests enzymatic activity and conservation of active site residues.	202
274423	TIGR03084	TIGR03084	TIGR03084 family protein. This family, like pfam07398, belongs to the larger set of probable enzymes modeled by TIGRFAMs family TIGR03083. Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterized. [Hypothetical proteins, Conserved]	253
132129	TIGR03085	TIGR03085	TIGR03085 family protein. This family, like pfam07398 and TIGRFAMs family TIGR03084, belongs to the larger set of probable enzymes defined in family TIGR03083. Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterized. [Hypothetical proteins, Conserved]	199
274424	TIGR03086	TIGR03086	TIGR03086 family protein. This family, like pfam07398 and TIGRFAMs family TIGR030834, belongs to the larger set of probable enzymes defined in family TIGR03083. Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterized.	180
274425	TIGR03087	stp1	sugar transferase, PEP-CTERM/EpsH1 system associated. Members of this family include a match to the pfam00534 Glycosyl transferases group 1 domain. Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria. In particular, these transferases are found proximal to a particular variant of exosortase, EpsH1, which appears to travel with a conserved group of genes summarized by Genome Property GenProp0652. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species, but we hypothesize a conserved substrate.	397
132132	TIGR03088	stp2	sugar transferase, PEP-CTERM/EpsH1 system associated. Members of this family include a match to the pfam00534 Glycosyl transferases group 1 domain. Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria. In particular, these transferases are found proximal to a particular variant of exosortase, EpsH1, which appears to travel with a conserved group of genes summarized by Genome Property GenProp0652. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species, but we hypothesize a conserved substrate.	374
274426	TIGR03089	TIGR03089	TIGR03089 family protein. This protein family is found, so far, only in the Actinobacteria (Streptomyces, Mycobacterium, Corynebacterium, Nocardia, Propionibacterium, etc.) and never more than one to a genome. Members show twilight-level sequence similarity to family of AMP-binding enzymes described by pfam00501.	228
163133	TIGR03090	SASP_tlp	small, acid-soluble spore protein tlp. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. Although previously designated tlp (thioredoxin-like protein), the B. subtilis protein was shown to be a minor small acid-soluble spore protein SASP, unique to spores. The motif E[VIL]XDE near the C-terminus probably represents at a germination protease cleavage site. [Cellular processes, Sporulation and germination]	70
274427	TIGR03091	SASP_sspK	small, acid-soluble spore protein K. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspK. [Cellular processes, Sporulation and germination]	32
132136	TIGR03092	SASP_sspI	small, acid-soluble spore protein I. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspI. The gene in Bacillus subtilis previously was designated ysfA. [Cellular processes, Sporulation and germination]	65
132137	TIGR03093	SASP_sspL	small, acid-soluble spore protein L. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspL. [Cellular processes, Sporulation and germination]	36
132138	TIGR03094	sulfo_cyanin	sulfocyanin. Members of this family are blue-copper redox proteins designated sulfocyanin, from the archaeal genera Sulfolobus, Ferroplasma, and Picrophilus. The most closely related proteins characterized as functionally different are the rustacyanins. [Energy metabolism, Electron transport]	195
132139	TIGR03095	rusti_cyanin	rusticyanin. Rusticyanin is a blue copper protein, described in an obligate acidophilic chemolithoautroph, Acidithiobacillus ferrooxidans, as an electron transfer protein. It can constitute up to 5 percent of protein in cells grown on Fe(II) and is thought to be part of an electron chain for Fe(II) oxidation, with two c-type cytochromes, an aa3-type cytochrome oxidase, and 02 as terminal electron acceptor. It is rather closely related to sulfocyanin (TIGR03094). [Energy metabolism, Electron transport]	148
132140	TIGR03096	nitroso_cyanin	nitrosocyanin. Nitrosocyanin, as described from the obligate chemolithoautotroph Nitrosomonas europaea, is a red copper protein of unknown function with sequence similarity to a number of blue copper redox proteins. [Energy metabolism, Electron transport]	135
132141	TIGR03097	PEP_O_lig_1	probable O-glycosylation ligase, exosortase A-associated. These proteins are members of the O-antigen polymerase (wzy) family described by pfam04932. This group is associated with genomes and ususally genomic contexts containing elements of the exosortase/PEP-CTERM protein export system, specificially the type 1 variety of this system described by the Genome Property, GenProp0652.	402
211788	TIGR03098	ligase_PEP_1	acyl-CoA ligase (AMP-forming), exosortase A-associated. This group of proteins contains an AMP-binding domain (pfam00501) associated with acyl CoA-ligases. These proteins are generally found in genomes containing the exosortase/PEP-CTERM protein expoert system, specifically the type 1 variant of this system described by the Genome Property GenProp0652. When found in this context they are invariably present next to a decarboxylase enzyme. A number of sequences from Burkholderia species also hit this model, but the genomic context is obviously different. The hypothesis of a constant substrate for this family is only strong where the exosortase context is present.	517
132143	TIGR03099	dCO2ase_PEP1	pyridoxal-dependent decarboxylase, exosortase A system-associated. The sequences in this family contain the pyridoxal binding domain (pfam02784) and C-terminal sheet domain (pfam00278) of a family of Pyridoxal-dependent decarboxylases. Characterized enzymes in this family decarboxylate substrates such as ornithine, diaminopimelate and arginine. The genes of the family modeled here, with the exception of those observed in certain Burkholderia species, are all found in the context of exopolysaccharide biosynthesis loci containing the exosortase/PEP-CTERM protein sorting system. More specifically, these are characteristic of the type 1 exosortase system represented by the Genome Property GenProp0652. The substrate of these enzymes may be a precursor of the carrier or linker which is hypothesized to release the PEP-CTERM protein from the exosortase enzyme. These enzymes are apparently most closely related to the diaminopimelate decarboxylase modeled by TIGR01048 which may suggest a similarity (or identity) of substrate.	398
132144	TIGR03100	hydr1_PEP	exosortase A system-associated hydrolase 1. This group of proteins are members of the alpha/beta hydrolase superfamily. These proteins are generally found in genomes containing the exosortase/PEP-CTERM protein expoert system, specifically the type 1 variant of this system described by the Genome Property GenProp0652. When found in this context they are invariably present in the vicinity of a second, relatively unrelated enzyme (ortholog 2, TIGR03101) of the same superfamily.	274
274428	TIGR03101	hydr2_PEP	exosortase A system-associated hydrolase 2. This group of proteins are members of the alpha/beta hydrolase superfamily. These proteins are generally found in genomes containing the exosortase/PEP-CTERM protein expoert system, specifically the type 1 variant of this system described by the Genome Property GenProp0652. When found in this context they are invariably present in the vicinity of a second, relatively unrelated enzyme (ortholog 1, TIGR03100) of the same superfamily.	266
274429	TIGR03102	halo_cynanin	halocyanin domain. Halocyanins are blue (type I) copper redox proteins found in halophilic archaea such as Natronobacterium pharaonis. This model represents a domain duplicated in some halocyanins, while appearing once in others. This domain includes the characteristic copper ligand residues. This family does not include plastocyanins, and does not include certain divergent paralogs of halocyanin.	115
132147	TIGR03103	trio_acet_GNAT	GNAT-family acetyltransferase TIGR03103. Members of this protein family belong to the GNAT family of acetyltransferases. Each is part of a conserved three-gene cassette sparsely distributed across at least twenty different species known so far, including alpha, beta, and gamma Proteobacteria, Mycobacterium, and Prosthecochloris, which is a member of the Chlorobi. The other two members of the cassette are a probable protease and an asparagine synthetase family protein.	547
274430	TIGR03104	trio_amidotrans	asparagine synthase family amidotransferase. Members of this protein family are closely related to several isoforms of asparagine synthetase (glutamine amidotransferase) and typically have been given this name in genome annotation to date. Each is part of a conserved three-gene cassette sparsely distributed across at least twenty different species known so far, including alpha, beta, and gamma Proteobacteria, Mycobacterium, and Prosthecochloris, which is a member of the Chlorobi. The other two members of the cassette are a probable protease and a member of the GNAT family of acetyltransferases.	589
274431	TIGR03105	gln_synth_III	glutamine synthetase, type III. This family consists of the type III isozyme of glutamine synthetase, originally described in Rhizobium meliloti, where types I and II also occur.	435
132150	TIGR03106	trio_M42_hydro	hydrolase, peptidase M42 family. This model describes a subfamily of MEROPS peptidase family M42, a glutamyl aminopeptidase family that also includes the cellulase CelM from Clostridium thermocellum and deblocking aminopeptidases that can remove acylated amino acids. Members of this family occur in a three gene cassette with an amidotransferase (TIGR03104)in the asparagine synthase (glutamine-hydrolyzing) family, and a probable acetyltransferase (TIGR03103) in the GNAT family.	343
132151	TIGR03107	glu_aminopep	glutamyl aminopeptidase. This model represents the M42.001 clade within MEROPS family M42. M42 includes glutamyl aminopeptidase as in the present model, deblocking aminopeptidases as from Pyrococcus horikoshii and related species, and endo-1,4-beta-glucanase (cellulase M) as from Clostridium thermocellum. The current family includes [Protein fate, Degradation of proteins, peptides, and glycopeptides]	350
132152	TIGR03108	eps_aminotran_1	exosortase A system-associated amidotransferase 1. The predicted protein-sorting transpeptidase that we call exosortase (see TIGR02602) has distinct subclasses that associated with different types of exopolysaccharide production loci. This model represents a distinct clade among a set of amidotransferases largely annotated (not necessarily accurately) as glutatime-hydrolyzing asparagine synthases. Members of this clade are essentially restricted to the characteristic exopolysaccharide (EPS) regions that contain the exosortase 1 genome (xrtA), in genomes that also have numbers of PEP-CTERM domain (TIGR02595) proteins.	628
274432	TIGR03109	exosortase_1	exosortase A. The predicted protein-sorting transpeptidase that we call exosortase (see TIGR02602) has distinct subclasses that associated with different types of exopolysaccharide production loci. We designate this, the most common type so far, exosortase 1. We propose the gene symbol xrtA, analogous to srtA for the most common type of sortase in Gram-positive bacteria.	267
188282	TIGR03110	exosort_Gpos	exosortase family protein XrtG. Members of this protein family are found in a modest number of non-pathogenic Gram-positive bacteria, including three species of Lactococcus and three paralogs in Clostridium acetobutylicum. This protein appears related to the conserved core region of a family of proposed transpeptidases, exosortase (previously EpsH), thought to act on PEP-CTERM proteins. Members of the seed alignment include all exosortase proposed active site residues. However, in contrast to canonical exosortase (TIGR02602) and archaeal (TIGR03762), and cyanobacterial (TIGR03763) variants, this family has not yet been matched to a cognate PEP-CTERM-like sorting signal. This protein is assigned the gene symbol XrtG (eXosoRTase family protein of Gram-positives).	187
132155	TIGR03111	glyc2_xrt_Gpos1	putative glycosyltransferase, exosortase G-associated. Members of this protein family are probable glycosyltransferases of family 2, whose genes are near those for the exosortase homolog XrtG (TIGR03110), which is restricted to Gram-positive bacteria. Other genes in the conserved gene neighborhood include a 6-pyruvoyl tetrahydropterin synthase homolog (TIGR03112) and an uncharacterized intergral membrane protein (TIGR03766).	439
132156	TIGR03112	6_pyr_pter_rel	6-pyruvoyl tetrahydropterin synthase-related domain. Members of this family are small proteins, or small domains of larger proteins, that occur in certain Firmicutes in the same regions as members of families TIGR03110 and TIGR03111. Members of TIGR03110 resemble exosortase, a proposed protein sorting transpeptidase (see TIGR02602). TIGR03111 represents a small clade among the group 2 glycosyltransferases. Members of the current protein family resemble eukaryotic known and prokaryotic predicted 6-pyruvoyl tetrahydropterin synthases.	113
274433	TIGR03113	exosortase_2	exosortase B. The predicted protein-sorting transpeptidase that we call exosortase (see TIGR02602) has distinct subclasses that associated with different types of exopolysaccharide production loci. We designate this relatively uncommon proteobacterial type to be type 2. We propose the gene symbol xrtB. Most species encountered so far with xrtB also contain xrtA (TIGR03109).	268
274434	TIGR03114	cas8u_csf1	CRISPR type AFERR-associated protein Csf1. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf1 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies closest to the repeats.	202
274435	TIGR03115	cas7_csf2	CRISPR type IV/AFERR-associated protein Csf2. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf2 (CRISPR/cas Subtype as in A. ferrooxidans protein 2), as it lies second closest to the repeats.	344
132160	TIGR03116	cas5_csf3	CRISPR type IV/AFERR-associated protein Csf3. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf3 (CRISPR/cas Subtype as in A. ferrooxidans protein 3), as it lies third closest to the repeats.	214
274436	TIGR03117	cas_csf4	CRISPR type AFERR-associated DEAD/DEAH-box helicase Csf4. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf4 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies farthest (fourth closest) from the repeats in the A. ferrooxidans genome.	636
132162	TIGR03118	PEPCTERM_chp_1	TIGR03118 family protein. This model describes and uncharacterized conserved hypothetical protein. Members are found with the C-terminal putative exosortase interaction domain, PEP-CTERM, in Nitrosospira multiformis, Rhodoferax ferrireducens, Solibacter usitatus Ellin6076, and Acidobacteria bacterium Ellin345. It is found without the PEP-CTERM domain in several other species, including Burkholderia ambifaria, Gloeobacter violaceus PCC 7421, and three copies in the Acanthamoeba polyphaga mimivirus. [Hypothetical proteins, Conserved]	336
132163	TIGR03119	one_C_fhcD	formylmethanofuran--tetrahydromethanopterin N-formyltransferase. Members of this protein family are the FhcD protein of tetrahydromethanopterin (H4MPT)-dependent C-1 carrier metabolism. In the archaea, FhcD is designated formylmethanofuran--tetrahydromethanopterin N-formyltransferase, while in bacteria it is commonly designated as formyltransferase/hydrolase complex subunit D. FhcD is essential for one-carbon metabolism in at least three groups of prokaryotes: methanogenic archaea, sulfate-reducing archaea, and methylotrophic bacteria. [Central intermediary metabolism, One-carbon metabolism]	287
274437	TIGR03120	one_C_mch	methenyltetrahydromethanopterin cyclohydrolase. Members of this protein family are the enzyme methenyltetrahydromethanopterin cyclohydrolase, a key enzyme for tetrahydromethanopterin (H4MPT)-linked C1 transfer metabolism. [Central intermediary metabolism, One-carbon metabolism]	313
274438	TIGR03121	one_C_dehyd_A	formylmethanofuran dehydrogenase subunit A. Members of this largely archaeal protein family are subunit A of the formylmethanofuran dehydrogenase. Nomenclature in some bacteria may reflect inclusion of the formyltransferase described by TIGR03119 as part of the complex, and therefore call this protein formyltransferase/hydrolase complex Fhc subunit A. Note that this model does not distinguish tungsten (FwdA) from molybdenum-containing (FmdA) forms of this enzyme; a single gene from this family is expressed constitutively in Methanobacterium thermoautotrophicum, which has both tungsten and molybdenum forms and may work interchangeably.	556
274439	TIGR03122	one_C_dehyd_C	formylmethanofuran dehydrogenase subunit C. Members of this largely archaeal protein family are subunit C of the formylmethanofuran dehydrogenase. Nomenclature in some bacteria may reflect inclusion of the formyltransferase described by TIGR03119 as part of the complex, and therefore call this protein formyltransferase/hydrolase complex Fhc subunit C. Note that this model does not distinguish tungsten (FwdC) from molybdenum-containing (FmdC) forms of this enzyme.	257
163144	TIGR03123	one_C_unchar_1	probable H4MPT-linked C1 transfer pathway protein. This protein family was identified, by the method of partial phylogenetic profiling, as related to the use of tetrahydromethanopterin (H4MPT) as a C-1 carrier. Characteristic markers of the H4MPT-linked C1 transfer pathway include formylmethanofuran dehydrogenase subunits, methenyltetrahydromethanopterin cyclohydrolase, etc. Tetrahydromethanopterin, a tetrahydrofolate analog, occurs in methanogenic archaea, bacterial methanotrophs, planctomycetes, and a few other lineages. [Central intermediary metabolism, One-carbon metabolism]	318
163145	TIGR03124	citrate_citX	holo-ACP synthase CitX. Members of this protein family are the CitX protein, or CitX domain of the CitXG bifunctional protein, of the citrate lyase system. CitX transfers the prosthetic group 2'-(5''-triphosphoribosyl)-3'-dephospho-CoA to the citrate lyase gamma chain, an acyl carrier protein. This enzyme may be designated holo-ACP synthase, holo-citrate lyase synthase, or apo-citrate lyase phosphoribosyl-dephospho-CoA transferase. In a few genera, including Haemophilus, this protein occurs as a fusion protein with CitG (2.7.8.25), an enzyme involved in prosthetic group biosynthesis. This CitX family is easily separated from the holo-ACP synthases of other enzyme systems. [Energy metabolism, Fermentation, Protein fate, Protein modification and repair]	165
132169	TIGR03125	citrate_citG	triphosphoribosyl-dephospho-CoA synthase CitG. Triphosphoribosyl-dephospho-CoA is transferred to, and becomes the prosthetic group of, the respective acyl carrier protein subunits of both citrate lyase and malonate decarboxylase. Members of this protein family are triphosphoribosyl-dephospho-CoA synthases specifically from citrate lyase systems. This protein sometimes occurs as a fusion protein with CitX, the phosphoribosyl-dephospho-CoA transferase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Fermentation]	275
132170	TIGR03126	one_C_fae	formaldehyde-activating enzyme. This family consists of formaldehyde-activating enzyme, or the corresponding domain of longer, bifunctional proteins. It links formaldehyde to the C1 carrier tetrahydromethanopterin (H4MPT), an analog of tetrahydrofolate, and is common among species with H4MPT. The ribulose monophosphate (RuMP) pathway, which removes the toxic metabolite formaldehyde by assimilation, runs in the opposite direction in some species to produce ribulose 5-phosphate for nucleotide biosynthesis, leaving formaldehyde as an additional metabolite. In these species, formaldehyde activating enzyme may occur as a fusion protein with D-arabino 3-hexulose 6-phosphate formaldehyde lyase from the RuMP pathway.	160
132171	TIGR03127	RuMP_HxlB	6-phospho 3-hexuloisomerase. Members of this protein family are 6-phospho 3-hexuloisomerase (PHI), or the PHI domain of a fusion protein. This enzyme is part of the ribulose monophosphate (RuMP) pathway, which in one direction removes the toxic metabolite formaldehyde by assimilation into fructose-6-phosphate. In the other direction, in species lacking a complete pentose phosphate pathway, the RuMP pathway yields ribulose-5-phosphate, necessary for nucleotide biosynthesis, at the cost of also yielding formaldehyde. These latter species tend usually have a formaldehyde-activating enzyme to attach formaldehyde to the C1 carrier tetrahydromethanopterin.	179
132172	TIGR03128	RuMP_HxlA	3-hexulose-6-phosphate synthase. Members of this protein family are 3-hexulose-6-phosphate synthase (HPS), or the HPS domain of a fusion protein. This enzyme is part of the ribulose monophosphate (RuMP) pathway, which in one direction removes the toxic metabolite formaldehyde by assimilation into fructose-6-phosphate. In the other direction, in species lacking a complete pentose phosphate pathway, the RuMP pathway yields ribulose-5-phosphate, necessary for nucleotide biosynthesis, at the cost of also yielding formaldehyde. These latter species tend usually have a formaldehyde-activating enzyme to attach formaldehyde to the C1 carrier tetrahydromethanopterin. In these species, the enzyme is viewed as a lyase rather than a synthase and is called D-arabino 3-hexulose 6-phosphate formaldehyde lyase. Note that there is some overlap in specificity with the Escherichia coli enzyme 3-keto-L-gulonate 6-phosphate decarboxylase.	206
132173	TIGR03129	one_C_dehyd_B	formylmethanofuran dehydrogenase subunit B. Members of this largely archaeal protein family are subunit B of the formylmethanofuran dehydrogenase. Nomenclature in some bacteria may reflect inclusion of the formyltransferase described by TIGR03119 as part of the complex, and therefore call this protein formyltransferase/hydrolase complex Fhc subunit C. Note that this model does not distinguish tungsten (FwdB) from molybdenum-containing (FmdB) forms of this enzyme.	421
188283	TIGR03130	malonate_delta	malonate decarboxylase acyl carrier protein. Members of this protein family are the acyl carrier protein, also called the delta subunit, of malonate decarboxylase. This subunit has the same covalently bound prosthetic group, derived from and similar to coenzyme A, as does citrate lyase, although this protein and the acyl carrier protein of citrate lyase do not show significant sequence similarity. Both malonyl and acetyl groups are transferred to the prosthetic group for catalysis.	98
132175	TIGR03131	malonate_mdcH	malonate decarboxylase, epsilon subunit. Members of this protein family are the epsilon subunit of malonate decarboxylase. This subunit has malonyl-CoA/dephospho-CoA acyltransferase activity. Malonate decarboxylase may be a soluble enzyme, or linked to membrane subunits and active as a sodium pump. The epsilon subunit is closely related to the malonyl CoA-acyl carrier protein (ACP) transacylase family described by TIGR00128, but acts on an ACP subunit of malonate decarboxylase that has an unusual coenzyme A derivative as its prothetic group.	295
274440	TIGR03132	malonate_mdcB	triphosphoribosyl-dephospho-CoA synthase MdcB. This protein acts in cofactor biosynthesis, preparing the coenzyme A derivative that becomes attached to the malonate decarboxylase acyl carrier protein (or delta subunit). The closely related protein CitG of citrate lyase produces the same molecule, but the two families are nonetheless readily separated. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	272
188285	TIGR03133	malonate_beta	biotin-independent malonate decarboxylase, beta subunit. Members of this protein family are the beta subunit of malonate decarboxylase. Malonate decarboxylase may be a soluble enzyme, or linked to membrane subunits and active as a sodium pump. In the malonate decarboxylase complex, the beta subunit appears to act as a malonyl-CoA decarboxylase.	274
274441	TIGR03134	malonate_gamma	biotin-independent malonate decarboxylase, gamma subunit. Members of this protein family are the gamma subunit of malonate decarboxylase. Malonate decarboxylase may be a soluble enzyme, or linked to membrane subunits and active as a sodium pump. In the malonate decarboxylase complex, the beta subunit appears to act as a malonyl-CoA decarboxylase, while the gamma subunit appears either to mediate subunit interaction or to act as a co-decarboxylase with the beta subunit. The beta and gamma subunits exhibit some local sequence similarity.	238
274442	TIGR03135	malonate_mdcG	malonate decarboxylase holo-[acyl-carrier-protein] synthase. Malonate decarboxylase, like citrate lyase, has a unique acyl carrier protein subunit with a prosthetic group derived from, and distinct from, coenzyme A. Members of this protein family are the phosphoribosyl-dephospho-CoA transferase specific to the malonate decarboxylase system. This enzyme can also be designated holo-ACP synthase (2.7.7.61). The corresponding component of the citrate lyase system, CitX, shows little or no sequence similarity to this family. [Energy metabolism, Other]	202
188287	TIGR03136	malonate_biotin	Na+-transporting malonate decarboxylase, carboxybiotin decarboxylase subunit. Malonate decarboxylase can be a soluble enzyme, or a sodium ion-translocating with additional membrane-bound components. Members of this protein family are integral membrane proteins required to couple decarboxylation to sodium ion export. This family belongs to a broader family, TIGR01109 of sodium ion-translocating decarboxylase beta subunits. [Transport and binding proteins, Cations and iron carrying compounds]	399
211789	TIGR03137	AhpC	peroxiredoxin. This peroxiredoxin (AhpC, alkylhydroperoxide reductase subunit C) is one subunit of a two-subunit complex with subunit F(TIGR03140). Usually these are found as an apparent operon. The gene has been characterized in Bacteroides fragilis, where it is important in oxidative stress defense. This gene contains two invariant cysteine residues, one near the N-terminus and one near the C-terminus, each followed immediately by a proline residue. [Cellular processes, Detoxification, Cellular processes, Adaptations to atypical conditions]	187
274443	TIGR03138	QueF	7-cyano-7-deazaguanine reductase. This enzyme catalyzes the 4-electron reduction of the cyano group of 7-cyano-7-deazaguanine (preQ0) to an amine. Although related to a large family of GTP cyclohydrolases (pfam01227), the relationship is structural and not germane to the catalytic mechanism. This mode represents the longer, gram-negative version of the enzyme as found in E. coli. The enzymatic step represents the first point at which the biosynthesis of queuosine in bacteria and eukaryotes is distinguished from the biosynthesis of archaeosine in archaea. [Transcription, RNA processing]	275
213775	TIGR03139	QueF-II	7-cyano-7-deazaguanine reductase. This enzyme catalyzes the 4-electron reduction of the cyano group of 7-cyano-7-deazaguanine (proQ1) to an amine. Although related to a large family of GTP cyclohydrolases (pfam01227), the relationship is structural and not germane to the catalytic mechanism. This mode represents the shorter, gram-positive version of the enzyme as found in B. subtilis. The enzymatic step represents the first point at which the biosynthesis of queuosine in bacteria and eukaryotes is distinguished from the biosynthesis of archaeosine in archaea.	115
274444	TIGR03140	AhpF	alkyl hydroperoxide reductase subunit F. This enzyme is the partner of the peroxiredoxin (alkyl hydroperoxide reductase) AhpC which contains the peroxide-reactive cysteine. AhpF contains the reductant (NAD(P)H) binding domain (pfam00070) and presumably acts to resolve the disulfide which forms after oxidation of the active site cysteine in AphC. This proteins contains two paired conserved cysteine motifs, CxxCP and CxHCDGP. [Cellular processes, Detoxification, Cellular processes, Adaptations to atypical conditions]	515
274445	TIGR03141	cytochro_ccmD	heme exporter protein CcmD. The model for this protein family describes a small, hydrophobic, and only moderately well-conserved protein, tricky to identify accurately for all of these reasons. However, members are found as part of large operons involved in heme export across the inner membrane for assembly of c-type cytochromes in a large number of bacteria. The gray zone between the trusted cutoff (13.0) and noise cutoff (4.75) includes both low-scoring examples and false-positive matches to hydrophobic domains of longer proteins.	45
274446	TIGR03142	cytochro_ccmI	cytochrome c-type biogenesis protein CcmI. This TPR repeat-containing protein is the CcmI protein (also called CycH) of c-type cytochrome biogenesis. CcmI is thought to act as an apo-cytochrome c chaperone. This model describes the N-terminal region of the protein, Members of this protein family [Protein fate, Protein folding and stabilization, Energy metabolism, Electron transport]	117
132187	TIGR03143	AhpF_homolog	putative alkyl hydroperoxide reductase F subunit. This family of thioredoxin reductase homologs is found adjacent to alkylhydroperoxide reductase C subunit predominantly in cases where there is only one C subunit in the genome and that genome is lacking the F subunit partner (also a thioredcxin reductase homolog) that is usually found (TIGR03140).	555
274447	TIGR03144	cytochr_II_ccsB	cytochrome c-type biogenesis protein CcsB. Members of this protein family represent one of two essential proteins of system II for c-type cytochrome biogenesis. Additional proteins tend to be part of the system but can be replaced by chemical reductants such as dithiothreitol. This protein is designated CcsB in Bordetella pertussis and some other bacteria, resC in Bacillus (where there is additional N-terminal sequence), and CcsA in chloroplast. We use the CcsB designation here. Member sequences show regions of strong sequence conservation and variable-length, poorly conserved regions in between; sparsely filled columns were removed from the seed alignment prior to model construction. [Energy metabolism, Electron transport, Protein fate, Protein modification and repair]	245
274448	TIGR03145	cyt_nit_nrfE	cytochrome c nitrate reductase biogenesis protein NrfE. Members of this protein family closely resemble the CcmF protein of the CcmABCDEFGH system, or system I, for c-type cytochrome biogenesis (GenProp0678). Members are found, as a rule, next to closely related paralogs of CcmG and CcmH and always located near other genes associated with the cytochrome c nitrite reductase enzyme complex. As a rule, members are found in species that also encode bona fide members of the CcmF, CcmG, and CcmH families.	614
132190	TIGR03146	cyt_nit_nrfB	cytochrome c nitrite reductase, pentaheme subunit. Members of this protein family contain five copies of the CXXCH heme-binding motif, and are the NrfB component of the multisubunit enzyme, cytochrome c nitrite reductase. [Energy metabolism, Electron transport]	145
274449	TIGR03147	cyt_nit_nrfF	cytochrome c nitrite reductase, accessory protein NrfF. [Energy metabolism, Electron transport]	126
274450	TIGR03148	cyt_nit_nrfD	cytochrome c nitrite reductase, NrfD subunit. Members of this protein family are NrfD, a highly hydrophobic protein encoded in the nrf operon, which encodes cytochrome c nitrite reductase. This multiple heme-containing enzyme can reduce nitrite to ammonia. Members belong to a broader Pfam protein family, pfam03916, which also contains an NrfD-related subunit of polysulphide reductase. [Energy metabolism, Electron transport]	316
274451	TIGR03149	cyt_nit_nrfC	cytochrome c nitrite reductase, Fe-S protein. Members of this protein family are the Fe-S protein, NrfC, of a cytochrome c nitrite reductase system for which the pentaheme cytochrome c protein, NrfB (family TIGR03146) is an unambiguous marker. Members of this protein family show similarity to other ferredoxin-like proteins, including a subunit of a polysulfide reductase. [Energy metabolism, Electron transport]	225
274452	TIGR03150	fabF	beta-ketoacyl-acyl-carrier-protein synthase II. 3-oxoacyl-[acyl-carrier-protein] synthase 2 (KAS-II, FabF) is involved in the condensation step of fatty acid biosynthesis in which the malonyl donor group is decarboxylated and the resulting carbanion used to attack and extend the acyl group attached to the acyl carrier protein. Most genomes encoding fatty acid biosynthesis contain a number of condensing enzymes, often of all three types: 1, 2 and 3. Synthase 2 is mechanistically related to synthase 1 (KAS-I, FabB) containing a number of absolutely conserved catalytic residues in common. This model is based primarily on genes which are found in apparent operons with other essential genes of fatty acid biosynthesis (GenProp0681). The large gap between the trusted cutoff and the noise cutoff contains many genes which are not found adjacent to genes of the fatty acid pathway in genomes that often also contain a better hit to this model. These genes may be involved in other processes such as polyketide biosyntheses. Some genomes contain more than one above-trusted hit to this model which may result from recent paralogous expansions. Second hits to this model which are not next to other fatty acid biosynthesis genes may be involved in other processes. FabB sequences should fall well below the noise cutoff of this model. [Fatty acid and phospholipid metabolism, Biosynthesis]	407
132195	TIGR03151	enACPred_II	putative enoyl-[acyl-carrier-protein] reductase II. This oxidoreductase of the 2-nitropropane dioxygenase family (pfam03060) is commonly found in apparent operons with genes involved in fatty acid biosynthesis. Furthermore, this genomic context generally includes the fabG 3-oxoacyl-[ACP] reductase and lacks the fabI enoyl-[ACP] reductase.	307
200248	TIGR03152	cyto_c552_HCOOH	formate-dependent cytochrome c nitrite reductase, c552 subunit. Members of this protein family are cytochrome c552, a component of cytochrome c nitrite reductase, which is known more formally as nitrite reductase (cytochrome; ammonia-forming) (EC 1.7.2.2). Nitrate can be reduced by several enzymes. EC 1.7.2.2 reduces nitrite all the way to ammonia, rather than to ammonium hydroxide (nitrite reductase (NAD(P)H), EC 1.7.1.4) or nitric oxide (nitrite reductase (NO-forming), EC 1.7.2.1). Some examples of EC 1.7.2.2 occur in a seven gene system that enables formate-dependent nitrite reduction, but is also found in simpler contexts. Members of this protein family, however, belong to the formate-dependent system. [Energy metabolism, Electron transport]	439
274453	TIGR03153	cytochr_NrfH	cytochrome c nitrite reductase, small subunit. Members of this protein family are NrfH, a tetraheme cytochrome c. NrfH is the cytochrome c nitrite reductase small subunit, and forms a heterodimer with NrfA, the catalytic subunit. While NrfA can act as a monomer, NrfH can bind to and anchor NrfA in the membrane and enables electron transfer to NrfA from quinones. [Energy metabolism, Electron transport]	135
132198	TIGR03154	sulfolob_CbsA	cytochrome b558/566, subunit A. Members of this protein family are CbsA, one subunit of a highly glycosylated, heterodimeric, mono-heme cytochrome b558/566, found in Sulfolobus acidocaldarius and several other members of the Sulfolobales, a branch of the Crenarchaeota.	465
274454	TIGR03155	sulfolob_CbsB	cytochrome b558/566, subunit B. Members of this protein family are CbsB, one subunit of a highly glycosylated, heterodimeric, mono-heme cytochrome b558/566, found in Sulfolobus acidocaldarius and several other members of the Sulfolobales, a branch of the Crenarchaeota.	302
274455	TIGR03156	GTP_HflX	GTP-binding protein HflX. This protein family is one of a number of homologous small, well-conserved GTP-binding proteins with pleiotropic effects. Bacterial members are designated HflX, following the naming convention in Escherichia coli where HflX is encoded immediately downstream of the RNA chaperone Hfq, and immediately upstream of HflKC, a membrane-associated protease pair with an important housekeeping function. Over large numbers of other bacterial genomes, the pairing with hfq is more significant than with hflK and hlfC. The gene from Homo sapiens in this family has been named PGPL (pseudoautosomal GTP-binding protein-like). [Unknown function, General]	351
274456	TIGR03157	cas_Csc2	CRISPR type I-D/CYANO-associated protein Csc2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc2 for CRISPR/Cas Subtype Cyano protein 2, as it is often the second gene upstream of the core cas genes, cas3-cas4-cas1-cas2.	282
274457	TIGR03158	cas3_cyano	CRISPR-associated helicase Cas3, subtype CYANO. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. It contains helicase motifs and appears to represent the Cas3 protein of the Cyano subtype of CRISPR/Cas system.	357
274458	TIGR03159	cas_Csc1	CRISPR type I-D/CYANO-associated protein Csc1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc1 for CRISPR/Cas Subtype Cyano protein 1, as it is often the first gene upstream of the core cas genes, cas3-cas4-cas1-cas2.	225
274459	TIGR03160	cobT_DBIPRT	nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase. Members of this family are nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase, an enzyme of cobalamin biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	333
274460	TIGR03161	ribazole_CobZ	alpha-ribazole phosphatase CobZ. Sequences in the seed alignment were the experimentally characterized CobZ of the methanogenic archaeon Methanosarcina mazei, and other archaeal proteins found similarly next to or very near to other cobalamin biosynthesis genes. CobZ replaces the alpha-ribazole-phosphate phosphatase (EC 3.1.3.73) called CobC in analogous bacterial pathways for cobalamin biosynthesis under anaerobic conditions. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	139
274461	TIGR03162	ribazole_cobC	alpha-ribazole phosphatase. Members of this protein family include the known CobC protein of Salmonella and Eschichia coli species, and homologous proteins found in cobalamin biosynthesis regions in other bacteria. This protein is alpha-ribazole phosphatase (EC 3.1.3.73) and, like many phosphatases, can be closely related in sequence to other phosphatases with different functions. Close homologs excluded from this model include proteins with duplications, so this model is built in -g mode to suppress hits to those proteins. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	177
132208	TIGR03164	UHCUDC	OHCU decarboxylase. Previously thought to only proceed spontaneously, the decarboxylation of 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) has been recently been shown to be catalyzed by this enzyme in Mus musculus. Homologs of this enzyme are found adjacent to and fused with uricase in a number of prokaryotes and are represented by this model.	157
274462	TIGR03165	F1F0_chp_2	F1/F0 ATPase, Methanosarcina type, subunit 2. Members of this protein family are uncharacterized, highly hydrophobic proteins encoded in the middle of apparent F1/F0 ATPase operons. We note, however, that this protein is both broadly and sparsely distributed. It is found in about only about two percent of microbial genomes sequenced, with the first ten examples found coming from the Euryarchaeota, Chlorobia, Betaproteobacteria, Deltaproteobacteria, and Planctomycetes. In most of these species, surrounding operon appears to represent a second F1/F0 ATPase system, and the member proteins belong to subfamilies with the same phylogenetic distribution as the current protein family.	83
132210	TIGR03166	alt_F1F0_F1_eps	alternate F1F0 ATPase, F1 subunit epsilon. A small number of taxonomically diverse prokaryotic species have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 epsilon subunit of this apparent second ATP synthase.	122
274463	TIGR03167	tRNA_sel_U_synt	tRNA 2-selenouridine synthase. The Escherichia coli YbbB protein was shown to encode a selenophosphate-dependent tRNA 2-selenouridine synthase, essential for modification of some tRNAs to replace a sulfur atom with selenium. This enzyme works with SelD, the selenium donor protein, which also acts in selenocysteine incorporation. Although the members of this protein family show a fairly deep split, sequences from both sides of the split are supported by co-occurence with, and often proximity to, the selD gene. [Protein synthesis, tRNA and rRNA base modification]	311
274464	TIGR03168	1-PFK	hexose kinase, 1-phosphofructokinase family. This family consists largely of 1-phosphofructokinases, but also includes tagatose-6-kinases and 6-phosphofructokinases.	303
274465	TIGR03169	Nterm_to_SelD	pyridine nucleotide-disulfide oxidoreductase family protein. Members of this protein family include N-terminal sequence regions of (probable) bifunctional proteins whose C-terminal sequences are SelD, or selenide,water dikinase, the selenium donor protein necessary for selenium incorporation into protein (as selenocysteine), tRNA (as 2-selenouridine), or both. However, some members of this family occur in species that do not show selenium incorporation, and the function of this protein family is unknown.	364
274466	TIGR03170	flgA_cterm	flagella basal body P-ring formation protein FlgA. This model describes a conserved C-terminal region of the flagellar basal body P-ring formation protein FlgA. This sequence region contains a SAF domain, now described by pfam08666. [Cellular processes, Chemotaxis and motility]	122
132215	TIGR03171	soxL2	Rieske iron-sulfur protein SoxL2. This iron-sulfur protein is found in a contiguous genomic region with subunits of cytochrome b558/566 in several archaeal species, and appears to be part of a cytochrome bc1-analogous system.	321
274467	TIGR03172	TIGR03172	probable selenium-dependent hydroxylase accessory protein YqeC. This uncharacterized protein family includes YqeC from Escherichia coli. A phylogenetic profiling analysis shows correlation with SelD, the selenium donor protein, even in species where SelD contributes to neither selenocysteine nor selenouridine biosynthesis. Instead, this family, and families TIGR03309 and TIGR03310 appear to mark selenium-dependent molybdenum hydroxylase maturation systems. [Unknown function, General]	210
274468	TIGR03173	pbuX	xanthine permease. All the seed members of this model are observed adjacent to genes for either xanthine phosphoribosyltransferase (for the conversion of xanthine to guanine, GenProp0696) or genes for the conversion of xanthine to urate and its concomitant catabolism (GenProp0640, GenProp0688, GenProp0686 and GenProp0687). A number of sequences scoring higher than trusted to this model are found in different genomic contexts, and the possibility exist that these transport related compounds in addition to or instead of xanthine itself. The outgroup to this family are sequences which are characterized as uracil permeases or are adjacent to established uracil phosphoribosyltransferases.	406
274469	TIGR03174	cas_Csc3	CRISPR type I-D/CYANO-associated protein Csc3/Cas10d. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc3 for CRISPR/Cas Subtype Cyano protein 3, as it is often the third gene upstream of the core cas genes, cas3-cas4-cas1-cas2.	953
274470	TIGR03175	AllD	ureidoglycolate dehydrogenase. This enzyme converts ureidoglycolate to oxalureate in the non-urea-forming catabolism of allantoin (GenProp0687). The pathway has been characterized in E. coli and is observed in the genomes of Entercoccus faecalis and Bacillus licheniformis.	349
274471	TIGR03176	AllC	allantoate amidohydrolase. This enzyme catalyzes the breakdown of allantoate, first to ureidoglycine by hydrolysis and then decarboxylation of one of the two equivalent ureido groups. Ureidoglycine then spontaneously exchanges ammonia for water resulting in ureidoglycolate. This enzyme is an alternative to allantoicase (3.5.3.4) which releases urea. [Central intermediary metabolism, Nitrogen metabolism]	406
274472	TIGR03177	pilus_cpaB	Flp pilus assembly protein CpaB. Members of this protein family are the CpaB protein of Flp-type pilus assembly. Similar proteins include the FlgA protein of bacterial flagellum biosynthesis.	261
163175	TIGR03178	allantoinase	allantoinase. This enzyme carries out the first step in the degradation of allantoin, a ring-opening hydrolysis. The seed members of this model are all in the vicinity of other genes involved in the processes of xanthine/urate/allantoin catabolism. Although not included in the seed, many eukaryotic homologs of this family are included above the trusted cutoff. Below the noise cutoff are related hydantoinases.	443
188295	TIGR03180	UraD_2	OHCU decarboxylase. Previously thought to only proceed spontaneously, the decarboxylation of 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) has been recently been shown to be catalyzed by this enzyme in Mus musculus. Homologs of this enzyme are found adjacent to and fused with uricase in a number of prokaryotes and are represented by this model. This model is a separate (but related) clade from that represented by TIGR3164. This model places a second homolog in streptomyces species which (are not in the vicinity of other urate catabolism associated genes) below the trusted cutoff.	158
213783	TIGR03181	PDH_E1_alph_x	pyruvate dehydrogenase E1 component, alpha subunit. Members of this protein family are the alpha subunit of the E1 component of pyruvate dehydrogenase (PDH). This model represents one branch of a larger family that E1-alpha proteins from 2-oxoisovalerate dehydrogenase, acetoin dehydrogenase, another PDH clade, etc. [Energy metabolism, Pyruvate dehydrogenase]	341
274473	TIGR03182	PDH_E1_alph_y	pyruvate dehydrogenase E1 component, alpha subunit. Members of this protein family are the alpha subunit of the E1 component of pyruvate dehydrogenase (PDH). This model represents one branch of a larger family that E1-alpha proteins from 2-oxoisovalerate dehydrogenase, acetoin dehydrogenase, another PDH clade, etc. [Energy metabolism, Pyruvate dehydrogenase]	315
163177	TIGR03183	DNA_S_dndC	putative sulfurtransferase DndC. Members of this protein family are the DndC protein from the dnd (degradation during electrophoresis) operon. The dnd phenotype reflects a sulfur-containing modification to DNA. This operon is sparsely and sporadically distributed among bactera; among the first eight examples are members from the Actinobacteria, Firmicutes, Gammaproteobacteria, Cyanobacteria. DndC is suggested to be a sulfurtransferase. [DNA metabolism, Restriction/modification]	447
274474	TIGR03184	DNA_S_dndE	DNA sulfur modification protein DndE. This model describes the DndE protein encoded by an operon associated with a sulfur-containing modification to DNA. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndE is a putative carboxylase homologous to NCAIR synthetases. [DNA metabolism, Restriction/modification]	105
274475	TIGR03185	DNA_S_dndD	DNA sulfur modification protein DndD. This model describes the DndB protein encoded by an operon associated with a sulfur-containing modification to DNA. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndD is described as a putative ATPase. The small number of examples known so far include species from among the Firmicutes, Actinomycetes, Proteobacteria, and Cyanobacteria. [DNA metabolism, Restriction/modification]	650
132230	TIGR03186	AKGDH_not_PDH	alpha-ketoglutarate dehydrogenase. Several bacterial species have a paralog to homodimeric form of the pyruvate dehydrogenase E1 component (see model TIGR00759), often encoded next to L-methionine gamma-lyase gene (mdeA). The member from a strain of Pseudomonas putida was shown to act on alpha-ketobutyrate, which is produced by MdeA.This model serves as an exception model to TIGR00759, as other proteins hitting TIGR00759 should be identified as the pyruvate dehydrogenase E1 component.	889
274476	TIGR03187	DGQHR	DGQHR domain. This highly divergent, uncharacterized domain has several absolutely conserved residues, including a QR pair and FxxxN motif. Its most striking feature, however, is a near invariant pentapeptide motif DGQHR. Several different subfamilies occur specifically as a part of DNA phosphorothioation systems, previously called DND (DNA instability during electrophoresis), while others (e.g. CPS_2936) occur in other contexts suggestive of lateral gene transfer (sporadic distribution of helicase-containing cassettes). The region described by this model is about 280 amino acids in length; additional sequences show local sequence similarity.	272
274477	TIGR03188	histidine_hisI	phosphoribosyl-ATP pyrophosphohydrolase. This enzyme, phosphoribosyl-ATP pyrophosphohydrolase, catalyses the second step in the histidine biosynthesis pathway. It often occurs as a fusion protein. This model a somewhat narrower scope than pfam01503, as some paralogs that appear to be functionally distinct are excluded from this model. [Amino acid biosynthesis, Histidine family]	84
132233	TIGR03189	dienoyl_CoA_hyt	cyclohexa-1,5-dienecarbonyl-CoA hydratase. This enzyme, cyclohexa-1,5-dienecarbonyl-CoA hydratase, also called dienoyl-CoA hydratase, acts on the product of benzoyl-CoA reductase (EC 1.3.99.15). Benzoyl-CoA is a common intermediate in the degradation of many aromatic compounds, and this enzyme is part of an anaerobic pathway for dearomatization and degradation.	251
132234	TIGR03190	benz_CoA_bzdN	benzoyl-CoA reductase, bzd-type, N subunit. Members of this family are the N subunit of one of two related types of four-subunit ATP-dependent benzoyl-CoA reductase. This enzyme system catalyzes the dearomatization of benzoyl-CoA, a common intermediate in pathways for the degradation for a number of different aromatic compounds, such as phenol and toluene.	377
132235	TIGR03191	benz_CoA_bzdO	benzoyl-CoA reductase, bzd-type, O subunit. Members of this family are the O subunit of one of two related types of four-subunit ATP-dependent benzoyl-CoA reductase. This enzyme system catalyzes the dearomatization of benzoyl-CoA, a common intermediate in pathways for the degradation for a number of different aromatic compounds, such as phenol and toluene.	430
132236	TIGR03192	benz_CoA_bzdQ	benzoyl-CoA reductase, bzd-type, Q subunit. Members of this family are the Q subunit of one of two related types of four-subunit ATP-dependent benzoyl-CoA reductase. This enzyme system catalyzes the dearomatization of benzoyl-CoA, a common intermediate in pathways for the degradation for a number of different aromatic compounds, such as phenol and toluene.	293
132237	TIGR03193	4hydroxCoAred	4-hydroxybenzoyl-CoA reductase, gamma subunit. 4-hydroxybenzoyl-CoA reductase converts 4-hydroxybenzoyl-CoA to benzoyl-CoA, a common intermediate in the degradation of aromatic compounds. This protein family represents the gamma chain of this three-subunit enzyme.	148
132238	TIGR03194	4hydrxCoA_A	4-hydroxybenzoyl-CoA reductase, alpha subunit. This model represents the largest chain, alpha, of the enzyme 4-hydroxybenzoyl-CoA reductase. In species capable of degrading various aromatic compounds by way of benzoyl-CoA, this enzyme can convert 4-hydroxybenzoyl-CoA to benzoyl-CoA.	746
132239	TIGR03195	4hydrxCoA_B	4-hydroxybenzoyl-CoA reductase, beta subunit. This model represents the second largest chain, beta, of the enzyme 4-hydroxybenzoyl-CoA reductase. In species capable of degrading various aromatic compounds by way of benzoyl-CoA, this enzyme can convert 4-hydroxybenzoyl-CoA to benzoyl-CoA.	321
132240	TIGR03196	pucD	xanthine dehydrogenase D subunit. This gene has been characterized in B. subtilis as the molybdopterin binding-subunit of xanthine dehydrogenase (pucD), acting in conjunction with pucC, the FAD-binding subunit and pucE, the FeS-binding subunit. The more common XDH complex (GenProp0640) includes the xdhB gene which is related to pucD. It appears that most of the relatives of pucD outside of this narrow clade are involved in other processes as they are found in unrelated genomic contexts, contain the more common XDH complex and/or do not appear to process purines to allantoin.	768
274478	TIGR03197	MnmC_Cterm	tRNA U-34 5-methylaminomethyl-2-thiouridine biosynthesis protein MnmC, C-terminal domain. In Escherichia coli, the protein previously designated YfcK is now identified as the bifunctional enzyme MnmC. It acts, following the action of the heterotetramer of GidA and MnmE, in the modification of U-34 of certain tRNA to 5-methylaminomethyl-2-thiouridine (mnm5s2U). In other bacterial, the corresponding proteins are usually but always found as a single polypeptide chain, but occasionally as the product of tandem genes. This model represents the C-terminal region of the multifunctional protein. [Protein synthesis, tRNA and rRNA base modification]	381
132242	TIGR03198	pucE	xanthine dehydrogenase E subunit. This gene has been characterized in B. subtilis as the Iron-sulfur cluster binding-subunit of xanthine dehydrogenase (pucE), acting in conjunction with pucC, the FAD-binding subunit and pucD, the molybdopterin binding subunit. The more common XDH complex (GenProp0640) includes the xdhA gene as the Fe-S cluster binding component.	151
274479	TIGR03199	pucC	xanthine dehydrogenase C subunit. This gene has been characterized in B. subtilis as the FAD binding-subunit of xanthine dehydrogenase (pucC), acting in conjunction with pucD, the molybdopterin-binding subunit and pucE, the FeS-binding subunit.	264
132244	TIGR03200	dearomat_oah	6-oxocyclohex-1-ene-1-carbonyl-CoA hydrolase. Members of this protein family are 6-oxocyclohex-1-ene-1-carbonyl-CoA hydrolase, a ring-hydrolyzing enzyme in the anaerobic metabolism of aromatic enzymes by way of benzoyl-CoA, as seen in Thauera aromatica, Geobacter metallireducens, and Azoarcus sp. Note that Rhodopseudomonas palustris uses a different pathway to perform a similar degradation of benzoyl-CoA to 3-hydroxpimelyl-CoA.	360
132245	TIGR03201	dearomat_had	6-hydroxycyclohex-1-ene-1-carbonyl-CoA dehydrogenase. Members of this protein family are 6-hydroxycyclohex-1-ene-1-carbonyl-CoA dehydrogenase, an enzyme in the anaerobic metabolism of aromatic enzymes by way of benzoyl-CoA, as seen in Thauera aromatica, Geobacter metallireducens, and Azoarcus sp. The experimentally characterized form from T. aromatica uses only NAD+, not NADP+. Note that Rhodopseudomonas palustris uses a different pathway to perform a similar degradation of benzoyl-CoA to 3-hydroxpimelyl-CoA.	349
132246	TIGR03202	pucB	xanthine dehydrogenase accessory protein pucB. In Bacillus subtilis the expression of this protein, located in an operon with the structural subunits of xanthine dehydrogenase, has been found to be essential for XDH activity. Some members of this family appear to have a distant relationship to the MobA protein involved in molybdopterin biosynthesis, although this may be coincidental.	190
132247	TIGR03203	pimD_small	pimeloyl-CoA dehydrogenase, small subunit. Members of this protein family are the PimD proteins of species such as Rhodopseudomonas palustris, Bradyrhizobium japonicum. The pimFABCDE operon encodes proteins for the metabolism of straight chain dicarboxylates of seven to fourteen carbons. Especially relevant is pimeloyl-CoA, basis of the gene symbol, as it is a catabolite of benzoyl-CoA degradation, which occurs in Rhodopseudomonas palustris.	378
132248	TIGR03204	pimC_large	pimeloyl-CoA dehydrogenase, large subunit. Members of this protein family are the PimC proteins of species such as Rhodopseudomonas palustris and Bradyrhizobium japonicum. The pimFABCDE operon encodes proteins for the metabolism of straight chain dicarboxylates of seven to fourteen carbons. Especially relevant is pimeloyl-CoA, basis of the gene symbol, as it is a catabolite of benzoyl-CoA degradation, which occurs in Rhodopseudomonas palustris.	395
132249	TIGR03205	pimA	dicarboxylate--CoA ligase PimA. PimA, a member of a large family of acyl-CoA ligases, is found in a characteristic operon pimFABCDE for the metabolism of pimelate and related compounds. It is found, so far, in Bradyrhizobium japonicum and several strains of Rhodopseudomonas palustris. PimA from R. palustris was shown to be active as a CoA ligase for C(7) to C(14) dicarboxylates and fatty acids.	541
132250	TIGR03206	benzo_BadH	2-hydroxycyclohexanecarboxyl-CoA dehydrogenase. Members of this protein family are the enzyme 2-hydroxycyclohexanecarboxyl-CoA dehydrogenase. The enzymatic properties were confirmed experimentally in Rhodopseudomonas palustris; the enzyme is homotetrameric, and not sensitive to oxygen. This enzyme is part of proposed pathway for degradation of benzoyl-CoA to 3-hydroxypimeloyl-CoA that differs from the analogous in Thauera aromatica. It also may occur in degradation of the non-aromatic compound cyclohexane-1-carboxylate.	250
132251	TIGR03207	cyc_hxne_CoA_dh	cyclohexanecarboxyl-CoA dehydrogenase. Cyclohex-1-ene-1carboxyl-CoA is an intermediate in the anaerobic degradation of benzoyl-CoA derived from varioius aromatic compounds, in Rhodopseudomonas palustris but not Thauera aromatica. The aliphatic compound cyclohexanecarboxylate, can be converted to the same intermediate in two steps. The first step is its ligation to coenzyme A. The second is the action of this enzyme, cyclohexanecarboxyl-CoA dehydrogenase.	372
132252	TIGR03208	cyc_hxne_CoA_lg	cyclohexanecarboxylate-CoA ligase. Members of this protein family are cyclohexanecarboxylate-CoA ligase. This enzyme prepares the aliphatic ring compound, cyclohexanecarboxylate, for dehydrogenation and then degradation by a pathway also used in benzoyl-CoA degradation in Rhodopseudomonas palustris.	538
132253	TIGR03209	P21_Cbot	clostridium toxin-associated regulator BotR. Clostridium botulinum neurotoxin production is regulated by a regulatory sigma-70 protein, BotR transcription regulator. Similarly, tetanus toxin production of Clostridium tetani is regulated by TetR which is a very close relative of BotR. Both BotR and TetR are members of the TIGR02937 subfamily of sigma-70 RNA polymerase sigma factors. Functional complementation experiments have been done for botR and tetR in highly transformable strain of Clostridium perfringens host cells to assess functional interchangeability of sigma factors and it has been confirmed that they are interchangeable in vivo.	142
132254	TIGR03210	badI	2-ketocyclohexanecarboxyl-CoA hydrolase. Members of this protein family are 2-ketocyclohexanecarboxyl-CoA hydrolase, a ring-opening enzyme that acts in catabolism of molecules such as benzoyl-CoA and cyclohexane carboxylate. It converts -ketocyclohexanecarboxyl-CoA to pimelyl-CoA. It is not sensitive to oxygen.	256
274480	TIGR03211	catechol_2_3	catechol 2,3 dioxygenase. Members of this family all are enzymes active as catechol 2,3 dioxygenase (1.13.11.2), although some members have highly significant activity on catechol derivatives such as 3-methylcatechol, 3-chlorocatechol, and 4-chlorocatechol (see Mars, et al.). This enzyme is also called metapyrocatechase, as it performs a meta-cleavage (an extradiol ring cleavage), in contrast to the ortho-cleavage (intradiol ring cleavage)performed by catechol 1,2-dioxygenase (EC 1.13.11.1), also called pyrocatechase. [Energy metabolism, Other]	303
211797	TIGR03212	uraD_N-term-dom	putative urate catabolism protein. This model represents a protein that is predominantly found just upstream of the UraD protein (OHCU decarboxylase) and in a number of instances as a N-terminal fusion with it. UraD itself catalyzes the last step in the catabolism of urate to allantoate. The function of this protein is presently unknown. It shows homology with the pfam01522 polysaccharide deacetylase domain family.	297
132257	TIGR03213	23dbph12diox	2,3-dihydroxybiphenyl 1,2-dioxygenase. Members of this protein family all have activity as 2,3-dihydroxybiphenyl 1,2-dioxygenase, the third enzyme of a pathway for biphenyl degradation. Many of the extradiol ring-cleaving dioxygenases, to which these proteins belong, act on a range of related substrates. Note that some members of this family may be found operons for toluene or naphthalene degradation, where other activities of the same enzyme may be more significant; the trusted cutoff for this model is set relatively high to exclude most such instances. [Energy metabolism, Other]	286
200251	TIGR03214	ura-cupin	putative allantoin catabolism protein. This model represents a protein containing a tandem arrangement of cupin domains (N-terminal part of pfam07883 and C-terminal more distantly related to pfam00190). This protein is found in the vicinity of genes involved in the catabolism of allantoin, a breakdown product of urate and sometimes of urate iteslf. The distribution of pathway components in the genomes in which this family is observed suggests that the function is linked to the allantoate catabolism to glyoxylate pathway (GenProp0686) since it is sometimes found in genomes lacking any elements of the xanthine-to-allantoin pathways (e.g. in Enterococcus faecalis).	252
132259	TIGR03215	ac_ald_DH_ac	acetaldehyde dehydrogenase (acetylating). Members of this protein family are acetaldehyde dehydrogenase (acetylating), EC 1.2.1.10. This enzyme oxidizes acetaldehyde, using NAD(+), and attaches coenzyme A (CoA), yielding acetyl-CoA. It occurs as a late step in the meta-cleavage pathways of a variety of compounds, including catechol, biphenyl, toluene, salicylate, etc.	285
132260	TIGR03216	OH_muco_semi_DH	2-hydroxymuconic semialdehyde dehydrogenase. Members of this protein family are 2-hydroxymuconic semialdehyde dehydrogenase. Many aromatic compounds are catabolized by way of the catechol, via the meta-cleavage pathway, to pyruvate and acetyl-CoA. This enzyme performs the second of seven steps in that pathway for catechol degradation. [Energy metabolism, Other]	481
274481	TIGR03217	4OH_2_O_val_ald	4-hydroxy-2-oxovalerate aldolase. Members of this protein family are 4-hydroxy-2-oxovalerate aldolase, also called 4-hydroxy-2-ketovalerate aldolase and 2-oxo-4-hydroxypentanoate aldolase. This enzyme, part of the pathway for the meta-cleavage of catechol, produces pyruvate and acetaldehyde. Acetaldehyde is then converted by acetaldehyde dehydrogenase (acylating) (DmpF; EC 1.2.1.10) to acetyl-CoA. The two enzymes are tightly associated. [Energy metabolism, Other]	333
132262	TIGR03218	catechol_dmpH	4-oxalocrotonate decarboxylase. Members of this protein family are 4-oxalocrotonate decarboxylase. Note that this protein, as characterized (indirectly) in Pseudomonas sp. strain CF600, was inactive except when coexpressed with DmpE, 2-oxopent-4-enoate hydratase, a homologous protein from the same operon. Both of these enzymes are active in the degradation of catechol, a common intermediate in the degradation of aromatic compounds such as benzoate, toluene, phenol, dimethylphenol (dmp), salicylate, etc. [Energy metabolism, Other]	263
274482	TIGR03219	salicylate_mono	salicylate 1-monooxygenase. Members of this protein family are salicylate 1-monooxygenase, also called salicylate hydroxylase. This enzyme converts salicylate to catechol, which is a common intermediate in the degradation of a number of aromatic compounds (phenol, toluene, benzoate, etc.). The gene for this protein may occur in catechol degradation genes, such as those of the meta-cleavage pathway.	414
132264	TIGR03220	catechol_dmpE	2-oxopent-4-enoate hydratase. Members of this protein family are 2-oxopent-4-enoate hydratase, which is also called 2-hydroxypent-2,4-dienoate hydratase. It is closely related to another gene found in the same operon, 4-oxalocrotonate decarboxylase, with which it interacts closely.	255
213786	TIGR03221	muco_delta	muconolactone delta-isomerase. Members of this protein family are muconolactone delta-isomerase (EC 5.3.3.4), the CatC protein of the ortho cleavage pathway for metabolizing aromatic compounds by way of catechol. [Energy metabolism, Other]	90
213787	TIGR03222	benzo_boxC	benzoyl-CoA-dihydrodiol lyase. In the presence of O2, the benzoyl-CoA oxygenase/reductase BoxBA BoxAB converts benzoyl-CoA to 2,3-dihydro-2,3-dihydroxybenzoyl-CoA. Members of this family, BoxC, homologous to enoyl-CoA hydratases/isomerases, hydrolyze this compound to 3,4-dehydroadipyl-CoA semialdehyde + HCOOH.	546
274483	TIGR03223	Phn_opern_protn	putative phosphonate metabolism protein. This family of proteins is observed in the vicinity of other caharacterized genes involved in the catabolism of phosphonates via the3 C-P lyase system (GenProp0232), its function is unknown. These proteins are members of the somewhat broader pfam06299 model "Protein of unknown function (DUF1045)" which contains proteins found in a different genomic context as well.	228
132268	TIGR03224	benzo_boxA	benzoyl-CoA oxygenase/reductase, BoxA protein. Members of this protein family are BoxA, the A component of the BoxAB benzoyl-CoA oxygenase/reductase. This oxygen-requiring enzyme acts in an aerobic pathway of benzoate catabolism via coenzyme A ligation. BoxA is a homodimeric iron-sulphur-flavoprotein and acts as an NADPH-dependent reductase for BoxB. [Energy metabolism, Other]	411
200253	TIGR03225	benzo_boxB	benzoyl-CoA oxygenase, B subunit. Members of this protein family are BoxB, the B subunit of benzoyl-CoA oxygenase. This oxygen-requiring enzyme acts in an aerobic pathway of benzoate catabolism via coenzyme A ligation. [Energy metabolism, Other]	471
274484	TIGR03226	PhnU	2-aminoethylphosphonate ABC transporter, permease protein. This ABC transporter permease (membrane-spanning) component is found in a region of the salmonella typhimurium LT2 genome responsible for the catabolism of 2-aminoethylphosphonate via the phnWX pathway (GenProp0238).	288
132271	TIGR03227	PhnS	2-aminoethylphosphonate ABC transporter, periplasmic 2-aminoethylphosphonate binding protein. This ABC transporter periplasmic substrate binding protein component is found in a region of the salmonella typhimurium LT2 genome responsible for the catabolism of 2-aminoethylphosphonate via the phnWX pathway (GenProp0238). The protein contains a match to pfam01547 for the "Bacterial extracellular solute-binding protein" domain.	367
132272	TIGR03228	anthran_1_2_A	anthranilate 1,2-dioxygenase, large subunit. Anthranilate (2-aminobenzoate) is an intermediate of tryptophan (Trp) biosynthesis and degradation. Members of this family are the large subunit of anthranilate 1,2-dioxygenase, which acts in Trp degradation by converting anthranilate to catechol. Closely related paralogs typically are the benzoate 1,2-dioxygenase large subunit, among the larger set of ring-hydroxylating dioxygenases. [Energy metabolism, Amino acids and amines]	438
132273	TIGR03229	benzo_1_2_benA	benzoate 1,2-dioxygenase, large subunit. Benzoate 1,2-dioxygenase (EC 1.14.12.10) belongs to the larger family of aromatic ring-hydroxylating dioxygenases. Members of this family all act on benzoate, but may have additional activities on various benozate analogs. This model describes the large subunit. Between the trusted and noise cutoffs are similar enzymes, likely to act on benzoate but perhaps best identified according to some other activity, such as 2-chlorobenzoate 1,2-dioxygenase (1.14.12.13). [Energy metabolism, Other]	433
132274	TIGR03230	lipo_lipase	lipoprotein lipase. Members of this protein family are lipoprotein lipase (EC 3.1.1.34), a eukaryotic triacylglycerol lipase active in plasma and similar to pancreatic and hepatic triacylglycerol lipases (EC 3.1.1.3). It is also called clearing factor. It cleaves chylomicron and VLDL triacylglycerols; it also has phospholipase A-1 activity.	442
132275	TIGR03231	anthran_1_2_B	anthranilate 1,2-dioxygenase, small subunit. Anthranilate (2-aminobenzoate) is an intermediate of tryptophan (Trp) biosynthesis and degradation. Members of this family are the small subunit of anthranilate 1,2-dioxygenase, which acts in Trp degradation by converting anthranilate to catechol. Closely related paralogs typically are the benzoate 1,2-dioxygenase small subunit, among the larger set of ring-hydroxylating dioxygenases. [Energy metabolism, Amino acids and amines]	155
132276	TIGR03232	benzo_1_2_benB	benzoate 1,2-dioxygenase, small subunit. Benzoate 1,2-dioxygenase (EC 1.14.12.10) belongs to the larger family of aromatic ring-hydroxylating dioxygenases. Members of this family should all act on benzoate, but several have additional known activities on various benozate analogs. Some members actually may be named more suitably according to such alternate an activity, such as 2-chlorobenzoate 1,2-dioxygenase (1.14.12.13).	155
163189	TIGR03233	DNA_S_dndB	DNA sulfur modification protein DndB. This model describes the DndB protein encoded by an operon associated with a sulfur-containing modification to DNA. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndB is described as a putative ATPase. [DNA metabolism, Restriction/modification]	355
163190	TIGR03234	OH-pyruv-isom	hydroxypyruvate isomerase. This enzyme interconverts tartronate semi-aldehyde (TSA, aka 2-hydroxy 3-oxopropionate) and hydroxypyruvate. The E. coli enzyme has been characterized and found to be specific for TSA, contain no cofactors, and have a rather high Km for hydroxypyruvate of 12.5 mM. The gene is ofter found in association with glyoxalate carboligase (which produces TSA), but has been shown to have no effect on growth on glyoxalate when knocked out. This is consistent with the fact that the gene for tartronate semialdehyde reductase (glxR) is also associated and may have primary responsibility for the catabolism of TSA.	254
163191	TIGR03235	DNA_S_dndA	cysteine desulfurase DndA. This model describes DndA, a protein related to IscS and part of a larger family of cysteine desulfurases. It is encoded, typically, divergently from a conserved, sparsely distributed operon for sulfur modification of DNA. This modification system is designated dnd, after the phenotype of DNA degradation during electrophoresis. The system is sporadically distributed in bacteria, much like some restriction enzyme operons. DndB is described as a putative ATPase. [DNA metabolism, Restriction/modification]	353
274485	TIGR03236	dnd_assoc_1	DNA phosphorothioation-dependent restriction protein DptG. A DNA sulfur modification (phosphorothioation) system, dnd (degradation during electrophoresis), is sparsely and sporadically distributed among the bacteria. This protein is one member of a three-gene restriction enzyme cassette that depends on DNA phosphorothioation. [DNA metabolism, Restriction/modification]	363
132281	TIGR03237	dnd_assoc_2	DNA phosphorothioation-dependent restriction protein DptH. A DNA sulfur modification (phosphorothioation) system, dnd (degradation during electrophoresis), is sparsely and sporadically distributed among the bacteria. This protein is one member of a three-gene restriction enzyme cassette that depends on DNA phosphorothioation. [DNA metabolism, Restriction/modification]	1256
132282	TIGR03238	dnd_assoc_3	DNA phosphorothioation-dependent restriction protein DptF. A DNA sulfur modification (phosphorothioation) system, dnd (degradation during electrophoresis), is sparsely and sporadically distributed among the bacteria. This protein is one member of a three-gene restriction enzyme cassette that depends on DNA phosphorothioation. [DNA metabolism, Restriction/modification]	504
132283	TIGR03239	GarL	2-dehydro-3-deoxyglucarate aldolase. In E. coli this enzyme (GarL) 2-dehydro-3-deoxyglucarate aldolase acts in the catabolism of several sugars including D-galactarate, D-glucarate and L-idarate. In fact, 5-dehydro-4-deoxy-D-glucarate aldolase is a synonym for this enzyme as it is unclear in the literature whether the enzyme acts on only one of these or, as seems likely, has no preference. (Despite the apparent large difference in substrate stucture indicated by their names, 2-DH-3DO- and 5-DH-4DO-glucarate differ only by the chirality of most central hydroxyl-bearing carbon and is alternately named 2-DH-3DO-galactarate.) The reported product of D-galactarate dehydratase (4.2.1.42) is the 5DH-4DO-glucarate isomer and this enzyme is found proximal to the aldolase in many genomes (GenProp0714) where no epimerase is known. Similarly, the product of D-glucarate dehydratase (4.2.1.40) is again the 5-DH-4DO isomer, so the provenance of the 2-DH-3DO-glucarate isomer for which this enzyme is named is unclear.	249
274486	TIGR03240	arg_catab_astD	succinylglutamate-semialdehyde dehydrogenase. Members of this protein family are succinylglutamic semialdehyde dehydrogenase (EC 1.2.1.71), the fourth enzyme in the arginine succinyltransferase (AST) pathway for arginine catabolism. [Energy metabolism, Amino acids and amines]	484
132285	TIGR03241	arg_catab_astB	succinylarginine dihydrolase. Members of this family are succinylarginine dihydrolase (EC 3.5.3.23), the second of five enzymes in the arginine succinyltransferase (AST) pathway. [Energy metabolism, Amino acids and amines]	443
132286	TIGR03242	arg_catab_astE	succinylglutamate desuccinylase. Members of this protein family are succinylglutamate desuccinylase, the fifth and final enzyme of the arginine succinyltransferase (AST) pathway for arginine catabolism. This model excludes the related protein aspartoacylase. [Energy metabolism, Amino acids and amines]	319
274487	TIGR03243	arg_catab_AOST	arginine and ornithine succinyltransferase subunits. In many bacteria, the sole member of this protein family is arginine N-succinyltransferase (EC 2.3.1.109), the AstA protein of the arginine succinyltransferase (ast) pathway. However, in Pseudomonas aeruginosa and several other species, a tandem gene pair encodes alpha and beta subunits of a heterodimer that is designated arginine and ornithine succinyltransferase (AOST).	335
274488	TIGR03244	arg_catab_AstA	arginine N-succinyltransferase. In many bacteria, the arginine succinyltransferase (ast) pathway operon consists of five genes, including this protein, arginine N-succinyltransferase (EC 2.3.1.109). In a few species, such as Pseudomonas aeruginosa, the member of this family is encoded adjacent to a paralog, and the two polypeptides form a heterodimeric enzyme, active on both arginine and ornithine. In such species, this polypeptide may be treated as the beta subunit of an enzyme that may be named either arginine N-succinyltransferase (AST) or arginine and orthithine N-succinyltransferase (AOST). [Energy metabolism, Amino acids and amines]	336
274489	TIGR03245	arg_AOST_alph	arginine/ornithine succinyltransferase, alpha subunit. In some bacteria, including Pseudomonas aeruginosa, the astB gene (arginine N-succinyltransferase) is replaced by tandem paralogs that form a heterodimer. This heterodimer from P. aeruginosa is characterized as arginine and ornithine N-2 succinyltransferase (AOST). Members of this protein family represent the less widespread paralog, designated AruI, or arginine/ornithine succinyltransferase, alpha subunit.	336
274490	TIGR03246	arg_catab_astC	succinylornithine transaminase family. Members of the seed alignment for this protein family are the enzyme succinylornithine transaminase (EC 2.6.1.81), which catalyzes the third of five steps in arginine succinyltransferase (AST) pathway, an ammonia-releasing pathway of arginine degradation. All seed alignment sequences are found within arginine succinyltransferase operons, and all proteins that score above 820.0 bits should function as succinylornithine transaminase. However, a number of sequences extremely closely related in sequence, found in different genomic contexts, are likely to act in different biological processes and may act on different substrates. This model is desigated subfamily rather than equivalog, pending further consideration, for this reason. [Energy metabolism, Amino acids and amines]	397
211799	TIGR03247	glucar-dehydr	glucarate dehydratase. Glucarate dehydratase converts D-glucarate (and L-idarate, a stereoisomer) to 5-dehydro-4-deoxyglucarate which is subsequently acted on by GarL, tartronate semialdehyde reductase and glycerate kinase (GenProp0716). The E. coli enzyme has been well-characterized.	441
274491	TIGR03248	galactar-dH20	galactarate dehydratase. Galactarate dehydratase converts D-galactarate to 5-dehydro-4-deoxyglucarate which is subsequently acted on by GarL, tartronate semialdehyde reductase and glycerate kinase (GenProp0714).	506
132293	TIGR03249	KdgD	5-dehydro-4-deoxyglucarate dehydratase. 5-dehydro-4-deoxyglucarate dehydratase not only catalyzes the dehydration of the substrate (diol to ketone + water), but causes the decarboxylation of the intermediate product to yield 2-oxoglutarate semialdehyde (2,5-dioxopentanoate). The gene for the enzyme is usually observed in the vicinity of transporters and dehydratases handling D-galactarate and D-gluconate as well as aldehyde dehydrogenases which convert the product to alpha-ketoglutarate.	296
132294	TIGR03250	PhnAcAld_DH	putative phosphonoacetaldehyde dehydrogenase. This family of genes are members of the pfam00171 NAD-dependent aldehyde dehydrogenase family. These genes are observed in Ralstonia eutropha JMP134, Sinorhizobium meliloti 1021, Burkholderia mallei ATCC 23344, Burkholderia thailandensis E264, Burkholderia cenocepacia AU 1054, Burkholderia pseudomallei K96243 and 1710b, Burkholderia xenovorans LB400, Burkholderia sp. 383 and Polaromonas sp. JS666 in close proximity to the PhnW gene (TIGR02326) encoding 2-aminoethyl phosphonate aminotransferase (which generates phosphonoacetaldehyde) and PhnA (TIGR02335) encoding phosphonoacetate hydrolase (not to be confused with the alkylphosphonate utilization operon protein PhnA modeled by TIGR00686). Additionally, transporters believed to be specific for 2-aminoethyl phosphonate are often present. PhnW is, in other organisms, coupled with PhnX (TIGR01422) for the degradation of phosphonoacetaldehyde (GenProp0238), but PhnX is apparently absent in each of the organisms containing this aldehyde reductase. PhnA, characterized in a strain of Pseudomonas fluorescens that has not het been genome sequenced, is only rarely found outside of the PhnW and aldehyde dehydrogenase context. For instance in Rhodopseudomonas and Bordetella bronchiseptica, where it is adjacent to transporters presumably specific for the import of phosphonoacetate. It seems reasonably certain then, that this enzyme catalyzes the NAD-dependent oxidation of phosphonoacetaldehyde to phosphonoacetate, bridging the metabolic gap between PhnW and PhnA. We propose the name phosphonoacetaldehyde dehydrogenase and the gene symbol PhnY for this enzyme.	472
274492	TIGR03251	LAT_fam	L-lysine 6-transaminase. Characterized members of this protein family are L-lysine 6-transaminase, also called lysine epsilon-aminotransferase (LAT). The immediate product of the reaction of this enzyme on lysine, 2-aminoadipate 6-semialdehyde, becomes 1-piperideine 6-carboxylate, or P6C. This product may be converted subsequently to pipecolate or alpha-aminoadipate, lysine catabolites that may be precursors of certain seconary metabolites.	431
132296	TIGR03252	TIGR03252	uncharacterized HhH-GPD family protein. This model describes a small, well-conserved bacterial protein family. Its sequence largely consists of a domain, HhH-GPD, found in a variety of related base excision DNA repair enzymes (see pfam00730). [DNA metabolism, DNA replication, recombination, and repair]	177
211800	TIGR03253	oxalate_frc	formyl-CoA transferase. This enzyme, formyl-CoA transferase, transfers coenzyme A from formyl-CoA to oxalate. It forms a pathway, together with oxalyl-CoA decarboxylase, for oxalate degradation; decarboxylation by the latter gene regenerates formyl-CoA. The two enzymes typically are encoded by a two-gene operon. [Cellular processes, Detoxification]	415
132298	TIGR03254	oxalate_oxc	oxalyl-CoA decarboxylase. In a number of bacteria, including Oxalobacter formigenes from the human gut, a two-gene operon of oxc (oxalyl-CoA decarboxylase) and frc (formyl-CoA transferase) encodes a system for degrading and therefore detoxifying oxalate. Members of this family are the thiamine pyrophosphate (TPP)-containing enzyme oxalyl-CoA decarboxylase. [Cellular processes, Detoxification]	554
132299	TIGR03255	PhnV	2-aminoethylphosphonate ABC transport system, membrane component PhnV. This membrane component of an ABC transport system is found in Salmonella and Burkholderia lineages in the vicinity of enzymes for the breakdown of 2-aminoethylphosphonate.	272
132300	TIGR03256	met_CoM_red_alp	methyl-coenzyme M reductase, alpha subunit. Members of this protein family are the alpha subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis]	548
132301	TIGR03257	met_CoM_red_bet	methyl-coenzyme M reductase, beta subunit. Members of this protein family are the beta subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis]	433
132302	TIGR03258	PhnT	2-aminoethylphosphonate ABC transport system, ATP-binding component PhnT. This ATP-binding component of an ABC transport system is found in Salmonella and Burkholderia lineages in the vicinity of enzymes for the breakdown of 2-aminoethylphosphonate.	362
132303	TIGR03259	met_CoM_red_gam	methyl-coenzyme M reductase, gamma subunit. Members of this protein family are the gamma subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis]	244
274493	TIGR03260	met_CoM_red_D	methyl-coenzyme M reductase operon protein D. Members of this protein family are protein D, a non-structural protein, of the operon for methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). That enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis; it has several modified sites, so accessory proteins are expected. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. Proteins in this family are expressed at much lower levels than the methyl-coenzyme M reductase itself and associate and have been shown to form at least transient associations. The precise function is unknown. [Energy metabolism, Methanogenesis]	150
274494	TIGR03261	phnS2	putative 2-aminoethylphosphonate ABC transporter, periplasmic 2-aminoethylphosphonate-binding protein. This ABC transporter extracellular solute-binding protein is found in a number of genomes in operon-like contexts strongly suggesting a substrate specificity for 2-aminoethylphosphonate (2-AEP). The characterized PhnSTUV system is absent in the genomes in which this system is found. These genomes encode systems for the catabolism of 2-AEP, making the need for a 2-AEP-specific transporter likely. [Transport and binding proteins, Amino acids, peptides and amines]	334
274495	TIGR03262	PhnU2	putative 2-aminoethylphosphonate ABC transporter, permease protein. [Transport and binding proteins, Amino acids, peptides and amines]	546
213788	TIGR03263	guanyl_kin	guanylate kinase. Members of this family are the enzyme guanylate kinase, also called GMP kinase. This enzyme transfers a phosphate from ATP to GMP, yielding ADP and GDP. [Purines, pyrimidines, nucleosides, and nucleotides, Nucleotide and nucleoside interconversions]	179
132308	TIGR03264	met_CoM_red_C	methyl-coenzyme M reductase I operon protein C. Members of this protein family are protein C, a non-structural protein, of the operon for methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). That enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis; it has several modified sites, so accessory proteins are expected. Several methanogens have encode two such enzymes, designated I and II; this protein occurs only operons of type I. The precise function is unknown. [Energy metabolism, Methanogenesis]	194
274496	TIGR03265	PhnT2	putative 2-aminoethylphosphonate ABC transporter, ATP-binding protein. This ABC transporter ATP-binding protein is found in a number of genomes in operon-like contexts strongly suggesting a substrate specificity for 2-aminoethylphosphonate (2-AEP). The characterized PhnSTUV system is absent in the genomes in which this system is found. These genomes encode systems for the catabolism of 2-AEP, making the need for a 2-AEP-specific transporter likely. [Transport and binding proteins, Amino acids, peptides and amines]	353
132310	TIGR03266	methan_mark_1	putative methanogenesis marker protein 1. Members of this protein family represent a distinct clade among the larger set of proteins that belong to families TIGR00702 and pfam02624. Proteins from this clade are found in genome sequence if and only if the species sequenced is one of the methanogens. All methanogens belong to the archaea; some but not all of those sequenced are hyperthermophiles. This protein family was detected by the method of partial phylogenetic profiling (see Haft, et al., 2006).	376
132311	TIGR03267	methan_mark_2	putative methanogenesis marker protein 2. A single member of this protein family is found in each of the first ten complete genome sequences of archaeal methanogens, and nowhere else. Sequence similarity to various bacterial proteins is reflected in Pfam models pfam00586 and pfam02769, AIR synthase related protein N-terminal and C-terminal domains, respectively. The functions of proteins in this family are unknown, but their role is likely one essential to methanogenesis. [Energy metabolism, Methanogenesis]	323
132312	TIGR03268	methan_mark_3	putative methanogenesis marker protein 3. A single member of this protein family is found in each of the first ten complete genome sequences of archaeal methanogens, and nowhere else. This protein family was detected by the method of partial phylogenetic profiling (see Haft, et al., 2006). The functions of proteins in this family are unknown, but their role is likely one essential to methanogenesis. [Energy metabolism, Methanogenesis]	503
132313	TIGR03269	met_CoM_red_A2	methyl coenzyme M reductase system, component A2. The enzyme that catalyzes the final step in methanogenesis, methyl coenzyme M reductase, contains alpha, beta, and gamma chains. In older literature, the complex of alpha, beta, and gamma chains was termed component C, while this single chain protein was termed methyl coenzyme M reductase system component A2. [Energy metabolism, Methanogenesis]	520
274497	TIGR03270	methan_mark_4	putative methanogen marker protein 4. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely linked to it. Some members have been suggested to be a methyltransferase, based on the proximity of its gene to genes of the multi-subunit complex, N5-methyl-tetrahydromethanopterin--coenzyme M methyltransferase. That context is not conserved, however. The family shows similarity to various phosphate acyltranferases. [Energy metabolism, Methanogenesis]	202
132315	TIGR03271	methan_mark_5	putative methanogenesis marker protein 5. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	142
274498	TIGR03272	methan_mark_6	putative methanogenesis marker protein 6. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	132
132317	TIGR03274	methan_mark_7	putative methanogenesis marker protein 7. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it.	302
132318	TIGR03275	methan_mark_8	putative methanogenesis marker protein 8. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it.	259
132319	TIGR03276	Phn-HD	phosphonate degradation operons associated HDIG domain protein. This small clade of proteins are found adjacent to other genes implicated in the catabolism of phosphonates. They are members of the TIGR00277 domain family and contain a series of five invariant histidines (the domain in general has only four).	179
132320	TIGR03277	methan_mark_9	putative methanogenesis marker domain 9. A gene for a protein that contains a copy of this domain, to date, is found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. A 69-amino acid core region of this 110-amino acid domain contains eight invariant Cys residues, including two copies of a motif [WFY]CCxxKPC. These motifs could be consistent with predicted metal-binding transcription factor as was suggested for the COG4008 family. Some members of this family have an additional N-terminal domain of about 250 amino acids from the nifR3 family of predicted TIM-barrel proteins.	109
132321	TIGR03278	methan_mark_10	methanogenesis marker radical SAM protein. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. It is a radical SAM enzyme by homology. The exact function is unknown, but likely is linked to methanogenesis. In most genomes, the member of this family is encoded by a gene next to, and divergently transcribed from, the methyl coenzyme M reductase operon.	404
132322	TIGR03279	cyano_FeS_chp	putative radical SAM enzyme, TIGR03279 family. Members of this protein family are predicted radical SAM enzymes of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.	433
213789	TIGR03280	methan_mark_11	methanogenesis imperfect marker protein 11. The first twenty-nine completed genomes with a member of this protein family include twenty-eight archaeal methanogens and one other related archaeon, Ferroglobus placidus DSM 10642. The exact function is unknown, but the protein likely belongs to a system usually tightly linked to methanogenesis.	292
132324	TIGR03281	methan_mark_12	putative methanogenesis marker protein 12. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	326
274499	TIGR03282	methan_mark_13	putative methanogenesis marker 13 metalloprotein. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. This metal cluster-binding family is related to nitrogenase structural protein NifD and accessory protein NifE, among others. [Energy metabolism, Methanogenesis]	352
132326	TIGR03283	thy_syn_methano	thymidylate synthase, methanogen type. Thymidylate synthase makes dTMP for DNA synthesis, and is among the most widely distributed of all enzymes. Members of this protein family are encoded within a completed genome sequence if and only if that species is one of the methanogenenic archaea. In these species, tetrahydromethanopterin replaces tetrahydrofolate, The member from Methanobacterium thermoautotrophicum was shown to behave as a thymidylate synthase based on similar side reactions (the exchange of a characteristic proton with water), although the full reaction was not reconstituted. Partial sequence data showed no similarity to known thymidylate synthases simply because the region sequenced was from a distinctive N-terminal region not found in other thymidylate synthases. Members of this protein family appear, therefore, to a novel, tetrahydromethanopterin-dependent thymidylate synthase. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	199
213790	TIGR03284	thym_sym	thymidylate synthase. Members of this protein family are thymidylate synthase, an enzyme that produces dTMP from dUMP. In prokaryotes, its gene usually is found close to that for dihydrofolate reductase, and in some systems the two enzymes are found as a fusion protein. This model excludes a set of related proteins (TIGR03283) that appears to replace this family in archaeal methanogens, where tetrahydrofolate is replaced by tetrahydromethanopterin. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	295
274500	TIGR03285	methan_mark_14	putative methanogenesis marker protein 14. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	445
132329	TIGR03286	methan_mark_15	putative methanogenesis marker protein 15. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. Related proteins include the BadF/BadG/BcrA/BcrD ATPase family (pfam01869), which includes an activator for (R)-2-hydroxyglutaryl-CoA dehydratase. [Energy metabolism, Methanogenesis]	404
274501	TIGR03287	methan_mark_16	putative methanogenesis marker 16 metalloprotein. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. This protein is a predicted to bind FeS clusters, based on the presence of two copies of the Fer4 domain (pfam00037), with each copy having four Cys residues invariant across all members. [Energy metabolism, Methanogenesis]	391
132331	TIGR03288	CoB_CoM_SS_B	CoB--CoM heterodisulfide reductase, subunit B. Members of this protein family are subunit B of the CoB--CoM heterodisulfide reductase, or simply heterodisulfide reductase, found in methanogenic archaea. Some archaea species have two copies, HdrB1 and HdrB2.	290
274502	TIGR03289	frhB	coenzyme F420 hydrogenase, subunit beta. This model represents that clade of F420-dependent hydrogenases (FRH) beta subunits found exclusively and universally in methanogenic archaea. The N- and C-terminal domains of this protein are modelled by pfam04422 and pfam04423 respectively.	275
132333	TIGR03290	CoB_CoM_SS_C	CoB--CoM heterodisulfide reductase, subunit C. The last step in methanogenesis leaves two coenzymes of methanogenesis, CoM and CoB, linked by a disulfide bond. Members of this protein family are the C subunit of the enzyme that reduces the heterodisulfide to CoB-SH and CoM-SH. Similar enzyme complex subunits are found in various other species, but likely act on a different substrate.	144
274503	TIGR03291	methan_mark_17	putative methanogenesis marker protein 17. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	185
274504	TIGR03292	PhnH_redo	phosphonate C-P lyase system protein PhnH. PhnH is a component of the C-P lyase system (GenProp0232) for the catabolism of phosphonate compounds. The specific function of this component is unknown. This model is based on pfam05845.2, and has been broadened to include sequences missed by that model which are clearly true positive hits based on genome context.	174
274505	TIGR03293	PhnG_redo	phosphonate C-P lyase system protein PhnG. PhnH is a component of the C-P lyase system (GenProp0232) for the catabolism of phosphonate compounds. The specific function of this component is unknown. This model is based on pfam06754.2, and has been broadened to include sequences missed by that model which are clearly true positive hits based on genome context.	144
132337	TIGR03294	FrhG	coenzyme F420 hydrogenase, subunit gamma. This model represents that clade of F420-dependent hydrogenases (FRH) beta subunits found exclusively and universally in methanogenic archaea. This protein contains two 4Fe-4S cluster binding domains (pfam00037) and scores above the trusted cutoff to model pfam01058 for the "NADH ubiquinone oxidoreductase, 20 Kd subunit" family.	228
274506	TIGR03295	frhA	coenzyme F420 hydrogenase, subunit alpha. This model represents that clade of F420-dependent hydrogenases (FRH) beta subunits found exclusively and universally in methanogenic archaea. This protein is a member of the Nickel-dependent hydrogenase superfamily represented by Pfam model, pfam00374.	408
274507	TIGR03296	M6dom_TIGR03296	M6 family metalloprotease domain. This model describes a metalloproteinase domain, with a characteristic HExxH motif. Examples of this domain are found in proteins in the family of immune inhibitor A, which cleaves antibacterial peptides, and in other, only distantly related proteases. This model is built to be broader and more inclusive than pfam05547.	285
274508	TIGR03297	Ppyr-DeCO2ase	phosphonopyruvate decarboxylase. This family consists of examples of phosphonopyruvate an decarboxylase enzyme that produces phosphonoacetaldehyde (Pald), the second step in the biosynthesis phosphonate-containing compounds. Since the preceding enzymate step, PEP phosphomutase (AepX, TIGR02320) favors the substrate PEP energetically, the decarboxylase is required to drive the reaction in the direction of phosphonate production. Pald is a precursor of natural products including antibiotics like bialaphos and phosphonothricin in Streptomyces species, phosphonate-modified molecules such as the polysaccharide B of Bacteroides fragilis, the phosphonolipids of Tetrahymena pyroformis, the glycosylinositolphospholipids of Trypanosoma cruzi. This gene generally occurs in prokaryotic organisms adjacent to the gene for AepX. Most often an aminotansferase (aepZ) is also present which leads to the production of the most common phosphonate compound, 2-aminoethylphosphonate (AEP).	361
274509	TIGR03298	argP	transcriptional regulator, ArgP family. ArgP used to be known as IciA. ArgP is a positive regulator of argK. It is a negative autoregulator in presence of arginine. It competes with DnaA for oriC iteron (13-mer) binding. It activates dnaA and nrd transcription. It has been demonstrated to be part of the pho regulon (). ArgP mutants convey canavanine (an L-arginine structural homolog) sensitivity. [Cellular processes, Toxin production and resistance, DNA metabolism, DNA replication, recombination, and repair, Regulatory functions, DNA interactions]	292
274510	TIGR03299	LGT_TIGR03299	phage/plasmid-like protein TIGR03299. Members of this uncharacterized protein family are found in various Mycobacterium phage genomes, in Streptomyces coelicolor plasmid SCP1, and in bacterial genomes near various markers that suggest lateral gene transfer. The function is unknown. [Mobile and extrachromosomal element functions, Other]	309
274511	TIGR03300	assembly_YfgL	outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ. [Protein fate, Protein and peptide secretion and trafficking]	377
274512	TIGR03301	PhnW-AepZ	2-aminoethylphosphonate aminotransferase. This family includes a number of 2-aminoethylphosphonate aminotransferases, some of which are indicated to operate in the catabolism of 2-aminoethylphosphonate (AEP) and others which are involved in the biosynthesis of the same compound. The catabolic enzyme (PhnW) is known to use pyruvate:alanine as the transfer partner and is modeled by the equivalog-level model (TIGR02326). The PhnW family is apparently a branch of a larger tree including genes (AepZ) adjacent to others responsible for the biosynthesis of phosphonoacetaldehyde. The identity of the transfer partner is unknown for these enzymes and considering the reversed flux compared to PhnW, it may very well be different.	355
274513	TIGR03302	OM_YfiO	outer membrane assembly lipoprotein YfiO. Members of this protein family include YfiO, a near-essential protein of the outer membrane, part of a complex involved in protein insertion into the bacterial outer membrane. Many proteins in this family are annotated as ComL, based on the involvement of this protein in natural transformation with exogenous DNA in Neisseria gonorrhoeae. This protein family shows sequence similarity to, but is distinct from, the tol-pal system protein YbgF (TIGR02795). [Protein fate, Protein and peptide secretion and trafficking]	235
274514	TIGR03303	OM_YaeT	outer membrane protein assembly complex, YaeT protein. Members of this protein family are the YaeT protein of the YaeT/YfiO complex for assembling proteins into the outer membrane of Gram-negative bacteria. This protein is similar in sequence and function to a non-essential paralog, YtfM, that is also in the Omp85 family. Members of this family typically have five tandem copies of the surface antigen variable number repeat (pfam07244), followed by an outer membrane beta-barrel domain (pfam01103), while the YtfM family typically has a single pfam07244 repeat. [Protein fate, Protein and peptide secretion and trafficking]	741
274515	TIGR03304	OMP85_target	outer membrane insertion C-terminal signal. This hidden Markov model detects a 10-residue targeting sequence common to beta-barrel outer membrane proteins (OMP) that rely on Omp85-like proteins for insertion into the outer membrane. Hits should be trusted if they include the last amino acid of a protein sequence that occurs in Gram-negative bacteria. It has been noted that Omp85 target sequences differ somewhat by species, while this model works generally for most Proteobacteria.	10
132348	TIGR03305	alt_F1F0_F1_bet	alternate F1F0 ATPase, F1 subunit beta. A small number of taxonomically diverse prokaryotic species have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 beta subunit of this apparent second ATP synthase.	449
132349	TIGR03306	altF1_A	alternate F1F0 ATPase, F0 subunit A. A small number of taxonomically diverse prokaryotic species have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F0 subunit A of this apparent second ATP synthase.	217
163212	TIGR03307	PhnP	phosphonate metabolism protein PhnP. This family of proteins found in operons encoding phosphonate C-P lyase systems as is observed in E. coli and is a member of the metallo-beta-lactamase superfamily (pfam00753). As defined by this model, all instances of this protein are associated with the C-P lyase, but not all genomes containing the C-P lyase system contain phnP.	238
132351	TIGR03308	phn_thr-fam	phosphonate metabolism protein, transferase hexapeptide repeat family. This family of proteins contains copies of the Bacterial transferase hexapeptide repeat family (pfam00132) and is only found in operons encoding the phosphonate C-P lyase system (GenProp0232). Many C-P lyase operons, however, lack a homolog of this protein.	204
132352	TIGR03309	matur_yqeB	selenium-dependent molybdenum hydroxylase system protein, YqeB family. Members of this protein family are probable accessory proteins for the biosynthesis of enzymes with labile selenium-containing centers, different from selenocysteine-containing proteins.	256
274516	TIGR03310	matur_MocA_YgfJ	molybdenum cofactor cytidylyltransferase. Members of this protein include MocA, which transfers cytosine from CTP to molybdopterin during molybdopterin cytosine dinucleotide (MCD) cofactor biosynthesis. It is distantly related to MobA, the GTP:molybdopterin guanylyltransferase. The MocA family is particularly closely related in phylogenetic distribution to other markers of selenium-dependent molybdenum hydroxylases (SDMH), suggesting most SDMH must use MCD rather than molybdopterin guanine dinucleotide.	188
132354	TIGR03311	Se_dep_XDH	selenium-dependent xanthine dehydrogenase. Members of this protein resemble conventional xanthine dehydrogenase enzymes, which depend on molybdenum cofactor - molybdopterin bound to molybdate with two sulfur atoms as ligands. But all members of this family occur in species that contain markers for the biosynthesis of enzymes with a selenium-containing form of molybdenum cofactor. The member of this family from Enterococcus faecalis has been shown to act as a xanthine dehydrogenenase, and its activity if dependent on SelD (selenophosphate synthase), selenium, and molybdenum. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	848
132355	TIGR03312	Se_sel_red_FAD	probable selenate reductase, FAD-binding subunit. This protein is suggested by Bebien, et al., to be the FAD-binding subunit of a molydbopterin-containing selenate reductase. Our comparative genomics suggests it to be a subunit of a selenium-dependent molybdenum hydroxylase for an unknown substrate.	257
132356	TIGR03313	Se_sel_red_Mo	probable selenate reductase, molybdenum-binding subunit. Our comparative genomics suggests this protein family to be a subunit of a selenium-dependent molybdenum hydroxylase, although the substrate is not specified. This protein is suggested by Bebien, et al., to be the molybdenum-binding subunit of a molydbopterin-containing selenate reductase. Xi, et al, however, show that mutation of this gene in E. coli conferred sensitivity to adenine, suggesting a defect in purine interconversion. This finding, plus homology of nearby genes in a 23-gene purine catabolism region in E. coli to xanthine dehydrogase subunits suggests xanthine dehydrogenase activity.	951
132357	TIGR03314	Se_ssnA	putative selenium metabolism protein SsnA. Members of this protein family are found exclusively in genomes that contain putative set of labile selenium-dependent enzyme accessory proteins as well as homologs of a labile selenium-dependent purine hydroxylase. A mutant in this gene in Escherichia coli had improved stationary phase viability. The function is unknown.	441
132358	TIGR03315	Se_ygfK	putative selenate reductase, YgfK subunit. Members of this protein family are YgfK, predicted to be one subunit of a three-subunit, molybdopterin-containing selenate reductase. This enzyme is found, typically, in genomic regions associated with xanthine dehydrogenase homologs predicted to belong to the selenium-dependent molybdenum hydroxylases (SDMH). Therefore, the selenate reductase is suggested to play a role in furnishing selenide for SelD, the selenophosphate synthase.	1012
274517	TIGR03316	ygeW	knotted carbamoyltransferase YgeW. Members of this protein family include the ygeW gene product of Escherichia coli. The function is unknown. Members show homology to ornithine carbamoyltransferase (TIGR00658) and aspartate carbamoyltransferase (carbamoyltransferase), and therefore may belong to the carbamoyltransferases in function. Members often are found in a large, conserved genomic region associated with selenium-dependent molybdenum hydroxylases.	391
274518	TIGR03317	ygfZ_signature	folate-binding protein YgfZ. YgfZ is a protein from Escherichia coli, homologous to the glycine cleavage system T protein, or aminomethyltransferase, GcvT (TIGR00528). Homologs of YgfZ other than members of the GcvT family share a well-conserved signature region that includes the motif, KGCYxGQE. Elsewhere, sequence diverge and length variation are substantial. Members of this family are mostly bacterial, largely absent from the Firmicutes and otherwise usually present. A few eukaryotic examples are found among the Apicomplexa, and a few archaeal sequences are found. Two functions implicated for this folate-binding protein are RNA modification (a function likely to be conserved) and replication initiation (a function likely to be highly variable). Many members of this family are, at the time of construction of this model, misnamed as the glycine cleavage system T protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	67
132361	TIGR03318	YdfZ_fam	putative selenium-binding protein YdfZ. This small protein has a very limited distribution, being found so far only among some gamma-Proteobacteria. The member from Escherichia coli was shown to bind selenium in the absence of a working SelD-dependent selenium incorporation system. Note that while the E. coli member contains a single Cys residue, a likely selenium binding site, some other members of this protein family contain two Cys residues or none. [Unknown function, General]	65
188306	TIGR03319	RNase_Y	ribonuclease Y. Members of this family are RNase Y, an endoribonuclease. The member from Bacillus subtilis, YmdA, has been shown to be involved in turnover of yitJ riboswitch. [Transcription, Degradation of RNA]	514
132364	TIGR03321	alt_F1F0_F0_B	alternate F1F0 ATPase, F0 subunit B. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, CC and in principle may run in either direction. This model represents the F0 subunit B of this apparent second ATP synthase.	246
132365	TIGR03322	alt_F1F0_F0_C	alternate F1F0 ATPase, F0 subunit C. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F0 subunit C of this apparent second ATP synthase.	86
274519	TIGR03323	alt_F1F0_F1_gam	alternate F1F0 ATPase, F1 subunit gamma. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 gamma subunit of this apparent second ATP synthase.	285
132367	TIGR03324	alt_F1F0_F1_al	alternate F1F0 ATPase, F1 subunit alpha. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 alpha subunit of this apparent second ATP synthase.	497
132368	TIGR03325	BphB_TodD	cis-2,3-dihydrobiphenyl-2,3-diol dehydrogenase. Members of this family occur as the BphD protein of biphenyl catabolism and as the TodD protein of toluene catabolism. Members catalyze the second step in each pathway and proved interchangeable when tested; the first and fourth enzymes in each pathway confer metabolic specificity. In the context of biphenyl degradation, the enzyme acts as cis-2,3-dihydrobiphenyl-2,3-diol dehydrogenase (EC 1.3.1.56), while in toluene degradation it acts as cis-toluene dihydrodiol dehydrogenase.	262
188307	TIGR03326	rubisco_III	ribulose bisphosphate carboxylase, type III. Members of this protein family are the archaeal, single chain, type III form of ribulose bisphosphate carboxylase, or RuBisCO. Members act is a three-step pathway for conversion of the sugar moiety of AMP to two molecules of 3-phosphoglycerate. Many of these species use ADP-dependent sugar kinases, which form AMP, for glycolysis. [Energy metabolism, Sugars]	411
274520	TIGR03327	AMP_phos	AMP phosphorylase. This enzyme family is found, so far, strictly in the Archaea, and only in those with a type III Rubisco enzyme. Most of the members previously were annotated as thymidine phosphorylase, or DeoA. The AMP metabolized by this enzyme may be produced by ADP-dependent sugar kinases.	494
274521	TIGR03328	salvage_mtnB	methylthioribulose-1-phosphate dehydratase. Members of this family are the methylthioribulose-1-phosphate dehydratase of the methionine salvage pathway. This pathway allows methylthioadenosine, left over from polyamine biosynthesis, to be recycled to methionine. [Amino acid biosynthesis, Aspartate family]	192
274522	TIGR03329	Phn_aa_oxid	putative aminophosphonate oxidoreductase. This clade of sequences are members of the pfam01266 family of FAD-dependent oxidoreductases. Characterized proteins within this family include glycerol-3-phosphate dehydrogenase (1.1.99.5), sarcosine oxidase beta subunit (1.5.3.1) and a number of deaminating amino acid oxidases (1.4.-.-). These genes have been consistently observed in a genomic context including genes for the import and catabolism of 2-aminoethylphosphonate (AEP). If the substrate of this oxidoreductase is AEP itself, then it is probably acting in the manner of a deaminating oxidase, resulting in the same product (phosphonoacetaldehyde) as the transaminase PhnW (TIGR02326), but releasing ammonia instead of coupling to pyruvate:alanine. Alternatively, it is reasonable to suppose that the various ABC cassette transporters which are also associated with these loci allow the import of phosphonates closely related to AEP which may not be substrates for PhnW.	460
274523	TIGR03330	SAM_DCase_Bsu	S-adenosylmethionine decarboxylase proenzyme, Bacillus form. Members of this protein family are the single chain precursor of the two chains of the mature S-adenosylmethionine decarboxylase as found in Methanocaldococcus jannaschii, Bacillus subtilis, and a wide range of other species. It differs substantially in architecture from the form as found in Escherichia coli, and lacks any extended homology to the eukaryotic form (TIGR00535). [Central intermediary metabolism, Polyamine biosynthesis]	112
274524	TIGR03331	SAM_DCase_Eco	S-adenosylmethionine decarboxylase proenzyme, Escherichia coli form. Members of this protein family are the single chain precursor of the S-adenosylmethionine decarboxylase as found in Escherichia coli. This form shows a substantially different architecture from the form shared by the Archaea, Bacillus, and many other species (TIGR03330). It shows little or no similarity to the form found in eukaryotes (TIGR00535). [Central intermediary metabolism, Polyamine biosynthesis]	259
132375	TIGR03332	salvage_mtnW	2,3-diketo-5-methylthiopentyl-1-phosphate enolase. Members of this family are the methionine salvage pathway enzyme 2,3-diketo-5-methylthiopentyl-1-phosphate enolase, a homolog of RuBisCO. This protein family seems restricted to Bacillus subtilis and close relatives, where two separate proteins carry the enolase and phosphatase activities that in other species occur in a single protein, MtnC (TIGR01691). [Amino acid biosynthesis, Aspartate family, Central intermediary metabolism, Sulfur metabolism]	407
213797	TIGR03333	salvage_mtnX	2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate phosphatase. Members of this family are the methionine salvage enzyme MnxX, a member of the HAD-superfamily hydrolases, subfamily IB (see TIGR01488). Members are found in Bacillus subtilis and related species, paired with MtnW (TIGR03332). In most species that recycle methionine from methylthioadenosine, the single protein MtnC replaces the MtnW/MtnX pair. In B. subtilis, mtnX was first known as ykrX. [Amino acid biosynthesis, Aspartate family, Central intermediary metabolism, Sulfur metabolism]	214
274525	TIGR03334	IOR_beta	indolepyruvate ferredoxin oxidoreductase, beta subunit. This model represents the beta subunit of indolepyruvate ferredoxin oxidoreductase, an alpha(2)/beta(2) tetramer, as found in Pyrococcus furiosus and Methanobacterium thermoautotrophicum. Cofactors for the tetramer include TPP, 4Fe4S, and 3Fe-4S. It shows considerable sequence similarity to subunits of several other ketoacid oxidoreductases.	189
132378	TIGR03335	F390_ftsA	coenzyme F390 synthetase. This enzyme, characterized in Methanobacterium thermoautotrophicum and found in several other methanogens, modifies coenzyme F420 by ligation of AMP (or GMP) from ATP (or GTP). On F420, it activates an aromatic hydroxyl group, which is unusual chemistry for an adenylyltransferase. This enzyme name has been attached to numbers of uncharacterized genes likely to instead act as phenylacetate CoA ligase, based on proximity to predicted indolepyruvate ferredoxin oxidoreductase (1.2.7.8) genes. The enzyme acts during transient exposure of the organism to oxygen. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis]	445
274526	TIGR03336	IOR_alpha	indolepyruvate ferredoxin oxidoreductase, alpha subunit. Indolepyruvate ferredoxin oxidoreductase (IOR) is an alpha 2/beta 2 tetramer related to ketoacid oxidoreductases for pyruvate (1.2.7.1, POR), 2-ketoglutarate (1.2.7.3, KOR), and 2-oxoisovalerate (1.2.7.7, VOR). These multi-subunit enzymes typically are found in anaerobes and are inactiviated by oxygen. IOR in Pyrococcus acts in fermentation of all three aromatic amino acids, following removal of the amino group by transamination. In Methanococcus maripaludis, by contrast, IOR acts in the opposite direction, in pathways of amino acid biosynthesis from phenylacetate, indoleacetate, and p-hydroxyphenylacetate. In M. maripaludis and many other species, iorA and iorB are found next to an apparent phenylacetate-CoA ligase.	595
132380	TIGR03337	phnR	phosphonate utilization transcriptional regulator PhnR. This family of proteins are members of the GntR family (pfam00392) containing an N-terminal helix-turn-helix (HTH) motif. This clade is found adjacent to or inside of operons for the degradation of 2-aminoethylphosphonate (AEP) in Salmonella, Vibrio Aeromonas hydrophila, Hahella chejuensis and Psychromonas ingrahamii. [Regulatory functions, DNA interactions]	231
132381	TIGR03338	phnR_burk	phosphonate utilization associated transcriptional regulator. This family of proteins are members of the GntR family (pfam00392) containing an N-terminal helix-turn-helix (HTH) motif. This clade is found adjacent to or inside of operons for the degradation of 2-aminoethylphosphonate (AEP) in Polaromonas, Burkholderia, Ralstonia and Verminephrobacter.	212
132382	TIGR03339	phn_lysR	aminoethylphosphonate catabolism associated LysR family transcriptional regulator. This group of sequences represents a number of related clades with numerous examples of members adjacent to operons for the degradation of 2-aminoethylphosphonate (AEP) in Pseudomonas, Ralstonia, Bordetella and Burkholderia species. These are transcriptional regulators of the LysR family which contain a helix-turn-helix (HTH) domain (pfam00126) and a periplasmic substrate-binding protein-like domain (pfam03466). [Regulatory functions, DNA interactions]	279
274527	TIGR03340	phn_DUF6	phosphonate utilization associated putative membrane protein. This family of hydrophobic proteins has some homology to families of integral membrane proteins such as (pfam00892) and may be a permease. It occurs in the vicinity of various types of operons for the catabolism of phosphonates in Vibrio, Pseudomonas, Polaromonas and Thiomicrospira.	281
132384	TIGR03341	YhgI_GntY	IscR-regulated protein YhgI. IscR (TIGR02010) is an iron-sulfur cluster-binding transcriptional regulator (see Genome Property GenProp0138). Members of this protein family include YhgI, whose expression is under control of IscR, and show sequence similarity to IscA, a known protein of iron-sulfur cluster biosynthesis. These two lines of evidence strongly suggest a role as an iron-sulfur cluster biosynthesis protein. An older study designated this protein GntY and suggested a role for it and for the product of an adjacent gene, based on complementation studies, in gluconate utilization. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	190
213798	TIGR03342	dsrC_tusE_dsvC	sulfur relay protein, TusE/DsrC/DsvC family. Members of this protein family may be described as TusE, a partner to TusBCD in a sulfur relay system for 2-thiouridine biosynthesis, a tRNA base modification process. Other members are DsrC, a functionally similar protein in species where the sulfur relay system exists primarily for sulfur metabolism rather than tRNA base modification. Some members of this family are known explicitly as the gamma subunit of sulfite reductases.	108
132386	TIGR03343	biphenyl_bphD	2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoate hydrolase. Members of this family are 2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoate hydrolase, or HOPD hydrolase, the BphD protein of biphenyl degradation. BphD acts on the product of ring meta-cleavage by BphC. Many species carrying bphC and bphD are capable of degrading polychlorinated biphenyls as well as biphenyl itself.	282
213799	TIGR03344	VI_effect_Hcp1	type VI secretion system effector, Hcp1 family. This family includes Hcp1 (hemolysin coregulated protein 1), an exported, homohexameric ring-forming virulence protein from Pseudomonas aeruginosa. Hcp1 lacks a conventional signal sequence and is instead exported by means of the type VI secretion system, encoded by a pathogenicity cluster of a class previously designated IAHP (IcmF-associated homologous protein). Homologs of Hcp1, in this protein family, are found in various bacteria of which most but not all are known pathogens. Pathogens may have many multiple members of this family, with three to ten in Erwinia carotovora, Yersinia pestis, uropathogenic Escherichia coli, and the insect pathogen Photorhabdus luminescens. [Cellular processes, Pathogenesis]	166
274528	TIGR03345	VI_ClpV1	type VI secretion ATPase, ClpV1 family. Members of this protein family are homologs of ClpB, an ATPase associated with chaperone-related functions. These ClpB homologs, designated ClpV1, are a key component of the bacterial pathogenicity-associated type VI secretion system. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	852
274529	TIGR03346	chaperone_ClpB	ATP-dependent chaperone ClpB. Members of this protein family are the bacterial ATP-dependent chaperone ClpB. This protein belongs to the AAA family, ATPases associated with various cellular activities (pfam00004). This molecular chaperone does not act as a protease, but rather serves to disaggregate misfolded and aggregated proteins. [Protein fate, Protein folding and stabilization]	850
274530	TIGR03347	VI_chp_1	type VI secretion protein, VC_A0111 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	300
274531	TIGR03348	VI_IcmF	type VI secretion protein IcmF. Members of this protein family are IcmF homologs and tend to be associated with type VI secretion systems. [Cellular processes, Pathogenesis]	1169
274532	TIGR03349	IV_VI_DotU	type IV / VI secretion system protein, DotU family. At least two families of proteins, often encoded by adjacent genes, show sequence similarity due to homology between type IV secretion systems and type VI secretion systems. One is the IcmF family (TIGR03348). The other is the family described by this model. Members include DotU from the Legionella pneumophila type IV secretion system. Many of the members of this protein family from type VI secretion systems have an additional C-terminal domain with OmpA/MotB homology. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	183
274533	TIGR03350	type_VI_ompA	type VI secretion system peptidoglycan-associated domain. The flagellar motor protein MotB, the Gram-negative bacterial outer membrane protein OmpA (with an N-terminal outer membrane beta barrel domain) share a C-terminal peptidoglycan-associating homology region. This model describes a domain found fused to type VI secretion system homologs of the type IV system protein DotU (see model TIGR03349), with OmpA/MotB homology. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	137
274534	TIGR03351	PhnX-like	phosphonatase-like hydrolase. This clade of sequences are the closest homologs to the PhnX enzyme, phosphonoacetaldehyde (Pald) hydrolase (phosphonatase, TIGR01422). This phosphonatase-like enzyme and PhnX itself are members of the haloacid dehalogenase (HAD) superfamily (pfam00702) having a a number of distinctive features that set them apart from typical HAD enzymes. The typical HAD N-terminal motif DxDx(T/V) here is DxAGT and the usual conserved lysine prior to the C-terminal motif is instead an arginine. Also distinctive of phosphonatase, and particular to its bi-catalytic mechanism is a conserved lysine in the variable "cap" domain. This lysine forms a Schiff base with the aldehyde of phosphonoacetaldehyde, providing, through the resulting positive charge, a polarization of the C-P bond necesary for cleavage as well as a route to the initial product of cleavage, an ene-amine. The conservation of these elements in this phosphonatase-like enzyme suggests that the substrate is also, like Pald, a 2-oxo-ethylphosphonate. Despite this, the genomic context of members of this family are quite distinct from PhnX, which is almost invariably associated with the 2-aminoethylphosphonate transaminase PhnW (TIGR02326), the source of the substrate Pald. Members of this clade are never associated with PhnW, but rather associate with families of FAD-dependent oxidoreductases related to deaminating amino acid oxidases (pfam01266) as well as zinc-dependent dehydrogenases (pfam00107). Notably, family members from Arthrobacter aurescens TC1 and Nocardia farcinica IFM 10152 are adjacent to the PhnCDE ABC cassette phosphonates transporter (GenProp0236) typically found in association with the phosphonates C-P lyase system (GenProp0232). These observations suggest two possibilities. First, the substrate for this enzyme family is also Pald, the non-association with PhnW not withstanding. Alternatively, the substrate is something very closely related such as hydroxyphosphonoacetaldehyde (Hpald). Hpald could come from oxidative deamination of 1-hydroxy-2-aminoethylphosphonate (HAEP) by the associated oxidase. HAEP would not be a substrate for PhnW due to its high specificity for AEP. HAEP has been shown to be a constituent of the sphingophosphonolipid of Bacteriovorax stolpii, and presumably has other natural sources. If Hpald is the substrate, the product would be glycoaldehyde (hydroxyacetaldehyde), and the associated alcohol dehydrogenase may serve to convert this to glycol.	220
274535	TIGR03352	VI_chp_3	type VI secretion lipoprotein, VC_A0113 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	130
274536	TIGR03353	VI_chp_4	type VI secretion protein, VC_A0114 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	439
274537	TIGR03354	VI_FHA	type VI secretion system FHA domain protein. Members of this protein family are FHA (forkhead-associated) domain-containing proteins that are part of type VI secretion loci in a considerable number of bacteria, most of which are known pathogens. Species include Pseudomonas aeruginosa PAO1, Aeromonas hydrophila, Yersinia pestis, Burkholderia mallei, etc. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	396
274538	TIGR03355	VI_chp_2	type VI secretion protein, EvpB/VC_A0108 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	473
274539	TIGR03356	BGL	beta-galactosidase. 	426
274540	TIGR03357	VI_zyme	type VI secretion system lysozyme-like protein. The description for Pfam family pfam04965 cites acidic lysozyme activity for some phage-encoded members. This family represents a different subgroup of the proteins from pfam04965, where all members are associated with bacterial type VI secretion system genomic contexts.	133
132401	TIGR03358	VI_chp_5	type VI secretion protein, VC_A0107 family. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	159
274541	TIGR03359	VI_chp_6	type VI secretion protein, VC_A0110 family. This protein family is associated with type VI secretion in a number of pathogenic bacteria. Mutation is associated with impaired virulence, such as impaired infection of plants by Rhizobium leguminosarum.	598
132403	TIGR03360	VI_minor_1	type VI secretion-associated protein, VC_A0118 family. Members of this protein family, including VC_A0118 from Vibrio cholerae El Tor N16961, are restricted to a subset of bacteria with the type VI secretion system, and are encoded among the type VI-associated pathogenicity islands. However, many species with type VI secretion lack a member of this family. This lack suggests that members of this family may be targets rather than components of the type VI secretion system.	185
274542	TIGR03361	VI_Rhs_Vgr	type VI secretion system Vgr family protein. Members of this protein family belong to the Rhs element Vgr protein family (see TIGR01646), but furthermore all are found in genomes with type VI secretion loci. However, members of this protein family, although recognizably correlated to type VI secretion according the partial phylogenetic profiling algorithm, are often found far the type VI secretion locus.	513
274543	TIGR03362	VI_chp_7	type VI secretion-associated protein, VC_A0119 family. This protein family is one of two related families in type VI secretion systems that contain an ImpA-related N-terminal domain (pfam06812). [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	301
274544	TIGR03363	VI_chp_8	type VI secretion-associated protein, ImpA family. This protein family is one of two related families in type VI secretion systems that contain an ImpA-related N-terminal domain (pfam06812).	353
132407	TIGR03364	HpnW_proposed	FAD dependent oxidoreductase TIGR03364. This clade of FAD dependent oxidoreductases (members of the pfam01266 family) is syntenically associated with a family of proposed phosphonatase-like enzymes (TIGR03351) and is also found (less frequently) in association with phosphonate transporter components. A likely role for this enzyme involves the oxidative deamination of an aminophosphonate differring slightly from 2-aminoethylphosphonate, possibly 1-hydroxy-2-aminoethylphosphonate (see the comments for TIGR03351). Many members of the larger FAD dependent oxidoreductase family act as amino acid oxidative deaminases.	365
274545	TIGR03365	Bsubt_queE	7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE. This uncharacterized enzyme, designated QueE, participates in the biosynthesis, from GTP, of 7-cyano-7-deazaguanosine, also called preQ0 because in many species it is a precursor of queuosine. In most Archaea, it is instead the precursor of a different tRNA modified base, archaeosine. [Protein synthesis, tRNA and rRNA base modification]	238
274546	TIGR03366	HpnZ_proposed	putative phosphonate catabolism associated alcohol dehydrogenase. This clade of zinc-binding alcohol dehydrogenases (members of pfam00107) are repeatedly associated with genes proposed to be involved with the catabolism of phosphonate compounds.	280
274547	TIGR03367	queuosine_QueD	queuosine biosynthesis protein QueD. Members of this protein family, closely related to eukaryotic 6-pyruvoyl tetrahydrobiopterin synthase enzymes, are the QueD protein of queuosine biosynthesis. Queuosine is a hypermodified base in the wobble position of tRNAs for Tyr, His, Asp, and Asn in many species. This modification, although widespread, appears not to be important for viability. The queuosine precursor made by this enzyme may be converted instead to archeaosine as in some Archaea. [Protein synthesis, tRNA and rRNA base modification]	89
132411	TIGR03368	cellulose_yhjU	cellulose synthase operon protein YhjU. This protein was identified by the partial phylogenetic profiling algorithm () as part of the system for cellulose biosynthesis in bacteria, and in fact is found in cellulose biosynthesis gene regions. The protein was designated YhjU in Salmonella enteritidis, where disruption of its gene disrupts cellulose biosynthesis and biofilm formation (). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	518
274548	TIGR03369	cellulose_bcsE	cellulose biosynthesis protein BcsE. This protein, called BcsE (bacterial cellulose synthase E) or YhjS, is required for cellulose biosynthesis in Salmonella enteritidis. Its role is this process across multiple bacterial species is implied by the partial phylogenetic profiling algorithm. Members are found in the vicinity of other cellulose biosynthesis genes. The model does not include a much less well-conserved N-terminal region about 150 amino acids in length for most members. Solano, et al. suggest this protein acts as a protease. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	322
132413	TIGR03370	VPLPA-CTERM	VPLPA-CTERM protein sorting domain. A probable protein export sorting signal, PEP-CTERM, was described by Haft, et al. (). It is predicted to interact with a putative transpeptidase we designate exosortase. This model describes a variant with conserved motif VPLPA, rather than VPEP. It appears to be the recognition sequences for exosortase D (TIGR04152). This variant is found prominently in two members of the Rhodobacterales, namely Jannaschia sp. CCS1 and Roseobacter denitrificans OCh 114. One interesting member protein has a full-length duplication and therefore two copies of this putative sorting domain.	25
274549	TIGR03371	cellulose_yhjQ	cellulose synthase operon protein YhjQ. Members of this family are the YhjQ protein, found immediately upsteam of bacterial cellulose synthase (bcs) genes in a broad range of bacteria, including both copies of the bcs locus in Klebsiella pneumoniae. In several species it is seen clearly as part of the bcs operon. It is identified as a probable component of the bacterial cellulose metabolic process not only by gene location, but also by partial phylogenetic profiling, or Haft-Selengut algorithm (), based on a bacterial cellulose biosynthesis genome property profile. Cellulose plays an important role in biofilm formation and structural integrity in some bacteria. Mutants in yhjQ in Escherichia coli, show altered morphology an growth, but the function of YhjQ has not yet been determined. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	246
132415	TIGR03372	putres_am_tran	putrescine aminotransferase. Members of this family are putrescine aminotransferase, as found in Escherichia coli, Erwinia carotovora subsp. atroseptica, and closely related species. This pyridoxal phosphate enzyme, as characterized in E. coli, can act also on cadaverine and, more weakly, spermidine. [Central intermediary metabolism, Polyamine biosynthesis]	442
132416	TIGR03373	VI_minor_4	type VI secretion-associated protein, BMA_A0400 family. Members of this protein family are found exclusively, although not universally, in bacterial species that possess a type VI secretion system. Genes are found in type VI secretion-associated gene clusters. The specific function is unknown. This model represents the rather well-conserved amino-terminal domain of a protein family in which carboxy-terminal regions, when present, show little conservation. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	136
132417	TIGR03374	ABALDH	1-pyrroline dehydrogenase. Members of this protein family are 1-pyrroline dehydrogenase (1.5.1.35), also called gamma-aminobutyraldehyde dehydrogenase. This enzyme can follow putrescine transaminase (EC 2.6.1.82) for a two-step conversion of putrescine to gamma-aminobutyric acid (GABA). The member from Escherichia coli is characterized as a homotetramer that binds one NADH per momomer. This enzyme belongs to the medium-chain aldehyde dehydrogenases, and is quite similar in sequence to the betaine aldehyde dehydrogenase (EC 1.2.1.8) family.	472
274550	TIGR03375	type_I_sec_LssB	type I secretion system ATPase, LssB family. Type I protein secretion is a system in some Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. Targeted proteins are not cleaved at the N-terminus, but rather carry signals located toward the extreme C-terminus to direct type I secretion. This model is related to models TIGR01842 and TIGR01846, and to bacteriocin ABC transporters that cleave their substrates during export. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	694
274551	TIGR03376	glycerol3P_DH	glycerol-3-phosphate dehydrogenase (NAD(+)). Members of this protein family are the eukaryotic enzyme, glycerol-3-phosphate dehydrogenase (NAD(+)) (EC 1.1.1.8). Enzymatic activity for 1.1.1.8 is defined as sn-glycerol 3-phosphate + NAD(+) = glycerone phosphate + NADH. Note the very similar reactions of enzymes defined as EC 1.1.1.94 and 1.1.99.5, assigned to families of proteins in the bacteria.	342
274552	TIGR03377	glycerol3P_GlpA	glycerol-3-phosphate dehydrogenase, anaerobic, A subunit. Members of this protein family are the A subunit, product of the glpA gene, of a three-subunit, membrane-anchored, FAD-dependent anaerobic glycerol-3-phosphate dehydrogenase. [Energy metabolism, Anaerobic]	516
213807	TIGR03378	glycerol3P_GlpB	glycerol-3-phosphate dehydrogenase, anaerobic, B subunit. Members of this protein family are the B subunit, product of the glpB gene, of a three-subunit, membrane-anchored, FAD-dependent anaerobic glycerol-3-phosphate dehydrogenase. [Energy metabolism, Anaerobic]	419
132422	TIGR03379	glycerol3P_GlpC	glycerol-3-phosphate dehydrogenase, anaerobic, C subunit. Members of this protein family are the membrane-anchoring, non-catalytic C subunit, product of the glpC gene, of a three-subunit, FAD-dependent, anaerobic glycerol-3-phosphate dehydrogenase. GlpC lasks classical hydrophobic transmembrane helices; Cole, et al suggest interaction with the membrane may involve amphipathic helices. GlcC has conserved Cys-containing motifs suggestive of iron-sulfur binding. This complex is found mostly in Escherichia coli and closely related species. [Energy metabolism, Anaerobic]	397
132423	TIGR03380	agmatine_aguA	agmatine deiminase. Members of this family are agmatine deiminase (3.5.3.12), as characterized in Pseudomonas aeruginosa and plants. Related deiminases include the peptidyl-arginine deiminase (3.5.3.15) as found in Porphyromonas gingivalis. [Central intermediary metabolism, Polyamine biosynthesis]	357
274553	TIGR03381	agmatine_aguB	N-carbamoylputrescine amidase. Members of this family are N-carbamoylputrescine amidase (3.5.1.53). Bacterial genes are designated AguB. The AguAB pathway replaces SpeB for conversion of agmatine to putrescine in two steps rather than one. [Central intermediary metabolism, Polyamine biosynthesis]	279
213808	TIGR03382	GC_trans_RRR	Myxococcales GC_trans_RRR domain. The domain described here is small (about 30 amino acids), hydrophobic, only moderately conserved, and similar to numerous other transmembrane helix-containing sequence regions from convergent evolution. This domain is found, once per protein but in many proteins per genome in several bacteria of the order Myxococcales. It begins with a signature Gly-Cys motif. Its other features, including a hydrophobic transmembrane helix, Arg-rich cluster, and location at the protein C-terminus, resemble the PEP-CTERM proposed protein targeting domain.	27
274554	TIGR03383	urate_oxi	urate oxidase. Members of this protein family are urate oxidase, also called uricase. This protein contains two copies of the domain described by the uricase model pfam01014. In animals, this enzyme has been lost from primates and birds. [Central intermediary metabolism, Other]	282
132427	TIGR03384	betaine_BetI	transcriptional repressor BetI. BetI is a DNA-binding transcriptional repressor of the bet (betaine) regulon. In sequence, it is related to TetR (pfam00440). Choline, through BetI, induces the expression of the betaine biosynthesis genes betA and betB by derepression. The choline porter gene betT is also part of this regulon in Escherichia coli. Note that a different transcriptional regulator, ArcA, controls the expression of bet regulon genes in response to oxygen, as BetA is an oxygen-dependent enzyme. [Regulatory functions, DNA interactions]	189
163244	TIGR03385	CoA_CoA_reduc	CoA-disulfide reductase. Members of this protein family are CoA-disulfide reductase (EC 1.8.1.14), as characterized in Staphylococcus aureus, Pyrococcus horikoshii, and Borrelia burgdorferi, and inferred in several other species on the basis of high levels of CoA and an absence of glutathione as a protective thiol. [Cellular processes, Detoxification]	427
274555	TIGR03388	ascorbase	L-ascorbate oxidase, plant type. Members of this protein family are the copper-containing enzyme L-ascorbate oxidase (EC 1.10.3.3), also called ascorbase. This family is found in flowering plants, and shows greater sequence similarity to a family of laccases (EC 1.10.3.2) from plants than to other known ascorbate oxidases.	541
274556	TIGR03389	laccase	laccase, plant. Members of this protein family include the copper-containing enzyme laccase (EC 1.10.3.2), often several from a single plant species, and additional, uncharacterized, closely related plant proteins termed laccase-like multicopper oxidases. This protein family shows considerable sequence similarity to the L-ascorbate oxidase (EC 1.10.3.3) family. Laccases are enzymes of rather broad specificity, and classification of all proteins scoring about the trusted cutoff of this model as laccases may be appropriate.	539
132431	TIGR03390	ascorbOXfungal	L-ascorbate oxidase, fungal type. This model describes a family of fungal ascorbate oxidases, within a larger family of multicopper oxidases that also includes plant ascorbate oxidases (TIGR03388), plant laccases and laccase-like proteins (TIGR03389), and related proteins. The member from Acremonium sp. HI-25 is characterized.	538
274557	TIGR03391	FeS_syn_CsdE	cysteine desulfurase, sulfur acceptor subunit CsdE. Members of this protein family are CsdE, formerly called YgdK. This protein, found as a paralog to SufE in Escherichia coli, Yersinia pestis, Photorhabdus luminescens, and related species, works together and physically interacts with CsdA (a paralog of SufS). CsdA has cysteine desulfurase activity that is enhanced by this protein (CsdE), in which Cys-61 (numbered as in E. coli) is a sulfur acceptor site. This gene pair, although involved in FeS cluster biosynthesis, is not found next to other such genes as are its paralogs from the Suf or Isc systems. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	139
274558	TIGR03392	FeS_syn_CsdA	cysteine desulfurase, catalytic subunit CsdA. Members of this protein family are CsdS. This protein, found Escherichia coli, Yersinia pestis, Photorhabdus luminescens, and related species, and related to SufS, works together with and physically interacts with CsdE (a paralog of SufE). CsdA has cysteine desulfurase activity that is enhanced by CsdE, a sulfur acceptor protein. This gene pair, although involved in FeS cluster biosynthesis, is not found next to other such genes as are its paralogs from the Suf or Isc systems. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	398
274559	TIGR03393	indolpyr_decarb	indolepyruvate decarboxylase, Erwinia family. A family of closely related, thiamine pyrophosphate-dependent enzymes includes indolepyruvate decarboxylase (EC 4.1.1.74), phenylpyruvate decarboxylase (EC 4.1.1.43), pyruvate decarboxylase (EC 4.1.1.1), branched-chain alpha-ketoacid decarboxylase, etc.. Members of this group of homologs may overlap in specificity. Within the larger family, this model represents a clade of bacterial indolepyruvate decarboxylases, part of a pathway for biosynthesis of the plant hormone indole-3-acetic acid. Typically, these species interact with plants, as pathogens or as beneficial, root-associated bacteria. [Central intermediary metabolism, Other]	539
274560	TIGR03394	indol_phenyl_DC	indolepyruvate/phenylpyruvate decarboxylase, Azospirillum family. A family of closely related, thiamine pyrophosphate-dependent enzymes includes indolepyruvate decarboxylase (EC 4.1.1.74), phenylpyruvate decarboxylase (EC 4.1.1.43), pyruvate decarboxylase (EC 4.1.1.1), branched-chain alpha-ketoacid decarboxylase, etc.. Members of this group of homologs may overlap in specificity. This model represents a clade that includes a Azospirillum brasilense member active as both phenylpyruvate decarboxylase and indolepyruvate decarboxylase.	535
132436	TIGR03395	sphingomy	sphingomyelin phosphodiesterase. Members of this family are bacterial proteins that act as sphingomyelin phosphodiesterase (EC 3.1.4.12), also called sphingomyelinase. Some members of this family have been shown to act as hemolysins. [Cellular processes, Pathogenesis]	283
274561	TIGR03396	PC_PLC	phospholipase C, phosphocholine-specific, Pseudomonas-type. Members of this protein family are bacterial, phosphatidylcholine-hydrolyzing phospholipase C enzymes, with a characteristic domain architecture as found in hemolytyic (PlcH) and nonhemolytic (PlcN) secreted enzymes of Pseudomonas aeruginosa. PlcH hydrolyzes phosphatidylcholine to diacylglycerol and phosphocholine, but unlike PlcN can also hydrolyze sphingomyelin to ceramide ((N-acylsphingosine)) and phosphocholine. Members of this family share the twin-arginine signal sequence for Sec-independent transport across the plasma membrane. PlcH is secreted as a heterodimer with a small chaperone, PlcR, encoded immediately downstream. [Cellular processes, Pathogenesis]	689
274562	TIGR03397	acid_phos_Burk	acid phosphatase, Burkholderia-type. A member of this family, AcpA from Burkholderia mallei, has been charactized as a surface-bound glycoprotein with acid phosphatase activity, as can be shown with the colorigenic substrate 5-bromo-4-chloro-3-indolyl phosphate. This family shares regions of sequence similarity with phosphocholine-preferring phospholipase C enzymes (TIGR03396) from many of the same species.	483
132439	TIGR03398	plc_access_R	phospholipase C accessory protein PlcR. The class of microbial phosphocholine-preferring phospholipase C enzymes described by model TIGR03396 has two members in Pseudomonas aeruginosa, one of which (PlcH) is hemolytic and can hydrolyzes sphingomyelin as well as phosphatidylcholine. This model describes PlcR, an accessory protein for PlcH with which it forms a heterodimer. The member of the family from P. aeruginosa, although not the members from various Burkholderia species, is encoded immediately downstream of phospholipase C.	141
274563	TIGR03399	RNA_3prim_cycl	RNA 3'-phosphate cyclase. Members of this protein family are RNA 3'-phosphate cyclase (6.5.1.4), an enzyme whose function is conserved from E. coli to human. The modification this enzyme performs enables certain RNA ligations to occur, although the full biological roll for this enzyme is not fully described. This model separates this enzyme from a related protein, present only in eukaryotes, localized to the nucleolus, and involved in ribosomal modification. [Transcription, RNA processing]	326
274564	TIGR03400	18S_RNA_Rcl1p	18S rRNA biogenesis protein RCL1. Members of this strictly eukaryotic protein family are not RNA 3'-phosphate cyclase (6.5.1.4), but rather a homolog with a distinct function, found in the nucleolus and required for ribosomal RNA processing. Homo sapiens has both a member of this RCL (RNA terminal phosphate cyclase like) family and EC 6.5.1.4, while Saccharomyces has a member of this family only.	360
188314	TIGR03401	cyanamide_fam	HD domain protein, cyanamide hydratase family. Members of this protein family are known, so far, in the Ascomycota, a branch of the Fungi, and contain an HD domain (pfam01966), found typically in various metal-dependent phosphohydrolases. The only characterized member of this family, from the soil fungus Myrothecium verrucaria, is cyanamide hydratase (EC 4.2.1.69), a zinc-containing homohexamer that adds water to the fertilizer cyanamide (NCNH2), a nitrile compound, to produce urea (NH2-CO-NH2). Homologs are likely to be nitrile hydratases.	228
132443	TIGR03402	FeS_nifS	cysteine desulfurase NifS. Members of this protein family are NifS, one of several related families of cysteine desulfurase involved in iron-sulfur (FeS) cluster biosynthesis. NifS is part of the NIF system, usually associated with other nif genes involved in nitrogenase expression and nitrogen fixation. The protein family is given a fairly broad interpretation here. It includes a clade nearly always found in extended nitrogen fixation genomic regions, plus a second clade more closely related to the first than to IscS and also part of NifS-like/NifU-like systems. This model does not extend to a more distantly clade found in the epsilon proteobacteria such as Helicobacter pylori, also named NifS in the literature, built instead in TIGR03403.	379
132444	TIGR03403	nifS_epsilon	cysteine desulfurase, NifS family, epsilon proteobacteria type. Members of this family are the NifS-like cysteine desulfurase of the epsilon division of the Proteobacteria, similar to the NifS protein of nitrogen-fixing bacteria. Like NifS, and unlike IscS, this protein is found as part of a system of just two proteins, a cysteine desulfurase and a scaffold, for iron-sulfur cluster biosynthesis. This protein is called NifS by Olsen, et al. (), so we use this designation.	382
274565	TIGR03404	bicupin_oxalic	bicupin, oxalate decarboxylase family. Members of this protein family are defined as bicupins as they have two copies of the cupin domain (pfam00190). Two different known activities for members of this family are oxalate decarboxylase (EC 4.1.1.2) and oxalate oxidase (EC 1.2.3.4), although the latter activity has more often been found in distantly related monocupin (germin) proteins.	367
274566	TIGR03405	Phn_Fe-ADH	phosphonate metabolism-associated iron-containing alcohol dehydrogenase. This small clade of iron-containing alcohol dehydrogenases of the pfam00465 family is found in genomic contexts indicating a role in the metabolism of phosphonates. In Delftia acidovorans SPH-1, the gene ZP_01580650.1 is adjacent to and running in the same direction as ZP_01580649.1 encoding the enzyme phosphonatase (PhnX, TIGR01422). Upstream are also found genes encoding components of a phosphonate ABC transport complex. In Ralstonia eutropha H16 and Verminephrobacter eiseniae EF01-2 the dehydrogenase is followed by a homolog of the PhnB gene, a putative phosphonate-specific MFS-type transporter. In Azoarcus BH72 the gene is preceded by Phosphoenolpyruvate phosphomutase (aepX) and a putative phosphonopyruvate decarboxylase (aepY), two genes involved in the biosynthesis of phosphonoacetaldehyde (Pald). Ususally these two are accompanied by a specific transaminase, AepZ, which converts Pald to 2-aminoethylphosphonate (2-AEP). 2-hydroxyethylphosphonate (2-HEP), the presumed product of the reaction of Pald with an alcohol dehydrogenase, is a biologically novel but reasonable analog of 2-AEP and may be a constituent of as-yet undescribed natural products. In the case of Azoarcus, downstream of the dehydrogenase is a CDP-glycerol:glycerophosphate transferase homolog that may indicate the existence of a pathway for 2-HEP-derived phosphonolipid biosynthesis.	355
213809	TIGR03406	FeS_long_SufT	probable FeS assembly SUF system protein SufT. The function is unknown for this protein family, but members are found almost always in operons for the the SUF system of iron-sulfur cluster biosynthesis. The SUF system is present elsewhere on the chromosome for those few species where SUF genes are not adjacent. This family shares this property of association with the SUF system with a related family, TIGR02945. TIGR02945 consists largely of a DUF59 domain (see pfam01883), while this protein is about double the length, with a unique N-terminal domain and DUF59 C-terminal domain. A location immediately downstream of the cysteine desulfurase gene sufS in many contexts suggests the gene symbol sufT. Note that some other homologs of this family and of TIGR02945, but no actual members of this family, are found in operons associated with phenylacetic acid (or other ring-hydroxylating) degradation pathways. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	174
132448	TIGR03407	urea_ABC_UrtA	urea ABC transporter, urea binding protein. Members of this protein family are ABC transporter substrate-binding proteins associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. Members of this protein family tend to have the twin-arginine signal for Sec-independent transport across the plasma membrane. [Transport and binding proteins, Amino acids, peptides and amines]	359
132449	TIGR03408	urea_trans_UrtC	urea ABC transporter, permease protein UrtC. Members of this protein family are ABC transporter permease proteins associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. [Transport and binding proteins, Amino acids, peptides and amines]	313
200272	TIGR03409	urea_trans_UrtB	urea ABC transporter, permease protein UrtB. Members of this protein family are ABC transporter permease proteins associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. [Transport and binding proteins, Amino acids, peptides and amines]	291
274567	TIGR03410	urea_trans_UrtE	urea ABC transporter, ATP-binding protein UrtE. Members of this protein family are ABC transporter ATP-binding subunits associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. [Transport and binding proteins, Amino acids, peptides and amines]	230
274568	TIGR03411	urea_trans_UrtD	urea ABC transporter, ATP-binding protein UrtD. Members of this protein family are ABC transporter ATP-binding subunits associated with urea transport and metabolism. This protein is found in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity. [Transport and binding proteins, Amino acids, peptides and amines]	242
132453	TIGR03412	iscX_yfhJ	FeS assembly protein IscX. Members of this protein family are YfhJ, a protein of the ISC system for iron-sulfur cluster assembly. Other genes in the system include iscSUA, hscBA, and fdx. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	63
274569	TIGR03413	GSH_gloB	hydroxyacylglutathione hydrolase. Members of this protein family are hydroxyacylglutathione hydrolase, a detoxification enzyme known as glyoxalase II. It follows lactoylglutathione lyase, or glyoxalase I, and acts to remove the toxic metabolite methylglyoxal and related compounds. This protein belongs to the broader metallo-beta-lactamase family (pfam00753). [Cellular processes, Detoxification]	248
188316	TIGR03414	ABC_choline_bnd	choline ABC transporter, periplasmic binding protein. Partial phylogenetic profiling () vs. the genome property of glycine betaine biosynthesis from choline consistently reveals a member of this ABC transporter periplasmic binding protein as the best match, save for the betaine biosynthesis enzymes themselves. Genomes often carry several paralogs, one encoded together with the permease and ATP-binding components and another encoded next to a choline-sulfatase gene, suggesting that different members of this protein family interact with shared components and give some flexibility in substrate. Of two members from Sinorhizobium meliloti 1021, one designated ChoX has been shown experimentally to bind choline (though not various related compounds such as betaine) and to be required for about 60 % of choline uptake. Members of this protein have an invariant Cys residue near the N-terminus and likely are lipoproteins. [Transport and binding proteins, Amino acids, peptides and amines]	290
188317	TIGR03415	ABC_choXWV_ATP	choline ABC transporter, ATP-binding protein. Members of this protein family are the ATP-binding subunit of a three-protein transporter. This family belongs, more broadly, to the family of proline and glycine-betaine transporters, but members have been identified by direct characterization and by bioinformatic means as choline transporters. Many species have several closely-related members of this family, probably with variable abilities to act additionally on related quaternary amines. [Transport and binding proteins, Amino acids, peptides and amines]	382
188318	TIGR03416	ABC_choXWV_perm	choline ABC transporter, permease protein. 	267
274570	TIGR03417	chol_sulfatase	choline-sulfatase. 	500
188320	TIGR03418	chol_sulf_TF	putative choline sulfate-utilization transcription factor. Members of this protein family are transcription factors of the LysR family. Their genes typically are divergently transcribed from choline-sulfatase genes. That enzyme makes choline, a precursor to the osmoprotectant glycine-betaine, available by hydrolysis of choline sulfate.	291
132460	TIGR03419	NifU_clost	FeS cluster assembly scaffold protein NifU, Clostridium type. NifU and NifS form a pair of iron-sulfur (FeS) cluster biosynthesis proteins much simpler than the ISC and SUF systems. Members of this protein family are a distinct group of NifU-like proteins, found always to a NifS-like protein and restricted to species that lack a SUF system. Typically, NIF systems service a smaller number of FeS-containing proteins than do ISC or SUF. Members of this particular branch typically are found, almost half the time, near the mnmA gene, involved in the carboxymethylaminomethyl modification of U34 in some tRNAs (see GenProp0704). While other NifU proteins are associated with nitrogen fixation, this family is not. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	121
274571	TIGR03420	DnaA_homol_Hda	DnaA regulatory inactivator Hda. Members of this protein family are Hda (Homologous to DnaA). These proteins are about half the length of DnaA and homologous over length of Hda. In the model species Escherichia coli, the initiation of DNA replication requires DnaA bound to ATP rather than ADP; Hda helps facilitate the conversion of DnaA-ATP to DnaA-ADP. [DNA metabolism, DNA replication, recombination, and repair]	226
274572	TIGR03421	FeS_CyaY	iron donor protein CyaY. Members of this protein family are the iron-sulfur cluster (FeS) metabolism protein CyaY, a homolog of eukaryotic frataxin. ISC is one of several bacterial systems for FeS assembly; we find by Partial Phylogenetic Profiling vs. the ISC system that CyaY most like work with the ISC system for FeS cluster biosynthesis. A study of of cyaY mutants in Salmonella enterica bears this out. Although the trusted cutoff is set low enough to include eukaryotic frataxin sequences, a narrower, exception-type model (TIGR03421) identifies identifies members of that specific set. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	102
132463	TIGR03422	mito_frataxin	frataxin. Frataxin is a mitochondrial protein, mutation of which leads to the disease Friedreich's ataxia. Its orthologs are widely distributed in the bacteria, associated with the ISC system for iron-sulfur cluster assembly, and designated CyaY. This exception-type model allows those examples of frataxin per se that score above the trusted cutoff to the CyaY equivalog-type model (TIGR03421) to be named appropriately.	97
274573	TIGR03423	pbp2_mrdA	penicillin-binding protein 2. Members of this protein family are penicillin-binding protein 2 (PBP-2), a protein whose gene (designated pbpA or mrdA) generally is found next to the gene for RodA, a protein required for rod (bacillus) shape in many bacteria. PBP-2 acts as a transpeptidase for cell elongation (hence, rod-shape). [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	592
132465	TIGR03424	urea_degr_1	urea carboxylase-associated protein 1. A number of bacteria degrade urea as a nitrogen source by the urea carboxylase/allophanate hydrolase pathway, which uses biotin and consumes ATP, rather than my means of the nickel-dependent enzyme urease. This model represents one of a pair of homologous, tandem uncharacterized genes found together with the urea carboxylase and allophanate hydrolase genes.	198
163257	TIGR03425	urea_degr_2	urea carboxylase-associated protein 2. A number of bacteria degrade urea as a nitrogen source by the urea carboxylase/allophanate hydrolase pathway, which uses biotin and consumes ATP, rather than my means of the nickel-dependent enzyme urease. This model represents one of a pair of homologous, tandem uncharacterized genes found together with the urea carboxylase and allophanate hydrolase genes.	233
274574	TIGR03426	shape_MreD	rod shape-determining protein MreD. Members of this protein family are the MreD protein of bacterial cell shape determination. Most rod-shaped bacteria depend on MreB and RodA to achieve either a rod shape or some other non-spherical morphology such as coil or stalk formation. MreD is encoded in an operon with MreB, and often with RodA and PBP-2 as well. It is highly hydrophobic (therefore somewhat low-complexity) and highly divergent, and therefore sometimes tricky to discover by homology, but this model finds most examples. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	152
213811	TIGR03427	ABC_peri_uca	ABC transporter periplasmic binding protein, urea carboxylase region. Members of this family are ABC transporter periplasmic binding proteins associated with the urea carboxylase/allophanate hydrolase pathway, an alternative to urease for urea degradation. The protein is restricted to bacteria with the pathway, with its gene close to the urea carboxylase and allophanate hydrolase genes. The substrate for this transporter therefore is likely to be urea or a compound from which urea is easily derived. [Transport and binding proteins, Unknown substrate]	328
132469	TIGR03428	ureacarb_perm	permease, urea carboxylase system. A number of bacteria obtain nitrogen by biotin- and ATP-dependent urea degradation system distinct from urease. The two characterized proteins of this system are the enzymes urea carboxylase and allophanate hydrolase, but other, uncharacterized proteins co-occur as genes encoded nearby in multiple organisms. This family includes predicted permeases of the amino acid permease family, likely to transport either urea or a compound from which urea is derived. It is found so far only Actinobacteria, whereas a number of other species with the urea carboxylase have an adjacent ABC transporter operon.	475
274575	TIGR03429	arom_pren_DMATS	aromatic prenyltransferase, DMATS type. Members of this protein family are mostly fungal enzymes of secondary metabolite production. Characterized or partially characterized members include several examples of dimethylallyltryptophan synthase, a brevianamide F prenyltransferase, LtxC from lyngbyatoxin biosynthesis, and a probable dimethylallyl tyrosine synthase.	405
132471	TIGR03430	trp_dimet_allyl	tryptophan dimethylallyltransferase. Members of this family are the enzyme tryptophan dimethylallyltransferase (EC 2.5.1.34), a distinct clade within a larger group of aromatic prenyltransferases that may act on on trp-containing cyclic dipeptides, or on tyrosine or other related substrates. Tryptophan dimethylallyltransferase and related enzymes typically are of fungal origin are involved in the biosynthesis of secondary metabolites such as ergot alkaloids.	419
132472	TIGR03431	PhnD	phosphonate ABC transporter, periplasmic phosphonate binding protein. This model is a subset of the broader subfamily of phosphate/phosphonate binding protein ABC transporter components, TIGR01098. In this model all members of the seed have support from genomic context for association with pathways for the metabolims of phosphonates, particularly the C-P lyase system, GenProp0232. This model includes the characterized phnD gene from E. coli. Note that this model does not identify all phnD-subfamily genes with evident phosphonate context, but all sequences above the trusted context may be inferred to bind phosphonate compounds even in the absence of such context. Furthermore, there is ample evidence to suggest that many other members of the TIGR01098 subfamily have a different primary function.	288
163260	TIGR03432	yjhG_yagF	putative dehydratase, YjhG/YagF family. This homolog of dihydroxy-acid dehydratases has an odd, sparse distribution. Members are found in two Acidobacteria, two Planctomycetes, Bacillus clausii KSM-K16, and (in two copies each) in strains K12-MG1655 and W3110 of Escherichia coli. The local context is not well conserved, but a few members are adjacent to homologs of the gluconate:H+ symporter (see TIGR00791). [Unknown function, Enzymes of unknown specificity]	640
163261	TIGR03433	padR_acidobact	transcriptional regulator, Acidobacterial, PadR-family. Members of this protein family are putative transcriptional regulators of the PadR family, as found in species of the Acidobacteria. This family of proteins has expanded greatly in this lineage, and where it regularly is found in the vicinity of a putative transporter protein [Regulatory functions, DNA interactions]	100
274576	TIGR03434	ADOP	Acidobacterial duplicated orphan permease. Members of this protein family are found, so far, only in three species of Acidobacteria, namely Acidobacteria bacterium Ellin345, Acidobacterium capsulatum ATCC 51196, and Solibacter usitatus Ellin6076, where they form large paralogous families. Each protein contains two copies of a domain called the efflux ABC transporter permease protein (pfam02687). However, unlike other members of that family (including LolC, FtsX, and MacB), genes for these proteins are essentially never found fused or adjacent to ABC transporter ATP-binding protein (pfam00005) genes. We name this family ADOP, for Acidobacterial Duplicated Orphan Permease, to reflect the restricted lineage, internal duplication, lack of associated ATP-binding cassette proteins, and permease homology. The function is unknown.	803
132476	TIGR03435	Soli_TIGR03435	soil-associated protein, TIGR03435 family. Bacterial reference strains encoding members of this protein family are all isolated from soil. These include 39 members from Solibacter usitatus Ellin6076, 27 from Acidobacterium sp. MP5ACTX8 (both Acidobacteria), and four from Pedosphaera parvula Ellin514 (Verrucomicrobia). The family is well-diversified, with few pairs showing greater than 50 % pairwise identity. A few members are fused to Peptidase_M56 domains (see pfam05569), to Sigma70_r2 domains (see pfam04542), or have a duplication of this domain.	237
274577	TIGR03436	acidobact_VWFA	VWFA-related Acidobacterial domain. Members of this family are bacterial domains that include a region related to the von Willebrand factor type A (VWFA) domain (pfam00092). These domains are restricted to, and have undergone a large paralogous family expansion in, the Acidobacteria, including Solibacter usitatus and Acidobacterium capsulatum ATCC 51196.	296
274578	TIGR03437	Soli_cterm	Solibacter uncharacterized C-terminal domain. This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.	215
274579	TIGR03438	egtD_ergothio	dimethylhistidine N-methyltransferase. This model represents a distinct set of uncharacterized proteins found in the bacteria. Analysis by PSI-BLAST shows remote sequence homology to methyltransferases [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	301
274580	TIGR03439	methyl_EasF	probable methyltransferase domain, EasF family. This model represents an uncharacterized domain of about 300 amino acids with homology to S-adenosylmethionine-dependent methyltransferases. Proteins with this domain are exclusively fungal. A few, such as EasF from Neotyphodium lolii, are associated with the biosynthesis of ergot alkaloids, a class of fungal secondary metabolites. EasF may, in fact, be the AdoMet:dimethylallyltryptophan N-methyltransferase, the enzyme that follows tryptophan dimethylallyltransferase (DMATS) in ergot alkaloid biosynthesis. Several other members of this family, including mug158 (meiotically up-regulated gene 158 protein) from Schizosaccharomyces pombe, contain an additional uncharacterized domain DUF323 (pfam03781).	319
274581	TIGR03440	egtB_TIGR03440	ergothioneine biosynthesis protein EgtB. Members of this family include EgtB, and enzyme of the ergothioneine biosynthesis, as found in numerous Actinobacteria. Characterized homologs to this family include a formylglycine-generating enzyme that serves as a maturase for an aerobic sulfatase (cf. the radical SAM enzymes that serve as anaerobic sulfatase maturases). [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	406
163267	TIGR03441	urea_trans_yut	urea transporter, Yersinia type. Members of this protein family are bacterial urea transporters, found not only is species that contain urease, but adjacent to the urease operon. It was characterized in Yersinia pseudotuberculosis. Members are homologous to eukaryotic members of solute carrier family 14, a family that includes urea transporters, and to bacterial proteins in species with no detectable urea degradation system. [Transport and binding proteins, Other]	292
132483	TIGR03442	TIGR03442	ergothioneine biosynthesis protein EgtC. Members of this strictly bacterial protein family show similarity to class II glutamine amidotransferases (see pfam00310). They are distinguished by appearing in a genome context with, and usually adjacent to or between, members of families TIGR03438 (an uncharacterized methyltransferase) and TIGR03440 (an uncharacterized protein). [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	251
274582	TIGR03443	alpha_am_amid	L-aminoadipate-semialdehyde dehydrogenase. Members of this protein family are L-aminoadipate-semialdehyde dehydrogenase (EC 1.2.1.31), product of the LYS2 gene. It is also called alpha-aminoadipate reductase. In fungi, lysine is synthesized via aminoadipate. Currently, all members of this family are fungal.	1389
274583	TIGR03444	EgtA_Cys_ligase	ergothioneine biosynthesis glutamate--cysteine ligase EgtA. Members of this bacterial protein family, EgtA, resemble the glutamate--cysteine ligase of the two-step pathway for glutathione (GSH) biosynthesis, but instead are involved in the biosynthesis of the histidine-derived thiol, ergothioneine (EGT). Successful in vitro reconstitution of EGT biosynthesis using EgtBCDE and gamma-L-glutamyl-L-cysteine suggests that this enzyme is a bone fide glutamate--cysteine ligase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	390
274584	TIGR03445	mycothiol_MshB	N-acetyl-1-D-myo-inositol-2-amino-2-deoxy-alpha-D-glucopyranoside deacetylase. Members of this protein family are N-acetyl-1-D-myo-inositol-2-amino-2-deoxy-alpha-D-glucopyranoside deacetylase, also called 1D-myo-inosityl-2-acetamido-2-deoxy-alpha-D-glucopyranoside deacetylase, the MshB protein of mycothiol biosynthesis in Mycobacterium tuberculosis and related species. [Cellular processes, Detoxification]	284
132487	TIGR03446	mycothiol_Mca	mycothiol conjugate amidase Mca. Mycobacterium tuberculosis, Corynebacterium glutamicum, and related species use the thiol mycothiol in place of glutathione. This enzyme, homologous to the (dispensible) MshB enzyme of mycothiol biosynthesis, is described as an amidase that acts on conjugates to mycothiol. It is a detoxification enzyme. [Cellular processes, Detoxification]	283
132488	TIGR03447	mycothiol_MshC	cysteine--1-D-myo-inosityl 2-amino-2-deoxy-alpha-D-glucopyranoside ligase. Members of this protein family are MshC, l-cysteine:1-D-myo-inosityl 2-amino-2-deoxy-alpha-D-glucopyranoside ligase, an enzyme that uses ATP to ligate a Cys residue to a mycothiol precursor molecule, in the second to last step in mycothiol biosynthesis. This enzyme shows considerable homology to Cys--tRNA ligases, and many instances are misannotated as such. Mycothiol is found in Mycobacterium tuberculosis, Corynebacterium glutamicum, Streptomyces coelicolor, and various other members of the Actinobacteria. Mycothiol is an analog to glutathione. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	411
132489	TIGR03448	mycothiol_MshD	mycothiol synthase. Members of this family are MshD, the acetyltransferase that catalyzes the final step of mycothiol biosynthesis in various members of the Actinomyctes, Mycothiol replaces glutathione in these species. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	292
132490	TIGR03449	mycothiol_MshA	D-inositol-3-phosphate glycosyltransferase. Members of this protein family, found exclusively in the Actinobacteria, are MshA, the glycosyltransferase of mycothiol biosynthesis. Mycothiol replaces glutathione in these species.	405
132491	TIGR03450	mycothiol_INO1	inositol 1-phosphate synthase, Actinobacterial type. This enzyme, inositol 1-phosphate synthase as found in Actinobacteria, produces an essential precursor for several different products, including mycothiol, which is a glutathione analog, and phosphatidylinositol, which is a phospholipid.	351
132492	TIGR03451	mycoS_dep_FDH	S-(hydroxymethyl)mycothiol dehydrogenase. Members of this protein family are mycothiol-dependent formaldehyde dehydrogenase (EC 1.2.1.66). This protein is found, so far, only in the Actinobacteria (Mycobacterium sp., Streptomyces sp., Corynebacterium sp., and related species), where mycothione replaces glutathione. [Cellular processes, Detoxification]	358
132493	TIGR03452	mycothione_red	mycothione reductase. Mycothiol, a glutathione analog in Mycobacterium tuberculosis and related species, can form a disulfide-linked dimer called mycothione. This enzyme can reduce mycothione to regenerate two mycothiol molecules. The enzyme shows some sequence similarity to glutathione-disulfide reductase, trypanothione-disulfide reductase, and dihydrolipoamide dehydrogenase. The characterized protein from M. tuberculosis, a homodimer, has FAD as a cofactor, one per monomer, and uses NADPH as a substrate.	452
274585	TIGR03453	partition_RepA	plasmid partitioning protein RepA. Members of this family are the RepA (or ParA) protein involved in replicon partitioning. All known examples occur in bacterial species with two or more replicons, on a plasmid or the smaller chromosome. Note that an apparent exception may be seen as a pseudomolecule from assembly of an incompletely sequenced genome. Members of this family belong to a larger family that also includes the enzyme cobyrinic acid a,c-diamide synthase, but assignment of that name to members of this family would be in error. [Mobile and extrachromosomal element functions, Plasmid functions]	387
274586	TIGR03454	partition_RepB	plasmid partitioning protein RepB. Members of this family are the RepB protein involved in replicon partitioning. RepB is found, in general, as part of a repABC operon in plasmids and small chromosomes, separate from the main chromosome, in various bacteria. This model describes a rather narrow clade of proteins; it should be noted that additional homologs scoring below the trusted cutoff have very similar functions, although they may be named differently. [Mobile and extrachromosomal element functions, Plasmid functions]	325
274587	TIGR03455	HisG_C-term	ATP phosphoribosyltransferase, C-terminal domain. This domain corresponds to the C-terminal third of the HisG protein. It is absent in many lineages.	92
132497	TIGR03457	sulphoacet_xsc	sulfoacetaldehyde acetyltransferase. Members of this protein family are sulfoacetaldehyde acetyltransferase, an enzyme of taurine utilization. Taurine, or 2-aminoethanesulfonate, can be used by bacteria as a source of carbon, nitrogen, and sulfur. [Central intermediary metabolism, Other]	579
274588	TIGR03458	YgfH_subfam	succinate CoA transferase. This family of CoA transferases includes enzymes catalyzing at least two related but distinct activities. The E. coli YgfH protein has been characterized as a propionyl-CoA:succinate CoA transferase where it appears to be involved in a pathway for the decarboxylation of succinate to propionate. The Clostridium kluyveri CAT1 protein has been characterized as a acetyl-CoA:succinate CoA transferase and is believed to be involved in anaerobic succinate degradation. The propionate:succinate transferase activity has been reported in the propionic acid fermentation of propionibacterium species, where it is distinct from the coupled activities of distinct nucleotide-triphosphate dependent succinate and propionate/acetate CoA transferases (as inferred from activity in the absence of NTPs). The family represented by this model includes a member from Propionibacterium acnes KPA171202 which is likely to be responsible for this activity. A closely related clade not included in this family are the Ach1p proteins of fungi which are acetyl-CoA hydrolases. This name has been applied to many of the proteins represented by this model, possibly erroneously.	485
274589	TIGR03459	crt_membr	carotene biosynthesis associated membrane protein. This model represents a family of hydrophobic and presumed membrane proteins found in the Actinobacteria. The genes encoding these proteins are syntenically associated with (found proximal to) genes of carotene biosynthesis ususally including phytoene synthase (crtB), phytoene dehydrogenase (crtI) and geranylgeranyl pyrophosphate synthase (ispA).	456
132500	TIGR03460	crt_membr_arch	carotene biosynthesis associated membrane protein. 	232
132501	TIGR03461	pabC_Proteo	aminodeoxychorismate lyase. Members of this protein family are aminodeoxychorismate lyase (ADC lyase), EC 4.1.3.38, the PabC protein of PABA biosynthesis. PABA (para-aminobenzoate) is a precursor of folate, needed for de novo purine biosynthesis. This enzyme is a pyridoxal-phosphate-binding protein in the class IV aminotransferase family (pfam01063). [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	261
274590	TIGR03462	CarR_dom_SF	lycopene cyclase domain. This domain is often repeated twice within the same polypeptide, as is observed in Archaea, Thermus, Sphingobacteria and Fungi. In the fungal sequences, this tandem domain pair is observed as the N-terminal half of a bifunctional protein, where it has been characterized as a lycopene beta-cyclase and the C-terminal half is a phytoene synthetase. In Myxococcus and Actinobacterial genomes this domain appears as a single polypeptide, tandemly repeated and usually in a genomic context consistent with a role in carotenoid biosynthesis. It is unclear whether any of the sequences in this family truly encode lycopene epsilon cyclases. However a number are annotated as such. The domain is generally hydrophobic with a number of predicted membrane spanning segments and contains a distinctive motif (hPhEEhhhhhh). In certain sequences one of either the proline or glutamates may vary, but always one of the tandem pair appear to match this canonical sequence exactly.	89
274591	TIGR03463	osq_cycl	2,3-oxidosqualene cyclase. This model identifies 2,3-oxidosqualene cyclases from Stigmatella aurantiaca which produces cycloartenol, and Gemmata obscuriglobus and Methylococcus capsulatus, which each produce the closely related sterol, lanosterol.	634
274592	TIGR03464	HpnC	squalene synthase HpnC. This family of genes are members of a superfamily (pfam00494) of phytoene and squalene synthases which catalyze the head-t0-head condensation of polyisoprene pyrophosphates. The genes of this family are often found in the same genetic locus with squalene-hopene cyclase genes, and are never associated with genes for the metabolism of phytoene. In the organisms Zymomonas mobilis and Bradyrhizobium japonicum these genes have been characterized as squalene synthases (farnesyl-pyrophosphate ligases). Often, these genes appear in tandem with the HpnD gene which appears to have resulted from an ancient gene duplication event. Presumably these proteins form a heteromeric complex, but this has not yet been experimentally demonstrated.	266
163278	TIGR03465	HpnD	squalene synthase HpnD. The genes of this family are often found in the same genetic locus with squalene-hopene cyclase genes, and are never associated with genes for the metabolism of phytoene. In the organisms Zymomonas mobilis and Bradyrhizobium japonicum these genes have been characterized as squalene synthases (farnesyl-pyrophosphate ligases). Often, these genes appear in tandem with the HpnC gene which appears to have resulted from an ancient gene duplication event. Presumably these proteins form a heteromeric complex, but this has not yet been experimentally demonstrated.	266
163279	TIGR03466	HpnA	hopanoid-associated sugar epimerase. The sequences in this family are members of the pfam01370 superfamily of NAD-dependent epimerases and dehydratases typically acting on nucleotide-sugar substrates. The genes of the family modeled here are generally in the same locus with genes involved in the biosynthesis and elaboration of hopene, the cyclization product of the polyisoprenoid squalene. This gene and its association with hopene biosynthesis in Zymomonas mobilis has been noted in the literature where the gene symbol hpnA was assigned. Hopanoids are known to be components of the plasma membrane and to have polar sugar head groups in Z. mobilis and other species.	328
274593	TIGR03467	HpnE	squalene-associated FAD-dependent desaturase. The sequences in this family are members of the pfam01593 superfamily of flavin-containing amine oxidases which include the phytoene desaturases. These sequences also include a FAD-dependent oxidoreductase domain, pfam01266. The genes of the family modeled here are generally in the same locus with genes involved in the biosynthesis and elaboration of squalene, the condensation product of the polyisoprenoid farnesyl pyrophosphate. This gene and its association with hopene biosynthesis in Zymomonas mobilis has been noted in the literature where the gene symbol hpnE was assigned. This gene is also found in contexts where the downstream conversion of squalene to hopenes is not evidence. The precise nature of the reaction catalyzed by this enzyme is unknown at this time.	419
274594	TIGR03468	HpnG	hopanoid-associated phosphorylase. The sequences in this family are members of the pfam01048 family of phosphorylases typically acting on nucleotide-sugar substrates. The genes of the family modeled here are generally in the same locus with genes involved in the biosynthesis and elaboration of hopene, the cyclization product of the polyisoprenoid squalene. This gene is adjacent to the genes PhnA-E and squalene-hopene cyclase (which would be HpnF) in Zymomonas mobilis and their association with hopene biosynthesis has been noted in the literature. Extending the gene symbol sequence, we suggest the symbol HpnG for the product of this gene. Hopanoids are known to be components of the plasma membrane and to have polar sugar head groups in Z. mobilis and other species.	212
213815	TIGR03469	HpnB	hopene-associated glycosyltransferase HpnB. This family of genes include a glycosyl transferase, group 2 domain (pfam00535) which are responsible, generally for the transfer of nucleotide-diphosphate sugars to substrates such as polysaccharides and lipids. The genes of this family are often found in the same genetic locus with squalene-hopene cyclase genes, and are never associated with genes for the metabolism of phytoene. Indeed, the members of this family appear to never be found in a genome lacking squalene-hopene cyclase (SHC), although not all genomes encoding SHC have this glycosyl transferase. In the organism Zymomonas mobilis the linkage of this gene to hopanoid biosynthesis has been noted and the gene named HpnB. Hopanoids are known to feature polar glycosyl head groups in many organisms.	384
274595	TIGR03470	HpnH	hopanoid biosynthesis associated radical SAM protein HpnH. The sequences represented by this model are members of the radical SAM superfamily of enzymes (pfam04055). These enzymes utilize an iron-sulfur redox cluster and S-adenosylmethionine to carry out diverse radical mediated reactions. The members of this clade are frequently found in the same locus as squalene-hopene cyclase (SHC, TIGR01507) and other genes associated with the biosynthesis of hopanoid natural products. The linkage between SHC and this radical SAM enzyme is strong; one is nearly always observed in the same genome where the other is found. A hopanoid biosynthesis locus was described in Zymomonas mobilis consisting of the genes HpnA-E and SHC (HpnF). Continuing past SHC are found a phosphorylase enzyme (ZMO0873, i.e. HpnG, TIGR03468) and this radical SAM enzyme (ZMO0874) which we name here HpnH. Granted, in Z. mobilis, HpnH is in a convergent orientation with respect to HpnA-G, but one gene beyond HpnH and running in the same convergent direction is IspH (ZM0875, 4-hydroxy-3-methylbut-2-enyl diphosphate reductase), an essential enzyme of IPP biosynthesis and therefore essential for the biosynthesis of hopanoids. One of the well-described hopanoid intermediates is bacteriohopanetetrol. In the conversion from hopene several reactions must occur in the side chain for which a radical mechanism might be reasonable. These include the four (presumably anaerobic) hydroxylations and a methyl shift.	318
132511	TIGR03471	HpnJ	hopanoid biosynthesis associated radical SAM protein HpnJ. The sequences represented by this model are members of the radical SAM superfamily of enzymes (pfam04055). These enzymes utilize an iron-sulfur redox cluster and S-adenosylmethionine to carry out diverse radical mediated reactions. The member of this clade from Acidithiobacillus ferrooxidans ATCC 23270 (AFE_0975) is found in the same locus as squalene-hopene cyclase (SHC, TIGR01507) and other genes associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha JMP134 (Reut_B4901) this gene is adjacent to HpnAB, IspH and HpnH (TIGR03470), although SHC itself is elsewhere in the genome. Notably, this gene (here named HpnJ) and three others form a conserved set (HpnIJKL) which occur in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling. This group includes Zymomonas mobilis, the organism where the initial hopanoid biosynthesis locus was described consisting of the genes HpnA-E and SHC (HpnF). Continuing past SHC are found a phosphorylase enzyme (ZMO0873, i.e. HpnG, TIGR03468) and another radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL. One of the well-described hopanoid intermediates is bacteriohopanetetrol. In the conversion from hopene several reactions must occur in the side chain for which a radical mechanism might be reasonable. These include the four (presumably anaerobic) hydroxylations and a methyl shift.	472
132512	TIGR03472	HpnI	hopanoid biosynthesis associated glycosyl transferase protein HpnI. This family of genes include a glycosyl transferase, group 2 domain (pfam00535) which are responsible, generally for the transfer of nucleotide-diphosphate sugars to substrates such as polysaccharides and lipids. The member of this clade from Acidithiobacillus ferrooxidans ATCC 23270 (AFE_0974) is found in the same locus as squalene-hopene cyclase (SHC, TIGR01507) and other genes associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha JMP134 (Reut_B4902) this gene is adjacent to HpnAB, IspH and HpnH (TIGR03470), although SHC itself is elsewhere in the genome. Notably, this gene (here named HpnI) and three others form a conserved set (HpnIJKL) which occur in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling. This group includes Zymomonas mobilis, the organism where the initial hopanoid biosynthesis locus was described consisting of the genes HpnA-E and SHC (HpnF). Continuing past SHC are found a phosphorylase enzyme (ZMO0873, i.e. HpnG, TIGR03468) and another radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL. Hopanoids are known to feature polar glycosyl head groups in many organisms.	373
132513	TIGR03473	HpnK	hopanoid biosynthesis associated protein HpnK. The sequences represented by this model are members of the pfam04794 "YdjC-like" family of uncharacterized proteins. The member of this clade from Acidithiobacillus ferrooxidans ATCC 23270 (AFE_0976) is found in the same locus as squalene-hopene cyclase (SHC, TIGR01507) and other genes associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha JMP134 (Reut_B4902) this gene is adjacent to HpnAB, IspH and HpnH (TIGR03470), although SHC itself is elsewhere in the genome. Notably, this gene (here named HpnK) and three others form a conserved set (HpnIJKL) which occur in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling. This group includes Zymomonas mobilis, the organism where the initial hopanoid biosynthesis locus was described consisting of the genes HpnA-E and SHC (HpnF). Continuing past SHC are found a phosphorylase enzyme (ZMO0873, i.e. HpnG, TIGR03468) and a radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL.	283
132514	TIGR03474	incFII_RepA	incFII family plasmid replication initiator RepA. Members of this protein are the plasmid replication initiator RepA of incFII (plasmid incompatibility group F-II) plasmids. R1 and R100 are plasmids in this group. Immediately upstream of repA is found tap, a leader peptide of about 24 amino acids, often not assigned as a gene in annotated plasmid sequences. Note that other, non-homologous plasmid replication proteins share the gene symbol (repA) and similar names (plasmid replication protein RepA).	275
132515	TIGR03475	tap_IncFII_lead	RepA leader peptide Tap. This protein is a translated leader peptide that actis in the regulation of the expression of the plasmid replication protein RepA in incF2 group plasmids. [Mobile and extrachromosomal element functions, Plasmid functions]	25
274596	TIGR03476	HpnL	putative membrane protein. This family of hydrophobic proteins is observed in two distinct contexts. It is primarily found in the presence of genes for the biosynthesis and elaboration of hopene where we assign the gene symbol HpnL. In a subset of the genomes containing HpnL a second, often plasmid-encoded, homolog is observed in a context implying the biosynthesis of 2-aminoethylphosphonate head-group containing lipids.	318
274597	TIGR03477	DMSO_red_II_gam	DMSO reductase family type II enzyme, heme b subunit. This model represents a heme b-binding subunit, typically called the gamma subunit, of various proteins that also contain a molybdopterin subunit and an iron-sulfur protein. The group includes two distinct but very closely related periplasmic proteins of anaerobic respiration, selenate reductase and chlorate reductase. Other members of this family include dimethyl sulphide dehydrogenase and ethylbenzene dehydrogenase. [Energy metabolism, Electron transport]	206
132518	TIGR03478	DMSO_red_II_bet	DMSO reductase family type II enzyme, iron-sulfur subunit. This model represents the iron-sulfur subunit, typically called the beta subunit, of various proteins that also contain a molybdopterin subunit and a heme b subunit. The group includes two distinct but very closely related periplasmic proteins of anaerobic respiration, selenate reductase and chlorate reductase. Other members of this family include dimethyl sulphide dehydrogenase and ethylbenzene dehydrogenase. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	321
132519	TIGR03479	DMSO_red_II_alp	DMSO reductase family type II enzyme, molybdopterin subunit. This model represents the molybdopterin subunit, typically called the alpha subunit, of various proteins that also contain an iron-sulfur subunit and a heme b subunit. The group includes two distinct but very closely related periplasmic proteins of anaerobic respiration, selenate reductase and chlorate reductase. Other members of this family include dimethyl sulphide dehydrogenase, ethylbenzene dehydrogenase, and an archaeal respiratory nitrate reductase. This alpha subunit has a twin-arginine translocation (TAT) signal for Sec-independent translocation across the plasma membrane.	912
274598	TIGR03480	HpnN	hopanoid biosynthesis associated RND transporter like protein HpnN. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins appear to be related to the RND family of export proteins, particularly the hydrophobe/amphiphile efflux-3 (HAE3) family represented by TIGR00921.	862
274599	TIGR03481	HpnM	hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.	198
274600	TIGR03482	DMSO_red_II_cha	DMSO reductase family type II enzyme chaperone. Type II members of the DMSO reductase family are heterotrimeric proteins with bis(molybdopterin guanine dinucleotide)Mo, iron-sulfur, and heme b prosthetic groups bound by the alpha, beta, and gamma subunits respectively. Members of this protein family are not part of the mature protein, although they are the product of a fourth clustered gene. Proteins in this family are interpreted as a chaperone, analogous to NarJ of nitrate reductases.	197
274601	TIGR03483	FtsZ_alphas_C	cell division protein FtsZ, alphaProteobacterial C-terminal extension. This model describes a domain found as a C-terminal extension to the cell division protein FtsZ in many but not all members of the alphaProteobacteria. [Cellular processes, Cell division]	121
132524	TIGR03485	cas_csx13_N	CRISPR-associated protein Cas8a1/Csx13, MYXAN subtype. Members of this family are found among cas (CRISPR-Associated) genes close to CRISPR repeats in Leptospira interrogans (a spirochete), Myxococcus xanthus (a delta-proteobacterium), and Lyngbya sp. PCC 8106 (a cyanobacterium). It is found with other cas genes in Anabaena variabilis ATCC 29413. In Lyngbya sp., the protein is split into two tandem genes. This model corresponds to the N-terminal region or upstream gene; the C-terminal region is described by TIGR03486. CRISPR/cas systems are associated with prokaryotic acquired resistance to phage and other exogenous DNA.	316
132525	TIGR03486	cas_csx13_C	CRISPR-associated protein Cas8a1/Csx13, MYXAN subtype, C-terminal region. Members of this family are found among cas (CRISPR-Associated) genes close to CRISPR repeats in Leptospira interrogans (a spirochete), Myxococcus xanthus (a delta-proteobacterium), and Lyngbya sp. PCC 8106 (a cyanobacterium). It is found with other cas genes in Anabaena variabilis ATCC 29413. In Lyngbya sp., the protein is split into two tandem genes. This model corresponds to the C-terminal region or downstream gene; the N-terminal region is modeled by TIGR03485. CRISPR/cas systems are associated with prokaryotic acquired resistance to phage and other exogenous DNA.	152
132526	TIGR03487	cas_csp2	CRISPR-associated protein Cas8c/Csp2, subtype PGING. Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pging (Porphyromonas gingivalis) subtype.	489
132527	TIGR03488	cas_Cas5p	CRISPR-associated protein Cas5, subtype PGING. CC Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pgingi (Porphyromonas gingivalis) subtype, but shows some sequence similarity to genes of the Cas5 type (see TIGR02593).	237
132528	TIGR03489	cas_csp1	CRISPR-associated protein Cas7/Csp1, subtype PGING. Members of this protein family are Csp1, a CRISPR-associated (cas) gene marker for the Pging subtype of CRISPR/cas system, as found in Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This protein belongs to the family of DevR (TIGR01875), a regulator of development in Myxococcus xanthus located in a cas gene region. A different branch of the DevR family, Cst2 (TIGR02585), is a marker for the Tneap subtype of CRISPR/cas system.	292
274602	TIGR03490	Mycoplas_LppA	mycoides cluster lipoprotein, LppA/P72 family. Members of this protein family occur in Mycoplasma mycoides, Mycoplasma hyopneumoniae, and related Mycoplasmas in small paralogous families that may also include truncated forms and/or pseudogenes. Members are predicted lipoproteins with a conserved signal peptidase II processing and lipid attachment site. Note that the name for certain characterized members, p72, reflects an anomalous apparent molecular weight, given a theoretical MW of about 61 kDa.	541
274603	TIGR03491	TIGR03491	RecB family nuclease, putative, TM0106 family. Members of this uncharacterized protein family are found broadly but sporadically among bacteria. The N-terminal region is homologous to the Cas4 protein of CRISPR systems, although this protein family shows no signs of association with CRISPR repeats.	457
274604	TIGR03492	TIGR03492	conserved hypothetical protein. This protein family is restricted to the Cyanobacteria, in one or two copies, save for instances in the genus Deinococcus. This protein shows some sequence similarity, especially toward the C-terminus, to lipid-A-disaccharide synthase (TIGR00215 or pfam02684). The function is unknown.	396
274605	TIGR03493	cellullose_BcsF	celllulose biosynthesis operon protein BcsF/YhjT. Members of this protein family are found invariably together with genes of bacterial cellulose biosynthesis, and are presumed to be involved in the process. Members average about 63 amino acids in length and are not uncharacterized. The gene has been designated both YhjT and BcsF (bacterial cellulose synthesis F). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	62
132533	TIGR03494	salicyl_syn	salicylate synthase. Members of this protein family are salicylate synthases, bifunctional enzymes that make salicylate, in two steps, from chorismate. Members are homologous to anthranilate synthase component I from Trp biosynthesis. Members typically are found in gene regions associated with siderophore or other secondary metabolite biosynthesis.	425
274606	TIGR03495	phage_LysB	phage lysis regulatory protein, LysB family. Members of this protein family are phage lysis regulatory protein, including the well-studied protein LysB (lysis protein B) of Enterobacteria phage P2. For members of this family, genes are found in phage or in prophage regions of bacterial genomes, typically near a phage lysozyme or phage holin.	135
274607	TIGR03496	FliI_clade1	flagellar protein export ATPase FliI. Members of this protein family are the FliI protein of bacterial flagellum systems. This protein acts to drive protein export for flagellar biosynthesis. The most closely related family is the YscN family of bacterial type III secretion systems. This model represents one (of three) segment of the FliI family tree. These have been modeled separately in order to exclude the type III secretion ATPases more effectively. [Cellular processes, Chemotaxis and motility]	411
274608	TIGR03497	FliI_clade2	flagellar protein export ATPase FliI. Members of this protein family are the FliI protein of bacterial flagellum systems. This protein acts to drive protein export for flagellar biosynthesis. The most closely related family is the YscN family of bacterial type III secretion systems. This model represents one (of three) segment of the FliI family tree. These have been modeled separately in order to exclude the type III secretion ATPases more effectively. [Cellular processes, Chemotaxis and motility]	413
163293	TIGR03498	FliI_clade3	flagellar protein export ATPase FliI. Members of this protein family are the FliI protein of bacterial flagellum systems. This protein acts to drive protein export for flagellar biosynthesis. The most closely related family is the YscN family of bacterial type III secretion systems. This model represents one (of three) segment of the FliI family tree. These have been modeled separately in order to exclude the type III secretion ATPases more effectively.	418
274609	TIGR03499	FlhF	flagellar biosynthetic protein FlhF. [Cellular processes, Chemotaxis and motility]	282
274610	TIGR03500	FliO_TIGR	flagellar biosynthetic protein FliO. This short protein found in flagellar biosynthesis operons contains a highly hydrophobic N-terminal sequence followed generally by two basic amino acids. This region is reminiscent of but distinct from the twin-arginine translocation signal sequence. Some instances of this gene have been names "FliZ" but phylogenetic tree building supports a single FliO family.	69
274611	TIGR03501	GlyGly_CTERM	GlyGly-CTERM domain. This homology domain, GlyGly-CTERM, shares a species distribution with rhombosortase (TIGR03902), a subfamily of rhomboid-like intramembrane serine proteases. It is probably a recognition sequence for protein sorting and then cleavage by rhombosortase. Shewanella species have the largest number of target proteins per genome, up to thirteen. The domain occurs at the extreme carboxyl-terminus of a diverse set of proteins, most of which are enzymes with conventional signal sequences and with hydrolytic activities: nucleases, proteases, agarases, etc. The agarase AgaA from Vibro sp. strain JT0107 is secreted into the medium, while the same protein heterologously expressed in E. coli is retained in the cell fraction. This suggests cleavage and release in species with this domain. Both this suggestion, and the chemical structure of the domain (motif, hydrophobic predicted transmembrane helix, cluster of basic residues) closely parallels that of the LPXTG/sortase system and the PEP-CTERM/exosortase(EpsH) system. For this reason, the putative processing enzyme is designated rhombosortase.	22
274612	TIGR03502	lipase_Pla1_cef	extracellular lipase, Pla-1/cef family. Members of this protein family are bacterial lipoproteins largely from the Gammaproteobacteria. Characterized members are expressed extracellularly and have esterase activity. Members include the lipase Pla-1 from Aeromonas hydrophila (AF092033) and CHO cell elongation factor (cef) from Vibrio hollisae	792
274613	TIGR03503	TIGR03503	TIGR03503 family protein. This set of conserved hypothetical protein has a phylogenetic range that closely matches that of TIGR03501, a putative C-terminal protein targeting signal.	374
274614	TIGR03504	FimV_Cterm	FimV C-terminal domain. This protein is found at the extreme C-terminus of FimV from Pseudomonas aeruginosa, and of TspA of Neisseria meningitidis. Disruption of the former blocks twitching motility from type IV pili; Semmler, et al. suggest a role in peptidoglycan layer remodelling required by type IV fimbrial systems.	44
274615	TIGR03505	FimV_core	FimV N-terminal domain. This region is found at, or about 200 amino acids from, the N-terminus of FimV from Pseudomonas aeruginosa, TspA of Neisseria meningitidis, and related proteins. Disruption of FimV blocks twitching motility from type IV pili; Semmler, et al. suggest a role for this family in peptidoglycan layer remodelling required by type IV fimbrial systems. Most but not all members of this protein family have a C-terminal region recognized by TIGR03504. In between is a highly variable, often repeat-filled region rich in the negatively charged amino acids Asp and Glu.	74
274616	TIGR03506	FlgEFG_subfam	flagellar hook-basal body protein. This model encompasses three closely related flagellar proteins usually denoted FlgE, FlgF and FlgG. The names have often been mis-assigned, however. Three equivalog models, TIGR02489, TIGR02490 and TIGR02488, respectively, separate the individual forms into three genome-context consistent groups. The major differences between these genes are architectural, with variable central sections between relatively conserved N- and C-terminal domains. More distantly related are two other flagellar apparatus familis, FlgC (TIGR01395) which consists of little else but the N-and C-terminal domains and FlgK (TIGR02492) with a substantial but different central domain.	231
274617	TIGR03507	decahem_SO1788	decaheme c-type cytochrome, OmcA/MtrC family. The protein SO_1778 (MtrC) of Shewanella oneidensis MR-1, and its paralog SO_1779 (OmcA), with which it intteracts, are large decaheme proteins, about 900 amino acids in length, involved in the use of manganese [Mn(III/IV)] and iron [Fe(III)] as terminal electron acceptors. This model represents these and similar decaheme proteins, found also in Rhodoferax ferrireducens DSM 15236, Aeromonas hydrophila ATCC7966, and a few other bacterial species. [Energy metabolism, Electron transport]	659
274618	TIGR03508	decahem_SO	decaheme c-type cytochrome, DmsE family. Members of this family are small, decaheme c-type cytochromes, related DmsE of Shewanella oneidensis MR-1, which has been shown to be part of an anaerobic dimethyl sulfoxide reductase.	258
274619	TIGR03509	OMP_MtrB_PioB	decaheme-associated outer membrane protein, MtrB/PioB family. Members of this protein family are integral proteins of the bacterial outer membrane, associated with multiheme c-type cytochromes involved in electron transfer. The MtrB protein of Shewanella oneidensis MR-1 (SO1776) has been shown to form a complex with 1:1:1 stochiometry with the small, periplasmic decaheme cytochrome MtrA and large, surface-exposed decaheme cytochrome MtrC. [Energy metabolism, Electron transport]	649
274620	TIGR03510	XapX	XapX domain. This model describes an uncharacterized small, hydrophobic protein of about 50 amino acids, found between the xapB and xapR genes of the E. coli xanthosine utilization system, and homologous regions in other small proteins, such as the N-terminal region of DUF1427 (pfam07235). We name this domain XapX, as it comprises the full length of the protein encoded between the genes for the well-studied XapB and XapR proteins. [Unknown function, General]	49
274621	TIGR03511	GldH_lipo	gliding motility-associated lipoprotein GldH. Members of this protein family are predicted lipoproteins, exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). Members include GldH, a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Not all Bacteroidetes with members of this protein family may have gliding motility. [Cellular processes, Chemotaxis and motility]	156
132551	TIGR03512	GldD_lipo	gliding motility-associated lipoprotein GldD. Members of this protein family are found a number of Bacteriodetes lineage bacteria, including both species such as Flavobacterium johnsoniae, which possess a poorly understood form of rapid gliding motility, and other species which apparently do not. Mutation of GldD blocks both this motility and chitin utilization in the model species, Flavobacterium johnsoniae. [Cellular processes, Chemotaxis and motility]	186
274622	TIGR03513	GldL_gliding	gliding motility-associated protein GldL. This protein family, GldL, is named for the member from Flavobacterium johnsoniae, which is required for a type of rapid gliding motility found in certain members of the Bacteriodetes. However, members are found also in several members of the Bacteriodetes that appear not to be motile [Cellular processes, Chemotaxis and motility]	202
274623	TIGR03514	GldB_lipo	gliding motility-associated lipoprotein GldB. 	319
132554	TIGR03515	GldC	gliding motility-associated protein GldC. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldC is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldC do not abolish the gliding phenotype but do impair it. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	108
132555	TIGR03516	ppisom_GldI	peptidyl-prolyl isomerase, gliding motility-associated. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldI is a FKBP-type peptidyl-prolyl cis-trans isomerase (pfam00254) linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockout of this gene abolishes the gliding phenotype. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. This family is only found in Bacteroidetes containing the suite of genes proposed to confer the gliding motility phenotype.	177
274624	TIGR03517	GldM_gliding	gliding motility-associated protein GldM. This protein family, GldM, is named for the member from Flavobacterium johnsoniae, which is required for a type of rapid gliding motility found in certain members of the Bacteriodetes. However, members are found also in several members of the Bacteriodetes that appear not to be motile. The best conserved region, toward the N-terminus, is centered on a highly hydrobobic probable transmembrane helix. Two paralogs are found in Cytophaga hutchinsonii.	523
132557	TIGR03518	ABC_perm_GldF	gliding motility-associated ABC transporter permease protein GldF. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldF is believed to be a ABC transporter permease protein (along with ATP-binding subunit, GldA and a sunstrate-binding subunit, GldG) and is linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldF abolish the gliding phenotype. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	240
274625	TIGR03519	T9SS_PorP_fam	type IX secretion system membrane protein, PorP/SprF family. This model describes a protein family unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Flavobacterium johnsoniae, that have type IX secretion systems (T9SS) and exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss a this motility.	291
274626	TIGR03520	GldE	gliding motility-associated protein GldE. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldC is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. GldE was discovered because of its adjacency to GldD in F. johnsonii. Overexpression of GldE partially supresses the effects of a GldB point mutant suggesting that GldB and GldE interact. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Not all Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility and in fact some do not appear to express the gliding phenotype.	407
274627	TIGR03521	GldG	gliding-associated putative ABC transporter substrate-binding component GldG. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldG is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldG abolish the gliding phenotype. GldG, along with GldA and GldF are believed to compose an ABC transporter and are observed as an operon. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	552
132561	TIGR03522	GldA_ABC_ATP	gliding motility-associated ABC transporter ATP-binding subunit GldA. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldA is an ABC transporter ATP-binding protein (pfam00005) linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldA abolish the gliding phenotype. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	301
274628	TIGR03523	GldN	gliding motility associated protien GldN. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldN is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldN abolish the gliding phenotype. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein also include those which are not believed to express the gliding phenotype, such as Prevotella intermedia and Porphyromonas gingivales.	280
132563	TIGR03524	GldJ	gliding motility-associated lipoprotein GldJ. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldJ is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldJ abolish the gliding phenotype. GldJ is homologous to GldK. There is a GldJ homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture and is represented by a separate model. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	559
274629	TIGR03525	GldK	gliding motility-associated lipoprotein GldK. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldK is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldK abolish the gliding phenotype. GldK is homologous to GldJ. There is a GldK homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture and is represented by a separate model. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	450
132565	TIGR03526	selenium_YgeY	putative selenium metabolism hydrolase. SelD, selenophosphate synthase, is the selenium donor protein for both selenocysteine and selenouridine biosynthesis systems, but it occurs also in a few prokaryotes that have neither of those pathways. The method of partial phylogenetic profiling, starting from such orphan-selD genomes, identifies this protein as one of those most strongly correlated to SelD occurrence. Its distribution is also well correlated with that of family TIGR03309, a putative accessory protein of labile selenium (non-selenocysteine) enzyme maturation. This family includes the uncharacterized YgeY of Escherichia coli, and belongs to a larger family of metalloenzymes in which some are known peptidases, others enzymes of different types.	395
274630	TIGR03527	selenium_YedF	selenium metabolism protein YedF. Members of this protein family are about 200 amino acids in size, and include the uncharacterized YedF protein of Escherichia coli. This family shares an N-terminal domain, modeled by pfam01206, with the sulfurtransferase TusA (also called SirA). The C-terminal domain includes a typical redox-active disulfide motif, CGXC. This protein family found only among those genomes that also carry the selenium donor protein SelD, and its connection to selenium metabolism is indicated by the method of partial phylogenetic profiling vs. SelD. Its gene typically is found next to selD. Members of this family are found even when selenocysteine and selenouridine biosynthesis pathways are, except for SelD, completely absent, as in Enterococcus faecalis. Its role in selenium metabolism is unclear, but may include either detoxification or a role in labile selenoprotein biosynthesis.	194
274631	TIGR03528	2_3_DAP_am_ly	diaminopropionate ammonia-lyase. Members of this protein family are the homodimeric, pyridoxal phosphate enzyme diaminopropionate ammonia-lyase, which adds water to remove two amino groups, leaving pyruvate.	396
274632	TIGR03529	GldK_short	gliding motility-associated lipoprotein GldK. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldK is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldK abolish the gliding phenotype. GldK is homologous to GldJ. This model represents a GldK homolog in Cytophaga hutchinsonii and several other species that has a different, shorter architecture than that found in Flavobacterium johnsoniae and related species (represented by (TIGR03525). Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	344
132569	TIGR03530	GldJ_short	gliding motility-associated lipoprotein GldJ. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldJ is a lipoprotein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae. Knockouts of GldJ abolish the gliding phenotype. GldJ is homologous to GldK. This model represents the GldJ homolog in Cytophaga hutchinsonii and several other species which is of shorter architecture than that found in Flavobacterium johnsoniae and is represented by a separate model (TIGR03524). Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	402
211833	TIGR03531	selenium_SpcS	O-phosphoseryl-tRNA(Sec) selenium transferase. In the archaea and eukaryotes, the conversion of the mischarged serine to selenocysteine (Sec) on its tRNA is accomplished in two steps. This enzyme, O-phosphoseryl-tRNA(Sec) selenium transferase, acts second, after a phosphophorylation step catalyzed by a homolog of the bacterial SelA protein. [Protein synthesis, tRNA aminoacylation]	444
132571	TIGR03532	DapD_Ac	2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-acetyltransferase. This enzyme is part of the diaminopimelate pathway of lysine biosynthesis. Alternate name: tetrahydrodipicolinate N-acetyltransferase. Note that IUBMB lists this alternate name as the accepted name. Unfortunately, the related succinyl transferase acting on the same substrate (EC:2.3.1.117, TIGR00695) uses the opposite standard. We have decided to give these two enzymes names which more clearly indicated that they act on the same substrate.	231
274633	TIGR03533	L3_gln_methyl	protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific. Members of this protein family methylate ribosomal protein L3 on a glutamine side chain. This family is related to HemK, a protein-glutamine methyltranferase for peptide chain release factors. [Protein synthesis, Ribosomal proteins: synthesis and modification]	284
274634	TIGR03534	RF_mod_PrmC	protein-(glutamine-N5) methyltransferase, release factor-specific. Members of this protein family are HemK (PrmC), a protein once thought to be involved in heme biosynthesis but now recognized to be a protein-glutamine methyltransferase that modifies the peptide chain release factors. All members of the seed alignment are encoded next to the release factor 1 gene (prfA) and confirmed by phylogenetic analysis. SIMBAL analysis (manuscript in prep.) shows the motif [LIV]PRx[DE]TE (in Escherichia coli, IPRPDTE) confers specificity for the release factors rather than for ribosomal protein L3. [Protein fate, Protein modification and repair]	250
274635	TIGR03535	DapD_actino	2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. This enzyme is part of the diaminopimelate pathway of lysine biosynthesis. This model represents a clade of the enzyme specific to Actinobacteria. Alternate name: tetrahydrodipicolinate N-succinyltransferase.	319
211834	TIGR03536	DapD_gpp	2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase. 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase (DapD) is involved in the succinylated branch of the "lysine biosynthesis via diaminopimelate (DAP)" pathway (GenProp0125). This model represents a clade of DapD sequences most closely related to the actinobacterial DapD family represented by the TIGR03535 model. All of the genes evaluated for the seed of this model are found in genomes where the downstream desuccinylase is present, but known DapD genes are absent. Additionally, many of the genes identified by this model are found proximal to genes involved in this lysine biosynthesis pathway.	341
274636	TIGR03537	DapC	succinyldiaminopimelate transaminase. The four sequences which make up the seed for this model are not closely related, although they are all members of the pfam00155 family of aminotransferases and are more closely related to each other than to anything else. Additionally, all of them are found in the vicinity of genes involved in the biosynthesis of lysine via the diaminopimelate pathway (GenProp0125), although this amount to a separation of 12 genes in the case of Sulfurihydrogenibium azorense Az-Fu1. None of these genomes contain another strong candidate for this role in the pathway. Note: the detailed information included in the EC:2.6.1.17 record includes the assertions that the enzyme uses the pyridoxal pyrophosphate cofactor, which is consistent with the pfam00155 family, and the assertion that the amino group donor is L-glutamate, which is undetermined for the sequences in this clade.	350
274637	TIGR03538	DapC_gpp	succinyldiaminopimelate transaminase. This family of succinyldiaminopimelate transaminases (DapC) includes the experimentally characterized enzyme from Bordatella pertussis. The majority of genes in this family are proximal to genes encoding components of the lysine biosynthesis via diaminopimelate pathway (GenProp0125).	393
132578	TIGR03539	DapC_actino	succinyldiaminopimelate transaminase. This family of actinobacterial succinyldiaminopimelate transaminase enzymes (DapC) are members of the pfam00155 superfamily. Many of these genes appear adjacent to other genes encoding enzymes of the lysine biosynthesis via diaminopimelate pathway (GenProp0125).	357
274638	TIGR03540	DapC_direct	LL-diaminopimelate aminotransferase. This clade of the pfam00155 superfamily of aminotransferases includes several which are adjacent to elements of the lysine biosynthesis via diaminopimelate pathway (GenProp0125). Every member of this clade is from a genome which possesses most of the lysine biosynthesis pathway but lacks any of the known aminotransferases, succinylases, desuccinylases, acetylases or deacetylases typical of the acylated versions of this pathway nor do they have the direct, NADPH-dependent enzyme (ddh). Although there is no experimental characterization of any of the sequences in this clade, a direct pathway is known in plants and Chlamydia and the clade containing the Chlamydia gene is a neighboring one in the same pfam00155 superfamily so it seems quite reasonable that these enzymes catalyze the same transformation.	383
132580	TIGR03541	reg_near_HchA	LuxR family transcriptional regulatory, chaperone HchA-associated. Members of this protein family belong to the LuxR transcriptional regulator family, and contain both autoinducer binding (pfam03472) and transcriptional regulator (pfam00196) domains. Members, however, occur only in a few members of the Gammaproteobacteria that have the chaperone/aminopeptidase HchA, and are always encoded by the adjacent gene.	232
163316	TIGR03542	DAPAT_plant	LL-diaminopimelate aminotransferase. This clade of the pfam00155 superfamily of aminotransferases includes several which are adjacent to elements of the lysine biosynthesis via diaminopimelate pathway (GenProp0125). This clade includes characterized species in plants and Chlamydia. Every member of this clade is from a genome which possesses most of the lysine biosynthesis pathway but lacks any of the known succinylases, desuccinylases, acetylases or deacetylases typical of the acylated versions of this pathway nor do they have the direct, NADPH-dependent enzyme (ddh).	402
188337	TIGR03543	divI1A_rptt_fam	DivIVA domain repeat protein. Members of this protein family contain two full and two partial repeats of a domain found at the N-terminus of Bacillus subtilis cell-division initiation protein DivIVA. The portion repeated four times in these proteins includes the motif GYxxxxVD.	178
274639	TIGR03544	DivI1A_domain	DivIVA domain. This model describes a domain found in Bacillus subtilis cell division initiation protein DivIVA, and homologs, toward the N-terminus. It is also found as a repeated domain in certain other proteins, including family TIGR03543.	34
274640	TIGR03545	TIGR03545	TIGR03545 family protein. This model represents a relatively rare but broadly distributed uncharacterized protein family, distributed in 1-2 percent of bacterial genomes, all of which have outer membranes. In many of these genomes, it is part of a two-gene pair.	555
200289	TIGR03546	TIGR03546	TIGR03546 family protein. Members of this family are uncharacterized proteins, usually encoded by a gene adjacent to a member of family TIGR03545, which is also uncharacterized.	154
274641	TIGR03547	muta_rot_YjhT	mutatrotase, YjhT family. Members of this protein family contain multiple copies of the beta-propeller-forming Kelch repeat. All are full-length homologs to YjhT of Escherichia coli, which has been identified as a mutarotase for sialic acid. This protein improves bacterial ability to obtain host sialic acid, and thus serves as a virulence factor. Some bacteria carry what appears to be a cyclically permuted homolog of this protein.	346
274642	TIGR03548	mutarot_permut	cyclically-permuted mutarotase family protein. Members of this protein family show essentially full-length homology, cyclically permuted, to YjhT from Escherichia coli. YjhT was shown to act as a mutarotase for sialic acid, and by this ability to be able to act as a virulence factor. Members of the YjhT family (TIGR03547) and this cyclically-permuted family have multiple repeats of the beta-propeller-forming Kelch repeat.	331
132588	TIGR03549	TIGR03549	YcaO domain protein. This family consists of remarkably well-conserved proteins from gamma and beta Proteobacteria, heavily skewed towards organisms of marine environments. Its gene neighborhood is not conserved. This family has an OsmC-like N-terminal domain. It shares a YcaO domain, frequently associated with ATP-dependent cyclodehydration for peptide modification. The function is unknown. Fifteen of the first sixteen members of this family are from selenouridine-positive genomes, but this correlation may not be meaningful.	718
132589	TIGR03550	F420_cofG	7,8-didemethyl-8-hydroxy-5-deazariboflavin synthase, CofG subunit. This model represents either a subunit or a domain, depending on whether or not the genes are fused, of a bifunctional protein that completes the synthesis of 7,8-didemethyl-8-hydroxy-5-deazariboflavin, or FO. FO is the chromophore of coenzyme F(420), involved in methanogenesis in methanogenic archaea but found in certain other lineages as well. The chromophore also occurs as a cofactor in DNA photolyases in Cyanobacteria. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	322
132590	TIGR03551	F420_cofH	7,8-didemethyl-8-hydroxy-5-deazariboflavin synthase, CofH subunit. This enzyme, together with CofG, complete the biosynthesis of 7,8-didemethyl-8-hydroxy-5-deazariboflavin synthase, the chromophore of coenzyme F420. The chromophore is also used in cyanobacteria DNA photolyases. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	343
274643	TIGR03552	F420_cofC	2-phospho-L-lactate guanylyltransferase. Members of this protein family are the CofC enzyme of coenzyme F420 biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	195
132592	TIGR03553	F420_FbiB_CTERM	F420 biosynthesis protein FbiB, C-terminal domain. Coenzyme F420 differs between the Archaea and the Actinobacteria, where the numbers of glutamate residues attached are 2 (Archaea) or 5-6 (Mycobacterium). The enzyme in the Archaea is homologous to the N-terminal domain of FbiB from Mycobacterium bovis, and is responsible for glutamate ligation. Therefore it seems likely that the C-terminal domain of FbiB modeled here, is involved in additional glutamate ligation. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	194
213827	TIGR03554	F420_G6P_DH	glucose-6-phosphate dehydrogenase (coenzyme-F420). This family consists of the F420-dependent glucose-6-phosphate dehydrogenase of Mycobacterium and Nocardia. It shows homology to several other F420-dependent enzymes rather than to the NAD or NADP-dependent glucose-6-phosphate dehydrogenases. [Energy metabolism, Pentose phosphate pathway]	331
132594	TIGR03555	F420_mer	5,10-methylenetetrahydromethanopterin reductase. Members of this protein family are 5,10-methylenetetrahydromethanopterin reductase, an F420-dependent enzyme of methanogenesis. It is restricted to the Archaea. [Energy metabolism, Methanogenesis]	325
274644	TIGR03556	photolyase_8HDF	deoxyribodipyrimidine photo-lyase, 8-HDF type. This model describes a narrow clade of cyanobacterial deoxyribodipyrimidine photo-lyase. This group, in contrast to several closely related proteins, uses a chromophore that, in other lineages is modified further to become coenzyme F420. This chromophore is called 8-HDF in most articles on the DNA photolyase and FO in most literature on coenzyme F420. [DNA metabolism, DNA replication, recombination, and repair]	471
274645	TIGR03557	F420_G6P_family	F420-dependent oxidoreductase, G6PDH family. Members of this protein family include F420-dependent glucose-6-phosphate dehydrogenases (TIGR03554) and related proteins. All members of this family come from species that synthesize coenzyme F420, with the exception of those that belong to TIGR03885, a clade within this family in which cofactor binding may instead be directed to FMN. [Unknown function, Enzymes of unknown specificity]	316
274646	TIGR03558	oxido_grp_1	luciferase family oxidoreductase, group 1. The Pfam domain family pfam00296 is named for luciferase-like monooxygenases, but the family also contains several coenzyme F420-dependent enzymes. This protein family represents a well-resolved clade within family pfam00296 and shows no restriction to coenzyme F420-positive species, unlike some other clades within pfam00296. [Unknown function, Enzymes of unknown specificity]	323
274647	TIGR03559	F420_Rv3520c	probable F420-dependent oxidoreductase, Rv3520c family. Members of this protein family are predicted to be oxidoreductases dependent on coenzyme F420. The family includes a single member in Mycobacterium tuberculosis (Rv3520c/MT3621) but four in Mycobacterium smegmatis. Prediction that this family is F420-dependent is based primarily on Partial Phylogenetic Profiling vs. F420 biosynthesis. [Unknown function, Enzymes of unknown specificity]	325
274648	TIGR03560	F420_Rv1855c	probable F420-dependent oxidoreductase, Rv1855c family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes one such subfamily, exemplified by Rv1855c from Mycobacterium tuberculosis. [Unknown function, Enzymes of unknown specificity]	227
274649	TIGR03561	organ_hyd_perox	peroxiredoxin, Ohr subfamily. pfam02566, OsmC-like protein, contains several deeply split clades of homologous proteins. The clade modeled here includes the protein Ohr, or organic hydroperoxide resistance protein. [Cellular processes, Detoxification]	134
274650	TIGR03562	osmo_induc_OsmC	peroxiredoxin, OsmC subfamily. pfam02566, OsmC-like protein, contains several deeply split clades of homologous proteins. The clade modeled here includes the protein OsmC, or osmotically induced protein C. The member from Thermus thermophilus was shown to have hydroperoxide peroxidase activity. In many species, this protein is induced by stress and helps resist oxidative stress. [Cellular processes, Detoxification]	135
132602	TIGR03563	perox_SACOL1771	peroxiredoxin, SACOL1771 subfamily. This protein family belongs to the OsmC/Ohr family (pfam02566, OsmC-like protein) of peroxiredoxins.	138
274651	TIGR03564	F420_MSMEG_4879	F420-dependent oxidoreductase, MSMEG_4879 family. Coenzyme F420 is produced by methanogenic archaea, a number of the Actinomycetes (including Mycobacterium tuberculosis), and rare members of other lineages. The resulting information-rich phylogenetic profile identifies candidate F420-dependent oxidoreductases within the family of luciferase-like enzymes (pfam00296), where the species range for the subfamily encompasses many F420-positive genomes without straying beyond. This family is uncharacterized, and named for member MSMEG_4879 from Mycobacterium smegmatis. [Unknown function, Enzymes of unknown specificity]	265
274652	TIGR03565	alk_sulf_monoox	alkanesulfonate monooxygenase, FMNH(2)-dependent. Members of this protein family are monooxygenases that catalyze desulfonation of aliphatic sulfonates such as methane sulfonate. This enzyme uses reduced FMN, although various others members of the same luciferase-like monooxygenase family (pfam00296) are F420-dependent enzymes. [Central intermediary metabolism, Sulfur metabolism]	346
211840	TIGR03566	FMN_reduc_MsuE	FMN reductase, MsuE subfamily. Members of this protein family use NAD(P)H to reduce FMN and regenerate FMNH2. Members include the NADH-dependent enzyme MsuE from Pseudomonas aeruginosa, which serves as a partner to an FMNH2-dependent alkanesulfonate monooxygenase. The NADP-dependent enzyme from E. coli is outside the scope of this model.	174
274653	TIGR03567	FMN_reduc_SsuE	FMN reductase, SsuE family. Members of this protein family use NAD(P)H to reduce FMN and regenerate FMNH2. Members include the homodimeric, NAD(P)H-dependent enzyme SsuE from Escherichia coli, which serves as a partner to an FMNH2-dependent alkanesulfonate monooxygenase. It is induced by sulfate starvation. The NADH-dependent enzyme MsuE from Pseudomonas aeruginosa is outside the scope of this model (see model TIGR03566). [Central intermediary metabolism, Sulfur metabolism]	171
274654	TIGR03568	NeuC_NnaA	UDP-N-acetyl-D-glucosamine 2-epimerase, UDP-hydrolysing. This family of enzymes catalyzes the combined epimerization and UDP-hydrolysis of UDP-N-acetylglucosamine to N-acetylmannosamine. This is in contrast to the related enzyme WecB (TIGR00236) which retains the UDP moiety. NeuC acts in concert with NeuA and NeuB to synthesize CMP-N5-acetyl-neuraminate.	364
274655	TIGR03569	NeuB_NnaB	N-acetylneuraminate synthase. This family is a subset of the pfam03102 and is believed to include only authentic NeuB N-acetylneuraminate (sialic acid) synthase enzymes. The majority of the genes identified by this model are observed adjacent to both the NeuA and NeuC genes which together effect the biosynthesis of CMP-N-acetylneuraminate from UDP-N-acetylglucosamine.	329
274656	TIGR03570	NeuD_NnaD	sugar O-acyltransferase, sialic acid O-acetyltransferase NeuD family. This family of proteins includes the characterized NeuD sialic acid O-acetyltransferase enzymes from E. coli and Streptococcus agalactiae (group B strep). These two are quite closely related to one another, so extension of this annotation to other members of the family in unsupported without additional independent evidence. The neuD gene is often observed in close proximity to the neuABC genes for the biosynthesis of CMP-N-acetylneuraminic acid (CMP-sialic acid), and NeuD sequences from these organisms were used to construct the seed for this model. Nevertheless, there are numerous instances of sequences identified by this model which are observed in a different genomic context (although almost universally in exopolysaccharide biosynthesis-related loci), as well as in genomes for which the biosynthesis of sialic acid (SA) is undemonstrated. Even in the cases where the association with SA biosynthesis is strong, it is unclear in the literature whether the biological substrate is SA iteself, CMP-SA, or a polymer containing SA. Similarly, it is unclear to what extent the enzyme has a preference for acetylation at the 7, 8 or 9 positions. In the absence of evidence of association with SA, members of this family may be involved with the acetylation of differring sugar substrates, or possibly the delivery of alternative acyl groups. The closest related sequences to this family (and those used to root the phylogenetic tree constructed to create this model) are believed to be succinyltransferases involved in lysine biosynthesis. These proteins contain repeats of the bacterial transferase hexapeptide (pfam00132), although often these do not register above the trusted cutoff.	201
274657	TIGR03571	lucif_BA3436	luciferase-type oxidoreductase, BA3436 family. This family is a distinct subgroup among members of the luciferase monooxygenase domain family. The larger family contains both FMN-binding enzymes (luciferase, alkane monooxygenase) and F420-binding enzymes (methylenetetrahydromethanopterin reductase, secondary alcohol dehydrogenase, glucose-6-phosphate dehydrogenase). Although some members of the domain family bind coenzyme F420 rather than FMN, members of this family are from species that lack the genes for F420 biosynthesis. A crystal structure, but not function, is known (but unpublished) for the member from Bacillus cereus, PDB|2B81. [Unknown function, Enzymes of unknown specificity]	298
132611	TIGR03572	WbuZ	glycosyl amidation-associated protein WbuZ. This clade of sequences is highly similar to the HisF protein, but generally represents the second HisF homolog in the genome where the other is an authentic HisF observed in the context of a complete histidine biosynthesis operon. The similarity between these WbuZ sequences and true HisFs is such that often the closest match by BLAST of a WbuZ is a HisF. Only by making a multiple sequence alignment is the homology relationship among the WbuZ sequences made apparent. WbuZ genes are invariably observed in the presence of a homolog of the HisH protein (designated WbuY) and a proposed N-acetyl sugar amidotransferase designated in WbuX in E. coli, IfnA in P. aeriginosa and PseA in C. jejuni. Similarly, this trio of genes is invariably found in the context of saccharide biosynthesis loci. It has been shown that the WbuYZ homologs are not essential components of the activity expressed by WbuX, leading to the proposal that these to proteins provide ammonium ions to the amidotransferase when these are in low concentration. WbuY (like HisH) is proposed to act as a glutaminase to release ammonium. In histidine biosynthesis this is also dispensible in the presence of exogenous ammonium ion. HisH and HisF form a complex such that the ammonium ion is passed directly to HisF where it is used in an amidation reaction causing a subsequent cleavage and cyclization. In the case of WbuYZ, the ammonium ion would be passed from WbuY to WbuZ. WbuZ, being non-essential and so similar to HisF that a sugar substrate is unlikely, would function instead as a amoonium channel to the WbuX protein which does the enzymatic work.	232
274658	TIGR03573	WbuX	N-acetyl sugar amidotransferase. This enzyme has been implicated in the formation of the acetamido moiety (sugar-NC(=NH)CH3) which is found on some exopolysaccharides and is positively charged at neutral pH. The reaction involves ligation of ammonia with a sugar N-acetyl group, displacing water. In E. coli (O145 strain) and Pseudomonas aeruginosa (O12 strain) this gene is known as wbuX and ifnA respectively and likely acts on sialic acid. In Campylobacter jejuni, the gene is known as pseA and acts on pseudaminic acid in the process of flagellin glycosylation. In other Pseudomonas strains and various organisms it is unclear what the identity of the sugar substrate is, and in fact, the phylogenetic tree of this family sports a considerably deep branching suggestive of possible major differences in substrate structure. Nevertheless, the family is characterized by a conserved tetracysteine motif (CxxC.....[GN]xCxxC) possibly indicative of a metal binding site, as well as an invariable contextual association with homologs of the HisH and HisF proteins known as WbuY and WbuZ, respectively. These two proteins are believed to supply the enzyme with ammonium by hydrolysis of glutamine and delivery through an ammonium conduit.	343
132613	TIGR03574	selen_PSTK	L-seryl-tRNA(Sec) kinase, archaeal. Members of this protein are L-seryl-tRNA(Sec) kinase. This enzyme is part of a two-step pathway in Eukaryota and Archaea for performing selenocysteine biosynthesis by changing serine misacylated on selenocysteine-tRNA to selenocysteine. This enzyme performs the first step, phosphorylation of the OH group of the serine side chain. This family represents archaeal proteins with this activity. [Protein synthesis, tRNA aminoacylation]	249
188340	TIGR03575	selen_PSTK_euk	L-seryl-tRNA(Sec) kinase, eukaryotic. Members of this protein are L-seryl-tRNA(Sec) kinase. This enzyme is part of a two-step pathway in Eukaryota and Archaea for performing selenocysteine biosynthesis by changing serine misacylated on selenocysteine-tRNA to selenocysteine. This enzyme performs the first step, phosphorylation of the OH group of the serine side chain. This family represents eukaryotic proteins with this activity.	340
213830	TIGR03576	pyridox_MJ0158	pyridoxal phosphate enzyme, MJ0158 family. Members of this archaeal protein family are pyridoxal phosphate enzymes of unknown function. Sequence similarity to SelA, a bacterial enzyme of selenocysteine biosynthesis, has led to some members being misannotated as functionally equivalent, but selenocysteine is made on tRNA in Archaea by a two-step process that does not involve a SelA homolog. [Unknown function, Enzymes of unknown specificity]	346
132616	TIGR03577	EF_0830	conserved hypothetical protein EF_0830/AHA_3911. Members of this family of small (about 120 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved]	115
213831	TIGR03578	EF_0831	conserved hypothetical protein EF_0831/AHA_3912. Members of this family of small (about 100 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved]	96
132618	TIGR03579	EF_0833	conserved hypothetical protein EF_0833/AHA_3914. Members of this family of relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown.	209
274659	TIGR03580	EF_0832	conserved hypothetical protein EF_0832/AHA_3913. Members of this family of relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Unknown function, General]	246
188342	TIGR03581	EF_0839	conserved hypothetical protein EF_0839/AHA_3917. Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved]	227
132621	TIGR03582	EF_0829	PRD domain protein EF_0829/AHA_3910. Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. This protein contains a PRD domain (see pfam00874). The function is unknown. [Unknown function, General]	107
132622	TIGR03583	EF_0837	probable amidohydrolase EF_0837/AHA_3915. Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. These proteins resemble aminohydrolases (see pfam01979), including dihydroorotases. The function is unknown. [Hypothetical proteins, Conserved]	365
274660	TIGR03584	PseF	pseudaminic acid cytidylyltransferase. The sequences in this family include the pfam02348 (cytidyltransferase) domain and are homologous to the NeuA protein responsible for the transfer of CMP to neuraminic acid. According to, this gene is responsible for the transfer of CMP to the structurally related sugar, pseudaminic acid which is observed as a component of sugar modifications of flagellin in Campylobacter species. This gene is commonly observed in apparent operons with other genes responsible for the biosynthesis of pseudaminic acid and as a component of flagellar and exopolysaccharide biosynthesis loci.	222
274661	TIGR03585	PseH	UDP-4-amino-4,6-dideoxy-N-acetyl-beta-L-altrosamine N-acetyltransferase. Sequences in this family are members of the pfam00583 (GNAT) superfamily of acetyltransferases and are proposed to perform a N-acetylation step in the process of pseudaminic acid biosynthesis in Campylobacter species. This gene is commonly observed in apparent operons with other genes responsible for the biosynthesis of pseudaminic acid and as a component of flagellar and exopolysaccharide biosynthesis loci. Significantly, many genomes containing other components of this pathway lack this gene, indicating that some other N-acetyl transferases may be incolved and/or the step is optional, resulting in a non-acetylated pseudaminic acid variant sugar.	152
163337	TIGR03586	PseI	pseudaminic acid synthase. Members of this family are included within the larger pfam03102 (NeuB) family. NeuB itself (TIGR03569) is involved in the biosynthesis of neuraminic acid by the condensation of phosphoenolpyruvate (PEP) with N-Acetyl-D-Mannosamine. In an analagous reaction, this enzyme, PseI, condenses PEP with 6-deoxy-beta-L-AltNAc4NAc to generate pseudaminic acid.	327
132626	TIGR03587	Pse_Me-ase	pseudaminic acid biosynthesis-associated methylase. Members of this small clade are methyltransferases of the pfam08241 family and are observed within operons for the biosynthesis of pseudaminic acid, a component of exopolysaccharide and flagellin glycosyl modifications. Notable among these genomes is Pseudomonas fluorescens PfO-1. Possibly one of the two hydroxyl groups of pseudaminic acid, at positions 4 and 8 is converted to a methoxy group by this enzyme	204
274662	TIGR03588	PseC	UDP-4-amino-4,6-dideoxy-N-acetyl-beta-L-altrosamine transaminase. This family of enzymes are aminotransferases of the pfam01041 family involved in the biosynthesis of pseudaminic acid. They convert UDP-4-keto-6-deoxy-N-acetylglucosamine into UDP-4-amino-4,6-dideoxy-N-acetylgalactose. Pseudaminic acid has a role in surface polysaccharide in Pseudomonas as well as in the modification of flagellin in Campylobacter and Helicobacter species.	380
132628	TIGR03589	PseB	UDP-N-acetylglucosamine 4,6-dehydratase (inverting). This enzyme catalyzes the first step in the biosynthesis of pseudaminic acid, the conversion of UDP-N-acetylglucosamine to UDP-4-keto-6-deoxy-N-acetylglucosamine. These sequences are members of the broader pfam01073 (3-beta hydroxysteroid dehydrogenase/isomerase family) family.	324
274663	TIGR03590	PseG	UDP-2,4-diacetamido-2,4,6-trideoxy-beta-L-altropyranose hydrolase. This protein is found in association with enzymes involved in the biosynthesis of pseudaminic acid, a component of polysaccharide in certain Pseudomonas strains as well as a modification of flagellin in Campylobacter and Hellicobacter. The role of this protein is unclear, although it may participate in N-acetylation in conjunction with, or in the absence of PseH (TIGR03585) as it often scores above the trusted cutoff to pfam00583 representing a family of acetyltransferases.	279
274664	TIGR03591	polynuc_phos	polyribonucleotide nucleotidyltransferase. Members of this protein family are polyribonucleotide nucleotidyltransferase, also called polynucleotide phosphorylase. Some members have been shown also to have additional functions as guanosine pentaphosphate synthetase and as poly(A) polymerase (see model TIGR02696 for an exception clade, within this family). [Transcription, Degradation of RNA]	688
274665	TIGR03592	yidC_oxa1_cterm	membrane protein insertase, YidC/Oxa1 family, C-terminal domain. This model describes full-length from some species, and the C-terminal region only from other species, of the YidC/Oxa1 family of proteins. This domain appears to be univeral among bacteria (although absent from Archaea). The well-characterized YidC protein from Escherichia coli and its close homologs contain a large N-terminal periplasmic domain in addition to the region modeled here. [Protein fate, Protein and peptide secretion and trafficking]	179
274666	TIGR03593	yidC_nterm	membrane protein insertase, YidC/Oxa1 family, N-terminal domain. Essentially all bacteria have a member of the YidC family, whose C-terminal domain is modeled by TIGR03592. The two copies are found in endospore-forming bacteria such as Bacillus subtilis appear redundant during vegetative growth, although the member designated spoIIIJ (stage III sporulation protein J) has a distinct role in spore formation. YidC, its mitochondrial homolog Oxa1, and its chloroplast homolog direct insertion into the bacterial/organellar inner (or only) membrane. This model describes an N-terminal sequence region, including a large periplasmic domain lacking in YidC members from Gram-positive species. The multifunctional YidC protein acts both with and independently of the Sec system. [Protein fate, Protein and peptide secretion and trafficking]	366
274667	TIGR03594	GTPase_EngA	ribosome-associated GTPase EngA. EngA (YfgK, Der) is a ribosome-associated essential GTPase with a duplication of its GTP-binding domain. It is broadly to universally distributed among bacteria. It appears to function in ribosome biogenesis or stability. [Protein synthesis, Other]	428
274668	TIGR03595	Obg_CgtA_exten	Obg family GTPase CgtA, C-terminal extension. CgtA (see model TIGR02729) is a broadly conserved member of the obg family of GTPases associated with ribosome maturation. This model represents a unique C-terminal domain found in some but not all sequences of CgtA. This region is preceded, and may be followed, by a region of low-complexity sequence.	69
274669	TIGR03596	GTPase_YlqF	ribosome biogenesis GTP-binding protein YlqF. Members of this protein family are GTP-binding proteins involved in ribosome biogenesis, including the essential YlqF protein of Bacillus subtilis, which is an essential protein. They are related to Era, EngA, and other GTPases of ribosome biogenesis, but are circularly permuted. This family is not universal, and is not present in Escherichia coli, and so is not as well studied as some other GTPases. This model is built for bacterial members. [Protein synthesis, Other]	276
213834	TIGR03597	GTPase_YqeH	ribosome biogenesis GTPase YqeH. This family describes YqeH, a member of a larger family of GTPases involved in ribosome biogenesis. Like YqlF, it shows a cyclical permutation relative to GTPases EngA (in which the GTPase domain is duplicated), Era, and others. Members of this protein family are found in a relatively small number of bacterial species, including Bacillus subtilis but not Escherichia coli. [Protein synthesis, Other]	360
274670	TIGR03598	GTPase_YsxC	ribosome biogenesis GTP-binding protein YsxC/EngB. Members of this protein family are a GTPase associated with ribosome biogenesis, typified by YsxC from Bacillus subutilis. The family is widely but not universally distributed among bacteria. Members commonly are called EngB based on homology to EngA, one of several other GTPases of ribosome biogenesis. Cutoffs as set find essentially all bacterial members, but also identify large numbers of eukaryotic (probably organellar) sequences. This protein is found in about 80 percent of bacterial genomes. [Protein synthesis, Other]	179
274671	TIGR03599	YloV	DAK2 domain fusion protein YloV. This model describes a protein family that contains an N-terminal DAK2 domain (pfam02734), so named because of similarity to the dihydroxyacetone kinase family family. The GTP-binding protein CgtA (a member of the obg family) is a bacterial GTPase associated with ribosome biogenesis, and it has a characteristic extension (TIGR03595) in certain lineages. This protein family described here was found, by the method of partial phylognetic profiling, to have a phylogenetic distribution strongly correlated to that of TIGR03595. This correlation implies some form of functional coupling.	530
274672	TIGR03600	phage_DnaB	phage replicative helicase, DnaB family, HK022 subfamily. Members of this family are phage (or prophage-region) homologs of the bacterial homohexameric replicative helicase DnaB. Some phage may rely on host DnaB, while others encode their own verions. This model describes the largest phage-specific clade among the close homologs of DnaB, but there are, or course, other DnaB homologs from phage that fall outside the scope of this model. [Mobile and extrachromosomal element functions, Prophage functions]	420
274673	TIGR03601	B_an_ocin	bacteriocin, heterocycloanthracin/sonorensin family. Numerous bacteria encode systems for producing bacteriocins by extensive modification of ribosomally produced precursors. Members of the TOMM class (thiazole/oxazole-modified microcins) are recognizable by association with cyclodehydratase (and often dehydrogenase) maturation proteins. This family consists of a special subclass, the heterocycloanthracin family, that share a homologous leader peptide region and then a repeat region with Cys as every third residue. In Bacillus anthracis and Bacillus cereus, the RiPP (ribosomally translated and post-translationally modified natural product) precursor is encoded far from its maturase genes, and every strain has the system. In other species (e.g. B. licheniformis, B. sorenensis), precursor and maturase genes are close together. Sonorensin, from B. sonorensis MT93, was shown to have broad spectrum antimicrobial activity, affecting Gram-positive and Gram-negative bacteria. [Cellular processes, Toxin production and resistance]	88
132641	TIGR03602	streptolysinS	bacteriocin protoxin, streptolysin S family. Members of this family are bacteriocin precursors. These small, ribosomally produced polypeptide precursors are extensively processed post-translationally. This family belongs to a class of heterocycle-containing bacteriocins, including streptolysin S from Streptococcus pyogenes, and related bacteriocins from Streptococcus iniae and Clostridium botulinum. Streptolysin S is hemolytic. Bacteriocin genes in general are small and highly diverse, with odd sequence composition, and are easily missed by many gene-finding programs. [Cellular processes, Toxin production and resistance]	56
200298	TIGR03603	cyclo_dehy_ocin	thiazole/oxazole-forming peptide maturase, SagC family component. Members of this protein family include enzymes related to SagC, a protein involved in thiazole/oxazole cyclodehydration modifications during biosynthesis of streptolysin S in Streptococcus pyogenes from the protoxin polypeptide (product of the sagA gene). Recent evidence suggests that the YcaO/SagD-like component, not this component, performs an ATP-dependent cyclodehydration. This protein family serves as a marker for widely distributed prokaryotic systems for making a general class of heterocycle-containing bacteriocins. Note that this model does not find all possible examples of bacteriocin biosynthesis cyclodehydratases, an in particular misses the E. coli plasmid protein McbB of microcin B17 biosynthesis. [Cellular processes, Pathogenesis]	319
274674	TIGR03604	TOMM_cyclo_SagD	thiazole/oxazole-forming peptide maturase, SagD family component. Members of this protein family include enzymes related to SagD, previously referred to as a scaffold or docking protein involved in the biosynthesis of streptolysin S in Streptococcus pyogenes from the protoxin polypeptide (product of the sagA gene). Newer evidence describes an enzymatic activity, an ATP-dependent cyclodehydration reaction, previously ascribed to the SagC component. This protein family serves as a marker for widely distributed prokaryotic systems for making a general class of heterocycle-containing bacteriocins.	377
188352	TIGR03605	antibiot_sagB	SagB-type dehydrogenase domain. SagB of Sterptococcus pyogenes participates in the maturation of streptolysin S from a ribosomally produced precursor polypeptide. Chemically similar systems operate on highly diverse sets of bacteriocin precursors in numerous other bacteria. This model describes a domain within SgaB and homologous regions from other proteins, many of which appear to be involved in biosynthesis of secondary metabolites. While some substrates may be intermediates in non-ribosomal peptide syntheses, others are involved in heterocycle-containing bacteriocin biosynthesis, and can be found near SgaC-like (see TIGR03603, cyclodehydratase) and SgaD-like (see TIGR03604, "docking") proteins. Members of this domain family are heterogeneous in length, as many have a partial second copy of the domain represented here. The incomplete second domain scores below the cutoffs to this model in most cases.	173
274675	TIGR03606	non_repeat_PQQ	dehydrogenase, PQQ-dependent, s-GDH family. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis.	454
274676	TIGR03607	TIGR03607	patatin-related protein. This bacterial protein family contains an N-terminal patatin domain, where patatins are plant storage proteins capable of phospholipase activity (see pfam01734). Regions of strong sequence conservation are separated by regions of significant sequence and length variability. Members of the family are distributed sporadically among bacteria. The function is unknown. [Unknown function, General]	738
188353	TIGR03608	L_ocin_972_ABC	putative bacteriocin export ABC transporter, lactococcin 972 group. A gene pair with a fairly wide distribution consists of a polypeptide related to the lactococcin 972 (see TIGR01653) and multiple-membrane-spanning putative immunity protein (see TIGR01654). This model represents a small clade within the ABC transporters that regularly are found adjacent to these bacteriocin system gene pairs and are likely serve as export proteins. [Cellular processes, Toxin production and resistance, Transport and binding proteins, Unknown substrate]	206
132648	TIGR03609	S_layer_CsaB	polysaccharide pyruvyl transferase CsaB. The CsaB protein (cell surface anchoring B) of Bacillus anthracis adds a pyruvoyl group to peptidoglycan-associated polysaccharide. This addition is required for proteins with an S-layer homology domain (pfam00395) to bind. Within the larger group of proteins described by pfam04230, this model represents a distinct clade that nearly exactly follows the phylogenetic distribution of the S-layer homology domain (pfam00395). [Cell envelope, Surface structures, Protein fate, Protein and peptide secretion and trafficking]	298
274677	TIGR03610	RutC	pyrimidine utilization protein C. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the endoribonuclease L-PSP family defined by pfam01042.	127
211851	TIGR03611	RutD	pyrimidine utilization protein D. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the hydrolase, alpha/beta fold family defined by pfam00067.	248
163355	TIGR03612	RutA	pyrimidine utilization protein A. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the luciferase family defined by pfam00296 and is likely a FMN-dependent monoxygenase. [Unknown function, Enzymes of unknown specificity]	355
274678	TIGR03613	RutR	pyrimidine utilization regulatory protein R. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the TetR family of transcriptional regulators defined by the N-teminal model pfam00440 and the C-terminal model pfam08362 (YcdC-like protein, C-terminal region).	202
163356	TIGR03614	RutB	pyrimidine utilization protein B. 	226
132654	TIGR03615	RutF	pyrimidine utilization flavin reductase protein F. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the flavin reductase family defined by pfam01613. Presumably, this protein recycles the flavin of the RutA luciferase-like oxidoreductase.	156
132655	TIGR03616	RutG	pyrimidine utilization transport protein G. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the uracil-xanthine permease family defined by TIGR00801. As well as the The Nucleobase:Cation Symporter-2 (NCS2) Family (TC 2.A.40).	429
132656	TIGR03617	F420_MSMEG_2256	probable F420-dependent oxidoreductase, MSMEG_2256 family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes one such subfamily, exemplified by MSMEG_2256 from Mycobacterium smegmatis. [Unknown function, Enzymes of unknown specificity]	318
274679	TIGR03618	Rv1155_F420	PPOX class probable F420-dependent enzyme. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomyces, make F420. The Partial Phylogenetic Profiling algorithm identifies this members of this protein family as high-scoring proteins to the F420 biosynthesis profile. A member of this family, Rv1155, was crytallized after expression in Escherichia coli, which does not synthesize F420; the crystal structure shown to resemble FMN-binding proteins, but with a recognizable empty cleft corresponding to, yet differing profounding from, the FMN site of pyridoxine 5'-phosphate oxidase. We propose that this protein family consists of F420-binding enzymes. [Unknown function, Enzymes of unknown specificity]	126
274680	TIGR03619	F420_Rv2161c	probable F420-dependent oxidoreductase, Rv2161c family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes a domain found in a distinctive subset of bacterial luciferase homologs, found only in F420-biosynthesizing members of the Actinobacteria. [Unknown function, Enzymes of unknown specificity]	246
274681	TIGR03620	F420_MSMEG_4141	probable F420-dependent oxidoreductase, MSMEG_4141 family. Members of this protein family, related to F420-dependent oxidoreductases within the larger family of a bacterial luciferase (an FMN-dependent enzyme), occurs only within the small subset of species that synthesize F420. Most such proteins are from members of the Actinobacteria, but at least one species, Sphingomonas wittichii, belongs to the Alphaproteobacteria. [Unknown function, Enzymes of unknown specificity]	278
200301	TIGR03621	F420_MSMEG_2516	probable F420-dependent oxidoreductase, MSMEG_2516 family. Coenzyme F420 is produced by methanogenic archaea, a number of the Actinomycetes (including Mycobacterium tuberculosis), and rare members of other lineages. The resulting information-rich phylogenetic profile identifies candidate F420-dependent oxidoreductases within the family of luciferase-like enzymes (pfam00296), where the species range for the subfamily encompasses many F420-positive genomes without straying beyond. This family is uncharacterized, and named for member MSMEG_2516 from Mycobacterium smegmatis. [Unknown function, Enzymes of unknown specificity]	295
132661	TIGR03622	urea_t_UrtB_arc	urea ABC transporter, permease protein UrtB. Members of this protein family are ABC transporter permease subunits restricted to the Archaea. Several lines of evidence suggest this protein is functionally analogous, as well as homologous, to the UrtB subunit of the Corynebacterium glutamicum urea transporter. All members of the operon show sequence similarity to urea transport subunits, the gene is located near the urease structural subunits in two of three species, and partial phylogenetic profiling identifies this permease subunit as closely matching the profile of urea utilization.	283
274682	TIGR03623	TIGR03623	probable DNA repair protein. Members of this protein family are bacterial proteins of about 900 amino acids in length. Members show extended homology to proteins in TIGR02786, the AddB protein of double-strand break repair via homologous recombination. Members of this family, therefore, may be DNA repair proteins.	874
274683	TIGR03624	TIGR03624	putative hydrolase. Members of this protein family have a phylogenetic distribution skewed toward the Actinobacteria (high GC Gram-positive bacteria), but with a few members occuring in the Archaea and Chloroflexi. The function is unknown. [Unknown function, General]	346
274684	TIGR03625	L3_bact	50S ribosomal protein uL3, bacterial form. This model describes bacterial (and mitochondrial and chloroplast) class of ribosomal protein L3. A separate model describes the archaeal form, where both belong to pfam00297. The name is phrased to meet the needs of bacterial genome annotation. Organellar forms typically will have transit peptides, N-terminal to the region modeled here.	202
274685	TIGR03626	L3_arch	ribosomal protein uL3, archaeal form. This model describes exclusively the archaeal class of ribosomal protein L3. A separate model (TIGR03625) describes the bacterial/organelle form, and both belong to pfam00297. Eukaryotic proteins are excluded from this model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	330
132666	TIGR03627	uS9_arch	ribosomal protein uS9, archaeal form. This model describes exclusively the archaeal ribosomal protein S9P. Homologous eukaryotic and bacterial ribosomal proteins are excluded from this model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	130
274686	TIGR03628	arch_S11P	ribosomal protein uS11P, archaeal form. This model describes exclusively the archaeal ribosomal protein S11P. It excludes homologous ribosomal proteins S14 from eukaryotes and S11 from bacteria. [Protein synthesis, Ribosomal proteins: synthesis and modification]	117
213839	TIGR03629	uS13_arch	ribosomal protein uS13, archaeal form. This model describes exclusively the archaeal ribosomal protein S13P. It excludes the homologous eukaryotic 40S ribosomal protein S18 and bacterial 30S ribosomal protein S13. [Protein synthesis, Ribosomal proteins: synthesis and modification]	144
274687	TIGR03630	uS17_arch	ribosomal protein uS17, archaeal form. This model describes exclusively the archaeal ribosomal protein S17P. It excludes the homologous ribosomal protein S17 from bacteria, and is not intended for use on eukaryotic sequences, where some instances of ribosomal proteins S11 score above the trusted cutoff. [Protein synthesis, Ribosomal proteins: synthesis and modification]	102
274688	TIGR03631	uS13_bact	ribosomal protein uS13, bacterial form. This model describes bacterial ribosomal protein S13, to the exclusion of the homologous archaeal S13P and eukaryotic ribosomal protein S18. This model identifies some (but not all) instances of chloroplast and mitochondrial S13, which is of bacterial type. [Protein synthesis, Ribosomal proteins: synthesis and modification]	113
274689	TIGR03632	uS11_bact	ribosomal protein uS11, bacterial form. This model describes the bacterial 30S ribosomal protein S11. Cutoffs are set such that the model excludes archaeal and eukaryotic ribosomal proteins, but many chloroplast and mitochondrial equivalents of S11 are detected. [Protein synthesis, Ribosomal proteins: synthesis and modification]	108
163366	TIGR03633	arc_protsome_A	proteasome endopeptidase complex, archaeal, alpha subunit. This protein family describes the archaeal proteasome alpha subunit, homologous to both the beta subunit and to the alpha and beta subunits of eukaryotic proteasome subunits. This family is universal in the first 29 complete archaeal genomes but occasionally is duplicated. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	224
274690	TIGR03634	arc_protsome_B	proteasome endopeptidase complex, archaeal, beta subunit. This protein family describes the archaeal proteasome beta subunit, homologous to both the alpha subunit and to the alpha and beta subunits of eukaryotic proteasome subunits. This family is universal in the first 29 complete archaeal genomes but occasionally is duplicated. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	185
274691	TIGR03635	uS17_bact	ribosomal protein uS17, bacterial form. This model describes the bacterial ribosomal small subunit protein S17, while excluding cytosolic eukaryotic homologs and archaeal homologs. The model finds many, but not, chloroplast and mitochondrial counterparts to bacterial S17. [Protein synthesis, Ribosomal proteins: synthesis and modification]	72
274692	TIGR03636	uL23_arch	ribosomal protein uL23, archaeal form. This model describes the archaeal ribosomal protein L23P and rigorously excludes the bacterial counterpart L23. In order to capture every known instance of archaeal L23P, the trusted cutoff is set lower than a few of the highest scoring eukaryotic cytosolic ribosomal counterparts. [Protein synthesis, Ribosomal proteins: synthesis and modification]	77
132676	TIGR03637	cas1_YPEST	CRISPR-associated endonuclease Cas1, subtype I-F/YPEST. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 protein particular to the YPEST subtype of CRISPR/Cas system.	307
274693	TIGR03638	cas1_ECOLI	CRISPR-associated endonuclease Cas1, subtype I-E/ECOLI. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 protein particular to the ECOLI subtype of CRISPR/Cas system.	268
274694	TIGR03639	cas1_NMENI	CRISPR-associated endonuclease Cas1, subtype II/NMENI. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is a prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 variant of the NMENI subtype of CRISPR/Cas system.	278
188360	TIGR03640	cas1_DVULG	CRISPR-associated endonuclease Cas1, subtype I-C/DVULG. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes the Cas1 protein particular to the DVULG subtype of CRISPR/Cas system.	340
274695	TIGR03641	cas1_HMARI	CRISPR-associated endonuclease Cas1, subtype I-B/HMARI/TNEAP. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes Cas1 subgroup that includes Cas1 proteins of the related HMARI and TNEAP subtypes of CRISPR/Cas system.	320
274696	TIGR03642	cas_csx14	CRISPR-associated protein, Csx14 family. This model describes a protein N-terminal protein sequence domain strictly associated with CRISPR and CRISPR-associated protein systems. This model and TIGR02584 identify two separate clades from a larger homology domain family, both CRISPR-associated, while other homologs are found that may not be. Members are found in bacteria that include Pelotomaculum thermopropionicum SI, Thermoanaerobacter tengcongensis MB4, and Roseiflexus sp. RS-1, and in archaea that include Thermoplasma volcanium, Picrophilus torridus, and Methanospirillum hungatei. The molecular function is unknown.	124
132682	TIGR03643	TIGR03643	TIGR03643 family protein. This model describes an uncharacterized bacterial protein family. Members average about 90 amino acids in length with several well-conserved uncommon amino acids (Trp, Met). The majority of species are marine bacteria. Few species have more than one copy, but Vibrio cholerae El Tor N16961 has three identical copies. [Hypothetical proteins, Conserved]	72
274697	TIGR03644	marine_trans_1	probable ammonium transporter, marine subtype. Members of this protein family are well conserved subclass of putative ammonimum transporters, belonging to the much broader set of ammonium/methylammonium transporter described by TIGR00836. Species with this transporter tend to be marine bacteria. Partial phylogenetic profiling (PPP) picks a member of this protein family as the single best-scoring protein vs. a reference profile for the marine environment Genome Property for a large number of different query genomes. This finding by PPP suggests that this transporter family represents an important adaptation to the marine environment.	404
132684	TIGR03645	glyox_marine	lactoylglutathione lyase family protein. Members of this protein family share homology with lactoylglutathione lyase (glyoxalase I) and are found mainly in marine members of the gammaproteobacteria, including CPS_0532 from Colwellia psychrerythraea 34H. This family excludes a well-separated, more narrowly distributed paralogous family, exemplified by CPS_3492 from C. psychrerythraea. The function is of this protein family is unknown.	162
132685	TIGR03646	YtoQ_fam	YtoQ family protein. Members of this family are uncharacterized proteins, including YtoQ from Bacillus subtilis. This family shows some sequence similarity to a family of nucleoside 2-deoxyribosyltransferases (COG3613 as iterated through CDD), but sufficiently remote that PSI-BLAST starting from YtoQ and exploring outwards does not discover the relationship.	144
132686	TIGR03647	Na_symport_sm	putative solute:sodium symporter small subunit. Members of this family are highly hydrophobic bacterial proteins of about 90 amino acids in length. Members usually are found immediately upstream (sometimes fused to) a member of the solute:sodium symporter family, and therefore are a putative sodium:solute symporter small subunit. Members tend to be found in aquatic species, especially those from marine or other high salt environments. [Transport and binding proteins, Unknown substrate]	77
274698	TIGR03648	Na_symport_lg	probable sodium:solute symporter, VC_2705 subfamily. This family belongs to a larger family of transporters of the sodium:solute symporter superfamily, TC 2.A.21. Members of this strictly bacterial protein subfamily are found almost invariably immediately downstream from a member of family TIGR03647. Occasionally, the two genes are fused.	552
274699	TIGR03649	ergot_EASG	ergot alkaloid biosynthesis protein, AFUA_2G17970 family. This family consists of fungal proteins of unknown function associated with secondary metabolite biosynthesis, such as of the ergot alkaloids such as ergovaline. Nomenclature differs because gene order differs - this is EasG in Neotyphodium lolii but is designated ergot alkaloid biosynthetic protein A in several other fungi.	285
132689	TIGR03650	violacein_E	violacein biosynthesis enzyme VioE. This enzyme catalyzes the third step in violacein biosynthesis from a pair of Trp residues, as in Chromobacterium violaceum, but the first step that distinguishes that pathway from staurosporine (an indolocarbazole antibiotic) biosynthesis. [Cellular processes, Toxin production and resistance]	184
274700	TIGR03651	circ_ocin_uber	circular bacteriocin, circularin A/uberolysin family. Circular bacteriocins are antibiotic proteins made by ribosomal translation of a precursor molecular, followed by cleavage and circularization. Members of this subclass of the circular bacteriocins include circularin A from Clostridium beijerinckii, bacteriocin AS-48 from Enterococcus faecalis, uberolysin from Streptococcus uberis, and carnocyclin A from Carnobacterium maltaromaticum. The mature circularized peptides average about 70 amino acids in size. [Cellular processes, Toxin production and resistance]	73
274701	TIGR03652	FeS_repair_RIC	iron-sulfur cluster repair di-iron protein. Members of this protein family, designated variously as YftE, NorA, DrnN, and NipC, are di-iron proteins involved in the repair of iron-sulfur clusters. Previously assigned names reflect pleiotropic effects of damage from NO or other oxidative stress when this protein is mutated. The suggested name now is RIC, for Repair of Iron Centers. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	216
274702	TIGR03653	uL6_arch	ribosomal protein uL6, archaeal form. Members of this protein family are the archaeal form ofribosomal protein uL6 (previously L9 in yeast and human). The top-scoring proteins not selected by this model are eukaryotic cytosolic uL6. Bacterial ribosomal protein L6 scores lower and is described by a distinct model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	170
274703	TIGR03654	L6_bact	ribosomal protein L6, bacterial type. [Protein synthesis, Ribosomal proteins: synthesis and modification]	175
274704	TIGR03655	anti_R_Lar	restriction alleviation protein, Lar family. Restriction alleviation proteins provide a countermeasure to host cell restriction enzyme defense against foreign DNA such as phage or plasmids. This family consists of homologs to the phage antirestriction protein Lar, and most members belong to phage genomes or prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions, DNA metabolism, Restriction/modification]	53
274705	TIGR03656	IsdC	heme uptake protein IsdC. Isd proteins are iron-regulated surface proteins found in Bacillus, Staphylococcus and Listeria species and are responsible for heme scavenging from hemoproteins. The IsdC protein consists of an N-terminal hydrophobic signal sequence, a central NEAT (NEAr Transporter, pfam05031) domain, which confers the ability to bind heme, and a C-terminal SrtB processing signal which targets the protein to the cell wall. IsdC is believed to make a direct contact with, and transfer heme to, the heme-binding component (IsdE) of an ABC transporter in the cytoplasmic membrane, and to receive heme from other NEAT-containing heme-binding proteins also localized in the cell wall.	217
213844	TIGR03657	IsdB	heme uptake protein IsdB. Isd proteins are iron-regulated surface proteins found in Bacillus, Staphylococcus and Listeria species and are responsible for heme scavenging from hemoproteins. The IsdB protein is only observed in Staphylococcus and consists of an N-terminal hydrophobic signal sequence, a pair of tandem NEAT (NEAr Transporter, pfam05031) domains, which confers the ability to bind heme, and a C-terminal sortase processing signal which targets the protein to the cell wall. IsdB is believed to make a direct contact with methemoglobin facilitating transfer of heme to IsdB. The heme is then transferred to other cell wall-bound NEAT domain proteins such as IsdA and IsdC.	644
132697	TIGR03658	IsdH_HarA	haptoglobin-binding heme uptake protein HarA. HarA is a heme-binding NEAT-domain (NEAr Transporter, pfam05031) protein which has been shown to bind to the haptoglobin-hemoglobin complex in order to extract heme from it. HarA has also been reported to bind hemoglobin directly. HarA (also known as IsdH) contains three NEAT domains as well as a sortase A C-terminal signal for localization to the cell wall. The heme bound at the third of these NEAT domains has been shown to be transferred to the IsdA protein also localized at the cell wall, presumably through an additional specific protein-protein interaction. Haptoglobin is a hemoglobin carrier protein involved in scavenging hemoglobin in the blood following red blood cell lysis and targetting it to the liver.	895
274706	TIGR03659	IsdE	heme ABC transporter, heme-binding protein isdE. This family of ABC substrate-binding proteins is observed primarily in close proximity with proteins localized to the cell wall and bearing the NEAT (NEAr Transporter, pfam05031) heme-binding domain. IsdE has been shown to bind heme and is involved in the process of scavenging heme for the purpose of obtaining iron.	289
132699	TIGR03660	T1SS_rpt_143	T1SS-143 repeat domain. This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion. [Cellular processes, Pathogenesis]	137
274707	TIGR03661	T1SS_VCA0849	type I secretion C-terminal target domain (VC_A0849 subclass). This model represents a C-terminal domain associated with secretion by type 1 secretion systems (T1SS). Members of this subclass do not include the RtxA toxin of Vibrio cholerae and its homologs, although the two classes of proteins share large size, occurrence in genomes with T1SS, regions with long tandem repeats, and regions with the glycine-rich repeat modeled by pfam00353. [Cellular processes, Pathogenesis]	88
274708	TIGR03662	Chlor_Arch_YYY	Chlor_Arch_YYY domain. Members of this highly hydrophobic probable integral membrane family belong to two classes. In one, a single copy of the region covered by this model represents essentially the full length of a strongly hydrophobic protein of about 700 to 900 residues (variable because of long inserts in some). The domain architecture of the other class consists of an additional N-terminal region, two copies of the region represented by this model, and three to four repeats of TPR, or tetratricopeptide repeat. The unusual species range includes several Archaea, several Chloroflexi, and Clostridium phytofermentans. An unusual motif YYYxG is present, and we suggest the name Chlor_Arch_YYY protein. The function is unknown.	723
274709	TIGR03663	TIGR03663	TIGR03663 family protein. Members of this protein family, uncommon and rather sporadically distributed, are found almost always in the same genomes as members of family TIGR03662, and frequently as a nearby gene. Members show some N-terminal sequence similarity with pfam02366, dolichyl-phosphate-mannose-protein mannosyltransferase. The few invariant residues in this family, found toward the N-terminus, include a dipeptide DE, a tripeptide HGP, and two different Arg residues. Up to three members may be found in a genome. The function is unknown.	439
274710	TIGR03664	fut_nucase	futalosine hydrolase. This enzyme catalyzes the conversion of futalosine to de-hypoxanthine futalosine in a pathway for the biosynthesis of menaquinone distinct from the pathway observed in E. coli.	222
274711	TIGR03665	arCOG04150	arCOG04150 universal archaeal KH domain protein. This family of proteins is universal among the 41 archaeal genomes analyzed, and is not observed outside of the archaea. The proteins contain a single KH domain (pfam00013) which is likely to confer the ability to bind RNA.	172
274712	TIGR03666	Rv2061_F420	PPOX class probable F420-dependent enzyme, Rv2061 family. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomycetales, make F420. A variant of the Partial Phylogenetic Profiling algorithm, SIMBAL, shows that this protein likely binds F420 in a cleft similar to that in which the homologous enzyme pyridoxamine phosphate oxidase (PPOX) binds FMN. [Unknown function, Enzymes of unknown specificity]	132
132706	TIGR03667	Rv3369	PPOX class probable F420-dependent enzyme, Rv3369 family. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomycetales, make F420. A variant of the Partial Phylogenetic Profiling algorithm, SIMBAL, shows that this protein likely binds F420 in a cleft similar to that in which the homologous enzyme pyridoxamine phosphate oxidase (PPOX) binds FMN. [Unknown function, Enzymes of unknown specificity]	130
132707	TIGR03668	Rv0121_F420	PPOX class probable F420-dependent enzyme, Rv0121 family. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomycetales, make F420. A variant of the Partial Phylogenetic Profiling algorithm, SIMBAL, shows that this protein likely binds F420 in a cleft similar to that in which the homologous enzyme pyridoxamine phosphate oxidase (PPOX) binds FMN. [Unknown function, Enzymes of unknown specificity]	141
132708	TIGR03669	urea_ABC_arch	urea ABC transporter, substrate-binding protein, archaeal type. Members of this protein family are identified as the substrate-binding protein of a urea ABC transport system by similarity to a known urea transporter from Corynebacterium glutamicum, operon structure, proximity of its operons to urease (urea-utilization protein) operons, and by Partial Phylogenetic Profiling vs. urea utilization. [Transport and binding proteins, Amino acids, peptides and amines]	374
274713	TIGR03670	rpoB_arch	DNA-directed RNA polymerase subunit B. This model represents the archaeal version of DNA-directed RNA polymerase subunit B (rpoB) and is observed in all archaeal genomes.	599
274714	TIGR03671	cca_archaeal	CCA-adding enzyme. 	408
274715	TIGR03672	rpl4p_arch	50S ribosomal protein uL4, archaeal form. One of the primary rRNA binding proteins, this protein initially binds near the 5'-end of the 23S rRNA. It is important during the early stages of 50S assembly. It makes multiple contacts with different domains of the 23S rRNA in the assembled 50S subunit and ribosome.	251
274716	TIGR03673	uL14_arch	50S ribosomal protein uL14, archaeal form. Part of the 50S ribosomal subunit. Forms a cluster with proteins L3 and L24e, part of which may contact the 16S rRNA in 2 intersubunit bridges.	131
274717	TIGR03674	fen_arch	flap structure-specific endonuclease. Endonuclease that cleaves the 5'-overhanging flap structure that is generated by displacement synthesis when DNA polymerase encounters the 5'-end of a downstream Okazaki fragment. Has 5'-endo-/exonuclease and 5'-pseudo-Y-endonuclease activities. Cleaves the junction between single and double-stranded regions of flap DNA	338
274718	TIGR03675	arCOG00543	arCOG00543 universal archaeal KH-domain/beta-lactamase-domain protein. This family of proteins is universal in the archaea and consistsof an N-terminal type-1 KH-domain (pfam00013) a central beta-lactamase-domain (pfam00753) with a C-terminal motif associated with RNA metabolism (pfam07521). KH-domains are associated with RNA-binding, so taken together, this protein is a likely metal-dependent RNAase. This family was defined as arCOG01782.	630
274719	TIGR03676	aRF1/eRF1	peptide chain release factor 1, archaeal and eukaryotic forms. Directs the termination of nascent peptide synthesis (translation) in response to the termination codons UAA, UAG and UGA. This model identifies both archaeal (aRF1) and eukaryotic (eRF1) of the protein. Also known as translation termination factor 1. [Protein synthesis, Translation factors]	403
188367	TIGR03677	eL8_ribo	ribosomal protein eL8, archaeal form. This model specifically identifies the archaeal version of the large ribosomal complex protein eL8, previously designated L8 in yeast and L7Ae in the archaea. The family is a narrower version of the pfam01248 model which also recognizes the L30 protein.	117
163391	TIGR03678	het_cyc_patell	bacteriocin leader peptide, microcyclamide/patellamide family. This model represents a conserved N-terminal region shared by microcyclamide and patellamide bacteriocins precursors. These bacteriocin precursors are associated with heterocyclization. Related precursors are found in family TIGR04446.	34
188368	TIGR03679	arCOG00187	arCOG00187 universal archaeal metal-binding-domain/4Fe-4S-binding-domain containing ABC transporter, ATP-binding protein. This protein consists of an N-terminal possible metal-binding domain (pfam04068) followed by a 4Fe-4S cluster binding domain (pfam00037) followed by a C-terminal ABC transporter, ATP-binding domain (pfam00005). This combination of N-terminal domains is observed in the RNase L inhibitor, RLI. This model has the same scope as an archaeal COG (arCOG00187) and is found in all completely sequenced archaea and does not recognize any known non-archaeal genes.	218
274720	TIGR03680	eif2g_arch	translation initiation factor 2 subunit gamma. This model represents the archaeal translation initiation factor 2 subunit gamma and is found in all known archaea. eIF-2 functions in the early steps of protein synthesis by forming a ternary complex with GTP and initiator tRNA.	406
274721	TIGR03682	arCOG04112	diphthamide biosynthesis enzyme Dph2. Members of this family are the archaeal protein Dph2, members of the universal archaeal protein family designated arCOG04112. The chemical function of this protein is analogous to the radical SAM family (pfam04055), although the sequence is not homologous. The chemistry involves [4Fe-4S]-aided formation of a 3-amino-3-carboxypropyl radical rather than the canonical 5'-deoxyadenosyl radical of the radical SAM family.	308
274722	TIGR03683	A-tRNA_syn_arch	alanyl-tRNA synthetase. This family of alanyl-tRNA synthetases is limited to the archaea, and is a subset of those sequences identified by the model pfam07973 covering the second additional domain (SAD) of alanyl and threonyl tRNA synthetases .	902
274723	TIGR03684	arCOG00985	arCOG04150 universal archaeal PUA-domain protein. This universal archaeal protein contains a domain possibly associated with RNA binding (pfam01472, TIGR00451).	150
274724	TIGR03685	ribo_P1_arch	50S ribosomal protein P1. This model represents P1 the L12P protein of the large (50S) subunit of the archaeal ribosome.	105
274725	TIGR03686	pupylate_PafA	Pup--protein ligase. Members of this family are the Pup--protein ligase PafA (proteasome accessory factor A), a protein shown to regulate steady-state levels of certain proteasome targets in Mycobacterium tuberculosis. Iyer, et al (2008) first suggested that PafA is the ligase for Pup, a ubiquitin analog attached to an epsilon-amino group of a Lys side-chain to direct the target to the proteasome. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	453
200311	TIGR03687	pupylate_cterm	ubiquitin-like protein Pup. Members of this protein family are Pup, a small protein whose ligation to target proteins steers them toward degradation. This protein family occurs in a number of bacteria, especially Actinobacteria such as Mycobacterium tuberculosis, that possess an archeal-type proteasome. All members of this protein family known during model construction end with the C-terminal motif [FY][VI]QKGG[QE]. Ligation is thought to occur between the C-terminal COOH of Pup and an epsilon-amino group of a Lys on the target protein. The N-terminal half of this protein is poorly conserved and not represented in the seed alignment. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	33
274726	TIGR03688	pupylate_PafA2	proteasome accessory factor PafA2. This protein family is paralogous to (and distinct from) the PafA (proteasome accessory factor) first described in Mycobacterium tuberculosis (see TIGR03686). Members of both this family and TIGR03686 itself tend to cluster with each other, with the ubiquitin analog Pup (TIGR03687) associated with targeting to the proteasome, and with proteasome subunits themselves. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	485
200312	TIGR03689	pup_AAA	proteasome ATPase. In the Actinobacteria, as shown for Mycobacterium tuberculosis, some proteins are modified by ligation between an epsilon-amino group of a lysine side chain and the C-terminal carboxylate of the ubiquitin-like protein Pup. This modification leads to protein degradation by the archaeal-like proteasome found in the Actinobacteria. Members of this protein family belong to the AAA family of ATPases and tend to be clustered with the genes for Pup, the Pup ligase PafA, and structural components of the proteasome. This protein forms hexameric rings with ATPase activity. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	512
163402	TIGR03690	20S_bact_beta	proteasome, beta subunit, bacterial type. Members of this family are the beta subunit of the 20S proteasome as found in Actinobacteria such as Mycobacterium, Rhodococcus, and Streptomyces. In Streptomyces, maturation during proteasome assembly was shown to remove a 53-amino acid propeptide. Most of the length of the propeptide is not included in this model. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	219
163403	TIGR03691	20S_bact_alpha	proteasome, alpha subunit, bacterial type. Members of this family are the alpha subunit of the 20S proteasome as found in Actinobacteria such as Mycobacterium, Rhodococcus, and Streptomyces. In most Actinobacteria (an exception is Propionibacterium acnes), the proteasome is accompanied by a system of tagging proteins for degradation with Pup. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	228
274727	TIGR03692	ATP_dep_HslV	ATP-dependent protease HslVU, peptidase subunit. The ATP-dependent protease HslVU, a complex of hexameric HslU active as a protein-unfolding ATPase and dodecameric HslV, the catalytic threonine protease. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	171
163405	TIGR03693	ocin_ThiF_like	putative thiazole-containing bacteriocin maturation protein. Members of this protein family are found in a three-gene operon in Bacillus anthracis and related Bacillus species, where the other two genes are clearly identified with maturation of a putative thiazole-containing bacteriocin precursor. While there is no detectable pairwise sequence similarity between members of this family and the proposed cyclodehydratases such as SagC of Streptococcus pyogenes (see family TIGR03603), both families show similarity through PSI-BLAST to ThiF, a protein involved in biosynthesis of the thiazole moiety for thiamine biosynthesis. This family, therefore, may contribute to cyclodehydratase function in heterocycle-containing bacteriocin biosyntheses. In Bacillus licheniformis ATCC 14580, the bacteriocin precursor gene is adjacent to the gene for this protein. [Cellular processes, Toxin production and resistance]	637
274728	TIGR03694	exosort_acyl	N-acyl amino acid synthase, PEP-CTERM/exosortase system-associated. Members of this protein family are restricted to bacterial species with the PEP-CTERM/exosortase system predicted to act in exopolysaccharide-associated protein targeting. PSI-BLAST and CDD reveal relationships to the acyltransferase family that includes N-acyl-L-homoserine lactone synthetase, and recent work shows long-chain N-acyl amino acid biosynthesis activity. Several members of this family may be found in a single genome. These acyltransferases may produce a quorum signalling molecule or may contribute to chemical modifications in exopolysaccharide and biofilm structural material production.	241
274729	TIGR03695	menH_SHCHC	2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase. This protein catalyzes the formation of SHCHC, or (1 R,6 R)-2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate, by elmination of pyruvate from 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate (SEPHCHC). Note that SHCHC synthase activity previously was attributed to MenD, which in fact is SEPHCHC synthase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	252
274730	TIGR03696	Rhs_assc_core	RHS repeat-associated core domain. This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.	77
163409	TIGR03697	NtcA_cyano	global nitrogen regulator NtcA, cyanobacterial. Members of this protein family, found in the cyanobacteria, are the global nitrogen regulator NtcA. This DNA-binding transcriptional regulator is required for expressing many different ammonia-repressible genes. The consensus NtcA-binding site is G T A N(8)T A C. [Regulatory functions, DNA interactions]	193
163410	TIGR03698	clan_AA_DTGF	clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	107
274731	TIGR03699	menaquin_MqnC	dehypoxanthine futalosine cyclase. members of this protein family are involved in menaquinone biosynthesis by an alternate pathway via futalosine. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	340
213851	TIGR03700	mena_SCO4494	putative menaquinone biosynthesis radical SAM enzyme, SCO4494 family. Members of this protein family appear to be involved in menaquinone biosynthesis by an alternate pathway via futalosine, based on close phylogenetic correlation with known markers of the futalosine pathway, gene clustering in many organisms, and paralogy with the SCO4550 protein. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	351
163413	TIGR03701	mena_SCO4490	menaquinone biosynthesis decarboxylase, SCO4490 family. Members of this protein family are putative decarboxylases involved in a late stage of the alternative pathway for menaquinone, via futalosine, as in Streptomyces coelicolor and Helicobacter pylori. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	433
163414	TIGR03702	lip_kinase_YegS	lipid kinase YegS. Members of this protein family are designated YegS, an apparent lipid kinase family in the Proteobacteria. Bakali, et al. report phosphatidylglycerol kinase activity for the member from Escherichia coli, but refrain from calling that activity synonymous with its biological role. Note that a broader, subfamily-type model (TIGR00147), includes this family but also multiple paralogs in some species and varied functions. [Unknown function, Enzymes of unknown specificity]	293
274732	TIGR03703	plsB	glycerol-3-phosphate O-acyltransferase. Members of this protein family are PlsB, glycerol-3-phosphate O-acyltransferase, present in E. coli and numerous related species. In many bacteria, PlsB is not found, and appears to be replaced by a two enzyme system for 1-acyl-glycerol-3-phosphate biosynthesis, the PlsX/Y system. [Fatty acid and phospholipid metabolism, Biosynthesis]	799
274733	TIGR03704	PrmC_rel_meth	putative protein-(glutamine-N5) methyltransferase, unknown substrate-specific. This protein family is closely related to two different families of protein-(glutamine-N5) methyltransferase. The first is PrmB, which modifies ribosomal protein L3 in some bacteria. The second is PrmC (HemK), which modifies peptide chain release factors 1 and 2 in most bacteria and also in eukaryotes. The glutamine side chain-binding motif NPPY shared by PrmB and PrmC is N[VAT]PY in this family. The protein substrate is unknown. [Protein synthesis, Ribosomal proteins: synthesis and modification]	251
274734	TIGR03705	poly_P_kin	polyphosphate kinase 1. Members of this protein family are the enzyme polyphosphate kinase 1 (PPK1). This family is found in many prokaryotes and also in Dictyostelium. Sequences in the seed alignment were taken from prokaryotic consecutive two-gene pairs in which the other gene encodes an exopolyphosphatase. It synthesizes polyphosphate from the terminal phosphate of ATP but not GTP, in contrast to PPK2. [Central intermediary metabolism, Phosphorus compounds]	672
274735	TIGR03706	exo_poly_only	exopolyphosphatase. It appears that a single enzyme may act as both exopolyphosphatase (Ppx) and guanosine pentaphosphate phosphohydrolase (GppA) in a number of species. Members of the seed alignment use to define this exception-level model are encoded adjacent to a polyphosphate kinase 1 gene, and the trusted cutoff is set high enough (425) that no genome has a second hit. Therefore all members may be presumed to at least share exopolyphospatase activity, and may lack GppA activity. GppA acts in the stringent response. [Central intermediary metabolism, Phosphorus compounds]	300
213852	TIGR03707	PPK2_P_aer	polyphosphate kinase 2, PA0141 family. Members of this protein family are designated polyphosphate kinase 2 (PPK2) after the characterized protein in Pseudomonas aeruginosa. This family comprises one of three well-separated clades in the larger family described by pfam03976. PA0141 from this family has been shown capable of operating in reverse, with GDP preferred (over ADP) as a substrate, producing GTP (or ATP) by transfer of a phosphate residue from polyphosphate. Most species with a member of this family also encode a polyphosphate kinase 1 (PPK1). [Central intermediary metabolism, Phosphorus compounds]	230
274736	TIGR03708	poly_P_AMP_trns	polyphosphate:AMP phosphotransferase. Members of this protein family contain a domain duplication. The characterized member from Acinetobacter johnsonii is polyphosphate:AMP phosphotransferase (PAP), which can transfer the terminal phosphate from poly(P) to AMP, yielding ADP. In the opposite direction, this enzyme can synthesize poly(P). Each domain of this protein family is homologous to polyphosphate kinase, an enzyme that can run in the forward direction to extend a polyphosphate chain with a new terminal phosphate from ATP, or in reverse to make ATP (or GTP) from ADP (or GDP). [Central intermediary metabolism, Phosphorus compounds]	493
274737	TIGR03709	PPK2_rel_1	polyphosphate:nucleotide phosphotransferase, PPK2 family. Members of this protein family belong to the polyphosphate kinase 2 (PPK2) family, which is not related in sequence to PPK1. While PPK1 tends to act in the biosynthesis of polyphosphate, or poly(P), members of the PPK2 family tend to use the terminal phosphate of poly(P) to regenerate ATP or GTP from the corresponding nucleoside diphosphate, or ADP from AMP as is the case with polyphosphate:AMP phosphotransferase (PAP). Members of this protein family most likely transfer the terminal phosphate between poly(P) and some nucleotide, but it is not clear which. [Central intermediary metabolism, Phosphorus compounds]	264
274738	TIGR03710	OAFO_sf	2-oxoacid:acceptor oxidoreductase, alpha subunit. This family of proteins contains a C-terminal thiamine diphosphate (TPP) binding domain typical of flavodoxin/ferredoxin oxidoreductases (pfam01855) as well as an N-terminal domain similar to the gamma subunit of the same group of oxidoreductases (pfam01558). The genes represented by this model are always found in association with a neighboring gene for a beta subunit (TIGR02177) which also occurs in a 4-subunit (alpha/beta/gamma/ferredoxin) version of the system. This alpha/gamma plus beta structure was used to define the set of sequences to include in this model. This pair of genes is not consistantly observed in proximity to any electron acceptor genes, but is found next to putative ferredoxins or ferredoxin-domain proteins in Azoarcus sp. EbN1, Bradyrhizobium japonicum USDA 110, Frankia sp. CcI3, Rhodoferax ferrireducens DSM 15236, Rhodopseudomonas palustris BisB5, Os, Sphingomonas wittichii RW1 and Streptomyces clavuligerus. Other potential acceptors are also sporadically observed in close proximity including ferritin-like proteins, reberythrin, peroxiredoxin and a variety of other flavin and iron-sulfur cluster-containing proteins. The phylogenetic distribution of this family encompasses archaea, a number of deeply-branching bacterial clades and only a small number of firmicutes and proteobacteria. The enzyme from Sulfolobus has been characterized with respect to its substrate specificity, which is described as wide, encompassing various 2-oxoacids such as 2-oxoglutarate, 2-oxobutyrate and pyruvate. The enzyme from Hydrogenobacter thermophilus has been shown to have a high specificity towards 2-oxoglutarate and is one of the key enzymes in the reverse TCA cycle in this organism. Furthermore, considering its binding of coenzyme A, it can be reasonably inferred that the product of the reaction is succinyl-CoA. The genes for this enzyme in Prevotella intermedia 17, Persephonella marina EX-H1 and Picrophilus torridus DSM 9790 are in close proximity to a variety of TCA cycle genes. Persephonella marina and P. torridus are believed to encode complete TCA cycles, and none of these contains the lipoate-based 2-oxoglutarate dehydrogenase (E1/E2/E3) system. That system is presumed to be replaced by this one. In fact, the lipoate system is absent in most organisms possessing a member of this family, providing additional circumstantial evidence that many of these enzymes are capable of acting as 2-oxoglutarate dehydrogenases and	562
163423	TIGR03711	acc_sec_asp3	accessory Sec system protein Asp3. This protein is designated Asp3 because, along with SecY2, SecA2, and other proteins it is part of the accessory Sec system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	135
274739	TIGR03712	acc_sec_asp2	accessory Sec system protein Asp2. This protein is designated Asp2 because, along with SecY2, SecA2, and other proteins it is part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	511
274740	TIGR03713	acc_sec_asp1	accessory Sec system protein Asp1. This protein is designated Asp1 because, along with SecY2, SecA2, and other proteins it is part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	519
163426	TIGR03714	secA2	accessory Sec system translocase SecA2. Members of this protein family are homologous to SecA and part of the accessory Sec system. This system, including both five core proteins for export and a variable number of proteins for glycosylation, operates in certain Gram-positive pathogens for the maturation and delivery of serine-rich glycoproteins such as the cell surface glycoprotein GspB in Streptococcus gordonii. [Protein fate, Protein and peptide secretion and trafficking]	762
274741	TIGR03715	KxYKxGKxW	KxYKxGKxW signal peptide. This model describes a novel form of signal peptide that occurs as an N-terminal domain with a recognizable motif, reminiscent of the YSIRK and PEP-CTERM forms of signal peptide. This domain tends to occur on long, low-complexity (usually Serine-rich and heavily glycosylated) proteins of the Firmicutes, and (as with YSIRK) the majority of these proteins have the LPXTG cell wall-anchoring motif at the C-terminus.	23
274742	TIGR03716	R_switched_YkoY	integral membrane protein, YkoY family. Rfam model RF00080 describes a structured RNA element called the yybP-ykoY leader, or SraF, which may precede one or several genes in a genome. Members of this highly hydrophobic protein family often are preceded by a yybP-ykoY leader, which may serve as a riboswitch. From the larger group of TerC homologs (pfam03741), this subfamily contains proteins YceF and YkoY from Bacillus subtilis. A transport function is proposed.	215
163429	TIGR03717	R_switched_YjbE	integral membrane protein, YjbE family. Rfam model RF00080 describes a structured RNA element called the yybP-ykoY leader, or SraF, which may precede one or several genes in a genome. Members of this highly hydrophobic protein family commonly are preceded by a yybP-ykoY leader, which may serve as a riboswitch. From the larger group of TerC homologs (pfam03741), this subfamily contains protein YjbE from Bacillus subtilis. A transport function is proposed.	176
274743	TIGR03718	R_switched_Alx	integral membrane protein, TerC family. Rfam model RF00080 describes a structured RNA element called the yybP-ykoY leader, or SraF, which may precede one or several genes in a genome. Members of this highly hydrophobic protein family often are preceded by a yybP-ykoY leader, which may serve as a riboswitch. From the larger group of TerC homologs (pfam03741), this subfamily contains TerC itself from Alcaligenes sp. plasmid IncHI2 pMER610 and from Proteus mirabilis. It also contains the alkaline-inducible E. coli protein Alx, which unlike the two TerC examples is preceded by a yybP-ykoY leader.	302
274744	TIGR03719	ABC_ABC_ChvD	ATP-binding cassette protein, ChvD family. Members of this protein family have two copies of the ABC transporter ATP-binding cassette, but are found outside the common ABC transporter operon structure that features integral membrane permease proteins and substrate-binding proteins encoded next to the ATP-binding cassette (ABC domain) protein. The member protein ChvD from Agrobacterium tumefaciens was identified as both a candidate to interact with VirB8, based on yeast two-hybrid analysis, and as an apparent regulator of VirG. The general function of this protein family is unknown.	552
274745	TIGR03720	exospor_lead	exosporium leader peptide. This domain is found as a leader peptide in at least two proteins targeted to the exosporium, a structure that occurs as the outermost layer of Bacillus anthracis, B. cereus, and B. thuringiensis spores. The exosporium consists of a basal layer and a nap of hair-like filaments. BclA, the major protein of the nap filaments, is targeted there by this leader peptide. [Cellular processes, Sporulation and germination]	19
274746	TIGR03721	exospore_TM	BclB C-terminal domain. This domain occurs as the C-terminal region in a number of proteins that have extensive collagen-like triple helix repeat regions. Member domains are predicted by TmHMM to have four or five transmembrane helices. Members are found mostly in the Firmicutes, but also in Acanthamoeba polyphaga mimivirus. Members include spore surface glycoprotein BclB from Bacillus anthracis, a protein of the exosporium. The exosporium is an additional outermost spore layer, lacking in B. subtilis and most other spore formers, consisting of a basal layer and, above it, a nap of fine filaments.	165
274747	TIGR03722	arch_KAE1	universal archaeal protein Kae1. This family represents the archaeal protein Kae1. Its partner Bud32 is fused with it in about half of the known archaeal genomes. The pair, which appears universal in the archaea, corresponds to EKC/KEOPS complex in eukaryotes. A recent characterization of the member from Pyrococcus abyssi, as an iron-binding, atypical DNA-binding protein with an apurinic lyase activity, challenges the common annotation of close homologs as O-sialoglycoprotein endopeptidase. The latter annotation is based on a characterized protein from the bacterium Pasteurella haemolytica. [DNA metabolism, DNA replication, recombination, and repair]	322
274748	TIGR03723	T6A_TsaD_YgjD	tRNA threonylcarbamoyl adenosine modification protein TsaD. This model represents bacterial members of a protein family that is widely distributed. In a few pathogenic species, the protein is exported in a way that may represent an exceptional secondary function. This model plus companion (archaeal) model TIGR03722 together span the prokaryotic member sequences of TIGR00329, a protein family that appears universal in life, and whose broad function is unknown. A member of TIGR03722 has been characterized as a DNA-binding protein with apurinic endopeptidase activity. In contrast, the rare characterized members of the present family show O-sialoglycoprotein endopeptidase (EC. 3.4.24.57) activity after export. These include glycoprotease (gcp) from Pasteurella haemolytica A1 and a cohemolysin from Riemerella anatipestifer (GB|AAG39646.1). The member from Staphylococcus aureus is essential and is related to cell wall dynamics and the modulation of autolysis, but members are also found in the Mycoplasmas (which lack a cell wall). A reasonable hypothesis is that virulence-related activities after export are secondary to a bacterial domain-wide unknown function. [Protein synthesis, tRNA and rRNA base modification]	313
274749	TIGR03724	arch_bud32	Kae1-associated kinase Bud32. Members of this protein family are the Bud32 protein associated with Kae1 (kinase-associated endopeptidase 1) in the Archaea. In many Archaeal genomes, Kae1 and Bud32 are fused. The complex is homologous to the Kae1 and Bud32 subunits of the eukaryotic KEOPS complex, an apparently ancient protein kinase-containing molecular machine. [Unknown function, General]	199
274750	TIGR03725	T6A_YeaZ	tRNA threonylcarbamoyl adenosine modification protein YeaZ. This family describes a protein family, YeaZ, now associated with the threonylcarbamoyl adenosine (t6A) tRNA modification. Members of this family may occur as fusions with ygjD (previously gcp) or the ribosomal protein N-acetyltransferase rimI, and is frequently encoded next to rimI. [Protein synthesis, tRNA and rRNA base modification]	204
274751	TIGR03726	strep_RK_lipo	putative cross-wall-targeting lipoprotein signal. The YSIRK signal domain targets proteins to the cross-wall, or septum, of dividing Gram-positive bacterial. Lipoprotein signal motifs direct a characteristic N-terminal cleavage and lipid modification for membrane anchoring. This Streptococcal-only signal peptide variant appears to be a hybrid between the two, likely directing protein targeting of nascent surface lipoproteins to the cross-wall. Nearly all members of this family have the characteristic LPXTG cell wall anchor signal at the C-terminus.	34
274752	TIGR03727	urea_t_UrtC_arc	urea ABC transporter, permease protein UrtC, archaeal type. Members of this protein family are ABC transporter permease subunits restricted to the Archaea. Several lines of evidence suggest this protein is functionally analogous, as well as homologous, to the UrtC subunit of the Corynebacterium glutamicum urea transporter. All members of the operon show sequence similarity to urea transport subunits, the gene is located near the urease structural subunits in two of three species, and partial phylogenetic profiling identifies this permease subunit as closely matching the profile of urea utilization.	369
163440	TIGR03728	glyco_access_1	glycosyltransferase, SP_1767 family. Members of this protein family are putative glycosyltransferases. Some members are found close to genes for the accessory secretory (SecA2) system, and are suggested by Partial Phylogenetic Profiling to correlate with SecA2 systems. Glycosylation, therefore, may occur in the cytosol prior to secretion.	265
163441	TIGR03729	acc_ester	putative phosphoesterase. Members of this protein family belong to the larger family pfam00149 (calcineurin-like phosphoesterase), a family largely defined by small motifs of metal-chelating residues. The subfamily in this model shows a good but imperfect co-occurrence in species with domain TIGR03715 that defines a novel class of signal peptide typical of the accessory secretory system.	239
163442	TIGR03730	tungstate_WtpA	tungstate ABC transporter binding protein WtpA. Members of this protein family are tungstate (and, more weakly, molybdate) binding proteins of tungstate(/molybdate) ABC transporters, as first characterized in Pyrococcus furiosus. Model seed members and cutoffs, pending experimental evidence for more distant homologs, were chosen such that this model identifies select archaeal proteins, excluding weaker archaeal and all bacterial homologs. Note that this family is homologous to molybdate transporters, and that at least one other family of tungstate transporter binding protein, TupA, also exists.	273
274753	TIGR03731	lantibio_gallid	lantibiotic, gallidermin/nisin family. Members of this family are lantibiotic precursors in the family that includes gallidermin, nisin, mutacin, epidermin, and streptin. [Cellular processes, Toxin production and resistance]	48
274754	TIGR03732	lanti_perm_MutE	lantibiotic protection ABC transporter permease subunit, MutE/EpiE family. Model TIGR03731 represents the family of all lantibiotics related to gallidermin, including epidermin, mutatin, and nisin. This protein family is largely restricted to gallidermin-family lantibiotic cassettes, but also include orphan transporter cassettes in species that lack candidate lantibiotic precursor and synthetase genes. In most species, this subunit is paralogous to an adjacent gene, modeled separately.	241
163445	TIGR03733	lanti_perm_MutG	lantibiotic protection ABC transporter permease subunit, MutG family. Model TIGR03731 represents the family of all lantibiotics related to gallidermin, including epidermin, mutatin, and nisin. This protein family is largely restricted to gallidermin-family lantibiotic cassettes, but also include orphan transporter cassettes in species that lack candidate lantibiotic precursor and synthetase genes. In most species, this subunit is paralogous to an adjacent gene modeled separate by TIGR03732, while in some species only one subunit is found.	248
274755	TIGR03734	PRTRC_parB	PRTRC system ParB family protein. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family the member related to ParB, and is designated PRTRC system ParB family protein.	554
163447	TIGR03735	PRTRC_A	PRTRC system protein A. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family is designated protein A.	192
163448	TIGR03736	PRTRC_ThiF	PRTRC system ThiF family protein. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. This family is the PRTRC system ThiF family protein.	244
274756	TIGR03737	PRTRC_B	PRTRC system protein B. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. This protein family is designated protein B.	228
163450	TIGR03738	PRTRC_C	PRTRC system protein C. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family is designated PRTRC system protein C.	66
274757	TIGR03739	PRTRC_D	PRTRC system protein D. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family is designated PRTRC system protein D. The gray zone, between trusted and noise, includes proteins found in the same genomes as other proteins of the PRTRC systems, but not in the same contiguous gene region.	320
163452	TIGR03740	galliderm_ABC	gallidermin-class lantibiotic protection ABC transporter, ATP-binding subunit. Model TIGR03731 represents the family of all lantibiotics related to gallidermin, including epidermin, mutatin, and nisin. This protein family describes the ATP-binding subunit of a gallidermin/epidermin class lantibiotic protection transporter. It is largely restricted to gallidermin-family lantibiotic biosynthesis and export cassettes, but also occurs in orphan transporter cassettes in species that lack candidate lantibiotic precursor and synthetase genes.	223
274758	TIGR03741	PRTRC_E	PRTRC system protein E. A novel genetic system characterized by six or seven major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family averages about 150 amino acids in length, but the last third contains low-complexity sequence that complicates sequence comparisons. This model does not include the low-complexity region.	104
274759	TIGR03742	PRTRC_F	PRTRC system protein F. A novel genetic system characterized by seven (usually) major proteins, including a ParB homolog and a ThiF homolog, is commonly found on plasmids or in bacterial chromosomal regions near phage, plasmid, or transposon markers. It is most common among the beta Proteobacteria. We designate the system PRTRC, or ParB-Related,ThiF-Related Cassette. This protein family is designated protein F. It is the most divergent of the families.	342
274760	TIGR03743	SXT_TraD	conjugative coupling factor TraD, SXT/TOL subfamily. Members of this protein family are the putative conjugative coupling factor, TraD (or TraG), rather distantly related to the well-characterized TraD of the F plasmid. Members are associated with conjugative-transposon-like mobile genetic elements of the class that includes SXT, an antibiotic resistance transfer element in some Vibrio cholerae strains. [Mobile and extrachromosomal element functions, Other]	634
274761	TIGR03744	traC_PFL_4706	conjugative transfer ATPase, PFL_4706 family. Members of this protein family are predicted ATP-binding proteins apparently associated with DNA conjugal transfer. Members are found both in plasmids and in bacterial chromosomal regions that appear to derive from integrative elements such as conjugative transposons. More distant homologs, outside the scope of this family, include type IV secretion/conjugal transfer proteins such as TraC, VirB4 and TrsE. The granularity of this protein family definition is chosen so as to represent one distinctive clade and act as a marker through which to define and recognize the class of mobile element it serves. [Mobile and extrachromosomal element functions, Plasmid functions]	893
274762	TIGR03745	conj_TIGR03745	integrating conjugative element membrane protein, PFL_4702 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in a region flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	105
163458	TIGR03746	conj_TIGR03746	integrating conjugative element protein, PFL_4703 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions]	202
163459	TIGR03747	conj_TIGR03747	integrating conjugative element membrane protein, PFL_4697 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	233
163460	TIGR03748	conj_PilL	conjugative transfer region protein, TIGR03748 family. This model describes the conserved N-terminal region of a variable length protein family associated with laterally transfered regions flanked by markers of conjugative plasmid integration and/or transposition. Most members of the family have the lipoprotein signal peptide motif. A member of the family from a pathogenicity island in Salmonella enterica serovar Dublin strain was designated PilL for nomenclature consistency with a neighboring gene for the pilin structural protein PilS. However, the species distribution of this protein family tracks much better with markers of conjugal transfer than with markers of PilS-like pilin structure. [Mobile and extrachromosomal element functions, Plasmid functions]	105
163461	TIGR03749	conj_TIGR03749	integrating conjugative element protein, PFL_4704 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in a region flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	257
274763	TIGR03750	conj_TIGR03750	conjugative transfer region protein, TIGR03750 family. Members of this protein family are found occasionally on plasmids. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	111
274764	TIGR03751	conj_TIGR03751	conjugative transfer region lipoprotein, TIGR03751 family. Members of this protein family are found occasionally on plasmids. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	124
274765	TIGR03752	conj_TIGR03752	integrating conjugative element protein, PFL_4705 family. Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida toluene catabolic TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	472
274766	TIGR03753	blh_monoox	beta-carotene 15,15'-monooxygenase, Brp/Blh family. This integral membrane protein family includes Brp (bacterio-opsin related protein) and Blh (Brp-like protein). Bacteriorhodopsin is a light-driven proton pump with a covalently bound retinal cofactor that appears to be derived beta-carotene. Blh has been shown to cleave beta-carotene to product two all-trans retinal molecules. Mammalian enzymes with similar enzymatic function are not multiple membrane spanning proteins and are not homologous.	259
274767	TIGR03754	conj_TOL_TraD	conjugative coupling factor TraD, TOL family. Members of this protein are assigned by homology to the TraD family of conjugative coupling factor. This particular clade serves as a marker for an extended gene region that occurs occasionally on plasmids, including the toluene catabolism TOL plasmid. More commonly, the gene region is chromosomal, flanked by various markers of conjugative transfer and insertion.	643
274768	TIGR03755	conj_TIGR03755	integrating conjugative element protein, PFL_4711 family. Members of this protein family are found in genomic regions associated with conjugative transfer and integrated TOL-like plasmids. The specific function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions]	418
274769	TIGR03756	conj_TIGR03756	integrating conjugative element protein, PFL_4710 family. Members of this protein family are found in genomic regions associated with conjugative transfer and integrated TOL-like plasmids. The specific function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions]	297
163469	TIGR03757	conj_TIGR03757	integrating conjugative element protein, PFL_4709 family. Members of this protein belong to extended genomic regions that appear to be spread by conjugative transfer. [Mobile and extrachromosomal element functions, Plasmid functions]	113
163470	TIGR03758	conj_TIGR03758	integrating conjugative element protein, PFL_4701 family. Members of this family of small, hydrophobic proteins are found occasionally on plasmids such as the Pseudomonas putida TOL (toluene catabolic) plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	65
274770	TIGR03759	conj_TIGR03759	integrating conjugative element protein, PFL_4693 family. Members of this protein family, such as model protein PFL_4693 from Pseudomonas fluorescens Pf-5, belong to extended genomic regions that appear to be spread by conjugative transfer. Most members have a predicted N-terminal signal sequence. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions]	200
163472	TIGR03760	ICE_TraI_Pfluor	integrating conjugative element relaxase, PFL_4751 family. Members of this protein family are the TraI putative relaxases required for transfer by a subclass of integrating conjugative elements (ICE) as found in Pseudomonas fluorescens Pf-5, and understood from study of two related ICE, SXT and R391. This model represents the N-terminal domain. Note that no homology is detected to the similarly named TraI relaxase of the F plasmid.	218
274771	TIGR03761	ICE_PFL4669	integrating conjugative element protein, PFL_4669 family. Members of this protein family, such as PFL4669, are found in integrating conjugative elements (ICE) of the PFGI-1 class as in Pseudomonas fluorescens.	216
274772	TIGR03762	archaeo_artC	archaeosortase C, PEF-CTERM variant. Members of this family are archaeal homologs to bacterial PEP-CTERM-sorting protein exosortase (TIGR02602). Members of this family are found in species with an archaeal variant sorting motif, PEF-CTERM (TIGR03024). Members are found in the thermoacidophilic Aciduliprofundum boonei, the mesophilic psychromethanogens Methanosarcina mazei and Methanococcoides burtonii, and in Ferroglobus placidus DSM 10642. [Protein fate, Protein and peptide secretion and trafficking]	274
163475	TIGR03763	cyanoexo_CrtA	cyanoexosortase A. The predicted protein-sorting transpeptidase that we call exosortase (see TIGR02602) has distinct subclasses that associated with different types of exopolysaccharide production loci and/or different taxonomic lineages. We designate this relatively divergent cyanobacterial type to be type 3. We propose the gene symbol xrtC. This type coexists with a TIGR02602-recognized form in Nostoc sp. PCC 7120. [Protein fate, Protein and peptide secretion and trafficking]	260
213858	TIGR03764	ICE_PFGI_1_parB	integrating conjugative element, PFGI_1 class, ParB family protein. Members of this protein family carry the ParB-type nuclease domain and are found in integrating conjugative elements (ICE) in the same class as PFGI-1 of Pseudomonas fluorescens Pf-5.	258
274773	TIGR03765	ICE_PFL_4695	integrating conjugative element protein, PFL_4695 family. This model describes a protein family exemplified by PFL_4695 of Pseudomonas fluorescens Pf-5. Full-length proteins in this family show some architectural variety, but this model represents a conserved domain. Most or all member proteins belong to laterally transferred chromosomal islands called integrative conjugative elements, or ICE.	105
274774	TIGR03766	TIGR03766	conserved hypothetical integral membrane protein. Models TIGR03110, TIGR03111, and TIGR03112 describe a three-gene system found in several Gram-positive bacteria, where TIGR03110 (XrtG) is distantly related to a putative transpeptidase, exosortase (TIGR02602). This model describes a small clade that correlates by both gene clustering and phyletic pattern, although imperfectly, to the three gene system. Both this narrow clade, and the larger set of full-length homologous integral membrane proteins, have an especially well-conserved region near the C-terminus with an invariant tyrosine. The function is unknown.	483
213859	TIGR03767	P_acnes_RR	metallophosphoesterase, PPA1498 family. This model describes a small collection of probable metallophosphoresterases, related to pfam00149 but with long inserts separating some of the shared motifs such that the homology is apparent only through multiple sequence alignment. Members of this protein family, in general, have a Sec-independent TAT (twin-arginine translocation) signal sequence, N-terminal to the region represented by this model. Members include YP_056203.1 from Propionibacterium acnes KPA171202.	496
163480	TIGR03768	RPA4764	metallophosphoesterase, RPA4764 family. This model describes a small collection of probable metallophosphoresterases, related to pfam00149. Members of this protein family usually have a Sec-independent TAT (twin-arginine translocation) signal sequence, N-terminal to the region represented by this model. This model and TIGR03767 divide a narrow clade of pfam00149-related enzymes.	492
274775	TIGR03769	P_ac_wall_RPT	actinobacterial surface-anchored protein domain. This model describes a repeat domain that one to three times in Actinobacterial proteins, some of which have LPXTG-type sortase recognition motifs for covalent attachment to the Gram-positive cell wall. Where it occurs with duplication in an LPXTG-anchored protein, it tends to be adjacent to the substrate-binding protein of the gene trio of an ABC transporter system, where that substrate-binding protein has a single copy of this same domain. This arrangement suggests a substrate-binding relay system, with the LPXTG protein acting as a substrate receptor.	41
163482	TIGR03770	anch_rpt_perm	anchored repeat-type ABC transporter, permease subunit. This protein family is the permease subunit of binding protein-dependent ABC transporter complex that strictly co-occurs with TIGR03769. TIGRFAMs model TIGR03769 describes a protein domain that occurs singly or as one of up to three repeats in proteins of a number of Actinobacteria, including Propionibacterium acnes KPA171202. The TIGR03769 domain occurs both in the adjacent gene for the substrate-binding protein and in additional (often nearby) proteins, often with LPXTG-like sortase recognition signals. Homologous permease subunits outside the scope of this family include manganese transporter MntB in Synechocystis sp. PCC 6803 and chelated iron transporter subunits. The function of this transporter complex is unknown. [Transport and binding proteins, Unknown substrate]	270
163483	TIGR03771	anch_rpt_ABC	anchored repeat-type ABC transporter, ATP-binding subunit. This protein family is the ATP-binding cassette subunit of binding protein-dependent ABC transporter complex that strictly co-occurs with TIGR03769. TIGRFAMs model TIGR03769 describes a protein domain that occurs singly or as one of up to three repeats in proteins of a number of Actinobacteria, including Propionibacterium acnes KPA171202. The TIGR03769 domain occurs both in an adjacent gene for the substrate-binding protein and in additional (often nearby) proteins, often with LPXTG-like sortase recognition signals. Homologous ATP-binding subunits outside the scope of this family include manganese transporter MntA in Synechocystis sp. PCC 6803 and chelated iron transporter subunits. The function of this transporter complex is unknown. [Transport and binding proteins, Unknown substrate]	223
163484	TIGR03772	anch_rpt_subst	anchored repeat ABC transporter, substrate-binding protein. Members of this protein family are ABC transporter permease subunits as identified by pfam00950, but additionally contain the Actinobacterial insert domain described by TIGR03769. Some homologs (lacking the insert) have been described as transporters of manganese or of chelated iron. Members of this family typically are found along with an ATP-binding cassette protein, a permease, and an LPXTG-anchored protein with two or three copies of the TIGR03769 insert that occurs just once in this protein family. [Transport and binding proteins, Unknown substrate]	479
274776	TIGR03773	anch_rpt_wall	putative ABC transporter-associated repeat protein. Members of this protein family occur in genomes that contain a three-gene ABC transporter operon associated with the presence of domain TIGR03769. That domain occurs as a single-copy insert in the substrate-binding protein, and occurs in two or more copies in members of this protein family. Members of this family typically are encoded adjacent to the said transporter operon and may serve as a substrate receptor.	513
163486	TIGR03774	RPE2	Rickettsial palindromic element RPE2 domain. This model describes protein translations of a second family, RPE2, of Rickettsia palindromic elements (RPE). The elements spread within a genome as selfish genetic elements, inserting into genes additional coding regions that does not disrupt the reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else.	35
163487	TIGR03775	RPE3	Rickettsial palindromic element RPE3 domain. This model describes protein translations of a second family, RPE3, of Rickettsia palindromic elements (RPE). The elements spread within a genome as selfish genetic elements, inserting into genes additional coding regions that does not disrupt the reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else.	43
163488	TIGR03776	RPE5	Rickettsial palindromic element RPE5 domain. This model describes protein translations of a family, RPE5, of Rickettsia palindromic elements (RPE). The elements spread within a genome as selfish genetic elements, inserting into genes additional coding region that does not disrupt the reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else.	43
274777	TIGR03777	RPE4	Rickettsial palindromic element RPE4 domain. This model describes protein translations of a family, RPE4, of Rickettsia palindromic elements (RPE). The elements spread within a genome as selfish genetic elements, inserting into genes additional coding region that does not disrupt the reading frame. This model finds RPE-encoded regions in several Rickettsial species and, so far, no where else.	32
274778	TIGR03778	VPDSG_CTERM	VPDSG-CTERM protein sorting domain. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (). This model describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C-terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif (TIGR02595) of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria.	24
274779	TIGR03779	Bac_Flav_CT_M	Bacteroides conjugative transposon TraM protein. Members of this protein family are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. [Cellular processes, DNA transformation]	410
163492	TIGR03780	Bac_Flav_CT_N	Bacteroides conjugative transposon TraN protein. Members of this family are the TraN protein encoded by transfer region genes of conjugative transposons of Bacteroides. The family is related to conjugative transfer proteins VirB9 and TrbG of Agrobacterium Ti plasmids. [Cellular processes, DNA transformation]	285
200324	TIGR03781	Bac_Flav_CT_K	Bacteroides conjugative transposon TraK protein. Members of this protein family are designated TraK and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. PSI-BLAST reveals a distant relationship to proteins TrbF and VirB8 in Proteobacterial conjugal transfer systems. [Cellular processes, DNA transformation]	202
274780	TIGR03782	Bac_Flav_CT_J	Bacteroides conjugative transposon TraJ protein. Members of this protein family are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. This family is related conjugation system proteins in the Proteobacteria, including TrbL of Agrobacterium Ti plasmids and VirB6. [Cellular processes, DNA transformation]	323
163495	TIGR03783	Bac_Flav_CT_G	Bacteroides conjugation system ATPase, TraG family. Members of this family include the predicted ATPase, TraG, encoded by transfer region genes of conjugative transposons of Bacteroides, such as CTnDOT, found on the main chromosome. Members also include TraG homologs borne on plasmids in Bacteroides. The protein family is related to the conjugative transfer system ATPase VirB4. [Cellular processes, DNA transformation]	829
163496	TIGR03784	marine_sortase	sortase, marine proteobacterial type. Members of this protein family are sortase enzymes, cysteine transpeptidases involved in protein sorting activities. Members of this family tend to be found in proteobacteria, rather than in Gram-positive bacteria where sortases attach proteins to the Gram-positive cell wall or participate in pilin cross-linking. Many species with this sortase appear to contain a signal target sequence, a protein with a Vault protein inter-alpha-trypsin domain (pfam08487) and a von Willebrand factor type A domain (pfam00092), encoded by an adjacent gene. These sortases are designated subfamily 6 according to Comfort and Clubb (2004).	174
163497	TIGR03785	marine_sort_HK	proteobacterial dedicated sortase system histidine kinase. This histidine kinase protein is paired with an adjacent response regulator (TIGR03787) gene. It co-occurs with a variant sortase enzyme (TIGR03784), usually in the same gene neighborhood, in proteobacterial species most of which are marine, and with an LPXTG motif-containing sortase target conserved protein (TIGR03788). Sortases and LPXTG proteins are far more common in Gram-positive bacteria, where sortase systems mediate attachment to the cell wall or cross-linking of pilin structures. We give this predicted sensor histidine kinase the gene symbol psdS, for Proteobacterial Dedicated Sortase system Sensor histidine kinase.	703
274781	TIGR03786	strep_pil_rpt	streptococcal pilin isopeptide linkage domain. This model describes a domain that occurs once in the major pilin of Streptococcus pyogenes, Spy0128, but in higher copy numbers in other streptococcal proteins. The domain occurs nine times in a surface-anchored protein of Bifidobacterium longum. All members of this family have LPXTG-type sortase target sequences. The S. pyogenes major pilin has been shown to undergo isopeptide bond cross-linking, mediated by sortases, that are critical to maintaining pilus structural integrity. One such Lys-to-Asn isopeptide bond is to a near-invariant Asn near the C-terminal end of this domain (column 81 of the seed alignment). A Glu in the S. pyogenes major pilin (column 25 of the seed alignment), invariant as Glu or Gln, is described as catalytic for isopeptide bond formation.	63
163499	TIGR03787	marine_sort_RR	proteobacterial dedicated sortase system response regulator. This model describes a family of DNA-binding response regulator proteins, associated with an adjacent histidine kinase (TIGR03785) to form a two-component system. This system co-occurs with, and often is adjacent to, a proteobacterial variant form of the protein sorting transpeptidase called sortase (TIGR03784), and a single target protein for the sortase. We give this protein the gene symbol pdsR, for Proteobacterial Dedicated Sortase system Response regulator.	227
274782	TIGR03788	marine_srt_targ	marine proteobacterial sortase target protein. Members of this protein family are restricted to the Proteobacteria. Each contains a C-terminal sortase-recognition motif, transmembrane domain, and basic residues cluster at the the C-terminus, and is encoded adjacent to a sortase gene. This protein is frequently the only sortase target in its genome, which is as unusual its occurrence in Gram-negative rather than Gram-positive genomes. Many bacteria with this system are marine. In addition to the LPXTG signal, members carry a vault protein inter-alpha-trypsin inhibitor domain (pfam08487) and a von Willebrand factor type A domain (pfam00092).	596
274783	TIGR03789	pdsO	proteobacterial sortase system peptidoglycan-associated protein. A newly defined histidine kinase (TIGR03785) and response regulator (TIGR03787) gene pair occurs exclusively in Proteobacteria, mostly of marine origin, nearly all of which contain a subfamily 6 sortase (TIGR03784) and its single dedicated target protein (TIGR03788) adjacent to to the sortase. This protein family shows up in only in those species with the histidine kinase/response regulator gene pair, and often adjacent to that pair. It belongs to the pfam00691 domain family, which is the peptidoglycan-associated region of flagellar motor protein MotB, OmpA (whose N-terminal region forms an outer membrane beta barrel), and peptidoglycan-associated lipoprotein Pal. Its function is unknown. We assign the gene symbol pdsO, for Proteobacterial Dedicated Sortase system OmpA family protein. [Protein fate, Protein and peptide secretion and trafficking]	243
274784	TIGR03790	TIGR03790	TIGR03790 family protein. Despite a broad and sporadic distribution (Cyanobacteria, Verrucomicrobia, Acidobacteria, beta and delta Proteobacteria, and Planctomycetes), this uncharacterized protein family occurs only among the roughly 8 percent of prokarotyic species that carry homologs of the integral membrane protein exosortase (see TIGR02602), a proposed protein-sorting system transpeptidase.	322
163503	TIGR03791	TTQ_mauG	tryptophan tryptophylquinone biosynthesis enzyme MauG. Members of this protein family are the tryptophan tryptophylquinone biosynthesis (TTQ) enzyme MauG, as found in Methylobacterium extorquens and related species. This protein is required to complete the maturation of the TTQ cofactor in the methylamine dehydrogenase light (beta) chain.	291
274785	TIGR03792	TIGR03792	uncharacterized cyanobacterial protein, TIGR03792 family. Members of this family are found, no more than one to a genome, exclusively in (but not universal to) the Cyanobacteria. These proteins are small, 100-150 amino acids. The function is unknown. [Unknown function, General]	90
274786	TIGR03793	TOMM_pelo	NHLP leader peptide domain. This model represents a domain that is conserved among a large number of putative ribosomal natural products (RNP) precursor, including the thiazole/oxazole-modified microcins (TOMMs). As a leader peptide domain, likely to be removed from the mature product, this domain is unusual in several ways. First, it is longer than most previously described RNP leader peptides. Second, most of the domain is homologous to nitrile hydratase alpha subunits. Finally, it appears that this domain correlates with a specific family of cleavage/export proteins while members undergo modifications by different classes of peptide maturase, including cyclodehydratases, lantibiotic synthases, radical SAM peptide maturases. This family is expanded especially in Pelotomaculum thermopropionicum SI. [Cellular processes, Biosynthesis of natural products]	77
274787	TIGR03794	NHLM_micro_HlyD	NHLM bacteriocin system secretion protein. Members of this protein family are homologs of the HlyD membrane fusion protein of type I secretion systems. Their occurrence in prokaryotic genomes is associated with the occurrence of a novel class of microcin (small bacteriocins) with a leader peptide region related to nitrile hydratase. We designate the class of bacteriocin as Nitrile Hydratase Leader Microcin, or NHLM. This family, therefore, is designated as NHLM bacteriocin system secretion protein. Some but not all NHLM-class putative microcins belong to the TOMM (thiazole/oxazole modified microcin) class as assessed by the presence of the scaffolding protein and/or cyclodehydratase in the same gene clusters. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Biosynthesis of natural products]	421
163507	TIGR03795	RNP_Burkhold	ribosomal natural product, two-chain TOMM family. Members of this protein family are found sparsely, mostly in members of the genus Burkholderia. Members often occur as tandem homologous genes, such as BMA_0021 and BMA_0022 in Burkholderia mallei ATCC 23344, or else have a duplication. The genes regularly are encoded near a cyclodehydrogenase/docking protein fusion protein of TOMM (thiazole/oxazole-modified microcins) biosynthetic clusters, suggesting a role in bacteriocin biosynthesis. The role of the putative natural product is unknown, but function as a two-chain bacteriocin is suggested. [Cellular processes, Biosynthesis of natural products]	114
274788	TIGR03796	NHLM_micro_ABC1	NHLM bacteriocin system ABC transporter, peptidase/ATP-binding protein. This protein describes a multidomain ABC transporter subunit that is one of three protein families associated with some regularity with a distinctive family of putative bacteriocins. It includes a bacteriocin-processing peptidase domain at the N-terminus. Model TIGR03793 describes a conserved propeptide region for this bacteriocin family, unusual because it shows obvious homology a region of the enzyme nitrile hydratase up to the classic Gly-Gly cleavage motif. This family is therefore predicted to be a subunit of a bacteriocin processing and export system characteristic to this system that we designate NHLM, Nitrile Hydratase Leader Microcin. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Biosynthesis of natural products]	710
274789	TIGR03797	NHLM_micro_ABC2	NHLM bacteriocin system ABC transporter, ATP-binding protein. Members of this protein family are ABC transporter ATP-binding subunits, part of a three-gene putative bacteriocin transport operon. The other subunits include another ATP-binding subunit (TIGR03796), which has an N-terminal leader sequence cleavage domain, and an HlyD homolog (TIGR03794). In a number of genomes, members of protein families related to nitrile hydratase alpha subunit or to nif11 have undergone paralogous family expansions, with members possessing a putative bacteriocin cleavage region ending with a classic Gly-Gly motif. Those sets of putative bacteriocins, members of this protein family and its partners TIGR03794 and TIGR03796, and cyclodehydratase/docking scaffold fusion proteins of thiazole/oxazole biosynthesis frequently show correlated species distribution and co-clustering within many of those genomes. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Biosynthesis of natural products]	686
274790	TIGR03798	ocin_TIGR03798	nif11-like leader peptide domain. This model describes a conserved, fairly long (about 65 residue) leader peptide region for a family of putative ribosomal natural products (RNP) of small size. Members of the seed alignment tend to have the Gly-Gly motif as the last two residues of the matched region. This is a cleavage site for a combination processing/export ABC transporter with a peptidase domain. Members include the prochlorosins, lantipeptides from Prochlorococcus. [Cellular processes, Biosynthesis of natural products]	64
274791	TIGR03799	NOD_PanD_pyr	putative pyridoxal-dependent aspartate 1-decarboxylase. This enzyme is proposed here to be a form of aspartate 1-decarboxylase, pyridoxal-dependent, that represents a non-orthologous displacement to the more widely distributed pyruvoyl-dependent form (TIGR00223). Aspartate 1-decarboxylase makes beta-alanine, used usually in pathothenate biosynthesis, by decarboxylation from asparatate. A number of species with the PanB and PanC enzymes, however, lack PanD. This protein family occurs in a number of Proteobacteria that lack PanD. This enzyme family appears to be a pyridoxal-dependent enzyme (see pfam00282). The family was identified by Partial Phylogenetic Profiling; members in Geobacter sulfurreducens, G. metallireducens, and Pseudoalteromonas atlantica are clustered with the genes for PanB and PanC. We suggest the gene symbol panP (panthothenate biosynthesis enzyme, Pyridoxal-dependent). [Biosynthesis of cofactors, prosthetic groups, and carriers, Pantothenate and coenzyme A]	522
274792	TIGR03800	PLP_synth_Pdx2	pyridoxal 5'-phosphate synthase, glutaminase subunit Pdx2. Pyridoxal 5'-phosphate (PLP) is synthesized by the PdxA/PdxJ pathway in some species (mostly within the gamma subdivision of the proteobacteria) and by the Pdx1/Pdx2 pathway in most other organisms. This family describes Pdx2, the glutaminase subunit of the PLP synthase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	184
163513	TIGR03801	asp_4_decarbox	aspartate 4-decarboxylase. This enzyme, aspartate 4-decarboxylase (EC 4.1.1.12), removes the side-chain carboxylate from L-aspartate, converting it to L-alanine plus carbon dioxide. It is a PLP-dependent enzyme, homologous to aspartate aminotransferase (EC 2.6.1.1). [Energy metabolism, Amino acids and amines]	521
274793	TIGR03802	Asp_Ala_antiprt	aspartate-alanine antiporter. All members of the seed alignment for this model are asparate-alanine anti-transporters (AspT) encoded next to the gene for aspartate 4-decarboxylase (AspD), which converts asparate to alanine, releasing CO2. The exchange of Asp for Ala is electrogenic, so the AspD/AspT system confers a proton-motive force. This transporter contains two copies of the AspT/YidE/YbjL antiporter duplication domain (TIGR01625).	562
274794	TIGR03803	Gloeo_Verruco	Gloeo_Verruco repeat. This model describes a rare protein repeat, found so far in two species of Verrucomicrobia (Chthoniobacter flavus and Verrucomicrobium spinosum) and in four different proteins of Gloeobacter violaceus PCC7421. In the Verrucomicrobial species, the repeat region is followed by a PEP-CTERM protein-sorting signal, suggesting an extracellular location.	34
274795	TIGR03804	para_beta_helix	parallel beta-helix repeat (two copies). This model represents a tandem pair of an approximately 22-amino acid (each) repeat homologous to the beta-strand repeats that stack in a right-handed parallel beta-helix in the periplasmic C-5 mannuronan epimerase, AlgA, of Pseudomonas aeruginosa. A homology domain consisting of a longer tandem array of these repeats is described in the SMART database as CASH (SM00722), and is found in many carbohydrate-binding proteins and sugar hydrolases. A single repeat is represented by SM00710. This TIGRFAMs model represents a flavor of the parallel beta-helix-forming repeat based on prokaryotic sequences only in its seed alignment, although it also finds many eukaryotic sequences.	44
163517	TIGR03805	beta_helix_1	parallel beta-helix repeat-containing protein. Members of this protein family contain a tandem pair of beta-helix repeats (see TIGR03804). Each repeat is expected to consist of three beta strands that form a single turn as they form a right-handed helix of stacked beta-structure. Member proteinsa occur regularly in two-gene pairs along with another uncharacterized protein family; both protein families exhibit either lipoprotein or regular signal peptides, suggesting transit through the plasma membrane, and the two may be fused. The function of the pair is unknown. [Unknown function, General]	314
163518	TIGR03806	chp_HNE_0200	conserved hypothetical protein, HNE_0200 family. The model TIGR03805 describes an uncharacterized protein family that contains repeats associated with the formation of a right-handed helical stack of parallel beta strands, homologous to those found in a number of carbohydrate-binding proteins and sugar hydrolases. This model describes another uncharacterized protein family, found in the same species as TIGR03805 member proteins, usually as the adjacent gene or in a fusion protein. An example is HNE_0200 from Hyphomonas neptunium ATCC 15444. Sometimes two members of this family are with a single member of TIGR03805. The function is unknown. [Hypothetical proteins, Conserved]	317
213864	TIGR03807	RR_fam_repeat	putative cofactor-binding repeat. This model describes a small repeat found in a family of proteins that crosses the plasma membrane by twin-arginine translation, which usually signifies the presence of a bound cofactor. This repeat shows similarity to the beta-helical repeat, in which three beta-strands per repeat wind once per repeat around in a right-handed helical stack of parallel beta structure.	27
163520	TIGR03808	RR_plus_rpt_1	twin-arg-translocated uncharacterized repeat protein. Members of this protein family have a Sec-independent twin-arginine tranlocation (TAT) signal sequence, which enables tranfer of proteins folded around prosthetic groups to cross the plasma membrane. These proteins have four copies of a repeat of about 23 amino acids that resembles the beta-helix repeat. Beta-helix refers to a structural motif in which successive beta strands wind around to stack parallel in a right-handed helix, as in AlgG and related enzymes of carbohydrate metabolism. The twin-arginine motif suggests that members of this protein family bind some unknown cofactor.	455
163521	TIGR03809	TIGR03809	TIGR03809 family protein. This protein family contains proteins with a median length of about 175, including a strongly conserved N-terminal region of about 55 amino acids, a conserved extreme C-terminal region of about 15 amino acids, and highly variable sequence in between the two. Members are found invariably with a member of family TIGR03808.	168
163522	TIGR03810	arg_ornith_anti	arginine-ornithine antiporter. Members of this protein family are the arginine/ornithine antiporter, ArcD. This exchanger of ornithine for arginine occurs in a system with arginine deiminase, ornithine carbamoyltransferase, and carbamate kinase, with together turn arginine to ornithine with the generation of ATP and release of CO2. [Transport and binding proteins, Amino acids, peptides and amines]	468
163523	TIGR03811	tyr_de_CO2_Ent	tyrosine decarboxylase, Enterococcus type. This model represents tyrosine decarboxylases in the family of the Enterococcus faecalis enzyme Tdc. These enzymes often are encoded next to tyrosine/tyramine antiporter, together comprising a system in which tyrosine decarboxylation can protect against exposure to acid conditions. This clade differs from the archaeal tyrosine decarboxylases associated with methanofuran biosynthesis. [Cellular processes, Adaptations to atypical conditions]	608
274796	TIGR03812	tyr_de_CO2_Arch	tyrosine decarboxylase MnfA. Members of this protein family are the archaeal form, MnfA, of tyrosine decarboxylase, and are involved in methanofuran biosynthesis. Members show clear homology to the Enterococcus form, Tdc, that is involved in tyrosine decarboxylation for resistance to acidic conditions. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	373
163525	TIGR03813	put_Glu_GABA_T	putative glutamate/gamma-aminobutyrate antiporter. Members of this protein family are putative putative glutamate/gamma-aminobutyrate antiporters. Each member of the seed alignment is found adjacent to a glutamate decarboxylase, which converts glutamate (Glu) to gamma-aminobutyrate (GABA). However, the majority belong to genome contexts with a glutaminase (converts Gln to Glu) as well as the decarboxylase that converts Glu to GABA. The specificity of the transporter remains uncertain.	474
274797	TIGR03814	Gln_ase	glutaminase A. This family describes the enzyme glutaminase, from a larger family that includes serine-dependent beta-lactamases and penicillin-binding proteins. Many bacteria have two isozymes. This model is based on selected known glutaminases and their homologs within prokaryotes, with the exclusion of highly-derived (long branch) and architecturally varied homologs, so as to achieve conservative assignments. A sharp drop in scores occurs below 250, and cutoffs are set accordingly. The enzyme converts glutamine to glutamate, with the release of ammonia. Members tend to be described as glutaminase A (glsA), where B (glsB) is unknown and may not be homologous (as in Rhizobium etli). Some species have two isozymes that may both be designated A (GlsA1 and GlsA2). [Energy metabolism, Amino acids and amines]	300
274798	TIGR03815	CpaE_hom_Actino	helicase/secretion neighborhood CpaE-like protein. Members of this protein family belong to the MinD/ParA family of P-loop NTPases, and in particular show homology to the CpaE family of pilus assembly proteins (see ). Nearly all members are found, not only in a gene context consistent with pilus biogenesis or a pilus-like secretion apparatus, but also near a DEAD/DEAH-box helicase, suggesting an involvement in DNA transfer activity. The model describes a clade restricted to the Actinobacteria.	322
274799	TIGR03816	tadE_like_DECH	helicase/secretion neighborhood TadE-like protein. Members of this small, highly hydrophobic protein family occur in a pilus/secretion-like region that usually is next to an uncharacterized DEAH-box helicase, in Actinobacteria. Members show sequence similarity to the TadE-like family described by pfam07811. The function is unknown. [Unknown function, General]	109
274800	TIGR03817	DECH_helic	helicase/secretion neighborhood putative DEAH-box helicase. A conserved gene neighborhood widely spread in the Actinobacteria contains this uncharacterized DEAH-box family helicase encoded convergently towards an operon of genes for protein homologous to type II secretion and pilus formation proteins. The context suggests that this helicase may play a role in conjugal transfer of DNA.	742
274801	TIGR03818	MotA1	flagellar motor stator protein MotA. The MotA protein, along with its partner MotB, comprise the stator complex of the bacterial flagellar motor. MotAB span the cytoplasmic membrane and undergo conformational changes powered by the translocation of protons. These conformational changes in turn are communicated to the rotor assembly, producing torque. This model represents one family of MotA proteins which are often not identified by the "transporter, MotA/TolQ/ExbB proton channel family" model, pfam01618.	282
200328	TIGR03819	heli_sec_ATPase	helicase/secretion neighborhood ATPase. Members of this protein family comprise a distinct clade of putative ATPase associated with an integral membrane complex likely to act in pilus formation, secretion, or conjugal transfer. The association of most members with a nearby gene for a DEAH-box helicase suggests a role in conjugal transfer.	340
163532	TIGR03820	lys_2_3_AblA	lysine-2,3-aminomutase. This model describes lysine-2,3-aminomutase as found along with beta-lysine acetyltransferase in a two-enzyme pathway for making the compatible solute N-epsilon-acetyl-beta-lysine. This compatible solute, or osmolyte, is known to protect a number of methanogenic archaea against salt stress. The trusted cutoff distinguishes a tight clade with essentially full-length homology from additional homologs that are shorter or highly diverged in the C-terminal region. All members of this family have the radical SAM motif CXXXCXXC, while some but not all have a second copy of the motif in the C-terminal region.	417
163533	TIGR03821	EFP_modif_epmB	EF-P beta-lysylation protein EpmB. Members of this radical SAM protein subfamily, including yjeK in E. coli, form a distinctive clade, homologous to lysine-2,3-aminomutase of Bacillus, Clostridium, and methanogenic archaea. Members of this family are found in E. coli, Buchnera, Yersinia, etc. The gene symbol is now reassigned as EpmB (Elongation factor P Modification B). [Protein fate, Protein modification and repair]	321
163534	TIGR03822	AblA_like_2	lysine-2,3-aminomutase-related protein. Members of this protein form a distinctive clade, homologous to lysine-2,3-aminomutase (of Bacillus, Clostridium, and methanogenic archaea) and likely similar in function. Members of this family are found in Rhodopseudomonas, Caulobacter crescentus, Bradyrhizobium, etc.	321
163535	TIGR03823	FliZ	flagellar regulatory protein FliZ. FliZ is involved in the regulation of flagellar assembly and possibly also the down-regulation of the motile phenotype. FliZ interacts with the flagellar translational activator FlhCD complex.	168
274802	TIGR03824	FlgM_jcvi	flagellar biosynthesis anti-sigma factor FlgM. FlgM interacts with and inhibits the alternative sigma factor sigma(28) FliA. The C-terminus of FlgM contains the sigma(28)-binding domain.	95
274803	TIGR03825	FliH_bacil	flagellar assembly protein FliH. This bacillus clade of FliH proteins is not found by the Pfam FliH model pfam02108, but is closely related to the sequences identified by that model. Sequences identified by this model are observed in flagellar operons in an analogous position relative to other flagellar operon genes.	255
163538	TIGR03826	YvyF	flagellar operon protein TIGR03826. This gene is found in flagellar operons of Bacillus-related organisms. Its function has not been determined and an official gene symbol has not been assigned, although the gene is designated yvyF in B. subtilus. A tentative assignment as a regulator is suggested in the NCBI record GI:16080597.	137
163539	TIGR03827	GNAT_ablB	putative beta-lysine N-acetyltransferase. Members of this protein family are GNAT family acetyltransferases, based on a seed alignment in which every member is associated with a lysine 2,3-aminomutase family protein, usually as the adjacent gene. This family includes AblB, the enzyme beta-lysine acetyltransferase that completes the two-step synthesis of the osmolyte (compatible solute) N-epsilon-acetyl-beta-lysine; all members of the family may have this function. Note that N-epsilon-acetyl-beta-lysine has been observed only in methanogenic archaea (e.g. Methanosarcina) but that this model, paired with TIGR03820, suggests a much broader distribution.	266
274804	TIGR03828	pfkB	1-phosphofructokinase. This enzyme acts in concert with the fructose-specific phosphotransferase system (PTS) which imports fructose as fructose-1-phosphate. The action of 1-phosphofructokinase results in beta-D-fructose-1,6-bisphosphate and is an entry point into glycolysis (GenProp0688).	304
163541	TIGR03829	YokU_near_AblA	uncharacterized protein, YokU family. Members of this protein family occur in various species of the genus Bacillus, always next to the gene (kamA or ablA) for lysine 2,3-aminomutase. Members have a pair of CXXC motifs, and share homology to the amino-terminal region of a family of putative transcription factors for which the C-terminal is modeled by pfam01381, a helix-turn-helix domain model. This family, however, is shorter and lacks the helix-turn-helix region. The function of this protein family is unknown, but a regulatory role in compatible solute biosynthesis is suggested by local genome context. [Unknown function, General]	89
274805	TIGR03830	CxxCG_CxxCG_HTH	putative zinc finger/helix-turn-helix protein, YgiT family. This model describes a family of predicted regulatory proteins with a conserved zinc finger/HTH architecture. The amino-terminal region contains a novel domain, featuring two CXXC motifs and occuring in a number of small bacterial proteins as well as in the present family. The carboxyl-terminal region consists of a helix-turn-helix domain, modeled by pfam01381. The predicted function is DNA binding and transcriptional regulation.	127
274806	TIGR03831	YgiT_finger	YgiT-type zinc finger domain. This domain model describes a small domain with two copies of a putative zinc-binding motif CXXC (usually CXXCG). Most member proteins consist largely of this domain or else carry an additional C-terminal helix-turn-helix domain, resembling that of the phage protein Cro and modeled by pfam01381.	46
163544	TIGR03832	Tyr_2_3_mutase	tyrosine 2,3-aminomutase. Members of this protein family are tyrosine 2,3-aminomutase. It is variable from member to member as to whether the (R)-beta-Tyr or (S)-beta-Tyr is the preferred product from L-Tyr. This enzyme tends to occur in secondary metabolite biosynthesis systems, as in the production of chondramides in Chondromyces crocatus. This class of enzyme has a prosthetic group, MIO (4-methylideneimidazol-5-one), that forms posttranslationally from an Ala-Ser-Gly motif.	507
274807	TIGR03833	TIGR03833	conserved hypothetical protein. A pair of adjacent genes, ablAB (acetyl-beta-lysine biosynthesis) encodes lysine 2,3-aminomutase and beta-lysine acetyltransferase in methanogenic archaea. Homologous pairs, possibly with identical function, occur in a wide range of species, including Bacillus subtilis. This model describes a conserved hypothetical protein, small in size, with a phylogenetic distribution moderately well correlated to that of the acetyltransferase family. This protein family is also described as DUF2196 and COG4895. The function is unknown. [Hypothetical proteins, Conserved]	62
213869	TIGR03834	EAGR_box	EAGR box. The EAGR box (Enriched in Aromatic and Glycine Residues) is found in three different proteins of the Mycoplasma genitalium terminal organelle, which acts in both cytadherence and gliding motility. The presence of this domain in a genome predicts the Mycoplasma-type terminal organelle structure, gliding motility, and cytadherence. The EAGR box may occur from one to nine times in a protein.	28
274808	TIGR03835	termin_org_DnaJ	terminal organelle assembly protein TopJ. This model describes TopJ (MG_200, CbpA), a DnaJ homolog and probable assembly protein of the Mycoplasma terminal organelle. The terminal organelle is involved in both cytadherence and gliding motility. [Cellular processes, Chemotaxis and motility]	871
163548	TIGR03836	termin_org_HMW1	cytadherence high molecular weight protein 1 N-terminal region. This model describes the N-terminal region of the Mycoplasma cytadherence protein HMW1, up to but not including the first EAGR box domain. The apparent orthologs in different Mycoplasma species differ profoundly in archictecture C-terminally to the region described here.	82
274809	TIGR03837	EarP	Elongation-Factor P (EF-P) rhamnosyltransferase EarP. This model describes a conserved protein that typically is encoded next to the gene efp for translation elongation factor P.	371
274810	TIGR03838	queuosine_YadB	glutamyl-queuosine tRNA(Asp) synthetase. This protein resembles a shortened glutamyl-tRNA ligase, but its purpose is to modify tRNA(Asp) at a queuosine position in the anticodon rather than to charge a tRNA with its cognate amino acid. [Protein synthesis, tRNA and rRNA base modification]	271
274811	TIGR03839	termin_org_P1	adhesin P1. Members of this protein family are the major adhesin of the Mycoplasma terminal organelle. The protein is called adhesin P1, cytadhesin P1, P140, attachment protein, and MgPa, with locus names MG191 in Mycoplasma genitalium and MPN141 in M. pneumoniae. A conserved C-terminal region is shared by additional paralogs in M. pneumoniae and M. gallisepticum, as well as by the member of this family. [Cell envelope, Surface structures, Cellular processes, Pathogenesis]	1425
213871	TIGR03840	TMPT_Se_Te	thiopurine S-methyltransferase, Se/Te detoxification family. Members of this family are thiopurine S-methyltransferase from a branch in which at least some member proteins can perform selenium methylation as a means to detoxify selenium, or perform a related detoxification of tellurium. Note that the EC number definition does not specify a particular thiopurine, but rather represents a class of activity.	213
274812	TIGR03841	F420_Rv3093c	probable F420-dependent oxidoreductase, Rv3093c family. This model describes a small family of enzymes in the bacterial luciferase-like monooxygenase family, which includes F420-dependent enzymes such as N5,N10-methylenetetrahydromethanopterin reductase as well as FMN-dependent enzymes. All members of this family are from species that produce coenzyme F420; SIMBAL analysis suggests that members of this family bind F420 rather than FMN. [Unknown function, Enzymes of unknown specificity]	301
163554	TIGR03842	F420_CPS_4043	F420-dependent oxidoreductase, CPS_4043 family. This model represents a family of putative F420-dependent oxidoreductases, fairly closely related to 5,10-methylenetetrahydromethanopterin reductase (mer, TIGR03555), both within the bacterial luciferase-like monoxygenase (LLM) family. A fairly deep split (to about 40 % sequence identity) in the present family separates a strictly Actinobacterial clade from an alpha/beta/gamma-proteobacterial clade, in which the member is often the only apparent F420-dependent LLM family member. The specific function, and whether Actinobacterial and Proteobacterial clades differ in function, are unknown. [Unknown function, Enzymes of unknown specificity]	330
274813	TIGR03843	TIGR03843	conserved hypothetical protein. This model represents a protein family largely restricted to the Actinobacteria (high-GC Gram-positives), although it is also found in the Chloroflexi. Distant similarity to the phosphatidylinositol 3- and 4-kinase is suggested by the matching of some members to pfam00454.	226
163556	TIGR03844	cysteate_syn	cysteate synthase. Members of this family are cysteate synthase, an enzyme of alternate pathway to sulfopyruvate, a precursor of coenzyme M. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis]	398
163557	TIGR03845	sulfopyru_alph	sulfopyruvate decarboxylase, alpha subunit. This model represents the alpha subunit, or the N-terminal region, of sulfopyruvate decarboxylase, an enzyme of coenzyme M biosynthesis. Coenzyme M is found almost exclusively in the methanogenic archaea. However, the enzyme also occurs in Roseovarius nubinhibens ISM in a degradative pathway, where the resulting sulfoacetaldehyde is desulfonated to acetyl phosphate, then converted to acetyl-CoA (see ). [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis]	157
274814	TIGR03846	sulfopy_beta	sulfopyruvate decarboxylase, beta subunit. Nearly every member of this protein family is the beta subunit, or else the C-terminal region, of sulfopyruvate decarboxylase, in an archaeal species capable of coenzyme M biosynthesis. However, the enzyme also occurs in Roseovarius nubinhibens ISM in a degradative pathway, where the resulting sulfoacetaldehyde is desulfonated to acetyl phosphate, then converted to acetyl-CoA (see ).	181
213872	TIGR03847	TIGR03847	conserved hypothetical protein. The conserved hypothetical protein described here occurs as part of the trio of uncharacterized proteins common in the Actinobacteria.	177
163560	TIGR03848	MSMEG_4193	probable phosphomutase, MSMEG_4193 family. A three-gene system broadly conserved among the Actinobacteria includes MSMEG_4193 and homologs, a subgroup among the larger phosphoglycerate mutase family protein (pfam00300). Another member of the trio is a probable kinase, related to phosphatidylinositol kinases; that context supports the hypothesis that this protein acts as a phosphomutase.	204
163561	TIGR03849	arch_ComA	phosphosulfolactate synthase. This model finds the ComA (Coenzyme M biosynthesis A) protein, phosphosulfolactate synthase, in methanogenic archaea. The ComABC pathway is one of at least two pathways to the intermediate sulfopyruvate. Coenzyme M occurs rarely and sporadically outside of the archaea, as for expoxide metabolism in Xanthobacter autotrophicus Py2, but candidate phosphosulfolactate synthases from that and other species occur fall below the cutoff and outside the scope of this model. This model deliberately is narrower in scope than pfam02679. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis]	237
274815	TIGR03850	bind_CPR_0540	carbohydrate ABC transporter substrate-binding protein, CPR_0540 family. Members of this protein are the substrate-binding protein of a predicted carbohydrate transporter operon, together with permease subunits of ABC transporter homology families. This substrate-binding protein frequently co-occurs in genomes with a family of disaccharide phosphorylases, TIGR02336, suggesting that the molecule transported will include beta-D-galactopyranosyl-(1->3)-N-acetyl-D-glucosamine and related carbohydrates. Members of this family are sporadically strain by strain, often in species with a human host association, including Propionibacterium acnes and Clostridium perfringens, and Bacillus cereus. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	437
274816	TIGR03851	chitin_NgcE	carbohydrate ABC transporter, N-acetylglucosamine/diacetylchitobiose-binding protein. Members of this protein family are the substrate-binding protein, a lipid-anchored protein of Gram-positive bacteria in all examples found so far, that include NgcE of the chitin-degrader, Streptomyces olivaceoviridis, and close homologs from other species likely to share the same function. NgcE binds both N-acetylglucosamine and the chitin dimer, N,N'-diacetylchitobiose. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	450
163564	TIGR03852	sucrose_gtfA	sucrose phosphorylase. In the forward direction, this enzyme uses phosphate to cleave sucrose into D-fructose + alpha-D-glucose 1-phosphate. Characterized representatives from Streptococcus mutans and Bifidobacterium adolescentis represent well-separated branches of a molecular phylogenetic tree. In S. mutans, the region including this gene has been associated with neighboring transporter genes and multiple sugar metabolism.	470
163565	TIGR03853	matur_matur	probable metal-binding protein. This model describes a family of small cytosolic proteins, about 80 amino acids in length, in which the eight invariant residues include three His residues and two Cys residues. Two pairs of these invariant residues occur in motifs HxH (where x is A or G) and CxH, both of which suggest metal-binding activity. This protein family was identified by searching with a phylogenetic profile based on an anaerobic sulfatase-maturase enzyme, which contains multiple 4Fe-4S clusters. The linkages by phylogenetic profiling and by iron-sulfur cluster-related motifs together suggest this protein may be an accessory protein to certain maturases in sulfatase/maturase systems.	77
163566	TIGR03854	F420_MSMEG_3544	probable F420-dependent oxidoreductase, MSMEG_3544 family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes a small family, closely related to other such families in the putative F420-binding region, exemplified by MSMEG_3544 in Mycobacterium smegmatis. [Unknown function, Enzymes of unknown specificity]	290
163567	TIGR03855	NAD_NadX	aspartate dehydrogenase. Members of this protein family are L-aspartate dehydrogenase, as shown for the NADP-dependent enzyme TM_1643 of Thermotoga maritima. Members lack homology to NadB, the aspartate oxidase (EC 1.4.3.16) of most mesophilic bacteria (described by TIGR00551), which this enzyme replaces in the generation of oxaloacetate from aspartate for the NAD biosynthetic pathway. All members of the seed alignment are found adjacent to other genes of NAD biosynthesis, although other uses of L-aspartate dehydrogenase may occur.	229
213873	TIGR03856	F420_MSMEG_2906	probable F420-dependent oxidoreductase, MSMEG_2906 family. This model describes a small family of enzymes in the bacterial luciferase-like monooxygenase family, which includes F420-dependent enzymes such as N5,N10-methylenetetrahydromethanopterin reductase as well as FMN-dependent enzymes. All members of this family are from species that produce coenzyme F420; SIMBAL analysis suggests that members of this family bind F420 rather than FMN. [Unknown function, Enzymes of unknown specificity]	249
213874	TIGR03857	F420_MSMEG_2249	probable F420-dependent oxidoreductase, MSMEG_2249 family. Coenzyme F420 has a limited phylogenetic distribution, including methanogenic archaea, Mycobacterium tuberculosis and related species, Colwellia psychrerythraea 34H, Rhodopseudomonas palustris HaA2, and others. Partial phylogenetic profiling identifies protein subfamilies, within the larger family called luciferase-like monooxygenanases (pfam00296), that appear only in F420-positive genomes and are likely to be F420-dependent. This model describes a distinctive subfamily, found only in F420-biosynthesizing members of the Actinobacteria of the bacterial luciferase-like monooxygenase (LLM) superfamily. [Unknown function, Enzymes of unknown specificity]	329
274817	TIGR03858	LLM_2I7G	probable oxidoreductase, LLM family. This model describes a highly conserved, somewhat broadly distributed family withing the luciferase-like monooxygenase (LLM) superfamily. Most members are from species incapable of synthesizing coenzyme F420, bound by some members of the LLM superfamily. Members, therefore, are more likely to use FMN as a cofactor.	337
274818	TIGR03859	PQQ_PqqD	coenzyme PQQ biosynthesis protein PqqD. This model identifies PqqD, a protein involved in the final steps of the biosynthesis of pyrroloquinoline quinone, coenzyme PQQ. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	81
274819	TIGR03860	FMN_nitrolo	FMN-dependent oxidoreductase, nitrilotriacetate monooxygenase family. This model represents a distinctive clade, in which all characterized members are FMN-binding, within the larger family of luciferase-like monooxygenases (LLM), among which there are both FMN- and F420-binding enzymes. A well-characterized member is nitrilotriacetate monooxygenase from Aminobacter aminovorans (Chelatobacter heintzii), where nitrilotriacetate is a chelating agent used in detergents. [Unknown function, Enzymes of unknown specificity]	422
163573	TIGR03861	phenyl_ABC_PedC	alcohol ABC transporter, permease protein. Members of this protein family, part of a larger class of efflux-type ABC transport permease proteins, are found exclusively in genomic contexts with pyrroloquinoline-quinone (PQQ) biosynthesis enzymes and/or PQQ-dependent alcohol dehydrogenases, such as the phenylethanol dehydrogenase PedE of Pseudomonas putida U. Members include PedC, an apparent phenylethanol transport protein whose suggested role is efflux to limit intracellular concentrations of toxic metabolites during phenylethanol catalysis. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	253
274820	TIGR03862	flavo_PP4765	uncharacterized flavoprotein, PP_4765 family. This model describes a sharply distinctive clade of proteins within the larger family of flavoproteins described by pfam03486 and TIGRFAMs model TIGR00275. The function is unknown.	376
274821	TIGR03863	PQQ_ABC_bind	ABC transporter, substrate binding protein, PQQ-dependent alcohol dehydrogenase system. Members of this protein family are putative substrate-binding proteins of an ABC transporter family that associates, in gene neighborhood and phylogenomic profile, with pyrroloquinoline-quinone (PQQ)-dependent degradation of certain alcohols, such as 2-phenylethanol in Pseudomonas putida U. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	347
274822	TIGR03864	PQQ_ABC_ATP	ABC transporter, ATP-binding subunit, PQQ-dependent alcohol dehydrogenase system. Members of this protein family are the ATP-binding subunit of an ABC transporter system that is associated with PQQ biosynthesis and PQQ-dependent alcohol dehydrogenases. While this family shows homology to several efflux ABC transporter subunits, the presence of a periplasmic substrate-binding protein and association with systems for catabolism of alcohols suggests a role in import rather than detoxification. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	236
274823	TIGR03865	PQQ_CXXCW	PQQ-dependent catabolism-associated CXXCW motif protein. Members of this protein family have a CXXXCW motif, consistent with a possible role in redox cofactor binding. This protein family shows strong relationships by phylogenetic profiling and conserved gene neighborhoods with a transport system for alcohols metabolized by PQQ-dependent enzymes.	162
274824	TIGR03866	PQQ_ABC_repeats	PQQ-dependent catabolism-associated beta-propeller protein. Members of this protein family consist of seven repeats each of the YVTN family beta-propeller repeat (see TIGR02276). Members occur invariably as part of a transport operon that is associated with PQQ-dependent catabolism of alcohols such as phenylethanol.	310
274825	TIGR03867	MprA_tail	MprA protease C-terminal rhombosortase-interaction domain. This model describes the Ralstonia lineage variant of the GlyGly-CTERM domain (TIGR03501), a predicted target for protein sorting and cleavage by rhombosortase, a member of the family of rhomboid proteases. Note that some MprA family proteases are full-length homologs except for the lack of this domain. All members of the present family are predicted serine proteases.	27
274826	TIGR03868	F420-O_ABCperi	proposed F420-0 ABC transporter, periplasmic F420-0 binding protein. This small clade of ABC-type transporter periplasmic binding protein components is found as a three gene cassette along with a permease (TIGR03869) and an ATPase (TIGR03873). The organisms containing this cassette are all Actinobacteria and all contain numerous genes requiring the coenzyme F420. This model was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: TIGR01916). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554). Based on these observations we propose that this periplasmic binding protein is a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter.	287
163581	TIGR03869	F420-0_ABCperm	proposed F420-0 ABC transporter, permease protein. his small clade of ABC-type transporter permease protein components is found as a three gene cassette along with a periplasmic substrate-binding protein (TIGR03868) and an ATPase (TIGR03873). The organisms containing this cassette are all Actinobacteria and all contain numerous genes requiring the coenzyme F420. This model was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: TIGR01916). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with an F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554). Based on these observations we propose that this permease protein is a component of a F420-0 (that is, F420 lacking only the polyglutamate tail) transporter.	325
274827	TIGR03870	ABC_MoxJ	methanol oxidation system protein MoxJ. This predicted periplasmic protein, called MoxJ or MxaJ, is required for methanol oxidation in Methylobacterium extorquens. Two differing lines of evidence suggest two different roles. Forming one view, homology suggests it is the substrate-binding protein of an ABC transporter associated with methanol oxidation. The gene, furthermore, is found regular in genomes with, and only two or three genes away from, a corresponding permease and ATP-binding cassette gene pair. The other view is that this protein is an accessory factor or additional subunit of methanol dehydrogenase itself. Mutational studies show a dependence on this protein for expression of the PQQ-dependent, two-subunit methanol dehydrogenase (MxaF and MxaI) in Methylobacterium extorquens, as if it is a chaperone for enzyme assembly or a third subunit. A homologous N-terminal sequence was found in Paracoccus denitrificans as a 32Kd third subunit. This protein may, in fact, be both, a component of a periplasmic enzyme that converts methanol to formaldehyde and a component of an ABC transporter that delivers the resulting formaldehyde to the cell's interior. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Energy metabolism, Other]	246
274828	TIGR03871	ABC_peri_MoxJ_2	quinoprotein dehydrogenase-associated probable ABC transporter substrate-binding protein. This protein family, a sister family to TIGR03870, is found more broadly. It occurs a range of PQQ-biosynthesizing species, not just in known methanotrophs. Interpretation of evidence by homology and by direct experimental work suggest two different roles. By homology, this family appears to be the periplasmic substrate-binding protein of an ABC transport family. However, mutational studies and direct characterization for some sequences related to this family suggests this family may act as a maturation chaperone or additional subunit of a methanol dehydrogenase-like enzyme.	232
274829	TIGR03872	cytochrome_MoxG	cytochrome c(L), periplasmic. This model describes a periplasmic c-type cytochrome that serves as the primary electron acceptor for the quinoprotein methanol dehydrogenase, a PQQ enzyme. The member from Paracoccus denitrificans is also characterized as an electron acceptor for methylamine dehydrogenase, a tryptophan tryptophylquinone enzyme. This protein is called cytochrome c(L) in methylotrophic bacteria such Methylobacterium extorquens, but c551i in Paracoccus denitrificans. [Energy metabolism, Electron transport]	133
163585	TIGR03873	F420-0_ABC_ATP	proposed F420-0 ABC transporter, ATP-binding protein. This small clade of ABC-type transporter ATP-binding protein components is found as a three gene cassette along with a periplasmic substrate-binding protein (TIGR03868) and a permease (TIGR03869). The organisms containing this cassette are all Actinobacteria and all contain numerous genes requiring the coenzyme F420. This model was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: TIGR01916). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554). Based on these observations we propose that this ATP-binding protein is a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter.	256
163586	TIGR03874	4cys_cytochr	c-type cytochrome, methanol metabolism-related. This family represents a c-type cytochrome related to (but excluding) cytochrome c-555 of Methylococcus capsulatus. Members contain four invariant Cys residues, including two from a heme-binding motif shared with c-555, and two others.	143
163587	TIGR03875	RNA_lig_partner	RNA ligase partner, MJ_0950 family. This uncharacterized protein family is found almost perfectly in the same set of genomes as the Pab1020 family described by model TIGR01209. These pairs are found mostly in Archaea, but also in a few bacteria (e.g. Alkalilimnicola ehrlichei MLHE-1, Aquifex aeolicus). While the partner protein has been described as homodimeric ligase that has RNA circularization activity, the function of this protein (also called UPF0278) is unknown.	206
274830	TIGR03876	cas_csaX	CRISPR type I-A/APERN-associated protein CsaX. This family comprises a minor CRISPR-associated protein family. It occurs only in the context of the (strictly archaeal) Apern subtype of CRISPR/Cas system, and is further restricted to the Sulfolobales, including Metallosphaera sedula DSM 5348 and multiple species of the genus Sulfolobus.	281
163589	TIGR03877	thermo_KaiC_1	KaiC domain protein, Ph0284 family. Members of this family contain a single copy of the KaiC domain (pfam06745) that occurs in two copies of the circadian clock protein kinase KaiC itself. Members occur primarily in thermophilic archaea and in Thermotoga.	237
274831	TIGR03878	thermo_KaiC_2	KaiC domain protein, AF_0795 family. This KaiC domain-containing protein family occurs sporadically across a broad taxonomic range (Euryarchaeota, Aquificae, Dictyoglomi, Epsilonproteobacteria, and Firmicutes), but exclusively in thermophiles.	259
163591	TIGR03879	near_KaiC_dom	probable regulatory domain. This model describes a common domain shared by two different families of proteins, each of which occurs regularly next to its corresponding partner family, a probable regulatory with homology to KaiC. By implication, this protein family likely is also involved in sensory transduction and/or regulation.	73
163592	TIGR03880	KaiC_arch_3	KaiC domain protein, AF_0351 family. This model represents a rather narrowly distributed archaeal protein family in which members have a single copy of the KaiC domain. This stands in contrast to the circadian clock protein KaiC itself, with two copies of the domain. Members are expected to have weak ATPase activity, by homology to the autokinase/autophosphorylase KaiC itself.	224
163593	TIGR03881	KaiC_arch_4	KaiC domain protein, PAE1156 family. Members of this protein family are archaeal single-domain KaiC_related proteins, homologous to the Cyanobacterial circadian clock cycle protein KaiC, an autokinase/autophosphorylase that has two copies of the domain.	229
274832	TIGR03882	cyclo_dehyd_2	bacteriocin biosynthesis cyclodehydratase domain. This model describes a ThiF-like domain of a fusion protein found in clusters associated with the production of TOMMs (thiazole/oxazole-modified microcins), small bacteriocins with characteristic heterocycle modifications. This domain is presumed to act as a cyclodehydratase, as do members of the SagC family modeled by TIGR03603.	164
274833	TIGR03883	DUF2342_F420	uncharacterized protein, coenzyme F420 biosynthesis associated. A phylogenetic tree of the DUF2342 family (TIGR03624) consists of two major branches. One of these branches, modeled here, is observed almost entirely to be found in coenzyme F420 biosynthesizing species of the Actinobacterial, Chloroflexi and Archaeal lineages. The few organisms having genes within this family and lacking F420 biosynthesis may either have an undiscovered F420 transporter, or may represent F420-to-FMN revertants. This family includes a Chloroflexus Aurantiacus protein whose crystal structure has been determined (PDB:3CMN_A). This has been annotated as a putative hydrolase, but the support for that assertion is untraceable. There is no cofactor present in the structure.	346
163596	TIGR03884	sel_bind_Methan	selenium-binding protein. This model describes a homopentameric selenium-binding protein with a suggested role in selenium transport and delivery to selenophosphate synthase, the SelD protein. This protein family is closely related to pfam01906, but is shorter because of several deleted regions. It is restricted to the archaeal genus Methanococcus.	74
274834	TIGR03885	flavin_revert	probable non-F420 flavinoid oxidoreductase. This model represents a clade of proteins within the larger subfamily TIGR03557. The parent model includes the F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554) and many other proteins. Excepting the members of this family, all members of TIGR03557 occur in species capable of synthesizing coenzyme F420. All members of the seed alignment for this model are from species that lack F420 biosynthesis. It is suggested that members of this family bind FMN, or FO, or a novel flavinoid cofactor, but not F420 per se. [Unknown function, Enzymes of unknown specificity]	315
188401	TIGR03886	lyase_spl_fam	spore photoproduct lyase family protein. This uncharacterized radical SAM domain protein occurs rarely and sporadically in species that include select Alphaproteobacteria and Actinobacteria, and in Deinococcus deserti VCD115. It is a distant but full-length homolog to the Bacillus subtilis spore photoproduct lyase (spl), which monomerizes thymine dimers created as DNA damage by uv radiation.	346
188402	TIGR03887	thiocyan_alph	thiocyanate hydrolase, gamma subunit. Members of this family are the gamma subunit of thiocyanate hydrolase. This family is closely related to the nitrile hydratase, alpha subunit (TIGR01323).	200
274835	TIGR03888	nitrile_beta	nitrile hydratase, beta subunit. Members of this protein family are the beta subunit of nitrile hydratase. The alpha subunit is represented by model TIGR01323. While nitrile hydratase is given the specific EC number 4.2.1.84, nitriles are a class of compounds, and one genome may carry more than one nitrile hydratase. The enzyme occurs in both non-heme iron and non-corrin cobalt forms. [Energy metabolism, Amino acids and amines]	223
188404	TIGR03889	nitrile_acc	nitrile hydratase accessory protein. Members of this protein family are found in operons with the alpha and beta subunits of nitrile hydratase, an enzyme with Fe(III) or Co(III) at the active site, and appear to be accessory proteins for maturation or activation of the enzyme. This protein is homologous to the beta subunit (see TIGR03888).	74
188405	TIGR03890	nif11_cupin	nif11 domain/cupin domain protein. Members of this protein family occur exclusively in the Cyanobacteria and contain both a nif11 and a cupin domain. The function is unknown.	171
274836	TIGR03891	thiopep_ocin	thiopeptide-type bacteriocin biosynthesis domain. This domain occurs within longer proteins that contain lantibiotic dehydratase domains (see pfam04737 and pfam04738), and as single-domain proteins in bacteriocin biosynthesis genomic contexts. Three named genes in this family, SioK in Streptomyces sioyaensis, TsrD in Streptomyces laurentii, and NosD in Streptomyces actuosus, all occur in regions associated with thiopeptide biosynthesis. [Cellular processes, Toxin production and resistance]	263
200334	TIGR03892	thiopep_precurs	thiazolylpeptide-type bacteriocin precursor. Members of this protein family are the precursors of a family of small bacteriocins (i.e. microcins) with thiopeptide type modifications, a highly modified subclass of heterocycle-containing peptide antibiotics. Members tend to be found clustered in genomes with proteins recognized by TIGR03891 and proteins/domains annotated as lantibiotic dehydratase (pfam04737, pfam04738), and with a cyclodehydratase/docking protein fusion protein characteristic of heterocycle formation. The seed alignment includes both an N-terminal leader peptide region and a C-terminal low-complexity region consisting mostly of Cys and Ser residues. Members with known function block translation by inhibiting translation factor activity. [Cellular processes, Toxin production and resistance]	43
274837	TIGR03893	lant_SP_1948	type 2 lantibiotic, SP_1948 family. This model recognizes a number of type 2 lantibiotic-type bacteriocins, related to but distinct from the family that includes lichenicidin and mersacidin. Sequence similarity among members consists largely of a 20-residue block of conserved sequence that covers most of the leader peptide region, absent from the mature lantibiotic. This is followed by a region with characteristic composition for lantibiotic precursor regions, rich in Ser and Thr and including a near-invariant Cys near or at the C-terminus, involved in cyclization. Members of this family typically are shorter than 70 amino acids. [Cellular processes, Toxin production and resistance]	61
188409	TIGR03894	chp_P_marinus_1	conserved hypothetical protein, TIGR03894 family. This protein family is restricted to the Prochlorococcus and Synechococcus lineages of the Cyanobacteria, and is sporadic in those lineages. Members average 100 amino acids in length, including a 30-residue, highly polar, low complexity region sandwiched between an N-terminal region of about 60 residues and a C-terminal [KR]VVR[KR]RS motif, both well-conserved. The function is unknown. [Hypothetical proteins, Conserved]	95
274838	TIGR03895	protease_PatA	cyanobactin maturation protease, PatA/PatG family. This model describes a protease domain associated with the maturation of various members of the cyanobactin family of ribosomally produced, heavily modified bioactive metabolites. Members include the PatA protein and C-terminal domain of the PatG protein of Prochloron didemni, TenA and a region of TenG from Nostoc spongiaeforme var. tenue, etc.	602
274839	TIGR03896	cyc_nuc_ocin	bacteriocin-type transport-associated protein. Members of this protein family are uncharacterized and contain two copies of the cyclic nucleotide-binding domain pfam00027. Members are restricted to select cyanobacteria but are found regularly in association with a transport operon that, in turn, is associated with the production of putative bacteriocins. The models describing the transport operon are TIGR03794, TIGR03796, and TIGR03797.	317
274840	TIGR03897	lanti_2_LanM	type 2 lantibiotic biosynthesis protein LanM. Members of this family are known generally as LanM, a multifunctional enzyme of lantibiotic biosynthesis. This catalysis by LanM distinguishes the type 2 lantibiotics, such as mersacidin, cinnamycin, and lichenicidin, from LanBC-produced type 1 lantibiotics such as nisin and subtilin. The N-terminal domain contains regions associated with Ser and Thr dehydration. The C-terminal region contains a pfam05147 domain, which catalyzes the formation of the lanthionine bridge. [Cellular processes, Toxin production and resistance]	931
274841	TIGR03898	lanti_MRSA_kill	type 2 lantibiotic, mersacidin/lichenicidin family. This model recognizes a number of type 2 lantibiotic-type bacteriocins, including mersacidin and lichenicidin. Members often are found as gene pairs encoding two-chain bacteriocins. Maturation is accomplished, at least in part, by a LanM-type enzyme (TIGR03897). This model describes only the leader peptide region. [Cellular processes, Toxin production and resistance]	44
274842	TIGR03899	TIGR03899	TIGR03899 family protein. Members of this protein family are conserved hypothetical proteins with a limited species distribution within the Gammaproteobacteria. It is common in the genera Vibrio and Shewanella, and in this resembles the C-terminal domain and putative protein sorting motif TIGR03501. This model, but design, does not extend to all homologs,but rather represents a particular clade.	250
274843	TIGR03900	prc_long_Delta	putative carboxyl-terminal-processing protease, deltaproteobacterial. This model describes a multidomain protein of about 1070 residues, restricted to the order Myxococcales in the Deltaproteobacteria. Members contain a PDZ domain (pfam00595), an S41 family peptidase domain (pfam03572), and an SH3 domain (pfam06347). A core region of this family, including PDZ and S41 regions, is described by TIGR00225, C-terminal processing peptidase, which recognizes the Prc protease. The species distribution of this family approximates that of largely Deltaproteobacterial C-terminal putative protein-sorting domain, TIGR03901, analogous to LPXTG and PEP-CTERM, but the co-occurrence may reflect shared restriction to the Myxococcales rather than a substrate/target relationship.	973
274844	TIGR03901	MYXO-CTERM	MYXO-CTERM domain. This model describes MYXO-CTERM, a C-terminal putative protein sorting domain, analogous to LPXTG (TIGR01167) and PEP-CTERM (TIGR02595). It is restricted to the Myxococcales, a division of the Deltaproteobacteria, with over 60 members occurring in Plesiocystis pacifica SIR-1. An example protein is TraA, involved in outer membrane exchange (lipids and proteins) through which one strain of Myxococcus can repair a mobility defect in another. The trusted cutoff for this model is set artificially high to avoid false positives, and consequently only about half of all members are recognized.	31
274845	TIGR03902	rhom_GG_sort	rhomboid family GlyGly-CTERM serine protease. This model describes a rhomboid-like intramembrane serine protease. Its species distribution closely matches model TIGR03501, GlyGly-CTERM, which describes a protein targeting domain analogous to LPXTG and PEP-CTERM. In a number of species (Ralstonia eutropha ,R. metallidurans, R. solanacearum, Marinobacter aquaeolei, etc) with just one GlyGly-CTERM protein (i.e., a dedicated system), the rhombosortase and GlyGly-CTERM genes are adjacent.	154
274846	TIGR03903	TOMM_kin_cyc	TOMM system kinase/cyclase fusion protein. This model represents proteins of 1350 in length, in multiple species of Burkholderia, in Acidovorax avenae subsp. citrulli AAC00-1 and Delftia acidovorans SPH-1, and in multiple copies in Sorangium cellulosum, in genomic neighborhoods that include a cyclodehydratase/docking scaffold fusion protein (TIGR03882) and a member of the thiazole/oxazole modified metabolite (TOMM) precursor family TIGR03795. It has a kinase domain in the N-terminal 300 amino acids, followed by a cyclase homology domain, followed by regions without named domain definitions. It is a probable bacteriocin-like metabolite biosynthesis protein. [Cellular processes, Toxin production and resistance]	1266
274847	TIGR03904	SAM_YgiQ	uncharacterized radical SAM protein YgiQ. Members of this family are fairly widespread uncharacterized radical SAM family proteins, many of which are designated YgiQ. [Unknown function, Enzymes of unknown specificity]	559
188420	TIGR03905	TIGR03905_4_Cys	uncharacterized protein TIGR03905. This model describes a family of conserved hypothetical proteins of small size, typically ~85 residues, with four invariant Cys residues. This small protein is distantly homologous to a C-terminal domain found in proteins identified by N-terminal homology as ribonucleotide reductases. The rare and sporadic distribution of this protein family falls mostly within the subset of bacterial genomes containing the uncharacterized radical SAM protein modeled by TIGR03904. [Unknown function, General]	78
274848	TIGR03906	quino_hemo_SAM	quinohemoprotein amine dehydrogenase maturation protein. Members of this protein family are radical SAM enzymes responsible for post-translational modifications to the gamma subunit of quinohemoprotein amine dehydrogenases. Ono, et al. () suggest that this protein is responsible for intrapeptidyl thioether cross-linking rather than cysteine tryptophylquinone biogenesis in the gamma subunit. [Protein fate, Protein modification and repair]	467
211887	TIGR03907	QH_beta	quinohemoprotein amine dehydrogenase, beta subunit. Quinohemoprotein amine dehydrogenase is a three subunit enzyme with both a heme group and a cysteine tryptophylquinone group derived by post-translational modification of the gamma subunit. This model describes the beta subunit. This enzyme catalyzes oxidative deamination of primary aliphatic and aromatic amines (). [Energy metabolism, Amino acids and amines]	338
274849	TIGR03908	QH_alpha	quinohemoprotein amine dehydrogenase, alpha subunit. Quinohemoprotein amine dehydrogenase is a three subunit enzyme with both a heme group and a cysteine tryptophylquinone group derived by post-translational modification of the gamma subunit. This model describes the beta subunit. This enzyme catalyzes oxidative deamination of primary aliphatic and aromatic amines (). [Energy metabolism, Amino acids and amines]	510
188424	TIGR03909	pyrrolys_PylC	pyrrolysine biosynthesis protein PylC. This protein is PylC, part of a three-gene cassette that is sufficient to direct the biosynthesis of pyrrolysine, the twenty-second amino acid, incorporated in some species at a UAG canonical stop codon. [Amino acid biosynthesis, Other]	374
188425	TIGR03910	pyrrolys_PylB	pyrrolysine biosynthesis radical SAM protein. This model describes a radical SAM protein, PylB, that is part of the three-gene cassette sufficient for the biosynthesis of pyrrolysine (the twenty-second amino acid) when expressed heterologously in E. coli. The pyrrolysine next is ligated to its own tRNA and incorporated at special UAG codons. [Amino acid biosynthesis, Other]	347
188426	TIGR03911	pyrrolys_PylD	pyrrolysine biosynthesis protein PylD. This protein is PylD, part of a three-gene cassette that is sufficient to direct the biosynthesis of pyrrolysine, the twenty-second amino acid, incorporated in some species at a UAG canonical stop codon. [Amino acid biosynthesis, Other]	266
188427	TIGR03912	PylS_Nterm	pyrrolysyl-tRNA synthetase, N-terminal region. PylS is the enzyme responsible for charging the pyrrolysine tRNA, PylT, by ligating a free molecule of pyrrolysine. Pyrrolysine is encoded at an in-frame UAG (amber) at least in several corrinoid-dependent methyltransferases of the archaeal genera Methanosarcina and Methanococcoides, such as trimethylamine methyltransferase. This protein occurs as a fusion protein in Methanosarcina but as split genes in Desulfitobacterium hafniense and other bacteria. This model describes the small, N-terminal region. [Protein synthesis, tRNA aminoacylation]	89
188428	TIGR03913	rad_SAM_trio	Y_X(10)_GDL-associated radical SAM protein. This narrowly distributed protein family contains an N-terminal radical SAM domain. It occurs in Pseudomonas fluorescens Pf0-1, Ralstonia solanacearum, and numerous species and strains of Burkholderia. Members always occur next to a trio of three mutually homologous genes, all of which contain the domain pfam08898 as the whole of the protein (about 60 amino acids) or as the C-terminal domain. The function is unknown, but the fact that all phylogenetically correlated proteins are mutually homologous with prominent invariant motifs (an invariant tyrosine and a GDL motif) and as small as 60 amino acids suggests that post-translational modification of pfam08898 domain-containing proteins may be its function. This view is supported by closer homology to the PqqE radical SAM protein involved in PQQ biosynthesis from the PqqA precursor peptide than to other characterized radical SAM proteins. [Unknown function, Enzymes of unknown specificity]	477
274850	TIGR03914	UDG_fam_dom	uracil-DNA glycosylase family domain. This model represents a clade within the uracil-DNA glycosylase superfamily. Among characterized proteins, it most closely resembles the Thermus thermophilus uracil-DNA glycosylase TTUDGA, which acts uracil (deamidated cytosine) in both single-stranded DNA and U/G pairs of double-stranded DNA. This domain may occur either as a stand-alone protein or as the C-terminal domain of a fusion with another domain that always pairs with a particular radical-SAM family protein.	230
274851	TIGR03915	SAM_7_link_chp	probable DNA metabolism protein. This model represents a conserved hypothetical protein that almost invariably pairs with an uncharacterized radical SAM protein. The pair occurs in about twenty percent of completed prokaryotic genomes. About forty percent of the members of this family occur as fusion proteins, where the C-terminal domain belongs to the uracil-DNA glycosylase family, a DNA repair family (because uracil in DNA is deamidated cytosine). The linkage by gene clustering and correlated species distribution to a radical SAM protein, and by gene fusion to a DNA repair protein family, suggests a role in DNA modification and/or repair.	241
188431	TIGR03916	rSAM_link_UDG	putative DNA modification/repair radical SAM protein. This uncharacterized protein of about 400 amino acids in length contains a radical SAM protein in the N-terminal half. Members are present in about twenty percent of prokaryotic genomes, always paired with a member of the conserved hypothetical protein TIGR03915. Roughly forty percent of the members of that family exist as fusions with a uracil-DNA glycosylase-like region, TIGR03914. In DNA, uracil results from deamidation of cytosine, forming U/G mismatches that lead to mutation, and so uracil-DNA glycosylase is a DNA repair enzyme. This indirect connection, and the recurring role or radical SAM protein in modification chemistries, suggest that this protein may act in DNA modification, repair, or both. [Unknown function, Enzymes of unknown specificity]	415
274852	TIGR03917	Frankia_40_dom	Frankia-40 domain. This model describes a paralogous domain of length 40, restricted to smaller proteins of the genus Frankia, a member of the Actinobacteria. The function is unknown.	40
274853	TIGR03918	GTP_HydF	[FeFe] hydrogenase H-cluster maturation GTPase HydF. This model describes the family of the [Fe] hydrogenase maturation protein HypF as characterized in Chlamydomonas reinhardtii and found, in an operon with radical SAM proteins HydE and HydG, in numerous bacteria. It has GTPase activity, can bind an 4Fe-4S cluster, and is essential for hydrogenase activity. [Protein fate, Protein modification and repair]	391
274854	TIGR03919	T7SS_EccB	type VII secretion protein EccB, Actinobacterial. This model represents the transmembrane protein EccB of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccB1, EccB2, etc. This model does not identify functionally related proteins in the Firmicutes such as Staphylococcus aureus and Bacillus anthracis. [Protein fate, Protein and peptide secretion and trafficking]	456
274855	TIGR03920	T7SS_EccD	type VII secretion integral membrane protein EccD. Members of this family are EccD, a component of actinobacterial type VII secretion systems (T7SS) with ten to eleven predicted transmembrane helix regions. [Protein fate, Protein and peptide secretion and trafficking]	453
274856	TIGR03921	T7SS_mycosin	type VII secretion-associated serine protease mycosin. Members of this family are subtilisin-related serine proteases, found strictly in the Actinobacteria and associated with type VII secretion operons. The designation mycosin is used for members from Mycobacterium. [Protein fate, Protein and peptide secretion and trafficking, Protein fate, Protein modification and repair]	350
188437	TIGR03922	T7SS_EccA	type VII secretion AAA-ATPase EccA. This model represents the AAA family ATPase, EccA, of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccA1, EccA2, etc. [Protein fate, Protein and peptide secretion and trafficking]	557
274857	TIGR03923	T7SS_EccE	type VII secretion protein EccE. This model represents the transmembrane protein EccB of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccE1, EccE2, etc. This model represents a conserved core region, and many members have 200 or more additional C-terminal residues. [Protein fate, Protein and peptide secretion and trafficking]	341
274858	TIGR03924	T7SS_EccC_a	type VII secretion protein EccCa. This model represents the N-terminal domain or EccCa subunit of the type VII secretion protein EccC as found in the Actinobacteria. Type VII secretion is defined more broadly as including secretion systems for ESAT-6-like proteins in the Firmicutes as well as in the Actinobacteria, but this family does not show close homologs in the Firmicutes. [Protein fate, Protein and peptide secretion and trafficking]	658
274859	TIGR03925	T7SS_EccC_b	type VII secretion protein EccCb. This model represents the C-terminal domain or EccCb subunit of the type VII secretion protein EccC as found in the Actinobacteria. Type VII secretion is defined more broadly as including secretion systems for ESAT-6-like proteins in the Firmicutes as well as in the Actinobacteria, but this family does not show close homologs in the Firmicutes. [Protein fate, Protein and peptide secretion and trafficking]	566
188441	TIGR03926	T7_EssB	type VII secretion protein EssB. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This protein is designated YukC in Bacillus subtilis and EssB is Staphylococcus aureus. [Protein fate, Protein and peptide secretion and trafficking]	377
200340	TIGR03927	T7SS_EssA_Firm	type VII secretion protein EssA. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This highly divergent protein family consists largely of a central region of highly polar low-complexity sequence containing occasional LF motifs in weak repeats about 17 residues in length, flanked by hydrophobic N- and C-terminal regions. [Protein fate, Protein and peptide secretion and trafficking]	150
274860	TIGR03928	T7_EssCb_Firm	type VII secretion protein EssC, C-terminal domain. This model describes the C-terminal domain, or longer subunit, of the Firmicutes type VII secretion protein EssC. This protein (homologous to EccC in Actinobacteria) and the WXG100 target proteins are the only homologous parts of type VII secretion between Firmicutes and Actinobacteria. [Protein fate, Protein and peptide secretion and trafficking]	1296
274861	TIGR03929	T7_esaA_Nterm	type VII secretion protein EsaA, N-terminal domain. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This model represents the conserved N-terminal domain.	193
274862	TIGR03930	WXG100_ESAT6	WXG100 family type VII secretion target. Members of this protein family include secretion targets for the two main variants of type VII secretion systems (T7SS), one found in the Actinobacteria, one found in the Firmicutes. This model was derived through iteration from pfam06013. The best characterized member of this family is ESAT-6 from Mycobacterium tuberculosis. Members of this family usually are ~100 amino acids in length but occasionally have a long C-terminal extension.	90
274863	TIGR03931	T7SS_Rv3446c	type VII secretion-associated protein, Rv3446c family, C-terminal domain. Members of this protein family occur as part of the ESX-4 cluster of type VII secretion system (T7SS) proteins in Mycobacterium tuberculosis and in similar T7SS clusters in other Actinobacteria genera, including Corynebacterium, Nocardia, Rhodococcus, and Saccharopolyspora. This model describes the better-conserved C-terminal region. [Protein fate, Protein and peptide secretion and trafficking]	172
188447	TIGR03932	PIA_icaD	intracellular adhesion protein D. Members of this protein family are IcaD (intracellular adhesion protein D), which with catalytic subunit IcaA forms an N-acetylglucosaminyltransferase. In the absence of IcaC, this enzyme forms N-acetylglucosamine oligomers up to 20 in length. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	88
188448	TIGR03933	PIA_icaB	intercellular adhesin biosynthesis polysaccharide N-deacetylase. A common motif in bacterial biosynthesis of polysaccharide for export is modification that follows polymerization. This model describes a subfamily of polysaccharide N-deacetylases that acts on poly-beta-1,6-N-acetyl-D-glyscosamine as produced by Staphylococcus epidermidis and S. aureus. The end product in these species is designated polysaccharide intercellular adhesin (PIA), and this gene designated icaB (intercellular adhesion protein B). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Cellular processes, Pathogenesis]	245
274864	TIGR03934	TQXA_dom	TQXA domain. This model describes a domain of about 40 residues with an invariant TQ dipeptide in an almost invariant TQxA[VI]W motif. This domain occurs in surface-expressed proteins of Gram-positive bacteria, many of which are anchored by LPXTG-containing sortase target domains. Numerous members of this family have domains pfam05738 (Cna protein B-type domain) and pfam08341 (fibronectin-binding protein signal sequence).	42
188450	TIGR03935	fragilysin	fragilysin. Members of this family are fragilysin, the Bacteroides fragilis enterotoxin. This enzyme is a Zn metalloprotease. Three distinct subtypes included in this family all are produced by enterotoxigenic (by definition) strains of Bacteroides fragilis. [Cellular processes, Pathogenesis]	386
274865	TIGR03936	sam_1_link_chp	radical SAM-linked protein. This model describes an uncharacterized protein encoded adjacent to, or as a fusion protein with, an uncharacterized radical SAM protein.	208
274866	TIGR03937	PgaC_IcaA	poly-beta-1,6 N-acetyl-D-glucosamine synthase. Members of this protein family are biofilm-forming enzymes that polymerize N-acetyl-D-glucosamine residues in beta(1,6) linkage. One named members is IcaA (intercellular adhesin protein A), an enzyme that acts (with aid of subunit IcaD) in Polysaccharide Intercellular Adhesin (PIA) biosynthesis in Staphylococcus epidermis). The homologous member in E. coli is designated PgaC. Members are often encoded next to a polysaccharide deacetylase and involved in biofilm formation. Note that chitin, although also made from N-acetylglucosamine, is formed with beta-1,4 linkages.	407
274867	TIGR03938	deacetyl_PgaB	poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase PgaB. Two well-characterized systems produce polysaccharide based on N-acetyl-D-glucosamine in straight chains with beta-1,6 linkages. These are encoded by the icaADBC operon in Staphylococcus species, where the system is designated polysaccharide intercellular adhesin (PIA), and the pgaABCD operon in Gram-negative bacteria such as E. coli. Both systems include a putative polysaccharide deacetylase. The PgaB protein, described here, contains an additional domain lacking from its Gram-positive counterpart IcaB (TIGR03933). Deacetylation by this protein appears necessary to allow export through the porin PgaA [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	619
274868	TIGR03939	PGA_TPR_OMP	poly-beta-1,6 N-acetyl-D-glucosamine export porin PgaA. Members of this protein family are the poly-beta-1,6 N-acetyl-D-glucosamine (PGA) export porin PgaA of Gram-negative bacteria. There is no counterpart in the poly-beta-1,6 N-acetyl-D-glucosamine biosynthesis systems of Gram-positive bacteria such as Staphylococcus epidermidis. The PGA polysaccharide adhesin is a critical determinant of biofilm formation. The conserved C-terminal domain of this outer membrane protein is preceded by a variable number of TPR repeats.	800
188455	TIGR03940	PGA_PgaD	poly-beta-1,6-N-acetyl-D-glucosamine biosynthesis protein PgaD. Members of this protein family are PgaD, essential to the production of poly-beta-1,6-N-acetyl-D-glucosamine (PGA). This cytoplasmic membrane protein appears to be an auxiliary subunit to the PGA synthase, PgaC (TIGR03937). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	141
274869	TIGR03941	tRNA_deam_assoc	putative tRNA adenosine deaminase-associated protein. This model describes a protein family about 200 amino acids in length with only five invariant residues, including an Arg, a Ser-Asp pair, and two Gly residues. Members always are found exclusively in Actinobacteria, and always adjacent to homologs of TadA, a tRNA-specific adenosine deaminase from Escherichia coli. Homology, phyletic pattern, and gene neighborhood together suggest a housekeeping function in tRNA metabolism. [Unknown function, General]	154
188457	TIGR03942	sulfatase_rSAM	anaerobic sulfatase-maturating enzyme. Members of this protein family are radical SAM family enzymes, maturases that prepare the oxygen-sensitive radical required in the active site of anaerobic sulfatases. This maturase role has led to many misleading legacy annotations suggesting that this enzyme maturase is instead a sulfatase regulatory protein. All members of the seed alignment are radical SAM enzymes encoded next to or near an anaerobic sulfatase. Note that a single genome may encode more than one sulfatase/maturase pair. [Protein fate, Protein modification and repair]	363
274870	TIGR03943	TIGR03943	TIGR03943 family protein. Members of this occur in gene pairs with members of pfam03773. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region.	219
274871	TIGR03944	dehyd_SbnB_fam	2,3-diaminopropionate biosynthesis protein SbnB. Members of this protein family are probable NAD-dependent dehydrogenases related to the alanine dehydrogenase of Archaeoglobus fulgidus (see TIGR02371, PDB structure 1OMO and ) and more distantly to ornithine cyclodeaminase. Members include the staphylobactin biosynthesis protein SbnB and tend to occur in contexts suggesting non-ribosomal peptide synthesis, always adjacent to (occasionally fused with) a pyridoxal phosphate-dependent enzyme, SbnA. The pair appears to provide 2,3-diaminopropionate for biosynthesis of siderophores or other secondary metabolites. [Cellular processes, Biosynthesis of natural products]	327
274872	TIGR03945	PLP_SbnA_fam	2,3-diaminopropionate biosynthesis protein SbnA. Members of this family include SbnA, a protein of the staphyloferrin B biosynthesis operon of Staphylococcus aureus. SbnA and SbnB together appear to synthesize 2,3-diaminopropionate, a precursor of certain siderophores and other secondary metabolites. SbnA is a pyridoxal phosphate-dependent enzyme. [Cellular processes, Biosynthesis of natural products]	304
188461	TIGR03946	viomycin_VioC	arginine beta-hydroxylase, Fe(II)/alpha-ketoglutarate-dependent. Members of this protein family are L-arginine beta-hydroxylase, members of a broader family of enzymes dependent on Fe(II), alpha-ketoglutarate, and molecular oxygen. Enzymes in the broader family but excluded by this model include clavaminate synthase, taurine dioxygenase, and prolyl-4-hydroxylase. [Cellular processes, Biosynthesis of natural products]	333
188462	TIGR03947	viomycin_VioD	capreomycidine synthase. Members of this family are the enzyme capreomycidine synthase, which performs the second of two steps in the biosynthesis of 2S,3R-capreomycidine from arginine. Capreomycidine is an unusual amino acid used by non-ribosomal peptide synthases (NRPS) to make the tuberactinomycin class of peptide antibiotic. The best characterized member is VioD from the biosynthetic pathway for viomycin. [Cellular processes, Biosynthesis of natural products]	359
188463	TIGR03948	butyr_acet_CoA	butyryl-CoA:acetate CoA-transferase. This enzyme represents one of at least two mechanisms for reclaiming CoA from butyryl-CoA at the end of butyrate biosynthesis (an important process performed by some colonic bacteria), namely transfer of CoA to acetate. An alternate mechanism transfers the butyrate onto inorganic phosphate, after which butyrate kinase transfers the phosphate onto ADP, creating ATP. [Energy metabolism, Fermentation]	445
274873	TIGR03949	bact_IIb_cerein	class IIb bacteriocin, lactobin A/cerein 7B family. Members of this protein family are described variably as bacteriocins per se, one chain of a two-chain bacteriocin, or bacteriocin enhancer proteins. All members of the seed alignment occur in paired gene contexts with another member of the same protein family. This family includes bacteriocins that appear not to undergo post-translational modification, other than cleavage at a Gly-Gly motif coupled to sec-independent export. For many members, the N-terminal bacteriocin cleavage motif region is recognized by TIGR01847. C-terminal to the cleavage motif, these proteins are hydrophobic and low in complexity, consistent with pore-forming activity as a mechanism of bacteriocin action.	45
274874	TIGR03950	sidero_Fe_reduc	siderophore ferric iron reductase, AHA_1954 family. Members of this protein family are 2Fe-2S cluster binding proteins, found regularly in the context of siderophore transporters. Members are distantly related to FhuF from E. coli, a ferric iron reductase linked to removal of iron from hydroxamate-type siderophores (). [Energy metabolism, Electron transport, Transport and binding proteins, Cations and iron carrying compounds]	223
274875	TIGR03951	Fe_III_red_FhuF	siderophore-iron reductase FhuF. Members of this protein family, including FhuF of E. coli, are siderophore ferric iron reductases that appear to play a role in iron removal from certain hydroxamate-type siderophores, including coprogen, ferrichrome, ferrioxamine B, and aerobactin. Genes occur in regularly in siderophore transport and/or biosynthesis clusters. The C-terminus includes four Cys residues in a C-C-10(X)-C-X-X-C motif that binds a 2Fe-2S cluster. Family TIGR03950 is similar, but especially in the C-terminal region, but likely acts on a different panel of siderophores. [Energy metabolism, Electron transport, Transport and binding proteins, Cations and iron carrying compounds]	182
274876	TIGR03952	metzin_BF0631	zinc-dependent metalloproteinase lipoprotein, BF0631 family. Members of this protein family are zinc-dependent metalloproteinases, related to ulilysin and other members of the pappalysin family. Members occur as predicted lipoproteins and occur mostly in the genera Bacteriodes and Prevotella. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	351
274877	TIGR03953	rplD_bact	50S ribosomal protein L4, bacterial/organelle. Members of this protein family are ribosomal protein L4. This model recognizes bacterial and most organellar forms, but excludes homologs from the eukaryotic cytoplasm and from archaea. [Protein synthesis, Ribosomal proteins: synthesis and modification]	188
274878	TIGR03954	integ_memb_HG	integral membrane protein. This model describes a strictly bacterial integral membrane domain of about 85 residues in length. It occurs in proteins that on rare occasions are fused to transporter domains such as the major facilitator superfamily domain. Of three invariant residues, two occur as a His-Gly dipeptide in the middle of three predicted transmembrane helices. [Unknown function, General]	85
274879	TIGR03955	rSAM_HydG	[FeFe] hydrogenase H-cluster radical SAM maturase HydG. This model describes the radical SAM protein HydG. It is part of an enzyme metallocenter maturation system, working together with GTP-binding protein HydF and another radical SAM enzyme, HydE, in H-cluster maturation in [FeFe] hydrogenases. [Protein fate, Protein modification and repair]	471
274880	TIGR03956	rSAM_HydE	[FeFe] hydrogenase H-cluster radical SAM maturase HydE. This model describes the radical SAM protein HydE, one of a pair of radical SAM proteins, along with GTP-binding protein HydF, for maturation of [Fe] hydrogenase in Chlamydomonas reinhardtii and numerous bacteria. [Protein fate, Protein modification and repair]	340
188472	TIGR03957	rSAM_HmdB	5,10-methenyltetrahydromethanopterin hydrogenase cofactor biosynthesis protein HmdB. Members of this archaeal protein family are HmdB, a partially characterized radical SAM protein with an unusual CX5CX2C motif. Its gene flanks the H2-forming methylene-H4-methanopterin dehydrogenase gene hmdA, found in hydrogenotrophic methanogens. HmdB appears to act in in biosynthesis of the novel cofactor of HmdA. [Protein fate, Protein modification and repair, Energy metabolism, Methanogenesis]	317
274881	TIGR03958	monoFe_hyd_HmdC	5,10-methenyltetrahydromethanopterin hydrogenase cofactor biosynthesis protein HmdC. Members of this protein family are HmdC, whose gene regularly occurs in the context of genes for HmdA (5,10-methenyltetrahydromethanopterin hydrogenase) and the radical SAM protein HmdB involved in biosynthesis of the HmdA cofactor. Bioinformatics suggests this protein, a homolog of eukaryotic fibrillarin, may be involved in biosynthesis of the guanylyl pyridinol cofactor in HmdA. [Protein fate, Protein modification and repair, Energy metabolism, Methanogenesis]	505
274882	TIGR03959	hyd_TM1266	putative iron-only hydrogenase system regulator. Members of this protein family occur as part of a system for producing iron-only hydrogenases, dependent on radical SAM proteins HydE and HydG and GTPase HydF. One member of this family, TM_1266 from Thermotoga maritima, has a known crystal structure. The small size, about 80 residues, and a distant relationship to the nickel regulator NikR of the CopG transcriptional regulator family suggest a role as a transcription factor. [Regulatory functions, DNA interactions]	76
188475	TIGR03960	rSAM_fuse_unch	radical SAM family uncharacterized protein. This model describes a radical SAM protein, or protein region, regularly found paired with or fused to a region described by TIGR03936. PSI-BLAST analysis of TIGR03936 suggests a relationship to the tRNA pseudouridine synthase TruA, suggesting that this system may act in RNA modification. [Unknown function, Enzymes of unknown specificity]	605
188476	TIGR03961	rSAM_PTO1314	archaeal radical SAM protein, PTO1314 family. Members of this protein family average about 340 residues in length, with a radical SAM domain in the N-terminal 200 residues. The taxonomic distribution is restricted to non-methanogenic archaea, including Picrophilus torridus (locus PTO1314), Sulfolobus sp., Thermoplasma sp., Picrophilus torridus, and Metallosphaera sedula. The gene neighborhood is not conserved, and the function of this family is unknown. [Unknown function, Enzymes of unknown specificity]	332
188477	TIGR03962	mycofact_rSAM	mycofactocin radical SAM maturase. Members of this family are uncharacterized radical SAM proteins from the Mycobacterium tuberculosis and many other Actinobacteria, as well as some deltaproteobacteria (e.g. Geobacter uraniireducens), firmicutes (Pelotomaculum thermopropionicum and Desulfotomaculum acetoxidans), and Chloroflexi (Thermomicrobium roseum DSM 5159 and Sphaerobacter thermophilus DSM 20745). They resemble several characterized radical SAM enzymes of peptide modification (PqqE, AlbA), and are always found next to the proposed target, TIGR03969, the putative mycofactocin precursor. [Unknown function, Enzymes of unknown specificity]	339
188478	TIGR03963	rSAM_QueE_Clost	putative 7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE, clostridial. Members of this radical SAM domain protein family appear to be the Clostridial form of the queuosine biosynthesis protein QueE. QueE is involved in making preQ0 (7-cyano-7-deazaquanine), a precursor of both the bacterial/eukaryotic modified tRNA base queuosine and the archaeal modified base archaeosine. Members occur in preQ0 operons species that lack members of related protein family TIGR03365. [Protein synthesis, tRNA and rRNA base modification]	219
274883	TIGR03964	mycofact_creat	mycofactocin system creatininase family protein. Members of this protein family are uncharacterized Actinobacterial proteins, with homology to creatinine amidohydrolase from Pseudomonas. Members occur only in the context of the mycofactocin system. [Unknown function, Enzymes of unknown specificity]	228
274884	TIGR03965	mycofact_glyco	mycofactocin system glycosyltransferase. Members of this protein family are putative glycosyltransferases, members of pfam00535 (glycosyl transferase family 2). Members appear mostly in the Actinobacteria, where they appear to be part of a system for converting a precursor peptide (TIGR03969) into a novel redox carrier designated mycofactocin. A radical SAM enzyme, TIGR03962, is a proposed to be a key maturase for mycofactocin.	466
274885	TIGR03966	actino_HemFlav	heme/flavin dehydrogenase, mycofactocin system. Members of this protein family possess an N-terminal heme-binding domain and C-terminal flavodehydrogenase domain, and share homology to yeast flavocytochrome b2, to E. coli L-lactate dehydrogenase [cytochrome], to (S)-mandelate dehydrogenase, etc. This enzyme appears only in the context of the mycofactocin system. Interestingly, it is absent from the four species detected so far with mycofactocin but without an F420 biosynthesis system.	385
274886	TIGR03967	mycofact_MftB	putative mycofactocin binding protein MftB. Families TIGR03969 and TIGR03962 describe, respectively, the putative mycofactocin precursor and its cognate radical SAM peptide maturase. This small protein family appears in the same sporadically distributed cassette and may serve as a scaffolding protein during mycofactocin maturation or as a carrier protein for the mature product, a putative novel redox carrier. A feature of mycofactocin-encoding genomes is co-clustering with sets of NAD-binding oxidoreductases in which the NAD is not exchangeable. Therefore it is proposed that mature mycofactocin, bound by a member of this family as a carrier protein, docks with the nicotinoprotein to allow electron transfer. Mediation of electron transfer through this system would define a segregated redox pool. [Unknown function, General]	81
188483	TIGR03968	mycofact_TetR	mycofactocin system transcriptional regulator. Members of this family are TetR family putative transcriptional regulators that occur in genome contexts near proteins of the mycofactocin system. These include the precursor peptide (TIGR03969), a radical SAM peptide maturase (TIGR03962), and a putative carrier protein (TIGR03967). [Regulatory functions, DNA interactions]	190
274887	TIGR03969	mycofactocin	mycofactocin precursor. Members of this protein family occur in Mycobacterium tuberculosis and many other Actinobacteria, as well as some delta-Proteobacteria (e.g. Geobacter uraniireducens), Firmicutes (Pelotomaculum thermopropionicum and Desulfotomaculum acetoxidans), and Chloroflexi (Thermomicrobium roseum DSM 5159 and Sphaerobacter thermophilus DSM 20745). Members sometimes are missed during gene model identification but always occur in the vicinity of radical SAM (rSAM) enzyme TIGR03962, which resembles several rSAM enzymes of peptide maturation (PqqE, AlbA). Species with this protein always carry members of unusual clades of nicotinoproteins that are restricted to mycofactocin-containing species and in which the NAD, when studied, has appeared non-exchangeable. It is proposed that the mature form of mycofactocin is a novel redox carrier for a segregated redox pool.	23
274888	TIGR03970	Rv0697	dehydrogenase, Rv0697 family. This model describes a set of dehydrogenases belonging to the glucose-methanol-choline oxidoreductase (GMC oxidoreductase) family. Members of the present family are restricted to Actinobacterial genome contexts containing also members of families TIGR03962 and TIGR03969 (the mycofactocin system), and are proposed to be uniform in function.	487
274889	TIGR03971	SDR_subfam_1	SDR family mycofactocin-dependent oxidoreductase. Members of this protein subfamily are putative oxidoreductases belonging to the larger SDR family. All members occur in genomes that encode a cassette for the biosynthesis of mycofactocin, a proposed electron carrier of a novel redox pool. Characterized members of this family are described as NDMA-dependent, meaning that a blue aniline dye serving as an artificial electron acceptor is required for members of this family to cycle in vitro, since the bound NAD residue is not exchangeable. See EC 1.1.99.36. [Unknown function, Enzymes of unknown specificity]	270
274890	TIGR03972	rSAM_TYW1	wyosine biosynthesis protein TYW1. Members of this protein family are the archaeal protein TWY1, a radical SAM protein that catalyzes the second step in creating the wye-bases, wyosine and derivatives such as wybutosine, for tRNA base modification. [Protein synthesis, tRNA and rRNA base modification]	297
274891	TIGR03973	six_Cys_in_45	six-cysteine peptide SCIFF. Members of this protein family are essentially universal in the class Clostidia and therefore highly abundant in the human gut microbiome. This short peptide is designated SCIFF, for Six Cysteines in Forty-Five residues. It is a presumed ribosomal natural product precursor, always found associated with a yet-uncharacterized radical SAM protein, family TIGR03974, that resembles other peptide modification radical SAM enzymes and is designated SCIFF radical SAM maturase.	43
274892	TIGR03974	rSAM_six_Cys	SCIFF radical SAM maturase. Members of this protein family are predicted radical SAM enzymes universally associated with Six Cysteines in Forty-Five protein, or SCIFF (family TIGR03973), a predicted ribosomal natural product precursor that is nearly universal in the class Clostridia. Similarity of this family to radical SAM maturases (PqqE and subtilosin A maturase) found in the vicinity of other peptide precursors suggests this protein is the SCIFF radical SAM maturase. [Cellular processes, Biosynthesis of natural products]	451
274893	TIGR03975	rSAM_ocin_1	ribosomal peptide maturation radical SAM protein 1. Models TIGR03793 and TIGR03798 describe bacteriocin precursor families to occur often in large paralogous families and are subject to various modifications, including by LanM family lantibiotic synthases and by cyclodehydratases. This model represents a radical SAM protein family that regularly occurs in the context of these bacteriocins, and may occur where other familiar peptide modification enzymes are absent. [Cellular processes, Toxin production and resistance]	606
274894	TIGR03976	chp_LLNDYxLRE	His-Xaa-Ser system protein HxsD. This rare conserved hypothetical protein of small size occurs exclusively, and perhaps universally, in the context of a pair of (uncharacterized) radical SAM proteins, TIGR03977 and TIGR03978. Many members of this family have invariant motifs LYW and LLNDYxLRE, but PSI-BLAST starting from family members well below 20 % pairwise sequence identity to this group eventually brings in the entire family as modeled here. The family TIGR03979 represents the fourth regularly conserved member of this system.	90
274895	TIGR03977	rSAM_pair_HxsC	His-Xaa-Ser system radical SAM maturase HxsC. This model describes the downstream member, HxsC, of a pair of uncharacterized radical SAM proteins, regularly found in the context of a small protein with four or more repeats of the tripeptide His-Xaa-Ser (HXS). This enzyme appears to be part of a peptide modification system.	292
274896	TIGR03978	rSAM_paired_1	His-Xaa-Ser system radical SAM maturase HxsB. This model describes the upstream member, HxsB, of a pair of uncharacterized radical SAM proteins, regularly found in the context of a small protein with four or more repeats of the tripeptide His-Xaa-Ser (HXS). This enzyme appears to be part of a peptide modification system.	466
274897	TIGR03979	His_Ser_Rich	His-Xaa-Ser repeat protein HxsA. Members of this protein share two defining regions. One is a histidine/serine-rich cluster, typically H-R-S-H-S-S-H-R-S-H-S-S-H. Members are found always in the context of a pair of radical SAM proteins, HxsB and HxsC, and a fourth protein HxsD. The system is predicted to perform peptide modifications, likely in the His-Xaa-Ser region, to produce some uncharacterized natural product.	186
274898	TIGR03980	prismane_assoc	hybrid cluster protein-associated redox disulfide domain. Members of this protein family resemble the domain of unknown function DUF1858 described by pfam08984, but all members contain an apparent redox-active disulfide. In at least one member protein, a cysteine in the CXXC motif is substituted by a selenocysteine. Most member proteins consist of this domain only, but a few members are fused to or adjacent to members of the hybrid-cluster (prismane) family or the nitrite/sulfite reductase family. [Energy metabolism, Electron transport]	58
188496	TIGR03981	SAM_quin_mod	His-Xaa-Ser system putative quinone modification maturase. One clue for the interpretation of this protein family is homology to the MauG protein (see TIGR03791) involved in the tryptophan tryptophylquinone post-translational modification of methylamine dehydrogenase light (beta) chain. The other is occurrence only in a five gene context in which two members are radical SAM proteins (TIGR03977 and TIGR03978) also likely involved in post-translational modification.	411
188497	TIGR03982	TIGR03982	His-Xaa-Ser system protein, TIGR03982 family. Members of this rare protein family occur in the presence of TIGR03981 and TIGR03979, which in turn occur only in the context of radical SAM protein families TIGR03977 and TIGR03978. The function is unknown.	117
274899	TIGR03983	cas1_MYXAN	CRISPR-associated endonuclease Cas1, subtype MYXAN. Members of this protein are the Cas1 endonuclease, or Cas1 domain in Cas4/Cas1 fusion proteins, of the MYXAN subtype of CRISPR/Cas systems. These systems typically feature repeats and spacers each about 36 base pairs in length. Species with this type of CRISPR system include Myxococcus xanthus, Cyanothece sp., Leptospira interrogans, Sorangium cellulosum, Anabaena variabilis ATCC 29413, etc.	347
274900	TIGR03984	TIGR03984	CRISPR-associated protein, TIGR03984 family. Members of this protein family are found exclusively in CRISPR-containing organisms, in operon contexts with RAMP (repeat-associated mystery protein) proteins also linked to CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats).	147
274901	TIGR03985	TIGR03985	CRISPR-associated protein, TIGR03985 family. Members of this protein family belong to CRISPR-associated (Cas) gene clusters. The majority of members are Cyanobacterial.	248
274902	TIGR03986	TIGR03986	CRISPR-associated protein. Members of this protein family, part of the larger RAMP family, are found exclusively in species with CRISPR systems, in local contexts containing other RAMP (Repeat-Associated Mystery Proteins).	562
274903	TIGR03987	TIGR03987	TIGR03987 family protein. Conserved hypothetical protein	120
274904	TIGR03988	antisig_RsrA	mycothiol system anti-sigma-R factor. Members of this family are the anti-sigma-R factor RsrA, which contains a CXXC motif as a thiol-disulphide redox switch. It interacts with sigma-R. It regulates and is regulated by the mycothiol system, which occurs in many actinomycetes. [Transcription, Transcription factors]	77
274905	TIGR03989	Rxyl_3153	NDMA-dependent alcohol dehydrogenase, Rxyl_3153 family. This model describes a clade within the family pfam00107 of zinc-binding dehydrogenases. The family pfam00107 contains class III alcohol dehydrogenases, including enzymes designated S-(hydroxymethyl)glutathione dehydrogenase and NAD/mycothiol-dependent formaldehyde dehydrogenase. Members of the current family occur only in species that contain the very small protein mycofactocin (TIGR03969), a possible cofactor precursor, and radical SAM protein TIGR03962. We name this family for Rxyl_3153, where the lone member of the family co-clusters with these markers in Rubrobacter xylanophilus. [Unknown function, Enzymes of unknown specificity]	369
274906	TIGR03990	Arch_GlmM	phosphoglucosamine mutase. The MMP1680 protein from Methanococcus maripaludis has been characterized as the archaeal protein responsible for the second step of UDP-GlcNAc biosynthesis. This GlmM protein catalyzes the conversion of glucosamine-6-phosphate to glucosamine-1-phosphate. The first-characterized bacterial GlmM protein is modeled by TIGR01455. These two families are members of the larger phosphoglucomutase/phosphomannomutase family (characterized by three domains: pfam02878, pfam02879 and pfam02880), but are not nearest neighbors to each other. This model also includes a number of sequences from non-archaea in the Bacteroides, Chlorobi, Chloroflexi, Planctomycetes and Spirochaetes lineages. Evidence supporting their inclusion in this equivalog as having the same activity comes from genomic context and phylogenetic profiling. A large number of these organisms are known to produce exo-polysaccharide and yet only appeared to contain the GlmS enzyme of the GlmSMU pathway for UDP-GlcNAc biosynthesis (GenProp0750). In some organisms including Leptospira, this archaeal GlmM is found adjacent to the GlmS as well as a putative GlmU non-orthologous homolog. Phylogenetic profiling of the GlmS-only pattern using PPP identifies members of this archaeal GlmM family as the highest-scoring result. [Central intermediary metabolism, Amino sugars]	443
274907	TIGR03991	alt_bact_glmU	UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase. The MJ_1101 protein from Methanococcus jannaschii has been characterized as the GlmU enzyme catalyzing the final two steps of UDP-GlcNAc biosynthesis. Homologs of this enzyme are identified in a number of bacterial organisms and modeled here. A number of these are observed in proximity to the GlmS and GlmM genes, and phylogenetic profiling by PPP identifies the LEPBI_I0518 gene in Leptospira biflexa as a likely Glm-system candidate. Multiple sequence alignments of these bacterial homologs with their archaeal counterparts reveals significant structural differences, necessitating the construction of separate models. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Central intermediary metabolism, Amino sugars]	337
274908	TIGR03992	Arch_glmU	UDP-N-acetylglucosamine diphosphorylase/glucosamine-1-phosphate N-acetyltransferase. The MJ_1101 protein from Methanococcus jannaschii has been characterized as the GlmU enzyme catalyzing the final two steps of UDP-GlcNAc biosynthesis. Many of the genes identified by this model are in proximity to the GlmS and GlmM genes and are also presumed to be GlmU. However, some archaeal genomes contain multiple closely-related homologs from this family and it is not clear what the substrate specificity is for each of them.	393
274909	TIGR03993	hydrog_HybE	[NiFe] hydrogenase assembly chaperone, HybE family. Members of this family are chaperones for the assembly of [NiFe] hydrogenases, in the family of HybE, which is specific for hydrogenase-2 of Escherichia coli. Members often have an additional N-terminal rubredoxin domain.	143
274910	TIGR03994	rSAM_HemZ	coproporphyrinogen dehydrogenase HemZ. Members of this radical SAM protein family are HemZ, a protein involved in coproporphyrinogen III decarboxylation. Alternative names for this enzyme (EC 1.3.99.22) include coproporphyrinogen dehydrogenase and oxygen-independent coproporphyrinogen III oxidase. The family is related to, but distinct from HemN, and in Bacillus subtilis was shown to be connected to peroxide stress and catalase formation. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	401
274911	TIGR03995	target_X_rSAM	putative rSAM target protein, CGCGG family. Members of this family of small proteins, approx. 100 amino acids in length, co-occur with a subfamily of radical SAM protein in several species in the Halobacteria and in Bacillus. The radical SAM protein belongs to a branch in which most characterized members act on peptide substrates. The lack of homology of this family to any known enzyme and the distinctive C-terminal region motif, with the common modification target residue Cys flanked by sterically permissive Gly residues.	84
188511	TIGR03996	mycofact_OYE_1	mycofactocin system FadH/OYE family oxidoreductase 1. The yeast protein called old yellow enzyme and FadH from Escherichia coli (2,4-dienoyl CoA reductase) are enzymes with 4Fe-4S, FMN, and FAD prosthetic groups, and interact with NADPH as well as substrate. Members of this related protein family occur in the vicinity of the putative mycofactocin biosynthesis operon in a number of Actinobacteria such as Frankia sp. and Rhodococcus sp. The function of this oxidoreductase is unknown.	633
274912	TIGR03997	mycofact_OYE_2	mycofactocin system FadH/OYE family oxidoreductase 2. The yeast protein called old yellow enzyme and FadH from Escherichia coli (2,4-dienoyl CoA reductase) are enzymes with 4Fe-4S, FMN, and FAD prosthetic groups, and interact with NADPH as well as substrate. Members of this related protein family occur in the vicinity of the putative mycofactocin biosynthesis operon in a number of Actinobacteria such as Frankia sp. and Rhodococcus sp., in Pelotomaculum thermopropionicum SI (Firmicutes), and in Geobacter uraniireducens Rf4 (Deltaproteobacteria). The function of this oxidoreductase is unknown.	644
274913	TIGR03998	thiol_BshC	bacillithiol biosynthesis cysteine-adding enzyme BshC. Members of this protein family are BshC, an enzyme required for bacillithiol biosynthesis and described as a cysteine-adding enzyme. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	528
274914	TIGR03999	thiol_BshA	N-acetyl-alpha-D-glucosaminyl L-malate synthase BshA. Members of this protein family are BshA, a glycosyltransferase required for bacillithiol biosynthesis. This enzyme combines UDP-GlcNAc and L-malate to form N-acetyl-alpha-D-glucosaminyl L-malate synthase. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	374
188515	TIGR04000	thiol_BshB2	bacillithiol biosynthesis deacetylase BshB2. Members of this protein family are BshB2 (YojG), an enzyme of bacillithiol biosynthesis; either BshB1 (YpjG) or BshB2 must be present, and often both are present. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	217
274915	TIGR04001	thiol_BshB1	bacillithiol biosynthesis deacetylase BshB1. Members of this protein family are BshB1 (YpjG), an enzyme of bacillithiol biosynthesis; either BshB1 or BshB2 (YojG) must be present, and often both are present. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	226
188517	TIGR04002	TIGR04002	TIGR04002 family protein. TIGR04002 family proteins, a division within DUF1393 ( pfam07155), occur strictly as part of a tandem gene pair with an uncharacterized radical SAM protein. [Unknown function, General]	151
188518	TIGR04003	rSAM_BssD	[benzylsuccinate synthase]-activating enzyme. Members of this radical SAM protein family are [benzylsuccinate synthase]-activating enzyme, a glycyl radical active site-creating enzyme related to [pyruvate formate-lyase]-activating enzyme and additional uncharacterized homologs activating additional glycyl radical-containing enzymes. [Protein fate, Protein modification and repair]	314
188519	TIGR04004	WcaM	colanic acid biosynthesis protein WcaM. This protein of uncharacterized function is the final gene in the conserved colanic acid biosynthesis cluster observed in Enterobacteraceae.	464
188520	TIGR04005	wcaL	colanic acid biosynthesis glycosyltransferase WcaL. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species.	406
188521	TIGR04006	wcaK	colanic acid biosynthesis pyruvyl transferase WcaK. This gene is the pyruvyl transferase involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species.	426
188522	TIGR04007	wcaI	colanic acid biosynthesis glycosyl transferase WcaI. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species.	407
188523	TIGR04008	WcaF	colanic acid biosynthesis acetyltransferase WcaF. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species. This acetyltransferase is believed to catalyze the addition of the acetyl group that is attached through an O linkage to the first fucosyl residue of the colanic acid repetitive unit (E unit)	180
188524	TIGR04009	wcaE	colanic acid biosynthesis glycosyl transferase WcaE. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species.	248
188525	TIGR04010	WcaD	putative colanic acid polymerase WcaD. This membrane protein is believed to function as the colanic acid repeating unit polymerase (in an analagous fashion to wzy proteins in O-antigen polymerization).	404
274916	TIGR04011	poly_gGlu_PgsC	poly-gamma-glutamate biosynthesis protein PgsC/CapC. Of four genes commonly found to be involved in biosynthesis and export of poly-gamma-glutamate, pgsB(capB) and pgsC(capC) are found to be involved in the synthesis per se. Members of this family are designated PgsC, covering both cases in which the poly-gamma-glutamate is secreted and those in which it is retained to form capsular material. PgsC binds tightly to PgsB, which has been shown to have poly-gamma-glutamate activity. [Cell envelope, Other]	132
188527	TIGR04012	poly_gGlu_PgsB	poly-gamma-glutamate synthase PgsB/CapB. Of four genes commonly found to be involved in biosynthesis and export of poly-gamma-glutamate, pgsB(capB) and pgsC(capC) are found to be involved in the synthesis per se. Members of this family are designated PgsB, a nomeclature that covers both cases in which the poly-gamma-glutamate is secreted and those in which it is retained to form capsular material.PgsB has been shown to have poly-gamma-glutamate activity by itself but is bound tightly by PgsC (TIGR04011). [Cell envelope, Other]	366
274917	TIGR04013	B12_SAM_MJ_1487	B12-binding domain/radical SAM domain protein, MJ_1487 family. Members of this family have both a B12 binding homology domain (pfam02310) and a radical SAM domain (pfam04055), and occur only once per genome. Some species with members of this family have a related protein with similar domain architecture. This protein is occurs largely in archaeal methanogens but also in a few bacteria, including Thermotoga maritima and Myxococcus xanthus. [Unknown function, Enzymes of unknown specificity]	383
274918	TIGR04014	B12_SAM_MJ_0865	B12-binding domain/radical SAM domain protein, MJ_0865 family. Members of this family have both a B12 binding homology domain (pfam02310) and a radical SAM domain (pfam04055), and occur only once per genome. This protein occurs so far only in methanogenic archaea. Some species with members of this family have a related protein with similar domain architecture (see TIGR04013). [Unknown function, Enzymes of unknown specificity]	434
274919	TIGR04015	WcaC	colanic acid biosynthesis glycosyl transferase WcaC. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species.	405
188531	TIGR04016	WcaB	colanic acid biosynthesis acetyltransferase WcaB. This gene is one of the acetyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species.	146
274920	TIGR04017	WcaA	colanic acid biosynthesis glycosyl transferase WcaA. This gene is one of the glycosyl transferases involved in the biosynthesis of colanic acid, an exopolysaccharide expressed in Enterobacteraceae species.	279
188533	TIGR04018	Bthiol_YpdA	putative bacillithiol system oxidoreductase, YpdA family. Members of this protein family, including YpdA from Bacillus subtilis, are apparent oxidoreductases present only in species with an active bacillithiol system. They have been suggested actually to be thiol disulfide oxidoreductases (TDOR), although the evidence is incomplete. [Unknown function, Enzymes of unknown specificity]	316
274921	TIGR04019	B_thiol_YtxJ	bacillithiol system protein YtxJ. Members of this protein family, including YtxJ from Bacillus subtilis, occur in species that encode proteins for synthesizing bacillithiol. The protein is described as thioredoxin-like, while another bacillithiol-associated protein, YpdA (TIGR04018), is described as thioredoxin reductase-like. [Unknown function, Enzymes of unknown specificity]	105
274922	TIGR04020	seco_metab_LLM	natural product biosynthesis luciferase-like monooxygenase domain. This model describes a subfamily within the bacterial luciferase-like monooxygenase (LLM) family that regularly occurs within large non-ribosomal protein synthases/polyketide synthases, but also as small proteins. The LLM family includes members that bind either FMN or F420, and FMN is more likely in this case because many members are from species that lack F420 biosynthesis capability. An example member is the MupA protein of mupirocin biosynthesis in Pseudomonas fluorescens NCIMB 10586.	341
274923	TIGR04021	LLM_DMSO2_sfnG	dimethyl sulfone monooxygenase SfnG. This family of FMNH2-dependent members of the luciferase-like monooxygenase (LLM) family includes SfnG, a monooxygenase that converts dimethylsulphone (DMSO2) to methanesulphonate. This step can be followed immediately by methanesulfonate sulfonatase (an alkanesulfonate monooxygenase - see TIGR03565) for the FMNH2-dependent conversion an inorganic form. [Central intermediary metabolism, Sulfur metabolism]	350
274924	TIGR04022	sulfur_SfnB	sulfur acquisition oxidoreductase, SfnB family. Members of this protein family belong to the greater family of acyl-CoA dehydrogenases. This family includes the sulfate starvation induced protein SfnB of Pseudomonas putida strain DS1, which is both encoded nearby to and phylogenetically closely correlated with the dimethyl sulphone monooxygenase SfnG. This family shows considerable sequence similarity to the Rhodococcus dibenzothiophene desulfurization enzyme DszC, although that enzyme falls outside of the scope of this family. [Central intermediary metabolism, Sulfur metabolism]	391
274925	TIGR04023	PPOX_MSMEG_5819	PPOX class F420-dependent enzyme, MSMEG_5819/OxyR family. A Genome Properties metabolic reconstruction for F420 biosynthesis shows that slightly over 10 percent of all prokaryotes with fully sequenced genomes, including about two thirds of the Actinomycetales, make F420. This subfamily within the PPOX family occurs in at least 19 distinct species of F420 producers and is likely to bind F420 rather than FMN. The member OxyR was shown to use F420 to catalyze a C5a-C11a reduction in oxytetracycline biosynthesis. [Unknown function, Enzymes of unknown specificity]	115
188539	TIGR04024	F420_NP1902A	coenzyme F420-dependent oxidoreductase, NP1902A family. This subfamily of the luciferase-like monooxygenases is restricted to the order Halobacteriales. SIMBAL analysis strongly suggests this oxidoreductase binds coenzyme F420 rather than FMN. Occasional annotations of members of this family as N5,N10-methylenetetrahydromethanopterin reductase appear to represent overly aggressive transfer of annotation. [Unknown function, Enzymes of unknown specificity]	330
274926	TIGR04025	PPOX_FMN_DR2398	PPOX class probable FMN-dependent enzyme, DR_2398 family. Members of the PPOX family (see pfam01243) may contain either FMN or F420 as cofactor. This subfamily consists of proteins mostly from species that lack the capability to synthesize F420, and therefore most likely all bind FMN.	197
274927	TIGR04026	PPOX_FMN_cyano	PPOX class probable FMN-dependent enzyme, alr4036 family. Members of the PPOX family (see pfam01243) may contain either FMN or F420 as cofactor. This subfamily described here is widespread in Cyanobacteria and plants, and is named for alr4036 from Nostoc sp. PCC 7120. The family consists mostly of proteins from species that lack the capability to synthesize F420, so it is probable that all members bind FMN rather than F420. [Unknown function, Enzymes of unknown specificity]	185
274928	TIGR04027	LLM_KPN_01858	putative FMN-dependent luciferase-like monooxygenase, KPN_01858 family. This protein family consists of luciferase-like monooxygenases (LLM), and include KPN_01858 from Klebsiella pneumoniae as a representative member. Most are from species that lack F420 biosynthesis, so the family is likely to bind FMN as its cofactor. This family is closely associated with a binding protein-dependent ABC transporter, suggesting a role in catabolism. [Unknown function, Enzymes of unknown specificity]	326
274929	TIGR04028	SBP_KPN_01854	ABC transporter substrate binding protein, KPN_01854 family. Members of this protein family are ABC transporter substrate-binding proteins related to KPN_01854 from Klebsiella pneumoniae, and occur in both Gram-positive and Gram-negative species. This transport protein family is closely associated with a putative FMN-dependent luciferase-like monooxygenase of unknown function (TIGR04027), as well as with the other proteins of its transporter complex. [Transport and binding proteins, Unknown substrate]	509
213885	TIGR04029	CMD_Avi_7170	CMD domain protein, Avi_7170 family. Sequences in this family occur as the N-terminal domain of a fusion protein with a C-terminal peroxidase-like protein, or as discrete protein encoded next to a peroxidase-like protein. The two partners regularly are encoded near to, and in the same genomes as, a putative FMN-dependent luciferase-like monooxygenase (LLM) (TIGR04027), and an ABC transporter in which TIGR04028 models the substrate-binding protein. CDD identifies this family as falling within the CMD superfamily that includes carboxymuconolactone decarboxylase.	174
188545	TIGR04030	perox_Avi_7169	alkylhydroperoxidase domain protein, Avi_7169 family. Members of this family represent a narrow clade that falls within a family of alkylhydroperoxidase-related proteins, fused to or adjacent to a sequence described by TIGR04029. These two partners occur almost always in the context of a putative FMN-dependent luciferase-like monooxygenase (LLM) (TIGR04027), and an ABC transporter in which TIGR04028 models the substrate-binding protein.	185
188546	TIGR04031	Htur_1727_fam	rSAM-partnered protein, Htur_1727 family. Members of this protein family show homology to the putative PaaH (or PaaB) subunit of the phenylacetate-CoA oxygenase complex. However, all members are found in the Halobacteriales in the vicinity of a radical SAM protein homologous to the PqqE protein of pyroquinoline quinone (PQQ) biosynthesis. Members are well-conserved to about residue 75, but then become low-complexity and hypervariable.	71
274930	TIGR04032	toxin_SdpC	antimicrobial peptide, SdpC family. This protein family contains the antimicrobial peptide SdpC, used in cannibalistic killing by Bacillus subtilis, and related sequences in species as distant as Myxococcus xanthus from the Deltaproteobacteria. A conserved gene neighborhood includes proteins associated with immunity.	172
274931	TIGR04033	export_SdpB	antimicrobial peptide system protein, SdpB family. Members of this protein family resemble SdpB (Sporulation Delaying Protein B), an integral membrane protein associated with production of the cannibalism peptide SdpC in Bacillus subtilis. Similar proteins are found in Myxococcus xanthus.	276
274932	TIGR04034	export_SdpA	antimicrobial peptide system protein, SdpA family. Members of this protein family resemble SdpA (Sporulation Delaying Protein A), a protein associated with production and export of the cannibalism peptide SdpC in Bacillus subtilis. Similar proteins are found in Myxococcus xanthus, Stigmatella aurantiaca DW4/3-1, Streptomyces sp. ACTE, etc.	156
274933	TIGR04035	glucan_65_rpt	glucan-binding repeat. This model describes a region of about 63 amino acids that is composed of three repeats of a more broadly distributed family of shorter repeats modeled by pfam01473. While the shorter repeats are often associated with choline binding (and therefore with cell wall binding), the longer repeat described here represents a subgroup of repeat sequences associated with glucan binding, as found in a number glycosylhydrolases. Shah, et al. describe a repeat consensus, WYYFDANGKAVTGAQTINGQTLYFDQDGKQVKG, that corresponds to half of the repeat as modeled here and one and a half copies of the repeat as modeled by pfam01473.	62
274934	TIGR04036	LLM_CE1758_fam	putative luciferase-like monooxygenase, FMN-dependent, CE1758 family. This tightly conserved subfamily of the bacterial luciferase-like monooxygenase (LLM) family, with members showing > 60 % pairwise sequence identity, includes proteins from both species with and species without the ability to make coenzyme F420. Therefore, the like cofactor is FMN rather than F420. The presence of three members in Kineococcus radiotolerans SRS30216 and two in Saccharopolyspora erythraea NRRL 2338 suggest closely related (subfamily) rather than exactly conserved (equivalog) function. Gene neighborhoods around members are not conserved. [Unknown function, Enzymes of unknown specificity]	355
274935	TIGR04037	LLM_duo_CE1759	LLM-partnered FMN reductase, CE1759 family. This family represents a distinct clade within pfam03358. That family includes enzymes such as the NADH-dependent FMN reductase MsuE. Members of the present family regularly co-occur in genomes, typically as gene pairs, with members of TIGR04036, a probable FMN-dependent member of the bacterial luciferase-like monooxygenase (LLM) family. At least one member, RF|YP_001509627.1 from Frankia sp. EAN1pec, is fused to the LLM protein. The function of these gene pairs is unknown.	198
274936	TIGR04038	tatD_link_rSAM	radical SAM protein, TatD family-associated. Members of this family are radical SAM proteins found in about 5 percent of microbial genomes. A portion occur as gene fusions with, or adjacent to, members of the TatD family of hydrolases (pfam01026). The TatD family may have several paralogs per genome, including TatD itself from E. coli (a soluble protein not actually part of the twin-arginine translocation complex), which appears to act in quality control for TAT, directing turnover of misfolded TAT substrates. The functions of TatD family hydrolases in general (other than TatD itself, which may be exceptional within its larger family), and of this radical SAM domain protein modeled here, are unknown.	191
188554	TIGR04039	MXAN_0977_Heme2	di-heme enzyme, MXAN_0977 family. This model describes a subfamily of di-heme proteins related to the di-heme cytochrome c peroxidase and to MauG (methylamine utilization G), an enzyme that performs a tryptophan tryptophylquinone modification to the methylamine dehydrogenase light chain.	336
274937	TIGR04040	glycyl_YjjI	glycine radical enzyme, YjjI family. Members of this family are homologs to enzymes known to undergo activation by a radical SAM protein to create an active site glycyl radical. This family appears to be activated by the YjjW radical SAM protein, usually encoded by an adjacent gene. [Unknown function, Enzymes of unknown specificity]	497
274938	TIGR04041	activase_YjjW	glycine radical enzyme activase, YjjW family. Members of this family are radical SAM enzymes, designated YjjW in E. coli, that are paired with and appear to activate a glycyl radical enzyme of unknown function, designated YjjI. This activase and its target are found in Clostridial species as well as E. coli and cousins. Members of this family may be misannotated as pyruvate formate lyase activating enzyme. [Protein fate, Protein modification and repair]	276
274939	TIGR04042	MSMEG_0570_fam	MSMEG_0570 family protein. This small protein, about 90 residues long, has no detectable homologs outside the set used to characterize this model. Member proteins serve as markers for an eight-gene region whose overall function is unknown. One member of the cluster is a radical SAM protein with some similarity to enzymes of cofactor biosynthesis, another a glycosyltransferase, several hydrolases or oxidoreductases, and several unknown.	90
274940	TIGR04043	rSAM_MSMEG_0568	radical SAM protein, MSMEG_0568 family. Members of this protein family are radical SAM proteins related to MSMEG_0568 from Mycobacterium smegmatis. Members occur within 8-gene operons in species as diverse as M. smegmatis, Rhizobium leguminosarum, Synechococcus elongatus, and Sorangium cellulosum. The function of the operon is unknown, but similarity of MSMEG_0568 to some cofactor biosynthesis radical SAM proteins suggests a similar biosynthetic function. [Unknown function, Enzymes of unknown specificity]	354
188559	TIGR04044	MSMEG_0572_fam	MSMEG_0572 family protein. This model describes a family of proteins with remote similarity to the DsrE/DsrF-like family (see pfam02635). All members are found in a context of at least seven genes that includes a radical SAM protein, suggesting biosynthesis. The system is sparsely but broadly distributed in bacteria, including Actinobacteria, Proteobacteria, and Cyanobacteria.	159
274941	TIGR04045	MSMEG_0567_GNAT	putative N-acetyltransferase, MSMEG_0567 N-terminal domain family. Members of this family belong to the GNAT family (pfam00583), in which numerous characterized examples, though not all, are are shown to be N-acetyltransferases or to interact with acetyl-CoA. This family occurs in a sparsely distributed biosynthetic cluster that occurs in Actinobacteria, Cyanobacteria, and Proteobacteria.	153
274942	TIGR04046	MSMEG_0569_nitr	flavin-dependent oxidoreductase, MSMEG_0569 family. Members of this protein family belong to a conserved seven-gene biosynthetic cluster found sparsely in Cyanobacteria, Proteobacteria, and Actinobacteria. Distant homologies to characterized proteins suggest that members are enzymes dependent on a flavinoid cofactor.	400
274943	TIGR04047	MSMEG_0565_glyc	glycosyltransferase, MSMEG_0565 family. A conserved gene cluster found sporadically from Actinobacteria to Proteobacteria to Cyanobacteria features a radical SAM protein, an N-acetyltransferase, an oxidoreductase, and two additional proteins whose functional classes are unclear. The metabolic role of the cluster is probably biosynthetic. This glycosyltransferase, named from member MSMEG_0565 from Mycobacterium smegmatis, occurs in most but not all instances of the cluster. [Unknown function, Enzymes of unknown specificity]	373
188563	TIGR04048	nitrile_sll0784	putative nitrilase, sll0784 family. This family represents a subfamily of a C-N bond-cleaving hydrolases (see pfam00795). Members occur as part of a cluster of genes in a probable biosynthetic cluster that contains a radical SAM protein, an N-acetyltransferase, a flavoprotein, several proteins of unknown function, and usually a glycosyltransferase. Members are closely related to a characterized aliphatic nitrilase from Rhodopseudomonas rhodochrous J1, for which an active site Cys was found at position 165. [Unknown function, Enzymes of unknown specificity]	301
188564	TIGR04049	AIR_rel_sll0787	AIR synthase-related protein, sll0787 family. Members of this family include sll0787 from Synechocystis sp. PCC 6803 and resemble the C-terminal region of MSMEG_0567 from Mycobacterium smegmatis, where the N-terminal is a GNAT family N-acetyltransferase. The conserved cluster is found broadly (Cyanobacteria, Proteobacteria, Actinobacteria) in about 8 percent of genomes and appears to be biosynthetic. The product is unkown. [Unknown function, Enzymes of unknown specificity]	316
274944	TIGR04050	MSMEG_0567_Cter	AIR synthase-related protein, MSMEG_0567 C-terminal family. Members of this family include the C-terminal region of MSMEG_0567 from Mycobacterium smegmatis, where the N-terminal is a GNAT family N-acetyltransferase, and resemble the full length of sll0787 from Synechocystis sp. PCC 6803. The conserved cluster that contains these is found broadly (Cyanobacteria, Proteobacteria, Actinobacteria) in about 8 percent of genomes and appears to be biosynthetic. The product is unkown. [Unknown function, Enzymes of unknown specificity]	296
188566	TIGR04051	rSAM_NirJ	heme d1 biosynthesis radical SAM protein NirJ. Heme d1 occurs in the cytochrome cd1 subunit of nitrite reductase in species such as Pseudomonas stutzeri. NirJ is a radical SAM protein involved in its bioynthesis. In a number of species, distinct genes NirJ1 and NirJ2 are found in similar genomic regions; this model describe authentic NirJ from genomes with NirJ only. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	354
188567	TIGR04052	AZL_007920_fam	AZL_007920/MXAN_0976 family protein. Members of this rare protein family regularly occur next to a member of the MXAN_0977 subfamily (TIGR04039) of the di-heme cytochrome c peroxidase/MauG family (pfam03150). MauG itself (TIGR03791) is a protein modification enzyme responsible for the tryptophan tryptophylquinone (TTQ) modification involved in methylamine dehydrogenase activation. All members of this family have a motif of four spaced invariant Cys residues, while additional homologs outside the scope of this family lack the four Cys residues.	206
274945	TIGR04053	sam_11	radical SAM protein, BA_1875 family. Members of this subfamily of the radical SAM domain superfamily show closer sequence relationships to peptide-modifying proteins of bacteriocin and PQQ biosynthesis than to other characterized radical SAM proteins. Within this subfamily, targets are likely to be diverse. [Unknown function, Enzymes of unknown specificity]	365
274946	TIGR04054	rSAM_NirJ1	putative heme d1 biosynthesis radical SAM protein NirJ1. Members of this radical SAM protein subfamily, designated NirJ1, occur in genomic contexts with a paralog NirJ2 and with other nitrite reductase operon genes associated with heme d1 biosynthesis, as in Heliobacillus mobilis and Heliophilum fasciatum. NirJ1 is presumed by bioinformatics analysis (Xiong, et al.) to be a heme d1 biosynthesis protein by context, perhaps involved in conversions of acetate groups to methyl groups in conversion from uroporphyrinogen III. A very closely related protein, involved in alternative heme b biosynthesis, occurs in Desulfovibrio and in methanogens. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	387
274947	TIGR04055	rSAM_NirJ2	putative heme d1 biosynthesis radical SAM protein NirJ2. Members of this radical SAM protein subfamily, designated NirJ2, occur in genomic contexts with a paralog NirJ1 and with other nitrite reductase operon genes associated with heme d1 biosynthesis, as in Heliobacillus mobilis and Heliophilum fasciatum. NirJ2 is presumed by bioinformatics analysis (Xiong, et al.) to be a heme d1 biosynthesis protein by context. This model has been redone (2014) to remove the branch (TIGR04545) that included DVU_0855, from a similar pathway for heme b biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	326
274948	TIGR04056	OMP_RagA_SusC	TonB-linked outer membrane protein, SusC/RagA family. This model describes a distinctive clade among the TonB-linked outer membrane proteins (OMP). Members of this family are restricted to the Bacteriodetes lineage (except for Gemmatimonas aurantiaca T-27 from the novel phylum Gemmatimonadetes) and occur in high copy numbers, with over 100 members from Bacteroides thetaiotaomicron VPI-5482 alone. Published descriptions of members of this family are available for RagA from Porphyromonas gingivalis, SusC from Bacteroides thetaiotaomicron, and OmpW from Bacteroides caccae. Members form pairs with members of the SusD/RagB family (pfam07980). Transporter complexes including these outer membrane proteins are likely to import large degradation products of proteins (e.g. RagA) or carbohydrates (e.g. SusC) as nutrients, rather than siderophores. [Transport and binding proteins, Unknown substrate]	981
274949	TIGR04057	SusC_RagA_signa	TonB-dependent outer membrane receptor, SusC/RagA subfamily, signature region. This model describes a 31-residue signature region of the SusC/RagA family of outer membrane proteins from the Bacteriodetes. While many TonB-dependent outer membrane receptors are associated with siderophore import, this family seems to include generalized nutrient receptors that may convey fairly large oligomers of protein or carbohydrate. This family occurs in high copy numbers in the most abundant species of the human gut microbiome.	31
188573	TIGR04058	AcACP_reductase	long-chain fatty acyl-ACP reductase (aldehyde-forming). This enzyme, found in cyanobacteria, reduces a long-chain (mainly C16 or C18) fatty acyl ACP ester to its corresponding fatty aldehyde, releasing the acyl carrier protein (ACP). NADPH or NADH is the reductant for this reaction. This enzyme may be distantly related to the short-chain dehydrogenase or reductase (SDR) family (pfam00106). The purpose of this reaction is in the first step of alkane biosynthesis (GenProp0942). [Central intermediary metabolism, Other]	339
274950	TIGR04059	Ald_deCOase	long-chain fatty aldehyde decarbonylase. This cyanobacterial family of fatty aldehyde decarbonylases acts on mainly C16 and C18 substrates to form hydrocarbons and carbon monoxide. Note that the corresponding EC number (4.1.99.5) dating from 1989 refers to a nonorthologous Pisum sativum enzyme that acts on C18 and longer chains and attaches the overly narrow narrow name octadecanal decarbonylase. [Central intermediary metabolism, Other]	220
274951	TIGR04060	formate_focA	formate transporter FocA. FocA (formate channel A) forms a pentameric formate-selective channel through the plasma membrane. The focA gene is largely restricted to Proteobacteria and occurs adjacent to genes for pyruvate formate lyase (PFL) and the PFL activase, a radical SAM protein. FocA is homologous to a nitrite transport protein, NirC. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	267
274952	TIGR04061	AZL_007950_fam	AZL_007950 family protein. This set of proteins includes PP_3335 from Pseudomonas putida, a protein of unknown function, and AZL_007950, a member of a putative biosynthetic cluster from Azospirillum sp. B510.	164
274953	TIGR04062	dnd_assoc_4	dnd system-associated protein 4. A DNA sulfur modification system, dnd (degradation during electrophoresis), is sparsely and sporadically distributed among the bacteria. Members of this protein fam ily are strictly limited to species with the dnd operon, and are found close to the dnd operon on the chromosomes of species such as Nostoc sp. PCC 7120, Geobacter uraniireducens Rf4, and Roseobacter denitrificans OCh 114. [DNA metabolism, Restriction/modification]	151
274954	TIGR04063	stp3	PEP-CTERM/exosortase A-associated glycosyltransferase, Daro_2409 family. PEP-CTERM/exosortase is a protein-sorting system associated with exopolysaccharide production. Members of this protein family are group 1 glycosyltransferases (see pfam00534) in which the overwhelming majority occur in species with the EpsH1 form of exosortase (see TIGR03109), and usually co-clustered with the exosortase. A typical member is Daro_2409 from Dechloromonas aromatica RCB.	397
274955	TIGR04064	rSAM_nif11	nif11-like peptide radical SAM maturase. Members of this family are radical SAM enzymes that occur co-clustered with nif11-related ribosomal natural product (RNP) precursors described by TIGRFAMs model TIGR03798. Homology within the bacteriocin family reflects largely constraints on the leader peptide, tied to processes such as cleavage and export, and members associate with various families of maturation enzyme. The gene symbol assigned is nlpM, for Nif11-class Leader Peptide family Radical SAM Maturase. [Cellular processes, Toxin production and resistance]	458
274956	TIGR04065	ocin_CLI_3235	putative bacteriocin precursor, CLI_3235 family. Members of this protein family are Cys-rich putative bacteriocin precursor peptides restricted to the Clostridia but found in multiple species with up to three per genome. They are found next to a CLI_3234 family radical SAM protein that may perform post-translational modification. This model describes approximately 35 residues starting from the N-terminus. Precursor peptides average about 50 amino acids in length. [Cellular processes, Toxin production and resistance]	34
274957	TIGR04066	nat_prod_clost	peptide maturation system protein, TIGR04066 family. Members of this protein family occur in various Clostridial genomes, always in the context of a short peptide and a radical SAM protein predicted to modify the short peptide. PSI-BLAST analysis suggests a sequence relationship to archaeal proteins designated as subunits of an H+-transporting two-sector ATPase. The modified peptide is likely to be a bacteriocin, and this protein is a candidate to act in either maturation or immunity.	361
188582	TIGR04067	oc_CLOSPO_01332	putative bacteriocin precursor, CLOSPO_01332 family. Members of this protein family are Cys-rich putative bacteriocin precursor peptides found in a few strains of Clostridium and Anaerococcus. This family is related to the family of CLI_3235 (TIGR04065). Members of both families are found next to a CLI_3234 family radical SAM protein that appears to perform post-translational modification.	59
274958	TIGR04068	rSAM_ocin_clost	Cys-rich peptide radical SAM maturase CcpM. Members of this family are radical SAM enzymes that occur next to clostridial Cys-rich predicted bacteriocin (or other class of ribosomal natural product) precursors (see families TIGR04065 and TIGR04067). They include a TIGR04085 C-terminal additional 4Fe4S cluster-binding domain that is associated with peptide modification by radical SAM enzymes, and they are proposed to be ribosomal natural product maturases. The gene symbol ccpM is assigned, for Clostridial Cys-rich Peptide Maturase. [Cellular processes, Toxin production and resistance]	459
274959	TIGR04069	ocin_ACP_rel	peptide maturation system acyl carrier-related protein. Both PSI-BLAST and large numbers of noise-level HMM hits show a relationship between this family and the phosphopantetheine attachment site domain modeled by pfam00550. That domain includes acyl carrier proteins (ACP) and features an essentially invariant serine residue that is the attachment site for the phosphopantetheine prosthetic group. In this family, the corresponding residue is not Ser and is not conserved. Members are found in genomic contexts associated with a small Cys-rich peptide and a radical SAM protein we predict modifies the peptide. [Cellular processes, Toxin production and resistance]	77
213890	TIGR04070	photo_TT_lyase	spore photoproduct lyase. DNA damage to bacterial spores from ultraviolet light accumulates in the form of 5-thyminyl-5,6-dihydrothymine, spore photoproduct. The damage is repaired by spore photoproduct lyase, a member of the radical SAM family of enzymes. The score of this model is set to restrict itself to spore-forming members of the Firmicutes, but additional homologs scoring below the trusted cutoff tend to occur in radioresistant organisms (e.g. Kineococcus radiotolerans) and may be functionally equivalent. A related family in the Mycobacterium lineage is described by family TIGR03886, and may or may not be equivalent in function. [DNA metabolism, DNA replication, recombination, and repair, Cellular processes, Sporulation and germination]	338
274960	TIGR04071	methanobac_OB3b	methanobactin precursor, Mb-OB3b family. Methanobactins are siderophore-like copper-chelating natural products with considerable variety from species to species. The 11-residue methanobactin of Methylosinus trichosporium OB3b is derived from a 30-residue precursor. A very similar 31-residue precursor is found in the rice endophyte Azospirillum sp. B510, which has not yet been shown to produce a methanobactin. This model models the shared region of the first 25 amino acids, including a Cys-Gly-Ser motif.	30
188587	TIGR04072	rSAM_ladder_B12	lipid biosynthesis B12-binding/radical SAM protein. Members of this protein family occur in conserved genomic contexts highly suggestive of lipid biosynthesis, including an island shared between Kuenenia stuttgartiensis, which produces ladderanes, and Desulfotalea psychrophila, which produces a different kind of unusual polyunsaturated hydrocarbon.	151
274961	TIGR04073	exo_TIGR04073	putative exosortase-associated protein, TIGR04073 family. Members of this protein family are found in beta, gamma, and delta proteobacteria, and in the verrucomicrobia. Twenty-two of twenty-four species encountered contain the PEP-CTERM/exosortase system for modulating extracellular polysaccharide biosynthesis production, suggesting a role in protein sorting. The N-terminal signal sequence is divergent and not included in the model. PSI-BLAST and HMM searches suggest a distant sequence relationship between a region of this protein of about 100 amino acids and a corresponding region of the very large eukaryotic protein vps13, associated with vacuolar protein sorting in yeast.	75
274962	TIGR04074	bacter_Hen1	3' terminal RNA ribose 2'-O-methyltransferase Hen1. Members of this protein family are bacterial Hen1, a 3' terminal RNA ribose 2'-O-methyltransferase that acts in bacterial RNA repair. All members of the seed alignment belong to a cassette with the RNA repair enzyme polynucleotide kinase-phosphatase (Pnkp). Chemically similar Hen1 in eukaryotes acts instead on small regulatory RNAs. [Transcription, RNA processing, Protein synthesis, tRNA and rRNA base modification]	462
274963	TIGR04075	bacter_Pnkp	polynucleotide kinase-phosphatase. Members of this protein family are the bacterial polynucleotide kinase-phosphatase (Pnkp) whose genes occur paired with genes for the 3' terminal RNA ribose 2'-O-methyltransferase Hen1. All members of the seed alignment belong to a cassette with the Hen1. The pair acts in bacterial RNA repair. This enzyme performs end-healing reactions on broken RNA, preparing from the RNA ligase to close the break. The working hypothesis is that the combination of Pnkp (RNA repair) and Hen1 (RNA modification) serves to first repair RNA damage from ribotoxins and then perform a modification that prevents the damage from recurring. [Transcription, RNA processing]	851
274964	TIGR04076	TIGR04076	TIGR04076 family protein. Members of this protein family are uncharacterized. The only invariant residue, and one of three other residues better than 90 percent conserved are both Cys. Phylogenetic profiling results and occasional fusion genes suggest a role for members of this family in redox reactions or iron cluster metabolism. Species occasionally have two or three copies.	89
188592	TIGR04077	expor_sig_YdyF	exported signaling peptide, YydF/SAG_2028 family. This family describes a rare family of small proteins, about 50 residues in length, that includes YydF from Bacillus subtilis and SAG_2028 from Streptococcus agalactiae 2603V/R. Mutational analysis and genomic context show that members of this family likely are modified by a (variably present) radical SAM enzyme, are exported by an ABC transporter, and serve as signaling peptide. The member from Bacillus subtilis induces the LiaRS two-component system. [Regulatory functions, Protein interactions]	49
188593	TIGR04078	rSAM_yydG	peptide modification radical SAM enzyme, YydG family. Members of this radical SAM protein family for peptide modification occur only in the context of members of family TIGR04077, which average about 50 amino acids in length. In Bacillus subtilis, this protein (YydG) appears to act on its cognate target peptide (YydF) prior to its export, and result in the creation of a signaling molecule that induces the LiaRS two-component system. [Regulatory functions, Protein interactions]	309
188594	TIGR04079	phero_cyc_pep	KxxxW-cyclized secreted peptide. Members of this family are short precursor peptides in which the mature form undergoes a cyclization between a Lys and a Trp four residues away. The modification enzyme appears to be an adjacent encoded radical SAM protein. Genomes encoding this system include Streptococcus thermophilus LMD-9 and Lactococcus lactis subsp. cremoris MG1363, among others. [Cellular processes, Biosynthesis of natural products]	23
188595	TIGR04080	rSAM_pep_cyc	KxxxW cyclic peptide radical SAM maturase. Members of this family are radical SAM enzymes that appear to perform a cyclization on an adjacent cognate peptide from family TIGR04079. Genomes with the complete system include Streptococcus thermophilus LMD-9 and Lactococcus lactis subsp. cremoris MG1363, among others. The gene symbol assigned is kwcM, for KxxxW Cyclic peptide Maturase. [Protein fate, Protein modification and repair]	440
188596	TIGR04081	selen_ocin	radical SAM modification target peptide, selenobiotic family. Members of this protein family are small peptides found in the vicinity of a peptide modification-type radical SAM protein family. Multiple members of this protein family occur in species with a selenocysteine incorporation systems and have a TGA stop codon at position that aligns with cysteine residues from other homologs. This finding strongly suggests that GSU_1558 and similar members of the family are selenopeptides. The selenocysteine insertion sequence (SECIS) finder bSECISearch finds two homologous SECIS elements for two TGA codons in the extension of GSU_1558. Meanwhile, the pairing with the radical SAM enzyme suggests additional modification.	37
274965	TIGR04082	rSAM_for_selen	selenobiotic family peptide radical SAM maturase. Members of this protein family are radical SAM (rSAM) enzymes similar in sequence to others with known or postulated roles in peptide modification, and regularly found adjacent to members of the GSU_1558 peptide family described by model TIGR04081. GSU_1558 and several other members of that family appear to be selenoproteins, hence the term selenobiotic.	516
274966	TIGR04083	rSAM_pep_methan	putative peptide-modifying radical SAM enzyme, Mhun_1560 family. Members of this family are radical SAM enzymes, homologous to a variety of other peptide-modifying radical SAM, and found primarily in methanogenic archaea.	376
274967	TIGR04084	rSAM_AF0577	putative peptide-modifying radical SAM enzyme, AF0577 family. This radical SAM family contains a C-terminal region motif CXXCX5CX3C that is found in PqqE and other radical SAM enzymes that act on peptide substrates. Members of this family are found primarily in the Archaea, but also several eukaryotes (Trichomonas vaginalis G3, Entamoeba dispar SAW760, Giardia intestinalis ATCC 50581, etc.). The function is unknown.	347
274968	TIGR04085	rSAM_more_4Fe4S	radical SAM additional 4Fe4S-binding SPASM domain. This domain contains regions binding additional 4Fe4S clusters found in various radical SAM proteins C-terminal to the domain described by model pfam04055. Radical SAM enzymes with this domain tend to be involved in protein modification, including anaerobic sulfatase maturation proteins, a quinohemoprotein amine dehydrogenase biogenesis protein, the Pep1357-cyclizing radical SAM enzyme, and various bacteriocin biosynthesis proteins. The motif CxxCxxxxxCxxxC is nearly invariant for members of this family, although PqqE has a variant form. We name this domain SPASM for Subtilosin, PQQ, Anaerobic Sulfatase, and Mycofactocin.	93
274969	TIGR04086	TIGR04086_membr	putative membrane protein, TIGR04086 family. Members of this family of strongly hydrophobic putative transmembrane protein average about 125 amino acids in length and occur mostly, but not exclusively, in the Firmicutes. Members are quite diverse in sequence. The function is unknown.	115
274970	TIGR04087	YqxM_for_SipW	YqxM protein. Members of this protein, including the partially characterized YqxM of Bacillus subtilis, are always found adjacent to a variant form, SipW, of signal peptidase, and are targets for this signal peptide, as is the biofilm protein constituent TasA. The function may always be associated with biofilm formation. In one instance, this protein is fused with the SipW signal peptidase.	186
274971	TIGR04088	cognate_SipW	SipW-cognate class signal peptide. This model describes a protein N-terminal domain found regularly in proteins encoded near a variant form of signal peptidase I such as the SipW protein of Bacillus subtilis. Many though not all members are homologs of camelysin (a casein-cleaving metalloprotease) and TasA (CotN), a metalloprotease that is secreted, along with extracellular polysaccharide (EPS), to be the major protein constituent of the Bacillus subtilis biofilm matrix. Sequencing from several known TasA/CotN proteins shows the cleavage location to be near the center of the alignment and typical of type I signal peptidases, with small residues at -3 and -1. This domain, therefore, appears to be a special subclass of signal peptide.	34
274972	TIGR04089	exp_by_SipW_III	alternate signal-mediated exported protein, RER_14450 family. Members of this Actinobacterial protein family contain the cognate signal peptide domain, modeled by TIGR04088, for the variant SipW form of the signal peptidase I family. The remainder of this protein, however, differs from families such as Peptidase_M73 (pfam12389) and YqxM (TIGR04087) that share the same signal peptide domain. Some additional homologs to this family lack full-length homology and are excluded by the trusted cutoff as set. The two known targets for export by the SipW signal peptidase in Bacillus subtilis act in producing biofilm matrix material.	179
274973	TIGR04090	exp_by_SipW_IV	alternate signal-mediated exported protein, CPF_0494 family. Members of this largely Clostridial protein family contain the cognate signal peptide domain, modeled by TIGR04088, for the variant SipW form of the signal peptidase I family. The remainder of this protein, however, differs from families such as Peptidase_M73 (pfam12389) and YqxM (TIGR04087) that share the same signal peptide domain. Some additional homologs to this family lack full-length homology and are excluded by the trusted cutoff as set. The two known targets for export by the SipW signal peptidase in Bacillus subtilis act in producing biofilm matrix material. Members include CPF_0494, adjacent to SipW homolog CPF_0493.	244
274974	TIGR04091	LTA_dltB	D-alanyl-lipoteichoic acid biosynthesis protein DltB. Members of this protein family are DltB, part of a four-gene operon for D-alanyl-lipoteichoic acid biosynthesis that is present in the vast majority of low-GC Gram-positive organisms. This protein may be involved in transport of D-alanine across the plasma membrane. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	380
274975	TIGR04092	LTA_DltD	D-alanyl-lipoteichoic acid biosynthesis protein DltD. Members of this protein family are DltD, part of the DltABCD system widely distributed in the Firmicutes for D-alanylation of lipoteichoic acids. The most common form of LTA, as in Staphylococcus aureus, has a backbone of polyglycerolphosphate.	384
274976	TIGR04093	cas1_CYANO	CRISPR-associated endonuclease Cas1, subtype CYANO. The CRISPR-associated protein Cas1 is virtually universal to CRISPR systems. CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, is prokaryotic immunity system for foreign DNA, mostly from phage. CRISPR systems belong to different subtypes, distinguished by both nature of the repeats, the makeup of the cohort of associated Cas proteins, and by molecular phylogeny within the more universal Cas proteins such as this one. This model is of type EXCEPTION and provides more specific information than the EQUIVALOG model TIGR00287. It describes a clade of Cas1 limited to the CYANO subtype of CRISPR/Cas system and most often the type found there.	323
274977	TIGR04094	adjacent_YSIRK	YSIRK-targeted surface antigen transcriptional regulator. Bacteria whose genomes encode only one protein with the YSIRK variant form of signal peptide (TIGR01168) were examined for conserved genes near that one tagged protein. This protein is found adjacent to at various classes of repetitive or low-complexity YSIRK proteins (whether unique in genome or not), in a range of species (Enterococcus faecalis X98, Ruminococcus torques, Coprobacillus sp. D7, Lysinibacillus fusiformis ZC1, Streptococcus equi subsp. equi 4047, etc). The affliated YSIRK proteins include Streptococcal protective antigen (see ) and proteins with the Rib/alpha/Esp surface antigen repeat (see TIGR02331). The last quarter of this protein has an AraC family helix-turn-helix (HTH)transcriptional regulator domain.	383
274978	TIGR04095	dnd_restrict_1	DNA phosphorothioation system restriction enzyme. The DNA phosphorothioate modification system dnd (DNA instability during electrophoresis) recently has been shown to provide a modification essential to a restriction system. This protein family was detected by Partial Phylogenetic Profiling as linked to dnd, and its members usually are clustered with the dndABCDE genes.	451
274979	TIGR04096	dnd_rel_methyl	DNA phosphorothioation-associated putative methyltransferase. Members of this protein family show distant local sequence similarity to a number of S-adenosyl-methionine-dependent methyltransferases. The family is identified by Partial Phylogenetic Profiling as closely tied to the DNA phosphorothioation system (dnd), and members are found adjacent to dnd genes in at least 13 species (Streptomyces lividans TK24, Shewanella frigidimarina NCIMB 400, Mycobacterium abscessus ATCC 19977, Nostoc punctiforme PCC 73102, Vibrio fischeri MJ11, etc.). The DNA phosphorothioation enables a novel form of restriction enzyme activity. Most members of this family appear in species with the DNA phosphorothioation system. [DNA metabolism, Restriction/modification]	478
274980	TIGR04098	biosyn_clust_1	biosynthesis cluster domain. Radical SAM family TIGR04043 is a marker for a widespread eight-gene probable biosynthetic cluster of unknown function. This protein family describes a domain that occurs as an additional protein for some of those clusters, but also as the N-terminal domain of large, multidomain polyketide synthases and in other contexts.	270
274981	TIGR04099	biosn_Pnap_2097	probable biosynthetic protein, Pnap_2097 family. Radical SAM family TIGR04043 is a marker for a widespread eight-gene probable biosynthetic cluster of unknown function. This protein family occurs only in the context of TIGR04043 member-containing biosynthetic clusters, although in a minority of such clusters. This protein family belongs to the TIGR04098 domain family, which also includes N-terminal domains of several probable polyketide synthases. A role in biosynthetic processes is suspected.	258
188615	TIGR04100	rSAM_pair_X	radical SAM enzyme, TIGR04100 family. Members of this protein family are radical SAM enzymes that appear paired with members of TIGR04002, a family of small (~170 residue), mostly hydrophobic protein. This family of radical SAM enzymes belongs to a larger family TIGR04038, in which some members show regularly in contexts with TatD.	197
200352	TIGR04101	CCGSCS	CCGSCS motif protein. This protein family, with average protein length about 58 residues, occurs in several marine bacteria, such as Shewanella benthica KT99, Marinobacter sp. ELB17, and Photobacterium profundum 3TCK. The striking feature is a C-terminal motif CCGSCS, which (perhaps coincidentally) resembles conserved core motif [LC]CGSC shared by two methanobactin precursors (see TIGR04071). There is no detectable conserved gene region for these proteins.	59
200353	TIGR04102	SWIM_PBPRA1643	SWIM/SEC-C metal-binding motif protein, PBPRA1643 family. Members of this protein family have a SWIM, or SEC-C, domain (see pfam02810), a 21-amino acid putative Zn-binding domain that is shared with SecA, plant MuDR transposases, etc. This small protein family of unknown function occurs primarily in marine bacteria.	108
200354	TIGR04103	rSAM_nif11_3	nif11-class peptide radical SAM maturase 3. Members of this protein family are peptide-modifying radical SAM enzymes, with a C-terminal additional 4Fe-4S cluster binding domain like many other peptide-modifying radical SAM enzymes. This form occurs primarily in the genera Cyanothece and Nostoc.	412
274982	TIGR04104	cxxc_20_cxxc	cxxc_20_cxxc protein. This small, uncommon, poorly conserved protein is found primarily in the Firmicutes. It features are pair of CxxC motifs separated by about 20 amino acids, followed by a highly hydrophobic region of about 45 amino acids. It has no conserved gene neighborhood, and its function is unknown.	94
274983	TIGR04105	FeFe_hydrog_B1	[FeFe] hydrogenase, group B1/B3. See for descriptions of different groups.	462
274984	TIGR04106	cas8c_GSU0052	CRISPR-associated protein GSU0052/csb3, Dpsyc system. This model describes a CRISPR-associated (cas) protein unique to the Dpsyc subtype (named for Desulfotalea psychrophila), a variant type I-C subtype, although not universal to the that subtype. Members of this family occur in CRISPR loci of Geobacter sulfurreducens PCA, Gemmata obscuriglobus UQM 2246, Rhodospirillum centenum SW, Planctomyces limnophilus DSM 3776, and Methylosinus trichosporium OB3b.	282
274985	TIGR04107	rSAM_HutW	putative heme utilization radical SAM enzyme HutW. HutW is a radical SAM enzyme closely related to HemN, the heme biosynthetic oxygen-independent coproporphyrinogen oxidase. It belongs to operons associated with heme uptake and utilization in Vibrio cholerae and related species, but neither it not HutX has been shown to be needed, as is HutZ, for heme utilization. HutW failed to complement a Salmonella enterica hemN mutant (), suggesting a related but distinct activity. Some members of this family are fused to hutX.	420
274986	TIGR04108	HutX	putative heme utilization carrier protein HutX. Members of this protein family are HutX, found paired with HutW in some heme utilization loci although not shown directly to be necessary for heme utilization. This protein is homologous to the heme carrier protein HemS, while its partner HutW is homologous to (but does not complement) HemN, the radical SAM enzyme oxygen-independent coproporphyrinogen III oxidase involved in heme biosynthesis.	154
200360	TIGR04109	heme_ox_HugZ	heme oxygenase, HugZ family. Members of this protein family are HugZ, a class of heme oxygenase that belongs to the PPOX family (pfam01243) and lacks homology to the HmuO family (pfam01126). Characterized members of this family include HP0318 from Helicobacter pylori and CJ1613c from Campylobacter jejuni. This enzyme releases iron during the conversion of heme to biliverdin.	243
200361	TIGR04110	heme_HutZ	heme utilization protein HutZ. Members of this family are heme utilization proteins, typically designated HutZ. They are members of the PPOX family (pfam01243) and, except for the lack of an N-terminal extension, are closely related to one form of heme oxidase (1.14.99.3), HugZ (TIGR04109). Members typically are found in a three-gene operon with radical SAM enzyme HutW and a protein of unknown function, HutX.	168
274987	TIGR04111	BcepMu_gp16	phage-associated protein, BcepMu gp16 family. Members of this protein family occur in Burkholderia phage BcepMu, Pseudomonas phage B3, and Burkholderia phage KS10, and many bacterial putative prophage regions. The member from Burkholderia phage BcepMu is named gp16. Homology suggests DNA-binding activity. [Mobile and extrachromosomal element functions, Prophage functions]	55
274988	TIGR04112	seleno_YedE	putative selenium metabolism protein, YedE family. For 79 of the first 80 reference genomes in which a member of this protein family, YedE, is found, a selenium utilization system is found, spread over a broad taxonomic range (Firmicutes, spirochetes, delta-proteobacteria, Fusobacteria, Bacteriodes, etc. This family is less widespread than YedF, also involved in selenium metabolism.	337
274989	TIGR04113	cas_csx17	CRISPR-associated protein Csx17, subtype Dpsyc. Members of this protein family are found exclusively in CRISPR-associated (cas) type I system gene clusters of the Dpsyc subtype. Markers for that type include a variant form of cas3 (model TIGR02621) and the GSU0054-like protein family (model TIGR02165). This family occurs in less than half of known Dpsyc clusters.	703
274990	TIGR04114	tSAM_targ_Cxxx	modification target Cys-rich repeat. This model describes a cysteine-rich repeat found in a number of bacterial putative radical-SAM modified natural product precursors. A substantial fraction of members of this family have been missed during gene-finding. A true hit to the model must exceed both TC on the whole and trusted cutoff 2 for at least one domain, to avoid false-positives from African swine fever virus proteins.	18
200366	TIGR04115	rSAM_Cxxx_rpt	radical SAM peptide maturase, CXXX-repeat target family. Members of this radical SAM domain protein are predicted peptide maturases, similar to PqqE, AlbA, the mycofactocin radical SAM maturase, and many others that share the peptide modification radical SAM protein C-terminal additional 4Fe4S-binding domain (TIGR04085). Members co-occur with a protein of unknown function that may be a chaperone or immunity protein and with a peptide that may have twelve or more cysteines occurring regularly spaced every fourth residue. These Cys residues tend to be flanked by residues with small side chains that provide minimal steric hindrance to crosslink formation by the radical SAM enzyme as in the subtilosin A system.	359
274991	TIGR04116	CXXX_rpt_assoc	CXXX repeat peptide modification system protein. Members of this protein family occur strictly in the presence of a peptide modification radical SAM enzyme described by model TIGR04115 and some small peptide in which, for a stretch, every fourth amino acid is Cys. Cysteine residues usually are flanked by residues with sterically small side chains, as with many radical SAM-modified peptides. Many of the latter are recognized by model TIGR04114.	90
274992	TIGR04117	Syntroph_Cxxx	Syntrophus aciditrophicus Cys-Xaa-Xaa-Xaa repeat radical SAM target protein. This model represents a paralogous family, in Syntrophus aciditrophicus SB, of peptides a conserved N-terminal region followed by ten to seventeen direct repeats of the sequence CXXX (see repeats model TIGR04114). The N-terminal region includes a hydrophobic patch that is not shared by most members of family TIGR04114.	95
200369	TIGR04118	Cxxx_AC3_0185	modification target Cys-rich peptide, AC3_0185 family. Radical SAM enzyme family TIGR04115 is paired with a number of short peptides with multiple tandem repeats of Cys-Xaa-Xaa-Xaa (see TIGR04114). This family represent a peptide family with a TIGR04114-like region, although the repeat region is relatively short in this group.	46
200370	TIGR04119	CXXX_matur	CXXX repeat peptide maturase. This model describes a peptide maturase that works with, usually fused to, a radical SAM enzyme in a system that modifies peptides with multiple tandem repeats of CXXX sequences. This protein includes an iron-sulfur cluster binding region associated with peptide modification as described in domain model TIGR04085.	210
274993	TIGR04120	DNA_lig_bact	DNA ligase, ATP-dependent, PP_1105 family. This model describes a family of ATP-dependent DNA ligases present in about 12 % of prokaryotic genomes. It occurs as part of a four-gene system with an exonuclease, a helicase and a phosphoesterase, with all four genes clustered or at least the first two and last two paired. This family resembles DNA ligase I (see TIGR00574 and pfam01068), and its presumed function may be in DNA repair, replication, or recombination.	526
274994	TIGR04121	DEXH_lig_assoc	DEXH box helicase, DNA ligase-associated. Members of this protein family are DEAD/DEAH box helicases found associated with a bacterial ATP-dependent DNA ligase, part of a four-gene system that occurs in about 12 % of prokaryotic reference genomes. The actual motif in this family is DE[VILW]H.	804
274995	TIGR04122	Xnuc_lig_assoc	putative exonuclease, DNA ligase-associated. Members of this protein family frequently are found annotated as a putative exonuclease involved in mRNA processing. This protein is found, exclusively in bacteria, associated with three other proteins: an ATP-dependent DNA ligase, a helicase, and putative phosphoesterase.	326
274996	TIGR04123	P_estr_lig_assc	metallophosphoesterase, DNA ligase-associated. Members of this protein family are an uncharacterized putative metallophosphoesterase associated with a DNA ligase, a helicase, and a putative exonuclease. It may play a role in DNA repair. Its system is present in about 12 % of prokaryotic reference genomes.	208
274997	TIGR04124	archaeo_artE	archaeosortase family protein ArtE. This protein family is related to the predicted protein-sorting transpeptidase Exosortase (EpsH), with the Cys, Arg, and His putative active site residues preserved, but it is strictly archaeal and is not associated with any known PEP-CTERM-like target sequence. The immediate gene neighborhood in most genomes suggests RNA (methylase, cyclase) and cofactor (thiamine pyrophosphate) metabolism. The function is unknown. It is designated archaeosortase family protein ArtE.	153
274998	TIGR04125	exosort_PGF_TRM	archaeosortase A, PGF-CTERM-specific. This family is an archaeal variant of the (normally bacterial) putative protein-sorting integral membrane protein exosortase, hence archaeosortase. In species a member of this family, its PGF-CTERM cognate sequence (TIGR04126) occurs at the C-termini of from two to over fifty proteins per genome. Those target proteins may not share homology to each other in regions N-terminal to the PGF-CTERM region.	262
274999	TIGR04126	PGF_CTERM	PGF-CTERM archaeal protein-sorting signal. This model describes a strictly archaeal putative protein-sorting motif, PGF-CTERM. It is the (predicted) recognition sequence for an exosortase homolog, archaeosortase (TIGR04125). In some archaea, up to fifty proteins have this domain as their C-terminal region, usually preceded by a Thr-rich region likely to be heavily glycosylated. The removal of this sorting signal may be associated with a C-terminal prenyl group modification in the halobacterial major cell surface glycoprotein, an S-layer protein.	28
275000	TIGR04127	flavo_near_exo	exosortase F-associated protein. Members of this protein family are always found next to an exosortase/archaeosortase-like protein, and occur so far only in the flavobacteria, within the Bacteroidetes. Members do not have an obvious PEP-CTERM-like C-terminal protein sorting domain.	136
275001	TIGR04128	exoso_Fjoh_1448	exosortase family protein XrtF. Members of this protein family are exosortase-related proteins found always in association with a member of family TIGR04127, a small, hydrophobic, uncharacterized protein limited to the Bacteriodetes. Exosortases are proposed transpeptidases with a cysteine active site (3.4.22.-), but usually are associated with specific C-terminal target motifs (PEP-CTERM, PEF-CTERM, PGF-CTERM, etc).	174
275002	TIGR04129	CxxH_BA5709	CxxH/CxxC protein, BA_5709 family. Members of this protein family occur exclusively in the Firmicutes, in at least 50 different species. Members average about 55 residues in length, and four of the five invariant or nearly invariant residues occur in motifs CxxH and CxxC. The function is unknown.	49
275003	TIGR04130	FnlA	UDP-N-acetylglucosamine 4,6-dehydratase/5-epimerase. The FnlA enzyme is the first step in the biosynthesis of UDP-FucNAc from UDP-GlcNAc in E. coli (along with FnlB and FnlC). The proteins identified by this model include FnlA homologs in the O-antigen clusters of O4, O25, O26, O29 (Shigella D11), O118, O145 and O172 serotype strains, all of which produce O-antigens containing FucNAc (or the further modified FucNAm). A homolog from Pseudomonas aerugiosa serotype O11, WbjB, also involved in the biosynthesis of UDP-FucNAc has been characterized and is now believed to carry out both the initial 4,6-dehydratase reaction and the subsequent epimerization of the resulting methyl group at C-5. A phylogenetic tree of related sequences shows a distinct clade of enzymes involved in the biosynthesis of UDP-QuiNAc (Qui=qinovosamine). This clade appears to be descendant from the common ancestor of the Pseudomonas and E. coli fucose-biosynthesis enzymes. It has been hypothesized that the first step in the biosynthesis of these two compounds may be the same, and thus that these enzymes all have the same function. At present, lacking sufficient confirmation of this, the current model trusted cutoff only covers the tree segment surrounding the E. coli genes. The clades containing the Pseudomonas and QuiNAc biosynthesis enzymes score above the noise cutoff. Immediately below the noise cutoff are enzymes involved in the biosynthesis of UDP-RhaNAc (Rha=rhamnose), which again may or may not produce the same product.	337
275004	TIGR04131	Bac_Flav_CTERM	gliding motility-associated C-terminal domain. This model describes a protein homology domain unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Flavobacterium johnsoniae, that exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss a this motility. Proteins with this domain frequently pair with members of family TIGR03519, whether one such pair or many occur in a genome. More than 30 members may occur in one genome.	85
275005	TIGR04132	intra_fol_E_lig	putative folate metabolism gamma-glutamate ligase. This protein family is related to CofE, a gamma-glutamyl ligase of coenzyme F420 biosynthesis. However, it occurs in a different gamma-glutamyl ligase context, polyglutamylated tetrahydrofolate biosynthesis-like regions in two widely separated lineages that both occur as intracellular bacteria - Chlamydia and Wolbachia.	241
200384	TIGR04133	rSAM_w_lipo	radical SAM enzyme, rSAM/lipoprotein system. Members of this protein family are radical SAM enzymes with an additional 4Fe4S cluster-binding C-terminal domain (TIGR04085) shared with PqqE and many other peptide and protein-modifying radical SAM enzymes. All members occur in the context of a predicted lipoprotein that usually is encoded by an adjacent gene.	350
200385	TIGR04134	lipo_with_rSAM	putative lipoprotein, rSAM/lipoprotein system. Members of this family are Bacteroidetes lineage putative lipoproteins that always occur in pairs with a radical SAM enzyme, TIGR04133, from a branch of the radical SAM superfamily in which many members perform peptide or protein modifications. In some members, the region distal to the Cys of the putative lipoprotein cleavage motif is duplicated.	150
200386	TIGR04135	FibroRuminTarg	Cys-rich radical SAM target, FibroRumin family. Members of this protein family are cysteine-rich small peptides, about 52 amino acids long, that are proposed targets for modification by a radical SAM enzyme. Known occurrences are as tandem gene pairs Fibrobacter succinogenes subsp. succinogenes S85 (missed gene calls) and in Ruminococcus albus 8.	52
200387	TIGR04136	rSAM_FibroRumin	radical SAM peptide maturase, FibroRumin system. Members of this protein family are radical SAM enzymes proposed to act on small, Cys-rich peptides encoded by tandem gene pairs. Members occur in enzymes Fibrobacter succinogenes subsp. succinogenes S85 (genes for their target peptides missed) and in Ruminococcus albus 8. This enzyme family is similar in sequence to the SCIFF (Six Cysteines in Forty-Five) system maturase (TIGR03974).	458
275006	TIGR04137	Chlam_Ver_rRNA	Chlam_Verruc_Plancto small basic protein. Members of this protein family are commonly found next to markers of rRNA processing such as YbeY. They are extremely lineage-restricted, in the Planctomycetes and Chlamydiae/Verrucomicrobia group. Since classification is based on rRNA molecular phylogeny, this provides additional support for a role in rRNA metabolism. This small protein, about 50 amino acids in length, is rich in basic residues, a third line of support for rRNA interaction.	50
275007	TIGR04138	Plancto_Ver_chp	Verruc_Plancto-restricted protein. Members of this protein family are extremely lineage-restricted, occurring exclusively in the Planctomycetes and Chlamydiae/Verrucomicrobia group, although not in Chlamydia itself. The function is unknown; the lack of invariant residues other than a single Phe suggests an ancient, conserved, non-enzymatic role.	122
200390	TIGR04139	CxxCx5CxxC_targ	putative peptide modification target, TIGR04139 family. This model describes a rare family of small putative polypeptides, including three encoded in tandem in Sphingobacterium spiritivorum ATCC 33300, in the vicinity of a TIGR04085 protein. This pairing is conserved in Chryseobacterium gleum ATCC 35910, Kordia algicida OT-1, and other species. TIGR04085 describes a C-terminal additional 4Fe4S-binding domain in PqqE and other radical SAM enzymes that seems to be a marker for peptide modification, and the family modeled here is a candidate modified peptide precursor.	66
275008	TIGR04140	chp_AF_0576	TIGR04140 family protein. This model represents an uncharacterized small archaeal protein.	66
275009	TIGR04141	TIGR04141	sporadically distributed protein, TIGR04141 family. This model describes a sporadically distributed conserved hypothetical protein in which complete members average over 500 amino acids in length, although matching sequences frequently are truncated or broken into tandem ORFs. Regular co-clustering with known markers of mobility (integrases, transposases, phage proteins, restriction enzymes, etc.) suggests this family also is part of the mobilome. The function is unknown.	516
200393	TIGR04142	PCisTranLspir	putative peptidyl-prolyl cis-trans isomerase, LIC12922 family. Members of this protein family have a known crystal structure (3NRK) showing similarity to the peptidyl-prolyl cis-trans isomerase SurA. Members are found in Leptospira species next to an uncharacterized radical SAM enzyme and a cytidylyltransferase family protein.	315
200394	TIGR04143	VPxxxP_CTERM	VPXXXP-CTERM protein sorting domain. This C-terminal protein sorting domain is detected, so far, in Methanohalophilus mahii DSM 5219 (five members) and Methanohalobium evestigatum Z-7303 (nine members). This domain resembles the PEP-CTERM, PEF-CTERM, and PGF-CTERM domains of other exosortase/archaeosortase systems. Member proteins co-cluster with a variant member of the exosortase/archaeosortase protein family, and represent a boutique second sorting system in these species.	25
200395	TIGR04144	archaeo_VPXXXP	archaeosortase B, VPXXXP-CTERM-specific. Members of this protein family are found so far in Methanohalophilus mahii DSM 5219 and Methanohalobium evestigatum Z-7303, along with five and nine proteins, respectively, with the VPXXXP-CTERM protein sorting signal (TIGR04143). In these species, this boutique system represents a second exosortase/archaeosortase-type system.	156
200396	TIGR04145	Firmicu_CTERM	Firmicu-CTERM domain. This C-terminal domain is found only in the Firmicutes, where its presence is sporadically distributed. Proteins with this domain are most conserved in the C-terminal region, where the pattern of ending with a transmembrane domain resembles both the LPXTG (sortase target) and PEP-CTERM (exosortase target) domain structures. However, members occur exclusively in the presence of an exosortase-like protein XrtG (TIGR03110), a putative glycosyltransferase (TIGR03111), and a 6-pyruvoyl tetrahydropterin synthase-related protein (TIGR03112).	45
275010	TIGR04146	GGGPS_Afulg	phosphoglycerol geranylgeranyltransferase. This enzyme, known also as GGGP synthase and GGGPS, catalyzes the stereospecific first step in the biosynthesis of the characteristic membrane diether lipids of archaea. Interestingly, the closest homologs outside this family are not the functionally equivalent enzymes of other archaea, but rather functionally distinct bacterial enzymes.	221
275011	TIGR04147	GGGPS_Halobact	phosphoglycerol geranylgeranyltransferase, putative. In most archaea, phosphoglycerol geranylgeranyltransferase (EC 2.5.1.41), also known as GGGP synthase and GGGPS, catalyzes the stereospecific first step in the biosynthesis of their characteristic membrane diether lipids. However, some groups of archaeal GGGPS homologs are more closely related to certain bacterial proteins than to each other. This family represents the putative GGGPS family as found in the Halobacteria.	229
200399	TIGR04148	GG_samocin_CFB	radical SAM peptide maturase, GG-Bacteroidales family. Members of this protein family are radical SAM enzymes (pfam04055) with the additional C-terminal region (TIGR04085) that is frequently a marker of peptide modification. Many members of this family are found in the vicinity of one or several ORFs encoding short polypeptides with a Gly-Gly motif (common for bacteriocin leader peptide cleavage), followed by a Cys-rich patch and then poorly conserved sequences.	411
275012	TIGR04149	GG_sam_targ_CFB	natural product precursor, GG-Bacteroidales family. Sequences in this protein domain family include a leader peptide region, up to and including a Gly-Gly cleavage motif, and about 15 additional residues, usually Cys-rich, from a family of predicted ribosomal natural product precursors. Many of these are associated with peptide-modifying radical SAM enzymes. The core region, up through the diglycine motif, resembles and contains some overlapping hits with the bacteriocin precuror leader peptide region modeled by TIGR01847, but is longer with an extreme N-terminal region with consensus sequence MKKLKKLKL. [Cellular processes, Biosynthesis of natural products]	43
275013	TIGR04150	pseudo_rSAM_GG	pseudo-rSAM protein, GG-Bacteroidales system. Many peptide-modifying radical SAM enzymes have two 4Fe4S-binding regions, an N-terminal one recognized by Pfam radical SAM domain-defining model pfam04055 and a C-terminal one recognized by TIGR04085. Members of this protein family occur in cassettes with such a radical SAM family (TIGR04148) and with a peptide modification target (TIGR04149). Surprisingly, members of this family show full-length homology to each other, with several scoring at least borderline hits to both pfam04055 and TIGR04085, and yet differ in the presence/absence of a signature CX(3)CX(2)CX(9)C motif. Instead, members are best-conserved in the TIGR04085-like C-terminal region. Therefore, this protein family is designated a pseudo-radical-SAM protein, which likely works in partnership with a TIGR04148 family protein.	407
200402	TIGR04151	exosort_VPDSG	exosortase C, VPDSG-CTERM-specific. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (). This model describes the exosortase subtype specific for the VPDSG-CTERM variant (TIGR03778) of PEP-CTERM. Systems are found, so far, in Verrucomicrobiae bacterium DG1235 (twenty) and bacterium Ellin514 (two). This system may coexist with other system variants. [Protein fate, Protein and peptide secretion and trafficking]	309
275014	TIGR04152	exosort_VPLPA	exosortase D, VPLPA-CTERM-specific. This model describes a variant sub class, exosortase D, of protein sorting enzyme (see parent exosortase model TIGR02602), specific for the VPLPA-CTERM variant (TIGR03370) of the PEP-CTERM protein sorting signal. [Protein fate, Protein and peptide secretion and trafficking]	486
275015	TIGR04153	cyanosortA_assc	cyanosortase A-associated protein. Members of this protein family are found exclusively in the Cyanobacteria, usually usually encoded next to and in at least one case fused to a gene encoding cyanoexosortase A. Note that family TIGR04533 shows a similar relationship to cyanoexosortase B (TIGR04156), and no EpsI is found.	186
275016	TIGR04154	archaeo_STT3	oligosaccharyl transferase, archaeosortase A system-associated. Members of this protein family occur, one to three members per genome, in the same species of Euryarchaeota as contain the predicted protein-sorting enzyme archaeosortase (TIGR04125) and its cognate protein-sorting signal PGF-CTERM (TIGR04126).	817
275017	TIGR04155	cyano_PEP	PEP-CTERM protein sorting domain, cyanobacterial subclass. This domain model describes a subclass with family TIGR02595 of PEP-CTERM protein sorting signals associated with bacterial exosortases. This subclass is restricted to Cyanobacteria, including the genera Cyanothece, Nostoc, Trichodesmium, Lyngbya, Arthospira, etc. This PEP-CTERM subclass features strongly conserved residues within the transmembrane region, including a Gx4GxG motif. Model TIGR03763 describes a corresponding cyanobacterial form of exosortase found in most species with this domain.	25
275018	TIGR04156	cyanoexo_CrtB	cyanoexosortase B. This model describes a cyanobacterial-restricted form of exosortase, associated with a PEP-CTERM domain subclass described in model TIGR04155. This is one of two such cyanoexosortases, either of which is sufficient to accompany TIGR04155 family members. The cyanoexosortase is TIGR03763. [Protein fate, Protein and peptide secretion and trafficking]	280
275019	TIGR04157	glyco_rSAM_CFB	glycosyltransferase, GG-Bacteroidales peptide system. Members of this protein family are predicted glycosyltransferases that occur in conserved gene neighborhoods in various members of the Bacteroidales. These neighborhoods feature a radical SAM enzyme predicted to act in peptide modification (family TIGR04148), peptides from family TIGR04149 with a characteristic GG cleavage motif, and several other proteins.	405
275020	TIGR04158	rSAM_MIA_synth	3-methyl-2-indolic acid synthase. Members are a radical SAM enzyme that converts L-Trp to 3-methyl-2-indolic acid synthase through a complex rearrangement. This enzyme is closest to ThiH, which also does a complex rearrangement, among other characterized radical SAM enzymes.	368
200410	TIGR04159	methbact_MbnB	methanobactin biosynthesis cassette protein MbnB. The first characterized methanobactin is made from a ribosomal precursor in Methylosinus trichosporium OB3b. Two additional species with homologous precursor peptides (family TIGR04071) are Azospirillum sp. B510 and Gluconacetobacter sp. SXCC-1. This model describes a clique of related sequences, domain or full-length, that occurs always and only next to a methanobactin precursor. The model excludes some close homologs from species where no similar precursor can be found.	91
275021	TIGR04160	methbact_MbnC	methanobactin biosynthesis cassette protein MbnC. The first characterized methanobactin is made from a ribosomal precursor in Methylosinus trichosporium OB3b. Two additional species with homologous precursor peptides (family TIGR04071) are Azospirillum sp. B510 and Gluconacetobacter sp. SXCC-1. This model describes a clique of related sequences, domain or full-length, that occurs always and only next to a methanobactin precursor of the Mb-OB3b type. The model excludes several Pseudomonas proteins whose function is unknown, which likewise are in model TIGR04061, but which diverge toward the C-terminus.	89
275022	TIGR04161	VPEID-CTERM	VPEID-CTERM protein sorting domain. Proteins belonging to this family are small, 80 to 120 residues, including a signal peptide, a central low-complexity region, and this roughly 31-amino acid extreme C-terminal region. Members occur paired with a variant form of exosortase. Species include Ruegeria sp., Phaeobacter gallaeciensis, Roseovarius nubinhibens ISM, and two in Methylobacter tundripaludum.	31
200413	TIGR04162	exo_VPEID	exosortase E/protease, VPEID-CTERM system. Members of this protein family are fusion proteins of exosortase (N-terminal) and a CAAX prenyl protease domain (C-terminal). Members are restricted to the alpha Proteobacteria. The variant C-terminal protein sequence VPEID-CTERM occurs only in these species, often adjacent.	519
200414	TIGR04163	rSAM_cobopep	peptide-modifying radical SAM enzyme CbpB. Members of this family are radical SAM enzymes that modify a short peptide encoded by an upstream gene. A role in metal chelation is suggested.	428
200415	TIGR04164	cobo_pep	modified peptide precursor CbpA. Members of this family are short peptides predicted to reach mature form after modification by a radical SAM enzyme (TIGR04163).	25
275023	TIGR04165	methano_modCys	Cys-rich peptide, TIGR04165 family. Members of this small peptide family occur strictly in a subset of archaeal methanogens. Members have four invariant Cys residues in two Cys-Xaa-Xaa-Cys-Gly motifs and may have other Cys residues as well. At least two members occur next to family TIGR04083 radical SAM enzymes predicted to act in peptide or protein modification.	50
275024	TIGR04166	methano_MtrB	tetrahydromethanopterin S-methyltransferase, subunit B. Members of this protein family are the MtrB protein of the tetrahydromethanopterin S-methyltransferase complex. This system is universal in archaeal methanogens. [Energy metabolism, Methanogenesis]	95
275025	TIGR04167	rSAM_SeCys	radical SAM/Cys-rich domain protein. Members of this protein family have an N-terminal radical SAM domain (pfam04055) and a C-terminal pfam12345 domain. The C-terminal region has several conserved Cys residues, one of which is replaced by selenocysteine in at least five bacterial reference genomes.	303
275026	TIGR04168	TIGR04168	TIGR04168 family protein. Members of this uncharacterized protein family are restricted, in 49 of 50 genomes, to organisms with a family TIGR04167 radical SAM protein, which occasionally is a selenoprotein.	269
200420	TIGR04169	perox_w_seleSAM	alkylhydroperoxidase/carboxymuconolactone decarboxylase family protein. Members of this family are usually annotated as putative carboxymuconolactone decarboxylases, are related also to alkylhydroperoxidase AhpD, and contain a peroxidase-like Cys-X-X-Cys putative redox-active disulfide. All members occur in genomes with a radical SAM protein of family TIGR04167, which occasionally are selenoproteins.	109
211905	TIGR04170	RNR_1b_NrdE	ribonucleoside-diphosphate reductase, class 1b, alpha subunit. Members of this family are NrdE, the alpha subunit of class 1b ribonucleotide reductase. This form uses a dimanganese moiety associated with a tyrosine radical to reduce the cellular requirement for iron.	698
275027	TIGR04171	RNR_1b_NrdF	ribonucleoside-diphosphate reductase, class 1b, beta subunit. Members of this family are NrdF, the beta subunit of class 1b ribonucleotide reductase. This form uses a dimanganese moiety associated with a tyrosine radical to reduce the cellular requirement for iron. [Purines, pyrimidines, nucleosides, and nucleotides, 2'-Deoxyribonucleotide metabolism]	313
275028	TIGR04172	DGQHR_dnd_1	DNA phosphorothioation-associated DGQHR protein 1. The DND system produces an phosphorothioation modification to DNA, replacing a non-bridging oxygen of a phosphate group with sulfur. The modification causes DNA degradation during electrophoresis in Tris buffer. This protein, like DndB (TIGR03233), contains a DGQHR domain (TIGR03187), which also occurs in several contexts that suggest lateral transfer rather than DNA phosphorothioation-dependent restriction.	378
200424	TIGR04173	PIP_CTERM	PIP-CTERM protein sorting domain. Proteins closely related to MJ_1469.1 from Methanocaldococcus jannaschii DSM 2661 are designated archaeosortase D (ArtD). ArtD appears to be a dedicated protein-sorting enzyme with a single target, a PKD domain (pfam00801) repeat protein encoded by adjacent gene. This model describes the C-terminal putative protein-sorting region structurally similar to PEP-CTERM (TIGR02602) and found only on these methanogen PKD domain proteins.	27
275029	TIGR04174	IPTL_CTERM	IPTL-CTERM protein sorting domain. This model describes a variant form of the PEP-CTERM C-terminal protein-sorting domain, with a consensus motif IPTL replacing the more typical VPEP. A majority of these sequences have a WG (Trp-Gly) motif at positions 7-8 of the domain. Species with multiple (up to 15) copies of this domain include Acidovorax citrulli, Acidovorax delafieldii 2AN, Delftia acidovorans SPH-1, and gamma proteobacterium NOR5-3.	27
200426	TIGR04175	archaeo_artD	archaeosortase D. This model describes archaeosortase D, one of several strictly archaeal subfamilies related to exosortase, the bacterial protein-sorting putative transpeptidase (see TIGR02602). ArtD is found in the genus Methanocaldococcus. Its predicted target, encoded by an adjacent gene, has a C-terminal VPIP motif-containing region (TIGR04173) likely to be its recognition site.	150
275030	TIGR04176	MarR_EPS	EPS-associated transcriptional regulator, MarR family. Members of this family of MarR-family transcriptional regulators are associated with long genomic loci consisting of genes encoding enzymes for the biosynthesis of exopolysaccharides. These genes include glycosyl transferases, sugar modifying enzymes (epimerases, isomerases, methyltransferases, aminotransferases, etc.), and exopolysaccharide polymerases (wzx, wzy). In Leptospira interrogans, borgpeterenii and biflexa, this gene is observed first in unidirectional EPS biosynthesis loci as long as 90 genes. MarR genes (pfam01407) are known to bind to DNA regions with palindromic or pseudopalindromic sequences as homodimers, and to bind small molecules as triggers for conformational changes controlling on/off states.	105
200428	TIGR04177	exosort_XrtH	exosortase H, IPTLxxWG-CTERM-specific. This model describes exosortase subfamily H, for which most cognate recognition sequences are found by the IPTLxxWG-CTERM model TIGR04174. Species with this exosortase and multiple (up to 15) copies of the target domain include Acidovorax citrulli, Acidovorax delafieldii 2AN, Delftia acidovorans SPH-1, and gamma proteobacterium NOR5-3. [Protein fate, Protein and peptide secretion and trafficking]	158
275031	TIGR04178	exo_archaeo	exosortase/archaeosortase family protein. This model represents the most conserved region of the multitransmembrane protein family of exosortases and archaeosortases. The region includes nearly invariant motifs at the ends of three predicted transmembrane helices on the extracytoplasmic face: a Cys (often Cys-Xaa-Gly), Asn-Xaa-Xaa-Arg, and His. This model is much broader than the bacterial exosortase model (TIGR02602), and has in intended scope similar to (or broader than) pfam09721.	97
275032	TIGR04179	rhombo_lipo	rhombotail lipoprotein. Members of this protein family are probable lipoproteins. Nearly every member ends with a C-terminal region consisting of a glycine-rich probable cleavage site, a hydrophobic probable transmembrane helix, and a cluster of basic residues, as described in putative protein sorting region model TIGR03501. Furthermore, members tend to be encoded next to a rhomboid family protease, called rhombosortase (TIGR03902) predicted to perform a C-terminal cleavage.	258
275033	TIGR04180	EDH_00030	NAD dependent epimerase/dehydratase, LLPSF_EDH_00030 family. This clade within the NAD dependent epimerase/dehydratase superfamily (pfam01370) is characterized by inclusion of its members within a cassette of seven distinctive enzymes. These include four genes homologous to the elements of the neuraminic (sialic) acid biosynthesis cluster (NeuABCD), an aminotransferase and a nucleotidyltransferase in addition to the epimerase/dehydratase. Together it is very likely that these enzymes direct the biosynthesis of a nine-carbon sugar analagous to CMP-neuraminic acid. These seven genes form the core of the cassette, although they are often accompanied by additional genes that may further modify the product sugar. Although this cassette is widely distributed in bacteria, the family nomenclature arises from the instance in Leptospira interrogans serovar Lai, str. 56601, where it appears as the 30th gene in the 91-gene lipopolysaccharide biosynthesis cluster.	297
275034	TIGR04181	NHT_00031	aminotransferase, LLPSF_NHT_00031 family. This clade of aminotransferases is a member of the pfam01041 (DegT/DnrJ/EryC1/StrS) superfamily. The family is named after the instance in Leptospira interrogans serovar Lai, str. 56601, where it is the 31st gene in the 91-gene lipopolysaccharide biosynthesis locus. Members of this family are generally found within a subcluster of seven or more genes including an epimerase/dehydratase, four genes homologous to the elements of the neuraminic (sialic) acid biosynthesis cluster (NeuABCD) and a nucleotidyl transferase. Together it is very likely that these enzymes direct the biosynthesis of a nine-carbon sugar analogous to CMP-neuraminic acid. These seven genes form the core of the cassette, although they are often accompanied by additional genes that may further modify the product sugar.	359
275035	TIGR04182	glyco_TIGR04182	glycosyltransferase, TIGR04182 family. Members of this family are glycosyltransferases restricted to the archaea. All but two members are from species with the PGF-CTERM/archaeosortase A system, a proposed maturation system for exported, glycosylated proteins as are found often in S-layers.	293
275036	TIGR04183	Por_Secre_tail	Por secretion system C-terminal sorting domain. Species that include Porphyromonas gingivalis, Fibrobacter succinogenes, Flavobacterium johnsoniae, Cytophaga hutchinsonii, Gramella forsetii, Prevotella intermedia, and Salinibacter ruber average twenty or more copies of a C-terminal domain, represented by this model, associated with sorting to the outer membrane and covalent modification.	72
275037	TIGR04184	ATPgraspMvdD	ATP-grasp ribosomal peptide maturase, MvdD family. The pair of ATP-grasp proteins MvdD and MvdC (microviridin D and C), as well as an acetyltransferase, produce microviridin K, an example of a RiPP (ribosomally synthesized and posttranslationally modified peptide). Microviridins are peptidase inhibitors.	321
275038	TIGR04185	ATPgraspMvdC	ATP-grasp ribosomal peptide maturase, MvdC family. The pair of ATP-grasp proteins MvdD and MvdC (microviridin D and C), as well as an acetyltransferase, produce microviridin K, an example of a RiPP (ribosomally synthesized and posttranslationally modified peptide). Microviridins are peptidase inhibitors. This family includes MvdC and corresponding members of similar cassettes.	318
275039	TIGR04186	GRASP_targ	putative ATP-grasp target RiPP. A RiPP is a ribosomally produced, post-translationally modified peptide. This family regularly occurs next to ATP-grasp enzymes related to those of microviridin maturation and next to a methyltransferase.	72
275040	TIGR04187	GRASP_SAV_5884	ATP-grasp ribosomal peptide maturase, SAV_5884 family. Members of this protein family are ATP-grasp ligase family enzymes that regularly occur in a contexts with a methyltransferase and a putative ribosomally translated post-translationally modified peptide precursor. Because of this conserved gene neighborhood and close sequence similarity to ATP-grasp enzymes from microviridin/marinostatin biosynthesis cassettes, this enzyme is suggested also to serve as a peptide maturase.	312
275041	TIGR04188	methyltr_grsp	methyltransferase, ATP-grasp peptide maturase system. Members of this protein family are predicted SAM-dependent methyltransferases that regularly occur in the context of a putative peptide modification ATP-grasp enzyme (TIGR04187, related to enzymes of microviridin maturation) and a putative ribosomal peptide modification target (TIGR04186).	363
275042	TIGR04189	surface_SprA	cell surface protein SprA. SprA is a cell surface protein widely distributed in the Bacteroidetes lineage. In Flavobacterium johnsoniae, a species that shows gliding motility, mutation disrupts gliding.	2315
211913	TIGR04190	B12_SAM_Ta0216	B12-binding domain/radical SAM domain protein, Ta0216 family. Members of this family are enzymes with an N-terminal B12-binding domain and central radical SAM domain. Families TIGR03975, TIGR04013 and TIGR04014 exhibit a similar architecture, which may be associated with lipid metabolism.	553
275043	TIGR04191	YphP_YqiW	putative bacilliredoxin, YphP/YqiW family. This protein family is one of several observed in species that express bacillithiol, an analog of glutathione and mycothiol. Rather than being involved in bacillithiol biosynthesis, members are likely to act in bacillithiol-dependent processes. A suggested term is bacilliredoxin (a glutaredoxin-like thiol-dependent oxidoreductase), and a suggested role of YphP is de-bacillithiolation - removing bacillithiol that became linked to protein thiols under oxidative stress. An older description of YphP as a disulphide isomerase therefore may be wrong.	136
275044	TIGR04192	GRASP_w_spasm	ATP-GRASP peptide maturase, grasp-with-spasm system. Members of this protein family are ATP-GRASP proteins that occur in a peptide maturation cassette with a SPASM domain protein. SPASM (TIGR04085) usually occurs as a C-terminal extension to radical SAM enzymes that act as peptide maturases, although it can occur independently.	318
211916	TIGR04193	SPASM_w_grasp	SPASM domain peptide maturase, grasp-with-spasm system. A 4Fe-4S-binding C-terminal domain is shared by radical SAM maturases for Subtilosin A (S), PQQ (P), Anaerobic sulfatases (AS), and mycofactocin (M), hence SPASM. Radical SAM proteins with SPASM tend to be peptide maturases. All members of this family, like some members of the quasi-rSAM family TIGR04105, lack the 4Fe-4S cluster of the radical SAM domain (pfam04055) in the N-terminal region. Members of this family occur with an ATP-GRASP family protein, known as a possible maturase from microviridin biosynthetic clusters. Systems occur in Microscilla marina ATCC 23134, Kordia algicida OT-1, Sphingobacterium spiritivorum ATCC 33300, etc.	342
211917	TIGR04194	grasp_w_spasm_A	grasp-with-spasm leader A domain. This model describes the leader peptide domain, ending in a Gly-Gly cleavage motif, for a post-ribosomal natural product (PRNP) precursor. The corresponding modification enzymes include an ATP-GRASP enzyme and a SPASM-domain protein, related to the C-terminal region of numerous peptide-modification radical SAM enzymes.	28
211918	TIGR04195	S_glycosyl_SunS	peptide S-glycosyltransferase, SunS family. Members of this family include SunS, the S-glycosyltransferase that transfers a sugar (substrate is variable in reconstitution assays) onto the precursor of the glycopeptide sublancin, which once was thought to be a lantibiotic.	422
211919	TIGR04196	glycopep_SunS	glycopeptide, sublancin family. Members of this family, including sublancin, are post-ribosomal natural products (PRNP) with an S-linked glycosylation. Sublancin itself also has two disulfide bonds. A related gene cluster in Bacillus cereus E33L includes the four Cys involved in the disulfide cluster but lacks the region with the glycosylated Cys, and have been excluded.	80
275045	TIGR04197	T7SS_SACOL2603	type VII secretion effector, SACOL2603 family. Members of this protein family are similar in length and sequence (although remotely) to the WXG100 family of type VII secretion system (T7SS) targets, described by family TIGR03930. Phylogenetic profiling shows that members of this family are similarly restricted to species with T7SS, marking this family as a related set of T7SS effectors. Members include SACOL2603 from Staphylococcus aureus subsp. aureus COL. Oddly, members of family pfam10824 (DUF2580), which appears also to be related, seem not to be tied to T7SS.	85
275046	TIGR04198	paramyx_RNAcap	mRNA capping enzyme, paramyxovirus family. This model represents a common C-terminal region shared by paramyxovirus-like RNA-dependent RNA polymerases (see pfam00946). Polymerase proteins described by these two models are often called L protein (large polymerase protein). Capping of mRNA requires RNA triphosphatase and guanylyl transferase activities, demonstrated for the rinderpest virus L protein and at least partially localized to the region of this model.	893
275047	TIGR04199	exosort_xrtJ	exosortase J. Exosortase J occurs as a three-member paralogous family in Acidobacterium sp. MP5ACTX8. It contains an N-terminal exosortase/archaeosortase domain and a novel C-terminal domain comprising about half of total protein length. The presumptive target, found as an adjacent gene for two of the three paralogs, consists of a possible lipoprotein signal peptide followed almost immediately by a C-terminal region with some PEP-CTERM-like characteristics.	522
211923	TIGR04200	targ_of_XrtJ	XrtJ-associated TM-motif-TM protein. This model represents essentially the full length, ~60 residues, of a two-gene paralogous family from Acidobacterium sp. MP5ACTX8. Sequences consist of an N-terminal signal sequence ending in a GC motif, suggestive of the lipoprotein signal sequence, followed immediately by a C-terminal domain sequence with characteristics PEP-CTERM-like sequences, including a PExP motif and a transmembrane helix. Both members occur next to the novel exosortase variant, XrtJ, which contains a novel C-terminal domain.	62
275048	TIGR04201	Myxo_Cys_RPT	Cys-rich repeat, Myxococcales-type. This repeat is restricted to the Myxococcales, a division of the deltaproteobacteria. It occurs in several surface proteins, and may form a stalk region. The repeat averages about 21 amino acids in length with four or five Cys, three of which are nearly invariant.	22
275049	TIGR04202	capSnatchArena	RNA endonuclease, cap-snatching, arenavirus family. This model describes a shared signature region from an RNA endonuclease region associated with cap-snatching for mRNA production by RNA viruses. This domain usually is part of a multifunctional protein, the L protein responsible for RNA-dependent RNA polymerase activity. Cap-snatching is a viral alternative to synthesizing a eukaryotic-like mRNA cap itself.	61
275050	TIGR04203	RPT_S_cricet	Streptococcal surface-anchored protein repeat, S. criceti family. This model describes a repeat sequence that occurs primarily LPXTG-anchored Streptococcus surface proteins, although it does occur elsewhere. It can comprise a major fraction of the length of repeat proteins taht exceed 2000 in length.	38
275051	TIGR04204	MAST_ArtA_sort	MAST domain. This model describes a domain (or in most cases the full length) of archaeal surface proteins that are putative targets for C-terminal processing by archaeosortase A (TIGR04125). Most members of this family belong to proteins encoded by tandem genes in the genus Methanosarcina. The putative processing signal, PGF-CTERM (TIGR04126), included within the domain definition, takes a variant form, with consensus motif PAF instead of PGF. We suggest the name MAST domain: Methanosarcina Archaeosortase-Sorted Tandem gene family domain.	182
275052	TIGR04205	classIII_w_PIP	class III signal peptide protein, archaeosortase D/PIP-CTERM system. Members of this protein family are short proteins that consist largely of the archaeal class III signal peptide (see pfam04021). Members are encoded in a gene cassette between archaeosortase D (TIGR04175) and its PIP-CTERM target protein (TIGR04173).	67
275053	TIGR04206	near_ArtA	TIGR04206 family protein. Members of this integral membrane protein family are found exclusively in halophilic archaea. In at least three species (Haloarcula marismortui, Haloquadratum walsbyi, and Haloferax volcanii), members are found in the gene neighborhood of archaeosortase A, suggesting a role in protein sorting.	139
275054	TIGR04207	halo_sig_pep	surface glycoprotein signal peptide. This N-terminal homology domain appears to be a specialized class of signal peptide. It occurs mostly in the halophilic archaea, primarily on proteins with the C-terminal PGF-CTERM domain, including the S-layer-forming major surface glycoprotein of several species. The PGF-CTERM domain is the putative archaeosortase A recognition sequence. However, this N-terminal domain occurs also in several archaeal proteins that lack PGF-CTERM, and occurs in bacteria on a protein from Clostridium leptum DSM 753.	30
275055	TIGR04209	sarcinarray	sarcinarray family protein. Members of this protein family are exclusive to archaea, probably all of which have S-layer surface protein arrays. All member proteins have an N-terminal signal sequence. The majority of known members belong to codirectional tandem arrays in the genus Methanosarcina (nine in M. barkeri str. Fusaro). Nearly all members have an additional 50 residues, (trimmed from the seed alignment for this model), consisting of low-complexity sequence rich in E,N,Q,T,S, and P, followed by a variant (PAF) form of the PGF-CTERM putative archaeal surface glycoprotein sorting signal. The coined name, sarcinarray family protein, evokes the predicted archaeal surface layer localization, the taxonomic bias of known members, and the tandem organization of most members.	144
211933	TIGR04210	bunya_NSm	bunyavirus nonstructural protein NSm. This model describes a protein region that is cleaved from a bunyavirus polyprotein to become the nonstructural protein NSm (encoded by the M segment). It is flanked by glycoprotein GP2 and glycoprotein GP1.	173
275056	TIGR04211	SH3_and_anchor	SH3 domain protein. Members of this protein family have a signal peptide, a strongly conserved SH3 domain, a variable region, and then a C-terminal hydrophobic transmembrane alpha helix region.	198
275057	TIGR04212	GlyGly_RbtA	Acinetobacter rhombotarget A. Members of this protein family are found, so far, exclusively in the genus Acinetobacter. Members average just over 600 amino acids in length, including a 22-amino acid C-terminal putative protein sorting recognition sequence, GlyGly-CTERM (TIGR03501). The GlyGly-CTERM signal always co-occurs with a subfamily of the rhomboid family intramembrane serine proteases called rhombosortase (TIGR03902). Members occur paired with a second rhombosortase target, with which it also shares an N-terminal motif CSLREA. This protein is designated Acinetobacter rhombotarget A (rbtA).	605
275058	TIGR04213	PGF_pre_PGF	PGF-pre-PGF domain. This domain occurs in archaeal species. Most domains in this family end with a motif PGF, after which the member sequences change in character to low-complexity sequence (usually Thr-rich) for about 40 residues. The low complexity region usually is followed by a PGF-CTERM domain (TIGR04126), which we suggest is the recognition sequence for archaeosortase A (TIGR04125), a putative protein-sorting transpeptidase. The similarity between the PGF motif in this domain and in the PGF-CTERM domain is highly suggestive.	153
211937	TIGR04214	CSLREA_Nterm	CSLREA domain. This model describes an N-terminal region, with a motif CSLREA, shared by tandem genes in Acinetobacter that both have the GlyGly-CTERM putative protein-sorting domain. Many proteins with this domain are putative outer membrane proteins (OMPs) with predicted beta strand-forming repeats.	27
275059	TIGR04215	choice_anch_A	choice-of-anchor A domain. This domain may occur as essentially the full length of a protein, except for an N-terminal sequence and a C-terminal protein-sorting signal such as PEP-CTERM or LPXTG. Most often, the putative surface protein is longer and contains repetitive sequence regions. This is one of very few domains for which both anchoring domains occur, and designated choice-of-anchor A domain. The best characterized member is Bacillus anthracis protein BA0871, a collagen-binding protein with five CNA-family protein B-type repeats toward the C-terminus and an LPXTG cell wall attachment motif.	249
275060	TIGR04216	halo_surf_glyco	major cell surface glycoprotein. Members of this family are the S-layer-forming halobacterial major cell surface glycoprotein. The highest scores below model cutoffs are fragmentary paralogs to actual members of the family. Modifications include at N-linked and O-linked glycosylation, a C-terminal diphytanylglyceryl modification, and probable cleavage of the PGF-CTERM tail.	763
211940	TIGR04217	archae_ser_T	archaetidylserine synthase. The activity CDP-2,3-di-O-geranylgeranyl-sn-glycerol:L-serine O-archaetidyltransferase (archaetidylserine synthase) was demonstrated experimentally in Methanothermobacter thermautotrophicus. Members represent an exception within the broader family (TIGR00473) of CDP-diacylglycerol-serine O-phosphatidyltransferases.	221
211941	TIGR04218	TOMM_plantaz	ribosomal natural product, plantazolicin-class. Members of this protein family are precursors of TOMMs, that is, thizazole/oxazole-modified microcins. Members are about 42 residues in length, have a C-terminal region of extremely low complexity rich in Ser, and are often missed by ab initio gene callers. The plantazolicin from Bacillus amyloliquefaciens FZB42 is a peptide antibiotic effective against Bacillus anthracis.	41
275061	TIGR04219	OMP_w_GlyGly	outer membrane protein. Members of this protein family are outer membrane proteins (OMP), as can be seen by their homology to YfaZ protein (see ) and by the OMP targeting region at the C-terminus, including a C-terminal Phe residue. Members of this protein family are found in the great majority of genomes with the GlyGly-CTERM protein sorting signal and the rhombosortase putative sorting enzyme, although the relationship may be fortuitous.	233
211943	TIGR04220	patB_acyB_mcaB	cyanobactin biosynthesis protein, PatB/AcyB/McaB family. Members of this protein family are small (~ 80 amino acids) and occur in biosynthesis clusters for cyanobactins, a type of ribosomal natural product, thiazole/oxazole-modified microcin (TOMM). The function of this protein family is unknown, and the recognized cyanobactin precursors (e.g. microcyclamides and patellamides) are encoded by a different protein (see TIGR03678). In this protein family, however, a core region of about 62 amino acids (modeled) is followed by a hypervariable region of 5 to 23 amino acids, with hallmarks of possible cyclodehydratase modification sites. The hallmarks include Cys residues flanked by Gly, and variable length Ser-rich tripeptide repeats. Further, members of this family were shown dispensible for patellamide biosynthesis, and two may occur in a cluster. Therefore, this family may represent a precursor of another type of ribosomal natural product.	61
275062	TIGR04221	SecA2_Mycobac	accessory Sec system translocase SecA2, Actinobacterial type. Members of this family are the SecA2 subunit of the Mycobacterial type of accessory secretory system. This family is quite different SecA2 of the Staph/Strep type (TIGR03714).	762
275063	TIGR04222	near_uncomplex	TIGR04222 domain. The majority of the proteins with a domain as described by this model have an extreme C-terminal sequence that is consists of extremely low-complexity sequence, rich in Ser or in Gly interspersed with Cys. That C-terminal region resembles ribosomal natural product precursors, although there is no evidence that C-terminal regions of these proteins undergo any modification or have any such function.	227
275064	TIGR04223	quorum_AgrD	cyclic lactone autoinducer peptide. Members of this family of short peptides are precursors to thiolactone (unless Cys is replaced by Ser) cyclic autoinducer peptides, used in quorum-sensing systems in Gram-positive bacteria. The best characterized is the AgrD precursor, processed by the AgrB protein. Nearby proteins regularly encountered include a histidine kinase and a response regulator. This model is related to pfam05931 but is newer and currently broader in scope.	37
275065	TIGR04224	ser_adhes_Nterm	serine-rich repeat adhesion glycoprotein AST domain. This model describes a definitive conserved N-terminal domain shared by Streptococcal serine-rich adhesion glycoproteins. These highly repetitive proteins may exceed 4000 amino acids in length, consisting largely of long regions in which every second amino acid is Ser. Members of this family, if sequenced completely and assigned the correct start site, begin with a KxYKxGKxW motif region (see TIGR03715) and end with an LPXTG motif region (see TIGR01167). Members are exported by the accessory secretory system (SecA2 and SecY2). They are highly variable among the Streptococci and may help determine host ranges for pathogenesis.	50
275066	TIGR04225	CshA_fibril_rpt	CshA-type fibril repeat. Many proteins with this repeat are LPXTG-anchored surface proteins of Firmicutes species, but the repeat occurs more broadly. Members include CshA from Streptococcus gordonii.	103
275067	TIGR04226	RrgB_K2N_iso_D2	fimbrial isopeptide formation D2 domain. The Streptococcus Pneumoniae pilus backbone protein, RrgB, has three tandem domains with Lys-to-Asn isopeptide bonds, but these three regions are extremely divergent in sequence. This model represents the homology domain family of the D2 domain. It occurs just once in many surface proteins but up to twenty times in some pilin subunit proteins. Three of every four members have the typical Gram-positive C-terminal motif, LPXTG, although in many cases this motif may be involved in pilin subunit cross-linking rather than cell wall attachment. Proteins with this domain include fimbrial proteins with lectin-like adhesion functions, and the majority of characterized members are involved in surface adhesion to host structures.	124
211950	TIGR04227	zmp_18_rpt	zinc metalloproteinase 18-residue repeat. This model describes a short (18-amino acid) tandem repeat that occurs variable numbers of times in zinc metalloproteinase C (zmpC) homologs in various species of Streptococcus. This repeat occurs, oddly, as an interruption in a region of tandem repeats of another type.	18
275068	TIGR04228	isopep_sspB_C2	adhesin isopeptide-forming domain, sspB-C2 type. This domain has a conserved Lys (position 3 in seed alignment) and Asn at 177 that form an intramolecular isopeptide bond. The Asp (or Glu) at position 59	173
275069	TIGR04229	geopeptide	putative radical SAM-modified peptide. This family of short peptides occurs near radical SAM/SPASM domain proteins and is proposed to be modified by that enzyme.	23
275070	TIGR04230	seadorna_VP11	seadornavirus VP11 protein. This protein family occurs in the seadornavirus virus group, with designations VP11 in Banna virus, and VP12 in Kadipiro virus and Liao ning virus. The function has not been assigned.	175
275071	TIGR04231	seadorna_VP5	seadornavirus VP5 protein. This protein family occurs in the seadornavirus virus group, with designations VP5 in Banna virus, and VP6 in Kadipiro virus and Liao ning virus. The function is unassigned.	505
211955	TIGR04232	seadorna_VP3	seadornavirus VP3 protein. Members of this protein family are VP3 proteins in the seadornavirus group. Sequences show sequence similarity to methyltransferases.	731
211956	TIGR04233	seadorna_VP8	seadornavirus VP8 protein. This protein family occurs in the seadornavirus virus group, with designations VP8 in Banna virus, and VP9 in Kadipiro virus and Liao ning virus. The function has not been assigned.	291
275072	TIGR04234	seadorna_RNAP	seadornavirus RNA-directed RNA polymerase. Members of this protein family are the seadornavirus VP1 protein, the RNA-directed RNA polymerase.	1144
211958	TIGR04235	seadorna_VP4	seadornavirus VP4 protein. This protein family occurs in the seadornavirus virus group, with designation VP4 in Banna virus, Kadipiro virus, and Liao ning virus. Although this family has been suggested to resemble methyltransferases, members show apparent N-terminal sequence similarity to the outer capsid protein VP5 of the orbivirus group, such as bluetongue virus, which also belong to the Reoviridae.	618
275073	TIGR04236	seadorna_VP2	seadornavirus VP2 protein. This protein family occurs in the seadornavirus virus group, with the designation VP2 in Banna virus, Kadipiro virus, and Liao ning virus.	953
211960	TIGR04237	seadorna_VP9	seadornavirus/coltivirus VP9 protein. This model, broader than related pfam08978, describes proteins VP9 in Coltivirus, and proteins with various designations in the seadornavirus group: VP9 in Banna virus, VP10 in Liao ning virus, and VP11 in Kadipiro virus.	280
275074	TIGR04238	seadorna_dsRNA	seadornavirus double-stranded RNA-binding protein. This protein family occurs in the seadornavirus virus group, with an N-terminal domain for binding double-stranded RNA, is designated VP12 in Banna virus, VP8 in Kadipiro virus, and VP11 in Liao ning virus.	201
275075	TIGR04239	rhombo_GlpG	rhomboid family protease GlpG. GlpG in E. coli is a rhomboid family intramembrane serine protease that has been extensively characterized as a proxy for rhomboid family proteases in animals. It efficiently cleaves eukaryote-derived model substrates. This multiple membrane-spanning protein excludes inappropriate substrates from access to its cleavage site, and shows activity against truncated versions, but not full-length versions, of the E. coli multidrug transporter MdfA. This finding suggests a housekeeping function in removing faulty proteins. In contrast, several eukaryotic rhomboid family proteases release peptide hormones for signaling functions, and the Shewanella and Vibrio protein rhombosortase appears to be part of a protein-sorting system, cleaving a C-terminal anchoring helix domain.	270
213897	TIGR04240	flavi_E_stem	flavivirus envelope glycoprotein E, stem/anchor domain. This model describes the C-terminal domain, containing a stem region followed by two transmembrane anchor domains, of the envelope protein E. This protein is cleaved from the large flavivirus polyprotein, which yields three structural and seven nonstructural proteins.	97
211964	TIGR04241	adenoE3CR1rpt	mastadenovirus E3 CR1-alpha-1. This domain occurs only in the adenovirus E3 region CR1-alpha-1 protein. It may occur once, twice, or three times.	81
275076	TIGR04242	nodulat_NodC	chitooligosaccharide synthase NodC. Members of this family are NodC, an N-acetylglucosaminyltransferase involved in the production of nodulation factors through which rhizobia establish symbioses with leguminous plants.	395
211966	TIGR04243	nodulat_NodB	chitooligosaccharide deacetylase NodB. Nodulation factors are lipooligosaccharide signalling molecules produced by rhizobia, the symbiotic nitrogen-fixing bacteria that form nodules in plants. These Nod factor sustems have the NodABC genes in common but differ subtly in what they produce, which affects host range. NodB is a chitooligosaccharide deacetylase.	197
275077	TIGR04244	nitrous_NosZ_RR	nitrous-oxide reductase, TAT-dependent. Members of this family are the nitrous-oxide reductase structural protein, NosZ, with an N-terminal twin-arginine translocation (TAT) signal sequence (see TIGR01409). The TAT system replaces the Sec system for export of proteins with bound cofactor.	627
211968	TIGR04245	nodulat_NodA	N-acyltransferase NodA. Nodulation factors are lipo-chitooligosaccharides made by bacterial nitrogen-fixing bacteria as a signal to plant hosts. Nod factors differ slightly from system to system are serve as host range determinants. Because the N-acyl group varies from one NodA to another, the family treated as a subfamily, but all members of this family belong to NodABC systems.	193
275078	TIGR04246	nitrous_NosZ_Gp	nitrous-oxide reductase, Sec-dependent. This model represents the nitrous-oxide reductase protein NosZ as characterized in Geobacillus thermodenitrificans. In contrast to the related form in Pseudomonas stutzeri, this version lacks a recognizable twin-arginine translocation (TAT) signal at the N-terminus. Consequently, its accessory protein may differ. Some members of this family have an additional cytochrome c-like domain at the C-terminus.	578
275079	TIGR04247	NosD_copper_fam	nitrous oxide reductase family maturation protein NosD. Members of this family include NosD, a repetitive periplasmic protein required for the maturation of the copper-containing enzyme nitrous-oxide reductase. NosD appears to be part of a complex with NosF (an ABC transporter family ATP-binding protein) and NosY (a six-helix transmembrane protein in the ABC-2 permease family). However, NosDFY-like complexes appear to occur also in species whose copper requiring enzymes are something other than nitrous-oxide reductase.	377
211971	TIGR04248	SCM_PqqD_rel	SynChlorMet cassette protein ScmD. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. These components suggest modification of a ribosomally produced peptide precursor, but the precursor has not been identified. Members of this family are the PqqD-like protein.	84
275080	TIGR04249	SCM_chp_ScmC	SynChlorMet cassette protein ScmC. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. These components suggest modification of a ribosomally produced peptide precursor, but the precursor has not been identified. Members of this family are designated ScmC.	292
211973	TIGR04250	SCM_rSAM_ScmE	SynChlorMet cassette radical SAM/SPASM protein ScmE. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. These components suggest modification of a ribosomally produced peptide precursor, but the precursor has not been identified. Of the two PqqE homologs of the cassette, this family is the closer in sequence.	358
211974	TIGR04251	SCM_rSAM_ScmF	SynChlorMet cassette radical SAM/SPASM protein ScmF. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. These components suggest modification of a ribosomally produced peptide precursor, but the precursor has not been identified. Of the two PqqE homologs of the cassette, this family is the more distant in sequence.	353
211975	TIGR04252	SCM_precur_ScmA	SynChlorMet cassette protein ScmA. A biosynthesis cassette found in Syntrophobacter fumaroxidans MPOB, Chlorobium limicola DSM 245, Methanocella paludicola SANAE, and delta proteobacterium NaphS2 contains two PqqE-like radical SAM/SPASM domain proteins, a PqqD homolog, and a conserved hypothetical protein. This model identifies a conserved open reading frame that was identified as a predicted gene in only one of those species (Chlorobium), but that may represent the ribosomally produced peptide precursor of the system. As with most other radical SAM enzyme-modified ribosomal natural products, these polypeptides are Cys-rich in the C-terminal half.	49
211976	TIGR04253	mesacon_CoA_iso	mesaconyl-CoA isomerase. Members of this protein family belong by homology to the family of CoA transferases. However, the characterized member from Chloroflexus aurantiacus appears to perform an intramolecular transfer, making it an isomerase. The enzyme converts mesaconyl-C1-CoA to mesaconyl-C4-CoA as part of the bicyclic 3-hydroxyproprionate pathway for carbon fixation.	403
275081	TIGR04254	OpituPEPCTERM_1	putative globular PEP-CTERM protein. Representatives of this family include a 13-member paralogous family of proteins about 215 amino acids in length from the termite gut bacterium Opitutaceae bacterium TAV2, a member of the Verrucomicrobia. The signal peptide (N-terminal) and PEP-CTERM putative protein sorting signal (C-terminal) are not included in the seed alignment. Conserved residues such as an invariant Arg and a lack of conspicuous low-complexity sequence suggest a globular structure and possible enzymatic activity. Members average about thirty percent sequence identify overall, but over seventy percent in the PEP-CTERM region. The function of this family is unknown.	136
275082	TIGR04255	sporadTIGR04255	TIGR04255 family protein. Members of this uncharacterized protein family are found broadly but sporadically among bacteria and archaea, including members of the genera Mycobacterium, Nostoc, Acinetobacter, Planctomyces, Geobacter, Streptomyces, Methanospirillum, etc. The function is unknown.	249
275083	TIGR04256	GxxExxY	GxxExxY protein. Members of this protein family average about 130 residues in length and include an almost perfectly conserved motif GxxExxY. Members occur in a wide range of prokaryotes, including Proteobacteria, Perrucomicrobia, Cyanobacteria, Bacteriodetes, Archaea, etc.	116
275084	TIGR04257	nanowire_3heme	c(7)-type cytochrome triheme domain. This domain binds three hemes, and itself occurs as a repeating unit. It occurs, for instance, four times in the dodecaheme c-type cytochrome protein GSU_1996, whose crystal structure shows elongation and a nanowire-like arrangement of twelve hemes that could function in extracellular electron transport processes.	75
275085	TIGR04258	4helix_suffix	four helix bundle suffix domain. This domain occurs as a suffix domain to some members of the much broader protein family TIGR02436, a few of whose other members are encoded within intervening sequences of bacterial 23S ribosomal RNA. Some proteins with this domain, in turn, are followed by a predicted DNA topoisomerase type C4 zinc finger.	49
275086	TIGR04259	oxa_formateAnti	oxalate/formate antiporter. This model represents a subgroup of the more broadly defined model TIGR00890, which in turn belongs to the Major Facilitator transporter family. Seed members for this family include the known oxalate/formate antiporter of Oxalobacter formigenes, as well as transporter subunits co-clustered with the two genes of a system that decarboxylates oxalate into formate. In many of these cassettes, two subunits are found rather than one, suggesting the antiporter is sometimes homodimeric, sometimes heterodimeric.	405
275087	TIGR04260	Cyano_gly_rpt	rSAM-associated Gly-rich repeat protein. Members of this protein family average 125 in length, roughly half of which is the repetitive and extremely Gly-rich C-terminal region. Virtually all members occur in the Cyanobacteria, in a neighborhood that includes a radical SAM/SPASM domain, often a marker of peptide modification systems.	119
211984	TIGR04261	rSAM_GlyRichRpt	radical SAM/SPASM domain protein, GRRM system. Members of this protein family are radical SAM/SPASM domain proteins (see pfam04055 and TIGR04085) related to anaeroboic sulfatase maturating enzymes and the peptide modification enzyme PqqE. Members are found primarily in Cyanobacteria adjacent to a short protein, ~150 residues, in which the last ~60 residues tends to be repetitive and highly glycine-rich (see TIGR04260). The arrangement suggests modifications to the repetitive C-terminal region by this radical SAM domain enzyme, but the purpose of this system on the whole is unknown.	363
275088	TIGR04262	orph_peri_GRRM	extracellular substrate-binding orphan protein, GRRM family. This subfamily belongs to bacterial extracellular solute-binding protein family 3 (pfam00497). In that family, most members are ABC transporter periplasmic substrate-binding proteins. However, members of the present subfamily are orphans in the sense of being adjacent to neither ABC transporter ATP-binding proteins or permease subunits. Instead, most members are encoded next to the two signature proteins of the proposed Glycine-Rich Repeat Modification (GRRM) system, a radical SAM/SPASM protein GrrM (TIGR04261) and the Gly-rich repeat protein itself GrrA (TIGR04260).	257
275089	TIGR04263	SasC_Mrp_aggreg	SasC/Mrp/FmtB intercellular aggregation domain. This domain, about 375 amino acids long on average, occurs only in Staphylococcus and Streptococcus. It occurs as a non-repetitive N-terminal domain of LPXTG-anchored surface proteins, including SasC, Mrp, and FmtB. This region in SasC was shown to be involved in cell aggregation and biofilm formation, which may explain the methicillin resistance seen for Mrp and FmtB.	366
275090	TIGR04264	hyperosmo_Ebh	hyperosmolarity resistance protein Ebh, N-terminal domain. Staphylococcal protein Ebh (extracellular matrix-binding protein homolog) is a giant protein, sometimes over 10,000 amino acids long as reported. This model describes a non-repetitive amino-terminal domain of about 2400 amino acids.	2354
211988	TIGR04265	bac_cardiolipin	cardiolipin synthase. This model is based on experimentally characterized bacterial cardiolipin synthases (cls) from E. coli, Staphylococcus aureus (two), and Bacillus pseudofirmus OF4. This model describes just one of several homologous but non-orthologous forms of cls. The cutoff score is set arbitrarily high to avoid false-positives. Note that there are two enzymatic activites called cardiolipin synthase. This model represents type 1, which does not rely on a CDP-linked donor, but instead does a reversible transfer of a phosphatidyl group from one phosphatidylglycerol molecule to another.	483
211989	TIGR04266	NDMA_methanol	NDMA-dependent methanol dehydrogenase. Members of this family belong to the iron-dependent alcohol dehydrogenase family (see pfam00465). The NADP(H) cofactor is bound too tightly for exchange (although non-convalently), so enzymatic activity depends on a second substrate or electron carrier. The radical SAM-modified natural product mycofactocin is proposed to fill this role. In Rhodococcus erythropolis N9T-4, a role was shown for this protein in CO2 fixation during extreme oligotrophic (or possibly chemoautotrophic) growth.	420
275091	TIGR04267	mod_HExxH	HEXXH motif domain. Some proteins with this domain toward the C-terminus have an N-terminal region with a radical SAM domain (pfam04055) and a SPASM domain (TIGR04085), a combination frequently associated with peptide modification. All seed alignment members, and all family members that are not fused to a radical SAM domain, have a motif HEXXH that suggests metalloprotease activity. A role in peptide or protein maturation is suggested.	399
275092	TIGR04268	FxSxx-COOH	FXSXX-COOH protein. Members of this family are very short (~60 residue) polypeptides, among which the fifth and third to last residues are nearly always Phe and Ser, respectively. Because members occur in a conserved context with a putative peptide-modifying radical SAM/SPASM domain protein, we suggest that members of this family may be the modification target. The gene symbol fxsA reflects both the FXA motif and the proposed role as a ribosomal natural product.	44
275093	TIGR04269	SAM_SPASM_FxsB	radical SAM/SPASM domain protein, FxsB family. This model describes a radical SAM (pfam04055)/SPASM domain (TIGR04085) fusion subfamily distinct from PqqE, MftC, anaerobic sulfatase maturases, and other peptide maturases. The combined region described in this model can itself be fused to another domain, such as TIGR04267, or stand alone. Members occurring in the same cassette as a member of family TIGR04268 should be designated FxsB.	363
211993	TIGR04270	Rama_corrin_act	methylamine methyltransferase corrinoid protein reductive activase. Members of this family occur as paralogs in species capable of generating methane from mono-, di-, and tri-methylamine. Members include RamA (Reductive Activation of Methyltransfer, Amines) from Methanosarcina barkeri MS (DSM 800). Member proteins have two C-terminal motifs with four Cys each, likely to bind one 4Fe-4S cluster per motif.	535
275094	TIGR04271	ThiI_C_thiazole	thiazole biosynthesis domain. The ThiI protein of Escherichia coli is a bifunctional protein in which most of the length of the protein is responsible for sulfurtransferase activity in 4-thiouridine modification to tRNA (EC 2.8.1.4 - see model TIGR00342). This rhodanese-like C-terminal domain, by itself, is able to synthesize the thiazole moiety during thiamin biosynthesis. Note that the invariant Cys residue in this domain is unusual in being required for both activities of the bifunctional ThiI protein.	101
275095	TIGR04272	cxxc_cxxc_Mbark	CxxC-x17-CxxC domain. This domain, with a pair of CXXC motifs separated by 17 amino acids, is a candidate zinc finger domain based on these motifs. Some proteins have two copies of the domain, while others are fused to another probable zinc-binding domain, described by pfam13451.	37
275096	TIGR04273	Y_sulf_Ax21	sulfation-dependent quorum factor, Ax21 family. This family consists of proteins closely related to Ax21 (Activator of XA21-mediated immunity), a protein that is secreted by a type I secretion system (RaxABC), and that appears to be sulfated on an N-terminal region tryosine in a motif LSYN. Ax21 acts in a quorum-sensing system. Homologous peptide-mediated quorum-sensing systems appear to exist in other species, such as the emerging opportunistic pathogen Stenotrophomonas maltophilia. Intriguingly, the rice genome encodes a receptor (XA21) for this protein that triggers innate immunity. [Cellular processes, Pathogenesis]	186
211997	TIGR04274	hypoxanDNAglyco	hypoxanthine-DNA glycosylase. Members of this protein family represent family 6 of the uracil-DNA glycosylase superfamily, where the five previously described families all act as uracil-DNA glycosylase (EC 3.2.2.27) per se. This family, instead, acts as a hypoxanthine-DNA glycosylase, where hypoxanthine results from deamination of adenine. Activity was shown directly for members from Methanosarcina barkeri and Methanosarcina acetivorans.	150
275097	TIGR04275	beta_prop_Msarc	beta propeller repeat, Methanosarcina surface protein type. This model describes a repeat region found mostly in cell surface proteins of various methanogens. Methanosarcina barkeri, for example, has twenty such proteins, often with either seven or fourteen repeats. These repeats resemble the beta propeller repeats of the TolB periplasmic protein of Gram-negative bacteria, part of a complex associated with various functions including biopolymer transport (see TIGR02800).	40
275098	TIGR04276	FxsC_Cterm	FxsC C-terminal domain. This model describes a sequence region found regularly as the C-terminal domain of a protein (where the N-terminal domain resembles a TIR domain - see pfam13676) in the vicinity of a proposed peptide-modifying radical SAM/SPASM domain protein, FxsB (TIGR04269).	196
212000	TIGR04277	squa_tetra_cyc	squalene--tetrahymanol cyclase. This enzyme, also called squalene--tetrahymanol cyclase, occurs a small number of eukaryotes, some of them anaerobic. The pathway can occur under anaerobic conditions, and the product is thought to replace sterols, letting organisms with this compound build membrane suitable for performing phagocytosis.	624
212001	TIGR04278	viperin	antiviral radical SAM protein viperin. Viperin (Virus Inhibitory Protein, ER-associated, Iterferon-inducible) is a radical SAM enzyme found in human and other vertebrates. It is both induced by interferon and demonstrably active in blocking replication by several types of virus, apparently by modifying lipid chemistries in lipid droplets and membrane rafts.	347
275099	TIGR04279	TIGR04279	TIGR04279 methanogen extracellular domain. This domain, with length just over 300 amino acids, occurs in predicted extracellular proteins in a number of methanogens, in one to three proteins per genome. The aromatic residue tyrosine, comprising about five percent of the amino acid composition, is overrepresented among the most highly conserved columns of the multiple sequence alignment. The three members of this family in Methanosarcina barkeri occur all within a six-gene region.	316
275100	TIGR04280	geopep_mat_rSAM	putative geopeptide radical SAM maturase. This family is the radical SAM/SPASM domain putative peptide maturase for geopeptide, described by model TIGR04229. The SPASM domain (see model TIGR04085) frequently marks peptide-modifying radical SAM enzymes.	428
275101	TIGR04281	peripla_PGF_1	putative ABC transporter PGF-CTERM-modified substrate-binding protein. Members of this archaeal protein family resemble periplasmic substrate-binding proteins of ABC transporters and appear in gene neighborhoods with permease and ATP-binding cassette proteins. Notably, essentially all members also have the PGF-CTERM putative protein-sorting domain at the C-terminus, while more distant homologs (excluded by the trusted cutoff) instead have what appear to be lipoprotein signal peptides at the N-terminus.	330
275102	TIGR04282	glyco_like_cofC	transferase 1, rSAM/selenodomain-associated. Members of this protein family show strongly correlated phylogenetic distribution, and in most cases co-clustering, with an unusual radical SAM enzyme (TIGR04167) whose C-terminal pfam12345 domain often contains a selenocysteine residue. Other members of the conserved gene neighborhood include another putative glycosyltransferase, an alkylhydroperoxidase family protein (TIGR04169), and a phosphoesterase family protein (TIGR04168). The cassette is likely to be biosynthetic but its exact function is unknown. [Unknown function, Enzymes of unknown specificity]	189
275103	TIGR04283	glyco_like_mftF	transferase 2, rSAM/selenodomain-associated. This enzyme may transfer a nucleotide, or it sugar moiety, as part of a biosynthetic pathway. Other proposed members of the pathway include another transferase (TIGR04282), a phosphoesterase, and a radical SAM enzyme (TIGR04167) whose C-terminal domain (pfam12345) frequently contains a selenocysteine. [Unknown function, Enzymes of unknown specificity]	220
275104	TIGR04284	aldehy_Rv0768	aldehyde dehydrogenase, Rv0768 family. This family describes a branch of the aldehyde dehydrogenase (NAD) family (see pfam00171) that includes Rv0768 from Mycobacterium tuberculosis. All members of this family belong to species predicted to synthesize mycofactocin, suggesting that this enzyme or another upstream or downstream in the same pathway might be mycofactocin-dependent. However, the taxonomic range of this family is not nearly broad enough to make that relationship conclusive. [Unknown function, Enzymes of unknown specificity]	480
275105	TIGR04285	nucleoid_noc	nucleoid occlusion protein. This model describes nucleoid occlusion protein, a close homolog to ParB chromosome partitioning proteins including Spo0J in Bacillus subtilis. Its gene often is located near the gene for the Spo0J ortholog. This protein bind a specific DNA sequence and blocks cytokinesis from happening until chromosome segregation is complete.	255
275106	TIGR04286	MSEP-CTERM	MSEP-CTERM protein. Members of this protein family average over 900 residues in length and appear to have multiple membrane-spanning helices in the N-terminal half. The extreme C-terminal region consists of a motif with consensus sequence MSEP, then a transmembrane alpha helix, then a short region with several basic residues. This region, hereby dubbed MSEP-CTERM, resembles other putative sorting signals associated with the archaeosortase/exosortase protein family (see TIGR04178). Genes for all members of this family are found next to a gene for exosortase K.	920
213900	TIGR04287	exosort_XrtK	exosortase K. Members of this protein family are exosortase K, a bacterial branch of the archaeosortase/exosortase family of protein-processing enzymes (see TIGR04178). All members of the seed alignment are encoded next to a member of family TIGR04286, which has the putative processing signal MSEP-CTERM (see family TIGR04286) at the extreme C-terminus.	163
213901	TIGR04288	CGP_CTERM	CGP-CTERM domain. This domain has an essentially invariant motif, Cys-Gly-Pro, followed by a highly hydrophobic transmembrane domain, always at the protein C-terminus. It occurs, so far, strictly in the family Thermococcaceae (includes Thermococcus and Pyrococcus) within the Euryarchaeota. It occurs in ten proteins per genome on average, and proteins with the domain may lack similarity elsewhere. The presumed sorting/processing protein, for which this domain contains the recognition sequence, is unknown, but it is unlikely to be a member of the exosortase/archaeosortase family. The Cys residue suggests a lipid modification. Upstream, from this domain, most member proteins have an extremely Thr-rich sequence, suggesting archaeal surface protein O-linked glycosylation.	20
275107	TIGR04289	heavy_Cys	eight-cysteine-cluster domain. In this domain of about 50 residues, eight of twelve invariant residues are Cys. Proteins with this domain tend to have N-terminal signal sequences, suggesting an extracytoplasmic location for this domain.	52
275108	TIGR04290	meth_Rta_06860	methyltransferase, Rta_06860 family. Members of this family are methyltransferases that mark a widely distributed large conserved gene neighborhood of unknown function. It appears most common in soil and rhizosphere bacteria.	226
275109	TIGR04291	arsen_driv_ArsA	arsenical pump-driving ATPase. The broader family (TIGR00345) to which the current family belongs consists of transport-energizing ATPases, including to TRC40/GET3 family involved in post-translational insertion of protein C-terminal transmembrane anchors into membranes from the cyotosolic face. This family, however, is restricted to ATPases that energize pumps that export arsenite (or antimonite).	566
275110	TIGR04292	heavy_Cys_CGP	heavy-Cys/CGP-CTERM domain protein. Members of this protein family are restricted to the Pyrococcus and Thermococcus genera of the archaea. Member proteins have a C-terminal, Cys-containing predicted surface anchor domain, where the Cys may be the site of cleavage and lipid attachment (see domain TIGR04288). Members also contain a region crowded with 10 invariant Cys in 60 residues (see domain TIGR04289), possible ligands to some redox cofactor. Note that a sorting motif is CGP. Previously, the motif was named incorrectly as GCP-CTERM in this model due to a typographical error.	373
213906	TIGR04293	archaeo_artF	archaeosortase family protein ArtF. Members of this protein family, ArtF, belong to the archaeosortase/exosortase family, in which many members associate with specific protein C-terminal putative protein sorting domains (exosortase A with PEP-CTERM, archaeosortase A with PGF-CTERM, etc.). This subgroup is observed in Thermococcus gammatolerans EJ3 and Thermococcus sp. AM4, but the gene neighborhood is not conserved. The cognate sequence to ArtF is unknown, but should not be ICGP-CTERM (model TIGR04288), found also in many Pyrococcus species that lack any archaeosortase family member.	166
213907	TIGR04294	pre_pil_HX9DG	prepilin-type processing-associated H-X9-DG domain. This model describes a region of ~16 residues found typically about 30 residues away from the C-terminus of large numbers of proteins in the Planctomycetes, Lentisphaerae, and Verrucomicrobia, on proteins with a prepilin-type N-terminal cleavage/methylation domain (see TIGR02532). The motif H-X(9)-D-G is nearly invariant. Single genomes may encode over 200 such proteins.	16
275111	TIGR04295	B12_rSAM_oligo	B12-binding domain/radical SAM domain protein, rhizo-twelve system. A variety of bacteria, including multiple species of Bradyrhizobium, Mesorhizobium, and Methylobacterium, have a typically twelve-gene cassette (hence the designation rhizo-twelve) for the biosynthesis of some unknown oligosaccaride. This family is a B12-binding domain/radical SAM domain protein found in roughly have of these cassettes, but nowhere else.	422
275112	TIGR04296	PEFG-CTERM	PEFG-CTERM domain. This putative protein sorting/processing domain occurs about ten times per genome in members of the Thaumarchaeota. Its putative handling protein, a member of the archaeosortase/exosortase protein family, is exceptional in having a Ser rather than Cys at the putative active site. The highly conserved motif resembles the PEF-CTERM protein sorting domain of family TIGR03024, but membership does not overlap.	30
213910	TIGR04297	thauma_sortase	thaumarchaeosortase. This member of the archaeosortase/exosortase family occurs exclusively in the Thaumarchaeota, where the corresponding proposed sorting signal is PEFG-CTERM (see model TIGR04296). This family is unusual in that the suspected active site residue, Cys in every other defined subfamily of archaeosortases and exosortases is replaced by Ser.	307
213911	TIGR04298	his_histam_anti	histidine-histamine antiporter. Members of this protein family are antiporters that exchange histidine with histamine, product of histidine decarboxylation. A system consisting of this protein, and a histidine decarboxylase encoded by an adjacent gene, creates decarboxylation/antiport proton-motive cycle that provides a transient resistance to acidic conditions.	429
213912	TIGR04299	antiport_PotE	putrescine-ornithine antiporter. Members of this protein family are putrescine-ornithine antiporter. They work together with an enzyme that decarboxylates ornithine to putrescine. This two-gene system has the net effect of removing a protein from the cytosol, providing transient resistance to acid conditions.	430
213913	TIGR04300	exosort_XrtM	exosortase family protein XrtM. Members of this family, part of the larger exosortase/archaeosortase family, are known from five related cassettes of genes in Methylomonas methanica MC09, a gammaproteobacterial methanotroph. Each xrtM gene occurs near a large YD repeat (see TIGR01643) protein of 1500-2500 residues and a small, uncharacterized protein of about 200 residues. No PEP-CTERM-like recognition sequence has been identified, so this protein is designated as exosortase family, but not necessarily a functional exosortase.	150
275113	TIGR04301	ODC_inducible	ornithine decarboxylase SpeF. Members of this family are known or trusted examples of ornithine decarboxylase, all encoded in the immediate vicinity of an ornithine-putrescine antiporter. Decarboxylation of ornithine to putrescine, followed by exchange of a putrescine for a new ornithine, is a proton-motive cycle that can be induced by low pH and protect a bacterium against transient exposure to acidic conditions.	719
213915	TIGR04302	geo_PqqD_fam	GeoRSP system PqqD family protein. Members of this PqqD-related family so far occur only in the genus Geobacter, always together with a PqqE-like radical SAM domain/SPASM domain protein and a second SPASM domain protein with traces of a degenerate radical SAM domain. The extended gene region includes a high-molecular-weight cytochrome c family protein. Besides authentic PqqD (TIGR03859), another example of a PqqD family protein occurs in the SynChlorMet cassette, again with two PqqE-like proteins. The system is named GeoRSP for its prevalence in Geobacter, its Radical SAM protein, is SPASM domain protein, and its PqqD family protein.	102
213916	TIGR04303	GeoRSP_rSAM	GeoRSP system radical SAM/SPASM protein. Members of this family are radical SAM/SPASM domain proteins from a cassette restricted to the genus Geobacter. Genes always found adjacent include a non-radical SAM protein with a closely related SPASM domain and a short stretch of N-terminal homology as well to this family, and also a PqqD-like protein. The three-gene cassette is designated GeoRSP for the genus Geobacter, this radical SAM protein, the SPASM domain protein, and the PqqD family protein.	325
213917	TIGR04304	GeoRSP_SPASM	GeoRSP system SPASM domain protein. Members of this protein family are encoded by one of two consecutive genes for SPASM domain proteins. The two are closely homologous in the SPASM domain regions, and also in a small N-terminal region, but the other family (TIGR04303) has an intact radical SAM domain (pfam04055) that this "quasi-rSAM" protein lacks. A PqqD-family protein, TIGR04302, is always adjacent.	293
275114	TIGR04305	fol_rel_CADD	putative folate metabolism protein, CADD family. This protein family, related to but outside the family of PqqC proteins involved in PQQ biosynthesis, includes the well-studied Chlamydia protein CADD (Chlamydia protein Associating with Death Domains), which can induce apoptosis in a host cell. Other members of this family occur in Rickettsia and Wolbachia, unrelated in terms of phylogeny (both are alphaproteobacteria) but similar in living intracellularly. Local gene context in these species, although not in Trichodesmium or Nitrosomonas eutropha, suggests a role in folate metabolism, and some species with this protein lack FolE but have other folate synthesis proteins.	212
213919	TIGR04306	salvage_TenA	thiaminase II. The TenA protein of Bacillus subtilis and Staphylococcus aurues, and the C-terminal region of trifunctional protein Thi20p from Saccharomyces cerevisiae, perform cleavages on thiamine and related compounds to produce 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP), a substrate a salvage pathway for thiamine biosynthesis. The gene symbol tenA, for Transcription ENhancement A, reflects a misleading early characterization as a regulatory protein. This family is related to PqqC from the PQQ biosynthesis system (see TIGR02111), heme oxygenase (pfam01126), and CADD (Chlamydia protein Associating with Death Domains), a putative folate metabolism enzyme (see TIGR04305).	208
213920	TIGR04307	ProTailRpt	proline-rich tail region repeat. This model describes a proline-rich tandem repeat of about 24 residues found in C-terminal regions of Gram-positive surface proteins with LPXTG sequences for processing and cell surface attachment by sortase.	23
275115	TIGR04308	repeat_SSSPR51	surface protein repeat SSSPR-51. This repeat domain is designated SSRS51, Streptococcal and Staphylococcal Surface Protein Repeat of size 51. These repeats are homologous to the listerial repeats of pfam13461, but shorter on average by about 8 amino acids. Up to twelve tandem repeats can occur, on some of the longest proteins of their respective species. Nearly all member proteins carry the C-terminal sortase target sequence, LPXTG, recognizable by model TIGR01167. The repeat structure and probable surface location suggest a possible adhesion function. A protein with this class of repeats may have other classes as well.	48
213922	TIGR04309	amanitin	amanitin/phalloidin family toxin. Members of this family are ribosomally produced precursors of toxins produced by several mushrooms. These precursors undergo extensive post-translational modification to become amatoxins (e.g. alpha-amanitin) and phallotoxins (e.g. phalloidin).	33
213923	TIGR04310	pantocin_A_pre	pantocin A family RiPP. Members of this family are ribosomally-synthesized and posttranslationally-modified peptide (RiPP) precursors about 30 amino acids in length encoded in the vicinity of PaaA and PaaB homologs. Members include PaaP from Pantoea agglomerans, whose central tripeptide EEN appears to be the source of the mature product, pantocin A. Note, however, that the corresponding residues in Photobacterium sp. SKA34 and Photobacterium asymbiotica are EEK rather than EEN. This family, therefore, resembles the PQQ precursor PqqA as a peptide precursor of an extremely small mature product.	29
275116	TIGR04311	rSAM_Geo_metal	putative metalloenzyme radical SAM/SPASM domain maturase. This model describes a family of radical SAM/SPASM enzymes largely from the deltaproteobacteria. The family is most closely related to radical SAM enzyme family regularly in archaea in the vicinity of tungsten-containing oxidoreductases. A single member of the family in archaea may correspond to multiple tungsten enzymes, e.g. five in Pyrococcus furiosus. Therefore, the lack of a conserved gene neighborhood for members of this family in deltaprotebacteria suggests members may be involved in the maturation of multiple metalloenzymes.	423
275117	TIGR04312	choice_anch_B	choice-of-anchor B domain. This domain, about 385 amino acids long, can have either of at least two types of C-terminal sorting signal. Members from Shewanella and allies have the rhombosortase target domain GlyGly-CTERM (TIGR03501), while members of the Bacteroidetes have the Por secretion system C-terminal domain (TIGR04183). Most other members lack any C-terminal extension, but in most of those, the normal signal sequence is replaced by a lipoprotein signal sequence. Member sequences show a region of local similarity to the LVIVD repeat sequence (pfam08309).	364
275118	TIGR04313	aro_clust_Mycop	aromatic cluster surface protein. Members of this family are absolutely restricted to the Mollicutes (Mycoplasma and Ureaplasma). All have a signal peptide, usually of the lipoprotein type, suggesting surface expression. Most members have lengths of about 280 residues but some members have a nearly full-length duplication. The mostly nearly invariant residue, a Trp,is part of a strongly conserved 9-residue motif, [ND]-W-[LY]-[WF]-X-[LF]-X-N-[LI], where X usually is hydrophobic. Because the hydrophobic six-residue core of this motif almost always contains three to four aromatic residues, we name this family aromatic cluster surface protein. Multiple paralogs may occur in a given Mycoplasma, usually clustered on the genome.	293
213927	TIGR04314	methano7heme	methanogenesis multiheme c-type cytochrome. Members of this protein family are multiheme cytochrome c proteins of Methanosarcina acetivorans C2A and several other archaeal methanogens. All members have N-terminal signal peptides and are presumed to act in electron transfer reactions associated with methanogenesis. Putative heme-binding motifs include five (or six) CXXCH motifs, a CXXXCH motif, and a CXXXXCH motif. These proteins show multiple regions of local homology, in the same order, with multiheme cytochrome c proteins such as octaheme tetrathionate reductase from Shewanella.	494
275119	TIGR04315	octaheme_Shew	octaheme c-type cytochrome, tetrathionate reductase family. Members of this protein family bind heme covalently and contain eight (at least) CXXCH heme-binding motifs. A characterized member is the respiratory enzyme octaheme tetrathionate reductase from Shewanella.	430
275120	TIGR04316	dhbA_paeA	2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase. Members of this family are 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase (EC 1.3.1.28), the third enzyme in the biosynthesis of 2,3-dihydroxybenzoic acid (DHB) from chorismate. The first two enzymes are isochorismate synthase (EC 5.4.4.2) and isochorismatase (EC 3.3.2.1). Synthesis is often followed by adenylation by the enzyme DHBA-AMP ligase (EC 2.7.7.58) to activate (DHB) for a non-ribosomal peptide synthetase.	250
275121	TIGR04317	W_rSAM_matur	tungsten cofactor oxidoreducase radical SAM maturase. Members of this family are radical SAM enzymes involved in the maturation of tungsten (W)-containing cofactors in the enzymes aldehyde ferredoxin oxidoreductase, formaldehyde ferredoxin oxidoreductase, and others, and tend to be encoded by an adjacent gene.	349
275122	TIGR04318	lacto_ODC_hypo	putative ornithine decarboxylase. In at least ten species of Lactobacillus, this close homolog to known ornithine decarboxylase occurs in a three-gene neighborhood, along with an amino acid permease family transporter and pyridoxal phosphate-dependent enyzme from the cystathionine gamma-lyase family. Species include L. acidophilus, L. amylovorus, L. crispatus, L. delbrueckii, L. farciminis, L. helveticus, L. johnsonii, etc. The combination of a decarboxylase with an antiporter in a two-gene system suggests a decarboxylation/antiport proton-motive cycle for transient resistance to acidic conditions. The substrate for this decarboxylase might be ornithine but is unknown.	695
275123	TIGR04319	SerAla_Lrha_rpt	surface protein repeat Ser-Ala-175. This serine and alanine-rich surface protein repeat, about 175 amino acids long, occurs up to nine times in surface proteins of some Lactobacillus strains, particularly in Lactobacillus rhamnosus. Members proteins have the N-terminal variant signal sequence described by TIGR03715 and C-terminal LPXTG signals for surface attachment by sortase.	175
275124	TIGR04320	Surf_Exclu_PgrA	SEC10/PgrA surface exclusion domain. This model describes a conserved domain found in surface proteins of a number of Firmutes. Many members have LPXTG C-terminal anchoring motifs and a substantial number have the KxYKxGKxW putative sorting signal at the N-terminus. The tetracycline resistance plasmid pCF10 in Enterococcus faecalis promotes conjugal plasmid transfer in response to sex pheromones, but PgrA/Sec10 encoded by that plasmid, a member of this family, specifically inhibits the ability of cells to receive homologous plasmids. The phenomenon is called surface exclusion.	356
275125	TIGR04321	spiroSPASM	spiro-SPASM protein. This three-domain protein is restricted to the spirochetes and widely distributed (excepting Borrelia). It has a conserved C-terminal SPASM domain, a 4Fe-4S binding domain shared by a number of peptide-modifying and heme-modifying radical SAM proteins. It has a central radical SAM domain, although half the members have lost the signature 4Fe-4S-binding Cys residues, fail to register with the radical SAM domain definition of pfam04055, and must be considered pseudo-SAM proteins. PSI-BLAST shows a relationship between the N-terminal domain and various predicted glycosyltransferases (e.g. Bacillus subtilis SpsF) and cytidyltransferases. In some Treponema species, this protein appears to split into two tandem genes.	508
275126	TIGR04322	rSAM_QueE_Ecoli	putative 7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE. Members of this radical SAM domain protein family appear to be the E. coli form of the queuosine biosynthesis protein QueE. QueE is involved in making preQ0 (7-cyano-7-deazaquanine), a precursor of both the bacterial/eukaryotic modified tRNA base queuosine and the archaeal modified base archaeosine. Members occur in species that lack known forms of QueE but usually are not found in queuosine biosynthesis operons. Members of this family tend to form bi-directional best hit matches to members of known (TIGR03365) and putative (TIGR03963) QueE families from other lineages.	215
213936	TIGR04323	SpoChoClust_1	sporadic carbohydrate cluster protein, LIC12192 family. Members of this uncharacterized protein family mark a rare but widely distributed carbohydrate biosynthesis cluster found sporadically in genera Bradyrhizobium, Leptospira, Magnetospirillum, Oscillatoria, Prochlorococcus, etc.	122
275127	TIGR04324	SpoChoClust_2	sporadic carbohydrate cluster 2OG-Fe(II) oxygenase. This family, related to streptomycin biosynthesis protein StrG and to phytanoyl-CoA dioxygenase, belongs to the 2-oxoglutarate and Fe(II)-dependent oxygenase superfamily, which includes not just dioxygenases, but also some chlorinating enzymes involved in natural product biosynthesis. Most members of this family are adjacent to a member of TIGR04323, and occur in a larger carbohydrate biosynthesis cluster found sporadically in genera Bradyrhizobium, Leptospira, Magnetospirillum, Oscillatoria, Prochlorococcus, etc.	248
275128	TIGR04325	MTase_LIC12133	putative methyltransferase, LIC12133 family. Members of this family tend to occur next to glycosyltransferases and other characteristic enzymes of O-antigen biosynthetic regions. The founding member is LIC12133 from Leptospira interrogans serovar Copenhageni. PSI-BLAST reveals distant homology to known SAM-dependent methyltransferases, as in pfam13489.	235
275129	TIGR04326	O_ant_LIC13510	surface carbohydrate biosynthesis protein, LIC13510 family. This uncharacterized, rare protein occurs in the highly variable O-antigen region of some strains of Leptospira, as well as strains of and serves as a phylogenetic marker for the likely presence of six additional proteins, including an activated sugar-nucleotidyltransferase, an activated sugar epimerase and a dehydratase, an aldolase, and a DegT family aminotransferase. The patterns suggests a role in preparing a novel sugar for O-antigen incorporation.	602
275130	TIGR04327	OMP_LA_2444	outer membrane protein, LA_2444/LA_4059 family. Members of this family are predicted outer membrane proteins, apparently restricted to the Leptospiraceae (Leptospira and Leptonema).	291
213941	TIGR04328	cas4_PREFRAN	CRISPR-associated protein Cas4, subtype PREFRAN. Members of this family are the Cas4 protein of a novel CRISPR subtype, PREFRAN, found in Prevotella bryantii B14, Prevotella disiens FB035-09AN, Francisella tularensis subsp. novicida, Francisella philomiragia, Butyrivibrio proteoclasticus B316, Helcococcus kunzii ATCC 51366, etc.	178
213942	TIGR04329	cas1_PREFRAN	CRISPR-associated endonuclease Cas1, subtype PREFRAN. Members of this family are the Cas1 endonuclease of a novel CRISPR subtype, PREFRAN, found in Prevotella bryantii B14, Prevotella disiens FB035-09AN, Francisella tularensis subsp. novicida, Francisella philomiragia, Butyrivibrio proteoclasticus B316, Helcococcus kunzii ATCC 51366, etc.	317
275131	TIGR04330	cas_Cpf1	CRISPR-associated protein Cpf1, subtype PREFRAN. This family is the long protein of a novel CRISPR subtype, PREFRAN, which is most common in Prevotella and Francisella, although widely distributed. The PREFRAN type has Cas1, Cas2, and Cas4, but lacks the helicase Cas3 and endonuclease Cas3-HD.	1286
275132	TIGR04331	o_ant_LIC12162	putative transferase, LIC12162 family. This protein family shows C-terminal sequence similarity to various surface carbohydrate biosynthesis enzymes: spore coat polysaccharide biosynthesis protein SpsB, UDP-N-acetyl-D-glucosamine 2-epimerase, lipid A disaccharide synthetase LpxB, etc. It may occur in O-antigen biosythesis regions.	585
275133	TIGR04332	gamma_Glu_sys	poly-gamma-glutamate system protein. Poly(gamma-glutamic acid), or PGA, is an extracellular structural polymer found in Bacillus subtilis and a number of other species. PGA is sometimes capsule-forming, sometimes secretory, and may be produced by Gram-positive (single plasma membrane) and Gram-negative (inner and outer membranes), so export and/or attachment machinery may differ from system to system. Members of this family occur in a subset of PGA operons, in lineages that include Francisella, Leptospira, Treponema, Thermotoga, Fusobacterium, and Clostridium, among others. Because gene symbols pgsWXYZ are not yet in use, we suggest pgsW, as one of a series of poly-gamma-glutamate synthesis auxiliary proteins.	307
213946	TIGR04333	Clo7Bot_mod_Cys	Cys-rich peptide, Clo7bot family. Members of this protein family range in size from 34 to 53 residues, including from four to seven Cys residues. Multiple strains of Clostridium botulinum show seven tandem members upstream of a radical SAM/SPASM domain protein likely to act as a ribosomal natural product maturase. By analogy to subtilosin A, the Cys residues are likely targets for modifications that may introduce new crosslinks. Across multiple strains of Clostridium botulinum and C. sporogenes, the adjacent radical SAM enzyme is nearly invariant.	34
213947	TIGR04334	rSAM_Clo7bot	radical SAM/SPASM domain Clo7bot peptide maturase. In multiple strains of Clostridium botulinum, this single radical SAM domain protein occurs next to a tandem array of seven homologous Cys-rich small peptides (see TIGR04333). Because this radical SAM enzyme contains the SPASM domain, associated with peptide modification, it is proposed to modify all seven C. botulinum targets, hence the name Clo7bot for this system. Suggested gene symbol is ctpM (Clostridial Tandem Peptide Maturase). [Protein fate, Protein modification and repair]	440
275134	TIGR04335	AmmeMemoSam_A	AmmeMemoRadiSam system protein A. Members of this protein family belong to the same domain family as AMMECR1, a mammalian protein named for AMME - Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis. Members of the present family occur as part of a three gene system with a homolog of the mammalian protein Memo (Mediator of ErbB2-driven cell MOtility), and an uncharacterized radical SAM enzyme.	174
275135	TIGR04336	AmmeMemoSam_B	AmmeMemoRadiSam system protein B. Members of this protein family belong to the same domain family as the mammalian protein Memo (Mediator of ErbB2-driven cell MOtility). Members of the present family occur as part of a three gene system with an uncharacterized radical SAM enzyme and a homolog of the mammalian protein AMMECR1, a mammalian protein named for AMME - Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis. Memo in humans has protein-protein interaction activity with binding of phosphorylated Try, but members of this family may be active as enzymes, as suggested by homology to a class of nonheme iron dioxygenases.	269
275136	TIGR04337	AmmeMemoSam_rS	AmmeMemoRadiSam system radical SAM enzyme. Members of this protein family are uncharacterized radical SAM enzymes that occur in a prokaryotic three-gene system along with homologs of mammalian proteins Memo (Mediator of ErbB2-driven cell MOtility) and AMMERCR1 (Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis). Among radical SAM enzymes that have been experimentally characterized, the most closely related in sequence include activases of pyruvate formate-lyase and of benzylsuccinate synthase.	349
275137	TIGR04338	HEXXH_Rv0185	putative metallohydrolase, TIGR04338 family. This protein family is restricted to the Actinomycetales, including Mycobacterium, Rhodococcus, Nocardia, Gordonii, and others. The invariant motif HEXXH, at the core of the best conserved region in the protein, suggests metallohydrolase activity, as does local sequence similarity in this region to other metallohydrolases.	159
213952	TIGR04339	PQQ_MSMEG_3727	Actinobacterial PQQ system protein. Members of this protein family are restricted to members of the Actinobacteria (Mycobacterium smegmatis, Streptomyces hygroscopicus, Geodermatophilus obscurus, Pseudonocardia dioxanivorans, Saccharomonospora marina, etc) that synthesize PQQ. This small protein, 155 amino acids long on average, is found regularly next to a much larger protein, a PQQ-dependent oxidoreductase, and might be a companion subunit or an accessory protein such as chaperone involved in cofactor insertion.	151
213953	TIGR04340	rSAM_ACGX	radical SAM/SPASM domain protein, ACGX system. Members of this protein family are radical SAM/SPASM domain proteins likely to be involved in the modification of small, Cys-rich peptides. Members of the family of proposed target sequences, TIGR04341, average 75 amino acids in length and average six instances of the motif ACGX, where X is A, S, or T.	341
213954	TIGR04341	target_ACGX	ACGX-repeat peptide. Members of this family average 75 amino acids in length and average six instances of the motif ACGX, where X is A, S, or T. Members are proposed target sequences for modification by adjacent radical SAM/SPASM domain proteins (family TIGR04340). Cys residues adjacent to Gly residues are common as proposed sites for modification by radical SAM enzymes.	57
275138	TIGR04342	EXLDI	EXLDI protein. The most conserved region in this protein family is the C-terminal pentapeptide, with motif ExLDI. Members from the Firmicutes average about 120 amino acids in length, while members from the Actinobacteria have an additional 45-residue amino-terminal segment not included in the model. In it is suggested that the member from Streptococcus mutans UA159, and its homologs, participate in bacteriocin production, export, or immunity.	124
275139	TIGR04343	egtE_PLP_lyase	ergothioneine biosynthesis PLP-dependent enzyme EgtE. Members of this protein family are the pyridoxal phosphate-dependent enzyme EgtE, which catalyzes the final step in the biosynthesis of ergothioneine. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	370
275140	TIGR04344	ovoA_Nterm	5-histidylcysteine sulfoxide synthase. Ovothiol A is N1-methyl-4-mercaptohistidine. In the absence of S-adenosylmethione, a methyl donor, the intermediate produced is 4-mercaptohistidine. In both Erwinia tasmaniensis and Trypanosoma cruzi, a protein occurs with 5-histidylcysteine sulfoxide synthase activity, but these two enzymes and most homologs share an additional C-terminal methyltransferase domain. Thus OvoA may be a bifunctional enzyme with 5-histidylcysteine sulfoxide synthase and 4-mercaptohistidine N1-methyltranferase activity. This model describes the 5-histidylcysteine sulfoxide synthase domain, a homolog of the ergothioneine biosynthesis protein EgtB. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	442
275141	TIGR04345	ovoA_Cterm	putative 4-mercaptohistidine N1-methyltranferase. Ovothiol A is N1-methyl-4-mercaptohistidine. In the absence of S-adenosylmethione, a methyl donor, the intermediate produced is 4-mercaptohistidine. In both Erwinia tasmaniensis and Trypanosoma cruzi, a protein occurs with 5-histidylcysteine sulfoxide synthase activity, but these two enzymes and most homologs share an additional C-terminal methyltransferase domain. Thus OvoA may be a bifunctional enzyme with 5-histidylcysteine sulfoxide synthase and 4-mercaptohistidine N1-methyltranferase activity. This model describes C-terminal putative 4-mercaptohistidine N1-methyltranferase domain. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	242
275142	TIGR04346	DotA_TraY	conjugal transfer/type IV secretion protein DotA/TraY. Members of this protein family include transfer protein TraY of IncI1 plasmid R64 and DotA (defect in organelle trafficking A) of Legionella pneumophila.	652
275143	TIGR04347	pseudo_SAM_Halo	pseudo-rSAM protein/SPASM domain protein. Members of this family all have a C-terminal SPASM domain (see model TIGR04085), a region usually found as a C-terminal second 4Fe-4S domain of radical SAM domain (see pfam04055) proteins. A majority of rSAM/SPASM proteins modify ribosomally produced peptides. In a few members of this family, the key Cys residues of the radical SAM domain have been lost, making this a pseudo-rSAM family. Members of this family are restricted so far to Haloarchaea, always occur next a member of family TIGR04031, and are often accompanied by another rSAM/SPASM domain protein. The function of this two or three gene cassette is unknown.	390
275144	TIGR04348	TIGR04348	putative glycosyltransferase, TIGR04348 family. This putative glycosyltransferase is found in marine bacteria such as Marinobacter and soil bacteria such as Anaeromyxobacter, but does not seem to occur in known pathogenic bacteria.	310
275145	TIGR04349	rSAM_QueE_gams	putative 7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE, gammaproteobacterial type. Members of this radical SAM domain protein family appear to be a form of the queuosine biosynthesis protein QueE. QueE is involved in making preQ0 (7-cyano-7-deazaquanine), a precursor of both the bacterial/eukaryotic modified tRNA base queuosine and the archaeal modified base archaeosine. Members occur in preQ0 operons species that lack members of related protein family TIGR03365.	210
275146	TIGR04350	C_S_lyase_PatB	putative C-S lyase. Members of this subfamily are probable C-S lyases from a family of pyridoxal phosphate-dependent enzymes that tend to be (mis)annotated as probable aminotransferases. One member is PatB of Bacillus subtilis, a proven C-S-lyase. Another is the virulence factor cystalysin from Treponema denticola, whose hemolysin activity may stem from H2S production. Members of the seed alignment occur next to examples of the enzyme 5-histidylcysteine sulfoxide synthase, from ovothiol A biosynthesis, and would be expected to perform a C-S cleavage of 5-histidylcysteine sulfoxide to leave 1-methyl-4-mercaptohistidine (ovothiol A).	384
213964	TIGR04351	TOMM_nitrile_2	putative TOMM peptide. Members of this family of short peptides average about 110 amino acids in length, with greatest variability in the last thirty. The conserved region resembles the alpha subunit of nitrile hydratase, as with the NHLP leader peptide domain (TIGR03793), and members usually are found near a cyclodehydratase (maturase) enzyme, marking these as like thiazole/oxazole-modified microcins (TOMM), but these precursor forms lack the GlyGly cleavage motif that marks the clear end of a leader peptide region. Genomes with this system include Streptomyces clavuligerus ATCC 27064, Verrucosispora maris AB-18-032, and Kitasatospora setae KM-6054.	48
275147	TIGR04352	HprK_rel_A	HprK-related kinase A. A number of protein families resemble HPr kinase (see TIGR00679) but do not belong to that system. They include this family, which appears instead to be the marker for a different type of gene neighborhood, in which one of the conserved neighboring proteins resembles (but is distinct from) PqqD.	280
275148	TIGR04353	PqqD_rel_X	PqqD family protein, HPr-rel-A system. Members of this protein show distant homology to PqqD, and belong to a three-gene cassette that included the HPr kinase related protein family of TIGR04352. The role of the cassette, and of this protein, are unknown.	73
275149	TIGR04354	amphi-Trp	amphi-Trp domain. This domain usually comprises most of the span of bacterial or archaeal proteins with a length of about 90 amino acids. Some members, however, are extended by one or two copies of domain pfam07411 in the C-terminal region. No residue in this domain is invariant. A striking feature of this domain is a C-terminal region that alternates strongly charged with strongly hydrophobic residues and usually ends with a Trp residue, e.g. LEIEIEW or FEIKVRW, suggesting an amphipathic beta strand structure. We suggest the name amphi-Trp for this domain. Some members of this function occur regularly in genomic contexts that include putative kinases of unknown specificity related to (but distinct from) HPr kinase, a Ser-specific protein kinase. The function is unknown.	67
275150	TIGR04355	HprK_rel_B	HprK-related kinase B. Members of this protein family resemble (and often are misannotated as) HprK, the serine kinase/phosphatase of the phosphocarrier protein HPr. However, members do not occur with an HPr homolog, but instead as part of a distinctive gene cassette of unknown function.	351
213969	TIGR04356	grasp_GAK	ATP-grasp enzyme, GAK system. Members of this family are ATP-grasp family enzymes related to a number of characterized glutamate ligases, including the ribosomal protein S6 modification enzyme RimK. This group belongs to a conserved gene neighborhood that also features an HPr kinase-related protein (see TIGR04355). We assign this system the initial designation GAK, for Grasp (this ATP-grasp family enzyme), Amphipathic (for the member of family TIGR04354, designated Amphi-Trp), and Kinase, for the HPr-kinase homolog TIGR04355.	287
275151	TIGR04357	CofD_rel_GAK	CofD-related protein, GAK system. Members of this family are distantly related to CofD, the enzyme LPPG:FO 2-phospho-L-lactate transferase, involved in coenzyme F420 biosynthesis. This family appears to belong to a biosynthesis cassette of unknown function.	368
275152	TIGR04358	XXXCH_domain	XXXCH domain. Members of this family show C-terminal sequence similarity, perhaps indicating distant homology, to cytochrome c-prime (see pfam01322). However, the motif CxxCH is replaced by xxxCH. Genes for this protein occur in a sporadically distributed genome context, largely in deltaProteobacteria, in which an ATP-grasp family glutamate ligase homolog and a CofD (LPPG:FO 2-phospho-L-lactate transferase) homolog suggest a novel biosynthesis.	90
275153	TIGR04359	TrbK_RP4	entry exclusion lipoprotein TrbK. The characterized model example of TrbK, from incompatibility group P (IncP) plasmid RP4, is an N-terminally processed lipoprotein, localized to the periplasmic face of the plasma membrane. TrbK prevents entry through conjugation by other IncP plasmids. Unrelated, uncharacterized proteins encoded in equivalent positions in other plasmid P-type conjugative transfer regions (e.g. TIGR04360) may have analogous functions. [Mobile and extrachromosomal element functions, Plasmid functions]	66
275154	TIGR04360	other_trbK	conjugative transfer region protein TrbK. Members of this family regularly are encoded between the TrbJ and TrbL proteins essential for P-type conjugal transfer, and therefore are designated TrbK. Positional analogy to family TIGR04359 (the entry exclusion lipoprotein TrbK of IncP plasmid RP4), which is a lipoprotein and not homologous, suggests this protein may also be involved in entry exclusion. Members of this family are small, with a non-lipoprotein signal peptide and a conserved disulfide bond. [Mobile and extrachromosomal element functions, Plasmid functions]	74
275155	TIGR04361	TrbK_Ti	entry exclusion protein TrbK, Ti-type. Members of this family are encoded between the genes for TrbJ and TrbL of P-type plasmid conjugal transfer systems, and therefore are TrbK, a member of a guild of unrelated TrbK protein families. The similarly located TrbK of plasmid RP4 (family TIGR04359) functions in entry exclusion, and the current family may as well, despite lacking any detectable homology. Members of this family include TrbK of the Ti plasmid from Agrobacterium, shown not to be required for transfer, which would be consistent with a role in entry exclusion rather than transfer itself. Li et al. cite unpublished results that showed an entry exclusion function for TrbK of the Ti plasmid. This small protein shares close C-terminal sequence homology to the much longer protein encoded by the neighboring gene TrbJ. [Mobile and extrachromosomal element functions, Plasmid functions]	62
275156	TIGR04362	choice_anch_C	choice-of-anchor C domain. This family describes an extracellular bacterial domain that occurs on a number of proteins with PEP-CTERM (exosortase recognition site) sequences at the C-terminus, as well some with an apparent alternate anchor sequence. Note that related pfam04862 (DUF642), as of release 26, is double the length of this model because it has two tandem regions homologous to this domain. pfam04862, in turn, belongs to a Pfam clan called the galactose-binding domain-like superfamily.	157
275157	TIGR04363	LD_lanti_pre	FxLD family lantipeptide. Members of this protein family occur with a cassette of lanthionine-type peptide modification enzymes. Members are small (about 60 amino acids long), rich in Cys, and variable in copy number per genome (from one to three). These features suggest that members of this family are modified to become lantipeptides, although not necessarily a lantibiotic. There is no GlyGly cleavage motif to separate a leader peptide from core region.The considerable abundance in Streptomyces and relatively strong consideration hints at a non-antibiotic function. The motif FxLD in the N-terminal region is nearly invariant.	37
275158	TIGR04364	methyltran_FxLD	methyltransferase, FxLD system. Members of this family resemble occur regularly in the vicinity of lantibiotic biosynthesis enzymes and their probable target, the FxLD family of putative ribosomal natural product precursor (TIGR04363). Members resemble protein-L-isoaspartate O-methyltransferase (TIGR00080) and a predicted methyltranserase, TIGR04188, of another putative peptide modification system.	394
213978	TIGR04365	spare_glycyl	autonomous glycyl radical cofactor GrcA. This small protein, previously designated YfiD in E. coli, is closely homologous to pyruvate formate_lyase (PFL) in a region surrounding the stable glycyl radical that is prepared by the action of pyruvate formate-lyase activase, a radical SAM enzyme. When damage at the site of this radical breaks the main chain of PFL, this protein acts as a spare part that reintroduces the needed stable glycyl radical. Cutoffs for this model are set to exclude a set of closely related phage proteins that appear to have a corresponding function.	124
275159	TIGR04366	cupin_WbuC	cupin fold metalloprotein, WbuC family. Members of this family show sequence similarity to cupin fold proteins (see pfam07883), including conserved His residues likely to serve as metal-binding ligands. Many members occur in bacterial O-antigen biosynthesis regions. Some members have acquired the gene symbol wbuC (e.g. Jarvis, et al, 2011), but publications using this term do not ascribe a function.	132
275160	TIGR04367	HpnR_B12_rSAM	hopanoid C-3 methylase HpnR. Members of this are family are a B12-binding domain/radical SAM domain protein required for 3-methylhopanoid production. Activity was confirmed by mutant phenotype by disrupting this gene in Methylococcus capsulatus strain Bath. This protein family should only occur in genomes that encode a squalene-hopene cyclase (see TIGR01507). [Fatty acid and phospholipid metabolism, Biosynthesis]	490
275161	TIGR04368	Glu_2_3_NH3_mut	glutamate 2,3-aminomutase. Members of this family are glutamate 2,3-aminomutase, a radical SAM enzyme with a pyridoxal phosphate group. It is closely related to lysine 2,3-aminomutase, but distinguished by architecture (longer N-terminal region, shorter C-terminal region) and replacement of key lysine-binding residues Asp293 and Asp330 (inferred from the crystal structure) by glutamate-binding residues Lys and Asn. Activity was demonstrated for sequences from Clostridium difficile, Thermoanaerobacter tengcongensis MB4, and Syntrophomonas wolfei str. Goettingen. The action of this enzyme creates beta-glutamate, an osmolyte. [Cellular processes, Adaptations to atypical conditions]	404
275162	TIGR04369	fusion_not_SelD	oxidoreductase/SelD-related fusion protein. Some selenium donor proteins (selenide,water dikase, product of the selD gene, model TIGR00476) are fusion proteins with an N-terminal extension described by model TIGR03169. Members of this family have a C-terminal region similar to yet outside the scope of the SelD model, fused to an N-terminal region similar to but outside the scope of TIGR03169.	702
275163	TIGR04370	glyco_rpt_poly	oligosaccharide repeat unit polymerase. Members of this subfamily of highly hydrophobic proteins, with few highly conserved residues, all may act to polymerize the oligosaccharide repeat units of surface polysaccharides, including O-antigen in Gram-negative bacteria such as Leptospira (assign gene symbol wzy) and capsular polysaccharide in Gram-positive bacteria such as Streptococcus. O-antigen biosynthesis enzymes produce a repeat unit, usually an oligosaccharide, which itself is polymerized. O-antigen polymerase, usually designated Wzy. This family bears homology to the O-antigen ligase WaaL, but known examples of WaaL fall outside the bounds defined here. This model is much broader than pfam14296. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	392
275164	TIGR04371	methyltran_NanM	putative sugar O-methyltransferase. Members of this family appear to be SAM-dependent O-methyltransferases acting on sugars, based on iterated sequence searches and gene context. Members occur in Leptospira O-antigen regions, as well NanM from the biosynthesis cluster for nanchangmycin, which produces 4-O-methyl-L-rhodinose as an intermediate. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	273
275165	TIGR04372	glycosyl_04372	putative glycosyltransferase, TIGR04372 family. This domain occurs in proteins of various lengths, in contexts that include O-antigen biosynthesis regions of various Leptospira species. Hits to this model and PSI-BLAST analysis suggest distant sequence similarity to family 9 glycosyltransferases (pfam01075), including ADP-heptose:LPS heptosyltransferase (RfaF), an enzyme involved in LPS inner core region biosynthesis. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	205
275166	TIGR04373	egtB_X_signatur	EgtB-related enzyme signature domain. This model represents a signature C-terminal region of a distinct clade in the EgtB subfamily, other members of which participate in ergothioneine biosynthesis	50
275167	TIGR04374	small_w_EgtBD	hercynine metabolism small protein. Hercynine is the betaine (trimethylated amino group) form of histidine. This small protein occurs in a conserved four-gene cyanobacterial cassette along with a EgtD, the methyltransferase that converts histidine to hercynine, and an EgtB homolog as in ergothioneine biosynthesis, likely to attach some thiol through its sulfur to the imidazole ring.	73
275168	TIGR04375	cyano_w_EgtBD	hercynine metabolism protein. Hercynine is the betaine (trimethylated amino group) form of histidine. This protein occurs in a conserved four-gene cyanobacterial cassette along with a EgtD, the methyltransferase that converts histidine to hercynine as in ergothioneine biosynthesis, an EgtB homolog that is likely to attach some thiol (e.g. gamma-glutamyl-cysteine) through its sulfur to the hercynine imidazole ring, and a small protein of unknown function (TIGR04374). Members are distantly related to phage shock protein A (PspA).	154
275169	TIGR04376	TIGR04376	TIGR04376 family protein. Members of this protein family resemble TIGR04375 and, more distantly, to phage shock protein A (PspA). Members are restricted to the Cyanobacteria.	189
275170	TIGR04377	myo_inos_iolD	3,5/4-trihydroxycyclohexa-1,2-dione hydrolase. Members of this protein family, 3,5/4-trihydroxycyclohexa-1,2-dione hydrolase (iolD), represent one of eight enzymes in a pathway converting myo-inositol to acetyl-CoA. IolD hydrolyzes the cyclic molecule 3D-(3,5/4)-trihydroxycyclohexane-1,2-dione to yield 5-deoxy-D-glucuronic acid. TPP is a cofactor. [Energy metabolism, Sugars]	615
275171	TIGR04378	myo_inos_iolB	5-deoxy-glucuronate isomerase. Members of this protein family, 5-deoxy-glucuronate isomerase (iolB), represent one of eight enzymes in a pathway converting myo-inositol to acetyl-CoA. [Energy metabolism, Sugars]	247
275172	TIGR04379	myo_inos_iolE	myo-inosose-2 dehydratase. Members of this family include the enzyme myo-inosose-2 dehydratase, product of the gene iolE, as found in inositol utilization cassettes in many species. [Energy metabolism, Sugars]	290
275173	TIGR04380	myo_inos_iolG	inositol 2-dehydrogenase. All members of the seed alignment for this model are known or predicted inositol 2-dehydrogenase sequences co-clustered with other enzymes for catabolism of myo-inositol or closely related compounds. Inositol 2-dehydrogenase catalyzes the first step in inositol catabolism. Members of this family may vary somewhat in their ranges of acceptable substrates and some may act on analogs to myo-inositol rather than myo-inositol per se. [Energy metabolism, Sugars]	330
275174	TIGR04381	HTH_TypR	TyrR family helix-turn-helix domain. This model describes the C-terminal DNA-binding helix-turn-helix domain of several regulators of aromatic amino acid metabolism. Examples include TyrR in Escherichia coli and PhhR in Pseudomonas putida. Most members of this family have a sigma-54 interaction domain. [Regulatory functions, DNA interactions]	49
275175	TIGR04382	myo_inos_iolC_N	5-dehydro-2-deoxygluconokinase. All members of the seed alignment for this model are translated from the iolC gene of known or putative inositol catabolism operons. Members with characterized function are 5-dehydro-2-deoxygluconokinase, the enzyme catalyzing the fifth step in degradation from myo-inositol or closely related compounds. Note that many members of this family are fusion proteins with an additional C-terminal domain, of unknown function, described by pfam09863. [Energy metabolism, Sugars]	309
275176	TIGR04383	acidic_w_LPXTA	processed acidic surface protein. Members of this family are acidic surface proteins with an N-terminal signal peptide and a variant C-terminal sortase recognition sequence, LPXTA rather than LPXTG. The N-terminal region past the signal peptide is repeated a second or third time in many members of this family. Members occur in Firmicutes, encoded next to a dedicated sortase related to SrtC that assembles pilins, suggesting that this protein serves a structural rather than enzymatic role. Processing by the neighboring sortase may result in polymerization as well as surface attachment. [Cell envelope, Surface structures]	316
275177	TIGR04384	putr_carbamoyl	putrescine carbamoyltransferase. Members of this family are putrescine carbamoyltransferase (EC 2.1.3.6). There is some overlapping specificity with ornithine carbamoyltransferase (EC 2.1.3.3). The gene regularly is found next to agmatine deiminase and a carbamate kinase, suggesting a conserved catabolic agmatine deiminase pathway. [Energy metabolism, Amino acids and amines]	330
275178	TIGR04385	B12_rSAM_cofa1	putative variant cofactor biosynthesis B12-binding domain/radical SAM domain protein 1. Members of this protein family are one of two tandem B12-binding domain/radical SAM domain proteins that occur in a genome context with a pair of homologs to ThiC (phosphomethylpyrimidine synthase, EC 4.1.99.17), an enzyme that performs a complex rearrangement involved in thiamin biosynthesis, and a putative CobT (nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase), an enzyme of cobalamin biosynthesis.	438
275179	TIGR04386	ThiC_like_1	ThiC-like protein 1. Members of this protein family closely resemble ThiC, an enzyme that performs a complex rearrangement during thiamin biosynthesis, but instead occur as one of two adjacent additional paralogs to bona fide ThiC, in a conserved gene neighborhood with a pair of B12 binding domain/radical SAM domain proteins. Members of the ThiC family are non-canonical radical SAM enzymes, using a C-terminal Cys-rich motif to ligand a 4Fe-4S cluster that cleaves S-adenosylmethionine (SAM), but that sequence region does not belong to pfam04055.	426
275180	TIGR04387	capsid_maj_N4	major capsid protein, N4-gp56 family. Members of this family are phage major capsid proteins as found in phage N4 (a double-stranded DNA virus) plus many additional lytic phage and integrated prophage regions. [Mobile and extrachromosomal element functions, Prophage functions]	315
275181	TIGR04388	Lepto_longest	putative large structural protein. Members of this family are restricted so far to the lineage Leptospira, where they may be the longest protein encoded by the genome. Two or three paralogs are often found. The seed alignment for this model includes sequences with significant length variability, and stops adjacent to an intein feature most full-length members of this family share. Oddly, members closely related in sequence up to the start of the intein (see TIGR01445) usually show very little sequence similarity C-terminal to the end of the intein (see TIGR01443). [Unknown function, General]	1134
275182	TIGR04389	Lepto_lipo_1	lipoprotein, Leptospiral tandem type. Members of this family are lipoproteins restricted (so far) to the genus Leptospira, sometimes with several paralogs clustered with each other, such as four in a row (out of six) in Leptospira interrogans str. UI 13372. The tandem set may be co-clustered with a putative structural protein that is usually the longest encoded by the leptospiral genome (and that often is an intein-containing protein). [Cell envelope, Other]	201
275183	TIGR04390	OMP_YaiO_dom	outer membrane protein, YaiO family. Members of this family share a domain of bacterial outer membrane beta barrel, up to the protein C-terminal residue (usually Phe or Trp). The member YaiO was shown experimentally to be localized to the outer membrane. [Unknown function, General]	230
275184	TIGR04391	CcmD_alt_fam	CcmD family protein. Members of this protein family are small (typically less than 50 amino acids in length), with the first half highly hydrophobic like transmembrane alpha helices and containing a nearly invariant tyrosine residue. Members from the Desulfovibrionales appear in the position of ccmD of system I c-type cytochrome biogenesis operons (see pfam04995). This family and pfam04995 appear very similar in sequence properties, but the very low level of actual sequence identify makes it unclear that the similarity reflects homology per se.	36
275185	TIGR04392	haoB_nitrify	hydroxylamine oxidation protein HaoB. Members of this family occur as the HaoB (hydroxylamine oxidation B) protein encoded next to the homotrimeric HaoA, hydroxylamine oxidoreductase, a protein with eight heme groups. It appears all species with this enzyme are nitrifying bacteria.	312
275186	TIGR04393	rpt_T5SS_PEPC	T5SS/PEP-CTERM-associated repeat. This model describes a repeat about 50 amino acids in length, appearing sometimes more than ten times in tandem in a single protein. Most proteins with this repeat have a C-terminal autotransporter domain (TIGR01414, pfam03797) and/or an N-terminal type V secretion system signal peptide (pfam13018), while others instead have a C-terminal PEP-CTERM domain (TIGR02595).	49
275187	TIGR04394	choline_CutC	choline trimethylamine-lyase. Members of this family, homologs to pyruvate formate-lyases and benzylsuccinate synthases, are glycine radical enzymes that appear to act as choline TMA-lyase, that is, to perform a C-N bond cleavage turning choline into trimethylamine (TMA) plus acetaldehyde. The gene symbol is cutC, for choline utilization. The activase, CutD, is a radical SAM enzyme. [Energy metabolism, Amino acids and amines]	789
275188	TIGR04395	cutC_activ_rSAM	choline TMA-lyase-activating enzyme. Members of this family are CutD, a radical enzyme that serves as an activase for choline TMA-lyase, CutC. CutC is a glycyl radical enzyme related to pyruvate formate-lyase, and this enzyme, CutD, is related to pyruvate formate-lyase activase. [Energy metabolism, Amino acids and amines]	309
275189	TIGR04396	surf_polysacc	surface carbohydrate biosynthesis protein. This model describes an uncharacterized homology region found broadly in proteins of surface carbohydrate biosynthesis regions. This family shows distant homology to regions of family TIGR04326, of spore coat polysaccharide biosynthesis protein SpsB from Bacillus subtilis, etc. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	321
275190	TIGR04397	SecA2_Bac_anthr	accessory Sec system translocase SecA2, Bacillus type. Members of this family always occur in genomes with the preprotein translocase SecA (TIGR00963) and closely resemble it, hence the designation SecA2. However, this appears to mark a different type of accessory Sec system SecA2 (TIGR03714) from the serine-rich glycoprotein type found in Staphylococcus and Streptococcus, and the actinobacterial SecA2 (TIGR04221). This type occurs in species including Bacillus anthracis, Geobacillus thermoglucosidasius, Solibacillus silvestris, etc. [Protein fate, Protein and peptide secretion and trafficking]	774
275191	TIGR04398	SLAP_DUP	SLAP domain. This domain is duplicated in SlaP (S-layer assembly protein), a partner of SecA2 in the Bacillus anthracis type of accessory Sec system (see TIGR04397). The domain is found, either once or twice, in additional Firmicutes species.	125
275192	TIGR04399	acc_Sec_SLAP	accessory Sec system S-layer assembly protein. Members of this family, designated S-layer assembly protein (SlaP), occur next to a Bacillus anthracis-type accessory Sec system SecA2. . Members have two tandem copies of a duplicated domain (TIGR04398) that may also occur in other contexts. SlaP is found both free in the cytoplasm and membrane-associated. SecA2 and SlaP appear to work together to modify Sec for efficient S-layer secretion. [Protein fate, Protein and peptide secretion and trafficking]	288
275193	TIGR04400	RK_trnsloc_Pase	Arg-Lys translocation region protein phosphatase. The Sec-independent protein export system TAT, or twin-arginine translocation, is unusual in Leptospira, with Lys replacing Arg in the second position of the twin-Arg motif. This protein, restricted to Leptospira and showing distant homology to the phosphoserine phosphatases RsbU and SpoIIE, is always encoded immediately downstream of the tatC gene and appears to be part of the variant TAT system. It lacks a TAT signal itself, and so is more likely to be part of the Sec-independent translocation machinery than to be a substrate. The suggested symbol is rktP, for RK-Translocation Phosphatase. [Protein fate, Protein and peptide secretion and trafficking]	358
275194	TIGR04401	TAT_Cys_rich	twin-arginine translocation signal/Cys-rich four helix bundle protein. Members of this family average about 150 amino acids in length, beginning with a twin-arginine translocation signal sequence, then a His-rich spacer region, followed by a ~105-residue region in which thirteen positions are nearly invariant Cys residues. CDD (Conserved Domain Database) assigns members of this family to clan cl13994, the DUF326 superfamily, based on homology to PA2107 from Pseudomonas aeruginosa. PA2107 is a cysteine-rich four helical bundle protein, with solved structure PDB:3KAW.	150
275195	TIGR04402	mob_CxxC_CxxC	mobilome CxxCx(11)CxxC protein. Members of this family share twin CxxC motifs near the C-terminus, suggesting a DNA- or RNA-binding activity. The spacing between CxxC motifs is variable, from 11 to 16 amino acids. Members in general occur on plasmids or near other markers of lateral gene transfer (transposases, integrases, endonucleases, etc).	186
275196	TIGR04403	rSAM_skfB	sporulation killing factor system radical SAM maturase. Members of this family are a radical SAM enzyme of post-translational modification of ribosomally translated peptides. In Bacillus subtilis, the enzyme SkfB creates a sactipeptide (sulfur-to-alpha-carbon) crosslink of Cys-4 to Met-12 of the mature form of sporulation killing factor (SkfA). In Paenibacillus larvae subsp. larvae B-3650, the Met is replaced by Leu, so the modification must be different. SkfB has 2 4Fe-4S clusters, one in its radical SAM domain (pfam04055) and one in a region that somewhat resembles the SPASM domain (TIGR04085).	402
275197	TIGR04404	RiPP_SkfA	sporulation killing factor. Members of this family are ribosomally synthesized and post-translationally modified peptide natural products, modified by sulfur-to-alpha-carbon cross-link introduced by a radical SAM enzyme, SkfB (TIGR04403).	53
275198	TIGR04405	SkfF	sporulation killing factor system integral membrane protein. Members of this family, SkfF, occur only in cassettes of the sporulation killing factor system. This protein has multiple membrane-spanning domains and is encoded next to a protein with ATP-binding cassette (ABC) homology (SkfE), suggesting ABC transporter permease activity for this protein.	472
275199	TIGR04406	LPS_export_lptB	LPS export ABC transporter ATP-binding protein. Members of this fmaily are LptB, the ATP-binding cassette protein of an ABC transporter involved in lipopolysaccharide export. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other]	239
275200	TIGR04407	LptF_YjgP	LPS export ABC transporter permease LptF. Members of this family are LptF, one of homologous, two tandem-encoded permease genes of an export ATP transporter for lipopolysaccharide (LPS) assembly in most Gram-negative bacteria. The other permease subunit is LptG (TIGR04408). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other]	346
275201	TIGR04408	LptG_lptG	LPS export ABC transporter permease LptG. Members of this family are LptG, one of homologous, two tandem-encoded permease genes of an export ATP transporter for lipopolysaccharide (LPS) assembly in most Gram-negative bacteria. The other permease subunit is LptF (TIGR04407). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other]	351
275202	TIGR04409	LptC_YrbK	LPS export ABC transporter periplasmic protein LptC. Members of this family are LptC, a periplasmic protein tethered to the inner membrane in the lipopolysaccharide (LPS)exporter transenvelope complex (Lpt), which is responsible for conducting LPS to the outer leaflet of the outer membrane in most Gram-negative bacteria. LptC is homologous to LptA, another member of the transenvelope complex. Genes lptC and lptA are often adjacent. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other]	180
275203	TIGR04410	Spiro_T2SS_lipo	type II secretion system-associated lipoprotein. Members of this family occur only in spirochetes (Leptospira, Leptonema, Turneriella), as part of a type II secretion system (T2SS) cassette. Properly extended gene models always include an N-terminal signal sequence ending with a Cys residue, suggesting this small protein (about 100 amino acids) is a lipoprotein.	117
275204	TIGR04411	T2SS_GspN_Lepto	type II secretion system protein N, Leptospira/Geobacter-type. Members of this family are the N (or GspN) protein of type II secretion systems (T2SS) as found in Leptospira, Geobacter, Myxococcus, and several other genera. Sequence similarity to GspN as found in, say, Gammaproteobacteria (see pfam01203) is extremely remote. [Protein fate, Protein and peptide secretion and trafficking]	279
275205	TIGR04412	T2SS_GspM_XcpZ	type II secretion system protein M, XcpZ-type. Members of this family are a variant form of the type II secretion system (T2SS) protein M, GspM, as found in several species of Pseudomonas. Members, including XcpZ, are short relatives to most members of pfam04612 (as of release 26.0).	126
275206	TIGR04413	MYXAN_cmx8	CRISPR type MYXAN-associated protein Cmx8. Members of this family occur only in CRISPR/cas loci of the MYXAN type, but are present in a minority of such systems. This protein appears to replace the MYXAN system Cas8a1/Csx13 protein (TIGR03485/TIGR03486), compared to which it shows similar length and composition but only about 12 percent sequence identity.	528
275207	TIGR04414	hepto_Aah_TibC	autotransporter strand-loop-strand O-heptosyltransferase. Both Aah (autotransporter adhesin heptosyltransferase) and TibC (tib is enterotoxigenic invasion locus B) are protein O-heptosyltransferases that act on multiple sites from repeat regions of proteins exported by autotransporters. Aah glycosylates AIDA, or autotransporter adhesin involved in diffuse adherence, TibC acts on TibA, but TibC can replace Aah. [Protein fate, Protein modification and repair]	374
275208	TIGR04415	O_hepto_targRPT	autotransporter passenger strand-loop-strand repeat. This model describes two tandem copies of a strand-loop-strand repeat that occurs often in type V secretion system (T5SS). These repeats usually occur in the passenger domain of the classical monomeric autotransporter. Proteins with this repeat often are encoded next to a member of family TIGR04414, the Aah/TibC family O-heptosyltransferase, and may be glycosylated in regions with this repeat.	38
275209	TIGR04416	group_II_RT_mat	group II intron reverse transcriptase/maturase. Members of this protein family are multifunctional proteins encoded in most examples of bacterial group II introns. These group II introns are mobile selfish genetic elements, often with multiple highly identical copies per genome. Member proteins have an N-terminal reverse transcriptase (RNA-directed DNA polymerase) domain (pfam00078) followed by an RNA-binding maturase domain (pfam08388). Some members of this family may have an additional C-terminal DNA endonuclease domain that this model does not cover. A region of the group II intron ribozyme structure should be detectable nearby on the genome by Rfam model RF00029. [Mobile and extrachromosomal element functions, Other]	354
275210	TIGR04417	PFTS_polysacc	polysaccharide biosynthesis PFTS motif protein. Members of this protein family are found in O-antigen biosynthesis loci in Leptospira, two tandem homologs in a polysaccharide biosynthesis region in the archaeon Methanoregula formicicum, in Rhizobium leguminosarum bv. trifolii WSM2297, etc. Members are more strongly conserved in the C-terminal region, where an invariant sequence PFTS is found. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	519
275211	TIGR04418	PriB_gamma	primosomal replication protein PriB. Members of this protein family are primosomal replication protein N (PriB), virtually always encoded between rpsF (ribosomal protein S6) and rpsR (ribosomal protein S18). Note that only this short form, as found primarily in the gamma-Proteobacteria, of the single-stranded DNA binding protein family (see model TIGR00621) may partner with PriA and be involved in priming for re-initiation of replication. [DNA metabolism, DNA replication, recombination, and repair]	96
275212	TIGR04419	no_iron_rSAM	HemN-related non-iron pseudo-SAM protein PsgB. Members of this protein family are related to radical SAM enzymes HemN (oxygen-independent coproporphyrinogen III oxidase) and HutW (a putative heme utilization enzyme) but lack the signature CxxxCxxC motif for 4Fe-4S binding. Members occur exclusively in Borrelia, which appears to live without iron, as the only radical SAM enzyme homolog in any Borrelia genome. We designate this enzyme PsgB (Pseudo-SAM, Genus Borrelia).	378
275213	TIGR04420	Sec_Non_Glob	Sec region non-globular protein. Members of this family occur only in the genus Leptospira, always encoded between genes for the YajC and SecD components of the Sec preprotein translocase. Sequences have an N-terminal signal peptide and a C-terminal transmembrane segment. Between these are regions of non-globular, low-complexity sequence including Lys-rich and Ser/Thr/Asn/Glu-rich regions.	241
275214	TIGR04421	FemAB_IMCC1989	FemAB family protein. Members of this family are FemA/FemB family proteins from a cassette that some Leptospira have in their O-antigen biosynthesis regions and share with a cassette from gamma proteobacterium IMCC1989. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	345
275215	TIGR04422	PLP_IMCC1989	putative PLP-dependent aminotransferase. Members of this family are PLP-dependent enzymes, probable aminotransferases of the DegT/DnrJ/EryC1/StrS. Members occur in some Leptospira have in their O-antigen biosynthesis regions and in a related cassette in gamma proteobacterium IMCC1989. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	315
275216	TIGR04423	casT3_TIGR04423	CRISPR type III-associated protein, TIGR04423 family. Members of this protein family occur only in species with CRISPR systems, in the context of type III systems that resemble type III-A (MTUBE) and III-B (the RAMP module). It occurs in several species of Prevotella, Helicobacter, Campylobacter, and Bacteroides.	124
275217	TIGR04424	metallo_McbB	McbB family protein. This family includes the partially characterized zinc metalloprotein McbB, part of the maturase system for microcin B17, a thiazole/oxazole-modified microcin (TOMM). Other members of this family belong to very different gene neighborhoods. The Cys residues that act as zinc ligands are conserved in most members, but for rare members are jointly absent.	286
275218	TIGR04425	P_lya_rel_AroB	putative sugar phosphate phospholyase (cyclizing). Members of this family tend to be found in O-antigen biosynthesis regions. Members frequently are misidentified as the closely related 3-dehydroquinate synthase, AroB (see family TIGR01357), the phospholyase that converts 3-deoxy-D-arabino-hept-2-ulosonate 7-phosphate to 3-dehydroquinate during chorismate biosynthesis. Most bacteria with this enzyme have a true AroB in a chorismate biosynthesis gene cluster. [Biosynthesis of cofactors, prosthetic groups, and carriers, Lipoate]	348
275219	TIGR04426	rSAM_desII	TDP-4-amino-4,6-dideoxy-D-glucose deaminase. Members of this protein family, including DesII, are radical SAM enzymes that deaminate TDP-4-amino-4,6-dideoxy-D-glucose to TDP-3-keto-4,6-dideoxy-D-glucose. This is the fourth step of the six step pathway in Streptomyces venezuelae for synthesizing D-desosamine, or 3-(dimethylamino)-3,4,6-trideoxyglucose, a precursor for many macrolide antibiotics.	468
275220	TIGR04427	PLP_DesI	dTDP-4-dehydro-6-deoxyglucose aminotransferase. Members of this family are pyridoxal phosphate-dependent aminotransferases that convert TDP-4-keto-6-deoxy-D-glucose to the 4-amino sugar form, TDP-4-amino-4,6-dideoxy-D-glucose. In Streptomyces venezuelae, this enzyme is designated DesI, catalyzing the third of six steps in the biosynthesis of TDP-D-desosamine, a component of a number of different macrolide antibiotics made by that organism. Related proteins, scoring below the trusted cutoff, include sugar aminotranferases in O-antigen biosynthesis regions.	390
275221	TIGR04428	B12_rSAM_trp_MT	tryptophan 2-C-methyltransferase. Members of this family are the B12-binding domain/radical SAM domain enzyme tryptophan methyltransferase, named TsrT in the cassette for thiostrepton biosynthesis. Thiostrepton and related thiopeptides are synthesized by extensive modification of a ribosomally translated product, but this enzyme is involved in a pathway that converts a free Trp residue to a quinaldic acid moiety before it is appended. [Cellular processes, Toxin production and resistance]	545
275222	TIGR04429	Phr_nterm	Phr family secreted Rap phosphatase inhibitor. Phr peptides are short peptides, best conserved in their amino-terminal regions, that are almost always encoded immediately downstream of a Rap phosphatase. A portion of the Phr peptide is secreted, enters another cell, and forms a quorum-sensing system by inhibiting its Rap phosphatase partner. The set of Phr peptides recognized by this N-terminal region model is disjoint from the PhrC/PhrF set recognized by pfam11131. [Regulatory functions, Protein interactions]	28
275223	TIGR04430	OM_asym_MlaD	outer membrane lipid asymmetry maintenance protein MlaD. Members of this protein family are the MlaD (maintenance of Lipid Asymmetry D) protein of an ABC transport system that seems to remove phospholipid from the outer leaflet of the Gram-negative bacterial outer membrane (OM), leaving only lipopolysaccharide in the outer leaflet. The Mla locus has long been associated with toluene tolerance, consistent with the proposed role in retrograde transport of phospholipid and therefore with maintaining the integrity of the OM as a protective barrier.	146
275224	TIGR04431	N6_acetyl_AAC6	aminoglycoside N(6')-acetyltransferase, AacA4 family. Members of this family are the aacA4 type of aminoglycoside N(6')-acetyltransferase (EC 2.3.1.82), an enzyme that modifies and inactivates aminoglycoside antibiotics such as kanamycin, neomycin, tobramycin, and amikacin. Members are regularly spread among pathogens into integron, transposon, and plasmid loci, with recombination often happening within the protein-coding region. Most of the region amino-terminal to the recombination site or sites was removed from this model. [Cellular processes, Toxin production and resistance]	184
275225	TIGR04432	rSAM_Cfr	23S rRNA (adenine(2503)-C(8))-methyltransferase Cfr. This model identifies Cfr, a 23S rRNA methyltransferase, EC 2.1.1.224, responsible for a transmissible form of chloramphenicol/florfenicol resistance. It is closely related to RlmN (see TIGR00048), an adenine C2-methyltransferase that acts at the same site where Cfr acts as a C8-methyltransferase [Protein synthesis, tRNA and rRNA base modification]	341
275226	TIGR04433	UrcA_uranyl	UrcA family protein. Members of this family feature an N-terminal signal sequence, small size, and two invariant Cys residues, 10-20 residues apart. One member of this family, UrcA from the aerobic bacterium Caulobacter crescentus, is expressed so highly in response to uranium, but not other heavy metals, that a genetically engineered strain expressing green fluorescent protein in place of UrcA serves as a biodetector for micromolar uranyl ion. Caulobacter crescentus tolerates high levels of U(VI) exposure by mineralizing the uranium. UrcA and its homologs may participate in such a process.	91
275227	TIGR04434	rSAM_Pput_1520	B12-binding domain/radical SAM domain protein, Pput_1520 family. Members of this family are radical SAM domain (pfam04055) enzymes with an N-terminal B12-binding domain (pfam02310), as is fairly common for radical SAM enzymes with lipid substrates. However, both domains as found in this family seem to be long-branch. The function is unknown, but all cases a PLP-dependent enzyme (a cysteine desulfurase homolog) is found nearby.	515
275228	TIGR04435	restrict_AAA_1	restriction system-associated AAA family ATPase. Members of this family are AAA family ATPases by homology. They occur regularly in a conserved gene neighborhood with the restriction (R), modification (M), and specificity (S) proteins of an apparent type I restriction enzyme system, plus one additional uncharacterized protein. It is not clear whether members of this family contribute to restriction per se, or to another process such as transfer of mobile elements.	555
275229	TIGR04436	SpoChoClust_3	putative 2OG-Fe(II) oxygenase. This family, related to streptomycin biosynthesis protein StrG and to phytanoyl-CoA dioxygenase, belongs to the 2-oxoglutarate and Fe(II)-dependent oxygenase clan, which includes not just dioxygenases such as phytanoyl-CoA dioxygenase PhyH, but also some chlorinating enzymes involved in natural product biosynthesis. Members of this family occur so far only in O-antigen biosynthesis regions of select Leptospira, and include an ~80 residue additional C-terminal region shared by no other proteins.	300
275230	TIGR04437	WaaZ_KDO_III	Kdo-III transferase WaaZ. Members of this family are WaaZ, or Kdo-III transferase. This enzyme, present in some strains of E. coli and its allies but not others, performs a non-stoichiometric addition of a third 3-deoxy-D-manno-oct-2-ulosonic acid (KDO-III) onto some fraction of KDO-II in the lipopolysaccharide (LPS) inner core. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	252
275231	TIGR04438	small_Trp_rich	small Trp-rich protein. Members of this bacterial protein family average 80 residues in length, and average nearly 6 Trp residues (two of which are invariant) in the first 45, which are strongly hydrophobic. Past this region, the protein is highly charged, with large numbers of Lys, Arg, Asp, and Glu residues. Members usually are divergently transcribed from a gene encoding a c-type cytochrome.	76
275232	TIGR04439	histamin_N_OH	putative histamine N-monooxygenase. Members of this family are involved in synthesizing N-hydroxyhistamine as a precursor to acinetobactin, a siderophore found in Acinetobacter baumannii. Assuming histidine is first decarboxylated to histamine, then hydroxylated, members of this family are histamine N-monooxygenase. The putative histidine decarboxylase is found in the same biosynthetic cluster.	431
275233	TIGR04440	glyco_TIGR04440	glycosyltransferase domain. This model describes a putative glycotransferase domain, related to the group 2 family glycosyltransferases of pfam00535.	215
275234	TIGR04441	lept_O_ant_chp1	surface carbohydrate biosynthesis protein. Members of this protein family occur only in a subset of Leptospira species, and in those species occur only in O-antigen biosynthesis regions. Members average about 375 amino acids in length. The function is unknown. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	373
275235	TIGR04442	TIGR04442	TIGR04442 family protein. Members of this family occur exclusively in certain Deltaproteobacteria, including Geobacter and Pelobacter. The function is unknown.	608
275236	TIGR04443	F420_CofF	coenzyme gamma-F420-2:alpha-L-glutamate ligase. Members of this family are the CofF, a RimK-related glutamate ligase that caps the gamma-glutamyl tail of the archaeal form of coenzyme F420, a hydride carrier. This enzyme does not appear in bacterial species, such as Mycobacterium tuberculosis, that make F420 with a different type of tail. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	283
275237	TIGR04444	chori_FkbO_Hyg5	chorismatase, FkbO/Hyg5 family. Members of this include two chemically distinct enzymes that cleave pyruvate from chorismate. FkbO and RapK convert chorismate to pyruvate plus (4R,5R)-4,5-dihydroxycyclohexa-1,5-dienecarboxylic acid (DCDC). HygB and Bra8 convert chorismate to pyruvate, water, and 3-hydroxybenzoate. These two enzymes are closely related, and lack homology to chorismate lyase, EC 4.1.3.40, which converts chorismate to pyruvate plus 4-hydroxybenzoate.	311
275238	TIGR04445	preny_LynF_TruF	peptide O-prenyltransferase, LynF/TruF/PatF family. Members of this family are prenytransferases that modify mostly, perhaps exclusively, ribosomally derived cyclic peptides. Note that in some cases, the natural product becomes C-prenylated through a non-enzymatic rearrangement after initial O-prenylation. Prenylated products carry names such as prenylagaramide, anacyclamide, and piricyclamide.	282
275239	TIGR04446	pren_cyc_PirE	prenylated cyclic peptide, anacyclamide/piricyclamide family. Members of this protein family occur primarily in Cyanobacteria. They average about 50 residues in length and are the ribosomally translated precursors of peptide natural products whose modifications include cleavage, cyclization, and prenylation. Sequences are well-conserved in the N-terminal region. They are nearly invariant over the last eight residues, but hypervariable just before that stretch. A related family, often in a similar genome context, is TIGR03678.	48
275240	TIGR04447	PatC_TenC_TruC	cyanobactin cluster PatC/TenC/TruC protein. Members of this family usually are small proteins (PatC, TenC, TruC) of unknown function in cyanobactin (prenylated cyclic peptide) biosynthesis clusters, where a different small protein is a known cyanobactin precursor (patellamide, anacyclamide, piricyclamide, etc). They may instead be the C-terminal domain of a longer protein that otherwise consists mostly of lectin-like or VWF type A domains, in similar context. Similar to the cyanobactin precursors, members of this family have two very strongly conserved regions separated by a hypervariable region, suggesting these proteins may undergo a similar maturation.	34
275241	TIGR04448	creatininase	creatininase. Members of this family are creatininase (EC 3.5.2.10), an amidohydrolase that interconverts creatinine + H(2)O with creatine. It should not be confused with creatinase (EC 3.5.3.3), which hydrolyzes creatine to sarcosine plus urea. [Central intermediary metabolism, Nitrogen metabolism]	246
275242	TIGR04449	halocin_C8_dom	halocin C8-like bacteriocin domain. This model describes the 76-amino C-terminal domain of the halocin C8 precursor that actually becomes the mature bacteriocin halocin C8 after export and cleavage, as well as homologous C-terminal regions from many other archaea. Surprisingly, this Cys-rich region occurs also in many strains of Staphylococcus epidermidis. Gene regions do not provide evidence for post-translational modification other than cleavage. Halocin C8 is active against a broad range of archaea; the region N-terminal to the bacteriocin domain modeled here serves as the immunity protein for the cell secreting the bacteriocin. [Cellular processes, Toxin production and resistance]	69
275243	TIGR04450	Gpos_C8_like	putative immunity protein/bacteriocin. This model describes full-length proteins of Gram-positive bacteria that consist of an N-terminal signal peptide, a central region of unknown function, and a Cys-rich C-terminal region. In both the overall architecture and the apparent weak homology of the C-terminal region itself, these proteins resemble archaeal proteins such as the halocin C8 precursor. In that precursor, the C-terminal region is a bacteriocin but the N-terminal region functions as the immunity protein. The related family of halocin C8-like bacteriocins and their bacterial homologs are recognized by model TIGR04449.	234
275244	TIGR04451	lanti_SCO0268	lantipeptide, SCO0268 family. Members of this family are putative lantipeptide (most likely lantibiotic) precursors, about 53 amino acids long, found in a lantibiotic-type biosynthetic cluster in several species of Streptomyces (S. coelicolor, S. griseoflavus, S. ambofaciens, etc.). This family is described in AntiSMASH as Streptomyces PEQAXS motif lantipeptide precursor.	53
275245	TIGR04452	Lepto_Lipo_YY_C	small lipoprotein, LA_3946 family. Members of this family of small lipoproteins that occur in at least nineteen species of Leptospira, but not so far anywhere outside that genus. Notable features include the putative lipoprotein modification Cys and an additional Cys near the C-terminus, both invariant, plus a well-conserved (although not invariant) Tyr-Tyr pair. From one to four paralogs occur in most Leptospira species. Members include LA_3946.	115
275246	TIGR04453	Lepto_8Cys	Cys-rich protein, LA_1312 family. Members of this protein family occur, so far, exclusively in the genus Leptospira. Members, although small (about 90 amino acids), have a predicted signal peptide followed by a region with eight invariant Cys residues. Some members have an additional Cys in the signal peptide region that suggests handling as a lipoprotein, but this Cys is not well conserved.	85
275247	TIGR04454	Lepto_4Cys	small lipoprotein, LB_250 family. Members of this family average about 92 amino acids in length, including an apparent lipoprotein signal peptide and a mature portion with four additional invariant Cys residues. This family is universal, so far, across at least twenty species of Leptospira but unknown outside the genus.	92
275248	TIGR04455	lipo_MXAN_6521	probable lipoprotein, LA_1396/MXAN_6521 family. Members of this family are predicted lipoproteins, near 200 residues in length, found in multiple species of Leptospira (a spirochete) but also in the Myxococcales branch of the Deltaproteobacteria (Myxococcus xanthus, Chondromyces apiculatus, Cystobacter fuscus, Corallococcus coralloides), an uncommon mix of species. In both lineages, the N-terminal region appears to be a lipoprotein signal peptide.	191
275249	TIGR04456	LruC_dom	LruC domain. This domain is abundant in the Leptospira, in Bacteroides, and in Vibrio (three widely separated lineages). Most members have plausible lipoprotein signal peptides, including lipoprotein LruC from Leptospira interrogans and BACOVA_00967, from Bacteroides ovatus, with a solved crystal structure. Note that the C-terminal region of pfam13448 (length 83) matches the N-terminal region of some members of this domain (length 243).	242
275250	TIGR04457	lipo_LenA_lepto	lipoprotein LenA. Members of this family are LenA (Leptospira Endostatin-like protein A), found in pathogenic and intermediate species of Leptospira but not in saprophytes. LenA binds plasminogen, laminin, and complement regulator factor H. Behavior during outer membrane solubilization by low concentrations of Triton X-114 and conservation in all family members of an apparent lipoprotein signal sequence, with invariant Cys residue, strongly suggest that LenA is a lipoprotein. Just outside this family is a full-length homolog found in another spirochete, Turneriella parva DSM 21527.	224
275251	TIGR04458	CYP450_TxtE	4-nitrotryptophan synthase. Members of this family are cytochrome P450 enzymes that convert L-tryptophan into L-4-nitrotryptophan. In thaxtomin gene clusters, this enzyme (TxtE) uses nitric acid (NO) derived from arginine by the nitric oxide synthase TxtD, and O2, to perform the tryptophan nitration. L-4-nitrotryptophan is then used as a non-proteinogenic amino acid by non-ribosomal peptide synthases (NRPS).	403
275252	TIGR04459	TisB_tox	type I toxin-antitoxin system toxin TisB. Members of this family are the TisB toxin protein of a type I toxin-antitoxin system, meaning that the antitoxin is a non-coding RNA. TisB is induced by some types of stress (SOS, ciprofloxacin) and appears to induce a persister state by dissipating transmembrane potential, thus depleting ATP. Persister cells, unable to grow until the persister state changes, survive a number of challenges, including exposure to most antibiotics. [Cellular processes, Adaptations to atypical conditions]	29
275253	TIGR04460	endura_MppR	enduracididine biosynthesis enzyme MppR. Members of this family are MppR, one of three enzymes involved in synthesizing enduracididine, a non-proteinogenic amino acid used in non-ribosomal peptide synthases to make natural products such as enduracidin from Streptomyces fungicidicus ATCC 21013. MppR is belongs to the acetoacetate decarboxylase-like superfamily. MppR catalyzes an aldol condensation and a dehydration, not a decarboxylation.	257
275254	TIGR04461	endura_MppQ	enduracididine biosynthesis enzyme MppQ. Members of this family are MppQ, one of three enzymes involved in synthesizing enduracididine, a non-proteinogenic amino acid used in non-ribosomal peptide synthases to make natural products such as enduracidin from Streptomyces fungicidicus ATCC 21013. MppQ is a PLP-dependent enzyme, predicted by homology to be an aminotransferase.	392
275255	TIGR04462	endura_MppP	enduracididine biosynthesis enzyme MppP. Members of this family are MppP, one of three enzymes involved in synthesizing enduracididine, a non-proteinogenic amino acid used in non-ribosomal peptide synthases to make natural products such as enduracidin from Streptomyces fungicidicus ATCC 21013. MppP is a PLP-dependent enzyme, predicted by homology to be an aminotransferase.	289
275256	TIGR04463	rSAM_vs_C_rich	radical SAM/SPASM domain protein maturase. Members of this family are probable protein/peptide-modifying radical SAM/SPASM domain proteins. The majority of members of this family seem to target Cys-rich repetitive regions of large proteins rather than of bacteriocin-sized small precursors. This arrangement suggests the modification target may be multifunctional, with the C-terminal domain behaving like a bacteriocin but other parts of the same precursor serving an immunity function, as occurs for the halocin C8 precursor.	439
275257	TIGR04464	chaper_lep	LipL41-expression chaperone Lep. Members of this protein family are Lep, an outer membrane lipoprotein LipL41-binding protein that appears to function as a chaperone important to its expression. LipL41 is the third most abundant lipoprotein in the pathogen Leptospira interrogans, but is found in saprophytic Leptospira species as well and is not essential for virulence.	114
275258	TIGR04465	ArgArg_F420	TAT-translocated F420-dependent dehydrogenase, FGD2 family. Members of this family are F420-binding enzymes with a proven functional N-terminal twin-arginine translocation (TAT) signal. Members are homologous to the cytosolic F420-dependent glucose-6-phosphate dehydrogenase but do not share the same function.	364
275259	TIGR04466	rSAM_BlsE	cytosylglucuronate decarboxylase. BlsE, part of the blasticidin S biosynthetic pathway, is a radical SAM enzyme that performs a decarboxylation at C5 of the glucoside residue. MilG in mildiomycin biosynthesis is equivalent. This enzyme follows CGA synthase and makes the pyranoside core moiety of a class of peptidyl nucleoside antibiotics. [Cellular processes, Toxin production and resistance]	327
275260	TIGR04467	CGA_synthase	hydroxymethylcytosylglucuronate/cytosylglucuronate synthase. Members of this family synthesize cytosylglucuronate (or hydroxymethylcytosylglucuronate) from UDP-glucuronate and free cytosine (or hydroxymethylcytosine). This reaction is followed by a decarboxylation at C5 of the glucoside residue. The net reaction makes the pyranoside core moiety of a class of peptidyl nucleoside antibiotics.	386
275261	TIGR04468	arg_2_3_am_muta	arginine 2,3-aminomutase. Members of this family are arginine 2,3-aminomutase, a radical SAM enzyme more closely related to lysine 2,3-aminomutase than to glutamate 2,3-aminomutase. The enzyme makes L-beta-arginine, sometimes in the context of antibiotic biosynthesis (blasticidin S, mildiomycin, etc). Activity is proven in Streptomyces griseochromogenes, which makes blasticidin S.	351
275262	TIGR04469	CGA_synth_rel	CGA synthase-related protein. Members of this family are related to cytosylglucuronate (CGA) synthase (TIGR04466), and found in the same clusters as CGA synthase and CGA decarboxylase. These clusters produce peptidyl nucleoside antibiotics with a pyranoside core moiety, found in a number of Streptomyces species. Removal of the S. griseochromogenes member of this family, BlsF, from a heterologous expression system caused an increase, not blockage, of blasticidin S. [Cellular processes, Toxin production and resistance]	297
275263	TIGR04470	rSAM_mob_pairB	radical SAM mobile pair protein B. Members of this family are the downstream member (B) of a pair of tandem-encoded radical SAM enzymes. Most of these radical SAM gene pairs have an additional upstream regulatory gene in the MarR family. Examples of high sequence identity (over 96 percent) from cassettes in several Treponema species of the oral cavity to those in multiple Firmicutes in the gut microbiome suggest recent lateral gene transfer, as might be expected for antibiotic resistance genes. The function is unknown.	285
275264	TIGR04471	rSAM_mob_pairA	radical SAM mobile pair protein A. Members of this family are the upstream member (A) of a pair of tandem-encoded radical SAM enzymes. Most of these radical SAM gene pairs have an additional upstream regulatory gene in the MarR family. Examples of high sequence identity (over 96 percent) from cassettes in several Treponema species of the oral cavity to those in multiple Firmicutes in the gut microbiome suggest recent lateral gene transfer, as might be expected for antibiotic resistance genes. The function is unknown.	220
275265	TIGR04472	reg_rSAM_mob	mobile rSAM pair MarR family regulator. A number of human microbiome species from both the Firmicutes and the spirochete genus Treponema share a gene cassette encoding a pair of radical SAM enzymes and, in most cases, this MarR family transcriptional regulator as well. Sequence identity can exceed 96 percent, suggesting recent lateral transfer. These observations suggest the cassette confers resistance to a toxic compound such as an antibiotic.	141
275266	TIGR04473	taxol_Phe_23mut	phenylalanine aminomutase (L-beta-phenylalanine forming). Members of this family are the phenylalanine aminomutase known from taxol biosynthesis. This enzyme has the MIO prosthetic group (4-methylideneimidazole-5-one), derived from an Ala-Ser-Gly motif. Other MIO enzymes include Phe, Tyr, and His ammonia-lyases. This model serves as an exception to overrule assignments by equivalog model TIGR01226 for phenylalanine ammonia-lyase.	687
275267	TIGR04474	tcm_partner	three-Cys-motif partner protein. Members of this family occur regularly as a partner to as a member of family pfam07505, which has been called a phage protein but which seems to occur also in other contexts. Members average about 400 residues in length, but the conserved region covered by the model averages 260 residues and excludes the C-terminus. Conserved motifs suggest enzymatic activity. Note that its frequent partner protein (see pfam07505) has a three-cysteine motif that resembles the Cx3CxxC motif of radical SAM proteins, and that in one branch (see TIGR04471) actually becomes Cx3CxxC. We suggest the name three-Cys-motif partner protein (tcmP), and renaming pfam07505 to three-Cys-motif family protein	262
275268	TIGR04475	Phe_D_beta_mut	phenylalanine aminomutase (D-beta-phenylalanine forming). Members of this family have the MIO prosthetic group (4-methylideneimidazole-5-one), derived from an Ala-Ser-Gly motif. Other MIO enzymes include Phe, Tyr, and His ammonia-lyases. The member from Pantoea agglomerans, and probably all members, convert Phe to D-beta-phenylalanine (EC 5.4.3.11). By contrast, members of family TIGR04473 convert Phe to L-beta-phenylalanine (5.4.3.10), as in taxol biosynthesis.	510
275269	TIGR04476	exosort_XrtN	exosortase N. Members of this family are exosortase N (xrtN), a bacterial exosortase variant whose single target, encoded always by an adjacent gene, belongs to the vault protein inter-alpha-trypsin family. This system occurs in a number of spirochete (Leptospira) and Bacteriodetes species.	403
275270	TIGR04477	sorted_by_XrtN	XrtN system VIT domain protein. Members of this subfamily average about 850 amino acids in length, ending with a variant form of PEP-CTERM sorting signal. Members have a VIT (vault protein inter-alpha-trypsin inhibitor heavy chain) domain (pfam08487). Other bacterial subfamilies of VIT domain proteins have members with either GlyGly-CTERM or LPXTG C-terminal sorting signals. Members of this subfamily occur only in context next to a protein sorting/processing enzyme, exosortase N (XrtN). These subsystems occur both among the Bacteriodetes and in the spirochete genus Leptospira.	822
275271	TIGR04478	rSAM_YfkAB	radical SAM/CxCxxxxC motif protein YfkAB. Members of this highly conserved family in some Firmicutes have an N-terminal radical SAM domain (pfam04055) and a C-terminal pfam08756 domain with a CxCxxxxC motif that suggests binding to an additional metallocluster. It appears all correct sequences in this family are about 370 amino acids in length, containing the YfkA and YfkB regions originally reported as separate ORFs in Bacillus subtilis. Partial Phylogenetic Profiling shows occurrences almost exclusively in the Bacilli, with very few examples of either lateral transfer or gene loss. The essentially monophyletic distribution suggests a housekeeping function. Members have no well-conserved gene neighborhood. The function is unknown. [Unknown function, Enzymes of unknown specificity]	363
275272	TIGR04479	bcpD_PhpK_rSAM	radical SAM P-methyltransferase, PhpK family. Characterized members of this family are B12-binding domain/radical SAM domain enzymes that use methylcobalamin as a methyl donor to methylate a phosphorous atom during the biosynthesis of natural products such as bialaphos and phosalacine. These syntheses create an extremely rare C-P-C bond. All members of the seed alignment derive from genomic regions that include a non-ribosomal peptide synthase. Note that a single organism, Pelosinus fermentans JBW45 from Cr(VI)-contaminated groundwater, has eight additional homologs of unknown function that score between trusted and noise cutoffs of this model. [Cellular processes, Toxin production and resistance]	504
275273	TIGR04480	D_pro_red_PrdA	D-proline reductase (dithiol), PrdA proprotein. Members of this family are the PrdA proprotein. The polypeptide undergoes an autocatalytic cleavage that creates two subunits, one with a Cys-derived pyruvoyl motiety critical for activity. The D-proline reductase complex also contains a subunit derived from the prdB gene. The complex acts on D-proline, which may be supplied from L-proline by an active proline racemase encoded nearby.	587
275274	TIGR04481	PR_assoc_PrdC	proline reductase-associated electron transfer protein PrdC. Members of this family are encoded near the prdA and prdB genes for proteins of the proline reductase complex, are induced by proline, and are designated PrdC by Bouillaut, et al. Some members are selenoproteins (at two different positions), as is PrdB. Members are homologous to, but distinct from, electron transport protein RnfC.	425
275275	TIGR04482	D_pro_red_PrdD	proline reductase cluster protein PrdD. Members of this family are PrdD, encoded in the proline reductase gene cluster. Members are closely homologous to PrdA, which cleaves during maturation to create two subunits of the subunits of the proline reductase complex, one of which has a Cys-derived pyruvoyl active site.	246
275276	TIGR04483	D_pro_red_PrdB	D-proline reductase (dithiol), PrdB protein. Members of this family form the PrdB subunit, usually a selenoprotein, in the D-proline reductase complex. The usual pathway is conversion of L-protein to D-proline by a racemase, then use of D-proline as an electron acceptor coupled to ATP generation under anaerobic conditions.	238
275277	TIGR04484	thiosulf_SoxA	sulfur oxidation c-type cytochrome SoxA. Members of this family are SoxA, a c-type cytochrome with a CxxCH motif, part of a heterodimer with SoxX. SoxXA, SoxYZ, and SoxB contribute to thiosulfate oxidation to sulfate.	211
275278	TIGR04485	thiosulf_SoxX	sulfur oxidation c-type cytochrome SoxX. Members of this family are SoxX, a c-type cytochrome with a CxxCH motif, part of a heterodimer with SoxA. SoxXA, SoxYZ, and SoxB contribute to thiosulfate oxidation to sulfate.	78
275279	TIGR04486	thiosulf_SoxB	thiosulfohydrolase SoxB. SoxB, a di-manganese(II) enzyme, works with SoxYZ and the c-type cytochrome SoxXA in oxidation of thiosulfate to sulfate.	556
275280	TIGR04487	SoxY_para_1	SoxY-related AACIE arm protein. Members of this family are paralogs to the authentic thiosulfate oxidation system protein SoxY. True SoxY end with the sequence GGCG(G), the swinging arm in which the Cys residue covalently binds the inorganic sulfur moiety. In this family, members end with a different swinging arm sequence, [AS]AC[IVT]E. The few species with a member of this family always have authentic SoxY.	145
275281	TIGR04488	SoxY_true_GGCGG	thiosulfate oxidation carrier protein SoxY. Members of this family are bona fide SoxY, the sulfate carrier protein with the GGCGG ir GGCG swinging arm C-terminal sequence. The Cys in the swinging arm is the residue to which the inorganic sulfate moiety becomes covalently attached. In some species, a closely related paralog occurs as well (TIGR04487), with a swinging arm sequence AACIE. More distantly related forms have an additional C-terminal SoxZ-related domain. All members are periplasmic and have an N-terminal twin-arginine translocation (TAT) signal sequence.	148
275282	TIGR04489	exosort_XrtO	exosortase O. Members of this protein are a variant form of exosortase, XrtO, with a dedicated target typically encoded by the adjacent gene. Members have a unique C-terminal extension very different from EpsI (TIGR02914), the extension that many exosortases have. The targets of XrtO all are members of family TIGR02921, which describes a PEP-CTERM protein about 950 residues long, found in more than 15 genera so far. These PEP-CTERM proteins are unusually hydrophobic in stretches, suggesting an integral membrane location, which is unusual. About one third of the members of TIGR02921 are in genomes with this protein, exosortase O, always encoded by an adjacent gene. Genomes include Synechocystis sp. PCC 7509, Xenococcus sp. PCC 7305,Pleurocapsa sp. PCC 7327, Microcoleus vaginatus, Hahella chejuensis, Vibrio azureus NBRC 104587, etc.	450
275283	TIGR04490	SoxZ_true	thiosulfate oxidation carrier complex protein SoxZ. SoxZ forms a heterodimer with SoxY, the subunit that forms a covalent bond with a sulfur moiety during thiosulfate oxidation to sulfate. Note that virtually all proteins that have a SoxY domain fused to a SoxZ domain are functionally distinct and not involved in thiosulfate oxidation.	95
275284	TIGR04491	reactive_PduG	diol dehydratase reactivase alpha subunit. Members of this family are the alpha (large) subunit of the alpha-2/beta-2 tetrameric enzyme that reactivates B12-dependent trimeric diol dehydratases (1,2-propanediol dehydratase, glycerol dehydratase). Note that the beta subunit of the reactivase is homologous to the beta (medium) subunit of the diol dehydratase. The reactivase catalyzes the exchange of chemically inactivated B-12 for active B12. This model excludes homologs from Mycobacterium (e.g. M. smegmatis), where the several paralogous forms of the dehydratase occur and are exceptional also by not being found in a carboxysome-like microcompartment.	598
275285	TIGR04492	VioB	iminophenyl-pyruvate dimer synthase VioB. Following the action of a flavin-dependent L-amino acid oxidase that converts L-tryptophan to indole-3-pyruvate imine, the enzymes VioB (this family), RebD, and StaD can ligate two molecules, forming a coupled iminophenyl-pyruvate dimer. In the violacein biosynthesis pathway, this compound is acted on by VioE before it cyclizes spontaneously to chromopyrrolic acid. In the pathways of homolog StaD (staurosporine), chromopyrrolate is formed, and the enzyme is referred to as chromopyrrolate synthase. RebD is very similar to StaD, but acts on chlorinated Trp-derived molecules. [Cellular processes, Biosynthesis of natural products]	1003
275286	TIGR04493	microcomp_PduM	microcompartment protein PduM. Members of this family are PduM, a protein essential for forming functional microcompartments in which a trimeric B12-dependent enzyme acts as a dehydratase for 1,2-propanediol (Salmonella enterica) or glycerol (Lactobacillus reuteri).	153
275287	TIGR04494	c550_PedF	cytochrome c-550 PedF. Members of this family are c-type cytochromes with some remote similarity to the sulfur oxidation cytochrome SoxX.	133
275288	TIGR04495	RiPP_XyeA	putative rSAM-modified RiPP, XyeA family. Members of this family are short polypeptides with a conserved GG motif as found in bacteriocins that are cleaved upon export. Each gene occurs immediately upstream of a gene for a peptide-modifying radical SAM/SPASM domain protein. The system is designated Xye for genera Xenorhabdus, Yersinia, and Erwinia, hence XyeA for the precursor peptide. The vicinity will also contain a transport protein with a C39 family peptidase domain (see pfam03412), characteristic of GG motif cleave-on-export systems. The function of this RiPP family is unknown.	52
275289	TIGR04496	rSAM_XyeB	radical SAM/SPASM domain peptide maturase, XyeB family. Members of this family are radical SAM/SPASM domain enzymes associated with maturation of the XyeA family of GG-motif containing RiPP (Ribosomally synthesized, Post-translationally modified Peptide) natural products.	385
275290	TIGR04497	GRASP_targ_2	putative ATP-grasp target RiPP. Members of this small family are putative RiPP (Ribosomally translated, Post-translationally modified Peptides) precursors, modified by RimK-like ATP-grasp proteins. Members are encoded near both an ATP-grasp protein and C39 peptidase domain-containing transporter. Members are short polypeptides that contain the GG motif expected for cleavage on export.	46
275291	TIGR04498	AbiV_defense	abortive infection protein, AbiV family. This family includes AbiV (abortive infection system V) from Lactococcus lactis, a phage resistance protein that causes certain phage infections to fail to lead to successful phage replication. Abortive infection mechanisms differ greatly. AbiV interacts directly with the protein SaV in phage p2 and blocks translation of phage proteins.	141
275292	TIGR04499	abortive_AbiA	abortive infection protein, AbiA family. Members of this protein family average about 650 amino acids in length, with an N-terminal region related to reverse transcriptases. The only characterized member is AbiA, with reported activity as an abortive infection protein for phage defense in Lactococcus lactis and (heterologously) in Streptococcus thermophilus.	615
275293	TIGR04500	PpiC_rel_mature	putative peptide maturation system protein. Members of this protein family have a novel N-terminal sequence region. Close homologs to the C-terminal region of this protein score well to PpiC-type peptidyl-prolyl cis-trans isomerase models (see pfam00639), yet no sequence within the family scores well to such models, suggesting origin within a branch of the PpiC family but subsequent neofunctionalization with a rapid change of sequence. The genome context for members always includes an ATP-grasp enzyme associated with peptide modification and a short polypeptide likely to be the modification target.	337
275294	TIGR04501	microcomp_PduB	microcompartment protein PduB. Members of this family are PduB, a protein of bacterial microcompartments for coenzyme B(12)-dependent utilization of 1,2-propanediol (hence pdu) or glycerol. The most closely related protein in ethanolamine utilization microcompartments is EutL (TIGR04502).	225
275295	TIGR04502	microcomp_EutL	microcompartment protein EutL. Members of this family are EutL, a protein of bacterial microcompartments for ethanolamine utilization (eut). The most closely related protein in microcompartments for utilization of 1,2-propanediol (hence pdu) or glycerol is PduB (TIGR04501).	214
275296	TIGR04503	mft_etfB	electron transfer flavoprotein, mycofactocin-associated. Members of this small protein family are putative electron transfer flavoproteins, related to FixA from E. coli and EtfB from Methylophilus methylotrophus but clearly forming a distinctive clade. All members occur in species with a mycofactocin system. We have proposed that mycofactocin is a redox carrier synthesized from a ribosomally translated peptide with aid from a radical SAM enzyme, analogous to PQQ.	290
275297	TIGR04504	SDR_subfam_2	SDR family mycofactocin-dependent oxidoreductase. Members of this protein subfamily are putative oxidoreductases belonging to the larger SDR family. All members occur in genomes that encode a cassette for the biosynthesis of mycofactocin, a proposed electron carrier of a novel redox pool. Characterized members of this family are described as NDMA-dependent, meaning that a blue aniline dye serving as an artificial electron acceptor is required for members of this family to cycle in vitro, since the bound NAD residue is not exchangeable. This family resembles TIGR03971 most closely in the N-terminal region, consistent with the published hypothesis of NAD interaction with mycofactocin. See EC 1.1.99.36. [Unknown function, Enzymes of unknown specificity]	259
275298	TIGR04505	PtsS_plasma	phosphate ABC transporter substrate-binding protein. Members of this family are the substrate-binding protein of the phosphate ABC transporter as found in Mollicutes genera such as Mycoplasma, Mesoplasma, and Spiroplasma. The most similar sequences outside this family are PtsS in family TIGR02136, but sequence architecture differs considerably. Members of this family are never lipoproteins.	328
275299	TIGR04506	F_threo_transal	fluorothreonine transaldolase. Members of this family are fluorothreonine transaldolase, and enzyme involved in biosynthesis of 4-fluorothreonine, one of the few known known naturally occurring organofluorine compounds. [Cellular processes, Biosynthesis of natural products]	609
275300	TIGR04507	fluorinase	adenosyl-fluoride synthase. Members of this family are fluorinase (adenosyl-fluoride synthase, EC 2.5.1.63), an enzyme involved in the first committed step in the biosynthesis of at least two different organofluorine compounds. Few organofluorine natural products are known. Related enzymes include chlorinases (EC 2.5.1.94) that lack fluorinase activity, although a fluorinase may show chlorinase activity. [Cellular processes, Biosynthesis of natural products]	285
275301	TIGR04508	queE_Cx14CxxC	7-carboxy-7-deazaguanine synthase, Cx14CxxC type. In the pathway of 7-cyano-7-deazaquanine (preQ0) biosynthesis, the radical SAM enzyme QueE is quite variable. This model describes a variant form in which the three-Cys motif that binds the signature 4Fe-4S cluster takes the form Cx14CxxC, as in Burkholderia multivorans ATCC 17616. The crystal structure is known.	208
275302	TIGR04509	mod_pep_NH_fam	putative modified peptide. Members of this family average 110 residues in length, with strong N-terminal homology to both the nitrile hydratase (NH) alpha subunit and family TIGR03793 of NH-related ribosomally translated natural product precursors. A neighboring gene resembles SagB, the dehydrogenase of many thiazole and oxazole modified peptide systems, supporting the hypothesis that members of this family are post-translationally modified.	85
275303	TIGR04510	mod_pep_cyc	putative peptide modification system cyclase. Members of this family show homology to mononucleotidyl cyclases and to tetratricopeptide repeat (TPR) proteins. Members occur in next to two other markers of ribosomal peptide modification systems. One is a dehydrogenase related to SagB proteins from thiazole/oxazole modification systems. The other is the putative precursor, related to the nitrile hydratase-related leader peptide (NHLP) and nitrile hydratase alpha subunit families. These systems occur in many species of Xanthomonas and Stenotrophomonas, among others.	814
275304	TIGR04511	SagB_rel_DH_2	putative peptide maturation dehydrogenase. Members resemble the peptide maturation dehydrogenase SagB of thiazole and oxazole modification systems, and occur in a what appears to be a new type of peptide modification system. One adjacent marker is a new type of nitrile hydratase alpha subunit-related putative precursor, TIGR04509, distantly related the NHLP leader peptide family TIGR03793. Another is a large protein, TIGR04510, with regions similar to adenylate cyclases and TPR proteins.	380
275305	TIGR04512	Mycopla_NOT_gsn	STREFT protein. Members of this family occur strictly in the genus Mycoplasma, average 1050 in length with little length variability, have an N-terminal signal sequence, and exhibit no detectable sequence similarity to any characterized protein. Up to four tandem copies occur in some Mycoplasma (e.g. M. putrefaciens KS1). Incorrect inclusion of a 57-amino acid stretch of one family member in pfam08178, for a helix-turn-helix transcriptional regulator in several E. coli phage, has caused many members of this family to be annotated, in error, as GnsA/GnsB family proteins. We suggest the name STREFT (Secreted Thousand Residue Frequently Tandem) protein as a distinctive name to spread and replace the incorrect GnsA/GnsB designation. [Unknown function, General]	1041
275306	TIGR04513	VPAMP_CTERM	VPAMP-CTERM protein sorting domain. This domain is found as the extreme C-terminal region of four extracellular protein (mostly protease) precursor sequences in Chthoniobacter flavus Ellin428, first representative sequenced from the Spartobacteria class of phylum Verrucomicrobia. This domain contains the cognate signal for one four exosortase family enzymes in the C. flavus genome, and coexists with the more common PEP-CTERM domain, found on more than 50 proteins.	28
275307	TIGR04514	GWxTD_dom	GWxTD domain. This domain, about 100 amino acids long, occurs in Actinobacteria and other little-studied Gram-negative bacteria, Sec-dependent proteins 300-800 in length. The domain is rich in aromatic residues, with Trp, Tyr, or Phe as the majority amino acid in ten of the twenty-four most-conserved residue positions.	105
275308	TIGR04515	P450_rel_GT_act	P450-derived glycosyltranferase activator. Members of this family resemble cytochrome P450 by homolog, but lack a critical heme-binding Cys residue. Members in general are encoded next to a glycosyltransferase gene in a natural products biosynthesis cluster, physically interact with it, and help the glycosyltransferase achieve high specificity while retaining high activity. Many members of this family assist in the attachment of a sugar moiety to a natural product such as a polyketide.	384
275309	TIGR04516	glycosyl_450act	glycosyltransferase, activator-dependent family. Many biosynthesis clusters for secondary metabolites feature a glycosyltransferase gene next to a P450 homolog, often with the P450 lacking a critical heme-binding Cys. These P540-derived sequences seem to be allosteric activators of glycosyltransferases such as the member of this family. This model describes a set of related glycosyltransferases, many of which can be recognized as activator-dependent from genomic context.	418
275310	TIGR04517	rSAM_PoyD	radical SAM family RiPP maturation amino acid epimerase. This model describes PoyD and its homologs. These are divergent putative radical SAM enzymes, with the classical CxxxCxxC motif but with few members approaching the cutoff score of pfam04055. PoyD appears responsible for catalyzing a unidirectional L-to-D epimerization of 18 the 48 residues in the core peptide of the first characterized polytheonamide. The RiPP (ribosomally translated natural product) precursor, and peptides encoded near many other members of this family, belong to the nitrile hydratase leader peptide (NHLP) family.	445
275311	TIGR04518	ECF_S_folT_fam	ECF transporter S component, folate family. Members of this model are the multiple membrane-spanning S (specificity) component of ECF (energy coupling factor) type uptake transporters. All seed members were found in the vicinity of the bifunctional enzyme folC, involved in making active cofactor from imported folate. However, some species have multiple members of this family, suggesting some diversity of function. [Transport and binding proteins, Unknown substrate]	162
275312	TIGR04519	MoCo_extend_TAT	MoCo/4Fe-4S cofactor protein extended Tat translocation domain. This model describes a forty-five residue domain in which the last six residues represent the start of a TAT (Twin-Arginine Translocation) sorting signal. TAT allows proteins already folded, with cofactor already bound, to transit the membrane and reach the periplasm with the ability to perform redox or other cofactor-dependent activities. TAT signals are not normally seen so far from a well-supported start site. Member proteins may all be mutually homologous, with both a molybdenum cofactor-binding domain and a 4Fe-4S dicluster-binding domain.	43
275313	TIGR04520	ECF_ATPase_1	energy-coupling factor transporter ATPase. Members of this family are ATP-binding cassette (ABC) proteins by homology, but belong to energy coupling factor (ECF) transport systems. The architecture in general is two ATPase subunits (or a double-length fusion protein), a T component, and a substrate capture (S) component that is highly variable, and may be interchangeable in genomes with only one T component. This model identifies many but not examples of the upstream member of the pair of ECF ATPases in Firmicutes and Mollicutes. [Transport and binding proteins, Unknown substrate]	268
275314	TIGR04521	ECF_ATPase_2	energy-coupling factor transporter ATPase. Members of this family are ATP-binding cassette (ABC) proteins by homology, but belong to energy coupling factor (ECF) transport systems. The architecture in general is two ATPase subunits (or a double-length fusion protein), a T component, and a substrate capture (S) component that is highly variable, and may be interchangeable in genomes with only one T component. This model identifies many but not examples of the downstream member of the pair of ECF ATPases in Firmicutes and Mollicutes. [Transport and binding proteins, Unknown substrate]	277
275315	TIGR04522	EcfS_MSC_0063	putative energy coupling factor transporter S component, MSC_0063 family. This family of proteins is restricted to the Mollicutes (including Mycoplasma, Spiroplasma, and Ureaplasma). Members belong to a superfamily of multiple membrane-spanning proteins, among which those with assigned activities function as the S component (the specificity component) of ECF transporters. However, members fail to score better than the trusted cutoffs to previously built models for S component proteins (see pfam07155). [Transport and binding proteins, Unknown substrate]	151
275316	TIGR04523	Mplasa_alph_rch	helix-rich Mycoplasma protein. Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.	745
275317	TIGR04524	mycoplas_M_dom	IgG-blocking virulence domain. This model defines a domain restricted to Mycoplasma and Ureaplasma proteins. Members include protein M of Mycoplasma genitalium, MG_281, a virulence protein that binds the IgG light chain to block the binding of antibody to antigen. The crystal structure of the protein M antibody-binding region is solved (PDB|4NZR), and includes this homology domain. Full-length homologs to MG_281 are known in a few other Mycoplasma species, but this model's seed alignment demonstrates distant homology to many additional proteins with a much wider distribution across the Mollicutes. Member proteins include paralogous families in some species, such as MCAP_0345, MCAP_0347, MCAP_0349, and MCAP_0351 in Mycoplasma capricolum. [Cellular processes, Pathogenesis]	251
275318	TIGR04525	prot_M_MG281	IgG-blocking protein M. Members of this family, including MG_281 of Mycoplasma genitalium, bind conserved regions of the IgG light chain sequences, blocking IgG's normal function of antigen-specific binding. It is therefore an important virulence protein. Members of this family are found also in Mycoplasma pneumoniae, M. penetrans, M. gallisepticum, and M. iowae. Model TIGR04524 describes a region within this protein that is shared by many additional Mycoplasma and Ureaplasma proteins. [Cellular processes, Pathogenesis]	526
275319	TIGR04526	predic_Ig_block	putative immunoglobulin-blocking virulence protein. Members of this family are putative virulence proteins of Mycoplasma and Ureaplasma species. Members share a region of sequence similarity (see TIGR04524) with protein M, a Mycoplasma genitalium protein that binds a conserved light chain region of IgG and blocks its protective function of antigen-specific binding. The seed alignment for this model includes an N-terminal signal-anchor domain and a proline-rich linker domain, and a C-terminal extension, in addition to the protein M-like domain recognized by TIGR04524. [Cellular processes, Pathogenesis]	692
275320	TIGR04527	mycoplas_twoTM	two transmembrane protein. Members of this family are uncharacterized proteins from the genus Mycoplasma, typically about 260 amino acids long, with a hydrophobic predicted transmembrane alpha helix toward each end. Often two family members are encoded in tandem, e.g. MG_279 and MG_280 from Mycoplasma genitalium.	245
275321	TIGR04528	acido_non_PQQ	acido-empty-quinoprotein group A. Members of this family closely resemble quinoproteins and quinohemoproteins such as PQQ-dependent methanol, glucose, and shikimate dehydrogenases, but restricted to species of Acidobacteria unable to synthesize PQQ. Seven members occur in candidatus Solibacter usitatus Ellin6076, eleven in Acidobacteriaceae bacterium KBS 96, etc. Members have N-terminal signal sequences. They lack the pair of adjacent Cys residues, involved in electron transfer, typical for family TIGR03075, and they lack CxxCH motifs for cytochrome c-like heme-binding. What cofactor these paralogous families of enzymes might use is unclear.	491
275322	TIGR04529	MTB_hemophore	hemophore-related protein, Rv0203/Rv1174c family. Members of this family occur as paralogs in most Mycobacterium strains, including 2 in M. tuberculosis, 6 in M. avium, and 9 in M. smegmatis. Members have a cleaved N-terminal signal peptide and exactly two Cys residues in the mature protein, both at invariant positions. The best characterized member, Rv0203, is a hemophore, that is, a secreted polypeptide that binds heme and delivers it to a transport system for import. Hemophores are protein analogs of siderophores, natural products that chelate non-heme iron and deliver it to receptors for transport. The unrelated HasA family of hemophores has been described in Gram-negative bacteria such as Yersinia pestis and Pseudomonas aeruginosa. [Transport and binding proteins, Other]	77
275323	TIGR04530	hemophoreRv0203	hemophore, mycobacterial-type. Members of this family, including Rv0203 from Mycobacterium tuberculosis, are secreted heme-binding proteins used in heme acquisition. Such proteins are called hemophores. Members have a cleavable N-terminal signal peptide, and a mature region just over 100 amino acids long with a pair of invariant Cys residues. An unrelated hemophore, HasA, occurs in Gram-negative pathogens such as Yersinia pestis. [Transport and binding proteins, Other]	113
275324	TIGR04531	nonproteo_OH	putative nonproteinogenic amino acid hydroxylase. This extremely rare protein family, a branch of the 2-oxoglutarate dependent oxygenase family related to proline 3-hydroxylase, appears only in natural product biosynthetic clusters that include nonribosomal peptide synthases. One members is PlyP from the polyoxypeptin A cluster, suggested to hydroxylate 3-methylproline. Another, GetF from the GE81112 biosynthetic gene cluster, is a proposed to hydroxylate pipecolic acid.	277
275325	TIGR04532	PT_fungal_PKS	iterative type I PKS product template domain. Sequences found by this model are the so-called product template (PT) domain of various fungal iterative type I polyketide synthases. This domain resembles pfam14765, designated polyketide synthase dehydratase by Pfam, but members of that family are primarily bacterial, where type I PKS are predominantly modular, not iterative. The dehydratase active site residues well-conserved in pfam14765 (His in the first hot dog domain, Asp in the second hot dog domain) seem well conserved in this family also.	324
275326	TIGR04533	cyanosortB_assc	cyanoexosortase B-associated protein. Members of this protein family are found exclusively in the Cyanobacteria, usually encoded next to a gene encoding cyanoexosortase B (TIGR04156). This relationship resembles the association of the unrelated protein family TIGR04153 with cyanoexosortase A (TIGR03763), and of most exosortases with EpsI.	221
275327	TIGR04534	ELWxxDGT_rpt	ELWxxDGT repeat. This model describes protein repeat with a well-conserved motif ELWxxDGT, and a periodicity of about 48. A single protein may have as many as 18 repeats. It may consist nearly entirely of this repeat, or may have other repeats as well (e.g. hyalin repeat). It is most common in the Deltaproteobacteria.	47
275328	TIGR04535	ferrit_encaps	ferritin-like protein. Two families of proteins are known to be encoded, with some regularity, next to the gene for encapsulin (pfam04454), with surrounds the target protein to form a prokaryotic proteinaceous organelle. One is the family of enzymes that includes Dyp-type peroxidases. The other is this family, with a resemblance to bacterioferritins. Encapsulin-associated forms of the proteins in these two families have a necessary C-terminal motif that resembles DGSL[SGN]IGSL[KR]. Members of this family that include the last columns of the model in the hit region (and are encoded next to the encapsulin gene) should be designated encapsulin-associated ferritin-like protein.	113
275329	TIGR04536	geobac_encap	encapsulated protein. Members of this family are lineage-restricted uncharacterized proteins found mostly in Brevibacillus and Geobacillus. Members are encoded next to the gene for encapsulin (which once was called a bacteriocin), and have the C-terminal motif for associating with encapsulin. [Unknown function, Enzymes of unknown specificity]	194
275330	TIGR04537	encap_target	encapsulation C-terminal sorting signal. This model describes a diverse region of extremely small size (11 residues), so unavoidably there are both false-positives and false-negatives. All true hits should occur in proteins encoded next to an encapsulating protein (see pfam04454), and should occur near the extreme C-terminus. Families previously known to have this domain on some members to mediate encapsulation include dye-decolorizing peroxidases and a ferritin-like protein, but (as this model helps show) there are others, including some hemerythrin family proteins and novel family TIGR04536.	11
275331	TIGR04538	P450_cycloAA_1	cytochrome P450, cyclodipeptide synthase-associated. Members of this subfamily are cytochrome P450 enzymes that occur next to tRNA-dependent cyclodipeptide synthases. This group does NOT include CYP121 (Rv2275) from Mycobacterium tuberculosis, adjacent to the cyclodityrosine synthetase Rv2276.	395
275332	TIGR04539	tRNA_cyclodipep	tRNA-dependent cyclodipeptide synthase. Members of this family take two aminoacylated tRNA molecules and produce a cyclic dipeptide with two peptide bonds. This enzyme therefore produces a type of nonribosomal peptide, but by a mechanism entirely different from the typical non-ribosomal peptide synthase (NRPS) that relies on adenylation to activate amino acids. Three characterized members of this family are the cyclodityrosine synthase of Mycobacterium tuberculosis (an essential gene), a cyclo(L-Phe-L-Leu) synthase from Streptomyces noursei involved in natural product biosynthesis, and cyclodileucine synthase YvmC from Bacillus licheniformis. Many cyclodipeptide synthases are found next to a cytochrome P450 that further modifies the product.	220
275333	TIGR04540	CLB_0814_fam	conserved hypothetical protein. Members of this family are conserved hypothetical proteins in a narrow range of species. In Clostridium botulinum A ATCC 19397, the gene occurs immediately after a five gene operon for biosynthesis of the natural product bacimethrin, a thiamin antivitamin antibiotic.	73
275334	TIGR04541	thiaminase_BcmE	thiamine pyridinylase. Members of this family are thiamine pyridinylase (EC 2.5.1.2), also called thiaminase I. Most examples of this secreted, thiamine-degrading enzyme are encoded with a cluster for biosynthesis of the thiamine antivitamin bacimethrin. [Cellular processes, Biosynthesis of natural products]	381
275335	TIGR04542	GMC_mycofac_2	GMC family mycofactocin-associated oxidreductase. This model describes a set of dehydrogenases belonging to the glucose-methanol-choline oxidoreductase (GMC oxidoreductase) family. Members of the present family are restricted to the bacterial genus Gordonia, and seem to replace the related family TIGR03970, which occurs in Actinobacteria generally but not in the genus Gordonia. Members of both this family and TIGR03970 are associated with the mycofactocin biosynthesis operon in Actinobacteria. [Unknown function, Enzymes of unknown specificity]	425
275336	TIGR04543	ketoArg_3Met	2-ketoarginine methyltransferase. This SAM-dependent C-methyltransferase performs the middle step of a three step conversion from arginine to beta-methylarginine. It performs a C-methylation at position 3 of 5-guanidino-2-oxopentanoic acid (keto-arginine). An aminotransferase converts arginine to 5-guanidino-2-oxopentanoic acid, and later converts 5-guanidino-3-methyl-2-oxopentanoic acid to beta-methylarginine.	331
275337	TIGR04544	3metArgNH2trans	beta-methylarginine biosynthesis bifunctional aminotransferase. Members of this family are the bifunctional aminotransferase that catalyzes the first and third steps in the three-step conversion of arginine to beta-methylarginine. It first converts arginine to 2-ketoarginine, then converts 3-methyl-2-ketoarginine to 3-methylarginine. All members of the seed alignment are encoded next to a 2-ketoarginine methyltransferase (EC 2.1.1.243).	366
275338	TIGR04545	rSAM_ahbD_hemeb	heme b synthase. Members of this family are AhbD (alternative heme biosynthetic protein D), a radical SAM enzyme in sulfate-reducing bacteria and methanogens that performs the last decarboxylations to synthesize heme b from Fe-coproporphyrin III. Members include DVU_0855, previously included in error in TIGR04055, the NirJ2 family thought to be involved in heme d1 biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	339
275339	TIGR04546	rSAM_ahbC_deAc	12,18-didecarboxysiroheme deacetylase. This model describes one of a pair of radical SAM enzymes involved in the alternative heme biosynthesis (ahb) pathway for heme b biosynthesis from siroheme. This anaerobic pathway occurs in sulfate-reducing bacteria and methanogens. A very similar pair of radical SAM enzymes (TIGR04054, TIGR04055) is involved in heme d1 biosynthesis in species such as Heliobacillus mobilis and Heliophilum fasciatum. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	390
275340	TIGR04547	Mollicu_LP	MOLPALP family lipoprotein. Members of this family are surface lipoproteins, about 900 amino acids long on average, found only in the Mollicutes (Mycoplasma, Entomoplasma, Acholeplasma, Mesoplasma, Spiroplasma). Paralogs occur, such as MCAP_0360, MCAP_0361, and MCAP_0362 in Mycoplasma capricolum. This family shares significant N-terminal sequence similarity with STREFT (Secreted Thousand Residue Frequently Tandem), described by model TIGR04512; several members of the STREFT family have been misannotated as GnsA/GnsB family proteins. For proteins in this family, we suggest the name MOLPALP (Mollicutes Paralogous Lipoprotein) family lipoprotein [Cell envelope, Surface structures]	805
275341	TIGR04548	DnaD_Mollicutes	DnaD family protein, Mollicutes type. This model describes the full length of a family of proteins in the Mollicutes (Mycoplasma, Spiroplasma, Mesoplasma, etc.) homologous to the N-terminal region of DnaD from Bacillus subtilis. [DNA metabolism, DNA replication, recombination, and repair]	178
275342	TIGR04549	LP_HExxH_w_tonB	substrate import-associated zinc metallohydrolase lipoprotein. Members of this family are lipoproteins with the typical zinc metallohydrolase HExxH motif and additional similarities to a better-documented zinc peptidase family, pfam06167. The seed alignment begins immediately after the lipoprotein motif Cys residue. Up to five members of this protein family occur per genome, in the context of certain gene pairs related to RagA and RagB, or to SusC and SusD. Those gene pairs, like the present family, are restricted to the Bacteriodetes, may number up to 100 pairs per genome, and are linked to TonB-dependent uptake of biopolymer-derived nutrients such as glycans. A possible function for this lipoprotein is to hydrolyse larger molecules to prepare substrates for import and utilization. [Unknown function, Enzymes of unknown specificity]	261
275343	TIGR04550	sMetMonox_MmoD	soluble methane monooxygenase-binding protein MmoD. Members of this family are MmoD, a protein that binds the soluble (as opposed to the membrane-bound, copper-rich, particulate) methane monooxygenase and may regulate its activity. Recent work suggests that MmoD, together with methanobactin, acts a copper switch to regulate which enzyme form is produced.	64
275344	TIGR04551	TIGR04551	TIGR04551 family protein. Members of this family are proteins of unknown function, about 620 amino acids in length, and universal in but restricted to the Myxococcales, an order within the Deltaproteobacteria with at least 15 sequenced genomes as of 8/2014. The most closely related homologs outside the Myxococcales show localized homology only and display sharply lower scores. Relatively few protein families (roughly 20) could be built to have a comparable restriction to the Myxococcales. The putative protein sorting signal MYXO-CTERM (TIGR03901) appears so far universal in but restricted to the Myxococcales, making the present family a candidate to be involved in recognizing and processing proteins with that signal.	523
275345	TIGR04552	TIGR04552	TIGR04552 family protein. Members of this family are bacterial proteins, roughly 400 amino acids in length. Most members belong to the Deltaproteobacteria. All members of the Myxococcales, and order withing the Deltaproteobacteria, have a member. The arrangement of conserved residues into invariant motifs suggests enzymatic activity. The function is unknown.	353
275346	TIGR04553	ABC_peri_selen	putative selenate ABC transporter periplasmic binding protein. Members of this family ABC transporter periplasmic binding proteins and represent one clade within a larger family that includes phosphate, phosphite, and phosphonate transporters. All members of the seed alignment occur near a gene for SelD, the selenium-activating protein needed to make selenocysteine or selenouridine. Context therefore suggests members should be able to transport selenate, although transporting other substrates as well (e.g. phosphonates) is possible. This model has no overlap with TIGR03431, whose members are found regularly with phosphonate catabolism operons.	266
275347	TIGR04554	3TM_mycoplas	three transmembrane helix protein. Members of this rare family are small, highly hydrophobic, and restricted so far to the genus Mycoplasma, where it appears not be be essential. All members have three hydrophobic transmembrane helical segments.	105
275348	TIGR04555	sulfite_DH_soxC	sulfite dehydrogenase. Members of this family are the sulfite dehydrogenase SoxC. All members have a twin-arginine translocation (TAT) signal for secretion of proteins with bound cofactor across the plasma membrane.	408
275349	TIGR04556	PKS_assoc	polyketide synthase-associated domain. This model describes a rare domain found as the N-terminal region of a number of dinoflagellate-specific proteins that resemble type I polyketide synthases.	228
275350	TIGR04557	fuse_rel_SoxYZ	quinoprotein dehydrogenase-associated SoxYZ-like carrier. Members of this family are fusion proteins, with the N-terminal region similar to the sulfur oxidation protein SoxY (TIGR04488) and the C-terminal region similar to sulfur oxidation protein SoxZ (TIGR04490). Members occur exclusively in species with PQQ-dependent enzymes that have a Cys-Cys motif (TIGR03075) for electron transfer to c550 family cytochrome. By homology to the sulfur moiety-binding subunit SoxY, we predict the conserved Cys in the Gly-Gly-Cys motif binds some unknown adduct.	225
275351	TIGR04558	SoxH_rel_PQQ_1	quinoprotein relay system zinc metallohydrolase 1. By homology, members are Zn metallohydrolases in the same family as the SoxH protein associated with sulfate metabolism, Bacillus cereus beta-lactamase II (see PDB:1bc2), and, more distantly, hydroxyacylglutathione hydrolase (glyoxalase II). All members occur in genomes with both PQQ biosynthesis and a PQQ-dependent (quinoprotein) dehydrogenase that has a motif of two consecutive Cys residues (see TIGR03075). The Cys-Cys motif is associated with electron transfer by specialized cytochromes such as c551. All these genomes also include a fusion protein (TIGR04557) whose domains resemble SoxY and SoxZ from thiosulfate oxidation. A conserved Cys in this fusion protein aligns to the Cys residue in SoxY that carries sulfur cycle intermediates. In many genomes, the genes for PQQ biosynthesis enzymes, PQQ-dependent enzymes, their associated cytochromes, and members of this family are clustered. Note that one to three closely related Zn metallohydrolases may occur; this family represents a specific clade among them. [Unknown function, Enzymes of unknown specificity]	285
275352	TIGR04559	SoxH_rel_PQQ_2	quinoprotein relay system zinc metallohydrolase 2. By homology, members are Zn metallohydrolases in the same family as the SoxH protein associated with sulfate metabolism, Bacillus cereus beta-lactamase II (see PDB:1bc2), and, more distantly, hydroxyacylglutathione hydrolase (glyoxalase II). All members occur in genomes with both PQQ biosynthesis and a PQQ-dependent (quinoprotein) dehydrogenase that has a motif of two consecutive Cys residues (see TIGR03075). The Cys-Cys motif is associated with electron transfer by specialized cytochromes such as c551. All these genomes also include a fusion protein (TIGR04557) whose domains resemble SoxY and SoxZ from thiosulfate oxidation. A conserved Cys in this fusion protein aligns to the Cys residue in SoxY that carries sulfur cycle intermediates. In many genomes, the genes for PQQ biosynthesis enzymes, PQQ-dependent enzymes, their associated cytochromes, and members of this family are clustered. Note that one to three closely related Zn metallohydrolases may occur; this family represents a specific clade among them. Some members of this family have a short additional N-terminal domain with four conserved Cys residues. [Unknown function, Enzymes of unknown specificity]	283
275353	TIGR04560	ribo_THX	ribosomal small subunit protein bTHX. Members of this protein are the lineage-specific bacterial ribosomal small subunit proteint bTHX (previously THX), originally shown to exist in the genus Thermus. The protein is conserved for the first 26 amino acids, past which some members continue with additional sequence, often repetitive or low-complexity. This model also finds eukaryotic organelle forms, which have additional N-terminal transit peptides. [Protein synthesis, Ribosomal proteins: synthesis and modification]	26
275354	TIGR04561	membra_charge	integral membrane protein. Members of this protein are short (about 85-residue), low-complexity sequences of unknown function, with a highly hydrophobic N-terminal region of about 40 amino acids followed by a charged (Asp, Glu, Lys, and Arg-rich), sometimes repetitive C-terminal region. Members occur exclusively among the Mollicutes (Mycoplasma, Mesoplasma, Acholeplasma, Spiroplasma, Entomoplasma). The gene neighborhood of this protein is not conserved.	82
275355	TIGR04562	TIGR04562	TIGR04562 family protein. Members of this family are proteins of unknown function, about 400 amino acids in length. Members are universal among the Myxococcales (a branch of the Deltaproteobacteria) and occur sporadically elsewhere. [Unknown function, General]	355
275356	TIGR04563	TIGR04563	MXAN_4361/MXAN_4362 family small protein. Members of this family are small proteins that appears to be restricted to and yet universal in the Myxococcales. The function is unknown. Members include two tandem loci in Myxococcus xanthus DK 1622, MXAN_4361 and MXAN_4362, although members are not tandem in other Myxococcales.	53
275357	TIGR04564	Synergist_CTERM	Synergist-CTERM protein sorting domain. This model identifies a C-terminal domain of about 27 residues whose features are 1) a short Gly/Ser-rich region that ends in an invariant Gly-Cys motif, 2) a highly hydrophobic probable transmembrane alpha helix with a nearly invariant Pro near the end, and 3) a cluster of basic residues (Arg, Lys), and then the end of the protein. This domain occurs, so far, only in species of Synergistetes (Dethiosulfovibrio peptidovorans, Aminiphilus circumscriptus, Aminomonas paucivorans, Fretibacterium fastidiosum, Cloacibacillus evryensis, Synergistes jonesii, etc). This region closely resembles the MXYO-CTERM region of the Myxococcales, a division of the Deltaproteobacteria (see TIGR03901), but that domain lacks the the conserved Pro, frequently has two Cys residues instead of one, and most importantly, has a spacer region separating the Gly-Cys motif from the transmembrane segment. As with MYXO-CTERM, the enzyme presumed to recognize and cleave the sorting signal is not known. The lack of a spacer region between motif and TM segment suggests the presumed protease is located largely within the membrane, like rhombosortase and archaeosortase, rather than merely tethered to it like sortase.	26
275358	TIGR04565	OMP_myx_plus	outer membrane beta-barrel protein. Members of this family are outer membrane beta-barrel proteins, as inferred by distant homologies to other families (e.g. pfam13505) and by the concentration of aromatic residues, especially Phe, in the OMP signal region, which is flush with the C-terminus in some members, but followed by a few residues in others. Members have variable insertions and deletions, affecting scores, so this model does not cleanly separate all members from all non-members. Members are common in the Myxococcales, with five occurring in Myxococcus xanthus DK 1622.	157
275359	TIGR04566	myxo_TraA_Nterm	outer membrane exchange protein TraA, N-terminal region. In Myxococcus xanthus, the protein pair TraA (MXAN_6895) and TraB (MXAN_6898) are required for contact-dependent exchange of outer membrane proteins. The C-terminal half of TrA consists largely of Cys-rich tandem repeats. This model describes the N-terminal region of TraA, and related protein MXAN_4924. This region is suggested to be similar to the lectin PA14. Members of this family are restricted to a subset of the Myxococcales, and so have a narrower species distribution than the MYXO-CTERM putative protein sorting signal (TIGR03901), which is universal in the Myxococcales. Note that TIGR04201 matches at least seven repeats in the C-terminal region of TraA. T [Protein fate, Protein and peptide secretion and trafficking]	240
275360	TIGR04567	RNAP_delt_lowGC	DNA-directed RNA polymerase delta subunit. Members of this family are the RNA polymerase delta subunit, as found in the Firmicutes and the Mollicutes. All members of the seed alignment have an extended C-terminal low-complexity region, consisting largely of Asp and Glu, that is not included in the model. Proteins giving borderline scores should be checked to confirm a similar acidic C-terminal domain. [Transcription, DNA-dependent RNA polymerase]	83
275361	TIGR04568	arch_SelU_Nterm	selenouridine synthase, SelU N-terminal-like subunit. This protein is involved in biosynthesis of a selenonucleotide, probably 2-selenouridine, in tRNA of some archaea, such as Methanococcus maripaludis. This protein resembles the N-terminal region of bacterial SelU, and its partner protein resembles the C-terminal region. [Protein synthesis, tRNA and rRNA base modification]	215
275362	TIGR04569	arch_SelU_Cterm	selenouridine synthase, SelU C-terminal-like subunit. This protein is involved in biosynthesis of a selenonucleotide, probably 2-selenouridine, in tRNA of some archaea, such as Methanococcus maripaludis. This protein resembles the C-terminal region of bacterial SelU, and its partner protein resembles the N-terminal region. [Protein synthesis, tRNA and rRNA base modification]	217
275363	TIGR04570	mollicut_2TM	small integral membrane protein. Members of this extremely rare protein family occur in Mycoplasma mycoides and two species of Spiroplasma. The protein is small and hydrophobic with two predicted transmembrane (TM) regions. [Unknown function, General]	87
275364	TIGR04571	LmtA_Leptospira	lipid A Kdo2 1-phosphate O-methyltransferase. This family describes LmtA, which methylates a phosphate on the Kdo2 sugar of lipid A. The model is classified as exception (more specific than equivalog) to reflect that its scope is limited to the genus Leptospira, whereas homologs with matching activity might exist more broadly. Members of this family belong to the broader family of pfam04191, phospholipid methyltransferase, which includes a characterized yeast enzyme that acts on a range of unsaturated phospholipids. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	252
237975	cd00001	PTS_IIB_man	PTS_IIB, PTS system, Mannose/sorbose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. The active site histidine receives a phosphate group from the IIA subunit and transfers it to the substrate.	151
237976	cd00002	YbaK_deacylase	This CD includes cysteinyl-tRNA(Pro) deacylases from Haemophilus influenzae and Escherichia coli and other related bacterial proteins. These trans-acting, single-domain proteins are homologs of ProX and also the cis-acting prolyl-tRNA synthetase (ProRS) inserted (INS) editing domain.  The bacterial amino acid trans-editing enzyme YbaK is a deacylase that hydrolyzes cysteinyl-tRNA(Pro)'s mischarged by prolyl-tRNA synthetase.   YbaK also hydrolyzes glycyl-tRNA's, alanyl-tRNA's, seryl-tRNA's, and prolyl-tRNA's.  YbaK is homologous to the INS domain of prolyl-tRNA synthetase (ProRS) as well as the trans-editing enzyme ProX of Aeropyrum pernix which hydrolyzes alanyl-tRNA's and glycyl-tRNA's.	152
237977	cd00003	PNPsynthase	Pyridoxine 5'-phosphate (PNP) synthase domain; pyridoxal 5'-phosphate is the active form of vitamin B6 that acts as an essential, ubiquitous coenzyme in amino acid metabolism. In bacteria, formation of pyridoxine 5'-phosphate is a step in the biosynthesis of vitamin B6. PNP synthase, a homooctameric enzyme, catalyzes the final step in PNP biosynthesis, the condensation of 1-amino-acetone 3-phosphate and 1-deoxy-D-xylulose 5-phosphate. PNP synthase adopts a TIM barrel topology, intersubunit contacts are mediated by three ''extra'' helices, generating a tetramer of symmetric dimers with shared active sites; the open state has been proposed to accept substrates and to release products, while most of the catalytic events are likely to occur in the closed state; a hydrophilic channel running through the center of the barrel was identified as the essential structural feature that enables PNP synthase to release water molecules produced during the reaction from the closed, solvent-shielded active site.	234
320674	cd00004	Sortase	Sortase domain. Sortases are cysteine transpeptidases, mainly found in Gram-positive bacteria, which either anchor surface proteins to peptidoglycans of the bacterial cell wall envelope or link proteins together to form pili by working alone, or in concert with other enzymes. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes based on sequence, membrane topology, genomic positioning, and cleavage site preference. The different classes are called class A to F sortases. Most Gram-positive bacteria contain more than one sortase and it is thought that the different sortases attach different surface protein classes. The typical eight-stranded beta-barrel fold is observed in all known sortases, along with the conserved catalytic triad consisting of cysteine, histidine and arginine residues. Some sortases contain an N-terminal signal peptide only and the C-terminus serves as a membrane anchor, which represents a type I membrane topology, with the N-terminal enzymatic portion projecting towards the bacterial surface and the C-terminal end residing in the cytoplasm. Other sortases adopt a type II membrane topology, with the N-terminal hydrophobic segment inside the cytoplasm and the C-terminal enzymatic portion located across the plasma membrane. The N-terminus either functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring. Sortases are also present in some Gram-negative and Archaebacterial species, but the functions of these enzymes are unknown.	125
187674	cd00005	CBM9_like_1	DOMON-like type 9 carbohydrate binding module of xylanases. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. The CBM9 domain frequently occurs in tandem repeats; members found in this subfamily typically co-occur with glycosyl hydrolase family 10 domains and are annotated as endo-1,4-beta-xylanases. CBM9 from Thermotoga maritima xylanase 10A is reported to have specificity for polysaccharide reducing ends.	185
237978	cd00006	PTS_IIA_man	PTS_IIA, PTS system, mannose/sorbose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. IIA subunits receive phosphoryl groups from HPr and transfer them to IIB subunits, which in turn phosphorylate the substrate.	122
350199	cd00008	PIN_53EXO-like	FEN-like PIN domains of the 5'-3' exonucleases of DNA polymerase I, bacteriophage T4 RNase H and T5-5' nucleases, and homologs. PIN (PilT N terminus) domains of the 5'-3' exonucleases (53EXO) of multi-domain DNA polymerase I and single domain protein homologs, as well as, the PIN domains of bacteriophage T5-5'nuclease (T5FEN or 5'-3'exonuclease), bacteriophage T4 RNase H (T4FEN), bacteriophage T3 (T3 phage exodeoxyribonuclease) and other similar nucleases are included in this family. The 53EXO of DNA polymerase I recognizes and endonucleolytically cleaves a structure-specific DNA substrate that has a bifurcated downstream duplex and an upstream template-primer duplex that overlaps the downstream duplex by 1 bp. The T5-5'nuclease is a 5'-3'exodeoxyribonuclease that also exhibits endonucleolytic activity on flap structures (branched duplex DNA containing a free single-stranded 5'end). T4 RNase H, which removes the RNA primers that initiate lagging strand fragments, has 5'- 3'exonuclease activity on DNA/DNA and RNA/DNA duplexes and has endonuclease activity on flap or forked DNA structures. These nucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	158
99707	cd00009	AAA	The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases.	151
237980	cd00010	AAI_LTSS	AAI_LTSS: Alpha-Amylase Inhibitors (AAI), Lipid Transfer (LT) and Seed Storage (SS) Protein family; a protein family unique to higher plants that includes cereal-type alpha-amylase inhibitors, lipid transfer proteins, seed storage proteins, and similar proteins. Proteins in this family are known to play important roles, in defending plants from insects and pathogens, lipid transport between intracellular membranes, and nutrient storage. Many proteins of this family have been identified as allergens in humans. These proteins contain a common pattern of eight cysteines that form four disulfide bridges.	63
153270	cd00011	BAR_Arfaptin_like	The Bin/Amphiphysin/Rvs (BAR) domain of Arfaptin-like proteins, a dimerization module that binds and bends membranes. The BAR domain of Arfaptin-like proteins, also called the Arfaptin domain, is a dimerization, lipid binding and curvature sensing module present in Arfaptins, PICK1, ICA69, and similar proteins. Arfaptins are ubiquitously expressed proteins implicated in mediating cross-talk between Rac, a member of the Rho family GTPases, and Arf (ADP-ribosylation factor) small GTPases. Arfaptins bind to GTP-bound Arf1, Arf5, and Arf6, with strongest binding to GTP-Arf1. Arfaptins also binds to Rac-GTP and Rac-GDP with similar affinities. The Arfs are thought to bind to the same surface as Rac, and their binding is mutually exclusive. Protein Interacting with C Kinase 1 (PICK1) plays a key role in the trafficking of AMPA receptors, which are critical for regulating synaptic strength and may be important in cellular processes involved in learning and memory. Islet cell autoantigen 69-kDa (ICA69) is a diabetes-associated autoantigen that is involved in membrane trafficking at the Golgi complex in neurosecretory cells. ICA69 associates with PICK1 through their BAR domains to form a heterodimer which is involved in regulating the synaptic targeting and surface expression of AMPA receptors. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	203
212657	cd00012	NBD_sugar-kinase_HSP70_actin	Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily. This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure.	185
200435	cd00013	ADF_gelsolin	Actin depolymerization factor/cofilin- and gelsolin-like domains. Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments.	97
409031	cd00014	CH_SF	calponin homology (CH) domain superfamily. CH domains are actin filament (F-actin) binding motifs, which may be present as a single copy or in tandem repeats (which increase binding affinity). They either function as autonomous actin binding motifs or serve a regulatory function. CH domains are found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, as well as proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav).	103
237982	cd00015	ALBUMIN	Albumin domain, contains five or six internal disulphide bonds; albuminoid superfamily includes alpha-fetoprotein which binds various cations, fatty acids and bilirubin; vitamin D-binding protein which binds to vitamin D, its metabolites, and fatty acids; alpha-albumin which binds water, cations (such as Ca2+, Na+ and K+), fatty acids, hormones, bilirubin and drugs; and afamin of which little is known; these belong to a multigene family with highly conserved intron/exon organization and encoded protein structures; evolutionary comparisons strongly support vitamin D-binding protein as the original gene in this group with subsequent local duplications generating the remaining genes in the cluster	185
293732	cd00016	ALP_like	alkaline phosphatases and sulfatases. This family includes alkaline phosphatases and sulfatases. Alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. Both alkaline phosphatase and sulfatase are essential for human metabolism. Deficiency of individual enzyme cause genetic diseases.	237
237984	cd00017	ANATO	Anaphylatoxin homologous domain; C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to repeats in fibulins.	70
237985	cd00018	AP2	DNA-binding domain found in transcription regulators in plants such as APETALA2 and EREBP (ethylene responsive element binding protein). In EREBPs the domain specifically binds to the 11bp GCC box of the ethylene response element (ERE), a promotor element essential for ethylene responsiveness. EREBPs and the C-repeat binding factor CBF1, which is involved in stress response, contain a single copy of the AP2 domain. APETALA2-like proteins, which play a role in plant  development contain two copies.	61
237986	cd00019	AP2Ec	AP endonuclease family 2; These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites; the alignment also contains hexulose-6-phosphate isomerases, enzymes that catalyze the epimerization of D-arabino-6-hexulose 3-phosphate to D-fructose 6-phosphate, via cleaving the phosphoesterbond with the sugar. 	279
380813	cd00021	Bbox_SF	B-box-type zinc finger superfamily. The B-box-type zinc finger is a short zinc binding domain of around 40 amino acid residues in length. It has been found in transcription factors, ribonucleoproteins and proto-oncoproteins, such as in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). The B-box-type zinc finger often presents in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interactions. Based on different consensus sequences and the spacing of the 7-8 zinc-binding residues, B-box-type zinc fingers can be divided into two groups, type 1 (Bbox1: C6H2) and type 2 (Bbox2: CHC3H2).	39
237989	cd00022	BIR	Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger.	69
237990	cd00023	BBI	Bowman-Birk type proteinase inhibitor (BBI); family of plant serine protease inhibitors that block trypsin or chymotrypsin.They are either single-headed (one reactive site, one inactive site, present mainly in monocotyledonous seeds) or double-headed (two reactive sites, present mainly in dicotyledonous seeds).	55
349274	cd00024	CD_CSD	CHROMO (CHRromatin Organization Modifier) domains and chromo shadow domains. Members of this group are chromodomains or chromo shadow domains; these are SH3-fold-beta-barrel domains of the chromo-like superfamily. Chromodomains lack the first strand of the SH3-fold-beta-barrel, this first strand is altered by insertion in the chromo shadow domains. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. Chromodomain-containing proteins include: i) those having an N-terminal chromodomain followed by a related chromo shadow domain, such as Drosophila and human heterochromatin protein Su(var)205 (HP1), and mammalian modifier 1 and 2; ii) those having a single chromodomain, such as Drosophila protein Polycomb (Pc), mammalian modifier 3, human Mi-2 autoantigen, and several yeast and Caenorhabditis elegans proteins of unknown function; iii) those having paired tandem chromodomains, such as mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1; (iv) and elongation factor eEF3, a member of the ATP-binding cassette (ABC) family of proteins, that serves an essential function in the translation cycle of fungi. eEF3 is a soluble factor lacking a transmembrane domain and having two ABC domains arranged in tandem, with a unique chromodomain inserted within the ABC2 domain.	50
237992	cd00025	BPI1	BPI/LBP/CETP N-terminal domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) N-terminal domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide.	223
237993	cd00026	BPI2	BPI/LBP/CETP C-terminal domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) C-terminal domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide.	200
349339	cd00027	BRCT	C-terminal domain of the breast cancer suppressor protein (BRCA1) and related domains. The BRCT (BRCA1 C-terminus) domain is found within many DNA damage repair and cell cycle checkpoint proteins. BRCT domains interact with each other forming homo/hetero BRCT multimers, but are also involved in BRCT-non-BRCT interactions and interactions within DNA strand breaks. BRCT tandem repeats bind to phosphopeptides; it has been shown that the repeats in human BRCA1 bind specifically to pS-X-X-F motifs, mediating the interaction between BRCA1 and the DNA helicase BACH1, or BRCA1 and CtIP, a transcriptional corepressor. It is assumed that BRCT repeats play similar roles in many signaling pathways associated with the response to DNA damage.	68
237995	cd00028	B_lectin	Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.	116
410341	cd00029	C1	protein kinase C conserved region 1 (C1 domain) superfamily. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains. It contains the motif HX12CX2CXnCX2CX4HX2CX7C, where C and H are cysteine and histidine, respectively; X represents other residues; and n is either 13 or 14. C1 has a globular fold with two separate Zn(2+)-binding sites. It was originally discovered as lipid-binding modules in protein kinase C (PKC) isoforms. C1 domains that bind and respond to phorbol esters (PE) and diacylglycerol (DAG) are referred to as typical, and those that do not respond to PE and DAG are deemed atypical. A C1 domain may also be referred to as PKC or non-PKC C1, based on the parent protein's activity. Most C1 domain-containing non-PKC proteins act as lipid kinases and scaffolds, except PKD which acts as a protein kinase. PKC C1 domains play roles in membrane translocation and activation of the enzyme.	50
175973	cd00030	C2	C2 domain. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	102
206635	cd00031	CA_like	Cadherin repeat-like domain. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan.	98
237997	cd00032	CASc	Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs.	243
153056	cd00033	CCP	Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system. SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function.	57
349275	cd00034	CSD	chromo shadow domain. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. CSDs are found for example in Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3.	52
211311	cd00035	ChtBD1	Hevein or type 1 chitin binding domain. Hevein or type 1 chitin binding domain (ChtBD1), a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins such as hevein, a major IgE-binding allergen in natural rubber latex, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements.	39
213175	cd00036	ChtBD3	Chitin/cellulose binding domains of chitinase and related enzymes. This group contains proteins related to the cellulose-binding domain of Erwinia chrysanthemi endoglucanase Z (EGZ) and Serratia marcescens chitinase B (ChiB). Gram negative plant parasite Erwinia chrysanthemi produces a variety of depolymerizing enzymes to metabolize pectin and cellulose on the host plant. Cellulase EGZ has a modular structure, with an N-terminal catalytic domain linked to a C-terminal cellulose-binding domain (CBD). CBD mediates the secretion activity of EGZ. Chitinases allow certain bacteria to utilize chitin as a energy source. Typically, non-plant chitinases are of the glycosidase family 18. Bacillus circulans Glycosidase ChiA1 hydrolyzes chitin and is comprised of several domains: the C-terminal chitin binding domain, an N-terminal catalytic domain, and 2 fibronectin type III-like domains. Bacillus circulans WL-12 ChiA1 facilitates invasion of fungal cell walls. The ChiA1 chitin binding domain is required for the specific recognition of insoluble chitin. although topologically and structurally related, ChiA1 lacks the characteristic aromatic residues of Erwinia chrysanthemi endoglucanase Z (CBD(EGZ)). Streptomyces griseus Chitinase C is a family 19 chitinase, and consists of a N-terminal chitin binding domain and a C-terminal chitin-catalytic domain that effects degradation. ChiC contains the characteristic chitin-binding aromatic residues. Chitinases function in invertebrates in the degradation of old exoskeletons, in fungi to utilize chitin in cell walls, and in bacteria which use chitin as an energy source. 	40
153057	cd00037	CLECT	C-type lectin (CTL)/C-type lectin-like (CTLD) domain. CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs.  Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice.  Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis;  P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration.  CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose.  Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors.  C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces.  Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model.  In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer.  A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome.  Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model.	116
237999	cd00038	CAP_ED	effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels.  Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels	115
119409	cd00039	COLIPASE	Colipase; a stoichiometric cofactor for pancreatic lipase, allowing the enzyme to anchor itself to the water-lipid interface and stabilizing the active enzyme conformation 	90
238000	cd00040	CSF2	Granulocyte Macrophage Colony Stimulating Factor (GM-CSF) is a member of the large family of polypeptide growth factors called cytokines. It stimulates a wide variety of hematopoietic and nonhematopoietic cell types via binding to members of the cytokine receptor family, mainly the GM-CSF receptor.	121
238001	cd00041	CUB	CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast.	113
238002	cd00042	CY	Substituted updates: Jan 30, 2002	105
410207	cd00043	CYCLIN_SF	Cyclin box fold superfamily. The cyclin box is a protein binding domain that functions in cell-cycle and transcriptional control. It is about 100 amino acids in length, composed of five helices, and is present in cyclins, transcription initiation factor IIB (TFIIB), and retinoblastoma tumour suppressor protein (Rb). Cyclins consist of 8 classes of cell cycle regulators that function as regulatory subunits of cyclin-dependent kinases (CDKs), which are serine/threonine kinases. The catalytic activities of CDKs are modulated not only by their interactions with cyclins but also by CDK inhibitors (CKIs). CDKs, cyclins and CKIs play key roles in transcription, epigenetic regulation, metabolism, stem cell self-renewal, neuronal functions, and spermatogenesis. TFIIB is a transcription factor that binds the TATA box. Members in this superfamily contain one or two copies of the cyclin box.	82
238004	cd00044	CysPc	Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction.	315
260016	cd00045	DED	Death Effector Domain: a protein-protein interaction domain. Death Effector Domains comprise a subfamily of the Death Domain (DD) superfamily. DED-containing proteins include Fas-Associated via Death Domain (FADD), Astrocyte phosphoprotein PEA-15, the initiator caspases (caspase-8 and -10), and FLICE-inhibitory protein (FLIP), among others. These proteins are prominent components of the programmed cell death (apoptosis) pathway. Some members also have non-apoptotic functions such as regulation of insulin signaling (DEDD and PEA15) and cell cycle progression (DEDD). DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes.	77
350668	cd00046	SF2-N	N-terminal DEAD/H-box helicase domain of superfamily 2 helicases. The DEAD/H-like superfamily 2 helicases comprise a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This N-terminal domain contains the ATP-binding region.	146
350343	cd00047	PTPc	catalytic domain of protein tyrosine phosphatases. Protein tyrosine phosphatases (PTP, EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG, and are characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active.	200
380679	cd00048	DSRM_SF	double-stranded RNA binding motif (DSRM) superfamily. DSRM (also known as dsRBM) is a 65-70 amino acid domain that adopts an alpha-beta-beta-beta-alpha fold. It is not sequence specific, but highly specific for double-stranded RNAs (dsRNAs) of various origin and structure. The DSRM domains are found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila Staufen protein, E. coli RNase III, RNase H1, and dsRNA dependent adenosine deaminases. They are involved in numerous cellular mechanisms ranging from localization and transport of messenger RNAs, through maturation and degradation of RNAs, to viral response and signal transduction. Some members harbor tandem DSRMs that act in small RNA biogenesis.	57
199811	cd00049	MH1	N-terminal Mad Homology 1 (MH1) domain. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure.  It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. Receptor-regulated SMAD proteins (R-SMADs, including SMAD1, SMAD2, SMAD3, SMAD5, and SMAD9) are activated by phosphorylation by transforming growth factor (TGF)-beta type I receptors. The active R-SMAD associates with a common mediator SMAD (Co-SMAD or SMAD4) and other cofactors, which together translocate to the nucleus to regulate gene expression. The inhibitory or antagonistic SMADs (I-SMADs, including SMAD6 and SMAD7) negatively regulate TGF-beta signaling by competing with R-SMADs for type I receptor or Co-SMADs. MH1 domains of R-SMAD and SMAD4 contain a nuclear localization signal as well as DNA-binding activity. The activated R-SMAD/SMAD4 complex then binds with very low affinity to a DNA sequence CAGAC called SMAD-binding element (SBE) via the MH1 domain.	121
199819	cd00050	MH2	C-terminal Mad Homology 2 (MH2) domain. The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected  in colorectal and other human cancers.	170
238008	cd00051	EFh	EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers.	63
238009	cd00052	EH	Eps15 homology domain; found in proteins implicated in endocytosis, vesicle transport, and signal transduction. The alignment contains a pair of EF-hand motifs, typically one of them is canonical and binds to Ca2+, while the other may not bind to Ca2+. A hydrophobic binding pocket is formed by residues from both EF-hand motifs. The EH domain binds to proteins containing NPF (class I), [WF]W or SWG (class II), or H[TS]F (class III) sequence motifs.	67
238010	cd00053	EGF	Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at  least  one  is  present  in  most EGF-like domains; a subset of these bind calcium.	36
238011	cd00054	EGF_CA	Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.	38
238012	cd00055	EGF_Lam	Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies	50
238013	cd00056	ENDO3c	endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases	158
238014	cd00057	FA58C	Substituted updates: Jan 31, 2002	143
238015	cd00058	FGF	Acidic and basic fibroblast growth factor family; FGFs are mitogens, which stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family plays essential roles in patterning and differentiation during vertebrate embryogenesis, and has neurotrophic activities. FGFs have a high affinity for heparan sulfate proteoglycans and require heparan sulfate to activate one of four cell surface FGF receptors. Upon binding to FGF, the receptors dimerize and their intracellular tyrosine kinase domains become active. FGFs have internal pseudo-threefold symmetry (beta-trefoil topology).	123
410788	cd00059	FH_FOX	Forkhead (FH) domain found in Forkhead box (FOX) family of transcription factors and similar proteins. The FOX family comprises diverse tissue- and cell type-specific transcription factors with an evolutionary conserved "Forkhead (FH)" or "winged helix" DNA-binding domain. FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. The structure of the FH domain contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'. They participate in a variety of cellular processes, such as cell cycle progression, proliferation, differentiation, migration, metabolism, and DNA damage response. Their expression can be regulated by multiple factors, and they can act as co-activators and/or transcriptional repressors. Fifty FOX-encoding genes in humans have been categorized into 19 subfamilies based on protein sequence homology (FOXA to FOXS).	75
238017	cd00060	FHA	Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53,  Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation).	102
238018	cd00061	FN1	Fibronectin type 1 domain, approximately 40 residue long with two conserved disulfide bridges. FN1 is one of three types of internal repeats which combine to form larger domains within fibronectin. Fibronectin, a plasma protein that binds cell surfaces and various compounds including collagen, fibrin, heparin, DNA, and actin, usually exists as a dimer in plasma and as an insoluble multimer in extracellular matrices.  Dimers of nearly identical subunits are linked by a disulfide bond close to their C-terminus. FN1 domains also found in coagulation factor XII, HGF activator, and tissue-type plasminogen activator. In tissue plasminogen activator, FN1 domains may form functional fibrin-binding units with EGF-like domains C-terminal to FN1.	43
238019	cd00062	FN2	Fibronectin Type II domain: FN2 is one of three types of internal repeats which combine to form larger domains within fibronectin. Fibronectin, a plasma protein that binds cell surfaces and various compounds including collagen, fibrin, heparin, DNA, and actin, usually exists as a dimer in plasma and as an insoluble multimer in extracellular matrices. Dimers of nearly identical subunits are linked by a disulfide bond close to their C-terminus. Fibronectin is composed of 3 types of modules, FN1,FN2 and FN3. The collagen binding domain contains four FN1 and two FN2 repeats.	48
238020	cd00063	FN3	Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases.	93
238021	cd00064	FU	Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors.	49
277249	cd00065	FYVE_like_SF	FYVE domain like superfamily. FYVE domain is a 60-80 residue double zinc finger motif-containing module named after the four proteins, Fab1, YOTB, Vac1, and EEA1. The canonical FYVE domains are distinguished from other zinc fingers by three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif, which form a compact phosphatidylinositol 3-phosphate (PtdIns3P, also termed PI3P)-binding site. They are found in many membrane trafficking regulators, including EEA1, Hrs, Vac1p, Vps27p, and FENS-1, which locate to early endosomes, specifically bind PtdIns3P, and play important roles in vesicular traffic and in signal transduction. Some proteins, such as rabphilin-3A and alpha-Rab3-interacting molecules (RIMs), are also involved in membrane trafficking and bind to members of the Rab subfamily of GTP hydrolases. However, they contain FYVE-related domains that are structurally similar to the canonical FYVE domains but lack the three signature sequences. At this point, they may not bind to phosphoinositides. In addition, this superfamily also contains the third group of proteins, caspase-associated ring proteins CARP1 and CARP2. They do not localize to membranes in the cell and are involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10, which are distinguished from other FYVE-type proteins. Moreover, these proteins have an altered sequence in the basic ligand binding patch and lack the WxxD motif that is conserved only in phosphoinositide binding FYVE domains. Thus they constitute a family of unique FYVE-type domains called FYVE-like domains. The FYVE domain is structurally similar to the RING domain and the PHD finger. This superfamily also includes ADDz zinc finger domain, which is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger.	52
206639	cd00066	G-alpha	Alpha subunit of G proteins (guanine nucleotide binding). The alpha subunit of G proteins contains the guanine nucleotide binding site. The heterotrimeric GNP-binding proteins are signal transducers that communicate signals from many hormones, neurotransmitters, chemokines, and autocrine and paracrine factors. Extracellular signals are received by receptors, which activate the G proteins, which in turn route the signals to several distinct intracellular signaling pathways. The alpha subunit of G proteins is a weak GTPase. In the resting state, heterotrimeric G proteins are associated at the cytosolic face of the plasma membrane and the alpha subunit binds to GDP. Upon activation by a receptor GDP is replaced with GTP, and the G-alpha/GTP complex dissociates from the beta and gamma subunits. This results in activation of downstream signaling pathways, such as cAMP synthesis by adenylyl cyclase, which is terminated when GTP is hydrolized and the heterotrimers reconstitute.	315
238023	cd00067	GAL4	GAL4-like Zn2Cys6 binuclear cluster DNA-binding domain; found in transcription regulators like GAL4.  Domain consists of two helices organized around a Zn(2)Cys(6 )motif; Binds to sequences containing 2 DNA half sites comprised of 3-5 C/G combinations	36
238024	cd00068	GGL	G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors	57
200450	cd00069	GHB_like	Glycoprotein hormone beta chain homologues. This family of cystine-knot hormones includes the beta chains of gonadotropins, thyrotropins, follitropins, choriogonadotropins and more. The members are reproductive hormones that consist of two glycosylated chains (alpha and beta), which form a tightly bound dimer.	96
238025	cd00070	GLECT	Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation.	127
238026	cd00071	GMPK	Guanosine monophosphate kinase (GMPK, EC 2.7.4.8), also known as guanylate kinase (GKase), catalyzes the reversible phosphoryl transfer from adenosine triphosphate (ATP) to guanosine monophosphate (GMP) to yield adenosine diphosphate (ADP) and guanosine diphosphate (GDP). It plays an essential role in the biosynthesis of guanosine triphosphate (GTP). This enzyme is also important for the activation of some antiviral and anticancer agents, such as acyclovir, ganciclovir, carbovir, and thiopurines.	137
238027	cd00072	GYF	GYF domain: contains conserved Gly-Tyr-Phe residues; Proline-binding domain in CD2-binding and other proteins. Involved in signaling lymphocyte activity. Also present in other unrelated proteins (mainly unknown) derived from diverse eukaryotic species.	57
238028	cd00073	H15	linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber	88
238029	cd00074	H2A	Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins.	115
340391	cd00075	HATPase	Histidine kinase-like ATPase domain. This superfamily includes the histidine kinase-like ATPase (HATPase) domains of several ATP-binding proteins such as histidine kinase, DNA gyrase B, topoisomerases, heat shock protein 90 (HSP90), phytochrome-like ATPases and DNA mismatch repair proteins. Domains belonging to this superfamily are also referred to as GHKL (gyrase, heat-shock protein 90, histidine kinase, MutL) ATPase domains.	102
238031	cd00076	H4	Histone H4, one of the four histones, along with H2A, H2B and H3, which forms the eukaryotic nucleosome core; along with H3, it plays a central role in nucleosome formation; histones bind to DNA and wrap the genetic material into "beads on a string" in which DNA (the string) is wrapped around small blobs of histones (the beads) at regular intervals; play a role in the inheritance of specialized chromosome structures and the control of gene activity; defects in the establishment of proper chromosome structure by histones may activate or silence genes aberrantly and thus lead to disease;  the sequence of histone H4 has remained almost invariant in more than 2 billion years of evolution	85
238032	cd00077	HDc	Metal dependent phosphohydrolases with conserved 'HD' motif	145
238033	cd00078	HECTc	HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains.	352
188616	cd00080	H3TH_StructSpec-5'-nucleases	H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination. The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	71
238035	cd00081	Hint	Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins and undergo protein splicing (e.g. DnaB, RIR1-2, GyrA and Pol). In protein splicing an intervening polypeptide sequence - the intein - is excised from a protein, and the flanking polypeptide sequences - the exteins - are joined by a peptide bond. In addition to the autocatalytic splicing domain, many inteins contain an inserted endonuclease domain, which plays a role in spreading inteins. Hedgehog proteins are a major class of intercellular signaling molecules, which control inductive interactions during animal development. The mature signaling forms of hedgehog proteins are the N-terminal fragments, which are covalently linked to cholesterol at their C-termini. This modification is the result of an autoprocessing step catalyzed by the C-terminal fragments, which are aligned here.	136
119399	cd00082	HisKA	Histidine Kinase A (dimerization/phosphoacceptor) domain; Histidine Kinase A dimers are formed through parallel association of 2 domains creating 4-helix bundles; usually these domains contain a conserved His residue and are activated via trans-autophosphorylation by the catalytic domain of the histidine kinase. They subsequently transfer the phosphoryl group to the Asp acceptor residue of a response regulator protein. Two-component signalling systems, consisting of a histidine protein kinase that senses a signal input and a response regulator that mediates the output, are ancient and evolutionarily conserved signaling mechanisms in prokaryotes and eukaryotes.	65
381392	cd00083	bHLH_SF	basic Helix Loop Helix (bHLH) domain superfamily. bHLH proteins are transcriptional regulators that are found in organisms from yeast to humans. Members of the bHLH superfamily have two highly conserved and functionally distinct regions. The basic part is at the amino end of the bHLH that may bind DNA to a consensus hexanucleotide sequence known as the E box (CANNTG). Different families of bHLH proteins recognize different E-box consensus sequences. At the carboxyl-terminal end of the region is the HLH region that interacts with other proteins to form homo- and heterodimers. bHLH proteins function as a diverse set of regulatory factors because they recognize different DNA sequences and dimerize with different proteins. The bHLH proteins can be divided to cell-type specific and widely expressed proteins. The cell-type specific members of bHLH superfamily are involved in cell-fate determination and act in neurogenesis, cardiogenesis, myogenesis, and hematopoiesis.	46
238037	cd00084	HMG-box	High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.	66
238038	cd00085	HNHc	HNH nucleases; HNH endonuclease signature which is found in viral, prokaryotic, and eukaryotic proteins. The alignment includes members of the large group of homing endonucleases, yeast intron 1 protein, MutS, as well as bacterial colicins, pyocins, and anaredoxins.	57
238039	cd00086	homeodomain	Homeodomain;  DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.	59
238040	cd00087	FReD	Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation.	215
238041	cd00088	HPT	Histidine Phosphotransfer domain, involved in signalling through a two part component systems in which an autophosphorylating histidine protein kinase serves as a phosphoryl donor to a response regulator protein; the response regulator protein is modulated by phosphorylation and dephosphorylation of a conserved aspartic acid residue; two-component proteins are abundant in most eubacteria; In E. coli there are 62 two-component proteins involved in a variety of processes such as chemotaxis, osmoregulation, metabolism and transport 1; also present in both Gram positive and Gram negative pathogenic bacteria where they regulate basic housekeeping functions and control expression of toxins and other proteins important for pathogenesis; in archaea and eukaryotes, two-component pathways constitute a very small number of all signaling systems; in fungi they mediate environmental stress responses and, in pathogenic yeast, hyphal development. In Dictyostelium and in plants, they are involved in important processes such as osmoregulation, cell growth, and differentiation; to date two-component proteins have not been identified in animals; in most prokaryotic systems, the output response is effected directly by the RR, which functions as a transcription factor while in eukaryotic systems, two-component proteins are found at the beginning of signaling pathways where they interface with more conventional eukaryotic signaling strategies such as MAP kinase and cyclic nucleotide cascades	94
212008	cd00089	HR1	Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases. The HR1 domain, also called the ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. It is found in Rho effector proteins including PKC-related kinases such as vertebrate PRK1 (or PKN) and yeast PKC1 protein kinases C, as well as in rhophilins and Rho-associated kinase (ROCK). Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 domains may occur in repeat arrangements (PKN contains three HR1 domains), separated by a short linker region.	68
238042	cd00090	HTH_ARSR	Arsenical Resistance Operon Repressor and similar prokaryotic, metal regulated homodimeric repressors. ARSR subfamily of helix-turn-helix bacterial transcription regulatory proteins (winged helix topology). Includes several proteins that appear to dissociate from DNA in the presence of metal ions.	78
238043	cd00091	NUC	DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases.  They exists as monomers and homodimers.	241
238044	cd00092	HTH_CRP	helix_turn_helix, cAMP Regulatory protein C-terminus; DNA binding domain of prokaryotic regulatory proteins belonging to the catabolite activator protein family.	67
238045	cd00093	HTH_XRE	Helix-turn-helix XRE-family like proteins. Prokaryotic DNA binding proteins belonging to the xenobiotic response element family of transcriptional regulators.	58
238046	cd00094	HX	Hemopexin-like repeats.; Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). This CD contains 4 instances of the repeat.	194
238047	cd00095	IFab	Interferon alpha, beta. Includes also interferon omega and tau. Different from interferon gamma family. Type I interferons(alpha, beta) belong to the larger helical cytokine superfamily, which includes growth hormones, interleukins, several colony-stimulating factors and several other regulatory molecules. All function as regulators of cellular activty by interacting with cell-surface receptors and activating various signalling pathways. Interferons produce antiviral and antiproliferative responses in cells. Receptor specificity determines function of the various members of the family.	152
409353	cd00096	Ig	Immunoglobulin domain. The members here are composed of the immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, including T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, including butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. Ig superfamily (IgSF) domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Typically, the V-set domains have A, B, E, and D strands in one sheet and A', G, F, C, C' and C" in the other. The structures in C1-set are smaller than those in the V-set; they have one beta sheet that is formed by strands A, B, E, and D and the other by strands G, F, C, and C'. Moreover, a C1-set Ig domain contains a short C' strand (three residues) and lacks A' and C" strand. Unlike other Ig domain sets, C2-set structures do not have a D strand. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	70
409354	cd00098	IgC1	Immunoglobulin Constant-1 (C1)-set domain. The members here are composed of C1-set domains, classical Ig-like domains resembling the antibody constant domain. Members of the IgC1 family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, while the IgC domain is involved in oligomerization and molecular interactions. The structures in C1-set are smaller than those in the V-set; they have one beta sheet that is formed by strands A, B, E, and D and the other strands by G, F, C, and C'.	95
409355	cd00099	IgV	Immunoglobulin variable domain (IgV). The members here are composed of the immunoglobulin variable domain (IgV). The IgV family contains the standard Ig superfamily V-set AGFCC'C"/DEB domain topology, and are components of immunoglobulin (Ig) and T cell receptors. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is the disulfide bridge connecting 2 beta-sheets with a tryptophan residue packed against the disulfide bond. Ig superfamily (IgSF) domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Typically, the V-set domains have A, B, E and, D strands in one sheet and A', G, F, C, C', and C" strands in the other.	111
238048	cd00100	IL1	Interleukin-1 homologes; Cytokines with various biological functions. Interleukin 1 alpha and beta are also known as hematopoietin and catabolin. This family also contains interleukin-1 receptor antagonists (inhibitors).	144
238049	cd00101	IlGF_like	Insulin/insulin-like growth factor/relaxin family; insulin family of proteins. Members include a number of active peptides which are evolutionary related including insulin, relaxin, prorelaxin, insulin-like growth factors I and II, mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP; gene INSL4), insect prothoracicotropic hormone (bombyxin), locust insulin-related peptide (LIRP), molluscan insulin-related peptides 1 to 5 (MIP), and C. elegans insulin-like peptides. Typically, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds.	41
238050	cd00102	IPT	Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers.	89
238051	cd00103	IRF	Interferon Regulatory Factor (IRF); also known as tryptophan pentad repeat. The family of IRF transcription factors is important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. The IRF family is characterized by a unique 'tryptophan cluster' DNA-binding region. Viral IRFs bind to cellular IRFs; block type I and II interferons and host IRF-mediated transcriptional activation.	107
238052	cd00104	KAZAL_FS	Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the  Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD.	41
411802	cd00105	KH-I	K homology (KH) RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include an N-terminal extension and type I KH domains (e.g. hnRNP K) contain a C-terminal extension. Some KH-I superfamily members contain a divergent KH domain that lacks the RNA-binding GXXG motif. Some others have a mutated GXXG motif which may or may not have nucleic acid binding ability.	63
276812	cd00106	KISc	Kinesin motor domain. Kinesin motor domain. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type), in some its is found in the middle (M-type), or C-terminal (C-type). N-type and M-type kinesins are (+) end-directed motors, while C-type kinesins are (-) end-directed motors, i.e. they transport cargo towards the (-) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	326
238055	cd00107	Knot1	The "knottin" fold is stable cysteine-rich scaffold, in which one disulfide bridge crosses the macrocycle made by two other disulfide bridges and the connecting backbone segments. Members include plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins, and arthropod defensins.	33
238056	cd00108	KR	Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides.	83
238057	cd00109	KU	BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure.	54
238058	cd00110	LamG	Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules.	151
238059	cd00111	Trefoil	P or trefoil or TFF domain; Trefoil factor family domain peptides are mucin-associated molecules, largely found in epithelia of gastrointestinal tissues. Function is not known but it was originally identified from mucosal tissues, where it may have a regulatory or structural role and has also been implicated as a growth fractor in other tissues.The domain is found in 1 to 6 copies where it occurs.	44
238060	cd00112	LDLa	Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure	35
238061	cd00113	PLAT	PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2)  domain.  It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates.	116
238062	cd00114	LIGANc	NAD+ dependent DNA ligase adenylation domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilizing either ATP or NAD(+) as a cofactor, but using the same basic reaction mechanism. The enzyme reacts with the cofactor to form a phosphoamide-linked AMP with the amino group of a conserved Lysine in the KXDG motif, and subsequently transfers it to the DNA substrate to yield adenylated DNA. This alignment contains members of the NAD+ dependent subfamily only.	307
319970	cd00115	LMWP	Low molecular weight phosphatase family. Substituted updates: Aug 22, 2001	137
238064	cd00116	LRR_RI	Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1).	319
238065	cd00117	LU	Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins.	79
212030	cd00118	LysM	Lysin Motif is a small domain involved in binding peptidoglycan. LysM, a small globular domain with approximately 40 amino acids, is a widespread protein module involved in binding peptidoglycan in bacteria and chitin in eukaryotes. The domain was originally identified in enzymes that degrade bacterial cell walls, but proteins involved in many other biological functions also contain this domain. It has been reported that the LysM domain functions as a signal for specific plant-bacteria recognition in bacterial pathogenesis. Many of these enzymes are modular and are composed of catalytic units linked to one or several repeats of LysM domains. LysM domains are found in bacteria and eukaryotes.	45
340357	cd00119	LYZ	C-type lysozyme and alpha-lactalbumin. C-type lysozyme (chicken or conventional type, 1,4-beta-N-acetylmuramidase) and alpha-lactalbumin (lactose synthase B protein, LA). They have a close evolutionary relationship and similar tertiary structure, however, functionally they are quite different. Lysozymes have primarily bacteriolytic function; hydrolysis of peptidoglycans of prokaryotic cell walls and transglycosylation. LA is a calcium-binding metalloprotein that is expressed exclusively in the mammary gland during lactation. LA is the regulatory subunit of the enzyme lactose synthase. The association of LA with the catalytic component of lactose synthase, galactosyltransferase, alters the acceptor substrate specificity of this glycosyltransferase, facilitating biosynthesis of lactose. Some lysozymes have evolved into digestive enzymes, both in mammals and invertebrates.	122
238067	cd00120	MADS	MADS: MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptonal regulators. Binds DNA and exists as hetero and homo-dimers.  Composed of 2 main subgroups: SRF-like/Type I and MEF2-like (myocyte enhancer factor 2)/ Type II. These subgroups differ mainly in position of the alpha 2 helix responsible for the dimerization interface; Important in homeotic regulation in plants and in immediate-early development in animals.  Also found in fungi.	59
238068	cd00121	MATH	MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains.	126
238069	cd00122	MBD	MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin.  MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family.	62
238070	cd00123	DmpA_OAT	DmpA/OAT superfamily; composed of L-aminopeptidase D-amidase/D-esterase (DmpA), ornithine acetyltransferase (OAT) and similar proteins. DmpA is an aminopeptidase that releases N-terminal D and L amino acids from peptide substrates. This group represents one of the rare aminopeptidases that are not metalloenzymes. DmpA shows similarity in catalytic mechanism to N-terminal nucleophile (Ntn) hydrolases, which are enzymes that catalyze the cleavage of amide bonds through the nucleophilic attack of the side chain of an N-terminal serine, threonine, or cysteine. OAT catalyzes the first and fifth steps in arginine biosynthesis, coupling acetylation of glutamate with deacetylation of N-acetylornithine, which allows recycling of the acetyl group in the arginine biosynthetic pathway. The superfamily also contains an enzyme, endo-type 6-aminohexanoate-oligomer hydrolase, that have been shown to be involved in nylon degradation. Proteins in this superfamily undergo autocatalytic cleavage of an inactive precursor at the site immediately upstream to the catalytic nucleophile.	286
276950	cd00124	MYSc	Myosin motor domain superfamily. Myosin motor domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	633
153091	cd00125	PLA2c	PLA2c: Phospholipase A2, a family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers.	115
238072	cd00126	PAH	Pancreatic Hormone domain, a regulator of pancreatic and gastrointestinal functions; neuropeptide Y (NPY)b, peptide YY (PYY), and pancreatic polypetide (PP) are closely related; propeptide is enzymatically cleaved to yield the mature active peptide with amidated C-terminal ends; receptor binding and activation functions may reside in the N- and C-termini respectively; occurs in neurons, intestinal endocrine cells, and pancreas; exist as monomers and dimers 	36
350200	cd00128	PIN_FEN1_EXO1-like	FEN-like PIN domains of Flap endonuclease-1 (FEN1)-like and exonuclease-1 (EXO1)-like nucleases, structure-specific, divalent-metal-ion dependent, 5' nucleases. PIN (PilT N terminus) domain of Flap endonuclease-1 (FEN1) and exonuclease-1 (EXO1)-like nucleases: FEN1, EXO1, Mkt1, Gap endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. These nucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	162
238074	cd00129	PAN_APPLE	PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins,  plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.	80
238075	cd00130	PAS	PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction.	103
238076	cd00131	PAX	Paired Box domain	128
238077	cd00132	CRIB	PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules.	42
99904	cd00133	PTS_IIB	PTS_IIB: subunit IIB of enzyme II (EII) is the central energy-coupling domain of the phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In the multienzyme PTS complex, EII is a carbohydrate-specific permease consisting of two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include chitobiose/lichenan, ascorbate, lactose, galactitol, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system. The PTS is found only in bacteria, where it catalyzes the transport and phosphorylation of numerous monosaccharides, disaccharides, polyols, amino sugars, and other sugar derivatives. The four proteins (domains) forming the PTS phosphorylation cascade (EI, HPr, EIIA, and EIIB), can phosphorylate or interact with numerous non-PTS proteins thereby regulating their activity.	84
238079	cd00135	PDGF	Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family domain; PDGF is a potent activator for cells of mesenchymal origin; PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer; VEGF is a potent mitogen in embryonic and somatic angiogenesis with a unique specificity for vascular endothelial cells; VEGF forms homodimers and exists in 4 different isoforms; overall, the VEGF monomer resembles that of PDGF, but its N-terminal segment is helical rather than extended; the cysteine knot motif is a common feature of this domain	86
238080	cd00136	PDZ	PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein.	70
176497	cd00137	PI-PLCc	Catalytic domain of prokaryotic and eukaryotic phosphoinositide-specific phospholipase C. This subfamily corresponds to the catalytic domain present in prokaryotic and eukaryotic phosphoinositide-specific phospholipase C (PI-PLC), which is a ubiquitous enzyme catalyzing the cleavage of the sn3-phosphodiester bond in the membrane phosphoinositides (phosphatidylinositol, PI; Phosphatidylinositol-4-phosphate, PIP; phosphatidylinositol 4,5-bisphosphate, PIP2) to yield inositol phosphates (inositol monosphosphate, InsP;  inositol diphosphate, InsP2;  inositol trisphosphate, InsP3) and diacylglycerol (DAG). The higher eukaryotic PI-PLCs (EC 3.1.4.11) have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. They play a critical role in most signal transduction pathways, controlling numerous cellular events, such as cell growth, proliferation, excitation and secretion. These PI-PLCs strictly require Ca2+ for their catalytic activity. They display a clear preference towards the hydrolysis of the more highly phosphorylated PI-analogues, PIP2 and PIP, to generate two important second messengers, InsP3 and DAG. InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. In contrast, bacterial PI-PLCs contain a single catalytic domain. Although their precise physiological function remains unclear, bacterial PI-PLCs may function as virulence factors in some pathogenic bacteria. They participate in Ca2+-independent PI metabolism. They are characterized as phosphatidylinositol-specific phospholipase C (EC 4.6.1.13) that selectively hydrolyze PI, not PIP or PIP2. The TIM-barrel type catalytic domain in bacterial PI-PLCs is very similar to the one in eukaryotic PI-PLCs, in which the catalytic domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. The catalytic mechanism of both prokaryotic and eukaryotic PI-PLCs is based on general base and acid catalysis utilizing two well conserved histidines, and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes a distinctly different type of eukaryotic PLC, glycosylphosphatidylinositol-specific phospholipase C (GPI-PLC), an integral membrane protein characterized in the protozoan parasite Trypanosoma brucei. T. brucei GPI-PLC hydrolyzes the GPI-anchor on the variant specific glycoprotein (VSG), releasing dimyristyl glycerol (DMG), which may facilitate the evasion of the protozoan to the host#s immune system. It does not require Ca2+ for its activity and is more closely related to bacterial PI-PLCs, but not mammalian PI-PLCs.	274
197200	cd00138	PLDc_SF	Catalytic domain of phospholipase D superfamily proteins. Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	119
340436	cd00139	PIPKc	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain family. The Phosphatidylinositol phosphate kinase (PIPK) catalytic domain family includes phosphatidylinositol 5-phosphate 4-kinases (PIP5Ks) and similar proteins. PIP5Ks catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. The family includes type I and II PIP5Ks (-alpha, -beta, and -gamma) kinases. Signalling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling.	253
238082	cd00140	beta_clamp	Beta clamp domain.  The beta subunit (processivity factor) of DNA polymerase III holoenzyme, refered to as the beta clamp, forms a ring shaped dimer that encircles dsDNA (sliding clamp) in bacteria.  The beta-clamp is structurally similar to the trimeric ring formed by PCNA (found in eukaryotes and archaea) and the processivity factor (found in bacteriophages T4 and RB69).  This structural correspondence further substantiates the mechanistic connection between eukaryotic and prokaryotic DNA replication that has been suggested on biochemical grounds. 	365
143386	cd00141	NT_POLXc	Nucleotidyltransferase (NT) domain of family X DNA Polymerases. X family polymerases fill in short gaps during DNA repair. They are relatively inaccurate enzymes and play roles in base excision repair, in non-homologous end joining (NHEJ) which acts mainly to repair damage due to ionizing radiation, and in V(D)J recombination. This family includes eukaryotic Pol beta, Pol lambda, Pol mu, and terminal deoxyribonucleotidyl transferase (TdT). Pol beta and Pol lambda are primarily DNA template-dependent polymerases. TdT is a DNA template-independent polymerase. Pol mu has both template dependent and template independent activities. This subgroup belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These three carboxylate residues are fairly well conserved in this family.	307
270621	cd00142	PI3Kc_like	Catalytic domain of Phosphoinositide 3-kinase and similar proteins. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRAPP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control. The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	216
238083	cd00143	PP2Cc	Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity.	254
277316	cd00144	MPP_PPP_family	phosphoprotein phosphatases of the metallophosphatase superfamily, metallophosphatase domain. The PPP (phosphoprotein phosphatase) family is one of two known protein phosphatase families specific for serine and threonine.  This family includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	229
99912	cd00145	POLBc	DNA polymerase type-B family catalytic domain. DNA-directed DNA polymerases elongate DNA by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA. DNA-directed DNA polymerases are multifunctional with both synthetic (polymerase) and degradative modes (exonucleases) and play roles in the processes of DNA replication, repair, and recombination. DNA-dependent DNA polymerases can be classified in six main groups based upon their phylogenetic relationships with E. coli polymerase I (class A), E. coli polymerase II (class B), E. coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB, and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y).  Family B DNA polymerases include E. coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon, and zeta), and eukaryotic viral and plasmid-borne enzymes. DNA polymerase is made up of distinct domains and sub-domains. The polymerase domain of DNA polymerase type B (Pol domain) is responsible for the template-directed polymerization of dNTPs onto the growing primer strand of duplex DNA that is usually magnesium dependent. In general, the architecture of the Pol domain has been likened to a right hand with fingers, thumb, and palm sub-domains with a deep groove to accommodate the nucleic acid substrate. There are a few conserved motifs in the Pol domain of family B DNA polymerases. The conserved aspartic acid residues in the DTDS motifs of the palm sub-domain is crucial for binding to divalent metal ion and is suggested to be important for polymerase catalysis.	323
238084	cd00146	PKD	polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.	81
132835	cd00147	cPLA2_like	Cytosolic phospholipase A2, catalytic domain; hydrolyses arachidonyl phospholipids. Catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. Movement of the cPLA2 lid possibly exposes a greater hydrophobic surface and the active site. cPLA2 belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Calcium is required for cPLA2 to bind with membranes or phospholipids. Group IV cPLA2 includes six intercellular enzymes: cPLA2alpha, cPLA2beta, cPLA2gamma, cPLA2delta, cPLA2epsilon, and cPLA2zeta.	438
238085	cd00148	PROF	Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway.	127
119410	cd00150	PlantTI	Plant trypsin inhibitors such as squash trypsin inhibitor. Plant proteinase inhibitors play important roles in natural plant defense. Proteinase inhibitors from squash seeds form an uniform family of small proteins cross-linked with three disulfide bridges.	27
238086	cd00152	PTX	Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers.	201
340449	cd00153	RA_RalGDS_like	Ras-associating (RA) domain of RalGDS family. The RalGDS family RA domains can interact with activated Ras and may function as effectors for other Ras family. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes and is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. The RalGDS family includes RalGDS, RGL, RGL2/Rlf and RGL3. All family members have similar domain structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal RA domain. The RA domain mediates the GTP-dependent interaction with Ras and Ras-related proteins.	88
206640	cd00154	Rab	Ras-related in brain (Rab) family of small guanosine triphosphatases (GTPases). Rab GTPases form the largest family within the Ras superfamily. There are at least 60 Rab genes in the human genome, and a number of Rab GTPases are conserved from yeast to humans. Rab GTPases are small, monomeric proteins that function as molecular switches to regulate vesicle trafficking pathways. The different Rab GTPases are localized to the cytosolic face of specific intracellular membranes, where they regulate distinct steps in membrane traffic pathways. In the GTP-bound form, Rab GTPases recruit specific sets of effector proteins onto membranes. Through their effectors, Rab GTPases regulate vesicle formation, actin- and tubulin-dependent vesicle movement, and membrane fusion. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which mask C-terminal lipid binding and promote cytosolic localization. While most unicellular organisms possess 5-20 Rab members, several have been found to possess 60 or more Rabs; for many of these Rab isoforms, homologous proteins are not found in other organisms. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Since crystal structures often lack C-terminal residues, the lipid modification site is not available for annotation in many of the CDs in the hierarchy, but is included where possible.	159
238087	cd00155	RasGEF	Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors.	237
381085	cd00156	REC	phosphoacceptor receiver (REC) domain of response regulators (RRs) and pseudo response regulators (PRRs). Two-component systems (TCSs) involving a sensor and a response regulator are used by bacteria to adapt to changing environments. Processes regulated by two-component systems in bacteria include sporulation, pathogenicity, virulence, chemotaxis, and membrane transport. Response regulators (RRs) share the common phosphoacceptor REC domain and different effector/output domains such as DNA, RNA, ligand-binding, protein-binding, or enzymatic domains. Response regulators regulate transcription, post-transcription or post-translation, or have functions such as methylesterases, adenylate or diguanylate cyclase, c-di-GMP-specific phosphodiesterases, histidine kinases, serine/threonine protein kinases, and protein phosphatases, depending on their output domains. The function of some output domains are still unknown. TCSs are found in all three domains of life - bacteria, archaea, and eukaryotes, however, the presence and abundance of particular RRs vary between the lineages. Archaea encode very few RRs with DNA-binding output domains; most are stand-alone REC domains. Among eukaryotes, TCSs are found primarily in protozoa, fungi, algae, and green plants. REC domains function as phosphorylation-mediated switches within RRs, but some also transfer phosphoryl groups in multistep phosphorelays.	99
206641	cd00157	Rho	Ras homology family (Rho) of small guanosine triphosphatases (GTPases). Members of the Rho (Ras homology) family include RhoA, Cdc42, Rac, Rnd, Wrch1, RhoBTB, and Rop. There are 22 human Rho family members identified currently. These proteins are all involved in the reorganization of the actin cytoskeleton in response to external stimuli. They also have roles in cell transformation by Ras in cytokinesis, in focal adhesion formation and in the stimulation of stress-activated kinase. These various functions are controlled through distinct effector proteins and mediated through a GTP-binding/GTPase cycle involving three classes of regulating proteins: GAPs (GTPase-activating proteins), GEFs (guanine nucleotide exchange factors), and GDIs (guanine nucleotide dissociation inhibitors). Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Since crystal structures often lack C-terminal residues, this feature is not available for annotation in many of the CDs in the hierarchy.	171
238089	cd00158	RHOD	Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins.	89
238090	cd00159	RhoGAP	RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins.	169
238091	cd00160	RhoGEF	Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains.	181
238092	cd00161	RICIN	Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture.	124
319361	cd00162	RING_Ubox	The superfamily of RING finger (Really Interesting New Gene) domain and U-box domain. RING finger is a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc. It is defined by the "cross-brace" motif that chelates zinc atoms by eight amino acid residues, typically Cys or His, arranged in a characteristic spacing. Canonical RING motifs have been categorized as two major subclasses, RING-HC (C3HC4-type) and RING-H2 (C3H2C3-type), according to their Cys/His content. There are also many variants of RING fingers. Some have different Cys/His pattern. Some lack a single Cys or His residues at typical Zn ligand positions. Especially, the fourth or eighth zinc ligand is prevalently exchanged for an Asp, which can indeed chelate Zn in a RING finger as well. C4C4-, C3HC3D-, C2H2C4-, and C3HC5-type RING fingers are closely related to RING-HC finger. In contrast, C4HC3- (RING-CH alias RINGv), C3H3C2-, C3H2C2D-, C3DHC3-, and C4HC2H-type RING fingers are close to RING-H2 finger. However, not all RING finger-containing proteins display regular RING finger features, and the RING finger family has turned out to be multifarious. The degenerated RING fingers from Siz/PIAS RING (SP-RING) family proteins and sporulation protein RMD5, are characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues. They bind only one Zn2+ ion. On the other hand, the RING fingers of the human APC11 and RBX1 proteins can bind a third Zn atom since they harbor four additional Zn ligands. U-box is a modified form of the RING finger domain that lacks metal chelating Cys and His. It resembles the cross-brace RING structure consisting of three beta-sheets and a single alpha-helix, which would be stabilized by salt bridges instead of chelated metal ions. U-box proteins are widely distributed among eukaryotic organisms and show a higher prevalence in plants than in other organisms. RING finger/U-box-containing proteins are a group of diverse proteins with a variety of cellular functions, including oncogenesis, development, viral replication, signal transduction, the cell cycle and apoptosis. Many of them are ubiquitin-protein ligases (E3s) that serves as a scaffold for binding to ubiquitin-conjugating enzymes (E2s, also referred to as ubiquitin carrier proteins or UBCs) in close proximity to substrate proteins, which enables efficient transfer of ubiquitin from E2 to the substrates.	40
119386	cd00163	RNase_A	RNase A family, or Pancreatic RNases family; includes vertebrate RNase homologs to the bovine pancreatic ribonuclease A (RNase A). Many of these enzymes have special biological activities; for example, some stimulate the development of vascular endothelial cells, dendritic cells, and neurons, are cytotoxic/anti-tumoral and/or anti-pathogenic. RNase A is involved in endonucleolytic cleavage of 3'-phosphomononucleotides and 3'-phosphooligonucleotides ending in C-P or U-P with 2',3'-cyclic phosphate intermediates. The catalytic mechanism is a transphosphorylation of P-O 5' bonds on the 3' side of pyrimidines and subsequent hydrolysis to generate 3' phosphate groups. The RNase A family proteins have a conserved catalytic triad (two histidines and one lysine); recently some family members lacking the catalytic residues have been identified. They also share three or four disulfide bonds. The most conserved disulfide bonds are close to the N and C termini and contribute most significantly to the conformational stability. 8 RNase A homologs had initially been identified in the human genome, pancreatic RNase (RNase 1), Eosinophil Derived Neurotoxin (EDN/RNASE 2), Eosinophil Cationic Protein (ECP/RNase 3), RNase 4, Angiogenin (RNase 5), RNase 6 or k6, the skin derived RNase (RNase 7) and RNase 8. These eight human genes are all located in a cluster on chromosome 14. Recent genomic analysis has extended the family to 13 sequences. However only the first eight identified human RNases, which are refered to as "canonical" RNases, contain the catalytic residues required for RNase A activity. The new genes corresponding to RNases 9-13 are also located in the same chromosome cluster and seem to be related to male-reproductive functions. RNases 9-13 have the characteristic disulfide bridge pattern but are unlikely to share RNase activity. The RNase A family most likely started off in vertebrates as a host-defense protein, and comparative analysis in mammals and birds indicates that the family may have originated from a RNase 5-like gene. This hypothesis is supported by the fact that only RNase 5-like RNases have been reported outside the mammalian class. The RNase 5 group would therefore be the most ancient form of this family, and all other members would have arisen during mammalian evolution.	119
238094	cd00164	S1_like	S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold.	65
238095	cd00165	S4	S4/Hsp/ tRNA synthetase RNA-binding domain; The domain surface is populated by conserved, charged residues that define a likely RNA-binding site;  Found in stress proteins, ribosomal proteins and tRNA synthetases; This may imply a hitherto unrecognized functional similarity between these three protein classes.	70
238096	cd00167	SANT	'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.	45
349397	cd00168	CAP	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain family. The CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain, also called SCP (sperm-coating glycoprotein), is found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs (cysteine-rich secretory proteins), which combine the CAP/SCP domain with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. CAP/SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal CAP/SCP domains.	128
238098	cd00169	Chemokine	Chemokine: small cytokines, including a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; distinguished from other cytokines by their receptors, which are G-protein coupled receptors; divided into 4 subfamilies based on the arrangement of the two N-terminal cysteines; some members can bind multiple receptors and many chemokine receptors can bind more than one chemokine; this redundancy allows precise control in stimulating the immune system and in contributing to the homeostasis of a cell; when expressed inappropriately, chemokines play a role in autoimmune diseases, vascular irregularities, graft rejection, neoplasia, and allergies; exist as monomers, dimers and multimers, but are believed to function as monomers; found only in vertebrates and a few viruses.  See CDs: Chemokine_CXC (cd00273), Chemokine_CC (cd00272), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for chemokine subgroups.	59
238099	cd00170	SEC14	Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits.	157
238100	cd00171	Sec7	Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity.	185
381000	cd00172	serpin	SERine Proteinase INhibitors (serpin) family. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	365
198173	cd00173	SH2	Src homology 2 (SH2) domain. In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1),  Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others.	79
212690	cd00174	SH3	Src Homology 3 domain superfamily. Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction.	51
238102	cd00175	SNc	Staphylococcal nuclease homologues. SNase homologues are found in bacteria, archaea, and eukaryotes. They contain no disufide bonds.	129
238103	cd00176	SPEC	Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here	213
176851	cd00177	START	Lipid-binding START domain of mammalian STARD1-STARD15 and related proteins. This family includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and related domains, such as the START domain of the Arabidopsis homeobox protein GLABRA 2. The mammalian STARDs are grouped into 8 subfamilies. This family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. For some members of this family, specific lipids that bind in this pocket are known; these include cholesterol (STARD1/STARD3/ STARD4/STARD5), 25-hydroxycholesterol (STARD5), phosphatidylcholine (STARD2/ STARD7/STARD10), phosphatidylethanolamine (STARD10) and ceramides (STARD11). The START domain is found either alone or in association with other domains. Mammalian STARDs participate in the control of various cellular processes including lipid trafficking between intracellular compartments, lipid metabolism, and modulation of signaling events. Mutation or altered expression of STARDs is linked to diseases such as cancer, genetic disorders, and autoimmune disease. The Arabidopsis homeobox protein GLABRA 2 suppresses root hair formation in hairless epidermal root cells.	193
238104	cd00178	STI	Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. Inhibit proteases by binding with high affinity to their active sites. Trefoil fold, common to interleukins and fibroblast growth factors.	172
238105	cd00179	SynN	Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain	151
270622	cd00180	PKc	Catalytic domain of Protein Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine or tyrosine residues on protein substrates. PKs make up a large family of serine/threonine kinases (STKs), protein tyrosine kinases (PTKs), and dual-specificity PKs that phosphorylate both serine/threonine and tyrosine residues of target proteins. Majority of protein phosphorylation occurs on serine residues while only 1% occurs on tyrosine residues. Protein phosphorylation is a mechanism by which a wide variety of cellular proteins, such as enzymes and membrane channels, are reversibly regulated in response to certain stimuli. PKs often function as components of signal transduction pathways in which one kinase activates a second kinase, which in turn, may act on other kinases; this sequential action transmits a signal from the cell surface to target proteins, which results in cellular responses. The PK family is one of the largest known protein families with more than 100 homologous yeast enzymes and more than 500 human proteins. A fraction of PK family members are pseudokinases that lack crucial residues for catalytic activity. The mutiplicity of kinases allows for specific regulation according to substrate, tissue distribution, and cellular localization. PKs regulate many cellular processes including proliferation, division, differentiation, motility, survival, metabolism, cell-cycle progression, cytoskeletal rearrangement, immunity, and neuronal functions. Many kinases are implicated in the development of various human diseases including different types of cancer. The PK family is part of a larger superfamily that includes the catalytic domains of RIO kinases, aminoglycoside phosphotransferase, choline kinase, phosphoinositide 3-kinase (PI3K), and actin-fragmin kinase.	215
206638	cd00181	Tar_Tsr_LBD	ligand binding domain of Tar- and Tsr-related chemoreceptors. E.coli Tar (taxis to aspartate and repellents) and Tsr (taxis to serine and repellents) are homologous chemoreceptors that have a high specificity for aspartate and serine, respectively. Both are homodimeric receptors and contain an N-terminal periplasmic ligand binding domain, a transmembrane region, a HAMP domain and a C-terminal cytosolic signaling domain.	129
410312	cd00182	T-box	DNA-binding domain of the T-box transcription factor family. The T-box family is an ancient family of transcription factors which plays a multitude of diverse functions throughout development. The founding member of the family is Brachyury (also known as TBXT, or T). Members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns. The T-box factors in Caenorhabditis elegans have evolved very differently than those in other organisms; its genome contains 22 T-box genes which encode factors which are diverse in DNA-binding specificity, function and sequence, and only 3 of these factors fall into the conserved T-box subfamilies.	176
238107	cd00183	TFIIS_I	N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme	76
238108	cd00184	TNF	Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding.	137
276900	cd00185	TNFRSF	Tumor necrosis factor receptor superfamily (TNFRSF). Members of TNFR superfamily (TNFRSF) interactions with TNF superfamily (TNFSF) ligands (TNFL) control key cellular processes such as differentiation, proliferation, apoptosis, and cell growth. Dysregulation of these pathways has been shown to result in a wide range of pathological conditions, including autoimmune diseases, inflammation, cancer, and viral infection. There are 29 very diverse family members of TNFRSF reported in humans: 22 are type I transmembrane receptors (single pass with the N terminus on extracellular side of the cell membrane) and have a clear signal peptide; the remaining 7 members are either type III transmembrane receptors (single pass with the N terminus on extracellular side of the membrane but no signal sequence; TNFR13B, TNFR13C, TNFR17, and XEDAR), or attached to the membrane via a glycosylphosphatidylinositol (GPI) linker (TNFR10C), or secreted as soluble receptors (TNFR11B and TNFR6B). All TNFRs contain relatively short cysteine-rich domains (CRDs) in the ectodomain, and are involved in interaction with the TNF homology domain (THD) of their ligands. TNFRs often have multiple CRDs (between one and six), with the most frequent configurations of three or four copies; most CRDs possess three disulfide bridges, but could have between one and four. Localized or genome-wide duplication and evolution of the TNFRSF members appear to have paralleled the emergence of the adaptive immune system; teleosts (i.e. ray-finned, bony fish), which possess an immune system with B and T cells, possess primary and secondary lymphoid organs, and are capable of adaptive responses to pathogens also display several characteristics that are different from the mammalian immune system, making teleost TNFSF orthologs and paralogs of interest to better understand immune system evolution and the immunological pathways elicited to pathogens.	87
238110	cd00186	TOP1Ac	DNA Topoisomerase, subtype IA; DNA-binding, ATP-binding and catalytic domain of bacterial DNA topoisomerases I and III, and eukaryotic DNA topoisomerase III and eubacterial and archael reverse gyrases. Topoisomerases clevage single or double stranded DNA and then rejoin the broken phosphodiester backbone. Proposed catalytic mechanism of single stranded DNA cleavage is by phosphoryl transfer through a tyrosine nucleophile using acid/base catalysis. Tyr is activated by a nearby group (not yet identified) acting as a general base for nucleophilic attack on the 5' phosphate of the scissile bond. Arg and Lys stabilize the pentavalent transition state. Glu then acts as a proton donor for the leaving 3'-oxygen, upon cleavage of the scissile strand.	381
238111	cd00187	TOP4c	DNA Topoisomerase, subtype IIA; domain A'; bacterial DNA topoisomerase IV (C subunit, ParC), bacterial DNA gyrases (A subunit, GyrA),mammalian DNA toposiomerases II. DNA topoisomerases are essential enzymes that regulate the conformational changes in DNA topology by catalysing the concerted breakage and rejoining of DNA strands during normal cellular growth.	445
173773	cd00188	TOPRIM	Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	83
238113	cd00190	Tryp_SPc	Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.	232
238114	cd00191	TY	Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases	66
270623	cd00192	PTKc	Catalytic domain of Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. They can be classified into receptor and non-receptor tyr kinases. PTKs play important roles in many cellular processes including, lymphocyte activation, epithelium growth and maintenance, metabolism control, organogenesis regulation, survival, proliferation, differentiation, migration, adhesion, motility, and morphogenesis. Receptor tyr kinases (RTKs) are integral membrane proteins which contain an extracellular ligand-binding region, a transmembrane segment, and an intracellular tyr kinase domain. RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain, leading to intracellular signaling. Some RTKs are orphan receptors with no known ligands. Non-receptor (or cytoplasmic) tyr kinases are distributed in different intracellular compartments and are usually multi-domain proteins containing a catalytic tyr kinase domain as well as various regulatory domains such as SH3 and SH2. PTKs are usually autoinhibited and require a mechanism for activation. In many PTKs, the phosphorylation of tyr residues in the activation loop is essential for optimal activity. Aberrant expression of PTKs is associated with many development abnormalities and cancers.The PTK family is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
277192	cd00193	SNARE	SNARE motif. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, Qb- and Qc-SNAREs are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	54
270455	cd00194	UBA_like_SF	UBA domain-like superfamily. The ubiquitin-associated (UBA) domain-like superfamily contains alpha-helical structural homology ubiquitin-binding domains, including UBA domains and coupling of ubiquitin conjugation to endoplasmic reticulum degradation (CUE) domains which share a common three-helical bundle architecture. UBA domains are commonly occurring sequence motifs found in proteins involved in ubiquitin-mediated proteolysis. They contribute to ubiquitin (Ub) binding or ubiquitin-like (UbL) domain binding. However, some kinds of UBA domains can only bind the UbL domain, but not the Ub domain. UBA domains are normally comprised of compact three-helix bundles which contain a conserved GF/Y-loop. They can bind polyubiquitin with high affinity. They also bind monoubiquitin and other proteins. Most UBA domain-containing proteins have one UBA domain, but some harbor two or three UBA domains. CUE domain containing proteins are characterized by an FP and a di-leucine-like sequence and bind to monoubiquitin with varying affinities. Some higher eukaryotic CUE domain proteins do not bind monoubiquitin efficiently, since they carry LP, rather than FP among CUE domains. This superfamily also includes many UBA-like domains found in AMP-activated protein kinase (AMPK) related kinases, the NXF family of mRNA nuclear export factors, elongation factor Ts (EF-Ts), nascent polypeptide-associated complex subunit alpha (NACA) and similar proteins. Although many UBA-like domains may have a conserved TG but not GF/Y-loop, they still show a high level of structural and sequence similarity with three-helical ubiquitin binding domains.	28
238117	cd00195	UBCc	Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3.  This pathway regulates many fundamental cellular processes.  There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD.	141
340450	cd00196	Ubiquitin_like_fold	Beta-grasp ubiquitin-like fold. Ubiquitin is a protein modifier that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. The ubiquitination process comprises a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. Ubiquitin-like proteins have similar ubiquitin beta-grasp fold and attach to other proteins in a ubiquitin-like manner but with biochemically distinct roles. Ubiquitin and ubiquitin-like proteins conjugate and deconjugate via ligases and peptidases to covalently modify target polypeptides. Some other ubiquitin-like domains have adaptor roles in ubiquitin-signaling by mediating protein-protein interaction. In addition to Ubiquitin-like (Ubl) domain, Ras-associating (RA) domain,  F0/F1 sub-domain of FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, TGS (ThrRS, GTPase and SpoT) domain, Ras-binding domain (RBD),  Ubiquitin regulatory domain X (UBX), Dublecortin-like domain, and RING finger- and WD40-associated ubiquitin-like (RAWUL) domain have beta-grasp ubiquitin-like folds, and are included in this superfamily.	68
340764	cd00197	VHS_ENTH_ANTH	VHS, ENTH and ANTH domain superfamily. This superfamily is composed of proteins containing a VHS, CID, ENTH, or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The CTD-Interacting Domain (CID) is present in several RNA-processing factors and binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase II (RNAP II or Pol II). The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-Terminal Homology (ANTH) domain. VHS, ENTH, and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH and ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	115
238119	cd00198	vWFA	Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains.	161
238120	cd00199	WAP	whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease.	60
238121	cd00200	WD40	WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.	289
238122	cd00201	WW	Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.	31
238123	cd00202	ZnF_GATA	Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C	54
238124	cd00203	ZnMc	Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease.	167
119412	cd00205	rhv_like	Picornavirus capsid protein domain_like. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids composed of 60 copies each of 4 virus encoded proteins; alignment includes picornaviridae, like poliovirus, hepatitis A virus, rhinovirus, foot-and-mouth disease virus and encephalomyocarditis virus; common structure is an 8-stranded beta sandwich	178
119411	cd00206	snake_toxin	Snake toxin domain, present in short and long neurotoxins, cytotoxins and short toxins, and in other miscellaneous venom peptides. The toxin acts by binding to the nicotinic acetylcholine receptors in the postsynaptic membrane of skeletal muscles and preventing the binding of acetylcholine, thereby blocking the excitation of muscles. This domain contains 60-75 amino acids that are fixed by 4-5 disulfide bridges and is nearly all beta sheet; it exists as either monomers or dimers.	64
238126	cd00207	fer2	2Fe-2S iron-sulfur cluster binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins, which act as electron carriers in photosynthesis and ferredoxins, which participate in redox chains (from bacteria to mammals). Fold is ismilar to thioredoxin.	84
100038	cd00208	LbetaH	Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms.	78
238127	cd00209	DHFR	Dihydrofolate reductase (DHFR). Reduces 7,8-dihydrofolate to 5,6,7,8-tetrahydrofolate with NADPH as a cofactor. This is an essential step in the biosynthesis of deoxythymidine phosphate since 5,6,7,8-tetrahydrofolate is required to regenerate 5,10-methylenetetrahydrofolate which is then utilized by thymidylate synthase. Inhibition of DHFR interrupts thymidilate synthesis and DNA replication, inhibitors of DHFR (such as Methotrexate) are used in cancer chemotherapy.  5,6,7,8-tetrahydrofolate also is involved in glycine, serine, and threonine metabolism and aminoacyl-tRNA biosynthesis.	158
238128	cd00210	PTS_IIA_glc	PTS_IIA, PTS system, glucose/sucrose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation.	124
238129	cd00211	PTS_IIA_fru	PTS_IIA, PTS system, fructose/mannitol specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation.	136
238130	cd00212	PTS_IIB_glc	PTS_IIB, PTS system, glucose/sucrose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation	78
238131	cd00213	S-100	S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins.	88
238132	cd00214	Calpain_III	Calpain, subdomain III. Calpains are  calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains.	150
238133	cd00215	PTS_IIA_lac	PTS_IIA, PTS system, lactose/cellobiose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation. This family of proteins normally function as a homotrimer, stabilized by a centrally located metal ion. Separation into subunits is thought to occur after phosphorylation.	97
199833	cd00216	PQQ_DH_like	PQQ-dependent dehydrogenases and related proteins. This family is composed of dehydrogenases with pyrroloquinoline quinone (PQQ) as a cofactor, such as ethanol, methanol, and membrane-bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller, and the family also includes distantly related proteins which are not enzymatically active and do not bind PQQ.	434
271174	cd00217	INT_Flp_C	Flp Tyrosine-based site-specific recombinases (also called integrases), C-terminal catalytic domain. Yeast Flp-like recombinases mediate the amplification of the 2 micron circular plasmid copy number by catalyzing the intra-molecular recombination between two inverted repeats during replication. They belong to the DNA breaking-rejoining enzyme superfamily, which also includes prokaryotic tyrosine recombinases and type IB topoisomerases. These enzymes share the same fold in their catalytic domain containing six conserved active site residues and the overall reaction mechanism. Flp-like recombinases are almost exclusively found in yeast and are highly diverged in sequence from the prokaryotic tyrosine recombinases. They cleave their target DNA in trans with a composite active site in which the catalytic tyrosine is provided by a promoter bound to a site other than the one being cleaved. Thus each active site within Flp complexes is assembled by domain swapping and contains catalytic residues from two different monomers. Two DNA segments are synapsed by the tetrameric enzyme, carrying the nucleophilic tyrosine in each active site with only two of the four monomers active at a given time. The catalytic domain is linked through a flexible loop to the N-terminal domain, which is largely responsible for non-specific DNA binding and isomerization. Its overall fold is similar to the SAM domain fold also found in the N-terminal domains of lambda integrase and XerD recombinase.	410
132995	cd00218	GlcAT-I	Beta1,3-glucuronyltransferase I (GlcAT-I) is involved in the initial steps of proteoglycan synthesis. Beta1,3-glucuronyltransferase I (GlcAT-I) domain; GlcAT-I is a Key enzyme involved in the initial steps of proteoglycan synthesis. GlcAT-I catalyzes the transfer of a glucuronic acid moiety from the uridine diphosphate-glucuronic acid (UDP-GlcUA) to the common linkage region of trisaccharide Gal-beta-(1-3)-Gal-beta-(1-4)-Xyl  of proteoglycans. The enzyme has two subdomains that bind the donor and acceptor substrate separately.  The active site is located at the cleft between both subdomains in which the trisaccharide molecule is oriented perpendicular to the UDP. This family has been classified as Glycosyltransferase family 43 (GT-43).	223
119405	cd00219	ToxGAP	GTPase-activating protein (GAP) domain found in bacterial cytotoxins, ExoS, SptP, and YopE. Part of protein secretion system; stimulates Rac1- dependent cytoskeletal changes that promote bacterial internalization.	120
238135	cd00220	VMO-I	Vitelline membrane outer layer protein I (VMO-I) domain, VMO-I is one of the proteins found in the outer layer of the vitelline membrane of poultry eggs; VMO-I, lysozyme, and VMO-II are tightly bound to ovomucin; this complex forms the backbone of the outer layer;  VMO-I has three distinct internal repeats;  all three repeats are used to define the domain here; VMO-I has recently been shown to synthesize N-acetylchito-oligosaccharides from N-acetylglucosamine; may be a carbohydrate-binding protein; member of the beta-prism-fold family	177
411702	cd00221	Vsr	Very short patch repair endonuclease. Very short patch repair (Vsr) is an endonuclease functioning in DNA repair that recognizes damaged DNA and cleaves the phosphodiester backbone. Vsr endonucleases have a common endonuclease topology that has been tailored for recognition of TG mismatches. It belongs to a superfamily of nucleases including archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	133
212461	cd00222	CollagenBindB	Repeat unit of collagen-binding protein domain B. The collagen-binding protein mediates bacterial adherence to collagen; the primary sequence has a non-repetitive, collagen-binding A region, followed by instances of this B region repetitive unit. The B region has one to four 23 kDa repeat units (B1-B4), which have been suggested to serve as 'stalks' that project the A region from the bacterial surface and thus facilitate bacterial adherence to collagen. Each B repeat unit has two highly similar domains (D1 and D2) placed side-by-side; both D1 and D2 are included in this model. They exhibit a unique inverse IgG-like domain fold.	92
173774	cd00223	TOPRIM_TopoIIB_SPO	TOPRIM_TopoIIB_SPO: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in the type IIB family of DNA topoisomerases and Spo11.  This subgroup contains proteins similar to Sulfolobus shibatae topoisomerase VI (TopoVI) and Saccharomyces cerevisiae meiotic recombination factor: Spo11.   Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions.  TopoVI enzymes are heterotetramers found in archaea and plants. Spo11 plays a role in generating the double strand breaks that initiate homologous recombination during meiosis.  S. shibatae TopoVI relaxes both positive and negative supercoils, and in addition has a strong decatenase activity.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD.  For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	160
238137	cd00224	Mog1	homolog to Ran-Binding Protein Mog1p; binds to the small GTPase Ran, which plays an important role in nuclear import. Binding is independent of Ran's nucleotide state (RanGTP/RanGDP)	173
119406	cd00225	API3	Ascaris pepsin inhibitor-3 (API3); protein inhibitor that reversibly inhibits aspartic proteinase cathepsin E, and gastric enzymes pepsin and gastricsin.	159
238138	cd00226	PRCH	Photosynthetic reaction center (RC) complex, subunit H;  RC is an integral membrane protein-pigment complex which catalyzes light-induced reduction of ubiquinone to ubiquinol, generating a transmembrane electrochemical gradient of protons used to produce ATP by ATP synthase. Subunit H is positioned mainly in the cytoplasm with one transmembrane alpha helix. Provides proton transfer pathway (water channels) connecting the terminal quinone electron acceptor of RC, to the aqueous phase. Found in photosynthetic bacteria: alpha, beta, and gamma proteobacteria.	246
238139	cd00227	CPT	Chloramphenicol (Cm) phosphotransferase (CPT). Cm-inactivating enzyme; modifies the primary (C-3) hydroxyl of the antibiotic. Related structurally to shikimate kinase II.	175
238140	cd00228	eu-GS	Eukaryotic Glutathione Synthetase (eu-GS); catalyses the production of glutathione from gamma-glutamylcysteine and glycine in an ATP-dependent manner. Belongs to the ATP-grasp superfamily.	471
238141	cd00229	SGNH_hydrolase	SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid.	187
238142	cd00231	ZipA	ZipA C-terminal domain. ZipA, a membrane-anchored protein, is one of at least nine essential gene products necessary for assembly of the septal ring which mediates cell division in E.coli. ZipA and FtsA directly bind FtsZ, a homolog of eukaryotic tubulins, at the prospective division site, followed by the sequential addition of FtsK, FtsQ, FtsL, FtsW, FtsI, and FtsN.  ZipA contains three domains: a short N-terminal membrane-anchored domain, a central P/Q domain that is rich in proline and glutamine and a C-terminal domain, which comprises almost half the protein.	130
350855	cd00232	HemeO-like	heme oxygenase. Heme oxygenase (HO, EC 1.14.14.18) catalyzes the rate limiting step in the degradation of heme to biliverdin in a multi-step reaction. HO is essential for recycling iron from heme which is used as a substrate and cofactor for its own degradation to biliverdin, iron, and carbon monoxide. This family serves a variety of specific needs in different branches of life: in vertebrates, HO plays a role in heme homeostasis and oxidative stress response, and cellular signaling in mammals that include isoforms HO-1 and HO-2; in photosynthetic organisms including cyanobacteria, algae, and higher plants, biliverdin is used for photosynthetic pigment formation or light-sensing; and, in pathogenic bacteria, HO is part of a pathway for iron acquisition from host heme and heme products. HO shares tertiary structure similarity to methane monooxygenase (EC 1.14.13.25), ribonucleotide reductase (EC 1.17.4.1) and thiaminase II (EC 3.5.99.2), but shares little sequence homology.	201
238144	cd00233	VIP2	VIP2; A family of actin-ADP-ribosylating toxin. A member of the Bacillus-prodiced vegetative insecticidal proteins (VIPs) possesses high specificity against the major insect pest, corn rootworms, and belongs to a classs of binary toxins and regulators of biological pathways distinct from classical A-B toxins. A novel family of insecticidal ADP-ribosyltransferses were isolated from Bacillus cereus during vegetative growth, where VIP1 likely targets insect cells and VIP2 ribosylates actin. VIP2 shares significant sequence similarity with enzymatic components of other binary toxins, Clostridium botulinum C2 toxin, C. perfringens iota toxin, C. piroforme toxin, C. piroforme toxin and C. difficile toxin.	201
119413	cd00235	TLP-20	Telokin-like protein-20 (TLP-20) domain; a baculovirus protein that shares some antigenic similarities to the smooth muscle protein telokin, a kinase-related protein	108
238145	cd00236	FinO_conjug_rep	FinO bacterial conjugation repressor domain;  the basic protein FinO is part of the the two component FinOP system which is responsible for repressing bacterial conjugation; the FinOP system represses the transfer (tra) operon of the F-plasmid which encodes the proteins responsible for conjugative transfer of this plasmid from host to recipient Escherichia coli cells; antisense RNA, FinP is thought to interact with traJ mRNA to occlude its ribosome binding site, blocking traJ translation and thereby inhibiting transcription of the tra operon; FinO protects FinP against degradation by binding to FinP and sterically blocking the cellular endonuclease RNase E; FinO also also binds to the complementary stem-loop structures in traJ mRNA and promotes duplex formation between FinP and traJ RNA in vitro;  this domain contains two independent RNA binding regions	146
107218	cd00237	p23	p23 binds heat shock protein (Hsp)90 and participates in the folding of a number of Hsp90 clients, including the progesterone receptor. p23 also has a passive chaperoning activity and in addition may participate in prostaglandin synthesis.	106
238146	cd00238	ERp29c	ERp29 and ERp38, C-terminal domain; composed of the protein disulfide isomerase (PDI)-like proteins ERp29 and ERp38. ERp29 (also called ERp28) is a ubiquitous endoplasmic reticulum (ER)-resident protein expressed in high levels in secretory cells. It contains a redox inactive TRX-like domain at the N-terminus. The expression profile of ERp29 suggests a role in secretory protein production, distinct from that of PDI. It has also been identified as a member of the thyroglobulin folding complex and is essential in regulating the secretion of thyroglobulin. The Drosophila homolog, Wind, is the product of windbeutel, an essential gene in the development of dorsal-ventral patterning. Wind is required for correct targeting of Pipe, a Golgi-resident type II transmembrane protein with homology to 2-O-sulfotransferase. ERp38 is a P5-like protein, first isolated from alfalfa (the cDNA clone was named G1), which contains two redox active TRX domains at the N-terminus, like human P5. However, unlike human P5, ERp38 also contains a C-terminal domain with homology to the C-terminal domain of ERp29. It may be a glucose-regulated protein. The function of the all-helical C-terminal domain of ERp29 and ERp38 remains unclear. The C-terminal domain of Wind is thought to provide a distinct site required for interaction with its substrate, Pipe.	93
119414	cd00239	PapG_CBD	PapG carbohydrate / receptor binding domain (CBD); PapG, the adhesin of the P-pili, is situated at the tip, mediating the attachment of uropathogenic Escherichia coli to the uroepithelium of the human kidney; PapG has a two-domain architecture: a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus (C-terminal pilin region). The carbohydrate-binding domain interacts with the receptor glycan. There are 3 PapG alleles, class I-III, which bind to different receptor isotypes, GbO3, GbO4, and GbO5, respectively.	194
238147	cd00240	TFIIFa	Transcription initiation factor IIF, alpha subunit, N-terminal region of RAP74. Subunit of transcription initiation complex involved in initiation, elongation and promoter escape.Tetramer of 2 alpha and 2 beta TFIIF subunits interacts directly with RNA polymerase II. TFIIF inhibits non-specific transcription initiation by PolII and recruits the polymerase to the preinitiation complex on promoter DNA for site-specific transcription initiation. The PolII/TFIIF-complex attaches through direct interactions of TFIIF with promoter DNA, TFIIB and the TAF250 subunit of TFIID, and provides scaffolding for addition of TFIIE and TFIIH. Together with TFIIE, TFIIF participates in DNA strand separation (open complex formation). N-terminal domains of RAP30 and RAP74 co-fold to form a single core structure, a triple barrel heterodimer, and has pseudo-2-fold symmetry.	162
187675	cd00241	DOMON_like	Domon-like ligand-binding domains. DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases.	158
153074	cd00242	Ecotin	Protease Inhibitor Ecotin; homodimeric protease inhibitor. Protease Inhibitor Ecotin; homodimeric protease inhibitor which binds two chymotrypsin-like serine proteases to form a heterotetramer. Found in bacterial periplasm. Inhibits a broad range of serine proteases including collagenase, trypsin, chymotrypsin, elastase, and factor Xa but not thrombin. Inhibition mechanism involves binding at two different protease contact sites: the primary and secondary binding sites. Primary site loops of ecotin bind to the active site of target proteases in a substrate-like manner with the P1 residue in ecotin mimicking the interactions of a canonical P1 substrate residue.	136
119415	cd00243	Lysin-Sp18	Sp18 and Lysin from Archaegastropoda (marine mollusks of the families Halotidae and Trochidae) sperm. Both proteins play an important role in fertilization: sp18 mediates fusion between sperm and egg cell membrane, lysin dissolves the vitelline envelope surrounding the egg; they are believed to be a product of gene duplication and subsequent divergence.	122
238148	cd00244	AlgLyase	Alginate Lyase A1-III; enzymatically depolymerizes alginate, a complex copolymer of beta-D-mannuronate and alpha-L-guluronate, by cleaving the beta-(1,4) glycosidic bond.	339
238149	cd00245	Glm_e	Coenzyme B12-dependent glutamate mutase epsilon subunit-like family; contains proteins similar to Clostridium cochlearium glutamate mutase (Glm) and Streptomyces tendae Tu901 NikV. Glm catalyzes a carbon-skeleton rearrangement of L-glutamate to L-threo-3-methylaspartate. The first step in the catalysis is a homolytic cleavage of the Co-C bond of the coenzyme B12 cofactor to generate a 5'-deoxyadenosyl radical. This radical then initiates the rearrangement reaction. C. cochlearium Glm is a sigma2epsilon2 heterotetramer. Glm plays a role in glutamate fermentation in Clostridium sp. and in members of the family Enterobacteriaceae, and in the synthesis of the lipopeptide antibiotic friulimicin in Actinoplanes friuliensis. S. tendae Tu901 glutamate mutase-like proteins NikU and NIkV participate in the synthesis of the peptidyl nucleoside antibiotic nikkomycin. NikU and NikV proteins have sequence similarity to Clostridium Glm sigma and epsilon components respectively, and may catalyze the rearrangement of 2-oxoglutaric acid to 2-keto-3-methylsuccinic acid during nikkomycin synthesis.	428
238150	cd00246	RabGEF	Nucleotide exchange factor for Rab-like small GTPases (RabGEF), Mss4 type; RabGEF positely regulates the function of  Rab GTPase by promoting exchange of GDP for GTP; members of the Rab subfamily of Ras GTPases are important in vesicular transport;	103
238151	cd00247	Endostatin-like	Endostatin-like domain; the angiogenesis inhibitor endostatin is a C-terminal fragment of collagen XV/XVIII, a proteoglycan/collagen found in vessel walls and basement membranes; this domain has a compact globular fold similar to that of C-type lectins; endostatin XVIII is monomeric and contains a heparin-binding epitope and zinc binding sites while endostatin XV is trimeric and contains neither of these sites; the generation of endostatin or endostatin-like collagen XV/XVIII fragments is catalyzed by proteolytic enzymes within the protease-sensitive hinge region of the C-terminal domain; endostatin inhibits endothelial cell migration in vitro and appears to be highly effective in murine in vivo studies	171
238152	cd00248	Mth938-like	Mth938-like domain. The members of this family include: Mth938, 2P1, Xcr35, Rpa2829, and several uncharacterized sequences. Mth938 is a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mth) genome. This protein crystallizes as a dimer, although it is monomeric in solution, with one disulfide bond in each monomer.  2P1 is a partially characterized nuclear protein which is homologous to E3-3 from rat and known to be alternately spliced. Xcr35 and Rpa2829 are hypothetical proteins of unknown function from the Xanthomonas campestris and Rhodopseudomonas palustris genomes, respectively, for which the crystal structures have been determined.	109
238153	cd00249	AGE	AGE domain; N-acyl-D-glucosamine 2-epimerase domain; Responsible for intermediate epimerization during biosynthesis of N-acetylneuraminic acid. Catalytic mechanism is believed to be via nucleotide elimination and readdition and is ATP modulated. AGE is structurally and mechanistically distinct from the other four types of epimerases. The AGE domain monomer is composed of an alpha(6)/alpha(6)-barrel, the structure of which is also found in glucoamylase and cellulase. The active form is a homodimer. The alignment also contains subtype III mannose 6-phosphate isomerases.	384
238154	cd00250	CAS_like	Clavaminic acid synthetase (CAS) -like;  CAS is a trifunctional Fe(II)/ 2-oxoglutarate (2OG) oxygenase carrying out three reactions in the biosynthesis of clavulanic acid, an inhibitor of class A serine beta-lactamases. In general, Fe(II)-2OG oxygenases catalyze a hydroxylation reaction, which leads to the incorporation of an oxygen atom from dioxygen into a hydroxyl group and conversion of 2OG to succinate and CO2	262
119403	cd00251	Mth_Ecto	The ectodomain of Methuselah (Mth); Mth mutants have a 35% increase in average lifespan and increased resistance to several forms of stress, including heat, starvation, and oxidative damage; The protein affected by this mutation is related to G protein-coupled receptors of the secretin receptor family; Mth, like secretin receptor family members, has a large N-terminal ectodomain, which may constitute the ligand binding site.	176
320009	cd00252	EFh_SPARC_EC	EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein acidic and rich in cysteine (SPARC)-like proteins. The SPARC protein family represents a diverse group of proteins that share a follistatin-like (FS) domain and an extracellular calcium-binding (EC) domain with two EF-hand motifs. It includes SPARC (for secreted protein acidic and rich in cysteine, also termed osteonectin/ON, or basement-membrane protein 40/BM-40), SPARC-like protein 1 (for secreted protein, acidic and rich in cysteines-like 1/ SPARCL1, also termed high endothelial venule protein/Hevi, or MAST 9, or SC-1, or RAGS-1, or QR1, or ECM 2), testicans 1, 2, and 3 (also termed SPARC/osteonectin, CWCV, and Kazal-like domains proteoglycans, or SPOCK), secreted modular calcium-binding protein SMOC-1 (also termed SPARC-related modular calcium-binding protein 1) and SMOC-2 (also termed SPARC-related modular calcium-binding protein 2, or smooth muscle-associated protein 2/SMAP-2), follistatin-related protein 1 (FRP-1, also termed follistatin-like protein 1/fstl-1, TSC-36/Flik, TGF-beta inducible protein). The SPARC proteins have been implicated in modulating cell interaction with the extracellular milieu, including regulation of extracellular matrix assembly and deposition, counter-adhesion, effects on extracellular protease activity, and modulation of growth factor/cytokine signaling pathways, as well as in development and disease.	107
238156	cd00253	PL_Passenger_AT	Pertactin-like passenger domains (virulence factors) of autotransporter proteins of the type V secretion system. Autotransporters are proteins used by Gram-negative bacteria to transport proteins across their outer membranes. The C-terminal (beta) domain of autotransporters forms a pore in the outer membrane through which the N-terminal passenger domain is transported. Following transport, the passenger domain is generally cleaved by an outer membrane protease with the passenger domain either remaining in contact with the surface via a noncovalent interaction with the beta domain or cleaved to release a soluble protein. These proteins are highly diverse and perform a variety of functions that promote virulence, including catalyzing proteolysis, serving as an adhesin, mediating actin-promoted motility, or serving as a cytotoxin. Proteins in this family share similarity in the C-terminal region of the passenger domain as seen in the pertactin structure P.69, a Bordetella pertussis agglutinogen responsible for human pertussis. The  P.69 protein consists of a 16-stranded parallel beta-helix with a V-shaped cross-section, and is one of the largest beta-helix known to date.	186
381594	cd00254	LT-like	lytic transglycosylase(LT)-like domain. Members include the soluble and insoluble membrane-bound LTs in bacteria and LTs in bacteriophage lambda. LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue.	111
238158	cd00255	nidG2	Nidogen, G2 domain; Nidogen is an important component of the basement membrane, an extracellular sheet-like matrix. Nidogen is a multifunctional protein that interacts with many other basement membrane proteins, like collagen, perlecan, lamin, and has a potential role in the assembly and connection of networks. Nidogen consists of 3 globular domains (G1-G3), G3 is the lamin-binding domain, while G2 binds collagen IV and perlecan. Also found in hemicentin, a protein which functions at various cell-cell and cell-matrix junctions and might assist in refining broad regions of cell contact into oriented, line-shaped junctions. Nidogen G2 consists of an N-terminal EGF-like domain (excluded from this alignment model) and an 11-stranded beta-barrel with a central helix, a topology that exhibits high structural similarity to the green flourescent proteins of Cnidaria.	224
238159	cd00256	VATPase_H	VATPase_H, regulatory vacuolar ATP synthase subunit H (Vma13p); activation component of the peripheral V1 complex of V-ATPase, a heteromultimeric enzyme which uses  ATP to actively transport protons into organelles and extracellular compartments. The topology is that of a superhelical spiral, in part the geometry is similar to superhelices composed of armadillo repeat motifs, as found in importins for example.	429
238160	cd00257	Fascin	Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed;  identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast  growth factor (FGF)	119
238161	cd00258	GM2-AP	GM2 activator protein (GM2-AP) is a non-enzymatic lysosomal protein that acts as cofactor in the sequential degradation of gangliosides. GM2A is an essential cofactor for beta-hexosaminidase A (Hex A) in the enzymatic hydrolysis of GM2 ganglioside to GM3. Mutation of the gene results in the AB variant of Tay-Sachs disease. GM2-AP and similar proteins belong to the ML domain family.	162
119404	cd00259	STNV	STNV domain; satellite tobacco necrosis virus (STNV) are small plant viruses which are completely dependent on the presence of a specific helper virus, TNV, for their replication;  60 identical subunits, this domain is one of them; form an icosahedral shell around a single RNA molecule. Half of the RNA codes for the coat protein with the other half being non-coding. The STNV domain has a "Swiss roll" Greek key topology with its two 4-stranded antiparallel beta sheets	195
271229	cd00260	Sialidase	sialidases/neuraminidases. Sialidases or neuraminidases function to bind and hydrolyze terminal sialic acid residues from various glycoconjugates as well as playing roles in pathogenesis, bacterial nutrition and cellular interactions. They have a six-bladed beta-propeller fold. This hierarchy includes eubacterial, eukaryotic, and viral sialidases.	361
238163	cd00261	AAI_SS	AAI_SS: Alpha-Amylase Inhibitors (AAIs) and Seed Storage (SS) Protein subfamily; composed of cereal-type AAIs and SS proteins. They are mainly present in the seeds of a variety of plants. AAIs play an important role in the natural defenses of plants against insects and pathogens such as fungi, bacteria and viruses. AAIs impede the digestion of plant starch and proteins by inhibiting digestive alpha-amylases and proteinases. Also included in this subfamily are SS proteins such as 2S albumin, gamma-gliadin, napin, and prolamin. These AAIs and SS proteins are also known allergens in humans.	110
238164	cd00264	BPI	BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide.	208
238165	cd00265	MADS_MEF2_like	MEF2 (myocyte enhancer factor 2)-like/Type II subfamily of MADS ( MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptional regulators. Binds DNA and exists as hetero and homo-dimers. Differs from SRF-like/Type I subgroup mainly in position of the alpha helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals.  Also found in fungi.	77
238166	cd00266	MADS_SRF_like	SRF-like/Type I subfamily of MADS (MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptional regulators. Binds DNA and exists as hetero- and homo-dimers. Differs from the MEF-like/Type II subgroup mainly in position of the alpha 2 helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals.  Also found in fungi.	83
213179	cd00267	ABC_ATPase	ATP-binding cassette transporter nucleotide-binding domain. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	157
350669	cd00268	DEADc	DEAD-box helicase domain of DEAD box helicases. DEAD-box helicases comprise a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	196
238168	cd00270	MATH_TRAF_C	Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link cell surface TNFRs and receptors of the interleukin-1/Toll-like family to downstream kinase signaling cascades which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. There are at least six mammalian and three Drosophila proteins containing TRAF domains. The mammalian TRAFs display varying expression profiles, indicating independent and cell type-specific regulation. They display distinct, as well as overlapping functions and interactions with receptors. Most TRAFs, except TRAF1, share N-terminal homology and contain a RING domain, multiple zinc finger domains, and a TRAF domain. TRAFs form homo- and heterotrimers through its TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors.	149
238169	cd00271	Chemokine_C	Chemokine_C, C or lymphotactin subgroup, 1 of 4 subgroup designations of chemokines based on the arrangement of two N-terminal, conserved cysteine residues. Most of the known chemokines (cd00169) belong to either the CC (cd00272) or CXC (cd00273) subclass. The two other subclasses each have a single known member: fractalkine for the CX3C (cd00274) class and lymphotactin for the C (cd00271) class. Chemokine_Cs differ structurally since they contain only one of the two disulfide bridges that are conserved in all other chemokines and they possess a unique C-terminal extension, which is required for biological activity and thought to play a role in receptor binding. Lymphotactin, a mediator of mucosal immunity, has been found to chemoattract neutrophils and B cells through the XCR1 receptor and thought to be a factor in acute allograft rejection and inflammatory bowel disease.	72
238170	cd00272	Chemokine_CC	Chemokine_CC:  1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteine residues; includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; some members (e.g. 2HCC) contain an additional disulfide bond which is thought to compensate for the highly conserved Trp missing in these; chemotatic for monocytes, macrophages, eosinophils, basophils, and T cells, but not neutrophils; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates and a few viruses; a subgroup of CC, identified by an N-terminal DCCL motif (Exodus-1, Exodus-2, and Exodus-3), has been shown to inhibit specific types of human cancer cell growth in a mouse model. See CDs:  Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups, and Chemokine_CC_DCCL for the DCCL subgroup of this CD.	57
238171	cd00273	Chemokine_CXC	Chemokine_CXC:  1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteine residues; includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity; many members contain an RCxC motif which may be a general requirement for binding to CXC chemokine receptors; those with the ELR motif are chemotatic for neutrophils and have been shown to be angiogenic, while those without the motif act on T and B cells, and are typically angiostatic; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates and a few viruses.  See CDs:  Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CC (cd00272), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups.	64
238172	cd00274	Chemokine_CX3C	Chemokine_CX3C:  1 of 4 subgroup designations based on the arrangement of the two N-terminal cysteines; differ structurally from the other subgroups in that they are attached to a membrane-spanning domain via a mucin-like stalk and can be proteolytically cleaved to a freely diffusible form; chemotatic for T cells, monocytes, and natural killer cells; function as monomers and are found only in vertebrates and a few viruses; currently only fractalkine (sometimes called neurotactin) has been identified as a member of this subfamily; the primary source of fractalkine is neurons, and they exhibit cell adhesion and chemoattractive properties in the central nervous system. See CDs:  Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_CC (cd00272), and Chemokine_C (cd00271) for the additional chemokine subgroups.	76
175974	cd00275	C2_PLC_like	C2 domain present in Phosphoinositide-specific phospholipases C (PLC). PLCs are involved in the hydrolysis of phosphatidylinositol-4,5-bisphosphate (PIP2) to d-myo-inositol-1,4,5-trisphosphate (1,4,5-IP3) and sn-1,2-diacylglycerol (DAG).   1,4,5-IP3 and DAG are second messengers in eukaryotic signal transduction cascades. PLC is composed of a N-terminal PH domain followed by a series of EF hands, a catalytic TIM barrel and a C-terminal C2 domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-II topology.	128
175975	cd00276	C2B_Synaptotagmin	C2 domain second repeat present in Synaptotagmin. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. There are several classes of Synaptotagmins. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	134
238173	cd00279	YlxR	Ylxr homologs; group of conserved hypothetical bacterial proteins of unknown function; structure revealed putative RNA binding cleft; proteins are encoded by an operon that includes other proteins involved in transcription and/or translation	79
238174	cd00280	TRFH	Telomeric Repeat binding Factor or TTAGGG Repeat binding Factor, central (dimerization) domain Homology; TRFH. Telomeres are protein/DNA complexes that make up the physical ends of eukaryotic linear chromosomes and are essential for chromosome stability, protecting the chromosome ends from degradation and end-to-end fusion. Proteins TRF1, TRF2 and Taz1 bind telomeric DNA and are also involved in recruiting interacting proteins, TIN2, and Rap1, to the telomeres. It has also been demonstrated that PARP1 associates with TRF2 and is capable of poly(ADP-ribosyl)ation of TRF2, which affects binding of TRF2 to telomeric DNA. TRF1, TRF2 and Taz1 proteins contain three functional domains: an N-terminal acidic domain, a central TRF-specific/dimerization domain, and a C-terminal DNA binding domain with a single Myb-like repeat. Homodimerization, a prerequisite to DNA binding, results in the juxtaposition of two Myb DNA binding domains.	200
176449	cd00281	DAP_dppA	Peptidase M55, D-aminopeptidase dipeptide-binding protein family. M55 Peptidase, D-Aminopeptidase dipeptide-binding protein (dppA; DAP dppA; EC 3.4.11.-) domain: Peptide transport systems are found in many bacterial species and generally function to accumulate intact peptides in the cell, where they are hydrolyzed. The dipeptide-binding protein (dppA) of Bacillus subtilis belongs to the dipeptide ABC transport (dpp) operon expressed early during sporulation. It is a binuclear zinc-dependent, D-specific aminopeptidase. The biologically active enzyme is a homodecamer with active sites buried in its channel. These self-compartmentalizing proteases are characterized by a SXDXEG motif. D-Ala-D-Ala and D-Ala-Gly-Gly are the preferred substrates. Bacillus subtilis dppA is thought to function as an adaptation to nutrient deficiency; hydrolysis of its substrate releases D-Ala which can be used subsequently as metabolic fuel. This family also contains a number of uncharacterized putative peptidases.	265
238175	cd00283	GIY-YIG_Cterm	GIYX(10-11)YIG family of class I homing endonucleases C-terminus (GIY-YIG_Cterm). Homing endonucleases promote the mobility of intron or intein by recognizing and cleaving a homologous allele that lacks the sequence. They catalyze a double-strand break in the DNA near the insertion site of that element to facilitate homing at that site. Class I homing endonucleases are sorted into four families based on the presence of these motifs in their respective N-termini: LAGLIDADG, His-Cys box, HNH, and GIY-YIG. This CD contains several but not all members of the GIY-YIG family. The C-terminus of GIY-YIG is a DNA-binding domain which is separated from the N-terminus by a long, flexible linker. The DNA-binding domain consists of a minor-groove binding alpha-helix, and a helix-turn-helix.  Some also contain a zinc finger (i.e. I-TevI) which is not required for DNA binding or catalysis, but is a component of the linker and directs the catalytic domain to cleave the homing site at a fixed distance from the intron insertion site.	113
238176	cd00284	Cytochrome_b_N	Cytochrome b (N-terminus)/b6/petB:  Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms.  Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites.  The C-terminal portion of cytochrome b is described in a separate CD.	200
276954	cd00286	Tubulin_FtsZ_Cetz-like	Tubulin protein family of FtsZ and CetZ-like. This family includes tubulin alpha-, beta-, gamma-, delta-, epsilon, and zeta-tubulins as well as FtsZ and CetZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis.  FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts. A recent study found that CetZ proteins, formerly annotated FtsZ type 2, are not required for cell division, whereas FtsZ proteins play an important role.  Instead, CetZ proteins are shown to be involved in controlling archaeal cell shape dynamics. The results from inactivation studies of CetZ proteins in Haloferax volcanii suggest that CetZ1 is essential for normal swimming motility and rod-cell development.	332
238177	cd00287	ribokinase_pfkB_like	ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric  (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase).	196
238178	cd00288	Pyruvate_Kinase	Pyruvate kinase (PK):  Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors.  Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state.  PK exists as several different isozymes, depending on organism and tissue type.  In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung.  PK forms a homotetramer, with each subunit containing three domains.  The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer.	480
238179	cd00290	cytochrome_b_C	Cytochrome b(C-terminus)/b6/petD:  Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms.  Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites.  The C-terminal domain is involved in forming the ubiquinol/ubiquinone binding sites, but not the heme binding sites.  The N-terminal portion of cytochrome b, which contains both heme binding sites,  is described in a separate CD.	147
238180	cd00291	SirA_YedF_YeeD	SirA, YedF, and YeeD. Two-layered alpha/beta sandwich domain.  SirA (also known as UvrY,  and YhhP) belongs to a family of bacterial two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA.  A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium.  Moreover, despite a low primary sequence similarity,  the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding.	69
238181	cd00292	EF1B	Elongation factor 1 beta (EF1B) guanine nucleotide exchange domain. EF1B catalyzes the exchange of GDP bound to the G-protein, EF1A, for GTP, an important step in the elongation cycle of the protein biosynthesis. EF1A binds to and delivers the aminoacyl tRNA to the ribosome. The guanine nucleotide exchange domain of EF1B, which is the alpha subunit in yeast, is responsible for the catalysis of this exchange reaction.	88
238182	cd00293	USP_Like	Usp: Universal stress protein family. The universal stress protein Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity. The crystal structure of Haemophilus influenzae Usp reveals an alpha/beta fold similar to that of the Methanococcus jannaschii MJ0577 protein, which binds ATP, athough Usp lacks ATP-binding activity.	130
238183	cd00295	RNA_Cyclase	RNA 3' phosphate cyclase domain -  RNA phosphate cyclases are enzymes that catalyze the ATP-dependent conversion of 3'-phosphate at the end of RNA into 2', 3'-cyclic phosphodiester bond. The enzymes are conserved in eucaryotes, bacteria and archaea. The exact biological role of this enzyme is unknown, but it has been proposed that it is likely to function in cellular RNA metabolism and processing. RNA phosphate cyclase has been characterized in human (with at least three isozymes), and E. coli, and it seems to be taxonomically widespread. The crystal structure of RNA phospate cyclase shows that it consists of two domains. The larger domain contains three repeats of a fold originally identified in the bacterial translation initiation factor IF3.	338
238184	cd00296	SIR2	SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines.	222
107219	cd00298	ACD_sHsps_p23-like	This domain family includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins.  sHsps are small stress induced proteins with monomeric masses between 12 -43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this family is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and  the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR).	80
198286	cd00299	GST_C_family	C-terminal, alpha helical domain of the Glutathione S-transferase family. Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of  glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction  and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases.	100
133418	cd00300	LDH_like	L-lactate dehydrogenase-like enzymes. Members of this subfamily are tetrameric NAD-dependent 2-hydroxycarboxylate dehydrogenases including LDHs, L-2-hydroxyisocaproate dehydrogenases (L-HicDH), and LDH-like malate dehydrogenases (MDH). Dehydrogenases catalyze the conversion of carbonyl compounds to alcohols or amino acids. LDHs catalyze the last step of glycolysis in which pyruvate is converted to L-lactate. Vertebrate LDHs are non-allosteric, but some bacterial LDHs are activated by an allosteric effector such as fructose-1,6-bisphosphate. L-HicDH catalyzes the conversion of a variety of 2-oxo carboxylic acids with medium-sized aliphatic or aromatic side chains. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH-like subfamily is part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	300
381182	cd00301	lipocalin_FABP	lipocalin/cytosolic fatty acid-binding protein family. Lipocalins are diverse, mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules as well as membrane bound-receptors. They have a large beta-barrel ligand-binding cavity. Members include retinol-binding protein, retinoic acid-binding protein, complement protein C8 gamma, Can f 2, apolipoprotein D,  extracellular fatty acid-binding protein, beta-lactoglobulin, oderant-binding protein, and bacterial lipocalin Blc. Lipocalins are involved in many important processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty acid-binding proteins also bind hydrophobic ligands in a non-covalent, reversible manner, and are involved in protection and shuttling of fatty acids within the cell, and in acquisition and removal of fatty acids from intracellular sites.	109
410651	cd00302	cytochrome_P450	cytochrome P450 (CYP) superfamily. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs with > 40% sequence identity are members of the same family. There are approximately 2250 CYP families: mammals, insects, plants, fungi, bacteria, and archaea have around 18, 208, 277, 805, 591, and 14 families, respectively. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle. CYPs use a variety of redox partners, such as the eukaryotic diflavin enzyme NADPH-cytochrome P450 oxidoreductase and the bacterial/mitochondrial NAD(P)H-ferredoxin reductase and ferredoxin partners. Some CYPs are naturally linked to their redox partners and others have evolved to bypass requirements for redox partners, and instead react directly with hydrogen peroxide or NAD(P)H to facilitate oxidative or reductive catalysis.	391
133136	cd00303	retropepsin_like	Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.	92
238185	cd00304	RT_like	RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs.	98
238186	cd00305	Cu-Zn_Superoxide_Dismutase	Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730].	144
173787	cd00306	Peptidases_S8_S53	Peptidase domain in the S8 and S53 families. Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and  exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin.  The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base.   However, the aspartic acid residue that acts as an electrophile is quite different.  In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity.  There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values.	241
238187	cd00307	RuBisCO_small_like	Ribulose bisphosphate carboxylase/oxygenase (Rubisco), small subunit and related proteins. Rubisco is a bifunctional enzyme catalyzes the initial steps of two opposing metabolic pathways: photosynthetic carbon fixation and the competing process of photorespiration. Rubisco Form I, present in plants and green algae, is composed of eight large and eight small subunits. The nearly identical small subunits are encoded by a family of nuclear genes. After translation, the small subunits are translocated across the chloroplast membrane, where an N-terminal signal peptide is cleaved off. While the large subunits contain the catalytic activities, it has been shown that the small subunits are important for catalysis by enhancing the catalytic rate through inducing conformational changes in the large subunits. This superfamily also contains specific proteins from cyanobacteria. CcmM plays a role in a CO2 concentrating mechanism, which cyanobacteria need to to overcome the low specificity of their Rubisco and fusions to Rubisco activase, a type of chaperone, which promotes and maintains the catalytic activity of Rubisco. CcmM contains an N-terminal carbonic anhydrase fused to four copies of the Rubisco-small subunit domain	84
238188	cd00308	enolase_like	Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase.	229
238189	cd00309	chaperonin_type_I_II	chaperonin families, type I and type II. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and  in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis.	464
349411	cd00310	ATP-synt_Fo_a_6	ATP synthase Fo complex, subunit 6 (eukaryotes) and subunit a (prokaryotes). Bacterial forms are designated as ATP synthase, Fo complex, subunit a; eukaryotic (chloroplast and mitochondrial) forms are designated as ATP synthase, Fo complex, subunit 6. The F-ATP synthases (also called FoF1-ATPases) consist of two structural domains: F1 (factor one) complex containing the soluble catalytic core, and Fo (oligomycin sensitive factor) complex containing the membrane proton channel, linked together by a central stalk and a peripheral stalk. F-ATP synthases are primarily found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts or in the plasma membranes of bacteria. F-ATP synthase has also been found in the archaea Methanosarcina acetivorans.  F-ATP synthases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis.	156
238190	cd00311	TIM	Triosephosphate isomerase (TIM) is a glycolytic enzyme that catalyzes the interconversion of dihydroxyacetone phosphate and D-glyceraldehyde-3-phosphate. The reaction is very efficient and requires neither cofactors nor metal ions. TIM, usually homodimeric, but in some organisms tetrameric, is ubiqitous and conserved in function across eukaryotes, bacteria and archaea.	242
238191	cd00312	Esterase_lipase	Esterases and lipases (includes fungal lipases, cholinesterases, etc.)  These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate.	493
349412	cd00313	ATP-synt_Fo_Vo_Ao_c	ATP synthase, membrane-bound Fo/Vo/Ao complexes, subunit c. Subunit c of the Fo/Vo/Ao complex is the main transmembrane subunit of F-, V- or A-type family of ATP synthases with rotary motors. These ion-transporting rotary ATP synthases are composed of two linked multi-subunit complexes: the F1, V1, and A1 complexes contains three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Fo, Vo, or Ao (oligomycin sensitive) complex that forms the membrane-embedded proton pore. The F-ATP synthases (also called FoF1-ATPases) are found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts, or in the plasma membranes of bacteria. F-ATPases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy  derived from ATP hydrolysis. The A-ATP synthase (AoA1-ATPases) is exclusively found in archaea and function like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, F-, V-, and A-type synthases can function in both ATP synthesis and hydrolysis modes.	65
173823	cd00314	plant_peroxidase_like	Heme-dependent peroxidases similar to plant peroxidases. Along with animal peroxidases, these enzymes belong to a group of peroxidases containing a heme prosthetic group (ferriprotoporphyrin IX), which catalyzes a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. The plant peroxidase-like superfamily is found in all three kingdoms of life and carries out a variety of biosynthetic and degradative functions. Several sub-families can be identified. Class I includes intracellular peroxidases present in fungi, plants, archaea and bacteria, called catalase-peroxidases, that can exhibit both catalase and broad-spectrum peroxidase activities depending on the steady-state concentration of hydrogen peroxide. Catalase-peroxidases are typically comprised of two homologous domains that probably arose via a single gene duplication event. Class II includes ligninase and other extracellular fungal peroxidases, while class III is comprised of classic extracellular plant peroxidases, like horseradish peroxidase.	255
238192	cd00315	Cyt_C5_DNA_methylase	Cytosine-C5 specific DNA methylases; Methyl transfer reactions play an important role in many aspects of biology. Cytosine-specific DNA methylases are found both in prokaryotes and eukaryotes. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the mammalian genome. These effects include transcriptional repression via inhibition of transcription factor binding or the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability.	275
238193	cd00316	Oxidoreductase_nitrogenase	The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia.  This group contains both alpha and beta subunits of component 1 of the three known genetically distinct types of nitrogenase systems: a molybdenum-dependent  nitrogenase (Mo-nitrogenase), a vanadium-dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase) and, both subunits of Protochlorophyllide (Pchlide) reductase and chlorophyllide (chlide) reductase. The nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). The most widespread and best characterized nitrogenase is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers whose alpha and beta subunits are similar to the alpha and beta subunits of MoFe. For MoFe, each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster from which, electrons are transferred  to the P-cluster of the MoFe and in turn, to FeMoCo at the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of  MoFe. Pchlide reductase and chlide reductase participate in the Mg-branch of the tetrapyrrole biosynthetic pathway.  Pchlide reductase catalyzes the reduction of the D-ring of Pchlide during the synthesis of chlorophylls (Chl) and bacteriochlorophylls (BChl).  Chlide-a reductase catalyzes the reduction of the B-ring of Chlide-a during the synthesis of BChl-a.  The Pchlide reductase NB complex is a an N2B2 heterotetramer resembling nitrogenase FeMo, N and B proteins are homologous to the FeMo alpha and beta subunits respectively.  The NB complex may serve as a catalytic site for Pchlide reduction and, the ZY complex as a site of chlide reduction, similar to MoFe for nitrogen reduction.	399
238194	cd00317	cyclophilin	cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA).  Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin.   PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system;  human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing. 	146
238195	cd00318	Phosphoglycerate_kinase	Phosphoglycerate kinase (PGK) is a monomeric enzyme which catalyzes the transfer of the high-energy phosphate group of 1,3-bisphosphoglycerate to ADP, forming ATP and 3-phosphoglycerate. This reaction represents the first of the two substrate-level phosphorylation events in the glycolytic pathway. Substrate-level phosphorylation is defined as production of  ATP by a process, which is catalyzed by water-soluble enzymes in the cytosol; not involving membranes and ion gradients. 	397
238196	cd00319	Ribosomal_S12_like	Ribosomal protein S12-like family; composed of  prokaryotic 30S ribosomal protein S12, eukaryotic 40S ribosomal protein S23 and similar proteins. S12 and S23 are located at the interface of the large and small ribosomal subunits, adjacent to the decoding center. They play an important role in translocation during the peptide elongation step of protein synthesis. They are also involved in important RNA and protein interactions. Ribosomal protein S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as a control element for the rRNA- and tRNA-driven movements of translocation. S23 interacts with domain III of the eukaryotic elongation factor 2 (eEF2), which catalyzes translocation. Mutations in S12 and S23 have been found to affect translational accuracy. Antibiotics such as streptomycin may also bind S12/S23 and cause the ribosome to misread the genetic code.	95
238197	cd00320	cpn10	Chaperonin 10 Kd subunit (cpn10 or GroES); Cpn10 cooperates with chaperonin 60 (cpn60 or GroEL), an ATPase, to assist the folding and assembly of proteins and is found in eubacterial cytosol, as well as in the matrix of mitochondria and chloroplasts. It forms heptameric rings with a dome-like structure, forming a lid to the large cavity of the tetradecameric cpn60 cylinder and thereby tightly regulating release and binding of proteins to the cpn60 surface.	93
238198	cd00321	SO_family_Moco	Sulfite oxidase (SO) family, molybdopterin binding domain. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). SO catalyzes the terminal reaction in the oxidative degradation of the sulfur-containing amino acids cysteine and methionine. Assimilatory NRs catalyze the reduction of nitrate to nitrite which is subsequently converted to NH4+ by nitrite reductase. Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate.	156
99778	cd00322	FNR_like	Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H).	223
271245	cd00323	uS7	Ribosomal protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis.	130
381595	cd00325	chitinase_GH19	Glycoside hydrolase family 19, chitinase domain. Chitinases are enzymes that catalyze the hydrolysis of the beta-1,4-N-acetyl-D-glucosamine linkages in chitin polymers. Glycoside hydrolase family 19 chitinases are found primarily in plants (classes I, III, and IV), but some are found in bacteria. Class I and II chitinases are similar in their catalytic domains. Class I chitinases have an N-terminal cysteine-rich, chitin-binding domain which is separated from the catalytic domain by a proline and glycine-rich hinge region. Class II chitinases lack both the chitin-binding domain and the hinge region. Class IV chitinases are similar to class I chitinases, but they are smaller in size due to certain deletions. Despite lacking any significant sequence homology with lysozymes, structural analysis reveals that family 19 chitinases, together with family 46 chitosanases, are similar to several lysozymes including those from T4-phage and from goose. The structures reveal that the different enzyme groups arose from a common ancestor glycohydrolase antecedent to the prokaryotic/eukaryotic divergence.	224
238200	cd00326	alpha_CA	Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer.	227
238201	cd00327	cond_enzymes	Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products.	254
163705	cd00328	catalase	Catalase heme-binding enzyme. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. Most catalases exist as tetramers of 65KD subunits containing a protoheme IX group buried deep inside the structure. In eukaryotic cells, catalases are located in peroxisomes.	433
238202	cd00329	TopoII_MutL_Trans	MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of type II DNA topoisomerases (Topo II) and DNA mismatch repair (MutL/MLH1/PMS2) proteins. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. The GyrB dimerizes in response to ATP binding, and is homologous to the N-terminal half of eukaryotic Topo II and the ATPase fragment of MutL. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. Included in this group are proteins similar to human MLH1 and PMS2.  MLH1 forms a heterodimer with PMS2 which functions in meiosis and in DNA mismatch repair (MMR). Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families.	107
153075	cd00330	phosphagen_kinases	Phosphagen (guanidino) kinases. Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK) or phosphoarginine in the case of arginine kinase, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CK exists in tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial and cytosolic) isoforms. They are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK and AK, the most studied members of this family are also other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK) and hypotaurocyamine kinase (HTK). The majority of bacterial phosphagen kinases appear to lack the N-terminal domain and have not been functionally characterized.	236
238203	cd00331	IGPS	Indole-3-glycerol phosphate synthase (IGPS); an enzyme in the tryptophan biosynthetic pathway, catalyzing the ring closure reaction of 1-(o-carboxyphenylamino)-1-deoxyribulose-5-phosphate (CdRP) to indole-3-glycerol phosphate (IGP), accompanied by the release of carbon dioxide and water. IGPS is active as a separate monomer in most organisms, but is also found fused to other enzymes as part of a bifunctional or multifunctional enzyme involved in tryptophan biosynthesis.	217
176460	cd00332	PAL-HAL	Phenylalanine ammonia-lyase (PAL) and histidine ammonia-lyase (HAL). PAL and HAL are members of the Lyase class I_like superfamily of enzymes that, catalyze similar beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. PAL, present in plants and fungi, catalyzes the conversion of L-phenylalanine to E-cinnamic acid. HAL, found in several bacteria and animals, catalyzes the conversion of L-histidine to E-urocanic acid. Both PAL and HAL contain the cofactor 3, 5-dihydro-5-methylidene-4H-imidazol-4-one (MIO) which is formed by autocatalytic excision/cyclization of the internal tripeptide, Ala-Ser-Gly. PAL is being explored as enzyme substitution therapy for Phenylketonuria (PKU), a disorder which involves an inability to metabolize phenylalanine. HAL failure in humans results in the disease histidinemia.	444
238204	cd00333	MIP	Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms.	228
238205	cd00336	Ribosomal_L22	Ribosomal protein L22/L17e.  L22 (L17 in eukaryotes) is a core protein of the large ribosomal subunit.  It is the only ribosomal protein that interacts with all six domains of 23S rRNA, and is one of the proteins important for directing the proper folding and stabilizing the conformation of 23S rRNA.  L22 is the largest protein contributor to the surface of the polypeptide exit channel, the tunnel through which the polypeptide product passes.  L22 is also one of six proteins located at the putative translocon binding site on the exterior surface of the ribosome.	105
238206	cd00338	Ser_Recombinase	Serine Recombinase family, catalytic domain; a DNA binding domain may be present either N- or C-terminal to the catalytic domain. These enzymes perform site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and serine recombinase. Serine recombinases demonstrate functional versatility and include resolvases, invertases, integrases, and transposases. Resolvases and invertases (i.e. Tn3, gamma-delta, Tn5044 resolvases, Gin and Hin invertases) in this family contain a C-terminal DNA binding domain and comprise a major phylogenic group. Also included are phage- and bacterial-encoded recombinases such as phiC31 integrase, SpoIVCA excisionase, and Tn4451 TnpX transposase. These integrases and transposases have larger C-terminal domains compared to resolvases/invertases and are referred to as large serine recombinases. Also belonging to this family are proteins with N-terminal DNA binding domains similar to IS607- and IS1535-transposases from Helicobacter and Mycobacterium.	137
238207	cd00340	GSH_Peroxidase	Glutathione (GSH) peroxidase family; tetrameric selenoenzymes that catalyze the reduction of a variety of hydroperoxides including lipid peroxidases, using GSH as a specific electron donor substrate. GSH peroxidase contains one selenocysteine residue per subunit, which is involved in catalysis. Different isoenzymes are known in mammals,which are involved in protection against reactive oxygen species, redox regulation of many metabolic processes, peroxinitrite scavenging, and modulation of inflammatory processes.	152
238208	cd00342	gram_neg_porins	Porins form aqueous channels for the diffusion of small hydrophillic molecules across the outer membrane.  Individual 16-strand anti-parallel beta-barrels form a central pore, and trimerizes thru mainly hydrophobic interactions at the interface. Trimers are stabilized by hytrophillic clamping of Loop L2. Loop 3 bends into the pore, creating an elliptical constriction of about 7 x 11A, large enough to allow passage of a glucose molecule without steric hindrance. Removal of the C-terminal residue (usuallly F) destabilizes the trimer and removal of the 16th beta-sheet abolishes trimerization. Unlike typical membrane proteins, porins lack long hydrophobic stretches. Short turns are found at the smooth, periplasmic end, longer irregular loops are  found at the rough, extracellular end. C-terminal residue forms salt bridge with N-terminus.	329
188629	cd00344	FBP_aldolase_I	Fructose-bisphosphate aldolase class I. Fructose-bisphosphate aldolase class I. Fructose-1,6-bisphosphate aldolase is an enzyme of the glycolytic and gluconeogenic pathways found in vertebrates, plants, and bacteria. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). Mutations in the aldolase genes in humans cause hemolytic anemia and hereditary fructose intolerance. The enzyme is a member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin.	328
238209	cd00347	Flavin_utilizing_monoxygenases	Flavin-utilizing monoxygenases	90
100101	cd00349	Ribosomal_L11	Ribosomal protein L11. Ribosomal protein L11, together with proteins L10 and L7/L12, and 23S rRNA, form the L7/L12 stalk on the surface of the large subunit of the ribosome. The homologous eukaryotic cytoplasmic protein is also called 60S ribosomal protein L12, which is distinct from the L12 involved in the formation of the L7/L12 stalk. The C-terminal domain (CTD) of L11 is essential for binding 23S rRNA, while the N-terminal domain (NTD) contains the binding site for the antibiotics thiostrepton and micrococcin. L11 and 23S rRNA form an essential part of the GTPase-associated region (GAR). Based on differences in the relative positions of the L11 NTD and CTD during the translational cycle, L11 is proposed to play a significant role in the binding of initiation factors, elongation factors, and release factors to the ribosome. Several factors, including the class I release factors RF1 and RF2, are known to interact directly with L11. In eukaryotes, L11 has been implicated in regulating the levels of ubiquinated p53 and MDM2 in the MDM2-p53 feedback loop, which is responsible for apoptosis in response to DNA damage. In bacteria, the "stringent response" to harsh conditions allows bacteria to survive, and ribosomes that lack L11 are deficient in stringent factor stimulation.	131
238210	cd00350	rubredoxin_like	Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer.  Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain.  Rubrerythrins and nigerythrins have putative peroxide activity.	33
238211	cd00351	TS_Pyrimidine_HMase	Thymidylate synthase and pyrimidine hydroxymethylase: Thymidylate synthase (TS) and deoxycytidylate hydroxymethylase (dCMP-HMase) are homologs that catalyze analogous alkylation of C5 of pyrimidine nucleotides. Both enzymes are involved in the biosynthesis of DNA precursors and are active as homodimers. However, they exhibit distinct pyrimidine base specificities and differ in the details of their catalyzed reactions. TS is biologically ubiquitous and catalyzes the conversion of dUMP and methylene-tetrahydrofolate (CH2THF) to dTMP and dihydrofolate (DHF). It also acts as a regulator of its own expression by binding and inactivating its own RNA. Due to its key role in the de novo pathway for thymidylate synthesis and, hence, DNA synthesis, it is one of the most conserved enzymes across species and phyla. TS is a well-recognized target for anticancer chemotherapy, as well as a valuable new target against infectious diseases. Interestingly, in several protozoa, a single polypeptide chain codes for both, dihydrofolate reductase (DHFR) and thymidylate synthase (TS), forming a bifunctional enzyme (DHFR-TS), possibly through gene fusion at a single evolutionary point. DHFR-TS is also active as a dimer. Virus encoded dCMP-HMase catalyzes the reversible conversion of dCMP and CH2THF to hydroxymethyl-dCMP and THF. This family also includes dUMP hydroxymethylase, which is encoded by several bacteriophages that infect Bacillus subtilis, for their own protection against the host restriction system,  and contain hydroxymethyl-dUMP instead of dTMP in their DNA.	215
238212	cd00352	Gn_AT_II	Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate.  Asparagine synthetase B  synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer.	220
238213	cd00353	Ribosomal_S15p_S13e	Ribosomal protein S15 (prokaryotic)_S13 (eukaryotic) binds the central domain of 16S rRNA and is required for assembly of the small ribosomal subunit and for intersubunit association, thus representing a key element in the assembly of the whole ribosome. S15 also plays an important autoregulatory role by binding and preventing its own mRNA from being translated. S15 has a predominantly alpha-helical fold that is highly structured except for the N-terminal alpha helix.	80
238214	cd00354	FBPase	Fructose-1,6-bisphosphatase, an enzyme that catalyzes the hydrolysis of fructose-1,6-biphosphate  into fructose-6-phosphate and is critical in gluconeogenesis pathway. The alignment model also includes chloroplastic FBPases and sedoheptulose-1,7-biphosphatases that play a role in pentose phosphate pathway (Calvin cycle).	315
100098	cd00355	Ribosomal_L30_like	Ribosomal protein L30, which is found in eukaryotes and prokaryotes but not in archaea, is one of the smallest ribosomal proteins with a molecular mass of about 7kDa. L30 binds the 23SrRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome.  The eukaryotic L30 members have N- and/or C-terminal extensions not found in their prokaryotic orthologs.  L30 is closely related to the ribosomal L7 protein found in eukaryotes and archaea.	53
238215	cd00361	arom_aa_hydroxylase	Biopterin-dependent aromatic amino acid hydroxylase; a family of non-heme, iron(II)-dependent enzymes that includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH converts L-phenylalanine to L-tyrosine, an important step in phenylalanine catabolism and neurotransmitter biosynthesis, and is linked to a severe variant of phenylketonuria in humans. TyrOH and TrpOH are involved in the biosynthesis of catecholamine and serotonin, respectively. The eukaryotic enzymes are all homotetramers.	221
238216	cd00363	PFK	Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to PFK family that includes ATP- and pyrophosphate (PPi)- dependent phosphofructokinases. Some members evolved by gene duplication and thus have a large C-terminal/N-terminal extension comprising a second PFK domain. Generally, ATP-PFKs are allosteric homotetramers, and  PPi-PFKs are dimeric and nonallosteric except for plant PPi-PFKs which are allosteric heterotetramers.	338
153080	cd00365	HMG-CoA_reductase	Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR). Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR) is a tightly regulated enzyme, which catalyzes the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. In mammals, this is the rate limiting committed step in cholesterol biosynthesis. Bacteria, such as Pseudomonas mevalonii, which rely solely on mevalonate for their carbon source, catalyze the reverse reaction, using an NAD-dependent HMGR to deacetylate mevalonate into 3-hydroxy-3-methylglutaryl-CoA. There are two classes of HMGR: class I enzymes which are found predominantly in eukaryotes and contain N-terminal membrane regions and class II enzymes which are found primarily in prokaryotes and are soluble as they lack the membrane region. With the exception of Archaeoglobus fulgidus, most archeae are assigned to class I, based on sequence similarity of the active site, even though they lack membrane regions. Yeast and human HMGR are divergent in their N-terminal regions, but are conserved in their active site. In contrast, human and bacterial HMGR differ in their active site architecture. While the prokaryotic enzyme is a homodimer, the eukaryotic enzyme is a homotetramer.	376
212658	cd00366	FGGY	FGGY family of carbohydrate kinases. This family is predominantly composed of glycerol kinase (GK) and similar carbohydrate kinases including rhamnulokinase (RhuK), xylulokinase (XK), gluconokinase (GntK), ribulokinase (RBK), and fuculokinase (FK). These enzymes catalyze the transfer of a phosphate group, usually from ATP, to their carbohydrate substrates. The monomer of FGGY proteins contains two large domains, which are separated by a deep cleft that forms the active site. One domain is primarily involved in sugar substrate binding, and the other is mainly responsible for ATP binding. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Substrate-induced conformational changes and a divalent cation may be required for the catalytic activity.	435
238217	cd00367	PTS-HPr_like	Histidine-containing phosphocarrier protein (HPr)-like proteins. HPr is a central component of the bacterial phosphoenolpyruvate sugar phosphotransferase system (PTS). The PTS catalyses the phosphorylation of sugar substrates during their translocation across the cell membrane. The phosphoryl group from phosphoenolpyruvate is transferred to HPr by enzyme I (EI). Phospho-HPr then transfers the phosphoryl group to one of several sugar-specific phosphoprotein intermediates. The conserved histidine in the N-terminus of HPr serves as an acceptor for the phosphoryl group of EI. In addition to the phosphotransferase proteins HPr and E1, this family also includes the closely related Carbon Catabolite Repressor (CCR) proteins which use the same phosphorylation mechanism and interact with transcriptional regulators to control expression of genes coding for utilization of less favored carbon sources.	77
238218	cd00368	Molybdopterin-Binding	Molybdopterin-Binding (MopB) domain of the MopB superfamily of proteins, a  large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site.	374
238219	cd00371	HMA	Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions.	63
238220	cd00374	RNase_T2	Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far.  This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases).  Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen.	195
238221	cd00375	Urease_alpha	Urease alpha-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, fungi and plants. Their primary role is to allow the use of external and internally generated urea as a nitrogen source. The enzyme consists of 3 subunits, alpha, beta and gamma, which can be fused and present on a single protein chain and which in turn forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers.	567
119340	cd00377	ICL_PEPM	Members of the ICL/PEPM enzyme family catalyze either P-C or C-C bond formation/cleavage. Known members are phosphoenolpyruvate mutase (PEPM), phosphonopyruvate hydrolase (PPH), carboxyPEP mutase (CPEP mutase), oxaloacetate hydrolase (OAH), isocitrate lyase (ICL), and 2-methylisocitrate lyase (MICL). Isocitrate lyase (ICL) catalyzes the conversion of isocitrate to succinate and glyoxylate, the first committed step in the glyoxylate pathway. This carbon-conserving pathway is present in most prokaryotes, lower eukaryotes and plants, but has not been observed in vertebrates. PEP mutase (PEPM) turns phosphoenolpyruvate (PEP) into phosphonopyruvate (P-pyr), an important intermediate in the formation of organophosphonates, which function as antibiotics or play a role in pathogenesis or signaling. P-pyr can be hydrolyzed by phosphonopyruvate hydrolase (PPH) to from pyruvate and phosphate. Oxaloacetate acetylhydrolase (OAH) catalyzes the hydrolytic cleavage of oxaloacetate to form acetate and oxalate, an important pathway to produce oxalate in filamentous fungi. 2-methylisocitrate lyase (MICL) cleaves 2-methylisocitrate to pyruvate and succinate, part of the methylcitrate cycle for the alpha-oxidation of propionate.	243
99733	cd00378	SHMT	Serine-glycine hydroxymethyltransferase (SHMT). This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). SHMT carries out interconversion of serine and glycine; it catalyzes the transfer of hydroxymethyl group of N5, N10-methylene tetrahydrofolate to glycine resulting in the formation of serine and tetrahydrofolate. Both eukaryotic and prokaryotic SHMT enzymes form tight obligate homodimers; the mammalian enzyme forms a homotetramer comprising four pyridoxal phosphate-bound active sites.	402
238222	cd00379	Ribosomal_L10_P0	Ribosomal protein L10 family; composed of the large subunit ribosomal protein called L10 in bacteria, P0 in eukaryotes, and L10e in archaea, as well as uncharacterized P0-like eukaryotic proteins. In all three kingdoms, L10 forms a tight complex with multiple copies of the small acidic protein L12(e). This complex forms a stalk structure on the large subunit of the ribosome. The N-terminal domain (NTD) of L10 interacts with L11 protein and forms the base of the L7/L12 stalk, while the extended C-terminal helix binds to two or three dimers of the NTD of L7/L12 (L7 and L12 are identical except for an acetylated N-terminus). The L7/L12 stalk is known to contain the binding site for elongation factors G and Tu (EF-G and EF-Tu, respectively); however, there is disagreement as to whether or not L10 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, L10 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). Some eukaryotic P0 sequences have an additional C-terminal domain homologous with acidic proteins P1 and P2.	155
240504	cd00380	KOW	KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese). KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.	49
238223	cd00381	IMPDH	IMPDH: The catalytic domain of the inosine monophosphate dehydrogenase. IMPDH catalyzes the NAD-dependent oxidation of inosine 5'-monophosphate (IMP) to xanthosine 5' monophosphate (XMP). It is a rate-limiting step in the de novo synthesis of the guanine nucleotides. There is often a CBS domain inserted in the middle of this domain, which is proposed to play a regulatory role. IMPDH is a key enzyme in the regulation of cell proliferation and differentiation. It has been identified as an attractive target for developing chemotherapeutic agents.	325
238224	cd00382	beta_CA	Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity.  Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes.	119
294013	cd00383	trans_reg_C	DNA-binding effector domain of two-component system response regulators. Bacteria and some eukaryotes use two-component signal transduction systems to detect and respond to changes in the environment. The systems consists of a sensor histidine kinase and a response regulator. The former autophosphorylates a histidine residue on detecting an external stimulus. The phosphate is then transferred to an invariant aspartate residue in a highly conserved receiver domain of the response regulator. Phosphorylation activates a variable effector domain of the response regulator, which triggers the cellular response. This C-terminal effector domain belongs to the winged helix-turn-helix family of transcriptional regulators and contains DNA and RNA polymerase binding sites. Several dimers or monomers bind head to tail to small tandem repeats upstream of the genes. The RNA polymerase binding sites interact with the alpha or sigma subunit of RNA polymerase.	89
238226	cd00384	ALAD_PBGS	Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. They either contain a cysteine-rich zinc binding site (consensus DXCXCX(Y/F)X3G(H/Q)CG) or an aspartate-rich magnesium binding site (consensus DXALDX(Y/F)X3G(H/Q)DG). The cysteine-rich zinc binding site appears more common. Most members represented by this model also have a second allosteric magnesium binding site (consensus RX~164DX~65EXXXD, missing in a eukaryotic subfamily with cysteine-rich zinc binding site).	314
173830	cd00385	Isoprenoid_Biosyn_C1	Isoprenoid Biosynthesis enzymes, Class 1. Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes; and are widely distributed among archaea, bacteria, and eukaryota.The enzymes in this superfamily share the same 'isoprenoid synthase fold' and include several subgroups. The head-to-tail (HT) IPPS catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. Cyclic monoterpenes, diterpenes, and sesquiterpenes, are formed from their respective linear isoprenoid diphosphates by class I terpene cyclases. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Cyclization of these 30- and 40-carbon linear forms are catalyzed by class II cyclases. Both the isoprenoid chain elongation reactions and the class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Generally, the enzymes in this family exhibit an all-trans reaction pathway, an exception, is the cis-trans terpene cyclase, trichodiene synthase. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD.	243
238227	cd00386	Heme_Cu_Oxidase_III_like	Heme-copper oxidase subunit III.  Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types.  This superfamily includes cytochrome c and ubiquinol oxidases.  Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO.  This group additionally contains proteins which are fusions between subunits I and III, such as Sulfolobus acidocaldarius SoxM, a subunit of the SoxM terminal oxidase complex. It also includes NorE which has been speculated to be a subunit of nitric oxide reductase. Some archaebacterial cytochrome oxidases lack subunit III.   Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I.  It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria.	183
100102	cd00387	Ribosomal_L7_L12	Ribosomal protein L7/L12. Ribosomal protein L7/L12 refers to the large ribosomal subunit proteins L7 and L12, which are identical except that L7 is acetylated at the N terminus. It is a component of the L7/L12 stalk, which is located at the surface of the ribosome. The stalk base consists of a portion of the 23S rRNA and ribosomal proteins L11 and L10. An extended C-terminal helix of L10 provides the binding site for L7/L12. L7/L12 consists of two domains joined by a flexible hinge, with the helical N-terminal domain (NTD) forming pairs of homodimers that bind to the extended helix of L10. It is the only multimeric ribosomal component, with either four or six copies per ribosome that occur as two or three dimers bound to the L10 helix. L7/L12 is the only ribosomal protein that does not interact directly with rRNA, but instead has indirect interactions through L10. The globular C-terminal domains of L7/L12 are highly mobile. They are exposed to the cytoplasm and contain binding sites for other molecules. Initiation factors, elongation factors, and release factors are known to interact with the L7/L12 stalk during their GTP-dependent cycles. The binding site for the factors EF-Tu and EF-G comprises L7/L12, L10, L11, the L11-binding region of 23S rRNA, and the sarcin-ricin loop of 23S rRNA. Removal of L7/L12 has minimal effect on factor binding and it has been proposed that L7/L12 induces the catalytically active conformation of EF-Tu and EF-G, thereby stimulating the GTPase activity of both factors. In eukaryotes, the proteins that perform the equivalent function to L7/L12 are called P1 and P2, which do not share sequence similarity with L7/L12. However, a bacterial L7/L12 homolog is found in some eukaryotes, in mitochondria and chloroplasts. In archaea, the protein equivalent to L7/L12 is called aL12 or L12p, but it is closer in sequence to P1 and P2 than to L7/L12.	127
238228	cd00389	microbial_RNases	microbial_RNases. Ribonucleases (RNAses) cleave phosphodiester bonds in RNA and are essential  for both non-specific RNA degradation and for numerous forms of RNA processing. The alignment contains fungal RNases (U2, T1, F1, Th,  Pb, N1, and Ms) and bacterial RNases (barnase, binase, RNase Sa) , the majority of which are guanyl specific and fungal ribotoxins.	71
238229	cd00390	Urease_gamma	Urease gamma-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, archaea, fungi and plants. Their primary role is to allow the use of external and internally-generated urea as a nitrogen source. The enzyme consists of three subunits, alpha, beta and gamma, which can exist as separate proteins or can be fused on a single protein chain. The alpha-beta-gamma heterotrimer forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers.	96
238230	cd00392	Ribosomal_L13	Ribosomal protein L13.  Protein L13, a large ribosomal subunit protein, is one of five proteins required for an early folding intermediate of 23S rRNA in the assembly of the large subunit. L13 is situated on the bottom of the large subunit, near the polypeptide exit site.  It interacts with proteins L3 and L6, and forms an extensive network of interactions with 23S rRNA. L13 has been identified as a homolog of the human breast basic conserved protein 1 (BBC1), a protein identified through its increased expression in breast cancer.  L13 expression is also upregulated in a variety of human gastrointestinal cancers, suggesting it may play a role in the etiology of a variety of human malignancies.	114
132923	cd00394	Clp_protease_like	Caseinolytic protease (ClpP) is an ATP-dependent protease. Clp protease (caseinolytic protease; ClpP; endopeptidase Clp; Peptidase S14; ATP-dependent protease, ClpAP)-like enzymes are highly conserved serine proteases and belong to the ClpP/Crotonase superfamily. Included in this family are Clp proteases that are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. The functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP. Active site consists of the triad Ser, His and Asp, preferring hydrophobic or non-polar residues at P1 or P1' positions. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. Another family included in this class of enzymes is the signal peptide peptidase A (SppA; S49) which is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Mutagenesis studies suggest that the catalytic center of SppA comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain. Others, including sohB peptidase, protein C, protein 1510-N and archaeal signal peptide peptidase, do not contain the amino-terminal domain. The third family included in this hierarchy is nodulation formation efficiency D (NfeD) which is a membrane-bound Clp-class protease and only found in bacteria and archaea. Majority of the NfeD genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named stomatin operon partner protein (STOPP). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 from Pyrococcus horikoshii has been shown to possess serine protease activity having a Ser-Lys catalytic dyad.	161
173893	cd00395	Tyr_Trp_RS_core	catalytic core domain of tyrosinyl-tRNA and tryptophanyl-tRNA synthetase. Tyrosinyl-tRNA synthetase (TyrRS)/Tryptophanyl-tRNA synthetase (TrpRS) catalytic core domain. These enzymes attach Tyr or Trp, respectively, to the appropriate tRNA. These class I enzymes are homodimers, which aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding.	273
100027	cd00396	PurM-like	AIR (aminoimidazole ribonucleotide) synthase related protein. This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM (formylglycinamidine ribonucleotide) synthase and Selenophosphate synthetase (SelD). The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain.	222
271175	cd00397	DNA_BRE_C	DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine based site-specific recombinases (integrases) that share the same fold in their catalytic domain containing conserved active site residues. The best-studied members of this diverse superfamily include Human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase, and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	167
238232	cd00398	Aldolase_II	Class II Aldolase and Adducin head (N-terminal) domain. Aldolases are ubiquitous enzymes catalyzing central steps of carbohydrate metabolism. Based on enzymatic mechanisms, this superfamily has been divided into two distinct classes (Class I and II). Class II enzymes are further divided into two sub-classes A and B. This family includes class II A aldolases and adducins which has not been ascribed any enzymatic function. Members of this class are primarily bacterial and eukaryotic in origin and  include L-fuculose-1-phosphate, L-rhamnulose-1-phosphate aldolases and L-ribulose-5-phosphate 4-epimerases. They all share the ability to promote carbon-carbon bond cleavage and stabilize enolate intermediates using divalent cations.	209
259843	cd00399	RNAP_largest_subunit_N	Largest subunit of RNA polymerase (RNAP), N-terminal domain. This region represents the N-terminal domain of the largest subunit of RNA polymerase (RNAP). RNAP is a large multi-protein complex responsible for the synthesis of RNA. It is the principle enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei; RNAP I transcribes the ribosomal RNA precursor, RNAP II the mRNA precursor, and RNAP III the 5S and tRNA genes. A single distinct RNAP complex is found in prokaryotes and archaea, respectively, which may be responsible for the synthesis of all RNAs. Structure studies reveal that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shaped structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. All RNAPs are metalloenzymes. At least one Mg2+ ion is bound in the catalytic center. In addition, all cellular RNAPs contain several tightly bound zinc ions to different subunits that vary between RNAPs from prokaryotic to eukaryotic lineages. This domain represents the N-terminal region of the largest subunit of RNAP, and includes part of the active site. In archaea and some of the photosynthetic organisms or cellular organelle, however, this domain exists as a separate subunit.	528
238233	cd00400	Voltage_gated_ClC	CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family.  The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria.  They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating.  Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry.  Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge.  In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function.	383
240619	cd00401	SAHH	S-Adenosylhomocysteine Hydrolase, NAD-binding and catalytic domains. S-adenosyl-L-homocysteine hydrolase (SAHH, AdoHycase) catalyzes the hydrolysis of S-adenosyl-L-homocysteine (AdoHyc) to form adenosine (Ado) and homocysteine (Hcy). The equilibrium lies far on the side of AdoHyc synthesis, but in nature the removal of Ado and Hyc is sufficiently fast, so that the net reaction is in the direction of hydrolysis. Since AdoHyc is a potent inhibitor of S-adenosyl-L-methionine dependent methyltransferases, AdoHycase plays a critical role in the modulation of the activity of various methyltransferases. The enzyme forms homotetramers, with each monomer binding one molecule of NAD+.	402
293928	cd00402	Riboflavin_synthase_like	Riboflavin synthase and similar proteins. Riboflavin synthase catalyzes the dismutation of two molecules of 6,7-dimethyl-8-(1'-D-ribityl)-lumazine (DMRL) to yield riboflavin (vitamin B12)  and 4-ribitylamino-5-amino-2,6-dihydroxypyrimidine (RAADP). Riboflavin synthase is a homotrimer and the catalysis does not require any cofactors.  Active sites are located between pairs of monomers, but only one active site catalyzes a reaction, the other two sites are inactive. Humans do not produce riboflavin synthase, and thus it is a good target for antimicrobial agents. This family also include lumazine protein (LumP) from bioluminescent bacteria. LumP serves as an optical transponder in bioluminescence emission.	185
238235	cd00403	Ribosomal_L1	Ribosomal protein L1.  The L1 protein, located near the E-site of the ribosome, forms part of the L1 stalk along with 23S rRNA.  In bacteria and archaea, L1 functions both as a ribosomal protein that binds rRNA, and as a translation repressor that binds its own mRNA.  Like several other large ribosomal subunit proteins, L1 displays RNA chaperone activity.  L1 is one of the largest ribosomal proteins. It is composed of two domains that cycle between open and closed conformations via a hinge motion. The RNA-binding site of L1 is highly conserved, with both mRNA and rRNA binding the same binding site.	208
238236	cd00404	Aconitase_swivel	Aconitase swivel domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The aconitase family contains the following proteins: - Iron-responsive  element binding protein (IRE-BP). IRE-BP is a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid.	88
238237	cd00405	PRAI	Phosphoribosylanthranilate isomerase (PRAI) catalyzes the fourth step of the tryptophan biosynthesis, the conversion of N-(5'- phosphoribosyl)-anthranilate (PRA) to 1-(o-carboxyphenylamino)- 1-deoxyribulose 5-phosphate (CdRP). Most PRAIs are monomeric, monofunctional and thermolabile, but in some thermophile organisms PRAI is dimeric for reasons of stability and in others it is fused to other components of the tryptophan biosynthesis pathway to form multifunctional enzymes.	203
238238	cd00407	Urease_beta	Urease beta-subunit; Urease is a nickel-dependent metalloenzyme that catalyzes the hydrolysis of urea to form ammonia and carbon dioxide. Nickel-dependent ureases are found in bacteria, archaea, fungi and plants. Their primary role is to allow the use of external and internally-generated urea as a nitrogen source. The enzyme consists of three subunits, alpha, beta and gamma, which can exist as separate proteins or can be fused on a single protein chain. The alpha-beta-gamma heterotrimer forms multimers, mainly trimers. The large alpha subunit is the catalytic domain containing an active site with a bi-nickel center complexed by a carbamylated lysine. The beta and gamma subunits play a role in subunit association to form the higher order trimers.	101
188630	cd00408	DHDPS-like	Dihydrodipicolinate synthase family. Dihydrodipicolinate synthase family. A member of the class I aldolases, which use an active-site lysine which stabilizes a reaction intermediate via Schiff base formation, and have TIM beta/alpha barrel fold. The dihydrodipicolinate synthase family comprises several pyruvate-dependent class I aldolases that use the same catalytic step to catalyze different reactions in different pathways and includes such proteins as N-acetylneuraminate lyase, MosA protein, 5-keto-4-deoxy-glucarate dehydratase, trans-o-hydroxybenzylidenepyruvate hydratase-aldolase, trans-2'-carboxybenzalpyruvate hydratase-aldolase, and 2-keto-3-deoxy- gluconate aldolase. The family is also referred to as the N-acetylneuraminate lyase (NAL) family.	281
199205	cd00411	L-asparaginase_like	Bacterial L-asparaginases and related enzymes. Asparaginases (amidohydrolases, E.C. 3.5.1.1) are dimeric or tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localized to the periplasm (type II L-asparaginase), and a second (asparaginase- glutaminase) present in the cytosol (type I L-asparaginase) that hydrolyzes both asparagine and glutamine with similar specificities and has a lower affinity for its substrate. Bacterial L-asparaginases (type II) are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL). A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. This wider family also includes a subunit of an archaeal Glu-tRNA amidotransferase.	320
238239	cd00412	pyrophosphatase	Inorganic pyrophosphatase. These enzymes hydrolyze inorganic pyrophosphate (PPi) to two molecules of orthophosphates (Pi). The reaction requires bivalent cations. The enzymes in general exist as homooligomers.	155
185683	cd00413	Glyco_hydrolase_16	glycosyl hydrolase family 16. The O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycosyl hydrolase family 16. Family 16 includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues.	210
185672	cd00418	GlxRS_core	catalytic core domain of glutamyl-tRNA and glutaminyl-tRNA synthetase. Glutamyl-tRNA synthetase(GluRS)/Glutaminyl-tRNA synthetase (GlnRS) cataytic core domain. These enzymes attach Glu or Gln, respectively, to the appropriate tRNA. Like other class I tRNA synthetases, they aminoacylate the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. These enzymes function as monomers.  Archaea, cellular organelles, and some bacteria lack GlnRS.  In these cases, the "non-discriminating" form of GluRS aminoacylates both tRNA(Glu) and tRNA(Gln) with Glu, which is converted to Gln when appropriate by a transamidation enzyme. The discriminating form of GluRS differs from GlnRS and the non-discriminating form of GluRS in their C-terminal anti-codon binding domains.	230
238240	cd00419	Ferrochelatase_C	Ferrochelatase, C-terminal domain: Ferrochelatase (protoheme ferrolyase or HemH) is the terminal enzyme of the heme biosynthetic pathway. It catalyzes the insertion of ferrous iron into the protoporphyrin IX ring yielding protoheme. This enzyme is ubiquitous in nature and widely distributed in bacteria and eukaryotes. Recently, some archaeal members have been identified. The oligomeric state of these enzymes varies depending on the presence of a dimerization motif at the C-terminus.	135
238241	cd00421	intradiol_dioxygenase	Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers.	146
238242	cd00423	Pterin_binding	Pterin binding enzymes. This family includes dihydropteroate synthase (DHPS) and cobalamin-dependent methyltransferases such as methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) and methionine synthase (MetH).  DHPS, a functional homodimer, catalyzes the condensation of p-aminobenzoic acid (pABA) in the de novo biosynthesis of folate, which is an essential cofactor in both nucleic acid and protein biosynthesis. Prokaryotes (and some lower eukaryotes) must synthesize folate de novo, while higher eukaryotes are able to utilize dietary folate and therefore lack DHPS.  Sulfonamide drugs, which are substrate analogs of pABA, target DHPS.  Cobalamin-dependent methyltransferases catalyze the transfer of a methyl group via a methyl- cob(III)amide intermediate.  These include MeTr, a functional heterodimer, and the folate binding domain of MetH.	258
176453	cd00424	PolY	Y-family of DNA polymerases. Y-family DNA polymerases are a specialized subset of polymerases that facilitate translesion synthesis (TLS), a process that allows the bypass of a variety of DNA lesions.  Unlike replicative polymerases, TLS polymerases lack proofreading activity and have low fidelity and low processivity.  They use damaged DNA as templates and insert nucleotides opposite the lesions. The active sites of TLS polymerases are large and flexible to allow the accomodation of distorted bases.  Most TLS polymerases are members of the Y-family, including Pol eta, Pol kappa/IV, Pol iota, Rev1, and Pol V, which is found exclusively in bacteria.  In eukaryotes, the B-family polymerase Pol zeta also functions as a TLS polymerase. Expression of Y-family polymerases is often induced by DNA damage and is believed to be highly regulated. TLS is likely induced by the monoubiquitination of the replication clamp PCNA, which provides a scaffold for TLS polymerases to bind in order to access the lesion.  Because of their high error rates, TLS polymerases are potential targets for cancer treatment and prevention.	343
238243	cd00427	Ribosomal_L29_HIP	Ribosomal L29 protein/HIP.  L29 is a protein of the large ribosomal Subunit. A homolog, called heparin/heparan sulfate interacting protein (HIP), has also been identified in mammals.  L29 is located on the surface of the large ribosomal subunit, where it participates in forming a protein ring that surrounds the polypeptide exit channel, providing structural support for the ribosome.  L29 is involved in forming the translocon binding site, along with L19, L22, L23, L24, and L31e.  In addition, L29 and L23 form the interaction site for trigger factor (TF) on the ribosomal surface, adjacent to the exit tunnel.  L29 forms numerous interactions with L23 and with the 23S rRNA. In some eukaryotes, L29 is referred to as L35, which is distinct from L35 found in bacteria and some eukaryotes (primarily plastids and mitochondria).  The mammalian homolog, HIP, is found on the surface of many tissues and cell lines. It is believed to play a role in cell adhesion and modulation of blood coagulation. It has also been shown to inhibit apoptosis in cancer cells.	57
238244	cd00429	RPE	Ribulose-5-phosphate 3-epimerase (RPE). This enzyme catalyses the interconversion of D-ribulose 5-phosphate (Ru5P) into D-xylulose 5-phosphate, as part of the Calvin cycle (reductive pentose phosphate pathway) in chloroplasts and in the oxidative pentose phosphate pathway. In the Calvin cycle Ru5P is phosphorylated by phosphoribulose kinase to ribulose-1,5-bisphosphate, which in turn is used by RubisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) to incorporate CO2 as the central step in carbohydrate synthesis.	211
143481	cd00430	PLPDE_III_AR	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Alanine Racemase. This family includes predominantly bacterial alanine racemases (AR), some serine racemases (SerRac), and putative bifunctional enzymes containing N-terminal UDP-N-acetylmuramoyl-tripeptide:D-alanyl-D-alanine ligase (murF) and C-terminal AR domains. These proteins are fold type III PLP-dependent enzymes that play essential roles in peptidoglycan biosynthesis. AR catalyzes the interconversion between L- and D-alanine, which is an essential component of the peptidoglycan layer of bacterial cell walls. SerRac converts L-serine into its D-enantiomer (D-serine) for peptidoglycan synthesis. murF catalyzes the addition of D-Ala-D-Ala to UDPMurNAc-tripeptide, the final step in the synthesis of the cytoplasmic precursor of bacterial cell wall peptidoglycan. Members of this family contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. AR and other members of this family require dimer formation and the presence of the PLP cofactor for catalytic activity. Fungal ARs and eukaryotic serine racemases, which are fold types I and II PLP-dependent enzymes respectively, are excluded from this family.	367
238245	cd00431	cysteine_hydrolases	Cysteine hydrolases; This family contains amidohydrolases, like CSHase (N-carbamoylsarcosine amidohydrolase), involved in creatine metabolism and nicotinamidase, converting nicotinamide to nicotinic acid and ammonia in the pyridine nucleotide cycle. It also contains isochorismatase, an enzyme that catalyzes the conversion of isochorismate to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of the vinyl ether bond, and other related enzymes with unknown function.	161
238246	cd00432	Ribosomal_L18_L5e	Ribosomal L18/L5e:  L18 (L5e) is a ribosomal protein found in the central protuberance (CP) of the large subunit. L18 binds 5S rRNA and induces a conformational change that stimulates the binding of L5 to 5S rRNA. Association of 5S rRNA with 23S rRNA depends on the binding of L18 and L5 to 5S rRNA. L18/L5e is generally described as L18 in prokaryotes and archaea, and as L5e (or L5) in eukaryotes. In bacteria, the CP proteins L5, L18, and L25 are required for the ribosome to incorporate 5S rRNA into the large subunit, one of the last steps in ribosome assembly. In archaea, both L18 and L5 bind 5S rRNA; in eukaryotes, only the L18 homolog (L5e) binds 5S rRNA but a homolog to L5 is also identified.	103
238247	cd00433	Peptidase_M17	Cytosol aminopeptidase family, N-terminal and catalytic domains.  Family M17 contains zinc- and manganese-dependent exopeptidases ( EC  3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants.	468
238248	cd00435	ACBP	Acyl CoA binding protein (ACBP) binds thiol esters of long fatty acids and coenzyme A in a one-to-one binding mode with high specificity and affinity. Acyl-CoAs are important intermediates in fatty lipid synthesis and fatty acid degradation and play a role in regulation of intermediary metabolism and gene regulation. The suggested role of ACBP is to act as a intracellular acyl-CoA transporter and pool former. ACBPs are present in a large group of eukaryotic species and several tissue-specific isoforms have been detected.	85
350155	cd00436	UP_TbUP-like	uridine phosphorylases similar to Trypanosoma brucei UP. Uridine phosphorylase (UP) catalyzes the reversible phosphorolysis of uracil ribosides and analogous compounds to their respective nucleobases and ribose 1-phosphate. Trypanosoma brucei UP has a high specificity for uracil-containing (deoxy)nucleosides, and may function as a dimer. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	282
380337	cd00438	cupin_RmlC	RmlC carbohydrate epimerase, involved in dTDP-L-rhamnose production. RmlC (deoxythymidine diphosphate (dTDP)-4-keto-6-deoxy-D-hexulose 3, 5-epimerase or dTDP-6-deoxy-D-xylo-4-hexulose 3,5-epimerase; also known as RfbC) is a carbohydrate epimerase involved in the production of dTDP-L-rhamnose, a precursor of the bacterial cell wall constituent, L-rhamnose. L-Rhamnose (6-deoxy-l-mannose) plays an important role in the cell-wall structure of many bacterial species. It has been found to contribute to the virulence of several species, including the Gram-negative Salmonella enterica and Vibrio cholerae, where it is present as a part of the O-antigen, and is essential for the growth of Gram-positive bacteria such as Streptococcus pyogenes. RmlC converts dTDP-6-deoxy-D-xylo-4-hexulose to dTDP-6-deoxy-L-xylo-hexulose by catalyzing the epimerization of the 5-methyl and 3-hydroxyl groups of hexulose, the third of four steps in the dTDP-L-rhamnose biosynthetic pathway. RmlC belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	168
188631	cd00439	Transaldolase	Transaldolase. Transaldolase. Enzymes found in the non-oxidative branch of the pentose phosphate pathway, that catalyze the reversible transfer of a dihydroxyacetone group from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. They are members of the class I aldolases, who are characterized by using a Schiff-base mechanism for stabilization of the reaction intermediates.	252
381596	cd00442	Lyz-like	lysozyme-like domains. This family contains several members, including soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, chitosanases, and pesticin. Typical members are involved in the hydrolysis of beta-1,4- linked polysaccharides.	59
238250	cd00443	ADA_AMPD	Adenosine/AMP deaminase. Adenosine deaminases (ADAs) are present in pro- and eukaryotic organisms and catalyze  the zinc dependent irreversible deamination of adenosine nucleosides to inosine nucleosides and ammonia. The eukaryotic AMP deaminase catalyzes a similar reaction leading to the hydrolytic removal of an amino group at the 6 position of the adenine nucleotide ring, a branch point in the adenylate catabolic pathway.	305
238251	cd00445	Uricase	Urate oxidase (UO, uricase) is a peroxisomal enzyme that catalyzes the oxidation of uric acid to allantoin in most fish, amphibian, and mammalian species. The enzymatic process involves catalyzing the oxidative opening of the purine ring during the purine degradation pathway.  In humans and certain other primates, however, the enzyme has been lost by some unknown mechanism. Each monomer contains two instances of this domain.  Its functional form is a homotetramer for most species, though there are reports that some may form heterotetramers based on a few biochemical studies.	286
271355	cd00446	GrpE	nucleotide exchange factor GrpE. GrpE is the adenine nucleotide exchange factor of DnaK (Hsp70)-type ATPases. In bacteria, the DnaK-DnaJ-GrpE (KJE) chaperone system functions at the fulcrum of protein homeostasis. GrpE participates actively in response to heat shock by preventing aggregation of stress-denatured proteins; unfolded proteins initially bind to DnaJ, the J-domain ATPase-activating protein (Hsp40 family), whereupon DnaK hydrolyzes its bound ATP, resulting in a stable complex. The GrpE dimer binds to the ATPase domain of Hsp70 catalyzing the dissociation of ADP, which enables rebinding of ATP, one step in the Hsp70 reaction cycle in protein folding. In eukaryotes, only the mitochondrial Hsp70, not the cytosolic form, is GrpE dependent.  Over-expression of Hsp70 molecular chaperones is important in suppressing toxicity of aberrantly folded proteins that occur in Alzheimer's disease (AD), Parkinson's disease (PD), amyotrophic lateral sclerosis, as well as several polyQ-diseases  such as Huntington's disease and ataxias.	136
238253	cd00447	NusB_Sun	RNA binding domain of NusB (N protein-Utilization Substance B) and Sun (also known as RrmB or Fmu) proteins. This family includes two orthologous groups exemplified by the transcription termination factor NusB and the N-terminal domain of the rRNA-specific 5-methylcytidine transferase (m5C-methyltransferase) Sun. The NusB protein plays a key role in the regulation of ribosomal RNA biosynthesis in eubacteria by modulating the efficiency of transcriptional antitermination. NusB along with other Nus factors (NusA, NusE/S10 and NusG) forms the core complex with the boxA element of the nut site of the rRNA operons. These interactions help RNA polymerase to counteract polarity during transcription of rRNA operons and allow stable antitermination. The transcription antitermination system can be appropriated by some bacteriophages such as lambda, which use the system to switch between the lysogenic and lytic modes of phage propagation. The m5C-methyltransferase Sun shares the N-terminal non-catalytic RNA-binding domain with NusB.	129
100004	cd00448	YjgF_YER057c_UK114_family	YjgF, YER057c, and UK114 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	107
238254	cd00449	PLPDE_IV	PyridoxaL 5'-Phosphate Dependent Enzymes class IV (PLPDE_IV). This D-amino acid superfamily, one of five classes of PLPDE, consists of branched-chain amino acid aminotransferases (BCAT), D-amino acid transferases (DAAT), and 4-amino-4-deoxychorismate lyases (ADCL). BCAT catalyzes the reversible transamination reaction between the L-branched-chain amino and alpha-keto acids. DAAT catalyzes the synthesis of D-glutamic acid and D-alanine, and ADCL converts 4-amino-4-deoxychorismate to p-aminobenzoate and pyruvate. Except for a few enzymes, i. e.,  Escherichia coli and Salmonella BCATs, which are homohexamers arranged as a double trimer, the class IV PLPDEs are homodimers. Homodimer formation is required for catalytic activity.	256
212095	cd00451	GH38N_AMII_euk	N-terminal catalytic domain of eukaryotic class II alpha-mannosidases; glycoside hydrolase family 38 (GH38). The family corresponds to a group of eukaryotic class II alpha-mannosidases (AlphaMII), which contain Golgi alpha-mannosidases II (GMII), the major broad specificity lysosomal alpha-mannosidases (LAM, MAN2B1), the noval core-specific lysosomal alpha 1,6-mannosidases (Epman, MAN2B2), and similar proteins. GMII catalyzes the hydrolysis of the terminal both alpha-1,3-linked and alpha-1,6-linked mannoses from the high-mannose oligosaccharide GlcNAc(Man)5(GlcNAc)2 to yield GlcNAc(Man)3(GlcNAc)2 (GlcNAc, N-acetylglucosmine), which is the committed step of complex N-glycan synthesis. LAM is a broad specificity exoglycosidase hydrolyzing all known alpha 1,2-, alpha 1,3-, and alpha 1,6-mannosidic linkages from numerous high mannose type oligosaccharides. Different from LAM, Epman can efficiently cleave only the alpha 1,6-linked mannose residue from (Man)3GlcNAc, but not (Man)3(GlcNAc)2 or other larger high mannose oligosaccharides, in the core of N-linked glycans.  Members in this family are retaining glycosyl hydrolases of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl enzyme complex.  Two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst.	258
188632	cd00452	KDPG_aldolase	KDPG and KHG aldolase. KDPG and KHG aldolase. This family belongs to the class I adolases whose reaction mechanism involves Schiff base formation between a substrate carbonyl and lysine residue in the active site. 2-keto-3-deoxy-6-phosphogluconate (KDPG) aldolase,  is best known for its role in the Entner-Doudoroff pathway of bacteria, where it catalyzes the reversible cleavage of KDPG to pyruvate and glyceraldehyde-3-phosphate. 2-keto-4-hydroxyglutarate (KHG) aldolase, which has enzymatic specificity toward glyoxylate, forming KHG in the presence of pyruvate, and is capable of regulating glyoxylate levels in the glyoxylate bypass, an alternate pathway when bacteria are grown on acetate carbon sources.	190
238255	cd00453	FTBP_aldolase_II	Fructose/tagarose-bisphosphate aldolase class II. This family includes fructose-1,6-bisphosphate (FBP) and tagarose 1,6-bisphosphate (TBP) aldolases. FBP-aldolase is homodimeric and used in gluconeogenesis and glycolysis; the enzyme controls the condensation of dihydroxyacetone phosphate with glyceraldehyde-3-phosphate to yield fructose-1,6-bisphosphate. TBP-aldolase is tetrameric and produces tagarose-1,6-bisphosphate. There is an absolute requirement for a divalent metal ion, usually zinc, and in addition the enzymes are activated by monovalent cations such as Na+. Although structurally similar, the class I aldolases use a different mechanism and are believed to have an independent evolutionary origin.	340
381253	cd00454	TrHb1_N	truncated hemoglobins (TrHbs, 2/2Hb, 2/2 globins); group 1 (N). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. They are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). Typical of the TrHb1s (N) group is a protein matrix tunnel. It includes a Mycobacterium tuberculosis TrHb1, Mt-trHbN, which is encoded by the glbN gene. Mt-trHbN is expressed during the Mycobacterium stationary phase, and plays a specific defense role against nitrosative stress. The cyanobacterium Synechococcus sp. PCC 7002 TrHb1 GlbN, is constitutively expressed, and likely also protects cells from reactive nitrogen species.	111
238257	cd00455	nuc_hydro	nuc_hydro: Nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity.  This group contains eukaryotic, bacterial and archeal proteins similar to the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata,  the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium, the purine-specific  inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax and, pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases such as URH1 from Saccharomyces cerevisiae, RihA and RihB from Escherichia coli. Nucleoside hydrolases are of interest as a target for antiprotozoan drugs as, no nucleoside hydrolase activity or genes encoding these enzymes have been detected in humans and, parasitic protozoans lack de novo purine synthesis relying on nucleoside hydrolase to scavenge purine and/or pyrimidines from the environment.   	295
176642	cd00457	PEBP	PhosphatidylEthanolamine-Binding Protein (PEBP) domain. PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). A number of biological roles for members of the PEBP family include serine protease inhibition, membrane biogenesis, regulation of flowering plant stem architecture, and Raf-1 kinase inhibition. Although their overall structures are similar, the members of the PEBP family bind very different substrates including phospholipids, opioids, and hydrophobic odorant molecules as well as having different oligomerization states (monomer/dimer/tetramer).	159
238258	cd00458	SugarP_isomerase	SugarP_isomerase: Sugar Phosphate Isomerase family; includes type A ribose 5-phosphate isomerase (RPI_A), glucosamine-6-phosphate (GlcN6P) deaminase, and 6-phosphogluconolactonase (6PGL). RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium, the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate. 6PGL converts 6-phosphoglucono-1,5-lactone to 6-phosphogluconate, the second step of the oxidative phase of the pentose phosphate pathway.	169
132901	cd00460	RNAP_RPB11_RPB3	RPB11 and RPB3 subunits of RNA polymerase. The eukaryotic RPB11 and RPB3 subunits of RNA polymerase (RNAP), as well as their archaeal (L and D subunits) and bacterial (alpha subunit) counterparts, are involved in the assembly of RNAP, a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts.	86
238259	cd00462	PTH	Peptidyl-tRNA hydrolase (PTH) is a monomeric protein that cleaves the ester bond linking the nascent peptide and tRNA when peptidyl-tRNA is released prematurely from the ribosome. This ensures the recycling of peptidyl-tRNAs into tRNAs produced through abortion of translation and is essential for cell viability.This group also contains chloroplast RNA splicing 2 (CRS2), which is closely related nuclear-encoded protein required for the splicing of nine group II introns in chloroplasts.	171
199209	cd00463	Ribosomal_L31e	Eukaryotic/archaeal ribosomal protein L31. Ribosomal protein L31e, which is present in archaea and eukaryotes, binds the 23S rRNA and is one of six protein components encircling the polypeptide exit tunnel. It is a component of the eukaryotic 60S (large) ribosomal subunit, and the archaeal 50S (large) ribosomal subunit.	83
238260	cd00464	SK	Shikimate kinase (SK) is the fifth enzyme in the shikimate pathway, a seven-step biosynthetic pathway which converts erythrose-4-phosphate to chorismic acid, found in bacteria, fungi and plants. Chorismic acid is a important intermediate in the synthesis of aromatic compounds, such as aromatic amino acids, p-aminobenzoic acid, folate and ubiquinone. Shikimate kinase catalyses the phosphorylation of the 3-hydroxyl group of shikimic acid using ATP.	154
238261	cd00465	URO-D_CIMS_like	The URO-D_CIMS_like protein superfamily includes bacterial and eukaryotic uroporphyrinogen decarboxylases (URO-D), coenzyme M methyltransferases and other putative bacterial methyltransferases, as well as cobalamine (B12) independent methionine synthases. Despite their sequence similarities, members of this family have clearly different functions. Uroporphyrinogen decarboxylase (URO-D) decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, an important branching point of the tetrapyrrole biosynthetic pathway. The methyltransferases represented here are important for ability of methanogenic organisms to use other compounds than carbon dioxide for reduction to methane, and methionine synthases transfer a methyl group from a folate cofactor to L-homocysteine in a reaction requiring zinc.	306
238262	cd00466	DHQase_II	Dehydroquinase (DHQase), type II. Dehydroquinase (or 3-dehydroquinate dehydratase) catalyzes the reversible dehydration of 3-dehydroquinate to form 3-dehydroshikimate. This reaction is part of two metabolic pathways: the biosynthetic shikimate pathway and the catabolic quinate pathway. There are two types of DHQases, which are distinct from each other in amino acid sequence and three-dimensional structure. Type I enzymes usually catalyze the biosynthetic reaction using a syn elimination mechanism. In contrast, type II enzymes, found in the quinate pathway of fungi and in the shikimate pathway of many bacteria, are dodecameric enzymes that employ an anti elimination reaction mechanism.	140
238263	cd00468	HIT_like	HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups.	86
238264	cd00470	PTPS	6-pyruvoyl tetrahydropterin synthase (PTPS). Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids, as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate. One enzyme from this pathway is PTPS which catalyzes the conversion of dihydroneopterin triphosphate to 6-pyruvoyl tetrahydropterin. The functional enzyme is a hexamer of identical subunits.	135
100103	cd00472	Ribosomal_L24e_L24	Ribosomal protein L24e/L24 is a ribosomal protein found in eukaryotes (L24) and in archaea (L24e, distinct from archaeal L24). L24e/L24 is located on the surface of the large subunit, adjacent to proteins L14 and L3, and near the translation factor binding site.  L24e/L24 appears to play a role in the kinetics of peptide synthesis, and may be involved in interactions between the large and small subunits, either directly or through other factors. In mouse, a deletion mutation in L24 has been identified as the cause for the belly spot and tail (Bst) mutation that results in disrupted pigmentation, somitogenesis and retinal cell fate determination.  L24 may be an important protein in eukaryotic reproduction:  in shrimp, L24 expression is elevated in the ovary, suggesting a role in oogenesis, and in Arabidopsis, L24 has been proposed to have a specific function in gynoecium development. No protein with sequence or structural homology to L24e/L24 has been identified in bacteria, but a functionally equivalent protein may exist.  Bacterial L19 forms an interprotein beta sheet with L14 that is similar to the L24e/L14 interprotein beta sheet observed in the archaeal L24e structures. Some eukaryotic L24 proteins were initially identified as L30, and this alignment model contains several sequences called L30.	54
275385	cd00473	bS6	Bacterial ribosomal protein S6. bS6 is one of the components of the small subunit of the prokaryotic ribosome, a ribonucleoprotein organelle that decodes the genetic information in messenger RNA and forms peptide bonds to synthesize the corresponding polypeptides. Ribosomes consist of a large and a small subunit, which assemble during the initiation stage of protein synthesis. Prokaryotic ribosomes consist of three molecules of RNA and more than 50 proteins. The small subunits of bacterial and eukaryotic ribosomes have the same overall shapes (with structural elements described as head, body, platform, beak and shoulder). The bacterial ribosomal protein S6 is important for the assembly of the central domain of the small subunit via heterodimerization with ribosomal protein S18.	91
211317	cd00474	eIF1_SUI1_like	Eukaryotic initiation factor 1 and related proteins. Members of the eIF1/SUI1 (eukaryotic initiation factor 1) family are found in eukaryotes, archaea, and some bacteria; eukaryotic members are understood to play an important role in accurate initiator codon recognition during translation initiation. eIF1 interacts with 18S rRNA in the 40S ribosomal subunit during eukaryotic translation initiation. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown. The function of non-eukaryotic family members is also unclear.	78
259850	cd00475	Cis_IPPS	Cis (Z)-Isoprenyl Diphosphate Synthases. Cis (Z)-Isoprenyl Diphosphate Synthases (cis-IPPS) catalyze the successive 1'-4 condensation of the isopentenyl diphosphate (IPP) molecule to trans,trans-farnesyl diphosphate (FPP) or to cis,trans-FPP to form long-chain polyprenyl diphosphates. A few can also catalyze the condensation of IPP to trans-geranyl diphosphate to form the short-chain cis,trans- FPP. In prokaryotes, the cis-IPPS, undecaprenyl diphosphate synthase (UPP synthase), catalyzes the formation of the carrier lipid UPP in bacterial cell wall peptidoglycan biosynthesis. Similarly, in eukaryotes, the cis-IPPS, dehydrodolichyl diphosphate (dedol-PP) synthase catalyzes the formation of the polyisoprenoid glycosyl carrier lipid dolichyl monophosphate. cis-IPPS form homodimers and are mechanistically and structurally distinct from trans-IPPS, which lack the DDXXD motifs, yet require Mg2+ for activity.	219
133468	cd00476	SAICAR_synt	5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. SAICAR synthetase (the PurC gene product) catalyzes the seventh step of the de novo biosynthesis of purine nucleotides (also reported as eighth step). It converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate.	230
349750	cd00477	FTHFS	formyltetrahydrofolate synthetase. Formyltetrahydrofolate synthetase (FTHFS) catalyzes the ATP-dependent activation of formate ion via its addition to the N10 position of tetrahydrofolate. FTHFS is a highly expressed key enzyme in both the Wood-Ljungdahl pathway of autotrophic CO2 fixation (acetogenesis) and the glycine synthase/reductase pathways of purinolysis. The key physiological role of this enzyme in acetogens is to catalyze the formylation of tetrahydrofolate, an initial step in the reduction of carbon dioxide and other one-carbon precursors to acetate. In purinolytic organisms, the enzymatic reaction is reversed, liberating formate from 10-formyltetrahydrofolate with concurrent production of ATP.	540
238267	cd00480	malate_synt	Malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA , which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms, like plants and fungi, to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle.	511
238268	cd00481	Ribosomal_L19e	Ribosomal protein L19e.  L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit.  The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits.	145
238269	cd00483	HPPK	7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK). Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate.  One enzyme from this pathway is HPPK which catalyzes pyrophosphoryl transfer from ATP to 6-hydroxymethyl-7,8-dihydropterin (HP). The functional enzyme is a monomer.  Mammals lack many of the enzymes in the folate pathway including, HPPK.	128
238270	cd00484	PEPCK_ATP	Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity, this model describes the ATP-dependent groups.	508
238271	cd00487	Pep_deformylase	Polypeptide or peptide deformylase; a family of metalloenzymes that catalyzes the removal of the N-terminal formyl group in a growing polypeptide chain following translation initiation during protein synthesis in prokaryotes. These enzymes utilize Fe(II) as the catalytic metal ion, which can be replaced with a nickel or cobalt ion with no loss of activity. There are two types of peptide deformylases, types I and II, which differ in structure only in the outer surface of the domain. Because these enzymes are essential only in prokaryotes (although eukaryotic gene sequences have been found), they are a target for a new class of antibacterial agents.	141
238272	cd00488	PCD_DCoH	PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH  (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme.  DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH).  DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein). Two DCoH proteins have been identifed in humans: DCoH1 and DCoH2. Mutations in human DCoH1 cause hyperphenylalaninemia. Loss of enzymic activity of DCoH in humans is associated with the depigmentation disorder vitiligo. DCoH1 has been reported to be overexpessed in colon cancer carcinomas and in malignant melanomas.	75
238273	cd00489	Barstar_like	Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it, thus inhibiting its potentially lethal RNase activity inside the cell.  Barstar also binds and inhibits a ribonuclease called RNase Sa (produced by Streptomyces aureofaciens) which belongs  to the same enzyme family as does barnase.	85
119402	cd00490	Met_repressor_MetJ	Met Repressor, MetJ.  MetJ is a bacterial regulatory protein that uses S-adenosylmethionine (SAM) as a corepressor to regulate the production of Methionine.  MetJ binds arrays of two to five adjacent copies of an eight base-pair 'metbox' sequence.  MetJ forms sufficiently strong interactions with the sugar-phosphate backbone to accomodate sequence variation in natural operators. However, it is very sensitive to particular base changes in the operator. MetJ exists as a homodimer.	103
238274	cd00491	4Oxalocrotonate_Tautomerase	4-Oxalocrotonate Tautomerase:  Catalyzes the isomerization of unsaturated ketones. The structure is a homohexamer that is arranged as a trimer of dimers. The hexamer contains six active sites, each formed by residues from three monomers, two from one dimer and the third from a neighboring monomer.  Each monomer is a beta-alpha-beta fold with two small beta strands at the C-terminus that fold back on themselves. A pair of monomers form a dimer with two-fold symmetry, consisting of a 4-stranded beta sheet with two helices on one side and two additional small beta strands at each end. The dimers are assembled around a 3-fold axis of rotation to form a hexamer, with the short beta strands from each dimer contacting the neighboring dimers.	58
238275	cd00493	FabA_FabZ	FabA/Z, beta-hydroxyacyl-acyl carrier protein (ACP)-dehydratases: One of several distinct enzyme types of the dissociative, type II, fatty acid synthase system (found in bacteria and plants) required to complete successive cycles of fatty acid elongation. The third step of the elongation cycle, the dehydration of beta-hydroxyacyl-ACP to trans-2-acyl-ACP, is catalyzed by FabA or FabZ.  FabA is bifunctional and catalyzes an additional isomerization reaction of trans-2-acyl-ACP to cis-3-acyl-ACP, an essential reaction to unsaturated fatty acid synthesis.  FabZ is the primary dehydratase that participates in the elongation cycles of saturated as well as unsaturated fatty acid biosynthesis, whereas FabA is more active in the dehydration of beta-hydroxydecanoyl-ACP. The FabA structure is homodimeric with two independent active sites located at the dimer interface.	131
270213	cd00494	PBP2_HMBS	Hydroxymethylbilane synthase possesses the type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, vitamin B12 and related macrocycles. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). This family includes the three domains of HMBS. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	274
198379	cd00495	Ribosomal_L25_TL5_CTC	Ribosomal L25/TL5/CTC N-terminal 5S rRNA binding domain. L25 is a single-domain protein, homologous to the N-terminal domain of TL5 and CTC. CTC is a known stress protein, and proteins of this family are believed to have two functions, acting as both ribosomal and stress proteins. In Escherichia coli, cells deleted for L25 were found to be viable; however, these cells grew slowly and had impaired protein synthesis capability. In Bacillus subtilis, CTC is induced under stress conditions and located in the ribosome; it has been proposed that CTC may be necessary for accurate translation under stress conditions. Ribosomal_L25_TL5_CTC is mostly found in bacteria, with a few exceptions such as plants or stramenopiles. Due to its limited taxonomic diversity and the viability of cells deleted for L25, this protein is not believed to be necessary for ribosomal assembly. Eukaryotes contain a protein called L25, which is not homologous to bacterial L25, but rather to bacterial L23.	90
238277	cd00496	PheRS_alpha_core	Phenylalanyl-tRNA synthetase (PheRS) alpha chain catalytic core domain. PheRS belongs to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure and the presence of three characteristic sequence motifs. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. While class II aaRSs generally aminoacylate the 3'-OH ribose of the appropriate tRNA,  PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs.  PheRS is an alpha-2/ beta-2 tetramer.	218
211322	cd00497	PseudoU_synth_TruA_like	Pseudouridine synthase, TruA family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruA, Saccharomyces cerevisiae Pus1p, S. cerevisiae Pus3p Caenorhabditis elegans Pus1p and human PUS1. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. S. cerevisiae PUS1 catalyzes the formation of psi34 and psi36 in the intron containing tRNAIle, psi35 in the intron containing tRNATyr, psi27 and/or psi28 in several yeast cytoplasmic tRNAs and, psi44 in U2 small nuclear RNA (U2 snRNA). The presence of the intron is required for the formation of psi 34, 35 and 36. In addition S. cerevisiae PUS1 makes psi 26, 65 and 67.  C. elegans Pus1p does not modify psi44 in U2 snRNA. S. cerevisiae Pus3p makes psi38 and psi39 in tRNAs. Psi44 in U2 snRNA and, psi38 and psi39 in tRNAs are highly phylogenetically conserved.  Psi 26,27,28,34,35,36,65 and 67 in tRNAs are less highly conserved. Mouse Pus1p regulates nuclear receptor activity through pseudouridylation of Steroid Receptor RNA Activator. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA).	215
238278	cd00498	Hsp33	Heat shock protein 33 (Hsp33):  Cytosolic protein that acts as a molecular chaperone under oxidative conditions.  In normal (reducing) cytosolic conditions, four conserved Cys residues are coordinated by a Zn ion.  Under oxidative stress (such as heat shock), the Cys are reversibly oxidized to disulfide bonds, which causes the chaperone activity to be turned on.  Hsp33 is homodimeric in its functional form.	275
238279	cd00501	Peptidase_C15	Pyroglutamyl peptidase (PGP) type I, also known as pyrrolidone carboxyl peptidase (pcp) type I:  Enzymes responsible for cleaving pyroglutamate (pGlu) from the N-terminal end of specialized proteins. The N-terminal pGlu protects these proteins from proteolysis by other proteases until the pGlu is removed by a PGP.  PGPs are cysteine proteases with a Cys-His-Glu/Asp catalytic triad. Type I PGPs are found in a wide variety of prokaryotes and eukaryotes. It is not clear whether the functional form is a monomer, a homodimer, or a homotetramer.	194
188633	cd00502	DHQase_I	Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase). Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase). Catalyzes the cis-dehydration of 3-dehydroquinate via a covalent imine intermediate to produce dehydroshikimate. Dehydroquinase is the third enzyme in the shikimate pathway, which is involved in the biosynthesis of aromatic amino acids. Type I DHQase exists as a homodimer. Type II 3-dehydroquinase also catalyzes the same overall reaction, but is unrelated in terms of sequence and structure, and utilizes a completely different reaction mechanism.	225
238280	cd00503	Frataxin	Frataxin is a nuclear-encoded mitochondrial protein implicated in Friedreich's ataxia (FRDA), an human autosomal recessive neurodegenerative disease; Frataxin is found in eukaryotes and in purple bacteria; lack of frataxin causes iron to accumulate in the mitochondrial matrix suggesting that frataxin is involved in mitochondrial iron homeostasis and possibly in iron transport; the domain has an alpha-beta fold consisting of two helices flanking an antiparallel beta sheet.	105
238281	cd00504	GXGXG	GXGXG domain. This domain of unknown function is found at the C-terminus of the large subunit (gltB) of glutamate synthase (GltS),  in subunit C of tungsten formylmethanofuran dehydrogenase (FwdC) and in subunit C of molybdenum formylmethanofuran dehydrogenase (FmdC). It is also found in a primarily archeal group of proteins predicted to encode part of the large subunit of GltS. It is characterized by a repeated GXXGXXXG motif. GltS is a complex iron-sulfur flavoprotein that catalyzes the synthesis of L-glutamate from L-glutamine and 2-oxoglutarate. It requires the transfer of ammonia and electrons among three distinct active centers that carry out L-Gln hydrolysis, conversion of 2-oxoglutarate into L-Glu, and electron uptake from a donor. These catalytic sites occur in other domains within the protein or or encoded by separate genes, and are not present in the domain in this CD. FwdC and FmdC are reversible ion pumps that catalyze the formylation and deformylation of methanofuran in hyperthermophiles and bacteria.  They require the presence of either tungstun (FwdC) or molybdenum (FmdC). The specific function of this domain also remains unidentified in the formylmethanofuran dehydrogenases.	149
132996	cd00505	Glyco_transf_8	Members of glycosyltransferase family 8 (GT-8) are involved in lipopolysaccharide biosynthesis and glycogen synthesis. Members of this family are involved in lipopolysaccharide biosynthesis and glycogen synthesis. GT-8 comprises enzymes with a number of known activities: lipopolysaccharide galactosyltransferase, lipopolysaccharide glucosyltransferase 1, glycogenin glucosyltransferase, and  N-acetylglucosaminyltransferase. GT-8 enzymes contains a conserved DXD motif which is essential in the coordination of a  catalytic divalent cation, most commonly Mn2+.	246
211323	cd00506	PseudoU_synth_TruB_like	Pseudouridine synthase, TruB family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruB, Saccharomyces cerevisiae Pus4, M.  tuberculosis TruB, S. cerevisiae Cbf5 and human dyskerin. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). No cofactors are required.  E. coli TruB, M.  tuberculosis TruB and S. cerevisiae Pus4,  make psi55 in the T loop of tRNAs. Pus4 catalyses the formation of psi55 in both cytoplasmic and mitochondrial tRNAs. Psi55 is almost universally conserved. S. cerevisiae Cbf5 and human dyskerin are nucleolar proteins that, with the help of guide RNAs, make the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs).  Cbf5/Dyskerin is the catalytic subunit of eukaryotic box H/ACA small nucleolar ribonucleoprotein (snoRNP) particles. Mutations in human dyskerin cause X-linked dyskeratosis congenitas.	210
238282	cd00508	MopB_CT_Fdh-Nap-like	This CD includes formate dehydrogenases (Fdh) H and N; nitrate reductases, Nap and Nas; and other related proteins. Formate dehydrogenase H is a component of the anaerobic formate hydrogen lyase complex  and catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. Formate dehydrogenase N (alpha subunit) is the major electron donor to the bacterial nitrate respiratory chain and nitrate reductases, Nap and Nas, catalyze the reduction of nitrate to nitrite. This CD (MopB_CT_Fdh-Nap-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	120
238283	cd00512	MM_CoA_mutase	Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM)-like family; contains proteins similar to MCM, and the large subunit of Streptomyces coenzyme B12-dependent isobutyryl-CoA mutase (ICM). MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In higher animals, MCM is involved in the breakdown of odd-chain fatty acids, several amino acids, and cholesterol. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include Propionbacterium shermanni MCM during propionic acid fermentation, E.coli MCM in a pathway for the conversion of succinate to propionate and Streptomyces MCM in polyketide biosynthesis. P. shermanni and Streptomyces cinnamonensis MCMs are alpha/beta heterodimers, with both subunits being homologous members of this family. It has been shown for P. shermanni MCM that only the alpha subunit binds coenzyme B12 and substrates. Human MCM is a homodimer with two active sites. Mouse and E.coli MCMs are also homodimers. ICM from S. cinnamonensis is comprised of a large and a small subunit. The holoenzyme appears to be an alpha2beta2 heterotetramer with up to 2 molecules of coenzyme B12 bound. The small subunit binds coenzyme B12. ICM catalyzes the reversible rearrangement of n-butyryl-CoA to isobutyryl-CoA (intermediates in fatty acid and valine catabolism, which in S. cinnamonensis can be converted to methylmalonyl-CoA and used in polyketide synthesis). In humans, impaired activity of MCM results in methylmalonic aciduria, a disorder of propionic acid metabolism.	399
238284	cd00513	Ribosomal_L32_L32e	Ribosomal_L32_L32e: L32 is a protein from the large subunit that contains a surface-exposed globular domain and a finger-like projection that extends into the RNA core to stabilize the tertiary structure. L32 does not appear to play a role in forming the A (aminacyl), P (peptidyl) or E (exit) sites of the ribosome, but does interact with 23S rRNA, which has a "kink-turn" secondary structure motif. L32 is overexpressed in human prostate cancer and has been identified as a stably expressed housekeeping gene in macrophages of human chronic obstructive pulmonary disease (COPD) patients. In Schizosaccharomyces pombe, L32 has also been suggested to play a role as a transcriptional regulator in the nucleus. Found in archaea and eukaryotes, this protein is known as L32 in eukaryotes and L32e in archaea.	107
238285	cd00515	HAM1	NTPase/HAM1.  This family consists of the HAM1 protein and pyrophosphate-releasing xanthosine/ inosine triphosphatase. HAM1 protects the cell against mutagenesis by the base analog 6-N-hydroxylaminopurine (HAP) in E. Coli and S. cerevisiae. A Ham1-related protein from Methanococcus jannaschii is a novel NTPase that has been shown to hydrolyze nonstandard nucleotides such as XTP to XMP and ITP to IMP, but not the standard nucleotides, in the presence of Mg or Mn ions. The enzyme exists as a homodimer. The HAM1 protein may be acting as an NTPase by hydrolyzing the HAP triphosphate.	183
238286	cd00516	PRTase_typeII	Phosphoribosyltransferase (PRTase) type II; This family contains two enzymes that play an important role in NAD production by either allowing quinolinic acid (QA) , quinolinate phosphoribosyl transferase (QAPRTase), or nicotinic acid (NA), nicotinate phosphoribosyltransferase (NAPRTase), to be used in the synthesis of NAD. QAPRTase catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide, an important step in the de novo synthesis of NAD. NAPRTase catalyses a similar reaction leading to NAMN and pyrophosphate, using nicotinic acid an PPRP as substrates, used in the NAD salvage pathway.	281
173895	cd00517	ATPS	ATP-sulfurylase. ATP-sulfurylase (ATPS), also known as sulfate adenylate transferase, catalyzes the transfer of an adenylyl group from ATP to sulfate, forming adenosine 5'-phosphosulfate (APS).  This reaction is generally accompanied by a further reaction, catalyzed by APS kinase, in which APS is phosphorylated to yield 3'-phospho-APS (PAPS).  In some organisms the APS kinase is a separate protein, while in others it is incorporated with ATP sulfurylase in a bifunctional enzyme that catalyzes both reactions.  In bifunctional proteins, the domain that performs the kinase activity can be attached at the N-terminal end of the sulfurylase unit or at the C-terminal end, depending on the organism. While the reaction is ubiquitous among organisms, the physiological role of the reaction varies.  In some organisms it is used to generate APS from sulfate and ATP, while in others it proceeds in the opposite direction to generate ATP from APS and pyrophosphate.  ATP sulfurylase can be a monomer, a homodimer, or a homo-oligomer, depending on the organism.  ATPS belongs to a large superfamily of nucleotidyltransferases that includes pantothenate synthetase (PanC), phosphopantetheine adenylyltransferase (PPAT), and the amino-acyl tRNA synthetases. The enzymes of this family are structurally similar and share a dinucleotide-binding domain.	353
99872	cd00518	H2MP	Hydrogenase specific C-terminal endopeptidases, also called Hydrogen Maturation Proteases (H2MP). These enzymes belong to the peptidase family M52. Maturation of [FeNi] hydrogenases includes formation of the nickel metallocenter, proteolytic processing and assembly with other subunits. Hydrogenase maturation endopeptidases are responsible for the proteolytic processing, liberating a short C-terminal peptide by cleaving after a His or an Arg residue, e.g., HycI (E. coli) is involved in processing of HypE, the large subunit of hydrogenase 3. This cleavage is nickel dependent. This CD also includes such hydrogenase-processing proteins as HydD, HupW, and HoxW, as well as, proteins of the F420-reducing hydrogenase of methanogens (e.g., FrcD). Also included, is the Pyrococcus furiosus FrxA protein, a bifunctional endopeptidase/ sulfhydrogenase found in NADP-reducing hyperthermophiles.The Pyrococcus FrxA is not related to those found in Helicobacter pylori.	139
238287	cd00519	Lipase_3	Lipase (class 3).  Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface.  A typical feature of lipases is "interfacial activation," the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation .  The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure.  A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site. 	229
238288	cd00520	RRF	Ribosome recycling factor (RRF). Ribosome recycling factor dissociates the posttermination complex, composed of the ribosome, deacylated tRNA, and mRNA, after termination of translation.  Thus ribosomes are "recycled" and ready for another round of protein synthesis.  RRF is believed to bind the ribosome at the A-site in a manner that mimics tRNA, but the specific mechanisms remain unclear.  RRF is essential for bacterial growth.  It is not necessary for cell growth in archaea or eukaryotes, but is found in mitochondria or chloroplasts of some eukaryotic species.	179
213981	cd00522	Hemerythrin-like	Hemerythrin family. Hemerythrin (Hr) and related proteins are found in bacteria, archaea and eukaryotes. They are non-heme diiron oxygen transport proteins. In addition to oxygen transport, members are involved in cadmium fixation and host anti-bacterial defense. They have the same "four alpha helix bundle" motif and similar active site structures. Some members, like Hr, form oligomers, the octameric form being most prevalent, while others are monomeric.	103
411703	cd00523	Holliday_junction_resolvase	Holliday junction resolvase. Holliday junction resolvases (HJRs) are endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. HJRs occur in archaea, bacteria, and in the mitochondria of certain eukaryotes; however, this CD includes only the archeal HJRs. The bacterial and archeal HJRs perform a similar function but differ in both sequence and structure. Structural similarity does however, exist between the archeal HJRs and type II restriction endonucleases, such as EcoRV, BglII, and Fok, and this similarity includes their active site configurations.	126
238290	cd00524	SORL	Superoxide reductase-like (SORL) domain; present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin.  Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. The SORL domain contains an active iron site, Fe[His4Cys(Glu)], which in the reduced state loses the glutamate ligand. Superoxide reductase (class II) forms a homotetramer with four Fe[His4Cys(Glu)] centers. Desulfoferrodoxin (class I) is a homodimeric protein, with each protomer comprised of two domains, the N-terminal desulforedoxin (DSRD) domain and C-terminal SORL domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)].	86
238291	cd00525	AE_Prim_S_like	AE_Prim_S_like: primase domain similar to that found in the small subunit of archaeal and eukaryotic (A/E) DNA primases. The replication machineries of A/Es are distinct from that of bacteria. Primases are DNA-dependent RNA polymerases which synthesis the short RNA primers required for DNA replication. In eukaryotes, this small catalytically active primase subunit (p50) and a larger primase subunit (p60), referred to jointly as the core primase, associate with the B subunit and the DNA polymerase alpha subunit in a complex, called Pol alpha-pri. In addition to its catalytic role in replication, eukaryotic DNA primase may play a role in coupling replication to DNA damage repair and in checkpoint control during S phase. Pfu41 and Pfu46 comprise the primase complex of the archaea Pyrococcus furiosus; these proteins have sequence identity to the eukaryotic p50 and p60 primase proteins respectively. Pfu41 preferentially uses dNTPs as substrate. Pfu46 regulates the primase activity of Pfu41. Also found in this group is the primase-polymerase (primpol) domain of replicases from archaeal plasmids including the ORF904 protein of pRN1 from Sulfolobus islandicus (pRN1 primpol). The pRN1 primpol domain exhibits DNA polymerase and primase activities; a cluster of active site residues (three acidic residues, and a histidine) is required for both these activities. The pRN1 primpol primase activity prefers dNTPs to rNTPs; however incorporation of dNTPs requires rNTP as cofactor. This group also includes the Pol domain of bacterial LigD proteins such Mycobacterium tuberculosis (Mt)LigD. MtLigD contains an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. LigD Pol plays a role in non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB) in vivo, perhaps by filling in short 5'-overhangs with ribonucleotides; the filled in termini would be sealed by the associated LigD ligase domain. The MtLigD Pol domain is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro.	136
238292	cd00527	IF6	Ribosome anti-association factor IF6 binds the large ribosomal subunit and prevents the two subunits from associating during translation initiation. IF6 comprises a family of translation factors that includes both eukaryotic (eIF6) and archeal (aIF6) members.  All members of this family have a conserved pentameric fold referred to as a beta/alpha propeller. The eukaryotic IF6 members have a moderately conserved C-terminal extension which is not required for ribosomal binding, and may have an alternative function.	220
238293	cd00528	MoaC	MoaC family. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis.	136
340812	cd00529	RuvC_like	Crossover junction endodeoxyribonuclease RuvC and similar proteins. The RuvC-like family consists of bacterial RuvC, fungal Cruciform cutting endonuclease 1 (CCE1), and bacterial YqgF. RuvC and CCE1 are Holliday junction resolvases (HJRs), endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. RuvC is part of the RuvABC pathway in Escherichia coli and other Gram-negative bacteria that is involved in processing Holliday junctions, which are formed by the reciprocal exchange of strands between two DNA duplexes. CCE1 is a HJR specific for 4-way junctions; it is involved in the maintenance of mitochondrial DNA. Escherichia coli YqgF has been shown to act as a pre-16S rRNA nuclease, presumably as a monomer. It is involved in the processing of pre-16S rRNA during ribosome maturation. HJRs occur in archaea, bacteria, and in the mitochondria of certain fungi. RuvC and its orthologs are homodimers and display structural similarity to RNase H and Hsp70.	117
238295	cd00530	PTE	Phosphotriesterase (PTE) catalyzes the hydrolysis of organophosphate nerve agents, including the chemical warfare agents VX, soman, and sarin as well as the insecticide paraoxon. PTE exists as a homodimer with one active site per monomer. The active site is located next to a binuclear metal center, at the C-terminal end of a TIM alpha- beta barrel motif.  The native enzyme contains two zinc ions at the active site however these can be replaced with other metals such as cobalt, cadmium, nickel or manganese and the enzyme remains active.	293
238296	cd00531	NTF2_like	Nuclear transport factor 2 (NTF2-like) superfamily. This family includes members of the NTF2 family, Delta-5-3-ketosteroid isomerases, Scytalone Dehydratases, and the beta subunit of Ring hydroxylating dioxygenases. This family is a classic example of divergent evolution wherein the proteins have many common structural details but diverge greatly in their function. For example,  nuclear transport factor 2 (NTF2) mediates the nuclear import of RanGDP and  binds to both RanGDP and FxFG repeat-containing nucleoporins while Ketosteroid isomerases catalyze the isomerization of delta-5-3-ketosteroid to delta-4-3-ketosteroid, by intramolecular transfer of the C4-beta proton to the C6-beta position. While the function of the beta sub-unit of the Ring hydroxylating dioxygenases is not known, Scytalone Dehydratases catalyzes two reactions in the biosynthetic pathway that produces fungal melanin. Members of the NTF2-like superfamily are widely distributed among bacteria, archaea and eukaryotes.	124
238297	cd00532	MGS-like	MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase, which catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The family also includes the C-terminal domain in carbamoyl phosphate synthetase (CPS) where it catalyzes the last phosphorylation of a coaboxyphosphate intermediate to form the product carbamoyl phosphate and may also play a regulatory role. This family also includes inosine monophosphate cyclohydrolase. The known structures in this family show a common phosphate binding site.	112
238298	cd00534	DHNA_DHNTPE	Dihydroneopterin aldolase (DHNA) and 7,8-dihydroneopterin triphosphate epimerase domain (DHNTPE); these enzymes have been designated folB and folX, respectively. Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines, and amino acids, as well as formyl-tRNA. Mammalian cells are able to utilize pre-formed folates after uptake by a carrier-mediated active transport system. Most microbes and plants lack this system and must synthesize folates de novo from guanosine triphosphate. One enzyme from this pathway is DHNA which catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate.  Though it is known that DHNTPE catalyzes the epimerization of dihydroneopterin triphosphate to dihydromonapterin triphosphate, the biological role of this enzyme is still unclear. It is hypothesized that it is not an essential protein since a folX knockout in E. coli has a normal phenotype and the fact that folX is not present in H. influenza. In addition both enzymes have been shown to be able to compensate for the other's activity albeit at slower reaction rates.  The functional enzyme for both is an octamer of identical subunits. Mammals lack many of the enzymes in the folate pathway including, DHNA and DHNTPE.	118
238299	cd00537	MTHFR	Methylenetetrahydrofolate reductase (MTHFR). 5,10-Methylenetetrahydrofolate is reduced to 5-methyltetrahydrofolate by methylenetetrahydrofolate reductase, a cytoplasmic, NAD(P)-dependent enzyme. 5-methyltetrahydrofolate is utilized by methionine synthase to convert homocysteine to methionine. The enzymatic mechanism is a ping-pong bi-bi mechanism, in which NAD(P)+ release precedes the binding of methylenetetrahydrofolate and the acceptor is free FAD. The family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from prokaryotes and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The bacterial enzyme is a homotetramer and NADH is the preferred reductant while the eukaryotic enzyme is a homodimer and NADPH is the preferred reductant. In humans, there are several clinically significant mutations in MTHFR that result in hyperhomocysteinemia, which is a risk factor for the development of cardiovascular disease.	274
238300	cd00538	PA	PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated  protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2.	126
238301	cd00539	MCR_gamma	Methyl-coenzyme M reductase (MCR) gamma subunit. MCR catalyzes the terminal step of methane formation in the energy metabolism of all methanogenic archaea, in which methyl-coenzyme M and coenzyme B are converted to methane and the heterodisulfide of coenzyme M and coenzyme B (CoM-S-S-CoB). MCR is a dimer of trimers, each of which consists of one alpha, one beta, and one gamma subunit, with two identical active sites containing nickel porphinoid factor 430 (F430).	246
187726	cd00540	AAG	Alkyladenine DNA glycosylase catalyzes the first step in base excision repair. Alkyladenine DNA glycosylase (AAG), also known as 3-methyladenine DNA glycosylase, catalyzes the first step in base excision repair (BER) by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site. AAG bends DNA by intercalating between the base pairs, causing the damaged base to flip out of the double helix and into the enzyme active site for cleavage. Although AAG represents one of six DNA glycosylase classes, it lacks the helix-hairpin-helix active site motif associated with other BER glycosylases and is structurally distinct from them.	187
238302	cd00541	OMPLA	The outer membrane phospholipase A (OMPLA) is an integral membrane enzyme that catalyses the hydrolysis of acylester bonds in phospholipids using calcium as a cofactor. The enzyme has a fold of transmembrane beta-barrels and is widespread among Gram-negative bacteria, both in pathogens and nonpathogens. In pathogenic bacteria such as Campylobacter coli and Helicobacter pylori OMPLA is involved in pathogenesis and virulence. In nonpathogenic bacteria the physiological function of OMPLA is less clear. The Escherichia coli enzyme is involved in the secretion of bacteriocins, antibacterial peptides that are produced in order to survive under starvation conditions. The enzyme activity of OMPLA is strictly regulated to prevent uncontrolled breakdown of the surrounding phospholipids. The activity of OMPLA can be induced by membrane perturbation and concurs with dimerization of the enzyme.	231
238303	cd00542	Ntn_PVA	Penicillin V acylase (PVA), also known as conjugated bile salt acid hydrolase (CBAH), catalyzes the hydrolysis of penicillin V to yield 6-amino penicillanic acid (6-APA), an important key intermediate of semisynthetic penicillins.  PVA has an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which PVA belongs.  This nucleophilic cysteine is exposed by post-translational prossessing of the PVA precursor. PVA forms a homotetramer.	303
410861	cd00544	CobU	Adenosylcobinamide kinase / adenosylcobinamide phosphate guanyltransferase (CobU). CobU is a bacterial bifunctional cobalbumin biosynthesis enzyme which display adenosylcobinamide kinase and adenosylcobinamide phosphate guanyltransferase activity and is a key participant in the final stages of cobalamin biosynthesis where it is involved in nucleotide loop assembly. CobU is a homotrimer which functions both as a kinase and as a nucleotidyl transferase. It phosphorylates of adenosylcobinamide to form adenosylcobinamide phosphate (using a variety of nucleotides as the phosphate donor) and then adds GMP to adenosylcobinamide phosphate to form adenosylcobinamide-GDP, specifically using GTP.	166
238305	cd00545	MCH	Methenyltetrahydromethanopterin (methenyl-H4MPT) cyclohydrolase (MCH). MCH is a cytoplasmic enzyme that has been identified in methanogenic archaea, sulfate- reducing archaea, and methylotrophic bacteria.  It catalyzes the reversible formation of N(5), N(10)-methenyltetrahydromethanopterin (methenyl-H4MPT+) from N(5)-formyltetrahydromethanopterin (formyl- H4MPT), in the third step of the reaction to reduce CO2 to CH4. The protein functions as a homodimer or homotrimer, depending on the organism.	312
238306	cd00546	QFR_TypeD_subunitC	Quinol:fumarate reductase (QFR) Type D subfamily, 15kD hydrophobic subunit C;  QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinine oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type D as they contain two transmembrane subunits (C and D) and no heme groups.  The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The quinone binding site resides in the transmembrane subunits.	124
238307	cd00547	QFR_TypeD_subunitD	Quinol:fumarate reductase (QFR) Type D subfamily, 13kD hydrophobic subunit D; QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinine oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type D as they contain two transmembrane subunits (C and D) and no heme groups.  The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The quinone binding site resides in the transmembrane subunits.	115
349426	cd00548	NrfA-like	cytochrome c nitrite reductase and similar proteins. This family contains cytochrome c nitrite reductase (also known as cytochrome c552, or NrfA) and similar proteins. The pentaheme enzyme NrfA catalyzes the electron reduction of nitrite to ammonia in the nitrogen cycle. This enzyme can also transform nitrogen monoxide and hydroxylamine, two potential bound reaction intermediates, into ammonia. It is a homodimer, with each monomer containing four classical CXXCH type heme-binding sites along with an alternative CXXCK heme-binding motif, which is important for catalysis. This family also includes octaheme nitrite reductase (TvNiR) from the haloalkaliphilic bacterium Thioalkalivibrio paradoxus which catalyzes the reduction of nitrite and hydroxylamine to ammonia as well as the reduction of sulfite to sulfide.	370
200451	cd00551	AmyAc_family	Alpha amylase catalytic domain family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	260
238308	cd00552	RaiA	RaiA ("ribosome-associated inhibitor A", also known as Protein Y (PY), YfiA, and SpotY,  is a stress-response protein that binds the ribosomal subunit interface and arrests translation by interfering with aminoacyl-tRNA binding to the ribosomal A site.  RaiA is also thought to counteract miscoding at the A site thus reducing translation errors. The RaiA fold structurally resembles the double-stranded RNA-binding domain (dsRBD).	93
238309	cd00553	NAD_synthase	NAD+ synthase is a homodimer, which catalyzes the final step in de novo nicotinamide adenine dinucleotide (NAD+) biosynthesis, an amide transfer from either ammonia or glutamine to nicotinic acid adenine dinucleotide (NaAD). The conversion of NaAD to NAD+ occurs via an NAD-adenylate intermediate and requires ATP and Mg2+. The intemediate is subsequently cleaved into NAD+ and AMP. In many prokaryotes, such as E. coli , NAD synthetase consists of a single domain and is strictly ammonia dependent. In contrast, eukaryotes and other prokaryotes have an additional N-terminal amidohydrolase domain that prefer glutamine, Interestingly, NAD+ synthases in these prokaryotes, can also utilize ammonia as an amide source .	248
100025	cd00554	MECDP_synthase	MECDP_synthase (2-C-methyl-D-erythritol-2,4-cyclodiphosphate synthase), encoded by the ispF gene, catalyzes the formation of 2-C-methyl-D-erythritol 2,4-cyclodiphosphate (MEC) in the non-mevalonate deoxyxylulose (DOXP) pathway for isoprenoid biosynthesis. This pathway is present in bacteria, plants and some protozoa but is distinct from that used by mammals and Archaea.  MECDP_synthase forms a homotrimer, carrying three active sites, each of which is formed in a cleft between pairs of subunits.	153
238310	cd00555	Maf	Nucleotide binding protein Maf. Maf has been implicated in inhibition of septum formation in eukaryotes, bacteria and archaea, but homologs in B.subtilis and S.cerevisiae are nonessential for cell division. Maf has been predicted to be a nucleotide- or nucleic acid-binding protein with structural similarity to the hypoxanthine/xanthine NTP pyrophosphatase Ham1 from Methanococcus jannaschii, RNase H from Escherichia coli, and some other nucleotide or RNA-binding proteins.	180
238311	cd00556	Thioesterase_II	Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene.	99
238312	cd00557	Translocase_SecB	Preprotein translocase subunit SecB. SecB is a cytoplasmic component of the multisubunit membrane-bound enzyme termed Sec protein translocase, which is the main constituent of the General Secretory (type II) Pathway involved in translocation of nascent polypeptides across the cytoplasmic membrane. SecB has been shown to function as export-specific molecular chaperone that selectively binds preproteins, maintains them in a translocation competent state and delivers them to SecA, the membrane-bound ATPase, that drives the translocation reaction. In solution, SecB exists as homotetramer, which is organized as a dimer of dimers.	131
238313	cd00559	Cyanase_C	Cyanase C-terminal domain. Cyanase (Cyanate lyase) is responsible for the hydrolysis of cyanate.  It catalyzes the reaction of cyanate with bicarbonate to produce ammonia and carbon dioxide. This allows organisms that possess the enzyme to overcome the toxicity of environmental cyanate and to use cyanate as a source of nitrogen for growth. This enzyme is a homodecamer, formed by five dimers. Each monomer is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain.	69
185673	cd00560	PanC	Pantoate-beta-alanine ligase. PanC  Pantoate-beta-alanine ligase, also known as pantothenate synthase, catalyzes the formation of pantothenate from pantoate and alanine.  PanC  belongs to a large superfamily of nucleotidyltransferases that includes , ATP sulfurylase (ATPS), phosphopantetheine adenylyltransferase (PPAT), and the amino-acyl tRNA synthetases. The enzymes of this family are structurally similar and share a dinucleotide-binding domain.	277
410862	cd00561	CobA_ACA	CobA-type ATP:corrinoid adenosyltransferase. CobA-ATP:corrinoid adenosyltransferase is one of three sequence and structurally unrelated groups of ATP:corrinoid adenosyltransferase, PduO and EutT being the other two. CobA has been shown to be involved in cobalamin (vitamin B12) biosynthesis and scavenging of incomplete corrinoids. This enzyme is a homodimer,  which catalyzes the adenosylation reaction: ATP + cob(I)alamin + H2O 	170
238315	cd00562	NifX_NifB	This CD represents a family of iron-molybdenum cluster-binding proteins that includes NifB, NifX, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme.  This domain is a predicted small-molecule-binding domain (SMBD) with an alpha/beta fold that is present either as a stand-alone domain (e.g. NifX and NifY) or fused to another conserved domain (e.g. NifB) however, its function is still undetermined.The SCOP database suggests that this domain is most similar to structures within the ribonuclease H superfamily.  This conserved domain is represented in two of the three major divisions of life (bacteria and archaea).	102
238316	cd00563	Dtyr_deacylase	D-Tyrosyl-tRNAtyr deacylases; a class of tRNA-dependent hydrolases which are capable of hydrolyzing the ester bond of D-Tyrosyl-tRNA reducing the level of cellular D-Tyrosine while recycling the peptidyl-tRNA; found in bacteria and in eukaryotes but not in archea; beta barrel-like fold structure; forms homodimers in which two surface cavities serve as the active site for tRNA binding	145
238317	cd00564	TMP_TenI	Thiamine monophosphate synthase (TMP synthase)/TenI. TMP synthase catalyzes an important step in the thiamine biosynthesis pathway, the substitution of the pyrophosphate of 2-methyl-4-amino-5- hydroxymethylpyrimidine pyrophosphate by 4-methyl-5- (beta-hydroxyethyl) thiazole phosphate to yield thiamine phosphate. TenI is a enzymatically inactive regulatory protein involved in the regulation of several extracellular enzymes. This superfamily also contains other enzymatically inactive proteins with unknown functions.	196
340451	cd00565	Ubl_ThiS	ubiquitin-like (Ubl) domain found in sulfur carrier protein ThiS. ThiS, also termed Thiamine biosynthesis protein (ThiaminS), is a sulfur carrier protein involved in thiamin biosynthesis in prokaryotes. It has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub), and is activated in an ATP-dependent manner by sulfurtransferases, similar to the activation mechanism of Ub-activating enzyme E1. ThiS has common evolutionary origin with Ub-related protein modifiers in eukaryotes, a beta-grasp fold as Ub, and is closely related to proteins MoaD and Urm1.	64
173838	cd00567	ACAD	Acyl-CoA dehydrogenase. Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast,  AXO catalyzes a different  oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium  (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities.  The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH),  glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase.  The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)	327
238318	cd00568	TPP_enzymes	Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes.	168
259851	cd00569	HTH_Hin_like	Helix-turn-helix domain of Hin and related proteins. This domain model summarizes a family of DNA-binding domains unique to bacteria and represented by the Hin protein of Salmonella. The basic HTH domain is a simple fold comprised of three core helices that form a right-handed helical bundle. The principal DNA-protein interface is formed by the third helix, the recognition helix, inserting itself into the major groove of the DNA. A diverse array of HTH domains participate in a variety of functions that depend on their DNA-binding properties. HTH_Hin represents one of the simplest versions of the HTH domains; the characterization of homologous relationships between various sequence-diverse HTH domain families remains difficult. The Hin recombinase induces the site-specific inversion of a chromosomal DNA segment containing a promoter, which controls the alternate expression of two genes by reversibly switching orientation. The Hin recombinase consists of a single polypeptide chain containing a C-terminal DNA-binding domain (HTH_Hin) and a catalytic domain.	42
238319	cd00570	GST_N_family	Glutathione S-transferase (GST) family, N-terminal domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of  glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK subfamily, a member of the DsbA family). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction  and isomerization of certain compounds. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxin 2 and stringent starvation protein A.	71
238320	cd00571	UreE	UreE urease accessory protein. UreE is a metallochaperone assisting the insertion of a Ni2+ ion in the active site of urease, an important step in the in vivo assembly of urease, an enzyme that hydrolyses urea into ammonia and carbamic acid. The C-terminal region of UreE contains a histidine rich nickel binding site.	136
238321	cd00575	NOS_oxygenase	Nitric oxide synthase (NOS) produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain which binds to the substrate L-Arg,  zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOSs also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN.  While prokaryotes can produce NO as a byproduct of denitrification, using a completely different set of enzymes than NOS, a few prokaryotes also have a NOS which consists solely of the NOS oxygenase domain. Prokaryotic NOS binds to the substrate L-Arg, zinc, and to the cofactors heme and tetrahydrofolate.	356
153083	cd00576	RNR_PFL	Ribonucleotide reductase and Pyruvate formate lyase. Ribonucleotide reductase (RNR) and pyruvate formate lyase (PFL) are believed to have diverged from a common ancestor. They have a structurally similar ten-stranded alpha-beta barrel domain that hosts the active site, and are radical enzymes. RNRs are found in all organisms and provide the only mechanism by which nucleotides are converted to deoxynucleotides. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs use a diiron-tyrosyl radical while Class II RNRs use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. PFL, an essential enzyme in anaerobic bacteria, catalyzes the conversion of pyruvate and CoA to acteylCoA and formate in a mechanism that uses a glycyl radical.	401
238322	cd00577	PCNA	Proliferating Cell Nuclear Antigen (PCNA) domain found in eukaryotes and archaea.  These polymerase processivity factors play a role in DNA replication and repair.  PCNA encircles duplex DNA in its central cavity, providing a DNA-bound platform for the attachment of the polymerase. The trimeric PCNA ring is structurally similar to the dimeric ring formed by the DNA polymerase processivity factors in bacteria (beta subunit DNA polymerase III holoenzyme) and in bacteriophages (catalytic subunits in T4 and RB69). This structural correspondence further substantiates the mechanistic connection between eukaryotic and prokaryotic DNA replication that has been suggested on biochemical grounds.   PCNA is also involved with proteins involved in cell cycle processes such as DNA repair and apoptosis. Many of these proteins contain a highly conserved motif known as the PIP-box (PCNA interacting protein box) which contains the sequence Qxx[LIM]xxF[FY]. 	248
238323	cd00578	L-fuc_L-ara-isomerases	L-fucose isomerase (FucIase) and L-arabinose isomerase (AI) family; composed of FucIase, AI and similar proteins. FucIase converts L-fucose, an aldohexose, to its ketose form, which prepares it for aldol cleavage (similar to the isomerization of glucose in glycolysis). L-fucose (or 6-deoxy-L-galactose) is found in various oligo- and polysaccharides in mammals, bacteria and plants. AI catalyzes the isomerization of L-arabinose to L-ribulose, the first reaction in its conversion to D-xylulose-5-phosphate, an intermediate in the pentose phosphate pathway, which allows L-arabinose to be used as a carbon source. AI can also convert D-galactose to D-tagatose at elevated temperatures in the presence of divalent metal ions. D-tagatose, rarely found in nature, is of commercial interest as a low-calorie sugar substitute.	452
238324	cd00580	CHMI	5-carboxymethyl-2-hydroxymuconate isomerase (CHMI) is a trimeric enzyme catalyzing the isomerization of the unsaturated ketone 5-(carboxymethyl)-2-hydroxymuconate to 5-(carboxymethyl)-2-oxo-3-hexene-1,6-dionate. This is one step in the homoprotocatechuate pathway, one of the microbial meta-fission pathways that degrade aromatic carbon sources to citric acid cycle intermediates.  Despite the structural similarity of CHMI with 4-oxalocrotonate tautomerase (4-OT) and macrophage migration inhibitory factor (MIF), there is no significant sequence similarity among these protein families, and therefore, they are not combined in one hierarchy.	113
238325	cd00581	QFR_TypeB_TM	Quinol:fumarate reductase (QFR) Type B subfamily, transmembrane subunit;  QFR couples the reduction of fumarate to succinate to the oxidation of quinol to quinone, the opposite reaction to that catalyzed by the related protein, succinate:quinone oxidoreductase (SQR). QFRs oxidize low potential quinols such as menaquinol and rhodoquinol and are involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits.  Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunit. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor (quinol). The Type B enzyme from Desulfovibrio gigas is capable of fumarate reduction and succinate oxidation.	206
411704	cd00583	MutH-like	Restriction endonuclease MutH and similar endonucleases. MutH is a 28kD endonuclease involved in methyl-directed DNA mismatch repair in gram negative bacteria. MutH is both sequence-specific and methylation-specific, introducing a nick in the unmethylated strand of a hemi-methylated d(GATC) DNA duplex. MutH is homologous to the type II restriction endonuclease Sau3AI which also recognizes the d(GATC) sequence however, Sau3AI cleaves both strands regardless of their methylation state. The active form of MutH is monomeric while that of Sau3AI is homodimeric. In addition to MutH, MutS, involved in mismatch recognition, and MutL, involved in mediating the interactions between MutH and MutS, are essential in initiating mismatch repair in Escherichia coli.	208
238327	cd00584	Prefoldin_alpha	Prefoldin alpha subunit; Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly.  The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils.	129
238328	cd00585	Peptidase_C1B	Peptidase C1B subfamily (MEROPS database nomenclature); composed of eukaryotic bleomycin hydrolases (BH) and bacterial aminopeptidases C (pepC). The proteins of this subfamily contain a large insert relative to the C1A peptidase (papain) subfamily. BH is a cysteine peptidase that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. Bleomycin, a glycopeptide derived from the fungus Streptomyces verticullus, is an effective anticancer drug due to its ability to induce DNA strand breaks. Human BH is the major cause of tumor cell resistance to bleomycin chemotherapy, and is also genetically linked to Alzheimer's disease. In addition to its peptidase activity, the yeast BH (Gal6) binds DNA and acts as a repressor in the Gal4 regulatory system. BH forms a hexameric ring barrel structure with the active sites imbedded in the central channel. The bacterial homolog of BH, called pepC, is a cysteine aminopeptidase possessing broad specificity. Although its crystal structure has not been solved, biochemical analysis shows that pepC also forms a hexamer. 	437
238329	cd00586	4HBT	4-hydroxybenzoyl-CoA thioesterase (4HBT). Catalyzes the final step in the 4-chlorobenzoate degradation pathway in which 4-chlorobenzoate is converted to 4-hydroxybenzoate in certain soil-dwelling bacteria. 4HBT forms a homotetramer with four active sites.  There is no evidence to suggest that 4HBT is related to the type I thioesterases functioning in primary or secondary metabolic pathways. Each subunit of the 4HBT tetramer adopts a so-called hot-dog fold similar to those of beta-hydroxydecanoyl-ACP dehydratase, (R)-specific enoyl-CoA hydratase, and type II, thioesterase (TEII).	110
238330	cd00587	HCP_like	The HCP family of iron-sulfur proteins includes hybrid cluster protein (HCP), acetyl-CoA synthase (ACS), and carbon monoxide dehydrogenase (CODH), all of which contain [Fe4-S4] metal clusters at their active sites. These proteins have a conserved alpha-beta rossman fold domain. HCP, formerly known as prismane, is thought to play a role in nitrogen metabolism but its specific function is unknown.  Acetyl-CoA synthase (ACS), is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA. ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide and CoA.	258
238331	cd00588	CheW_like	CheW-like domain. CheW proteins are part of the chemotaxis signalling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flageller rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in the chemotaxis associated histidine kinase CheA binds to CheW, suggesting that these domains can interact with each other.	136
409669	cd00590	RRM_SF	RNA recognition motif (RRM) superfamily. RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs).	72
259852	cd00591	HU_IHF	DNA sequence specific (IHF) and non-specific (HU) domains. This family includes integration host factor (IHF) and HU, also called type II DNA-binding proteins (DNABII), which are small dimeric proteins that specifically bind the DNA minor groove, inducing large bends in the DNA and serving as architectural factors in a variety of cellular processes such as recombination, initiation of replication/transcription and gene regulation. IHF binds DNA in a sequence specific manner while HU displays little or no sequence preference. IHF homologs are usually heterodimers, while HU homologs are typically homodimers (except HU heterodimers from E. coli and other enterobacteria). HU is highly basic and contributes to chromosomal compaction and maintenance of negative supercoiling, thus often referred to as histone-like protein. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). Bacillus phage SPO1-encoded transcription factor 1 (TF1) is another related type II DNA-binding protein. Like IHF, TF1 binds DNA specifically and bends DNA sharply.	85
133378	cd00592	HTH_MerR-like	Helix-Turn-Helix DNA binding domain of MerR-like transcription regulators. Helix-turn-helix (HTH) MerR-like transcription regulator, N-terminal domain. The MerR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription of multidrug/metal ion transporter genes and oxidative stress regulons by reconfiguring the spacer between the -35 and -10 promoter elements.  A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	100
238333	cd00593	RIBOc	RIBOc. Ribonuclease III C terminal domain. This group consists of eukaryotic, bacterial and archeal ribonuclease III (RNAse III) proteins. RNAse III is a double stranded RNA-specific endonuclease. Prokaryotic RNAse III is important in post-transcriptional control of mRNA stability and translational efficiency. It is involved in the processing of ribosomal RNA precursors. Prokaryotic RNAse III also plays a role in the maturation of tRNA precursors and in the processing of phage and plasmid transcripts. Eukaryotic RNase III's participate (through direct cleavage) in rRNA processing, in processing of small nucleolar RNAs (snoRNAs) and snRNA's (components of the spliceosome). In eukaryotes RNase III or RNaseIII like enzymes such as Dicer are involved in RNAi (RNA interference) and miRNA (micro-RNA) gene silencing.	133
238334	cd00594	KU	Ku-core domain; includes the central DNA-binding beta-barrels, polypeptide rings, and the C-terminal arm of Ku proteins. The Ku protein consists of two tightly associated homologous subunits, Ku70 and Ku80, and was originally identified as an autoantigen recognized by the sera of patients with an autoimmunity disease. In eukaryotes, the Ku heterodimer contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by non-homologous end-joining. The bacterial Ku homologs does not contain the conserved N-terminal extension that is present in the eukaryotic Ku protein.	272
238335	cd00595	NDPk	Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved.	133
349427	cd00596	Peptidase_M14_like	M14 family of metallocarboxypeptidases and related proteins. The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.  Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	216
119349	cd00598	GH18_chitinase-like	The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods.  Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel.  The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins.  The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model.	210
119373	cd00599	GH25_muramidase	Endo-N-acetylmuramidases (muramidases) are lysozymes (also referred to as peptidoglycan hydrolases) that degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues.  This family of muramidases contains a glycosyl hydrolase family 25 (GH25) catalytic domain and is found in bacteria, fungi, slime molds, round worms, protozoans and bacteriophages.  The bacteriophage members are referred to as endolysins which are involved in lysing the host cell at the end of the replication cycle to allow release of mature phage particles.  Endolysins are typically modular enzymes consisting of a catalytically active domain that hydrolyzes the peptidoglycan cell wall and a cell wall-binding domain that anchors the protein to the cell wall.  Endolysins generally have narrow substrate specificities with either intra-species or intra-genus bacteriolytic activity.	186
212462	cd00600	Sm_like	Sm and related proteins. The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.	63
238336	cd00602	IPT_TF	IPT domain of eukaryotic transcription factors NF-kappaB/Rel, nuclear factor of activated Tcells (NFAT), and  recombination signal J-kappa binding protein (RBP-Jkappa). The IPT domains in these proteins are involved in DNA binding. Most NF-kappaB/Rel proteins form homo- and heterodimers, while NFAT proteins are largely monomeric (with TonEBP being an exception). While the majority of sequence-specific DNA binding elements are found in the N-terminal domain, several are found in the IPT domain in loops adjacent to, and including, the linker region.	101
238337	cd00603	IPT_PCSR	IPT domain of Plexins and Cell Surface Receptors (PCSR) and related proteins . This subgroup contains IPT domains of plexins, receptors, like the plasminogen-related growth factor receptors, the hepatocyte growth factor-scatter factors, and the macrophage-stimulating receptors and of fibrocystin. Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT_PCSR domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains.	90
238338	cd00604	IPT_CGTD	IPT domain (domain D) of cyclodextrin glycosyltransferase (CGTase) and similar enzymes. These enzymes are involved in the enzymatic hydrolysis of alpha-1,4 linkages of starch polymers and belong to the glycosyl hydrolase family 13. Most consist of three domains (A,B,C) but CGTase is more complex and has two additional domains (D,E). The function of the IPT/D domain is unknown.	81
238339	cd00606	fungal_RNase	fungal type ribonuclease. Ribonucleases (RNAses)  cleave phosphodiester bonds in RNA and are essential  for both non-specific RNA degradation and for numerous forms of RNA processing. The members of this CD belong to the superfamily of microbial ribonucleases which are predominantly guanyl specific nucleases. Guanyl specific RNAses are endonucleases which split RNA phosphodiester bonds at the 3' oxygen end of guanosine residues to yield oligonucleotides with the guanosine-2',3'-cyclophosphate at the 3' end and the hydroxyl group at the 5' end. The terminal guanosine-2,3'-cyclophosphate is hydrolysed by guanyl RNAses to give guanosine-3'-phosphate. The alignment also contains ribotoxins, a fungal group of cytotoxins, specifically cleaving the sarcin/ricin loop (SRL) structure of the 23-28S rRNA and therefore being very potent inhibitors of protein synthesis.	100
238340	cd00607	RNase_Sa	RNase_Sa. Ribonucleases first isolated from Streptomyces aureofaciens. In general, ribonucleases cleave phosphodiester bonds in RNA and are essential  for both non-specific RNA degradation and for numerous forms of RNA processing. RNAse Sa is a guanylate specific endoribonuclease which belongs to the superfamily of microbial ribonucleases. Typical of this sub-family, the enzyme hydrolyses the phosphodiester bonds of RNA at the 3' oxygen end of guanosine residues to yield oligonucleotides with the guanosine-2',3'-cyclophosphate at the 3' end and the hydroxyl group at the 5' end. The terminal guanosine-2,3'-cyclophosphate is hydrolysed by guanyl RNAses to give guanosine-3'-phosphate.	95
238341	cd00608	GalT	Galactose-1-phosphate uridyl transferase (GalT): This enzyme plays a key role in galactose metabolism by catalysing the transfer of a uridine 5'-phosphoryl group from UDP-galactose 1-phosphate. The structure of E.coli GalT reveals that the enzyme contains two identical subunits. It also demonstrates that the active site is formed by amino acid residues from both subunits of the dimer.	329
99734	cd00609	AAT_like	Aspartate aminotransferase family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). Pyridoxal phosphate combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of  the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. The major groups in this CD corresponds to Aspartate aminotransferase a, b and c, Tyrosine, Alanine, Aromatic-amino-acid, Glutamine phenylpyruvate, 1-Aminocyclopropane-1-carboxylate synthase, Histidinol-phosphate, gene products of malY and cobC, Valine-pyruvate aminotransferase and Rhizopine catabolism regulatory protein.	350
99735	cd00610	OAT_like	Acetyl ornithine aminotransferase family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to ornithine aminotransferase, acetylornithine aminotransferase, alanine-glyoxylate aminotransferase, dialkylglycine decarboxylase, 4-aminobutyrate aminotransferase, beta-alanine-pyruvate aminotransferase, adenosylmethionine-8-amino-7-oxononanoate aminotransferase, and glutamate-1-semialdehyde 2,1-aminomutase. All the enzymes belonging to this family act on basic amino acids and their derivatives are involved in transamination or decarboxylation.	413
99736	cd00611	PSAT_like	Phosphoserine aminotransferase (PSAT) family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major group in this CD corresponds to phosphoserine aminotransferase (PSAT).  PSAT is active as a dimer and catalyzes the conversion of phosphohydroxypyruvate to phosphoserine.	355
99737	cd00613	GDC-P	Glycine cleavage system P-protein, alpha- and beta-subunits. This family consists of Glycine cleavage system P-proteins EC:1.4.4.2 from bacterial, mammalian and plant sources. The P protein is part of the glycine decarboxylase multienzyme complex EC:2.1.2.10 (GDC) also annotated as glycine cleavage system or glycine synthase. GDC consists of four proteins P, H, L and T. The reaction catalysed by this protein is: Glycine + lipoylprotein <=> S-aminomethyldihydrolipoylprotein + CO2. Alpha-beta-type dimers associate to form an alpha(2)beta(2) tetramer, where the alpha- and beta-subunits are structurally similar and appear to have arisen by gene duplication and subsequent divergence with a loss of one active site. The members of this CD are widely dispersed among all three forms of cellular life.	398
99738	cd00614	CGS_like	CGS_like: Cystathionine gamma-synthase is a PLP dependent enzyme and catalyzes the committed step of methionine biosynthesis. This pathway is unique to microorganisms and plants, rendering the enzyme an attractive target for the development of antimicrobials and herbicides. This subgroup also includes cystathionine gamma-lyases (CGL), O-acetylhomoserine sulfhydrylases and O-acetylhomoserine thiol lyases. CGL's are very similar to CGS's. Members of this group are widely distributed among all three forms of life.	369
99739	cd00615	Orn_deC_like	Ornithine decarboxylase family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD corresponds to ornithine decarboxylase (ODC), arginine decarboxylase (ADC) and lysine decarboxylase (LDC). ODC is a dodecamer composed of six homodimers and catalyzes the decarboxylation of tryptophan. ADC catalyzes the decarboxylation of arginine and LDC catalyzes the decarboxylation of lysine. Members of this family are widely found in all three forms of life.	294
99740	cd00616	AHBA_syn	3-amino-5-hydroxybenzoic acid synthase family (AHBA_syn). AHBA_syn family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The members of this CD are involved in various biosynthetic pathways for secondary metabolites. Some well studied proteins in this CD are AHBA_synthase, protein product of pleiotropic regulatory gene degT,  Arnb aminotransferase and pilin glycosylation protein. The prototype of this family, the AHBA_synthase, is a dimeric PLP dependent enzyme. AHBA_syn is the terminal enzyme of 3-amino-5-hydroxybenzoic acid (AHBA) formation which is involved in the biosynthesis of ansamycin antibiotics, including rifamycin B. Some members of this CD are involved in 4-amino-6-deoxy-monosaccharide D-perosamine synthesis. Perosamine is an important element in the glycosylation of several cell products, such as antibiotics and lipopolysaccharides of gram-positive and gram-negative bacteria. The pilin glycosylation protein encoded by gene pglA, is a galactosyltransferase involved in pilin glycosylation. Additionally, this CD consists of ArnB (PmrH) aminotransferase, a 4-amino-4-deoxy-L-arabinose lipopolysaccharide-modifying enzyme. This CD also consists of several predicted pyridoxal phosphate-dependent enzymes apparently involved in regulation of cell wall biogenesis. The catalytic lysine which is present in all characterized PLP dependent enzymes is replaced by histidine in some members of this CD.	352
99741	cd00617	Tnase_like	Tryptophanase family (Tnase). This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to tryptophanase (Tnase) and tyrosine phenol-lyase (TPL). Tnase and TPL are active as tetramers and catalyze beta-elimination reactions. Tnase catalyzes degradation of L-tryptophan to yield indole, pyruvate and ammonia and TPL catalyzes degradation of L-tyrosine to yield phenol, pyruvate and ammonia.	431
153092	cd00618	PLA2_like	PLA2_like: Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers.	83
238342	cd00619	Terminator_NusB	Transcription termination factor NusB (N protein-Utilization Substance B). NusB plays a key role in the regulation of ribosomal RNA biosynthesis in eubacteria by modulating the efficiency of transcriptional antitermination. NusB along with other Nus factors (NusA, NusE/S10 and NusG) forms the core complex with the boxA element of the nut site of the rRNA operons. These interactions help RNA polymerase to counteract polarity during transcription of rRNA operons and allow stable antitermination. The transcription antitermination system can be appropriated by some bacteriophages such as lambda, which use the system to switch between the lysogenic and lytic modes of phage propagation.	130
238343	cd00620	Methyltransferase_Sun	N-terminal RNA binding domain of the methyltransferase Sun. The rRNA-specific 5-methylcytidine transferase Sun, also known as RrmB or Fmu shares the RNA-binding non-catalytic domain with the transcription termination factor NusB. The precise biological role of this domain in Sun is unknown, although it is likely to be involved in sequence-specific RNA binding. The C-terminal methyltransferase domain of Sun has been shown to catalyze formation of m5C at position 967 of 16S rRNA in Escherichia coli.	126
143482	cd00622	PLPDE_III_ODC	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Ornithine Decarboxylase. This subfamily is composed mainly of eukaryotic ornithine decarboxylases (ODC, EC 4.1.1.17) and ODC-like enzymes from prokaryotes represented by Vibrio vulnificus LysineOrnithine decarboxylase. These are fold type III PLP-dependent enzymes that differ from most bacterial ODCs which are fold type I PLP-dependent enzymes. ODC participates in the formation of putrescine by catalyzing the decarboxylation of ornithine, the first step in polyamine biosynthesis. Members of this subfamily contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Homodimer formation and the presence of the PLP cofactor are required for catalytic activity. Also members of this subfamily are proteins with homology to ODC but do not possess any catalytic activity, the Antizyme inhibitor (AZI) and ODC-paralogue (ODC-p). AZI binds to the regulatory protein Antizyme with a higher affinity than ODC and prevents ODC degradation. ODC-p is a novel ODC-like protein, present only in mammals, that is specifically exressed in the brain and testes. ODC-p may function as a tissue-specific antizyme inhibitory protein.	362
238344	cd00625	ArsB_NhaD_permease	Anion permease ArsB/NhaD.  These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life.  A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump.	396
132719	cd00630	RNAP_largest_subunit_C	Largest subunit of RNA polymerase (RNAP), C-terminal domain. RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is the final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei, RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. Structure studies revealed that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shape structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. The largest RNAP subunit (Rpb1) interacts with the second-largest RNAP subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The region covered by this domain makes up part of the foot and jaw structures. In archaea, some photosynthetic organisms, and some organelles, this domain exists as a separate subunit, while it forms the C-terminal region of the RNAP largest subunit in eukaryotes and bacteria.	158
238345	cd00632	Prefoldin_beta	Prefoldin beta; Prefoldin is a hexameric molecular chaperone complex, composed of two evolutionarily related subunits (alpha and beta), which are found in both eukaryotes and archaea.  Prefoldin binds and stabilizes newly synthesized polypeptides allowing them to fold correctly.  The hexameric structure consists of a double beta barrel assembly with six protruding coiled-coils. The alpha prefoldin subunits have two beta hairpin structures while the beta prefoldin subunits (this CD) have only one hairpin that is most similar to the second hairpin of the alpha subunit. The prefoldin hexamer consists of two alpha and four beta subunits and is assembled from the beta hairpins of all six subunits. The alpha subunits initially dimerize providing a structural nucleus for the assembly of the beta subunits. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer.	105
238346	cd00633	Secretoglobin	Secretoglobins are relatively small, secreted, disulphide-bridged dimeric proteins with encoding genes sharing substantial sequence similarity. Their family subunits may be grouped into five subfamilies, A-E. Uteroglobin (subfamily A), which is identical to Clara cell protein (CC10), forms a globular shaped homodimer with a large hydrophobic pocket located between the two dimers. The uteroglobin monomer structure is composed of four alpha helices that do not form a canonical four helix-bundle motif but rather a boomerang-shaped structure in which helices H1, H3, and H4 are able to bind a homodimeric partner. The hydrophobic pocket binds steroids, particularly progesterone, with high specificity. However, the true biological function of uteroglobin is poorly understood. In mammals, uteroglobin has immunosuppressive and anti-inflammatory properties through the inhibition of phospholipase A2. The other four main subfamilies of secretoglobins are found in heterodimeric combinations, with B and C subfamilies disulphide-bridged to the E and D subfamilies, respectively. [See review by Laukaitis C.M. & Karn R.C. (2005). Biological Journal of the Linnean Society 84, 493]. These include rat prostatic steroid-binding protein (PBP or prostatein), human mammaglobin (or heteroglobin), lipophilins, major cat allergen Fel dI, the hamster Harderian gland proteins and mouse salivary androgen-binding protein (ABP). Example of such a heterodimer: ABPalpha-like sequences are closely related to cat Fel dI chain 1, whereas ABPbeta-gamma-like sequences are closely related to Fel dI chain 2. Thus, the heterodimeric structure of ABPalpha-beta and ABPalpha-gamma is recapitulated by the sequence-similar Fel dI chains 1 and 2. This conservation of primary and quaternary structure indicates that the genome of the eutherian common ancestor of cats, rodents, and primates contained a similar gene pair.	67
143483	cd00635	PLPDE_III_YBL036c_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, YBL036c-like proteins. This family contains mostly uncharacterized proteins, widely distributed among eukaryotes, bacteria and archaea, that bear similarity to the yeast hypothetical protein YBL036c, which is homologous to a Pseudomonas aeruginosa gene that is co-transcribed with a known proline biosynthetic gene. YBL036c is a single domain monomeric protein with a typical TIM barrel fold. It binds the PLP cofactor and has been shown to exhibit amino acid racemase activity. The YBL036c structure is similar to the N-terminal domain of the fold type III PLP-dependent enzymes, bacterial alanine racemase and eukaryotic ornithine decarboxylase, which are two-domain dimeric proteins. The lack of a second domain in YBL036c may explain limited D- to L-alanine racemase or non-specific racemase activity.	222
238347	cd00636	TroA-like	Helical backbone metal receptor (TroA-like domain). These proteins have been shown to function in the ABC transport of ferric siderophores and metal ions such as Mn2+, Fe3+, Cu2+ and/or Zn2+.  Their ligand binding site is formed in the interface between two globular domains linked by a single helix.  Many of these proteins also possess a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).  The TroA-like proteins differ in their fold and ligand-binding mechanism from the PBPI and PBPII proteins, but are structurally similar, however, to the beta-subunit of the nitrogenase molybdenum-iron protein MoFe.   Most TroA-like proteins are encoded by ABC-type operons and appear to function as periplasmic components of ABC transporters in metal ion uptake.	148
410626	cd00637	7tm_classA_rhodopsin-like	rhodopsin receptor-like class A family of the seven-transmembrane G protein-coupled receptor superfamily. Class A rhodopsin-like receptors constitute about 90% of all GPCRs. The class A GPCRs include the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. Based on sequence similarity, GPCRs can be divided into six major classes: class A (rhodopsin-like family), class B (Methuselah-like, adhesion and secretin-like receptor family), class C (metabotropic glutamate receptor family), class D (fungal mating pheromone receptors), class E (cAMP receptor family), and class F (frizzled/smoothened receptor family). Nearly 800 human GPCR genes have been identified and are involved essentially in all major physiological processes. Approximately 40% of clinically marketed drugs mediate their effects through modulation of GPCR function for the treatment of a variety of human diseases including bacterial infections.	275
107202	cd00640	Trp-synth-beta_II	Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate  to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine  to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine.	244
238348	cd00641	GTP_cyclohydro2	GTP cyclohydrolase II (RibA).  GTP cyclohydrolase II catalyzes the conversion of GTP to 2,5-diamino-6-ribosylamino-4(3H)-pyrimidinone 5' phosphate, formate, pyrophosphate (APy), and GMP in the biosynthetic pathway of riboflavin. Riboflavin is the precursor molecule for the synthesis of  the coenzymes flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential to cell metabolism. The enzyme is present in plants and numerous pathogenic bacteria, especially gram negative organisms, who are dependent on endogenous synthesis of the vitamin because they lack an appropriate uptake system.  For animals and humans, which lack this biosynthetic pathway, riboflavin is the essential vitamin B2. GTP cyclohydrolase II requires magnesium ions for activity and has a bound catalytic zinc. The functionally active form is thought to be a homodimer. A paralogous protein is encoded in the genome of Streptomyces coelicolor, which converts GTP to 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate (FAPy), an activity that has otherwise been reported for unrelated GTP cyclohydrolases III.	193
238349	cd00642	GTP_cyclohydro1	GTP cyclohydrolase I (GTP-CH-I) catalyzes the conversion of GTP into dihydroneopterin triphosphate.  The enzyme product is the precursor of tetrahydrofolate in eubacteria, fungi, and plants and of the folate analogs in methanogenic bacteria.  In vertebrates and insects it is the biosynthtic precursor of tetrahydrobiopterin (BH4) which is involved in the formation of catacholamines, nitric oxide, and the stimulation of T lymphocytes. The biosynthetic reaction of BH4 is controlled by a regulatory protein GFRP which mediates feedback inhibition of GTP-CH-I by BH4.  This inhibition is reversed by phenylalanine. The decameric GTP-CH-I forms a complex with two pentameric GFRP in the presence of phenylalanine or a combination of GTP and BH4, respectively.	185
153081	cd00643	HMG-CoA_reductase_classI	Class I hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR). Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR), class I enzyme, homotetramer. Catalyzes the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. In mammals this is the rate limiting committed step in cholesterol biosynthesis. Class I enzymes are found predominantly in eukaryotes and contain N-terminal membrane regions. With the exception of Archaeoglobus fulgidus, most archeae are assigned to class I, based on sequence similarity of the active site, even though they lack membrane regions. Yeast and human HMGR are divergent in their N-terminal regions, but are conserved in their active site. In contrast, human and bacterial HMGR differ in their active site architecture.	403
153082	cd00644	HMG-CoA_reductase_classII	Class II hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR). Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR), class II, prokaryotic enzyme is a homodimer. Class II enzymes are found primarily in prokaryotes and Archaeoglobus fulgidus and are soluble as they lack the membrane region. Enzymes catalyze the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. Bacteria, such as Pseudomonas mevalonii, which rely solely on mevalonate for their carbon source, catalyze the reverse reaction, using an NAD-dependent HMGR to deacetylate mevalonate into 3-hydroxy-3-methylglutaryl-CoA. Human and bacterial HMGR differ in their active site architecture.	417
238350	cd00645	AsnA	Asparagine synthetase (aspartate-ammonia ligase) (AsnA) catalyses the conversion of L-aspartate to L-asparagine in the presence of ATP and ammonia.  AsnA is a homodimeric enzyme which is structurally similiar to the catalytic core domain of class II aminoacyl-tRNA synthetases. Ammonia-dependent AsnA is not homologous to the glutamine-dependent asparagine synthetase AsnB.	309
270214	cd00648	Periplasmic_Binding_Protein_Type_2	Type 2 periplasmic binding fold superfamily. This evolutionary model and hierarchy represent the ligand-binding domains found in solute binding proteins that serve as initial receptors in the transport, signal transduction and channel gating.  The PBP2 proteins share the same architecture as periplasmic binding proteins type 1 (PBP1), but have a different topology.  They are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The origin of PBP module can be traced across the distant phyla, including eukaryotes, archebacteria, and prokaryotes.  The majority of PBP2 proteins are involved in the uptake of a variety of soluble substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the family includes ionotropic glutamate receptors and unorthodox sensor proteins involved in signal transduction. The substrate binding domain of the LysR transcriptional regulators and the oligopeptide-like transport systems also contain the type 2 periplasmic binding fold and thus they are significantly homologous to that of the PBP2; however, these two families are grouped into a separate hierarchy of the PBP2 superfamily due to the large number of protein sequences.	196
173824	cd00649	catalase_peroxidase_1	N-terminal catalytic domain of catalase-peroxidases. This is a subgroup of heme-dependent peroxidases of the plant superfamily that share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Catalase-peroxidases can exhibit both catalase and broad-spectrum peroxidase activities depending on the steady-state concentration of hydrogen peroxide. These enzymes are found in many archaeal and bacterial organisms, where they neutralize potentially lethal hydrogen peroxide molecules generated during photosynthesis or stationary phase. Along with related intracellular fungal and plant peroxidases, catalase-peroxidases belong to class I of the plant peroxidase superfamily. Unlike the eukaryotic enzymes, they are typically comprised of two homologous domains that probably arose via a single gene duplication event. The heme binding motif is present only in the N-terminal domain; the function of the C-terminal domain is not clear.	409
133419	cd00650	LDH_MDH_like	NAD-dependent, lactate dehydrogenase-like, 2-hydroxycarboxylate dehydrogenase family. Members of this family include ubiquitous enzymes like L-lactate dehydrogenases (LDH), L-2-hydroxyisocaproate dehydrogenases, and some malate dehydrogenases (MDH). LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH/MDH-like proteins are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	263
238351	cd00651	TFold	Tunnelling fold (T-fold). The five known T-folds are found in five different enzymes with different functions: dihydroneopterin-triphosphate epimerase (DHNTPE), dihydroneopterin aldolase (DHNA) , GTP cyclohydrolase I (GTPCH-1),  6-pyrovoyl tetrahydropterin synthetase (PTPS), and uricase (UO,uroate/urate oxidase). They bind to substrates belonging to the purine or pterin families, and share a fold-related binding site with a glutamate or glutamine residue anchoring the substrate and a lot of conserved interactions. They also share a similar oligomerization mode: several T-folds join together to form a beta(2n)alpha(n) barrel, then two barrels join together in a head-to-head fashion to made up the native enzymes. The functional enzyme is a tetramer for UO, a hexamer for PTPS, an octamer for DHNA/DHNTPE and a decamer for GTPCH-1. The substrate is located in a deep and narrow pocket at the interface between monomers. In PTPS, the active site is located at the interface of three monomers, two from one trimer and one from the other trimer. In GTPCH-1, it is also located at the interface of three subunits, two from one pentamer and one from the other pentamer. There are four equivalent active sites in UO, six in PTPS, eight in DHNA/DHNTPE and ten in GTPCH-1.   Each globular multimeric enzyme encloses a tunnel which is lined with charged residues for DHNA and UO, and with basic residues in PTPS. The N and C-terminal ends are located on one side of the T-fold while the residues involved in the catalytic activity are located at the opposite side. In PTPS, UO and DHNA/DHNTPE, the N and C-terminal extremities of the enzyme are located on the exterior side of the functional multimeric enzyme. In GTPCH-1, the extra C-terminal helix places the extremity inside the tunnel.	122
238352	cd00652	TBP_TLF	TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. New members of the TBP family, called TBP-like proteins (TBLP, TLF, TLP) or TBP-related factors (TRF1, TRF2,TRP), are similar to the core domain of TBPs, with identical or chemically similar amino acids at many equivalent positions, suggesting similar structure. However, TLFs contain distinct, conserved amino acids at several positions that distinguish them from TBP.	174
238353	cd00653	RNA_pol_B_RPB2	RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Each RNA polymerase complex contains two related members of this family, in each case they are the two largest subunits.The clamp is a mobile structure that grips DNA during elongation.	866
238354	cd00655	RNAP_Rpb7_N_like	RNAP_Rpb7_N_like: This conserved domain represents the N-terminal ribonucleoprotein (RNP) domain of the Rpb7 subunit of eukaryotic RNA polymerase (RNAP) II and its homologs, Rpa43 of eukaryotic RNAP I, Rpc25 of eukaryotic RNAP III, and RpoE (subunit E) of archaeal RNAP. These proteins have, in addition to their N-terminal RNP domain, a C-terminal oligonucleotide-binding (OB) domain. Each of these subunits heterodimerizes with another RNAP subunit (Rpb7 to Rpb4, Rpc25 to Rpc17, RpoE to RpoF, and Rpa43 to Rpa14). The heterodimer is thought to tether the RNAP to a given promoter via its interactions with a promoter-bound transcription factor.The heterodimer is also thought to bind and position nascent RNA as it exits the polymerase complex.	80
259791	cd00656	Zn-ribbon	C-terminal zinc ribbon domain of RNA polymerase intrinsic transcript cleavage subunit. The homologous C-terminal zinc ribbon domains of subunits A12.2, Rpb9, and C11 in RNA Polymerases (Pol) I, II, and III, respectively are required for intrinsic transcript cleavage. TFS is a related archaeal protein that is involved in RNA cleavage by archaeal polymerase. These proteins have two zinc-binding beta-ribbon domains, N-terminal zinc ribbon (N-ribbon) and C-terminal zinc ribbon (C-ribbon). Transcription Factor IIS (TFIIS) domain III is homologous to the C-ribbon domain that stimulates the weak cleavage activity of Rpb9 for Pol II.	45
153097	cd00657	Ferritin_like	Ferritin-like superfamily of diiron-containing four-helix-bundle proteins. Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF).	130
271176	cd00659	Topo_IB_C	DNA topoisomerase IB, C-terminal catalytic domain. Topoisomerase I promotes the relaxation of both positive and negative DNA superhelical tension by introducing a transient single-stranded break in duplex DNA. This function is vital for the processes of replication, transcription, and recombination. Unlike Topo IA enzymes, Topo IB enzymes do not require a single-stranded region of DNA or metal ions for their function. The type IB family of DNA topoisomerases includes eukaryotic nuclear topoisomerase I, topoisomerases of poxviruses, and bacterial versions of Topo IB. They belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their C-terminal catalytic domain and the overall reaction mechanism with tyrosine recombinases. The C-terminal catalytic domain in topoisomerases is linked to a divergent N-terminal domain that shows no sequence or structure similarity to the N-terminal domains of tyrosine recombinases.	210
238356	cd00660	Topoisomer_IB_N	Topoisomer_IB_N: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into:  topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes.  Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts.  In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topos I play putative roles in organizing the kinetoplast DNA network unique to these parasites.  This family may represent more than one structural domain.	215
238357	cd00667	ring_hydroxylating_dioxygenases_beta	Ring hydroxylating dioxygenase beta subunit. This subunit has a similar structure to NTF-2, Ketosteroid isomerase and scytalone dehydratase.The degradation of aromatic compounds by aerobic bacteria frequently begins with the dihydroxylation of the substrate by nonheme iron-containing dioxygenases. These enzymes consist of two or three soluble proteins that interact to form an electron-transport chain that transfers electrons from reduced nucleotides (NADH) via flavin and [2Fe-2S] redox centers to a terminal dioxygenase. Aromatic-ring-hydroxylating dioxygenases oxidize aromatic hydrocarbons and related compounds to cis-arene diols. These enzymes utilize a mononuclear non-heme iron center to catalyze the addition of dioxygen to their respective substrates. The active site of these enzymes however is in the alpha sub-unit. No functional role has been attributed to the beta sub-unit except for a structural role.	160
185674	cd00668	Ile_Leu_Val_MetRS_core	catalytic core domain of isoleucyl, leucyl, valyl and methioninyl tRNA synthetases. Catalytic core domain of isoleucyl, leucyl, valyl and methioninyl tRNA synthetases. These class I enzymes are all monomers. However, in some species, MetRS functions as a homodimer, as a result of an additional C-terminal domain. These enzymes aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding.  Enzymes in this subfamily share an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids. MetRS has a significantly shorter insertion, which lacks the editing function.	312
238358	cd00669	Asp_Lys_Asn_RS_core	Asp_Lys_Asn_tRNA synthetase class II core domain. This domain is the core catalytic domain of class II aminoacyl-tRNA synthetases of the subgroup containing aspartyl, lysyl, and asparaginyl tRNA synthetases. It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. Nearly all class II tRNA synthetases are dimers and enzymes in this subgroup are homodimers. These enzymes attach a specific amino acid to the 3' OH group of ribose of the appropriate tRNA.	269
238359	cd00670	Gly_His_Pro_Ser_Thr_tRS_core	Gly_His_Pro_Ser_Thr_tRNA synthetase class II core domain. This domain is the core catalytic domain of tRNA synthetases of the subgroup containing glycyl, histidyl, prolyl, seryl and threonyl tRNA synthetases. It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. These enzymes belong to class II aminoacyl-tRNA synthetases (aaRS) based upon their structure and the presence of three characteristic sequence motifs in the core domain. This domain is also found at the C-terminus of eukaryotic GCN2 protein kinase and at the N-terminus of the ATP phosphoribosyltransferase accessory subunit, HisZ and the accessory subunit of mitochondrial polymerase gamma (Pol gamma b) . Most class II tRNA synthetases are dimers, with this subgroup consisting of mostly homodimers. These enzymes attach a specific amino acid to the 3' OH group of ribose of the appropriate tRNA.	235
185675	cd00671	ArgRS_core	catalytic core domain of arginyl-tRNA synthetases. Arginyl tRNA synthetase (ArgRS) catalytic core domain. This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. There are at least three subgroups of ArgRS. One type contains both characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. The second subtype lacks the KMSKS motif; however, it has a lysine N-terminal to the HIGH motif, which serves as the functional counterpart to the second lysine of the KMSKS motif. A third group, which is found  primarily in archaea and a few bacteria,  lacks both the KMSKS motif and the HIGH loop lysine.	212
173899	cd00672	CysRS_core	catalytic core domain of cysteinyl tRNA synthetase. Cysteinyl tRNA synthetase (CysRS) catalytic core domain. This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding.	213
238360	cd00673	AlaRS_core	Alanyl-tRNA synthetase (AlaRS) class II core catalytic domain. AlaRS is a homodimer. It is responsible for the attachment of alanine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its predicted structure and the presence of three characteristic sequence motifs.	232
173900	cd00674	LysRS_core_class_I	catalytic core domain of  class I lysyl tRNA synthetase. Class I lysyl tRNA synthetase (LysRS) catalytic core domain. This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. The class I LysRS is found only in archaea and some bacteria and has evolved separately from class II LysRS, as the two do not share structural or sequence similarity.	353
238361	cd00677	S15_NS1_EPRS_RNA-bind	S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism.	46
176852	cd00680	RHO_alpha_C	C-terminal catalytic domain of the oxygenase alpha subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenase (RHO) family. RHOs, also known as aromatic ring hydroxylating dioxygenases, utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC), and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Oxygenases belonging to this family include the alpha subunits of Pseudomonas resinovorans strain CA10 anthranilate 1,2-dioxygenase, Stenotrophomonas maltophilia dicamba O-demethylase, Ralstonia sp. U2 salicylate-5-hydroxylase, Cycloclasticus sp. strain A5 polycyclic aromatic hydrocarbon dioxygenase, toluene 2,3-dioxygenase from Pseudomonas putida F1, dioxin dioxygenase of Sphingomonas sp. Strain RW1, plant choline monooxygenase, and the polycyclic aromatic hydrocarbon (PAH)-degrading ring-hydroxylating dioxygenase from Sphingomonas CHY-1. This group also includes the C-terminal catalytic domains of MupW, part of the mupirocin biosynthetic gene cluster in Pseudomonas fluorescens, and Pseudomonas aeruginosa GbcA (glycine betaine catabolism A). This family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	188
173831	cd00683	Trans_IPPS_HH	Trans-Isoprenyl Diphosphate Synthases, head-to-head. These trans-Isoprenyl Diphosphate Synthases (Trans_IPPS) catalyze a head-to-head (HH) (1'-1) condensation reaction. This CD includes squalene and phytoene synthases which catalyze the 1'-1 condensation of two 15-carbon (farnesyl) and 20-carbon (geranylgeranyl) isoprenyl diphosphates, respectively. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions (DXXXD) located on opposite walls. These residues mediate binding of prenyl phosphates. A two-step reaction has been proposed for squalene synthase (farnesyl-diphosphate farnesyltransferase) in which, two molecules of FPP react to form a stable cyclopropylcarbinyl diphosphate intermediate, and then the intermediate undergoes heterolysis, isomerization, and reduction with NADPH to form squalene, a precursor of cholestrol. The carotenoid biosynthesis enzyme, phytoene synthase (CrtB), catalyzes the condensation reaction of two molecules of geranylgeranyl diphosphate to produce phytoene, a precursor of beta-carotene. These enzymes produce the triterpene and tetraterpene precursors for many diverse sterol and carotenoid end products and are widely distributed among eukareya, bacteria, and archaea.	265
173832	cd00684	Terpene_cyclase_plant_C1	Plant Terpene Cyclases, Class 1. This CD includes a diverse group of monomeric plant terpene cyclases (Tspa-Tspf) that convert the acyclic isoprenoid diphosphates, geranyl diphosphate (GPP), farnesyl diphosphate (FPP), or geranylgeranyl diphosphate (GGPP) into cyclic monoterpenes, diterpenes, or sesquiterpenes, respectively; a few form acyclic species. Terpnoid cyclases are soluble enzymes localized to the cytosol (sesquiterpene synthases) or plastids (mono- and diterpene synthases). All monoterpene and diterpene synthases have restrict substrate specificity, however, some sesquiterpene synthases can accept both FPP and GPP. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl diphosphates, via bridging Mg2+ ions (K+ preferred by gymnosperm cyclases), inducing conformational changes such that an N-terminal region forms a cap over the catalytic core. Loss of diphosphate from the enzyme-bound substrate (GPP, FPP, or GGPP) results in an allylic carbocation that electrophilically attacks a double bond further down the terpene chain to effect the first ring closure. Unlike monoterpene, sesquiterene, and macrocyclic diterpenes synthases, which undergo substrate ionization by diphosphate ester scission, Tpsc-like diterpene synthases catalyze cyclization reactions by an initial protonation step producing a copalyl diphosphate intermediate. These enzymes lack the aspartate-rich sequences mentioned above. Most diterpene synthases have an N-terminal, internal element (approx 210 aa) whose function is unknown.	542
173833	cd00685	Trans_IPPS_HT	Trans-Isoprenyl Diphosphate Synthases, head-to-tail. These trans-Isoprenyl Diphosphate Synthases (Trans_IPPS) catalyze head-to-tail (HT) (1'-4) condensation reactions. This CD includes all-trans (E)-isoprenyl diphosphate synthases which synthesize various chain length (C10, C15, C20, C25, C30, C35, C40, C45, and C50) linear isoprenyl diphosphates from precursors,  isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). They catalyze the successive 1'-4 condensation of the 5-carbon IPP to allylic substrates geranyl-, farnesyl-, or geranylgeranyl-diphosphate. Isoprenoid chain elongation reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions (DDXX(XX)D) located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, protecting and stabilizing reactive carbocation intermediates. Farnesyl diphosphate synthases produce the precursors of steroids, cholesterol, sesquiterpenes, farnsylated proteins, heme, and vitamin K12; and geranylgeranyl diphosphate and longer chain synthases produce the precursors of carotenoids, retinoids, diterpenes, geranylgeranylated chlorophylls, ubiquinone, and archaeal ether linked lipids. Isoprenyl diphosphate synthases are widely distributed among archaea, bacteria, and eukareya.	259
173834	cd00686	Terpene_cyclase_cis_trans_C1	Cis, Trans, Terpene Cyclases, Class 1. This CD includes the terpenoid cyclase, trichodiene synthase, which catalyzes the cyclization of farnesyl diphosphate (FPP) to trichodiene using a cis-trans pathway, and is the first committed step in the biosynthesis of trichothecene toxins and antibiotics. As with other enzymes with the 'terpenoid synthase fold', this enzyme has two conserved metal binding motifs that coordinate Mg2+ ion-bridged binding of the diphosphate moiety of FPP. Metal-triggered substrate ionization initiates catalysis, and the alpha-barrel active site serves as a template to channel and stabilize the conformations of reactive carbocation intermediates through a complex cyclization cascade. These enzymes function as homodimers and are found in several genera of fungi.	357
173835	cd00687	Terpene_cyclase_nonplant_C1	Non-plant Terpene Cyclases, Class 1. This CD includes terpenoid cyclases such as pentalenene synthase and aristolochene synthase which, using an all-trans pathway, catalyze the ionization of farnesyl diphosphate, followed by the formation of a macrocyclic intermediate by bond formation between C1 with either C10 (aristolochene synthase) or C11 (pentalenene synthase), resulting in production of tricyclic hydrocarbon pentalenene or bicyclic hydrocarbon aristolochene. As with other enzymes with the 'terpenoid synthase fold', they have two conserved metal binding motifs, proposed to coordinate Mg2+ ion-bridged binding of the diphosphate moiety of FPP to the enzymes. Metal-triggered substrate ionization initiates catalysis, and the alpha-barrel active site serves as a template to channel and stabilize the conformations of reactive carbocation intermediates through a complex cyclization cascade. These enzymes function in the monomeric form and are found in fungi, bacteria and Dictyostelium.	303
238362	cd00688	ISOPREN_C2_like	This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds.  The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system.	300
173825	cd00691	ascorbate_peroxidase	Ascorbate peroxidases and cytochrome C peroxidases. Ascorbate peroxidases are a subgroup of heme-dependent peroxidases of the plant superfamily that share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Along with related catalase-peroxidases, ascorbate peroxidases belong to class I of the plant superfamily. Ascorbate peroxidases are found in the chloroplasts and/or cytosol of algae and plants, where they have been shown to control the concentration of lethal hydrogen peroxide molecules. The yeast cytochrome c peroxidase is a divergent member of the family; it forms a complex with cytochrome c to catalyze the reduction of hydrogen peroxide to water.	253
173826	cd00692	ligninase	Ligninase and other manganese-dependent fungal peroxidases. Ligninases and related extracellular fungal peroxidases belong to class II of the plant heme-dependent peroxidase superfamily. All members of the superfamily share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Class II peroxidases are fungal glycoproteins that have been implicated in the oxidative breakdown of lignin, the main cell wall component of woody plants. They contain four conserved disulphide bridges and two conserved calcium binding sites.	328
173827	cd00693	secretory_peroxidase	Horseradish peroxidase and related secretory plant peroxidases. Secretory peroxidases belong to class III of the plant heme-dependent peroxidase superfamily. All members of the superfamily share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Class III peroxidases are found in the extracellular space or in the vacuole in plants where they have been implicated in hydrogen peroxide detoxification, auxin catabolism and lignin biosynthesis, and stress response. Class III peroxidases contain four conserved disulphide bridges and two conserved calcium binding sites.	298
133420	cd00704	MDH	Malate dehydrogenase. Malate dehydrogenase (MDH) is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. MDHs belong to the NAD-dependent, lactate dehydrogenase (LDH)-like, 2-hydroxycarboxylate dehydrogenase family, which also includes the GH4 family of glycoside hydrolases. They are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	323
238363	cd00707	Pancreat_lipase_like	Pancreatic lipase-like enzymes.  Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface.  A typical feature of lipases is "interfacial activation," the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation .  The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure.  A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site.	275
100039	cd00710	LbH_gamma_CA	Gamma carbonic anhydrases (CA): Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism, involving the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three distinct groups of  carbonic anhydrases - alpha, beta and gamma - which show no significant sequence identity or structural similarity. Gamma CAs are homotrimeric enzymes, with each subunit containing a left-handed parallel beta helix (LbH) structural domain.	167
238364	cd00712	AsnB	Glutamine amidotransferases class-II (GATase) asparagine synthase_B type.  Asparagine synthetase B catalyses the ATP-dependent conversion of aspartate to asparagine. This enzyme is a homodimer, with each monomer composed of a  glutaminase domain and a synthetase domain. The N-terminal glutaminase domain hydrolyzes glutamine to glutamic acid and ammonia.	220
238365	cd00713	GltS	Glutamine amidotransferases class-II (Gn-AT), glutamate synthase (GltS)-type. GltS is a homodimer that synthesizes L-glutamate from 2-oxoglutarate and L-glutamine, an important step in ammonia assimilation in bacteria, cyanobacteria and plants. The N-terminal glutaminase domain catalyzes the hydrolysis of glutamine to glutamic acid and ammonia, and has a fold similar to that of other glutamine amidotransferases such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), and beta lactam synthetase (beta-LS), as well as the Ntn hydrolase folds of the proteasomal alpha and beta subunits.	413
238366	cd00714	GFAT	Glutamine amidotransferases class-II (Gn-AT)_GFAT-type. This domain is found at the N-terminus of glucosamine-6P synthase (GlmS, or GFAT in humans).  The glutaminase domain catalyzes amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. In humans, GFAT catalyzes the first and rate-limiting step of hexosamine metabolism, the conversion of D-fructose-6P (Fru6P) into D-glucosamine-6P using L-glutamine as a nitrogen source.  The end product of this pathway, UDP-N-acetyl glucosamine, is a major building block of the bacterial peptidoglycan and fungal chitin.	215
238367	cd00715	GPATase_N	Glutamine amidotransferases class-II (GN-AT)_GPAT- type. This domain is found at the N-terminus of  glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase) . The glutaminase domain catalyzes amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP,  resulting in phosphoribosylamine, pyrophosphate and glutamate. GPATase crystalizes as a homotetramer, but can also exist as a homdimer.	252
153076	cd00716	creatine_kinase_like	Phosphagen (guanidino) kinases such as creatine kinase and similar enzymes. Eukaryotic creatine kinase-like phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK), which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CKs are found as tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial, cytosolic, and flagellar) isoforms. Mitochondrial and cytoplasmic CKs are dimeric or octameric, while the flagellar isoforms are trimers with three CD domains fused as a single protein chain. CKs are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK, one of the most studied members of this family, this model also represents other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK), and echinoderm arginine kinase (AK).	357
238368	cd00717	URO-D	Uroporphyrinogen decarboxylase (URO-D) is a dimeric cytosolic enzyme that decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, without requiring any prosthetic groups or cofactors. This reaction is located at the branching point of the tetrapyrrole biosynthetic pathway, leading to the biosynthesis of heme, chlorophyll or bacteriochlorophyll. URO-D deficiency is responsible for the human genetic diseases familial porphyria cutanea tarda (fPCT) and hepatoerythropoietic porphyria (HEP).	335
198380	cd00719	GIY-YIG_SF	GIY-YIG nuclease domain superfamily. The GIY-YIG nuclease domain superfamily includes a large and diverse group of proteins involved in many cellular processes, such as class I homing GIY-YIG family endonucleases, prokaryotic nucleotide excision repair proteins UvrC and Cho, type II restriction enzymes, the endonuclease/reverse transcriptase of eukaryotic retrotransposable elements, and a family of eukaryotic enzymes that repair stalled replication forks. All of these members contain a conserved GIY-YIG nuclease domain that may serve as a scaffold for the coordination of a divalent metal ion required for catalysis of the phosphodiester bond cleavage. By combining with different specificity, targeting, or other domains, the GIY-YIG nucleases may perform different functions.	69
238369	cd00727	malate_synt_A	Malate synthase A (MSA), present in some bacteria, plants and fungi. Prokaryotic MSAs tend to be monomeric, whereas eukaryotic enzymes are homomultimers. In general, malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA, which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms, like plants and fungi, to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle.	511
238370	cd00728	malate_synt_G	Malate synthase G (MSG), monomeric enzyme present in some bacteria. In general, malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA , which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle.	712
238371	cd00729	rubredoxin_SM	Rubredoxin, Small Modular nonheme iron binding domain containing a [Fe(SCys)4] center, present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), and  believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.	34
238372	cd00730	rubredoxin	Rubredoxin; nonheme iron binding domains containing a [Fe(SCys)4] center. Rubredoxins are small nonheme iron proteins. The iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc. They are believed to be involved in electron transfer.	50
238373	cd00731	CheA_reg	CheA regulatory domain; CheA is a histidine protein kinase present in bacteria and archea. Activated by the chemotaxis receptor a histidine phosphoryl group from CheA is passed directly to an aspartate in the response regulator CheY. This signalling mechanism is modulated by the methyl accepting chemotaxis proteins (MCPs). MCPs form a highly interconnected, tightly packed array within the membrane that is organized, at least in part, through interactions with CheW and CheA. The CheA regulatory domain belongs to the family of CheW_like proteins and has been proposed to mediate interaction with the kinase regulator CheW.	132
238374	cd00732	CheW	CheW, a small regulator protein, unique to the chemotaxis signalling in prokaryotes and archea. CheW interacts with the histidine kinase CheA, most likely with the related regulatory domain of CheA. CheW is proposed to form signalling arrays together with CheA and the methyl-accepting chemotaxis proteins (MCPs), which are involved in response modulation.	140
238375	cd00733	GlyRS_alpha_core	Class II Glycyl-tRNA synthetase (GlyRS) alpha subunit core catalytic domain. GlyRS functions as a homodimer in eukaryotes, archaea and some bacteria and as a heterotetramer in the remainder of prokaryotes and in arabidopsis. It is responsible for the attachment of glycine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. This alignment contains only sequences from the GlyRS form which heterotetramerizes. The homodimer form of GlyRS is in a different family of class II aaRS. Class II assignment is based upon structure and the presence of three characteristic sequence motifs.	279
381597	cd00735	T4-like_lys	bacteriophage T4-like lysozymes. Bacteriophage T4-like lysozymes hydrolyze the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc) in peptidoglycan heteropolymers of prokaryotic cell walls. Members include a variety of bacteriophages (T4, RB49, RB69, Aeh1), as well as Dictyostelium.	146
381598	cd00736	lambda_lys-like	Bacteriophage lambda lysozyme and similar proteins. Lysozyme from bacteriophage lambda hydrolyzes the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc), as do other lysozymes. However, unlike other lysozymes, bacteriophage lambda does not produce a reducing end upon cleavage of the peptidoglycan, but rather uses the 6-OH of the same MurNAc residue to produce a 1,6-anhydromuramic acid terminal residue and is therefore a lytic transglycosylase. An identical 1,6-anhydro bond is formed in bacterial peptidoglycans by the action of the lytic transglycosylases of E. coli, though they differ structurally.	141
381599	cd00737	lyz_endolysin_autolysin	endolysin and autolysin. The dsDNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall heteropolymer peptidoglycan and cell division. Endolysins and autolysins are found in viruses and bacteria, respectively. Both endolysin and autolysin enzymes cleave the glycosidic beta 1,4-bonds between the N-acetylmuramic acid and the N-acetylglucosamine of the peptidoglycan.	136
238379	cd00738	HGTP_anticodon	HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b).	94
238380	cd00739	DHPS	DHPS subgroup of Pterin binding enzymes. DHPS (dihydropteroate synthase), a functional homodimer, catalyzes the condensation of p-aminobenzoic acid (pABA) in the de novo biosynthesis of folate, which is an essential cofactor in both nucleic acid and protein biosynthesis. Prokaryotes (and some lower eukaryotes) must synthesize folate de novo, while higher eukaryotes are able to utilize dietary folate and therefore lack DHPS.  Sulfonamide drugs, which are substrate analogs of pABA, target DHPS.	257
238381	cd00740	MeTr	MeTr subgroup of pterin binding enzymes. This family includes cobalamin-dependent methyltransferases such as methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) and methionine synthase (MetH).  Cobalamin-dependent methyltransferases catalyze the transfer of a methyl group via a methyl- cob(III)amide intermediate.  These include MeTr, a functional heterodimer, and the folate binding domain of MetH.	252
238382	cd00741	Lipase	Lipase.  Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface.  A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure.  A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site.	153
381183	cd00742	FABP	intracellular fatty acid-binding protein family. Members of this family are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner. They protect and shuttle fatty acids within the cell and are involved in acquisition and removal of fatty acids from intracellular sites. They include cellular retinol-binding proteins (CRBPs) which participate in the cellular uptake of vitamin A in the form of free retinol, cellular retinoic acid-binding proteins (CRABPs) which participate in the metabolism of vitamin A and retinoic acid, and bind all trans retinoic acid, but not retinol, and FABPs similar to FABP3 which plays an important role in fatty acid transportation, cell growth, cell signaling, and gene transcription.	129
381184	cd00743	lipocalin_RBP_like	retinol-binding protein 4 and similar proteins. Retinol-Binding Protein 4 (RBP4) is a plasma protein that transports retinol (vitamin A) from the liver stores to the peripheral tissues. The RBP4-retinol complex interacts with transthyretin (TTR - transports thyroxine and retinol) which protects it from renal excretion. In addition to retinol, other endogenous and synthetic retinoids bind RBP4, including all-trans and 13-cis retinoic acid, retinyl acetate, N-(ethyl)retinamide, and fenretinide. This group also includes purpurin, a retinol-specific protein that plays a role in neural retina cell adhesion during development of the chicken retina; it also binds retinol and may participate in retinol transporter in the retina. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	171
238383	cd00751	thiolase	Thiolase are ubiquitous enzymes that catalyze the reversible thiolytic cleavage of 3-ketoacyl-CoA into acyl-CoA and acetyl-CoA, a 2-step reaction involving a covalent intermediate formed with a catalytic cysteine. They are found in prokaryotes and eukaryotes (cytosol, microbodies and mitochondria). There are 2 functional different classes: thiolase-I (3-ketoacyl-CoA thiolase) and thiolase-II (acetoacetyl-CoA thiolase). Thiolase-I can cleave longer fatty acid molecules and plays an important role in the beta-oxidative degradation of fatty acids. Thiolase-II has a high substrate specificity. Although it can cleave acetoacyl-CoA, its main function is the synthesis of acetoacyl-CoA from two molecules of acetyl-CoA, which gives it importance in several biosynthetic pathways.	386
340452	cd00754	Ubl_MoaD	ubiquitin-like (Ubl) domain found in molybdenum cofactor biosynthesis protein D (MoaD) and similar proteins. MoaD, also termed molybdopterin synthase sulfur carrier subunit, or MPT synthase subunit 1, or MPT synthase small subunit, or molybdopterin-converting factor small subunit, or molybdopterin-converting factor subunit 1, is a conserved small sulfur carrier protein that has beta-grasp ubiquitin-like (Ubl) fold involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor of a diverse group of redox enzymes. MoaD is activated in an ATP-dependent manner by sulfurtransferases similar to the activation mechanism of ubiquitin-activating enzyme E1.	79
238384	cd00755	YgdL_like	Family of activating enzymes (E1) of ubiquitin-like proteins related to the E.coli hypothetical protein ygdL. The common reaction mechanism catalyzed by E1-like enzymes begins with a nucleophilic attack of the C-terminal carboxylate of the ubiquitin-like substrate, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of the substrate. The exact function of this family is unknown.	231
238385	cd00756	MoaE	MoaE family. Members of this family are involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor for a diverse group of redox enzymes. Moco biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. Moco contains a tricyclic pyranopterin, termed molybdopterin (MPT), which carries the cis-dithiolene group responsible for molybdenum ligation. This dithiolene group is generated by MPT synthase in the second major step in Moco biosynthesis. MPT synthase is a heterotetramer consisting of two large (MoaE) and two small (MoaD) subunits.	124
238386	cd00757	ThiF_MoeB_HesA_family	ThiF_MoeB_HesA. Family of E1-like enzymes involved in molybdopterin and thiamine biosynthesis family. The common reaction mechanism catalyzed by MoeB and ThiF, like other E1 enzymes, begins with a nucleophilic attack of the C-terminal carboxylate of MoaD and ThiS, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of  a thiocarboxylate at the C termini of MoaD and ThiS. MoeB, as the MPT synthase (MoaE/MoaD complex) sulfurase, is involved in the biosynthesis of the molybdenum cofactor, a derivative of the tricyclic pterin, molybdopterin (MPT). ThiF catalyzes the adenylation of ThiS, as part of the biosynthesis pathway of thiamin pyrophosphate (vitamin B1). 	228
238387	cd00758	MoCF_BD	MoCF_BD: molybdenum cofactor (MoCF) binding domain (BD). This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. The domain is presumed to bind molybdopterin.	133
132997	cd00761	Glyco_tranf_GTA_type	Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold. Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein.  Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold.  This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities.	156
133442	cd00762	NAD_bind_malic_enz	NAD(P) binding domain of malic enzyme. Malic enzyme (ME), a member of the amino acid dehydrogenase (DH)-like domain family, catalyzes the oxidative decarboxylation of L-malate to pyruvate in the presence of cations (typically  Mg++ or Mn++) with the concomitant reduction of cofactor NAD+ or NADP+.  ME has been found in all organisms and plays important roles in diverse metabolic pathways such as photosynthesis and lipogenesis. This enzyme generally forms homotetramers. The conversion of malate to pyruvate by ME typically involves oxidation of malate to produce oxaloacetate, followed by decarboxylation of oxaloacetate to produce pyruvate and CO2.  Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	254
238388	cd00763	Bacterial_PFK	Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include bacterial ATP-dependent phosphofructokinases. These are allosrterically regulated homotetramers; the subunits are of about 320 amino acids.	317
238389	cd00764	Eukaryotic_PFK	Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include eukaryotic ATP-dependent phosphofructokinases. These have evolved from the bacterial PFKs by gene duplication and fusion events and exhibit complex allosteric behavior.	762
238390	cd00765	Pyrophosphate_PFK	Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to a subfamily of the PFKA family (cd00363) and include pyrophosphate-dependent phosphofructokinases. These are found in bacteria as well as plants. These may be dimeric nonallosteric enzymes as in bacteria or allosteric heterotetramers as in plants.	550
238391	cd00768	class_II_aaRS-like_core	Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA.   PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial  ATP  phosphoribosyltransferase regulatory subunit HisZ.	211
238392	cd00769	PheRS_beta_core	Phenylalanyl-tRNA synthetase (PheRS) beta chain core domain. PheRS belongs to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure. While class II aaRSs generally aminoacylate the 3'-OH ribose of the appropriate tRNA,  PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. PheRS is an alpha-2/ beta-2 tetramer. While the alpha chain contains a catalytic core domain, the beta chain has a non-catalytic core domain.	198
238393	cd00770	SerRS_core	Seryl-tRNA synthetase (SerRS) class II core catalytic domain. SerRS is responsible for the attachment of serine to the 3' OH group of ribose of the appropriate tRNA. This domain It is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate.  Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. SerRS synthetase is a homodimer.	297
238394	cd00771	ThrRS_core	Threonyl-tRNA synthetase (ThrRS) class II core catalytic domain. ThrRS is a homodimer. It is responsible for the attachment of threonine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain.	298
238395	cd00772	ProRS_core	Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain.	264
238396	cd00773	HisRS-like_core	Class II Histidinyl-tRNA synthetase (HisRS)-like catalytic core domain. HisRS is a homodimer. It is responsible for the attachment of histidine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. This domain is also found at the C-terminus of eukaryotic GCN2 protein kinase and at the N-terminus of the ATP phosphoribosyltransferase accessory subunit, HisZ. HisZ along with HisG catalyze the first reaction in histidine biosynthesis. HisZ is found only in a subset of bacteria and differs from HisRS in lacking a C-terminal anti-codon binding domain.	261
238397	cd00774	GlyRS-like_core	Glycyl-tRNA synthetase (GlyRS)-like class II core catalytic domain. GlyRS functions as a homodimer in eukaryotes, archaea and some bacteria and as a heterotetramer in the remainder of prokaryotes. It is responsible for the attachment of glycine to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP binding and hydrolysis. This alignment contains only sequences from the GlyRS form which homodimerizes. The heterotetramer glyQ is in a different family of class II aaRS. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. This domain is also found at the N-terminus of the accessory subunit of mitochondrial polymerase gamma (Pol gamma b). Pol gamma b stimulates processive DNA synthesis and is functional as a homodimer, which can associate with the catalytic subunit Pol gamma alpha to form a heterotrimer. Despite significant both structural and sequence similarity with GlyRS,  Pol gamma b lacks conservation of several class II functional residues.	254
238398	cd00775	LysRS_core	Lys_tRNA synthetase (LysRS) class II core domain.  Class II LysRS is a dimer which attaches a lysine to the 3' OH group of ribose of the appropriate tRNA. Its assignment to class II aaRS is based upon its structure and the presence of three characteristic sequence motifs in the core domain. It is found in eukaryotes as well as some prokaryotes and archaea.  However, LysRS belongs to class I aaRS's  in some prokaryotes and archaea. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate.	329
238399	cd00776	AsxRS_core	Asx tRNA synthetase (AspRS/AsnRS) class II core domain.  Assignment to class II aminoacyl-tRNA synthetases (aaRS) based upon its structure and the presence of three characteristic sequence motifs in the core domain. This family includes AsnRS as well as a subgroup of AspRS.  AsnRS and AspRS are homodimers, which attach either asparagine or aspartate to the 3'OH group of ribose of the appropriate tRNA.  While archaea lack asnRS, they possess a non-discriminating aspRS, which can mischarge Asp-tRNA with Asn. Subsequently, a tRNA-dependent aspartate amidotransferase converts the bound aspartate to asparagine. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate.	322
238400	cd00777	AspRS_core	Asp tRNA synthetase (aspRS) class II core domain. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs. AspRS is a homodimer, which attaches a specific amino acid to the 3' OH group of ribose of the appropriate tRNA. The catalytic core domain is primarily responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. AspRS in this family differ from those found in the AsxRS family by a GAD insert in the core domain.	280
238401	cd00778	ProRS_core_arch_euk	Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. This subfamily contains the core domain of ProRS from archaea, the cytoplasm of eukaryotes and some bacteria.	261
238402	cd00779	ProRS_core_prok	Prolyl-tRNA synthetase (ProRS) class II core catalytic domain. ProRS is a homodimer. It is responsible for the attachment of proline to the 3' OH group of ribose of the appropriate tRNA. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. Class II assignment is based upon its structure and the presence of three characteristic sequence motifs in the core domain. This subfamily contains the core domain of ProRS from prokaryotes and from the mitochondria of eukaryotes.	255
238403	cd00780	NTF2	Nuclear transport factor 2 (NTF2) domain plays an important role in the trafficking of macromolecules, ions and small molecules between the cytoplasm and nucleus. This bi-directional transport of macromolecules across the nuclear envelope requires many soluble factors that includes GDP-binding protein Ran (RanGDP). RanGDP is required for both import and export of proteins and poly(A) RNA. RanGDP also has been implicated in cell cycle control, specifically in mitotic spindle assembly. In interphase cells, RanGDP is predominately nuclear and thought to be GTP bound, but it is also present in the cytoplasm, probably in the GDP-bound state. NTF2 mediates the nuclear import of RanGDP. NTF2 binds to both RanGDP and FxFG repeat-containing nucleoporins.	119
238404	cd00781	ketosteroid_isomerase	ketosteroid isomerase: Many biological reactions proceed by enzymatic cleavage of a C-H bond adjacent to carbonyl or a carboxyl group, leading to an enol or a enolate intermediate that is subsequently re-protonated at the same or an adjacent carbon. Ketosteroid isomerases are important members of this class of enzymes which are the most proficient of all enzymes known and have served as a paradigm for enzymatic enolizations since its discovery in 1954. This CD includes members of this class that calalyze the isomerization of various beta,gamma-unsaturated isomers at nearly a diffusion-controlled rate. These enzymes are widely distributed in bacteria.	122
238405	cd00782	MutL_Trans	MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. Included in this group are proteins similar to human MLH1, hPMS2, hPMS1, hMLH3 and E. coli MutL,  MLH1 forms heterodimers with PMS2, PMS1 and MLH3. These three complexes have distinct functions in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Roles for hMLH1-hPMS1 or hMLH1-hMLH3 in MMR have not been established. Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 causes predisposition to HNPCC, Muir-Torre syndrome and Turcot syndrome (HNPCC variant). Mutation in hPMS2 causes predisposition to HPNCC and Turcot syndrome. Mutation in hMLH1 accounts for a large fraction of HNPCC families. There is no convincing evidence to support hPMS1 having a role in HNPCC predisposition. It has been suggested that hMLH3 may be a low risk gene for colorectal cancer; however there is little evidence to support it having a role in classical HNPCC.  It has been suggested that during initiation of DNA mismatch repair in E. coli, the mismatch recognition protein MutS recruits MutL in the presence of ATP.  The MutS(ATP)-MutL ternary complex formed, then recruits the latent endonuclease MutH.	122
238406	cd00786	cytidine_deaminase-like	Cytidine and deoxycytidylate deaminase zinc-binding region. The family contains cytidine deaminases, nucleoside deaminases, deoxycytidylate deaminases and riboflavin deaminases. Also included are the apoBec family of mRNA editing enzymes.  All members are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate.	96
238407	cd00788	KU70	Ku-core domain, Ku70 subfamily; Ku70 is a subunit of the Ku protein, which plays a key role in multiple nuclear processes such as DNA repair, chromosome maintenance, transcription regulation, and V(D)J recombination. The mechanism underlying the regulation of all the diverse functions of Ku is still unclear, although it seems that Ku is a multifunctional protein that works in the nuclei. In mammalian cells, the Ku heterodimer recruits the catalytic subunit of DNA-dependent protein kinase (DNA-PK), which is dependent on its association with the Ku70/80 heterodimer bound to DNA for its protein kinase activity.	287
238408	cd00789	KU_like	Ku-core domain, Ku-like subfamily; composed of prokaryotic homologs of the eukaryotic DNA binding protein Ku. The alignment includes the core domain shared by the prokaryotic YkoV-like proteins and the eukaryotic Ku70 and Ku80. The prokaryotic Ku homologs are predicted to form homodimers. It is proposed that the Ku homologs are functionally associated with ATP-dependent DNA ligase and the eukaryotic-type primase, probably as components of a double-strand break repair system.	256
238409	cd00794	NOS_oxygenase_prok	Nitric oxide synthase (NOS) prokaryotic oxygenase domain. NOS produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. Nitric oxide synthases are homodimers. Most prokaryotes produce NO as a byproduct of denitrification, using a completely different set of enzymes than NOS. However, a few prokaryotes also have a NOS, consisting solely of the NOS oxygenase domain. Prokaryotic NOS binds to the substrate L-Arg, zinc, and to the cofactors heme and tetrahydrofolate.	353
238410	cd00795	NOS_oxygenase_euk	Nitric oxide synthase (NOS) eukaryotic oxygenase domain. NOS produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain, which binds to the substrate L-Arg,  zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOS's also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN.	412
271177	cd00796	INT_Rci_Hp1_C	Shufflon-specific DNA recombinase Rci and Bacteriophage Hp1_like integrase, C-terminal catalytic domain. Rci protein is a tyrosine recombinase specifically involved in Shufflon type of DNA rearrangement in bacteria. The shufflon of plasmid R64 consists of four invertible DNA segments which are separated and flanked by seven 19-bp repeat sequences. RCI recombinase facilitates the site-specific recombination between any inverted repeats results in an inversion of the DNA segment(s) either independently or in groups. HP1 integrase promotes site-specific recombination of the HP1 genome into that of Haemophilus influenza. Bacteriophage Hp1_like integrases are tyrosine based site specific recombinases. They belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA.	162
271178	cd00797	INT_RitB_C_like	C-terminal catalytic domain of recombinase RitB, a component of the recombinase trio. Recombinases belonging to the RitA (also known as pAE1 due to its presence in the deletion prone region of plasmid pAE1 of Alcaligenes eutrophus H1), RitB, and RitC families are associated in a complex referred to as a Recombinase in Trio (RIT) element.  These RIT elements consist of three adjacent and unidirectional overlapping genes, one from each family (ritABC in order of transcription).  All three integrases contain a catalytic motif, suggesting that they are all active enzymes.  However, their specific roles are not yet fully understood.  All three families belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism.	198
271179	cd00798	INT_XerDC_C	XerD and XerC integrases, C-terminal catalytic domains. XerDC-like integrases are involved in the site-specific integration and excision of lysogenic bacteriophage genomes, transposition of conjugative transposons, termination of chromosomal replication, and stable plasmid inheritance. They share the same fold in their catalytic domain containing six conserved active site residues and the overall reaction mechanism with the DNA breaking-rejoining enzyme superfamily. In Escherichia coli, the Xer site-specific recombination system acts to convert dimeric chromosomes, which are formed by homologous recombination to monomers. Two related recombinases, XerC and XerD, bind cooperatively to a recombination site present in the E. coli chromosome. Each recombinase catalyzes the exchange of one pair of DNA strand in a reaction that proceeds through a Holliday junction intermediate. These enzymes can bridge two different and well-separated DNA sequences called arm- and core-sites. The C-terminal domain binds, cleaves, and re-ligates DNA strands at the core-sites, while the N-terminal domain is largely responsible for high-affinity binding to the arm-type sites.	172
271180	cd00799	INT_Cre_C	C-terminal catalytic domain of Cre recombinase (also called integrase). Cre-like recombinases are tyrosine based site specific recombinases. They belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The bacteriophage P1 Cre recombinase maintains the circular phage replicon in a monomeric state by catalyzing a site-specific recombination between two loxP sites. The catalytic core domain of Cre recombinase is linked to a more divergent helical N-terminal domain, which interacts primarily with the DNA major groove proximal to the crossover region.	188
271181	cd00800	INT_Lambda_C	C-terminal catalytic domain of Lambda integrase, a tyrosine-based site-specific recombinase. Lambda-type integrases catalyze site-specific integration and excision of temperate bacteriophages and other mobile genetic elements to and from the bacterial host chromosome. They are tyrosine-based site-specific recombinase and belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The phage lambda integrase can bridge two different and well-separated DNA sequences called arm- and core-sites. The C-terminal domain binds, cleaves and re-ligates DNA strands at the core-sites, while the N-terminal domain is largely responsible for high-affinity binding to the arm-type sites.	161
271182	cd00801	INT_P4_C	Bacteriophage P4 integrase, C-terminal catalytic domain. P4-like integrases are found in temperate bacteriophages, integrative plasmids, pathogenicity and symbiosis islands, and other mobile genetic elements. The P4 integrase mediates integrative and excisive site-specific recombination between two sites, called attachment sites, located on the phage genome and the bacterial chromosome. The phage attachment site is often found adjacent to the integrase gene, while the host attachment sites are typically situated near tRNA genes. This family belongs to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA.	180
173901	cd00802	class_I_aaRS_core	catalytic core domain of class I amino acyl-tRNA synthetase. Class I amino acyl-tRNA synthetase (aaRS) catalytic core domain. These enzymes are mostly monomers which aminoacylate the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding.	143
173902	cd00805	TyrRS_core	catalytic core domain of tyrosinyl-tRNA synthetase. Tyrosinyl-tRNA synthetase (TyrRS) catalytic core domain. TyrRS is a homodimer which attaches Tyr to the appropriate tRNA. TyrRS is a class I tRNA synthetases, so it aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formationof the enzyme bound aminoacyl-adenylate. It contains the class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding.	269
173903	cd00806	TrpRS_core	catalytic core domain of tryptophanyl-tRNA synthetase. Tryptophanyl-tRNA synthetase (TrpRS) catalytic core domain. TrpRS is a homodimer which attaches Tyr to the appropriate tRNA. TrpRS is a class I tRNA synthetases, so it aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains class I characteristic HIGH and KMSKS motifs, which are involved in ATP binding	280
185676	cd00807	GlnRS_core	catalytic core domain of glutaminyl-tRNA synthetase. Glutaminyl-tRNA synthetase (GlnRS) cataytic core domain. These enzymes attach Gln to the appropriate tRNA. Like other class I tRNA synthetases, they aminoacylate the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. GlnRS contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. These enzymes function as monomers. Archaea and most bacteria lack GlnRS. In these organisms, the "non-discriminating" form of GluRS aminoacylates both tRNA(Glu) and tRNA(Gln) with Glu, which is converted to Gln when appropriate by a transamidation enzyme.	238
173905	cd00808	GluRS_core	catalytic core domain of discriminating glutamyl-tRNA synthetase. Discriminating Glutamyl-tRNA synthetase (GluRS) catalytic core domain . The discriminating form of GluRS is only found in bacteria and cellular organelles. GluRS is a monomer that attaches Glu to the appropriate tRNA.  Like other class I tRNA synthetases, GluRS aminoacylates the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding.	239
173906	cd00812	LeuRS_core	catalytic core domain of leucyl-tRNA synthetases. Leucyl tRNA synthetase (LeuRS) catalytic core domain. This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. In Aquifex aeolicus, the gene encoding LeuRS is split in two, just before the KMSKS motif. Consequently, LeuRS is a heterodimer, which likely superimposes with the LeuRS monomer found in most other organisms. LeuRS has an insertion in the core domain, which is subject to both deletions and rearrangements and thus differs between prokaryotic LeuRS and archaeal/eukaryotic LeuRS. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids.	314
173907	cd00814	MetRS_core	catalytic core domain of methioninyl-tRNA synthetases. Methionine tRNA synthetase (MetRS) catalytic core domain. This class I enzyme aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. MetRS, which consists of the core domain and an anti-codon binding domain, functions as a monomer. However, in some species the anti-codon binding domain is followed by an EMAP domain. In this case, MetRS functions as a homodimer. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding.  As a result of a deletion event, MetRS has a significantly shorter core domain insertion than IleRS, ValRS, and LeuR.  Consequently, the MetRS insertion lacks the editing function.	319
185677	cd00817	ValRS_core	catalytic core domain of valyl-tRNA synthetases. Valine amino-acyl tRNA synthetase (ValRS) catalytic core domain. This enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding.  ValRS has an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids.	382
173909	cd00818	IleRS_core	catalytic core domain of isoleucyl-tRNA synthetases. Isoleucine amino-acyl tRNA synthetases (IleRS) catalytic core domain . This class I enzyme is a monomer which aminoacylates the 2'-OH of the nucleotide at the 3' of the appropriate tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding.  IleRS has an insertion in the core domain, which is subject to both deletions and rearrangements. This editing region hydrolyzes mischarged cognate tRNAs and thus prevents the incorporation of chemically similar amino acids.	338
238417	cd00819	PEPCK_GTP	Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity, this model describes the GTP-dependent group.	579
238418	cd00820	PEPCK_HprK	Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP  or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity.  PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of HPr and its dephosphorylation by phosphorolysis. PEPCK and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting that these two phosphotransferases have related functions.	107
275388	cd00821	PH	Pleckstrin homology (PH) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	92
238419	cd00822	TopoII_Trans_DNA_gyrase	TopoIIA_Trans_DNA_gyrase: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIA family of DNA topoisomerases similar to the B subunits of E. coli DNA gyrase and E. coli Topoisomerase IV which are  heterodimers composed of two subunits.  The type IIA enzymes are the predominant form of topoisomerase and are found in some bacteriophages, viruses and archaea, and in all bacteria and eukaryotes.  All type IIA topoisomerases are related to each other at amino acid sequence level, though their oligomeric organization sometimes differs.  TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA.  These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. TopoIIA enzymes also catenate/ decatenate duplex rings. E.coli DNA gyrase is a heterodimer composed of two subunits. E. coli DNA gyrase B subunit is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes.	172
238420	cd00823	TopoIIB_Trans	TopoIIB_Trans: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIB family of DNA topoisomerases similar to Sulfolobus shibatae topoisomerase VI (topoVI). The sole representative of the Type IIB family is topo VI. Topo VI enzymes are heterotetramers found in archaea and plants.  S. shibatae topoVI relaxes both positive and negative supercoils, and in addition has a strong decatenase activity. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes.	151
238421	cd00825	decarbox_cond_enzymes	decarboxylating condensing enzymes; Family of enzymes that catalyze the formation of a new carbon-carbon bond by a decarboxylating Claisen-like condensation reaction. Members are involved in the synthesis of fatty acids and polyketides, a diverse group of natural products. Both pathways are an iterative series of additions of small carbon units, usually acetate, to a nascent acyl group. There are 2 classes of decarboxylating condensing enzymes, which can be distinguished by sequence similarity, type of active site residues and type of primer units (acetyl CoA or acyl carrier protein (ACP) linked units).	332
238422	cd00826	nondecarbox_cond_enzymes	nondecarboxylating condensing enzymes; In general, thiolases catalyze the reversible thiolytic cleavage of 3-ketoacyl-CoA into acyl-CoA and acetyl-CoA, a 2-step reaction involving a covalent intermediate formed with a catalytic cysteine. There are 2 functional different classes: thiolase-I (3-ketoacyl-CoA thiolase) and thiolase-II (acetoacetyl-CoA thiolase). Thiolase-I can cleave longer fatty acid molecules and plays an important role in the beta-oxidative degradation of fatty acids. Thiolase-II has a high substrate specificity. Although it can cleave acetoacyl-CoA, its main function is the synthesis of acetoacyl-CoA from two molecules of acetyl-CoA, which gives it importance in several biosynthetic pathways.	393
238423	cd00827	init_cond_enzymes	"initiating" condensing enzymes are a subclass of decarboxylating condensing enzymes, including beta-ketoacyl [ACP] synthase, type III and polyketide synthases, type III, which include chalcone synthase and related enzymes. They are characterized by the utlization of CoA substrate primers, as well as the nature of their active site residues.	324
238424	cd00828	elong_cond_enzymes	"elongating" condensing enzymes are a subclass of decarboxylating condensing enzymes, including beta-ketoacyl [ACP] synthase, type I and II and polyketide synthases.They are characterized by the utlization of acyl carrier protein (ACP) thioesters as primer substrates, as well as the nature of their active site residues.	407
238425	cd00829	SCP-x_thiolase	Thiolase domain associated with sterol carrier protein (SCP)-x isoform and related proteins; SCP-2  has multiple roles in intracellular lipid circulation and metabolism. The N-terminal presequence in the SCP-x isoform represents a peroxisomal 3-ketacyl-Coa thiolase specific for branched-chain acyl CoAs, which is proteolytically cleaved from the sterol carrier protein.	375
238426	cd00830	KAS_III	Ketoacyl-acyl carrier protein synthase III (KASIII) initiates the elongation in type II fatty acid synthase systems. It is found in bacteria and plants. Elongation of fatty acids in the type II systems occurs by Claisen condensation of malonyl-acyl carrier protein (ACP) with acyl-ACP. KASIII initiates this process by specifically using acetyl-CoA over acyl-CoA.	320
238427	cd00831	CHS_like	Chalcone and stilbene synthases; plant-specific polyketide synthases (PKS) and related enzymes, also called type III PKSs. PKS generate an array of different products, dependent on the nature of the starter molecule. They share a common chemical strategy, after the starter molecule is loaded onto the active site cysteine, a carboxylative condensation reation extends the polyketide chain. Plant-specific PKS are dimeric iterative PKSs, using coenzyme A esters to deliver substrate to the active site, but they differ in the choice of starter molecule and the number of condensation reactions.	361
238428	cd00832	CLF	Chain-length factor (CLF) is a factor required for polyketide chain initiation of aromatic antibiotic-producing polyketide synthases (PKSs) of filamentous bacteria. CLFs have been shown to have decarboxylase activity towards malonyl-acyl carrier protein (ACP). CLFs are similar to other elongation ketosynthase domains, but their active site cysteine is replaced by a conserved glutamine.	399
238429	cd00833	PKS	polyketide synthases (PKSs) polymerize simple fatty acids into a large variety of different products, called polyketides, by successive decarboxylating Claisen condensations. PKSs can be divided into 2 groups, modular type I PKSs consisting of one or more large multifunctional proteins and iterative type II PKSs, complexes of several monofunctional subunits.	421
238430	cd00834	KAS_I_II	Beta-ketoacyl-acyl carrier protein (ACP) synthase (KAS), type I and II. KASs are responsible for the elongation steps in fatty acid biosynthesis. KASIII catalyses the initial condensation and KAS I and II catalyze further elongation steps by Claisen condensation of malonyl-acyl carrier protein (ACP) with acyl-ACP.	406
269907	cd00835	RanBD_family	Ran-binding domain. The RanBD is present in RanBP1, RanBP2, RanBP3, Nuc2, and Nuc50. Most of these proteins have a single RanBD, with the exception of RanBP2 which has 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. The Ran-binding domain is found in multiple copies in Nuclear pore complex proteins. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The RanBD proteins of the nuclear pore complex (NPC): nucleoporin 1 (NUP1), NUP2, NUP61, and Nuclear Pore complex Protein 9 (npp-9) are present in the parent, but specific models were not made due to lineage. To date there been no reports of inositol phosphate or phosphoinositide binding by Ran-binding proteins.	118
275389	cd00836	FERM_C-lobe	FERM domain C-lobe. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	93
269909	cd00837	EVH1_family	EVH1 (Drosophila Enabled (Ena)/Vasodilator-stimulated phosphoprotein (VASP) homology 1) domain. The EVH1 domains are part of the PH domain superfamily. EVH1 subfamilies include Enables/VASP, Homer/Vesl, WASP, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains.	103
277317	cd00838	MPP_superfamily	metallophosphatase superfamily, metallophosphatase domain. Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	130
277318	cd00839	MPP_PAPs	purple acid phosphatases of the metallophosphatase superfamily, metallophosphatase domain. Purple acid phosphatases (PAPs) belong to a diverse family of binuclear metallohydrolases that have been identified and characterized in plants, animals, and fungi. PAPs contain a binuclear metal center and their characteristic pink or purple color derives from a charge-transfer transition between a tyrosine residue and a chromophoric ferric ion within the binuclear center. PAPs catalyze the hydrolysis of a wide range of activated phosphoric acid mono- and di-esters and anhydrides. PAPs are distinguished from the other phosphatases by their insensitivity to L-(+) tartrate inhibition and are therefore also known as tartrate resistant acid phosphatases (TRAPs). While only a few copies of PAP-like genes are present in mammalian and fungal genomes, multiple copies are present in plant genomes. PAPs belong to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	296
277319	cd00840	MPP_Mre11_N	Mre11 nuclease, N-terminal metallophosphatase domain. Mre11 (also known as SbcD in Escherichia coli) is a subunit of the MRX protein complex. This complex includes: Mre11, Rad50, and Xrs2/Nbs1, and plays a vital role in several nuclear processes including DNA double-strand break repair, telomere length maintenance, cell cycle checkpoint control, and meiotic recombination, in eukaryotes. During double-strand break repair, the MRX complex is required to hold the two ends of a broken chromosome together. In vitro studies show that Mre11 has 3'-5' exonuclease activity on dsDNA templates and endonuclease activity on dsDNA and ssDNA templates. In addition to the N-terminal phosphatase domain, the eukaryotic MRE11 members of this family have a C-terminal DNA binding domain (not included in this alignment model). MRE11-like proteins are found in prokaryotes and archaea was well as in eukaryotes. Mre11 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	186
277320	cd00841	MPP_YfcE	Escherichia coli YfcE and related proteins, metallophosphatase domain. YfcE is a manganase-dependent metallophosphatase, found in bacteria and archaea, that cleaves bis-p-nitrophenyl phosphate, thymidine 5'-monophosphate-p-nitrophenyl ester, and p-nitrophenyl phosphorylcholine, but is unable to hydrolyze 2',3 ' or 3',5' cyclic nucleic phosphodiesters, and various phosphomonoesters, including p-nitrophenyl phosphate. This family also includes the Bacilus subtilis YsnB and Methanococcus jannaschii MJ0936 proteins.  This domain family belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	156
277321	cd00842	MPP_ASMase	acid sphingomyelinase and related proteins, metallophosphatase domain. Acid sphingomyelinase (ASMase) is a ubiquitously expressed phosphodiesterase which hydrolyzes sphingomyelin in acid pH conditions to form ceramide, a bioactive second messenger, as part of the sphingomyelin signaling pathway.  ASMase is localized at the noncytosolic leaflet of biomembranes (for example the luminal leaflet of endosomes, lysosomes and phagosomes, and the extracellular leaflet of plasma membranes).  ASMase-deficient humans develop Niemann-Pick disease. This disease is characterized by lysosomal storage of sphingomyelin in all tissues.  Although ASMase-deficient mice are resistant to stress-induced apoptosis, they have greater susceptibility to bacterial infection. The latter correlates with defective phagolysosomal fusion and antibacterial killing activity in ASMase-deficient macrophages.  ASMase belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	294
277322	cd00844	MPP_Dbr1_N	Dbr1 RNA lariat debranching enzyme, N-terminal metallophosphatase domain. Dbr1 is an RNA lariat debranching enzyme that hydrolyzes 2'-5' phosphodiester bonds at the branch points of excised intron lariats. This alignment model represents the N-terminal metallophosphatase domain of Dbr1. This domain belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	271
277323	cd00845	MPP_UshA_N_like	Escherichia coli UshA-like family, N-terminal metallophosphatase domain. This family includes the bacterial enzyme UshA, and related enzymes including SoxB, CpdB, YhcR, and CD73. All members have a similar domain architecture which includes an N-terminal metallophosphatase domain and a C-terminal nucleotidase domain. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	255
238431	cd00851	MTH1175	This uncharacterized conserved protein belongs to a family of iron-molybdenum cluster-binding proteins that includes NifX, NifB, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme.  This domain is a predicted small-molecule-binding domain (SMBD) with an alpha/beta fold that is present either as a stand-alone domain (e.g. NifX and NifY) or fused to another conserved domain (e.g. NifB) however, its function is still undetermined.The SCOP database suggests that this domain is most similar to structures within the ribonuclease H superfamily.  This conserved domain is represented in two of the three major divisions of life (bacteria and archaea).	103
238432	cd00852	NifB	NifB belongs to a family of iron-molybdenum cluster-binding proteins that includes NifX, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme as part of nitrogen fixation in bacteria. This domain is sometimes found fused to a N-terminal domain (the Radical SAM domain) in nifB-like proteins.	106
238433	cd00853	NifX	NifX belongs to a family of iron-molybdenum cluster-binding proteins that includes NifB,  and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme.  The protein is part of the nitrogen fixation gene cluster in nitrogen-fixing bacteria and has sequence similarity to other members of the cluster.	102
238434	cd00854	NagA	N-acetylglucosamine-6-phosphate deacetylase, NagA, catalyzes the hydrolysis of the N-acetyl group of N-acetyl-glucosamine-6-phosphate (GlcNAc-6-P) to glucosamine 6-phosphate and acetate. This is the first committed step in the biosynthetic pathway to amino-sugar-nucleotides, which is needed for cell wall peptidoglycan and teichoic acid biosynthesis. Deacetylation of N-acetylglucosamine is also important in lipopolysaccharide synthesis and cell wall recycling.	374
349487	cd00855	SWIB-MDM2	SWIB/MDM2 domain family. The SWIB/MDM2 protein domain, short for SWI/SNF complex B/MDM2, has been found in both SWI/SNF complex B (SWIB) and the negative regulator of the p53 tumor suppressor MDM2, which are homologous and share a common fold. The SWIB domain is a conserved region found within proteins in the SWI/SNF (SWItch/Sucrose Non-Fermentable) family of complexes. SWI/SNF complex proteins display helicase and ATPase activities and are thought to regulate transcription of certain genes by altering the chromatin structure around those genes. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). MDM2 is an inhibitor of p53 tumor repressor. It binds to the transactivation domain and down-regulates the ability of p53 to activate transcription. This family corresponds to the SWIB domain and the p53 binding domain of MDM2.	69
238435	cd00858	GlyRS_anticodon	GlyRS Glycyl-anticodon binding domain. GlyRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only.	121
238436	cd00859	HisRS_anticodon	HisRS Histidyl-anticodon binding domain. HisRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only.	91
238437	cd00860	ThrRS_anticodon	ThrRS Threonyl-anticodon binding domain. ThrRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only.	91
238438	cd00861	ProRS_anticodon_short	ProRS Prolyl-anticodon binding domain, short version found predominantly in bacteria. ProRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only.	94
238439	cd00862	ProRS_anticodon_zinc	ProRS Prolyl-anticodon binding domain, long version found predominantly in eukaryotes and archaea. ProRS belongs to class II aminoacyl-tRNA synthetases (aaRS). This alignment contains the anticodon binding domain, which is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only, and an additional C-terminal zinc-binding domain specific to this subfamily of aaRSs.	202
238440	cd00864	PI3Ka	Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture.	152
176643	cd00865	PEBP_bact_arch	PhosphatidylEthanolamine-Binding Protein (PEBP) domain present in bacteria and archaea. PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea).  The members in this subgroup are present in bacterial and archaea.  Members here include Escherichia coli YBHB and YBCL which are thought to regulate protein phosphorylation as well as Sulfolobus solfataricus SsCEI which inhibits serine proteases alpha-chymotrypsin and elastase.  Although their overall structures are similar, the members of the PEBP family have very different substrates and oligomerization states (monomer/dimer/tetramer). In a few of the bacterial members present here the dimerization interface is proposed to form the ligand binding site, unlike in other PEBP members.	150
176644	cd00866	PEBP_euk	PhosphatidylEthanolamine-Binding Protein (PEBP) domain present in eukaryotes. PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea).  The members in this subgroup are present in eukaryotes.  Members here include those in plants such as Arabidopsis thaliana FLOWERING LOCUS (FT) and TERMINAL FLOWER1 (FT1) which function as a promoter and a repressor of the floral transitions, respectively as well as the mammalian Raf kinase inhibitory protein (RKIP) which inhibits MAP kinase (Raf-MEK-ERK), G protein-coupled receptor (GPCR) kinase and NFkappaB signaling cascades. Although their overall structures are similar, the members of the PEBP family have very different substrates and oligomerization states (monomer/dimer/tetramer).	154
173836	cd00867	Trans_IPPS	Trans-Isoprenyl Diphosphate Synthases. Trans-Isoprenyl Diphosphate Synthases (Trans_IPPS) of class 1 isoprenoid biosynthesis enzymes which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, diterpenes, ubiquinone, and archaeal ether linked lipids; and are widely distributed among archaea, bacteria, and eukareya. The enzymes in this family share the same 'isoprenoid synthase fold' and include the head-to-tail (HT) IPPS which catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Isoprenoid chain elongation reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Mechanistically and structurally distinct, cis-IPPS are not included in this CD.	236
173837	cd00868	Terpene_cyclase_C1	Terpene cyclases, Class 1. Terpene cyclases, Class 1 (C1) of the class 1 family of isoprenoid biosynthesis enzymes, which share the 'isoprenoid synthase fold' and convert linear, all-trans, isoprenoids, geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate into numerous cyclic forms of monoterpenes, diterpenes, and sesquiterpenes. Also included in this CD are the cis-trans terpene cyclases such as trichodiene synthase. The class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD. Taxonomic distribution includes bacteria, fungi and plants.	284
238441	cd00869	PI3Ka_II	Phosphoinositide 3-kinase (PI3K) class II, accessory domain (PIK domain); PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general,  class II PI3-kinases phosphorylate phosphoinositol (PtdIns), PtdIns(4)-phosphate, but not PtdIns(4,5)-bisphosphate. They are larger, having a C2 domain at the C-terminus.	169
238442	cd00870	PI3Ka_III	Phosphoinositide 3-kinase (PI3K) class III, accessory domain (PIK domain); PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general, PI3Ks class III phosphorylate phosphoinositol (PtdIns) only. The prototypical PI3K class III, yeast Vps34, is involved in trafficking proteins from Golgi to the vacuole.	166
238443	cd00871	PI4Ka	Phosphoinositide 4-kinase(PI4K), accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. PI4K phosphorylates hydroxylgroup at position 4 on the inositol ring of phosphoinositide, the first commited step in the phosphatidylinositol cycle.	175
238444	cd00872	PI3Ka_I	Phosphoinositide 3-kinase (PI3K) class I, accessory domain ; PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. In general, PI3K class I prefer phosphoinositol (4,5)-bisphosphate as a substrate. Mammalian members interact with active Ras. They form heterodimers with adapter molecules linking them to different signaling pathways.	171
238445	cd00873	KU80	Ku-core domain, Ku80 subfamily; Ku80 is a subunit of the Ku protein, which plays a key role in multiple nuclear processes such as DNA repair, chromosome maintenance, transcription regulation, and V(D)J recombination. The mechanism underlying the regulation of all the diverse functions of Ku is still unclear, although it seems that Ku is a multifunctional protein that works in nuclei. In mammalian cells, the Ku heterodimer recruits the catalytic subunit of DNA-dependent protein kinase (DNA-PK), which is dependent on its association with the Ku70/80 heterodimer bound to DNA for its protein kinase activity.	300
238446	cd00874	RNA_Cyclase_Class_II	RNA 3' phosphate cyclase domain (class II). These proteins function as RNA cyclase to catalyze the ATP-dependent conversion of 3'-phosphate to a 2'.3'-cyclic phosphodiester at the end of RNA molecule. A conserved catalytic histidine residue is found in all members of this subfamily.	326
238447	cd00875	RNA_Cyclase_Class_I	RNA 3' phosphate cyclase domain (class I) This subfamily of cyclase-like proteins are encoded in eukaryotic genomes. They lack a conserved catalytic histidine residue required for cyclase activity, so probably do not function as cyclases. They are believed to play a role in ribosomal RNA processing and assembly.	341
206642	cd00876	Ras	Rat sarcoma (Ras) family of small guanosine triphosphatases (GTPases). The Ras family of the Ras superfamily includes classical N-Ras, H-Ras, and K-Ras, as well as R-Ras, Rap, Ral, Rheb, Rhes, ARHI, RERG, Rin/Rit, RSR1, RRP22, Ras2, Ras-dva, and RGK proteins. Ras proteins regulate cell growth, proliferation and differentiation. Ras is activated by guanine nucleotide exchange factors (GEFs) that release GDP and allow GTP binding. Many RasGEFs have been identified. These are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEF colocalizes with Ras. Active GTP-bound Ras interacts with several effector proteins: among the best characterized are the Raf kinases, phosphatidylinositol 3-kinase (PI3K), RalGEFs and NORE/MST1. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	160
206643	cd00877	Ran	Ras-related nuclear proteins (Ran)/TC4 family of small GTPases. Ran GTPase is involved in diverse biological functions, such as nuclear transport, spindle formation during mitosis, DNA replication, and cell division. Among the Ras superfamily, Ran is a unique small G protein. It does not have a lipid modification motif at the C-terminus to bind to the membrane, which is often observed within the Ras superfamily. Ran may therefore interact with a wide range of proteins in various intracellular locations. Like other GTPases, Ran exists in GTP- and GDP-bound conformations that interact differently with effectors. Conversion between these forms and the assembly or disassembly of effector complexes requires the interaction of regulator proteins. The intrinsic GTPase activity of Ran is very low, but it is greatly stimulated by a GTPase-activating protein (RanGAP1) located in the cytoplasm. By contrast, RCC1, a guanine nucleotide exchange factor that generates RanGTP, is bound to chromatin and confined to the nucleus. Ran itself is mobile and is actively imported into the nucleus by a mechanism involving NTF-2. Together with the compartmentalization of its regulators, this is thought to produce a relatively high concentration of RanGTP in the nucleus.	166
206644	cd00878	Arf_Arl	ADP-ribosylation factor(Arf)/Arf-like (Arl) small GTPases. Arf (ADP-ribosylation factor)/Arl (Arf-like) small GTPases. Arf proteins are activators of phospholipase D isoforms. Unlike Ras proteins they lack cysteine residues at their C-termini and therefore are unlikely to be prenylated. Arfs are N-terminally myristoylated. Members of the Arf family are regulators of vesicle formation in intracellular traffic that interact reversibly with membranes of the secretory and endocytic compartments in a GTP-dependent manner. They depart from other small GTP-binding proteins by a unique structural device, interswitch toggle, that implements front-back communication from N-terminus to the nucleotide binding site. Arf-like (Arl) proteins are close relatives of the Arf, but only Arl1 has been shown to function in membrane traffic like the Arf proteins. Arl2 has an unrelated function in the folding of native tubulin, and Arl4 may function in the nucleus. Most other Arf family proteins are so far relatively poorly characterized. Thus, despite their significant sequence homologies, Arf family proteins may regulate unrelated functions.	158
206645	cd00879	Sar1	Sar1 is an essential component of COPII vesicle coats. Sar1 is an essential component of COPII vesicle coats involved in export of cargo from the ER. The GTPase activity of Sar1 functions as a molecular switch to control protein-protein and protein-lipid interactions that direct vesicle budding from the ER. Activation of the GDP to the GTP-bound form of Sar1 involves the membrane-associated guanine nucleotide exchange factor (GEF) Sec12. Sar1 is unlike all Ras superfamily GTPases that use either myristoyl or prenyl groups to direct membrane association and function, in that Sar1 lacks such modification. Instead, Sar1 contains a unique nine-amino-acid N-terminal extension. This extension contains an evolutionarily conserved cluster of bulky hydrophobic amino acids, referred to as the Sar1-N-terminal activation recruitment (STAR) motif. The STAR motif mediates the recruitment of Sar1 to ER membranes and facilitates its interaction with mammalian Sec12 GEF leading to activation.	191
206646	cd00880	Era_like	E. coli Ras-like protein (Era)-like GTPase. The Era (E. coli Ras-like protein)-like family includes several distinct subfamilies (TrmE/ThdF, FeoB, YihA (EngB), Era, and EngA/YfgK) that generally show sequence conservation in the region between the Walker A and B motifs (G1 and G3 box motifs), to the exclusion of other GTPases. TrmE is ubiquitous in bacteria and is a widespread mitochondrial protein in eukaryotes, but is absent from archaea. The yeast member of TrmE family, MSS1, is involved in mitochondrial translation; bacterial members are often present in translation-related operons. FeoB represents an unusual adaptation of GTPases for high-affinity iron (II) transport. YihA (EngB) family of GTPases is typified by the E. coli YihA, which is an essential protein involved in cell division control. Era is characterized by a distinct derivative of the KH domain (the pseudo-KH domain) which is located C-terminal to the GTPase domain. EngA and its orthologs are composed of two GTPase domains and, since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family.	161
206647	cd00881	GTP_translation_factor	GTP translation factor family primarily contains translation initiation, elongation and release factors. The GTP translation factor family consists primarily of translation initiation, elongation, and release factors, which play specific roles in protein translation. In addition, the family includes Snu114p, a component of the U5 small nuclear riboprotein particle which is a component of the spliceosome and is involved in excision of introns, TetM, a tetracycline resistance gene that protects the ribosome from tetracycline binding, and the unusual subfamily CysN/ATPS, which has an unrelated function (ATP sulfurylase) acquired through lateral transfer of the EF1-alpha gene and development of a new function.	183
206648	cd00882	Ras_like_GTPase	Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases). Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions.	161
238448	cd00883	beta_CA_cladeA	Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity.  Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes.	182
238449	cd00884	beta_CA_cladeB	Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity.  Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes.	190
238450	cd00885	cinA	Competence-damaged protein. CinA is the first gene in the competence- inducible (cin) operon and is thought to be specifically required at some stage in the process of transformation. This domain is closely related to a domain, found in a variety of proteins involved in biosynthesis of molybdopterin cofactor, where the domain is presumed to bind molybdopterin.	170
238451	cd00886	MogA_MoaB	MogA_MoaB family. Members of this family are involved in biosynthesis of the molybdenum cofactor (MoCF) an essential cofactor of a diverse group of redox enzymes. MoCF biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea, and eukaryotes. MoCF contains a tricyclic pyranopterin, termed molybdopterin (MPT).  MogA, together with MoeA, is responsible for the metal incorporation into MPT, the third step in MoCF biosynthesis. The plant homolog Cnx1 is a MoeA-MogA fusion protein.  The mammalian homolog gephyrin is a MogA-MoeA fusion protein, that plays a critical role in postsynaptic anchoring of inhibitory glycine receptors and major GABAa receptor subtypes. In contrast, MoaB shows high similarity to MogA, but little is known about its physiological role. All well studied members of this family form highly stable trimers.	152
238452	cd00887	MoeA	MoeA family. Members of this family are involved in biosynthesis of the molybdenum cofactor (MoCF), an essential cofactor of a diverse group of redox enzymes. MoCF biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. MoCF contains a tricyclic pyranopterin, termed molybdopterin (MPT).  MoeA, together with MoaB, is responsible for the metal incorporation into MPT, the third step in MoCF biosynthesis. The plant homolog Cnx1 is a MoeA-MogA fusion protein.  The mammalian homolog gephyrin is a MogA-MoeA fusion protein, that plays a critical role in postsynaptic anchoring of inhibitory glycine receptors and major GABAa receptor subtypes.	394
238453	cd00890	Prefoldin	Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly.  The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils.	129
270624	cd00891	PI3Kc	Catalytic domain of Phosphoinositide 3-kinase. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. Class II PI3Ks comprise three catalytic isoforms that do not associate with any regulatory subunits. They selectively use PtdIns as a susbtrate to produce PtsIns(3)P. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	334
270625	cd00892	PIKKc_ATR	Catalytic domain of Ataxia telangiectasia and Rad3-related proteins. ATR is also referred to as Mei-41 (Drosophila), Esr1/Mec1p (Saccharomyces cerevisiae), Rad3 (Schizosaccharomyces pombe), and FRAP-related protein (human). ATR contains a UME domain of unknown function, a FAT (FRAP, ATM and TRRAP) domain, a catalytic domain, and a FATC domain at the C-terminus. Together with its downstream effector kinase, Chk1, ATR plays a central role in regulating the replication checkpoint. ATR stabilizes replication forks by promoting the association of DNA polymerases with the fork. Preventing fork collapse is essential in preserving genomic integrity. ATR also plays a role in normal cell growth and in response to DNA damage. ATR is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The ATR catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	237
270626	cd00893	PI4Kc_III	Catalytic domain of Type III Phosphoinositide 4-kinase. PI4Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 4-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) to generate PtdIns(4)P, the major precursor in the synthesis of other phosphoinositides including PtdIns(4,5)P2, PtdIns(3,4)P2, and PtdIns(3,4,5)P3. There are two types of PI4Ks, types II and III. Type II PI4Ks lack the characteristic catalytic kinase domain present in PI3Ks and type III PI4Ks, and are excluded from this family. Two isoforms of type III PI4K, alpha and beta, exist in most eukaryotes. The PI4K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	286
270627	cd00894	PI3Kc_IB_gamma	Catalytic domain of Class IB Phosphoinositide 3-kinase gamma. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Kgamma signaling controls diverse immune and vascular functions including cell recruitment, mast cell activation, platelet aggregation, and smooth muscle contractility. It associates with one of two regulatory subunits, p101 and p84, and is activated by G-protein-coupled receptors (GPCRs) by direct binding to their betagamma subunits. It contains an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. PI3Ks can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	367
119421	cd00895	PI3Kc_C2_beta	Catalytic domain of Class II Phosphoinositide 3-kinase beta. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. The class II beta isoform, PI3K-C2beta, contributes to the migration and survival of cancer cells. It regulates Rac activity and impacts membrane ruffling, cell motility, and cadherin-mediated cell-cell adhesion. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PtdIns as a substrate to produce PtdIns(3)P, but can also phosphorylate PtdIns(4)P. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a Phox homology (PX) domain, and a second C2 domain at the C-terminus. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	354
270628	cd00896	PI3Kc_III	Catalytic domain of Class III Phosphoinositide 3-kinase. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. Class III PI3Ks, also called Vps34 (vacuolar protein sorting 34), contain an N-terminal lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. They phosphorylate only the substrate PtdIns. They interact with a regulatory subunit, Vps15, to form a membrane-associated complex. Class III PI3Ks are involved in protein and vesicular trafficking and sorting, autophagy, trimeric G-protein signaling, and phagocytosis. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	346
132998	cd00897	UGPase_euk	Eukaryotic UGPase catalyses the synthesis of UDP-Glucose. UGPase (UDP-Glucose Pyrophosphorylase) catalyzes the reversible production of UDP-Glucose and pyrophosphate (PPi) from Glucose-1-phosphate and UTP.  UDP-glucose plays pivotal roles in galactose utilization, in glycogen synthesis, and in the synthesis of the carbohydrate moieties of glycolipids, glycoproteins, and proteoglycans. UGPase is found in both prokaryotes and eukaryotes. Interestingly, while the prokaryotic and eukaryotic forms of UGPase catalyze the same reaction, they share low sequence similarity.  This family consists of mainly eukaryotic UTP-glucose-1-phosphate uridylyltransferases.	300
132999	cd00899	b4GalT	Beta-4-Galactosyltransferase is involved in the formation of the poly-N-acetyllactosamine core structures present in glycoproteins and glycosphingolipids. Beta-4-Galactosyltransferase transfers galactose from uridine diphosphogalactose to the terminal beta-N-acetylglucosamine residues, hereby forming the poly-N-acetyllactosamine core structures present in glycoproteins and glycosphingolipids. At least seven homologous beta-4-galactosyltransferase isoforms have been identified that use different types of glycoproteins and glycolipids as substrates. Of the seven identified members of the beta-1,4-galactosyltransferase subfamily (beta1,4-Gal-T1 to -T7), b1,4-Gal-T1 is most characterized (biochemically). It is a Golgi-resident type II membrane enzyme with a cytoplasmic domain, membrane spanning region, and a stem region and catalytic domain facing the lumen.	219
275390	cd00900	PH-like	Pleckstrin homology-like domain. The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins.	89
153098	cd00904	Ferritin	Ferritin iron storage proteins. Ferritins are the primary iron storage proteins of most living organisms and members of a broad superfamily of ferritin-like diiron-carboxylate proteins. The iron-free (apoferritin) ferritin molecule is a protein shell composed of 24 protein chains arranged in 432 symmetry. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the dinuclear ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite; the protein shell can hold up to 4500 iron atoms. In vertebrates, two types of chains (subunits) have been characterized, H or M (fast) and L (slow), which differ in rates of iron uptake and mineralization. Bacterial non-heme ferritins are composed only of H chains. Fe(II) oxidation in the H/M subunits take place initially at the ferroxidase center, a carboxylate-bridged diiron center, located within the subunit four-helix bundle. In a complementary role, negatively charged residues on the protein shell inner surface of the L subunits promote ferrihydrite nucleation. Most plant ferritins combine both oxidase and nucleation functions in one chain: they have four interior glutamate residues as well as seven ferroxidase center residues.	160
153099	cd00907	Bacterioferritin	Bacterioferritin, ferritin-like diiron-binding domain. Bacterioferritins, also known as cytochrome b1, are members of a broad superfamily of ferritin-like diiron-carboxylate proteins. Similar to ferritin in architecture, Bfr forms an oligomer of 24 subunits that assembles to form a hollow sphere with 432 symmetry. Up to 12 heme cofactor groups (iron protoporphyrin IX or coproporphyrin III) are bound between dimer pairs. The role of the heme is unknown, although it may be involved in mediating iron-core reduction and iron release. Each subunit is composed of a four-helix bundle which carries a diiron ferroxidase center; it is here that initial oxidation of ferrous iron by molecular oxygen occurs, facilitating the detoxification of iron, protection against dioxygen and radical products, and storage of ferric-hydroxyphosphate at the core. Some bacterioferritins are composed of two subunit types, one conferring heme-binding ability (alpha) and the other (beta) bestowing ferroxidase activity.	153
238454	cd00912	ML	The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2  and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids.	127
238455	cd00913	PCD_DCoH_subfamily_a	PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH  (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme.  DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH).  DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein).	76
238456	cd00914	PCD_DCoH_subfamily_b	PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH  (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme.  DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH).  DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein). Two DCoH proteins have been identifed in humans: DCoH1 and DCoH2. Mutations in human DCoH1 cause hyperphenylalaninemia. Loss of enzymic activity of DCoH in humans is associated with the depigmentation disorder vitiligo. DCoH1 has been reported to be overexpessed in colon cancer carcinomas and in malignant melanomas.	76
238457	cd00915	MD-1_MD-2	MD-1 and MD-2 are cofactors required for LPS signaling through cell surface receptors. MD-2 and its binding partner, Toll-like receptor 4 (TLR4), are essential for the innate immune responses of mammalian cells to bacterial lipopolysaccharide (LPS); MD-2 directly binds the lipid A moiety of LPS. The TLR4-like receptor, RP105, which mediates LPS-induced lymphocyte proliferation, interacts with MD-1; MD-1 enhances RP105-mediated LPS-induced growth of B cells. These proteins belong to the ML domain family.	130
238458	cd00916	Npc2_like	Niemann-Pick type C2 (Npc2) is a lysosomal protein in which a mutation in the gene causes a rare form of Niemann-Pick type C disease, an autosomal recessive lipid storage disorder characterized by accumulation of low-density lipoprotein-derived cholesterol in lysosomes. Although Npc2 is known to bind cholesterol, the function of this protein is unknown. These proteins belong to the ML domain family. 	123
238459	cd00917	PG-PI_TP	The phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP) has been shown to bind phosphatidylglycerol and phosphatidylinositol, but the biological significance of this is still obscure. These proteins belong to the ML domain family.	122
238460	cd00918	Der-p2_like	Several group 2 allergen proteins belong to the ML domain family. They include Dermatophagoides pteronyssinus, group 2 (Der p 2) and D. farinae, group 2 (Der f 2) allergens. These house dust mites cause heavy atopic diseases such as asthma and dermatitis. Although the allergenic properties of these proteins have been well characterized, their biological function in mites is unknown.	120
238461	cd00919	Heme_Cu_Oxidase_I	Heme-copper oxidase subunit I.  Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane.  The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria.  It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria. Membership in the superfamily is defined by subunit I, which contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion.  It also contains a low-spin heme, believed to participate in the transfer of electrons to the binuclear center.  Only subunit I is common to the entire superfamily.  For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from the electron donor on the opposite side of the membrane.  The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane.  This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I of cytochrome c oxidase (CcO) and ubiquinol oxidase.  A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electron transfer occurs in two segments:  from the electron donor to the low-spin heme, and from the low-spin heme to the binuclear center.  The first segment can be a multi-step process and varies among the different families, while the second segment, a direct transfer, is consistent throughout the superfamily.	463
259860	cd00920	Cupredoxin	Cupredoxin superfamily. Cupredoxins contain type I copper centers and are involved in inter-molecular electron transfer reactions. Cupredoxins are blue copper proteins, having an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. The majority of family members contain multiple cupredoxin domain repeats: ceruloplasmin and the coagulation factors V/VIII have six repeats; laccase, ascorbate oxidase, spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase, and the periplasmic domain of cytochrome c oxidase subunit II.	110
238462	cd00922	Cyt_c_Oxidase_IV	Cytochrome c oxidase subunit IV. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit IV is the largest of the nuclear-encoded subunits. It binds ATP at the matrix side, leading to an allosteric inhibition of enzyme activity at high intramitochondrial ATP/ADP ratios. In mammals, subunit IV has a lung-specific isoform and a ubiquitously expressed isoform.	136
238463	cd00923	Cyt_c_Oxidase_Va	Cytochrome c oxidase subunit Va. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Va is one of three mammalian subunits that lacks a transmembrane region. Subunit Va is located on the matrix side of the membrane and binds thyroid hormone T2, releasing allosteric inhibition caused by the binding of ATP to subunit IV and allowing high turnover at elevated intramitochondrial ATP/ADP ratios.	103
238464	cd00924	Cyt_c_Oxidase_Vb	Cytochrome c oxidase subunit Vb.  Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes.  It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome.  Found only in eukaryotes, subunit Vb is one of three mammalian subunits that lacks a transmembrane region.  Subunit Vb is located on the matrix side of the membrane and binds the regulatory subunit of protein kinase A.  The abnormally extended conformation is stable only in the CcO assembly.	97
238465	cd00925	Cyt_c_Oxidase_VIa	Cytochrome c oxidase subunit VIa.   Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes.  It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome.  Found only in eukaryotes, subunit VIa is expressed in two tissue-specific isoforms in mammals but not fish. VIa-H is the heart and skeletal muscle isoform; VIa-L is the liver or non-muscle isoform.  Mammalian VIa-H induces a slip in CcO (decrease in proton/electron stoichiometry) at high intramitochondrial ATP/ADP ratios, while VIa-L induces a permanent slip in CcO, depending on the presence of cardiolipin and palmitate.	86
238466	cd00926	Cyt_c_Oxidase_VIb	Cytochrome c oxidase subunit VIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIb is one of three mammalian subunits that lacks a transmembrane region. It is located on the cytosolic side of the membrane and helps form the dimer interface with the corresponding subunit on the other monomer complex.	75
238467	cd00927	Cyt_c_Oxidase_VIc	Cytochrome c oxidase subunit VIc. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIc subunit is found only in eukaryotes and its specific function remains unclear. It has been reported that the relative concentrations of some nuclear encoded CcO subunits, including subunit VIc, compared to those of the mitochondrial encoded subunits, are altered significantly during the progression of prostate cancer.	70
238468	cd00928	Cyt_c_Oxidase_VIIa	Cytochrome c oxidase subunit VIIa. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIIa has two tissue-specific isoforms that are expressed in a developmental manner. VIIa-H is expressed in heart and skeletal muscle but not smooth muscle. VIIa-L is expressed in liver and non-muscle tissues.	55
238469	cd00929	Cyt_c_Oxidase_VIIc	Cytochrome c oxidase subunit VIIc. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIc subunit is found only in eukaryotes and its specific function remains unclear. Peroxide inactivation of bovine CcO coincides with the direct oxidation of tryptophan (W19) within subunit VIIc, along with other structural changes in other subunits.	46
238470	cd00930	Cyt_c_Oxidase_VIII	Cytochrome oxidase c subunit VIII.  Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIII is the smallest of the nuclear-encoded subunits. It exists in muscle-specific and non-muscle-specific isoforms that are differently expressed in different species, suggesting species-specific regulation of energy metabolism.	43
238471	cd00933	barnase	Barnase, a member of the family of homologous microbial ribonucleases, catalyses the cleavage of single-stranded RNA via a two-step mechanism thought to be similar to that of pancreatic ribonuclease. The mechanism involves a transesterification to give a 2', 3'-cyclic phosphate intermediate, followed by hydrolysis to yield a 3' nucleotide. The active site residues His and Glu act as general acid-base groups during catalysis, while the Arg and Lys residues are important in binding the reactive phosphate, the latter probably binding the phosphate in the transition state. Barstar, a small 89 residue intracellular protein is a natural inhibitor of Barnase.	107
269911	cd00934	PTB	Phosphotyrosine-binding (PTB) PH-like fold. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to bind peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains.	120
238472	cd00935	GlyRS_RNA	GlyRS_RNA binding domain.  This short RNA-binding domain is found at the  N-terminus of GlyRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). This domain consists of a helix-turn-helix structure , which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes.	51
238473	cd00936	WEPRS_RNA	WEPRS_RNA binding domain. This short RNA-binding domain is found in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is found in multiple copies in eukaryotic bifunctional glutamyl-prolyl-tRNA synthetases (EPRS) in a region that separates the N-terminal glutamyl-tRNA synthetase (GluRS) from the C-terminal prolyl-tRNA synthetase (ProRS). It is also found at the N-terminus of vertebrate tryptophanyl-tRNA synthetases (TrpRS). This domain  consists of a helix-turn-helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes.	50
238474	cd00938	HisRS_RNA	HisRS_RNA binding domain.  This short RNA-binding domain is found at the N-terminus of HisRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). This domain consists of a helix- turn- helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes.	45
238475	cd00939	MetRS_RNA	MetRS_RNA binding domain. This short RNA-binding domain is found at the C-terminus of MetRS in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is repeated in Drosophila MetRS. This domain consists of a helix-turn-helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes.	45
411993	cd00941	FokI_N	N-terminal DNA recognition domain of restriction endonuclease FokI and similar proteins. Restriction endonuclease FokI (EC3.1.21.4), also called R.FokI, or endonuclease FokI, is a type IIS restriction enzyme that require only divalent metals (such as Mg2+ or Mn2+) as cofactors to catalyze the hydrolysis of DNA. FokI recognizes the double-stranded sequence 5'-GGATG-3'/3'-CATCC-5' and cleaves 14 bases after G-1 and 13 bases before C-1, respectively. It contains an N-terminal DNA recognition domain and a C-terminal endonuclease domain. This model describes the DNA recognition domain. The family also includes endonuclease StsI, a type IIS restriction endonuclease found in Streptococcus sanguinis 54. It recognizes the same sequence as FokI but cleaves at different positions.	373
411705	cd00942	BamHI-like	Restriction endonuclease BamHI and similar proteins. Restriction endonuclease BamHI (EC 3.1.21.4), also termed R.BamHI, or endonuclease BamHI, is a type II restriction enzyme that require only divalent metals (such as Mg2+ or Mn2+) as cofactors to catalyze the hydrolysis of DNA. BamHI recognizes the double-stranded sequence GGATCC and cleaves after G-1. It shows striking resemblance to the structure of EcoRI, but lacks sequence similarity between them. The family also includes a BamHI isoschizomer, OkrAI endonuclease, which recognizes and cleaves the same DNA sequence (TATGGATCCATA) as BamHI. However, OkrAI does not have the equivalent of N- and C-terminal helices of BamHI, and it has higher star activity compared to BamHI.	196
411706	cd00943	EcoRI-like	Restriction endonuclease EcoRI and similar proteins. Restriction endonuclease EcoRI (EC 3.1.21.4), also termed R.EcoRI, or endonuclease EcoRI, is a type II restriction enzyme that require only divalent metals (such as Mg2+ or Mn2+) as cofactors to catalyze the hydrolysis of DNA. EcoRI recognizes the double-stranded sequence GAATTC and cleaves after G-1. The family also includes an EcoRI isoschizomer, RsrI endonuclease, which also catalyzes the cleavage of duplex DNA and oligodeoxyribonucleotides between the first two residues of the sequence GAATTC. RsrI differs from EcoRI in its N-terminal amino acid sequence, susceptibility to inhibition by antibodies, sensitivity to N-ethylmaleimide, isoelectric point, state of aggregation at high concentrations, temperature lability, and conditions for optimal reaction. It displays a reduction of specificity ("star activity") under conditions that also relax the specificity of EcoRI.	254
188634	cd00945	Aldolase_Class_I	Class I aldolases. Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin.	201
238476	cd00946	FBP_aldolase_IIA	Class II Type A, Fructose-1,6-bisphosphate (FBP) aldolases. The enzyme catalyses the zinc-dependent, reversible aldol condensation of dihydroxyacetone phosphate with glyceraldehyde-3-phosphate to form fructose-1,6-bisphosphate. FBP aldolase is homodimeric and used in gluconeogenesis and glycolysis. The type A and type B Class II FBPA's differ in the presence and absence of distinct indels in the sequence that result in differing loop lengths in the structures.	345
238477	cd00947	TBP_aldolase_IIB	Tagatose-1,6-bisphosphate (TBP) aldolase and related Type B Class II aldolases. TBP aldolase is a tetrameric class II aldolase that catalyzes the reversible condensation of dihydroxyacetone phosphate with glyceraldehyde 3-phsophate to produce tagatose 1,6-bisphosphate. There is an absolute requirement for a divalent metal ion, usually zinc, and in addition the enzymes are activated by monovalent cations such as Na+. The type A and type B Class II FBPA's differ in the presence and absence of distinct indels in the sequence that result in differing loop lengths in the structures.	276
188635	cd00948	FBP_aldolase_I_a	Fructose-1,6-bisphosphate aldolase. Fructose-1,6-bisphosphate aldolase. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). This family includes proteins found in vertebrates, plants, and bacterial plant pathogens. Mutations in the aldolase genes in humans cause hemolytic anemia and hereditary fructose intolerance. The enzyme is a member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates.	330
188636	cd00949	FBP_aldolase_I_bact	Fructose-1.6-bisphosphate aldolase found in gram +/- bacteria. Fructose-1.6-bisphosphate aldolase found in gram +/- bacteria. The enzyme catalyzes the cleavage of fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP). The enzyme is member of the class I aldolase family, which utilizes covalent catalysis through a Schiff base formed between a lysine residue of the enzyme and ketose substrates.	292
188637	cd00950	DHDPS	Dihydrodipicolinate synthase (DHDPS). Dihydrodipicolinate synthase (DHDPS) is a key enzyme in lysine biosynthesis. It catalyzes the aldol condensation of L-aspartate-beta- semialdehyde and pyruvate to dihydropicolinic acid via a Schiff base formation between pyruvate and a lysine residue. The functional enzyme is a homotetramer consisting of a dimer of dimers. DHDPS is member of dihydrodipicolinate synthase family that comprises several pyruvate-dependent class I aldolases that use the same catalytic step to catalyze different reactions in different pathways.	284
188638	cd00951	KDGDH	5-dehydro-4-deoxyglucarate dehydratase, also called 5-keto-4-deoxy-glucarate dehydratase (KDGDH). 5-dehydro-4-deoxyglucarate dehydratase, also called 5-keto-4-deoxy-glucarate dehydratase (KDGDH), which is member of dihydrodipicolinate synthase (DHDPS) family that comprises several pyruvate-dependent class I aldolases. The enzyme is involved in glucarate metabolism, and its mechanism presumbly involves a Schiff-base intermediate similar to members of DHDPS family. While in the case of Pseudomonas sp. 5-dehydro-4-deoxy-D-glucarate is degraded by KDGDH to 2,5-dioxopentanoate, in certain species of Enterobacteriaceae it is degraded instead to pyruvate and glycerate.	289
188639	cd00952	CHBPH_aldolase	Trans-o-hydroxybenzylidenepyruvate hydratase-aldolase (HBPHA) and trans-2'-carboxybenzalpyruvate hydratase-aldolase (CBPHA). Trans-o-hydroxybenzylidenepyruvate hydratase-aldolase (HBPHA) and trans-2'-carboxybenzalpyruvate hydratase-aldolase (CBPHA). HBPHA catalyzes HBP to salicyaldehyde and pyruvate. This reaction is part of the degradative pathways for naphthalene and naphthalenesulfonates by bacteria. CBPHA is homologous to HBPHA and catalyzes the cleavage of CBP to 2-carboxylbenzaldehyde and pyruvate during the degradation of phenanthrene. They are member of the DHDPS family of Schiff-base-dependent class I aldolases.	309
188640	cd00953	KDG_aldolase	KDG (2-keto-3-deoxygluconate) aldolases found in archaea. KDG (2-keto-3-deoxygluconate) aldolases found in archaea. This subfamily of enzymes is adapted for high thermostability and shows specificity for non-phosphorylated substrates. The enzyme catalyses the reversible aldol cleavage of 2-keto-3-dexoygluconate to pyruvate and glyceraldehyde, the third step of a modified non-phosphorylated Entner-Doudoroff pathway of glucose oxidation. KDG aldolase shows no significant sequence similarity to microbial 2-keto-3-deoxyphosphogluconate (KDPG) aldolases, and the enzyme shows no activity with glyceraldehyde 3-phosphate as substrate. The enzyme is a tetramer and a member of the DHDPS family of Schiff-base-dependent class I aldolases.	279
188641	cd00954	NAL	N-Acetylneuraminic acid aldolase, also called N-acetylneuraminate lyase (NAL). N-Acetylneuraminic acid aldolase, also called N-acetylneuraminate lyase (NAL), which catalyses the reversible aldol reaction of N-acetyl-D-mannosamine and pyruvate to give N-acetyl-D-neuraminic acid (D-sialic acid). It has a widespread application as biocatalyst for the synthesis of sialic acid and its derivatives. This enzyme has been shown to be quite specific for pyruvate as the donor, but flexible to a variety of D- and, to some extent, L-hexoses and pentoses as acceptor substrates. NAL is member of dihydrodipicolinate synthase family that comprises several pyruvate-dependent class I aldolases.	288
188642	cd00955	Transaldolase_like	Transaldolase-like proteins from plants and bacteria. Transaldolase-like proteins from plants and bacteria. Transaldolase is found in the non-oxidative branch of the pentose phosphate pathway, that catalyze the reversible transfer of a dihydroxyacetone group from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. They are members of the class I aldolases, who are characterized by using a Schiff-base mechanism for stabilization of the reaction intermediates.	338
188643	cd00956	Transaldolase_FSA	Transaldolase-like fructose-6-phosphate aldolases (FSA) found in bacteria and archaea. Transaldolase-like fructose-6-phosphate aldolases (FSA) found in bacteria and archaea, which are member of the MipB/TalC subfamily of class I aldolases. FSA catalyze an aldol cleavage of fructose 6-phosphate and do not utilize fructose, fructose 1-phosphate, fructose 1,6-phosphate, or dihydroxyacetone phosphate. The enzymes belong to the transaldolase family that serves in transfer reactions in the pentose phosphate cycle, and are more distantly related to fructose 1,6-bisphosphate aldolase.	211
188644	cd00957	Transaldolase_TalAB	Transaldolases including both TalA and TalB. Transaldolases including both TalA and TalB. The enzyme catalyses the reversible transfer of a dyhydroxyacetone moiety, derived from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. The catalytic mechanism is similar to other class I aldolases. The enzyme is found in the non-oxidative branch of the pentose phosphate pathway and forms a dimer in solution.	313
188645	cd00958	DhnA	Class I fructose-1,6-bisphosphate (FBP) aldolases of the archaeal type (DhnA homologs). Class I fructose-1,6-bisphosphate (FBP) aldolases of the archaeal type (DhnA homologs) found in bacteria and archaea. Catalysis of the enzymes proceeds via a Schiff-base mechanism like other class I aldolases, although this subfamily is clearly divergent based on sequence similarity to other class I and class II  (metal dependent) aldolase subfamilies.	235
188646	cd00959	DeoC	2-deoxyribose-5-phosphate aldolase (DERA) of the DeoC family. 2-deoxyribose-5-phosphate aldolase (DERA) of the DeoC family. DERA belongs to the class I aldolases and catalyzes a reversible aldol reaction between acetaldehyde and glyceraldehyde 3-phosphate to generate 2-deoxyribose 5-phosphate. DERA is unique in catalyzing the aldol reaction between two aldehydes, and its broad substrate specificity confers considerable utility as a biocatalyst, offering an environmentally benign alternative to chiral transition metal catalysis of the asymmetric aldol reaction.	203
238478	cd00974	DSRD	Desulforedoxin (DSRD) domain; a small non-heme iron domain present in the desulforedoxin (rubredoxin oxidoreductase) and desulfoferrodoxin proteins of some archeael and bacterial methanogens and sulfate/sulfur reducers. Desulforedoxin is a small, single-domain homodimeric protein; each subunit contains an iron atom bound to four cysteinyl sulfur atoms, Fe(S-Cys)4, in a distorted tetrahedral coordination. Its metal center is similar to that found in rubredoxin type proteins. Desulforedoxin is regarded as a potential redox partner for rubredoxin. Desulfoferrodoxin forms a homodimeric protein, with each protomer comprised of two domains, the N-terminal DSRD domain and C-terminal superoxide reductase-like (SORL) domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)].	34
381600	cd00978	chitosanase_GH46	chitosanase belonging to the glycosyl hydrolase 46 family. This family is composed of the chitosanase enzymes which hydrolyzes chitosan, a biopolymer of beta (1,4)-linked-D-glucosamine (GlcN) residues produced by partial or full deacetylation of chitin. Chitosanases play a role in defense against pathogens such as fungi and are found in microorganisms, fungi, viruses, and plants. Microbial chitosanases can be divided into 3 subclasses based on the specificity of the cleavage positions for partial acetylated chitosan. Subclass I chitosanases such as N174 can split GlcN-GlcN and GlcNAc-GlcN linkages, whereas subclass II chitosanases such as Bacillus sp. no. 7-M can cleave only GlcN-GlcN linkages. Subclass III chitosanases such as MH-K1 chitosanase are the most versatile and can split both GlcN-GlcN and GlcN-GlcNAc linkages.	222
238480	cd00980	FwdC/FmdC	FwdC/FmdC. This domain of unknown function is found in the subunit C of formylmethanofuran dehydrogenase, an enzyme that catalyzes the first step in methane formation from CO2 in methanogenic archaea, hyperthermophiles and bacteria. There are two isoenzymes, a tungsten-containing isoenzyme (Fwd) and a molybdenum-containing isoenzyme (Fmd). The subunits C of both isoenzymes (FwdC/FmdC) are characterized by a repeated GXXGXXXG motif.	203
238481	cd00981	arch_gltB	Archaeal-type gltB domain. This domain shares sequence similarity with a region of unknown function found in the large subunit of glutamate synthase, which is encoded by gltB and found in most bacteria and eukaryotes.  It is predicted to be homologous to the C-terminal domain of glutamate synthase based upon sequence similarity coupled with genome organization data, showing that this domain is found in a gene cluster with other domains of Glts, which are annotated. This domain is found primarily in archaea, but is also present in a few bacteria, likely as a result of lateral gene transfer.	232
238482	cd00982	gltB_C	gltb_C. This domain is found at the C-terminus of the large subunit (gltB) of glutamate synthase (GltS).  GltS encodes a complex iron-sulfur flavoprotein that catalyzes the synthesis of L-glutamate from L-glutamine and 2-oxoglutarate. It requires the transfer of ammonia and electrons among three distinct active centers that carry out L-Gln hydrolysis, conversion of 2-oxoglutarate into L-Glu, and electron uptake from a donor. These catalytic sites appear to occur in other domains within the protein, and not the domain in this CD. This particular domain has no known function, but it likely has a structural role as it interacts with the amidotransferase and FMN-binding domains of gltS.	251
410863	cd00983	RecA	recombinase A. RecA is a bacterial enzyme which has roles in homologous recombination, DNA repair, and the induction of the SOS response.  RecA couples ATP hydrolysis to DNA strand exchange.	235
410864	cd00984	DnaB_C	C-terminal domain of DnaB helicase. DnaB helicase C-terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the  chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis.	256
238485	cd00985	Maf_Ham1	Maf_Ham1. Maf, a nucleotide binding protein, has been implicated in inhibition of septum formation in eukaryotes, bacteria and archaea. A Ham1-related protein from Methanococcus jannaschii is a novel NTPase that has been shown to hydrolyze nonstandard nucleotides, such as hypoxanthine/xanthine NTP, but not standard nucleotides.	131
238486	cd00986	PDZ_LON_protease	PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand  is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.	79
238487	cd00987	PDZ_serine_protease	PDZ domain of trypsin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.	90
238488	cd00988	PDZ_CTP_protease	PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.	85
238489	cd00989	PDZ_metalloprotease	PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.	79
238490	cd00990	PDZ_glycyl_aminopeptidase	PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.	80
238491	cd00991	PDZ_archaeal_metalloprotease	PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.	79
238492	cd00992	PDZ_signaling	PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.	82
270215	cd00993	PBP2_ModA_like	Substrate binding domain of molybdate-binding proteins, the type 2 periplasmic binding protein fold. Molybdate binding domain ModA. Molybdate transport system is comprised of a periplasmic binding protein, an integral membrane protein, and an energizer protein. These three proteins are coded by modA, modB, and modC genes, respectively. ModA proteins serve as initial receptors in the ABC transport of molybdate mostly in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. In contrast to the structure of the two ModA homologs from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arranged around the metal center, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has revealed a binding site for molybdate and tungstate where the central metal atom is in a hexacoordinate configuration. This octahedral geometry was rather unexpected. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge.  They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	225
270216	cd00994	PBP2_GlnH	Glutamine binding domain of ABC-type transporter; the type 2 periplasmic binding protein fold. This periplasmic substrate-binding component serves as an initial receptor in the ABC transport of glutamine in bacteria and eukaryota. GlnH belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	218
173853	cd00995	PBP2_NikA_DppA_OppA_like	The substrate-binding domain of an ABC-type nickel/oligopeptide-like import system contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding domain of nickel/dipeptide/oligopeptide transport systems, which function in the import of nickel and peptides, and other closely related proteins. The oligopeptide-binding protein OppA is a periplasmic component of an ATP-binding cassette (ABC) transport system OppABCDEF consisting of five subunits: two homologous integral membrane proteins OppB and OppF that form the translocation pore; two homologous nucleotide-binding domains OppD and OppF that drive the transport process through binding and hydrolysis of ATP; and the substrate-binding protein or receptor OppA that determines the substrate specificity of the transport system. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis.  Similar to the ABC-type dipeptide and oligopeptide import systems, nickel transporter is comprised of five subunits NikABCDE: the two pore-forming integral inner membrane proteins NikB and NikC; the two inner membrane-associated proteins with ATPase activity NikD and NikE; and the periplasmic nickel binding NikA, which is the initial nickel receptor that controls the chemotactic response away from nickel.  Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand binding domains of ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	466
270217	cd00996	PBP2_AatB_like	Polar amino acids-binding domain of ATP-binding cassette transporter-like systems that belong to the type 2 periplasmic binding fold protein superfamily. This subfamily includes periplasmic binding domain of ATP-binding cassette transporter-like systems that serve as initial receptors in the ABC transport of amino acids and their derivatives in eubacteria.  After binding their ligand with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically-located ATPase.  This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  The Abp proteins belong to the PBPI superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	227
270218	cd00997	PBP2_GluR0	Bacterial GluR0 ligand-binding domain; the type 2 periplasmic binding protein fold. Glutamate receptor domain GluR0. These domains are found in the GluR0 proteins that have been shown to function as prokaryotic L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain.  The GluR0 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	218
270219	cd00998	PBP2_iGluR_ligand_binding	The ligand-binding domain of ionotropic glutamate receptor family, a member of the periplasmic binding protein type II superfamily. This subfamily represents the ligand binding of ionotropic glutamate receptors. iGluRs are heterotetrameric ion channels that comprises of three functionally distinct subtypes based on their pharmacology and structural similarities:  AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid), NMDA (N-methyl-D-aspartate), and kainate receptors.  All three types of channels are also activated by the physiological neurotransmitter, glutamate. iGluRs are concentrated at postsynaptic sites, where they exert a variety of different functions.  While this ligand-binding domain of iGluRs is structurally homologous to the periplasmic binding fold type II superfamily, the N-terminal leucine/isoleucine/valine-binding protein (LIVBP)-like domain belongs to the periplasmic-binding fold type I.	243
270220	cd00999	PBP2_ArtJ	The solute binding domain of ArtJ protein, a member of the type 2 periplasmic binding fold protein superfamily. An arginine-binding protein found in Chlamydiae trachomatis (CT-ArtJ) and pneumoniae (CPn-ArtJ) and its closely related proteins. CT- and CPn-ArtJ are shown to have different immunogenic properties despite a high sequence similarity. The ArtJ proteins display the type 2 periplasmic binding fold organized in two alpha-beta domains with arginine-binding region at their interface.	223
270221	cd01000	PBP2_Cys_DEBP_like	Substrate-binding domain of cysteine- and aspartate/glutamate-binding proteins; the type 2 periplasmic-binding protein fold. This family comprises of the periplasmic-binding protein component of ABC transporters specific for cysteine and carboxylic amino acids, as well as their closely related proteins.  The cysteine and aspartate-glutamate binding domains belong to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many  members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	228
270222	cd01001	PBP2_HisJ_LAO_like	Substrate binding domain of ABC-type histidine/lysine/arginine/ornithine transporters and related proteins; the type 2 periplasmic-binding protein fold. This family comprises the periplasmic substrate-binding proteins, including the lysine-, arginine-, ornithine-binding protein (LAO) and the histidine-binding protein (HisJ), which serve as initial receptors for active transport. HisJ and LAO proteins belong to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	228
270223	cd01002	PBP2_Ehub_like	Substrate binding domain of ectoine/hydroxyectoine specific ABC transport system; the type 2 periplasmic binding protein fold. This family represents the periplasmic substrate-binding component of ABC transport systems that involved in uptake of osmoprotectants (also termed compatible solutes) such as ectoine and hydroxyectoine. To counteract the efflux of water, bacteria and archaea accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	242
270224	cd01003	PBP2_YckB	Substrate binding domain of an ABC cystine transporter; the type 2 periplasmic binding protein fold. Periplasmic cystine-binding domain (YckB) of an ATP-binding cassette (ABC) transporter from Bacillus subtilis and its related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake.  Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	229
270225	cd01004	PBP2_MidA_like	Mimosine binding domain of ABC-type transporter MidA and similar proteins; the type 2 periplasmic binding protein fold. This subgroup includes the periplasmic binding component of ABC transporter involved in uptake of mimosine MidA and its similar proteins. This periplasmic binding domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	230
270226	cd01005	PBP2_CysP	Substrate binding domain of an active sulfate transporter, a member of the type 2 periplasmic binding fold superfamily. This family contains sulfate binding domain of CysP proteins that serve as initial receptors in the ABC transport of sulfate and thiosulfate in eubacteria. After binding the ligand, CysP interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The CysP proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	307
270227	cd01006	PBP2_phosphate_binding	Substrate binding domain of ABC-type phosphate transporter, a member of the type 2 periplasmic-binding fold superfamily. This phosphate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	253
270228	cd01007	PBP2_BvgS_HisK_like	The type 2 periplasmic ligand-binding protein domain of the sensor-kinase BvgS and histidine kinase receptors, and related proteins. This family comprises the periplasmic sensor domain of the two-component sensor-kinase systems, such as the sensor protein BvgS of Bordetella pertussis and histidine kinase receptors (HisK), and uncharacterized related proteins. Typically, the two-component system consists of a membrane spanning sensor-kinase and a cytoplasmic response regulator. It serves as a stimulus-response coupling mechanism to enable microorganisms to sense and respond to changes in environmental conditions. The N-terminal sensing domain of the sensor kinase detects extracellular signals, such as small molecule ligands and ions, which then modulate the catalytic activity of the cytoplasmic kinase domain through a phosphorylation cascade. The periplasmic sensor domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	220
270229	cd01008	PBP2_NrtA_SsuA_CpmA_like	Substrate binding domain of ABC-type nitrate/sulfonate/bicarbonate transporters, a member of the type 2 periplasmic binding fold superfamily. This family represents the periplasmic binding proteins involved in nitrate, alkanesulfonate, and bicarbonate transport. These domains are found in eubacterial perisplamic-binding proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates. Other closest homologs involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB) are also included in this family. After binding their ligand with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. These binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	212
270230	cd01009	PBP2_YfhD_N	The solute binding domain of YfhD proteins, a member of the type 2 periplasmic binding fold protein superfamily. This subfamily includes the solute binding domain YfhD_N. These domains are found in the YfhD proteins that are predicted to function as lytic transglycosylases that cleave the glycosidic bond between N-acetylmuramic acid and N-acetylglucosamin in peptidoglycan, while the YfhD_N domain might act as an auxiliary or regulatory subunit. In addition to periplasmic solute binding domain, they have an SLT domain, typically found in soluble lytic transglycosylases, and a C-terminal low complexity domain.   The YfhD proteins might have been recruited to create localized cell wall openings required for transport of large substrates such as DNA. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	223
238493	cd01011	nicotinamidase	Nicotinamidase/pyrazinamidase (PZase).  Nicotinamidase, a ubiquitous enzyme in prokaryotes, converts nicotinamide to nicotinic acid (niacin) and ammonia, which in turn can be recycled to make nicotinamide adenine dinucleotide (NAD). The same enzyme is also called pyrazinamidase, because in converts the tuberculosis drug pyrazinamide (PZA) into its active form pyrazinoic acid (POA).	196
238494	cd01012	YcaC_related	YcaC related amidohydrolases; E.coli YcaC is an homooctameric hydrolase with unknown specificity. Despite its weak sequence similarity, it is structurally related to other amidohydrolases and shares conserved active site residues with them. Multimerisation interface seems not to be conserved in all members.	157
238495	cd01013	isochorismatase	Isochorismatase, also known as 2,3 dihydro-2,3 dihydroxybenzoate synthase, catalyses the conversion of isochorismate, in the presence of water, to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of a vinyl ether, an uncommon reaction in biological systems. Isochorismatase is part of the phenazine biosynthesis pathway. Phenazines are antimicrobial compounds that provide the competitive advantage for certain bacteria.	203
238496	cd01014	nicotinamidase_related	Nicotinamidase_ related amidohydrolases.  Cysteine hydrolases of unknown function that share the catalytic triad with other amidohydrolases, like nicotinamidase, which converts nicotinamide to nicotinic acid and ammonia.	155
238497	cd01015	CSHase	N-carbamoylsarcosine amidohydrolase (CSHase) hydrolyzes N-carbamoylsarcosine to sarcosine, carbon dioxide and ammonia. CSHase is involved in one of the two alternative pathways for creatinine degradation to glycine in microorganisms.This CSHase-containing pathway degrades creatinine via N-methylhydantoin  N-carbamoylsarcosine and sarcosine to glycine. Enzymes of this pathway are used in the diagnosis for renal disfunction, for determining creatinine levels in urine and serum.	179
238498	cd01016	TroA	Metal binding protein TroA. These proteins have been shown to function as initial receptors in ABC transport of Zn2+ and possibly Fe3+ in many eubacterial species.  The TroA proteins belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).	276
238499	cd01017	AdcA	Metal binding protein AdcA.  These proteins have been shown to function in the ABC uptake of Zn2+ and Mn2+ and in competence for genetic transformation and adhesion.  The AdcA proteins belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  They are comprised of two globular subdomains connected by a long alpha helix and they bind their ligand in the cleft between these domains.  In addition, many of these proteins have a low complexity region containing metal binding histidine-rich motif (repetitive HDH sequence).	282
238500	cd01018	ZntC	Metal binding protein ZntC.  These proteins are predicted to function as initial receptors in ABC transport of metal ions.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  They are comprised of two globular subdomains connected by a long alpha helix and bind their specific ligands in the cleft between these domains.  In addition, many of these proteins possess a metal-binding histidine-rich motif (repetitive HDH sequence).	266
238501	cd01019	ZnuA	Zinc binding protein ZnuA. These proteins have been shown to function as initial receptors in the ABC uptake of Zn2+.  They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism.  They are comprised of two globular subdomains connected by a single helix and bind their specific ligands in the cleft between these domains.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).	286
238502	cd01020	TroA_b	Metal binding protein TroA_b.  These proteins are predicted to function as initial receptors in ABC transport of metal ions.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).	264
381601	cd01021	GEWL	Goose egg-white lysozyme. Eukaryotic goose-type or G-type lysozyme (goose egg-white lysozyme; GEWL) catalyzes the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc). Mammals have two lysozymes. This family corresponds to human and mouse lysozyme G-like protein 2.	174
212096	cd01022	GH57N_like	N-terminal catalytic domain of heat stable retaining glycoside hydrolase family 57. Glycoside hydrolase family 57(GH57) is a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57s cleave alpha-glycosidic bonds by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation.	313
173775	cd01025	TOPRIM_recR	TOPRIM_recR: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in Escherichia coli RecR.  RecR participates in the RecFOR pathway of homologous recombinational repair in prokaryotes. This pathway provides a single-stranded DNA molecule coated with RecA to allow invasion of a homologous molecule. The RecFOR system directs the loading of RecA onto gapped DNA coated with SSB protein. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD).  In RecR sequences this glutamate in the first turn of the TOPRIM domain is semiconserved, the DXD motif is not conserved.	112
173776	cd01026	TOPRIM_OLD	TOPRIM_OLD: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in bacterial and archaeal nucleases of the OLD (overcome lysogenization defect) family.  The bacteriophage P2 OLD protein, which has DNase as well as RNase activity, consists of an N-terminal ABC-type ATPase domain and a C-terminal Toprim domain; the nuclease activity of OLD is stimulated by ATP, though the ATPase activity is not DNA-dependent. Functional details on OLD are scant and further experimentation is required to define the relationship between the ATPase and Toprim nuclease domains.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD).  The conserved glutamate may act as a general acid in strand cleavage by nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	97
173777	cd01027	TOPRIM_RNase_M5_like	TOPRIM_ RNase M5_like: The topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain found in Ribonuclease M5: (RNase M5) and other small primase-like proteins from bacteria and archaea.  RNase M5 catalyzes the maturation of 5S rRNA in low G+C Gram-positive bacteria. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	81
173778	cd01028	TOPRIM_TopoIA	TOPRIM_TopoIA: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in the type IA family of DNA topoisomerases (TopoIA).  This subgroup contains proteins similar to the Type I DNA topoisomerases: E. coli topisomerases I and III, eukaryotic topoisomerase III and, ATP-dependent reverse gyrase found in archaea and thermophilic bacteria.   Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA. These enzymes cleave one strand of the DNA duplex, covalently link to the 5' phosphoryl end of the DNA break and allow the other strand of the duplex to pass through the gap. Reverse gyrase is also able to insert positive supercoils in the presence of ATP and negative supercoils in the presence of AMPPNP.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD).  For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	142
173779	cd01029	TOPRIM_primases	TOPRIM_primases: The topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain found in the active site regions of bacterial DnaG-type primases and their homologs. Primases synthesize RNA primers for the initiation of DNA replication. DnaG type primases are often closely associated with DNA helicases in primosome assemblies.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function. The prototypical bacterial primase. Escherichia coli DnaG is a single subunit enzyme.	79
173780	cd01030	TOPRIM_TopoIIA_like	TOPRIM_TopoIIA_like: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topoisomerase II. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA.  These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases.  The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	115
238504	cd01031	EriC	ClC chloride channel EriC.  This domain is found in the EriC chloride transporters that mediate the extreme acid resistance response in eubacteria and archaea. This response allows bacteria to survive in the acidic environments by decarboxylation-linked proton utilization. As shown for Escherichia coli EriC, these channels can counterbalance the electric current produced by the outwardly directed virtual proton pump linked to amino acid decarboxylation.  The EriC proteins belong to the ClC superfamily of chloride ion channels, which share a unique double-barreled architecture and voltage-dependent gating mechanism.  The voltage-dependent gating is conferred by the permeating anion itself, acting as the gating charge. In Escherichia coli EriC, a glutamate residue that protrudes into the pore is thought to participate in gating by binding to a Cl- ion site within the selectivity filter.	402
238505	cd01033	ClC_like	Putative ClC chloride channel.  Clc proteins are putative halogen ion (Cl-, Br- and I-) transporters found in eubacteria. They belong to the ClC superfamily of halogen ion channels, which share a unique double-barreled architecture and voltage-dependent gating mechanism.  This superfamily lacks any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating.  The voltage-dependent gating is conferred by the permeating anion itself, acting as the gating charge.	388
238506	cd01034	EriC_like	ClC chloride channel family. These protein sequences, closely related to the ClC Eric family, are putative halogen ion (Cl-, Br- and I-) transport proteins found in eubacteria. They belong to the ClC superfamily of chloride ion channels, which share a unique double-barreled architecture and voltage-dependent gating mechanism.  This superfamily lacks any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating.  The voltage-dependent gating is conferred by the permeating anion itself, acting as the gating charge.	390
238507	cd01036	ClC_euk	Chloride channel, ClC.  These domains are found in the eukaryotic halogen ion (Cl-, Br- and I-) channel proteins that perform a variety of functions including cell volume regulation, membrane potential stabilization, charge compensation necessary for the acidification of intracellular organelles, signal transduction and transepithelial transport.  They are also involved in many pathophysiological processes and are responsible for a number of human diseases.  These proteins belong to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism.  The gating is conferred by the permeating anion itself, acting as the gating charge.  Some proteins possess long C-terminal cytoplasmic regions containing two CBS (cystathionine beta synthase) domains of putative regulatory function.	416
411707	cd01037	PDDEXK_nuclease-like	PDDEXK family nucleases. Superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	83
411708	cd01038	Endonuclease_DUF559	Putative endonuclease. Domain of unknown function 559 (DUF559) is a putative endonuclease of unknown function, belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	97
381254	cd01040	Mb-like	myoglobin-like; M family globin domain. This family includes chimeric (FHbs/flavohemoglobins) and single-domain globins: FHbs, Ngbs/neuroglobins, Cygb/cytoglobins, GbE/avian eye specific globin E, GbX/globin X, amphibian GbY/globin Y, Mb/myoglobin, HbA/hemoglobin-alpha, HbB/hemoglobin-beta, SDgbs/single-domain globins related to FHbs, and Adgb/androglobin. The M family exhibits the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments (named A through H). In Adgbs, the globin domain is split into two: helices C-H are followed by helices A-B and the two parts are separated by the IQ motif. Although rearranged, the globin domain of most Adgbs contains a number of conserved residues which play critical roles in heme-coordination and gas ligand binding. Adgbs have been omitted from this A-H helix cd.	133
153100	cd01041	Rubrerythrin	Rubrerythrin, ferritin-like diiron-binding domain. Rubrerythrin domain is a nonheme iron binding domain found in many air-sensitive bacteria and archaea and member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The homodimeric rubrerythrin protein contains a binuclear metal center located within a four helix bundle. Many, but not all, rubrerythrin proteins have a second domain with a rubredoxin-like hexacoordinated iron center. Rubrerythrin is thought to reduce hydrogen peroxide as part of an oxidative stress protection system but its function is still poorly understood.	134
153101	cd01042	DMQH	Demethoxyubiquinone hydroxylase, ferritin-like diiron-binding domain. Demethoxyubiquinone hydroxylases (DMQH) are members of the ferritin-like, diiron-carboxylate family which are present in eukaryotes (the CLK-1/CAT5 family) and prokaryotes (the Coq7 family). DMQH participates in one of the last steps of ubiquinone biosysnthesis and is responsible for DMQ hydroxylation, resulting in the formation of hydroxyubiquinone, a precursor of ubiquinone. CLK-1 is a mitochondrial inner membrane protein and Coq7 is a proposed interfacial integral membrane protein. Mutations in the Caenorhabditis elegans gene clk-1 affect biological timing and extend longevity. The conserved residues of a diiron center are present in this domain.	165
153102	cd01043	DPS	DPS protein, ferritin-like diiron-binding domain. DPS (DNA Protecting protein under Starved conditions) domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Some DPS proteins nonspecifically bind DNA, protecting it from cleavage caused by reactive oxygen species such as the hydroxyl radicals produced during oxidation of Fe(II) by hydrogen peroxide. These proteins assemble into dodecameric structures, some form DPS-DNA co-crystalline complexes, and possess iron and H2O2 detoxification capabilities. Expression of DPS is induced by oxidative or nutritional stress, including metal ion starvation. Members of the DPS family are homopolymers formed by 12 four-helix bundle subunits that assemble with 23 symmetry into a hollow shell. The DPS ferroxidase site is unusual in that it is not located in a four-helix bundle as in ferritin, but is shared by 2-fold symmetry-related subunits providing the iron ligands. Many DPS sequences (e.g., E. coli) display an N-terminal extension of variable length that contains two or three positively charged lysine residues that extends into the solvent and is thought to play an important role in the stabilization of the complex with DNA. DPS Listeria Flp, Bacillus anthracis Dlp-1 and Dlp-2, and Helicobacter pylori HP-NAP which lack the N-terminal extension, do not bind DNA. DPS proteins from Helicobacter pylori, Treponema pallidum, and Borrelia burgdorferi are highly immunogenic.	139
153103	cd01044	Ferritin_CCC1_N	Ferritin-CCC1, N-terminal ferritin-like diiron-binding domain. Ferritin-like N-terminal domain present in an uncharacterized family of proteins found in bacteria and archaea.  These proteins also have a C-terminal CCC1-like transmembrane domain and are thought to be involved in iron and/or manganese transport.  This domain has the conserved residues of a diiron center found in other ferritin-like proteins.	125
153104	cd01045	Ferritin_like_AB	Uncharacterized family of ferritin-like proteins found in archaea and bacteria. Ferritin-like domain found in archaea and bacteria (Ferritin_like_AB).  This uncharacterized domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins whose function is unknown.  This family includes unknown or hypothetical proteins which were sequenced from mostly anaerobic or microaerophilic metal-metabolizing and/or nitrogen-fixing microbes. The family includes sequences from ferric-, sulfate-, and arsenic-reducing bacteria, Geobacter, Magnetospirillum, Desulfovibrio, and Desulfitobacterium.  Also included are several nitrogen-fixing endosymbiotic bacteria, Rhizobium, Mesorhizobium, and Bradyrhizobium; also phototrophic purple nonsulfur bacteria, Rhodobacter and Rhodopseudomonas, as well as, obligate thermophiles, Thermotoga, Thermoanaerobacter, and Pyrococcus. The conserved residues of a diiron center are present in this uncharacterized domain.	139
153105	cd01046	Rubrerythrin_like	rubrerythrin-like, diiron-binding domain. Rubrerythrin-like domain, similar to rubrerythrin, a nonheme iron binding domain found in many air-sensitive bacteria and archaea, and member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Rubrerythrin is thought to reduce hydrogen peroxide as part of an oxidative stress protection system. The rubrerythrin protein has two domains, a binuclear metal center located within a four-helix bundle of the rubrerythrin domain, and a rubredoxin domain. The Rubrerythrin-like domains in this CD are singular domains (no C-terminus rubredoxin domain) and are phylogenetically distinct from rubrerythrin domains of rubrerythrin-rubredoxin proteins.	123
153106	cd01047	ACSF	Aerobic Cyclase System Fe-containing subunit (ACSF), ferritin-like diiron-binding domain. Aerobic Cyclase System, Fe-containing subunit (ACSF) is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. Rubrivivax gelatinosus acsF codes for a conserved, putative binuclear iron-cluster-containing protein involved in aerobic oxidative cyclization of Mg-protoporphyrin IX monomethyl ester. AcsF and homologs have a leucine zipper and two copies of the conserved glutamate and histidine residues predicted to act as ligands for iron in the Ex(29-35)DExRH motifs. Several homologs of AcsF are found in a wide range of photosynthetic organisms, including Chlamydomonas reinhardtii Crd1 and Pharbitis nil PNZIP, suggesting that this aerobic oxidative cyclization mechanism is conserved from bacteria to plants.	323
153107	cd01048	Ferritin_like_AB2	Uncharacterized family of ferritin-like proteins found in archaea and bacteria. Ferritin-like domain found in archaea and bacteria, subgroup 2 (Ferritin_like_AB2).  This uncharacterized domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins whose function is unknown. The conserved residues of a diiron center are present within the putative active site.	135
153108	cd01049	RNRR2	Ribonucleotide Reductase, R2/beta subunit, ferritin-like diiron-binding domain. Ribonucleotide Reductase, R2/beta subunit (RNRR2) is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The RNR protein catalyzes the conversion of ribonucleotides to deoxyribonucleotides and is found in all eukaryotes, many prokaryotes, several viruses, and few archaea. The catalytically active form of RNR is a proposed alpha2-beta2 tetramer. The homodimeric alpha subunit (R1) contains the active site and redox active cysteines as well as the allosteric binding sites. The beta subunit (R2) contains a diiron cluster that, in its reduced state, reacts with dioxygen to form a stable tyrosyl radical and a diiron(III) cluster. This essential tyrosyl radical is proposed to generate a thiyl radical, located on a cysteine residue in the R1 active site that initiates ribonucleotide reduction. The beta subunit is composed of 10-13 helices, the 8 longest helices form an alpha-helical bundle; some have 2 addition beta strands. Yeast is unique in that it assembles both homodimers and heterodimers of RNRR2. The yeast heterodimer, Y2Y4, contains R2 (Y2) and a R2 homolog (Y4) that lacks the diiron center and is proposed to only assist in cofactor assembly, and perhaps stabilize R1 (Y1) in its active conformation.	288
153109	cd01050	Acyl_ACP_Desat	Acyl ACP desaturase, ferritin-like diiron-binding domain. Acyl-Acyl Carrier Protein Desaturase (Acyl_ACP_Desat) is a mu-oxo-bridged diiron-carboxylate enzyme, which belongs to a broad superfamily of ferritin-like proteins and catalyzes the NADPH and O2-dependent formation of a cis-double bond in acyl-ACPs.  Acyl-ACP desaturases are found in higher plants and a few bacterial species (Mycobacterium tuberculosis, M. leprae, M. avium and Streptomyces avermitilis, S. coelicolor). In plants, Acyl-ACP desaturase is a plastid-localized, covalently ACP linked, soluble desaturase that introduces the first double bound into saturated fatty acids, resulting in the corresponding monounsaturated fatty acid.  Members of this class of soluble desaturases are specific for a particular substrate chain length and introduce the double bond between specific carbon atoms. For example, delta 9 stearoyl-ACP is specific for stearic acid and introduces a double bond between carbon 9 and 10 to yield oleic acid in the ACP-bound form. The enzymatic reaction requires molecular oxygen, NAD(P)H, NAD(P)H ferredoxin oxido-reductase and ferredoxin. The enzyme is active in the homodimeric form; the monomer consists mainly of alpha-helices with the catalytic diiron center buried within a four-helix bundle. Integral membrane fatty acid desaturases that introduce double bonds into fatty acid chains, acyl-CoA desaturases of animals, yeasts, and fungi, and acyl-lipid desaturases of cyanobacteria and higher plants, are distinct from soluble acyl-ACP desaturases, lack diiron centers, and are not included in this CD.	297
153110	cd01051	Mn_catalase	Manganese catalase, ferritin-like diiron-binding domain. Manganese (Mn) catalase is a member of a broad superfamily of ferritin-like diiron enzymes. While many diiron enzymes catalyze dioxygen-dependent reactions, manganese catalase performs peroxide-dependent oxidation-reduction. Catalases are important antioxidant metalloenzymes that catalyze disproportionation of hydrogen peroxide, forming dioxygen and water. Manganese catalase, a nonheme type II catalase, contains a binuclear manganese cluster that catalyzes the redox dismutation of hydrogen peroxide, interconverting between dimanganese(II) [(2,2)] and dimanganese(III) [(3,3)] oxidation states during turnover. Mn catalases are found in a broad range of microorganisms in microaerophilic environments, including the mesophilic lactic acid bacteria (e.g., Lactobacillus plantarum) and bacterial and archaeal thermophiles (e.g., Thermus thermophilus and Pyrobaculum caldifontis). L. plantarum and T. thermophilus holoenzymes are homohexameric structures; each subunit contains a dimanganese active site. The manganese ions are linked by a mu 1,3-bridging glutamate carboxylate and two mu-bridging solvent oxygens that electronically couple the metal centers. Several members of this CD lack the C-terminal strands that pack against the neighboring catalytic domains as seen in L. plantarum. One such sequence, Bacillus subtilis CotJC, is known to be a component of the inner spore coat that interacts with spore coat protein, CotJA. It has been suggested that CotJC could modulate the degree of Mn SodA-dependent cross-linking of an outer coat component, or the two enzymes could serve to protect specific cellular structures during the developmental process.	156
153111	cd01052	DPSL	DPS-like protein, ferritin-like diiron-binding domain. DPSL (DPS-like).  DPSL is a phylogenetically distinct class within the ferritin-like superfamily, and similar in many ways to the DPS (DNA Protecting protein under Starved conditions) proteins. Like DPS, these proteins are expressed in response to oxidative stress, form dodecameric cage-like particles, preferentially utilize hydrogen peroxide in the controlled oxidation of iron, and possess a short N-terminal extension implicated in stabilizing cellular DNA.  This domain is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. These proteins are distantly related to bacterial ferritins which assemble 24 monomers,  each of which have a four-helix bundle with a fifth shorter helix at the C terminus and a diiron (ferroxidase) center. Ferritins contain a center where oxidation of ferrous iron by molecular oxygen occurs, facilitating the detoxification of iron, protection against dioxygen and radical products, and storage of iron in the ferric form. Many of the conserved residues of a diiron center are present in this domain.	148
153112	cd01053	AOX	Alternative oxidase, ferritin-like diiron-binding domain. Alternative oxidase (AOX) is a mitochondrial ubiquinol oxidase found in plants and some fungi and protists. AOX is a member of the ferritin-like diiron-carboxylate superfamily. The plant mitochondrial protein alternative oxidase catalyses dioxygen dependent ubiquinol oxidation to yield ubiquinone and water. AOX is a cyanide-resistant, salicylhydroxamic acid-sensitive oxidase that transfers electrons from ubiquinol to oxygen, bypassing the cytochrome chain. AOX has been proposed to contain a hydroxo-bridged diiron center within a four-helix bundle and a proximal redox-active tyrosine residue. AOX is proposed to be peripherally associated with the matrix side of the inner mitochondrial membrane. Fungal and protozoan AOXs generally exist as monomers. In plants, AOX is dimeric. Pyruvate is an allosteric activator of plant AOX involved in the reversible inactivation of the enzyme though the formation of an intermolecular disulfide bridge between monomeric subunits. The enzyme is non-proton-motive and does not contribute to the conservation of energy. The heat that dissipates from AOX activity is used in thermogenic plants to volatilize primary amines to attract pollinating insects. Other functions have been proposed: i) that the alternative oxidase allows Krebs-cycle turnover when the energy charge of the cell is high, and ii) that the enzyme protects against oxidative stress. The expression of AOX is induced when plants are exposed to a variety of stresses including chilling, pathogen attack, senescence and fruit ripening.	168
153113	cd01055	Nonheme_Ferritin	nonheme-containing ferritins. Nonheme Ferritin domain, found in archaea and bacteria, is a member of a broad superfamily of ferritin-like diiron-carboxylate proteins. The ferritin protein shell is composed of 24 protein subunits arranged in 432 symmetry. Each protein subunit, a four-helix bundle with a fifth short terminal helix, contains a dinuclear ferroxidase center (H type). Unique to this group of proteins is a third metal site in the ferroxidase center. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite.	156
153114	cd01056	Euk_Ferritin	eukaryotic ferritins. Eukaryotic Ferritin (Euk_Ferritin) domain. Ferritins are the primary iron storage proteins of most living organisms and members of a broad superfamily of ferritin-like diiron-carboxylate proteins. The iron-free (apoferritin) ferritin molecule is a protein shell composed of 24 protein chains arranged in 432 symmetry. Iron storage involves the uptake of iron (II) at the protein shell, its oxidation by molecular oxygen at the dinuclear ferroxidase centers, and the movement of iron (III) into the cavity for deposition as ferrihydrite; the protein shell can hold up to 4500 iron atoms. In vertebrates, two types of chains (subunits) have been characterized, H or M (fast) and L (slow), which differ in rates of iron uptake and mineralization. Fe(II) oxidation in the H/M subunits take place initially at the ferroxidase center, a carboxylate-bridged diiron center, located within the subunit four-helix bundle. In a complementary role, negatively charged residues on the protein shell inner surface of the L subunits promote ferrihydrite nucleation. Most plant ferritins combine both oxidase and nucleation functions in one chain: they have four interior glutamate residues as well as seven ferroxidase center residues.	161
153115	cd01057	AAMH_A	Aromatic and Alkene Monooxygenase Hydroxylase, subunit A, ferritin-like diiron-binding domain. Aromatic and Alkene Monooxygenase Hydroxylases, subunit A  (AAMH_A). Subunit A of the soluble hydroxylase of multicomponent, aromatic and alkene monooxygenases are members of a superfamily of ferritin-like iron-storage proteins. AAMH exists as a hexamer (an alpha2-beta2-gamma2 homodimer) with each alpha-subunit housing one nonheme diiron center embedded in a four-helix bundle. The N-terminal domain of the alpha- and noncatalytic beta-subunits possess nearly identical folds, however, the beta-subunit lacks critical diiron ligands and a C-terminal domain found in the alpha-subunit. Methane monooxygenase is a multicomponent enzyme found in methanotrophic bacteria that catalyzes the hydroxylation of methane and higher alkenes (as large as octane). Phenol monooxygenase, found in a diverse group of bacteria, catalyses the hydroxylation of phenol, chloro- and methyl-phenol and naphthol. Both enzyme systems consist of three components: the hydroxylase, a coupling protein and a reductase. In the MMO hydroxylase, dioxygen and substrate interact with the diiron center in a hydrophobic cavity at the active site. The reductase component and protein coupling factor provide electrons from NADH for reducing the oxidized binuclear iron-oxo cluster to its reduced form. Reaction with dioxygen produces a peroxy-bridged complex and dehydration leads to the formation of complex Q, which is thought to be the oxygenating species that carries out the insertion of an oxygen atom into a C-H bond of the substrate. The toluene monooxygenase systems, toluene 2-, 3-, and 4-monooxygenase, are similar to MMO but with an additional component, a Rieske-type ferredoxin. The alkene monooxygenase from Xanthobacter strain Py2 is closely related to aromatic monooxygenases and catalyzes aromatic monohydroxylation of benzene, toluene, and phenol. Alkane omega-hydroxylase (AlkB) and xylene monooxygenase are members of a distinct class of integral membrane diiron proteins and are not included in this CD.	465
153116	cd01058	AAMH_B	Aromatic and Alkene Monooxygenase Hydroxylase, subunit B, ferritin-like diiron-binding domain. Aromatic and Alkene Monooxygenase Hydroxylases, subunit B (AAMH_B). Subunit B (beta) of the soluble hydroxylase of multicomponent, aromatic and alkene monooxygenases are members of a superfamily of ferritin-like iron-storage proteins. AAMH exists as a hexamer (an alpha2-beta2-gamma2 homodimer) with each alpha-subunit housing one nonheme diiron center embedded in a four-helix bundle. The N-terminal domain of the alpha- and noncatalytic beta-subunits possess nearly identical folds; the beta-subunit lacks the C-terminal domain found in the alpha-subunit. Methane monooxygenase is a multicomponent enzyme found in methanotrophic bacteria that catalyzes the hydroxylation of methane and higher alkenes (as large as octane). Phenol monooxygenase, found in a diverse group of bacteria, catalyses the hydroxylation of phenol, chloro- and methyl-phenol and naphthol. Both enzyme systems consist of three components: the hydroxylase, a coupling protein and a reductase. In the MMO hydroxylase, dioxygen and substrate interact with the diiron center in a hydrophobic cavity at the active site. The reductase component and protein coupling factor provide electrons from NADH for reducing the oxidized binuclear iron-oxo cluster to its reduced form. Reaction with dioxygen produces a peroxy-bridged complex and dehydration leads to the formation of complex Q, which is thought to be the oxygenating species that carries out the insertion of an oxygen atom into a C-H bond of the substrate. The toluene monooxygenase systems, toluene 2-, 3-, and 4-monooxygenase, are similar to MMO but with an additional component, a Rieske-type ferredoxin. The alkene monooxygenase from Xanthobacter strain Py2 is closely related to aromatic monooxygenases and catalyzes aromatic monohydroxylation of benzene, toluene, and phenol. Alkane omega-hydroxylase (AlkB) and xylene monooxygenase are members of a distinct class of integral membrane diiron proteins and are not included in this CD.	304
153121	cd01059	CCC1_like	CCC1-related family of proteins. CCC1_like: This protein family includes the proteins related to CCC1, a yeast vacuole transmembrane protein responsible for the iron and manganese transport from the cytosol into vacuole. It also includes the proteins similar to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. 	143
238511	cd01060	Membrane-FADS-like	The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase.	122
238512	cd01061	RNase_T2_euk	Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far.  This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases).  Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the eukaryotic RNase T2 family members.	195
238513	cd01062	RNase_T2_prok	Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far.  This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases).  Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the prokaryotic RNase T2 family members.	184
133443	cd01065	NAD_bind_Shikimate_DH	NAD(P) binding domain of Shikimate dehydrogenase. Shikimate dehydrogenase (DH) is an amino acid DH family member. Shikimate pathway links metabolism of carbohydrates to de novo biosynthesis of aromatic amino acids, quinones and folate. It is essential in plants, bacteria, and fungi but absent in mammals, thus making enzymes involved in this pathway ideal targets for broad spectrum antibiotics and herbicides. Shikimate DH catalyzes the reduction of 3-hydroshikimate to shikimate using the cofactor NADH. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DHs, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	155
238514	cd01066	APP_MetAP	A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation.	207
381255	cd01067	Globin-like	Globin-like protein superfamily. This globin-like domain superfamily contains a wide variety of all-helical proteins that bind porphyrins, phycobilins, and other non-heme cofactors, and play various roles in all three kingdoms of life, including sensors or transporters of oxygen. It includes the M/myoglobin-like, S/sensor globin, and T/truncated globin (TrHb) families, and the phycobiliproteins (PBPs). The M family includes chimeric (FHbs/flavohemoglobins) and single-domain globins: FHbs, Ngbs/neuroglobins, Cygb/cytoglobins, GbE/avian eye specific globin E, GbX/globin X, amphibian GbY/globin Y, Mb/myoglobin, HbA/hemoglobin-alpha, HbB/hemoglobin-beta, SDgbs/single-domain globins related to FHbs, and Adgb/androglobin. The S family includes GCS/globin-coupled sensors, Pgbs/protoglobins, and SSDgbs/sensor single domain globins. The T family is classified into three main groups: TrHb1s (N), TrHb2s (O) and TrHb3s (P). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments (named A through H). For M family Adgbs, this globin domain is permuted, such that C-H are followed by A-B. The T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. PBPs bind the linear tetrapyrrole chromophore, phycobilin, a prosthetic group chemically and metabolically related to iron protoporphyrin IX/protoheme. Examples of other globin-like domains which bind non-heme cofactors include those of the Bacillus anthracis sporulation inhibitors pXO1-118 and pXO2-61 which bind fatty acid and halide in vitro, and the globin-like domain of Bacillus subtilis RsbRA which is presumed to channel sensory input to the C-terminal sulfate transporter/ anti-sigma factor antagonist (STAT) domain. RsbRA is a component of the sigma B-activating stressosome, and a regulator of the RNA polymerase sigma factor subunit sigma (B).	119
381256	cd01068	globin_sensor	Globin sensor domain of globin-coupled-sensors (GCSs), protoglobins (Pgbs), and sensor single-domain globins (SSDgbs); S family. This family includes sensor domains which binds porphyrins, and other non-heme cofactors. GCSs have an N-terminal sensor domain coupled to a functional domain. For heme-bound oxygen sensing/binding globin domains, O2 binds to/dissociates from the heme iron complex inducing a structural change in the sensor domain, which is then transduced to the functional domain, switching on (or off) the function of the latter. Functional domains include DGC/GGDEF, EAL, histidine kinase, MCP, PAS, and GAF domains. Characterized members include Bacillus subtilis heme-based aerotaxis transducer (HemAT-Bs) which has a sensor domain coupled to an MCP domain. HemAT-Bs mediates an aerophilic response, and may control the movement direction of bacteria and archaea. Its MCP domain interacts with the CheA histidine kinase, a component of the CheA/CheY signal transduction system that regulates the rotational direction of flagellar motors. Another GCS having the sensor domain coupled to an MCP domain is Caulobacter crescentus McpB. McpB is encoded by a gene which lies adjacent to the major chemotaxis operon. Like McpA (encoded on this operon), McpB has three potential methylation sites, a C-terminal CheBR docking motif, and a motif needed for proteolysis via a ClpX-dependent pathway during the swarmer-to-stalked cell transition. Also included is Geobacter sulfurreducens GCS, a GCS of unknown function, in which the sensor domain is coupled to a transmembrane signal-transduction domain. Pgbs are single-domain globins of unknown function. Methanosarcina acetivorans Pgbs is dimeric and has an N-terminal extension, which together with other Pgb-specific loops, buries the heme within the protein; small ligand molecules gain access to the heme via two orthogonal apolar tunnels. Pgbs and other single-domain globins can function as sensors, when coupled to an appropriate regulator domain.	146
270231	cd01069	PBP2_PheC	Cyclohexadienyl dehydratase, a member of the type 2 periplasmic binding fold protein superfamily. This subfamily includes cyclohexadienyl dehydratase PheC. These proteins catalyze the decarboxylation of prephenate to phenylpyruvate in the alternative phenylalanine biosynthesis pathway in some proteobacteria and archaea.  The PheC proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.  Since they the PheC proteins are so similar to periplasmic binding proteins, (PBP), it is evolutionarily plausible that several pre-existing PBP proteins might have been recruited to perform the enzymatic function.	232
270232	cd01071	PBP2_PhnD_like	Substrate binding domain of phosphonate uptake system-like, a member of the type 2 periplasmic-binding fold superfamily. This family includes alkylphosphonate binding domain PhnD. These domains are found in PhnD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. PhnD is the periplasmic binding component of an ABC-type phosphonate uptake system (PhnCDE) that recognizes and binds phosphonate. PhnD belongs to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. The PBP2 have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	253
270233	cd01072	PBP2_SMa0082_like	The substrate-binding domain of putatuve amino acid transporter; the type 2 periplasmic binding protein fold. This group includes the periplamic-binding protein component of a putative amino acid ABC transporter from Sinorhizobium meliloti and its related proteins. The putative SMa0082-like domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	238
133444	cd01075	NAD_bind_Leu_Phe_Val_DH	NAD(P) binding domain of leucine dehydrogenase, phenylalanine dehydrogenase, and valine dehydrogenase. Amino acid dehydrogenase (DH) is a widely distributed family of enzymes that catalyzes the oxidative deamination of an amino acid to its keto acid and ammonia with concomitant reduction of NADP+. For example, leucine DH catalyzes the reversible oxidative deamination of L-leucine and several other straight or branched chain amino acids to the corresponding 2-oxoacid derivative. Amino acid DH -like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	200
133445	cd01076	NAD_bind_1_Glu_DH	NAD(P) binding domain of glutamate dehydrogenase, subgroup 1. Amino acid dehydrogenase (DH) is a widely distributed family of enzymes that catalyzes the oxidative deamination of an amino acid to its keto acid and ammonia with concomitant reduction of NADP+. Glutamate DH is a multidomain enzyme that catalyzes the reaction from glutamate to 2-oxyoglutarate and ammonia in the presence of NAD or NADP. It is present in all organisms. Enzymes involved in ammonia assimilation are typically NADP+-dependent, while those involved in glutamate catabolism are generally NAD+-dependent. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha -beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	227
133446	cd01078	NAD_bind_H4MPT_DH	NADP binding domain of methylene tetrahydromethanopterin dehydrogenase. Methylene Tetrahydromethanopterin Dehydrogenase (H4MPT DH) NADP binding domain. NADP-dependent H4MPT DH catalyzes the dehydrogenation of methylene- H4MPT and methylene-tetrahydrofolate (H4F) with NADP+ as cofactor. H4F and H4MPT are both cofactors that carry the one-carbon units between the formyl and methyl oxidation level. H4F and H4MPT are structurally analogous to each other with respect to the pterin moiety, but each has distinct side chain. H4MPT is present only in anaerobic methanogenic archaea and aerobic methylotrophic proteobacteria. H4MPT seems to have evolved independently from H4F and functions as a distinct carrier in C1 metabolism. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	194
133447	cd01079	NAD_bind_m-THF_DH	NAD binding domain of methylene-tetrahydrofolate dehydrogenase. The NAD-binding domain of methylene-tetrahydrofolate dehydrogenase (m-THF DH).  M-THF is a versatile carrier of activated one-carbon units. The major one-carbon folate donors are N-5 methyltetrahydrofolate, N5,N10-m-THF, and N10-formayltetrahydrofolate. The oxidation of metabolic intermediate m-THF to m-THF requires the enzyme m-THF DH. M-THF DH is a component of an unusual monofunctional enzyme; in eukaryotes, m-THF DH is typically found as part of a multifunctional protein.  NADP-dependent m-THF DHs in mammals, birds and yeast are components of a trifunctional enzyme with DH, cyclohydrolase, and synthetase activities. Certain eukaryotic cells also contain homodimeric bifunctional DH/cyclodrolase form. In bacteria, monofunctional DH, as well as bifunctional DH/cyclodrolase are found. In addition, yeast (S. cerevisiae) also express an monofunctional DH. This family contains only the monofunctional DHs from S. cerevisiae and certain bacteria. M-THF DH, like other amino acid DH-like NAD(P)-binding domains, is a member of the Rossmann fold superfamily which includes glutamate, leucine, and phenylalanine DHs, m-THF DH, methylene-tetrahydromethanopterin DH, m-THF DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	197
133448	cd01080	NAD_bind_m-THF_DH_Cyclohyd	NADP binding domain of methylene-tetrahydrofolate dehydrogenase/cyclohydrolase. NADP binding domain of the Methylene-Tetrahydrofolate Dehydrogenase/cyclohydrolase (m-THF DH/cyclohydrolase) bifunctional enzyme.   Tetrahydrofolate is a versatile carrier of activated one-carbon units. The major one-carbon folate donors are N-5 methyltetrahydrofolate, N5,N10-m-THF, and N10-formayltetrahydrofolate. The oxidation of metabolic intermediate m-THF to m-THF requires the enzyme m-THF DH. In addition, most DHs also have an associated cyclohydrolase activity which catalyzes its hydrolysis to N10-formyltetrahydrofolate. m-THF DH is typically found as part of a multifunctional protein in eukaryotes. NADP-dependent m-THF DH in mammals, birds and yeast are components of a trifunctional enzyme with DH, cyclohydrolase, and synthetase activities. Certain eukaryotic cells also contain homodimeric bifunctional DH/cyclodrolase form. In bacteria, monofucntional DH, as well as bifunctional m-THF m-THF DHm-THF DHDH/cyclodrolase are found. In addition, yeast (S. cerevisiae) also express an monofunctional DH. This family contains the bifunctional DH/cyclohydrolase. M-THF DH, like other amino acid DH-like NAD(P)-binding domains, is a member of the Rossmann fold superfamily which includes glutamate, leucine, and phenylalanine DHs, m-THF DH, methylene-tetrahydromethanopterin DH, m-THF DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains.	168
185695	cd01081	Aldose_epim	aldose 1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen.	284
238517	cd01083	GAG_Lyase	Glycosaminoglycan (GAG) polysaccharide lyase family. This family consists of a group of secreted bacterial lyase enzymes capable of acting on glycosaminoglycans, such as hyaluronan and chondroitin, in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen. These are broad-specificity glycosaminoglycan lyases which recognize uronyl residues in polysaccharides and cleave their glycosidic bonds via a beta-elimination reaction to form a double bond between C-4 and C-5 of the non-reducing terminal uronyl residues of released products. Substrates include chondroitin, chondroitin 4-sulfate, chondroitin 6-sulfate, and hyaluronic acid. Family members include chondroitin AC lyase, chondroitin abc lyase, xanthan lyase, and hyalurate lyase.	693
238518	cd01085	APP	X-Prolyl Aminopeptidase 2. E.C. 3.4.11.9. Also known as X-Pro aminopeptidase, proline aminopeptidase, aminopeptidase P, and aminoacylproline aminopeptidase. Catalyses release of any N-terminal amino acid, including proline, that is linked with proline, even from a dipeptide or tripeptide.	224
238519	cd01086	MetAP1	Methionine Aminopeptidase 1. E.C. 3.4.11.18. Also known as methionyl aminopeptidase and Peptidase M. Catalyzes release of N-terminal amino acids, preferentially methionine, from peptides and arylamides.	238
238520	cd01087	Prolidase	Prolidase. E.C. 3.4.13.9. Also known as Xaa-Pro dipeptidase, X-Pro dipeptidase, proline dipeptidase., imidodipeptidase, peptidase D, gamma-peptidase. Catalyses hydrolysis of Xaa-Pro dipeptides; also acts on aminoacyl-hydroxyproline analogs. No action on Pro-Pro.	243
238521	cd01088	MetAP2	Methionine Aminopeptidase 2. E.C. 3.4.11.18. Also known as methionyl aminopeptidase and peptidase M. Catalyzes release of N-terminal amino acids, preferentially methionine, from peptides and arylamides.	291
238522	cd01089	PA2G4-like	Related to aminopepdidase M, this family contains proliferation-associated protein 2G4. Family members have been implicated in cell cycle control.	228
238523	cd01090	Creatinase	Creatine amidinohydrolase. E.C.3.5.3.3. Hydrolyzes creatine to sarcosine and urea.	228
238524	cd01091	CDC68-like	Related to aminopeptidase P and aminopeptidase M, a member of this domain family is present in cell division control protein 68, a transcription factor.	243
238525	cd01092	APP-like	Similar to Prolidase and Aminopeptidase P. The members of this subfamily presumably catalyse hydrolysis of Xaa-Pro dipeptides and/or release of any N-terminal amino acid, including proline, that is linked with proline.	208
238526	cd01093	CRIB_PAK_like	PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. This subgroup of CRIB/PBD-domains is found N-terminal of Serine/Threonine kinase domains in PAK and PAK-like proteins.	46
238527	cd01094	Alkanesulfonate_monoxygenase	Alkanesulfonate monoxygenase is the monoxygenase of a two-component system that catalyzes the conversion of alkanesulfonates to the corresponding aldehyde and sulfite. Alkanesulfonate monoxygenase (SsuD) has an absolute requirement for reduced flavin mononucleotide (FMNH2), which is provided by the NADPH-dependent FMN oxidoreductase (SsuE).	244
238528	cd01095	Nitrilotriacetate_monoxgenase	nitrilotriacetate monoxygenase oxidizes nitrilotriacetate utilizing reduced flavin mononucleotide (FMNH2) and oxygen. The FMNH2 is provided by an NADH:flavin mononucleotide (FMN) oxidorductase that uses NADH to reduce FMN to FMNH2.	358
238529	cd01096	Alkanal_monooxygenase	Alkanal monooxygenase are flavin monoxygenases. Molecular oxygen is activated by reaction with reduced flavin mononucleotide (FMNH2) and reacts with an aldehyde to yield the carboxylic acid, oxidized flavin (FMN) and a blue-green light. Bacterial luciferases are heterodimers made of alpha and beta subunits which are homologous. The single activer center is on the alpha subunit. The alpha subunit has a stretch of 30 amino acid residues that is not present in the beta subunit. The beta subunit does not contain the active site and is required for the formation of the fully active heterodimer. The beta subunit does not contribute anything directly to the active site. Its role is probably to stabilize the high quantum yield conformation of the alpha subunit through interactionbs across the subunit interface.	315
238530	cd01097	Tetrahydromethanopterin_reductase	N5,N10-methylenetetrahydromethanopterin reductase (Mer) catalyzes the reduction of N5,N10-methylenetetrahydromethanopterin with reduced coenzyme F420 to N5-methyltetrahydromethanopterin and oxidized coenzyme F420.	202
238531	cd01098	PAN_AP_plant	Plant PAN/APPLE-like domain; present in plant S-receptor protein kinases and secreted glycoproteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. S-receptor protein kinases and S-locus glycoproteins are involved in sporophytic self-incompatibility response in Brassica, one of probably many molecular mechanisms, by which hermaphrodite flowering plants avoid self-fertilization.	84
238532	cd01099	PAN_AP_HGF	Subfamily of PAN/APPLE-like domains; present in N-terminal (N) domains of plasminogen/hepatocyte growth factor proteins, and various proteins found in Bilateria, such as leech anti-platelet proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.	80
238533	cd01100	APPLE_Factor_XI_like	Subfamily of PAN/APPLE-like domains; present in plasma prekallikrein/coagulation factor XI, microneme antigen proteins, and a few prokaryotic proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.	73
238534	cd01102	Link_Domain	The link domain is a hyaluronan (HA)-binding domain. It functions to mediate adhesive interactions during inflammatory leukocyte homing and tumor metastasis. It is found in the CD44 receptor and in human TSG-6. TSG-6 is the protein product of the tumor necrosis factor-stimulated gene-6. TSG-6 has a strong anti-inflammatory effect in models of acute inflammation and autoimmune arthritis and plays an essential role in female fertility. This group also contains the link domains of the chondroitin sulfate proteoglycan core proteins (CSPG) including aggrecan, versican, neurocan, and brevican and the link domains of the vertebrate HAPLN (HA and proteoglycan binding link) protein family. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates in which other CSPGs substitute for aggregan might contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN gene family are physically linked adjacent to CSPG genes. TSG-6 contains a single link module which supports high affinity binding with HA. The functional HA-binding domain of CD44 is an extended domain comprised of a link module flanked with N-and C- extensions. These extensions are essential for folding and functional activity. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of the CSPG aggrecan are involved in interaction with HA. Aggrecan in addition contains a second globular domain (G2) which contains link modules 3 and 4 which lack HA-binding activity. HAPLNs contain two contiguous link modules.	92
133379	cd01104	HTH_MlrA-CarA	Helix-Turn-Helix DNA binding domain of the transcription regulators MlrA and CarA. Helix-turn-helix (HTH) transcription regulator MlrA (merR-like regulator A), N-terminal domain. The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium.  Its close homolog, CarA from Myxococcus xanthus, is involved in activation of the carotenoid biosynthesis genes by light. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. Many MlrA- and CarA-like proteins in this group appear to lack the long dimerization helix seen in the N-terminal domains of typical MerR-like proteins.	68
133380	cd01105	HTH_GlnR-like	Helix-Turn-Helix DNA binding domain of GlnR-like transcription regulators. Helix-turn-helix (HTH) transcription regulator GlnR and related proteins, N-terminal domain. The GlnR and TnrA (also known as ScgR) proteins have been shown to regulate expression of glutamine synthetase as well as several genes involved in nitrogen metabolism. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.  A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.	88
133381	cd01106	HTH_TipAL-Mta	Helix-Turn-Helix DNA binding domain of the transcription regulators TipAL, Mta, and SkgA. Helix-turn-helix (HTH) TipAL, Mta, and SkgA transcription regulators, and related proteins, N-terminal domain. TipAL regulates resistance to and activation by numerous cyclic thiopeptide antibiotics, such as thiostrepton. Mta is a global transcriptional regulator; the N-terminal DNA-binding domain of Mta interacts directly with the promoters of mta, bmr, blt, and ydfK, and induces transcription of these multidrug-efflux transport genes. SkgA has been shown to control stationary-phase expression of catalase-peroxidase in Caulobacter crescentus. These proteins are comprised of distinct domains that harbor an  N-terminal active (DNA-binding) site and a regulatory (effector-binding) site. The conserved N-terminal domain of these transcription regulators contains winged HTH motifs that mediate DNA binding. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. Unique to this family, is a TipAL-like, lineage specific Bacilli subgroup, which has five conserved cysteines in the C-terminus of the protein.	103
133382	cd01107	HTH_BmrR	Helix-Turn-Helix DNA binding domain of the BmrR transcription regulator. Helix-turn-helix (HTH) multidrug-efflux transporter transcription regulator, BmrR and YdfL of Bacillus subtilis, and related proteins; N-terminal domain. Bmr is a membrane protein which causes the efflux of a variety of toxic substances and antibiotics. BmrR is comprised of two distinct domains that harbor a regulatory (effector-binding) site and an active (DNA-binding) site. The conserved N-terminal domain contains a winged HTH motif  that mediates DNA binding, while the C-terminal domain binds coactivating, toxic compounds. BmrR shares the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.	108
133383	cd01108	HTH_CueR	Helix-Turn-Helix DNA binding domain of CueR-like transcription regulators. Helix-turn-helix (HTH) transcription regulators CueR and ActP, copper efflux regulators. In Bacillus subtilis, copper induced CueR regulates the copZA operon, preventing copper toxicity. In Rhizobium leguminosarum, ActP controls copper homeostasis; it detects cytoplasmic copper stress and activates transcription in response to increasing copper concentrations. These proteins are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain winged HTH motifs that mediate DNA binding, while the C-terminal domains have two conserved cysteines that define a monovalent copper ion binding site. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.	127
133384	cd01109	HTH_YyaN	Helix-Turn-Helix DNA binding domain of the MerR-like transcription regulators YyaN and YraB. Putative helix-turn-helix (HTH) MerR-like transcription regulators of Bacillus subtilis, YyaN and YraB, and related proteins; N-terminal domain. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	113
133385	cd01110	HTH_SoxR	Helix-Turn-Helix DNA binding domain of the SoxR transcription regulator. Helix-turn-helix (HTH) transcriptional regulator SoxR. The global regulator, SoxR, up-regulates gene expression of another transcription activator, SoxS, which directly stimulates the oxidative stress regulon genes in E. coli. The soxRS response renders the bacterial cell resistant to superoxide-generating agents, macrophage-generated nitric oxide, organic solvents, and antibiotics. The SoxR proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the unusually long spacer between the -35 and -10 promoter elements. They also harbor a regulatory C-terminal domain containing an iron-sulfur center.	139
133386	cd01111	HTH_MerD	Helix-Turn-Helix DNA binding domain of the MerD transcription regulator. Helix-turn-helix (HTH) transcription regulator MerD. The putative secondary regulator of mercury resistance (mer) operons, MerD, has been shown to down-regulate the expression of this operon in gram-negative bacteria. It binds to the same operator DNA as MerR that activates transcription of the operon in the presence of mercury ions. The MerD protein shares the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily, which promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are conserved and contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	107
238535	cd01115	SLC13_permease	Permease SLC13 (solute carrier 13).  The sodium/dicarboxylate cotransporter NaDC-1 has been shown to translocate Krebs cycle intermediates such as succinate, citrate, and alpha-ketoglutarate across plasma membranes rabbit, human, and rat kidney. It is related to renal and intestinal Na+/sulfate cotransporters and a few putative bacterial permeases. The SLC13-type proteins belong to the ArsB/NhaD superfamily of permeases that translocate sodium and various anions across biological membranes in all three kingdoms of life. A typical ArsB/NhaD permease is composed of 8-13 transmembrane helices.	382
238536	cd01116	P_permease	Permease P (pink-eyed dilution). Mutations in the human melanosomal P gene were responsible for classic phenotype of oculocutaneous albinism type 2 (OCA2). Although the precise function of the P protein is unknown, it was predicted to regulate the intraorganelle pH, together with the ATP-driven proton pump. It shows significant sequence similarity to the Na+/H+ antiporter NhaD from Vibrio parahaemolyticus. Both proteins belong to ArsB/NhaD superfamily of permeases that translocate sodium, arsenate, sulfate, and organic anions across biological membranes in all three kingdoms of life.  A typical ArsB/NhaD permease contains 8-13 transmembrane domains.	413
238537	cd01117	YbiR_permease	Putative anion permease YbiR.  Based on sequence similarity, YbiR proteins are predicted to function as anion translocating permeases in eubacteria, archaea and plants. They belong to ArsB/NhaD superfamily of permeases that have been shown to translocate sodium, sulfate, arsenite and organic anions. A typical ArsB/NhaD permease is composed of 8-13 transmembrane domains.	384
238538	cd01118	ArsB_permease	Anion permease ArsB.  These permeases have been shown to export arsenate and antimonite in eubacteria and archaea.  A typical ArsB permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump (ArsAB).  The ArsAB complex is similar in many ways to ATP-binding cassette transporters, which have two groups of six transmembrane-spanning helical segments and two nucleotide-binding domains. The ArsB proteins belong to the ArsB/NhaD superfamily of permeases that translocate sodium, arsenate, sulfate, and organic anions across biological membranes in all three kingdoms of life.	416
238539	cd01119	Chemokine_CC_DCCL	Chemokine_CC_DCCL:  subgroup of the Chemokine_CC subgroup based on the presence of a DCCL motif involving the two N-terminal cysteine residues; includes a number of small inducible cytokines capable of reversibly inhibiting normal hematopoietic progenitor proliferation by blocking progression through the cell cycle; DCCL subgroup contains Exodus-1 (also known as CCL20, MIP-3alpha, LARC, ST38 (mouse)), Exodus-2 (also known as CCL21, SLC, 6-Ckine, TCA4, CKbeta9), and Exodus-3 (also known as CCL-19, ELC, MIP-3beta, CKbeta11).  Exodus-3 was shown to inhibit the growth of human breast cancer cells in vivo in a mouse model; Exodus-1, -2, and -3 were all shown to significantly inhibit chronic myelogenous leukemia progenitor cell proliferation; Exodus-2 and -3 show potent immunotherapeutic activity toward solid tumors; chemotatic for T cells, B cells, dendritic cells, macrophage progenitor cells, and NK cells; exist as monomers and dimers, but are believed to be functional as monomers; found only in vertebrates. See CDs:  Chemokine_CC (cd00272) for the entire CC subgroup, Chemokine (cd00169) for the general alignment of chemokines, or Chemokine_CXC (cd00273), Chemokine_C (cd00271), and Chemokine_CX3C (cd00274) for the additional chemokine subgroups.	61
410865	cd01120	RecA-like_superfamily	RecA-like_NTPases. RecA-like NTPases. This superfamily includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	119
410866	cd01121	RadA_SMS_N	bacterial RadA DNA repair protein. Sms or bacterial RadA is a DNA repair protein that plays a role in recombination and recombinational repair of DNA damaged by UV radiation, X-rays, and chemical agent and is responsible for the stabilization or processing of branched DNA molecules.	268
410867	cd01122	Twinkle_C	C-terminal domain of Twinkle. Twinkle ( T7 gp4-like protein with intramitochondrial nucleoid localization, also known as C10orf2, PEO1, SCA8, ATXN8, IOSCA, PEOA3 or SANDO) is a homohexameric DNA helicases which unwinds short stretches of double-stranded DNA in the 5' to 3' direction and, along with mitochondrial single-stranded DNA binding protein and mtDNA polymerase gamma, is thought to play a key role in mtDNA replication. Mutations in the human gene cause infantile onset spinocerebellar ataxia (IOSCA) and progressive external ophthalmoplegia (PEO) and are also associated with several mitochondrial depletion syndromes. This group also contains viral GP4-like and related bacterial helicases.	266
410868	cd01123	Rad51_DMC1_archRadA	recombinase Rad51, DMC1, and archaeal RadA. This group of recombinases includes the eukaryotic proteins RAD51, RAD55/57 and the meiosis-specific protein DMC1, and the archaeal protein RadA. They are closely related to the bacterial RecA group. Rad51 proteins catalyze a similar recombination reaction as RecA, using ATP-dependent DNA binding activity and a DNA-dependent ATPase. However, this reaction is less efficient and requires accessory proteins such as RAD55/57 .	234
410869	cd01124	KaiC-like	Circadian Clock Protein KaiC. KaiC is a circadian clock protein, most studied in cyanobacteria.  KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation.	222
410870	cd01125	RepA_RSF1010_like	Hexameric Replicative Helicase RepA of plasmid RSF1010 and related proteins. This family includes the homo-hexameric replicative helicase RepA encoded by plasmid RSF1010. RSF1010 is found in most Gram-negative bacteria and some Gram-positive bacteria . The RepA protein of Plasmid RSF1010 is a 5'-3' DNA helicase which can utilize ATP, dATP, GTP and dGTP (and CTP and dCTP to a lesser extent).	238
410871	cd01127	TrwB_TraG_TraD_VirD4	TrwB/TraG/TraD/VirD4 family of bacterial conjugation proteins. The TraG/TraD/VirD4 family are bacterial conjugation proteins involved in type IV secretion (T4S) systems, versatile bacterial secretion systems mediating transport of protein and/or DNA. They are present in gram-negative and gram-positive bacteria, as well as archaea. They form hexameric rings and belong to the RecA-like NTPases superfamily, which also includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases.	144
410872	cd01128	rho_factor_C	C-terminal ATP binding domain of transcription termination factor rho. Transcription termination factor rho is a bacterial ATP-dependent RNA/DNA helicase. It is a homohexamer.  Each monomer consists of an N-terminal oligonucleotide/oligosaccharide binding fold (OB-fold) domain which binds cysteine-rich nucleotides, and a C-terminal ATP binding domain. This alignment is of the C-terminal ATP binding domain.	249
410873	cd01129	PulE-GspE-like	PulE-GspE family. PulE and General secretory pathway protein GspE are ATPases of the type II secretory pathway, the main terminal branch of the general secretory pathway (GSP). PulE is a cytoplasmic protein of the GSP, which contains an ATP binding site and a tetracysteine motif. This subgroup also includes PilB, a type IV pilus assembly ATPase, DotB, an ATPase of the type IVb secretion system, also known as the dot/icm system, Escherichia coli IncI plasmid-encoded conjugative transfer ATPase TraJ, and HofB.	159
410874	cd01130	VirB11-like_ATPase	Type IV secretory pathway component VirB11-like. Type IV secretory pathway component VirB11, and related ATPases. The homohexamer, VirB11 is one of eleven Vir (virulence) proteins, which are required for T-pilus biogenesis and virulence in the transfer of T-DNA from the bacterial Ti (tumor-inducing)-plasmid into plant cells. The pilus is a fibrous cell surface organelle, which mediates adhesion between bacteria during conjugative transfer or between bacteria and host eukaryotic cells during infection. VirB11-related ATPases include Sulfolobus acidocaldarius FlaI, which plays key roles in archaellum (archaeal flagellum) assembly and motility functions, and the pilus assembly proteins CpaF/TadA and TrbB. This alignment contains the C-terminal domain, which is the ATPase.	177
410875	cd01131	PilT	Pilus retraction ATPase PilT. Pilus retraction ATPase PilT is a nucleotide-binding protein responsible for the retraction of type IV pili, likely by pili disassembly. This retraction provides the force required for travel of bacteria in low water environments by a mechanism known as twitching motility.	223
410876	cd01132	F1-ATPase_alpha_CD	F1 ATP synthase alpha subunit, central domain. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The mitochondrial extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The alpha subunit of the F1 ATP synthase can bind nucleotides, but is non-catalytic. Alpha and beta subunits form the globular catalytic moiety, a hexameric ring of alternating alpha and beta subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton-translocating domain.	274
410877	cd01133	F1-ATPase_beta_CD	F1 ATP synthase beta subunit, central domain. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The mitochondrial extrinsic membrane domain, F1,  is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The beta subunit of ATP synthase is catalytic. Alpha and beta subunits form the globular catalytic moiety, a hexameric ring of alternating alpha and beta subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton-translocating domain.	277
410878	cd01134	V_A-ATPase_A	V/A-type ATP synthase catalytic subunit A. V/A-type ATP synthase catalytic subunit A. These ATPases couple ATP hydrolysis to the build up of a H+ gradient, but V-type ATPases do not catalyze the reverse reaction. Vacuolar (V-type) ATPases play major roles in endomembrane and plasma membrane proton transport in eukaryotes. They are found in multiple intracellular membranes including vacuoles, endosomes, lysosomes, Golgi-derived vesicles, secretory vesicles, as well as the plasma membrane. Archaea have a protein which is similar in sequence to V-ATPases, but functions like an F-ATPase (called A-ATPase).  A similar protein is also found in a few bacteria.	288
410879	cd01135	V_A-ATPase_B	V/A-type ATP synthase subunit B. V/A-type ATP synthase (non-catalytic) subunit B. These ATPases couple ATP hydrolysis to the build up of a H+ gradient, but V-type ATPases do not catalyze the reverse reaction. Vacuolar (V-type) ATPases play major roles in endomembrane and plasma membrane proton transport in eukaryotes. They are found in multiple intracellular membranes including vacuoles, endosomes, lysosomes, Golgi-derived vesicles, secretory vesicles, as well as the plasma membrane. Archaea have a protein which is similar in sequence to V-ATPases, but functions like an F-ATPase (called A-ATPase).  A similar protein is also found in a few bacteria. This subfamily consists of the non-catalytic beta subunit.	282
410880	cd01136	ATPase_flagellum-secretory_path_III	Flagellum-specific ATPase/type III secretory pathway virulence-related protein. Flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins. The bacterial flagellar motor is similar to the F0F1-ATPase, in that they both are proton-driven rotary molecular devices. However, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens such as Salmonella and Chlamydia also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway.	265
238557	cd01137	PsaA	Metal binding protein PsaA.  These proteins have been shown to function as initial receptors in ABC transport of Mn2+ and as surface adhesins in some eubacterial species.  They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).	287
238558	cd01138	FeuA	Periplasmic binding protein FeuA.  These proteins have predicted to function as initial receptors in ABC transport of metal ions in some eubacterial species.  They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains.	248
238559	cd01139	TroA_f	Periplasmic binding protein TroA_f.  These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains.	342
238560	cd01140	FatB	Siderophore binding protein FatB.  These proteins have been shown to function as ABC-type initial receptors in the siderophore-mediated iron uptake in some eubacterial species.  They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains.	270
238561	cd01141	TroA_d	Periplasmic binding protein TroA_d.  These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains.	186
238562	cd01142	TroA_e	Periplasmic binding protein TroA_e.  These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains.	289
238563	cd01143	YvrC	Periplasmic binding protein YvrC.  These proteins are predicted to function as initial receptors in ABC transport of metal ions in eubacteria and archaea.  They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains.	195
238564	cd01144	BtuF	Cobalamin binding protein BtuF.  These proteins have been shown to function as initial receptors in ABC transport of vitamin B12 (cobalamin) in eubacterial and some archaeal species.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).	245
238565	cd01145	TroA_c	Periplasmic binding protein TroA_c.  These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains.	203
238566	cd01146	FhuD	Fe3+-siderophore binding domain FhuD.  These proteins have been shown to function as initial receptors in ABC transport of Fe3+-siderophores in many eubacterial species. They belong to the TroA-like superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA-like protein is comprised of two globular subdomains connected by a long alpha helix and binds its specific ligands in the cleft between these domains.	256
238567	cd01147	HemV-2	Metal binding protein HemV-2.  These proteins are predicted to function as initial receptors in ABC transport of metal ions.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains. In addition, these proteins sometimes have a low complexity region containing a metal-binding histidine-rich motif (repetitive HDH sequence).	262
238568	cd01148	TroA_a	Metal binding protein TroA_a.  These proteins are predicted to function as initial receptors in ABC transport of metal ions in eubacteria. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains.	284
238569	cd01149	HutB	Hemin binding protein HutB.  These proteins have been shown to function as initial receptors in ABC transport of hemin and hemoproteins in many eubacterial species.  They belong to the TroA superfamily of periplasmic metal binding proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind the metal ion in the cleft between these domains.	235
173839	cd01150	AXO	Peroxisomal acyl-CoA oxidase. Peroxisomal acyl-CoA oxidases (AXO) catalyze the first set in the peroxisomal fatty acid beta-oxidation, the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. In a second oxidative half-reaction, the reduced FAD is reoxidized by molecular oxygen. AXO is generally a homodimer, but it has been reported to form a different type of oligomer in yeast. There are several subtypes of AXO's, based on substrate specificity. Palmitoyl-CoA oxidase acts on straight-chain fatty acids and prostanoids; whereas, the closely related Trihydroxycoprostanoly-CoA oxidase has the greatest activity for  2-methyl branched side chains of bile precursors. Pristanoyl-CoA oxidase, acts on 2-methyl branched fatty acids.  AXO has an additional domain, C-terminal to the region with similarity to acyl-CoA dehydrogenases, which is included in this alignment.	610
173840	cd01151	GCD	Glutaryl-CoA dehydrogenase. Glutaryl-CoA dehydrogenase (GCD). GCD is an acyl-CoA dehydrogenase, which catalyzes the oxidative decarboxylation of glutaryl-CoA to crotonyl-CoA and carbon dioxide in the catabolism of lysine, hydroxylysine, and tryptophan. It uses electron transfer flavoprotein (ETF) as an electron acceptor. GCD is a homotetramer. GCD deficiency leads to a severe neurological disorder in humans.	386
173841	cd01152	ACAD_fadE6_17_26	Putative acyl-CoA dehydrogenases similar to fadE6, fadE17, and fadE26. Putative acyl-CoA dehydrogenases (ACAD). Mitochondrial acyl-CoA dehydrogenases (ACAD) catalyze the alpha, beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. The mitochondrial ACD's are generally homotetramers and have an active site glutamate at a conserved position.	380
173842	cd01153	ACAD_fadE5	Putative acyl-CoA dehydrogenases similar to fadE5. Putative acyl-CoA dehydrogenase (ACAD). Mitochondrial acyl-CoA dehydrogenases (ACAD) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. The mitochondrial ACD's are generally homotetramers and have an active site glutamate at a conserved position.	407
173843	cd01154	AidB	Proteins involved in DNA damage response, similar to the AidB gene product. AidB is one of several genes involved in the SOS adaptive response to DNA alkylation damage, whose expression is activated by the Ada protein. Its function has not been entirely elucidated; however, it is similar in sequence and function to acyl-CoA dehydrogenases. It has been proposed that aidB directly destroys DNA alkylating agents such as nitrosoguanidines (nitrosated amides) or their reaction intermediates.	418
173844	cd01155	ACAD_FadE2	Acyl-CoA dehydrogenases similar to fadE2. FadE2-like Acyl-CoA dehydrogenase (ACAD). Acyl-CoA dehydrogenases (ACAD) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. The ACAD family includes the eukaryotic beta-oxidation, as well as amino acid catabolism enzymes. These enzymes share high sequence similarity, but differ in their substrate specificities. ACAD's are generally homotetramers and have an active site glutamate at a conserved position.	394
173845	cd01156	IVD	Isovaleryl-CoA dehydrogenase. Isovaleryl-CoA dehydrogenase (IVD) is an is an acyl-CoA dehydrogenase, which catalyzes the third step in leucine catabolism, the conversion of isovaleryl-CoA (3-methylbutyryl-CoA) into 3-methylcrotonyl-CoA. IVD is a homotetramer and has the greatest affinity for small branched chain substrates.	376
173846	cd01157	MCAD	Medium chain acyl-CoA dehydrogenase. MCADs are mitochondrial beta-oxidation enzymes, which catalyze the alpha,beta dehydrogenation of the corresponding medium chain acyl-CoA by FAD, which becomes reduced. The reduced form of MCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. MCAD is a  homotetramer.	378
173847	cd01158	SCAD_SBCAD	Short chain acyl-CoA dehydrogenases and eukaryotic short/branched chain acyl-CoA dehydrogenases. Short chain acyl-CoA dehydrogenase (SCAD). SCAD is a mitochondrial beta-oxidation enzyme. It catalyzes the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of SCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis.  This subgroup also contains the eukaryotic short/branched chain acyl-CoA dehydrogenase(SBCAD), the bacterial butyryl-CoA dehydorgenase(BCAD) and 2-methylbutyryl-CoA dehydrogenase, which is involved in isoleucine catabolism.  These enzymes are homotetramers.	373
173848	cd01159	NcnH	Naphthocyclinone hydroxylase. Naphthocyclinone is an aromatic polyketide and an antibiotic, which is active against Gram-positive bacteria.  Polyketides are secondary metabolites, which have important biological functions such as antitumor, immunosupressive or antibiotic activities. NcnH is a hydroxylase involved in the biosynthesis of naphthocyclinone and possibly other polyketides.	370
173849	cd01160	LCAD	Long chain acyl-CoA dehydrogenase. LCAD is an acyl-CoA dehydrogenases (ACAD), which is found in the mitochondria of eukaryotes and in some prokaryotes.  It catalyzes the alpha, beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of LCAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. LCAD acts as a homodimer.	372
173850	cd01161	VLCAD	Very long chain acyl-CoA dehydrogenase. VLCAD is an acyl-CoA dehydrogenase (ACAD), which is found in the mitochondria of eukaryotes and in some bacteria.  It catalyzes the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. VLCAD acts as a homodimer.	409
173851	cd01162	IBD	Isobutyryl-CoA dehydrogenase. Isobutyryl-CoA dehydrogenase  (IBD) catalyzes the alpha, beta- dehydrogenation of short branched chain acyl-CoA intermediates in valine catabolism. It is predicted to be a homotetramer.	375
173852	cd01163	DszC	Dibenzothiophene (DBT) desulfurization enzyme C. DszC is a flavin reductase dependent enzyme, which catalyzes the first two steps of DBT desulfurization in mesophilic bacteria. DszC converts DBT to DBT-sulfoxide, which is then converted to DBT-sulfone. Bacteria with this enzyme are candidates for the removal of organic sulfur compounds from fossil fuels, which pollute the environment. An equivalent enzyme tdsC, is found in thermophilic bacteria. This alignment also contains a closely related uncharacterized subgroup.	377
238570	cd01164	FruK_PfkB_like	1-phosphofructokinase (FruK), minor 6-phosphofructokinase (pfkB) and related sugar kinases. FruK plays an important role in the predominant pathway for fructose utilisation.This group also contains tagatose-6-phophate kinase, an enzyme of the tagatose 6-phosphate pathway, which responsible for breakdown of the galactose moiety during lactose metabolism by bacteria such as L. lactis.	289
349496	cd01165	BTB_POZ	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain superfamily. Proteins in this superfamily are characterized by the presence of a common protein-protein interaction motif of about 100 amino acids, known as the BTB/POZ domain. Members include transcription factors, oncogenic proteins, ion channel proteins, and potassium channel tetramerization domain (KCTD) proteins. They have been identified in poxviruses and many eukaryotes, and have diverse functions, such as transcriptional regulation, chromatin remodeling, protein degradation and cytoskeletal regulation. Many BTB/POZ proteins contain one or two additional domains, such as kelch repeats, zinc-finger domains, FYVE (Fab1, YOTB, Vac1, and EEA1) fingers, or ankyrin repeats, among others. These special additional domains or interaction partners provide unique characteristics and functions to BTB/POZ proteins. In ion channel proteins and KCTD proteins, the BTB/POZ domain is also called the tetramerization (T1) domain.	79
238571	cd01166	KdgK	2-keto-3-deoxygluconate kinase (KdgK) phosphorylates 2-keto-3-deoxygluconate (KDG) to form 2-keto-3-deoxy-6-phosphogluconate (KDGP). KDG is the common intermediate product, that allows organisms to channel D-glucuronate and/or D-galacturinate into the glycolysis and therefore use polymers, like pectin and xylan as carbon sources.	294
238572	cd01167	bac_FRK	Fructokinases (FRKs) mainly from bacteria and plants are enzymes with high specificity for fructose, as are all FRKs, but they catalyzes the conversion of fructose to fructose-6-phosphate, which is an entry point into glycolysis via conversion into glucose-6-phosphate. This is in contrast to FRKs [or ketohexokinases (KHKs)] from mammalia and halophilic archaebacteria, which phosphorylate fructose to fructose-1-phosphate.	295
238573	cd01168	adenosine_kinase	Adenosine kinase (AK) catalyzes the phosphorylation of ribofuranosyl-containing nucleoside analogues at the 5'-hydroxyl using ATP or GTP as the phosphate donor.The physiological function of AK is associated with the regulation of extracellular adenosine levels and the preservation of intracellular adenylate pools. Adenosine kinase is involved in the purine salvage pathway. 	312
238574	cd01169	HMPP_kinase	4-amino-5-hydroxymethyl-2-methyl-pyrimidine phosphate kinase (HMPP-kinase) catalyzes two consecutive phosphorylation steps in the thiamine phosphate biosynthesis pathway, leading to the synthesis of vitamin B1. The first step is the phosphorylation of the hydroxyl group of HMP to form 4-amino-5-hydroxymethyl-2-methyl-pyrimidine phosphate (HMP-P) and then the phophorylation of HMP-P to form 4-amino-5-hydroxymethyl-2-methyl-pyrimidine pyrophosphate (HMP-PP), which is the substrate for the thiamine synthase coupling reaction.	242
238575	cd01170	THZ_kinase	4-methyl-5-beta-hydroxyethylthiazole (Thz) kinase catalyzes the phosphorylation of the hydroxylgroup of Thz. A reaction that allows cells to recycle Thz into the thiamine biosynthesis pathway, as an alternative to its synthesis from cysteine, tyrosine and 1-deoxy-D-xylulose-5-phosphate.	242
238576	cd01171	YXKO-related	B.subtilis YXKO protein of unknown function and related proteins. Based on the conservation of the ATP binding site, the substrate binding site and the Mg2+binding site and structural homology this group is a member of the ribokinase-like superfamily.	254
238577	cd01172	RfaE_like	RfaE encodes a bifunctional ADP-heptose synthase involved in the biosynthesis of the lipopolysaccharide (LPS) core precursor ADP-L-glycero-D-manno-heptose. LPS plays an important role in maintaining the structural integrity of the bacterial outer membrane of gram-negative bacteria. RfaE consists of two domains, a sugar kinase domain, represented here, and a domain belonging to the cytidylyltransferase superfamily.	304
238578	cd01173	pyridoxal_pyridoxamine_kinase	Pyridoxal kinase plays a key role in the synthesis of the active coenzyme pyridoxal-5'-phosphate  (PLP), by catalyzing the phosphorylation of the precursor vitamin B6  in the presence of Zn2+ and ATP. Mammals are unable to synthesize PLP de novo and require its precursors in the form of vitamin B6 (pyridoxal, pyridoxine, and pyridoxamine) from their diet. Pyridoxal kinase encoding genes are also found in many other species including yeast and bacteria.	254
238579	cd01174	ribokinase	Ribokinase catalyses the phosphorylation of ribose to ribose-5-phosphate using ATP. This reaction is the first step in the ribose metabolism. It traps ribose within the cell after uptake and also prepares the sugar for use in the synthesis of nucleotides and histidine, and for entry into the pentose phosphate pathway. Ribokinase is dimeric in solution.	292
238580	cd01175	IPT_COE	IPT domain of the COE family (Col/Olf-1/EBF) of non-basic, helix-loop-helix (HLH)-containing transcription factors. COE family proteins are all transcription factors and play an important role in variety of developmental processes. Mouse EBF is involved in the regulation of the early stages of B-cell differentiation, Drosophila collier is a regulator of the head patterning, and a related protein in Xenopus is involved in primary neurogenesis. All COE family members have a well conserved DNA binding domain that contains an atypical Zn finger motif. The function of the IPT domain is unknown.	85
238581	cd01176	IPT_RBP-Jkappa	IPT domain of the recombination signal Jkappa binding protein (RBP-Jkappa). RBP-J kappa, was initially considered to be involved in V(D)J recombination because of its DNA binding specificity and structural similarity to site-specific recombinases known as the integrase family. Further studies indicated that RBP-J kappa functions as a repressor of transcription, via destabilization of the general transcription factor IID and recruitment of histone deacetylase complexes.	97
238582	cd01177	IPT_NFkappaB	IPT domain of the transcription factor NFkappaB and related transcription factors. NFkappaB is considered a central regulator of stress responses, activated by different stressful conditions, including physical stress, oxidative stress, and exposure to certain chemicals. NFkappaB blocking cell apoptosis in several cell types, gives it an important role in cell proliferation and differentiation.	102
238583	cd01178	IPT_NFAT	IPT domain of the NFAT family of transcription factors. NFAT transcription complexes are a target of calcineurin, a calcium dependent phosphatase, and activate genes mainly involved in cell-cell-interaction.	101
238584	cd01179	IPT_plexin_repeat2	Second repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains.	85
238585	cd01180	IPT_plexin_repeat1	First repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains.	94
238586	cd01181	IPT_plexin_repeat3	Third repeat of the IPT domain of Plexins and Cell Surface Receptors (PCSR) . Plexins are involved in the regulation of cell proliferation and of cellular adhesion and repulsion receptors. In general, there are three copies of the IPT domain present preceeded by SEMA (semaphorin) and PSI (plexin, semaphorin, integrin) domains.	99
271183	cd01182	INT_RitC_C_like	C-terminal catalytic domain of recombinase RitC, a component of the recombinase trio. Recombinases belonging to the RitA (also known as pAE1 due to its presence in the deletion prone region of plasmid pAE1 of Alcaligenes eutrophus H1), RitB, and RitC families are associated in a complex referred to as a Recombinase in Trio (RIT) element.  These RIT elements consist of three adjacent and unidirectional overlapping genes, one from each family (ritABC in order of transcription).  All three integrases contain a catalytic motif, suggesting that they are all active enzymes.  However, their specific roles are not yet fully understood.  All three families belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism.	186
271184	cd01184	INT_C_like_1	Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain containing six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	180
271185	cd01185	INTN1_C_like	Integrase IntN1 of Bacteroides mobilizable transposon NBU1 and similar proteins, C-terminal catalytic domain. IntN1 is a tyrosine recombinase for the integration and excision of Bacteroides mobilizable transposon NBU1 from the host chromosome. IntN1 does not require strict homology between the recombining sites seen with other tyrosine recombinases. This family belongs to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA.	161
271186	cd01186	INT_tnpA_C_Tn554	Putative Transposase A from transposon Tn554, C-terminal catalytic domain. This family includes putative Transposase A from transposon Tn554. It belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	184
271187	cd01187	INT_tnpB_C_Tn554	Putative Transposase B from transposon Tn554, C-terminal catalytic domain. This family includes putative Transposase B from transposon Tn554. It belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain containing six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	142
271188	cd01188	INT_RitA_C_like	C-terminal catalytic domain of recombinase RitA, a component of the recombinase trio. Recombinases RitA (also known as pAE1), RitB, and RitC are encoded by three adjacent and overlapping genes. Collectively they are known as the Recombinase in Trio (RIT). This RitA family includes various bacterial integrases and integrases from the deletion-prone region of plasmid pAE1 of Alcaligenes eutrophus H1. All three integrases contain a catalytic motif, suggesting that they are all active enzymes. However, their specific roles are not fully understood. All three families belong to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA.	179
271189	cd01189	INT_ICEBs1_C_like	C-terminal catalytic domain of integrases from bacterial phages and conjugate transposons. This family of tyrosine based site-specific integrases is has origins in bacterial phages and conjugate transposons. One member is the integrase from Bacillus subtilis conjugative transposon ICEBs1. ICEBs1 can be excised and transfered to various recipients in response to DNA damage or high concentrations of potential mating partners. The family belongs to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA.	147
271190	cd01190	INT_StrepXerD_C_like	Putative XerD in Streptococcus pneumonia and similar proteins, C-terminal catalytic domain. This family includes a putative XerD recombinase in Streptococcus pneumonia and similar tyrosine recombinases. However, the members of this family contain unusual active site motifs from the XerD from Escherichia coli. E. coli XerD and homologous enzymes show four conserved amino acids R-H-R-H that are spaced along the C-terminal domain. The putative S. pneumoniae XerD contains three unique replacements at the conserved positions resulting in L-Q-R-L. Severe growth defects in a loss-of-function xerD mutant demonstrate an important in vivo function of the S. pneumoniae XerD protein. This family belongs to the superfamily of DNA breaking-rejoining enzymes, which share the same fold in their catalytic domain and the overall reaction mechanism. The catalytic domain contains six conserved active site residues. Their overall reaction mechanism involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA.	150
271191	cd01191	INT_C_like_2	Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	176
271192	cd01192	INT_C_like_3	Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	178
271193	cd01193	INT_IntI_C	Integron integrase and similar protiens, C-terminal catalytic domain. Integron integrases mediate site-specific DNA recombination between a proximal primary site (attI) and a secondary target site (attC) found within mobile gene cassettes encoding resistance or virulence factors. Unlike other site specific recombinases, the attC sites lack sequence conservation. Integron integrase exhibits broader DNA specificity by recognizing the non-conserved attC sites. The structure shows that DNA target site recognition are not dependent on canonical DNA but on the position of two flipped-out bases that interact in cis and in trans with the integrase. Integron-integrases are present in many natural occurring mobile elements, including transposons and conjugative plasmids. Vibrio, Shewanella, Xanthomonas, and Pseudomonas species harbor chromosomal super-integrons. All integron-integrases carry large inserts unlike the TnpF ermF-like proteins also seen in this group.	176
271194	cd01194	INT_C_like_4	Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	174
271195	cd01195	INT_C_like_5	Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	170
271196	cd01196	INT_C_like_6	Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain. Tyrosine recombinase (integrase) belongs to a DNA breaking-rejoining enzyme superfamily. The catalytic domain contains six conserved active site residues. The recombination reaction involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity.	183
271197	cd01197	INT_FimBE_like	FimB and FimE and related proteins, integrase/recombinases. This CD includes proteins similar to E.coli FimE and FimB and Proteus mirabilis MrpI. FimB and FimE are the regulatory proteins during expression of type 1 fimbriae in Escherichia coli. The fimB and fimE proteins direct the phase switch into the 'on' and 'off' position. MrpI is the regulatory protein of proteus mirabilis fimbriae expression. This family belongs to the integrase/recombinase superfamily.	181
238605	cd01200	WHEPGMRS_RNA	EPRS-like_RNA binding domain. This short RNA-binding domain is found in several higher eukaryote aminoacyl-tRNA synthetases (aaRSs). It is found in three copies in the mammalian bifunctional EPRS in a region that separates the N-terminal GluRS from the C-terminal ProRS. In the Drosophila EPRS, this domain is repeated six times. It is found at the N-terminus of TrpRS, HisRS and GlyR and at the C-terminus of MetRS. This domain consists of a helix- turn- helix structure, which is similar to other RNA-binding proteins. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions, which are important for the formation of aaRSs into multienzyme complexes.	42
275391	cd01201	PH_BEACH	Pleckstrin homology domain in BEACH domain containing proteins. The BEACH domain is present in several eukaroyotic proteins CHS, neurobeachin (Nbea), LRBA (also called BGL, beige-like, or CDC4L), FAN, KIAA1607, and LvsA-LvsF. CHS is a rare, autosomal recessive disorder that can cause severe immunodeficiency and albinism in mammals and beige is the name for the CHS disease in mice. The CHS disease is associated with the presence of giant, perinuclear vesicles (lysosomes, melanosomes, and others) and CHS protein is thought to play an important role in the fusion, fission, or trafficking of these vesicles. All BEACH proteins contain the following domains: PH, BEACH, and WD40. The WD40 domain is involved in mediating protein-protein interactions involved in targeting proteins to subcellular compartments. The combined PH-BEACH motifs may present a single continuous structural unit involved in protein binding. Some members have an additional N-terminal Laminin G-like (LamG) domains Ca++ mediated receptors or an additional C-terminal FYVE zinc-binding domain which targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	112
269913	cd01202	PTB_FRS2	Fibroblast growth factor receptor substrate 2 phosphotyrosine-binding domain. FRS2 (also called Suc1-associated neurotrophic factor (SNT)-induced tyrosine-phosphorylated target) proteins are membrane-anchored adaptor proteins. They are composed of an N-terminal myristoylation site followed by a phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a C-terminal effector domain containing multiple tyrosine and serine/threonine phosphorylation site. The FRS2/SNT proteins show increased tyrosine phosphorylation by activated receptors, such as fibroblast growth factor receptor (FGFR) and TrkA, recruit SH2 domain containing proteins such as Grb2, and mediate signals from activated receptors to a variety of downstream pathways. The PTB domains of the SNT proteins directly interact with the canonical NPXpY motif of TrkA in a phosphorylationdependent manner, they directly bind to the juxtamembrane region of FGFR in a phosphorylation-independent manner. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup.	92
269914	cd01203	PTB_DOK1_DOK2_DOK3	Downstream of tyrosine kinase 1, 2, and 3 proteins phosphotyrosine-binding domain (PTBi). The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain is binds to acidic phospholids and localizes proteins to the plasma membrane, while the PTB domain mediates protein-protein interactions by binding to phosphotyrosine-containing motifs. The C-terminal part of Dok contains multiple tyrosine phosphorylation sites that serve as potential docking sites for Src homology 2-containing proteins such as ras GTPase-activating protein and Nck, leading to inhibition of ras signaling pathway activation and the c-Jun N-terminal kinase (JNK) and c-Jun activation, respectively. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. Dok-4- 6 play roles in protein tyrosine kinase(PTK)-mediated signaling in neural cells and Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup.	99
269915	cd01204	PTB_IRS	Insulin receptor substrate phosphotyrosine-binding domain (PTBi). Insulin receptor substrate (IRS) molecules are mediators in insulin signaling and play a role in maintaining basic cellular functions such as growth and metabolism. They act as docking proteins between the insulin receptor and a complex network of intracellular signaling molecules containing Src homology 2 (SH2) domains. Four members (IRS-1, IRS-2, IRS-3, IRS-4) of this family have been identified that differ as to tissue distribution, subcellular localization, developmental expression, binding to the insulin receptor, and interaction with SH2 domain-containing proteins. IRS molecules have an N-terminal PH domain, followed by an IRS-like PTB domain which has a PH-like fold. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup.	106
269916	cd01205	EVH1_WASP-like	WASP family proteins EVH1 domain. The Wiskott-Aldrich Syndrome Protein (WASP; also called Bee1p) and its homolog N (neuronal)-WASP are signal transduction proteins that promote actin polymerization in response to upstream intracellular signals. WAS is an X-linked recessive disease, characterized by eczema, immunodeficiency, and thrombocytopenia. The majority of patients with WAS, or a milder version of the disorder, X-linked thrombocytopenia (XLT), have point mutations in the EVH1 domain of WASP. WASP is an actin regulatory protein consisting of an N-terminal EVH1 domain called WH1 which binds LPPPEP peptides, a basic region (B), a GTP binding domain (GBP), a proline rich region, a WH2 domain, and a verprolin-cofilin-acidic motif (VCA) which activates the actin-related protein (Arp)2/3 actin nucleating complex. The B, GBD, and the proline-rich region are involved in autoinhibitory interactions that repress or block the activity of the VCA. Yeast members lack the GTP binding domain. The EVH1 domains are part of the PH domain superamily. There are 5 EVH1 subfamilies: Enables/VASP, Homer/Vesl, WASP, Dcp1, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains.	101
269917	cd01206	EVH1_Homer_Vesl	Homer/Vesl family proteins EVH1 domain. Homer/Vesl proteins are synaptic scaffolding proteins, required for long-term potentiation, a form of synaptic plasticity thought to underlie memory formation. They contains an N-terminal EVH1 domain and bind to both neurotransmitter receptors, such as the metabotropic group 1 glutamate receptor (mGluR) and to other scaffolding proteins via PPXXF motifs, in order to target them to the synaptic junction. These mGluRs possess a long C-terminal intracellular tail that may be important for subcellular localization of the receptor. The C-terminus is also the site of binding by the immediate early gene (IEG), Homer 1a. In contrast to Homer 1a, other Homer members additionally encode a C-terminal coiled-coil (CC) domain and form multivalent complexes that bind group 1 mGluRs. Homer 1a competes with constitutively expressed CC-Homers to modify the association of group 1 mGluRs with CC-Homer complexes. Since Homer proteins are strikingly enriched at the postsynaptic density (PSD), these observations suggest a role for the Homer family in regulating synaptic metabotropic receptor function. PSD-Zip45 (also named Homer 1c/Vesl-1L) has an EVH1 domain with a longer alpha-helix and its linking part included in the conserved region of Homer 1 (CRH1) interacts with the EVH1 domain of the neighbour CRH1 molecule in the crystal, suggesting that the EVH1 domain recognizes the PPXXF motif found in the binding partners, and the SPLTP sequence (P-motif) in the linking region of the CRH1. The two types of binding are partly overlapped in the EVH1 domain, implying a mechanism to regulate multimerization of Homer 1 family proteins. Homer 2 and Homer 3 are negative regulators of T cell activation. They bind the nuclear factor of activated T cells (NFAT) and compete with calcineurin binding. NFAT plays a critical role in calcium-dependent signaling in other cell types, including muscle and neurons. Homer-NFAT binding is also antagonized by active serine-threonine kinase AKT, enhancing TCR signaling via calcineurin-dependent dephosphorylation of NFAT resulting in changes in cytokine expression and an increase in effector-memory T cell populations in Homer-deficient mice. The EVH1 domains are part of the PH domain superamily. There are 5 EVH1 subfamilies: Enables/VASP, Homer/Vesl, WASP, Dcp1, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains.	109
269918	cd01207	EVH1_Ena_VASP-like	Enabled/VASP family EVH1 domain. Ena/VASP family includes proteins such as: Vasodilator-stimulated phosphoprotein (VASP), enabled gene product from Drosophila (Ena), mammalian enabled (Mena) and Ena/VASP-Like protein (EVL) localize to focal adhesions and to sites of actin filament dynamics. These proteins share a common modular organization with a highly conserved N- and C-terminal domains, termed Ena/VASP homology domains 1 and 2 (EVH1 and EVH2), that are separated by a central proline-rich domain. The EVH1 domain binds to other proteins at proline rich sequences. The majority of Ena-VASP type EVH1 domains recognize FPPPP motifs such as in the focal adhesion proteins zyxin and vinculin, and the ActA surface protein of Listeria monocytogenes, however the LIM3 domain of Tes lacks the FPPPP motif but still binds the EVH1 domain of Mena. It has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains. EVH2 mediates oligomerization within the family. The proline-rich region binds SH3 and WW domains as well as profilin, a protein that regulates actin filament dynamics. The EVH1 domains are part of the PH domain superamily. There are 5 EVH1 subfamilies: Enables/VASP, Homer/Vesl, WASP, Dcp1, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains.	108
269919	cd01208	PTB_X11	X11-like Phosphotyrosine-binding (PTB) domain. The function of the neuronal protein X11 is unknown to date. X11 has a PTB domain followed by two PDZ domains. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	161
269920	cd01209	PTB_Shc	Shc-like phosphotyrosine-binding (PTB) domain. Shc is a substrate for receptor tyrosine kinases, which can interact with phosphoproteins at NPXY motifs. Shc contains an PTB domain followed by an SH2 domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Shc-like subgroup.	170
269921	cd01210	PTB_EPS8	Epidermal growth factor receptor kinase substrate (EPS8)-like Phosphotyrosine-binding (PTB) domain. EPS8 is a regulator of Rac signaling. It consists of a PTB and an SH3 domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	131
269922	cd01211	PTB_Rab6GAP	GTPase activating protein for Rab 6 Phosphotyrosine-binding (PTB) domain. GAPCenA is a centrosome-associated GTPase activating protein (GAP) for Rab 6. It consists of an N-terminal PTB domain and a C-terminal TBC domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	129
269923	cd01212	PTB_JIP	JNK-interacting protein-like (JIP) Phosphotyrosine-binding (PTB) domain. JIP is a mitogen-activated protein kinase scaffold protein. JIP consists of a C-terminal SH3 domain, followed by a PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	149
269924	cd01213	PTB_tensin	Tensin Phosphotyrosine-binding (PTB) domain. Tensin is a a focal adhesion protein, which contains a C-terminal SH2 domain followed by a PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	136
269925	cd01214	PTB_FAM43A	Family with sequence similarity 43, member A (FAM43A) Phosphotyrosine-binding (PTB) domain. The function of FAM43A is currently unknown. Human FAM43A is located on chromosome 3 at location 3q29. It encodes a 3182 base pair mRNA which possesses one Pleckstrin homology-like domain. The mRNA translates into LOC131583, a hydrophilic protein that is predicted to localize in the nucleus. The FAM43A gene is conserved through a broad range of vertebrates. It is highly conserved from chimpanzees to zebrafish. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains.	125
269926	cd01215	PTB_Dab	Disabled (Dab) Phosphotyrosine-binding domain. Dab is a cystosolic adaptor protein, which binds to the cytoplasmic tails of lipoprotein receptors, such as ApoER2 and VLDLR, via its PTB domain. The dab PTB domain has a preference for unphosphorylated tyrosine within an NPxY motif. Additionally, the Dab PTB domain, which is structurally similar to PH domains, binds to phosphatidlyinositol phosphate 4,5 bisphosphate in a manner characteristic of phosphoinositide binding PH domains. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	147
241252	cd01217	PTB_CG12581	CG12581 Phosphotyrosine-binding (PTB) domain. The function of CG12581 and its related proteins are unknown to date. Members here contain a single N-terminal PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	166
269927	cd01218	PH_Phafin2-like	Phafin2 (also called EAPF, FLJ13187, ZFYVE18 or PLEKHF2) Pleckstrin Homology (PH) domain. Phafin2 is differentially expressed in the liver cancer cell and regulates the structure and function of the endosomes through Rab5-dependent processes. Phafin2 modulates the cell's response to extracellular stimulation by modulating the receptor density on the cell surface. Phafin2 contains a PH domain and a FYVE domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	123
275392	cd01219	PH1_FGD1	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 1, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Mutations in the FGD1 gene are responsible for the X-linked disorder known as faciogenital dysplasia (FGDY). Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. However, FGD1 and FGD3 induced significantly different morphological changes in HeLa Tet-Off cells and while FGD1 induced long finger-like protrusions, FGD3 induced broad sheet-like protrusions when the level of GTP-bound Cdc42 was significantly increased by the inducible expression of FGD3. They also reciprocally regulated cell motility in inducibly expressed in HeLa Tet-Off cells, FGD1 stimulated cell migration while FGD3 inhibited it. FGD1 and FGD3 therefore play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway through SCF(FWD1/beta-TrCP). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
269928	cd01220	PH1_FARP1-like	FERM, RhoGEF and pleckstrin domain-containing protein 1 and related proteins Pleckstrin Homology (PH) domain, repeat 1. Members here include FARP1 (also called Chondrocyte-derived ezrin-like protein; PH domain-containing family C member 2), FARP2 (also called FIR/FERM domain including RhoGEF; FGD1-related Cdc42-GEF/FRG), and FARP6 (also called Zinc finger FYVE domain-containing protein 24). They are members of the Dbl family guanine nucleotide exchange factors (GEFs) which are upstream positive regulators of Rho GTPases. Little is known about FARP1 and FARP6, though FARP1 has increased expression in differentiated chondrocytes. FARP2 is thought to regulate neurite remodeling by mediating the signaling pathways from membrane proteins to Rac. It is found in brain, lung, and testis, as well as embryonic hippocampal and cortical neurons. FARP1 and FARP2 are composed of a N-terminal FERM domain, a proline-rich (PR) domain, Dbl-homology (DH), and two C-terminal PH domains. FARP6 is composed of Dbl-homology (DH), and two C-terminal PH domains separated by a FYVE domain. This hierarchy contains the first PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	109
269929	cd01221	PH_ephexin	Ephexin Pleckstrin homology (PH) domain. Ephexin-1 (also called NGEF/ neuronal guanine nucleotide exchange factor) plays a role in the homeostatic modulation of presynaptic neurotransmitter release. Specific functions are still unknown for Ephexin-2 (also called RhoGEF19) and Ephexin-3 (also called Rho guanine nucleotide exchange factor 5/RhoGEF5, Transforming immortalized mammary oncogene/p60 TIM, and NGEF/neuronalGEF). Ephexin-4 (also called RhoGEF16) acts downstream of EphA2 to promote ligand-independent breast cancer cell migration and invasion toward epidermal growth factor through activation of RhoG. This in turn results in the activation of RhoG which recruits ELMO2 and Dock4 to form a complex with EphA2 at the tips of cortactin-rich protrusions in migrating breast cancer cells. Ephexin-5 is the specific GEF for RhoA activation and the regulation of vascular smooth muscle contractility. It interacts with EPHA4 PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. The members of the Ephexin family contains a RhoGEF (DH) followed by a PH domain and an SH3 domain. The ephexin PH domain is believed to act with the DH domain in mediating protein-protein interactions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	131
269930	cd01223	PH_Vav	Vav pleckstrin homology (PH) domain. Vav acts as a guanosine nucleotide exchange factor (GEF) for Rho/Rac proteins. They control processes including T cell activation, phagocytosis, and migration of cells. The Vav subgroup of Dbl GEFs consists of three family members (Vav1, Vav2, and Vav3) in mammals. Vav1 is preferentially expressed in the hematopoietic system, while Vav2 and Vav3 are described by broader expression patterns. Mammalian Vav proteins consist of a calponin homology (CH) domain, an acidic region, a catalytic Dbl homology (DH) domain, a PH domain, a zinc finger cysteine rich domain (C1/CRD), and an SH2 domain, flanked by two SH3 domains. In invertebrates such as Drosophila and C. elegans, Vav is missing the N-terminal SH3 domain. The DH domain is involved in RhoGTPase recognition and selectivity and stimulates the reorganization of the switch regions for GDP/GTP exchange. The PH domain is implicated in directing membrane localization, allosteric regulation of guanine nucleotide exchange activity, and as a phospholipid- dependent regulator of GEF activity. Vavs bind RhoGTPases including Rac1, RhoA, RhoG, and Cdc42, while other members of the GEF family are specific for a single RhoGTPase. This promiscuity is thought to be a result of its CRD. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but only a few (less than 10%) display strong specificity in binding inositol phosphates. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinases, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, cytoskeletal associated molecules, and in lipid associated enzymes.	127
269931	cd01224	PH_Collybistin_ASEF	Collybistin/APC-stimulated guanine nucleotide exchange factor pleckstrin homology (PH) domain. Collybistin (also called PEM2) is homologous to the Dbl proteins ASEF (also called ARHGEF4/RhoGEF4) and SPATA13 (Spermatogenesis-associated protein 13; also called ASEF2). It activates CDC42 specifically and not any other Rho-family GTPases. Collybistin consists of an SH3 domain, followed by a RhoGEF/DH and PH domain. In Dbl proteins, the DH and PH domains catalyze the exchange of GDP for GTP in Rho GTPases, allowing them to signal to downstream effectors. It induces submembrane clustering of the receptor-associated peripheral membrane protein gephyrin, which is thought to form a scaffold underneath the postsynaptic membrane linking receptors to the cytoskeleton. It also acts as a tumor suppressor that links adenomatous polyposis coli (APC) protein, a negative regulator of the Wnt signaling pathway and promotes the phosphorylation and degradation of beta-catenin, to Cdc42. Autoinhibition of collybistin is accomplished by the binding of its SH3 domain with both the RhoGEF and PH domains to block access of Cdc42 to the GTPase-binding site. Inactivation promotes cancer progression. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	138
269932	cd01225	PH_Cool_Pix	Cloned out of library/PAK-interactive exchange factor pleckstrin homology (PH) domain. There are two forms of Pix proteins: alpha Pix (also called Rho guanine nucleotide exchange factor (GEF) 6/90Cool-2) and beta Pix (GEF7/p85Cool-1). betaPix contains an N-terminal SH3 domain, a RhoGEF/DH domain, a PH domain, a GIT1 binding domain (GBD), and a C-terminal coiled-coil (CC) domain. alphaPix differs in that it contains a calponin homology (CH) domain, which interacts with beta-parvin, N-terminal to the SH3 domain. alphaPix is an exchange factor for Rac1 and Cdc42 and mediates Pak activation on cell adhesion to fibronectin. Mutations in alphaPix can cause X-linked mental retardation. alphaPix also interacts with Huntington's disease protein (htt), and enhances the aggregation of mutant htt (muthtt) by facilitating SDS-soluble muthtt-muthtt interactions. The DH-PH domain of a Pix was required for its binding to htt. In the majority of Rho GEF proteins, the DH-PH domain is responsible for the exchange activity. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	100
269933	cd01226	PH_RalBD_exo84	Exocyst complex 84-kDa subunit Ral-binding domain/Pleckstrin Homology (PH) domain. The Sec6/8 complex, also called the exocyst complex, forms an octameric protein (Sec3, Sec5, Sec6, Sec8, Sec10, Sec15, Exo70 and Exo84) involved in the tethering of secretory vesicles to specific regions on the plasma membrane. The regulation of Sec6/8 complex differs between mammals and yeast. Mamalian Exo84 and Sec5 are effector targets for active Ral GTPases which are not present in yeast. Ral GTPases are members of the Ras superfamily, and as such cycle between an active GTP-bound state and an inactive GDP-bound state. The Exo84 Ral-binding domain adopts a PH domain fold. Mammalian Exo84 and Sec5 competitively bind to active RalA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	115
269934	cd01227	PH_Dbs	DBL's big sister protein pleckstrin homology (PH) domain. Dbs (also called MCF2-transforming sequence-like protein 2) is a guanine nucleotide exchange factor (GEF), which contains spectrin repeats, a rhoGEF (DH) domain and a PH domain. The Dbs PH domain participates in binding to both the Cdc42 and RhoA GTPases. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	126
269935	cd01228	PH_BCR-related	Breakpoint Cluster Region-related pleckstrin homology (PH) domain. The BCR gene is one of the two genes in the BCR-ABL complex, which is associated with the Philadelphia chromosome, a product of a reciprocal translocation between chromosomes 22 and 9. BCR is a GTPase-activating protein (GAP) for RAC1 (primarily) and CDC42. The Dbl region of BCR has the most RhoGEF activity for Cdc42, and less activity towards Rac and Rho. Since BCR possesses both GAP and GEF activities, it may function to temporally regulate the activity of these GTPases. It also displays serine/threonine kinase activity. The BCR protein contains multiple domains including an N-terminal kinase domain, a RhoGEF domain, a PH domain, a C1 domain, a C2 domain, and a C-terminal RhoGAP domain. ABR, a related smaller protein, is structurally similar to BCR, but lacks the N-terminal kinase domain and has GAP activity for both Rac and Cdc42. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	166
269936	cd01229	PH_Ect2	Epithelial cell transforming 2 (Ect2) pleckstrin homology (PH) domain. Ect2, a mammalian ortholog of Drosophila pebble, plays a role in neuronal differentiation and brain development. Pebble and Ect2 have been identified as Rho-family guanine nucleotide exchange factors (GEF) that mediate activation of Rho during cytokinesis, but are proposed to play slightly different roles. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	180
269937	cd01230	PH1_Tiam1_2	T-lymphoma invasion and metastasis 1 and 2 Pleckstrin Homology (PH) domain, N-terminal domain. Tiam1 activates Rac GTPases to induce membrane ruffling and cell motility while Tiam2 (also called STEF (SIF (still life) and Tiam1 like-exchange factor) contributes to neurite growth. Tiam1/2 are Dbl-family of GEFs that possess a Dbl(DH) domain with a PH domain in tandem. DH-PH domain catalyzes the GDP/GTP exchange reaction in the GTPase cycle and facillitating the switch between inactive GDP-bound and active GTP-bound states. Tiam1/2 possess two PH domains, which are often referred to as PHn and PHc domains. The DH-PH tandem domain is made up of the PHc domain while the PHn is part of a novel N-terminal PHCCEx domain which is made up of the PHn domain, a coiled coil region(CC), and an extra region (Ex). PHCCEx mediates binding to plasma membranes and signalling proteins in the activation of Rac GTPases. The PH domain resembles the beta-spectrin PH domain, suggesting non-canonical phosphatidylinositol binding. CC and Ex form a positively charged surface for protein binding. There are 2 motifs in Tiam1/2-interacting proteins that bind to the PHCCEx domain: Motif-I in CD44, ephrinBs, and the NMDA receptor and Motif-II in Par3 and JIP2.Neither of these fall in the PHn domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	127
269938	cd01231	PH_SH2B_family	SH2B adapter protein 1, 2, and 3 Pleckstrin homology (PH) domain. SH2B family/APS proteins are a family of intracellular adaptor proteins that influences a variety of signaling pathways mediated by Janus kinase (JAK) and receptor tyrosine kinases (RTKs) including receptors for insulin, insulin-like growth factor-1, Janus kinase 2 (Jak2), platelet derived growth factor, fibroblast growth factor and nerve growth factor. They function in glucose homeostasis, energy metabolism, hematopoesis and reproduction. Mutations in human SH2B orthologs are associated with metabolic disregulation and obesity. There are several SH2B members in mammals: SH2B1 (splice variants: SH2B1alpha, SH2B1beta, SH2B1gamma, and SH2B1delta), SH2B2 (APS) and SH2B3 (Lnk). They contain a PH domain, a SH2 domain, a proline rich region, multiple consensus sites for tyrosine and serine/threonine phosphorylation and a highly conserved c-Cbl recognition motif. These domains function as protein-protein interaction motifs which allows SH2B proteins to integrate and transduce intracellular signals from multiple signaling networks in the absence of intrinsic catalytic activity. SH2B proteins bind via their SH2 domains to phosphotyrosine residues within the intracellular tails of several activated RTKs thereby contributing to receptor activation. SH2B proteins have been shown to interact with insulin receptor substrates IRS1 and IRS2, Grb2, Shc and c-Cbl which may or may not require RTK-stimulated tyrosine phosphorylation of SH2B. positively and negatively regulating RTK signaling. Understanding the physiological functions of SH2B proteins in mammals has been complicated by the presence of multiple SH2B isoforms and conflicting data. Both SH2-Bbeta and APS associate with JAKs, but the former facilitates JAK/STAT signaling while the latter inhibits it. Lnk plays a role in cell growth and proliferation with mutations resulting in growth reduction, developmental delay and female sterility. Recently Lnk Drosophila has been shown to be an important regulator of the insulin/insulin-like growth factor (IGF)-1 signaling (IIS) pathway during growth. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	115
269939	cd01233	PH_KIFIA_KIFIB	KIFIA and KIFIB protein pleckstrin homology (PH) domain. The kinesin-3 family motors KIFIA (Caenorhabditis elegans homolog unc-104) and KIFIB transport synaptic vesicle precursors that contain synaptic vesicle proteins, such as synaptophysin, synaptotagmin and the small GTPase RAB3A, but they do not transport organelles that contain plasma membrane proteins. They have a N-terminal motor domain, followed by a coiled-coil domain, and a C-terminal PH domain. KIF1A adopts a monomeric form in vitro, but acts as a processive dimer in vivo. KIF1B has alternatively spliced isoforms distinguished by the presence or absence of insertion sequences in the conserved amino-terminal region of the protein; this results in their different motor activities. KIF1A and KIF1B bind to RAB3 proteins through the adaptor protein mitogen-activated protein kinase (MAPK) -activating death domain (MADD; also calledDENN), which was first identified as a RAB3 guanine nucleotide exchange factor (GEF). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	103
269940	cd01234	PH_CADPS	Ca2+-dependent activator protein (also called CAPS) Pleckstrin homology (PH) domain. CADPS/CAPS consists of two members, CAPS1 which regulates catecholamine release from neuroendocrine cells and CAPS2 which is involved in the release of two neurotrophins, brain-derived neurotrophic factor (BDNF) and neurotrophin-3 (NT-3) from cerebellar granule cells. CADPS plays an important role in vesicle exocytosis in neurons and endocrine cells where it functions to prime the exocytic machinery for Ca2+-triggered fusion. Priming involves the assembly of trans SNARE complexes. The initial interaction of vesicles with target membranes is mediated by diverse stage-specific tethering factors or multi-subunit tethering complexes. CADPS and Munc13 proteins are proposed to be the functional homologs of the stage-specific tethering factors that prime membrane fusion. Interestingly, regions in the C-terminal half of CADPS are similar to the C-terminal region of Munc13-1 that was reported to bind syntaxin-1. CADPS has independent interactions with each of the SNARE proteins (Q-SNARE and R-SNARE) required for vesicle fusion. CADPS interacts with Q-SNARE proteins syntaxin-1 (H3 SNARE) and SNAP-25 (SN1) and might promote Q-SNARE heterodimer formation. Through its N-terminal R-SNARE VAMP-2 interactions, CADPS bound to heterodimeric Q-SNARE complexes could be involved in catalyzing the zippering of VAMP-2 into recipient complexes. It also contains a central PH domain that binds to phosphoinositide 4,5 bisphosphate containing liposomes. Membrane association may also be mediated by binding to phosphatidlyserine via general electrostatic interactions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	122
269941	cd01235	PH_Sbf1_hMTMR5	Set binding factor 1 (also called Human MTMR5) Pleckstrin Homology (PH) domain. Sbf1 is a myotubularin-related pseudo-phosphatase. Both Sbf1 and myotubularin interact with the SET domains of Hrx and other epigenetic regulatory proteins, but Sbf1 lacks phosphatase activity due to several amino acid changes in its structurally preserved catalytic pocket. It contains pleckstrin (PH), GEF, and myotubularin homology domains that are thought to be responsible for signaling and growth control. Sbf1 functions as an inhibitor of cellular growth. The N-terminal GEF homology domain serves to inhibit the transforming effects of Sbf1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	106
269942	cd01236	PH_RIP	Rho-Interacting Protein Pleckstrin homology (PH) domain. RIP1-RhoGDI2 was obtained in a screen for proteins that bind to wild-type RhoA. RIP2, RIP3, and RIP4 were isolated from cDNA libraries with constitutively active V14RhoA (containing the C190R mutation). RIP2 represents a novel GDP/GTP exchange factor (RhoGEF), while RIP3 (p116Rip) and RIP4 are thought to be structural proteins. RhoGEF contains a Dbl(DH)/PH region, a a zinc finger motif, a leucine-rich domain, and a coiled-coil region. The last 2 domains are thought to be involved in mediating protein-protein interactions. RIP3 is a negative regulator of RhoA signaling that inhibits, either directly or indirectly, RhoA-stimulated actomyosin contractility. In plants RIP3 is localized at microtubules and interacts with the kinesin-13 family member AtKinesin-13A, suggesting a role for RIP3 in microtubule reorganization and a possible function in Rho proteins of plants (ROP)-regulated polar growth. It has a PH domain, two proline-rich regions which are putative binding sites for SH3 domains, and a COOH-terminal coiled-coil region which overlaps with the RhoA-binding region. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	136
269943	cd01237	PH_fermitin	Fermitin family pleckstrin homology (PH) domain. Fermitin functions as a mediator of integrin inside-out signalling. The recruitment of Fermitin proteins and Talin to the membrane mediates the terminal event of integrin signalling, via interaction with integrin beta subunits. Fermatin has FERM domain interrupted with a pleckstrin homology (PH) domain. Fermitin family homologs (Fermt1, 2, and 3, also known as Kindlins) are each encoded by a different gene. In mammalian studies, Fermt1 is generally expressed in epithelial cells, Fermt2 is expressed inmuscle tissues, and Fermt3 is expressed in hematopoietic lineages. Specifically Fermt2 is expressed in smooth and striated muscle tissues in mice and in the somites (a trunk muscle precursor) and neural crest in Xenopus embryos. As such it has been proposed that Fermt2 plays a role in cardiomyocyte and neural crest differentiation. Expression of mammalian Fermt3 is associated with hematopoietic lineages: the anterior ventral blood islands, vitelline veins, and early myeloid cells. In Xenopus embryos this expression, also include the notochord and cement gland. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	125
269944	cd01238	PH_Btk	Bruton's tyrosine kinase pleckstrin homology (PH) domain. Btk is a member of the Tec family of cytoplasmic protein tyrosine kinases that includes BMX, IL2-inducible T-cell kinase (Itk) and Tec. Btk plays a role in the maturation of B cells. Tec proteins general have an N-terminal PH domain, followed by a Tek homology (TH) domain, a SH3 domain, a SH2 domain and a kinase domain. The Btk PH domain binds phosphatidylinositol 3,4,5-trisphosphate and responds to signalling via phosphatidylinositol 3-kinase. The PH domain is also involved in membrane anchoring which is confirmed by the discovery of a mutation of a critical arginine residue in the BTK PH domain. This results in severe human immunodeficiency known as X-linked agammaglobulinemia (XLA) in humans and a related disorder is mice.PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	140
269945	cd01239	PH_PKD	Protein kinase D (PKD/PKCmu) pleckstrin homology (PH) domain. Protein Kinase C family is composed of three members, PKD1 (PKCmu), PKD2 and PKD3 (PKCnu). Like the C-type protein kinases (PKCs), PKDs are activated by diacylglycerol (DAG). They are involved in vesicular transport, cell proliferation, survival, migration and immune responses. PKD consists of tandem C1 domains, followed by a PH domain and a kinase domain. While the PKD PH domain has not been shown to bind phosphorylated inositol lipids and is not required for membrane translocation, it is required for nuclear export. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	127
269946	cd01240	PH_GRK2_subgroup	G Protein-Coupled Receptor Kinase 2 subgroup pleckstrin homology (PH) domain. GRKs are a family of serine-threonine kinases which phosphorylates activated G-protein coupled receptors leading to the release of the previously bound heterotrimeric G protein agonist and thus signal termination. There are seven mammalian GRKs (GRK1-7) grouped into three subfamilies: GRK1 (GRK1 and 7), GRK2 (GRK2 and 3), and GRK4 (GRK4-6). GRKs have three functional components: an N-terminal Regulators of G-protein signaling (RGS) which interacts with the seven-trans-membrane helical receptor protein and/or other membrane targets, a central catalytic protein kinase C (PKc) domain, and a C-terminal section containing a autophosphorylation region and a variable region that mediates membrane association. In both GRK2 (also known as beta-adrenergic receptor kinase-1) and GRK3 (beta-adrenergic receptor kinase-2), the C-terminal variable region contains a PH domain which gives binding specificity to Gbetagamma proteins. The GRK2 PH domain has an extended C-terminal helix, which mediates interactions with G beta gamma subunits. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	118
269947	cd01241	PH_PKB	Protein Kinase B-like pleckstrin homology (PH) domain. PKB (also called Akt), a member of the AGC kinase family, is a phosphatidylinositol 3'-kinase (PI3K)-dependent Ser/Thr kinase which alters the activity of the targeted protein. The name AGC is based on the three proteins that it is most similar to cAMP-dependent protein kinase 1 (PKA; also known as PKAC), cGMP-dependent protein kinase (PKG; also known as CGK1) and protein kinase C (PKC). Human Akt has three isoforms derived for distinct genes: Akt1/PKBalpha, Akt2/PKBbeta, and Akt3/PKBgamma. All Akts have an N-terminal PH domain with an activating Thr phosphorylation site, a kinase domain, and a short C-terminal regulatory tail with an activating Ser phosphorylation site. The PH domain recruits Akt to the plasma membrane by binding to phosphoinositides (PtdIns-3,4-P2) and is required for activation. The phosphorylation of Akt at its Thr and Ser phosphorylation sites leads to increased Akt activity toward forkhead transcription factors, the mammalian target of rapamycin (mTOR), and the Bcl-xL/Bcl-2-associated death promoter (BAD), all of which possess a consensus motif R-X-R-XX-ST-B (X = amino acid, B = bulky hydrophobic residue) for Akt phosphorylation. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	107
269948	cd01242	PH_ROCK	Rho-associated coiled-coil containing protein kinase pleckstrin homology (PH) domain. ROCK is a serine/threonine kinase that binds GTP-Rho. It consists of a kinase domain, a coiled coil region and a PH domain. The ROCK PH domain is interrupted by a C1 domain. ROCK plays a role in cellular functions, such as contraction, adhesion, migration, and proliferation and in the regulation of apoptosis. There are two ROCK isoforms, ROCK1 and ROCK2. In ROCK2 the Rho Binding Domain (RBD) and the PH domain work together in membrane localization with RBD receiving the RhoA signal and the PH domain receiving the phospholipid signal. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	110
269949	cd01243	PH_MRCK	MRCK (myotonic dystrophy-related Cdc42-binding kinase) pleckstrin homology (PH) domain. MRCK is thought to be coincidence detector of signaling by Cdc42 and phosphoinositides. It has been shown to promote cytoskeletal reorganization, which affects many biological processes. There are 2 members of this family: MRCKalpha and MRCKbeta. MRCK consists of a serine/threonine kinase domain, a cysteine rich (C1) region, a PH domain and a p21 binding motif. The MRCK PH domain is responsible for its targeting to cell to cell junctions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	135
269950	cd01244	PH_GAP1-like	RAS p21 protein activator (GTPase activating protein) family pleckstrin homology (PH) domain. RASAL1, GAP1(m), GAP1(IP4BP), and CAPRI are all members of the GAP1 family of GTPase-activating proteins. They contain N-terminal SH2-SH3-SH2 domains, followed by two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. With the notable exception of GAP1(m), they all possess an arginine finger-dependent GAP activity on the Ras-related protein Rap1. They act as a suppressor of RAS enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS, allowing control of cellular proliferation and differentiation. PH domains share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	107
269951	cd01247	PH_FAPP1_FAPP2	Four phosphate adaptor protein 1 and 2 Pleckstrin homology (PH) domain. Human FAPP1 (also called PLEKHA3/Pleckstrin homology domain-containing, family A member 3) regulates secretory transport from the trans-Golgi network to the plasma membrane. It is recruited through binding of PH domain to phosphatidylinositol 4-phosphate (PtdIns(4)P) and a small GTPase ADP-ribosylation factor 1 (ARF1). These two binding sites have little overlap the FAPP1 PH domain to associate with both ligands simultaneously and independently. FAPP1 has a N-terminal PH domain followed by a short proline-rich region. FAPP1 is a member of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), and Goodpasture antigen binding protein (GPBP). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. FAPP2 (also called PLEKHA8/Pleckstrin homology domain-containing, family A member 8), a member of the Glycolipid lipid transfer protein(GLTP) family has an N-terminal PH domain that targets the TGN and C-terminal GLTP domain. FAPP2 functions to traffic glucosylceramide (GlcCer) which is made in the Golgi. It's interaction with vesicle-associated membrane protein-associated protein (VAP) could be a means of regulation. Some FAPP2s share the FFAT-like motifs found in GLTP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	100
269952	cd01248	PH_PLC_ELMO1	Phospholipase C and Engulfment and cell motility protein 1 pleckstrin homology domain. The C-terminal region of ELMO1, the PH domain and Pro-rich sequences, binds the SH3-containing region of DOCK2 forming a intermolecular five-helix bundle allowing for DOCK mediated Rac1 activation. ELMO1, a mammalian homolog of C. elegans CED-12, contains an N-terminal RhoG-binding region, a ELMO domain, a PH domain, and a C-terminal sequence with three PxxP motifs. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). All PLCs, except for PLCzeta, have a PH domain which is for most part N-terminally located, though lipid binding specificity is not conserved between them. In addition PLC gamma contains a split PH domain within its catalytic domain that is separated by 2 SH2 domains and a single SH3 domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
269953	cd01249	BAR-PH_GRAF_family	GTPase Regulator Associated with Focal adhesion and related proteins Pleckstrin homology (PH) domain. This hierarchy contains GRAF family members: OPHN1/oligophrenin1, GRAF1 (also called ARHGAP26/Rho GTPase activating protein 26), GRAF2 (also called ARHGAP10/ARHGAP42), AK057372, and LOC129897, all of which are members of the APPL family. OPHN1 is a RhoGAP involved in X-linked mental retardation, epilepsy, rostral ventricular enlargement, and cerebellar hypoplasia. Affected individuals have morphological abnormalities of their brain with enlargement of the cerebral ventricles and cerebellar hypoplasia. OPHN1 negatively regulates RhoA, Cdc42, and Rac1 in neuronal and non-neuronal cells. GRAF1 sculpts the endocytic membranes of the CLIC/GEEC (clathrin-independent carriers/GPI-enriched early endosomal compartments) endocytic pathway. It strongly interacts with dynamin and inhibition of dynamin abolishes CLIC/GEEC endocytosis. GRAF2, GRAF3 and oligophrenin are likely to play similar roles during clathrin-independent endocytic events. GRAF1 mutations are linked to leukaemia. All members are composed of a N-terminal BAR-PH domain, followed by a RhoGAP domain, a proline rich region, and a C-terminal SH3 domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
241281	cd01250	PH_AGAP	Arf-GAP with GTPase, ANK repeat and PH domain-containing protein Pleckstrin homology (PH) domain. AGAP (also called centaurin gamma; PIKE/Phosphatidylinositol-3-kinase enhancer) reside mainly in the nucleus and are known to activate phosphoinositide 3-kinase, a key regulator of cell proliferation, motility and vesicular trafficking. There are 3 isoforms of AGAP (PIKE-A, PIKE-L, and PIKE-S) the longest of which PIKE-L consists of N-terminal proline rich domains (PRDs), followed by a GTPase domain, a split PH domain (PHN and PHC), an ArfGAP domain and two ankyrin repeats. PIKE-S terminates after the PHN domain and PIKE-A is missing the PRD region. Centaurin binds phosphatidlyinositol (3,4,5)P3. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	114
241282	cd01251	PH2_ADAP	ArfGAP with dual PH domains Pleckstrin homology (PH) domain, repeat 2. ADAP (also called centaurin alpha) is a phophatidlyinositide binding protein consisting of an N-terminal ArfGAP domain and two PH domains. In response to growth factor activation, PI3K phosphorylates phosphatidylinositol 4,5-bisphosphate to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 1 is recruited to the plasma membrane following growth factor stimulation by specific binding of its PH domain to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 2 is constitutively bound to the plasma membrane since it binds phosphatidylinositol 4,5-bisphosphate and phosphatidylinositol 3,4,5-trisphosphate with equal affinity. This cd contains the second PH domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
269954	cd01252	PH_GRP1-like	General Receptor for Phosphoinositides-1-like Pleckstrin homology (PH) domain. GRP1/cytohesin3 and the related proteins ARNO (ARF nucleotide-binding site opener)/cytohesin-2 and cytohesin-1 are ARF exchange factors that contain a pleckstrin homology (PH) domain thought to target these proteins to cell membranes through binding polyphosphoinositides. The PH domains of all three proteins exhibit relatively high affinity for PtdIns(3,4,5)P3. Within the Grp1 family, diglycine (2G) and triglycine (3G) splice variants, differing only in the number of glycine residues in the PH domain, strongly influence the affinity and specificity for phosphoinositides. The 2G variants selectively bind PtdIns(3,4,5)P3 with high affinity,the 3G variants bind PtdIns(3,4,5)P3 with about 30-fold lower affinity and require the polybasic region for plasma membrane targeting. These ARF-GEFs share a common, tripartite structure consisting of an N-terminal coiled-coil domain, a central domain with homology to the yeast protein Sec7, a PH domain, and a C-terminal polybasic region. The Sec7 domain is autoinhibited by conserved elements proximal to the PH domain. GRP1 binds to the DNA binding domain of certain nuclear receptors (TRalpha, TRbeta, AR, ER, but not RXR), and can repress thyroid hormone receptor (TR)-mediated transactivation by decreasing TR-complex formation on thyroid hormone response elements. ARNO promotes sequential activation of Arf6, Cdc42 and Rac1 and insulin secretion. Cytohesin acts as a PI 3-kinase effector mediating biological responses including cell spreading and adhesion, chemotaxis, protein trafficking, and cytoskeletal rearrangements, only some of which appear to depend on their ability to activate ARFs. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	119
269955	cd01253	PH_ARHGAP21-like	ARHGAP21 and related proteins pleckstrin homology (PH) domain. ARHGAP family genes encode Rho/Rac/Cdc42-like GTPase activating proteins with a RhoGAP domain. These proteins functions as a GTPase-activating protein (GAP) for RHOA and CDC42. ARHGAP21 controls the Arp2/3 complex and F-actin dynamics at the Golgi complex by regulating the activity of the small GTPase Cdc42. It is recruited to the Golgi by to GTPase, ARF1, through its PH domain and its helical motif. It is also required for CTNNA1 recruitment to adherens junctions. ARHGAP21 and it related proteins all contains a PH domain and a RhoGAP domain. Some of the members have additional N-terminal domains including PDZ, SH3, and SPEC. The ARHGAP21 PH domain interacts with the GTPbound forms of both ARF1 and ARF6 ARF-binding domain/ArfBD. The members here include: ARHGAP15, ARHGAP21, and ARHGAP23. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	113
269956	cd01254	PH_PLD	Phospholipase D pleckstrin homology (PH) domain. PLD hydrolyzes phosphatidylcholine to phosphatidic acid (PtdOH), which can bind target proteins. PLD contains a PH domain, a PX domain and four conserved PLD signature domains. The PLD PH domain is specific for bisphosphorylated inositides. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	136
269957	cd01255	PH2_Tiam1_2	T-lymphoma invasion and metastasis 1 and 2 Pleckstrin Homology (PH) domain, C-terminal domain. Tiam1 activates Rac GTPases to induce membrane ruffling and cell motility while Tiam2 (also called STEF (SIF (still life) and Tiam1 like-exchange factor) contributes to neurite growth. Tiam1/2 are Dbl-family of GEFs that possess a Dbl(DH) domain with a PH domain in tandem. DH-PH domain catalyzes the GDP/GTP exchange reaction in the GTPase cycle and facillitating the switch between inactive GDP-bound and active GTP-bound states. The DH domain of Tiam1 interacts with Switch regions 1 and 2 of Rac1 which blocks magnesium binding and GDP is released. Tiam1/2 possess two PH domains, which are often referred to as PHn and PHc domains. The DH-PH tandem domain is made up of the PHc domain while the PHn is part of a novel N-terminal PHCCEx domain which is made up of the PHn domain, a coiled coil region(CC), and an extra region (Ex). PHCCEx mediates binding to plasma membranes and signalling proteins in the activation of Rac GTPases. The PH domain resembles the beta-spectrin PH domain, suggesting non-canonical phosphatidylinositol binding. CC and Ex form a positively charged surface for protein binding. There are 2 motifs in Tiam1/2-interacting proteins that bind to the PHCCEx domain: Motif-I in CD44, ephrinBs, and the NMDA receptor and Motif-II in Par3 and JIP2. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	172
269958	cd01256	PH_dynamin	Dynamin pleckstrin homology (PH) domain. Dynamin is a GTPase that regulates endocytic vesicle formation. It has an N-terminal GTPase domain, followed by a PH domain, a GTPase effector domain and a C-terminal proline arginine rich domain. Dynamin-like proteins, which are found in metazoa, plants and yeast have the same domain architecture as dynamin, but lack the PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	112
269959	cd01257	PH_IRS	Insulin receptor substrate (IRS) pleckstrin homology (PH) domain. Insulin receptor substrate (IRS) molecules are mediators in insulin signaling and play a role in maintaining basic cellular functions such as growth and metabolism. They act as docking proteins between the insulin receptor and a complex network of intracellular signaling molecules containing Src homology 2 (SH2) domains. Four members (IRS-1, IRS-2, IRS-3, IRS-4) of this family have been identified that differ as to tissue distribution, subcellular localization, developmental expression, binding to the insulin receptor, and interaction with SH2 domain-containing proteins. IRS molecules have an N-terminal PH domain, followed by an IRS-like PTB domain which has a PH-like fold. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.cytoskeletal associated molecules, and in lipid associated enzymes.	106
269960	cd01258	PHsplit_syntrophin	Syntrophin Split Pleckstrin homology (PH) domain. Syntrophins are scaffold proteins that associate with associate with the Duchenne muscular dystrophy protein dystrophin and the dystrophin-related proteins, utrophin and dystrobrevin to form the dystrophin glycoprotein complex (DGC). There are 5 members: alpha, beta1, beta2, gamma1, and gamma2) all of which contains a split (also called joined) PH domain and a PDZ domain (PHN-PDZ-PHC). The split PH domain of alpha-syntrophin adopts a canonical PH domain fold and together with PDZ forms a supramodule functioning synergistically in binding to inositol phospholipids. The alpha-syntrophin PH-PDZ supramodule showed strong binding to phosphoinositides PI(3,5)P2 and PI(5)P, modest binding to PI(3,4)P2 and PI(4,5)P2, and weak binding to PI(3)P, PI(4)P, and PI(3,4,5)P. There are a large number of signaling proteins that bind to the PDZ domain of syntrophins: nitric oxide synthase (nNOS), aquaporin-4, voltage-gated sodium channels, potassium channels, serine/threonine protein kinases, and the ATP-binding cassette transporter A1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	89
269961	cd01259	PH_APBB1IP	Amyloid beta (A4) Precursor protein-Binding, family B, member 1 Interacting Protein pleckstrin homology (PH) domain. APBB1IP consists of a Ras-associated (RA) domain, a PH domain, a family-specific BPS region, and a C-terminal SH2 domain. Grb7, Grb10 and Grb14 are paralogs that are also present in this hierarchy. These adapter proteins bind a variety of receptor tyrosine kinases, including the insulin and insulin-like growth factor-1 (IGF1) receptors. Grb10 and Grb14 are important tissue-specific negative regulators of insulin and IGF1 signaling based and may contribute to type 2 (non-insulin-dependent) diabetes in humans. RA-PH function as a single structural unit and is dimerized via a helical extension of the PH domain. The PH domain here are proposed to bind phosphoinositides non-cannonically ahd are unlikely to bind an activated GTPase. The tandem RA-PH domains are present in a second adapter-protein family, MRL proteins, Caenorhabditis elegans protein MIG-1012, the mammalian proteins RIAM and lamellipodin and the Drosophila melanogaster protein Pico12, all of which are Ena/VASP-binding proteins involved in actin-cytoskeleton rearrangement. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	124
269962	cd01260	PH_CNK_mammalian-like	Connector enhancer of KSR (Kinase suppressor of ras) (CNK) pleckstrin homology (PH) domain. CNK family members function as protein scaffolds, regulating the activity and the subcellular localization of RAS activated RAF. There is a single CNK protein present in Drosophila and Caenorhabditis elegans in contrast to mammals which have 3 CNK proteins (CNK1, CNK2, and CNK3). All of the CNK members contain a sterile a motif (SAM), a conserved region in CNK (CRIC) domain, and a PSD-95/DLG-1/ZO-1 (PDZ) domain, and, with the exception of CNK3, a PH domain. A CNK2 splice variant CNK2A also has a PDZ domain-binding motif at its C terminus and Drosophila CNK (D-CNK) also has a domain known as the Raf-interacting region (RIR) that mediates binding of the Drosophila Raf kinase. This cd contains CNKs from mammals, chickens, amphibians, fish, and crustacea. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	114
269963	cd01261	PH_SOS	Son of Sevenless (SOS) Pleckstrin homology (PH) domain. SOS is a Ras guanine nucleotide exchange factor. SOS is thought to transmit signals from activated receptor tyrosine kinases to the Ras signaling pathway. SOS contains a histone domain, Dbl-homology (DH), a PH domain, Rem domain, Cdc25 domain, and a Grb2 binding domain. The SOS PH domain binds to phosphatidylinositol-4,5-bisphosphate (PIP2) and phosphatidic acid (PA). SOS is dependent on Ras binding to the allosteric site via its histone domain for both a lower level of activity (Ras GDP) and maximal activity (Ras GTP). The DH domain blocks the allosteric Ras binding site in SOS. The PH domain is closely associated with the DH domain and the action of the DH-PH unit gates a reciprocal interaction between Ras and SOS. The C-terminal proline-rich domain of SOS binds to the adapter protein Grb2 which localizes the Sos protein to the plasma membrane and diminishes the negative effect of the C-terminal domain on the guanine nucleotide exchange activity of the CDC25-homology domain of SOS. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	109
241293	cd01262	PH_PDK1	3-Phosphoinositide dependent protein kinase 1 (PDK1) pleckstrin homology (PH) domain. PDK1 plays an important role in insulin and growth factor signalling cascades. It phosphorylates and activates many AGC (cAMP-dependent, cGMP-dependent, protein kinase C (PKC)) family of protein kinases members, including protein kinase B (PKB, also known as Akt), p70 ribosomal S6-kinase (S6K), serum and glucocorticoid responsive kinase (SGK), p90 ribosomal S6 kinase (RSK), and PKC. PDK1 contains an N-terminal serine/threonine kinase domain followed by a PH domain. Following binding of the PH domain to PtdIns(3,4,5)P3 and PtdIns(3,4)P2, PDK1 activates these enzymes by phosphorylating a Ser/Thr residue in their activation loop. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	107
269964	cd01263	PH_anillin	Anillin Pleckstrin homology (PH) domain. Anillin (Rhotekin/RTKN; also called PLEKHK/Pleckstrin homology domain-containing family K) is an actin binding protein involved in cytokinesis. It interacts with GTP-bound Rho proteins and results in the inhibition of their GTPase activity. Dysregulation of the Rho signal transduction pathway has been implicated in many forms of cancer. Anillin proteins have a N-terminal HRI domain/ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. The C-terminal PH domain helps target anillin to ectopic septin containing foci. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	121
269965	cd01264	PH_MELT_VEPH1	Melted pleckstrin homology (PH) domain. The melted protein (also called Ventricular zone expressed PH domain-containing protein homolog 1) is expressed in the developing central nervous system of vertebrates. It contains a single C-terminal PH domain that is required for membrane targeting. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
269966	cd01265	PH_TBC1D2A	TBC1 domain family member 2A pleckstrin homology (PH) domain. TBC1D2A (also called PARIS-1/Prostate antigen recognized and identified by SEREX 1 and ARMUS) contains a PH domain and a TBC-type GTPase catalytic domain. TBC1D2A integrates signaling between Arf6, Rac1, and Rab7 during junction disassembly. Activated Rac1 recruits TBC1D2A to locally inactivate Rab7 via its C-terminal TBC/RabGAP domain and facilitate E-cadherin degradation in lysosomes. The TBC1D2A PH domain mediates localization at cell-cell contacts and coprecipitates with cadherin complexes. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	102
241297	cd01266	PH_Gab1_Gab2	Grb2-associated binding proteins 1 and 2 pleckstrin homology (PH) domain. The Gab subfamily includes several Gab proteins, Drosophila DOS and C. elegans SOC-1. They are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. The members in this cd include the Gab1 and Gab2 proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	123
241298	cd01268	PTB_Numb	Numb Phosphotyrosine-binding (PTB) domain. Numb is a membrane associated adaptor protein which plays critical roles in cell fate determination. Numb proteins are involved in control of asymmetric cell division and cell fate choice, endocytosis, cell adhesion, cell migration, ubiquitination of specific substrates and a number of signaling pathways (Notch, Hedgehog, p53). Mutations in Numb plays a critical role in disease (cancer). Numb has an N-terminal PTB domain and a C-terminal NumbF domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	135
269967	cd01269	PTB_TBC1D1_like	TBC1 domain family member 1  and related proteins Phosphotyrosine-binding (PTB) domain. The TBC1D1-like members here include TBC1D1, TBC1D4 (also called Akt substrate of 160 kDa or AS160), and pollux (PLX), a calmodulin-binding protein, and are thought to have a role in regulating cell growth and differentiation. These proteins are thought to function as GTPase-activating protein for Rab family protein(s). They may play a role in the cell cycle and differentiation of various tissues. They all contain an N-terminal PTB domain, a calmodulin CBD domain, and a C-terminal TBC domain which is thought to be a GTPase activator protein of Rab-like small GTPases. Recently, TBC1D1 and TBC1D4 were recognized to potentially link the proximal signalling of insulin and/or exercise with GLUT4. TBC1D4 is thought to be involved in contraction-stimulated glucose uptake, but TBC1D4-independent mechanisms (potentially involving TBC1D1) are likely to be essential for most of the contraction's effect. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	143
269968	cd01270	PTB_CAPON-like	Carboxyl-terminal PDZ ligand of neuronal nitric oxide synthase protein (CAPON) Phosphotyrosine-binding (PTB) domain. CAPON (also known as Nitric oxide synthase 1 adaptor protein, NOS1AP, encodes a cytosolic protein that binds to the signaling molecule, neuronal NOS (nNOS). It contains a N-terminal PTB domain that binds to the small monomeric G protein, Dexras1 and a C-terminal PDZ-binding domain that mediates interactions with nNOS. Included in this cd are C. elegan proteins dystrobrevin, DYB-1, which controls neurotransmitter release and muscle Ca(2+) transients by localizing BK channels and DYstrophin-like phenotype and CAPON related,DYC-1, which is functionally related to dystrophin homolog, DYS-1. Mutations in the dystrophin gene causes Duchenne muscular dystrophy. DYS-1 shares sequence similarity, including key motifs, with their mammalian counterparts. These CAPON-like proteins all have a single PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	179
269969	cd01271	PTB2_Fe65	Fe65 C-terminal Phosphotyrosine-binding (PTB) domain. The neuronal adaptor protein Fe65 is involved in brain development, Alzheimer disease amyloid precursor protein (APP) signaling, and proteolytic processing of APP. It contains three protein-protein interaction domains, one WW domain, and a unique tandem array of phosphotyrosine-binding (PTB) domains. The C-terminal PTB domain is responsible for APP binding. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	127
269970	cd01272	PTB1_Fe65	Fe65 N-terminal Phosphotyrosine-binding (PTB) domain. The neuronal adaptor protein Fe65 is involved in brain development, Alzheimer disease amyloid precursor protein (APP) signaling, and proteolytic processing of APP. It contains three protein-protein interaction domains, one WW domain, and a unique tandem array of phosphotyrosine-binding (PTB) domains. The N-terminal PTB domain was shown to interact with a variety of proteins, including the low density lipoprotein receptor-related protein (LRP-1), the ApoEr2 receptor, and the histone acetyltransferase Tip60. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	138
269971	cd01273	PTB_CED-6	Cell death protein 6 homolog (CED-6/GULP1) Phosphotyrosine-binding (PTB) domain. CED6 (also known as GULP1: engulfment adaptor PTB domain containing 1) is an adaptor protein involved in the specific recognition and engulfment of apoptotic cells. CED6 has been shown to interact with the cytoplasmic tail of another protein involved in the engulfment of apoptotic cells, CED1. CED6 has a C-terminal PTB domain, which can bind to NPXY motifs. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	144
269972	cd01274	PTB_Anks	Ankyrin repeat and sterile alpha motif (SAM) domain-containing (Anks) protein family Phosphotyrosine-binding (PTB) domain. Both AIDA-1b (AbetaPP intracellular domain-associated protein 1b) and Odin (also known as ankyrin repeat and sterile alpha motif domain-containing 1A; ANKS1A) belong to the Anks protein family. Both of these family members interacts with the EphA8 receptor. Ank members consists of ankyrin repeats, a SAM domain and a C-terminal PTB domain which is crucial for interaction with the juxtamembrane (JM) region of EphA8. PTB domains are classified into three groups, namely, phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains of which the Anks PTB is a member. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	146
238606	cd01275	FHIT	FHIT (fragile histidine family): FHIT proteins, related to the HIT family carry a motif HxHxH/Qxx (x, is a hydrophobic amino acid), On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified into three  branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Fhit plays a very important role in the development of tumours. Infact, Fhit deletions are among the earliest and most frequent genetic alterations in the development of tumours.	126
238607	cd01276	PKCI_related	Protein Kinase C Interacting protein related (PKCI): PKCI and related proteins belong to the ubiquitous HIT family of hydrolases that act on alpha-phosphates of ribonucleotides. The members of this subgroup have a conserved HxHxHxx motif (x is a hydrophobic residue) that is a signature for this family. No enzymatic activity has been reported however, for PKCI and its related members.	104
238608	cd01277	HINT_subgroup	HINT (histidine triad nucleotide-binding protein) subgroup: Members of this CD belong to the superfamily of histidine triad hydrolases that act on alpha-phosphate of ribonucleotides. This subgroup includes members from all three forms of cellular life. Although the biochemical function has not been characterised for many of the members of this subgroup, the proteins from Yeast have been shown to be involved in secretion, peroxisome formation and gene expression.	103
238609	cd01278	aprataxin_related	aprataxin related: Aprataxin, a HINT family hydrolase is mutated in ataxia oculomotor apraxia syndrome. All the members of this subgroup have the conserved HxHxHxx (where x is a hydrophobic residue) signature motif. Members of this subgroup are predominantly eukaryotic in origin.	104
133387	cd01279	HTH_HspR-like	Helix-Turn-Helix DNA binding domain of HspR-like transcription regulators. Helix-turn-helix (HTH) transcription regulator HspR and related proteins, N-terminal domain. Heat shock protein regulators (HspR) have been shown to regulate expression of specific regulons in response to high temperature or high osmolarity in Streptomyces and Helicobacter, respectively. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.	98
133388	cd01282	HTH_MerR-like_sg3	Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 3). Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	112
238610	cd01283	cytidine_deaminase	Cytidine deaminase zinc-binding domain. These enzymes are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate. Cytidine deaminases catalyze the deamination of cytidine to uridine and are important in the pyrimadine salvage pathway in many cell types, from bacteria to humans. This family also includes  the apoBec proteins, which are a mammal specific expansion of RNA editing enzymes, and the closely related phorbolins, and the AID (activation-induced) enzymes.	112
238611	cd01284	Riboflavin_deaminase-reductase	Riboflavin-specific deaminase. Riboflavin biosynthesis protein RibD (Diaminohydroxyphosphoribosylaminopyrimidine deaminase) catalyzes the deamination of 2,5-diamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate, which is an intermediate step in the biosynthesis of riboflavin.The ribG gene of Bacillus subtilis and the ribD gene of E. coli are bifunctional and contain this deaminase domain and a reductase domain which catalyzes the subsequent reduction of the ribosyl side chain.	115
238612	cd01285	nucleoside_deaminase	Nucleoside deaminases include adenosine, guanine and cytosine deaminases. These enzymes are Zn dependent and catalyze the deamination of nucleosides. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate. The functional enzyme is a homodimer. Cytosine deaminase catalyzes the deamination of cytosine to uracil and ammonia and is a member of the pyrimidine salvage pathway. Cytosine deaminase is found in bacteria and fungi but is not present in mammals; for this reason, the enzyme is currently of interest for antimicrobial drug design and gene therapy applications against tumors. Some members of this family are tRNA-specific adenosine deaminases that generate inosine at the first position of their anticodon (position 34) of specific tRNAs; this modification is thought to enlarge the codon recognition capacity during protein synthesis. Other members of the family are guanine deaminases which deaminate guanine to xanthine as part of the utilization of guanine as a nitrogen source.	109
238613	cd01286	deoxycytidylate_deaminase	Deoxycytidylate deaminase domain. Deoxycytidylate deaminase catalyzes the deamination of dCMP to dUMP,  providing the nucleotide substrate for thymidylate synthase. The enzyme binds Zn++, which is required for catalytic activity. The activity of the enzyme is allosterically regulated by the ratio of dCTP to dTTP not only in eukaryotic cells but also in T-even phage-infected Escherichia coli, with dCTP acting as an activator and dTTP as an inhibitor.	131
238614	cd01287	FabA	FabA, beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase: Bacterial protein of the type II, fatty acid synthase system that binds ACP and catalyzes both dehydration and isomerization reactions, apparently in the same active site. The FabA structure is a homodimer with two independent active sites located at the dimer interface.  Each active site is tunnel-shaped and completely inaccessible to solvent.  No metal ions or cofactors are required for ligand binding or catalysis.	150
238615	cd01288	FabZ	FabZ is a 17kD beta-hydroxyacyl-acyl carrier protein (ACP) dehydratase that primarily catalyzes the dehydration of beta-hydroxyacyl-ACP to trans-2-acyl-ACP, the third step in the elongation phase of the bacterial/ plastid, type II, fatty-acid biosynthesis pathway.	131
238616	cd01289	FabA_like	Domain of unknown function, appears to be related to a diverse group of beta-hydroxydecanoyl ACP dehydratases (FabA) and beta-hydroxyacyl ACP dehydratases (FabZ). This group appears to lack the conserved active site histidine of FabA and FabZ.	138
211324	cd01291	PseudoU_synth	Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families.  This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39  in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs).  Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA).	87
238617	cd01292	metallo-dependent_hydrolases	Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase  dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others.	275
238618	cd01293	Bact_CD	Bacterial cytosine deaminase and related metal-dependent hydrolases. Cytosine deaminases (CDs) catalyze the deamination of cytosine, producing uracil and ammonia. They play an important role in pyrimidine salvage. CDs are present in prokaryotes and fungi, but not mammalian cells. The bacterial enzymes, but not the fungal enzymes, are related to the adenosine deaminases (ADA). The bacterial enzymes are iron dependent and hexameric.	398
238619	cd01294	DHOase	Dihydroorotase (DHOase) catalyzes the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in the pyrimidine biosynthesis. In contrast to the large polyfunctional CAD proteins of higher organisms, this group of DHOases is monofunctional and mainly dimeric.	335
238620	cd01295	AdeC	Adenine deaminase (AdeC) directly deaminates adenine to form hypoxanthine. This reaction is part of one of the adenine salvage pathways, as well as the degradation pathway. It is important for adenine utilization as a purine, as well as a nitrogen source in bacteria and archea.	422
238621	cd01296	Imidazolone-5PH	Imidazolonepropionase/imidazolone-5-propionate hydrolase (Imidazolone-5PH) catalyzes the third step in the histidine degradation pathway, the hydrolysis of (S)-3-(5-oxo-4,5-dihydro-3H-imidazol-4-yl)propanoate to N-formimidoyl-L-glutamate. In bacteria, the enzyme is part of histidine utilization (hut) operon.	371
238622	cd01297	D-aminoacylase	D-aminoacylases (N-acyl-D-Amino acid amidohydrolases) catalyze the hydrolysis of N-acyl-D-amino acids to produce the corresponding D-amino acids, which are used as intermediates in the synthesis of pesticides, bioactive peptides, and antibiotics.	415
238623	cd01298	ATZ_TRZ_like	TRZ/ATZ family contains enzymes from the atrazine degradation pathway and related hydrolases. Atrazine, a chlorinated herbizide, can be catabolized by a variety of different bacteria. The first three steps of the atrazine dehalogenation pathway are catalyzed by atrazine chlorohydrolase (AtzA), hydroxyatrazine ethylaminohydrolase (AtzB), and N-isopropylammelide N-isopropylaminohydrolase (AtzC). All three enzymes belong to the superfamily of metal dependent hydrolases. AtzA and AtzB, beside other related enzymes are represented in this CD.	411
238624	cd01299	Met_dep_hydrolase_A	Metallo-dependent hydrolases, subgroup A is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown.	342
238625	cd01300	YtcJ_like	YtcJ_like metal dependent amidohydrolases. YtcJ is a Bacillus subtilis ORF of unknown function. The Arabidopsis homolog LAF3 has been identified as a factor required for photochrome A signalling.	479
238626	cd01301	rDP_like	renal dipeptidase (rDP), best studied in mammals and also called membrane or microsomal dipeptidase, is a membrane-bound glycoprotein hydrolyzing dipeptides and is involved in hydrolytic metabolism of penem and carbapenem beta-lactam antibiotics. Although the biological function of the enzyme is still unknown, it has been suggested to play a role in the renal glutathione metabolism.	309
238627	cd01302	Cyclic_amidohydrolases	Cyclic amidohydrolases, including hydantoinase, dihydropyrimidinase, allantoinase, and dihydroorotase, are involved in the metabolism of pyrimidines and purines, sharing the property of hydrolyzing the cyclic amide bond of each substrate to the corresponding N-carbamyl amino acids. Allantoinases catalyze the degradation of purines, while dihydropyrimidinases and hydantoinases, a microbial counterpart of dihydropyrimidinase, are involved in pyrimidine degradation. Dihydroorotase participates in the de novo synthesis of pyrimidines.	337
238628	cd01303	GDEase	Guanine deaminase (GDEase). Guanine deaminase is an aminohydrolase responsible for the conversion of guanine to xanthine and ammonia, the first step to utilize guanine as a nitrogen source. This reaction also removes the guanine base from the pool and therefore can play a role in the regulation of cellular GTP and the guanylate nucleotide pool.	429
238629	cd01304	FMDH_A	Formylmethanofuran dehydrogenase (FMDH) subunit A;  Methanogenic bacteria and archea derive the energy for autotrophic growth from methanogenesis, the reduction of CO2 with molecular hydrogen as the electron donor. FMDH catalyzes the first step in methanogenesis, the formyl-methanofuran synthesis. In this step, CO2 is bound to methanofuran and subsequently reduced to the formyl state with electrons derived from hydrogen.	541
238630	cd01305	archeal_chlorohydrolases	Predicted chlorohydrolases. These metallo-dependent hydrolases from archea are part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. They have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. Some members of this subgroup are predicted to be chlorohyrolases.	263
238631	cd01306	PhnM	PhnM is believed to be a subunit of the membrane associated C-P lyase complex. C-P lyase is thought to catalyze the direct cleavage of inactivated C-P bonds to yield inorganic phosphate and the corresponding hydrocarbons. It is responsible for cleavage of alkylphosphonates, which are utilized as sole phosphorus sources by many bacteria.	325
238632	cd01307	Met_dep_hydrolase_B	Metallo-dependent hydrolases, subgroup B is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown.	338
238633	cd01308	Isoaspartyl-dipeptidase	Isoaspartyl dipeptidase hydrolyzes the beta-L-isoaspartyl linkages in dipeptides, as part of the degradative pathway to eliminate proteins with beta-L-isoaspartyl peptide bonds, bonds whereby the beta-group of an aspartate forms the peptide link with the amino group of the following amino acid. Formation of this bond is a spontaneous nonenzymatic reaction in nature and can profoundly effect the function of the protein. Isoaspartyl dipeptidase is an octameric enzyme that contains a binuclear zinc center in the active site of each subunit and shows a strong preference of hydrolyzing Asp-Leu dipeptides.	387
238634	cd01309	Met_dep_hydrolase_C	Metallo-dependent hydrolases, subgroup C is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown.	359
238635	cd01310	TatD_DNAse	TatD like proteins;  E.coli TatD is a cytoplasmic protein, shown to have magnesium dependent DNase activity.	251
238636	cd01311	PDC_hydrolase	2-pyrone-4,6-dicarboxylic acid (PDC) hydrolase hydrolyzes PDC to yield 4-oxalomesaconic acid (OMA) or its tautomer, 4-carboxy-2-hydroxymuconic acid (CHM). This reaction is part of the protocatechuate (PCA) 4,5-cleavage pathway. PCA is one of the most important intermediate metabolites in the bacterial pathways for various phenolic compounds, including lignin, which is the most abundant aromatic material in nature.	263
238637	cd01312	Met_dep_hydrolase_D	Metallo-dependent hydrolases, subgroup D is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown.	381
238638	cd01313	Met_dep_hydrolase_E	Metallo-dependent hydrolases, subgroup D is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The function of this subgroup is unknown.	418
238639	cd01314	D-HYD	D-hydantoinases (D-HYD) also called dihydropyrimidases (DHPase) and related proteins; DHPases are a family of enzymes that catalyze the reversible hydrolytic ring opening of the amide bond in five- or six-membered cyclic diamides, like dihydropyrimidine or hydantoin. The hydrolysis of dihydropyrimidines is the second step of reductive catabolism of pyrimidines in human. The hydrolysis of 5-substituted hydantoins in microorganisms leads to enantiomerically pure N-carbamyl amino acids, which are used for the production of antibiotics, peptide hormones, pyrethroids, and pesticides. HYDs are classified depending on their stereoselectivity. This family also includes collapsin response regulators (CRMPs), cytosolic proteins involved in neuronal differentiation and axonal guidance which have strong homology to DHPases, but lack most of the active site residues.	447
238640	cd01315	L-HYD_ALN	L-Hydantoinases (L-HYDs) and Allantoinase (ALN); L-Hydantoinases are a member of the dihydropyrimidinase family, which catalyzes the reversible hydrolytic ring opening of dihydropyrimidines and hydantoins (five-membered cyclic diamides used in biotechnology). But L-HYDs differ by having an L-enantio specificity and by lacking activity on possible natural substrates such as dihydropyrimidines. Allantoinase catalyzes the hydrolytic cleavage of the five-member ring of allantoin (5-ureidohydantoin) to form allantoic acid.	447
238641	cd01316	CAD_DHOase	The eukaryotic CAD protein is a trifunctional enzyme of carbamoylphosphate synthetase-aspartate transcarbamoylase-dihydroorotase, which catalyzes the first three steps of de novo pyrimidine nucleotide biosynthesis. Dihydroorotase (DHOase) catalyzes the third step, the reversible interconversion of carbamoyl aspartate to dihydroorotate.	344
238642	cd01317	DHOase_IIa	Dihydroorotase (DHOase), subgroup IIa; DHOases catalyze the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in pyrimidine biosynthesis. This subgroup also contains proteins that lack the active site, like unc-33, a C.elegans protein involved in axon growth.	374
238643	cd01318	DHOase_IIb	Dihydroorotase (DHOase), subgroup IIb; DHOases catalyze the reversible interconversion of carbamoyl aspartate to dihydroorotate, a key reaction in pyrimidine biosynthesis. This group contains the archeal members of the DHOase family.	361
238644	cd01319	AMPD	AMP deaminase (AMPD) catalyzes the hydrolytic deamination of adensosine monophosphate (AMP) at position 6 of the adenine nucleotide ring. AMPD is a diverse and highly regulated eukaryotic key enzyme of the adenylate catabolic pathway.	496
238645	cd01320	ADA	Adenosine deaminase (ADA) is a monomeric zinc dependent enzyme which catalyzes the irreversible hydrolytic deamination of both adenosine, as well as desoxyadenosine, to ammonia and inosine or desoxyinosine, respectively. ADA plays an important role in the purine pathway. Low, as well as high levels of ADA activity have been linked to several diseases.	325
238646	cd01321	ADGF	Adenosine deaminase-related growth factors (ADGF), a novel family of secreted growth-factors with sequence similarty to adenosine deaminase.	345
238647	cd01324	cbb3_Oxidase_CcoQ	Cytochrome cbb oxidase CcoQ.  Cytochrome cbb3 oxidase, the terminal oxidase in the respiratory chains of proteobacteria, is a multi-chain transmembrane protein located in the cell membrane. Like other cytochrome oxidases, it catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  Found exclusively in proteobacteria, cbb3 is believed to be a modern enzyme that has evolved independently to perform a specialized function in microaerobic energy metabolism. The cbb3 operon contains four genes (ccoNOQP or fixNOQP), with ccoN coding for subunit I.  Instead of a CuA-containing subunit II analogous to other cytochrome oxidases, cbb3 utilizes subunits ccoO and ccoP, which contain one and two hemes, respectively, to transfer electrons to the binuclear center.  ccoQ, the fourth subunit, is a single transmembrane helix protein.  It has been shown to protect the core complex from proteolytic degradation by serine proteases.  See cd00919, cd01322, or cd01323 for more information on cbb3 oxidase.	48
238648	cd01327	KAZAL_PSTI	Kazal-type pancreatic secretory trypsin inhibitors (PSTI) and related proteins, including the second domain of the ovomucoid turkey inhibitor and the C-terminal domain of the esophagus cancer-related gene-2 protein (ECRG-2), are members of the superfamily of kazal-type proteinase inhibitors and follistatin-like proteins.	45
238649	cd01328	FSL_SPARC	Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.	86
238650	cd01330	KAZAL_SLC21	The kazal-type serine protease inhibitor domain has been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The KAZAL_SLC21 domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.	54
176461	cd01334	Lyase_I	Lyase class I family; a group of proteins which catalyze similar beta-elimination reactions. The Lyase class I family contains class II fumarase, aspartase, adenylosuccinate lyase (ASL), argininosuccinate lyase (ASAL), prokaryotic-type 3-carboxy-cis,cis-muconate cycloisomerase (pCMLE), and related proteins. It belongs to the Lyase_I superfamily. Proteins of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits.	325
100105	cd01335	Radical_SAM	Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and  MoaA, an enzyme of the biosynthesis of molybdopterin.	204
133421	cd01336	MDH_cytoplasmic_cytosolic	Cytoplasmic and cytosolic Malate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this subfamily are eukaryotic MDHs localized to the cytoplasm and cytosol. MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	325
133422	cd01337	MDH_glyoxysomal_mitochondrial	Glyoxysomal and mitochondrial malate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this subfamily are localized to the glycosome and mitochondria. MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	310
133423	cd01338	MDH_choloroplast_like	Chloroplast-like malate dehydrogenases. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this subfamily are bacterial MDHs, and plant MDHs localized to the choloroplasts. MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	322
133424	cd01339	LDH-like_MDH	L-lactate dehydrogenase-like malate dehydrogenase proteins. Members of this subfamily have an LDH-like structure and an MDH enzymatic activity. Some members, like MJ0490 from Methanococcus jannaschii, exhibit both MDH and LDH activities. Tetrameric MDHs, including those from phototrophic bacteria, are more similar to LDHs than to other MDHs. LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH-like MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	300
238651	cd01341	ADP_ribosyl	ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated  by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active.	137
293888	cd01342	Translation_Factor_II_like	Domain II of Elongation factor Tu (EF-Tu)-like proteins. Elongation factor Tu consists of three structural domains. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought  to be involved in binding of E. coli IF-2 to 30S subunits.	80
238653	cd01343	PL1_Passenger_AT	Pertactin-like passenger domains (virulence factors), C-terminal, subgroup 1, of autotransporter proteins of the type V secretion system of Gram-negative bacteria. This subgroup includes the passenger domains of Neisseria and Haemophilus IgA1 proteases, SPATEs (serine protease autotransporters secreted by Enterobacteriaceae), Bordetella pertacins, and nonprotease autotransporters, TibA and similar AIDA-like proteins.	233
238654	cd01344	PL2_Passenger_AT	Pertactin-like passenger domains (virulence factors), C-terminal, subgroup 2, of autotransporter proteins of the type V secretion system of Gram-negative bacteria. This subgroup includes the passenger domains of the nonprotease autotransporters, Ag43, AIDA-1 and IcsA, as well as, the less characterized ShdA, MisL, and BapA autotransporters.	188
238655	cd01345	OM_channels	Porin superfamily.  These outer membrane channels share a beta-barrel structure that differ in strand and shear number.  Classical (gram-negative ) porins are non-specific channels for small hydrophillic molecules and form 16 beta-stranded barrels (16,20), which associate as trimers. Maltoporin-like channels have specificities for various sugars and form 18 beta-stranded barrels (18,22), which associate as trimers. Ligand-gated protein channels cooperate with a TonB associated inner membrane complex to actively transport ligands via the proton motive force and they form monomeric, (22,24) barrels. The 150-200 N-terminal residues form a plug that blocks the channel from the periplasmic end.	253
238656	cd01346	Maltoporin-like	The Maltoporin-like channels (LamB porin) form a trimeric structure which facilitate the diffusion of maltodextrins and other sugars across the outer membrane of Gram-negative bacteria. The membrane channel is formed by an 18-strand antiparallel beta-barrel (18,22). Loop 3 folds into the core to constrict pore size. Long irregular loops are found on the extracelllular side, while short turns are in the periplasm.Tightly-bound water molecules are found in the eyelet of the passage, and only substrates that can displace and replace the broken hydrogen bonds are likely to enter the pore.  In the MPR structure, loops 4,6, and 9 have the greatest mobility and are highly variable; these are postulated to attract maltodextrins.	392
238657	cd01347	ligand_gated_channel	TonB dependent/Ligand-Gated channels are created by a monomeric 22 strand (22,24) anti-parallel beta-barrel. Ligands apparently bind to the large extracellular loops. The N-terminal 150-200 residues form a plug from the periplasmic end of barrel.   Energy (proton-motive force) and TonB-dependent conformational alteration of channel (parts of plug, and loops 7 and 8) allow passage of ligand. FepA residues 12-18 form the TonB box, which mediates the interaction with the TonB-containing  inner membrane complex. TonB preferentially interacts with ligand-bound receptors. Transport thru the channel may resemble passage thru an air lock.  In this model, ligand binding leads to closure of the extracellular end of pore, then a TonB-mediated  signal facillitates opening of the interior side of pore, deforming the N-terminal plug and allowing passage of the ligand to the periplasm. Such a mechanism would prevent the free diffusion of small molecules thru the pore.	635
153129	cd01351	Aconitase	Aconitase catalytic domain; Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Aconitase catalytic domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle.  Cis-aconitate is formed as an intermediate product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. Aconitase, in its active form, contains a 4Fe-4S  iron-sulfur cluster; three cysteine residues have been shown to be ligands of the 4Fe-4S cluster. This is the Aconitase core domain, including structural domains 1, 2 and 3, which binds the Fe-S cluster. The aconitase family also contains the following proteins: - Iron-responsive  element binding protein (IRE-BP), a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in  the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of  transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the  alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid.	389
153130	cd01355	AcnX	Putative Aconitase X catalytic domain. Putative Aconitase X catalytic domain. It is predicted by comparative genomic analysis. The proteins are mainly found in archaea and proteobacteria. They are distantly related to Aconitase family of proteins by sequence similarity and seconary structure prediction. The functions have not yet been experimentally characterized. Thus, the prediction should be treated with caution.	389
238658	cd01356	AcnX_swivel	Putative Aconitase X swivel domain. It is predicted by comparative genomic analysis. The proteins are mainly found in archaea and proteobacteria. They are distantly related to Aconitase family of proteins by sequence similarity and seconary structure prediction. The functions have not yet been experimentally characterized. Thus, the prediction should be treated with caution.	123
176462	cd01357	Aspartase	Aspartase. This subgroup contains Escherichia coli aspartase (L-aspartate ammonia-lyase), Bacillus aspartase and related proteins. It is a member of the Lyase class I family, which includes both aspartase (L-aspartate ammonia-lyase) and fumarase class II enzymes. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Aspartase catalyzes the reversible deamination of aspartic acid.	450
176463	cd01359	Argininosuccinate_lyase	Argininosuccinate lyase (argininosuccinase, ASAL). This group contains ASAL and related proteins. It is a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASAL is a cytosolic enzyme which catalyzes the reversible breakdown of argininosuccinate to arginine and fumarate during arginine biosynthesis. In ureotelic species ASAL also catalyzes a reaction involved in the production of urea. Included in this group are the major soluble avian eye lens proteins from duck, delta 1 and delta 2 crystallin. Of these two isoforms only delta 2 has retained ASAL activity. These crystallins may have evolved by, gene recruitment of ASAL followed by gene duplication. In humans, mutations in ASAL result in the autosomal recessive disorder argininosuccinic aciduria.	435
176464	cd01360	Adenylsuccinate_lyase_1	Adenylsuccinate lyase (ASL)_subgroup 1. This subgroup contains bacterial and archeal proteins similar to ASL, a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and, the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP).	387
176465	cd01362	Fumarase_classII	Class II fumarases. This subgroup contains Escherichia coli fumarase C, human mitochondrial fumarase, and related proteins.  It is a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Fumarase catalyzes the reversible hydration/dehydration of fumarate to L-malate during the Krebs cycle.	455
276814	cd01363	Motor_domain	Myosin and Kinesin motor domain. Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Some of the names do not match with what is given in the sequence list.  This is because they are based on the current nomenclature by Kollmar/Sebe-Pedros.	170
276815	cd01364	KISc_BimC_Eg5	Kinesin motor domain, BimC/Eg5 spindle pole proteins. Kinesin motor domain, BimC/Eg5 spindle pole proteins, participate in spindle assembly and chromosome segregation during cell division. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type), N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	353
276816	cd01365	KISc_KIF1A_KIF1B	Kinesin motor domain, KIF1_like proteins. Kinesin motor domain, KIF1_like proteins. KIF1A (Unc104) transports synaptic vesicles to the nerve  terminal, KIF1B has been implicated in transport of mitochondria. Both proteins are expressed in neurons. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. In contrast to the majority of dimeric kinesins, most KIF1A/Unc104 kinesins are monomeric motors. A lysine-rich loop in KIF1A binds to the negatively charged C-terminus of tubulin and compensates for the lack of a second motor domain, allowing KIF1A to move processively.	361
276817	cd01366	KISc_C_terminal	Kinesin motor domain, KIFC2/KIFC3/ncd-like carboxy-terminal kinesins. Kinesin motor domain, KIFC2/KIFC3/ncd-like carboxy-terminal kinesins. Ncd is a spindle motor protein necessary for chromosome segregation in meiosis. KIFC2/KIFC3-like kinesins have been implicated in motility of the Golgi apparatus as well as dentritic and axonal transport in neurons. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this subgroup the motor domain is found at the C-terminus (C-type). C-type kinesins are (-) end-directed motors, i.e. they transport cargo towards the (-) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	329
276818	cd01367	KISc_KIF2_like	Kinesin motor domain, KIF2-like group. Kinesin motor domain, KIF2-like group. KIF2 is a protein expressed in neurons, which has been associated with axonal transport and neuron development; alternative splice forms have been implicated in lysosomal translocation. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this subgroup the motor domain is found in the middle (M-type) of the protein chain. M-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second (KIF2 may be slower). To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	328
276819	cd01368	KISc_KIF23_like	Kinesin motor domain, KIF23-like subgroup. Kinesin motor domain, KIF23-like subgroup. Members of this group may play a role in mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	345
276820	cd01369	KISc_KHC_KIF5	Kinesin motor domain, kinesin heavy chain (KHC) or KIF5-like subgroup. Kinesin motor domain, kinesin heavy chain (KHC) or KIF5-like subgroup. Members of this group have been associated with organelle transport. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	325
276821	cd01370	KISc_KIP3_like	Kinesin motor domain, KIP3-like subgroup. Kinesin motor domain, KIP3-like subgroup. The yeast kinesin KIP3 plays a role in positioning the mitotic spindle. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	345
276822	cd01371	KISc_KIF3	Kinesin motor domain, kinesins II or KIF3_like proteins. Kinesin motor domain, kinesins II or KIF3_like proteins. Subgroup of kinesins, which form heterotrimers composed of 2 kinesins and one non-motor accessory subunit. Kinesins II play important roles in ciliary transport, and have been implicated in neuronal transport, melanosome transport, the secretory pathway, and mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In this group the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	334
276823	cd01372	KISc_KIF4	Kinesin motor domain, KIF4-like subfamily. Kinesin motor domain, KIF4-like subfamily. Members of this group seem to perform a variety of functions, and have been implicated in neuronal organelle transport and chromosome segregation during mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	341
276824	cd01373	KISc_KLP2_like	Kinesin motor domain, KIF15-like subgroup. Kinesin motor domain, KIF15-like subgroup. Members of this subgroup seem to play a role in mitosis and meiosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	347
276825	cd01374	KISc_CENP_E	Kinesin motor domain, CENP-E/KIP2-like subgroup. Kinesin motor domain, CENP-E/KIP2-like subgroup, involved in chromosome movement and/or spindle elongation during mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	321
276826	cd01375	KISc_KIF9_like	Kinesin motor domain, KIF9-like subgroup. Kinesin motor domain, KIF9-like subgroup; might play a role in cell shape remodeling. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	334
276827	cd01376	KISc_KID_like	Kinesin motor domain, KIF22/Kid-like subgroup. Kinesin motor domain, KIF22/Kid-like subgroup. Members of this group might play a role in regulating chromosomal movement along microtubules in mitosis. This catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. In most kinesins, the motor domain is found at the N-terminus (N-type). N-type kinesins are (+) end-directed motors, i.e. they transport cargo towards the (+) end of the microtubule. Kinesin motor domains hydrolyze ATP at a rate of about 80 per second, and move along the microtubule at a speed of about 6400 Angstroms per second. To achieve that, kinesin head groups work in pairs. Upon replacing ADP with ATP, a kinesin motor domain increases its affinity for microtubule binding and locks in place. Also, the neck linker binds to the motor domain, which repositions the other head domain through the coiled-coil domain close to a second tubulin dimer, about 80 Angstroms along the microtubule. Meanwhile, ATP hydrolysis takes place, and when the second head domain binds to the microtubule, the first domain again replaces ADP with ATP, triggering a conformational change that pulls the first domain forward.	319
276951	cd01377	MYSc_class_II	class II myosins, motor domain. Myosin motor domain in class II myosins. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. Thus, myosin II has two heads. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	662
276829	cd01378	MYSc_Myo1	class I myosin, motor domain. Myosin I generates movement at the leading edge in cell motility, and class I myosins have been implicated in phagocytosis and vesicle transport. Myosin I, an unconventional myosin, does not form dimers. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. There are 5 myosin subclasses with subclasses c/h, d/g, and a/b have an IQ domain and a TH1 domain. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	652
276830	cd01379	MYSc_Myo3	class III myosin, motor domain. Myosin III has been shown to play a role in the vision process in insects and in hearing in mammals. Myosin III, an unconventional myosin, does not form dimers. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. They are characterized by an N-terminal protein kinase domain and several IQ domains.  Some members also contain WW, SH2, PH, and Y-phosphatase domains.  Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	633
276831	cd01380	MYSc_Myo5	class V myosin, motor domain. Myo5, also called heavy chain 12, myoxin, are dimeric myosins that transport a variety of intracellular cargo processively along actin filaments, such as melanosomes, synaptic vesicles, vacuoles, and mRNA. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. It also contains a IQ domain and a globular DIL domain. Myosin V is a class of actin-based motor proteins involved in cytoplasmic vesicle transport and anchorage, spindle-pole alignment and mRNA translocation. The protein encoded by this gene is abundant in melanocytes and nerve cells. Mutations in this gene cause Griscelli syndrome type-1 (GS1), Griscelli syndrome type-3 (GS3) and neuroectodermal melanolysosomal disease, or Elejalde disease. Multiple alternatively spliced transcript variants encoding different isoforms have been reported, but the full-length nature of some variants has not been determined. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. Note that the Dictyostelium myoVs are not contained in this child group. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	629
276832	cd01381	MYSc_Myo7	class VII myosin, motor domain. These monomeric myosins have been associated with functions in sensory systems such as vision and hearing. Mammalian myosin VII has a tail with 2 MyTH4 domains, 2 FERM domains, and a SH3 domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	648
276833	cd01382	MYSc_Myo6	class VI myosin, motor domain. Myosin VI is a monomeric myosin, which moves towards the minus-end of actin filaments, in contrast to most other myosins which moves towards the plus-end of actin filaments. It is thought that myosin VI, unlike plus-end directed myosins, does not use a pure lever arm mechanism, but instead steps with a mechanism analogous to the kinesin neck-linker uncoupling model. It has been implicated in a myriad of functions including: the transport of cytoplasmic organelles, maintenance of normal Golgi morphology, endocytosis, secretion, cell migration, border cell migration during development, and in cancer metastasis playing roles in deafness and retinal development among others. While how this is accomplished is largely unknown there are several interacting proteins that have been identified such as disabled homolog 2 (DAB2), GIPC1, synapse-associated protein 97 (SAP97; also known as DLG1) and optineurin, which have been found to target myosin VI to different cellular compartments. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the minus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	649
276834	cd01383	MYSc_Myo8	class VIII myosin, motor domain. These plant-specific type VIII myosins has been associated with endocytosis, cytokinesis, cell-to-cell coupling and gating at plasmodesmata. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. It also contains IQ domains Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	647
276835	cd01384	MYSc_Myo11	class XI myosin, motor domain. These plant-specific type XI myosin are involved in organelle transport. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle.	647
276836	cd01385	MYSc_Myo9	class IX myosin, motor domain. Myosin IX is a processive single-headed motor, which might play a role in signalling. It has a N-terminal RA domain, an IQ domain, a C1_1 domain, and a RhoGAP domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	690
276837	cd01386	MYSc_Myo18	class XVIII myosin, motor domain. Many members of this class contain a N-terminal PDZ domain which is commonly found in proteins establishing molecular complexes. The motor domain itself does not exhibit ATPase activity, suggesting that it functions as an actin tether protein. It also has two IQ domains that probably bind light chains or related calmodulins and a C-terminal tail with two sections of coiled-coil domains, which are thought to mediate homodimerization. The function of these myosins are largely unknown. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	689
276838	cd01387	MYSc_Myo15	class XV mammal-like myosin, motor domain. The class XV myosins are monomeric. In vertebrates, myosin XV appears to be expressed in sensory tissue and play a role in hearing. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. C-terminal to the head domain are 2 MyTH4 domain, a FERM domain, and a SH3 domain. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	657
238684	cd01388	SOX-TCF_HMG-box	SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.	72
238685	cd01389	MATA_HMG-box	MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.	77
238686	cd01390	HMGB-UBF_HMG-box	HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.	66
380477	cd01391	Periplasmic_Binding_Protein_type1	Type 1 periplasmic binding fold superfamily. Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins, the ligands are monosaccharides, including lactose, ribose, fructose, xylose, arabinose, galactose/glucose and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold.	280
143331	cd01392	HTH_LacI	Helix-turn-helix (HTH) DNA binding domain of the LacI family of transcriptional regulators. HTH-DNA binding domain of the LacI (lactose operon repressor) family of bacterial transcriptional regulators and their putative homologs found in plants. The LacI family has more than 500 members distributed among almost all bacterial species. The monomeric proteins of the LacI family contain common structural features that include a small DNA-binding domain with a helix-turn-helix motif in the N-terminus, a regulatory ligand-binding domain which exhibits the type I periplasmic binding protein fold in the C-terminus for oligomerization and for effector binding, and an approximately 18-amino acid linker connecting these two functional domains. In LacI-like transcriptional regulators, the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. When the C-terminal domain of the LacI family repressor binds its ligand, it undergoes a conformational change which affects the DNA-binding affinity of the repressor. In Escherichia coli, LacI represses transcription by binding with high affinity to the lac operon at a specific operator DNA sequence until it interacts with the physiological inducer allolactose or a non-degradable analog IPTG (isopropyl-beta-D-thiogalactopyranoside). Induction of the repressor lowers its affinity for the operator sequence, thereby allowing transcription of the lac operon structural genes (lacZ, lacY, and LacA). The lac repressor occurs as a tetramer made up of two functional dimers. Thus, two DNA binding domains of a dimer are required to bind the inverted repeat sequences of the operator DNA binding sites.	52
410881	cd01393	RecA-like	RecA family. RecA is a bacterial enzyme which has roles in homologous recombination, DNA repair, and the induction of the SOS response.  RecA couples ATP hydrolysis to DNA strand exchange. While prokaryotes have a single RecA protein, eukaryotes have multiple RecA homologs such as Rad51, DMC1 and Rad55/57.  Archaea have the RecA-like homologs RadA and RadB.	185
410882	cd01394	archRadB	archaeal RadB. The archaeal protein RadB shares similarity RadA, the archaeal functional homologue to the bacterial RecA. The precise function of RadB is unclear.	216
238689	cd01395	HMT_MBD	Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as CLLD8 and SETDB1 proteins; CLLD8 contains a MBD, a PreSET and a bifurcated SET domain, suggesting that CLLD8 might be associated with methylation-mediated transcriptional repression. SETDB1 and other proteins in this group have a similar domain architecture. SETDB1 is a novel KAP-1-associated histone H3, lysine 9-specific methyltransferase that contributes to HP1-mediated silencing of euchromatic genes by KRAB zinc-finger proteins.	60
238690	cd01396	MeCP2_MBD	MeCP2, MBD1, MBD2, MBD3, and MBD4 are members of a protein family that share the methyl-CpG-binding domain (MBD). The MBD, consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin.  MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1.	77
238691	cd01397	HAT_MBD	Methyl-CpG binding domains (MBD) present in putative chromatin remodelling factor such as BAZ2A; BAZ2A contains a MBD, DDT, PHD-type zinc finger and Bromo domain suggesting that BAZ2A might be associated with histone acetyltransferase (HAT) activity. The Drosophila melanogaster toutatis protein, a putative subunit of the chromatin-remodeling complex, and other such proteins in this group share a similar domain architecture with BAZ2A, as does the Caenorhabditis elegans flectin homolog.	73
238692	cd01398	RPI_A	RPI_A: Ribose 5-phosphate isomerase type A (RPI_A) subfamily; RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. This reaction leads to the conversion of phosphosugars into glycolysis intermediates, which are precursors for the synthesis of amino acids, vitamins, nucleotides, and cell wall components. In plants, RPI is part of the Calvin cycle as ribulose 5-phosphate is the carbon dioxide receptor in the first dark reaction of photosynthesis. There are two unrelated types of RPIs (A and B), which catalyze the same reaction, at least one type of RPI is present in an organism. RPI_A is more widely distributed than RPI_B in bacteria, eukaryotes, and archaea.	213
238693	cd01399	GlcN6P_deaminase	GlcN6P_deaminase: Glucosamine-6-phosphate (GlcN6P) deaminase subfamily; GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium. The reaction is an aldo-keto isomerization coupled with an amination or deamination. It is the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate (GlcNAc6P). GlcN6P deaminase is a hexameric enzyme that is allosterically activated by GlcNAc6P.	232
238694	cd01400	6PGL	6PGL: 6-Phosphogluconolactonase (6PGL) subfamily; 6PGL catalyzes the second step of the oxidative phase of the pentose phosphate pathway, the hydrolyzation of 6-phosphoglucono-1,5-lactone (delta form) to 6-phosphogluconate. 6PGL is thought to guard against the accumulation of the delta form of the lactone, which may be toxic through its reaction with endogenous cellular nucleophiles.	219
238695	cd01401	PncB_like	Nicotinate phosphoribosyltransferase (NAPRTase), related to PncB. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. This subgroup is present in bacteria, archea and funghi.	377
238696	cd01403	Cyt_c_Oxidase_VIIb	Cytochrome C oxidase chain VIIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIb subunit is found only in eukaryotes and its specific function remains unclear. A rare polymorphism of the CcO VIIb gene may be associated with the high risk of nasopharyngeal carcinoma in a Cantonese family.	51
238697	cd01406	SIR2-like	Sir2-like: Prokaryotic group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines; and are members of the SIR2 superfamily of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation.	242
238698	cd01407	SIR2-fam	SIR2 family of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer.	218
238699	cd01408	SIRT1	SIRT1: Eukaryotic group (class1) which includes human sirtuins SIRT1-3 and yeast Hst1-4; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The nuclear SIRT1 has been shown to target the p53 tumor suppressor protein for deacetylation to suppress DNA damage, and the cytoplasmic SIRT2 homolog has been shown to target alpha-tubulin for deacetylation for the maintenance of cell integrity.	235
238700	cd01409	SIRT4	SIRT4: Eukaryotic and prokaryotic group (class2) which includes human sirtuin SIRT4 and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span.	260
238701	cd01410	SIRT7	SIRT7: Eukaryotic and prokaryotic group (class4) which includes human sirtuin SIRT6, SIRT7, and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span.	206
238702	cd01411	SIR2H	SIR2H: Uncharacterized prokaryotic Sir2 homologs from several gram positive bacterial species and Fusobacteria; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span.	225
238703	cd01412	SIRT5_Af1_CobB	SIRT5_Af1_CobB: Eukaryotic, archaeal and prokaryotic group (class3) which includes human sirtuin SIRT5, Archaeoglobus fulgidus Sir2-Af1, and E. coli CobB; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. CobB is a bacterial sirtuin that deacetylates acetyl-CoA synthetase at an active site lysine to stimulate its enzymatic activity. 	224
238704	cd01413	SIR2_Af2	SIR2_Af2: Archaeal and prokaryotic group which includes Archaeoglobus fulgidus Sir2-Af2, Sulfolobus solfataricus ssSir2, and several bacterial homologs; and are members of the SIR2 family of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation. Sir2 proteins have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The Sir2 homolog from the archaea Sulfolobus solftaricus deacetylates the non-specific DNA protein Alba to mediate transcription repression.	222
133469	cd01414	SAICAR_synt_Sc	non-metazoan 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. Eukaryotic, bacterial, and archaeal group of SAICAR synthetases represented by the Saccharomyces cerevisiae (Sc) enzyme, mostly absent in metazoans. SAICAR synthetase catalyzes the seventh step of the de novo biosynthesis of purine nucleotides (also reported as eighth step). It converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate.	279
133470	cd01415	SAICAR_synt_PurC	bacterial and archaeal 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. A subfamily of SAICAR synthetases represented by the Thermotoga maritima (Tm) enzyme and E. coli PurC. SAICAR synthetase catalyzes the seventh step of the de novo biosynthesis of purine nucleotides (also reported as eighth step). It converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate.	230
133471	cd01416	SAICAR_synt_Ade5	Ade5_like 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. Eukaryotic group of SAICAR synthetases represented by the Drosophila melanogaster, N-terminal, SAICAR synthetase domain of bifunctional Ade5. The Ade5 gene product (CAIR-SAICARs) catalyzes the sixth and seventh steps of the de novo biosynthesis of purine nucleotides (also reported as seventh and eighth steps). SAICAR synthetase converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate.	252
238705	cd01417	Ribosomal_L19e_E	Ribosomal protein L19e, eukaryotic.  L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit.  The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits.	164
238706	cd01418	Ribosomal_L19e_A	Ribosomal protein L19e, archaeal.  L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit.  The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits.	145
238707	cd01419	MoaC_A	MoaC family, archaeal. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis.	141
238708	cd01420	MoaC_PE	MoaC family, prokaryotic and eukaryotic. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis.	140
238709	cd01421	IMPCH	Inosine monophosphate cyclohydrolase domain. This is the N-terminal domain in the purine biosynthesis pathway protein ATIC (purH). The bifunctional ATIC protein contains a C-terminal  ATIC formylase domain that formylates 5-aminoimidazole-4-carboxamide-ribonucleotide. The IMPCH domain then converts the formyl-5-aminoimidazole-4-carboxamide-ribonucleotide to inosine monophosphate. This is the final step in de novo purine production.	187
238710	cd01422	MGS	Methylglyoxal synthase catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The first part of the catalytic mechanism is believed to be similar to TIM (triosephosphate isomerase) in that both enzymes utilize DHAP to form an ene-diolate phosphate intermediate. In MGS, the second catalytic step is characterized by the elimination of phosphate and collapse of the enediolate to form methylglyoxal instead of reprotonation to form the isomer glyceraldehyde 3-phosphate, as in TIM. This is the first reaction in the methylglyoxal bypass of the Embden-Myerhoff glycolytic pathway and is believed to provide physiological benefits under non-ideal growth conditions in bacteria.	115
238711	cd01423	MGS_CPS_I_III	Methylglyoxal synthase-like domain found in pyr1 and URA1-like carbamoyl phosphate synthetases (CPS), including ammonia-dependent CPS Type I, and glutamine-dependent CPS Type III. These are multidomain proteins, in which MGS is the C-terminal domain.	116
238712	cd01424	MGS_CPS_II	Methylglyoxal synthase-like domain from type II glutamine-dependent carbamoyl phosphate synthetase (CSP). CSP, a CarA and CarB heterodimer, catalyzes the production of carbamoyl phosphate which is subsequently employed in the metabolic pathways responsible for the synthesis of pyrimidine nucleotides or arginine. The MGS-like domain is the C-terminal domain of CarB and appears to play a regulatory role in CPS function by binding allosteric effector molecules, including UMP and ornithine.	110
100106	cd01425	RPS2	Ribosomal protein S2 (RPS2), involved in formation of the translation initiation complex, where it might contact the messenger RNA and several components of the ribosome. It has been shown that in Escherichia coli RPS2 is essential for the binding of ribosomal protein S1 to the 30s ribosomal subunit. In humans, most likely in all vertebrates, and perhaps in all metazoans, the protein also functions as the 67 kDa laminin receptor (LAMR1 or 67LR), which is formed from a 37 kDa precursor, and is overexpressed in many tumors. 67LR is a cell surface receptor which interacts with a variety of ligands, laminin-1 and others. It is assumed that the ligand interactions are mediated via the conserved C-terminus, which becomes extracellular as the protein undergoes conformational changes which are not well understood. Specifically, a conserved palindromic motif, LMWWML, may participate in the interactions. 67LR plays essential roles in the adhesion of cells to the basement membrane and subsequent signalling events, and has been linked to several diseases. Some evidence also suggests that the precursor of 67LR, 37LRP is also present in the nucleus in animals, where it appears associated with histones.	193
349738	cd01426	ATP-synt_F1_V1_A1_AB_FliI_N	ATP synthase, alpha/beta subunits of F1/V1/A1 complex, flagellum-specific ATPase FliI, N-terminal domain. The alpha and beta (or A and B) subunits are primarily found in the F1, V1, and A1 complexes of the F-, V- and A-type family of ATPases with rotary motors. These ion-transporting rotary ATPases are composed of two linked multi-subunit complexes: the F1, V1, or A1 complex which contains three copies each of the alpha and beta subunits that form the soluble catalytic core involved in ATP synthesis/hydrolysis, and the Fo, Vo, or Ao complex which forms the membrane-embedded proton pore. The F-ATP synthases (also called FoF1-ATPases) are found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts, or in the plasma membranes of bacteria. F-ATPases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis.  The A-ATP synthases (AoA1-ATPases), a different class of proton-translocating ATP synthases, are found in archaea and function like F-ATP synthases. Structurally, however, the A-ATP synthases are more closely related to the V-ATP synthases (vacuolar VoV1-ATPases), which are a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, F-, V-, and A-type synthases can function in both ATP synthesis and hydrolysis modes. This family also includes the flagellum-specific ATPase/type III secretory pathway virulence-related protein, which shows extensive similarity to the alpha and beta subunits of F1-ATP synthase.	73
319763	cd01427	HAD_like	Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others. This superfamily includes a variety of enzymes that catalyze the cleavage of substrate C-Cl, P-C, and P-OP bonds via nucleophilic substitution pathways. All of which use a nucleophilic aspartate in their phosphoryl transfer reaction. They catalyze nucleophilic substitution reactions at phosphorus or carbon centers, using a conserved Asp carboxylate in covalent catalysis. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	106
238713	cd01428	ADK	Adenylate kinase (ADK) catalyzes the reversible phosphoryl transfer from adenosine triphosphates (ATP) to adenosine monophosphates (AMP) and to yield adenosine diphosphates (ADP). This enzyme is required for the biosynthesis of ADP and is essential for homeostasis of adenosine phosphates.	194
349744	cd01429	ATP-synt_F1_V1_A1_AB_FliI_C	ATP synthase, alpha/beta subunits of F1/V1/A1 complex, flagellum-specific ATPase FliI, C-terminal domain. The alpha and beta (also called A and B) subunits are primarily found in the F1, V1, and A1 complexes of F-, V- and A-type family of ATPases with rotary motors. These ion-transporting rotary ATPases are composed of two linked multi-subunit complexes: the F1, V1, and A1 complexes contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Fo, Vo, or Ao complex that forms the membrane-embedded proton pore. The F-ATP synthases (also called FoF1-ATPases) are found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts, or in the plasma membranes of bacteria. F-ATPases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy  derived from ATP hydrolysis. The A-ATP synthases (AoA1-ATPases), a different class of proton-translocating ATP synthases, are found in archaea and function like F-ATP synthases. Structurally, however, the A-ATP synthases are more closely related to the V-ATP synthases (vacuolar VoV1-ATPases), which are a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, F-, V-, and A-type synthases can function in both ATP synthesis and hydrolysis modes. This family also includes the flagellum-specific ATPase/type III secretory pathway virulence-related protein, which shows extensive similarity to the alpha and beta subunits of F1-ATP synthase.	70
319764	cd01431	P-type_ATPases	ATP-dependent membrane-bound cation and aminophospholipid transporters. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. They are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.  A general characteristic of P-type ATPases is a bundle of transmembrane helices which make up the transport path, and three domains on the cytoplasmic side of the membrane. Members include pumps that transport various light metal ions, such as H(+), Na(+), K(+), Ca(2+), and Mg(2+), pumps that transport indispensable trace elements, such as Zn(2+) and Cu(2+), pumps that remove toxic heavy metal ions, such as Cd(2+), and pumps such as aminophospholipid translocases which transport phosphatidylserine and phosphatidylethanolamine.	319
238714	cd01433	Ribosomal_L16_L10e	Ribosomal_L16_L10e: L16 is an essential protein in the large ribosomal subunit of bacteria, mitochondria, and chloroplasts. Large subunits that lack L16 are defective in peptidyl transferase activity, peptidyl-tRNA hydrolysis activity, association with the 30S subunit, binding of aminoacyl-tRNA and interaction with antibiotics. L16 is required for the function of elongation factor P (EF-P), a protein involved in peptide bond synthesis through the stimulation of peptidyl transferase activity by the ribosome. Mutations in L16 and the adjoining bases of 23S rRNA confer antibiotic resistance in bacteria, suggesting a role for L16 in the formation of the antibiotic binding site. The GTPase RbgA (YlqF) is essential for the assembly of the large subunit, and it is believed to regulate the incorporation of L16. L10e is the archaeal and eukaryotic cytosolic homolog of bacterial L16. L16 and L10e exhibit structural differences at the N-terminus.	112
238715	cd01434	EFG_mtEFG1_IV	EFG_mtEFG1_IV: domains similar to domain IV of the bacterial translational elongation factor (EF) EF-G.  Included in this group is a domain of mitochondrial Elongation factor G1 (mtEFG1) proteins homologous to domain IV of EF-G. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs.  Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s) mtEFG2s are not present in this group.	116
259844	cd01435	RNAP_I_RPA1_N	Largest subunit (RPA1) of eukaryotic RNA polymerase I (RNAP I), N-terminal domain. RPA1 is the largest subunit of the eukaryotic RNA polymerase I (RNAP I). RNAP I is a multi-subunit protein complex responsible for the synthesis of rRNA precursors. RNAP I consists of at least 14 different subunits, the largest being homologous to subunit Rpb1 of yeast RNAP II and subunit beta' of bacterial RNAP. The yeast member of this family is known as Rpb190. Structure studies suggest that different RNA polymerase complexes share a similar crab-claw-shaped structure. The N-terminal domain of Rpb1, the largest subunit of RNAP II in yeast, forms part of the active site. It makes up the head and core of one clamp, as well as the pore and funnel structures of RNAP II. The strong homology between RPA1 and Rpb1 suggests a similar functional and structural role.	779
238716	cd01436	Dipth_tox_like	Mono-ADP-ribosylating toxins catalyze the transfer of ADP_ribose from NAD+ to eukaryotic Elongation Factor 2, halting protein synthesis. A single molecule of delivered toxin is sufficient to kill a cell.  These toxins share mono-ADP-ribosylating activity with a variety of bacterial toxins, such as cholera toxin and pertussis toxin.   The structural core is homologous to the poly-ADP ribosylating enzymes such as the PARP enzymes and Tankyrase. Diphtheria toxin is encoded by a lysogenic bacteriophage. Both diphtheria toxin and Pseudomonas aeruginosa exotoxin A are multi-domain proteins. These domains provide a EF2 ADP_ribosylating, receptor-binding, and intracellular trafficking/transmembrane functions .	147
238717	cd01437	parp_like	Poly(ADP-ribose) polymerase (parp) catalytic domain catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins,  which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active. Poly(ADP-ribose)-like polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated  by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length through interactions with telomere repeat binding factor 1.	347
238718	cd01438	tankyrase_like	Tankyrases interact with the telomere reverse transcriptase complex (TERT). Tankyrase 1 poly-ADP-ribosylates Telomere Repeat Binding Factor 1  (TRF1) while Tankyrase 2 can poly-ADP-ribosylate itself or TRF1. The tankyrases also contain multiple ankyrin repeats that mediate protein-protein interaction (binding TRF1 and insulin-responsive aminopeptidase) and may function as a complex. Overexpression of Tank1 promotes increased telomere length when overexpressed, while overexpressed Tank2 has been shown to promote PARP cleavage- independent cell death (necrosis).	223
238719	cd01439	TCCD_inducible_PARP_like	Poly(ADP-ribose) polymerases catalyse the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) causes  pleotropic effects in mammalian species through modulating gene expression.  TCCD indicible PARP (TiPARP) is a  target of TCDD that may contribute to multiple responses to TCDD by modulating protein function through poly ADP-ribosylation	121
238720	cd01443	Cdc25_Acr2p	Cdc25 enzymes are members of the Rhodanese Homology Domain (RHOD) superfamily. Also included in this CD are eukaryotic arsenate resistance proteins such as Saccharomyces cerevisiae Acr2p and similar proteins. Cdc25 phosphatases activate the cell division kinases throughout the cell cycle progression. Cdc25 phosphatases dephosphorylate phosphotyrosine and phosphothreonine residues, in order to activate their Cdk/cyclin substrates. The Cdc25 and Acr2p RHOD domains have the signature motif (H/YCxxxxxR).	113
238721	cd01444	GlpE_ST	GlpE sulfurtransferase (ST) and homologs are members of the Rhodanese Homology Domain superfamily. Unlike other rhodanese sulfurtransferases, GlpE is a single domain protein but indications are that it functions as a dimer. The active site contains a catalytically active cysteine.	96
238722	cd01445	TST_Repeats	Thiosulfate sulfurtransferases (TST) contain 2 copies of the Rhodanese Homology Domain. Only the second repeat contains the catalytically active Cys residue. The role of the 1st repeat is uncertain, but believed to be involved in protein interaction. This CD aligns the 1st and 2nd repeats.	138
238723	cd01446	DSP_MapKP	N-terminal regulatory rhodanese domain of dual specificity phosphatases (DSP), such as Mapk Phosphatase. This domain is believed to determine substrate specificity by binding the substrate, such as ERK2, and activating the C-terminal catalytic domain by inducing a conformational change. This domain has homology to the Rhodanese Homology Domain.	132
238724	cd01447	Polysulfide_ST	Polysulfide-sulfurtransferase - Rhodanese Homology Domain. This domain is believed to serve as a polysulfide binding and transferase domain in anaerobic gram-negative bacteria, functioning in oxidative phosphorylation with polysulfide-sulfur as a terminal electron acceptor. The active site contains the same conserved cysteine that is the catalytic residue in other Rhodanese Homology Domain proteins.	103
238725	cd01448	TST_Repeat_1	Thiosulfate sulfurtransferase (TST), N-terminal, inactive domain. TST contains 2 copies of the Rhodanese Homology Domain; this is the 1st repeat, which does not contain the catalytically active Cys residue. The role of the 1st repeat is uncertain, but it is believed to be involved in protein interaction.	122
238726	cd01449	TST_Repeat_2	Thiosulfate sulfurtransferase (TST), C-terminal, catalytic domain. TST contains 2 copies of the Rhodanese Homology Domain; this is the second repeat. Only the second repeat contains the catalytically active Cys residue.	118
238727	cd01450	vWFA_subfamily_ECM	Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains  	161
238728	cd01451	vWA_Magnesium_chelatase	Magnesium chelatase: Mg-chelatase catalyses the insertion of Mg into protoporphyrin IX (Proto). In chlorophyll biosynthesis, insertion of Mg2+ into protoporphyrin IX is catalysed by magnesium chelatase in an ATP-dependent reaction. Magnesium chelatase is a three sub-unit (BchI, BchD and BchH) enzyme with a novel arrangement of domains: the C-terminal helical domain is located behind the nucleotide binding site. The BchD domain contains a AAA domain at its N-terminus and a VWA domain at its C-terminus. The VWA domain has been speculated to be involved in mediating protein-protein interactions.	178
238729	cd01452	VWA_26S_proteasome_subunit	26S proteasome plays a major role in eukaryotic protein breakdown, especially for ubiquitin-tagged proteins. It is an ATP-dependent protease responsible for the bulk of non-lysosomal proteolysis in eukaryotes, often using covalent modification of proteins by ubiquitylation. It consists of a 20S proteolytic core particle (CP) and a 19S regulatory particle (RP). The CP is an ATP independent peptidase consisting of hydrolyzing activities. One or both ends of CP carry the RP that confers both ubiquitin and ATP dependence to the 26S proteosome. The RP's  proposed functions include recognition of substrates and translocation of these to CP for proteolysis. The RP can dissociate into a stable lid and base subcomplexes. The base is composed of three non-ATPase subunits (Rpn 1, 2 and 10). A single residue in the vWA domain of Rpn10 has been implicated to be responsible for stabilizing the lid-base association.	187
238730	cd01453	vWA_transcription_factor_IIH_type	Transcription factors IIH type: TFIIH is a multiprotein complex that is one of the five general transcription factors that binds RNA polymerase II holoenzyme. Orthologues of these genes are found in all completed eukaryotic genomes and all these proteins contain a VWA domain. The p44 subunit of TFIIH functions as a DNA helicase in RNA polymerase II transcription initiation and DNA repair, and its transcriptional activity is dependent on its C-terminal Zn-binding domains. The function of the vWA domain is unclear, but may be involved in complex assembly. The MIDAS motif is not conserved in this sub-group.	183
238731	cd01454	vWA_norD_type	norD type: Denitrifying bacteria contain both membrane bound and periplasmic nitrate reductases. Denitrification plays a major role  in completing the nitrogen cycle by converting nitrate or nitrite to nitrogen gas. The pathway for microbial denitrification has been established as NO3-  ------> NO2- ------> NO -------> N2O ---------> N2. This reaction generally occurs under oxygen limiting conditions. Genetic and biochemical studies have shown that the first srep of the biochemical pathway is catalyzed by periplasmic nitrate reductases. This family is widely present in proteobacteria and firmicutes. This version of the domain is also present in some archaeal members. The function of the vWA domain in this sub-group is not known. Members of this subgroup have a conserved MIDAS motif.	174
238732	cd01455	vWA_F11C1-5a_type	Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the functions of the members of this subgroup. The members of this subgroup are fused to the ancient AAA domain.	191
238733	cd01456	vWA_ywmD_type	VWA ywmD type:Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the function of the members of this subgroup. All members of this subgroup however have a conserved MIDAS motif. 	206
238734	cd01457	vWA_ORF176_type	VWA ORF176 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. The members of this subgroup are Eubacterial in origin and have a conserved MIDAS motif. Not much is known about the biochemistry of these.	199
238735	cd01458	vWA_ku	Ku70/Ku80 N-terminal domain. The Ku78 heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks (DSB) in a preferred orientation. DSB's are repaired by either homologues recombination or non-homologues end joining and facilitate repair by the non-homologous end-joining pathway (NHEJ). The Ku heterodimer is required for accurate process that tends to preserve the sequence at the junction. Ku78 is found in all three kingdoms of life. However, only the eukaryotic proteins have a vWA domain fused to them at their N-termini. The vWA domain is not involved in DNA binding but may very likey mediate Ku78's interactions with other proteins. Members of this subgroup lack the conserved MIDAS motif.	218
238736	cd01459	vWA_copine_like	VWA Copine: Copines are phospholipid-binding proteins originally identified in paramecium. They are found in human and orthologues have been found in C. elegans and Arabidopsis Thaliana. None have been found in D. Melanogaster or S. Cereviciae. Phylogenetic distribution suggests that copines have been lost in some eukaryotes. No functional properties have been assigned to the VWA domains present in copines. The members of this subgroup contain a functional MIDAS motif based on their preferential binding to magnesium and manganese. However, the MIDAS motif is not totally conserved, in most cases the MIDAS consists of the sequence DxTxS instead of the motif DxSxS that is found in most cases. The C2 domains present in copines mediate phospholipid binding.	254
238737	cd01460	vWA_midasin	VWA_Midasin: Midasin is a member of the AAA ATPase family. The proteins of this family are unified by their common archetectural organization that is based upon a conserved ATPase domain. The AAA domain of midasin contains six tandem AAA protomers. The AAA domains in midasin is followed by a D/E rich domain that is following by a VWA domain. The members of this subgroup have a conserved MIDAS motif. The function of this domain is not exactly known although it has been speculated to play a crucial role in midasin function.	266
238738	cd01461	vWA_interalpha_trypsin_inhibitor	vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin). Bikunin confers the protease-inhibitor function while the heavy chains are involved in rendering stability to the extracellular matrix by binding to hyaluronic acid. The heavy chains carry the VWA domain with a conserved MIDAS motif. Although the exact role of the VWA domains remains unknown, it has been speculated to be involved in mediating protein-protein interactions with the components of the extracellular matrix.	171
238739	cd01462	VWA_YIEM_type	VWA YIEM type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup have a conserved MIDAS motif, however, their biochemical function is not well characterised.	152
238740	cd01463	vWA_VGCC_like	VWA Voltage gated Calcium channel like: Voltage-gated calcium channels are a complex of five proteins: alpha 1, beta 1, gamma, alpha 2 and delta. The alpha 2 and delta subunits result from proteolytic processing of a single gene product and carries at its N-terminus the VWA and cache domains, The alpha 2 delta gene family has orthologues in D. melanogaster and C. elegans but none have been detected in aither A. thaliana or yeast. The exact biochemical function of the VWA domain  is not known but the alpha 2 delta complex has been shown to regulate various functional properties of the channel complex.	190
238741	cd01464	vWA_subfamily	VWA subfamily: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup have no assigned function. This subfamily is typified by the presence of a conserved MIDAS motif.	176
238742	cd01465	vWA_subgroup	VWA subgroup: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Not much is known about the function of the VWA domain in these proteins. The members do have a conserved MIDAS motif. The biochemical function however is not known.	170
238743	cd01466	vWA_C3HC4_type	VWA C3HC4-type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Membes of this subgroup belong to Zinc-finger family as they are found fused to RING finger domains. The MIDAS motif is not conserved in all the members of this family. The function of vWA domains however is not known.	155
238744	cd01467	vWA_BatA_type	VWA BatA type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses. In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. Members of this subgroup are bacterial in origin. They are typified by the presence of a MIDAS motif.	180
238745	cd01468	trunk_domain	trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins.	239
238746	cd01469	vWA_integrins_alpha_subunit	Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions.	177
238747	cd01470	vWA_complement_factors	Complement factors B and C2 are two critical proteases for complement activation. They both contain three CCP or Sushi domains, a trypsin-type serine protease domain and a single VWA domain with a conserved metal ion dependent adhesion site referred commonly as the MIDAS motif. Orthologues of these molecules are found from echinoderms to chordates. During complement activation, the CCP domains are cleaved off, resulting in the formation of an active protease that cleaves and activates complement C3. Complement C2 is in the classical pathway and complement B is in the alternative pathway. The interaction of C2 with C4 and of factor B with C3b are both dependent on Mg2+ binding sites within the VWA domains and the VWA domain of factor B has been shown to mediate the binding of C3. This is consistent with the common inferred function of VWA domains as magnesium-dependent protein interaction domains.	198
238748	cd01471	vWA_micronemal_protein	Micronemal proteins: The Toxoplasma lytic cycle begins when the parasite actively invades a target cell. In association with invasion, T. gondii sequentially discharges three sets of secretory organelles beginning with the micronemes, which contain adhesive proteins involved in parasite attachment to a host cell. Deployed as protein complexes, several micronemal proteins possess vertebrate-derived adhesive sequences that function in binding receptors. The VWA domain likely mediates the protein-protein interactions of these with their interacting partners.	186
238749	cd01472	vWA_collagen	von Willebrand factor (vWF) type A domain; equivalent to the I-domain of integrins.  This domain has a variety of functions including: intermolecular adhesion, cell migration, signalling, transcription, and DNA repair. In integrins these domains form heterodimers while in vWF it forms homodimers and multimers. There are different interaction surfaces of this domain as seen by its complexes with collagen with either integrin or human vWFA. In integrins collagen binding occurs via  the metal ion-dependent adhesion site (MIDAS) and involves three surface loops located on the upper surface of the molecule. In human vWFA, collagen binding is thought to occur on the bottom of the molecule and does not involve the vestigial MIDAS motif.	164
238750	cd01473	vWA_CTRP	CTRP for  CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60  amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions.	192
238751	cd01474	vWA_ATR	ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells.	185
238752	cd01475	vWA_Matrilin	VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.	224
238753	cd01476	VWA_integrin_invertebrates	VWA_integrin (invertebrates): Integrins are a family of cell surface receptors that have diverse functions in  cell-cell and cell-extracellular matrix interactions. Because of their involvement in many biologically important adhesion processes, integrins are conserved across a wide range of multicellular animals. Integrins from invertebrates have been identified from six phyla. There are no data to date to suggest  any immunological functions for the invertebrate integrins. The members of this sub-group have the conserved MIDAS motif that is charateristic of this domain suggesting the involvement of the integrins in the recognition and binding of multi-ligands.	163
238754	cd01477	vWA_F09G8-8_type	VWA F09G8.8 type: Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses  In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains. The members of this subgroup lack the MIDAS motif. This subgroup is found only in C. elegans and the members identified thus far are always found fused to a C-Lectin type domain. Biochemical function thus far has not be attributed to any of the members of this subgroup.	193
238755	cd01478	Sec23-like	Sec23-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 23 is very similar to Sec24. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup lack the consensus MIDAS motif but have the overall Para-Rossmann type fold that is characteristic of this superfamily.	267
238756	cd01479	Sec24-like	Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.	244
238757	cd01480	vWA_collagen_alpha_1-VI-type	VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far.  Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions.	186
238758	cd01481	vWA_collagen_alpha3-VI-like	VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far.  Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions.	165
238759	cd01482	vWA_collagen_alphaI-XII-like	Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions.	164
238760	cd01483	E1_enzyme_family	Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS.	143
238761	cd01484	E1-2_like	Ubiquitin activating enzyme (E1), repeat 2-like. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. A set of novel molecules with a structural similarity to Ub, called Ub-like proteins (Ubls), have similar conjugation cascades. In contrast to ubiquitin-E1, which is a single-chain protein with a weakly conserved two-fold repeat, many of the Ubls-E1are a heterodimer where each subunit corresponds to one half of a single-chain E1. This CD represents the family homologous to the second repeat of Ub-E1.	234
238762	cd01485	E1-1_like	Ubiquitin activating enzyme (E1), repeat 1-like. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. A set of novel molecules with a structural similarity to Ub, called Ub-like proteins (Ubls), have similar conjugation cascades. In contrast to ubiquitin-E1, which is a single-chain protein with a weakly conserved two-fold repeat, many of the Ubls-E1are a heterodimer where each subunit corresponds to one half of a single-chain E1. This CD represents the family homologous to the first repeat of Ub-E1.	198
238763	cd01486	Apg7	Apg7 is an E1-like protein, that activates two different ubiquitin-like proteins, Apg12 and Apg8, and assigns them to specific E2 enzymes, Apg10 and Apg3, respectively. This leads to the covalent conjugation of Apg8 with phosphatidylethanolamine, an important step in autophagy. Autophagy is a dynamic membrane phenomenon for bulk protein degradation in the lysosome/vacuole.	307
238764	cd01487	E1_ThiF_like	E1_ThiF_like. Member of superfamily of activating enzymes (E1) of the ubiquitin-like proteins. The common reaction mechanism catalyzed by E1-like enzymes begins with a nucleophilic attack of the C-terminal carboxylate of the ubiquitin-like substrate, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of the substrate. The exact function of this family is unknown.	174
238765	cd01488	Uba3_RUB	Ubiquitin activating enzyme (E1) subunit UBA3. UBA3 is part of the heterodimeric activating enzyme (E1), specific for the Rub family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins. consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin(-like) by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by Rub family of ubiquitin-like proteins (Ublps) activates SCF ubiquitin ligases and is involved in cell cycle control, signaling and embryogenesis. UBA3 contains both the nucleotide-binding motif involved in adenylation and the catalytic cysteine involved in the thioester intermediate and Ublp transfer to E2.	291
238766	cd01489	Uba2_SUMO	Ubiquitin activating enzyme (E1) subunit UBA2. UBA2 is part of the heterodimeric activating enzyme (E1), specific for the SUMO family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by SUMO family of ubiquitin-like proteins (Ublps) is involved in cell division, nuclear transport, the stress response and signal transduction. UBA2 contains both the nucleotide-binding motif involved in adenylation and the catalytic cysteine involved in the thioester intermediate and Ublp transfer to E2.	312
238767	cd01490	Ube1_repeat2	Ubiquitin activating enzyme (E1), repeat 2. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Ubiquitin-E1 is a single-chain protein with a weakly conserved two-fold repeat. This CD represents the second repeat of Ub-E1.	435
238768	cd01491	Ube1_repeat1	Ubiquitin activating enzyme (E1), repeat 1. E1, a highly conserved small protein present universally in eukaryotic cells, is part of cascade to attach ubiquitin (Ub) covalently to substrate proteins. This cascade consists of activating (E1), conjugating (E2), and/or ligating (E3) enzymes and then targets them for degradation by the 26S proteasome. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and ubiquitin's C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Ubiquitin-E1 is a single-chain protein with a weakly conserved two-fold repeat. This CD represents the first repeat of Ub-E1.	286
238769	cd01492	Aos1_SUMO	Ubiquitin activating enzyme (E1) subunit Aos1. Aos1 is part of the heterodimeric activating enzyme (E1), specific for the SUMO family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. The E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by SUMO family of ubiquitin-like proteins (Ublps) is involved in cell division, nuclear transport, the stress response and signal transduction. Aos1 contains part of the adenylation domain.	197
238770	cd01493	APPBP1_RUB	Ubiquitin activating enzyme (E1) subunit APPBP1. APPBP1 is part of the heterodimeric activating enzyme (E1), specific for the Rub family of ubiquitin-like proteins (Ubls). E1 enzymes are part of a conjugation cascade to attach Ub or Ubls, covalently to substrate proteins consisting of activating (E1), conjugating (E2), and/or ligating (E3) enzymes. E1 activates ubiquitin(-like) by C-terminal adenylation, and subsequently forms a highly reactive thioester bond between its catalytic cysteine and Ubls C-terminus. E1 also associates with E2 and promotes ubiquitin transfer to the E2's catalytic cysteine. Post-translational modification by Rub family of ubiquitin-like proteins (Ublps) activates SCF ubiquitin ligases and is involved in cell cycle control, signaling and embryogenesis. ABPP1 contains part of the adenylation domain.	425
99742	cd01494	AAT_I	Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of  the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V).	170
275447	cd01513	Translation_factor_III	Domain III of Elongation factor (EF) Tu (EF-TU) and related proteins. Elongation factor (EF) EF-Tu participates in the elongation phase during protein biosynthesis on the ribosome. Its functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental findings indicate an essential contribution of domain III to activation of GTP hydrolysis. This domain III, which is distinct from the domain III in EFG and related elongation factors, is found in several eukaryotic translation factors, like peptide chain release factors RF3, elongation factor 1, selenocysteine (Sec)-specific elongation factor, and in GT-1 family of GTPase (GTPBP1).	102
238772	cd01514	Elongation_Factor_C	Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p.  This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain.  EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner.  BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown.	79
238773	cd01515	Arch_FBPase_1	Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family (FBPase class IV). These are Mg++ dependent phosphatases. Members in this family may have both fructose-1,6-bisphosphatase and inositol-monophosphatase activity. In hyperthermophilic archaea, inositol monophosphatase is thought to play a role in the biosynthesis of di-myo-inositol-1,1'-phosphate, an osmolyte unique to hyperthermophiles.	257
238774	cd01516	FBPase_glpX	Bacterial fructose-1,6-bisphosphatase, glpX-encoded. A dimeric enzyme dependent on Mg(2+). glpX-encoded FPBase (FBPase class II) differs from other members of the inositol-phosphatase superfamily by permutation of secondary structure elements. The core structure around the active site is well preserved. In E. coli, FBPase II is part of the glp regulon, which mediates growth on glycerol or sn-glycerol 3-phosphate as the sole carbon source.	309
238775	cd01517	PAP_phosphatase	PAP-phosphatase_like domains. PAP-phosphatase is a member of the inositol monophosphatase family, and catalyses the hydrolysis of 3'-phosphoadenosine-5'-phosphate (PAP) to AMP. In Saccharomyces cerevisiae, HAL2 (MET22) is involved in methionine biosynthesis and provides increased salt tolerance when over-expressed. Bacterial members of this domain family may differ in their substrate specificity and dephosphorylate different targets, as the substrate binding site does not appear to be conserved in that sub-set.	274
238776	cd01518	RHOD_YceA	Member of the Rhodanese Homology Domain superfamily. This CD includes Escherichia coli YceA, Bacillus subtilis YbfQ, and similar uncharacterized proteins.	101
238777	cd01519	RHOD_HSP67B2	Member of the Rhodanese Homology Domain superfamily. This CD includes the heat shock protein 67B2 of Drosophila melanogaster and other similar proteins, many of which are uncharacterized.	106
238778	cd01520	RHOD_YbbB	Member of the Rhodanese Homology Domain superfamily. This CD includes several putative ATP /GTP binding proteins including E. coli YbbB.	128
238779	cd01521	RHOD_PspE2	Member of the Rhodanese Homology Domain superfamily. This CD includes the putative rhodanese-like protein, Psp2, of Yersinia pestis biovar Medievalis and other similar uncharacterized proteins.	110
238780	cd01522	RHOD_1	Member of the Rhodanese Homology Domain superfamily, subgroup 1. This CD includes the putative rhodanese-related sulfurtransferases of several uncharacterized proteins.	117
238781	cd01523	RHOD_Lact_B	Member of the Rhodanese Homology Domain superfamily. This CD includes predicted proteins with rhodanese-like domains found N-terminal of the metallo-beta-lactamase domain.	100
238782	cd01524	RHOD_Pyr_redox	Member of the Rhodanese Homology Domain superfamily. Included in this CD are the Lactococcus lactis NADH oxidase, Bacillus cereus NADH dehydrogenase, and Bacteroides thetaiotaomicron pyridine nucleotide-disulphide oxidoreductase, and similar rhodanese-like domains found C-terminal of the pyridine nucleotide-disulphide oxidoreductase (Pyr-redox) domain and the Pyr-redox dimerization domain.	90
238783	cd01525	RHOD_Kc	Member of the Rhodanese Homology Domain superfamily. Included in this CD are the rhodanese-like domains found C-terminal of the serine/threonine protein kinases catalytic (S_TKc) domain and the Tre-2, BUB2p, Cdc16p (TBC) domain. The putative active site Cys residue is not present in this CD.	105
238784	cd01526	RHOD_ThiF	Member of the Rhodanese Homology Domain superfamily. This CD includes several putative molybdopterin synthase sulfurylases including the molybdenum cofactor biosynthetic protein (CnxF) of Aspergillus nidulans and the molybdenum cofactor synthesis protein 3 (MOCS3) of Homo sapiens. These rhodanese-like domains are found C-terminal of the ThiF and MoeZ_MoeB domains.	122
238785	cd01527	RHOD_YgaP	Member of the Rhodanese Homology Domain superfamily. This CD includes Escherichia coli YgaP, and similar uncharacterized putative rhodanese-related sulfurtransferases.	99
238786	cd01528	RHOD_2	Member of the Rhodanese Homology Domain superfamily, subgroup 2. Subgroup 2 includes uncharacterized putative rhodanese-related domains.	101
238787	cd01529	4RHOD_Repeats	Member of the Rhodanese Homology Domain superfamily. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. Only the second and most of the fourth repeats contain the putative catalytic Cys residue. This CD aligns the 1st , 2nd, 3rd, and 4th repeats.	96
238788	cd01530	Cdc25	Cdc25 phosphatases are members of the Rhodanese Homology Domain superfamily. They activate the cell division kinases throughout the cell cycle progression. Cdc25 phosphatases dephosphorylate phosphotyrosine and phosphothreonine residues, in order to activate their Cdk/cyclin substrates. Cdc25A phosphatase functions to regulate S phase entry and Cdc25B is required for G2/M phase transition of the cell cycle. The Cdc25 domain binds oxyanions at the catalytic site and has the signature motif (H/YCxxxxxR).	121
238789	cd01531	Acr2p	Eukaryotic arsenate resistance proteins are members of the Rhodanese Homology Domain superfamily. Included in this CD is the Saccharomyces cerevisiae arsenate reductase protein, Acr2p, and other yeast and plant homologs.	113
238790	cd01532	4RHOD_Repeat_1	Member of the Rhodanese Homology Domain superfamily, repeat 1. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 1st repeat which does not contain the putative catalytic Cys residue.	92
238791	cd01533	4RHOD_Repeat_2	Member of the Rhodanese Homology Domain superfamily, repeat 2. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 2nd repeat which does contain the putative catalytic Cys residue.	109
238792	cd01534	4RHOD_Repeat_3	Member of the Rhodanese Homology Domain superfamily, repeat 3. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 3rd repeat which does not contain the putative catalytic Cys residue.	95
238793	cd01535	4RHOD_Repeat_4	Member of the Rhodanese Homology Domain superfamily, repeat 4. This CD includes putative rhodanese-related sulfurtransferases which contain 4 copies of the Rhodanese Homology Domain. This CD aligns the 4th repeat which, in general, contains the putative catalytic Cys residue.	145
380478	cd01536	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of active transport systems that are members of the type 1 periplasmic binding protein (PBP1) superfamily. Periplasmic sugar-binding domain of active transport systems that are members of the type 1 periplasmic binding protein (PBP1) superfamily. The members of this family function as the primary receptors for chemotaxis and transport of many sugar based solutes in bacteria and archaea. The sugar binding domain is also homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. Moreover, this periplasmic binding domain, also known as Venus flytrap domain, undergoes transition from an open to a closed conformational state upon the binding of ligands such as lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars. This family also includes the periplasmic binding domain of autoinducer-2 (AI-2) receptors such as LsrB and LuxP which are highly homologous to periplasmic pentose/hexose sugar-binding proteins.	268
380479	cd01537	PBP1_repressor_sugar_binding-like	Ligand-binding domain of the LacI-GalR family of transcription regulators and the sugar-binding domain of ABC-type transport systems. Ligand-binding domain of the LacI-GalR family of transcription regulators and the sugar-binding domain of ABC-type transport systems, all of which contain the type 1 periplasmic binding protein-like fold. Their specific ligands include lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars. The LacI family of proteins consists of transcriptional regulators related to the lac repressor; in general the sugar binding domain in this family binds a sugar, which in turn changes the DNA binding activity of the repressor domain. The core structure of the periplasmic binding proteins is classified into two types and they differ in number and order of beta strands in each domain: type 1, which has six beta strands, and type 2, which has five beta strands. These two distinct structural arrangements may have originated from a common ancestor.	265
380480	cd01538	PBP1_ABC_xylose_binding-like	periplasmic xylose-like sugar-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily. Periplasmic xylose-like sugar-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR.	283
380481	cd01539	PBP1_GGBP	periplasmic glucose/galactose-binding protein (GGBP) involved in chemotaxis towards, and active transport of, glucose and galactose in various bacterial species. Periplasmic glucose/galactose-binding protein (GGBP) involved in chemotaxis towards, and active transport of, glucose and galactose in various bacterial species. GGBP is a member of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic GGBP is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR.	302
380482	cd01540	PBP1_arabinose_binding	periplasmic L-arabinose-binding protein (ABP), a member of a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily. Periplasmic L-arabinose-binding protein (ABP), a member of a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily. ABP is only involved in transport contrary to other related sugar-binding proteins such as the glucose/galactose-binding protein (GGBP) and the ribose-binding protein (RBP), both of which are involved in chemotaxis as well as transport. The periplasmic ABP consists of two alpha/beta globular domains connected by a three-stranded hinge, a Venus flytrap-like domain, which undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, ABP is homologous to the ligand-binding domain of eukaryotic receptors such as metabotropic glutamate receptor (mGluR) and DNA-binding transcriptional repressors such as LacI and GalR.	294
380483	cd01541	PBP1_AraR	ligand-binding domain of DNA transcription repressor specific for arabinose (AraR) which is a member of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor specific for arabinose (AraR) which is a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of AraR is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	274
380484	cd01542	PBP1_TreR-like	ligand-binding domain of DNA transcription repressor specific for trehalose (TreR) which is a member of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor specific for trehalose (TreR) which is a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of TreR is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	259
380485	cd01543	PBP1_XylR	ligand-binding domain of DNA transcription repressor specific for xylose (XylR). Ligand-binding domain of DNA transcription repressor specific for xylose (XylR), a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of XylR is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	265
380486	cd01544	PBP1_GalR	ligand-binding domain of DNA transcription repressor GalR which is one of two regulatory proteins involved in galactose transport and metabolism. Ligand-binding domain of DNA transcription repressor GalR which is one of two regulatory proteins involved in galactose transport and metabolism. Transcription of the galactose regulon genes is regulated by Gal iso-repressor (GalS) and Gal repressor (GalR) in different ways, but both repressors recognize the same DNA binding site in the absence of D-galactose. GalR is a dimeric protein like GalS and is exclusively involved in the regulation of galactose permease, the low-affinity galactose transporter. GalS is involved in regulating expression of the high-affinity galactose transporter encoded by the mgl operon. GalS and GalR are members of the LacI-GalR family of transcription regulators and both contain the type 1 periplasmic binding protein-like fold. Hence, they are structurally homologous to the periplasmic sugar binding of ABC-type transport systems.	269
380487	cd01545	PBP1_SalR	ligand-binding domain of DNA transcription repressor SalR, a member of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor SalR, a member of the LacI-GalR family of bacterial transcription regulators. The SalR binds to glucose based compound Salicin which is chemically related to aspirin. The ligand-binding of SalR is structurally homologous to the periplasmic sugar-binding domain of ABC-transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	270
238794	cd01553	EPT_RTPC-like	This domain family includes the Enolpyruvate transferase (EPT) family and the RNA 3' phosphate cyclase family (RTPC). These 2 families differ in that EPT is formed by 3 repeats of an alpha-beta structural domain while  RTPC has 3 similar repeats with a 4th slightly different domain inserted between the 2nd and 3rd repeat. They evidently share the same active site location, although the catalytic residues differ.	211
238795	cd01554	EPT-like	Enol pyruvate transferases family includes EPSP synthases and UDP-N-acetylglucosamine enolpyruvyl transferase. Both enzymes catalyze the reaction of enolpyruvyl transfer.	408
238796	cd01555	UdpNAET	UDP-N-acetylglucosamine enolpyruvyl transferase catalyzes enolpyruvyl transfer as part of the first step in the biosynthesis of peptidoglycan, a component of the bacterial cell wall. The reaction is phosphoenolpyruvate + UDP-N-acetyl-D-glucosamine = phosphate + UDP-N-acetyl-3-(1-carboxyvinyl)-D-glucosamine. This enzyme is of interest as a potential target for anti-bacterial agents. The only other known enolpyruvyl transferase is the related 5-enolpyruvylshikimate-3-phosphate synthase.	400
238797	cd01556	EPSP_synthase	EPSP synthase domain. 3-phosphoshikimate 1-carboxyvinyltransferase (5-enolpyruvylshikimate-3-phosphate synthase) (EC 2.5.1.19) catalyses the reaction between shikimate-3-phosphate (S3P) and phosphoenolpyruvate (PEP) to form 5-enolpyruvylshkimate-3-phosphate (EPSP), an intermediate in the shikimate pathway leading to aromatic amino acid biosynthesis. The reaction is phosphoenolpyruvate + 3-phosphoshikimate = phosphate + 5-O-(1-carboxyvinyl)-3-phosphoshikimate. It is found in bacteria and plants but not animals. The enzyme is the target of the widely used herbicide glyphosate, which has been shown to occupy the active site. In bacteria and plants, it is a single domain protein, while in fungi, the domain is found as part of a multidomain protein with functions that are all part of the shikimate pathway.	409
238798	cd01557	BCAT_beta_family	BCAT_beta_family: Branched-chain aminotransferase catalyses the transamination of the branched-chain amino acids  leusine, isoleucine and valine to their respective alpha-keto acids, alpha-ketoisocaproate, alpha-keto-beta-methylvalerate and alpha-ketoisovalerate. The enzyme requires pyridoxal 5'-phosphate (PLP) as a cofactor to catalyze the reaction. It has been found that mammals have two foms of the enzyme - mitochondrial and cytosolic forms while bacteria contain only one form of the enzyme. The mitochondrial form plays a significant role in skeletal muscle glutamine and alanine synthesis and in interorgan nitrogen metabolism.Members of this subgroup are widely distributed in all three forms of life.	279
238799	cd01558	D-AAT_like	D-Alanine aminotransferase (D-AAT_like): D-amino acid aminotransferase catalyzes transamination between D-amino acids and their respective alpha-keto acids. It plays a major role in the synthesis of bacterial cell wall components like D-alanine and D-glutamate in addition to other D-amino acids. The enzyme like other members of this superfamily requires PLP as a cofactor. Members of this subgroup are found in all three forms of life.	270
238800	cd01559	ADCL_like	ADCL_like: 4-Amino-4-deoxychorismate lyase:  is a member of the fold-type IV of PLP dependent enzymes that converts 4-amino-4-deoxychorismate (ADC) to p-aminobenzoate and pyruvate.  Based on the information available from the crystal structure, most members of this subgroup are likely to function as dimers.  The enzyme from E.Coli, the structure of which is available, is a homodimer that is folded into a small and a larger domain. The coenzyme pyridoxal 5; -phosphate  resides at the interface of the two domains that is linked by a flexible loop. Members of this subgroup are found in Eukaryotes and bacteria.	249
107203	cd01560	Thr-synth_2	Threonine synthase catalyzes the final step of threonine biosynthesis. The conversion of O-phosphohomoserine into threonine and inorganic phosphate is pyridoxal 5'-phosphate dependent. The Thr-synth_1 CD includes members from higher plants, cyanobacteria, archaebacteria and eubacterial groups. This CD, Thr-synth_2, includes enzymes from fungi and eubacterial groups, as well as, metazoan threonine synthase-like proteins.	460
107204	cd01561	CBS_like	CBS_like: This subgroup includes Cystathionine beta-synthase (CBS) and Cysteine synthase. CBS is a unique heme-containing enzyme that catalyzes a pyridoxal 5'-phosphate (PLP)-dependent condensation of serine and homocysteine to give cystathionine. Deficiency of CBS leads to homocystinuria, an inherited disease of sulfur metabolism characterized by increased levels of the toxic metabolite homocysteine. Cysteine synthase on the other hand catalyzes the last step of cysteine biosynthesis.  This subgroup also includes an O-Phosphoserine sulfhydrylase found in hyperthermophilic archaea which produces L-cysteine from sulfide and the more thermostable O-phospho-L-serine.	291
107205	cd01562	Thr-dehyd	Threonine dehydratase: The first step in amino acid degradation is the removal of nitrogen. Although the nitrogen atoms of most amino acids are transferred to alpha-ketoglutarate before removal, the alpha-amino group of threonine can be directly converted into NH4+. The direct deamination is catalyzed by threonine dehydratase, in which pyridoxal phosphate (PLP) is the prosthetic group. Threonine dehydratase is widely distributed in all three major phylogenetic divisions.	304
107206	cd01563	Thr-synth_1	Threonine synthase is a pyridoxal phosphate (PLP) dependent enzyme that catalyses the last reaction in the synthesis of  threonine from aspartate. It proceeds by converting O-phospho-L-homoserine (OPH) into threonine and inorganic phosphate. In plants, OPH is an intermediate between the methionine and threonine/isoleucine pathways. Thus threonine synthase competes for OPH with cystathionine-gamma-synthase, the first enzyme in the methionine pathway. These enzymes are in general dimers. Members of this CD, Thr-synth_1, are widely distributed in bacteria, archaea and higher plants.	324
238801	cd01567	NAPRTase_PncB	Nicotinate phosphoribosyltransferase (NAPRTase) family. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products.	343
238802	cd01568	QPRTase_NadC	Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase), also called nicotinate-nucleotide pyrophosphorylase, is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide. QPRTase functions as a homodimer with two active sites, each formed by the C-terminal region of one subunit and the N-terminal region of the other.	269
238803	cd01569	PBEF_like	pre-B-cell colony-enhancing factor (PBEF)-like. The mammalian members of this group of nicotinate phosphoribosyltransferases (NAPRTases) were originally identified as genes whose expression is upregulated upon activation in lymphoid cells. In general, nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis.	407
238804	cd01570	NAPRTase_A	Nicotinate phosphoribosyltransferase (NAPRTase), subgroup A. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products. This subgroup is present in bacteria and eukaryota (except funghi).	327
238805	cd01571	NAPRTase_B	Nicotinate phosphoribosyltransferase (NAPRTase), subgroup B. Nicotinate phosphoribosyltransferase catalyses the formation of NAMN and PPi from 5-phosphoribosy -1-pyrophosphate (PRPP) and nicotinic acid, this is the first, and also rate limiting, reaction in the NAD salvage synthesis. This salvage pathway serves to recycle NAD degradation products.	302
238806	cd01572	QPRTase	Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase), also called nicotinate-nucleotide pyrophosphorylase, is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide. QPRTase functions as a homodimer with two active sites, each formed by the C-terminal region of one subunit and the N-terminal region of the other.	268
238807	cd01573	modD_like	ModD; Quinolinate phosphoribosyl transferase (QAPRTase or QPRTase) present in some modABC operons in bacteria, which are involved in molybdate transport. In general, QPRTases are part of the de novo synthesis pathway of NAD in both prokaryotes and eukaryotes. They catalyse the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide.	272
380488	cd01574	PBP1_LacI	ligand-binding domain of DNA transcription repressor LacI specific for lactose, a member of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor LacI specific for lactose, a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of LacI is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	265
380489	cd01575	PBP1_GntR	ligand-binding domain of DNA transcription repressor GntR specific for gluconate, a member of the LacI-GalR family of bacterial transcription regulators. This group represents the ligand-binding domain of DNA transcription repressor GntR specific for gluconate, a member of the LacI-GalR family of bacterial transcription regulators. The ligand-binding domain of GntR is structurally homologous to the periplasmic sugar-binding domain of ABC-type transporters and both domains contain the type 1 periplasmic binding protein-like fold. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the type 1 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding, which in turn changes the DNA binding affinity of the repressor.	269
238808	cd01576	AcnB_Swivel	Aconitase B swivel domain. Aconitate hydratase B is involved in energy metabolism as part of the TCA cycle. It catalyses the formation of cis-aconitate from citrate. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The domain structure of Aconitase B is different from other Aconitases in that he swivel domain that is found at N-terminus of B family is normally found at C-terminus for other Aconitases. In most members of the family, there is also a HEAT domain before domain 4, which is believed to play a role in protein-protein interaction.	131
238809	cd01577	IPMI_Swivel	Aconatase-like swivel domain of 3-isopropylmalate dehydratase and related uncharacterized proteins. 3-isopropylmalate dehydratase catalyzes the isomerization between 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate 3-isopropylmalate. IPMI is involved in fungal and bacterial leucine biosynthesis and is also found in eukaryotes. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism.	91
238810	cd01578	AcnA_Mitochon_Swivel	Mitochondrial aconitase A swivel domain. Aconitase (also known as aconitate hydratase and citrate hydro-lyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm.  This is the mitochondrial form. The mitochondrial product is coded by a nuclear gene. Most members of this subfamily are mitochondrial but there are some bacterial members.	149
238811	cd01579	AcnA_Bact_Swivel	Bacterial Aconitase-like swivel domain. Aconitase (aconitate hydratase or citrate hydrolyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle.  Cis-aconitate is formed as an intermediate product during the course of the reaction. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism. This distinct subfamily is found only in bacteria and archea. Its exact characteristics are not known.	121
238812	cd01580	AcnA_IRP_Swivel	Aconitase A swivel domain. This is the major form of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydro-lyase. It includes bacterial and archaeal aconitase A, and the eukaryotic cytosolic form of aconitase. This group also includes sequences that have been shown to act as an iron-responsive element (IRE) binding protein in animals and may have the same role in other eukaryotes. This is the aconitase-like swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism.	171
153131	cd01581	AcnB	Aconitate hydratase B catalyses the formation of cis-aconitate from citrate as part of the TCA cycle. Aconitase B catalytic domain. Aconitate hydratase B catalyses the formation of cis-aconitate from citrate as part of the TCA cycle. Aconitase has an active (4FE-4S) and an inactive (3FE-4S) form. The active cluster is part of the catalytic site that interconverts citrate, cis-aconitase and isocitrate. The domain architecture of aconitase B is different from other aconitases in that the catalytic domain is normally found at C-terminus for other aconitases, but it is at N-terminus for B family. It also has a HEAT domain before domain 4 which plays a role in protein-protein interaction. This alignment is the core domain including domains 1,2 and 3.	436
153132	cd01582	Homoaconitase	Homoaconitase and other uncharacterized proteins of the Aconitase family. Homoaconitase catalytic domain. Homoaconitase and other uncharacterized proteins of the Aconitase family. Homoaconitase is part of an unusual lysine biosynthesis pathway found only in filamentous fungi, in which lysine is synthesized via the alpha-aminoadipate pathway. In this pathway, homoaconitase catalyzes the conversion of cis-homoaconitic acid into homoisocitric acid. The reaction mechanism is believed to be similar to that of other aconitases.	363
153133	cd01583	IPMI	3-isopropylmalate dehydratase catalyzes the isomerization between 2-isopropylmalate and 3-isopropylmalate. Aconatase-like catalytic domain of 3-isopropylmalate dehydratase and related uncharacterized proteins. 3-isopropylmalate dehydratase catalyzes the isomerization between 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate 3-isopropylmalate. IPMI is involved in fungal and bacterial leucine biosynthesis and is also found in eukaryotes.	382
153134	cd01584	AcnA_Mitochondrial	Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Mitochondrial aconitase A catalytic domain. Aconitase (also known as aconitate hydratase and citrate hydro-lyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediary product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. This is the mitochondrial form. The mitochondrial product is coded by a nuclear gene. Most members of this subfamily are mitochondrial but there are some bacterial members.	412
153135	cd01585	AcnA_Bact	Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Bacterial Aconitase-like catalytic domain. Aconitase (aconitate hydratase or citrate hydrolyase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. This distinct subfamily is found only in bacteria and Archaea. Its exact characteristics are not known.	380
153136	cd01586	AcnA_IRP	Aconitase A catalytic domain. Aconitase A catalytic domain. This is the major form of the TCA cycle enzyme aconitate hydratase, also known as aconitase and citrate hydrolyase. It includes bacterial and archaeal aconitase A, and the eukaryotic cytosolic form of aconitase. This group also includes sequences that have been shown to act as an iron-responsive element (IRE) binding protein in animals and may have the same role in other eukaryotes.	404
176466	cd01594	Lyase_I_like	Lyase class I_like superfamily: contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase, which catalyze similar beta-elimination reactions. Lyase class I_like superfamily of enzymes that catalyze beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. This superfamily contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase. The lyase class I family comprises proteins similar to class II fumarase, aspartase, adenylosuccinate lyase, argininosuccinate lyase, and 3-carboxy-cis, cis-muconate lactonizing enzyme which, for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. Histidine or phenylalanine ammonia-lyase catalyze a beta-elimination of ammonia from histidine and phenylalanine, respectively.	231
176467	cd01595	Adenylsuccinate_lyase_like	Adenylsuccinate lyase (ASL)_like. This group contains ASL, prokaryotic-type 3-carboxy-cis,cis-muconate cycloisomerase (pCMLE), and related proteins. These proteins are members of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and; the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP). pCMLE catalyzes the cyclization of 3-carboxy-cis,cis-muconate (3CM) to 4-carboxy-muconolactone, in the beta-ketoadipate pathway. ASL deficiency has been linked to several pathologies including psychomotor retardation with autistic features, epilepsy and muscle wasting.	381
176468	cd01596	Aspartase_like	aspartase (L-aspartate ammonia-lyase) and fumarase class II enzymes. This group contains aspartase (L-aspartate ammonia-lyase), fumarase class II enzymes, and related proteins. It is a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. Aspartase catalyzes the reversible deamination of aspartic acid. Fumarase catalyzes the reversible hydration/dehydration of fumarate to L-malate during the Krebs cycle.	450
176469	cd01597	pCLME	prokaryotic 3-carboxy-cis,cis-muconate cycloisomerase (CMLE)_like. This subgroup contains pCLME and related proteins, and belongs to the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. CMLE catalyzes the cyclization of 3-carboxy-cis,cis-muconate (3CM) to 4-carboxy-muconolactone in the beta-ketoadipate pathway. This pathway is responsible for the catabolism of a variety of aromatic compounds into intermediates of the citric cycle in prokaryotic and eukaryotic micro-organisms.	437
176470	cd01598	PurB	PurB_like adenylosuccinases (adenylosuccinate lyase, ASL). This subgroup contains EcASL, the product of the purB gene in Escherichia coli, and related proteins. It is a member of the Lyase class I family of the Lyase_I superfamily. Members of the Lyase class I family function as homotetramers to catalyze similar beta-elimination reactions in which a Calpha-N or Calpha-O bond is cleaved with the subsequent release of fumarate as one of the products. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two non-sequential steps in the de novo purine biosynthesis pathway: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and; the conversion of adenylosuccinate (SAMP) into adenosine monophosphate (AMP).	425
259845	cd01609	RNAP_beta'_N	Largest subunit (beta') of bacterial DNA-dependent RNA polymerase (RNAP), N-terminal domain. Beta' is the largest subunit of bacterial DNA-dependent RNA polymerase (RNAP). This family also includes the eukaryotic plastid-encoded RNAP beta' subunit. Bacterial RNAP is a large multi-subunit complex responsible for the synthesis of all RNAs in the cell. Structure studies suggest that RNA polymerase complexes from different organisms share a crab-claw-shaped structure with two "pincers" defining a central cleft. Beta' and beta, the largest and the second largest subunits of bacterial RNAP, each makes up one pincer and part of the base of the cleft. Beta' contains part of the active site and binds two zinc ions that have a structural role in the formation of the active polymerase.	659
238813	cd01610	PAP2_like	PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins.	122
340453	cd01611	Ubl_Autophagy_like	ubiquitin-like (Ubl) domain found in autophagy-related ubiquitin-like protein. Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. The autophagy-related ubiquitin-like proteins, such as Saccharomyces cerevisiae Atg8p, undergo a unique ubiquitin-like (Ubl) conjugation, a process essential for autophagosome formation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The ubiquitination process comprises a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. ATG8 family proteins undergo multistep modifications by the E1-like (ubiquitin activating) enzyme ATG7, and the E2-like (ubiquitin conjugating) enzyme ATG3. The mammalian ATG8 family is classified into three subfamilies: i) MAP1LC3 (microtubule associated protein 1 light chain 3) which includes MAP1LC3A, MAP1LC3B, MAP1LC3B2, and MAP1LC3C, ii) GABARAP (GABA type A receptor associated protein) which includes GABARAP, GABARAPL1, and GABARAPL3, and iii) GABARAPL2 (GABA type A receptor associated protein like 2), also known as GATE-16 (golgi-associated adenosine triphosphatase enhancer of 16 kDa).	84
340454	cd01612	Ubl_ATG12	ubiquitin-like (Ubl) domain found in autophagy-related protein 12 (ATG12). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. The autophagy-related ubiquitin-like (Ubl) proteins such as ATG12 protein have a conserved Ubl fold structure and undergo a unique Ubl conjugation, a process essential for autophagosome formation. ATG12 is conjugated to ATG5 by multistep modifications of the E1-like (ubiquitin activating) enzyme ATG7, and the E2-like (ubiquitin conjugating) enzyme ATG10. The ATG12-ATG5 conjugate facilitates the lipidation of ATG8 and directs its correct subcellular localization. ATG12 is localized at the developing autophagosome.	86
133473	cd01614	EutN_CcmL	Ethanolamine utilisation protein and carboxysome structural protein domain family. Beside the Escherichia coli ethanolamine utilization protein EutN and the Synechocystis sp. carboxysome (beta-type) structural protein CcmL, this family also includes alpha-type carboxysome structural proteins CsoS4A and CsoS4B (previously known as OrfA and OrfB), propanediol utilizationprotein PduN, and some hypothetical homologous of various bacterial microcompartments. The carboxysome, a polyhedral organelle, participates in carbon fixation by sequestering enzymes. It is the prototypical bacterial microcompartment. Its enzymatic components, ribulose bisphosphate carboxylase/oxygenase(RuBisCO) and carbonic anhydrase (CA), are surrounded by a polyhedral protein shell. Similarly, the ethanolamine utilization (eut) microcompartment, and the 1,2-propanediol utilization (pdu) microcompartment encapsulate the enzymes necessary for the process of cobalamin-dependent ethanolamine degradation, and coenzyme B12-dependent degradation of 1,2-propanediol, respectively, within its polyhedral protein shells. It is interesting that both carboxysome structural proteins CcmL and CsoS4A assemble as pentamers in the crystal structures, which might constitute the twelve pentameric vertices of a regular icosahedral carboxysome. However, the reported EutN structure is hexameric rather than pentameric. The absence of pentamers in Eut microcompartments might lead to less-regular icosahedral shell shapes. Due to the lack of structure evidence, the functional roles of the CsoS4A adjacent paralog, CsoS4B, and propanediol utilization protein PduN are not yet clear.	83
119367	cd01615	CIDE_N	CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein.	78
340455	cd01616	TGS	TGS (ThrRS, GTPase and SpoT) domain structurally similar to a beta-grasp ubiquitin-like fold. This family includes eukaryotic and some bacterial threonyl-tRNA synthetases (ThrRSs), a distinct Obg family GTPases, and guanosine polyphosphate hydrolase (SpoT) and synthetase (RelA), which are involved in stringent response in bacteria, as well as uridine kinase (UDK) from Thermotogales. All family members contain a TGS domain named after the ThrRS, GTPase, and SpoT/RelA proteins where it occurs. It is a small domain with a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. The functions of the TGS domain remains unclear, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, with a regulatory role.	61
340456	cd01617	DCX	Dublecortin-like domain structurally similar to a beta-grasp ubiquitin-like fold. Dublecortin (DCX) is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX gene family consists of eleven paralogs in human and mouse, and its DCX protein domains can occur in double tandem or as single DCX repeats. Proteins with DCX tandem domains in general have roles in microtubule (MT) regulation and signal transduction such as X-linked doublecortin (DCX), retinitis pigmentosa-1 (RP1) and doublecortin-like kinase (DCLK). Single DCX repeat proteins are normally localized to actin-rich subcellular structures, or the nucleus such as DCDC2. DCX is not only a unique MAP in terms of structure, it also interacts with multiple additional proteins. Mutations in human DCX genes are associated with abnormal neuronal migration, epilepsy, and mental retardation.	73
240620	cd01619	LDH_like	D-Lactate and related Dehydrogenases, NAD-binding and catalytic domains. D-Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate, and is a member of the 2-hydroxyacid dehydrogenase family. LDH is homologous to D-2-Hydroxyisocaproic acid dehydrogenase (D-HicDH) and shares the 2 domain structure of formate dehydrogenase. D-HicDH is a NAD-dependent member of the hydroxycarboxylate dehydrogenase family, and shares the Rossmann fold typical of many NAD binding proteins. D-HicDH from Lactobacillus casei forms a monomer and catalyzes the reaction R-CO-COO(-) + NADH + H+ to R-COH-COO(-) + NAD+. Similar to the structurally distinct L-HicDH, D-HicDH exhibits low side-chain R specificity, accepting a wide range of 2-oxocarboxylic acid side chains. (R)-2-hydroxyglutarate dehydrogenase (HGDH) catalyzes the NAD-dependent reduction of 2-oxoglutarate to (R)-2-hydroxyglutarate. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain.	323
240621	cd01620	Ala_dh_like	Alanine dehydrogenase and related dehydrogenases. Alanine dehydrogenase/Transhydrogenase, such as the hexameric L-alanine dehydrogenase of Phormidium lapideum, contain 2 Rossmann fold-like domains linked by an alpha helical region. Related proteins include Saccharopine Dehydrogenase (SDH), bifunctional lysine ketoglutarate reductase /saccharopine dehydrogenase enzyme, N(5)-(carboxyethyl)ornithine synthase, and Rubrum transdehydrogenase. Alanine dehydrogenase (L-AlaDH) catalyzes the NAD-dependent conversion of pyrucate to L-alanine via reductive amination. Transhydrogenases found in bacterial and inner mitochondrial membranes link NAD(P)(H)-dependent redox reactions to proton translocation. The energy of the proton electrochemical gradient (delta-p), generated by the respiratory electron transport chain, is consumed by transhydrogenase in NAD(P)+ reduction. Transhydrogenase is likely involved in the regulation of the citric acid cycle. Rubrum transhydrogenase has 3 components, dI, dII, and dIII. dII spans the membrane while dI and dIII protrude on the cytoplasmic/matirx side. DI contains 2 domains with Rossmann folds, linked by a long alpha helix, and contains a NAD binding site. Two dI polypeptides (represented in this sub-family) spontaneously form a heterotrimer with one dIII in the absence of dII. In the heterotrimer, both dI chains may bind NAD, but only one is well-ordered. dIII also binds a well-ordered NADP, but in a different orientation than classical Rossmann domains.	317
319765	cd01624	HAD_VSP_like	vegetative storage proteins and related proteins, similar to soybean VSPalpha and VSPbeta proteins; belongs to the haloacid dehalogenase-like superfamily. Soybean [Glycine max (L.) Merr.] vegetative storage protein VSPalpha and VSPbeta levels were identified as storage proteins due to their abundance and pattern of expression in plant tissues, they accumulate to almost one-half the amount of soluble leaf protein when soybean plants are continually depodded. They possess acid phosphatase activity which appears to be low compared to several other plant acid phosphatases; it increases in the leaves of depodded soybean plants, but to no more than 0.1% of the total acid phosphatase activity in these leaves. This acid phosphatase activity has maximal activity at pH 5.0 - 5.5, and can liberate Pi from different substrates such as napthyl acid phosphate, carboxyphenyl phosphate, sugar-phosphates, glyceraldehyde 3-phosphate, dihydroxyacetone phosphate, phosphoenolpyruvate, ATP, ADP, PPi, and short chain polyphosphates; they cleave phosphoenolpyruvate, ATP, ADP, PPI, and polyphosphates most efficiently. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Soybean VSPalpha and VSPbeta lack this active site aspartate, other members of this family have this aspartate and may be more active.  Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	160
319766	cd01625	HAD_PNP	polynucleotide 3'-phosphatase domain similar to the phosphatase domain of the bifunctional enzyme polynucleotide 5'-kinase/3'-phosphatase. Polynucleotide 3'-phosphatase (PNP) domain. This domain dephosphorylates single-stranded as well as double-stranded 3'-phospho termini. It is found in bifunctional enzyme polynucleotide kinase/phosphatase (PNKP) which contain both kinase and phosphatase domains. PNKP plays a key role in both base excision repair and non-homologous end-joining DNA repair pathway. DNA strand breaks can result from DNA damage by ionizing radiation and chemical agents, such as alkylating agents or anticancer agents. Such DNA damage often results in DNA strands with 5'-hydroxyl and 3'-phosphate termini. However, the repair of DNA damage by DNA polymerases and ligases requires 5'-phosphate and 3'-hydroxyl termini. PNKP acts as a 5'-kinase/3'-phosphatase to create 5'-phosphate/3'-hydroxyl termini, which are a necessary prerequisite for ligation during repair. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	154
319767	cd01627	HAD_TPP	trehalose-phosphate phosphatase similar to Escherichia coli trehalose-6-phosphate phosphatase OtsB and Saccharomyces cerevisiae trehalose-phosphatase TPS2. Trehalose biosynthesis in bacteria is known through three pathways - OtsAB, TreYZ and TreS. The OtsAB pathway, also known as the trehalose 6-phosphate synthase (TSP)/ Trehalose-6-phosphate phosphatase (TPP) pathway, is the most common route known to be involved in the stress response of Escherichia coli. It involves converting glucose-6-phosphate and UDP-glucose to form trehalose-6-phosphate (T6P), catalyzed by TPS, the product of the otsA gene, this step is followed by the dephosphorylation of T6P to yield trehalose and inorganic phosphate, catalyzed by a specific TPP, the product of otsB gene. This OtsAB (or TSP/TPP) pathway, is also the most common route known to be involved in the stress response of yeast In Saccharomyces cerevisiae, the corresponding enzymes, TPS1p and TPS2p, form a multimeric synthase complex together with additional regulatory subunits encoded by Tsl1 and Tps3. Trehalose is a common disaccharide accumulated by organisms as a reservation of carbohydrate and in response to unfavorable growth conditions. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	228
319768	cd01629	HAD_EP	Enolase-phosphatase similar to human enolase-phosphatase E1 and and Xanthomonas oryzae pv. Oryzae enolase-phosphatase Xep. Enolase-phosphatase E1 (also called MASA) is a bifunctional enolase- phosphatase which promotes the conversion of 2,3-diketo-5-methylthio-1-phosphopentane to 1,2-dihydroxy-3-keto-5-methylthiopentene anion (an aci-reductone) in the methionine salvage pathway. The catalytic reaction is carried out continuously by enolization and dephosphorylation, and the enolase activity cannot be classified as typical enzymatic enolization. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	204
319769	cd01630	HAD_KDO-like	haloacid dehalogenase-like (HAD) hydrolase, similar to Escherichia coli 3-deoxy-D-manno-octulosonate 8-phosphate (KDO 8-P) phosphatase KdsC, and rainbow trout N-acylneuraminate cytidylyltransferase. KDO 8-P phosphatase catalyzes the hydrolysis of KDO 8-P to KDO (3-deoxy-D-manno-octulosonate) and inorganic phosphate and is the last enzyme in the KDO biosynthetic pathway. KDO is an 8-carbon sugar that links the lipid A and polysaccharide moieties of the lipopolysaccharide region in Gram-negative bacteria. An interruption in KDO biosynthesis leads to the accumulation of lipid A precursors and subsequent arrest in cell growth. The KDO biosynthesis pathway involves five sequential enzymatic reactions. This family also includes rainbow trout CMP-sialic acid synthetase which effectively converts both deaminoneuraminic acid (KDN, 2-keto-3-deoxy-D-glycero-D-galacto-nononic acid) and N-acetylneuraminic acid (Neu5Ac) to CMP-KDN and CMP-Neu5Ac, respectively. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	146
340816	cd01635	Glycosyltransferase_GTB-type	glycosyltransferase family 1 and related proteins with GTB topology. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.	235
238814	cd01636	FIG	FIG, FBPase/IMPase/glpX-like domain. A superfamily of metal-dependent phosphatases with various substrates. Fructose-1,6-bisphospatase (both the major and the glpX-encoded variant) hydrolyze fructose-1,6,-bisphosphate to fructose-6-phosphate in gluconeogenesis. Inositol-monophosphatases and inositol polyphosphatases play vital roles in eukaryotic signalling, as they participate in metabolizing the messenger molecule Inositol-1,4,5-triphosphate. Many of these enzymes are inhibited by Li+.	184
238815	cd01637	IMPase_like	Inositol-monophosphatase-like domains. This family of phosphatases is dependent on bivalent metal ions such as Mg++, and many members are inhibited by Li+ (which is thought to displace a bivalent ion in the active site). Substrates include fructose-1,6-bisphosphate, inositol poly- and monophosphates, PAP and PAPS, sedoheptulose-1,7-bisphosphate and probably others.	238
238816	cd01638	CysQ	CysQ, a 3'-Phosphoadenosine-5'-phosphosulfate (PAPS) 3'-phosphatase, is a bacterial member of the inositol monophosphatase family. It has been proposed that CysQ helps control intracellular levels of PAPS, which is an intermediate in cysteine biosynthesis (a principal route of sulfur assimilation).	242
238817	cd01639	IMPase	IMPase, inositol monophosphatase and related domains. A family of Mg++ dependent phosphatases, inhibited by lithium, many of which may act on inositol monophosphate substrate. They dephosphorylate inositol phosphate to generate inositol, which may be recycled into inositol lipids; in eukaryotes IMPase plays a vital role in intracellular signaling. IMPase is one of the proposed targets of Li+ therapy in manic-depressive illness. This family contains some bacterial members of the inositol monophosphatase family classified as SuhB-like. E. coli SuhB has been suggested to participate in posstranscriptional control of gene expression, and its inositol monophosphatase activity doesn't appear to be sufficient for its cellular function. It has been proposed, that SuhB plays a role in the biosynthesis of phosphatidylinositol in mycobacteria.	244
238818	cd01640	IPPase	IPPase; Inositol polyphosphate-1-phosphatase, a member of the Mg++ dependent family of inositol monophosphatase-like domains, hydrolyzes the 1' position phosphate from inositol 1,3,4-trisphosphate and inositol 1,4-bisphosphate. Members in this group may also exhibit 3'-phosphoadenosine 5'-phosphate phosphatase activity, and they all appear to be inhibited by lithium. IPPase is one of the proposed targets of Li+ therapy in manic-depressive illness.	293
238819	cd01641	Bacterial_IMPase_like_1	Predominantly bacterial family of Mg++ dependend phosphatases, related to inositol monophosphatases. These enzymes may dephosphorylate fructose-1,6-bisphosphate, inositol monophospate, 3'-phosphoadenosine-5'-phosphate,  or similar substrates.	248
238820	cd01642	Arch_FBPase_2	Putative fructose-1,6-bisphosphatase or related enzymes of inositol monophosphatase family. These are Mg++ dependent phosphatases. Members in this family may have fructose-1,6-bisphosphatase and/or inositol-monophosphatase activity. Fructose-1,6-bisphosphatase catalyzes the hydrolysis of fructose-1,6-biphosphate  into fructose-6-phosphate and is critical in gluconeogenesis pathway.	244
238821	cd01643	Bacterial_IMPase_like_2	Bacterial family of Mg++ dependent phosphatases, related to inositol monophosphatases. These enzymes may dephosphorylate inositol monophosphate or similar substrates.	242
238822	cd01644	RT_pepA17	RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT domain of a multifunctional enzyme. C-terminal to the RT domain is a domain homologous to aspartic proteinases (corresponding to Merops family A17) encoded by retrotransposons and retroviruses. RT catalyzes DNA replication from an RNA template and is responsible for the replication of retroelements.	213
238823	cd01645	RT_Rtv	RT_Rtv: Reverse transcriptases (RTs) from retroviruses (Rtvs). RTs catalyze the conversion of single-stranded RNA into double-stranded viral DNA for integration into host chromosomes. Proteins in this subfamily contain long terminal repeats (LTRs) and are multifunctional enzymes with RNA-directed DNA polymerase, DNA directed DNA polymerase, and ribonuclease hybrid (RNase H) activities. The viral RNA genome enters the cytoplasm as part of a nucleoprotein complex, and the process of reverse transcription generates in the cytoplasm forming a linear DNA duplex via an intricate series of steps. This duplex DNA is colinear with its RNA template, but contains terminal duplications known as LTRs that are not present in viral RNA. It has been proposed that two specialized template switches, known as strand-transfer reactions or "jumps", are required to generate the LTRs.	213
238824	cd01646	RT_Bac_retron_I	RT_Bac_retron_I: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons. The polymerase reaction of this enzyme leads to the production of a unique RNA-DNA complex called msDNA (multicopy single-stranded (ss)DNA) in which a small ssDNA branches out from a small ssRNA molecule via a 2'-5'phosphodiester linkage. Bacterial retron RTs produce cDNA corresponding to only a small portion of the retron genome.	158
238825	cd01647	RT_LTR	RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT catalyzes DNA replication from an RNA template, and is responsible for the replication of retroelements. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs are present in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and Caulimoviruses.	177
238826	cd01648	TERT	TERT: Telomerase reverse transcriptase (TERT). Telomerase is a ribonucleoprotein (RNP) that synthesizes telomeric DNA repeats. The telomerase RNA subunit provides the template for synthesis of these repeats. The catalytic subunit of RNP is known as telomerase reverse transcriptase (TERT). The reverse transcriptase (RT) domain is located in the C-terminal region of the TERT polypeptide. Single amino acid substitutions in this region lead to telomere shortening and senescence. Telomerase is an enzyme that, in certain cells, maintains the physical ends of chromosomes (telomeres) during replication. In somatic cells, replication of the lagging strand requires the continual presence of an RNA primer approximately 200 nucleotides upstream, which is complementary to the template strand. Since there is a region of DNA less than 200 base pairs from the end of the chromosome where this is not possible, the chromosome is continually shortened. However, a surplus of repetitive DNA at the chromosome ends protects against the erosion of gene-encoding DNA. Telomerase is not normally expressed in somatic cells. It has been suggested that exogenous TERT may extend the lifespan of, or even immortalize, the cell. However, recent studies have shown that telomerase activity can be induced by a number of oncogenes. Conversely, the oncogene c-myc can be activated in human TERT immortalized cells. Sequence comparisons place the telomerase proteins in the RT family but reveal hallmarks that distinguish them from retroviral and retrotransposon relatives.	119
238827	cd01650	RT_nLTR_like	RT_nLTR: Non-LTR (long terminal repeat) retrotransposon and non-LTR retrovirus reverse transcriptase (RT). This subfamily contains both non-LTR retrotransposons and non-LTR retrovirus RTs. RTs catalyze the conversion of single-stranded RNA into double-stranded DNA for integration into host chromosomes. RT is a multifunctional enzyme with RNA-directed DNA polymerase, DNA directed DNA polymerase and ribonuclease hybrid (RNase H) activities.	220
238828	cd01651	RT_G2_intron	RT_G2_intron: Reverse transcriptases (RTs) with group II intron origin. RT transcribes DNA using RNA as template. Proteins in this subfamily are found in bacterial and mitochondrial group II introns. Their most probable ancestor was a retrotransposable element with both gag-like and pol-like genes. This subfamily of proteins appears to have captured the RT sequences from transposable elements, which lack long terminal repeats (LTRs).	226
153210	cd01653	GATase1	Type 1 glutamine amidotransferase (GATase1)-like domain. Type 1 glutamine amidotransferase (GATase1)-like domain. This group includes proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA. and, the A4 beta-galactosidase middle domain.  The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.  For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site.  Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. The E. coli HP-II C-terminal domain, S.  meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase.	115
100099	cd01657	Ribosomal_L7_archeal_euk	Ribosomal protein L7, which is found in archaea and eukaryotes but not in prokaryotes, binds domain II of the 23S rRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome.  The eukaryotic L7 members have an N-terminal extension not found in the archeal L7 orthologs.  L7 is closely related to the ribosomal L30 protein found in eukaryotes and prokaryotes.	159
100100	cd01658	Ribosomal_L30	Ribosomal protein L30, which is found in eukaryotes and prokaryotes but not in archaea, is one of the smallest ribosomal proteins with a molecular mass of about 7kDa. L30 binds the 23SrRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome.  The eukaryotic L30 members have N- and/or C-terminal extensions not found in their prokaryotic orthologs.  L30 is closely related to the ribosomal L7 protein found in eukaryotes and archaea.	54
238829	cd01659	TRX_superfamily	Thioredoxin (TRX) superfamily; a large, diverse group of proteins containing a TRX-fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include TRX, protein disulfide isomerase (PDI), tlpA-like, glutaredoxin, NrdH redoxin, and the bacterial Dsb (DsbA, DsbC, DsbG, DsbE, DsbDgamma) protein families. Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins and glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others.	69
238830	cd01660	ba3-like_Oxidase_I	ba3-like heme-copper oxidase subunit I.  The ba3 family of heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and some archaea which catalyze the reduction of O2 and simultaneously pump protons across the membrane.  It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria.  The ba3 family contains oxidases that lack the conserved residues that form the D- and K-pathways in CcO and ubiquinol oxidase. Instead they contain a potential alternative K-pathway.  Additional proton channels have been proposed for this family of oxidases but none have been identified definitively.  For general information on the heme-copper oxidase superfamily, please see cd00919.	473
238831	cd01661	cbb3_Oxidase_I	Cytochrome cbb3 oxidase subunit I.  Cytochrome cbb3 oxidase, the terminal oxidase in the respiratory chains of proteobacteria, is a multi-chain transmembrane protein located in the cell membrane. Like other cytochrome oxidases, it catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  Found mainly in proteobacteria, cbb3 is believed to be a modern enzyme that has evolved independently to perform a specialized function in microaerobic energy metabolism. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion.  It also contains a low-spin heme, believed to participate in the transfer of electrons to the binuclear center.  The cbb3 operon contains four genes (ccoNOQP or fixNOQP), with ccoN coding for subunit I.  Instead of a CuA-containing subunit II analogous to other cytochrome oxidases, cbb3 utilizes subunits ccoO and ccoP, which contain one and two hemes, respectively, to transfer electrons to the binuclear center.  The fourth subunit (ccoQ) has been shown to protect the core complex from proteolytic degradation by serine proteases.  For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from cytochrome c on the opposite side of the membrane.  The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane.  This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. The polar residues that form the D- and K-pathways in subunit I of other cytochrome c and ubiquinol oxidases are absent in cbb3.  The proton pathways remain undefined.  A pathway for the transfer of pumped protons beyond the binuclear center also remains undefined.  It is believed that electrons are passed from cytochrome c (the electron donor) to the low-spin heme via ccoP and ccoO, respectively, and directly from the low-spin heme to the binuclear center.	493
238832	cd01662	Ubiquinol_Oxidase_I	Ubiquinol oxidase subunit I.  Ubiquinol oxidase, the terminal oxidase in the respiratory chains of aerobic bacteria, is a multi-chain transmembrane protein located in the cell membrane.  It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits in ubiquinol oxidase varies from two to five. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme and a copper ion.  It also contains a low-spin heme, believed to participate in the transfer of electrons from ubiquinol to the binuclear center.  For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from ubiquinol on the opposite side of the membrane.  The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane. This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis.  Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I.  It is generally believed that the channels contain water molecules that act as 'proton wires' to transfer the protons.  A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified.  Electrons are believed to be transferred directly from ubiquinol (the electron donor) to the low-spin heme, and directly from the low-spin heme to the binuclear center.	501
238833	cd01663	Cyt_c_Oxidase_I	Cytochrome C oxidase subunit I.  Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes.  It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function, but subunit III, which is also conserved, may play a role in assembly or oxygen delivery to the active site. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme (heme a3) and a copper ion (CuB).  It also contains a low-spin heme (heme a), believed to participate in the transfer of electrons to the binuclear center.  For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from cytochrome c on the opposite side of the membrane.  The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane.  This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I.  A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electrons are transferred from cytochrome c (the electron donor) to heme a via the CuA binuclear site in subunit II, and directly from heme a to the binuclear center.	488
238834	cd01665	Cyt_c_Oxidase_III	Cytochrome c oxidase subunit III.  Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. CcO catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria.  Only subunits I and II are essential for function, but subunit III, which is also conserved, is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I.  Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit III contains bound phospholipids in several crystal structures and is proposed to contain a "lipid pool."  These phospholipids are believed to intrinsic constituents similar to cofactors of the enzyme.	243
340457	cd01666	TGS_DRG	TGS (ThrRS, GTPase and SpoT) domain found in developmentally regulated GTP binding protein (DRG) family. DRG-1 and DRG-2 comprise a highly conserved DRG subfamily of GTP-binding proteins found in archaea, plants, fungi and animals. The exact function of DRG proteins is unknown, although phylogenetic and biochemical fraction studies have linked them to translation, differentiation and growth. Their abnormal expressions may trigger cell transformation or cell cycle arrest. DRG-1 and DRG-2 bind to DFRP1 (DRG family regulatory protein 1) and DFRP2, respectively. Both DRG-1 and DRG-2 contain a domain of characteristic Obg-type G-motifs that may be the core of GTPase activity, as well as the C-terminal TGS (ThrRS, GTPase and SpoT) domain, which has a predominantly beta-grasp ubiquitin-like fold and may be related to RNA binding. DRG subfamily belongs to the Obg family of GTPases.	77
340458	cd01667	TGS_ThrRS	TGS (ThrRS, GTPase and SpoT) domain found in threonyl-tRNA synthetase (ThrRS) and similar proteins. ThrRS, also termed cytoplasmic threonine--tRNA ligase, is a class II aminoacyl-tRNA synthetase (aaRS) that plays an essential role in protein synthesis by catalyzing the aminoacylation of tRNA(Thr), generating aminoacyl-tRNA, and editing misacylation. In addition to its catalytic and anticodon-binding domains, ThrRS has an N-terminal TGS domain, named after the ThrRS, GTPase, and SpoT/RelA proteins where it occurs. TGS is a small domain with a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions.	65
340459	cd01668	TGS_RSH	TGS (ThrRS, GTPase and SpoT) domain found in the RelA/SpoT homolog (RSH) family. The RelA/SpoT homolog (RSH) family consists of long RSH proteins and short RSH proteins. Long RSH proteins have been characterized as containing an N-terminal region and a C-terminal region. The N-terminal region contains a pseudo-hydrolase (inactive-hydrolase) domain and a (p)ppGpp synthetase domain. The C-terminal region contains a ubiquitin-like TGS (ThrRS, GTPase and SpoT) domain, a conserved cysteine domain (CC), helical and ACT (aspartate kinase, chorismate mutase, TyrA domain) domains connected by a linker region. Short RSH proteins have a truncated C-terminal region without ACT domain. The RSH family includes two classes of enzyme: i) monofunctional (p)ppGpp synthetase I, RelA, and ii) bifunctional (p)ppGpp synthetase II/hydrolase, SpoT (also called Rel). Both classes are capable of synthesizing (p)ppGpp but only bifunctional enzymes are capable of (p)ppGpp hydrolysis. SpoT is a ribosome-associated protein that is activated during amino acid starvation and thought to mediate the stringent response. The function of the TGS domain of SpoT is in transcription of survival and virulence genes in respond to environmental stress.  RelA is an ATP:GTP(GDP) pyrophosphate transferase that is recruited to stalled ribosomes and activated to synthesize (p)ppGpp, which acts as a pleiotropic secondary messenger.	59
340460	cd01669	TGS_MJ1332_like	TGS (ThrRS, GTPase and SpoT) domain found in Methanocaldococcus jannaschii uncharacterized GTP-binding protein MJ1332 and similar proteins. This family includes a group of uncharacterized GTP-binding proteins from archaea, which belong to the Obg family of GTPases. The family members contain a domain of characteristic Obg-type G-motifs that may be the core of GTPase activity, as well as a C-terminal TGS (ThrRS, GTPase and SpoT) domain that has a predominantly beta-grasp ubiquitin-like fold.	78
260017	cd01670	Death	Death Domain: a protein-protein interaction domain. Death Domains (DDs) are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. Structural analysis of DD-DD complexes show that the domains interact with each other in many different ways. DD-containing proteins serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes. In mammals, they are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways. In invertebrates, they are involved in transcriptional regulation of zygotic patterning genes in insect embryogenesis, and are components of the ToII/NF-kappaB pathway, a conserved innate immune pathway in animal cells.	79
260018	cd01671	CARD	Caspase activation and recruitment domain: a protein-protein interaction domain. Caspase activation and recruitment domains (CARDs) are death domains (DDs) found associated with caspases. Caspases are aspartate-specific cysteine proteases with functions in apoptosis, immune signaling, inflammation, and host-defense mechanisms. In addition to caspases, proteins containing CARDs include adaptor proteins such as RAIDD, CARD9, and RIG-I-like helicases, which can form multiprotein complexes and play important roles in mediating the signals to induce immune and inflammatory responses. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	79
238835	cd01672	TMPK	Thymidine monophosphate kinase (TMPK), also known as thymidylate kinase, catalyzes the phosphorylation of thymidine monophosphate (TMP) to thymidine diphosphate (TDP) utilizing ATP as its preferred phophoryl donor. TMPK represents the rate-limiting step in either de novo or salvage biosynthesis of thymidine triphosphate (TTP).	200
238836	cd01673	dNK	Deoxyribonucleoside kinase (dNK) catalyzes the phosphorylation of deoxyribonucleosides to yield corresponding monophosphates (dNMPs). This family consists of various deoxynucleoside kinases including deoxyribo- cytidine (EC 2.7.1.74), guanosine (EC 2.7.1.113), adenosine (EC 2.7.1.76), and thymidine (EC 2.7.1.21) kinases. They are key enzymes in the salvage of deoxyribonucleosides originating from extra- or intracellular breakdown of DNA.	193
238837	cd01674	Homoaconitase_Swivel	Homoaconitase swivel domain. This family includes homoaconitase and other uncharacterized proteins of the Aconitase family. Homoaconitase is part of an unusual lysine biosynthesis pathway found only in filamentous fungi, in which lysine is synthesized via the alpha-aminoadipate pathway. In this pathway, homoaconitase catalyzes the conversion of cis-homoaconitic acid into homoisocitric acid. The reaction mechanism is believed to be similar to that of other aconitases. This is the swivel domain, which is believed to undergo swivelling conformational change in the enzyme mechanism.	129
153084	cd01675	RNR_III	Class III ribonucleotide reductase. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in strict or facultative anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). The class III enzyme from phage T4 consists of two subunits, this model covers the larger subunit which contains the active and allosteric sites.	555
153085	cd01676	RNR_II_monomer	Class II ribonucleotide reductase, monomeric form. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to  the domain of PFL (pyruvate formate lyase). Class II RNRs are found in bacteria that can live under both aerobic and anaerobic conditions. Many, but not all members of this class, are found to be homodimers. This particular subfamily is found to be active as a monomer. Adenosylcobalamin interacts directly with an active site cysteine to form the reactive cysteine radical.	658
153086	cd01677	PFL2_DhaB_BssA	Pyruvate formate lyase 2 and related enzymes. This family includes pyruvate formate lyase 2 (PFL2), B12-independent glycerol dehydratase (DhaB) and the alpha subunit of benzylsuccinate synthase (BssA), all of which have a highly conserved ten-stranded alpha/beta barrel domain, which is similar to those of PFL1 (pyruvate formate lyase 1) and RNR (ribonucleotide reductase). Pyruvate formate lyase catalyzes a key step in anaerobic glycolysis, the conversion of pyruvate and CoenzymeA to formate and acetylCoA. DhaB catalyzes the first step in the conversion of glycerol to 1,3-propanediol while BssA catalyzes the first step in the anaerobic mineralization of both toluene and m-xylene.	781
153087	cd01678	PFL1	Pyruvate formate lyase 1. Pyruvate formate lyase catalyzes a key step in anaerobic glycolysis, the conversion of pyruvate and CoenzymeA to formate and acetylCoA. The PFL mechanism involves an unusual radical cleavage of pyruvate in which two cysteines and one glycine form radicals that are required for catalysis. PFL has a ten-stranded alpha/beta barrel domain that is structurally similar to those of all three ribonucleotide reductase (RNR) classes as well as benzylsuccinate synthase and B12-independent glycerol dehydratase.	738
153088	cd01679	RNR_I	Class I ribonucleotide reductase. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and many viruses, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophages, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to  the domain of PFL (pyruvate formate lyase). Class I RNR is oxygen-dependent and can be subdivided into classes Ia (eukaryotes, prokaryotes, viruses and phages) and Ib (which is found in prokaryotes only). It is a tetrameric enzyme of two alpha and two beta subunits; this model covers the major part of the alpha or large subunit, called R1 in class Ia and R1E in class Ib.	460
238838	cd01680	EFG_like_IV	Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2  promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm.	116
238839	cd01681	aeEF2_snRNP_like_IV	This family represents domain IV of archaeal and eukaryotic elongation factor 2 (aeEF-2) and of an evolutionarily conserved U5 snRNP-specific protein. U5 snRNP is a GTP-binding factor closely related to the ribosomal translocase EF-2. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Phe-tRNA, EF-1 (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm.	177
238840	cd01683	EF2_IV_snRNP	EF-2_domain IV_snRNP domain is a part of 116kD U5-specific protein of the U5 small nucleoprotein (snRNP) particle, essential component of the spliceosome. The protein is structurally closely related to the eukaryotic translational elongation factor EF2. This domain has been also identified in 114kD U5-specific protein of  Saccharomyces cerevisiae and may play an important role either in splicing process itself or the recycling of spliceosomal snRNP.	178
238841	cd01684	Tet_like_IV	EF-G_domain IV_RPP domain is a part of bacterial ribosomal protected proteins (RPP) family. RPPs such as tetracycline resistance proteins Tet(M) and Tet(O) mediate tetracycline resistance in both gram-positive and -negative species. Tetracyclines inhibit the accommodation of aminoacyl-tRNA into ribosomal A site and therefore prevent the addition of new amino acids to the growing polypeptide. RPPs Tet(M) confer tetracycline resistance by releasing tetracycline from the ribosome and thereby freeing the ribosome from inhibitory effects of the drug, such that aa-tRNA can bind to the A site and protein synthesis can continue.	115
238842	cd01693	mtEFG2_like_IV	mtEF-G2 domain IV. This subfamily is a part the of mitochondrial transcriptional elongation factor, mtEF-G2. Mitochondrial translation is crucial for maintaining mitochondrial function and mutations in this system lead to a breakdown in the respiratory chain-oxidative phosphorylation system and to impaired maintenance of mitochondrial DNA. In complex with GTP, EF-G promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome.	120
238843	cd01699	RNA_dep_RNAP	RNA_dep_RNAP: RNA-dependent RNA polymerase (RdRp) is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage. RdRp catalyzes synthesis of the RNA strand complementary to a given RNA template. RdRps of many viruses are products of processing of polyproteins. Some RdRps consist of one polypeptide chain, and others are complexes of several subunits. The domain organization and the 3D structure of the catalytic center of a wide range of RdRps, including those with a low overall sequence homology, are conserved. The catalytic center is formed by several motifs containing a number of conserved amino acid residues. This subfamily represents the RNA-dependent RNA polymerases from all positive-strand RNA eukaryotic viruses with no DNA stage.	278
176454	cd01700	PolY_Pol_V_umuC	umuC subunit of DNA Polymerase V. umuC subunit of Pol V.   Pol V is a bacterial translesion synthesis (TLS) polymerase that consists of the heterotrimer of one umuC and two umuD subunits.  Translesion synthesis is a process that allows the bypass of a variety of DNA lesions.  TLS polymerases lack proofreading activity and have low fidelity and low processivity.  They use damaged DNA as templates and insert nucleotides opposite the lesions.  Pol V, RecA, single stranded DNA-binding protein, beta sliding clamp, and gamma clamp loading complex are responsible for inducing the SOS response in bacteria to repair UV-induced DNA damage.	344
176455	cd01701	PolY_Rev1	DNA polymerase Rev1. Rev1 is a translesion synthesis (TLS) polymerase found in eukaryotes.  Translesion synthesis is a process that allows the bypass of a variety of DNA lesions.  TLS polymerases lack proofreading activity and have low fidelity and low processivity.  They use damaged DNA as templates and insert nucleotides opposite the lesions.  Rev1 has both structural and enzymatic roles.  Structurally, it is believed to interact with other nonclassical polymerases and replication machinery to act as a scaffold.  Enzymatically, it catalyzes the specific insertion of dCMP opposite abasic sites.  Rev1 interacts with the Rev7 subunit of the B-family TLS polymerase Pol zeta (Rev3/Rev7).  Rev1 is known to actively promote the introduction of mutations, potentially making it a significant target for cancer treatment.	404
176456	cd01702	PolY_Pol_eta	DNA Polymerase eta. Pol eta, also called Rad30A, is a translesion synthesis (TLS) polymerase.  Translesion synthesis is a process that allows the bypass of a variety of DNA lesions.  TLS polymerases lack proofreading activity and have low fidelity and low processivity.  They use damaged DNA as templates and insert nucleotides opposite the lesions.  Unlike other Y-family members, Pol eta can efficiently and accurately replicate DNA past UV-induced lesions. Its activity is initiated by two simultaneous interactions: the PIP box in pol eta interacting with PCNA, and the UBZ (ubiquitin-binding zinc finger) in pol eta interacting with monoubiquitin attached to PCNA.  Pol eta is more efficient in copying damaged DNA than undamaged DNA and seems to recognize when a lesion has been passed, facilitating a lesion-dependent dissociation from the DNA.	359
176457	cd01703	PolY_Pol_iota	DNA Polymerase iota. Pol iota, also called Rad30B, is a translesion synthesis (TLS) polymerase.  Translesion synthesis is a process that allows the bypass of a variety of DNA lesions.  TLS polymerases lack proofreading activity and have low fidelity and low processivity.  They use damaged DNA as templates and insert nucleotides opposite the lesions.  Pol iota is thought to be one of the least efficient polymerases, particularly when opposite pyrimidines; it can incorporate the correct nucleotide opposite a purine much more efficiently than opposite a pyrimidine, and prefers to insert guanosine instead of adenosine opposite thymidine. Pol iota is believed to use Hoogsteen rather than Watson-Crick base pairing, which may explain the varying efficiency for different template nucleotides.	379
238844	cd01709	RT_like_1	RT_like_1: A subfamily of reverse transcriptases (RTs). An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs.	346
238845	cd01712	ThiI	ThiI is required for thiazole synthesis in the thiamine biosynthesis pathway. It belongs to the Adenosine Nucleotide Hydrolysis suoerfamily and predicted to bind to Adenosine nucleotide.	177
238846	cd01713	PAPS_reductase	This domain is found in phosphoadenosine phosphosulphate (PAPS) reductase enzymes or PAPS sulphotransferase. PAPS reductase is part of the adenine nucleotide alpha hydrolases superfamily also including N type ATP PPases and ATP sulphurylases. A highly modified version of the P loop, the fingerprint peptide of mononucleotide-binding proteins, is present in the active site of the protein, which appears to be a positively charged cleft containing a number of conserved arginine and lysine residues. Although PAPS reductase has no ATPase activity, it shows a striking similarity to the structure of the ATP pyrophosphatase (ATP PPase) domain of GMP synthetase, indicating that both enzyme families have evolved from a common ancestral nucleotide-binding fold.   The enzyme uses thioredoxin as an electron donor for the reduction of PAPS to phospho-adenosine-phosphate (PAP) . It is also found in NodP nodulation protein P from Rhizobium meliloti which has ATP sulphurylase activity (sulphate adenylate transferase) .	173
238847	cd01714	ETF_beta	The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria.  The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The beta subunit protein is distantly related to and forms a heterodimer with the alpha subunit.	202
238848	cd01715	ETF_alpha	The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria.  The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The alpha subunit of ETF is structurally related to the bacterial nitrogen fixation protein fixB which could play a role in a redox process and feed electrons to ferredoxin.	168
212463	cd01716	Hfq	bacterial Hfq-like. Hfq, an abundant, ubiquitous RNA-binding protein, functions as a pleiotropic regulator of RNA metabolism in prokaryotes, required for transcription of some transcripts and degradation of others. Hfq binds small RNA molecules called riboregulators that modulate the stability or translation efficiency of RNA transcripts. Hfq binds preferentially to unstructured A/U-rich RNA sequences and is similar to the eukaryotic Sm proteins in both sequence and structure. Hfq forms a homo-hexameric ring similar to the heptameric ring of the Sm proteins.	60
212464	cd01717	Sm_B	Sm protein B. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	80
212465	cd01718	Sm_E	Sm protein E. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit E binds subunits F and G to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.	79
212466	cd01719	Sm_G	Sm protein G. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure.	70
212467	cd01720	Sm_D2	Sm protein D2. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits.	89
212468	cd01721	Sm_D3	Sm protein D3. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits.	70
212469	cd01722	Sm_F	Sm protein F. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.	69
212470	cd01723	LSm4	Like-Sm protein 4. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	76
212471	cd01724	Sm_D1	Sm protein D1. The eukaryotic Sm proteins (B/B', D1, D2, D3, E, F and G) assemble into a hetero-heptameric ring around the Sm site of the 2,2,7-trimethyl guanosine (m3G) capped U1, U2, U4 and U5 snRNAs (Sm snRNAs) forming the core of the snRNP particle. The snRNP particle, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits.	92
212472	cd01725	LSm2	Like-Sm protein 2. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	89
212473	cd01726	LSm6	Like-Sm protein 6. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. LSm657 is believed to be an assembly intermediate for both the LSm1-7 and LSm2-8 rings. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	68
212474	cd01727	LSm8	Like-Sm protein 8. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. LSm657 is believed to be an assembly intermediate for both the LSm1-7 and LSm2-8 rings. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	91
212475	cd01728	LSm1	Like-Sm protein 1. The eukaryotic LSm proteins (LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. Accumulation of uridylated RNAs in an lsm1 mutant suggests an involvement of the LSm1-7 complex in recognition of the 3' uridylation tag and recruitment of the decapping machinery. LSm1-7, together with Pat1, are also called the decapping activator. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	74
212476	cd01729	LSm7	Like-Sm protein 7. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. LSm657 is believed to be an assembly intermediate for both the LSm1-7 and LSm2-8 rings. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	89
212477	cd01730	LSm3	Like-Sm protein 3. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	82
212478	cd01731	archaeal_Sm1	archaeal Sm protein 1. The archaeal Sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, their Sm proteins may play a more general role. Archaeal LSm proteins are likely to represent the ancestral Sm domain.	69
212479	cd01732	LSm5	Like-Sm protein 5. The eukaryotic LSm proteins (LSm2-8 or LSm1-7) assemble into a hetero-heptameric ring around the 3'-terminus uridylation tag of the gamma-methyl triphosphate (gamma-m-P3) capped U6 snRNA. LSm2-8 form the core of the snRNP particle that, in turn, assembles with other components onto the pre-mRNA to form the spliceosome which is responsible for the excision of introns and the ligation of exons. LSm1-7 is involved in recognition of the 3' uridylation tag and recruitment of the decapping machinery. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet.	76
212480	cd01733	LSm10	Like-Sm protein 10. The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-membered ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.	78
212481	cd01734	YlxS_C	Bacillus subtilis YxlS-like, C-terminal domain. YxlS is a Bacillus subtilis gene of unknown function with two domains that each have an alpha/beta fold. The N-terminal domain is composed of two alpha-helices and a three-stranded beta-sheet, while the C-terminal domain is composed of one alpha-helix and a five-stranded beta-sheet. This CD represents the C-terminal domain which has a fold similar to the Sm fold of proteins like Sm-D3.	72
212482	cd01735	LSm12_N	Like-Sm protein 12, N-terminal domain. LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.	61
212483	cd01736	LSm14_N	Like-Sm protein 14, N-terminal domain. LSm14 (also known as RAP55) belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet, that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm14 has an uncharacterized C-terminal domain containing a conserved DFDF box. In Xenopus laevis, LSm14 is an oocyte-specific constituent of ribonucleoprotein particles.	74
212484	cd01737	LSm16_N	Like-Sm protein 16, N-terminal domain. LSm16 (also known as enhancer of decapping-3 or EDC3) has been shown to be associated with an mRNA-decapping complex Dcp1-Dcp2, required for removal of the 5-prime cap from mRNA prior to its degradation from the 5-prime end. EDC3 is believed to be a scaffold for decapping complex formation. It belongs to a family of Sm-like proteins that associate with RNA to form complexes involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold, containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet, that associates with other Sm proteins to form hexameric and heptameric ring structures. LSm16 has, in addition to its N-terminal Sm-like domain, a C-terminal Yjef_N-type Rossmann fold domain of unknown function.	65
212485	cd01739	LSm11_M	Like-Sm protein 11, middle domain. The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm11 is an SmD2-like subunit which binds U7 snRNA along with LSm10 and five other Sm subunits to form a 7-membered ring structure. LSm11 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.	63
153211	cd01740	GATase1_FGAR_AT	Type 1 glutamine amidotransferase (GATase1)-like domain found in Formylglycinamide ribonucleotide amidotransferase. Type 1 glutamine amidotransferase (GATase1)-like domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT). FGAR-AT catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway. FGAR-AT is a glutamine amidotransferase. Glutamine amidotransferase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. FGAR-AT belongs to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site	238
153212	cd01741	GATase1_1	Subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. This group contains a subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. GATase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. Glutamine amidotransferases (GATase) includes the triad family of amidotransferases which have a conserved Cys-His-Glu catalytic triad in the glutaminase active site. In this subgroup this triad is conserved. GATase activity can be found in a range of biosynthetic enzymes, including: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase , anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase.	188
153213	cd01742	GATase1_GMP_Synthase	Type 1 glutamine amidotransferase (GATase1) domain found in GMP synthetase. Type 1 glutamine amidotransferase (GATase1) domain found in GMP synthetase. GMP synthetase is a glutamine amidotransferase from the de novo purine biosynthetic pathway. Glutamine amidotransferase (GATase) activity catalyse the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate.  GMP synthetase catalyses the amination of the nucleotide precursor xanthosine 5'-monophosphate to form GMP.  GMP synthetase belongs to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site.	181
153214	cd01743	GATase1_Anthranilate_Synthase	Type 1 glutamine amidotransferase (GATase1) domain found in Anthranilate synthase. Type 1 glutamine amidotransferase (GATase1) domain found in Anthranilate synthase (ASase). This group contains proteins similar to para-aminobenzoate (PABA) synthase and ASase.  These enzymes catalyze similar reactions and produce similar products, PABA and ortho-aminobenzoate (anthranilate). Each enzyme is composed of non-identical subunits: a glutamine amidotransferase subunit (component II) and a subunit that produces an aminobenzoate products (component I). ASase catalyses the synthesis of anthranilate from chorismate and glutamine and is a tetrameric protein comprising two copies each of components I and II. Component II of ASase belongs to the family of triad GTases which hydrolyze glutamine and transfer nascent ammonia between the active sites. In some bacteria, such as Escherichia coli, component II can be much larger than in other organisms, due to the presence of phosphoribosyl-anthranilate transferase (PRTase) activity. PRTase catalyses the second step in tryptophan biosynthesis and results in the addition of 5-phosphoribosyl-1-pyrophosphate to anthranilate to create N-5'-phosphoribosyl-anthranilate.  In E.coli, the first step in the conversion of chorismate to PABA involves two proteins: PabA and PabB which co-operate to transfer the amide nitrogen of glutamine to chorismate forming 4-amino-4 deoxychorismate (ADC). PabA acts as a glutamine amidotransferase, supplying an amino group to PabB, which carries out the amination reaction. A third protein PabC then mediates elimination of pyruvate and aromatization to give PABA. Several organisms have bipartite proteins containing fused domains homologous to PabA and PabB commonly called PABA synthases. These hybrid PABA synthases may produce ADC and not PABA.	184
153215	cd01744	GATase1_CPSase	Small chain of the glutamine-dependent form of carbamoyl phosphate synthase, CPSase II. This group of sequences represents the small chain of the glutamine-dependent form of carbamoyl phosphate synthase, CPSase II.  CPSase II catalyzes the production of carbomyl phosphate (CP) from bicarbonate, glutamine and two molecules of MgATP. The reaction is believed to proceed by a series of four biochemical reactions involving a minimum of three discrete highly reactive intermediates. The synthesis of CP is critical for the initiation of two separate biosynthetic pathways. In one CP is coupled to aspartate, its carbon and nitrogen nuclei ultimately incorporated into the aromatic moieties of pyrimidine nucleotides. In the second pathway CP is condensed with ornithine at the start of the urea cycle and is utilized for the detoxification of ammonia and biosynthesis of arginine. CPSases may be encoded by one or by several genes, depending on the species.  The E.coli enzyme is a heterodimer consisting of two polypeptide chains referred to as the small and large subunit. Ammonia an intermediate during the biosynthesis of carbomyl phosphate produced by the hydrolysis of glutamine in the small subunit of the enzyme is delivered via a molecular tunnel between the remotely located carboxyphosphate active site in the large subunit. CPSase IIs belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site. This group also contains the sequence from the mammalian urea cycle form which has lost the active site Cys, resulting in an ammonia-dependent form, CPSase I.	178
153216	cd01745	GATase1_2	Subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. This group contains a subgroup of proteins having the Type 1 glutamine amidotransferase (GATase1) domain. GATase activity catalyses the transfer of ammonia from the amide side chain of glutamine to an acceptor substrate. Glutamine amidotransferases (GATase) includes the triad family of amidotransferases which have a conserved Cys-His-Glu catalytic triad in the glutaminase active site. In this subgroup this triad is conserved. GATase activity can be found in a range of biosynthetic enzymes, including: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase , anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase, cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase.	189
153217	cd01746	GATase1_CTP_Synthase	Type 1 glutamine amidotransferase (GATase1) domain found in Cytidine Triphosphate Synthetase. Type 1 glutamine amidotransferase (GATase1) domain found in Cytidine Triphosphate Synthetase (CTP). CTP is involved in pyrimidine ribonucleotide/ribonucleoside metabolism. CTPs produce CTP from UTP and glutamine and regulate intracellular CTP levels through interactions with four ribonucleotide triphosphates. The enzyme exists as a dimer of identical chains that aggregates as a tetramer. CTP is derived form UTP in three separate steps involving two active sites. In one active site, the UTP O4 oxygen is activated by Mg-ATP-dependent phosphorylation, followed by displacement of the resulting 4-phosphate moiety by ammonia. At a separate site, ammonia is generated via rate limiting glutamine hydrolysis (glutaminase) activity. A gated channel that spans between the glutamine hydrolysis and amidoligase active sites provides a path for ammonia diffusion. CTPs belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site.	235
153218	cd01747	GATase1_Glutamyl_Hydrolase	Type 1 glutamine amidotransferase (GATase1) domain found in gamma-Glutamyl Hydrolase. Type 1 glutamine amidotransferase (GATase1) domain found in gamma-Glutamyl Hydrolase. gamma-Glutamyl Hydrolase catalyzes the cleavage of the gamma-glutamyl chain of folylpoly-gamma-glutamyl substrates and is a central enzyme in folyl and antifolyl poly-gamma-glutamate metabolism. GATase activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate.  gamma-Glutamyl hydrolases belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site.	273
153219	cd01748	GATase1_IGP_Synthase	Type 1 glutamine amidotransferase (GATase1) domain found in imidazole glycerol phosphate synthase (IGPS). Type 1 glutamine amidotransferase (GATase1) domain found in imidazole glycerol phosphate synthase (IGPS). IGPS incorporates ammonia derived from glutamine into N1-[(5'-phosphoribulosyl)-formimino]-5-aminoimidazole-4-carboxamide ribonucleotide (PRFAR) to form 5'-(5-aminoimidazole-4-carboxamide) ribonucleotide (AICAR) and imidazole glycerol phosphate (IGP). The glutamine amidotransferase domain generates the ammonia nucleophile which is channeled from the glutaminase active site to the PRFAR active site. IGPS belong to the triad family of amidotransferases having a conserved Cys-His-Glu catalytic triad in the glutaminase active site.	198
153220	cd01749	GATase1_PB	Glutamine Amidotransferase (GATase_I) involved in pyridoxine biosynthesis. Glutamine Amidotransferase (GATase_I) involved in pyridoxine biosynthesis. Glutamine amidotransferase (GATase) activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate.  This group contains proteins like Bacillus subtilus YaaE  and Plasmodium falciparum Pdx2 which are members of the triad glutamine aminotransferase family and function in a pathway for the biosynthesis of vitamin B6.	183
153221	cd01750	GATase1_CobQ	Type 1 glutamine amidotransferase (GATase1) domain found in Cobyric Acid Synthase (CobQ). Type 1 glutamine amidotransferase (GATase1) domain found in Cobyric Acid Synthase (CobQ).  CobQ plays a role in cobalamin biosythesis.   CobQ catalyses amidations at positions B, D, E, and G on adenosylcobyrinic A,C-diamide in the biosynthesis of cobalamin.  CobQ belongs to the triad family of amidotransferases.  Two of the three residues of the catalytic triad that are involved in glutamine binding, hydrolysis and transfer of the resulting ammonia to the acceptor substrate in other triad aminodotransferases are conserved in CobQ.	194
238849	cd01751	PLAT_LH2	PLAT/ LH2 domain of plant lipoxygenase related proteins. Lipoxygenases are nonheme, nonsulfur iron dioxygenases that act on lipid substrates containing one or more (Z,Z)-1,4-pentadiene moieties. In plants, the immediate products are involved in defense mechanisms against pathogens and may be precursors of metabolic regulators. The generally proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins.	137
238850	cd01752	PLAT_polycystin	PLAT/LH2 domain of polycystin-1 like proteins.  Polycystins are a large family of membrane proteins composed of multiple domains, present in fish, invertebrates, mammals, and humans that are widely expressed in various cell types and whose biological functions remain poorly defined. In human, mutations in polycystin-1 (PKD1) and polycystin-2 (PKD2) have been shown to be the cause for autosomal dominant polycystic kidney disease (ADPKD).  The generally proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins.	120
238851	cd01753	PLAT_LOX	PLAT domain of 12/15-lipoxygenase. As a unique subfamily of the mammalian lipoxygenases, they catalyze enzymatic lipid peroxidation in complex biological structures via direct dioxygenation of phospholipids and cholesterol esters of biomembranes and plasma lipoproteins. Both types of enzymes are cytosolic but need this domain to access their sequestered membrane or micelle bound substrates.	113
238852	cd01754	PLAT_plant_stress	PLAT/LH2 domain of plant-specific single domain protein family with unknown function. Many of its members are stress induced. In general, PLAT/LH2 consists of an eight stranded beta-barrel and it's proposed function is to mediate interaction with lipids or membrane bound proteins.	129
238853	cd01755	PLAT_lipase	PLAT/ LH2 domain present in connection with a lipase domain. This family contains two major subgroups, the  lipoprotein lipase (LPL) and the pancreatic triglyceride lipase.  LPL is a key enzyme in catabolism of plasma lipoprotein triglycerides (TGs). The central role of triglyceride lipases is in energy production. In general, PLAT/LH2 domain's proposed function is to mediate interaction with lipids or membrane bound proteins.	120
238854	cd01756	PLAT_repeat	PLAT/LH2 domain repeats of family of proteins with unknown function. In general, PLAT/LH2 consists of an eight stranded beta-barrel and it's proposed function is to mediate interaction with lipids or membrane bound proteins.	120
238855	cd01757	PLAT_RAB6IP1	PLAT/LH2 domain present in RAB6 interacting protein 1 (Rab6IP1)_like family. PLAT/LH2 domains consists of an eight stranded beta-barrel. In RabIP1 this domain may participate in lipid-mediated modulation of Rab6IP1's function via it's generally proposed function of mediating interaction with lipids or membrane bound proteins.	114
238856	cd01758	PLAT_LPL	PLAT/ LH2 domain present in lipoprotein lipase (LPL).  LPL is a key enzyme in catabolism of plasma lipoprotein triglycerides (TGs) and has therefeore has a profound influence on triglyceride and high-density lipoprotein (HDL) cholesterol levels in the blood. In general, PLAT/LH2 domain's proposed function is to mediate interaction with lipids or membrane bound proteins.	137
238857	cd01759	PLAT_PL	PLAT/LH2 domain of pancreatic triglyceride lipase.  Lipases hydrolyze phospholipids and triglycerides to generate fatty acids for energy production or for storage and to release inositol phosphates that act as second messengers. The central role of triglyceride lipases is in energy production. The proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins.	113
340461	cd01760	RBD	Ras-binding domain (RBD), structurally similar to a beta-grasp ubiquitin-like fold. The RBD of the serine/threonine kinase Raf is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. A Raf-like RBD is also present in Regulator of G protein Signaling (RGS12 and RGS14) members of GTPase activating proteins.	71
340462	cd01763	Ubl_SUMO_like	ubiquitin-like (Ubl) domain found in small ubiquitin-related modifier (SUMO) and similar proteins. SUMO (also known as "Smt3" and "sentrin" in other organisms) resembles ubiquitin (Ub) in structure, ligation to other proteins, and the mechanism of ligation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. SUMOs, like Ub, are covalently conjugated to lysine residues in a wide variety of target proteins in eukaryotic cells and regulate numerous cellular processes, such as transcription, epigenetic gene control, genomic instability, and protein degradation. The mammalian SUMOs have four paralogs, SUMO1 through SUMO4, which all regulate different cellular functions by conjugating to different proteins. SUMO2-4 are more closely related to each other than to SUMO1.	72
340463	cd01764	Ubl_Urm1	ubiquitin-like (Ubl) domain found in ubiquitin-related modifier 1 (Urm1). Urm1 acts as a sulfur carrier in the thiolation of eukaryotic tRNA via a mechanism that requires the formation of a thiocarboxylated Urm1, which is similar to that of prokaryotic sulfur carrier proteins such as ThiS and MoaD, containing the beta-grasp ubiquitin-like (Ubl) fold. Urm1 can be covalently conjugated to lysine residues of other proteins through a mechanism involving the E1-like protein Uba4. Urm1 is involved in yeast bioprocesses such as budding, nutrient sensing, high temperature sensitivity, antioxidant stress response and post-translation modification of the elongator subunit.	94
340464	cd01765	FERM_F0_F1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain and F1 sub-domain, found in FERM (Four.1/Ezrin/Radixin/Moesin) family proteins. FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain is present at the N-terminus of a large and diverse group of proteins that mediate linkage of the cytoskeleton to the plasma membrane. FERM-containing proteins are ubiquitous components of the cytocortex and are involved in cell transport, cell structure and signaling functions. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N), which is structurally similar to ubiquitin.	80
340465	cd01766	Ubl_UFM1	ubiquitin-like (Ubl) domain found in ubiquitin fold modifier 1 (UFM1). UFM1 belongs to the ubiquitin-like protein family with similar ubiquitin beta-grasp folds and mechanism of ligation to other proteins. UFM1 is present in nearly all eukaryotic organisms except fungi. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The UNF1 cascade has been implicated in endoplasmic reticulum functions, cell cycle control and cell differentiation. The involvement of the UFM1 cascade in diseases is diverse; reports include its involvement in ischemic heart diseases, diabetes, gastric lesions, schizophrenia, hip dysplasia and cancer.	75
340466	cd01767	UBX	Ubiquitin regulatory domain X (UBX) structurally similar to a beta-grasp ubiquitin-like fold. The UBXD family of proteins contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. Members in this family function as cofactors of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. Based on domain composition, UBXD proteins can be divided into two main groups, with and without ubiquitin-associated (UBA) domain.	74
340467	cd01768	RA_FERM_F0_F1_like	Ras-associating (RA) domain,  FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0/F1 sub-domain, structurally similar to a beta-grasp ubiquitin-like fold. RA domain-containing proteins function by interacting, directly or indirectly, with Ras proteins and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras protein is a small GTPase that is involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA-containing proteins include RalGDS, AF6, RIN, RASSF1, SNX27, CYR1, STE50, and phospholipase C epsilon. The FERM domain is present at the N-terminus of a large and diverse group of proteins that mediate linkage of the cytoskeleton to the plasma membrane. FERM-containing proteins are ubiquitous components of the cytocortex and are involved in cell transport, cell structure and signaling. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, also known as the N-terminal Ubl-like structural domain of the FERM domain (FERM_N), which is structurally similar to Ub. Some FERM domain-containing proteins contain an N-terminal region, which also has the beta-grasp Ub-like fold, precedes the FERM domain and has been referred to as the F0 domain.	110
340468	cd01770	UBX_UBXN2	Ubiquitin regulatory domain X (UBX) found in UBX domain-containing proteins UBXN2A, UBXN2B, NSFL1C/UBXN2C, and similar proteins. This family includes UBX domain-containing proteins UBXN2A, UBXN2B, and NSFL1C/UBXN2C, which contain a SEP (Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47) domain, and a ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold at the C-terminus. UBX domain participates broadly in the regulation of protein degradation. UBXN2A, UBXN2B, and UBXN2C function as the adaptor proteins of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation.	71
340469	cd01771	UBX_UBXN3A	Ubiquitin regulatory domain X (UBX) found in FAS associated factor 1 (FAF1, also known as UBXN3A) and similar proteins. UBX domain-containing protein 3A (UBXN3A),also termed UBX domain-containing protein 12 (UBXD12), or FAF1, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, FAF1 contains two tandem ubiquitin-like (Ubl) domains, which shows high structural similarity with UBX domain. FAF1 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The FAF1-p97 complex inhibits the proteasomal protein degradation in which p97 acts as a co-chaperone. Moreover, FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. This family corresponds to UBX domain.	80
340470	cd01772	UBX_UBXN1	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 1 (UBXN1) and similar proteins. UBXN1, also termed SAPK substrate protein 1 (SAKS1), UBA/UBX 33.3 kDa protein (Y33K), or UBXD10, is a widely expressed protein containing an N-terminal ubiquitin-associated (UBA) domain, a coiled-coil region, and a C-terminal ubiquitin-like (Ubl or UBX) domain that has a beta-grasp ubiquitin-like fold without the C-terminal double glycine motif. UBXN1 has been identified as a substrate for stress-activated protein kinases (SAPKs). It binds polyubiquitin and valosin-containing protein (VCP), suggesting a role as an adaptor that directs VCP to polyubiquitinated proteins facilitating its destruction by the proteasome. In addition, UBXN1 specifically binds to Homer2b. It may also interact with ubiquitin (Ub) and be involved in the Ub-proteasome proteolytic pathways. UBXN1 can also associate with autoubiquitinated BRCA1 tumor suppressor and inhibit its enzymatic function through its UBA domains.	81
340471	cd01773	UBX_UBXN7	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 7 (UBXN7) and similar proteins. UBXN7, also termed UBX domain-containing protein 7 (UBXD7), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN7 functions as a ubiquitin-binding adaptor that mediates the interaction between the AAA+ ATPase p97 (also known as VCP or Cdc48) and the transcription factor HIF1-alpha. It binds only to the active, NEDD8- or Rub1-modified form of cullins. In addition to having a UBX domain, UBXD7 contains a ubiquitin-associated (UBA), ubiquitin-associating (UAS), and ubiquitin-interacting motif (UIM) domains. Either UBA or UIM could serve as a docking site for neddylated-cullins. UBA domain is required for binding ubiquitylated-protein substrates, while the UIM motif is responsible for the binding to cullin RING ligases (CRLs), and the UBX domain is essential for p97 binding.	76
340472	cd01774	UBX_UBXN8	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 8 (UBXN8) and similar proteins. UBXN8, also termed reproduction 8 protein (Rep8), or UBX domain-containing protein 6 (UBXD6), or D8S2298E, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN8 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN8 is a transmembrane protein that localizes to the endoplasmic reticulum (ER) membrane with its UBX domain facing the cytoplasm. It facilitates efficient ER-associated degradation (ERAD) by tethering p97 to the ER membrane.	76
340473	cd01775	RA_PHLPP_like	Ras-associating (RA) domain found in PH domain leucine-rich repeat-containing protein phosphatases, fungal adenylate cyclase, and similar proteins. PHLPP represents a novel family of Ser/Thr protein phosphatases, which is involved in two key signaling pathways, the phosphatidylinositol 3-kinase and diacylglycerol signaling pathways, by directly dephosphorylating and inactivating Akt serine-threonine kinases (Akt1, Akt2, Akt3) and protein kinase C (PKC) isoforms. PHLPP contains a putative Ras-associating (RA) domain followed by a pleckstrin homology (PH) domain, a series of leucine-rich repeats and a protein phosphatase 2C (PP2C) domain. Fungal adenylate cyclase regulates developmental processes such as hyphal growth, biofilm formation, and phenotypic switching. It plays an essential role in regulation of cellular metabolism by catalyzing the synthesis of a second messenger, cAMP. Fungal adenylate cyclase has at least four domains, including an N-terminal adenylate cyclase G-alpha binding domain, a Ras-associating (RA) domain, a middle leucine-rich repeat region, and a catalytic domain. The RA domain of adenylate cyclase post-translationally modifies a small GTPase called Ras, which is involved in cellular signal transduction. The activity of adenylate cyclase is stimulated directly by regulatory proteins (Ras1 and Gpa2), peptidoglycan fragments and carbon dioxide.	99
340474	cd01776	RA_Rin	Ras-associating (RA) domain of Ras and Rab interactor (Rin) protein family. Family of Ras-interaction/interference (Rin) proteins, also known as Ras and Rab interactors, is composed of Rin1, Rin2, and Rin3, which have multifunctional domains, including SH2 and proline-rich domains in the N-terminal region, and RH, VPS9, and RA domains in the C-terminal region. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The RA domains of Rin1, Rin2, and Rin3 are well conserved and they all have Ras binding characteristics.	90
340475	cd01777	FERM_F1_SNX27	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in sorting nexin protein 27 (SNX27). SNX27 is a member of the family of cytoplasmic sorting nexin adaptor proteins that regulate endosomal trafficking of cell surface proteins. In addition to a PX (Phox homology) domain that regulates its endosomal localization, SNX27 has a unique PDZ (Psd-95/Dlg/ZO1) domain and an atypical FERM (4.1, ezrin, radixin, moesin) domain that both function to bind short peptide sequence motifs in the cytoplasmic domains of the cargo receptors. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	92
340476	cd01778	RA_RASSF1_like	Ras-associating (RA) domain found in Ras-association domain family members, RASSF1, RASSF3, and RASSF5. The RASSF family of proteins shares a conserved RalGDS/AF6 Ras association (RA) domain which is located either at the C-terminus (RASSF1-6, the classical group) or at the N-terminus (RASSF7-10). RASSF1-6 contains a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that functions in scaffolding and regulatory interactions. The RA domain of the classical RASSF proteins has a beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. Classical RASSF members interact either directly or indirectly with activated Ras. Ras proteins are small GTPases that are involved in cellular signal transduction. The classical RASSF proteins seem to modulate some of the growth inhibitory responses mediated by Ras and may serve as tumor suppressor genes. This family contains RASSF1, RASSF3, and RASSF5.	130
340477	cd01779	RA_Myosin-IX	Ras-associating (RA) domain found in Myosin-IX. Myosins IX (Myo9) is a class of unique motor proteins with a common structure of an N-terminal extension preceding a myosin head homologous to the Ras-association (RA) domain, a head (motor) domain, a neck with IQ motifs that bind light chains and a C-terminal tail containing a Rho-GTPase activating protein (RhoGAP) domain. The RA domain is located at its head domain and has the beta-grasp ubiquitin-like fold with unknown function. There are two genes for myosins IX in humans, IXa and IXb, that are different in their expression and localization. IXa is expressed abundantly in brain and testis and IXb is expressed abundantly in tissues of the immune system.	97
340478	cd01780	RA2_PLC-epsilon	Ras-associating (RA) domain 2 found in Phosphatidylinositide-specific phospholipase C (PLC)-epsilon. PLC is a signaling enzyme that hydrolyzes membrane phospholipids to generate inositol triphosphate. PLC-epsilon represents a novel forth class of PLC that has a PLC catalytic core domain, a CDC25 guanine nucleotide exchange factor domain and two RA (Ras-association) domains. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. Although PLC RA1 and RA2 have homologous ubiquitin-like folds only RA2 can bind Ras and activate it. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and involve in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. This family corresponds to the second RA domain of PLC-epsilon.	102
340479	cd01781	RA2_Afadin	Ras-associating (RA) domain 2 found in Afadin. Afadin, also termed ALL1-fused gene from chromosome 6 protein (AF-6), or canoe, is involved in many fundamental signaling cascades in cells. In addition, it is involved in oncogenesis and metastasis. Afadin has multiple domains: from the N-terminus to the C-terminus it has two Ras-associated (RA) domains, a forkhead-associated domain, a dilute domain, a PDZ domain, three proline-rich domains, and an F-actin binding domain. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has a beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. Afadin is abundant at cadherin-based adherens junctions in epithelial cells, endothelial cells, and fibroblasts. This family corresponds to the second RA domain of afadin.	102
340480	cd01782	RA1_Afadin	Ras-associating (RA) domain 1 found in Afadin. Afadin, also termed ALL1-fused gene from chromosome 6 protein (AF-6), or canoe, is involved in many fundamental signaling cascades in cells. In addition, it is involved in oncogenesis and metastasis. Afadin has multiple domains: from the N-terminus to the C-terminus it has two Ras-associated (RA) domains, a forkhead-associated domain, a dilute domain, a PDZ domain, three proline-rich domains, and an F-actin-binding domain. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has a beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. Afadin is abundant at cadherin-based adherens junctions in epithelial cells, endothelial cells, and fibroblasts. This family corresponds to the first RA domain of afadin, which mediates its self-association.	112
340481	cd01783	RA2_DAGK-theta	Ras-associating (RA) domain 2 found in diacylgylcerol kinase theta (DAGK-theta) and similar proteins. DAGK phosphorylates the second messenger diacylglycerol to phosphatidic acid as part of a protein kinase C pathway. DAGK-theta is characterized as a type V DAGK that has three cysteine-rich domains (all other isoforms have two), a proline/glycine-rich domain at its N-terminal, and a proposed Ras-associating (RA) domain. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has a beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. There are ten mammalian isoforms of DAGK have been identified to date, these are organized into five categories based on the domain architecture. DAGK-theta also contains a pleckstrin homology (PH) domain. The subcellular localization and the activity of DAGK-theta are regulated in a complex (stimulation- and cell type-dependent) manner. This family corresponds to the second RA domain of DAGK-theta.	95
340482	cd01784	RA_RASSF2_like	Ras-associating (RA) domain found in Ras-association domain family members, RASSF2, RASSF4, and RASSF6. The RASSF family of proteins shares a conserved RalGDS/AF6 RA domain either in the C-terminus (RASSF1-6) or N-terminus (RASSF7-10). The classical family members (RASSF1-6) contain a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that functions as scaffolding and regulatory interactions. The RA domain of the classical RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. Classical RASSF members interact either directly or indirectly with activated Ras. Ras proteins are small GTPases that are involved in cellular signal transduction. The classical RASSF protein family seem to modulate some of the growth inhibitory responses mediated by Ras and may serve as tumor suppressor genes. This family contains RASSF2, RASSF4, and RASSF6.	87
340483	cd01785	RA_PDZ-GEF1	Ras-associating (RA) domain found in PDZ domain-containing guanine nucleotide exchange factor 1 (PDZ-GEF1) and similar proteins. PDZ-GEF1, also termed Rap guanine nucleotide exchange factor 2, or cyclic nucleotide ras GEF (CNrasGEF), or neural RAP guanine nucleotide exchange protein (nRap GEP), or Ras/Rap1-associating GEF-1 (RA-GEF-1), is a Rap-specific guanine nucleotide exchange factor (GEF) that has a PSD-95/DlgA/ZO-1 (PDZ) domain, a RA domain and a region related to a cyclic nucleotide binding domain (RCBD). The RA domain of PDZ-GEF interacts with Rap1 and also contributes to the membrane localization of PDZ-GEF. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and involve in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	85
340484	cd01786	RA_STE50	Ras-associating (RA) domain found in the fungal adaptor protein STE50. The fungal adaptor protein STE50 is an essential component of three MAPK-mediated signaling pathways that control the mating response, invasive/filamentous growth and osmotolerance (HOG pathway), respectively. STE50 functions in cell signaling between the activated G protein and STE11. The domain architecture of STE50 includes an amino-terminal SAM (sterile alpha motif) domain in addition to the carboxy-terminal ubiquitin-like RA (RAS-associated) domain. RA domain of STE50 interacts with the small GTPase Cdc42p, a member of Rho type of the Ras superfamily. This interaction activates Ste11p/Ste7p/Kss1pMAP kinase cascade that controls filamentous growth. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	101
340485	cd01787	RA_MRL	Ras-associating (RA) domain of Mig10/RIAM/Lpd (MRL) family. MRL proteins share a common structural architecture, including a central structural unit consisting of a Ras-associating (RA) domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA and PH form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain. MRL proteins have distinct functions in cell migration and adhesion, signaling, and in cell growth.	85
340486	cd01788	Ubl_ElonginB	ubiquitin-like (Ubl) domain found in transcription elongation factor B (Elongin B) and similar proteins. Elongin B, also termed Elongin 18 kDa subunit, or EloB, or RNA polymerase II transcription factor SIII subunit B (SIII p18), is part of an E3 ubiquitin ligase complex called VEC that activates ubiquitination by the E2 ubiquitin-conjugating enzyme Ubc5. VEC is composed of von Hippel-Lindau tumor suppressor protein (pVHL), elongin C, cullin 2, NEDD8, and Rbx1. ElonginB binds elonginC to form the elonginBC complex which is a positive regulator of RNA polymerase II elongation factor Elongin A. The BC complex then binds VHL (von Hippel-Lindau) tumor suppressor protein to form a VCB ternary complex. Elongin B has a ubiquitin-like (Ubl) domain. Ub has a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair.	101
340487	cd01789	Ubl_TBCB	ubiquitin-like (Ubl) domain found in tubulin-folding cofactor B (TBCB) and similar proteins. TBCB, also termed cytoskeleton-associated protein 1, or cytoskeleton-associated protein CKAPI, or tubulin-specific chaperone B, is one of protein cofactors A through E that is required for the folding of tubulins prior to their incorporation into microtubules and heterodimer assembly. TBCB comprises an N-terminal ubiquitin-like (Ubl) domain and a C-terminal cytoskeleton-associated protein with glycine-rich segment (CAP-Gly) domain. The Ubl domain of TBCB is essential for proper folding and assembly of tubulin alpha. It has a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. TBC-A through E are necessary for the biogenesis of microtubules and for cell viability.	80
340488	cd01790	Ubl_HERP	ubiquitin-like (Ubl) domain found in homocysteine-inducible endoplasmic reticulum stress protein HERP. HERP is an endoplasmic reticulum (ER) integral membrane protein containing an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. The Ubl domain is required for the degradation of HERP itself as well as for HERP-mediated anti-apoptotic effects. HERP is induced by the ER stress response pathway and is involved in improving the balance of folding capacity and protein loads in the ER. There are two types of HERP, HERP1 and HERP2, which are encoded by the HERPUD1 and HERPUD2 genes, respectively.	78
340489	cd01791	Ubl_UBL5	ubiquitin-like (Ubl) domain found in ubiquitin-like protein 5 (UBL5) and similar proteins. UBL5, known as Hub1 in yeast, is an atypical ubiquitin-like (Ubl) post-translational modifier that contains a conserved Ubl domain with a beta-grasp Ubl fold. At the C-terminal end of its Ubl fold is a di-tyrosine motif followed by a single variable residue instead of the characteristic di-glycine found in all other Ubl modifiers, and thus UBL5 does not form covalent conjugates with cellular proteins. The yeast Hub1p binds non-covalently to the HIND element of spliceosomal protein Snu66p (Snu66p is termed SART1 in mammals) and modifies the spliceosome by this unconventional Ubl modifier. In higher eukaryotes, UBL5/Hub1 plays a role in modulating pre-mRNA splicing. It also is required for signaling in the mitochondrial unfolded protein response, through interaction with the transcription factor DVE-1 and upregulation of chaperone genes in response to mitochondrial stress. Moreover, UBL5 functions as a factor that directly binds to and stabilizes FANCI, and promotes the functionality of the Fanconi anemia (FA) DNA repair pathway.	71
340490	cd01792	Ubl1_ISG15	ubiquitin-like (Ubl) domain 1 found in interferon-stimulated gene 15 (ISG15) and similar proteins. ISG15, also termed interferon-induced 15 kDa protein, or interferon-induced 17 kDa protein (IP17), or ubiquitin cross-reactive protein (UCRP), is an antiviral interferon-induced ubiquitin-like protein (Ubl) that upon viral infection, modifies cellular and viral proteins by mechanisms similar to ubiquitination. Although ISG15 has properties similar to those of other Ubl molecules, it is a unique member of the Ubl superfamily, whose expression and conjugation to target proteins are tightly regulated by specific signaling pathways, indicating it may have specialized functions in the immune system. ISG15 contains two tandem Ubl domains with a beta-grasp Ubl fold. This family corresponds to the first Ubl domain.	75
340491	cd01793	Ubl_FUBI	ubiquitin-like (Ubl) domain found in ubiquitin-like protein FUBI and similar proteins. FUBI is a pro-apoptotic regulatory gene FAU encoding ubiquitin-like protein with ribosomal protein S30 as a C-terminal extension. FUBI functions as a tumor suppressor protein that may be involved in the ATP-dependent proteolytic activity of ubiquitin. The N-terminal ubiquitin-like (Ubl) domain of FUBI has the beta-grasp Ubl fold, and it may act as a substitute or an inhibitor of ubiquitin or one of ubiquitin's close relatives UCRP, FAT10, and Nedd8.	74
340492	cd01794	Ubl_UBTD	ubiquitin-like (Ubl) domain found in ubiquitin domain-containing proteins UBTD1, UBTD2, and similar proteins. This family represents a group of ubiquitin-like (Ubl) domain-containing proteins evolutionarily conserved and found in metazoa, fungi, and plants. They may regulate the activity and/or specificity of E2 ubiquitin conjugating enzymes belonging to the UBE2D family. Members in this family contain an N-terminal ubiquitin binding domain (UBD) and a C-terminal Ubl domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions.	69
340493	cd01795	Ubl_USP48	ubiquitin-like (Ubl) domain found in ubiquitin-specific-processing protease 48 (USP48) and similar proteins. USP48, also termed USP31, or deubiquitinating enzyme 48, or ubiquitin thioesterase 48, or ubiquitin carboxyl-terminal hydrolase 48, belongs to the ubiquitin specific protease (USP) family that is one of at least seven deubiquitylating enzyme (DUB) families capable of deconjugating ubiquitin (Ub)and ubiquitin-like (Ubl) adducts. While the USP proteins have a conserved catalytic core domain, USP48 differs in its domain architecture. It contains an N-terminal USP domain, three DUSP (domain present in ubiquitin-specific protease) domains, and a C-terminal Ubl domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. USP48 is a deubiquitinating enzyme that interacts with TNF receptor-associated factor 2 (TRAF2) and has been implicated in activation of nuclear factor-kappaB (NF-kappaB). Moreover, as a nuclear deubiquitinase regulated by casein kinase 2 (CK2), USP48 controls the ubiquitin/proteasome-system (UPS)-dependent turnover of activated NF-kappaB/RelA in the nucleus together with the COP9 signalosome, suggesting a role of USP48 in a timely control of immune responses.	99
340494	cd01796	Ubl_Ddi1_like	ubiquitin-like (Ubl) domain found in the eukaryotic Ddi1 family. The eukaryotic Ddi1 family, including yeast aspartyl protease DNA-damage inducible 1 (Ddi1) and Ddi1-like proteins from vertebrates and other eukaryotes, has been characterized by containing an N-terminal ubiquitin-like (Ubl) domain and a conserved retroviral aspartyl-protease-like domain (RVP) that is important in cell-cycle control. Yeast Ddi1 and many family members also contain a C-terminal ubiquitin-association (UBA) domain, however, Ddi1-like proteins from all vertebrates lack the UBA domain. Ddi1, also termed v-SNARE-master 1 (Vsm1), is an ubiquitin receptor involved in the cell cycle and late secretory pathway in Saccharomyces cerevisiae. It functions as an UBA-Ubl shuttle protein that is required for the proteasome to enable ubiquitin-dependent degradation of its ligands. For instance, Ddi1 plays an essential role in the final stages of proteasomal degradation of Ho endonuclease and of its cognate FBP, Ufo1. Moreover, Ddi1 and its associated protein Rad23p play a cooperative role as negative regulators in yeast PHO pathway. This family also includes mammalian regulatory solute carrier protein family 1 member 1 (RSC1A1), also termed transporter regulator RS1 (RS1), which mediates transcriptional and post-transcriptional regulation of Na(+)-D-glucose cotransporter SGLT1. Ddi1-like proteins play a significant role in cell cycle control, growth control, and trafficking in yeast and may play a crucial role in embryogenesis in higher eukaryotes.	73
340495	cd01797	Ubl_UHRF	ubiquitin-like (Ubl) domain found in ubiquitin-like PHD and RING finger domain-containing proteins, UHRF1 and UHRF2, and similar proteins. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma, gastric cancer, esophageal squamous cell carcinoma, colorectal cancer, prostate cancer, and breast cancer. UHRF1 can acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumor suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (Ubl), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a set- and ring-associated (SRA) domain, and a C-terminal RING finger.	74
340496	cd01798	Ubl_parkin	ubiquitin-like (Ubl) domain found in parkin and similar proteins. Parkin, also termed Parkinson juvenile disease protein 2, is a RBR-type E3 ubiquitin-protein ligase that is associated with recessive early onset Parkinson's disease (PD), and exerts a protective effect against dopamine-induced alpha-synuclein-dependent cell toxicity. Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Parkin functions within a multiprotein E3 ubiquitin ligase complex, catalyzing the covalent attachment of ubiquitin moieties onto substrate proteins, such as BCL2, SYT11, CCNE1, GPR37, RHOT1/MIRO1, MFN1, MFN2, STUB1, SNCAIP, SEPT5, TOMM20, USP30, ZNF746 and AIMP2. It mediates monoubiquitination as well as Lys-6-, Lys-11-, Lys-48- and Lys-63-linked polyubiquitination of substrates depending on the context. Parkin may enhance cell viability and protects dopaminergic neurons from oxidative stress-mediated death by regulating mitochondrial function. It also limits the production of reactive oxygen species (ROS), and regulates cyclin-E during neuronal apoptosis. Moreover, parkin displays a ubiquitin ligase-independent function in transcriptional repression of p53. Parkin contains an N-terminal ubiquitin-like (Ubl) domain and a C-terminal RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction.	74
340497	cd01799	Ubl_HOIL1	ubiquitin-like (Ubl) domain found in heme-oxidized IRP2 ubiquitin ligase 1 (HOIL-1) and similar proteins. HOIL-1, also termed RBCK1, or HOIL-1L, or RanBP-type and C3HC4-type zinc finger-containing protein 1, HBV-associated factor 4, or Hepatitis B virus X-associated protein 4, or RING finger protein 54 (RNF54), or ubiquitin-conjugating enzyme 7-interacting protein 3, or UbcM4-interacting protein 28 (UIP28), together with E3 ubiquitin-protein ligase RNF31 (also known as HOIP) and SHANK-associated RH domain interacting protein (SHARPIN), forms the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis through conjugation of linear polyubiquitin chains to NF-kappaB essential modulator (also known as NEMO or IKBKG). HOIL-1 plays a crucial role in TNF-alpha-mediated NF-kappaB activation. It also functions as an ubiquitin-protein ligase E3 that interacts with not only PKCbeta but also PKCzeta. It can recognize heme-oxidized IRP2 (iron regulatory protein2) and is thought to affect the turnover of oxidatively damaged proteins. HOIL-1 contains an N-terminal ubiqutin-like (UBL) domain and an Npl4 zinc-finger (NZF) domain, which regulate the interaction with the LUBAC subunit RNF31 and ubiquitin, respectively. The NZF domain belongs to RanBP2-type zinc finger (zf-RanBP2) domain superfamily. In addition, HOIL-1 has a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction.	81
340498	cd01800	Ubl_SF3a120	ubiquitin-like (Ubl) domain found in splicing factor 3A 120kDa subunit (SF3a120) and similar proteins. Mammalian splicing factor SF3a consists of three subunits of 60, 66, and 120 kDa and functions early during pre-mRNA splicing by converting the U2 snRNP to its active form. The 120kDa subunit SF3a120, also termed splicing factor 3A subunit 1 (SF3A1), or spliceosome-associated protein 114 (SAP114), is the U2 snRNP-specific protein that is critical for spliceosome assembly and normal splicing events. During splicing, SF3a120, together with the U2 snRNP and other proteins, are recruited to the 3' splicing site to generate the splicing complex A after the recognition of the 3' splicing site. SF3a120 contains two N-terminal SWAP (suppressor-of-white-apricot) domains, referred to collectively as the SURP module, as well as a C-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions.	84
340499	cd01801	Ubl_TECR_like	ubiquitin-like (Ubl) domain found in trans-2,3-enoyl-CoA reductase (TECR) and similar proteins. This family includes TECR and many TECR-like proteins, such as TECRL. TECR, also termed very-long-chain enoyl-CoA reductase, or synaptic glycoprotein SC2, or TER, or GPSN2, is a synaptic glycoprotein that catalyzes the fourth reaction in the synthesis of very long-chain fatty acids (VLCFA) which is the reduction step of the microsomal fatty acyl-elongation process. Diseases involving perturbations to normal synthesis and degradation of VLCFA (e.g. adrenoleukodystrophy and Zellweger syndrome) have significant neurological consequences. The mammalian TECR P182L mutation causes nonsyndromic mental retardation. Deletion of the yeast TECR (TSC13) homolog is lethal. TECR contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions, as well as a C-terminal catalytic domain. TECRL, also termed steroid 5-alpha-reductase 2-like 2 protein (SRD5A2L2), is associated with life-threatening inherited arrhythmias displaying features of both long QT syndrome (LQTS) and catecholaminergic polymorphic ventricular tachycardia (CPVT). Both TECR and TECRL contain an N-terminal Ubl domain with a beta-grasp Ubl fold, and a C-terminal catalytic domain.	77
340500	cd01802	Ubl_ZFAND4	ubiquitin-like (Ubl) domain found in AN1-type zinc finger protein 4 (ZFAND4) and similar proteins. ZFAND4, also termed AN1-type zinc finger and ubiquitin domain-containing protein-like 1 (ANUBL1), may function as an oncogene that promotes proliferation and regulates relevant tumor suppressor genes in gastric cancer, suggesting a role in gastric cancer initiation and progression. ZFAND4contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions, as well as a C-terminal AN1-type zinc finger. Unlike ubiquitin polyproteins and most ubiquitin fusion proteins, the N-terminal Ubl domain of ZFAND4 does not undergo proteolytic processing.	74
340501	cd01803	Ubl_ubiquitin	ubiquitin-like (Ubl) domain found in ubiquitin. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. Ubiquitin-like (Ubl) proteins have similar ubiquitin beta-grasp fold and attach to other proteins in a Ubl manner but with biochemically distinct roles. Ubiquitin (Ub)and Ubl proteins conjugate and deconjugate via ligases and peptidases to covalently modify target polypeptides. Ub includes Ubq/RPL40e and Ubq/RPS27a fusions as well as homopolymeric multiubiquitin protein chains.	76
340502	cd01804	Ubl_midnolin	ubiquitin-like (Ubl) domain found in midnolin and similar proteins. Midnolin, also termed midbrain nucleolar protein, is a nucleolar protein that may be involved in regulation of genes related to neurogenesis in the nucleolus. It is strongly expressed at the mesencephalon (midbrain) of the embryo in day 12.5 (E12.5) mice and its expression is developmentally regulated. Midnolin plays a role in cellular signaling of adult tissues and regulates glucokinase enzyme activity in pancreatic beta cells. It can also control development via regulation of mRNA transport in cells. Midnolin contains an N-terminal conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions.	70
340503	cd01805	Ubl_Rad23	ubiquitin-like (Ubl) domain found in the Rad23 protein family. The Rad23 family includes the yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe), their mammalian orthologs HR23A and HR23B, and putative DNA repair proteins from plants. Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry an ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. The Ubl domain is responsible for the binding to proteasome. The UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates, which suggests Rad23 proteins might be involved in certain pathways of Ub metabolism. Both the Ubl domain and the XPC-binding domain are necessary for efficient NER function of Rad23 proteins.	72
340504	cd01806	Ubl_NEDD8	ubiquitin-like (Ubl) domain found in neural precursor cell expressed developmentally down-regulated protein 8 (NEDD8) and similar proteins. NEDD8, also termed Neddylin, or RELATED TO UBIQUITIN (RUB/Rub1p) in plant and yeast, is a ubiquitin-like protein that conjugates to nuclear proteins in a manner analogous to ubiquitination and sentrinization. It modifies a family of molecular scaffold proteins called cullins that are responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. NEDD8 deamidation and its inhibition of Cullin-RING ubiquitin ligases (CRLs) activity are responsible for Cycle-inhibiting factor (Cif)/Cif homolog in Burkholderia pseudomallei (CHBP)-induced cytopathic effect. NEDD8 contains a single conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. Polyubiquitination, signals for a diverse set of cellular events via different isopeptide linkages formed between the C terminus of one ubiquitin (Ub) and the epsilon-amine of K6, K11, K27, K29, K33, K48, or K63 of a second Ub. Ubl NEDD8, contains many of the same lysines (K6, K11, K27, K33, K48) as Ub, where K27 has an role (other than conjugation) in the mechanism of protein neddylation.	74
340505	cd01807	Ubl_UBL4A_like	ubiquitin-like (Ubl) domain found in ubiquitin-like proteins UBL4A and similar proteins. UBL4A, also termed GdX, is a ubiquitously expressed ubiquitin-like (Ubl) protein that forms a complex with partner proteins and participates in the protein processing through endoplasmic reticulum (ER), acting as a chaperone. As a key component of the BCL2-associated athanogene 6 (BAG6) chaperone complex, UBL4A plays a role in mediating DNA damage signaling and cell death. UBL4A also regulates insulin-induced Akt plasma membrane translocation through promotion of Arp2/3-dependent actin branching. Moreover, UBL4A specifically stabilizes the TC45/STAT3 association and promotes dephosphorylation of STAT3 to repress tumorigenesis. UBL4B is testis-specific, and encoded by an X-derived retrogene Ubl4b, which is specifically expressed in post-meiotic germ cells in mammals. As a germ cell-specific cytoplasmic protein, UBL4B is not present in somatic cells. Moreover, UBL4B is present in elongated spermatids, but not in spermatocytes and round spermatids, suggesting its function is restricted to late spermiogenesis. The function of UBL4A may be compensated by either UBL4B or other Ubl proteins in normal conditions. Both UBL4A and UBL4B contain a conserved Ubl domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions.	72
340506	cd01808	Ubl_PLICs	ubiquitin-like (Ubl) domain found in eukaryotic protein linking integrin-associated protein (IAP, also known as CD47) with cytoskeleton (PLIC) proteins. The PLIC proteins (or ubiquilins) family contains human homologs of the yeast ubiquitin-like (Ubl) Dsk2 protein, PLIC-1 (also termed ubiquilin-1), PLIC-2 (also termed ubiquilin-2, or Chap1), PLIC-3 (also termed ubiquilin-3) and PLIC-4 (also termed ubiquilin-4, ataxin-1 interacting ubiquitin-like protein, A1Up, connexin43-interacting protein of 75 kDa, or CIP75), and mouse PLIC proteins. They are ubiquitin (Ub)-binding adaptor proteins involved in all protein degradation pathways through delivering ubiquitinated substrates to proteasomes. They also promote autophagy-dependent cell survival during nutrient starvation. PLIC-1 regulates the function of the thrombospondin receptor CD47 and G protein signaling. It plays a role in TLR4-mediated signaling through interacting with the Toll/interleukin-1 receptor (TIR) domain of TLR4. It also inhibits the TLR3-Trif antiviral pathway by reducing the abundance of Trif. Moreover, PLIC-1 binds to gamma-aminobutyric acid receptors (GABAARs) and modulates the Ub-dependent, proteasomal degradation of GABAARs. Furthermore, PLIC-1 acts as a molecular chaperone regulating amyloid precursor protein (APP) biosynthesis, trafficking, and degradation by stimulating K63-linked polyubiquitination of lysine 688 in the APP intracellular domain. In addition, PLIC-1 is involved in the protein aggregation-stress pathway via associating with the Ub-interacting motif (UIM) proteins ataxin 3, HSJ1a, and epidermal growth factor substrate 15 (EPS15). PLIC-2 is a protein that binds the ATPase domain of the HSP70-like Stch protein. It functions as a negative regulator of G protein-coupled receptor (GPCR) endocytosis. It also involved in amyotrophic lateral sclerosis (ALS)-related dementia. PLIC-3 is encoded by UbiquilinN3, a testis-specific gene. It shows high sequence similarity with the Xenopus protein XDRP1, a nuclear phosphoprotein that binds to the N-terminus of cyclin A and inhibits Ca2+-induced degradation of cyclin A, but not cyclin B. PLIC-4 is an ubiquitin-like (Ubl) nuclear protein that interacts with ataxin-1 and further links ataxin-1 with the chaperone and Ub-proteasome pathways. It also binds to the non-ubiquitinated gap junction protein connexin43 (Cx43) and regulates the turnover of Cx43 through the proteasomal pathway. PLIC proteins contain an N-terminal Ubl domain that is responsible for the binding of Ub-interacting motifs (UIMs) expressed by proteasomes and endocytic adaptors, and C-terminal Ub-associated (UBA) domain that interacts with Ub chains present on proteins destined for proteasomal degradation. In addition, mammalian PLIC2 proteins have an extra collagen-like motif region, which is absent in other PLIC proteins and the yeast Dsk2 protein.	73
340507	cd01809	Ubl_BAG6	ubiquitin-like (Ubl) domain found in BCL2-associated athanogene 6 (BAG6) and similar proteins. BAG6, also termed large proline-rich protein BAG6, or BAG family molecular chaperone regulator 6, or HLA-B-associated transcript 3 (Bat3), or protein Scythe, or protein G3, is a nucleo-cytoplasmic shuttling chaperone protein that is highly conserved in eukaryotes. It functions in two distinct biological pathways, ubiquitin-mediated protein degradation of defective polypeptides and tail-anchored transmembrane protein biogenesis in mammals. BAG6 is a component of the heterotrimeric BAG6 sortase complex composed of BAG6, transmembrane recognition complex 35 (TRC35) and ubiquitin-like protein 4A (UBL4A). The BAG6 complex together with the cochaperone small, glutamine-rich, tetratricopeptide repeat-containing, protein alpha (SGTA) plays a role in the biogenesis of tail-anchored membrane proteins and subsequently shown to regulate the ubiquitination and proteasomal degradation of mislocalized proteins. Moreover, BAG6 acts as an apoptotic regulator that binds reaper, a potent apoptotic inducer. BAG6/reaper is thought to signal apoptosis, in part through regulating the folding and activity of apoptotic signaling molecules. It is also likely a key regulator of the molecular chaperone Heat Shock Protein A2 (HSPA2) stability/function in human germ cells. Furthermore, aspartyl protease-mediated cleavage of BAG6 is necessary for autophagy and fungal resistance in plants. BAG6 contains a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, which provides a platform for discriminating substrates with shorter hydrophobicity stretches as a signal for defective proteins.	71
340508	cd01810	Ubl2_ISG15	ubiquitin-like (Ubl) domain 2 found in interferon-stimulated gene 15 (ISG15) and similar proteins. ISG15, also termed interferon-induced 15 kDa protein, or interferon-induced 17 kDa protein (IP17), or ubiquitin cross-reactive protein (UCRP), is an antiviral interferon-induced ubiquitin-like protein that upon viral infection it modifies cellular and viral proteins by mechanisms similar to ubiquitination. Although ISG15 has properties similar to those of other ubiquitin-like (Ubl) molecules, it is a unique member of the Ubl superfamily, whose expression and conjugation to target proteins are tightly regulated by specific signaling pathways, indicating it may have specialized functions in the immune system. ISG15 contains two tandem Ubl domains with a beta-grasp Ubl fold. This family corresponds to the second Ubl domain.	74
340509	cd01811	Ubl1_OASL	ubiquitin-like (Ubl) domain 1 found in 2'-5'-oligoadenylate synthase-like protein (OASL) and similar proteins. OASL, also termed 2'-5'-OAS-related protein (2'-5'-OAS-RP), or 59 kDa 2'-5'-oligoadenylate synthase-like protein, or thyroid receptor-interacting protein 14, or TR-interacting protein 14 (TRIP-14), or p59 OASL (p59OASL), is an interferon (IFN)-induced antiviral protein that plays an important role in the IFNs-mediated antiviral signaling pathway. It inhibits respiratory syncytial virus replication and is targeted by the viral nonstructural protein 1 (NS1). It also displays antiviral activity against encephalomyocarditis virus (EMCV) and hepatitis C virus (HCV) via an alternative antiviral pathway independent of RNase L. Moreover, OASL does not have 2'-5'-OAS activity, but can bind double-stranded RNA (dsRNA) to enhance RIG-I signaling. OASL belongs to the 2'-5' oligoadenylate synthase (OAS) family. While each member of this family has a conserved N-terminal OAS catalytic domain, only OASL has two tandem C-terminal ubiquitin-like (Ubl) repeats, which is required for its antiviral activity. This family corresponds to the first Ubl domain.	75
340510	cd01812	Ubl_BAG1	ubiquitin-like (Ubl) domain found in BAG family molecular chaperone regulator 1 (BAG1) and similar proteins. BAG1, also termed Bcl-2-associated athanogene 1, or HAP, is a multifunctional protein involved in a variety of cellular functions such as apoptosis, transcription, and proliferative pathways, as well as in cell signaling and differentiation. It delivers chaperone-recognized unfolded substrates to the proteasome for degradation. BAG1 functions as a co-chaperone for Hsp70/Hsc70 to increase Hsp70 foldase activity. It also suppresses apoptosis and enhances neuronal differentiation. As an anti-apoptotic factor, BAG1 interacts with tau and regulates its proteasomal degradation. It also binds to BCR-ABL with a high affinity, and directly routes immature BCR-ABL for proteasomal degradation. It acts as a potential therapeutic target in Parkinson's disease. It also modulates huntingtin toxicity, aggregation, degradation, and subcellular distribution, suggesting a role in Huntington's disease. There are at least four isoforms of Bag1 protein that are formed by alternative initiation of translation within a common mRNA. BAG1 contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, and a C-terminal BAG domain.	77
340511	cd01813	Ubl_UBLCP1	ubiquitin-like (Ubl) domain found in ubiquitin-like domain-containing CTD phosphatase 1 (UBLCP1) and similar proteins. UBLCP1 is a 26S proteasome phosphatase that regulates nuclear proteasome activity. It is localized in the nucleus and it contains conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, which directly interacts with the proteasome. Knockdown of UBLCP1 in cells promotes 26S proteasome assembly and selectively enhances nuclear proteasome activity. UBLCP1 may also play a role in the regulation of phosphorylation state of RNA polymerase II C-terminal domain, a key event during mRNA metabolism.	74
340512	cd01814	Ubl_MUBs_plant	ubiquitin-like (Ubl) domain found in plant membrane-anchored ubiquitin-fold proteins (MUBs). The plant MUBs belong to a family of ubiquitin-fold proteins that are plasma membrane-anchored by prenylation. They may serve as docking site to facilitate the association of specific E2s to the plasma membrane. MUBs contain a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold.	89
340513	cd01815	Ubl_UBL7	ubiquitin-like (Ubl) domain found in ubiquitin-like protein 7 (UBL7) and similar proteins. UBL7, also termed bone marrow stromal cell ubiquitin-like (Ubl)protein (BMSC-UbP), or ubiquitin-like protein SB132, is a novel Ubl protein that may play roles in regulation of bone marrow stromal cell (BMSC) function or cell differentiation via an evocator-associated and cell-specific pattern. UBL7 contains an N-terminal Ubl domain with a beta-grasp Ubl fold, and a C-terminal ubiquitin-associated (UBA) domain. The Ubl domain interacts with 26S proteasome-dependent degradation, and the UBA domain links cellular processes and the ubiquitin system.	92
340514	cd01816	RBD_RAF	Ras-binding domain (RBD) found in RAF family serine/threonine kinases. The RAF family includes three RAF serine/threonine kinases ARAF, BRAF, and RAF1/CRAF. These are encoded by proto-oncogenes, and activate the mitogen-activated protein kinase/extracellular-signal-regulated kinase (MAPK/ERK) cascade downstream of RAS. They share a common structure consisting of an N-terminal regulatory domain and a C-terminal kinase domain. There are three conserved regions (CR1-3) in the regulatory domain, CR1 contains a Ras-binding domain (RBD) and a cysteine-rich domain (CRD), CR2 is a serine/threonine-rich domain, and CR3 encodes the kinase domain required for RAF. The RBD of RAF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions.	71
340515	cd01817	RBD1_RGS12_like	Ras-binding domain (RBD) 1 of regulator of G protein signaling 12 (RGS12) and similar proteins. Regulator of G protein signaling (RGS) proteins belong to a large family of GTpase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. This RGS12-like subfamily is composed of RGS12 and RGS14, with multidomain architectures including a RGS domain, two tandem Ras-binding domains (RBDs), and a second Galpha interacting domain, the GoLoco motif. The RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair.	70
132836	cd01819	Patatin_and_cPLA2	Patatins and Phospholipases. Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms.	155
238858	cd01820	PAF_acetylesterase_like	PAF_acetylhydrolase (PAF-AH)_like subfamily of SGNH-hydrolases. Platelet-activating factor (PAF) and PAF-AH are key players in inflammation and in atherosclerosis. PAF-AH is a calcium independent phospholipase A2 which exhibits strong substrate specificity towards PAF, hydrolyzing an acetyl ester at the sn-2 position. PAF-AH also degrades a family of oxidized PAF-like phospholipids with short sn-2 residues.  In addition,  PAF and PAF-AH are associated with neural migration and mammalian reproduction.	214
238859	cd01821	Rhamnogalacturan_acetylesterase_like	Rhamnogalacturan_acetylesterase_like subgroup of SGNH-hydrolases. Rhamnogalacturan acetylesterase removes acetyl esters from rhamnogalacturonan substrates, and renders them susceptible to degradation by rhamnogalacturonases. Rhamnogalacturonans are highly branched regions in pectic polysaccharides, consisting of repeating -(1,2)-L-Rha-(1,4)-D-GalUA disaccharide units, with many rhamnose residues substituted by neutral oligosaccharides such as arabinans, galactans and arabinogalactans. Extracellular enzymes participating in the degradation of plant cell wall polymers, such as Rhamnogalacturonan acetylesterase, would typically be found in saprophytic and plant pathogenic fungi and bacteria.	198
238860	cd01822	Lysophospholipase_L1_like	Lysophospholipase L1-like subgroup of SGNH-hydrolases. The best characterized member in this family is TesA, an E. coli periplasmic protein with thioesterase, esterase, arylesterase, protease and lysophospholipase activity.	177
238861	cd01823	SEST_like	SEST_like. A family of secreted SGNH-hydrolases similar to Streptomyces scabies esterase (SEST), a causal agent of the potato scab disease, which hydrolyzes a specific ester bond in suberin, a plant lipid. The tertiary fold of this enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxylic acid.	259
238862	cd01824	Phospholipase_B_like	Phospholipase-B_like. This subgroup of the SGNH-family of lipolytic enzymes may have both esterase and phospholipase-A/lysophospholipase activity.  It's members may be involved in the conversion of phosphatidylcholine to fatty acids and glycerophosphocholine, perhaps in the context of dietary lipid uptake. Members may be membrane proteins. The tertiary fold of the SGNH-hydrolases is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; Its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases.	288
238863	cd01825	SGNH_hydrolase_peri1	SGNH_peri1; putative periplasmic member of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	189
238864	cd01826	acyloxyacyl_hydrolase_like	Acyloxyacyl-hydrolase like subfamily of the SGNH-hydrolase family. Acyloxyacyl-hydrolase is a leukocyte-secreted enzyme that deacetylates bacterial lipopolysaccharides.	305
238865	cd01827	sialate_O-acetylesterase_like1	sialate O-acetylesterase_like family of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	188
238866	cd01828	sialate_O-acetylesterase_like2	sialate_O-acetylesterase_like subfamily of the SGNH-hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	169
238867	cd01829	SGNH_hydrolase_peri2	SGNH_peri2; putative periplasmic member of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	200
238868	cd01830	XynE_like	SGNH_hydrolase subfamily, similar to the putative arylesterase/acylhydrolase from the rumen anaerobe Prevotella bryantii XynE. The P. bryantii XynE gene is located in a xylanase gene cluster. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	204
238869	cd01831	Endoglucanase_E_like	Endoglucanase E-like members of the SGNH hydrolase family; Endoglucanase E catalyzes the endohydrolysis of 1,4-beta-glucosidic linkages in cellulose, lichenin and cereal beta-D-glucans.	169
238870	cd01832	SGNH_hydrolase_like_1	Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. Myxobacterial members of this subfamily have been reported to be involved in adventurous gliding motility.	185
238871	cd01833	XynB_like	SGNH_hydrolase subfamily, similar to Ruminococcus flavefaciens XynB. Most likely a secreted hydrolase with xylanase activity. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	157
238872	cd01834	SGNH_hydrolase_like_2	SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	191
238873	cd01835	SGNH_hydrolase_like_3	SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	193
238874	cd01836	FeeA_FeeB_like	SGNH_hydrolase subfamily, FeeA, FeeB and similar esterases/lipases. FeeA and FeeB are part of a biosynthetic gene cluster and may participate in the biosynthesis of long-chain N-acyltyrosines by providing saturated and unsaturated fatty acids, which it turn are loaded onto the acyl carrier protein FeeL. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	191
238875	cd01837	SGNH_plant_lipase_like	SGNH_plant_lipase_like, a plant specific subfamily of the SGNH-family of hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	315
238876	cd01838	Isoamyl_acetate_hydrolase_like	Isoamyl-acetate hydrolyzing esterase-like proteins. SGNH_hydrolase subfamily similar to the Saccharomyces cerevisiae IAH1. IAH1 may be the major esterase that hydrolyses isoamyl acetate in sake mash.  The SGNH-family of hydrolases is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases	199
238877	cd01839	SGNH_arylesterase_like	SGNH_hydrolase subfamily, similar to arylesterase (7-aminocephalosporanic acid-deacetylating enzyme) of A. tumefaciens. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	208
238878	cd01840	SGNH_hydrolase_yrhL_like	yrhL-like subfamily of SGNH-hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Most members of this sub-family appear to co-occur with N-terminal acyltransferase domains. Might be involved in lipid metabolism.	150
238879	cd01841	NnaC_like	NnaC (CMP-NeuNAc synthetase) _like subfamily of SGNH_hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles two of the three components of typical Ser-His-Asp(Glu) triad from other serine hydrolases. E. coli NnaC appears to be involved in polysaccharide synthesis.	174
238880	cd01842	SGNH_hydrolase_like_5	SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	183
238881	cd01844	SGNH_hydrolase_like_6	SGNH_hydrolase subfamily. SGNH hydrolases are a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases.	177
238882	cd01846	fatty_acyltransferase_like	Fatty acyltransferase-like subfamily of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Might catalyze fatty acid transfer between phosphatidylcholine and sterols.	270
238883	cd01847	Triacylglycerol_lipase_like	Triacylglycerol lipase-like subfamily of the SGNH hydrolases, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad found in other serine hydrolases. Members of this subfamily might hydrolyze triacylglycerol into diacylglycerol and fatty acid anions.	281
206746	cd01849	YlqF_related_GTPase	Circularly permuted YlqF-related GTPases. These proteins are found in bacteria, eukaryotes, and archaea.  They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases.	146
206649	cd01850	CDC_Septin	CDC/Septin GTPase family. Septins are a conserved family of GTP-binding proteins associated with diverse processes in dividing and non-dividing cells. They were first discovered in the budding yeast S. cerevisiae as a set of genes (CDC3, CDC10, CDC11 and CDC12) required for normal bud morphology. Septins are also present in metazoan cells, where they are required for cytokinesis in some systems, and implicated in a variety of other processes involving organization of the cell cortex and exocytosis. In humans, 12 septin genes generate dozens of polypeptides, many of which comprise heterooligomeric complexes. Since septin mutants are commonly defective in cytokinesis and formation of the neck formation of the neck filaments/septin rings, septins have been considered to be the primary constituents of the neck filaments. Septins belong to the GTPase superfamily for their conserved GTPase motifs and enzymatic activities.	275
206650	cd01851	GBP	Guanylate-binding protein (GBP) family (N-terminal domain). Guanylate-binding protein (GBP), N-terminal domain. Guanylate-binding proteins (GBPs) define a group of proteins that are synthesized after activation of the cell by interferons. The biochemical properties of GBPs are clearly different from those of Ras-like and heterotrimeric GTP-binding proteins. They bind guanine nucleotides with low affinity (micromolar range), are stable in their absence and have a high turnover GTPase. In addition to binding GDP/GTP, they have the unique ability to bind GMP with equal affinity and hydrolyze GTP not only to GDP, but also to GMP. Furthermore, two unique regions around the base and the phosphate-binding areas, the guanine and the phosphate caps, respectively, give the nucleotide-binding site a unique appearance not found in the canonical GTP-binding proteins. The phosphate cap, which constitutes the region analogous to switch I, completely shields the phosphate-binding site from solvent such that a potential GTPase-activating protein (GAP) cannot approach.	224
206651	cd01852	AIG1	AvrRpt2-Induced Gene 1 (AIG1). This group represents Arabidoposis protein AIG1 (avrRpt2-induced gene 1) that appears to be involved in plant resistance to bacteria. The Arabidopsis disease resistance gene RPS2 is involved in recognition of bacterial pathogens carrying the avirulence gene avrRpt2. AIG1 exhibits RPS2- and avrRpt1-dependent induction early after infection with Pseudomonas syringae carrying avrRpt2. This subfamily also includes IAN-4 protein, which has GTP-binding activity and shares sequence homology with a novel family of putative GTP-binding proteins: the immuno-associated nucleotide (IAN) family. The evolutionary conservation of the IAN family provides a unique example of a plant pathogen response gene conserved in animals. The IAN/IMAP subfamily has been proposed to regulate apoptosis in vertebrates and angiosperm plants, particularly in relation to cancer, diabetes, and infections. The human IAN genes were renamed GIMAP (GTPase of the immunity associated proteins).	201
206652	cd01853	Toc34_like	Translocon at the Outer-envelope membrane of Chloroplasts 34-like (Toc34-like). The Toc34-like (Translocon at the Outer-envelope membrane of Chloroplasts) family contains several Toc proteins, including Toc34, Toc33, Toc120, Toc159, Toc86, Toc125, and Toc90. The Toc complex at the outer envelope membrane of chloroplasts is a molecular machine of ~500 kDa that contains a single Toc159 protein, four Toc75 molecules, and four or five copies of Toc34. Toc64 and Toc12 are associated with the translocon, but do not appear to be part of the core complex. The Toc translocon initiates the import of nuclear-encoded preproteins from the cytosol into the organelle. Toc34 and Toc159 are both GTPases, while Toc75 is a beta-barrel integral membrane protein. Toc159 is equally distributed between a soluble cytoplasmic form and a membrane-inserted form, suggesting that assembly of the Toc complex is dynamic. Toc34 and Toc75 act sequentially to mediate docking and insertion of Toc159 resulting in assembly of the functional translocon.	248
206747	cd01854	YjeQ_EngC	Ribosomal interacting GTPase YjeQ/EngC, a circularly permuted subfamily of the Ras GTPases. YjeQ (YloQ in Bacillus subtilis) is a ribosomal small subunit-dependent GTPase; hence also known as RsgA. YjeQ is a late-stage ribosomal biogenesis factor involved in the 30S subunit maturation, and it represents a protein family whose members are broadly conserved in bacteria and have been shown to be essential to the growth of E. coli and B. subtilis. Proteins of the YjeQ family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases. All YjeQ family proteins display a unique domain architecture, which includes an N-terminal OB-fold RNA-binding domain, the central permuted GTPase domain, and a zinc knuckle-like C-terminal cysteine domain.	211
206748	cd01855	YqeH	Circularly permuted YqeH GTPase. YqeH is an essential GTP-binding protein. Depletion of YqeH induces an excess initiation of DNA replication, suggesting that it negatively controls initiation of chromosome replication. The YqeH subfamily is common in eukaryotes and sporadically present in bacteria with probable acquisition by plants from chloroplasts. Proteins of the YqeH family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases.	191
206749	cd01856	YlqF	Circularly permuted YlqF GTPase. Proteins of the YlqF family contain all sequence motifs typical of the vast class of P-loop-containing GTPases, but show a circular permutation, with a G4-G1-G3 pattern of motifs as opposed to the regular G1-G3-G4 pattern seen in most GTPases. The YlqF subfamily is represented in all eukaryotes as well as a phylogenetically diverse array of bacteria (including gram-positive bacteria, proteobacteria, Synechocystis, Borrelia, and Thermotoga).	171
206750	cd01857	HSR1_MMR1	A circularly permuted subfamily of the Ras GTPases. Human HSR1 is localized to the human MHC class I region and is highly homologous to a putative GTP-binding protein, MMR1 from mouse. These proteins represent a new subfamily of GTP-binding proteins that has only eukaryote members. This subfamily shows a circular permutation of the GTPase signature motifs so that the C-terminal strands 5, 6, and 7 (strand 6 contains the G4 box with sequence NKXD) are relocated to the N-terminus.	140
206751	cd01858	NGP_1	A novel nucleolar GTP-binding protein, circularly permuted subfamily of the Ras GTPases. Autoantigen NGP-1 (Nucleolar G-protein gene 1) has been shown to localize in the nucleolus and nucleolar organizers in all cell types analyzed, which is indicative of a function in ribosomal assembly. NGP-1 and its homologs show a circular permutation of the GTPase signature motifs so that the C-terminal strands 5, 6, and 7 (strand 6 contains the G4 box with NKXD motif) are relocated to the N terminus.	157
206752	cd01859	MJ1464	An uncharacterized, circularly permuted subfamily of the Ras GTPases. This family represents archaeal GTPase typified by the protein MJ1464 from Methanococcus jannaschii. The members of this family show a circular permutation of the GTPase signature motifs so that C-terminal strands 5, 6, and 7 (strands 6 contain the NKxD motif) are relocated to the N terminus.	157
206653	cd01860	Rab5_related	Rab-related GTPase family includes Rab5 and Rab22; regulates early endosome fusion. The Rab5-related subfamily includes Rab5 and Rab22 of mammals, Ypt51/Ypt52/Ypt53 of yeast, and RabF of plants. The members of this subfamily are involved in endocytosis and endocytic-sorting pathways. In mammals, Rab5 GTPases localize to early endosomes and regulate fusion of clathrin-coated vesicles to early endosomes and fusion between early endosomes. In yeast, Ypt51p family members similarly regulate membrane trafficking through prevacuolar compartments. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	163
206654	cd01861	Rab6	Rab GTPase family 6 (Rab6). Rab6 is involved in microtubule-dependent transport pathways through the Golgi and from endosomes to the Golgi. Rab6A of mammals is implicated in retrograde transport through the Golgi stack, and is also required for a slow, COPI-independent, retrograde transport pathway from the Golgi to the endoplasmic reticulum (ER). This pathway may allow Golgi residents to be recycled through the ER for scrutiny by ER quality-control systems. Yeast Ypt6p, the homolog of the mammalian Rab6 GTPase, is not essential for cell viability. Ypt6p acts in endosome-to-Golgi, in intra-Golgi retrograde transport, and possibly also in Golgi-to-ER trafficking. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	161
206655	cd01862	Rab7	Rab GTPase family 7 (Rab7). Rab7 subfamily. Rab7 is a small Rab GTPase that regulates vesicular traffic from early to late endosomal stages of the endocytic pathway. The yeast Ypt7 and mammalian Rab7 are both involved in transport to the vacuole/lysosome, whereas Ypt7 is also required for homotypic vacuole fusion. Mammalian Rab7 is an essential participant in the autophagic pathway for sequestration and targeting of cytoplasmic components to the lytic compartment. Mammalian Rab7 is also proposed to function as a tumor suppressor. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	172
206656	cd01863	Rab18	Rab GTPase family 18 (Rab18). Rab18 subfamily. Mammalian Rab18 is implicated in endocytic transport and is expressed most highly in polarized epithelial cells. However, trypanosomal Rab, TbRAB18, is upregulated in the BSF (Blood Stream Form) stage and localized predominantly to elements of the Golgi complex. In human and mouse cells, Rab18 has been identified in lipid droplets, organelles that store neutral lipids. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	161
133267	cd01864	Rab19	Rab GTPase family 19 (Rab19). Rab19 subfamily. Rab19 proteins are associated with Golgi stacks. Similarity analysis indicated that Rab41 is closely related to Rab19. However, the function of these Rabs is not yet characterized. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	165
206657	cd01865	Rab3	Rab GTPase family 3 contains Rab3A, Rab3B, Rab3C and Rab3D. The Rab3 subfamily contains Rab3A, Rab3B, Rab3C, and Rab3D. All four isoforms were found in mouse brain and endocrine tissues, with varying levels of expression. Rab3A, Rab3B, and Rab3C localized to synaptic and secretory vesicles; Rab3D was expressed at high levels only in adipose tissue, exocrine glands, and the endocrine pituitary, where it is localized to cytoplasmic secretory granules. Rab3 appears to control Ca2+-regulated exocytosis. The appropriate GDP/GTP exchange cycle of Rab3A is required for Ca2+-regulated exocytosis to occur, and interaction of the GTP-bound form of Rab3A with effector molecule(s) is widely believed to be essential for this process. Functionally, most studies point toward a role for Rab3 in the secretion of hormones and neurotransmitters. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	165
206658	cd01866	Rab2	Rab GTPase family 2 (Rab2). Rab2 is localized on cis-Golgi membranes and interacts with Golgi matrix proteins. Rab2 is also implicated in the maturation of vesicular tubular clusters (VTCs), which are microtubule-associated intermediates in transport between the ER and Golgi apparatus. In plants, Rab2 regulates vesicle trafficking between the ER and the Golgi bodies and is important to pollen tube growth. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	168
206659	cd01867	Rab8_Rab10_Rab13_like	Rab GTPase families 8, 10, 13 (Rab8, Rab10, Rab13). Rab8/Sec4/Ypt2 are known or suspected to be involved in post-Golgi transport to the plasma membrane. It is likely that these Rabs have functions that are specific to the mammalian lineage and have no orthologs in plants. Rab8 modulates polarized membrane transport through reorganization of actin and microtubules, induces the formation of new surface extensions, and has an important role in directed membrane transport to cell surfaces. The Ypt2 gene of the fission yeast Schizosaccharomyces pombe encodes a member of the Ypt/Rab family of small GTP-binding proteins, related in sequence to Sec4p of Saccharomyces cerevisiae but closer to mammalian Rab8. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	167
206660	cd01868	Rab11_like	Rab GTPase family 11 (Rab11)-like includes Rab11a, Rab11b, and Rab25. Rab11a, Rab11b, and Rab25 are closely related, evolutionary conserved Rab proteins that are differentially expressed. Rab11a is ubiquitously synthesized, Rab11b is enriched in brain and heart and Rab25 is only found in epithelia. Rab11/25 proteins seem to regulate recycling pathways from endosomes to the plasma membrane and to the trans-Golgi network. Furthermore, Rab11a is thought to function in the histamine-induced fusion of tubulovesicles containing H+, K+ ATPase with the plasma membrane in gastric parietal cells and in insulin-stimulated insertion of GLUT4 in the plasma membrane of cardiomyocytes. Overexpression of Rab25 has recently been observed in ovarian cancer and breast cancer, and has been correlated with worsened outcomes in both diseases. In addition, Rab25 overexpression has also been observed in prostate cancer, transitional cell carcinoma of the bladder, and invasive breast tumor cells. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	165
206661	cd01869	Rab1_Ypt1	Rab GTPase family 1 includes the yeast homolog Ypt1. Rab1/Ypt1 subfamily. Rab1 is found in every eukaryote and is a key regulatory component for the transport of vesicles from the ER to the Golgi apparatus. Studies on mutations of Ypt1, the yeast homolog of Rab1, showed that this protein is necessary for the budding of vesicles of the ER as well as for their transport to, and fusion with, the Golgi apparatus. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	166
206662	cd01870	RhoA_like	Ras homology family A (RhoA)-like includes RhoA, RhoB and RhoC. The RhoA subfamily consists of RhoA, RhoB, and RhoC. RhoA promotes the formation of stress fibers and focal adhesions, regulating cell shape, attachment, and motility. RhoA can bind to multiple effector proteins, thereby triggering different downstream responses. In many cell types, RhoA mediates local assembly of the contractile ring, which is necessary for cytokinesis. RhoA is vital for muscle contraction; in vascular smooth muscle cells, RhoA plays a key role in cell contraction, differentiation, migration, and proliferation. RhoA activities appear to be elaborately regulated in a time- and space-dependent manner to control cytoskeletal changes. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. RhoA and RhoC are observed only in geranylgeranylated forms; however, RhoB can be present in palmitoylated, farnesylated, and geranylgeranylated forms. RhoA and RhoC are highly relevant for tumor progression and invasiveness; however, RhoB has recently been suggested to be a tumor suppressor. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	175
206663	cd01871	Rac1_like	Ras-related C3 botulinum toxin substrate 1 (rho family, small GTP binding protein Rac1)-like consists of Rac1, Rac2 and Rac3. The Rac1-like subfamily consists of Rac1, Rac2, and Rac3 proteins, plus the splice variant Rac1b that contains a 19-residue insertion near switch II relative to Rac1. While Rac1 is ubiquitously expressed, Rac2 and Rac3 are largely restricted to hematopoietic and neural tissues respectively. Rac1 stimulates the formation of actin lamellipodia and membrane ruffles. It also plays a role in cell-matrix adhesion and cell anoikis. In intestinal epithelial cells, Rac1 is an important regulator of migration and mediates apoptosis. Rac1 is also essential for RhoA-regulated actin stress fiber and focal adhesion complex formation. In leukocytes, Rac1 and Rac2 have distinct roles in regulating cell morphology, migration, and invasion, but are not essential for macrophage migration or chemotaxis. Rac3 has biochemical properties that are closely related to Rac1, such as effector interaction, nucleotide binding, and hydrolysis; Rac2 has a slower nucleotide association and is more efficiently activated by the RacGEF Tiam1. Both Rac1 and Rac3 have been implicated in the regulation of cell migration and invasion in human metastatic breast cancer. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	174
133275	cd01873	RhoBTB	RhoBTB protein is an atypical member of the Rho family of small GTPases. Members of the RhoBTB subfamily of Rho GTPases are present in vertebrates, Drosophila, and Dictyostelium. RhoBTB proteins are characterized by a modular organization, consisting of a GTPase domain, a proline rich region, a tandem of two BTB (Broad-Complex, Tramtrack, and Bric a brac) domains, and a C-terminal region of unknown function. RhoBTB proteins may act as docking points for multiple components participating in signal transduction cascades. RhoBTB genes appeared upregulated in some cancer cell lines, suggesting a participation of RhoBTB proteins in the pathogenesis of particular tumors. Note that the Dictyostelium RacA GTPase domain is more closely related to Rac proteins than to RhoBTB proteins, where RacA actually belongs. Thus, the Dictyostelium RacA is not included here. Most Rho proteins contain a lipid modification site at the C-terminus; however, RhoBTB is one of few Rho subfamilies that lack this feature.	195
206664	cd01874	Cdc42	cell division cycle 42 (Cdc42) is a small GTPase of the Rho family. Cdc42 is an essential GTPase that belongs to the Rho family of Ras-like GTPases. These proteins act as molecular switches by responding to exogenous and/or endogenous signals and relaying those signals to activate downstream components of a biological pathway. Cdc42 transduces signals to the actin cytoskeleton to initiate and maintain polarized growth and to mitogen-activated protein morphogenesis. In the budding yeast Saccharomyces cerevisiae, Cdc42 plays an important role in multiple actin-dependent morphogenetic events such as bud emergence, mating-projection formation, and pseudohyphal growth. In mammalian cells, Cdc42 regulates a variety of actin-dependent events and induces the JNK/SAPK protein kinase cascade, which leads to the activation of transcription factors within the nucleus. Cdc42 mediates these processes through interactions with a myriad of downstream effectors, whose number and regulation we are just starting to understand. In addition, Cdc42 has been implicated in a number of human diseases through interactions with its regulators and downstream effectors. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	175
133277	cd01875	RhoG	Ras homolog family, member G (RhoG) of small guanosine triphosphatases (GTPases). RhoG is a GTPase with high sequence similarity to members of the Rac subfamily, including the regions involved in effector recognition and binding. However, RhoG does not bind to known Rac1 and Cdc42 effectors, including proteins containing a Cdc42/Rac interacting binding (CRIB) motif. Instead, RhoG interacts directly with Elmo, an upstream regulator of Rac1, in a GTP-dependent manner and forms a ternary complex with Dock180 to induce activation of Rac1. The RhoG-Elmo-Dock180 pathway is required for activation of Rac1 and cell spreading mediated by integrin, as well as for neurite outgrowth induced by nerve growth factor. Thus RhoG activates Rac1 through Elmo and Dock180 to control cell morphology. RhoG has also been shown to play a role in caveolar trafficking and has a novel role in signaling the neutrophil respiratory burst stimulated by G protein-coupled receptor (GPCR) agonists. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins.	191
206665	cd01876	YihA_EngB	YihA (EngB) GTPase family. The YihA (EngB) subfamily of GTPases is typified by the E. coli YihA, an essential protein involved in cell division control. YihA and its orthologs are small proteins that typically contain less than 200 amino acid residues and consists of the GTPase domain only (some of the eukaryotic homologs contain an N-terminal extension of about 120 residues that might be involved in organellar targeting). Homologs of yihA are found in most Gram-positive and Gram-negative pathogenic bacteria, with the exception of Mycobacterium tuberculosis. The broad-spectrum nature of YihA and its essentiality for cell viability in bacteria make it an attractive antibacterial target.	170
206666	cd01878	HflX	HflX GTPase family. HflX subfamily. A distinct conserved domain with a glycine-rich segment N-terminal of the GTPase domain characterizes the HflX subfamily. The E. coli HflX has been implicated in the control of the lambda cII repressor proteolysis, but the actual biological functions of these GTPases remain unclear. HflX is widespread, but not universally represented in all three superkingdoms.	204
206667	cd01879	FeoB	Ferrous iron transport protein B (FeoB) family. Ferrous iron transport protein B (FeoB) subfamily. E. coli has an iron(II) transport system, known as feo, which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N terminus contains a P-loop motif suggesting that iron transport may be ATP dependent.	159
206668	cd01881	Obg_like	Obg-like family of GTPases consist of five subfamilies: Obg, DRG, YyaF/YchF, Ygr210, and NOG1. The Obg-like subfamily consists of five well-delimited, ancient subfamilies, namely Obg, DRG, YyaF/YchF, Ygr210, and NOG1. Four of these groups (Obg, DRG, YyaF/YchF, and Ygr210) are characterized by a distinct glycine-rich motif immediately following the Walker B motif (G3 box). Obg/CgtA is an essential gene that is involved in the initiation of sporulation and DNA replication in the bacteria Caulobacter and Bacillus, but its exact molecular role is unknown. Furthermore, several OBG family members possess a C-terminal RNA-binding domain, the TGS domain, which is also present in threonyl-tRNA synthetase and in bacterial guanosine polyphosphatase SpoT. Nog1 is a nucleolar protein that might function in ribosome assembly. The DRG and Nog1 subfamilies are ubiquitous in archaea and eukaryotes, the Ygr210 subfamily is present in archaea and fungi, and the Obg and YyaF/YchF subfamilies are ubiquitous in bacteria and eukaryotes. The Obg/Nog1 and DRG subfamilies appear to form one major branch of the Obg family and the Ygr210 and YchF subfamilies form another branch. No GEFs, GAPs, or GDIs for Obg have been identified.	167
206669	cd01882	BMS1	Bms1, an essential GTPase, promotes assembly of preribosomal RNA processing complexes. Bms1 is an essential, evolutionarily conserved, nucleolar protein. Its depletion interferes with processing of the 35S pre-rRNA at sites A0, A1, and A2, and the formation of 40S subunits. Bms1, the putative endonuclease Rc11, and the essential U3 small nucleolar RNA form a stable subcomplex that is believed to control an early step in the formation of the 40S subumit. The C-terminal domain of Bms1 contains a GTPase-activating protein (GAP) that functions intramolecularly. It is believed that Rc11 activates Bms1 by acting as a guanine-nucleotide exchange factor (GEF) to promote GDP/GTP exchange, and that activated (GTP-bound) Bms1 delivers Rc11 to the preribosomes.	231
206670	cd01883	EF1_alpha	Elongation Factor 1-alpha (EF1-alpha) protein family. EF1 is responsible for the GTP-dependent binding of aminoacyl-tRNAs to the ribosomes. EF1 is composed of four subunits: the alpha chain which binds GTP and aminoacyl-tRNAs, the gamma chain that probably plays a role in anchoring the complex to other cellular components and the beta and delta (or beta') chains. This subfamily is the alpha subunit, and represents the counterpart of bacterial EF-Tu for the archaea (aEF1-alpha) and eukaryotes (eEF1-alpha). eEF1-alpha interacts with the actin of the eukaryotic cytoskeleton and may thereby play a role in cellular transformation and apoptosis. EF-Tu can have no such role in bacteria. In humans, the isoform eEF1A2 is overexpressed in 2/3 of breast cancers and has been identified as a putative oncogene. This subfamily also includes Hbs1, a G protein known to be important for efficient growth and protein synthesis under conditions of limiting translation initiation in yeast, and to associate with Dom34. It has been speculated that yeast Hbs1 and Dom34 proteins may function as part of a complex with a role in gene expression.	219
206671	cd01884	EF_Tu	Elongation Factor Tu (EF-Tu) GTP-binding proteins. EF-Tu subfamily. This subfamily includes orthologs of translation elongation factor EF-Tu in bacteria, mitochondria, and chloroplasts. It is one of several GTP-binding translation factors found in the larger family of GTP-binding elongation factors. The eukaryotic counterpart, eukaryotic translation elongation factor 1 (eEF-1 alpha), is excluded from this family. EF-Tu is one of the most abundant proteins in bacteria, as well as, one of the most highly conserved, and in a number of species the gene is duplicated with identical function. When bound to GTP, EF-Tu can form a complex with any (correctly) aminoacylated tRNA except those for initiation and for selenocysteine, in which case EF-Tu is replaced by other factors. Transfer RNA is carried to the ribosome in these complexes for protein translation.	195
206672	cd01885	EF2	Elongation Factor 2 (EF2) in archaea and eukarya. Translocation requires hydrolysis of a molecule of GTP and is mediated by EF-G in bacteria and by eEF2 in eukaryotes. The eukaryotic elongation factor eEF2 is a GTPase involved in the translocation of the peptidyl-tRNA from the A site to the P site on the ribosome. The 95-kDa protein is highly conserved, with 60% amino acid sequence identity between the human and yeast proteins. Two major mechanisms are known to regulate protein elongation and both involve eEF2. First, eEF2 can be modulated by reversible phosphorylation. Increased levels of phosphorylated eEF2 reduce elongation rates presumably because phosphorylated eEF2 fails to bind the ribosomes. Treatment of mammalian cells with agents that raise the cytoplasmic Ca2+ and cAMP levels reduce elongation rates by activating the kinase responsible for phosphorylating eEF2. In contrast, treatment of cells with insulin increases elongation rates by promoting eEF2 dephosphorylation. Second, the protein can be post-translationally modified by ADP-ribosylation. Various bacterial toxins perform this reaction after modification of a specific histidine residue to diphthamide, but there is evidence for endogenous ADP ribosylase activity. Similar to the bacterial toxins, it is presumed that modification by the endogenous enzyme also inhibits eEF2 activity.	218
206673	cd01886	EF-G	Elongation factor G (EF-G) family involved in both the elongation and ribosome recycling phases of protein synthesis. Translocation is mediated by EF-G (also called translocase). The structure of EF-G closely resembles that of the complex between EF-Tu and tRNA. This is an example of molecular mimicry; a protein domain evolved so that it mimics the shape of a tRNA molecule. EF-G in the GTP form binds to the ribosome, primarily through the interaction of its EF-Tu-like domain with the 50S subunit. The binding of EF-G to the ribosome in this manner stimulates the GTPase activity of EF-G. On GTP hydrolysis, EF-G undergoes a conformational change that forces its arm deeper into the A site on the 30S subunit. To accommodate this domain, the peptidyl-tRNA in the A site moves to the P site, carrying the mRNA and the deacylated tRNA with it. The ribosome may be prepared for these rearrangements by the initial binding of EF-G as well. The dissociation of EF-G leaves the ribosome ready to accept the next aminoacyl-tRNA into the A site. This group contains both eukaryotic and bacterial members.	270
206674	cd01887	IF2_eIF5B	Initiation Factor 2 (IF2)/ eukaryotic Initiation Factor 5B (eIF5B) family. IF2/eIF5B contribute to ribosomal subunit joining and function as GTPases that are maximally activated by the presence of both ribosomal subunits. As seen in other GTPases, IF2/IF5B undergoes conformational changes between its GTP- and GDP-bound states. Eukaryotic IF2/eIF5Bs possess three characteristic segments, including a divergent N-terminal region followed by conserved central and C-terminal segments. This core region is conserved among all known eukaryotic and archaeal IF2/eIF5Bs and eubacterial IF2s.	169
206675	cd01888	eIF2_gamma	Gamma subunit of initiation factor 2 (eIF2 gamma). eIF2 is a heterotrimeric translation initiation factor that consists of alpha, beta, and gamma subunits. The GTP-bound gamma subunit also binds initiator methionyl-tRNA and delivers it to the 40S ribosomal subunit. Following hydrolysis of GTP to GDP, eIF2:GDP is released from the ribosome. The gamma subunit has no intrinsic GTPase activity, but is stimulated by the GTPase activating protein (GAP) eIF5, and GDP/GTP exchange is stimulated by the guanine nucleotide exchange factor (GEF) eIF2B. eIF2B is a heteropentamer, and the epsilon chain binds eIF2. Both eIF5 and eIF2B-epsilon are known to bind strongly to eIF2-beta, but have also been shown to bind directly to eIF2-gamma. It is possible that eIF2-beta serves simply as a high-affinity docking site for eIF5 and eIF2B-epsilon, or that eIF2-beta serves a regulatory role. eIF2-gamma is found only in eukaryotes and archaea. It is closely related to SelB, the selenocysteine-specific elongation factor from eubacteria. The translational factor components of the ternary complex, IF2 in eubacteria and eIF2 in eukaryotes are not the same protein (despite their unfortunately similar names). Both factors are GTPases; however, eubacterial IF-2 is a single polypeptide, while eIF2 is heterotrimeric. eIF2-gamma is a member of the same family as eubacterial IF2, but the two proteins are only distantly related. This family includes translation initiation, elongation, and release factors.	197
206676	cd01889	SelB_euk	SelB, the dedicated elongation factor for delivery of selenocysteinyl-tRNA to the ribosome. SelB is an elongation factor needed for the co-translational incorporation of selenocysteine. Selenocysteine is coded by a UGA stop codon in combination with a specific downstream mRNA hairpin. In bacteria, the C-terminal part of SelB recognizes this hairpin, while the N-terminal part binds GTP and tRNA in analogy with elongation factor Tu (EF-Tu). It specifically recognizes the selenocysteine charged tRNAsec, which has a UCA anticodon, in an EF-Tu like manner. This allows insertion of selenocysteine at in-frame UGA stop codons. In E. coli SelB binds GTP, selenocysteyl-tRNAsec and a stem-loop structure immediately downstream of the UGA codon (the SECIS sequence). The absence of active SelB prevents the participation of selenocysteyl-tRNAsec in translation. Archaeal and animal mechanisms of selenocysteine incorporation are more complex. Although the SECIS elements have different secondary structures and conserved elements between archaea and eukaryotes, they do share a common feature. Unlike in E. coli, these SECIS elements are located in the 3' UTRs. This group contains eukaryotic SelBs and some from archaea.	192
206677	cd01890	LepA	LepA also known as Elongation Factor 4 (EF4). LepA (also known as elongation factor 4, EF4) belongs to the GTPase family and exhibits significant homology to the translation factors EF-G and EF-Tu, indicating its possible involvement in translation and association with the ribosome. LepA is ubiquitous in bacteria and eukaryota (e.g. yeast GUF1p), but is missing from archaea. This pattern of phyletic distribution suggests that LepA evolved through a duplication of the EF-G gene in bacteria, followed by early transfer into the eukaryotic lineage, most likely from the promitochondrial endosymbiont. Yeast GUF1p is not essential and mutant cells did not reveal any marked phenotype.	179
206678	cd01891	TypA_BipA	Tyrosine phosphorylated protein A (TypA)/BipA family belongs to ribosome-binding GTPases. BipA is a protein belonging to the ribosome-binding family of GTPases and is widely distributed in bacteria and plants. BipA was originally described as a protein that is induced in Salmonella typhimurium after exposure to bactericidal/permeability-inducing protein (a cationic antimicrobial protein produced by neutrophils), and has since been identified in E. coli as well. The properties thus far described for BipA are related to its role in the process of pathogenesis by enteropathogenic E. coli. It appears to be involved in the regulation of several processes important for infection, including rearrangements of the cytoskeleton of the host, bacterial resistance to host defense peptides, flagellum-mediated cell motility, and expression of K5 capsular genes. It has been proposed that BipA may utilize a novel mechanism to regulate the expression of target genes. In addition, BipA from enteropathogenic E. coli has been shown to be phosphorylated on a tyrosine residue, while BipA from Salmonella and from E. coli K12 strains is not phosphorylated under the conditions assayed. The phosphorylation apparently modifies the rate of nucleotide hydrolysis, with the phosphorylated form showing greatly increased GTPase activity.	194
206679	cd01892	Miro2	Mitochondrial Rho family 2 (Miro2), C-terminal. Miro2 subfamily. Miro (mitochondrial Rho) proteins have tandem GTP-binding domains separated by a linker region containing putative calcium-binding EF hand motifs. Genes encoding Miro-like proteins were found in several eukaryotic organisms. This CD represents the putative GTPase domain in the C terminus of Miro proteins. These atypical Rho GTPases have roles in mitochondrial homeostasis and apoptosis. Most Rho proteins contain a lipid modification site at the C-terminus; however, Miro is one of few Rho subfamilies that lack this feature.	180
206680	cd01893	Miro1	Mitochondrial Rho family 1 (Miro1), N-terminal. Miro1 subfamily. Miro (mitochondrial Rho) proteins have tandem GTP-binding domains separated by a linker region containing putative calcium-binding EF hand motifs. Genes encoding Miro-like proteins were found in several eukaryotic organisms. This CD represents the N-terminal GTPase domain of Miro proteins. These atypical Rho GTPases have roles in mitochondrial homeostasis and apoptosis. Most Rho proteins contain a lipid modification site at the C-terminus; however, Miro is one of few Rho subfamilies that lack this feature.	168
206681	cd01894	EngA1	EngA1 GTPase contains the first domain of EngA. This EngA1 subfamily CD represents the first GTPase domain of EngA and its orthologs, which are composed of two adjacent GTPase domains. Since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family. Although the exact function of these proteins has not been elucidated, studies have revealed that the E. coli EngA homolog, Der, and Neisseria gonorrhoeae EngA are essential for cell viability. A recent report suggests that E. coli Der functions in ribosome assembly and stability.	157
206682	cd01895	EngA2	EngA2 GTPase contains the second domain of EngA. This EngA2 subfamily CD represents the second GTPase domain of EngA and its orthologs, which are composed of two adjacent GTPase domains. Since the sequences of the two domains are more similar to each other than to other GTPases, it is likely that an ancient gene duplication, rather than a fusion of evolutionarily distinct GTPases, gave rise to this family. Although the exact function of these proteins has not been elucidated, studies have revealed that the E. coli EngA homolog, Der, and Neisseria gonorrhoeae EngA are essential for cell viability. A recent report suggests that E. coli Der functions in ribosome assembly and stability.	174
206683	cd01896	DRG	Developmentally Regulated GTP-binding protein (DRG). The developmentally regulated GTP-binding protein (DRG) subfamily is an uncharacterized member of the Obg family, an evolutionary branch of GTPase superfamily proteins. GTPases act as molecular switches regulating diverse cellular processes. DRG2 and DRG1 comprise the DRG subfamily in eukaryotes. In view of their widespread expression in various tissues and high conservation among distantly related species in eukaryotes and archaea, DRG proteins may regulate fundamental cellular processes. It is proposed that the DRG subfamily proteins play their physiological roles through RNA binding.	233
206684	cd01897	NOG	Nucleolar GTP-binding protein (NOG). NOG1 is a nucleolar GTP-binding protein present in eukaryotes ranging from trypanosomes to humans. NOG1 is functionally linked to ribosome biogenesis and found in association with the nuclear pore complexes and identified in many preribosomal complexes. Thus, defects in NOG1 can lead to defects in 60S biogenesis. The S. cerevisiae NOG1 gene is essential for cell viability, and mutations in the predicted G motifs abrogate function. It is a member of the ODN family of GTP-binding proteins that also includes the bacterial Obg and DRG proteins.	167
206685	cd01898	Obg	Obg GTPase. The Obg nucleotide binding protein subfamily has been implicated in stress response, chromosome partitioning, replication initiation, mycelium development, and sporulation. Obg proteins are among a large group of GTP binding proteins conserved from bacteria to humans. The E. coli homolog, ObgE is believed to function in ribosomal biogenesis. Members of the subfamily contain two equally and highly conserved domains, a C-terminal GTP binding domain and an N-terminal glycine-rich domain.	170
206686	cd01899	Ygr210	Ygr210 GTPase. Ygr210 is a member of Obg-like family and present in archaea and fungi. They are characterized by a distinct glycine-rich motif immediately following the Walker B motif. The Ygr210 and YyaF/YchF subfamilies appear to form one major branch of the Obg-like family. Among eukaryotes, the Ygr210 subfamily is represented only in fungi. These fungal proteins form a tight cluster with their archaeal orthologs, which suggests the possibility of horizontal transfer from archaea to fungi.	318
206687	cd01900	YchF	YchF GTPase. YchF is a member of the Obg family, which includes four other subfamilies of GTPases: Obg, DRG, Ygr210, and NOG1. Obg is an essential gene that is involved in DNA replication in C. crescentus and Streptomyces griseus and is associated with the ribosome. Several members of the family, including YchF, possess the TGS domain related to the RNA-binding proteins. Experimental data and genomic analysis suggest that YchF may be part of a nucleoprotein complex and may function as a GTP-dependent translational factor.	274
238884	cd01901	Ntn_hydrolase	The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid.  N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences.	164
238885	cd01902	Ntn_CGH	Choloylglycine hydrolase (CGH) is a bile salt-modifying enzyme that hydrolyzes non-peptide carbon-nitrogen bonds in choloylglycine and choloyltaurine, both of which are present in bile.  CGH is present in a number of probiotic microbial organisms that inhabit the gut.  CGH has an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which CGH belongs.	291
238886	cd01903	Ntn_AC_NAAA	AC_NAAA This conserved domain includes two closely related proteins, acid ceramidase (AC, also known as N-acylsphingosine amidohydrolase), and N-acylethanolamine-hydrolyzing acid amidase (NAAA).  AC catalyzes the hydrolysis of ceramide to sphingosine and fatty acid. Ceramide is required for the biosynthesis of most sphingolipids and plays an important role in many signal transduction pathways by inducing apoptosis and/or arresting cell growth. An inherited deficiency of AC activity leads to the lysosomal storage disorder known as Farber disease.  AC is considered a "rheostat" important for maintaining the proper intracellular levels of these lipids since hydrolysis of ceramide is the only source of sphingosine in cells.  NAAA is a eukaryotic glycoprotein that hydrolyzes bioactive N-acylethanolamines, including anandamide (an endocannabinoid) and N-palmitoylethanolamine (an anti-inflammatory and neuroprotective substance), to fatty acids and ethanolamine at acidic pH.  NAAA shows structural and functional similarity to acid ceramidase, but lacks the ceramide-hydrolyzing activity of AC.	231
238887	cd01906	proteasome_protease_HslV	proteasome_protease_HslV. This group contains the eukaryotic proteosome alpha and beta subunits and the prokaryotic protease hslV subunit. Proteasomes are large multimeric self-compartmentalizing proteases, involved in the clearance of misfolded proteins, the breakdown of regulatory proteins, and the processing of proteins such as the preparation of peptides for immune presentation. Two main proteasomal types are distinguished by their different tertiary structures: the eukaryotic/archeal 20S proteasome and the prokaryotic proteasome-like heat shock protein encoded by heat shock locus V, hslV.  The proteasome core particle is a highly conserved cylindrical structure made up of non-identical subunits that have their active sites on the inner walls of a large central cavity. The proteasome subunits of bacteria, archaea, and eukaryotes all share a conserved Ntn (N terminal nucleophile) hydrolase fold and a catalytic mechanism involving an N-terminal nucleophilic threonine that is exposed by post-translational processing of an inactive propeptide.	182
238888	cd01907	GlxB	Glutamine amidotransferases class-II (Gn-AT)_GlxB-type.  GlxB is a glutamine amidotransferase-like protein of unknown function found in bacteria and archaea. GlxB has a structural fold similar to that of other class II glutamine amidotransferases including glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase),  asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS).   The GlxB fold is also somewhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits.	249
238889	cd01908	YafJ	Glutamine amidotransferases class-II (Gn-AT)_YafJ-type.  YafJ is a glutamine amidotransferase-like protein of unknown function found in prokaryotes, eukaryotes and archaea.  YafJ has a conserved structural fold similar to those of other class II glutamine amidotransferases including glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase),  asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS).  The YafJ fold is also somewhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits.	257
238890	cd01909	betaLS_CarA_N	Glutamine amidotransferases class-II (GATase) asparagine synthase_betaLS-type.  Carbapenam synthetase (CarA) is an ATP/Mg2+-dependent enzyme that catalyzes the formation of the beta-lactam ring in (5R)-carbapenem-3-carboxylic acid biosynthesis.  CarA is homologous to beta-lactam synthetase (beta-LS), which is involved in the biosynthesis of clavulanic acid, a clinically important beta-lactamase inhibitor. CarA and beta-LS each have two distinct domains, an N-terminal Ntn hydrolase domain and a C-terminal synthetase domain, a domain architecture similar to that of the class-B asparagine synthetases (AS-B's). The N-terminal domain of these enzymes hydrolyzes glutamine to glutamate and ammonia. CarA forms a homotetramer while  betaLS forms a heterodimer.   The N-terminal folds of CarA and beta-LS are similar to those of other class II glutamine amidotransferases including lucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase),  asparagine synthetase B (AsnB), and glutamate synthase (GltS).  This fold is also somwhat similar to the Ntn (N-terminal nucleophile) hydrolase fold of the proteasomal alpha and beta subunits.	199
238891	cd01910	Wali7	This domain is present in Wali7, a protein of unknown function, expressed in wheat and induced by aluminum.  Wali7 has a single domain similar to the glutamine amidotransferase domain of glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase),  asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS).  The Wali7 domain is also somewhat similar to the Ntn hydrolase fold of the proteasomal alph and beta subunits.	224
238892	cd01911	proteasome_alpha	proteasome alpha subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 different alpha and 10 different beta proteasome subunit genes while archaea have one of each.	209
238893	cd01912	proteasome_beta	proteasome beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	189
238894	cd01913	protease_HslV	Protease HslV and the ATPase/chaperone HslU are part of an ATP-dependent proteolytic system that is the prokaryotic homolog of the proteasome. HslV is a dimer of hexamers (a dodecamer) that forms a central proteolytic chamber with active sites on the interior walls of the cavity. HslV shares significant sequence and structural similarity with the proteasomal beta-subunit and both are members of the Ntn-family of hydrolases.  HslV has a nucleophilic threonine residue at its N-terminus that is exposed after processing of the propeptide and is directly involved in active site catalysis.	171
238895	cd01914	HCP	Hybrid cluster protein (HCP), formerly known as prismane, is thought to play a role in nitrogen metabolism but its specific function is unknown. HCP has three structural domains, an N-terminal alpha-helical domain, and two similar domains comprising a central beta-sheet flanked by alpha-helices. HCP contains two iron-sulfur clusters, one of which is a [Fe4-S4] cubane cluster similar to that of carbon monoxide dehydrogenase (CODH).  The second cluster, referred to as the hybrid cluster, is a hybrid [Fe4-S2-O2] center located at the interface of the three domains. Although the hybrid cluster is buried within the protein, it is accessible through a large hydrophobic cavity.	423
238896	cd01915	CODH	Carbon monoxide dehydrogenase (CODH) is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA, respectively. CODH has two types of metal clusters, a cubane [Fe4-S4] center (B-cluster) similar to that of hybrid cluster protein (HCP) and a Ni-Fe-S center (C-cluster) where carbon monoxide oxidation occurs.  Bifunctional CODH forms a heterotetramer with acetyl-CoA synthase (ACS) consisting of two CODH and two ACS subunits while monofunctional CODH forms a homodimer. Bifunctional CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP), while monofunctional CODH oxidizes carbon monoxide to carbon dioxide. CODH and ACS each have a metal cluster referred to as the C- and A-clusters, respectively.	613
238897	cd01916	ACS_1	Acetyl-CoA synthase (ACS), also known as acetyl-CoA decarbonylase, is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA.  ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP).  ACS has three structural domains, an N-terminal rossman fold domain with a helical region at its N-terminus which interacts with CODH, and two alpha + beta fold domains.  A Ni-Fe-S center referred to as the A-cluster is located in the C-terminal domain. A large cavity exists between the three domains which may bind CoA.	731
238898	cd01917	ACS_2	Acetyl-CoA synthase (ACS), also known as acetyl-CoA decarbonylase, is found in acetogenic and methanogenic organisms and is responsible for the synthesis and breakdown of acetyl-CoA.  ACS forms a heterotetramer with carbon monoxide dehydrogenase (CODH) consisting of two ACS and two CODH subunits. CODH reduces carbon dioxide to carbon monoxide and ACS then synthesizes acetyl-CoA from carbon monoxide, CoA, and a methyl group donated by another protein (CoFeSP).  ACS has three structural domains, an N-terminal rossman fold domain with a helical region at its N-terminus which interacts with CODH, and two alpha + beta fold domains.  A Ni-Fe-S center referred to as the A-cluster is located in the C-terminal domain. A large cavity exists between the three domains which may bind CoA.	287
238899	cd01918	HprK_C	HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of Ser-46 of HPr and its dephosphorylation by phosphorolysis. The latter reaction uses inorganic phosphate as substrate and produces pyrophosphate. Phosphoenolpyruvate carboxykinase (PEPCK) and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting these two phosphotransferases have related functions.  The HprK/P N-terminal domain is structurally similar to the N-terminal domains of the MurE and MurF amino acid ligases.	149
238900	cd01919	PEPCK	Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP  or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity.  PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).	515
238901	cd01920	cyclophilin_EcCYP_like	cyclophilin_EcCYP_like: cyclophilin-type A-like peptidylprolyl cis- trans isomerase (PPIase) domain similar to the cytosolic E. coli cyclophilin A and Streptomyces antibioticus SanCyp18. Compared to the archetypal cyclophilin Human cyclophilin A, these have reduced affinity for cyclosporin A.  E. coli cyclophilin A has a similar peptidylprolyl cis- trans isomerase activity to the human cyclophilin A. Most members of this subfamily contain a phenylalanine residue at the position equivalent to Human cyclophilin W121, where a tyrptophan has been shown to be important for cyclophilin binding.	155
238902	cd01921	cyclophilin_RRM	cyclophilin_RRM: cyclophilin-type peptidylprolyl cis- trans isomerase domain occuring with a C-terminal RNA recognition motif domain (RRM). This subfamily of the cyclophilin domain family contains a number of eukaryotic cyclophilins having the RRM domain including the nuclear proteins: human hCyP-57, Arabidopsis thaliana AtCYP59, Caenorhabditis elegans CeCyP-44 and Paramecium tetrurelia Kin241. The Kin241 protein has been shown to have a role in cell morphogenesis.	166
238903	cd01922	cyclophilin_SpCYP2_like	cyclophilin_SpCYP2_like: cyclophilin 2-like peptidylprolyl cis- trans isomerase (PPIase) domain similar to Schizosaccharomyces pombe cyp-2. These proteins bind their respective SNW chromatin binding protein in autologous systems, in a CsA independent manner indicating interaction with a surface outside the PPIase active site. SNW proteins play a basic and broad range role in signaling.	146
238904	cd01923	cyclophilin_RING	cyclophilin_RING: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) having a modified RING finger domain. This group includes the nuclear proteins, Human hCyP-60 and Caenorhabditis elegans MOG-6 which, compared to the archetypal cyclophilin Human cyclophilin A exhibit reduced peptidylprolyl cis- trans isomerase activity and lack a residue important for cyclophilin binding. Human hCyP-60 has been shown to physically interact with the proteinase inhibitor peptide eglin c and; C. elegans MOG-6 to physically interact with MEP-1, a nuclear zinc finger protein. MOG-6 has been shown to function in germline sex determination.	159
238905	cd01924	cyclophilin_TLP40_like	cyclophilin_TLP40_like: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) similar ot the Spinach thylakoid lumen protein TLP40.  Compared to the archetypal cyclophilin Human cyclophilin A, these proteins have similar peptidylprolyl cis- trans isomerase activity and reduced affinity for cyclosporin A. Spinach TLP40 has been shown to have a dual function as a folding catalyst and regulator of dephosphorylation.	176
238906	cd01925	cyclophilin_CeCYP16-like	cyclophilin_CeCYP16-like: cyclophilin-type peptidylprolyl cis- trans isomerase) (PPIase) domain similar to Caenorhabditis elegans cyclophilin 16. C. elegans CeCYP-16, compared to the archetypal cyclophilin Human cyclophilin A has, a reduced peptidylprolyl cis- trans isomerase activity, is cyclosporin insensitive and shows an altered substrate preference favoring, hydrophobic, acidic or amide amino acids. Most members of this subfamily have a glutamate residue in the active site at the position equivalent to a tryptophan (W121 in Human cyclophilin A), which has been shown to be important for cyclophilin binding.	171
238907	cd01926	cyclophilin_ABH_like	cyclophilin_ABH_like: Cyclophilin  A, B and H-like cyclophilin-type peptidylprolyl cis- trans isomerase (PPIase) domain. This family represents the archetypal cystolic cyclophilin similar to human cyclophilins A, B and H. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. These enzymes have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. As cyclophilins, Human hCyP-A, human cyclophilin-B (hCyP-19), S. cerevisiae Cpr1 and C. elegans Cyp-3, are inhibited by the immunosuppressive drug cyclopsporin A (CsA). CsA binds to the PPIase active site. Cyp-3. S. cerevisiae Cpr1 interacts with the Rpd3 - Sin3 complex and in addition is a component of the Set3 complex. S. cerevisiae Cpr1 has also been shown to have a role in Zpr1p nuclear transport. Human cyclophilin H associates with the [U4/U6.U5] tri-snRNP particles of the splicesome.	164
238908	cd01927	cyclophilin_WD40	cyclophilin_WD40: cyclophilin-type peptidylprolyl cis- trans isomerases (cyclophilins) having a WD40 domain. This group consists of several hypothetical and putative eukaryotic and bacterial proteins which have a cyclophilin domain and a WD40 domain. Function of the protein is not known.	148
238909	cd01928	Cyclophilin_PPIL3_like	Cyclophilin_PPIL3_like. Proteins similar to Human cyclophilin-like peptidylprolyl cis- trans isomerase (PPIL3). Members of this family lack a key residue important for cyclosporin binding: the tryptophan residue corresponding to W121 in human hCyP-18a; most members have a histidine at this position. The exact function of the protein is not known.	153
238910	cd01935	Ntn_CGH_like	Choloylglycine hydrolase (CGH)_like. This family of choloylglycine hydrolase-like proteins includes conjugated bile acid hydrolase (CBAH), penicillin V acylase (PVA), acid ceramidase (AC), and N-acylethanolamine-hydrolyzing acid amidase (NAAA) which cleave non-peptide carbon-nitrogen bonds in bile salt constituents.  These enzymes have an N-terminal nucleophilic cysteine, as do other members of the Ntn hydrolase family to which they belong.  This nucleophilic cysteine is exposed by post-translational prossessing of the precursor protein.	229
238911	cd01936	Ntn_CA	Cephalosporin acylase (CA) belongs to a family of beta-lactam acylases that includes penicillin G acylase (PGA) and aculeacin A acylase. PGA and CA are crucial for the production of backbone chemicals like 6-aminopenicillanic acid and 7-aminocephalosporanic acid (7-ACA), which can be used to synthesize semi-synthetic penicillins and cephalosporins, respectively.  While both PGA and CA have a conserved Ntn (N-terminal nucleophile) hydrolase fold and the structural similarity at their active sites is very high, their sequence similarity to other Ntn's is low.	469
238912	cd01937	ribokinase_group_D	Ribokinase-like subgroup D.  Found in bacteria and archaea, this subgroup is part of the ribokinase/pfkB superfamily.  Its oligomerization state is unknown at this time.	254
238913	cd01938	ADPGK_ADPPFK	ADP-dependent glucokinase (ADPGK) and phosphofructokinase (ADPPFK). ADPGK and ADPPFK are proteins that rely on ADP rather than ATP to donate a phosphoryl group.  They are found in certain hyperthermophilic archaea and in higher eukaryotes.  A functional ADPGK has been characterized in mouse and is assumed to be desirable during ischemia/hypoxia.  ADPGK and ADPPFK contain a large and a small domain with the binding site located in a groove between the domains. Partial domain closing is seen when ADP is bound, and further domain closing is observed when glucose is also bound.  The oligomerization state apparently varies depending on the species, with some existing as monomers, some as dimers, and some as tetramers.	445
238914	cd01939	Ketohexokinase	Ketohexokinase (fructokinase, KHK) catalyzes the phosphorylation of fructose to fructose-1-phosphate (F1P), the first step in the metabolism of dietary fructose.  KHK can also phosphorylate several other furanose sugars.  It is found in higher eukaryotes where it is believed to function as a dimer and requires K(+) and ATP to be active.  In humans, hepatic KHK deficiency causes fructosuria, a benign inborn error of metabolism.	290
238915	cd01940	Fructoselysine_kinase_like	Fructoselysine kinase-like.  Fructoselysine is a fructoseamine formed by glycation, a non-enzymatic reaction of glucose with a primary amine followed by an Amadori rearrangement, resulting in a protein that is modified at the amino terminus and at the lysine side chains. Fructoseamines are typically metabolized by fructoseamine-3-kinase, especially in higher eukaryotes. In E. coli, fructoselysine kinase has been shown in vitro to catalyze the phosphorylation of fructoselysine. It is proposed that fructoselysine is released from glycated proteins during human digestion and is partly metabolized by bacteria in the hind gut using a protein such as fructoselysine kinase.  This family is found only in bacterial sequences, and its oligomeric state is currently unknown.	264
238916	cd01941	YeiC_kinase_like	YeiC-like sugar kinase.  Found in eukaryotes and bacteria, YeiC-like kinase is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time.	288
238917	cd01942	ribokinase_group_A	Ribokinase-like subgroup A.  Found in bacteria and archaea, this subgroup is part of the ribokinase/pfkB superfamily.  Its oligomerization state is unknown at this time.	279
238918	cd01943	MAK32	MAK32 kinase.  MAK32 is a protein found primarily in fungi that is necessary for the structural stability of L-A particles.  The L-A virus particule is a specialized compartment for the transcription and replication of double-stranded RNA, known to infect yeast and other fungi.  MAK32 is part of the host machinery used by the virus to multiply.	328
238919	cd01944	YegV_kinase_like	YegV-like sugar kinase.  Found only in bacteria, YegV-like kinase is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time.	289
238920	cd01945	ribokinase_group_B	Ribokinase-like subgroup B.  Found in bacteria and plants, this subgroup is part of the ribokinase/pfkB superfamily.  Its oligomerization state is unknown at this time. .	284
238921	cd01946	ribokinase_group_C	Ribokinase-like subgroup C.  Found only in bacteria, this subgroup is part of the ribokinase/pfkB superfamily.  Its oligomerization state is unknown at this time.	277
238922	cd01947	Guanosine_kinase_like	Guanosine kinase-like sugar kinases.  Found in bacteria and archaea, the guanosine kinase-like group is part of the ribokinase/pfkB sugar kinase superfamily. Its oligomerization state is unknown at this time.	265
238923	cd01948	EAL	EAL domain. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues and is also known as domain of unknown function 2 (DUF2).  The EAL domain has been shown to stimulate degradation of a second messenger, cyclic di-GMP, and is a good candidate for a diguanylate phosphodiesterase function. Together with the GGDEF domain, EAL might be involved in regulating cell surface adhesiveness in bacteria.	240
143635	cd01949	GGDEF	Diguanylate-cyclase (DGC) or GGDEF domain. Diguanylate-cyclase (DGC) or GGDEF domain: Originally named after a conserved residue pattern, and initially described as a domain of unknown function 1 (DUF1). This domain is widely present in bacteria, linked to a wide range of non-homologous domains in a variety of cell signaling proteins. The domain shows homology to the adenylyl cyclase catalytic domain. This correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. Together with the EAL domain, GGDEF might be involved in regulating cell surface adhesion in bacteria.	158
173886	cd01951	lectin_L-type	legume lectins. The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind.  This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor.  L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face".  This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers.  Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.	223
238924	cd01958	HPS_like	HPS_like: Hydrophobic Protein from Soybean (HPS)-like subfamily; composed of proteins with similarity to HPS, a small hydrophobic protein with unknown function related to cereal-type alpha-amylase inhibitors and lipid transfer proteins. In addition to HPS, members of this subfamily include a hybrid proline-rich protein (HyPRP) from maize, a dark-inducible protein (LeDI-2) from Lithospermum erythrorhizon, maize ZRP3 protein, and rice RcC3 protein. HyPRP is an embryo-specific protein that contains an N-terminal proline-rich domain and a C-terminal HPS-like cysteine-rich domain. It has been suggested that HyPRP may be involved in the stability and defense of the developing embryo. LeDI-2 is a root-specific protein that may be involved in regulating the biosynthesis of shikonin derivatives in L. erythrorhizon. Maize ZRP3 and rice RcC3 are root-specific proteins whose functions are yet to be determined. It has been reported that ZRP3 largely accumulates in a distinct subset of cortical cells.	85
238925	cd01959	nsLTP2	nsLTP2: Non-specific lipid-transfer protein type 2 (nsLTP2) subfamily; Plant nsLTPs are small, soluble proteins that facilitate the transfer of fatty acids, phospholipids, glycolipids, and steroids between membranes. In addition to lipid transport and assembly, nsLTPs also play a key role in the defense of plants against pathogens. There are two closely-related types of nsLTPs, types 1 and 2, which differ in protein sequence, molecular weight, and biological properties. nsLTPs contain an internal hydrophobic cavity, which serves as the binding site for lipids. nsLTP2 can bind lipids and sterols. Structure studies of rice nsLTPs show that the plasticity of the hydrophobic cavity is an important factor in ligand binding. The flexibility of the sLTP2 cavity allows its binding to rigid sterol molecules, whereas nsLTP1 cannot bind sterols despite its larger cavity size. The resulting nsLTP2/sterol complexes may bind to receptors that trigger defense responses. nsLTP2 gene expression has been observed in barley and rice developing seeds, during Zinnia elegans cell differentiation, and under abiotic stress conditions in barley roots. The nsLTP2 of Brassica rapa has also been identified as a potent allergen.	66
238926	cd01960	nsLTP1	nsLTP1: Non-specific lipid-transfer protein type 1 (nsLTP1) subfamily; Plant nsLTPs are small, soluble proteins that facilitate the transfer of fatty acids, phospholipids, glycolipids, and steroids between membranes. In addition to lipid transport and assembly, nsLTPs also play a key role in the defense of plants against pathogens. There are two closely-related types of nsLTPs, types 1 and 2, which differ in protein sequence, molecular weight, and biological properties. nsLTPs contain an internal hydrophobic cavity, which serves as the binding site for lipids. The hydrophobic cavity accommodates various fatty acid ligands containing from ten to 18 carbon atoms. In general, the cavity is larger in nsLTP1 than in nsLTP2. nsLTP1 proteins are located in extracellular layers and in vacuolar structures. They may be involved in the formation of cutin layers on plant surfaces by transporting cutin monomers. Many nsLTP1 proteins have been characterized as allergens in humans.	89
238927	cd01965	Nitrogenase_MoFe_beta_like	Nitrogenase_MoFe_beta_like: Nitrogenase MoFe protein, beta subunit_like. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia.  This group contains the beta subunits of component 1 of the three known genetically distinct types of nitrogenase systems: a molybdenum-dependent  nitrogenase (Mo-nitrogenase), a vanadium-dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase). These nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). The most widespread and best characterized of these systems is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers having  alpha and beta subunits similar to the alpha and beta subunits of MoFe. For MoFe, each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains, a single [4Fe-4S] cluster from which electrons are transferred  to the P-cluster of the MoFe and in turn, to FeMoCo, the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of  MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene  The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during N2-reduction and, ethane as a minor product during acetylene reduction	428
238928	cd01966	Nitrogenase_NifN_1	Nitrogenase_nifN1: A subgroup of the NifN subunit of the NifEN complex: NifN forms an alpha2beta2 tetramer with NifE.  NifN and nifE are structurally homologous to nitrogenase MoFe protein beta and alpha subunits respectively.  NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein.  NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The nifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this nifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco).	417
238929	cd01967	Nitrogenase_MoFe_alpha_like	Nitrogenase_MoFe_alpha_like: Nitrogenase MoFe protein, alpha subunit_like. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia.  Three genetically distinct types of nitrogenase systems are known to exist: a molybdenum-dependent  nitrogenase (Mo-nitrogenase), a vanadium dependent nitrogenase (V-nitrogenase), and an iron-only nitrogenase (Fe-nitrogenase). These nitrogenase systems consist of component 1 (MoFe protein, VFe protein or, FeFe protein respectively) and, component 2 (Fe protein). This group contains the alpha subunit of component 1 of all three different forms. The most widespread and best characterized of these systems is the Mo-nitrogenase. MoFe is an alpha2beta2 tetramer, the alternative nitrogenases are alpha2beta2delta2 hexamers having  alpha and beta subunits similar to the alpha and beta subunits of MoFe.  The role of the delta subunit is unknown. For MoFe, each alphabeta pair of subunits contains one P-cluster (located at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein is a homodimer which contains, a single [4Fe-4S] cluster from which electrons are transferred  to the P-cluster of the MoFe and in turn, to FeMoCo the site of substrate reduction. The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of  MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene  The V-nitrogenase differs from the Mo- nitrogenase in that it produces free hydrazine, as a minor product during  dinitrogen reduction and, ethane as a minor product during acetylene reduction.	406
238930	cd01968	Nitrogenase_NifE_I	Nitrogenase_NifE_I: a subgroup of the NifE subunit of the NifEN complex: NifE forms an alpha2beta2 tetramer with NifN.  NifE and NifN are structurally homologous to nitrogenase MoFe protein alpha and beta subunits respectively.  NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein.  NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The NifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this NifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco).	410
238931	cd01971	Nitrogenase_VnfN_like	Nitrogenase_vnfN_like: VnfN subunit of the VnfEN complex-like.  This group in addition to VnfN contains a subset of the beta subunit of the nitrogenase MoFe protein and NifN-like proteins. The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia.  NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of MoFe protein of the molybdenum(Mo)-nitrogenase.  NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to NifEN where it is further processed to FeMoco. VnfEN  may similarly be a scaffolding protien for the iron-vanadium cofactor (FeVco) of  the vanadium-dependent (V)-nitrogenase.  NifE and NifN are essential for the Mo-nitrogenase, VnfE and VnfN are not essential for the V-nitrogenase. NifE and NifN can substitute when the vnfEN genes are inactivated.	427
238932	cd01972	Nitrogenase_VnfE_like	Nitrogenase_VnfE_like: VnfE subunit of the VnfEN complex_like. This group in addition to VnfE contains a subset of the alpha subunit of the nitrogenase MoFe protein and NifE-like proteins.  The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia.  NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of MoFe protein of the molybdenum(Mo)-nitrogenase.  NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to NifEN where it is further processed to FeMoco. VnfEN  may similarly be a scaffolding protein for the iron-vanadium cofactor (FeVco) of  the vanadium-dependent (V)-nitrogenase.  NifE and NifN are essential for the Mo-nitrogenase, VnfE and VnfN are not essential for the V-nitrogenase. NifE and NifN can substitute when the vnfEN genes are inactivated.	426
238933	cd01973	Nitrogenase_VFe_beta_like	Nitrogenase_VFe_beta -like: Nitrogenase VFe protein, beta subunit like. This group contains proteins similar to the beta subunits of  the VFe protein of the vanadium-dependent (V-) nitrogenase.  Nitrogenase catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia. In addition to V-nitrogenase there is a molybdenum (Mo)-dependent nitrogenase and an iron only (Fe-) nitrogenase.  The Mo-nitrogenase is the most widespread and best characterized of these systems.  These systems consist of component 1 (VFe protein, FeFe protein or, MoFe protein  respectively) and, component 2 (Fe protein). MoFe is an alpha2beta2 tetramer, V-and Fe- nitrogenases are alpha2beta2delta2 hexamers. The alpha and beta subunits of VFe and FeFe are similar to the alpha and beta subunits of MoFe. For MoFe each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein which has a practically identical structure in all three systems, it contains a single [4Fe-4S] cluster.  Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction.  The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of  MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene  The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during  dinitrogen reduction and, ethane as a minor product during acetylene reduction.	454
238934	cd01974	Nitrogenase_MoFe_beta	Nitrogenase_MoFe_beta: Nitrogenase MoFe protein, beta subunit. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia. The Molybdenum (Mo-) nitrogenase is the most widespread and best characterized of these systems.  Mo-nitrogenase consists of the MoFe protein (component 1) and the Fe protein (component 2).  MoFe is an alpha2beta2 tetramer. This group contains the beta subunit of the MoFe protein. Each alphabeta pair of MoFe contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster.  Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction.	435
238935	cd01976	Nitrogenase_MoFe_alpha	Nitrogenase_MoFe_alpha_II: Nitrogenase MoFe protein, beta subunit. A group of proteins similar to the alpha subunit of the MoFe protein of the molybdenum (Mo-) nitrogenase. The nitrogenase enzyme catalyzes the ATP-dependent reduction of dinitrogen to ammonia. The Mo-nitrogenase is the most widespread and best characterized of these systems.  Mo-nitrogenase consists of the MoFe protein (component 1) and the Fe protein (component 2).  MoFe is an alpha2beta2 tetramer. Each alphabeta pair of MoFe contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein contains a single [4Fe-4S] cluster.  Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction.	421
238936	cd01977	Nitrogenase_VFe_alpha	Nitrogenase_VFe_alpha -like: Nitrogenase VFe protein, alpha subunit like. This group contains proteins similar to the alpha subunits of,  the VFe protein of the vanadium-dependent (V-) nitrogenase and the FeFe protein of the iron only (Fe-) nitrogenase Nitrogenase catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia. In addition to V- and Fe- nitrogenases there is a molybdenum (Mo)-dependent nitrogenase which is the most widespread and best characterized of these systems.  These systems consist of component 1 (VFe protein, FeFe protein or, MoFe protein  respectively) and, component 2 (Fe protein). MoFe is an alpha2beta2 tetramer, V-and Fe- nitrogenases are alpha2beta2delta2 hexamers. The alpha and beta subunits of VFe and FeFe are similar to the alpha and beta subunits of MoFe. For MoFe each alphabeta pair contains one P-cluster (at the alphabeta interface) and, one molecule of iron molybdenum cofactor (FeMoco) contained within the alpha subunit. The Fe protein which has a practically identical structure in all three systems, it contains a single [4Fe-4S] cluster.  Electrons are transferred from the [4Fe-4S] cluster of the Fe protein to the P-cluster of the MoFe and in turn to FeMoCo, the site of substrate reduction.  The V-nitrogenase requires an iron-vanadium cofactor (FeVco), the iron only-nitrogenase an iron only cofactor (FeFeco). These cofactors are analogous to the FeMoco. The V-nitrogenase has P clusters identical to those of  MoFe. In addition to N2, nitrogenase also catalyzes the reduction of a variety of other substrates such as acetylene  The V-nitrogenase differs from the Mo-nitrogenase in that it produces free hydrazine, as a minor product during  dinitrogen reduction and, ethane as a minor product during acetylene reduction.	415
238937	cd01979	Pchlide_reductase_N	Pchlide_reductase_N: N protein of the NB protein complex of Protochlorophyllide (Pchlide)_reductase. Pchlide reductase catalyzes the reductive formation of chlorophyllide (chlide) from protochlorophyllide (pchlide) during biosynthesis of chlorophylls and bacteriochlorophylls. This group contains both the light-independent Pchlide reductase (DPOR) and light-dependent Pchlide reductase (LPOR).  Angiosperms contain only LPOR, cyanobacteria, algae and gymnosperms contain both DPOR and LPOR, primitive anoxygenic photosynthetic bacteria contain only DPOR. NB is structurally similar to the FeMo protein of nitrogenase, forming an N2B2 heterotetramer. N and B are homologous to the FeMo alpha and beta subunits respectively. Also in common with nitrogenase in vitro DPOR activity requires ATP hydrolysis and dithoionite or ferredoxin as electron donor. The NB protein complex may serve as a catalytic site for Pchlide reduction similar to MoFe for nitrogen reduction.	396
238938	cd01980	Chlide_reductase_Y	Chlide_reductase_Y : Y subunit of chlorophyllide (chlide) reductase (BchY).  Chlide reductase participates in photosynthetic pigment synthesis playing a role in the conversion of chlorophylls(Chl) into bacteriochlorophylls (BChl). Chlide reductase catalyzes the reduction of the B-ring of the tetrapyrolle. Chlide reductase is a three subunit enzyme (subunits are designated BchX, BchY and BchZ). The similarity between these three subunits and the subunits for nitrogenase suggests that BchX serves as an electron donor for the BchY-BchY catalytic subunits.	416
238939	cd01981	Pchlide_reductase_B	Pchlide_reductase_B: B protein of the NB protein complex of Protochlorophyllide (Pchlide)_reductase. Pchlide reductase catalyzes the reductive formation of chlorophyllide (chlide) from protochlorophyllide (pchlide) during biosynthesis of chlorophylls and bacteriochlorophylls. This group contains both the light-independent Pchlide reductase (DPOR) and light-dependent Pchlide reductase (LPOR).  Angiosperms contain only LPOR, cyanobacteria, algae and gymnosperms contain both DPOR and LPOR, primitive anoxygenic photosynthetic bacteria contain only DPOR. NB is structurally similar to the FeMo protein of nitrogenase, forming an N2B2 heterotetramer. N and B are homologous to the FeMo alpha and beta subunits respectively. Also in common with nitrogenase in vitro DPOR activity requires ATP hydrolysis and dithoionite or ferredoxin as electron donor. The NB protein complex may serve as a catalytic site for Pchlide reduction similar to MoFe for nitrogen reduction.	430
238940	cd01982	Chlide_reductase_Z	Chlide_reductase_Z : Z subunit of chlorophyllide (chlide) reductase (BchZ).  Chlide reductase participates in photosynthetic pigment synthesis playing a role in the conversion of chlorophylls(Chl) into bacteriochlorophylls (BChl). Chlide reductase catalyzes the reduction of the B-ring of the tetrapyrolle. Chlide reductase is a three subunit enzyme (subunits are designated BchX, BchY and BchZ). The similarity between these three subunits and the subunits for nitrogenase suggests that BchX serves as an electron donor for the BchY-BchY catalytic subunits.	412
349751	cd01983	SIMIBI	SIMIBI (signal recognition particle, MinD and BioD)-class NTPases. SIMIBI (after signal recognition particle, MinD, and BioD), consists of signal recognition particle (SRP) GTPases, the assemblage of MinD-like ATPases, which are involved in protein localization, chromosome partitioning, and membrane transport, and a group of metabolic enzymes with kinase or related phosphate transferase activity. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion.	107
238942	cd01984	AANH_like	Adenine nucleotide alpha hydrolases superfamily  including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which  binds to Adenosine nucleotide.	86
238943	cd01985	ETF	The electron transfer flavoprotein (ETF) serves as a specific electron acceptor for various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consists of an alpha and a beta subunit which binds one molecule of FAD per dimer . A similar system also exists in some bacteria.  The homologous pair of proteins (FixA/FixB) are essential for nitrogen fixation. The alpha subunit of ETF is structurally related to the bacterial nitrogen fixation protein fixB which could play a role in a redox process and feed electrons to ferredoxin. The beta subunit protein is distantly related to and forms a heterodimer with the alpha subunit.	181
238944	cd01986	Alpha_ANH_like	Adenine nucleotide alpha hydrolases superfamily  including N type ATP PPases and ATP sulphurylases. The domain forms a apha/beta/apha fold which  binds to Adenosine group..	103
238945	cd01987	USP_OKCHK	USP domain is located between the N-terminal sensor domain and C-terminal catalytic domain of this Osmosensitive K+ channel histidine kinase family. The family of KdpD sensor kinase proteins regulates the kdpFABC operon responsible for potassium transport. The USP domain is homologous to the universal stress protein Usp Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity.	124
238946	cd01988	Na_H_Antiporter_C	The C-terminal domain of a subfamily of Na+ /H+ antiporter existed in bacteria and archea . Na+/H+ exchange proteins eject protons from cells, effectively eliminating excess acid from actively metabolising cells. Na+ /H+ exchange activity is also crucial for the regulation of cell volume, and for the reabsorption of NaCl across renal, intestinal, and other epithelia. These antiports exchange Na+ for H+ in an electroneutral manner, and this activity is carried out by a family of Na+ /H+ exchangers, or NHEs, which are known to be present in both prokaryotic and eukaryotic cells.  These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity wit h other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region or C-terminal has homology with a family universal stress protein.Usp is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. Usp enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity.	132
238947	cd01989	STK_N	The N-terminal domain of Eukaryotic Serine Threonine  kinases. The Serine Threonine  kinases are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. The N-terminal domain is homologous to the USP family which has a ATP binding fold. The N-terminal domain  is predicted to be involved in ATP binding.	146
238948	cd01990	Alpha_ANH_like_I	This is a subfamily of Adenine nucleotide alpha hydrolases superfamily. Adenine nucleotide alpha hydrolases superfamily  includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which  binds to Adenosine group.  This subfamily   of proteins probably binds ATP. This domain is about 200 amino acids long with a strongly conserved motif SGGKD at the N terminus.	202
238949	cd01991	Asn_Synthase_B_C	The C-terminal domain of Asparagine Synthase B. This domain is always found associated n-terminal amidotransferase domain. Family members that contain this domain catalyse the conversion of aspartate to asparagine. Asparagine synthetase B  catalyzes the assembly of asparagine from aspartate, Mg(2+)ATP, and glutamine. The three-dimensional architecture of the N-terminal domain of asparagine synthetase B is similar to that observed for glutamine phosphoribosylpyrophosphate amidotransferase while the molecular motif of the C-domain is reminiscent to that observed for GMP synthetase .	269
238950	cd01992	PP-ATPase	N-terminal domain of predicted ATPase of the PP-loop faimly implicated in cell cycle control [Cell division and chromosome partitioning]. This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily  includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which  binds to Adenosine group.  This domain has  a strongly conserved motif SGGXD at the N terminus.	185
238951	cd01993	Alpha_ANH_like_II	This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily  includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which  binds to Adenosine group.  This subfamily   of proteins is predicted to  bind ATP. This domainhas  a strongly conserved motif SGGKD at the N terminus.	185
238952	cd01994	Alpha_ANH_like_IV	This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily  includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which  binds to Adenosine group.  This subfamily   of proteins is predicted to  bind ATP. This domainhas  a strongly conserved motif SGGKD at the N terminus.	194
238953	cd01995	ExsB	ExsB is a transcription regulator related protein. It is a subfamily of a Adenosine nucleotide binding superfamily of proteins. This protein family is represented by a single member in nearly every completed large (> 1000 genes) prokaryotic genome. In Rhizobium meliloti, a species in which the exo genes make succinoglycan, a symbiotically important exopolysaccharide, exsB is located nearby and affects succinoglycan levels, probably through polar effects on exsA expression or the same polycistronic mRNA. In Arthrobacter viscosus, the homologous gene is designated ALU1 and is associated with an aluminum tolerance phenotype. The function is unknown	169
238954	cd01996	Alpha_ANH_like_III	This is a subfamily of Adenine nucleotide alpha hydrolases superfamily.Adeninosine nucleotide alpha hydrolases superfamily  includes N type ATP PPases and ATP sulphurylases. It forms a apha/beta/apha fold which  binds to Adenosine group.  This subfamily   of proteins is predicted to  bind ATP. This domain has  a strongly conserved motif SGGKD at the N terminus.	154
238955	cd01997	GMP_synthase_C	The C-terminal domain of GMP synthetase. It contains two subdomains; the ATP pyrophosphatase domain which closes to the N-termial and the dimerization domain at C-terminal end. The ATP-PPase is a twisted, five-stranded parallel beta-sheet sandwiched between helical layers. It has a signature nucleotide-binding motif, or P-loop, at the end of the first-beta strand.The dimerization domain formed by the C-terminal 115 amino acid for prokaryotic proteins. It is adjacent to teh ATP-binding site of the ATP-PPase subdomain. The largest difference between the primary sequence of prokaryotic and eukaryotic GMP synthetase map to the dimerization domain.Eukaryotic GMP synthetase has several large insertions relative to prokaryotes.	295
238956	cd01998	tRNA_Me_trans	tRNA methyl transferase. This family represents tRNA(5-methylaminomethyl-2-thiouridine)-methyltransferase which is involved in the biosynthesis of the modified nucleoside 5-methylaminomethyl-2-thiouridine present in the wobble position of some tRNAs. This family of enzyme only presents in bacteria and eukaryote. The  archaeal counterpart of this enzyme performs same function, but is completely unrelated in sequence.	349
238957	cd01999	Argininosuccinate_Synthase	Argininosuccinate synthase. The Argininosuccinate synthase is a urea cycle enzyme that catalyzes the penultimate step in arginine biosynthesis: the ATP-dependent ligation of citrulline to aspartate to form argininosuccinate, AMP and pyrophosphate .  In humans, a defect in the AS gene causes citrullinemia, a genetic disease characterized by severe vomiting spells and mental retardation. AS is a homotetrameric enzyme of chains of about 400 amino-acid residues. An arginine seems to be important for the enzyme's catalytic mechanism. The sequences of AS from various prokaryotes, archaebacteria and eukaryotes show significant similarity	385
238958	cd02000	TPP_E1_PDC_ADC_BCADC	Thiamine pyrophosphate (TPP) family, E1 of PDC_ADC_BCADC subfamily, TPP-binding module; composed of proteins similar to the E1 components of the human pyruvate dehydrogenase complex (PDC), the acetoin dehydrogenase complex (ADC) and the branched chain alpha-keto acid dehydrogenase/2-oxoisovalerate dehydrogenase complex (BCADC). PDC catalyzes the irreversible oxidative decarboxylation of pyruvate to produce acetyl-CoA in the bridging step between glycolysis and the citric acid cycle. ADC participates in the breakdown of acetoin while BCADC participates in the breakdown of branched chain amino acids. BCADC catalyzes the oxidative decarboxylation of 4-methyl-2-oxopentanoate, 3-methyl-2-oxopentanoate and 3-methyl-2-oxobutanoate (branched chain 2-oxo acids derived from the transamination of leucine, valine and isoleucine).	293
238959	cd02001	TPP_ComE_PpyrDC	Thiamine pyrophosphate (TPP) family, ComE and PpyrDC subfamily, TPP-binding module; composed of proteins similar to sulfopyruvate decarboxylase beta subunit (ComE) and phosphonopyruvate decarboxylase (Ppyr decarboxylase). Methanococcus jannaschii sulfopyruvate decarboxylase (ComDE) is a dodecamer of six alpha (D) subunits and six (E) beta subunits which, catalyzes the decarboxylation of sulfopyruvic acid to sulfoacetaldehyde in the coenzyme M pathway.  Ppyr decarboxylase is a homotrimeric enzyme which functions in the biosynthesis of C-P compounds such as bialaphos tripeptide in Streptomyces hygroscopicus. Ppyr decarboxylase and ComDE require TPP and divalent metal cation cofactors.	157
238960	cd02002	TPP_BFDC	Thiamine pyrophosphate (TPP) family, BFDC subfamily, TPP-binding module; composed of proteins similar to Pseudomonas putida benzoylformate decarboxylase (BFDC). P. putida BFDC plays a role in the mandelate pathway, catalyzing the conversion of benzoylformate to benzaldehyde and carbon dioxide. This enzyme is dependent on TPP and a divalent metal cation as cofactors.	178
238961	cd02003	TPP_IolD	Thiamine pyrophosphate (TPP) family, IolD subfamily, TPP-binding module; composed of proteins similar to Rhizobium leguminosarum bv. viciae IolD. IolD plays an important role in myo-inositol catabolism.	205
238962	cd02004	TPP_BZL_OCoD_HPCL	Thiamine pyrophosphate (TPP) family, BZL_OCoD_HPCL subfamily, TPP-binding module; composed of proteins similar to benzaldehyde lyase (BZL), oxalyl-CoA decarboxylase (OCoD) and 2-hydroxyphytanoyl-CoA lyase (2-HPCL). Pseudomonas fluorescens biovar I BZL cleaves the acyloin linkage of benzoin producing 2 molecules of benzaldehyde and enabling the Pseudomonas to grow on benzoin as the sole carbon and energy source. OCoD has a role in the detoxification of oxalate, catalyzing the decarboxylation of oxalyl-CoA to formate. 2-HPCL is a peroxisomal enzyme which plays a role in the alpha-oxidation of 3-methyl-branched fatty acids, catalyzing the cleavage of 2-hydroxy-3-methylacyl-CoA into formyl-CoA and a 2-methyl-branched fatty aldehyde. All these enzymes depend on Mg2+ and TPP for activity.	172
238963	cd02005	TPP_PDC_IPDC	Thiamine pyrophosphate (TPP) family, PDC_IPDC subfamily, TPP-binding module; composed of proteins similar to pyruvate decarboxylase (PDC) and indolepyruvate decarboxylase (IPDC). PDC, a key enzyme in alcoholic fermentation, catalyzes the conversion of pyruvate to acetaldehyde and CO2. It is able to utilize other 2-oxo acids as substrates. In plants and various plant-associated bacteria, IPDC plays a role in the indole-3-pyruvic acid (IPA) pathway, a tryptophan-dependent biosynthetic route to indole-3-acetaldehyde (IAA). IPDC catalyzes the decarboxylation of IPA to IAA. Both PDC and IPDC depend on TPP and Mg2+ as cofactors.	183
238964	cd02006	TPP_Gcl	Thiamine pyrophosphate (TPP) family, Gcl subfamily, TPP-binding module; composed of proteins similar to Escherichia coli glyoxylate carboligase (Gcl). E. coli glyoxylate carboligase, plays a key role in glyoxylate metabolism where it catalyzes the condensation of two molecules of glyoxylate to give tartronic semialdehyde and carbon dioxide. This enzyme requires TPP, magnesium ion and FAD as cofactors.	202
238965	cd02007	TPP_DXS	Thiamine pyrophosphate (TPP) family, DXS subfamily, TPP-binding module; 1-Deoxy-D-xylulose-5-phosphate synthase (DXS) is a regulatory enzyme of the mevalonate-independent pathway involved in terpenoid biosynthesis. Terpeniods are plant natural products with important pharmaceutical activity. DXS catalyzes a transketolase-type condensation of pyruvate with D-glyceraldehyde-3-phosphate to form 1-deoxy-D-xylulose-5-phosphate (DXP) and carbon dioxide. The formation of DXP leads to the formation of the terpene precursor IPP (isopentyl diphosphate) and to the formation of thiamine (vitamin B1) and pyridoxal (vitamin B6).	195
238966	cd02008	TPP_IOR_alpha	Thiamine pyrophosphate (TPP) family, IOR-alpha subfamily, TPP-binding module; composed of proteins similar to indolepyruvate ferredoxin oxidoreductase (IOR) alpha subunit. IOR catalyzes the oxidative decarboxylation of arylpyruvates, such as indolepyruvate or phenylpyruvate, which are generated by the transamination of aromatic amino acids, to the corresponding aryl acetyl-CoA.	178
238967	cd02009	TPP_SHCHC_synthase	Thiamine pyrophosphate (TPP) family, SHCHC synthase subfamily, TPP-binding module; composed of proteins similar to Escherichia coli 2-succinyl-6-hydroxyl-2,4-cyclohexadiene-1-carboxylic acid (SHCHC) synthase (also called MenD). SHCHC synthase plays a key role in the menaquinone biosynthetic pathway, converting isochorismate and 2-oxoglutarate to SHCHC, pyruvate and carbon dioxide. The enzyme requires TPP and a divalent metal cation for activity.	175
238968	cd02010	TPP_ALS	Thiamine pyrophosphate (TPP) family, Acetolactate synthase (ALS) subfamily, TPP-binding module; composed of proteins similar to Klebsiella pneumoniae ALS, a catabolic enzyme required for butanediol fermentation. ALS catalyzes the conversion of 2 molecules of pyruvate to acetolactate and carbon dioxide. ALS does not contain FAD, and requires TPP and a divalent metal cation for activity.	177
238969	cd02011	TPP_PK	Thiamine pyrophosphate (TPP) family, Phosphoketolase (PK) subfamily, TPP-binding module; PK catalyzes the conversion of D-xylulose 5-phosphate and phosphate to acetyl phosphate, D-glyceraldehyde-3-phosphate and H2O. This enzyme requires divalent magnesium ions and TPP for activity.	227
238970	cd02012	TPP_TK	Thiamine pyrophosphate (TPP) family, Transketolase (TK) subfamily, TPP-binding module; TK catalyzes the transfer of a two-carbon unit from ketose phosphates to aldose phosphates. In heterotrophic organisms, TK provides a link between glycolysis and the pentose phosphate pathway and provides precursors for nucleotide, aromatic amino acid and vitamin biosynthesis. In addition, the enzyme plays a central role in the Calvin cycle in plants. Typically, TKs are homodimers. They require TPP and divalent cations, such as magnesium ions, for activity.	255
238971	cd02013	TPP_Xsc_like	Thiamine pyrophosphate (TPP) family, Xsc-like subfamily, TPP-binding module; composed of proteins similar to Alcaligenes defragrans sulfoacetaldehyde acetyltransferase (Xsc). Xsc plays a key role in the degradation of taurine, catalyzing the desulfonation of 2-sulfoacetaldehyde into sulfite and acetyl phosphate. This enzyme requires TPP and divalent metal ions for activity.	196
238972	cd02014	TPP_POX	Thiamine pyrophosphate (TPP) family, Pyruvate oxidase (POX) subfamily, TPP-binding module; composed of proteins similar to Lactobacillus plantarum POX, which plays a key role in controlling acetate production under aerobic conditions. POX decarboxylates pyruvate, producing hydrogen peroxide and the energy-storage metabolite acetylphosphate. It requires FAD in addition to TPP and a divalent cation as cofactors.	178
238973	cd02015	TPP_AHAS	Thiamine pyrophosphate (TPP) family, Acetohydroxyacid synthase (AHAS) subfamily, TPP-binding module; composed of proteins similar to the large catalytic subunit of AHAS. AHAS catalyzes the condensation of two molecules of pyruvate to give the acetohydroxyacid, 2-acetolactate. 2-Acetolactate is the precursor of the branched chain amino acids, valine and leucine. AHAS also catalyzes the condensation of pyruvate and 2-ketobutyrate to form 2-aceto-2-hydroxybutyrate in isoleucine biosynthesis. In addition to requiring TPP and a divalent metal ion as cofactors, AHAS requires FAD.	186
238974	cd02016	TPP_E1_OGDC_like	Thiamine pyrophosphate (TPP) family, E1 of OGDC-like subfamily, TPP-binding module; composed of proteins similar to the E1 component of the 2-oxoglutarate dehydrogenase multienzyme complex (OGDC). OGDC catalyzes the oxidative decarboxylation of 2-oxoglutarate to succinyl-CoA and carbon dioxide, a key reaction of the tricarboxylic acid cycle.	265
238975	cd02017	TPP_E1_EcPDC_like	Thiamine pyrophosphate (TPP) family, E1 of E. coli PDC-like subfamily, TPP-binding module; composed of proteins similar to the E1 component of the Escherichia coli pyruvate dehydrogenase multienzyme complex (PDC). PDC catalyzes the oxidative decarboxylation of pyruvate and the subsequent acetylation of coenzyme A to acetyl-CoA. The E1 component of PDC catalyzes the first step of the multistep process, using TPP and a divalent cation as cofactors. E. coli PDC is a homodimeric enzyme.	386
238976	cd02018	TPP_PFOR	Thiamine pyrophosphate (TPP family), Pyruvate ferredoxin/flavodoxin oxidoreductase (PFOR) subfamily, TPP-binding module; PFOR catalyzes the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. PFORs can be homodimeric, heterodimeric, or heterotetrameric, depending on the organism. These enzymes are dependent on TPP and a divalent metal cation as cofactors.	237
238977	cd02019	NK	Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate.	69
238978	cd02020	CMPK	Cytidine monophosphate kinase (CMPK) catalyzes the reversible phosphorylation of cytidine monophosphate (CMP) to produce cytidine diphosphate (CDP), using ATP as the preferred phosphoryl donor.	147
238979	cd02021	GntK	Gluconate kinase (GntK) catalyzes the phosphoryl transfer from ATP to gluconate. The resulting product gluconate-6-phoshate is an important precursor of gluconate metabolism. GntK acts as a dimmer composed of two identical subunits.	150
238980	cd02022	DPCK	Dephospho-coenzyme A kinase (DPCK, EC 2.7.1.24) catalyzes the phosphorylation of dephosphocoenzyme A (dCoA) to yield CoA, which is the final step in CoA biosynthesis.	179
238981	cd02023	UMPK	Uridine monophosphate kinase (UMPK, EC 2.7.1.48), also known as uridine kinase or uridine-cytidine kinase (UCK), catalyzes the reversible phosphoryl transfer from ATP to uridine or cytidine to yield UMP or CMP. In the primidine nucleotide-salvage pathway, this enzyme combined with nucleoside diphosphate kinases further phosphorylates UMP and CMP to form UTP and CTP. This kinase also catalyzes the phosphorylation of several cytotoxic ribonucleoside analogs such as 5-flurrouridine and cyclopentenyl-cytidine.	198
238982	cd02024	NRK1	Nicotinamide riboside kinase (NRK) is an enzyme involved in the metabolism of nicotinamide adenine dinucleotide (NAD+). This enzyme catalyzes the phosphorylation of nicotinamide riboside (NR) to form nicotinamide mononucleotide (NMN). It defines the NR salvage pathway of NAD+ biosynthesis in addition to the pathways through nicotinic acid mononucleotide (NaMN). This enzyme can also phosphorylate the anticancer drug tiazofurin, which is an analog of nicotinamide riboside.	187
238983	cd02025	PanK	Pantothenate kinase (PanK) catalyzes the phosphorylation of pantothenic acid to form 4'-phosphopantothenic, which is the first of five steps in coenzyme A (CoA) biosynthetic pathway. The reaction carried out by this enzyme is a key regulatory point in CoA biosynthesis.	220
238984	cd02026	PRK	Phosphoribulokinase (PRK) is an enzyme involved in the Benson-Calvin cycle in chloroplasts or photosynthetic prokaryotes. This enzyme catalyzes the phosphorylation of D-ribulose 5-phosphate to form D-ribulose 1, 5-biphosphate, using ATP and NADPH produced by the primary reactions of photosynthesis.	273
238985	cd02027	APSK	Adenosine 5'-phosphosulfate kinase (APSK) catalyzes the phosphorylation of adenosine 5'-phosphosulfate to form 3'-phosphoadenosine 5'-phosphosulfate (PAPS). The end-product PAPS is a biologically "activated" sulfate form important for the assimilation of inorganic sulfate.	149
238986	cd02028	UMPK_like	Uridine monophosphate kinase_like (UMPK_like) is a family of proteins highly similar to the uridine monophosphate kinase (UMPK, EC 2.7.1.48), also known as uridine kinase or uridine-cytidine kinase (UCK).	179
238987	cd02029	PRK_like	Phosphoribulokinase-like (PRK-like) is a family of proteins similar to phosphoribulokinase (PRK), the enzyme involved in the Benson-Calvin cycle in chloroplasts or photosynthetic prokaryotes. PRK catalyzes the phosphorylation of D-ribulose 5-phosphate to form D-ribulose 1, 5-biphosphate, using ATP and NADPH produced by the primary reactions of photosynthesis.	277
238988	cd02030	NDUO42	NADH:Ubiquinone oxioreductase, 42 kDa (NDUO42) is a family of proteins that are highly similar to deoxyribonucleoside kinases (dNK). Members of this family have been identified as one of the subunits of NADH:Ubiquinone oxioreductase (complex I), a multi-protein complex located in the inner mitochondrial membrane. The main function of the complex is to transport electrons from NADH to ubiquinone, which is accompanied by the translocation of protons from the mitochondrial matrix to the inter membrane space.	219
349752	cd02032	Bchl-like	L-subunit of protochlorophyllide reductase. This family of proteins contains BchL and ChlL. Protochlorophyllide reductase catalyzes the reductive formation of chlorophyllide from protochlorophyllide during biosynthesis of chlorophylls and bacteriochlorophylls. Three genes, bchL, bchN and bchB, are involved in light-independent protochlorophyllide reduction in bacteriochlorophyll biosynthesis. In cyanobacteria, algae, and gymnosperms, three similar genes, chlL, chlN and chlB are involved in protochlorophyllide reduction during chlorophylls biosynthesis. BchL/chlL, bchN/chlN and bchB/chlB exhibit significant sequence similarity to the nifH, nifD and nifK subunits of nitrogenase, respectively. Nitrogenase catalyzes the reductive formation of ammonia from dinitrogen.	267
349753	cd02033	BchX	X-subunit of protochlorophyllide reductase. Chlorophyllide reductase converts chlorophylls into bacteriochlorophylls by reducing the chlorin B-ring. This family contains the X subunit of this three-subunit enzyme. Sequence and structure similarity between bchX, protochlorophyllide reductase L subunit (bchL and chlL) and nitrogenase Fe protein (nifH gene) suggest their functional similarity. Members of the BchX family serve as the unique electron donors to their respective catalytic subunits (bchN-bchB, bchY-bchZ and nitrogenase component 1). Mechanistically, they hydrolyze ATP and transfer electrons through a Fe4-S4 cluster.	329
349754	cd02034	CooC1	accessory protein CooC1. The accessory protein CooC1, a nickel-binding ATPase, participates in the incorporation of nickel into the complex active site ([Ni-4Fe-4S]) cluster of Ni,Fe-dependent carbon monoxide dehydrogenase (CODH). CODH from Rhodospirillum rubrum catalyzes the reversible oxidation of CO to CO2. CODH contains a nickel-iron-sulfur cluster (C-center) and an iron-sulfur cluster (B-center). CO oxidation occurs at the C-center. Three accessory proteins encoded by cooCTJ genes are involved in nickel incorporation into a nickel site. CooC functions as a nickel insertase that mobilizes nickel to apoCODH using energy released from ATP hydrolysis. CooC is a homodimer and has NTPase activities. Mutation at the P-loop abolishs its function.	249
349755	cd02035	ArsA	Arsenical pump-driving ATPase ArsA. ArsA ATPase functions as an efflux pump located on the inner membrane of the cell. This ATP-driven oxyanion pump catalyzes the extrusion of arsenite, antimonite and arsenate. Maintenance of a low intracellular concentration of oxyanion produces resistance to the toxic agents. The pump is composed of two subunits, the catalytic ArsA subunit and the membrane subunit ArsB, which are encoded by arsA and arsB genes, respectively. Arsenic efflux in bacteria is catalyzed by either ArsB alone or by ArsAB complex. The ATP-coupled pump, however, is more efficient. ArsA is composed of two homologous halves, A1 and A2, connected by a short linker sequence.	250
349756	cd02036	MinD	septum site-determining protein MinD. Septum site-determining protein MinD is part of the operon MinCDE that determines the site of the formation of a septum at mid-cell, an important part of bacterial cell division. MinC is a nonspecific inhibitor of the septum protein FtsZ. MinE is the supressor of MinC. MinD plays a pivotal role, selecting the mid-cell over other sites through the activation and regulation of MinC and MinE. MinD is a membrane-associated ATPase, related to nitrogenase iron protein.	236
349757	cd02037	Mrp_NBP35	Mrp/NBP35 ATP-binding protein family. Mrp/NBP35 ATP-binding family protein are typically iron-sulfur (FeS) cluster scaffolds that function to assemble nascent FeS clusters for transfer to FeS-requiring enzymes. Members include the eukaryotic nucleotide-binding protein 1 (NUBP1) which is a component of the cytosolic iron-sulfur (Fe/S) protein assembly (CIA) machinery and the archael [NiFe] hydrogenase maturation protein HypB which is required for nickel insertion into [NiFe] hydrogenase.	213
349758	cd02038	FlhG-like	MinD-like ATPase FlhG. FlhG is a member of the SIMIBI superfamily. FlhG (also known as YlxH) is a major determinant for a variety of flagellation patterns. It effects location and number of bacterial flagella during C-ring assembly.	230
185678	cd02039	cytidylyltransferase_like	Cytidylyltransferase-like domain. Cytidylyltransferase-like domain. Many of these proteins are known to use CTP or ATP and release pyrophosphate. Protein families that contain at least one copy of this domain include citrate lyase ligase, pantoate-beta-alanine ligase, glycerol-3-phosphate cytidyltransferase, ADP-heptose synthase, phosphocholine cytidylyltransferase, lipopolysaccharide core biosynthesis protein KdtB, the bifunctional protein NadR, and a number whose function is unknown.	143
349759	cd02040	NifH	nitrogenase component II NifH. NifH gene encodes component II (iron protein) of nitrogenase. Nitrogenase is responsible for the biological nitrogen fixation, i.e. reduction of molecular nitrogen to ammonia. NifH consists of two oxygen-sensitive metallosulfur proteins: the mollybdenum-iron (alternatively, vanadium-iron or iron-iron) protein (commonly referred to as component 1), and the iron protein (commonly referred to as component 2). The iron protein is a homodimer, with an Fe4S4 cluster bound between the subunits and two ATP-binding domains. It supplies energy by ATP hydrolysis, and transfers electrons from reduced ferredoxin or flavodoxin to component 1 for the reduction of molecular nitrogen to ammonia.	265
349760	cd02042	ParAB_family	partition proteins ParAB family. ParA and ParB of Caulobacter crescentus belong to a conserved family of bacterial proteins implicated in chromosome segregation. ParB binds to DNA sequences adjacent to the origin of replication and localizes to opposite cell poles shortly following the initiation of DNA replication. ParB regulates the ParA ATPase activity by promoting nucleotide exchange in a fashion reminiscent of the exchange factors of eukaryotic G proteins. ADP-bound ParA binds single-stranded DNA, whereas the ATP-bound form dissociates ParB from its DNA binding sites. Increasing the fraction of ParA-ADP in the cell inhibits cell division, suggesting that this simple nucleotide switch may regulate cytokinesis. ParA shares sequence similarity to a conserved and widespread family of ATPases which includes the repA protein of the repABC operon in Rhizobium etli symbiotic plasmid. This operon is involved in the plasmid replication and partition.	130
381001	cd02043	serpinP_plants	serpin family P, plant serpins. Plant SERine Proteinase INhibitors (serpins) are potent inhibitors of a range of mammalian serine proteases in vitro, and at least seven serpin genes are expressed in Arabidopsis. Serpins from plants display a wide range of functions including protection of storage protein degradation by exogenous proteases and seed survival within the herbivore digestive tract. Comparison between Arabidopsis AtSerpin1 and other serpins reveals several distinguishing features including a plant-specific insertion between s2B and s3B, with a plant-specific motif YXXGXDXRXF and the presence of a beta-bulge in strand s2C. The conserved Asp-230 and Arg-232 in the motif form a network of hydrogen bonds stabilize a loop region, which is otherwise disordered in many other serpin structures. AtSerpin1 is targeted to the secretory pathway and was shown to interact with cysteine protease RD21 (RESPONSIVE TO DESICCATION-21). RD21 accepts peptides and ligates them to the N termini of acceptor proteins so it has been proposed that AtSerpin1 functions to curb this activity. This subgroup corresponds to clade P of the serpin superfamily. In general, serpins exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	382
381002	cd02045	serpinC1_AT3	serpin family C member 1, antithrombin III. Antithrombin III (AT3/ATIII) is a non-vitamin K-dependent serine protease that inhibits coagulation by neutralizing the enzymatic activity of thrombin (factors IIa, IXa, Xa). It is the most important anticoagulant molecule in mammalian circulation systems, controlled by its interaction with the cofactor, heparin, which accelerates its interaction with target proteases. This subgroup corresponds to clade C of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	395
381003	cd02046	serpinH1_CBP1	serpin family H member 1, collagen-binding protein 1. Collagen-binding protein 1 (CBP1, also called heat shock protein 47/hsp47 or colligin), because of its collagen binding ability, is a chaperone specific protein for the correct folding of types I-V procollagen in the endoplasmic reticulum (ER). It is induced under stress conditions through heat shock element-heat shock factor interaction and has been shown to be essential for collagen biosynthesis. Hsp47 transiently binds to procollagen in the ER, dissociates in the cis-Golgi or ER-Golgi intermediate compartment, and is then transported back to the ER via its RDEL retention sequence. Hsp47 recognizes collagenous (Gly-Xaa-Arg) repeats on triple-helical procollagen and can prevent local unfolding and/or aggregate formation of procollagen. Hsp47 is a non-inhibitory member of the SERPIN superfamily and corresponds to clade H. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	382
381004	cd02047	serpinD1_HCF2	serpin family D member 1, Heparin cofactor II. Heparin cofactor II (HCF2/HC-II, also called protease inhibitor leuserpin-2/hLS2) is a protein encoded by the SERPIND1 gene that inhibits thrombin, the final protease of the coagulation cascade. HCII is allosterically activated by binding to cell surface glycosaminoglycans (GAGs). The specificity of HCII for thrombin is conferred by a highly acidic hirudin-like N-terminal tail, which becomes available after GAG binding for interaction with the anion-binding exosite I of thrombin. HCII deficiency can lead to increased thrombin generation and a hypercoagulable state. This subgroup corresponds to clade D of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	449
381005	cd02048	serpinI1_NSP	serpin family I member 1, neuroserpin. Neuroserpin (NSP, also called proteinase inhibitor 12/PI-12) is an inhibitory member of the serpin family that reacts preferentially with tissue-type plasminogen activator (tPA). It is located in neurons in regions of the brain where tPA is also found, suggesting that neuroserpin is the selective inhibitor of tPA in the central nervous system (CNS). This subgroup corresponds to clade I of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	372
381006	cd02050	serpinG1_C1-INH	serpin family G member 1, plasma proteinase C1 inhibitor. Plasma proteinase C1 inhibitor (C1-INH/C1IN) is a protease inhibitor of the serpin family. It plays a pivotal role in regulating the activation of the classical complement pathway and of the contact system, via regulating bradykinin formation, inhibiting factor XII and kallikrein of the contact system, and via acting on factor XI in the coagulation cascade. This subgroup corresponds to clade G of the serpin superfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	362
381007	cd02051	serpinE1_PAI-1	serpin family E member 1, plasminogen activator inhibitor-1. Plasminogen activator inhibitor-1 (PAI-1/PLANH1, also called endothelial PAI) is the primary, fast-acting inhibitor of plasminogen activators. It is often bound to vitronectin, an abundant component of the extracellular matrix in many tissues. PAI1 deficiency is a rare bleeding disorder that causes excessive or prolonged bleeding due to blood clots being broken down too early. PAI-1 is a member of the serpin superfamily and belongs to clade E. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	374
381008	cd02052	serpinF1_PEDF	serpin family F member 1, Pigment epithelium-derived factor (PEDF). Pigment epithelium-derived factor (PEDF, also called capsin or EPC-1) is an extracellular component of the retinal interphotoreceptor matrix, vitreous humor, and aqueous humor of the adult eye. PEDF is non-inhibitory member of the serpin superfamily. It exhibits neurotrophic, neuroprotective and antiangiogenic properties and is widely expressed in the developing and adult nervous systems. This subgroup corresponds to clade F1 of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	373
381009	cd02053	serpinF2_A2AP	serpin family F member 2, alpha2-antiplasmin inhibitor. Alpha2-antiplasmin inhibitor (A2AP/API, also called plasmin inhibitor/PLI or alpha-2-antiplasmin) is the primary inhibitor of plasmin, a proteinase that digests fibrin, the main component of blood clots. Alpha2AP forms an inactive 1:1 stoichiometric complex with plasmin. It also rapidly crosslinks to fibrin during blood clotting by activated coagulation factor XIII, and as a consequence fibrin becomes more resistant to fibrinolysis. Therefore alpha2AP is important in modulating the effectiveness and persistence of fibrin with respect to its susceptibility to digestion and removal by plasmin. This subgroup corresponds to clade F2 of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	363
381010	cd02054	serpinA8_AGT	serpin family A member 8, angiotensinogen. Angiotensinogen (AGT) is part of the renin-angiotensin system (RAS), which plays an important role in blood pressure regulation, renal hemodynamics, as well as fluid and electrolyte homeostasis. It is also involved in normal and abnormal growth processes. The growth promoting actions of angiotensin have been shown in a variety of cells and tissues. This subgroup represents clade A8 of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	446
381011	cd02055	serpinA10_PZI	serpin family A member 10, protein Z-dependent protease inhibitor. Protein Z-dependent protease inhibitor (ZPI) is a member of the serpin superfamily of proteinase inhibitors (clade A10). ZPI inhibits coagulation factor Xa, dependent on protein Z (PZ), a vitamin K-dependent plasma protein. ZPI also inhibits factor XIa in a process that does not require PZ. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	380
381012	cd02056	serpinA1_A1AT	serpin family A member 1, alpha-1-antitrypsin. Alpha-1-antitrypsin (also called A1AT, A1A, AAT, alpha1-proteinase inhibitor/A1PI, alpha1-antiproteinase/A1AP, proteinase inhibitor/PI, and serum trypsin inhibitor) is a protease inhibitor that belongs to the serpin superfamily. It is encoded in humans by the SERPINA1 gene. When the blood contains inadequate amounts of A1AT or functionally defective A1AT (such as in alpha-1 antitrypsin deficiency), neutrophil elastase is excessively free to break down elastin, degrading the elasticity of the lungs, which results in respiratory complications, such as chronic obstructive pulmonary disease. Normally, A1AT leaves its site of origin, the liver, and joins the systemic circulation; defective A1AT fails to do so, building up in the liver, which results in cirrhosis. This family contains other A1AT-like members of clade A of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	368
381013	cd02057	serpinB5_maspin	serpin family B member 5, mammary serine proteinase inhibitor. Mammary serine proteinase inhibitor (maspin, also known as proteinase inhibitor 5/PI5), a member of the serpin superfamily, is related to the ov-serpins, with a multitude of effects on cells and tissues at an assortment of developmental stages. Maspin has tumor suppressing activity against breast and prostate cancer. All true inhibitory serpins rely on an exposed reactive center loop (RCL) to inhibit their target proteinase, in which the proteinase cleaves the RCL and becomes incorporated into a serpin-proteinase complex. Maspin differs from other serpins in that its RCL is necessary for activity, but it is not cleaved or rearranged. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	375
381014	cd02058	serpinB_MENT-like	serpin family B, Myeloid and Erythroid Nuclear Termination stage-specific protein (MENT) and similar proteins. Gallus gallus Myeloid and Erythroid Nuclear Termination stage-specific protein (MENT) is a nonhistone heterochromatin-associated serpin that is an effective inhibitor of cathepsin L as well as the papain-like cysteine proteases cathepsins K, L, and V in vitro. It's reactive center loop, which is essential for chromatin bridging, is able to mediate formation of a loop-sheet oligomer. It also contains an M-loop which contains two critical functional motifs: a classical nuclear localization signal (NLS) that is required for nuclear import and an AT-hook motif that is involved in chromatin and DNA binding. MENT belongs to the clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	406
381015	cd02059	serpinB14_OVA	serpin family B member 14, ovalbumin. The chicken protein ovalbumin (OVA3), a storage protein from egg white, lacking a loop insertion mechanism and therefore protease inhibitory activity, is a historical member of the serpin superfamily and the founding member of the subgroup known as ov-serpins (ovalbumin-related serpins). It has several modifications, including N-terminal acetylation, phosphorylation, and glycosylation. Ovalbumin is secreted from the cell, targeted by an internal signal sequence, rather than the N-terminal signal sequence commonly found in other secreted proteins. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	385
380311	cd02062	Nitro_FMN_reductase	nitroreductase family protein. Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer,  utilizing FMN or FAD as cofactor. They are often found to be homodimers. Enzymes of this family are described as NAD(P)H:FMN oxidoreductases, oxygen-insensitive nitroreductase, flavin reductase P, dihydropteridine reductase, NADH oxidase or NADH dehydrogenase.	139
185679	cd02064	FAD_synthetase_N	FAD synthetase, N-terminal domain of the bifunctional enzyme. FAD synthetase_N.  N-terminal domain of the bifunctional riboflavin biosynthesis protein riboflavin kinase/FAD synthetase. These enzymes have both ATP:riboflavin 5'-phosphotransferase and ATP:FMN-adenylyltransferase activities.  The N-terminal domain is believed to play a role in the adenylylation reaction of FAD synthetases. The C-terminal domain is thought to have kinase activity.  FAD synthetase is present among all kingdoms of life.  However, the bifunctional enzyme is not found in mammals, which use separate enzymes for FMN and FAD formation.	180
239016	cd02065	B12-binding_like	B12 binding domain (B12-BD). Most of the members bind different cobalamid derivates, like B12 (adenosylcobamide) or methylcobalamin or methyl-Co(III) 5-hydroxybenzimidazolylcobamide. This domain is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. Cobalamin undergoes a conformational change on binding the protein; the dimethylbenzimidazole group, which is coordinated to the cobalt in the free cofactor, moves away from the corrin and is replaced by a histidine contributed by the protein. The sequence Asp-X-His-X-X-Gly, which contains this histidine ligand, is conserved in many cobalamin-binding proteins. Not all members of this family contain the conserved binding motif.	125
239017	cd02066	GRX_family	Glutaredoxin (GRX) family; composed of GRX, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including human GRX1 and GRX2, as well as E. coli GRX1 and GRX3, which are members of this family. E. coli GRX2, however, is a 24-kDa protein that belongs to the GSH S-transferase (GST) family.	72
239018	cd02067	B12-binding	B12 binding domain (B12-BD). This domain binds different cobalamid derivates, like B12 (adenosylcobamide) or methylcobalamin or methyl-Co(III) 5-hydroxybenzimidazolylcobamide, it is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. Cobalamin undergoes a conformational change on binding the protein; the dimethylbenzimidazole group, which is coordinated to the cobalt in the free cofactor, moves away from the corrin and is replaced by a histidine contributed by the protein. The sequence Asp-X-His-X-X-Gly, which contains this histidine ligand, is conserved in many cobalamin-binding proteins.	119
239019	cd02068	radical_SAM_B12_BD	B12 binding domain_like associated with radical SAM domain. This domain shows similarity with B12 (adenosylcobamide) binding domains found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase, but it lacks the signature motif Asp-X-His-X-X-Gly, which contains the histidine that acts as a cobalt ligand. The function of this domain remains unclear.	127
239020	cd02069	methionine_synthase_B12_BD	B12 binding domain of methionine synthase. This domain binds methylcobalamin, which it uses as an intermediate methyl carrier from methyltetrahydrofolate (CH3H4folate) to homocysteine (Hcy).	213
239021	cd02070	corrinoid_protein_B12-BD	B12 binding domain of corrinoid proteins. A family of small methanogenic corrinoid proteins that bind methyl-Co(III) 5-hydroxybenzimidazolylcobamide as a cofactor. They play a role on the methanogenesis from trimethylamine, dimethylamine or monomethylamine, which is initiated by a series of corrinoid-dependent methyltransferases.	201
239022	cd02071	MM_CoA_mut_B12_BD	methylmalonyl CoA mutase B12 binding domain. This domain binds to B12 (adenosylcobamide), which initiates the conversion of succinyl CoA and methylmalonyl CoA by forming an adenosyl radical, which then undergoes a rearrangement exchanging a hydrogen atom with a group attached to a neighboring carbon atom. This family is present in both mammals and bacteria. Bacterial members are heterodimers and involved in the fermentation of pyruvate to propionate. Mammalian members are homodimers and responsible for the conversion of odd-chain fatty acids and branched-chain amino acids via propionyl CoA to succinyl CoA for further degradation.	122
239023	cd02072	Glm_B12_BD	B12 binding domain of glutamate mutase (Glm). Glutamate mutase catalysis the conversion of (S)-glutamate with (2S,3S)-3-methylaspartate. The rearrangement reaction is initiated by the extraction of a hydrogen from the protein-bound substrate by a 5'-desoxyadenosyl radical, which is generated by the homolytic cleavage of the organometallic bond of the cofactor B12. Glm is a heterotetrameric molecule consisting of two alpha and two epsilon polypeptide chains.	128
319770	cd02073	P-type_ATPase_APLT_Dnf-like	Aminophospholipid translocases (APLTs), similar to Saccharomyces cerevisiae Dnf1-3p, Drs2p, and human ATP8A2, -10D, -11B, -11C. Aminophospholipid translocases (APLTs), also known as type 4 P-type ATPases, act as flippases, and translocate specific phospholipids from the exoplasmic leaflet to the cytoplasmic leaflet of biological membranes. Yeast Dnf1 and Dnf2 mediate the transport of phosphatidylethanolamine, phosphatidylserine, and phosphatidylcholine from the outer to the inner leaflet of the plasma membrane. This subfamily includes mammalian flippases such as ATP11C which may selectively transports PS and PE from the outer leaflet of the plasma membrane to the inner leaflet. It also includes Arabidopsis phospholipid flippases including ALA1, and Caenorhabditis elegans flippases, including TAT-1, the latter has been shown to facilitate the inward transport of phosphatidylserine. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	836
319771	cd02076	P-type_ATPase_H	plant and fungal plasma membrane H(+)-ATPases, and related bacterial and archaeal putative H(+)-ATPases. This subfamily includes eukaryotic plasma membrane H(+)-ATPase which transports H(+) from the cytosol to the extracellular space, thus energizing the plasma membrane for the uptake of ions and nutrients, and is expressed in plants and fungi. This H(+)-ATPase consists of four domains: a transmembrane domain and three cytosolic domains: nucleotide-binding domain, phosphorylation domain and actuator domain, and belongs to the P-type ATPase type III subfamily. This subfamily also includes the putative P-type H(+)-ATPase, MJ1226p of the anaerobic hyperthermophilic archaea Methanococcus jannaschii. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	781
319772	cd02077	P-type_ATPase_Mg	magnesium transporting ATPase (MgtA), similar to Escherichia coli MgtA and Salmonella typhimurium MgtA. MgtA is a membrane protein which actively transports Mg(2+) into the cytosol with its electro-chemical gradient rather than against the gradient as other cation transporters do. It may act both as a transporter and as a sensor for Mg(2+). In Salmonella typhimurium and Escherichia coli, the two-component system PhoQ/PhoP regulates the transcription of the mgtA gene by sensing Mg(2+) concentrations in the periplasm. MgtA is activated by cardiolipin and it highly sensitive to free magnesium in vitro. It consists of a transmembrane domain and three cytosolic domains: nucleotide-binding domain, phosphorylation domain and actuator domain, and belongs to the P-type ATPase type III subfamily. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	768
319773	cd02078	P-type_ATPase_K	potassium-transporting ATPase ATP-binding subunit, KdpB, a subunit of the prokaryotic high-affinity potassium uptake system KdpFABC; similar to Escherichia coli KdpB. KdpFABC is a prokaryotic high-affinity potassium uptake system. It is expressed under K(+) limiting conditions when the other potassium transport systems are not able to provide a sufficient flow of K(+) into the bacteria. The KdpB subunit represents the catalytic subunit performing ATP hydrolysis. KdpB is comprised of four domains: the transmembrane domain, the nucleotide-binding domain, the phosphorylation domain, and the actuator domain. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	667
319774	cd02079	P-type_ATPase_HM	P-type heavy metal-transporting ATPase. Heavy metal-transporting ATPases (Type IB ATPases) transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. These ATPases include mammalian copper-transporting ATPases, ATP7A and ATP7B, Bacillus subtilis CadA which transports cadmium, zinc and cobalt out of the cell, Bacillus subtilis ZosA/PfeT which transports copper, and perhaps also zinc and ferrous iron, Archaeoglobus fulgidus CopA and CopB, Staphylococcus aureus plasmid pI258 CadA, a cadmium-efflux ATPase, and Escherichia coli ZntA which is selective for Pb(2+), Zn(2+), and Cd(2+). The characteristic N-terminal heavy metal associated (HMA) domain of this group is essential for the binding of metal ions. This family belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	617
319775	cd02080	P-type_ATPase_cation	P-type cation-transporting ATPase similar to Exiguobacterium aurantiacum Mna, an Na(+)-ATPase, and Synechocystis sp. PCC 6803 PMA1, a putative Ca(2+)-ATPase. This subfamily includes the P-type Na(+)-ATPase of an alkaliphilic bacterium Exiguobacterium aurantiacum Mna and cyanobacterium Synechocystis sp. PCC 6803 PMA1, a cation-transporting ATPase which may translocate calcium. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	819
319776	cd02081	P-type_ATPase_Ca_PMCA-like	animal plasma membrane Ca2(+)-ATPases (PMCA), similar to human ATP2B1-4/PMCA1-4, and related Ca2(+)-ATPases including Saccharomyces cerevisiae vacuolar PMC1. Animal PMCAs function to export Ca(2+) from cells and play a role in regulating Ca(2+) signals following stimulus induction and in preventing calcium toxicity. Many PMCA pump variants exist due to alternative splicing of transcripts. PMCAs are regulated by the binding of calmodulin or by kinase-mediated phosphorylation. Saccharomyces cerevisiae vacuolar transporter Pmc1p facilitates the accumulation of Ca2+ into vacuoles. Pmc1p is not regulated by direct calmodulin binding but responds to the calmodulin/calcineurin-signaling pathway and is controlled by the transcription factor complex Tcn1p/Crz1p.  Similarly, the expression of the gene for Dictyostelium discoideum Ca(2+)-ATPase PAT1, patA, is under the control of a calcineurin-dependent transcription factor. Plant vacuolar Ca(2+)-ATPases, are regulated by direct-calmodulin binding. Plant Ca(2+)-ATPases are present at various cellular locations including the plasma membrane, endoplasmic reticulum, chloroplast and vacuole. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	721
319777	cd02082	P-type_ATPase_cation	P-type cation-transporting ATPases, similar to human ATPase type 13A1-A4 (ATP13A1-A4) proteins and Saccharomyces cerevisiae Ypk9p and Spf1p. Saccharomyces cerevisiae Yph9p localizes to the yeast vacuole and may play a role in sequestering heavy metal ions, its deletion confers sensitivity for growth for cadmium, manganese, nickel or selenium. Saccharomyces 1 Spf1p may mediate manganese transport into the endoplasmic reticulum. Human ATP13A2 (PARK9/CLN12) is a lysosomal transporter with zinc as the possible substrate. Mutation in the ATP13A2 gene has been linked to Parkinson's disease and Kufor-Rakeb syndrome, and to neuronal ceroid lipofuscinoses. ATP13A3/AFURS1 is a candidate gene for oculo auriculo vertebral spectrum (OAVS), being one of nine genes included in a 3q29 microduplication in a patient with OAVS. Mutation in the human ATP13A4 may be involved in a speech-language disorder. The expression of ATP13A1 has been followed during mouse development, ATP13A1 transcript expression showed an increase as development progressed, with the highest expression at the peak of neurogenesis. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	786
319778	cd02083	P-type_ATPase_SERCA	sarco/endoplasmic reticulum Ca(2+)-ATPase (SERCA), similar to mammalian ATP2A1-3/SERCA1-3. SERCA is a transmembrane (Ca2+)-ATPase and a major regulator of Ca(2+) homeostasis and contractility in cardiac and skeletal muscle. It re-sequesters cytoplasmic Ca(2+) to the sarco/endoplasmic reticulum store, thereby also terminating Ca(2+)-induced signaling such as in muscle contraction. Three genes (ATP2A1-3/SERCA1-3) encode SERCA pumps in mammals, further isoforms exist due to alternative splicing of transcripts. The activity of SERCA is regulated by two small membrane proteins called phospholamban and sarcolipin. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	979
319779	cd02085	P-type_ATPase_SPCA	golgi-associated secretory pathway Ca(2+) transport ATPases, similar to human ATPase secretory pathway Ca(2+) transporting 1/hSPCA1 and Saccharomyces cerevisiae Ca(2+)/Mn(2+)-transporting P-type ATPase, Pmr1p. SPCAs are Ca(2+) pumps important for the golgi-associated secretion pathway, in addition some function as Mn(2+) pumps in Mn(2+) detoxification. Saccharomyces cerevisiae Pmr1p is a high affinity Ca(2+)/Mn(2+) ATPase which transports Ca(2+) and Mn(2+) from the cytoplasm into the Golgi. Pmr1p also contributes to Cd(2+) detoxification.  This subfamily includes human SPCA1 and SPCA2, encoded by the ATP2C1 and ATP2C2 genes; autosomal dominant Hailey-Hailey disease is caused by mutations in the human ATP2C1 gene. It also includes Strongylocentrotus purpuratus testis secretory pathway calcium transporting ATPase SPCA which plays an important role in fertilization. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	804
319780	cd02086	P-type_ATPase_Na_ENA	fungal-type Na(+)-ATPase, similar to the plasma membrane sodium transporters Saccharomyces cerevisiae Ena1p, Ena2p and Ustilago maydis Ena1, and the endoplasmic reticulum sodium transporter Ustilago maydis Ena2. Fungal-type Na(+)-ATPase (also called ENA ATPases). This subfamily includes the Saccharomyces cerevisiae plasma membrane transporters: Na(+)/Li(+)-exporting ATPase Ena1p which may also extrudes K(+), and Na(+)-exporting P-type ATPase Ena2p. It also includes Ustilago maydis plasma membrane Ena1, an K(+)/Na(+)-ATPase whose chief role is to pump Na(+) and K(+) out of the cytoplasm, especially at high pH values, and endoplasmic reticulum Ena2 ATPase which mediates  Na(+) or K(+) fluxes in the ER or in other endomembranes. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	920
319781	cd02089	P-type_ATPase_Ca_prok	prokaryotic P-type Ca(2+)-ATPase similar to Synechococcus elongatus sp. strain PCC 7942 PacL and Listeria monocytogenes LMCA1. Ca(2+) transport ATPase is a plasma membrane protein which pumps Ca(2+) ion out of the cytoplasm. This prokaryotic subfamily includes the Ca(2+)-ATPase Synechococcus elongatus PacL, Listeria monocytogenes Ca(2+)-ATPase 1 (LMCA1) which has a low Ca(2+) affinity and a high pH optimum (pH about 9) and may remove Ca(2+) from the microorganism in environmental conditions when e.g. stressed by high Ca(2+) and alkaline pH, and the Bacillus subtilis putative P-type Ca(2+)-transport ATPase encoded by the yloB gene, which is expressed during sporulation. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	674
319782	cd02092	P-type_ATPase_FixI-like	Rhizobium meliloti FixI and related proteins; belongs to P-type heavy metal-transporting ATPase subfamily. FixI may be a pump of a specific cation involved in symbiotic nitrogen fixation. The Rhizobium fixI gene is part of an operon conserved among rhizobia, fixGHIS. FixG, FixH, FixI, and FixS may participate in a membrane-bound complex coupling the FixI cation pump with a redox process catalyzed by FixG, an iron-sulfur protein. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	605
319783	cd02094	P-type_ATPase_Cu-like	P-type heavy metal-transporting ATPase, similar to human copper-transporting ATPases, ATP7A and ATP7B. The mammalian copper-transporting P-type ATPases, ATP7A and ATP7B are key molecules required for the regulation and maintenance of copper homeostasis. Menkes and Wilson diseases are caused by mutation in ATP7A and ATP7B respectively. This subfamily includes other copper-transporting ATPases such as: Bacillus subtilis CopA , Archeaoglobus fulgidus CopA, and Saccharomyces cerevisiae Ccc2p. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	647
259797	cd02106	SPFH_like	core domain of the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons, and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease, and in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome.	110
239025	cd02107	YedY_like_Moco	YedY_like molybdopterin cofactor (Moco) binding domain, a subgroup of the sulfite oxidase (SO) family of molybdopterin binding domains. Escherichia coli YedY has been proposed to form a heterodimer, consisting of a soluble catalytic subunit termed YedY, which is likely membrane-anchored by a heme-containing trans-membrane subunit YedZ. Preliminary results indicate that YedY may represent a new type of membrane-associated bacterial reductase. Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate.	218
239026	cd02108	bact_SO_family_Moco	bacterial subgroup of the sulfite oxidase (SO) family of molybdopterin binding domains. This domain is found in a variety of oxidoreductases. Common features of all known members of this family, like sulfite oxidase and nitrite reductase, are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate. The specific function of this subgroup is unknown.	185
239027	cd02109	arch_bact_SO_family_Moco	bacterial and archael members of the sulfite oxidase (SO) family of molybdopterin binding domains. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate.  The specific function of this subgroup is unknown.	180
239028	cd02110	SO_family_Moco_dimer	Subgroup of sulfite oxidase (SO) family molybdopterin binding domains that contains conserved dimerization domain. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). 	317
239029	cd02111	eukary_SO_Moco	molybdopterin binding domain of sulfite oxidase (SO). SO catalyzes the terminal reaction in the oxidative degradation of the sulfur-containing amino acids cysteine and methionine. Common features of all known members of the sulfite oxidase (SO) family of molybdopterin binding domains are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate.	365
239030	cd02112	eukary_NR_Moco	molybdopterin binding domain of eukaryotic nitrate reductase (NR). Assimilatory NRs catalyze the reduction of nitrate to nitrite which is subsequently converted to NH4+ by nitrite reductase. Eukaryotic assimilatory nitrate reductases are cytosolic homodimeric enzymes with three prosthetic groups, flavin adenine dinucleotide (FAD), cytochrome b557, and Mo cofactor, which are located in three functional domains. Common features of all known members of the sulfite oxidase (SO) family of molybdopterin binding domains are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate.	386
239031	cd02113	bact_SoxC_Moco	bacterial SoxC is a member of the sulfite oxidase (SO) family of molybdopterin binding domains. SoxC is involved in oxidation of sulfur compounds during chemolithothrophic growth. Together with SoxD, a small c-type heme containing subunit, it forms a hetrotetrameric sulfite dehydrogenase. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate.	326
239032	cd02114	bact_SorA_Moco	sulfite:cytochrome c oxidoreductase subunit A (SorA), molybdopterin binding domain. SorA is involved in oxidation of sulfur compounds during chemolithothrophic growth. Together with SorB, a small c-type heme containing subunit, it forms a hetrodimer. It  is a member of the sulfite oxidase (SO) family of molybdopterin binding domains. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate.	367
239033	cd02115	AAK	Amino Acid Kinases (AAK) superfamily, catalytic domain; present in such enzymes like N-acetylglutamate kinase (NAGK), carbamate kinase (CK), aspartokinase (AK), glutamate-5-kinase (G5K) and UMP kinase (UMPK). The AAK superfamily includes kinases that phosphorylate a variety of amino acid substrates. These kinases catalyze the formation of phosphoric anhydrides, generally with a carboxylate, and use ATP as the source of the phosphoryl group; are involved in amino acid biosynthesis. Some of these kinases control the process via allosteric feed-back inhibition.	248
153139	cd02116	ACT	ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. Members of this CD belong to the superfamily of ACT regulatory domains. Pairs of ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. The ACT domain has been detected in a number of diverse proteins; some of these proteins are involved in amino acid and purine biosynthesis, phenylalanine hydroxylation, regulation of bacterial metabolism and transcription, and many remain to be characterized. ACT domain-containing enzymes involved in amino acid and purine synthesis are in many cases allosteric enzymes with complex regulation enforced by the binding of ligands. The ACT domain is commonly involved in the binding of a small regulatory molecule, such as the amino acids L-Ser and L-Phe in the case of D-3-phosphoglycerate dehydrogenase and the bifunctional chorismate mutase-prephenate dehydratase enzyme (P-protein), respectively. Aspartokinases typically consist of two C-terminal ACT domains in a tandem repeat, but  the second ACT domain is inserted within the first, resulting in, what is normally the terminal beta strand of ACT2, formed from a region N-terminal of ACT1. ACT domain repeats have been shown to have nonequivalent ligand-binding sites with complex regulatory patterns such as those seen in the bifunctional enzyme, aspartokinase-homoserine dehydrogenase (ThrA). In other enzymes, such as phenylalanine hydroxylases, the ACT domain appears to function as a flexible small module providing allosteric regulation via transmission of conformational changes, these conformational changes are not necessarily initiated by regulatory ligand binding at the ACT domain itself. ACT domains are present either singularly, N- or C-terminal, or in pairs present C-terminal or between two catalytic domains. Unique to cyanobacteria are four ACT domains C-terminal to an aspartokinase domain. A few proteins are composed almost entirely of ACT domain repeats as seen in the four ACT domain protein, the ACR protein, found in higher plants; and the two ACT domain protein, the glycine cleavage system transcriptional repressor (GcvR) protein, found in some bacteria. Also seen are single ACT domain proteins similar to the Streptococcus pneumoniae ACT domain protein (uncharacterized pdb structure 1ZPV) found in both bacteria and archaea. Purportedly, the ACT domain is an evolutionarily mobile ligand binding regulatory module that has been fused to different enzymes at various times.	60
349761	cd02117	NifH-like	NifH family. This family contains the NifH (iron protein) of nitrogenase, L subunit (BchL/ChlL) of the  protochlorophyllide reductase, and the BchX subunit of the Chlorophyllide reductase. Members of this family use energy from ATP hydrolysis and transfer electrons through a Fe4-S4 cluster to other subunit for substrate reduction	266
239035	cd02120	PA_subtilisin_like	PA_subtilisin_like: Protease-associated domain containing subtilisin-like proteases. This group contains various PA domain-containing subtilisin-like proteases including melon cucumisin, Arabidopsis thaliana Ara12, a nodule specific serine protease from Alnus glutinosa ag12, members of the tomato P69 family, and tomato LeSBT2. These proteins belong to the peptidase S8 family. Cucumisin from the juice of melon fruits is a thermostable serine peptidase, with a broad substrate specificity for oligopeptides and proteins. A. thaliana Ara12 is a thermostable, extracellular serine protease, found chiefly in silique tissue and stem tissue. Ara12 is stimulated by Ca2+ ions. A. glutinosa ag12 is expressed at high levels in the nodules, and at low levels in the shoot tips; it is implicated in both symbiotic and non-symbiotic processes in plant development. The tomato P69 protease family is comprised of various protein isoforms of approximately 69KDa. These isoforms accumulate extracellularly. Some of the P69 genes are tightly regulated in a tissue specific fashion, and by environmental and developmental signals. For example: infection with avirulent bacteria activates transcription of the genes for the P69 B and C isoforms, the P69 E transcript was detected only in roots, and the P69F transcript only in hydathodes. The Tomato LeSBT2 subtilase transcript was not detected in flowers and roots, but was present in cotyledons and leaves. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	126
239036	cd02121	PA_GCPII_like	PA_GCPII_like: Protease-associated domain containing protein, glutamate carboxypeptidase II (GCPII)-like. This group contains various PA domain-containing proteins similar to GCPII including, GCPIII (NAALADase2) and NAALADase L. These proteins belong to the peptidase M28 family. GCPII is also known N-acetylated-alpha-linked acidic dipeptidase (NAALDase1), folate hydrolase or prostate-specific membrane antigen (PSMA). GCPII is found in various human tissues including prostate, small intestine, and the central nervous system. In the brain, GCPII is known as NAALDase1, it functions as a NAALDase hydrolyzing the neuropeptide N-acetyl-L-aspartyl-L-glutamate (alpha-NAAG), to release free glutamate. In the small intestine, GCPII releases the terminal glutamate from poly-gamma-glutamated folates. GCPII (PSMA) is a useful cancer marker; its expression is markedly increased in prostate cancer and in tumor-associated neovasculature. GCPIII hydrolyzes alpha-NAAG with a lower efficiency than does GCPII; NAALADase L is not able to hydrolyze alpha-NAAG. The GCPII PA domain (referred to as the apical domain) participates in substrate binding and may act as a protein-protein interaction domain.	220
239037	cd02122	PA_GRAIL_like	PA _GRAIL_like: Protease-associated (PA) domain GRAIL-like. This group includes PA domain containing E3 (ubiquitin ligases) similar to human GRAIL (gene related to anergy in lymphocytes) protein. Proteins in this group contain a C3H2C3 RING finger. E3 ubiquitin ligase is part of an enzymic cascade, the end result of which is the ubiquitination of proteins. In this cascade, E1 activates the ubiquitin, the activated ubiquitin is carried by E2, and E3 recognizes the acceptor protein as well as catalyzes the transfer of the activated ubiquitin from E2 to this acceptor. GRAIL, a transmembrane protein localized in the endosomes, controls the development of T cell clonal anergy, and may ubiquitinate membrane-associated targets for T cell activation. GRAIL1 is associated with, and regulated by, two isoforms of otubain 1 (the ubiquitin-specific protease). Additional E3s belonging to this group include human (h)Goliath and Xenopus GREUL1 (Goliath Related E3 Ubiquitin Ligase 1). hGoliath and GRAIL both have the property of self-ubiquitination. hGoliath is expressed in leukocytes; its expression and localization is not modified in leukemia. GREUL1 may play a role in the generation of anterior ectoderm. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	138
239038	cd02123	PA_C_RZF_like	PA_C-RZF_ like: Protease-associated (PA) domain C_RZF-like. This group includes various PA domain-containing proteins similar to C-RZF (chicken embryo RING zinc finger) protein. These proteins contain a C3H2C3 RING finger. C-RZF is expressed in embryo cells and is restricted mainly to brain and heart, it is localized to both the nucleus and endosomes. Additional C3H2C3 RING finger proteins belonging to this group, include Arabidopsis ReMembR-H2 protein and mouse sperizin. ReMembR-H2 is likely to be an integral membrane protein, and to traffic through the endosomal pathway. Sperizin is expressed in haploid germ cells and localized in the cytoplasm, it may participate in spermatogenesis. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	153
239039	cd02124	PA_PoS1_like	PA_PoS1_like: Protease-associated (PA) domain PoS1-like. This group includes various PA domain-containing proteins similar to Pleurotus ostreatus (Po)S1. PoSl, the main extracellular protease in P. ostreatus is a subtilisin-like serine protease belonging to the peptidase S8 family. Ca2+ and Mn2+ both stimulate the protease activity of (Po)S1. Ca2+ protects PoS1 from autolysis. PoS1 is a monomeric glycoprotein, which may play a role in the regulation of laccases in lignin formation. (Po)S1 participates in the degradation of POXA1b, and in the activation of POXA3, (POXA1b and POXA3 are laccase isoenzymes), but its effect may be indirect. The significance of the PA domain to PoS1 has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	129
239040	cd02125	PA_VSR	PA_VSR: Protease-associated (PA) domain-containing plant vacuolar sorting receptor (VSR). This group includes various PA domain-containing VSRs such as garden pea BP-80, pumpkin PV72, and various Arabidopsis VSRs including AtVSR1. In contrast to most eukaryotes, which only have one or two VSRs, plants have several. This may in part be a reflection of having a more complex vacuolar system with both lytic vacuoles and storage vacuoles. The lytic vacuole is thought to be equivalent to the mammalian lysosome and the yeast vacuole. Pea BP-80 is a type 1 transmembrane protein, involved in the targeting of proteins to the lytic vacuole; it has been suggested that this protein also mediates targeting to the storage vacuole. PV72 and AtVSR1 may mediate transport of seed storage proteins to protein storage vacuoles. The significance of the PA domain to VSRs has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	127
239041	cd02126	PA_EDEM3_like	PA_EDEM3_like: protease associated domain (PA) domain-containing EDEM3-like proteins. This group contains various PA domain-containing proteins similar to mouse EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein). EDEM3 contains a region, similar to Class I alpha-mannosidases (gylcosyl hydrolase family 47), N-terminal to the PA domain. EDEM3 accelerates glycoprotein ERAD (ER-associated degradation). In transfected mammalian cells, overexpression of EDEM3 enhances the mannose trimming from the N-glycans, of a model misfolded protein [alpha1-antitrypsin null (Hong Kong)] as well as, from total glycoproteins. Mannose trimming appears to be involved in the selection of ERAD substrates. EDEM3 has a different specificity of trimming than ER alpha-mannosidase 1. The significance of the PA domain to EDEM3 has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	126
239042	cd02127	PA_hPAP21_like	PA_hPAP21_like: Protease-associated domain containing proteins like the human secreted glycoprotein hPAP21 (human protease-associated domain-containing protein, 21kDa). This group contains various PA domain-containing proteins similar to hPAP21. Complex N-glycosylation may be required for the secretion of hPAP21. The significance of the PA domain to hPAP21 has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	118
239043	cd02128	PA_TfR	PA_TfR: Protease-associated domain containing proteins like transferrin receptor (TfR). This group contains various PA domain-containing proteins similar to human TfR1 and TfR2. TfR1 and TfR2 are type II membrane proteins, belonging to the peptidase M28 family. TfR1 is homodimeric, widely expressed, and a key player in the uptake of iron-loaded transferrin (Tf) into cells. The TfR1 homodimer binds two molecules of Tf and this complex is internalized. In addition to its role in iron uptake, TfR1 may participate in cell growth and proliferation. TfR2 also binds Tf but with a significantly lower affinity than does TfR1. TfR2 is expressed chiefly in hepatocytes, hematopoietic cells, and duodenal crypt cells; its expression overlaps with that of hereditary hemochromatosis protein (HFE). TfR2 is involved in iron homeostasis. HFE and TfR2 interact in cells. By one model for serum iron sensing, at low or basal iron concentrations, HFE and TFR1 form a complex at the plasma membrane; at increased Tf, Tf competes with HFE for binding of TfR1, resulting in HFE disassociating from TfR1 and associating with TfR2 . The TfR1-TfR2 association might initiate a signal cascade leading to the induction of hepcidin (a small peptide hormone that controls systemic iron levels). Human mutations in TfR2 are associated with a form of hemochromatosis (HFE3). The significance of the PA domain to TfRs has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	183
239044	cd02129	PA_hSPPL_like	PA_hSPPL_like: Protease-associated domain containing human signal peptide peptidase-like (hSPPL)-like. This group contains various PA domain-containing proteins similar to hSPPL2a and 2b. These SPPLs are GxGD aspartic proteases. SPPL2a is sorted to the late endosomes, SPPL2b to the plasma membrane. In activated dendritic cells, hSPPL2a and 2b catalyze the intramembrane proteolysis of tumor necrosis factor alpha triggering IL-12 production. hSPPL2a and 2b may have a broad substrate spectrum. The significance of the PA domain to these SPPLs has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	120
239045	cd02130	PA_ScAPY_like	PA_ScAPY_like: Protease-associated domain containing proteins like Saccharomyces cerevisiae aminopeptidase Y (ScAPY). This group contains various PA domain-containing proteins similar to the S. cerevisiae APY, including Trichophyton rubrum leucine aminopeptidase 1(LAP1). Proteins in this group belong to the peptidase M28 family. ScAPY hydrolyzes amino acid-4-methylcoumaryl-7-amides (MCAs). ScAPY more rapidly hydrolyzes dipeptidyl-MCAs. Hydrolysis of amino acid-MCAs or dipeptides is stimulated by Co2+ while  the hydrolysis of dipeptidyl-MCAs, tripeptides, and longer peptides is inhibited by Co2+. ScAPY is vacuolar and  is activated by proteolytic processing. LAP1 is a secreted leucine aminopeptidase. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	122
239046	cd02131	PA_hNAALADL2_like	PA_hNAALADL2_like: Protease-associated domain containing proteins like human N-acetylated alpha-linked acidic dipeptidase-like 2 protein (hNAALADL2). This group contains various PA domain-containing proteins similar to hNAALADL2. The function of hNAALADL2 is unknown. This gene has been mapped to a chromosomal region associated with Cornelia de Lange syndrome. The significance of the PA domain to hNAALADL2 has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	153
239047	cd02132	PA_GO-like	PA_GO-like: Protease-associated domain containing proteins like Arabidopsis thaliana growth-on protein GRO10. This group contains various PA domain-containing proteins similar to the functionally uncharacterized Arabidopsis GRO10. The PA domain may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	139
239048	cd02133	PA_C5a_like	PA_C5a_like: Protease-associated domain containing proteins like Streptococcus pyogenes C5a peptidase. This group contains various PA domain-containing proteins similar to S. pyogenes C5a, including, i) Vpr, a minor extracellular serine protease from Bacillus subtilis, ii) a large molecular mass collagenolytic protease from Geobacillus collagenovorans MO-1, and iii) PrtS, a cell envelope protease from Streptococcus thermophilus CNRZ 385. Proteins in this group belong to the peptidase S8 family. C5a peptidase is a cell surface serine protease which specifically inactivates C5a [a chemotactic peptide, which attracts polymorphonuclear leukocytes (PMNs)], by cleaving it to release a 7-residue carboxy-terminal fragment which contains the PMN binding site. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	143
411779	cd02134	KH-II_NusA_rpt1	first type II K-homology (KH) RNA-binding domain found in transcription termination/antitermination protein NusA and similar proteins. NusA, also called N utilization substance protein A or transcription termination/antitermination L factor, is an essential multifunctional transcription elongation factor that participates in both transcription termination and antitermination. NusA anti-termination function plays an important role in the expression of ribosomal rrn operons. During transcription of many other genes, NusA-induced RNA polymerase pausing provides a mechanism for synchronizing transcription and translation. In prokaryotes, the N-terminal RNA polymerase-binding domain (NTD) is connected through a flexible hinge helix to three globular domains, the S1 and two K-homology (KH), KH1 and KH2. The KH domains of NusA belong to the type II KH RNA-binding domain superfamily. This model corresponds to the first KH domain of NusA and similar proteins.	76
380312	cd02135	YdjA-like	nitroreductase family protein similar to Escherichia coli YdjA. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase YdjA from Escherichia coli. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor.  The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase.	162
380313	cd02136	PnbA_NfnB-like	nitroreductase similar to Mycobacterium smegmatis NfnB. Members of this family utilize FMN as a cofactor and catalyze reduction of a variety of nitroaromatic compounds, including nitrofurans, nitrobenzens, nitrophenol, nitrobenzoate and quinones by using either NADH or NADPH as a source of reducing equivalents in an obligatory two-election transfer mechanism. The enzyme is typically a homodimer. Mycobacterium smegmatis nitroreductase NfnB plays a role in resistance to benzothiazinone.	152
380314	cd02137	MhqN-like	nitroreductase family protein similar to the NAD(P)H nitroreductase MhqN. A diverse subfamily of the nitroreductase family containing uncharacterized proteins; includes nitroreductases MhqN, YodC, YdgI, DrgA. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor.  The enzyme is typically a homodimer.	147
380315	cd02138	TdsD-like	nitroreductase similar to Burkholderia pseudomallei TdsD. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to Burkholderia pseudomallei TdsD, may be involved in the processing of organosulfur compounds. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor.  The enzyme is typically a homodimer.	174
380316	cd02139	nitroreductase	nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor.  The enzyme is typically a homodimer.	165
380317	cd02140	Frm2-like	nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. Members of this family are also called NADH dehydrogenase, oxygen-insensitive NAD(P)H nitrogenase or dihydropteridine reductase.	192
380318	cd02142	McbC_SagB-like_oxidoreductase	oxidase similar to the microcin B17 processing protein McbC. This family is the oxidase domain of NRPS (non-ribosomal peptide synthetase) and other systems that modify polypeptides by cyclizing a thioester to form a ring. These include EpoB, part of the epothilone biosynthesis pathway; TubD, part of the tubulysin biosynthesis pathway, MtsD, part of the myxothiozol biosynthesis pathway; IndC, part of the indigoidine biosynthesis pathway and TfxB, part of the trifitoxin processing pathway. All are FMN-dependent and oxidize the product of the cyclization of thioesters in short polypeptides.	200
380319	cd02143	nitroreductase_FeS-like	nitroreductases with an N-terminal iron-sulfur cluster-binding domain. Members of this family utilize FMN as a cofactor. This family may be involved in the reduction of flavin or nitroaromatic compounds via an obligatory two-electron transfer. Nitroreductase is homodimer. Each subunit contains one FMN molecule.	187
380320	cd02144	iodotyrosine_dehalogenase	iodotyrosine dehalogenase. Iodotyrosine dehalogenase catalyzes the removal of iodine from the 3, 5 positions of L-tyosine in thyroid, liver and kidney,  using NADPH as electron donor. This enzyme is a homolog of the nitroreductase family. These enzymes are usually homodimers.	192
380321	cd02145	BluB	5,6-dimethylbenzimidazole synthase. BluB catalyzes the O2-dependent conversion of FMNH2 to 5,6-dimethylbenzimidazole (DMB), a component of vitamin B12; is is a subfamily of the nitroreductase family; nitroreductases typically reduce their substrates by using NAD(P)H as electron donor and often use FMN as a cofactor.	196
380322	cd02146	NfsA-like	nitroreductase similar to Escherichia coli NfsA. This family contains NADPH-dependent flavin reductase and oxygen-insensitive nitroreductase. These enzymes are homodimeric flavoproteins that contain one FMN per monomer as a cofactor. Flavin reductase catalyzes the reduction of flavin by using NADPH as an electron donor. Oxygen-insensitive nitroreductase, such as NfsA protein in Escherichia coli, catalyzes reduction of nitrocompounds using NADPH as electron donor.	229
380323	cd02148	RutE-like	nitroreductase similar to Escherichia coli RutE. A subfamily of the nitroreductase family containing uncharacterized proteins that are similar to nitroreductase. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer. RutE is involved in the utilization of uracil as the sole nitrogen source; it appears to have the same function as YdfG, which reduces malonic semialdehyde to 3-hydroxypropionic acid.	186
380324	cd02149	NfsB-like	nitroreductase similar to Escherichia coli NfsB. NAD(P)H:FMN oxidoreductase family. This domain catalyzes the reduction of flavin, nitrocompound, quinones and azo compounds using NADH or NADPH as an electron donor. The enzyme is a homodimer, and each monomer binds a FMN as co-factor. This family includes FRase I in Vibrio fischeri, wihich reduces FMN into FMNH2 as part of the bioluminescent reaction. The family also includes oxygen-insensitive nitroreductases that use NADH or NADPH as an electron donor in the ping pong bi bi mechanism. This type of nitroreductase can be used in cancer chemotherapy to activate a range of prodrugs.	156
380325	cd02150	nitroreductase	nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor.  The enzyme is typically a homodimer.often found to be homodimers.	156
380326	cd02151	nitroreductase	nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor.  The enzyme is typically a homodimer.often found to be homodimers..	157
239065	cd02152	OAT	Ornithine acetyltransferase (OAT) family; also referred to as ArgJ. OAT catalyzes the first and fifth steps in arginine biosynthesis, coupling acetylation of glutamate with deacetylation of N-acetylornithine, which allows recycling of the acetyl group in the arginine biosynthetic pathway. Members of this family may experience feedback inhibition by L-arginine. The active enzyme is a heterotetramer of two alpha and two beta chains, where the alpha and beta chains are the result of autocatalytic cleavage. OATs found in the clavulanic acid biosynthesis gene cluster catalyze the fifth step only, and may utilize acetyl acceptors other than glutamate.	390
239066	cd02153	tRNA_bindingDomain	The tRNA binding domain is also known as the Myf domain in literature. This domain is found in a diverse collection of tRNA binding proteins, including prokaryotic phenylalanyl tRNA synthetases (PheRS), methionyl-tRNA synthetases (MetRS), human tyrosyl-tRNA synthetase(hTyrRS), Saccharomyces cerevisiae Arc1p, Thermus thermophilus CsaA, Aquifex aeolicus Trbp111, human p43 and human EMAP-II. PheRS, MetRS and hTyrRS aminoacylate their cognate tRNAs.  Arc1p is a transactivator of yeast methionyl-tRNA and glutamyl-tRNA synthetases.  The molecular chaperones Trbp111 and CsaA also contain this domain.  CsaA has export related activities; Trbp111 is structure-specific recognizing the L-shape of the tRNA fold. This domain has general tRNA binding properties.  In a subset of this family this domain has the added capability of a cytokine. For example the p43 component of the Human aminoacyl-tRNA synthetase complex is cleaved to release EMAP-II cytokine. EMAP-II has multiple activities during apoptosis, angiogenesis and inflammation and participates in malignant transformation. An EMAP-II-like cytokine is released from hTyrRS upon cleavage. The active cytokine heptapeptide locates to this domain. For homodimeric members of this group which include CsaA, Trbp111 and Escherichia coli MetRS this domain acts as a dimerization domain.	99
173912	cd02156	nt_trans	nucleotidyl transferase superfamily. nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain.	105
173914	cd02163	PPAT	Phosphopantetheine adenylyltransferase. Phosphopantetheine adenylyltransferase (PPAT). PPAT is an essential enzyme in bacteria, responsible for catalyzing the rate-limiting step in coenzyme A (CoA) biosynthesis.  The dinucleotide-binding fold of PPAT is homologous to class I aminoacyl-tRNA synthetases. CoA has been shown to inhibit PPAT and competes with ATP, PhP, and dPCoA. PPAT is a homohexamer in E. coli.	153
173915	cd02164	PPAT_CoAS	phosphopantetheine adenylyltransferase domain of eukaryotic and archaeal bifunctional enzymes. The PPAT domain of the bifunctional enzyme with PPAT and DPCK functions. The final two steps of the CoA biosynthesis pathway are catalyzed by phosphopantetheine adenylyltransferase (PPAT) and dephospho-CoA (dPCoA) kinase (DPCK). The PPAT reaction involves the reversible adenylation of 4'-phosphopantetheine to form 3'-dPCoA and PPi, and DPCK catalyses phosphorylation of the 3'-hydroxy group of the ribose moiety of dPCoA.  In eukaryotes the two enzymes are part of a large multienzyme complex . Studies in Corynebacterium ammoniagenes suggested that separate enzymes were present, and this was confirmed through identification of the bacterial PPAT/CoAD.	143
185680	cd02165	NMNAT	Nicotinamide/nicotinate mononucleotide adenylyltransferase. Nicotinamide/nicotinate mononucleotide (NMN/ NaMN)adenylyltransferase (NMNAT).  NMNAT represents the primary bacterial and eukaryotic adenylyltransferases for nicotinamide-nucleotide and for the deamido form, nicotinate nucleotide.  It is an indispensable enzyme in the biosynthesis of NAD(+) and NADP(+). Nicotinamide-nucleotide adenylyltransferase synthesizes NAD via the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD via the de novo pathway. Human NMNAT displays unique dual substrate specificity toward both NMN and NaMN, and can participate in both de novo and salvage pathways of NAD synthesis.	192
173917	cd02166	NMNAT_Archaea	Nicotinamide/nicotinate mononucleotide adenylyltransferase, archaeal. This family of archaeal proteins exhibits nicotinamide-nucleotide adenylyltransferase (NMNAT) activity utilizing the salvage pathway to synthesize NAD. In some cases, the enzyme was tested and found also to have the activity of nicotinate-nucleotide adenylyltransferase an enzyme of NAD de novo biosynthesis, although with a higher Km. In some archaeal species, a number of proteins which are uncharacterized with respect to activity, are also present.	163
173918	cd02167	NMNAT_NadR	Nicotinamide/nicotinate mononucleotide adenylyltransferase of bifunctional NadR-like proteins. NMNAT domain of NadR protein. The NadR protein (NadR) is a bifunctional enzyme possessing both NMN adenylytransferase (NMNAT) and ribosylnicotinamide kinase (RNK) activities. Its function is essential for the growth and survival of H. influenzae and thus may present a new highly specific anti-infectious drug target. The N-terminal domain that hosts the NMNAT activity is closely related to archaeal NMNAT. The bound NAD at the active site of the NMNAT domain reveals several critical interactions between NAD and the protein.The NMNAT domain of hiNadR defines yet another member of the pyridine nucleotide adenylyltransferase	158
173919	cd02168	NMNAT_Nudix	Nicotinamide/nicotinate mononucleotide adenylyltransferase of bifunctional proteins, also containing a Nudix hydrolase domain. N-terminal NMNAT (Nicotinamide/nicotinate mononucleotide adenylyltransferase) domain of a novel bifunctional enzyme endowed with NMN adenylyltransferase and Nudix hydrolase activities.  This domain is highly homologous to the archeal NMN adenyltransferase that catalyzes NAD synthesis from NMN and ATP.  NMNAT is an essential enzyme in the biosynthesis of NAD(+) and NADP(+). Nicotinamide-nucleotide adenylyltransferase synthesizes NAD via the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD via the de novo pathway.  The C-terminal domain of this enzyme shares homology with the archaeal ADP-ribose pyrophosphatase, a member of the 'Nudix' hydrolase family.	181
173920	cd02169	Citrate_lyase_ligase	Citrate lyase ligase. Citrate lyase ligase, also known as [Citrate (pro-3S)-lyase] ligase, is responsible for acetylation of the (2-(5''-phosphoribosyl)-3'-dephosphocoenzyme-A) prosthetic group of the gamma subunit of citrate lyase, converting the inactive thiol form of this enzyme to the active form. The acetylation of 1 molecule of deacetyl-citrate lyase to enzymatically active citrate lyase requires 6 molecules of ATP. The Adenylylyltranferase activity of the enzyme involves the formation of AMP and and pyrophosphate in the acetylation reaction.	297
173921	cd02170	cytidylyltransferase	cytidylyltransferase. The cytidylyltransferase family includes cholinephosphate cytidylyltransferase (CCT), glycerol-3-phosphate cytidylyltransferase, RafE and  phosphoethanolamine cytidylyltransferase (ECT). All enzymes catalyze the transfer of a cytidylyl group from CTP to various substrates.	136
173922	cd02171	G3P_Cytidylyltransferase	glycerol-3-phosphate cytidylyltransferase. Glycerol-3-phosphate cytidylyltransferase,(CDP-glycerol pyrophosphorylase). Glycerol-3-phosphate cytidyltransferase acts in pathways of teichoic acid biosynthesis. Teichoic acids are substituted polymers, linked by phosphodiester bonds, of glycerol, ribitol, etc. An example is poly(glycerol phosphate), the major teichoic acid of the Bacillus subtilis cell wall. Most, but not all, species encoding proteins in this family are Gram-positive bacteria.  A closely related protein assigned a different function experimentally is a human ethanolamine-phosphate cytidylyltransferase.	129
173923	cd02172	RfaE_N	N-terminal domain of RfaE. RfaE is a protein involved in the biosynthesis of ADP-L-glycero-D-manno-heptose, a precursor for LPS inner core biosynthesis. RfaE is a bifunctional protein in Escherichia coli, and separate proteins in other organisms. Domain I  is suggested to act in D-glycero-D-manno-heptose 1-phosphate biosynthesis, while domain II (this family) adds ADP to yield ADP-D-glycero-D-manno-heptose .	144
173924	cd02173	ECT	CTP:phosphoethanolamine cytidylyltransferase (ECT). CTP:phosphoethanolamine cytidylyltransferase (ECT) catalyzes the conversion of phosphoethanolamine to CDP-ethanolamine as part of the CDP-ethanolamine biosynthesis pathway.  ECT expression in hepatocytes is localized predominantly to areas of the cytoplasm that are rich in rough endoplasmic reticulum. Several ECTs, including yeast and human ECT, have large repetitive sequences located within their N- and C-termini.	152
173925	cd02174	CCT	CTP:phosphocholine cytidylyltransferase. CTP:phosphocholine cytidylyltransferase (CCT) catalyzes the condensation of CTP and phosphocholine to form CDP-choline as the rate-limiting and regulatory step in the CDP-choline pathway. CCT is unique in that its enzymatic activity is regulated by the extent of its association with membrane structures. A current model posts that the elastic stress of the bilayer curvature is sensed by CCT and this governs the degree of membrane association, thus providing a mechanism for both positive and negative regulation of activity.	150
185684	cd02175	GH16_lichenase	lichenase, member of glycosyl hydrolase family 16. Lichenase, also known as 1,3-1,4-beta-glucanase, is a member of glycosyl hydrolase family 16, that specifically cleaves 1,4-beta-D-glucosidic bonds in mixed-linked beta glucans that also contain 1,3-beta-D-glucosidic linkages.  Natural substrates of beta-glucanase are beta-glucans from grain endosperm cell walls or lichenan from the Islandic moss, Cetraria islandica.  This protein is found not only in bacteria but also in anaerobic fungi.  This domain includes two seven-stranded antiparallel beta-sheets that are adjacent to one another forming a compact, jellyroll beta-sandwich structure.	212
185685	cd02176	GH16_XET	Xyloglucan endotransglycosylase, member of glycosyl hydrolase family 16. Xyloglucan endotransglycosylases (XETs) cleave and religate xyloglucan polymers in plant cell walls via a transglycosylation mechanism. Xyloglucan is a soluble hemicellulose with a backbone of beta-1,4-linked glucose units, partially substituted with alpha-1,6-linked xylopyranose branches. It binds noncovalently to cellulose, cross-linking the adjacent cellulose microfibrils, giving it a key structural role as a matrix polymer. Therefore, XET plays an important role in all plant processes that require cell wall remodeling.	263
185686	cd02177	GH16_kappa_carrageenase	Kappa-carrageenase, member of glycosyl hydrolase family 16. Kappa-carrageenase is a glycosyl hydrolase family 16 (GH16) member that hydrolyzes the internal beta-1,4-linkage of kappa-carrageenans, a hydrophilic polysaccharide found in the cell wall of Rhodophyceaea, marine red algae. Carrageenans are linear chains of galactose units linked by alternating D-alpha-1,3- and D-beta-1,4-linkages that are additionally modified by a 3,6-anhydro-bridge. Depending on the position and number of sulfate ester modifications they are subdivided into kappa-, iota-, and lambda-carrageenases, kappa being modified once. Carrageenans form thermo-reversible gels widely used for industrial applications. Kappa-carrageenases exist in bacteria belonging to at least three phylogenetically distant branches, including pseudoalteromonas, planctomycetes, and baceroidetes.   This domain adopts a curved  beta-sandwich conformation, with a tunnel-shaped active site cavity, referred to as a jellyroll fold.	269
185687	cd02178	GH16_beta_agarase	Beta-agarase, member of glycosyl hydrolase family 16. Beta-agarase is a glycosyl hydrolase family 16 (GH16) member that hydrolyzes the internal beta-1,4-linkage of agarose, a hydrophilic polysaccharide found in the cell wall of Rhodophyceaea, marine red algae. Agarose is a linear chain of galactose units linked by alternating L-alpha-1,3- and D-beta-1,4-linkages that are additionally modified by a 3,6-anhydro-bridge. Agarose forms thermo-reversible gels that are widely used in the food industry or as a laboratory medium. While beta-agarases are also found in two other families derived from the sequence-based classification of glycosyl hydrolases (GH50, and GH86) the GH16 members are most abundant.  This domain adopts a curved  beta-sandwich conformation, with a tunnel-shaped active site cavity, referred to as a jellyroll fold.	258
185688	cd02179	GH16_beta_GRP	beta-1,3-glucan recognition protein, member of glycosyl hydrolase family 16. Beta-GRP (beta-1,3-glucan recognition protein) is one of several pattern recognition receptors (PRRs), also referred to as biosensor proteins, that complexes with pathogen-associated beta-1,3-glucans and then transduces signals necessary for activation of an appropriate innate immune response. They are present in insects and lack all catalytic residues. This subgroup also contains related proteins of unknown function that still contain the active site. Their structures adopt a jelly roll fold with a deep active site channel harboring the catalytic residues, like those of other glycosyl hydrolase family 16 members.	321
185689	cd02180	GH16_fungal_KRE6_glucanase	Saccharomyces cerevisiae KRE6 and related glucanses, member of glycosyl hydrolase family 16. KRE6 is a Saccharomyces cerevisiae glucanase that participates in the synthesis of beta-1,6-glucan, a major structural component of the cell wall.  It is a golgi membrane protein required for normal beta-1,6-glucan levels in the cell wall.  KRE6 is closely realted to laminarinase, a glycosyl hydrolase family 16 member that hydrolyzes 1,3-beta-D-glucosidic linkages in 1,3-beta-D-glucans such as laminarins, curdlans, paramylons, and pachymans, with very limited action on mixed-link (1,3-1,4-)-beta-D-glucans.	295
185690	cd02181	GH16_fungal_Lam16A_glucanase	fungal 1,3(4)-beta-D-glucanases, similar to Phanerochaete chrysosporium laminarinase 16A. Group of fungal 1,3(4)-beta-D-glucanases, similar to Phanerochaete chrysosporium laminarinase 16A. Lam16A belongs to the 'nonspecific' 1,3(4)-beta-glucanase subfamily, although beta-1,6 branching and beta-1,4 bonds specifically define where Lam16A hydrolyzes its substrates, like curdlan (beta-1,3-glucan), lichenin (beta-1,3-1,4-mixed linkage glucan), and laminarin (beta-1,6-branched-1,3-glucan).	293
185691	cd02182	GH16_Strep_laminarinase_like	Streptomyces laminarinase-like, member of glycosyl hydrolase family 16. Proteins similar to Streptomyces sioyaensis beta-1,3-glucanase (laminarinase) present in Actinomycetales as well as Peziomycotina. Laminarinases belong to glycosyl hydrolase family 16 and hydrolyze the glycosidic bond of the 1,3-beta-linked glucan, a major component of fungal and plant cell walls and the structural and storage polysaccharides (laminarin) of marine macro-algae. Members of the GH16 family have a conserved jelly roll fold with an active site channel.	259
185692	cd02183	GH16_fungal_CRH1_transglycosylase	glycosylphosphatidylinositol-glucanosyltransferase. Group of fungal GH16 members related to Saccharomyces cerevisiae Crh1p. Chr1p and Crh2p are transglycosylases that are required for the linkage of chitin to beta(1-3)glucose branches of beta(1-6)glucan, an important step in the assembly of new cell wall. Both have been shown to be glycosylphosphatidylinositol (GPI)-anchored. A third homologous protein, Crr1p, functions in the formation of the spore wall. They belongs to the family 16 of glycosyl hydrolases that includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues.	203
100026	cd02185	AroH	Chorismate mutase (AroH) is one of at least five chorismate-utilizing enzymes present in microorganisms that catalyze the rearrangement of chorismate to prephenic acid, the first committed step in the biosynthesis of aromatic amino acids. In prokaryotes, chorismate mutase may be fused to prephenate dehydratase, prephenate dehydrogenase, or 3-deoxy-D-arabino-heptulosonat-7-phosphate (DAHP) as part of a bifunctional enzyme.  The AroH domain forms a homotrimer with three-fold symmetry.	117
276955	cd02186	alpha_tubulin	The alpha-tubulin family. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly.  The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. The alpha/beta-tubulin heterodimer is the structural subunit of microtubules.  The alpha- and beta-tubulins share 40% amino-acid sequence identity, exist in several isotype forms, and undergo a variety of posttranslational modifications.  The structures of alpha- and beta-tubulin are basically identical: each monomer is formed by a core of two beta-sheets surrounded by alpha-helices. The monomer structure is very compact, but can be divided into three regions based on function: the amino-terminal nucleotide-binding region, an intermediate taxol-binding region and the carboxy-terminal region which probably constitutes the binding surface for motor proteins.	434
276956	cd02187	beta_tubulin	The beta-tubulin family. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly.  The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. The alpha/beta-tubulin heterodimer is the structural subunit of microtubules.  The alpha- and beta-tubulins share 40% amino-acid sequence identity, exist in several isotype forms, and undergo a variety of posttranslational modifications.  The structures of alpha- and beta-tubulin are basically identical: each monomer is formed by a core of two beta-sheets surrounded by alpha-helices. The monomer structure is very compact, but can be divided into three regions based on function: the amino-terminal nucleotide-binding region, an intermediate taxol-binding region and the carboxy-terminal region which probably constitutes the binding surface for motor proteins.	425
276957	cd02188	gamma_tubulin	The gamma-tubulin family. Gamma-tubulin is a ubiquitous phylogenetically conserved member of tubulin superfamily.  Gamma is a low abundance protein present within the cells in both various types of microtubule-organizing centers and cytoplasmic protein complexes.  Gamma-tubulin recruits the alpha/beta-tubulin dimers that form the minus ends of microtubules and is thought to be involved in microtubule nucleation and capping.	430
276958	cd02189	delta_zeta_tubulin-like	The delta- and zeta-tubulin families. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly.  The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes.  Delta-tubulin plays an essential role in forming the triplet microtubules of centrioles and basal bodies.	433
276959	cd02190	epsilon_tubulin	The epsilon-tubulin family. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The epsilon-tubulins which are widespread but not ubiquitous among eukaryotes play a role in basal body/centriole morphogenesis.	449
276960	cd02191	FtsZ_CetZ-like	Subfamily of FitZ and Cell-structure-related euryarchaeota tubulin/FtsZ homolog-like. FtsZ is a GTPase that is similar to the eukaryotic tubulins and is essential for cell division in prokaryotes.  CetZ-like proteins are related to tubulin and FtsZ and co-exists with FtsZ in many archaea. However, a recent study found that Cetz proteins (formerly annotated FtsZ type 2) are not required for cell division. Instead, CetZ proteins are shown to be involved in controlling archaeal cell shape dynamics.  The results from inactivation studies of CetZ proteins in Haloferax volcanii suggest that CetZ1 is essential for normal swimming motility and rod-cell development.	308
100028	cd02192	PurM-like3	AIR synthase (PurM) related protein, subgroup 3 of unknown function. The family of PurM related proteins includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM synthase and Selenophosphate synthetase (SelD). They all contain two conserved domains and seem to dimerize. The N-terminal domain forms the dimer interface and is a putative ATP binding domain.	283
100029	cd02193	PurL	Formylglycinamide ribonucleotide amidotransferase (FGAR-AT) catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether).  The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM.	272
100030	cd02194	ThiL	ThiL (Thiamine-monophosphate kinase) plays a dual role in de novo biosynthesis and in salvage of exogenous thiamine. Thiamine salvage occurs in two steps, with thiamine kinase catalyzing the formation of thiamine phosphate, and ThiL catalyzing the conversion of this intermediate to thiamine pyrophosphate. The N-terminal domain of ThiL binds ATP and is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, FGAM synthase and selenophosphate synthetase (SelD).	291
100031	cd02195	SelD	Selenophosphate synthetase  (SelD) catalyzes the conversion of selenium to selenophosphate which is required by a number of bacterial, archaeal and eukaryotic organisms for synthesis of Secys-tRNA, the precursor of selenocysteine in selenoenzymes. The N-terminal domain of SelD is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, and FGAM synthase and is thought to bind ATP.	287
100032	cd02196	PurM	PurM (Aminoimidazole Ribonucleotide [AIR] synthetase), one of eleven enzymes required for purine biosynthesis, catalyzes the conversion of formylglycinamide ribonucleotide (FGAM) and ATP to AIR, ADP, and Pi, the fifth step in de novo purine biosynthesis. The N-terminal domain of PurM is related to the ATP-binding domains of hydrogen expression/formation protein HypE, the AIR synthases, selenophosphate synthetase (SelD), and FGAM synthase and is thought to bind ATP.	297
100033	cd02197	HypE	HypE (Hydrogenase expression/formation protein). HypE is involved in Ni-Fe hydrogenase biosynthesis.  HypE dehydrates its own carbamoyl moiety in an ATP-dependent process to yield the enzyme thiocyanate. The N-terminal domain of HypE is related to the ATP-binding domains of the AIR synthases, selenophosphate synthetase (SelD), and FGAM synthase and is thought to bind ATP.	293
100005	cd02198	YjgH_like	YjgH belongs to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	111
100006	cd02199	YjgF_YER057c_UK114_like_1	This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function.  The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	142
276961	cd02201	FtsZ_type1	Filamenting temperature sensitive mutant Z, type 1. FtsZ is a GTPase that is similar to the eukaryotic tubulins and is essential for cell division in prokaryotes.  FtsZ is capable of polymerizing in a GTP-driven process into structures similar to those formed by tubulin. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells.	303
276962	cd02202	CetZ_tubulin-like	Cell-structure-related euryarchaeota tubulin/FtsZ homologs. CetZ proteins comprise a distinct tubulin/FtsZ family. The crystal structures of CetZ contain the FtsZ/tubulin superfamily fold and its family members have mosaic of tubulin-like and FtsZ-like amino acid residues. However, a recent study found that CetZ proteins (formerly annotated FtsZ type 2) are not required for cell division, whereas FtsZ proteins play an important role. Instead, CetZ proteins are shown to be involved in controlling archaeal cell shape dynamics. The results from inactivation studies of CetZ proteins in Haloferax volcanii suggest that CetZ1 is essential for normal swimming motility and rod-cell development.	357
100034	cd02203	PurL_repeat1	PurL subunit of the formylglycinamide ribonucleotide amidotransferase (FGAR-AT), first repeat. FGAR-AT catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether). The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM.	313
100035	cd02204	PurL_repeat2	PurL subunit of the formylglycinamide ribonucleotide amidotransferase (FGAR-AT), second repeat. FGAR-AT catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, phosphate, and glutamate in the fourth step of the purine biosynthetic pathway. In eukaryotes and Gram-negative bacteria, FGAR-AT is encoded by the purL gene as a multidomain protein with a molecular mass of about 140 kDa. In Gram-positive bacteria and archaea FGAR-AT is a complex of three proteins: PurS, PurL, and PurQ. PurL itself contains two tandem N- and C-terminal domains (four domains altogether). The N-terminal domains bind ATP and are related to the ATP-binding domains of HypE, ThiL, SelD and PurM.	264
341358	cd02205	CBS_pair_SF	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains superfamily. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	113
380338	cd02208	cupin_RmlC-like	RmlC-like cupin superfamily. This superfamily contains proteins similar to the RmlC (dTDP (deoxythymidine diphosphates)-4-dehydrorhamnose 3,5-epimerase)-like cupins. RmlC is a dTDP-sugar isomerase involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria. Cupins are a functionally diverse superfamily originally discovered based on the highly conserved motif found in germin and germin-like proteins. This conserved motif forms a beta-barrel fold found in all of the cupins, giving rise to the name cupin ('cupa' is the Latin term for small barrel). The active site of members of this superfamily is generally located at the center of a conserved barrel and usually includes a metal ion. The different functional classes in this superfamily include single domain bacterial isomerases and epimerases involved in the modification of cell wall carbohydrates, two domain bicupins such as the desiccation-tolerant seed storage globulins, and multidomain nuclear transcription factors involved in legume root nodulation.	73
380339	cd02209	cupin_XRE_C	XRE (Xenobiotic Response Element) family transcriptional regulators, C-terminal cupin domain. This family contains transcriptional regulators containing an N-terminal XRE (Xenobiotic Response Element) family helix-turn-helix (HTH) DNA-binding domain and a C-terminal cupin domain. Included in this family is Escherichia coli transcription factor SutR (YdcN) that plays a regulatory role in sulfur utilization; it regulates a set of genes involved in the generation of sulfate and its reduction, the synthesis of cysteine, the synthesis of enzymes containing Fe-S as cofactors, and the modification of tRNA with use of sulfur-containing substrates. This family belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	90
380340	cd02210	cupin_BLR2406-like	Bradyrhizobium japonicum BLR2406 and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to BLR2406, a Bradyrhizobium japonicum protein of unknown function with a cupin beta barrel domain. Proteins in this subfamily appear to align closest to RmlC carbohydrate epimerase which is involved in dTDP-L-rhamnose production, and belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	98
380341	cd02211	cupin_UGlyAH_N	(S)-ureidoglycine aminohydrolase and related proteins, N-terminal cupin domain. This family includes the N-terminal cupin domain of (S)-ureidoglycine aminohydrolase (UGlyAH), an enzyme that converts (S)-ureidoglycine into (S)-ureidoglycolate and ammonia, providing the final substrate to the ureide pathway. The ureide pathway has recently been identified as the metabolic route of purine catabolism in plants and some bacteria where, uric acid, which is a major product of the early stage of purine catabolism, is degraded into glyoxylate and ammonia via stepwise reactions by seven different enzymes. Thus, this pathway has a possible physiological role in mobilization of purine ring nitrogen for further assimilation. This enzyme from Arabidopsis thaliana(AtUGlyAH) has been shown to bind a Mn2+ ion, via the C-terminal cupin domain, which acts as a molecular anchor to bind (S)-ureidoglycine, and its binding mode dictates the enantioselectivity of the reaction. The structure of AtUGlyAH shows a bi-cupin fold with a conserved "jelly roll-like" beta-barrel fold and an octameric functional unit. Several structural homologs of UGlyAH, including the Escherichia coli ortholog YlbA (also known as GlxB6), also exhibit similar features.	117
380342	cd02212	cupin_UGlyAH_C	(S)-ureidoglycine aminohydrolase and related proteins, C-terminal cupin domain. This family includes the C-terminal cupin domain of (S)-Ureidoglycine aminohydrolase (UGlyAH), an enzyme that converts (S)-ureidoglycine into (S)-ureidoglycolate and ammonia, providing the final substrate to the ureide pathway. The ureide pathway has recently been identified as the metabolic route of purine catabolism in plants and some bacteria where, uric acid, which is a major product of the early stage of purine catabolism, is degraded into glyoxylate and ammonia via stepwise reactions by seven different enzymes. Thus, this pathway has a possible physiological role in mobilization of purine ring nitrogen for further assimilation. This enzyme from Arabidopsis thaliana(AtUGlyAH) has been shown to bind a Mn2+ ion,via the C-terminal cupin domain, which acts as a molecular anchor to bind (S)-ureidoglycine, and its binding mode dictates the enantioselectivity of the reaction. The structure of AtUGlyAH shows a bi-cupin fold with a conserved "jelly roll-like" beta-barrel fold and an octameric functional unit. Several structural homologs of UGlyAH, including the Escherichia coli ortholog YlbA (also known as GlxB6), also exhibit similar features.	92
380343	cd02213	cupin_PMI_typeII_C	Phosphomannose isomerase type II, C-terminal cupin domain. This family includes the C-terminal cupin domain of mannose-6-phosphate isomerases (MPIs) which have been classified broadly into two groups, type I and type II, based on domain organization. This family contains type II phosphomannose isomerase (also known as PMI-GDP, phosphomannose isomerase/GDP-D-mannose pyrophosphorylase), a bifunctional enzyme with two domains that catalyze the first and third steps in the GDP-mannose pathway in which fructose 6-phosphate is converted to GDP-D-mannose. The N-terminal domain catalyzes the first and rate-limiting step, the isomerization from D-fructose-6-phosphate to D-mannose-6-phosphate, while the C-terminal cupin domain (represented in this alignment model) converts mannose 1-phosphate to GDP-D-mannose in the final step of the reaction. Although these two domains occur together in one protein in most organisms, they occur as separate proteins in certain cyanobacterial organisms. Also, although type I and type II MPIs have no overall sequence similarity, they share a conserved catalytic motif.	126
380344	cd02214	cupin_MJ1618	Methanocaldococcus jannaschii MJ1618 and related proteins, cupin domain. This family includes bacterial and archaeal proteins homologous to MJ1618, a Methanocaldococcus jannaschii protein of unknown function with a cupin beta barrel domain. The active site of members of the cupin superfamily is generally located at the center of a conserved barrel and usually includes a metal ion.	100
380345	cd02215	cupin_QDO_N_C	quercetinase, N- and C-terminal cupin domains. This family contains quercetinase (also known as quercetin 2,3-dioxygenase, 2,3QD, QDO and YxaG; EC 1.13.11.24), a mononuclear copper-dependent dioxygenase that catalyzes the cleavage of the flavonol quercetin (5,7,3',4'-tetrahydroxyflavonol) heterocyclic ring to produce 2-protocatechuoyl-phloroglucinol carboxylic acid and carbon monoxide. Bacillus subtilis quercetin 2,3-dioxygenase (QDO) is a homodimer that shows oxygenase activity with several divalent metals such as Mn2+, Co2+, Fe2+, and Cu2+, although the preferred one appears to be Mn2+. The dioxygen binds to the metal ion of the Cu-QDO-quercetin complex, yielding a Cu2+-superoxo quercetin radical intermediate, which then forms a Cu2+-alkylperoxo complex which then evolves into endoperoxide intermediate that decomposes to the product. Quercetinase is a bicupin with two tandem cupin beta-barrel domains, both of which are included in this alignment model. The pirins, which also belong to the cupin domain family, have been shown to catalyze a reaction involving quercetin and may have a function similar to that of quercetinase.	122
380346	cd02216	cupin_GDO-like_N	gentisate 1,2-dioxygenase, 1-hydroxy-2-naphthoate dioxygenase, and salicylate 1,2-dioxygenase, N-terminal cupin domain. This family includes the N-terminal cupin domains of three closely related bicupin aromatic ring-cleaving dioxygenases: gentisate 1,2-dioxygenase (GDO), salicylate 1,2-dioxygenase (SDO), and 1-hydroxy-2-naphthoate dioxygenase (NDO). GDO catalyzes the cleavage of the gentisate (2,5-dihydroxybenzoate) aromatic ring, a key step in the gentisate degradation pathway allowing soil bacteria to utilize 2,5-xylenol, 3,5-xylenol, and m-cresol as sole carbon and energy sources. NDO catalyzes the cleavage of 1-hydroxy-2-naphthoate as part of the bacterial phenanthrene degradation pathway. SDO is a ring cleavage dioxygenase from Pseudaminobacter salicylatoxidans that oxidizes salicylate to 2-oxohepta-3,5-dienedioic acid via a novel ring fission mechanism. SDO differs from other known GDOs and NDOs in its unique ability to oxidatively cleave many different salicylate, gentisate and 1-hydroxy-2-naphthoate substrates with high catalytic efficiency. The active site of these enzymes is located in the N-terminal domain but could be influenced by changes in the C-terminal domain, which lacks the strictly conserved metal-binding residues found in other cupin domains and is thought to be an inactive vestigial remnant.	108
380347	cd02218	cupin_PGI	cupin-type phosphoglucose isomerase. The cupin-type phosphoglucose isomerase (also called cupin-like glucose-6-phosphate isomerase or cPGI; EC 5.3.1.9) family is found in archaea and certain prokaryotes where they catalyze the reversible aldose-ketose isomerization of glucose 6-phosphate (G6P) and fructose 6-phosphate (F6P) as part of a unique variation of the Embden-Meyerhof glycolytic pathway. Cupin-PGIs represent a separate lineage in the evolution of phosphoglucose isomerases. Pyrococcus furiosus phosphoglucose isomerase (PfPGI) has been shown to be a metal-containing enzyme which catalyzes the interconversion of glucose 6-phosphate (G6P) and fructose 6-phosphate (F6P). These domains have a cupin beta-barrel fold capable of homodimerization.	168
380348	cd02219	cupin_YjlB-like	Bacillus subtilis YjlB and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to YjlB, a Bacillus subtilis protein of unknown function with a cupin beta barrel fold. The active site of members of the cupin superfamily is generally located at the center of a conserved barrel and usually includes a metal ion. The different functional classes in this superfamily include single domain bacterial isomerases and epimerases involved in the modification of cell wall carbohydrates, two-domain bicupins such as the desiccation-tolerant seed storage globulins, and multidomain nuclear transcription factors involved in legume root nodulation.	154
380349	cd02220	cupin_ABP1	auxin-binding protein 1, cupin domain. Auxin-binding protein 1 (ABP1) is a soluble glycoprotein receptor that binds the plant hormone auxin, indole-3-acetic acid (IAA). ABP1 belongs to the ancient and functionally diverse germin/seed storage 7S protein superfamily. It is an important mediator of auxin action in plants and is essential for cell cycle control. Cellular auxin responses typically depend on auxin concentrations that mainly result from intercellular auxin transport and auxin biosynthesis, as well as metabolism. The functional inactivation of ABP1 results in cell cycle arrest, showing that ABP1 plays a critical role in cell cycle regulation, acting at both the G1/S and G2/M checkpoints. ABP1 is ubiquitous among green plants, found mainly within the endoplasmic reticulum (ER) and in smaller quantities at the cell surface associated with the plasma membrane. In Arabidopsis thaliana, ABP1 null mutations result in embryonic lethality while decreased ABP1 expression leads to severe retardation of leaf growth.	151
380350	cd02221	cupin_TM1287-like	Thermotoga maritima TM1287 decarboxylase, cupin domain. This family includes bacterial proteins homologous to TM1287 decarboxylase, a Thermotoga maritima manganese-containing cupin thought to catalyze the conversion of oxalate to formate and carbon dioxide, due to its similarity to oxalate decarboxylase (OXDC) from Bacillus subtilis. TM1287 shows a cupin fold with a conserved "jelly roll-like" beta-barrel fold and forms a homodimer.	93
380351	cd02222	cupin_TM1459-like	Thermotoga maritima TM1459 and related proteins, cupin domain. This family includes bacterial and archaeal proteins homologous to Thermotoga maritima TM1459, a manganese-containing cupin that has been shown to cleave C=C bonds in the presence of alkylperoxide as oxidant in vitro. Its biological function is still unknown. This family also includes Halorhodospira halophila Hhal_0468. Structures of these proteins show a cupin fold with a conserved "jelly roll-like" beta-barrel fold that form a homodimer.	91
380352	cd02223	cupin_Bh2720-like	Bacillus halodurans Bh2720 and related proteins, cupin domain. This family includes bacterial, archaeal, and eukaryotic proteins similar to Bh2720, a Bacillus halodurans protein of unknown function with a cupin beta-barrel fold.	98
380353	cd02224	cupin_SPO2919-like	Silicibacter pomeroyi SPO2919 and related proteins, uncharacterized sugar phosphate isomerase with a cupin domain. This family includes proteins similar to sugar phosphate isomerase SPO2919 from Silicibacter pomeroyi and Afe_0303 from Acidithiobacillus ferrooxidans, but are as yet uncharacterized. Structures of these proteins show a cupin fold with a conserved "jelly roll-like" beta-barrel fold that form a homodimer.	105
380354	cd02225	cupin_PA3510-like	Pseudomonas aeruginosa PA3510 and related proteins, cupin domain. This family includes bacterial proteins homologous to PA3510, a Pseudomonas aeruginosa protein of unknown function with a beta-barrel fold that belongs to the cupin superfamily.	150
380355	cd02226	cupin_YdbB-like	Bacillus subtilis YdbB and related proteins, cupin domain. This family includes bacterial proteins homologous to YdbB, a Bacillus subtilis protein of unknown function. It also includes protein Nmb1881 From Neisseria meningitidis, also of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	94
380356	cd02227	cupin_TM1112-like	Thermotoga maritima TM1112 and related proteins, cupin domain. This family includes bacterial and plant proteins homologous to TM1112, a Thermotoga maritima protein of unknown function with a cupin beta barrel domain. TM1112 (also known as DUF861) is a subfamily of RmlC-like cupins with a conserved "jelly roll-like" beta-barrel fold; structures indicate that a monomer is the biologically-relevant form.	69
380357	cd02228	cupin_EutQ	Clostridium difficile EutQ and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to ethanolamine utilization protein EutQ found in Clostridium difficile, as well as in other bacteria, including the enteric pathogens Salmonella enterica and Enterococcus faecalis. EutQ is encoded by the eutQ gene which is part of the eut (ethanolamine utilization) operon found to be essential during anoxic growth of S. enterica on ethanolamine and tetrathionate. In C. difficile, inability to utilize ethanolamine results in greater virulence and a shorter time to morbidity in the animal model, suggesting that, in contrast to other intestinal pathogens, the metabolism of ethanolamine can delay the onset of disease. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization. In contrast to the metal-binding catalytic cupins, the EutQ family does not possess the histidine residues that are responsible for metal coordination in the oxidoreductase and epimerase classes of cupins.	84
380358	cd02230	cupin_HP0902-like	Helicobacter pylori HP0902 and related proteins, cupin domain. This family includes prokaryotic and archaeal proteins homologous to HP0902, a functionally uncharacterized protein from Helicobacter pylori and Spy1581, a protein of unknown function from Streptococcus pyogenes. These proteins demonstrate all-beta cupin folds that cannot bind metal ions due to the absence of a metal-binding histidine that is conserved in many metallo-cupins. HP0902 is able to bind bacterial endotoxin lipopolysaccharides (LPS) through its surface-exposed loops, where metal-binding sites are usually found in other metallo-cupins, and thus may have a putative role in H. pylori pathogenicity.	83
380359	cd02231	cupin_BLL6423-like	Bradyrhizobium japonicum BLL6423 and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to BLL6423, a Bradyrhizobium japonicum protein of unknown function; it includes a structure of an uncharacterized protein from Novosphingobium aromaticivorans. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	108
380360	cd02232	cupin_ARD	acireductone dioxygenase (ARD), cupin domain. Acireductone dioxygenase (ARD; also known as 1,2-dihydroxy-3-keto-5-methylthiopentene dioxygenase) catalyzes the oxidation of 1,2-dihydroxy-3-keto-5-methylthiopentene to yield two different products depending on which active site metal is present (Fe2+ or Ni2+) as part of the methionine salvage pathway. The ARD apo-enzyme, obtained after the metal is removed, is catalytically inactive. The Fe(II)-ARD reaction yields an alpha-keto acid and formic acid, while Ni(II)-ARD instead catalyzes a shunt out of the methionine salvage pathway, yielding methylthiocarboxylic acid, formic acid, and CO. ARD belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization	134
380361	cd02233	cupin_HNL-like	Granulicella tundricola hydroxynitrile lyase (GtHNL) and related proteins, cupin domain. This family includes archaeal, eukaryotic, and bacterial proteins homologous to hydroxynitrile lyase from Granulicella tundricola (GtHNL), a novel class of HNLs that does not show any sequence or structural similarity to any other HNL and does not contain conserved motifs typical of HNLs. HNLs comprise a diverse group of enzymes that vary in terms of their substrate specificity, enantioselectivity and the need for a co-factor. In plants, they catalyze the reversible cleavage of cyanohydrins, yielding HCN and aldehydes or ketones. Also included in this family is TM1010 from Thermotoga maritima, a protein of unknown function. Some but not all members of this family have N- or C-terminal carboxymuconolactone decarboxylase domains in addition to the cupin domain. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	106
380362	cd02234	cupin_BLR7677-like	Bradyrhizobium japonicum BLR7677 and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to BLR7677, a Bradyrhizobium japonicum protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	103
380363	cd02235	cupin_BLL4011-like	Bradyrhizobium diazoefficiens BLL4011 and related proteins, cupin domain. This family includes bacterial and fungal proteins homologous to BLL4011, a Bradyrhizobium diazoefficiens protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	100
380364	cd02236	cupin_CV2614-like	Chromobacterium violaceum CV2614 and related proteins, cupin domain. This family includes mostly bacterial proteins homologous to CV2614, a Chromobacterium violaceum protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	102
380365	cd02237	cupin_DAD_ChrR	2,4'-Dihydroxyacetophenone dioxygenase (DAD) and anti-sigma factor ChrR, and similar proteins; cupin domain. This family includes the proteins 2,4'-Dihydroxyacetophenone dioxygenase (DAD) and anti-sigma factor ChrR. DAD catalyzes the oxidation of 2,4'-dihydroxyacetophenone to 4-hydroxybenzoate and formate as part of the 4-hydroxyacetophenone catabolic pathway. The enzyme is a homotetramer containing one iron per molecule of enzyme. Anti-sigma factor ChrR is a member of the ZAS (Zn2+ anti-sigma) subfamily of group IV anti-sigmas. It inhibits transcriptional activity by binding to the Rsp extra cytoplasmic function (ECF) sigma factor E (sigmaE). Some ChrR members contain tandem repeats of two distinct homologous functional domains. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	82
380366	cd02238	cupin_KdgF	pectin degradation protein KdgF and related proteins, cupin domain. This family includes bacterial and archaeal pectin degradation protein KdgF that catalyzes the linearization of unsaturated uronates from both pectin and alginate, which are polysaccharides found in the cell walls of plants and brown algae, respectively, and represent an important source of carbon. These polysaccharides, mostly consisting of chains of uronates, can be metabolized by bacteria through a pathway of enzymatic steps to the key metabolite 2-keto-3-deoxygluconate (KDG). Pectin degradation is used by many plant-pathogenic bacteria during infection, and also, pectin and alginate can both represent abundant sources of carbohydrate for the production of biofuels. These proteins belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	104
380367	cd02240	cupin_OxDC	Oxalate decarboxylase (OxDC), cupin domain. Oxalate decarboxylase (OxDC; EC 4.1.1.2) is a manganese-dependent bicupin that catalyzes the conversion of oxalate to formate and carbon dioxide, utilizing dioxygen as a cofactor. It is evolutionarily related to oxalate oxidase (OxOx or germin; EC 1.2.3.4) which, in contrast, converts oxalate and dioxygen to carbon dioxide and hydrogen peroxide. OxDC is classified as a bicupin because it contains two cupin folds and both domains are included in this alignment.  Each OxDC cupin domain contains one manganese binding site, with four manganese binding residues (three histidines and one glutamate) conserved as well as a number of hydrophobic residues. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	145
380368	cd02241	cupin_OxOx	Oxalate oxidase (germin), cupin domain. Oxalate oxidase (OxOx, also known as germin; EC 1.2.3.4) catalyzes the manganese-dependent oxidative decarboxylation of oxalate to carbon dioxide and hydrogen peroxide (H2O2). It is widespread in fungi and various plant tissues and may play a role in plant signaling and defense. This enzyme has been employed in a widely used assay for detecting urinary oxalate levels. Also, the gene encoding OxOx from barley roots has been expressed in oilseed rape in order to provide a defense against externally supplied oxalic acid.  In germin, the predominant protein produced during the early phase of wheat germination, it is believed that H2O2 production is employed as a defense mechanism in response to infection by pathogens. Germin is also a marker of growth onset in cell walls in germinating cereals. The H2O2 produced by OxOx, together with the Ca2+ released by degradation of calcium oxalate, are thought to mediate cell wall cross-linking at high concentrations. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	191
380369	cd02242	cupin_11S_legumin_N	11S legumin seed storage globulin, N-terminal cupin domain. This family contains the N-terminal domains of 11S legumin seed storage proteins that supply nutrition for seed germination, such as glycinin and legumin, including many common food allergens such as the peanut major allergen Ara h 3, almond allergen Pru du 6, Pecan allergen Car i 4, hazelnut nut allergen Cor a 9, Brazil nut allergen Ber e 2, cashew allergen Ana o 2, pistachio allergen Pis v 2/5, and walnut allergen Jug n/r 4. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin). They are synthesized as propeptides in the endoplasmic reticulum and transported to the secretory vesicles as a homotrimer. The propeptides are processed as they are sorted in the secretory vesicles. The homotrimer binds another homotrimer to form a homohexamer with 32-point symmetry formed by a face-to-face stacking of the two trimers. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	209
380370	cd02243	cupin_11S_legumin_C	11S legumin seed storage globulin, C-terminal cupin domain. This family contains the C-terminal domains of 11S legumin seed storage proteins that supply nutrition for seed germination, such as glycinin and legumin, including many common food allergens such as the peanut major allergen Ara h 3, almond allergen Pru du 6, Pecan allergen Car i 4, hazelnut nut allergen Cor a 9, Brazil nut allergen Ber e 2, cashew allergen Ana o 2, pistachio allergen Pis v 2/5, and walnut allergen Jug n/r 4. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin). They are synthesized as propeptides in the endoplasmic reticulum and transported to the secretory vesicles as a homotrimer. The propeptides are processed as they are sorted in the secretory vesicles. The homotrimer binds another homotrimer to form a homohexamer with 32-point symmetry formed by a face-to-face stacking of the two trimers. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	155
380371	cd02244	cupin_7S_vicilin-like_N	7S vicilin seed storage globulin, N-terminal cupin domain. This family contains the N-terminal domains of plant 7S seed storage proteins such as vicilin, and includes beta-conglycinin, phaseolin, canavalin, conglutin-beta, a chromatin protein in Pisum sativum called P54, and a sucrose binding protein in soybean called SBP. These 7S globulins also include soybean allergen beta-conglycinin, peanut allergen conarachin (Ara h 1), walnut allergen Jug r 2, and lentil allergen Len c 1. Proteins in this family perform various functions, including a role in sucrose binding, desiccation, defense against microbes and oxidative stress. The vicilin peptides formed by trypsin or chymotrypsin digestion exhibit antihypertensive effects. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin).  Storage proteins are the cause of well-known allergic reactions to peanuts and cereals. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	178
380372	cd02245	cupin_7S_vicilin-like_C	7S vicilin seed storage globulin, C-terminal cupin domain. This family contains C-terminal domain of plant 7S seed storage protein such as vicilin and includes beta-conglycinin, phaseolin, canavalin, conglutin-beta, a chromatin protein in Pisum sativum called P54, and a sucrose binding protein in soybean called SBP. These 7S globulins also include soybean allergen beta-conglycinin, peanut allergen conarachin (Ara h 1), walnut allergen Jug r 2 and lentil allergen Len c 1. Proteins in this family perform various functions, including a role in sucrose binding, desiccation, defense against microbes and oxidative stress. The vicilin peptides formed by trypsin or chymotrypsin digestion exhibit antihypertensive effects. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin).  Storage proteins are the cause of well-known allergic reactions to peanuts and cereals. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	166
380373	cd02247	cupin_pirin_C	pirin, C-terminal cupin domain. This family contains the C-terminal domain of pirin, a nuclear protein that is highly conserved among mammals, plants, fungi, and prokaryotes. It is widely expressed in dot-like subnuclear structures in human tissues such as liver and heart. Pirin functions as both a transcriptional cofactor and an apoptosis-related protein in mammals and is involved in seed germination and seedling development in plants. The pirins have been assigned as a subfamily of the cupin superfamily based on structure and sequence similarity. The pirins have two tandem cupin-like folds but the C-terminal cupin fold has diverged considerably and does not have a metal binding site. The exact functions of pirins are unknown but they have quercitinase activity in Escherichia coli and are thought to play important roles in transcription and apoptosis. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	76
239068	cd02248	Peptidase_C1A	Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to hatch or to evade the host immune system. Mammalian CPs are primarily lysosomal enzymes with the exception of cathepsin W, which is retained in the endoplasmic reticulum. They are responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. In addition to its inhibitory role, the propeptide is required for proper folding of the newly synthesized enzyme and its stabilization in denaturing pH conditions. Residues within the propeptide region also play a role in the transport of the proenzyme to lysosomes or acidified vesicles. Also included in this subfamily are proteins classified as non-peptidase homologs, which lack peptidase activity or have missing active site residues.	210
239069	cd02249	ZZ	Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins.	46
239070	cd02252	nylC_like	nylC-like family; composed of proteins with similarity to Flavobacterium endo-type 6-aminohexanoate-oligomer hydrolase (EIII), the product of the nylon oligomer degradation gene, nylC. EIII is an amide hydrolase that catalyzes the degradation of highly-polymerized 6-aminohexanoate oligomers. Together with other nylon degradation enzymes, such as 6-aminohexanoate cyclic dimer hydrolase (EI) and 6-aminohexanoate dimer hydrolase (EII), EIII plays a role in the detoxification and biological removal of the synthetic by-products of nylon manufacture. EIII shows sequence similarity to L-aminopeptidase D-amidase/D-esterase (DmpA), an aminopeptidase that releases N-terminal D and L amino acids from peptide substrates. Like DmpA, EIII undergoes autocatalytic cleavage in front of a nucleophile to form a heterodimer. DmpA shows similarity in catalytic mechanism to N-terminal nucleophile (Ntn) hydrolases, which are enzymes that catalyze the cleavage of amide bonds through the nucleophilic attack of the side chain of an N-terminal serine, threonine, or cysteine.	260
239071	cd02253	DmpA	L-Aminopeptidase D-amidase/D-esterase (DmpA) family; DmpA catalyzes the release of N-terminal D and L amino acids from peptide susbtrates. DmpA is synthesized as a single polypeptide precursor, which is autocatalytically cleaved to the active heterodimeric form. The cleavage results in two polypeptide chains, with one chain containing an N-terminal nucleophile. This group represents one of the rare aminopeptidases that are not metalloenzymes. DmpA shows similarity in catalytic mechanism to N-terminal nucleophile (Ntn) hydrolases, which are enzymes that catalyze the cleavage of amide bonds through the nucleophilic attack of the side chain of an N-terminal serine, threonine, or cysteine.	339
187736	cd02255	Peptidase_C12	Cysteine peptidase C12 contains ubiquitin carboxyl-terminal hydrolase (UCH) families L1, L3, L5 and BAP1. The ubiquitin C-terminal hydrolase (UCH; ubiquitinyl hydrolase; ubiquitin thiolesterase) family of deubiquitinating enzymes (DUBs) consists of four members to date: UCH-L1, UCH-L3, UCH-L5 (UCH37) and BRCA1-associated protein-1 (BAP1), all containing a conserved catalytic domain with cysteine peptidase activity.  UCH-L1 hydrolyzes carboxyl terminal esters and amides of ubiquitin (Ub). Dysfunction of this hydrolase activity can lead to an accumulation of alpha-synuclein, which is linked to Parkinson's disease (PD) and neurofibrillary tangles, linked to Alzheimer's disease (AD).  UCH-L1, in its dimeric form, has additional enzymatic activity as a ubiquitin ligase. UCH-L3 hydrolyzes isopeptide bonds at the C-terminal glycine of either Ub or Nedd8, a ubiquitin-like protein. UCH-L3 can also interact with Lys48-linked Ub dimers to protect it from degradation while inhibiting its hydrolase activity at the same time.  UCH-L1 and UCH-L3 are the most closely related of the UCH members. UCH-L5 (UCH37) is involved in the deubiquitinating activity in the 19S proteasome regulatory complex. It is also associated with the human Ino80 chromatin-remodeling complex (hINO80) in the nucleus. BAP1 binds to the wild-type BRCA1 RING finger domain, localized in the nucleus.  It consists of the N-terminal UCH domain and two predicted nuclear localization signals (NLSs), only one of which is functional. The full-length human BRCA1 is a ubiquitin ligase. However, BAP1 does not appear to function in the deubiquitination of autoubiquitinated  BRCA1. There is growing evidence that UCH enzymes and human malignancies are closely correlated. Studies show that UCH enzymes play a crucial role in some signaling pathways and in cell-cycle regulation.	222
239072	cd02257	Peptidase_C19	Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	255
199210	cd02258	Peptidase_C25_N	Peptidase C25 family N-terminal domain, found in Arg-gingipain (Rgp), Lys-gingipain (Kgp) and related proteins. Peptidase family C25 is a unique class of cysteine proteases, exemplified by gingipain, which is produced by Porphyromonas gingivalis. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease that is also associated with other diseases such as diabetes and cardiovascular disease. Gingipains are a group of extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene. Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. They are proposed to enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network.	382
239073	cd02259	Peptidase_C39_like	Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in all sub-families. 	122
187535	cd02266	SDR	Short-chain dehydrogenases/reductases (SDR). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase (KR) domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	186
100064	cd02325	R3H	R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	59
239074	cd02334	ZZ_dystrophin	Zinc finger, ZZ type. Zinc finger present in dystrophin and dystrobrevin. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Dystrophin attaches actin filaments to an integral membrane glycoprotein complex in muscle cells. The ZZ domain in dystrophin has been shown to be essential for binding to the membrane protein beta-dystroglycan.	49
239075	cd02335	ZZ_ADA2	Zinc finger, ZZ type. Zinc finger present in ADA2, a putative transcriptional adaptor, and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding.	49
239076	cd02336	ZZ_RSC8	Zinc finger, ZZ type. Zinc finger present in RSC8 and related proteins. RSC8 is a component of the RSC complex, which is closely related to the SWI/SNF complex and is involved in remodeling chromatin structure. The ZZ motif coordinates a zinc ion and most likely participates in ligand binding or molecular scaffolding.	45
239077	cd02337	ZZ_CBP	Zinc finger, ZZ type. Zinc finger present in CBP/p300 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. CREB-binding protein (CBP) is a large multidomain protein that provides binding sites for transcriptional coactivators, the role of the ZZ domain in CBP/p300 is unclear.	41
239078	cd02338	ZZ_PCMF_like	Zinc finger, ZZ type. Zinc finger present in potassium channel modulatory factor (PCMF) 1  and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Human potassium channel modulatory factor 1 or FIGC has been shown to possess intrinsic E3 ubiquitin ligase activity and to promote ubiquitination.	49
239079	cd02339	ZZ_Mind_bomb	Zinc finger, ZZ type. Zinc finger present in Drosophila Mind bomb (D-mib) and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Mind bomb is an E3 ubiqitin ligase that has been shown to regulate signaling by the Notch ligand Delta in Drosophila melanogaster.	45
239080	cd02340	ZZ_NBR1_like	Zinc finger, ZZ type. Zinc finger present in Drosophila ref(2)P, NBR1, Human sequestosome 1 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Drosophila ref(2)P appears to control the multiplication of sigma rhabdovirus. NBR1 (Next to BRCA1 gene 1 protein) interacts with fasciculation and elongation protein zeta-1 (FEZ1) and calcium and integrin binding protein (CIB), and may function in cell signalling pathways. Sequestosome 1 is a phosphotyrosine independent ligand for the Lck SH2 domain and binds noncovalently to ubiquitin via its UBA domain.	43
239081	cd02341	ZZ_ZZZ3	Zinc finger, ZZ type. Zinc finger present in ZZZ3 (ZZ finger containing 3) and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding.	48
239082	cd02342	ZZ_UBA_plant	Zinc finger, ZZ type. Zinc finger present in plant ubiquitin-associated (UBA) proteins. The ZZ motif coordinates a zinc ion and most likely participates in ligand binding or molecular scaffolding.	43
239083	cd02343	ZZ_EF	Zinc finger, ZZ type. Zinc finger present in proteins with an EF_hand motif. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding.	48
239084	cd02344	ZZ_HERC2	Zinc finger, ZZ type. Zinc finger present in HERC2 and related proteins. HERC2 is a potential E3 ubiquitin protein ligase and/or guanine nucleotide exchange factor. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding.	45
239085	cd02345	ZZ_dah	Zinc finger, ZZ type. Zinc finger present in Drosophila dah and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Dah (discontinuous actin hexagon) is a membrane associated protein essential for cortical furrow formation in Drosophila. 	49
411803	cd02393	KH-I_PNPase	type I K homology (KH) RNA-binding domain found in polyribonucleotide nucleotidyltransferase (PNPase) and similar proteins. PNPase, also called polynucleotide phosphorylase, is a polyribonucleotide nucleotidyl transferase that degrades mRNA in prokaryotes and plant chloroplasts. It catalyzes the phosphorolysis of single-stranded polyribonucleotides processively in the 3'- to 5'-direction.  It is also involved, along with RNase II, in tRNA processing. The C-terminal region of PNPase contains domains homologous to those in other RNA binding proteins: a KH domain and an S1 domain. The model corresponds to the KH domain.	70
411804	cd02394	KH-I_Vigilin_rpt6	sixth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the sixth one.	68
411805	cd02395	KH-I_BBP	type I K homology (KH) RNA-binding domain found in yeast branchpoint-bridging protein (BBP) and similar proteins. Yeast BBP, also called mud synthetic-lethal 5 protein, or splicing factor 1, or zinc finger protein BBP, is a mammalian splicing factor SF1 ortholog. It is involved in protein-protein interactions that bridge the 3' and 5' splice-site ends of the intron during the early steps of yeast pre-mRNA splicing. BBP interacts specifically with the pre-mRNA branchpoint sequence UACUAAC.	92
411806	cd02396	KH-I_PCBP_rpt2	second type I K homology (KH) RNA-binding domain found in the family of poly(C)-binding proteins (PCBPs). The PCBP family, also known as hnRNP E family, comprises four members, PCBP1-4, which are RNA-binding proteins that interact in a sequence-specific manner with single-stranded poly(C) sequences. They are mainly involved in various posttranscriptional regulations, including mRNA stabilization or translational activation/silencing. Besides, PCBPs may share iron chaperone activity. PCBPs contain three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	72
239090	cd02406	CRS2	Chloroplast RNA splicing 2 (CRS2) is a nuclear-encoded protein required for the splicing of group II introns in the chloroplast. CRS2 forms stable complexes with two CRS2-associated factors, CAF1 and CAF2, which are required for the splicing of distinct subsets of CRS2-dependent introns. CRS2 is closely related to bacterial peptidyl-tRNA hydrolases (PTH).	191
239091	cd02407	PTH2_family	Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes.	115
411780	cd02409	KH-II_SF	type II K-homology (KH) RNA-binding domain superfamily. The K-homology (KH) domain binds single-stranded RNA or DNA, and is found in a wide variety of proteins including ribosomal proteins, transcription factors, and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but share a single "minimal KH motif" which is folded into a beta-alpha-alpha-beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include an N-terminal extension while type I KH domains (e.g. hnRNP K) contain a C-terminal extension, connected to the KH motif by variable loops that are different in different KH domains, whether they are type I or type II. KH-II superfamily members contain one or two KH domains, most of which are canonical type II KH domains that have the signature motif GXXG (where X represents any amino acid). The first KH domain found in archaeal cleavage and polyadenylation specificity factors (CPSFs) is a non-canonical type II KH domain that lacks the GXXG motif. Some others have mutated GXXG motifs which may or may not have nucleic acid binding ability.	67
411781	cd02410	KH-II_CPSF_arch_rpt2	second type II K-homology (KH) RNA-binding domain found in archaeal cleavage and polyadenylation specificity factor (CPSF) and similar proteins. The archaeal CPSFs are predicted to be metal-dependent RNases belonging to the beta-CASP family, a subgroup of enzymes within the metallo-beta-lactamase fold. Within the CPSF family, all archaeal genomes contain one member with two N-terminal type II K-homology (KH) domains and one without. This family includes the CPSF homologs from archaea possessing N-terminal KH domains. This model corresponds to the second KH domain of CPSF, which is a canonical type II KH domain that contains the signature motif GXXG (where X represents any amino acid).	76
411782	cd02411	KH-II_30S_S3_arch	type II K-homology (KH) RNA-binding domain found in archaeal 30S ribosomal protein S3 and similar proteins. 30S ribosomal protein S3, also called small ribosomal subunit protein uS3, is part of the head region of the 30S ribosomal subunit and binds to the lower part of the 30S subunit head. It is believed to interact with mRNA as it threads its way from the latch into the channel. Members of this family are mainly from archaea and contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid).	87
411783	cd02412	KH-II_30S_S3	type II K-homology (KH) RNA-binding domain found in 30S ribosomal protein S3 and similar proteins. 30S ribosomal protein S3, also called small ribosomal subunit protein uS3, is part of the head region of the 30S ribosomal subunit and binds to the lower part of the 30S subunit head. It may also bind mRNA in the 70S ribosome, positioning it for translation. S3 protein is believed to interact with mRNA as it threads its way from the latch into the channel. Members of this family contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid).	108
411784	cd02413	KH-II_40S_S3	type II K-homology (KH) RNA-binding domain found in 40S ribosomal protein S3 and similar proteins. 40S ribosomal protein S3, also called small ribosomal subunit protein uS3, is part of the head region of the 40S ribosomal subunit and is involved in translation. It is believed to interact with mRNA as it threads its way from the latch into the channel. 40S ribosomal protein S3 has endonuclease activity and plays a role in repair of damaged DNA. It cleaves phosphodiester bonds of DNAs containing altered bases with broad specificity and cleaves supercoiled DNA more efficiently than relaxed DNA. Members of this family are mainly from prokaryotes and contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid).	91
411785	cd02414	KH-II_Jag	type II K-homology (KH) RNA-binding domain found in protein Jag and similar proteins. Protein Jag, also called SpoIIIJ-associated protein, is associated with SpoIIIJ and is necessary for the third stage of sporulation. Members of this family are mainly from bacteria and contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid).	79
239098	cd02417	Peptidase_C39_likeA	A sub-family of peptidase C39 which contains Cyclolysin and Hemolysin processing peptidases.  Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in this sub-family.	121
239099	cd02418	Peptidase_C39B	A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family.	136
239100	cd02419	Peptidase_C39C	A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family.	127
239101	cd02420	Peptidase_C39D	A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family.	125
239102	cd02421	Peptidase_C39_likeD	A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is not conserved in this sub-family.	124
239103	cd02423	Peptidase_C39G	A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are lacking the nucleotide-binding transporter signature.	129
239104	cd02424	Peptidase_C39E	A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family, which contains Colicin V perocessing peptidase.	129
239105	cd02425	Peptidase_C39F	A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family.	126
239106	cd02426	Pol_gamma_b_Cterm	C-terminal domain of mitochondrial DNA polymerase gamma B subunit, which is required for processivity. Polymerase gamma replicates and repairs mitochondrial DNA. The c-terminal domain of its B subunit is strikingly similar to the anticodon-binding domain of glycyl tRNA synthetase.	128
239107	cd02429	PTH2_like	Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported  to encode such activity, Pth present in bacteria and eukaryotes and  Pth2 present in archaea and eukaryotes. There is no functional information for this eukaryote-specific subgroup.	116
239108	cd02430	PTH2	Peptidyl-tRNA hydrolase, type 2 (PTH2). Peptidyl-tRNA hydrolase (PTH) activity releases tRNA from the premature translation termination product peptidyl-tRNA, therefore allowing the tRNA and peptide to be reused in protein synthesis. PTH2 is present in archaea and eukaryotes.	115
153122	cd02431	Ferritin_CCC1_C	CCC1-related domain of ferritin. Ferritin_CCC1_like_C: The proteins of this family contain two domains. This is the C-terminal domain that is closely related to the CCC1, a vacuole transmembrane protein functioning as an iron and manganese transporter. The N-terminal domain is similar to ferritin-like diiron-carboxylate proteins, which are involved in a variety of iron ion related functions, such as iron storage and regulation, mono-oxygenation, and reactive radical production. This family may be unique to certain bacteria and archaea. 	149
153123	cd02432	Nodulin-21_like_1	Nodulin-21 and CCC1-related protein family. Nodulin-21_like_1: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. 	218
153124	cd02433	Nodulin-21_like_2	Nodulin-21 and CCC1-related protein family. Nodulin-21_like_2: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. 	234
153125	cd02434	Nodulin-21_like_3	Nodulin-21 and CCC1-related protein family. Nodulin-21_like_3: This is a family of proteins closely related to nodulin-21, a plant nodule-specific protein that may be involved in symbiotic nitrogen fixation. This family is also related to CCC1, a yeast vacuole transmembrane protein that functions as an iron and manganese transporter. 	225
153126	cd02435	CCC1	CCC1. CCC1: This domain is present in the CCC1, an iron and manganese transporter of Saccharomyces cerevisiae. CCC1 is a transmembrane protein that is located in the vacuole and transfers the iron and manganese ions from the cytosol to the vacuole. This domain may be unique to certain fungi and plants.	241
153127	cd02436	Nodulin-21	Nodulin-21. Nodulin-21: This is a family of proteins that may be unique to certain plants. The family member in soybean is found to be nodule-specific and is abundant during nodule development. The proteins of this family thus may play a role in symbiotic nitrogen fixation.  	152
153128	cd02437	CCC1_like_1	CCC1-related protein family. CCC1_like_1: This is a protein family closely related to CCC1, a family of proteins involved in iron and manganese transport. Yeast CCC1 is a vacuole transmembrane protein responsible for the iron and manganese accumulation in vacuole.   	175
143332	cd02439	DMB-PRT_CobT	Nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase (DMB-PRT), also called CobT. Nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase (DMB-PRT/CobT, not to be confused with the CobT subunit of cobaltochelatase, which does not belong to this group) catalyzes the synthesis of alpha-ribazole-5'-phosphate, from nicotinate mononucleotide (NAMN) and 5,6-dimethylbenzimidazole (DMB). This function is essential to the anaerobic biosynthesis pathway of cobalamin (vitamin B12), which is the largest and most complex cofactor in a number of enzyme-catalyzed reactions in bacteria, archaea and eukaryotes. Only eubacteria and archaebacteria can synthesize vitamin B12; multicellular organisms have lost this ability during evolution. DMB-PRT/CobT works sequentially with CobC (a phosphatase) to couple the lower ligand of cobalamin to a ribosyl moiety. DMB is the most common lower ligand of cobamides; other lower ligands include adenine, 5-methoxybenzimidazole or phenol. It has been suggested that earlier metabolic or enzymatic steps may control which lower ligand is available to DMB-PRT/CobT. In Salmonella enterica, for example, the lower ligand is DMB under aerobic conditions and adenine or 2-methyladenine under anaerobic conditions. Salmonella enterica DMB-PRT/CobT is a homodimer with two active sites, each active site is comprised of residues from both monomers. This group includes two distinct subfamilies, one archaeal-like, the other comprised of bacterial sequences.	315
100107	cd02440	AdoMet_MTases	S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I;  AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.).	107
133000	cd02503	MobA	MobA catalyzes the formation of molybdopterin guanine dinucleotide. The prokaryotic enzyme molybdopterin-guanine dinucleotide biosynthesis protein A (MobA). All mononuclear molybdoenzymes bind molybdenum in complex with an organic cofactor termed molybdopterin (MPT). In many bacteria, including Escherichia coli, molybdopterin can be further modified by attachment of a GMP group to the terminal phosphate of molybdopterin to form molybdopterin guanine dinucleotide (MGD). This GMP attachment step is catalyzed by MobA, by linking a guanosine 5'-phosphate to MPT forming molybdopterin guanine dinucleotide. This reaction requires GTP, MgCl2, and the MPT form of the cofactor. It is a reaction unique to prokaryotes, and therefore may represent a potential drug target.	181
133001	cd02507	eIF-2B_gamma_N_like	The N-terminal of eIF-2B_gamma_like is predicted to have glycosyltransferase activity. N-terminal domain of eEIF-2B epsilon and gamma, subunits of eukaryotic translation initiators, is a subfamily of glycosyltranferase 2 and is predicted to have glycosyltranferase activity. eIF-2B is a guanine nucleotide-exchange factor which mediates the exchange of GDP (bound to initiation factor eIF2) for GTP, generating active eIF2.GTP complex. EIF2B is a complex multimeric protein consisting of five subunits named alpha, beta, gamma, delta and epsilon. Subunit epsilon shares sequence similarity with gamma subunit, and with a family of bifunctional nucleotide-binding enzymes such as ADP-glucose pyrophosphorylase, suggesting that epsilon subunit may play roles in nucleotide binding activity. In yeast, eIF2B gamma enhances the activity of eIF2B-epsilon leading to the idea that these subunits form the catalytic subcomplex.	216
133002	cd02508	ADP_Glucose_PP	ADP-glucose pyrophosphorylase is involved in the biosynthesis of glycogen or starch. ADP-glucose pyrophosphorylase (glucose-1-phosphate adenylyltransferase) catalyzes a very important step in the biosynthesis of alpha 1,4-glucans (glycogen or starch) in bacteria and plants: synthesis of the activated glucosyl donor, ADP-glucose, from glucose-1-phosphate and ATP.  ADP-glucose pyrophosphorylase is a tetrameric allosterically regulated enzyme. While a homotetramer in bacteria, in plant chloroplasts and amyloplasts, it is a heterotetramer of two different, yet evolutionary related, subunits.  There are a number of conserved regions in the sequence of bacterial and plant ADP-glucose pyrophosphorylase subunits. It is a subfamily of a very diverse glycosy transferase family 2.	200
133003	cd02509	GDP-M1P_Guanylyltransferase	GDP-M1P_Guanylyltransferase catalyzes the formation of GDP-Mannose. GDP-mannose-1-phosphate guanylyltransferase, also called GDP-mannose pyrophosphorylase (GDP-MP), catalyzes the formation of GDP-Mannose from mannose-1-phosphate and GTP. Mannose is a key monosaccharide for glycosylation of proteins and lipids. GDP-Mannose is the activated donor for mannosylation of various biomolecules. This enzyme is known to be bifunctional, as both mannose-6-phosphate isomerase and mannose-1-phosphate guanylyltransferase. This CD covers the N-terminal GDP-mannose-1-phosphate guanylyltransferase domain, whereas the isomerase function is located at the C-terminal half. GDP-MP is a member of the nucleotidyltransferase family of enzymes.	274
133004	cd02510	pp-GalNAc-T	pp-GalNAc-T initiates the formation of mucin-type O-linked glycans. UDP-GalNAc: polypeptide alpha-N-acetylgalactosaminyltransferases (pp-GalNAc-T) initiate the formation of mucin-type, O-linked glycans by catalyzing the transfer of alpha-N-acetylgalactosamine (GalNAc) from UDP-GalNAc to hydroxyl groups of Ser or Thr residues of core proteins to form the Tn antigen (GalNAc-a-1-O-Ser/Thr). These enzymes are type II membrane proteins with a GT-A type catalytic domain and a lectin domain located on the lumen side of the Golgi apparatus. In human, there are 15 isozymes of pp-GalNAc-Ts, representing the largest of all glycosyltransferase families. Each isozyme has unique but partially redundant substrate specificity for glycosylation sites on acceptor proteins.	299
133005	cd02511	Beta4Glucosyltransferase	UDP-glucose LOS-beta-1,4 glucosyltransferase is required for biosynthesis of lipooligosaccharide. UDP-glucose: lipooligosaccharide (LOS)  beta-1-4-glucosyltransferase catalyzes the addition of the first residue, glucose, of the lacto-N-neotetrase structure to HepI of the LOS inner core.  LOS is the major constituent of the outer leaflet of the outer membrane of gram-positive bacteria. It consists of a short oligosaccharide chain of variable composition (alpha chain) attached to a branched inner core which is lined in turn to lipid A. Beta 1,4 glucosyltransferase is required to attach the alpha chain to the inner core.	229
133006	cd02513	CMP-NeuAc_Synthase	CMP-NeuAc_Synthase activates N-acetylneuraminic acid by adding CMP moiety. CMP-N-acetylneuraminic acid synthetase (CMP-NeuAc synthetase) or acylneuraminate cytidylyltransferase catalyzes the transfer the CMP moiety of CTP to the anomeric hydroxyl group of NeuAc in the presence of Mg++. It is the second to last step in the sialylation of the oligosaccharide component of glycoconjugates by providing the activated sugar-nucleotide cytidine 5'-monophosphate N-acetylneuraminic acid (CMP-Neu5Ac), the substrate for sialyltransferases.  Eukaryotic CMP-NeuAc synthetases are predominantly located in the nucleus. The activated CMP-Neu5Ac diffuses from the nucleus into the cytoplasm.	223
133007	cd02514	GT13_GLCNAC-TI	GT13_GLCNAC-TI is involved in an essential step in the synthesis of complex or hybrid-type N-linked oligosaccharides. Alpha-1,3-mannosyl-glycoprotein beta-1,2-N-acetylglucosaminyltransferase (GLCNAC-T I , GNT-I)  transfers N-acetyl-D-glucosamine from UDP to high-mannose glycoprotein N-oligosaccharide, an essential step in the synthesis of complex or hybrid-type N-linked oligosaccharides. The enzyme is an integral membrane protein localized to the Golgi apparatus. The catalytic domain is located at the C-terminus. These proteins are members of the glycosy transferase family 13.	334
133008	cd02515	Glyco_transf_6	Glycosyltransferase family 6 comprises enzymes responsible for the production of the human ABO blood group antigens. Glycosyltransferase family 6, GT_6, comprises enzymes with three known activities: alpha-1,3-galactosyltransferase, alpha-1,3 N-acetylgalactosaminyltransferase, and alpha-galactosyltransferase. UDP-galactose:beta-galactosyl alpha-1,3-galactosyltransferase (alpha3GT) catalyzes the transfer of galactose from UDP-alpha-d-galactose into an alpha-1,3 linkage with beta-galactosyl groups in glycoconjugates. The enzyme exists in most mammalian species but is absent from humans, apes, and old world monkeys as a result of the mutational inactivation of the gene. The alpha-1,3 N-acetylgalactosaminyltransferase and alpha-galactosyltransferase are responsible for the production of the human ABO blood group antigens. A N-acetylgalactosaminyltransferases use a UDP-GalNAc donor to convert the H-antigen acceptor to the A antigen, whereas a galactosyltransferase uses a UDP-galactose donor to convert the H-antigen acceptor to the B antigen. Alpha-1,3 N-acetylgalactosaminyltransferase and alpha-galactosyltransferase differ only in the identity of four critical amino acid residues.	271
133009	cd02516	CDP-ME_synthetase	CDP-ME synthetase is involved in mevalonate-independent isoprenoid production. 4-diphosphocytidyl-2-methyl-D-erythritol synthase (CDP-ME), also called  2C-methyl-d-erythritol 4-phosphate cytidylyltransferase catalyzes the third step in the alternative (non-mevalonate) pathway of Isopentenyl diphosphate (IPP) biosynthesis: the formation of 4-diphosphocytidyl-2C-methyl-D-erythritol from CTP and 2C-methyl-D-erythritol 4-phosphate. This mevalonate independent pathway that utilizes pyruvate and glyceraldehydes 3-phosphate as starting materials for production of IPP occurs in a variety of bacteria, archaea and plant cells, but is absent in mammals. Thus, CDP-ME synthetase is  an attractive targets for the structure-based design of selective antibacterial, herbicidal and antimalarial drugs.	218
133010	cd02517	CMP-KDO-Synthetase	CMP-KDO synthetase catalyzes the activation of KDO which is an essential component of the lipopolysaccharide. CMP-KDO Synthetase: 3-Deoxy-D-manno-octulosonate cytidylyltransferase (CMP-KDO synthetase) catalyzes the conversion of CTP and 3-deoxy-D-manno-octulosonate into CMP-3-deoxy-D-manno-octulosonate (CMP-KDO) and pyrophosphate. KDO is an essential component of the lipopolysaccharide found in the outer surface of gram-negative eubacteria. It is also a constituent of the capsular polysaccharides of some gram-negative eubacteria. Its presence in the cell wall polysaccharides of green algae and plant were also discovered. However, they have not been found in yeast and animals. The absence of the enzyme in mammalian cells makes it an attractive target molecule for drug design.	239
133011	cd02518	GT2_SpsF	SpsF is a glycosyltrnasferase implicated in the synthesis of the spore coat. Spore coat polysaccharide biosynthesis protein F (spsF) is a glycosyltransferase implicated in the synthesis of the spore coat in a variety of bacteria challenged by stress as starvation. The spsF gene is expressed in the late stage of coat development responsible for a terminal step in coat formation that involves the glycosylation of the coat.  SpsF gene mutation resulted in spores that appeared normal. But, the spores tended to aggregate and had abnormal adsorption properties, indicating a surface alteration.	233
133012	cd02520	Glucosylceramide_synthase	Glucosylceramide synthase catalyzes the first glycosylation step of glycosphingolipid synthesis. UDP-glucose:N-acylsphingosine D-glucosyltransferase (glucosylceramide synthase or ceramide glucosyltransferase) catalyzes the first glycosylation step of glycosphingolipid synthesis. Its product, glucosylceramide, serves as the core of more than 300 glycosphingolipids (GSL). GSLs are a group of membrane components that have the lipid portion embedded in the outer plasma membrane leaflet and the sugar chains extended to the outer environment. Several lines of evidence suggest the importance of GSLs in various cellular processes such as differentiation, adhesion, proliferation, and cell-cell recognition. In pathogenic fungus Cryptococcus neoformans,  glucosylceramide serves as an antigen that elicits an antibody response in patients and it is essential for fungal growth in host extracellular environment.	196
133013	cd02522	GT_2_like_a	GT_2_like_a represents a glycosyltransferase family-2 subfamily with unknown function. Glycosyltransferase family 2 (GT-2) subfamily of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families.	221
133014	cd02523	PC_cytidylyltransferase	Phosphocholine cytidylyltransferases catalyze the synthesis of CDP-choline. This family contains proteins similar to prokaryotic phosphocholine (P-cho) cytidylyltransferases. Phosphocholine (PC) cytidylyltransferases catalyze the transfer of a cytidine monophosphate from CTP to phosphocholine to form CDP-choline. PC is the most abundant phospholipid in eukaryotic membranes and it is also important in prokaryotic membranes. For pathogenic prokaryotes, the cell surface PC facilitates the interaction with host surface and induces attachment and invasion. In addition cell wall PC serves as scaffold for a group of choline-binding proteins that are secreted from the cells. Phosphocholine (PC) cytidylyltransferase is a key enzyme in the prokaryotic choline metabolism pathway. It has been hypothesized to consist of a choline transport system, a choline kinase, CTP:phosphocholine cytidylyltransferase, and a choline phosphotransferase that transfers P-Cho from CDP-Cho to either lipoteichoic acid or lipopolysaccharide.	229
133015	cd02524	G1P_cytidylyltransferase	G1P_cytidylyltransferase catalyzes the production of CDP-D-Glucose. Alpha-D-Glucose-1-phosphate Cytidylyltransferase catalyzes the production of CDP-D-Glucose from alpha-D-Glucose-1-phosphate and MgCTP as substrate. CDP-D-Glucose is the precursor  for synthesizing four of the five naturally occurring 3,6-dideoxy sugars-abequose (3,6-dideoxy-D-Xylo-hexose), ascarylose (3,6-dideoxy-L-arabino-hexose), paratose (3,6-dideoxy-D-ribohexose), and tyvelose (3,6-dideoxy-D-arabino-hexose. Deoxysugars are ubiquitous in nature where they function in a variety of biological processes, including cell adhesion, immune response, determination of ABO blood groups, fertilization, antibiotic function, and microbial pathogenicity.	253
133016	cd02525	Succinoglycan_BP_ExoA	ExoA is involved in the biosynthesis of succinoglycan. Succinoglycan Biosynthesis Protein ExoA catalyzes the formation of a beta-1,3 linkage of the second sugar (glucose) of the succinoglycan with the galactose on the lipid carrie. Succinoglycan is an acidic exopolysaccharide that is important for invasion of the nodules. Succinoglycan is a high-molecular-weight polymer composed of repeating octasaccharide units. These units are synthesized on membrane-bound isoprenoid lipid carriers, beginning with galactose followed by seven glucose molecules, and modified by the addition of acetate, succinate, and pyruvate. ExoA is a membrane protein with a transmembrance domain at c-terminus.	249
133017	cd02526	GT2_RfbF_like	RfbF is a putative dTDP-rhamnosyl transferase. Shigella flexneri RfbF protein is a putative dTDP-rhamnosyl transferase. dTDP rhamnosyl  transferases of Shigella flexneri  add rhamnose sugars to N-acetyl-glucosamine in the O-antigen tetrasaccharide repeat. Lipopolysaccharide O antigens are important virulence determinants for many bacteria. The variations of sugar composition, the sequence of the sugars and the linkages in the O antigen provide structural diversity of the O antigen.	237
133018	cd02537	GT8_Glycogenin	Glycogenin belongs the GT 8 family and initiates the biosynthesis of glycogen. Glycogenin initiates the biosynthesis of glycogen by incorporating glucose residues through a self-glucosylation reaction at a Tyr residue, and then acts as substrate for chain elongation by glycogen synthase and branching enzyme. It contains a conserved DxD motif and an N-terminal beta-alpha-beta Rossmann-like fold that are common to the nucleotide-binding domains of most glycosyltransferases. The DxD motif is essential for coordination of the catalytic divalent cation, most commonly Mn2+. Glycogenin can be classified as a retaining glycosyltransferase, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed. It is placed in glycosyltransferase family 8 which includes lipopolysaccharide glucose and galactose transferases and galactinol synthases.	240
133019	cd02538	G1P_TT_short	G1P_TT_short is the short form of glucose-1-phosphate thymidylyltransferase. This family is the short form of glucose-1-phosphate thymidylyltransferase.  Glucose-1-phosphate thymidylyltransferase catalyses the formation of dTDP-glucose, from dTTP and glucose 1-phosphate. It is the first enzyme in the biosynthesis of dTDP-L-rhamnose, a cell wall constituent and a feedback inhibitor of the enzyme.There are two forms of   Glucose-1-phosphate thymidylyltransferase in bacteria and archeae; short form and long form. The homotetrameric, feedback inhibited short form is found in numerous bacterial species that produce dTDP-L-rhamnose. The long form, which has an extra 50 amino acids c-terminal, is found in many species for which it serves as a sugar-activating enzyme for antibiotic biosynthesis and or other, unknown pathways, and in which dTDP-L-rhamnose is not necessarily produced.	240
133020	cd02540	GT2_GlmU_N_bac	N-terminal domain of bacterial GlmU. The N-terminal domain of N-Acetylglucosamine-1-phosphate uridyltransferase (GlmU). GlmU is an essential bacterial enzyme with both an acetyltransferase and an uridyltransferase activity which have been mapped to the C-terminal and N-terminal domains, respectively. This family represents the N-terminal uridyltransferase. GlmU performs the last two steps in the synthesis of UDP-N-acetylglucosamine (UDP-GlcNAc), which is an essential precursor in both the peptidoglycan and the lipopolysaccharide metabolic pathways in Gram-positive and Gram-negative bacteria, respectively.	229
133021	cd02541	UGPase_prokaryotic	Prokaryotic UGPase catalyses the synthesis of UDP-glucose. Prokaryotic UDP-Glucose Pyrophosphorylase (UGPase) catalyzes a reversible production of UDP-Glucose  and pyrophosphate (PPi) from glucose-1-phosphate and UTP.  UDP-glucose plays pivotal roles in galactose utilization, in glycogen synthesis, and in the synthesis of the carbohydrate moieties of glycolipids , glycoproteins , and proteoglycans. UGPase is found in both prokaryotes and eukaryotes, although prokaryotic and eukaryotic forms of UGPase catalyze the same reaction, they share low sequence similarity.	267
239109	cd02549	Peptidase_C39A	A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are lacking the nucleotide-binding transporter signature or have different domain architectures.	141
211325	cd02550	PseudoU_synth_Rsu_Rlu_like	Pseudouridine synthase, Rsu/Rlu family. This group is comprised of eukaryotic, bacterial and archeal proteins similar to eight site specific Escherichia coli pseudouridine synthases: RsuA, RluA, RluB, RluC, RluD, RluE, RluF and TruA. Pseudouridine synthases catalyze the isomerization of specific uridines in a n RNA molecule to pseudouridines (5-ribosyluracil, psi) requiring no cofactors.  E. coli RluC for example makes psi955, 2504 and 2580 in 23S RNA.  Some psi sites such as psi1917 in 23S RNA made by RluD are universally conserved.  Other psi sites occur in a more restricted fashion, for example psi2819 in 21S mitochondrial ribosomal RNA made by S. cerevisiae Pus5p is only found in mitochondrial large subunit rRNAs from some other species and in gram negative bacteria. The E. coli counterpart of this psi residue is psi2580 in 23S rRNA.  psi2604in 23S RNA made by RluF has only been detected in E.coli.	154
211326	cd02552	PseudoU_synth_TruD_like	Pseudouridine synthase, TruD family. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases similar to Escherichia coli TruD and Saccharomyces cerevisiae Pus7.  Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  E. coli TruD and S. cerevisiae Pus7 make psi13 in cytoplasmic tRNAs. In addition S. cerevisiae Pus7 makes psi35 in U2 small nuclear RNA (U2 snRNA) and psi35 in pre-tRNATyr.  Psi35 in U2 snRNA and psi13 in tRNAs are highly phylogenetically conserved.  Psi34 is the mammalian U2 snRNA counterpart of yeast U2 snRNA psi35.	232
211327	cd02553	PseudoU_synth_RsuA	Pseudouridine synthase, Escherichia coli RsuA like. This group is comprised of eukaryotic and bacterial proteins similar to Escherichia coli RsuA. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. E.coli RsuA makes psi516 in 16S RNA. Psi at this position is not generally conserved in other organisms.	167
211328	cd02554	PseudoU_synth_RluF	Pseudouridine synthase, Escherichia coli RluF like. This group is comprised of bacterial proteins similar to Escherichia coli RluF. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. E.coli RluF makes psi2604 in 23S RNA. psi2604 has only been detected in E. coli. It is absent from other eubacteria despite a precursor U at that site and from eukarya and archea which lack a precursor U at that site.	164
211329	cd02555	PSSA_1	Pseudouridine synthase, a subgroup of the RsuA family. This group is comprised of bacterial proteins assigned to the RsuA family of pseudouridine synthases. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. The TruA family is comprised of proteins related to Escherichia coli RsuA.	177
211330	cd02556	PseudoU_synth_RluB	Pseudouridine synthase, Escherichia coli RluB like. This group is comprised of bacterial and eukaryotic proteins similar to E. coli RluB. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. E.coli RluB makes psi2605 in 23S RNA.  psi2605 has been detected in eubacteria but, not in eukarya and archea despite the presence of a precursor U at that site.	167
211331	cd02557	PseudoU_synth_ScRIB2	Pseudouridine synthases similar to Saccharomyces cerevisiae RIB2. Pseudouridine synthase, Saccharomyces cerevisiae RIB2_like. This group is comprised of eukaryotic and bacterial proteins similar to Saccharomyces cerevisiae RIB2, S. cerevisiae Pus6p and human hRPUDSD2. S. cerevisiae RIB2 displays two distinct catalytic activities. The N-terminal domain of RIB2 is RNA:psi-synthase which makes psi32 on cytoplasmic tRNAs. Psi32 is highly phylogenetically conserved.   The C-terminal domain of RIB2 has a DRAP deaminase activity which catalyses the formation of 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione 5'-phosphate from 2,5-diamino-6-ribitylamino-4(3H)-pyrimidinone 5'-phosphate during riboflavin biosynthesis. S. cerevisiae Pus6p makes the psi31 of cytoplasmic and mitochondrial tRNAs.	213
211332	cd02558	PSRA_1	Pseudouridine synthase, a subgroup of the RluA family. This group is comprised of bacterial proteins assigned to the RluA family of pseudouridine synthases. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. The RluA family is comprised of proteins related to Escherichia coli RluA.	246
211333	cd02563	PseudoU_synth_TruC	tRNA pseudouridine isomerase C. Pseudouridine synthases catalyze the isomerization of specific uridines in an tRNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. TruC makes psi65 in tRNAs.  This psi residue is not universally conserved.	223
211334	cd02566	PseudoU_synth_RluE	Pseudouridine synthase, Escherichia coli RluE. This group is comprised of bacterial proteins similar to E. coli RluE. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required.  Escherichia coli RluE makes psi2457 in 23S RNA. psi2457 is not universally conserved.	168
211335	cd02568	PseudoU_synth_PUS1_PUS2	Pseudouridine synthase, PUS1/ PUS2 like. This group consists of eukaryotic pseudouridine synthases similar to Saccharomyces cerevisiae Pus1p,  S.  cerevisiae Pus2p, Caenorhabditis elegans Pus1p and human PUS1. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. S. cerevisiae Pus1p catalyzes the formation of psi34 and psi36 in the intron-containing tRNAIle, psi35 in the intron-containing tRNATyr, psi27 and/or psi28 in several yeast cytoplasmic tRNAs and, psi44 in U2 small nuclear RNA (U2 snRNA). The presence of the intron is required for the formation of psi 34, 35 and 36. In addition S. cerevisiae PUS1 makes are psi 26, 65 and 67.  C. elegans Pus1p does not modify psi44 in U2 snRNA. Mouse Pus1p makes psi27/28 in pre- tRNASer , tRNAVal and tRNAIle,  psi 34/36 in tRNAIle and, psi 32 and potentially 67 in tRNAVal.  Psi44 in U2 snRNA and psi32 in tRNAs are highly phylogenetically conserved. Psi 26,27,28,34,35,36,65 and 67 in tRNAs are less highly conserved. Mouse Pus1p regulates nuclear receptor activity through pseudouridylation of Steroid Receptor RNA Activator. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA).	245
211336	cd02569	PseudoU_synth_ScPus3	Pseudouridine synthase, Saccharomyces cerevisiae Pus3 like. This group consists of eukaryotic pseudouridine synthases similar to S. cerevisiae Pus3p, mouse Pus3p and, human PUS2. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. S. cerevisiae Pus3p makes psi38 and psi39 in tRNAs. Mouse Pus3p has been shown to makes psi38 and, possibly also psi 39, in tRNAs. Psi38 and psi39 are highly conserved in tRNAs from eubacteria, archea and eukarya.	256
211337	cd02570	PseudoU_synth_EcTruA	Eukaryotic and bacterial pseudouridine synthases similar to E.  coli TruA. This group consists of eukaryotic and bacterial pseudouridine synthases similar to E.  coli TruA, Pseudomonas aeruginosa truA and human pseudouridine synthase-like 1 (PUSL1). Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. E. coli TruA makes psi38/39 and/or 40 in tRNA.  psi38 and psi39 in tRNAs are highly phylogenetically conserved.  P. aeruginosa truA is required for induction of type III secretory genes and may act through modifying tRNAs critical for the expression of type III genes or their regulators.	239
211338	cd02572	PseudoU_synth_hDyskerin	Pseudouridine synthase, human dyskerin like. This group consists of eukaryotic and archeal pseudouridine synthases similar to human dyskerin, Saccharomyces cerevisiae Cbf5, and Drosophila melanogaster Mfl (minifly protein).  Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactor is required. S. cerevisiae Cbf5 and human dyskerin are nucleolar proteins that, with the help of guide RNAs, make the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs).  Cbf5/Dyskerin is the catalytic subunit of eukaryotic box H/ACA small nucleolar ribonucleoprotein (snoRNP) particles. D. melanogaster mfl hosts in its fourth intron, a box H/AC snoRNA gene.  In addition dyskerin is likely to have a structural role in the telomerase complex.  Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Mutations in Drosophila Mfl results in miniflies that suffer abnormalities.	182
211339	cd02573	PseudoU_synth_EcTruB	Pseudouridine synthase, Escherichia coli TruB like. This group consists of bacterial pseudouridine synthases similar to E. coli TruB and Mycobacterium tuberculosis TruB. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  E. coli TruB and M.  tuberculosis TruB make psi55 in the T loop of tRNAs. Psi55 is nearly universally conserved.  E. coli TruB is not inhibited by RNA containing 5-fluorouridine.	213
211340	cd02575	PseudoU_synth_EcTruD	Pseudouridine synthase, similar to Escherichia coli TruD. This group consists of bacterial pseudouridine synthases similar to Escherichia coli TruD. Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  E. coli TruD makes the highly phylogenetically conserved psi13 in tRNAs.	253
211341	cd02576	PseudoU_synth_ScPUS7	Pseudouridine synthase, TruD family. This group consists of eukaryotic pseudouridine synthases similar to Saccharomyces cerevisiae Pus7.  Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  Saccharomyces cerevisiae Pus7 makes psi35 in U2 small nuclear RNA (U2 snRNA), psi13 in cytoplasmic tRNAs and psi35 in pre-tRNATyr. Psi35 in yeast U2 snRNA and psi13 in tRNAs are highly phylogenetically conserved.  Psi34 is the mammalian U2 snRNA counterpart of yeast U2 snRNA psi35.	371
211342	cd02577	PSTD1	Pseudouridine synthase, a subgroup of the TruD family. This group consists of several hypothetical archeal pseudouridine synthases assigned to the TruD family of psuedouridine synthases.  Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  The TruD family is comprised of proteins related to Escherichia coli TruD.	319
259846	cd02582	RNAP_archeal_A'	A' subunit of archaeal RNA polymerase (RNAP). A' is the largest subunit of the archaeal RNA polymerase (RNAP). Archaeal RNAP is closely related to RNA polymerases in eukaryotes based on the subunit compositions. Archaeal RNAP is a large multi-protein complex, made up of 11 to 13 subunits, depending on the species, that are responsible for the synthesis of RNA. Structure studies suggest that RNAP complexes from different organisms share a crab-claw-shaped structure. The largest eukaryotic RNAP subunit is encoded by two separate archaeal subunits (A' and A'') which correspond to the N- and C-terminal domains of eukaryotic RNAP II Rpb1, respectively. The N-terminal domain of Rpb1 forms part of the active site and includes the head and the core of one clamp as well as the pore and funnel structures of RNAP II. Based on a structural comparison among the archaeal, bacterial and eukaryotic RNAPs the DNA binding channel and the active site are part of A' subunit which is conserved. The strong similarity between subunit A' and the N-terminal domain of Rpb1 suggests a similar functional and structural role for these two proteins.	861
259847	cd02583	RNAP_III_RPC1_N	Largest subunit (RPC1) of eukaryotic RNA polymerase III (RNAP III), N-terminal domain. Rpc1 (C160) subunit forms part of the active site region of RNAP III. RNAP III is one of the three distinct classes of nuclear RNAP in eukaryotes that is responsible for the synthesis of tRNAs, 5SrRNA, Alu-RNA, U6 snRNA genes, and some others. RNAP III is the largest nuclear RNA polymerase with 17 subunits. Structure studies suggest that different RNA polymerase complexes share a similar crab-claw-shaped structure. The N-terminal domain of Rpb1, the largest subunit of RNAP II in yeast, forms part of the active site, making up the head and core of the one clamp, as well as the pore and funnel structures of RNAP II. The strong homology between Rpc1 and Rpb1 suggests a similar functional and structural role.	816
132720	cd02584	RNAP_II_Rpb1_C	Largest subunit (Rpb1) of Eukaryotic RNA polymerase II (RNAP II), C-terminal domain. RNA polymerase II (RNAP II) is a large multi-subunit complex responsible for the synthesis of mRNA. RNAP II consists of a 10-subunit core enzyme and a peripheral heterodimer of two subunits. The largest core subunit (Rpb1) of yeast RNAP II is the best characterized member of this family. Structure studies suggest that RNAP complexes from different organisms share a crab-claw-shape structure. In yeast, Rpb1 and Rpb2, the largest and the second largest subunits, each makes up one clamp, one jaw, and part of the cleft. Rpb1 interacts with Rpb2 to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The C-terminal domain of Rpb1 makes up part of the foot and jaw structures.	410
319784	cd02585	HAD_PMM	phosphomannomutase, similar to human PMM1 and PMM2, Saccharomyces Sec53p, and Arabidopsis thaliana PMM. PMM catalyzes the interconversion of mannose-6-phosphate (M6P) to mannose-1-phosphate (M1P); the conversion of M6P to M1P is an essential step in mannose activation and the biosynthesis of glycoconjugates in all eukaryotes. M1P is the substrate for the synthesis of GDP-mannose, which is an intermediate for protein glycosylation, protein sorting and secretion, and maintaining a functional endomembrane system in eukaryotic cells.  Proteins in this family contains a conserved phosphorylated motif DxDx(T/V) shared with some other phosphotransferases. This family contains two human homologs, PMM1 and PMM2; PMM2 deficiency causes congenital disorder of glycosylation type I-a, also known as Jaeken syndrome. PMM1 can also act as glucose-1,6-bisphosphatase in the brain after stimulation with inosine monophosphate; PMM2 on the other hand, is insensitive to IMP and demonstrates low glucose-1,6-bisphosphatase activity.  Arabidopsis thaliana PMM converted M1P into M6P and glucose-1-phosphate into glucose-6-phosphate, with the latter reaction being less efficient. Arabidopsis thaliana and Nicotiana benthamian PPMs are involved in ascorbic acid biosynthesis. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	238
319785	cd02586	HAD_PHN	Phosphonoacetaldehyde hydrolase (phosphonatase); similar to Bacillus cereus phosphonatase. Degradation of the ubiquitous natural phosphonate 2-aminoethylphosphonate (AEP) into useable forms of nitrogen, carbon, and phosphorus is a two-step metabolic pathway. The first step, catalyzed by AEP transaminase, involves the transfer of NH3 from AEP to pyruvate, yielding phosphonoacetaldehyde (P-Ald) and alanine. In the second step, phosphonatase catalyzes the hydrolytic P-C bond cleavage of P-Ald to form orthophosphate and acetaldehyde. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	242
319786	cd02587	HAD_5-3dNT	5'(3')-deoxyribonucleotidase. This family includes cytosolic 5'(3')-deoxyribonucleotidase (cdN) and mitochondrial 5'(3')-deoxyribonucleotidase (mdN). cdN and mdN specifically dephosphorylate the deoxyribo form of nucleoside monophosphates helps maintain homeostasis of deoxynucleosides required for mitochondrial DNA synthesis. Their preferred substrates are dUMP and dTMP. cdN also dephosphorylates dGMP and dIMP efficiently. They can also dephosphorylate the 5'- or 3'-phosphates of pyrimidine ribonucleotides.  This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	161
319787	cd02588	HAD_L2-DEX	L-2-haloacid dehalogenase. L-2-Haloacid dehalogenase catalyzes the hydrolytic dehalogenation of L-2-haloacids to produce the corresponding D-2-hydroxyacids with an inversion of the C2-configuration. 2-haloacid dehalogenases are of interest for their potential to degrade recalcitrant halogenated environmental pollutants and their use in the synthesis of industrial chemicals. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	216
319788	cd02598	HAD_BPGM	beta-phosphoglucomutase, similar to Lactococcus lactis beta-phosphoglucomutase (beta-PGM). Lactococcus lactis beta-PGM catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), forming beta-D-glucose 1,6-(bis)phosphate as an intermediate. In the forward G6P-forming direction, this reaction links polysaccharide phosphorolysis to glycolysis, in the reverse direction, the reaction provides G1P for the biosynthesis of exo-polysaccharides. This subfamily belongs to the beta-phosphoglucomutase-like family whose other members include Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	174
319789	cd02601	HAD_Eya	protein tyrosine phosphatase domain of the nuclear transcription factor of Eyes absent (Eya) and related phosphatase domains. Eyes absent (Eya) is a transcriptional coactivator, and an aspartyl-based protein tyrosine phosphatase. Eya and Six operate as a composite transcription factor, within a conserved network of transcription factors called the retinal determination (RD) network. The RD network interacts with a broad variety of signaling pathways to regulate the development and homeostasis of organs and tissues such as eye, muscle, kidney and ear. To date it is not clear what the physiologically relevant substrates of the Eya protein tyrosine phosphatase are, or whether this phosphatase activity plays a role in transcription. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	271
319790	cd02603	HAD_sEH-N_like	N-terminal lipase phosphatase domain of human soluble epoxide hydrolase, Escherichia coli YihX/HAD4 alpha-D-glucose 1-phosphate phosphatase, and related domains, may be inactive. This family includes the N-terminal phosphatase domain of human soluble epoxide hydrolase (sEH). sEH is a bifunctional enzyme with two distinct enzyme activities, the C-terminal domain has epoxide hydrolysis activity and the N-terminal domain (Ntermphos), which belongs to this family, has lipid phosphatase activity. The latter prefers mono-phosphate esters, and lysophosphatidic acids (LPAs) are the best natural substrates found to date.  In addition this family includes Gallus gallus sEH and Xenopus sEH which appears to lack phosphatase activity, and Escherichia coli YihX/HAD4 which selectively hydrolyzes alpha-Glucose-1-P, phosphatase, has significant phosphatase activity against pyridoxal phosphate, and has low beta phosphoglucomutase activity. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	195
319791	cd02604	HAD_5NT	haloacid dehalogenase (HAD)-like 5'-nucleotidases similar to Saccharomyces cerevisiae Phm8p and Sdt1p. This family includes Saccharomyces cerevisiae Phm8p (phosphate metabolism protein 8) and Sdt1p (Suppressor of disruption of TFIIS). Phm8p participates in the ribose salvage pathway, it catalyzes the dephosphorylation of nucleotide monophosphates to nucleosides, its preferred substrates are nucleotide monophosphates AMP, GMP, CMP, and UMP. Phm8p is also a lysophosphatidic acid phosphatase, dephosphorylating lysophosphatidic acids (LPAs) to monoacylglycerol in response to phosphate starvation. Sdt1p is a pyrimidine and pyridine-specific 5'-nucleotidase; it is an NMN/NaMN 5'-nucleotidases involved in the production of nicotinamide riboside and nicotinic acid riboside, and is a pyrimidine 5'-nucleotidase with high specificity for UMP and CMP. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	182
319792	cd02605	HAD_SPP	sucrose-phosphatase, similar to Synechocystis sp PCC 6803 SPP. Sucrose-phosphatase (SPP; EC 3.1.3.24) catalyzes the dephosphorylation of sucrose-6(F)-phosphate (Suc6P)-the final step in the pathway of sucrose biosynthesis in plants and cyanobacteria. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	245
319793	cd02607	HAD_ThrH_like	bifunctional phosphoserine phosphatase/phosphoserine:homoserine phosphotransferase, similar to Pseudomonas aeruginosa ThrH. This family includes Pseudomonas aeruginosa ThrH which is a duel activity enzyme having both phosphoserine phosphatase and phosphoserine:homoserine phosphotransferase activities, i.e. it can dephosphorylate phosphoserine, and can transfer phosphate from phosphoserine to homoserine. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	195
319794	cd02608	P-type_ATPase_Na-K_like	alpha-subunit of Na(+)/K(+)-ATPases and of gastric H(+)/K(+)-ATPase, similar to the human Na(+)/K(+)-ATPase alpha subunits 1-4. This subfamily includes the alpha subunit of Na(+)/K(+)-ATPase a heteromeric transmembrane protein composed of an alpha- and beta-subunit and an optional third subunit belonging to the FXYD proteins which are more tissue specific regulatory subunits of the enzyme. The alpha-subunit is the catalytic subunit responsible for transport activities of the enzyme. This subfamily includes all four isotopes of the human alpha subunit: (alpha1-alpha4, encoded by the ATP1A1- ATP1A4 genes).  Na(+)/K(+)-ATPase functions chiefly as an ion pump, hydrolyzing one molecule of ATP to pump three Na(+) out of the cell in exchange for two K(+)entering the cell per pump cycle. In addition Na(+)/K(+)-ATPase acts as a signal transducer. This subfamily also includes Oreochromis mossambicus (tilapia) Na(+)/K(+)-ATPase alpha 1 and alpha 3 subunits, and gastric H(+)/K(+)-ATPase which exchanges hydronium ion with potassium and is responsible for gastric acid secretion. Gastric H(+)/K(+)-ATPase is an alpha,beta-heterodimeric enzyme. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	905
319795	cd02609	P-type_ATPase	uncharacterized subfamily of P-type ATPase transporter, similar to uncharacterized Streptococcus pneumoniae exported protein 7, Exp7. This subfamily contains P-type ATPase transporters of unknown function, similar to Streptococcus pneumoniae Exp7. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. They are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.  A general characteristic of P-type ATPases is a bundle of transmembrane helices which make up the transport path, and three domains on the cytoplasmic side of the membrane. Members include pumps that transport various light metal ions, such as H(+), Na(+), K(+), Ca(2+), and Mg(2+), pumps that transport indispensable trace elements, such as Zn(2+) and Cu(2+), pumps that remove toxic heavy metal ions, such as Cd(2+), and pumps such as aminophospholipid translocases which transport phosphatidylserine and phosphatidylethanolamine.	661
319796	cd02612	HAD_PGPPase	phosphatidylglycerol-phosphate phosphatase, similar to Escherichia coli K-12 phosphatidylglycerol-phosphate phosphatase C. This family includes Escherichia coli K-12 phosphatidylglycerol-phosphate phosphatase C, PgpC (previously named yfhB) which catalyzes the dephosphorylation of phosphatidylglycerol-phosphate (PGP) to phosphatidylglycerol (PG). This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	195
319797	cd02616	HAD_PPase	pyrophosphatase similar to Bacillus subtilis PpaX. This family includes Bacillus subtilis PpaX which hydrolyzes pyrophosphate formed during serine-46-phosphorylated HPr (P-Ser-HPr) dephosphorylation by the bifunctional enzyme HPr kinase/phosphorylase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	207
239110	cd02619	Peptidase_C1	C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues.	223
239111	cd02620	Peptidase_C1A_CathepsinB	Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane antibody-mediated interstitial nephritis. It plays a role in renal tubulogenesis and is defective in hereditary tubulointerstitial disorders. TIN-Ag is exclusively expressed in kidney tissues. 	236
239112	cd02621	Peptidase_C1A_CathepsinC	Cathepsin C; also known as Dipeptidyl Peptidase I (DPPI), an atypical papain-like cysteine peptidase with chloride dependency and dipeptidyl aminopeptidase activity, resulting from its tetrameric structure which limits substrate access. Each subunit of the tetramer is composed of three peptides: the heavy and light chains, which together adopts the papain fold and forms the catalytic domain; and the residual propeptide region, which forms a beta barrel and points towards the substrate's N-terminus. The subunit composition is the result of the unique characteristic of procathepsin C maturation involving the cleavage of the catalytic domain and the non-autocatalytic excision of an activation peptide within its propeptide region. By removing N-terminal dipeptide extensions, cathepsin C activates granule serine peptidases (granzymes) involved in cell-mediated apoptosis, inflammation and tissue remodelling. Loss-of-function mutations in cathepsin C are associated with Papillon-Lefevre and Haim-Munk syndromes, rare diseases characterized by hyperkeratosis and early-onset periodontitis. Cathepsin C is widely expressed in many tissues with high levels in lung, kidney and placenta. It is also highly expressed in cytotoxic lymphocytes and mature myeloid cells.	243
100065	cd02636	R3H_sperm-antigen	R3H domain of a group of metazoan proteins that is related to the sperm-associated antigen 7. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	61
100066	cd02637	R3H_PARN	R3H domain of Poly(A)-specific ribonuclease (PARN). PARN is a poly(A)-specific 3' exonuclease from the RNase D family that, in Xenopus, deadenylates a specific class of maternal mRNAs which results in their translational repression. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA.	65
100067	cd02638	R3H_unknown_1	R3H domain of a group of eukaryotic proteins with unknown function. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	62
100068	cd02639	R3H_RRM	R3H domain of mainly fungal proteins which are associated with a RNA recognition motif (RRM) domain. Present in this group is the RNA-binding post-transcriptional regulator Cip2 (Csx1-interacting protein 2) involved in counteracting Csx1 function. Csx1 plays a central role in controlling gene expression during oxidative stress. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	60
100069	cd02640	R3H_NRF	R3H domain of the NF-kappaB-repression factor (NRF). NRF is a nuclear inhibitor of NF-kappaB proteins that can silence the IFNbeta promoter via binding to a negative regulatory element (NRE). Beside R3H NRF also contains a G-patch domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	60
100070	cd02641	R3H_Smubp-2_like	R3H domain of Smubp-2_like proteins.  Smubp-2_like proteins also contain a helicase_like and an AN1-like Zinc finger domain and have been shown to bind single-stranded DNA. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA.	60
100071	cd02642	R3H_encore_like	R3H domain of encore-like and DIP1-like proteins. Drosophila encore is involved in the germline exit after four mitotic divisions, by facilitating SCF-ubiquitin-proteasome-dependent proteolysis. Maize DBF1-interactor protein 1 (DIP1) containing an R3H domain is a potential regulator of DBF1 activity in stress responses. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	63
100072	cd02643	R3H_NF-X1	R3H domain of the X1 box binding protein (NF-X1) and related proteins. Human NF-X1 is a transcription factor that regulates the expression of class II major histocompatibility complex (MHC) genes. The Drosophila homolog shuttle craft (STC) has been shown to be a DNA- or RNA-binding protein required for proper axon guidance in the central nervous system and, the yeast homolog FAP1 encodes a dosage suppressor of rapamycin toxicity. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	74
100073	cd02644	R3H_jag	R3H domain found in proteins homologous to Bacillus subtilus Jag, which is associated with SpoIIIJ. SpoIIIJ is necessary for the third stage of sporulation. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	67
100074	cd02645	R3H_AAA	R3H domain of a group of proteins with unknown function, who also contain a AAA-ATPase (AAA) domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA or ssRNA in a sequence-specific manner.	60
100075	cd02646	R3H_G-patch	R3H domain of a group of fungal and plant proteins with unknown function, who also contain a G-patch domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the R3H domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	58
239113	cd02647	nuc_hydro_TvIAG	nuc_hydro_ TvIAG:  Nucleoside hydrolases similar to the Inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax.   Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. Nucleoside hydrolases vary in their substrate specificity. This group contains eukaryotic and bacterial proteins similar to the purine specific inosine-adenosine-guanosine-preferring nucleoside hydrolase (IAG-NH) from T.  vivax.  T. vivax IAG-NH is of the order of a thousand to ten thousand fold more specific towards the naturally occurring purine nucleosides, than towards the pyrimidine nucleosides. 	312
239114	cd02648	nuc_hydro_1	NH_1: A subgroup of nucleoside hydrolases. This group contains fungal proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity.  	367
239115	cd02649	nuc_hydro_CeIAG	nuc_hydro_CeIAG: Nucleoside hydrolases similar to the inosine-adenosine-guanosine-preferring nucleoside hydrolase from Caenorhabditis elegans.  Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the purine-preferring nucleoside hydrolase (IAG-NH) from C. elegans and the salivary purine nucleosidase from Aedes aegypti.  C. elegans IAG-NH exhibits a high affinity for the substrate analogue p-nitrophenylriboside (p-NPR). 	306
239116	cd02650	nuc_hydro_CaPnhB	NH_hydro_CaPnhB: A subgroup of nucleoside hydrolases similar to Corynebacterium ammoniagenes Purine/pyrimidine nucleoside hydrolase (pnhB). Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. 	304
239117	cd02651	nuc_hydro_IU_UC_XIUA	nuc_hydro_IU_UC_XIUA: inosine-uridine preferring, xanthosine-inosine-uridine-adenosine-preferring and, uridine-cytidine preferring nucleoside hydrolases.  Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains proteins similar to nucleoside hydrolases which hydrolyze both pyrimidine and purine ribonucleosides: the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the inosine-uridine-xanthosine preferring nucleoside hydrolase RihC from Escherichia coli and the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium. This group also contains proteins similar to the pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases URH1 from Saccharomyces cerevisiae, E. coli RihA and E. coli RihB.  E. coli  RihA is equally efficient with uridine and cytidine, E. coli RihB prefers cytidine over uridine. S. cerevisiae URH1 prefers uridine over cytidine. 	302
239118	cd02652	nuc_hydro_2	NH_2: A subgroup of nucleoside hydrolases. This group contains eukaryotic and bacterial proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity.  	293
239119	cd02653	nuc_hydro_3	NH_3: A subgroup of nucleoside hydrolases. This group contains eukaryotic and bacterial proteins similar to nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity.  	320
239120	cd02654	nuc_hydro_CjNH	nuc_hydro_CjNH. Nucleoside hydrolases similar to Campylobacter jejuni nucleoside hydrolase.  This group contains eukaryotic and bacterial proteins similar to C. jejuni nucleoside hydrolase. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. C. jejuni nucleoside hydrolase is inactive against natural nucleosides or against common nucleoside analogues. 	318
132721	cd02655	RNAP_beta'_C	Largest subunit (beta') of Bacterial DNA-dependent RNA polymerase (RNAP), C-terminal domain. Bacterial RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of all RNAs in the cell. This family also includes the eukaryotic plastid-encoded RNAP beta" subunit. Structure studies suggest that RNAP complexes from different organisms share a crab-claw-shape structure with two pincers defining a central cleft. Beta' and beta, the largest and the second largest subunits of bacterial RNAP, each makes up one pincer and part of the base of the cleft. The C-terminal domain includes a G loop that forms part of the floor of the downstream DNA-binding cavity. The position of the G loop may determine the switch of the bridge helix between flipped-out and normal alpha-helical conformations.	204
239121	cd02656	MIT	MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear.	75
239122	cd02657	Peptidase_C19A	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	305
239123	cd02658	Peptidase_C19B	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	311
239124	cd02659	peptidase_C19C	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	334
239125	cd02660	Peptidase_C19D	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	328
239126	cd02661	Peptidase_C19E	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	304
239127	cd02662	Peptidase_C19F	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	240
239128	cd02663	Peptidase_C19G	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	300
239129	cd02664	Peptidase_C19H	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	327
239130	cd02665	Peptidase_C19I	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	228
239131	cd02666	Peptidase_C19J	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	343
239132	cd02667	Peptidase_C19K	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	279
239133	cd02668	Peptidase_C19L	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	324
239134	cd02669	Peptidase_C19M	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	440
239135	cd02670	Peptidase_C19N	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	241
239136	cd02671	Peptidase_C19O	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	332
239137	cd02672	Peptidase_C19P	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	268
239138	cd02673	Peptidase_C19Q	A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	245
239139	cd02674	Peptidase_C19R	A subfamily of peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	230
259861	cd02675	Ephrin_ectodomain	Ectodomain of Ephrins. Ephrins and their receptors EphR play an important role in cell communication in normal physiology, as well as in disease pathogenesis. Binding of the ephrin (Eph) ligand to EphR requires cell-cell contact, since both molecules are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling, depending on Eph kinase activity) and ephrin-expressing cells (reverse signaling). Eph signaling controls cell morphology, adhesion, migration and invasion. Ephrins can be subdivided into 2 groups, A and B, depending on their respective receptors EphA or EphB. The nine human EphA receptors bind to five GPI-linked ephrin-A ligands and the five EphB receptors bind to three transmembrane ephrin-B ligands. Interactions are promiscuous within each class, and some Eph receptors can also bind to ephrins of the other class. All ephrins contain a highly conserved ectodomain for receptor binding, which is characterized by this domain hierarchy.	136
239140	cd02677	MIT_SNX15	MIT: domain contained within Microtubule Interacting and Trafficking molecules. This MIT domain sub-family is found in sorting nexin 15 and related proteins. The molecular function of the MIT domain is unclear.	75
239141	cd02678	MIT_VPS4	MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in intracellular protein transport proteins of the AAA-ATPase family. The molecular function of the MIT domain is unclear.	75
239142	cd02679	MIT_spastin	MIT: domain contained within Microtubule Interacting and Trafficking molecules. This MIT domain sub-family is found in the AAA protein spastin, a probable ATPase involved in the assembly or function of nuclear protein complexes; spastins might also be involved in microtubule dynamics. The molecular function of the MIT domain is unclear.	79
239143	cd02680	MIT_calpain7_2	MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in the nuclear thiol protease PalBH. The molecular function of the MIT domain is unclear.	75
239144	cd02681	MIT_calpain7_1	MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in the nuclear thiol protease PalBH. The molecular function of the MIT domain is unclear.	76
239145	cd02682	MIT_AAA_Arch	MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in mostly archaebacterial AAA-ATPases. The molecular function of the MIT domain is unclear.	75
239146	cd02683	MIT_1	MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in proteins with unknown function, co-occuring with an as yet undescribed domain. The molecular function of the MIT domain is unclear.	77
239147	cd02684	MIT_2	MIT: domain contained within Microtubule Interacting and Trafficking molecules. This sub-family of MIT domains is found in proteins with an n-terminal serine/threonine kinase domain. The molecular function of the MIT domain is unclear.	75
239148	cd02685	MIT_C	MIT_C; domain found C-terminal to MIT (contained within Microtubule Interacting and Trafficking molecules) domains, as well as in some bacterial proteins. The function of this domain is unknown.	148
199878	cd02688	E_set	Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus. The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase.	82
349868	cd02690	M28	M28 Zn-peptidases include aminopeptidases and carboxypeptidases. Peptidase M28 family (also called aminopeptidase Y family) contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Plasma glutamate carboxypeptidase (PGCP) and glutamate carboxypeptidase II (NAALADase) hydrolyze dipeptides. Several members of the M28 peptidase family have PA domain inserts which may participate in substrate binding and/or in promoting conformational changes, which influence the stability and accessibility of the site to substrate. These include prostate-specific membrane antigen (PSMA), yeast aminopeptidase S (SGAP), human transferrin receptors (TfR1 and TfR2), plasma glutamate carboxypeptidase (PGCP) and several predicted aminopeptidases where relatively little is known about them. Also included in the M28 family are glutaminyl cyclases (QC), which are involved in N-terminal glutamine cyclization of many endocrine peptides. Nicastrin and nicalin belong to this family but lack the amino-acid conservation required for catalytically active aminopeptidases.	202
100036	cd02691	PurM-like2	AIR synthase (PurM) related protein, archaeal subgroup 2 of unknown function. The family of PurM related proteins includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM synthase and Selenophosphate synthetase (SelD). They all contain two conserved domains and seem to dimerize. The N-terminal domain forms the dimer interface and is a putative ATP binding domain.	346
119407	cd02696	MurNAc-LAA	N-acetylmuramoyl-L-alanine amidase or MurNAc-LAA (also known as peptidoglycan aminohydrolase, NAMLA amidase, NAMLAA, Amidase 3, and peptidoglycan amidase; EC 3.5.1.28) is an autolysin that hydrolyzes the amide bond between N-acetylmuramoyl and L-amino acids in certain cell wall glycopeptides. These proteins are Zn-dependent peptidases with highly conserved residues involved in cation co-ordination. MurNAc-LAA in this family is one of several peptidoglycan hydrolases (PGHs) found in bacterial and bacteriophage or prophage genomes that are involved in the degradation of the peptidoglycan. In Escherichia coli, there are five MurNAc-LAAs present: AmiA, AmiB, AmiC and AmiD that are periplasmic, and AmpD that is cytoplasmic. Three of these (AmiA, AmiB and AmiC) belong to this family, the other two (AmiD and AmpD) do not. E. coli AmiA, AmiB and AmiC play an important role in cleaving the septum to release daughter cells after cell division. In general, bacterial MurNAc-LAAs are members of the bacterial autolytic system and carry a signal peptide in their N-termini that allows their transport across the cytoplasmic membrane. However, the bacteriophage MurNAc-LAAs are endolysins since these phage-encoded enzymes break down bacterial peptidoglycan at the terminal stage of the phage reproduction cycle. As opposed to autolysins, almost all endolysins have no signal peptides and their translocation through the cytoplasmic membrane is thought to proceed with the help of phage-encoded holin proteins. The amidase catalytic module is fused to another functional module (cell wall binding module or CWBM) either at the N- or C-terminus, which is responsible for high affinity binding of the protein to the cell wall.	172
349869	cd02697	M20_like	M20 Zn-peptidases include exopeptidases. Peptidase M20 family; uncharacterized subfamily. These hypothetical proteins have been inferred by homology to be exopeptidases: carboxypeptidases, dipeptidases and a specialized aminopeptidase. In general, the peptidase hydrolyzes the late products of protein degradation in order to complete the conversion of proteins to free amino acids. Members of this subfamily may bind metal ions such as zinc.	394
239149	cd02698	Peptidase_C1A_CathepsinX	Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response.	239
341048	cd02699	M4_M36	Peptidase M4 family (includes thermolysin, aureolysin, neutral protease and bacillolysin) and Peptidase M36 family (also known as fungalysin). This family includes the peptidases M4 as well as M36, both belonging to the Gluzincin family. The M4 peptidase family includes numerous zinc-dependent metallopeptidases that hydrolyze peptide bonds, such as thermolysin (EC 3.4.24.27), pseudolysin (the extracellullar elastase of Pseudomonas aeruginosa), aureolysin (the extracellular metalloproteinase from Staphylococcus aureus), neutral protease from Bacillus cereus, as well as bacillolysin (EC 3.4.24.28). The M36 family also known as fungalysin (elastinolytic metalloproteinase) family, includes endopeptidases from pathogenic fungi. Both M4 and M36 families have similar folds and contain the Zn-binding site and the active site HEXXH motif. The eukaryotic M36 and bacterial M4 families of metalloproteases also share a conserved domain in their propeptides called FTP (fungalysin/thermolysin propeptide).	313
259848	cd02733	RNAP_II_RPB1_N	Largest subunit (Rpb1) of eukaryotic RNA polymerase II (RNAP II), N-terminal domain. The two largest subunits of RNA polymerase II (RNAP II), Rpb1 and Rpb2, form the active site, DNA entry channel and RNA exit channel. RNAP II is a large multi-subunit complex responsible for the synthesis of mRNA in eukaryotes. RNAP II consists of a 10-subunit core enzyme and a peripheral heterodimer of two subunits. Structure studies suggest that RNAP complexes from different organisms share a crab-claw-shape structure. In yeast, Rpb1 and Rpb2, each makes up one clamp, one jaw, and part of the cleft. Rpb1_N contains part of the active site, forms the head and core of the one clamp, and makes up the pore and funnel regions of RNAP II.	751
132722	cd02735	RNAP_I_Rpa1_C	Largest subunit (Rpa1) of Eukaryotic RNA polymerase I (RNAP I), C-terminal domain. RNA polymerase I (RNAP I) is a multi-subunit protein complex responsible for the synthesis of rRNA precursor. It consists of at least 14 different subunits, and the largest one is homologous to subunit Rpb1 of yeast RNAP II and subunit beta' of bacterial RNAP. Rpa1 is also known as Rpa190 in yeast. Structure studies suggest that different RNAP complexes share a similar crab-claw-shape structure. The C-terminal domain of Rpb1, the largest subunit of RNAP II, makes up part of the foot and jaw structures of RNAP II. The similarity between this domain and the C-terminal domain of Rpb1, its counterpart in RNAP II, suggests a similar functional and structural role.	309
132723	cd02736	RNAP_III_Rpc1_C	Largest subunit (Rpc1) of Eukaryotic RNA polymerase III (RNAP III), C-terminal domain. Eukaryotic RNA polymerase III (RNAP III) is a large multi-subunit complex responsible for the synthesis of tRNAs, 5SrRNA, Alu-RNA, U6 snRNA, among others. Rpc1 is also known as C160 in yeast. Structure studies suggest that different RNA polymerase complexes share a similar crab-claw-shape structure. The C-terminal domain of Rpb1, the largest subunit of RNAP II, makes up part of the foot and jaw structures of RNAP II. The similarity between this domain and the C-terminal domain of Rpb1, its counterpart in RNAP II, suggests a similar functional and structural role.	300
132724	cd02737	RNAP_IV_NRPD1_C	Largest subunit (NRPD1) of Higher plant RNA polymerase IV, C-terminal domain. Higher plants have five multi-subunit nuclear RNA polymerases: RNAP I, RNAP II and RNAP III, which are essential for viability; plus the two isoforms of the non-essential polymerase RNAP IV (IVa and IVb), which specialize in small RNA-mediated gene silencing pathways. RNAP IVa and/or RNAP IVb might be involved in RNA-directed DNA methylation of endogenous repetitive elements, silencing of transgenes, regulation of flowering-time genes, inducible regulation of adjacent gene pairs, and spreading of mobile silencing signals. NRPD1a is the largest subunit of RNAP IVa, whereas NRPD1b is the largest subunit of RNAP IVb. The full subunit compositions of RNAP IVa and RNAP IVb are not known, nor are their templates or enzymatic products. However, it has been shown that RNAP IVa and, to a lesser extent, RNAP IVb are crucial for several RNA-mediated gene silencing phenomena.	381
119331	cd02742	GH20_hexosaminidase	Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides.  These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase.  The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself.	303
394871	cd02749	Macro_SF	macrodomain superfamily. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Macrodomains include the yeast macrodomain Poa1 which is a phosphatase of ADP-ribose-1"-phosphate, a by-product of tRNA splicing. Some macrodomains have ADPr-unrelated binding partners such as the coronavirus SUD-N (N-terminal subdomain) and SUD-M (middle subdomain) of the SARS-unique domain (SUD) which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). Macrodomains regulate a wide variety of cellular and organismal processes, including DNA damage repair, signal transduction, and immune response.	121
239151	cd02750	MopB_Nitrate-R-NarG-like	Respiratory nitrate reductase A (NarGHI), alpha chain (NarG) and related proteins. Under anaerobic conditions in the presence of nitrate, E. coli synthesizes the cytoplasmic membrane-bound quinol-nitrate oxidoreductase (NarGHI), which reduces nitrate to nitrite and forms part of a redox loop generating a proton-motive force. Found in prokaryotes and some archaea, NarGHI usually functions as a heterotrimer. The alpha chain contains the molybdenum cofactor-containing Mo-bisMGD catalytic subunit. Members of the MopB_Nitrate-R-NarG-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	461
239152	cd02751	MopB_DMSOR-like	The MopB_DMSOR-like CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR),  trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Also included in this group is the pyrogallol-phloroglucinol transhydroxylase from Pelobacter acidigallici. Members of the MopB_DMSOR-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	609
239153	cd02752	MopB_Formate-Dh-Na-like	Formate dehydrogenase N, alpha subunit (Formate-Dh-Na) is a major component of nitrate respiration in bacteria such as in the E. coli formate dehydrogenase N (Fdh-N). Fdh-N is a membrane protein that is a complex of three different subunits and is the major electron donor to the nitrate respiratory chain. Also included in this CD is the Desulfovibrio gigas tungsten formate dehydrogenase, DgW-FDH. In contrast to Fdh-N, which is a  functional heterotrimer, DgW-FDH is a heterodimer. The DgW-FDH complex is composed of a large subunit carrying the W active site and one [4Fe-4S] center, and a small subunit that harbors a series of three [4Fe-4S] clusters as well as a putative vacant binding site for a fourth cluster. The smaller subunit is not included in this alignment. Members of the MopB_Formate-Dh-Na-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	649
239154	cd02753	MopB_Formate-Dh-H	Formate dehydrogenase H (Formate-Dh-H) catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. It is a component of the anaerobic formate hydrogen lyase complex. The E. coli formate dehydrogenase H (Fdh-H) is a monomer composed of a single polypeptide chain with a  Mo active site region and a [4Fe-4S] center. Members of the MopB_Formate-Dh-H CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	512
239155	cd02754	MopB_Nitrate-R-NapA-like	Nitrate reductases, NapA (Nitrate-R-NapA), NasA, and NarB catalyze the reduction of nitrate to nitrite. Monomeric Nas is located in the cytoplasm and participates in nitrogen assimilation. Dimeric Nap is located in the periplasm and is coupled to quinol oxidation via a membrane-anchored tetraheme cytochrome. Members of the MopB_Nitrate-R-NapA CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	565
239156	cd02755	MopB_Thiosulfate-R-like	The MopB_Thiosulfate-R-like CD contains thiosulfate-, sulfur-, and polysulfide-reductases, and other related proteins. Thiosulfate reductase catalyzes the cleavage of sulfur-sulfur bonds in thiosulfate. Polysulfide reductase is a membrane-bound enzyme that catalyzes the reduction of polysulfide using either hydrogen or formate as the electron donor. Members of the MopB_Thiosulfate-R-like CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	454
239157	cd02756	MopB_Arsenite-Ox	Arsenite oxidase (Arsenite-Ox) oxidizes arsenite to the less toxic arsenate; it transfers the electrons obtained from the oxidation of arsenite towards the soluble periplasmic electron carriers cytochrome c and/or amicyanin.  Arsenite oxidase is a heterodimeric enzyme containing a large and a small subunit. The large catalytic subunit harbors the molybdopterin cofactor and the [3Fe-4S] cluster; and the small subunit belongs to the structural class of the Rieske proteins. The small subunit is not included in this alignment. Members of MopB_Arsenite-Ox CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	676
239158	cd02757	MopB_Arsenate-R	This CD includes the respiratory arsenate reductase, As(V), catalytic subunit (ArrA) and other related proteins. These members belong to the molybdopterin_binding (MopB) superfamily of proteins.	523
239159	cd02758	MopB_Tetrathionate-Ra	The MopB_Tetrathionate-Ra CD contains tetrathionate reductase, subunit A, (TtrA) and other related proteins. The Salmonella enterica tetrathionate reductase catalyses the reduction of trithionate but not sulfur or thiosulfate. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	735
239160	cd02759	MopB_Acetylene-hydratase	The MopB_Acetylene-hydratase CD contains acetylene hydratase (Ahy) and other related proteins. The acetylene hydratase of Pelobacter acetylenicus is a tungsten iron-sulfur protein involved in the fermentation of acetylene to ethanol and acetate. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	477
239161	cd02760	MopB_Phenylacetyl-CoA-OR	The MopB_Phenylacetyl-CoA-OR CD contains the phenylacetyl-CoA:acceptor oxidoreductase, large subunit (PadB2), and other related proteins. The phenylacetyl-CoA:acceptor oxidoreductase has been characterized as a membrane-bound molybdenum-iron-sulfur enzyme involved in anaerobic metabolism of phenylalanine in the denitrifying bacterium Thauera aromatica. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	760
239162	cd02761	MopB_FmdB-FwdB	The MopB_FmdB-FwdB CD contains the molybdenum/tungsten formylmethanofuran dehydrogenases, subunit B (FmdB/FwdB), and other related proteins. Formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and some eubacteria. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	415
239163	cd02762	MopB_1	The MopB_1 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins.	539
239164	cd02763	MopB_2	The MopB_2 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins	679
239165	cd02764	MopB_PHLH	The MopB_PHLH CD includes a group of related uncharacterized putative hydrogenase-like homologs (PHLH) of molybdopterin binding (MopB) proteins. This CD is of the PHLH region homologous to the catalytic molybdopterin-binding subunit of MopB homologs.	524
239166	cd02765	MopB_4	The MopB_4 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins	567
239167	cd02766	MopB_3	The MopB_3 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins	501
239168	cd02767	MopB_ydeP	The MopB_ydeP CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative molybdopterin cofactor binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins.	574
239169	cd02768	MopB_NADH-Q-OR-NuoG2	MopB_NADH-Q-OR-NuoG2: The NuoG/Nad11/75-kDa subunit (second domain) of the NADH-quinone oxidoreductase (NADH-Q-OR)/respiratory complex I/NADH dehydrogenase-1 (NDH-1). The NADH-Q-OR is the first energy-transducting complex in the respiratory chains of many prokaryotes and eukaryotes. Mitochondrial complex I and its bacterial counterpart, NDH-1, function as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The atomic structure of complex I is not known and the mechanisms of electron transfer and proton pumping are not established. The nad11 gene codes for the largest (75-kDa) subunit of the mitochondrial NADH:ubiquinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. In Escherichia coli, this subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the 'minimal' functional enzyme. The nad11 gene is nuclear-encoded in animals, plants, and fungi, but is still encoded in the mitochondrial genome of some protists. The Nad11/NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain family belongs to the molybdopterin_binding (MopB) superfamily of proteins. Bacterial type II NADH-quinone oxidoreductases and NQR-type sodium-motive NADH-quinone oxidoreductases are not homologs of this domain family.	386
239170	cd02769	MopB_DMSOR-BSOR-TMAOR	The MopB_DMSOR-BSOR-TMAOR CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR),  trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Members of this CD belong to the molybdopterin_binding (MopB) superfamily of proteins.	609
239171	cd02770	MopB_DmsA-EC	This CD (MopB_DmsA-EC) includes the DmsA enzyme of the dmsABC operon encoding the anaerobic dimethylsulfoxide reductase (DMSOR) of Escherichia coli and other related DMSOR-like enzymes. Unlike other DMSOR-like enzymes, this group has a  predicted N-terminal iron-sulfur [4Fe-4S] cluster  binding site. These members belong to the molybdopterin_binding (MopB) superfamily of proteins.	617
239172	cd02771	MopB_NDH-1_NuoG2-N7	MopB_NDH-1_NuoG2-N7: The second domain of the NuoG subunit (with a [4Fe-4S] cluster, N7) of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1) found in various bacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Unique to this group, compared to the other prokaryotic and eukaryotic groups in this domain protein family (NADH-Q-OR-NuoG2), is an N-terminal [4Fe-4S] cluster (N7/N1c) present in the second domain. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins.	472
239173	cd02772	MopB_NDH-1_NuoG2	MopB_NDH-1_NuoG2: The second domain of the NuoG subunit of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1), found in beta- and gammaproteobacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins.	414
239174	cd02773	MopB_Res-Cmplx1_Nad11	MopB_Res_Cmplx1_Nad11: The second domain of the Nad11/75-kDa subunit of the NADH-quinone oxidoreductase/respiratory complex I/NADH dehydrogenase-1(NDH-1) of eukaryotes and the Nqo3/G subunit of alphaproteobacteria NDH-1. The NADH-quinone oxidoreductase is the first energy-transducting complex in the respiratory chains of many prokaryotes and eukaryotes. Mitochondrial complex I and its bacterial counterpart, NDH-1, function as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The nad11 gene codes for the largest (75 kDa) subunit of the mitochondrial NADH:ubiquinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. In Paracoccus denitrificans, this subunit is encoded by the nqo3 gene, and is part of the 14 distinct subunits constituting the 'minimal' functional enzyme. The Nad11/Nqo3 subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins.	375
239175	cd02774	MopB_Res-Cmplx1_Nad11-M	MopB_Res_Cmplx1_Nad11_M: Mitochondrial-encoded NADH-quinone oxidoreductase/respiratory complex I, the second domain of the Nad11/75-kDa subunit of some protists. NADH-quinone oxidoreductase is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. The nad11 gene codes for the largest (75-kDa) subunit of the mitochondrial NADH-quinone oxidoreductase, it constitutes the electron input part of the enzyme, or the so-called NADH dehydrogenase fragment. The Nad11 subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain (this CD), is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Although only vestigial sequence evidence remains of a molybdopterin binding site, this protein domain belongs to the molybdopterin_binding (MopB) superfamily of proteins.	366
239176	cd02775	MopB_CT	Molybdopterin-Binding, C-terminal (MopB_CT) domain of the MopB superfamily of proteins, a  large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site. This hierarchy is of the conserved MopB_CT domain present in many, but not all, MopB homologs.	101
239177	cd02776	MopB_CT_Nitrate-R-NarG-like	Respiratory nitrate reductase A (NarGHI), alpha chain (NarG) and related proteins. Under anaerobic conditions in the presence of nitrate, E. coli synthesizes the cytoplasmic membrane-bound quinol-nitrate oxidoreductase (NarGHI), which reduces nitrate to nitrite and forms part of a redox loop generating a proton-motive force. Found in prokaryotes and some archaea, NarGHI usually functions as a heterotrimer. The alpha chain contains the molybdenum cofactor-containing Mo-bisMGD catalytic subunit. This CD (MopB_CT_Nitrate-R-NarG-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	141
239178	cd02777	MopB_CT_DMSOR-like	The MopB_CT_DMSOR-like CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR),  trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO. Also included in this group is the pyrogallol-phloroglucinol transhydroxylase from Pelobacter acidigallici. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	127
239179	cd02778	MopB_CT_Thiosulfate-R-like	The MopB_CT_Thiosulfate-R-like CD contains thiosulfate-, sulfur-, and polysulfide-reductases, and other related proteins. Thiosulfate reductase catalyzes the cleavage of sulfur-sulfur bonds in thiosulfate. Polysulfide reductase is a membrane-bound enzyme that catalyzes the reduction of polysulfide using either hydrogen or formate as the electron donor. Also included in this CD is the phenylacetyl-CoA:acceptor oxidoreductase, large subunit (PadB2), which has been characterized as a membrane-bound molybdenum-iron-sulfur enzyme involved in anaerobic metabolism of phenylalanine in the denitrifying bacterium Thauera aromatica. The MopB_CT_Thiosulfate-R-like CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	123
239180	cd02779	MopB_CT_Arsenite-Ox	This CD contains the molybdopterin_binding C-terminal (MopB_CT) region of Arsenite oxidase (Arsenite-Ox) and related proteins. Arsenite oxidase oxidizes arsenite to the less toxic arsenate; it transfers the electrons obtained from the oxidation of arsenite towards the soluble periplasmic electron carriers cytochrome c and/or amicyanin.	115
239181	cd02780	MopB_CT_Tetrathionate_Arsenate-R	This CD contains the molybdopterin_binding C-terminal (MopB_CT) region of tetrathionate reductase, subunit A, (TtrA); respiratory arsenate As(V) reductase, catalytic subunit (ArrA); and other related proteins.	143
239182	cd02781	MopB_CT_Acetylene-hydratase	The MopB_CT_Acetylene-hydratase CD contains acetylene hydratase (Ahy) and other related proteins. The acetylene hydratase of Pelobacter acetylenicus is a tungsten iron-sulfur protein involved in the fermentation of acetylene to ethanol and acetate. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	130
239183	cd02782	MopB_CT_1	The MopB_CT_1 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	129
239184	cd02783	MopB_CT_2	The MopB_CT_2 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	156
239185	cd02784	MopB_CT_PHLH	The MopB_CT_PHLH CD includes a group of related uncharacterized putative hydrogenase-like homologs (PHLH) of molybdopterin binding proteins. This CD is of the PHLH region homologous to the conserved molybdopterin-binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	137
239186	cd02785	MopB_CT_4	The MopB_CT_4 CD includes a group of related uncharacterized bacterial and archaeal molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	124
239187	cd02786	MopB_CT_3	The MopB_CT_3 CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative N-terminal iron-sulfur [4Fe-4S] cluster binding site and molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	116
239188	cd02787	MopB_CT_ydeP	The MopB_CT_ydeP CD includes a group of related uncharacterized bacterial molybdopterin-binding oxidoreductase-like domains with a putative molybdopterin cofactor binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	112
239189	cd02788	MopB_CT_NDH-1_NuoG2-N7	MopB_CT_NDH-1_NuoG2-N7: C-terminal region of the NuoG-like subunit (of the variant with a [4Fe-4S] cluster, N7) of the NADH-quinone oxidoreductase/NADH dehydrogenase-1 (NDH-1) found in various bacteria. The NDH-1 is the first energy-transducting complex in the respiratory chain and functions as a redox pump that uses the redox energy to translocate H+ ions across the membrane, resulting in a significant contribution to energy production. In Escherichia coli NDH-1, the largest subunit is encoded by the nuoG gene, and is part of the 14 distinct subunits constituting the functional enzyme. The NuoG subunit is made of two domains: the first contains three binding sites for FeS clusters (the fer2 domain), the second domain, is of unknown function or, as postulated, has lost an ancestral formate dehydrogenase activity that became redundant during the evolution of the complex I enzyme. Unique to this group, compared to the other prokaryotic and eukaryotic groups in this domain protein family (NADH-Q-OR-NuoG2), is an N-terminal [4Fe-4S] cluster (N7/N1c) present in the second domain and a C-terminal region (this CD) homologous to the formate dehydrogenase C-terminal molybdopterin_binding (MopB) region.	96
239190	cd02789	MopB_CT_FmdC-FwdD	The MopB_FmdC-FwdD CD includes the  C-terminus of subunit C of molybdenum formylmethanofuran dehydrogenase (FmdC) and subunit D of tungsten formylmethanofuran dehydrogenase (FwdD), and other related proteins. Formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and some eubacteria. Members of this CD belong to the molybdopterin_binding superfamily of proteins. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	106
239191	cd02790	MopB_CT_Formate-Dh_H	Formate dehydrogenase H (Formate-Dh-H) catalyzes the reversible oxidation of formate to CO2 with the release of a proton and two electrons. It is a component of the anaerobic formate hydrogen lyase complex. The E. coli formate dehydrogenase H (Fdh-H) is a monomer composed of a single polypeptide chain with a  Mo active site region and a [4Fe-4S] center. This CD (MopB_CT_Formate-Dh_H) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	116
239192	cd02791	MopB_CT_Nitrate-R-NapA-like	Nitrate reductases, NapA (Nitrate-R-NapA), NasA, and NarB catalyze the reduction of nitrate to nitrite. Monomeric Nas is located in the cytoplasm and participates in nitrogen assimilation. Dimeric Nap is located in the periplasm and is coupled to quinol oxidation via a membrane-anchored tetraheme cytochrome. This CD (MopB_CT_Nitrate-R-Nap) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs	122
239193	cd02792	MopB_CT_Formate-Dh-Na-like	Formate dehydrogenase N, alpha subunit (Formate-Dh-Na) is a major component of nitrate respiration in bacteria such as in the E. coli formate dehydrogenase N (Fdh-N). Fdh-N is a membrane protein that is a complex of three different subunits and is the major electron donor to the nitrate respiratory chain. Also included in this CD is the Desulfovibrio gigas tungsten formate dehydrogenase, DgW-FDH. In contrast to Fdh-N, which is a  functional heterotrimer, DgW-FDH is a heterodimer. The DgW-FDH complex is composed of a large subunit carrying the W active site and one [4Fe-4S] center, and a small subunit that harbors a series of three [4Fe-4S] clusters as well as a putative vacant binding site for a fourth cluster. The smaller subunit is not included in this alignment. This CD (MopB_CT_Formate-Dh-Na-like) is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	122
239194	cd02793	MopB_CT_DMSOR-BSOR-TMAOR	The MopB_DMSOR-BSOR-TMAOR CD contains dimethylsulfoxide reductase (DMSOR), biotin sulfoxide reductase (BSOR),  trimethylamine N-oxide reductase (TMAOR) and other related proteins. DMSOR always catalyzes the reduction of DMSO to dimethylsulfide, but its cellular location and oligomerization state are organism-dependent. For example, in Rhodobacter sphaeriodes and Rhodobacter capsulatus, it is an 82-kDa monomeric soluble protein found in the periplasmic space; in E. coli, it is membrane-bound and exists as a heterotrimer. BSOR catalyzes the reduction of biotin sulfixode to biotin, and is unique among Mo enzymes because no additional auxiliary proteins or cofactors are required. TMAOR is similar to DMSOR, but its only natural substrate is TMAO.This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	129
239195	cd02794	MopB_CT_DmsA-EC	The MopB_CT_DmsA-EC CD includes the DmsA enzyme of the dmsABC operon encoding the anaerobic dimethylsulfoxide reductase (DMSOR) of Escherichia coli and other related DMSOR-like enzymes. Unlike other DMSOR-like enzymes, this group has a  predicted N-terminal iron-sulfur [4Fe-4S] cluster binding site. This CD is of the conserved molybdopterin_binding C-terminal (MopB_CT) region present in many, but not all, MopB homologs.	121
271143	cd02795	CBM6-CBM35-CBM36_like	Carbohydrate Binding Module 6 (CBM6) and CBM35_like superfamily. Carbohydrate binding module family 6 (CBM6, family 6 CBM), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36). These are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of the appended catalytic modules to their dedicated, insoluble substrates. Many of these CBMs are associated with glycoside hydrolase (GH) domains. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. CBM36s are calcium-dependent xylan binding domains. CBM35s display conserved specificity through extensive sequence similarity, but divergent function through their appended catalytic modules. This alignment model also contains the C-terminal domains of bacterial insecticidal toxins, where they may be involved in determining insect specificity through carbohydrate binding functionality.	124
239196	cd02796	tRNA_bind_bactPheRS	tRNA-binding-domain-containing prokaryotic phenylalanly tRNA synthetase (PheRS) beta chain.  PheRS aminoacylate phenylalanine transfer RNAs (tRNAphe).  PheRSs belong structurally to class II aminoacyl tRNA synthetases (aaRSs) but, as they aminoacylate the 2'OH of the terminal ribose of tRNA they belong functionally to class 1 aaRSs.  This domain has general tRNA binding properties and is believed to direct tRNAphe to the active site of the enzyme.	103
239197	cd02798	tRNA_bind_CsaA	tRNA-binding-domain-containing CsaA-like proteins.  CsaA is a molecular chaperone with export related activities. CsaA has a putative tRNA binding activity. The functional unit of CsaA is a homodimer and this domain acts as a dimerization domain.	107
239198	cd02799	tRNA_bind_EMAP-II_like	tRNA-binding-domain-containing EMAP2-like proteins. This family contains a diverse fraction of tRNA binding proteins, including Caenorhabditis elegans methionyl-tRNA synthetase (CeMetRS), human tyrosyl- tRNA synthetase (hTyrRS), Saccharomyces cerevisiae Arc1p, human p43 and EMAP2.  CeMetRS and hTyrRS aminoacylate their cognate tRNAs.  Arc1p is a transactivator of yeast methionyl-tRNA and glutamyl-tRNA synthetases.  This domain has general tRNA binding properties.  In a subset of this family this domain has the added capability of a cytokine. For example the p43 component of the Human aminoacyl-tRNA synthetase complex is cleaved to release EMAP-II cytokine. EMAP-II has multiple activities during apoptosis, angiogenesis and inflammation and participates in malignant transformation. A EMAP-II-like cytokine also is released from hTyrRS upon cleavage. The active cytokine heptapeptide locates to this domain.	105
239199	cd02800	tRNA_bind_EcMetRS_like	tRNA-binding-domain-containing Escherichia coli methionyl-tRNA synthetase (EcMetRS)-like proteins.  This family includes EcMetRS and Aquifex aeolicus Trbp111 (AaTrbp111). This domain has general tRNA binding properties.  MetRS aminoacylates methionine transfer RNAs (tRNAmet). AaTrbp111 is structure-specific molecular chaperone recognizing the L-shape of the tRNA fold. AaTrbp111 plays a role in nuclear trafficking of tRNAs. The functional unit of EcMetRs and AaTrbp111 is a homodimer, this domain acts as the dimerization domain.	105
239200	cd02801	DUS_like_FMN	Dihydrouridine synthase-like (DUS-like) FMN-binding domain. Members of this family catalyze the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archaea. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. 1VHN, a putative flavin oxidoreductase, has high sequence similarity to DUS.  The enzymatic mechanism of 1VHN is not known at the present.	231
239201	cd02803	OYE_like_FMN_family	Old yellow enzyme (OYE)-like FMN binding domain. OYE was the first flavin-dependent enzyme identified, however its true physiological role remains elusive to this day.  Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction.  Members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase.	327
239202	cd02808	GltS_FMN	Glutamate synthase (GltS) FMN-binding domain.  GltS is a complex iron-sulfur flavoprotein that catalyzes the reductive synthesis of L-glutamate from 2-oxoglutarate and L-glutamine via intramolecular channelling of ammonia, a reaction in the plant, yeast and bacterial pathway for ammonia assimilation. It is a multifunctional enzyme that functions through three distinct active centers, carrying out  L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor.	392
239203	cd02809	alpha_hydroxyacid_oxid_FMN	Family of homologous FMN-dependent alpha-hydroxyacid oxidizing enzymes. This family occurs in both prokaryotes and eukaryotes. Members of this family include flavocytochrome b2 (FCB2), glycolate oxidase (GOX), lactate monooxygenase (LMO), mandelate dehydrogenase (MDH), and long chain hydroxyacid oxidase (LCHAO). In green plants, glycolate oxidase is one of the key enzymes in photorespiration where it oxidizes glycolate to glyoxylate. LMO catalyzes the oxidation of L-lactate to acetate and carbon dioxide. MDH oxidizes (S)-mandelate to phenylglyoxalate. It is an enzyme in the mandelate pathway that occurs in several strains of Pseudomonas which converts (R)-mandelate to benzoate.	299
239204	cd02810	DHOD_DHPD_FMN	Dihydroorotate dehydrogenase (DHOD) and Dihydropyrimidine dehydrogenase (DHPD) FMN-binding domain.  DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 are cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively. DHPD catalyzes the first step in pyrimidine degradation: the NADPH-dependent reduction of uracil and thymine to the corresponding 5,6-dihydropyrimidines. DHPD contains two FAD, two FMN and eight [4Fe-4S] clusters, arranged in two electron transfer chains that pass its homodimeric interface twice. Two of the Fe-S clusters show a hitherto unobserved coordination involving a glutamine residue.	289
239205	cd02811	IDI-2_FMN	Isopentenyl-diphosphate:dimethylallyl diphosphate isomerase type 2 (IDI-2) FMN-binding domain. Two types of IDIs have been characterized at present. The long known IDI-1 is only dependent on divalent metals for activity, whereas IDI-2 requires a metal, FMN and NADPH. IDI-2 catalyzes the interconversion of isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) in the mevalonate pathway.	326
239206	cd02812	PcrB_like	PcrB_like proteins. One member of this family, a protein from Archaeoglobus fulgidus, has been characterized as a (S)-3-O-geranylgeranylglyceryl phosphate synthase (AfGGGPS). AfGGGPS catalyzes the formation of an ether linkage between sn-glycerol-1-phosphate (G1P) and geranylgeranyl diphosphate (GGPP), the committed step in archaeal lipid biosynthesis. Therefore, it has been proposed that PcrB-like proteins are either prenyltransferases or are involved in lipoteichoic acid biosynthesis although the exact function is still unknown.	219
239207	cd02825	PAZ	PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain.	115
239208	cd02826	Piwi-like	Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism.	393
239209	cd02843	PAZ_dicer_like	PAZ domain, dicer_like subfamily. Dicer is an RNAse involved in cleaving dsRNA in the RNA interference pathway. It generates dsRNAs which are approximately 20 bp long (siRNAs), which in turn target hydrolysis of homologous RNAs. PAZ domains are named after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes.	122
239210	cd02844	PAZ_CAF_like	PAZ domain, CAF_like subfamily. CAF (for carpel factory) is a plant homolog of Dicer. CAF has been implicated in flower morphogenesis and in early Arabidopsis development and might function through posttranscriptional regulation of specific mRNA molecules. PAZ domains are named after the proteins Piwi, Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes.	135
239211	cd02845	PAZ_piwi_like	PAZ domain,  Piwi_like subfamily. In multi-cellular organisms, the Piwi protein appears to be essential for the maintenance of germline stem cells. In the Drosophila male germline, Piwi was shown to be involved in the silencing of retrotransposons in the male gametes. The Piwi proteins share their domain architecture with other members of the argonaute family. The PAZ domain has been named after the proteins Piwi, Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes.	117
239212	cd02846	PAZ_argonaute_like	PAZ domain, argonaute_like subfamily. Argonaute is part of the RNA-induced silencing complex (RISC), and is an endonuclease that plays a key role in the RNA interference pathway. The PAZ domain has been named after the proteins Piwi,Argonaut, and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the Piwi and Dicer families. PAZ functions as a nucleic acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes.	114
199879	cd02847	E_set_Chitobiase_C	C-terminal Early set domain associated with the catalytic domain of chitobiase (also called N-acetylglucosaminidase). E or "early" set domains are associated with the catalytic domain of chitobiase at the C-terminus. Chitobiase digests the beta, 1-4 glycosidic bonds of the N-acetylglucosamine (NAG) oligomers found in chitin, an important structural element of fungal cell wall and arthropod exoskeletons. It is thought to proceed through an acid-base reaction mechanism, in which one protein carboxylate acts as the catalytic acid, while the nucleophile is the polar acetamido group of the sugar in a substrate-assisted reaction with retention of the anomeric configuration. The C-terminus of chitobiase may be related to the immunoglobulin and/or fibronectin type III superfamilies. E set domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	62
199880	cd02848	E_set_Chitinase_N	N-terminal Early set domain associated with the catalytic domain of chitinase. E or "early" set domains are associated with the catalytic domain of chitinase at the N-terminal end. Chitinases hydrolyze the abundant natural biopolymer chitin, producing smaller chito-oligosaccharides. Chitin consists of multiple N-acetyl-D-glucosamine (NAG) residues connected via beta-1,4-glycosidic linkages and is an important structural element of fungal cell wall and arthropod exoskeletons. On the basis of the mode of chitin hydrolysis, chitinases are classified as random, endo-, and exo-chitinases and belong to families 18 and 19 of glycosyl hydrolases based on sequence criteria. The N-terminal domain of chitinase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	105
199881	cd02850	E_set_Cellulase_N	N-terminal Early set domain associated with the catalytic domain of cellulase. E or "early" set domains are associated with the catalytic domain of cellulases at the N-terminal end. Cellulases are O-glycosyl hydrolases (GHs) that hydrolyze beta 1-4 glucosidic bonds in cellulose. They are usually categorized into either exoglucanases, which sequentially release terminal sugar units from the cellulose chain, or endoglucanases, which also attack the chain internally. The N-terminal domain of cellulase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	86
199882	cd02851	E_set_GO_C	C-terminal Early set domain associated with the catalytic domain of galactose oxidase. E or "early" set domains are associated with the catalytic domain of galactose oxidase at the C-terminal end. Galactose oxidase is an extracellular monomeric enzyme which catalyzes the stereospecific oxidation of a broad range of primary alcohol substrates and possesses a unique mononuclear copper site essential for catalyzing a two-electron transfer reaction during the oxidation of primary alcohols to corresponding aldehydes. The second redox active center necessary for the reaction was found to be situated at a tyrosine residue. The C-terminal domain of galactose oxidase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	103
199883	cd02853	E_set_MTHase_like_N	N-terminal Early set domain associated with the catalytic domain of Maltooligosyl trehalose trehalohydrolase (also called Glycosyltrehalose trehalohydrolase) and similar proteins. E or "early" set domains are associated with the catalytic domain of Maltooligosyl trehalose trehalohydrolase (MTHase) and similar proteins at the N-terminal end. This subfamily also includes bacterial alpha amylases and 1,4-alpha-glucan branching enzymes which are highly similar to MTHase. Maltooligosyl trehalose synthase (MTSase) and MTHase work together to produce trehalose. MTSase is responsible for converting the alpha-1,4-glucosidic linkage to an alpha,alpha-1,1-glucosidic linkage at the reducing end of the maltooligosaccharide through an intramolecular transglucosylation reaction, while MTHase hydrolyzes the penultimate alpha-1,4 linkage of the reducing end, resulting in the release of trehalose. The N-terminal domain of MTHase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	84
199884	cd02854	E_set_GBE_euk_N	N-terminal Early set domain associated with the catalytic domain of eukaryotic glycogen branching enzyme (also called 1,4 alpha glucan branching enzyme). This subfamily is composed of predominantly eukaryotic 1,4 alpha glucan branching enzymes, also called glycogen branching enzymes or starch binding enzymes in plants. E or "early" set domains are associated with the catalytic domain of the 1,4 alpha glucan branching enzymes at the N-terminal end. These enzymes catalyze the formation of alpha-1,6 branch points in either glycogen or starch by cleavage of the alpha-1,4 glucosidic linkage, yielding a non-reducing end oligosaccharide chain, as well as the subsequent attachment of short glucosyl chains to the alpha-1,6 position. Starch is composed of two types of glucan polymer: amylose and amylopectin. Amylose is mainly composed of linear chains of alpha-1,4 linked glucose residues and amylopectin consists of shorter alpha-1,4 linked chains connected by alpha-1,6 linkages. Amylopectin is synthesized from linear chains by starch branching enzyme. The N-terminal domains of the branching enzyme proteins may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	95
199885	cd02855	E_set_GBE_prok_N	N-terminal Early set domain associated with the catalytic domain of prokaryotic glycogen branching enzyme. This subfamily is composed of predominantly prokaryotic 1,4 alpha glucan branching enzymes, also called glycogen branching enzymes. E or "early" set domains are associated with the catalytic domain of glycogen branching enzymes at the N-terminal end. Glycogen branching enzyme catalyzes the formation of alpha-1,6 branch points in either glycogen or starch by cleavage of the alpha-1,4 glucosidic linkage, yielding a non-reducing end oligosaccharide chain, as well as the subsequent attachment of short glucosyl chains to the alpha-1,6 position. By increasing the number of non-reducing ends, glycogen is more reactive to synthesis and digestion as well as being more soluble. The N-terminal domain of the 1,4 alpha glucan branching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at  either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions.  Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	105
199886	cd02856	E_set_GDE_Isoamylase_N	N-terminal Early set domain associated with the catalytic domain of Glycogen debranching enzyme and bacterial isoamylase (also called glycogen 6-glucanohydrolase). E or "early" set domains are associated with the catalytic domain of the glycogen debranching enzyme at the N-terminal end. Glycogen debranching enzymes have both 4-alpha-glucanotransferase and amylo-1,6-glucosidase activities. As a transferase, it transfers a segment of the 1,4-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or another 1,4-alpha-D-glucan. As a glucosidase, it catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. Bacterial isoamylases are also included in this subfamily. Isoamylase is one of the starch-debranching enzymes that catalyze the hydrolysis of alpha-1,6-glucosidic linkages specific in alpha-glucans such as amylopectin or glycogen. Isoamylase contains a bound calcium ion, but this is not in the same position as the conserved calcium ion that has been reported in other alpha-amylase family enzymes. The N-terminal domain of glycogen debranching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	130
199887	cd02857	E_set_CDase_PDE_N	N-terminal Early set domain associated with the catalytic domain of cyclomaltodextrinase and pullulan-degrading enzymes. E or "early" set domains are associated with the catalytic domain of the cyclomaltodextrinase (CDase) and pullulan-degrading enzymes at the N-terminal end. Members of this subgroup include CDase, maltogenic amylase, and neopullulanase, all of which are capable of hydrolyzing all or two of the following three types of substrates: cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. The N-terminal domain of the CDase and pullulan-degrading enzymes may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions.  Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	109
199888	cd02858	E_set_Esterase_N	N-terminal Early set domain associated with the catalytic domain of esterase. E or "early" set domains are associated with the catalytic domain of esterase at the N-terminal end. Esterases catalyze the hydrolysis of organic esters to release an alcohol or thiol and acid. The term esterase can be applied to enzymes that hydrolyze carboxylate, phosphate and sulphate esters, but is more often restricted to the first class of substrate. The N-terminal domain of esterase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at  either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	78
199889	cd02859	E_set_AMPKbeta_like_N	N-terminal Early set domain, a glycogen binding domain, associated with the catalytic domain of AMP-activated protein kinase beta subunit. E or "early" set domains are associated with the catalytic domain of AMP-activated protein kinase beta subunit glycogen binding domain at the N-terminal end. AMPK is a metabolic stress sensing protein that senses AMP/ATP and has recently been found to act as a glycogen sensor as well. The protein functions as an alpha-beta-gamma heterotrimer. This N-terminal domain is the glycogen binding domain of the beta subunit. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, and isoamylase.	80
199890	cd02860	E_set_Pullulanase	Early set domain associated with the catalytic domain of pullulanase (also called dextrinase and alpha-dextrin endo-1,6-alpha glucosidase). E or "early" set domains are associated with the catalytic domain of pullulanase at either the N-terminal or C-terminal end, and in a few instances at both ends. Pullulanase is an enzyme with activity similar to that of isoamylase; it cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. The E set domain of pullulanase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase.	97
199891	cd02861	E_set_pullulanase_like	Early set domain associated with the catalytic domain of pullulanase-like proteins. E or "early" set domains are associated with the catalytic domain of pullulanase at either the N-terminal or C-terminal end, and in a few instances at both ends. Pullulanase (also called dextrinase or alpha-dextrin endo-1,6-alpha glucosidase) is an enzyme with action similar to that of isoamylase; it cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. The E set domain of pullulanase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase.	88
239213	cd02862	NorE_like	NorE_like subfamily of heme-copper oxidase subunit III.  Heme-copper oxidases include cytochrome c and ubiquinol oxidases.  Alcaligenes faecalis norE is found in a gene cluster containing norCB. norCB encodes the cytochrome c and cytochrome b subunits of nitric oxide reductase (NOR). Based on this and on its similarity to subunit III of cytochrome c oxidase (CcO) and ubiquinol oxidase, NorE has been speculated to be a subunit of NOR.	186
239214	cd02863	Ubiquinol_oxidase_III	Ubiquinol oxidase subunit III subfamily. Ubiquinol oxidase, the terminal oxidase in the respiratory chains of aerobic bacteria, is a multi-chain transmembrane protein located in the cell membrane.  It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  Ubiquinol oxidases feature four subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of bovine CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in bovine CcO.  Although not required for catalytic activity, subunit III appears to be involved in assembly of the multimer complex.	186
239215	cd02864	Heme_Cu_Oxidase_III_1	Heme-copper oxidase subunit III subfamily.  Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types.  This superfamily includes cytochrome c and ubiquinol oxidases.  Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO.  Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I.  It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria.	202
239216	cd02865	Heme_Cu_Oxidase_III_2	Heme-copper oxidase subunit III subfamily.  Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types.  This superfamily includes cytochrome c and ubiquinol oxidases.  Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO.  Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I.  It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria.	184
211343	cd02866	PseudoU_synth_TruA_Archea	Archeal pseudouridine synthases. This group consists of archeal pseudouridine synthases.Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).  No cofactors are required. This group of proteins make Psedouridine in tRNAs.	219
211344	cd02867	PseudoU_synth_TruB_4	Pseudouridine synthase homolog 4. This group consists of Eukaryotic TruB proteins similar to Saccharomyces cerevisiae Pus4. S. cerevisiae Pus4, makes psi55 in the T loop of both cytoplasmic and mitochondrial tRNAs. Psi55 is almost universally conserved.  Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).	312
211345	cd02868	PseudoU_synth_hTruB2_like	Pseudouridine synthase, humanTRUB2_like. This group consists of eukaryotic pseudouridine synthases similar to human TruB pseudouridine synthase homolog 2 (TRUB2). Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi).	226
211346	cd02869	PseudoU_synth_RluA_like	Pseudouridine synthase, RluA family. This group is comprised of eukaryotic, bacterial and archeal proteins similar to eight site specific Escherichia coli pseudouridine synthases: RsuA, RluA, RluB, RluC, RluD, RluE, RluF and TruA.  Pseudouridine synthases catalyze the isomerization of specific uridines in a n RNA molecule to pseudouridines (5-ribosyluracil, psi) requiring no cofactors.  E. coli RluC for example makes psi955, 2504 and 2580 in 23S RNA.  Some psi sites such as psi1917 in 23S RNA made by RluD are universally conserved.  Other psi sites occur in a more restricted fashion, for example psi2819 in 21S mitochondrial ribosomal RNA made by S. cerevisiae Pus5p is only found in mitochondrial large subunit rRNAs from some other species and in gram negative bacteria. The E. coli counterpart of this psi residue is psi2580 in 23S rRNA.  psi2604in 23S RNA made by RluF has only been detected in E.coli.	185
211347	cd02870	PseudoU_synth_RsuA_like	Pseudouridine synthases, RsuA subfamily. Pseudouridine synthases  are responsible for the synthesis of pseudouridine from uracil in ribosomal RNA. The RsuA subfamily includes Pseudouridine Synthase similar to Ribosomal small subunit pseudouridine 516 synthase. Most of the proteins in this family are bacterial proteins.	146
119350	cd02871	GH18_chitinase_D-like	GH18 domain of Chitinase D (ChiD).  ChiD, a chitinase found in Bacillus circulans, hydrolyzes the 1,4-beta-linkages of N-acetylglucosamine in chitin and chitodextrins.  The domain architecture of ChiD includes a catalytic glycosyl hydrolase family 18 (GH18) domain, a chitin-binding domain, and a fibronectin type III domain. The chitin-binding and fibronectin type III domains are located either N-terminal or C-terminal to the catalytic domain.  This family includes exochitinase Chi36 from Bacillus cereus.	312
119351	cd02872	GH18_chitolectin_chitotriosidase	This conserved domain family includes a large number of catalytically inactive chitinase-like lectins (chitolectins) including YKL-39, YKL-40 (HCGP39), YM1, oviductin, and AMCase (acidic mammalian chitinase), as well as catalytically active chitotriosidases.  The conserved domain is an eight-stranded alpha/beta barrel fold belonging to the family 18 glycosyl hydrolases.  The fold has a pronounced active-site cleft at the C-terminal end of the beta-barrel.  The chitolectins lack a key active site glutamate (the proton donor required for hydrolytic activity) but retain highly conserved residues involved in oligosaccharide binding.  Chitotriosidase is a chitinolytic enzyme expressed in maturing macrophages, which suggests that it plays a part in antimicrobial defense.  Chitotriosidase hydrolyzes chitotriose, as well as colloidal chitin to yield chitobiose and is therefore considered an exochitinase. Chitotriosidase occurs in two major forms, the large form being converted to the small form by either RNA or post-translational processing.  Although the small form, containing the chitinase domain alone, is sufficient for the chitinolytic activity, the additional C-terminal chitin-binding domain of the large form plays a role in processing colloidal chitin. The chitotriosidase gene is nonessential in humans, as about 35% of the population are heterozygous and 6% homozygous for an inactivated form of the gene.  HCGP39 is a 39-kDa human cartilage glycoprotein thought to play a role in connective tissue remodeling and defense against pathogens.	362
119352	cd02873	GH18_IDGF	The IDGF's (imaginal disc growth factors) are a family of growth factors identified in insects that include at least five members, some of which are encoded by genes in a tight cluster. The IDGF's have an eight-stranded alpha/beta barrel fold and are related to the glycosyl hydrolase family 18 (GH18) chitinases, but they have an amino acid substitution known to abolish chitinase catalytic activity. IDGFs may have evolved from chitinases to gain new functions as growth factors, interacting with cell surface glycoproteins involved in growth-promoting processes.	413
119353	cd02874	GH18_CFLE_spore_hydrolase	Cortical fragment-lytic enzyme (CFLE) is a peptidoglycan hydrolase involved in  bacterial endospore germination.  CFLE is expressed as an inactive preprotein (called SleB) in the forespore compartment of sporulating cells.  SleB translocates across the forespore inner membrane and is deposited as a mature enzyme in the cortex layer of the spore.  As part of a sensory mechanism capable of initiating germination, CFLE degrades a spore-specific peptidoglycan constituent called muramic-acid delta-lactam that comprises the outer cortex.  CFLE has a C-terminal glycosyl hydrolase family 18 (GH18) catalytic domain as well as two N-terminal LysM peptidoglycan-binding domains.  In addition to SleB, this family includes YaaH, YdhD, and YvbX from Bacillus subtilis.	313
119354	cd02875	GH18_chitobiase	Chitobiase (also known as di-N-acetylchitobiase) is a lysosomal glycosidase that hydrolyzes the reducing-end N-acetylglucosamine from the chitobiose core of oligosaccharides during the ordered degradation of asparagine-linked glycoproteins in eukaryotes. Chitobiase can only do so if the asparagine that joins the oligosaccharide to protein is previously removed by a glycosylasparaginase. Chitobiase is therefore the final step in the lysosomal degradation of the protein/carbohydrate linkage component of asparagine-linked glycoproteins. The catalytic domain of chitobiase is an eight-stranded alpha/beta barrel fold similar to that of other family 18 glycosyl hydrolases such as hevamine and chitotriosidase.	358
119355	cd02876	GH18_SI-CLP	Stabilin-1 interacting chitinase-like protein (SI-CLP) is a eukaryotic chitinase-like protein of unknown function that interacts with the endocytic/sorting transmembrane receptor stabilin-1 and is secreted from the lysosome.  SI-CLP has a glycosyl hydrolase family 18 (GH18) domain but lacks a chitin-binding domain. The catalytic amino acids of the GH18 domain are not conserved in SI-CLP, similar to the chitolectins YKL-39, YKL-40, and YM1/2.  Human SI-CLP is sorted to late endosomes and secretory lysosomes in alternatively activated macrophages.	318
119356	cd02877	GH18_hevamine_XipI_class_III	This conserved domain family includes xylanase inhibitor Xip-I, and the class III plant chitinases such as hevamine, concanavalin B, and PPL2, all of which have a glycosyl hydrolase family 18 (GH18) domain. Hevamine is a class III endochitinase that hydrolyzes the linear polysaccharide chains of chitin and peptidoglycan and is important for defense against pathogenic bacteria and fungi.  PPL2 (Parkia platycephala lectin 2) is a class III chitinase from Parkia platycephala seeds that hydrolyzes beta(1-4) glycosidic bonds linking 2-acetoamido-2-deoxy-beta-D-glucopyranose units in chitin.	280
119357	cd02878	GH18_zymocin_alpha	Zymocin, alpha subunit.  Zymocin is a heterotrimeric enzyme that inhibits yeast cell cycle progression. The zymocin alpha subunit has a chitinase activity that is essential for holoenzyme action from the cell exterior while the gamma subunit contains the intracellular toxin responsible for G1 phase cell cycle arrest.  The zymocin alpha and beta subunits are thought to act from the cell's exterior by docking to the cell wall-associated chitin, thus mediating gamma-toxin translocation.  The alpha subunit has an eight-stranded TIM barrel fold similar to that of family 18 glycosyl hydrolases such as hevamine, chitolectin, and chitobiase.	345
119358	cd02879	GH18_plant_chitinase_class_V	The class V plant chitinases have a glycosyl hydrolase family 18 (GH18) domain, but lack the chitin-binding domain present in other GH18 enzymes.  The GH18 domain of the class V chitinases has endochitinase activity in some cases and no catalytic activity in others.  Included in this family is a lectin found in black locust (Robinia pseudoacacia) bark, which binds chitin but lacks chitinase activity.  Also included is a chitinase-related receptor-like kinase (CHRK1) from tobacco (Nicotiana tabacum), with an N-terminal GH18 domain and a C-terminal kinase domain, which is thought to be part of a plant signaling pathway.  The GH18 domain of CHRK1 is expressed extracellularly where it binds chitin but lacks chitinase activity.	299
239217	cd02883	Nudix_Hydrolase	Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase.	123
239218	cd02885	IPP_Isomerase	Isopentenyl diphosphate (IPP) isomerase, a member of the Nudix hydrolase superfamily, is a key enzyme in the isoprenoid biosynthetic pathway. Isoprenoids comprise a large family of natural products including sterols, carotenoids, dolichols and prenylated proteins. These compounds are synthesized from two precursors: isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). IPP isomerase catalyzes the interconversion of IPP and DMAPP by a stereoselective antarafacial transposition of hydrogen. The enzyme requires one Mn2+ or Mg2+ ion in its active site to fold into an active conformation and also contains the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as a metal binding and catalytic site. The metal binding site is present within the active site and plays structural and catalytical roles. IPP isomerase is well represented in several bacteria, archaebacteria and eukaryotes, including fungi, mammals and plants. Despite sequence variations (mainly at the N-terminus), the core structure is highly conserved.	165
153089	cd02888	RNR_II_dimer	Class II ribonucleotide reductase, dimeric form. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). Class II RNRs are found in bacteria that can live under both aerobic and anaerobic conditions. Many, but not all members of this class are found to be homodimers. Adenosylcobalamin interacts directly with an active site cysteine to form the reactive cysteine radical.	464
239219	cd02889	SQCY	Squalene cyclase (SQCY) domain; found in class II terpene cyclases that have an alpha 6 - alpha 6 barrel fold. Squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY) are integral membrane proteins that catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. Bacterial SQCY catalyzes the convertion of squalene to hopene or diplopterol. Eukaryotic OSQCY transforms the 2,3-epoxide of squalene to compounds such as, lanosterol (a metabolic precursor of cholesterol and steroid hormones) in mammals and fungi or, cycloartenol in plants. Deletion of a single glycine residue of Alicyclobacillus acidocaldarius SQCY alters its substrate specificity into that of eukaryotic OSQCY. Both enzymes have a second minor domain, which forms an alpha-alpha barrel that is inserted into the major domain. This group also contains SQCY-like archael sequences and some bacterial SQCY's which lack this minor domain.	348
239220	cd02890	PTase	Protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). The protein prenyltransferase family of lipid-modifying enzymes includes protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II). They catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between the C1 atom of farnesyl (15-carbon by FTase) or geranylgeranyl (20-carbon by GGTase-I, II) isoprenoid lipids and cysteine residues at or near the C-terminus of protein acceptors. FTase and GGTase-I prenylate the cysteine in the terminal sequence, "CAAX"; and GGTase-II prenylates both cysteines in the "CC" (or "CXC") terminal sequence. These enzymes are heterodimeric with both alpha and beta subunits required for catalytic activity. In contrast to other prenyltransferases, GGTase-II does not recognize its protein acceptor directly but requires Rab to complex with REP (Rab escort protein) before prenylation can occur. These enzymes are found exclusively in eukaryotes.	286
239221	cd02891	A2M_like	Proteins similar to alpha2-macroglobulin (alpha (2)-M).  Alpha (2)-M is a major carrier protein in serum. It is a broadly specific proteinase inhibitor.  The structural thioester of alpha (2)-M, is involved in the immobilization and entrapment of proteases. This group contains another broadly specific proteinase inhibitor:  pregnancy zone protein (PZP).  PZP is a trace protein in the plasma of non-pregnant females and males which is elevated in pregnancy. Alpha (2)-M and PZ bind to placental protein-14 and may modulate its activity in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system. This group also contains C3, C4 and C5 of vertebrate complement.  The vertebrate complement is an effector of both the acquired and innate immune systems The point of convergence of the classical, alternative and lectin pathways of the complement system is the proteolytic activation of C3. C4 plays a key role in propagating the classical and lectin pathways. C5 participates in the classical and alternative pathways. The thioester bond located within the structure of C3 and C4 is central to the function of complement. C5 does not contain an active thioester bond.	282
239222	cd02892	SQCY_1	Squalene cyclase (SQCY) domain subgroup 1; found in class II terpene cyclases that have an alpha 6 - alpha 6 barrel fold. Squalene cyclase (SQCY)  and 2,3-oxidosqualene cyclase (OSQCY) are integral membrane proteins that catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. This group contains bacterial SQCY which catalyzes the convertion of squalene to hopene or diplopterol and eukaryotic OSQCY which transforms the 2,3-epoxide of squalene to compounds such as, lanosterol in mammals and fungi or, cycloartenol in plants. Deletion of a single glycine residue of Alicyclobacillus acidocaldarius SQCY alters its substrate specificity into that of eukaryotic OSQCY. Both enzymes have a second minor domain, which forms an alpha-alpha barrel that is inserted into the major domain.	634
239223	cd02893	FTase	Protein farnesyltransferase (FTase)_like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). FTases are a subgroup of PTase family of lipid-modifying enzymes. PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. These proteins are heterodimers of alpha and beta subunits. Both subunits are required for catalytic activity. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids. Ftase attaches a 15-carbon farnesyl group to the cysteine within the C-terminal CaaX motif of substrate proteins when X is Ala, Met, Ser, Cys or Gln. Protein farnesylation has been shown to play critical roles in a variety of cellular processes including Ras/mitogen activated protein kinase signaling pathways in mammals and, abscisic acid signal transduction in Arabidopsis.	299
239224	cd02894	GGTase-II	Geranylgeranyltransferase type II (GGTase-II)_like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). GGTase-IIs are a subgroup of the protein prenyltransferase family of lipid-modifying enzymes. PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids (geranylgeranyl (20-carbon) in the case of GGTase-II ). GGTase-II catalyzes alkylation of both cysteine residues in Rab proteins containing carboxy-terminal "CC", "CXCX" or "CXC" motifs. PTases are heterodimeric with both alpha and beta subunits required for catalytic activity. In contrast to other prenyltransferases, GGTas-II requires an escort protein to bring the substrate protein to the catalytic heterodimer and to escort the geryanylgeranylated product to the membrane.	287
239225	cd02895	GGTase-I	Geranylgeranyltransferase types I (GGTase-I)-like proteins containing the protein prenyltransferase (PTase) domain, beta subunit (alpha 6 - alpha 6 barrel fold). GGTase-I s are a subgroup of the protein prenyltransferase family of lipid-modifying enzymes PTases catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Prenyltransferases employ a Zn2+ ion to alkylate a thiol group catalyzing the formation of thioether linkages between cysteine residues at or near the C-terminus of protein acceptors and the C1 atom of isoprenoid lipids (geranylgeranyl (20-carbon) in the case of GGTase-I ). GGTase-I prenylates the cysteine in the terminal sequence, "CAAX" when X is Leu or Phe. Substrates for GTTase-I include the gamma subunit of neural G-proteins and several Ras-related G-proteins.  PTases are heterodimeric with both alpha and beta subunits required for catalytic activity.	307
239226	cd02896	complement_C3_C4_C5	Proteins similar to C3, C4 and C5 of vertebrate complement.  The vertebrate complement system, comprised of a large number of distinct plasma proteins, is an effector of both the acquired and innate immune systems.  The point of convergence of the classical, alternative and lectin pathways of the complement system is the proteolytic activation of C3. C4 plays a key role in propagating the classical and lectin pathways. C5 participates in the classical and alternative pathways. The thioester bond located within the structure of C3 and C4 is central to the function of complement. C5 does not contain an active thioester bond.	297
239227	cd02897	A2M_2	Proteins similar to alpha2-macroglobulin (alpha (2)-M). This group also contains the pregnancy zone protein (PZP).  Alpha(2)-M and PZP are broadly specific proteinase inhibitors. Alpha (2)-M is a major carrier protein in serum. The structural thioester of alpha (2)-M, is involved in the immobilization and entrapment of proteases.  PZP is a trace protein in the plasma of non-pregnant females and males which is elevated in pregnancy. Alpha (2)-M and PZ bind to placental protein-14 and may modulate its activity in T-cell growth and cytokine production contributing to fetal survival. It has been suggested that thioester bond cleavage promotes the binding of PZ and alpha (2)-M to the CD91 receptor clearing them from circulation.	292
239228	cd02899	PLAT_SR	Scavenger receptor protein. A subfamily of PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2)  domain.  It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates. This subfamily contains Toxoplasma gondii Scavenger protein TgSR1.	109
394872	cd02900	Macro_Appr_pase	macrodomain, Appr-1"-pase family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. The yeast protein Ymx7 and related proteins in this family contain a stand-alone macrodomain and may be specific phosphatases catalyzing the conversion of ADP-ribose-1"-monophosphate (Appr-1"-p) to ADP-ribose. Appr-1"-p is an intermediate in a metabolic pathway involved in pre-tRNA splicing.	195
394873	cd02901	Macro_Poa1p-like	macrodomain, Poa1p-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Members of this family show similarity to the yeast protein Poa1p, reported to be a phosphatase specific for Appr-1"-p, a tRNA splicing metabolite. Poa1p may play a role in tRNA splicing regulation.	135
394874	cd02903	Macro_BAL-like	macrodomain, B-aggressive lymphoma (BAL)-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Members of this family show similarity to BAL (B-aggressive lymphoma) proteins, which contain one to three macrodomains. Most BAL family macrodomains belong to this family except for the most N-terminal domain in multiple-domain containing proteins. This family includes the second and third macrodomains of mono-ADP-ribosyltransferase PARP14 (PARP-14, also known as ADP-ribosyltransferase diphtheria toxin-like 8, ATRD8, B aggressive lymphoma protein 2, or BAL2). Most BAL proteins also contain a C-terminal PARP active site and are also named as PARPs. Human BAL1 (or PARP-9) was originally identified as a risk-related gene in diffuse large B-cell lymphoma that promotes malignant B-cell migration. Some BAL family proteins exhibit PARP activity. Poly (ADP-ribosyl)ation is an immediate DNA-damage-dependent post-translational modification of histones and other nuclear proteins. BAL proteins may also function as transcriptional repressors.	175
394875	cd02904	Macro_H2A-like	macrodomain, macroH2A-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Members of this family are similar to macroH2A, a variant of the major-type core histone H2A, which contains an N-terminal H2A domain and a C-terminal nonhistone macrodomain. Histone macroH2A is enriched on the inactive X chromosome of mammalian female cells. It does not bind poly ADP-ribose, but does bind the monomeric SirT1 metabolite O-acetyl-ADP-ribose (OAADPR) with high affinity through its macrodomain. This family also includes the ADP-ribose binding macrodomain of the macroH2A variant, macroH2A1.1. The macroH2A1.1 isoform inhibits PARP1-dependent DNA-damage induced chromatin dynamics. The putative ADP-ribose binding pocket of the human macroH2A2 macrodomain exhibits marked structural differences compared with the macroH2A1.1 variant.	188
394876	cd02905	Macro_GDAP2-like	macrodomain, GDAP2-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. This family contains proteins similar to human GDAP2, the ganglioside induced differentiation associated protein 2, whose gene is expressed at a higher level in differentiated Neuro2a cells compared with non-differentiated cells. GDAP2 contains an N-terminal macrodomain and a C-terminal Sec14p-like lipid binding domain. It is specifically expressed in brain and testis.	169
394877	cd02907	Macro_Af1521_BAL-like	macrodomain, Af1521-like family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. The macrodomains in this family show similarity to Af1521, a protein from Archaeoglobus fulgidus containing a stand-alone macrodomain. Af1521 binds ADP-ribose and exhibits phosphatase activity toward ADP-ribose-1"-monophosphate (Appr-1"-p). Also included in this family are the N-terminal (or first) macrodomains of BAL (B-aggressive lymphoma) proteins which contain multiple macrodomains, such as the first macrodomain of mono-ADP-ribosyltransferase PARP14 (PARP-14, also known as ADP-ribosyltransferase diphtheria toxin-like 8, ATRD8, B aggressive lymphoma protein 2, or BAL2). Most BAL proteins also contain a C-terminal PARP active site and are also named as PARPs. Human BAL1 (or PARP-9) was originally identified as a risk-related gene in diffuse large B-cell lymphoma that promotes malignant B-cell migration. Some BAL family proteins exhibit PARP activity. Poly (ADP-ribosyl)ation is an immediate DNA-damage-dependent post-translational modification of histones and other nuclear proteins. BAL proteins may also function as transcriptional repressors.	158
394878	cd02908	Macro_OAADPr_deacetylase	macrodomain, O-acetyl-ADP-ribose (OAADPr) family. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. This family includes eukaryotic macrodomain proteins such as human MacroD1 and MacroD2, and bacterial proteins such as Escherichia coli YmdB; these have been shown to be O-acetyl-ADP-ribose (OAADPr) deacetylases that efficiently catalyze the hydrolysis of OAADPr to produce ADP-ribose and free acetate. OAADPr is a sirtuin reaction product generated from the NAD+-dependent protein deacetylation reactions and has been implicated as a signaling molecule. By acting on mono-ADP-ribosylated substrates, OAADPr deacteylases may reverse cellular ADP-ribosylation.	166
380374	cd02909	cupin_pirin_N	pirin, N-terminal cupin domain. This family contains the N-terminal domain of pirin, a nuclear protein that is highly conserved among mammals, plants, fungi, and prokaryotes. It is widely expressed in dot-like subnuclear structures in human tissues such as liver and heart. Pirin functions as both a transcriptional cofactor and an apoptosis-related protein in mammals and is involved in seed germination and seedling development in plants. The pirins have been assigned as a subfamily of the cupin superfamily based on structure and sequence similarity. The pirins have two tandem cupin-like folds but the C-terminal cupin fold has diverged considerably and does not have a metal binding site. The exact functions of pirins are unknown but they have quercitinase activity in Escherichia coli and are thought to play important roles in transcription and apoptosis. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold generally capable of homodimerization.	104
380375	cd02910	cupin_Yhhw_N	Escherichia coli YhhW and YhaK and related proteins, pirin-like bicupin, N-terminal cupin domain. This family includes the N-terminal cupin domains of YhhW and YhaK, Escherichia coli pirin-like proteins with unknown function. YhhW is structurally similar not only to human pirin but also to quercitin 2,3-dioxygenase (quercitinase). Although the function of YhhW is not completely understood, YhhW and its human ortholog have quercitinase activity and are likely to play an important role in transcription and apoptosis. This N-terminal cupin domain of YhhW has a metal coordination site and is thought to have catalytic activity while the C-terminal cupin-like domain has diverged considerably and has closer alignment with C-terminal pirin. YhaK is found in low abundance in the cytosol of E. coli and is strongly up-regulated by nitroso-glutathione (GSNO). There are major structural differences at the N-terminus of YhaK compared with YhhW; YhaK lacks the canonical cupin metal-binding residues of pirins and may be involved in chloride binding and/or sensing of oxidative stress in enterobacteria. YhaK showed no quercetinase and peroxidase activity; however, reduced YhaK was very sensitive to reactive oxygen species (ROS). Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	119
239237	cd02911	arch_FMN	Archeal FMN-binding domain. This family of archaeal proteins are part of the NAD(P)H-dependent flavin oxidoreductase (oxidored) FMN-binding family that reduce a range of alternative electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN. The specific function of this group is unknown.	233
239238	cd02922	FCB2_FMN	Flavocytochrome b2 (FCB2) FMN-binding domain.  FCB2 (AKA L-lactate:cytochrome c oxidoreductase) is a respiratory enzyme located in the intermembrane space of fungal mitochondria which catalyzes the oxidation of L-lactate to pyruvate. FCB2 also participates in a short electron-transport chain involving cytochrome c and cytochrome oxidase which ultimately directs the reducing equivalents gained from L-lactate oxidation to oxygen, yielding one molecule of ATP for every L-lactate molecule consumed. FCB2  is composed of 2 domains: a C-terminal flavin-binding domain, which includes the active site for lacate oxidation, and an N-terminal b2-cytochrome domain, required for efficient cytochrome c reduction. FCB2 is a homotetramer and contains two noncovalently bound cofactors, FMN and heme per subunit.	344
239239	cd02929	TMADH_HD_FMN	Trimethylamine dehydrogenase (TMADH) and histamine dehydrogenase (HD) FMN-binding domain.  TMADH is an iron-sulfur flavoprotein that catalyzes the oxidative demethylation of trimethylamine to form dimethylamine and formaldehyde. The protein forms a symetrical dimer with each subunit containing one 4Fe-4S cluster and one FMN cofactor.  It contains a unique flavin, in the form of a 6-S-cysteinyl FMN  which is bent by ~25 degrees along the N5-N10 axis of the flavin isoalloxazine ring. This modification of the conformation of the flavin is thought to facilitate catalysis.The closely related histamine dehydrogenase catalyzes oxidative deamination of histamine.	370
239240	cd02930	DCR_FMN	2,4-dienoyl-CoA reductase (DCR) FMN-binding domain.  DCR in E. coli  is an iron-sulfur flavoenzyme which contains FMN, FAD, and a 4Fe-4S cluster. It is also a monomer, unlike that of its eukaryotic counterparts which form homotetramers and lack the flavin and iron-sulfur cofactors. Metabolism of unsaturated fatty acids requires auxiliary enzymes in addition to those used in b-oxidation. After a given number of cycles through the b-oxidation pathway, those unsaturated fatty acyl-CoAs with double bonds at even-numbered carbon positions contain 2-trans, 4-cis double bonds that can not be modified by enoyl-CoA hydratase. DCR utilizes NADPH to remove the C4-C5 double bond. DCR can catalyze the reduction of both natural fatty acids with cis double bonds, as well as substrates containing trans double bonds. The reaction is initiated by hybrid transfer from NADPH to FAD, which in turn transfers electrons, one at a time, to FMN via the 4Fe-4S cluster. The fully reduced FMN provides a hydrid ion to the C5 atom of substrate, and Tyr and His are proposed to form a catalytic dyad that protonates the C4 atom of the substrate and completes the reaction.	353
239241	cd02931	ER_like_FMN	Enoate reductase (ER)-like FMN-binding domain.  Enoate reductase catalyzes the NADH-dependent reduction of carbon-carbon double bonds of several molecules, including nonactivated 2-enoates, alpha,beta-unsaturated aldehydes, cyclic ketones, and methylketones. ERs are similar to 2,4-dienoyl-CoA reductase from E. coli and to the old yellow enzyme from Saccharomyces cerevisiae.	382
239242	cd02932	OYE_YqiM_FMN	Old yellow enzyme (OYE) YqjM-like FMN binding domain. YqjM is involved in the oxidative stress response of Bacillus subtilis.  Like the other OYE members, each monomer of YqjM contains FMN as a non-covalently bound cofactor and uses NADPH as a reducing agent.   The YqjM enzyme exists as a homotetramer that is assembled as a dimer of catalytically dependent dimers, while other OYE members exist only as monomers or dimers. Moreover, the protein displays a shared active site architecture where an arginine finger at the COOH terminus of one monomer extends into the active site of the adjacent monomer and is directly involved in substrate recognition. Another remarkable difference in the binding of the ligand in YqjM is represented by the contribution of the NH2-terminal tyrosine instead of a COOH-terminal tyrosine in OYE and its homologs.	336
239243	cd02933	OYE_like_FMN	Old yellow enzyme (OYE)-like FMN binding domain. OYE was the first flavin-dependent enzyme identified, however its true physiological role remains elusive to this day. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction.  Members of OYE family include 12-oxophytodienoate reductase, pentaerythritol tetranitrate reductase, morphinone reductase, and related enzymes.	338
239244	cd02940	DHPD_FMN	Dihydropyrimidine dehydrogenase (DHPD) FMN-binding domain.  DHPD catalyzes the first step in pyrimidine degradation: the NADPH-dependent reduction of uracil and thymine to the corresponding 5,6-dihydropyrimidines. DHPD contains two FAD, two FMN, and eight [4Fe-4S] clusters, arranged in two electron transfer chains that pass the dimer interface twice. Two of the Fe-S clusters show a hitherto unobserved coordination involving a glutamine residue.	299
239245	cd02947	TRX_family	TRX family; composed of two groups: Group I, which includes proteins that exclusively encode a TRX domain; and Group II, which are composed of fusion proteins of TRX and additional domains. Group I TRX is a small ancient protein that alter the redox state of target proteins via the reversible oxidation of an active site dithiol, present in a CXXC motif, partially exposed at the protein's surface. TRX reduces protein disulfide bonds, resulting in a disulfide bond at its active site. Oxidized TRX is converted to the active form by TRX reductase, using reducing equivalents derived from either NADPH or ferredoxins. By altering their redox state, TRX regulates the functions of at least 30 target proteins, some of which are enzymes and transcription factors. It also plays an important role in the defense against oxidative stress by directly reducing hydrogen peroxide and certain radicals, and by serving as a reductant for peroxiredoxins. At least two major types of functional TRXs have been reported in most organisms; in eukaryotes, they are located in the cytoplasm and the mitochondria. Higher plants contain more types (at least 20 TRX genes have been detected in the genome of Arabidopsis thaliana), two of which (types f amd m) are located in the same compartment, the chloroplast. Also included in the alignment are TRX-like domains which show sequence homology to TRX but do not contain the redox active CXXC motif. Group II proteins, in addition to either a redox active TRX or a TRX-like domain, also contain additional domains, which may or may not possess homology to known proteins.	93
239246	cd02948	TRX_NDPK	TRX domain, TRX and NDP-kinase (NDPK) fusion protein family; most members of this group are fusion proteins which contain one redox active TRX domain containing a CXXC motif and three NDPK domains, and are characterized as intermediate chains (ICs) of axonemal outer arm dynein. Dyneins are molecular motors that generate force against microtubules to produce cellular movement, and are divided into two classes: axonemal and cytoplasmic. They are supramolecular complexes consisting of three protein groups classified according to size: dynein heavy, intermediate and light chains. Axonemal dyneins form two structures, the inner and outer arms, which are attached to doublet microtubules throughout the cilia and flagella. The human homolog is the sperm-specific Sptrx-2, presumed to be a  component of the human sperm axoneme architecture. Included in this group is another human protein, TRX-like protein 2, a smaller fusion protein containing one TRX and one NDPK domain, which is also associated with microtubular structures. The other members of this group are hypothetical insect proteins containing a TRX domain and outer arm dynein light chains (14 and 16kDa) of Chlamydomonas reinhardtii. Using standard assays, the fusion proteins have shown no TRX enzymatic activity.	102
239247	cd02949	TRX_NTR	TRX domain, novel NADPH thioredoxin reductase (NTR) family; composed of fusion proteins found only in oxygenic photosynthetic organisms containing both TRX and NTR domains. The TRX domain functions as a protein disulfide reductase via the reversible oxidation of an active center dithiol present in a CXXC motif, while the NTR domain functions as a reductant to oxidized TRX. The fusion protein is  bifunctional, showing both TRX and NTR activities, but it is not an independent NTR/TRX system. In plants, the protein is found exclusively in shoots and mature leaves and is localized in the chloroplast. It is involved in plant protection against oxidative stress.	97
239248	cd02950	TxlA	TRX-like protein A (TxlA) family; TxlA was originally isolated from the cyanobacterium Synechococcus. It is found only in oxygenic photosynthetic organisms. TRX is a small enzyme that participate in redox reactions, via the reversible oxidation of an active site dithiol present in a CXXC motif. Disruption of the txlA gene suggests that the protein is involved in the redox regulation  of the structure and function of photosynthetic apparatus. The plant homolog (designated as HCF164) is localized in the chloroplast and is involved in the assembly of the cytochrome b6f complex, which takes a central position in photosynthetic electron transport.	142
239249	cd02951	SoxW	SoxW family; SoxW is a bacterial periplasmic TRX, containing a redox active CXXC motif, encoded by a genetic locus (sox operon) involved in thiosulfate oxidation. Sulfur bacteria oxidize sulfur compounds to provide reducing equivalents for carbon dioxide fixation during autotrophic growth and the respiratory electron transport chain. It is unclear what the role of SoxW is, since it has been found to be dispensable in the oxidation of thiosulfate to sulfate. SoxW is specifically kept in the reduced state by SoxV, which is essential in thiosulfate oxidation.	125
239250	cd02952	TRP14_like	Human TRX-related protein 14 (TRP14)-like family; composed of proteins similar to TRP14, a 14kD cytosolic protein that shows disulfide reductase activity in vitro with a different substrate specificity compared with another human cytosolic protein, TRX1. TRP14 catalyzes the reduction of small disulfide-containing peptides but does not reduce disulfides of ribonucleotide reductase, peroxiredoxin and methionine sulfoxide reductase, which are TRX1 substrates. TRP14 also plays a role in tumor necrosis factor (TNF)-alpha signaling pathways, distinct from that of TRX1. Its depletion promoted TNF-alpha induced activation of c-Jun N-terminal kinase and mitogen-activated protein kinases.	119
239251	cd02953	DsbDgamma	DsbD gamma family; DsbD gamma is the C-terminal periplasmic domain of the bacterial protein DsbD. It contains a CXXC motif in a TRX fold and shuttles the reducing potential from the membrane domain (DsbD beta) to the N-terminal periplasmic domain (DsbD alpha).  DsbD beta, a transmembrane domain comprising of eight helices, acquires its reducing potential from the cytoplasmic thioredoxin. DsbD alpha transfers the acquired reducing potential from DsbD gamma to target proteins such as the periplasmic protein disulphide isomerases, DsbC and DsbG. This flow of reducing potential from the cytoplasm through DsbD allows DsbC and DsbG to act as isomerases in the oxidizing environment of the bacterial periplasm. DsbD also transfers reducing potential from the cytoplasm to specific reductases in the periplasm which are involved in the maturation of cytochromes.	104
239252	cd02954	DIM1	Dim1 family; Dim1 is also referred to as U5 small nuclear ribonucleoprotein particle (snRNP)-specific 15kD protein. It is a component of U5 snRNP, which pre-assembles with U4/U6 snRNPs to form a [U4/U6:U5] tri-snRNP complex required for pre-mRNA splicing. Dim1 interacts with multiple splicing-associated proteins, suggesting that it functions at multiple control points in the splicing of pre-mRNA as part of a large spliceosomal complex involving many protein-protein interactions. U5 snRNP contains seven core proteins (common to all snRNPs) and nine U5-specific proteins, one of which is Dim1. Dim1 adopts a thioredoxin fold but does not contain the redox active CXXC motif. It is essential for G2/M phase transition, as a consequence to its role in pre-mRNA splicing.	114
239253	cd02955	SSP411	TRX domain, SSP411 protein family; members of this family are highly conserved proteins present in eukaryotes, bacteria and archaea, about 600-800 amino acids in length, which contain a TRX domain with a redox active CXXC motif. The human/rat protein, called SSP411, is specifically expressed in the testis in an age-dependent manner. The SSP411 mRNA is increased during spermiogenesis and is localized in round and elongated spermatids, suggesting a function in fertility regulation.	124
239254	cd02956	ybbN	ybbN protein family; ybbN is a hypothetical protein containing a redox-inactive TRX-like domain. Its gene has been sequenced from several gammaproteobacteria and actinobacteria.	96
239255	cd02957	Phd_like	Phosducin (Phd)-like family; composed of Phd and Phd-like proteins (PhLP), characterized as cytosolic regulators of G protein functions. Phd and PhLPs specifically bind G protein betagamma (Gbg)-subunits with high affinity, resulting in the solubilization of Gbg from the plasma membrane and impeding G protein-mediated signal transduction by inhibiting the formation of a functional G protein trimer (G protein alphabetagamma). Phd also inhibits the GTPase activity of G protein alpha. Phd can be phosphorylated by protein kinase A and G protein-coupled receptor kinase 2, leading to its inactivation. Phd was originally isolated from the retina, where it is highly expressed and has been implicated to play an important role in light adaptation. It is also found in the pineal gland, liver, spleen, striated muscle and the brain. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain. Also included in this family is a PhLP characterized as a viral inhibitor of apoptosis (IAP)-associated factor, named VIAF, that functions in caspase activation during apoptosis.	113
239256	cd02958	UAS	UAS family; UAS is a domain of unknown function. Most members of this family are uncharacterized proteins with similarity to FAS-associated factor 1 (FAF1) and ETEA because of the presence of a UAS domain N-terminal to a ubiquitin-associated UBX domain. FAF1 is a longer protein, compared to the other members of this family, having additional N-terminal domains, a ubiquitin-associated UBA domain and a nuclear targeting domain. FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. ETEA is the protein product of a highly expressed gene in T-cells and eosinophils of atopic dermatitis patients. The presence of the ubiquitin-associated UBX domain in the proteins of this family suggests the possibility of their involvement in ubiquitination. Recently, FAF1 has been shown to interact with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. Some members of this family are uncharacterized proteins containing only a UAS domain.	114
239257	cd02959	ERp19	Endoplasmic reticulum protein 19 (ERp19) family; ERp19 is also known as ERp18, a protein located in the ER containing one redox active TRX domain. Denaturation studies indicate that the reduced form is more stable than the oxidized form, suggesting that the protein is involved in disulfide bond formation. In vitro, ERp19 has been shown to possess thiol-disulfide oxidase activity which is dependent on the presence of both active site cysteines. Although described as protein disulfide isomerase (PDI)-like, the protein does not complement for PDI activity. ERp19 shows a wide tissue distribution but is most abundant in liver, testis, heart and kidney.	117
239258	cd02960	AGR	Anterior Gradient (AGR) family; members of this family are similar to secreted proteins encoded by the cement gland-specific genes XAG-1 and XAG-2, expressed in the anterior region of dorsal ectoderm of Xenopus. They are implicated in the formation of the cement gland and the induction of forebrain fate. The human homologs, hAG-2 and hAG-3, are secreted proteins associated with estrogen-positive breast tumors. Yeast two-hybrid studies identified the metastasis-associated C4.4a protein and dystroglycan as binding partners, indicating possible roles in the development and progression of breast cancer. hAG-2 has also been implicated in prostate cancer. Its gene was cloned as an androgen-inducible gene and it was shown to be overexpressed in prostate cancer cells at the mRNA and protein levels. AGR proteins contain one conserved cysteine corresponding to the first cysteine in the CXXC motif of TRX. They show high sequence similarity to ERp19.	130
239259	cd02961	PDI_a_family	Protein Disulfide Isomerase (PDIa) family, redox active TRX domains; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI and PDI-related proteins like ERp72, ERp57 (or ERp60), ERp44, P5, PDIR, ERp46 and the transmembrane PDIs. PDI, ERp57, ERp72, P5, PDIR and ERp46 are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins usually contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and may also contain one or more redox inactive TRX-like (b) domains. Only one a domain is required for the oxidase function but multiple copies are necessary for the isomerase function. The different types of PDIs may show different substrate specificities and tissue-specific expression, or may be induced by stress. PDIs are in their reduced form at steady state and are oxidized to the active form by Ero1, which is localized in the ER through ERp44. Some members of this family also contain a DnaJ domain in addition to the redox active a domains; examples are ERdj5 and Pfj2. Also included in the family is the redox inactive N-terminal TRX-like domain of ERp29.	101
239260	cd02962	TMX2	TMX2 family; composed of proteins similar to human TMX2, a 372-amino acid TRX-related transmembrane protein, identified and characterized through the cloning of its cDNA from a human fetal library. It contains a TRX domain but the redox active CXXC motif is replaced with SXXC. Sequence analysis predicts that TMX2 may be a Type I membrane protein, with its C-terminal half protruding on the luminal side of the endoplasmic reticulum (ER). In addition to the TRX domain, transmembrane region and ER-retention signal, TMX2 also contains a Myb DNA-binding domain repeat signature and a dileucine motif in the tail.	152
239261	cd02963	TRX_DnaJ	TRX domain, DnaJ domain containing protein family; composed of uncharacterized proteins of about 500-800 amino acids, containing an N-terminal DnaJ domain followed by one redox active TRX domain. DnaJ is a member of the 40 kDa heat-shock protein (Hsp40) family of molecular chaperones, which regulate the activity of Hsp70s. TRX is involved in the redox regulation of many protein substrates through the reduction of disulfide bonds. TRX has been implicated to catalyse the reduction of Hsp33, a chaperone holdase that binds to unfolded protein intermediates. The presence of DnaJ and TRX domains in members of this family suggests that they could be involved in a redox-regulated chaperone network.	111
239262	cd02964	TryX_like_family	Tryparedoxin (TryX)-like family; composed of TryX and related proteins including nucleoredoxin (NRX), rod-derived cone viability factor (RdCVF) and the nematode homolog described as a 16-kD class of TRX. Most members of this family, except RdCVF, are protein disulfide oxidoreductases containing an active site CXXC motif, similar to TRX.	132
239263	cd02965	HyaE	HyaE family; HyaE is also called HupG and HoxO. They are proteins serving a critical role in the assembly of multimeric [NiFe] hydrogenases, the enzymes that catalyze the oxidation of molecular hydrogen to enable microorganisms to utilize hydrogen as the sole energy source. The E. coli HyaE protein is a chaperone that specifically interacts with the twin-arginine translocation (Tat) signal peptide of the [NiFe] hydrogenase-1 beta subunit precursor. Tat signal peptides target precursor proteins to the Tat protein export system, which facilitates the transport of fully folded proteins across the inner membrane. HyaE may be involved in regulating the traffic of [NiFe] hydrogenase-1 on the Tat transport pathway.	111
239264	cd02966	TlpA_like_family	TlpA-like family; composed of  TlpA, ResA, DsbE and similar proteins. TlpA, ResA and DsbE are bacterial protein disulfide reductases with important roles in cytochrome maturation. They are membrane-anchored proteins with a soluble TRX domain containing a CXXC motif located in the periplasm. The TRX domains of this family contain an insert, approximately 25 residues in length, which correspond to an extra alpha helix and a beta strand when compared with TRX. TlpA catalyzes an essential reaction in the biogenesis of cytochrome aa3, while ResA and DsbE are essential proteins in cytochrome c maturation. Also included in this family are proteins containing a TlpA-like TRX domain with domain architectures similar to E. coli DipZ protein, and the N-terminal TRX domain of PilB protein from Neisseria which acts as a disulfide reductase that can recylce methionine sulfoxide reductases.	116
239265	cd02967	mauD	Methylamine utilization (mau) D family; mauD protein is the translation product of the mauD gene found in methylotrophic bacteria, which are able to use methylamine as a sole carbon source and a nitrogen source. mauD is an essential accessory protein for the biosynthesis of methylamine dehydrogenase (MADH), the enzyme that catalyzes the oxidation of methylamine and other primary amines. MADH possesses an alpha2beta2 subunit structure; the alpha subunit is also referred to as the large subunit. Each beta (small) subunit contains a tryptophan tryptophylquinone (TTQ) prosthetic group. Accessory proteins are essential for the proper transport of MADH to the periplasm, TTQ synthesis and the formation of several structural disulfide bonds. Bacterial mutants containing an insertion on the mauD gene were unable to grow on methylamine as a sole carbon source, were found to lack the MADH small subunit and had decreased amounts of the MADH large subunit.	114
239266	cd02968	SCO	SCO (an acronym for Synthesis of Cytochrome c Oxidase) family; composed of proteins similar to Sco1, a membrane-anchored protein possessing a soluble domain with a TRX fold. Members of this family are required for the proper assembly of cytochrome c oxidase (COX). They contain a metal binding motif, typically CXXXC, which is located in a flexible loop. COX, the terminal enzyme in the respiratory chain, is imbedded in the inner mitochondrial membrane of all eukaryotes and in the plasma membrane of some prokaryotes. It is composed of two subunits, COX I and COX II. It has been proposed that Sco1 specifically delivers copper to the CuA site, a dinuclear copper center, of the COX II subunit. Mutations in human Sco1 and Sco2 cause fatal infantile hepatoencephalomyopathy and cardioencephalomyopathy, respectively. Both disorders are associated with severe COX deficiency in affected tissues. More recently, it has been argued that the redox sensitivity of the copper binding properties of Sco1 implies that it participates in signaling events rather than functioning as a chaperone that transfers copper to COX II.	142
239267	cd02969	PRX_like1	Peroxiredoxin (PRX)-like 1 family; hypothetical proteins that show sequence similarity to PRXs. Members of this group contain a conserved cysteine that aligns to the first cysteine in the CXXC motif of TRX. This does not correspond to the peroxidatic cysteine found in PRXs, which aligns to the second cysteine in the CXXC motif of TRX. In addition, these proteins do not contain the other two conserved residues of the catalytic triad of PRX. PRXs confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF.	171
239268	cd02970	PRX_like2	Peroxiredoxin (PRX)-like 2 family; hypothetical proteins that show sequence similarity to PRXs. Members of this group contain a CXXC motif, similar to TRX. The second cysteine in the motif corresponds to the peroxidatic cysteine of PRX, however, these proteins do not contain the other two residues of the catalytic triad of PRX. PRXs confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. TRXs alter the redox state of target proteins by catalyzing the reduction of their disulfide bonds via the CXXC motif using reducing equivalents derived from either NADPH or ferredoxins.	149
239269	cd02971	PRX_family	Peroxiredoxin (PRX) family; composed of the different classes of PRXs including many proteins originally known as bacterioferritin comigratory proteins (BCP), based on their electrophoretic mobility before their function was identified. PRXs are thiol-specific antioxidant (TSA) proteins also known as TRX peroxidases and alkyl hydroperoxide reductase C22 (AhpC) proteins. They confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either TRX, glutathione, trypanothione and AhpF. They are distinct from other peroxidases in that they have no cofactors such as metals or prosthetic groups. The first step of catalysis, common to all PRXs, is the nucleophilic attack by the catalytic cysteine (also known as the peroxidatic cysteine) on the peroxide leading to cleavage of the oxygen-oxygen bond and the formation of a cysteine sulfenic acid intermediate. The second step of the reaction, the resolution of the intermediate, distinguishes the different types of PRXs. The presence or absence of a second cysteine (the resolving cysteine) classifies PRXs as either belonging to the 2-cys or 1-cys type. The resolving cysteine of 2-cys PRXs is either on the same chain (atypical) or on the second chain (typical) of a functional homodimer. Structural and motif analysis of this growing family supports the need for a new classification system. The peroxidase activity of PRXs is regulated in vivo by irreversible cysteine over-oxidation into a sulfinic acid, phosphorylation and limited proteolysis.	140
239270	cd02972	DsbA_family	DsbA family; consists of DsbA and DsbA-like proteins, including DsbC, DsbG, glutathione (GSH) S-transferase kappa (GSTK), 2-hydroxychromene-2-carboxylate (HCCA) isomerase, an oxidoreductase (FrnE) presumed to be involved in frenolicin biosynthesis, a 27-kDa outer membrane protein, and similar proteins. Members of this family contain a redox active CXXC motif (except GSTK and HCCA isomerase) imbedded in a TRX fold, and an alpha helical insert of about 75 residues (shorter in DsbC and DsbG) relative to TRX. DsbA is involved in the oxidative protein folding pathway in prokaryotes, catalyzing disulfide bond formation of proteins secreted into the bacterial periplasm. DsbC and DsbG function as protein disulfide isomerases and chaperones to correct non-native disulfide bonds formed by DsbA and prevent aggregation of incorrectly folded proteins.	98
239271	cd02973	TRX_GRX_like	Thioredoxin (TRX)-Glutaredoxin (GRX)-like family; composed of archaeal and bacterial proteins that show similarity to both TRX and GRX, including the C-terminal TRX-fold subdomain of Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). All members contain a redox-active CXXC motif and may function as PDOs. The archaeal proteins Mj0307 and Mt807 show structures more similar to GRX, but activities more similar to TRX. Some members of the family are similar to PfPDO in that they contain a second CXXC motif located in a second TRX-fold subdomain at the N-terminus; the superimposable N- and C-terminal TRX subdomains form a compact structure. PfPDO is postulated to be the archaeal counterpart of bacterial DsbA and eukaryotic protein disulfide isomerase (PDI). The C-terminal CXXC motif of PfPDO is required for its oxidase, reductase and isomerase activities. Also included in the family is the C-terminal TRX-fold subdomain of the N-terminal domain (NTD) of bacterial AhpF, which has a similar fold as PfPDO with two TRX-fold subdomains but without the second CXXC motif.	67
239272	cd02974	AhpF_NTD_N	Alkyl hydroperoxide reductase F subunit (AhpF) N-terminal domain (NTD) family, N-terminal TRX-fold subdomain; AhpF is a homodimeric flavoenzyme which catalyzes the NADH-dependent reduction of the peroxiredoxin AhpC, which in turn catalyzes the reduction of hydrogen peroxide and organic hydroperoxides. AhpF contains an NTD forming two contiguous TRX-fold subdomain similar to Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). It also contains a catalytic core similar to TRX reductase containing FAD and NADH binding domains with an active site disulfide. The proposed mechanism of action of AhpF is similar to a TRX/TRX reductase system. The flow of reducing equivalents goes from NADH -> catalytic core of AhpF -> NTD of AhpF -> AhpC -> peroxide substrates. The N-terminal TRX-fold subdomain of AhpF NTD is redox inactive, but is proposed to contain an important residue that aids in the catalytic function of the redox-active CXXC motif contained in the C-terminal TRX-fold subdomain.	94
239273	cd02975	PfPDO_like_N	Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO)-like family, N-terminal TRX-fold subdomain; composed of proteins with similarity to PfPDO, a redox active thermostable protein believed to be the archaeal counterpart of bacterial DsbA and eukaryotic protein disulfide isomerase (PDI), which are both involved in oxidative protein folding. PfPDO contains two redox active CXXC motifs in two contiguous TRX-fold subdomains. The active site in the N-terminal TRX-fold subdomain is required for isomerase but not for reductase activity of PfPDO. The exclusive presence of PfPDO-like proteins in extremophiles may suggest that they have a special role in adaptation to extreme conditions.	113
239274	cd02976	NrdH	NrdH-redoxin (NrdH) family; NrdH is a small monomeric protein with a conserved redox active CXXC motif within a TRX fold, characterized by a glutaredoxin (GRX)-like sequence and TRX-like activity profile. In vitro, it displays protein disulfide reductase activity that is dependent on TRX reductase, not glutathione (GSH). It is part of the NrdHIEF operon, where NrdEF codes for class Ib ribonucleotide reductase (RNR-Ib), an efficient enzyme at low oxygen levels. Under these conditions when GSH is mostly conjugated to spermidine, NrdH can still function and act as a hydrogen donor for RNR-Ib. It has been suggested that the NrdHEF system may be the oldest RNR reducing system, capable of functioning in a microaerophilic environment, where GSH was not yet available. NrdH from Corynebacterium ammoniagenes can form domain-swapped dimers, although it is unknown if this happens in vivo. Domain-swapped dimerization, which results in the blocking of the TRX reductase binding site, could be a mechanism for regulating the oxidation state of the protein.	73
239275	cd02977	ArsC_family	Arsenate Reductase (ArsC) family; composed of TRX-fold arsenic reductases and similar proteins including the transcriptional regulator, Spx. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione (GSH) via glutaredoxin (GRX), through a single catalytic cysteine. This family of predominantly bacterial enzymes is unrelated to two other families of arsenate reductases which show similarity to low-molecular-weight acid phosphatases and phosphotyrosyl phosphatases. Spx is a general regulator that exerts negative and positive control over transcription initiation by binding to the C-terminal domain of the alpha subunit of RNA polymerase.	105
239276	cd02978	KaiB_like	KaiB-like family; composed of the circadian clock proteins, KaiB and the N-terminal KaiB-like sensory domain of SasA. KaiB is an essential protein in maintaining circadian rhythm. It was originally discovered from the cyanobacterium Synechococcus as part of the circadian clock gene cluster, kaiABC. KaiB attenuates KaiA-enhanced KaiC autokinase activity by interacting with KaiA-KaiC complexes in a circadian fashion. KaiB is membrane-associated as well as cytosolic. The amount of membrane-associated protein peaks in the evening (at circadian time (CT) 12-16) while the cytosolic form peaks later (at CT 20). The rhythmic localization of KaiB may function in regulating the formation of Kai complexes. SasA is a sensory histidine kinase which associates with KaiC. Although it is not an essential oscillator component, it is important in enhancing kaiABC expression and is important in metabolic growth control under day/night cycle conditions. SasA contains an N-terminal sensory domain with a TRX fold which  is involved in the SasA-KaiC interaction. This domain shows high sequence similarity with KaiB. However, the KaiB structure does not show a classical TRX fold. The N-terminal half of KaiB shares the same beta-alpha-beta topology as TRX, but the topology of its C-terminal half diverges.	72
239277	cd02979	PHOX_C	FAD-dependent Phenol hydoxylase (PHOX) family, C-terminal TRX-fold domain; composed of proteins similar to PHOX from the aerobic topsoil yeast Trichosporon cutaneum. PHOX is a flavoprotein monooxygenase that catalyzes the hydroxylation of phenol and simple phenol derivatives in the ortho position with the consumption of NADPH and oxygen. This is the first step in the biodegradation and detoxification of phenolic compounds. PHOX contains three domains. The substrate and FAD/NAD(P) binding sites are contained in the first two domains, which adopt a complicated folding pattern. The third or C-terminal domain contains a TRX fold and is involved in dimerization. The functional unit of PHOX is a dimer, although active tetramers of the recombinant enzyme can be isolated when overproduced in bacteria.	167
239278	cd02980	TRX_Fd_family	Thioredoxin (TRX)-like [2Fe-2S] Ferredoxin (Fd) family; composed of [2Fe-2S] Fds with a TRX fold (TRX-like Fds) and proteins containing domains similar to TRX-like Fd including formate dehydrogenases, NAD-reducing hydrogenases and the subunit E of NADH:ubiquinone oxidoreductase (NuoE). TRX-like Fds are soluble low-potential electron carriers containing a single [2Fe-2S] cluster. The exact role of TRX-like Fd is still unclear. It has been suggested that it may be involved in nitrogen fixation. Its homologous domains in large redox enzymes (such as Nuo and hydrogenases) function as electron carriers.	77
239279	cd02981	PDI_b_family	Protein Disulfide Isomerase (PDIb) family, redox inactive TRX-like domain b; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI, calsequestrin and other PDI-related proteins like ERp72, ERp57, ERp44 and PDIR. PDI, ERp57 (or ERp60), ERp72 and PDIR are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and one or more redox inactive TRX-like (b) domains. The molecular structure of PDI is abb'a'. Also included in this family is the PDI-related protein ERp27, which contains only redox-inactive TRX-like (b and b') domains. The redox inactive b domains are implicated in substrate recognition.	97
239280	cd02982	PDI_b'_family	Protein Disulfide Isomerase (PDIb') family, redox inactive TRX-like domain b'; composed of eukaryotic proteins involved in oxidative protein folding in the endoplasmic reticulum (ER) by acting as catalysts and folding assistants. Members of this family include PDI, calsequestrin and other PDI-related proteins like ERp72, ERp57 (or ERp60), ERp44, P5 and PDIR. PDI, ERp57, ERp72, P5 and PDIR are all oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. These proteins contain multiple copies of a redox active TRX (a) domain containing a CXXC motif, and one or more redox inactive TRX-like (b) domains. The molecular structure of PDI is abb'a'. Also included in this family is the PDI-related protein ERp27, which contains only redox-inactive TRX-like (b and b') domains. The redox inactive domains are implicated in substrate recognition with the b' domain serving as the primary substrate binding site. Only the b' domain is necessary for the binding of small peptide substrates. In addition to the b' domain, other domains are required for the binding of larger polypeptide substrates. The b' domain is also implicated in chaperone activity.	103
239281	cd02983	P5_C	P5 family, C-terminal redox inactive TRX-like domain; P5 is a protein disulfide isomerase (PDI)-related protein with a domain structure of aa'b (where a and a' are redox active TRX domains and b is a redox inactive TRX-like domain). Like PDI, P5 is located in the endoplasmic reticulum (ER) and displays both isomerase and chaperone activities, which are independent of each other. Compared to PDI, the isomerase and chaperone activities of P5 are lower. The first cysteine in the CXXC motif of both redox active domains in P5 is necessary for isomerase activity. The P5 gene was first isolated as an amplified gene from a hydroxyurea-resistant hamster cell line. The zebrafish P5 homolog has been implicated to play a critical role in establishing left/right asymmetries in the embryonic midline. The C-terminal domain is likely involved in substrate binding, similar to the b and b' domains of PDI.	130
239282	cd02984	TRX_PICOT	TRX domain, PICOT (for PKC-interacting cousin of TRX) subfamily; PICOT is a protein that interacts with protein kinase C (PKC) theta, a calcium independent PKC isoform selectively expressed in skeletal muscle and T lymphocytes. PICOT contains an N-terminal TRX-like domain, which does not contain the catalytic CXXC motif, followed by one to three glutaredoxin domains. The TRX-like domain is required for interaction with PKC theta. PICOT inhibits the activation of c-Jun N-terminal kinase and the transcription factors, AP-1 and NF-kB, induced by PKC theta or T-cell activating stimuli.	97
239283	cd02985	TRX_CDSP32	TRX family, chloroplastic drought-induced stress protein of 32 kD (CDSP32); CDSP32 is composed of two TRX domains, a C-terminal TRX domain which contains a redox active CXXC motif and an N-terminal TRX-like domain which contains an SXXS sequence instead of the redox active motif. CDSP32 is a stress-inducible TRX, i.e., it acts as a TRX by reducing protein disulfides and is induced by environmental and oxidative stress conditions. It plays a critical role in plastid defense against oxidative damage, a role related to its function as a physiological electron donor to BAS1, a plastidic 2-cys peroxiredoxin. Plants lacking CDSP32 exhibit decreased photosystem II photochemical efficiencies and chlorophyll retention compared to WT controls, as well as an increased proportion of BAS1 in its overoxidized monomeric form.	103
239284	cd02986	DLP	Dim1 family, Dim1-like protein (DLP) subfamily; DLP is a novel protein which shares 38% sequence identity to Dim1. Like Dim1, it is also implicated in pre-mRNA splicing and cell cycle progression. DLP is located in the nucleus and has been shown to interact with the U5 small nuclear ribonucleoprotein particle (snRNP)-specific 102kD protein (or Prp6). Dim1 protein, also known as U5 snRNP-specific 15kD protein is a component of U5 snRNP, which pre-assembles with U4/U6 snRNPs to form a [U4/U6:U5] tri-snRNP complex required for pre-mRNA splicing. Dim1 adopts a thioredoxin fold but does not contain the redox active CXXC motif.	114
239285	cd02987	Phd_like_Phd	Phosducin (Phd)-like family, Phd subfamily; Phd is a cytosolic regulator of G protein functions. It specifically binds G protein betagamma (Gbg)-subunits with high affinity, resulting in the solubilization of Gbg from the plasma membrane. This impedes the formation of a functional G protein trimer (G protein alphabetagamma), thereby inhibiting G protein-mediated signal transduction. Phd also inhibits the GTPase activity of G protein alpha. Phd can be phosphorylated by protein kinase A and G protein-coupled receptor kinase 2, leading to its inactivation. Phd was originally isolated from the retina, where it is highly expressed and has been implicated to play an important role in light adaptation. It is also found in the pineal gland, liver, spleen, striated muscle and the brain. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain.	175
239286	cd02988	Phd_like_VIAF	Phosducin (Phd)-like family, Viral inhibitor of apoptosis (IAP)-associated factor (VIAF) subfamily; VIAF is a Phd-like protein that functions in caspase activation during apoptosis. It was identified as an IAP binding protein through a screen of a human B-cell library using a prototype IAP. VIAF lacks a consensus IAP binding motif and while it does not function as an IAP antagonist, it still plays a regulatory role in the complete activation of caspases. VIAF itself is a substrate for IAP-mediated ubiquitination, suggesting that it may be a target of IAPs in the prevention of cell death. The similarity of VIAF to Phd points to a potential role distinct from apoptosis regulation. Phd functions as a cytosolic regulator of G protein by specifically binding to G protein betagamma (Gbg)-subunits. The C-terminal domain of Phd adopts a thioredoxin fold, but it does not contain a CXXC motif. Phd interacts with G protein beta mostly through the N-terminal helical domain.	192
239287	cd02989	Phd_like_TxnDC9	Phosducin (Phd)-like family, Thioredoxin (TRX) domain containing protein 9 (TxnDC9) subfamily; composed of predominantly uncharacterized eukaryotic proteins, containing a TRX-like domain without the redox active CXXC motif. The gene name for the human protein is TxnDC9. The two characterized members are described as Phd-like proteins, PLP1 of Saccharomyces cerevisiae and PhLP3 of Dictyostelium discoideum. Gene disruption experiments show that both PLP1 and PhLP3 are non-essential proteins. Unlike Phd and most Phd-like proteins, members of this group do not contain the Phd N-terminal helical domain which is implicated in binding to the G protein betagamma subunit.	113
239288	cd02990	UAS_FAF1	UAS family, FAS-associated factor 1 (FAF1) subfamily; FAF1 contains a UAS domain of unknown function N-terminal to a ubiquitin-associated UBX domain. FAF1 also contains ubiquitin-associated UBA and nuclear targeting domains, N-terminal to the UAS domain. FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. It is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kB (NF-kB) by interfering with the nuclear translocation of the p65 subunit. FAF1 also interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway.	136
239289	cd02991	UAS_ETEA	UAS family, ETEA subfamily; composed of proteins similar to human ETEA protein, the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. ETEA shows homology to Fas-associated factor 1 (FAF1); both containing UAS and UBX (ubiquitin-associated) domains. Compared to FAF1, however, ETEA lacks the ubiquitin-associated UBA domain and a nuclear targeting domain. The function of ETEA is still unknown. A yeast two-hybrid assay showed that it can interact with Fas. Because of its homology to FAF1, it is postulated that ETEA could be involved in modulating Fas-mediated apoptosis of T-cells and eosinophils of atopic dermatitis patients, making them more resistant to apoptosis.	116
239290	cd02992	PDI_a_QSOX	PDIa family, Quiescin-sulfhydryl oxidase (QSOX) subfamily; QSOX is a eukaryotic protein containing an N-terminal redox active TRX domain, similar to that of PDI, and a small C-terminal flavin adenine dinucleotide (FAD)-binding domain homologous to the yeast ERV1p protein. QSOX oxidizes thiol groups to disulfides like PDI, however, unlike PDI, this oxidation is accompanied by the reduction of oxygen to hydrogen peroxide. QSOX is localized in high concentrations in cells with heavy secretory load and prefers peptides and proteins as substrates, not monothiols like glutathione. Inside the cell, QSOX is found in the endoplasmic reticulum and Golgi. The flow of reducing equivalents in a QSOX-catalyzed reaction goes from the dithiol substrate -> dithiol of the QSOX TRX domain -> dithiols of the QSOX ERV1p domain -> FAD -> oxygen.	114
239291	cd02993	PDI_a_APS_reductase	PDIa family, 5'-Adenylylsulfate (APS) reductase subfamily; composed of plant-type APS reductases containing a C-terminal redox active TRX domain and an N-terminal reductase domain which is part of a superfamily that includes N type ATP PPases. APS reductase catalyzes the reduction of activated sulfate to sulfite, a key step in the biosynthesis of sulfur-containing metabolites. Sulfate is first activated by ATP sulfurylase, forming APS, which can be phosphorylated to 3'-phosphoadenosine-5'-phosphosulfate (PAPS). Depending on the organism, either APS or PAPS can be used for sulfate reduction. Prokaryotes and fungi use PAPS, whereas plants use both APS and PAPS. Since plant-type APS reductase uses glutathione (GSH) as its electron donor, the C-terminal domain may function like glutaredoxin, a GSH-dependent member of the TRX superfamily. The flow of reducing equivalents goes from GSH -> C-terminal TRX domain -> N-terminal reductase domain -> APS. Plant-type APS reductase shows no homology to that of dissimilatory sulfate-reducing bacteria, which is an iron-sulfur flavoenzyme. Also included in the alignment is EYE2 from Chlamydomonas reinhardtii, a protein required for eyespot assembly.	109
239292	cd02994	PDI_a_TMX	PDIa family, TMX subfamily; composed of proteins similar to the TRX-related human transmembrane protein, TMX. TMX is a type I integral membrane protein; the N-terminal redox active TRX domain is present in the endoplasmic reticulum (ER) lumen while the C-terminus is oriented towards the cytoplasm. It is expressed in many cell types and its active site motif (CPAC) is unique. In vitro, TMX reduces interchain disulfides of insulin and renatures inactive RNase containing incorrect disulfide bonds. The C. elegans homolog, DPY-11, is expressed only in the hypodermis and resides in the cytoplasm. It is required for body and sensory organ morphogeneis. Another uncharacterized TRX-related transmembrane protein, human TMX4, is included in the alignment. The active site sequence of TMX4 is CPSC.	101
239293	cd02995	PDI_a_PDI_a'_C	PDIa family, C-terminal TRX domain (a') subfamily; composed of the C-terminal redox active a' domains of PDI, ERp72, ERp57 (or ERp60) and EFP1. PDI, ERp72 and ERp57 are endoplasmic reticulum (ER)-resident eukaryotic proteins involved in oxidative protein folding. They are oxidases, catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER. They also exhibit reductase activity in acting as isomerases to correct any non-native disulfide bonds, as well as chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. PDI and ERp57 have the abb'a' domain structure (where a and a' are redox active TRX domains while b and b' are redox inactive TRX-like domains). PDI also contains an acidic region (c domain) after the a' domain that is absent in ERp57. ERp72 has an additional a domain at the N-terminus (a"abb'a' domain structure). ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins, while PDI shows a wider substrate specificity. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. EFP1 is a binding partner protein of thyroid oxidase, which is responsible for the generation of hydrogen peroxide, a crucial substrate of thyroperoxidase, which functions to iodinate thyroglobulin and synthesize thyroid hormones.	104
239294	cd02996	PDI_a_ERp44	PDIa family, endoplasmic reticulum protein 44 (ERp44) subfamily; ERp44 is an ER-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain, similar to that of PDIa, with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. The CXFS motif in the N-terminal domain allows ERp44 to form stable reversible mixed disulfides with its substrates. Through this activity, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. It also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol.	108
239295	cd02997	PDI_a_PDIR	PDIa family, PDIR subfamily; composed of proteins similar to human PDIR (for Protein Disulfide Isomerase Related). PDIR is composed of three redox active TRX (a) domains and an N-terminal redox inactive TRX-like (b) domain. Similar to PDI, it is involved in oxidative protein folding in the endoplasmic reticulum (ER) through its isomerase and chaperone activities. These activities are lower compared to PDI, probably due to PDIR acting only on a subset of proteins. PDIR is preferentially expressed in cells actively secreting proteins and its expression is induced by stress. Similar to PDI, the isomerase and chaperone activities of PDIR are independent; CXXC mutants lacking isomerase activity retain chaperone activity.	104
239296	cd02998	PDI_a_ERp38	PDIa family, endoplasmic reticulum protein 38 (ERp38) subfamily; composed of proteins similar to the P5-like protein first isolated from alfalfa, which contains two redox active TRX (a) domains at the N-terminus, like human P5, and a C-terminal domain with homology to the C-terminal domain of ERp29, unlike human P5. The cDNA clone of this protein (named G1) was isolated from an alfalfa cDNA library by screening with human protein disulfide isomerase (PDI) cDNA. The G1 protein is constitutively expressed in all major organs of the plant and its expression is induced by treatment with tunicamycin, indicating that it may be a glucose-regulated protein. The G1 homolog in the eukaryotic social amoeba Dictyostelium discoideum is also described as a P5-like protein, which is located in the endoplasmic reticulum (ER) despite the absence of an ER-retrieval signal. G1 homologs from Aspergillus niger and Neurospora crassa have also been characterized, and are named TIGA and ERp38, respectively. Also included in the alignment is an atypical PDI from Leishmania donovani containing a single a domain, and the C-terminal a domain of a P5-like protein from Entamoeba histolytica.	105
239297	cd02999	PDI_a_ERp44_like	PDIa family, endoplasmic reticulum protein 44 (ERp44)-like subfamily; composed of uncharacterized PDI-like eukaryotic proteins containing only one redox active TRX (a) domain with a CXXS motif, similar to ERp44. CXXS is still a redox active motif; however, the mixed disulfide formed with the substrate is more stable than those formed by CXXC motif proteins. PDI-related proteins are usually involved in the oxidative protein folding in the ER by acting as catalysts and folding assistants. ERp44 is involved in thiol-mediated retention in the ER.	100
239298	cd03000	PDI_a_TMX3	PDIa family, TMX3 subfamily; composed of eukaryotic proteins similar to human TMX3, a TRX related transmembrane protein containing one redox active TRX domain at the N-terminus and a classical ER retrieval sequence for type I transmembrane proteins at the C-terminus. The TMX3 transcript is found in a variety of tissues with the highest levels detected in skeletal muscle and the heart. In vitro, TMX3 showed oxidase activity albeit slightly lower than that of protein disulfide isomerase.	104
239299	cd03001	PDI_a_P5	PDIa family, P5 subfamily; composed of eukaryotic proteins similar to human P5, a PDI-related protein with a domain structure of aa'b (where a and a' are redox active TRX domains and b is a redox inactive TRX-like domain). Like PDI, P5 is located in the endoplasmic reticulum (ER) and displays both isomerase and chaperone activities, which are independent of each other. Compared to PDI, the isomerase and chaperone activities of P5 are lower. The first cysteine in the CXXC motif of both redox active domains in P5 is necessary for isomerase activity. The P5 gene was first isolated as an amplified gene from a hydroxyurea-resistant hamster cell line. The zebrafish P5 homolog has been implicated to play a critical role in establishing left/right asymmetries in the embryonic midline. Some members of this subfamily are P5-like proteins containing only one redox active TRX domain.	103
239300	cd03002	PDI_a_MPD1_like	PDI family, MPD1-like subfamily; composed of eukaryotic proteins similar to Saccharomyces cerevisiae MPD1 protein, which contains a single redox active TRX domain located at the N-terminus, and an ER retention signal at the C-terminus indicative of an ER-resident protein. MPD1 has been shown to suppress the maturation defect of carboxypeptidase Y caused by deletion of the yeast PDI1 gene. Other characterized members of this subfamily include the Aspergillus niger prpA protein and Giardia PDI-1. PrpA is non-essential to strain viability, however, its transcript level is induced by heterologous protein expression suggesting a possible role in oxidative protein folding during high protein production. Giardia PDI-1 has the ability to refold scrambled RNase and exhibits transglutaminase activity.	109
239301	cd03003	PDI_a_ERdj5_N	PDIa family, N-terminal ERdj5 subfamily; ERdj5, also known as JPDI and macrothioredoxin, is a protein containing an N-terminal DnaJ domain and four redox active TRX domains. This subfamily is comprised of the first TRX domain of ERdj5 located after the DnaJ domain at the N-terminal half of the protein. ERdj5 is a ubiquitous protein localized in the endoplasmic reticulum (ER) and is abundant in secretory cells. It's transcription is induced during ER stress. It interacts with BiP through its DnaJ domain in an ATP-dependent manner. BiP, an ER-resident member of the Hsp70 chaperone family, functions in ER-associated degradation and protein translocation.	101
239302	cd03004	PDI_a_ERdj5_C	PDIa family, C-terminal ERdj5 subfamily; ERdj5, also known as  JPDI and macrothioredoxin, is a protein containing an N-terminal DnaJ domain and four redox active TRX domains. This subfamily is composed of the three TRX domains located at the C-terminal half of the protein. ERdj5 is a ubiquitous protein localized in the endoplasmic reticulum (ER) and is abundant in secretory cells. It's transcription is induced during ER stress. It interacts with BiP through its DnaJ domain in an ATP-dependent manner. BiP, an ER-resident member of the Hsp70 chaperone family, functions in ER-associated degradation and protein translocation. Also included in the alignment is the single complete TRX domain of an uncharacterized protein from Tetraodon nigroviridis, which also contains a DnaJ domain at its N-terminus.	104
239303	cd03005	PDI_a_ERp46	PDIa family, endoplasmic reticulum protein 46 (ERp46) subfamily; ERp46 is an ER-resident protein containing three redox active TRX domains. Yeast complementation studies show that ERp46 can substitute for protein disulfide isomerase (PDI) function in vivo. It has been detected in many tissues, however, transcript and protein levels do not correlate in all tissues, suggesting regulation at a posttranscriptional level. An identical protein, named endoPDI, has been identified as an endothelial PDI that is highly expressed in the endothelium of tumors and hypoxic lesions. It has a protective effect on cells exposed to hypoxia.	102
239304	cd03006	PDI_a_EFP1_N	PDIa family, N-terminal EFP1 subfamily; EFP1 is a binding partner protein of thyroid oxidase (ThOX), also called Duox. ThOX proteins are responsible for the generation of hydrogen peroxide, a crucial substrate of thyroperoxidase, which functions to iodinate thyroglobulin and synthesize thyroid hormones. EFP1 was isolated through a yeast two-hybrid method using the EF-hand fragment of dog Duox1 as a bait. It could be one of the partners in the assembly of a multiprotein complex constituting the thyroid hydrogen peroxide generating system. EFP1 contains two TRX domains related to the redox active TRX domains of protein disulfide isomerase (PDI). This subfamily is composed of the N-terminal TRX domain of EFP1, which contains a CXXS sequence in place of the typical CXXC motif, similar to ERp44. The CXXS motif allows the formation of stable mixed disulfides, crucial for the ER-retention function of ERp44.	113
239305	cd03007	PDI_a_ERp29_N	PDIa family, endoplasmic reticulum protein 29 (ERp29) subfamily; ERp29 is a ubiquitous ER-resident protein expressed in high levels in secretory cells. It forms homodimers and higher oligomers in vitro and in vivo. It contains a redox inactive TRX-like domain at the N-terminus, which is homologous to the redox active TRX (a) domains of PDI, and a C-terminal helical domain similar to the C-terminal domain of P5. The expression profile of ERp29 suggests a role in secretory protein production distinct from that of PDI. It has also been identified as a member of the thyroglobulin folding complex. The Drosophila homolog, Wind, is the product of windbeutel, an essential gene in the development of dorsal-ventral patterning. Wind is required for correct targeting of Pipe, a Golgi-resident type II transmembrane protein with homology to 2-O-sulfotransferase.	116
239306	cd03008	TryX_like_RdCVF	Tryparedoxin (TryX)-like family, Rod-derived cone viability factor (RdCVF) subfamily; RdCVF is a thioredoxin (TRX)-like protein specifically expressed in photoreceptors. RdCVF was isolated and identified as a factor that supports cone survival in retinal cultures. Cone photoreceptor loss is responsible for the visual handicap resulting from the inherited disease, retinitis pigmentosa. RdCVF shows 33% similarity to TRX but does not exhibit any detectable thiol oxidoreductase activity.	146
239307	cd03009	TryX_like_TryX_NRX	Tryparedoxin (TryX)-like family, TryX and nucleoredoxin (NRX) subfamily; TryX and NRX are thioredoxin (TRX)-like protein disulfide oxidoreductases that alter the redox state of target proteins via the reversible oxidation of an active center CXXC motif. TryX is involved in the regulation of oxidative stress in parasitic trypanosomatids by reducing TryX peroxidase, which in turn catalyzes the reduction of hydrogen peroxide and organic hydroperoxides. TryX derives reducing equivalents from reduced trypanothione, a polyamine peptide conjugate unique to trypanosomatids, which is regenerated by the NADPH-dependent flavoprotein trypanothione reductase. Vertebrate NRX is a 400-amino acid nuclear protein with one redox active TRX domain containing a CPPC active site motif followed by one redox inactive TRX-like domain. Mouse NRX transcripts are expressed in all adult tissues but is restricted to the nervous system and limb buds in embryos. Plant NRX, longer than the vertebrate NRX by about 100-200 amino acids, is a nuclear protein containing a redox inactive TRX-like domain between two redox active TRX domains. Both vertebrate and plant NRXs show thiol oxidoreductase activity in vitro. Their localization in the nucleus suggests a role in the redox regulation of nuclear proteins such as transcription factors.	131
239308	cd03010	TlpA_like_DsbE	TlpA-like family, DsbE (also known as CcmG and CycY) subfamily; DsbE is a membrane-anchored, periplasmic TRX-like reductase containing a CXXC motif that specifically donates reducing equivalents to apocytochrome c via CcmH, another cytochrome c maturation (Ccm) factor with a redox active CXXC motif. Assembly of cytochrome c requires the ligation of heme to reduced thiols of the apocytochrome. In bacteria, this assembly occurs in the periplasm. The reductase activity of DsbE in the oxidizing environment of the periplasm is crucial in the maturation of cytochrome c.	127
239309	cd03011	TlpA_like_ScsD_MtbDsbE	TlpA-like family, suppressor for copper sensitivity D protein (ScsD) and actinobacterial DsbE homolog subfamily; composed of ScsD, the DsbE homolog of Mycobacterium tuberculosis (MtbDsbE) and similar proteins, all containing a redox-active CXXC motif. The Salmonella typhimurium ScsD is a thioredoxin-like protein which confers copper tolerance to copper-sensitive mutants of E. coli. MtbDsbE has been characterized as an oxidase in vitro, catalyzing the disulfide bond formation of substrates like hirudin. The reduced form of MtbDsbE is more stable than its oxidized form, consistent with an oxidase function. This is in contrast to the function of DsbE from gram-negative bacteria which is a specific reductase of apocytochrome c.	123
239310	cd03012	TlpA_like_DipZ_like	TlpA-like family, DipZ-like subfamily; composed uncharacterized proteins containing a TlpA-like TRX domain. Some members show domain architectures similar to that of E. coli DipZ protein (also known as DsbD). The only eukaryotic members of the TlpA family belong to this subfamily. TlpA is a disulfide reductase known to have a crucial role in the biogenesis of cytochrome aa3.	126
239311	cd03013	PRX5_like	Peroxiredoxin (PRX) family, PRX5-like subfamily; members are similar to the human protein, PRX5, a homodimeric TRX peroxidase, widely expressed in tissues and found cellularly in mitochondria, peroxisomes and the cytosol. The cellular location of PRX5 suggests that it may have an important antioxidant role in organelles that are major sources of reactive oxygen species (ROS), as well as a role in the control of signal transduction. PRX5 has been shown to reduce hydrogen peroxide, alkyl hydroperoxides and peroxynitrite. As with all other PRXs, the N-terminal peroxidatic cysteine of PRX5 is oxidized into a sulfenic acid intermediate upon reaction with peroxides. Human PRX5 is able to resolve this intermediate by forming an intramolecular disulfide bond with its C-terminal cysteine (the resolving cysteine), which can then be reduced by TRX, just like an atypical 2-cys PRX. This resolving cysteine, however, is not conserved in other members of the subfamily. In such cases, it is assumed that the oxidized cysteine is directly resolved by an external small-molecule or protein reductant, typical of a 1-cys PRX. In the case of the H. influenza PRX5 hybrid, the resolving glutaredoxin domain is on the same protein chain as PRX. PRX5 homodimers show an A-type interface, similar to atypical 2-cys PRXs.	155
239312	cd03014	PRX_Atyp2cys	Peroxiredoxin (PRX) family, Atypical 2-cys PRX subfamily; composed of PRXs containing peroxidatic and resolving cysteines, similar to the homodimeric thiol specific antioxidant (TSA) protein also known as TRX-dependent thiol peroxidase (Tpx). Tpx is a bacterial periplasmic peroxidase which differs from other PRXs in that it shows substrate specificity toward alkyl hydroperoxides over hydrogen peroxide. As with all other PRXs, the peroxidatic cysteine (N-terminal) of Tpx is oxidized into a sulfenic acid intermediate upon reaction with peroxides. Tpx is able to resolve this intermediate by forming an intramolecular disulfide bond with a conserved C-terminal cysteine (the resolving cysteine), which can then be reduced by thioredoxin. This differs from the typical 2-cys PRX which resolves the oxidized cysteine by forming an intermolecular disulfide bond with the resolving cysteine from the other subunit of the homodimer. Atypical 2-cys PRX homodimers have a loop-based interface (A-type for alternate), in contrast with the B-type interface of typical 2-cys and 1-cys PRXs.	143
239313	cd03015	PRX_Typ2cys	Peroxiredoxin (PRX) family, Typical 2-Cys PRX subfamily; PRXs are thiol-specific antioxidant (TSA) proteins, which confer a protective role in cells through its peroxidase activity by reducing hydrogen peroxide, peroxynitrite, and organic hydroperoxides. The functional unit of typical 2-cys PRX is a homodimer. A unique intermolecular redox-active disulfide center is utilized for its activity. Upon reaction with peroxides, its peroxidatic cysteine is oxidized into a sulfenic acid intermediate which is resolved by bonding with the resolving cysteine from the other subunit of the homodimer. This intermolecular disulfide bond is then reduced by thioredoxin, tryparedoxin or AhpF. Typical 2-cys PRXs, like 1-cys PRXs, form decamers which are stabilized by reduction of the active site cysteine. Typical 2-cys PRX interacts through beta strands at one edge of the monomer (B-type interface) to form the functional homodimer, and uses an A-type interface (similar to the dimeric interface in atypical 2-cys PRX and PRX5) at the opposite end of the monomer to form the stable decameric (pentamer of dimers) structure.	173
239314	cd03016	PRX_1cys	Peroxiredoxin (PRX) family, 1-cys PRX subfamily; composed of PRXs containing only one conserved cysteine, which serves as the peroxidatic cysteine. They are homodimeric thiol-specific antioxidant (TSA) proteins that confer a protective role in cells by reducing and detoxifying hydrogen peroxide, peroxynitrite, and organic hydroperoxides. As with all other PRXs, a cysteine sulfenic acid intermediate is formed upon reaction of 1-cys PRX with its substrates. Having no resolving cysteine, the oxidized enzyme is resolved by an external small-molecule or protein reductant such as thioredoxin or glutaredoxin. Similar to typical 2-cys PRX, 1-cys PRX forms a functional dimeric unit with a B-type interface, as well as a decameric structure which is stabilized in the reduced form of the enzyme. Other oligomeric forms, tetramers and hexamers, have also been reported. Mammalian 1-cys PRX is localized cellularly in the cytosol and is expressed at high levels in brain, eye, testes and lung. The seed-specific plant 1-cys PRXs protect tissues from reactive oxygen species during desiccation and are also called rehydrins.	203
239315	cd03017	PRX_BCP	Peroxiredoxin (PRX) family, Bacterioferritin comigratory protein (BCP) subfamily; composed of  thioredoxin-dependent thiol peroxidases, widely expressed in pathogenic bacteria, that protect cells against toxicity from reactive oxygen species by reducing and detoxifying hydroperoxides. The protein was named BCP based on its electrophoretic mobility before its function was known. BCP shows substrate selectivity toward fatty acid hydroperoxides rather than hydrogen peroxide or alkyl hydroperoxides. BCP contains the peroxidatic cysteine but appears not to possess a resolving cysteine (some sequences, not all, contain a second cysteine but its role is still unknown). Unlike other PRXs, BCP exists as a monomer. The plant homolog of BCP is PRX Q, which is expressed only in leaves and is cellularly localized in the chloroplasts and the guard cells of stomata. Also included in this subfamily is the fungal nuclear protein,  Dot5p (for disrupter of telomere silencing protein 5), which functions as an alkyl-hydroperoxide reductase during post-diauxic growth.	140
239316	cd03018	PRX_AhpE_like	Peroxiredoxin (PRX) family, AhpE-like subfamily; composed of proteins similar to Mycobacterium tuberculosis AhpE. AhpE is described as a 1-cys PRX because of the absence of a resolving cysteine. The structure and sequence of AhpE, however, show greater similarity to 2-cys PRXs than 1-cys PRXs. PRXs are thiol-specific antioxidant (TSA) proteins that confer a protective role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. The first step of catalysis is the nucleophilic attack by the peroxidatic cysteine on the peroxide leading to the formation of a cysteine sulfenic acid intermediate. The absence of a resolving cysteine suggests that functional AhpE is regenerated by an external reductant. The solution behavior and crystal structure of AhpE show that it forms dimers and octamers.	149
239317	cd03019	DsbA_DsbA	DsbA family, DsbA subfamily; DsbA is a monomeric thiol disulfide oxidoreductase protein containing a redox active CXXC motif imbedded in a TRX fold. It is involved in the oxidative protein folding pathway in prokaryotes, and is the strongest thiol oxidant known, due to the unusual stability of the thiolate anion form of the first cysteine in the CXXC motif. The highly unstable oxidized form of DsbA directly donates disulfide bonds to reduced proteins secreted into the bacterial periplasm. This rapid and unidirectional process helps to catalyze the folding of newly-synthesized polypeptides. To regain catalytic activity, reduced DsbA is then reoxidized by the membrane protein DsbB, which generates its disulfides from oxidized quinones, which in turn are reoxidized by the electron transport chain.	178
239318	cd03020	DsbA_DsbC_DsbG	DsbA family, DsbC and DsbG subfamily; V-shaped homodimeric proteins containing a redox active CXXC motif imbedded in a TRX fold. They function as protein disulfide isomerases and chaperones in the bacterial periplasm to correct non-native disulfide bonds formed by DsbA and prevent aggregation of incorrectly folded proteins. DsbC and DsbG are kept in their reduced state by the cytoplasmic membrane protein DsbD, which utilizes the TRX/TRX reductase system in the cytosol as a source of reducing equivalents. DsbG differ from DsbC in that it has a more limited substrate specificity, and it may preferentially act later in the folding process to catalyze disulfide rearrangements in folded or partially folded proteins. Also included in the alignment is the predicted protein TrbB, whose gene was sequenced from the enterohemorrhagic E. coli type IV pilus gene cluster, which is required for efficient plasmid transfer.	197
239319	cd03021	DsbA_GSTK	DsbA family, Glutathione (GSH) S-transferase Kappa (GSTK) subfamily; GSTK is a member of the GST family of enzymes which catalyzes the transfer of the thiol of GSH to electrophilic substrates. It is specifically located in the mitochondria and peroxisomes, unlike other members of the canonical GST family, which are mainly cytosolic. The biological substrates of GSTK are not yet known. It is presumed to have a protective role during respiration when large amounts of reactive oxygen species are generated. GSTK has the same general fold as DsbA, consisting of a thioredoxin domain interrupted by an alpha-helical domain and its biological unit is a homodimer. GSTK is closely related to the bacterial enzyme, 2-hydroxychromene-2-carboxylate (HCCA) isomerase. It shows little sequence similarity to the other members of the GST family.	209
239320	cd03022	DsbA_HCCA_Iso	DsbA family, 2-hydroxychromene-2-carboxylate (HCCA) isomerase subfamily; HCCA isomerase is a glutathione (GSH) dependent enzyme involved in the naphthalene catabolic pathway. It converts HCCA, a hemiketal formed spontaneously after ring cleavage of 1,2-dihydroxynapthalene by a dioxygenase, into cis-o-hydroxybenzylidenepyruvate (cHBPA). This is the fourth reaction in a six-step pathway that converts napthalene into salicylate. HCCA isomerase is unique to bacteria that degrade polycyclic aromatic compounds. It is closely related to the eukaryotic protein, GSH transferase kappa (GSTK).	192
239321	cd03023	DsbA_Com1_like	DsbA family, Com1-like subfamily; composed of proteins similar to Com1, a 27-kDa outer membrane-associated immunoreactive protein originally found in both acute and chronic disease strains of the pathogenic bacteria Coxiella burnetti. It contains a CXXC motif, assumed to be imbedded in a DsbA-like structure. Its homology to DsbA suggests that the protein is a protein disulfide oxidoreductase. The role of such a protein in pathogenesis is unknown.	154
239322	cd03024	DsbA_FrnE	DsbA family, FrnE subfamily; FrnE is a DsbA-like protein containing a CXXC motif. It is presumed to be a thiol oxidoreductase involved in polyketide biosynthesis, specifically in the production of the aromatic antibiotics frenolicin and nanaomycins.	201
239323	cd03025	DsbA_FrnE_like	DsbA family, FrnE-like subfamily; composed of uncharacterized proteins containing a CXXC motif with similarity to DsbA and FrnE. FrnE is presumed to be a thiol oxidoreductase involved in polyketide biosynthesis, specifically in the production of the aromatic antibiotics frenolicin and nanaomycins.	193
239324	cd03026	AhpF_NTD_C	TRX-GRX-like family, Alkyl hydroperoxide reductase F subunit (AhpF) N-terminal domain (NTD) subfamily, C-terminal TRX-fold subdomain; AhpF is a homodimeric flavoenzyme which catalyzes the NADH-dependent reduction of the peroxiredoxin AhpC, which then reduces hydrogen peroxide and organic hydroperoxides. AhpF contains an NTD containing two contiguous TRX-fold subdomains similar to Pyrococcus furiosus protein disulfide oxidoreductase (PfPDO). It also contains a catalytic core similar to TRX reductase containing FAD and NADH binding domains with an active site disulfide. The proposed mechanism of action of AhpF is similar to a TRX/TRX reductase system. The flow of reducing equivalents goes from NADH -> catalytic core of AhpF -> NTD of AhpF -> AhpC -> peroxide substrates. The catalytic CXXC motif of the NTD of AhpF is contained in its C-terminal TRX subdomain.	89
239325	cd03027	GRX_DEP	Glutaredoxin (GRX) family, Dishevelled, Egl-10, and Pleckstrin (DEP) subfamily; composed of uncharacterized proteins containing a GRX domain and additional domains DEP and DUF547, both of which have unknown functions.  GRX is a glutathione (GSH) dependent reductase containing a redox active CXXC motif in a TRX fold. It has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. By altering the redox state of target proteins, GRX is involved in many cellular functions.	73
239326	cd03028	GRX_PICOT_like	Glutaredoxin (GRX) family, PKC-interacting cousin of TRX (PICOT)-like subfamily; composed of PICOT and GRX-PICOT-like proteins. The non-PICOT members of this family contain only the GRX-like domain, whereas PICOT contains an N-terminal TRX-like domain followed by one to three GRX-like domains. It is interesting to note that PICOT from plants contain three repeats of the GRX-like domain, metazoan proteins (except for insect) have two repeats, while fungal sequences contain only one copy of the domain. PICOT is a protein that interacts with protein kinase C (PKC) theta, a calcium independent PKC isoform selectively expressed in skeletal muscle and T lymphocytes. PICOT inhibits the activation of c-Jun N-terminal kinase and the transcription factors, AP-1 and NF-kB, induced by PKC theta or T-cell activating stimuli. Both GRX and TRX domains of PICOT are required for its activity. Characterized non-PICOT members of this family include CXIP1, a CAX-interacting protein in Arabidopsis thaliana, and PfGLP-1, a GRX-like protein from Plasmodium falciparum.	90
239327	cd03029	GRX_hybridPRX5	Glutaredoxin (GRX) family, PRX5 hybrid subfamily; composed of hybrid proteins containing peroxiredoxin (PRX) and GRX domains, which is found in some pathogenic bacteria and cyanobacteria. PRXs are thiol-specific antioxidant (TSA) proteins that confer a protective antioxidant role in cells through their peroxidase activity in which hydrogen peroxide, peroxynitrate, and organic hydroperoxides are reduced and detoxified using reducing equivalents derived from either thioredoxin, glutathione, trypanothione and AhpF. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins. PRX-GRX hybrid proteins from Haemophilus influenza and Neisseria meningitis exhibit GSH-dependent peroxidase activity. The flow of reducing equivalents in the catalytic cycle of the hybrid protein goes from NADPH -> GSH reductase -> GSH -> GRX domain of hybrid -> PRX domain of hybrid -> peroxide substrate.	72
239328	cd03030	GRX_SH3BGR	Glutaredoxin (GRX) family, SH3BGR (SH3 domain binding glutamic acid-rich protein) subfamily; a recently-identified subfamily composed of SH3BGR and similar proteins possessing significant sequence similarity to GRX, but without a redox active CXXC motif. The SH3BGR gene was cloned in an effort to identify genes mapping to chromosome 21, which could be involved in the pathogenesis of congenital heart disease affecting Down syndrome newborns. Several human SH3BGR-like (SH3BGRL) genes have been identified since, mapping to different locations in the chromosome. Of these, SH3BGRL3 was identified as a tumor necrosis factor (TNF) alpha inhibitory protein and was also named TIP-B1. Upregulation of expression of SH3BGRL3 is associated with differentiation. It has been suggested that it functions as a regulator of differentiation-related signal transduction pathways.	92
239329	cd03031	GRX_GRX_like	Glutaredoxin (GRX) family, GRX-like domain containing protein subfamily; composed of uncharacterized eukaryotic proteins containing a GRX-like domain having only one conserved cysteine, aligning to the C-terminal cysteine of the CXXC motif of GRXs. This subfamily is predominantly composed of plant proteins. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins via a redox active CXXC motif using a similar dithiol mechanism employed by TRXs. GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. Proteins containing only the C-terminal cysteine are generally redox inactive.	147
239330	cd03032	ArsC_Spx	Arsenate Reductase (ArsC) family, Spx subfamily; Spx is a unique RNA polymerase (RNAP)-binding protein present in bacilli and some mollicutes. It inhibits transcription by binding to the C-terminal domain of the alpha subunit of RNAP, disrupting complex formation between RNAP and certain transcriptional activator proteins like ResD and ComA. In response to oxidative stress, Spx can also activate transcription, making it a general regulator that exerts both positive and negative control over transcription initiation. Spx has been shown to exert redox-sensitive transcriptional control over genes like trxA (TRX) and trxB (TRX reductase), genes that function in thiol homeostasis. This redox-sensitive activity is dependent on the presence of a CXXC motif, present in some members of the Spx subfamily, that acts as a thiol/disulfide switch. Spx has also been shown to repress genes in a sulfate-dependent manner independent of the presence of the CXXC motif.	115
239331	cd03033	ArsC_15kD	Arsenate Reductase (ArsC) family, 15kD protein subfamily; composed of proteins of unknown function with similarity to thioredoxin-fold arsenic reductases, ArsC. It is encoded by an ORF present in a gene cluster associated with nitrogen fixation that also encodes dinitrogenase reductase ADP-ribosyltransferase (DRAT) and dinitrogenase reductase activating glycohydrolase (DRAG). ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione via glutaredoxin, through a single catalytic cysteine.	113
239332	cd03034	ArsC_ArsC	Arsenate Reductase (ArsC) family, ArsC subfamily; arsenic reductases similar to that encoded by arsC on the R733 plasmid of Escherichia coli. E. coli ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], the first step in the detoxification of arsenic, using reducing equivalents derived from glutathione (GSH) via glutaredoxin (GRX). ArsC contains a single catalytic cysteine, within a thioredoxin fold, that forms a covalent thiolate-As(V) intermediate, which is reduced by GRX through a mixed GSH-arsenate intermediate. This family of predominantly bacterial enzymes is unrelated to two other families of arsenate reductases which show similarity to low-molecular-weight acid phosphatases and phosphotyrosyl phosphatases.	112
239333	cd03035	ArsC_Yffb	Arsenate Reductase (ArsC) family, Yffb subfamily; Yffb is an uncharacterized bacterial protein encoded by the yffb gene, related to the thioredoxin-fold arsenic reductases, ArsC. The structure of Yffb and the conservation of the catalytic cysteine suggest that it is likely to function as a glutathione (GSH)-dependent thiol reductase. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from GSH via glutaredoxin, through a single catalytic cysteine.	105
239334	cd03036	ArsC_like	Arsenate Reductase (ArsC) family, unknown subfamily; uncharacterized proteins containing a CXXC motif with similarity to thioredoxin (TRX)-fold arsenic reductases, ArsC. Proteins containing a redox active CXXC motif like TRX and glutaredoxin (GRX) function as protein disulfide oxidoreductases, altering the redox state of target proteins via the reversible oxidation of the active site dithiol. ArsC catalyzes the reduction of arsenate [As(V)] to arsenite [As(III)], using reducing equivalents derived from glutathione via GRX, through a single catalytic cysteine.	111
239335	cd03037	GST_N_GRX2	GST_N family, Glutaredoxin 2 (GRX2) subfamily; composed of bacterial proteins similar to E. coli GRX2, an atypical GRX with a molecular mass of about 24kD, compared with other GRXs which are 9-12kD in size. GRX2 adopts a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. It contains a redox active CXXC motif located in the N-terminal domain but is not able to reduce ribonucleotide reductase like other GRXs. However, it catalyzes GSH-dependent protein disulfide reduction of other substrates efficiently. GRX2 is thought to function primarily  in catalyzing the reversible glutathionylation of proteins in cellular redox regulation including stress responses.	71
239336	cd03038	GST_N_etherase_LigE	GST_N family, Beta etherase LigE subfamily; composed of proteins similar to Sphingomonas paucimobilis beta etherase, LigE, a GST-like protein that catalyzes the cleavage of the beta-aryl ether linkages present in low-moleculer weight lignins using GSH as the hydrogen donor. This reaction is an essential step in the degradation of lignin, a complex phenolic polymer that is the most abundant aromatic material in the biosphere. The beta etherase activity of LigE is enantioselective and it complements the activity of the other GST family beta etherase, LigF.	84
239337	cd03039	GST_N_Sigma_like	GST_N family, Class Sigma_like; composed of GSTs belonging to class Sigma and similar proteins, including GSTs from class Mu, Pi and Alpha. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Vertebrate class Sigma GSTs are characterized as GSH-dependent hematopoietic prostaglandin (PG) D synthases and are responsible for the production of PGD2 by catalyzing the isomerization of PGH2. The functions of PGD2 include the maintenance of body temperature, inhibition of platelet aggregation, bronchoconstriction, vasodilation and mediation of allergy and inflammation. Other class Sigma members include the class II insect GSTs, S-crystallins from cephalopods and 28-kDa GSTs from parasitic flatworms. Drosophila GST2 is associated with indirect flight muscle and exhibits preference for catalyzing GSH conjugation to lipid peroxidation products, indicating an anti-oxidant role. S-crystallin constitutes the major lens protein in cephalopod eyes and is responsible for lens transparency and proper refractive index. The 28-kDa GST from Schistosoma is a multifunctional enzyme, exhibiting GSH transferase, GSH peroxidase and PGD2 synthase activities, and may play an important role in host-parasite interactions.  Also members are novel GSTs from the fungus Cunninghamella elegans, designated as class Gamma, and from the protozoan Blepharisma japonicum, described as a light-inducible GST.	72
239338	cd03040	GST_N_mPGES2	GST_N family; microsomal Prostaglandin E synthase Type 2 (mPGES2) subfamily; mPGES2 is a membrane-anchored dimeric protein containing a CXXC motif which catalyzes the isomerization of PGH2 to PGE2. Unlike cytosolic PGE synthase (cPGES) and microsomal PGES Type 1 (mPGES1), mPGES2 does not require glutathione (GSH) for its activity, although its catalytic rate is increased two- to four-fold in the presence of DTT, GSH or other thiol compounds. PGE2 is widely distributed in various tissues and is implicated in the sleep/wake cycle, relaxation/contraction of smooth muscle, excretion of sodium ions, maintenance of body temperature and mediation of inflammation. mPGES2 contains an N-terminal hydrophobic domain which is membrane associated, and a C-terminal soluble domain with a GST-like structure.	77
239339	cd03041	GST_N_2GST_N	GST_N family, 2 repeats of the N-terminal domain of soluble GSTs (2 GST_N) subfamily; composed of uncharacterized proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains.	77
239340	cd03042	GST_N_Zeta	GST_N family, Class Zeta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Zeta GSTs, also known as maleylacetoacetate (MAA) isomerases, catalyze the isomerization of MAA to fumarylacetoacetate, the penultimate step in tyrosine/phenylalanine catabolism, using GSH as a cofactor. They show little GSH-conjugating activity towards traditional GST substrates but display modest GSH peroxidase activity. They are also implicated in the detoxification of the carcinogen dichloroacetic acid by catalyzing its dechlorination to glyoxylic acid.	73
239341	cd03043	GST_N_1	GST_N family, unknown subfamily 1; composed of uncharacterized proteins, predominantly from bacteria, with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains.	73
239342	cd03044	GST_N_EF1Bgamma	GST_N family, Gamma subunit of Elongation Factor 1B (EFB1gamma) subfamily; EF1Bgamma is part of the eukaryotic translation elongation factor-1 (EF1) complex which plays a central role in the elongation cycle during protein biosynthesis. EF1 consists of two functionally distinct units, EF1A and EF1B. EF1A catalyzes the GTP-dependent binding of aminoacyl-tRNA to the ribosomal A site concomitant with the hydrolysis of GTP. The resulting inactive EF1A:GDP complex is recycled to the active GTP form by the guanine-nucleotide exchange factor EF1B, a complex composed of at least two subunits, alpha and gamma. Metazoan EFB1 contain a third subunit, beta. The EF1B gamma subunit contains a GST fold consisting of an N-terminal TRX-fold domain and a C-terminal alpha helical domain. The GST-like domain of EF1Bgamma is believed to mediate the dimerization of the EF1 complex, which in yeast is a dimer of the heterotrimer EF1A:EF1Balpha:EF1Bgamma. In addition to its role in protein biosynthesis, EF1Bgamma may also display other functions. The recombinant rice protein has been shown to possess GSH conjugating activity. The yeast EF1Bgamma binds membranes in a calcium dependent manner and is also part of a complex that binds to the msrA (methionine sulfoxide reductase) promoter suggesting a function in the regulation of its gene expression.	75
239343	cd03045	GST_N_Delta_Epsilon	GST_N family, Class Delta and Epsilon subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Delta and Epsilon subfamily is made up primarily of insect GSTs, which play major roles in insecticide resistance by facilitating reductive dehydrochlorination of insecticides or conjugating them with GSH to produce water-soluble metabolites that are easily excreted. They are also implicated in protection against cellular damage by oxidative stress.	74
239344	cd03046	GST_N_GTT1_like	GST_N family, Saccharomyces cerevisiae GTT1-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT1, and the Schizosaccharomyces pombe GST-III. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GTT1, a homodimer, exhibits GST activity with standard substrates and associates with the endoplasmic reticulum. Its expression is induced after diauxic shift and remains high throughout the stationary phase. S. pombe GST-III is implicated in the detoxification of various metals.	76
239345	cd03047	GST_N_2	GST_N family, unknown subfamily 2; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The sequence from Burkholderia cepacia was identified as part of a gene cluster involved in the degradation of 2,4,5-trichlorophenoxyacetic acid. Some GSTs (e.g. Class Zeta and Delta) are known to catalyze dechlorination reactions.	73
239346	cd03048	GST_N_Ure2p_like	GST_N family, Ure2p-like subfamily; composed of the Saccharomyces cerevisiae Ure2p and related GSTs. Ure2p is a regulator for nitrogen catabolism in yeast. It represses the expression of several gene products involved in the use of poor nitrogen sources when rich sources are available. A transmissible conformational change of Ure2p results in a prion called [Ure3], an inactive, self-propagating and infectious amyloid. Ure2p displays a GST fold containing an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The N-terminal TRX-fold domain is sufficient to induce the [Ure3] phenotype and is also called the prion domain of Ure2p. In addition to its role in nitrogen regulation, Ure2p confers protection to cells against heavy metal ion and oxidant toxicity, and shows glutathione (GSH) peroxidase activity. Characterized GSTs in this subfamily include Aspergillus fumigatus GSTs 1 and 2, and Schizosaccharomyces pombe GST-I. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of GSH with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes.	81
239347	cd03049	GST_N_3	GST_N family, unknown subfamily 3; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains.	73
239348	cd03050	GST_N_Theta	GST_N family, Class Theta subfamily; composed of eukaryotic class Theta GSTs and bacterial dichloromethane (DCM) dehalogenase. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Mammalian class Theta GSTs show poor GSH conjugating activity towards the standard substrates, CDNB and ethacrynic acid, differentiating them from other mammalian GSTs. GSTT1-1 shows similar cataytic activity as bacterial DCM dehalogenase, catalyzing the GSH-dependent hydrolytic dehalogenation of dihalomethanes. This is an essential process in methylotrophic bacteria to enable them to use chloromethane and DCM as sole carbon and energy sources. The presence of polymorphisms in human GSTT1-1 and its relationship to the onset of diseases including cancer is subject of many studies. Human GSTT2-2 exhibits a highly specific sulfatase activity, catalyzing the cleavage of sulfate ions from aralkyl sufate esters, but not from aryl or alkyl sulfate esters.	76
239349	cd03051	GST_N_GTT2_like	GST_N family, Saccharomyces cerevisiae GTT2-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT2. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GTT2, a homodimer, exhibits GST activity with standard substrates. Strains with deleted GTT2 genes are viable but exhibit increased sensitivity to heat shock.	74
239350	cd03052	GST_N_GDAP1	GST_N family, Ganglioside-induced differentiation-associated protein 1 (GDAP1) subfamily; GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal TRX-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. It does not exhibit GST activity using standard substrates.	73
239351	cd03053	GST_N_Phi	GST_N family, Class Phi subfamily; composed of plant-specific class Phi GSTs and related fungal and bacterial proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Phi GST subfamily has experience extensive gene duplication. The Arabidopsis and Oryza genomes contain 13 and 16 Phi GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Tau GSTs, showing class specificity in substrate preference. Phi enzymes are highly reactive toward chloroacetanilide and thiocarbamate herbicides. Some Phi GSTs have other functions including transport of flavonoid pigments to the vacuole, shoot regeneration and GSH peroxidase activity.	76
239352	cd03054	GST_N_Metaxin	GST_N family, Metaxin subfamily; composed of metaxins and related proteins. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. Metaxin 2 binds to metaxin 1 and may also play a role in protein translocation into the mitochondria. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken and mammals. Sequence analysis suggests that all three metaxins share a common ancestry and that they possess similarity to GSTs. Also included in the subfamily are uncharacterized proteins with similarity to metaxins, including a novel GST from Rhodococcus with toluene o-monooxygenase and glutamylcysteine synthetase activities.	72
239353	cd03055	GST_N_Omega	GST_N family, Class Omega subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. They contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism. Polymorphisms of the class Omega GST genes may be associated with the development of some types of cancer and the age-at-onset of both Alzheimer's and Parkinson's diseases.	89
239354	cd03056	GST_N_4	GST_N family, unknown subfamily 4; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains.	73
239355	cd03057	GST_N_Beta	GST_N family, Class Beta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Unlike mammalian GSTs which detoxify a broad range of compounds, the bacterial class Beta GSTs exhibit limited GSH conjugating activity with a narrow range of substrates. In addition to GSH conjugation, they also bind antibiotics and reduce the antimicrobial activity of beta-lactam drugs. The structure of the Proteus mirabilis enzyme reveals that the cysteine in the active site forms a covalent bond with GSH.	77
239356	cd03058	GST_N_Tau	GST_N family, Class Tau subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The plant-specific class Tau GST subfamily has undergone extensive gene duplication. The Arabidopsis and Oryza genomes contain 28 and 40 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Phi GSTs, showing class specificity in substrate preference. Tau enzymes are highly efficient in detoxifying diphenylether and aryloxyphenoxypropionate herbicides. In addition, Tau GSTs play important roles in intracellular signalling, biosynthesis of anthocyanin, responses to soil stresses and responses to auxin and cytokinin hormones.	74
239357	cd03059	GST_N_SspA	GST_N family, Stringent starvation protein A (SspA) subfamily; SspA is a RNA polymerase (RNAP)-associated protein required for the lytic development of phage P1 and for stationary phase-induced acid tolerance of E. coli. It is implicated in survival during nutrient starvation. SspA adopts the GST fold with an N-terminal TRX-fold domain and a C-terminal alpha helical domain, but it does not bind glutathione (GSH) and lacks GST activity. SspA is highly conserved among gram-negative bacteria. Related proteins found in Neisseria (called RegF), Francisella and Vibrio regulate the expression of virulence factors necessary for pathogenesis.	73
239358	cd03060	GST_N_Omega_like	GST_N family, Omega-like subfamily; composed of uncharacterized proteins with similarity to class Omega GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. Like Omega enzymes, proteins in this subfamily contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism.	71
239359	cd03061	GST_N_CLIC	GST_N family, Chloride Intracellular Channel (CLIC) subfamily; composed of CLIC1-5, p64, parchorin and similar proteins. They are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division and apoptosis. They can exist in both water-soluble and membrane-bound states, and are found in various vesicles and membranes. Biochemical studies of the C. elegans homolog, EXC-4, show that the membrane localization domain is present in the N-terminal part of the protein. The structure of soluble human CLIC1 reveals that it is monomeric and it adopts a fold similar to GSTs, containing an N-terminal domain with a TRX fold and a C-terminal alpha helical domain. Upon oxidation, the N-terminal domain of CLIC1 undergoes a structural change to form a non-covalent dimer stabilized by the formation of an intramolecular disulfide bond between two cysteines that are far apart in the reduced form. The CLIC1 dimer bears no similarity to GST dimers. The redox-controlled structural rearrangement exposes a large hydrophobic surface, which is masked by dimerization in vitro. In vivo, this surface may represent the docking interface of CLIC1 in its membrane-bound state. The two cysteines in CLIC1 that form the disulfide bond in oxidizing conditions are essential for dimerization and chloride channel activity, however, in other subfamily members, the second cysteine is not conserved.	91
239360	cd03062	TRX_Fd_Sucrase	TRX-like [2Fe-2S] Ferredoxin (Fd) family, Sucrase subfamily; composed of proteins with similarity to a novel plant enzyme, isolated from potato, which contains a Fd-like domain and exhibits sucrolytic activity. The putative active site of the Fd-like domain of the enzyme contains two cysteines and two histidines for possible binding to iron-sulfur clusters, compared to four cysteines present in the active site of Fd.	97
239361	cd03063	TRX_Fd_FDH_beta	TRX-like [2Fe-2S] Ferredoxin (Fd) family, NAD-dependent formate dehydrogenase (FDH) beta subunit; composed of proteins similar to the beta subunit of NAD-linked FDH of Ralstonia eutropha, a soluble enzyme that catalyzes the irreversible oxidation of formate to carbon dioxide accompanied by the reduction of NAD to NADH. FDH is a heteromeric enzyme composed of four nonidentical subunits (alpha, beta, gamma and delta). The FDH beta subunit contains a NADH:ubiquinone oxidoreductase (Nuo) F domain C-terminal to a Fd-like domain without the active site cysteines. The absence of conserved metal-binding residues in the putative active site suggests that members of this subfamily have lost the ability to bind iron-sulfur clusters in the N-terminal Fd-like domain. The C-terminal NuoF domain is a component of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. NuoF contains one [4Fe-4S] cluster and binds NADH and FMN.	92
239362	cd03064	TRX_Fd_NuoE	TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily; Nuo, also called respiratory chain Complex 1, is the entry point for electrons into the respiratory chains of bacteria and the mitochondria of eukaryotes. It is a multisubunit complex with at least 14 core subunits. It catalyzes the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane, providing the proton motive force required for energy-consuming processes. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster present in NuoE core subunit, also called the 24 kD subunit of Complex 1. This subfamily also include formate dehydrogenases, NiFe hydrogenases and NAD-reducing hydrogenases, that contain a NuoE domain. A subset of these proteins contain both NuoE and NuoF in a single chain. NuoF, also called the 51 kD subunit of Complex 1, contains one [4Fe-4S] cluster and also binds the NADH substrate and FMN.	80
239363	cd03065	PDI_b_Calsequestrin_N	PDIb family, Calsequestrin subfamily, N-terminal TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store. The N-terminal TRX-fold domain (or domain I) mediates front-to-front dimer interaction, an important feature in the formation of calsequestrin polymers.	120
239364	cd03066	PDI_b_Calsequestrin_middle	PDIb family, Calsequestrin subfamily, Middle TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store.	102
239365	cd03067	PDI_b_PDIR_N	PDIb family, PDIR subfamily, N-terminal TRX-like b domain; composed of proteins similar to human PDIR (for Protein Disulfide Isomerase Related). PDIR is composed of three redox active TRX (a) domains and an N-terminal redox inactive TRX-like (b) domain. Similar to PDI, it is involved in oxidative protein folding in the endoplasmic reticulum (ER) through its isomerase and chaperone activities. These activities are lower compared to PDI, probably due to PDIR acting only on a subset of proteins. PDIR is preferentially expressed in cells actively secreting proteins and its expression is induced by stress. Similar to PDI, the isomerase and chaperone activities of PDIR are independent; CXXC mutants lacking isomerase activity retain chaperone activity. The TRX-like b domain of PDIR is critical for its chaperone activity.	112
239366	cd03068	PDI_b_ERp72	PDIb family, ERp72 subfamily, first redox inactive TRX-like domain b; ERp72 exhibits both disulfide oxidase and reductase functions like PDI, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER and acting as isomerases to correct any non-native disulfide bonds. It also displays chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp72 contains three redox-active TRX (a) domains and two redox inactive TRX-like (b) domains.  Its molecular structure is a"abb'a', compared to the abb'a' structure of PDI. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. Similar to PDI, the b domain of ERp72 is likely involved in binding to substrates.	107
239367	cd03069	PDI_b_ERp57	PDIb family, ERp57 subfamily, first redox inactive TRX-like domain b; ERp57 (or ERp60) exhibits both disulfide oxidase and reductase functions like PDI, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides in the ER and acting as isomerases to correct any non-native disulfide bonds. It also displays chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp57 contains two redox-active TRX (a) domains and two redox inactive TRX-like (b) domains.  It shares the same domain arrangement of abb'a' as PDI, but lacks the C-terminal acid-rich region (c domain) that is present in PDI. ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins. Similar to PDI, the b domain of ERp57 is likely involved in binding to substrates.	104
239368	cd03070	PDI_b_ERp44	PDIb family, ERp44 subfamily, first redox inactive TRX-like domain b; ERp44 is an endoplasmic reticulum (ER)-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. Through the formation of reversible mixed disulfides, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. ERp44 also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol. Similar to PDI, the b domain of ERp44 is likely involved in binding to substrates.	91
239369	cd03071	PDI_b'_NRX	PDIb' family, NRX subgroup, redox inactive TRX-like domain b'; composed of vertebrate nucleoredoxins (NRX). NRX is a 400-amino acid nuclear protein with one redox active TRX domain followed by one redox inactive TRX-like domain homologous to the b' domain of PDI. In vitro studies show that NRX has thiol oxidoreductase activity and that it may be involved in the redox regulation of transcription, in a manner different from that of TRX or glutaredoxin. NRX enhances the activation of NF-kB by TNFalpha, as well as PMA-1 induced AP-1 and FK-induced CREB activation. Mouse NRX transcripts are expressed in all adult tissues but is restricted to the nervous system and limb buds in embryos. The mouse NRX gene is implicated in streptozotocin-induced diabetes. Similar to PDI, the b' domain of NRX is likely involved in substrate recognition.	116
239370	cd03072	PDI_b'_ERp44	PDIb' family, ERp44 subfamily, second redox inactive TRX-like domain b'; ERp44 is an endoplasmic reticulum (ER)-resident protein, induced during stress, involved in thiol-mediated ER retention. It contains an N-terminal TRX domain with a CXFS motif followed by two redox inactive TRX-like domains, homologous to the b and b' domains of PDI. Through the formation of reversible mixed disulfides, ERp44 mediates the ER localization of Ero1alpha, a protein that oxidizes protein disulfide isomerases into their active form. ERp44 also prevents the secretion of unassembled cargo protein with unpaired cysteines. ERp44 also modulates the activity of inositol 1,4,5-triphosphate type I receptor (IP3R1), an intracellular channel protein that mediates calcium release from the ER to the cytosol. Similar to PDI, the b' domain of ERp44 is likely involved in substrate recognition and may be the primary binding site.	111
239371	cd03073	PDI_b'_ERp72_ERp57	PDIb' family, ERp72 and ERp57 subfamily, second redox inactive TRX-like domain b'; ERp72 and ER57 are involved in oxidative protein folding in the ER, like PDI. They exhibit both disulfide oxidase and reductase functions, by catalyzing the formation of disulfide bonds of newly synthesized polypeptides and acting as isomerases to correct any non-native disulfide bonds. They also display chaperone activity to prevent protein aggregation and facilitate the folding of newly synthesized proteins. ERp57 contains two redox-active TRX (a) domains and two redox inactive TRX-like (b) domains.  It shares the same domain arrangement of abb'a' as PDI, but lacks the C-terminal acid-rich region (c domain) that is present in PDI. ERp72 contains one additional redox-active TRX (a) domain at the N-terminus with a molecular structure of a"abb'a'. ERp57 interacts with the lectin chaperones, calnexin and calreticulin, and specifically promotes the oxidative folding of glycoproteins. ERp72 associates with several ER chaperones and folding factors to form complexes in the ER that bind nascent proteins. The b' domain of ERp57 is the primary binding site and is adapted for ER lectin association. Similarly, the b' domain of ERp72 is likely involved in substrate recognition.	111
239372	cd03074	PDI_b'_Calsequestrin_C	Protein Disulfide Isomerase (PDIb') family, Calsequestrin subfamily, C-terminal TRX-fold domain; Calsequestrin is the major calcium storage protein in the sarcoplasmic reticulum (SR) of skeletal and cardiac muscle. It stores calcium ions in sufficient quantities (up to 20 mM) to allow repetitive contractions and is essential to maintain movement, respiration and heart beat. A missense mutation in human cardiac calsequestrin is associated with catecholamine-induced polymorphic ventricular tachycardia (CPVT), a rare disease characterized by seizures or sudden death in response to physiologic or emotional stress. Calsequestrin is a highly acidic protein with up to 50 calcium binding sites formed simply by the clustering of two or more acidic residues. The monomer contains three redox inactive TRX-fold domains. Calsequestrin is condensed as a linear polymer in the SR lumen and is membrane-anchored through binding with intra-membrane proteins triadin, junctin and ryanodine receptor (RyR) Ca2+ release channel. In addition to its role as a calcium ion buffer, calsequestrin also regulates the activity of the RyR channel, coordinating the release of calcium ions from the SR with the loading of the calcium store. The C-terminal TRX-fold domain (or domain III) mediates back-to-back dimer interaction and also contriubutes to the front-to-front dimer interface, both of which are important features in the formation of calsequestrin polymers.	120
239373	cd03075	GST_N_Mu	GST_N family, Class Mu subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Mu subfamily is composed of eukaryotic GSTs. In rats, at least six distinct class Mu subunits have been identified, with homologous genes in humans for five of these subunits. Class Mu GSTs can form homodimers and heterodimers, giving a large number of possible isoenzymes that can be formed, all with overlapping activities but different substrate specificities. They are the most abundant GSTs in human liver, skeletal muscle and brain, and are believed to provide protection against diseases including cancer and neurodegenerative disorders. Some isoenzymes have additional specific functions. Human GST M1-1 acts as an endogenous inhibitor of ASK1 (apoptosis signal-regulating kinase 1), thereby suppressing ASK1-mediated cell death. Human GSTM2-2 and 3-3 have been identified as prostaglandin E2 synthases in the brain and may play crucial roles in temperature and sleep-wake regulation.	82
239374	cd03076	GST_N_Pi	GST_N family, Class Pi subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. Class Pi GST is a homodimeric eukaryotic protein. The human GSTP1 is mainly found in erythrocytes, kidney, placenta and fetal liver. It is involved in stress responses and in cellular proliferation pathways as an inhibitor of JNK (c-Jun N-terminal kinase). Following oxidative stress, monomeric GSTP1 dissociates from JNK and dimerizes, losing its ability to bind JNK and causing an increase in JNK activity, thereby promoting apoptosis. GSTP1 is expressed in various tumors and is the predominant GST in a wide range of cancer cells. It has been implicated in the development of multidrug-resistant tumours.	73
239375	cd03077	GST_N_Alpha	GST_N family, Class Alpha subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal TRX-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. The class Alpha subfamily is composed of eukaryotic GSTs which can form homodimer and heterodimers. There are at least six types of class Alpha GST subunits in rats, four of which have human counterparts, resulting in many possible isoenzymes with different activities, tissue distribution and substrate specificities. Human GSTA1-1 and GSTA2-2 show high GSH peroxidase activity. GSTA3-3 catalyzes the isomerization of intermediates in steroid hormone biosynthesis. GSTA4-4 preferentially catalyzes the GSH conjugation of alkenals.	79
239376	cd03078	GST_N_Metaxin1_like	GST_N family, Metaxin subfamily, Metaxin 1-like proteins; composed of metaxins 1 and 3, and similar proteins including Tom37 from fungi. Mammalian metaxin (or metaxin 1) and the fungal protein Tom37 are components of preprotein import complexes of the mitochondrial outer membrane. Metaxin extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. Like the murine gene, the human metaxin gene is located downstream to the glucocerebrosidase (GBA) pseudogene and is convergently transcribed. Inherited deficiency of GBA results in Gaucher disease, which presents many diverse clinical phenotypes. Alterations in the metaxin gene, in addition to GBA mutations, may be associated with Gaucher disease. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken and mammals.	73
239377	cd03079	GST_N_Metaxin2	GST_N family, Metaxin subfamily, Metaxin 2; a metaxin 1 binding protein identified through a yeast two-hybrid system using metaxin 1 as the bait. Metaxin 2 shares sequence similarity with metaxin 1 but does not contain a C-terminal mitochondrial outer membrane signal-anchor domain. It associates with mitochondrial membranes through its interaction with metaxin 1, which is a component of the mitochondrial preprotein import complex of the outer membrane. The biological function of metaxin 2 is unknown. It is likely that it also plays a role in protein translocation into the mitochondria. However, this has not been experimentally validated. In a recent proteomics study, it has been shown that metaxin 2 is overexpressed in response to lipopolysaccharide-induced liver injury.	74
239378	cd03080	GST_N_Metaxin_like	GST_N family, Metaxin subfamily, Metaxin-like proteins; a heterogenous group of proteins, predominantly uncharacterized, with similarity to metaxins and GSTs. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. One characterized member of this subgroup is a novel GST from Rhodococcus with toluene o-monooxygenase and gamma-glutamylcysteine synthetase activities. Also members are the cadmium-inducible lysosomal protein CDR-1 and its homologs from C. elegans, and the failed axon connections (fax) protein from Drosophila. CDR-1 is an integral membrane protein that functions to protect against cadmium toxicity and may also have a role in osmoregulation to maintain salt balance in C. elegans. The fax gene of Drosophila was identified as a genetic modifier of Abelson (Abl) tyrosine kinase. The fax protein is localized in cellular membranes and is expressed in embryonic mesoderm and axons of the central nervous system.	75
239379	cd03081	TRX_Fd_NuoE_FDH_gamma	TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily, NAD-dependent formate dehydrogenase (FDH) gamma subunit; composed of proteins similar to the gamma subunit of NAD-linked FDH of Ralstonia eutropha, a soluble enzyme that catalyzes the irreversible oxidation of formate to carbon dioxide accompanied by the reduction of NAD+ to NADH. FDH is a heteromeric enzyme composed of four nonidentical subunits (alpha, beta, gamma and delta). The FDH gamma subunit is closely related to NuoE, which is part of a multisubunit complex (Nuo) catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster present in NuoE. Similarly, the FDH gamma subunit is hypothesized to be involved in an electron transport chain involving other FDH subunits, upon the oxidation of formate.	80
239380	cd03082	TRX_Fd_NuoE_W_FDH_beta	TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E family, Tungsten-containing formate dehydrogenase (W-FDH) beta subunit; composed of proteins similar to the W-FDH beta subunit of Methylobacterium extorquens. W-FDH is a heterodimeric NAD-dependent enzyme catalyzing the conversion of formate to carbon dioxide. The beta subunit is a fusion protein containing an N-terminal NuoE domain and a C-terminal NuoF domain. NuoE and NuoF are components of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster in NuoE and the [4Fe-4S] cluster in NuoF. In addition, NuoF is also the NADH- and FMN-binding subunit. Similarly, the beta subunit of W-FDH is most likely involved in the electron transport chain during the NAD-dependent oxidation of formate.	72
239381	cd03083	TRX_Fd_NuoE_hoxF	TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily, hoxF; composed of proteins similar to the NAD-reducing hydrogenase (hoxS) alpha subunit of Alcaligenes eutrophus H16. HoxS is a cytoplasmic hydrogenase catalyzing the oxidation of molecular hydrogen accompanied by the reduction of NAD. It is composed of four structural subunits encoded by the genes hoxF, hoxU, hoxY and hoxH. The hoxF protein (or alpha subunit) is a fusion protein containing an N-terminal NuoE-like domain and a C-terminal NuoF domain. NuoE and NuoF are components of Nuo, a multisubunit complex catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster in NuoE and the [4Fe-4S] cluster in NuoF. In addition, NuoF is also the NADH- and FMN-binding subunit. HoxF may be involved in the electron transport chain during the NAD-dependent oxidation of hydrogen through its NuoF domain. The NuoE-like domain of hoxF contains only one conserved cysteine in its putative active site, compared to four cysteines in NuoE, and may have lost the ability to bind [2Fe-2S] clusters.	80
100086	cd03084	phosphohexomutase	The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	355
100087	cd03085	PGM1	Phosphoglucomutase 1 (PGM1) catalyzes the bidirectional interconversion of glucose-1-phosphate (G-1-P) and glucose-6-phosphate (G-6-P) via a glucose 1,6-diphosphate intermediate, an important metabolic step in prokaryotes and eukaryotes. In one direction, G-1-P produced from sucrose catabolism is converted to G-6-P, the first intermediate in glycolysis. In the other direction, conversion of G-6-P to G-1-P generates a substrate for synthesis of UDP-glucose which is required for synthesis of a variety of cellular constituents including cell wall polymers and glycoproteins. The PGM1 family also includes a non-enzymatic PGM-related protein (PGM-RP) thought to play a structural role in eukaryotes, as well as pp63/parafusin, a phosphoglycoprotein that plays an important role in calcium-regulated exocytosis in ciliated protozoans. PGM1 belongs to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	548
100088	cd03086	PGM3	PGM3 (phosphoglucomutase 3), also known as PAGM (phosphoacetylglucosamine mutase) and AGM1 (N-acetylglucosamine-phosphate mutase), is an essential enzyme found in eukaryotes that reversibly catalyzes the conversion of GlcNAc-6-phosphate into GlcNAc-1-phosphate as part of the UDP-N-acetylglucosamine (UDP-GlcNAc) biosynthetic pathway. UDP-GlcNAc is an essential metabolite that serves as the biosynthetic precursor of many glycoproteins and mucopolysaccharides. AGM1 is a member of the alpha-D-phosphohexomutase superfamily, which catalyzes the intramolecular phosphoryl transfer of sugar substrates. The alpha-D-phosphohexomutases have four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	513
100089	cd03087	PGM_like1	This archaeal PGM-like (phosphoglucomutase-like) protein of unknown function belongs to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. The alpha-D-phosphohexomutases include several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this superfamily include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	439
100090	cd03088	ManB	ManB is a bacterial phosphomannomutase (PMM) that catalyzes the conversion of mannose 6-phosphate to mannose-1-phosphate in the second of three steps in the GDP-mannose pathway, in which GDP-D-mannose is synthesized from fructose-6-phosphate. In Mycobacterium tuberculosis, the causative agent of tuberculosis, PMM is involved in the biosynthesis of mannosylated lipoglycans that participate in the association of mycobacteria with host macrophage phagocytic receptors. ManB belongs to the the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	459
100091	cd03089	PMM_PGM	The phosphomannomutase/phosphoglucomutase (PMM/PGM) bifunctional enzyme catalyzes the reversible conversion of 1-phospho to 6-phospho-sugars (e.g. between mannose-1-phosphate and mannose-6-phosphate or glucose-1-phosphate and glucose-6-phosphate) via a bisphosphorylated sugar intermediate. The reaction involves two phosphoryl transfers, with an intervening 180 degree reorientation of the reaction intermediate during catalysis. Reorientation of the intermediate occurs without dissociation from the active site of the enzyme and is thus, a simple example of processivity, as defined by multiple rounds of catalysis without release of substrate. Glucose-6-phosphate and glucose-1-phosphate are known to be utilized for energy metabolism and cell surface construction, respectively. PMM/PGM belongs to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the phosphoglucomutases (PGM1 and PGM2). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	443
349762	cd03108	AdSS	adenylosuccinate synthetase. Adenylosuccinate synthetase (AdSS) catalyzes the first step in the de novo biosynthesis of AMP. IMP and L-aspartate are conjugated in a two-step reaction accompanied by the hydrolysis of GTP to GDP in the presence of Mg2+. In the first step, the r-phosphate group of GTP is transferred to the 6-oxygen atom of IMP. An aspartate then displaces this 6-phosphate group to form the product adenylosuccinate. Because of its critical role in purine biosynthesis, AdSS is a target of antibiotics, herbicides and antitumor drugs.	316
349763	cd03109	DTBS	dethiobiotin synthetase. Dethiobiotin synthetase (DTBS) is the penultimate enzyme in the biotin biosynthesis pathway in Escherichia coli and other microorganisms. The enzyme catalyzes formation of the ureido ring of dethiobiotin from (7R,8S)-7,8-diaminononanoic acid (DAPA) and carbon dioxide. The enzyme utilizes carbon dioxide instead of hydrogen carbonate as substrate and is dependent on ATP and divalent metal ions as cofactors.	189
349764	cd03110	SIMIBI_bact_arch	bacterial and archaeal subfamily of SIMIBI. Uncharacterized bacterial and archaeal subfamily of SIMIBI superfamily. Proteins in this superfamily contain an ATP-binding domain and use energy from hydrolysis of ATP to transfer electron or ion. The specific function of this family is unknown.	246
349765	cd03111	CpaE-like	pilus assembly ATPase CpaE. This protein family consists of proteins similar to the cpaE protein of the Caulobacter pilus assembly and the orf4 protein of Actinobacillus pilus formation gene cluster. The function of these proteins are unkown. The Caulobacter pilus assembly contains 7 genes: pilA, cpaA, cpaB, cpaC, cpaD, cpaE and cpaF. These genes are clustered together on chromosome.	235
349766	cd03112	CobW-like	cobalamin synthesis protein CobW. The function of this protein family is unknown. The amino acid sequence of YjiA protein in E. coli contains several conserved motifs that characterizes it as a P-loop GTPase. YijA gene is among the genes significantly induced in response to DNA-damage caused by mitomycin. YijA gene is a homologue of the CobW gene which encodes the cobalamin synthesis protein/P47K.	198
349767	cd03113	CTPS_N	N-terminal domain of cytidine 5'-triphosphate synthase. Cytidine 5'-triphosphate synthase (CTPS) is a two-domain protein, which consists of an N-terminal synthetase domain and C-terminal glutaminase domain. The enzymes hydrolyze the amide bond of glutamine to ammonia and glutamate at the glutaminase domains and transfer nascent ammonia to the acceptor substrate at the synthetase domain to form an aminated product.	261
349768	cd03114	MMAA-like	methylmalonic aciduria associated protein. Methylmalonyl Co-A mutase-associated GTPase MeaB and its human homolog, methylmalonic aciduria associated protein (MMAA) are metallochaperones that function as a G-protein chaperone that assists AdoCbl cofactor delivery to the methylmalonyl-CoA mutase (MCM) and reactivation of the enzyme during catalysis. A member of the family, Escherichia coli ArgK, was previously thought to be a membrane ATPase which is required for transporting arginine, ornithine and lysine into the cells by the arginine and ornithine (AO system) and lysine, arginine and ornithine (LAO) transport systems.	252
349769	cd03115	SRP_G_like	GTPase domain similar to the signal recognition particle subunit 54. The signal recognition particle (SRP) mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognate receptor (SR). In mammals, SRP consists of six protein subunits and a 7SL RNA. One of these subunits is a 54 kd protein (SRP54), which is a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 is a multidomain protein that consists of an N-terminal domain, followed by a central G (GTPase) domain and a C-terminal M domain.	193
349770	cd03116	MobB	molybdopterin-guanine dinucleotide biosynthesis protein B. Molybdenum is an essential trace element in the form of molybdenum cofactor (Moco) which is associated with the metabolism of nitrogen, carbon and sulfur by redox active enzymes. In Escherichia coli, the synthesis of Moco involves genes from several loci: moa, mob, mod, moe and mog. The mob locus contains mobA and mobB genes. MobB catalyzes the attachment of the guanine dinucleotide to molybdopterin.	157
239391	cd03117	alpha_CA_IV_XV_like	Carbonic anhydrase alpha, CA_IV, CA_XV, like isozymes. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues. This subgroup, restricted to animals, contains isozyme IV and similar proteins such as mouse CA XV. Isozymes IV is attached to membranes via a glycosylphosphatidylinositol (GPI) tail. In mammals, Isozyme IV plays crucial roles in kidney and lung function, amongst others. This subgroup also contains the dual domain CA from the giant clam, Tridacna gigas. T.  gigas CA plays a role in the movement of inorganic carbon from the surrounding seawater to the symbiotic algae found in the clam's tissues. CA XV is expressed in several species but not in humans or chimps. Similar to isozyme CA IV, CA XV attaches to membranes via a GPI tail.	234
239392	cd03118	alpha_CA_V	Carbonic anhydrase alpha, CA isozyme V_like subgroup.  Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozyme V. CA V is the mitochondrial isozyme, which may play a role in gluconeogenesis and ureagenesis and possibly also in lipogenesis.	236
239393	cd03119	alpha_CA_I_II_III_XIII	Carbonic anhydrase alpha, isozymes I, II, and III and XIII.  Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes.  The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozymes I, II, and III, which are cytoplasmic enzymes. CA I, for example, is expressed in erythrocyes of many vertebrates; CA II is the most active cytosolic isozyme; while it is being expressed nearly ubiquitously, it comprises 95% of the renal carbonic anhydrase and is  required for renal acidification; CA III has been implicated in protection from the damaging effect of oxidizing agents in hepatocytes. CAXIII may play important physiological roles in several organs.	259
239394	cd03120	alpha_CARP_VIII	Carbonic anhydrase alpha related protein, group VIII. Carbonic anhydrase related proteins (CARPs) are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. CARP VIII may play roles in various biological processes of the central nervous system, and could be involved in protein-protein interactions. CARP VIII has been shown to bind inositol 1,4,5-triphosphate (IP3) receptor type I (IP3RI), reducing the affinity of the receptor for IP3. IP3RI is an intracellular IP3-gated Ca2+ channel located on intracellular Ca2+ stores. IP3RI converts IP3 signaling into Ca2+ signaling thereby participating in a variety of cell functions.	256
239395	cd03121	alpha_CARP_X_XI_like	Carbonic anhydrase alpha related protein: groups X, XI and related proteins. This subgroup contains carbonic anhydrase related proteins (CARPs) X and XI, which have been implicated in various biological processes of the central nervous system. CARPs are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. CARP XI plays a role in the development of gastrointestinal stromal tumors.	256
239396	cd03122	alpha_CARP_receptor_like	Carbonic anhydrase alpha related protein, receptor_like subfamily. Carbonic anhydrase related proteins (CARPs) are sequence similar to carbonic anhydrases. Carbonic anhydrases are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism. CARPs have lost conserved histidines involved in zinc binding and consequently their catalytic activity. This sub-family of carbonic anhydrase-related domains found in tyrosine phosphatase receptors may play a role in cell adhesion.	253
239397	cd03123	alpha_CA_VI_IX_XII_XIV	Carbonic anhydrase alpha, isozymes VI, IX, XII and XIV. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Alpha CAs are mostly monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the secreted CA VI, which is found in saliva, for example, and the membrane proteins CA IX, XII, and XIV.	248
239398	cd03124	alpha_CA_prokaryotic_like	Carbonic anhydrase alpha, prokaryotic-like subfamily. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This sub-family includes bacterial carbonic anhydrase alpha, as well as plant enzymes such as tobacco nectarin III and yam dioscorin and, carbonic anhydrases from molluscs, such as nacrein, which are part of the organic matrix layer in shells. Other members of this family may be involved in maintaining pH balance, in facilitating transport of carbon dioxide or carbonic acid, or in sensing carbon dioxide levels in the environment.  Dioscorin is the major storage protein of yam tubers and may play a role as an antioxidant.  Tobacco Nectarin may play a role in the maintenace of pH and oxidative balance in nectar. Mollusc nacrein may participate in calcium carbonate crystal formation of the nacreous layer.  This subfamily also includes three alpha carbonic anhydrases from Chlamydomonas reinhardtii (CAH 1-3).  CAHs1-2 are localized in the periplasmic space. CAH1 faciliates the movement of carbon dioxide across the plasma membrane when the medium is alkaline. CAH3 is localized to the thylakoid lumen and provides CO2 to Rubisco.	216
239399	cd03125	alpha_CA_VI	Carbonic anhydrase alpha, isozyme VI. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes.  The zinc ion is complexed by three histidine residues. This sub-family comprises the secreted CA VI, which is found in saliva.	249
239400	cd03126	alpha_CA_XII_XIV	Carbonic anhydrase alpha, isozymes XII and XIV. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the membrane proteins CA XII and XIV.	249
239401	cd03127	tetraspanin_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins.	90
153222	cd03128	GAT_1	Type 1 glutamine amidotransferase (GATase1)-like domain. Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E.  The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.  For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site.  Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain.	92
153223	cd03129	GAT1_Peptidase_E_like	Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E_like proteins. Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E_like proteins. This group contains proteins similar to the aspartyl dipeptidases Salmonella typhimurium peptidase E and Xenopus laevis peptidase E and, extracellular cyanophycinases from Pseudomonas anguilliseptica BI (CphE) and Synechocystis sp. PCC 6803 CphB. In bacteria peptidase E is believed to play a role in degrading peptides generated by intracellular protein breakdown or imported into the cell as nutrient sources. Peptidase E uniquely hydrolyses only Asp-X dipeptides (where X is any amino acid), and one tripeptide Asp-Gly-Gly.  Cyanophycinases are intracellular exopeptidases which hydrolyze the polymer cyanophycin (multi L-arginyl-poly-L-aspartic acid) to the dipeptide beta-Asp-Arg. Peptidase E and cyanophycinases are thought to have a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow. Xenopus peptidase E is developmentally regulated in response to thyroid hormone and, it is thought to play a role in apoptosis during tail reabsorption.	210
153224	cd03130	GATase1_CobB	Type 1 glutamine amidotransferase (GATase1) domain found in Cobyrinic Acid a,c-Diamide Synthase. Type 1 glutamine amidotransferase (GATase1) domain found in Cobyrinic Acid a,c-Diamide Synthase. CobB plays a role in cobalamin biosythesis catalyzing the conversion of cobyrinic acid to cobyrinic acid a,c-diamide.  CobB belongs to the triad family of amidotransferases.  Two of the three residues of the catalytic triad that are involved in glutamine binding, hydrolysis and transfer of the resulting ammonia to the acceptor substrate in other triad aminodotransferases are conserved in CobB.	198
153225	cd03131	GATase1_HTS	Type 1 glutamine amidotransferase (GATase1)-like domain found in homoserine trans-succinylase (HTS). Type 1 glutamine amidotransferase (GATase1)-like domain found in homoserine trans-succinylase (HTS). HTS, the first enzyme in methionine biosynthesis in Escherichia coli, transfers a succinyl group from succinyl-CoA to homoserine forming succinyl homoserine.  It has been suggested that the succinyl group of succinyl-CoA is initially transferred to an enzyme nucleophile before subsequent transfer to homoserine. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with GATase1 domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. It has been proposed that this cys is in the active site of the molecule. However, as succinyl has been found bound to a conserved lysine residue, this conserved cys may play a role in dimer formation.  HTS activity is tightly regulated by several mechanisms including feedback inhibition and proteolysis. It represents a critical control point for cell growth and viability.	175
153226	cd03132	GATase1_catalase	Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Type 1 glutamine amidotransferase (GATase1)-like domain found in at the C-terminal of several large catalases. Catalase catalyzes the dismutation of hydrogen peroxide (H2O2) to water and oxygen. This group includes the large catalases: Neurospora crassa Catalase-1 and Catalase-3 and, Escherichia coli HP-II.  This GATase1-like domain has an essential role in HP-II catalase activity.  However, it lacks enzymatic activity and the catalytic triad typical of GATase1 domains. Catalase-1 and -3 are homotetrameric, HP-II is homohexameric. It has been proposed that this domain may facilitate the folding and oligomerization process. The interface between this GATase1-like domain of HP-II and the core of the subunit forms part of a channel which provides access to the deeply buried catalase active sites of HPII.  Catalase-1 is associated with non-growing cells; Catalase-3 is associated with growing conditions. HP-II is produced in stationary phase. Catalase-1 is induced by ethanol and heat shock. Catalase-3 is induced under stress conditions such a hydrogen peroxide, paraquat, cadmium, heat shock, uric acid and nitrate treatment.	142
153227	cd03133	GATase1_ES1	Type 1 glutamine amidotransferase (GATase1)-like domain found in zebrafish ES1. Type 1 glutamine amidotransferase (GATase1)-like domain found in zebrafish ES1. This group includes, proteins similar to ES1, Escherichia coli enhancing lycopene biosynthesis protein 2, Azospirillum brasilense iaaC and, human HES1.  The catalytic triad typical of GATase1domains is not conserved in this GATase1-like domain. However, in common with GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. Zebrafish ES1 is expressed specifically in adult photoreceptor cells and appears to be a cytoplasmic protein. A. brasilense iaaC is involved in controlling IAA biosynthesis.	213
153228	cd03134	GATase1_PfpI_like	A type 1 glutamine amidotransferase (GATase1)-like domain found in PfpI from Pyrococcus furiosus. A type 1 glutamine amidotransferase (GATase1)-like domain found in PfpI from Pyrococcus furiosus.   This group includes proteins similar to PfpI from P.  furiosus. and PH1704 from Pyrococcus horikoshii. These enzymes are ATP-independent intracellular proteases and may hydrolyze small peptides to provide a nutritional source.  Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.  For PH1704, it is believed that this Cys together with a different His in one monomer and Glu (from an adjacent monomer) forms a different catalytic triad from the typical GATase1domain.  PfpI is homooligomeric. Protease activity is only found for oligomeric forms of PH1704.	165
153229	cd03135	GATase1_DJ-1	Type 1 glutamine amidotransferase (GATase1)-like domain found in Human DJ-1. Type 1 glutamine amidotransferase (GATase1)-like domain found in Human DJ-1. DJ-1 is involved in multiple physiological processes including cancer, Parkinson's disease and male fertility. It is unclear how DJ-1 functions in these. DJ-1 has been shown to possess chaperone activity. DJ-1 is preferentially expressed in the testis and moderately in other tissues; it is induced together with genes involved in oxidative stress response. The Drosophila homologue (DJ-1A) plays an essential role in oxidative stress response and neuronal maintenance. Inhibition of DJ-1A function through RNAi, results in the cellular accumulation of reactive oxygen species, organismal hypersensitivity to oxidative stress, and dysfunction and degeneration of dopaminergic and photoreceptor neurons.  DJ-1 has lacks enzymatic activity and the catalytic triad of typical GATase1 domains, however it does contain the highly conserved cysteine located at the nucelophile elbow region typical of these domains. This cysteine been proposed to be a site of regulation of DJ-1 activity by oxidation.  DJ-1 is a dimeric enzyme.	163
153230	cd03136	GATase1_AraC_ArgR_like	AraC transcriptional regulators having an N-terminal Type 1 glutamine amidotransferase (GATase1)-like domain. A subgroup of AraC transcriptional regulators having an N-terminal Type 1 glutamine amidotransferase (GATase1)-like domain.  This group contains proteins similar to the Pseudomonas aeruginosa ArgR regulator.  ArgR functions in the control of expression of certain genes of arginine biosynthesis and catabolism. AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal.  AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in some sequences in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.	185
153231	cd03137	GATase1_AraC_1	AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain. A subgroup of AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain.  AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal.  AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.	187
153232	cd03138	GATase1_AraC_2	AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain. A subgroup of AraC transcriptional regulators having a Type 1 glutamine amidotransferase (GATase1)-like domain.  AraC regulators are defined by a AraC-type helix-turn-helix DNA binding domain at their C-terminal.  AraC family transcriptional regulators are widespread among bacteria and are involved in regulating diverse and important biological functions, including carbon metabolism, stress responses and virulence in different microorganisms. The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with typical GATase1domains a reactive cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.	195
153233	cd03139	GATase1_PfpI_2	Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus.   PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source.  Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.	183
153234	cd03140	GATase1_PfpI_3	Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus.   PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source.  Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.	170
153235	cd03141	GATase1_Hsp31_like	Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Escherichia coli Hsp31 protein. Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Escherichia coli Hsp31 protein (EcHsp31).  This group includes EcHsp31 and Saccharomyces cerevisiae Ydr533c protein.  EcHsp31 has chaperone activity.  Ydr533c is upregulated in response to various stress conditions along with the heat shock family.  EcHsp31 coordinates a metal ion using a 2-His-1-carboxylate motif present in various ions that use iron as a cofactor such as Carboxypeptidase A.   The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1 domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For EcHsp31, this Cys together with a different His and, an Asp (rather than a Glu) residue form a different catalytic triad from the typical GATase1 domain.  For Ydr533c a catalytic triad forms from the conserved Cys together with a different His and Glu from that of the typical GATase1domain. Ydr533c protein and EcHsp31 are homodimers.	221
153236	cd03142	GATase1_ThuA	Type 1 glutamine amidotransferase (GATase1)-like domain found in Sinorhizobium meliloti Rm1021 ThuA (SmThuA). Type 1 glutamine amidotransferase (GATase1)-like domain found in Sinorhizobium meliloti Rm1021 ThuA (SmThuA).  This group includes proteins similar to SmThuA which plays a role in a major pathway for trehalose catabolism. SmThuA is induced by trehalose but not by related structurally similar disaccharides like sucrose or maltose. Proteins in this group lack the catalytic triad of typical GATase1 domains:  a His replaces the reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. S. meliloti Rm1021 thuA mutants are impaired in competitive colonization of Medicago sativa roots but are more competitive than the wild-type Rml021 in infecting alfalfa roots and forming nitrogen-fixing nodules.	215
153237	cd03143	A4_beta-galactosidase_middle_domain	A4 beta-galactosidase middle domain: a type 1 glutamine amidotransferase (GATase1)-like domain. A4 beta-galactosidase middle domain: a type 1 glutamine amidotransferase (GATase1)-like domain. This group includes proteins similar to beta-galactosidase from Thermus thermophilus. Beta-Galactosidase hydrolyzes the beta-1,4-D-galactosidic linkage of lactose, as well as those of related chromogens, o-nitrophenyl-beta-D-galactopyranoside (ONP-Gal) and 5-bromo-4-chloro-3-indolyl-beta-D-galactoside (X-gal).  This A4 beta-galactosidase middle domain lacks the catalytic triad of typical GATase1 domains. The reactive Cys residue found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow in typical GATase1 domains is not conserved in this group.	154
153238	cd03144	GATase1_ScBLP_like	Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Saccharomyces cerevisiae biotin-apoprotein ligase (ScBLP). Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Saccharomyces cerevisiae biotin-apoprotein ligase (ScBLP). Biotin-apoprotein ligase modifies proteins by covalently attaching biotin.  ScBLP is known to biotinylate acety-CoA carboxylase and pyruvate carboxylase.  The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, the Cys residue found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow in a typical GATase1 domain is conserved.	114
153239	cd03145	GAT1_cyanophycinase	Type 1 glutamine amidotransferase (GATase1)-like domain found in cyanophycinase. Type 1 glutamine amidotransferase (GATase1)-like domain found in cyanophycinase. This group contains proteins similar to the extracellular cyanophycinases from Pseudomonas anguilliseptica BI (CphE) and Synechocystis sp. PCC 6803 CphB.  Cyanophycinases are intracellular exopeptidases which hydrolyze the polymer cyanophycin (multi L-arginyl-poly-L-aspartic acid) to the dipeptide beta-Asp-Arg. Cyanophycinase is believed to be a serine-type exopeptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow.	217
153240	cd03146	GAT1_Peptidase_E	Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E. Type 1 glutamine amidotransferase (GATase1)-like domain found in peptidase E. This group contains proteins similar to the aspartyl dipeptidases Salmonella typhimurium peptidase E and Xenopus laevis peptidase E. In bacteria peptidase E is believed to play a role in degrading peptides generated by intracellular protein breakdown or imported into the cell as nutrient sources. Peptidase E uniquely hydrolyses only Asp-X dipeptides (where X is any amino acid), and one tripeptide Asp-Gly-Gly.  Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad typical of GATase1 domains by having a Ser in place of the reactive Cys at the nucleophile elbow. Xenopus PepE  is developmentally regulated in response to thyroid hormone and, it is thought to play a role in apoptosis during tail reabsorption.	212
153241	cd03147	GATase1_Ydr533c_like	Type 1 glutamine amidotransferase (GATase1)-like domain found in Saccharomyces cerevisiae Ydr533c protein. Type 1 glutamine amidotransferase (GATase1)-like domain found in Saccharomyces cerevisiae Ydr533c protein.  This group includes proteins similar to S. cerevisiae Ydr533c.  Ydr533c is upregulated in response to various stress conditions along with the heat shock family.  The catalytic triad typical of GATase1domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. This Cys together with a different His and Glu residue form a different catalytic triad from the typical GATase1domain.  Ydr533c protein is a homodimer.	231
153242	cd03148	GATase1_EcHsp31_like	Type 1 glutamine amidotransferase (GATase1)-like domain found in Escherichia coli Hsp31 protein (EcHsp31). Type 1 glutamine amidotransferase (GATase1)-like domain found in Escherichia coli Hsp31 protein (EcHsp31).  This group includes proteins similar to EcHsp31.  EcHsp31 has chaperone activity.  EcHsp31 coordinates a metal ion using a 2-His-1-carboxylate motif present in various ions that use iron as a cofactor such as Carboxypeptidase A.   The catalytic triad typical of GATase1 domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. This Cys together with a different His and, an Asp (rather than a Glu) residue form a different catalytic triad from the typical GATase1 domain.  EcHsp31 is a homodimer.	232
239402	cd03149	alpha_CA_VII	Carbonic anhydrase alpha, CA isozyme VII_like subgroup.  Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines. This vertebrate subgroup comprises isozyme VII. CA VII is the most active cytosolic enzyme after CA II, and may be highly expressed in the brain. Human CA VII may be a target of antiepileptic sulfonamides/sulfamates.	236
239403	cd03150	alpha_CA_IX	Carbonic anhydrase alpha, isozyme IX. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Alpha CAs are strictly monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the membrane protein CA IX. CA IX is functionally implicated in tumor growth and survival. CA IX is mainly present in solid tumors and its expression in normal tissues is limited to the mucosa of alimentary tract. CA IX is a transmembrane protein with two extracellular domains: carbonic anhydrase and,  a proteoglycan-like segment mediating cell-cell adhesion. There is evidence for an involvement of the MAPK pathway in the regulation of CA9 expression.	247
239404	cd03151	CD81_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), CD81_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD81, also referred to as Target for anti-proliferative antigen-1, TAPA-1, is found in virtually all tissues, may be involved in regulation of cell growth and has been described as a  member of the CD19/CD21/Leu-13 signal transduction complex identified on B cells (the B-Cell co-receptor).	84
239405	cd03152	CD9_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), CD9 family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD9 is found in virtually all tissues and is potentially involved in developmental processes. It associates with the tetraspanins CD81 and CD63, as well as with some integrin, and has been shown to be involved in a variety of activation, adhesion, and cell motility functions, as well as cell-cell interactions - such as during fertilization.	84
239406	cd03153	PHEMX_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), PHEMX_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". Phemx (pan hematopoietic expression) or TSSC6 may play a role in hematopoietic cell function.	87
239407	cd03154	TM4SF3_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF3_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contaions transmembrane 4 superfamily 3 (TM4SF3) or D6.1a and related proteins. D6.1a associates with alpha6beta4 integrin and supports cell motility, it has been ascribed a role in tumor progression and metastasis.	100
239408	cd03155	CD151_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), CD151_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD151strongly associates with integrins, especially alpha3beta1, alpha6beta1, alpha7beta1, and alpha6beta4; it may play roles in cell-cell adhesion, cell migration, platelet aggregation, and angiogenesis. For example, CD151 is  is involved in regulation of migration of neutrophils, endothelial cells, and various tumor cell lines; it associates specifically with laminin-binding integrins and strengthens alpha6beta1 integrin-mediated adhesion to laminin-1; CD151 also specifically attenuates adhesion-dependent activation of Ras and correspdonding downstream effects, and is involved in epithelial cell-cell adhesion as a modulator of PKC- and Cdc42-dependent actin cytoskeletal reorganization.	110
239409	cd03156	uroplakin_I_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), uroplakin_I_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". Uroplakin Ia and Ib are components of the 16nm protein particles, which are packed hexagonally to form 2D crystals of asymmetric unit membranes, and cover the apical surface of mammalian urothelium, contributing to the urinay bladder's permeability barrier function. Uroplakins Ia and Ib are maturation facilitators. They trigger conformational changes in their single-transmembrane-domain binding partner proteins uroplakin II and IIIa, which in turn may lead to ER-exit, stabilization, and cell-surface expression.	114
239410	cd03157	TM4SF12_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF12_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This sub-family contains proteins similar to human transmembrane 4 superfamily member 12 (TM4SF12).	103
239411	cd03158	penumbra_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), penumbra_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". Human Penumbra exhibits growth-suppressive activity in vitro and has been associated with myeloid malignancies.	119
239412	cd03159	TM4SF9_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF9_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contaions transmembrane 4 superfamily 9 (TM4SF9) or Tetraspanin-5 and related proteins. TM4SF9 is strongly expressed witin the central nervous system, and expression levels appear to correlate with differentiation status of particular neurons, hinting at a role in neuronal maturation.	121
239413	cd03160	CD37_CD82_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), CD37_CD82_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD37 is a leukocyte-specific protein, and its restricted expression pattern suggests a role in the immune system. A regulatory role in T-cell proliferation has been suggested. CD82 is a metastasis suppressor implicated in biological processes ranging from fusion, adhesion, and migration to apoptosis and alterations of cell morphology.	117
239414	cd03161	TM4SF2_6_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF2_6_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contaions transmembrane 4 superfamily 2 (TM4SF2) or Tspan-7, transmembrane 4 superfamily 6 (TM4SF6) or Tspan-6, and related proteins. TM4SF2 has been identified as involved in some forms of X-linked mental retardation.	104
239415	cd03162	peripherin_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), peripherin_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". Peripherin, or RDS (retinal degradation slow) is a glycoprotein expressed in vertebrate photoreceptors, located at the rim of the disc membranes of the photoreceptor outer segments. RDS is thought to play a major role in folding and stacking of the discs. Mutations in RDS have been linked to hereditary retinal dystrophies, which typically exhibit a wide phenotypic spectrum.	143
239416	cd03163	TM4SF8_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), TM4SF8_like subfamily. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contaions transmembrane 4 superfamily 8 (TM4SF8) or Tspan-3 and related proteins. Tspan-3 has been reported to form a complex with integrin beta1 and OSP/claudin-11, which may be involved in oligodendrocyte proliferation and migration.	105
239417	cd03164	CD53_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), CD53_Like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD53 is a tetraspanin of the lymphoid-myeloid lineage and has been implicated in apoptosis protection. It associates with integrin alpha4beta1. Some of the cellular responses modulated by CD53 may be mediated by JNK activation and/or via the AKT pathway.	86
239418	cd03165	NET-5_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), NET-5_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This sub-family contains proteins similar to human tetraspan NET-5.	98
239419	cd03166	CD63_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), CD63 family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". CD63 is present in platelets, neutrophils, and endothelial cells, amongst others. In platelets it associates with the integrin alphaIIBbeta3 and may modulate alphaIIbbeta3-dependent cytoskeletal reorganization.	99
239420	cd03167	oculospanin_like_LEL	Tetraspanin, extracellular domain or large extracellular loop (LEL), oculospanin_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contains sequences similar to oculospanin, which is found to be expressed in retinal pigment epithelium, iris, ciliary body, and retinal ganglion cells.	120
153243	cd03169	GATase1_PfpI_1	Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus. Type 1 glutamine amidotransferase (GATase1)-like domain found in a subgroup of proteins similar to PfpI from Pyrococcus furiosus.   PfpI is an ATP-independent intracellular proteases which may hydrolyze small peptides to provide a nutritional source.  Only Cys of the catalytic triad typical of GATase1 domains is conserved in this group. This Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow.	180
239421	cd03171	SORL_Dfx_classI	Superoxide reductase-like (SORL) domain, class I; SORL-domains are present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin.  Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. Desulfoferrodoxin (class I) is a homodimeric protein, with each protomer comprised of two domains, the N-terminal desulforedoxin (DSRD) domain and C-terminal SORL domain. Each domain has a distinct iron center: the DSRD iron center I, Fe(S-Cys)4; and the SORL iron center II, Fe[His4Cys(Glu)].	78
239422	cd03172	SORL_classII	Superoxide reductase-like (SORL) domain, class II; SORL-domains are present in a family of mononuclear non-heme iron proteins that includes superoxide reductase and desulfoferrodoxin.  Superoxide reductase-like proteins scavenge superoxide anion radicals as a defense mechanism against reactive oxygen species and are found in anaerobic bacteria and archeae, and microaerophilic Treponema pallidum. The SORL domain contains an active iron site, Fe[His4Cys(Glu)], which in the reduced state loses the glutamate ligand. Superoxide reductase (class II) forms a homotetramer with four Fe[His4Cys(Glu)] centers.	104
176264	cd03173	DUF619-like	DUF619 domain of various N-acetylglutamate Kinases and N-acetylglutamate Synthases. DUF619-like: This family includes the DUF619 domain of various N-acetylglutamate synthases (NAGS) of the urea cycle found in humans and fish, the DUF619 domain of the NAGS of the fungal arginine-biosynthetic pathway (FABP), as well as the DUF619 domain present C-terminal of a NAG kinase-like domain in a limited number of predicted NAGSs found in bacteria and Dictyostelium. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAGS is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic and fungal NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. This subgroup also includes the DUF619 domain of the FABP N-acetylglutamate kinase (NAGK), the enzyme that catalyzes the second reaction of arginine biosynthesis; the phosphorylation of the gamma-carboxyl group of NAG to produce N-acetylglutamylphosphate (NAGP) which is subsequently converted to ornithine in two more steps. The nuclear-encoded, mitochondrial polyprotein precursor (ARG5,6) consists of an N-terminal NAGK (ArgB) domain, a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-acetylglutamate phosphate reductase). The DUF619 domain function has yet to be characterized.	98
163674	cd03174	DRE_TIM_metallolyase	DRE-TIM metallolyase superfamily. The DRE-TIM metallolyase superfamily includes 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	265
198287	cd03177	GST_C_Delta_Epsilon	C-terminal, alpha helical domain of Class Delta and Epsilon Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Delta and Epsilon subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Delta and Epsilon subfamily is made up primarily of insect GSTs, which play major roles in insecticide resistance by facilitating reductive dehydrochlorination of insecticides or conjugating them with GSH to produce water-soluble metabolites that are easily excreted. They are also implicated in protection against cellular damage by oxidative stress.	117
198288	cd03178	GST_C_Ure2p_like	C-terminal, alpha helical domain of Ure2p and related Glutathione S-transferase-like proteins. Glutathione S-transferase (GST) C-terminal domain family, Ure2p-like subfamily; composed of the Saccharomyces cerevisiae Ure2p, YfcG and YghU from Escherichia coli, and related GST-like proteins. Ure2p is a regulator for nitrogen catabolism in yeast. It represses the expression of several gene products involved in the use of poor nitrogen sources when rich sources are available. A transmissible conformational change of Ure2p results in a prion called [Ure3], an inactive, self-propagating and infectious amyloid. Ure2p displays a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The N-terminal thioredoxin-fold domain is sufficient to induce the [Ure3] phenotype and is also called the prion domain of Ure2p. In addition to its role in nitrogen regulation, Ure2p confers protection to cells against heavy metal ion and oxidant toxicity, and shows glutathione (GSH) peroxidase activity. YfcG and YghU are two of the nine GST homologs in the genome of Escherichia coli. They display very low or no GSH transferase, but show very good disulfide bond oxidoreductase activity. YghU also shows modest organic hydroperoxide reductase activity. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of GSH with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain.	110
198289	cd03180	GST_C_2	C-terminal, alpha helical domain of an unknown subfamily 2 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 2; composed of uncharacterized bacterial proteins, with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain.	110
198290	cd03181	GST_C_EF1Bgamma_like	Glutathione S-transferase C-terminal-like, alpha helical domain of the Gamma subunit of Elongation Factor 1B and similar proteins. Glutathione S-transferase (GST) C-terminal domain family, Gamma subunit of Elongation Factor 1B (EF1Bgamma) subfamily; EF1Bgamma is part of the eukaryotic translation elongation factor-1 (EF1) complex which plays a central role in the elongation cycle during protein biosynthesis. EF1 consists of two functionally distinct units, EF1A and EF1B. EF1A catalyzes the GTP-dependent binding of aminoacyl-tRNA to the ribosomal A site concomitant with the hydrolysis of GTP. The resulting inactive EF1A:GDP complex is recycled to the active GTP form by the guanine-nucleotide exchange factor EF1B, a complex composed of at least two subunits, alpha and gamma. Metazoan EFB1 contain a third subunit, beta. The EF1B gamma subunit contains a GST fold consisting of an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The GST-like domain of EF1Bgamma is believed to mediate the dimerization of the EF1 complex, which in yeast is a dimer of the heterotrimer EF1A:EF1Balpha:EF1Bgamma. In addition to its role in protein biosynthesis, EF1Bgamma may also display other functions. The recombinant rice protein has been shown to possess GSH conjugating activity. The yeast EF1Bgamma binds to membranes in a calcium dependent manner and is also part of a complex that binds to the msrA (methionine sulfoxide reductase) promoter suggesting a function in the regulation of its gene expression. Also included in this subfamily is the GST_C-like domain at the N-terminus of human valyl-tRNA synthetase (ValRS) and its homologs. Metazoan ValRS forms a stable complex with Elongation Factor-1H (EF-1H), and together, they catalyze consecutive steps in protein biosynthesis, tRNA aminoacylation and its transfer to EF.	123
198291	cd03182	GST_C_GTT2_like	C-terminal, alpha helical domain of GTT2-like Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Saccharomyces cerevisiae GTT2-like subfamily; composed of predominantly uncharacterized proteins with similarity to the Saccharomyces cerevisiae GST protein, GTT2. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. GTT2, a homodimer, exhibits GST activity with standard substrates. Strains with deleted GTT2 genes are viable but exhibit increased sensitivity to heat shock.	116
198292	cd03183	GST_C_Theta	C-terminal, alpha helical domain of Class Theta Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Theta subfamily; composed of eukaryotic class Theta GSTs and bacterial dichloromethane (DCM) dehalogenase. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Mammalian class Theta GSTs show poor GSH conjugating activity towards the standard substrates, CDNB and ethacrynic acid, differentiating them from other mammalian GSTs. GSTT1-1 shows similar cataytic activity as bacterial DCM dehalogenase, catalyzing the GSH-dependent hydrolytic dehalogenation of dihalomethanes. This is an essential process in methylotrophic bacteria to enable them to use chloromethane and DCM as sole carbon and energy sources. The presence of polymorphisms in human GSTT1-1 and its relationship to the onset of diseases including cancer is the subject of many studies. Human GSTT2-2 exhibits a highly specific sulfatase activity, catalyzing the cleavage of sulfate ions from aralkyl sufate esters, but not from the aryl or alkyl sulfate esters.	126
198293	cd03184	GST_C_Omega	C-terminal, alpha helical domain of Class Omega Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Omega subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Omega GSTs show little or no GSH-conjugating activity towards standard GST substrates. Instead, they catalyze the GSH dependent reduction of protein disulfides, dehydroascorbate and monomethylarsonate, activities which are more characteristic of glutaredoxins. They contain a conserved cysteine equivalent to the first cysteine in the CXXC motif of glutaredoxins, which is a redox active residue capable of reducing GSH mixed disulfides in a monothiol mechanism. Polymorphisms of the class Omega GST genes may be associated with the development of some types of cancer and the age-at-onset of both Alzheimer's and Parkinson's diseases.	124
198294	cd03185	GST_C_Tau	C-terminal, alpha helical domain of Class Tau Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Tau subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The plant-specific class Tau GST subfamily has undergone extensive gene duplication. The Arabidopsis and Oryza genomes contain 28 and 40 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Phi GSTs, showing class specificity in substrate preference. Tau enzymes are highly efficient in detoxifying diphenylether and aryloxyphenoxypropionate herbicides. In addition, Tau GSTs play important roles in intracellular signalling, biosynthesis of anthocyanin, responses to soil stresses and responses to auxin and cytokinin hormones.	127
198295	cd03186	GST_C_SspA	C-terminal, alpha helical domain of Stringent starvation protein A. Glutathione S-transferase (GST) C-terminal domain family, Stringent starvation protein A (SspA) subfamily; SspA is a RNA polymerase (RNAP)-associated protein required for the lytic development of phage P1 and for stationary phase-induced acid tolerance of E. coli. It is implicated in survival during nutrient starvation. SspA adopts the GST fold with an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, but it does not bind glutathione (GSH) and lacks GST activity. SspA is highly conserved among gram-negative bacteria. Related proteins found in Neisseria (called RegF), Francisella and Vibrio regulate the expression of virulence factors necessary for pathogenesis.	108
198296	cd03187	GST_C_Phi	C-terminal, alpha helical domain of Class Phi Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Phi subfamily; composed of plant-specific class Phi GSTs and related fungal and bacterial proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Phi GST subfamily has experience extensive gene duplication. The Arabidopsis and Oryza genomes contain 13 and 16 Tau GSTs, respectively. They are primarily responsible for herbicide detoxification together with class Tau GSTs, showing class specificity in substrate preference. Phi enzymes are highly reactive toward chloroacetanilide and thiocarbamate herbicides. Some Phi GSTs have other functions including transport of flavonoid pigments to the vacuole, shoot regeneration and GSH peroxidase activity.	118
198297	cd03188	GST_C_Beta	C-terminal, alpha helical domain of Class Beta Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Beta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Unlike mammalian GSTs which detoxify a broad range of compounds, the bacterial class Beta GSTs exhibit GSH conjugating activity with a narrow range of substrates. In addition to GSH conjugation, they are involved in the protection against oxidative stress and are able to bind antibiotics and reduce the antimicrobial activity of beta-lactam drugs, contributing to antibiotic resistance. The structure of the Proteus mirabilis enzyme reveals that the cysteine in the active site forms a covalent bond with GSH. One member of this subfamily is a GST from Burkholderia xenovorans LB400 that is encoded by the bphK gene and is part of the biphenyl catabolic pathway.	113
198298	cd03189	GST_C_GTT1_like	C-terminal, alpha helical domain of GTT1-like Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Saccharomyces cerevisiae GTT1-like subfamily; composed of predominantly uncharacterized proteins with similarity to the S. cerevisiae GST protein, GTT1, and the Schizosaccharomyces pombe GST-III. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. GTT1, a homodimer, exhibits GST activity with standard substrates and associates with the endoplasmic reticulum. Its expression is induced after diauxic shift and remains high throughout the stationary phase. S. pombe GST-III is implicated in the detoxification of various metals.	123
198299	cd03190	GST_C_Omega_like	C-terminal, alpha helical domain of Class Omega-like Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Saccharomyces cerevisiae Omega-like subfamily; composed of three Saccharomyces cerevisiae GST omega-like (Gto) proteins, Gto1p, Gto2p (also known as Extracellular mutant protein 4 or ECM4p), and Gto3p, as well as similar uncharacterized proteins from fungi and bacteria. The three Saccharomyces cerevisiae Gto proteins are omega-class GSTs with low or no GST activity against standard substrates, but have glutaredoxin/thiol oxidoreductase and dehydroascorbate reductase activity through a single cysteine residue in the active site. Gto1p is located in the peroxisomes while Gto2p and Gto3p are cytosolic. The gene encoding Gto2p, called ECM4, is involved in cell surface biosynthesis and architecture. S. cerevisiae ECM4 mutants show increased amounts of the cell wall hexose, N-acetylglucosamine. More recently, global gene expression analysis shows that ECM4 is upregulated during genotoxic conditions and together with the expression profiles of 18 other genes could potentially differentiate between genotoxic and cytotoxic insults in yeast.	142
198300	cd03191	GST_C_Zeta	C-terminal, alpha helical domain of Class Zeta Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Zeta subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Zeta GSTs, also known as maleylacetoacetate (MAA) isomerases, catalyze the isomerization of MAA to fumarylacetoacetate, the penultimate step in tyrosine/phenylalanine catabolism, using GSH as a cofactor. They show little GSH-conjugating activity towards traditional GST substrates, but display modest GSH peroxidase activity. They are also implicated in the detoxification of the carcinogen dichloroacetic acid by catalyzing its dechlorination to glyoxylic acid.	121
198301	cd03192	GST_C_Sigma_like	C-terminal, alpha helical domain of Class Sigma-like Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Sigma_like; composed of GSTs belonging to class Sigma and similar proteins, including GSTs from class Mu, Pi, and Alpha. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Vertebrate class Sigma GSTs are characterized as GSH-dependent hematopoietic prostaglandin (PG) D synthases and are responsible for the production of PGD2 by catalyzing the isomerization of PGH2. The functions of PGD2 include the maintenance of body temperature, inhibition of platelet aggregation, bronchoconstriction, vasodilation, and mediation of allergy and inflammation. Other class Sigma-like members include the class II insect GSTs, S-crystallins from cephalopods, nematode-specific GSTs, and 28-kDa GSTs from parasitic flatworms. Drosophila GST2 is associated with indirect flight muscle and exhibits preference for catalyzing GSH conjugation to lipid peroxidation products, indicating an anti-oxidant role. S-crystallin constitutes the major lens protein in cephalopod eyes and is responsible for lens transparency and proper refractive index. The 28-kDa GST from Schistosoma is a multifunctional enzyme, exhibiting GSH transferase, GSH peroxidase, and PGD2 synthase activities, and may play an important role in host-parasite interactions. Members also include novel GSTs from the fungus Cunninghamella elegans, designated as class Gamma, and from the protozoan Blepharisma japonicum, described as a light-inducible GST.	104
198302	cd03193	GST_C_Metaxin	C-terminal, alpha helical domain of Metaxin and related proteins. Glutathione S-transferase (GST) C-terminal domain family, Metaxin subfamily; composed of metaxins and related proteins. Metaxin 1 is a component of a preprotein import complex of the mitochondrial outer membrane. It extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. In humans, alterations in the metaxin gene may be associated with Gaucher disease. Metaxin 2 binds to metaxin 1 and may also play a role in protein translocation into the mitochondria. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken, and mammals. Sequence analysis suggests that all three metaxins share a common ancestry and that they possess similarity to GSTs. Also included in the subfamily are uncharacterized proteins with similarity to metaxins, including a novel GST from Rhodococcus with toluene o-monooxygenase and glutamylcysteine synthetase activities. Other members are the cadmium-inducible lysosomal protein CDR-1 and its homologs from C. elegans, and the failed axon connections (fax) protein from Drosophila. CDR-1 is an integral membrane protein that functions to protect against cadmium toxicity and may also have a role in osmoregulation to maintain salt balance in C. elegans. The fax gene of Drosophila was identified as a genetic modifier of Abelson (Abl) tyrosine kinase. The fax protein is localized in cellular membranes and is expressed in embryonic mesoderm and axons of the central nervous system.	88
198303	cd03194	GST_C_3	C-terminal, alpha helical domain of an unknown subfamily 3 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 3; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain.	115
198304	cd03195	GST_C_4	C-terminal, alpha helical domain of an unknown subfamily 4 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 4; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain.	114
198305	cd03196	GST_C_5	C-terminal, alpha helical domain of an unknown subfamily 5 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 5; composed of uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain.	115
198306	cd03197	GST_C_mPGES2	C-terminal, alpha helical domain of microsomal Prostaglandin E synthase Type 2. Glutathione S-transferase (GST) C-terminal domain family, microsomal Prostaglandin E synthase Type 2 (mPGES2) subfamily; mPGES2 is a membrane-anchored dimeric protein containing a CXXC motif which catalyzes the isomerization of PGH2 to PGE2. Unlike cytosolic PGE synthase (cPGES) and microsomal PGES Type 1 (mPGES1), mPGES2 does not require glutathione (GSH) for its activity, although its catalytic rate is increased two- to four-fold in the presence of DTT, GSH, or other thiol compounds. PGE2 is widely distributed in various tissues and is implicated in the sleep/wake cycle, relaxation/contraction of smooth muscle, excretion of sodium ions, maintenance of body temperature, and mediation of inflammation. mPGES2 contains an N-terminal hydrophobic domain which is membrane associated and a C-terminal soluble domain with a GST-like structure.  The C-terminal GST-like domain contains two structural domains, an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The GST active site is located in a cleft between the two structural domains.	149
198307	cd03198	GST_C_CLIC	C-terminal, alpha helical domain of Chloride Intracellular Channels. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) subfamily; composed of CLICs (CLIC1-6 in vertebrates), p64, parchorin, and similar proteins. They are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. Biochemical studies of the Caenorhabditis elegans homolog, EXC-4, show that the membrane localization domain is present in the N-terminal part of the protein. CLICs display structural plasticity, with CLIC1 adopting two soluble conformations. The structure of soluble human CLIC1 reveals that it is monomeric and adopts a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. Upon oxidation, the N-terminal domain of CLIC1 undergoes a structural change to form a non-covalent dimer stabilized by the formation of an intramolecular disulfide bond between two cysteines that are far apart in the reduced form. The CLIC1 dimer bears no similarity to GST dimers. The redox-controlled structural rearrangement exposes a large hydrophobic surface, which is masked by dimerization in vitro. In vivo, this surface may represent the docking interface of CLIC1 in its membrane-bound state. The two cysteines in CLIC1 that form the disulfide bond in oxidizing conditions are essential for dimerization and chloride channel activity, however, in other subfamily members, the second cysteine is not conserved.	119
198308	cd03199	GST_C_GRX2	C-terminal, alpha helical domain of Glutaredoxin 2. Glutathione S-transferase (GST) C-terminal domain family, Glutaredoxin 2 (GRX2) subfamily; composed of Escherichia coli GRX2 and similar proteins. Escherichia coli GRX2 is an atypical GRX with a molecular mass of about 24kD (most GRXs range from 9-12kD). It adopts a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. It contains a redox active CXXC motif located in the N-terminal domain, but is not able to reduce ribonucleotide reductase like other GRXs. However, it catalyzes GSH-dependent protein disulfide reduction of other substrates efficiently. GRX2 is thought to function primarily in catalyzing the reversible glutathionylation of proteins in cellular redox regulation including stress responses.	128
198309	cd03200	GST_C_AIMP2	Glutathione S-transferase C-terminal-like, alpha helical domain of Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein 2. Glutathione S-transferase (GST) C-terminal domain family, Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein (AIMP) 2 subfamily; AIMPs are non-enzymatic cofactors that play critical roles in the assembly and formation of a macromolecular multi-tRNA synthetase protein complex that functions as a molecular hub to coordinate protein synthesis. There are three AIMPs, named AIMP1-3, which play diverse regulatory roles. AIMP2, also called p38 or JTV-1, contains a C-terminal domain with similarity to the C-terminal alpha helical domain of GSTs. It plays an important role in the control of cell fate via antiproliferative (by enhancing the TGF-beta signal) and proapoptotic (activation of p53 and TNF-alpha) activities. Its roles in the control of cell proliferation and death suggest that it is a potent tumor suppressor. AIMP2 heterozygous mice with lower than normal expression of AIMP2 show high susceptibility to tumorigenesis. AIMP2 is also a substrate of Parkin, an E3 ubiquitin ligase that is involved in the ubiquitylation and proteasomal degradation of its substrates. Mutations in the Parkin gene is found in 50% of patients with autosomal-recessive early-onset parkinsonism. The accumulation of AIMP2, due to impaired Parkin function, may play a role in the pathogenesis of Parkinson's disease.	96
198310	cd03201	GST_C_DHAR	C-terminal, alpha helical domain of Dehydroascorbate Reductase. Glutathione S-transferase (GST) C-terminal domain family, Dehydroascorbate Reductase (DHAR) subfamily; composed of plant-specific DHARs, which are monomeric enzymes catalyzing the reduction of DHA into ascorbic acid (AsA) using glutathione as the reductant. DHAR allows plants to recycle oxidized AsA before it is lost. AsA serves as a cofactor of violaxanthin de-epoxidase in the xanthophyll cycle and as an antioxidant in the detoxification of reactive oxygen species. Because AsA is the major reductant in plants, DHAR serves to regulate their redox state. It has been suggested that a significant portion of DHAR activity is plastidic, acting to reduce the large amounts of ascorbate oxidized during hydrogen peroxide scavenging by ascorbate peroxidase. DHAR contains a conserved cysteine in its active site and in addition to its reductase activity, shows thiol transferase activity similar to glutaredoxins.	121
198311	cd03202	GST_C_etherase_LigE	C-terminal, alpha helical domain of Beta etherase LigE. Glutathione S-transferase (GST) C-terminal domain family, Beta etherase LigE subfamily; composed of proteins similar to Sphingomonas paucimobilis beta etherase, LigE, a GST-like protein that catalyzes the cleavage of the beta-aryl ether linkages present in low-moleculer weight lignins using GSH as the hydrogen donor. This reaction is an essential step in the degradation of lignin, a complex phenolic polymer that is the most abundant aromatic material in the biosphere. The beta etherase activity of LigE is enantioselective and it complements the activity of the other GST family beta etherase, LigF. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains.	124
198312	cd03203	GST_C_Lambda	C-terminal, alpha helical domain of Class Lambda Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Lambda subfamily; composed of plant-specific class Lambda GSTs. GSTs are cytosolic, usually dimeric, proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Lambda subfamily was recently discovered, together with dehydroascorbate reductases (DHARs), as two outlying groups of the GST superfamily in Arabidopsis thaliana, which contain conserved active site cysteines. Characterization of recombinant A. thaliana proteins show that Lambda class GSTs are monomeric, similar to DHARs. They do not exhibit GSH conjugating or DHAR activities, but are active as thiol transferases, similar to glutaredoxins. Members of this subfamily were originally identified as encoded proteins of the In2-1 gene, which can be induced by treatment with herbicide safeners.	120
198313	cd03204	GST_C_GDAP1_like	C-terminal, alpha helical domain of Ganglioside-induced differentiation-associated protein 1-like proteins. Glutathione S-transferase (GST) C-terminal domain family, Ganglioside-induced differentiation-associated protein 1 (GDAP1)-like subfamily; GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal thioredoxin-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. It does not exhibit GST activity using standard substrates.	111
198314	cd03205	GST_C_6	C-terminal, alpha helical domain of an unknown subfamily 6 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 6; composed of uncharacterized bacterial proteins with similarity to GSTs, including Pseudomonas fluorescens GST with a known three-dimensional structure. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Though the three-dimensional structure of Pseudomonas fluorescens GST has been determined, there is no information on its functional characterization.	109
198315	cd03206	GST_C_7	C-terminal, alpha helical domain of an unknown subfamily 7 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 7; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain.	100
198316	cd03207	GST_C_8	C-terminal, alpha helical domain of an unknown subfamily 8 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 8; composed of Agrobacterium tumefaciens GST and other uncharacterized bacterial proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The three-dimensional structure of Agrobacterium tumefaciens GST has been determined but there is no information on its functional characterization.	101
198317	cd03208	GST_C_Alpha	C-terminal, alpha helical domain of Class Alpha Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Alpha subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Alpha subfamily is composed of vertebrate GSTs which can form homodimer and heterodimers. There are at least six types of class Alpha GST subunits in rats, four of which have human counterparts, resulting in many possible isoenzymes with different activities, tissue distribution and substrate specificities. Human GSTA1-1 and GSTA2-2 show high GSH peroxidase activity. GSTA3-3 catalyzes the isomerization of intermediates in steroid hormone biosynthesis. GSTA4-4 preferentially catalyzes the GSH conjugation of alkenals.	135
198318	cd03209	GST_C_Mu	C-terminal, alpha helical domain of Class Mu Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Mu subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. The class Mu subfamily is composed of eukaryotic GSTs. In rats, at least six distinct class Mu subunits have been identified, with homologous genes in humans for five of these subunits. Class Mu GSTs can form homodimers and heterodimers, giving a large number of possible isoenzymes that can be formed, all with overlapping activities but different substrate specificities. They are the most abundant GSTs in human liver, skeletal muscle and brain, and are believed to provide protection against diseases including cancer and neurodegenerative disorders. Some isoenzymes have additional specific functions. Human GST M1-1 acts as an endogenous inhibitor of ASK1 (apoptosis signal-regulating kinase 1) thereby suppressing ASK1-mediated cell death. Human GSTM2-2 and 3-3 have been identified as prostaglandin E2 synthases in the brain and may play crucial roles in temperature and sleep-wake regulation.	121
198319	cd03210	GST_C_Pi	C-terminal, alpha helical domain of Class Pi Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Pi subfamily; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Class Pi GST is a homodimeric eukaryotic protein. The human GSTP1 is mainly found in erythrocytes, kidney, placenta and fetal liver. It is involved in stress responses and in cellular proliferation pathways as an inhibitor of JNK (c-Jun N-terminal kinase). Following oxidative stress, monomeric GSTP1 dissociates from JNK and dimerizes, losing its ability to bind JNK and causing an increase in JNK activity, thereby promoting apoptosis. GSTP1 is expressed in various tumors and is the predominant GST in a wide range of cancer cells. It has been implicated in the development of multidrug-resistant tumors.	126
198320	cd03211	GST_C_Metaxin2	C-terminal, alpha helical domain of Metaxin 2. Glutathione S-transferase (GST) C-terminal domain family, Metaxin subfamily, Metaxin 2; a metaxin 1 binding protein identified through a yeast two-hybrid system using metaxin 1 as the bait. Metaxin 2 shares sequence similarity with metaxin 1 but does not contain a C-terminal mitochondrial outer membrane signal-anchor domain. It associates with mitochondrial membranes through its interaction with metaxin 1, which is a component of the mitochondrial preprotein import complex of the outer membrane. The biological function of metaxin 2 is unknown. It is likely that it also plays a role in protein translocation into the mitochondria. However, this has not been experimentally validated. In a recent proteomics study, it has been shown that metaxin 2 is overexpressed in response to lipopolysaccharide-induced liver injury.	126
198321	cd03212	GST_C_Metaxin1_3	C-terminal, alpha helical domain of Metaxin 1, Metaxin 3, and similar proteins. Glutathione S-transferase (GST) C-terminal domain family, Metaxin subfamily, Metaxin 1-like proteins; composed of metaxins 1 and 3, and similar proteins. Mammalian metaxin (or metaxin 1) is a component of the preprotein import complex of the mitochondrial outer membrane. Metaxin extends to the cytosol and is anchored to the mitochondrial membrane through its C-terminal domain. In mice, metaxin is required for embryonic development. Like the murine gene, the human metaxin gene is located downstream to the glucocerebrosidase (GBA) pseudogene and is convergently transcribed. Inherited deficiency of GBA results in Gaucher disease, which presents many diverse clinical phenotypes. Alterations in the metaxin gene, in addition to GBA mutations, may be associated with Gaucher disease. Genome sequencing shows that a third metaxin gene also exists in zebrafish, Xenopus, chicken, and mammals.	137
213180	cd03213	ABCG_EPDR	Eye pigment and drug resistance transporter subfamily G of the ATP-binding cassette superfamily. ABCG transporters are involved in eye pigment (EP) precursor transport, regulation of lipid-trafficking mechanisms, and pleiotropic drug resistance (DR). DR is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. Compared to other members of the ABC transporter subfamilies, the ABCG transporter family is composed of proteins that have an ATP-binding cassette domain at the N-terminus and a TM (transmembrane) domain at the C-terminus.	194
213181	cd03214	ABC_Iron-Siderophores_B12_Hemin	ATP-binding component of iron-siderophores, vitamin B12 and hemin transporters and related proteins. ABC transporters, involved in the uptake of siderophores, heme, and vitamin B12, are widely conserved in bacteria and archaea. Only very few species lack representatives of the siderophore family transporters. The E. coli BtuCD protein is an ABC transporter mediating vitamin B12 uptake. The two ATP-binding cassettes (BtuD) are in close contact with each other, as are the two membrane-spanning subunits (BtuC); this arrangement is distinct from that observed for the E. coli lipid flippase MsbA. The BtuC subunits provide 20 transmembrane helices grouped around a translocation pathway that is closed to the cytoplasm by a gate region, whereas the dimer arrangement of the BtuD subunits resembles the ATP-bound form of the Rad50 DNA repair enzyme. A prominent cytoplasmic loop of BtuC forms the contact region with the ATP-binding cassette and represent a conserved motif among the ABC transporters.	180
213182	cd03215	ABC_Carb_Monos_II	Second domain of the ATP-binding cassette component of monosaccharide transport system. This family represents domain II of the carbohydrate uptake proteins that transport only monosaccharides (Monos). The Carb_Monos family is involved in the uptake of monosaccharides, such as pentoses (such as xylose, arabinose, and ribose) and hexoses (such as xylose, arabinose, and ribose), that cannot be broken down to simple sugars by hydrolysis. In members of Carb_Monos family the single hydrophobic gene product forms a homodimer, while the ABC protein represents a fusion of two nucleotide-binding domains. However, it is assumed that two copies of the ABC domains are present in the assembled transporter.	182
213183	cd03216	ABC_Carb_Monos_I	First domain of the ATP-binding cassette component of monosaccharide transport system. This family represents the domain I of the carbohydrate uptake proteins that transport only monosaccharides (Monos). The Carb_Monos family is involved in the uptake of monosaccharides, such as pentoses (such as xylose, arabinose, and ribose) and hexoses (such as xylose, arabinose, and ribose), that cannot be broken down to simple sugars by hydrolysis. Pentoses include xylose, arabinose, and ribose. Important hexoses include glucose, galactose, and fructose. In members of the Carb_monos family, the single hydrophobic gene product forms a homodimer while the ABC protein represents a fusion of two nucleotide-binding domains. However, it is assumed that two copies of the ABC domains are present in the assembled transporter.	163
213184	cd03217	ABC_FeS_Assembly	ABC-type transport system involved in Fe-S cluster assembly, ATPase component. Biosynthesis of iron-sulfur clusters (Fe-S) depends on multi-protein systems. The SUF system of E. coli and Erwinia chrysanthemi is important for Fe-S biogenesis under stressful conditions. The SUF system is made of six proteins: SufC is an atypical cytoplasmic ABC-ATPase, which forms a complex with SufB and SufD; SufA plays the role of a scaffold protein for assembly of iron-sulfur clusters and delivery to target proteins; SufS is a cysteine desulfurase which mobilizes the sulfur atom from cysteine and provides it to the cluster; SufE has no associated function yet.	200
213185	cd03218	ABC_YhbG	ATP-binding cassette component of YhbG transport system. The ABC transporters belonging to the YhbG family are similar to members of the Mj1267_LivG family, which is involved in the transport of branched-chain amino acids. The genes yhbG and yhbN are located in a single operon and may function together in cell envelope during biogenesis. YhbG is the putative ATP-binding cassette component and YhbN is the putative periplasmic-binding protein. Depletion of each gene product leads to growth arrest, irreversible cell damage and loss of viability in E. coli. The YhbG homolog (NtrA) is essential in Rhizobium meliloti, a symbiotic nitrogen-fixing bacterium.	232
213186	cd03219	ABC_Mj1267_LivG_branched	ATP-binding cassette component of branched chain amino acids transport system. The Mj1267/LivG ABC transporter subfamily is involved in the transport of the hydrophobic amino acids leucine, isoleucine and valine. MJ1267 is a branched-chain amino acid transporter with 29% similarity to both the LivF and LivG components of the E. coli branched-chain amino acid transporter. MJ1267 contains an insertion from residues 114 to 123 characteristic of LivG (Leucine-Isoleucine-Valine) homologs. The branched-chain amino acid transporter from E. coli comprises a heterodimer of ABCs (LivF and LivG), a heterodimer of six-helix TM domains (LivM and LivH), and one of two alternative soluble periplasmic substrate binding proteins (LivK or LivJ).	236
213187	cd03220	ABC_KpsT_Wzt	ATP-binding cassette component of polysaccharide transport system. The KpsT/Wzt ABC transporter subfamily is involved in extracellular polysaccharide export. Among the variety of membrane-linked or extracellular polysaccharides excreted by bacteria, only capsular polysaccharides, lipopolysaccharides, and teichoic acids have been shown to be exported by ABC transporters. A typical system is made of a conserved integral membrane and an ABC. In addition to these proteins, capsular polysaccharide exporter systems require two 'accessory' proteins to perform their function: a periplasmic (E.coli) or a lipid-anchored outer membrane protein called OMA (Neisseria meningitidis and Haemophilus influenza) and a cytoplasmic membrane protein MPA2.	224
213188	cd03221	ABCF_EF-3	ATP-binding cassette domain of elongation factor 3, subfamily F. Elongation factor 3 (EF-3) is a cytosolic protein required by fungal ribosomes for in vitro protein synthesis and for in vivo growth. EF-3 stimulates the binding of the EF-1: GTP: aa-tRNA ternary complex to the ribosomal A site by facilitated release of the deacylated tRNA from the E site. The reaction requires ATP hydrolysis. EF-3 contains two ATP nucleotide binding sequence (NBS) motifs. NBSI is sufficient for the intrinsic ATPase activity. NBSII is essential for the ribosome-stimulated functions.	144
213189	cd03222	ABC_RNaseL_inhibitor	ATP-binding cassette domain of RNase L inhibitor. The ABC ATPase RNase L inhibitor (RLI) is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI's are not transport proteins, and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLI's have an N-terminal Fe-S domain and two nucleotide-binding domains, which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology.	177
213190	cd03223	ABCD_peroxisomal_ALDP	ATP-binding cassette domain of peroxisomal transporter, subfamily D. Peroxisomal ATP-binding cassette transporter (Pat) is involved in the import of very long-chain fatty acids (VLCFA) into the peroxisome. The peroxisomal membrane forms a permeability barrier for a wide variety of metabolites required for and formed during fatty acid beta-oxidation. To communicate with the cytoplasm and mitochondria, peroxisomes need dedicated proteins to transport such hydrophilic molecules across their membranes. X-linked adrenoleukodystrophy (X-ALD) is caused by mutations in the ALD gene, which encodes ALDP (adrenoleukodystrophy protein ), a peroxisomal integral membrane protein that is a member of the ATP-binding cassette (ABC) transporter protein family. The disease is characterized by a striking and unpredictable variation in phenotypic expression. Phenotypes include the rapidly progressive childhood cerebral form (CCALD), the milder adult form, adrenomyeloneuropathy (AMN), and variants without neurologic involvement (i.e. asymptomatic).	166
213191	cd03224	ABC_TM1139_LivF_branched	ATP-binding cassette domain of branched-chain amino acid transporter. LivF (TM1139) is part of the LIV-I bacterial ABC-type two-component transport system that imports neutral, branched-chain amino acids. The E. coli branched-chain amino acid transporter comprises a heterodimer of ABC transporters (LivF and LivG), a heterodimer of six-helix TM domains (LivM and LivH), and one of two alternative soluble periplasmic substrate binding proteins (LivK or LivJ). ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules.	222
213192	cd03225	ABC_cobalt_CbiO_domain1	First domain of the ATP-binding cassette component of cobalt transport system. Domain I of the ABC component of a cobalt transport family found in bacteria, archaea, and eukaryota. The transition metal cobalt is an essential component of many enzymes and must be transported into cells in appropriate amounts when needed. This ABC transport system of the CbiMNQO family is involved in cobalt transport in association with the cobalamin (vitamin B12) biosynthetic pathways. Most of cobalt (Cbi) transport systems possess a separate CbiN component, the cobalt-binding periplasmic protein, and they are encoded by the conserved gene cluster cbiMNQO. Both the CbiM and CbiQ proteins are integral cytoplasmic membrane proteins, and the CbiO protein has the linker peptide and the Walker A and B motifs commonly found in the ATPase components of the ABC-type transport systems.	211
213193	cd03226	ABC_cobalt_CbiO_domain2	Second domain of the ATP-binding cassette component of cobalt transport system. Domain II of the ABC component of a cobalt transport family found in bacteria, archaea, and eukaryota. The transition metal cobalt is an essential component of many enzymes and must be transported into cells in appropriate amounts when needed. The CbiMNQO family ABC transport system is involved in cobalt transport in association with the cobalamin (vitamin B12) biosynthetic pathways. Most cobalt (Cbi) transport systems possess a separate CbiN component, the cobalt-binding periplasmic protein, and they are encoded by the conserved gene cluster cbiMNQO. Both the CbiM and CbiQ proteins are integral cytoplasmic membrane proteins, and the CbiO protein has the linker peptide and the Walker A and B motifs commonly found in the ATPase components of the ABC-type transport systems.	205
213194	cd03227	ABC_Class2	ATP-binding cassette domain of non-transporter proteins. ABC-type Class 2 contains systems involved in cellular processes other than transport. These families are characterized by the fact that the ABC subunit is made up of duplicated, fused ABC modules (ABC2). No known transmembrane proteins or domains are associated with these proteins.	162
213195	cd03228	ABCC_MRP_Like	ATP-binding cassette domain of multidrug resistance protein-like transporters. The MRP (Multidrug Resistance Protein)-like transporters are involved in drug, peptide, and lipid export. They belong to the subfamily C of the ATP-binding cassette (ABC) superfamily of transport proteins. The ABCC subfamily contains transporters with a diverse functional spectrum that includes ion transport, cell surface receptor, and toxin secretion activities. The MRP-like family, similar to all ABC proteins, have a common four-domain core structure constituted by two membrane-spanning domains, each composed of six transmembrane (TM) helices, and two nucleotide-binding domains (NBD). ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	171
213196	cd03229	ABC_Class3	ATP-binding cassette domain of the binding protein-dependent transport systems. This class is comprised of all BPD (Binding Protein Dependent) systems that are largely represented in archaea and eubacteria and are primarily involved in scavenging solutes from the environment. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	178
213197	cd03230	ABC_DR_subfamily_A	ATP-binding cassette domain of the drug resistance transporter and related proteins, subfamily A. This family of ATP-binding proteins belongs to a multi-subunit transporter involved in drug resistance (BcrA and DrrA), nodulation, lipid transport, and lantibiotic immunity. In bacteria and archaea, these transporters usually include an ATP-binding protein and one or two integral membrane proteins. Eukaryotic systems of the ABCA subfamily display ABC domains that are quite similar to this family. The ATP-binding domain shows the highest similarity between all members of the ABC transporter family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	173
213198	cd03231	ABC_CcmA_heme_exporter	Cytochrome c biogenesis ATP-binding export protein. CcmA, the ATP-binding component of the bacterial CcmAB transporter. The CCM family is involved in bacterial cytochrome c biogenesis. Cytochrome c maturation in E. coli requires the ccm operon, which encodes eight membrane proteins (CcmABCDEFGH). CcmE is a periplasmic heme chaperon that binds heme covalently and transfers it onto apocytochrome c in the presence of CcmF, CcmG, and CcmH. The CcmAB proteins represent an ABC transporter and the CcmCD proteins participate in heme transfer to CcmE.	201
213199	cd03232	ABCG_PDR_domain2	Second domain of the pleiotropic drug resistance-like (PDR) subfamily G of ATP-binding cassette transporters. The pleiotropic drug resistance (PDR) is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. This PDR subfamily represents domain I of its (ABC-IM)2 organization. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	192
213200	cd03233	ABCG_PDR_domain1	First domain of the pleiotropic drug resistance-like subfamily G of ATP-binding cassette transporters. The pleiotropic drug resistance (PDR) is a well-described phenomenon occurring in fungi and shares several similarities with processes in bacteria and higher eukaryotes. This PDR subfamily represents domain I of its (ABC-IM)2 organization. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	202
213201	cd03234	ABCG_White	White pigment protein homolog of ABCG transporter subfamily. The White subfamily represents ABC transporters homologous to the Drosophila white gene, which acts as a dimeric importer for eye pigment precursors. The eye pigmentation of Drosophila is developed from the synthesis and deposition in the cells of red pigments, which are synthesized from guanine, and brown pigments, which are synthesized from tryptophan. The pigment precursors are encoded by the white, brown, and scarlet genes, respectively. Evidence from genetic and biochemical studies suggest that the White and Brown proteins function as heterodimers to import guanine, while the White and Scarlet proteins function to import tryptophan. However, a recent study also suggests that White may be involved in the transport of a metabolite, such as 3-hydroxykynurenine, across intracellular membranes. Mammalian ABC transporters belonging to the White subfamily (ABCG1, ABCG5, and ABCG8) have been shown to be involved in the regulation of lipid-trafficking mechanisms in macrophages, hepatocytes, and intestinal mucosa cells. ABCG1 (ABC8), the human homolog of the Drosophila white gene is induced in monocyte-derived macrophages during cholesterol influx mediated by acetylated low-density lipoprotein. It is possible that human ABCG1 forms heterodimers with several heterologous partners.	226
213202	cd03235	ABC_Metallic_Cations	ATP-binding cassette domain of the metal-type transporters. This family includes transporters involved in the uptake of various metallic cations such as iron, manganese, and zinc. The ATPases of this group of transporters are very similar to members of iron-siderophore uptake family suggesting that they share a common ancestor. The best characterized metal-type ABC transporters are the YfeABCD system of Y. pestis, the SitABCD system of Salmonella enterica serovar Typhimurium, and the SitABCD transporter of Shigella flexneri. Moreover other uncharacterized homologs of these metal-type transporters are mainly found in pathogens like Haemophilus or enteroinvasive E. coli isolates.	213
213203	cd03236	ABC_RNaseL_inhibitor_domain1	The ATP-binding cassette domain 1 of RNase L inhibitor. The ABC ATPase, RNase L inhibitor (RLI), is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI s are not transport proteins and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLIs have an N-terminal Fe-S domain and two nucleotide binding domains which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology.	255
213204	cd03237	ABC_RNaseL_inhibitor_domain2	The ATP-binding cassette domain 2 of RNase L inhibitor. The ABC ATPase, RNase L inhibitor (RLI), is a key enzyme in ribosomal biogenesis, formation of translation preinitiation complexes, and assembly of HIV capsids. RLI's are not transport proteins and thus cluster with a group of soluble proteins that lack the transmembrane components commonly found in other members of the family. Structurally, RLI's have an N-terminal Fe-S domain and two nucleotide-binding domains which are arranged to form two composite active sites in their interface cleft. RLI is one of the most conserved enzymes between archaea and eukaryotes with a sequence identity of more than 48%. The high degree of evolutionary conservation suggests that RLI performs a central role in archaeal and eukaryotic physiology.	246
213205	cd03238	ABC_UvrA	ATP-binding cassette domain of the excision repair protein UvrA. Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins, and UvrB having one ATP binding site that is structurally related to that of helicases.	176
213206	cd03239	ABC_SMC_head	The SMC head domain belongs to the ATP-binding cassette superfamily. The structural maintenance of chromosomes (SMC) proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms. SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and each has five distinct domains: amino- and carboxy-terminal globular domains, which contain sequences characteristic of ATPases, two coiled-coil regions separating the terminal domains , and a central flexible hinge. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair, and epigenetic silencing of gene expression.	178
213207	cd03240	ABC_Rad50	ATP-binding cassette domain of Rad50. The catalytic domains of Rad50 are similar to the ATP-binding cassette of ABC transporters, but are not associated with membrane-spanning domains. The conserved ATP-binding motifs common to Rad50 and the ABC transporter family include the Walker A and Walker B motifs, the Q loop, a histidine residue in the switch region, a D-loop, and a conserved LSGG sequence. This conserved sequence, LSGG, is the most specific and characteristic motif of this family and is thus known as the ABC signature sequence.	204
213208	cd03241	ABC_RecN	ATP-binding cassette domain of RecN. RecN ATPase involved in DNA repair; similar to ABC (ATP-binding cassette) transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	276
213209	cd03242	ABC_RecF	ATP-binding cassette domain of RecF. RecF is a recombinational DNA repair ATPase that maintains replication in the presence of DNA damage. When replication is prematurely disrupted by DNA damage, several recF pathway gene products play critical roles processing the arrested replication fork, allowing it to resume and complete its task. This CD represents the nucleotide binding domain of RecF. RecF belongs to a large superfamily of ABC transporters involved in the transport of a wide variety of different compounds including sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases with a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	270
213210	cd03243	ABC_MutS_homologs	ATP-binding cassette domain of MutS homologs. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family also possess a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	202
213211	cd03244	ABCC_MRP_domain2	ATP-binding cassette domain 2 of multidrug resistance-associated protein. The ABC subfamily C is also known as MRP (multidrug resistance-associated protein). Some of the MRP members have five additional transmembrane segments in their N-terminus, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resistance lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions, such as glutathione, glucuronate, and sulfate.	221
213212	cd03245	ABCC_bacteriocin_exporters	ATP-binding cassette domain of bacteriocin exporters, subfamily C. Many non-lantibiotic bacteriocins of lactic acid bacteria are produced as precursors which have N-terminal leader peptides that share similarities in amino acid sequence and contain a conserved processing site of two glycine residues in positions -1 and -2. A dedicated ATP-binding cassette (ABC) transporter is responsible for the proteolytic cleavage of the leader peptides and subsequent translocation of the bacteriocins across the cytoplasmic membrane.	220
213213	cd03246	ABCC_Protease_Secretion	ATP-binding cassette domain of PrtD, subfamily C. This family represents the ABC component of the protease secretion system PrtD, a 60-kDa integral membrane protein sharing 37% identity with HlyB, the ABC component of the alpha-hemolysin secretion pathway, in the C-terminal domain. They export degradative enzymes by using a type I protein secretion system and lack an N-terminal signal peptide, but contain a C-terminal secretion signal. The Type I secretion apparatus is made up of three components, an ABC transporter, a membrane fusion protein (MFP), and an outer membrane protein (OMP). For the HlyA transporter complex, HlyB (ABC transporter) and HlyD (MFP) reside in the inner membrane of E. coli. The OMP component is TolC, which is thought to interact with the MFP to form a continuous channel across the periplasm from the cytoplasm to the exterior. HlyB belongs to the family of ABC transporters, which are ubiquitous, ATP-dependent transmembrane pumps or channels. The spectrum of transport substrates ranges from inorganic ions, nutrients such as amino acids, sugars, or peptides, hydrophobic drugs, to large polypeptides, such as HlyA.	173
213214	cd03247	ABCC_cytochrome_bd	ATP-binding cassette domain of CydCD, subfamily C. The CYD subfamily implicated in cytochrome bd biogenesis. The CydC and CydD proteins are important for the formation of cytochrome bd terminal oxidase of E. coli and it has been proposed that they were necessary for biosynthesis of the cytochrome bd quinol oxidase and for periplasmic c-type cytochromes. CydCD were proposed to determine a heterooligomeric complex important for heme export into the periplasm or to be involved in the maintenance of the proper redox state of the periplasmic space. In Bacillus subtilis, the absence of CydCD does not affect the presence of halo-cytochrome c in the membrane and this observation suggests that CydCD proteins are not involved in the export of heme in this organism.	178
213215	cd03248	ABCC_TAP	ATP-binding cassette domain of the Transporter Associated with Antigen Processing, subfamily C. TAP (Transporter Associated with Antigen Processing) is essential for peptide delivery from the cytosol into the lumen of the endoplasmic reticulum (ER), where these peptides are loaded on major histocompatibility complex (MHC) I molecules. Loaded MHC I leave the ER and display their antigenic cargo on the cell surface to cytotoxic T cells. Subsequently, virus-infected or malignantly transformed cells can be eliminated. TAP belongs to the large family of ATP-binding cassette (ABC) transporters, which translocate a vast variety of solutes across membranes.	226
213216	cd03249	ABC_MTABC3_MDL1_MDL2	ATP-binding cassette domain of a mitochondrial protein MTABC3 and related proteins. MTABC3 (also known as ABCB6) is a mitochondrial ATP-binding cassette protein involved in iron homeostasis and one of four ABC transporters expressed in the mitochondrial inner membrane, the other three being MDL1(ABC7), MDL2, and ATM1. In fact, the yeast MDL1 (multidrug resistance-like protein 1) and MDL2 (multidrug resistance-like protein 2) transporters are also included in this CD. MDL1 is an ATP-dependent permease that acts as a high-copy suppressor of ATM1 and is thought to have a role in resistance to oxidative stress. Interestingly, subfamily B is more closely related to the carboxyl-terminal component of subfamily C than the two halves of ABCC molecules are with one another.	238
213217	cd03250	ABCC_MRP_domain1	ATP-binding cassette domain 1 of multidrug resistance-associated protein, subfamily C. This subfamily is also known as MRP (multidrug resistance-associated protein). Some of the MRP members have five additional transmembrane segments in their N-terminus, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resisting lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions, such as glutathione, glucuronate, and sulfate.	204
213218	cd03251	ABCC_MsbA	ATP-binding cassette domain of the bacterial lipid flippase and related proteins, subfamily C. MsbA is an essential ABC transporter, closely related to eukaryotic MDR proteins. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	234
213219	cd03252	ABCC_Hemolysin	ATP-binding cassette domain of hemolysin B, subfamily C. The ABC-transporter hemolysin B is a central component of the secretion machinery that translocates the toxin, hemolysin A, in a Sec-independent fashion across both membranes of E. coli. The hemolysin A (HlyA) transport machinery is composed of the ATP-binding cassette (ABC) transporter HlyB located in the inner membrane, hemolysin D (HlyD), also anchored in the inner membrane, and TolC, which resides in the outer membrane. HlyD apparently forms a continuous channel that bridges the entire periplasm, interacting with TolC and HlyB. This arrangement prevents the appearance of periplasmic intermediates of HlyA during substrate transport. Little is known about the molecular details of HlyA transport, but it is evident that ATP-hydrolysis by the ABC-transporter HlyB is a necessary source of energy.	237
213220	cd03253	ABCC_ATM1_transporter	ATP-binding cassette domain of iron-sulfur clusters transporter, subfamily C. ATM1 is an ABC transporter that is expressed in the mitochondria. Although the specific function of ATM1 is unknown, its disruption results in the accumulation of excess mitochondrial iron, loss of mitochondrial cytochromes, oxidative damage to mitochondrial DNA, and decreased levels of cytosolic heme proteins. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	236
213221	cd03254	ABCC_Glucan_exporter_like	ATP-binding cassette domain of glucan transporter and related proteins, subfamily C. Glucan exporter ATP-binding protein. In A. tumefaciens cyclic beta-1, 2-glucan must be transported into the periplasmic space to exert its action as a virulence factor. This subfamily belongs to the MRP-like family and is involved in drug, peptide, and lipid export. The MRP-like family, similar to all ABC proteins, have a common four-domain core structure constituted by two membrane-spanning domains each composed of six transmembrane (TM) helices and two nucleotide-binding domains (NBD). ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	229
213222	cd03255	ABC_MJ0796_LolCDE_FtsE	ATP-binding cassette domain of the transporters involved in export of lipoprotein and macrolide, and cell division protein. This family is comprised of MJ0796 ATP-binding cassette, macrolide-specific ABC-type efflux carrier (MacAB), and proteins involved in cell division (FtsE), and release of lipoproteins from the cytoplasmic membrane (LolCDE). They are clustered together phylogenetically. MacAB is an exporter that confers resistance to macrolides, while the LolCDE system is not a transporter at all. An FtsE null mutants showed filamentous growth and appeared viable on high salt medium only, indicating a role for FtsE in cell division and/or salt transport. The LolCDE complex catalyzes the release of lipoproteins from the cytoplasmic membrane prior to their targeting to the outer membrane.	218
213223	cd03256	ABC_PhnC_transporter	ATP-binding cassette domain of the binding protein-dependent phosphonate transport system. Phosphonates are a class of organophosphorus compounds characterized by a chemically stable carbon-to-phosphorus (C-P) bond. Phosphonates are widespread among naturally occurring compounds in all kingdoms of wildlife, but only prokaryotic microorganisms are able to cleave this bond. Certain bacteria such as E. coli can use alkylphosphonates as a phosphorus source. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	241
213224	cd03257	ABC_NikE_OppD_transporters	ATP-binding cassette domain of nickel/oligopeptides specific transporters. The ABC transporter subfamily specific for the transport of dipeptides, oligopeptides (OppD), and nickel (NikDE). The NikABCDE system of E. coli belongs to this family and is composed of the periplasmic binding protein NikA, two integral membrane components (NikB and NikC), and two ATPase (NikD and NikE). The NikABCDE transporter is synthesized under anaerobic conditions to meet the increased demand for nickel resulting from hydrogenase synthesis. The molecular mechanism of nickel uptake in many bacteria and most archaea is not known. Many other members of this ABC family are also involved in the uptake of dipeptides and oligopeptides. The oligopeptide transport system (Opp) is a five-component ABC transport composed of a membrane-anchored substrate binding proteins (SRP), OppA, two transmembrane proteins, OppB and OppC, and two ATP-binding domains, OppD and OppF.	228
213225	cd03258	ABC_MetN_methionine_transporter	ATP-binding cassette domain of methionine transporter. MetN (also known as YusC) is an ABC-type transporter encoded by metN of the metNPQ operon in Bacillus subtilis that is involved in methionine transport. Other members of this system include the MetP permease and the MetQ substrate binding protein. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	233
213226	cd03259	ABC_Carb_Solutes_like	ATP-binding cassette domain of the carbohydrate and solute transporters-like. This family is comprised of proteins involved in the transport of apparently unrelated solutes and proteins specific for di- and oligosaccharides and polyols. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	213
213227	cd03260	ABC_PstB_phosphate_transporter	ATP-binding cassette domain of the phosphate transport system. Phosphate uptake is of fundamental importance in the cell physiology of bacteria because phosphate is required as a nutrient. The Pst system of E. coli comprises four distinct subunits encoded by the pstS, pstA, pstB, and pstC genes. The PstS protein is a phosphate-binding protein located in the periplasmic space. PstA and PstC are hydrophobic and they form the transmembrane portion of the Pst system. PstB is the catalytic subunit, which couples the energy of ATP hydrolysis to the import of phosphate across cellular membranes through the Pst system, often referred as ABC-protein. PstB belongs to one of the largest superfamilies of proteins characterized by a highly conserved adenosine triphosphate (ATP) binding cassette (ABC), which is also a nucleotide binding domain (NBD).	227
213228	cd03261	ABC_Org_Solvent_Resistant	ATP-binding cassette transport system involved in resistance to organic solvents. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	235
213229	cd03262	ABC_HisP_GlnQ	ATP-binding cassette domain of the histidine and glutamine transporters. HisP and GlnQ are the ATP-binding components of the bacterial periplasmic histidine and glutamine permeases, respectively. Histidine permease is a multi-subunit complex containing the HisQ and HisM integral membrane subunits and two copies of HisP. HisP has properties intermediate between those of integral and peripheral membrane proteins and is accessible from both sides of the membrane, presumably by its interaction with HisQ and HisM. The two HisP subunits form a homodimer within the complex. The domain structure of the amino acid uptake systems is typical for prokaryotic extracellular solute binding protein-dependent uptake systems. All of the amino acid uptake systems also have at least one, and in a few cases, two extracellular solute binding proteins located in the periplasm of Gram-negative bacteria, or attached to the cell membrane of Gram-positive bacteria. The best-studied member of the PAAT (polar amino acid transport) family is the HisJQMP system of S. typhimurium, where HisJ is the extracellular solute binding proteins and HisP is the ABC protein.	213
213230	cd03263	ABC_subfamily_A	ATP-binding cassette domain of the lipid transporters, subfamily A. The ABCA subfamily mediates the transport of a variety of lipid compounds. Mutations of members of ABCA subfamily are associated with human genetic diseases, such as, familial high-density lipoprotein (HDL) deficiency, neonatal surfactant deficiency, degenerative retinopathies, and congenital keratinization disorders. The ABCA1 protein is involved in disorders of cholesterol transport and high-density lipoprotein (HDL) biosynthesis. The ABCA4 (ABCR) protein transports vitamin A derivatives in the outer segments of photoreceptor cells, and therefore, performs a crucial step in the visual cycle. The ABCA genes are not present in yeast. However, evolutionary studies of ABCA genes indicate that they arose as transporters that subsequently duplicated and that certain sets of ABCA genes were lost in different eukaryotic lineages.	220
213231	cd03264	ABC_drug_resistance_like	ABC-type multidrug transport system, ATPase component. The biological function of this family is not well characterized, but display ABC domains similar to members of ABCA subfamily. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	211
213232	cd03265	ABC_DrrA	Daunorubicin/doxorubicin resistance ATP-binding protein. DrrA is the ATP-binding protein component of a bacterial exporter complex that confers resistance to the antibiotics daunorubicin and doxorubicin. In addition to DrrA, the complex includes an integral membrane protein called DrrB. DrrA belongs to the ABC family of transporters and shares sequence and functional similarities with a protein found in cancer cells called P-glycoprotein. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	220
213233	cd03266	ABC_NatA_sodium_exporter	ATP-binding cassette domain of the Na+ transporter. NatA is the ATPase component of a bacterial ABC-type Na+ transport system called NatAB, which catalyzes ATP-dependent electrogenic Na+ extrusion without mechanically coupled proton or K+ uptake. NatB possess six putative membrane spanning regions at its C-terminus. In B. subtilis, NatAB is inducible by agents such as ethanol and protonophores, which lower the proton-motive force across the membrane. The closest sequence similarity to NatA is exhibited by DrrA of the two-component daunorubicin- and doxorubicin-efflux system. Hence, the functional NatAB is presumably assembled with two copies of a single ATP-binding protein and a single integral membrane protein.	218
213234	cd03267	ABC_NatA_like	ATP-binding cassette domain of an uncharacterized transporter similar in sequence to NatA. NatA is the ATPase component of a bacterial ABC-type Na+ transport system called NatAB, which catalyzes ATP-dependent electrogenic Na+ extrusion without mechanically coupled to proton or K+ uptake. NatB possess six putative membrane spanning regions at its C-terminus. In B. subtilis, NatAB is inducible by agents such as ethanol and protonophores, which lower the proton-motive force across the membrane. The closest sequence similarity to NatA is exhibited by DrrA of the two-component daunorubicin- and doxorubicin-efflux system. Hence, the functional NatAB is presumably assembled with two copies of the single ATP-binding protein and the single integral membrane protein.	236
213235	cd03268	ABC_BcrA_bacitracin_resist	ATP-binding cassette domain of the bacitracin-resistance transporter. The BcrA subfamily represents ABC transporters involved in peptide antibiotic resistance. Bacitracin is a dodecapeptide antibiotic produced by B. licheniformis and B. subtilis. The synthesis of bacitracin is non-ribosomally catalyzed by a multi-enzyme complex BcrABC. Bacitracin has potent antibiotic activity against gram-positive bacteria. The inhibition of peptidoglycan biosynthesis is the best characterized bacterial effect of bacitracin. The bacitracin resistance of B. licheniformis is mediated by the ABC transporter Bcr which is composed of two identical BcrA ATP-binding subunits and one each of the integral membrane proteins, BcrB and BcrC. B. subtilis cells carrying bcr genes on high-copy number plasmids develop collateral detergent sensitivity, a similar phenomenon in human cells with overexpressed multi-drug resistance P-glycoprotein.	208
213236	cd03269	ABC_putative_ATPase	ATP-binding cassette domain of an uncharacterized transporter. This subgroup is related to the subfamily A transporters involved in drug resistance, nodulation, lipid transport, and bacteriocin and lantibiotic immunity. In eubacteria and archaea, the typical organization consists of one ABC and one or two integral membranes. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	210
213237	cd03270	ABC_UvrA_I	ATP-binding cassette domain I of the excision repair protein UvrA. Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins, and UvrB having one ATP binding site that is structurally related to that of helicases.	226
213238	cd03271	ABC_UvrA_II	ATP-binding cassette domain II of the excision repair protein UvrA. Nucleotide excision repair in eubacteria is a process that repairs DNA damage by the removal of a 12-13-mer oligonucleotide containing the lesion. Recognition and cleavage of the damaged DNA is a multistep ATP-dependent reaction that requires the UvrA, UvrB, and UvrC proteins. Both UvrA and UvrB are ATPases, with UvrA having two ATP binding sites, which have the characteristic signature of the family of ABC proteins and UvrB having one ATP binding site that is structurally related to that of helicases.	261
213239	cd03272	ABC_SMC3_euk	ATP-binding cassette domain of eukaryotic SMC3 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).	243
213240	cd03273	ABC_SMC2_euk	ATP-binding cassette domain of eukaryotic SMC2 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).	251
213241	cd03274	ABC_SMC4_euk	ATP-binding cassette domain of eukaryotic SMC4 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).	212
213242	cd03275	ABC_SMC1_euk	ATP-binding cassette domain of eukaryotic SMC1 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).	247
213243	cd03276	ABC_SMC6_euk	ATP-binding cassette domain of eukaryotic SM6 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).	198
213244	cd03277	ABC_SMC5_euk	ATP-binding cassette domain of eukaryotic SMC5 proteins. The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).	213
213245	cd03278	ABC_SMC_barmotin	ATP-binding cassette domain of barmotin, a member of the SMC protein family. Barmotin is a tight junction-associated protein expressed in rat epithelial cells which is thought to have an important regulatory role in tight junction barrier function. Barmotin belongs to the SMC protein family. SMC proteins are large (approximately 110 to 170 kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T, in the single-letter amino-acid code), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 (formerly known as Rad18).	197
213246	cd03279	ABC_sbcCD	ATP-binding cassette domain of sbcCD. SbcCD and other Mre11/Rad50 (MR) complexes are implicated in the metabolism of DNA ends. They cleave ends sealed by hairpin structures and are thought to play a role in removing protein bound to DNA termini.	213
213247	cd03280	ABC_MutS2	ATP-binding cassette domain of MutS2. MutS2 homologs in bacteria and eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family also possess a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	200
213248	cd03281	ABC_MSH5_euk	ATP-binding cassette domain of eukaryotic MutS5 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	213
213249	cd03282	ABC_MSH4_euk	ATP-binding cassette domain of eukaryotic MutS4 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	204
213250	cd03283	ABC_MutS-like	ATP-binding cassette domain of MutS-like homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	199
213251	cd03284	ABC_MutS1	ATP-binding cassette domain of MutS1 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	216
213252	cd03285	ABC_MSH2_euk	ATP-binding cassette domain of eukaryotic MutS2 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	222
213253	cd03286	ABC_MSH6_euk	ATP-binding cassette domain of eukaryotic MutS6 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	218
213254	cd03287	ABC_MSH3_euk	ATP-binding cassette domain of eukaryotic MutS3 homolog. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding clamps, and recognition of specific DNA structures or lesions results in ADP/ATP exchange.	222
213255	cd03288	ABCC_SUR2	ATP-binding cassette domain 2 of the sulfonylurea receptor SUR. The SUR domain 2. The sulfonylurea receptor SUR is an ATP binding cassette (ABC) protein of the ABCC/MRP family. Unlike other ABC proteins, it has no intrinsic transport function, neither active nor passive, but associates with the potassium channel proteins Kir6.1 or Kir6.2 to form the ATP-sensitive potassium (K(ATP)) channel. Within the channel complex, SUR serves as a regulatory subunit that fine-tunes the gating of Kir6.x in response to alterations in cellular metabolism. It constitutes a major pharmaceutical target as it binds numerous drugs, K(ATP) channel openers and blockers, capable of up- or down-regulating channel activity.	257
213256	cd03289	ABCC_CFTR2	ATP-binding cassette domain 2 of CFTR,subfamily C. The cystic fibrosis transmembrane regulator (CFTR), the product of the gene mutated in patients with cystic fibrosis, has adapted the ABC transporter structural motif to form a tightly regulated anion channel at the apical surface of many epithelia. Use of the term assembly of a functional ion channel implies the coming together of subunits or at least smaller not-yet functional components of the active whole. In fact, on the basis of current knowledge only the CFTR polypeptide itself is required to form an ATP- and protein kinase A-dependent low-conductance chloride channel of the type present in the apical membrane of many epithelial cells. CFTR displays the typical organization (IM-ABC)2 and carries a characteristic hydrophilic R-domain that separates IM1-ABC1 from IM2-ABC2.	275
213257	cd03290	ABCC_SUR1_N	ATP-binding cassette domain of the sulfonylurea receptor, subfamily C. The SUR domain 1. The sulfonylurea receptor SUR is an ATP transporter of the ABCC/MRP family with tandem ATPase binding domains. Unlike other ABC proteins, it has no intrinsic transport function, neither active nor passive, but associates with the potassium channel proteins Kir6.1 or Kir6.2 to form the ATP-sensitive potassium (K(ATP)) channel. Within the channel complex, SUR serves as a regulatory subunit that fine-tunes the gating of Kir6.x in response to alterations in cellular metabolism. It constitutes a major pharmaceutical target as it binds numerous drugs, K(ATP) channel openers and blockers, capable of up- or down-regulating channel activity.	218
213258	cd03291	ABCC_CFTR1	ATP-binding cassette domain of the cystic fibrosis transmembrane regulator, subfamily C. The CFTR subfamily domain 1. The cystic fibrosis transmembrane regulator (CFTR), the product of the gene mutated in patients with cystic fibrosis, has adapted the ABC transporter structural motif to form a tightly regulated anion channel at the apical surface of many epithelia. Use of the term assembly of a functional ion channel implies the coming together of subunits, or at least smaller not-yet functional components of the active whole. In fact, on the basis of current knowledge only the CFTR polypeptide itself is required to form an ATP- and protein kinase A-dependent low-conductance chloride channel of the type present in the apical membrane of many epithelial cells. CFTR displays the typical organization (IM-ABC)2 and carries a characteristic hydrophilic R-domain that separates IM1-ABC1 from IM2-ABC2.	282
213259	cd03292	ABC_FtsE_transporter	ATP-binding cassette domain of the cell division transporter. FtsE is a hydrophilic nucleotide-binding protein that binds FtsX to form a heterodimeric ATP-binding cassette (ABC)-type transporter that associates with the bacterial inner membrane. The FtsE/X transporter is thought to be involved in cell division and is important for assembly or stability of the septal ring.	214
213260	cd03293	ABC_NrtD_SsuB_transporters	ATP-binding cassette domain of the nitrate and sulfonate transporters. NrtD and SsuB are the ATP-binding subunits of the bacterial ABC-type nitrate and sulfonate transport systems, respectively. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	220
213261	cd03294	ABC_Pro_Gly_Betaine	ATP-binding cassette domain of the osmoprotectant proline/glycine betaine uptake system. This family comprises the glycine betaine/L-proline ATP binding subunit in bacteria and its equivalents in archaea. This transport system belong to the larger ATP-Binding Cassette (ABC) transporter superfamily. The characteristic feature of these transporters is the obligatory coupling of ATP hydrolysis to substrate translocation. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	269
213262	cd03295	ABC_OpuCA_Osmoprotection	ATP-binding cassette domain of the osmoprotectant transporter. OpuCA is a the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment. ABC (ATP-binding cassette) transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition, to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	242
213263	cd03296	ABC_CysA_sulfate_importer	ATP-binding cassette domain of the sulfate transporter. Part of the ABC transporter complex cysAWTP involved in sulfate import. Responsible for energy coupling to the transport system. The complex is composed of two ATP-binding proteins (cysA), two transmembrane proteins (cysT and cysW), and a solute-binding protein (cysP). ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	239
213264	cd03297	ABC_ModC_molybdenum_transporter	ATP-binding cassette domain of the molybdenum transport system. ModC is an ABC-type transporter and the ATPase component of a molybdate transport system that also includes the periplasmic binding protein ModA and the membrane protein ModB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	214
213265	cd03298	ABC_ThiQ_thiamine_transporter	ATP-binding cassette domain of the thiamine transport system. Part of the binding-protein-dependent transport system tbpA-thiPQ for thiamine and TPP. Probably responsible for the translocation of thiamine across the membrane. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	211
213266	cd03299	ABC_ModC_like	ATP-binding cassette domain similar to the molybdate transporter. Archaeal protein closely related to ModC. ModC is an ABC-type transporter and the ATPase component of a molybdate transport system that also includes the periplasmic binding protein ModA and the membrane protein ModB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	235
213267	cd03300	ABC_PotA_N	ATP-binding cassette domain of the polyamine transporter. PotA is an ABC-type transporter and the ATPase component of the spermidine/putrescine-preferential uptake system consisting of PotA, -B, -C, and -D. PotA has two domains with the N-terminal domain containing the ATPase activity and the residues required for homodimerization with PotA and heterdimerization with PotB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins.	232
213268	cd03301	ABC_MalK_N	The N-terminal ATPase domain of the maltose transporter, MalK. ATP binding cassette (ABC) proteins function from bacteria to human, mediating the translocation of substances into and out of cells or organelles. ABC transporters contain two transmembrane-spanning domains (TMDs) or subunits and two nucleotide binding domains (NBDs) or subunits that couple transport to the hydrolysis of ATP. In the maltose transport system, the periplasmic maltose binding protein (MBP) stimulates the ATPase activity of the membrane-associated transporter, which consists of two transmembrane subunits, MalF and MalG, and two copies of the ATP binding subunit, MalK, and becomes tightly bound to the transporter in the catalytic transition state, ensuring that maltose is passed to the transporter as ATP is hydrolyzed.	213
176471	cd03302	Adenylsuccinate_lyase_2	Adenylsuccinate lyase (ASL)_subgroup 2. This subgroup contains mainly eukaryotic proteins similar to ASL, a member of the Lyase class I family. Members of this family for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. These proteins are active as tetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. ASL catalyzes two steps in the de novo purine biosynthesis: the conversion of 5-aminoimidazole-(N-succinylocarboxamide) ribotide (SAICAR) into 5-aminoimidazole-4-carboxamide ribotide (AICAR) and, the conversion of adenylsuccinate (SAMP) into adenosine monophosphate (AMP). ASL deficiency has been linked to several pathologies including psychomotor retardation with autistic features, epilepsy and muscle wasting.	436
239423	cd03307	Mta_CmuA_like	MtaA_CmuA_like family. MtaA/CmuA, also MtsA, or methyltransferase 2 (MT2) MT2-A and MT2-M isozymes, are methylcobamide:Coenzyme M methyltransferases, which play a role in metabolic pathways of methane formation from various substrates, such as methylated amines and methanol. Coenzyme M, 2-mercaptoethylsulfonate or CoM, is methylated during methanogenesis in a reaction catalyzed by three proteins. A methyltransferase methylates the corrinoid cofactor, which is bound to a second polypeptide, a corrinoid protein. The methylated corrinoid protein then serves as a substrate for MT2-A and related enzymes, which methylate CoM.	326
239424	cd03308	CmuA_CmuC_like	CmuA_CmuC_like: uncharacterized protein family similar to uroporphyrinogen decarboxylase (URO-D) and the methyltransferases CmuA and CmuC.	378
239425	cd03309	CmuC_like	CmuC_like. Proteins similar to the putative corrinoid methyltransferase CmuC. Its function has been inferred from sequence similarity to the methyltransferases CmuA and MtaA. Mutants of Methylobacterium sp. disrupted in cmuC and purU appear deficient in some step of chloromethane metabolism.	321
239426	cd03310	CIMS_like	CIMS - Cobalamine-independent methonine synthase, or MetE. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers both the N-and C-terminal barrel, and some single-barrel sequences, mostly from Archaea. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Side chains from both barrels contribute to the binding of the folate substrate.	321
239427	cd03311	CIMS_C_terminal_like	CIMS - Cobalamine-independent methonine synthase, or MetE, C-terminal domain_like. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers the C-terminal barrel, and a few single-barrel sequences most similar to the C-terminal barrel. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Sidechains from both barrels contribute to the binding of the folate substrate.	332
239428	cd03312	CIMS_N_terminal_like	CIMS - Cobalamine-independent methonine synthase, or MetE, N-terminal domain_like. Many members have been characterized as 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferases, EC:2.1.1.14, mostly from bacteria and plants. This enzyme catalyses the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to L-homocysteine without using an intermediate methyl carrier. The active enzyme has a dual (beta-alpha)8-barrel structure, and this model covers the N-terminal barrel, and a few single-barrel sequences most similar to the N-terminal barrel. It is assumed that the homologous N-terminal barrel has evolved from the C-terminus via gene duplication and has subsequently lost binding sites, and it seems as if the two barrels forming the active enzyme may sometimes reside on different polypeptides. The C-terminal domain incorporates the Zinc ion, which binds and activates homocysteine. Side chains from both barrels contribute to the binding of the folate substrate.	360
239429	cd03313	enolase	Enolase: Enolases are homodimeric enzymes that catalyse the reversible dehydration of 2-phospho-D-glycerate to phosphoenolpyruvate as part of the glycolytic and gluconeogenesis pathways. The reaction is facilitated by the presence of metal ions.	408
239430	cd03314	MAL	Methylaspartate ammonia lyase (3-methylaspartase, MAL) is a homodimeric enzyme, catalyzing the magnesium-dependent reversible alpha,beta-elimination of ammonia from L-threo-(2S,3S)-3-methylaspartic acid to mesaconic acid. This reaction is part of the main catabolic pathway for glutamate. MAL belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion.	369
239431	cd03315	MLE_like	Muconate lactonizing enzyme (MLE) like subgroup of the enolase superfamily. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and residues that can function as general acid/base catalysts, a Lys-X-Lys motif and another conserved lysine. Despite these conserved residues, the members of the MLE subgroup, like muconate lactonizing enzyme, o-succinylbenzoate synthase (OSBS) and N-acylamino acid racemase (NAAAR), catalyze different reactions.	265
239432	cd03316	MR_like	Mandelate racemase (MR)-like subfamily of the enolase superfamily. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues,  a Lys-X-Lys motif and a conserved histidine-aspartate dyad. Members of the MR subgroup are mandelate racemase, D-glucarate/L-idarate dehydratase (GlucD),  D-altronate/D-mannonate dehydratase , D-galactonate dehydratase (GalD) , D-gluconate dehydratase (GlcD), and L-rhamnonate dehydratase (RhamD).	357
239433	cd03317	NAAAR	N-acylamino acid racemase (NAAAR), an octameric enzyme that catalyzes the racemization of N-acylamino acids. NAAARs act on a broad range of N-acylamino acids rather than amino acids. Enantiopure amino acids are of industrial interest as chiral building blocks for antibiotics, herbicides, and drugs. NAAAR is a member of the enolase superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion.	354
239434	cd03318	MLE	Muconate Lactonizing Enzyme (MLE), an homooctameric enzyme, catalyses the conversion of cis,cis-muconate (CCM) to muconolactone (ML) in the catechol branch of the beta-ketoadipate pathway. This pathway is used in soil microbes to breakdown lignin-derived aromatics, catechol and protocatechuate, to citric acid cycle intermediates. Some bacterial species are also capable of dehalogenating chloroaromatic compounds by the action of chloromuconate lactonizing enzymes (Cl-MLEs). MLEs are members of the enolase superfamily characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and that is stabilized by coordination to the essential Mg2+ ion.	365
239435	cd03319	L-Ala-DL-Glu_epimerase	L-Ala-D/L-Glu epimerase catalyzes the epimerization of L-Ala-D/L-Glu and other dipeptides. The genomic context and the substrate specificity of characterized members of this family from E.coli and B.subtilis indicates a possible role in the metabolism of the murein peptide of peptidoglycan, of which L-Ala-D-Glu is a component. L-Ala-D/L-Glu epimerase is a member of the enolase-superfamily, which is characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion.	316
239436	cd03320	OSBS	o-Succinylbenzoate synthase (OSBS) catalyzes the conversion of 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate (SHCHC) to 4-(2'-carboxyphenyl)-4-oxobutyrate (o-succinylbenzoate or OSB), a reaction in the menaquinone biosynthetic pathway. Menaquinone is an essential cofactor for anaerobic growth in eubacteria and some archaea. OSBS belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion.	263
239437	cd03321	mandelate_racemase	Mandelate racemase (MR) catalyzes the Mg2+-dependent 1,1-proton transfer reaction that interconverts the enantiomers of mandelic acid. MR is the first enzyme in the bacterial pathway that converts mandelic acid to benzoic acid and allows this pathway to utilize either enantiomer of mandelate. MR belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion.	355
239438	cd03322	RspA	The starvation sensing protein RspA from E.coli and its homologs are lactonizing enzymes whose putative targets are homoserine lactone (HSL)-derivative. They are part of the mandelate racemase (MR)-like subfamily of the enolase superfamily. Enzymes of this subfamily share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and catalytic residues, a partially conserved Lys-X-Lys motif and a conserved histidine-aspartate dyad.	361
239439	cd03323	D-glucarate_dehydratase	D-Glucarate dehydratase (GlucD) catalyzes the dehydration of both D-glucarate and L-idarate to form 5-keto-4-deoxy-D-glucarate (5-KDG) , the initial reaction of the catabolic pathway for (D)-glucarate. GlucD belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and that is stabilized by coordination to the essential Mg2+ ion.	395
239440	cd03324	rTSbeta_L-fuconate_dehydratase	Human rTS beta is encoded by the rTS gene which, through alternative RNA splicing, also encodes rTS alpha whose mRNA is complementary to thymidylate synthase mRNA. rTS beta expression is associated with the production of small molecules that appear to mediate the down-regulation of thymidylate synthase protein by a novel intercellular signaling mechanism. A member of this family, from Xanthomonas, has been characterized to be a L-fuconate dehydratase. rTS beta belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion.	415
239441	cd03325	D-galactonate_dehydratase	D-galactonate dehydratase catalyses the dehydration of galactonate to 2-keto-3-deoxygalactonate (KDGal), as part of the D-galactonate nonphosphorolytic catabolic Entner-Doudoroff pathway. D-galactonate dehydratase belongs to the enolase superfamily of enzymes, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion.	352
239442	cd03326	MR_like_1	Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 1. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues,  a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown.	385
239443	cd03327	MR_like_2	Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 2. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues,  a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown.	341
239444	cd03328	MR_like_3	Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 3. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues,  a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown.	352
239445	cd03329	MR_like_4	Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 4. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues,  a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown.	368
394879	cd03330	Macro_Ttha0132-like	Macrodomain, uncharacterized family similar to Thermus thermophilus hypothetical protein Ttha0132. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Macrodomains include the yeast macrodomain Poa1 which is a phosphatase of ADP-ribose-1"-phosphate, a by-product of tRNA splicing. Some macrodomains have ADPr-unrelated binding partners such as the coronavirus SUD-N (N-terminal subdomain) and SUD-M (middle subdomain) of the SARS-unique domain (SUD) which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). Macrodomains regulate a wide variety of cellular and organismal processes, including DNA damage repair, signal transduction, and immune response. This family is composed of uncharacterized proteins containing a stand-alone macrodomain, similar to Thermus thermophilus hypothetical protein Ttha0132.	147
394880	cd03331	Macro_Poa1p-like_SNF2	macrodomain, Poa1p-like family, SNF2 subfamily. Macrodomains are found in a variety of proteins with diverse cellular functions, as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Macrodomains can recognize ADP-ribose (ADPr) in both its free and protein-linked forms, in related ligands, such as O-acyl-ADP-ribose (OAADPr), and even in ligands unrelated to ADPr. Members of this subfamily contain a C-terminal macrodomain that show similarity to the yeast protein Poa1p, reported to be a phosphatase specific for Appr-1"-p, a tRNA splicing metabolite. In addition, they also contain an SNF2 domain, defined by the presence of seven motifs with sequence similarity to DNA helicases. SNF2 proteins have the capacity to use the energy released by their DNA-dependent ATPase activity to stabilize or perturb protein-DNA interactions and play important roles in transcriptional regulation, maintenance of chromosome integrity and DNA repair.	152
239448	cd03332	LMO_FMN	L-Lactate 2-monooxygenase (LMO) FMN-binding domain. LMO is a FMN-containing enzyme that catalyzes the conversion of L-lactate and oxygen to acetate, carbon dioxide, and water. LMO is a member of the family of alpha-hydroxy acid oxidases.  It is thought to be a homooctamer with two- and four- fold axes in the center of the octamer.	383
239449	cd03333	chaperonin_like	chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and  in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains.	209
239450	cd03334	Fab1_TCP	TCP-1 like domain of the eukaryotic phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinase Fab1.  Fab1p is important for vacuole size regulation, presumably by modulating PtdIns(3,5)P2 effector activity. In the human homolog p235/PIKfyve deletion of this domain leads to loss of catalytic activity. However no exact function this domain has been defined. In general, chaperonins are involved in productive folding of proteins.	261
239451	cd03335	TCP1_alpha	TCP-1 (CTT or eukaryotic type II) chaperonin family, alpha subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin.	527
239452	cd03336	TCP1_beta	TCP-1 (CTT or eukaryotic type II) chaperonin family, beta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin.	517
239453	cd03337	TCP1_gamma	TCP-1 (CTT or eukaryotic type II) chaperonin family, gamma subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin.	480
239454	cd03338	TCP1_delta	TCP-1 (CTT or eukaryotic type II) chaperonin family, delta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin.	515
239455	cd03339	TCP1_epsilon	TCP-1 (CTT or eukaryotic type II) chaperonin family, epsilon subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin.	526
239456	cd03340	TCP1_eta	TCP-1 (CTT or eukaryotic type II) chaperonin family, eta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin.	522
239457	cd03341	TCP1_theta	TCP-1 (CTT or eukaryotic type II) chaperonin family, theta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin.	472
239458	cd03342	TCP1_zeta	TCP-1 (CTT or eukaryotic type II) chaperonin family, zeta subunit. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. In contrast to bacterial group I chaperonins (GroEL), each ring of the eukaryotic cytosolic chaperonin (CTT) consists of eight different, but homologous subunits. Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. The best studied in vivo substrates of CTT are actin and tubulin.	484
239459	cd03343	cpn60	cpn60 chaperonin family. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings. Archaeal cpn60 (thermosome), together with TF55 from thermophilic bacteria and the eukaryotic cytosol chaperonin (CTT), belong to the type II group of chaperonins. Cpn60 consists of two stacked octameric rings, which are composed of one or two different subunits.  Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis.	517
239460	cd03344	GroEL	GroEL_like type I chaperonin. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). With the aid of cochaperonin GroES, GroEL encapsulates non-native substrate proteins inside the cavity of the GroEL-ES complex and promotes folding by using energy derived from ATP hydrolysis.	520
239461	cd03345	eu_TyrOH	Eukaryotic tyrosine hydroxylase (TyrOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH) and eukaryotic tryptophan hydroxylase (TrpOH). TyrOH catalyzes the conversion of tyrosine to L-dihydroxyphenylalanine (L-DOPA), the rate-limiting step in the biosynthesis of the catecholamines dopamine, noradrenaline, and adrenaline.	298
239462	cd03346	eu_TrpOH	Eukaryotic tryptophan hydroxylase (TrpOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH) and eukaryotic tyrosine hydroxylase (TyrOH). TrpOH oxidizes L-tryptophan to 5-hydroxy-L-tryptophan, the rate-limiting step in the biosynthesis of serotonin (5-hydroxytryptamine), a widely distributed hormone and neurotransmitter.	287
239463	cd03347	eu_PheOH	Eukaryotic phenylalanine-4-hydroxylase (eu_PheOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes prokaryotic phenylalanine-4-hydroxylase (pro_PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH).  PheOH catalyzes the first and rate-limiting step in the metabolism of the amino acid L-phenylalanine (L-Phe), the hydroxylation of L-Phe to L-tyrosine (L-Tyr). It uses (6R)-L-erythro-5,6,7,8-tetrahydrobiopterin (BH4) as the physiological electron donor. The catalytic activity of the tetrameric enzyme is tightly regulated by the binding of L-Phe and BH4 as well as by phosphorylation. Mutations in the human enzyme are linked to a severe variant of phenylketonuria.	306
239464	cd03348	pro_PheOH	Prokaryotic phenylalanine-4-hydroxylase (pro_PheOH); a member of the biopterin-dependent aromatic amino acid hydroxylase family of non-heme, iron(II)-dependent enzymes that also includes the eukaryotic proteins, phenylalanine-4-hydroxylase (eu_PheOH), tyrosine hydroxylase (TyrOH) and tryptophan hydroxylase (TrpOH). PheOH catalyzes the hydroxylation of L-Phe to L-tyrosine (L-Tyr). It uses (6R)-L-erythro-5,6,7,8-tetrahydrobiopterin (BH4) as the physiological electron donor.	228
100040	cd03349	LbH_XAT	Xenobiotic acyltransferase (XAT): The XAT class of hexapeptide acyltransferases is composed of a large number of microbial enzymes that catalyze the CoA-dependent acetylation of a variety of hydroxyl-bearing acceptors such as chloramphenicol and streptogramin, among others. Members of this class of enzymes include Enterococcus faecium streptogramin A acetyltransferase and Pseudomonas aeruginosa chloramphenicol acetyltransferase. They contain repeated copies of a six-residue hexapeptide repeat sequence motif (X-[STAV]-X-[LIV]-[GAED]-X) and adopt a left-handed parallel beta helix (LbH) structure. The active enzyme is a trimer with CoA and substrate binding sites at the interface of two separate LbH subunits. XATs are implicated in inactivating xenobiotics leading to xenobiotic resistance in patients.	145
100041	cd03350	LbH_THP_succinylT	2,3,4,5-tetrahydropyridine-2,6-dicarboxylate (THDP) N-succinyltransferase (also called THP succinyltransferase): THDP N-succinyltransferase catalyzes the conversion of tetrahydrodipicolinate and succinyl-CoA to N-succinyltetrahydrodipicolinate and CoA. It is the committed step in the succinylase pathway by which bacteria synthesize L-lysine and meso-diaminopimelate, a component of peptidoglycan. The enzyme is homotrimeric and each subunit contains an N-terminal region with alpha helices and hairpin loops, as well as a C-terminal region with a left-handed parallel alpha-helix (LbH) structural motif encoded by hexapeptide repeat motifs.	139
100042	cd03351	LbH_UDP-GlcNAc_AT	UDP-N-acetylglucosamine O-acyltransferase (UDP-GlcNAc acyltransferase): Proteins in this family catalyze the transfer of (R)-3-hydroxymyristic acid from its acyl carrier protein thioester to UDP-GlcNAc. It is the first enzyme in the lipid A biosynthetic pathway and is also referred to as LpxA. Lipid A is essential for the growth of Escherichia coli and related bacteria. It is also essential for maintaining the integrity of the outer membrane. UDP-GlcNAc acyltransferase is a homotrimer of left-handed parallel beta helix (LbH) subunits. Each subunit contains an N-terminal LbH region with 9 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), and a C-terminal alpha-helical region.	254
100043	cd03352	LbH_LpxD	UDP-3-O-acyl-glucosamine N-acyltransferase (LpxD): The enzyme catalyzes the transfer of 3-hydroxymyristic acid or 3-hydroxy-arachidic acid, depending on the organism, from the acyl carrier protein (ACP) to UDP-3-O-acyl-glucosamine to produce UDP-2,3-diacyl-GlcNAc. This constitutes the third step in the lipid A biosynthetic pathway in Gram-negative bacteria. LpxD is a homotrimer, with each subunit consisting of a novel combination of an N-terminal uridine-binding domain, a core lipid-binding left-handed parallel beta helix (LbH) domain, and a C-terminal alpha-helical extension. The LbH domain contains 9 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X).	205
100044	cd03353	LbH_GlmU_C	N-acetyl-glucosamine-1-phosphate uridyltransferase (GlmU), C-terminal left-handed beta-helix (LbH) acetyltransferase domain: GlmU is also known as UDP-N-acetylglucosamine pyrophosphorylase. It is a bifunctional bacterial enzyme that catalyzes two consecutive steps in the formation of UDP-N-acetylglucosamine (UDP-GlcNAc), an important precursor in bacterial cell wall formation. The two enzymatic activities, uridyltransferase and acetyltransferase, are carried out by two independent domains. The C-terminal LbH domain possesses the acetyltransferase activity. It catalyzes the CoA-dependent acetylation of GlcN-1-phosphate to GlcNAc-1-phosphate. The LbH domain contains 10 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X. The acetyltransferase active site is located at the interface between two subunits of the active LbH trimer.	193
100045	cd03354	LbH_SAT	Serine acetyltransferase (SAT): SAT catalyzes the CoA-dependent acetylation of the side chain hydroxyl group of L-serine to form O-acetylserine, as the first step of a two-step biosynthetic pathway in bacteria and plants leading to the formation of L-cysteine. This reaction represents a key metabolic point of regulation for the cysteine biosynthetic pathway due to its feedback inhibition by cysteine. The enzyme is a 175 kDa homohexamer, composed of a dimer of homotrimers. Each subunit contains an N-terminal alpha helical region and a C-terminal left-handed beta-helix (LbH) subdomain with 5 turns, each containing a hexapeptide repeat motif characteristic of the acyltransferase superfamily of enzymes. The trimer interface mainly involves the C-terminal LbH subdomain while the dimer (of trimers) interface is mediated by the N-terminal alpha helical subdomain.	101
100046	cd03356	LbH_G1P_AT_C_like	Left-handed parallel beta-Helix (LbH) domain of a group of proteins with similarity to glucose-1-phosphate adenylyltransferase: Included in this family are glucose-1-phosphate adenylyltransferase, mannose-1-phosphate guanylyltransferase, and the eukaryotic translation initiation factor eIF-2B subunits, epsilon and gamma. Most members of this family contains an N-terminal catalytic domain that resembles a dinucleotide-binding Rossmann fold, followed by a LbH fold domain with at least 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). eIF-2B epsilon contains an additional domain of unknown function at the C-terminus. Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity.	79
100047	cd03357	LbH_MAT_GAT	Maltose O-acetyltransferase (MAT) and Galactoside O-acetyltransferase (GAT): MAT and GAT catalyze the CoA-dependent acetylation of the 6-hydroxyl group of their respective sugar substrates. MAT acetylates maltose and glucose exclusively at the C6 position of the nonreducing end glucosyl moiety. GAT specifically acetylates galactopyranosides. Furthermore, MAT shows higher affinity toward artificial substrates containing an alkyl or hydrophobic chain as well as a glucosyl unit. Active MAT and GAT are homotrimers, with each subunit consisting of an N-terminal alpha-helical region and a C-terminal left-handed parallel alpha-helix (LbH) subdomain with 6 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X).	169
100048	cd03358	LbH_WxcM_N_like	WcxM-like, Left-handed parallel beta-Helix (LbH) N-terminal domain: This group is composed of Xanthomonas campestris WcxM and proteins with similarity to the WcxM N-terminal domain. WcxM is thought to be bifunctional, catalyzing both the isomerization and transacetylation reactions of keto-hexoses. It contains an N-terminal LbH domain responsible for the transacetylation function and a C-terminal isomerase domain. The LbH domain contains imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), typical of enzymes with acyltransferase activity.	119
100049	cd03359	LbH_Dynactin_5	Dynactin 5 (or subunit p25); Dynactin is a major component of the activator complex that stimulates dynein-mediated vesicle transport. Dynactin is a heterocomplex of at least eight subunits, including a 150,000-MW protein called Glued, the actin-capping protein Arp1, and dynamatin. In vitro binding experiments show that dynactin enhances dynein-dependent motility, possibly through interaction with microtubules and vesicles. Subunit p25 is part of the pointed-end subcomplex in dynactin that also includes p26, p27, and Arp11. This subcomplex interacts with membranous cargoes. p25 and p27 contain imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), indicating a left-handed parallel beta helix (LbH) structural domain. Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity.	161
100050	cd03360	LbH_AT_putative	Putative Acyltransferase (AT), Left-handed parallel beta-Helix (LbH) domain; This group is composed of mostly uncharacterized proteins containing an N-terminal helical subdomain followed by a LbH domain. The alignment contains 6 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. A few members are identified as NeuD, a sialic acid (Sia) O-acetyltransferase that is required for Sia synthesis and surface polysaccharide sialylation.	197
173781	cd03361	TOPRIM_TopoIA_RevGyr	TopoIA_RevGyr : The topoisomerase-primase (TORPIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to the ATP-dependent reverse gyrase found in archaea and thermophilic bacteria.   Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap. Reverse gyrase is also able to insert positive supercoils in the presence of ATP and negative supercoils in the presence of AMPPNP. The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD).  For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	170
173782	cd03362	TOPRIM_TopoIA_TopoIII	TOPRIM_TopoIA_TopoIII: The topoisomerase-primase (TORPIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to topoisomerase III.   Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD).  For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	151
173783	cd03363	TOPRIM_TopoIA_TopoI	TOPRIM_TopoIA_TopoI: The topoisomerase-primase (TOPRIM) domain found in members of the type IA family of DNA topoisomerases (Topo IA) similar to Escherichia coli DNA topoisomerase I.   Type IA DNA topoisomerases remove (relax) negative supercoils in the DNA by: cleaving one strand of the DNA duplex, covalently linking to the 5' phosphoryl end of the DNA break and, allowing the other strand of the duplex to pass through the gap.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD).  For topoisomerases the conserved glutamate is believed to act as a general base in strand joining and, as a general acid in strand cleavage. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	123
173784	cd03364	TOPRIM_DnaG_primases	TOPRIM_DnaG_primases: The topoisomerase-primase (TORPIM) nucleotidyl transferase/hydrolase domain found in the active site regions of proteins similar to Escherichia coli DnaG. Primases synthesize RNA primers for the initiation of DNA replication. DnaG type primases are often closely associated with DNA helicases in primosome assemblies.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.  E. coli DnaG is a single subunit enzyme.	79
173785	cd03365	TOPRIM_TopoIIA	TOPRIM_TopoIIA: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topoisomerase II. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA.  These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases.  The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	120
173786	cd03366	TOPRIM_TopoIIA_GyrB	TOPRIM_TopoIIA_GyrB: topoisomerase-primase (TOPRIM) nucleotidyl transferase/hydrolase domain of the type found in proteins of the type IIA family of DNA topoisomerases similar to the Escherichia coli GyrB subunit. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA.  These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. These proteins also catenate/ decatenate duplex rings.  DNA gyrase is more effective at relaxing supercoils than decatentating DNA.  DNA gyrase in addition inserts negative supercoils in the presence of ATP.  The TOPRIM domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). The conserved glutamate may act as a general base in strand joining and as a general acid in strand cleavage by topisomerases.  The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function.	114
239465	cd03367	Ribosomal_S23	S12-like family, 40S ribosomal protein S23 subfamily; S23 is located at the interface of the large and small ribosomal subunits of eukaryotes, adjacent to the decoding center. It interacts with domain III of the eukaryotic elongation factor 2 (eEF2), which catalyzes the translocation of the growing peptidyl-tRNA to the P site to make room for the next aminoacyl-tRNA at the A (acceptor) site. Through its interaction with eEF2, S23 may play an important role in translocation. Also members of this subfamily are the archaeal 30S ribosomal S12 proteins. Prokaryotic S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as control element for the rRNA- and tRNA-driven movements of translocation. S12 and S23 are also implicated in translation accuracy. Antibiotics such as streptomycin bind S12/S23 and cause the ribosome to misread the genetic code.	115
239466	cd03368	Ribosomal_S12	S12-like family, 30S ribosomal protein S12 subfamily; S12 is located at the interface of the large and small ribosomal subunits of prokaryotes, chloroplasts and mitochondria, where it plays an important role in both tRNA and ribosomal subunit interactions. S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as a control element for the rRNA- and tRNA-driven movements of translocation. Antibiotics such as streptomycin bind S12 and cause the ribosome to misread the genetic code.	108
213269	cd03369	ABCC_NFT1	ATP-binding cassette domain 2 of NFT1, subfamily C. Domain 2 of NFT1 (New full-length MRP-type transporter 1). NFT1 belongs to the MRP (multidrug resistance-associated protein) family of ABC transporters. Some of the MRP members have five additional transmembrane segments in their N-terminus, but the function of these additional membrane-spanning domains is not clear. The MRP was found in the multidrug-resisting lung cancer cell in which p-glycoprotein was not overexpressed. MRP exports glutathione by drug stimulation, as well as, certain substrates in conjugated forms with anions such as glutathione, glucuronate, and sulfate.	207
380327	cd03370	nitroreductase	uncharacterized nitroreductase family proteins. Nitroreductase family containing Thermus thermophilus NADH oxidase and other, uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor. The enzyme is typically a homodimer.	191
239468	cd03371	TPP_PpyrDC	Thiamine pyrophosphate (TPP) family, PpyrDC subfamily, TPP-binding module; composed of proteins similar to phosphonopyruvate decarboxylase (PpyrDC) proteins. PpyrDC is a homotrimeric enzyme which functions in the biosynthesis of C-P compounds such as bialaphos tripeptide in Streptomyces hygroscopicus. These proteins require TPP and divalent metal cation cofactors.	188
239469	cd03372	TPP_ComE	Thiamine pyrophosphate (TPP) family, ComE subfamily, TPP-binding module; composed of proteins similar to Methanococcus jannaschii sulfopyruvate decarboxylase beta subunit (ComE). M. jannaschii sulfopyruvate decarboxylase (ComDE) is a dodecamer of six alpha (D) subunits and six (E) beta subunits, which catalyzes the decarboxylation of sulfopyruvic acid to sulfoacetaldehyde in the coenzyme M pathway. ComDE requires TPP and divalent metal cation cofactors.	179
239470	cd03375	TPP_OGFOR	Thiamine pyrophosphate (TPP family), 2-oxoglutarate ferredoxin oxidoreductase (OGFOR) subfamily, TPP-binding module; OGFOR catalyzes the oxidative decarboxylation of 2-oxo-acids, with ferredoxin acting as an electron acceptor. In the TCA cycle, OGFOR catalyzes the oxidative decarboxylation of 2-oxoglutarate to succinyl-CoA. In the reductive tricarboxylic acid cycle found in the anaerobic autotroph Hydrogenobacter thermophilus, OGFOR catalyzes the reductive carboxylation of succinyl-CoA to produce 2-oxoglutarate. Thauera aromatica OGFOR has been shown to provide reduced ferredoxin to benzoyl-CoA reductase, a key enzyme in the anaerobic metabolism of aromatic compounds. OGFOR is dependent on TPP and a divalent metal cation for activity.	193
239471	cd03376	TPP_PFOR_porB_like	Thiamine pyrophosphate (TPP family), PFOR porB-like subfamily, TPP-binding module; composed of proteins similar to the beta subunit (porB) of the Helicobacter pylori four-subunit pyruvate ferredoxin oxidoreductase (PFOR), which are also found in archaea and some hyperthermophilic bacteria. PFOR catalyzes the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. The 36-kDa porB subunit contains the binding sites for the cofactors, TPP and a divalent metal cation, which are required for activity.	235
239472	cd03377	TPP_PFOR_PNO	Thiamine pyrophosphate (TPP family), PFOR_PNO subfamily, TPP-binding module; composed of proteins similar to the single subunit pyruvate ferredoxin oxidoreductase (PFOR) of Desulfovibrio Africanus, present in bacteria and amitochondriate eukaryotes. This subfamily also includes proteins characterized as pyruvate NADP+ oxidoreductase (PNO). These enzymes are dependent on TPP and a divalent metal cation as cofactors. PFOR and PNO catalyze the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. Archaea, anaerobic bacteria and eukaryotes that lack mitochondria (and therefore pyruvate dehydrogenase) use PFOR to oxidatively decarboxylate pyruvate, with ferredoxin or flavodoxin as the electron acceptor. The PFOR from cyanobacterium Anabaena (NifJ) is required for the transfer of electrons from pyruvate to flavodoxin, which reduces nitrogenase. The facultative anaerobic mitochondrion of the photosynthetic protist Euglena gracilis oxidizes pyruvate with PNO.	365
239473	cd03378	beta_CA_cladeC	Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity.  Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes.	154
239474	cd03379	beta_CA_cladeD	Carbonic anhydrases (CA) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism in which the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide is followed by the regeneration of an active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. CAs are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionarily distinct families of CAs (the alpha-, beta-, and gamma-CAs) which show no significant sequence identity or structural similarity.  Within the beta-CA family there are four evolutionarily distinct clades (A through D). The beta-CAs are multimeric enzymes (forming dimers,tetramers,hexamers and octamers) which are present in higher plants, algae, fungi, archaea and prokaryotes.	142
239475	cd03380	PAP2_like_1	PAP2_like_1 proteins, a sub-family of PAP2, containing bacterial acid phosphatase, vanadium chloroperoxidases and vanadium bromoperoxidases.	209
239476	cd03381	PAP2_glucose_6_phosphatase	PAP2_like proteins, glucose-6-phosphatase subfamily. Glucose-6-phosphatase converts glucose-6-phosphate into free glucose and is active in the lumen of the endoplasmic reticulum, where it is bound to the membrane. The generation of free glucose is an important control point in metabolism, and stands at the end of gluconeogenesis and the release of glucose from glycogen. Deficiency of glucose-6-phosphatase leads to von Gierke's disease.	235
239477	cd03382	PAP2_dolichyldiphosphatase	PAP2_like proteins, dolichyldiphosphatase subfamily. Dolichyldiphosphatase is a membrane-associated protein located in the endoplasmic reticulum and hydrolyzes dolichyl pyrophosphate, as well as dolichylmonophosphate at a low rate. The enzyme is necessary for maintaining proper levels of dolichol-linked oligosaccharides and protein N-glycosylation, and might play a role in re-utilization of the glycosyl carrier lipid for additional rounds of lipid intermediate biosynthesis after its release during protein N-glycosylation reactions.	159
239478	cd03383	PAP2_diacylglycerolkinase	PAP2_like proteins, diacylglycerol_kinase like sub-family. In some prokaryotes, PAP2_like phosphatase domains appear fused to E. coli DAGK-like trans-membrane diacylglycerol kinase domains. The cellular function of these architectures remains to be determined.	109
239479	cd03384	PAP2_wunen	PAP2, wunen subfamily. Most likely a family of membrane associated phosphatidic acid phosphatases. Wunen is a drosophila protein expressed in the central nervous system, which provides repellent activity towards primordial germ cells (PGCs), controls the survival of PGCs and is essential in the migration process of these cells towards the somatic gonadal precursors.	150
239480	cd03385	PAP2_BcrC_like	PAP2_like proteins, BcrC_like subfamily. Several members of this family have been annotated as bacitracin transport permeases, as it was suspected that they form the permease component of an ABC transporter system. It was shown, however, that BcrC from Bacillus subtilis posesses undecaprenyl pyrophosphate (UPP) phospatase activity, and it is hypothesized that it competes with bacitracin for UPP, increasing the cell's resistance to bacitracin.	144
239481	cd03386	PAP2_Aur1_like	PAP2_like proteins, Aur1_like subfamily. Yeast Aur1p or Ipc1p is necessary for the addition of inositol phosphate to ceramide, an essential step in yeast sphingolipid synthesis, and is the target of several antifungal compounds such as aureobasidin.	186
239482	cd03388	PAP2_SPPase1	PAP2_like proteins, sphingosine-1-phosphatase subfamily. Sphingosine-1-phosphatase is an intracellular enzyme located in the endoplasmic reticulum, which regulates the level of sphingosine-1-phosphate (S1P), a bioactive lipid. S1P acts as a second messenger in the cell, and extracellularly by binding to G-protein coupled receptors of the endothelial differentiation gene family.	151
239483	cd03389	PAP2_lipid_A_1_phosphatase	PAP2_like proteins, Lipid A 1-phosphatase subfamily. Lipid A 1-phosphatase, or LpxE from Francisella novicida selectively dephosphorylates lipid A at the 1-position. Lipid A is the membrane-anchor component of lipopolysaccharides (LPS), the major constituents of the outer membrane in many gram-negative bacteria.	186
239484	cd03390	PAP2_containing_1_like	PAP2, subfamily similar to human phosphatidic_acid_phosphatase_type_2_domain_containing_1. Most likely membrane-associated phosphatidic acid phosphatases. Plant members of this group are constitutively expressed in many tissues and exhibit both diacylglycerol pyrophosphate phosphatase activity as well as phosphatidate (PA) phosphatase activity, they may have a more generic housekeeping role in lipid metabolism.	193
239485	cd03391	PAP2_containing_2_like	PAP2, subfamily similar to human phosphatidic_acid_phosphatase_type_2_domain_containing_2. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to eukaryota, lacks functional characterization and may act as a membrane-associated phosphatidic acid phosphatase.	159
239486	cd03392	PAP2_like_2	PAP2_like_2 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase.	182
239487	cd03393	PAP2_like_3	PAP2_like_3 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria and archaea, lacks functional characterization and may act as a membrane-associated lipid phosphatase.	125
239488	cd03394	PAP2_like_5	PAP2_like_5 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase.	106
239489	cd03395	PAP2_like_4	PAP2_like_4 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which is specific to bacteria, lacks functional characterization and may act as a membrane-associated lipid phosphatase.	177
239490	cd03396	PAP2_like_6	PAP2_like_6 proteins. PAP2 is a super-family of phosphatases and haloperoxidases. This subgroup, which mainly contains bacterial proteins, lacks functional characterization and may act as a membrane-associated lipid phosphatase.	197
239491	cd03397	PAP2_acid_phosphatase	PAP2, bacterial acid phosphatase or class A non-specific acid phosphatases. These enzymes catalyze phosphomonoester hydrolysis, with optimal activity in low pH conditions. They are secreted into the periplasmic space, and their physiological role remains to be determined.	232
239492	cd03398	PAP2_haloperoxidase	PAP2, haloperoxidase_like subfamily. Haloperoxidases catalyze the oxidation of halides such as bromide or chloride by hydrogen peroxide, which results in subsequent halogenation of organic substrates, or halide-assisted disproportionation of hydrogen peroxide forming dioxygen. They are likely to participate in the biosynthesis of halogenated natural products, such as volatile halogenated hydrocarbons, chiral halogenated terpenes, acetogenins and indoles.	232
259798	cd03399	SPFH_flotillin	Flotillin or reggie family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. The flotillin (reggie) like proteins are lipid raft-associated. Individual proteins of this SPFH family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. In addition, microdomains formed from flotillin proteins may be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and interact with a variety of proteins. They may play a role in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and in cancer invasion, and metastasis.	145
259799	cd03401	SPFH_prohibitin	Prohibitin family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model characterizes proteins similar to prohibitin (a lipid raft-associated integral membrane protein). Individual proteins of the SPFH (band 7) domain superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. These microdomains, in addition to being stable scaffolds, may also be dynamic units with their own regulatory functions. Prohibitin is a mitochondrial inner-membrane protein which may act as a chaperone for the stabilization of mitochondrial proteins. Human prohibitin forms a hetero-oligomeric complex with Bap-37 (prohibitin 2, an SPFH domain carrying homolog). This complex may protect non-assembled membrane proteins against proteolysis by the m-AAA protease. Prohibitin and Bap-37 yeast homologs have been implicated in yeast longevity and in the maintenance of mitochondrial morphology.	195
259800	cd03402	SPFH_like_u2	Uncharacterized family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes an uncharacterized family of proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease, and in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome.	231
259801	cd03403	SPFH_stomatin	Stomatin, a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. Stomatin (or band 7) is widely expressed and, highly expressed in red blood cells. It localizes predominantly to the plasma membrane and to intracellular vesicles of the endocytic pathway, where it is present in higher order homo-oligomeric complexes (of between 9 and 12 monomers). Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and, is implicated in trafficking of Glut1 glucose transporters. This subgroup found in animals, also contains proteins similar to Caenorhabditis elegans MEC-2. MEC-2 interacts with MEC-4, which is part of the degenerin channel complex required for response to gentle body touch.	202
259802	cd03404	SPFH_HflK	High frequency of lysogenization K (HflK) family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model characterizes proteins similar to prokaryotic HflK (High frequency of lysogenization K). Although many members of the SPFH (or band 7) superfamily are lipid raft associated, prokaryote plasma membranes lack cholesterol and are unlikely to have lipid raft domains. Individual proteins of this SPFH domain superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Escherichia coli HflK is an integral membrane protein which may localize to the plasma membrane. HflK associates with another SPFH superfamily member (HflC) to form an HflKC complex. HflKC interacts with FtsH in a large complex termed the FtsH holo-enzyme. FtsH is an AAA ATP-dependent protease which exerts progressive proteolysis against membrane-embedded and soluble substrate proteins. HflKC can modulate the activity of FtsH. HflKC plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection.	266
259803	cd03405	SPFH_HflC	High frequency of lysogenization C (HflC) family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model characterizes proteins similar to prokaryotic HflC (High frequency of lysogenization C). Although many members of the SPFH (or band 7) superfamily are lipid raft associated, prokaryote plasma membranes lack cholesterol and are unlikely to have lipid raft domains. Individual proteins of this SPFH domain superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Escherichia coli HflC is an integral membrane protein which may localize to the plasma membrane. HflC associates with another SPFH superfamily member (HflK) to form an HflKC complex. HflKC interacts with FtsH in a large complex termed the FtsH holo-enzyme. FtsH is an AAA ATP-dependent protease which exerts progressive proteolysis against membrane-embedded and soluble substrate proteins. HflKC can modulate the activity of FtsH. HflKC plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection.	249
259804	cd03406	SPFH_like_u3	Uncharacterized family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes an uncharacterized family of proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome.	293
259805	cd03407	SPFH_like_u4	Uncharacterized family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes an uncharacterized family of proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome.	269
259806	cd03408	SPFH_like_u1	Uncharacterized family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes an uncharacterized family of proteins similar to stomatin, prohibitin, flotillin, HflK/C (SPFH) and podocin. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Many superfamily members are associated with lipid rafts. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome.	217
239503	cd03409	Chelatase_Class_II	Class II Chelatase: a family of ATP-independent monomeric or homodimeric enzymes that catalyze the insertion of metal into protoporphyrin rings. This family includes protoporphyrin IX ferrochelatase (HemH), sirohydrochlorin ferrochelatase (SirB) and the cobaltochelatases, CbiK and CbiX. HemH and SirB are involved in heme and siroheme biosynthesis, respectively, while the cobaltochelatases are associated with cobalamin biosynthesis. Excluded from this family are the ATP-dependent heterotrimeric chelatases (class I) and the multifunctional homodimeric enzymes with dehydrogenase and chelatase activities (class III).	101
239504	cd03411	Ferrochelatase_N	Ferrochelatase, N-terminal domain: Ferrochelatase (protoheme ferrolyase or HemH) is the terminal enzyme of the heme biosynthetic pathway. It catalyzes the insertion of ferrous iron into the protoporphyrin IX ring yielding protoheme. This enzyme is ubiquitous in nature and widely distributed in bacteria and eukaryotes. Recently, some archaeal members have been identified. The oligomeric state of these enzymes varies depending on the presence of a dimerization motif at the C-terminus.	159
239505	cd03412	CbiK_N	Anaerobic cobalamin biosynthetic cobalt chelatase (CbiK), N-terminal domain. CbiK is part of the cobalt-early path for cobalamin biosynthesis. It catalyzes the insertion of cobalt into the oxidized form of precorrin-2, factor II (sirohydrochlorin), the second step of the anaerobic branch of vitamin B12 biosynthesis. CbiK belongs to the class II family of chelatases and is a homomeric enzyme that does not require ATP for its enzymatic activity.	127
239506	cd03413	CbiK_C	Anaerobic cobalamin biosynthetic cobalt chelatase (CbiK), C-terminal domain. CbiK is part of the cobalt-early path for cobalamin biosynthesis. It catalyzes the insertion of cobalt into the oxidized form of precorrin-2, factor II (sirohydrochlorin), the second step of the anaerobic branch of vitamin B12 biosynthesis. CbiK belongs to the class II family of chelatases, and is a homomeric enzyme that does not require ATP for its enzymatic activity.	103
239507	cd03414	CbiX_SirB_C	Sirohydrochlorin cobalt chelatase (CbiX) and sirohydrochlorin iron chelatase (SirB), C-terminal domain. SirB catalyzes the ferro-chelation of sirohydrochlorin to siroheme, the prosthetic group of sulfite and nitrite reductases. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, an important step in the vitamin B12 biosynthetic pathway. CbiX often contains a C-terminal histidine-rich region that may be important for metal delivery and/or storage, and may also contain an iron-sulfur center. Both CbiX and SirB are found in a wide range of bacteria.	117
239508	cd03415	CbiX_CbiC	Archaeal sirohydrochlorin cobalt chelatase (CbiX) single domain. Proteins in this subgroup contain a single CbiX domain N-terminal to a precorrin-8X methylmutase (CbiC) domain. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, while CbiC catalyzes the conversion of cobalt-precorrin 8 to cobyrinic acid by methyl rearrangement. Both CbiX and CbiC are involved in vitamin B12 biosynthesis.	125
239509	cd03416	CbiX_SirB_N	Sirohydrochlorin cobalt chelatase (CbiX) and sirohydrochlorin iron chelatase (SirB), N-terminal domain. SirB catalyzes the ferro-chelation of sirohydrochlorin to siroheme, the prosthetic group of sulfite and nitrite reductases. CbiX is a cobaltochelatase, responsible for the chelation of Co2+ into sirohydrochlorin, an important step in the vitamin B12 biosynthetic pathway. CbiX often contains a C-terminal histidine-rich region that may be important for metal delivery and/or storage, and may also contain an iron-sulfur center. Both are found in a wide range of bacteria. This subgroup also contains single domain proteins from archaea and bacteria which may represent the ancestral form of class II chelatases before domain duplication occurred.	101
239510	cd03418	GRX_GRXb_1_3_like	Glutaredoxin (GRX) family, GRX bacterial class 1 and 3 (b_1_3)-like subfamily; composed of bacterial GRXs, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including  E. coli GRX1 and GRX3, which are members of this subfamily.	75
239511	cd03419	GRX_GRXh_1_2_like	Glutaredoxin (GRX) family, GRX human class 1 and 2 (h_1_2)-like subfamily; composed of proteins similar to human GRXs, approximately 10 kDa in size, and proteins containing a GRX or GRX-like domain. GRX is a glutathione (GSH) dependent reductase, catalyzing the disulfide reduction of target proteins such as ribonucleotide reductase. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH -> GSH reductase -> GSH -> GRX -> protein substrates. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress. Different classes are known including human GRX1 and GRX2, which are members of this subfamily. Also included in this subfamily are the N-terminal GRX domains of proteins similar to human thioredoxin reductase 1 and 3.	82
239512	cd03420	SirA_RHOD_Pry_redox	SirA_RHOD_Pry_redox.    SirA-like domain located within a multidomain protein of unknown function. Other domains include RHOD (rhodanese homology domain), and Pry_redox (pyridine nucleotide-disulphide oxidoreductase) as well as a C-terminal domain that corresponds to COG2210.  This fold is referred to as a two-layered alpha/beta sandwich, structurally similar to that of translation initiation factor 3.	69
239513	cd03421	SirA_like_N	SirA_like_N, a protein of unknown function with an N-terminal SirA-like domain.  The SirA, YedF, YeeD protein family is present in bacteria as well as archaea. SirA  (also known as UvrY,  and YhhP) belongs to a family of a two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA.  A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium.  Moreover, despite a low primary sequence similarity,  the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding.	67
239514	cd03422	YedF	YedF is a bacterial SirA-like protein of unknown function.  SirA  (also known as UvrY,  and YhhP) belongs to a family of a two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA. A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is suggested to be important for normal cell division and growth in rich nutrient medium.  Moreover, despite a low primary sequence similarity,  the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding.	69
239515	cd03423	SirA	SirA (also known as UvrY,  and YhhP) belongs to a family of two-component response regulators that controls secondary metabolism and virulence. The other member of this two-component system is a sensor kinase called BarA which phosphorylates SirA.  A variety of microorganisms have similar proteins, all of which contain a common CPxP sequence motif in the N-terminal region. YhhP is thought to be important for normal cell division and growth in rich nutrient medium.  Moreover, despite a low primary sequence similarity,  the YccP structure closely resembles the non-homologous C-terminal RNA-binding domain of E. coli translation initiation factor IF3. The signature CPxP motif serves to stabilize the N-terminal helix as part of the N-capping box and might be important in mRNA-binding.	69
239516	cd03424	ADPRase_NUDT5	ADP-ribose pyrophosphatase (ADPRase) catalyzes the hydrolysis of ADP-ribose and a variety of additional ADP-sugar conjugates to AMP and ribose-5-phosphate. Like other members of the Nudix hydrolase superfamily, it requires a divalent cation, such as Mg2+, for its activity. It also contains a highly conserved 23-residue Nudix motif (GX5EX7REUXEEXGU, where U = I, L or V) which functions as a metal binding site/catalytic site. In addition to the Nudix motif, there are additional conserved amino acid residues, distal from the signature sequence, that correlate with substrate specificity. In humans, there are four distinct ADPRase activities, three putative cytosolic enzymes (ADPRase-I, -II, and -Mn) and a single mitochondrial enzyme (ADPRase-m). Human ADPRase-II is also referred to as NUDT5. It lacks the N-terminal target sequence unique to mitochondrial ADPRase. The different cytosolic types are distinguished by their specificities for substrate and specific requirement for metal ions. NUDT5 forms a homodimer.	137
239517	cd03425	MutT_pyrophosphohydrolase	The MutT pyrophosphohydrolase is a prototypical Nudix hydrolase that catalyzes the hydrolysis of nucleoside and deoxynucleoside triphosphates (NTPs and dNTPs) by substitution at a beta-phosphorus to yield a nucleotide monophosphate (NMP) and inorganic pyrophosphate (PPi). This enzyme requires two divalent cations for activity; one coordinates the phosphoryl groups of the NTP/dNTP substrate, and the other coordinates to the enzyme. It also contains the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as metal binding and catalytic site. MutT pyrophosphohydrolase is important in preventing errors in DNA replication by hydrolyzing mutagenic nucleotides such as 8-oxo-dGTP (a product of oxidative damage), which can mispair with template adenine during DNA replication, to guanine nucleotides.	124
239518	cd03426	CoAse	Coenzyme A pyrophosphatase (CoAse), a member of the Nudix hydrolase superfamily, functions to catalyze the elimination of oxidized inactive CoA, which can inhibit CoA-utilizing enzymes. The need of CoAses mainly arises under conditions of oxidative stress. CoAse has a conserved Nudix fold and requires a single divalent cation for catalysis. In addition to a signature Nudix motif G[X5]E[X7]REUXEEXGU, where U is  Ile, Leu, or Val, CoAse contains an additional motif upstream called the NuCoA motif (LLTXT(SA)X3RX3GX3FPGG) which is postulated to be involved in CoA recognition. CoA plays a central role in lipid metabolism. It is involved in the initial steps of fatty acid sythesis in the cytosol, in the oxidation of fatty acids and the citric acid cycle in the mitochondria, and in the oxidation of long-chain fatty acids in peroxisomes. CoA has the important role of activating fatty acids for further modification into key biological signalling molecules.	157
239519	cd03427	MTH1	MutT homolog-1 (MTH1) is a member of the Nudix hydrolase superfamily. MTH1, the mammalian counterpart of MutT, hydrolyzes oxidized purine nucleoside triphosphates, such as 8-oxo-dGTP and 2-hydroxy-ATP, to monophosphates, thereby preventing the incorporation of such oxygen radicals during replication. This is an important step in the repair mechanism in genomic and mitochondrial DNA.  Like other members of the Nudix family, it requires a divalent cation, such as Mg2+ or Mn2+, for activity, and contain the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as a metal binding and catalytic site. MTH1 is predominantly localized in the cytoplasm and mitochondria. Structurally, this enzyme adopts a similar fold to MutT despite low sequence similarity outside the conserved nudix motif. The most distinctive structural difference between MutT and MTH1 is the presence of a beta-hairpin, which is absent in MutT. This results in a much deeper and narrower substrate binding pocket. Mechanistically, MTH1 contains dual specificity for nucleotides that contain 2-OH-adenine bases and those that contain 8-oxo-guanine bases.	137
239520	cd03428	Ap4A_hydrolase_human_like	Diadenosine tetraphosphate (Ap4A) hydrolase is a member of the Nudix hydrolase superfamily. Ap4A hydrolases are well represented in a variety of prokaryotic and eukaryotic organisms. Phylogenetic analysis reveals two distinct subgroups where plant enzymes fall into one subfamily and fungi/animals/archaea enzymes, represented by this subfamily, fall into another. Bacterial enzymes are found in both subfamilies. Ap4A is a potential by-product of aminoacyl tRNA synthesis, and accumulation of Ap4A has been implicated in a range of biological events, such as DNA replication, cellular differentiation, heat shock, metabolic stress, and apoptosis. Ap4A hydrolase cleaves Ap4A asymmetrically into ATP and AMP. It is important in the invasive properties of bacteria and thus presents a potential target for inhibition of such invasive bacteria. Besides the signature nudix motif (G[X5]E[X7]REUXEEXGU, where U is Ile, Leu, or Val) that functions as a metal binding and catalytic site, and a required divalent cation, Ap4A hydrolase is structurally similar to the other members of the nudix superfamily with some degree of variation. Several regions in the sequences are poorly defined and substrate and metal binding sites are only predicted based on kinetic studies.	130
239521	cd03429	NADH_pyrophosphatase	NADH pyrophosphatase, a member of the Nudix hydrolase superfamily, catalyzes the cleavage of NADH into reduced nicotinamide mononucleotide (NMNH) and AMP. Like other members of the Nudix family, it requires a divalent cation, such as Mg2+ or Mn2+, for activity. Members of this family are also recognized by the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as a metal binding and catalytic site. A block of 8 conserved amino acids downstream of the nudix motif is thought to give NADH pyrophosphatase its specificity for NADH. NADH pyrophosphatase forms a dimer.	131
239522	cd03430	GDPMH	GDP-mannose glycosyl hydrolase (AKA GDP-mannose mannosyl hydrolase (GDPMH)) is a member of the Nudix hydrolase superfamily. This class of enzymes is unique from other members of the superfamily in two aspects. First, it contains a modified Nudix signature sequence. The slight changes to the conserved sequence motif, GX5EX7REUXEEXGU, where U = I, L or V), are believed to contribute to the removal of all magnesium binding sites but one, retaining only the metal site that coordinates the pyrophosphate of the substrate. Secondly, it is not a pyrophosphatase that substitutes at a phosphorus; instead, it hydrolyzes nucleotide sugars such as GDP-mannose to GDP and mannose, cleaving the phosphoglycosyl bond by substituting at a carbon position. GDP-mannose provides mannosyl components for cell wall synthesis and is required for the synthesis of other glycosyl donors (such as GDP-fucose and colitose) for the cell wall. The importance of GDP-sugar hydrolase activities is thus closely related to the regulation of cell wall biosynthesis. Enzymes in this family are believed to regulate the concentration of GDP-mannose and GDP-glucose in the bacterial cell wall.	144
239523	cd03431	DNA_Glycosylase_C	DNA glycosylase (MutY in bacteria and hMYH in humans) is responsible for repairing misread  A*oxoG residues to C*G by removing the inappropriately paired adenine base from the DNA backbone. It belongs to the Nudix hydrolase superfamily and is important for the repair of various genotoxic lesions. Enzymes belonging to this superfamily requires a divalent cation, such as Mg2+ or Mn2+ for their activity. They are also recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V). However, DNA glycosylase does not seem to contain this signature motif. DNA glycosylase consists of 2 domains: the N-terminal domain contains the catalytic properties of the enzyme and the C-terminal domain affects substrate (oxoG) binding and enzymatic turnover. The C-terminal domain is highly similar to MutT, based on secondary structure and topology, despite low sequence identity. MutT sanitizes the nucleotide precursor pool by hydrolyzing oxo-dGTP to oxo-dGMO and inorganic pyrophosphate. The similarity strongly suggests that the two proteins share a common evolutionary origin.	118
239524	cd03440	hot_dog	The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold.  These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate.  This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis.	100
239525	cd03441	R_hydratase_like	(R)-hydratase [(R)-specific enoyl-CoA hydratase].  Catalyzes the hydration of trans-2-enoyl CoA to (R)-3-hydroxyacyl-CoA as part of the PHA (polyhydroxyalkanoate) biosynthetic pathway.  The structure of the monomer includes a five-strand antiparallel beta-sheet wrapped around a central alpha helix, referred to as a hot dog fold.  The active site lies within a substrate-binding tunnel formed by the homodimer.  Other enzymes with this fold include MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE),  and the fatty acid synthase beta subunit.	127
239526	cd03442	BFIT_BACH	Brown fat-inducible thioesterase (BFIT).  Brain acyl-CoA hydrolase (BACH).  These enzymes deacylate long-chain fatty acids by hydrolyzing acyl-CoA thioesters to free fatty acids and CoA-SH. Eukaryotic members of this family are expressed in brain, testis, and brown adipose tissues. The archeal and eukaryotic members of this family have two tandem copies of the conserved hot dog fold, while most bacterial members have only one copy.	123
239527	cd03443	PaaI_thioesterase	PaaI_thioesterase is a tetrameric acyl-CoA thioesterase with a hot dog fold and one of several proteins responsible for phenylacetic acid (PA) degradation in bacteria.  Although orthologs of PaaI exist in archaea and eukaryotes, their function has not been determined. Sequence similarity between PaaI, E. coli medium chain acyl-CoA thioesterase II, and human thioesterase III suggests they all belong to the same thioesterase superfamily. The conserved fold present in these thioesterases is referred to as an asymmetric hot dog fold, similar to those of 4-hydroxybenzoyl-CoA thioesterase (4HBT) and the beta-hydroxydecanoyl-ACP dehydratases (FabA/FabZ).	113
239528	cd03444	Thioesterase_II_repeat1	Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene.	104
239529	cd03445	Thioesterase_II_repeat2	Thioesterase II (TEII) is thought to regenerate misprimed nonribosomal peptide synthetases (NRPSs) as well as modular polyketide synthases (PKSs) by hydrolyzing acetyl groups bound to the peptidyl carrier protein (PCP) and acyl carrier protein (ACP) domains, respectively. TEII has two tandem asymmetric hot dog folds that are structurally similar to one found in PaaI thioesterase, 4-hydroxybenzoyl-CoA thioesterase (4HBT) and beta-hydroxydecanoyl-ACP dehydratase and thus, the TEII monomer is equivalent to the homodimeric form of the latter three enzymes. Human TEII is expressed in T cells and has been shown to bind the product of the HIV-1 Nef gene.	94
239530	cd03446	MaoC_like	MoaC_like    Similar to the MaoC (monoamine oxidase C) dehydratase regulatory protein but without the N-terminal PutA domain. This protein family has a hot-dog fold similar to that of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit.	140
239531	cd03447	FAS_MaoC	FAS_MaoC, the MaoC-like hot dog fold of the fatty acid synthase, beta subunit.  Other enzymes with this fold include MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE), and 17-beta-hydroxysteriod dehydrogenase (HSD).	126
239532	cd03448	HDE_HSD	HDE_HSD  The R-hydratase-like hot dog fold of the 17-beta-hydroxysteriod dehydrogenase (HSD), and Hydratase-Dehydrogenase-Epimerase (HDE) proteins.  Other enzymes with this fold include MaoC dehydratase, and the fatty acid synthase beta subunit.	122
239533	cd03449	R_hydratase	(R)-hydratase [(R)-specific enoyl-CoA hydratase] catalyzes the hydration of trans-2-enoyl CoA to (R)-3-hydroxyacyl-CoA as part of the PHA (polyhydroxyalkanoate) biosynthetic pathway.  (R)-hydratase contains a hot-dog fold similar to those of thioesterase II, and beta-hydroxydecanoyl-ACP dehydratase, MaoC dehydratase, Hydratase-Dehydrogenase-Epimerase protein (HDE), and the fatty acid synthase beta subunit.  The active site lies within a substrate-binding tunnel formed by the (R)-hydratase homodimer.  A subset of the bacterial (R)-hydratases contain a C-terminal phosphotransacetylase (PTA) domain.	128
239534	cd03450	NodN	NodN (nodulation factor N) contains a single hot dog fold similar to those of the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit.  Rhizobium and related species form nodules on the roots of their legume hosts, a symbiotic process that requires production of Nod factors, which are signal molecules involved in root hair deformation and meristematic cell division.  The nodulation gene products, including NodN, are involved in producing the Nod factors, however the role played by NodN is unclear.	149
239535	cd03451	FkbR2	FkbR2 is a Streptomyces hygroscopicus protein with a hot dog fold that belongs to a conserved family of proteins found in prokaryotes and archaea but not in eukaryotes. FkbR2  has sequence similarity to (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit.  The function of FkbR2 is unknown.	146
239536	cd03452	MaoC_C	MaoC_C  The C-terminal hot dog fold of the MaoC (monoamine oxidase C) dehydratase regulatory protein. Orthologs of MaoC include PaaZ [Escherichia coli] and PaaN [Pseudomonas putida], which are putative ring-opening enzymes involved in phenylacetic acid degradation. The C-terminal domain of MaoC has sequence similarity to (R)-specific enoyl-CoA hydratase,Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit.  MaoC also has an N-terminal PutA domain like that found in the E. coli PutA proline dehydrogenase and other members of the aldehyde dehydrogenase family.	142
239537	cd03453	SAV4209_like	SAV4209_like.  Similar in sequence to the Streptomyces avermitilis SAV4209 protein, with a hot dog fold that is similar to those of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit.	127
239538	cd03454	YdeM	YdeM is a Bacillus subtilis protein that belongs to a family of prokaryotic proteins of unkown function.  YdeM has sequence similarity to the hot-dog fold of (R)-specific enoyl-CoA hydratase.   Other enzymes with this fold include the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit.	140
239539	cd03455	SAV4209	SAV4209 is a Streptomyces avermitilis protein with a hot dog fold that is similar to those of (R)-specific enoyl-CoA hydratase, the peroxisomal Hydratase-Dehydrogenase-Epimerase (HDE) protein, and the fatty acid synthase beta subunit.  The alpha- and gamma-proteobacterial members of this CD have, in addition to a hot dog fold, an N-terminal extension.	123
239540	cd03457	intradiol_dioxygenase_like	Intradiol dioxygenase supgroup. Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. They break the catechol C1-C2 bond and utilize Fe3+, as opposed to  the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. The family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases. The specific function of this subgroup is unknown.	188
239541	cd03458	Catechol_intradiol_dioxygenases	Catechol intradiol dioxygenases can be divided into several subgroups according to their substrate specificity for catechol, chlorocatechols and hydroxyquinols. Almost all members of this family are homodimers containing one ferric ion (Fe3+) per monomer. They belong to the intradiol dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.	256
239542	cd03459	3,4-PCD	Protocatechuate 3,4-dioxygenase (3,4-PCD) catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.	158
239543	cd03460	1,2-CTD	Catechol 1,2 dioxygenase (1,2-CTD) catalyzes an intradiol cleavage reaction of catechol to form cis,cis-muconate. 1,2-CTDs is homodimers with one catalytic non-heme ferric ion per monomer. They belong to the aromatic dioxygenase family, a family of mononuclear non-heme iron intradiol-cleaving enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings.	282
239544	cd03461	1,2-HQD	Hydroxyquinol 1,2-dioxygenase (1,2-HQD) catalyzes the ring cleavage of hydroxyquinol (1,2,4-trihydroxybenzene), a intermediate in the degradation of a large variety of aromatic compounds including some polychloro- and nitroaromatic pollutants, to form 3-hydroxy-cis,cis-muconates. 1,2-HQD blongs to the aromatic dioxygenase family, a family of mononuclear non-heme intradiol-cleaving enzymes.	277
239545	cd03462	1,2-CCD	chlorocatechol 1,2-dioxygenases (1,2-CCDs) (type II enzymes) are homodimeric intradiol dioxygenases that degrade chlorocatechols via the addition of molecular oxygen and the subsequent cleavage between two adjacent hydroxyl groups. This reaction is part of the modified ortho-cleavage pathway which is a central oxidative bacterial pathway that channels chlorocatechols, derived from the degradation of chlorinated benzoic acids, phenoxyacetic acids, phenols, benzenes, and other aromatics into the energy-generating tricarboxylic acid pathway.	247
239546	cd03463	3,4-PCD_alpha	Protocatechuate 3,4-dioxygenase (3,4-PCD) , alpha subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.	185
239547	cd03464	3,4-PCD_beta	Protocatechuate 3,4-dioxygenase (3,4-PCD) , beta subunit. 3,4-PCD catalyzes the oxidative ring cleavage of 3,4-dihydroxybenzoate to produce beta-carboxy-cis,cis-muconate. 3,4-PCDs are large aggregates of 12 protomers, each composed of an alpha- and beta-subunit and an Fe3+ ion bound in the beta-subunit at the alpha-subunit-beta-subunit interface. 3,4-PCD is a member of the aromatic dioxygenases which are non-heme iron intradiol-cleaving enzymes that break the C1-C2 bond and utilize Fe3+.	220
239548	cd03465	URO-D_like	The URO-D _like protein superfamily includes bacterial and eukaryotic uroporphyrinogen decarboxylases (URO-D), coenzyme M methyltransferases and other putative bacterial methyltransferases. Uroporphyrinogen decarboxylase (URO-D) decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, an important branching point of the tetrapyrrole biosynthetic pathway. The methyltransferases represented here are important for ability of methanogenic organisms to use other compounds than carbon dioxide for reduction to methane.	330
239549	cd03466	Nitrogenase_NifN_2	Nitrogenase_nifN_2: A subgroup of the NifN subunit of the NifEN complex: NifN forms an alpha2beta2 tetramer with NifE.  NifN and nifE are structurally homologous to nitrogenase MoFe protein beta and alpha subunits respectively.  NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of the MoFe protein.  NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to the NifEN complex where it is further processed to FeMoco. The nifEN bound precursor of FeMoco has been identified as a molybdenum-free, iron- and sulfur- containing analog of FeMoco. It has been suggested that this nifEN bound precursor also acts as a cofactor precursor in nitrogenase systems which require a cofactor other than FeMoco: i.e. iron-vanadium cofactor (FeVco) or iron only cofactor (FeFeco). This group also contains the Clostidium fused NifN-NifB protein.	429
239550	cd03467	Rieske	Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis.	98
176458	cd03468	PolY_like	DNA Polymerase Y-family. Y-family DNA polymerases are a specialized subset of polymerases that facilitate translesion synthesis (TLS), a process that allows the bypass of a variety of DNA lesions.  Unlike replicative polymerases, TLS polymerases lack proofreading activity and have low fidelity and low processivity.  They use damaged DNA as templates and insert nucleotides opposite the lesions. The active sites of TLS polymerases are large and flexible to allow the accomodation of distorted bases.  Expression of Y-family polymerases is often induced by DNA damage and is believed to be highly regulated. TLS is likely induced by the monoubiquitination of the replication clamp PCNA, which provides a scaffold for TLS polymerases to bind in order to access the lesion.  Because of their high error rates, TLS polymerases are potential targets for cancer treatment and prevention.	335
239551	cd03469	Rieske_RO_Alpha_N	Rieske non-heme iron oxygenase (RO) family, N-terminal Rieske domain of the oxygenase alpha subunit; The RO family comprise a large class of aromatic ring-hydroxylating dioxygenases found predominantly in microorganisms. These enzymes enable microorganisms to tolerate and even exclusively utilize aromatic compounds for growth. ROs consist of two or three components: reductase, oxygenase, and ferredoxin (in some cases) components. The oxygenase component may contain alpha and beta subunits, with the beta subunit having a purely structural function. Some oxygenase components contain only an alpha subunit. The oxygenase alpha subunit has two domains, an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from the reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Reduced pyridine nucleotide is used as the initial source of two electrons for dioxygen activation.	118
239552	cd03470	Rieske_cytochrome_bc1	Iron-sulfur protein (ISP) component of the bc(1) complex family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The bc(1) complex is a multisubunit enzyme found in many different organisms including uni- and multi-cellular eukaryotes, plants (in their mitochondria) and bacteria. The cytochrome bc(1) and b6f complexes are central components of the respiratory and photosynthetic electron transport chains, respectively, which carry out similar core electron and proton transfer steps. The bc(1) and b6f complexes share a common core structure of three catalytic subunits: cyt b, the Rieske ISP, and either a cyt c1 in the bc(1) complex or cyt f in the b6f complex, which are arranged in an integral membrane-bound dimeric complex. While the core of the b6f complex is similar to that of the bc(1) complex, the domain arrangement outside the core and the complement of prosthetic groups are strikingly different.	126
239553	cd03471	Rieske_cytochrome_b6f	Iron-sulfur protein (ISP) component of the b6f complex family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The cytochrome b6f complex from Mastigocladus laminosus, a thermophilic cyanobacterium, contains four large subunits, including cytochrome f, cytochrome b6, the Rieske ISP, and subunit IV; as well as four small hydrophobic subunits, PetG, PetL, PetM, and PetN. Rieske ISP, one of the large subunits of the cytochrome bc-type complexes, is involved in respiratory and photosynthetic electron transfer. The core of the chloroplast b6f complex is similar to the analogous respiratory cytochrome bc(1) complex, but the domain arrangement outside the core and the complement of prosthetic groups are strikingly different.	126
239554	cd03472	Rieske_RO_Alpha_BPDO_like	Rieske non-heme iron oxygenase (RO) family, Biphenyl dioxygenase (BPDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of BPDO and similar proteins including cumene dioxygenase (CumDO), nitrobenzene dioxygenase (NBDO), alkylbenzene dioxygenase (AkbDO) and dibenzofuran 4,4a-dioxygenase (DFDO). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. BPDO degrades biphenyls and polychlorinated biphenyls (PCB's) while CumDO degrades cumene (isopropylbenzene), an aromatic hydrocarbon that is intermediate in size between ethylbenzene and biphenyl. NBDO catalyzes the initial reaction in nitrobenzene degradation, oxidizing the aromatic rings of mono- and dinitrotoluenes to form catechol and nitrite. NBDO belongs to the naphthalene subfamily of ROs. AkbDO is involved in alkylbenzene catabolism, converting o-xylene to 2,3- and 3,4-dimethylphenol and ethylbenzene to cis-dihydrodiol. DFDO is involved in dibenzofuran degradation.	128
239555	cd03473	Rieske_CMP_Neu5Ac_hydrolase_N	Cytidine monophosphate-N-acetylneuraminic acid (CMP Neu5Ac) hydroxylase family, N-terminal Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. CMP Neu5Ac hydroxylase is the key enzyme for the synthesis of N-glycolylneuraminic acid (NeuGc) from N-acetylneuraminic acid (Neu5Ac), NeuGc and Neu5Ac are members of a family of cell surface sugars called sialic acids. All mammals except humans have both NeuGc variants on their cell surfaces. In humans, the gene encoding CMP Neu5Ac hydroxylase has a mutation within its coding region that abolishes NeuGc production.	107
239556	cd03474	Rieske_T4moC	Toluene-4-monooxygenase effector protein complex (T4mo), Rieske ferredoxin subunit; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. T4mo is a four-protein complex that catalyzes the NADH- and O2-dependent hydroxylation of toluene to form p-cresol. T4mo consists of an NADH oxidoreductase (T4moF), a diiron hydroxylase (T4moH), a catalytic effector protein (T4moD), and a Rieske ferredoxin (T4moC). T4moC contains a Rieske domain and functions as an obligate electron carrier between T4moF and T4moH. Rieske ferredoxins are found as subunits of membrane oxidase complexes, cis-dihydrodiol-forming aromatic dioxygenases, bacterial assimilatory nitrite reductases, and arsenite oxidase. Rieske ferredoxins are also found as soluble electron carriers in bacterial dioxygenase and monooxygenase complexes.	108
239557	cd03475	Rieske_SoxF_SoxL	SoxF and SoxL family, Rieske domain; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. SoxF is a subunit of the terminal oxidase supercomplex SoxM in the plasma membrane of Sulfolobus acidocaldarius that combines features of a cytochrome bc(1) complex and a cytochrome. The Rieske domain of SoxF has a 12 residue insertion which is not found in eukaryotic and bacterial Rieske proteins and is thought to influence the redox properties of the iron-sulfur cluster. SoxL is a Rieske protein which may be part of an archaeal bc-complex homologue whose physiological function is still unknown. SoxL has two features not seen in other Rieske proteins; (i) a significantly greater distance between the two cluster-binding sites and  (ii) an unexpected Pro -> Asp substitution at one of the cluster binding sites. SoxF and SoxL are found in archaea and in bacteria.	171
239558	cd03476	Rieske_ArOX_small	Small subunit of Arsenite oxidase (ArOX) family, Rieske domain; ArOX is a molybdenum/iron protein involved in the detoxification of arsenic, oxidizing it to arsenate. It consists of two subunits, a large subunit similar to members of the DMSO reductase family of molybdenum enzymes and a small subunit with a Rieske-type [2Fe-2S] cluster. The large subunit of ArOX contains the molybdenum site at which the oxidation of arsenite occurs. The small subunit contains a domain homologous to the Rieske domains of the cytochrome bc(1) and cytochrome b6f complexes as well as naphthalene 1,2-dioxygenase. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer.	126
239559	cd03477	Rieske_YhfW_C	YhfW family, C-terminal Rieske domain; YhfW is a protein of unknown function with an N-terminal DadA-like (glycine/D-amino acid dehydrogenase) domain and a C-terminal Rieske domain. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. It is commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. YhfW is found in bacteria, some eukaryotes and archaea.	91
239560	cd03478	Rieske_AIFL_N	AIFL (apoptosis-inducing factor like) family, N-terminal Rieske domain; members of this family show similarity to human AIFL, containing an N-terminal Rieske domain and a C-terminal pyridine nucleotide-disulfide oxidoreductase domain (Pyr_redox). The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. AIFL shares 35% homology with human AIF (apoptosis-inducing factor), mainly in the Pyr_redox domain. AIFL is predominantly localized to the mitochondria. AIFL induces apoptosis in a caspase-dependent manner.	95
239561	cd03479	Rieske_RO_Alpha_PhDO_like	Rieske non-heme iron oxygenase (RO) family, Phthalate 4,5-dioxygenase (PhDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of PhDO and similar proteins including 3-chlorobenzoate 3,4-dioxygenase (CBDO), phenoxybenzoate dioxygenase (POB-dioxygenase) and 3-nitrobenzoate oxygenase (MnbA). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. PhDO and CBDO are two-component RO systems, containing oxygenase and reductase components. PhDO catalyzes the dihydroxylation of phthalate to form the 4,5-dihydro-cis-dihydrodiol of phthalate (DHD). CBDO, together with CbaC dehydrogenase, converts the environmental pollutant 3CBA to protocatechuate (PCA) and 5-Cl-PCA, which are then metabolized by the chromosomal PCA meta (extradiol) ring fission pathway. POB-dioxygenase catalyzes the initial catabolic step in the angular dioxygenation of phenoxybenzoate, converting mono- and dichlorinated phenoxybenzoates to protocatechuate and chlorophenols. These phenoxybenzoates are metabolic products formed during the degradation of pyrethroid insecticides.	144
239562	cd03480	Rieske_RO_Alpha_PaO	Rieske non-heme iron oxygenase (RO) family, Pheophorbide a oxygenase (PaO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of a small subfamily of enzymes found in plants as well as oxygenic cyanobacterial photosynthesizers including LLS1 (lethal leaf spot 1, also known as PaO) and ACD1 (accelerated cell death 1). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. PaO expression increases upon physical wounding of plant leaves and is thought to catalyze a key step in chlorophyll degradation. The Arabidopsis-accelerated cell death gene ACD1 is involved in oxygenation of PaO.	138
239563	cd03481	TopoIIA_Trans_ScTopoIIA	TopoIIA_Trans_ScTopoIIA: Transducer domain, having a ribosomal S5 domain 2-like fold, of the type found in proteins of the type IIA family of DNA topoisomerases similar to Saccharomyces cerevisiae Topo IIA.  S. cerevisiae Topo IIA is a homodimer encoded by a single gene. The type IIA enzymes are the predominant form of topoisomerase and are found in some bacteriophages, viruses and archaea, and in all bacteria and eukaryotes.  All type IIA topoisomerases are related to each other at amino acid sequence level, though their oligomeric organization sometimes differs. TopoIIA enzymes cut both strands of the duplex DNA to remove (relax) both positive and negative supercoils in DNA.  These enzymes covalently attach to the 5' ends of the cut DNA, separate the free ends of the cleaved strands, pass another region of the duplex through this gap, then rejoin the ends. TopoIIA enzymes also catenate/ decatenate duplex rings. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes.	153
239564	cd03482	MutL_Trans_MutL	MutL_Trans_MutL: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to Escherichia coli MutL.  EcMutL belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family.  This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from the ATP-binding site to the DNA breakage/reunion regions of the enzymes.  It has been suggested that during initiation of DNA mismatch repair in E. coli, the mismatch recognition protein MutS recruits MutL in the presence of ATP.  The MutS(ATP)-MutL ternary complex formed, then recruits the latent endonuclease MutH. Prokaryotic MutS and MutL are homodimers.	123
239565	cd03483	MutL_Trans_MLH1	MutL_Trans_MLH1: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to yeast and human MLH1 (MutL homologue 1). This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. MLH1 forms heterodimers with PMS2, PMS1 and MLH3. These three complexes have distinct functions in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Roles for hMLH1-hPMS1 or hMLH1-hMLH3 in MMR have not been established. Cells lacking hMLH1 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 causes predisposition to HNPCC, Muir-Torre syndrome and Turcot syndrome (HNPCC variant). Mutation in hMLH1 accounts for a large fraction of HNPCC families.	127
239566	cd03484	MutL_Trans_hPMS_2_like	MutL_Trans_hPMS2_like: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to human PSM2 (hPSM2). hPSM2 belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family.  This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. Included in this group are proteins similar to yeast PMS1. The yeast MLH1-PMS1 and the human MLH1-PMS2 heterodimers play a role in meiosis. hMLH1-hPMS2 also participates in the repair of all DNA mismatch repair (MMR) substrates. Cells lacking hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hPMS2 causes predisposition to HPNCC and Turcot syndrome.	142
239567	cd03485	MutL_Trans_hPMS_1_like	MutL_Trans_hPMS1_like: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to human PSM1 (hPSM1) and yeast MLH2. hPSM1 and yMLH2 are members of the DNA mismatch repair (MutL/MLH1/PMS2) family.  This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. PMS1 forms a heterodimer with MLH1. The MLH1-PMS1 complex functions in meiosis. Loss of yMLH2 results in a small but significant decrease in spore viability and a significant increase in gene conversion frequencies.  A role for hMLH1-hPMS1 in DNA mismatch repair has not been established. Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families, however there is no convincing evidence to support hPMS1 having a role in HNPCC predisposition.	132
239568	cd03486	MutL_Trans_MLH3	MutL_Trans_MLH3: transducer domain, having a ribosomal S5 domain 2-like fold, found in proteins similar to yeast and human MLH3 (MutL homologue 3). MLH3 belongs to the DNA mismatch repair (MutL/MLH1/PMS2) family. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. MLH1 forms heterodimers with MLH3. The MLH1-MLH3 complex plays a role in meiosis. A role for hMLH1-hMLH3 in DNA mismatch repair (MMR) has not been established. It has been suggested that hMLH3 may be a low risk gene for colorectal cancer; however there is little evidence to support it having a role in classical HNPCC.	141
239569	cd03487	RT_Bac_retron_II	RT_Bac_retron_II: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons. The polymerase reaction of this enzyme leads to the production of a unique RNA-DNA complex called msDNA (multicopy single-stranded (ss)DNA) in which a small ssDNA branches out from a small ssRNA molecule via a 2'-5'phosphodiester linkage. Bacterial retron RTs produce cDNA corresponding to only a small portion of the retron genome.	214
239570	cd03488	Topoisomer_IB_N_htopoI_like	Topoisomer_IB_N_htopoI_like : N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I.  Topo I enzymes are divided into:  topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes.  Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts.  This family may represent more than one structural domain.	215
239571	cd03489	Topoisomer_IB_N_LdtopoI_like	Topoisomer_IB_N_LdtopoI_like: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into:  topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes.  Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topo I play putative roles in organizing the kinetoplast DNA network unique to these parasites.  This family may represent more than one structural domain.	212
239572	cd03490	Topoisomer_IB_N_1	Topoisomer_IB_N_1: A subgroup of the N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB. Topo IB proteins include the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into:  topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes.  Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts.  In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topos I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topos I have putative roles in organizing the kinetoplast DNA network unique to these parasites.  This family may represent more than one structural domain.	217
239573	cd03493	SQR_QFR_TM	Succinate:quinone oxidoreductase (SQR) and Quinol:fumarate reductase (QFR) family, transmembrane subunits; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol, while QFR catalyzes the reverse reaction. SQR, also called succinate dehydrogenase or Complex II, is part of the citric acid cycle and the aerobic respiratory chain, while QFR is involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQRs may reduce either high or low potential quinones while QFRs oxidize only low potential quinols. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit(s) containing the electron donor/acceptor (quinol or quinone). The reversible reduction of quinone is an essential feature of respiration, allowing the transfer of electrons between respiratory complexes. SQRs and QFRs can be classified into five types (A-E) according to the number of their hydrophobic subunits and heme groups. This classification is consistent with the characteristics and phylogeny of the catalytic and iron-sulfur subunits. Type E proteins, e.g. non-classical archael SQRs, contain atypical transmembrane subunits and are not included in this hierarchy. The heme and quinone binding sites reside in the transmembrane subunits. Although succinate oxidation and fumarate reduction are carried out by separate enzymes in most organisms, some bifunctional enzymes that exhibit both SQR and QFR activities exist.	98
239574	cd03494	SQR_TypeC_SdhD	Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase D (SdhD) subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. E. coli SQR, a member of this subfamily, reduces the high potential quinine, ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain.  SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group.  SdhD and SdhC are the two transmembrane proteins of bacterial SQRs. They contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes.	99
239575	cd03495	SQR_TypeC_SdhD_like	Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase D (SdhD) subunit-like; composed of predominantly uncharacterized bacterial proteins with similarity to the E. coli SdhD subunit. One characterized protein is the respiratory Complex II SdhD subunit of the only eukaryotic member, Reclinomonas americana. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. It is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. E. coli SQR is classified as Type C SQRs because it contains two transmembrane subunits and one heme group. The SdhD and SdhC subunits are membrane anchor subunits containing heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes.	100
239576	cd03496	SQR_TypeC_CybS	SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Eukaryotic SQRs reduce high potential quinones such as ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain.  SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits.  Members of this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group.  CybS and CybL are the two transmembrane proteins of eukaryotic SQRs. They contain heme and quinone binding sites. CybS is the eukaryotic homolog of the bacterial SdhD subunit.  The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the transmembrane subunits via electron transport through FAD and three iron-sulfur centers.  The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes.  Mutations in human Complex II result in various physiological disorders including hereditary paraganglioma and pheochromocytoma tumors. The gene encoding for the SdhD subunit is classified as a tumor suppressor gene.	104
239577	cd03497	SQR_TypeB_1_TM	Succinate:quinone oxidoreductase (SQR) Type B subfamily 1, transmembrane subunit; composed of proteins similar to Bacillus subtilis SQR. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Bacillus subtilis SQR reduces low potential quinones such as menaquinone. SQR is also called succinate dehydrogenase (Sdh) or Complex II and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups.  The heme and quinone binding sites reside on the transmembrane subunit. The transmembrane subunit of Bacillus subtilis SQR is also called Sdh cytochrome b558 subunit. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron acceptor (quinone). The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes.	207
239578	cd03498	SQR_TypeB_2_TM	Succinate:quinone oxidoreductase (SQR)-like Type B subfamily 2, transmembrane subunit; composed of proteins with similarity to the SQRs of Geobacter metallireducens and Corynebacterium glutamicum. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. C. glutamicum SQR reduces low potential quinones such as menaquinone. SQR is also called succinate dehydrogenase (Sdh) or Complex II and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits.  Members of this subfamily are classified as Type B as they contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunit. The transmembrane subunit of members of this subfamily is also called Sdh cytochrome b558 subunit based on the Bacillus subtilis protein. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron acceptor (quinone). The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. Proteins in this subfamily from G. metallireducens and G. sulfurreducens are bifunctional enzymes with SQR and QFR activities.	209
239579	cd03499	SQR_TypeC_SdhC	Succinate:quinone oxidoreductase (SQR) Type C subfamily, Succinate dehydrogenase C (SdhC) subunit; composed of bacterial SdhC and eukaryotic large cytochrome b binding (CybL) proteins. SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this family reduce high potential quinones such as ubiquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain.  SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Proteins in this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group. The heme and quinone binding sites reside in the transmembrane subunits. The SdhC or CybL protein is one of the  two transmembrane subunits of bacterial and eukaryotic SQRs. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes.	117
239580	cd03500	SQR_TypeA_SdhD_like	Succinate:quinone oxidoreductase (SQR) Type A subfamily, Succinate dehydrogenase D (SdhD)-like subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this subfamily reduce low potential quinones such as menaquinone and thermoplasmaquinone. SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are similar to the Thermoplasma acidophilum SQR and are classified as Type A  because they contain two transmembrane subunits as well as two heme groups. Although there are no structures available for this subfamily, the presence of two hemes has been proven spectroscopically for T. acidophilum. The two membrane anchor subunits are similar to the SdhD and SdhC subunits of bacterial SQRs, which contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes.	106
239581	cd03501	SQR_TypeA_SdhC_like	Succinate:quinone oxidoreductase (SQR) Type A subfamily, Succinate dehydrogenase C (SdhC)-like subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol. Members of this subfamily reduce low potential quinones such as menaquinone and thermoplasmaquinone.  SQR is also called succinate dehydrogenase or Complex II, and is part of the citric acid cycle and the aerobic respiratory chain. SQR is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Members of this subfamily are similar to the Thermoplasma acidophilum SQR and are classified as Type A because they contain two transmembrane subunits as well as two heme groups. Although there are no structures available for this subfamily, the presence of two hemes has been proven spectroscopically for T. acidophilum.  The two membrane anchor subunits are similar to the SdhD and SdhC subunits of bacterial SQRs, which contain heme and quinone binding sites. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the membrane anchor subunits via electron transport through FAD and three iron-sulfur centers. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes.	101
239582	cd03505	Delta9-FADS-like	The Delta9 Fatty Acid Desaturase (Delta9-FADS)-like CD includes the delta-9 and delta-11 acyl CoA desaturases found in various eukaryotes including vertebrates, insects, higher plants, and fungi. The delta-9 acyl-lipid desaturases are found in a wide range of bacteria. These enzymes play essential roles in fatty acid metabolism and the regulation of cell membrane fluidity. Acyl-CoA desaturases are the enzymes involved in the CoA-bound desaturation of fatty acids. Mammalian stearoyl-CoA delta-9 desaturase is a key enzyme in the biosynthesis of monounsaturated fatty acids, and in yeast, the delta-9 acyl-CoA desaturase (OLE1) reaction accounts for all de nova unsaturated fatty acid production in Saccharomyces cerevisiae. These non-heme, iron-containing, ER membrane-bound enzymes are part of a three-component enzyme system involving cytochrome b5, cytochrome b5 reductase, and the delta-9 fatty acid desaturase. This complex catalyzes the NADH- and oxygen-dependent insertion of a cis double bond between carbons 9 and 10 of the saturated fatty acyl substrates, palmitoyl (16:0)-CoA or stearoyl (18:0)-CoA, yielding the monoenoic products palmitoleic (16:l) or oleic (18:l) acids, respectively. In cyanobacteria, the biosynthesis of unsaturated fatty acids is initiated by delta 9 acyl-lipid desaturase (DesC) which introduces the first double bond at the delta-9 position of a saturated fatty acid that has been esterified to a glycerolipid. This domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain the residues: HXXXXH, HXXHH, and H/QXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase. Some eukaryotic (Fungi, Euglenozoa, Mycetozoa, Rhodophyta) desaturase domains have an adjacent C-terminal cytochrome b5-like domain.	178
239583	cd03506	Delta6-FADS-like	The Delta6 Fatty Acid Desaturase (Delta6-FADS)-like CD includes the integral-membrane enzymes: delta-4, delta-5, delta-6, delta-8, delta-8-sphingolipid, and delta-11 desaturases found in vertebrates, higher plants, fungi, and bacteria. These desaturases are required for the synthesis of highly unsaturated fatty acids (HUFAs), which are mainly esterified into phospholipids and contribute to maintaining membrane fluidity. While HUFAs may be required for cold tolerance in bacteria, plants and fish, the primary role of HUFAs in mammals is cell signaling. These enzymes are described as front-end desaturases because they introduce a double bond between the pre-exiting double bond and the carboxyl (front) end of the fatty acid. Various substrates are involved, with both acyl-coenzyme A (CoA) and acyl-lipid desaturases present in this CD. Acyl-lipid desaturases are localized in the membranes of cyanobacterial thylakoid, plant endoplasmic reticulum (ER), and plastid; and acyl-CoA desaturases are present in ER membrane. ER-bound plant acyl-lipid desaturases and acyl-CoA desaturases require cytochrome b5 as an electron donor. Most of the eukaryotic desaturase domains have an adjacent N-terminal cytochrome b5-like domain. This domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain the residues: HXXXH, HXX(X)HH, and Q/HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase.	204
239584	cd03507	Delta12-FADS-like	The Delta12 Fatty Acid Desaturase (Delta12-FADS)-like CD includes the integral-membrane enzymes, delta-12 acyl-lipid desaturases, oleate 12-hydroxylases, omega3 and omega6 fatty acid desaturases, and other related proteins, found in a wide range of organisms including higher plants, green algae, diatoms, nematodes, fungi, and bacteria. The expression of these proteins appears to be temperature dependent: decreases in temperature result in increased levels of fatty acid desaturation within membrane lipids subsequently altering cell membrane fluidity. An important enzyme for the production of polyunsaturates in plants is the oleate delta-12 desaturase (Arabidopsis FAD2) of the endoplasmic reticulum. This enzyme accepts l-acyl-2-oleoyl-sn-glycero-3-phosphocholine as substrate and requires NADH:cytochrome b oxidoreductase, cytochrome b, and oxygen for activity. FAD2 converts oleate(18:1) to linoleate (18:2) and is closely related to oleate 12-hydroxylase which catalyzes the hydroxylation of oleate to ricinoleate. Plastid-bound desaturases (Arabidopsis delta-12 desaturase (FAD6), omega-3 desaturase (FAD8), omega-6 desaturase (FAD6)), as well as, the cyanobacterial thylakoid-bound FADSs require oxygen, ferredoxin, and ferredoxin oxidoreductase for activity. As in higher plants, the cyanobacteria delta-12 (DesA) and omega-3 (DesB) FADSs desaturate oleate (18:1) to linoleate (18:2) and linoleate (18:2) to linolenate (18:3), respectively. Omega-3 (DesB/FAD8) and omega-6 (DesD/FAD6) desaturases catalyze reactions that introduce a double bond between carbons three and four, and carbons six and seven, respectively, from the methyl end of fatty acids. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXX(X)HH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homologue, stearoyl CoA desaturase. Mutation of any one of four of these histidines in the Synechocystis delta-12 acyl-lipid desaturase resulted in complete inactivity.	222
239585	cd03508	Delta4-sphingolipid-FADS-like	The Delta4-sphingolipid Fatty Acid Desaturase (Delta4-sphingolipid-FADS)-like CD includes the integral-membrane enzymes, dihydroceramide Delta-4 desaturase, involved in the synthesis of sphingosine; and the human membrane fatty acid (lipid) desaturase (MLD), reported to modulate biosynthesis of the epidermal growth factor receptor; and other related proteins. These proteins are found in various eukaryotes including vertebrates, higher plants, and fungi. Studies show that MLD is localized to the endoplasmic reticulum. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase.	289
239586	cd03509	DesA_FADS-like	Fatty acid desaturase protein family subgroup, a delta-12 acyl-lipid desaturase-like, DesA-like, yet uncharacterized subgroup of membrane fatty acid desaturase proteins found in alpha-, beta-, and gamma-proteobacteria. Sequences of this domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase.	288
239587	cd03510	Rhizobitoxine-FADS-like	This CD includes the dihydrorhizobitoxine fatty acid desaturase (RtxC) characterized in Bradyrhizobium japonicum USDA110, and other related proteins. Dihydrorhizobitoxine desaturase is reported to be involved in the final step of rhizobitoxine biosynthesis. This domain family appears to be structurally related to the membrane fatty acid desaturases and the alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXX(X)HH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase.	175
239588	cd03511	Rhizopine-oxygenase-like	This CD includes the putative hydrocarbon oxygenase, MocD, a bacterial rhizopine (3-O-methyl-scyllo-inosamine, 3-O-MSI) oxygenase, and other related proteins. It has been proposed that MocD, MocE (Rieske-like ferredoxin), and MocF (ferredoxin reductase) under the regulation of MocR, act in concert to form a ferredoxin oxygenase system that demethylates 3-O-MSI to form scyllo-inosamine.  This domain family appears to be structurally related to the membrane fatty acid desaturases and the alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase.	285
239589	cd03512	Alkane-hydroxylase	Alkane hydroxylase is a bacterial, integral-membrane di-iron enzyme that shares a requirement for iron and oxygen for activity similar to that of the non-heme integral-membrane acyl coenzyme A (CoA) desaturases and acyl lipid desaturases. The alk genes in Pseudomonas oleovorans encode conversion of alkanes to acyl CoA. The alkane omega-hydroxylase (AlkB) system is responsible for the initial oxidation of inactivated alkanes. It is a three-component system comprising a soluble NADH-rubredoxin reductase (AlkT), a soluble rubredoxin (AlkG), and the integral membrane oxygenase (AlkB). AlkB utilizes the oxygen rebound mechanism to hydroxylate alkanes. This mechanism involves homolytic cleavage of the C-H bond by an electrophilic metal-oxo intermediate to generate a substrate-based radical. As with other members of this superfamily, this domain family has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. The active site structure of AlkB is not known, however, spectroscopic and genetic evidence points to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals. Like all other members of this superfamily, there are eight conserved histidines seen in the histidine cluster motifs: HXXXH, HXXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the homolog, stearoyl CoA desaturase. Also included in this CD are terminal alkane hydroxylases (AlkM), xylene monooxygenase hydroxylases (XylM), p-cymene monooxygenase hydroxylases (CymAa), and other related proteins.	314
239590	cd03513	CrtW_beta-carotene-ketolase	Beta-carotene ketolase/oxygenase (CrtW, also known as CrtO), the carotenoid astaxanthin biosynthetic enzyme, initially catalyzes the addition of two keto groups to carbons C4 and C4' of beta-carotene. Carotenoids are important natural pigments produced by many microorganisms and plants. Astaxanthin is reported to be an antioxidant, an anti-cancer agent, and an immune system stimulant. A number of bacteria and green algae can convert beta-carotene into astaxanthin by using several ketocarotenoids as intermediates and CrtW and a beta-carotene hydroxylase (CrtZ). CrtW initially converts beta-carotene to canthaxanthin via echinenone, and CrtZ initially mediates the conversion of beta-carotene to zeaxanthin via beta-cryptoxanthin. After a few more intermediates are formed, CrtW and CrtZ act in combination to produce astaxanthin. Sequences of this domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that are capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase.	225
239591	cd03514	CrtR_beta-carotene-hydroxylase	Beta-carotene hydroxylase (CrtR), the carotenoid zeaxanthin biosynthetic enzyme catalyzes the addition of hydroxyl groups to the beta-ionone rings of beta-carotene to form zeaxanthin and is found in bacteria and red algae. Carotenoids are important natural pigments; zeaxanthin and lutein are the only dietary carotenoids that accumulate in the macular region of the retina and lens. It is proposed that these carotenoids protect ocular tissues against photooxidative damage. CrtR does not show overall amino acid sequence similarity to the beta-carotene hydroxylases similar to CrtZ, an astaxanthin biosynthetic beta-carotene hydroxylase. However, CrtR does show sequence similarity to the green alga, Haematococcus pluvialis, beta-carotene ketolase (CrtW), which converts beta-carotene to canthaxanthin. Sequences of the CrtR_beta-carotene-hydroxylase domain family, as well as, the CrtW_beta-carotene-ketolase domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase.	207
239592	cd03515	Link_domain_TSG_6_like	This is the extracellular link domain of the type found in human TSG-6. The link domain is a hyaluronan (HA)-binding domain. TSG-6 is the protein product of tumor necrosis factor-stimulated gene-6. TSG-6 is up-regulated in inflammatory lesions and in the ovary during ovulation. It has a strong anti-inflammatory and chondroprotective effect in models of acute inflammation and autoimmune arthritis and plays an essential role in female fertility. Also included in this group are the stabilins: stabilin-1 (FEEL-1, CLEVER-1) and stabilin-2 (FEEL-2). Stabilin-2 functions as the major liver and lymph node-scavenging receptor for HA and related glycosaminoglycans. Stabilin-2 is a scavenger receptor with a broad range of ligands including advanced glycation end (AGE) products, acetylated low density lipoprotein and procollagen peptides. In contrast, stabilin-1 does not bind HA, but binds acetylated low density lipoprotein and AGEs with lower affinity. As AGEs accumulate in vascular tissues during aging and diabetes, these receptors may be implicated in the pathologies of these states. Both stabilins are present in the early endocytic pathway in hepatic sinusoidal epithelium associating with clathrin/AP-2. Stabilin-1 is expressed in macrophages. Stabilin-2 is absent from the latter. In macrophages: stabilin-1 is involved in trafficking between early/sorting endosomes and the trans-Golgi network. Stabilin-1 has also been implicated in angiogenesis and possibly leucocyte trafficking. Both stabilins bind gram-positive and gram-negative bacteria. TSG-6 and stabilins contain a single link module which supports high affinity binding to HA.	93
239593	cd03516	Link_domain_CD44_like	This domain is a hyaluronan (HA)-binding domain. It is found in CD44 receptor and mediates adhesive interactions during inflammatory leukocyte homing and tumor metastasis. It also plays an important role in arteriogenesis. The functional HA-binding domain of CD44 is an extended domain comprised of a single link module flanked with N-and C- extensions. These extensions are essential for folding and for functional activity. This group also contains the cell surface retention sequence (CRS) binding protein-1 (CRSBP-1) and lymph vessel endothelial receptor-1 (LYVE-1). CRSBP-1 is a cell surface binding protein for the CRS motif of PDGF-BB (platelet-derived growth factor-BB) and is responsible for the cell surface retention of PDGF-BB in SSV-transformed cells. CRSBP-1 may play a role in autocrine regulation of cell growth mediated by CRS containing growth regulators. LYVE-1 is preferentially expressed on the lymphatic endothelium and is used as a molecular marker for the detection and characterization of lymphatic vessels in tumors.	144
239594	cd03517	Link_domain_CSPGs_modules_1_3	Link_domain_CSPGs_modules_1_3; this extracellular link domain is found in the first and third link modules of the chondroitin sulfate proteoglycan core protein (CSPG) aggrecan. In addition, it is found in the first link module of three other CSPGs: versican, neurocan, and brevican. The link domain is a hyaluronan (HA)-binding domain. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of aggrecan are involved in interaction with HA. In addition, aggrecan contains a second globular domain (G2) which contains link modules 3 and 4. G2 appears to lack HA-binding activity. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes.	95
239595	cd03518	Link_domain_HAPLN_module_1	Link_domain_HAPLN_module_1; this link domain is found in the first link module of proteins similar to the vertebrate HAPLN (hyaluronan/HA and proteoglycan binding link) protein family which includes cartilage link protein. The link domain is a HA-binding domain. HAPLNs contain two contiguous link modules. Both link modules of cartilage link protein are involved in interaction with HA. In cartilage, a chondroitin sulfate proteoglycan core protein (CSPG) aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates with other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HAPLN gene family are physically linked adjacent to CSPG genes.	95
239596	cd03519	Link_domain_HAPLN_module_2	Link_domain_HAPLN_module_2; this link domain is found in the second link module of proteins similar to the vertebrate HAPLN (hyaluronan/HA and proteoglycan binding link) protein family which includes cartilage link protein. The link domain is a HA-binding domain. HAPLNs contain two contiguous link modules. Both link modules of cartilage link protein are involved in interaction with HA. In cartilage, a chondroitin sulfate proteoglycan core protein (CSPG) aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates with other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HAPLN gene family are physically linked adjacent to CSPG genes.	91
239597	cd03520	Link_domain_CSPGs_modules_2_4	Link_domain_CSPGs_modules_2_4; this link domain is found in the second and fourth link modules of the chondroitin sulfate proteoglycan core protein (CSPG) aggrecan and, in the second link module of three other CSPGs: versican, neurocan, and brevican. The link domain is a hyaluronan (HA)-binding domain. CSPGs are characterized by an N-terminal globular domain (G1 domain) containing two contiguous link modules (modules 1 and 2). Both link modules of the G1 domain of aggrecan are involved in interaction with HA. Aggrecan in addition contains a second globular domain (G2) having link modules 3 and 4 which lack HA-binding activity. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggregan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes.	96
239598	cd03521	Link_domain_KIAA0527_like	Link_domain_KIAA0527_like; this domain is found in the human protein KIAA0527. Sequence-wise, it is highly similar to the link domain. The link domain is a hyaluronan-binding (HA) domain. KIAA0527 contains a single link module. The KIAA0527 gene was originally cloned from human brain tissue.	95
239599	cd03522	MoeA_like	MoeA_like. This domain is similar to a domain found in a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. There this domain is presumed to bind molybdopterin. The exact function of this subgroup is unknown.	312
239600	cd03523	NTR_like	NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex.	105
239601	cd03524	RPA2_OBF_family	RPA2_OBF_family: A family of oligonucleotide binding (OB) folds with similarity to the OB fold of the single strand (ss) DNA-binding domain (DBD)-D of human RPA2 (also called RPA32). RPA2 is a subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). RPA contains six OB folds, which are involved in ssDNA binding and in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. This family also includes OB folds similar to those found in Escherichia coli SSB, the wedge domain of E. coli RecG (a branched-DNA-specific helicase), E. coli ssDNA specific exodeoxyribonuclease VII large subunit, Pyrococcus abyssi DNA polymerase II (Pol II) small subunit, Sulfolobus solfataricus SSB, and Bacillus subtilis YhaM (a 3'-to-5'exoribonuclease). It also includes the OB folds of breast cancer susceptibility gene 2 protein (BRCA2), Oxytricha nova telomere end binding protein (TEBP), Saccharomyces cerevisiae telomere-binding protein (Cdc13), and human protection of telomeres 1 protein (POT1).	75
239602	cd03526	SQR_QFR_TypeB_TM	Succinate:quinone oxidoreductase (SQR) and Quinol:fumarate reductase (QFR) Type B subfamily, transmembrane subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol, while QFR catalyzes the reverse reaction. SQR, also called succinate dehydrogenase or Complex II, is part of the citric acid cycle and the aerobic respiratory chain, while QFR is involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. Type B proteins contain one transmembrane subunit and two heme groups. The heme and quinone binding sites reside in the transmembrane subunits. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit containing the electron donor/acceptor (quinol or quinone). The reversible reduction of quinone is an essential feature of respiration, allowing the transfer of electrons between respiratory complexes.	199
239603	cd03527	RuBisCO_small	Ribulose bisphosphate carboxylase/oxygenase (Rubisco), small subunit. Rubisco is a bifunctional enzyme catalyzes the initial steps of two opposing metabolic pathways: photosynthetic carbon fixation and the competing process of photorespiration. Rubisco Form I, present in plants and green algae, is composed of eight large and eight small subunits. The nearly identical small subunits are encoded by a family of nuclear genes. After translation, the small subunits are translocated across the chloroplast membrane, where an N-terminal signal peptide is cleaved off. While the large subunits contain the catalytic activities, it has been shown that the small subunits are important for catalysis by enhancing the catalytic rate through inducing conformational changes in the large subunits.	99
239604	cd03528	Rieske_RO_ferredoxin	Rieske non-heme iron oxygenase (RO) family, Rieske ferredoxin component; composed of the Rieske ferredoxin component of some three-component RO systems including biphenyl dioxygenase (BPDO) and carbazole 1,9a-dioxygenase (CARDO). The RO family comprise a large class of aromatic ring-hydroxylating dioxygenases found predominantly in microorganisms. These enzymes enable microorganisms to tolerate and even exclusively utilize aromatic compounds for growth. ROs consist of two or three components: reductase, oxygenase, and ferredoxin (in some cases) components. The ferredoxin component contains either a plant-type or Rieske-type [2Fe-2S] cluster. The Rieske ferredoxin component in this family carries an electron from the RO reductase component to the terminal RO oxygenase component. BPDO degrades biphenyls and polychlorinated biphenyls. BPDO ferredoxin (BphF) has structural features consistent with a minimal and perhaps archetypical Rieske protein in that the insertions that give other Rieske proteins unique structural features are missing. CARDO catalyzes dihydroxylation at the C1 and C9a positions of carbazole. Rieske ferredoxins are found as subunits of membrane oxidase complexes, cis-dihydrodiol-forming aromatic dioxygenases, bacterial assimilatory nitrite reductases, and arsenite oxidase. Rieske ferredoxins are also found as soluble electron carriers in bacterial dioxygenase and monooxygenase complexes.	98
239605	cd03529	Rieske_NirD	Assimilatory nitrite reductase (NirD) family, Rieske domain; Assimilatory nitrate and nitrite reductases convert nitrate through nitrite to ammonium. Members include bacterial and fungal proteins. The bacterial NirD contains a single Rieske domain while fungal proteins have a C-terminal Rieske domain in addition to several other domains. The fungal NirD is involved in nutrient acquisition, functioning at the soil/fungus interface to control nutrient exchange between the fungus and the host plant. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. The Rieske [2Fe-2S] cluster is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In this family, only a few members contain these residues. Other members may have lost the ability to bind the Rieske [2Fe-2S] cluster.	103
239606	cd03530	Rieske_NirD_small_Bacillus	Small subunit of nitrite reductase (NirD) family, Rieske domain; composed of proteins similar to the Bacillus subtilis small subunit of assimilatory nitrite reductase containing a Rieske domain. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. Assimilatory nitrate and nitrite reductases convert nitrate through nitrite to ammonium.	98
239607	cd03531	Rieske_RO_Alpha_KSH	The alignment model represents the N-terminal rieske iron-sulfur domain of KshA, the oxygenase component of 3-ketosteroid 9-alpha-hydroxylase (KSH).  The terminal oxygenase component of KSH is a key enzyme in the microbial steroid degradation pathway, catalyzing the 9 alpha-hydroxylation of 4-androstene-3,17-dione (AD) and 1,4-androstadiene-3,17-dione (ADD). KSH is a two-component class IA monooxygenase, with terminal oxygenase (KshA) and oxygenase reductase (KshB) components.  KSH activity has been found in many actino- and proteo- bacterial genera including Rhodococcus, Nocardia, Arthrobacter, Mycobacterium, and Burkholderia.	115
239608	cd03532	Rieske_RO_Alpha_VanA_DdmC	Rieske non-heme iron oxygenase (RO) family, Vanillate-O-demethylase oxygenase (VanA) and dicamba O-demethylase oxygenase (DdmC) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Vanillate-O-demethylase is a heterodimeric enzyme consisting of a terminal oxygenase (VanA) and reductase (VanB) components. This enzyme reductively catalyzes the conversion of vanillate into protocatechuate and formaldehyde. Protocatechuate and vanillate are important intermediate metabolites in the degradation pathway of lignin-derived compounds such as ferulic acid and vanillin by soil microbes.  DDmC is the oxygenase component of a three-component dicamba O-demethylase found in Pseudomonas maltophila, that catalyzes the conversion of a widely used herbicide called herbicide dicamba (2-methoxy-3,6-dichlorobenzoic acid) to DCSA (3,6-dichlorosalicylic acid).	116
239609	cd03535	Rieske_RO_Alpha_NDO	Rieske non-heme iron oxygenase (RO) family, Nathphalene 1,2-dioxygenase (NDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. NDO is a three-component RO system consisting of a reductase, a ferredoxin, and a hetero-hexameric alpha-beta subunit oxygenase component. NDO catalyzes the oxidation of naphthalene to cis-(1R,2S)-dihydroxy-1,2-dihydronaphthalene (naphthalene cis-dihydrodiol) with the consumption of O2 and NAD(P)H. NDO has a relaxed substrate specificity and can oxidize almost 100 substrates. Included in its varied activities are the enantiospecific cis-dihydroxylation of polycyclic aromatic hydrocarbons and benzocycloalkenes, benzylic hydroxylation, N- and O-dealkylation, sulfoxidation and desaturation reactions.	123
239610	cd03536	Rieske_RO_Alpha_DTDO	This alignment model represents the N-terminal rieske domain of the oxygenase alpha subunit (DitA) of diterpenoid dioxygenase (DTDO). DTDO is a novel aromatic-ring-hydroxylating dioxygenase found in Pseudomonas and other proteobacteria that degrades dehydroabietic acid (DhA).  Specifically, DitA hydroxylates 7-oxodehydroabietic acid to 7-oxo-11,12-dihydroxy-8, 13-abietadien acid. The ditA1 and ditA2 genes encode the alpha and beta subunits of the oxygenase component of DTDO while the ditA3 gene encodes the ferredoxin component of DTDO. The organization of the genes encoding the various diterpenoid dioxygenase components, the phylogenetic distinctiveness of both the alpha subunit and the ferredoxin component, and the unusual iron-sulfur cluster of the ferredoxin all suggest that this enzyme belongs to a new class of aromatic ring-hydroxylating dioxygenases.	123
239611	cd03537	Rieske_RO_Alpha_PrnD	This alignment model represents the N-terminal rieske domain of the oxygenase alpha subunit of aminopyrrolnitrin oxygenase (PrnD).  PrnD is a novel Rieske N-oxygenase that catalyzes the final step in the pyrrolnitrin biosynthetic pathway, the oxidation of the amino group in aminopyrrolnitrin to a nitro group, forming the antibiotic pyrrolnitrin. The biosynthesis of pyrrolnitrin is one of the best examples of enzyme-catalyzed arylamine oxidation. Although arylamine oxygenases are widely distributed within the microbial world and used in a variety of metabolic reactions, PrnD represents one of only two known examples of arylamine oxygenases or N-oxygenases involved in arylnitro group formation, the other being AurF involved in aureothin biosynthesis.	123
239612	cd03538	Rieske_RO_Alpha_AntDO	Rieske non-heme iron oxygenase (RO) family, Anthranilate 1,2-dioxygenase (AntDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. AntDO converts anthranilate to catechol, a naturally occurring compound formed through tryptophan degradation and an important intermediate in the metabolism of many N-heterocyclic compounds such as indole, o-nitrobenzoate, carbazole, and quinaldine.	146
239613	cd03539	Rieske_RO_Alpha_S5H	This alignment model represents the N-terminal rieske iron-sulfur domain of the oxygenase alpha subunit (NagG) of salicylate 5-hydroxylase (S5H). S5H converts salicylate (2-hydroxybenzoate), a metabolic intermediate of phenanthrene, to gentisate (2,5-dihydroxybenzoate) as part of an alternate pathway for naphthalene catabolism. S5H is a multicomponent enzyme made up of NagGH (the oxygenase components), NagAa (the ferredoxin reductase component), and NagAb (the ferredoxin component). The oxygenase component is made up of alpha (NagG) and beta (NagH) subunits.	129
239614	cd03541	Rieske_RO_Alpha_CMO	Rieske non-heme iron oxygenase (RO) family, Choline monooxygenase (CMO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. CMO is a novel RO found in certain plants which catalyzes the first step in betaine synthesis. CMO is not found in animals or bacteria. In these organisms, the first step in betaine synthesis is catalyzed by either the membrane-bound choline dehydrogenase (CDH) or the soluble choline oxidase (COX).	118
239615	cd03542	Rieske_RO_Alpha_HBDO	Rieske non-heme iron oxygenase (RO) family, 2-Halobenzoate 1,2-dioxygenase (HBDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. HBDO catalyzes the double hydroxylation of 2-halobenzoates with concomitant release of halogenide and carbon dioxide, yielding catechol.	123
239616	cd03545	Rieske_RO_Alpha_OHBDO_like	Rieske non-heme iron oxygenase (RO) family, Ortho-halobenzoate-1,2-dioxygenase (OHBDO)-like subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; composed of the oxygenase alpha subunits of OHBDO, salicylate 5-hydroxylase (S5H), terephthalate 1,2-dioxygenase system (TERDOS) and similar proteins. ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. OHBDO converts 2-chlorobenzoate (2-CBA) to catechol as well as 2,4-dCBA and 2,5-dCBA to 4-chlorocatechol, as part of the chlorobenzoate degradation pathway. Although ortho-substituted chlorobenzoates appear to be particularly recalcitrant to biodegradation, several strains utilize 2-CBA and the dCBA derivatives as a sole carbon and energy source. S5H converts salicylate (2-hydroxybenzoate), a metabolic intermediate of phenanthrene, to gentisate (2,5-dihydroxybenzoate) as part of an alternate pathway for naphthalene catabolism. S5H is a multicomponent enzyme made up of NagGH (the oxygenase components), NagAa (the ferredoxin reductase component), and NagAb (the ferredoxin component). The oxygenase component is made up of alpha (NagG) and beta (NagH) subunits. TERDOS is present in gram-positive bacteria and proteobacteria where it converts terephthalate (1,4-dicarboxybenzene) to protocatechuate as part of the terephthalate degradation pathway. The oxygenase component of TERDOS, called TerZ, is a hetero-hexamer with 3 alpha (TerZalpha) and 3 beta (TerZbeta) subunits.	150
239617	cd03548	Rieske_RO_Alpha_OMO_CARDO	Rieske non-heme iron oxygenase (RO) family, 2-Oxoquinoline 8-monooxygenase (OMO) and Carbazole 1,9a-dioxygenase (CARDO) subfamily, N-terminal Rieske domain of the oxygenase alpha subunit; ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. OMO catalyzes the NADH-dependent oxidation of the N-heterocyclic aromatic compound 2-oxoquinoline to 8-hydroxy-2-oxoquinoline, the second step in the bacterial degradation of quinoline. OMO consists of a reductase component (OMR) and  an oxygenase component (OMO) that together function to shuttle electrons from the reduced pyridine nucleotide to the active site of OMO, where O2 activation and 2-oxoquinoline hydroxylation occurs. CARDO, which contains oxygenase (CARDO-O), ferredoxin (CARDO-F) and ferredoxin reductase (CARDO-R) components, catalyzes the dihydroxylation at the C1 and C9a positions of carbazole. The oxygenase component of OMO and CARDO contain only alpha subunits arranged in a trimeric structure.	136
239618	cd03556	L-fucose_isomerase	L-fucose isomerase (FucIase); FucIase converts L-fucose, an aldohexose, to its ketose form, which prepares it for aldol cleavage (similar to the isomerization of glucose during glycolysis). L-fucose (or 6-deoxy-L-galactose) is found in blood group determinants as well as in various oligo- and polysaccharides, and glycosides in mammals, bacteria and plants.	584
239619	cd03557	L-arabinose_isomerase	L-Arabinose isomerase (AI) catalyzes the isomerization of L-arabinose to L-ribulose, the first reaction in its conversion into D-xylulose-5-phosphate, an intermediate in the pentose phosphate pathway, which allows L-arabinose to be used as a carbon source. AI can also convert D-galactose to D-tagatose at elevated temperatures in the presence of divalent metal ions. D-tagatose, rarely found in nature, is of commercial interest as a low-calorie sugar substitute.	484
349787	cd03558	LGIC_ECD	extracellular domain (ECD) of Cys-loop neurotransmitter-gated ion channels (also known as ligand-gated ion channel (LGIC)). This superfamily contains the extracellular domain (ECD) of Cys-loop neurotransmitter-gated ion channels, which include nicotinic acetylcholine receptor (nAChR), serotonin 5-hydroxytryptamine receptor (5-HT3), type-A gamma-aminobutyric acid receptor (GABAAR) and glycine receptor (GlyR). These ligand-gated ion channels (LGICs) are found across metazoans and have close homologs in bacteria. They are vital for communication throughout the nervous system. GABAAR and GlyR are anionic channels, both mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR receptor pore, resulting in hyperpolarization of the neuron. nAChR is a non-selective cation channel that is permeable to Na+ and K+, and some subunit combinations are also permeable to Ca2+. Na+ enters and K+ exits to allow net flow of positively charged ions inward. 5-HT3, a cation-selective channel, binds serotonin and is permeable to Na+, K+, and Ca2+. It mediates neuronal depolarization and excitation within the central and peripheral nervous systems. These ligand-gated chloride channels are critical not only for maintaining appropriate neuronal activity, but have long been important therapeutic targets: benzodiazepines, barbiturates, some intravenous and volatile anaesthetics, alcohol, strychnine, picrotoxin, and ivermectin all derive their biological activity from acting on the inhibitory half of the Cys-loop receptor family. The ECD contains the ligand binding sites for these receptors.	179
349850	cd03559	LGIC_TM	transmembrane domain of Cys-loop neurotransmitter-gated ion channels. This superfamily contains the transmembrane domain of Cys-loop neurotransmitter-gated ion channels, which include nicotinic acetylcholine receptor (nAChR), serotonin 5-hydroxytryptamine receptor (5-HT3), type-A gamma-aminobutyric acid receptor (GABAAR), and glycine receptor (GlyR). These ligand-gated ion channels (LGICs) are found across metazoans and have close homologs in bacteria. They are vital for communication throughout the nervous system where the sign of synaptic connections (excitatory or inhibitory) is determined by the charge of the ions that flow through these channels. In general, channels that conduct positive ions are excitatory, whereas channels that conduct negative ions are inhibitory. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR and GlyR are anionic channels, both mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR receptor pore, resulting in hyperpolarization of the neuron. nAChR is a non-selective cation channel that is permeable to Na+ and K+, and some subunit combinations are also permeable to Ca2+. Na+ enters and K+ exits to allow net flow of positively charged ions inward. 5-HT3, a cation-selective channel, binds serotonin and is permeable to Na+, K+, and Ca2+. It mediates neuronal depolarization and excitation within the central and peripheral nervous systems. These ligand-gated chloride channels are critical not only for maintaining appropriate neuronal activity, but have long been important therapeutic targets: benzodiazepines, barbiturates, some intravenous and volatile anaesthetics, alcohol, strychnine, picrotoxin, and ivermectin all derive their biological activity from acting on the inhibitory half of the Cys-loop receptor family.	116
340765	cd03561	VHS	VHS (Vps27/Hrs/STAM) domain family. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It has a superhelical structure similar to that of the ARM (Armadillo) repeats and is present at the N-termini of proteins involved in intracellular membrane trafficking. There are four general groups of VHS domain containing proteins based on their association with other domains. The first group consists of proteins of the STAM/EAST/Hbp family, which has the domain composition of VHS-SH3-ITAM. The second consists of proteins with a FYVE domain C-terminal to VHS. The third consists of GGA proteins with a domain composition of VHS-GAT (GGA and TOM)-GAE (Gamma-Adaptin Ear) domain. The fourth consists of proteins with a VHS domain alone or with domains other than those mentioned above. In GGA proteins, VHS domains are involved in cargo recognition in trans-Golgi, thereby having a general membrane targeting/cargo recognition role in vesicular trafficking.	131
340766	cd03562	CID	CID (CTD-Interacting Domain) family. The CTD-Interacting Domain (CID) is present in several eukaryotic RNA-processing factors including yeast proteins, Pcf11 and Nrd1, and vertebrate proteins, CTD-associated factors 8 (SCAF8) and Regulation of nuclear pre-mRNA domain-containing proteins (such as RPRD1 and RPRD2). Pcf11 is a conserved and essential subunit of the yeast cleavage factor IA, which is required for polyadenylation-dependent 3'-RNA processing and transcription termination. Nrd1 is implicated in polyadenylation-independent 3'-RNA processing. CID binds tightly to the carboxy-terminal domain (CTD) of  RNA polymerase (Pol) II (RNAP II). During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	123
340767	cd03564	ANTH_N	ANTH (AP180 N-Terminal Homology) domain family, N-terminal region. The ANTH (AP180 N-Terminal Homology) domain family is composed of Adaptor Protein 180 (AP180), Clathrin Assembly Lymphoid Myeloid Leukemia protein (CALM), and similar proteins. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ANTH-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that the ANTH domain is a universal component of the machinery for clathrin-mediated membrane budding. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. This model describes the N-terminal region of ANTH domains.	120
340768	cd03565	VHS_Tom1_like	VHS (Vps27/Hrs/STAM) domain of Tom1 subfamily. This subfamily is composed of Tom1 (Target of myb1 - retroviral oncogene) protein, Tom1L1 (Tom1-like1), Tom1L2 (Tom1-like2), and similar proteins. Proteins belonging to this subfamily are characterized by the presence of a VHS (Vps27p/Hrs/Stam) domain in the N-terminal portion followed by a GAT (GGA and Tom) domain. They are novel regulators for post-Golgi trafficking and signaling. Yeast do not contain homologous proteins of the Tom1 subfamily, suggesting these proteins have evolved to accommodate more complex cellular processes. Tom1 is essential for the negative regulation of Interleukin-1 and Tumor Necrosis Factor-induced signaling pathways. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting.	138
340769	cd03567	VHS_GGA_metazoan	VHS (Vps27/Hrs/STAM) domain of metazoan GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) proteins. GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) comprises a subfamily of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. Jawed vertebrates contain as many as three GGA proteins: GGA1, GGA2, and GGA3. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system.	139
340770	cd03568	VHS_STAM	VHS (Vps27/Hrs/STAM) domain of the STAM (Signal Transducing Adaptor Molecule) subfamily. STAM (Signal Transducing Adaptor Molecule) subfamily members have at their N-termini a VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats, followed by a Ubiquitin-Interacting Motif (UIM) and a SH3 (Src Homology 3) domain, which is a well-established protein-protein interaction domain, and a GAT (GGA and TOM) domain. At the C-termini of most vertebrate STAMs, an Immunoreceptor Tyrosine-based Activation Motif (ITAM) is present, which mediates the binding of HRS (hepatocyte growth factor-regulated tyrosine kinase substrate) in endocytic and exocytic machineries. STAM is a component of the ESCRT (Endosomal Sorting Complex Required for Transport)-0 machinery and together with Hrs, functions to bind and sequester cargoes for downstream sorting into intralumenal vesicles. Jawed vertebrates have two STAM subfamily members, STAM1 and STAM2.	132
340771	cd03569	VHS_Hrs	VHS (Vps27/Hrs/STAM) domain of Hepatocyte growth factor-regulated tyrosine kinase substrate, Hrs. Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) plays a role in at least three vesicle trafficking events: exocytosis, endocytosis, and endosome to lysosome trafficking. Hrs is involved in promoting rapid recycling of endocytosed signaling receptors to the plasma membrane. Together with STAM or STAM2, it comprises the ESCRT (Endosomal Sorting Complex Required for Transport)-0 machinery, which functions to bind and sequester cargoes for downstream sorting into intralumenal vesicles. Hrs contains an N-terminal VHS domain, which has a superhelical structure similar to the structure of ARM (Armadillo) repeats, a FYVE (Fab1p, YOTB, Vac1p, and EEA1) zinc finger domain, a Double Ubiquitin-Interacting Motif (DUIM), a P(S/T)XP motif that recruit ESCRT-I, a GAT (GGA and TOM) domain, and a short peptide motif near the C-terminus that recruits clathrin.	138
340772	cd03571	ENTH	Epsin N-Terminal Homology (ENTH) domain family. The Epsin N-Terminal Homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, contributing to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	117
340773	cd03572	ENTH_like_Tepsin	Epsin N-Terminal Homology (ENTH)-like domain of AP-4 complex accessory subunit Tepsin and similar domains. This family is composed of proteins containing an ENTH-like domain including vertebrate AP-4 complex accessory subunit Tepsin and Arabidopsis thaliana VHS domain-containing protein At3g16270. Tepsin is also called ENTH Domain-containing protein 2 (ENTHD2), Epsin for AP-4, or Tetra-epsin. It associates with the adapter-like complex 4 (AP-4), a heterotetramer composed of two large adaptins (epsilon and beta), a medium adaptin (mu) and a small adaptin (sigma), which forms a non-clathrin coat on vesicles departing the Trans-Golgi Network. The Epsin N-Terminal Homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	119
239629	cd03574	NTR_complement_C345C	NTR/C345C domain; The NTR domains that are found in the C-termini of complement C3, C4 and C5, are also called C345C domains. In C5, the domain interacts with various partners during the formation of the membrane attack complex, a fundamental process in the mammalian defense against infection. It's role in component C3 and C4 is not well understood.	147
239630	cd03575	NTR_WFIKKN	NTR domain, WFIKKN subfamily; WFIKKN proteins contain a C-terminal NTR domain and are putative secreted proteins which may be multivalent protease inhibitors that act on serine proteases as well as metalloproteases. Human WFIKKN and a related protein sharing the same domain architecture were observed to have distinct tissue expression patterns. WFIKKN is also referred to as growth and differentiation factor-associated serum protein-1 (GASP-1). It inhibits the activity of mature myostatin, a specific regulator of skeletal muscle mass and a member of the TGFbeta superfamily.	109
239631	cd03576	NTR_PCOLCE	NTR domain, PCOLCE subfamily; Procollagen C-endopeptidase enhancers (PCOLCEs) are extracellular matrix proteins that enhance the activity of procollagen C-proteases, by binding to the procollagen I C-peptide. They contain a C-terminal NTR domain, which have been suggested to possess inhibitory functions towards specific serine proteases but not towards metzincins, which are inhibited by the related TIMPs.	124
239632	cd03577	NTR_TIMP_like	NTR domain, TIMP-like subfamily; TIMPs, or tissue inibitors of metalloproteases, are essential regulators of extracellular matrix turnover and remodeling. They form complexes with matrix metalloproteases (MMPs) and inactivate them irreversibly by non-covalently binding their active zinc-binding sites. This group contains domains similar to the TIMP NTR domain, which binds MMPs. Members of this group may or may not function as MMP inhibitors.	116
239633	cd03578	NTR_netrin-4_like	NTR domain, Netrin-4-like subfamily; composed of the C-terminal NTR domains of netrin-4 (beta netrin) and similar proteins. Netrins are secreted proteins that function as tropic cues in the direction of axon growth and cell migration during neural development. Netrin-4 is a basement membrane component that is important in neural, kidney and vascular development. It may also be involved in regulating the outgrowth and shape of epithelial cells during lung branching morphogenesis.	111
239634	cd03579	NTR_netrin-1_like	NTR domain, Netrin-1-like subfamily; The C-terminal NTR domain of netrins is also called domain C in the context of C. elegans netrin UNC-6. Netrins are secreted proteins that function as tropic cues in the direction of axon growth and cell migration during neural development. These proteins may be chemoattractive to some neurons and chemorepellant for others. In the case of netrin-1, attraction and repulsion responses are mediated by the DCC and UNC-5 receptor families. The biological activities of C. elegans UNC-6, which may either attract or repel migrating cells or axons, are mediated by its different domains. The C-terminal NTR domain of UNC-6 has been shown to inhibit axon branching activity.	115
239635	cd03580	NTR_Sfrp1_like	NTR domain, Secreted frizzled-related protein (Sfrp) 1-like subfamily; composed of proteins similar to human Sfrp1, Sfrp2 and Sfrp5. Sfrps are soluble proteins containing an NTR domain C-terminal to a cysteine-rich Frizzled domain. They show diverse functions and are thought to work in Wnt signaling indirectly, as modulators or antagonists by binding Wnt ligands, and directly, via the Wnt receptor, Frizzled. They participate in regulating the patterning along the anteroposterior axis in vertebrates. Human Sfrp1 has been found frequently to be downregulated in breast cancer and is associated with disease progression and poor prognosis.	126
239636	cd03581	NTR_Sfrp3_like	NTR domain, Secreted frizzled-related protein (Sfrp) 3-like subfamily; composed of proteins similar to human Sfrp3 and Sfrp4. Sfrps are soluble proteins containing an NTR domain C-terminal to a cysteine-rich Frizzled domain. They show diverse functions and are thought to work in Wnt signaling indirectly, as modulators or antagonists by binding Wnt ligands, and directly, via the Wnt receptor, Frizzled. They participate in regulating the patterning along the anteroposterior axis in vertebrates. Human Sfrp3 may suppress the growth and invasiveness of androgen-independent prostate cancer cells.	111
239637	cd03582	NTR_complement_C5	NTR/C345C domain, complement C5 subfamily; The NTR domain found in complement C5 is also known as C345C because it occurs at the C-terminus of complement C3, C4 and C5. Complement C5 is activated by C5 convertase, which itself is a complex between C3b and C3 convertase. The small cleavage fragment, C5a, is the most important small peptide mediator of inflammation, and the larger active fragment, C5b, initiates late events of complement activation. The NTR/C345C domain is important in the function of C5 as it interacts with enzymes that convert C5 to the active form, C5b. The domain has also been found to bind to complement components C6 and C7, and may specifically interact with their factor I modules.	150
239638	cd03583	NTR_complement_C3	NTR/C345C domain, complement C3 subfamily; The NTR domain found in complement C3 is also known as the C345C domain because it occurs at the C-terminus of complement C3, C4 and C5. Complement C3 plays a pivotal role in the activation of the complement systems, as all pathways (classical, alternative, and lectin) result in the processing of C3 by C3 convertase. The larger fragment, activated C3b, contains the NTR/C345C domain and binds covalently, via a reactive thioester, to cell surface carbohydrates including components of bacterial cell walls and immune aggregates. The smaller cleavage product, C3a, acts independently as a diffusible signal to mediate local inflammatory processes. The structure of C3 shows that the NTR/C345C domain is located in an exposed position relative to the rest of the molecule. The function of the domain in complement C3 is poorly understood.	149
239639	cd03584	NTR_complement_C4	NTR/C345C domain, complement C4 subfamily; The NTR domain found in complement C4 is also known as the C345C domain because it occurs at the C-terminus of complement C3, C4 and C5. Complement C4 is a key player in the activation of the component classical pathway. C4 is cleaved by activated C1 to yield C4a anaphylatoxin, and the larger fragment C4b, an essential component of the C3- and C5-convertase enzymes. C4b binds covalently to the surface of pathogens through a reactive thioester. The role of the NTR/C345C domain in C4 (C4b) is unclear.	153
239640	cd03585	NTR_TIMP	NTR domain, TIMP subfamily; TIMPs, or tissue inibitors of metalloproteases, are essential regulators of extracellular matrix turnover and remodeling. They form complexes with matrix metalloproteases (MMPs) and inactivate them irreversibly by non-covalently binding their active zinc-binding sites. The levels of activated membrane-type MMPs, MMPs, and free TIMPs determine the balance between matrix degradation and matrix formation or stabilization. Consequently, TIMPs play roles in processes that require the remodeling and degradation of connective tissue, such as development, morphogenesis, wound healing, as well as in various diseases and pathological states such as tumor cell metastasis, arthritis, and artherosclerosis. Most TIMPs bind to a variety of MMPs. TIMP-1 and TIMP-2 appear to be multifunctional proteins with diverse biological action. They may exhibit growth factor-like activity and can inhibit angiogenesis. TIMP-3 has been implicated in apoptosis.	183
176459	cd03586	PolY_Pol_IV_kappa	DNA Polymerase IV/Kappa. Pol IV, also known as Pol kappa, DinB, and Dpo4, is a translesion synthesis (TLS) polymerase.  Translesion synthesis is a process that allows the bypass of a variety of DNA lesions.  TLS polymerases lack proofreading activity and have low fidelity and low processivity.  They use damaged DNA as templates and insert nucleotides opposite the lesions.  Known primarily as Pol IV in prokaryotes and Pol kappa in eukaryotes, this polymerase has a propensity for generating frameshift mutations.  The eukaryotic Pol kappa differs from Pol IV and Dpo4 by an N-terminal extension of ~75 residues known as the "N-clasp" region.  The structure of Pol kappa shows DNA that is almost totally encircled by Pol kappa, with the N-clasp region augmenting the interactions between DNA and the polymerase. Pol kappa is more resistant than Pol eta and Pol iota to bulky guanine adducts and is efficient at catalyzing the incorporation of dCTP.  Bacterial pol IV has a higher error rate than other Y-family polymerases.	334
239641	cd03587	SOCS	SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	41
153058	cd03588	CLECT_CSPGs	C-type lectin-like domain (CTLD) of the type found in chondroitin sulfate proteoglycan core proteins. CLECT_CSPGs: C-type lectin-like domain (CTLD) of the type found in chondroitin sulfate proteoglycan core proteins (CSPGs) in human and chicken aggrecan, frog brevican, and zebra fish dermacan.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  In cartilage, aggrecan forms cartilage link protein stabilized aggregates with hyaluronan (HA).  These aggregates contribute to the tissue's load bearing properties.  Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues.  Xenopus brevican is expressed in the notochord and the brain during early embryogenesis.  Zebra fish dermacan is expressed in dermal bones and may play a role in dermal bone development.  CSPGs do contain LINK domain(s) which bind HA.  These LINK domains are considered by one classification system to be a variety of CTLD, but are omitted from this hierarchical classification based on insignificant sequence similarity.	124
153059	cd03589	CLECT_CEL-1_like	C-type lectin-like domain (CTLD) of the type found in CEL-1 from Cucumaria echinata and Echinoidin from Anthocidaris crassispina. CLECT_CEL-1_like: C-type lectin-like domain (CTLD) of the type found in CEL-1 from Cucumaria echinata and Echinoidin from Anthocidaris crassispina.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  The CEL-1 CTLD binds three calcium ions and has a high specificity for N-acteylgalactosamine (GalNAc).  CEL-1 exhibits strong cytotoxicity which is inhibited by GalNAc.  This protein may play a role as a toxin defending against predation.  Echinoidin is found in the coelomic fluid of the sea urchin and is specific for GalBeta1-3GalNAc.  Echinoidin has a cell adhesive activity towards human cancer cells which is not mediated through the CTLD.  Both CEL-1 and Echinoidin are multimeric proteins comprised of multiple dimers linked by disulfide bonds.	137
153060	cd03590	CLECT_DC-SIGN_like	C-type lectin-like domain (CTLD) of the type found in human dendritic cell (DC)-specific intercellular adhesion molecule 3-grabbing non-integrin (DC-SIGN) and the related receptor, DC-SIGN receptor (DC-SIGNR). CLECT_DC-SIGN_like: C-type lectin-like domain (CTLD) of the type found in human dendritic cell (DC)-specific intercellular adhesion molecule 3-grabbing non-integrin (DC-SIGN) and the related receptor, DC-SIGN receptor (DC-SIGNR).  This group also contains proteins similar to hepatic asialoglycoprotein receptor (ASGP-R) and langerin in human.  These proteins are type II membrane proteins with a CTLD ectodomain.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  DC-SIGN is thought to mediate the initial contact between dendritic cells and resting T cells, and may also mediate the rolling of DCs on epithelium.  DC-SIGN and DC-SIGNR bind to oligosaccharides present on human tissues, as well as, on pathogens including parasites, bacteria, and viruses.  DC-SIGN and DC-SIGNR bind to HIV enhancing viral infection of T cells.  DC-SIGN and DC-SIGNR are homotetrameric, and contain four CTLDs stabilized by a coiled coil of alpha helices.  The hepatic ASGP-R is an endocytic recycling receptor which binds and internalizes desialylated glycoproteins having a terminal galactose or N-acetylgalactosamine residues on their N-linked carbohydrate chains, via the clathrin-coated pit mediated endocytic pathway, and delivers them to lysosomes for degradation.  It has been proposed that glycoproteins bearing terminal Sia (sialic acid) alpha2, 6GalNAc and Sia alpha2, 6Gal are endogenous ligands for ASGP-R and that ASGP-R participates in regulating the relative concentration of serum glycoproteins bearing alpha 2,6-linked Sia.  The human ASGP-R is a hetero-oligomer composed of two subunits, both of which are found within this group.  Langerin is expressed in a subset of dendritic leukocytes, the Langerhans cells (LC). Langerin induces the formation of Birbeck Granules (BGs) and associates with these BGs following internalization.  Langerin binds, in a calcium-dependent manner, to glyco-conjugates containing mannose and related sugars mediating their uptake and degradation.  Langerin molecules oligomerize as trimers with three CTLDs held together by a coiled-coil of alpha helices.	126
153061	cd03591	CLECT_collectin_like	C-type lectin-like domain (CTLD) of the type found in human collectins including lung surfactant proteins A and D, mannose- or mannan binding lectin (MBL), and CL-L1 (collectin liver 1). CLECT_collectin_like: C-type lectin-like domain (CTLD) of the type found in human collectins including lung surfactant proteins A and D, mannose- or mannan binding lectin (MBL), and CL-L1 (collectin liver 1).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. The CTLDs of these collectins bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, or apoptotic cells) and mediate functions associated with killing and phagocytosis.  MBPs recognize high mannose oligosaccharides in a calcium dependent manner, bind to a broad range of pathogens, and trigger cell killing by activating the complement pathway.  MBP also acts directly as an opsonin.  SP-A and SP-D in addition to functioning as host defense components, are components of pulmonary surfactant which play a role in surfactant homeostasis.  Pulmonary surfactant is a phospholipid-protein complex which reduces the surface tension within the lungs.  SP-A binds the major surfactant lipid: dipalmitoylphosphatidylcholine (DPPC).  SP-D binds two minor components of surfactant that contain sugar moieties: glucosylceramide and phosphatidylinositol (PI).  MBP and SP-A, -D monomers are homotrimers with an N-terminal collagen region and three CTLDs.  Multiple homotrimeric units associate to form supramolecular complexes.  MBL deficiency results in an increased susceptibility to a large number of different infections and to inflammatory disease, such as rheumatoid arthritis.	114
153062	cd03592	CLECT_selectins_like	C-type lectin-like domain (CTLD) of the type found in the type 1 transmembrane proteins:  P(platlet)-, E(endothelial)-, and L(leukocyte)- selectins (sels). CLECT_selectins_like: C-type lectin-like domain (CTLD) of the type found in the type 1 transmembrane proteins:  P(platlet)-, E(endothelial)-, and L(leukocyte)- selectins (sels).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  P- E- and L-sels are cell adhesion receptors that mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration.  L- sel is expressed constitutively on most leukocytes.  P-sel is stored in the Weibel-Palade bodies of endothelial cells and in the alpha granules of platlets.  E- sels are present on endothelial cells.  Following platelet and/or endothelial cell activation P- sel is rapidly translocated to the cell surface and E-sel expression is induced.  The initial step in leukocyte migration involves interactions of selectins with fucosylated, sialylated, and sulfated carbohydrate moieties on target ligands displayed on glycoprotein scaffolds on endothelial cells and leucocytes.  A major ligand of P- E- and L-sels is PSGL-1 (P-sel glycoprotein ligand).  Interactions of E- and P- sels with tumor cells may promote extravasation of cancer cells.   Regulation of L-sel and P-sel function includes proteolytic shedding of the most extracellular portion (containing the CTLD) from the cell surface.  Increased levels of the soluble form of P-sel in the plasma have been found in a number of diseases including coronary disease and diabetes.  E- and P- sel also play roles in the development of synovial inflammation in inflammatory arthritis.  Platelet P-sel, but not endothelial P-sel, plays a role in the inflammatory response and neointimal formation after arterial injury.  Selectins may also function as signal-transducing receptors.	115
153063	cd03593	CLECT_NK_receptors_like	C-type lectin-like domain (CTLD) of the type found in natural killer cell receptors (NKRs). CLECT_NK_receptors_like: C-type lectin-like domain (CTLD) of the type found in natural killer cell receptors (NKRs), including proteins similar to oxidized low density lipoprotein (OxLDL) receptor (LOX-1), CD94, CD69, NKG2-A and -D, osteoclast inhibitory lectin (OCIL), dendritic cell-associated C-type lectin-1 (dectin-1),  human myeloid inhibitory C-type lectin-like receptor (MICL), mast cell-associated functional antigen (MAFA), killer cell lectin-like receptors: subfamily F, member 1 (KLRF1) and subfamily B, member 1 (KLRB1), and lys49 receptors.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  NKRs are variously associated with activation or inhibition of natural killer (NK) cells. Activating NKRs stimulate cytolysis by NK cells of virally infected or transformed cells; inhibitory NKRs block cytolysis upon recognition of markers of healthy self cells. Most Lys49 receptors are inhibitory; some are stimulatory.  OCIL inhibits NK cell function via binding to the receptor NKRP1D.  Murine OCIL in addition to inhibiting NK cell function inhibits osteoclast differentiation. MAFA clusters with the type I Fc epsilon receptor (FcepsilonRI) and inhibits the mast cells secretory response to FcepsilonRI stimulus.  CD72 is a negative regulator of B cell receptor signaling.  NKG2D is an activating receptor for stress-induced antigens; human NKG2D ligands include the stress induced MHC-I homologs, MICA, MICB, and ULBP family of glycoproteins  Several NKRs have a carbohydrate-binding capacity which is not mediated through calcium ions (e.g. OCIL binds a range of high molecular weight sulfated glycosaminoglycans including dextran sulfate, fucoidan, and gamma-carrageenan sugars).  Dectin-1 binds fungal beta-glucans and in involved in the innate immune responses to fungal pathogens.  MAFA binds saccharides having terminal alpha-D mannose residues in a calcium-dependent manner.  LOX-1 is the major receptor for OxLDL in endothelial cells and thought to play a role in the pathology of atherosclerosis.  Some NKRs exist as homodimers (e.g.Lys49, NKG2D, CD69, LOX-1) and some as heterodimers (e.g. CD94/NKG2A).  Dectin-1 can function as a monomer in vitro.	116
153064	cd03594	CLECT_REG-1_like	C-type lectin-like domain (CTLD) of the type found in Human REG-1 (lithostathine), REG-4, and avian eggshell-specific proteins: ansocalcin, structhiocalcin-1(SCA-1), and -2(SCA-2). CLECT_REG-1_like: C-type lectin-like domain (CTLD) of the type found in Human REG-1 (lithostathine), REG-4, and avian eggshell-specific proteins: ansocalcin, structhiocalcin-1(SCA-1), and -2(SCA-2).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  REG-1 is a proliferating factor which participates in various kinds of tissue regeneration including pancreatic beta-cell regeneration, regeneration of intestinal mucosa, regeneration of motor neurons, and perhaps in tissue regeneration of damaged heart.  REG-1 may play a role on the pathophysiology of Alzheimer's disease and in the development of gastric cancers.  Its expression is correlated with reduced survival from early-stage colorectal cancer.  REG-1 also binds and aggregates several bacterial strains from the intestinal flora and it has been suggested that it is involved in the control of the intestinal bacterial ecosystem.  Rat lithostathine has calcium carbonate crystal inhibitor activity in vitro.  REG-IV is unregulated in pancreatic, gastric, hepatocellular, and prostrate adenocarcinomas.  REG-IV activates the EGF receptor/Akt/AP-1 signaling pathway in colorectal carcinoma.  Ansocalcin, SCA-1 and -2 are found at high concentration in the calcified egg shell layer of goose and ostrich, respectively and tend to form aggregates.  Ansocalcin nucleates calcite crystal aggregates in vitro.	129
153065	cd03595	CLECT_chondrolectin_like	C-type lectin-like domain (CTLD) of the type found in the human type-1A transmembrane proteins chondrolectin (CHODL) and layilin. CLECT_chondrolectin_like: C-type lectin-like domain (CTLD) of the type found in the human type-1A transmembrane proteins chondrolectin (CHODL) and layilin.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  CHODL is predominantly expressed in muscle cells and is associated with T-cell maturation.  Various alternatively spliced isoforms have been of CHODL have been identified.  The transmembrane form of CHODL is localized in the ER-Golgi apparatus.  Layilin is widely expressed in different cell types.  The extracellular CTLD of layilin binds hyaluronan (HA), a major constituent of the extracellular matrix (ECM).  The cytoplasmic tail of layilin binds various members of the band 4.1/ERM superfamily (talin, radixin, and merlin).  The ERM proteins are cytoskeleton-membrane linker molecules which link actin to receptors in the plasma membrane.  Layilin co-localizes in with talin in membrane ruffles and may mediate signals from the ECM to the cell cytoskeleton.	149
153066	cd03596	CLECT_tetranectin_like	C-type lectin-like domain (CTLD) of the type found in the tetranectin (TN), cartilage derived C-type lectin (CLECSF1), and stem cell growth factor (SCGF). CLECT_tetranectin_like: C-type lectin-like domain (CTLD) of the type found in the tetranectin (TN), cartilage derived C-type lectin (CLECSF1), and stem cell growth factor (SCGF).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  TN binds to plasminogen and stimulates activation of plasminogen, playing a key role in the regulation of proteolytic processes.  The TN CTLD binds two calcium ions.  Its calcium free form binds to various kringle-like protein ligands.  Two residues involved in the coordination of calcium are critical for the binding of TN to the fourth kringle (K4) domain of plasminogen (Plg K4).  TN binds the kringle 1-4 form of angiostatin (AST K1-4).  AST K1-4 is a fragment of Plg, commonly found in cancer tissues.  TN inhibits the binding of Plg and AST K1-4 to the extracellular matrix (EMC) of endothelial cells and counteracts the antiproliferative effects of AST K1-4 on these cells.  TN also binds the tenth kringle domain of apolipoprotein (a).  In addition, TN binds fibrin and complex polysaccharides in a Ca2+ dependent manner.  The binding site for complex sulfated polysaccharides is N-terminal to the CTLD.  TN is homotrimeric; N-terminal to the CTLD is an alpha helical domain responsible for trimerization of monomeric units.  TN may modulate angiogenesis through interactions with angiostatin and coagulation through interaction with fibrin.  TN may play a role in myogenesis and in bone development.  Mice having a deletion in the TN gene exhibit a kyphotic spine abnormality.  TN is a useful prognostic marker of certain cancer types.  CLECSF1 is expressed in cartilage tissue, which is primarily intracellular matrix (ECM), and is a candidate for organizing ECM.  SCGF is strongly expressed in bone marrow and is a cytokine for primitive hematopoietic progenitor cells.	129
153067	cd03597	CLECT_attractin_like	C-type lectin-like domain (CTLD) of the type found in human and mouse attractin (AtrN) and attractin-like protein (ALP). CLECT_attractin_like: C-type lectin-like domain (CTLD) of the type found in human and mouse attractin (AtrN) and attractin-like protein (ALP).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  Mouse AtrN (the product of the mahogany gene) has been shown to bind Agouti protein and to function in agouti-induced pigmentation and obesity.  Mutations in AtrN have also been shown to cause spongiform encephalopathy and hypomyelination in rats and hamsters.  The cytoplasmic region of mouse ALP has been shown to binds to melanocortin receptor (MCR4).  Signaling through MCR4 plays a role in appetite suppression.  Attractin may have therapeutic potential in the treatment of obesity.  Human attractin (hAtrN) has been shown to be expressed on activated T cells and released extracellularly.  The circulating serum attractin induces the spreading of monocytes that become the focus of the clustering of non-proliferating T cells.	129
153068	cd03598	CLECT_EMBP_like	C-type lectin-like domain (CTLD) of the type found in the human proteins, eosinophil major basic protein (EMBP) and prepro major basic protein homolog (MBPH). CLECT_EMBP_like: C-type lectin-like domain (CTLD) of the type found in the human proteins, eosinophil major basic protein (EMBP) and prepro major basic protein homolog (MBPH).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  Eosinophils and basophils carry out various functions in allergic, parasitic, and inflammatory diseases.  EMBP is stored in eosinophil crystalloid granules and is released upon degranulation.  EMBP is also expressed in basophils.  The proform of EMBP is expressed in placental X cells and breast tissue and increases significantly during human pregnancy.  EMBP has cytotoxic properties and damages bacteria and mammalian cells, in vitro, as well as, helminth parasites.  EMBP deposition has been observed in the inflamed tissue of allergy patients in a variety of diseases including asthma, atopic dermatitis, and rhinitis. In addition to its cytotoxic functions, EMBP activates cells and stimulates cytokine production.  EMBP has been shown to bind the proteoglycan heparin.  The binding site is similar to the carbohydrate binding site of other classical CTLD, such as mannose-binding protein (MBP1), however, heparin binding to EMBP is calcium ion independent.  MBPH has reduced potency in cytotoxic and cytostimulatory assays compared with EMBP.	117
153069	cd03599	CLECT_DGCR2_like	C-type lectin-like domain (CTLD) of the type found in DGCR2, an integral membrane protein deleted in DiGeorge Syndrome (DGS). CLECT_DGCR2_like: C-type lectin-like domain (CTLD) of the type found in DGCR2, an integral membrane protein deleted in DiGeorge Syndrome (DGS).  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  DGS is also known velo-cardio-facial syndrome (VCFS).  DGS is a genetic abnormality that results in malformations of the heart, face, and limbs and is associated with schizophrenia and depressive disorders.  DGCR2 is a candidate for involvement in the pathogenesis of DGS since the DGCR2 gene lies within the minimal DGS critical region (MDGRC) of 22q11, which when deleted gives rise to DGS, and the DGCR2 gene is in close proximity to the balanced translocation breakpoint in a DGS patient having a balanced translocation.	153
153070	cd03600	CLECT_thrombomodulin_like	C-type lectin-like domain (CTLD) of the type found in human thrombomodulin(TM), Endosialin, C14orf27, and C1qR. CLECT_thrombomodulin_like: C-type lectin-like domain (CTLD) of the type found in human thrombomodulin(TM), Endosialin, C14orf27, and C1qR.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  In these thrombomodulin-like proteins the residues involved in coordinating Ca2+ in the classical MBP-A CTLD are not conserved.  TM exerts anti-fibrinolytic and anti-inflammatory activity.  TM also regulates blood coagulation in the anticoagulant protein C pathway.  In this pathway, the procoagulant properties of thrombin (T) are lost when it binds TM.  TM also plays a key role in tumor biology.  It is expressed on endothelial cells and on several type of tumor cell including squamous cell carcinoma.  Loss of TM expression correlates with advanced stage and poor prognosis.  Loss of function of TM function may be associated with arterial or venous thrombosis and with late fetal loss.  Soluble molecules of TM retaining the CTLD are detected in human plasma and urine where higher levels indicate injury and/or enhanced turnover of the endothelium.  C1qR is expressed on endothelial cells and stem cells.  It is also expressed on monocots and neutrophils, where it is subject to ectodomain shedding.  Soluble forms of C1qR retaining the CTLD is detected in human plasma.  C1qR modulates the phagocytosis of apoptotic cells in vivo.  C1qR-deficient mice are defective in clearance of apoptotic cells in vivo.  The cytoplasmic tail of C1qR, C-terminal to the CTLD of CD93, contains a PDZ binding domain which interacts with the PDZ domain-containing adaptor protein, GIPC.  The juxtamembrane region of this tail interacts with the ezrin/radixin/moesin family.  Endosialin functions in the growth and progression of abdominal tumors and is expressed in the stroma of several tumors.	141
153071	cd03601	CLECT_TC14_like	C-type lectin-like domain (CTLD) of the type found in lectins TC14, TC14-2, TC14-3, and TC14-4 from the budding tunicate Polyandrocarpa misakiensis and PfG6 from the Acorn worm. CLECT_TC14_like: C-type lectin-like domain (CTLD) of the type found in lectins TC14, TC14-2, TC14-3, and TC14-4 from the budding tunicate Polyandrocarpa misakiensis and PfG6 from the Acorn worm.  CTLD refers to a domain homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  TC14 is homodimeric.  The CTLD of TC14 binds D-galactose and D-fucose.  TC14 is expressed constitutively by multipotent epithelial and mesenchymal cells and plays in role during budding, in inducing the aggregation of undifferentiated mesenchymal cells to give rise to epithelial forming tissue.   TC14-2 and TC14-3 shows calcium-dependent galactose binding activity.  TC14-3 is a cytostatic factor which blocks cell growth and dedifferentiation of the atrial epithelium during asexual reproduction.  It may also act as a differentiation inducing factor.  Galactose inhibits the cytostatic activity of TC14-3.  The gene for Acorn worm PfG6 is gill-specific; PfG6 may be a secreted protein.	119
153072	cd03602	CLECT_1	C-type lectin (CTL)/C-type lectin-like (CTLD) domain subgroup 1; a subgroup of protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. CLECT_1: C-type lectin (CTL)/C-type lectin-like (CTLD) domain subgroup 1; a subgroup of protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces including CaCO3 and ice.  Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions.  CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers from which ligand-binding sites project in different orientations.  In some CTLDs a loop extends to the adjoining domain to form a loop-swapped dimer.	108
153073	cd03603	CLECT_VCBS	A bacterial subgroup of the C-type lectin-like (CTLD) domain; a subgroup of bacterial protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. CLECT_VCBS: A bacterial subgroup of the C-type lectin-like (CTLD) domain; a subgroup of bacterial protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins.  Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces including CaCO3 and ice.  Bacterial CTLDs within this group are functionally uncharacterized.  Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions.  CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose.  CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers from which ligand-binding sites project in different orientations.  In some CTLDs a loop extends to the adjoining domain to form a loop-swapped dimer.	118
239642	cd03670	ADPRase_NUDT9	ADP-ribose pyrophosphatase (ADPRase) catalyzes the hydrolysis of ADP-ribose to AMP and ribose-5-P.  Like other members of the Nudix hydrolase superfamily of enzymes, it is thought to require a divalent cation, such as Mg2+, for its activity. It also contains a 23-residue Nudix motif (GX5EX7REUXEEXGU, where U = I, L or V) which functions as a metal binding site/catalytic site. In addition to the Nudix motif, there are additional conserved amino acid residues, distal from the signature sequence, that correlate with substrate specificity. In humans, there are four distinct ADPRase activities, three putative cytosolic (ADPRase-I, -II, and -Mn) and a single mitochondrial enzyme (ADPRase-m). ADPRase-m is also known as NUDT9. It can be distinugished from the cytosolic ADPRase by a N-terminal target sequence unique to mitochondrial ADPRase. NUDT9 functions as a monomer.	186
239643	cd03671	Ap4A_hydrolase_plant_like	Diadenosine tetraphosphate (Ap4A) hydrolase is a member of the Nudix hydrolase superfamily. Members of this family are well represented in a variety of prokaryotic and eukaryotic organisms. Phylogenetic analysis reveals two distinct subgroups where plant enzymes fall into one group (represented by this subfamily) and fungi/animals/archaea enzymes fall into another. Bacterial enzymes are found in both subfamilies. Ap4A is a potential by-product of aminoacyl tRNA synthesis, and accumulation of Ap4A has been implicated in a range of biological events, such as DNA replication, cellular differentiation, heat shock, metabolic stress, and apoptosis. Ap4A hydrolase cleaves Ap4A asymmetrically into ATP and AMP. It is important in the invasive properties of bacteria and thus presents a potential target for the inhibition of such invasive bacteria. Besides the signature nudix motif (G[X5]E[X7]REUXEEXGU where U is Ile, Leu, or Val), Ap4A hydrolase is structurally similar to the other members of the nudix superfamily with some degree of variations. Several regions in the sequences are poorly defined and substrate and metal binding sites are only predicted based on kinetic studies.	147
239644	cd03672	Dcp2p	mRNA decapping enzyme 2 (Dcp2p), the catalytic subunit, and Dcp1p are the two components of the decapping enzyme complex. Decapping is a key step in both general and nonsense-mediated 5'->3' mRNA-decay pathways. Dcp2p contains an all-alpha helical N-terminal domain and a C-terminal domain which has the Nudix fold. While decapping is not dependent on the N-terminus of Dcp2p, it does affect its efficiency. Dcp1p binds the N-terminal domain of Dcp2p stimulating the decapping activity of Dcp2p. Decapping permits the degradation of the transcript and is a site of numerous control inputs. It is responsible for nonsense-mediated decay as well as AU-rich element (ARE)-mediated decay. In addition, it may also play a role in the levels of mRNA. Enzymes belonging to the Nudix superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V).	145
239645	cd03673	Ap6A_hydrolase	Diadenosine hexaphosphate (Ap6A) hydrolase is a member of the Nudix hydrolase superfamily. Ap6A hydrolase specifically hydrolyzes diadenosine polyphosphates, but not ATP or diadenosine triphosphate, and it generates ATP as the product. Ap6A, the most preferred substrate, hydrolyzes to produce two ATP molecules, which is a novel hydrolysis mode for Ap6A. These results indicate that Ap6A  hydrolase is a diadenosine polyphosphate hydrolase. It requires the presence of a divalent cation, such as Mn2+, Mg2+, Zn2+, and Co2+, for activity. Members of the Nudix superfamily are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site.	131
239646	cd03674	Nudix_Hydrolase_1	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity. They also contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, U=I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	138
239647	cd03675	Nudix_Hydrolase_2	Contains a crystal structure of the Nudix hydrolase from Nitrosomonas europaea, which has an unknown function. In general, members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity. They also contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	134
239648	cd03676	Nudix_hydrolase_3	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belong to this superfamily requires a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	180
239649	cd03677	MM_CoA_mutase_beta	Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, Beta subunit-like subfamily; contains bacterial proteins similar to the beta subunit of MCMs from Propionbacterium shermanni and Streptomyces cinnamonensis, which are alpha/beta heterodimers. For P. shermanni MCM, it is known that only the alpha subunit binds coenzyme B12 and substrates. The role of the beta subunit is unclear. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include P. shermanni MCM during propionic acid fermentation and Streptomyces MCM in polyketide biosynthesis.	424
239650	cd03678	MM_CoA_mutase_1	Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, unknown subfamily 1; composed of uncharacterized bacterial proteins containing a C-terminal MCM domain. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Members of this subfamily also contain an N-terminal coenzyme B12 binding domain followed by a domain similar to the E. coli ArgK membrane ATPase.	495
239651	cd03679	MM_CoA_mutase_alpha_like	Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, Alpha subunit-like subfamily; contains proteins similar to the alpha subunit of Propionbacterium shermanni MCM, as well as human and E. coli MCM. Members of this subfamily contain an N-terminal MCM domain and a C-terminal coenzyme B12 binding domain. MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In higher animals, MCM is involved in the breakdown of odd-chain fatty acids, several amino acids, and cholesterol. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include P. shermanni MCM during propionic acid fermentation, E.coli MCM in a pathway for the conversion of succinate to propionate and Streptomyces MCM in polyketide biosynthesis. Sinorhizobium meliloti strain SU47 MCM plays a role in the polyhydroxyalkanoate degradation pathway. P. shermanni and Streptomyces cinnamonensis MCMs are alpha/beta heterodimers. It has been shown for P. shermanni MCM that only the alpha subunit binds coenzyme B12 and substrates. Human MCM is a homodimer with two active sites. Mouse and E.coli MCMs are also homodimers. In humans, impaired activity of MCM results in methylmalonic aciduria, a disorder of propionic acid metabolism.	536
239652	cd03680	MM_CoA_mutase_ICM_like	Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, isobutyryl-CoA mutase (ICM)-like subfamily; contains archaeal and bacterial proteins similar to the large subunit of Streptomyces cinnamonensis coenzyme B12-dependent ICM. ICM from S. cinnamonensis is comprised of a large and a small subunit. The holoenzyme appears to be an alpha2beta2 heterotetramer with up to 2 molecules of coenzyme B12 bound. The small subunit binds coenzyme B12. ICM catalyzes the reversible rearrangement of n-butyryl-CoA to isobutyryl-CoA, intermediates in fatty acid and valine catabolism, which in S. cinnamonensis can be converted to methylmalonyl-CoA and used in polyketide synthesis.	538
239653	cd03681	MM_CoA_mutase_MeaA	Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, MeaA-like subfamily; contains various methylmalonyl coenzyme A (CoA) mutase (MCM)-like proteins similar to the Streptomyces cinnamonensis MeaA, Methylobacterium extorquens MeaA and Streptomyces collinus B12-dependent mutase. Members of this subfamily contain an N-terminal MCM domain and a C-terminal coenzyme B12 binding domain. S. cinnamonensis MeaA is a putative B12-dependent mutase which provides methylmalonyl-CoA precursors for the biosynthesis of the monensin polyketide via an unknown pathway. S. collinus B12-dependent mutase may be involved in a pathway for acetate assimilation.	407
239654	cd03682	ClC_sycA_like	ClC sycA-like chloride channel proteins. This ClC family presents in bacteria, where it facilitates acid resistance in acidic soil. Mutation of this gene (sycA) in Rhizobium tropici CIAT899 causes serious deficiencies in nodule development, nodulation competitiveness, and N2 fixation on Phaseolus vulgaris plants, due to its reduced ability for acid resistance.  This family is part of the ClC chloride channel superfamiy. These proteins catalyse the selective flow of Cl- ions across cell membranes and Cl-/H+ exchange transport. These proteins share two characteristics that are apparently inherent to the entire ClC chloride channel superfamily: a unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge.	378
239655	cd03683	ClC_1_like	ClC-1-like chloride channel proteins. This CD includes isoforms ClC-0, ClC-1, ClC-2 and ClC_K. ClC-1 is expressed in skeletal muscle and its mutation leads to both recessively and dominantly-inherited forms of muscle stiffness or myotonia. ClC-K is exclusively expressed in kidney. Similarly, mutation of ClC-K leads to nephrogenic diabetes insipidus in mice and Bartter's syndrome in human. These proteins belong to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism.  The gating is conferred by the permeating anion itself, acting as the gating charge. This domain is found in the eukaryotic halogen ion (Cl-, Br- and I-) channel proteins, that perform a variety of functions including cell volume regulation, regulation of intracelluar chloride concentration, membrane potential stabilization, charge compensation necessary for the acidification of intracellular organelles and transepithelial chloride transport.	426
239656	cd03684	ClC_3_like	ClC-3-like chloride channel proteins.  This CD  includes ClC-3, ClC-4, ClC-5 and ClC-Y1. ClC-3 was initially cloned from rat kidney. Expression of ClC-3 produces outwardly-rectifying Cl currents that are inhibited by protein kinase C activation. It has been suggested that ClC-3 may be a ubiquitous swelling-activated Cl channel that has very similar characteristics to those of native volume-regulated Cl currents. The function of ClC-4 is unclear. Studies of human ClC-4 have revealed that it gives rise to Cl currents that rapidly activate at positive voltages, and are sensitive to extracellular pH, with currents decreasing when pH falls below 6.5. ClC-4 is broadly distributed, especially in brain and heart.   ClC-5 is predominantly expressed in the kidney, but can be found in the brain and liver. Mutations in the ClC-5 gene cause certain hereditary diseases, including Dent's disease, an X-chromosome linked syndrome characterised by proteinuria, hypercalciuria, and kidney stones (nephrolithiasis), leading to progressive renal failure.   These proteins belong to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. This domain is found in the eukaryotic halogen ion (Cl- and I-) channel proteins, that perform a variety of functions including cell volume regulation, the membrane potential stabilization, transepithelial chloride transport and charge compensation necessary for the acidification of intracellular organelles.	445
239657	cd03685	ClC_6_like	ClC-6-like chloride channel proteins. This CD includes ClC-6, ClC-7 and ClC-B, C, D in plants. Proteins in this family are ubiquitous in eukarotes and their functions are unclear. They are expressed in intracellular organelles membranes.  This family belongs to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. ClC chloride ion channel superfamily perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, and transepithelial transport in animals.	466
239658	cd03687	Dehydratase_LU	Dehydratase large subunit. This family contains the large (alpha) subunit of B12-dependent glycerol dehydratases (GDHs) and B12-dependent diol dehydratases (DDHs). GDH is isofunctional with DDH. These enzymes can each catalyze the conversion of 1,2-propanediol, glycerol, and 1,2-ethanediol to the corresponding aldehydes via a coenzyme B12 (adenosylcobalamin)-dependent radical mechanism. Both enzymes exhibit a subunit composition of alpha2beta2gamma2. The enzymes differ in substrate specificity; glycerol is the preferred substrate for GDH and 1,2-propanediol for DDH. GDH shows almost equal affinity for both (R) and (S)-isomers while DDH prefers the (S) isomer. GDH plays a key role in the dihydroxyacetone (DHA) pathway and DDH in the anaerobic degradation of 1,2-diols. The radical mechanism has been well studied for Klebsiella oxytoca DDH and involves binding of 1,2-propanediol to the enzyme to induce hemolytic cleavage of the Co-C5' bond of the coenzyme to form cob(II)alamin and the adenosyl radical. Hydrogen abstraction from the substrate follows producing a substrate generated radical and 5'-deoxyadenosine. Rearrangement to the product radical is then followed by abstraction of a hydrogen atom from 5'-deoxyadenosine to produce the hydrated propionaldehyde and regenerate the adenosyl radical. After the Co-C5' bond is reformed and the hydrated aldehyde dehydrated, the process is complete. GDH has a higher affinity for coenzyme B12 than DDH. Both GDH and DDH are activated by various monovalent cations with K+, NH4+, and Rb+ being the most effective. However, DDH differs from GDH in that it is partially active with Cs+ and Na+. In general, the alpha and beta subunits for both enzymes are on different chains. However, for a subset of the GDHs, alpha and beta subunits appear to be on a single chain.	545
293889	cd03688	eIF2_gamma_II	Domain II of the gamma subunit of eukaryotic translation initiation factor 2. This subfamily represents domain II of the gamma subunit of eukaryotic translation initiation factor 2 (eIF2-gamma) found in eukaryota and archaea. eIF2 is a G protein that delivers the methionyl initiator tRNA to the small ribosomal subunit and releases it upon GTP hydrolysis after the recognition of the initiation codon. eIF2 is composed of three subunits, alpha, beta and gamma. Subunit gamma shows strongest conservation, and it confers both tRNA binding and GTP/GDP binding.	113
293890	cd03689	RF3_II	Domain II of bacterial Release Factor 3. This subfamily represents domain II of bacterial Release Factor 3 (RF3). Termination of protein synthesis by the ribosome requires two release factor (RF) classes. The class II RF3 is a GTPase that removes class I RFs (RF1 or RF2) from the ribosome after release of the nascent polypeptide. RF3 in the GDP state binds to the ribosomal class I RF complex, followed by an exchange of GDP for GTP and release of the class I RF. Sequence comparison of class II release factors with elongation factors shows that prokaryotic RF3 is more similar to EF-G whereas eukaryotic eRF3 is more similar to eEF1A, implying that their precise function may differ.	87
293891	cd03690	Tet_II	Domain II of ribosomal protection proteins Tet(M) and Tet(O). This subfamily represents domain II of ribosomal protection proteins Tet(M) and Tet(O). This domain has homology to domain II of the elongation factors EF-G and EF-2. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner thereby mediating Tc resistance. Tcs are broad-spectrum antibiotics.  Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA.	86
293892	cd03691	BipA_TypA_II	Domain II of BipA. BipA (also called TypA) is a highly conserved protein with global regulatory properties in Escherichia coli.  BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios and is stimulated  by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion. The domain II of BipA shows similarity to the domain II of the elongation factors (EFs) EF-G and EF-Tu.	94
293893	cd03692	mtIF2_IVc	C2 subdomain of domain IV in mitochondrial translation initiation factor 2. This model represents the C2 subdomain of domain IV of mitochondrial translation initiation factor 2 (mtIF2) which adopts a beta-barrel fold displaying a high degree of structural similarity with domain II of the translation elongation factor EF-Tu. The C-terminal part of mtIF2 contains the entire fMet-tRNAfmet binding site of IF-2 and is resistant to proteolysis. This C-terminal portion consists of two domains, IF2 C1 and IF2 C2.  IF2 C2 has been shown to contain all molecular determinants necessary and sufficient for the recognition and binding of fMet-tRNAfMet. Like IF2 from certain prokaryotes such as Thermus thermophilus, mtIF2lacks domain II which is thought to be involved in binding of E.coli IF-2 to 30S subunits.	84
293894	cd03693	EF1_alpha_II	Domain II of elongation factor 1-alpha. This family represents domain II of elongation factor 1-alpha (EF-1A) that is found in archaea and all eukaryotic lineages. EF-1A is very abundant in the cytosol, where it is involved in the GTP-dependent binding of aminoacyl-tRNAs to the A site of the ribosomes in the second step of translation from mRNAs to proteins. Both domain II of EF-1A and domain IV of IF2/eIF5B have been implicated in recognition of the 3'-ends of tRNA. More than 61% of eukaryotic elongation factor 1A (eEF-1A) in cells is estimated to be associated with actin cytoskeleton. The binding of eEF-1A to actin is a noncanonical function that may link two distinct cellular processes, cytoskeleton organization and gene expression.	91
293895	cd03694	GTPBP_II	Domain II of the GTPBP family of GTP binding proteins. This group includes proteins similar to GTPBP1 and GTPBP2. GTPBP1 is structurally related to elongation factor 1 alpha, a key component of the protein biosynthesis machinery. Immunohistochemical analyses on mouse tissues revealed that GTPBP1 is expressed in some neurons and smooth muscle cells of various organs as well as macrophages. Immunofluorescence analyses revealed that GTPBP1 is localized exclusively in cytoplasm and shows a diffuse granular network forming a gradient from the nucleus to the periphery of the cells in smooth muscle cell lines and macrophages. No significant difference was observed in the immune response to protein antigen between mutant mice and wild-type mice, suggesting normal function of antigen-presenting cells of the mutant mice. The absence of an eminent phenotype in GTPBP1-deficient mice may be due to functional compensation by GTPBP2, which is similar to GTPBP1 in structure and tissue distribution.	87
293896	cd03695	CysN_NodQ_II	Domain II of the large subunit of ATP sulfurylase. This subfamily represents domain II of the large subunit of ATP sulfurylase (ATPS): CysN or the N-terminal portion of NodQ, found mainly in proteobacteria and homologous to the domain II of EF-Tu. Escherichia coli ATPS consists of CysN and a smaller subunit CysD. ATPS produces adenosine-5'-phosphosulfate (APS) from ATP and sulfate, coupled with GTP hydrolysis. In the subsequent reaction, APS is phosphorylated by an APS kinase (CysC), to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS) for use in amino acid (aa) biosynthesis. The Rhizobiaceae group (alpha-proteobacteria) appears to carry out the same chemistry for the sulfation of a nodulation factor. In Rhizobium meliloti, the heterodimeric complex comprised of NodP and NodQ appears to possess both ATPS and APS kinase activities. The N and C termini of NodQ correspond to CysN and CysC, respectively. Other eubacteria, archaea, and eukaryotes use a different ATP sulfurylase, which shows no amino acid sequence similarity to CysN or NodQ. CysN and the N-terminal portion of NodQ show similarity to GTPases involved in translation, in particular, EF-Tu and EF-1alpha.	81
293897	cd03696	SelB_II	Domain II of elongation factor SelB. This subfamily represents the domain of elongation factor SelB that is homologous to domain II of EF-Tu. SelB may function by replacing EF-Tu. In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3' or 5' non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation.	83
293898	cd03697	EFTU_II	Domain II of elongation factor Tu. Elongation factors Tu (EF-Tu) are three-domain GTPases with an essential function in the elongation phase of mRNA translation. The GTPase center of EF-Tu is in the N-terminal domain (domain I), also known as the catalytic or G-domain. The G-domain is composed of about 200 amino acid residues, arranged into a predominantly parallel six-stranded beta-sheet core surrounded by seven alpha helices. Non-catalytic domains II and III are beta-barrels of seven and six, respectively, antiparallel beta-strands that share an extended interface. Both non-catalytic domains are composed of about 100 amino acid residues. EF-Tu proteins exist in two principal conformations: a compact one, EF-Tu*GTP, with tight interfaces between all three domains and a high affinity for aminoacyl-tRNA; and an open one, EF-Tu*GDP, with essentially no G-domain-domain II interactions and a low affinity for aminoacyl-tRNA. EF-Tu has approximately a 100-fold higher affinity for GDP than for GTP.	87
293899	cd03698	eRF3_II_like	Domain II of the eukaryotic class II release factor-like proteins. This model represents the domain similar to domain II of the eukaryotic class II release factor (eRF3). In eukaryotes, translation termination is mediated by two interacting release factors, eRF1 and eRF3, which act as class I and II factors, respectively. eRF1 functions as an omnipotent release factor, decoding all three stop codons and triggering the release of the nascent peptide catalyzed by the ribosome. eRF3 is a GTPase, which enhances termination efficiency by stimulating eRF1 activity in a GTP-dependent manner. Sequence comparison of class II release factors with elongation factors shows that eRF3 is more similar to eEF-1alpha whereas prokaryote RF3 is more similar to EF-G, implying that their precise function may differ. Only eukaryote RF3s are found in this group. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM  is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils. This group also contains proteins similar to S. cerevisiae Hbs1, a G protein known to be important for efficient growth and protein synthesis under conditions of limiting translation initiation and to associate with Dom34.  It has been speculated that yeast Hbs1 and Dom34 proteins may function as part of a complex with a role in gene expression.	84
293900	cd03699	EF4_II	Domain II of Elongation Factor 4 (EF4). Elongation factor 4 (EF4 or LepA) is a highly conserved guanosine triphosphatase found in bacteria and eukaryotic mitochondria and chloroplasts. EF4 functions as a translation factor, which promotes back-translocation of tRNAs on posttranslocational ribosome complexes and competes with elongation factor G for interaction with pretranslocational ribosomes, inhibiting the elongation phase of protein synthesis.	86
293901	cd03700	EF2_snRNP_like_II	Domain II of elongation factor 2 and C-terminal domain of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein. This subfamily represents domain II of elongation factor (EF) EF-2 found in eukaryotes and archaea, and the C-terminal portion of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and its yeast counterpart Snu114p. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. This translocation step is catalyzed by EF-2_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP.  Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of U5-116 kD/Snu114p.	95
293902	cd03701	IF2_IF5B_II	Domain II of prokaryotic Initiation Factor 2 and archaeal and eukaryotic Initiation Factor 5. This family represents domain II of prokaryotic Initiation Factor 2 (IF2) and its archaeal and eukaryotic homologue aeIF5B. IF2, the largest initiation factor, is an essential GTP binding protein. In E. coli, three natural forms of IF2 exist in the cell, IF2alpha, IF2beta1, and IF2beta2. Disruption of the eIF5B gene (FUN12) in yeast causes a severe slow-growth phenotype, associated with a defect in translation. eIF5B has a function analogous to prokaryotic IF2 in mediating the joining of the 60S ribosomal subunit. The eIF5B consists of three N-terminal domains (I, II, II) connected by a long helix to domain IV. Domain I is a G domain, domain II and IV are beta-barrels and domain III has a novel alpha-beta-alpha sandwich fold. The G domain and the beta-barrel domain II display a similar structure and arrangement to the homologous domains in EF1A, eEF1A and aeIF2gamma.	96
293903	cd03702	IF2_mtIF2_II	Domain II of bacterial and mitochondrial Initiation Factor 2. This family represents domain II of bacterial Initiation Factor 2 (IF2) and its eukaryotic mitochondrial homolog mtIF2. IF2, the largest initiation factor, is an essential GTP binding protein. In E. coli, three natural forms of IF2 exist in the cell, IF2alpha, IF2beta1, and IF2beta2.  Bacterial IF-2 is structurally and functionally related to eukaryotic mitochondrial mtIF-2.	96
293904	cd03703	aeIF5B_II	Domain II of archaeal and eukaryotic Initiation Factor 5. This family represents domain II of archaeal and eukaryotic IF5B. aIF5B and eIF5B are homologs of prokaryotic Initiation Factor 2 (IF2). Disruption of the eIF5B gene (FUN12) in yeast causes a severe slow-growth phenotype, associated with a defect in translation. eIF5B has a function analogous to prokaryotic IF2 in mediating the joining of joining of 60S subunits.  The eIF5B consists of three N-terminal domains  (I, II, II) connected by a long helix to domain IV. Domain I is a G domain, domain II and IV are beta-barrels and domain III has a novel alpha-beta-alpha sandwich fold. The G domain and the beta-barrel domain II display a similar structure and arrangement to the homologous domains of EF1A, eEF1A and aeIF2gamma.	111
294003	cd03704	eRF3_C_III	C-terminal domain of eRF3. This model represents the eEF1alpha-like C-terminal region of eRF3, which is homologous to the domain III of EF-Tu. eRF3 is a GTPase which enhances termination efficiency by stimulating eRF1 activity in a GTP-dependent manner. The C-terminal region is responsible for translation termination activity and is essential for viability. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions: N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils.	108
294004	cd03705	EF1_alpha_III	Domain III of Elongation Factor 1. Eukaryotic elongation factor 1 (EF-1) is responsible for the GTP-dependent binding of aminoacyl-tRNAs to ribosomes. EF-1 is composed of four subunits: the alpha chain, which binds GTP and aminoacyl-tRNAs; the gamma chain that probably plays a role in anchoring the complex to other cellular components; and the beta and delta (or beta') chains. This model represents the alpha subunit, which is the counterpart of bacterial EF-Tu for archaea (aEF-1 alpha) and eukaryotes (eEF-1 alpha).	104
294005	cd03706	mtEFTU_III	Domain III of mitochondrial EF-TU (mtEF-TU). mtEF-TU is highly conserved and is 55-60% identical to bacterial EF-TU. The overall structure is similar to that observed in the Escherichia coli and Thermus aquaticus EF-TU. However, compared with that observed in prokaryotic EF-TU, the nucleotide-binding domain (domain I) of mtEF-TU is in a different orientation relative to the rest of the structure. Furthermore, domain III is followed by a short 11-amino acid extension that forms one helical turn. This extension seems to be specific to the mitochondrial factors and has not been observed in any of the prokaryotic factors.	93
294006	cd03707	EFTU_III	Domain III of Elongation Factor (EF) Tu. EF-Tu consists of three structural domains, designated I, II, and III. Domain III adopts a beta barrel structure. Domain III is involved in binding to both charged tRNA and to elongation factor Ts (EF-Ts). EF-Ts is the guanine-nucleotide-exchange factor for EF-Tu. EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Crystallographic studies revealed structural similarities ("molecular mimicry") between tertiary structures of EF-G and the EF-Tu-aminoacyl-tRNA ternary complex. Domains III, IV, and V of EF-G mimic the tRNA structure in the EF-Tu ternary complex; domains III, IV and V can be related to the acceptor stem, anticodon helix and T stem of tRNA respectively.	90
294007	cd03708	GTPBP_III	Domain III of the GP-1 family of GTPases. This family includes proteins similar to GTPBP1 and GTPBP2. GTPBP1 is structurally related to elongation factor 1 alpha, a key component of the protein biosynthesis machinery. Immunohistochemical analyses on mouse tissues revealed that GTPBP1 is expressed in some neurons and smooth muscle cells of various organs as well as macrophages. Immunofluorescence analyses revealed that GTPBP1 is localized exclusively in the cytoplasm and shows a diffuse granular network forming a gradient from the nucleus to the periphery of the cells in smooth muscle cell lines and macrophages. No significant difference was observed in the immune response to protein antigen between mutant mice and wild-type mice, suggesting normal function of antigen-presenting cells of the mutant mice. The absence of an eminent phenotype in GTPBP1-deficient mice may be due to functional compensation by GTPBP2, which is similar to GTPBP1 in structure and tissue distribution.	87
239680	cd03709	lepA_C	lepA_C: This family represents the C-terminal region of LepA, a GTP-binding protein localized in the cytoplasmic membrane.   LepA is ubiquitous in Bacteria and Eukaryota (e.g. Saccharomyces cerevisiae GUF1p), but is missing from Archaea. LepA exhibits significant homology to elongation factors (EFs) Tu and G. The function(s) of the proteins in this family are unknown. The N-terminal domain of LepA is homologous to a domain of similar size found in initiation factor 2 (IF2), and in EF-Tu and EF-G (factors required for translation in Escherichia coli). Two types of phylogenetic tree, rooted by other GTP-binding proteins, suggest that eukaryotic homologs (including S. cerevisiae GUF1) originated within the bacterial LepA family. LepA has never been observed in archaea, and eukaryl LepA is organellar. LepA is therefore a true bacterial GTPase, found only in the bacterial lineage.	80
239681	cd03710	BipA_TypA_C	BipA_TypA_C: a C-terminal portion of BipA or TypA having homology to the C terminal domains of the elongation factors EF-G and EF-2. A member of the ribosome binding GTPase superfamily, BipA is widely distributed in bacteria and plants.  BipA is a highly conserved protein with global regulatory properties in Escherichia coli. BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis.  BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios and, is stimulated  by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion.	79
239682	cd03711	Tet_C	Tet_C: C-terminus of ribosomal protection proteins Tet(M) and Tet(O). This domain has homology to the C terminal domains of the elongation factors EF-G and EF-2. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner thereby mediating Tc resistance.  Tcs are broad-spectrum antibiotics.  Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the  occupation of site A by aminoacyl-tRNA.	78
239683	cd03713	EFG_mtEFG_C	EFG_mtEFG_C: domains similar to the C-terminal domain of the bacterial translational elongation factor (EF) EF-G.  Included in this group is the C-terminus of mitochondrial Elongation factor G1 (mtEFG1) and G2 (mtEFG2) proteins. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs.  Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. No clear phenotype has been found for mutants in the yeast homologue of mtEFG2, MEF2.	78
239684	cd03714	RT_DIRS1	RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members of the subfamily include the Dictyostelium DIRS-1, Volvox carteri kangaroo, and Panagrellus redivivus PAT elements. These elements differ from LTR and conventional non-LTR retrotransposons. They contain split direct repeat (SDR) termini, and have been proposed to integrate via double-stranded closed-circle DNA intermediates assisted by an encoded recombinase which is similar to gamma-site-specific integrase.	119
239685	cd03715	RT_ZFREV_like	RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT.  An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. Phylogenetic analysis suggests that  ZFERV belongs to a distinct group of retroviruses.	210
239686	cd03716	SOCS_ASB_like	SOCS (suppressors of cytokine signaling) box of ASB (ankyrin repeat and SOCS box) and SSB (SPRY domain-containing SOCS box proteins) protein families. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence of a variable number of repeats. SSB proteins contain a central SPRY domain and a C-terminal SOCS. Recently, it has been shown that all four SSB proteins interact with the MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF), and that SSB-1, SSB-2, and SSB-4 interact with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain.	42
239687	cd03717	SOCS_SOCS_like	SOCS (suppressors of cytokine signaling) box of SOCS-like proteins. The CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. These intracellular proteins regulate the responses of immune cells to cytokines. Identified as negative regulators of the cytokine-JAK-STAT pathway, they seem to play a role in many immunological and pathological processes. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. Related SOCS boxes are also present in Rab40-like proteins and insect proteins of unknown function that also contain a NEUZ (domain in neuralized proteins) domain.	39
239688	cd03718	SOCS_SSB1_4	SOCS (suppressors of cytokine signaling) box of SSB1 and SSB4 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB1 and SSB4 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF) and also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239689	cd03719	SOCS_SSB2	SOCS (suppressors of cytokine signaling) box of SSB2 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB2 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF). SSB2, like SSB4 and SSB1, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239690	cd03720	SOCS_ASB1	SOCS (suppressors of cytokine signaling) box of ASB1-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239691	cd03721	SOCS_ASB2	SOCS (suppressors of cytokine signaling) box of ASB2-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB2 targets specific proteins to destruction by the proteasome in leukemia cells that have been induced to differentiate. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	45
239692	cd03722	SOCS_ASB3	SOCS (suppressors of cytokine signaling) box of ASB3-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ABS3 has been shown to be negative regulator of TNF-R2-mediated cellular responses to TNF-alpha by direct targeting of tumor necrosis factor receptor II (TNF-R2) for ubiquitination and proteasome-mediated degradation. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	51
239693	cd03723	SOCS_ASB4_ASB18	SOCS (suppressors of cytokine signaling) box of ASB4 and ASB18 proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Asb4 was identified as imprinted gene in mice. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	48
239694	cd03724	SOCS_ASB5	SOCS (suppressors of cytokine signaling) box of ASB5-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB5 has been implicated in the initiation of arteriogenesis. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239695	cd03725	SOCS_ASB6	SOCS (suppressors of cytokine signaling) box of ASB6-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. ASB6 interacts with the adaptor protein APS and recruits elongin B/C to the insulin receptor signaling complex. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	44
239696	cd03726	SOCS_ASB7	SOCS (suppressors of cytokine signaling) box of ASB7-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	45
239697	cd03727	SOCS_ASB8	SOCS (suppressors of cytokine signaling) box of ASB8-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Human ASB8 is highly transcribed in skeletal muscle and in lung carcinoma cell lines. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	43
239698	cd03728	SOCS_ASB_9_11	SOCS (suppressors of cytokine signaling) box of ASB9 and 11 proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239699	cd03729	SOCS_ASB13	SOCS (suppressors of cytokine signaling) box of ASB13-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239700	cd03730	SOCS_ASB14	SOCS (suppressors of cytokine signaling) box of ASB14-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	57
239701	cd03731	SOCS_ASB15	SOCS (suppressors of cytokine signaling) box of ASB15-like proteins. ASB family members have a C-terminal SOCS box and an N-terminal ankyrin-related sequence. Human ASB15 is expressed predominantly in skeletal muscle and participates in the regulation of protein turnover and muscle cell development by stimulating protein synthesis and regulating differentiation of muscle cells. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	56
239702	cd03733	SOCS_WSB_SWIP	SOCS (suppressors of cytokine signaling) box of WSB/SWiP-like proteins. This subfamily contains WSB-1 (SOCS-box-containing WD-40 protein), part of an E3 ubiquitin ligase for the thyroid-hormone-activating type 2 iodothyronine deiodinase (D2), and SWiP-1 (SOCS box and WD-repeats in Protein), a WD40-containing protein that is expressed in embryonic structures of chickens and regulated by Sonic Hedgehog (Shh), as well as, their isoforms WSB-2 and SWiP-2. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	39
239703	cd03734	SOCS_CIS1	SOCS (suppressors of cytokine signaling) box of CIS (cytokine-inducible SH2 protein) 1-like proteins. Together with the SOCS proteins, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. CIS1, like SOCS1 and SOCS3, is involved in the down-regulation of the JAK/STAT pathway. CIS1 binds to cytokine receptors at STAT5-docking sites, which prohibits recruitment of STAT5 to the receptor signaling complex and results in the down-regulation of activation by STAT5.	41
239704	cd03735	SOCS_SOCS1	SOCS (suppressors of cytokine signaling) box of SOCS1-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS1, like CIS1 and SOCS3, is involved in the down-regulation of the JAK/STAT pathway. SOCS1 has a dual function as a direct potent JAK kinase inhibitor and as a component of an E3 ubiquitin-ligase complex recruiting substrates to the protein degradation machinery.	43
239705	cd03736	SOCS_SOCS2	SOCS (suppressors of cytokine signaling) box of SOCS2-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS2 has recently been shown to regulate neuronal differentiation by controlling expression of a neurogenic transcription factor, Neurogenin-1. SOCS2 binds to GH receptors and inhibits the activation of STAT5b induced by GH. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	41
239706	cd03737	SOCS_SOCS3	SOCS (suppressors of cytokine signaling) box of SOCS3-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS3, like CIS1 and SOCS1, is involved in the down-regulation of the JAK/STAT pathway.  SOCS3 inhibits JAK activity indirectly through recruitment to the cytokine receptors. SOCS3 has been shown to play an essential role in placental development and a non-essential role in embryo development. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239707	cd03738	SOCS_SOCS4	SOCS (suppressors of cytokine signaling) box of SOCS4-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	56
239708	cd03739	SOCS_SOCS5	SOCS (suppressors of cytokine signaling) box of SOCS5-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS5 inhibits Th2 differentiation by inhibiting IL-4 signaling. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system.   The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	57
239709	cd03740	SOCS_SOCS6	SOCS (suppressors of cytokine signaling) box of SOCS6-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	41
239710	cd03741	SOCS_SOCS7	SOCS (suppressors of cytokine signaling) box of SOCS7-like proteins. Together with CIS1, the CIS/SOCS family of proteins is characterized by the presence of a C-terminal SOCS box and a central SH2 domain. SOCS7 is important in the functioning of neuronal cells. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	49
239711	cd03742	SOCS_Rab40	SOCS (suppressors of cytokine signaling) box of Rab40-like proteins. Rab40 is part of the Rab family of small GTP-binding proteins that form the largest family within the Ras superfamily. Rab proteins regulate vesicular trafficking pathways, behaving as membrane-associated molecular switches. Rab40 is characterized by a SOCS box c-terminal to the GTPase domain. The SOCS boxes interact with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	43
239712	cd03743	SOCS_SSB4	SOCS (suppressors of cytokine signaling) box of  SSB4 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB4 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF). SSB4, like SSB2 and SSB1, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239713	cd03744	SOCS_SSB1	SOCS (suppressors of cytokine signaling) box of SSB1 (SPRY domain-containing SOCS box proteins)-like proteins. SSB proteins contain a central SPRY domain and a C-terminal SOCS. SSB1 has been shown to bind to MET, the receptor protein-tyrosine kinase for hepatocyte growth factor (HGF), both the absence and the presence of HGF and enhances the HGF-MET-induced mitogen-activated protein kinases Erk-transcription factor Elk-1-serum response elements (SRE) pathway. SSB1, like SSB2 and SSB4, also interacts with prostate apoptosis response protein-4. Both types of interactions are mediated through the SPRY domain. The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	42
239714	cd03745	SOCS_WSB2_SWIP2	SOCS (suppressors of cytokine signaling) box of WSB2/SWiP2-like proteins. This family consists of WSB-2 (SOCS-box-containing WD-40 protein) and SWiP-2 (SOCS box and WD-repeats in Protein). No functional information is available for WSB2 or SWiP-2, but limited information is available for the isoforms WSB-1 and SWiP-1.  The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	39
239715	cd03746	SOCS_WSB1_SWIP1	SOCS (suppressors of cytokine signaling) box of WSB1/SWiP1-like proteins. This subfamily contains WSB-1 (SOCS-box-containing WD-40 protein), part of an E3 ubiquitin ligase for the thyroid-hormone-activating type 2 iodothyronine deiodinase (D2) and SWiP-1 (SOCS box and WD-repeats in Protein), a WD40-containing protein that is expressed in embryonic structures of chickens and regulated by Sonic Hedgehog (Shh). The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions.	40
239716	cd03747	Ntn_PGA_like	Penicillin G acylase (PGA) belongs to a family of beta-lactam acylases that includes cephalosporin acylase (CA) and aculeacin A acylase. PGA and CA are crucial for the production of backbone chemicals like 6-aminopenicillanic acid and 7-aminocephalosporanic acid (7-ACA), which can be used to synthesize semi-synthetic penicillins and cephalosporins, respectively.  While both PGA and CA have a conserved Ntn (N-terminal nucleophile) hydrolase fold and the structural similarity at their active sites is very high, their sequence similarity is low.	312
239717	cd03748	Ntn_PGA	Penicillin G acylase (PGA) is the key enzyme in the industrial production of beta-lactam antibiotics. PGA hydrolyzes the side chain of penicillin G and related beta-lactam antibiotics releasing 6-amino penicillanic acid (6-APA), a building block in the production of semisynthetic penicillins.  PGA is widely distributed among microorganisms, including bacteria, yeast and filamentous fungi but it's in vivo role remains unclear.	488
239718	cd03749	proteasome_alpha_type_1	proteasome_alpha_type_1. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	211
239719	cd03750	proteasome_alpha_type_2	proteasome_alpha_type_2. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	227
239720	cd03751	proteasome_alpha_type_3	proteasome_alpha_type_3. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	212
239721	cd03752	proteasome_alpha_type_4	proteasome_alpha_type_4. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	213
239722	cd03753	proteasome_alpha_type_5	proteasome_alpha_type_5. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	213
239723	cd03754	proteasome_alpha_type_6	proteasome_alpha_type_6. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	215
239724	cd03755	proteasome_alpha_type_7	proteasome_alpha_type_7. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	207
239725	cd03756	proteasome_alpha_archeal	proteasome_alpha_archeal. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	211
239726	cd03757	proteasome_beta_type_1	proteasome beta type-1 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	212
239727	cd03758	proteasome_beta_type_2	proteasome beta type-2 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis.Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	193
239728	cd03759	proteasome_beta_type_3	proteasome beta type-3 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	195
239729	cd03760	proteasome_beta_type_4	proteasome beta type-4 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis.Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	197
239730	cd03761	proteasome_beta_type_5	proteasome beta type-5 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	188
239731	cd03762	proteasome_beta_type_6	proteasome beta type-6 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	188
239732	cd03763	proteasome_beta_type_7	proteasome beta type-7 subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	189
239733	cd03764	proteasome_beta_archeal	Archeal proteasome, beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme for non-lysosomal protein degradation in both the cytosol and the nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are both members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	188
239734	cd03765	proteasome_beta_bacterial	Bacterial proteasome, beta subunit. The 20S proteasome, multisubunit proteolytic complex, is the central enzyme of nonlysosomal protein degradation in both the cytosol and nucleus. It is composed of 28 subunits arranged as four homoheptameric rings that stack on top of one another forming an elongated alpha-beta-beta-alpha cylinder with a central cavity. The proteasome alpha and beta subunits are members of the N-terminal nucleophile (Ntn)-hydrolase superfamily. Their N-terminal threonine residues are exposed as a nucleophile in peptide bond hydrolysis. Mammals have 7 alpha and 7 beta proteasome subunits while archaea have one of each.	236
239735	cd03766	Gn_AT_II_novel	Gn_AT_II_novel.  This asparagine synthase-related domain is present in eukaryotes but its function has not yet been determined.  The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate.  Asparagine synthetase B  synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer.	181
239736	cd03767	SR_Res_par	Serine recombinase (SR) family, Partitioning (par)-Resolvase subfamily, catalytic domain; Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. This subgroup is composed of proteins similar to the E. coli resolvase found in the par region of the RP4 plasmid, which encodes a highly efficient partitioning system. This protein is part of a complex stabilization system involved in the resolution of plasmid dimers during cell division. Similar to Tn3 and other resolvases, members of this family may contain a C-terminal DNA binding domain.	146
239737	cd03768	SR_ResInv	Serine Recombinase (SR) family, Resolvase and Invertase subfamily, catalytic domain; members contain a C-terminal DNA binding domain. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. Resolvases and invertases affect resolution or inversion and comprise a major phylogenic group. Resolvases (e.g. Tn3, gamma-delta, and Tn5044) normally recombine two sites in direct repeat causing deletion of the DNA between the sites. Invertases (e.g. Gin and Hin) recombine sites in inverted repeat to invert the DNA between the sites. Cointegrate resolution with gamma-delta resolvase requires the formation of a synaptosome of three resolvase dimers bound to each of two res sites on the DNA. Also included in this subfamily are some putative integrases including a sequence from bacteriophage phi-FC1.	126
239738	cd03769	SR_IS607_transposase_like	Serine Recombinase (SR) family, IS607-like transposase subfamily, catalytic domain; members contain a DNA binding domain with homology to MerR/SoxR located N-terminal to the catalytic domain. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. This subfamily is composed of proteins that catalyze the transposition of insertion sequence (IS) elements such as IS607 from Helicobacter and IS1535 from Mycobacterium, and similar proteins from other bacteria and several archaeal species. IS elements are DNA segments that move to new sites in prokaryotic and eukaryotic genomes causing insertion mutations and gene rearrangements.	134
239739	cd03770	SR_TndX_transposase	Serine Recombinase (SR) family, TndX-like transposase subfamily, catalytic domain; composed of large serine recombinases similar to Clostridium TndX and TnpX transposases. Serine recombinases catalyze site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and the enzyme. They are functionally versatile and include resolvases, invertases, integrases, and transposases. TndX mediates the excision and circularization of the conjugative transposon Tn5397 from Clostridium difficile. TnpX is responsible for the movement of the nonconjugative chloramphenicol resistance elements of the Tn4451/3 family. Mobile genetic elements such as transposons are important vehicles for the transmission of virulence and antibiotic resistance in many microorganisms.	140
239740	cd03771	MATH_Meprin	Meprin family, MATH domain; Meprins are multidomain, highly glycosylated extracellular metalloproteases, which are either anchored to the membrane or secreted into extracellular spaces. They are expressed in renal and intestinal brush border membranes, leukocytes, and cancer cells, and are capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. Meprin proteases are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. Despite their similarity, the two subunits differ in their ability to self-associate, in proteolytic processing during biosynthesis and in substrate specificity. Both subunits are synthesized as membrane spanning proteins, however, the alpha subunit is cleaved during biosynthesis and loses its transmembrane domain. Meprin beta forms homodimers or heterotetramers while meprin alpha oligomerizes into large complexes containing 10-100 subunits. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen.	167
239741	cd03772	MATH_HAUSP	Herpesvirus-associated ubiquitin-specific protease (HAUSP, also known as USP7) family, N-terminal MATH (TRAF-like) domain; composed of proteins similar to human HAUSP, an enzyme that specifically catalyzes the deubiquitylation of p53 and MDM2, hence playing an important role in the p53-MDM2 pathway. It contains an N-terminal TRAF-like domain and a C-terminal catalytic protease (C19 family) domain. The tumor suppressor p53 protein is a transcription factor that responds to many cellular stress signals and is regulated primarily through ubiquitylation and subsequent degradation. MDM2 is a RING-finger E3 ubiquitin ligase that promotes p53 ubiquitinylation. p53 and MDM2 bind to the same site in the N-terminal TRAF-like domain of HAUSP in a mutually exclusive manner. HAUSP also interacts with the Epstein-Barr nuclear antigen 1 (EBNA1) protein of the Epstein-Barr virus (EBV), which efficiently immortalizes infected cells predisposing the host to a variety of cancers. EBNA1 plays several important roles in EBV latent infection and cellular transformation. It binds the same pocket as p53 in the HAUSP TRAF-like domain. Through interactions with p53, MDM2 and EBNA1, HAUSP plays a role in cell proliferation, apoptosis and EBV-mediated immortalization.	137
239742	cd03773	MATH_TRIM37	Tripartite motif containing protein 37 (TRIM37) family, MATH domain; TRIM37 is a peroxisomal protein and is a member of the tripartite motif (TRIM) protein subfamily, also known as the RING-B-box-coiled-coil (RBCC) subfamily of zinc-finger proteins. Mutations in the human TRIM37 gene (also known as MUL) cause Mulibrey (muscle-liver-brain-eye) nanism, a rare growth disorder of prenatal onset characterized by dysmorphic features, pericardial constriction and hepatomegaly. TRIM37, similar to other TRIMs, contains a cysteine-rich, zinc-binding RING-finger domain followed by another cysteine-rich zinc-binding domain, the B-box, and a coiled-coil domain. TRIM37 is autoubiquitinated in a RING domain-dependent manner, indicating that it functions as an ubiquitin E3 ligase. In addition to the tripartite motif, TRIM37 also contains a MATH domain C-terminal to the coiled-coil domain. The MATH domain of TRIM37 has been shown to interact with the TRAF domain of six known TRAFs in vitro, however, it is unclear whether this is physiologically relevant. Eleven TRIM37 mutations have been associated with Mulibrey nanism so far. One mutation, Gly322Val, is located in the MATH domain and is the only mutation that does not affect the length of the protein. It results in the incorrect subcellular localization of TRIM37.	132
239743	cd03774	MATH_SPOP	Speckle-type POZ protein (SPOP) family, MATH domain; composed of proteins with similarity to human SPOP. SPOP was isolated as a novel antigen recognized by serum from a scleroderma patient, whose overexpression in COS cells results in a discrete speckled pattern in the nuclei. It contains an N-terminal MATH domain and a C-terminal BTB (also called POZ) domain. Together with Cul3, SPOP constitutes an ubiquitin E3 ligase which is able to ubiquitinate the PcG protein BMI1, the variant histone macroH2A1 and the death domain-associated protein Daxx. Therefore, SPOP may be involved in the regulation of these proteins and may play a role in transcriptional regulation, apoptosis and X-chromosome inactivation. Cul3 binds to the BTB domain of SPOP whereas Daxx and the macroH2A1 nonhistone region have been shown to bind to the MATH domain. Both MATH and BTB domains are necessary for the nuclear speckled accumulation of SPOP. There are many proteins, mostly uncharacterized, containing both MATH and BTB domains from C. elegans and plants which are excluded from this family.	139
239744	cd03775	MATH_Ubp21p	Ubiquitin-specific protease 21 (Ubp21p) family, MATH domain; composed of fungal proteins with similarity to Ubp21p of fission yeast. Ubp21p is a deubiquitinating enzyme that may be involved in the regulation of the protein kinase Prp4p, which controls the formation of active spliceosomes. Members of this family are similar to human HAUSP (Herpesvirus-associated ubiquitin-specific protease) in that they contain an N-terminal MATH domain and a C-terminal catalytic protease (C19 family) domain. HAUSP is also an ubiquitin-specific protease that specifically catalyzes the deubiquitylation of p53 and MDM2. The MATH domain of HAUSP contains the binding site for p53 and MDM2. Similarly, the MATH domain of members in this family may be involved in substrate binding.	134
239745	cd03776	MATH_TRAF6	Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF6 subfamily, TRAF domain, C-terminal MATH subdomain; composed of proteins with similarity to human TRAF6, including the Drosophila protein DTRAF2. TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF6 is the most divergent in its TRAF domain among the mammalian TRAFs. In addition to mediating TNFR family signaling, it is also an essential signaling molecule of the interleukin-1/Toll-like receptor superfamily. Whereas other TRAF molecules display similar and overlapping TNFR-binding specificities, TRAF6 binds completely different sites on receptors such as CD40 and RANK. TRAF6 serves as a molecular bridge between innate and adaptive immunity and plays a central role in osteoimmunology. DTRAF2, as an activator of nuclear factor-kappaB, plays a pivotal role in Drosophila development and innate immunity. TRAF6 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors.	147
239746	cd03777	MATH_TRAF3	Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF3 subfamily, TRAF domain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF3 was first described as a molecule that binds the cytoplasmic tail of CD40. However, it is not required for CD40 signaling. More recently, TRAF3 has been identified as a key regulator of type I interferon (IFN) production and the mammalian innate antiviral immunity. It mediates IFN responses in Toll-like receptor (TLR)-dependent as well as TLR-independent viral recognition pathways. It is also a key element in immunological homeostasis through its regulation of the anti-inflammatory cytokine interleukin-10. TRAF3 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors.	186
239747	cd03778	MATH_TRAF2	Tumor Necrosis Factor Receptor (TNFR) Associated Factor (TRAF) family, TRAF2 subfamily, TRAF domain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF2 associates with the receptors TNFR-1, TNFR-2, RANK (which mediates differentiation and maturation of osteoclasts) and CD40 (which is important for the proliferation and activation of B cells), among others. It regulates distinct pathways that lead to the activation of nuclear factor-kappaB and Jun NH2-terminal kinases. TRAF2 also indirectly associates with death receptors through its interaction with TRADD (TNFR-associated death domain protein). It is involved in regulating oxidative stress or ROS-induced cell death and in the preconditioning of cells by sublethal stress for protection from subsequent injury. TRAF2 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors.	164
239748	cd03779	MATH_TRAF1	Tumor Necrosis Factor Receptor (TNFR) Associated Factor (TRAF) family, TRAF1 subfamily, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF1 expression is the most restricted among the TRAFs. It is found exclusively in activated lymphocytes, dendritic cells and certain epithelia. TRAF1 associates, directly or indirectly through heterodimerization with TRAF2, with the TNFR family receptors TNFR-2, CD30, RANK, CD40 and LMP1, among others. It also binds the intracellular proteins TRADD, TANK, TRIP, RIP1, RIP2 and FLIP. TRAF1 is unique among the TRAFs in that it lacks a RING domain, which is critical for the activation of  nuclear factor-kappaB and Jun NH2-terminal kinase. Studies on TRAF1-deficient mice suggest that TRAF1 has a negative regulatory role in TNFR-mediated signaling events. TRAF1 contains one zinc finger and one TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors.	147
239749	cd03780	MATH_TRAF5	Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF5 subfamily, TRAF domain, C-terminal MATH subdomain; TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF5 was identified as an activator of nuclear factor-kappaB and a regulator of lymphotoxin-beta receptor and CD40 signaling. Its interaction with CD40 is indirect, involving hetero-oligomerization with TRAF3. In addition, TRAF5 has been shown to associate with other TNFRs including CD27, CD30, OX40 and GITR (glucocorticoid-induced TNFR). It plays a role in modulating Th2 immune responses (driven by OX40 costimulation) and T-cell activation (triggered by GITR). It is also involved in osteoclastogenesis. TRAF5 contains a RING finger domain, five zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors.	148
239750	cd03781	MATH_TRAF4	Tumor Necrosis Factor Receptor (TNFR)-Associated Factor (TRAF) family, TRAF4 subfamily, TRAF domain, C-terminal MATH subdomain; composed of proteins with similarity to human TRAF4, including the Drosophila protein DTRAF1. TRAF molecules serve as adapter proteins that link TNFRs and downstream kinase cascades resulting in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses. TRAF4 is highly expressed during embryogenesis, especially in the central and peripheral nervous system. Studies using TRAF4-deficient mice show that TRAF4 is required for neurogenesis, as well as the development of the trachea and the axial skeleton. In addition, TRAF4 augments nuclear factor-kappaB activation triggered by GITR (glucocorticoid-induced TNFR), a receptor expressed in T-cells, B-cells and macrophages. It also participates in counteracting the signaling mediated by Toll-like receptors through its association with TRAF6 and TRIF. DTRAF1 plays a pivotal role in the development of eye imaginal discs and photosensory neuron arrays in Drosophila. TRAF4 contains a RING finger domain, seven zinc finger domains, and a TRAF domain. The TRAF domain can be divided into a more divergent N-terminal alpha helical region (TRAF-N), and a highly conserved C-terminal MATH subdomain (TRAF-C) with an eight-stranded beta-sandwich structure. TRAF-N mediates trimerization while TRAF-C interacts with receptors.	154
239751	cd03782	MATH_Meprin_Beta	Meprin family, Beta subunit, MATH domain; Meprins are multidomain extracellular metalloproteases capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. They are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. The beta subunit is a type I membrane protein, which forms homodimers or heterotetramers (alpha2beta2 or alpha3beta). Meprin beta shows preference for acidic residues at the P1 and P1' sites of its substrate. Among its best substrates are growth factors and chemokines such as gastrin and osteopontin. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen.	167
239752	cd03783	MATH_Meprin_Alpha	Meprin family, Alpha subunit, MATH domain; Meprins are multidomain extracellular metalloproteases capable of cleaving growth factors, cytokines, extracellular matrix proteins, and biologically active peptides. They are composed of two related subunits, alpha and beta, which form homo- or hetro-complexes where the basic unit is a disulfide-linked dimer. The alpha subunit is synthesized as a membrane spanning protein, however, it is cleaved during biosynthesis and loses its transmembrane domain. It oligomerizes into large complexes, containing 10-100 subunits (dimers that associate noncovalently), which are secreted as latent proteases and can move through extracellular spaces in a nondestructive manner. This allows delivery of the concentrated protease to sites containing activating enzymes, such as sites of inflammation, infection or cancerous growth. Meprin alpha shows preference for small or hydrophobic residues at the P1 and P1' sites of its substrate. Both alpha and beta subunits contain a catalytic astacin (M12 family) protease domain followed by the adhesion or interaction domains MAM, MATH and AM. The MATH and MAM domains provide symmetrical intersubunit disulfide bonds necessary for the dimerization of meprin subunits. The MATH domain may also be required for folding of an activable zymogen.	167
340817	cd03784	GT1_Gtf-like	UDP-glycosyltransferases and similar proteins. This family includes the Gtfs, a group of homologous glycosyltransferases involved in the final stages of the biosynthesis of antibiotics vancomycin and related chloroeremomycin. Gtfs transfer sugar moieties from an activated NDP-sugar donor to the oxidatively cross-linked heptapeptide core of vancomycin group antibiotics. The core structure is important for the bioactivity of the antibiotics.	404
340818	cd03785	GT28_MurG	undecaprenyldiphospho-muramoylpentapeptide beta-N-acetylglucosaminyltransferase. MurG (EC 2.4.1.227) is an N-acetylglucosaminyltransferase, the last enzyme involved in the intracellular phase of peptidoglycan biosynthesis. It transfers N-acetyl-D-glucosamine (GlcNAc) from UDP-GlcNAc to the C4 hydroxyl of a lipid-linked N-acetylmuramoyl pentapeptide (NAM). The resulting disaccharide is then transported across the cell membrane, where it is polymerized into NAG-NAM cell-wall repeat structure. MurG belongs to the GT-B structural superfamily of glycoslytransferases, which have characteristic N- and C-terminal domains, each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology.  The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.	350
340819	cd03786	GTB_UDP-GlcNAc_2-Epimerase	UDP-N-acetylglucosamine 2-epimerase and similar proteins. Bacterial members of the UDP-N-Acetylglucosamine (GlcNAc) 2-Epimerase family (EC 5.1.3.14) are known to catalyze the reversible interconversion of UDP-GlcNAc and UDP-N-acetylmannosamine (UDP-ManNAc). The enzyme serves to produce an activated form of ManNAc residues (UDP-ManNAc) for use in the biosynthesis of a variety of cell surface polysaccharides; The mammalian enzyme is bifunctional, catalyzing both the inversion of stereochemistry at C-2 and the hydrolysis of the UDP-sugar linkage to generate free ManNAc. It also catalyzes the phosphorylation of ManNAc to generate ManNAc 6-phosphate, a precursor to salic acids. In mammals, sialic acids are found at the termini of oligosaccharides in a large variety of cell surface glycoconjugates and are key mediators of cell-cell recognition events. Mutations in human members of this family have been associated with Sialuria, a rare disease caused by the disorders of sialic acid metabolism. This family belongs to the GT-B structural superfamily of glycoslytransferases, which have characteristic N- and C-terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology.  The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.	365
340820	cd03788	GT20_TPS	trehalose-6-phosphate synthase. Trehalose-6-Phosphate Synthase (TPS, EC 2.4.1.15) is a glycosyltransferase that catalyses the synthesis of alpha,alpha-1,1-trehalose-6-phosphate from glucose-6-phosphate using a UDP-glucose donor. It is a key enzyme in the trehalose synthesis pathway. Trehalose is a nonreducing disaccharide present in a wide variety of organisms and may serve as a source of energy and carbon. It is characterized most notably in insect, plant, and microbial cells. Its production is often associated with a variety of stress conditions, including desiccation, dehydration, heat, cold, and oxidation. This family represents the catalytic domain of the TPS. Some members of this domain family coexist with a C-terminal trehalose phosphatase domain.	463
340821	cd03789	GT9_LPS_heptosyltransferase	lipopolysaccharide heptosyltransferase and similar proteins. Lipopolysaccharide heptosyltransferase (2.4.99.B6) is involved in the biosynthesis of lipooligosaccharide (LOS). Lipopolysaccharide (LPS) is a major component of the outer membrane of gram-negative bacteria. LPS heptosyltransferase transfers heptose molecules from ADP-heptose to 3-deoxy-D-manno-octulosonic acid (KDO), a part of the inner core component of LPS. This family also contains lipopolysaccharide 1,2-N-acetylglucosaminetransferase EC 2.4.1.56 and belongs to the GT-B structural superfamily of glycoslytransferases, which have characteristic N- and C-terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology.  The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.	277
340822	cd03791	GT5_Glycogen_synthase_DULL1-like	Glycogen synthase GlgA and similar proteins. This family is most closely related to the GT5 family of glycosyltransferases. Glycogen synthase (EC:2.4.1.21) catalyzes the formation and elongation of the alpha-1,4-glucose backbone using ADP-glucose, the second and key step of glycogen biosynthesis. This family includes starch synthases of plants, such as DULL1 in Zea mays and glycogen synthases of various organisms.	474
340823	cd03792	GT4_trehalose_phosphorylase	trehalose phosphorylase and similar proteins. Trehalose phosphorylase (TP) reversibly catalyzes trehalose synthesis and degradation from alpha-glucose-1-phosphate (alpha-Glc-1-P) and glucose. The catalyzing activity includes the phosphorolysis of trehalose, which produce alpha-Glc-1-P and glucose, and the subsequent synthesis of trehalose. This family is most closely related to the GT4 family of glycosyltransferases.	378
340824	cd03793	GT3_GSY2-like	glycogen synthase GSY2 and similar proteins. Glycogen synthase, which is most closely related to the GT3 family of glycosyltransferases, catalyzes the transfer of a glucose molecule from UDP-glucose to a terminal branch of a glycogen molecule, a rate-limit step of glycogen biosynthesis. GSY2, the member of this family in S. cerevisiae, has been shown to possess glycogen synthase activity.	590
340825	cd03794	GT4_WbuB-like	Escherichia coli WbuB and similar proteins. This family is most closely related to the GT1 family of glycosyltransferases. WbuB in E. coli is involved in the biosynthesis of the O26 O-antigen.  It has been proposed to function as an N-acetyl-L-fucosamine (L-FucNAc) transferase.	391
340826	cd03795	GT4_WfcD-like	Escherichia coli alpha-1,3-mannosyltransferase WfcD and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP-linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in bacteria and eukaryotes.	355
340827	cd03796	GT4_PIG-A-like	phosphatidylinositol N-acetylglucosaminyltransferase subunit A and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Phosphatidylinositol glycan-class A (PIG-A), an X-linked gene in humans, is necessary for the synthesis of N-acetylglucosaminyl-phosphatidylinositol, a very early intermediate in glycosyl phosphatidylinositol (GPI)-anchor biosynthesis. The GPI-anchor is an important cellular structure that facilitates the attachment of many proteins to cell surfaces. Somatic mutations in PIG-A have been associated with Paroxysmal Nocturnal Hemoglobinuria (PNH), an acquired hematological disorder.	398
340828	cd03798	GT4_WlbH-like	Bordetella parapertussis WlbH and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Staphylococcus aureus CapJ may be involved in capsule polysaccharide biosynthesis. WlbH in Bordetella parapertussis has been shown to be required for the biosynthesis of a trisaccharide that, when attached to the B. pertussis lipopolysaccharide (LPS) core (band B), generates band A LPS.	376
340829	cd03799	GT4_AmsK-like	Erwinia amylovora AmsK and similar proteins. This is a family of GT4 glycosyltransferases found specifically in certain bacteria. AmsK in Erwinia amylovora, has been reported to be involved in the biosynthesis of amylovoran, a exopolysaccharide acting as a virulence factor.	350
340830	cd03800	GT4_sucrose_synthase	sucrose-phosphate synthase and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. The sucrose-phosphate synthases in this family may be unique to plants and photosynthetic bacteria. This enzyme catalyzes the synthesis of sucrose 6-phosphate from fructose 6-phosphate and uridine 5'-diphosphate-glucose, a key regulatory step of sucrose metabolism. The activity of this enzyme is regulated by phosphorylation and moderated by the concentration of various metabolites and light.	398
340831	cd03801	GT4_PimA-like	phosphatidyl-myo-inositol mannosyltransferase. This family is most closely related to the GT4 family of glycosyltransferases and named after PimA in Propionibacterium freudenreichii, which is involved in the biosynthesis of phosphatidyl-myo-inositol mannosides (PIM) which are early precursors in the biosynthesis of lipomannans (LM) and lipoarabinomannans (LAM), and catalyzes the addition of a mannosyl residue from GDP-D-mannose (GDP-Man) to the position 2 of the carrier lipid phosphatidyl-myo-inositol (PI) to generate a phosphatidyl-myo-inositol bearing an alpha-1,2-linked mannose residue (PIM1). Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in certain bacteria and archaea.	366
340832	cd03802	GT4_AviGT4-like	UDP-Glc:tetrahydrobiopterin alpha-glucosyltransferase and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. aviGT4 in Streptomyces viridochromogenes has been shown to be involved in biosynthesis of oligosaccharide antibiotic avilamycin A. Inactivation of aviGT4 resulted in a mutant that accumulated a novel avilamycin derivative lacking the terminal eurekanate residue.	333
340833	cd03804	GT4_WbaZ-like	mannosyltransferase WbaZ and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. WbaZ in Salmonella enterica has been shown to possess mannosyltransferase activity.	356
340834	cd03805	GT4_ALG2-like	alpha-1,3/1,6-mannosyltransferase ALG2 and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases.  ALG2, a 1,3-mannosyltransferase, in yeast catalyzes the mannosylation of Man(2)GlcNAc(2)-dolichol diphosphate and Man(1)GlcNAc(2)-dolichol diphosphate to form Man(3)GlcNAc(2)-dolichol diphosphate. A deficiency of this enzyme causes an abnormal accumulation of Man1GlcNAc2-PP-dolichol and Man2GlcNAc2-PP-dolichol, which is associated with a type of congenital disorders of glycosylation (CDG), designated CDG-Ii, in humans.	392
340835	cd03806	GT4_ALG11-like	alpha-1,2-mannosyltransferase ALG11 and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. ALG11 in yeast is involved in adding the final 1,2-linked Man to the Man5GlcNAc2-PP-Dol synthesized on the cytosolic face of the ER. The deletion analysis of ALG11 was shown to block the early steps of core biosynthesis that takes place on the cytoplasmic face of the ER and lead to a defect in the assembly of lipid-linked oligosaccharides.	419
340836	cd03807	GT4_WbnK-like	Shigella dysenteriae WbnK and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. WbnK in Shigella dysenteriae has been shown to be involved in the type 7 O-antigen biosynthesis.	362
340837	cd03808	GT4_CapM-like	capsular polysaccharide biosynthesis glycosyltransferase CapM and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. CapM in Staphylococcus aureus is required for the synthesis of type 1 capsular polysaccharides.	358
340838	cd03809	GT4_MtfB-like	glycosyltransferases MtfB, WbpX, and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. MtfB (mannosyltransferase B) in E. coli has been shown to direct the growth of the O9-specific polysaccharide chain. It transfers two mannoses into the position 3 of the previously synthesized polysaccharide.	362
340839	cd03811	GT4_GT28_WabH-like	family 4 and family 28 glycosyltransferases similar to Klebsiella WabH. This family is most closely related to the GT1 family of glycosyltransferases. WabH in Klebsiella pneumoniae has been shown to transfer a GlcNAc residue from UDP-GlcNAc onto the acceptor GalUA residue in the cellular outer core.	351
340840	cd03812	GT4_CapH-like	capsular polysaccharide biosynthesis glycosyltransferase CapH and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. capH in Staphylococcus aureus has been shown to be required for the biosynthesis of the type 1 capsular polysaccharide (CP1).	357
340841	cd03813	GT4-like	glycosyltransferase family 4 proteins. This family is most closely related to the GT4 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in bacteria, while some of them are also found in Archaea and eukaryotes.	474
340842	cd03814	GT4-like	glycosyltransferase family 4 proteins. This family is most closely related to the GT4 family of glycosyltransferases and includes a sequence annotated as alpha-D-mannose-alpha(1-6)phosphatidyl myo-inositol monomannoside transferase from Bacillus halodurans. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in bacteria and eukaryotes.	365
340843	cd03816	GT33_ALG1-like	chitobiosyldiphosphodolichol beta-mannosyltransferase and similar proteins. This family is most closely related to the GT33 family of glycosyltransferases. The yeast gene ALG1 has been shown to function as a mannosyltransferase that catalyzes the formation of dolichol pyrophosphate (Dol-PP)-GlcNAc2Man from GDP-Man and Dol-PP-Glc-NAc2, and participates in the formation of the lipid-linked precursor oligosaccharide for N-glycosylation. In humans ALG1 has been associated with the congenital disorders of glycosylation (CDG) designated as subtype CDG-Ik.	411
340844	cd03817	GT4_UGDG-like	UDP-Glc:1,2-diacylglycerol 3-a-glucosyltransferase and similar proteins. This family is most closely related to the GT1 family of glycosyltransferases. UDP-glucose-diacylglycerol glucosyltransferase (EC 2.4.1.337, UGDG; also known as 1,2-diacylglycerol 3-glucosyltransferase) catalyzes the transfer of glucose from UDP-glucose to 1,2-diacylglycerol forming 3-D-glucosyl-1,2-diacylglycerol.	372
340845	cd03818	GT4_ExpC-like	Rhizobium meliloti ExpC and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. ExpC in Rhizobium meliloti has been shown to be involved in the biosynthesis of galactoglucan (exopolysaccharide II).	396
340846	cd03819	GT4_WavL-like	Vibrio cholerae WavL and similar sequences. This family is most closely related to the GT4 family of glycosyltransferases. WavL in Vibrio cholerae has been shown to be involved in the biosynthesis of the lipopolysaccharide core.	345
340847	cd03820	GT4_AmsD-like	amylovoran biosynthesis glycosyltransferase AmsD and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. AmSD in Erwinia amylovora has been shown to be involved in the biosynthesis of amylovoran, the acidic exopolysaccharide acting as a virulence factor. This enzyme may be responsible for the formation of  galactose alpha-1,6 linkages in amylovoran.	351
340848	cd03821	GT4_Bme6-like	Brucella melitensis Bme6 and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Bme6 in Brucella melitensis has been shown to be involved in the biosynthesis of a polysaccharide.	377
340849	cd03822	GT4_mannosyltransferase-like	mannosyltransferases of glycosyltransferase family 4 and similar proteins. This family is most closely related to the GT1 family of glycosyltransferases. ORF704 in E. coli has been shown to be involved in the biosynthesis of O-specific mannose homopolysaccharides.	370
340850	cd03823	GT4_ExpE7-like	glycosyltransferase ExpE7 and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. ExpE7 in Sinorhizobium meliloti has been shown to be involved in the biosynthesis of galactoglucans (exopolysaccharide II).	357
340851	cd03825	GT4_WcaC-like	putative colanic acid biosynthesis glycosyl transferase WcaC and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. Escherichia coli WcaC has been predicted to function in colanic acid biosynthesis. WcfI in Bacteroides fragilis has been shown to be involved in the capsular polysaccharide biosynthesis.	364
239753	cd03829	Sina	Seven in absentia (Sina) protein family, C-terminal substrate binding domain; composed of the Drosophila Sina protein, the mammalian Sina homolog (Siah), the plant protein SINAT5, and similar proteins. Sina, Siah and SINAT5 are RING-containing proteins that function as E3 ubiquitin ligases, acting either as single proteins or as a part of multiprotein complexes. Sina is expressed in many cells in the developing eye but is essential specifically for R7 photoreceptor cell development. Sina cooperates with Phyllopod (Phyl), Ebi and the E2 ubiquitin-conjugating enzyme Ubcd1 to catalyze the ubiquitination and subsequent degradation of Tramtrack (Ttk88); Ttk88 is a transcriptional repressor that blocks photoreceptor differentiation. Similarly, the mammalian homologue Siah1 cooperates with SIP (Siah-interacting protein), Ebi and the adaptor protein Skp1, to target beta-catenin for ubiquitination and degradation via a p53-dependent mechanism. SINAT5 targets NAC1 for ubiquitin-mediated degradation resulting in the downregulation of auxin, a hormone that controls many aspects of plant development. Other targets of Sina family proteins include c-Myb, synaptophysin, group 1 glutamate receptors, promyelocytic leukemia protein, alpha-synuclein, synphilin-1 and alpha-ketoglutarate dehydrogenase, among others. Sina proteins also bind proteins that are not targets for ubiquitination such as Phyl, adenomatous polyposis coli, VAV, BAG-1 and Dab-1. Siah binds to a consensus motif, PXAXVXP, which is present in Siah-binding proteins. Siah is a dimeric protein consisting of an N-terminal RING domain, two zinc finger motifs and a C-terminal substrate-binding domain (SBD); this SBD contains an eight-stranded antiparallel beta-sandwich fold similar to the MATH (meprin and TRAF-C homology) domain.	127
349428	cd03855	M14_ASTE	Peptidase M14 Succinylglutamate desuccinylase (ASTE) subfamily. Peptidase M14 Succinylglutamate desuccinylase (ASTE, also known as N-succinyl-L-glutamate amidohydrolase, N2-succinylglutamate desuccinylase, and SGDS; EC 3.5.1.96) belongs to the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily of the M14 family of metallocarboxypeptidases. This group includes succinylglutamate desuccinylase that catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway. It hydrolyzes N-succinyl-L-glutamate to succinate and L-glutamate.	239
349429	cd03856	M14_Nna1-like	Peptidase M14-like domain of ATP/GTP binding proteins, cytosolic carboxypeptidases and related proteins. Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP), and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This subfamily includes the human AGTPBP-1 and AGBL -2, -3, -4, and -5, and the mouse Nna1/CCP-1 and CCP -2 through -6. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Nna1 is widely expressed in the developing and adult nervous systems, including cerebellar Purkinje and granule neurons, miral cells of the olfactory bulb and retinal photoreceptors. Nna1 is also induced in axotomized motor neurons. Mutations in Nna1 cause Purkinje cell degeneration (pcd). The Nna1 CP domain is required to prevent the retinal photoreceptor loss and cerebellar ataxia phenotypes of pcd mice, and a functional zinc-binding domain is needed for Nna-1 to support neuron survival in these mice. Nna1-like proteins from the different phyla are highly diverse, but they all contain a characteristic N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	252
349430	cd03857	M14-like	Peptidase M14-like domain; uncharacterized subfamily. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	203
349431	cd03858	M14_CP_N-E_like	Peptidase M14 carboxypeptidase subfamily N/E-like. Carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages that would otherwise damage the cell. In addition, all members of the N/E subfamily contain an extra C-terminal domain that is not present in the A/B subfamily. This domain has structural homology to transthyretin and other proteins and has been proposed to function as a folding domain. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation.	292
349432	cd03859	M14_CPT	Peptidase M14 Carboxypeptidase T subfamily. Peptidase M14-like domain of carboxypeptidase (CP) T (CPT), CPT belongs to the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPT has moderate similarity to CPA and CPB, and exhibits dual-substrate specificity by cleaving C-terminal hydrophobic amino acid residues like CPA and C-terminal positively charged residues like CPB. CPA and CPB are M14 family peptidases but do not belong to this CPT group. The substrate specificity difference between CPT and CPA and CPB is ascribed to a few amino acid substitutions at the substrate-binding pocket while the spatial organization of the binding site remains the same as in all Zn-CPs. CPT has increased thermal stability in presence of Ca2+ ions, and two disulfide bridges which give an additional stabilization factor.	292
349433	cd03860	M14_CP_A-B_like	Peptidase M14 carboxypeptidase subfamily A/B-like. The Peptidase M14 Carboxypeptidase (CP) A/B subfamily is one of two main M14 CP subfamilies defined by sequence and structural homology, the other being the N/E subfamily. CPs hydrolyze single, C-terminal amino acids from polypeptide chains. They have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. There are nine members in the A/B family: CPA1, CPA2, CPA3, CPA4, CPA5, CPA6, CPB, CPO and CPU. CPA1, CPA2 and CPB are produced by the pancreas. The A forms have slightly different specificities, with CPA1 preferring aliphatic and small aromatic residues, and CPA2 preferring the bulkier aromatic side chains. CPA3 is found in secretory granules of mast cells and functions in inflammatory processes. CPA4 is detected in hormone-regulated tissues, and is thought to play a role in prostate cancer. CPA5 is present in discrete regions of pituitary and other tissues, and cleaves aliphatic C-terminal residues. CPA6 is highly expressed in embryonic brain and optic muscle, suggesting that it may play a specific role in cell migration and axonal guidance. CPU (also called CPB2) is produced and secreted by the liver as the inactive precursor, PCPU, commonly referred to as thrombin-activatable fibrinolysis inhibitor (TAFI). Little is known about CPO but it has been suggested to have specificity for acidic residues.	300
349434	cd03862	M14-like	Peptidase M14-like domain; uncharacterized subfamily. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	245
349435	cd03863	M14_CPD_II	Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase D, domain II subgroup. The second carboxypeptidase (CP)-like domain of  Carboxypeptidase D (CPD; EC 3.4.17.22), domain II. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, while the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at pH 6.3-7.5 and prefers substrates with C-terminal Arg, whereas domain II is active at pH 5.0-6.5 and prefers substrates with C-terminal Lys. CPD functions in the processing of proteins that transit the secretory pathway, and is present in all vertebrates as well as Drosophila. It is broadly distributed in all tissue types. Within cells, CPD is present in the trans-Golgi network and immature secretory vesicles, but is excluded from mature vesicles. It is thought to play a role in the processing of proteins that are initially processed by furin or related endopeptidases present in the trans-Golgi network, such as growth factors and receptors. CPD is implicated in the pathogenesis of lupus erythematosus (LE), it is regulated by TGF-beta in various cell types of murine and human origin and is significantly down-regulated in CD14 positive cells isolated from patients with LE. As down -regulation of CPD leads to down-modulation of TGF-beta, CPD may have a role in a positive feedback loop.	296
349436	cd03864	M14_CPN	Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase N subgroup. Peptidase M14 Carboxypeptidase N (CPN, also known as kininase I, creatine kinase conversion factor, plasma carboxypeptidase B, arginine carboxypeptidase, and protaminase; EC 3.4.17.3) is an extracellular glycoprotein synthesized in the liver and released into the blood, where it is present in high concentrations. CPN belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPN plays an important role in protecting the body from excessive buildup of potentially deleterious peptides that normally act as local autocrine or paracrine hormones. It specifically removes C-terminal basic residues. As CPN can cleave lysine more avidly than arginine residues it is also called lysine carboxypeptidase. CPN substrates include peptides found in the bloodstream, such as kinins (e.g. bradykinin, kalinin, met-lys-bradykinin), complement anaphylatoxins and creatine kinase MM (CK-MM). By removing just one amino acid, CPN can alter peptide activity and receptor binding. For example Bradykinin, a nine-residue peptide released from kiningen in response to tissue injury which is inactivated by CPN, anaphylatoxins which are regulated by CPN by the cleaving and removal of their C-terminal arginines resulting in a reduction in their biological activities of 10-100-fold, and creatine kinase MM, a cytosolic enzyme that catalyzes the reversible transfer of a phosphate group from ATP to creatine, and is regulated by CPN by the cleavage of C-terminal lysines. Like the other N/E subfamily members, two surface loops surrounding the active-site groove restrict access to the catalytic center, thus restricting larger protein carboxypeptidase inhibitors from inhibiting CPN.	313
349437	cd03865	M14_CPE	Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase E subgroup. Peptidase M14 Carboxypeptidase (CP) E (CPE, also known as carboxypeptidase H, and enkephalin convertase; EC 3.4.17.10) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPE is an important enzyme responsible for the proteolytic processing of prohormone intermediates (such as pro-insulin, pro-opiomelanocortin, or pro-gonadotropin-releasing hormone) by specifically removing C-terminal basic residues. In addition, it has been proposed that the regulated secretory pathway (RSP) of the nervous and endocrine systems utilizes membrane-bound CPE as a sorting receptor. A naturally occurring point mutation in CPE reduces the stability of the enzyme and causes its degradation, leading to an accumulation of numerous neuroendocrine peptides that result in obesity and hyperglycemia. Reduced CPE enzyme and receptor activity could underlie abnormal placental phenotypes from the observation that CPE is down-regulated  in enlarged placentas of interspecific hybrid (interspecies hybrid placental dysplasia, IHPD) and cloned mice.	319
349438	cd03866	M14_CPM	Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase M subgroup. Peptidase M14 Carboxypeptidase (CP) M (CPM) belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPM is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C-terminus of the protein. It specifically removes C-terminal basic residues such as lysine and arginine from peptides and proteins. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the control of peptide hormones and growth factor activity on the cell surface and in the membrane-localized degradation of extracellular proteins, for example it hydrolyses the C-terminal arginine of epidermal growth factor (EGF) resulting in des-Arg-EGF which binds to the EGF receptor (EGFR) with an equal or greater affinity than native EGF.  CPM is a required processing enzyme that generates specific agonists for the B1 receptor.	289
349439	cd03867	M14_CPZ	Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase Z subgroup. Peptidase M14-like domain of carboxypeptidase (CP) Z (CPZ), CPZ belongs to the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPZ is a secreted Zn-dependent enzyme whose biological function is largely unknown. Unlike other members of the N/E subfamily, CPZ has a bipartite structure, which consists of an N-terminal cysteine-rich domain (CRD) whose sequence is similar to Wnt-binding proteins, and a C-terminal CP catalytic domain that removes C-terminal Arg residues from substrates. CPZ is enriched in the extracellular matrix and is widely distributed during early embryogenesis.  That the CRD of CPZ can bind to Wnt4 suggests that CPZ plays a role in Wnt signaling.	315
349440	cd03868	M14_CPD_I	Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase D, domain I subgroup. The first carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain I. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at pH 6.3-7.5 and prefers substrates with C-terminal Arg, whereas domain II is active at pH 5.0-6.5 and prefers substrates with C-terminal Lys. This Domain I family contains two contiguous surface cysteines that may become palmitoylated and target the enzyme to membranes, thus regulating intracellular trafficking. CPD functions in the processing of proteins that transit the secretory pathway, and is present in all vertebrates as well as Drosophila. It is broadly distributed in all tissue types. Within cells, CPD is present in the trans Golgi network and immature secretory vesicles, but is excluded from mature vesicles. It is thought to play a role in the processing of proteins that are initially processed by furin or related endopeptidases present in the trans Golgi network, such as growth factors and receptors. CPD is implicated in the pathogenesis of lupus erythematosus (LE), it is regulated by TGF-beta in various cell types of murine and human origin and is significantly down-regulated in CD14 positive cells isolated from patients with LE. As down-regulation of CPD leads to down-modulation of TGF-beta, CPD may have a role in a positive feedback loop. In D. melanogaster, the CPD variant 1B short (DmCPD1Bs) is necessary and sufficient for viability of the fruit fly.	294
349441	cd03869	M14_CPX_like	Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase X subgroup. Peptidase M14-like domain of carboxypeptidase (CP)-like protein X (CPX), CPX forms a distinct subgroup of the N/E subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Proteins belonging to this subgroup include CP-like protein X1 (CPX1), CP-like protein X2 (CPX2),  and aortic CP-like protein (ACLP) and its isoform adipocyte enhancer binding protein-1 (AEBP1). AEBP1 is a truncated form of ACLP, which may arise from alternative splicing of the gene. These proteins are inactive towards standard CP substrates because they lack one or more critical active site and substrate-binding residues that are necessary for activity. They may function as binding proteins rather than as active CPs or display catalytic activity toward other substrates.  Proteins in this subgroup also contain an N-terminal discoidin domain. The CP domain is important for the function of AEBP1 as a transcriptional repressor. AEBP1 is involved in several biological processes including adipogenesis, macrophage cholesterol homeostasis, and inflammation. In macrophages, AEBP1 promotes the expression of IL-6, TNF-alpha, MCP-1, and iNOS whose expression is tightly regulated by NF-kappaB activity. ACLP, a secreted protein that associates with the extracellular matrix, is essential for abdominal wall development and contributes to dermal wound healing.	322
349442	cd03870	M14_CPA	Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase A subgroup. Peptidase M14 Carboxypeptidase (CP) A (CPA) belongs to the A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPA enzymes generally favor hydrophobic residues. A/B subfamily enzymes are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The procarboxypeptidase A (PCPA) is produced by the exocrine pancreas and stored as a stable zymogen in the pancreatic granules until secretion into the digestive tract occurs. This subfamily includes CPA1, CPA2 and CPA4 forms. Within these A forms, there are slightly different specificities, with CPA1 preferring aliphatic and small aromatic residues, and CPA2 preferring the bulkier aromatic side chains. CPA4, detected in hormone-regulated tissues, is thought to play a role in prostate cancer.	301
349443	cd03871	M14_CPB	Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase B subgroup. Peptidase M14 Carboxypeptidase B (CPB) belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Carboxypeptidase B (CPB) enzymes only cleave the basic residues lysine or arginine. A/B subfamily enzymes are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The procarboxypeptidase B (PCPB) is produced by the exocrine pancreas and stored as stable zymogen in the pancreatic granules until secretion into the digestive tract occurs. PCPB has been reported to be a good serum marker for the diagnosis of acute pancreatitis and graft rejection in pancreas transplant recipients. this subfamily also includes thrombin activatable fibrinolysis inhibitor (TAFIa), a carboxypeptidase that stabilizes fibrin clots by removing C-terminal arginines and lysines from partially degraded fibrin. Inhibition of TAFIa stimulates the degradation of fibrin clots and may help in prevention of thrombosis.	300
349444	cd03872	M14_CPA6	Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase A6 subgroup. Carboxypeptidase (CP) A6 (CPA6, also known as CPAH; EC 3.4.17.1), belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPA6 prefers large hydrophobic C-terminal amino acids as well as histidine, while peptides with a penultimate glycine or proline are very poorly cleaved. Several neuropeptides are processed by CPA6, including Met- and Leu-enkephalin, angiotensin I, and neurotensin. CPA6 converts enkephalin and neurotensin into forms known to be inactive toward their receptors, but converts inactive angiotensin I into the biologically active angiotensin II. Thus, CPA6 plays a possible role in the regulation of neuropeptides in the extracellular environment within the olfactory bulb where it is highly expressed. It is also broadly expressed in embryonic tissue, being found in neuronal tissues, bone, skin as well as the lateral rectus eye muscle. A disruption in the CPA6 gene is linked to Duane syndrome, a defect in the abducens nerve/lateral rectus muscle connection.	300
349870	cd03873	Zinc_peptidase_like	Zinc peptidases M18, M20, M28, and M42. Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This hierarchy contains zinc peptidases that correspond to the MH clan in the MEROPS database, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a non-specific eukaryotic dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carboxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metallo-aminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacyl-peptidase activity (i.e. hydrolysis of acylated N-terminal residues).	200
349871	cd03874	M28_PMSA_TfR_like	M28 Zn-peptidase Transferrin Receptor-like family. Peptidase M28 family; Transferrin Receptor (TfR) and prostate-specific membrane antigen (PSMA, also called glutamate carboxypeptidase or GCP-II) subfamily. TfR and PSMA are homodimeric type II transmembrane proteins containing three distinct domains: protease-like, apical or protease-associated (PA) and helical domains. The protease-like domain is a large extracellular portion (ectodomain). In TfR, it contains a binding site for the transferrin molecule and has 28% identity to membrane glutamate carboxypeptidase II (mGCP-II or PSMA). The PA domain is inserted between the first and second strands of the central beta sheet in the protease-like domain. TfR1 is widely expressed, and is a key player in the uptake of iron-loaded transferrin (Tf) into cells. The TfR1 homodimer binds two molecules of Tf and the complex is then internalized. TfR1 may also participate in cell growth and proliferation. TfR2 binds Tf but with a significantly lower affinity than TfR1. It is expressed chiefly in hepatocytes, hematopoietic cells, and duodenal crypt cells; its expression overlaps with that of hereditary hemochromatosis protein (HFE). TfR2 is involved in iron homeostasis; in humans, mutations in TfR2 are associated with a form of hemochromatosis (HFE3). PSMA is over-expressed predominantly in prostate cancer (PCa) as well as in the neovasculature of most solid tumors, but not in the vasculature of normal tissues. PSMA is considered a biomarker for PCa and possibly for use as an imaging and therapeutic target. The extracellular domain of PSMA possesses two unique enzymatic functions: N-acetylated, alpha-linked acidic dipeptidase (NAALADase) which cleaves terminal glutamate from the neurodipeptide N-acetyl-aspartyl-glutamate (NAAG), and folate hydrolase (FOLH) which cleaves the terminal glutamates from gamma-linked polyglutamates (carboxypeptidase). A mutation in this gene may be associated with impaired intestinal absorption of dietary folates, resulting in low blood folate levels and consequent hyperhomocysteinemia. Expression of this protein in the brain may be involved in a number of pathological conditions associated with glutamate excitotoxicity. This gene likely arose from a duplication event of a nearby chromosomal region. Alternative splicing gives rise to multiple transcript variants. While related in sequence to peptidase M28 GCP-II, TfR lacks the metal ion coordination centers and protease activity.	278
349872	cd03875	M28_Fxna_like	M28 Zn-peptidase Endoplasmic reticulum metallopeptidase 1. Peptidase family M28; Endoplasmic reticulum metallopeptidase 1 (ERMP1; Felix-ina, FXNA or Fxna peptidase; KIAA1815) subfamily. ERMP1 is a multi-pass membrane protein located in the endoplasmic reticulum membrane. In humans, Fxna may play a crucial role in processing proteins required for the organization of somatic cells and oocytes into discrete follicular structures, although which proteins are hydrolyzed has not yet been determined. Another member of this subfamily is the 24-kDa vacuolar protein (VP24) which is probably involved in the formation of intravacuolar pigmented globules (cyanoplasts) in highly anthocyanin-containing vacuoles; however, the biological function of the C-terminal region which includes the putative transmembrane metallopeptidase domain is unknown.	307
349873	cd03876	M28_SGAP_like	M28 Zn-peptidase Streptomyces griseus aminopeptidase and similar proteins. Peptidase family M28; Streptomyces griseus Aminopeptidase (SGAP, Leucine aminopeptidase (LAP), aminopeptidase S, Mername-AA022 peptidase) subfamily. SGAP is a di-zinc exopeptidase with high preference towards large hydrophobic amino-terminal residues, with Leu being the most efficiently cleaved. It can accommodate all except Pro and Glu residues in the P1' position. It is a monomeric (30 kDa), calcium-activated and calcium-stabilized enzyme; its activation by calcium correlates with substrate specificity and it has thermal stability only in the presence of calcium. Although SGAP contains a calcium binding site, it is not conserved in many members of this subfamily. SGAP is present in the extracellular fluid of S. griseus cultures.	289
349874	cd03877	M28_like	M28 Zn-peptidase, many containing a protease-associated (PA) domain insert. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins, many of which contain a protease-associated (PA) domain insert which may participate in substrate binding and/or promote conformational changes, influencing the stability and accessibility of the site to substrate. Some proteins in this subfamily are also associated with the PDZ domain, a widespread protein module that has been recruited to serve multiple functions during the course of evolution.	206
349875	cd03879	M28_AAP	M28 Zn-peptidase Aeromonas (Vibrio) proteolytica aminopeptidase. Peptidase family M28; Aeromonas (Vibrio) proteolytica aminopeptidase (AAP; leucine aminopeptidase from Vibrio proteolyticus; Bacterial leucyl aminopeptidase; E.C. 3.4.11.10) subfamily. AAP is a small (32kDa), heat stable leucine aminopeptidase and is active as a monomer. Similar forms of the enzyme have been isolated from Escherichia coli and Staphylococcus thermophilus. Leucine aminopeptidases, in general, play important roles in many biological processes such as protein catabolism, hormone degradation, regulation of migration and cell proliferation, as well as HIV infection and proliferation. AAP is a broad-specificity enzyme, utilizing two zinc(II) ions in its active site to remove N-terminal amino acids, with preference for large hydrophobic amino acids in the P1 position of the substrate, Leu being the most efficiently cleaved. It can accommodate all residues, except Pro, Asp and Glu in the P1' position.	286
349876	cd03880	M28_QC_like	M28 Zn-peptidase glutaminyl cyclase. Peptidase M28 family, glutaminyl cyclase (QC; EC 2.3.2.5) subfamily. QC is involved in N-terminal glutamine cyclization of many endocrine peptides and is typically abundant in brain tissue. N-terminal glutamine residue cyclization is an important post-translational event in the processing of numerous bioactive proteins, including neuropeptides, hormones, and cytokines during their maturation in the secretory pathway. The N-terminal pGlu protects them from exopeptidase degradation and/or enables them to have proper conformation for binding to their receptors. QCs are highly conserved from yeast to human. In humans, several genetic diseases, such as osteoporosis, appear to result from mutations of the QC gene. N-terminal glutamate cyclization into pyroglutamate (pGlu) is a reaction that may be related to the formation of several plaque-forming peptides, such as amyloid-(A) peptides and collagen-like Alzheimer amyloid plaque component, which play a pivotal role in Alzheimer's disease.	305
349877	cd03881	M28_Nicastrin	M28 Zn-peptidase nicastrin, a main component of gamma-secretase complex. Peptidase M28 family, nicastrin subfamily. Nicastrin is a main component of the gamma-secretase complex, which also contains presenilin, Pen-2 and Aph-1. Its extracellular domain sequence resembles aminopeptidases, but certain catalytic residues are not conserved. It is mainly localized to the endoplasmic reticulum and Golgi. It is highly glycosylated (Mr 120 kDa) and is essential for substrate recognition of the N-terminus of gamma-secretase substrates derived from APP and Notch. Nicastrin facilitates substrate cleavage by the catalytic presenilin subunit in the gamma-secretase complex. One conserved glutamate is especially important, probably because this residue forms an ion pair with the amino terminus of the substrate. This substrate-binding domain is often called the DAP domain (named after DYIGS, the amino acid stretch that modulates amyloid precursor protein (APP) processing, and Peptidase homologous region). The sequence of the substrate N-terminus is apparently not critical for the interaction, but a free amino group is. Thus, nicastrin can be considered a kind of gatekeeper for the gamma-secretase complex: type I membrane proteins that have not shed their ectodomains cannot interact properly with nicastrin and do not gain access to the active site. Dysfunction of gamma-secretase is thought to cause Alzheimer's disease, with most mutations derived from Alzheimer's disease mapping to the catalytic subunit presenilin 1 (PS1).	519
349878	cd03882	M28_nicalin_like	M28 Zn-Peptidase Nicalin, Nicastrin-like protein. Peptidase M28 family, Nicalin (nicastrin-like protein) subfamily. Nicalin is distantly related to Nicastrin, a component of the Alzheimer's disease-associated gamma-secretase, and forms a complex with Nomo (nodal modulator) pM5. Similar to Nicastrin, Nicalin lacks the amino-acid conservation required for catalytically active aminopeptidases. Functional studies in zebrafish embryos and cultured human cells reveal that nicalin and Nomo collaborate to antagonize the Nodal/TGFbeta signaling pathway. Thus, nicastrin and nicalin are both associated with protein complexes involved in cell fate decisions during early embryonic development.	296
349879	cd03883	M28_Pgcp_like	M28 Zn-Peptidase Plasma glutamate carboxypeptidase. Peptidase M28 family; Plasma glutamate carboxypeptidase (PGCP; blood plasma glutamate carboxypeptidase; EC 3.4.17.21) subfamily. PGCP is a 56kDa glutamate carboxypeptidase that is mainly produced in mammalian placenta and kidney, the majority of which is thought to be secreted into the bloodstream. Similar proteins are also found in other species, including bacteria. These proteins contain protease-associated (PA) domain inserts between the first and second strands of the central beta sheet in the protease-like domain. The PA domains may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. The exact physiological substrates of PGCP are unknown, although this enzyme may play an important role in the hydrolysis of circulating peptides. Its closest homolog encodes an important brain glutamate carboxypeptidase II (NAALADase) identical to the prostate-specific membrane antigen (PSMA), which serves as a marker for prostatic cancer metastasis. Hypermethylation of PGCP gene has been associated with human bronchial epithelial (HBE) cell immortalization and lung cancer. PGCP also provides an attractive target for serological analysis in hepatitis C virus (HCV)-induced hepatocellular carcinoma (HCC) patients.	425
349880	cd03884	M20_bAS	M20 Peptidase beta-alanine synthase, an amidohydrolase. Peptidase M20 family, beta-alanine synthase (bAS; N-carbamoyl-beta-alanine amidohydrolase and beta-ureidopropionase; EC 3.5.1.6) subfamily. bAS is an amidohydrolase and is the final enzyme in the pyrimidine catabolic pathway, which is involved in the regulation of the cellular pyrimidine pool. bAS catalyzes the irreversible hydrolysis of the N-carbamylated beta-amino acids to beta-alanine or aminoisobutyrate with the release of carbon dioxide and ammonia. Also included in this subfamily is allantoate amidohydrolase (allantoate deiminase), which catalyzes the conversion of allantoate to (S)-ureidoglycolate, one of the crucial alternate steps in purine metabolism. It is possible that these two enzymes arose from the same ancestral peptidase that evolved into two structurally related enzymes with distinct catalytic properties and biochemical roles within the cell. Downstream enzyme (S)-ureidoglycolate amidohydrolase (UAH) is homologous in structure and sequence with AAH and catalyzes the conversion of (S)-ureidoglycolate into glyoxylate, releasing two molecules of ammonia as by-products. Yeast requires beta-alanine as a precursor of pantothenate and coenzyme A biosynthesis, but generates it mostly via degradation of spermine. Disorders in pyrimidine degradation and beta-alanine metabolism caused by beta-ureidopropionase deficiency (UPB1 gene) in humans are normally associated with neurological disorders.	398
349881	cd03885	M20_CPDG2	M20 Peptidase Glutamate carboxypeptidase, a periplasmic enzyme. Peptidase M20 family, Glutamate carboxypeptidase (carboxypeptidase G; carboxypeptidase G1; carboxypeptidase G2; CPDG2; CPG2; Folate hydrolase G2; Pteroylmonoglutamic acid hydrolase G2; Glucarpidase; E.C. 3.4.17.11) subfamily. CPDG2 is a periplasmic enzyme that is synthesized with a signal peptide. It is a dimeric zinc-dependent exopeptidase, with two domains, a catalytic domain, which provides the ligands for the two zinc ions in the active site, and a dimerization domain. CPDG2 cleaves the C-terminal glutamate moiety from a wide range of N-acyl groups, including peptidyl, aminoacyl, benzoyl, benzyloxycarbonyl, folyl, and pteroyl groups to release benzoic acid, phenol, and aniline mustards. It is used clinically to treat methotrexate toxicity by hydrolyzing it to inactive and non-toxic metabolites. It is also proposed for use in antibody-directed enzyme prodrug therapy; for example, glutamate can be cleaved from glutamated benzoyl nitrogen mustards, producing nitrogen mustards with effective cytotoxicity against tumor cells.	362
349882	cd03886	M20_Acy1	M20 Peptidase Aminoacylase 1 family. Peptidase M20 family, Aminoacylase 1 (ACY1; hippuricase; acylase I; amido acid deacylase; IAA-amino acid hydrolase; dehydropeptidase II; N-acyl-L-amino-acid amidohydrolase; EC 3.5.1.14) subfamily. ACY1 is the most abundant of the aminoacylases, a class of zinc binding homodimeric enzymes involved in the hydrolysis of N-acetylated proteins. It is encoded by the aminoacylase 1 gene (Acy1) on chromosome 3p21 that comprises 15 exons. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity; substrates include indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). ACY1 appears to physically interact with Sphingosine kinase type 1 (SphK1) and may influence its physiological functions; SphK1 and its product sphingosine-1-phosphate have been shown to promote cell growth and inhibit apoptosis of tumor cells. Strong expression of the human gene and its mouse ortholog Acy1 in brain, liver, and kidney, suggest a role of the enzyme in amino acid metabolism of these organs. Defects in ACY1 are the cause of aminoacylase-1 deficiency (ACY1D), resulting in a metabolic disorder manifesting encephalopathy and psychomotor delay.	371
349883	cd03887	M20_Acy1L2	M20 Peptidase Aminoacylase 1-like protein 2, amidohydrolase family. Peptidase M20 family, Aminoacylase 1-like protein 2 (ACY1L2; amidohydrolase) subfamily. This group contains many uncharacterized proteins predicted as amidohydrolases, including gene products of abgA and abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in Escherichia coli, to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate utilization is catalyzed by the abg region gene product, AbgT. Aminoacylase 1 (ACY1) proteins are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	360
349884	cd03888	M20_PepV	M20 Peptidase Xaa-His dipeptidase (PepV) degrades hydrophobic dipeptides. Peptidase M20 family, Peptidase V (Xaa-His dipeptidase; PepV g.p. (Lactobacillus lactis); X-His dipeptidase; beta-Ala-His dipeptidase; carnosinase) subfamily. The PepV group of proteins is widely distributed in lactic acid bacteria. PepV, along with PepT, functions at the end of the proteolytic processing system. PepV is a monomeric metalloenzyme that preferentially degrades hydrophobic dipeptides. The Streptococcus gordonii PepV gene is homologous to the PepV gene family from Lactobacillus and Lactococcus spp. PepV recognizes and fixes the dipeptide backbone, while the side chains are not specifically probed and can vary, rendering it a nonspecific dipeptidase. It has been shown that Lactococcus lactis subspecies lactis (L9) PepV does not hydrolyze dipeptides containing Pro or D-amino acids at the C-terminus, while PepV from Lactobaccilus has been shown to have L-carnosine hydrolyzing activity. The mammalian PepV also acts on anserine and homocarnosine (but not on homoanserine), and to a lesser extent on some other aminoacyl-L-histidine dipeptides. Also included is the Staphylococcus aureus metallopeptidase, Sapep, a Mn(2+)-dependent dipeptidase where large interdomain movements could potentially regulate the activity of this enzyme.	449
349885	cd03890	M20_pepD	M20 Peptidase D has specificity for beta-alanyl-L-histidine dipeptide. Peptidase M20 family, Peptidase D (PepD, Xaa-His dipeptidase; X-His dipeptidase; aminoacylhistidine dipeptidase; dipeptidase D; Beta-alanyl-histidine dipeptidase; pepD g.p. (Escherichia coli); EC 3.4.13.3) subfamily. PepD is a cytoplasmic enzyme family characterized by its unusual specificity for the dipeptides beta-alanyl-L-histidine (L-carnosine or beta-Ala-His) and gamma-aminobutyryl histidine (L-homocarnosine or gamma-amino-butyl-His). Homocarnosine has been suggested as a precursor for the neurotransmitter gamma-aminobutyric acid (GABA), acting as a GABA reservoir, and may mediate anti-seizure effects of GABAergic therapies. It has also been reported that glucose metabolism could be influenced by L-carnosine. PepD also includes a lid domain that forms a homodimer; however, the physiological function of this extra domain remains unclear.	474
349886	cd03891	M20_DapE_proteobac	M20 Peptidase proteobacterial DapE encoded N-succinyl-L,L-diaminopimelic acid desuccinylase. Peptidase M20 family, proteobacterial DapE encoded N-succinyl-L,L-diaminopimelic acid desuccinylase (DapE; aspartyl dipeptidase; succinyl-diaminopimelate desuccinylase) subfamily. DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. It has been shown that DapE is essential for cell growth and proliferation. DapEs have been purified from Escherichia coli and Haemophilus influenzae, while the genes that encode for DapEs have been sequenced from several bacterial sources such as Corynebacterium glutamicum, Helicobacter pylori, Neisseria meningitidis and Mycobacterium tuberculosis. DapE is a small, dimeric enzyme that requires two zinc atoms per molecule for full enzymatic activity. All of the amino acids that function as metal binding ligands are strictly conserved in DapE.	366
349887	cd03892	M20_peptT	M20 Peptidase T specifically cleaves tripeptides. Peptidase M20 family, Peptidase T (peptT; tripeptide aminopeptidase; tripeptidase) subfamily. PepT acts only on tripeptide substrates, and is thus called a tripeptidase. It catalyzes the release of N-terminal amino acids with hydrophobic side chains from tripeptides with high specificity; dipeptides, tetrapeptides or tripeptides with the N-terminus blocked are not cleaved. Tripeptidases are known to function at the final stage of proteolysis in lactococcal bacteria and release amino acids from tripeptides produced during the digestion of milk proteins such as casein.	400
349888	cd03893	M20_Dipept_like	M20 Dipeptidases. Peptidase M20 family, dipeptidase-like subfamily. This group contains a large variety of enzymes, including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase), canosinase, DUG2 type proteins, as well as many proteins inferred by homology to be dipeptidases. These enzymes have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. Substrates of CNDP are varied and not limited to Xaa-His dipeptides. DUG2 proteins contain a metallopeptidase domain and a large N-terminal WD40 repeat region, and are involved in the alternative pathway of glutathione degradation.	426
349889	cd03894	M20_ArgE	M20 Peptidase acetylornithine deacetylase. Peptidase M20 family, acetylornithine deacetylase (ArgE, Acetylornithinase, AO, N2-acetyl-L-ornithine amidohydrolase, EC 3.5.1.16) subfamily. ArgE catalyzes the conversion of N-acetylornithine to ornithine, which can then be incorporated into the urea cycle for the final stage of arginine synthesis. The substrate specificity of ArgE is quite broad; several alpha-N-acyl-L-amino acids can be hydrolyzed, including alpha-N-acetylmethionine and alpha-N-formylmethionine. ArgE shares significant sequence homology and biochemical features, and possibly a common origin, with glutamate carboxypeptidase (CPG2) and succinyl-diaminopimelate desuccinylase (DapE), and aminoacylase I (ACY1), having all metal ligand binding residues conserved.	367
349890	cd03895	M20_ArgE_DapE-like	M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly bacterial, and have been inferred by homology as being related to both ArgE and DapE.	400
349891	cd03896	M20_PAAh_like	M20 Peptidases, Poly(aspartic acid) hydrolase-like proteins. Peptidase M20 family, Poly(aspartic acid) hydrolase (PAA hydrolase)-like subfamily. PAA hydrolase enzymes are involved in alpha,beta-poly(D,L-aspartic acid) (tPAA) biodegradation. PAA is being extensively studied as a replacement for commercial polycarboxylate components since it can be degraded by enzymes from isolated tPAA degrading bacteria. Thus far, two types of PAA degrading bacteria (Sphingomonas sp. KT-1 and Pedobacter sp. KP-2) have been investigated in detail; the former can completely degrade tPAA of low-molecular weights below 5000, while the latter can degrade high molecular weight tPAA to release oligo(aspartic acid) (OAA) as a product, suggesting two kinds of PAA degrading enzymes. It has been shown that PAA hydrolase-1 from Sphingomonas sp. KT-1 hydrolyzes beta,beta-aspartic acid units in tPAA to produce OAA, and it is suggested that PAA hydrolase-2 hydrolyzes OAA to aspartic acid. Also included in this family is Bradyrhizobium 5-nitroanthranilic acid (5NAA)-aminohydrolase (5NAA-A), a biodegradation enzyme that converts 5NAA to 5-nitrosalicylic acid; 5NAA is a metabolite secreted by Streptomyces scabies, the bacterium responsible for potato scab, and metabolized by Bradyrhizobium species strain JS329.	357
175976	cd04009	C2B_Munc13-like	C2 domain second repeat in Munc13 (mammalian uncoordinated)-like proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner.  Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode.  Munc13 is the mammalian homolog which are expressed in the brain.  There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters.  Unc13 and Munc13 contain both C1 and C2 domains.  There are two C2 related domains present, one central and one at the carboxyl end.  Munc13-1 contains a third C2-like domain.  Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology.	133
175977	cd04010	C2B_RasA3	C2 domain second repeat present in RAS p21 protein activator 3 (RasA3). RasA3 are members of GTPase activating protein 1 (GAP1), a Ras-specific GAP, which suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras.  In this way it can control cellular proliferation and differentiation.  RasA3 contains an N-terminal C2 domain,  a Ras-GAP domain, a plextrin homology (PH)-like domain, and a Bruton's Tyrosine Kinase (BTK) zinc binding domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	148
175978	cd04011	C2B_Ferlin	C2 domain second repeat in Ferlin. Ferlins are involved in vesicle fusion events.  Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together.  There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6.  Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1).  Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E.   In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology.	111
175979	cd04012	C2A_PI3K_class_II	C2 domain first repeat present in class II phosphatidylinositol 3-kinases (PI3Ks). There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a N-terminal C2 domain, a PIK domain, and a kinase catalytic domain. Unlike class I and class III, class II PI3Ks have additionally a PX domain and a C-terminal C2 domain containing a nuclear localization signal both of which bind phospholipids though in a slightly different fashion.  Class II PIK3s act downstream of receptors for growth factors, integrins, and chemokines. PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility.  PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	171
175980	cd04013	C2_SynGAP_like	C2 domain present in Ras GTPase activating protein (GAP) family. SynGAP, GAP1, RasGAP, and neurofibromin are all members of the Ras-specific GAP (GTPase-activating protein) family.  SynGAP regulates the MAP kinase signaling pathway and is critical for cognition and synapse function.  Mutations in this gene causes mental retardation in humans.   SynGAP contains a PH-like domain, a C2 domain, and a  Ras-GAP domain.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	146
175981	cd04014	C2_PKC_epsilon	C2 domain in Protein Kinase C (PKC) epsilon. A single C2 domain is found in PKC epsilon. The PKC family of serine/threonine kinases regulates apoptosis, proliferation, migration, motility, chemo-resistance, and differentiation.  There are 3 groups: group 1 (alpha, betaI, beta II, gamma) which require phospholipids and calcium, group 2 (delta, epsilon, theta, eta) which do not require calcium for activation, and group 3 (xi, iota/lambda) which are atypical and can be activated in the absence of diacylglycerol and calcium. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  Members here have a type-II topology.	132
175982	cd04015	C2_plant_PLD	C2 domain present in plant phospholipase D (PLD). PLD hydrolyzes terminal phosphodiester bonds in diester glycerophospholipids resulting in the degradation of phospholipids.  In vitro PLD transfers phosphatidic acid to primary alcohols.  In plants PLD plays a role in germination, seedling growth, phosphatidylinositol metabolism, and changes in phospholipid composition.  There is a single Ca(2+)/phospholipid-binding C2 domain in PLD. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	158
175983	cd04016	C2_Tollip	C2 domain present in Toll-interacting protein (Tollip). Tollip is a part of the Interleukin-1 receptor (IL-1R) signaling pathway. Tollip is proposed to link serine/threonine kinase IRAK to IL-1Rs as well as inhibiting phosphorylation of IRAK. There is a single C2 domain present in Tollip. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	121
175984	cd04017	C2D_Ferlin	C2 domain fourth repeat in Ferlin. Ferlins are involved in vesicle fusion events.  Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together.  There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6.  Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1).  Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E.   In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the fourth C2 repeat, C2D, and has a type-II topology.	135
175985	cd04018	C2C_Ferlin	C2 domain third repeat in Ferlin. Ferlins are involved in vesicle fusion events.  Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together.  There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6.  Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1).  Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E.   In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology.	151
175986	cd04019	C2C_MCTP_PRT_plant	C2 domain third repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP); plant subset. MCTPs are involved in Ca2+ signaling at the membrane.  Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence.  It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology.	150
175987	cd04020	C2B_SLP_1-2-3-4	C2 domain second repeat present in Synaptotagmin-like proteins 1-4. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length.  Slp1/JFC1 and Slp2/exophilin 4 promote granule docking to the plasma membrane.  Additionally, their C2A domains are both Ca2+ independent, unlike the case in Slp3 and Slp4/granuphilin in which their C2A domains are Ca2+ dependent.  It is thought that SHD (except for the Slp4-SHD) functions as a specific Rab27A/B-binding domain. In addition to Slps, rabphilin, Noc2, and  Munc13-4 also function as Rab27-binding proteins. It has been demonstrated that Slp3 and Slp4/granuphilin promote dense-core vesicle exocytosis. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.   This cd contains the second C2 repeat, C2B, and has a type-I topology.	162
175988	cd04021	C2_E3_ubiquitin_ligase	C2 domain present in E3 ubiquitin ligase. E3 ubiquitin ligase is part of the ubiquitylation mechanism responsible for controlling surface expression of membrane proteins.  The sequential action of several enzymes are involved: ubiquitin-activating enzyme E1, ubiquitin-conjugating enzyme E2, and ubiquitin-protein ligase E3 which is responsible for substrate recognition and promoting the transfer of ubiquitin to the target protein.  E3 ubiquitin ligase is composed of an N-terminal C2 domain, 4 WW domains, and a HECTc domain.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	125
175989	cd04022	C2A_MCTP_PRT_plant	C2 domain first repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP); plant subset. MCTPs are involved in Ca2+ signaling at the membrane.  Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence.  It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology.	127
175990	cd04024	C2A_Synaptotagmin-like	C2 domain first repeat present in Synaptotagmin-like proteins. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	128
175991	cd04025	C2B_RasA1_RasA4	C2 domain second repeat present in RasA1 and RasA4. RasA1 and RasA4 are GAP1s (GTPase activating protein 1s ), Ras-specific GAP members, which suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras.  In this way it can control cellular proliferation and differentiation.  Both proteins contain two C2 domains,  a Ras-GAP domain, a plextrin homology (PH)-like domain, and a Bruton's Tyrosine Kinase (BTK) zinc binding domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	123
175992	cd04026	C2_PKC_alpha_gamma	C2 domain in Protein Kinase C (PKC) alpha and gamma. A single C2 domain is found in PKC alpha and gamma. The PKC family of serine/threonine kinases regulates apoptosis, proliferation, migration, motility, chemo-resistance, and differentiation.  There are 3 groups: group 1(alpha, betaI, beta II, gamma) which require phospholipids and calcium, group 2 (delta, epsilon, theta, eta) which do not require calcium for activation, and group 3 (xi, iota/lambda) which are atypical and can be activated in the absence of diacylglycerol and calcium. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology.	131
175993	cd04027	C2B_Munc13	C2 domain second repeat in Munc13 (mammalian uncoordinated) proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner.  Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode.  Munc13 is the mammalian homolog which are expressed in the brain.  There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters.  Unc13 and Munc13 contain both C1 and C2 domains.  There are two C2 related domains present, one central and one at the carboxyl end.  Munc13-1 contains a third C2-like domain.  Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology.	127
175994	cd04028	C2B_RIM1alpha	C2 domain second repeat contained in Rab3-interacting molecule (RIM) proteins. RIMs are believed to organize specialized sites of the plasma membrane called active zones.  They also play a role in controlling neurotransmitter release, plasticity processes, as well as memory and learning.  RIM contains an N-terminal zinc finger domain, a PDZ domain, and two C-terminal C2 domains (C2A, C2B).  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology and do not bind Ca2+.	146
175995	cd04029	C2A_SLP-4_5	C2 domain first repeat present in Synaptotagmin-like proteins 4 and 5. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length. SHD of Slp (except for the Slp4-SHD) function as a specific Rab27A/B-binding domain.  In addition to Slp, rabphilin, Noc2, and  Munc13-4 also function as Rab27-binding proteins. It has been demonstrated that Slp4/granuphilin promotes dense-core vesicle exocytosis. The C2A domain of Slp4 is Ca2+ dependent. Slp5 mRNA has been shown to be restricted to human placenta and liver suggesting a role in Rab27A-dependent membrane trafficking in specific tissues. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains the first C2 repeat, C2A, and has a type-I topology.	125
175996	cd04030	C2C_KIAA1228	C2 domain third repeat present in uncharacterized human KIAA1228-like proteins. KIAA proteins are uncharacterized human proteins. They were compiled by the Kazusa mammalian cDNA project which identified more than 2000 human genes. They are identified by 4 digit codes that precede the KIAA designation.  Many KIAA genes are still functionally uncharacterized including KIAA1228. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology.	127
175997	cd04031	C2A_RIM1alpha	C2 domain first repeat contained in Rab3-interacting molecule (RIM) proteins. RIMs are believed to organize specialized sites of the plasma membrane called active zones.  They also play a role in controlling neurotransmitter release, plasticity processes, as well as memory and learning.  RIM contains an N-terminal zinc finger domain, a PDZ domain, and two C-terminal C2 domains (C2A, C2B).  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology and do not bind Ca2+.	125
175998	cd04032	C2_Perforin	C2 domain of Perforin. Perforin contains a single copy of a C2 domain in its C-terminus and plays a role in lymphocyte-mediated cytotoxicity.  Mutations in perforin leads to familial hemophagocytic lymphohistiocytosis type 2.  The function of perforin is calcium dependent and the C2 domain is thought to confer this binding to target cell membranes.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	127
175999	cd04033	C2_NEDD4_NEDD4L	C2 domain present in the Human neural precursor cell-expressed, developmentally down-regulated 4 (NEDD4) and NEDD4-like (NEDD4L/NEDD42). Nedd4 and Nedd4-2 are two of the nine members of the Human Nedd4 family.  All vertebrates appear to have both Nedd4 and Nedd4-2 genes. They are thought to participate in the regulation of epithelial Na+ channel (ENaC) activity. They also have identical specificity for ubiquitin conjugating enzymes (E2).  Nedd4 and Nedd4-2 are composed of a C2 domain, 2-4 WW domains, and a ubiquitin ligase Hect domain. Their WW domains can bind PPxY (PY) or LPSY motifs, and in vitro studies suggest that WW3 and WW4 of both proteins bind PY motifs in the key substrates, with WW3 generally exhibiting higher affinity. Most Nedd4 family members, especially Nedd4-2, also have multiple splice variants, which might play different roles in regulating their substrates. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	133
176000	cd04035	C2A_Rabphilin_Doc2	C2 domain first repeat present in Rabphilin and Double C2 domain. Rabphilin is found neurons and in neuroendrocrine cells, while Doc2 is found not only in the brain but in tissues, including mast cells, chromaffin cells, and osteoblasts.  Rabphilin and Doc2s share highly homologous tandem C2 domains, although their N-terminal structures are completely different: rabphilin contains an N-terminal Rab-binding domain (RBD),7 whereas Doc2 contains an N-terminal Munc13-1-interacting domain (MID). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	123
176001	cd04036	C2_cPLA2	C2 domain present in cytosolic PhosphoLipase A2 (cPLA2). A single copy of the C2 domain is present in cPLA2 which releases arachidonic acid from membranes initiating the biosynthesis of potent inflammatory mediators such as prostaglandins, leukotrienes, and platelet-activating factor.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members of this cd have a type-II topology.	119
176002	cd04037	C2E_Ferlin	C2 domain fifth repeat in Ferlin. Ferlins are involved in vesicle fusion events.  Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together.  There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6.  Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1).  Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E.   In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the fifth C2 repeat, C2E, and has a type-II topology.	124
176003	cd04038	C2_ArfGAP	C2 domain present in Arf GTPase Activating Proteins (GAP). ArfGAP is a GTPase activating protein which regulates the ADP ribosylation factor Arf, a member of the Ras superfamily of GTP-binding proteins.  The GTP-bound form of Arf is involved in Golgi morphology and is involved in recruiting coat proteins.  ArfGAP is responsible for the GDP-bound form of Arf which is necessary for uncoating the membrane and allowing the Golgi to fuse with an acceptor compartment.  These proteins contain an N-terminal ArfGAP domain containing the characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) and C-terminal C2 domain. C2 domains were first identified in Protein Kinase C (PKC). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	145
176004	cd04039	C2_PSD	C2 domain present in Phosphatidylserine decarboxylase (PSD). PSD is involved in the biosynthesis of aminophospholipid by converting phosphatidylserine (PtdSer) to phosphatidylethanolamine (PtdEtn). There is a single C2 domain present and it is thought to confer PtdSer binding motif that is common to PKC and synaptotagmin. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	108
176005	cd04040	C2D_Tricalbin-like	C2 domain fourth repeat present in Tricalbin-like proteins. 5 to 6 copies of the C2 domain are present in Tricalbin, a yeast homolog of Synaptotagmin, which is involved in membrane trafficking and sorting.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the fifth C2 repeat, C2E, and has a type-II topology.	115
176006	cd04041	C2A_fungal	C2 domain first repeat; fungal group. C2 domains were first identified in Protein Kinase C (PKC). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	111
176007	cd04042	C2A_MCTP_PRT	C2 domain first repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP). MCTPs are involved in Ca2+ signaling at the membrane.  MCTP is composed of a variable N-terminal sequence, three C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence.  It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology.	121
176008	cd04043	C2_Munc13_fungal	C2 domain in Munc13 (mammalian uncoordinated) proteins; fungal group. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner.  Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode.  Munc13 is the mammalian homolog which are expressed in the brain.  There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters.  Unc13 and Munc13 contain both C1 and C2 domains.  There are two C2 related domains present, one central and one at the carboxyl end.  Munc13-1 contains a third C2-like domain.  Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology.	126
176009	cd04044	C2A_Tricalbin-like	C2 domain first repeat present in Tricalbin-like proteins. 5 to 6 copies of the C2 domain are present in Tricalbin, a yeast homolog of Synaptotagmin, which is involved in membrane trafficking and sorting.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains the first C2 repeat, C2A, and has a type-II topology.	124
176010	cd04045	C2C_Tricalbin-like	C2 domain third repeat present in Tricalbin-like proteins. 5 to 6 copies of the C2 domain are present in Tricalbin, a yeast homolog of Synaptotagmin, which is involved in membrane trafficking and sorting.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains the third C2 repeat, C2C, and has a type-II topology.	120
176011	cd04046	C2_Calpain	C2 domain present in Calpain proteins. A single C2 domain is found in calpains (EC 3.4.22.52, EC 3.4.22.53), calcium-dependent, non-lysosomal cysteine proteases.  Caplains are classified as belonging to Clan CA by MEROPS and include six families: C1, C2, C10, C12, C28, and C47.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	126
176012	cd04047	C2B_Copine	C2 domain second repeat in Copine. There are 2 copies of the C2 domain present in copine, a protein involved in membrane trafficking, protein-protein interactions, and perhaps even cell division and growth.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	110
176013	cd04048	C2A_Copine	C2 domain first repeat in Copine. There are 2 copies of the C2 domain present in copine, a protein involved in membrane trafficking, protein-protein interactions, and perhaps even cell division and growth.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	120
176014	cd04049	C2_putative_Elicitor-responsive_gene	C2 domain present in the putative elicitor-responsive gene. In plants elicitor-responsive proteins are triggered in response to specific elicitor molecules such as glycolproteins, peptides, carbohydrates and lipids. A host of defensive responses are also triggered resulting in localized cell death.  Antimicrobial secondary metabolites, such as phytoalexins, or defense-related proteins, including pathogenesis-related (PR) proteins  are also produced.  There is a single C2 domain present here.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members have a type-II topology.	124
176015	cd04050	C2B_Synaptotagmin-like	C2 domain second repeat present in Synaptotagmin-like proteins. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	105
176016	cd04051	C2_SRC2_like	C2 domain present in Soybean genes Regulated by Cold 2 (SRC2)-like proteins. SRC2 production is a response to pathogen infiltration.  The initial response of increased Ca2+ concentrations are coupled to downstream signal transduction pathways via calcium binding proteins.  SRC2 contains a single C2 domain which localizes to the plasma membrane and is involved in Ca2+ dependent protein binding. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	125
176017	cd04052	C2B_Tricalbin-like	C2 domain second repeat present in Tricalbin-like proteins. 5 to 6 copies of the C2 domain are present in Tricalbin, a yeast homolog of Synaptotagmin, which is involved in membrane trafficking and sorting.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology.	111
176018	cd04054	C2A_Rasal1_RasA4	C2 domain first repeat present in RasA1 and RasA4. Rasal1 and RasA4 are both members of GAP1 (GTPase activating protein 1).  Rasal1 responds to repetitive Ca2+ signals by associating with the plasma membrane and deactivating Ras. RasA4 suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation.  Both of these proteins contains two C2 domains, a Ras-GAP domain, a plextrin homology (PH)-like domain, and a Bruton's Tyrosine Kinase (BTK) zinc binding domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	121
173788	cd04056	Peptidases_S53	Peptidase domain in the S53 family. Members of the peptidases S53 (sedolisin) family include endopeptidases and exopeptidases sedolisin, kumamolysin, and (PSCP) Pepstatin-insensitive Carboxyl Proteinase.  The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of Asn in subtilisin. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. Characterized sedolisins include Kumamolisin, an extracellular calcium-dependent thermostable endopeptidase from Bacillus. The enzyme is synthesized with a 188 amino acid N-terminal preprotein region which is cleaved after the extraction into the extracellular space with low pH. One kumamolysin paralog, kumamolisin-As, is believed to be a collagenase. TPP1 is a serine protease that functions as a tripeptidyl exopeptidase as well as an endopeptidase. Less is known about PSCP from Pseudomonas which is thought to be an aspartic proteinase.	361
173789	cd04059	Peptidases_S8_Protein_convertases_Kexins_Furin-like	Peptidase S8 family domain in Protein convertases. Protein convertases, whose members include furins and kexins, are members of the peptidase S8 or Subtilase clan of proteases. They have an Asp/His/Ser catalytic triad that is not homologous to trypsin. Kexins are involved in the activation of peptide hormones, growth factors, and viral proteins.  Furin cleaves cell surface vasoactive peptides and proteins involved in cardiovascular tissue remodeling in the TGN, at cell surface, or in endosomes but rarely in the ER.  Furin also plays a key role in blood pressure regulation though the activation of transforming growth factor (TGF)-beta. High specificity is seen for cleavage after dibasic (Lys-Arg or Arg-Arg) or multiple basic residues in protein convertases.  There is also strong sequence conservation.	297
173790	cd04077	Peptidases_S8_PCSK9_ProteinaseK_like	Peptidase S8 family domain in ProteinaseK-like proteins. The peptidase S8 or Subtilase clan of proteases have a Asp/His/Ser catalytic triad that is not homologous to trypsin. This CD contains several members of this clan including: PCSK9 (Proprotein convertase subtilisin/kexin type 9), Proteinase_K, Proteinase_T, and other subtilisin-like serine proteases.  PCSK9 posttranslationally regulates hepatic low-density lipoprotein receptors (LDLRs) by binding to LDLRs on the cell surface, leading to their degradation. The binding site of PCSK9 has been localized to the epidermal growth factor-like repeat A (EGF-A) domain of the LDLR. Characterized Proteinases K are secreted endopeptidases with a high degree of sequence conservation.  Proteinases K are not substrate-specific and function in a wide variety of species in different pathways. It can hydrolyze keratin and other proteins with subtilisin-like specificity. The number of calcium-binding motifs found in these differ. Proteinase T is a novel proteinase from the fungus Tritirachium album Limber. The amino acid sequence of proteinase T as deduced from the nucleotide sequence is about 56% identical to that of proteinase K.	255
271144	cd04078	CBM36_xylanase-like	Carbohydrate Binding Module family 36 (CBM36); appended mainly to glycoside hydrolase family 11 (GH11) domains; xylan binding. This family includes carbohydrate binding module family 36 (CBM36) most of which appear appended to glycoside hydrolase family 11 (GH11) domains. These CBMs are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH11 catalytic modules with their dedicated, insoluble substrates. GH11 domains have xylanase (endo-1,4-beta-xylanase) activity which catalyzes the hydrolysis of beta-1,4 bonds of xylan, the major component of hemicelluloses, to generate xylooligosaccharides and xylose. This family includes XynB from Dictyoglomus thermophilum Rt46B.1 and Xyn11A from Pseudobutyrivibrio xylanivorans Mz5T. Xyn11A is a multicatalytic enzyme with an N-terminal GH11 domain, a CBM36 domain, and a C-terminal putative NodB-like polysaccharide deacetylase which is predicted to be an acetyl esterase involved in debranching activity in the xylan backbone. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. Consistent with its structural and sequence similarity to CBM6, CBM36 binds xylan, but only at binding site I, and in a calcium-dependent manner; the latter suggests its potential application in affinity labeling.	119
271145	cd04079	CBM6_agarase-like	Carbohydrate Binding Module 6 (CBM6); appended mainly to glycoside hydrolase (GH) family 16 alpha- and beta agarases. This family includes carbohydrate binding module 6 (CBM6) domains that are appended mainly to glycoside hydrolase (GH) family 16 agarases. These CBM6s are non-catalytic carbohydrate binding domains that facilitate the activity of alpha- and beta-agarase catalytic modules which are involved in the hydrolysis of 1,4-beta-D-galactosidic linkages. These CBM6s bind specifically to the non-reducing end of agarose chains, recognizing only the first repeat of the disaccharide, and directing the appended catalytic modules to areas of the plant cell wall attacked by beta-agarases. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. This family includes three tandem CBM6s from the Saccharophagus degradans  agarase Aga86E, and three tandem CBM6s from Vibrio sp. strain PO-303 AgaA; in both these proteins these are appended to a GH16 domain. Vibrio AgaA also contains a Big-2-like protein-protein interaction domain. This family also includes two tandem CBM6s from an endo-type beta-agarase from a deep-sea Microbulbifer-like isolate, which are appended to a GH16 domain, and two of three CBM6s of Alteromonas agarilytica AgaA alpha-agarase, which are appended to a GH96 domain.	134
271146	cd04080	CBM6_cellulase-like	Carbohydrate Binding Module 6 (CBM6); appended to glycoside hydrolase (GH) domains, including GH5 (cellulase). This family includes carbohydrate binding module 6 (CBM6) domains that are appended to several glycoside hydrolase (GH) domains, including GH5 (cellulase) and GH16, as well as to coagulation factor 5/8 carbohydrate-binding domains. CBM6s are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH catalytic modules with their dedicated, insoluble substrates. The CBM6s are appended to GHs that display a diversity of substrate specificities. For some members of this family information is available about the specific substrates of the appended GH domains. It includes the CBM domains of various enzymes involved in cell wall degradation including, an extracellular beta-1,3-glucanase from Lysobacter enzymogenes encoded by the gluC gene (its catalytic domain belongs to the GH16 family), the tandem CBM domains of Pseudomonas sp. PE2 beta-1,3(4)-glucanase A (its catalytic domain also belongs to GH16), and a family 6 CBM from Cellvibrio mixtus Endoglucanase 5A (CmCBM6) which binds to the beta1,4-beta1,3-mixed linked glucans lichenan, and barley beta-glucan, cello-oligosaccharides, insoluble forms of cellulose, the beta1,3-glucan laminarin, and xylooligosaccharides, and the CBM6 of Fibrobacter succinogenes S85 XynD xylanase, appended to a GH10 domain, and Cellvibrio japonicas Cel5G appended to a GH5 (cellulase) domain. GH5 (cellulase) family includes enzymes with several known activities such as endoglucanase, beta-mannanase, and xylanase, which are involved in the degradation of cellulose and xylans. GH16 family includes enzymes with lichenase, xyloglucan endotransglycosylase (XET), and beta-agarase activities. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. For CmCBM6 it has been shown that these two binding sites have different ligand specificities.	144
271147	cd04081	CBM35_galactosidase-like	Carbohydrate Binding Module family 35 (CBM35); appended mainly to enzymes that bind alpha-D-galactose (CBM35-Gal), including glycoside hydrolase (GH) families GH27 and GH43. This family includes carbohydrate binding module family 35 (CBM35); these are non-catalytic carbohydrate binding domains that are appended mainly to enzymes that bind alpha-D-galactose (CBM35-Gal), including glycoside hydrolase (GH) families GH27 and GH43. Examples of proteins which contain CBM35s belonging to this family includes the CBM35 of an exo-beta-1,3-galactanase from Phanerochaete chrysosporium 9 (Pc1,3Gal43A)  which is appended to a GH43 domain, and the CBM35 domain of two bifunctional proteins with beta-L-arabinopyranosidase/alpha-D-galactopyranosidase activities from Fusarium oxysporum 12S, Foap1 and Foap2 (Fo/AP1 and Fo/AP2), that are appended to GH27 domains. CBM35s are unique in that they display conserved specificity through extensive sequence similarity but divergent function through their appended catalytic modules. They are known to bind alpha-D-galactose (Gal), mannan (Man), xylan, glucuronic acid (GlcA), a beta-polymer of mannose, and possibly glucans, forming four subfamilies based on general ligand specificities (galacto, urono, manno, and gluco configurations). Some CBM35s bind their ligands in a calcium-dependent manner. In contrast to most CBMs that are generally rigid proteins, CBM35 undergoes significant conformational change upon ligand binding. GH43 includes beta-xylosidases and beta-xylanases, using aryl-glycosides as substrates, while family GH27 includes alpha-galactosidases, alpha-N-acetylgalactosaminidases, and isomaltodextranases.	125
271148	cd04082	CBM35_pectate_lyase-like	Carbohydrate Binding Module family 35 (CBM35), pectate lyase-like; appended mainly to enzymes that bind mannan (Man), xylan, glucuronic acid (GlcA) and possibly glucans. This family includes carbohydrate binding module family 35 (CBM35) domains that are non-catalytic carbohydrate binding domains that are appended mainly to enzymes that bind mannan (Man), xylan, glucuronic acid (GlcA) and possibly glucans. Included in this family are CBM35s of pectate lyases, including pectate lyase 10A from Cellvibrio japonicas, these enzymes release delta-4,5-anhydrogalaturonic acid (delta4,5-GalA) from pectin, thus identifying a signature molecule for plant cell wall degradation. CBM35s are unique in that they display conserved specificity through extensive sequence similarity but divergent function through their appended catalytic modules. They are known to bind alpha-D-galactose (Gal), mannan (Man), xylan, glucuronic acid (GlcA), a beta-polymer of mannose, and possibly glucans, forming four subfamilies based on general ligand specificities (galacto, urono, manno, and gluco configurations). In contrast to most CBMs that are generally rigid proteins, CBM35 undergoes significant conformational change upon ligand binding. Some CBM35s bind their ligands in a calcium-dependent manner, especially those binding uronic acids.	124
271149	cd04083	CBM35_Lmo2446-like	Carbohydrate Binding Module 35 (CBM35) domains similar to Lmo2446. This family includes carbohydrate binding module 35 (CBM35) domains that are appended to several carbohydrate binding enzymes. Some CBM35 domains belonging to this family are appended to glycoside hydrolase (GH) family domains, including glycoside hydrolase family 31 (GH31), for example the CBM35 domain of Lmo2446, an uncharacterized protein from Listeria monocytogenes EGD-e. These CBM35s are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH catalytic modules with their dedicated, insoluble substrates. GH31 has a wide range of hydrolytic activities such as alpha-glucosidase, alpha-xylosidase, 6-alpha-glucosyltransferase, or alpha-1,4-glucan lyase, cleaving a terminal carbohydrate moiety from a substrate that may be a starch or a glycoprotein. Most characterized GH31 enzymes are alpha-glucosidases.	125
271150	cd04084	CBM6_xylanase-like	Carbohydrate Binding Module 6 (CBM6); many are appended to glycoside hydrolase (GH) family 11 and GH43 xylanase domains. This family includes carbohydrate binding module 6 (CBM6) domains that are appended mainly to glycoside hydrolase (GH) family domains, including GH3, GH11, and GH43 domains. These CBM6s are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH catalytic modules with their dedicated, insoluble substrates. Examples of proteins having CMB6s belonging to this family are Microbispora bispora GghA, a 1,4-beta-D-glucan glucohydrolase (GH3); Clostridium thermocellum xylanase U (GH11), and Penicillium purpurogenum ABF3, a bifunctional alpha-L-arabinofuranosidase/xylobiohydrolase (GH43). GH3 comprises enzymes with activities including beta-glucosidase (hydrolyzes beta-galactosidase) and beta-xylosidase (hydrolyzes 1,4-beta-D-xylosidase). GH11 family comprises enzymes with xylanase (endo-1,4-beta-xylanase) activity which catalyze the hydrolysis of beta-1,4 bonds of xylan, the major component of hemicelluloses, to generate xylooligosaccharides and xylose. GH43 includes beta-xylosidases and beta-xylanases, using aryl-glycosides as substrates. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold.	123
271151	cd04085	delta_endotoxin_C	delta-endotoxin C-terminal domain may be associated with carbohydrate binding functionality. Delta-endotoxin C-terminal domain (delta endotoxin domain III) is part of the activated region of delta endotoxins, which are insecticidal toxins produced during sporulation by Bacillus species of bacteria. The activated endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain (I) is involved in membrane insertion and pore formation, while the second and third domains (II and III) are involved in receptor binding. Domain III structurally resembles the carbohydrate binding domain 6 (CBM6) and it is possible that insect specificity is determined by protein-protein or protein-carbohydrate interactions mediated by both domains II and III of the toxin. Delta-endotoxins are of great interest for development of new bioinsecticides and in the control of mosquitoes.	152
271152	cd04086	CBM35_mannanase-like	Carbohydrate Binding Module 35 (CBM35); appended to several carbohydrate binding enzymes, including several glycoside hydrolase (GH) family 26 mannanase domains. This family includes carbohydrate binding module 35 (CBM35) domains that are appended to several carbohydrate binding enzymes, including periplasmic component of ABC-type sugar transport system involved in carbohydrate transport and metabolism, and several glycoside hydrolase (GH) domains, including GH26. These CBM6s are non-catalytic carbohydrate binding domains that facilitate the strong binding of the GH catalytic modules with their dedicated, insoluble substrates. Examples of proteins having CMB35s belonging to this family are mannanase A from Clostridium thermocellum (GH26), Man26B from Paenibacillus sp. BME-14 (GH26), and the multifunctional Cel44C-Man26A from Paenibacillus polymyxa GS01 (which has two GH domains, GH44 and GH26). GH26 mainly includes mannan endo-1,4-beta-mannosidase which hydrolyzes 1,4-beta-D-linkages in mannans, galacto-mannans, glucomannans, and galactoglucomannans, but displays little activity towards other plant cell wall polysaccharides. A few proteins belonging to this family have additional CBM3 domains; these CBM3s are not found in the CBM6-CBM35-CBM36_like superfamily.	119
239754	cd04087	PTPA	Phosphotyrosyl phosphatase activator (PTPA) is also known as protein phosphatase 2A (PP2A) phosphatase activator. PTPA is an essential, well conserved protein that stimulates the tyrosyl phosphatase activity of PP2A. It also reactivates the serine/threonine phosphatase activity of an inactive form of PP2A. Together, PTPA and PP2A constitute an ATPase. It has been suggested that PTPA alters the relative specificity of PP2A from phosphoserine/phosphothreonine substrates to phosphotyrosine substrates in an ATP-hydrolysis-dependent manner. Basal expression of PTPA is controlled by the transcription factor Yin Yang1 (YY1). PTPA has been suggested to play a role in the insertion of metals to the PP2A catalytic subunit (PP2Ac) active site, to act as a chaperone, and more recently, to have peptidyl prolyl cis/trans isomerase activity that specifically targets human PP2Ac.	266
293905	cd04088	EFG_mtEFG_II	Domain II of bacterial elongation factor G and C-terminal domain of mitochondrial Elongation factors G1 and G2. This family represents the domain II of bacterial Elongation factor G (EF-G)and mitochondrial Elongation factors G1 (mtEFG1) and G2 (mtEFG2). During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. In bacteria this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. mtEFG1 and mtEFG2 show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. No clear phenotype has been found for mutants in the yeast homolog of mtEFG2, MEF2.	83
293906	cd04089	eRF3_II	Domain II of the eukaryotic class II release factor. In eukaryotes, translation termination is mediated by two interacting release factors, eRF1 and eRF3, which act as class I and II factors, respectively. eRF1 functions as an omnipotent release factor, decoding all three stop codons and triggering the release of the nascent peptide catalyzed by the ribosome. eRF3 is a GTPase, which enhances termination efficiency by stimulating eRF1 activity in a GTP-dependent manner. Sequence comparison of class II release factors with elongation factors shows that eRF3 is more similar to eEF-1alpha whereas prokaryote RF3 is more similar to EF-G, implying that their precise function may differ. Only eukaryote RF3s are found in this group. Saccharomyces cerevisiae eRF3 (Sup35p) is a translation termination factor which is divided into three regions N, M and a C-terminal eEF1a-like region essential for translation termination. Sup35NM  is a non-pathogenic prion-like protein with the property of aggregating into polymer-like fibrils.	82
293907	cd04090	EF2_II_snRNP	Domain II of the spliceosomal 116kD U5 small nuclear ribonucleoprotein (snRNP) component. This subfamily includes domain II of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and its yeast counterpart Snu114p. This domain is homologous to domain II of the eukaryotic translational elongation factor EF-2.  U5-116 kD is a GTPase which is a component of the spliceosome complex which processes precursor mRNAs to produce mature mRNAs.	94
293908	cd04091	mtEFG1_II_like	Domain II of mitochondrial elongation factor G1-like proteins found in eukaryotes. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria. Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s); mtEFG2s are not present in this group.	81
293909	cd04092	mtEFG2_II_like	Domain II of mitochondrial elongation factor G2-like proteins found in eukaryotes. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria.  Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs. No clear phenotype has been found for mutants in the yeast homolog of mtEFG2, MEF2. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s); mtEFG1s are not present in this group.	83
294008	cd04093	HBS1_C_III	C-terminal domain of Hsp70 subfamily B suppressor 1 (HBS1). This model represents the C-terminal domain of Hsp70 subfamily B suppressor 1 (HBS1), which is homologous to the domain III of EF-1alpha. This group contains proteins similar to yeast Hbs1, which together with Dom34, promotes the No-go decay (NGD) of mRNA. The NGD targets mRNAs whose elongation stalled for degradation initiated by endonucleolytic cleavage in the vicinity of the stalled ribosome.	109
294009	cd04094	eSelB_III	Domain III of eukaryotic and archaeal elongation factor SelB. This model represents the domain III of archaeal and eukaryotic selenocysteine (Sec)-specific eukaryotic elongation factor (eEFSec or eSelB), which is homologous to domain III of EF-Tu. SelB is a specialized translation elongation factor responsible for the co-translational incorporation of selenocysteine into proteins by recoding of a UGA stop codon in the presence of a downstream mRNA hairpin loop, called Sec insertion sequence (SECIS) element.	114
294010	cd04095	CysN_NoDQ_III	Domain III of the large subunit of ATP sulfurylase (ATPS). This model represents domain III of the large subunit of ATP sulfurylase (ATPS): CysN or the N-terminal portion of NodQ, found mainly in proteobacteria and is homologous to domain III of EF-Tu. Escherichia coli ATPS consists of CysN and a smaller subunit CysD and CysN. ATPS produces adenosine-5'-phosphosulfate (APS) from ATP and sulfate, coupled with GTP hydrolysis. In the subsequent reaction APS is phosphorylated by an APS kinase (CysC), to produce 3'-phosphoadenosine-5'-phosphosulfate (PAPS) for use in amino acid (aa) biosynthesis. The Rhizobiaceae group (alpha-proteobacteria) appears to carry out the same chemistry for the sulfation of a nodulation factor. In Rhizobium meliloti, the heterodimeric complex comprised of NodP and NodQ appears to possess both ATPS and APS kinase activities. The N- and C-termini of NodQ correspond to CysN and CysC, respectively. Other eubacteria, archaea, and eukaryotes use a different ATP sulfurylase, which shows no amino acid sequence similarity to CysN or NodQ. CysN and the N-terminal portion of NodQ show similarity to GTPases involved in translation, in particular, EF-Tu and EF-1alpha.	103
239763	cd04096	eEF2_snRNP_like_C	eEF2_snRNP_like_C: this family represents a C-terminal domain of eukaryotic elongation factor 2 (eEF-2) and a homologous domain of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p.  Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP.  Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p.   In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome.	80
239764	cd04097	mtEFG1_C	mtEFG1_C: C-terminus of mitochondrial Elongation factor G1 (mtEFG1)-like proteins found in eukaryotes.  Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species.  Eukaryotic EF-2 operates in the cytosolic protein synthesis machinery of eukaryotes, EF-Gs in protein synthesis in bacteria.  Eukaryotic mtEFG1 proteins show significant homology to bacterial EF-Gs.  Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects and a tendency to lose mitochondrial DNA. There are two forms of mtEFG present in mammals (designated mtEFG1s and mtEFG2s) mtEFG2s are not present in this group.	78
239765	cd04098	eEF2_C_snRNP	eEF2_C_snRNP: This family includes a C-terminal portion of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p.  This domain is homologous to the C-terminal domain of the eukaryotic translational elongation factor EF-2.  Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP.  Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p.   In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome.	80
239766	cd04100	Asp_Lys_Asn_RS_N	Asp_Lys_Asn_RS_N: N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop.  Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases (AspRS, AsnRS, and LysRS).  aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Included in this group are archeal and archeal-like AspRSs which are non-discriminating and can charge both tRNAAsp and tRNAAsn. E. coli cells have two isoforms of LysRSs (LysS and LysU) encoded by two distinct genes, which are differentially regulated. The cytoplasmic and the mitochondrial isoforms of human LysRS are encoded by a single gene. Yeast cytoplasmic and mitochondrial LysRSs participate in mitochondrial import of cytoplasmic tRNAlysCUU.  In addition to their housekeeping role, human LysRS may function as a signaling molecule that activates immune cells. Tomato LysRS may participate in a process possibly connected to conditions of oxidative-stress conditions or heavy metal uptake. It is known that human tRNAlys and LysRS are specifically packaged into HIV-1 suggesting a role for LysRS in tRNA packaging.  AsnRS is immunodominant antigen of the filarial nematode Brugia malayai and is of interest as a target for anti-parasitic drug design.  Human AsnRS has been shown to be a pro-inflammatory chemokine which interacts with CCR3 chemokine receptors on T cells, immature dendritic cells and macrophages.	85
206688	cd04101	RabL4	Rab GTPase-like family 4 (Rab-like4). RabL4 (Rab-like4) subfamily. RabL4s are novel proteins that have high sequence similarity with Rab family members, but display features that are distinct from Rabs, and have been termed Rab-like. As in other Rab-like proteins, RabL4 lacks a prenylation site at the C-terminus. The specific function of RabL4 remains unknown.	167
206689	cd04102	RabL3	Rab GTPase-like family 3 (Rab-like3). RabL3 (Rab-like3) subfamily. RabL3s are novel proteins that have high sequence similarity with Rab family members, but display features that are distinct from Rabs, and have been termed Rab-like. As in other Rab-like proteins, RabL3 lacks a prenylation site at the C-terminus. The specific function of RabL3 remains unknown.	204
133303	cd04103	Centaurin_gamma	Centaurin gamma (CENTG) GTPase. The centaurins (alpha, beta, gamma, and delta) are large, multi-domain proteins that all contain an ArfGAP domain and ankyrin repeats, and in some cases, numerous additional domains. Centaurin gamma contains an additional GTPase domain near its N-terminus. The specific function of this GTPase domain has not been well characterized, but centaurin gamma 2 (CENTG2) may play a role in the development of autism. Centaurin gamma 1 is also called PIKE (phosphatidyl inositol (PI) 3-kinase enhancer) and centaurin gamma 2 is also known as AGAP (ArfGAP protein with a GTPase-like domain, ankyrin repeats and a Pleckstrin homology domain) or GGAP. Three isoforms of PIKE have been identified. PIKE-S (short) and PIKE-L (long) are brain-specific isoforms, with PIKE-S restricted to the nucleus and PIKE-L found in multiple cellular compartments. A third isoform, PIKE-A was identified in human glioblastoma brain cancers and has been found in various tissues. GGAP has been shown to have high GTPase activity due to a direct intramolecular interaction between the N-terminal GTPase domain and the C-terminal ArfGAP domain. In human tissue, AGAP mRNA was detected in skeletal muscle, kidney, placenta, brain, heart, colon, and lung. Reduced expression levels were also observed in the spleen, liver, and small intestine.	158
206690	cd04104	p47_IIGP_like	p47 GTPase family includes IGTP, TGTP/Mg21, IRG-47, GTPI, LRG-47, and IIGP1. The p47 GTPase family consists of several highly homologous proteins, including IGTP, TGTP/Mg21, IRG-47, GTPI, LRG-47, and IIGP1. They are found in higher eukaryotes where they play a role in immune resistance against intracellular pathogens. p47 proteins exist at low resting levels in mouse cells, but are strongly induced by Type II interferon (IFN-gamma). ITGP is critical for resistance to Toxoplasma gondii infection and in involved in inhibition of Coxsackievirus-B3-induced apoptosis. TGTP was shown to limit vesicular stomatitis virus (VSV) infection of fibroblasts in vitro. IRG-47 is involved in resistance to T. gondii infection. LRG-47 has been implicated in resistance to T. gondii, Listeria monocytogenes, Leishmania, and mycobacterial infections. IIGP1 has been shown to localize to the ER and to the Golgi membranes in IFN-induced cells and inflamed tissues. In macrophages, IIGP1 interacts with hook3, a microtubule binding protein that participates in the organization of the cis-Golgi compartment.	197
206691	cd04105	SR_beta	Signal recognition particle receptor, beta subunit (SR-beta), together with SR-alpha, forms the heterodimeric signal recognition particle (SRP). Signal recognition particle receptor, beta subunit (SR-beta). SR-beta and SR-alpha form the heterodimeric signal recognition particle (SRP or SR) receptor that binds SRP to regulate protein translocation across the ER membrane. Nascent polypeptide chains are synthesized with an N-terminal hydrophobic signal sequence that binds SRP54, a component of the SRP. SRP directs targeting of the ribosome-nascent chain complex (RNC) to the ER membrane via interaction with the SR, which is localized to the ER membrane. The RNC is then transferred to the protein-conducting channel, or translocon, which facilitates polypeptide translation across the ER membrane or integration into the ER membrane. SR-beta is found only in eukaryotes; it is believed to control the release of the signal sequence from SRP54 upon binding of the ribosome to the translocon. High expression of SR-beta has been observed in human colon cancer, suggesting it may play a role in the development of this type of cancer.	202
133306	cd04106	Rab23_like	Rab GTPase family 23 (Rab23)-like. Rab23-like subfamily. Rab23 is a member of the Rab family of small GTPases. In mouse, Rab23 has been shown to function as a negative regulator in the sonic hedgehog (Shh) signaling pathway. Rab23 mediates the activity of Gli2 and Gli3, transcription factors that regulate Shh signaling in the spinal cord, primarily by preventing Gli2 activation in the absence of Shh ligand. Rab23 also regulates a step in the cytoplasmic signal transduction pathway that mediates the effect of Smoothened (one of two integral membrane proteins that are essential components of the Shh signaling pathway in vertebrates). In humans, Rab23 is expressed in the retina. Mice contain an isoform that shares 93% sequence identity with the human Rab23 and an alternative splicing isoform that is specific to the brain. This isoform causes the murine open brain phenotype, indicating it may have a role in the development of the central nervous system. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	162
206692	cd04107	Rab32_Rab38	Rab GTPase families 18 (Rab18) and 32 (Rab32). Rab38/Rab32 subfamily. Rab32 and Rab38 are members of the Rab family of small GTPases. Human Rab32 was first identified in platelets but it is expressed in a variety of cell types, where it functions as an A-kinase anchoring protein (AKAP). Rab38 has been shown to be melanocyte-specific. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	201
206693	cd04108	Rab36_Rab34	Rab GTPase families 34 (Rab34) and 36 (Rab36). Rab34/Rab36 subfamily. Rab34, found primarily in the Golgi, interacts with its effector, Rab-interacting lysosomal protein (RILP). This enables its participation in microtubular dynenin-dynactin-mediated repositioning of lysosomes from the cell periphery to the Golgi. A Rab34 (Rah) isoform that lacks the consensus GTP-binding region has been identified in mice. This isoform is associated with membrane ruffles and promotes macropinosome formation. Rab36 has been mapped to human chromosome 22q11.2, a region that is homozygously deleted in malignant rhabdoid tumors (MRTs). However, experimental assessments do not implicate Rab36 as a tumor suppressor that would enable tumor formation through a loss-of-function mechanism. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	170
206694	cd04109	Rab28	Rab GTPase family 28 (Rab28). Rab28 subfamily. First identified in maize, Rab28 has been shown to be a late embryogenesis-abundant (Lea) protein that is regulated by the plant hormone abcisic acid (ABA). In Arabidopsis, Rab28 is expressed during embryo development and is generally restricted to provascular tissues in mature embryos. Unlike maize Rab28, it is not ABA-inducible. Characterization of the human Rab28 homolog revealed two isoforms, which differ by a 95-base pair insertion, producing an alternative sequence for the 30 amino acids at the C-terminus. The two human isoforms are presumably the result of alternative splicing. Since they differ at the C-terminus but not in the GTP-binding region, they are predicted to be targeted to different cellular locations. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	213
133310	cd04110	Rab35	Rab GTPase family 35 (Rab35). Rab35 is one of several Rab proteins to be found to participate in the regulation of osteoclast cells in rats. In addition, Rab35 has been identified as a protein that interacts with nucleophosmin-anaplastic lymphoma kinase (NPM-ALK) in human cells. Overexpression of NPM-ALK is a key oncogenic event in some anaplastic large-cell lymphomas; since Rab35 interacts with N|PM-ALK, it may provide a target for cancer treatments. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	199
133311	cd04111	Rab39	Rab GTPase family 39 (Rab39). Found in eukaryotes, Rab39 is mainly found in epithelial cell lines, but is distributed widely in various human tissues and cell lines. It is believed to be a novel Rab protein involved in regulating Golgi-associated vesicular transport during cellular endocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	211
206695	cd04112	Rab26	Rab GTPase family 26 (Rab26). Rab26 subfamily. First identified in rat pancreatic acinar cells, Rab26 is believed to play a role in recruiting mature granules to the plasma membrane upon beta-adrenergic stimulation. Rab26 belongs to the Rab functional group III, which are considered key regulators of intracellular vesicle transport during exocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	191
206696	cd04113	Rab4	Rab GTPase family 4 (Rab4). Rab4 subfamily. Rab4 has been implicated in numerous functions within the cell. It helps regulate endocytosis through the sorting, recycling, and degradation of early endosomes. Mammalian Rab4 is involved in the regulation of many surface proteins including G-protein-coupled receptors, transferrin receptor, integrins, and surfactant protein A. Experimental data implicate Rab4 in regulation of the recycling of internalized receptors back to the plasma membrane. It is also believed to influence receptor-mediated antigen processing in B-lymphocytes, in calcium-dependent exocytosis in platelets, in alpha-amylase secretion in pancreatic cells, and in insulin-induced translocation of Glut4 from internal vesicles to the cell surface. Rab4 is known to share effector proteins with Rab5 and Rab11. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	161
133314	cd04114	Rab30	Rab GTPase family 30 (Rab30). Rab30 subfamily. Rab30 appears to be associated with the Golgi stack. It is expressed in a wide variety of tissue types and in humans maps to chromosome 11. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	169
133315	cd04115	Rab33B_Rab33A	Rab GTPase family 33 includes Rab33A and Rab33B. Rab33B/Rab33A subfamily. Rab33B is ubiquitously expressed in mouse tissues and cells, where it is localized to the medial Golgi cisternae. It colocalizes with alpha-mannose II. Together with the other cisternal Rabs, Rab6A and Rab6A', it is believed to regulate the Golgi response to stress and is likely a molecular target in stress-activated signaling pathways. Rab33A (previously known as S10) is expressed primarily in the brain and immune system cells. In humans, it is located on the X chromosome at Xq26 and its expression is down-regulated in tuberculosis patients. Experimental evidence suggests that Rab33A is a novel CD8+ T cell factor that likely plays a role in tuberculosis disease processes. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	170
206697	cd04116	Rab9	Rab GTPase family 9 (Rab9). Rab9 is found in late endosomes, together with mannose 6-phosphate receptors (MPRs) and the tail-interacting protein of 47 kD (TIP47). Rab9 is a key mediator of vesicular transport from late endosomes to the trans-Golgi network (TGN) by redirecting the MPRs. Rab9 has been identified as a key component for the replication of several viruses, including HIV1, Ebola, Marburg, and measles, making it a potential target for inhibiting a variety of viruses. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	170
206698	cd04117	Rab15	Rab GTPase family 15 (Rab15). Rab15 colocalizes with the transferrin receptor in early endosome compartments, but not with late endosomal markers. It codistributes with Rab4 and Rab5 on early/sorting endosomes, and with Rab11 on pericentriolar recycling endosomes. It is believed to function as an inhibitory GTPase that regulates distinct steps in early endocytic trafficking. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	164
133318	cd04118	Rab24	Rab GTPase family 24 (Rab24). Rab24 is distinct from other Rabs in several ways. It exists primarily in the GTP-bound state, having a low intrinsic GTPase activity; it is not efficiently geranyl-geranylated at the C-terminus; it does not form a detectable complex with Rab GDP-dissociation inhibitors (GDIs); and it has recently been shown to undergo tyrosine phosphorylation when overexpressed in vitro. The specific function of Rab24 still remains unknown. It is found in a transport route between ER-cis-Golgi and late endocytic compartments. It is putatively involved in an autophagic pathway, possibly directing misfolded proteins in the ER to degradative pathways. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	193
133319	cd04119	RJL	Rab GTPase family J-like (RabJ-like). RJLs are found in many protists and as chimeras with C-terminal DNAJ domains in deuterostome metazoa. They are not found in plants, fungi, and protostome metazoa, suggesting a horizontal gene transfer between protists and deuterostome metazoa. RJLs lack any known membrane targeting signal and contain a degenerate phosphate/magnesium-binding 3 (PM3) motif, suggesting an impaired ability to hydrolyze GTP. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization.	168
206699	cd04120	Rab12	Rab GTPase family 12 (Rab12). Rab12 was first identified in canine cells, where it was localized to the Golgi complex. The specific function of Rab12 remains unknown, and inconsistent results about its cellular localization have been reported. More recent studies have identified Rab12 associated with post-Golgi vesicles, or with other small vesicle-like structures but not with the Golgi complex. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	202
133321	cd04121	Rab40	Rab GTPase family 40 (Rab40) contains Rab40a, Rab40b and Rab40c. The Rab40 subfamily contains Rab40a, Rab40b, and Rab40c, which are all highly homologous. In rat, Rab40c is localized to the perinuclear recycling compartment (PRC), and is distributed in a tissue-specific manor, with high expression in brain, heart, kidney, and testis, low expression in lung and liver, and no expression in spleen and skeletal muscle. Rab40c is highly expressed in differentiated oligodendrocytes but minimally expressed in oligodendrocyte progenitors, suggesting a role in the vesicular transport of myelin components. Unlike most other Ras-superfamily proteins, Rab40c was shown to have a much lower affinity for GTP, and an affinity for GDP that is lower than for GTP. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	189
133322	cd04122	Rab14	Rab GTPase family 14 (Rab14). Rab14 GTPases are localized to biosynthetic compartments, including the rough ER, the Golgi complex, and the trans-Golgi network, and to endosomal compartments, including early endosomal vacuoles and associated vesicles. Rab14 is believed to function in both the biosynthetic and recycling pathways between the Golgi and endosomal compartments. Rab14 has also been identified on GLUT4 vesicles, and has been suggested to help regulate GLUT4 translocation. In addition, Rab14 is believed to play a role in the regulation of phagocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	166
133323	cd04123	Rab21	Rab GTPase family 21 (Rab21). The localization and function of Rab21 are not clearly defined, with conflicting data reported. Rab21 has been reported to localize in the ER in human intestinal epithelial cells, with partial colocalization with alpha-glucosidase, a late endosomal/lysosomal marker. More recently, Rab21 was shown to colocalize with and affect the morphology of early endosomes. In Dictyostelium, GTP-bound Rab21, together with two novel LIM domain proteins, LimF and ChLim, has been shown to regulate phagocytosis. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	162
133324	cd04124	RabL2	Rab GTPase-like family 2 (Rab-like2). RabL2 (Rab-like2) subfamily. RabL2s are novel Rab proteins identified recently which display features that are distinct from other Rabs, and have been termed Rab-like. RabL2 contains RabL2a and RabL2b, two very similar Rab proteins that share > 98% sequence identity in humans. RabL2b maps to the subtelomeric region of chromosome 22q13.3 and RabL2a maps to 2q13, a region that suggests it is also a subtelomeric gene. Both genes are believed to be expressed ubiquitously, suggesting that RabL2s are the first example of duplicated genes in human proximal subtelomeric regions that are both expressed actively. Like other Rab-like proteins, RabL2s lack a prenylation site at the C-terminus. The specific functions of RabL2a and RabL2b remain unknown. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization.	161
133326	cd04126	Rab20	Rab GTPase family 20 (Rab20). Rab20 is one of several Rab proteins that appear to be restricted in expression to the apical domain of murine polarized epithelial cells. It is expressed on the apical side of polarized kidney tubule and intestinal epithelial cells, and in non-polarized cells. It also localizes to vesico-tubular structures below the apical brush border of renal proximal tubule cells and in the apical region of duodenal epithelial cells. Rab20 has also been shown to colocalize with vacuolar H+-ATPases (V-ATPases) in mouse kidney cells, suggesting a role in the regulation of V-ATPase traffic in specific portions of the nephron. It was also shown to be one of several proteins whose expression is upregulated in human myelodysplastic syndrome (MDS) patients. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins.	220
206700	cd04127	Rab27A	Rab GTPase family 27a (Rab27a). The Rab27a subfamily consists of Rab27a and its highly homologous isoform, Rab27b. Unlike most Rab proteins whose functions remain poorly defined, Rab27a has many known functions. Rab27a has multiple effector proteins, and depending on which effector it binds, Rab27a has different functions as well as tissue distribution and/or cellular localization. Putative functions have been assigned to Rab27a when associated with the effector proteins Slp1, Slp2, Slp3, Slp4, Slp5, DmSlp, rabphilin, Dm/Ce-rabphilin, Slac2-a, Slac2-b, Slac2-c, Noc2, JFC1, and Munc13-4. Rab27a has been associated with several human diseases, including hemophagocytic syndrome (Griscelli syndrome or GS), Hermansky-Pudlak syndrome, and choroidermia. In the case of GS, a rare, autosomal recessive disease, a Rab27a mutation is directly responsible for the disorder. When Rab27a is localized to the secretory granules of pancreatic beta cells, it is believed to mediate glucose-stimulated insulin secretion, making it a potential target for diabetes therapy. When bound to JFC1 in prostate cells, Rab27a is believed to regulate the exocytosis of prostate- specific markers. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C-terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	180
206701	cd04128	Spg1	Septum-promoting GTPase (Spg1). Spg1p. Spg1p (septum-promoting GTPase) was first identified in the fission yeast S. pombe, where it regulates septum formation in the septation initiation network (SIN) through the cdc7 protein kinase. Spg1p is an essential gene that localizes to the spindle pole bodies. When GTP-bound, it binds cdc7 and causes it to translocate to spindle poles. Sid4p (septation initiation defective) is required for localization of Spg1p to the spindle pole body, and the ability of Spg1p to promote septum formation from any point in the cell cycle depends on Sid4p. Spg1p is negatively regulated by Byr4 and cdc16, which form a two-component GTPase activating protein (GAP) for Spg1p. The existence of a SIN-related pathway in plants has been proposed. GTPase activating proteins (GAPs) interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization.	182
206702	cd04129	Rho2	Ras homology family 2 (Rho2) of small guanosine triphosphatases (GTPases). Rho2 is a fungal GTPase that plays a role in cell morphogenesis, control of cell wall integrity, control of growth polarity, and maintenance of growth direction. Rho2 activates the protein kinase C homolog Pck2, and Pck2 controls Mok1, the major (1-3) alpha-D-glucan synthase. Together with Rho1 (RhoA), Rho2 regulates the construction of the cell wall. Unlike Rho1, Rho2 is not an essential protein, but its overexpression is lethal. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for proper intracellular localization via membrane attachment. As with other Rho family GTPases, the GDP/GTP cycling is regulated by GEFs (guanine nucleotide exchange factors), GAPs (GTPase-activating proteins) and GDIs (guanine nucleotide dissociation inhibitors).	190
133330	cd04130	Wrch_1	Wnt-1 responsive Cdc42 homolog (Wrch-1) is a Rho family GTPase similar to Cdc42. Wrch-1 (Wnt-1 responsive Cdc42 homolog) is a Rho family GTPase that shares significant sequence and functional similarity with Cdc42. Wrch-1 was first identified in mouse mammary epithelial cells, where its transcription is upregulated in Wnt-1 transformation. Wrch-1 contains N- and C-terminal extensions relative to cdc42, suggesting potential differences in cellular localization and function. The Wrch-1 N-terminal extension contains putative SH3 domain-binding motifs and has been shown to bind the SH3 domain-containing protein Grb2, which increases the level of active Wrch-1 in cells. Unlike Cdc42, which localizes to the cytosol and perinuclear membranes, Wrch-1 localizes extensively with the plasma membrane and endosomes. The membrane association, localization, and biological activity of Wrch-1 indicate an atypical model of regulation distinct from other Rho family GTPases. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	173
206703	cd04131	Rnd	Rho family GTPase subfamily Rnd includes Rnd1/Rho6, Rnd2/Rho7, and Rnd3/RhoE/Rho8. The Rnd subfamily contains Rnd1/Rho6, Rnd2/Rho7, and Rnd3/RhoE/Rho8. These novel Rho family proteins have substantial structural differences compared to other Rho members, including N- and C-terminal extensions relative to other Rhos. Rnd3/RhoE is farnesylated at the C-terminal prenylation site, unlike most other Rho proteins that are geranylgeranylated. In addition, Rnd members are unable to hydrolyze GTP and are resistant to GAP activity. They are believed to exist only in the GTP-bound conformation, and are antagonists of RhoA activity. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	176
206704	cd04132	Rho4_like	Ras homology family 4 (Rho4) of small guanosine triphosphatases (GTPases)-like. Rho4 is a GTPase that controls septum degradation by regulating secretion of Eng1 or Agn1 during cytokinesis. Rho4 also plays a role in cell morphogenesis. Rho4 regulates septation and cell morphology by controlling the actin cytoskeleton and cytoplasmic microtubules. The localization of Rho4 is modulated by Rdi1, which may function as a GDI, and by Rga9, which is believed to function as a GAP. In S. pombe, both Rho4 deletion and Rho4 overexpression result in a defective cell wall, suggesting a role for Rho4 in maintaining cell wall integrity. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins.	197
206705	cd04133	Rop_like	Rho-related protein from plants (Rop)-like. The Rop (Rho-related protein from plants) subfamily plays a role in diverse cellular processes, including cytoskeletal organization, pollen and vegetative cell growth, hormone responses, stress responses, and pathogen resistance. Rops are able to regulate several downstream pathways to amplify a specific signal by acting as master switches early in the signaling cascade. They transmit a variety of extracellular and intracellular signals. Rops are involved in establishing cell polarity in root-hair development, root-hair elongation, pollen-tube growth, cell-shape formation, responses to hormones such as abscisic acid (ABA) and auxin, responses to abiotic stresses such as oxygen deprivation, and disease resistance and disease susceptibility. An individual Rop can have a unique function or an overlapping function shared with other Rop proteins; in addition, a given Rop-regulated function can be controlled by one or multiple Rop proteins. For example, Rop1, Rop3, and Rop5 are all involved in pollen-tube growth; Rop2 plays a role in response to low-oxygen environments, cell-morphology, and root-hair development; root-hair development is also regulated by Rop4 and Rop6; Rop6 is also responsible for ABA response, and ABA response is also regulated by Rop10. Plants retain some of the regulatory mechanisms that are shared by other members of the Rho family, but have also developed a number of unique modes for regulating Rops. Unique RhoGEFs have been identified that are exclusively active toward Rop proteins, such as those containing the domain PRONE (plant-specific Rop nucleotide exchanger). Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	173
206706	cd04134	Rho3	Ras homology family 3 (Rho3) of small guanosine triphosphatases (GTPases). Rho3 is a member of the Rho family found only in fungi. Rho3 is believed to regulate cell polarity by interacting with the diaphanous/formin family protein For3 to control both the actin cytoskeleton and microtubules. Rho3 is also believed to have a direct role in exocytosis that is independent of its role in regulating actin polarity. The function in exocytosis may be two-pronged: first, in the transport of post-Golgi vesicles from the mother cell to the bud, mediated by myosin (Myo2); second, in the docking and fusion of vesicles to the plasma membrane, mediated by an exocyst (Exo70) protein. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins.	185
206707	cd04135	Tc10	Rho GTPase TC10 (Tc10). TC10 is a Rho family protein that has been shown to induce microspike formation and neurite outgrowth in vitro. Its expression changes dramatically after peripheral nerve injury, suggesting an important role in promoting axonal outgrowth and regeneration. TC10 regulates translocation of insulin-stimulated GLUT4 in adipocytes and has also been shown to bind directly to Golgi COPI coat proteins. GTP-bound TC10 in vitro can bind numerous potential effectors. Depending on its subcellular localization and distinct functional domains, TC10 can differentially regulate two types of filamentous actin in adipocytes. TC10 mRNAs are highly expressed in three types of mouse muscle tissues: leg skeletal muscle, cardiac muscle, and uterus; they were also present in brain, with higher levels in adults than in newborns. TC10 has also been shown to play a role in regulating the expression of cystic fibrosis transmembrane conductance regulator (CFTR) through interactions with CFTR-associated ligand (CAL). The GTP-bound form of TC10 directs the trafficking of CFTR from the juxtanuclear region to the secretory pathway toward the plasma membrane, away from CAL-mediated DFTR degradation in the lysosome. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	174
206708	cd04136	Rap_like	Rap-like family consists of Rap1, Rap2 and RSR1. The Rap subfamily consists of the Rap1, Rap2, and RSR1. Rap subfamily proteins perform different cellular functions, depending on the isoform and its subcellular localization. For example, in rat salivary gland, neutrophils, and platelets, Rap1 localizes to secretory granules and is believed to regulate exocytosis or the formation of secretory granules. Rap1 has also been shown to localize in the Golgi of rat fibroblasts, zymogen granules, plasma membrane, and microsomal membrane of the pancreatic acini, as well as in the endocytic compartment of skeletal muscle cells and fibroblasts. Rap1 localizes in the nucleus of human oropharyngeal squamous cell carcinomas (SCCs) and cell lines. Rap1 plays a role in phagocytosis by controlling the binding of adhesion receptors (typically integrins) to their ligands. In yeast, Rap1 has been implicated in multiple functions, including activation and silencing of transcription and maintenance of telomeres. Rap2 is involved in multiple functions, including activation of c-Jun N-terminal kinase (JNK) to regulate the actin cytoskeleton and activation of the Wnt/beta-catenin signaling pathway in embryonic Xenopus. A number of effector proteins for Rap2 have been identified, including isoform 3 of the human mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4) and Traf2- and Nck-interacting kinase (TNIK), and the RalGEFs RalGDS, RGL, and Rlf, which also interact with Rap1 and Ras. RSR1 is the fungal homolog of Rap1 and Rap2. In budding yeasts, it is involved in selecting a site for bud growth, which directs the establishment of cell polarization. The Rho family GTPase Cdc42 and its GEF, Cdc24, then establish an axis of polarized growth. It is believed that Cdc42 interacts directly with RSR1 in vivo. In filamentous fungi such as Ashbya gossypii, RSR1 is a key regulator of polar growth in the hypha. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	164
206709	cd04137	RheB	Ras Homolog Enriched in Brain (RheB) is a small GTPase. Rheb (Ras Homolog Enriched in Brain) subfamily. Rheb was initially identified in rat brain, where its expression is elevated by seizures or by long-term potentiation. It is expressed ubiquitously, with elevated levels in muscle and brain. Rheb functions as an important mediator between the tuberous sclerosis complex proteins, TSC1 and TSC2, and the mammalian target of rapamycin (TOR) kinase to stimulate cell growth. TOR kinase regulates cell growth by controlling nutrient availability, growth factors, and the energy status of the cell. TSC1 and TSC2 form a dimeric complex that has tumor suppressor activity, and TSC2 is a GTPase activating protein (GAP) for Rheb. The TSC1/TSC2 complex inhibits the activation of TOR kinase through Rheb. Rheb has also been shown to induce the formation of large cytoplasmic vacuoles in a process that is dependent on the GTPase cycle of Rheb, but independent of the TOR kinase, suggesting Rheb plays a role in endocytic trafficking that leads to cell growth and cell-cycle progression. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins.	180
133338	cd04138	H_N_K_Ras_like	Ras GTPase family containing H-Ras,N-Ras and K-Ras4A/4B. H-Ras/N-Ras/K-Ras subfamily. H-Ras, N-Ras, and K-Ras4A/4B are the prototypical members of the Ras family. These isoforms generate distinct signal outputs despite interacting with a common set of activators and effectors, and are strongly associated with oncogenic progression in tumor initiation. Mutated versions of Ras that are insensitive to GAP stimulation (and are therefore constitutively active) are found in a significant fraction of human cancers. Many Ras guanine nucleotide exchange factors (GEFs) have been identified. They are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEF colocalizes with Ras. Active (GTP-bound) Ras interacts with several effector proteins that stimulate a variety of diverse cytoplasmic signaling activities. Some are known to positively mediate the oncogenic properties of Ras, including Raf, phosphatidylinositol 3-kinase (PI3K), RalGEFs, and Tiam1. Others are proposed to play negative regulatory roles in oncogenesis, including RASSF and NORE/MST1. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	162
206710	cd04139	RalA_RalB	Ral (Ras-like) family containing highly homologous RalA and RalB. The Ral (Ras-like) subfamily consists of the highly homologous RalA and RalB. Ral proteins are believed to play a crucial role in tumorigenesis, metastasis, endocytosis, and actin cytoskeleton dynamics. Despite their high sequence similarity (>80% sequence identity), nonoverlapping and opposing functions have been assigned to RalA and RalBs in tumor migration. In human bladder and prostate cancer cells, RalB promotes migration while RalA inhibits it. A Ral-specific set of GEFs has been identified that are activated by Ras binding. This RalGEF activity is enhanced by Ras binding to another of its target proteins, phosphatidylinositol 3-kinase (PI3K). Ral effectors include RLIP76/RalBP1, a Rac/cdc42 GAP, and the exocyst (Sec6/8) complex, a heterooctomeric protein complex that is involved in tethering vesicles to specific sites on the plasma membrane prior to exocytosis. In rat kidney cells, RalB is required for functional assembly of the exocyst and for localizing the exocyst to the leading edge of migrating cells. In human cancer cells, RalA is required to support anchorage-independent proliferation and RalB is required to suppress apoptosis. RalA has been shown to localize to the plasma membrane while RalB is localized to the intracellular vesicles. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	163
206711	cd04140	ARHI_like	A Ras homolog member I (ARHI). ARHI (A Ras homolog member I) is a member of the Ras family with several unique structural and functional properties. ARHI is expressed in normal human ovarian and breast tissue, but its expression is decreased or eliminated in breast and ovarian cancer. ARHI contains an N-terminal extension of 34 residues (human) that is required to retain its tumor suppressive activity. Unlike most other Ras family members, ARHI is maintained in the constitutively active (GTP-bound) state in resting cells and has modest GTPase activity. ARHI inhibits STAT3 (signal transducers and activators of transcription 3), a latent transcription factor whose abnormal activation plays a critical role in oncogenesis. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	165
206712	cd04141	Rit_Rin_Ric	Ras-like protein in all tissues (Rit), Ras-like protein in neurons (Rin) and Ras-related protein which interacts with calmodulin (Ric). Rit (Ras-like protein in all tissues), Rin (Ras-like protein in neurons) and Ric (Ras-related protein which interacts with calmodulin) form a subfamily with several unique structural and functional characteristics. These proteins all lack a the C-terminal CaaX lipid-binding motif typical of Ras family proteins, and Rin and Ric contain calmodulin-binding domains. Rin, which is expressed only in neurons, induces neurite outgrowth in rat pheochromocytoma cells through its association with calmodulin and its activation of endogenous Rac/cdc42. Rit, which is ubiquitously expressed in mammals, inhibits growth-factor withdrawl-mediated apoptosis and induces neurite extension in pheochromocytoma cells. Rit and Rin are both able to form a ternary complex with PAR6, a cell polarity-regulating protein, and Rac/cdc42. This ternary complex is proposed to have physiological function in processes such as tumorigenesis. Activated Ric is likely to signal in parallel with the Ras pathway or stimulate the Ras pathway at some upstream point, and binding of calmodulin to Ric may negatively regulate Ric activity.	172
133342	cd04142	RRP22	Ras-related protein on chromosome 22 (RRP22) family. RRP22 (Ras-related protein on chromosome 22) subfamily consists of proteins that inhibit cell growth and promote caspase-independent cell death. Unlike most Ras proteins, RRP22 is down-regulated in many human tumor cells due to promoter methylation. RRP22 localizes to the nucleolus in a GTP-dependent manner, suggesting a novel function in modulating transport of nucleolar components. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Like most Ras family proteins, RRP22 is farnesylated.	198
133343	cd04143	Rhes_like	Ras homolog enriched in striatum (Rhes) and activator of G-protein signaling 1 (Dexras1/AGS1). This subfamily includes Rhes (Ras homolog enriched in striatum) and Dexras1/AGS1 (activator of G-protein signaling 1). These proteins are homologous, but exhibit significant differences in tissue distribution and subcellular localization. Rhes is found primarily in the striatum of the brain, but is also expressed in other areas of the brain, such as the cerebral cortex, hippocampus, inferior colliculus, and cerebellum. Rhes expression is controlled by thyroid hormones. In rat PC12 cells, Rhes is farnesylated and localizes to the plasma membrane. Rhes binds and activates PI3K, and plays a role in coupling serpentine membrane receptors with heterotrimeric G-protein signaling. Rhes has recently been shown to be reduced under conditions of dopamine supersensitivity and may play a role in determining dopamine receptor sensitivity. Dexras1/AGS1 is a dexamethasone-induced Ras protein that is expressed primarily in the brain, with low expression levels in other tissues. Dexras1 localizes primarily to the cytoplasm, and is a critical regulator of the circadian master clock to photic and nonphotic input. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins.	247
133344	cd04144	Ras2	Rat sarcoma (Ras) family 2 of small guanosine triphosphatases (GTPases). The Ras2 subfamily, found exclusively in fungi, was first identified in Ustilago maydis. In U. maydis, Ras2 is regulated by Sql2, a protein that is homologous to GEFs (guanine nucleotide exchange factors) of the CDC25 family. Ras2 has been shown to induce filamentous growth, but the signaling cascade through which Ras2 and Sql2 regulate cell morphology is not known. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins.	190
133345	cd04145	M_R_Ras_like	R-Ras2/TC21, M-Ras/R-Ras3. The M-Ras/R-Ras-like subfamily contains R-Ras2/TC21, M-Ras/R-Ras3, and related members of the Ras family. M-Ras is expressed in lympho-hematopoetic cells. It interacts with some of the known Ras effectors, but appears to also have its own effectors. Expression of mutated M-Ras leads to transformation of several types of cell lines, including hematopoietic cells, mammary epithelial cells, and fibroblasts. Overexpression of M-Ras is observed in carcinomas from breast, uterus, thyroid, stomach, colon, kidney, lung, and rectum. In addition, expression of a constitutively active M-Ras mutant in murine bone marrow induces a malignant mast cell leukemia that is distinct from the monocytic leukemia induced by H-Ras. TC21, along with H-Ras, has been shown to regulate the branching morphogenesis of ureteric bud cell branching in mice. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	164
206713	cd04146	RERG_RasL11_like	Ras-related and Estrogen-Regulated Growth inhibitor (RERG) and Ras-like 11 (RasL11)-like families. RERG (Ras-related and Estrogen- Regulated Growth inhibitor) and Ras-like 11 are members of a novel subfamily of Ras that were identified based on their behavior in breast and prostate tumors, respectively. RERG expression was decreased or lost in a significant fraction of primary human breast tumors that lack estrogen receptor and are correlated with poor clinical prognosis. Elevated RERG expression correlated with favorable patient outcome in a breast tumor subtype that is positive for estrogen receptor expression. In contrast to most Ras proteins, RERG overexpression inhibited the growth of breast tumor cells in vitro and in vivo. RasL11 was found to be ubiquitously expressed in human tissue, but down-regulated in prostate tumors. Both RERG and RasL11 lack the C-terminal CaaX prenylation motif, where a = an aliphatic amino acid and X = any amino acid, and are localized primarily in the cytoplasm. Both are believed to have tumor suppressor activity.	166
206714	cd04147	Ras_dva	Ras - dorsal-ventral anterior localization (Ras-dva) family. Ras-dva subfamily. Ras-dva (Ras - dorsal-ventral anterior localization) subfamily consists of a set of proteins characterized only in Xenopus leavis, to date. In Xenopus Ras-dva expression is activated by the transcription factor Otx2 and begins during gastrulation throughout the anterior ectoderm. Ras-dva expression is inhibited in the anterior neural plate by factor Xanf1. Downregulation of Ras-dva results in head development abnormalities through the inhibition of several regulators of the anterior neural plate and folds patterning, including Otx2, BF-1, Xag2, Pax6, Slug, and Sox9. Downregulation of Ras-dva also interferes with the FGF-8a signaling within the anterior ectoderm. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins.	197
206715	cd04148	RGK	Rem, Rem2, Rad, Gem/Kir (RGK) subfamily of Ras GTPases. RGK subfamily. The RGK (Rem, Rem2, Rad, Gem/Kir) subfamily of Ras GTPases are expressed in a tissue-specific manner and are dynamically regulated by transcriptional and posttranscriptional mechanisms in response to environmental cues. RGK proteins bind to the beta subunit of L-type calcium channels, causing functional down-regulation of these voltage-dependent calcium channels, and either termination of calcium-dependent secretion or modulation of electrical conduction and contractile function. Inhibition of L-type calcium channels by Rem2 may provide a mechanism for modulating calcium-triggered exocytosis in hormone-secreting cells, and has been proposed to influence the secretion of insulin in pancreatic beta cells. RGK proteins also interact with and inhibit the Rho/Rho kinase pathway to modulate remodeling of the cytoskeleton. Two characteristics of RGK proteins cited in the literature are N-terminal and C-terminal extensions beyond the GTPase domain typical of Ras superfamily members. The N-terminal extension is not conserved among family members; the C-terminal extension is reported to be conserved among the family and lack the CaaX prenylation motif typical of membrane-associated Ras proteins. However, a putative CaaX motif has been identified in the alignment of the C-terminal residues of this CD.	219
206716	cd04149	Arf6	ADP ribosylation factor 6 (Arf6). Arf6 subfamily. Arf6 (ADP ribosylation factor 6) proteins localize to the plasma membrane, where they perform a wide variety of functions. In its active, GTP-bound form, Arf6 is involved in cell spreading, Rac-induced formation of plasma membrane ruffles, cell migration, wound healing, and Fc-mediated phagocytosis. Arf6 appears to change the actin structure at the plasma membrane by activating Rac, a Rho family protein involved in membrane ruffling. Arf6 is required for and enhances Rac formation of ruffles. Arf6 can regulate dendritic branching in hippocampal neurons, and in yeast it localizes to the growing bud, where it plays a role in polarized growth and bud site selection. In leukocytes, Arf6 is required for chemokine-stimulated migration across endothelial cells. Arf6 also plays a role in down-regulation of beta2-adrenergic receptors and luteinizing hormone receptors by facilitating the release of sequestered arrestin to allow endocytosis. Arf6 is believed to function at multiple sites on the plasma membrane through interaction with a specific set of GEFs, GAPs, and effectors. Arf6 has been implicated in breast cancer and melanoma cell invasion, and in actin remodelling at the invasion site of Chlamydia infection.	168
206717	cd04150	Arf1_5_like	ADP-ribosylation factor-1 (Arf1) and ADP-ribosylation factor-5 (Arf5). The Arf1-Arf5-like subfamily contains Arf1, Arf2, Arf3, Arf4, Arf5, and related proteins. Arfs1-5 are soluble proteins that are crucial for assembling coat proteins during vesicle formation. Each contains an N-terminal myristoylated amphipathic helix that is folded into the protein in the GDP-bound state. GDP/GTP exchange exposes the helix, which anchors to the membrane. Following GTP hydrolysis, the helix dissociates from the membrane and folds back into the protein. A general feature of Arf1-5 signaling may be the cooperation of two Arfs at the same site. Arfs1-5 are generally considered to be interchangeable in function and location, but some specific functions have been assigned. Arf1 localizes to the early/cis-Golgi, where it is activated by GBF1 and recruits the coat protein COPI. It also localizes to the trans-Golgi network (TGN), where it is activated by BIG1/BIG2 and recruits the AP1, AP3, AP4, and GGA proteins. Humans, but not rodents and other lower eukaryotes, lack Arf2. Human Arf3 shares 96% sequence identity with Arf1 and is believed to generally function interchangeably with Arf1. Human Arf4 in the activated (GTP-bound) state has been shown to interact with the cytoplasmic domain of epidermal growth factor receptor (EGFR) and mediate the EGF-dependent activation of phospholipase D2 (PLD2), leading to activation of the activator protein 1 (AP-1) transcription factor. Arf4 has also been shown to recognize the C-terminal sorting signal of rhodopsin and regulate its incorporation into specialized post-Golgi rhodopsin transport carriers (RTCs). There is some evidence that Arf5 functions at the early-Golgi and the trans-Golgi to affect Golgi-associated alpha-adaptin homology Arf-binding proteins (GGAs).	159
206718	cd04151	Arl1	ADP ribosylation factor 1 (Arf1). Arl1 subfamily. Arl1 (Arf-like 1) localizes to the Golgi complex, where it is believed to recruit effector proteins to the trans-Golgi network. Like most members of the Arf family, Arl1 is myristoylated at its N-terminal helix and mutation of the myristoylation site disrupts Golgi targeting. In humans, the Golgi-localized proteins golgin-97 and golgin-245 have been identified as Arl1 effectors. Golgins are large coiled-coil proteins found in the Golgi, and these golgins contain a C-terminal GRIP domain, which is the site of Arl1 binding. Additional Arl1 effectors include the GARP (Golgi-associated retrograde protein)/VFT (Vps53) vesicle-tethering complex and Arfaptin 2. Arl1 is not required for exocytosis, but appears necessary for trafficking from the endosomes to the Golgi. In Drosophila zygotes, mutation of Arl1 is lethal, and in the host-bloodstream form of Trypanosoma brucei, Arl1 is essential for viability.	158
206719	cd04152	Arl4_Arl7	Arf-like 4 (Arl4) and 7 (Arl7) GTPases. Arl4 (Arf-like 4) is highly expressed in testicular germ cells, and is found in the nucleus and nucleolus. In mice, Arl4 is developmentally expressed during embryogenesis, and a role in somite formation and central nervous system differentiation has been proposed. Arl7 has been identified as the only Arf/Arl protein to be induced by agonists of liver X-receptor and retinoid X-receptor and by cholesterol loading in human macrophages. Arl7 is proposed to play a role in transport between a perinuclear compartment and the plasma membrane, apparently linked to the ABCA1-mediated cholesterol secretion pathway. Older literature suggests that Arl6 is a part of the Arl4/Arl7 subfamily, but analyses based on more recent sequence data place Arl6 in its own subfamily.	183
133353	cd04153	Arl5_Arl8	Arf-like 5 (Arl5) and 8 (Arl8) GTPases. Arl5/Arl8 subfamily. Arl5 (Arf-like 5) and Arl8, like Arl4 and Arl7, are localized to the nucleus and nucleolus. Arl5 is developmentally regulated during embryogenesis in mice. Human Arl5 interacts with the heterochromatin protein 1-alpha (HP1alpha), a nonhistone chromosomal protein that is associated with heterochromatin and telomeres, and prevents telomere fusion. Arl5 may also play a role in embryonic nuclear dynamics and/or signaling cascades. Arl8 was identified from a fetal cartilage cDNA library. It is found in brain, heart, lung, cartilage, and kidney. No function has been assigned for Arl8 to date.	174
206720	cd04154	Arl2	Arf-like 2 (Arl2) GTPase. Arl2 (Arf-like 2) GTPases are members of the Arf family that bind GDP and GTP with very low affinity. Unlike most Arf family proteins, Arl2 is not myristoylated at its N-terminal helix. The protein PDE-delta, first identified in photoreceptor rod cells, binds specifically to Arl2 and is structurally very similar to RhoGDI. Despite the high structural similarity between Arl2 and Rho proteins and between PDE-delta and RhoGDI, the interactions between the GTPases and their effectors are very different. In its GTP bound form, Arl2 interacts with the protein Binder of Arl2 (BART), and the complex is believed to play a role in mitochondrial adenine nucleotide transport. In its GDP bound form, Arl2 interacts with tubulin- folding Cofactor D; this interaction is believed to play a role in regulation of microtubule dynamics that impact the cytoskeleton, cell division, and cytokinesis.	173
206721	cd04155	Arl3	Arf-like 3 (Arl3) GTPase. Arl3 (Arf-like 3) is an Arf family protein that differs from most Arf family members in the N-terminal extension. In is inactive, GDP-bound form, the N-terminal extension forms an elongated loop that is hydrophobically anchored into the membrane surface; however, it has been proposed that this region might form a helix in the GTP-bound form. The delta subunit of the rod-specific cyclic GMP phosphodiesterase type 6 (PDEdelta) is an Arl3 effector. Arl3 binds microtubules in a regulated manner to alter specific aspects of cytokinesis via interactions with retinitis pigmentosa 2 (RP2). It has been proposed that RP2 functions in concert with Arl3 to link the cell membrane and the cytoskeleton in photoreceptors as part of the cell signaling or vesicular transport machinery. In mice, the absence of Arl3 is associated with abnormal epithelial cell proliferation and cyst formation.	174
133356	cd04156	ARLTS1	Arf-like tumor suppressor gene 1 (ARLTS1 or Arl11). ARLTS1 (Arf-like tumor suppressor gene 1), also known as Arl11, is a member of the Arf family of small GTPases that is believed to play a major role in apoptotic signaling. ARLTS1 is widely expressed and functions as a tumor suppressor gene in several human cancers. ARLTS1 is a low-penetrance suppressor that accounts for a small percentage of familial melanoma or familial chronic lymphocytic leukemia (CLL). ARLTS1 inactivation seems to occur most frequently through biallelic down-regulation by hypermethylation of the promoter. In breast cancer, ARLTS1 alterations were typically a combination of a hypomorphic polymorphism plus loss of heterozygosity. In a case of thyroid adenoma, ARLTS1 alterations were polymorphism plus promoter hypermethylation. The nonsense polymorphism Trp149Stop occurs with significantly greater frequency in familial cancer cases than in sporadic cancer cases, and the Cys148Arg polymorphism is associated with an increase in high-risk familial breast cancer.	160
206722	cd04157	Arl6	Arf-like 6 (Arl6) GTPase. Arl6 (Arf-like 6) forms a subfamily of the Arf family of small GTPases. Arl6 expression is limited to the brain and kidney in adult mice, but it is expressed in the neural plate and somites during embryogenesis, suggesting a possible role for Arl6 in early development. Arl6 is also believed to have a role in cilia or flagella function. Several proteins have been identified that bind Arl6, including Arl6 interacting protein (Arl6ip), and SEC61beta, a subunit of the heterotrimeric conducting channel SEC61p. Based on Arl6 binding to these effectors, Arl6 is also proposed to play a role in protein transport, membrane trafficking, or cell signaling during hematopoietic maturation. At least three specific homozygous Arl6 mutations in humans have been found to cause Bardet-Biedl syndrome, a disorder characterized by obesity, retinopathy, polydactyly, renal and cardiac malformations, learning disabilities, and hypogenitalism. Older literature suggests that Arl6 is a part of the Arl4/Arl7 subfamily, but analyses based on more recent sequence data place Arl6 in its own subfamily.	162
206723	cd04158	ARD1	(ADP-ribosylation factor domain protein 1 (ARD1). ARD1 (ADP-ribosylation factor domain protein 1) is an unusual member of the Arf family. In addition to the C-terminal Arf domain, ARD1 has an additional 46-kDa N-terminal domain that contains a RING finger domain, two predicted B-Boxes, and a coiled-coil protein interaction motif. This domain belongs to the TRIM (tripartite motif) or RBCC (RING, B-Box, coiled-coil) family. Like most Arfs, the ARD1 Arf domain lacks detectable GTPase activity. However, unlike most Arfs, the full-length ARD1 protein has significant GTPase activity due to the GAP (GTPase-activating protein) activity exhibited by the 46-kDa N-terminal domain. The GAP domain of ARD1 is specific for its own Arf domain and does not bind other Arfs. The rate of GDP dissociation from the ARD1 Arf domain is slowed by the adjacent 15 amino acids, which act as a GDI (GDP-dissociation inhibitor) domain. ARD1 is ubiquitously expressed in cells and localizes to the Golgi and to the lysosomal membrane. Two Tyr-based motifs in the Arf domain are responsible for Golgi localization, while the GAP domain controls lysosomal localization.	169
206724	cd04159	Arl10_like	Arf-like 9 (Arl9) and 10 (Arl10) GTPases. Arl10-like subfamily. Arl9/Arl10 was identified from a human cancer-derived EST dataset. No functional information about the subfamily is available at the current time, but crystal structures of human Arl10b and Arl10c have been solved.	159
206725	cd04160	Arfrp1	Arf-related protein 1 (Arfrp1). Arfrp1 (Arf-related protein 1), formerly known as ARP, is a membrane-associated Arf family member that lacks the N-terminal myristoylation motif. Arfrp1 is mainly associated with the trans-Golgi compartment and the trans-Golgi network, where it regulates the targeting of Arl1 and the GRIP domain-containing proteins, golgin-97 and golgin-245, onto Golgi membranes. It is also involved in the anterograde transport of the vesicular stomatitis virus G protein from the Golgi to the plasma membrane, and in the retrograde transport of TGN38 and Shiga toxin from endosomes to the trans-Golgi network. Arfrp1 also inhibits Arf/Sec7-dependent activation of phospholipase D. Deletion of Arfrp1 in mice causes embryonic lethality at the gastrulation stage and apoptosis of mesodermal cells, indicating its importance in development.	168
133361	cd04161	Arl2l1_Arl13_like	Arl2-like protein 1 (Arl2l1) and Arl13. Arl2l1 (Arl2-like protein 1) and Arl13 form a subfamily of the Arf family of small GTPases. Arl2l1 was identified in human cells during a search for the gene(s) responsible for Bardet-Biedl syndrome (BBS). Like Arl6, the identified BBS gene, Arl2l1 is proposed to have cilia-specific functions. Arl13 is found on the X chromosome, but its expression has not been confirmed; it may be a pseudogene.	167
133362	cd04162	Arl9_Arfrp2_like	Arf-like 9 (Arl9)/Arfrp2-like GTPase. Arl9/Arfrp2-like subfamily. Arl9 (Arf-like 9) was first identified as part of the Human Cancer Genome Project. It maps to chromosome 4q12 and is sometimes referred to as Arfrp2 (Arf-related protein 2). This is a novel subfamily identified in human cancers that is uncharacterized to date.	164
206726	cd04163	Era	E. coli Ras-like protein (Era) is a multifunctional GTPase. Era (E. coli Ras-like protein) is a multifunctional GTPase found in all bacteria except some eubacteria. It binds to the 16S ribosomal RNA (rRNA) of the 30S subunit and appears to play a role in the assembly of the 30S subunit, possibly by chaperoning the 16S rRNA. It also contacts several assembly elements of the 30S subunit. Era couples cell growth with cytokinesis and plays a role in cell division and energy metabolism. Homologs have also been found in eukaryotes. Era contains two domains: the N-terminal GTPase domain and a C-terminal domain KH domain that is critical for RNA binding. Both domains are important for Era function. Era is functionally able to compensate for deletion of RbfA, a cold-shock adaptation protein that is required for efficient processing of the 16S rRNA.	168
206727	cd04164	trmE	trmE is a tRNA modification GTPase. TrmE (MnmE, ThdF, MSS1) is a 3-domain protein found in bacteria and eukaryotes. It controls modification of the uridine at the wobble position (U34) of tRNAs that read codons ending with A or G in the mixed codon family boxes. TrmE contains a GTPase domain that forms a canonical Ras-like fold. It functions a molecular switch GTPase, and apparently uses a conformational change associated with GTP hydrolysis to promote the tRNA modification reaction, in which the conserved cysteine in the C-terminal domain is thought to function as a catalytic residue. In bacteria that are able to survive in extremely low pH conditions, TrmE regulates glutamate-dependent acid resistance.	159
206728	cd04165	GTPBP1_like	GTP binding protein 1 (GTPBP1)-like family includes GTPBP2. Mammalian GTP binding protein 1 (GTPBP1), GTPBP2, and nematode homologs AGP-1 and CGP-1 are GTPases whose specific functions remain unknown. In mouse, GTPBP1 is expressed in macrophages, in smooth muscle cells of various tissues and in some neurons of the cerebral cortex; GTPBP2 tissue distribution appears to overlap that of GTPBP1. In human leukemia and macrophage cell lines, expression of both GTPBP1 and GTPBP2 is enhanced by interferon-gamma (IFN-gamma). The chromosomal location of both genes has been identified in humans, with GTPBP1 located in chromosome 22q12-13.1 and GTPBP2 located in chromosome 6p21-12. Human glioblastoma multiforme (GBM), a highly-malignant astrocytic glioma and the most common cancer in the central nervous system, has been linked to chromosomal deletions and a translocation on chromosome 6. The GBM translocation results in a fusion of GTPBP2 and PTPRZ1, a protein involved in oligodendrocyte differentiation, recovery, and survival. This fusion product may contribute to the onset of GBM.	224
206729	cd04166	CysN_ATPS	CysN, together with protein CysD, forms the ATP sulfurylase (ATPS) complex. CysN_ATPS subfamily. CysN, together with protein CysD, form the ATP sulfurylase (ATPS) complex in some bacteria and lower eukaryotes. ATPS catalyzes the production of ATP sulfurylase (APS) and pyrophosphate (PPi) from ATP and sulfate. CysD, which catalyzes ATP hydrolysis, is a member of the ATP pyrophosphatase (ATP PPase) family. CysN hydrolysis of GTP is required for CysD hydrolysis of ATP; however, CysN hydrolysis of GTP is not dependent on CysD hydrolysis of ATP. CysN is an example of lateral gene transfer followed by acquisition of new function. In many organisms, an ATPS exists which is not GTP-dependent and shares no sequence or structural similarity to CysN.	209
206730	cd04167	Snu114p	Snu114p, a spliceosome protein, is a GTPase. Snu114p subfamily. Snu114p is one of several proteins that make up the U5 small nuclear ribonucleoprotein (snRNP) particle. U5 is a component of the spliceosome, which catalyzes the splicing of pre-mRNA to remove introns. Snu114p is homologous to EF-2, but typically contains an additional N-terminal domain not found in Ef-2. This protein is part of the GTP translation factor family and the Ras superfamily, characterized by five G-box motifs.	213
206731	cd04168	TetM_like	Tet(M)-like family includes Tet(M), Tet(O), Tet(W), and OtrA, containing tetracycline resistant proteins. Tet(M), Tet(O), Tet(W), and OtrA are tetracycline resistance genes found in Gram-positive and Gram-negative bacteria. Tetracyclines inhibit protein synthesis by preventing aminoacyl-tRNA from binding to the ribosomal acceptor site. This subfamily contains tetracycline resistance proteins that function through ribosomal protection and are typically found on mobile genetic elements, such as transposons or plasmids, and are often conjugative. Ribosomal protection proteins are homologous to the elongation factors EF-Tu and EF-G. EF-G and Tet(M) compete for binding on the ribosomes. Tet(M) has a higher affinity than EF-G, suggesting these two proteins may have overlapping binding sites and that Tet(M) must be released before EF-G can bind. Tet(M) and Tet(O) have been shown to have ribosome-dependent GTPase activity. These proteins are part of the GTP translation factor family, which includes EF-G, EF-Tu, EF2, LepA, and SelB.	237
206732	cd04169	RF3	Release Factor 3 (RF3) protein involved in the terminal step of translocation in bacteria. Peptide chain release factor 3 (RF3) is a protein involved in the termination step of translation in bacteria. Termination occurs when class I release factors (RF1 or RF2) recognize the stop codon at the A-site of the ribosome and activate the release of the nascent polypeptide. The class II release factor RF3 then initiates the release of the class I RF from the ribosome. RF3 binds to the RF/ribosome complex in the inactive (GDP-bound) state. GDP/GTP exchange occurs, followed by the release of the class I RF. Subsequent hydrolysis of GTP to GDP triggers the release of RF3 from the ribosome. RF3 also enhances the efficiency of class I RFs at less preferred stop codons and at stop codons in weak contexts.	268
206733	cd04170	EF-G_bact	Elongation factor G (EF-G) family. Translocation is mediated by EF-G (also called translocase). The structure of EF-G closely resembles that of the complex between EF-Tu and tRNA. This is an example of molecular mimicry; a protein domain evolved so that it mimics the shape of a tRNA molecule. EF-G in the GTP form binds to the ribosome, primarily through the interaction of its EF-Tu-like domain with the 50S subunit. The binding of EF-G to the ribosome in this manner stimulates the GTPase activity of EF-G. On GTP hydrolysis, EF-G undergoes a conformational change that forces its arm deeper into the A site on the 30S subunit. To accommodate this domain, the peptidyl-tRNA in the A site moves to the P site, carrying the mRNA and the deacylated tRNA with it. The ribosome may be prepared for these rearrangements by the initial binding of EF-G as well. The dissociation of EF-G leaves the ribosome ready to accept the next aminoacyl-tRNA into the A site. This group contains only bacterial members.	268
206734	cd04171	SelB	SelB, the dedicated elongation factor for delivery of selenocysteinyl-tRNA to the ribosome. SelB is an elongation factor needed for the co-translational incorporation of selenocysteine. Selenocysteine is coded by a UGA stop codon in combination with a specific downstream mRNA hairpin. In bacteria, the C-terminal part of SelB recognizes this hairpin, while the N-terminal part binds GTP and tRNA in analogy with elongation factor Tu (EF-Tu). It specifically recognizes the selenocysteine charged tRNAsec, which has a UCA anticodon, in an EF-Tu like manner. This allows insertion of selenocysteine at in-frame UGA stop codons. In E. coli SelB binds GTP, selenocysteyl-tRNAsec, and a stem-loop structure immediately downstream of the UGA codon (the SECIS sequence). The absence of active SelB prevents the participation of selenocysteyl-tRNAsec in translation. Archaeal and animal mechanisms of selenocysteine incorporation are more complex. Although the SECIS elements have different secondary structures and conserved elements between archaea and eukaryotes, they do share a common feature. Unlike in E. coli, these SECIS elements are located in the 3' UTRs. This group contains bacterial SelBs, as well as, one from archaea.	170
206735	cd04172	Rnd3_RhoE_Rho8	Rnd3/RhoE/Rho8 GTPases. Rnd3/RhoE/Rho8 subfamily. Rnd3/RhoE/Rho8 is a member of the novel Rho subfamily Rnd, together with Rnd1/Rho6 and Rnd2/Rho7. Rnd3/RhoE is known to bind the serine-threonine kinase ROCK I. Unphosphorylated Rnd3/RhoE associates primarily with membranes, but ROCK I-phosphorylated Rnd3/RhoE localizes in the cytosol. Phosphorylation of Rnd3/RhoE correlates with its activity in disrupting RhoA-induced stress fibers and inhibiting Ras-induced fibroblast transformation. In cells that lack stress fibers, such as macrophages and monocytes, Rnd3/RhoE induces a redistribution of actin, causing morphological changes in the cell. In addition, Rnd3/RhoE has been shown to inhibit cell cycle progression in G1 phase at a point upstream of the pRb family pocket protein checkpoint. Rnd3/RhoE has also been shown to inhibit Ras- and Raf-induced fibroblast transformation. In mammary epithelial tumor cells, Rnd3/RhoE regulates the assembly of the apical junction complex and tight junction formation. Rnd3/RhoE is underexpressed in prostate cancer cells both in vitro and in vivo; re-expression of Rnd3/RhoE suppresses cell cycle progression and increases apoptosis, suggesting it may play a role in tumor suppression. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	182
206736	cd04173	Rnd2_Rho7	Rnd2/Rho7 GTPases. Rnd2/Rho7 is a member of the novel Rho subfamily Rnd, together with Rnd1/Rho6 and Rnd3/RhoE/Rho8. Rnd2/Rho7 is transiently expressed in radially migrating cells in the brain while they are within the subventricular zone of the hippocampus and cerebral cortex. These migrating cells typically develop into pyramidal neurons. Cells that exogenously expressed Rnd2/Rho7 failed to migrate to upper layers of the brain, suggesting that Rnd2/Rho7 plays a role in the radial migration and morphological changes of developing pyramidal neurons, and that Rnd2/Rho7 degradation is necessary for proper cellular migration. The Rnd2/Rho7 GEF Rapostlin is found primarily in the brain and together with Rnd2/Rho7 induces dendrite branching. Unlike Rnd1/Rho6 and Rnd3/RhoE/Rho8, which are RhoA antagonists, Rnd2/Rho7 binds the GEF Pragmin and significantly stimulates RhoA activity and Rho-A mediated cell contraction. Rnd2/Rho7 is also found to be expressed in spermatocytes and early spermatids, with male-germ-cell Rac GTPase-activating protein (MgcRacGAP), where it localizes to the Golgi-derived pro-acrosomal vesicle. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins.	221
206737	cd04174	Rnd1_Rho6	Rnd1/Rho6 GTPases. Rnd1/Rho6 is a member of the novel Rho subfamily Rnd, together with Rnd2/Rho7 and Rnd3/RhoE/Rho8. Rnd1/Rho6 binds GTP but does not hydrolyze it to GDP, indicating that it is constitutively active. In rat, Rnd1/Rho6 is highly expressed in the cerebral cortex and hippocampus during synapse formation, and plays a role in spine formation. Rnd1/Rho6 is also expressed in the liver and in endothelial cells, and is upregulated in uterine myometrial cells during pregnancy. Like Rnd3/RhoE/Rho8, Rnd1/Rho6 is believed to function as an antagonist to RhoA. Most Rho proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Rho proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	232
133375	cd04175	Rap1	Rap1 family GTPase consists of Rap1a and Rap1b isoforms. The Rap1 subgroup is part of the Rap subfamily of the Ras family. It can be further divided into the Rap1a and Rap1b isoforms. In humans, Rap1a and Rap1b share 95% sequence homology, but are products of two different genes located on chromosomes 1 and 12, respectively. Rap1a is sometimes called smg p21 or Krev1 in the older literature. Rap1 proteins are believed to perform different cellular functions, depending on the isoform, its subcellular localization, and the effector proteins it binds. For example, in rat salivary gland, neutrophils, and platelets, Rap1 localizes to secretory granules and is believed to regulate exocytosis or the formation of secretory granules. Rap1 has also been shown to localize in the Golgi of rat fibroblasts, zymogen granules, plasma membrane, and the microsomal membrane of pancreatic acini, as well as in the endocytic compartment of skeletal muscle cells and fibroblasts. High expression of Rap1 has been observed in the nucleus of human oropharyngeal squamous cell carcinomas (SCCs) and cell lines; interestingly, in the SCCs, the active GTP-bound form localized to the nucleus, while the inactive GDP-bound form localized to the cytoplasm. Rap1 plays a role in phagocytosis by controlling the binding of adhesion receptors (typically integrins) to their ligands. In yeast, Rap1 has been implicated in multiple functions, including activation and silencing of transcription and maintenance of telomeres. Rap1a, which is stimulated by T-cell receptor (TCR) activation, is a positive regulator of T cells by directing integrin activation and augmenting lymphocyte responses. In murine hippocampal neurons, Rap1b determines which neurite will become the axon and directs the recruitment of Cdc42, which is required for formation of dendrites and axons. In murine platelets, Rap1b is required for normal homeostasis in vivo and is involved in integrin activation. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	164
133376	cd04176	Rap2	Rap2 family GTPase consists of Rap2a, Rap2b, and Rap2c. The Rap2 subgroup is part of the Rap subfamily of the Ras family. It consists of Rap2a, Rap2b, and Rap2c. Both isoform 3 of the human mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4) and Traf2- and Nck-interacting kinase (TNIK) are putative effectors of Rap2 in mediating the activation of c-Jun N-terminal kinase (JNK) to regulate the actin cytoskeleton. In human platelets, Rap2 was shown to interact with the cytoskeleton by binding the actin filaments. In embryonic Xenopus development, Rap2 is necessary for the Wnt/beta-catenin signaling pathway. The Rap2 interacting protein 9 (RPIP9) is highly expressed in human breast carcinomas and correlates with a poor prognosis, suggesting a role for Rap2 in breast cancer oncogenesis. Rap2b, but not Rap2a, Rap2c, Rap1a, or Rap1b, is expressed in human red blood cells, where it is believed to be involved in vesiculation. A number of additional effector proteins for Rap2 have been identified, including the RalGEFs RalGDS, RGL, and Rlf, which also interact with Rap1 and Ras. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins. Due to the presence of truncated sequences in this CD, the lipid modification site is not available for annotation.	163
133377	cd04177	RSR1	RSR1/Bud1p family GTPase. RSR1/Bud1p is a member of the Rap subfamily of the Ras family that is found in fungi. In budding yeasts, RSR1 is involved in selecting a site for bud growth on the cell cortex, which directs the establishment of cell polarization. The Rho family GTPase cdc42 and its GEF, cdc24, then establish an axis of polarized growth by organizing the actin cytoskeleton and secretory apparatus at the bud site. It is believed that cdc42 interacts directly with RSR1 in vivo. In filamentous fungi, polar growth occurs at the tips of hypha and at novel growth sites along the extending hypha. In Ashbya gossypii, RSR1 is a key regulator of hyphal growth, localizing at the tip region and regulating in apical polarization of the actin cytoskeleton. Most Ras proteins contain a lipid modification site at the C-terminus, with a typical sequence motif CaaX, where a = an aliphatic amino acid and X = any amino acid. Lipid binding is essential for membrane attachment, a key feature of most Ras proteins.	168
206753	cd04178	Nucleostemin_like	A circularly permuted subfamily of the Ras GTPases. Nucleostemin (NS) is a nucleolar protein that functions as a regulator of cell growth and proliferation in stem cells and in several types of cancer cells, but is not expressed in the differentiated cells of most mammalian adult tissues. NS shuttles between the nucleolus and nucleoplasm bidirectionally at a rate that is fast and independent of cell type. Lowering GTP levels decreases the nucleolar retention of NS, and expression of NS is abruptly down-regulated during differentiation prior to terminal cell division. Found only in eukaryotes, NS consists of an N-terminal basic domain, a coiled-coil domain, a GTP-binding domain, an intermediate domain, and a C-terminal acidic domain. Experimental evidence indicates that NS uses its GTP-binding property as a molecular switch to control the transition between the nucleolus and nucleoplasm, and this process involves interaction between the basic, GTP-binding, and intermediate domains of the protein.	171
133022	cd04179	DPM_DPG-synthase_like	DPM_DPG-synthase_like is a member of the Glycosyltransferase 2 superfamily. DPM1 is the catalytic subunit of eukaryotic dolichol-phosphate mannose (DPM) synthase. DPM synthase is required for synthesis of the glycosylphosphatidylinositol (GPI) anchor, N-glycan precursor, protein O-mannose, and C-mannose. In higher eukaryotes,the enzyme has three subunits, DPM1, DPM2 and DPM3. DPM is synthesized from dolichol phosphate and GDP-Man on the cytosolic surface of the ER membrane by DPM synthase and then is flipped onto the luminal side and used as a donor substrate. In lower eukaryotes, such as Saccharomyces cerevisiae and Trypanosoma brucei, DPM synthase consists of a single component (Dpm1p and TbDpm1, respectively) that possesses one predicted transmembrane region near the C terminus for anchoring to the ER membrane. In contrast, the Dpm1 homologues of higher eukaryotes, namely fission yeast, fungi, and animals, have no transmembrane region, suggesting the existence of adapter molecules for membrane anchoring. This family also includes bacteria and archaea DPM1_like enzymes. However, the enzyme structure and mechanism of function are not well understood. The UDP-glucose:dolichyl-phosphate glucosyltransferase (DPG_synthase) is a transmembrane-bound enzyme of the endoplasmic reticulum involved in protein N-linked glycosylation. This enzyme catalyzes the transfer of glucose from UDP-glucose to dolichyl phosphate. This protein family belongs to Glycosyltransferase 2 superfamily.	185
133023	cd04180	UGPase_euk_like	Eukaryotic UGPase-like includes UDPase and UDPGlcNAc pyrophosphorylase enzymes. This family includes UDP-Glucose Pyrophosphorylase (UDPase) and UDPGlcNAc  pyrophosphorylase enzymes. The two enzymes share significant sequence and structure similarity. UDP-Glucose Pyrophosphorylase catalyzes a reversible production of UDP-Glucose and pyrophosphate (PPi) from Glucose-1-phosphate and UTP.  UDP-glucose plays pivotal roles in galactose utilization, in glycogen synthesis, and in the synthesis of the carbohydrate moieties of glycolipids , glycoproteins , and proteoglycans . UDP-N-acetylglucosamine (UDPGlcNAc) pyrophosphorylase (UAP) (also named GlcNAc1P uridyltransferase), catalyzes the reversible conversion of UTP and GlcNAc1P from PPi and UDPGlcNAc, which is a key precursor of N- and O-linked glycosylations and is essential for the synthesis of chitin (a major component of the fungal cell wall) and of the glycosylphosphatidylinositol (GPI) linker anchoring a variety of cell surface proteins to the plasma membrane. In bacteria, UDPGlcNAc represents an essential precursor for both peptidoglycan and lipopolysaccharide biosynthesis.	266
133024	cd04181	NTP_transferase	NTP_transferases catalyze the transfer of nucleotides onto phosphosugars. Nucleotidyltransferases transfer nucleotides onto phosphosugars.  The enzyme family includes Alpha-D-Glucose-1-Phosphate Cytidylyltransferase, Mannose-1-phosphate guanyltransferase, and Glucose-1-phosphate thymidylyltransferase. The products are activated sugars that are precursors for synthesis of lipopolysaccharide, glycolipids and polysaccharides.	217
133025	cd04182	GT_2_like_f	GT_2_like_f is a subfamily of the glycosyltransferase family 2 (GT-2) with unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families.	186
133026	cd04183	GT2_BcE_like	GT2_BcbE_like is likely involved in the biosynthesis of the polysaccharide capsule. GT2_BcbE_like:  The bcbE gene is one of the genes in the capsule biosynthetic locus of Pasteurella multocida. Its deducted product is likely involved in the biosynthesis of the polysaccharide capsule, which is found on surface of a wide range of bacteria. It is a subfamily of Glycosyltransferase Family GT2, which includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds.	231
133027	cd04184	GT2_RfbC_Mx_like	Myxococcus xanthus RfbC like proteins are required for O-antigen biosynthesis. The rfbC gene encodes a predicted protein of 1,276 amino acids, which is required for O-antigen biosynthesis in Myxococcus xanthus. It is a subfamily of Glycosyltransferase Family GT2, which includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds.	202
133028	cd04185	GT_2_like_b	Subfamily of Glycosyltransferase Family GT2 of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families.	202
133029	cd04186	GT_2_like_c	Subfamily of Glycosyltransferase Family GT2 of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families.	166
133030	cd04187	DPM1_like_bac	Bacterial DPM1_like enzymes are related to eukaryotic DPM1. A family of  bacterial enzymes related to eukaryotic DPM1; Although the mechanism of eukaryotic enzyme is well studied, the mechanism of the  bacterial enzymes is not well understood. The eukaryotic DPM1 is the catalytic subunit of eukaryotic Dolichol-phosphate mannose (DPM) synthase. DPM synthase is required for synthesis of the glycosylphosphatidylinositol (GPI) anchor, N-glycan precursor, protein O-mannose, and C-mannose. The enzyme has three subunits, DPM1, DPM2 and DPM3. DPM is synthesized from dolichol phosphate and GDP-Man on the cytosolic surface of the ER membrane by DPM synthase and then is flipped onto the luminal side and used as a donor substrate. This protein family belongs to Glycosyltransferase 2 superfamily.	181
133031	cd04188	DPG_synthase	DPG_synthase is involved in protein N-linked glycosylation. UDP-glucose:dolichyl-phosphate glucosyltransferase (DPG_synthase) is a transmembrane-bound enzyme of the endoplasmic reticulum involved in protein N-linked glycosylation. This enzyme catalyzes the transfer of glucose from UDP-glucose to dolichyl phosphate.	211
133032	cd04189	G1P_TT_long	G1P_TT_long represents the long form of glucose-1-phosphate thymidylyltransferase. This family is the long form of Glucose-1-phosphate thymidylyltransferase.  Glucose-1-phosphate thymidylyltransferase catalyses the formation of dTDP-glucose, from dTTP and glucose 1-phosphate. It is the first enzyme in the biosynthesis of dTDP-L-rhamnose, a cell wall constituent and a feedback inhibitor of the enzyme.There are two forms of   Glucose-1-phosphate thymidylyltransferase in bacteria and archeae; short form and long form.  The long form, which has an extra 50 amino acids c-terminal, is found in many species for which it serves as a sugar-activating enzyme for antibiotic biosynthesis and or other, unknown pathways, and in which dTDP-L-rhamnose is not necessarily produced.The long from enzymes also have a left-handed parallel helix domain at the c-terminus, whereas, th eshort form enzymes do not have this domain. The homotetrameric, feedback inhibited short form is found in numerous bacterial species that produce dTDP-L-rhamnose.	236
133033	cd04190	Chitin_synth_C	C-terminal domain of Chitin Synthase catalyzes the incorporation of GlcNAc from substrate UDP-GlcNAc into chitin. Chitin synthase, also called UDP-N-acetyl-D-glucosamine:chitin 4-beta-N-acetylglucosaminyltransferase, catalyzes the incorporation of GlcNAc from substrate UDP-GlcNAc into chitin, which is a linear homopolymer of GlcNAc residues formed by covalent beta-1,4 linkages. Chitin is an important component of the cell wall of fungi and bacteria and it is synthesized on the cytoplasmic surface of the cell membrane by  membrane bound chitin synthases. Studies with fungi have revealed that most of them contain more than one chitin synthase gene. At least five subclasses of chitin synthases have been identified.	244
133034	cd04191	Glucan_BSP_MdoH	Glucan_BSP_MdoH catalyzes the elongation of beta-1,2 polyglucose chains of glucan. Periplasmic Glucan Biosynthesis protein MdoH is a glucosyltransferase that catalyzes the elongation of beta-1,2 polyglucose chains of glucan, requiring a beta-glucoside as a primer and UDP-glucose as a substrate. Glucans are composed of 5 to 10 units of glucose forming a highly branched structure, where beta-1,2-linked glucose constitutes a linear backbone to which branches are attached by beta-1,6 linkages. In Escherichia coli, glucans are located in the periplasmic space, functioning as regulator of osmolarity. It is synthesized at a maximum when cells are grown in a medium with low osmolarity. It has been shown to span the cytoplasmic membrane.	254
133035	cd04192	GT_2_like_e	Subfamily of Glycosyltransferase Family GT2 of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families.	229
133036	cd04193	UDPGlcNAc_PPase	UDPGlcNAc pyrophosphorylase catalayzes the synthesis of UDPGlcNAc. UDP-N-acetylglucosamine (UDPGlcNAc) pyrophosphorylase (UAP) (also named GlcNAc1P uridyltransferase), catalyzes the reversible conversion of UTP and GlcNAc1 to PPi and UDPGlcNAc. UDP-N-acetylglucosamine (UDPGlcNAc), the activated form of GlcNAc, is a key precursor of N- and O-linked glycosylations. It is essential for the synthesis of chitin (a major component of the fungal cell wall) and of the glycosylphosphatidylinositol (GPI) linker which anchors a variety of cell surface proteins to the plasma membrane. In bacteria, UDPGlcNAc represents an essential precursor for both peptidoglycan and lipopolysaccharide biosynthesis. Human UAP has two isoforms, resulting from alternative splicing of a single gene and differing by the presence or absence of 17 amino acids. UDPGlcNAc  pyrophosphorylase shares significant sequence and structure conservation with UDPglucose pyrophosphorylase.	323
133037	cd04194	GT8_A4GalT_like	A4GalT_like proteins catalyze the addition of galactose or glucose residues to the lipooligosaccharide (LOS) or lipopolysaccharide (LPS) of the bacterial cell surface. The members of this family of glycosyltransferases catalyze the addition of galactose or glucose residues to the lipooligosaccharide (LOS) or lipopolysaccharide (LPS) of the bacterial cell surface. The enzymes exhibit broad substrate specificities. The known functions found in this family include: Alpha-1,4-galactosyltransferase, LOS-alpha-1,3-D-galactosyltransferase, UDP-glucose:(galactosyl) LPS alpha1,2-glucosyltransferase, UDP-galactose: (glucosyl) LPS alpha1,2-galactosyltransferase, and UDP-glucose:(glucosyl) LPS alpha1,2-glucosyltransferase. Alpha-1,4-galactosyltransferase from N. meningitidis  adds an alpha-galactose from UDP-Gal (the donor) to a terminal lactose (the acceptor) of the LOS structure of outer membrane. LOSs are virulence factors that enable the organism to evade the immune system of host cells. In E. coli, the three alpha-1,2-glycosyltransferases, that are involved in the synthesis of the outer core region of the LPS, are all members of this family. The three enzymes share 40 % of sequence identity, but have different sugar donor or acceptor specificities, representing the structural diversity of LPS.	248
133038	cd04195	GT2_AmsE_like	GT2_AmsE_like is involved in exopolysaccharide amylovora biosynthesis. AmsE is a glycosyltransferase involved in exopolysaccharide amylovora biosynthesis in Erwinia amylovora. Amylovara is one of the three exopolysaccharide produced by E. amylovora. Amylovara-deficient mutants are non-pathogenic. It is a subfamily of Glycosyltransferase Family GT2, which includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds.	201
133039	cd04196	GT_2_like_d	Subfamily of Glycosyltransferase Family GT2 of unknown function. GT-2 includes diverse families of glycosyltransferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families.	214
133040	cd04197	eIF-2B_epsilon_N	The N-terminal domain of epsilon subunit of the eIF-2B is a subfamily of glycosyltransferase 2. N-terminal domain of epsilon subunit of the eukaryotic translation initiation factor 2B (eIF-2B): eIF-2B is a guanine nucleotide-exchange factor which mediates the exchange of GDP (bound to initiation factor eIF2) for GTP, generating active eIF2.GTP complex. EIF2B is a complex multimeric protein consisting of five subunits named alpha, beta, gamma, delta and epsilon. Subunit epsilon shares sequence similarity with gamma subunit, and with a family of bifunctional nucleotide-binding enzymes such as ADP-glucose pyrophosphorylase, suggesting that epsilon subunit may play roles in nucleotide binding activity. In yeast, eIF2B gamma enhances the activity of eIF2B-epsilon leading to the idea that these subunits form the catalytic subcomplex.	217
133041	cd04198	eIF-2B_gamma_N	The N-terminal domain of gamma subunit of the eIF-2B is a subfamily of glycosyltransferase 2. N-terminal domain of gamma subunit of the eukaryotic translation initiation factor 2B (eIF-2B): eIF-2B is a guanine nucleotide-exchange factor which mediates the exchange of GDP (bound to initiation factor eIF2) for GTP, generating active eIF2.GTP complex. EIF2B is a complex multimeric protein consisting of five subunits named alpha, beta, gamma, delta and epsilon. Subunit gamma shares sequence similarity with epsilon subunit, and with a family of bifunctional nucleotide-binding enzymes such as ADP-glucose pyrophosphorylase, suggesting that epsilon subunit may play roles in nucleotide binding activity. In yeast, eIF2B gamma enhances the activity of eIF2B-epsilon leading to the idea that these subunits form the catalytic subcomplex.	214
259862	cd04199	CuRO_1_ceruloplasmin_like	Cupredoxin domains 1, 3, and 5 of ceruloplasmin and similar proteins. This family includes the first, third, and fifth cupredoxin domains of ceruloplasmin and similar proteins including the first, third and fifth cupredoxin domains of unprocessed coagulation factors V and VIII. Ceruloplasmin (ferroxidase) is a multicopper oxidase essential for normal iron homeostasis. It functions in copper transport, amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains and exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. Human Factor VIII facilitates blood clotting by acting as a cofactor for factor IXa. Factor VIII and IXa forms a complex in the presence of Ca+2 and phospholipids that converts factor X to the activated form Xa.	177
259863	cd04200	CuRO_2_ceruloplasmin_like	Cupredoxin domains 2, 4, and 6 of ceruloplasmin and similar proteins. This family includes the second, fourth and sixth cupredoxin domains of  ceruloplasmin and similar proteins, including the second, fourth, and sixth cupredoxin domains of unprocessed coagulation factors V and VIII. Ceruloplasmin (ferroxidase) is a multicopper oxidase essential for normal iron homeostasis. Ceruloplasmin also functions in copper transport, amine oxidase and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains and exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. Human Factor VIII facilitates blood clotting by acting as a cofactor for factor IXa  Factor VIII and IXa forms a complex  in the presence of Ca+2 and phospholipids that converts factor X to the activated form Xa.	141
259864	cd04201	CuRO_1_CuNIR_like	Cupredoxin domain 1 of  Copper-containing nitrite reductase and two-domain laccase. Copper-containing nitrite reductase (CuNIR), which catalyzes the reduction of NO2- to NO, is the key enzyme in the denitrification process in denitrifying bacteria. CuNIR contains at least one type 1 copper center and a type 2 copper center, which serves as the active site of the enzyme. A histidine, bound to the Type 2 Cu center, is responsible for binding and reducing nitrite. A Cys-His bridge plays an important role in facilitating rapid electron transfer from the type 1 center to the type 2 center. A reduced type I blue copper protein (pseudoazurin) was found to be a specific electron transfer donor for the copper-containing NIR in bacteria Alcaligenes faecalis. The two-domain laccase (small laccase) in this family differs significantly from all laccases. It resembles two domain nitrite reductase in both sequence homology and structure similarity. It consists of two domains and forms trimers and hence resembles the quaternary structure of nitrite reductases more than that of larger laccases.	120
259865	cd04202	CuRO_D2_2dMcoN_like	The second cupredoxin domain of bacterial two domain multicopper oxidase McoN and similar proteins. This family includes bacterial two domain multicopper oxidases (2dMCOs) represented by the McoN from Nitrosomonas europaea. McoN is a trimeric type C blue copper oxidase. Each subunit houses a type 1 copper site in domain 1 and a type 2/type 3 trinuclear copper cluster at the subunit-subunit interface. The 2dMCO is proposed to be a key intermediate in the evolution of three domain MCOs. The biological function of McoN has not been characterized. Multicopper oxidases couple oxidation of substrates with reduction of dioxygen to water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals.	138
259866	cd04203	Cupredoxin_like_3	Uncharacterized subfamiy of Cupredoxin. Cupredoxins contain type I copper centers and are involved in inter-molecular electron transfer reactions. Cupredoxins are blue copper proteins, having an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Majority of family members contain multiple cupredoxin domain repeats: ceruloplamin and coagulation factors V/VIII have six repeats; laccase, ascorbate oxidase, and spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase and the periplasmic domain of cytochrome c oxidase subunit II. Proteins of this uncharacterized subfamily contain a single cupredoxin domain.	84
259867	cd04204	Pseudoazurin_like	Small blue copper proteins including pseudocyanin, plastocyanin, halocyanin and amicyanin. The Pseudocyanin-like family of copper-binding proteins (or blue (type 1) copper domain) is a family of small proteins that bind a single copper atom and are characterized by an intense electronic absorption band near 600 nm. Pseudoazurin (PAz) has been identified as a electron donor in the denitrification pathway. For example, PAz acts as an electron donor to cytochrome c peroxidase and N2OR from Paracoccus pantotrophus (Pp), and to the copper containing nitrite reductase (NiR) that catalyzes the second step of denitrification. Plastocyanin is found in cyanobacteria, higher plants, and some algae where it plays a role in photosynthesis.  Plastocyanin is responsible for transporting electrons from PSII to PSI. This family also includes halocyanins found in halophilic archaea such as Natronomonas pharaonis (Natronobacterium pharaonis) and amicyanin found in bacteria Paracoccus denitrificans.	92
259868	cd04205	CuRO_2_LCC_like	Cupredoxin domain 2 of laccase-like multicopper oxidases; including laccase, CueO, spore coat protein A, ascorbate oxidase and similar proteins. Laccase-like multicopper oxidases (MCOs) are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	152
259869	cd04206	CuRO_1_LCC_like	Cupredoxin domain 1 of laccase-like multicopper oxidases; including laccase, CueO, spore coat protein A, ascorbate oxidase and similar proteins. Laccase-like multicopper oxidases (MCOs) in this family contain three cupredoxin domains. They are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites; Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. Also included in this family are cupredoxin domains 1, 3, and 5 of the 6-domain MCO ceruloplasmin and similar proteins.	120
259870	cd04207	CuRO_3_LCC_like	Cupredoxin domain 3 of laccase-like multicopper oxidases; including laccase, CueO, spore coat protein A, ascorbate oxidase and similar proteins. Laccase-like multicopper oxidases (MCOs) in this family contain three cupredoxin domains. They are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites; Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. Also included in this family are cupredoxin domains 2, 4, and 6 of the 6-domain MCO ceruloplasmin and similar proteins.	132
259871	cd04208	CuRO_2_CuNIR	Cupredoxin domain 2 of Copper-containing nitrite reductase. Copper-containing nitrite reductase (CuNIR), which catalyzes the reduction of NO2- to NO, is the key enzyme in the denitrification process in denitrifying bacteria. CuNIR contains at least one type 1 copper center and a type 2 copper center in the protein. The type 2 copper center of a copper nitrite reductase is the active site of the enzyme. A histidine, bound to the Type 2 Cu center, is  responsible for binding and reducing nitrite. A Cys-His bridge plays an important role in facilitating rapid electron transfer from the type 1 center to the type 2 center. A reduced type I blue copper protein (pseudoazurin) was found to be a specific electron transfer donor for the copper-containing NIR in bacteria Alcaligenes faecalis.	143
259872	cd04210	Cupredoxin_like_1	Uncharacterized Cupredoxin-like subfamily. Cupredoxins contain type I copper centers and are involved in inter-molecular electron transfer reactions. Cupredoxins are blue copper proteins because they have an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Majority of family members contain multiple cupredoxin domain repeats; ceruloplasmin and coagulation factors V/VIII have six repeats; Laccase, ascorbate oxidase, and spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase and the periplasmic domain of cytochrome c oxidase subunit II.	111
259873	cd04211	Cupredoxin_like_2	Uncharacterized Cupredoxin-like subfamily. Cupredoxins contain type I copper centers and are involved in inter-molecular electron transfer reactions. Cupredoxins are blue copper proteins because they have an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Majority of family members contain multiple cupredoxin domain repeats; ceruloplasmin and coagulation factors V/VIII have six repeats; Laccase, ascorbate oxidase, and spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase and the periplasmic domain of cytochrome c oxidase subunit II.	110
259874	cd04212	CuRO_UO_II	The cupredoxin domain of Ubiquinol oxidase subunit II. Ubiquinol oxidase, the terminal oxidase in the respiratory chains of aerobic bacteria, is a multi-chain transmembrane protein located in the cell membrane.  It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits in ubiquinol oxidase varies from two to five.  Although subunit II of ubiquinol oxidase lacks the binuclear CuA site found in cytochrome c oxidases, the structure is conserved.	99
259875	cd04213	CuRO_CcO_Caa3_II	The cupredoxin domain of Caa3 type Cytochrome c oxidase subunit II. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of most bacteria, is a multi-chain transmembrane protein located in the inner membrane the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. Caa3 type of CcO Subunit II contains a copper-copper binuclear site called CuA, which is believed to be involved in electron transfer from cytochrome c to the cytochromes a, a3 and CuB active site in subunit I.	103
259876	cd04214	PAD_N	N-terminal non-catalytic domain of protein-arginine deiminase. The N-terminal non-catalytic domain of protein-arginine deiminase has a cupredoxin-like fold, but lacks the Cu binding site. PAD (protein-arginine deiminase) and protein L-arginine iminohydrolase catalyze the conversion of protein arginine residues to citrulline residues post-translationally in a process called citrullination. The modification plays crucial regulatory roles in development and cell differentiation.	108
259877	cd04215	Nitrosocyanin	Nitrosocyanin (NC) is a mononuclear red copper protein. Nitrosocyanin (NC) is isolated from the ammonia oxidizing bacterium Nitrosomonas europaea. Nitrosocyanin exhibits remote sequence homology to classic blue copper proteins; its spectroscopic and electrochemical properties are different. The structure of NC is a trimer of single domain cupredoxins. Nitroscocyanin may mediate electron transfer. It could have a novel role as a nitric oxide dehydrogenase or a nitric oxide reductase in the oxidation of ammonia.	107
259878	cd04216	Phytocyanin	Phytocyanins are plant blue or type I copper proteins. Phytocyanins are plant blue or type I copper proteins. They are involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. Phytocyanins are classified into four groups: stellacyanin, plantacyanin, uclacyanin and early nodulin groups. Stellacyanin appears to be associated with the plant cell wall; it may be involved in oxidative reactions to build polymeric material making up the cell wall. Plantacyanin is shown to play a role in reproduction in Arabidopsis. Plantacyanins may also be stress-related proteins and may be involved in plant defense responses. The early nodulin-like protein (OsENODL1) from Oryza sativa is expressed specifically at the late developmental stage of the seeds.	98
259879	cd04217	Cupredoxin_Fibrocystin-L_like	Cupredoxin domain of PKHDL1, a homolog of the autosomal recessive polycystic kidney disease protein. One member of this family is Fibrocystin-L, a homolog of the autosomal recessive polycystic kidney disease protein PKHD1. Human fibrocystin-L is predicted to be a large receptor protein (466 kDa) with a signal peptide, a single transmembrane domain and a short cytoplasmic tail. Fibrocystin-L is widely expressed at a low level in most tissues but is up-regulated specifically in T lymphocytes following activation signals. It may play roles in immunity.	86
259880	cd04218	Pseudoazurin	Pseudoazurin (Paz) is a type I blue copper electron-transfer protein. Pseudoazurin (PAz) has been identified as an electron donor to the denitrification pathway. For example, PAz acts as an electron donor to cytochrome c peroxidase and N2OR from Paracoccus pantotrophus (Pp), and to the copper containing nitrite reductase (NiR) that catalyzes the second step of denitrification. It has been shown that pseudoazurin dramatically enhances the reaction profile of nitrite reduction by Paracoccus pantotrophus cytochrome cd1 and facilitates release of the product nitric oxide. The ability of this small redox protein to interact with a multitude of structurally different partners has been attributed to the hydrophobic character of the binding surface.	117
259881	cd04219	Plastocyanin	Plastocyanin is a type I copper protein and functions in the electron transfer from PSII to PSI. Plastocyanin is a small copper-containing protein found in cyanobacteria, higher plants, and some algae, where it plays a role in photosynthesis. The two photosystems that are primarily responsible for photosynthesis are photosystem I (PSI) and photosystem II (PSII). The flow of electrons begins in PSII, which acts as a proton pump. Plastocyanin is responsible for transporting electrons from PSII to PSI.	97
259882	cd04220	Halocyanin	Halocyanin is an archaea blue (type I) copper redox protein. Halocyanins are blue (type I) copper redox proteins found in halophilic archaea such as Natronomonas pharaonis (Natronobacterium pharaonis). Halocyanin may serve as a mobile electron carrier at a peripheral membrane protein. The copper-binding domain is present only once in some halocyanins and is duplicated in others.	92
259883	cd04221	MauL	Methylamine utilization protein MauL. MauL is one of the products from the methylamine utilization gene cluster in Methylobacterium extorquens AM1. Mutants generated by insertions in mauL were not able to grow on methylamine or any other primary amine as carbon sources. MauL belongs to the blue or type I copper protein family. They are involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form.	83
259884	cd04222	CuRO_1_ceruloplasmin	The first cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the first cupredoxin domain of ceruloplasmin.	183
259885	cd04223	N2OR_C	The C-terminal cupredoxin domain of Nitrous-oxide reductase. Nitrous-oxide reductase participates in nitrogen metabolism and catalyzes the last step in dissimilatory nitrate reduction, the two-electron reduction of N2O to N2. It contains copper ions as cofactors in the form of a binuclear CuA center at the site of electron entry and a tetranuclear CuZ centre at the active site. The C-terminus of Nitrous-oxide reductase is a cupredoxin domain.	95
259886	cd04224	CuRO_3_ceruloplasmin	The third cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the third cupredoxin domain of ceruloplasmin.	197
259887	cd04225	CuRO_5_ceruloplasmin	The fifth cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the fifth cupredoxin domain of ceruloplasmin.	171
259888	cd04226	CuRO_1_FV_like	The first cupredoxin domain of coagulation factor VIII and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 1 of unprocessed Factor V or the heavy chain of Factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom.	165
259889	cd04227	CuRO_3_FVIII_like	The third cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 3 of unprocessed Factor VIII or the heavy chain of circulating Factor VIII, and similar proteins.	177
259890	cd04228	CuRO_5_FVIII_like	The fifth cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 5 of unprocessed Factor VIII or the first cupredoxin domain of the light chain of circulating Factor VIII, and similar proteins.	169
259891	cd04229	CuRO_1_Ceruloplasmin_like_1	cupredoxin domain of ceruloplasmin homologs. Uncharacterized subfamily of ceruloplasmin homologous proteins. Ceruloplasmin  (ferroxidase) is a multicopper oxidase essential for normal iron homeostasis.  Ceruloplasmin also functions in copper transport, amine oxidase and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains and exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the first domain of the triplicated units.	175
259892	cd04230	Sulfocyanin	Sulfocyanin is a blue copper protein in archaebacterium Sulfolobus acidocaldarius. Sulfocyanin is a blue copper protein with a putative membrane anchoring hydrophobic motif at the N-terminus. It may substitute for cytochrome C in electron transfer reactions in archaea.	143
259893	cd04231	Rusticyanin	Rusticyanin is a cupredoxin in archaea and proteobacteria. Rusticyanin is a copper-containing protein which is involved in electron-transfer. The members of this family are found in archaea and proteobacteria. It is a cupredoxin, or blue-copper protein due to its color. Rusticyanin, extracted from the bacteria Thiobacillus ferrooxidans is redox active down to PH 2.0 and the acid-stable cytochrome c is the primary acceptor of the electron. This organism can grow on Fe2+ as its sole energy source. Rusticyanin is thought to be a principal component in the iron respiratory electron transport chain of T. ferrooxidans.	127
259894	cd04232	CuRO_1_CueO_FtsP	The first Cupredoxin domain of the multicopper oxidase CueO, the cell division protein FtsP, and similar proteins. CueO is a multicopper oxidase (MCO) that is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CueO is a periplasmic multicopper oxidase that is stimulated by exogenous copper(II). FtsP (also named SufI) is a component of the cell division apparatus. It is involved in protecting or stabilizing the assembly of divisomes under stress conditions. FtsP belongs to the multicopper oxidase superfamily but lacks metal cofactors. The protein is localized at septal rings and may serve as a scaffolding function. Members of this subfamily contain three cupredoxin domains and this model represents the first domain. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. FtsP does not contain any copper binding sites.	120
259895	cd04233	Auracyanin	Auracyanins A and B and similar proteins. This subfamily includes both auracyanins A and B from the photosynthetic bacterium Chloroflexus aurantiacus and similar proteins. Auracyanins A and B are very  similar blue copper proteins with 38% sequence identity and are homologous to the bacterial redox protein Azurin. However, auracyanin A is expressed only when C. aurantiacus cells are grown in light, whereas auracyanin B is expressed in both dark and light conditions. Thus, auracyanin A may function as a redox partner in photosynthesis, while auracyanin B may function in aerobic respiration.	121
239767	cd04234	AAK_AK	AAK_AK: Amino Acid Kinase Superfamily (AAK), Aspartokinase (AK); this CD includes the N-terminal catalytic domain of aspartokinase (4-L-aspartate-4-phosphotransferase;). AK is the first enzyme in the biosynthetic pathway of the aspartate family of amino acids (lysine, threonine, methionine, and isoleucine) and the bacterial cell wall component, meso-diaminopimelate. It also catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. One mechanism for the regulation of this pathway is by the production of several isoenzymes of aspartokinase with different repressors and allosteric inhibitors. Pairs of ACT domains are proposed to specifically bind amino acids leading to allosteric regulation of the enzyme. In Escherichia coli, three different aspartokinase isoenzymes are regulated specifically by lysine, methionine, and threonine. AK-HSDHI (ThrA) and AK-HSDHII (MetL) are bifunctional enzymes that consist of an N-terminal AK and a C-terminal homoserine dehydrogenase (HSDH). ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. The third isoenzyme, AKIII (LysC), is monofunctional and is involved in lysine synthesis. The three Bacillus subtilis isoenzymes, AKI (DapG), AKII (LysC), and AKIII (YclM), are feedback-inhibited by meso-diaminopimelate, lysine, and lysine plus threonine, respectively. The E. coli lysine-sensitive AK is described as a homodimer, whereas, the B. subtilis lysine-sensitive AK is described as a heterodimeric complex of alpha- and beta- subunits that are formed from two in-frame overlapping genes. A single AK enzyme type has been described in Pseudomonas, Amycolatopsis, and Corynebacterium. The fungal aspartate pathway is regulated at the AK step, with L-Thr being an allosteric inhibitor of the Saccharomyces cerevisiae AK (Hom3). At least two distinct AK isoenzymes can occur in higher plants, one is a monofunctional lysine-sensitive isoenzyme, which is involved in the overall regulation of the pathway and can be synergistically inhibited by S-adenosylmethionine. The other isoenzyme is a bifunctional, threonine-sensitive AK-HSDH protein. Also included in this CD is the catalytic domain of the Methylomicrobium alcaliphilum ectoine AK, the first enzyme of the ectoine biosynthetic pathway, found in this bacterium, and several other halophilic/halotolerant bacteria.	227
239768	cd04235	AAK_CK	AAK_CK: Carbamate kinase (CK) catalyzes both the ATP-phosphorylation of carbamate and carbamoyl phosphate (CP) utilization with the production of ATP from ADP and CP. Both CK (this CD) and nonhomologous CP synthetase synthesize carbamoyl phosphate, an essential precursor of arginine and pyrimidine bases, in the presence of ATP, bicarbonate, and ammonia. CK is a homodimer of 33 kDa subunits and is a member of the Amino Acid Kinase Superfamily (AAK).	308
239769	cd04236	AAK_NAGS-Urea	AAK_NAGS-Urea: N-acetylglutamate (NAG) kinase-like domain of the NAG Synthase (NAGS) of the urea cycle found in animals. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate; NAG is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Ureogenic NAGS activity is dependent on the concentration of glutamate (substrate) and arginine (activator). Domain architecture of ureogenic NAGS consists of an N-terminal NAG kinase-like (ArgB) domain (this CD) and a C-terminal DUF619 domain. Members of this CD belong to the protein superfamily, the Amino Acid Kinase Family (AAKF).	271
239770	cd04237	AAK_NAGS-ABP	AAK_NAGS-ABP: N-acetylglutamate (NAG) kinase-like domain of the NAG Synthase (NAGS) of the arginine-biosynthesis pathway (ABP) found in gamma- and beta-proteobacteria and higher plant chloroplasts. Domain architecture of these NAGS consisted of an N-terminal NAG kinase-like (ArgB) domain (this CD) and a C-terminal NAG synthase, acetyltransferase (ArgA) domain. Both bacterial and plant sequences in this CD have a conserved N-terminal extension; a similar sequence in the NAG kinases of the cyclic arginine-biosynthesis pathway has been implicated in feedback inhibition sensing. Plant sequences also have an N-terminal chloroplast transit peptide and an insert (approx. 70 residues) in the C-terminal region of ArgB. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK).	280
239771	cd04238	AAK_NAGK-like	AAK_NAGK-like: N-Acetyl-L-glutamate kinase (NAGK)-like . Included in this CD are the Escherichia coli and Pseudomonas aeruginosa type NAGKs which catalyze the phosphorylation of N-acetyl-L-glutamate (NAG) by ATP in the second step of arginine biosynthesis found in bacteria and photosynthetic organisms using either the acetylated, noncyclic (NC), or non-acetylated, cyclic (C) route of ornithine biosynthesis. Also included in this CD is a distinct group of uncharacterized (UC) bacterial and archeal NAGKs. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK).	256
239772	cd04239	AAK_UMPK-like	AAK_UMPK-like: UMP kinase (UMPK)-like, the microbial/chloroplast uridine monophosphate kinase (uridylate kinase) enzyme that catalyzes UMP phosphorylation and plays a key role in pyrimidine nucleotide biosynthesis. Regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinases of E. coli (Ec) and Pyrococcus furiosus (Pf) are known to function as homohexamers, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial UMPKs have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Also included in this CD are the alpha and beta subunits of the Mo storage protein (MosA and MosB) characterized as an alpha4-beta4 octamer containing an ATP-dependent, polynuclear molybdenum-oxide cluster. These and related  sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK).	229
239773	cd04240	AAK_UC	AAK_UC: Uncharacterized (UC) amino acid kinase-like proteins found mainly in archaea and a few bacteria. Sequences in this CD are members of the Amino Acid Kinase (AAK) superfamily.	203
239774	cd04241	AAK_FomA-like	AAK_FomA-like: This CD includes a fosfomycin biosynthetic gene product, FomA, and similar proteins found in a wide range of organisms. Together, the fomA and fomB genes in the fosfomycin biosynthetic gene cluster of Streptomyces wedmorensis confer high-level fosfomycin resistance. FomA and FomB proteins converted fosfomycin to fosfomycin monophosphate and fosfomycin diphosphate in the presence of ATP and a magnesium ion, indicating that FomA and FomB catalyzed phosphorylations of fosfomycin and fosfomycin monophosphate, respectively. FomA and related  sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK).	252
239775	cd04242	AAK_G5K_ProB	AAK_G5K_ProB: Glutamate-5-kinase (G5K) catalyzes glutamate-dependent ATP cleavage; G5K transfers the terminal phosphoryl group of ATP to the gamma-carboxyl group of glutamate, in the first and controlling step of proline (and, in mammals, ornithine) biosynthesis. G5K is subject to feedback allosteric inhibition by proline or ornithine. In microorganisms and plants, proline plays an important role as an osmoprotectant and, in mammals, ornithine biosynthesis is crucial for proper ammonia detoxification, since a G5K mutation has been shown to cause human hyperammonaemia. Microbial G5K generally consists of two domains: a catalytic G5K domain and one PUA (pseudo uridine synthases and archaeosine-specific transglycosylases) domain, and some lack the PUA domain. G5K requires free Mg for activity, it is tetrameric, and it aggregates to higher forms in a proline-dependent way. G5K lacking the PUA domain remains tetrameric, active, and proline-inhibitable, but the Mg requirement and the proline-triggered aggregation are greatly diminished and abolished, respectively, and more proline is needed for inhibition. Although plant and animal G5Ks are part of a bifunctional polypeptide, delta 1-pyrroline-5-carboxylate synthetase (P5CS), composed of an N-terminal G5K (ProB) and a C-terminal glutamyl 5- phosphate reductase (G5PR; ProA); bacterial and yeast G5Ks are monofunctional single-polypeptide enzymes. In this CD, all three domain architectures are present: G5K, G5K+PUA, and G5K+G5PR.	251
239776	cd04243	AAK_AK-HSDH-like	AAK_AK-HSDH-like: Amino Acid Kinase Superfamily (AAK), AK-HSDH-like; this family includes the N-terminal catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK- homoserine dehydrogenase (HSDH). These aspartokinases are found in such bacteria as E. coli (AKI-HSDHI, ThrA  and  AKII-HSDHII, MetL) and in higher plants (Z. mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains, located C-terminal to the AK catalytic domain, were shown to be involved in allosteric activation. Also included in this CD is the catalytic domain of the aspartokinase (AK) of the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme (LysC) found in some bacteria such as E. coli. In E. coli, LysC is reported to be a homodimer of 50 kD subunits. Also included in this CD is  the catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK - DAP decarboxylase (DapDC) found in some bacteria. DapDC, which is the lysA gene product, catalyzes the decarboxylation of DAP to lysine.	293
239777	cd04244	AAK_AK-LysC-like	AAK_AK-LysC-like: Amino Acid Kinase Superfamily (AAK), AK-LysC-like; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive AK isoenzyme found in higher plants. The lysine-sensitive AK isoenzyme is a monofunctional protein. It is involved in the overall regulation of the aspartate pathway and can be synergistically inhibited by S-adenosylmethionine. Also included in this CD is an uncharacterized LysC-like AK found in Euryarchaeota and some bacteria. AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP.	298
239778	cd04245	AAK_AKiii-YclM-BS	AAK_AKiii-YclM-BS: Amino Acid Kinase Superfamily (AAK), AKiii-YclM-BS; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in Bacilli (Bacillus subtilis YclM) and Clostridia species. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. In Bacillus subtilis (BS), YclM is reported to be a single polypeptide of 50 kD. The Bacillus subtilis 168 AKIII is induced by lysine and repressed by threonine, and it is synergistically inhibited by lysine and threonine.	288
239779	cd04246	AAK_AK-DapG-like	AAK_AK-DapG-like: Amino Acid Kinase Superfamily (AAK), AK-DapG-like; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the diaminopimelate-sensitive aspartokinase isoenzyme AKI (DapG), a monofunctional enzymes found in Bacilli (Bacillus subtilis 168), Clostridia, and Actinobacteria bacterial species, as well as, the catalytic AK domain of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis 168, the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related isoenzymes. In Bacillus subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive aspartokinase isoenzymes. The role of the AKI isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The B. subtilis 168 AKII is induced by methionine, and repressed and inhibited by lysine. In Corynebacterium glutamicum and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and threonine. Also included in this CD are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinase isoenzyme types found in Pseudomonas, C. glutamicum, and Amycolatopsis lactamdurans. The B. subtilis AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. The B. subtilis 168 AKII aspartokinase is also described as tetrameric consisting of two alpha and two beta subunits. Some archeal aspartokinases in this group lack recognizable ACT domains.	239
239780	cd04247	AAK_AK-Hom3	AAK_AK-Hom3: Amino Acid Kinase Superfamily (AAK), AK-Hom3; this CD includes the N-terminal catalytic domain of the aspartokinase HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae and other related AK domains. Aspartokinase, the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and in fungi, is responsible for the production of threonine, isoleucine and methionine. S. cerevisiae has a single aspartokinase isoenzyme type, which is regulated by feedback, allosteric inhibition by L-threonine. Recent studies show that the allosteric transition triggered by binding of threonine to AK involves a large change in the conformation of the native hexameric enzyme that is converted to an inactive one of different shape and substantially smaller hydrodynamic size.	306
239781	cd04248	AAK_AK-Ectoine	AAK_AK-Ectoine: Amino Acid Kinase Superfamily (AAK), AK-Ectoine; this CD includes the N-terminal catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway found in Methylomicrobium alcaliphilum, Vibrio cholerae, and other various halotolerant or halophilic bacteria. Bacteria exposed to hyperosmotic stress accumulate organic solutes called 'compatible solutes'  of which ectoine, a heterocyclic amino acid, is one. Apart from its osmotic function, ectoine also exhibits a protective effect on proteins, nucleic acids and membranes against a variety of stress factors. de novo synthesis of ectoine starts with the phosphorylation of L-aspartate and shares its first two enzymatic steps with the biosynthesis of amino acids of the aspartate family: aspartokinase and L-aspartate-semialdehyde dehydrogenase. The M. alcaliphilum and the V. cholerae aspartokinases are encoded on the ectABCask operon.	304
239782	cd04249	AAK_NAGK-NC	AAK_NAGK-NC: N-Acetyl-L-glutamate kinase - noncyclic (NAGK-NC) catalyzes the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of microbial arginine biosynthesis using the acetylated, noncyclic route of ornithine biosynthesis. There are two variants of this pathway. In one, typified by the pathway in Escherichia coli, glutamate is acetylated by acetyl-CoA and acetylornithine is deacylated hydrolytically. In this pathway, feedback inhibition by arginine occurs at the initial acetylation of glutamate and not at the phosphorylation of NAG by NAGK. Homodimeric NAGK-NC are members of the Amino Acid Kinase Superfamily (AAK).	252
239783	cd04250	AAK_NAGK-C	AAK_NAGK-C: N-Acetyl-L-glutamate kinase - cyclic (NAGK-C) catalyzes the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of arginine biosynthesis found in some bacteria and photosynthetic organisms using the non-acetylated, cyclic route of ornithine biosynthesis. In this pathway, glutamate is first N-acetylated and then phosphorylated by NAGK to give phosphoryl NAG, which is converted to NAG-ornithine. There are two variants of this pathway. In one, typified by the pathway in Thermotoga maritima and Pseudomonas aeruginosa, the acetyl group is recycled by reversible transacetylation from acetylornithine to glutamate. The phosphorylation of NAG by NAGK is feedback inhibited by arginine. In photosynthetic organisms, NAGK is the target of the nitrogen-signaling protein PII. Hexameric formation of NAGK domains appears to be essential to both arginine inhibition and NAGK-PII complex formation. NAGK-C are members of the Amino Acid Kinase Superfamily (AAK).	279
239784	cd04251	AAK_NAGK-UC	AAK_NAGK-UC: N-Acetyl-L-glutamate kinase - uncharacterized (NAGK-UC). This domain is similar to Escherichia coli and Pseudomonas aeruginosa NAGKs which catalyze the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of microbial arginine biosynthesis. These uncharacterized domain sequences are found in some bacteria (Deinococci and Chloroflexi) and archea and belong to the Amino Acid Kinase Superfamily (AAK).	257
239785	cd04252	AAK_NAGK-fArgBP	AAK_NAGK-fArgBP: N-Acetyl-L-glutamate kinase (NAGK) of the fungal arginine-biosynthetic pathway (fArgBP). The nuclear-encoded, mitochondrial polyprotein precursor with an N-terminal NAGK (ArgB) domain (this CD), a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-Acetylglutamate Phosphate Reductase, NAGPR). The precursor is cleaved in the mitochondria into two distinct enzymes (NAGK-DUF619 and NAGPR). Native molecular weights of these proteins indicate that the kinase is an octamer whereas the reductase is a dimer. This CD also includes some gamma-proteobacteria (Xanthomonas and Xylella) NAG kinases with an N-terminal NAGK (ArgB) domain (this CD) and a C-terminal DUF619 domain. The DUF619 domain is described as a putative distant homolog of the acetyltransferase, ArgA, predicted to function in NAG synthase association in fungi. Eukaryotic sequences have an N-terminal mitochondrial transit peptide. Members of this NAG kinase domain CD belong to the Amino Acid Kinase Superfamily (AAK).	248
239786	cd04253	AAK_UMPK-PyrH-Pf	AAK_UMPK-PyrH-Pf: UMP kinase (UMPK)-Pf, the mostly archaeal uridine monophosphate kinase (uridylate kinase) enzymes that catalyze UMP phosphorylation and play a key role in pyrimidine nucleotide biosynthesis; regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinase of Pyrococcus furiosus (Pf) is known to function as a homohexamer, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial UMPKs have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs (this CD) appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK).	221
239787	cd04254	AAK_UMPK-PyrH-Ec	UMP kinase (UMPK)-Ec, the microbial/chloroplast uridine monophosphate kinase (uridylate kinase) enzyme that catalyzes UMP phosphorylation and plays a key role in pyrimidine nucleotide biosynthesis; regulation of this process is via feed-back control and via gene repression of carbamoyl phosphate synthetase (the first enzyme of the pyrimidine biosynthesis pathway). The UMP kinase of E. coli (Ec) is known to function as a homohexamer, with GTP and UTP being allosteric effectors. Like other related enzymes (carbamate kinase, aspartokinase, and N-acetylglutamate kinase) the E. coli and most bacterial and chloroplast UMPKs (this CD) have a conserved, N-terminal, lysine residue proposed to function in the catalysis of the phosphoryl group transfer, whereas most archaeal UMPKs appear to lack this residue and the Pyrococcus furiosus structure has an additional Mg ion bound to the ATP molecule which is proposed to function as the catalysis instead. Members of this CD belong to the Amino Acid Kinase Superfamily (AAK).	231
239788	cd04255	AAK_UMPK-MosAB	AAK_UMPK-MosAB: This CD includes the alpha and beta subunits of the Mo storage protein (MosA and MosB) which are related to uridine monophosphate kinase (UMPK) enzymes that catalyze the phosphorylation of UMP by ATP, yielding UDP, and playing a key role in pyrimidine nucleotide biosynthesis. The Mo storage protein from the nitrogen-fixing bacterium, Azotobacter vinelandii, is characterized as an alpha4-beta4 octamer containing a polynuclear molybdenum-oxide cluster which is ATP-dependent to bind Mo and pH-dependent to release Mo. These and related bacterial sequences in this CD are members of the Amino Acid Kinase Superfamily (AAK).	262
239789	cd04256	AAK_P5CS_ProBA	AAK_P5CS_ProBA: Glutamate-5-kinase (G5K) domain of the bifunctional delta 1-pyrroline-5-carboxylate synthetase (P5CS), composed of an N-terminal G5K (ProB) and a C-terminal glutamyl 5- phosphate reductase (G5PR, ProA), the first and second enzyme catalyzing proline (and, in mammals, ornithine) biosynthesis. G5K transfers the terminal phosphoryl group of ATP to the gamma-carboxyl group of glutamate, and is subject to feedback allosteric inhibition by proline or ornithine. In plants, proline plays an important role as an osmoprotectant and, in mammals, ornithine biosynthesis is crucial for proper ammonia detoxification, since a G5K mutation has been shown to cause human hyperammonaemia.	284
239790	cd04257	AAK_AK-HSDH	AAK_AK-HSDH: Amino Acid Kinase Superfamily (AAK), AK-HSDH; this CD includes the N-terminal catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK - homoserine dehydrogenase (HSDH). These aspartokinases are found in bacteria (E. coli AKI-HSDHI, ThrA  and E. coli AKII-HSDHII, MetL) and higher plants (Z. mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains, located C-terminal to the AK catalytic domain, were shown to be involved in allosteric activation.	294
239791	cd04258	AAK_AKiii-LysC-EC	AAK_AKiii-LysC-EC: Amino Acid Kinase Superfamily (AAK), AKiii-LysC-EC: this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive aspartokinase isoenzyme AKIII. AKIII is a monofunctional class enzyme (LysC) found in some bacteria such as E. coli. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. In E. coli, LysC is reported to be a homodimer of 50 kD subunits.	292
239792	cd04259	AAK_AK-DapDC	AAK_AK-DapDC: Amino Acid Kinase Superfamily (AAK), AK-DapDC; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the bifunctional enzyme AK - DAP decarboxylase (DapDC) found in some bacteria. Aspartokinase is the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. DapDC, which is the lysA gene product, catalyzes the decarboxylation of DAP to lysine.	295
239793	cd04260	AAK_AKi-DapG-BS	AAK_AKi-DapG-BS: Amino Acid Kinase Superfamily (AAK), AKi-DapG; this CD includes the N-terminal catalytic aspartokinase (AK) domain of  the diaminopimelate-sensitive aspartokinase isoenzyme AKI (DapG), a monofunctional class enzyme found in Bacilli (Bacillus subtilis 168), Clostridia, and Actinobacteria bacterial species.  In Bacillus subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive aspartokinase isoenzymes. AKI activity is invariant during the exponential and stationary phases of growth and is not altered by addition of amino acids to the growth medium. The role of this isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The B. subtilis AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains.	244
239794	cd04261	AAK_AKii-LysC-BS	AAK_AKii-LysC-BS: Amino Acid Kinase Superfamily (AAK), AKii; this CD includes the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related sequences. In B. subtilis 168, the regulation of the diaminopimelate (Dap)-lysine biosynthetic pathway involves dual control by Dap and lysine, effected through separate Dap- and lysine-sensitive aspartokinase isoenzymes. The B. subtilis 168 AKII is induced by methionine, and repressed and inhibited by lysine. Although Corynebacterium glutamicum is known to contain a single aspartokinase isoenzyme type, both the succinylase and dehydrogenase variant pathways of DAP-lysine synthesis operate simultaneously in this organism. In this organism and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and theronine. Also included in this CD are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinases found in Pseudomons, C. glutamicum, and Amycolatopsis lactamdurans. B. subtilis 168 AKII, and the C. glutamicum, Streptomyces clavuligerus and A. lactamdurans aspartokinases are described as tetramers consisting of two alpha and two beta subunits; the alpha (44 kD) and beta (18 kD) subunits formed by two in-phase overlapping polypeptides.	239
176265	cd04263	DUF619-NAGK-FABP	DUF619 domain of N-acetylglutamate kinase (NAGK) of the fungal arginine-biosynthetic pathway. DUF619-NAGK-FABP: DUF619 domain of N-acetylglutamate kinase (NAGK) of the fungal arginine-biosynthetic pathway (FABP). The nuclear-encoded, mitochondrial polyprotein precursor (ARG5,6) consists of an N-terminal NAGK (ArgB) domain, a central DUF619 domain, and a C-terminal reductase domain (ArgC, N-Acetylglutamate Phosphate Reductase, NAGPR). The precursor is cleaved into two distinct enzymes (NAGK-DUF619 and NAGPR) in the mitochondria. Native molecular weights of these proteins indicate that the kinase is an octamer whereas the reductase is a dimer. Arg5,6 catalyzes the second reaction of arginine biosynthesis; the phosphorylation of the gamma-carboxyl group of NAG to produce N-acetylglutamylphosphate (NAGP) which is subsequently converted to ornithine in two more steps. It also binds and regulates the promoters of nuclear and mitochondrial genes, and may possibly regulate precursor mRNA metabolism. The DUF619 domain function has yet to be characterized.	98
176266	cd04264	DUF619-NAGS	DUF619 domain of various N-acetylglutamate Synthases of the fungal arginine-biosynthetic pathway and urea cycle found in humans and fish. DUF619-NAGS: This family includes the DUF619 domain of various N-acetylglutamate synthases (NAGS) of the urea cycle found in humans and fish, the DUF619 domain of the NAGS of the fungal arginine-biosynthetic pathway (FABP), as well as the DUF619 domain present in C-terminal of a NAG kinase-like domain in a limited number of predicted NAGSs found in bacteria and Dictyostelium. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAGS is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic and fungal NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. The DUF619 domain function has yet to be characterized.	99
176267	cd04265	DUF619-NAGS-U	DUF619 domain of various N-acetylglutamate Synthases (NAGS) of the urea (U) cycle of humans and fish. This family includes the DUF619 domain of various N-acetylglutamate synthases (NAGS) of the urea cycle found in humans and fish, the DUF619 domain of the NAGS of the fungal arginine-biosynthetic pathway (FABP), as well as the DUF619 domain present in C-terminal of a NAG kinase-like domain in a limited number of predicted NAGSs found in bacteria and Dictyostelium. Ureogenic NAGS is a mitochondrial enzyme catalyzing the formation of NAG from acetylcoenzyme A and L-glutamate. NAGS is an essential allosteric activator of carbamylphosphate synthase I, the first and rate limiting enzyme of the urea cycle. Domain architecture of ureogenic and fungal NAGS consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. The DUF619 domain function has yet to be characterized.	99
176268	cd04266	DUF619-NAGS-FABP	DUF619 domain of N-acetylglutamate Synthase of the fungal arginine-biosynthetic pathway. DUF619-NAGS-FABP: This family includes the DUF619 domain of N-acetylglutamate synthase (NAGS) of the fungal arginine-biosynthetic pathway (FABP). This NAGS (also known as arginine-requiring protein 2 or ARG2) consists of an N-terminal NAG kinase-like domain and a C-terminal DUF619 domain. NAGS catalyzes the formation of NAG from acetylcoenzyme A and L-glutamate. The DUF619 domain, yet to be characterized, is predicted to function in NAGS association in fungi.	108
239795	cd04267	ZnMc_ADAM_like	Zinc-dependent metalloprotease, ADAM_like or reprolysin_like subgroup. The adamalysin_like or ADAM family of metalloproteases contains proteolytic domains from snake venoms, proteases from the mammalian reproductive tract, and the tumor necrosis factor alpha convertase, TACE. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions.	192
239796	cd04268	ZnMc_MMP_like	Zinc-dependent metalloprotease, MMP_like subfamily. This group contains matrix metalloproteinases (MMPs), serralysins, and the astacin_like family of proteases.	165
239797	cd04269	ZnMc_adamalysin_II_like	Zinc-dependent metalloprotease; adamalysin_II_like subfamily. Adamalysin II is a snake venom zinc endopeptidase. This subfamily contains other snake venom metalloproteinases, as well as membrane-anchored metalloproteases belonging to the ADAM family. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions.	194
239798	cd04270	ZnMc_TACE_like	 Zinc-dependent metalloprotease; TACE_like subfamily. TACE, the tumor-necrosis factor-alpha converting enzyme, releases soluble TNF-alpha from transmembrane pro-TNF-alpha.	244
239799	cd04271	ZnMc_ADAM_fungal	Zinc-dependent metalloprotease, ADAM_fungal subgroup. The adamalysin_like or ADAM (A Disintegrin And Metalloprotease) family of metalloproteases are integral membrane proteases acting on a variety of extracellular targets. They are involved in shedding soluble peptides or proteins from the cell surface. This subfamily contains fungal ADAMs, whose precise function has yet to be determined.	228
239800	cd04272	ZnMc_salivary_gland_MPs	Zinc-dependent metalloprotease, salivary_gland_MPs. Metalloproteases secreted by the salivary glands of arthropods.	220
239801	cd04273	ZnMc_ADAMTS_like	Zinc-dependent metalloprotease, ADAMTS_like subgroup. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions. This particular subfamily represents domain architectures that combine ADAM-like metalloproteinases with thrombospondin type-1 repeats. ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) proteinases are inhibited by TIMPs (tissue inhibitors of metalloproteinases), and they play roles in coagulation, angiogenesis, development and progression of arthritis. They hydrolyze the von Willebrand factor precursor and various components of the extracellular matrix.	207
239802	cd04275	ZnMc_pappalysin_like	Zinc-dependent metalloprotease, pappalysin_like subfamily. The pregnancy-associated plasma protein A (PAPP-A or pappalysin-1) cleaves insulin-like growth factor-binding proteins 4 and 5, thereby promoting cell growth by releasing bound growth factor. This model includes pappalysins and related metalloprotease domains from all three kingdoms of life. The three-dimensional structure of an archaeal representative, ulilysin, has been solved.	225
239803	cd04276	ZnMc_MMP_like_2	Zinc-dependent metalloprotease; MMP_like sub-family 2. A group of bacterial metalloproteinase domains similar to matrix metalloproteinases and astacin.	197
239804	cd04277	ZnMc_serralysin_like	Zinc-dependent metalloprotease, serralysin_like subfamily. Serralysins and related proteases are important virulence factors in pathogenic bacteria. They may be secreted into the medium via a mechanism found in gram-negative bacteria, that does not require n-terminal signal sequences which are cleaved after the transmembrane translocation. A calcium-binding domain c-terminal to the metalloprotease domain, which contains multiple tandem repeats of a nine-residue motif including the pattern GGxGxD, and which forms a parallel beta roll may be involved in the translocation mechanism and/or substrate binding. Serralysin family members may have a broad spectrum of substrates each, including host immunoglobulins, complement proteins, cell matrix and cytoskeletal proteins, as well as antimicrobial peptides.	186
239805	cd04278	ZnMc_MMP	Zinc-dependent metalloprotease, matrix metalloproteinase (MMP) sub-family. MMPs are responsible for a great deal of pericellular proteolysis of extracellular matrix and cell surface molecules, playing crucial roles in morphogenesis, cell fate specification, cell migration, tissue repair, tumorigenesis, gain or loss of tissue-specific functions, and apoptosis. In many instances, they are anchored to cell membranes via trans-membrane domains, and their activity is controlled via TIMPs (tissue inhibitors of metalloproteinases).	157
239806	cd04279	ZnMc_MMP_like_1	Zinc-dependent metalloprotease; MMP_like sub-family 1. A group of bacterial, archaeal, and fungal metalloproteinase domains similar to matrix metalloproteinases and astacin.	156
239807	cd04280	ZnMc_astacin_like	Zinc-dependent metalloprotease, astacin_like subfamily or peptidase family M12A, a group of zinc-dependent proteolytic enzymes with a HExxH zinc-binding site/active site. Members of this family may have an amino terminal propeptide, which is cleaved to yield the active protease domain, which is consequently always found at the N-terminus in multi-domain architectures. This family includes: astacin, a digestive enzyme from Crayfish; meprin, a multiple domain membrane component that is constructed from a homologous alpha and beta chain, proteins involved in (bone) morphogenesis, tolloid from drosophila, and the sea urchin SPAN protein, which may also play a role in development.	180
239808	cd04281	ZnMc_BMP1_TLD	Zinc-dependent metalloprotease; BMP1/TLD-like subfamily. BMP1 (Bone morphogenetic protein 1) and TLD (tolloid)-like metalloproteases play vital roles in extracellular matrix formation, by cleaving precursor proteins such as enzymes, structural proteins, and proteins involved in the mineralization of the extracellular matrix. The drosophila protein tolloid and its Xenopus homologue xolloid cleave and inactivate Sog and chordin, respectively, which are inhibitors of Dpp (the Drosophila decapentaplegic gene product) and its homologue BMP4, involved in dorso-ventral patterning.	200
239809	cd04282	ZnMc_meprin	Zinc-dependent metalloprotease, meprin_like subfamily. Meprins are membrane-bound or secreted extracellular proteases, which cleave a variety of targets, including peptides such as parathyroid hormone, gastrin, and cholecystokinin, cytokines such as osteopontin, and proteins such as collagen IV, fibronectin, casein and gelatin. Meprins may also be able to release proteins from the cell surface. Closely related meprin alpha- and beta-subunits form homo- and hetero-oligomers; these complexes are found on epithelial cells of the intestine, for example, and are also expressed in certain cancer cells.	230
239810	cd04283	ZnMc_hatching_enzyme	Zinc-dependent metalloprotease, hatching enzyme-like subfamily. Hatching enzymes are secreted by teleost embryos to digest the egg envelope or chorion. In some teleosts, the hatching enzyme may be a system consisting of two evolutionary related  metalloproteases, high choriolytic enzyme and low choriolytic enzyme (HCE and LCE), which may have different  substrate specificities and cooperatively digest the chorion.	182
340852	cd04299	GT35_Glycogen_Phosphorylase-like	proteins similar to glycogen phosphorylase. This family is most closely related to the oligosaccharide phosphorylase domain family and other unidentified sequences. Oligosaccharide phosphorylase catalyzes the breakdown of oligosaccharides into glucose-1-phosphate units. They are important allosteric enzymes in carbohydrate metabolism.	776
340853	cd04300	GT35_Glycogen_Phosphorylase	glycogen phosphorylase and similar proteins. This is a family of oligosaccharide phosphorylases. It includes yeast and mammalian glycogen phosphorylases, plant starch/glucan phosphorylase, as well as the maltodextrin phosphorylases of bacteria. The members of this family catalyze the breakdown of oligosaccharides into glucose-1-phosphate units. They are important allosteric enzymes in carbohydrate metabolism. The allosteric control mechanisms of yeast and mammalian members of this family are different from that of bacterial members. The members of this family belong to the GT-B structural superfamily of glycoslytransferases, which have characteristic N- and C-terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology.  The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.	795
173926	cd04301	NAT_SF	N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate. NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included.	65
319798	cd04302	HAD_5NT	haloacid dehalogenase (HAD)-like 5'-nucleotidases similar to the Pseudomonas aeruginosa PA0065. 5'-nucleotidases dephosphorylate nucleoside 5'-monophosphates to nucleosides and inorganic phosphate. Purified Pseudomonas aeruginosa PA0065 displayed high activity toward 5'-UMP and 5'-IMP, significant activity against 5'-XMP and 5'-TMP, and low activity against 5'-CMP. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	209
319799	cd04303	HAD_PGPase	phosphoglycolate phosphatase, similar to Synechococcus elongates phosphoglycolate phosphatase PGP/CbbZ. Phosphoglycolate phosphatase catalyzes the dephosphorylation of phosphoglycolate; its activity requires divalent cations, especially Mg++.  This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	201
319800	cd04305	HAD_Neu5Ac-Pase_like	human N-acetylneuraminate-9-phosphate phosphatase, Escherichia coli house-cleaning phosphatase YjjG, and related phosphatases. N-acetylneuraminate-9- phosphatase (Neu5Ac-9-Pase; E.C. 3.1.3.29) catalyzes the dephosphorylation of N-acylneuraminate 9-phosphate during the synthesis of N-acetylneuraminate; Escherichia coli nucleotide phosphatase YjjG has a broad pyrimidine nucleotide activity spectrum and functions as an in vivo house-cleaning phosphatase for noncanonical pyrimidine nucleotides. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	109
319801	cd04309	HAD_PSP_eu	phosphoserine phosphatase eukaryotic-like, similar to human phosphoserine phosphatase. Human PSP, EC 3.1.3.3, catalyzes the third and final of the L-serine biosynthesis pathway, the Mg2+-dependent hydrolysis of phospho-L-serine to L-serine and inorganic phosphate, L-serine is a precursor for the biosynthesis of glycine. HPSP regulates the levels of glycine and D-serine (converted from L-serine), the putative co-agonists for the glycine site of the NMDA receptor in the brain. Plant 3-PSP catalyzes the conversion of 3-phosphoserine to serine in the last step of the plastidic pathway of serine biosynthesis. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	202
239811	cd04316	ND_PkAspRS_like_N	ND_PkAspRS_like_N: N-terminal, anticodon recognition domain of the type found in the homodimeric non-discriminating (ND) Pyrococcus kodakaraensis aspartyl-tRNA synthetase (AspRS).  This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop.  P. kodakaraensis AspRS is a class 2b aaRS. aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of  the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. P. kodakaraensis ND-AspRS can charge both tRNAAsp and tRNAAsn. Some of the enzymes in this group may be discriminating, based on the presence of homologs of asparaginyl-tRNA synthetase (AsnRS) in their completed genomes.	108
239812	cd04317	EcAspRS_like_N	EcAspRS_like_N: N-terminal, anticodon recognition domain of the type found in Escherichia coli aspartyl-tRNA synthetase (AspRS), the human mitochondrial (mt) AspRS-2, the discriminating (D) Thermus thermophilus AspRS-1, and the nondiscriminating (ND) Helicobacter pylori AspRS.  These homodimeric enzymes are class2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop.  aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of  the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose.  Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic synthesis, whereas the other exclusively with mitochondrial protein synthesis. Human mtAspRS participates in mitochondrial biosynthesis; this enzyme been shown to charge E.coli native tRNAsp in addition to in vitro transcribed human mitochondrial tRNAsp.  T. thermophilus is rare among bacteria in having both a D_AspRS and a ND_AspRS.  H.pylori ND-AspRS can charge both tRNAASp and tRNAAsn, it is fractionally more efficient at aminoacylating tRNAAsp over tRNAAsn. The H.pylori genome does not contain AsnRS.	135
239813	cd04318	EcAsnRS_like_N	EcAsnRS_like_N: N-terminal, anticodon recognition domain of the type found in Escherichia coli asparaginyl-tRNA synthetase (AsnRS) and, in Arabidopsis thaliana and Saccharomyces cerevisiae mitochondrial (mt) AsnRS. This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of  the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. S. cerevisiae mtAsnRS can charge E.coli tRNA with asparagines. Mutations in the gene for S. cerevisiae mtAsnRS has been found to induce a "petite" phenotype typical for a mutation in a nuclear gene that results in a non-functioning mitochondrial protein synthesis system.	82
239814	cd04319	PhAsnRS_like_N	PhAsnRS_like_N: N-terminal, anticodon recognition domain of the type found in Pyrococcus horikoshii AsnRS asparaginyl-tRNA synthetase (AsnRS).  This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The archeal enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of  the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose.	103
239815	cd04320	AspRS_cyto_N	AspRS_cyto_N: N-terminal, anticodon recognition domain of the type found in Saccharomyces cerevisiae and human cytoplasmic aspartyl-tRNA synthetase (AspRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of  the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis.	102
239816	cd04321	ScAspRS_mt_like_N	ScAspRS_mt_like_N: N-terminal, anticodon recognition domain of the type found in Saccharomyces cerevisiae mitochondrial (mt) aspartyl-tRNA synthetase (AspRS). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this fungal group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of  the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Mutations in the gene for S. cerevisiae mtAspRS result in a "petite" phenotype typical for a mutation in a nuclear gene that results in a non-functioning mitochondrial protein synthesis system.	86
239817	cd04322	LysRS_N	LysRS_N: N-terminal, anticodon recognition domain of lysyl-tRNA synthetases (LysRS). These enzymes are homodimeric class 2b aminoacyl-tRNA synthetases (aaRSs). This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop.  aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of  the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose.  Included in this group are E. coli LysS and LysU. These two isoforms of LysRS are encoded by distinct genes which are differently regulated.  Eukaryotes contain 2 sets of aaRSs, both of which encoded by the nuclear genome. One set concerns with cytoplasmic protein synthesis, whereas the other exclusively with mitochondrial protein synthesis. Saccharomyces cerevisiae cytoplasmic and mitochondrial LysRSs have been shown to participate in the mitochondrial import of the only nuclear-encoded tRNA of S. cerevisiae (tRNAlysCUU). The gene for human LysRS encodes both the cytoplasmic and the mitochondrial isoforms of LysRS.  In addition to their housekeeping role, human lysRS may function as a signaling molecule that activates immune cells and tomato LysRS may participate in a root-specific process possibly connected to conditions of oxidative-stress conditions or heavy metal uptake. It is known that human tRNAlys and LysRS are specifically packaged into HIV-1 suggesting a role for LysRS in tRNA packaging.	108
239818	cd04323	AsnRS_cyto_like_N	AsnRS_cyto_like_N: N-terminal, anticodon recognition domain of the type found in human and Saccharomyces cerevisiae cytoplasmic asparaginyl-tRNA synthetase (AsnRS), in Brugia malayai AsnRs and, in various putative bacterial AsnRSs.  This domain is a beta-barrel domain (OB fold) involved in binding the tRNA anticodon stem-loop. The enzymes in this group are homodimeric class2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids (AAs) to their cognate tRNAs during protein biosynthesis. This 2-step reaction involves i) the activation of the AA by ATP in the presence of magnesium ions, followed by ii) the transfer of  the activated AA to the terminal ribose of tRNA.  In the case of the class2b aaRSs, the activated AA is attached to the 3'OH of the terminal ribose. Eukaryotes contain 2 sets of aaRSs, both of which are encoded by the nuclear genome. One set concerns with cytoplasmic synthesis, whereas the other exclusively with mitochondrial protein synthesis.  AsnRS is immunodominant antigen of the filarial nematode B. malayai and of interest as a target for anti-parasitic drug design. Human AsnRS has been shown to be a pro-inflammatory chemokine which interacts with CCR3 chemokine receptors on T cells, immature dendritic cells and macrophages.	84
239819	cd04327	ZnMc_MMP_like_3	Zinc-dependent metalloprotease; MMP_like sub-family 3. A group of bacterial and fungal metalloproteinase domains similar to matrix metalloproteinases and astacin.	198
239820	cd04328	RNAP_I_Rpa43_N	RNAP_I_Rpa43_N: Rpa43, N-terminal ribonucleoprotein (RNP) domain. Rpa43 is a subunit of eukaryotic RNA polymerase (RNAP) I that is homologous to Rpb7 of eukaryotic RNAP II, Rpc25 of eukaryotic RNP III, and RpoE of archaeal RNAP. Rpa43 has two domains, an N-terminal RNP domain and a C-terminal oligonucleotide-binding (OB) domain. Rpa43 heterodimerizes with Rpa14 and this heterodimer has genetic and biochemical characteristics similar to those of the Rpb7/Rpb4 heterodimer of RNAP II. In addition, the Rpa43/Rpa14 heterodimer binds single-stranded RNA, as is the case for the Rpb7/Rpb4 and the archaeal E/F complexes. The position of Rpa43/Rpa14 in the three-dimensional structure of RNAP I is similar to that of Rpb4/Rpb7, which forms an upstream interface between the C-terminal domain of Rpb1 and the transcription factor IIB (TFIIB), recruiting pol II to the pol II promoter. Rpb43 binds Rrn3, an rDNA-specific transcription factor, functionally equivalent to TFIIB, involved in recruiting RNAP I to the pol I promoter.	89
239821	cd04329	RNAP_II_Rpb7_N	RNAP_II_Rpb7_N: Rpb7, N-terminal ribonucleoprotein (RNP) domain. Rpb7 is a subunit of eukaryotic RNA polymerase (RNAP) II that is homologous to Rpc25 of RNAP III, RpoE of archaeal RNAP, and Rpa43 of eukaryotic RNAP I. Rpb7 heterodimerizes with Rpb4 and this heterodimer binds the 10-subunit core of RNAP II, forming part of the floor of the DNA-binding cleft. Rpb7 has two domains, an N-terminal RNP domain and a C-terminal oligonucleotide-binding (OB) domain, both of which bind single-stranded RNA. Rpb7 is thought to interact with the nascent RNA strand as it exits the RNAP II complex during transcription elongation. The Rpb7/Rpb4 heterodimer is also thought to serve as an upstream interface between the C-terminal domain of Rpb1 and the transcription factor IIB (TFIIB), recruiting pol II to the pol II promoter.	80
239822	cd04330	RNAP_III_Rpc25_N	RNAP_III_Rpc25_N: Rpc25, N-terminal ribonucleoprotein (RNP) domain. Rpc25 is a subunit of eukaryotic RNA polymerase (RNAP) III and is homologous to Rpa43 of eukaryotic RNAP I, Rpb7 of eukaryotic RNAP II, and RpoE of archaeal RNAP. Rpc25 has two domains, an N-terminal RNP domain and a C-terminal oligonucleotide-binding (OB) domain, both of which are thought to bind single-stranded RNA. Rpc25 heterodimerizes with Rpc17 and plays an important role in transcription initiation. RNAP III transcribes diverse structural and catalytic RNAs including 5S ribosomal RNAs, tRNAs, and a small number of snRNAs involved in RNA and protein synthesis.	80
239823	cd04331	RNAP_E_N	RNAP_E_N: RpoE, N-terminal ribonucleoprotein (RNP) domain. RpoE (subunit E) is a subunit of the archaeal RNA polymerase (RNAP) that is homologous to Rpb7 of eukaryotic RNAP II, Rpc25 of eukaryotic RNAP III, and Rpa43 of eukaryotic RNAP I. RpoE heterodimerizes with RpoF, another RNA polymerase subunit. RpoE has an elongated two-domain structure that includes an N-terminal RNP domain and a C-terminal oligonucleotide-binding (OB) domain. Both domains of RpoE bind single-stranded RNA.	80
239824	cd04332	YbaK_like	YbaK-like.  The YbaK family of deacylase domains includes the INS amino acid-editing domain of  the bacterial class II prolyl tRNA synthetase (ProRS), and it's trans-acting homologs, YbaK, ProX, and PrdX.  The primary function of INS is to hydrolyze mischarged cysteinyl-tRNA(Pro)'s, thus helping ensure the fidelity of translation.  Organisms whose ProRS lacks the INS domain express an INS homolog in trans (e.g. YbaK, ProX, or PrdX).	136
239825	cd04333	ProX_deacylase	This CD, composed mainly of bacterial single-domain proteins, includes the Thermus thermophilus (Tt) YbaK-like protein, a homolog of the trans-acting Escherichia coli YbaK Cys-tRNA(Pro) deacylase and the Agrobacterium tumefaciens  ProX Ala-tRNA(Pro) deacylase and also the cis-acting prolyl-tRNA synthetase-editing domain (ProRS-INS). While ProX and ProRS-INS hydrolyze misacylated Ala-tRNA(Pro), the E. coli YbaK hydrolyzes misacylated Cys-tRNA(Pro). A few CD members are N-terminal, YbaK-ProX-like domains of an uncharacterized protein with a C-terminal, predicted Fe-S protein domain.	148
239826	cd04334	ProRS-INS	INS is an amino acid-editing domain inserted (INS) into the bacterial class II prolyl-tRNA synthetase (ProRS) however, this CD is not exclusively bacterial. It is also found at the N-terminus of the eukaryotic/archaea-like ProRS's of yeasts and single-celled parasites.  ProRS catalyzes the attachment of proline to tRNA(Pro); proline is first activated by ATP, and then transferred to the acceptor end of tRNA(Pro). ProRS can inadvertently process noncognate amino acids such as alanine and cysteine, and to avoid such errors, in post-transfer editing, the INS domain deacylates mischarged Ala-tRNA(Pro), thus ensuring the fidelity of translation. Misacylated Cys-tRNA(Pro) is not edited by ProRS.  In addition to the INS editing domain, the prokaryote-like ProRS protein contains catalytic and anticodon-binding domains which form a dimeric interface.	160
239827	cd04335	PrdX_deacylase	This CD includes bacterial (Agrobacterium tumefaciens and Caulobacter crescentus ProX, and Clostridium sticklandii PrdX) and eukaryotic (Plasmodium falciparum N-terminal ProRS editing domain) sequences. The C. sticklandii PrdX protein, a homolog of the YbaK and ProX proteins, and the prolyl-tRNA synthetase-editing domain (ProRS-INS), specifically hydrolyzes Ala-tRNA(Pro). In this CD, many of the eukaryotic editing domains are N-terminal and cis-acting, expressed from a multidomain ProRS, however, similar to the bacterial PrdX, the mammalian, amphibian, and echinoderm PrdX-like proteins are trans-acting, single-domain proteins.	156
239828	cd04336	YeaK	YeaK is an uncharacterized Echerichia coli protein with a YbaK-like domain of unknown function.  The YbaK-like domain family includes the INS amino acid-editing domain of the bacterial class II prolyl tRNA synthetase (ProRS), and it's trans-acting homologs, YbaK, and ProX.  The primary function of INS is to hydrolyze mischarged cysteinyl-tRNA(Pro)'s, thus helping ensure the fidelity of translation.  Organisms whose ProRS lacks the INS domain express a single-domain INS homolog such as YbaK, ProX, or PrdX which supplies the function of INS in trans.	153
239829	cd04337	Rieske_RO_Alpha_Cao	Cao (chlorophyll a oxygenase) is a rieske non-heme iron-sulfur protein located within the plastid-envelope inner and thylakoid membranes, that catalyzes the conversion of chlorophyllide a to chlorophyllide b. CAO is found not only in plants but also in chlorophytes and  prochlorophytes. This domain represents the N-terminal rieske domain of the oxygenase alpha subunit. ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis. Cao is closely related to several other plant RO's including Tic 55, a 55 kDa protein associated with protein transport through the inner chloroplast membrane;  Ptc 52, a novel 52 kDa protein isolated from chloroplasts; and LLS1/Pao (Lethal-leaf spot 1/pheophorbide a oxygenase).	129
239830	cd04338	Rieske_RO_Alpha_Tic55	Tic55 is a 55kDa LLS1-related non-heme iron oxygenase associated with protein transport through the plant inner chloroplast membrane. This domain represents the N-terminal Rieske domain of the Tic55 oxygenase alpha subunit. Tic55 is closely related to the oxygenase alpha subunits of a small subfamily of enzymes found in plants as well as oxygenic cyanobacterial photosynthesizers including LLS1 (lethal leaf spot 1, also known as PaO), Ptc52, and ACD1 (accelerated cell death 1). ROs comprise a large class of aromatic ring-hydroxylating dioxygenases that enable microorganisms to tolerate and utilize aromatic compounds for growth. The oxygenase alpha subunit contains an N-terminal Rieske domain with an [2Fe-2S] cluster and a C-terminal catalytic domain with a mononuclear Fe(II) binding site. The Rieske [2Fe-2S] cluster accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron for catalysis.	134
239831	cd04365	IlGF_relaxin_like	IlGF_like family, relaxin_like subgroup, specific to vertebrates. Members include a number of active peptides including (pro)relaxin, mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP; gene INSL4), and insulin-like peptides 5 (INSL5) and 6 (INSL6). Members of this subgroup are widely expressed in testes (INSL3, INSL6), decidua, placenta, prostate, corpus luteum, brain (various relaxins), GI tract, and kidney (INSL5) where they serve a variety of functions in parturition and development. Typically, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain:  Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds.	59
239832	cd04366	IlGF_insulin_bombyxin_like	IlGF_like family, insulin_bombyxin_like subgroup. Members include a number of peptides including insulin, insulin-like growth factors I and II, insect prothoracicotropic hormone (bombyxin), locust insulin-related peptide (LIRP), molluscan insulin-related peptides 1 to 5 (MIP), and C. elegans insulin-like peptides. With the exception of insulin-like growth factors, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain:  Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds.	42
239833	cd04367	IlGF_insulin_like	IlGF_like family, insulin_like subgroup, specific to vertebrates. Members include a number of peptides including insulin and insulin-like growth factors I and II, which play a variety of roles in controlling processes such as metabolism, growth and differentiation, and reproduction. On a cellular level they affect cell cycle, apoptosis, cell migration, and differentiation. With the exception of the insulin-like growth factors, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain:  Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds.	79
239834	cd04368	IlGF	IlGF, insulin_like growth factors; specific to vertebrates. Members include a number of peptides including insulin-like growth factors I and II, which play a variety of roles in controlling processes such as growth, differentiation, and reproduction. On a cellular level they affect cell cycle, apoptosis, cell migration, proliferation, and differentiation. Typically, the active forms of these peptide hormones are single chains cross-linked by three disulfide bonds.	67
99922	cd04369	Bromodomain	Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine.	99
239835	cd04370	BAH	BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions.	123
239836	cd04371	DEP	DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction.	81
239837	cd04372	RhoGAP_chimaerin	RhoGAP_chimaerin: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of chimaerins. Chimaerins are a family of phorbolester- and diacylglycerol-responsive GAPs specific for the Rho-like GTPase Rac. Chimaerins exist in two alternative splice forms that each contain a C-terminal GAP domain, and a central C1 domain which binds phorbol esters, inducing a conformational change that activates the protein; one splice form is lacking the N-terminal Src homology-2 (SH2) domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	194
239838	cd04373	RhoGAP_p190	RhoGAP_p190: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of p190-like proteins. p190, also named RhoGAP5, plays a role in neuritogenesis and axon branch stability. p190 shows a preference for Rho, over Rac and Cdc42, and consists of an N-terminal GTPase domain and a C-terminal GAP domain. The central portion of p190 contains important regulatory phosphorylation sites. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	185
239839	cd04374	RhoGAP_Graf	RhoGAP_Graf: GTPase-activator protein (GAP) domain for Rho-like GTPases found in GRAF (GTPase regulator associated with focal adhesion kinase); Graf is a multi-domain protein, containing SH3 and PH domains, that binds focal adhesion kinase and influences cytoskeletal changes mediated by Rho proteins. Graf exhibits GAP activity toward RhoA and Cdc42, but only weakly activates Rac1. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	203
239840	cd04375	RhoGAP_DLC1	RhoGAP_DLC1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of DLC1-like proteins. DLC1 shows in vitro GAP activity towards RhoA and CDC42. Beside its C-terminal GAP domain, DLC1 also contains a SAM (sterile alpha motif) and a START (StAR-related lipid transfer action) domain. DLC1 has tumor suppressor activity in cell culture. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	220
239841	cd04376	RhoGAP_ARHGAP6	RhoGAP_ARHGAP6: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP6-like proteins. ArhGAP6 shows GAP activity towards RhoA, but not towards Cdc42 and Rac1. ArhGAP6 is often deleted in microphthalmia with linear skin defects syndrome (MLS); MLS is a severe X-linked developmental disorder. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	206
239842	cd04377	RhoGAP_myosin_IX	RhoGAP_myosin_IX: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in class IX myosins. Class IX myosins contain a characteristic head domain, a neck domain, a tail domain which contains a C6H2-zinc binding motif and a RhoGAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	186
239843	cd04378	RhoGAP_GMIP_PARG1	RhoGAP_GMIP_PARG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of GMIP (Gem interacting protein) and PARG1 (PTPL1-associated RhoGAP1). GMIP plays important roles in neurite growth and axonal guidance, and interacts with Gem, a member of the RGK subfamily of the Ras small GTPase superfamily, through the N-terminal half of the protein. GMIP contains a C-terminal RhoGAP domain. GMIP inhibits RhoA function, but is inactive towards Rac1 and Cdc41. PARG1 interacts with Rap2, also a member of the Ras small GTPase superfamily whose exact function is unknown, and shows strong preference for Rho. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	203
239844	cd04379	RhoGAP_SYD1	RhoGAP_SYD1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in SYD-1_like proteins. Syd-1, first identified and best studied in C.elegans, has been shown to play an important role in neuronal development by specifying axonal properties. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	207
239845	cd04380	RhoGAP_OCRL1	RhoGAP_OCRL1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in OCRL1-like proteins. OCRL1 (oculocerebrorenal syndrome of Lowe 1)-like proteins contain two conserved domains: a central inositol polyphosphate 5-phosphatase domain and a C-terminal Rho GAP domain, this GAP domain lacks the catalytic residue and therefore maybe inactive. OCRL-like proteins are type II inositol polyphosphate 5-phosphatases that can hydrolyze lipid PI(4,5)P2 and PI(3,4,5)P3 and soluble Ins(1,4,5)P3 and Ins(1,3,4,5)P4, but their individual specificities vary. The functionality of the RhoGAP domain is still unclear. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	220
239846	cd04381	RhoGap_RalBP1	RhoGap_RalBP1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in RalBP1 proteins, also known as RLIP, RLIP76 or cytocentrin. RalBP1 plays an important role in endocytosis during interphase. During mitosis, RalBP1 transiently associates with the centromere and has been shown to play an essential role in the proper assembly of the mitotic apparatus. RalBP1 is an effector of the Ral GTPase which itself is an effector of Ras. RalBP1 contains a RhoGAP domain, which shows weak activity towards Rac1 and Cdc42, but not towards Ral, and a Ral effector domain binding motif. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	182
239847	cd04382	RhoGAP_MgcRacGAP	RhoGAP_MgcRacGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in MgcRacGAP proteins. MgcRacGAP plays an important dual role in cytokinesis: i) it is part of centralspindlin-complex, together with the mitotic kinesin MKLP1, which is critical for the structure of the central spindle by promoting microtuble bundling. ii) after phosphorylation by aurora B MgcRacGAP becomes an effective regulator of RhoA and plays an important role in the assembly of the contractile ring and the initiation of cytokinesis. MgcRacGAP-like proteins contain a N-terminal C1-like domain, and a C-terminal RhoGAP domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	193
239848	cd04383	RhoGAP_srGAP	RhoGAP_srGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in srGAPs. srGAPs are components of the intracellular part of Slit-Robo signalling pathway that is important for axon guidance and cell migration. srGAPs contain an N-terminal FCH domain, a central RhoGAP domain and a C-terminal SH3 domain; this SH3 domain interacts with the intracellular proline-rich-tail of the Roundabout receptor (Robo). This interaction with Robo then activates the rhoGAP domain which in turn inhibits Cdc42 activity. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	188
239849	cd04384	RhoGAP_CdGAP	RhoGAP_CdGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of CdGAP-like proteins; CdGAP contains an N-terminal RhoGAP domain and a C-terminal proline-rich region, and it is active on both Cdc42 and Rac1 but not RhoA. CdGAP is recruited to focal adhesions via the interaction with the scaffold protein actopaxin (alpha-parvin). Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	195
239850	cd04385	RhoGAP_ARAP	RhoGAP_ARAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in ARAPs. ARAPs (also known as centaurin deltas) contain, besides the RhoGAP domain, an Arf GAP, ankyrin repeat ras-associating, and PH domains. Since their ArfGAP activity is PIP3-dependent, ARAPs are considered integration points for phosphoinositide, Arf and Rho signaling. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	184
239851	cd04386	RhoGAP_nadrin	RhoGAP_nadrin: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of Nadrin-like proteins. Nadrin, also named Rich-1, has been shown to be involved in the regulation of Ca2+-dependent exocytosis in neurons and recently has been implicated in tight junction maintenance in mammalian epithelium. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	203
239852	cd04387	RhoGAP_Bcr	RhoGAP_Bcr: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of Bcr (breakpoint cluster region protein)-like proteins. Bcr is a multidomain protein with a variety of enzymatic functions. It contains a RhoGAP and a Rho GEF domain, a Ser/Thr kinase domain, an N-terminal oligomerization domain, and a C-terminal PDZ binding domain, in addition to PH and C2 domains. Bcr is a negative regulator of:  i) RacGTPase, via the Rho GAP domain, ii) the Ras-Raf-MEK-ERK pathway, via phosphorylation of the Ras binding protein AF-6, and iii) the Wnt signaling pathway through binding beta-catenin. Bcr can form a complex with  beta-catenin and Tcf1. The Wnt signaling pathway is involved in cell proliferation, differentiation, and cell renewal. Bcr was discovered as a fusion partner of Abl. The Bcr-Abl fusion is characteristic for a large majority of chronic myelogenous leukemias (CML). Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	196
239853	cd04388	RhoGAP_p85	RhoGAP_p85: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in the p85 isoforms of the regulatory subunit of the class IA PI3K (phosphatidylinositol 3'-kinase). This domain is also called Bcr (breakpoint cluster region protein) homology (BH) domain. Class IA PI3Ks are heterodimers, containing a regulatory subunit (p85) and a catalytic subunit (p110) and are activated by growth factor receptor tyrosine kinases (RTKs); this activation is mediated by the p85 subunit. p85 isoforms, alpha and beta, contain a C-terminal p110-binding domain flanked by two SH2 domains, an N-terminal SH3 domain, and a RhoGAP domain flanked by two proline-rich regions. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	200
239854	cd04389	RhoGAP_KIAA1688	RhoGAP_KIAA1688: GTPase-activator protein (GAP) domain for Rho-like GTPases found in KIAA1688-like proteins; KIAA1688 is a protein of unknown function that contains a RhoGAP domain and a myosin tail homology 4 (MyTH4) domain. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	187
239855	cd04390	RhoGAP_ARHGAP22_24_25	RhoGAP_ARHGAP22_24_25:  GTPase-activator protein (GAP) domain for Rho-like GTPases found in ARHGAP22, 24 and 25-like proteins; longer isoforms of these proteins contain an additional N-terminal pleckstrin homology (PH) domain. ARHGAP25 (KIA0053) has been identified as a GAP for Rac1 and Cdc42. Short isoforms (without the PH domain) of ARHGAP24, called RC-GAP72 and p73RhoGAP, and of ARHGAP22, called p68RacGAP, has been shown to be involved in angiogenesis and endothelial cell capillary formation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	199
239856	cd04391	RhoGAP_ARHGAP18	RhoGAP_ARHGAP18: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP18-like proteins. The function of ArhGAP18 is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	216
239857	cd04392	RhoGAP_ARHGAP19	RhoGAP_ARHGAP19: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP19-like proteins. The function of ArhGAP19 is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	208
239858	cd04393	RhoGAP_FAM13A1a	RhoGAP_FAM13A1a: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of FAM13A1, isoform a-like proteins. The function of FAM13A1a is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by up several orders of magnitude.	189
239859	cd04394	RhoGAP-ARHGAP11A	RhoGAP-ARHGAP11A: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP11A-like proteins. The mouse homolog of human ArhGAP11A has been detected as a gene exclusively expressed in immature ganglion cells, potentially playing a role in retinal development. The exact function of ArhGAP11A is unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	202
239860	cd04395	RhoGAP_ARHGAP21	RhoGAP_ARHGAP21: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP21-like proteins. ArhGAP21 is a multi-domain protein, containing RhoGAP, PH and PDZ domains, and is believed to play a role in the organization of the cell-cell junction complex. It has been shown to function as a GAP of Cdc42 and RhoA, and to interact with alpha-catenin and Arf6. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	196
239861	cd04396	RhoGAP_fSAC7_BAG7	RhoGAP_fSAC7_BAG7: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal SAC7 and BAG7-like proteins. Both proteins are GTPase activating proteins of Rho1, but differ functionally in vivo: SAC7, but not BAG7, is involved in the control of Rho1-mediated activation of the PKC-MPK1 pathway. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	225
239862	cd04397	RhoGAP_fLRG1	RhoGAP_fLRG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal LRG1-like proteins. Yeast Lrg1p is required for efficient cell fusion, and mother-daughter cell separation, possibly through acting as a RhoGAP specifically regulating 1,3-beta-glucan synthesis. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	213
239863	cd04398	RhoGAP_fRGD1	RhoGAP_fRGD1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal RGD1-like proteins. Yeast Rgd1 is a GAP protein for Rho3 and Rho4 and plays a role in low-pH response. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	192
239864	cd04399	RhoGAP_fRGD2	RhoGAP_fRGD2: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal RGD2-like proteins. Yeast Rgd2 is a GAP protein for Cdc42 and Rho5. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	212
239865	cd04400	RhoGAP_fBEM3	RhoGAP_fBEM3: RhoGAP (GTPase-activator [GAP] protein for Rho-like small GTPases) domain of fungal BEM3-like proteins. Bem3 is a GAP protein of Cdc42, and is specifically involved in the control of the initial assembly of the septin ring in yeast bud formation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	190
239866	cd04401	RhoGAP_fMSB1	RhoGAP_fMSB1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of fungal MSB1-like proteins. Msb1 was originally identified as a multicopy suppressor of temperature sensitive cdc42 mutation. Msb1 is a positive regulator of the Pkc1p-MAPK pathway and 1,3-beta-glucan synthesis, both pathways involve Rho1 regulation. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	198
239867	cd04402	RhoGAP_ARHGAP20	RhoGAP_ARHGAP20: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of ArhGAP20-like proteins. ArhGAP20, also known as KIAA1391 and RA-RhoGAP, contains a RhoGAP, a RA, and a PH domain, and ANXL repeats. ArhGAP20 is activated by Rap1 and induces inactivation of Rho, which in turn leads to neurite outgrowth. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	192
239868	cd04403	RhoGAP_ARHGAP27_15_12_9	RhoGAP_ARHGAP27_15_12_9: GTPase-activator protein (GAP) domain for Rho-like GTPases found in ARHGAP27 (also called CAMGAP1), ARHGAP15, 12 and 9-like proteins; This subgroup of ARHGAPs are multidomain proteins that contain RhoGAP, PH, SH3 and WW domains. Most members that are studied show GAP activity towards Rac1, some additionally show activity towards Cdc42. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	187
239869	cd04404	RhoGAP-p50rhoGAP	RhoGAP-p50rhoGAP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of p50RhoGAP-like proteins; p50RhoGAP, also known as RhoGAP-1, contains a C-terminal RhoGAP domain and an N-terminal Sec14 domain which binds phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3). It is ubiquitously expressed and preferentially active on Cdc42. This subgroup also contains closely related ARHGAP8. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	195
239870	cd04405	RhoGAP_BRCC3-like	RhoGAP_BRCC3-like: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of BRCC3-like proteins. This subgroup also contains two groups of closely related proteins, BRCC3 and DEPDC7, which both contain a C-terminal RhoGAP-like domain and an N-terminal DEP (Disheveled, Egl-10, and Pleckstrin) domain. The function(s) of  BRCC3 and DEPDC7 are unknown. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	235
239871	cd04406	RhoGAP_myosin_IXA	RhoGAP_myosin_IXA: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in myosins IXA. Class IX myosins contain a characteristic head domain, a neck domain and a tail domain which contains a C6H2-zinc binding motif and a Rho-GAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	186
239872	cd04407	RhoGAP_myosin_IXB	RhoGAP_myosin_IXB: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in myosins IXB. Class IX myosins contain a characteristic head domain, a neck domain and a tail domain which contains a C6H2-zinc binding motif and a Rho-GAP domain. Class IX myosins are single-headed, processive myosins that are partly cytoplasmic, and partly associated with membranes and the actin cytoskeleton. Class IX myosins are implicated in the regulation of neuronal morphogenesis and function of sensory systems, like the inner ear. There are two major isoforms, myosin IXA and IXB with several splice variants, which are both expressed in developing neurons Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	186
239873	cd04408	RhoGAP_GMIP	RhoGAP_GMIP: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of GMIP (Gem interacting protein). GMIP plays important roles in neurite growth and axonal guidance, and interacts with Gem, a member of the RGK subfamily of the Ras small GTPase superfamily, through the N-terminal half of the protein. GMIP contains a C-terminal RhoGAP domain. GMIP inhibits RhoA function, but is inactive towards Rac1 and Cdc41. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	200
239874	cd04409	RhoGAP_PARG1	RhoGAP_PARG1: RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain of PARG1 (PTPL1-associated RhoGAP1). PARG1 was originally cloned as an interaction partner of PTPL1, an intracellular protein-tyrosine phosphatase. PARG1 interacts with Rap2, also a member of the Ras small GTPase superfamily whose exact function is unknown, and shows strong preference for Rho. Small GTPases cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when GDP-bound. The Rho family of GTPases activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. GTPases generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude.	211
319870	cd04410	DMSOR_beta-like	Beta subunit of the DMSO Reductase (DMSOR) family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	136
100108	cd04411	Ribosomal_P1_P2_L12p	Ribosomal protein P1, P2, and L12p. Ribosomal proteins P1 and P2 are the eukaryotic proteins that are functionally equivalent to bacterial L7/L12. L12p is the archaeal homolog. Unlike other ribosomal proteins, the archaeal L12p and eukaryotic P1 and P2 do not share sequence similarity with their bacterial counterparts. They are part of the ribosomal stalk (called the L7/L12 stalk in bacteria), along with 28S rRNA and the proteins L11 and P0 in eukaryotes (23S rRNA, L11, and L10e in archaea). In bacterial ribosomes, L7/L12 homodimers bind the extended C-terminal helix of L10 to anchor the L7/L12 molecules to the ribosome. Eukaryotic P1/P2 heterodimers and archaeal L12p homodimers are believed to bind the L10 equivalent proteins, eukaryotic P0 and archaeal L10e, in a similar fashion. P1 and P2 (L12p, L7/L12) are the only proteins in the ribosome to occur as multimers, always appearing as sets of dimers. Recent data indicate that most archaeal species contain six copies of L12p (three homodimers), while eukaryotes have two copies each of P1 and P2 (two heterodimers). Bacteria may have four or six copies (two or three homodimers), depending on the species. As in bacteria, the stalk is crucial for binding of initiation, elongation, and release factors in eukaryotes and archaea.	105
239875	cd04412	NDPk7B	Nucleoside diphosphate kinase 7 domain B (NDPk7B): The nm23-H7 class of nucleoside diphosphate kinase (NDPk7) consists of an N-terminal DM10 domain and two functional catalytic NDPk modules, NDPk7A and NDPk7B. The function of the DM10 domain, which also occurs in multiple copies in other proteins, is unknown. NDPk7 is predominantly expressed in testes, although appreciable amount are also found in liver, heart, brain, ovary, small intestine and spleen. The nm23-H7 gene is located in or near the hereditary prostrate cancer susceptibility locus. Nm23-H7 may be involved in the development of colon and gastric carcinoma, the latter possibly in a type-specific manner.	134
239876	cd04413	NDPk_I	Nucleoside diphosphate kinase Group I (NDPk_I)-like: NDP kinase domains are present in a large family of structurally and functionally conserved proteins from bacteria to humans that generally catalyze the transfer of gamma-phosphates of a nucleoside triphosphate (NTP) donor onto a nucleoside diphosphate (NDP) acceptor through a phosphohistidine intermediate. The mammalian nm23/NDP kinase gene family can be divided into two distinct groups. The group I genes encode proteins that generally have highly homologous counterparts in other organisms and possess the classic enzymatic activity of a kinase. This group includes vertebrate NDP kinases A-D (Nm23- H1 to -H4),  and its counterparts in bacteria, archea and other eukaryotes. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. They possess the NDP kinase active site motif (NXXH[G/A]SD) and the nine residues that are most essential for catalysis.	130
239877	cd04414	NDPk6	Nucleoside diphosphate kinase 6 (NDP kinase 6, NDPk6, NM23-H6; NME6; Inhibitor of p53-induced apoptosis-alpha, IPIA-alpha): The nm23-H6 gene encoding NDPk6 is expressed mainly in mitochondria, but also found at a lower level in most tissues. NDPk6 has all nine residues considered crucial for enzyme structure and activity, and has been found to have NDP kinase activity. It may play a role in cell growth and cell cycle progression. The nm23-H6 gene locus has been implicated in a variety of malignant tumors.	135
239878	cd04415	NDPk7A	Nucleoside diphosphate kinase 7 domain A (NDPk7A): The nm23-H7 class of nucleoside diphosphate kinase (NDPk7) consists of an N-terminal DM10 domain and two functional catalytic NDPk modules, NDPk7A and NDPk7B. The function of the DM10 domain, which also occurs in multiple copies in other proteins, is unknown. NDPk7 is predominantly expressed in testes, although appreciable amount are also found in liver, heart, brain, ovary, small intestine and spleen. The nm23-H7 gene is located in or near the hereditary prostrate cancer susceptibility locus. Nm23-H7 may be involved in the development of colon and gastric carcinoma, the latter possibly in a type-specific manner.	131
239879	cd04416	NDPk_TX	NDP kinase domain of thioredoxin domain-containing proteins  (TXNDC3 and TXNDC6): Txl-2 (TXNDC6) and Sptrx-2 (TXNDC3) are fusion proteins of Group II N-terminal thioredoxin domains followed by one or three NDP kinase domains, respectively. Sptrx-2, which has a tissue specific distribution in human testis, has been considered as a member of the nm23 family (nm23-H8) and exhibits a high homology with sea urchin IC1 (intermediate chain-1) protein, a component of the sperm axonemal outer dynein arm complex. Txl-2 is mainly represented in close association with microtubules within tissues with cilia and flagella such as seminiferous epithelium (spermatids) and lung airway epithelium, suggesting possible role in control of microtubule stability and maintenance.	132
239880	cd04418	NDPk5	Nucleoside diphosphate kinase homolog 5 (NDP kinase homolog 5, NDPk5, NM23-H5; Inhibitor of p53-induced apoptosis-beta, IPIA-beta): In human, mRNA for NDPk5 is almost exclusively found in testis, especially in the flagella of spermatids and spermatozoa, in association with axoneme microtubules, and may play a role in spermatogenesis by increasing the ability of late-stage spermatids to eliminate reactive oxygen species.  It belongs to the nm23 Group II genes and appears to differ from the other human NDPks in that it lacks two important catalytic site residues, and thus does not appear to possess NDP kinase activity. NDPk5 confers protection from cell death by Bax and alters the cellular levels of several antioxidant enzymes, including glutathione peroxidase 5 (Gpx5).	132
341228	cd04433	AFD_class_I	Adenylate forming domain, Class I, also known as the ANL superfamily. This family is known as the ANL (acyl-CoA synthetases, the NRPS adenylation domains, and the Luciferase enzymes) superfamily. It includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases.The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain.	336
271198	cd04434	LanC_like	Cyclases involved in the biosynthesis of lantibiotics, and similar proteins. LanC is the cyclase enzyme of the lanthionine synthetase. Lanthionine is a lantibiotic, a unique class of peptide antibiotics. They are ribosomally synthesized as a precursor peptide and then post-translationally modified to contain thioether cross-links called lanthionines (Lans) or methyllanthionines (MeLans), in addition to  2,3-didehydroalanine (Dha) and (Z)-2,3-didehydrobutyrine (Dhb). These unusual amino acids are introduced by the dehydration of serine and threonine residues, followed by thioether formation via addition of cysteine thiols, catalysed by LanB and LanC or LanM. LanC, the cyclase component, is a zinc metalloprotein, whose bound metal has been proposed to activate the thiol substrate for nucleophilic addition. A related domain is also present in LanM and other pro- and eukaryotic proteins with poorly characterized functions.	351
239882	cd04435	DEP_fRom2	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in fungal RhoGEF (GDP/GTP exchange factor) Rom2-like proteins. Rom2-like proteins share a common domain architecture, containing, beside the RhoGEF domain, a DEP, a PH (pleckstrin homology) and a CNH domain. Rom2, a yeast GEF for Rho1 and Rho2, is involved in mediating stress response via the Ras-cAMP pathway and also plays a role in mediating resistance to sphingolipid disturbances.	82
239883	cd04436	DEP_fRgd2	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in fungal RhoGAP (GTPase-activator protein) Rgd2-like proteins. Rgd2-like proteins share a common domain architecture, containing, beside the RhoGAP domain, a DEP and a FCH (Fes/CIP4 homology) domain. Yeast Rgd2 is a GAP protein for Cdc42 and Rho5.	84
239884	cd04437	DEP_Epac	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in Epac-like proteins. Epac (exchange proteins directly activated by cAMP) proteins are GEFs (guanine-nucleotide-exchange factors) for the small GTPases, Rap1 and Rap2. They are directly regulated by cyclic AMP, a second messenger that plays a role in the control of diverse cellular processes, such as cell adhesion and insulin secretion.  Epac-like proteins share a common domain architecture, containing RasGEF, DEP and CAP-effector (cAMP binding) domains. The DEP domain is involved in membrane localization.	125
239885	cd04438	DEP_dishevelled	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in dishevelled-like proteins.  Dishevelled-like proteins play a key role in the transduction of the Wnt signal from the cell surface to the nucleus, which in turn is an important regulatory pathway for cellular development and growth. They contain an N-terminal DIX domain, a central PDZ domain, and a C-terminal DEP domain.	84
239886	cd04439	DEP_1_P-Rex	DEP (Dishevelled, Egl-10, and Pleckstrin) domain 1 found in P-Rex-like proteins. The P-Rex family is the guanine-nucleotide exchange factor (GEF) for the small GTPase Rac that contains an N-terminal RhoGEF domain, two DEP and PDZ domains. Rac-GEF activity is stimulated by phosphatidylinositol (3,4,5)-trisphosphate (PtdIns(3,4,5)P3), a lipid second messenger, and by the G beta-gamma subunits of heterotrimeric G proteins. The DEP domains are not involved in mediating these stimuli, but may be of importance for basal and stimulated levels Rac-GEF activity.	81
239887	cd04440	DEP_2_P-Rex	DEP (Dishevelled, Egl-10, and Pleckstrin) domain 2 found in P-Rex-like proteins. The P-Rex family is the guanine-nucleotide exchange factor (GEF) for the small GTPase Rac that contains an N-terminal RhoGEF domain, two DEP and PDZ domains. Rac-GEF activity is stimulated by phosphatidylinositol (3,4,5)-trisphosphate (PtdIns(3,4,5)P3), a lipid second messenger, and the G beta-gamma subunits of heterotrimeric G proteins. The DEP domains are not involved in mediating these stimuli, but may be of importance for basal and stimulated levels Rac-GEF activity.	93
239888	cd04441	DEP_2_DEP6	DEP (Dishevelled, Egl-10, and Pleckstrin) domain 2 found in DEP6-like proteins. DEP6 proteins contain two DEP and a PDZ domain. Their function is unknown.	85
239889	cd04442	DEP_1_DEP6	DEP (Dishevelled, Egl-10, and Pleckstrin) domain 1 found in DEP6-like proteins. DEP6 proteins contain two DEP and a PDZ domain. Their function is unknown.	82
239890	cd04443	DEP_GPR155	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in GPR155-like proteins. GRP155-like proteins, also known as PGR22, contain an N-terminal permease domain, a central transmembrane region and a C-terminal DEP domain. They are orphan receptors of the class B G protein-coupled receptors. Their function is unknown.	83
239891	cd04444	DEP_PLEK2	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in pleckstrin 2-like proteins.  Pleckstrin 2 is found in a wide variety of cell types, which suggest a more general role in signaling than pleckstrin 1.  Pleckstrin-like proteins contain a central DEP domain, flanked by 2 PH (pleckstrin homology) domains.	109
239892	cd04445	DEP_PLEK1	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in pleckstrin 1-like proteins.  Pleckstrin 1 plays a role in cell spreading and reorganization of actin cytoskeleton in platelets and leukocytes. Its activity is highly regulated by phosphorylation, mainly by protein kinase C. Pleckstrin-like proteins contain a central DEP domain, flanked by 2 PH (pleckstrin homology) domains.	99
239893	cd04446	DEP_DEPDC4	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in DEPDC4-like proteins. DEPDC4 is a DEP domain containing protein of unknown function.	95
239894	cd04447	DEP_BRCC3	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in BBRC3-like proteins. BBRC3, also known as DEPDC1B, is a DEP containing protein of unknown function.	92
239895	cd04448	DEP_PIKfyve	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in fungal RhoGEF (GDP/GTP exchange factor) PIKfyve-like proteins. PIKfyve contains N-terminal Fyve finger and DEP domains, a central chaperonin-like domain and a C-terminal PIPK (phosphatidylinositol phosphate kinase) domain. PIKfyve-like proteins are important phosphatidylinositol (3)-monophosphate (PtdIns(3)P)-5-kinases, producing PtdIns(3,5)P2, which plays a major role in multivesicular body (MVB) sorting and control of retrograde traffic from the vacuole back to the endosome and/or Golgi. PIKfyve itself has been shown to be play a role in regulating early-endosome-to-trans-Golgi network (TGN) retrograde trafficking.	81
239896	cd04449	DEP_DEPDC5-like	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in DEPDC5-like proteins. DEPDC5, in human also known as KIAA0645, is a DEP domain containing protein of unknown function.	83
239897	cd04450	DEP_RGS7-like	DEP (Dishevelled, Egl-10, and Pleckstrin) domain found in RGS (regulator of G-protein signaling) proteins of the subfamily R7. This subgroup contains RGS7, RGS6, RGS9 and RGS11. They share a common domain architecture, containing, beside the RGS domain, a DEP domain and a GGL (G-protein gamma subunit-like ) domain. RGS proteins are GTPase-activating (GAP) proteins of heterotrimeric G proteins by increasing the rate of GTP hydrolysis of the alpha subunit. The fungal homologs, like yeast Sst2, share a related common domain architecture, containing RGS and DEP domains. Sst2 has been identified as the principal regulator of mating pheromone signaling and recently the DEP domain of Sst2 has been shown to be necessary and sufficient to mediate receptor interaction.	88
239898	cd04451	S1_IF1	S1_IF1: Translation Initiation Factor IF1, S1-like RNA-binding domain. IF1 contains an S1-like RNA-binding domain, which is found in a wide variety of RNA-associated proteins. Translation initiation includes a number of interrelated steps preceding the formation of the first peptide bond. In Escherichia coli, the initiation mechanism requires, in addition to mRNA, fMet-tRNA, and ribosomal subunits,  the presence of three additional proteins (initiation factors IF1, IF2, and IF3) and at least one GTP molecule. The three initiation factors influence both the kinetics and the stability of ternary complex formation. IF1 is the smallest of the three factors. IF1 enhances the rate of 70S ribosome subunit association and dissociation and the interaction of 30S ribosomal subunit with IF2 and IF3. It stimulates 30S complex formation. In addition, by binding to the A-site of the 30S ribosomal subunit, IF1 may contribute to the fidelity of the selection of the initiation site of the mRNA.	64
239899	cd04452	S1_IF2_alpha	S1_IF2_alpha: The alpha subunit of translation Initiation Factor 2, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. Eukaryotic and archaeal Initiation Factor 2 (e- and aIF2, respectively) are heterotrimeric proteins with three subunits (alpha, beta, and gamma). IF2 plays a crucial role in the process of translation initiation. The IF2 gamma subunit contains a GTP-binding site. The IF2 beta and gamma subunits together are thought to be responsible for binding methionyl-initiator tRNA. The ternary complex consisting of IF2, GTP, and the methionyl-initiator tRNA binds to the small subunit of the ribosome, as part of a pre-initiation complex that scans the mRNA to find the AUG start codon. The IF2-bound GTP is hydrolyzed to GDP when the methionyl-initiator tRNA binds the AUG start codon, at which time the IF2 is released with its bound GDP. The large ribosomal subunit then joins with the small subunit to complete the initiation complex, which is competent to begin translation. The IF2a subunit is a major site of control of the translation initiation process, via phosphorylation of a specific serine residue. This alpha subunit is well conserved in eukaryotes and archaea but is not present in bacteria. IF2 is a cold-shock-inducible protein.	76
239900	cd04453	S1_RNase_E	S1_RNase_E: RNase E and RNase G, S1-like RNA-binding domain. RNase E is an essential endoribonuclease in the processing and degradation of RNA. In addition to its role in mRNA degradation, RNase E has also been implicated in the processing of rRNA, and the maturation of tRNA, 10Sa RNA and the M1 precursor of RNase P. RNase E associates with PNPase (3' to 5' exonuclease), Rhl B (DEAD-box RNA helicase) and enolase (glycolytic enzyme)  to form the RNA degradosome. RNase E tends to cut mRNA within single-stranded regions that are rich in A/U nucleotides. The N-terminal region of RNase E contains the catalytic site. Within the conserved N-terminal domain of RNAse E and RNase G, there is an S1-like subdomain, which is an ancient single-stranded RNA-binding domain. S1 domain is an RNA-binding module originally identified in the ribosomal protein S1. The S1 domain is required for RNA cleavage by RNase E. RNase G is paralogous to RNase E with an N-terminal catalytic domain that is highly homologous to that of RNase E. RNase G not only shares sequence similarity with RNase E, but also functionally overlaps with RNase E. In Escherichia coli, RNase G is involved in the maturation of the 5' end of the 16S rRNA. RNase G plays a secondary role in mRNA decay.	88
239901	cd04454	S1_Rrp4_like	S1_Rrp4_like: Rrp4-like, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. Rrp4 protein, and Rrp40 and Csl4 proteins, also represented in this group, are subunits of the exosome complex. The exosome plays a central role in 3' to 5' RNA processing and degradation in eukarytes and archaea. Its functions include the removal of incorrectly processed RNA and the maintenance of proper levels of mRNA, rRNA and a number of small RNA species. In Saccharomyces cerevisiae, the exosome includes nine core components, six of which are homologous to bacterial RNase PH. These form a hexameric ring structure. The other three subunits (RrP4, Rrp40, and Csl4) contain an S1 RNA binding domain and are part of the "S1 pore structure".	82
239902	cd04455	S1_NusA	S1_NusA: N-utilizing substance A protein (NusA), S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. NusA is a transcription elongation factor containing an N-terminal catalytic domain and three RNA binding domains (RBD's). The RBD's include one S1 domain and two KH domains that form an RNA binding surface. DNA transcription by RNA polymerase (RNAP) includes three phases - initiation, elongation, and termination. During initiation, sigma factors bind RNAP and target RNAP to specific promoters. During elongation, N-utilization substances (NusA, B, E, and G) replace sigma factors and regulate pausing, termination, and antitermination. NusA is cold-shock-inducible.	67
239903	cd04456	S1_IF1A_like	S1_IF1A_like: Translation initiation factor IF1A-like, S1-like RNA-binding domain. IF1A is also referred to as eIF1A in eukaryotes and aIF1A in archaea. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. IF1A is essential for translation initiation. eIF1A acts synergistically with eIF1 to mediate assembly of ribosomal initiation complexes at the initiation codon and maintain the accuracy of this process by recognizing and destabilizing aberrant preinitiation complexes from the mRNA. Without eIF1A and eIF1, 43S ribosomal preinitiation complexes can bind to the cap-proximal region, but are unable to reach the initiation codon. eIF1a also enhances the formation of 5'-terminal complexes in the presence of other translation initiation factors. This protein family is only found in eukaryotes and archaea.	78
239904	cd04457	S1_S28E	S1_S28E: S28E, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. S28E protein is a component of the 30S ribosomal subunit. S28E is highly conserved among archaea and eukaryotes. S28E may control precursor RNA splicing and turnover in mRNA maturation process but its function in the ribosome is largely unknown. The structure contains an OB-fold found in many oligosaccharide and nucleic acid binding proteins. This implies that S28E might be involved in protein synthesis.	60
239905	cd04458	CSP_CDS	Cold-Shock Protein (CSP) contains an S1-like cold-shock domain (CSD) that is found in eukaryotes, prokaryotes, and archaea.  CSP's include the major cold-shock proteins CspA and CspB in bacteria and the eukaryotic gene regulatory factor Y-box protein. CSP expression is up-regulated by an abrupt drop in growth temperature. CSP's are also expressed under normal condition at lower level. The function of cold-shock proteins is not fully understood. They preferentially bind poly-pyrimidine region of single-stranded RNA and DNA.  CSP's are thought to bind mRNA and regulate ribosomal translation, mRNA degradation, and  the rate of transcription termination. The human Y-box protein, which contains a CSD, regulates transcription and translation of genes that contain the Y-box sequence in their promoters. This specific ssDNA-binding properties of CSD are required for the binding of Y-box protein to the promoter's Y-box sequence, thereby regulating transcription.	65
239906	cd04459	Rho_CSD	Rho_CSD: Rho protein cold-shock domain (CSD). Rho protein is a transcription termination factor in most bacteria. In bacteria, there are two distinct mechanisms for mRNA transcription termination. In intrinsic termination, RNA polymerase and nascent mRNA are released from DNA template by an mRNA stem loop structure, which resembles the transcription termination mechanism used by eukaryotic pol III. The second mechanism is mediated by Rho factor. Rho factor terminates transcription by using energy from ATP hydrolysis to forcibly dissociate the transcripts from RNA polymerase. Rho protein contains an N-terminal S1-like domain, which binds single-stranded RNA. Rho has a C-terminal ATPase domain which hydrolyzes ATP to provide energy to strip RNA polymerase and mRNA from the DNA template. Rho functions as a homohexamer.	68
239907	cd04460	S1_RpoE	S1_RpoE: RpoE, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. RpoE is subunit E of archaeal RNA polymerase. Archaeal cells contain a single RNA polymerase made up of 12 subunits, which are homologous to the 12 subunits (RPB1-12) of eukaryotic RNA polymerase II. RpoE is homologous to Rpa43 of eukaryotic RNA polymerase I, RPB7 of eukaryotic RNA polymerase II, and Rpc25 of eukaryotic RNA polymerase III. RpoE is composed of two domains, the N-terminal RNP (ribonucleoprotein) domain and the C-terminal S1 domain. This S1 domain binds ssRNA and ssDNA. This family is classified based on the C-terminal S1 domain. The function of RpoE is not fully understood. In eukaryotes, RPB7 and RPB4 form a heterodimer that reversibly associates with the RNA polymerase II core.	99
239908	cd04461	S1_Rrp5_repeat_hs8_sc7	S1_Rrp5_repeat_hs8_sc7: Rrp5 Homo sapiens S1 repeat 8 (hs8) and Saccharomyces cerevisiae S1 repeat 7 (sc7)-like domains. Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits.  Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in S. cerevisiae Rrp5 and 14 S1 repeats in H. sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 8 and S. cerevisiae S1 repeat 7. Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	83
239909	cd04462	S1_RNAPII_Rpb7	S1_RNAPII_Rpb7: Eukaryotic RNA polymerase II (RNAPII) Rpb7 subunit C-terminal S1 domain. RNAPII is composed of 12 subunits (Rpb1-12). Rpb4 and Rpb7 form a heterodimer that associate with the RNAPII core. Rpb7 is a homolog of the Rpc25 of RNA polymerase III, RpoE of the archaeal RNA polymerase, and Rpa43 of eukaryotic RNA polymerase I. Rpb7 has two domains, an N-terminal ribonucleoprotein (RNP) domain and a C-terminal S1 domain, both of which bind single-stranded RNA. It is possible that the S1 domain interacts with the nascent RNA transcript, assisted by the RNP domain. In yeast, Rpb4/Rpb7 is necessary for promoter-directed transcription initiation. They also play a role in regulating transcription-coupled repair in the Rad26-dependent pathway, in efficient mRNA export, and in transcription termination.	88
239910	cd04463	S1_EF_like	S1_EF_like: EF-like, S1-like RNA-binding domain. The EF-like superfamily contains the bacterial translation elongation factor P and its archeal and eukaryotic homologs, aIF5A and eIF5A. All proteins in this superfamily contain an S1 domain, which binds RNA or single-stranded DNA and often interacts with the ribosome. Hex-1, the SI-like domain of which is also found in this group, is structurally homologous to eIF5A and might have evolved from an ancestral eIF5A through gene duplication.	55
239911	cd04465	S1_RPS1_repeat_ec2_hs2	S1_RPS1_repeat_ec2_hs2: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain.While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 2 of the Escherichia coli and Homo sapiens RPS1 (ec2 and hs2, respectively). Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog.	67
239912	cd04466	S1_YloQ_GTPase	S1_YloQ_GTPase: YloQ GTase family (also known as YjeQ and CpgA), S1-like RNA-binding domain. Proteins in the YloQ GTase family bind the ribosome and have GTPase activity. The precise role of this family is unknown. The protein structure is composed of three domains: an N-terminal S1 domain, a central GTPase domain, and a C-terminal zinc finger domain. This N-terminal S1 domain binds ssRNA. The central GTPase domain contains nucleotide-binding signature motifs: G1 (walker A), G3 (walker B) and G4 motifs. Experiments show that the bacterial YloQ and YjeQ proteins have low intrinsic GTPase activity. The C-terminal zinc-finger domain has structural similarity to a portion of the DNA-repair protein Rad51. This suggests a possible role for this GTPase as a regulator of translation, perhaps as a translation initiation factor. This family is classified based on the N-terminal S1 domain.	68
239913	cd04467	S1_aIF5A	S1_aIF5A: Archaeal translation Initiation Factor 5A (aIF5A), S1-like RNA-binding domain. aIF5A is a homolog of eukaryotic eIF5A. IF5A is the only protein known to have the unusual amino acid hypusine. Hypusine is a post-translationally modified lysine and is essential for IF5A function. In yeast, eIF5A interacts with components of the 80S ribosome and translation elongation factors 2 (eEF2) in a hypusine-dependent manner. This C-terminal S1 domain resembles the cold-shock domain which binds RNA. Moreover, IF5A prefers binding to the actively translating ribosome. This evidence suggests that IF5A plays a role in translation elongation instead of translation initiation as previously proposed.	57
239914	cd04468	S1_eIF5A	S1_eIF5A: Eukaryotic translation Initiation Factor 5A (eIF5A), S1-like RNA-binding domain. eIF5A is an evolutionarily conserved protein found in eukaryotes. eIF5A is the only protein known to have the unusual amino acid hypusine. Hypusine is essential for eIF5A function and is a post-translationally modified lysine. eIF5A interacts with components of the 80S ribosome and translation elongation factors 2 (eEF2) in a hypusine-dependent manner. This C-terminal S1 domain resembles the oligonucleotides-binding fold (OB fold) which binds RNA. Moreover, eIF5A prefers binding to the actively translating ribosome. This evidence suggests that eIF5A plays a role in translation elongation instead of translation initiation as previously proposed.	69
239915	cd04469	S1_Hex1	S1_Hex1: Hex1, S1-like RNA-binding domain. Hex1 protein is the major component of the Woronin body in filamentous fungi. The Woronin body is a dense vesicle and plays a vital role in filamentous fungi cell integrity. When cell damage occurs, Woronin bodies seal the septal pore to prevent further cytoplasmic bleeding. Hex1 protein self-assembles to form the solid core of the Woronin body vesicle. The Hex1 sequence and structure are similar to eukaryotic initiation factor 5A (eIF5A), suggesting they share a common ancestor during evolution. All members of the EF superfamily to which Hex1 belongs, contain an S1 domain, which has been shown to bind RNA or single-stranded DNA and often interacts with the ribosome.	75
239916	cd04470	S1_EF-P_repeat_1	S1_EF-P_repeat_1: Translation elongation factor P (EF-P), S1-like RNA-binding domain, repeat 1. EF-P stimulates the peptidyltransferase activity in the prokaryotic 70S ribosome. EF-P enhances the synthesis of certain dipeptides with N-formylmethionyl-tRNA and puromycine in vitro. EF-P binds to both the 30S and 50S ribosomal subunits. EF-P binds near the streptomycine binding site of the 16S rRNA in the 30S subunit. EF-P interacts with domains 2 and 5 of the 23S rRNA. The L16 ribosomal protein of the 50S or its N-terminal fragment are required for EF-P mediated peptide bond synthesis, whereas L11, L15, and L7/L12 are not required in this reaction, suggesting that EF-P may function at a different ribosomal site than most other translation factors. EF-P is essential for cell viability and is required for protein synthesis. EF-P is mainly present in bacteria. The EF-P homologs in archaea and eukaryotes are the initiation factors aIF5A and eIF5A, respectively. EF-P has 3 domains (domains I, II, and III). Domains II and III are S1-like domains. This CD includes domain II (the first S1 domain of EF_P). Domains II and III have structural homology to the eIF5A domain C, suggesting that domains II and III evolved by duplication.	61
239917	cd04471	S1_RNase_R	S1_RNase_R: RNase R C-terminal S1 domain. RNase R is a processive 3' to 5' exoribonuclease, which is a homolog of RNase II. RNase R degrades RNA with secondary structure having a 3' overhang of at least 7 nucleotides. RNase R and PNPase play an important role in the degradation of RNA with extensive secondary structure, such as rRNA, tRNA, and certain mRNA which contains repetitive extragenic palindromic sequences. The C-terminal S1 domain binds ssRNA.	83
239918	cd04472	S1_PNPase	S1_PNPase: Polynucleotide phosphorylase (PNPase), ), S1-like RNA-binding domain. PNPase  is a polyribonucleotide nucleotidyl transferase that degrades mRNA. It is a trimeric multidomain protein. The C-terminus contains the S1 domain which binds ssRNA. This family is classified based on the S1 domain. PNPase nonspecifically removes the 3' nucleotides from mRNA, but is stalled by double-stranded RNA structures such as a stem-loop. Evidence shows that a minimum of 7-10 unpaired nucleotides at the 3' end, is required for PNPase degradation. It is suggested that PNPase also dephosphorylates the RNA 5' end. This additional activity may regulate the 5'-dependent activity of RNaseE in vivo.	68
239919	cd04473	S1_RecJ_like	S1_RecJ_like: The S1 domain of the archaea-specific RecJ-like exonuclease. The function of this family is not fully understood. In Escherichia coli, RecJ degrades single-stranded DNA in the 5'-3' direction and participates in homologous recombination and mismatch repair.	77
239920	cd04474	RPA1_DBD_A	RPA1_DBD_A: A subfamily of OB folds corresponding to the second OB fold, the ssDNA-binding domain (DBD)-A, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-A, RPA1 contains three other OB folds: DBD-B, DBD-C, and RPA1N. The major DNA binding activity of human RPA (hRPA) and Saccharomyces cerevisiae RPA (ScRPA) is associated with DBD-A and DBD-B of RPA1. RPA1 DBD-C is involved in trimerization. The ssDNA-binding mechanism is believed to be multistep and to involve conformational change. Although ScRPA and the hRPA have similar ssDNA-binding properties, they differ functionally. Antibodies to hRPA do not cross-react with ScRPA, and null mutations in the ScRPA subunits are not complemented by corresponding human genes. Also, ScRPA cannot support Simian virus 40 (SV40) DNA replication in vitro, whereas human RPA can.	104
239921	cd04475	RPA1_DBD_B	RPA1_DBD_B: A subfamily of OB folds corresponding to the third OB fold, the ssDNA-binding domain (DBD)-B, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-B, RPA1 contains three other OB folds: DBD-A, DBD-C, and RPA1N. The major DNA binding activity of human RPA (hRPA) and Saccharomyces cerevisiae RPA (ScRPA) is associated with RPA1 DBD-A and DBD-B. RPA1 DBD-C is involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. Although ScRPA and the hRPA have similar ssDNA-binding properties, they differ functionally. Antibodies to hRPA do not cross-react with ScRPA, and null mutations in the ScRPA subunits are not complemented by corresponding human genes. Also, ScRPA cannot support Simian virus 40 (SV40) DNA replication in vitro, whereas human RPA can.	101
239922	cd04476	RPA1_DBD_C	RPA1_DBD_C: A subfamily of OB folds corresponding to the C-terminal OB fold, the ssDNA-binding domain (DBD)-C, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-C, RPA1 contains three other OB folds: DBD-A, DBD-B, and RPA1N. The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B. RPA1 DBD-C is involved in DNA binding and trimerization. It contains two structural insertions not found to date in other OB-folds: a zinc ribbon and a three-helix bundle. RPA1 DBD-C also contains a Cys4-type zinc-binding motif, which plays a role in the ssDNA binding function of this domain. It appears that zinc itself may not be required for ssDNA binding.	166
239923	cd04477	RPA1N	RPA1N: A subfamily of OB folds corresponding to the N-terminal OB-fold domain of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). RPA1N is known to specifically interact with the p53 tumor suppressor, DNA polymerase alpha, and transcription factors. In addition to RPA1N, RPA1 contains three other OB folds: ssDNA-binding domain (DBD)-A, DBD-B, and DBD-C.	97
239924	cd04478	RPA2_DBD_D	RPA2_DBD_D: A subfamily of OB folds corresponding to the OB fold of the central ssDNA-binding domain (DBD)-D of human RPA2 (also called RPA32). RPA2 is a subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B; RPA2 DBD-D is a weak ssDNA-binding domain. RPA2 DBD-D is also involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. N-terminal to human RPA2 DBD-D is a domain containing all the known phosphorylation sites of RPA. Human RPA2 is phosphorylated in a cell cycle dependent manner in response to DNA damage. RPA2 interacts physically with menin; the gene encoding menin is a tumor suppressor gene disrupted in multiple endocrine neoplasia type I. This subfamily also includes RPA2 from Cryptosporidium parvum (CpRPA2). CpRPA2 is an SSB, which can be phosphorylated by DNA-PK in vitro.	95
239925	cd04479	RPA3	RPA3: A subfamily of OB folds similar to human RPA3 (also called RPA14). RPA3 is the smallest subunit of Replication protein A (RPA). RPA is a nuclear ssDNA binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). RPA3 is believed to have a structural role in assembly of the RPA heterotrimer.	101
239926	cd04480	RPA1_DBD_A_like	RPA1_DBD_A_like: A subgroup of uncharacterized plant OB folds with similarity to the second OB fold, the ssDNA-binding domain (DBD)-A, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-A, RPA1 contains three other OB folds: DBD-B, DBD-C, and RPA1N. The major DNA binding activity of RPA is associated with DBD-A and DBD-B of RPA1. RPA1 DBD-C is involved in trimerization. The ssDNA-binding mechanism is believed to be multistep and to involve conformational change.	86
239927	cd04481	RPA1_DBD_B_like	RPA1_DBD_B_like: A subgroup of uncharacterized, plant OB folds with similarity to the third OB fold, the ssDNA-binding domain (DBD)-B, of human RPA1 (also called RPA70). RPA1 is the large subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). In addition to DBD-B, RPA1 contains three other OB folds: DBD-A, DBD-C, and RPA1N. The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B. RPA1 DBD-C is involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change.	106
239928	cd04482	RPA2_OBF_like	RPA2_OBF_like: A subgroup of uncharacterized archaeal OB folds with similarity to the OB fold of the central ssDNA-binding domain (DBD)-D of human RPA2 (also called RPA32). RPA2 is a subunit of Replication protein A (RPA). RPA is a nuclear ssDNA-binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). The major DNA binding activity of RPA is associated with RPA1 DBD-A and DBD-B; RPA2 DBD-D is a weak ssDNA-binding domain. RPA2 DBD-D is also involved in trimerization. The ssDNA binding mechanism is believed to be multistep and to involve conformational change. N-terminal to human RPA2 DBD-D is a domain containing all the known phosphorylation sites of RPA. Human RPA2 is phosphorylated in a cell cycle dependent manner in response to DNA damage.	91
239929	cd04483	hOBFC1_like	hOBFC1_like: A subfamily of OB folds similar to that found in human OB fold containing protein 1 (hOBFC1). Members of this group belong to the Replication protein A subunit 2 (RPA2) family of OB folds. RPA is a nuclear ssDNA binding protein (SSB) which appears to be involved in all aspects of DNA metabolism including replication, recombination, and repair. RPA also mediates specific interactions of various nuclear proteins. In animals, plants, and fungi, RPA is a heterotrimer with subunits of 70KDa (RPA1), 32kDa (RPA2), and 14 KDa (RPA3). The OB fold domain of RPA2 has dual roles in ssDNA binding and trimerization.	92
239930	cd04484	polC_OBF	polC_OBF: A subfamily of OB folds corresponding to the N-terminal OB-fold nucleic acid binding domain of Bacillus subtilis type C replicative DNA polymerase III alpha subunit (polC). Replication in B. subtilis and Staphylococcus aureus requires two different polymerases, polC and DnaE. The holoenzyme is thought to include the two different polymerases. At the B. subtilis replication fork, polC appears to be involved in leading strand synthesis and DnaE in lagging strand synthesis.	82
239931	cd04485	DnaE_OBF	DnaE_OBF: A subfamily of OB folds corresponding to the C-terminal OB-fold nucleic acid binding domain of Thermus aquaticus and Escherichia coli type C replicative DNA polymerase III alpha subunit (DnaE). The DNA polymerase holoenzyme of E. coli contains two copies of this replicative polymerase, each of which copies a different DNA strand. This group also contains Bacillus subtilis DnaE. Replication in B. subtilis and Staphylococcus aureus requires two different type C polymerases, polC and DnaE, both of which are thought to be included in the DNA polymerase holoenzyme. At the B. subtilis replication fork, polC appears to be involved in leading strand synthesis and DnaE in lagging strand synthesis.	84
239932	cd04486	YhcR_OBF_like	YhcR_OBF_like: A subfamily of OB-fold domains similar to the OB folds of Bacillus subtilis YhcR. YhcR is a sugar-nonspecific nuclease, which is active in the presence of Ca2+ and Mn2+. It cleaves RNA endonucleolytically, producing 3'-monophosphate nucleosides. YhcR appears to be the major Ca2+ activated nuclease of B. subtilis. YhcR may be localized in the cell wall.	78
239933	cd04487	RecJ_OBF2_like	RecJ_OBF2_like: A subfamily of OB folds corresponding to the second OB fold (OBF2) of archaeal-specific proteins with similarity to eubacterial RecJ. RecJ is an ssDNA-specific exonuclease. Although the overall sequence similarity of these proteins to eubacterial RecJ proteins is marginal, they appear to carry motifs, which have been shown to be essential for nuclease function in Escherichia coli RecJ. In addition to this OB fold, most proteins in this subfamily contain: i) an N-terminal OB fold belonging to a different domain family (the ribosomal S1-like RNA-binding family); and ii) a domain, C-terminal to OBF2, characteristic of DHH family proteins. DHH family proteins include E. coli RecJ, and are predicted to have a phosphoesterase function.	73
239934	cd04488	RecG_wedge_OBF	RecG_wedge_OBF: A subfamily of OB folds corresponding to the OB fold found in the N-terminal (wedge) domain of Escherichia coli RecG. RecG is a branched-DNA-specific helicase, which catalyzes the interconversion of a DNA replication fork to a four-stranded (Holliday) junction in vivo and in vitro. This interconversion provides a route to repair stalled forks. The RecG monomer contains three domains. The N-terminal domain is named for its wedge structure, and may provide the specificity of RecG for binding branched-DNA structures. During the reversal of fork to Holliday junction, the wedge domain is fixed at the junction of the fork where the leading and lagging strand duplex arms meet, and is thought to promote the unwinding of the nascent leading and lagging strands. In order to form the Holliday junction, these nascent strands would be annealed, and the parental strands reannealed. The wedge domain may also be a processivity factor of RecG on these branched chain substrates.	75
239935	cd04489	ExoVII_LU_OBF	ExoVII_LU_OBF: A subfamily of OB folds corresponding to the N-terminal OB-fold domain of Escherichia coli exodeoxyribonuclease VII (ExoVII) large subunit. E. coli ExoVII is composed of two non-identical subunits. E. coli ExoVII is a single-strand-specific exonuclease which degrades ssDNA from both 3-prime and 5-prime ends. ExoVII plays a role in methyl-directed mismatch repair in vivo. ExoVII may also guard the genome from mutagenesis by removing excess ssDNA, since the build up of ssDNA would lead to SOS induction and PolIV-dependent mutagenesis.	78
239936	cd04490	PolII_SU_OBF	PolII_SU_OBF: A subfamily of OB folds corresponding to the OB fold found in Pyrococcus abyssi DNA polymerase II (PolII) small subunit. PolII is a family D DNA polymerase, having a 3-prime to 5-prime exonuclease activity. P. abyssi PolII is heterodimeric. The large subunit appears to be the polymerase, and the small subunit may be the exonuclease. The small subunit contains a calcineurin-like phosphatase superfamily domain C-terminal to this OB-fold domain.	79
239937	cd04491	SoSSB_OBF	SoSSB_OBF: A subfamily of OB folds similar to the OB fold of the crenarchaeote Sulfolobus solfataricus single-stranded (ss) DNA-binding protein (SSoSSB). SSoSSB has a single OB fold, and it physically and functionally interacts with RNA polymerase. In vitro, SSoSSB can substitute for the basal transcription factor TBP, stimulating transcription from promoters under conditions in which TBP is limiting, and supporting transcription when TBP is absent. SSoSSB selectively melts the duplex DNA of promoter sequences. It also relieves transcriptional repression by the chromatin Alba. In addition, SSoSSB activates reverse gyrase activity, which involves DNA binding, DNA cleavage, strand passage and ligation. SSoSSB stimulates all these steps in the presence of the chromatin protein, Sul7d. SSoSSB antagonizes the inhibitory effect of Sul7d on reverse gyrase supercoiling activity. It also physically and functionally interacts with Mini-chromosome Maintenance (MCM), stimulating the DNA helicase activity of MCM.	82
239938	cd04492	YhaM_OBF_like	YhaM_OBF_like: A subfamily of OB folds similar to that found in Bacillus subtilis YhaM and Staphylococcus aureus cmp-binding factor-1 (SaCBF1). Both these proteins are 3'-to-5'exoribonucleases. YhaM requires Mn2+ or Co2+ for activity and is inactive in the presence of Mg2+. YhaM also has a Mn2+ dependent 3'-to-5'single-stranded DNA exonuclease activity. SaCBF is also a double-stranded DNA binding protein, binding specifically to cmp, the replication enhancer found in S. aureus plasmid pT181. Proteins in this group combine an N-terminal OB fold with a C-terminal HD domain. The HD domain is found in metal-dependent phosphohydrolases.	83
239939	cd04493	BRCA2DBD_OB1	BRCA2DBD_OB1: A subfamily of OB folds corresponding to the first OB fold (OB1) of the 800-amino acid C-terminal ssDNA binding domain (DBD) of BRCA2 (breast cancer susceptibility gene 2) protein, called BRCA2DBD. BRCA2 participates in homologous recombination-mediated repair of double-strand DNA breaks. It stimulates the displacement of Replication protein A (RPA), the most abundant eukaryotic ssDNA binding protein. It also facilitates filament formation. Mutations that map throughout the BRCA2 protein are associated with breast cancer susceptibility. BRCA2 is a large nuclear protein and its most conserved region is the C-terminal BRCA2DBD. BRCA2DBD binds ssDNA in vitro, and is composed of five structural domains, three of which are OB folds (OB1, OB2, and OB3). BRCA2DBD OB2 and OB3 are arranged in tandem, and their mode of binding can be considered qualitatively similar to two OB folds of RPA1, DBD-A and DBD-B (the major DBDs of RPA). BRCA2DBD OB1 binds DNA weakly.	100
239940	cd04494	BRCA2DBD_OB2	BRCA2DBD_OB2: A subfamily of OB folds corresponding to the second OB fold (OB2) of the 800-amino acid C-terminal ssDNA binding domain (DBD) of BRCA2 (breast cancer susceptibility gene 2) protein, called BRCA2DBD. BRCA2 participates in homologous recombination-mediated repair of double-strand DNA breaks. It stimulates the displacement of Replication protein A (RPA), the most abundant eukaryotic ssDNA binding protein. It also facilitates filament formation. Mutations that map throughout the BRCA2 protein are associated with breast cancer susceptibility. BRCA2 is a large nuclear protein and its most conserved region is the C-terminal BRCA2DBD. BRCA2DBD binds ssDNA in vitro, and is composed of five structural domains, three of which are OB folds (OB1, OB2, and OB3). BRCA2DBD OB2 and OB3 are arranged in tandem, and their mode of binding can be considered qualitatively similar to two OB folds of RPA1, DBD-A and DBD-B (the major DBDs of RPA).	251
239941	cd04495	BRCA2DBD_OB3	BRCA2DBD_OB3: A subfamily of OB folds corresponding to the third OB fold (OB3) of the 800-amino acid C-terminal ssDNA binding domain (DBD) of BRCA2 (breast cancer susceptibility gene 2) protein, called BRCA2DBD. BRCA2 participates in homologous recombination-mediated repair of double-strand DNA breaks. It stimulates the displacement of Replication protein A (RPA), the most abundant eukaryotic ssDNA binding protein. It also facilitates filament formation. Mutations that map throughout the BRCA2 protein are associated with breast cancer susceptibility. BRCA2 is a large nuclear protein and its most conserved region is the C-terminal BRCA2DBD. BRCA2DBD binds ssDNA in vitro, and is composed of five structural domains, three of which are OB folds (OB1, OB2, and OB3). BRCA2DBD OB2 and OB3 are arranged in tandem, and their mode of binding can be considered qualitatively similar to two OB folds of RPA1, DBD-A and DBD-B (the major DBDs of RPA).	100
239942	cd04496	SSB_OBF	SSB_OBF: A subfamily of OB folds similar to the OB fold of ssDNA-binding protein (SSB). SSBs bind with high affinity to ssDNA. They bind to and protect ssDNA intermediates during DNA metabolic pathways. All bacterial and eukaryotic SSBs studied to date oligomerize to bring together four OB folds in their active state. The majority (e.g. Escherichia coli SSB) have a single OB fold per monomer, which oligomerize to form a homotetramer. However, Deinococcus and Thermus SSB proteins have two OB folds per monomer, which oligomerize to form a homodimer. Mycobacterium tuberculosis SSB varies in quaternary structure from E. coli SSB. It forms a dimer of dimers having a unique dimer interface, which lends the protein greater stability. Included in this group are OB folds similar to Escherichia coli PriB. E.coli PriB is homodimeric with each monomer having a single OB fold. It does not appear to form higher order oligomers. PriB is an essential protein for the replication restart at forks that have stalled at sites of DNA damage. It also plays a role in the assembly of primosome during replication initiation at the bacteriophage phiX174 origin. PriB physically interacts with SSB and binds ssDNA with high affinity.	100
239943	cd04497	hPOT1_OB1_like	hPOT1_OB1_like: A subfamily of OB folds similar to the first OB fold (OB1) of human protection of telomeres 1 protein (hPOT1), the single OB fold of the N-terminal domain of Schizosaccharomyces pombe POT1 (SpPOT1), and the first OB fold of the N-terminal domain of the alpha subunit (OB1Nalpha) of Oxytricha nova telomere end binding protein (OnTEBP). POT1 proteins recognize single-stranded (ss) 3-prime ends of the telomere. A 3-prime ss overhang is conserved in ciliated protozoa, yeast, and mammals. SpPOT1 is essential for telomere maintenance. It binds specifically to the ss G-rich telomeric sequence (GGTTAC) of S. pombe. hPOT1 binds specifically to ss telomeric DNA repeats ending with the sequence GGTTAG. Deletion of the S. pombe pot1+ gene results in a rapid loss of telomere sequences, chromosome mis-segregation and chromosome circularization. hPOT1 is implicated in telomere length regulation. The hPOT1 monomer consists of two closely connected OB folds (OB1-OB2) which cooperate to bind telomeric ssDNA. OB1 makes more extensive contact with the ssDNA than OB2. OB2 protects the 3' end of the ssDNA. A second OB fold has not been predicted in S. pombe POT1. OnTEBP binds the extreme 3-prime end of telomeric DNA. It is heterodimeric and contains four OB folds - three in the alpha subunit (two in the N-terminal domain and one in the C-terminal domain) and one in the beta subunit. OB1Nalpha, together with the second OB fold of the N-terminal domain of OnTEBP alpha subunit and the beta subunit OB fold, forms a deep cleft that binds ssDNA.	138
239944	cd04498	hPOT1_OB2	hPOT1_OB2: A subfamily of OB folds similar to the second OB fold (OB2) of human protection of telomeres 1 protein (hPOT1). POT1 proteins bind to the single-stranded (ss) 3-prime ends of the telomere. hPOT1 binds specifically to ss telomeric DNA repeats ending with the sequence GGTTAG. The hPOT1 monomer consists of two closely connected OB folds (OB1-OB2) which cooperate to bind telomeric ssDNA. OB1 makes more extensive contact with the ssDNA than OB2. OB2 protects the 3' end of the ssDNA. hPOT1 is implicated in telomere length regulation.	123
239945	cd04501	SGNH_hydrolase_like_4	Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid.	183
239946	cd04502	SGNH_hydrolase_like_7	Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid.	171
239947	cd04506	SGNH_hydrolase_YpmR_like	Members of the SGNH-hydrolase superfamily, a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid. This subfamily contains sequences similar to Bacillus YpmR.	204
410449	cd04508	Tudor_SF	Tudor domain superfamily. The Tudor domain is a conserved structural domain, originally identified in the Tudor protein of Drosophila, that adopts a beta-barrel-like core structure containing four short beta-strands followed by an alpha-helical region. It binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. Tudor domain-containing proteins may mediate protein-protein interactions required for various DNA-templated biological processes, such as RNA metabolism, as well as histone modification and the DNA damage response. Members of this superfamily contain one or more copies of the Tudor domain.	47
380490	cd04509	PBP1_ABC_transporter_GPCR_C-like	Family C of G-protein coupled receptors and their close homologs, the type 1 periplasmic-binding proteins of ATP-binding cassette transporter-like systems. This CD includes members of the family C of G-protein coupled receptors and their close homologs, the type 1 periplasmic-binding proteins of ATP-binding cassette transporter-like systems. The family C GPCR includes glutamate/glycine-gated ion channels such as the NMDA receptor, G-protein-coupled receptors, metabotropic glutamate, GABA-B, calcium sensing, pheromone receptors, and atrial natriuretic peptide-guanylate cyclase receptors. The glutamate receptors that form cation-selective ion channels, iGluR, can be classified into three different subgroups according to their binding-affinity for the agonists NMDA (N-methyl-D-asparate), AMPA (alpha-amino-3-dihydro-5-methyl-3-oxo-4-isoxazolepropionic acid), and kainate. L-glutamate is a major neurotransmitter in the brain of vertebrates and acts through either mGluRs or iGluRs. mGluRs subunits possess seven transmembrane segments and a large N-terminal extracellular domain. ABC-type leucine-isoleucine-valine binding protein (LIVBP) is a bacterial periplasmic binding protein that has homology with the amino-terminal domain of the glutamate-receptor ion channels (iGluRs). The extracellular regions of iGluRs are made of two PBP-like domains in tandem, a LIVBP-like domain that constitutes the N terminus (included in this model) followed by a domain related to lysine-arginine-ornithine-binding protein (LAOBP) that belongs to the type 2 periplasmic binding fold protein superfamily. The uncharacterized periplasmic components of various ABC-type transport systems are also included in this family.	306
239948	cd04511	Nudix_Hydrolase_4	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, U=I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	130
271334	cd04512	Ntn_Asparaginase_2_like	L-Asparaginase type 2-like enzymes of the NTN-hydrolase superfamily. This family includes Glycosylasparaginase, Taspase 1, and  L-Asparaginase type 2 enzymes. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue. The family is circularly permuted relative to other NTN-hydrolase families.	249
271335	cd04513	Glycosylasparaginase	Glycosylasparaginase and similar proteins. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoproteins. This enzyme is an amidase located inside lysosomes. Mutation of this gene in humans causes a genetic disorder known as aspartylglycosaminuria (AGU). The glycosylasparaginase precursor undergoes autoproteolysis through an N-O or N-S acyl rearrangement of the peptide bond, which leads to the cleavage of a peptide bond between an Asp and a Thr. This proteolysis step generates an exposed N-terminal catalytic threonine and activates the enzyme.	294
271336	cd04514	Taspase1_like	Taspase 1 (threonine aspartase 1) and similar proteins. Taspase1 catalyzes the cleavage of the mix lineage leukemia (MLL) nuclear protein and transcription factor TFIIA. Taspase1 is a threonine aspartase, a member of the Ntn hydrolase superfamily and the type 2 asparaginase family. A threonine residue acts as the active site nucleophile in both endopeptidease and protease activities to cleave polypeptide substrates after an aspartate residue. The Taspase1 proenzyme undergoes autoproteolysis into alpha and beta subunits. The N-terminal residue of the beta subunit is a threonine which is the active catalytic residue. The active enzyme is a heterotetramer.	313
341214	cd04515	Alpha_kinase	Alpha kinase family. The alpha kinase family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional serine/threonine protein kinases. The family contains myosin heavy chain kinases, elongation factor-2 kinases, and bifunctional ion channel kinases. These kinases are implicated in a large variety of cellular processes such as protein translation, Mg2+/Ca2+ homeostasis, intracellular transport, cell migration, adhesion, and proliferation. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	213
239952	cd04516	TBP_eukaryotes	eukaryotic TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA.	174
239953	cd04517	TLF	TBP-like factors (TLF; also called TLP, TRF, TRP), which are found in most metazoans. TLFs and TBPs have well-conserved core domains; however, they only share about 60% similarity. TLFs, like TBPs, interact with TFIIA and TFIIB, which are part of the basal transcription machinery. Yet, in contrast to TBPs, TLFs seem not to interact with the TATA-box and even have a negative effect on the transcription of TATA-containing promoters. Recent results indicate that TLFs are involved in the transcription via TATA-less promoters.	174
239954	cd04518	TBP_archaea	archaeal TATA box binding protein (TBP): TBPs are transcription factors present in archaea and eukaryotes, that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA.	174
213328	cd04519	RasGAP	Ras GTPase Activating Domain. RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	256
341359	cd04582	CBS_pair_ABC_OpuCA_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found associated with the ABC transporter OpuCA. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown.  In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	111
341360	cd04583	CBS_pair_ABC_OpuCA_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found associated with the ABC transporter OpuCA. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown.  In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	110
341361	cd04584	CBS_pair_AcuB_like	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the ACT domain. The putative Acetoin Utilization Protein (Acub) from Vibrio Cholerae contains a CBS pair domain.  The acetoin utilization protein plays a role in growth and sporulation on acetoin or butanediol for use as a carbon source. Acetoin is an important physiological metabolite excreted by many microorganisms.  It is used as an external energy store by a number of fermentive bacteria. Acetoin is produced by the decarboxylation of alpha-acetolactate. Once superior carbon sources are exhausted, and the culture enters stationary phase, acetoin can be utilised in order to maintain the culture density. The conversion of acetoin into acetyl-CoA or 2,3-butanediol is catalysed by the acetoin dehydrogenase complex and acetoin reductase/2,3-butanediol dehydrogenase, respectively. Acetoin utilization proteins, acetylpolyamine amidohydrolases, and histone deacetylases are members of an ancient protein superfamily.This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the acetoin utilization proteins in bacteria. Acetoin is a product of fermentative metabolism in many prokaryotic and eukaryotic microorganisms.  They produce acetoin as an external carbon storage compound and then later reuse it as a carbon and energy source during their stationary phase and sporulation. In addition these CBS domains are associated with a downstream ACT (aspartate kinase/chorismate mutase/TyrA) domain, which is linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	130
341362	cd04586	CBS_pair_BON_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the BON (bacterial OsmY and nodulation domain) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the BON (bacterial OsmY and nodulation domain) domain. BON is a putative phospholipid-binding domain found in a family of osmotic shock protection proteins. It is also found in some secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	137
341363	cd04587	CBS_pair_CAP-ED_NT_Pol-beta-like_DUF294_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT (Nucleotidyltransferase) Pol-beta-like domain, and the DUF294 domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT_Pol-beta-like domain, and the DUF294 domain.  Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. The NT_Pol-beta-like domain includes the Nucleotidyltransferase (NT) domains of DNA polymerase beta and other family X DNA polymerases, as well as the NT domains of class I and class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly (A) polymerases, terminal uridylyl transferases, Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins.  DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	114
341364	cd04588	CBS_pair_archHTH_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in archaea and associated with helix turn helix domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein.  IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	111
341365	cd04589	CBS_pair_CAP-ED_NT_Pol-beta-like_DUF294_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT (Nucleotidyltransferase) Pol-beta-like domain, and the DUF294 domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT_Pol-beta-like domain, and the DUF294 domain.  Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. The NT_Pol-beta-like domain includes the Nucleotidyltransferase (NT) domains of DNA polymerase beta and other family X DNA polymerases, as well as the NT domains of class I and class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly (A) polymerases, terminal uridylyl transferases, Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins.  DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	113
341366	cd04590	CBS_pair_CorC_HlyC_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains the majority of which are associated with the CorC_HlyC domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains the majority of which are associated with the CorC_HlyC domain. CorC_HlyC is a transporter associated domain. This small domain is found in Na+/H+ antiporters, in proteins involved in magnesium and cobalt efflux, and in association with some proteins of unknown function.  The function of the CorC_HlyC domain is uncertain but it might be involved in modulating transport of ion substrates.  These CBS domains are found in highly conserved proteins that either have unknown function or are puported to be hemolysins, exotoxins involved in  lysis of red blood cells in vitro.  The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	119
341367	cd04591	CBS_pair_voltage-gated_CLC_euk_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC (chloride channel) in eukaryotes and bacteria. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC voltage-gated chloride channel.  The CBS pairs here are found in the EriC CIC-type chloride channels in eukaryotes and bacteria. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	114
341368	cd04592	CBS_pair_voltage-gated_CLC_euk_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC (chloride channel) in eukaryotes and bacteria. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC voltage-gated chloride channel.  The CBS pairs here are found in the EriC CIC-type chloride channels in eukaryotes and bacteria. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	128
341369	cd04594	CBS_pair_voltage-gated_CLC_archaea	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC (chloride channel) in archaea. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC voltage-gated chloride channel.  The CBS pairs here are found in the EriC CIC-type chloride channels in archaea. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	107
341370	cd04595	CBS_pair_DHH_polyA_Pol_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the DHH and nucleotidyltransferase (NT) domains. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with an upstream DHH domain which performs a phosphoesterase function and a downstream nucleotidyltransferase (NT) domain of family X DNA polymerases. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	110
341371	cd04596	CBS_pair_DRTGG_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the DRTGG domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	108
341372	cd04597	CBS_pair_inorgPPase	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with  family II inorganic pyrophosphatase. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a subgroup of family II inorganic pyrophosphatases (PPases) that also contain a DRTGG domain. The homolog from Clostridium has been shown to be inhibited by AMP and activated by a novel effector, diadenosine 5',5-P1,P4-tetraphosphate (AP(4)A), which has been shown to bind to the CBS domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.	106
341373	cd04598	CBS_pair_GGDEF_EAL	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	121
341374	cd04599	CBS_pair_GGDEF_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the GGDEF (DiGuanylate-Cyclase (DGC)) domain. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	107
341375	cd04600	CBS_pair_HPP_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the HPP motif domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the HPP motif domain. These proteins are integral membrane proteins with four transmembrane spanning helices. The function of these proteins is uncertain, but they are thought to be transporters. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	133
341376	cd04601	CBS_pair_IMPDH	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein.  IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	110
341377	cd04603	CBS_pair_KefB_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	112
341378	cd04604	CBS_pair_SIS_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the with the SIS (Sugar ISomerase) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the SIS (Sugar ISomerase) domain in the API [A5P (D-arabinose 5-phosphate) isomerase] protein KpsF/GutQ.  These APIs catalyze the conversion of the pentose pathway intermediate D-ribulose 5-phosphate into A5P, a precursor of 3-deoxy-D-manno-octulosonate, which is an integral carbohydrate component of various glycolipids coating the surface of the outer membrane of Gram-negative bacteria, including lipopolysaccharide and many group 2 K-antigen capsules. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	124
341379	cd04605	CBS_pair_arch_MET2_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the MET2 domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the MET2 domain. Met2 is a key enzyme in the biosynthesis of methionine.  It encodes a homoserine transacetylase involved in converting homoserine to O-acetyl homoserine. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	116
341380	cd04606	CBS_pair_Mg_transporter	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the magnesium transporter, MgtE. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain in the magnesium transporter, MgtE.  MgtE and its homologs are found in eubacteria, archaebacteria, and eukaryota. Members of this family transport Mg2+ or other divalent cations into the cell via two highly conserved aspartates. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	121
341381	cd04607	CBS_pair_NTP_transferase_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domain associated with the NTP (Nucleotidyl transferase) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain associated with the NTP (Nucleotidyl transferase) domain downstream.  The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	112
341382	cd04608	CBS_pair_CBS	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the pyridoxal-phosphate (PALP) dependent enzyme domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the pyridoxal-phosphate (PALP) dependent enzyme domain upstream. Cystathionine beta-synthase (CBS ) contains, besides the C-terminal regulatory CBS-pair, an N-terminal heme-binding module, followed by a pyridoxal phosphate (PLP) domain, which houses the active site. It is the first enzyme in the transsulfuration pathway, catalyzing the conversion of serine and homocysteine to cystathionine and water. In general, CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	120
341383	cd04610	CBS_pair_ParBc_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a ParBc (ParB-like nuclease) domain. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a ParBc (ParB-like nuclease) domain downstream. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	108
341384	cd04611	CBS_pair_GGDEF_PAS_repeat2	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in diguanylate cyclase/phosphodiesterase proteins with PAS sensors, repeat 2. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in diguanylate cyclase/phosphodiesterase proteins with PAS sensors.  PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	131
341385	cd04613	CBS_pair_voltage-gated_CLC_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC (chloride channel) in bacteria. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the voltage gated CLC voltage-gated chloride channel.  The CBS pairs here are found in the EriC CIC-type chloride channels in bacteria. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	119
341386	cd04614	CBS_pair_arch2_repeat2	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea, repeat 2. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in Inosine monophosphate (IMP) dehydrogenases and related proteins including IMP dehydrogenase IX from Methanothermobacter.  IMP dehydrogenase is an essential enzyme in the de novo biosynthesis of Guanosine monophosphate (GMP), catalyzing the NAD-dependent oxidation of IMP to xanthosine monophosphate (XMP).  The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	150
341387	cd04617	CBS_pair_CcpN	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains of CcpN repressor. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	125
341388	cd04618	CBS_euAMPK_gamma-like_repeat1	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in AMP-activated protein kinase gamma-like proteins, repeat 1. AMP-activated protein kinase (AMPK) plays multiple roles in the body's overall metabolic balance and response to exercise, nutritional stress, hormonal stimulation, and the glucose-lowering drugs metformin and rosiglitazone. AMPK consists of a catalytic alpha subunit and two non-catalytic subunits, beta and gamma, each with multiple isoforms that form active 1:1:1 heterotrimers.  This cd contains 2 tandem repeats of the CBS domains found in the gamma subunits of AMPK. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	138
341389	cd04620	CBS_two-component_sensor_histidine_kinase_repeat1	2 tandem repeats of the CBS domain in the two-component sensor histidine kinase and related-proteins, repeat 1. This cd contains 2 tandem repeats of the CBS domain in the two-component sensor histidine kinase and related-proteins. Two-component regulation is the predominant form of signal recognition and response coupling mechanism used by bacteria to sense and respond to diverse environmental stresses and cues ranging from common environmental stimuli to host signals recognized by pathogens and bacterial cell-cell communication signals.  The structures of both sensors and regulators are modular, and numerous variations in domain architecture and composition have evolved to tailor to specific needs in signal perception and signal transduction. The simplest histidine kinase sensors consists of only sensing and kinase domains. The more complex hybrid sensors contain an additional REC domain typical of two-component regulators and in some cases a C-terminal histidine phosphotransferase (HPT) domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	136
341390	cd04622	CBS_pair_HRP1_like	CBS pair domain found in Hypoxic Response Protein 1 (HRP1) -like proteinds. Mycobacterium tuberculosis adapts to cellular stresses by upregulation of the dormancy survival regulon.  Hypoxic response protein 1 (HRP1) is encoded by one of the most strongly upregulated genes in the dormancy survival regulon. HRP1 is a 'CBS-domain-only protein; however unlike other CBS containing proteins it does not appear to bind AMP. The biological function of the protein remains unclear, but is thought to contribute to the modulation of the host immune response. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	115
341391	cd04623	CBS_pair_bac_euk	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria and eukaryotes. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	113
341392	cd04629	CBS_pair_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	116
341393	cd04630	CBS_pair_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	120
341394	cd04631	CBS_archAMPK_gamma-repeat2	CBS pair domains found in archeal 5'-AMP-activated protein kinase gamma subunit-like proteins. Archeal gamma-subunit of 5'-AMP-activated protein kinase (AMPK) contains four CBS domains in tandem repeats, similar to eukaryotic homologs. AMPK is an important regulator of metabolism and of energy homeostasis. It is a heterotrimeric protein composed of a catalytic serine/threonine kinase subunit (alpha) and two regulatory subunits (beta and gamma). The gamma subunit senses the intracellular energy status by competitively binding AMP and ATP and is believed to be responsible for allosteric regulation of the whole complex. In humans mutations in gamma- subunit of AMPK are associated with hypertrophic cardiomiopathy, Wolff-Parkinson-White syndrome and glycogen storage in the skeletal muscle. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.	130
341395	cd04632	CBS_pair_arch1_repeat2	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea, repeat 2. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	127
341396	cd04638	CBS_pair_arch2_repeat1	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea, repeat 1. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	109
341397	cd04639	CBS_pair_peptidase_M50	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in the metalloprotease peptidase M50. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in peptidase M50.  Members of the M50 metallopeptidase family include mammalian sterol-regulatory element binding protein (SREBP) site 2 proteases and various hypothetical bacterial homologues. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	120
341398	cd04640	CBS_pair_proteobact	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in proteobacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	133
341399	cd04641	CBS_euAMPK_gamma-like_repeat2	CBS pair domain found in 5'-AMP (adenosine monophosphate)-activated protein kinase. The 5'-AMP (adenosine monophosphate)-activated protein kinase (AMPK) coordinates metabolic function with energy availability by responding to changes in intracellular ATP (adenosine triphosphate) and AMP concentrations. Most of the members of this cd contain two Bateman domains, each of which is composed of a tandem pair of cystathionine beta-synthase (CBS) motifs. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	124
341400	cd04643	CBS_pair_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	130
100051	cd04645	LbH_gamma_CA_like	Gamma carbonic anhydrase-like: This family is composed of gamma carbonic anhydrase (CA), Ferripyochelin Binding Protein (FBP), E. coli paaY protein, and similar proteins. CAs are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism, involving the nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Gamma CAs are trimeric enzymes with left-handed parallel beta helix (LbH) structural domain.	153
100052	cd04646	LbH_Dynactin_6	Dynactin 6 (or subunit p27): Dynactin is a major component of the activator complex that stimulates dynein-mediated vesicle transport. Dynactin is a heterocomplex of at least eight subunits, including a 150,000-MW protein called Glued, the actin-capping protein Arp1, and dynamatin. In vitro binding experiments show that dynactin enhances dynein-dependent motility, possibly through interaction with microtubules and vesicles. Subunit p27 is part of the pointed-end subcomplex in dynactin that also includes p25, p26, and Arp11. This subcomplex interacts with membranous cargoes. p25 and p27 contain the imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), indicating a left-handed parallel beta helix (LbH) structural domain. Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity.	164
100053	cd04647	LbH_MAT_like	Maltose O-acyltransferase (MAT)-like: This family is composed of maltose O-acetyltransferase, galactoside O-acetyltransferase (GAT), xenobiotic acyltransferase (XAT) and similar proteins. MAT and GAT catalyze the CoA-dependent acetylation of the 6-hydroxyl group of their respective sugar substrates. MAT acetylates maltose and glucose exclusively while GAT specifically acetylates galactopyranosides. XAT catalyzes the CoA-dependent acetylation of a variety of hydroxyl-bearing acceptors such as chloramphenicol and streptogramin, among others. XATs are implicated in inactivating xenobiotics leading to xenobiotic resistance in patients. Members of this family contain a a left-handed parallel beta-helix (LbH) domain with at least 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). They are trimeric in their active form.	109
100054	cd04649	LbH_THP_succinylT_putative	Putative 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate (THDP) N-succinyltransferase (THP succinyltransferase), C-terminal left-handed parallel alpha-helix (LbH) domain: This group is composed of mostly uncharacterized proteins containing an N-terminal domain of unknown function and a C-terminal LbH domain with similarity to THP succinyltransferase LbH. THP succinyltransferase catalyzes the conversion of tetrahydrodipicolinate and succinyl-CoA to N-succinyltetrahydrodipicolinate and CoA. It is the committed step in the succinylase pathway by which bacteria synthesize L-lysine and meso-diaminopimelate, a component of peptidoglycan. The enzyme is trimeric and displays the left-handed parallel alpha-helix (LbH) structural motif encoded by the hexapeptide repeat motif.	147
100055	cd04650	LbH_FBP	Ferripyochelin Binding Protein (FBP): FBP is an outer membrane protein which plays a role in iron acquisition. It binds iron when it is complexed with pyochelin. It adopts the left-handed parallel beta-helix (LbH) structure, and contains imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. Acyltransferase activity has not been observed in this group.	154
100056	cd04651	LbH_G1P_AT_C	Glucose-1-phosphate adenylyltransferase, C-terminal Left-handed parallel beta helix (LbH) domain: Glucose-1-phosphate adenylyltransferase is also known as ADP-glucose synthase or ADP-glucose pyrophosphorylase. It catalyzes the first committed and rate-limiting step in starch biosynthesis in plants and glycogen biosynthesis in bacteria. It is the enzymatic site for regulation of storage polysaccharide accumulation in plants and bacteria. The enzyme is a homotetramer, with each subunit containing an N-terminal catalytic domain that resembles a dinucleotide-binding Rossmann fold and a C-terminal LbH fold domain with at 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). The LbH domain is involved in cooperative allosteric regulation and oligomerization.	104
100057	cd04652	LbH_eIF2B_gamma_C	eIF-2B gamma subunit, C-terminal Left-handed parallel beta-Helix (LbH) domain: eIF-2B is a eukaryotic translation initiator, a guanine nucleotide exchange factor (GEF) composed of five different subunits (alpha, beta, gamma, delta and epsilon). eIF2B is important for regenerating GTP-bound eIF2 during the initiation process. This event is obligatory for eIF2 to bind initiator methionyl-tRNA, forming the ternary initiation complex. The eIF-2B gamma subunit contains an N-terminal domain that resembles a dinucleotide-binding Rossmann fold and a C-terminal LbH domain with 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). The epsilon and gamma subunits form the catalytic subcomplex of eIF-2B, which binds eIF2 and catalyzes guanine nucleotide exchange.	81
240015	cd04657	Piwi_ago-like	Piwi_ago-like: PIWI domain, Argonaute-like subfamily. Argonaute is the central component of the RNA-induced silencing complex (RISC) and related complexes. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing.	426
240016	cd04658	Piwi_piwi-like_Euk	Piwi_piwi-like_Euk: PIWI domain, Piwi-like subfamily found in eukaryotes. This domain is found in Piwi and closely related proteins, where it is believed to perform a crucial role in germline cells, via RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The mechanism in Piwi is believed to be similar to that in Argonaute, the central component of the RNA-induced silencing complex (RISC). The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing.	448
240017	cd04659	Piwi_piwi-like_ProArk	Piwi_piwi-like_ProArk: PIWI domain, Piwi-like subfamily found in Archaea and Bacteria. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism.	404
240018	cd04660	nsLTP_like	nsLTP_like: Non-specific lipid-transfer protein (nsLTP)-like subfamily; composed of predominantly uncharacterized proteins with similarity to nsLTPs, including Medicago truncatula MtN5, the root-specific Phaseolus vulgaris PVR3, Antirrhinum majus FIL1, and Lilium longiflorum LIM3. Plant nsLTPs are small, soluble proteins that facilitate the transfer of fatty acids, phospholipids, glycolipids, and steroids between membranes. The MtN5 gene is induced during root nodule development. FIL1 is thought to be important in petal and stamen formation. The LIM3 gene is induced during the early prophase stage of meiosis in lily microsporocytes.	73
240019	cd04661	MRP_L46	Mitochondrial ribosomal protein L46 (MRP L46) is a component of the large subunit (39S) of the mammalian mitochondrial ribosome and a member of the Nudix hydrolase superfamily. MRPs are thought to be involved in the maintenance of the mitochondrial DNA. In general, members of the Nudix superfamily require a divalent cation, such as Mg2+ or Mn2+, for activity and contain the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. MRP L46 appears to contain a modified nudix motif.	132
240020	cd04662	Nudix_Hydrolase_5	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	126
240021	cd04663	Nudix_Hydrolase_6	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belong to this superfamily requires a divalent cation, such as Mg2+ or Mn2+ for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, U=I, L or V) which functions as metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	126
240022	cd04664	Nudix_Hydrolase_7	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	129
240023	cd04665	Nudix_Hydrolase_8	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	118
240024	cd04666	Nudix_Hydrolase_9	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	122
240025	cd04667	Nudix_Hydrolase_10	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	112
240026	cd04669	Nudix_Hydrolase_11	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	121
240027	cd04670	Nudix_Hydrolase_12	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	127
240028	cd04671	Nudix_Hydrolase_13	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	123
240029	cd04672	Nudix_Hydrolase_14	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	123
240030	cd04673	Nudix_Hydrolase_15	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	122
240031	cd04674	Nudix_Hydrolase_16	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	118
240032	cd04676	Nudix_Hydrolase_17	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	129
240033	cd04677	Nudix_Hydrolase_18	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	132
240034	cd04678	Nudix_Hydrolase_19	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	129
240035	cd04679	Nudix_Hydrolase_20	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	125
240036	cd04680	Nudix_Hydrolase_21	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	120
240037	cd04681	Nudix_Hydrolase_22	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	130
240038	cd04682	Nudix_Hydrolase_23	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	122
240039	cd04683	Nudix_Hydrolase_24	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	120
240040	cd04684	Nudix_Hydrolase_25	Contains a crystal structure of the Nudix hydrolase from Enterococcus faecalis, which has an unknown function. In general, members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity. They also contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	128
240041	cd04685	Nudix_Hydrolase_26	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily requires a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	133
240042	cd04686	Nudix_Hydrolase_27	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	131
240043	cd04687	Nudix_Hydrolase_28	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	128
240044	cd04688	Nudix_Hydrolase_29	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	126
240045	cd04689	Nudix_Hydrolase_30	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U=I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	125
240046	cd04690	Nudix_Hydrolase_31	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	118
240047	cd04691	Nudix_Hydrolase_32	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	117
240048	cd04692	Nudix_Hydrolase_33	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	144
240049	cd04693	Nudix_Hydrolase_34	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	127
240050	cd04694	Nudix_Hydrolase_35	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	143
240051	cd04695	Nudix_Hydrolase_36	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	131
240052	cd04696	Nudix_Hydrolase_37	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	125
240053	cd04697	Nudix_Hydrolase_38	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	126
240054	cd04699	Nudix_Hydrolase_39	Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+, for their activity and contain a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. Substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	129
240055	cd04700	DR1025_like	DR1025 from Deinococcus radiodurans, a member of the Nudix hydrolase superfamily, show nucleoside triphosphatase and dinucleoside polyphosphate pyrophosphatase activities. Like other enzymes belonging to this superfamily, it requires a divalent cation, in this case Mg2+, for its activity. It also contains a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which functions as a metal binding and catalytic site. In general, substrates of nudix hydrolases include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance & "house-cleaning" enzymes. Substrate specificity is used to define families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required.	142
271337	cd04701	Asparaginase_2	Bacterial/fungal L-Asparaginase type 2. L-Asparaginase hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzyme undergoes an autoproteolytic cleavage into alpha and beta subunits to expose a threonine residue which becomes the N-terminal residue of the beta subunit. The threonine residue plays a central role in hydrolase activity. Some asparaginases can also hydrolyze L-glutamine and are termed glutaminase-asparaginase. This is a member of the Ntn-hydrolase superfamily, and this subfamily covers mostly bacterial and fungal enzymes.	264
271338	cd04702	ASRGL1_like	Metazoan L-Asparaginase type 2. ASRGL1 and similar proteins constitute a subfamily of the L-Asparaginase type 2-like enzymes. The wider family includes Glycosylasparaginase, Taspase 1, and  L-Asparaginase type 2 enzymes. The proenzymes undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue. ASRGL1, or asparaginase-like 1, has been cloned from mammalian testis cDNA libraries. It has been identified as a sperm antigen that may induce the production of autoantibodies following obstruction of the male reproductive tract, e.g. vasectomy.	289
271339	cd04703	Asparaginase_2_like_1	Uncharacterized subfamily of the L-Asparaginase type 2-like enzymes, an Ntn-hydrolase family. The wider family of Asparaginase 2-like enzymes includes Glycosylasparaginase, Taspase 1, and  L-Asparaginase type 2. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue.	243
153093	cd04704	PLA2_bee_venom_like	PLA2_bee_venom_like: A sub-family of  Phospholipase A2, similar to bee venom PLA2. PLA2 is a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. Enzymatically active PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids; secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers. Bee venom PLA2 has fewer conserved disulfide bridges than most canonical PLA2s.	97
153094	cd04705	PLA2_group_III_like	PLA2_group_III_like: A sub-family of  Phospholipase A2, similar to human group III PLA2. PLA2 is a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. Enzymatically active PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids; secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers.	100
153095	cd04706	PLA2_plant	PLA2_plant: Plant-specific sub-family of  Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. Enzymatically active PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids; secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers. This sub-family does not appear to have a conserved active site and metal-binding loop.	117
153096	cd04707	otoconin_90	otoconin_90: Phospholipase A2-like domains present in otoconin-90 and otoconin-95, mammal proteins that are principal matrix proteins of calcitic otoconia. Interactions involving otoconin-90 may trigger or constitute key events in otoconia formation. The PLA2-like domains in otoconins may have lost their metal-binding sites.	117
240059	cd04708	BAH_plantDCM_II	BAH, or Bromo Adjacent Homology domain, second copy present in DNA (Cytosine-5)-methyltransferases (DCM) from plants. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	202
240060	cd04709	BAH_MTA	BAH, or Bromo Adjacent Homology domain, as present in MTA1 and similar proteins. The Metastasis-associated protein MTA1 is part of the NURD (nucleosome remodeling and deacetylating) complex and plays a role in cellular transformation and metastasis. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	164
240061	cd04710	BAH_fungalPHD	BAH, or Bromo Adjacent Homology domain, as present in fungal proteins containing PHD domains. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	135
240062	cd04711	BAH_Dnmt1_II	BAH, or Bromo Adjacent Homology domain, second copy present in DNA (Cytosine-5)-methyltransferases from Bilateria, Dnmt1 and similar proteins. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	137
240063	cd04712	BAH_DCM_I	BAH, or Bromo Adjacent Homology domain, as present in DNA (Cytosine-5)-methyltransferases (DCM) 1. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	130
240064	cd04713	BAH_plant_3	BAH, or Bromo Adjacent Homology domain, plant-specific sub-family with unknown function. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	146
240065	cd04714	BAH_BAHCC1	BAH, or Bromo Adjacent Homology domain, as present in mammalian BAHCC1 and similar proteins. BAHCC1 stands for BAH domain and coiled-coil containing 1. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	121
240066	cd04715	BAH_Orc1p_like	BAH, or Bromo Adjacent Homology domain, as present in the Schizosaccharomyces pombe homolog of Saccharomyces cerevisiae Orc1p and similar proteins. Orc1  is part of the Yeast Sir1-origin recognition complex, the Orc1p BAH doman functions in epigenetic silencing. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	159
240067	cd04716	BAH_plantDCM_I	BAH, or Bromo Adjacent Homology domain, first copy present in DNA (Cytosine-5)-methyltransferases (DCM) from plants. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	122
240068	cd04717	BAH_polybromo	BAH, or Bromo Adjacent Homology domain, as present in polybromo and yeast RSC1/2. The human polybromo protein (BAF180) is a component of the SWI/SNF chromatin-remodeling complex PBAF. It is thought that polybromo participates in transcriptional regulation. Saccharomyces cerevisiae RSC1 and RSC2 are part of the 15-subunit nucleosome remodeling RSC complex. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	121
240069	cd04718	BAH_plant_2	BAH, or Bromo Adjacent Homology domain, plant-specific sub-family with unknown function. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	148
240070	cd04719	BAH_Orc1p_animal	BAH, or Bromo Adjacent Homology domain, as present in animal homologs of Saccharomyces cerevisiae Orc1p. Orc1  is part of the Yeast Sir1-origin recognition complex. The Orc1p BAH doman functions in epigenetic silencing. In vertebrates, a similar ORC protein complex exists, which has been shown essential for DNA replication in Xenopus laevis. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	128
240071	cd04720	BAH_Orc1p_Yeast	BAH, or Bromo Adjacent Homology domain, as present in Orc1p, which again is part of the Saccharomyces cerevisiae Sir1-origin recognition complex, and as present in Sir3p. The Orc1p BAH doman functions in epigenetic silencing. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	179
240072	cd04721	BAH_plant_1	BAH, or Bromo Adjacent Homology domain, plant-specific sub-family with unknown function. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	130
240073	cd04722	TIM_phosphate_binding	TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN.	200
240074	cd04723	HisA_HisF	Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase (HisA) and the cyclase subunit of imidazoleglycerol phosphate synthase (HisF). The ProFAR isomerase catalyzes the fourth step in histidine biosynthesis, an isomerisation of the aminoaldose moiety of ProFAR to the aminoketose of PRFAR (N-(5'-phospho-D-1'-ribulosylformimino)-5-amino-1-(5''-phospho-ribosyl)-4-imidazolecarboxamide). In bacteria and archaea, ProFAR isomerase is encoded by the HisA gene. The Imidazole glycerol phosphate synthase (IGPS) catalyzes the fifth step of histidine biosynthesis, the formation of the imidazole ring. IGPS converts N1-(5'-phosphoribulosyl)-formimino-5-aminoimidazole-4-carboxamide ribonucleotide (PRFAR) to imidazole glycerol phosphate (ImGP) and 5'-(5-aminoimidazole-4-carboxamide) ribonucleotide (AICAR). This conversion involves two tightly coupled reactions in distinct active sites of IGPS. The two catalytic domains can be fused, like in fungi and plants, or peformed by a heterodimer (HisH-glutaminase and HisF-cyclase), like in bacteria.	233
240075	cd04724	Tryptophan_synthase_alpha	Ttryptophan synthase (TRPS) alpha subunit (TSA). TPRS is a bifunctional tetrameric enzyme (2 alpha and 2 beta subunits) that catalyzes the last two steps of L-tryptophan biosynthesis. Alpha and beta subunit catalyze two distinct reactions which are both strongly stimulated by the formation of the complex. The alpha subunit catalyzes the cleavage of indole 3-glycerol phosphate (IGP) to indole and d-glyceraldehyde 3-phosphate (G3P). Indole is then channeled to the active site of the beta subunit, a PLP-dependent enzyme that catalyzes a replacement reaction to convert L-serine into L-tryptophan.	242
240076	cd04725	OMP_decarboxylase_like	Orotidine 5'-phosphate decarboxylase (ODCase) is a dimeric enzyme that decarboxylates orotidine 5'-monophosphate (OMP) to form uridine 5'-phosphate (UMP), an essential step in the pyrimidine biosynthetic pathway. In mammals, UMP synthase contains two domains:  the orotate phosphoribosyltransferase (OPRTase) domain that catalyzes the transfer of phosphoribosyl 5'-pyrophosphate (PRPP) to orotate to form OMP, and the orotidine-5'-phosphate decarboxylase (ODCase) domain that decarboxylates OMP to form UMP.	216
240077	cd04726	KGPDC_HPS	3-Keto-L-gulonate 6-phosphate decarboxylase (KGPDC) and D-arabino-3-hexulose-6-phosphate synthase (HPS). KGPDC catalyzes the formation of L-xylulose 5-phosphate and carbon dioxide from 3-keto-L-gulonate 6-phosphate as part of the anaerobic pathway for L-ascorbate utilization in some eubacteria. HPS catalyzes the formation of D-arabino-3-hexulose-6-phosphate from D-ribulose 5-phosphate and formaldehyde in microorganisms that can use formaldehyde as a carbon source. Both catalyze reactions that involve the Mg2+-assisted formation and stabilization of 1,2-enediolate reaction intermediates.	202
240078	cd04727	pdxS	PdxS is a subunit of the pyridoxal 5'-phosphate (PLP) synthase, an important enzyme in deoxyxylulose 5-phosphate (DXP)-independent pathway for de novo biosynthesis of PLP,  present in some eubacteria, in archaea, fungi, plants, plasmodia, and some metazoa. Together with PdxT, PdxS forms the PLP synthase, a heteromeric glutamine amidotransferase (GATase), whereby PdxT produces ammonia from glutamine and PdxS combines ammonia with five- and three-carbon phosphosugars to form PLP. PLP is the biologically active form of vitamin B6, an essential cofactor in many biochemical processes. PdxS subunits form two hexameric rings.	283
240079	cd04728	ThiG	Thiazole synthase (ThiG) is the tetrameric enzyme that is involved in the formation of the thiazole moiety of thiamin pyrophosphate, an essential ubiquitous cofactor that plays an important role in carbohydrate and amino acid metabolism. ThiG catalyzes the formation of thiazole from 1-deoxy-D-xylulose 5-phosphate (DXP) and dehydroglycine, with the help of the sulfur carrier protein ThiS that carries the sulfur needed for thiazole assembly on its carboxy terminus (ThiS-COSH).	248
240080	cd04729	NanE	N-acetylmannosamine-6-phosphate epimerase (NanE) converts N-acetylmannosamine-6-phosphate to N-acetylglucosamine-6-phosphate. This reaction is part of the pathway that allows the usage of sialic acid as a carbohydrate source. Sialic acids are a family of related sugars that are found as a component of glycoproteins, gangliosides, and other sialoglycoconjugates.	219
240081	cd04730	NPD_like	2-Nitropropane dioxygenase (NPD), one of the nitroalkane oxidizing enzyme families, catalyzes oxidative denitrification of nitroalkanes to their corresponding carbonyl compounds and nitrites. NDP is a member of the NAD(P)H-dependent flavin oxidoreductase family that reduce a range of alternative electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN.	236
240082	cd04731	HisF	The cyclase subunit of imidazoleglycerol phosphate synthase (HisF). Imidazole glycerol phosphate synthase (IGPS) catalyzes the fifth step of histidine biosynthesis, the formation of the imidazole ring. IGPS converts N1-(5'-phosphoribulosyl)-formimino-5-aminoimidazole-4-carboxamide ribonucleotide (PRFAR) to imidazole glycerol phosphate (ImGP) and 5'-(5-aminoimidazole-4-carboxamide) ribonucleotide (AICAR). This conversion involves two tightly coupled reactions in distinct active sites of IGPS. The two catalytic domains can be fused, like in fungi and plants, or peformed by a heterodimer (HisH-glutaminase and HisF-cyclase), like in bacteria.	243
240083	cd04732	HisA	HisA.  Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase catalyzes the fourth step in histidine biosynthesis, an isomerisation of the aminoaldose moiety of ProFAR to the aminoketose of PRFAR (N-(5'-phospho-D-1'-ribulosylformimino)-5-amino-1-(5''-phospho-ribosyl)-4-imidazolecarboxamide). In bacteria and archaea, ProFAR isomerase is encoded by the HisA gene.	234
240084	cd04733	OYE_like_2_FMN	Old yellow enzyme (OYE)-related FMN binding domain, group 2.  Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction.  Other members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase.	338
240085	cd04734	OYE_like_3_FMN	Old yellow enzyme (OYE)-related FMN binding domain, group 3. Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction.  Other members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase. One member of this subgroup, the Sinorhizobium meliloti stachydrine utilization protein stcD, has been idenified as a putative N-methylproline demethylase.	343
240086	cd04735	OYE_like_4_FMN	Old yellow enzyme (OYE)-related FMN binding domain, group 4.  Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction.  Other members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase.	353
240087	cd04736	MDH_FMN	Mandelate dehydrogenase (MDH)-like FMN-binding domain.  MDH is part of a widespread family of homologous FMN-dependent a-hydroxy acid oxidizing enzymes that oxidizes (S)-mandelate to phenylglyoxalate. MDH is an enzyme in the mandelate pathway that occurs in several strains of Pseudomonas which converts (R)-mandelate to benzoate. This family occurs in both prokaryotes and eukaryotes. Members of this family include flavocytochrome b2 (FCB2), glycolate oxidase (GOX), lactate monooxygenase (LMO), mandelate dehydrogenase (MDH), and long chain hydroxyacid oxidase (LCHAO).	361
240088	cd04737	LOX_like_FMN	L-Lactate oxidase (LOX) FMN-binding domain. LOX is a member of the family of FMN-containing alpha-hydroxyacid oxidases and catalyzes the oxidation of l-lactate using molecular oxygen to generate pyruvate and H2O2.  This family occurs in both prokaryotes and eukaryotes. Members of this family include flavocytochrome b2 (FCB2), glycolate oxidase (GOX), lactate monooxygenase (LMO), mandelate dehydrogenase (MDH), and long chain hydroxyacid oxidase (LCHAO).	351
240089	cd04738	DHOD_2_like	Dihydroorotate dehydrogenase (DHOD) class 2. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences, their cellular location and their natural electron acceptor used to reoxidize the flavin group. Members of class 1 are cytosolic enzymes and multimers, while class 2 enzymes are membrane associated, monomeric and use respiratory quinones as their physiological electron acceptors.	327
240090	cd04739	DHOD_like	Dihydroorotate dehydrogenase (DHOD) like proteins.  DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 are cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively.  This subgroup has the conserved FMN binding site, but lacks some catalytic residues and may therefore be inactive.	325
240091	cd04740	DHOD_1B_like	Dihydroorotate dehydrogenase (DHOD) class 1B FMN-binding domain. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 are cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively.	296
240092	cd04741	DHOD_1A_like	Dihydroorotate dehydrogenase (DHOD) class 1A FMN-binding domain. DHOD catalyzes the oxidation of (S)-dihydroorotate to orotate. This is the fourth step and the only redox reaction in the de novo biosynthesis of UMP, the precursor of all pyrimidine nucleotides. DHOD requires FMN as co-factor. DHOD divides into class 1 and class 2 based on their amino acid sequences and cellular location. Members of class 1 are cytosolic enzymes and multimers while class 2 enzymes are membrane associated and monomeric. The class 1 enzymes can be further divided into subtypes 1A and 1B which are homodimers and heterotetrameric proteins, respectively.	294
240093	cd04742	NPD_FabD	2-Nitropropane dioxygenase (NPD)-like domain, associated with the (acyl-carrier-protein) S-malonyltransferase  FabD. NPD is part of the nitroalkaneoxidizing enzyme family, that catalyzes oxidative denitrification of nitroalkanes to their corresponding carbonyl compounds and nitrites. NDPs are members of the NAD(P)H-dependent flavin oxidoreductase family that reduce a range of alternative  electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN.	418
240094	cd04743	NPD_PKS	2-Nitropropane dioxygenase (NPD)-like domain, associated with polyketide synthases (PKS). NPD is part of the nitroalkaneoxidizing enzyme family, that catalyzes oxidative denitrification of nitroalkanes to their corresponding carbonyl compounds and nitrites. NDPs are members of the NAD(P)H-dependent flavin oxidoreductase family that reduce a range of alternative  electron acceptors. Most use FAD/FMN as a cofactor and NAD(P)H as electron donor. Some contain 4Fe-4S cluster to transfer electron from FAD to FMN.	320
100058	cd04745	LbH_paaY_like	paaY-like: This group is composed by uncharacterized proteins with similarity to the protein product of the E. coli paaY gene, which is part of the paa gene cluster responsible for phenylacetic acid degradation. Proteins in this group are expected to adopt the left-handed parallel beta-helix (LbH) structure. They contain imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Similarity to gamma carbonic anhydrase and Ferripyochelin Binding Protein (FBP) may suggest metal binding capacity.	155
240095	cd04747	OYE_like_5_FMN	Old yellow enzyme (OYE)-related FMN binding domain, group 5.  Each monomer of OYE contains FMN as a non-covalently bound cofactor, uses NADPH as a reducing agent with oxygens, quinones, and alpha,beta-unsaturated aldehydes and ketones, and can act as electron acceptors in the catalytic reaction.  Other members of OYE family include trimethylamine dehydrogenase, 2,4-dienoyl-CoA reductase, enoate reductase, pentaerythriol tetranitrate reductase, xenobiotic reductase, and morphinone reductase.	361
240096	cd04748	Commd	COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals.	87
240097	cd04749	Commd1_MURR1	COMM_Domain containing protein 1, also called Murr1. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	174
240098	cd04750	Commd2	COMM_Domain containing protein 2. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	166
240099	cd04751	Commd3	COMM_Domain containing protein 3. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	95
240100	cd04752	Commd4	COMM_Domain containing protein 4. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	174
240101	cd04753	Commd5_HCaRG	COMM_Domain containing protein 5, also called HCaRG (hypertension-related, calcium-regulated gene). HCaRG is a nuclear protein that might be involved in cell proliferation; it is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	110
240102	cd04754	Commd6	COMM_Domain containing protein 6. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	86
240103	cd04755	Commd7	COMM_Domain containing protein 7. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	180
240104	cd04756	Commd8	COMM_Domain containing protein 8. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	176
240105	cd04757	Commd9	COMM_Domain containing protein 9. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	108
240106	cd04758	Commd10	COMM_Domain containing protein 10. The COMM Domain is found at the C-terminus of a variety of proteins; presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB.	186
212498	cd04759	Rib_hydrolase	ADP-ribosyl cyclase, also known as cyclic ADP-ribose hydrolase or CD38. ADP-ribosyl cyclase (EC:3.2.2.5) synthesizes the second messenger cyclic-ADP ribose (cADPR), which in turn releases calcium from internal stores. Mammals possess two membrane proteins, CD38 and BST-1/CD157, which exhibit ADP-ribosyl cyclase activity, as well as intracellular soluble ADP-ribose cyclases. CD38 is involved in differentiation, adhesion, and cell proliferation, and has been implicated in diseases such as AIDS, diabetes, and B-cell chronic lymphocytic leukemia. The extramembrane domain of CD38 acts as a multifunctional enzyme, and can synthesize cADPR from NAD+, hydrolyze NAD+ and cADPR to ADPR, as well as catalyze the exchange of the nicotinamide group of NADP+ with nicotinic acid under acidic conditions, to yield NAADP+ (nicotinic acid-adenine dinucleotide phosphate), a metabolite involved in Ca2+ mobilization from acidic stores.	244
240107	cd04760	BAH_Dnmt1_I	BAH, or Bromo Adjacent Homology domain, first copy present in DNA (Cytosine-5)-methyltransferases from Bilateria, Dnmt1 and similar proteins. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.	124
133389	cd04761	HTH_MerR-SF	Helix-Turn-Helix DNA binding domain of transcription regulators from the MerR superfamily. Helix-turn-helix (HTH) transcription regulator MerR superfamily, N-terminal domain. The MerR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription of multidrug/metal ion transporter genes and oxidative stress regulons by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	49
133390	cd04762	HTH_MerR-trunc	Helix-Turn-Helix DNA binding domain of truncated MerR-like proteins. Proteins in this family mostly have a truncated helix-turn-helix (HTH) MerR-like domain. They lack a portion of the C-terminal region, called Wing 2 and the long dimerization helix that is typically present in MerR-like proteins. These truncated domains are found in response regulator receiver (REC) domain proteins (i.e., CheY), cytosine-C5 specific DNA methylases, IS607 transposase-like proteins, and RacA, a bacterial protein that anchors chromosomes to cell poles.	49
133391	cd04763	HTH_MlrA-like	Helix-Turn-Helix DNA binding domain of MlrA-like transcription regulators. Helix-turn-helix (HTH) transcription regulator MlrA (merR-like regulator A) and related proteins, N-terminal domain. The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium. Its close homolog, CarA from Myxococcus xanthus, is involved in activation of the carotenoid biosynthesis genes by light. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. Many MlrA-like proteins in this group appear to lack the long dimerization helix seen in the N-terminal domains of typical MerR-like proteins.	68
133392	cd04764	HTH_MlrA-like_sg1	Helix-Turn-Helix DNA binding domain of putative MlrA-like transcription regulators. Putative helix-turn-helix (HTH) MlrA-like transcription regulators (subgroup 1). The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules. Many MlrA-like proteins in this group appear to lack the long dimerization helix seen in the N-terminal domains of typical MerR-like proteins.	67
133393	cd04765	HTH_MlrA-like_sg2	Helix-Turn-Helix DNA binding domain of putative MlrA-like transcription regulators. Putative helix-turn-helix (HTH) MlrA-like transcription regulators (subgroup 2), N-terminal domain. The MlrA protein, also known as YehV, has been shown to control cell-cell aggregation by co-regulating the expression of curli and extracellular matrix production in Escherichia coli and Salmonella typhimurium. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.	99
133394	cd04766	HTH_HspR	Helix-Turn-Helix DNA binding domain of the HspR transcription regulator. Helix-turn-helix (HTH) transcription regulator HspR, N-terminal domain. Heat shock protein regulators (HspR) have been shown to regulate expression of specific regulons in response to high temperature or high osmolarity in Streptomyces and Helicobacter, respectively. These proteins share the N-terminal DNA binding domain  with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.  A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.	91
133395	cd04767	HTH_HspR-like_MBC	Helix-Turn-Helix DNA binding domain of putative HspR-like transcription regulators. Putative helix-turn-helix (HTH) transcription regulator HspR-like proteins. Unlike the characterized HspR, these proteins have a C-terminal domain with putative metal binding cysteines (MBC). Heat shock protein regulators (HspR) have been shown to regulate expression of specific regulons in response to high temperature or high osmolarity in Streptomyces and Helicobacter, respectively. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.	120
133396	cd04768	HTH_BmrR-like	Helix-Turn-Helix DNA binding domain of BmrR-like transcription regulators. Helix-turn-helix (HTH) BmrR-like transcription regulators (TipAL, Mta, SkgA, BmrR, and BltR), N-terminal domain. These proteins have been shown to regulate expression of specific regulons in response to various toxic substances, antibiotics, or oxygen radicals in Bacillus subtilis, Streptomyces, and Caulobacter crescentus. They are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain  HTH motifs that mediate DNA binding, while the C-terminal domains are often unrelated and bind specific coactivator molecules. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.	96
133397	cd04769	HTH_MerR2	Helix-Turn-Helix DNA binding domain of MerR2-like transcription regulators. Helix-turn-helix (HTH) transcription regulator MerR2 and related proteins. MerR2 in Bacillus cereus RC607 regulates resistance to organomercurials. The MerR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	116
133398	cd04770	HTH_HMRTR	Helix-Turn-Helix DNA binding domain of Heavy Metal Resistance transcription regulators. Helix-turn-helix (HTH) heavy metal resistance transcription regulators (HMRTR): MerR1 (mercury), CueR (copper),  CadR (cadmium),  PbrR (lead), ZntR (zinc), and other related proteins. These transcription regulators mediate responses to heavy metal stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	123
133399	cd04772	HTH_TioE_rpt1	First Helix-Turn-Helix DNA binding domain of the regulatory protein TioE. Putative helix-turn-helix (HTH) regulatory protein, TioE, and related proteins. TioE is part of the thiocoraline gene cluster, which is involved in the biosynthesis of the antitumor thiocoraline from the marine actinomycete, Micromonospora. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. Proteins in this family are unique within the MerR superfamily in that they are composed of just two adjacent MerR-like N-terminal domains; this CD contains the N-terminal or first repeat (rpt1) of these tandem MerR-like domain proteins.	99
133400	cd04773	HTH_TioE_rpt2	Second Helix-Turn-Helix DNA binding domain of the regulatory protein TioE. Putative helix-turn-helix (HTH) regulatory protein, TioE, and related proteins. TioE is part of the thiocoraline gene cluster, which is involved in the biosynthesis of the antitumor thiocoraline from the marine actinomycete, Micromonospora. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. Proteins in this family are unique within the MerR superfamily in that they are composed of just two adjacent MerR-like N-terminal domains; this CD mainly contains the C-terminal or second repeat (rpt2) of these tandem MerR-like domain proteins.	108
133401	cd04774	HTH_YfmP	Helix-Turn-Helix DNA binding domain of the YfmP transcription regulator. Helix-turn-helix (HTH) transcription regulator, YfmP, and related proteins; N-terminal domain. YfmP regulates the multidrug efflux protein, YfmO, and indirectly regulates the expression of the Bacillus subtilis copZA operon encoding a metallochaperone, CopZ, and a CPx-type ATPase efflux protein, CopA. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.	96
133402	cd04775	HTH_Cfa-like	Helix-Turn-Helix DNA binding domain of Cfa-like transcription regulators. Putative helix-turn-helix (HTH) MerR-like transcription regulators; the HTH domain of Cfa, a cyclopropane fatty acid synthase, and other related methyltransferases, as well as, the N-terminal domain of a conserved, uncharacterized ~172 a.a. protein. Based on sequence similarity of the N-terminal domain, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	102
133403	cd04776	HTH_GnyR	Helix-Turn-Helix DNA binding domain of the regulatory protein GnyR. Putative helix-turn-helix (HTH) regulatory protein, GnyR, and other related proteins. GnyR belongs to the gnyRDBHAL cluster, which is involved in acyclic isoprenoid degradation in Pseudomonas aeruginosa. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules.	118
133404	cd04777	HTH_MerR-like_sg1	Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 1), N-terminal domain. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	107
133405	cd04778	HTH_MerR-like_sg2	Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 2). Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	219
133406	cd04779	HTH_MerR-like_sg4	Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 4). Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	134
133407	cd04780	HTH_MerR-like_sg5	Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 5), N-terminal domain. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	95
133408	cd04781	HTH_MerR-like_sg6	Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 6) with at least two conserved cysteines present in the C-terminal portion of the protein. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	120
133409	cd04782	HTH_BltR	Helix-Turn-Helix DNA binding domain of the BltR transcription regulator. Helix-turn-helix (HTH) multidrug-efflux transporter transcription regulator, BltR (BmrR-like transporter) of Bacillus subtilis, and related proteins; N-terminal domain. Blt, like Bmr, is a membrane protein which causes the efflux of a variety of toxic substances and antibiotics. These regulators are comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the C-terminal domains are often unrelated and bind specific coactivator molecules. They share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.	97
133410	cd04783	HTH_MerR1	Helix-Turn-Helix DNA binding domain of the MerR1 transcription regulator. Helix-turn-helix (HTH) transcription regulator MerR1. MerR1 transcription regulators, such as Tn21 MerR and Tn501 MerR, mediate response to mercury exposure in eubacteria. These proteins are comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain winged HTH motifs that mediate DNA binding, while the C-terminal domains have three conserved cysteines that define a mercury binding site. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.	126
133411	cd04784	HTH_CadR-PbrR	Helix-Turn-Helix DNA binding domain of the CadR and PbrR transcription regulators. Helix-turn-helix (HTH) CadR and PbrR transcription regulators including Pseudomonas aeruginosa CadR and Ralstonia metallidurans PbrR that regulate expression of the cadmium and lead resistance operons, respectively. These proteins are comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the C-terminal domains have three conserved cysteines which form a putative metal binding site. Some members in this group have a histidine-rich C-terminal extension. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.	127
133412	cd04785	HTH_CadR-PbrR-like	Helix-Turn-Helix DNA binding domain of the CadR- and PbrR-like transcription regulators. Helix-turn-helix (HTH) CadR- and PbrR-like transcription regulators. CadR and PbrR regulate expression of the cadmium and lead resistance operons, respectively. These proteins are comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the C-terminal domains have three conserved cysteines which comprise a putative metal binding site. Some members in this group have a histidine-rich C-terminal extension. These proteins share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.	126
133413	cd04786	HTH_MerR-like_sg7	Helix-Turn-Helix DNA binding domain of putative transcription regulators from the MerR superfamily. Putative helix-turn-helix (HTH) MerR-like transcription regulators (subgroup 7) with a conserved cysteine present in the C-terminal portion of the protein. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	131
133414	cd04787	HTH_HMRTR_unk	Helix-Turn-Helix DNA binding domain of putative Heavy Metal Resistance transcription regulators. Putative helix-turn-helix (HTH) heavy metal resistance transcription regulators (HMRTR), unknown subgroup. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to heavy metal stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules, such as, metal ions, drugs, and organic substrates. This subgroup lacks one of the conserved, metal-binding cysteines seen in the MerR1 group.	133
133415	cd04788	HTH_NolA-AlbR	Helix-Turn-Helix DNA binding domain of the transcription regulators NolA and AlbR. Helix-turn-helix (HTH) transcription regulators NolA and AlbR, N-terminal domain. In Bradyrhizobium (Arachis) sp. NC92, NolA is required for efficient nodulation of host plants. In Xanthomonas albilineans, AlbR regulates the expression of the pathotoxin, albicidin. These proteins are putatively comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the C-terminal domains are often unrelated and bind specific coactivator molecules. They share the N-terminal DNA binding domain with other transcription regulators of the MerR superfamily that promote transcription by reconfiguring the spacer between the -35 and -10 promoter elements.	96
133416	cd04789	HTH_Cfa	Helix-Turn-Helix DNA binding domain of the Cfa transcription regulator. Putative helix-turn-helix (HTH) MerR-like transcription regulator; the N-terminal domain of Cfa, a cyclopropane fatty acid synthase and other related methyltransferases. Based on sequence similarity, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	102
133417	cd04790	HTH_Cfa-like_unk	Helix-Turn-Helix DNA binding domain of putative Cfa-like transcription regulators. Putative helix-turn-helix (HTH) MerR-like transcription regulator; conserved, Cfa-like, unknown proteins (~172 a.a.). The N-terminal domain of these proteins appears to be related to the HTH domain of Cfa, a cyclopropane fatty acid synthase. These Cfa-like proteins have a unique C-terminal domain with conserved histidines (motif HXXFX7HXXF). Based on sequence similarity of the N-terminal domains, these proteins are predicted to function as transcription regulators that mediate responses to stress in eubacteria. They belong to the MerR superfamily of transcription regulators that promote transcription of various stress regulons by reconfiguring the operator sequence located between the -35 and -10 promoter elements. A typical MerR regulator is comprised of distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates.	172
271199	cd04791	LanC_SerThrkinase	Lanthionine synthetase C-like domain associated with serine/threonine kinases. Some members of this subgroup lack the zinc binding site and the active site residues, and therefore are most likely inactive. The function of this domain is unknown.	327
271200	cd04792	LanM-like	Cyclases involved in the biosynthesis of class II lantibiotics, and similar proteins. LanM-like proteins. LanM is a bifunctional enzyme, involved in the synthesis of class II lantibiotics. It is responsible for both the dehydration and the cyclization of the precursor-peptide during lantibiotic synthesis. The C-terminal domain shows similarity to LanC, the cyclase component of the lan operon, but the N terminus seems to be unrelated to the dehydratase, LanB.	836
271201	cd04793	LanC	Cyclases involved in the biosynthesis of lantibiotics. LanC is the cyclase enzyme of the lanthionine synthetase. Lanthinoine is a lantibiotic, a unique class of peptide antibiotics. They are ribosomally synthesized as precursor peptides and then post-translationally modified to contain thioether cross-links called lanthionines (Lans) or methyllanthionines (MeLans) in addition to  2,3-didehydroalanine (Dha) and (Z)-2,3-didehydrobutyrine (Dhb). These unusual amino acids are introduced by the dehydration of serine and threonine residues, followed by thioether formation via addition of cysteine thiols, catalysed by LanB and LanC or LanM. LanC, the cyclase component, is a zinc metalloprotein, whose bound metal has been proposed to activate the thiol substrate for nucleophilic addition. Also contains SpaC (the cyclase involved in the biosynthesis of subtilin), NisC, and homologs.	377
271202	cd04794	euk_LANCL	Eukaryotic Lanthionine synthetase C-like protein. This family contains the lanthionine synthetase C-like proteins 1 and 2 which are related to the bacterial lanthionine synthetase components C (LanC). LANCL1 and LANCL2 (testes-specific adriamycin sensitivity protein) were thought to be peptide-modifying enzyme components in eukaryotic cells. Both proteins are produced in large quantities in the brain and testes and may have role in the immune surveillance of these organs. More recently, they have been associated with signal transduction processes and insulin sensitization. In particular, LANCL2 has been shown to bind abscisic acid (ABA), and this interaction may play a role in signaling pathways triggered by ABA, such as in human granulocytes and rat insulinoma cells. This eukaryotic LANCL family also includes Arabidopsis GCR2.	349
240112	cd04795	SIS	SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars.	87
341401	cd04801	CBS_pair_peptidase_M50	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in the metalloprotease peptidase M50. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in peptidase M50.  Members of the M50 metallopeptidase family include mammalian sterol-regulatory element binding protein (SREBP) site 2 proteases and various hypothetical bacterial homologues. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	113
240117	cd04813	PA_1	PA_1: Protease-associated (PA) domain subgroup 1. A subgroup of PA-domain containing proteins. Proteins in this subgroup contain a RING-finger (Really Interesting New Gene) domain C-terminal to this PA domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins in this group contain a C-terminal RING-finger domain. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases: such as hSPPL2a and 2b, ii) various E3 ubiquitin ligases similar to human GRAIL (gene related to anergy in lymphocytes) protein, iii) various proteins containing a RING finger motif such as Arabidopsis ReMembR-H2 protein, iv) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), v) various plant vacuolar sorting receptors such as Pisum sativum BP-80, vi) prostate-specific membrane antigen (PSMA), vii) yeast aminopeptidase Y viii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, ix) various subtilisin-like proteases such as Cucumisin from the juice of melon fruits, and x) human TfR (transferrin receptor) 1 and human TfR2.  The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup.	117
240118	cd04814	PA_M28_1	PA_M28_1: Protease-associated (PA) domain, peptidase family M28, subfamily-1. A subfamily of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subfamilies, relatively little is known about proteins in this subfamily.	142
240119	cd04815	PA_M28_2	PA_M28_2: Protease-associated (PA) domain, peptidase family M28, subfamily-2. A subfamily of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subfamilies; relatively little is known about proteins in this subfamily.	134
240120	cd04816	PA_SaNapH_like	PA_SaNapH_like: Protease-associated domain containing proteins like Streptomyces anulatus N-acetylpuromycin N-acetylhydrolase (SaNapH).This group contains various PA domain-containing proteins similar SaNapH.  Proteins in this group belong to the peptidase M28 family. NapH is a terminal enzyme in the puromycin biosynthetic pathway; NapH hydrolyzes N-acetylpuromycin to the active antibiotic. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	122
240121	cd04817	PA_VapT_like	PA_VapT_like: Protease-associated domain containing proteins like VapT from Vibrio metschnikovii strain RH530. This group contains various PA domain-containing proteins similar to V. metschnikovii VapT, including the serine alkaline protease SapSh from the psychotroph Shewanella strain Ac10 and the Apa1 protease from the psychrotroph Pseudoalteromonas Sp. As-11. VapT is a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease showing high activity over a broad pH range and temperature. SapSh has a high level of protease activity at low temperatures. Apa1 is also cold-adapted. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.	139
240122	cd04818	PA_subtilisin_1	PA_subtilisin_1: Protease-associated domain containing subtilisin-like proteases, subgroup 1. A subgroup of PA domain-containing subtilisin-like proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following subtilisin-like proteases: i) melon cucumisin, ii) Arabidopsis thaliana Ara12, iii) Alnus glutinosa ag12, iv) members of the tomato P69 family, and v) tomato LeSBT2. However, these proteins belong to other subtilisin-like subgroups. Relatively little is known about proteins in this subgroup.	118
240123	cd04819	PA_2	PA_2: Protease-associated (PA) domain subgroup 2. A subgroup of PA-domain containing proteins. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins in this group contain a C-terminal RING-finger domain. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases: such as hSPPL2a and 2b, ii) various E3 ubiquitin ligases similar to human GRAIL (gene related to anergy in lymphocytes) protein, iii) various proteins containing a RING finger motif such as Arabidopsis ReMembR-H2 protein, iv) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), v) various plant vacuolar sorting receptors such as Pisum sativum BP-80, vi) prostate-specific membrane antigen (PSMA), vii) yeast aminopeptidase Y viii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, ix) various subtilisin-like proteases such as Cucumisin from the juice of melon fruits, and x) human TfR (transferrin receptor) 1 and human TfR2.  The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup.	127
240124	cd04820	PA_M28_1_1	PA_M28_1_1: Protease-associated (PA) domain, peptidase family M28, subfamily-1, subgroup 1. A subgroup of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup.	137
240125	cd04821	PA_M28_1_2	PA_M28_1_2: Protease-associated (PA) domain, peptidase family M28, subfamily-1, subgroup 2. A subgroup of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup.	157
240126	cd04822	PA_M28_1_3	PA_M28_1_3: Protease-associated (PA) domain, peptidase family M28, subfamily-1, subgroup 3. A subgroup of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup.	151
240127	cd04823	ALAD_PBGS_aspartate_rich	Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. All of PBGS_aspartate_rich contain an aspartate rich metal binding site with the general sequence DXALDX(Y/F)X3G(H/Q)DG. They also contain an allosteric magnesium binding sequence RX~164DX~65EXXXD and are activated by magnesium and/or potassium, but not by zinc. PBGSs_aspartate_rich are found in some bacterial species and photosynthetic organisms such as vascular plants, mosses and algae, but not in archaea.	320
240128	cd04824	eu_ALAD_PBGS_cysteine_rich	Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. The eukaryotic PBGSs represented by this model, which contain a cysteine-rich zinc binding motif (DXCXCX(Y/F)X3G(H/Q)CG), require zinc for their activity, they do not contain an additional allosteric metal binding site and do not bind magnesium.	320
173791	cd04842	Peptidases_S8_Kp43_protease	Peptidase S8 family domain in Kp43 proteases. Kp43 proteases are members of the peptidase S8 or Subtilase clan of proteases. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure (an example of convergent evolution). Kp43 is topologically similar to kexin and furin both of which are proprotein convertases, but differ in amino acids sequence and the position of its C-terminal barrel.  Kp43 has 3 Ca2+ binding sites that differ from the corresponding sites in the other known subtilisin-like proteases.  KP-43 protease is known to be an oxidation-resistant protease when compared with the other subtilisin-like proteases	293
173792	cd04843	Peptidases_S8_11	Peptidase S8 family domain, uncharacterized subfamily 11. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	277
173793	cd04847	Peptidases_S8_Subtilisin_like_2	Peptidase S8 family domain in Subtilisin-like proteins. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	291
173794	cd04848	Peptidases_S8_Autotransporter_serine_protease_like	Peptidase S8 family domain in Autotransporter serine proteases. Autotransporter serine proteases belong to Peptidase S8 or Subtilase family. Subtilases, or subtilisin-like serine proteases, have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure (an example of convergent evolution).  Autotransporters are a superfamily of outer membrane/secreted proteins of gram-negative bacteria.  The presence of these subtilisin-like domains in these autotransporters are may enable them to be auto-catalytic and may also serve to allow them to act as a maturation protease cleaving other outer membrane proteins at the cell surface.	267
173795	cd04852	Peptidases_S8_3	Peptidase S8 family domain, uncharacterized subfamily 3. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	307
173796	cd04857	Peptidases_S8_Tripeptidyl_Aminopeptidase_II	Peptidase S8 family domain in Tripeptidyl aminopeptidases_II. Tripeptidyl aminopeptidases II are member of the peptidase S8 or Subtilase family. Subtilases, or subtilisin-like serine proteases, have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure (an example of convergent evolution).  Tripeptidyl aminopeptidase II removes tripeptides from the free N terminus of oligopeptides as well as having endoproteolytic activity.  Some tripeptidyl aminopeptidases have been shown to cleave tripeptides and small peptides, e.g. angiotensin II and glucagon, while others are believed to be involved in MHC I processing.	412
240129	cd04859	Prim_Pol	Prim_Pol: Primase-polymerase (primpol) domain of the type found in bifunctional replicases from archaeal plasmids, including ORF904 protein of the crenarchaeal plasmid pRN1 from Sulfolobus islandicus (pRN1 primpol). These primpol domains belong to the archaeal/eukaryal primase (AEP) superfamily. This group includes archaeal plasmids and bacteriophage AEPs. The ORF904 protein is a multifunctional protein having ATPase, primase and DNA polymerase activity, and may play a role in the replication of the archaeal plasmid. The pRN1 primpol domain exhibits DNA polymerase and primase activities; a cluster of active site residues (three acidic residues, and a histidine) is required for both these activities. For pRN1 primpol, the primase activity prefers dNTPs to rNTPs; incorporation of dNTPs requires rNTP as cofactor. The pRN1 primpol contains an unusual zinc-binding stem, which is not conserved in other members of this group.	152
240130	cd04860	AE_Prim_S	AE_Prim_S: primase domain similar to that found in the small subunit of archaeal and eukaryotic (A/E) DNA primases. Primases are DNA-dependent RNA polymerases which synthesis the short RNA primers required for DNA replication. In addition to its catalytic role in replication, DNA primase may play a role in coupling replication to DNA damage repair and in checkpoint control during S phase. In eukaryotes, this small catalytically active primase subunit (p50) and a larger primase subunit (p60), referred to jointly as the core primase, associate with the B subunit and the DNA polymerase alpha subunit in a complex, called Pol alpha-pri. The function of the larger primase subunit is unclear. Included in this group are Pfu41 and Pfu46, these two proteins comprise the primase complex of the archaea Pyrococcus furiosus; Pfu41 and Pfu46 have sequence identity to the eukaryotic p50 and p60 primase proteins respectively. Pfu41 preferentially uses dNTPs as substrate. Pfu46 regulates the primase activity of Pfu41.	232
240131	cd04861	LigD_Pol_like	LigD_Pol_like: Polymerase (Pol) domain of bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. PaeLigD is monomeric, containing an N-terminal phosphoesterase module, a central polymerase (Pol) domain, and a C-terminal ATP-dependent ligase domain. Mycobacterium tuberculosis (Mt)LigD, also found in this group, is monomeric and contains the same modules but these are arranged differently: an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The PaeLigD Pol domain in vitro, in a manganese-dependent fashion, catalyzes templated extensions of 5'-overhang duplex DNA, and nontemplated single-nucleotide additions to blunt-end duplex DNA; it preferentially adds single ribonucleotides at blunt DNA ends. PaeLigD Pol adds a correctly paired rNTP to the DNA primer termini more rapidly than it does a correctly paired dNTP; it has higher infidelity as an RNA polymerase than it does as a DNA polymerase, which is in keeping with the mutagenic property of NHEJ-mediated DNA DSB repair. The MtLigD Pol domain similarly is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro. The MtLigD Pol domain has been shown to prefer DNA gapped substrates containing a 5'-phosphate group at the gap.	227
240132	cd04862	PaeLigD_Pol_like	PaeLigD_Pol_like: Polymerase (Pol) domain of bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. PaeLigD is monomeric, containing an N-terminal phosphoesterase module, a central polymerase (Pol) domain, and a C-terminal ATP-dependent ligase domain. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The PaeLigD Pol domain in vitro, in a manganese-dependent fashion, catalyzes templated extensions of 5'-overhang duplex DNA, and nontemplated single-nucleotide additions to blunt-end duplex DNA; it preferentially adds single ribonucleotides at blunt DNA ends. PaeLigD Pol adds a correctly paired rNTP to the DNA primer termini more rapidly than it does a correctly paired dNTP; it has higher infidelity as an RNA polymerase than it does as a DNA polymerase, which is in keeping with the mutagenic property of NHEJ-mediated DNA DSB repair.	227
240133	cd04863	MtLigD_Pol_like	MtLigD_Pol_like: Polymerase (Pol) domain of bacterial LigD proteins similar to Mycobacterium tuberculosis (Mt)LigD. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. MtLigD is monomeric and contains an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The MtLigD Pol domain is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro. The MtLigD Pol domain has been shown to prefer DNA gapped substrates containing a 5'-phosphate group at the gap.	231
240134	cd04864	LigD_Pol_like_1	LigD_Pol_like_1: Polymerase (Pol) domain of mostly bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD, subgroup 1. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The Pol domains of PaeLigD and Mycobacterium tuberculosis (Mt)LigD are stimulated by manganese, are error-prone, and prefer adding rNTPs to dNTPs in vitro; however PaeLigD and MtLigD belong to other subgroups, proteins in this subgroup await functional characterization.	228
240135	cd04865	LigD_Pol_like_2	LigD_Pol_like_2: Polymerase (Pol) domain of bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD, subgroup 2. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. It has been suggested that LigD Pol contributes to NHEJ-mediated DNA DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The Pol domains of PaeLigD and Mycobacterium tuberculosis (Mt)LigD are stimulated by manganese, are error-prone, and prefer adding rNTPs to dNTPs in vitro; however PaeLigD and MtLigD belong to other subgroups, proteins in this subgroup await functional characterization.	228
240136	cd04866	LigD_Pol_like_3	LigD_Pol_like_3: Polymerase (Pol) domain of bacterial LigD proteins similar to Pseudomonas aeruginosa (Pae) LigD, subgroup 3. The LigD Pol domain belongs to the archaeal/eukaryal primase (AEP) superfamily. In prokaryotes, LigD along with Ku is required for non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB). NHEJ-mediated DNA DSB repair is error-prone. It has been suggested that LigD Pol contributes to NHEJ-mediated repair DSB repair in vivo, by filling in short 5'-overhangs with ribonucleotides; the filled in termini would then be sealed by the associated LigD ligase domain, resulting in short stretches of RNA incorporated into the genomic DNA. The Pol domains of PaeLigD and Mycobacterium tuberculosis (Mt)LigD are stimulated by manganese, are error-prone, and prefer adding rNTPs to dNTPs in vitro; however PaeLigD and MtLigD belong to other subgroups, proteins in this subgroup await functional characterization.	223
340516	cd04867	TGS_YchF_OLA1	TGS (ThrRS, GTPase and SpoT) domain found in the YchF/OLA1 family proteins. The YchF/Ola1 family includes bacterial ribosome-binding ATPase YchF as well as its human homolog Obg-like ATPase 1 (OLA1), both of which belong to the Obg family of GTPases, and are novel ATPases that bind and hydrolyze ATP more efficiently than GTP. They have been associated with various cellular processes and pathologies, including DNA repair, tumorigenesis, and apoptosis, in addition to the regulation of the oxidative stress response. OLA1 is also termed DNA damage-regulated overexpressed in cancer 45 (DOC45), or GTP-binding protein 9 (GTPBP9). It is over-expressed in several human malignancies, including cancers of the colon, rectum, ovary, lung, stomach, and uterus. It is linked to the cellular stress response and tumorigenesis, and may also serve as a valuable tumor marker. Members in this family contain a central Obg-type G (guanine nucleotide-binding) domain, flanked by a coiled-coil domain and this TGS (ThrRS, GTPase, SpoT) domain of unknown function.	85
153140	cd04868	ACT_AK-like	ACT domains C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). This CD includes each of two ACT domains C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). Typically, AK consists of two ACT domains in a tandem repeat, but the second ACT domain is inserted within the first, resulting in, what is normally the terminal beta strand of ACT2, formed from a region N-terminal of ACT1. AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. Aspartokinase is the first enzyme in the pathway of the biosynthesis of the aspartate family of amino acids (lysine, threonine, methionine, and isoleucine) and the bacterial cell wall component, meso-diaminopimelate. One mechanism for the regulation of this pathway is by the production of several isoenzymes of aspartokinase with different repressors and allosteric inhibitors. Pairs of ACT domains are proposed to specifically bind amino acids leading to allosteric regulation of the enzyme. In Escherichia coli (EC), three different aspartokinase isoenzymes are regulated specifically by lysine, methionine, and threonine. AK-HSDHI (ThrA) and AK-HSDHII (MetL) are bifunctional enzymes that consist of an N-terminal AK and a C-terminal homoserine dehydrogenase (HSDH). ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. The third isoenzyme, AKIII (LysC), is monofunctional and is involved in lysine synthesis. The three Bacillus subtilis (BS) isoenzymes, AKI (DapG), AKII (LysC), and AKIII (YclM), are feedback inhibited by meso-diaminopimelate, lysine, and lysine plus threonine, respectively. The E. coli lysine-sensitive AK is described as a homodimer, whereas, the B. subtilis lysine-sensitive AK is described as is a heterodimeric complex of alpha- and beta- subunits that are formed from two in-frame overlapping genes. A single AK enzyme type has been described in Pseudomonas, Amycolatopsis, and Corynebacterium, and apparently, unique to cyanobacteria, are aspartokinases with two tandem pairs of ACT domains, C-terminal to the catalytic domain. The fungal aspartate pathway is regulated at the AK step, with L-Thr being an allosteric inhibitor of the Saccharomyces cerevisiae AK (Hom3). At least two distinct AK isoenzymes can occur in higher plants, a monofunctional lysine-sensitive isoenzyme, which is involved in the overall regulation of the pathway and can be synergistically inhibited by S-adenosylmethionine. The other isoenzyme is a bifunctional, threonine-sensitive AK-HSDH protein. Also included in this AK family CD are the ACT domains of the Methylomicrobium alcaliphilum AK; the first enzyme of the ectoine biosynthetic pathway found in this bacterium and several other halophilic/halotolerant bacteria. Members of this CD belong to the superfamily of ACT regulatory domains.	60
153141	cd04869	ACT_GcvR_2	ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. This CD includes the second of the two ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. The glycine cleavage enzyme system in Escherichia coli provides one-carbon units for cellular methylation reactions. This enzyme system, encoded by the gcvTHP operon and lpd gene, catalyzes the cleavage of glycine into CO2 + NH3 and transfers a one-carbon unit to tetrahydrofolate, producing 5,10-methylenetetrahydrofolate. The gcvTHP operon is activated by the GcvA protein in response to glycine and repressed by a GcvA/GcvR interaction in the absence of glycine. It has been proposed that the co-activator glycine acts through a mechanism of de-repression by binding to GcvR and preventing GcvR from interacting with GcvA to block GcvA's activator function. Evidence also suggests that GcvR interacts directly with GcvA rather than binding to DNA to cause repression. Members of this CD belong to the superfamily of ACT regulatory domains.	81
153142	cd04870	ACT_PSP_1	CT domains found N-terminal of phosphoserine phosphatase (PSP, SerB). The ACT_PSP_1 CD includes the first of the two ACT domains found N-terminal of phosphoserine phosphatase (PSP, SerB). PSPs belong to the L-2-haloacid dehalogenase-like protein superfamily. PSP is involved in serine metabolism; serine is synthesized from phosphoglycerate through sequential reactions catalyzed by 3-phosphoglycerate dehydrogenase (SerA), 3-phosphoserine aminotransferase (SerC), and SerB. Members of this CD belong to the superfamily of ACT regulatory domains.	75
153143	cd04871	ACT_PSP_2	ACT domains found N-terminal of phosphoserine phosphatase (PSP, SerB). The ACT_PSP_2 CD includes the second of the two ACT domains found N-terminal of phosphoserine phosphatase (PSP, SerB). PSPs belong to the L-2-haloacid dehalogenase-like protein superfamily. PSP is involved in serine metabolism; serine is synthesized from phosphoglycerate through sequential reactions catalyzed by 3-phosphoglycerate dehydrogenase (SerA), 3-phosphoserine aminotransferase (SerC), and SerB. Members of this CD belong to the superfamily of ACT regulatory domains	84
153144	cd04872	ACT_1ZPV	ACT domain proteins similar to the yet uncharacterized Streptococcus pneumoniae ACT domain protein. This CD, ACT_1ZPV, includes those single ACT domain proteins similar to the yet uncharacterized Streptococcus pneumoniae ACT domain protein (pdb structure 1ZPV). Members of this CD belong to the superfamily of ACT regulatory domains.	88
153145	cd04873	ACT_UUR-ACR-like	ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD. This ACT domain family, ACT_UUR_ACR-like, includes the two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD; including those enzymes similar to the GlnD found in enteric Escherichia coli and those found in photosynthetic, nitrogen-fixing bacterium Rhodospirillum rubrum. Also included in this CD are the four ACT domains of a novel protein composed almost entirely of ACT domain repeats (the ACR protein) and like proteins. These ACR proteins, found in Arabidopsis and Oryza, are proposed to function as novel regulatory or sensor proteins in plants. This CD also includes the first of the two ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein and related domains, as well as, the N-terminal ACT domain of a yet characterized Arabidopsis/Oryza predicted tyrosine kinase. Members of this CD belong to the superfamily of ACT regulatory domains.	70
153146	cd04874	ACT_Af1403	N-terminal ACT domain of the yet uncharacterized, small (~133 a.a.), putative amino acid binding protein, Af1403, and related domains. This CD includes the N-terminal ACT domain of the yet uncharacterized, small (~133 a.a.), putative amino acid binding protein, Af1403, from Archaeoglobus fulgidus and other related archeal ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains.	72
153147	cd04875	ACT_F4HF-DF	N-terminal ACT domain of formyltetrahydrofolate deformylase (F4HF-DF; formyltetrahydrofolate hydrolase). This CD includes the N-terminal ACT domain of formyltetrahydrofolate deformylase (F4HF-DF; formyltetrahydrofolate hydrolase) which catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Formyl-FH4 hydrolase  generates the formate that is used by purT-encoded 5'-phosphoribosylglycinamide transformylase for step three of de novo purine nucleotide synthesis. Formyl-FH4 hydrolase, a hexamer which is activated by methionine and inhibited by glycine, is proposed to regulate the balance FH4 and C1-FH4 in response to changing growth conditions. Members of this CD belong to the superfamily of ACT regulatory domains.	74
153148	cd04876	ACT_RelA-SpoT	ACT  domain found C-terminal of the RelA/SpoT domains. ACT_RelA-SpoT: the ACT  domain found C-terminal of the RelA/SpoT domains. Enzymes of the Rel/Spo family enable bacteria to survive prolonged periods of nutrient limitation by controlling guanosine-3'-diphosphate-5'-(tri)diphosphate ((p)ppGpp) production and subsequent rRNA repression (stringent response). Both the synthesis of (p)ppGpp from ATP and GDP(GTP), and its hydrolysis to GDP(GTP) and pyrophosphate, are catalyzed by Rel/Spo proteins. In Escherichia coli and its close relatives, the metabolism of (p)ppGpp is governed by two homologous proteins, RelA and SpoT. The RelA protein catalyzes (p)ppGpp synthesis in a reaction requiring its binding to ribosomes bearing codon-specified uncharged tRNA. The major role of the SpoT protein is the breakdown of (p)ppGpp by a manganese-dependent (p)ppGpp pyrophosphohydrolase activity. Although the stringent response appears to be tightly regulated by these two enzymes in E. coli, a bifunctional Rel/Spo protein has been discovered in most gram-positive organisms studied so far. These bifunctional Rel/Spo homologs (rsh) appear to modulate (p)ppGpp levels through two distinct active sites that are controlled by a reciprocal regulatory mechanism ensuring inverse coupling of opposing activities. In studies with the Streptococcus equisimilis Rel/Spo homolog, the C-terminal domain appears to be involved in this reciprocal regulation of the two opposing catalytic activities present in the N-terminal domain, ensuring that both synthesis and degradation activities are not coinduced. Members of this CD belong to the superfamily of ACT regulatory domains.	71
153149	cd04877	ACT_TyrR	N-terminal ACT domain of the TyrR protein. ACT_TyrR: N-terminal ACT domain of the TyrR protein. The TyrR protein of Escherichia coli controls the expression of a group of transcription units (TyrR regulon) whose gene products are involved in the biosynthesis or transport of the aromatic amino acids. Binding to specific DNA sequences known as TyrR boxes, the TyrR protein can either activate or repress transcription at different sigma70 promoters. Its regulatory activity occurs in response to intracellular levels of tyrosine, phenylalanine and tryptophan. The TyrR protein consists of an N-terminal region important for transcription activation with an ATP-independent aromatic amino acid binding site (contained within the ACT domain) and is involved in dimerization; a central region with an ATP binding site, an ATP-dependent aromatic amino acid binding site and is involved in hexamerization; and a helix turn helix DNA binding C-terminal region. In solution, in the absence of cofactors or in the presence of phenylalanine alone, the TyrR protein exists as a dimer. However, in the presence of ATP and tyrosine the TyrR protein self-aggregates to form a hexamer. Members of this CD belong to the superfamily of ACT regulatory domains.	74
153150	cd04878	ACT_AHAS	N-terminal ACT domain of the Escherichia coli IlvH-like regulatory subunit of acetohydroxyacid synthase (AHAS). ACT_AHAS: N-terminal ACT domain of the Escherichia coli IlvH-like regulatory subunit of acetohydroxyacid synthase (AHAS). AHAS catalyses the first common step in the biosynthesis of the three branched-chain amino acids. The first step involves the condensation of either pyruvate or 2-ketobutyrate with the two-carbon hydroxyethyl fragment derived from another pyruvate molecule, covalently bound to the coenzyme thiamine diphosphate. Bacterial AHASs generally consist of regulatory and catalytic subunits. The effector (valine) binding sites are proposed to be located in two symmetrically related positions in the interface between a pair of N-terminal ACT domains with the C-terminal domain of IlvH contacting the catalytic dimer. Plants Arabidopsis and Oryza have tandem IlvH subunits; both the first and second ACT domain sequences are present in this CD. Members of this CD belong to the superfamily of ACT regulatory domains.	72
153151	cd04879	ACT_3PGDH-like	ACT_3PGDH-like CD includes the C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH). ACT_3PGDH-like: The ACT_3PGDH-like CD includes the C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH), with or without an extended C-terminal (xct) region found in various bacteria, archaea, fungi, and plants. 3PGDH is an enzyme that belongs to the D-isomer specific, 2-hydroxyacid dehydrogenase family and catalyzes the oxidation of D-3-phosphoglycerate to 3- phosphohydroxypyruvate, which is the first step in the biosynthesis of L-serine, using NAD+ as the oxidizing agent. In bacteria, 3PGDH is feedback controlled by the end product L-serine in an allosteric manner. In the Escherichia coli homotetrameric enzyme, the interface at adjacent ACT (regulatory) domains couples to create an extended beta-sheet. Each regulatory interface forms two serine-binding sites. The mechanism by which serine transmits inhibition to the active site is postulated to involve the tethering of the regulatory domains together to create a rigid quaternary structure with a solvent-exposed active site cleft. This CD also includes the C-terminal ACT domain of the L-serine dehydratase (LSD), iron-sulfur-dependent, beta subunit, found in various bacterial anaerobes such as Clostridium, Bacillus, and Treponema species. LSD enzymes catalyze the deamination of L-serine, producing pyruvate and ammonia. Unlike the eukaryotic L-serine dehydratase, which requires the pyridoxal-5'-phosphate (PLP) cofactor, the prokaryotic L-serine dehydratase contains an [4Fe-4S] cluster instead of a PLP active site. The LSD alpha and beta subunits of the 'clostridial' enzyme are encoded by the sdhA and sdhB genes. The single subunit bacterial homologs of L-serine dehydratase (LSD1, LSD2, TdcG) present in E. coli, and other Enterobacteriales, lack the ACT domain described here. Members of this CD belong to the superfamily of ACT regulatory domains.	71
153152	cd04880	ACT_AAAH-PDT-like	ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH). ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH): Phenylalanine hydroxylases (PAH), tyrosine hydroxylases (TH) and tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. This family of enzymes shares a common catalytic mechanism, in which dioxygen is used by an active site containing a single, reduced iron atom to hydroxylate an unactivated aromatic substrate, concomitant with a two-electron oxidation of tetrahydropterin (BH4) cofactor to its quinonoid dihydropterin form. Eukaryotic AAAHs have an N-terminal  ACT (regulatory) domain, a middle catalytic domain and a C-terminal domain which is responsible for the oligomeric state of the enzyme forming a domain-swapped tetrameric coiled-coil. The PAH, TH, and TPH enzymes contain highly conserved catalytic domains but distinct N-terminal ACT domains and differ in their mechanisms of regulation. One commonality is that all three eukaryotic enzymes appear to be regulated, in part, by the phosphorylation of serine residues N-terminal of the ACT domain. Also included in this CD are the C-terminal ACT domains of the bifunctional chorismate mutase-prephenate dehydratase (CM-PDT) enzyme and the prephenate dehydratase (PDT) enzyme found in plants, fungi, bacteria, and archaea. The P-protein of Escherichia coli (CM-PDT) catalyzes the conversion of chorismate to prephenate and then the decarboxylation and dehydration to form phenylpyruvate. These are the first two steps in the biosynthesis of L-Phe and L-Tyr via the shikimate pathway in microorganisms and plants. The E. coli P-protein (CM-PDT) has three domains with an N-terminal domain with chorismate mutase activity, a middle domain with prephenate dehydratase activity, and an ACT regulatory C-terminal domain. The prephenate dehydratase enzyme has a PDT and ACT domain. The ACT domain is essential to bring about the negative allosteric regulation by L-Phe binding. L-Phe binds with positive cooperativity; with this binding, there is a shift in the protein to less active tetrameric and higher oligomeric forms from a more active dimeric form. Members of this CD belong to the superfamily of ACT regulatory domains.	75
153153	cd04881	ACT_HSDH-Hom	ACT_HSDH_Hom CD includes the C-terminal ACT domain of the NAD(P)H-dependent, homoserine dehydrogenase (HSDH) and related domains. The ACT_HSDH_Hom CD includes the C-terminal ACT domain of the NAD(P)H-dependent, homoserine dehydrogenase (HSDH) encoded by the hom gene of Bacillus subtilis and other related sequences. HSDH reduces aspartate semi-aldehyde to the amino acid homoserine, one that is required for the biosynthesis of Met, Thr, and Ile from Asp. Neither the enzyme nor the aspartate pathway is found in the animal kingdom. This mostly bacterial HSDH group has a C-terminal ACT domain and is believed to be involved in enzyme regulation. A C-terminal deletion in the Corynebacterium glutamicum HSDH abolished allosteric inhibition by L-threonine. Members of this CD belong to the superfamily of ACT regulatory domains.	79
153154	cd04882	ACT_Bt0572_2	C-terminal ACT domain of a novel protein composed of just two ACT domains. Included in this CD is the C-terminal ACT domain of a novel protein composed of just two ACT domains, as seen in the yet uncharacterized structure (pdb 2F06) of the Bt0572 protein from Bacteroides thetaiotaomicron and related proteins. Members of this CD belong to the superfamily of ACT regulatory domains.	65
153155	cd04883	ACT_AcuB	C-terminal ACT domain of the Bacillus subtilis acetoin utilization protein, AcuB. This CD includes the C-terminal ACT domain of the Bacillus subtilis acetoin utilization protein, AcuB. AcuB is putatively involved in the anaerobic catabolism of acetoin, and related proteins. Studies report the induction of AcuB by nitrate respiration and also by fermentation. Since acetoin can be secreted and later serve as a source of carbon, it has been proposed that, during anaerobic growth when other carbon sources are exhausted, the induction of the AcuB protein  results in acetoin catabolism. AcuB-like proteins have two N-terminal tandem CBS domains and a single C-terminal ACT domain. Members of this CD belong to the superfamily of ACT regulatory domains.	72
153156	cd04884	ACT_CBS	C-terminal ACT domain of the cystathionine beta-synthase (CBS) domain protein found in Thermotoga maritima, Tm0935, and delta proteobacteria. This CD includes the C-terminal ACT domain of the cystathionine beta-synthase (CBS) domain protein found in Thermotoga maritima, Tm0935, and delta proteobacteria. This protein has two N-terminal tandem CBS domains and a single C-terminal ACT domain. The CBS domain is found in a wide range of proteins, often in tandem arrangements and together with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Members of this CD belong to the superfamily of ACT regulatory domains.	72
153157	cd04885	ACT_ThrD-I	Tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase). This CD includes each of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase) which catalyzes the committed step in branched chain amino acid biosynthesis in plants and microorganisms, the pyridoxal 5'-phosphate (PLP)-dependent dehydration/deamination of L-threonine (or L-serine) to 2-ketobutyrate (or pyruvate). ThrD-I is a cooperative, feedback-regulated (isoleucine and valine) allosteric enzyme that forms a tetramer and contains four pyridoxal phosphate moieties. Members of this CD belong to the superfamily of ACT regulatory domains.	68
153158	cd04886	ACT_ThrD-II-like	C-terminal ACT domain of biodegradative (catabolic) threonine dehydratase II (ThrD-II) and other related ACT domains. This CD includes the C-terminal ACT domain of biodegradative (catabolic) threonine dehydratase II (ThrD-II) and other related ACT domains. The Escherichia coli tdcB gene product, ThrD-II, anaerobically catalyzes the pyridoxal phosphate-dependent dehydration of L-threonine and L-serine to ammonia and to alpha-ketobutyrate and pyruvate, respectively. Tetrameric ThrD-II is subject to allosteric activation by AMP, inhibition by alpha-keto acids, and catabolite inactivation by several metabolites of glycolysis and the citric acid cycle. Also included in  this CD are  N-terminal ACT domains present in smaller (~170 a.a.) archaeal proteins of unknown function. Members of this CD belong to the superfamily of ACT regulatory domains.	73
153159	cd04887	ACT_MalLac-Enz	ACT_MalLac-Enz CD includes the N-terminal ACT domain of putative NAD-dependent malic enzyme 1, Bacillus subtilis YqkI and related domains. The ACT_MalLac-Enz CD includes the N-terminal ACT domain of putative NAD-dependent malic enzyme 1, Bacillus subtilis YqkI, a malolactic enzyme  (MalLac-Enz) which converts malate to lactate, and other related ACT domains. The yqkJ product is predicted to convert malate directly to lactate, as opposed to related malic enzymes that convert malate to pyruvate. Members of this CD belong to the superfamily of ACT regulatory domains.	74
153160	cd04888	ACT_PheB-BS	C-terminal ACT domain of a small (~147 a.a.) putative phenylalanine biosynthetic pathway protein described in Bacillus subtilis (BS) PheB (PheB-BS) and related domains. This CD includes the C-terminal ACT domain of a small (~147 a.a.) putative phenylalanine biosynthetic pathway protein described in Bacillus subtilis (BS) PheB (PheB-BS) and other related ACT domains. In B. subtilis, the upstream gene of pheB, pheA encodes prephenate dehydratase (PDT). The presumed product of the pheB gene is chorismate mutase (CM). The deduced product of the B. subtilis pheB gene, however, has no significant homology to the CM portion of the bifunctional CM-PDT of Escherichia coli. The presence of an ACT domain lends support to the prediction that these proteins function as a phenylalanine-binding regulatory protein. Members of this CD belong to the superfamily of ACT regulatory domains.	76
153161	cd04889	ACT_PDH-BS-like	C-terminal ACT domain of the monofunctional, NAD dependent, prephenate dehydrogenase (PDH) enzyme that catalyzes the formation of 4-hydroxyphenylpyruvate from prephenate. Included in this CD is the C-terminal ACT domain of the monofunctional, NAD dependent, prephenate dehydrogenase (PDH) enzyme that catalyzes the formation of 4-hydroxyphenylpyruvate from prephenate, found in Bacillus subtilis (BS) and other Firmicutes, Deinococci, and Bacteroidetes. PDH is the first enzyme in the aromatic amino acid pathway specific for the biosynthesis of tyrosine. This enzyme is feedback inhibited by tyrosine in B. subtilis and other microorganisms. Both phenylalanine and tryptophan have been shown to be inhibitors of this activity in B. subtilis. Bifunctional  chorismate mutase-PDH (TyrA) enzymes such as those seen in Escherichia coli do not contain an ACT domain. Also included in this CD is the N-terminal ACT domain of a novel protein composed almost entirely of two tandem ACT domains as seen in the uncharacterized structure (pdb 2F06) of the Bt0572 protein from Bacteroides thetaiotaomicron and related ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains.	56
153162	cd04890	ACT_AK-like_1	ACT domains found C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). This CD includes the first of two ACT domains found C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and is the first enzyme in the pathway of the biosynthesis of the aspartate family of amino acids, lysine, threonine, methionine, and isoleucine. This CD, includes the first ACT domain of the Escherichia coli (EC) isoenzyme, AKIII (LysC) and the Arabidopsis isoenzyme, asparate kinase 1, both enzymes monofunctional and involved in lysine synthesis, as well as the the first ACT domain of Bacillus subtilis (BS) isoenzyme, AKIII (YclM), and of the Saccharomyces cerevisiae AK (Hom3). Also included are the first ACT domains of the Methylomicrobium alcaliphilum AK, the first enzyme of the ectoine biosynthetic pathway. Members of this CD belong to the superfamily of ACT regulatory domains.	62
153163	cd04891	ACT_AK-LysC-DapG-like_1	ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII and related proteins. This CD includes the N-terminal of the two ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, as well as, the first and third, of four, ACT domains present in cyanobacteria AK. Also included are the N-terminal of the two ACT domains of the diaminopimelate-sensitive aspartokinase isoenzyme AKI found in Bacilli (Bacillus subtilis strain 168), Clostridia, and Actinobacteria bacterial species. Members of this CD belong to the superfamily of ACT regulatory domains.	61
153164	cd04892	ACT_AK-like_2	ACT domains C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). This CD includes the second of two ACT domains C-terminal to the catalytic domain of aspartokinase (AK; 4-L-aspartate-4-phosphotransferase). The exception in this group, is the inclusion of the first ACT domain of the bifunctional  aspartokinase - homoserine dehydrogenase-like enzyme group (ACT_AKi-HSDH-ThrA-like_1) which includes the  monofunctional,  threonine-sensitive, aspartokinase found  in Methanococcus jannaschii and other related archaeal species. AK catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. AK is the first enzyme in the pathway of the biosynthesis of the aspartate family of amino acids (lysine, threonine, methionine, and isoleucine) and the bacterial cell wall component, meso-diaminopimelate. One mechanism for the regulation of this pathway is by the production of several isoenzymes of AK with different repressors and allosteric inhibitors. Pairs of ACT domains are proposed to specifically bind amino acids leading to allosteric regulation of the enzyme. In Escherichia coli (EC), three different AK isoenzymes are regulated specifically by lysine, methionine, and threonine. AK-HSDHI (ThrA) and AK-HSDHII (MetL) are bifunctional enzymes that consist of an N-terminal AK and a C-terminal homoserine dehydrogenase (HSDH). ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. The third isoenzyme, AKIII (LysC), is monofunctional and is involved in lysine synthesis. The three Bacillus subtilis (BS) isoenzymes, AKI (DapG), AKII (LysC), and AKIII (YclM), are feedback inhibited by meso-diaminopimelate, lysine, and lysine plus threonine, respectively. The E. coli lysine-sensitive AK is described as a homodimer, whereas, the B. subtilis lysine-sensitive AK is described as is a heterodimeric complex of alpha- and beta- subunits that are formed from two in-frame overlapping genes. A single AK enzyme type has been described in Pseudomonas, Amycolatopsis, and Corynebacterium, and apparently, unique to cyanobacteria, are AKs with two tandem pairs of ACT domains, C-terminal to the catalytic domain. The fungal aspartate pathway is regulated at the AK step, with L-Thr being an allosteric inhibitor of the Saccharomyces cerevisiae AK (Hom3). At least two distinct AK isoenzymes can occur in higher plants, a monofunctional lysine-sensitive isoenzyme, which is involved in the overall regulation of the pathway and can be synergistically inhibited by S-adenosylmethionine. The other isoenzyme is a bifunctional, threonine-sensitive AK-HSDH protein. Also included in this CD are the ACT domains of the Methylomicrobium alcaliphilum AK; the first enzyme of the ectoine biosynthetic pathway found in this bacterium and several other halophilic/halotolerant bacteria. Members of this CD belong to the superfamily of ACT regulatory domains.	65
153165	cd04893	ACT_GcvR_1	ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. This CD includes the first of the two ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. The glycine cleavage enzyme system in Escherichia coli provides one-carbon units for cellular methylation reactions. This enzyme system, encoded by the gcvTHP operon and lpd gene, catalyzes the cleavage of glycine into CO2 + NH3 and transfers a one-carbon unit to tetrahydrofolate, producing 5,10-methylenetetrahydrofolate. The gcvTHP operon is activated by the GcvA protein in response to glycine and repressed by a GcvA/GcvR interaction in the absence of glycine. It has been proposed that the co-activator glycine acts through a mechanism of de-repression by binding to GcvR and preventing GcvR from interacting with GcvA to block GcvA's activator function. Evidence also suggests that GcvR interacts directly with GcvA rather than binding to DNA to cause repression. Members of this CD belong to the superfamily of ACT regulatory domains.	77
153166	cd04894	ACT_ACR-like_1	ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the N-terminal ACT domain of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana predicted gene product, At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains.	69
153167	cd04895	ACT_ACR_1	ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the N-terminal ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products have been described (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) and are represented in this CD. Members of this CD belong to the superfamily of ACT regulatory domains.	72
153168	cd04896	ACT_ACR-like_3	ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the third ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana predicted gene product, At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains.	75
153169	cd04897	ACT_ACR_3	ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the third ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products have been described (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) and are represented in this CD. Members of this CD belong to the superfamily of ACT regulatory domains.	75
153170	cd04898	ACT_ACR-like_4	ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the C-terminal ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana  predicted gene product,  At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains.	77
153171	cd04899	ACT_ACR-UUR-like_2	C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD and related domains. This ACT domain family, ACT_ACR-UUR-like_2, includes the second of two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD; including those enzymes similar to the GlnD found in enteric Escherichia coli and those found in photosynthetic, nitrogen-fixing bacterium Rhodospirillum rubrum. Also included in this CD are the second and fourth ACT domains of a novel protein composed almost entirely of ACT domain repeats, the ACR protein. These ACR proteins, found in Arabidopsis and Oryza, are proposed to function as novel regulatory or sensor proteins in plants. Members of this CD belong to the superfamily of ACT regulatory domains.	70
153172	cd04900	ACT_UUR-like_1	ACT domain family, ACT_UUR-like_1, includes the first of two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD and related domains. This ACT domain family, ACT_UUR-like_1, includes the first of two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD; including those enzymes similar to the GlnD found in enteric Escherichia coli and those found in photosynthetic, nitrogen-fixing bacterium Rhodospirillum rubrum. Also included in this CD is the N-terminal ACT domain of a yet characterized Arabidopsis/Oryza predicted tyrosine kinase. Members of this CD belong to the superfamily of ACT regulatory domains.	73
153173	cd04901	ACT_3PGDH	C-terminal ACT (regulatory) domain of D-3-Phosphoglycerate Dehydrogenase (3PGDH) found in fungi and bacteria. The C-terminal ACT (regulatory) domain of D-3-Phosphoglycerate Dehydrogenase (3PGDH) found in fungi and bacteria. 3PGDH is an enzyme that belongs to the D-isomer specific, 2-hydroxyacid dehydrogenase family and catalyzes the oxidation of D-3-phosphoglycerate to 3- phosphohydroxypyruvate, which is the first step in the biosynthesis of L-serine, using NAD+ as the oxidizing agent. In Escherichia coli, the SerA 3PGDH is feedback-controlled by the end product L-serine in an allosteric manner. In the homotetrameric enzyme, the interface at adjacent ACT (regulatory) domains couples to create an extended beta-sheet. Each regulatory interface forms two serine-binding sites. The mechanism by which serine transmits inhibition to the active site is postulated to involve the tethering of the regulatory domains together to create a rigid quaternary structure with a solvent-exposed active site cleft. Members of this CD belong to the superfamily of ACT regulatory domains.	69
153174	cd04902	ACT_3PGDH-xct	C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH). The C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH), with an extended C-terminal (xct) region from bacteria, archaea, fungi, and plants. 3PGDH is an enzyme that belongs to the D-isomer specific, 2-hydroxyacid dehydrogenase family and catalyzes the oxidation of D-3-phosphoglycerate to 3- phosphohydroxypyruvate, which is the first step in the biosynthesis of L-serine, using NAD+ as the oxidizing agent. In bacteria, 3PGDH is feedback-controlled by the end product L-serine in an allosteric manner. Some 3PGDH enzymes have an additional domain formed by an extended C-terminal region. This additional domain introduces significant asymmetry to the homotetramer. Adjacent ACT (regulatory) domains interact, creating two serine-binding sites, however, this asymmetric arrangement results in the formation of two different and distinct domain interfaces between identical domains in the asymmetric unit. How this asymmetry influences the mechanism of effector inhibition is still unknown. Members of this CD belong to the superfamily of ACT regulatory domains.	73
153175	cd04903	ACT_LSD	C-terminal ACT domain of the L-serine dehydratase (LSD), iron-sulfur-dependent, beta subunit. The C-terminal ACT domain of the L-serine dehydratase (LSD), iron-sulfur-dependent, beta subunit, found in various bacterial anaerobes such as Clostridium, Bacillis, and Treponema species. These enzymes catalyze the deamination of L-serine, producing pyruvate and ammonia. Unlike the eukaryotic L-serine dehydratase, which requires the pyridoxal-5'-phosphate (PLP) cofactor, the prokaryotic L-serine dehydratase contains an [4Fe-4S] cluster instead of a PLP active site. The LSD alpha and beta subunits of the 'clostridial' enzyme are encoded by the sdhA and sdhB genes. The single subunit bacterial homologs of L-serine dehydratase (LSD1, LSD2, TdcG) present in Escherichia coli, and other enterobacterials, lack the ACT domain described here. Members of this CD belong to the superfamily of ACT regulatory domains.	71
153176	cd04904	ACT_AAAH	ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH). ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH): Phenylalanine hydroxylases (PAH), tyrosine hydroxylases (TH) and tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. This family of enzymes shares a common catalytic mechanism, in which dioxygen is used by an active site containing a single, reduced iron atom to hydroxylate an unactivated aromatic substrate, concomitant with a two-electron oxidation of tetrahydropterin (BH4) cofactor to its quinonoid dihydropterin form. PAH catalyzes the hydroxylation of L-Phe to L-Tyr, the first step in the catabolic degradation of L-Phe; TH catalyses the hydroxylation of L-Tyr to 3,4-dihydroxyphenylalanine, the rate limiting step in the biosynthesis of catecholamines; and TPH catalyses the hydroxylation of L-Trp to 5-hydroxytryptophan, the rate limiting step in the biosynthesis of 5-hydroxytryptamine (serotonin) and the first reaction in the synthesis of melatonin. Eukaryotic AAAHs have an N-terminal  ACT (regulatory) domain, a middle catalytic domain and a C-terminal domain which is responsible for the oligomeric state of the enzyme forming a domain-swapped tetrameric coiled-coil. The PAH, TH, and TPH enzymes contain highly conserved catalytic domains but distinct N-terminal ACT domains (this CD) and differ in their mechanisms of regulation. One commonality is that all three eukaryotic enzymes are regulated in part by the phosphorylation of serine residues N-terminal of the ACT domain. Members of this CD belong to the superfamily of ACT regulatory domains.	74
153177	cd04905	ACT_CM-PDT	C-terminal ACT domain of the bifunctional chorismate mutase-prephenate dehydratase (CM-PDT) enzyme and the prephenate dehydratase (PDT) enzyme. The C-terminal ACT domain of the bifunctional chorismate mutase-prephenate dehydratase (CM-PDT) enzyme and the prephenate dehydratase (PDT) enzyme, found in plants, fungi, bacteria, and archaea. The P-protein of E. coli (CM-PDT, PheA) catalyzes the conversion of chorismate to prephenate and then the decarboxylation and dehydration to form phenylpyruvate. These are the first two steps in the biosynthesis of L-Phe and L-Tyr via the shikimate pathway in microorganisms and plants. The E. coli P-protein (CM-PDT) has three domains with an N-terminal domain with chorismate mutase activity, a middle domain with prephenate dehydratase activity, and an ACT regulatory C-terminal domain. The prephenate dehydratase enzyme has a PDT and ACT domain. The ACT domain is essential to bring about the negative allosteric regulation by L-Phe binding. L-Phe binds with positive cooperativity; with this binding, there is a shift in the protein to less active tetrameric and higher oligomeric forms from a more active dimeric form. Members of this CD belong to the superfamily of ACT regulatory domains.	80
153178	cd04906	ACT_ThrD-I_1	First of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase). This CD includes the first of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase) which catalyzes the committed step in branched chain amino acid biosynthesis in plants and microorganisms, the pyridoxal 5'-phosphate (PLP)-dependent dehydration/deamination of L-threonine (or L-serine) to 2-ketobutyrate (or pyruvate). ThrD-I is a cooperative, feedback-regulated (isoleucine and valine) allosteric enzyme that forms a tetramer and contains four pyridoxal phosphate moieties. Members of this CD belong to the superfamily of ACT regulatory domains.	85
153179	cd04907	ACT_ThrD-I_2	Second of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase). This CD includes the second of two tandem C-terminal ACT domains of threonine dehydratase I (ThrD-I; L-threonine hydrolyase) which catalyzes the committed step in branched chain amino acid biosynthesis in plants and microorganisms, the pyridoxal 5'-phosphate (PLP)-dependent dehydration/deamination of L-threonine (or L-serine) to 2-ketobutyrate (or pyruvate). ThrD-I is a cooperative, feedback-regulated (isoleucine and valine) allosteric enzyme that forms a tetramer and contains four pyridoxal phosphate moieties. Members of this CD belong to the superfamily of ACT regulatory domains.	81
153180	cd04908	ACT_Bt0572_1	N-terminal ACT domain of a novel protein composed almost entirely of two tandem ACT domains. Included in this CD is the N-terminal ACT domain of a novel protein composed almost entirely of two tandem ACT domains as seen in the uncharacterized structure (pdb 2F06) of the Bt0572 protein from Bacteroides thetaiotaomicron and related ACT domains. These tandem ACT domain proteins belong to the superfamily of ACT regulatory domains.	66
153181	cd04909	ACT_PDH-BS	C-terminal ACT domain of the monofunctional, NAD dependent, prephenate dehydrogenase (PDH). The C-terminal ACT domain of the monofunctional, NAD dependent, prephenate dehydrogenase (PDH) enzyme that catalyzes the formation of 4-hydroxyphenylpyruvate from prephenate, found in Bacillus subtilis (BS) and other Firmicutes, Deinococci, and Bacteroidetes. PDH is the first enzyme in the aromatic amino acid pathway specific for the biosynthesis of tyrosine. This enzyme is feedback-inhibited by tyrosine in B. subtilis and other microorganisms. Both phenylalanine and tryptophan have been shown to be inhibitors of this activity in B. subtilis. Bifunctional  chorismate mutase-PDH (TyrA) enzymes such as those seen in Escherichia coli  do not contain an ACT domain. Members of this CD belong to the superfamily of ACT regulatory domains.	69
153182	cd04910	ACT_AK-Ectoine_1	ACT domains located C-terminal to the catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway found in Methylomicrobium alcaliphilum, Vibrio cholerae, and various other halotolerant or halophilic bacteria. Bacteria exposed to hyperosmotic stress accumulate organic solutes called 'compatible solutes' of which ectoine, a heterocyclic amino acid, is one. Apart from its osmotic function, ectoine also exhibits a protective effect on proteins, nucleic acids and membranes against a variety of stress factors. de novo synthesis of ectoine starts with the phosphorylation of L-aspartate and shares its first two enzymatic steps with the biosynthesis of amino acids of the aspartate family: aspartokinase and L-aspartate-semialdehyde dehydrogenase. The M. alcaliphilum and the V. cholerae aspartokinases are encoded on the ectABCask operon. Members of this CD belong to the superfamily of ACT regulatory domains.	71
153183	cd04911	ACT_AKiii-YclM-BS_1	ACT domains located C-terminal to the catalytic domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in Bacilli (Bacillus subtilis (BS) YclM) and Clostridia species. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. Bacillus subtilis YclM is reported to be a single polypeptide of 50 kD. AKIII from Bacillus subtilis strain 168 is induced by lysine and repressed by threonine and it is synergistically inhibited by lysine and threonine. Members of this CD belong to the superfamily of ACT regulatory domains.	76
153184	cd04912	ACT_AKiii-LysC-EC-like_1	ACT domains located C-terminal to the catalytic domain of  the lysine-sensitive aspartokinase isoenzyme AKIII. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of  the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in bacteria (Escherichia coli (EC) LysC) and plants, (Zea mays Ask1, Ask2, and Arabidopsis thaliana AK1). Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. Like the A. thaliana AK1 (AK1-AT), the E. coli AKIII (LysC) has two bound feedback allosteric inhibitor lysine molecules at the dimer interface located between the ACT1 domain of two subunits. The lysine-sensitive plant isoenzyme is synergistically inhibited by S-adenosylmethionine. A homolog of this group appears to be the Saccharomyces cerevisiae AK (Hom3) which clusters with this group as well. Members of this CD belong to the superfamily of ACT regulatory domains.	75
153185	cd04913	ACT_AKii-LysC-BS-like_1	ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168 and related proteins. This CD includes the N-terminal of the two ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related sequences. In B. subtilis 168, the regulation of the diaminopimelate (Dap)-lysine biosynthetic pathway involves dual control by Dap and lysine, effected through separate Dap- and lysine-sensitive aspartokinase isoenzymes. The B. subtilis 168 AKII is induced by methionine and repressed and inhibited by lysine. Although Corynebacterium glutamicum is known to contain a single aspartokinase, both the succinylase and dehydrogenase variant pathways of DAP-lysine synthesis operate simultaneously in this organism. In corynebacteria and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and threonine. Conserved residues in the ACT domains have been shown to be involved in this concerted feedback inhibition. Also included in this CD are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinases found in Pseudomonas aeruginosa, C. glutamicum, and Amycolatopsis lactamdurans. B. subtilis 168 AKII, and the C. glutamicum, Streptomyces clavuligerus and A. lactamdurans aspartokinases are described as tetramers consisting of two alpha and two beta subunits; the alpha (44 kD) and beta (18 kD) subunits formed by two in-phase overlapping polypeptides. This CD includes the first ACT domain C-terminal to the AK catalytic domain of the alpha subunit and the first ACT domain of the beta subunit that lacks the AK catalytic domain. Unlike the C. glutamicum AK beta subunit, which is involved in feedback regulation, the B. subtilis AKII beta subunit is not. Cyanobacteria aspartokinases are unique to this CD and they have a unique domain architecture with two tandem pairs of ACT domains, C-terminal to the catalytic AK domain. In this CD, the first and third cyanobacteria AK ACT domains are present. Members of this CD belong to the superfamily of ACT regulatory domains.	75
153186	cd04914	ACT_AKi-DapG-BS_1	ACT domains of the diaminopimelate-sensitive aspartokinase (AK) isoenzyme AKI. This CD includes the N-terminal of the two ACT domains of the diaminopimelate-sensitive aspartokinase (AK) isoenzyme AKI, a monofunctional class enzyme found in Bacilli (Bacillus subtilis (BS) strain 168), Clostridia, and Actinobacteria, bacterial species. In B. subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive aspartokinase isoenzymes. AKI activity is invariant during the exponential and stationary phases of growth and is not altered by addition of amino acids to the growth medium. The role of this isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The B. subtilis AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains.	67
153187	cd04915	ACT_AK-Ectoine_2	ACT domains located C-terminal to the catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway. This CD includes the second of two ACT domains located C-terminal to the catalytic domain of the aspartokinase of the ectoine (1,4,5,6-tetrahydro-2-methyl pyrimidine-4-carboxylate) biosynthetic pathway found in Methylomicrobium alcaliphilum, Vibrio cholerae, and various other halotolerant or halophilic bacteria. Bacteria exposed to hyperosmotic stress accumulate organic solutes called 'compatible solutes'  of which ectoine, a heterocyclic amino acid, is one. Apart from its osmotic function, ectoine also exhibits a protective effect on proteins, nucleic acids and membranes against a variety of stress factors. de novo synthesis of ectoine starts with the phosphorylation of L-aspartate and shares its first two enzymatic steps with the biosynthesis of amino acids of the aspartate family: aspartokinase and L-aspartate-semialdehyde dehydrogenase. The M. alcaliphilum and the V. cholerae aspartokinases are encoded on the ectABCask operon. Members of this CD belong to the superfamily of ACT regulatory domains.	66
153188	cd04916	ACT_AKiii-YclM-BS_2	ACT domains located C-terminal to the catalytic domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII. This CD includes the second of two ACT domains located C-terminal to the catalytic domain of the lysine plus threonine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in Bacilli (Bacillus subtilis (BS) YclM) and Clostridia species. Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. B. subtilis YclM is reported to be a single polypeptide of 50 kD. AKIII from B. subtilis strain 168 is induced by lysine and repressed by threonine and it is synergistically inhibited by lysine and threonine. Members of this CD belong to the superfamily of ACT regulatory domains.	66
153189	cd04917	ACT_AKiii-LysC-EC_2	ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII. This CD includes the second of two ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in bacteria (Escherichia coli (EC) LysC). Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. The E. coli AKIII (LysC) binds two feedback allosteric inhibitor lysine molecules at the dimer interface located between the ACT1 domain of two subunits. The second ACT domain (ACT2), this CD, is not involved in the binding of heterotrophic effectors. Members of this CD belong to the superfamily of ACT regulatory domains.	64
153190	cd04918	ACT_AK1-AT_2	ACT domains located C-terminal to the catalytic domain of a monofunctional, lysine-sensitive, plant aspartate kinase 1 (AK1). This CD includes the second of two ACT domains located C-terminal to the catalytic domain of a monofunctional, lysine-sensitive, plant aspartate kinase 1 (AK1), which can be synergistically inhibited by S-adenosylmethionine (SAM). This isoenzyme is found in higher plants, Arabidopsis thaliana (AT) and Zea mays, and also in Chlorophyta. In its inactive state, Arabidopsis AK1 binds the effectors lysine and SAM (two molecules each) at the interface of two ACT1 domain subunits. The second ACT domain (ACT2), this CD, does not interact with an effector. Members of this CD belong to the superfamily of ACT regulatory domains.	65
153191	cd04919	ACT_AK-Hom3_2	ACT domains located C-terminal to the catalytic domain of the aspartokinase (AK) HOM3. This CD includes the second of two ACT domains located C-terminal to the catalytic domain of the aspartokinase (AK) HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae, and other related ACT domains. AK is the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and in fungi, is responsible for the production of threonine, isoleucine and methionine. S. cerevisiae has a single AK, which is regulated by feedback, allosteric inhibition by L-threonine. Recent studies shown that the allosteric transition triggered by binding of threonine to AK involves a large change in the conformation of the native hexameric enzyme that is converted to an inactive one of different shape and substantially smaller hydrodynamic size. Members of this CD belong to the superfamily of ACT regulatory domains.	66
153192	cd04920	ACT_AKiii-DAPDC_2	ACT domains of a bifunctional AKIII (LysC)-like aspartokinase/meso-diaminopimelate decarboxylase (DAPDC). This CD includes the second of two ACT domains of a bifunctional AKIII (LysC)-like aspartokinase/meso-diaminopimelate decarboxylase (DAPDC) bacterial protein. Aspartokinase (AK) is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. The lysA gene encodes the enzyme DAPDC, a pyridoxal-5'-phosphate (PLP)-dependent enzyme which catalyzes the final step in the lysine biosynthetic pathway converting meso-diaminopimelic acid (DAP) to l-lysine. Tandem ACT domains are positioned centrally with the AK catalytic domain N-terminal and the DAPDC domains C-terminal. Members of this CD belong to the superfamily of ACT regulatory domains.	63
153193	cd04921	ACT_AKi-HSDH-ThrA-like_1	ACT domains of the bifunctional enzyme aspartokinase (AK) - homoserine dehydrogenase (HSDH). This CD includes the first of two ACT domains of the bifunctional enzyme aspartokinase (AK) - homoserine dehydrogenase (HSDH). The ACT domains are positioned between the N-terminal catalytic domain of AK and the C-terminal HSDH domain found in bacteria (Escherichia coli (EC) ThrA) and higher plants (Zea mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. HSDH is the first committed reaction in the branch of the pathway that leads to Thr and Met. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains were shown to be involved in allosteric activation. Also included in this CD is the first of two ACT domains of a tetrameric, monofunctional, threonine-sensitive, AK found in Methanococcus jannaschii and other related archaeal species. Members of this CD belong to the superfamily of ACT regulatory domains.	80
153194	cd04922	ACT_AKi-HSDH-ThrA_2	ACT domains of the bifunctional enzyme aspartokinase (AK) - homoserine dehydrogenase (HSDH). This CD includes the second  of two ACT domains of the bifunctional enzyme aspartokinase (AK) - homoserine dehydrogenase (HSDH). The ACT domains are positioned between the N-terminal catalytic domain of AK and the C-terminal HSDH domain found in bacteria (Escherichia coli (EC) ThrA) and higher plants (Zea mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. HSDH is the first committed reaction in the branch of the pathway that leads to Thr and Met. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180-kD enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains were shown to be involved in allosteric activation. Members of this CD belong to the superfamily of ACT regulatory domains.	66
153195	cd04923	ACT_AK-LysC-DapG-like_2	ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168 and related domains. This CD includes the C-terminal of the two ACT domains of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis (BS) strain 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, as well as, the second and fourth, of four, ACT domains present in cyanobacteria AK. Also included are the C-terminal of the two ACT domains of the diaminopimelate-sensitive aspartokinase isoenzyme AKI found in Bacilli (B. subtilis strain 168), Clostridia, and Actinobacteria bacterial species. Members of this CD belong to the superfamily of ACT regulatory domains.	63
153196	cd04924	ACT_AK-Arch_2	ACT domains of a monofunctional aspartokinase found mostly in Archaea species (ACT_AK-Arch_2). Included in this CD is the second of two ACT domains of a monofunctional aspartokinase found mostly in Archaea species (ACT_AK-Arch_2). The first or N-terminal ACT domain of these proteins cluster with the ThrA-like ACT 1 domains (ACT_AKi-HSDH-ThrA-like_1) which includes the threonine-sensitive archaeal Methanococcus jannaschii aspartokinase ACT 1 domain. Members of this CD belong to the superfamily of ACT regulatory domains.	66
153197	cd04925	ACT_ACR_2	ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the second ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products have been described (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) and are represented in this CD. Members of this CD belong to the superfamily of ACT regulatory domains.	74
153198	cd04926	ACT_ACR_4	C-terminal  ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the C-terminal  ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products have been described (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) and are represented in this CD. Members of this CD belong to the superfamily of ACT regulatory domains.	72
153199	cd04927	ACT_ACR-like_2	Second  ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the second  ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana  predicted gene product, At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains.	76
153200	cd04928	ACT_TyrKc	Uncharacterized, N-terminal ACT domain of an Arabidopsis/Oryza predicted tyrosine kinase and other related ACT domains. This CD includes a novel, yet uncharacterized, N-terminal ACT domain of an Arabidopsis/Oryza predicted tyrosine kinase and other related ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains.	68
153201	cd04929	ACT_TPH	ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. TPH catalyses the hydroxylation of L-Trp to 5-hydroxytryptophan, the rate limiting step in the biosynthesis of 5-hydroxytryptamine (serotonin) and the first reaction in the synthesis of melatonin. Very little is known about the role of the ACT domain in TPH, which appears to be regulated by phosphorylation but not by its substrate or cofactor. Members of this CD belong to the superfamily of ACT regulatory domains.	74
153202	cd04930	ACT_TH	ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tyrosine hydroxylases (TH). ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tyrosine hydroxylases (TH). TH catalyses the hydroxylation of L-Tyr to 3,4-dihydroxyphenylalanine, the rate limiting step in the biosynthesis of catecholamines (dopamine, noradrenaline and adrenaline), functioning as hormones and neurotransmitters. The enzyme is not regulated by its amino acid substrate, but instead by phosphorylation at several serine residues located N-terminal of the ACT domain, and by feedback inhibition by catecholamines at the active site. Members of this CD belong to the superfamily of ACT regulatory domains.	115
153203	cd04931	ACT_PAH	ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, phenylalanine hydroxylases (PAH). ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, phenylalanine hydroxylases (PAH). PAH catalyzes the hydroxylation of L-Phe to L-Tyr, the first step in the catabolic degradation of L-Phe. In PAH, an autoregulatory sequence, N-terminal of the ACT domain, extends across the catalytic domain active site and regulates the enzyme by intrasteric regulation. It appears that the activation by L-Phe induces a conformational change that converts the enzyme to a high-affinity and high-activity state. Modulation of activity is achieved through inhibition by BH4 and activation by phosphorylation of serine residues of the autoregulatory region. The molecular basis for the cooperative activation process is not fully understood yet. Members of this CD belong to the superfamily of ACT regulatory domains.	90
153204	cd04932	ACT_AKiii-LysC-EC_1	ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the lysine-sensitive aspartokinase isoenzyme AKIII, a monofunctional class enzyme found in bacteria (Escherichia coli (EC) LysC). Aspartokinase is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. The E. coli AKIII (LysC) binds two feedback allosteric inhibitor lysine molecules at the dimer interface located between the ACT1 domain of two subunits. Members of this CD belong to the superfamily of ACT regulatory domains.	75
153205	cd04933	ACT_AK1-AT_1	ACT domains located C-terminal to the catalytic domain of a monofunctional, lysine-sensitive, plant aspartate kinase 1 (AK1). This CD includes the first of two ACT domains located C-terminal to the catalytic domain of a monofunctional, lysine-sensitive, plant aspartate kinase 1 (AK1), which can be synergistically inhibited by S-adenosylmethionine. This isoenzyme is found in higher plants, Arabidopsis thaliana (AT) and Zea mays, and also in Chlorophyta. Like the Escherichia coli AKIII (LysC), Arabidopsis AK1 binds two feedback allosteric inhibitor lysine molecules at the dimer interface located between the ACT1 domain of two subunits. A loop in common is involved in the binding of both Lys and S-adenosylmethionine providing an explanation for the synergistic inhibition by these effectors. Members of this CD belong to the superfamily of ACT regulatory domains.	78
153206	cd04934	ACT_AK-Hom3_1	CT domains located C-terminal to the catalytic domain of the aspartokinase (AK) HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae, and other related ACT domains. This CD includes the first of two ACT domains located C-terminal to the catalytic domain of the aspartokinase (AK) HOM3, a monofunctional class enzyme found in Saccharomyces cerevisiae, and other related ACT domains. AK is the first enzyme in the aspartate metabolic pathway, catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP, and in fungi, is responsible for the production of threonine, isoleucine and methionine. S. cerevisiae has a single AK, which is regulated by feedback, allosteric inhibition by L-threonine. Recent studies shown that the allosteric transition triggered by binding of threonine to AK involves a large change in the conformation of the native hexameric enzyme that is converted to an inactive one of different shape and substantially smaller hydrodynamic size. Members of this CD belong to the superfamily of ACT regulatory domains.	73
153207	cd04935	ACT_AKiii-DAPDC_1	ACT domains of a bifunctional AKIII (LysC)-like aspartokinase/meso-diaminopimelate decarboxylase (DAPDC) bacterial protein. This CD includes the first of two ACT domains of a bifunctional AKIII (LysC)-like aspartokinase/meso-diaminopimelate decarboxylase (DAPDC) bacterial protein. Aspartokinase (AK) is the first enzyme in the aspartate metabolic pathway and catalyzes the conversion of aspartate and ATP to aspartylphosphate and ADP. The lysA gene encodes the enzyme DAPDC, a pyridoxal-5'-phosphate (PLP)-dependent enzyme which catalyzes the final step in the lysine biosynthetic pathway converting meso-diaminopimelic acid (DAP) to l-lysine. Tandem ACT domains are positioned centrally with the AK catalytic domain N-terminal and the DAPDC domains C-terminal. Members of this CD belong to the superfamily of ACT regulatory domains.	75
153208	cd04936	ACT_AKii-LysC-BS-like_2	ACT domains of the lysine-sensitive, aspartokinase (AK) isoenzyme AKII of Bacillus subtilis (BS) strain 168 and related domains. This CD includes the C-terminal of the two ACT domains of the lysine-sensitive, aspartokinase (AK) isoenzyme AKII of Bacillus subtilis (BS) strain 168, and the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related sequences. In B. subtilis strain 168, the regulation of the diaminopimelate (Dap)-lysine biosynthetic pathway involves dual control by Dap and lysine, effected through separate Dap- and lysine-sensitive AK isoenzymes. The B. subtilis strain 168 AKII is induced by methionine and repressed and inhibited by lysine. Although C. glutamicum is known to contain a single AK, both the succinylase and dehydrogenase variant pathways of DAP-lysine synthesis operate simultaneously in this organism. In corynebacteria and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and threonine. Conserved residues in the ACT domains have been shown to be involved in this concerted feedback inhibition. Also included in this CD are the AKs of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single AKs found in Pseudomons, C. glutamicum, and Amycolatopsis lactamdurans. B. subtilis strain 168 AKII, and the C. glutamicum, Streptomyces clavuligerus and A. lactamdurans AKs are described as tetramers consisting of two alpha and two beta subunits; the alpha (44 kD) and beta (18 kD) subunits formed by two in-phase overlapping polypeptides. This CD includes the second ACT domain C-terminal to the AK catalytic domain of the alpha subunit and the second ACT domain of the beta subunit that lacks the AK catalytic domain. Unlike the C. glutamicum AK beta subunit, which is involved in feedback regulation, the B. subtilis AKII beta subunit is not. Cyanobacteria AKs are unique to this CD and they have a unique domain architecture with two tandem pairs of ACT domains, C-terminal to the catalytic AK domain. In this CD, the second and fourth cyanobacteria AK ACT domains are present. Members of this CD belong to the superfamily of ACT regulatory domains.	63
153209	cd04937	ACT_AKi-DapG-BS_2	ACT domains of the diaminopimelate-sensitive aspartokinase (AK) isoenzyme AKI. This CD includes the C-terminal of the two ACT domains of the diaminopimelate-sensitive aspartokinase (AK) isoenzyme AKI, a monofunctional class enzyme found in Bacilli (Bacillus subtilis (BS) strain 168), Clostridia, and Actinobacteria bacterial species. In B. subtilis, the regulation of the diaminopimelate-lysine biosynthetic pathway involves dual control by diaminopimelate and lysine, effected through separate diaminopimelate- and lysine-sensitive AK isoenzymes. AKI activity is invariant during the exponential and stationary phases of growth and is not altered by addition of amino acids to the growth medium. The role of this isoenzyme is most likely to provide a constant level of aspartyl-beta-phosphate for the biosynthesis of diaminopimelate for peptidoglycan synthesis and dipicolinate during sporulation. The BS AKI is tetrameric consisting of two alpha and two beta subunits; the alpha (43 kD) and beta (17 kD) subunit formed by two in-phase overlapping genes. The alpha subunit contains the AK catalytic domain and two ACT domains. The beta subunit contains two ACT domains. Members of this CD belong to the superfamily of ACT regulatory domains.	64
340517	cd04938	TGS_Obg	TGS (ThrRS, GTPase and SpoT) domain found in the Obg protein family. The Obg family of GTPases function has been implicated in cellular processes as diverse as sporulation, stress response, control of DNA replication, and ribosome assembly. It consists of several subfamilies such as DRG and YchF with TGS domain. The TGS domain is named after the various RNA-binding multidomain ThrRS, GTPase, and SpoT/RelA proteins in which this domain occurs. The TGS domain of Obg-like GTPases such as those present in DRG (developmentally regulated GTP-binding protein), and GTP-binding proteins Ygr210 and YchF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions.	77
240137	cd04939	PA2301	PA2301 is an uncharacterized Pseudomonas aeruginosa protein with a YbaK-like domain of unknown function.  The YbaK-like domain family includes the INS amino acid-editing domain of  the bacterial class II prolyl tRNA synthetase (ProRS), and it's trans-acting homologs, YbaK, and ProX.  The primary function of INS is to hydrolyze mischarged cysteinyl-tRNA(Pro)'s, thus helping ensure the fidelity of translation.  Organisms whose ProRS lacks the INS domain express a single-domain INS homolog such as YbaK, ProX, or PrdX which supplies the function of INS in trans.	139
340854	cd04946	GT4_AmsK-like	amylovoran biosynthesis glycosyltransferase AmsK and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases. AmsK is involved in the biosynthesis of amylovoran, which functions as a virulence factor. It functions as a glycosyl transferase which transfers galactose from UDP-galactose to a lipid-linked amylovoran-subunit precursor.  The members of this family are found mainly in bacteria and Archaea.	401
340855	cd04949	GT4_GtfA-like	accessory Sec system glycosyltransferase GtfA and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases and is named after gtfA in Streptococcus gordonii, where it plays a role in the O-linked glycosylation of GspB, a cell surface glycoprotein involved in platelet binding.  In general glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found in bacteria.	328
340856	cd04950	GT4_TuaH-like	teichuronic acid biosynthesis glycosyltransferase TuaH and similar proteins. Members of this family may function in teichuronic acid biosynthesis/cell wall biogenesis. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.	373
340857	cd04951	GT4_WbdM_like	LPS/UnPP-GlcNAc-Gal a-1,4-glucosyltransferase WbdM and similar proteins. This family is most closely related to the GT4 family of glycosyltransferases and is named after WbdM in Escherichia coli. In general glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found in bacteria.	360
340858	cd04955	GT4-like	glycosyltransferase family 4 proteins. This family is most closely related to the GT4 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found in certain bacteria and Archaea.	379
340859	cd04962	GT4_BshA-like	N-acetyl-alpha-D-glucosaminyl L-malate synthase BshA and similar proteins. This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility. The members of this family are found mainly in bacteria, while some of them are also found in Archaea and eukaryotes.	370
409356	cd04967	IgI_1_Contactin	First immunoglobulin (Ig) domain of contactin; member of the I-set of (Ig) superfamily domains. The members here are composed of the first immunoglobulin (Ig) domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 weeks postnatal, and a lack of contactin-5 (NB-2) results in an impairment of neuronal activity in the rat auditory system. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. This group belongs to the I-set of IgSF domains.	96
409357	cd04968	IgI_3_Contactin	Third immunoglobulin (Ig) domain of contactin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig) domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 weeks postnatal, and a lack of contactin-5 (NB-2) results in an impairment of neuronal activity in the rat auditory system. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. This group belongs to the I-set of IgSF domains.	88
409358	cd04969	Ig5_Contactin	Fifth immunoglobulin (Ig) domain of contactin. The members here are composed of the fifth immunoglobulin (Ig) domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 weeks postnatal, and a lack of contactin-5 (NB-2) results in an impairment of neuronal activity in the rat auditory system. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma.	89
409359	cd04970	Ig6_Contactin	Sixth immunoglobulin (Ig) domain of contactin. The members here are composed of the sixth immunoglobulin (Ig) domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 weeks postnatal, and a lack of contactin-5 (NB-2) results in an impairment of neuronal activity in the rat auditory system. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma.	102
409360	cd04971	IgI_TrKABC_d5	Fifth domain (immunoglobulin-like) of Trk receptors TrkA, TrkB, and TrkC; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth domain of Trk receptors TrkA, TrkB, and TrkC, an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors. They are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkA, TrkB, and TrkC share significant sequence homology and domain organization. The first three domains are leucine-rich domains while the fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrkA, TrkB, and TrkC mediate the trophic effects of the neurotrophin Nerve Growth Factor (NGF) family. TrkA is recognized by NGF. TrkB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. TrkC is recognized by NT-3. NT-3 is promiscuous as in some cell systems it activates TrkA and TrkB receptors. TrkA is a receptor found in all major NGF targets, including the sympathetic, trigeminal, and dorsal root ganglia, cholinergic neurons of the basal forebrain, and the striatum. TrKB transcripts are found throughout multiple structures of the central and peripheral nervous systems. The TrkC gene is expressed throughout the mammalian nervous system. This group belongs to the I-set of IgSF domains.	96
409361	cd04972	Ig_TrkABC_d4	Fourth domain (immunoglobulin-like) of Trk receptors TrkA, TrkB, and TrkC. The members here are composed of the fourth domain of Trk receptors TrkA, TrkB, and TrkC, an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors. They are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkA, TrkB, and TrkC share significant sequence homology and domain organization. The first three domains are leucine-rich domains while the fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrkA, TrkB, and TrkC mediate the trophic effects of the neurotrophin Nerve Growth Factor (NGF) family. TrkA is recognized by NGF. TrKB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. TrkC is recognized by NT-3. NT-3 is promiscuous as in some cell systems it activates TrkA and TrkB receptors. TrkA is a receptor found in all major NGF targets, including the sympathetic, trigeminal, and dorsal root ganglia, cholinergic neurons of the basal forebrain, and the striatum. TrKB transcripts are found throughout multiple structures of the central and peripheral nervous systems. The TrkC gene is expressed throughout the mammalian nervous system.	88
409362	cd04973	IgI_1_FGFR	First immunoglobulin (Ig)-like domain of fibroblast growth factor receptor (FGFR); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of fibroblast growth factor receptor (FGFR). Fibroblast growth factors (FGFs) participate in morphogenesis, development, angiogenesis, and wound healing. These FGF-stimulated processes are mediated by four FGFR tyrosine kinases (FGRF1-4). FGFRs are comprised of an extracellular portion consisting of three Ig-like domains, a transmembrane helix, and a cytoplasmic portion having protein tyrosine kinase activity. The highly conserved Ig-like domains 2 and 3, and the linker region between D2 and D3 define a general binding site for all FGFs.	94
409363	cd04974	IgI_3_FGFR	Third immunoglobulin (Ig)-like domain of fibroblast growth factor receptor (FGFR); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig)-like domain of fibroblast growth factor receptor (FGFR). Fibroblast growth factors (FGFs) participate in morphogenesis, development, angiogenesis, and wound healing. These FGF-stimulated processes are mediated by four FGFR tyrosine kinases (FGRF1-4). FGFRs are comprised of an extracellular portion consisting of three Ig-like domains, a transmembrane helix, and a cytoplasmic portion having protein tyrosine kinase activity. The highly conserved Ig-like domains 2 and 3, and the linker region between D2 and D3 define a general binding site for FGFs. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	102
409364	cd04975	IgI_4_SCFR_like	Fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR), and similar domains; member of the I-set of IgSF domains. The members here are composed of the fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR). In addition to SCFR, this group also includes the fourth Ig domain of macrophage colony stimulating factor receptor (M-CSF-R). SCFR, also called receptor tyrosine kinase KIT or proto-oncogene c-Kit, contains an extracellular component having five Ig-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. SCFR and its ligand SCF are critical for normal hematopoiesis, mast cell development, melanocytes, and gametogenesis. SCF binds to the second and third Ig-like domains of SCFR, this fourth Ig-like domain participates in SCFR dimerization, which follows ligand binding. Deletion of this fourth SCFR Ig-like domain abolishes the ligand-induced dimerization of SCFR and completely inhibits signal transduction. M-CSF-R, also called proto-oncogene c-Fms, acts as cell-surface receptor for CSF1 and IL34 and plays an essential role in the regulation of survival, proliferation and differentiation of hematopoietic precursor cells, such as macrophages and monocytes. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	101
409365	cd04976	IgI_VEGFR	Immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor (VEGFR); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor (VEGFR). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. The VEGFR family consists of three members, VEGFR-1 (Flt-1), VEGFR-2 (KDR/Flk-1), and VEGFR-3 (Flt-4). VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-2 is a major mediator of the mitogenic, angiogenic, and microvascular permeability-enhancing effects of VEGF-A. VEGFR-1 may play an inhibitory part in these processes by binding VEGF and interfering with its interaction with VEGFR-2. VEGFR-1 has a signaling role in mediating monocyte chemotaxis. VEGFR-1 and VEGFR-2 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. VEGFR-3 has been shown to be involved in tumor angiogenesis and growth. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	90
409366	cd04977	IgI_1_NCAM-1_like	First immunoglobulin (Ig)-like domain of neural cell adhesion molecule NCAM-1, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of neural cell adhesion molecule NCAM-1. NCAM-1 plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM) and heterophilic (NCAM-nonNCAM) interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves the Ig1, Ig2, and Ig3 domains. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions), through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. Also included in this group is NCAM-2 (also known as OCAM/mamFas II and RNCAM). NCAM-2 is differentially expressed in the developing and mature olfactory epithelium (OE). This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	95
409367	cd04978	Ig4_L1-NrCAM_like	Fourth immunoglobulin (Ig)-like domain of L1, Ng-CAM (Neuron-glia CAM cell adhesion molecule), and NrCAM (Ng-CAM-related). The members here are composed of the fourth immunoglobulin (Ig)-like domain of L1, Ng-CAM (Neuron-glia CAM cell adhesion molecule), and NrCAM (Ng-CAM-related). These proteins belong to the L1 subfamily of cell adhesion molecules (CAMs) and are comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region and an intracellular domain. These molecules are primarily expressed in the nervous system. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth.	89
409368	cd04979	Ig_Semaphorin_C	Immunoglobulin (Ig)-like domain at the C-terminus of semaphorins. The members here are composed of the immunoglobulin (Ig)-like domain in semaphorins. Semaphorins are transmembrane protein that have important roles in a variety of tissues. Functionally, semaphorins were initially characterized for their importance in the development of the nervous system and in axonal guidance. Later they have been found to be important for the formation and functioning of the cardiovascular, endocrine, gastrointestinal, hepatic, immune, musculoskeletal, renal, reproductive, and respiratory systems. Semaphorins function through binding to their receptors and transmembrane semaphorins also serves as receptors themselves. Although molecular mechanism of semaphorins is poorly understood, the Ig-like domains may be involved in ligand binding or dimerization.	88
409369	cd04980	IgV_L_kappa	Immunoglobulin (Ig) light chain, kappa type, variable (V) domain. The members here are composed of the immunoglobulin (Ig) light chain, kappa type, variable (V) domain. This group contains the standard Ig superfamily V-set AGFCC'C"/DEB domain topology. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determines the type of immunoglobulin formed:  IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which seem to be functionally identical, and can associate with any of the heavy chains.	106
409370	cd04981	IgV_H	Immunoglobulin (Ig) heavy chain (H), variable (V) domain. The members here are composed of the immunoglobulin (Ig) heavy chain (H), variable (V) domain. This group contains the standard Ig superfamily V-set AGFCC'C"/DEB domain topology. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determines the type of immunoglobulin formed: IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which can associate with any of the heavy chains. This family includes alpha, gamma, delta, epsilon, and mu heavy chains.	118
409371	cd04982	IgV_TCR_gamma	Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) gamma chain. The members here are composed of the immunoglobulin (Ig) variable (V) domain of the gamma chain of gamma/delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are heterodimers consisting of alpha and beta chains or gamma and delta chains.  Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha/beta TCRs, but a small subset contain gamma/delta TCRs. Alpha/beta TCRs recognize antigens as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma/delta TCRs recognize intact protein antigens directly without antigen processing and recognize MHC independently of the bound peptide. Gamma/delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. The variable domain of gamma/delta TCRs is responsible for antigen recognition and is located at the N-terminus of the receptor. Members of this group contain the standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	117
409372	cd04983	IgV_TCR_alpha	Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) alpha chain and similar proteins. The members here are composed of the immunoglobulin (Ig) variable domain of the alpha chain of alpha/beta T-cell antigen receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are composed of alpha and beta, or gamma and delta polypeptide chains with variable (V) and constant (C) regions. This group represents the variable domain of the alpha chain of TCRs and also includes the variable domain of delta chains of TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. The variable domain of TCRs is responsible for antigen recognition, and is located at the N-terminus of the receptor.  Gamma/delta TCRs recognize intact protein antigens directly without antigen processing and recognize MHC independently of the bound peptide. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	109
409373	cd04984	IgV_L_lambda	Immunoglobulin (Ig) lambda light chain variable (V) domain. The members here are composed of the immunoglobulin (Ig) light chain, lambda type, variable (V) domain. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determines the type of immunoglobulin formed:  IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which seem to be functionally identical, and can associate with any of the heavy chains. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	105
409374	cd04985	IgC1_CH1_IgADEGM	CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin heavy alpha, delta, epsilon, gamma, and mu chains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of alpha, delta, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors.	98
409375	cd04986	IgC1_CH2_IgA	CH2 domain (second constant Ig domain of the heavy chain) in immunoglobulin heavy alpha chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin constant-1 set domain (IgC) of alpha heavy chains. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors.	96
240138	cd05005	SIS_PHI	Hexulose-6-phosphate isomerase (PHI). PHI is a member of the SIS (Sugar ISomerase domain) superfamily. In the ribulose monophosphate pathway of formaldehyde fixation, hexulose-6-phosphate synthase catalyzes the condensation of ribulose-5-phosphate with formadelhyde to become hexulose-6-phosphate, which is then isomerized to fructose-6-phosphate by PHI.	179
240139	cd05006	SIS_GmhA	Phosphoheptose isomerase is a member of the SIS (Sugar ISomerase) superfamily. Phosphoheptose isomerase catalyzes the isomerization of sedoheptulose 7-phosphate into D-glycero-D-mannoheptose 7-phosphate. This is the first step of the biosynthesis of gram-negative bacteria inner core lipopolysaccharide precursor, L-glycero-D-mannoheptose (Gmh).	177
240140	cd05007	SIS_Etherase	N-acetylmuramic acid 6-phosphate etherase. Members of this family contain the SIS (Sugar ISomerase) domain. The SIS domain is found in many phosphosugar isomerases and phosphosugar binding proteins. The bacterial cell wall sugar N-acetylmuramic acid carries a unique D-lactyl ether substituent at the C3 position. The etherase catalyzes the cleavage of the lactyl ether bond of N-acetylmuramic acid 6-phosphate.	257
240141	cd05008	SIS_GlmS_GlmD_1	SIS (Sugar ISomerase) domain repeat 1 found in Glucosamine 6-phosphate synthase (GlmS) and Glucosamine-6-phosphate deaminase (GlmD). The SIS domain is found in many phosphosugar isomerases and phosphosugar binding proteins. GlmS contains a N-terminal glutaminase domain and two C-terminal SIS domains and catalyzes the first step in hexosamine metabolism, converting fructose 6-phosphate into glucosamine 6-phosphate using glutamine as nitrogen source. The glutaminase domain hydrolyzes glutamine to glutamate and ammonia. Ammonia is transferred through a channel to the isomerase domain for glucosamine 6-phosphate synthesis. The end product of the pathway is N-acetylglucosamine, which plays multiple roles in eukaryotic cells including being a building block of bacterial and fungal cell walls. In the absence of glutamine, GlmS catalyzes the isomerization of fructose 6-phosphate into glucose 6- phosphate (PGI-like activity). Glucosamine-6-phosphate deaminase (GlmD) contains two SIS domains and catalyzes the deamination and isomerization of glucosamine-6-phosphate into fructose-6-phosphate with the release of ammonia; in presence of high ammonia concentration, GlmD can catalyze the reverse reaction.	126
240142	cd05009	SIS_GlmS_GlmD_2	SIS (Sugar ISomerase) domain repeat 2 found in Glucosamine 6-phosphate synthase (GlmS) and Glucosamine-6-phosphate deaminase (GlmD). The SIS domain is found in many phosphosugar isomerases and phosphosugar binding proteins. GlmS contains a N-terminal glutaminase domain and two C-terminal SIS domains and catalyzes the first step in hexosamine metabolism, converting fructose 6-phosphate into glucosamine 6-phosphate using glutamine as nitrogen source. The glutaminase domain hydrolyzes glutamine to glutamate and ammonia. Ammonia is transferred through a channel to the isomerase domain for glucosamine 6-phosphate synthesis. The end product of the pathway is N-acetylglucosamine, which plays multiple roles in eukaryotic cells including being a building block of bacterial and fungal cell walls. In the absence of glutamine, GlmS catalyzes the isomerization of fructose 6-phosphate into glucose 6- phosphate (PGI-like activity). Glucosamine-6-phosphate deaminase (GlmD) contains two SIS domains and catalyzes the deamination and isomerization of glucosamine-6-phosphate into fructose-6-phosphate with the release of ammonia; in presence of high ammonia concentration, GlmD can catalyze the reverse reaction.	153
240143	cd05010	SIS_AgaS_like	AgaS-like protein. AgaS contains a SIS (Sugar ISomerase) domain which is found in many phosphosugar isomerases and phosphosugar binding proteins. AgaS is a putative isomerase in Escherichia coli. It is similar to the glucosamine-6-phosphate synthases (GlmS) which catalyzes the first step in hexosamine metabolism, converting fructose 6-phosphate into glucosamine 6-phosphate using glutamine as nitrogen source.	151
240144	cd05013	SIS_RpiR	RpiR-like protein. RpiR contains a SIS (Sugar ISomerase) domain, which is found in many phosphosugar isomerases and phosphosugar binding proteins. In E. coli, rpiR negatively regulates the expression of rpiB gene. Both rpiB and rpiA are ribose phosphate isomerases that catalyze the reversible reactions of ribose 5-phosphate into ribulose 5-phosphate.	139
240145	cd05014	SIS_Kpsf	KpsF-like protein. KpsF is an arabinose-5-phosphate isomerase which contains SIS (Sugar ISomerase) domains. SIS domains are found in many phosphosugar isomerases and phosphosugar binding proteins. KpsF catalyzes the reversible reaction of ribulose 5-phosphate to arabinose 5-phosphate. This is the second step in the CMP-Kdo biosynthesis pathway.	128
240146	cd05015	SIS_PGI_1	Phosphoglucose isomerase (PGI) contains two SIS (Sugar ISomerase) domains. This classification is based on the alignment of the first SIS domain. PGI is a multifunctional enzyme which as an intracellular dimer catalyzes the reversible isomerization of glucose 6-phosphate to fructose 6-phosphate. As an extracellular protein, PGI also has functions equivalent to neuroleukin (NLK), autocrine motility factor (AMF), and maturation factor (MF). Evidence suggests that PGI, NLK, AMF, and MF are closely related or identical. NLK is a neurotrophic growth factor that promotes regeneration and survival of neurons. The dimeric form of NLK has isomerase function, whereas its monomeric form carries out neurotrophic activity. AMF is a cytokine that stimulates cell migration and metastasis. MF mediates the differentiation of human myeloid leukemic HL-60 cells to terminal monocytic cells.	158
240147	cd05016	SIS_PGI_2	Phosphoglucose isomerase (PGI) contains two SIS (Sugar ISomerase) domains. This classification is based on the alignment of the second SIS domain. PGI is a multifunctional enzyme which as an intracellular dimer catalyzes the reversible isomerization of glucose 6-phosphate to fructose 6-phosphate. As an extracellular protein, PGI also has functions equivalent to neuroleukin (NLK), autocrine motility factor (AMF), and maturation factor (MF). Evidence suggests that PGI, NLK, AMF, and MF are closely related or identical. NLK is a neurotrophic growth factor that promotes regeneration and survival of neurons. The dimeric form of NLK has isomerase function, whereas its monomeric form carries out neurotrophic activity. AMF is a cytokine that stimulates cell migration and metastasis. MF mediates the differentiation of human myeloid leukemic HL-60 cells to terminal monocytic cells.	164
240148	cd05017	SIS_PGI_PMI_1	The members of this protein family contain the SIS (Sugar ISomerase) domain and have both the phosphoglucose isomerase (PGI) and the phosphomannose isomerase (PMI) functions. These functions catalyze the reversible reactions of glucose 6-phosphate to fructose 6-phosphate, and mannose 6-phosphate to fructose 6-phosphate, respectively at an equal rate. This protein contains two SIS domains. This alignment is based on the first SIS domain.	119
176853	cd05018	CoxG	Carbon monoxide dehydrogenase subunit G (CoxG). CoxG has been shown, in Oligotropha carboxidovorans, to anchor the carbon monoxide (CO) dehydrogenase to the cytoplasmic membrane. The gene encoding CoxG is part of the Cox cluster (coxBCMSLDEFGHIK) located on a low-copy-number, circular, megaplasmid pHCG3. This cluster includes genes encoding subunits of CO dehydrogenase and several accessory components involved in the utilization of CO. This family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	144
240149	cd05022	S-100A13	S-100A13: S-100A13 domain found in proteins similar to S100A13. S100A13 is a calcium-binding protein belonging to a large S100 vertebrate-specific protein family within the EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A13 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100A13 is involved in the cellular export of interleukin-1 (IL-1) and of fibroblast growth factor-1 (FGF-1), which plays an important role in angiogenesis and tissue regeneration. Export is based on the CuII-dependent formation of multiprotein complexes containing the S100A13 protein. Assembly of these complexes occurs near the inner surface of the plasma membrane. Binding of two Ca(II) ions per monomer triggers key conformational changes leading to the creation of two identical and symmetrical Cu(II)-binding sites on the surface of the protein, close to the interface between the two monomers. These Cu(II)-binding sites are unique among the S100 proteins, which are reported to bind Cu(II) or Zn(II) ions in addition to Ca(II) ions. In addition, the three-dimensional structure of S100A13 differs significantly from those of other S100 proteins; the hydrophobic pocket that largely contributes to protein-protein interactions in other S100 proteins is absent in S100A13. The structure of S100A13 contains a large patch of negatively charged residues flanked by dense cationic clusters, formed mostly from positively charged residues from the C-terminal end, which plays major role in binding FGF-1.	89
240150	cd05023	S-100A11	S-100A11: S-100A11 domain found in proteins similar to S100A11. S100A11 is a member of the S-100 domain family within EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100A11 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control . S100 proteins have also been associated with a variety of pathological events, including neoplastic transformation and neurodegenerative diseases such as Alzheimer's, usually via over expression of the protein. S100A11 is expressed in smooth muscle and other tissues and involves in calcium-dependent membrane aggregation, which is important for cell vesiculation . As is the case for many other S100 proteins, S100A11 is homodimer, which is able to form a heterodimer with S100B through subunit exchange. Ca2+ binding to S100A11 results in a conformational change in the protein, exposing a hydrophobic surface that interacts with target proteins. In addition to binding to annexin A1 and A6  S100A11 also interacts with actin  and transglutaminase.	89
240151	cd05024	S-100A10	S-100A10: A subgroup of the S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of the S100 family of EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A10 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. A unique feature of S100A10 is that it contains mutation in both of the calcium binding sites, making it calcium insensitive. S100A10 has been detected in brain, heart, gastrointestinal tract, kidney, liver, lung, spleen, testes, epidermis, aorta, and thymus. Structural data supports the homo- and hetero-dimeric as well as hetero-tetrameric nature of the protein. S100A10 has multiple binding partners in its calcium free state and is therefore involved in many diverse biological functions.	91
240152	cd05025	S-100A1	S-100A1: S-100A1 domain found in proteins similar to S100A1. S100A1 is a calcium-binding protein belonging to a large S100 vertebrate-specific protein family within the EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A1 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. As is the case with many other members of S100 protein family, S100A1 is implicated in intracellular and extracellular regulatory activities, including interaction with myosin-associated twitchin kinase, actin-capping protein CapZ, sinapsin I, and tubulin. Structural data suggests that S100A1 proteins exist within cells as antiparallel homodimers, while heterodimers  with S100A4 and S100B also has been reported. Upon binding calcium S100A1 changes conformation to expose a hydrophobic cleft which is the interaction site of S100A1 with its more that 20 known target  proteins.	92
240153	cd05026	S-100Z	S-100Z: S-100Z domain found in proteins similar to S100Z. S100Z is a member of the S100 domain family within the EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100Z group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately.S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control. S100Z is normally expressed in various tissues, with its highest level of expression being in spleen and leukocytes. The function of S100Z remains unclear. Preliminary structural data suggests that S100Z is homodimer, however a heterodimer with S100P has been reported. S100Z is capable of binding calcium ions. When calcium binds to S110Z,  the protein experiences a conformational change, which exposes hydrophobic surfaces on the protein. In comparison with their normal tissue counterparts, S100Z gene expression appears to be deregulated in some tumor tissues.	93
240154	cd05027	S-100B	S-100B: S-100B domain found in proteins similar to S100B. S100B is a calcium-binding protein belonging to a large S100 vertebrate-specific protein family within the EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100B group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100B is most abundant in glial cells of the central nervous system, predominately in astrocytes. S100B is involved in signal transduction via the inhibition of protein phoshorylation, regulation of enzyme activity and by affecting the calcium homeostasis. Upon calcium binding the S100B homodimer changes conformation to expose a hydrophobic cleft, which represents the interaction site of S100B with its more than 20 known target  proteins. These target proteins include several cellular architecture proteins such as tubulin and GFAP; S100B can inhibit polymerization of these oligomeric molecules. Furthermore, S100B inhibits the phosphorylation of multiple kinase substrates including the Alzheimer protein tau and neuromodulin (GAP-43) through a calcium-sensitive interaction with the protein substrates.	88
240155	cd05029	S-100A6	S-100A6: S-100A6 domain found in proteins similar to S100A6. S100A6 is a member of the S100 domain family within EF-hand Ca2+-binding proteins superfamily. Note that the S-100 hierarchy, to which this S-100A6 group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins exhibit unique patterns of tissue- and cell type-specific expression and have been implicated in the Ca2+-dependent regulation of diverse physiological processes, including cell cycle regulation, differentiation, growth, and metabolic control . S100A6 is normally expressed in the G1 phase of the cell cycle in neuronal cells. The function of S100A6 remains unclear, but evidence suggests that it is involved in cell cycle regulation and exocytosis. S100A6 may also be involved in tumorigenesis; the protein is overexpressed in several tumors. Ca2+ binding to S100A6 leads to a conformational change in the protein, which exposes a hydrophobic surface for interaction with target proteins. Several such proteins have been identified: glyceraldehyde-3-phosphate dehydrogenase , annexins  2, 6 and 11 and Calcyclin-Binding Protein (CacyBP).	88
240156	cd05030	calgranulins	Calgranulins: S-100 domain found in proteins belonging to the Calgranulin subgroup of the S100 family of EF-hand calcium-modulated proteins, including S100A8, S100A9, and S100A12 . Note that the S-100 hierarchy, to which this Calgranulin group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. These proteins are expressed mainly in granulocytes, and are involved in inflammation, allergy, and neuritogenesis, as well as in host-parasite response. Calgranulins are modulated not only by calcium, but also by other metals such as zinc and copper. Structural data suggested that calgranulins may exist in  multiple structural forms, homodimers, as well as hetero-oligomers. For example, the S100A8/S100A9 complex called calprotectin plays important roles in the regulation of inflammatory processes, wound repair, and regulating zinc-dependent enzymes as well as microbial growth.	88
240157	cd05031	S-100A10_like	S-100A10_like: S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of the S100 family of EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A1_like group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. A unique feature of S100A10 is that it contains mutation in both of the calcium binding sites, making it calcium insensitive. S100A10 has been detected in brain, heart, gastrointestinal tract, kidney, liver, lung, spleen, testes, epidermis, aorta, and thymus. Structural data supports the homo- and hetero-dimeric as well as hetero-tetrameric nature of the protein. S100A10 has multiple binding partners in its calcium free state and is therefore involved in many diverse biological functions.	94
173625	cd05032	PTKc_InsR_like	Catalytic domain of Insulin Receptor-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The InsR subfamily is composed of InsR, Insulin-like Growth Factor-1 Receptor (IGF-1R), and similar proteins. InsR and IGF-1R are receptor PTKs (RTKs) composed of two alphabeta heterodimers. Binding of the ligand (insulin, IGF-1, or IGF-2) to the extracellular alpha subunit activates the intracellular tyr kinase domain of the transmembrane beta subunit. Receptor activation leads to autophosphorylation, stimulating downstream kinase activities, which initiate signaling cascades and biological function. InsR and IGF-1R, which share 84% sequence identity in their kinase domains, display physiologically distinct yet overlapping functions in cell growth, differentiation, and metabolism. InsR activation leads primarily to metabolic effects while IGF-1R activation stimulates mitogenic pathways. In cells expressing both receptors, InsR/IGF-1R hybrids are found together with classical receptors. Both receptors can interact with common adaptor molecules such as IRS-1 and IRS-2. The InsR-like subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270629	cd05033	PTKc_EphR	Catalytic domain of Ephrin Receptor Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EphRs comprise the largest subfamily of receptor PTKs (RTKs). They can be classified into two classes (EphA and EphB), according to their extracellular sequences, which largely correspond to binding preferences for either GPI-anchored ephrin-A ligands or transmembrane ephrin-B ligands. Vertebrates have ten EphA and six EphB receptors, which display promiscuous ligand interactions within each class. EphRs contain an ephrin binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. This allows ephrin/EphR dimers to form, leading to the activation of the intracellular tyr kinase domain. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). The main effect of ephrin/EphR interaction is cell-cell repulsion or adhesion. Ephrin/EphR signaling is important in neural development and plasticity, cell morphogenesis and proliferation, cell-fate determination, embryonic development, tissue patterning, and angiogenesis.The EphR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	266
270630	cd05034	PTKc_Src_like	Catalytic domain of Src kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Src subfamily members include Src, Lck, Hck, Blk, Lyn, Fgr, Fyn, Yrk, and Yes. Src (or c-Src) proteins are cytoplasmic (or non-receptor) PTKs which are anchored to the plasma membrane. They contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. They were identified as the first proto-oncogene products, and they regulate cell adhesion, invasion, and motility in cancer cells and tumor vasculature, contributing to cancer progression and metastasis. Src kinases are overexpressed in a variety of human cancers, making them attractive targets for therapy. They are also implicated in acute inflammatory responses and osteoclast function. Src, Fyn, Yes, and Yrk are widely expressed, while Blk, Lck, Hck, Fgr, and Lyn show a limited expression pattern. The Src-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	248
270631	cd05035	PTKc_TAM	Catalytic Domain of TAM (Tyro3, Axl, Mer) Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The TAM subfamily consists of Tyro3 (or Sky), Axl, Mer (or Mertk), and similar proteins. TAM subfamily members are receptor tyr kinases (RTKs) containing an extracellular ligand-binding region with two immunoglobulin-like domains followed by two fibronectin type III repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands, Gas6 and protein S, leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. TAM proteins are implicated in a variety of cellular effects including survival, proliferation, migration, and phagocytosis. They are also associated with several types of cancer as well as inflammatory, autoimmune, vascular, and kidney diseases. The TAM subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	273
270632	cd05036	PTKc_ALK_LTK	Catalytic domain of the Protein Tyrosine Kinases, Anaplastic Lymphoma Kinase and Leukocyte Tyrosine Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyr residues in protein substrates. ALK and LTK are orphan receptor PTKs (RTKs) whose ligands are not yet well-defined. ALK appears to play an important role in mammalian neural development as well as visceral muscle differentiation in Drosophila. ALK is aberrantly expressed as fusion proteins, due to chromosomal translocations, in about 60% of anaplastic large cell lymphomas (ALCLs). ALK fusion proteins are also found in rare cases of diffuse large B cell lymphomas (DLBCLs). LTK is mainly expressed in B lymphocytes and neuronal tissues. It is important in cell proliferation and survival. Transgenic mice expressing TLK display retarded growth and high mortality rate. In addition, a polymorphism in mouse and human LTK is implicated in the pathogenesis of systemic lupus erythematosus. RTKs contain an extracellular ligand-binding domain, a transmembrane region, and an intracellular tyr kinase domain. They are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. The ALK/LTK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270633	cd05037	PTK_Jak_rpt1	Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinases, Janus kinases. The Jak subfamily is composed of Jak1, Jak2, Jak3, TYK2, and similar proteins. They are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal catalytic tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. It modulates the kinase activity of the C-terminal catalytic domain. In the case of Jak2, the presumed pseudokinase (repeat 1) domain exhibits dual-specificity kinase activity, phosphorylating two negative regulatory sites in Jak2: Ser523 and Tyr570. Most Jaks are expressed in a wide variety of tissues, except for Jak3, which is expressed only in hematopoietic cells. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). Jaks are also involved in regulating the surface expression of some cytokine receptors. The Jak-STAT pathway is involved in many biological processes including hematopoiesis, immunoregulation, host defense, fertility, lactation, growth, and embryogenesis. The Jak subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
270634	cd05038	PTKc_Jak_rpt2	Catalytic (repeat 2) domain of the Protein Tyrosine Kinases, Janus kinases. The Jak subfamily is composed of Jak1, Jak2, Jak3, TYK2, and similar proteins. They are PTKs, catalyzing the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Jaks are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase catalytic domain. Most Jaks are expressed in a wide variety of tissues, except for Jak3, which is expressed only in hematopoietic cells. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). Jaks are also involved in regulating the surface expression of some cytokine receptors. The Jak-STAT pathway is involved in many biological processes including hematopoiesis, immunoregulation, host defense, fertility, lactation, growth, and embryogenesis. The Jak subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270635	cd05039	PTKc_Csk_like	Catalytic domain of C-terminal Src kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of Csk, Chk, and similar proteins. They are cytoplasmic (or nonreceptor) PTKs containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. They negatively regulate the activity of Src kinases that are anchored to the plasma membrane. To inhibit Src kinases, Csk and Chk are translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. Csk catalyzes the tyr phosphorylation of the regulatory C-terminal tail of Src kinases, resulting in their inactivation. Chk inhibit Src kinases using a noncatalytic mechanism by simply binding to them. As negative regulators of Src kinases, Csk and Chk play important roles in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. The Csk-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270636	cd05040	PTKc_Ack_like	Catalytic domain of the Protein Tyrosine Kinase, Activated Cdc42-associated kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily includes Ack1, thirty-eight-negative kinase 1 (Tnk1), and similar proteins. They are cytoplasmic (or nonreceptor) PTKs containing an N-terminal catalytic domain, an SH3 domain, a Cdc42-binding CRIB domain, and a proline-rich region. They are mainly expressed in brain and skeletal tissues and are involved in the regulation of cell adhesion and growth, receptor degradation, and axonal guidance. Ack1 is also associated with androgen-independent  prostate cancer progression. Tnk1 regulates TNFalpha signaling and may play an important role in cell death. The Ack-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270637	cd05041	PTKc_Fes_like	Catalytic domain of Fes-like Protein Tyrosine Kinases. Protein Tyrosine Kinase (PTK) family; Fes subfamily; catalytic (c) domain. Fes subfamily members include Fes (or Fps), Fer, and similar proteins. The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K). PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Fes subfamily proteins are cytoplasmic (or nonreceptor) tyr kinases containing an N-terminal region with FCH (Fes/Fer/CIP4 homology) and coiled-coil domains, followed by a SH2 domain, and a C-terminal catalytic domain. The genes for Fes (feline sarcoma) and Fps (Fujinami poultry sarcoma) were first isolated from tumor-causing retroviruses. The viral oncogenes encode chimeric Fes proteins consisting of Gag sequences at the N-termini, resulting in unregulated tyr kinase activity. Fes and Fer kinases play roles in haematopoiesis, inflammation and immunity, growth factor signaling, cytoskeletal regulation, cell migration and adhesion, and the regulation of cell-cell interactions. Fes and Fer show redundancy in their biological functions.	251
270638	cd05042	PTKc_Aatyk	Catalytic domain of the Protein Tyrosine Kinases, Apoptosis-associated tyrosine kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Aatyk subfamily is also referred to as the lemur tyrosine kinase (Lmtk) subfamily. It consists of Aatyk1 (Lmtk1), Aatyk2 (Lmtk2, Brek), Aatyk3 (Lmtk3), and similar proteins. Aatyk proteins are mostly receptor PTKs (RTKs) containing a transmembrane segment and a long C-terminal cytoplasmic tail with a catalytic domain. Aatyk1 does not contain a transmembrane segment and is a cytoplasmic (or nonreceptor) kinase. Aatyk proteins are classified as PTKs based on overall sequence similarity and the phylogenetic tree. However, analysis of catalytic residues suggests that Aatyk proteins may be multispecific kinases, functioning also as serine/threonine kinases. They are involved in neural differentiation, nerve growth factor (NGF) signaling, apoptosis, and spermatogenesis. The Aatyk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
270639	cd05043	PTK_Ryk	Pseudokinase domain of Ryk (Receptor related to tyrosine kinase). Ryk is a receptor tyr kinase (RTK) containing an extracellular region with two leucine-rich motifs, a transmembrane segment, and an intracellular inactive pseudokinase domain, which shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. The extracellular region of Ryk shows homology to the N-terminal domain of Wnt inhibitory factor-1 (WIF) and serves as the ligand (Wnt) binding domain of Ryk. Ryk is expressed in many different tissues both during development and in adults, suggesting a widespread function. It acts as a chemorepulsive axon guidance receptor of Wnt glycoproteins and is responsible for the establishment of axon tracts during the development of the central nervous system. In addition, studies in mice reveal that Ryk is essential in skeletal, craniofacial, and cardiac development. Thus, it appears Ryk is involved in signal transduction despite its lack of kinase activity. Ryk may function as an accessory protein that modulates the signals coming from catalytically active partner RTKs such as the Eph receptors. The Ryk subfamily is part of a larger superfamily that includes other pseudokinases and the catalytic domains of active kinases including PTKs, protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
270640	cd05044	PTKc_c-ros	Catalytic domain of the Protein Tyrosine Kinase, C-ros. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily contains c-ros, Sevenless, and similar proteins. The proto-oncogene c-ros encodes an orphan receptor PTK (RTK) with an unknown ligand. RTKs contain an extracellular ligand-binding domain, a transmembrane region, and an intracellular tyr kinase domain. RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. C-ros is expressed in embryonic cells of the kidney, intestine and lung, but disappears soon after birth. It persists only in the adult epididymis. Male mice bearing inactive mutations of c-ros lack the initial segment of the epididymis and are infertile. The Drosophila protein, Sevenless, is required for the specification of the R7 photoreceptor cell during eye development. The c-ros subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
173631	cd05045	PTKc_RET	Catalytic domain of the Protein Tyrosine Kinase, REarranged during Transfection protein. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. RET is a receptor PTK (RTK) containing an extracellular region with four cadherin-like repeats, a calcium-binding site, and a cysteine-rich domain, a transmembrane segment, and an intracellular catalytic domain. It is part of a multisubunit complex that binds glial-derived neurotropic factor (GDNF) family ligands (GFLs) including GDNF, neurturin, artemin, and persephin. GFLs bind RET along with four GPI-anchored coreceptors, bringing two RET molecules together, leading to autophosphorylation, activation, and intracellular signaling. RET is essential for the development of the sympathetic, parasympathetic and enteric nervous systems, and the kidney. RET disruption by germline mutations causes diseases in humans including congenital aganglionosis of the gastrointestinal tract (Hirschsprung's disease) and three related inherited cancers: multiple endocrine neoplasia type 2A (MEN2A), MEN2B, and familial medullary thyroid carcinoma. The RET subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
133178	cd05046	PTK_CCK4	Pseudokinase domain of the Protein Tyrosine Kinase, Colon Carcinoma Kinase 4. CCK4, also called protein tyrosine kinase 7 (PTK7), is an orphan receptor PTK (RTK) containing an extracellular region with seven immunoglobulin domains, a transmembrane segment, and an intracellular inactive pseudokinase domain, which shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. Studies in mice reveal that CCK4 is essential for neural development. Mouse embryos containing a truncated CCK4 die perinatally and display craniorachischisis, a severe form of neural tube defect. The mechanism of action of the CCK4 pseudokinase is still unknown. Other pseudokinases such as HER3 rely on the activity of partner RTKs. The CCK4 subfamily is part of a larger superfamily that includes other pseudokinases and the catalytic domains of active kinases including PTKs, protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	275
270641	cd05047	PTKc_Tie	Catalytic domain of Tie Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tie proteins, consisting of Tie1 and Tie2, are receptor PTKs (RTKs) containing an extracellular region, a transmembrane segment, and an intracellular catalytic domain. The extracellular region contains an immunoglobulin (Ig)-like domain, three epidermal growth factor (EGF)-like domains, a second Ig-like domain, and three fibronectin type III repeats. Tie receptors are specifically expressed in endothelial cells and hematopoietic stem cells. The angiopoietins (Ang-1 to Ang-4) serve as ligands for Tie2, while no specific ligand has been identified for Tie1. The binding of Ang-1 to Tie2 leads to receptor autophosphorylation and activation, promoting cell migration and survival. In contrast, Ang-2 binding to Tie2 does not result in the same response, suggesting that Ang-2 may function as an antagonist. In vivo studies of Tie1 show that it is critical in vascular development. The Tie subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
270642	cd05048	PTKc_Ror	Catalytic Domain of the Protein Tyrosine Kinases, Receptor tyrosine kinase-like Orphan Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Ror subfamily consists of Ror1, Ror2, and similar proteins. Ror proteins are orphan receptor PTKs (RTKs) containing an extracellular region with immunoglobulin-like, cysteine-rich, and kringle domains, a transmembrane segment, and an intracellular catalytic domain. Ror RTKs are unrelated to the nuclear receptor subfamily called retinoid-related orphan receptors (RORs). RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. Ror kinases are expressed in many tissues during development. They play important roles in bone and heart formation. Mutations in human Ror2 result in two different bone development genetic disorders, recessive Robinow syndrome and brachydactyly type B. Drosophila Ror is expressed only in the developing nervous system during neurite outgrowth and neuronal differentiation, suggesting a role for Drosophila Ror in neural development. More recently, mouse Ror1 and Ror2 have also been found to play an important role in regulating neurite growth in central neurons. Ror1 and Ror2 are believed to have some overlapping and redundant functions. The Ror subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
270643	cd05049	PTKc_Trk	Catalytic domain of the Protein Tyrosine Kinases, Tropomyosin Related Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Trk subfamily consists of TrkA, TrkB, TrkC, and similar proteins. They are receptor PTKs (RTKs) containing an extracellular region with arrays of leucine-rich motifs flanked by two cysteine-rich clusters followed by two immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands, the nerve growth factor (NGF) family of neutrotrophins, leads to Trk receptor oligomerization and activation of the catalytic domain. Trk receptors are mainly expressed in the peripheral and central nervous systems. They play important roles in cell fate determination, neuronal survival and differentiation, as well as in the regulation of synaptic plasticity. Altered expression of Trk receptors is associated with many human diseases. The Trk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	280
133181	cd05050	PTKc_Musk	Catalytic domain of the Protein Tyrosine Kinase, Muscle-specific kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Musk is a receptor PTK (RTK) containing an extracellular region with four immunoglobulin-like domains and a cysteine-rich cluster, a transmembrane segment, and an intracellular catalytic domain. Musk is expressed and concentrated in the postsynaptic membrane in skeletal muscle. It is essential for the establishment of the neuromuscular junction (NMJ), a peripheral synapse that conveys signals from motor neurons to muscle cells. Agrin, a large proteoglycan released from motor neurons, stimulates Musk autophosphorylation and activation, leading to the clustering of acetylcholine receptors (AChRs). To date, there is no evidence to suggest that agrin binds directly to Musk. Mutations in AChR, Musk and other partners are responsible for diseases of the NMJ, such as the autoimmune syndrome myasthenia gravis. The Musk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
270644	cd05051	PTKc_DDR	Catalytic domain of the Protein Tyrosine Kinases, Discoidin Domain Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The DDR subfamily consists of homologs of mammalian DDR1, DDR2, and similar proteins. They are receptor PTKs (RTKs) containing an extracellular discoidin homology domain, a transmembrane segment, an extended juxtamembrane region, and an intracellular catalytic domain. The binding of the ligand, collagen, to DDRs results in a slow but sustained receptor activation. DDRs regulate cell adhesion, proliferation, and extracellular matrix remodeling. They have been linked to a variety of human cancers including breast, colon, ovarian, brain, and lung. There is no evidence showing that DDRs act as transforming oncogenes. They are more likely to play a role in the regulation of tumor growth and metastasis. The DDR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	297
270645	cd05052	PTKc_Abl	Catalytic domain of the Protein Tyrosine Kinase, Abelson kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Abl (or c-Abl) is a ubiquitously-expressed cytoplasmic (or nonreceptor) PTK that contains SH3, SH2, and tyr kinase domains in its N-terminal region, as well as nuclear localization motifs, a putative DNA-binding domain, and F- and G-actin binding domains in its C-terminal tail. It also contains a short autoinhibitory cap region in its N-terminus. Abl function depends on its subcellular localization. In the cytoplasm, Abl plays a role in cell proliferation and survival. In response to DNA damage or oxidative stress, Abl is transported to the nucleus where it induces apoptosis. In chronic myelogenous leukemia (CML) patients, an aberrant translocation results in the replacement of the first exon of Abl with the BCR (breakpoint cluster region) gene. The resulting BCR-Abl fusion protein is constitutively active and associates into tetramers, resulting in a hyperactive kinase sending a continuous signal. This leads to uncontrolled proliferation, morphological transformation and anti-apoptotic effects. BCR-Abl is the target of selective inhibitors, such as imatinib (Gleevec), used in the treatment of CML. Abl2, also known as ARG (Abelson-related gene), is thought to play a cooperative role with Abl in the proper development of the nervous system. The Tel-ARG fusion protein, resulting from reciprocal translocation between chromosomes 1 and 12, is associated with acute myeloid leukemia (AML). The TEL gene is a frequent fusion partner of other tyr kinase oncogenes, including Tel/Abl, Tel/PDGFRbeta, and Tel/Jak2, found in patients with leukemia and myeloproliferative disorders. The Abl subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
270646	cd05053	PTKc_FGFR	Catalytic domain of the Protein Tyrosine Kinases, Fibroblast Growth Factor Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The FGFR subfamily consists of FGFR1, FGFR2, FGFR3, FGFR4, and similar proteins. They are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, and to heparin/heparan sulfate (HS) results in the formation of a ternary complex, which leads to receptor dimerization and activation, and intracellular signaling. There are at least 23 FGFs and four types of FGFRs. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. FGF/FGFR signaling is important in the regulation of embryonic development, homeostasis, and regenerative processes. Depending on the cell type and stage, FGFR signaling produces diverse cellular responses including proliferation, growth arrest, differentiation, and apoptosis. Aberrant signaling leads to many human diseases such as skeletal, olfactory, and metabolic disorders, as well as cancer. The FGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase .	294
270647	cd05054	PTKc_VEGFR	Catalytic domain of the Protein Tyrosine Kinases, Vascular Endothelial Growth Factor Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The VEGFR subfamily consists of VEGFR1 (Flt1), VEGFR2 (Flk1), VEGFR3 (Flt4), and similar proteins. VEGFR subfamily members are receptor PTKss (RTKs) containing an extracellular ligand-binding region with seven immunoglobulin (Ig)-like domains, a transmembrane segment, and an intracellular catalytic domain. In VEGFR3, the fifth Ig-like domain is replaced by a disulfide bridge. The binding of VEGFRs to their ligands, the VEGFs, leads to receptor dimerization, activation, and intracellular signaling. There are five VEGF ligands in mammals, which bind, in an overlapping pattern to the three VEGFRs, which can form homo or heterodimers. VEGFRs regulate the cardiovascular system. They are critical for vascular development during embryogenesis and blood vessel formation in adults. They induce cellular functions common to other growth factor receptors such as cell migration, survival, and proliferation. The VEGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	298
133186	cd05055	PTKc_PDGFR	Catalytic domain of the Protein Tyrosine Kinases, Platelet Derived Growth Factor Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The PDGFR subfamily consists of PDGFR alpha, PDGFR beta, KIT, CSF-1R, the mammalian FLT3, and similar proteins. They are receptor PTKs (RTKs) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. PDGFR kinase domains are autoinhibited by their juxtamembrane regions containing tyr residues. The binding to their ligands leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. PDGFR subfamily receptors are important in the development of a variety of cells. PDGFRs are expressed in a many cells including fibroblasts, neurons, endometrial cells, mammary epithelial cells, and vascular smooth muscle cells. PDGFR signaling is critical in normal embryonic development, angiogenesis, and wound healing. Kit is important in the development of melanocytes, germ cells, mast cells, hematopoietic stem cells, the interstitial cells of Cajal, and the pacemaker cells of the GI tract. CSF-1R signaling is critical in the regulation of macrophages and osteoclasts. Mammalian FLT3 plays an important role in the survival, proliferation, and differentiation of stem cells. The PDGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase .	302
133187	cd05056	PTKc_FAK	Catalytic domain of the Protein Tyrosine Kinase, Focal Adhesion Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. FAK is a cytoplasmic (or nonreceptor) PTK that contains an autophosphorylation site and a FERM domain at the N-terminus, a central tyr kinase domain, proline-rich regions, and a C-terminal FAT (focal adhesion targeting) domain. FAK activity is dependent on integrin-mediated cell adhesion, which facilitates N-terminal autophosphorylation. Full activation is achieved by the phosphorylation of its two adjacent A-loop tyrosines. FAK is important in mediating signaling initiated at sites of cell adhesions and at growth factor receptors. Through diverse molecular interactions, FAK functions as a biosensor or integrator to control cell motility. It is a key regulator of cell survival, proliferation, migration and invasion, and thus plays an important role in the development and progression of cancer. Src binds to autophosphorylated FAK forming the FAK-Src dual kinase complex, which is activated in a wide variety of tumor cells and generates signals promoting growth and metastasis. FAK is being developed as a target for cancer therapy. The FAK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
270648	cd05057	PTKc_EGFR_like	Catalytic domain of Epidermal Growth Factor Receptor-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER, ErbB) subfamily members include EGFR (HER1, ErbB1), HER2 (ErbB2), HER3 (ErbB3), HER4 (ErbB4), and similar proteins. They are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, resulting in the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Collectively, they can recognize a variety of ligands including EGF, TGFalpha, and neuregulins, among others. All four subfamily members can form homo- or heterodimers. HER3 contains an impaired kinase domain and depends on its heterodimerization partner for activation. EGFR subfamily members are involved in signaling pathways leading to a broad range of cellular responses including cell proliferation, differentiation, migration, growth inhibition, and apoptosis. Gain of function alterations, through their overexpression, deletions, or point mutations in their kinase domains, have been implicated in various cancers. These receptors are targets of many small molecule inhibitors and monoclonal antibodies used in cancer therapy. The EGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
270649	cd05058	PTKc_Met_Ron	Catalytic domain of the Protein Tyrosine Kinases, Met and Ron. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Met and Ron are receptor PTKs (RTKs) composed of an alpha-beta heterodimer. The extracellular alpha chain is disulfide linked to the beta chain, which contains an extracellular ligand-binding region with a sema domain, a PSI domain and four IPT repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. Met binds to the ligand, hepatocyte growth factor/scatter factor (HGF/SF), and is also called the HGF receptor. HGF/Met signaling plays a role in growth, transformation, cell motility, invasion, metastasis, angiogenesis, wound healing, and tissue regeneration. Aberrant expression of Met through mutations or gene amplification is associated with many human cancers including hereditary papillary renal and gastric carcinomas. The ligand for Ron is macrophage stimulating protein (MSP). Ron signaling is important in regulating cell motility, adhesion, proliferation, and apoptosis. Aberrant Ron expression is implicated in tumorigenesis and metastasis. The Met/Ron subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
173637	cd05059	PTKc_Tec_like	Catalytic domain of Tec-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Tec-like subfamily is composed of Tec, Btk, Bmx (Etk), Itk (Tsk, Emt), Rlk (Txk), and similar proteins. They are cytoplasmic (or nonreceptor) PTKs with similarity to Src kinases in that they contain Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Unlike Src kinases, most Tec subfamily members except Rlk also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. In addition, some members contain the Tec homology (TH) domain, which contains proline-rich and zinc-binding regions. Tec kinases form the second largest subfamily of nonreceptor PTKs and are expressed mainly by haematopoietic cells, although Tec and Bmx are also found in endothelial cells. B-cells express Btk and Tec, while T-cells express Itk, Txk, and Tec. Collectively, Tec kinases are expressed in a variety of myeloid cells such as mast cells, platelets, macrophages, and dendritic cells. Each Tec kinase shows a distinct cell-type pattern of expression. Tec kinases play important roles in the development, differentiation, maturation, regulation, survival, and function of B-cells and T-cells. Mutations in Btk cause the severe B-cell immunodeficiency, X-linked agammaglobulinaemia (XLA). The Tec-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270650	cd05060	PTKc_Syk_like	Catalytic domain of Spleen Tyrosine Kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The Syk-like subfamily is composed of Syk, ZAP-70, Shark, and similar proteins. They are cytoplasmic (or nonreceptor) PTKs containing two Src homology 2 (SH2) domains N-terminal to the catalytic tyr kinase domain. They are involved in the signaling downstream of activated receptors (including B-cell, T-cell, and Fc receptors) that contain ITAMs (immunoreceptor tyr activation motifs), leading to processes such as cell proliferation, differentiation, survival, adhesion, migration, and phagocytosis. Syk is important in B-cell receptor signaling, while Zap-70 is primarily expressed in T-cells and NK cells, and is a crucial component in T-cell receptor signaling. Syk also plays a central role in Fc receptor-mediated phagocytosis in the adaptive immune system. Shark is exclusively expressed in ectodermally derived epithelia, and is localized preferentially to the apical surface of the epithelial cells, it may play a role in a signaling pathway for epithelial cell polarity. The Syk-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
133192	cd05061	PTKc_InsR	Catalytic domain of the Protein Tyrosine Kinase, Insulin Receptor. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. InsR is a receptor PTK (RTK) that is composed of two alphabeta heterodimers. Binding of the insulin ligand to the extracellular alpha subunit activates the intracellular tyr kinase domain of the transmembrane beta subunit. Receptor activation leads to autophosphorylation, stimulating downstream kinase activities, which initiate signaling cascades and biological function. InsR signaling plays an important role in many cellular processes including glucose homeostasis, glycogen synthesis, lipid and protein metabolism, ion and amino acid transport, cell cycle and proliferation, cell differentiation, gene transcription, and nitric oxide synthesis. Insulin resistance, caused by abnormalities in InsR signaling, has been described in diabetes, hypertension, cardiovascular disease, metabolic syndrome, heart failure, and female infertility. The InsR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
133193	cd05062	PTKc_IGF-1R	Catalytic domain of the Protein Tyrosine Kinase, Insulin-like Growth Factor-1 Receptor. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. IGF-1R is a receptor PTK (RTK) that is composed of two alphabeta heterodimers. Binding of the ligand (IGF-1 or IGF-2) to the extracellular alpha subunit activates the intracellular tyr kinase domain of the transmembrane beta subunit. Receptor activation leads to autophosphorylation, which stimulates downstream kinase activities and biological function. IGF-1R signaling is important in the differentiation, growth, and survival of normal cells. In cancer cells, where it is frequently overexpressed, IGF-1R is implicated in proliferation, the suppression of apoptosis, invasion, and metastasis. IGF-1R is being developed as a therapeutic target in cancer treatment. The IGF-1R subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
133194	cd05063	PTKc_EphR_A2	Catalytic domain of the Protein Tyrosine Kinase, Ephrin Receptor A2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. The EphA2 receptor is overexpressed in tumor cells and tumor blood vessels in a variety of cancers including breast, prostate, lung, and colon. As a result, it is an attractive target for drug design since its inhibition could affect several aspects of tumor progression. EphRs comprise the largest subfamily of receptor PTKs (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphRs contain an ephrin binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. The EphA2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K).	268
133195	cd05064	PTKc_EphR_A10	Catalytic domain of the Protein Tyrosine Kinase, Ephrin Receptor A10. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EphA10, which contains an inactive tyr kinase domain, may function to attenuate signals of co-clustered active receptors. EphA10 is mainly expressed in the testis. Ephrin/EphR interaction results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. EphRs comprise the largest subfamily of receptor tyr kinases (RTKs). In general, class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphRs contain an ephrin binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). The EphA10 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	266
173638	cd05065	PTKc_EphR_B	Catalytic domain of the Protein Tyrosine Kinases, Class EphB Ephrin Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EphB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. One exception is EphB2, which also interacts with ephrin A5. EphB receptors play important roles in synapse formation and plasticity, spine morphogenesis, axon guidance, and angiogenesis. In the intestinal epithelium, EphBs are Wnt signaling target genes that control cell compartmentalization. They function as suppressors of colon cancer progression. EphRs comprise the largest subfamily of receptor PTKs (RTKs). They contain an ephrin-binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion. The EphB subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
270651	cd05066	PTKc_EphR_A	Catalytic domain of the Protein Tyrosine Kinases, Class EphA Ephrin Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of most class EphA receptors including EphA3, EphA4, EphA5, and EphA7, but excluding EphA1, EphA2 and EphA10. Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. One exception is EphA4, which also binds ephrins-B2/B3. EphA receptors and ephrin-A ligands are expressed in multiple areas of the developing brain, especially in the retina and tectum. They are part of a system controlling retinotectal mapping. EphRs comprise the largest subfamily of receptor PTKs (RTKs). EphRs contain an ephrin-binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. The EphA subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270652	cd05067	PTKc_Lck_Blk	Catalytic domain of the Protein Tyrosine Kinases, Lymphocyte-specific kinase and Blk. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Lck and Blk are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Lck is expressed in T-cells and natural killer cells. It plays a critical role in T-cell maturation, activation, and T-cell receptor (TCR) signaling. Lck phosphorylates ITAM (immunoreceptor tyr activation motif) sequences on several subunits of TCRs, leading to the activation of different second messenger cascades. Phosphorylated ITAMs serve as binding sites for other signaling factor such as Syk and ZAP-70, leading to their activation and propagation of downstream events. In addition, Lck regulates drug-induced apoptosis by interfering with the mitochondrial death pathway. The apototic role of Lck is independent of its primary function in T-cell signaling. Blk is expressed specifically in B-cells. It is involved in pre-BCR (B-cell receptor) signaling. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Lck/Blk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	264
270653	cd05068	PTKc_Frk_like	Catalytic domain of Fyn-related kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Frk and Srk are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Frk, also known as Rak, is specifically expressed in liver, lung, kidney, intestine, mammary glands, and the islets of Langerhans. Rodent homologs were previously referred to as GTK (gastrointestinal tyr kinase), BSK (beta-cell Src-like kinase), or IYK (intestinal tyr kinase). Studies in mice reveal that Frk is not essential for viability. It plays a role in the signaling that leads to cytokine-induced beta-cell death in Type I diabetes. It also regulates beta-cell number during embryogenesis and early in life. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Frk-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270654	cd05069	PTKc_Yes	Catalytic domain of the Protein Tyrosine Kinase, Yes. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Yes (or c-Yes) is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. c-Yes kinase is the cellular homolog of the oncogenic protein (v-Yes) encoded by the Yamaguchi 73 and Esh sarcoma viruses. It displays functional overlap with other Src subfamily members, particularly Src. It also shows some unique functions such as binding to occludins, transmembrane proteins that regulate extracellular interactions in tight junctions. Yes also associates with a number of proteins in different cell types that Src does not interact with, like JAK2 and gp130 in pre-adipocytes, and Pyk2 in treated pulmonary vein endothelial cells. Although the biological function of Yes remains unclear, it appears to have a role in regulating cell-cell interactions and vesicle trafficking in polarized cells. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Yes subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	279
270655	cd05070	PTKc_Fyn	Catalytic domain of the Protein Tyrosine Kinase, Fyn. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Fyn and Yrk are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Fyn, together with Lck, plays a critical role in T-cell signal transduction by phosphorylating ITAM (immunoreceptor tyr activation motif) sequences on T-cell receptors, ultimately leading to the proliferation and differentiation of T-cells. In addition, Fyn is involved in the myelination of neurons, and is implicated in Alzheimer's and Parkinson's diseases. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Fyn/Yrk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase.	274
270656	cd05071	PTKc_Src	Catalytic domain of the Protein Tyrosine Kinase, Src. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Src (or c-Src) is a cytoplasmic (or non-receptor) PTK, containing an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region with a conserved tyr. It is activated by autophosphorylation at the tyr kinase domain, and is negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). c-Src is the vertebrate homolog of the oncogenic protein (v-Src) from Rous sarcoma virus. Together with other Src subfamily proteins, it is involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. Src also play a role in regulating cell adhesion, invasion, and motility in cancer cells and tumor vasculature, contributing to cancer progression and metastasis. Elevated levels of Src kinase activity have been reported in a variety of human cancers. Several inhibitors of Src have been developed as anti-cancer drugs. Src is also implicated in acute inflammatory responses and osteoclast function. The Src subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270657	cd05072	PTKc_Lyn	Catalytic domain of the Protein Tyrosine Kinase, Lyn. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Lyn is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Lyn is expressed in B lymphocytes and myeloid cells. It exhibits both positive and negative regulatory roles in B cell receptor (BCR) signaling. Lyn, as well as Fyn and Blk, promotes B cell activation by phosphorylating ITAMs (immunoreceptor tyr activation motifs) in CD19 and in Ig components of BCR. It negatively regulates signaling by its unique ability to phosphorylate ITIMs (immunoreceptor tyr inhibition motifs) in cell surface receptors like CD22 and CD5. Lyn also plays an important role in G-CSF receptor signaling by phosphorylating a variety of adaptor molecules. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Lyn subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	272
270658	cd05073	PTKc_Hck	Catalytic domain of the Protein Tyrosine Kinase, Hematopoietic cell kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Hck is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Hck is present in myeloid and lymphoid cells that play a role in the development of cancer. It may be important in the oncogenic signaling of the protein Tel-Abl, which induces a chronic myelogenous leukemia (CML)-like disease. Hck also acts as a negative regulator of G-CSF-induced proliferation of granulocytic precursors, suggesting a possible role in the development of acute myeloid leukemia (AML). In addition, Hck is essential in regulating the degranulation of polymorphonuclear leukocytes. Genetic polymorphisms affect the expression level of Hck, which affects PMN mediator release and influences the development of chronic obstructive pulmonary disease (COPD). Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The Hck subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
270659	cd05074	PTKc_Tyro3	Catalytic domain of the Protein Tyrosine Kinase, Tyro3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tyro3 (or Sky) is predominantly expressed in the central nervous system and the brain, and functions as a neurotrophic factor. It is also expressed in osteoclasts and has a role in bone resorption. Tyro3 is a member of the TAM subfamily, composed of receptor PTKs (RTKs) containing an extracellular ligand-binding region with two immunoglobulin-like domains followed by two fibronectin type III repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands, Gas6 and protein S, leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. The Tyro3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270660	cd05075	PTKc_Axl	Catalytic domain of the Protein Tyrosine Kinase, Axl. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Axl is widely expressed in a variety of organs and cells including epithelial, mesenchymal, hematopoietic, as well as non-transformed cells. It is important in many cellular functions such as survival, anti-apoptosis, proliferation, migration, and adhesion. Axl was originally isolated from patients with chronic myelogenous leukemia and a chronic myeloproliferative disorder. It is overexpressed in many human cancers including colon, squamous cell, thyroid, breast, and lung carcinomas. Axl is a member of the TAM subfamily, composed of receptor PTKs (RTKs) containing an extracellular ligand-binding region with two immunoglobulin-like domains followed by two fibronectin type III repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to its ligands, Gas6 and protein S, leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. The Axl subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270661	cd05076	PTK_Tyk2_rpt1	Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinase, Tyrosine kinase 2. Tyk2 is widely expressed in many tissues. It is involved in signaling via the cytokine receptors IFN-alphabeta, IL-6, IL-10, IL-12, IL-13, and IL-23. It mediates cell surface urokinase receptor (uPAR) signaling and plays a role in modulating vascular smooth muscle cell (VSMC) functional behavior in response to injury. Tyk2 is also important in dendritic cell function and T helper (Th)1 cell differentiation. A homozygous mutation of Tyk2 was found in a patient with hyper-IgE syndrome (HIES), a primary immunodeficiency characterized by recurrent skin abscesses, pneumonia, and elevated serum IgE. This suggests that Tyk2 may play important roles in multiple cytokine signaling involved in innate and adaptive immunity. Tyk2 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. It modulates the kinase activity of the C-terminal catalytic domain. The Tyk2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	273
270662	cd05077	PTK_Jak1_rpt1	Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinase, Janus kinase 1. Jak1 is widely expressed in many tissues. Many cytokines are dependent on Jak1 for signaling, including those that use the shared receptor subunits, common gamma chain (IL-2, IL-4, IL-7, IL-9, IL-15, IL-21) and gp130 (IL-6, IL-11, oncostatin M, G-CSF, and IFNs, among others). The many varied interactions of Jak1 and its ubiquitous expression suggest many biological roles. Jak1 is important in neurological development, as well as in lymphoid development and function. It also plays a role in the pathophysiology of cardiac hypertrophy and heart failure. A mutation in the ATP-binding site of Jak1 was identified in a human uterine leiomyosarcoma cell line, resulting in defective cytokine induction and antigen presentation, thus allowing the tumor to evade the immune system. Jak1 is a cytoplasmic (or nonreceptor) PTK containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. It modulates the kinase activity of the C-terminal catalytic domain. The Jak1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	266
270663	cd05078	PTK_Jak2_rpt1	Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinase, Janus kinase 2. Jak2 is widely expressed in many tissues. It is essential for the signaling of hormone-like cytokines such as growth hormone, erythropoietin, thrombopoietin, and prolactin, as well as some IFNs and cytokines that signal through the IL-3 and gp130 receptors. Disruption of Jak2 in mice results in an embryonic lethal phenotype with multiple defects including erythropoietic and cardiac abnormalities. It is the only Jak gene that results in a lethal phenotype when disrupted in mice. A mutation in the pseudokinase domain of Jak2, V617F, is present in many myeloproliferative diseases, including almost all patients with polycythemia vera, and 50% of patients with essential thrombocytosis and myelofibrosis. Jak2 is a cytoplasmic (or nonreceptor) PTK containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. Despite this, the presumed pseudokinase (repeat 1) domain of Jak2 exhibits dual-specificity kinase activity, phosphorylating two negative regulatory sites in Jak2: Ser523 and Tyr570. Inactivation of the repeat 1 domain increased Jak2 basal activity, suggesting that it modulates the kinase activity of the C-terminal catalytic (repeat 2) domain. The Jak2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
173644	cd05079	PTKc_Jak1_rpt2	Catalytic (repeat 2) domain of the Protein Tyrosine Kinase, Janus kinase 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Jak1 is widely expressed in many tissues. Many cytokines are dependent on Jak1 for signaling, including those that use the shared receptor subunits common gamma chain (IL-2, IL-4, IL-7, IL-9, IL-15, IL-21) and gp130 (IL-6, IL-11, oncostatin M, G-CSF, and IFNs, among others). The many varied interactions of Jak1 and its ubiquitous expression suggest many biological roles. Jak1 is important in neurological development, as well as in lymphoid development and function. It also plays a role in the pathophysiology of cardiac hypertrophy and heart failure. A mutation in the ATP-binding site of Jak1 was identified in a human uterine leiomyosarcoma cell line, resulting in defective cytokine induction and antigen presentation, thus allowing the tumor to evade the immune system. Jak1 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The Jak1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270664	cd05080	PTKc_Tyk2_rpt2	Catalytic (repeat 2) domain of the Protein Tyrosine Kinase, Tyrosine kinase 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tyk2 is widely expressed in many tissues. It is involved in signaling via the cytokine receptors IFN-alphabeta, IL-6, IL-10, IL-12, IL-13, and IL-23. It mediates cell surface urokinase receptor (uPAR) signaling and plays a role in modulating vascular smooth muscle cell (VSMC) functional behavior in response to injury. Tyk2 is also important in dendritic cell function and T helper (Th)1 cell differentiation. A homozygous mutation of Tyk2 was found in a patient with hyper-IgE syndrome (HIES), a primary immunodeficiency characterized by recurrent skin abscesses, pneumonia, and elevated serum IgE. This suggests that Tyk2 may play important roles in multiple cytokine signaling involved in innate and adaptive immunity. Tyk2 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase catalytic domain. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The Tyk2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
270665	cd05081	PTKc_Jak3_rpt2	Catalytic (repeat 2) domain of the Protein Tyrosine Kinase, Janus kinase 3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Jak3 is expressed only in hematopoietic cells. It binds the shared receptor subunit common gamma chain and thus, is essential in the signaling of cytokines that use it such as IL-2, IL-4, IL-7, IL-9, IL-15, and IL-21. Jak3 is important in lymphoid development and myeloid cell differentiation. Inactivating mutations in Jak3 have been reported in humans with severe combined immunodeficiency (SCID). Jak3 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal catalytic tyr kinase domain. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
133213	cd05082	PTKc_Csk	Catalytic domain of the Protein Tyrosine Kinase, C-terminal Src kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Csk catalyzes the tyr phosphorylation of the regulatory C-terminal tail of Src kinases, resulting in their inactivation. Csk is expressed in a wide variety of tissues. As a negative regulator of Src, Csk plays a role in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. Csk is a cytoplasmic (or nonreceptor) PTK containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. To inhibit Src kinases, Csk is translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. In addition, Csk also shows Src-independent functions. It is a critical component in G-protein signaling, and plays a role in cytoskeletal reorganization and cell migration. The Csk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270666	cd05083	PTKc_Chk	Catalytic domain of the Protein Tyrosine Kinase, Csk homologous kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Chk is also referred to as megakaryocyte-associated tyrosine kinase (Matk).  Chk inhibits Src kinases using a noncatalytic mechanism by simply binding to them. As a negative regulator of Src kinases, Chk may play important roles in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. Chk is expressed in brain and hematopoietic cells. Like Csk, it is a cytoplasmic (or nonreceptor) tyr kinase containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. To inhibit Src kinases that are anchored to the plasma membrane, Chk is translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. Studies in mice reveal that Chk is not functionally redundant with Csk and that it plays an important role as a regulator of immune responses. Chk also plays a role in neural differentiation in a manner independent of Src by enhancing Mapk activation via Ras-mediated signaling. The Chk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	254
270667	cd05084	PTKc_Fes	Catalytic domain of the Protein Tyrosine Kinase, Fes. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Fes (or Fps) is a cytoplasmic (or nonreceptor) PTK containing an N-terminal region with FCH (Fes/Fer/CIP4 homology) and coiled-coil domains, followed by a SH2 domain, and a C-terminal catalytic domain. The genes for Fes (feline sarcoma) and Fps (Fujinami poultry sarcoma) were first isolated from tumor-causing retroviruses. The viral oncogenes encode chimeric Fes proteins consisting of Gag sequences at the N-termini, resulting in unregulated PTK activity. Fes kinase is expressed in myeloid, vascular endothelial, epithelial, and neuronal cells. It plays important roles in cell growth and differentiation, angiogenesis, inflammation and immunity, and cytoskeletal regulation. A recent study implicates Fes kinase as a tumor suppressor in colorectal cancer. The Fes subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	252
270668	cd05085	PTKc_Fer	Catalytic domain of the Protein Tyrosine Kinase, Fer. Protein Tyrosine Kinase (PTK) family; Fer kinase; catalytic (c) domain. The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K). PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Fer kinase is a member of the Fes subfamily of proteins which are cytoplasmic (or nonreceptor) tyr kinases containing an N-terminal region with FCH (Fes/Fer/CIP4 homology) and coiled-coil domains, followed by a SH2 domain, and a C-terminal catalytic domain. Fer kinase is expressed in a wide variety of tissues, and is found to reside in both the cytoplasm and the nucleus. It plays important roles in neuronal polarization and neurite development, cytoskeletal reorganization, cell migration, growth factor signaling, and the regulation of cell-cell interactions mediated by adherens junctions and focal adhesions. Fer kinase also regulates cell cycle progression in malignant cells.	251
270669	cd05086	PTKc_Aatyk2	Catalytic domain of the Protein Tyrosine Kinase, Apoptosis-associated tyrosine kinase 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Aatyk2 is a member of the Aatyk subfamily of proteins, which are receptor kinases containing a transmembrane segment and a long C-terminal cytoplasmic tail with a catalytic domain. Aatyk2 is also called lemur tyrosine kinase 2 (Lmtk2) or brain-enriched kinase (Brek). It is expressed at high levels in early postnatal brain, and has been shown to play a role in nerve growth factor (NGF) signaling. Studies with knockout mice reveal that Aatyk2 is essential for late stage spermatogenesis. Although it is classified as a PTK based on sequence similarity and the phylogenetic tree, Aatyk2 has been functionally characterized as a serine/threonine kinase. The Aatyk2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
270670	cd05087	PTKc_Aatyk1	Catalytic domain of the Protein Tyrosine Kinases, Apoptosis-associated tyrosine kinase 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Aatyk1 (or simply Aatyk) is also called lemur tyrosine kinase 1 (Lmtk1). It is a cytoplasmic (or nonreceptor) kinase containing a long C-terminal region. The expression of Aatyk1  is upregulated during growth arrest and apoptosis in myeloid cells. Aatyk1 has been implicated in neural differentiation, and is a regulator of the Na-K-2Cl cotransporter, a membrane protein involved in cell proliferation and survival, epithelial transport, and blood pressure control. The Aatyk1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
133219	cd05088	PTKc_Tie2	Catalytic domain of the Protein Tyrosine Kinase, Tie2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tie2 is a receptor PTK (RTK) containing an extracellular region, a transmembrane segment, and an intracellular catalytic domain. The extracellular region contains an immunoglobulin (Ig)-like domain, three epidermal growth factor (EGF)-like domains, a second Ig-like domain, and three fibronectin type III repeats. Tie2 is expressed mainly in endothelial cells and hematopoietic stem cells. It is also found in a subset of tumor-associated monocytes and eosinophils. The angiopoietins (Ang-1 to Ang-4) serve as ligands for Tie2. The binding of Ang-1 to Tie2 leads to receptor autophosphorylation and activation, promoting cell migration and survival. In contrast, Ang-2 binding to Tie2 does not result in the same response, suggesting that Ang-2 may function as an antagonist. Tie2 signaling plays key regulatory roles in vascular integrity and quiescence, and in inflammation. The Tie2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	303
270671	cd05089	PTKc_Tie1	Catalytic domain of the Protein Tyrosine Kinase, Tie1. Protein Tyrosine Kinase (PTK) family; Tie1; catalytic (c) domain. The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K). PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tie1 is a receptor tyr kinase (RTK) containing an extracellular region, a transmembrane segment, and an intracellular catalytic domain. The extracellular region contains an immunoglobulin (Ig)-like domain, three epidermal growth factor (EGF)-like domains, a second Ig-like domain, and three fibronectin type III repeats. Tie receptors are specifically expressed in endothelial cells and hematopoietic stem cells. No specific ligand has been identified for Tie1, although the angiopoietin, Ang-1, binds to Tie1 through integrins at high concentrations. In vivo studies of Tie1 show that it is critical in vascular development.	297
270672	cd05090	PTKc_Ror1	Catalytic domain of the Protein Tyrosine Kinase, Receptor tyrosine kinase-like Orphan Receptor 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Ror kinases are expressed in many tissues during development. Avian Ror1 was found to be involved in late limb development. Studies in mice reveal that Ror1 is important in the regulation of neurite growth in central neurons, as well as in respiratory development. Loss of Ror1 also enhances the heart and skeletal abnormalities found in Ror2-deficient mice. Ror proteins are orphan receptor PTKs (RTKs) containing an extracellular region with immunoglobulin-like, cysteine-rich, and kringle domains, a transmembrane segment, and an intracellular catalytic domain. Ror RTKs are unrelated to the nuclear receptor subfamily called retinoid-related orphan receptors (RORs). RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. The Ror1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
270673	cd05091	PTKc_Ror2	Catalytic domain of the Protein Tyrosine Kinase, Receptor tyrosine kinase-like Orphan Receptor 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Ror2 plays important roles in skeletal and heart formation. Ror2-deficient mice show widespread bone abnormalities, ventricular defects in the heart, and respiratory dysfunction. Mutations in human Ror2 result in two different bone development genetic disorders, recessive Robinow syndrome and brachydactyly type B. Ror2 is also implicated in neural development. Ror proteins are orphan receptor PTKs (RTKs) containing an extracellular region with immunoglobulin-like, cysteine-rich, and kringle domains, a transmembrane segment, and an intracellular catalytic domain. Ror RTKs are unrelated to the nuclear receptor subfamily called retinoid-related orphan receptors (RORs). RTKs are usually activated through ligand binding, which causes dimerization and autophosphorylation of the intracellular tyr kinase catalytic domain. The Ror2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270674	cd05092	PTKc_TrkA	Catalytic domain of the Protein Tyrosine Kinase, Tropomyosin Related Kinase A. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. TrkA is a receptor PTK (RTK) containing an extracellular region with arrays of leucine-rich motifs flanked by two cysteine-rich clusters followed by two immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. Binding of TrkA to its ligand, nerve growth factor (NGF), results in receptor oligomerization and activation of the catalytic domain. TrkA is expressed mainly in neural-crest-derived sensory and sympathetic neurons of the peripheral nervous system, and in basal forebrain cholinergic neurons of the central nervous system. It is critical for neuronal growth, differentiation and survival. Alternative TrkA splicing has been implicated as a pivotal regulator of neuroblastoma (NB) behavior. Normal TrkA expression is associated with better NB prognosis, while the hypoxia-regulated TrkAIII splice variant promotes NB pathogenesis and progression. Aberrant TrkA expression has also been demonstrated in non-neural tumors including prostate, breast, lung, and pancreatic cancers. The TrkA subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	280
270675	cd05093	PTKc_TrkB	Catalytic domain of the Protein Tyrosine Kinase, Tropomyosin Related Kinase B. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. TrkB is a receptor PTK (RTK) containing an extracellular region with arrays of leucine-rich motifs flanked by two cysteine-rich clusters followed by two immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. Binding of TrkB to its ligands, brain-derived neurotrophic factor (BDNF) or neurotrophin 4 (NT4), results in receptor oligomerization and activation of the catalytic domain. TrkB is broadly expressed in the nervous system and in some non-neural tissues. It plays important roles in cell proliferation, differentiation, and survival. BDNF/Trk signaling plays a key role in regulating activity-dependent synaptic plasticity. TrkB also contributes to protection against gp120-induced neuronal cell death. TrkB overexpression is associated with poor prognosis in neuroblastoma (NB) and other human cancers. It acts as a suppressor of anoikis (detachment-induced apoptosis) and contributes to tumor metastasis. The TrkB subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
270676	cd05094	PTKc_TrkC	Catalytic domain of the Protein Tyrosine Kinase, Tropomyosin Related Kinase C. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. TrkC is a receptor PTK (RTK) containing an extracellular region with arrays of leucine-rich motifs flanked by two cysteine-rich clusters followed by two immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. Binding of TrkC to its ligand, neurotrophin 3 (NT3), results in receptor oligomerization and activation of the catalytic domain. TrkC is broadly expressed in the nervous system and in some non-neural tissues including the developing heart. NT3/TrkC signaling plays an important role in the innervation of the cardiac conducting system and the development of smooth muscle cells. Mice deficient with NT3 and TrkC have multiple heart defects. NT3/TrkC signaling is also critical for the development and maintenance of enteric neurons that are important for the control of gut peristalsis. The TrkC subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
270677	cd05095	PTKc_DDR2	Catalytic domain of the Protein Tyrosine Kinase, Discoidin Domain Receptor 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. DDR2 is a receptor PTK (RTK) containing an extracellular discoidin homology domain, a transmembrane segment, an extended juxtamembrane region, and an intracellular catalytic domain. The binding of the ligand, collagen, to DDR2 results in a slow but sustained receptor activation. DDR2 binds mostly to fibrillar collagens as well as collagen X. DDR2 is widely expressed in many tissues with the highest levels found in skeletal muscle, skin, kidney and lung. It is important in cell proliferation and development. Mice, with a deletion of DDR2, suffer from dwarfism and delayed healing of epidermal wounds. DDR2 also contributes to collagen (type I) regulation by inhibiting fibrillogenesis and altering the morphology of collagen fibers. It is also expressed in immature dendritic cells (DCs), where it plays a role in DC activation and function. The DDR2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	297
133227	cd05096	PTKc_DDR1	Catalytic domain of the Protein Tyrosine Kinase, Discoidin Domain Receptor 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. DDR1 is a receptor PTK (RTK) containing an extracellular discoidin homology domain, a transmembrane segment, an extended juxtamembrane region, and an intracellular catalytic domain. The binding of the ligand, collagen, to DDR1 results in a slow but sustained receptor activation. DDR1 binds to all collagens tested to date (types I-IV). It is widely expressed in many tissues. It is abundant in the brain and is also found in keratinocytes, colonic mucosa epithelium, lung epithelium, thyroid follicles, and the islets of Langerhans. During embryonic development, it is found in the developing neuroectoderm. DDR1 is a key regulator of cell morphogenesis, differentiation and proliferation. It is important in the development of the mammary gland, the vasculator and the kidney. DDR1 is also found in human leukocytes, where it facilitates cell adhesion, migration, maturation, and cytokine production. The DDR1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	304
133228	cd05097	PTKc_DDR_like	Catalytic domain of Discoidin Domain Receptor-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. DDR-like proteins are members of the DDR subfamily, which are receptor PTKs (RTKs) containing an extracellular discoidin homology domain, a transmembrane segment, an extended juxtamembrane region, and an intracellular catalytic domain. The binding of the ligand, collagen, to DDRs results in a slow but sustained receptor activation. DDRs regulate cell adhesion, proliferation, and extracellular matrix remodeling. They have been linked to a variety of human cancers including breast, colon, ovarian, brain, and lung. There is no evidence showing that DDRs act as transforming oncogenes. They are more likely to play a role in the regulation of tumor growth and metastasis. The DDR-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	295
270678	cd05098	PTKc_FGFR1	Catalytic domain of the Protein Tyrosine Kinase, Fibroblast Growth Factor Receptor 1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Alternative splicing of FGFR1 transcripts produces a variety of isoforms, which are differentially expressed in cells. FGFR1 binds the ligands, FGF1 and FGF2, with high affinity and has also been reported to bind FGF4, FGF6, and FGF9. FGFR1 signaling is critical in the control of cell migration during embryo development. It promotes cell proliferation in fibroblasts. Nuclear FGFR1 plays a role in the regulation of transcription. Mutations, insertions or deletions of FGFR1 have been identified in patients with Kallman's syndrome (KS), an inherited disorder characterized by hypogonadotropic hypogonadism and loss of olfaction. Aberrant FGFR1 expression has been found in some human cancers including 8P11 myeloproliferative syndrome (EMS), breast cancer, and pancreatic adenocarcinoma. FGFR1 is part of the FGFR subfamily, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, results in receptor dimerization and activation, and intracellular signaling. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. The FGFR1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	302
133230	cd05099	PTKc_FGFR4	Catalytic domain of the Protein Tyrosine Kinase, Fibroblast Growth Factor Receptor 4. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Unlike other FGFRs, there is only one splice form of FGFR4. It binds FGF1, FGF2, FGF6, FGF19, and FGF23. FGF19 is a selective ligand for FGFR4. Although disruption of FGFR4 in mice causes no obvious phenotype, in vivo inhibition of FGFR4 in cultured skeletal muscle cells resulted in an arrest of muscle progenitor differentiation. FGF6 and FGFR4 are uniquely expressed in myofibers and satellite cells. FGF6/FGFR4 signaling appears to play a key role in the regulation of muscle regeneration. A polymorphism in FGFR4 is found in head and neck squamous cell carcinoma. FGFR4 is part of the FGFR subfamily, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, results in receptor dimerization and activation, and intracellular signaling. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. The FGFR4 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	314
173652	cd05100	PTKc_FGFR3	Catalytic domain of the Protein Tyrosine Kinase, Fibroblast Growth Factor Receptor 3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Many FGFR3 splice variants have been reported with the IIIb and IIIc isoforms being the predominant forms. FGFR3 IIIc is the isoform expressed in chondrocytes, the cells affected in dwarfism, while IIIb is expressed in epithelial cells. FGFR3 ligands include FGF1, FGF2, FGF4, FGF8, FGF9, and FGF23. It is a negative regulator of long bone growth. In the cochlear duct and in the lens, FGFR3 is involved in differentiation while it appears to have a role in cell proliferation in epithelial cells. Germline mutations in FGFR3 are associated with skeletal disorders including several forms of dwarfism. Some missense mutations are associated with multiple myeloma and carcinomas of the bladder and cervix. Overexpression of FGFR3 is found in thyroid carcinoma. FGFR3 is part of the FGFR subfamily, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, results in receptor dimerization and activation, and intracellular signaling. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. The FGFR3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	334
270679	cd05101	PTKc_FGFR2	Catalytic domain of the Protein Tyrosine Kinase, Fibroblast Growth Factor Receptor 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. There are many splice variants of FGFR2 which show differential expression and binding to FGF ligands. Disruption of either FGFR2 or FGFR2b is lethal in mice, due to defects in the placenta or severe impairment of tissue development including lung, limb, and thyroid, respectively. Disruption of FGFR2c in mice results in defective bone and skull development. Genetic alterations of FGFR2 are associated with many human skeletal disorders including Apert syndrome, Crouzon syndrome, Jackson-Weiss syndrome, and Pfeiffer syndrome. FGFR2 is part of the FGFR subfamily, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with three immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of FGFRs to their ligands, the FGFs, results in receptor dimerization and activation, and intracellular signaling. The binding of FGFs to FGFRs is promiscuous, in that a receptor may be activated by several ligands and a ligand may bind to more that one type of receptor. The FGFR2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	313
270680	cd05102	PTKc_VEGFR3	Catalytic domain of the Protein Tyrosine Kinase, Vascular Endothelial Growth Factor Receptor 3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. VEGFR3 (or Flt4) preferentially binds the ligands VEGFC and VEGFD. VEGFR3 is essential for lymphatic endothelial cell (EC) development and function. It has been shown to regulate adaptive immunity during corneal transplantation. VEGFR3 is upregulated on blood vascular ECs in pathological conditions such as vascular tumors and the periphery of solid tumors. It plays a role in cancer progression and lymph node metastasis. Missense mutations in the VEGFR3 gene are associated with primary human lymphedema. VEGFR3 is a member of the VEGFR subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with seven immunoglobulin (Ig)-like domains, a transmembrane segment, and an intracellular catalytic domain. In VEGFR3, the fifth Ig-like domain is replaced by a disulfide bridge. The binding of VEGFRs to their ligands, the VEGFs, leads to receptor dimerization, activation, and intracellular signaling. The VEGFR3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	336
270681	cd05103	PTKc_VEGFR2	Catalytic domain of the Protein Tyrosine Kinase, Vascular Endothelial Growth Factor Receptor 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. VEGFR2 (or Flk1) binds the ligands VEGFA, VEGFC, VEGFD and VEGFE. VEGFR2 signaling is implicated in all aspects of normal and pathological vascular endothelial cell biology. It induces a variety of cellular effects including migration, survival, and proliferation. It is critical in regulating embryonic vascular development and angiogenesis. VEGFR2 is the major signal transducer in pathological angiogenesis including cancer and diabetic retinopathy, and is a target for inhibition in cancer therapy. The carboxyl terminus of VEGFR2 plays an important role in its autophosphorylation and activation. VEGFR2 is a member of the VEGFR subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with seven immunoglobulin (Ig)-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of VEGFRs to their ligands, the VEGFs, leads to receptor dimerization, activation, and intracellular signaling. The VEGFR2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	343
270682	cd05104	PTKc_Kit	Catalytic domain of the Protein Tyrosine Kinase, Kit. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Kit is important in the development of melanocytes, germ cells, mast cells, hematopoietic stem cells, the interstitial cells of Cajal, and the pacemaker cells of the GI tract. Kit signaling is involved in major cellular functions including cell survival, proliferation, differentiation, adhesion, and chemotaxis. Mutations in Kit, which result in constitutive ligand-independent activation, are found in human cancers such as gastrointestinal stromal tumor (GIST) and testicular germ cell tumor (TGCT). The aberrant expression of Kit and/or SCF is associated with other tumor types such as systemic mastocytosis and cancers of the breast, neurons, lung, prostate, colon, and rectum. Although the structure of the human Kit catalytic domain is known, it is excluded from this specific alignment model because it contains a deletion in its sequence. Kit is a member of the Platelet Derived Growth Factor Receptor (PDGFR) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of Kit to its ligand, the stem-cell factor (SCF), leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. The Kit subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	375
173653	cd05105	PTKc_PDGFR_alpha	Catalytic domain of the Protein Tyrosine Kinase, Platelet Derived Growth Factor Receptor alpha. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. PDGFR alpha is a receptor PTK (RTK) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding to its ligands, the PDGFs, leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. PDGFR alpha forms homodimers or heterodimers with PDGFR beta, depending on the nature of the PDGF ligand. PDGF-AA, PDGF-AB, and PDGF-CC induce PDGFR alpha homodimerization. PDGFR signaling plays many roles in  normal embryonic development and adult physiology. PDGFR alpha signaling is important in the formation of lung alveoli, intestinal villi, mesenchymal dermis, and hair follicles, as well as in the development of oligodendrocytes, retinal astrocytes, neural crest cells, and testicular cells. Aberrant PDGFR alpha expression is associated with some human cancers. Mutations in PDGFR alpha have been found within a subset of gastrointestinal stromal tumors (GISTs). An active fusion protein FIP1L1-PDGFR alpha, derived from interstitial deletion, is associated with idiopathic hypereosinophilic syndrome and chronic eosinophilic leukemia. The PDGFR alpha subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	400
133237	cd05106	PTKc_CSF-1R	Catalytic domain of the Protein Tyrosine Kinase, Colony-Stimulating Factor-1 Receptor. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. CSF-1R, also called c-Fms, is a member of the Platelet Derived Growth Factor Receptor (PDGFR) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of CSF-1R to its ligand, CSF-1, leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. CSF-1R signaling is critical in the regulation of macrophages and osteoclasts. It leads to increases in gene transcription and protein translation, and induces cytoskeletal remodeling. CSF-1R signaling leads to a variety of cellular responses including survival, proliferation, and differentiation of target cells. It plays an important role in innate immunity, tissue development and function, and the pathogenesis of some diseases including atherosclerosis and cancer. CSF-1R signaling is also implicated in mammary gland development during pregnancy and lactation. Aberrant CSF-1/CSF-1R expression correlates with tumor cell invasiveness, poor clinical prognosis, and bone metastasis in breast cancer. Although the structure of the human CSF-1R catalytic domain is known, it is excluded from this specific alignment model because it contains a deletion in its sequence. The CSF-1R subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	374
133238	cd05107	PTKc_PDGFR_beta	Catalytic domain of the Protein Tyrosine Kinase, Platelet Derived Growth Factor Receptor beta. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. PDGFR beta is a receptor PTK (RTK) containing an extracellular ligand-binding region with five immunoglobulin-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding to its ligands, the PDGFs, leads to receptor dimerization, trans phosphorylation and activation, and intracellular signaling. PDGFR beta forms homodimers or heterodimers with PDGFR alpha, depending on the nature of the PDGF ligand. PDGF-BB and PDGF-DD induce PDGFR beta homodimerization. PDGFR signaling plays many roles in  normal embryonic development and adult physiology. PDGFR beta signaling leads to a variety of cellular effects including the stimulation of cell growth and chemotaxis, as well as the inhibition of apoptosis and GAP junctional communication. It is critical in normal angiogenesis as it is involved in the recruitment of pericytes and smooth muscle cells essential for vessel stability. Aberrant PDGFR beta expression is associated with some human cancers. The continuously-active fusion proteins of PDGFR beta with COL1A1 and TEL are associated with dermatofibrosarcoma protuberans (DFSP) and a subset of chronic myelomonocytic leukemia (CMML), respectively. The PDGFR beta subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	401
270683	cd05108	PTKc_EGFR	Catalytic domain of the Protein Tyrosine Kinase, Epidermal Growth Factor Receptor. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER1, ErbB1) is a receptor PTK (RTK) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Ligands for EGFR include EGF, heparin binding EGF-like growth factor (HBEGF), epiregulin, amphiregulin, TGFalpha, and betacellulin. Upon ligand binding, EGFR can form homo- or heterodimers with other EGFR subfamily members. The EGFR signaling pathway is one of the most important pathways regulating cell proliferation, differentiation, survival, and growth. Overexpression and mutation in the kinase domain of EGFR have been implicated in the development and progression of a variety of cancers. A number of monoclonal antibodies and small molecule inhibitors have been developed that target EGFR, including the antibodies Cetuximab and Panitumumab, which are used in combination with other therapies for the treatment of colorectal cancer and non-small cell lung carcinoma (NSCLC). The small molecule inhibitors Gefitinib (Iressa) and Erlotinib (Tarceva), already used for NSCLC, are undergoing clinical trials for other types of cancer including gastrointestinal, breast, head and neck, and bladder. The EGFR subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	313
270684	cd05109	PTKc_HER2	Catalytic domain of the Protein Tyrosine Kinase, HER2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. HER2 (ErbB2, HER2/neu) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. HER2 does not bind to any known EGFR subfamily ligands, but contributes to the kinase activity of all possible heterodimers. It acts as the preferred partner of other ligand-bound EGFR proteins and functions as a signal amplifier, with the HER2-HER3 heterodimer being the most potent pair in mitogenic signaling. HER2 plays an important role in cell development, proliferation, survival and motility. Overexpression of HER2 results in its activation and downstream signaling, even in the absence of ligand. HER2 overexpression, mainly due to gene amplification, has been shown in a variety of human cancers. Its role in breast cancer is especially well-documented. HER2 is up-regulated in about 25% of breast tumors and is associated with increases in tumor aggressiveness, recurrence and mortality. HER2 is a target for monoclonal antibodies and small molecule inhibitors, which are being developed as treatments for cancer. The first humanized antibody approved for clinical use is Trastuzumab (Herceptin), which is being used in combination with other therapies to improve the survival rates of patients with HER2-overexpressing breast cancer. The HER2 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
173655	cd05110	PTKc_HER4	Catalytic domain of the Protein Tyrosine Kinase, HER4. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. HER4 (ErbB4) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Ligands that bind HER4 fall into two groups, the neuregulins (or heregulins) and some EGFR (HER1) ligands including betacellulin, HBEGF, and epiregulin. All four neuregulins (NRG1-4) interact with HER4. Upon ligand binding, HER4 forms homo- or heterodimers with other HER proteins. HER4 is essential in embryonic development. It is implicated in mammary gland, cardiac, and neural development. As a postsynaptic receptor of NRG1, HER4 plays an important role in synaptic plasticity and maturation. The impairment of NRG1/HER4 signaling may contribute to schizophrenia. The HER4 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	303
173656	cd05111	PTK_HER3	Pseudokinase domain of the Protein Tyrosine Kinase, HER3. HER3 (ErbB3) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. Unlike other PTKs, phosphorylation of the activation loop of EGFR proteins is not critical to their activation. Instead, they are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. HER3 contains an impaired tyr kinase domain, which lacks crucial residues for catalytic activity against exogenous substrates but is still able to bind ATP and autophosphorylate. HER3 binds the neuregulin ligands, NRG1 and NRG2, and it relies on its heterodimerization partners for activity following ligand binding. The HER2-HER3 heterodimer constitutes a high affinity co-receptor capable of potent mitogenic signaling. HER3 participates in a signaling pathway involved in the proliferation, survival, adhesion, and motility of tumor cells. The HER3 subfamily is part of a larger superfamily that includes other pseudokinases and the the catalytic domains of active kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
133243	cd05112	PTKc_Itk	Catalytic domain of the Protein Tyrosine Kinase, Interleukin-2-inducible T-cell Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Itk, also known as Tsk or Emt, is a member of the Tec-like subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs with similarity to Src kinases in that they contain Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Unlike Src kinases, most Tec subfamily members except Rlk also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. In addition, Itk contains the Tec homology (TH) domain containing one proline-rich region and a zinc-binding region. Itk is expressed in T-cells and mast cells, and is important in their development and differentiation. Of the three Tec kinases expressed in T-cells, Itk plays the predominant role in T-cell receptor (TCR) signaling. It is activated by phosphorylation upon TCR crosslinking and is involved in the pathway resulting in phospholipase C-gamma1 activation and actin polymerization. It also plays a role in the downstream signaling of the T-cell costimulatory receptor CD28, the T-cell surface receptor CD2, and the chemokine receptor CXCR4. In addition, Itk is crucial for the development of T-helper(Th)2 effector responses. The Itk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
173657	cd05113	PTKc_Btk_Bmx	Catalytic domain of the Protein Tyrosine Kinases, Bruton's tyrosine kinase and Bone marrow kinase on the X chromosome. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Btk and Bmx (also named Etk) are members of the Tec-like subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs with similarity to Src kinases in that they contain Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Unlike Src kinases, most Tec subfamily members except Rlk also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. In addition, Btk contains the Tec homology (TH) domain with proline-rich and zinc-binding regions. Btk is expressed in B-cells, and a variety of myeloid cells including mast cells, platelets, neutrophils, and dendrictic cells. It interacts with a variety of partners, from cytosolic proteins to nuclear transcription factors, suggesting a diversity of functions. Stimulation of a diverse array of cell surface receptors, including antigen engagement of the B-cell receptor, leads to PH-mediated membrane translocation of Btk and subsequent phosphorylation by Src kinase and activation. Btk plays an important role in the life cycle of B-cells including their development, differentiation, proliferation, survival, and apoptosis. Mutations in Btk cause the primary immunodeficiency disease, X-linked agammaglobulinaemia (XLA) in humans. Bmx is primarily expressed in bone marrow and the arterial endothelium, and plays an important role in ischemia-induced angiogenesis. It facilitates arterial growth, capillary formation, vessel maturation, and bone marrow-derived endothelial progenitor cell mobilization. The Btk/Bmx subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270685	cd05114	PTKc_Tec_Rlk	Catalytic domain of the Protein Tyrosine Kinases, Tyrosine kinase expressed in hepatocellular carcinoma and Resting lymphocyte kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Tec and Rlk (also named Txk) are members of the Tec-like subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs with similarity to Src kinases in that they contain Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Unlike Src kinases, most Tec subfamily members except Rlk also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. Instead of PH, Rlk contains an N-terminal cysteine-rich region. In addition to PH, Tec also contains the Tec homology (TH) domain with proline-rich and zinc-binding regions. Tec kinases are expressed mainly by haematopoietic cells. Tec is more widely-expressed than other Tec-like subfamily kinases. It is found in endothelial cells, both B- and T-cells, and a variety of myeloid cells including mast cells, erythroid cells, platelets, macrophages and neutrophils. Rlk is expressed in T-cells and mast cell lines. Tec and Rlk are both key components of T-cell receptor (TCR) signaling. They are important in TCR-stimulated proliferation, IL-2 production and phopholipase C-gamma1 activation. The Tec/Rlk subfamily is part of a larger superfamily, that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	260
270686	cd05115	PTKc_Zap-70	Catalytic domain of the Protein Tyrosine Kinase, Zeta-chain-associated protein of 70kDa. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Zap-70 is a cytoplasmic (or nonreceptor) PTK containing two Src homology 2 (SH2) domains N-terminal to the catalytic tyr kinase domain. Zap-70 is primarily expressed in T-cells and NK cells, and is a crucial component in T-cell receptor (TCR) signaling. Zap-70 binds the phosphorylated ITAM (immunoreceptor tyr activation motif) sequences of the activated TCR zeta-chain through its SH2 domains, leading to its phosphorylation and activation. It then phosphorylates target proteins, which propagate the signals to downstream pathways. Zap-70 is hardly detected in normal peripheral B-cells, but is present in some B-cell malignancies. It is used as a diagnostic marker for chronic lymphocytic leukemia (CLL) as it is associated with the more aggressive subtype of the disease. The Zap-70 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
133247	cd05116	PTKc_Syk	Catalytic domain of the Protein Tyrosine Kinase, Spleen tyrosine kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Syk is a cytoplasmic (or nonreceptor) PTK containing two Src homology 2 (SH2) domains N-terminal to the catalytic tyr kinase domain. Syk was first cloned from the spleen, and its function in hematopoietic cells is well-established. It is involved in the signaling downstream of activated receptors (including B-cell and Fc receptors) that contain ITAMs (immunoreceptor tyr activation motifs), leading to processes such as cell proliferation, differentiation, survival, adhesion, migration, and phagocytosis. More recently, Syk expression has been detected in other cell types (including epithelial cells, vascular endothelial cells, neurons, hepatocytes, and melanocytes), suggesting a variety of biological functions in non-immune cells. Syk plays a critical role in maintaining vascular integrity and in wound healing during embryogenesis. It also regulates Vav3, which is important in osteoclast function including bone development. In breast epithelial cells, where Syk acts as a negative regulator for EGFR signaling, loss of Syk expression is associated with abnormal proliferation during cancer development suggesting a potential role as a tumor suppressor. In mice, Syk has been shown to inhibit malignant transformation of mammary epithelial cells induced with murine mammary tumor virus (MMTV). The Syk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
270687	cd05117	STKc_CAMK	The catalytic domain of CAMK family Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. There are several types of CaMKs including CaMKI, CaMKII, and CaMKIV. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. CaMKII is a signaling molecule that translates upstream calcium and reactive oxygen species (ROS) signals into downstream responses that play important roles in synaptic function and cardiovascular physiology. CAMKIV is implicated in regulating several transcription factors like CREB, MEF2, and retinoid orphan receptors, as well as in T-cell development and signaling. The CAMK family also consists of other related kinases including the Phosphorylase kinase Gamma subunit (PhKG), the C-terminal kinase domains of Ribosomal S6 kinase (RSK) and Mitogen and stress-activated kinase (MSK), Doublecortin-like kinase (DCKL), and the MAPK-activated protein kinases MK2, MK3, and MK5, among others. The CAMK family is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270688	cd05118	STKc_CMGC	Catalytic domain of CMGC family Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The CMGC family consists of Cyclin-Dependent protein Kinases (CDKs), Mitogen-activated protein kinases (MAPKs) such as Extracellular signal-regulated kinase (ERKs), c-Jun N-terminal kinases (JNKs), and p38, and other kinases. CDKs belong to a large subfamily of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. MAPKs serve as important mediators of cellular responses to extracellular signals. They control critical cellular functions including differentiation, proliferation, migration, and apoptosis. They are also implicated in the pathogenesis of many diseases including multiple types of cancer, stroke, diabetes, and chronic inflammation. Other members of the CMGC family include casein kinase 2 (CK2), Dual-specificity tYrosine-phosphorylated and -Regulated Kinase (DYRK), Glycogen Synthase Kinase 3 (GSK3), among many others. The CMGC family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	249
270689	cd05119	RIO	Catalytic domain of the atypical protein serine kinases, RIO kinases. RIO kinases are atypical protein serine kinases present in archaea, bacteria and eukaryotes. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. RIO kinases contain a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. Most organisms contain at least two RIO kinases, RIO1 and RIO2. A third protein, RIO3, is present in multicellular eukaryotes. In yeast, RIO1 and RIO2 are essential for survival. They function as non-ribosomal factors necessary for late 18S rRNA processing. RIO1 is also required for proper cell cycle progression and chromosome maintenance. The biological substrates for RIO kinases are still unknown. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	192
270690	cd05120	APH_ChoK_like	Aminoglycoside 3'-phosphotransferase and Choline Kinase family. This family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	158
270691	cd05121	ABC1_ADCK3-like	Activator of bc1 complex (ABC1) kinases (also called aarF domain containing kinase 3) and similar proteins. This family is composed of the atypical yeast protein kinase Abc1p, its human homolog ADCK3 (also called CABC1), and similar proteins. Abc1p (also called Coq8p) is required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. It is necessary for the formation of a multi-subunit Q-biosynthetic complex and may also function in the regulation of Q synthesis. Human ADCK3 is able to rescue defects in Q synthesis and the phosphorylation state of Coq proteins in yeast Abc1 (or Coq8) mutants. Mutations in ADCK3 cause progressive cerebellar ataxia and atrophy due to Q10 deficiency. Eukaryotes contain at least two more ABC1/ADCK3-like proteins: in humans, these are the putative atypical protein kinases named ADCK1 and ADCK2. In algae and higher plants, ABC1 kinases have proliferated to more than 15 subfamilies, most of which are located in plastids or mitochondria. Eight of these plant ABC1 kinase subfamilies (ABC1K1-8) are specific for photosynthetic organisms. ABC1 kinases are not related to the ATP-binding cassette (ABC) membrane transporter family.	247
270692	cd05122	PKc_STE	Catalytic domain of STE family Protein Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. This family is composed of STKs, and some dual-specificity PKs that phosphorylate both threonine and tyrosine residues of target proteins. Most members are kinases involved in mitogen-activated protein kinase (MAPK) signaling cascades, acting as MAPK kinases (MAPKKs), MAPKK kinases (MAPKKKs), or MAPKKK kinases (MAP4Ks). The MAPK signaling pathways are important mediators of cellular responses to extracellular signals. The pathways involve a triple kinase core cascade comprising of the MAPK, which is phosphorylated and activated by a MAPKK, which itself is phosphorylated and activated by a MAPKKK. Each MAPK cascade is activated either by a small GTP-binding protein or by an adaptor protein, which transmits the signal either directly to a MAPKKK to start the triple kinase core cascade or indirectly through a mediator kinase, a MAP4K. Other STE family members include p21-activated kinases (PAKs) and class III myosins, among others. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. Class III myosins are motor proteins containing an N-terminal kinase catalytic domain and a C-terminal actin-binding domain, which can phosphorylate several cytoskeletal proteins, conventional myosin regulatory light chains, as well as autophosphorylate the C-terminal motor domain. They play an important role in maintaining the structural integrity of photoreceptor cell microvilli. The STE family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	254
270693	cd05123	STKc_AGC	Catalytic domain of AGC family Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. AGC kinases regulate many cellular processes including division, growth, survival, metabolism, motility, and differentiation. Many are implicated in the development of various human diseases. Members of this family include cAMP-dependent Protein Kinase (PKA), cGMP-dependent Protein Kinase (PKG), Protein Kinase C (PKC), Protein Kinase B (PKB), G protein-coupled Receptor Kinase (GRK), Serum- and Glucocorticoid-induced Kinase (SGK), and 70 kDa ribosomal Protein S6 Kinase (p70S6K or S6K), among others. AGC kinases share an activation mechanism based on the phosphorylation of up to three sites: the activation loop (A-loop), the hydrophobic motif (HM) and the turn motif. Phosphorylation at the A-loop is required of most AGC kinases, which results in a disorder-to-order transition of the A-loop. The ordered conformation results in the access of substrates and ATP to the active site. A subset of AGC kinases with C-terminal extensions containing the HM also requires phosphorylation at this site. Phosphorylation at the HM allows the C-terminal extension to form an ordered structure that packs into the hydrophobic pocket of the catalytic domain, which then reconfigures the kinase into an active bi-lobed state. In addition, growth factor-activated AGC kinases such as PKB, p70S6K, RSK, MSK, PKC, and SGK, require phosphorylation at the turn motif (also called tail or zipper site), located N-terminal to the HM at the C-terminal extension. The AGC family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and Phosphoinositide 3-Kinase.	250
270694	cd05124	AFK	Catalytic domain of Actin-Fragmin Kinase. AFK is found in slime molds, ciliates, and flowering plants. It catalyzes the transfer of the gamma-phosphoryl group from ATP specifically to threonine residues in the actin-fragmin complex. The phosphorylation sites are located at a minor contact site for DNase I and at an actin-actin contact site. Fragmin is an actin-binding protein that functions as a regulator of the microfilament system. It interferes with the growth of F-actin by severing actin filaments and capping their ends. The phosphorylation of the actin-fragmin complex inhibits its nucleation activity and results in calcium-dependent capping activity. Thus, AFK plays a role in regulating actin polymerization. The AFK catalytic domain is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	240
240161	cd05125	Mth938_2P1-like	Mth938_2P1-like domain. This model contains sequences that are similar to 2P1, a partially characterized nuclear protein, which is homologous to E3-3 from rat and known to be alternatively spliced. Its function is unknown. This family is part of the Mth938 family, for which structures, but no functional data are available.	114
240162	cd05126	Mth938	Mth938 domain. Mth938 is a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mth) genome. This protein crystallizes as a dimer, although it is monomeric in solution, with one disulfide bond in each monomer. The function of the protein has not been determined.	117
213329	cd05127	RasGAP_IQGAP_like	Ras-GTPase Activating Domain of IQ motif containing GTPase activating proteins. This family represents IQ motif containing GTPase activating protein (IQGAP) which associated with the Ras GTP-binding protein. A primary function of IQGAP proteins is to modulate cytoskeletal architecture. There are three known IQGAP family members: IQGAP1, IQGAP2 and IQGAP3. Human IQGAP1 and IQGAP2 share 62% identity. IQGAPs are multi-domain molecules having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP is an essential regulator of cytoskeletal function. IQGAP1 negatively regulates Ras family GTPases by stimulating their intrinsic GTPase activity, the protein actually lacks GAP activity. Both IQGAP1 and IQGAP2 specifically bind to Cdc42 and Rac1, but not to RhoA. Despite of their similarities to part of the sequence of RasGAP, neither IQGAP1 nor IQGAP2 interacts with Ras. IQGAP3, only present in mammals, regulates the organization of the cytoskeleton under the regulation of Rac1 and Cdc42 in neuronal cells. The depletion of IQGAP3 is shown to impair neurite or axon outgrowth in neuronal cells with disorganized cytoskeleton.	331
213330	cd05128	RasGAP_GAP1_like	Ras-GTPase Activating Domain of GAP1 and similar proteins. The GAP1 family of Ras GTPase-activating proteins includes GAP1(m) (or RASA2), GAP1_IP4BP (or RASA3), Ca2+ -promoted Ras inactivator (CAPRI, or RASAL4), and Ras GTPase activating-like proteins (RASAL) or RASAL1. The members are characterized by a conserved domain structure comprising N-terminal tandem C2 domains, a highly conserved central RasGAP domain, and a C-terminal pleckstrin homology domain that is associated with a Bruton's tyrosine kinase motif. While this domain structure is conserved, a small change in the function of each individual domain and the interaction between domains has a marked effect on the regulation of each protein.	269
213331	cd05129	RasGAP_RAP6	Ras-GTPase Activating Domain of Rab5-activating protein 6. Rab5-activating protein 6 (RAP6) is an endosomal protein with a role in the regulation of receptor-mediated endocytosis. RAP6 contains a Vps9 domain, which is involved in the activation of Rab5, and a Ras GAP domain (RGD). Rab5 is a small GTPase required for the control of the endocytic route, and its activity is regulated by guanine nucleotide exchange factor, such as Rabex5, and GAPs, such as RN-tre. Human Rap6 protein is localized on the plasma membrane and on the endosome. RAP6 binds to Rab5 and Ras through the Vps9 and RGD domains, respectively.	365
213332	cd05130	RasGAP_Neurofibromin	Ras-GTPase Activating Domain of neurofibromin. Neurofibromin is the product of the neurofibromatosis type 1 gene (NF1) and shares a region of similarity with catalytic domain of the mammalian p120RasGAP protein and an extended similarity with the Saccharomyces cerevisiae RasGAP proteins Ira1 and Ira2. Neurofibromin has been shown to function as a GAP (GTPase-activating protein) which inhibits low molecular weight G proteins such as Ras by stimulating their intrinsic GTPase activity. NF1 is a common genetic disorder characterized by various symptoms ranging from predisposition for the development of tumors to learning disability or mental retardation. Loss of neurofibromin activity can be correlated to the increase in Ras-GTP concentration in neurofibromas of NF1 of patients, supporting the notion that unregulated Ras signaling may contribute to their development.	332
213333	cd05131	RasGAP_IQGAP2	Ras-GTPase Activating Domain of IQ motif containing GTPase activating protein 2. IQGAP2 is a member of the IQGAP family that contains a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeat, a single WW domain, four IQ motifs which mediate interactions with calmodulin, and a Ras-GTPase-activating protein (GAP)-related domain that binds Rho family GTPases. IQGAP2 and IQGAP3 play important roles in the regulation of the cytoskeleton for axon outgrowth in hippocampal neurons and are thought to stay in a common regulatory pathway. The results of RNA interference studies indicated that IQGAP3 partially compensates functions of IQGAP2, but has lesser ability than IQGAP2 to promote axon outgrowth in hippocampal neuron. Moreover, IQGAP2 is required for the cadherin-mediated cell-to-cell adhesion in Xenopus laevis embryos.	359
213334	cd05132	RasGAP_GAPA	Ras-GTPase Activating Domain of GAPA. GAPA is an IQGAP-related protein and is predicted to bind to small GTPases, which are yet to be identified. IQGAP proteins are integral components of cytoskeletal regulation. Results from truncated GAPAs indicated that almost the entire region of GAPA homologous to IQGAP is required for cytokinesis in Dictyostelium. More members of the IQGAP family are emerging, and evidence suggests that there are both similarities and differences in their function.	352
213335	cd05133	RasGAP_IQGAP1	Ras-GTPase Activating Domain of IQ motif containing GTPase activating protein 1. IQGAP1 is a homodimeric protein that is widely expressed among vertebrate cell types from early embryogenesis. Mammalian IQGAP1 protein is the best characterized member of the IQGAP family, and contains several protein-interacting domains. Human IQGAP1 is most similar to mouse Iqgap1 (94% identity) and has 62% identity to human IQGAP2. IQGAP1 binds and cross-links actin filaments in vitro and has been implicated in Ca2+/calmodulin signaling, E-cadherin-dependent cell adhesion, cell motility, and invasion. Yeast IQGAP homologs have a role in the recruitment of actin filaments, are components of the spindle pole body, and are required for actomyosin ring assembly and cytokinesis. Furthermore, IQGAP1 over-expression has also been detected in gastric and colorectal carcinomas and gastric cancer cell lines.	380
213336	cd05134	RasGAP_RASA3	Ras-GTPase Activating Domain of RASA3. RASA3 (or GAP1_IP4BP) is a member of the GAP1 family and has been shown to specifically bind 1,3,4,5-tetrakisphosphate (IP4). Thus, RASA3 may function as an IP4 receptor. The members of GAP1 family are characterized by a conserved domain structure comprising N-terminal tandem C2 domains, a highly conserved central RasGAP domain, and a C-terminal pleckstrin-homology domain that is associated with a Bruton's tyrosine kinase motif. Purified RASA3 stimulates GAP activity on Ras with about a five-fold lower potency than p120RasGAP, but shows no GAP-stimulating activity at all against Rac or Rab3A.	269
213337	cd05135	RasGAP_RASAL	Ras-GTPase Activating Domain of RASAL1 and similar proteins. Ras GTPase activating-like protein (RASAL) or RASAL1 is a member of the GAP1 family, and a Ca2+ sensor responding in-phase to repetitive Ca2+ signals by associating with the plasma membrane and deactivating Ras. It contains a conserved domain structure comprising N-terminal tandem C2 domains, a highly conserved central RasGAP domain, and a C-terminal pleckstrin-homology domain that is associated with a Bruton's tyrosine kinase motif. RASAL, like Ca2+ -promoted Ras inactivator (CAPRI, or RASAL4), is a cytosolic protein that undergoes a rapid translocation to the plasma membrane in response to receptor-mediated elevation in the concentration of intracellular free Ca2+, a translocation that activates its ability to function as a RasGAP. However, unlike RASAL4, RASAL undergoes an oscillatory translocation to the plasma membrane that occurs in synchrony with repetitive Ca2+ spikes.	287
213338	cd05136	RasGAP_DAB2IP	Ras-GTPase Activating Domain of DAB2IP and similar proteins. The DAB2IP family of Ras GTPase-activating proteins includes DAB2IP, nGAP, and Syn GAP. Disabled 2 interactive protein, (DAB2IP; also known as ASK-interacting protein 1 (AIP1)), is a member of the GTPase-activating proteins, down-regulates Ras-mediated signal pathways, and mediates TNF-induced activation of ASK1-JNK signaling pathways. The mechanism by which TNF signaling is coupled to DAB2IP is not known.	324
213339	cd05137	RasGAP_CLA2_BUD2	Ras-GTPase Activating Domain of CLA2/BUD2. CLA2/BUD2 functions as a GTPase-activating protein (GAP) for BUD1/RSR1 and is necessary for proper bud-site selection in yeast. BUD2 has sequence similarity to the catalytic domain of RasGAPs, and stimulates the hydrolysis of BUD1-GTP to BUD1-GDP. Elimination of Bud2p activity by mutation causes a random budding pattern with no growth defect. Overproduction of Bud2p also alters the budding pattern.	356
240163	cd05140	Barstar_AU1054-like	Barstar_AU1054-like contains uncharacterized sequences similar to the uncharacterized, predicted RNAase inhibitor AU1054 found in Burkholderia cenocepacia. This is a subfamily of the Barstar family of RNAase inhibitors. Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell.  Barstar also binds and inhibits a ribonuclease called RNase Sa (produced by Streptomyces aureofaciens) which belongs to the same enzyme family as does barnase.	86
240164	cd05141	Barstar_evA4336-like	Barstar_evA4336-like contains uncharacterized sequences similar to the uncharacterized, predicted RNAase inhibitor evA4336 found in Azoarcus sp. EvN1. This is a subfamily of the Barstar family of RNAase inhibitors. Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell.  Barstar also binds and inhibits a ribonuclease called RNase Sa (produced by Streptomyces aureofaciens) which belongs to the same enzyme family as does barnase.	81
240165	cd05142	Barstar	Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell.  Barstar also binds and inhibits a ribonuclease called RNase Sa (produced by Streptomyces aureofaciens) which belongs to the same enzyme family as does barnase.	87
240166	cd05143	Barstar_SaI14_like	Barstar_SaI14_like contains sequences that are similar to SaI14, an RNAase inhibitor, which are members of the Barstar family. Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell. The sequences in this subfamily are mostly uncharacterized, but believed to have a similar function and role.	88
270695	cd05144	RIO2_C	C-terminal catalytic domain of the atypical protein serine kinase, RIO2 kinase. RIO2 is present in archaea and eukaryotes. It contains an N-terminal winged helix (wHTH) domain and a C-terminal RIO kinase catalytic domain. The wHTH domain is primarily seen in DNA-binding proteins, although some wHTH domains may be involved in RNA recognition. RIO2 is essential for survival and is necessary for rRNA cleavage during 40S ribosomal subunit maturation. RIO kinases are atypical protein serine kinases containing a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. The RIO2 kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	183
270696	cd05145	RIO1_like	Catalytic domain of the atypical protein serine kinases, RIO1 and RIO3 kinases and similar proteins. RIO1 is present in archaea, bacteria and eukaryotes. In addition, RIO3 is present in multicellular eukaryotes. Both RIO1 and RIO3 are associated with precursors of 40S ribosomal subunits, just like RIO2. RIO1 is essential for survival and is required for 18S rRNA processing, proper cell cycle progression and chromosome maintenance. Although depletion of either RIO1 and RIO2 results in similar effects, the two kinases are not fully interchangeable. The specific function of RIO3 is unknown. RIO kinases are atypical protein serine kinases containing a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	189
270697	cd05146	RIO3_euk	Catalytic domain of the atypical protein serine kinase, RIO3 kinase. RIO3 is present only in multicellular eukaryotes. It is associated with precursors of 40S ribosomal subunits, just like RIO1 and RIO2. Its specific function is still unknown. Like RIO1 and RIO2, it may be involved in ribosomal subunit processing and maturation. RIO kinases are atypical protein serine kinases containing a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. The RIO3 kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	196
270698	cd05147	RIO1_euk	Catalytic domain of the atypical protein serine kinase, Eukaryotic RIO1 kinase. RIO1 is present in archaea, bacteria and eukaryotes. This subfamily is composed of RIO1 proteins from eukaryotes. RIO1 is essential for survival and is required for 18S rRNA processing, proper cell cycle progression and chromosome maintenance. It is associated with precursors of 40S ribosomal subunits, just like RIO2. Although depletion of either RIO1 and RIO2 results in similar effects, the two kinases are not fully interchangeable. RIO kinases are atypical protein serine kinases containing a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	190
133248	cd05148	PTKc_Srm_Brk	Catalytic domain of the Protein Tyrosine Kinases, Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal myristylation sites (Srm) and Breast tumor kinase (Brk). PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Srm and Brk (also called protein tyrosine kinase 6) are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Brk has been found to be overexpressed in a majority of breast tumors. Src kinases in general contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr; they are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). Srm and Brk however, lack the N-terminal myristylation sites. Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation.  The Srm/Brk subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	261
270699	cd05150	APH	Aminoglycoside 3'-phosphotransferase. APH catalyzes the transfer of the gamma-phosphoryl group from ATP to aminoglycoside antibiotics such as kanamycin, streptomycin, neomycin, and gentamicin, among others. The aminoglycoside antibiotics target the 30S ribosome and promote miscoding, leading to the production of defective proteins which insert into the bacterial membrane, resulting in membrane damage and the ultimate demise of the bacterium. Phosphorylation of the aminoglycoside antibiotics results in their inactivation, leading to bacterial antibiotic resistance. The APH gene is found on transposons and plasmids and is thought to have originated as a self-defense mechanism used by microorganisms that produce the antibiotics. The APH subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	244
270700	cd05151	ChoK-like	Choline Kinase and similar proteins. This subfamily is composed of bacterial and eukaryotic choline kinases, as well as eukaryotic ethanolamine kinase. ChoK catalyzes the transfer of the gamma-phosphoryl group from ATP (or CTP) to its substrate, choline, producing phosphorylcholine (PCho), a precursor to the biosynthesis of two major membrane phospholipids, phosphatidylcholine (PC), and sphingomyelin (SM). Although choline is the preferred substrate, ChoK also shows substantial activity towards ethanolamine and its N-methylated derivatives. Bacterial ChoK is also referred to as licA protein. ETNK catalyzes the transfer of the gamma-phosphoryl group from CTP to ethanolamine (Etn), the first step in the CDP-Etn pathway for the formation of the major phospholipid, phosphatidylethanolamine (PtdEtn). Unlike ChoK, ETNK shows specific activity for its substrate and displays negligible activity towards N-methylated derivatives of Etn. ChoK plays an important role in cell signaling pathways and the regulation of cell growth. The ChoK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	152
270701	cd05152	MPH2'	Macrolide 2'-Phosphotransferase. MPH2' catalyzes the transfer of the gamma-phosphoryl group from ATP to the 2'-hydroxyl of macrolide antibiotics such as erythromycin, clarithromycin, and azithromycin, among others. Macrolides penetrate the bacterial cell and bind to ribosomes, where it interrupts protein elongation, leading ultimately to the demise of the bacterium. Phosphorylation of macrolides leads to their inactivation. Based on substrate specificity and amino acid sequence, MPH2' is divided into types I and II, encoded by mphA and mphB genes, respectively. MPH2'I inactivates 14-membered ring macrolides while MPH2'II inactivates both 14- and 16-membered ring macrolides. Enzymatic inactivation of macrolides has been reported as a mechanism for bacterial resistance in clinical samples. MPH2' is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	276
270702	cd05153	HomoserineK_II	Type II Homoserine Kinase. This subfamily is composed of unusual homoserine kinases, from a subset of bacteria, which have a Protein Kinase fold. These proteins do not bear any similarity to the GHMP family homoserine kinases present in most bacteria and eukaryotes. Homoserine kinase catalyzes the transfer of the gamma-phosphoryl group from ATP to L-homoserine producing L-homoserine phosphate, an intermediate in the production of the amino acids threonine, methionine, and isoleucine. The Type II homoserine kinase subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	300
270703	cd05154	ACAD10_11_N-like	N-terminal domain of Acyl-CoA dehydrogenase (ACAD) 10 and 11, and similar proteins. This subfamily is composed of the N-terminal domains of vertebrate ACAD10 and ACAD11, and similar uncharacterized bacterial and eukaryotic proteins. ACADs are a family of flavoproteins that are involved in the beta-oxidation of fatty acyl-CoA derivatives. ACAD deficiency can cause metabolic disorders including muscle fatigue, hypoglycemia, and hepatic lipidosis. There are at least 11 distinct ACADs, some of which show distinct substrate specificities to either straight-chain or branched-chain fatty acids. ACAD10 is widely expressed in human tissues and highly expressed in liver, kidney, pancreas, and spleen. ACAD10 and ACAD11 are both significantly expressed in human brain tissues. They contain a long N-terminal domain with similarity to phosphotransferases with a Protein Kinase fold, which is absent in other ACADs. They may exhibit multiple functions in acyl-CoA oxidation pathways. ACAD11 utilizes substrates with carbon chain lengths of 20 to 26, with optimal activity towards C22CoA. ACAD10 may be associated with an increased risk in type II diabetes. The ACAD10/11-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	254
270704	cd05155	APH_ChoK_like_1	Uncharacterized bacterial proteins with similarity to Aminoglycoside 3'-phosphotransferase and Choline kinase. This subfamily is composed of uncharacterized bacterial proteins with similarity to APH and ChoK. Other APH/ChoK-like proteins include ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). These proteins catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates, such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides, and macrolides leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine. The APH/ChoK-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	234
270705	cd05156	ChoK_euk	Euykaryotic Choline Kinase. ChoK catalyzes the transfer of the gamma-phosphoryl group from ATP (or CTP) to its substrate, choline, producing phosphorylcholine (PCho), a precursor to the biosynthesis of two major membrane phospholipids, phosphatidylcholine (PC) and sphingomyelin (SM). Although choline is the preferred substrate, ChoK also shows substantial activity towards ethanolamine and its N-methylated derivatives. ChoK plays an important role in cell signaling pathways and the regulation of cell growth. Along with PCho, it is involved in malignant transformation through Ras oncogenes in various human cancers such as breast, lung, colon, prostate, neuroblastoma, and hepatic lymphoma. In mammalian cells, there are three ChoK isoforms (A-1, A-2, and B) which are active in homo- or heterodimeric forms. The ChoK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	326
270706	cd05157	ETNK_euk	Euykaryotic Ethanolamine kinase. ETNK catalyzes the transfer of the gamma-phosphoryl group from CTP to ethanolamine (Etn), the first step in the CDP-Etn pathway for the formation of the major phospholipid, phosphatidylethanolamine (PtdEtn). Unlike ChoK, ETNK shows specific activity for its substrate, and displays negligible activity towards N-methylated derivatives of Etn. The Drosophila ETNK is implicated in development and neuronal function. Mammals contain two ETNK proteins, ETNK1 and ETNK2. ETNK1 selectively increases Etn uptake and phosphorylation, as well as PtdEtn synthesis. ETNK2 is found primarily in the liver and reproductive tissues. It plays a critical role in regulating placental hemostasis to support late embryonic development. It may also have a role in testicular maturation. ETNK is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	307
176646	cd05160	DEDDy_DNA_polB_exo	DEDDy 3'-5' exonuclease domain of family-B DNA polymerases. The 3'-5' exonuclease domain of family-B DNA polymerases. This domain has a fundamental role in reducing polymerase errors and is involved in proofreading activity. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. This domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The exonuclease domain of family B polymerase also contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. Members include Escherichia coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon and zeta), and eukaryotic viral and plasmid-borne enzymes. Nuclear DNA polymerases alpha and zeta lack the four conserved acidic metal-binding residues. Family-B DNA polymerases are predominantly involved in DNA replication and DNA repair.	199
99894	cd05162	PWWP	The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids.  The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation.  Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.  The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes.  Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors.	87
270707	cd05163	PIKK_TRRAP	Pseudokinase domain of TRansformation/tRanscription domain-Associated Protein. TRRAP belongs to the the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. It contains a FATC (FRAP, ATM and TRRAP, C-terminal) domain and has a large molecular weight. Unlike most PIKK proteins, however, it contains an inactive PI3K-like pseudokinase domain, which lacks the conserved residues necessary for ATP binding and catalytic activity. TRRAP also contains many motifs that may be critical for protein-protein interactions. TRRAP is a common component of many histone acetyltransferase (HAT) complexes, and is responsible for the recruitment of these complexes to chromatin during transcription, replication, and DNA repair. TRRAP also exists in non-HAT complexes such as the p400 and MRN complexes, which are implicated in ATP-dependent remodeling and DNA repair, respectively. The TRRAP pseudokinase domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	252
270708	cd05164	PIKKc	Catalytic domain of Phosphoinositide 3-kinase-related protein kinases. PIKK subfamily members include ATM (Ataxia telangiectasia mutated), ATR (Ataxia telangiectasia and Rad3-related), TOR (Target of rapamycin), SMG-1 (Suppressor of morphogenetic effect on genitalia-1), and DNA-PK (DNA-dependent protein kinase). PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). They show strong preference for phosphorylating serine/threonine residues followed by a glutamine and are also referred to as (S/T)-Q-directed kinases. They all contain a FATC (FRAP, ATM and TRRAP, C-terminal) domain. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control. The PIKK catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	222
270709	cd05165	PI3Kc_I	Catalytic domain of Class I Phosphoinositide 3-kinase. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. In vitro, they can also phosphorylate the substrates PtdIns and PtdIns(4)P. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	363
270710	cd05166	PI3Kc_II	Catalytic domain of Class II Phosphoinositide 3-kinase. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PtdIns as a substrate to produce PtdIns(3)P, but can also phosphorylate PtdIns(4)P. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a Phox homology (PX) domain, and a second C2 domain at the C-terminus. They are activated by a variety of stimuli including chemokines, cytokines, lysophosphatidic acid (LPA), insulin, and tyrosine kinase receptors. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	352
270711	cd05167	PI4Kc_III_alpha	Catalytic domain of Type III Phosphoinositide 4-kinase alpha. PI4Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 4-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) to generate PtdIns(4)P, the major precursor in the synthesis of other phosphoinositides including PtdIns(4,5)P2, PtdIns(3,4)P2, and PtdIns(3,4,5)P3. Two isoforms of type III PI4K, alpha and beta, exist in most eukaryotes. PI4KIIIalpha is a 220 kDa protein found in the plasma membrane and the endoplasmic reticulum (ER). The role of PI4KIIIalpha in the ER remains unclear. In the plasma membrane, it provides PtdIns(4)P, which is then converted by PI5Ks to PtdIns(4,5)P2, an important signaling molecule. Vertebrate PI4KIIIalpha is also part of a signaling complex associated with P2X7 ion channels. The yeast homolog, Stt4p, is also important in regulating the conversion of phosphatidylserine to phosphatidylethanolamine at the ER and Golgi interface. Mammalian PI4KIIIalpha is highly expressed in the nervous system. The PI4K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	307
270712	cd05168	PI4Kc_III_beta	Catalytic domain of Type III Phosphoinositide 4-kinase beta. PI4Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 4-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) to generate PtdIns(4)P, the major precursor in the synthesis of other phosphoinositides including PtdIns(4,5)P2, PtdIns(3,4)P2, and PtdIns(3,4,5)P3. Two isoforms of type III PI4K, alpha and beta, exist in most eukaryotes. PI4KIIIbeta (also called Pik1p in yeast) is a 110 kDa protein that is localized to the Golgi and the nucleus. It is required for maintaining the structural integrity of the Golgi complex (GC), and is a key regulator of protein transport from the GC to the plasma membrane. PI4KIIIbeta also functions in the genesis, transport, and exocytosis of synaptic vesicles. The Drosophila PI4KIIIbeta is essential for cytokinesis during spermatogenesis. The PI4K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	292
270713	cd05169	PIKKc_TOR	Catalytic domain of Target of Rapamycin. TOR contains a rapamycin binding domain, a catalytic domain, and a FATC (FRAP, ATM and TRRAP, C-terminal) domain at the C-terminus. It is also called FRAP (FK506 binding protein 12-rapamycin associated protein). TOR is a central component of the eukaryotic growth regulatory network. It controls the expression of many genes transcribed by all three RNA polymerases. It associates with other proteins to form two distinct complexes, TORC1 and TORC2. TORC1 is involved in diverse growth-related functions including protein synthesis, nutrient use and transport, autophagy and stress responses. TORC2 is involved in organizing cytoskeletal structures. TOR is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The TOR catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	279
270714	cd05170	PIKKc_SMG1	Catalytic domain of Suppressor of Morphogenetic effect on Genitalia-1. SMG-1 plays a critical role in the mRNA surveillance mechanism known as non-sense mediated mRNA decay (NMD). NMD protects the cells from the accumulation of aberrant mRNAs with premature termination codons (PTCs) generated by genome mutations and by errors during transcription and splicing. SMG-1 phosphorylates Upf1, another central component of NMD, at the C-terminus upon recognition of PTCs. The phosphorylation/dephosphorylation cycle of Upf1 is essential for promoting NMD. In addition to its catalytic domain, SMG-1 contains a FATC (FRAP, ATM and TRRAP, C-terminal) domain at the C-terminus. SMG-1 is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The SMG-1 catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	304
270715	cd05171	PIKKc_ATM	Catalytic domain of Ataxia Telangiectasia Mutated. ATM is critical in the response to DNA double strand breaks (DSBs) caused by radiation. It is activated at the site of a DSB and phosphorylates key substrates that trigger pathways that regulate DNA repair and cell cycle checkpoints at the G1/S, S phase, and G2/M transition. Patients with the human genetic disorder Ataxia telangiectasia (A-T), caused by truncating mutations in ATM, show genome instability, increased cancer risk, immunodeficiency, compromised mobility, and neurodegeneration. A-T displays clinical heterogeneity, which is correlated to the degree of retained ATM activity. ATM contains a FAT (FRAP, ATM and TRRAP) domain, a catalytic domain, and a FATC domain at the C-terminus. It is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The ATM catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	282
270716	cd05172	PIKKc_DNA-PK	Catalytic domain of DNA-dependent protein kinase. DNA-PK is comprised of a regulatory subunit, containing the Ku70/80 subunit, and a catalytic subunit, which contains a NUC194 domain of unknown function, a FAT (FRAP, ATM and TRRAP) domain, a catalytic domain, and a FATC domain at the C-terminus. It is part of a multi-component system involved in non-homologous end joining (NHEJ), a process of repairing double strand breaks (DSBs) by joining together two free DNA ends of little homology. DNA-PK functions as a molecular sensor for DNA damage that enhances the signal via phosphorylation of downstream targets. It may also act as a protein scaffold that aids the localization of DNA repair proteins to the site of DNA damage. DNA-PK also plays a role in the maintenance of telomeric stability and the prevention of chromosomal end fusion. DNA-PK is a member of the phosphoinositide 3-kinase-related protein kinase (PIKK) subfamily. PIKKs have intrinsic serine/threonine kinase activity and are distinguished from other PKs by their unique catalytic domain, similar to that of lipid PI3K, and their large molecular weight (240-470 kDa). The DNA-PK catalytic domain subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	235
270717	cd05173	PI3Kc_IA_beta	Catalytic domain of Class IA Phosphoinositide 3-kinase beta. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Kbeta can be activated by G-protein-coupled receptors. Deletion of PI3Kbeta in mice results in early lethality at around day three of development. PI3Kbeta plays an important role in regulating sustained integrin activation and stable platelet agrregation, especially under conditions of high shear stress. PI3Ks can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). Class IA enzymes contain an N-terminal p85 binding domain, a Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. They associate with a regulatory subunit of the p85 family and are activated by tyrosine kinase receptors. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	362
270718	cd05174	PI3Kc_IA_delta	Catalytic domain of Class IA Phosphoinositide 3-kinase delta. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Kdelta is mainly expressed in immune cells and plays an important role in cellular and humoral immunity. It plays a major role in antigen receptor signaling in B-cells, T-cells, and mast cells. It regulates the differentiation of peripheral helper T-cells and controls the development and function of regulatory T-cells. PI3Ks can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). Class IA enzymes contain an N-terminal p85 binding domain, a Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. They associate with a regulatory subunit of the p85 family and are activated by tyrosine kinase receptors. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	366
270719	cd05175	PI3Kc_IA_alpha	Catalytic domain of Class IA Phosphoinositide 3-kinase alpha. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. PI3Kalpha plays an important role in insulin signaling. It also mediates physiologic heart growth and provides protection from stress. Activating mutations of PI3Kalpha is associated with diverse forms of cancer at high frequency. PI3Ks can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class I PI3Ks are the only enzymes capable of converting PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. Class I enzymes are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. They are further classified into class IA (alpha, beta and delta) and IB (gamma). Class IA enzymes contain an N-terminal p85 binding domain, a Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, and a C-terminal ATP-binding cataytic domain. They associate with a regulatory subunit of the p85 family and are activated by tyrosine kinase receptors. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	370
270720	cd05176	PI3Kc_C2_alpha	Catalytic domain of Class II Phosphoinositide 3-kinase alpha. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. The class II alpha isoform, PI3K-C2alpha, plays key roles in clathrin assembly and clathrin-mediated membrane trafficking, insulin signaling, vascular smooth muscle contraction, and the priming of neurosecretory granule exocytosis. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PtdIns as a substrate to produce PtdIns(3)P, but can also phosphorylate PtdIns(4)P. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a Phox homology (PX) domain, and a second C2 domain at the C-terminus. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	353
270721	cd05177	PI3Kc_C2_gamma	Catalytic domain of Class II Phosphoinositide 3-kinase gamma. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. The class II gamma isoform, PI3K-C2gamma, is expressed in the liver, breast, and prostate. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They can be divided into three main classes (I, II, and III), defined by their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PtdIns as a substrate to produce PtdIns(3)P, but can also phosphorylate PtdIns(4)P. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a Phox homology (PX) domain, and a second C2 domain at the C-terminus. It's biological function remains unknown. The PI3K catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases.	354
176178	cd05188	MDR	Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family. The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases  (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc.	271
133449	cd05191	NAD_bind_amino_acid_DH	NAD(P) binding domain of amino acid dehydrogenase-like proteins. Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	86
187536	cd05193	AR_like_SDR_e	aldehyde reductase, flavonoid reductase, and related proteins, extended (e) SDRs. This subgroup contains aldehyde reductase and flavonoid reductase of the extended SDR-type and related proteins. Proteins in this subgroup have a complete SDR-type active site tetrad and a close match to the canonical extended SDR NADP-binding motif. Aldehyde reductase I (aka carbonyl reductase) is an NADP-binding SDR; it catalyzes  the NADP-dependent  reduction of ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate. The related flavonoid reductases act in the NADP-dependent reduction of  flavonoids, ketone-containing plant secondary metabolites. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	295
176179	cd05195	enoyl_red	enoyl reductase of polyketide synthase. Putative enoyl reductase of polyketide synthase. Polyketide synthases produce polyketides in step by step mechanism that is similar to fatty acid synthesis. Enoyl reductase reduces a double to single bond. Erythromycin is one example of a polyketide generated by 3 complex enzymes (megasynthases). 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in  Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.   ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology  to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic  and coenzyme-binding domains, at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.	293
133425	cd05197	GH4_glycoside_hydrolases	Glycoside Hydrases Family 4. Glycoside hydrolases cleave glycosidic bonds to release smaller sugars from oligo- or polysaccharides. Some bacteria simultaneously translocate and phosphorylate disaccharides via the phosphoenolpyruvate-dependent phosphotransferase system (PEP-PTS). After translocation, these phospho-disaccharides may be hydrolyzed by GH4 glycoside hydrolases. Other organisms (such as archaea and Thermotoga maritima) lack the PEP-PTS system, but have several enzymes normally associated with the PEP-PTS operon. GH4 family members include 6-phospho-beta-glucosidases, 6-phospho-alpha-glucosidases, alpha-glucosidases/alpha-glucuronidases (only from Thermotoga), and alpha-galactosidases. They require two cofactors, NAD+ and a divalent metal (Mn2+, Ni2+, Mg2+), for activity. Some also require reducing conditions. GH4 glycoside hydrolases are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	425
240622	cd05198	formate_dh_like	Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxy acid dehydrogenase family. Formate dehydrogenase, D-specific 2-hydroxy acid dehydrogenase, Phosphoglycerate Dehydrogenase, Lactate dehydrogenase, Thermostable Phosphite Dehydrogenase, and Hydroxy(phenyl)pyruvate reductase, among others, share a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases. FDHs are found in all methylotrophic microorganisms in energy production and in the stress responses of plants. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase, among others. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	302
240623	cd05199	SDH_like	Saccharopine Dehydrogenase like proteins. Saccharopine Dehydrogenase (SDH) and related proteins, including bifunctional lysine ketoglutarate reductase/SDH enzymes and N(5)-(carboxyethyl)ornithine synthases. SDH catalyzes the final step in the reversible NAD-dependent oxidative deamination of saccharopine to alpha-ketoglutarate and lysine, in the alpha-aminoadipate pathway of L-lysine biosynthesis. SDH is structurally related to formate dehydrogenase and similar enzymes, having a 2-domain structure in which a Rossmann-fold NAD(P)-binding domain is inserted within the linear sequence of a catalytic domain of related structure. Bifunctional lysine ketoglutarate reductase/SDH protein is a pair of enzymes linked on a single polypeptide chain that catalyze the initial, consecutive steps of lysine degradation. These proteins are related to the 2-domain saccharopine dehydrogenases.	319
133450	cd05211	NAD_bind_Glu_Leu_Phe_Val	NAD(P) binding domain of glutamate dehydrogenase, leucine dehydrogenase, phenylalanine dehydrogenase, and valine dehydrogenase. Amino acid dehydrogenase (DH) is a widely distributed family of enzymes that catalyzes the oxidative deamination of an amino acid to its keto acid and ammonia with concomitant reduction of NAD(P)+. This subfamily includes glutamate, leucine, phenylalanine, and valine DHs. Glutamate DH is a multi-domain enzyme that catalyzes the reaction from glutamate to 2-oxyoglutarate and ammonia in the presence of NAD or NADP. It is present in all organisms.  Enzymes involved in ammonia assimilation are typically NADP+-dependent, while those involved in glutamate catabolism are generally NAD+-dependent.  As in other NAD+-dependent DHs, monomers in this family have 2 domains separated by a deep cleft. Here the c-terminal domain contains a modified NAD-binding Rossmann fold with 7 rather than the usual 6 beta strands and one strand anti-parrallel to the others. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	217
133451	cd05212	NAD_bind_m-THF_DH_Cyclohyd_like	NAD(P) binding domain of methylene-tetrahydrofolate dehydrogenase and methylene-tetrahydrofolate dehydrogenase/cyclohydrolase. NAD(P) binding domains of methylene-tetrahydrofolate dehydrogenase (m-THF DH) and  m-THF DH/cyclohydrolase bifunctional enzymes (m-THF DH/cyclohydrolase). M-THF is a versatile carrier of activated one-carbon units. The major one-carbon folate donors are N-5 methyltetrahydrofolate, N5,N10-m-THF, and N10-formayltetrahydrofolate. The oxidation of metabolic intermediate m-THF to m-THF requires the enzyme m-THF DH. In addition, most DHs also have an associated cyclohydrolase activity which catalyzes its hydrolysis to N10-formyltetrahydrofolate. m-THF DH is typically found as part of a multifunctional protein in eukaryotes. NADP-dependent m-THF DH in mammals, birds and yeast are components of a trifunctional enzyme with DH, cyclohydrolase, and synthetase activities. Certain eukaryotic cells also contain homodimeric bifunctional DH/cyclodrolase form. In bacteria, mono-functional DH, as well as bifunctional DH/cyclodrolase are found. In addition, yeast (S. cerevisiae) also express a monofunctional DH. M-THF DH, like other amino acid DH-like NAD(P)-binding domains, is a member of the Rossmann fold superfamily which includes glutamate, leucine, and phenylalanine DHs, m-THF DH, methylene-tetrahydromethanopterin DH, m-THF DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	140
133452	cd05213	NAD_bind_Glutamyl_tRNA_reduct	NADP-binding domain of glutamyl-tRNA reductase. Glutamyl-tRNA reductase catalyzes the conversion of glutamyl-tRNA to glutamate-1-semialdehyde, initiating the synthesis of tetrapyrrole. Whereas tRNAs are generally associated with peptide bond formation in protein translation, here the tRNA activates glutamate in the initiation of tetrapyrrole biosynthesis in archaea, plants and many bacteria. In the first step, activated glutamate is reduced to glutamate-1-semi-aldehyde via the NADPH dependent glutamyl-tRNA reductase. Glutamyl-tRNA reductase forms a V-shaped dimer. Each monomer has 3 domains: an N-terminal catalytic domain, a classic nucleotide binding domain, and a C-terminal dimerization domain. Although the representative structure 1GPJ lacks a bound NADPH, a theoretical binding pocket has been described. (PMID 11172694). Amino acid dehydrogenase (DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	311
187537	cd05226	SDR_e_a	Extended (e) and atypical (a) SDRs. Extended or atypical short-chain dehydrogenases/reductases (SDRs, aka tyrosine-dependent oxidoreductases) are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	176
187538	cd05227	AR_SDR_e	aldehyde reductase, extended (e) SDRs. This subgroup contains aldehyde reductase of the extended SDR-type and related proteins. Aldehyde reductase I (aka carbonyl reductase) is an NADP-binding SDR; it has an NADP-binding motif consensus that is slightly different from the canonical SDR form and lacks the Asn of the extended SDR active site tetrad. Aldehyde reductase I catalyzes the NADP-dependent  reduction of ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	301
187539	cd05228	AR_FR_like_1_SDR_e	uncharacterized subgroup of aldehyde reductase and flavonoid reductase related proteins, extended (e) SDRs. This subgroup contains proteins of unknown function related to aldehyde reductase and flavonoid reductase of the extended SDR-type. Aldehyde reductase I (aka carbonyl reductase) is an NADP-binding SDR; it has an NADP-binding motif consensus that is slightly different from the canonical SDR form and lacks the Asn of the extended SDR active site tetrad. Aldehyde reductase I catalyzes the NADP-dependent  reduction of ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate. The related flavonoid reductases act in the NADP-dependent reduction of  flavonoids, ketone-containing plant secondary metabolites. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	318
187540	cd05229	SDR_a3	atypical (a) SDRs, subgroup 3. These atypical SDR family members of unknown function have a glycine-rich NAD(P)-binding motif consensus that is very similar to the extended SDRs, GXXGXXG.  Generally, this group has poor conservation of the active site tetrad, However, individual sequences do contain matches to the YXXXK active site motif, and generally Tyr or Asn in place of the upstream Ser found in most SDRs. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	302
187541	cd05230	UGD_SDR_e	UDP-glucuronate decarboxylase (UGD) and related proteins, extended (e) SDRs. UGD catalyzes the formation of UDP-xylose from UDP-glucuronate; it is an extended-SDR, and has the characteristic glycine-rich NAD-binding pattern, TGXXGXXG, and active site tetrad.  Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	305
187542	cd05231	NmrA_TMR_like_1_SDR_a	NmrA (a transcriptional regulator) and triphenylmethane reductase (TMR) like proteins, subgroup 1, atypical (a) SDRs. Atypical SDRs related to NMRa, TMR, and HSCARG (an NADPH sensor). This subgroup resembles the SDRs and has a partially conserved characteristic [ST]GXXGXXG NAD-binding motif, but lacks the conserved active site residues. NmrA is a negative transcriptional regulator of various fungi, involved in the post-translational modulation of the GATA-type transcription factor AreA. NmrA lacks the canonical GXXGXXG NAD-binding motif and has altered residues at the catalytic triad, including a Met instead of the critical Tyr residue. NmrA may bind nucleotides but appears to lack any dehydrogenase activity. HSCARG has been identified as a putative NADP-sensing molecule, and redistributes and restructures in response to NADPH/NADP ratios. Like NmrA, it lacks most of the active site residues of the SDR family, but has an NAD(P)-binding motif similar to the extended SDR family, GXXGXXG. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Atypical SDRs are distinct from classical SDRs. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	259
187543	cd05232	UDP_G4E_4_SDR_e	UDP-glucose 4 epimerase, subgroup 4, extended (e) SDRs. UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), is a homodimeric extended SDR. It catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. This subgroup is comprised of bacterial proteins, and includes the Staphylococcus aureus capsular polysaccharide Cap5N, which may have a role in the synthesis of UDP-N-acetyl-d-fucosamine. This subgroup has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	303
212491	cd05233	SDR_c	classical (c) SDRs. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	234
187545	cd05234	UDP_G4E_2_SDR_e	UDP-glucose 4 epimerase, subgroup 2, extended (e) SDRs. UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), is a homodimeric extended SDR. It catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. This subgroup is comprised of archaeal and bacterial proteins, and has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	305
187546	cd05235	SDR_e1	extended (e) SDRs, subgroup 1. This family consists of an SDR module of multidomain proteins identified as putative polyketide sythases fatty acid synthases (FAS), and nonribosomal peptide synthases, among others. However, unlike the usual ketoreductase modules of FAS and polyketide synthase, these domains are related to the extended SDRs, and have canonical NAD(P)-binding motifs and an active site tetrad. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	290
187547	cd05236	FAR-N_SDR_e	fatty acyl CoA reductases (FARs), extended (e) SDRs. SDRs are Rossmann-fold NAD(P)H-binding proteins, many of which may function as fatty acyl CoA reductases (FAR), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as biosynthesis of insect pheromones, plant cuticular wax production, and mammalian wax biosynthesis. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols. This N-terminal domain shares the catalytic triad (but not the upstream Asn) and characteristic NADP-binding motif of the extended SDR family. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	320
187548	cd05237	UDP_invert_4-6DH_SDR_e	UDP-Glcnac (UDP-linked N-acetylglucosamine) inverting 4,6-dehydratase, extended (e) SDRs. UDP-Glcnac inverting 4,6-dehydratase was identified in Helicobacter pylori as the hexameric flaA1 gene product (FlaA1). FlaA1 is hexameric, possesses UDP-GlcNAc-inverting 4,6-dehydratase activity,  and catalyzes the first step in the creation of a pseudaminic acid derivative in protein glycosylation. Although this subgroup has the NADP-binding motif characteristic of extended SDRs, its members tend to have a Met substituted for the active site Tyr found in most SDR families. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	287
187549	cd05238	Gne_like_SDR_e	Escherichia coli Gne (a nucleoside-diphosphate-sugar 4-epimerase)-like, extended (e) SDRs. Nucleoside-diphosphate-sugar 4-epimerase has the characteristic active site tetrad and NAD-binding motif of the extended SDR, and is related to more specifically defined epimerases such as UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), which catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. This subgroup includes Escherichia coli 055:H7 Gne, a UDP-GlcNAc 4-epimerase, essential for O55 antigen synthesis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	305
187550	cd05239	GDP_FS_SDR_e	GDP-fucose synthetase, extended (e) SDRs. GDP-fucose synthetase (aka 3, 5-epimerase-4-reductase) acts in the NADP-dependent synthesis of GDP-fucose from GDP-mannose. Two activities have been proposed for the same active site: epimerization and reduction. Proteins in this subgroup are extended SDRs, which have a characteristic active site tetrad and an NADP-binding motif, [AT]GXXGXXG, that is a close match to the archetypical form. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	300
187551	cd05240	UDP_G4E_3_SDR_e	UDP-glucose 4 epimerase (G4E), subgroup 3, extended (e) SDRs. Members of this bacterial subgroup are identified as possible sugar epimerases, such as UDP-glucose 4 epimerase. However, while the NAD(P)-binding motif is fairly well conserved, not all members retain the canonical active site tetrad of the extended SDRs. UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), is a homodimeric extended SDR. It catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	306
187552	cd05241	3b-HSD-like_SDR_e	3beta-hydroxysteroid dehydrogenases (3b-HSD)-like, extended (e) SDRs. Extended SDR family domains belonging to this subgroup have the characteristic active site tetrad and a fairly well-conserved NAD(P)-binding motif. 3b-HSD catalyzes the NAD-dependent conversion of various steroids, such as pregnenolone to progesterone, or androstenediol to testosterone. This subgroup includes an unusual bifunctional 3b-HSD/C-4 decarboxylase from Arabidopsis thaliana, and Saccharomyces cerevisiae ERG26, a 3b-HSD/C-4 decarboxylase, involved in the synthesis of ergosterol, the major sterol of yeast. It also includes human 3 beta-HSD/HSD3B1 and C(27) 3beta-HSD/ [3beta-hydroxy-delta(5)-C(27)-steroid oxidoreductase; HSD3B7].  C(27) 3beta-HSD/HSD3B7 is a membrane-bound enzyme of the endoplasmic reticulum, that catalyzes the isomerization and oxidation of 7alpha-hydroxylated sterol intermediates, an early step in bile acid biosynthesis. Mutations in the human NSDHL (NAD(P)H steroid dehydrogenase-like protein) cause CHILD syndrome (congenital hemidysplasia with ichthyosiform nevus and limb defects), an X-linked dominant, male-lethal trait. Mutations in the human gene encoding C(27) 3beta-HSD underlie a rare autosomal recessive form of neonatal cholestasis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid sythase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	331
187553	cd05242	SDR_a8	atypical (a) SDRs, subgroup 8. This subgroup contains atypical SDRs of unknown function. Proteins in this subgroup have a glycine-rich NAD(P)-binding motif consensus that resembles that of the extended SDRs, (GXXGXXG or GGXGXXG), but lacks the characteristic active site residues of the SDRs. A Cys often replaces the usual Lys of the YXXXK active site motif, while the upstream Ser is generally present and Arg replaces the usual Asn. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	296
187554	cd05243	SDR_a5	atypical (a) SDRs, subgroup 5. This subgroup contains atypical SDRs, some of which are identified as putative NAD(P)-dependent epimerases, one as a putative NAD-dependent epimerase/dehydratase. Atypical SDRs are distinct from classical SDRs. Members of this subgroup have a glycine-rich NAD(P)-binding motif that is very similar to the extended SDRs, GXXGXXG, and binds NADP. Generally, this subgroup has poor conservation of the active site tetrad; however, individual sequences do contain matches to the YXXXK active site motif, the upstream Ser, and there is a highly conserved Asp in place of the usual active site Asn throughout the subgroup. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	203
187555	cd05244	BVR-B_like_SDR_a	biliverdin IX beta reductase (BVR-B, aka flavin reductase)-like proteins; atypical (a) SDRs. Human BVR-B catalyzes pyridine nucleotide-dependent production of bilirubin-IX beta during fetal development; in the adult BVR-B has flavin and ferric reductase activities. Human BVR-B catalyzes the reduction of FMN, FAD, and riboflavin. Recognition of flavin occurs mostly by hydrophobic interactions, accounting for the broad substrate specificity. Atypical SDRs are distinct from classical SDRs. BVR-B does not share the key catalytic triad, or conserved tyrosine typical of SDRs. The glycine-rich NADP-binding motif of BVR-B is GXXGXXG, which is similar but not identical to the pattern seen in extended SDRs. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	207
187556	cd05245	SDR_a2	atypical (a) SDRs, subgroup 2. This subgroup contains atypical SDRs, one member is identified as Escherichia coli protein ybjT, function unknown. Atypical SDRs are distinct from classical SDRs. Members of this subgroup have a glycine-rich NAD(P)-binding motif consensus that generally matches the extended SDRs, TGXXGXXG, but lacks the characteristic active site residues of the SDRs. This subgroup has basic residues (HXXXR) in place of the active site motif YXXXK, these may have a catalytic role. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	293
187557	cd05246	dTDP_GD_SDR_e	dTDP-D-glucose 4,6-dehydratase, extended (e) SDRs. This subgroup contains dTDP-D-glucose 4,6-dehydratase and related proteins, members of the extended-SDR family, with the characteristic Rossmann fold core region, active site tetrad and NAD(P)-binding motif. dTDP-D-glucose 4,6-dehydratase is closely related to other sugar epimerases of the SDR family. dTDP-D-dlucose 4,6,-dehydratase catalyzes the second of four steps in the dTDP-L-rhamnose pathway (the dehydration of dTDP-D-glucose to dTDP-4-keto-6-deoxy-D-glucose) in the synthesis of L-rhamnose, a cell wall component of some pathogenic bacteria. In many gram negative bacteria, L-rhamnose is an important constituent of lipopoylsaccharide O-antigen. The larger N-terminal portion of dTDP-D-Glucose 4,6-dehydratase forms a Rossmann fold NAD-binding domain, while the C-terminus binds the sugar substrate. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	315
187558	cd05247	UDP_G4E_1_SDR_e	UDP-glucose 4 epimerase, subgroup 1, extended (e) SDRs. UDP-glucose 4 epimerase (aka UDP-galactose-4-epimerase), is a homodimeric extended SDR. It catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. This subgroup has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	323
187559	cd05248	ADP_GME_SDR_e	ADP-L-glycero-D-mannoheptose 6-epimerase (GME), extended (e) SDRs. This subgroup contains ADP-L-glycero-D-mannoheptose 6-epimerase, an extended SDR, which catalyzes the NAD-dependent interconversion of ADP-D-glycero-D-mannoheptose and ADP-L-glycero-D-mannoheptose.  This subgroup has the canonical active site tetrad and NAD(P)-binding motif. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	317
187560	cd05250	CC3_like_SDR_a	CC3(TIP30)-like, atypical (a) SDRs. Atypical SDRs in this subgroup include CC3 (also known as TIP30) which is implicated in tumor suppression. Atypical SDRs are distinct from classical SDRs. Members of this subgroup have a glycine rich NAD(P)-binding motif that resembles the extended SDRs, and have an active site triad of the SDRs (YXXXK and upstream Ser), although the upstream Asn of the usual SDR active site is substituted with Asp. For CC3, the Tyr of the triad is displaced compared to the usual SDRs and the protein is monomeric, both these observations suggest that the usual SDR catalytic activity is not present. NADP appears to serve an important role as a ligand, and may be important in the interaction with other macromolecules. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	214
187561	cd05251	NmrA_like_SDR_a	NmrA (a transcriptional regulator) and HSCARG (an NADPH sensor) like proteins, atypical (a) SDRs. NmrA and HSCARG like proteins. NmrA is a negative transcriptional regulator of various fungi, involved in the post-translational modulation of the GATA-type transcription factor AreA. NmrA lacks the canonical GXXGXXG NAD-binding motif and has altered residues at the catalytic triad, including a Met instead of the critical Tyr residue. NmrA may bind nucleotides but appears to lack any dehydrogenase activity. HSCARG has been identified as a putative NADP-sensing molecule, and redistributes and restructures in response to NADPH/NADP ratios. Like NmrA, it lacks most of the active site residues of the SDR family, but has an NAD(P)-binding motif similar to the extended SDR family, GXXGXXG. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Atypical SDRs are distinct from classical SDRs. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	242
187562	cd05252	CDP_GD_SDR_e	CDP-D-glucose 4,6-dehydratase, extended (e) SDRs. This subgroup contains CDP-D-glucose 4,6-dehydratase, an extended SDR, which catalyzes the conversion of CDP-D-glucose to CDP-4-keto-6-deoxy-D-glucose. This subgroup has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	336
187563	cd05253	UDP_GE_SDE_e	UDP glucuronic acid epimerase, extended (e) SDRs. This subgroup contains UDP-D-glucuronic acid 4-epimerase, an extended SDR, which catalyzes the conversion of UDP-alpha-D-glucuronic acid to UDP-alpha-D-galacturonic acid. This group has the SDR's canonical catalytic tetrad and the TGxxGxxG NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	332
187564	cd05254	dTDP_HR_like_SDR_e	dTDP-6-deoxy-L-lyxo-4-hexulose reductase and related proteins, extended (e) SDRs. dTDP-6-deoxy-L-lyxo-4-hexulose reductase, an extended SDR, synthesizes dTDP-L-rhamnose from alpha-D-glucose-1-phosphate,  providing the precursor of L-rhamnose, an essential cell wall component of many pathogenic bacteria. This subgroup has the characteristic active site tetrad and NADP-binding motif. This subgroup also contains human MAT2B, the regulatory subunit of methionine adenosyltransferase (MAT); MAT catalyzes S-adenosylmethionine synthesis. The human gene encoding MAT2B encodes two major splicing variants which are induced in human cell liver cancer and regulate HuR, an mRNA-binding protein which stabilizes the mRNA of several cyclins, to affect cell proliferation. Both MAT2B variants include this extended SDR domain. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	280
187565	cd05255	SQD1_like_SDR_e	UDP_sulfoquinovose_synthase (Arabidopsis thaliana SQD1 and related proteins), extended (e) SDRs. Arabidopsis thaliana UDP-sulfoquinovose-synthase ( SQD1), an extended SDR,  catalyzes the transfer of SO(3)(-) to UDP-glucose in the biosynthesis of plant sulfolipids. Members of this subgroup share the conserved SDR catalytic residues, and a partial match to the characteristic extended-SDR NAD-binding motif. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	382
187566	cd05256	UDP_AE_SDR_e	UDP-N-acetylglucosamine 4-epimerase, extended (e) SDRs. This subgroup contains UDP-N-acetylglucosamine 4-epimerase of Pseudomonas aeruginosa, WbpP,  an extended SDR, that catalyzes the NAD+ dependent conversion of UDP-GlcNAc and UDPGalNA to UDP-Glc and UDP-Gal.  This subgroup has the characteristic active site tetrad and NAD-binding motif of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	304
187567	cd05257	Arna_like_SDR_e	Arna decarboxylase_like, extended (e) SDRs. Decarboxylase domain of ArnA. ArnA, is an enzyme involved in the modification of outer membrane protein lipid A of gram-negative bacteria. It is a bifunctional enzyme that catalyzes the NAD-dependent decarboxylation of UDP-glucuronic acid and N-10-formyltetrahydrofolate-dependent formylation of UDP-4-amino-4-deoxy-l-arabinose; its NAD-dependent decaboxylating activity is in the C-terminal 360 residues. This subgroup belongs to the extended SDR family, however the NAD binding motif is not a perfect match and the upstream Asn of the canonical active site tetrad is not conserved. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	316
187568	cd05258	CDP_TE_SDR_e	CDP-tyvelose 2-epimerase, extended (e) SDRs. CDP-tyvelose 2-epimerase is a tetrameric SDR that catalyzes the conversion of CDP-D-paratose to CDP-D-tyvelose, the last step in tyvelose biosynthesis. This subgroup is a member of the extended SDR subfamily, with a characteristic active site tetrad and NAD-binding motif. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	337
187569	cd05259	PCBER_SDR_a	phenylcoumaran benzylic ether reductase (PCBER) like, atypical (a) SDRs. PCBER and pinoresinol-lariciresinol reductases are NADPH-dependent aromatic alcohol reductases, and are atypical members of the SDR family. Other proteins in this subgroup are identified as eugenol synthase. These proteins contain an N-terminus characteristic of NAD(P)-binding proteins and a small C-terminal domain presumed to be involved in substrate binding, but they do not have the conserved active site Tyr residue typically found in SDRs. Numerous other members have unknown functions. The glycine rich NADP-binding motif in this subgroup is of 2 forms: GXGXXG and G[GA]XGXXG; it tends to be atypical compared with the forms generally seen in classical or extended SDRs. The usual SDR active site tetrad is not present, but a critical active site Lys at the usual SDR position has been identified in various members, though other charged and polar residues are found at this position in this subgroup. Atypical SDR-related proteins retain the Rossmann fold of the SDRs, but have limited sequence identity and generally lack the catalytic properties of the archetypical members. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	282
187570	cd05260	GDP_MD_SDR_e	GDP-mannose 4,6 dehydratase, extended (e) SDRs. GDP-mannose 4,6 dehydratase, a homodimeric SDR, catalyzes the NADP(H)-dependent conversion of GDP-(D)-mannose to GDP-4-keto, 6-deoxy-(D)-mannose in the fucose biosynthesis pathway. These proteins have the canonical active site triad and NAD-binding pattern, however the active site Asn is often missing and may be substituted with Asp. A Glu residue has been identified as an important active site base. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	316
187571	cd05261	CAPF_like_SDR_e	capsular polysaccharide assembling protein (CAPF) like, extended (e) SDRs. This subgroup of extended SDRs, includes some members which have been identified as capsular polysaccharide assembling proteins, such as Staphylococcus aureus Cap5F which is involved in the biosynthesis of N-acetyl-l-fucosamine, a constituent of surface polysaccharide structures of S. aureus. This subgroup has the characteristic active site tetrad and NAD-binding motif of extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	248
187572	cd05262	SDR_a7	atypical (a) SDRs, subgroup 7. This subgroup contains atypical SDRs of unknown function. Members of this subgroup have a glycine-rich NAD(P)-binding motif consensus that matches the extended SDRs, TGXXGXXG, but lacks the characteristic active site residues of the SDRs. This subgroup has basic residues (HXXXR) in place of the active site motif YXXXK, these may have a catalytic role. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	291
187573	cd05263	MupV_like_SDR_e	Pseudomonas fluorescens MupV-like, extended (e) SDRs. This subgroup of extended SDR family domains have the characteristic active site tetrad and a well-conserved NAD(P)-binding motif. This subgroup is not well characterized, its members are annotated as having a variety of putative functions. One characterized member is Pseudomonas fluorescens MupV a protein  involved in the biosynthesis of Mupirocin, a polyketide-derived antibiotic. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	293
187574	cd05264	UDP_G4E_5_SDR_e	UDP-glucose 4-epimerase (G4E), subgroup 5, extended (e) SDRs. This subgroup partially conserves the characteristic active site tetrad and NAD-binding motif of the extended SDRs, and has been identified as possible UDP-glucose 4-epimerase (aka UDP-galactose 4-epimerase), a homodimeric member of the extended SDR family. UDP-glucose 4-epimerase catalyzes the NAD-dependent conversion of UDP-galactose to UDP-glucose, the final step in Leloir galactose synthesis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	300
187575	cd05265	SDR_a1	atypical (a) SDRs, subgroup 1. Atypical SDRs in this subgroup are poorly defined and have been identified putatively as isoflavones reductase, sugar dehydratase, mRNA binding protein etc. Atypical SDRs are distinct from classical SDRs. Members of this subgroup retain the canonical active site triad (though not the upstream Asn found in most SDRs) but have an unusual putative glycine-rich NAD(P)-binding motif, GGXXXXG, in the usual location. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	250
187576	cd05266	SDR_a4	atypical (a) SDRs, subgroup 4. Atypical SDRs in this subgroup are poorly defined, one member is identified as a putative NAD-dependent epimerase/dehydratase. Atypical SDRs are distinct from classical SDRs. Members of this subgroup have a glycine-rich NAD(P)-binding motif that is related to, but is different from, the archetypical SDRs, GXGXXG. This subgroup also lacks most of the characteristic active site residues of the SDRs; however, the upstream Ser is present at the usual place, and some potential catalytic residues are present in place of the usual YXXXK active site motif. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	251
187577	cd05267	SDR_a6	atypical (a) SDRs, subgroup 6. These atypical SDR family members of unknown function have only a partial match to a prototypical glycine-rich NAD(P)-binding motif consensus, GXXG, which conserves part of the motif of extended SDR. Furthermore, they lack the characteristic active site residues of the SDRs. This subgroup is related to phenylcoumaran benzylic ether reductase, an NADPH-dependent aromatic alcohol reductase. One member is identified as a putative NAD-dependent epimerase/dehydratase. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	203
187578	cd05269	TMR_SDR_a	triphenylmethane reductase (TMR)-like proteins, NMRa-like, atypical (a) SDRs. TMR is an atypical NADP-binding protein of the SDR family. It lacks the active site residues of the SDRs but has a glycine rich NAD(P)-binding motif that matches the extended SDRs. Proteins in this subgroup however, are more similar in length to the classical SDRs. TMR was identified as a reducer of triphenylmethane dyes, important environmental pollutants. This subgroup also includes Escherichia coli NADPH-dependent quinine oxidoreductase (QOR2), which catalyzes two-electron reduction of quinone; but is unlikely to play a major role in protecting against quinone cytotoxicity. Atypical SDRs are distinct from classical SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	272
187579	cd05271	NDUFA9_like_SDR_a	NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, subunit 9, 39 kDa, (NDUFA9) -like, atypical (a) SDRs. This subgroup of extended SDR-like proteins are atypical SDRs. They have a glycine-rich NAD(P)-binding motif similar to the typical SDRs, GXXGXXG, and have the YXXXK active site motif (though not the other residues of the SDR tetrad). Members identified include NDUFA9 (mitochondrial) and putative nucleoside-diphosphate-sugar epimerase. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	273
187580	cd05272	TDH_SDR_e	L-threonine dehydrogenase, extended (e) SDRs. This subgroup contains members identified as L-threonine dehydrogenase (TDH). TDH catalyzes the zinc-dependent formation of 2-amino-3-ketobutyrate from L-threonine via NAD(H)-dependent oxidation. This group is distinct from TDHs that are members of the medium chain dehydrogenase/reductase family. This group has the NAD-binding motif and active site tetrad of the extended SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	308
187581	cd05273	GME-like_SDR_e	Arabidopsis thaliana GDP-mannose-3',5'-epimerase (GME)-like, extended (e) SDRs. This subgroup of NDP-sugar epimerase/dehydratases are extended SDRs; they have the characteristic active site tetrad, and an NAD-binding motif: TGXXGXX[AG], which is a close match to the canonical NAD-binding motif. Members include Arabidopsis thaliana GDP-mannose-3',5'-epimerase (GME) which catalyzes the epimerization of two positions of GDP-alpha-D-mannose to form GDP-beta-L-galactose. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	328
187582	cd05274	KR_FAS_SDR_x	ketoreductase (KR) and fatty acid synthase (FAS), complex (x) SDRs. Ketoreductase, a module of the multidomain polyketide synthase (PKS), has 2 subdomains, each corresponding  to a SDR family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerize but is composed of 2 subdomains, each resembling an SDR monomer. The active site resembles that of typical SDRs, except that the usual positions of the catalytic Asn and Tyr are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular PKSs are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) FAS.  In some instances, such as porcine FAS, an enoyl reductase (ER) module is inserted between the sub-domains. Fatty acid synthesis occurs via the stepwise elongation of a chain (which is attached to acyl carrier protein, ACP) with 2-carbon units. Eukaryotic systems consist of large, multifunctional synthases (type I) while bacterial, type II systems, use single function proteins. Fungal fatty acid synthase uses a dodecamer of 6 alpha and 6 beta subunits. In mammalian type FAS cycles, ketoacyl synthase forms acetoacetyl-ACP which is reduced by the NADP-dependent beta-KR, forming beta-hydroxyacyl-ACP, which is in turn dehydrated by dehydratase to a beta-enoyl intermediate, which is reduced by NADP-dependent beta-ER. Polyketide synthesis also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP-binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	375
176180	cd05276	p53_inducible_oxidoreductase	PIG3 p53-inducible quinone oxidoreductase. PIG3 p53-inducible quinone oxidoreductase, a medium chain dehydrogenase/reductase family member, acts in the apoptotic pathway. PIG3 reduces ortho-quinones, but its apoptotic activity has been attributed to oxidative stress generation, since overexpression of PIG3 accumulates reactive oxygen species. PIG3 resembles the MDR family member quinone reductases, which catalyze the reduction of quinone to hydroxyquinone. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology  to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	323
176181	cd05278	FDH_like	Formaldehyde dehydrogenases. Formaldehyde dehydrogenase (FDH) is a member of the zinc-dependent/medium chain alcohol dehydrogenase family.  Formaldehyde dehydrogenase (aka ADH3) may be the ancestral form of alcohol dehydrogenase, which evolved to detoxify formaldehyde.  This CD contains glutathione dependant FDH, glutathione independent FDH, and related alcohol dehydrogenases. FDH converts formaldehyde and NAD(P) to formate and NAD(P)H. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. Unlike typical FDH, Pseudomonas putida aldehyde-dismutating FDH (PFDH) is glutathione-independent. The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	347
176182	cd05279	Zn_ADH1	Liver alcohol dehydrogenase and related zinc-dependent alcohol dehydrogenases. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  There are 7 vertebrate ADH 7 classes, 6 of which have been identified in humans. Class III, glutathione-dependent formaldehyde dehydrogenase, has been identified as the primordial form and exists in diverse species, including plants, micro-organisms, vertebrates, and invertebrates. Class I, typified by  liver dehydrogenase, is an evolving form. Gene duplication and functional specialization of ADH into ADH classes and subclasses created numerous forms in vertebrates. For example, the A, B and C (formerly alpha, beta, gamma) human class I subunits have high overall structural similarity, but differ in the substrate binding pocket and therefore in substrate specificity.  In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine (His-51), the ribose of NAD, a serine (Ser-48), then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology  to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.	365
176183	cd05280	MDR_yhdh_yhfp	Yhdh and yhfp-like putative quinone oxidoreductases. Yhdh and yhfp-like putative quinone oxidoreductases (QOR). QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology  to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	325
176184	cd05281	TDH	Threonine dehydrogenase. L-threonine dehydrogenase (TDH) catalyzes the zinc-dependent formation of 2-amino-3-ketobutyrate from L-threonine via NAD(H)- dependent oxidation.  THD is a member of the zinc-requiring, medium chain NAD(H)-dependent alcohol dehydrogenase family (MDR). MDRs  have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria) and have 2 tightly bound zinc atoms per subunit. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose.	341
176645	cd05282	ETR_like	2-enoyl thioester reductase-like. 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.   ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  The N-terminal catalytic domain has a distant homology  to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.  Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain.	323
176186	cd05283	CAD1	Cinnamyl alcohol dehydrogenases (CAD). Cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family, reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol  dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins  typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	337
176187	cd05284	arabinose_DH_like	D-arabinose dehydrogenase. This group contains arabinose dehydrogenase (AraDH) and related alcohol dehydrogenases. AraDH is a member of the medium chain dehydrogenase/reductase family and catalyzes the NAD(P)-dependent oxidation of D-arabinose and other pentoses, the initial step in the metabolism of d-arabinose into 2-oxoglutarate. Like the alcohol dehydrogenases, AraDH binds a zinc in the catalytic cleft as well as a distal structural zinc. AraDH forms homotetramers as a dimer of dimers. AraDH replaces a conserved catalytic His with replace with Arg, compared to the canonical ADH site. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	340
176188	cd05285	sorbitol_DH	Sorbitol dehydrogenase. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose. Sorbitol dehydrogenase is tetrameric and has a single catalytic zinc per subunit. Aldose reductase catalyzes the NADP(H)-dependent conversion of glucose to sorbital, and SDH uses NAD(H) in the conversion of sorbitol to fructose.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	343
176189	cd05286	QOR2	Quinone oxidoreductase (QOR). Quinone oxidoreductase (QOR) and 2-haloacrylate reductase. QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds.  Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. 2-haloacrylate reductase, a member of this subgroup, catalyzes the NADPH-dependent reduction of a carbon-carbon double bond in organohalogen compounds. Although similar to QOR, Burkholderia 2-haloacrylate reductase does not act on the quinones 1,4-benzoquinone and 1,4-naphthoquinone. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology  to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H)  binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	320
176190	cd05288	PGDH	Prostaglandin dehydrogenases. Prostaglandins and related eicosanoids are metabolized by the oxidation of the 15(S)-hydroxyl group of the NAD+-dependent (type I 15-PGDH) 15-prostaglandin dehydrogenase (15-PGDH) followed by reduction by NADPH/NADH-dependent (type II 15-PGDH) delta-13 15-prostaglandin reductase (13-PGR) to 15-keto-13,14,-dihydroprostaglandins. 13-PGR is a bifunctional enzyme, since it also has leukotriene B(4) 12-hydroxydehydrogenase activity. These 15-PGDH and related enzymes are members of the medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases  (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.	329
176191	cd05289	MDR_like_2	alcohol dehydrogenase and quinone reductase-like medium chain degydrogenases/reductases. Members identified as zinc-dependent alcohol dehydrogenases and quinone oxidoreductase. QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds.  Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	309
133426	cd05290	LDH_3	A subgroup of L-lactate dehydrogenases. L-lactate dehydrogenases (LDH) are tetrameric enzymes catalyzing the last step of glycolysis in which pyruvate is converted to L-lactate. This subgroup is composed of some bacterial LDHs from firmicutes, gammaproteobacteria, and actinobacteria. Vertebrate LDHs are non-allosteric, but some bacterial LDHs are activated by an allosteric effector such as fructose-1,6-bisphosphate. LDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenase, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	307
133427	cd05291	HicDH_like	L-2-hydroxyisocapronate dehydrogenases and some bacterial L-lactate dehydrogenases. L-2-hydroxyisocapronate dehydrogenase (HicDH) catalyzes the conversion of a variety of 2-oxo carboxylic acids with medium-sized aliphatic or aromatic side chains. This subfamily is composed of HicDHs and some bacterial L-lactate dehydrogenases (LDH). LDHs catalyze the last step of glycolysis in which pyruvate is converted to L-lactate. Bacterial LDHs can be non-allosteric or may be activated by an allosteric effector such as fructose-1,6-bisphosphate. Members of this subfamily with known structures such as the HicDH of Lactobacillus confusus, the non-allosteric LDH of Lactobacillus pentosus, and the allosteric LDH of Bacillus stearothermophilus, show that they exist as homotetramers. The HicDH-like subfamily is part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	306
133428	cd05292	LDH_2	A subgroup of L-lactate dehydrogenases. L-lactate dehydrogenases (LDH) are tetrameric enzymes catalyzing the last step of glycolysis in which pyruvate is converted to L-lactate. This subgroup is composed predominantly of bacterial LDHs and a few fungal LDHs. Bacterial LDHs may be non-allosteric or may be activated by an allosteric effector such as fructose-1,6-bisphosphate. LDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	308
133429	cd05293	LDH_1	A subgroup of L-lactate dehydrogenases. L-lactate dehydrogenases (LDH) are tetrameric enzymes catalyzing the last step of glycolysis in which pyruvate is converted to L-lactate. This subgroup is composed of eukaryotic LDHs. Vertebrate LDHs are non-allosteric. This is in contrast to some bacterial LDHs that are activated by an allosteric effector such as fructose-1,6-bisphosphate. LDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	312
133430	cd05294	LDH-like_MDH_nadp	A lactate dehydrogenases-like structure with malate dehydrogenase enzymatic activity. The LDH-like MDH proteins have a lactate dehyhydrogenase-like (LDH-like) structure and malate dehydrogenase (MDH) enzymatic activity. This subgroup is composed of some archaeal LDH-like MDHs that prefer NADP(H) rather than NAD(H) as a cofactor. One member, MJ0490 from Methanococcus jannaschii, has been observed to form dimers and tetramers during crystalization, although it is believed to exist primarilly as a tetramer in solution. In addition to its MDH activity, MJ0490 also possesses fructose-1,6-bisphosphate-activated LDH activity. Members of this subgroup have a higher sequence similarity to LDHs than to other MDHs. LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH-like MDHs are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)- binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenase, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	309
133431	cd05295	MDH_like	Malate dehydrogenase-like. These MDH-like proteins are related to other groups in the MDH family but do not have conserved substrate and cofactor binding residues. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. Members of this subgroup are uncharacterized MDH-like proteins from animals. They are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	452
133432	cd05296	GH4_P_beta_glucosidase	Glycoside Hydrolases Family 4; Phospho-beta-glucosidase. Some bacteria simultaneously translocate and phosphorylate  disaccharides via the phosphoenolpyruvate-dependent phosphotransferase system (PEP-PTS). After translocation, these phospho-disaccharides may be hydrolyzed by the GH4 glycoside hydrolases such as the phospho-beta-glucosidases. Other organisms (such as archaea and Thermotoga maritima ) lack the PEP-PTS system, but have several enzymes normally associated with the PEP-PTS operon. The 6-phospho-beta-glucosidase from Thermotoga maritima hydrolylzes cellobiose 6-phosphate (6P) into glucose-6P and glucose, in an NAD+ and Mn2+ dependent fashion. The Escherichia coli 6-phospho-beta-glucosidase (also called celF) hydrolyzes a variety of phospho-beta-glucosides including cellobiose-6P, salicin-6P, arbutin-6P, and gentobiose-6P. Phospho-beta-glucosidases are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	419
133433	cd05297	GH4_alpha_glucosidase_galactosidase	Glycoside Hydrolases Family 4; Alpha-glucosidases and alpha-galactosidases. linked to 3D####ucture	423
133434	cd05298	GH4_GlvA_pagL_like	Glycoside Hydrolases Family 4; GlvA- and pagL-like glycosidases. Bacillus subtilis GlvA and Clostridium acetobutylicum pagL are 6-phospho-alpha-glucosidase, catalyzing the hydrolysis of alpha-glucopyranoside bonds to release glucose from oligosaccharides. The substrate specificities of other members of this subgroup are unknown. Some bacteria simultaneously translocate and phosphorylate disaccharides via the phosphoenolpyruvate-dependent phosphotransferase system (PEP_PTS).  After translocation, these phospho-disaccharides may be hydrolyzed by the GH4 glycoside hydrolases, which include 6-phospho-beta-glucosidases, 6-phospho-alpha-glucosidases, alpha-glucosidases/alpha-glucuronidases (only from Thermotoga), and alpha-galactosidases. Members of this subfamily are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others.	437
240624	cd05299	CtBP_dh	C-terminal binding protein (CtBP), D-isomer-specific 2-hydroxyacid dehydrogenases related repressor. The transcriptional corepressor CtBP is a dehydrogenase with sequence and structural similarity to the d2-hydroxyacid dehydrogenase family. CtBP was initially identified as a protein that bound the PXDLS sequence at the adenovirus E1A C terminus, causing the loss of CR-1-mediated transactivation. CtBP binds NAD(H) within a deep cleft, undergoes a conformational change upon NAD binding, and has NAD-dependent dehydrogenase activity.	312
240625	cd05300	2-Hacid_dh_1	Putative D-isomer specific 2-hydroxyacid dehydrogenase. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomains but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of the hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases. FDHs are found in all methylotrophic microorganisms in energy production and in the stress responses of plants.	313
240626	cd05301	GDH	D-glycerate dehydrogenase/hydroxypyruvate reductase (GDH). D-glycerate dehydrogenase (GDH, also known as hydroxypyruvate reductase, HPR) catalyzes the reversible reaction of (R)-glycerate + NAD+ to hydroxypyruvate + NADH + H+. In humans, HPR deficiency causes primary hyperoxaluria type 2, characterized by over-excretion of L-glycerate and oxalate in the urine, possibly due to an imbalance in competition with L-lactate dehydrogenase, another formate dehydrogenase (FDH)-like enzyme. GDH, like FDH and other members of the D-specific hydroxyacid dehydrogenase family that also includes L-alanine dehydrogenase and S-adenosylhomocysteine hydrolase, typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann-fold NAD+ binding form, despite often low sequence identity. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	309
240627	cd05302	FDH	NAD-dependent Formate Dehydrogenase (FDH). NAD-dependent formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of a formate anion to carbon dioxide coupled with the reduction of NAD+ to NADH. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxy acid dehydrogenase family have 2 highly similar subdomains of the alpha/beta form, with NAD binding occurring in the cleft between subdomains. NAD contacts are primarily to the Rossmann-fold NAD-binding domain which is inserted within the linear sequence of the more diverse flavodoxin-like catalytic subdomain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of the hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases. FDHs are found in all methylotrophic microorganisms in energy production from C1 compounds such as methanol, and in the stress responses of plants. NAD-dependent FDH is useful in cofactor regeneration in asymmetrical biocatalytic reduction processes, where FDH irreversibly oxidizes formate to carbon dioxide, while reducing the oxidized form of the cofactor to the reduced form.	348
240628	cd05303	PGDH_2	Phosphoglycerate dehydrogenase (PGDH) NAD-binding and catalytic domains. Phosphoglycerate dehydrogenase (PGDH) catalyzes the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDH comes in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases. PGDH in E. coli and Mycobacterium tuberculosis form tetramers, with subunits containing a Rossmann-fold NAD binding domain. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence.	301
240629	cd05304	Rubrum_tdh	Rubrum transdehydrogenase NAD-binding and catalytic domains. Transhydrogenases found in bacterial and inner mitochondrial membranes link NAD(P)(H)-dependent redox reactions to proton translocation. The energy of the proton electrochemical gradient (delta-p), generated by the respiratory electron transport chain, is consumed by transhydrogenase in NAD(P)+ reduction. Transhydrogenase is likely involved in the regulation of the citric acid cycle. Rubrum transhydrogenase has 3 components, dI, dII, and dIII. dII spans the membrane while dI and dIII protrude on the cytoplasmic/matrix side. DI contains 2 domains in Rossmann-like folds, linked by a long alpha helix, and contains a NAD binding site. Two dI polypeptides (represented in this sub-family) spontaneously form a heterotrimer with dIII in the absence of dII. In the heterotrimer, both dI chains may bind NAD, but only one is well-ordered. dIII also binds a well-ordered NADP, but in a different orientation than a classical Rossmann domain.	363
240630	cd05305	L-AlaDH	Alanine dehydrogenase NAD-binding and catalytic domains. Alanine dehydrogenase (L-AlaDH) catalyzes the NAD-dependent conversion of pyruvate to L-alanine via reductive amination. Like formate dehydrogenase and related enzymes, L-AlaDH is comprised of 2 domains connected by a long alpha helical stretch, each resembling a Rossmann fold NAD-binding domain. The NAD-binding domain is inserted within the linear sequence of the more divergent catalytic domain. Ligand binding and active site residues are found in the cleft between the subdomains. L-AlaDH is typically hexameric and is critical in carbon and nitrogen metabolism in micro-organisms.	359
133453	cd05311	NAD_bind_2_malic_enz	NAD(P) binding domain of malic enzyme (ME), subgroup 2. Malic enzyme (ME), a member of the amino acid dehydrogenase (DH)-like domain family, catalyzes the oxidative decarboxylation of L-malate to pyruvate in the presence of cations (typically  Mg++ or Mn++) with the concomitant reduction of cofactor NAD+ or NADP+.  ME has been found in all organisms, and plays important roles in diverse metabolic pathways such as photosynthesis and lipogenesis. This enzyme generally forms homotetramers. The conversion of malate to pyruvate by ME typically involves oxidation of malate to produce oxaloacetate, followed by decarboxylation of oxaloacetate to produce pyruvate and CO2.  This subfamily consists primarily of archaeal and bacterial ME.  Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	226
133454	cd05312	NAD_bind_1_malic_enz	NAD(P) binding domain of malic enzyme (ME), subgroup 1. Malic enzyme (ME), a member of the amino acid dehydrogenase (DH)-like domain family, catalyzes the oxidative decarboxylation of L-malate to pyruvate in the presence of cations (typically  Mg++ or Mn++) with the concomitant reduction of cofactor NAD+ or NADP+.  ME has been found in all organisms, and plays important roles in diverse metabolic pathways such as photosynthesis and lipogenesis. This enzyme generally forms homotetramers. The conversion of malate to pyruvate by ME typically involves oxidation of malate to produce oxaloacetate, followed by decarboxylation of oxaloacetate to produce pyruvate and CO2.  This subfamily consists of eukaryotic and bacterial ME.  Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	279
133455	cd05313	NAD_bind_2_Glu_DH	NAD(P) binding domain of glutamate dehydrogenase, subgroup 2. Amino acid dehydrogenase (DH) is a widely distributed family of enzymes that catalyzes the oxidative deamination of an amino acid to its keto acid and ammonia with concomitant reduction of NADP+. Glutamate DH is a multidomain enzyme that catalyzes the reaction from glutamate to 2-oxyoglutarate and ammonia in the presence of NAD or NADP. It is present in all organisms. Enzymes involved in ammonia asimilation are typically NADP+-dependent, while those involved in glutamate catabolism are generally NAD+-dependent. Amino acid DH-like NAD(P)-binding domains are members of the Rossmann fold superfamily and include glutamate, leucine, and phenylalanine DHs, methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel  domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha -beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts.	254
187583	cd05322	SDH_SDR_c_like	Sorbitol 6-phosphate dehydrogenase (SDH), classical (c) SDRs. Sorbitol 6-phosphate dehydrogenase (SDH, aka glucitol 6-phosphate dehydrogenase) catalyzes the NAD-dependent interconversion of D-fructose 6-phosphate to D-sorbitol 6-phosphate. SDH is a member of the classical SDRs, with the characteristic catalytic tetrad, but without a complete match to the typical NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	257
187584	cd05323	ADH_SDR_c_like	insect type alcohol dehydrogenase (ADH)-like, classical (c) SDRs. This subgroup contains insect type ADH, and 15-hydroxyprostaglandin dehydrogenase (15-PGDH) type I; these proteins are classical SDRs. ADH catalyzes the NAD+-dependent oxidation of alcohols to aldehydes/ketones. This subgroup is distinct from the zinc-dependent alcohol dehydrogenases of the medium chain dehydrogenase/reductase family, and evolved in fruit flies to allow the digestion of fermenting fruit. 15-PGDH catalyzes the NAD-dependent interconversion of (5Z,13E)-(15S)-11alpha,15-dihydroxy-9-oxoprost-13-enoate and (5Z,13E)-11alpha-hydroxy-9,15-dioxoprost-13-enoate, and has a typical SDR glycine-rich NAD-binding motif, which is not fully present in ADH.  SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	244
187585	cd05324	carb_red_PTCR-like_SDR_c	Porcine testicular carbonyl reductase (PTCR)-like, classical (c) SDRs. PTCR is a classical SDR which catalyzes the NADPH-dependent reduction of ketones on steroids and prostaglandins. Unlike most SDRs, PTCR functions as a monomer. This subgroup also includes human carbonyl reductase 1 (CBR1) and CBR3. CBR1 is an NADPH-dependent SDR with broad substrate specificity and may be responsible for the in vivo reduction of quinones, prostaglandins, and other carbonyl-containing compounds. In addition it includes poppy NADPH-dependent salutaridine reductase which catalyzes the stereospecific reduction of salutaridine to 7(S)-salutaridinol in the biosynthesis of morphine, and Arabidopsis SDR1,a menthone reductase, which catalyzes the reduction of menthone to neomenthol, a compound with antimicrobial activity; SDR1  can also carry out neomenthol oxidation. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	225
187586	cd05325	carb_red_sniffer_like_SDR_c	carbonyl reductase sniffer-like, classical (c) SDRs. Sniffer is an NADPH-dependent carbonyl reductase of the classical SDR family. Studies in Drosophila melanogaster implicate Sniffer in the prevention of neurodegeneration due to aging and oxidative-stress. This subgroup also includes Rhodococcus sp. AD45 IsoH, which is an NAD-dependent 1-hydroxy-2-glutathionyl-2-methyl-3-butene dehydrogenase involved in isoprene metabolism, Aspergillus nidulans StcE encoded by a gene which is part of a proposed sterigmatocystin biosynthesis gene cluster, Bacillus circulans SANK 72073 BtrF encoded by a gene found in the butirosin biosynthesis gene cluster, and Aspergillus parasiticus nor-1 involved in the biosynthesis of aflatoxins. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	233
187587	cd05326	secoisolariciresinol-DH_like_SDR_c	secoisolariciresinol dehydrogenase (secoisolariciresinol-DH)-like, classical (c) SDRs. Podophyllum secoisolariciresinol-DH is a homo tetrameric, classical SDR that catalyzes the NAD-dependent conversion of (-)-secoisolariciresinol to (-)-matairesinol via a (-)-lactol intermediate. (-)-Matairesinol is an intermediate to various 8'-lignans, including the cancer-preventive mammalian lignan, and those involved in vascular plant defense. This subgroup also includes rice momilactone A synthase which catalyzes the conversion of 3beta-hydroxy-9betaH-pimara-7,15-dien-19,6beta-olide into momilactone A, Arabidopsis ABA2 which during abscisic acid (ABA) biosynthesis, catalyzes the conversion of xanthoxin to abscisic aldehyde and, maize Tasselseed2 which participate in the maize sex determination pathway. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	249
212492	cd05327	retinol-DH_like_SDR_c_like	retinol dehydrogenase (retinol-DH), Light dependent Protochlorophyllide (Pchlide) OxidoReductase (LPOR) and related proteins, classical (c) SDRs. Classical SDR subgroup containing retinol-DHs, LPORs, and related proteins. Retinol is processed by a medium chain alcohol dehydrogenase followed by retinol-DHs. Pchlide reductases act in chlorophyll biosynthesis. There are distinct enzymes that catalyze Pchlide reduction in light or dark conditions. Light-dependent reduction is via an NADP-dependent SDR, LPOR. Proteins in this subfamily share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. This subgroup includes the human proteins: retinol dehydrogenase -12, -13 ,and -14, dehydrogenase/reductase SDR family member (DHRS)-12 , -13 and -X (a DHRS on chromosome X), and WWOX (WW domain-containing oxidoreductase), as well as a Neurospora crassa SDR encoded by the blue light inducible bli-4 gene. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	269
187589	cd05328	3alpha_HSD_SDR_c	alpha hydroxysteroid dehydrogenase (3alpha_HSD), classical (c) SDRs. Bacterial 3-alpha_HSD, which catalyzes the NAD-dependent oxidoreduction of hydroxysteroids, is a dimeric member of the classical SDR family. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	250
187590	cd05329	TR_SDR_c	tropinone reductase-I and II (TR-1, and TR-II)-like, classical (c) SDRs. This subgroup includes TR-I and TR-II; these proteins are members of the SDR family. TRs catalyze the NADPH-dependent reductions of the 3-carbonyl group of tropinone, to a beta-hydroxyl group. TR-I and TR-II produce different stereoisomers from tropinone, TR-I produces tropine (3alpha-hydroxytropane), and TR-II, produces pseudotropine (sigma-tropine, 3beta-hydroxytropane).  SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	251
187591	cd05330	cyclohexanol_reductase_SDR_c	cyclohexanol reductases, including levodione reductase, classical (c) SDRs. Cyloclohexanol reductases,including (6R)-2,2,6-trimethyl-1,4-cyclohexanedione (levodione) reductase of Corynebacterium aquaticum, catalyze the reversible oxidoreduction of hydroxycyclohexanone derivatives. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	257
187592	cd05331	DH-DHB-DH_SDR_c	2,3 dihydro-2,3 dihydrozybenzoate dehydrogenases, classical (c) SDRs. 2,3 dihydro-2,3 dihydrozybenzoate dehydrogenase shares the characteristics of the classical SDRs. This subgroup includes Escherichai coli EntA which catalyzes the NAD+-dependent oxidation of 2,3-dihydro-2,3-dihydroxybenzoate to 2,3-dihydroxybenzoate during biosynthesis of the siderophore Enterobactin. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	244
187593	cd05332	11beta-HSD1_like_SDR_c	11beta-hydroxysteroid dehydrogenase type 1 (11beta-HSD1)-like, classical (c) SDRs. Human 11beta_HSD1 catalyzes the NADP(H)-dependent interconversion of cortisone and cortisol. This subgroup also includes human dehydrogenase/reductase SDR family member 7C (DHRS7C) and DHRS7B. These proteins have the GxxxGxG nucleotide binding motif and S-Y-K catalytic triad characteristic of the SDRs, but have an atypical C-terminal domain that contributes to homodimerization contacts. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	257
187594	cd05333	BKR_SDR_c	beta-Keto acyl carrier protein reductase (BKR), involved in Type II FAS, classical (c) SDRs. This subgroup includes the Escherichai coli K12 BKR, FabG. BKR catalyzes the NADPH-dependent reduction of ACP in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of four elongation steps, which are repeated to extend the fatty acid chain through the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and a final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I FAS utilizes one or two multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet) NAD(P)(H) binding region and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues.   Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD binding motif and characteristic NAD-binding and catalytic sequence patterns.  These enzymes have a 3-glycine N-terminal NAD(P)(H) binding pattern: TGxxxGxG in classical SDRs.  Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif.  Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P) binding motif and  an altered active site motif (YXXXN).  Fungal type type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.  Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P) binding motif and missing or unusual active site residues.  Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site.  Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr-151 and Lys-155, and well as Asn-111 (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	240
187595	cd05334	DHPR_SDR_c_like	dihydropteridine reductase (DHPR), classical (c) SDRs. Dihydropteridine reductase is an NAD-binding protein related to the SDRs. It converts dihydrobiopterin into tetrahydrobiopterin, a cofactor necessary in catecholamines synthesis. Dihydropteridine reductase has the YXXXK of these tyrosine-dependent oxidoreductases, but lacks the typical upstream Asn and Ser catalytic residues. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	221
187596	cd05337	BKR_1_SDR_c	putative beta-ketoacyl acyl carrier protein [ACP] reductase (BKR), subgroup 1, classical (c) SDR. This subgroup includes Escherichia coli CFT073 FabG. The Escherichai coli K12 BKR, FabG, belongs to a different subgroup. BKR catalyzes the NADPH-dependent reduction of ACP in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of four elongation steps, which are repeated to extend the fatty acid chain through the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and a final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I FAS utilizes one or two multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet) NAD(P)(H) binding region and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues.   Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD binding motif and characteristic NAD-binding and catalytic sequence patterns.  These enzymes have a 3-glycine N-terminal NAD(P)(H) binding pattern: TGxxxGxG in classical SDRs. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif.  Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P) binding motif and  an altered active site motif (YXXXN).  Fungal type type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.  Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P) binding motif and missing or unusual active site residues.  Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site.  Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr-151 and Lys-155, and well as Asn-111 (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	255
187597	cd05338	DHRS1_HSDL2-like_SDR_c	human dehydrogenase/reductase (SDR family) member 1 (DHRS1) and human hydroxysteroid dehydrogenase-like protein 2 (HSDL2), classical (c) SDRs. This subgroup includes human DHRS1 and human HSDL2 and related proteins. These are members of the classical SDR family, with a canonical Gly-rich  NAD-binding motif and the typical YXXXK active site motif. However, the rest of the catalytic tetrad is not strongly conserved. DHRS1 mRNA has been detected in many tissues, liver, heart, skeletal muscle, kidney and pancreas; a longer transcript is predominantly expressed in the liver , a shorter one in the heart. HSDL2 may play a part in fatty acid metabolism, as it is found in peroxisomes. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	246
187598	cd05339	17beta-HSDXI-like_SDR_c	human 17-beta-hydroxysteroid dehydrogenase XI-like, classical (c) SDRs. 17-beta-hydroxysteroid dehydrogenases (17betaHSD) are a group of isozymes that catalyze activation and inactivation of estrogen and androgens. 17betaHSD type XI, a classical SDR, preferentially converts 3alpha-Adiol to androsterone but not numerous other tested steroids. This subgroup of classical SDRs also includes members identified as retinol dehydrogenases, which convert retinol to retinal, a property that overlaps with 17betaHSD activity. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	243
187599	cd05340	Ycik_SDR_c	Escherichia coli K-12 YCIK-like, classical (c) SDRs. Escherichia coli K-12 YCIK and related proteins have a canonical classical SDR nucleotide-binding motif and active site tetrad. They are predicted oxoacyl-(acyl carrier protein/ACP) reductases. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	236
187600	cd05341	3beta-17beta-HSD_like_SDR_c	3beta17beta hydroxysteroid dehydrogenase-like, classical (c) SDRs. This subgroup includes members identified as 3beta17beta hydroxysteroid dehydrogenase, 20beta hydroxysteroid dehydrogenase, and R-alcohol dehydrogenase. These proteins exhibit the canonical active site tetrad and glycine rich NAD(P)-binding motif of the classical SDRs. 17beta-dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens, and include members of the SDR family. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	247
187601	cd05343	Mgc4172-like_SDR_c	human Mgc4172-like, classical (c) SDRs. Human Mgc4172-like proteins, putative SDRs. These proteins are members of the SDR family, with a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	250
187602	cd05344	BKR_like_SDR_like	putative beta-ketoacyl acyl carrier protein [ACP] reductase (BKR)-like, SDR. This subgroup resembles the SDR family, but does not have a perfect match to the NAD-binding motif or the catalytic tetrad characteristic of the SDRs. It includes the SDRs, Q9HYA2 from Pseudomonas aeruginosa PAO1 and APE0912 from Aeropyrum pernix K1. BKR catalyzes the NADPH-dependent reduction of ACP in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of four elongation steps, which are repeated to extend the fatty acid chain through the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and a final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I FAS utilizes one or two multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	253
187603	cd05345	BKR_3_SDR_c	putative beta-ketoacyl acyl carrier protein [ACP] reductase (BKR), subgroup 3, classical (c) SDR. This subgroup includes the putative Brucella melitensis biovar Abortus 2308 BKR, FabG, Mesorhizobium loti MAFF303099 FabG, and other classical SDRs. BKR, a member of the SDR family, catalyzes the NADPH-dependent reduction of acyl carrier protein in the first reductive step of de novo fatty acid synthesis (FAS).  FAS consists of 4 elongation steps, which are repeated to extend the fatty acid chain thru the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I Fas utilizes one or 2 multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	248
187604	cd05346	SDR_c5	classical (c) SDR, subgroup 5. These proteins are members of the classical SDR family, with a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	249
187605	cd05347	Ga5DH-like_SDR_c	gluconate 5-dehydrogenase (Ga5DH)-like, classical (c) SDRs. Ga5DH catalyzes the NADP-dependent conversion of carbon source D-gluconate and 5-keto-D-gluconate. This SDR subgroup has a classical Gly-rich NAD(P)-binding motif and a conserved active site tetrad pattern. However, it has been proposed that Arg104 (Streptococcus suis Ga5DH numbering), as well as an active site Ca2+, play a critical role in catalysis. In addition to Ga5DHs this subgroup contains Erwinia chrysanthemi KduD which is involved in pectin degradation, and is a putative 2,5-diketo-3-deoxygluconate dehydrogenase. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107,15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	248
187606	cd05348	BphB-like_SDR_c	cis-biphenyl-2,3-dihydrodiol-2,3-dehydrogenase (BphB)-like, classical (c) SDRs. cis-biphenyl-2,3-dihydrodiol-2,3-dehydrogenase (BphB) is a classical SDR, it is of particular importance for its role in the degradation of biphenyl/polychlorinated biphenyls(PCBs); PCBs are a significant source of environmental contamination. This subgroup also includes Pseudomonas putida F1 cis-biphenyl-1,2-dihydrodiol-1,2-dehydrogenase (aka cis-benzene glycol dehydrogenase, encoded by the bnzE gene), which participates in benzene metabolism. In addition it includes Pseudomonas sp. C18 putative 1,2-dihydroxy-1,2-dihydronaphthalene dehydrogenase (aka dibenzothiophene dihydrodiol dehydrogenase, encoded by the doxE gene) which participates in an upper naphthalene catabolic pathway. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	257
187607	cd05349	BKR_2_SDR_c	putative beta-ketoacyl acyl carrier protein [ACP]reductase (BKR), subgroup 2, classical (c) SDR. This subgroup includes Rhizobium sp. NGR234 FabG1. The Escherichai coli K12 BKR, FabG, belongs to a different subgroup. BKR catalyzes the NADPH-dependent reduction of ACP in the first reductive step of de novo fatty acid synthesis (FAS). FAS consists of four elongation steps, which are repeated to extend the fatty acid chain through the addition of two-carbo units from malonyl acyl-carrier protein (ACP): condensation, reduction, dehydration, and a final reduction. Type II FAS, typical of plants and many bacteria, maintains these activities on discrete polypeptides, while type I FAS utilizes one or two multifunctional polypeptides. BKR resembles enoyl reductase, which catalyzes the second reduction step in FAS.  SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	246
187608	cd05350	SDR_c6	classical (c) SDR, subgroup 6. These proteins are members of the classical SDR family, with a canonical active site tetrad  and a fairly well conserved typical Gly-rich  NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	239
187609	cd05351	XR_like_SDR_c	xylulose reductase-like, classical (c) SDRs. Members of this subgroup include proteins identified as L-xylulose reductase (XR) and carbonyl reductase; they are members of the SDR family. XR, catalyzes the NADP-dependent reduction of L-xyulose and other sugars. Tetrameric mouse carbonyl reductase is involved in the metabolism of biogenic and xenobiotic carbonyl compounds. This subgroup also includes tetrameric chicken liver D-erythrulose reductase, which catalyzes the reduction of D-erythrulose to D-threitol. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser).	244
187610	cd05352	MDH-like_SDR_c	mannitol dehydrogenase (MDH)-like, classical (c) SDRs. NADP-mannitol dehydrogenase catalyzes the conversion of fructose to mannitol, an acyclic 6-carbon sugar. MDH is a tetrameric member of the SDR family. This subgroup also includes various other tetrameric SDRs, including Pichia stipitis D-arabinitol dehydrogenase (aka polyol dehydrogenase), Candida albicans Sou1p, a sorbose reductase, and Candida parapsilosis (S)-specific carbonyl reductase (SCR, aka S-specific alcohol dehydrogenase) which catalyzes the enantioselective reduction of 2-hydroxyacetophenone into (S)-1-phenyl-1,2-ethanediol. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser).	252
187611	cd05353	hydroxyacyl-CoA-like_DH_SDR_c-like	(3R)-hydroxyacyl-CoA dehydrogenase-like, classical(c)-like SDRs. Beta oxidation of fatty acids in eukaryotes occurs by a four-reaction cycle, that may take place in mitochondria or in peroxisomes. (3R)-hydroxyacyl-CoA dehydrogenase is part of rat peroxisomal multifunctional MFE-2, it is a member of the NAD-dependent SDRs, but contains an additional small C-terminal domain that completes the active site pocket and participates in dimerization. The atypical, additional C-terminal extension allows for more extensive dimerization contact than other SDRs. MFE-2 catalyzes the second and third reactions of the peroxisomal beta oxidation cycle. Proteins in this subgroup have a typical catalytic triad, but have a His in place of the usual upstream Asn. This subgroup also contains members identified as 17-beta-hydroxysteroid dehydrogenases, including human peroxisomal 17-beta-hydroxysteroid dehydrogenase type 4 (17beta-HSD type 4, aka MFE-2, encoded by HSD17B4 gene) which is involved in fatty acid beta-oxidation and steroid metabolism. This subgroup also includes two SDR domains of the Neurospora crassa and Saccharomyces cerevisiae multifunctional beta-oxidation protein (MFP, aka Fox2).  SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	250
187612	cd05354	SDR_c7	classical (c) SDR, subgroup 7. These proteins are members of the classical SDR family, with a canonical active site triad (and also an active site Asn) and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	235
187613	cd05355	SDR_c1	classical (c) SDR, subgroup 1. These proteins are members of the classical SDR family, with a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	270
187614	cd05356	17beta-HSD1_like_SDR_c	17-beta-hydroxysteroid dehydrogenases (17beta-HSDs) types -1, -3, and -12, -like, classical (c) SDRs. This subgroup includes various 17-beta-hydroxysteroid dehydrogenases and 3-ketoacyl-CoA reductase, these are members of the SDR family, and contain the canonical active site tetrad and glycine-rich NAD-binding motif of the classical SDRs. 3-ketoacyl-CoA reductase (KAR, aka 17beta-HSD type 12, encoded by HSD17B12) acts in fatty acid elongation; 17beta- hydroxysteroid dehydrogenases are isozymes that catalyze activation and inactivation of estrogen and androgens, and include members of the SDR family. 17beta-estradiol dehydrogenase (aka 17beta-HSD type 1, encoded by HSD17B1) converts estrone to estradiol. Estradiol is the predominant female sex hormone. 17beta-HSD type 3 (aka testosterone 17-beta-dehydrogenase 3, encoded by HSD17B3) catalyses the reduction of androstenedione to testosterone, it also accepts estrogens as substrates. This subgroup also contains a putative steroid dehydrogenase let-767 from Caenorhabditis elegans, mutation in which results in  hypersensitivity to cholesterol limitation.  SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	239
187615	cd05357	PR_SDR_c	pteridine reductase (PR), classical (c) SDRs. Pteridine reductases (PRs), members of the SDR family, catalyzes the NAD-dependent reduction of folic acid, dihydrofolate and related compounds. In Leishmania, pteridine reductase (PTR1) acts to circumvent the anti-protozoan drugs that attack dihydrofolate reductase activity. Proteins in this subgroup have an N-terminal NAD-binding motif and a YxxxK active site motif, but have an Asp instead of the usual upstream catalytic Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	234
187616	cd05358	GlcDH_SDR_c	glucose 1 dehydrogenase (GlcDH), classical (c) SDRs. GlcDH, is a tetrameric member of the SDR family, it catalyzes the NAD(P)-dependent oxidation of beta-D-glucose to D-glucono-delta-lactone. GlcDH has a typical NAD-binding site glycine-rich pattern as well as the canonical active site tetrad (YXXXK motif plus upstream Ser and Asn). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	253
187617	cd05359	ChcA_like_SDR_c	1-cyclohexenylcarbonyl_coenzyme A_reductase (ChcA)_like, classical (c) SDRs. This subgroup contains classical SDR proteins, including members identified as 1-cyclohexenylcarbonyl coenzyme A reductase. ChcA of Streptomyces collinus is implicated in the final reduction step of shikimic acid to ansatrienin. ChcA shows sequence similarity to the SDR family of NAD-binding proteins, but it lacks the conserved Tyr of the characteristic catalytic site. This subgroup also contains the NADH-dependent enoyl-[acyl-carrier-protein(ACP)] reductase FabL from Bacillus subtilis. This enzyme participates in bacterial fatty acid synthesis, in type II fatty-acid synthases and catalyzes the last step in each elongation cycle. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	242
187618	cd05360	SDR_c3	classical (c) SDR, subgroup 3. These proteins are members of the classical SDR family, with a canonical active site triad (and also active site Asn) and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	233
187619	cd05361	haloalcohol_DH_SDR_c-like	haloalcohol dehalogenase, classical (c) SDRs. Dehalogenases cleave carbon-halogen bonds. Haloalcohol dehalogenase show low sequence similarity to short-chain dehydrogenases/reductases (SDRs). Like the SDRs, haloalcohol dehalogenases have a conserved catalytic triad (Ser-Tyr-Lys/Arg), and form a Rossmann fold. However, the normal classical SDR NAD(P)-binding motif (TGXXGXG) and NAD-binding function is replaced with a halide binding site, allowing the enzyme to catalyze a dehalogenation reaction. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	242
187620	cd05362	THN_reductase-like_SDR_c	tetrahydroxynaphthalene/trihydroxynaphthalene reductase-like, classical (c) SDRs. 1,3,6,8-tetrahydroxynaphthalene reductase (4HNR) of Magnaporthe grisea and the related 1,3,8-trihydroxynaphthalene reductase (3HNR) are typical members of the SDR family containing the canonical glycine rich NAD(P)-binding site and active site tetrad, and function in fungal melanin biosynthesis. This subgroup also includes an SDR from Norway spruce that may function to protect against both biotic and abitoic stress. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	243
187621	cd05363	SDH_SDR_c	Sorbitol dehydrogenase (SDH), classical (c) SDR. This bacterial subgroup includes Rhodobacter sphaeroides SDH, and other SDHs. SDH  preferentially interconverts D-sorbitol (D-glucitol) and D-fructose, but also interconverts L-iditol/L-sorbose and galactitol/D-tagatose. SDH is NAD-dependent and is a dimeric member of the SDR family. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	254
187622	cd05364	SDR_c11	classical (c) SDR, subgroup 11. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	253
187623	cd05365	7_alpha_HSDH_SDR_c	7 alpha-hydroxysteroid dehydrogenase (7 alpha-HSDH), classical (c) SDRs. This bacterial subgroup contains 7 alpha-HSDHs,  including Escherichia coli 7 alpha-HSDH. 7 alpha-HSDH, a member of the SDR family, catalyzes the NAD+ -dependent dehydrogenation of a hydroxyl group at position 7 of  the steroid skeleton of bile acids. In humans the two primary bile acids are cholic and chenodeoxycholic acids, these are formed from cholesterol in the liver. Escherichia coli 7 alpha-HSDH dehydroxylates these bile acids in the human intestine. Mammalian 7 alpha-HSDH activity has been found in livers. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	242
187624	cd05366	meso-BDH-like_SDR_c	meso-2,3-butanediol dehydrogenase-like, classical (c) SDRs. 2,3-butanediol dehydrogenases (BDHs) catalyze the NAD+ dependent conversion of 2,3-butanediol to acetonin; BDHs are classified into types according to their stereospecificity as to substrates and products. Included in this subgroup are Klebsiella pneumonia meso-BDH which catalyzes meso-2,3-butanediol to D(-)-acetonin, and Corynebacterium glutamicum L-BDH which catalyzes lX+)-2,3-butanediol to L(+)-acetonin. This subgroup is comprised of classical SDRs with the characteristic catalytic triad and NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	257
187625	cd05367	SPR-like_SDR_c	sepiapterin reductase (SPR)-like, classical (c) SDRs. Human SPR, a member of the SDR family, catalyzes the NADP-dependent reduction of sepiaptern to 7,8-dihydrobiopterin (BH2). In addition to SPRs, this subgroup also contains Bacillus cereus yueD, a benzil reductase, which catalyzes the stereospecific reduction of benzil to (S)-benzoin. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	241
187626	cd05368	DHRS6_like_SDR_c	human DHRS6-like, classical (c) SDRs. Human DHRS6, and similar proteins. These proteins are classical SDRs, with a canonical active site tetrad and a close match to the typical Gly-rich NAD-binding motif. Human DHRS6 is a cytosolic type 2 (R)-hydroxybutyrate dehydrogenase, which catalyses the conversion of (R)-hydroxybutyrate to acetoacetate. Also included in this subgroup is Escherichia coli UcpA (upstream cys P). Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.    Note: removed :  needed to make this chiodl smaller when drew final trees: rmeoved text form description: Other proteins in this subgroup include Thermoplasma acidophilum aldohexose dehydrogenase, which has high dehydrogenase activity against D-mannose, Bacillus subtilis BacC involved in the biosynthesis of the dipeptide bacilysin and its antibiotic moiety anticapsin, Sphingomonas paucimobilis strain B90 LinC, involved in the degradation of hexachlorocyclohexane isomers...... P).	241
187627	cd05369	TER_DECR_SDR_a	Trans-2-enoyl-CoA reductase (TER) and 2,4-dienoyl-CoA reductase (DECR), atypical (a) SDR. TTER is a peroxisomal protein with a proposed role in fatty acid elongation. Fatty acid synthesis is known to occur in the both endoplasmic reticulum and mitochondria; peroxisomal TER has been proposed as an additional fatty acid elongation system, it reduces the double bond at C-2 as the last step of elongation.  This system resembles the mitochondrial system in that acetyl-CoA is used as a carbon donor. TER may also function in phytol metabolism, reducting phytenoyl-CoA to phytanoyl-CoA in peroxisomes. DECR processes double bonds in fatty acids to increase their utility in fatty acid metabolism; it reduces 2,4-dienoyl-CoA to an enoyl-CoA. DECR is active in mitochondria and peroxisomes. This subgroup has the Gly-rich NAD-binding motif of the classical SDR family, but does not display strong identity to the canonical active site tetrad, and lacks the characteristic Tyr at the usual position. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	249
187628	cd05370	SDR_c2	classical (c) SDR, subgroup 2. Short-chain dehydrogenases/reductases (SDRs, aka Tyrosine-dependent oxidoreductases) are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	228
187629	cd05371	HSD10-like_SDR_c	17hydroxysteroid dehydrogenase type 10 (HSD10)-like, classical (c) SDRs. HSD10, also known as amyloid-peptide-binding alcohol dehydrogenase (ABAD), was previously identified as a L-3-hydroxyacyl-CoA dehydrogenase, HADH2. In fatty acid metabolism, HADH2 catalyzes the third step of beta-oxidation, the conversion of a hydroxyl to a keto group in the NAD-dependent oxidation of L-3-hydroxyacyl CoA. In addition to alcohol dehydrogenase and HADH2 activites, HSD10 has steroid dehydrogenase activity. Although the mechanism is unclear, HSD10 is implicated in the formation of amyloid beta-petide in the brain (which is linked to the development of Alzheimer's disease). Although HSD10 is normally concentrated in the mitochondria, in the presence of amyloid beta-peptide it translocates into the plasma membrane, where it's action may generate cytotoxic aldehydes and may lower estrogen levels through its use of 17-beta-estradiol as a substrate. HSD10 is a member of the SRD family, but differs from other SDRs by the presence of two insertions of unknown function. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	252
187630	cd05372	ENR_SDR	Enoyl acyl carrier protein (ACP) reductase (ENR), divergent SDR. This bacterial subgroup of ENRs includes Escherichia coli ENR. ENR catalyzes the NAD(P)H-dependent reduction of enoyl-ACP in the last step of fatty acid biosynthesis. De novo fatty acid biosynthesis is catalyzed by the fatty acid synthetase complex, through the serial addition of 2-carbon subunits. In bacteria and plants,ENR catalyzes one of six synthetic steps in this process. Oilseed rape ENR, and also apparently the NADH-specific form of Escherichia coli ENR, is tetrameric.  Although similar to the classical SDRs, this group does not have the canonical catalytic tetrad, nor does it have the typical Gly-rich NAD-binding pattern. Such so-called divergent SDRs have a GXXXXXSXA NAD-binding motif and a YXXMXXXK (or YXXXMXXXK) active site motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	250
187631	cd05373	SDR_c10	classical (c) SDR, subgroup  10. This subgroup resembles the classical SDRs, but has an incomplete match to the canonical glycine rich NAD-binding motif and lacks the typical active site tetrad (instead of the critical active site Tyr, it has Phe, but contains the nearby Lys). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	238
187632	cd05374	17beta-HSD-like_SDR_c	17beta hydroxysteroid dehydrogenase-like, classical (c) SDRs. 17beta-hydroxysteroid dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	248
349398	cd05379	CAP_bacterial	Bacterial CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain proteins. Little is known about bacterial and archaeal members of the CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain family. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others. Studies of eukaryotic proteins show that CAP domains have several functions, including the binding of cholesterol, lipids and heparan sulfate. This group includes Borrelia burgdorferi outer surface protein BB0689, which does not bind to cholesterol, lipids, or heparan sulfate, and whose function is unknown.	120
349399	cd05380	CAP_euk	Eukaryotic CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain proteins. The CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain is found mainly in eukaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), and allergen 5 from vespid venom.	144
349400	cd05381	CAP_PR-1	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of pathogenesis-related protein 1 (PR-1) family proteins. Members of pathogenesis-related protein 1 (PR-1) family  are among the most abundantly produced proteins in plants on pathogen attack. They are considered hallmarks of hypersensitive response/defense pathways and may act as anti-fungal agents or be involved in cell wall loosening.	136
349401	cd05382	CAP_GAPR1-like	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of Golgi-associated plant pathogenesis-related protein 1 and similar proteins. Golgi-associated plant pathogenesis related protein 1 (GAPR1), also called Golgi-associated PR-1 protein or glioma pathogenesis-related protein 2 (GLIPR-2), forms amyloid-like fibrils in the presence of liposomes containing acidic phospholipids. It has been identified in mice as an up-regulated protein in kidney fibrosis, and is involved in epithelial to mesenchymal transition and in generating a pool of myofibroblasts contributing to fibrosis. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	132
349402	cd05383	CAP_CRISP	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of cysteine-rich secretory proteins. Cysteine-rich secretory proteins (CRISPs) are two-domain proteins with an evolutionary diverse and structurally conserved N-terminal CAP domain and a C-terminal cysteine-rich domain, which is comprised of a hinge and an ICR (ion channel regulator) region. CRISPs are involved in response to pathogens, fertilization, and sperm maturation. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. CRISP-1 has been shown to mediate gamete fusion by binding to the egg surface. Other members of the CRISP family secreted in the testis (CRISP2), epididymis (CRISP3-4), or during ejaculation (CRISP3), are also involved in sperm-egg interaction, supporting the existence of a functional redundancy and cooperation between homolog proteins ensuring the success of fertilization. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1) and allergen 5 from vespid venom, among others.	139
349403	cd05384	CAP_PRY1-like	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of pathogen-related yeast 1 (PRY1) protein and similar fungal proteins. PRY1, also called pathogenesis-related protein 1, is a yeast protein that is up-regulated in core ESCRT mutants. It is a secreted protein required for efficient export of lipids such as acetylated sterols, and acts in detoxification of hydrophobic compounds. This PRY1-like group also contains fruiting body proteins SC7/14 from Schizophyllum commune. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	129
349404	cd05385	CAP_GLIPR1-like	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of glioma pathogenesis-related protein 1 and similar proteins. Glioma pathogenesis-related protein 1 (GLIPR1) is also called related to testes-specific, vespid, and pathogenesis protein 1 (RTVP-1). The GLIPR1 gene has been identified as a p53 target gene and was shown to be methylated and down-regulated in prostate cancer. It is a novel broad-spectrum tumor suppressor whose proapoptotic properties are exerted in part through ROS-JNK signaling. GLIPR1 is composed of a signal peptide that directs its secretion, a CAP domain, and a transmembrane domain. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	148
349771	cd05386	TraL	transfer origin protein TraL. The transfer origin protein TraL is member of the SIMIBI superfamily which contains a ATP-binding domain. Proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion. The specific function of TraL protein is unknown.	155
349772	cd05387	BY-kinase	bacterial tyrosine-kinase. Bacterial tyrosine (BY)-kinases catalyze the autophosphorylation on a C-terminal tyrosine cluster and also phosphorylate endogenous protein substrates by using ATP as phosphoryl donor. Besides their capacity to function as tyrosine kinase, most of these proteins are also involved in the production and transport of exopolysaccharides. BY-kinases are involved in a number of physiological processes ranging from stress resistance to pathogenicity.	190
349773	cd05388	CobB_N	N-terminal domain of cobyrinic acid a,c-diamide synthase. Cobyrinic acid a,c-diamide synthase (CobB, CbiA). Biosynthesis of cobalamin (vitamin B12) requires more than two dozen different enzymes. CobB catalyzes the ATP-dependent amidation of the two carboxylate groups at positions a and c of cobyrinic acid, via the formation of a phosphorylated intermediate, using glutamine or ammonia as the nitrogen source. CobB is comprised of two protein domains: the C-terminal glutaminase domain and the N-terminal ATP-binding domain. The glutaminase domain catalyzes the hydrolysis of glutamine to glutamate and ammonia. It belongs to the triad class of glutamine amidotransferases. This classification is based on the N-terminal domain which catalyzes the ultimate synthesis of the diamide product by using energy from the hydrolysis of ATP and ammonia transferred from the C-terminal domain.	193
349774	cd05389	CobQ_N	N-terminal domain of cobyric acid synthase. Cobyric acid synthase (CobQ, CbiP) N-terminal domain. CobQ plays a role in the cobalamin (vitamin B12) biosynthesis pathway. CobQ catalyzes the ATP-dependent amidation of adenosyl-cobyrinic acid a,c-diamide at carboxylates positions b, d, e, and g to produce cobyric acid using glutamine or ammonia as the nitrogen source. The C-terminal glutaminase domain catalyzes the hydrolysis of glutamine to glutamate and ammonia. Ammonia is translocated via an intramolecular tunnel to the N-terminal domain for the synthesis of cobyric acid.	223
349775	cd05390	HypB	nickel incorporation protein HypB. HypB is one of numerous accessory proteins required for the maturation of nickel-dependent hydrogenases, like carbon monoxide dehydrogenase or urease. HypB is a GTP-binding protein and has GTP hyrolase activity. It forms homodimer and is capable of binding two nickel ions and two zinc ions. The active site is located on the dimer interface. Energy from hydrolysis of GTP is used to insert nickels into hydrogenases.	203
213340	cd05391	RasGAP_p120GAP	Ras-GTPase Activating Domain of p120. p120GAP is a negative regulator of Ras that stimulates hydrolysis of bound GTP to GDP. Once the Ras regulator p120GAP, a member of the GAP protein family, is recruited to the membrane, it is transiently immobilized to interact with Ras-GTP. The down-regulation of Ras by p120GAP is a critical step in the regulation of many cellular processes, which is disrupted in approximately 30% of human cancers. p120GAP contains SH2, SH3, PH, calcium- and lipid-binding domains, suggesting its involvement in a complex network of cellular interactions in vivo.	328
213341	cd05392	RasGAP_Neurofibromin_like	Ras-GTPase Activating Domain of proteins similar to neurofibromin. Neurofibromin-like proteins include the Saccharomyces cerevisiae RasGAP proteins Ira1 and Ira2, the closest homolog of neurofibromin, which is responsible for the human autosomal dominant disease neurofibromatosis type I (NF1). The RasGAP Ira1/2 proteins are negative regulators of the Ras-cAMP signaling pathway and conserved from yeast to human. In yeast Ras proteins are activated by GEFs, and inhibited by two GAPs, Ira1 and Ira2. Ras proteins activate the cAMP/protein kinase A (PKA) pathway, which controls metabolism, stress resistance, growth, and meiosis. Recent studies showed that the kelch proteins Gpb1 and Gpb2 inhibit Ras activity via association with Ira1 and Ira2. Gpb1/2 bind to a conserved C-terminal domain of Ira1/2, and loss of Gpb1/2 results in a destabilization of Ira1 and Ira2, leading to elevated levels of Ras2-GTP and uninhibited cAMP-PKA signaling. Since the Gpb1/2 binding domain on Ira1/2 is conserved in the human neurofibromin protein, the studies suggest that an analogous signaling mechanism may contribute to the neoplastic development of NF1.	317
213342	cd05394	RasGAP_RASA2	Ras-GTPase Activating Domain of RASA2. RASA2 (or GAP1(m)) is a member of the GAP1 family of Ras GTPase-activating proteins that includes GAP1_IP4BP (or RASA3), CAPRI, and RASAL. In vitro, RASA2 has been shown to bind inositol 1,3,4,5-tetrakisphosphate (IP4), the water soluble inositol head group of the lipid second messenger phosphatidylinositol 3,4,5-trisphosphate (PIP3). In vivo studies also demonstrated that RASA2 binds PIP3, and it is recruited to the plasma membrane following agonist stimulation of PI 3-kinase. Furthermore, the membrane translocation is a consequence of the ability of its pleckstrin homology (PH) domain to bind PIP3.	272
213343	cd05395	RasGAP_RASA4	Ras-GTPase Activating Domain of RASA4. Ras GTPase activating-like 4 protein (RASAL4), also known as Ca2+ -promoted Ras inactivator (CAPRI), is a member of the GAP1 family. Members of the GAP1 family are characterized by a conserved domain structure comprising N-terminal tandem C2 domains, a highly conserved central RasGAP domain, and a C-terminal pleckstrin-homology domain that is associated with a Bruton's tyrosine kinase motif. RASAL4, like RASAL, is a cytosolic protein that undergoes a rapid translocation to the plasma membrane in response to a receptor-mediated elevation in the concentration of intracellular free Ca2+ ([Ca2+]i). However, unlike RASAL, RASAL4 does not sense oscillations in [Ca2+]i.	287
188647	cd05396	An_peroxidase_like	Animal heme peroxidases and related proteins. A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well.	370
143387	cd05397	NT_Pol-beta-like	Nucleotidyltransferase (NT) domain of DNA polymerase beta and similar proteins. This superfamily includes the NT domains of DNA polymerase beta and other family X DNA polymerases, as well as the NT domains of Class I and Class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly (A) polymerases, terminal uridylyl transferases, and Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. The Escherichia coli CCA-adding enzyme belongs to this superfamily but is not included as this enzyme lacks the N-terminal helix conserved in the remainder of the superfamily. In the majority of the Pol beta-like superfamily NTs, two carboxylates, Dx[D/E], together with a third more distal carboxylate coordinate two divalent metal cations that are essential for catalysis. These divalent metal ions are involved in a two-metal ion mechanism of nucleotide addition. Two of the three catalytic carboxylates are found in Rel-Spo enzymes, with the second carboxylate of the DXD motif missing. Evidence supports a single-cation synthetase mechanism for Rel-Spo enzymes.	49
143388	cd05398	NT_ClassII-CCAase	Nucleotidyltransferase (NT) domain of ClassII CCA-adding enzymes. CCA-adding enzymes add the sequence [cytidine(C)-cytidine-adenosine (A)], one nucleotide at a time, onto the 3' end of tRNA, in a template-independent reaction. This Class II group is comprised mainly of eubacterial and eukaryotic enzymes and includes Bacillus stearothermophilus CCAase, Escherichia coli poly(A) polymerase I, human mitochondrial CCAase, and Saccharomyces cerevisiae CCAase (CCA1). CCA-adding enzymes have a single catalytic pocket, which recognizes both ATP and CTP substrates. Included in this subgroup are CC- and A-adding enzymes from various ancient species of bacteria such as Aquifex aeolicus; these enzymes collaborate to add CCA to tRNAs. This family belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These carboxylate residues are fairly well conserved in this family. Escherichia coli CCAase is related to this group but has not been included in this alignment as this enzyme lacks the N-terminal helix conserved in the remainder of the NT superfamily.	139
143389	cd05399	NT_Rel-Spo_like	Nucleotidyltransferase (NT) domain of RelA- and SpoT-like ppGpp synthetases and hydrolases. This family includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain. This subgroup belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition.Two of the three catalytic carboxylates are found in Rel-Spo enzymes, with the second carboxylate of the DXD motif missing. Evidence supports a single-cation synthetase mechanism.	129
143390	cd05400	NT_2-5OAS_ClassI-CCAase	Nucleotidyltransferase (NT) domain of 2'5'-oligoadenylate (2-5A)synthetase (2-5OAS) and class I CCA-adding enzyme. In vertebrates, 2-5OASs are induced by interferon during the innate immune response to protect against RNA virus infections. In the presence of an RNA activator, 2-5OASs catalyze the oligomerization of ATP into 2-5A. 2-5A activates endoribonuclease L, which leads to degradation of the viral RNA. 2-5OASs are also implicated in cell growth control, differentiation, and apoptosis. This family includes human OAS1, -2, -3, and OASL. CCA-adding enzymes add the sequence [cytidine(C)-cytidine-adenosine (A)], one nucleotide at a time, onto the 3' end of tRNA, in a template-independent reaction. This class I group includes the archaeal Sulfolobus shibatae and Archeoglobus fulgidus CCA-adding enzymes. It belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These carboxylate residues are conserved in this family.	143
143391	cd05401	NT_GlnE_GlnD_like	Nucleotidyltransferase (NT) domain of Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), and similar proteins. Escherichia coli GlnD and -E participate in the Glutamine synthetase (GS)/Glutamate synthase (GOGAT) pathway for the assimilation of ammonium nitrogen. In nitrogen sufficiency, GlnE adenylates GS, reducing GS activity; when nitrogen is limiting, GlnE deadenylates GS-AMP, restoring GS activity. When nitrogen is limiting, GlnD uridylylates the nitrogen regulatory protein PII to PII-UTP, and in nitrogen sufficiency, it removes the modifying groups. The activity of Escherichia coli GlnE is modulated by PII-proteins. PII-UMP promotes GlnE deadenylation activity, and PII promotes GlnE adenylation activity. Escherichia coli GlnE has two separate NT domains. The N-terminal NT domain catalyzes the deadenylylation of GS, and the C-terminal NT domain the adenylylation reaction. The majority of proteins in this family contain a C-terminal NT domain which is associated with a cystathionine beta-synthase (CBS) domain pair and a CAP_ED (cAMP receptor protein effector ) domain. This family belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. For the majority of proteins in this family, these carboxylate residues are conserved.	172
143392	cd05402	NT_PAP_TUTase	Nucleotidyltransferase (NT) domain of poly(A) polymerases and terminal uridylyl transferases. Poly(A) polymerases (PAPs) catalyze mRNA poly(A) tail synthesis, and terminal uridylyl transferases (TUTases) uridylate RNA. PAPs in this subgroup include human PAP alpha, mouse testis-specific cytoplasmic PAP beta, human nuclear PAP gamma, Saccharomyces cerevisiae PAP1, TRF4 and-5, Schizosaccharomyces pombe caffeine-induced death proteins -1, and -14, Caenorhabditis elegans Germ Line Development-2, and Chlamydomonas reinhardtii MUT68. This family also includes human U6 snRNA-specific TUTase1, and Trypanosoma brucei 3'-TUTase-1,-2, and 4. This family belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. For the majority of proteins in this family, these carboxylate residues are conserved.	114
143393	cd05403	NT_KNTase_like	Nucleotidyltransferase (NT) domain of Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. S. aureus KNTase is a plasmid encoded enzyme which confers resistance to a wide range of aminoglycoside antibiotics which have a 4'- or 4''-hydroxyl group in the equatorial position, such as kanamycin A. This enzyme transfers a nucleoside monophosphate group from a nucleotide (ATP,GTP, or UTP) to the 4'-hydroxyl group of kanamycin A. This enzyme is a homodimer, having two NT active sites. The nucleotide and antibiotic binding sites of each active site include residues from each monomer. Included in this subgroup is Escherichia coli AadA5 which confers resistance to the antibiotic spectinomycin and is a putative aminoglycoside-3'-adenylyltransferase. It is part of the aadA5 cassette of a class 1 integron. This subgroup also includes Haemophilus influenzae HI0073 which forms a 2:2 heterotetramer with an unrelated protein HI0074. Structurally HI0074 is related to the substrate-binding domain of S. aureus KNTase. The genes encoding HI0073 and HI0074 form an operon. Little is known about the substrate specificity or function of two-component NTs. The characterized members of this subgroup may not be representive of the function of this subgroup. This subgroup belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, co-ordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These carboxylate residues are conserved in this subgroup.	93
176102	cd05466	PBP2_LTTR_substrate	The substrate binding domain of LysR-type transcriptional regulators (LTTRs), a member of the type 2 periplasmic binding fold protein superfamily. This model and hierarchy represent the the substrate-binding domain of the LysR-type transcriptional regulators that form the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, oxidative stress responses, nodule formation of nitrogen-fixing bacteria, synthesis of virulence factors, toxin production, attachment and secretion, to name a few. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	197
119437	cd05467	CBM20	The family 20 carbohydrate-binding module (CBM20), also known as the starch-binding domain, is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	96
176472	cd05468	pVHL	von Hippel-Landau (pVHL) tumor suppressor protein. von Hippel-Landau (pVHL) protein, the gene product of VHL, is a critical regulator of the ubiquitous oxygen-sensing pathway. It is conserved throughout evolution, as its homologs are found in organisms ranging from mammals to the Drosophila melanogaster, Anopheles gambiae insects and the Caenorhabditis elegans nematode. pVHL acts as the substrate recognition component of an E3 ubiquitin ligase complex.  Several proteins have been identified as pVHL-binding proteins that are subject to ubiquitin-mediated proteolysis; the best characterized putative substrates are the alpha subunits of the hypoxia-inducible factor (HIF1alpha, HIF2alpha, and HIF3alpha). In addition to HIF degradation, pVHL has been implicated to be involved in HIF independent cellular processes. Germline VHL mutations cause renal cell carcinomas, hemangioblastomas and pheochromocytomas in humans. pVHL can bind to and direct the proper deposition of fibronectin and collagen IV within the extracellular matrix. It works to stabilize microtubules and foster the maintenance of primary cilium. It also has been reported to promote the stabilization and activation of p53 in a HIF-independent manner and, in neuronal cells, promote apoptosis by down-regulation of Jun-B.	141
100112	cd05469	Transthyretin_like	Transthyretin_like.  This domain is present in the transthyretin-like protein (TLP) family which includes transthyretin (TTR) and a transthyretin-related protein called 5-hydroxyisourate hydrolase (HIUase).  TTR and HIUase are homotetrameric proteins with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. TTR transports thyroid hormones and retinol in the blood serum of vertebrates while HIUase catalyzes the second step in a three-step ureide pathway. TTRs are highly conserved and found only in vertebrates while the HIUases are found in a wide range of bacterial, plant, fungal, slime mold and vertebrate organisms.	113
133137	cd05470	pepsin_retropepsin_like	Cellular and retroviral pepsin-like aspartate proteases. This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family).	109
133138	cd05471	pepsin_like	Pepsin-like aspartic proteases, bilobal enzymes that cleave bonds in peptides at acidic pH. Pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, renin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (renin, cathepsin D and E, pepsin) or commercially (chymosin) important. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap.The enzymes are mostly secreted from cells as inactive proenzymes that activate autocatalytically at acidic pH. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	283
133139	cd05472	cnd41_like	Chloroplast Nucleoids DNA-binding Protease, catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco. Antisense tobacco with reduced amount of CND41 maintained green leaves and constant protein levels, especially Rubisco.  CND41 has DNA-binding as well as aspartic protease activities. The pepsin-like aspartic protease domain is located at the C-terminus of the protein. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	299
133140	cd05473	beta_secretase_like	Beta-secretase, aspartic-acid protease important in the pathogenesis of Alzheimer's disease. Beta-secretase also called BACE (beta-site of APP cleaving enzyme) or memapsin-2. Beta-secretase is an aspartic-acid protease important in the pathogenesis of Alzheimer's disease, and in the formation of myelin sheaths in peripheral nerve cells. It cleaves amyloid precursor protein (APP) to reveal the N-terminus of the beta-amyloid peptides. The beta-amyloid peptides are the major components of the amyloid plaques formed in the brain of patients with Alzheimer's disease (AD). Since BACE mediates one of the cleavages responsible for generation of AD, it is regarded as a potential target for pharmacological intervention in AD. Beta-secretase is a member of pepsin family of aspartic proteases. Same as other aspartic proteases, beta-secretase is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule.  The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The enzymes specifically cleave bonds in peptides which have at least six residues in length with hydrophobic residues in both the P1 and P1' positions. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. The enzymes are mostly secreted from cells as inactive proenzymes that activate autocatalytically at acidic pH. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	364
133141	cd05474	SAP_like	SAPs, pepsin-like proteinases secreted from pathogens to degrade host proteins. SAPs (Secreted aspartic proteinases) are secreted from a group of pathogenic fungi, predominantly Candida species. They are secreted from the pathogen to degrade host proteins. SAP is one of the most significant extracellular hydrolytic enzymes produced by C. albicans. SAP proteins, encoded by a family of 10 SAP genes. All 10 SAP genes of C. albicans encode preproenzymes, approximately 60 amino acid longer than the mature enzyme, which are processed when transported via the secretory pathway. The mature enzymes contain sequence motifs typical for all aspartyl proteinases, including the two conserved aspartate residues other active site and conserved cysteine residues implicated in the maintenance of the three-dimensional structure. Most Sap proteins contain putative N-glycosylation sites, but it remains to be determined which Sap proteins are glycosylated. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).  The overall structure of Sap protein conforms to the classical aspartic proteinase fold typified by pepsin. SAP is a bilobal enzyme, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	295
133142	cd05475	nucellin_like	Nucellins, plant aspartic proteases specifically expressed in nucellar cells during degradation. Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. This degradation is a characteristic of programmed cell death. Nucellins are plant aspartic proteases specifically expressed in nucellar cells during degradation. The enzyme is characterized by having two aspartic protease catalytic site motifs, the Asp-Thr-Gly-Ser in the N-terminal and Asp-Ser-Gly-Ser in the C-terminal region, and two other regions nearly identical to two regions of plant aspartic proteases. Aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more divergent, except for the conserved catalytic site motif.	273
133143	cd05476	pepsin_A_like_plant	Chroloplast Nucleoids DNA-binding Protease and Nucellin, pepsin-like aspartic proteases from plants. This family contains pepsin like aspartic proteases from plants including Chloroplast Nucleoids DNA-binding Protease and Nucellin. Chloroplast Nucleoids DNA-binding Protease catalyzes the degradation of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in senescent leaves of tobacco and Nucellins are important regulators of nucellar cell's progressive degradation after ovule fertilization. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event.  The enzymes specifically cleave bonds in peptides which have at least six residues in length with hydrophobic residues in both the P1 and P1' positions. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap.  The enzymes are mostly secreted from cells as inactive proenzymes that activate autocatalytically at acidic pH.	265
133144	cd05477	gastricsin	Gastricsins, asparate proteases produced in gastric mucosa. Gastricsin is also called pepsinogen C. Gastricsins are produced in gastric mucosa of mammals. It is synthesized by the chief cells in the stomach as an inactive zymogen. It is self-converted to a mature enzyme under acidic conditions. Human gastricsin is distributed throughout all parts of the stomach. Gastricsin is synthesized as an inactive progastricsin that has an approximately 40 residue prosequence. It is self-converting to a mature enzyme being triggered by a drop in pH from neutrality to acidic conditions. Like other aspartic proteases, gastricsin are characterized by two catalytic aspartic residues at the active site, and display optimal activity at acidic pH. Mature enzyme has a pseudo-2-fold symmetry that passes through the active site between the catalytic aspartate residues. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic aspartate residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. Although the three-dimensional structures of the two lobes are very similar, the amino acid sequences are more divergent, except for the conserved catalytic site motif. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	318
133145	cd05478	pepsin_A	Pepsin A, aspartic protease produced in gastric mucosa of mammals. Pepsin, a well-known aspartic protease, is produced by the human gastric mucosa in seven different zymogen isoforms, subdivided into two types: pepsinogen A and pepsinogen C. The prosequence of the zymogens are self cleaved under acidic pH. The mature enzymes are called pepsin A and pepsin C, correspondingly. The well researched porcine pepsin is also in this pepsin A family. Pepsins play an integral role in the digestion process of vertebrates. Pepsins are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe may be evolved from the other through ancient gene-duplication event. More recently evolved enzymes have similar three-dimensional structures, however their amino acid sequences are more divergent except for the conserved catalytic site motif. Pepsins specifically cleave bonds in peptides which have at least six residues in length with hydrophobic residues in both the P1 and P1' positions. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	317
133146	cd05479	RP_DDI	RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.	124
133147	cd05480	NRIP_C	NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The  C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.	103
133148	cd05481	retropepsin_like_LTR_1	Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.	93
133149	cd05482	HIV_retropepsin_like	Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.	87
133150	cd05483	retropepsin_like_bacteria	Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.	96
133151	cd05484	retropepsin_like_LTR_2	Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.	91
133152	cd05485	Cathepsin_D_like	Cathepsin_D_like, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank a deep active site cleft. Each of the two related lobes contributes one active site aspartic acid residue and contains a single carbohydrate group. Cathepsin D is an essential enzyme. Mice deficient for proteinase cathepsin D, generated by gene targeting, develop normally during the first 2 weeks, stop thriving in the third week and die in a state of anorexia in the fourth week. The mice develop atrophy of ileal mucosa followed by other degradation of intestinal organs. In these knockout mice, lysosomal proteolysis was normal. These results suggest that vital functions of cathepsin D are exerted by limited proteolysis of proteins regulating cell growth and/or tissue homeostasis, while its contribution to bulk proteolysis in lysosomes appears to be non-critical. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	329
133153	cd05486	Cathespin_E	Cathepsin E, non-lysosomal aspartic protease. Cathepsin E is an intracellular, non-lysosomal aspartic protease expressed in a variety of cells and tissues. The protease has proposed physiological roles in antigen presentation by the MHC class II system, in the biogenesis of the vasoconstrictor peptide endothelin, and in neurodegeneration associated with brain ischemia and aging. Cathepsin E is the only A1 aspartic protease that exists as a homodimer with a disulfide bridge linking the two monomers. Like many other aspartic proteases, it is synthesized as a zymogen which is catalytically inactive towards its natural substrates at neutral pH and which auto-activates in an acidic environment. The overall structure follows the general fold of aspartic proteases of the A1 family, it is composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal lobes of the enzyme. The aspartic acid residues act together to allow a water molecule to attack the peptide bond. One aspartic acid residue (in its deprotonated form) activates the attacking water molecule, whereas the other aspartic acid residue (in its protonated form) polarizes the peptide carbonyl, increasing its susceptibility to attack. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	316
133154	cd05487	renin_like	Renin stimulates production of angiotensin and thus affects blood pressure. Renin, also known as angiotensinogenase, is a circulating enzyme that participates in the renin-angiotensin system that mediates extracellular volume, arterial vasoconstriction, and consequently mean arterial blood pressure. The enzyme is secreted by the kidneys from specialized juxtaglomerular cells in response to decreases in glomerular filtration rate (a consequence of low blood volume), diminished filtered sodium chloride and sympathetic nervous system innervation. The enzyme circulates in the blood stream and hydrolyzes angiotensinogen secreted from the liver into the peptide angiotensin I. Angiotensin I is further cleaved in the lungs by endothelial bound angiotensin converting enzyme (ACE) into angiotensin II, the final active peptide. Renin is a member of the aspartic protease family. Structurally, aspartic proteases are bilobal enzymes, each lobe contributing a catalytic Aspartate  residue, with an extended active site cleft localized between the two lobes of the molecule. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The active site is located at the groove formed by the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors in the active site. Specificity is determined by nearest-neighbor hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. The enzymes are mostly secreted from cells as inactive proenzymes that activate autocatalytically at acidic pH. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	326
133155	cd05488	Proteinase_A_fungi	Fungal Proteinase A , aspartic proteinase superfamily. Fungal Proteinase A, a proteolytic enzyme distributed among a variety of organisms, is a member of the aspartic proteinase superfamily. In Saccharomyces cerevisiae, targeted to the vacuole as a zymogen, activation of proteinases A at acidic pH can occur by two different pathways: a one-step process to release mature proteinase A, involving the intervention of proteinase B, or a step-wise pathway via the auto-activation product known as pseudo-proteinase A. Once active, S. cerevisiae proteinase A is essential to the activities of other yeast vacuolar hydrolases, including proteinase B and carboxypeptidase Y. The mature enzyme is bilobal, with each lobe providing one of the two catalytically essential aspartic acid residues in the active site. The crystal structure of free proteinase A shows that flap loop is atypically pointing directly into the S(1) pocket of the enzyme.  Proteinase A preferentially hydrolyzes hydrophobic residues such as Phe, Leu or Glu at the P1 position and Phe, Ile, Leu or Ala at P1'. Moreover, the enzyme is inhibited by IA3, a natural and highly specific inhibitor produced by S. cerevisiae. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	320
133156	cd05489	xylanase_inhibitor_I_like	TAXI-I inhibits degradation of xylan in the cell wall. Xylanase inhibitor-I (TAXI-I) is a member of potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases. Plants developed a diverse battery of defense mechanisms in response to continual challenges by a broad spectrum of pathogenic microorganisms. Their defense arsenal includes inhibitors of cell wall-degrading enzymes, which hinder a possible invasion and colonization by antagonists. Xylanases of fungal and bacterial pathogens are the key enzymes in the degradation of xylan in the cell wall. Plants secrete proteins that inhibit these degradation glycosidases, including xylanase. Surprisingly, TAXI-I displays structural homology with the pepsin-like family of aspartic proteases but is proteolytically nonfunctional, because one or more residues of the essential catalytic triad are absent. The structure of the TAXI-inhibitor, Aspergillus niger xylanase I complex, illustrates the ability of tight binding and inhibition with subnanomolar affinity and indicates the importance of the C-terminal end for the differences in xylanase specificity among different TAXI-type inhibitors. This family also contains pepsin-like aspartic proteinases homologous to TAXI-I. Unlike TAXI-I, they have active site aspartates and are functionally active. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	362
133157	cd05490	Cathepsin_D2	Cathepsin_D2, pepsin family of proteinases. Cathepsin D is the major aspartic proteinase of the lysosomal compartment where it functions in protein catabolism. It is a member of the pepsin family of proteinases. This enzyme is distinguished from other members of the pepsin family by two features that are characteristic of lysosomal hydrolases. First, mature Cathepsin D is found predominantly in a two-chain form due to a posttranslational cleavage event. Second, it contains phosphorylated, N-linked oligosaccharides that target the enzyme to lysosomes via mannose-6-phosphate receptors. Cathepsin D preferentially attacks peptide bonds flanked by bulky hydrophobic amino acids and its pH optimum is between pH 2.8 and 4.0. Two active site aspartic acid residues are essential for the catalytic activity of aspartic proteinases. Like other aspartic proteinases, Cathepsin D is a bilobed molecule; the two evolutionary related lobes are mostly made up of beta-sheets and flank a deep active site cleft. Each of the two related lobes contributes one active site aspartic acid residue and contains a single carbohydrate group. Cathepsin D is an essential enzyme. Mice deficient for proteinase cathepsin D, generated by gene targeting, develop normally during the first 2 weeks, stop thriving in the third week and die in a state of anorexia in the fourth week. The mice develop atrophy of ileal mucosa followed by other degradation of intestinal organs. In these knockout mice, lysosomal proteolysis was normal. These results suggest that vital functions of cathepsin D are exerted by limited proteolysis of proteins regulating cell growth and/or tissue homeostasis, while its contribution to bulk proteolysis in lysosomes appears to be non-critical. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	325
99923	cd05491	Bromo_TBP7_like	Bromodomain; TBP7_like subfamily, limited to fungi. TBP7, or TAT-binding protein homolog 7, is a yeast protein of unknown function that contains AAA-superfamily ATP-ase domains and a bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine.	119
99924	cd05492	Bromo_ZMYND11	Bromodomain; ZMYND11_like sub-family. ZMYND11 or BS69 is a ubiquitously expressed nuclear protein that has been shown to associate with chromatin. It interacts with chromatin remodeling factors and might play a role in chromatin remodeling and gene expression. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	109
99925	cd05493	Bromo_ALL-1	Bromodomain, ALL-1 like proteins. ALL-1 is a vertebrate homologue of Drosophila trithorax and is often affected in chromosomal rearrangements that are linked to acute leukemias, such as acute lymphocytic leukemia (ALL). Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine.	131
99926	cd05494	Bromodomain_1	Bromodomain; uncharacterized subfamily. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine.	114
99927	cd05495	Bromo_cbp_like	Bromodomain, cbp_like subfamily. Cbp (CREB binding protein or CREBBP) is an acetyltransferase acting on histone, which gives a specific tag for transcriptional activation and also acetylates non-histone proteins. CREBBP binds specifically to phosphorylated CREB protein and augments the activity of phosphorylated CREB to activate transcription of cAMP-responsive genes. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	108
99928	cd05496	Bromo_WDR9_II	Bromodomain; WDR9 repeat II_like subfamily. WDR9 is a human gene located in the Down Syndrome critical region-2 of chromosome 21. It encodes for a nuclear protein containing WD40 repeats and two bromodomains, which may function as a transcriptional regulator involved in chromatin remodeling and play a role in embryonic development. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	119
99929	cd05497	Bromo_Brdt_I_like	Bromodomain, Brdt_like subfamily, repeat I. Human Brdt is a testis-specific member of the BET subfamily of bromodomain proteins; the first bromodomain in Brdt has been shown to be essential for male germ cell differentiation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	107
99930	cd05498	Bromo_Brdt_II_like	Bromodomain, Brdt_like subfamily, repeat II. Human Brdt is a testis-specific member of the BET subfamily of bromodomain proteins; the first bromodomain in Brdt has been shown to be essential for male germ cell differentiation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	102
99931	cd05499	Bromo_BDF1_2_II	Bromodomain. BDF1/BDF2 like subfamily, restricted to fungi, repeat II. BDF1 and BDF2 are yeast transcription factors involved in the expression of a wide range of genes, including snRNAs; they are required for sporulation and DNA repair and protect histone H4 from deacetylation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	102
99932	cd05500	Bromo_BDF1_2_I	Bromodomain. BDF1/BDF2 like subfamily, restricted to fungi, repeat I. BDF1 and BDF2 are yeast transcription factors involved in the expression of a wide range of genes, including snRNAs; they are required for sporulation and DNA repair and protect histone H4 from deacetylation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	103
99933	cd05501	Bromo_SP100C_like	Bromodomain, SP100C_like subfamily. The SP100C protein is a splice variant of SP100, a major component of PML-SP100 nuclear bodies (NBs), which are poorly understood. It is covalently modified by SUMO-1 and may play a role in processes at the chromatin level. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	102
99934	cd05502	Bromo_tif1_like	Bromodomain; tif1_like subfamily. Tif1 (transcription intermediary factor 1) is a member of the tripartite motif (TRIM) protein family, which is characterized by a particular domain architecture. It functions by recruiting coactivators and/or corepressors to modulate transcription. Vertebrate Tif1-gamma, also labeled E3 ubiquitin-protein ligase TRIM33, plays a role in the control of hematopoiesis. Its homologue in Xenopus laevis, Ectodermin, has been shown to function in germ-layer specification and control of cell growth during embryogenesis. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	109
99935	cd05503	Bromo_BAZ2A_B_like	Bromodomain, BAZ2A/BAZ2B_like subfamily. Bromo adjacent to zinc finger 2A (BAZ2A) and 2B (BAZ2B) were identified as a novel human bromodomain gene by cDNA library screening. BAZ2A is also known as Tip5 (Transcription termination factor I-interacting protein 5) and hWALp3. The proteins may play roles in transcriptional regulation. Human Tip5 is part of a complex termed NoRC (nucleolar remodeling complex), which induces nucleosome sliding and may play a role in the regulation of the rDNA locus. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	97
99936	cd05504	Bromo_Acf1_like	Bromodomain; Acf1_like or BAZ1A_like subfamily. Bromo adjacent to zinc finger 1A (BAZ1A) was identified as a novel human bromodomain gene by cDNA library screening. The Drosophila homologue, Acf1, is part of the CHRAC (chromatin accessibility complex) and regulates ISWI-induced nucleosome remodeling. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	115
99937	cd05505	Bromo_WSTF_like	Bromodomain; Williams syndrome transcription factor-like subfamily (WSTF-like). The Williams-Beuren syndrome deletion transcript 9 is a putative transcriptional regulator. WSTF was found to play a role in vitamin D-mediated transcription as part of two chromatin remodeling complexes, WINAC and WICH. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	97
99938	cd05506	Bromo_plant1	Bromodomain, uncharacterized subfamily specific to plants. Might function as a global transcription factor. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	99
99939	cd05507	Bromo_brd8_like	Bromodomain, brd8_like subgroup. In mammals, brd8 (bromodomain containing 8) interacts with the thyroid hormone receptor in a ligand-dependent fashion and enhances thyroid hormone-dependent activation from thyroid response elements. Brd8 is thought to be a nuclear receptor coactivator. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	104
99940	cd05508	Bromo_RACK7	Bromodomain, RACK7_like subfamily. RACK7 (also called human protein kinase C-binding protein) was identified as a potential tumor suppressor genes, it shares domain architecture with BS69/ZMYND11; both have been implicated in the regulation of cellular proliferation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	99
99941	cd05509	Bromo_gcn5_like	Bromodomain; Gcn5_like subfamily. Gcn5p is a histone acetyltransferase (HAT) which mediates acetylation of histones at lysine residues; such acetylation is generally correlated with the activation of transcription. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	101
99942	cd05510	Bromo_SPT7_like	Bromodomain; SPT7_like subfamily. SPT7 is a yeast protein that functions as a component of the transcription regulatory histone acetylation (HAT) complexes SAGA, SALSA, and SLIK. SAGA is involved in the RNA polymerase II-dependent transcriptional regulation of about 10% of all yeast genes. The SPT7 bromodomain has been shown to weakly interact with acetylated histone H3, but not H4. The human representative of this subfamily is cat eye syndrome critical region protein 2 (CECR2). Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	112
99943	cd05511	Bromo_TFIID	Bromodomain, TFIID-like subfamily. Human TAFII250 (or TAF250) is the largest subunit of TFIID, a large multi-domain complex, which initiates the assembly of the transcription machinery. TAFII250 contains two bromodomains that specifically bind to acetylated histone H4. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	112
99944	cd05512	Bromo_brd1_like	Bromodomain; brd1_like subfamily. BRD1 is a mammalian gene which encodes for a nuclear protein assumed to be a transcriptional regulator. BRD1 has been implicated with brain development and susceptibility to schizophrenia and bipolar affective disorder. Bromodomains are 110 amino acid long domains that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	98
99945	cd05513	Bromo_brd7_like	Bromodomain, brd7_like subgroup. The BRD7 gene encodes a nuclear protein that has been shown to inhibit cell growth and the progression of the cell cycle by regulating cell-cycle genes at the transcriptional level. BRD7 has been identified as a gene involved in nasopharyngeal carcinoma. The protein interacts with acetylated histone H3 via its bromodomain. Bromodomains are 110 amino acid long domains that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	98
99946	cd05515	Bromo_polybromo_V	Bromodomain, polybromo repeat V. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine.	105
99947	cd05516	Bromo_SNF2L2	Bromodomain, SNF2L2-like subfamily, specific to animals. SNF2L2 (SNF2-alpha) or SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A member 2 is a global transcriptional activator, which cooperates with nuclear hormone receptors to boost transcriptional activation. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	107
99948	cd05517	Bromo_polybromo_II	Bromodomain, polybromo repeat II. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine.	103
99949	cd05518	Bromo_polybromo_IV	Bromodomain, polybromo repeat IV. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine.	103
99950	cd05519	Bromo_SNF2	Bromodomain, SNF2-like subfamily, specific to fungi. SNF2 is a yeast protein involved in transcriptional activation, it is the catalytic component of the SWI/SNF ATP-dependent chromatin remodeling complex. The protein is essential for the regulation of gene expression (both positive and negative) of a large number of genes. The SWI/SNF complex changes chromatin structure by altering DNA-histone contacts within the nucleosome, which results in a re-positioning of the nucleosome and facilitates or represses the binding of gene-specific transcription factors. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	103
99951	cd05520	Bromo_polybromo_III	Bromodomain, polybromo repeat III. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine.	103
99952	cd05521	Bromo_Rsc1_2_I	Bromodomain, repeat I in Rsc1/2_like subfamily, specific to fungi. Rsc1 and Rsc2 are components of the RSC complex (remodeling the structure of chromatin), are essential for transcriptional control, and have a specific domain architecture including two bromodomains. The RSC complex has also been linked to homologous recombination and nonhomologous end-joining repair of DNA double strand breaks. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	106
99953	cd05522	Bromo_Rsc1_2_II	Bromodomain, repeat II in Rsc1/2_like subfamily, specific to fungi. Rsc1 and Rsc2 are components of the RSC complex (remodeling the structure of chromatin), are essential for transcriptional control, and have a specific domain architecture including two bromodomains. The RSC complex has also been linked to homologous recombination and nonhomologous end-joining repair of DNA double strand breaks. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	104
99954	cd05524	Bromo_polybromo_I	Bromodomain, polybromo repeat I. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine.	113
99955	cd05525	Bromo_ASH1	Bromodomain; ASH1_like sub-family. ASH1 (absent, small, or homeotic 1) is a member of the trithorax-group in Drosophila melanogaster, an epigenetic transcriptional regulator of HOX genes. Drosophila ASH1 has been shown to methylate specific lysines in histones H3 and H4. Mammalian ASH1 has been shown to methylate histone H3. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	106
99956	cd05526	Bromo_polybromo_VI	Bromodomain, polybromo repeat VI. Polybromo is a nuclear protein of unknown function, which contains 6 bromodomains. The human ortholog BAF180 is part of a SWI/SNF chromatin-remodeling complex, and it may carry out the functions of Yeast Rsc-1 and Rsc-2. It was shown that polybromo bromodomains bind to histone H3 at specific acetyl-lysine positions. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine, but not all the bromodomains in polybromo may bind to acetyl-lysine.	110
99957	cd05528	Bromo_AAA	Bromodomain; sub-family co-occurring with AAA domains. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine. The structure(2DKW) in this alignment is an uncharacterized protein predicted from analysis of cDNA clones from human fetal liver	112
99958	cd05529	Bromo_WDR9_I_like	Bromodomain; WDR9 repeat I_like subfamily. WDR9 is a human gene located in the Down Syndrome critical region-2 of chromosome 21. It encodes for a nuclear protein containing WD40 repeats and two bromodomains, which may function as a transcriptional regulator involved in chromatin remodeling and play a role in embryonic development. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	128
99913	cd05530	POLBc_B1	DNA polymerase type-B B1 subfamily catalytic domain. Archaeal proteins that are involved in DNA replication are similar to those from eukaryotes. Some archaeal members also possess multiple family B DNA polymerases (B1, B2 and B3). So far there is no specific function(s) has been assigned for different members of the archaea type B DNA polymerases. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family B DNA polymerases are support independent gene duplications during the evolution of archaeal and eukaryotic family B DNA polymerases.	372
99914	cd05531	POLBc_B2	DNA polymerase type-B B2 subfamily catalytic domain. Archaeal proteins that are involved in DNA replication are similar to those from eukaryotes. Some archaeal members also possess multiple family B DNA polymerases (B1, B2 and B3). So far there is no specific function(s) has been assigned for different members of the archaea type B DNA polymerases. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family B DNA polymerases are support independent gene duplications during the evolution of archaeal and eukaryotic family B DNA polymerases.	352
99915	cd05532	POLBc_alpha	DNA polymerase type-B alpha subfamily catalytic domain. Three DNA-dependent DNA polymerases type B (alpha, delta, and epsilon) have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase (Pol) alpha is almost exclusively required for the initiation of DNA replication and the priming of Okazaki fragments during elongation. In most organisms no specific repair role, other than check point control, has been assigned to this enzyme. Pol alpha contains both polymerase and exonuclease domains, but lacks exonuclease activity suggesting that the exonuclease domain may be for structural purposes only.	400
99916	cd05533	POLBc_delta	DNA polymerase type-B delta subfamily catalytic domain. Three DNA-dependent DNA polymerases type B (alpha, delta, and epsilon) have been identified as essential for nuclear DNA replication in eukaryotes. Presently, no direct data is available regarding the strand specificity of DNA polymerase during DNA replication in vivo. However, mutation analysis supports the hypothesis that DNA polymerase delta is the enzyme responsible for both elongation and maturation of Okazaki fragments on the lagging strand.	393
99917	cd05534	POLBc_zeta	DNA polymerase type-B zeta subfamily catalytic domain. DNA polymerase (Pol) zeta is a member of the eukaryotic B-family of DNA polymerases and distantly related to DNA Pol delta. Pol zeta plays a major role in translesion replication and the production of either spontaneous or induced mutations. Apart from its role in translesion replication, Pol zeta also appears to be involved in somatic hypermutability in B lymphocytes, an important element for the production of high affinity antibodies in response to an antigen.	451
99918	cd05535	POLBc_epsilon	DNA polymerase type-B epsilon subfamily catalytic domain. Three DNA-dependent DNA polymerases type B (alpha, delta, and epsilon) have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase (Pol) epsilon has been proposed to play a role in elongation of the leading strand during DNA replication. Pol epsilon might also have a role in DNA repair. The structure of pol epsilon is characteristic of this family with the exception that it contains a large c-terminal domain with an unclear function. Phylogenetic analyses indicate that Pol epsilon is the ortholog to the archaeal Pol B3 rather than to Pol alpha, delta, or zeta. This might be because pol epsilon is ancestral to both archaea and eukaryotes DNA polymerases type B.	621
99919	cd05536	POLBc_B3	DNA polymerase type-B B3 subfamily catalytic domain. Archaeal proteins that are involved in DNA replication are similar to those from eukaryotes. Some members of the archaea also possess multiple family B DNA polymerases (B1, B2 and B3). So far there is no specific function(s) has been assigned for different members of the archaea type B DNA polymerases. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family B DNA polymerases are support independent gene duplications during the evolution of archaeal and eukaryotic family B DNA polymerases. Structural comparison of the thermostable DNA polymerase type B to its mesostable homolog suggests several adaptations to high temperature such as shorter loops, disulfide bridges, and increasing electrostatic interaction at subdomain interfaces.	371
99920	cd05537	POLBc_Pol_II	DNA polymerase type-II subfamily catalytic domain. Bacteria contain five DNA polymerases (I, II, III, IV and V). DNA polymerase II (Pol II) is a prototype for the B-family of polymerases. The role of Pol II in a variety of cellular activities, such as repair of DNA damaged by UV irradiation or oxidation has been proven by genetic studies. DNA polymerase III is the main enzyme responsible for replication of the bacterial chromosome; however, In vivo studies have also shown that Pol II is able to participate in chromosomal DNA replication with larger role in lagging-strand replication.	371
99921	cd05538	POLBc_Pol_II_B	DNA polymerase type-II B subfamily catalytic domain. Bacteria contain five DNA polymerases (I, II, III, IV and V). DNA polymerase II (Pol II) is a prototype for the B-family of polymerases. The role of Pol II in a variety of cellular activities, such as repair of DNA damaged by UV irradiation or oxidation has been proved by genetic studies. DNA polymerase III is the main enzyme responsible for replication of the bacterial chromosome; however, In vivo studies have also shown that Pol II is able to participate in chromosomal DNA replication with larger role in lagging-strand replication.	347
349776	cd05540	UreG	urease accessory protein UreG. UreG is one of the four accessory proteins of urease. Urease is an enzyme which catalyzes the decomposition of urea to form ammonia and carbon dioxide. Bacterial urease is a trimer of three subunits which are encoded by genes ureA, ureB, and ureC. Up to four accessory proteins (ureD, ureE, ureF, and ureG) are required for urease catalytical function. UreG may play an important role in nickel incorporation of the urease metallocenter. UreG is a member of the Fer4_NifH superfamily which contains an ATP-binding domain. Proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion.	191
349405	cd05559	CAP_PI16_HrTT-1	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of peptidase inhibitor 16 and HrTT-1 protein. Human peptidase inhibitor 16 (PI16) is also called cysteine-rich secretory protein 9 (CRISP-9) or PSP94-binding protein. Mouse PI16 is also called cysteine-rich protease inhibitor. PI16 is predominantly expressed by cardiac fibroblasts and is exposed to the interstitium via a glycophosphatidylinositol (-GPI) membrane anchor. It suppresses the activation of the chemokine chemerin in the myocardium, which may be a part of the cardiac stress response. At high endothelial shear stress, PI16 is an inflammation-regulated inhibitor of matrix metalloproteinase 2 (MMP2). Also included in this subfamily is the HrTT-1 protein, a tail-tip epidermis marker in ascidians. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	134
240187	cd05560	Xcc1710_like	Xcc1710_like family, specific to proteobacteria. Xcc1710 is a hypothetical protein from Xanthomonas campestris pv. campestris str. ATCC 33913, similar to Mth938, a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mth) genome. Their three-dimensional structures have been determined, but their functions are unknown.	109
173797	cd05561	Peptidases_S8_4	Peptidase S8 family domain, uncharacterized subfamily 4. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	239
173798	cd05562	Peptidases_S53_like	Peptidase domain in the S53 family. Members of the peptidase S53 (sedolisin) family include endopeptidases and exopeptidases. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of Asn in subtilisin. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values. Characterized sedolisins include Kumamolisin, an extracellular calcium-dependent thermostable endopeptidase from Bacillus. The enzyme is synthesized with a 188 amino acid N-terminal preprotein region which is cleaved after the extraction into the extracellular space with low pH. One kumamolysin paralog, kumamolisin-As, is believed to be a collagenase. TPP1 is a serine protease that functions as a tripeptidyl exopeptidase as well as an endopeptidase. Less is known about PSCP from Pseudomonas which is thought to be an aspartic proteinase.	275
99905	cd05563	PTS_IIB_ascorbate	PTS_IIB_ascorbate: subunit IIB of enzyme II (EII) of the L-ascorbate-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In this system, EII is an L-ascorbate-specific permease with two cytoplasmic subunits (IIA and IIB) and a transmembrane channel IIC subunit. Subunits IIA, IIB, and IIC are encoded by the sgaA, sgaB, and sgaT genes of the E. coli sgaTBA operon. In some bacteria, the IIB (SgaB) domain is fused C-terminal to the IIA (SgaT) domain. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include ascorbate, chitobiose/lichenan, lactose, galactitol, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system.	86
99906	cd05564	PTS_IIB_chitobiose_lichenan	PTS_IIB_chitobiose_lichenan: subunit IIB of enzyme II (EII) of the N,N-diacetylchitobiose-specific and lichenan-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In these systems, EII is either a lichenan- or an N,N-diacetylchitobiose-specific permease with two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain. In the chitobiose system, these subunits are expressed as separate proteins from chbA, chbB, and chbC of the chb operon (formerly the cel (cellulose) operon). In the lichenan system, these subunits are expressed from licA, licB, and licC of the lic operon. The lic operon of Bacillus subtilis is required for the transport and degradation of oligomeric beta-glucosides, which are produced by extracellular enzymes on substrates such as lichenan or barley glucan. The lic operon is transcribed from a gammaA-dependent promoter and is inducible by lichenan, lichenan hydrolysate, and cellobiose. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include chitobiose/lichenan, ascorbate, lactose, galactitol, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system.	96
99907	cd05565	PTS_IIB_lactose	PTS_IIB_lactose: subunit IIB of enzyme II (EII) of the lactose-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) found in Firmicutes as well as Actinobacteria. In this system, EII is a lactose-specific permease with two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain. The IIC and IIB domains are expressed as a single protein from the lac operon. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include lactose, chitobiose/lichenan, ascorbate, galactitol, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system.	99
99908	cd05566	PTS_IIB_galactitol	PTS_IIB_galactitol: subunit IIB of enzyme II (EII) of the galactitol-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS).  In this system, EII is a galactitol-specific permease with two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain that are expressed on three distinct polypeptide chains, in contrast to other PTS sugar transporters. The three genes encoding these subunits (gatA, gatB, and gatC) comprise the gatCBA operon. Galactitol PTS permease takes up exogenous galactitol, releasing the phosphate ester into the cytoplasm in preparation for oxidation and further metabolism via a modified glycolytic pathway called the tagatose-6-phosphate glycolytic pathway. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include galactitol, chitobiose/lichenan, ascorbate, lactose, mannitol, fructose, and a sensory system with similarity to the bacterial bgl system.	89
99909	cd05567	PTS_IIB_mannitol	PTS_IIB_mannitol: subunit IIB of enzyme II (EII) of the mannitol-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In this system, EII is a mannitol-specific permease with two cytoplasmic domains (IIA and IIB) and a transmembrane channel IIC domain.  The IIA, IIB, and IIC domains are expressed from the mtlA gene as a single protein, also known as the mannitol PTS permease, the mtl transporter, or MtlA. MtlA is only functional as a dimer with the dimer contacts occuring between the IIC domains. MtlA takes up exogenous mannitol releasing the phosphate ester into the cytoplasm in preparation  for oxidation to fructose-6-phosphate by the NAD-dependent mannitol-P dehydrogenase (MtlD). The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include mannitol, chitobiose/lichenan, ascorbate, lactose, galactitol, fructose, and a sensory system with similarity to the bacterial bgl system.	87
99910	cd05568	PTS_IIB_bgl_like	PTS_IIB_bgl_like: the PTS (phosphotransferase system) IIB domain of a family of sensory systems composed of a membrane-bound sugar-sensor (similar to BglF) and a transcription antiterminator (similar to BglG) which regulate expression of genes involved in sugar utilization. The domain architecture of the IIB-containing protein includes a region N-terminal to the IIB domain which is homologous to the BglG transcription antiterminator with an RNA-binding domain followed by two homologous domains, PRD1 and PRD2 (PTS Regulation Domains). C-terminal to the IIB domain is a domain similar to the PTS IIA domain. In this system, the BglG-like region and the IIB and IIA-like domains are all expressed together as a single multidomain protein. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include this sensory system with similarity to the bacterial bgl system, chitobiose/lichenan, ascorbate, lactose, galactitol, mannitol, and fructose systems.	85
99911	cd05569	PTS_IIB_fructose	PTS_IIB_fructose: subunit IIB of enzyme II (EII) of the fructose-specific phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS). In this system, EII (also referred to as FruAB) is a fructose-specific permease made up of two proteins (FruA and FruB) each containing 3 domains. The FruA protein contains two tandem nonidentical IIB domains and a C-terminal IIC transmembrane domain. Both IIB domains of FruA are included in this alignment. The FruB protein (also referred to as diphosphoryl transfer protein) contains a IIA domain, a domain of unknown function, and an Hpr-like domain called FPr (fructose-inducible HPr). This familiy also includes the IIB domains of several fructose-like PTS permeases including the Frv permease encoded by the frvABXR operon, the Frw permease encoded by the frwACBD operon, the Frx permease encoded by the hrsA gene,  and the Fry permease encoded by the fryABC (ypdDGH) operon. FruAB takes up exogenous fructose, releasing the 1-phosphate ester in to the cytoplasm in preparation for metabolism primarily via glycolysis. The IIB domain fold includes a central four-stranded parallel open twisted beta-sheet flanked by alpha-helices on both sides. The seven major PTS systems with this IIB fold include fructose, chitobiose/lichenan, ascorbate, lactose, galactitol, mannitol, and a sensory system with similarity to the bacterial bgl system.	96
270722	cd05570	STKc_PKC	Catalytic domain of the Serine/Threonine Kinase, Protein Kinase C. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. PKCs undergo three phosphorylations in order to take mature forms. In addition, classical PKCs depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. Novel PKCs are calcium-independent, but require DAG and PS for activity, while atypical PKCs only require PS. PKCs phosphorylate and modify the activities of a wide variety of cellular proteins including receptors, enzymes, cytoskeletal proteins, transcription factors, and other kinases. They play a central role in signal transduction pathways that regulate cell migration and polarity, proliferation, differentiation, and apoptosis. Also included in this subfamily are the PKC-like proteins, called PKNs. The PKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	318
270723	cd05571	STKc_PKB	Catalytic domain of the Serine/Threonine Kinase, Protein Kinase B. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. There are three PKB isoforms from different genes, PKB-alpha (or Akt1), PKB-beta (or Akt2), and PKB-gamma (or Akt3). PKB contains an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain. It is activated downstream of phosphoinositide 3-kinase (PI3K) and plays important roles in diverse cellular functions including cell survival, growth, proliferation, angiogenesis, motility, and migration. PKB also has a central role in a variety of human cancers, having been implicated in tumor initiation, progression, and metastasis. The PKB subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and PI3K.	322
270724	cd05572	STKc_cGK	Catalytic domain of the Serine/Threonine Kinase, cGMP-dependent protein kinase (cGK or PKG). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Mammals have two cGK isoforms from different genes, cGKI and cGKII. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. cGK consists of an N-terminal regulatory domain containing a dimerization and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a  soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is also expressed at lower concentrations in other tissues. cGKII is a membrane-bound protein that is most abundantly expressed in the intestine. It is also present in the brain nuclei, adrenal cortex, kidney, lung, and prostate. cGKI is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. cGKII plays a role in the regulation of secretion, such as renin secretion by the kidney and aldosterone secretion by the adrenal. It also regulates bone growth and the circadian rhythm. The cGK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
270725	cd05573	STKc_ROCK_NDR_like	Catalytic domain of Rho-associated coiled-coil containing protein kinase (ROCK)- and Nuclear Dbf2-Related (NDR)-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this subfamily include ROCK and ROCK-like proteins such as DMPK, MRCK, and CRIK, as well as NDR and NDR-like proteins such as LATS, CBK1 and Sid2p. ROCK and CRIK are effectors of the small GTPase Rho, while MRCK is an effector of the small GTPase Cdc42. NDR and NDR-like kinases contain an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Proteins in this subfamily are involved in regulating many cellular functions including contraction, motility, division, proliferation, apoptosis, morphogenesis, and cytokinesis. The ROCK/NDR-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	350
270726	cd05574	STKc_phototropin_like	Catalytic domain of Phototropin-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Phototropins are blue-light receptors that control responses such as phototropism, stromatal opening, and chloroplast movement in order to optimize the photosynthetic efficiency of plants. They are light-activated STKs that contain an N-terminal photosensory domain and a C-terminal catalytic domain. The N-terminal domain contains two LOV (Light, Oxygen or Voltage) domains that binds FMN. Photoexcitation of the LOV domains results in autophosphorylation at multiple sites and activation of the catalytic domain. In addition to plant phototropins, included in this subfamily are predominantly uncharacterized fungal STKs whose catalytic domains resemble the phototropin kinase domain. One protein from Neurospora crassa is called nrc-2, which plays a role in growth and development by controlling entry into the conidiation program. The phototropin-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	316
270727	cd05575	STKc_SGK	Catalytic domain of the Serine/Threonine Kinase, Serum- and Glucocorticoid-induced Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SGKs are activated by insulin and growth factors via phosphoinositide 3-kinase and PDK1. They activate ion channels, ion carriers, and the Na-K-ATPase, as well as regulate the activity of enzymes and transcription factors. SGKs play important roles in transport, hormone release, neuroexcitability, cell proliferation, and apoptosis. There are three isoforms of SGK, named SGK1, SGK2, and SGK3 (also called cytokine-independent survival kinase CISK). The SGK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	323
270728	cd05576	STKc_RPK118_like	Catalytic domain of the Serine/Threonine Kinase, RPK118, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RPK118 contains an N-terminal Phox homology (PX) domain, a Microtubule Interacting and Trafficking (MIT) domain, and a kinase domain containing a long uncharacterized insert. Also included in the family is human RPK60 (or ribosomal protein S6 kinase-like 1), which also contains MIT and kinase domains but lacks a PX domain. RPK118 binds sphingosine kinase, a key enzyme in the synthesis of sphingosine 1-phosphate (SPP), a lipid messenger involved in many cellular events. RPK118 may be involved in transmitting SPP-mediated signaling. RPK118 also binds the antioxidant peroxiredoxin-3. RPK118 may be involved in the transport of PRDX3 from the cytoplasm to its site of function in the mitochondria. The RPK118-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
270729	cd05577	STKc_GRK	Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors, which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. GRKs play important roles in the cardiovascular, immune, respiratory, skeletal, and nervous systems. They contain a central catalytic domain, flanked by N- and C-terminal extensions. The N-terminus contains an RGS (regulator of G protein signaling) homology (RH) domain and several motifs. The C-terminus diverges among different groups of GRKs. There are seven types of GRKs, named GRK1 to GRK7, which are subdivided into three main groups: visual (GRK1/7); beta-adrenergic receptor kinases (GRK2/3); and GRK4-like (GRK4/5/6). Expression of GRK2/3/5/6 is widespread while GRK1/4/7 show a limited tissue distribution. The substrate spectrum of the widely expressed GRKs partially overlaps. The GRK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	278
270730	cd05578	STKc_Yank1	Catalytic domain of the Serine/Threonine Kinase, Yank1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily contains uncharacterized STKs with similarity to the human protein designated as Yank1 or STK32A. The Yank1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
270731	cd05579	STKc_MAST_like	Catalytic domain of Microtubule-associated serine/threonine (MAST) kinase-like proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes MAST kinases, MAST-like (MASTL) kinases (also called greatwall kinase or Gwl), and fungal kinases with similarity to Saccharomyces cerevisiae Rim15 and Schizosaccharomyces pombe cek1. MAST kinases contain an N-terminal domain of unknown function, a central catalytic domain, and a C-terminal PDZ domain that mediates protein-protein interactions. MASTL kinases carry only a catalytic domain which contains a long insert relative to other kinases. The fungal kinases in this subfamily harbor other domains in addition to a central catalytic domain, which like in MASTL, also contains an insert relative to MAST kinases. Rim15 contains a C-terminal signal receiver (REC) domain while cek1 contains an N-terminal PAS domain. MAST kinases are cytoskeletal associated kinases of unknown function that are also expressed at neuromuscular junctions and postsynaptic densities. MASTL/Gwl is involved in the regulation of mitotic entry, mRNA stabilization, and DNA checkpoint recovery. The fungal proteins Rim15 and cek1 are involved in the regulation of meiosis and mitosis, respectively. The MAST-like kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	272
270732	cd05580	STKc_PKA_like	Catalytic subunit of the Serine/Threonine Kinases, cAMP-dependent protein kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the cAMP-dependent protein kinases, PKA and PRKX, and similar proteins. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. PRKX is also reulated by the R subunit and is is present in many tissues including fetal and adult brain, kidney, and lung. It is implicated in granulocyte/macrophage lineage differentiation, renal cell epithelial migration, and tubular morphogenesis in the developing kidney. The PKA-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
270733	cd05581	STKc_PDK1	Catalytic domain of the Serine/Threonine Kinase, Phosphoinositide-dependent kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PDK1 carries an N-terminal catalytic domain and a C-terminal pleckstrin homology (PH) domain that binds phosphoinositides. It phosphorylates the activation loop of AGC kinases that are regulated by PI3K such as PKB, SGK, and PKC, among others, and is crucial for their activation. Thus, it contributes in regulating many processes including metabolism, growth, proliferation, and survival. PDK1 also has the ability to autophosphorylate and is constitutively active in mammalian cells. It is essential for normal embryo development and is important in regulating cell volume. The PDK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	278
270734	cd05582	STKc_RSK_N	N-terminal catalytic domain of the Serine/Threonine Kinase, 90 kDa ribosomal protein S6 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. Mammals possess four RSK isoforms (RSK1-4) from distinct genes. RSK proteins are also referred to as MAP kinase-activated protein kinases (MAPKAPKs), p90-RSKs, or p90S6Ks. The RSK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	317
270735	cd05583	STKc_MSK_N	N-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, in response to various stimuli such as growth factors, hormones, neurotransmitters, cellular stress, and pro-inflammatory cytokines. This triggers phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) in the C-terminal extension of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. MSKs are predominantly nuclear proteins. They are widely expressed in many tissues including heart, brain, lung, liver, kidney, and pancreas. There are two isoforms of MSK, called MSK1 and MSK2. The MSK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
270736	cd05584	STKc_p70S6K	Catalytic domain of the Serine/Threonine Kinase, 70 kDa ribosomal protein S6 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p70S6K (or S6K) contains only one catalytic kinase domain, unlike p90 ribosomal S6 kinases (RSKs). It acts as a downstream effector of the STK mTOR (mammalian Target of Rapamycin) and plays a role in the regulation of the translation machinery during protein synthesis. p70S6K also plays a pivotal role in regulating cell size and glucose homeostasis. Its targets include S6, the translation initiation factor eIF3, and the insulin receptor substrate IRS-1, among others. Mammals contain two isoforms of p70S6K, named S6K1 and S6K2 (or S6K-beta). The p70S6K subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	323
270737	cd05585	STKc_YPK1_like	Catalytic domain of Yeast Protein Kinase 1-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of fungal proteins with similarity to the AGC STKs, Saccharomyces cerevisiae YPK1 and Schizosaccharomyces pombe Gad8p. YPK1 is required for cell growth and acts as a downstream kinase in the sphingolipid-mediated signaling pathway of yeast. It also plays a role in efficient endocytosis and in the maintenance of cell wall integrity. Gad8p is a downstream target of Tor1p, the fission yeast homolog of mTOR. It plays a role in cell growth and sexual development. The YPK1-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	313
270738	cd05586	STKc_Sck1_like	Catalytic domain of Suppressor of loss of cAMP-dependent protein kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Schizosaccharomyces pombe Sck1 and similar fungal proteins. Sck1 plays a role in trehalase activation triggered by glucose and a nitrogen source. Trehalase catalyzes the cleavage of the disaccharide trehalose to glucose. Trehalose, as a carbohydrate reserve and stress metabolite, plays an important role in the response of yeast to environmental changes. The Sck1-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	330
270739	cd05587	STKc_cPKC	Catalytic domain of the Serine/Threonine Kinase, Classical (or Conventional) Protein Kinase C. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cPKCs are potent kinases for histones, myelin basic protein, and protamine. They depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. cPKCs contain a calcium-binding C2 region in their regulatory domain. There are four cPKC isoforms, named alpha, betaI, betaII, and gamma. PKC-gamma is mainly expressed in neuronal tissues. It plays a role in protection from ischemia. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. The cPKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	320
270740	cd05588	STKc_aPKC	Catalytic domain of the Serine/Threonine Kinase, Atypical Protein Kinase C. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. The aPKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	328
270741	cd05589	STKc_PKN	Catalytic domain of the Serine/Threonine Kinase, Protein Kinase N. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKN has a C-terminal catalytic domain that is highly homologous to PKCs. Its unique N-terminal regulatory region contains antiparallel coiled-coil (ACC) domains. In mammals, there are three PKN isoforms from different genes (designated PKN-alpha, beta, and gamma), which show different enzymatic properties, tissue distribution, and varied functions. PKN can be activated by the small GTPase Rho, and by fatty acids such as arachidonic and linoleic acids. It is involved in many biological processes including cytokeletal regulation, cell adhesion, vesicle transport, glucose transport, regulation of meiotic maturation and embryonic cell cycles, signaling to the nucleus, and tumorigenesis. The PKN subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	326
270742	cd05590	STKc_nPKC_eta	Catalytic domain of the Serine/Threonine Kinase, Novel Protein Kinase C eta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-eta is predominantly expressed in squamous epithelia, where it plays a crucial role in the signaling of cell-type specific differentiation. It is also expressed in pro-B cells and early-stage thymocytes, and acts as a key regulator in early B-cell development. PKC-eta increases glioblastoma multiforme (GBM) proliferation and resistance to radiation, and is being developed as a therapeutic target for the management of GBM. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. The nPKC-eta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	323
270743	cd05591	STKc_nPKC_epsilon	Catalytic domain of the Serine/Threonine Kinase, Novel Protein Kinase C epsilon. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-epsilon has been shown to behave as an oncoprotein. Its overexpression contributes to neoplastic transformation depending on the cell type. It contributes to oncogenesis by inducing disordered cell growth and inhibiting cell death. It also plays a role in tumor invasion and metastasis. PKC-epsilon has also been found to confer cardioprotection against ischemia and reperfusion-mediated damage. Other cellular functions include the regulation of gene expression, cell adhesion, and cell motility. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. The nPKC-epsilon subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	321
270744	cd05592	STKc_nPKC_theta_like	Catalytic domain of the Serine/Threonine Kinases, Novel Protein Kinase C theta, delta, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-theta is selectively expressed in T-cells and plays an important and non-redundant role in several aspects of T-cell biology. PKC-delta plays a role in cell cycle regulation and programmed cell death in many cell types. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. There are four nPKC isoforms, delta, epsilon, eta, and theta. The nPKC-theta-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	320
270745	cd05593	STKc_PKB_gamma	Catalytic domain of the Serine/Threonine Kinase, Protein Kinase B gamma (also called Akt3). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKB-gamma is predominantly expressed in neuronal tissues. Mice deficient in PKB-gamma show a reduction in brain weight due to the decreases in cell size and cell number. PKB-gamma has also been shown to be upregulated in estrogen-deficient breast cancer cells, androgen-independent prostate cancer cells, and primary ovarian tumors. It acts as a key mediator in the genesis of ovarian cancer. PKB contains an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain. The PKB-gamma subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	348
270746	cd05594	STKc_PKB_alpha	Catalytic domain of the Serine/Threonine Kinase, Protein Kinase B alpha (also called Akt1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKB-alpha is predominantly expressed in endothelial cells. It is critical for the regulation of angiogenesis and the maintenance of vascular integrity. It also plays a role in adipocyte differentiation. Mice deficient in PKB-alpha exhibit perinatal morbidity, growth retardation, reduction in body weight accompanied by reduced sizes of multiple organs, and enhanced apoptosis in some cell types. PKB-alpha activity has been reported to be frequently elevated in breast and prostate cancers. In some cancer cells, PKB-alpha may act as a suppressor of metastasis. PKB contains an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain. The PKB-alpha subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	356
173686	cd05595	STKc_PKB_beta	Catalytic domain of the Serine/Threonine Kinase, Protein Kinase B beta (also called Akt2). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKB-beta is the predominant PKB isoform expressed in insulin-responsive tissues. It plays a critical role in the regulation of glucose homeostasis. It is also implicated in muscle cell differentiation. Mice deficient in PKB-beta display normal growth weights but exhibit severe insulin resistance and diabetes, accompanied by lipoatrophy and B-cell failure. PKB contains an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain.The PKB-beta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	323
270747	cd05596	STKc_ROCK	Catalytic domain of the Serine/Threonine Kinase, Rho-associated coiled-coil containing protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK is also referred to as Rho-associated kinase or simply as Rho kinase. It contains an N-terminal extension, a catalytic kinase domain, and a long C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD) and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain. It is activated via interaction with Rho GTPases and is involved in many cellular functions including contraction, adhesion, migration, motility, proliferation, and apoptosis. The ROCK subfamily consists of two isoforms, ROCK1 and ROCK2, which may be functionally redundant in some systems, but exhibit different tissue distributions. Both isoforms are ubiquitously expressed in most tissues, but ROCK2 is more prominent in brain and skeletal muscle while ROCK1 is more pronounced in the liver, testes, and kidney. Studies in knockout mice result in different phenotypes, suggesting that the two isoforms do not compensate for each other during embryonic development. The ROCK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	352
270748	cd05597	STKc_DMPK_like	Catalytic domain of Myotonic Dystrophy protein kinase (DMPK)-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The DMPK-like subfamily is composed of DMPK and DMPK-related cell division control protein 42 (Cdc42) binding kinase (MRCK). DMPK is expressed in skeletal and cardiac muscles, and in central nervous tissues. The functional role of DMPK is not fully understood. It may play a role in the signal transduction and homeostasis of calcium. The DMPK gene is implicated in myotonic dystrophy 1 (DM1), an inherited multisystemic disorder with symptoms that include muscle hyperexcitability, progressive muscle weakness and wasting, cataract development, testicular atrophy, and cardiac conduction defects. The genetic basis for DM1 is the mutational expansion of a CTG repeat in the 3'-UTR of DMPK. MRCK is activated via interaction with the small GTPase Cdc42. MRCK/Cdc42 signaling mediates myosin-dependent cell motility. Three isoforms of MRCK are known, named alpha, beta and gamma. MRCKgamma is expressed in heart and skeletal muscles, unlike MRCKalpha and MRCKbeta, which are expressed ubiquitously. The DMPK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	331
270749	cd05598	STKc_LATS	Catalytic domain of the Serine/Threonine Kinase, Large Tumor Suppressor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LATS was originally identified in Drosophila using a screen for genes whose inactivation led to overproliferation of cells. In tetrapods, there are two LATS isoforms, LATS1 and LATS2. Inactivation of LATS1 in mice results in the development of various tumors, including sarcomas and ovarian cancer. LATS functions as a tumor suppressor and is implicated in cell cycle regulation. The LATS subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	333
270750	cd05599	STKc_NDR_like	Catalytic domain of Nuclear Dbf2-Related kinase-like Protein Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NDR kinases regulate mitosis, cell growth, embryonic development, and neurological processes. They are also required for proper centrosome duplication. Higher eukaryotes contain two NDR isoforms, NDR1 and NDR2. This subfamily also contains fungal NDR-like kinases. NDR kinase contains an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Like many other AGC kinases, NDR kinase requires phosphorylation at two sites, the activation loop (A-loop) and the hydrophobic motif (HM), for activity. The NDR kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	324
270751	cd05600	STKc_Sid2p_like	Catalytic domain of Fungal Sid2p-like Protein Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This group contains fungal kinases including Schizosaccharomyces pombe Sid2p and Saccharomyces cerevisiae Dbf2p. Group members show similarity to NDR kinases in that they contain an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Sid2p plays a crucial role in the septum initiation network (SIN) and in the initiation of cytokinesis. Dbf2p is important in regulating the mitotic exit network (MEN) and in cytokinesis. The Sid2p-like group is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	386
270752	cd05601	STKc_CRIK	Catalytic domain of the Serine/Threonine Kinase, Citron Rho-interacting kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CRIK (also called citron kinase) is an effector of the small GTPase Rho. It plays an important function during cytokinesis and affects its contractile process. CRIK-deficient mice show severe ataxia and epilepsy as a result of abnormal cytokinesis and massive apoptosis in neuronal precursors. A Down syndrome critical region protein TTC3 interacts with CRIK and inhibits CRIK-dependent neuronal differentiation and neurite extension. CRIK contains a catalytic domain, a central coiled-coil domain, and a C-terminal region containing a Rho-binding domain (RBD), a zinc finger, and a pleckstrin homology (PH) domain, in addition to other motifs. The CRIK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	328
270753	cd05602	STKc_SGK1	Catalytic domain of the Protein Serine/Threonine Kinase, Serum- and Glucocorticoid-induced Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SGK1 is ubiquitously expressed and is under transcriptional control of numerous stimuli including cell stress (cell shrinkage), serum, hormones (gluco- and mineralocorticoids), gonadotropins, growth factors, interleukin-6, and other cytokines. It plays roles in sodium retention and potassium elimination in the kidney, nutrient transport, salt sensitivity, memory consolidation, and cardiac repolarization. A common SGK1 variant is associated with increased blood pressure and body weight. SGK1 may also contribute to tumor growth, neurodegeneration, fibrosing disease, and ischemia. The SGK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	339
270754	cd05603	STKc_SGK2	Catalytic domain of the Serine/Threonine Kinase, Serum- and Glucocorticoid-induced Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SGK2 shows a more restricted distribution than SGK1 and is most abundantly expressed in epithelial tissues including kidney, liver, pancreas, and the choroid plexus of the brain. In vitro cellular assays show that SGK2 can stimulate the activity of ion channels, the glutamate transporter EEAT4, and the glutamate receptors, GluR6 and GLUR1. The SGK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	321
270755	cd05604	STKc_SGK3	Catalytic domain of the Protein Serine/Threonine Kinase, Serum- and Glucocorticoid-induced Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SGK3 (also called cytokine-independent survival kinase or CISK) is expressed in most tissues and is most abundant in the embryo and adult heart and spleen. It was originally discovered in a screen for antiapoptotic genes. It phosphorylates and inhibits the proapoptotic proteins, Bad and FKHRL1. SGK3 also regulates many transporters, ion channels, and receptors. It plays a critical role in hair follicle morphogenesis and hair cycling. The SGK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	326
270756	cd05605	STKc_GRK4_like	Catalytic domain of G protein-coupled Receptor Kinase 4-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of the GRK4-like group include GRK4, GRK5, GRK6, and similar GRKs. They contain an N-terminal RGS homology (RH) domain and a catalytic domain, but lack a G protein betagamma-subunit binding domain. They are localized to the plasma membrane through post-translational lipid modification or direct binding to PIP2. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK4-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	285
270757	cd05606	STKc_beta_ARK	Catalytic domain of the Serine/Threonine Kinase, beta-adrenergic receptor kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The beta-ARK group is composed of GRK2, GRK3, and similar proteins. GRK2 and GRK3 are both widely expressed in many tissues, although GRK2 is present at higher levels. They contain an N-terminal RGS homology (RH) domain, a central catalytic domain, and C-terminal pleckstrin homology (PH) domain that mediates PIP2 and G protein betagamma-subunit translocation to the membrane. GRK2 (also called beta-ARK or beta-ARK1) is important in regulating several cardiac receptor responses. It plays a role in cardiac development and in hypertension. Deletion of GRK2 in mice results in embryonic lethality, caused by hypoplasia of the ventricular myocardium. GRK2 also plays important roles in the liver (as a regulator of portal blood pressure), in immune cells, and in the nervous system. Altered GRK2 expression has been reported in several disorders including major depression, schizophrenia, bipolar disorder, and Parkinsonism. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The beta-ARK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
270758	cd05607	STKc_GRK7	Catalytic domain of the Protein Serine/Threonine Kinase, G protein-coupled Receptor Kinase 7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK7 (also called iodopsin kinase) belongs to the visual group of GRKs. It is primarily found in the retina and plays a role in the regulation of opsin light receptors. GRK7 is located in retinal cone outer segments and plays an important role in regulating photoresponse of the cones. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors, which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK7 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
270759	cd05608	STKc_GRK1	Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK1 (also called rhodopsin kinase) belongs to the visual group of GRKs and is expressed in retinal cells. It phosphorylates rhodopsin in rod cells, which leads to termination of the phototransduction cascade. Mutations in GRK1 are associated to a recessively inherited form of stationary nightblindness called Oguchi disease. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors, which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
270760	cd05609	STKc_MAST	Catalytic domain of the Protein Serine/Threonine Kinase, Microtubule-associated serine/threonine kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAST kinases contain an N-terminal domain of unknown function, a central catalytic domain, and a C-terminal PDZ domain that mediates protein-protein interactions. There are four mammalian MAST kinases, named MAST1-MAST4. MAST1 is also called syntrophin-associated STK (SAST) while MAST2 is also called MAST205. MAST kinases are cytoskeletal associated kinases of unknown function that are also expressed at neuromuscular junctions and postsynaptic densities. MAST1, MAST2, and MAST3 bind and phosphorylate the tumor suppressor PTEN, and may contribute to the regulation and stabilization of PTEN. MAST2 is involved in the regulation of the Fc-gamma receptor of the innate immune response in macrophages, and may also be involved in the regulation of the Na+/H+ exchanger NHE3. The MAST kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	280
270761	cd05610	STKc_MASTL	Catalytic domain of the Serine/Threonine Kinase, Microtubule-associated serine/threonine-like kinase (also called greatwall kinase). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The MASTL kinases in this group carry only a catalytic domain, which contains a long insertion relative to MAST kinases. MASTL, also called greatwall kinase (Gwl), is involved in the regulation of mitotic entry, which is controlled by the coordinated activities of protein kinases and opposing protein phosphatases (PPs). The cyclin B/CDK1 complex induces entry into M-phase while PP2A-B55 shows anti-mitotic activity. MASTL/Gwl is activated downstream of cyclin B/CDK1 and indirectly inhibits PP2A-B55 by phosphorylating the small protein alpha-endosulfine (Ensa) or the cAMP-regulated phosphoprotein 19 (Arpp19), resulting in M-phase progression. Gwl kinase may also play roles in mRNA stabilization and DNA checkpoint recovery. The human MASTL gene has also been named FLJ14813; a missense mutation in FLJ14813 is associated with autosomal dominant thrombocytopenia. The MASTL kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	349
270762	cd05611	STKc_Rim15_like	Catalytic domain of fungal Rim15-like Protein Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this group include Saccharomyces cerevisiae Rim15, Schizosaccharomyces pombe cek1, and similar fungal proteins. They contain a central catalytic domain, which contains an insert relative to MAST kinases. In addition, Rim15 contains a C-terminal signal receiver (REC) domain while cek1 contains an N-terminal PAS domain. Rim15 (or Rim15p) functions as a regulator of meiosis. It acts as a downstream effector of PKA and regulates entry into stationary phase (G0). Thus, it plays a crucial role in regulating yeast proliferation, differentiation, and aging. Cek1 may facilitate progression of mitotic anaphase. The Rim15-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
270763	cd05612	STKc_PRKX_like	Catalytic domain of PRKX-like Protein Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this group include human PRKX (X chromosome-encoded protein kinase), Drosophila DC2, and similar proteins. PRKX is present in many tissues including fetal and adult brain, kidney, and lung. The PRKX gene is located in the Xp22.3 subregion and has a homolog called PRKY on the Y chromosome. An abnormal interchange between PRKX aand PRKY leads to the sex reversal disorder of XX males and XY females. PRKX is implicated in granulocyte/macrophage lineage differentiation, renal cell epithelial migration, and tubular morphogenesis in the developing kidney. The PRKX-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	292
270764	cd05613	STKc_MSK1_N	N-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSK1 plays a role in the regulation of translational control and transcriptional activation. It phosphorylates the transcription factors, CREB and NFkB. It also phosphorylates the nucleosomal proteins H3 and HMG-14. Increased phosphorylation of MSK1 is associated with the development of cerebral ischemic/hypoxic preconditioning. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, which trigger phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. The MSK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
270765	cd05614	STKc_MSK2_N	N-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSK2 and MSK1 play nonredundant roles in activating histone H3 kinases, which play pivotal roles in compaction of the chromatin fiber. MSK2 is the required H3 kinase in response to stress stimuli and activation of the p38 MAPK pathway. MSK2 also plays a role in the pathogenesis of psoriasis. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family, similar to 90 kDa ribosomal protein S6 kinases (RSKs). MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, which trigger phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. The MSK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	332
270766	cd05615	STKc_cPKC_alpha	Catalytic domain of the Serine/Threonine Kinase, Classical Protein Kinase C alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-alpha is expressed in many tissues and is associated with cell proliferation, apoptosis, and cell motility. It plays a role in the signaling of the growth factors PDGF, VEGF, EGF, and FGF. Abnormal levels of PKC-alpha have been detected in many transformed cell lines and several human tumors. In addition, PKC-alpha is required for HER2 dependent breast cancer invasion. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. PKCs undergo three phosphorylations in order to take mature forms. In addition, cPKCs depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. The cPKC-alpha subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	341
270767	cd05616	STKc_cPKC_beta	Catalytic domain of the Serine/Threonine Kinase, Classical Protein Kinase C beta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PKC beta isoforms (I and II), generated by alternative splicing of a single gene, are preferentially activated by hyperglycemia-induced DAG (1,2-diacylglycerol) in retinal tissues. This is implicated in diabetic microangiopathy such as ischemia, neovascularization, and abnormal vasodilator function. PKC-beta also plays an important role in VEGF signaling. In addition, glucose regulates proliferation in retinal endothelial cells via PKC-betaI. PKC-beta is also being explored as a therapeutic target in cancer. It contributes to tumor formation and is involved in the tumor host mechanisms of inflammation and angiogenesis. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. PKCs undergo three phosphorylations in order to take mature forms. In addition, cPKCs depend on calcium, DAG, and in most cases, phosphatidylserine (PS) for activation. The cPKC-beta subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	323
270768	cd05617	STKc_aPKC_zeta	Catalytic domain of the Serine/Threonine Kinase, Atypical Protein Kinase C zeta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-zeta plays a critical role in activating the glucose transport response. It is activated by glucose, insulin, and exercise through diverse pathways. PKC-zeta also plays a central role in maintaining cell polarity in yeast and mammalian cells. In addition, it affects actin remodeling in muscle cells. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. The aPKC-zeta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	357
270769	cd05618	STKc_aPKC_iota	Catalytic domain of the Serine/Threonine Kinase, Atypical Protein Kinase C iota. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-iota is directly implicated in carcinogenesis. It is critical to oncogenic signaling mediated by Ras and Bcr-Abl. The PKC-iota gene is the target of tumor-specific gene amplification in many human cancers, and has been identified as a human oncogene. In addition to its role in transformed growth, PKC-iota also promotes invasion, chemoresistance, and tumor cell survival. Expression profiling of PKC-iota is a prognostic marker of poor clinical outcome in several human cancers. PKC-iota also plays a role in establishing cell polarity, and has critical embryonic functions. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. The aPKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	364
270770	cd05619	STKc_nPKC_theta	Catalytic domain of the Serine/Threonine Kinase, Novel Protein Kinase C theta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-theta is selectively expressed in T-cells and plays an important and non-redundant role in several aspects of T-cell biology. Although T-cells also express other PKC isoforms, PKC-theta is unique in that upon antigen stimulation, it is translocated to the plasma membrane at the immunological synapse, where it mediates signals essential for T-cell activation. It is essential for TCR-induced proliferation, cytokine production, T-cell survival, and the differentiation and effector function of T-helper (Th) cells, particularly Th2 and Th17. PKC-theta is being developed as a therapeutic target for Th2-mediated allergic inflammation and Th17-mediated autoimmune diseases. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. The nPKC subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	331
173710	cd05620	STKc_nPKC_delta	Catalytic domain of the Serine/Threonine Kinase, Novel Protein Kinase C delta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKC-delta plays a role in cell cycle regulation and programmed cell death in many cell types. It slows down cell proliferation, inducing cell cycle arrest and enhancing cell differentiation. PKC-delta is also involved in the regulation of transcription as well as immune and inflammatory responses. It plays a central role in the genotoxic stress response that leads to DNA damaged-induced apoptosis. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. The nPKC-delta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	316
270771	cd05621	STKc_ROCK2	Catalytic domain of the Serine/Threonine Kinase, Rho-associated coiled-coil containing protein kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK2 was the first identified target of activated RhoA, and was found to play a role in stress fiber and focal adhesion formation. It is prominently expressed in the brain, heart, and skeletal muscles. It is implicated in vascular and neurological disorders, such as hypertension and vasospasm of the coronary and cerebral arteries. ROCK2 is also activated by caspase-2 cleavage, resulting in thrombin-induced microparticle generation in response to cell activation. Mice deficient in ROCK2 show intrauterine growth retardation and embryonic lethality because of placental dysfunction. ROCK contains an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD) and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain, and is activated via interaction with Rho GTPases. The ROCK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	379
270772	cd05622	STKc_ROCK1	Catalytic domain of the Serine/Threonine Kinase, Rho-associated coiled-coil containing protein kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK1 is preferentially expressed in the liver, lung, spleen, testes, and kidney. It mediates signaling from Rho to the actin cytoskeleton. It is implicated in the development of cardiac fibrosis, cardiomyocyte apoptosis, and hyperglycemia. Mice deficient with ROCK1 display eyelids open at birth (EOB) and omphalocele phenotypes due to the disorganization of actin filaments in the eyelids and the umbilical ring. ROCK contains an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD) and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain, and is activated via interaction with Rho GTPases. The ROCK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	405
270773	cd05623	STKc_MRCK_alpha	Catalytic domain of the Serine/Threonine Kinase, DMPK-related cell division control protein 42 binding kinase (MRCK) alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MRCK-alpha is expressed ubiquitously in many tissues. It plays a role in the regulation of peripheral actin reorganization and neurite outgrowth. It may also play a role in the transferrin iron uptake pathway. MRCK is activated via interaction with the small GTPase Cdc42. MRCK/Cdc42 signaling mediates myosin-dependent cell motility. The MRCK-alpha subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. This alignment model includes the dimerization domain.	409
270774	cd05624	STKc_MRCK_beta	Catalytic domain of the Protein Serine/Threonine Kinase, DMPK-related cell division control protein 42 binding kinase (MRCK) beta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MRCK-beta is expressed ubiquitously in many tissues. MRCK is activated via interaction with the small GTPase Cdc42. MRCK/Cdc42 signaling mediates myosin-dependent cell motility. The MRCK-beta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. This alignment model includes the dimerization domain.	409
270775	cd05625	STKc_LATS1	Catalytic domain of the Serine/Threonine Kinase, Large Tumor Suppressor 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LATS1 functions as a tumor suppressor and is implicated in cell cycle regulation. Inactivation of LATS1 in mice results in the development of various tumors, including sarcomas and ovarian cancer. Promoter methylation, loss of heterozygosity, and missense mutations targeting the LATS1 gene have also been found in human sarcomas and ovarian cancers. In addition, decreased expression of LATS1 is associated with an aggressive phenotype and poor prognosis. LATS1 induces G2 arrest and promotes cytokinesis. It may be a component of the mitotic exit network in higher eukaryotes. The LATS1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	382
173715	cd05626	STKc_LATS2	Catalytic domain of the Protein Serine/Threonine Kinase, Large Tumor Suppressor 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LATS2 is an essential mitotic regulator responsible for coordinating accurate cytokinesis completion and governing the stabilization of other mitotic regulators. It is also critical in the maintenance of proper chromosome number, genomic stability, mitotic fidelity, and the integrity of centrosome duplication. Downregulation of LATS2 is associated with poor prognosis in acute lymphoblastic leukemia and breast cancer. The LATS2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	381
270776	cd05627	STKc_NDR2	Catalytic domain of the Serine/Threonine Kinase, Nuclear Dbf2-Related kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NDR2 (also called STK38-like) plays a role in proper centrosome duplication. In addition, it is involved in regulating neuronal growth and differentiation, as well as in facilitating neurite outgrowth. NDR2 is also implicated in fear conditioning as it contributes to the coupling of neuronal morphological changes with fear-memory consolidation. NDR kinase contains an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Like many other AGC kinases, NDR kinase requires phosphorylation at two sites, the activation loop (A-loop) and the hydrophobic motif (HM), for activity. The NDR2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	366
270777	cd05628	STKc_NDR1	Catalytic domain of the Serine/Threonine Kinase, Nuclear Dbf2-Related kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NDR1 (also called STK38) plays a role in proper centrosome duplication. It is highly expressed in thymus, muscle, lung and spleen. It is not an essential protein because mice deficient of NDR1 remain viable and fertile. However, these mice develop T-cell lymphomas and appear to be hypersenstive to carcinogenic treatment. NDR1 appears to also act as a tumor suppressor. NDR kinase contains an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. Like many other AGC kinases, NDR kinase requires phosphorylation at two sites, the activation loop (A-loop) and the hydrophobic motif (HM), for activity. The NDR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	376
270778	cd05629	STKc_NDR_like_fungal	Catalytic domain of Fungal Nuclear Dbf2-Related kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This group is composed of fungal NDR-like proteins including Saccharomyces cerevisiae CBK1 (or CBK1p), Schizosaccharomyces pombe Orb6 (or Orb6p), Ustilago maydis Ukc1 (or Ukc1p), and Neurospora crassa Cot1. Like NDR kinase, group members contain an N-terminal regulatory (NTR) domain and an insert within the catalytic domain that contains an auto-inhibitory sequence. CBK1 is an essential component in the RAM (regulation of Ace2p activity and cellular morphogenesis) network. CBK1 and Orb6 play similar roles in coordinating cell morphology with cell cycle progression. Ukc1 is involved in morphogenesis, pathogenicity, and pigment formation. Cot1 plays a role in polar tip extension.The fungal NDR subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	377
270779	cd05630	STKc_GRK6	Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK6 is widely expressed in many tissues and is expressed as multiple splice variants with different domain architectures. It is post-translationally palmitoylated and localized in the membrane. GRK6 plays important roles in the regulation of dopamine, M3 muscarinic, opioid, and chemokine receptor signaling. It also plays maladaptive roles in addiction and Parkinson's disease. GRK6-deficient mice exhibit altered dopamine receptor regulation, decreased lymphocyte chemotaxis, and increased acute inflammation and neutrophil chemotaxis. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	285
173720	cd05631	STKc_GRK4	Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK4 has a limited tissue distribution. It is mainly found in the testis, but is also present in the cerebellum and kidney. It is expressed as multiple splice variants with different domain architectures and is post-translationally palmitoylated and localized in the membrane. GRK4 polymorphisms are associated with hypertension and salt sensitivity, as they cause hyperphosphorylation, desensitization, and internalization of the dopamine 1 (D1) receptor while increasing the expression of the angiotensin II type 1 receptor. GRK4 plays a crucial role in the D1 receptor regulation of sodium excretion and blood pressure. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	285
270780	cd05632	STKc_GRK5	Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK5 is widely expressed in many tissues. It associates with the membrane though an N-terminal PIP2 binding domain and also binds phospholipids via its C-terminus. GRK5 deficiency is associated with early Alzheimer's disease in humans and mouse models. GRK5 also plays a crucial role in the pathogenesis of sporadic Parkinson's disease. It participates in the regulation and desensitization of PDGFRbeta, a receptor tyrosine kinase involved in a variety of downstream cellular effects including cell growth, chemotaxis, apoptosis, and angiogenesis. GRK5 also regulates Toll-like receptor 4, which is involved in innate and adaptive immunity. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	313
270781	cd05633	STKc_GRK3	Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK3, also called beta-adrenergic receptor kinase 2 (beta-ARK2), is widely expressed in many tissues. It is involved in modulating the cholinergic response of airway smooth muscles, and also plays a role in dopamine receptor regulation. GRK3-deficient mice show a lack of olfactory receptor desensitization and altered regulation of the M2 muscarinic airway. GRK3 promoter polymorphisms may also be associated with bipolar disorder. GRK3 contains an N-terminal RGS homology (RH) domain, a central catalytic domain, and C-terminal pleckstrin homology (PH) domain that mediates PIP2 and G protein betagamma-subunit translocation to the membrane. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. The GRK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	346
100059	cd05635	LbH_unknown	Uncharacterized proteins, Left-handed parallel beta-Helix (LbH) domain: Members in this group are uncharacterized bacterial proteins containing a LbH domain with multiple turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity.	101
100060	cd05636	LbH_G1P_TT_C_like	Putative glucose-1-phosphate thymidylyltransferase, C-terminal Left-handed parallel beta-Helix (LbH) domain: Proteins in this family show simlarity to glucose-1-phosphate adenylyltransferases in that they contain N-terminal catalytic domains that resemble a dinucleotide-binding Rossmann fold and C-terminal LbH fold domains. Members in this family are predicted to be glucose-1-phosphate thymidylyltransferases, which are involved in the dTDP-L-rhamnose biosynthetic pathway. Glucose-1-phosphate thymidylyltransferase catalyzes the synthesis of deoxy-thymidine di-phosphate (dTDP)-L-rhamnose, an important component of the cell wall of many microorganisms. The C-terminal LbH domain contains multiple turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity.	163
240188	cd05637	SIS_PGI_PMI_2	The members of this protein family contain the SIS (Sugar ISomerase) domain and have both the phosphoglucose isomerase (PGI) and the phosphomannose isomerase (PMI) functions. These functions catalyze the reversible reactions of glucose 6-phosphate to fructose 6-phosphate, and mannose 6-phosphate to fructose 6-phosphate, respectively at an equal rate. This protein contains two SIS domains. This alignment is based on the second SIS domain.	132
193517	cd05638	M42	M42 Peptidases, also known as glutamyl aminopeptidase family. Peptidase M42 family proteins, also known as glutamyl aminopeptidases (GAP), are co-catalytic metallopeptidases, found in archaea and bacteria. They typically bind two zinc or cobalt atoms and include cellulase and endo-1,4-beta-glucanase (endoglucanase). Some of the enzymes exhibit typical aminopeptidase specificity, whereas others are also capable of N-terminal deblocking activity, i.e. hydrolyzing acylated N-terminal residues. GAP removes glutamyl residues from the N-terminus of peptide substrates, but is also effective against aspartyl and, to a lesser extent, seryl residues. Lactococcus lactis glutamyl aminopeptidase (PepA; aminopeptidase A) has high thermal stability and aids growth of the organism in milk. Pyrococcus horikoshii contain a thermostable de-blocking aminopeptidase member of this family, used commercially for N-terminal protein sequencing.	332
349892	cd05639	M18	M18 peptidase aminopeptidase family. Peptidase M18 aminopeptidase family is widely distributed in bacteria and eukaryotes, but only the yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized to date. Yeast aminopeptidase I is active only in its dodecameric form with broad substrate specificity, acting on N-terminal leucine and most other amino acids. In contrast, the mammalian aspartyl aminopeptidase is highly selective for hydrolysis of N-terminal Asp or Glu residues from peptides. These enzymes have two catalytic zinc ions at the active site.	430
349893	cd05640	M28_like	M28 Zn-peptidase; uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions.	281
349894	cd05642	M28_like	M28 Zn-peptidase-like; uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions.	347
349895	cd05643	M28_like	M28 Zn-peptidase-like. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They typically have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This protein subfamily conserves some of the metal-coordinating residues of the typically co-catalytic M28 family which might suggest binding of a single metal ion.	290
381731	cd05644	M28_like	M28 Zn-peptidase-like, uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They typically have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. Proteins in this subfamily conserve some of the metal-coordinating residues of the typically co-catalytic M28 family, and appear to bind a single metal (Zn) ion.	340
349897	cd05645	M20_peptidase_T	M20 Peptidase T specifically cleaves tripeptides. Peptidase M20 family, Peptidase T (PepT; tripeptide aminopeptidase; tripeptidase) subfamily and similar proteins. PepT acts only on tripeptide substrates, and is thus termed a tripeptidase. It catalyzes the release of N-terminal amino acids with hydrophobic side chains from tripeptides with high specificity; dipeptides, tetrapeptides or tripeptides with the N-terminus blocked are not cleaved. Tripeptidases are known to function at the final stage of proteolysis in lactococcal bacteria and release amino acids from tripeptides produced during the digestion of milk proteins such as casein.	400
349898	cd05646	M20_AcylaseI_like	M20 Aminoacylase-I like subfamily. Peptidase M20 family, aminoacylase-I like (AcyI-like; acylase I; N-acyl-L-amino-acid amidohydrolase; EC 3.5.1.14) subfamily. Acylase I is involved in the hydrolysis of N-acylated or N-acetylated amino acids (except L-aspartate) and is considered as a potential target of antimicrobial agents. Porcine AcyI is also shown to deacetylate certain quorum-sensing N-acylhomoserine lactones, while the rat enzyme has been implicated in degradation of chemotactic peptides of commensal bacteria. Prokaryotic arginine synthesis usually involves the transfer of an acetyl group to glutamate by ornithine acetyltransferase in order to form ornithine. However, Escherichia coli acetylornithine deacetylase (acetylornithinase, ArgE) (EC 3.5.1.16) catalyzes the deacylation of N2-acetyl-L-ornithine to yield ornithine and acetate. Phylogenetic evidence suggests that the clustering of the arg genes in one continuous sequence pattern arose in an ancestor common to Enterobacteriaceae and Vibrionaceae, where ornithine acetyltransferase was lost and replaced by a deacylase. Elevated levels of serum aminoacylase-1 autoantibody have been seen in the disease progression of chronic hepatitis B (CHB), making ACY1 autoantibody a valuable serum biomarker for discriminating hepatitis B virus (HBV) related liver cirrhosis from CHB.	391
349899	cd05647	M20_DapE_actinobac	M20 Peptidase actinobacterial DapE encoded N-succinyl-L,L-diaminopimelic acid desuccinylase. Peptidase M20 family, actinobacterial dapE encoded N-succinyl-L,L-diaminopimelic acid desuccinylase (DapE) subfamily. This group is composed of predominantly actinobacterial DapE proteins. DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. It has been shown that DapE is essential for cell growth and proliferation. DapEs have been purified from proteobacteria such as Escherichia coli and Haemophilus influenzae, while genes that encode for DapEs have been sequenced from several bacterial sources such as the actinobacteria Corynebacterium glutamicum and Mycobacterium tuberculosis. DapE is a small, dimeric enzyme (41.6 kDa per subunit) that requires 2 atoms of zinc per molecule of polypeptide for full enzymatic activity. All of the amino acids that function as metal binding ligands are strictly conserved in DapE.	347
349900	cd05649	M20_ArgE_DapE-like	M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. This group includes the hypothetical protein ygeY from Escherichia coli, a putative deacetylase, but many in this subfamily are classified as unassigned peptidases. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly bacterial and archaeal, and have been inferred by homology as being related to both ArgE and DapE.	381
349901	cd05650	M20_ArgE_DapE-like	M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly bacterial and archaeal, and have been inferred by homology as being related to both ArgE and DapE.	389
349902	cd05651	M20_ArgE_DapE-like	M20 peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are bacterial, and have been inferred by homology as being related to both ArgE and DapE.	341
349903	cd05652	M20_ArgE_DapE-like_fungal	M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly fungal, and have been inferred by similarity as being related to both ArgE and DapE.	340
349904	cd05653	M20_ArgE_LysK	M20 Peptidase acetylornithine deacetylase/acetyl-lysine deacetylase. Peptidase M20 family, acetylornithine deacetylase (ArgE)/acetyl-lysine deacetylase (LysK) subfamily. Proteins in this subfamily are mainly archaeal with related bacterial species and are deacetylases with specificity for both N-acetyl-ornithine and N-acetyl-lysine found within a lysine biosynthesis operon. ArgE catalyzes the conversion of N-acetylornithine to ornithine, while LysK, a homolog of ArgE, has deacetylating activities for both N-acetyllysine and N-acetylornithine at almost equal efficiency. These results suggest that LysK which may share an ancestor with ArgE functions not only for lysine biosynthesis, but also for arginine biosynthesis in species such as Thermus thermophilus. The substrate specificity of ArgE is quite broad in that several alpha-N-acyl-L-amino acids can be hydrolyzed, including alpha-N-acetylmethionine and alpha-N-formylmethionine. ArgE shares significant sequence homology and biochemical features, and possibly a common origin, with glutamate carboxypeptidase (CPG2) and succinyl-diaminopimelate desuccinylase (DapE), and aminoacylase I (ACY1), having all metal ligand binding residues conserved.	343
349905	cd05654	M20_ArgE_RocB	M20 Peptidase arginine utilization protein, RocB. Peptidase M20 family, ArgE RocB (arginine utilization protein, RocB; arginine degradation protein, RocB) subfamily. This group of proteins is possibly related to acetylornithine deacetylase (ArgE) and may be involved in the arginine and/or ornithine degradation pathway. In Bacillus subtilis, RocB is one of the three genes found in the rocABC operon, which is sigma L dependent and induced by arginine. The function of members of this family is as yet unknown, although they are predicted as deacetylases.	534
349906	cd05656	M42_Frv	M42 Peptidase, endoglucanases. Peptidase M42 family, Frv (Frv Operon Protein; Endo-1 4-Beta-Glucanase; Cellulase Protein; Endoglucanase; Endo-1 4-Beta-Glucanase Homolog; Glucanase; EC. 3.2.1.4) subfamily. Frv is a co-catalytic metallopeptidase, found in archaea and bacteria, including Pyrococcus horikoshii tetrahedral shaped phTET1 (DAPPh1; FrvX; PhDAP aminopeptidase; PhTET aminopeptidase; deblocking aminopeptidase), phTET2 (DAPPh2) and phTET3 (DAPPh3), Haloarcula marismortui TET (HmTET) as well as Bacillus subtilis YsdC. All of these exhibit aminopeptidase and deblocking activities. The HmTET is a broad substrate aminopeptidase capable of degrading large peptides. PhTET2, which shares 24% identity with HmTET, is a cobalt-activated peptidase and possibly a deblocking aminopeptidase, assembled as a 12-subunit tetrahedral dodecamer, while PhTET1 can be alternatively assembled as a tetrahedral dodecamer or as an octahedral tetracosameric structure. The active site in such a self-compartmentalized complex is located on the inside such that substrate sizes are limited, indicating function as possible peptide scavengers. PhTET2 cleaves polypeptides by a nonprocessive mechanism, preferring N-terminal hydrophobic or uncharged polar amino acids. Streptococcus pneumoniae PepA (SpPepA) also forms dodecamer with tetrahedral architecture, and exhibits selective substrate specificity to acidic amino acids with the preference to glutamic acid, with the substrate binding S1 pocket containing an Arg allows electrostatic interactions with the N-terminal acidic residue in the substrate. The YsdC gene is conserved in a number of thermophiles, archaea and pathogenic bacterial species; the closest structural homolog is Thermotoga maritima FrwX (34% identity), which is annotated as either a cellulase or an endoglucanase, and is possibly involved in polysaccharide biosynthesis or degradation.	337
349907	cd05657	M42_glucanase_like	M42 Peptidase, endoglucanase-like subfamily. Peptidase M42 family, glucanase (endo-1,4-beta-glucanase or endoglucanase)-like subfamily. Proteins in this subfamily are co-catalytic metallopeptidases, found in archaea and bacteria. They show similarity to cellulase and endo-1,4-beta-glucanase (endoglucanase) which typically bind two zinc or cobalt atoms. Some of the enzymes exhibit typical aminopeptidase specificity, whereas others are also capable of N-terminal deblocking activity, i.e. hydrolyzing acylated N-terminal residues. Many of these enzymes are assembled either as tetrahedral dodecamers or as octahedral tetracosameric structures, with the active site located on the inside such that substrate sizes are limited, indicating function as possible peptide scavengers.	337
349908	cd05658	M18_DAP	M18 peptidase aspartyl aminopeptidase. Peptidase M18 family, aspartyl aminopeptidase (DAP; EC 3.4.11.21) subfamily, is widely distributed in bacteria and eukaryotes. DAP cleaves only unblocked N-terminal acidic amino-acid residues. It is a cytosolic enzyme and is highly conserved; for example, the human enzyme has 51% identity to an aspartyl aminopeptidase-like protein in Arabidopsis thaliana. The mammalian DAP is highly selective for hydrolysis of N-terminal aspartate or glutamate residues from peptides. Unlike glutamyl aminopeptidase (M42), DAP does not cleave simple aminoaryl-arylamide substrates. Although there is lack of understanding of the function of this enzyme, it is thought to act in concert with other aminopeptidases to facilitate protein turnover because of their restricted specificities for the N-terminal aspartic and glutamic acid, which cannot be cleaved by any other aminopeptidases. The mammalian aspartyl aminopeptidase is possibly contributing to the catabolism of peptides, including those produced by the proteasome. It may also trim the N-terminus of peptides that are intended for the MHC class I system. In humans, DAP has been implicated in the specific function of converting angiotensin II to the vasoactive angiotensin III within the brain. Saccharomyces cerevisiae aminopeptidase I (Ape1) is involved in protein degradation in vacuoles (the yeast lysosomes) where it is transported by the unique cytoplasm-to-vacuole targeting (Cvt) pathway under vegetative growth conditions and by the autophagy pathway during starvation. Its N-terminal propeptide region, which mediates higher-order complex formation, serves as a scaffolding cargo critical for the assembly of the Cvt vesicle for vacuolar delivery. Pseudomonas aeruginosa aminopeptidase (PaAP) shows that its activity is dependent on Co2+ rather than Zn2+, and is thus a cocatalytic cobalt peptidase rather than a zinc-dependent peptidase.	439
349909	cd05659	M18_API	M18 peptidase aminopeptidase I. Peptidase M18 family, aminopeptidase I (vacuolar aminopeptidase I; polypeptidase; Leucine aminopeptidase IV; LAPIV; aminopeptidase III; aminopeptidase yscI; EC 3.4.11.22) subfamily. Aminopeptidase I is widely distributed in bacteria and eukaryotes, but only the yeast enzyme has been characterized to date. It is a vacuolar enzyme, synthesized as a cytosolic proform, and proteolytically matured upon arrival in the vacuole. The pro-aminopeptidase I (proAPI) does not enter the vacuole via the secretory pathway. In non-starved cells, it uses the cytoplasm to vacuole targeting (cvt) pathway and in cells starved for nitrogen, it is targeted to the vacuole via autophagy. Yeast aminopeptidase I is active only in its dodecameric form with broad substrate specificity, acting on all aminoacyl and peptidyl derivatives that contain a free alpha-amino group; this is in contrast to the highly selective M18 mammalian aspartyl aminopeptidase. N-terminal leucine and most other hydrophobic amino acid residues are the best substrates while glycine and charged amino acid residues in P1 position are cleaved much more slowly. This enzyme is strongly and specifically activated by zinc (Zn2+) and chloride (Cl-) ions.	446
349910	cd05660	M28_like_PA	M28 Zn-peptidase containing a protease-associated (PA) domain insert. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins containing a protease-associated (PA) domain insert which may participate in substrate binding and/or promote conformational changes, influencing the stability and accessibility of the site to substrate.	290
349911	cd05661	M28_like_PA	M28 Zn-peptidase containing a PA domain insert. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins containing a protease-associated (PA) domain insert which may participate in substrate binding and/or promote conformational changes, influencing the stability and accessibility of the site to substrate.	262
349912	cd05662	M28_like	M28 Zn-Peptidases. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins that do not contain a protease-associated (PA) domain.	268
349913	cd05663	M28_like_PA_PDZ_associated	M28 Zn-peptidase containing a protease-associated (PA) domain insert and associated with a PDZ domain. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. This subfamily is composed of uncharacterized proteins, many of which contain a protease-associated (PA) domain insert which may participate in substrate binding and/or promote conformational changes, influencing the stability and accessibility of the site to substrate. Proteins in this subfamily are also associated with the PDZ domain, a widespread protein module that has been recruited to serve multiple functions during the course of evolution.	266
349914	cd05664	M20_Acy1-like	M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, Uncharacterized subfamily of proteins predicted as putative amidohydrolases or hippurate hydrolases. These are a class of zinc binding homodimeric enzymes involved in the hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as in the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	399
349915	cd05665	M20_Acy1_IAAspH	M20 Peptidases aminoacyclase-1 indole-3-acetic-L-aspartic acid hydrolase. Peptidase M20 family, bacterial and archaeal aminoacyclase-1 indole-3-acetic-L-aspartic acid hydrolase (IAA-Asp hydrolase; IAAspH; IAAH; IAA amidohydrolase; EC 3.5.1.-) subfamily. IAAspH hydrolyzes indole-3-acetyl-N-aspartic acid (IAA or auxin) to indole-3-acetic acid. Genes encoding IAA-amidohydrolases were first cloned from Arabidopsis; ILR1, IAR3, ILL1 and ILL2 encode active IAA- amino acid hydrolases, and three additional amidohydrolase-like genes (ILL3, ILL5, ILL6) have been isolated. In higher plants, the growth regulator indole-3-acetic acid (IAA or auxin) is found both free and conjugated via amide bonding to a variety of amino acids and peptides, and via an ester linkage to carbohydrates. IAA-Asp conjugates are involved in homeostatic control, protection, storing and subsequent use of free IAA. IAA-Asp is also found in some plants as a unique intermediate for entering into IAA non-decarboxylative oxidative pathway. IAA amidohydrolase cleaves the amide bond between the auxin and the conjugated amino acid. Enterobacter agglomerans IAAspH has very strong enzyme activity and substrate specificity towards IAA-Asp, although its substrate affinity is weaker compared to Arabidopsis enzymes of the ILR1 gene family. Enhanced IAA-hydrolase activity has been observed during clubroot disease in Chinese cabbage.	415
349916	cd05666	M20_Acy1-like	M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of bacterial proteins predicted as putative amidohydrolases or hippurate hydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	373
349917	cd05667	M20_Acy1-like	M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of bacterial proteins that have been predicted as N-acyl-L-amino acid amidohydrolase (amaA), thermostable carboxypeptidase (cpsA-1, cpsA-2 in Sulfolobus solfataricus) and abgB (aminobenzoyl-glutamate utilization protein B), and generally are involved in the urea cycle and metabolism of amino groups. Aminoacylases 1 (ACY1s) comprise a class of zinc binding homodimeric enzymes involved in the hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and is a highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	403
349918	cd05668	M20_Acy1-like	M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of bacterial uncharacterized proteins predicted as putative amidohydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	371
349919	cd05669	M20_Acy1_YxeP-like	M20 Peptidase aminoacyclase-1 YxeP-like proteins, including YxeP, YtnL, YjiB and HipO2. Peptidase M20 family, aminoacyclase-1 YxeP-like subfamily including YxeP, YtnL, YjiB and HipO2, most of which have not been well characterized to date. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity; substrates include indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as in the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). ACY1 appears to physically interact with Sphingosine kinase type 1 (SphK1) and may influence its physiological functions; SphK1 and its product sphingosine-1-phosphate have been shown to promote cell growth and inhibit apoptosis of tumor cells. Strong expression of the human gene and its mouse ortholog Acy1 in brain, liver, and kidney suggest a role of the enzyme in amino acid metabolism of these organs.	371
349920	cd05670	M20_Acy1_YkuR-like	M20 Peptidase aminoacyclase-1 YkuR-like proteins, including YkuR and Ama/HipO/HyuC proteins. Peptidase M20 family, aminoacyclase-1 YkuR-like subfamily including YkuR and Ama/HipO/HyuC proteins, most of which have not been well characterized to date. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity; substrates include indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as in the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). ACY1 appears to physically interact with Sphingosine kinase type 1 (SphK1) and may influence its physiological functions; SphK1 and its product sphingosine-1-phosphate have been shown to promote cell growth and inhibit apoptosis of tumor cells. Strong expression of the human gene and its mouse ortholog Acy1 in brain, liver, and kidney suggest a role of the enzyme in amino acid metabolism of these organs.	367
349921	cd05672	M20_ACY1L2-like	M20 Peptidase aminoacylase 1-like protein 2-like, amidohydrolase subfamily. Peptidase M20 family, aminoacylase 1-like protein 2 (ACY1L2; amidohydrolase)-like subfamily. This group contains many uncharacterized proteins predicted as amidohydrolases, including gene products of abgA and abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in Escherichia coli, to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate utilization is catalyzed by the abg region gene product, AbgT. This subfamily includes Staphylococcus aureus antibiotic resistance factor HmrA that has been shown to participate in methicillin resistance mechanisms in vivo in the presence of beta-lactams. Aminoacylase 1 (ACY1) proteins are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	360
349922	cd05673	M20_Acy1L2_AbgB	M20 Peptidase Aminoacylase 1-like protein 2 aminobenzoyl-glutamate utilization protein B subfamily. Peptidase M20 family, ACY1L2 aminobenzoyl-glutamate utilization protein B (AbgB) subfamily. This group contains mostly bacterial amidohydrolases, including gene products of abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in Escherichia coli, to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate is a natural end product of folate catabolism, and its utilization is initiated by the abg region gene product, AbgT, by enabling uptake of its into the cell in a concentration-dependent, saturable manner. It is subsequently cleaved by AbgA and AbgB (sometimes referred to as AbgAB).	437
349923	cd05674	M20_yscS	M20 Peptidase, carboxypeptidase yscS. Peptidase M20 family, yscS (GlyX-carboxypeptidase, CPS1, carboxypeptidase S, carboxypeptidase a, carboxypeptidase yscS, glycine carboxypeptidase)-like subfamily. This group mostly contains proteins that have been uncharacterized to date, but also includes vacuolar proteins involved in nitrogen metabolism which are essential for use of certain peptides that are sole nitrogen sources. YscS releases a C-terminal amino acid from a peptide that has glycine as the penultimate residue. It is synthesized as one polypeptide chain precursor which yields two active precursor molecules after carbohydrate modification in the secretory pathway. The proteolytically unprocessed forms are associated with the membrane, whereas the mature forms of the enzyme are soluble. Enzymes in this subfamily may also cleave intracellularly generated peptides in order to recycle amino acids for protein synthesis. Also included in this subfamily is peptidase M20 domain containing 1 (PM20D1), that is enriched in uncoupling protein 1, UCP1(+) versus UCP1(-) adipocytes is a bidirectional enzyme in vitro, catalyzing both the condensation of fatty acids and amino acids to generate N-acyl amino acids and also the reverse hydrolytic reaction; N-acyl amino acids directly bind mitochondria and function as endogenous uncouplers of UCP1-independent respiration. Mice studies show increased circulating PM20D1 augments respiration and increases N-acyl amino acids in blood, and administration of N-acyl amino acids improves glucose homeostasis and increases energy expenditure.	471
349924	cd05675	M20_yscS_like	M20 Peptidase, carboxypeptidase yscS-like. Peptidase M20 family, yscS (GlyX-carboxypeptidase, CPS1, carboxypeptidase S, carboxypeptidase a, carboxypeptidase yscS, glycine carboxypeptidase)-like subfamily. This group contains proteins that have been uncharacterized to date with similarity to vacuolar proteins involved in nitrogen metabolism which are essential for use of certain peptides that are sole nitrogen sources. YscS releases a C-terminal amino acid from a peptide that has glycine as the penultimate residue. It is synthesized as one polypeptide chain precursor which yields two active precursor molecules after carbohydrate modification in the secretory pathway. The proteolytically unprocessed forms are associated with the membrane, whereas the mature forms of the enzyme are soluble. Enzymes in this subfamily may also cleave intracellularly generated peptides in order to recycle amino acids for protein synthesis.	431
349925	cd05676	M20_dipept_like_CNDP	M20 cytosolic nonspecific dipeptidases including anserinase and serum carnosinase. Peptidase M20 family, CNDP (cytosolic nonspecific dipeptidase) subfamily including anserinase (Xaa-methyl-His dipeptidase, EC 3.4.13.5), 'serum' carnosinase (beta-alanyl-L-histidine dipeptidase; EC 3.4.13.20), and some uncharacterized proteins. Two genes, CN1 and CN2, coding for proteins that degrade carnosine (beta-alanyl-L-histidine) and homocarnosine (gamma-aminobutyric acid-L-histidine), two naturally occurring dipeptides with potential neuroprotective and neurotransmitter functions, have been identified. CN1 encodes for serum carnosinase and has narrow substrate specificity for Xaa-His dipeptides, where Xaa can be beta-alanine (carnosine), N-methyl beta-alanine, alanine, glycine and gamma-aminobutyric acid (homocarnosine). CN2 corresponds to the cytosolic nonspecific dipeptidase (CNDP; EC 3.4.13.18) and is not limited to Xaa-His dipeptides. CNDP requires Mn(2+) for full activity and does not hydrolyze homocarnosine. Anserinase is a dipeptidase that mainly catalyzes the hydrolysis of N-alpha-acetylhistidine.	467
349926	cd05677	M20_dipept_like_DUG2_type	M20 Defective in Utilization of Glutathione-type peptidases containing WD repeats. Peptidase M20 family, Defective in Utilization of Glutathione (DUG2) subfamily. DUG2-type proteins are metallopeptidases containing WD repeats at the N-terminus. DUG2 proteins are involved in the alternative pathway of glutathione (GSH) degradation. GSH, the major low-molecular-weight thiol compound in most eukaryotic cells, is normally degraded through the gamma-glutamyl cycle initiated by gamma-glutamyl transpeptidase. However, a novel pathway for the degradation of GSH has been characterized; it requires the participation of three genes identified in Saccharomyces cerevisiae as "defective in utilization of glutathione" genes including DUG1, DUG2, and DUG3. DUG1 encodes a probable di- or tri-peptidase identified as M20 metallopeptidase, DUG2 gene encodes a protein with a metallopeptidase domain and a large N-terminal WD40 repeat region, while DUG3 encodes a protein with a glutamine amidotransferase domain. Although dipeptides and tripeptides with a normal peptide bond, such as cys-gly or glu-cys-gly, can be hydrolyzed by the DUG1 protein, the presence of an unusual peptide bond, like in GSH, requires the participation of the DUG2 and DUG3 proteins as well. These three proteins form a GSH degradosomal complex.	436
349927	cd05678	M20_dipept_like	uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and  canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine.	466
349928	cd05679	M20_dipept_like	uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and  canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine.	448
349929	cd05680	M20_dipept_like	uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and  canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine.	437
349930	cd05681	M20_dipept_Sso-CP2	uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and  canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. This family includes Sso-CP2 from Sulfolobus solfataricus.	429
349931	cd05682	M20_dipept_dapE	uncharacterized M20 dipeptidase. Peptidase M20 family, unknown dipeptidase-like subfamily (inferred by homology to be dipeptidases). M20 dipeptidases include a large variety of bacterial enzymes including cytosolic nonspecific dipeptidase (CNDP), Xaa-methyl-His dipeptidase (anserinase),and  canosinase. These dipeptidases have been shown to act on a wide range of dipeptides, but not larger peptides. For example, anserinase mainly catalyzes the hydrolysis of N-alpha-acetylhistidine while carnosinase degrades beta-alanyl-L-histidine. This family includes dapE (Lpg0809) from Legionella pneumophila.	451
349932	cd05683	M20_peptT_like	M20 Peptidase T like enzymes specifically cleave tripeptides. Peptidase M20 family, PeptT (tripeptide aminopeptidase; tripeptidase)-like subfamily. This group includes bacterial tripeptidases as well as predicted tripeptidases. Peptidase T acts only on tripeptide substrates, and is thus called a tripeptidase. It catalyzes the release of N-terminal amino acids with hydrophobic side chains from tripeptides with high specificity; dipeptides, tetrapeptides or tripeptides with the N-terminus blocked are not cleaved. Tripeptidases are known to function at the final stage of proteolysis in lactococcal bacteria and release amino acids from tripeptides produced during the digestion of milk proteins such as casein.	368
240189	cd05684	S1_DHX8_helicase	S1_DHX8_helicase: The  N-terminal S1 domain of human ATP-dependent RNA helicase DHX8, a DEAH (Asp-Glu-Ala-His) box polypeptide.  The DEAH-box RNA helicases are thought to play key roles in pre-mRNA splicing and DHX8 facilitates nuclear export of spliced mRNA by releasing the RNA from the spliceosome. DHX8 is also known as HRH1 (human RNA helicase 1) in Homo sapiens and PRP22 in Saccharomyces cerevisiae.	79
240190	cd05685	S1_Tex	S1_Tex: The C-terminal S1 domain of a transcription accessory factor called Tex, which has been characterized in Bordetella pertussis and Pseudomonas aeruginosa. The tex gene is essential in Bortella pertusis and is named for its role in toxin expression. Tex has two functional domains, an N-terminal domain homologous to the Escherichia coli maltose repression protein, which is a poorly defined transcriptional factor, and a C-terminal S1 RNA-binding domain. Tex is found in prokaryotes, eukaryotes, and archaea.	68
240191	cd05686	S1_pNO40	S1_pNO40: pNO40 , S1-like RNA-binding domain. pNO40 is a nucleolar protein of unknown function with an N-terminal S1 RNA binding domain, a CCHC type zinc finger, and clusters of basic amino acids representing a potential nucleolar targeting signal.  pNO40 was identified through a yeast two-hybrid interaction screen of a human kidney cDNA library using the pinin (pnn) protein as bait. pNO40 is thought to play a role in ribosome maturation and/or biogenesis.	73
240192	cd05687	S1_RPS1_repeat_ec1_hs1	S1_RPS1_repeat_ec1_hs1: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 1 of the Escherichia coli and Homo sapiens RPS1 (ec1 and hs1, respectively). Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog.	70
240193	cd05688	S1_RPS1_repeat_ec3	S1_RPS1_repeat_ec3: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 3 (ec3) of the Escherichia coli RPS1. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog.	68
240194	cd05689	S1_RPS1_repeat_ec4	S1_RPS1_repeat_ec4: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 4 (ec4) of the Escherichia coli RPS1. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog.	72
240195	cd05690	S1_RPS1_repeat_ec5	S1_RPS1_repeat_ec5: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 5 (ec5) of the Escherichia coli RPS1. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog.	69
240196	cd05691	S1_RPS1_repeat_ec6	S1_RPS1_repeat_ec6: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 6 (ec6) of the Escherichia coli RPS1. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog.	73
240197	cd05692	S1_RPS1_repeat_hs4	S1_RPS1_repeat_hs4: Ribosomal protein S1 (RPS1) domain. RPS1 is a component of the small ribosomal subunit thought to be involved in the recognition and binding of mRNA's during translation initiation. The bacterial RPS1 domain architecture consists of 4-6 tandem S1 domains. In some bacteria, the tandem S1 array is located C-terminal to a 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HMBPP reductase) domain. While RPS1 is found primarily in bacteria, proteins with tandem RPS1-like domains have been identified in plants and humans, however these lack the N-terminal HMBPP reductase domain. This CD includes S1 repeat 4 (hs4) of the H. sapiens RPS1 homolog. Autoantibodies to double-stranded DNA from patients with systemic lupus erythematosus cross-react with the human RPS1 homolog.	69
240198	cd05693	S1_Rrp5_repeat_hs1_sc1	S1_Rrp5_repeat_hs1_sc1: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 1 (hs1) and S. cerevisiae S1 repeat 1 (sc1). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	100
240199	cd05694	S1_Rrp5_repeat_hs2_sc2	S1_Rrp5_repeat_hs2_sc2: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 2 (hs2) and S. cerevisiae S1 repeat 2 (sc2). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	74
240200	cd05695	S1_Rrp5_repeat_hs3	S1_Rrp5_repeat_hs3: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 3 (hs3). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	66
240201	cd05696	S1_Rrp5_repeat_hs4	S1_Rrp5_repeat_hs4: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 4 (hs4). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	71
240202	cd05697	S1_Rrp5_repeat_hs5	S1_Rrp5_repeat_hs5: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 5 (hs5) and S. cerevisiae S1 repeat 5 (sc5). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	69
240203	cd05698	S1_Rrp5_repeat_hs6_sc5	S1_Rrp5_repeat_hs6_sc5: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 6 (hs6) and S. cerevisiae S1 repeat 5 (sc5). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	70
240204	cd05699	S1_Rrp5_repeat_hs7	S1_Rrp5_repeat_hs7: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 7 (hs7). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	72
240205	cd05700	S1_Rrp5_repeat_hs9	S1_Rrp5_repeat_hs9: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes Homo sapiens S1 repeat 9 (hs9). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	65
240206	cd05701	S1_Rrp5_repeat_hs10	S1_Rrp5_repeat_hs10: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 10 (hs10). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	69
240207	cd05702	S1_Rrp5_repeat_hs11_sc8	S1_Rrp5_repeat_hs11_sc8: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 11 (hs11) and S. cerevisiae S1 repeat 8 (sc8). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	70
240208	cd05703	S1_Rrp5_repeat_hs12_sc9	S1_Rrp5_repeat_hs12_sc9: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions.  Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 12 (hs12) and S. cerevisiae S1 repeat 9 (sc9). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	73
240209	cd05704	S1_Rrp5_repeat_hs13	S1_Rrp5_repeat_hs13: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits.  Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions.  Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 13 (hs13). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	72
240210	cd05705	S1_Rrp5_repeat_hs14	S1_Rrp5_repeat_hs14: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 14 (hs14). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	74
240211	cd05706	S1_Rrp5_repeat_sc10	S1_Rrp5_repeat_sc10: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes S. cerevisiae S1 repeat 10 (sc10). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	73
240212	cd05707	S1_Rrp5_repeat_sc11	S1_Rrp5_repeat_sc11: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes S. cerevisiae S1 repeat 11 (sc11). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	68
240213	cd05708	S1_Rrp5_repeat_sc12	S1_Rrp5_repeat_sc12: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions.  Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes S. cerevisiae S1 repeat 12 (sc12). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.	77
100078	cd05709	S2P-M50	Site-2 protease (S2P) class of zinc metalloproteases (MEROPS family M50) cleaves transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of this family use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. The domain core structure appears to contain at least three transmembrane helices with a catalytic zinc atom coordinated by three conserved residues contained within the consensus sequence HExxH, together with a conserved aspartate residue. The S2P/M50 family of RIP proteases is widely distributed; in eukaryotic cells, they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum (ER) stress responses. In sterol-depleted mammalian cells, a two-step proteolytic process releases the N-terminal domains of sterol regulatory element-binding proteins (SREBPs) from membranes of the ER. These domains translocate into the nucleus, where they activate genes of cholesterol and fatty acid biosynthesis. It is the second proteolytic step that is carried out by the SREBP Site-2 protease (S2P) which is present in this CD superfamily. Prokaryotic S2P/M50 homologs have been shown to regulate stress responses, sporulation, cell division, and cell differentiation. In Escherichia coli, the S2P homolog RseP is involved in the sigmaE pathway of extracytoplasmic stress responses, and in Bacillus subtilis, the S2P homolog SpoIVFB is involved in the pro-sigmaK pathway of spore formation. Some of the subfamilies within this hierarchy contain one or two PDZ domain insertions, with putative regulatory roles, such as the inhibition of substrate cleavage as seen by the RseP PDZ domain.	180
240214	cd05710	SIS_1	A subgroup of the SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars.	120
409376	cd05711	IgC2_D2_LILR_KIR_like	Second immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors, Natural killer inhibitory receptors (KIRs) and similar domains; member of Immunoglobulin Constant-2 set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors (LILRs), Natural killer inhibitory receptors (KIRs, also known as also known as cluster of differentiation (CD) 158), and similar proteins. This group includes LILRB1 (also known as LIR-1), LILRA5 (also known as LIR9), an activating natural cytotoxicity receptor NKp46, the immune-type receptor glycoprotein VI (GPVI), and the IgA-specific receptor Fc-alphaRI (also known as cluster of differentiation (CD) 89). LILRs are a family of immunoreceptors expressed on expressed on T and B cells, on monocytes, dendritic cells, and subgroups of natural killer (NK) cells. The human LILR family contains nine proteins (LILRA1-3, and 5, and LILRB1-5). From functional assays, and as the cytoplasmic domains of various LILRs, for example LILRB1, LILRB2 (also known as LIR-2), and LILRB3 (also known as LIR-3) contain immunoreceptor tyrosine-based inhibitory motifs (ITIMs), it is thought that LIR proteins are inhibitory receptors. Of the eight LIR family proteins, only LILRB1, and LILRB2, show detectable binding to class I MHC molecules; ligands for the other members have yet to be determined. The extracellular portions of the different LIR proteins contain different numbers of Ig-like domains for example, four in the case of LILRB1, and LILRB2, and two in the case of LILRB4 (also known as LIR-5). The activating natural cytotoxicity receptor NKp46 is expressed in natural killer cells, and is organized as an extracellular portion having two Ig-like extracellular domains, a transmembrane domain, and a small cytoplasmic portion. GPVI, which also contains two Ig-like domains, participates in the processes of collagen-mediated platelet activation and arterial thrombus formation. Fc-alphaRI is expressed on monocytes, eosinophils, neutrophils, and macrophages; it mediates IgA-induced immune effector responses such as phagocytosis, antibody-dependent cell-mediated cytotoxicity and respiratory burst. Killer cell immunoglobulin-like receptors (KIRs; also known as CD158 for human KIR) are transmembrane glycoproteins expressed by natural killer cells and subsets of T cells. KIRs are a family of highly polymorphic activating and inhibitory receptors that serve as key regulators of human NK cell function. The KIR proteins are classified by the number of extracellular immunoglobulin domains (2D or 3D) and by whether they have a long (L) or short (S) cytoplasmic domain. KIR proteins with the long cytoplasmic domain transduce inhibitory signals upon ligand binding via an immune tyrosine-based inhibitory motif (ITIM), while KIR proteins with the short cytoplasmic domain lack the ITIM motif and instead associate with the TYRO protein tyrosine kinase binding protein to transduce activating signals. The major ligands for KIR are MHC class I (HLA-A, -B or -C) molecules.	90
409377	cd05712	IgV_CD33	Immunoglobulin Variable (IgV) domain at the N-terminus of CD33 and related Siglecs (sialic acid-binding Ig-like lectins). The members here are composed of the immunoglobulin (Ig) domain at the N-terminus of Cluster of Differentiation (CD) 33 and related Siglecs (sialic acid-binding Ig-like lectins). Siglec refers to a structurally related protein family that specifically recognizes sialic acid in oligosaccharide chains of glycoproteins and glycolipids. Siglecs are type I transmembrane proteins, organized as an extracellular module composed of Ig-like domains, an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains, followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG, the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11.	119
409378	cd05713	IgV_MOG_like	Immunoglobulin (Ig)-like domain of myelin oligodendrocyte glycoprotein (MOG). The members here are composed of the immunoglobulin (Ig)-like domain of myelin oligodendrocyte glycoprotein (MOG). MOG, a minor component of the myelin sheath, is an important CNS-specific autoantigen, linked to the pathogenesis of multiple sclerosis (MS) and experimental autoimmune encephalomyelitis (EAE). It is a transmembrane protein having an extracellular Ig domain. MOG is expressed in the CNS on the outermost lamellae of the myelin sheath, and on the surface of oligodendrocytes, and may participate in the completion, compaction, and/or maintenance of myelin. This group also includes butyrophilin (BTN). BTN is the most abundant protein in bovine milk-fat globule membrane (MFGM).	114
409379	cd05714	Ig_CSPGs_LP_like	Immunoglobulin (Ig)-like domain of chondroitin sulfate proteoglycans (CSPGs), human cartilage link protein (LP), and similar domains. The members here are composed of the immunoglobulin (Ig)-like domain similar to that found in chondroitin sulfate proteoglycans (CSPGs) and human cartilage link protein (LP). Included in this group are the CSPGs aggrecan, versican, and neurocan. In CSPGs, this Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with hyaluronan (HA). These aggregates contribute to the tissue's load bearing properties. Aggrecan and versican have a wide distribution in connective tissue and extracellular matrices. Neurocan is localized almost exclusively in nervous tissue. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. There is considerable evidence that HA-binding CSPGs are involved in developmental processes in the central nervous system. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes.	123
409380	cd05715	IgV_P0-like	Immunoglobulin (Ig)-like domain of protein zero (P0) and similar proteins. The members here are composed of the immunoglobulin (Ig) domain of protein zero (P0), a myelin membrane adhesion molecule. P0 accounts for over 50% of the total protein in peripheral nervous system (PNS) myelin. P0 is a single-pass transmembrane glycoprotein with a highly basic intracellular domain and an extracellular Ig domain. The extracellular domain of P0 (P0-ED) is similar to the Ig variable domain, carrying one acceptor sequence for N-linked glycosylation. P0 plays a role in membrane adhesion in the spiral wraps of the myelin sheath. The intracellular domain is thought to mediate membrane apposition of the cytoplasmic faces and may, through electrostatic interactions, interact directly with lipid headgroups. It is thought that homophilic interactions of the P0 extracellular domain mediate membrane juxtaposition in the extracellular space of PNS myelin. This group also contains the Ig domain of sodium channel subunit beta-2 (SCN2B), and of epithelial V-like antigen 1 (EVA). EVA, also known as myelin protein zero-like 2, is an adhesion molecule, which may play a role in structural organization of the thymus and early lymphocyte development. SCN2B subunits play a role in determining sodium channel density and function in neurons,and in control of electrical excitability in the brain.	117
409381	cd05716	IgV_pIgR_like	Immunoglobulin (Ig)-like domain in the polymeric Ig receptor (pIgR) and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain in the polymeric Ig receptor (pIgR) and similar proteins. pIgR delivers dimeric IgA and pentameric IgM to mucosal secretions. Polymeric immunoglobulin (pIgs) are the first defense against pathogens and toxins. IgA and IgM can form polymers via an 18-residue extension at their C-termini referred to as the tailpiece. pIgR transports pIgs across mucosal epithelia into mucosal secretions. Human pIgR is a glycosylated type I transmembrane protein, comprised of a 620-residue extracellular region, a 23-residue transmembrane region, and a 103-residue cytoplasmic tail. The extracellular region contains five domains that share sequence similarity with Ig variable (v) regions. This group also contains the Ig-like extracellular domains of other receptors such as NK cell receptor Nkp44 and myeloid receptors, among others.	100
409382	cd05717	IgV_1_Necl_like	First (N-terminal) immunoglobulin (Ig)-like domain of the nectin-like molecules; member of the V-set of Ig superfamily (IgSF) domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of the nectin-like molecules Necl-1 (also known as cell adhesion molecule 3 (CADM3)), Necl-2 (CADM1), Necl-3 (CADM2), and similar proteins. At least five nectin-like molecules have been identified (Necl-1 to Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1, Necl-2, and Necl-3 have Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue, and is important to the formation of synapses, axon bundles, and myelinated axons. Necl-2 is expressed in a wide variety of tissues and is a putative tumour suppressor gene which is downregulated in aggressive neuroblastoma. Necl-3 accumulates in central and peripheral nervous system tissue and has been shown to selectively interact with oligodendrocytes. This group also contains Class-I MHC-restricted T-cell-associated molecule (CRTAM), whose expression pattern is consistent with its expression in Class-I MHC-restricted T-cells.	94
409383	cd05718	IgV_1_PVR_like	First immunoglobulin variable (IgV) domain of poliovirus receptor (PVR, also known as CD155 and necl-5), and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of poliovirus receptor (PVR, also known as CD155 and nectin-like protein 5 (necl-5)). Poliovirus (PV) binds to its cellular receptor (PVR/CD155) to initiate infection. CD155 is a membrane-anchored, single-span glycoprotein; its extracellular region has three Ig-like domains. There are four different isotypes of CD155 (referred to as alpha, beta, gamma, and delta), that result from alternate splicing of the CD155 mRNA, and have identical extracellular domains. CD155-beta and CD155-gamma are secreted; CD155-alpha and CD155-delta are membrane-bound and function as PV receptors. The virus recognition site is contained in the amino-terminal domain, D1. Having the virus attachment site on the receptor distal from the plasma membrane may be important for successful initiation of infection of cells by the virus. CD155 binds in the poliovirus "canyon" with a footprint similar to that of the intercellular adhesion molecule-1 receptor on human rhinoviruses. This group also includes the first Ig-like domain of nectin-1 (also known as poliovirus receptor related protein(PVRL)1; CD111), nectin-3 (also known as PVRL 3), nectin-4 (also known as PVRL4; LNIR receptor)and DNAX accessory molecule 1 (DNAM-1; CD226).	113
409384	cd05719	IgC1_2_PVR_like	Second immunoglobulin (Ig) domain of poliovirus receptor (PVR, also known as CD155 and Necl-5), and similar domains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig) domain of poliovirus receptor (PVR, also known as CD155 and nectin-like protein 5 (Necl-5)) and similar proteins. Poliovirus (PV) binds to its cellular receptor (PVR/CD155) to initiate infection. CD155 is a membrane-anchored, single-span glycoprotein; its extracellular region has three Ig-like domains. There are four different isotypes of CD155 (referred to as alpha, beta, gamma, and delta), these result from alternate splicing of the CD155 mRNA, and have identical extracellular domains. CD155-beta and CD155-gamma are secreted, while CD155-alpha and CD155-delta are membrane-bound and function as PV receptors. The virus recognition site is contained in the amino-terminal domain, D1. Having the virus attachment site on the receptor distal from the plasma membrane may be important for successful initiation of infection of cells by the virus. CD155 binds in the poliovirus "canyon" and has a footprint similar to that of the intercellular adhesion molecule-1 receptor on human rhinoviruses. This group also includes the second Ig-like domain of nectin-1, also known as poliovirus receptor related protein(PVRL)1 or CD111.	96
409385	cd05720	IgV_CD8_alpha	Immunoglobulin (Ig)-like variable (V) domain of Cluster of Differentiation (CD) 8 alpha chain. The members here are composed of the immunoglobulin (Ig)-like variable domain of the Cluster of Differentiation (CD) 8 alpha. The CD8 glycoprotein plays an essential role in the control of T-cell selection, maturation, and the T-cell receptor (TCR)-mediated response to peptide antigen. CD8 is comprised of alpha and beta subunits and is expressed as either an alpha/alpha or alpha/beta dimer. Both dimeric isoforms can serve as a coreceptor for T cell activation and differentiation, however they have distinct physiological roles, different cellular distributions, unique binding partners, etc. Each CD8 subunit is comprised of an extracellular domain containing a V-type Ig-like domain, a single pass transmembrane portion, and a short intracellular domain. The Ig domain of CD8 alpha binds to antibodies. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	110
409386	cd05721	IgV_CTLA-4	Immunoglobulin Variable (IgV) domain of cytotoxic T lymphocyte-associated antigen 4 (CTLA-4). The members here are composed of the variable(v)-type immunoglobulin (Ig) domain found in cytotoxic T lymphocyte-associated antigen 4 (CTLA-4).  CTLA-4 is involved in the regulation of T cell response, acting as an inhibitor of intracellular signaling.  CTLA-4 is similar to CD28, a T cell co-receptor protein that recognizes the B7 proteins (CD80 and CD86). CD28 binding of the B7 proteins occurs after the presentation of antigen to the T cell receptor (TCR) via the peptide-MHC complex on the surface of an antigen presenting cell (APC).  CTLA-4 also binds the B7 molecules with a higher affinity than does CD28.  The B7/CTLA-4 interaction generates inhibitory signals down-regulating the response, and may prevent T cell activation by weak TCR signals. CD28 and CTLA-4 then elicit opposing signals in the regulation of T cell responsiveness and homeostasis. T cell activation leads to increased CTLA-4 gene expression and trafficking of CTLA-4 protein to the cell surface. CTLA-4 is not detected on the T-cell surface until 24 hours after activation. Covalent dimerization of CTLA-4 has been shown to be required for its high binding avidity, although each CTLA-4 monomer contains a binding site for CD80 and CD86.	115
409387	cd05722	IgI_1_Neogenin_like	First immunoglobulin (Ig)-like domain in neogenin, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain in neogenin and related proteins. Neogenin  is a cell surface protein which is expressed in the developing nervous system of vertebrate embryos in the growing nerve cells. It is also expressed in other embryonic tissues and may play a general role in developmental processes such as cell migration, cell-cell recognition, and tissue growth regulation. Included in this group is the tumor suppressor protein DCC which is deleted in colorectal carcinoma. DCC and neogenin each have four Ig-like domains followed by six fibronectin type III domains, a transmembrane domain, and an intracellular domain. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	97
409388	cd05723	IgI_4_Neogenin_like	Fourth immunoglobulin (Ig)-like domain in neogenin, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fourth immunoglobulin (Ig)-like domain in neogenin and related proteins. Neogenin  is a cell surface protein which is expressed in the developing nervous system of vertebrate embryos in the growing nerve cells. It is also expressed in other embryonic tissues, and may play a general role in developmental processes such as cell migration, cell-cell recognition, and tissue growth regulation. Included in this group is the tumor suppressor protein DCC which is deleted in colorectal carcinoma. DCC and neogenin each have four Ig-like domains followed by six fibronectin type III domains, a transmembrane domain, and an intracellular domain. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	84
409389	cd05724	IgI_2_Robo	Second immunoglobulin (Ig)-like domain in Robo (roundabout) receptors; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain in Robo (roundabout) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of the Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, Robo2, and Robo3), and three mammalian Slit homologs (Slit-1,Slit-2, Slit-3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, Robo2, and Robo3 are expressed by commissural neurons in the vertebrate spinal cord and Slit-1, Slit-2, Slit-3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of Slit responsiveness, antagonizes Slit responsiveness in precrossing axons.  The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit-2 has been shown by surface plasmon resonance experiments and mutational analysis to be the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	87
409390	cd05725	IgI_3_Robo	Third immunoglobulin (Ig)-like domain in Robo (roundabout) receptors; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig)-like domain in Robo (roundabout) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, Robo2, Robo3), and three mammalian Slit homologs (Slit-1,Slit-2, Slit-3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, Robo2, and Robo3 are expressed by commissural neurons in the vertebrate spinal cord and Slit-1, Slit-2, and Slit-3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of Slit responsiveness, antagonizes Slit responsiveness in precrossing axons.  The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit2 has been shown by surface plasmon resonance experiments and mutational analysis to be the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	83
409391	cd05726	IgI_4_Robo	Fourth immunoglobulin (Ig)-like domain in Robo (roundabout) receptors; member of the I-set of Ig superfamily (IgSF) domains. Members here are composed the fourth immunoglobulin (Ig)-like domain in Robo (roundabout) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, Robo2, Robo3), and three mammalian Slit homologs (Slit-1, Slit-2, Slit-3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, Robo2, and Robo3 are expressed by commissural neurons in the vertebrate spinal cord and Slit-1, Slit-2, and Slit-3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of Slit responsiveness, antagonizes Slit responsiveness in precrossing axons.  The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit2 has been shown by surface plasmon resonance experiments and mutational analysis to be the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	98
409392	cd05727	Ig2_Contactin-2-like	Second Ig domain of the neural cell adhesion molecule contactin-2, and similar domains. The members here are composed of the second Ig domain of the neural cell adhesion molecule contactin-2. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-2 (also called TAG-1, axonin-1) facilitates cell adhesion by homophilic binding between molecules in apposed membranes. The first four Ig domains form the intermolecular binding fragment which arranges as a compact U-shaped module by contacts between Ig domains 1 and 4, and domains 2 and 3. It has been proposed that a linear zipper-like array forms, from contactin-2 molecules alternatively provided by the two apposed membranes.	88
143205	cd05728	Ig4_Contactin-2-like	Fourth Ig domain of the neural cell adhesion molecule contactin-2, and similar domains. The members here are composed of the fourth Ig domain of the neural cell adhesion molecule contactin-2. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-2 (also called TAG-1, axonin-1) facilitates cell adhesion by homophilic binding between molecules in apposed membranes. The first four Ig domains form the intermolecular binding fragment which arranges as a compact U-shaped module by contacts between Ig domains 1 and 4, and domains 2 and 3. It has been proposed that a linear zipper-like array forms, from contactin-2 molecules alternatively provided by the two apposed membranes.	85
409393	cd05729	IgI_2_FGFR_like	Second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor, and similar domains; member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor. FGF receptors bind FGF signaling polypeptides. FGFs participate in multiple processes such as morphogenesis, development, and angiogenesis. FGFs bind to four FGF receptor tyrosine kinases (FGFR1, FGFR2, FGFR3, FGFR4). Receptor diversity is controlled by alternative splicing producing splice variants with different ligand binding characteristics and different expression patterns. FGFRs have an extracellular region comprised of three Ig-like domains, a single transmembrane helix, and an intracellular tyrosine kinase domain. Ligand binding and specificity reside in the Ig-like domains 2 and 3, and the linker region that connects these two. FGFR activation and signaling depend on FGF-induced dimerization, a process involving cell surface heparin or heparin sulfate proteoglycans. This group also contains fibroblast growth factor (FGF) receptor like-1(FGFRL1). FGFRL1 does not have a protein tyrosine kinase domain at its C-terminus; neither does its cytoplasmic domain appear to interact with a signaling partner. It has been suggested that FGFRL1 may not have any direct signaling function, but instead acts as a decoy receptor trapping FGFs and preventing them from binding other receptors.	95
143207	cd05730	IgI_3_NCAM-1	Third immunoglobulin (Ig)-like domain of Neural Cell Adhesion Molecule 1 (NCAM-1); member of the I-set of IgSF domains. The members here are composed of the third immunoglobulin (Ig)-like domain of Neural Cell Adhesion Molecule (NCAM-1). NCAM plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM), and heterophilic (NCAM-non-NCAM), interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves Ig1, Ig2, and Ig3. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions) through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain.	95
409394	cd05731	Ig3_L1-CAM_like	Third immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM), and similar domains. The members here are composed of the third immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, and spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains the chicken neuron-glia cell adhesion molecule, Ng-CAM and human neurofascin.	83
409395	cd05732	IgI_NCAM-1_like	Immunoglobulin (Ig)-like I-set domain of Neural Cell Adhesion Molecule 1 (NCAM-1) and similar proteins. The members here are composed of the fourth immunoglobulin (Ig)-like domain of Neural Cell Adhesion Molecule (NCAM-1). NCAM plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM), and heterophilic (NCAM-non-NCAM), interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves Ig1, Ig2, and Ig3. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions), through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. Also included in this group is NCAM-2 (also known as OCAM/mamFas II and RNCAM)  NCAM-2 is differentially expressed in the developing and mature olfactory epithelium (OE). One of the unique features of I-set domains is the lack of a C" strand. The structures of this group show that the Ig domain lacks this strand and thus is a member of the I-set of Ig domains.	96
409396	cd05733	IgI_L1-CAM_like	Immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains NrCAM [Ng(neuronglia)CAM-related cell adhesion molecule], which is primarily expressed in the nervous system, and human neurofascin. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lacks a C" strand.	94
409397	cd05734	Ig_DSCAM	Immunoglobulin (Ig)-like domain of Down Syndrome Cell Adhesion molecule (DSCAM). The members here are composed of the immunoglobulin (Ig)-like domain of Down Syndrome Cell Adhesion molecule (DSCAM). DSCAM is a cell adhesion molecule expressed largely in the developing nervous system. The gene encoding DSCAM is located at human chromosome 21q22, the locus associated with the intellectual disability phenotype of Down Syndrome. DSCAM is predicted to be the largest member of the IG superfamily. It has been demonstrated that DSCAM can mediate cation-independent homophilic intercellular adhesion.	97
409398	cd05735	Ig_DSCAM	Immunoglobulin (Ig) domain of Down Syndrome Cell Adhesion molecule (DSCAM). The members here are composed of the immunoglobulin (Ig) domain of Down Syndrome Cell Adhesion molecule (DSCAM). DSCAM is a cell adhesion molecule expressed largely in the developing nervous system. The gene encoding DSCAM is located at human chromosome 21q22, the locus associated with the intellectual disability phenotype of Down Syndrome. DSCAM is predicted to be the largest member of the IG superfamily. It has been demonstrated that DSCAM can mediate cation-independent homophilic intercellular adhesion.	101
409399	cd05736	IgI_2_Follistatin_like	Second immunoglobulin (Ig)-like domain of a Follistatin-related protein 5, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain found in human Follistatin-related protein 5 (FSTL5) and a follistatin-like molecule encoded by the CNS-related Mahya gene. Mahya genes have been retained in certain Bilaterian branches during evolution. They are conserved in Hymenoptera and Deuterostomes, but are absent from other metazoan species such as fruit fly and nematode. Mahya proteins are secretory, with a follistatin-like domain (Kazal-type serine/threonine protease inhibitor domain and EF-hand calcium-binding domain), two Ig-like domains, and a novel C-terminal domain. Mahya may be involved in learning and memory and in processing of sensory information in Hymenoptera and vertebrates. Follistatin is a secreted, multidomain protein that binds activins with high affinity and antagonizes their signaling.	93
319300	cd05737	IgI_Myomesin_like_C	C-terminal immunoglobulin (Ig)-like domain of myomesin and M-protein; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of myomesin and M-protein (also known as myomesin-2). Myomesin and M-protein are both structural proteins localized to the M-band, a transverse structure in the center of the sarcomere, and are candidates for M-band bridges. Both proteins are modular, consisting mainly of repetitive Ig-like and fibronectin type III (FnIII) domains. Myomesin is expressed in all types of vertebrate striated muscle; M-protein has a muscle-type specific expression pattern. Myomesin is present in both slow and fast fibers; M-protein is present only in fast fibers. It has been suggested that myomesin acts as a molecular spring with alternative splicing as a means of modifying its elasticity.	92
409400	cd05738	IgI_2_RPTP_IIa_LAR_like	Second immunoglobulin (Ig)-like domain of  the receptor protein tyrosine phosphatase (RPTP)-F; member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain found in the receptor protein tyrosine phosphatase (RPTP)-F, also known as LAR. LAR belongs to the RPTP type IIa subfamily. Members of this subfamily are cell adhesion molecule-like proteins involved in central nervous system (CNS) development. They have large extracellular portions comprised of multiple Ig-like domains and two to nine fibronectin type III (FNIII) domains and a cytoplasmic portion having two tandem phosphatase domains.	91
409401	cd05739	IgI_3_RPTP_IIa_LAR_like	Third immunoglobulin (Ig)-like domain of the receptor protein tyrosine phosphatase (RPTP)-F (also known as LAR), type IIa; member of the I-set of IgSF domains. The members here are composed of the third immunoglobulin (Ig)-like domain found in the receptor protein tyrosine phosphatase (RPTP)-F, also known as LAR. LAR belongs to the RPTP type IIa subfamily. Members of this subfamily are cell adhesion molecule-like proteins involved in central nervous system (CNS) development. They have large extracellular portions comprised of multiple Ig-like domains and two to nine fibronectin type III (FNIII) domains and a cytoplasmic portion having two tandem phosphatase domains. Included in this group is Drosophila LAR (DLAR).	82
409402	cd05740	IgI_hCEACAM_2_4_6_like	Immunoglobulin (Ig)-like domain of human carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) domains 2, 4, and 6, and similar domains. The members here are composed of the second, fourth, and sixth immunoglobulin (Ig)-like domains in human carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) protein subfamily. The CEA family is a group of anchored or secreted glycoproteins expressed by epithelial cells, leukocytes, endothelial cells, and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions; it is a cell adhesion molecule and a signaling molecule that regulates the growth of tumor cells, an angiogenic factor, and a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two [D1, D4] or four [D1-D4] Ig-like domains on the cell surface.	89
409403	cd05741	IgV_CEACAM_like	Immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) and related domains. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions: it is a cell adhesion molecule and a signaling molecule that regulates the growth of tumor cells, an angiogenic factor, and a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two (D1, D4) or four (D1-D4) Ig-like domains on the cell surface. This family corresponds to the D1 Ig-like domain. Also belonging to this group is the N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family, CD84-like family. The SLAM family is a group of immune-cell specific receptors that can regulate both adaptive and innate immune responses. SLAM family proteins are organized as an extracellular domain with having two or four Ig-like domains, a single transmembrane segment, and a cytoplasmic region having Tyr-based motifs. The extracellular domain is organized as a membrane-distal Ig variable (IgV) domain that is responsible for ligand recognition and a membrane-proximal truncated Ig constant-2 (IgC2) domain.	102
409404	cd05742	IgI_VEGFR_like	Immunoglobulin (Ig)-like domain of vascular endothelial growth factor (VEGF) receptor (R) and similar proteins; member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor (VEGF) receptor (R) and related proteins. The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. The VEGFR family consists of three members: VEGFR-1 (Flt-1), VEGFR-2 (KDR/Flk-1) and VEGFR-3 (Flt-4). VEGF-A interacts with both VEGFR-1 and VEGFR-2. VEGFR-1 binds strongest to VEGF; VEGF-2 binds more weakly. VEGFR-3 appears not to bind VEGF, but binds other members of the VEGF family (VEGF-C and -D). VEGFRs bind VEGFs with high affinity with the IG-like domains. VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-2 is a major mediator of the mitogenic, angiogenic, and microvascular permeability-enhancing effects of VEGF-A. VEGFR-1 may play an inhibitory part in these processes by binding VEGF and interfering with its interaction with VEGFR-2. VEGFR-1 has a signaling role in mediating monocyte chemotaxis. VEGFR-1 and VEGFR-2 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. VEGFR-3 has been shown to be involved in tumor angiogenesis and growth. This group also contains alpha-type platelet-derived growth factor receptor precursor (PDGFR)-alpha (CD140a), and PDGFR-beta (CD140b). PDGFRs alpha and beta have an extracellular component with five Ig-like domains, a transmembrane segment, and a cytoplasmic portion that has protein tyrosine kinase activity.	102
143220	cd05743	Ig_Perlecan_like	Immunoglobulin (Ig)-like domain of the human basement membrane heparan sulfate proteoglycan perlecan and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain of the human basement membrane heparan sulfate proteoglycan perlecan, also known as HSPG2, and similar proteins. Perlecan consists of five domains: domain I has three putative heparan sulfate attachment sites, domain II has four LDL receptor-like repeats, and one Ig-like repeat, domain III resembles the short arm of laminin chains, domain IV has multiple Ig-like repeats (21 repeats in human perlecan), and domain V resembles the globular G domain of the laminin A chain and internal repeats of EGF. Perlecan may participate in a variety of biological functions including cell binding, LDL-metabolism, basement membrane assembly and selective permeability, calcium binding, and growth- and neurite-promoting activities.	78
409405	cd05744	IgI_Myotilin_C_like	Immunoglobulin (Ig)-like domain of myotilin, palladin, and myopalladin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain in myotilin, palladin, and myopalladin.  Myotilin, palladin, and myopalladin function as scaffolds that regulate actin organization. Myotilin and myopalladin are most abundant in skeletal and cardiac muscle; palladin is ubiquitously expressed in the organs of developing vertebrates and plays a key role in cellular morphogenesis. The three family members each interact with specific molecular partners with all three binding to alpha-actinin; In addition, palladin also binds to vasodilator-stimulated phosphoprotein (VASP) and ezrin, myotilin binds to filamin and actin, and myopalladin also binds to nebulin and cardiac ankyrin repeat protein (CARP). This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	91
143222	cd05745	Ig3_Peroxidasin	Third immunoglobulin (Ig)-like domain of peroxidasin. The members here are composed of the third immunoglobulin (Ig)-like domain in peroxidasin. Peroxidasin has a peroxidase domain and interacting extracellular motifs containing four Ig-like domains. It has been suggested that peroxidasin is secreted and has functions related to the stabilization of the extracellular matrix. It may play a part in various other important processes such as removal and destruction of cells which have undergone programmed cell death and protection of the organism against non-self.	74
143223	cd05746	Ig4_Peroxidasin	Fourth immunoglobulin (Ig)-like domain of peroxidasin. The members here are composed of the fourth immunoglobulin (Ig)-like domain in peroxidasin. Peroxidasin has a peroxidase domain and interacting extracellular motifs containing four Ig-like domains. It has been suggested that peroxidasin is secreted, and has functions related to the stabilization of the extracellular matrix. It may play a part in various other important processes such as removal and destruction of cells which have undergone programmed cell death and protection of the organism against non-self.	69
143224	cd05747	IgI_Titin_like	Immunoglobulin (Ig)-like domain of human titin C terminus and similar proteins; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth immunoglobulin (Ig)-like domain from the C-terminus of human titin x and similar proteins. Titin (also called connectin) is a fibrous sarcomeric protein specifically found in vertebrate striated muscle. Titin is gigantic; depending on isoform composition it ranges from 2970 to 3700 kDa, and is of a length that spans half a sarcomere. Titin largely consists of multiple repeats of Ig-like and fibronectin type 3 (FN-III)-like domains. Titin connects the ends of myosin thick filaments to Z disks and extends along the thick filament to the H zone and appears to function similar to an elastic band, keeping the myosin filaments centered in the sarcomere during muscle contraction or stretching. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	92
409406	cd05748	Ig_Titin_like	Immunoglobulin (Ig)-like domain of titin and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain found in titin-like proteins and similar proteins. Titin (also called connectin) is a fibrous sarcomeric protein specifically found in vertebrate striated muscle. Titin is a giant protein; depending on isoform composition, it ranges from 2970 to 3700 kDa, and is of a length that spans half a sarcomere. Titin largely consists of multiple repeats of Ig-like and fibronectin type 3 (FN-III)-like domains. Titin connects the ends of myosin thick filaments to Z disks and extends along the thick filament to the H zone.  It appears to function similarly to an elastic band, keeping the myosin filaments centered in the sarcomere during muscle contraction or stretching. Within the sarcomere, titin is also attached to or is associated with myosin binding protein C (MyBP-C). MyBP-C appears to contribute to the generation of passive tension by titin and like titin has repeated Ig-like and FN-III domains. Also included in this group are worm twitchin and insect projectin, thick filament proteins of invertebrate muscle which also have repeated Ig-like and FN-III domains.	82
409407	cd05749	IgI_2_Axl_Tyro3_like	Second immunoglobulin (Ig)-like domain of Axl/Tyro3 family receptor tyrosine kinases (RTKs); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain in the Axl/Tyro3 family of receptor tyrosine kinases (RTKs). This family includes Axl (also known as Ark, Ufo, and Tyro7), Tyro3 (also known as Sky, Rse, Brt, Dtk, and Tif), and Mer (also known as Nyk, c-Eyk, and Tyro12). Axl/Tyro3 family receptors have an extracellular portion with two Ig-like domains followed by two fibronectin-types III (FNIII) domains, a membrane-spanning single helix, and a cytoplasmic tyrosine kinase domain. Axl, Tyro3, and Mer are widely expressed in adult tissues, though they show higher expression in the brain, lymphatic and vascular systems, and testis. Axl, Tyro3, and Mer bind the vitamin K dependent protein Gas6 with high affinity, and in doing so activate their tyrosine kinase activity. Axl/Gas6 signaling may play a part in cell adhesion processes, prevention of apoptosis, and cell proliferation.	82
409408	cd05750	Ig_Pro_neuregulin	Immunoglobulin (Ig)-like domain in neuregulins. The members here are composed of the immunoglobulin (Ig)-like domain in neuregulins (NRGs). NRGs are signaling molecules which participate in cell-cell interactions in the nervous system, breast, heart, and other organ systems, and are implicated in the pathology of diseases including schizophrenia, multiple sclerosis, and breast cancer. There are four members of the neuregulin gene family (NRG-1, NRG-2, NRG-3, and NRG-4). The NRG-1 protein, binds to and activates the tyrosine kinases receptors ErbB3 and ErbB4, initiating signaling cascades. The other NRGs proteins bind one or the other or both of these ErbBs. NRG-1 has multiple functions: in the brain it regulates various processes such as radial glia formation and neuronal migration, dendritic development, and expression of neurotransmitters receptors, while in the peripheral nervous system NRG-1 regulates processes such as target cell differentiation, and Schwann cell survival. There are many NRG-1 isoforms which arise from the alternative splicing of mRNA. Less is known of the functions of the other NRGs. NRG-2 and NRG-3 are expressed predominantly in the nervous system. NRG-2 is expressed by motor neurons and terminal Schwann cells, and is concentrated near synaptic sites and may be a signal that regulates synaptic differentiation. NRG-4 has been shown to direct pancreatic islet cell development towards the delta-cell lineage.	92
409409	cd05751	IgC2_D1_LILR_KIR_like	First immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors (LILRs), Natural killer inhibitory receptors (KIRs) and similar domains; member of Immunoglobulin Constant-2 set of IgSF domains. The members here are composed of the first immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors (LILRs) and Natural killer inhibitory receptors (KIRs, also known as also known as cluster of differentiation (CD) 158), and similar proteins. This group includes LILRB1 (also known as LIR-1), LILRA5 (also known as LIR9), an activating natural cytotoxicity receptor NKp46, the immune-type receptor glycoprotein VI (GPVI), and the IgA-specific receptor Fc-alphaRI (also known as cluster of differentiation (CD) 89). LILRs are a family of immunoreceptors expressed on expressed on T and B cells, on monocytes, dendritic cells, and subgroups of natural killer (NK) cells. The human LILR family contains nine proteins (LILRA1-3, and 5, and LILRB1-5). From functional assays, and as the cytoplasmic domains of various LILRs, for example LILRB1, LILRB2 (also known as LIR-2), and LILRB3 (also known as LIR-3) contain immunoreceptor tyrosine-based inhibitory motifs (ITIMs), it is thought that LIR proteins are inhibitory receptors. Of the eight LIR family proteins, only LILRB1, and LILRB2, show detectable binding to class I MHC molecules; ligands for the other members have yet to be determined. The extracellular portions of the different LIR proteins contain different numbers of Ig-like domains for example, four in the case of LILRB1, and LILRB2, and two in the case of LILRB4 (also known as LIR-5). The activating natural cytotoxicity receptor NKp46 is expressed in natural killer cells, and is organized as an extracellular portion having two Ig-like extracellular domains, a transmembrane domain, and a small cytoplasmic portion. GPVI, which also contains two Ig-like domains, participates in the processes of collagen-mediated platelet activation and arterial thrombus formation. Fc-alphaRI is expressed on monocytes, eosinophils, neutrophils, and macrophages; it mediates IgA-induced immune effector responses such as phagocytosis, antibody-dependent cell-mediated cytotoxicity and respiratory burst. Killer cell immunoglobulin-like receptors (KIRs; also known as CD158 for human KIR) are transmembrane glycoproteins expressed by natural killer cells and subsets of T cells. KIRs are a family of highly polymorphic activating and inhibitory receptors that serve as key regulators of human NK cell function. The KIR proteins are classified by the number of extracellular immunoglobulin domains (2D or 3D) and by whether they have a long (L) or short (S) cytoplasmic domain. KIR proteins with the long cytoplasmic domain transduce inhibitory signals upon ligand binding via an immune tyrosine-based inhibitory motif (ITIM), while KIR proteins with the short cytoplasmic domain lack the ITIM motif and instead associate with the TYRO protein tyrosine kinase binding protein to transduce activating signals. The major ligands for KIR are MHC class I (HLA-A, -B or -C) molecules.	88
409410	cd05752	Ig1_FcgammaR_like	First immunoglobulin (Ig)-like domain of  Fcgamma-receptors (FcgammaRs), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of Fcgamma-receptors (FcgammaRs). Interactions between IgG and FcgammaR are important to the initiation of cellular and humoral response. IgG binding to FcgammaR leads to a cascade of signals and ultimately to functions such as antibody-dependent-cellular-cytotoxicity (ADCC), endocytosis, phagocytosis, release of inflammatory mediators, etc. FcgammaR has two Ig-like domains. This group also contains FcepsilonRI which binds IgE with high affinity.	79
409411	cd05753	Ig2_FcgammaR_like	Second immunoglobulin (Ig)-like domain of  Fcgamma-receptors (FcgammaRs), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of  Fcgamma-receptors (FcgammaRs). Interactions between IgG and FcgammaR are important to the initiation of cellular and humoral response. IgG binding to FcgammaR leads to a cascade of signals and ultimately to functions such as antibody-dependent-cellular-cytotoxicity (ADCC), endocytosis, phagocytosis, release of inflammatory mediators, etc. FcgammaR has two Ig-like domains. This group also contains FcepsilonRI which binds IgE with high affinity.	83
409412	cd05754	IgI_Perlecan_like	Immunoglobulin (Ig)-like domain found in Perlecan and similar proteins; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig)-like domain found in Perlecan. Perlecan is a large multi-domain heparin sulfate proteoglycan, important in tissue development and organogenesis.  Perlecan can be represented as 5 major portions; its fourth major portion (domain IV) is a tandem repeat of immunoglobulin-like domains (Ig2-Ig15) which can vary in size due to alternative splicing. Perlecan binds many cellular and extracellular ligands. Its domain IV region has many binding sites.  Some of these have been mapped at the level of individual Ig-like domains, including a site restricted to the Ig5 domain for heparin/sulfatide, a site restricted to the Ig3 domain for nidogen-1 and nidogen-2, a site restricted to Ig4-5 for fibronectin, and sites restricted to Ig2 and to Ig13-15 for fibulin-2. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	85
409413	cd05755	IgC2_2_ICAM-1_like	Second immunoglobulin (Ig)-like C2-set domain of intercellular cell adhesion molecule 1 (ICAM-1), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of intercellular cell adhesion molecule 1 (ICAM-1; also known as domain of cluster of differentiation (CD) 54) and similar proteins. During the inflammation process, these molecules recruit leukocytes onto the vascular endothelium before extravasation to the injured tissues. ICAM-1 may be involved in organ targeted tumor metastasis. The interaction of ICAM-1 with leukocyte function-associated antigen-1 (LFA-1) plays a part in leukocyte-endothelial cell recognition. This group also contains ICAM-2 which also interacts with LFA-1. Transmigration of immature dendritic cells across resting endothelium is dependent on the interaction of ICAM-2 with, yet unidentified, ligand(s) on the dendritic cells. ICAM-1 has five Ig-like domains and ICAM-2 has two. ICAM-1 may also act as host receptor for viruses and parasites. The structures of this group show that the second Ig domain lacks a D strand and thus belonging to the C2-set of the IgSF	101
409414	cd05756	Ig1_IL1R_like	First immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R; also known as cluster of differentiation (CD) 121). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three Ig-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta.	96
409415	cd05757	Ig2_IL1R-like	Second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R; also known as cluster of differentiation (CD) 121). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP).  IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three IG-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. This group also contains ILIR-like 1 (IL1R1L) which maps to the same chromosomal location as IL1R1 and IL1R2.	92
319310	cd05758	IgI_5_KIRREL3-like	Fifth immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3 (also known as Neph2). This protein has five Ig-like domains, one transmembrane domain, and a cytoplasmic tail. Included in this group is mammalian Kirrel (also known as Neph1), Kirrel2 (also known as Neph3), and Drosophila RST (also known as irregular chiasm C-roughest) protein. These proteins contain multiple Ig domains, have properties of cell adhesion molecules, and are important in organ development.	98
409416	cd05759	IgI_2_KIRREL3-like	Second immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3, and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3 (also known as Neph2). This protein has five Ig-like domains, one transmembrane domain, and a cytoplasmic tail. Included in this group is mammalian Kirrel (Neph1), Kirrel2 (Neph3), and Drosophila RST (irregular chiasm C-roughest) protein. These proteins contain multiple Ig domains, have properties of cell adhesion molecules, and are important in organ development.	98
409417	cd05760	Ig2_PTK7	Second immunoglobulin (Ig)-like domain of protein tyrosine kinase (PTK) 7. The members here are composed of the second immunoglobulin (Ig)-like domain in protein tyrosine kinase (PTK) 7, also known as CCK4. PTK7 is a subfamily of the receptor protein tyrosine kinase family, and is referred to as an RPTK-like molecule. RPTKs transduce extracellular signals across the cell membrane and play important roles in regulating cell proliferation, migration, and differentiation. PTK7 is organized as an extracellular portion having seven Ig-like domains, a single transmembrane region, and a cytoplasmic tyrosine kinase-like domain. PTK7 is considered a pseudokinase as it has several unusual residues in some of the highly conserved tyrosine kinase (TK) motifs; it is predicted to lack TK activity. PTK7 may function as a cell-adhesion molecule. PTK7 mRNA is expressed at high levels in placenta, melanocytes, liver, lung, pancreas, and kidney. PTK7 is overexpressed in several cancers, including melanoma and colon cancer lines.	95
409418	cd05761	IgI_2_Necl-1-4	Second immunoglobulin (Ig)-like domain of the nectin-like molecules Necl-1 - Necl-4; member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of the nectin-like molecules Necl-1 (also known as cell adhesion molecule 3 or CADM3), Necl-2 (also known as CADM1), Necl-3 (also known as CADM2) and Necl-4 (also known as CADM4). These nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 through Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1 and Necl-2 have Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue and is important to the formation of synapses, axon bundles, and myelinated axons. Necl-2 is expressed in a wide variety of tissues, and is a putative tumour suppressor gene, which is downregulated in aggressive neuroblastoma. Necl-3 has been shown to accumulate in tissues of the central and peripheral nervous system, where it is expressed in ependymal cells and myelinated axons.  It is observed at the interface between the axon shaft and the myelin sheath. Necl-4 is expressed on Schwann cells, and plays a key part in initiating peripheral nervous system (PNS) myelination. Necl-4 participates in cell-cell adhesion and is proposed to play a role in tumor suppression.	102
409419	cd05762	IgI_8_hMLCK_like	Eighth immunoglobulin (Ig)-like domain of human myosin light-chain kinase (MLCK) and similar protein; member of the I-set of IgSF domains. The members here are composed of the eighth immunoglobulin (Ig)-like domain of human myosin light-chain kinase (MLCK) and similar proteins. Myosin light-chain kinase (MLCK) is a key regulator of different forms of cell motility involving actin and myosin II.  Agonist stimulation of smooth muscle cells increases cytosolic Ca2+ which binds calmodulin.  This Ca2+-calmodulin complex in turn binds to and activates MLCK. Activated MLCK leads to the phosphorylation of the 20 kDa myosin regulatory light chain (RLC) of myosin II and the stimulation of actin-activated myosin MgATPase activity. MLCK is widely present in vertebrate tissues; it phosphorylates the 20 kDa RLC of both smooth and nonmuscle myosin II. Phosphorylation leads to the activation of the myosin motor domain and altered structural properties of myosin II. In smooth muscle MLCK it is involved in initiating contraction. In nonmuscle cells, MLCK may participate in cell division and cell motility; it has been suggested MLCK plays a role in cardiomyocyte differentiation and contraction through regulation of nonmuscle myosin II.	99
409420	cd05763	IgI_LRIG1-like	Immunoglobulin (Ig)-like ectodomain of the LRIG1 (Leucine-rich Repeats And Immunoglobulin-like Domains Protein 1) and similar proteins; member of the I-set of IgSF domains. The members here are composed of subgroup of the immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. The ectodomain of LRIG1 has two distinct regions: the proposed 15 LRRs and three Ig-like domains closer to the membrane. LRIG1 has been reported to interact with many receptor tyrosine kinases, GDNF/c-Ret, E-cadherin, JAK/STAT, c-Met, and the EGFR family signaling systems. Immunoglobulin Superfamily (IgSF) domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The structure of the LRIG1 extracellular Ig domain lacks a C" strand and thus is better described as a member of the I-set of IgSF domains.	91
409421	cd05764	IgI_SALM5_like	Immunoglobulin domain of human Synaptic Adhesion-Like Molecule 5 (SALM5) and similar proteins; member of the I-set of IgSF domains. This group contains the immunoglobulin domain of human Synaptic Adhesion-Like Molecule 5 (SALM5) and similar proteins. The SALM (for synaptic adhesion-like molecules; also known as Lrfn for leucine-rich repeat and fibronectin type III domain containing) family of adhesion molecules consists of five known members: SALM1/Lrfn2, SALM2/Lrfn1, SALM3/Lrfn4, SALM4/Lrfn3, and SALM5/Lrfn5. SALMs share a similar domain structure, containing leucine-rich repeats (LRRs), an immunoglobulin (Ig) domain, and a fibronectin III (FNIII) domain, followed by a transmembrane domain and a C-terminal PDZ-binding motif. SALM5 is implicated in autism spectrum disorders (ASDs) and schizophrenia, induces presynaptic differentiation in contacting axons. SALM5 interacts with the Ig domains of LAR (Leukocyte common Antigen-Related) family receptor protein tyrosine phosphatases (LAR-RPTPs; LAR, PTPdelta, and PTPsigma). In addition, PTPdelta is implicated in ASDs, ADHD, bipolar disorder, and restless leg syndrome. Studies have shown that LAR-RPTPs are novel and splicing-dependent presynaptic ligands for SALM5, and that they mediate SALM5-dependent presynaptic differentiation. Furthermore, SALM5 maintains AMPA receptor (AMPAR)-mediated excitatory synaptic transmission through mechanisms involving the interaction of SALM5 with LAR-RPTPs. This group belongs to the I-set of immunoglobulin superfamily (IgSF) domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.	88
409422	cd05765	IgI_3_WFIKKN-like	Third immunoglobulin-like domain of the human WFIKKN (WAP, follistatin, immunoglobulin, Kunitz and NTR domain-containing protein), and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin-like domain of the human WFIKKN (WAP, follistatin, immunoglobulin, Kunitz and NTR domain-containing protein) and similar proteins. WFIKKN is a secreted protein that consists of multiple types of protease inhibitory modules, including two tandem Kunitz-type protease inhibitor-domains. The Ig superfamily is a heterogenous group of proteins built on a common fold comprised of a sandwich of two beta sheets. Members of the Ig superfamily are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	95
409423	cd05766	IgC1_MHC_II_beta	Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II beta chain.  MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes and they are also expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain has two globular domains (N- and C-terminal) and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	96
409424	cd05767	IgC1_MHC_II_alpha	Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of the major histocompatibility complex (MHC) class II alpha chain.  MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are also expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	95
409425	cd05768	IgC1_CH3_IgAGD_CH4_IgAEM	CH3 domain (third constant Ig domain of the heavy chain) in immunoglobulin heavy alpha, gamma, and delta chains, and CH4 domain (fourth constant Ig domain of the heavy chain) in immunoglobulin heavy alpha, epsilon, and mu chains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the third and fourth immunoglobulin constant domain (IgC) of alpha, delta, gamma and alpha, epsilon, and mu heavy chains, respectively. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns.	105
409426	cd05769	IgC1_TCR_beta	T cell receptor (TCR) beta chain constant immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the T cell receptor (TCR) beta chain constant immunoglobulin domain. TCRs mediate antigen recognition by T lymphocytes, and are composed of alpha and beta, or gamma and delta, polypeptide chains with variable (V) and constant (C) regions. This group includes the variable domain of the beta chain. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. The antigen binding site is formed by the variable domains of the alpha and beta chains, located at the N-terminus of each chain. Alpha/beta TCRs recognize antigens differently from gamma/delta TCRs.	116
409427	cd05770	IgC1_beta2m	Class I major histocompatibility complex (MHC) beta-2-microglobulin; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin-like domain in beta-2-microglobulin (beta2m). Beta2m is the non-covalently bound light chain of the human class I major histocompatibility complex (MHC-I). Beta2m is structured as a beta-sandwich domain composed of two facing beta-sheets (four stranded and three stranded), that is typical of the C-type immunoglobulin superfamily. This structure is stabilized by an intramolecular disulfide bridge connecting two Cys residues in the facing beta-sheets. In vivo, MHC-I continuously exposes beta2m on the cell surface, where it may be released to plasmatic fluids, transported to the kidneys, degraded, and finally excreted.	94
409428	cd05771	IgC1_Tapasin_R	Tapasin-R immunoglobulin-like domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin-like domain on Tapasin-R. Tapasin is a V-C1 (variable-constant) immunoglobulin superfamily molecule present in the endoplasmic reticulum (ER), where it links MHC class I molecules to the transporter associated with antigen processing (TAP). Tapasin-R is a tapasin-related protein that contains similar structural motifs to Tapasin, with some marked differences, especially in the V domain, transmembrane and cytoplasmic regions. The majority of Tapasin-R is located within the ER; however, there may be some expression of Tapasin-R at the cell surface. Tapasin-R lacks an obvious ER retention signal.	100
409429	cd05772	IgC1_SIRP_domain_2	Signal-regulatory protein (SIRP) immunoglobulin-like domain 2; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain in Signal-Regulatory Protein (SIRP), domain 2 (C1 repeat 1). The SIRPs belong to the "paired receptors" class of membrane proteins that comprise several genes coding for proteins with similar extracellular regions, but very different transmembrane/cytoplasmic regions with different (activating or inhibitory) signaling potentials. They are commonly on NK cells, but are also on many myeloid cells. Their extracellular region contains three Immunoglobulin superfamily domains, a single V-set and two C1-set IgSF domains. Their cytoplasmic tails contain either ITIMs or transmembrane regions that have positively charged residues that allow an association with adaptor proteins, such as DAP12/KARAP, containing ITAMs. There are 3 distinct SIRP members: alpha, beta, and gamma.  SIRP alpha (also known as CD172a or SRC homology 2 domain-containing protein tyrosine phosphatase substrate 1/Shps-1) is a membrane receptor that interacts with a ligand CD47 expressed on many cells and gives an inhibitory signal through immunoreceptor tyrosine-based inhibition motifs in the cytoplasmic region that interact with phosphatases SHP-1 and SHP-2. SIRP beta has a short cytoplasmic region and associates with a transmembrane adapter protein DAP12 containing immunoreceptor tyrosine-based activation motifs to give an activating signal. SIRP gamma contains a very short cytoplasmic region lacking obvious signaling motifs, but also binds CD47, but with much less affinity.	102
143250	cd05773	IgC1_hNephrin_like	Immunoglobulin-like domain of human nephrin and similar proteins; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin-like domain in human nephrin and similar proteins. Nephrin is an integral component of the slit diaphragm and is a central component of the glomerular ultrafilter. Nephrin plays a structural role and has a role in signaling. Nephrin is a transmembrane protein having a short intracellular portion, an extracellular portion comprised of eight Ig-like domains, and one fibronectin type III-like domain. The extracellular portions of nephrin from neighboring foot processes of separate podocyte cells may interact with each other, and in association with other components of the slit diaphragm form a porous molecular sieve within the slit pore.  The intracellular portion of nephrin is associated with linker proteins, which connect nephrin to the actin cytoskeleton. The intracellular portion is tyrosine phosphorylated, and mediates signaling from the slit diaphragm into the podocytes.	109
409430	cd05774	IgV_CEACAM_D1	First immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM). The members here are composed of the immunoglobulin (Ig)-like domain 1 in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) proteins. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells, and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions: it is a cell adhesion molecule and a signaling molecule that regulates the growth of tumor cells, an angiogenic factor, and a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two (D1, D4) or four (D1-D4) Ig-like domains on the cell surface. 	105
409431	cd05775	IgV_CD2_like_N	N-terminal immunoglobulin (Ig)-like domain of T-cell surface antigen CD2, and similar domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain (or domain 1) of T-cell surface antigen Clusters of Differentiation (CD) 2 and similar proteins. CD2 is a T-cell specific surface glycoprotein and is critically important for mediating adhesion between T cells and antigen-presenting cells or between cytolytic T cells and target cells. CD2 is located on chromosome 1 at 1p13 in humans and on chromosome 3 in mice. CD2 contains an extracellular domain with two or Ig-like domains, a single transmembrane segment, and a cytoplasmic region rich in proline and basic residues.	98
99819	cd05776	DNA_polB_alpha_exo	inactive DEDDy 3'-5' exonuclease domain of eukaryotic DNA polymerase alpha, a family-B DNA polymerase. The 3'-5' exonuclease domain of eukaryotic DNA polymerase alpha.  DNA polymerase alpha is a family-B DNA polymerase with a catalytic subunit that contains a DnaQ-like 3'-5' exonuclease domain. It is one of the three DNA-dependent type B DNA polymerases (delta and epsilon are the other two) that have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase alpha is almost exclusively required for the initiation of DNA replication and the priming of Okazaki fragments during elongation. It associates with DNA primase and is the only enzyme able to start DNA synthesis de novo. The catalytic subunit contains both polymerase and 3'-5' exonuclease domains, but only exhibits polymerase activity. The 3'-5' exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, without the four conserved acidic residues that are crucial for metal binding and catalysis. This explains why in most organisms, that no specific repair role, other than check point control, has been assigned to this enzyme. The exonuclease domain may have a structural role.	234
99820	cd05777	DNA_polB_delta_exo	DEDDy 3'-5' exonuclease domain of eukaryotic DNA polymerase delta, a family-B DNA polymerase. The 3'-5' exonuclease domain of eukaryotic DNA polymerase delta. DNA polymerase delta is a family-B DNA polymerase with a catalytic subunit that contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain. It is one of the three DNA-dependent type B DNA polymerases (alpha and epsilon are the other two) that have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase delta is the enzyme responsible for both elongation and maturation of Okazaki fragments on the lagging strand. It is also implicated in mismatch repair (MMR) and base excision repair (BER). The catalytic subunit displays both polymerase and 3'-5' exonuclease activities. The exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues necessary for metal binding and catalysis. The exonuclease domain of family B polymerase also contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation.	230
99821	cd05778	DNA_polB_zeta_exo	inactive DEDDy 3'-5' exonuclease domain of eukaryotic DNA polymerase zeta, a family-B DNA polymerase. The 3'-5' exonuclease domain of eukaryotic DNA polymerase zeta. DNA polymerase zeta is a family-B DNA polymerase which is distantly related to DNA polymerase delta. It plays a major role in translesion replication and the production of either spontaneous or induced mutations. In addition, DNA polymerase zeta also appears to be involved in somatic hypermutability in B lymphocytes, an important element for the production of high affinity antibodies in response to an antigen. The catalytic subunit contains both polymerase and 3'-5' exonuclease domains, but only exhibits polymerase activity. The DnaQ-like 3'-5' exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, without the four conserved acidic residues that are crucial for metal binding and catalysis.	231
99822	cd05779	DNA_polB_epsilon_exo	DEDDy 3'-5' exonuclease domain of eukaryotic DNA polymerase epsilon, a family-B DNA polymerase. The 3'-5' exonuclease domain of eukaryotic DNA polymerase epsilon. DNA polymerase epsilon is a family-B DNA polymerase with a catalytic subunit that contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain. It is one of the three DNA-dependent type B DNA polymerases (alpha and delta are the other two) that have been identified as essential for nuclear DNA replication in eukaryotes. DNA polymerase epsilon plays a role in elongating the leading strand during DNA replication. It is also involved in DNA repair. The catalytic subunit contains both polymerase and 3'-5' exonuclease activities. The N-terminal exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. DNA polymerase epsilon also carries a unique large C-terminal domain with an unknown function. Phylogenetic analyses indicate that it is orthologous to the archaeal DNA polymerase B3 rather than to the eukaryotic alpha, delta, or zeta polymerases. The exonuclease domain of family-B polymerases contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation	204
99823	cd05780	DNA_polB_Kod1_like_exo	DEDDy 3'-5' exonuclease domain of Pyrococcus kodakaraensis Kod1 and similar archaeal family-B DNA polymerases. The 3'-5' exonuclease domain of archaeal family-B DNA polymerases with similarity to Pyrococcus kodakaraensis Kod1, including polymerases from Desulfurococcus (D. Tok Pol) and Thermococcus gorgonarius (Tgo Pol). Kod1, D. Tok Pol, and Tgo Pol are thermostable enzymes that exhibit both polymerase and 3'-5' exonuclease activities. They are family-B DNA polymerases. Their amino termini harbor a DEDDy-type DnaQ-like 3'-5' exonuclease domain that contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. The exonuclease domain of family B polymerases contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. Members of this subfamily show similarity to eukaryotic DNA polymerases involved in DNA replication. Some archaea possess multiple family-B DNA polymerases. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family-B DNA polymerases support independent gene duplications during the evolution of archaeal and eukaryotic family-B DNA polymerases.	195
99824	cd05781	DNA_polB_B3_exo	DEDDy 3'-5' exonuclease domain of Sulfurisphaera ohwakuensis DNA polymerase B3 and similar archaeal family-B DNA polymerases. The 3'-5' exonuclease domain of archaeal proteins with similarity to Sulfurisphaera ohwakuensis DNA polymerase B3. B3 is a family-B DNA polymerase. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. B3 exhibits both polymerase and 3'-5' exonuclease activities. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. The exonuclease domain of family B polymerases also contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. Archaeal proteins that are involved in DNA replication are similar to those from eukaryotes. Some archaea possess multiple family-B DNA polymerases. B3 is mainly found in crenarchaea. Phylogenetic analyses of eubacterial, archaeal, and eukaryotic family B-DNA polymerases support independent gene duplications during the evolution of archaeal and eukaryotic family-B DNA polymerases.	188
99825	cd05782	DNA_polB_like1_exo	Uncharacterized bacterial subgroup of the DEDDy 3'-5' exonuclease domain of family-B DNA polymerases. A subfamily of the 3'-5' exonuclease domain of family-B DNA polymerases. This subfamily is composed of uncharacterized bacterial family-B DNA polymerases. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are involved in metal binding and catalysis. The exonuclease domain of family-B DNA polymerases has a fundamental role in proofreading activity. It contains a beta hairpin structure that plays an important role in active site switching in the event of a nucleotide misincorporation. Family-B DNA polymerases are predominantly involved in DNA replication and DNA repair.	208
99826	cd05783	DNA_polB_B1_exo	DEDDy 3'-5' exonuclease domain of Sulfolobus solfataricus DNA polymerase B1 and similar archaeal family-B DNA polymerases. The 3'-5' exonuclease domain of Sulfolobus solfataricus DNA polymerase B1 and similar archaeal proteins. B1 is a family-B DNA polymerase. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. B1displays thermostable polymerase and 3'-5' exonuclease activities. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. The exonuclease domain of family-B polymerases also contains a beta hairpin structure that plays an important role in active site switching in the event of nucleotide misincorporation. Family-B DNA polymerases from thermophilic archaea are unique in that they are able to recognize the presence of uracil in the template strand, leading to the stalling of DNA synthesis. This is an additional safeguard mechanism against increased levels of deaminated bases during genome duplication at high temperatures. S. solfataricus B1 also interacts with DNA polymerase Y and may contribute to genome stability mechanisms.	204
99827	cd05784	DNA_polB_II_exo	DEDDy 3'-5' exonuclease domain of Escherichia coli DNA polymerase II and similar bacterial family-B DNA polymerases. The 3'-5' exonuclease domain of Escherichia coli DNA polymerase II (Pol II) and similar bacterial proteins. Pol II is a family-B DNA polymerase. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and are involved in metal binding and catalysis. The exonuclease domain has a fundamental role in the proofreading activity of polII. It contains a beta hairpin structure that plays an important role in active site switching in the event of a nucleotide misincorporation. Pol II is involved in a variety of cellular activities, such as the repair of DNA damaged by UV irradiation or oxidation. It plays a pivotal role in replication-restart, a process that bypasses DNA damage in an error-free manner. Pol II is also involved in lagging strand synthesis.	193
99828	cd05785	DNA_polB_like2_exo	Uncharacterized bacterial subgroup of the DEDDy 3'-5' exonuclease domain of family-B DNA polymerases. A subfamily of the 3'-5' exonuclease domain of family-B DNA polymerases. This subfamily is composed of uncharacterized bacterial family-B DNA polymerases. Family-B DNA polymerases contain an N-terminal DEDDy DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-A DNA polymerases. This exonuclease domain contains three sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are involved in metal binding and catalysis. The exonuclease domain of family-B DNA polymerases has a fundamental role in proofreading activity. It contains a beta hairpin structure that plays an important role in active site switching in the event of a nucleotide misincorporation. Family-B DNA polymerases are predominantly involved in DNA replication and DNA repair.	207
100061	cd05787	LbH_eIF2B_epsilon	eIF-2B epsilon subunit, central Left-handed parallel beta-Helix (LbH) domain: eIF-2B is a eukaryotic translation initiator, a guanine nucleotide exchange factor (GEF) composed of five different subunits (alpha, beta, gamma, delta and epsilon). eIF2B is important for regenerating GTP-bound eIF2 during the initiation process. This event is obligatory for eIF2 to bind initiator methionyl-tRNA, forming the ternary initiation complex. The eIF-2B epsilon subunit contains an N-terminal domain that resembles a dinucleotide-binding Rossmann fold, a central LbH domain containing 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X), and a C-terminal domain of unknown function that is present in eIF-4 gamma, eIF-5, and eIF-2B epsilon. The epsilon and gamma subunits form the catalytic subcomplex of eIF-2B, which binds eIF2 and catalyzes guanine nucleotide exchange.	79
240215	cd05789	S1_Rrp4	S1_Rrp4: Rrp4 S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. Rrp4 protein is a subunit of the exosome complex. The exosome plays a central role in 3' to 5' RNA processing and degradation in eukarytes and archaea. Its functions include the removal of incorrectly processed RNA and the maintenance of proper levels of mRNA, rRNA and a number of small RNA species. In Saccharomyces cerevisiae, the exosome includes nine core components, six of which are homologous to bacterial RNase PH. These form a hexameric ring structure. The other three subunits (RrP4, Rrp40, and Csl4) contain an S1 RNA binding domain and are part of the "S1 pore structure".	86
240216	cd05790	S1_Rrp40	S1_Rrp40: Rrp40 S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. Rrp4 protein is a subunit of the exosome complex. The exosome plays a central role in 3' to 5' RNA processing and degradation in eukarytes and archaea. Its functions include the removal of incorrectly processed RNA and the maintenance of proper levels of mRNA, rRNA and a number of small RNA species. In Saccharomyces cerevisiae, the exosome includes nine core components, six of which are homologous to bacterial RNase PH. These form a hexameric ring structure. The other three subunits (RrP4, Rrp40, and Csl4) contain an S1 RNA binding domain and are part of the "S1 pore structure".	86
240217	cd05791	S1_CSL4	S1_CSL4: CSL4, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. ScCSL4 protein is a subunit of the exosome complex. The exosome plays a central role in 3' to 5' RNA processing and degradation in eukarytes and archaea. Its functions include the removal of incorrectly processed RNA and the maintenance of proper levels of mRNA, rRNA and a number of small RNA species. In S. cerevisiae, the exosome includes nine core components, six of which are homologous to bacterial RNase PH. These form a hexameric ring structure. The other three subunits (RrP4, Rrp40, and Csl4) contain an S1 RNA binding domain and are part of the "S1 pore structure".	92
240218	cd05792	S1_eIF1AD_like	S1_eIF1AD_like: eukaryotic translation initiation factor 1A domain containing protein (eIF1AD)-like, S1-like RNA-binding domain. eIF1AD is also known as MGC11102 protein. Little is known about the function of eIF1AD. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins, including translation initiation factor IF1A (also referred to as eIF1A in eukaryotes). eIF1A is essential for translation initiation. eIF1A acts synergistically with eIF1 to mediate assembly of ribosomal initiation complexes at the initiation codon and maintain the accuracy of this process by recognizing and destabilizing aberrant preinitiation complexes from the mRNA. Without eIF1A and eIF1, 43S ribosomal preinitiation complexes can bind to the cap-proximal region, but are unable to reach the initiation codon. eIF1a also enhances the formation of 5'-terminal complexes in the presence of other translation initiation factors.	78
240219	cd05793	S1_IF1A	S1_IF1A: Translation initiation factor IF1A, also referred to as eIF1A in eukaryotes and aIF1A in archaea, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. IF1A is essential for translation initiation. eIF1A acts synergistically with eIF1 to mediate assembly of ribosomal initiation complexes at the initiation codon and maintain the accuracy of this process by recognizing and destabilizing aberrant preinitiation complexes from the mRNA. Without eIF1A and eIF1, 43S ribosomal preinitiation complexes can bind to the cap-proximal region, but are unable to reach the initiation codon. eIF1a also enhances the formation of 5'-terminal complexes in the presence of other translation initiation factors. This protein family is only found in eukaryotes and archaea.	77
240220	cd05794	S1_EF-P_repeat_2	S1_EF-P_repeat_2: Translation elongation factor P (EF-P), S1-like RNA-binding domain, repeat 1. EF-P stimulates the peptidyltransferase activity in the prokaryotic 70S ribosome. EF-P enhances the synthesis of certain dipeptides with N-formylmethionyl-tRNA and puromycine in vitro. EF-P binds to both the 30S and 50S ribosomal subunits. EF-P binds near the streptomycine binding site of the 16S rRNA in the 30S subunit. EF-P interacts with domains 2 and 5 of the 23S rRNA. The L16 ribosomal protein of the 50S or its N-terminal fragment are required for EF-P mediated peptide bond synthesis, whereas L11, L15, and L7/L12 are not required in this reaction, suggesting that EF-P may function at a different ribosomal site than most other translation factors. EF-P is essential for cell viability and is required for protein synthesis. EF-P is mainly present in bacteria. The EF-P homologs in archaea and eukaryotes are the initiation factors aIF5A and eIF5A, respectively. EF-P has 3 domains (domains I, II, and III). Domains II and III are S1-like domains. This CD includes domain III (the second S1 domain of EF_P). Domains II and III of have structural homology to the eIF5A domain C, suggesting that domains II and III evolved by duplication.	56
240221	cd05795	Ribosomal_P0_L10e	Ribosomal protein L10 family, P0 and L10e subfamily; composed of eukaryotic 60S ribosomal protein P0 and the archaeal P0 homolog, L10e. P0 or L10e forms a tight complex with multiple copies of the small acidic protein L12(e). This complex forms a stalk structure on the large subunit of the ribosome. The stalk is known to contain the binding site for elongation factors G and Tu (EF-G and EF-Tu, respectively); however, there is disagreement as to whether or not L10 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, L10 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). These eukaryotic and archaeal P0 sequences have an additional C-terminal domain homologous with acidic proteins P1 and P2.	175
240222	cd05796	Ribosomal_P0_like	Ribosomal protein L10 family, P0-like protein subfamily; composed of uncharacterized eukaryotic proteins with similarity to the 60S ribosomal protein P0, including the Saccharomyces cerevisiae protein called mRNA turnover protein 4 (MRT4). MRT4 may be involved in mRNA decay. P0 forms a tight complex with multiple copies of the small acidic protein L12(e). This complex forms a stalk structure on the large subunit of the ribosome. It occupies the L7/L12 stalk of the ribosome. The stalk is known to contain the binding site for elongation factors EF-G and EF-Tu; however, there is disagreement as to whether or not P0 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, P0 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). Some eukaryotic P0 sequences have an additional C-terminal domain homologous with acidic proteins P1 and P2.	163
240223	cd05797	Ribosomal_L10	Ribosomal protein L10 family, L10 subfamily; composed of bacterial 50S ribosomal protein and eukaryotic mitochondrial 39S ribosomal protein, L10. L10 occupies the L7/L12 stalk of the ribosome. The N-terminal domain (NTD) of L10 interacts with L11 protein and forms the base of the L7/L12 stalk, while the extended C-terminal helix binds to two or three dimers of the NTD of L7/L12 (L7 and L12 are identical except for an acetylated N-terminus). The L7/L12 stalk is known to contain the binding site for elongation factors G and Tu (EF-G and EF-Tu, respectively); however, there is disagreement as to whether or not L10 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, L10 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). These bacteria and eukaryotic sequences have no additional C-terminal domain, present in other eukaryotic and archaeal orthologs.	157
240224	cd05798	SIS_TAL_PGI	SIS_TAL_PGI: Transaldolase (TAL)/ Phosphoglucose isomerase (PGI). This group represents the SIS (Sugar ISomerase) PGI domain, of a multifunctional protein (TAL-PGI ) having both TAL and PGI activities. TAL_PGI contains an N-terminal TAL domain and a C-terminal PGI domain. TAL catalyzes the reversible conversion of sedoheptulose-7-phosphate (S7P) and glyceraldehyde-3-phosphate (G3P), to fructose-6-phosphate (F6P) and erythrose-4-phosphate (E4P). PGI catalyzes the reversible isomerization of F6P to glucose-6-phosphate (G6P). It has been suggested for Gluconobacter oxydans TAL_PGI that this enzyme generates E4P and G6P directly from S7P and G3P. G. oxydans TAL_PGI contributes to increased xylitol production from D-arabitol. As xylitol is an alternative natural sweetner to sucrose, the microbial conversion of D-arabitol to xylitol is of interest to food and pharmaceutical industries.	129
100092	cd05799	PGM2	This CD includes PGM2 (phosphoglucomutase 2) and PGM2L1 (phosphoglucomutase 2-like 1). The mammalian PGM2 is thought to be a phosphopentomutase that catalyzes the conversion of the nucleoside breakdown products, ribose-1-phosphate and deoxyribose-1-phosphate to the corresponding 5-phosphopentoses. PGM2L1 is thought to catalyze the 1,3-bisphosphoglycerate-dependent synthesis of glucose 1,6-bisphosphate and other aldose-bisphosphates that serve as cofactors for several sugar phosphomutases and possibly also as regulators of glycolytic enzymes. PGM2 and PGM2L1 belong to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	487
100093	cd05800	PGM_like2	This PGM-like (phosphoglucomutase-like) protein of unknown function belongs to the alpha-D-phosphohexomutase superfamily and is found in both archaea and bacteria. The alpha-D-phosphohexomutases include several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four structural domains (subdomains) with a centrally located active site formed by four loops, one from each subdomain. All four subdomains are included in this alignment model.	461
100094	cd05801	PGM_like3	This bacterial PGM-like (phosphoglucomutase-like) protein of unknown function belongs to the alpha-D-phosphohexomutase superfamily. The alpha-D-phosphohexomutases include several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Other members of this superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	522
100095	cd05802	GlmM	GlmM is a bacterial phosphoglucosamine mutase (PNGM) that belongs to the alpha-D-phosphohexomutase superfamily. It is required for the interconversion of glucosamine-6-phosphate and glucosamine-1-phosphate in the biosynthetic pathway of UDP-N-acetylglucosamine, an essential precursor to components of the cell envelope.  In order to be active, GlmM must be phosphorylated, which can occur via autophosphorylation or by the Ser/Thr kinase StkP. GlmM functions in a classical ping-pong bi-bi mechanism with glucosamine-1,6-diphosphate as an intermediate.  Other members of the alpha-D-phosphohexomutase superfamily include phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	434
100096	cd05803	PGM_like4	This PGM-like (phosphoglucomutase-like) domain is located C-terminal to a mannose-1-phosphate guanyltransferase domain in a protein of unknown function that is found in both prokaryotes and eukaryotes. This domain belongs to the alpha-D-phosphohexomutase superfamily which includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this superfamily include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	445
100115	cd05804	StaR_like	StaR_like; a well-conserved protein found in bacteria, plants, and animals. A family member from Streptomyces toyocaensis, StaR is part of a gene cluster involved in the biosynthesis of glycopeptide antibiotics (GPAs), specifically A47934. It has been speculated that StaR could be a flavoprotein hydroxylating a tyrosine sidechain. Some family members have been annotated as proteins containing tetratricopeptide (TPR) repeats, which may at least indicate mostly alpha-helical secondary structure.	355
100097	cd05805	MPG1_transferase	GTP-mannose-1-phosphate guanyltransferase (MPG1 transferase), also known as GDP-mannose pyrophosphorylase, is a bifunctional enzyme with both phosphomannose isomerase (PMI) activity and GDP-mannose phosphorylase (GMP) activity.  The protein contains an N-terminal NTP transferase domain, an L-beta-H domain, and a C-terminal PGM-like domain that belongs to the alpha-D-phosphohexomutase superfamily.  This subfamily is limited to bacteria and archaea. The alpha-D-phosphohexomutases include several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this group appear to lack conserved residues necessary for metal binding and catalytic activity. Other members of this superfamily include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model.	441
99881	cd05806	CBM20_laforin	Laforin protein tyrosine phosphatase, N-terminal CBM20 (carbohydrate-binding module, family 20) domain. Laforin, encoded by the EPM2A gene, is a dual-specificity phosphatase that dephosphorylates complex carbohydrates. Mutations in the gene encoding laforin result in Lafora disease, a fatal autosomal recessive neurodegenerative disorder characterized by the presence of intracellular deposits of insoluble, abnormally branched, glycogen-like polymers, known as Lafora bodies, in neurons, muscle, liver, and other tissues. The molecular basis for the formation of these Lafora bodies is unknown. Laforin is one of the only phosphatases that contains a carbohydrate-binding module. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	112
99882	cd05807	CBM20_CGTase	CGTase, C-terminal CBM20 (carbohydrate-binding module, family 20) domain. CGTase, also known as cyclodextrin glycosyltransferase and cyclodextrin glucanotransferase, catalyzes the formation of various cyclodextrins (alpha-1,4-glucans) from starch. CGTase has, in addition to its C-terminal CBM20 domain, an N-terminal catalytic domain belonging to glycosyl hydrolase family 13 and an IPT domain of unknown function. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	101
99883	cd05808	CBM20_alpha_amylase	Alpha-amylase, C-terminal CBM20 (carbohydrate-binding module, family 20) domain. This domain is found in several bacterial and fungal alpha-amylases including the maltopentaose-forming amylases (G5-amylases). Most alpha-amylases have, in addition to the C-terminal CBM20 domain, an N-terminal catalytic domain belonging to glycosyl hydrolase family 13, which hydrolyzes internal alpha-1,4-glucosidic bonds in starch and related saccharides, yielding maltotriose and maltose. Two types of soluble substrates are used by alpha-amylases including long substrates (e.g. amylose) and short substrates (e.g. maltodextrins or maltooligosaccharides). The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	95
99884	cd05809	CBM20_beta_amylase	Beta-amylase, C-terminal CBM20 (carbohydrate-binding module, family 20) domain.  Beta-amylase has, in addition to its C-terminal CBM20 domain, an N-terminal catalytic domain belonging to glycosyl hydrolase family 14, which hydrolyzes the alpha-1,4-glucosidic bonds of starch, yielding beta-maltose from the nonreducing end of the substrate. Beta-amylase is found in both plants and microorganisms, however the plant members lack a C-terminal CBM20 domain and are not included in this group. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	99
99885	cd05810	CBM20_alpha_MTH	Glucan 1,4-alpha-maltotetraohydrolase (alpha-MTH), C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Alpha-MTH, also known as maltotetraose-forming exo-amylase or G4-amylase, is an exo-amylase found in bacteria that degrades starch from its non-reducing end. Most alpha-MTHs have, in addition to the C-terminal CBM20 domain, an N-terminal glycosyl hydrolase family 13 catalytic domain. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	97
99886	cd05811	CBM20_glucoamylase	Glucoamylase (glucan1,4-alpha-glucosidase), C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Glucoamylases are inverting, exo-acting starch hydrolases that hydrolyze starch and related polysaccharides by releasing the nonreducing end glucose. They are mainly active on alpha-1,4-glycosidic bonds but also have some activity towards 1,6-glycosidic bonds occurring in natural oligosaccharides. The ability of glucoamylases to cleave 1-6-glycosidic binds is called "debranching activity" and is of importance in industrial applications, where complete degradation of starch to glucose is needed. Most glucoamylases are multidomain proteins containing an N-terminal catalytic domain, a C-terminal CBM20 domain, and a highly O-glycosylated linker region that connects the two. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	106
99887	cd05813	CBM20_genethonin_1	Genethonin-1, C-terminal CBM20 (carbohydrate-binding module, family 20) domain.  Genethonin-1 is a human skeletal muscle protein with no known function. It contains a C-terminal CBM20 domain. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	95
99888	cd05814	CBM20_Prei4	Prei4, N-terminal CBM20 (carbohydrate-binding module, family 20) domain. Preimplantation protein 4 (Prei4) is a protein of unknown function that is expressed during mouse preimplantation embryogenesis. In addition to the N-terminal CBM20 domain, Prei4 contains a C-terminal glycerophosphoryl diester phosphodiesterase (GDPD) domain. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	120
99889	cd05815	CBM20_DPE2_repeat1	Disproportionating enzyme 2 (DPE2), N-terminal CBM20 (carbohydrate-binding module, family 20) domain, repeat 1. DPE2 is a transglucosidase that is essential for the cytosolic metabolism of maltose in plant leaves at night. Maltose is an intermediate on the pathway from starch to sucrose and DPE2 is thought to metabolize the maltose that is exported from the chloroplast. DPE2 has two N-terminal CBM20 starch binding domains as well as a C-terminal amylomaltase (4-alpha-glucanotransferase) catalytic domain. DPE1, the plastid version of this enzyme, has a transglucosidase domain that is similar to that of DPE2 but lacks the N-terminal carbohydrate-binding domains. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	101
99890	cd05816	CBM20_DPE2_repeat2	Disproportionating enzyme 2 (DPE2), N-terminal CBM20 (carbohydrate-binding module, family 20) domain, repeat 2. DPE2 is a transglucosidase that is essential for the cytosolic metabolism of maltose in plant leaves at night. Maltose is an intermediate on the pathway from starch to sucrose and DPE2 is thought to metabolize the maltose that is exported from the chloroplast. DPE2 has two N-terminal CBM20 domains as well as a C-terminal amylomaltase (4-alpha-glucanotransferase) catalytic domain. DPE1, the plastid version of this enzyme, has a transglucosidase domain that is similar to that of DPE2 but lacks the N-terminal CBM20 domains. Included in this group are PDE2-like proteins from Dictyostelium, Entamoeba, and Bacteroides. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	99
99891	cd05817	CBM20_DSP	Dual-specificity phosphatase (DSP), N-terminal CBM20 (carbohydrate-binding module, family 20) domain. This CBM20 domain is located at the N-terminus of a protein tyrosine phosphatase of unknown function found in slime molds and ciliated protozoans. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	100
99892	cd05818	CBM20_water_dikinase	Phosphoglucan water dikinase (also known as alpha-glucan water dikinase), N-terminal CBM20 (carbohydrate-binding module, family 20) domain. This domain is found in the chloroplast-encoded phosphoglucan water dikinase, one of two enzymes involved in the phosphorylation of plant starches. In addition to the CBM20 domain, phosphoglucan water dikinase contains a C-terminal pyruvate binding domain. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	92
271320	cd05819	NHL	NHL repeat unit of beta-propeller proteins. The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.	269
99893	cd05820	CBM20_novamyl	Novamyl (also known as acarviose transferase, ATase, maltogenic alpha-amylase, glucan 1,4-alpha-maltohydrolase, and AcbD), C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Novamyl has a five-domain structure similar to that of cyclodextrin glucanotransferase (CGTase). Novamyl has a substrate-binding surface with an open groove which can accommodate both cyclodextrins and linear substrates. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	103
100113	cd05821	TLP_Transthyretin	Transthyretin (TTR) is a 55 kDa protein responsible for the transport of thyroid hormones and retinol in vertebrates.  TTR distributes the two thyroid hormones T3 (3,5,3'-triiodo-L-thyronine) and T4 (Thyroxin, or 3,5,3',5'-tetraiodo-L-thyronine), as well as retinol (vitamin A) through the formation of a macromolecular complex that includes each of these as well as retinol-binding protein.  Misfolded forms of TTR are implicated in the amyloid diseases familial amyloidotic polyneuropathy and senile systemic amyloidosis. TTR forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits, which differ in their ligand binding affinity.  A negative cooperativity has been observed for the binding of T4 and other TTR ligands. A fraction of plasma TTR is carried in high density lipoproteins by binding to apolipoprotein AI (apoA-I).  TTR is able to proteolytically process apoA-I by cleaving its C-terminus; therefore TTR has protease activity in addition to its function in protein transport.	121
100114	cd05822	TLP_HIUase	HIUase (5-hydroxyisourate hydrolase) catalyzes the second step in a three-step ureide pathway in which 5-hydroxyisourate (HIU), a product of the uricase (urate oxidase) reaction, is hydrolyzed to 2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline (OHCU). HIUase has high sequence similarity with transthyretins and is a member of the transthyretin-like protein (TLP) family.   HIUase is distinguished from transthyretins by a conserved signature motif at its C-terminus that forms part of the active site.  In HIUase, this motif is YRGS, while transthyretins have a conserved TAVV sequence in the same location.  Most HIUases are cytosolic but in plants and slime molds, they are peroxisomal based on the presence of N-terminal periplasmic localization sequences.  HIUase forms a homotetramer with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix.  The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits.	112
100062	cd05824	LbH_M1P_guanylylT_C	Mannose-1-phosphate guanylyltransferase, C-terminal Left-handed parallel beta helix (LbH) domain: Mannose-1-phosphate guanylyltransferase is also known as GDP-mannose pyrophosphorylase. It catalyzes the synthesis of GDP-mannose from GTP and mannose-1-phosphate, and is involved in the maintenance of cell wall integrity and glycosylation. Similar to ADP-glucose pyrophosphorylase, it contains an N-terminal catalytic domain that resembles a dinucleotide-binding Rossmann fold and a C-terminal LbH fold domain, presumably with 4 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity.	80
100063	cd05825	LbH_wcaF_like	wcaF-like: This group is composed of the protein product of the E. coli wcaF gene and similar proteins. WcaF is part of the gene cluster responsible for the biosynthesis of the extracellular polysaccharide colanic acid. The wcaF protein is predicted to contain a left-handed parallel beta-helix (LbH) domain encoded by imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity. Many are trimeric in their active forms.	107
320675	cd05826	Sortase_B	Sortase domain found in class B sortases. Class B sortases are membrane-bound cysteine transpeptidases broadly distributed in Gram-positive bacteria (mainly present in Firmicutes and Actinobacteria). They can have radically distinct functions. Some members of this group attach haemoproteins to the peptidoglycan of the cell wall, while others assemble pili, which are multi-subunit hair-like fibres that extend from the cell surface to promote microbial adhesion and biofilm formation. In transpeptidation reaction, the surface protein substrate is cleaved at a conserved cell wall-sorting signal (Class B sortases normally recognize the consensus NP[Q/K][T/S][N/G/S][D/A] motif), and covalently linked to peptidoglycan for display on the bacterial surface. The prototypical sortase B protein from Staphylococcus aureus (named Sa-SrtB) cleaves surface protein precursors between threonine and asparagine at a conserved NPQTN motif with subsequent covalent linkage to pentaglycine cross-bridges. It is required for anchoring the heme-iron binding surface protein IsdC to the cell wall envelope. SrtB contains an N-terminal hydrophobic region that functions as a signal peptide/transmembrane domain. At the C terminus, it contains an essential cysteine residue within the catalytic TLXTC signature sequence, where X is usually a serine. Genes encoding SrtB and its targets are generally clustered in the same locus. The prototypical class B sortase involved in pilus biogenesis is pilus-specific sortase C2 from Streptococcus pyogenes (named Sp-SrtC2) that anchors a surface protein containing a QVPTGV motif to the cell wall, as well as polymerizes the major pilin subunit Tee3/FctA and attaches the minor tip pilin Cpa. The linkage of Cpa to Tee3 by SrtC2 requires the VPPTG motif in the cell wall-sorting signal of Cpa. The family also includes SrtB enzymes from Bacillus anthracis (named Ba-SrtB) and Clostridium difficile (named Cd-SrtB). Ba-SrtB is thought to recognize the NPKTG motif, and attaches surface proteins to meso-diaminopimelic acid (mDAP) cross-bridges. Cd-SrtB does not play an essential role in pathogenesis. It cleaves short [SP]PXTG motif-containing peptides between the threonine and glycine residues and then covalently anchors the threonine residue to a nucleophile such as glycine or mDAP, but not to the peptidoglycan of C. difficile, suggesting a novel association of sortase activity with cyclic diGMP (c-diGMP)-mediated regulation to control levels of cell wall anchoring and secretion of putative adhesion molecules.	170
320676	cd05827	Sortase_C	Sortase domain found in class C sortases. Class C sortases are membrane-bound cysteine transpeptidases broadly distributed in Gram-positive bacteria (mainly present in Firmicutes and Actinobacteria). They function as pilin polymerases responsible for the assembly of pili, which are multi-subunit hair-like fibres that extend from the cell surface to promote microbial adhesion and biofilm formation. First, one or more class C sortases form the long thin shaft of the pilus through linking together pilin subunits via isopeptide bonds. The base of the pilus is then anchored to the cell wall by a housekeeping sortase or, in some cases, the class C sortase itself. Depending upon the organism both the number and type of sortase enzymes involved varies, and in some cases, accessory factors appear to be needed. In three-component spaA pilus from Corynebacterium diphtheriae, the prototypical class C sortase (named Cd-SrtA) catalyzes polymerization of the SpaA-type pilus, consisting of the shaft pilin SpaA, tip pilin SpaC and minor pilin SpaB. The pilus shaft is then attached to the cell wall by a housekeeping class E sortase, Cd-SrtF. In the absence of Cd-SrtF, Cd-SrtA attaches the pilus to the cell wall, albeit at a reduced rate. Cd-SrtA can recognize two distinct sorting signals (LPLTG in SpaA and SpaC, and LAFTG in SpaB) and it can employ lysine residues that originate from different proteins (either Lys190 within the pilin motif of SpaA or Lys139 in SpaB). However, Cd-SrtA cannot be able to polymerize the major pilin subunit SpaH, even though it contains LPLTG motif. In two-component pili of prototypical Bacillus cereus, the class C sortase (named Bc-SrtD) cleaves related sorting signals within a major pilin protein BcpA (LPVTG) and a minor tip pilin BcpB (IPNTG), and catalyzes a transpeptidation that joins the threonine residues in each signal to the side-chain of Lys162 in BcpA (located within a pilin motif). Unlike the SpaA pilus in C. diphtheriae, in B. cereus Bc-SrtD is unable to covalently attach the pilus to the cell wall without the help of the housekeeping sortase.	131
320677	cd05828	Sortase_D_1	Sortase domain found in subfamily 1 of the class D family of sortases. Class D sortases are cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Firmicutes). The prototypical subfamily 1 of class D sortase from Bacillus anthracis (named Ba-SrtC) covalently attaches proteins bearing a noncanonical LPNTA sorting signal, such as the BasH and BasI proteins, to the peptidoglycan of the cell wall that facilitate sporulation. BasH is exclusively anchored to the forespore cell wall envelope, while BasI is attached to the diaminopimelic acid moiety of the peptidoglycan of predivisional cells. Ba-SrtC lacks the N-terminal signal peptide and membrane anchor. The family also includes many class D sortase homologs from Gram-negative bacteria, but the functions of these enzymes are unknown.	127
320678	cd05829	Sortase_F	Sortase domain found in the class F family of sortases. Class F sortases are mainly present in Actinobacteria, Chlorobacteria and Firmicutes. Their functions are largely unknown.	144
320679	cd05830	Sortase_E	Sortase domain found in the class E family of sortases. Class E sortases are membrane-bound cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Actinobacteria). Genes encoding class A and E sortases are never found in the same organism, and similar to class A sortases, the genes encoding class E sortases are not positioned adjacent to genes encoding potential protein substrates, suggesting a housekeeping sortase function of class E sortases in some high G + C Gram-positive bacteria. Similar to class A sortase, class E sortases are capable of anchoring a large number of functionally distinct surface proteins containing a cell wall sorting signal to an amino group located on the bacterial cell wall. They recognize an LAXTG sorting signal, instead of the canonical LPXTG motif processed by class A sortases. The prototypical class E sortase from Corynebacterium diphtheria (named Cd-SrtF) is a non-polymerization sortase that is not required for pilus polymerization, and proceeds to complete the assembly process by anchoring the polymer to the cell wall peptidoglycan. Moreover, in Streptomyces coelicolor, one or both of Staphylococcus aureus SrtA homologs may function as class E sortase responsible for the cell wall anchoring of the long chaplin proteins (ChpA-C) containing an LAXTG sorting signal, which presumably mediate aerial hyphae formation. The family also includes some class E sortase homologs from Gram-negative and Archaebacterial species, but the functions of these enzymes are unknown.	135
100109	cd05831	Ribosomal_P1	Ribosomal protein P1. This subfamily represents the eukaryotic large ribosomal protein P1. Eukaryotic P1 and P2 are functionally equivalent to the bacterial protein L7/L12, but are not homologous to L7/L12. P1 is located in the L12 stalk, with proteins P2, P0, L11, and 28S rRNA. P1 and P2 are the only proteins in the ribosome to occur as multimers, always appearing as sets of heterodimers. Recent data indicate that eukaryotes have four copies (two heterodimers), while most archaeal species contain six copies of L12p (three homodimers) and bacteria may have four or six copies (two or three homodimers), depending on the species. Experiments using S. cerevisiae P1 and P2 indicate that P1 proteins are positioned more internally with limited reactivity in the C-terminal domains, while P2 proteins seem to be more externally located and are more likely to interact with other cellular components. In lower eukaryotes, P1 and P2 are further subdivided into P1A, P1B, P2A, and P2B, which form P1A/P2B and P1B/P2A heterodimers. Some plant species have a third P-protein, called P3, which is not homologous to P1 and P2. In humans, P1 and P2 are strongly autoimmunogenic. They play a significant role in the etiology and pathogenesis of systemic lupus erythema (SLE). In addition, the ribosome-inactivating protein trichosanthin (TCS) interacts with human P0, P1, and P2, with its primary binding site located in the C-terminal region of P2. TCS inactivates the ribosome by depurinating a specific adenine in the sarcin-ricin loop of 28S rRNA.	103
100110	cd05832	Ribosomal_L12p	Ribosomal protein L12p. This subfamily includes archaeal L12p, the protein that is functionally equivalent to L7/L12 in bacteria and the P1 and P2 proteins in eukaryotes. L12p is homologous to P1 and P2 but is not homologous to bacterial L7/L12. It is located in the L12 stalk, with proteins L10, L11, and 23S rRNA. L12p is the only protein in the ribosome to occur as multimers, always appearing as sets of dimers. Recent data indicate that most archaeal species contain six copies of L12p (three homodimers), while eukaryotes have four copies (two heterodimers), and bacteria may have four or six copies (two or three homodimers), depending on the species. The organization of proteins within the stalk has been characterized primarily in bacteria, where L7/L12 forms either two or three homodimers and each homodimer binds to the extended C-terminal helix of L10. L7/L12 is attached to the ribosome through L10 and is the only ribosomal protein that does not directly interact with rRNA. Archaeal L12p is believed to function in a similar fashion. However, hybrid ribosomes containing the large subunit from E. coli with an archaeal stalk are able to bind archaeal and eukaryotic elongation factors but not bacterial elongation factors. In several mesophilic and thermophilic archaeal species, the binding of 23S rRNA to protein L11 and to the L10/L12p pentameric complex was found to be temperature-dependent and cooperative.	106
100111	cd05833	Ribosomal_P2	Ribosomal protein P2. This subfamily represents the eukaryotic large ribosomal protein P2. Eukaryotic P1 and P2 are functionally equivalent to the bacterial protein L7/L12, but are not homologous to L7/L12. P2 is located in the L12 stalk, with proteins P1, P0, L11, and 28S rRNA. P1 and P2 are the only proteins in the ribosome to occur as multimers, always appearing as sets of heterodimers. Recent data indicate that eukaryotes have four copies (two heterodimers), while most archaeal species contain six copies of L12p (three homodimers). Bacteria may have four or six copies of L7/L12 (two or three homodimers) depending on the species. Experiments using S. cerevisiae P1 and P2 indicate that P1 proteins are positioned more internally with limited reactivity in the C-terminal domains, while P2 proteins seem to be more externally located and are more likely to interact with other cellular components. In lower eukaryotes, P1 and P2 are further subdivided into P1A, P1B, P2A, and P2B, which form P1A/P2B and P1B/P2A heterodimers. Some plants have a third P-protein, called P3, which is not homologous to P1 and P2. In humans, P1 and P2 are strongly autoimmunogenic. They play a significant role in the etiology and pathogenesis of systemic lupus erythema (SLE). In addition, the ribosome-inactivating protein trichosanthin (TCS) interacts with human P0, P1, and P2, with its primary binding site in the C-terminal region of P2. TCS inactivates the ribosome by depurinating a specific adenine in the sarcin-ricin loop of 28S rRNA.	109
99895	cd05834	HDGF_related	The PWWP domain is an essential part of the Hepatoma Derived Growth Factor (HDGF) family of proteins, and is necessary for DNA binding by HDGF. This family of endogenous nuclear-targeted mitogens includes HRP (HDGF-related proteins 1, 2, 3, 4, or HPR1, HPR2, HPR3, HPR4, respectively) and lens epithelium-derived growth factor, LEDGF. Members of the HDGF family have been linked to human diseases, and HDGF is a prognostic factor in several types of cancer. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.	83
99896	cd05835	Dnmt3b_related	The PWWP domain is an essential component of DNA methyltransferase 3 B (Dnmt3b) which is responsible for establishing DNA methylation patterns during embryogenesis and gametogenesis.  In tumorigenesis, DNA methylation by Dnmt3b is known to play a role in the inactivation of tumor suppressor genes.  In addition, a point mutation in the PWWP domain of Dnmt3b has been identified in patients with ICF syndrome (immunodeficiency, centromeric instability, and facial anomalies), a rare autosomal recessive disorder characterized by hypomethylation of classical satellite DNA. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.	87
99897	cd05836	N_Pac_NP60	The PWWP domain is an essential part of the cytokine-like nuclear factor n-pac protein, or NP60, which enhances the activity of MAP2K4 and MAP2K6 kinases to phosphorylate p38-alpha.  In a variety of cell lines, NP60 has been shown to localize to the nucleus. In addition to the PWWP domain, NP60 also contains an AT-hook and a C-terminal NAD-binding domain. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding proteins, that function as transcription factors regulating a variety of developmental processes.	86
99898	cd05837	MSH6_like	The PWWP domain is present in MSH6, a mismatch repair protein homologous to bacterial MutS.   The PWWP domain of histone-lysine N-methyltransferase, also known as Nuclear SET domain-containing protein 3, is also included. Mutations in MSH6 have been linked to increased cancer susceptibility, particularly in hereditary nonpolyposis colorectal cancer in humans.  The role of the PWWP domain in MSH6 is not clear; MSH6 orthologs found in S. cerevisiae, Caenorhabditis elegans and Arabidopsis thaliana lack the PWWP domain.   Histone methyltransferases (HMTases) induce the posttranslational methylation of lysine residues in histones and play a role in apoptosis.  In the HMTase Whistle, the PWWP domain is necessary for HMTase activity. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.	110
99899	cd05838	WHSC1_related	The PWWP domain was first identified in the WHSC1 (Wolf-Hirschhorn syndrome candidate 1) protein, a protein implicated in Wolf-Hirschhorn syndrome (WHS).  When translocated, WHSC1 plays a role in lymphoid multiple myeloma (MM) disease, also known as plasmacytoma. WHCS1 proteins typically contain two copies of the PWWP domain.  The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.	95
99900	cd05839	BR140_related	The PWWP domain is found in the BR140 family, which includes peregrin and BR140-like proteins 1 and 2.   BR140 is the only family to contain the PWWP domain at the C terminus, with PHD and bromo domains in the N-terminal region.  In myeloid leukemias, BR140 is disrupted by chromosomal translocations, similar to translocations of WHSC1 in lymphoid multiple myeloma.  The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding proteins, that function as transcription factors regulating a variety of developmental processes.	111
99901	cd05840	SPBC215_ISWI_like	The PWWP domain is a component of the S. pombe hypothetical protein SPBC215, as well as ISWI complex protein 4.  The ISWI (imitation switch) proteins are ATPases responsible for chromatin remodeling in eukaryotes, and SPBC215 is proposed to also bind chromatin.   The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding,  proteins that function as transcription factors regulating a variety of developmental processes.	93
99902	cd05841	BS69_related	The PWWP domain is part of BS69 protein, a nuclear protein that specifically binds adenoviral E1A and Epstein-Barr viral EBNA2 proteins, suppressing their transactivation functions.  BS69 is a multi-domain protein, containing bromo, PHD, PWWP, and MYND domains.  The specific role of the PWWP domain within BS69 is not clearly identified, but BS69 functions in chromatin remodeling, consistent with other PWWP-containing proteins. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.	83
320682	cd05843	Peptidase_M48_M56	Peptidases M48 (Ste24 endopeptidase or htpX homolog) and M56 (in MecR1 and BlaR1), integral membrane metallopeptidases. This family contains peptidase M48 (also known as Ste24 peptidase, Ste24p, Ste24 endopeptidase, a-factor converting enzyme, AFC1), M56 (also known as BlaR1 peptidase) as well as a novel family called minigluzincins. Peptidase M48 belongs to Ste24 endopeptidase family. Members of this family include Ste24 protease (peptidase M48A), protease htpX homolog (peptidase M48B), or CAAX prenyl protease 1, and mitochondrial metalloendopeptidase OMA1 (peptidase M48C). They proteolytically remove the C-terminal three residues of farnesylated proteins. They are integral membrane proteins associated with the endoplasmic reticulum and golgi, binding one zinc ion per subunit. In eukaryotes, Ste24p is required for the first NH2-terminal proteolytic processing event within the a-factor precursor, which takes place after COOH-terminal CAAX modification (C is cysteine; A is usually aliphatic; X is one of several amino acids) is complete. The Ste24p contains multiple membrane spans, a zinc metalloprotease motif (HEXXH), and a COOH-terminal ER retrieval signal (KKXX). Mutation studies have shown that the HEXXH protease motif, which is extracellular but adjacent to a transmembrane domain and therefore close to the membrane surface, is critical for Ste24p activity. Ste24p has limited homology to HtpX family of prokaryotic proteins; HtpX proteins, also part of the M48 peptidase family, are smaller and homology is restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins; HtpX then undergoes self-degradation and collaborates with FtsH to eliminate these misfolded proteins. Peptidase M56 includes zinc metalloprotease domain in MecR1 and BlaR1. MecR1 is a transmembrane beta-lactam sensor/signal transducer protein that regulates the expression of an altered penicillin-binding protein PBP2a, which resists inactivation by beta-lactam antibiotics, in methicillin-resistant Staphylococcus aureus (MRSA). BlaR1 regulates the inducible expression of a class A beta-lactamase that hydrolytically destroys certain beta-lactam antibiotics in MRSA. Also included are a novel family of related proteins that consist of the soluble minimal scaffold similar to the catalytic domains of the integral-membrane metallopeptidase M48 and M56, thus called minigluzincins.	94
340860	cd05844	GT4-like	glycosyltransferase family 4 proteins. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to glycosyltransferase family 4 (GT4). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.	365
409432	cd05845	IgI_2_L1-CAM_like	Second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM), and similar domains; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins. L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains, five fibronectin type III domains, a transmembrane region, and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1 that involves abnormalities of axonal growth. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	91
409433	cd05846	IgV_1_MRC-OX-2_like	First immunoglobulin (Ig) variable (V) domain of rat MRC OX-2 antigen, and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of rat MRC OX-2 antigen (also known as CD200) and similar proteins. MRC OX-2 is a membrane glycoprotein expressed in a variety of lymphoid and non-lymphoid cells in rats. It has a similar broad distribution pattern in humans. MRC OX-2 may regulate myeloid cell activity. The protein has an extracellular portion containing two Ig-like domains, a transmembrane portion, and a cytoplasmic portion.	108
409434	cd05847	IgC1_CH2_IgE	CH2 domain (second constant Ig domain of the heavy chain) in immunoglobulin E (IgE); member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second constant domain of the heavy chain of immunoglobulin E (IgE). The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta, and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). The different classes of antibodies vary in their heavy chains; the IgE class has the epsilon type. This domain (Cepsilon2) of IgE is in place of the flexible hinge region found in IgG.	97
409435	cd05848	IgI_1_Contactin-5	First immunoglobulin (Ig) domain of contactin-5; member of the I-set of Ig superfamily domains. The members here are composed of the first immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-5. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains, anchored to the membrane by glycosylphosphatidylinositol. The different contactins show different expression patterns in the central nervous system. In rats, a lack of contactin-5 (NB-2) results in an impairment of the neuronal activity in the auditory system. Contactin-5 is expressed specifically in the postnatal nervous system, peaking at about 3 weeks postnatal. Contactin-5 is highly expressed in the adult human brain in the occipital lobe and in the amygdala; lower levels of expression have been detected in the corpus callosum, caudate nucleus, and spinal cord. This group belongs to the I-set of IgSF domains.	96
409436	cd05849	IgI_1_Contactin-1	First immunoglobulin (Ig) domain of contactin-1; member of the I-set of Ig superfamily domains. The members here are composed of the first immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may, through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. This group belongs to the I-set of IgSF domains.	95
409437	cd05850	IgI_1_Contactin-2	First immunoglobulin (Ig) domain of contactin-2; member of the I-set of Ig superfamily domains. The members here are composed of the first immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-2-like. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-2 (TAG-1, axonin-1) facilitates cell adhesion by homophilic binding between molecules in apposed membranes. It may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module by contacts between IG domains 1 and 4, and domains 2 and 3. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-2 is also expressed in retinal amacrine cells in the developing chick retina, corresponding to the period of formation and maturation of AC processes. This group belongs to the I-set of IgSF domains.	97
143259	cd05851	IgI_3_Contactin-1	Third immunoglobulin (Ig) domain of contactin-1; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma. This group belongs to the I-set of IgSF domains.	88
409438	cd05852	Ig5_Contactin-1	Fifth immunoglobulin (Ig) domain of contactin-1. The members here are composed of the fifth immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma.	89
409439	cd05853	Ig6_Contactin-4	Sixth immunoglobulin (Ig) domain of contactin-4. The members here are composed of the sixth immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-4. Contactins are neural cell adhesion molecules, and are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The different contactins show different expression patterns in the central nervous system. Highest expression of contactin-4 is in testes, thyroid, small intestine, uterus, and brain. Contactin-4 plays a role in the response of neuroblastoma cells to differentiating agents, such as retinoids. The contactin 4 gene is associated with cerebellar degeneration in spinocerebellar ataxia type 16.	102
409440	cd05854	Ig6_Contactin-2	Sixth immunoglobulin (Ig) domain of contactin-2. The members here are composed of the sixth immunoglobulin (Ig) domain of the neural cell adhesion molecule contactin-2-like. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-2 (TAG-1, axonin-1) facilitates cell adhesion by homophilic binding between molecules in apposed membranes. It may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module by contacts between IG domains 1 and 4, and domains 2 and 3. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-2 is also expressed in retinal amacrine cells (AC) in the developing chick retina, corresponding to the period of formation and maturation of AC processes.	102
409441	cd05855	IgI_TrkB_d5	Fifth domain (immunoglobulin-like) of Trk receptor TrkB;  member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth domain of Trk receptor, TrkB, an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors, which mediate the trophic effects of the neurotrophin Nerve Growth Factor (NGF) family. Trks are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkB shares significant sequence homology and domain organization with TrkA and TrkC. The first three domains are leucine-rich domains while the fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrKB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. In some cell systems NT-3 can activate TrkA and TrkB receptors. TrKB transcripts are found throughout multiple structures of the central and peripheral nervous systems. This group belongs to the I-set of IgSF domains	94
409442	cd05856	IgI_2_FGFRL1-like	Second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor_like-1(FGFRL1); member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor like-1(FGFRL1). FGFRL1 is comprised of a signal peptide, three extracellular Ig-like modules, a transmembrane segment, and a short intracellular domain. FGFRL1 is expressed preferentially in skeletal tissues. Similar to FGF receptors, the expressed protein interacts specifically with heparin and with FGF2.  FGFRL1 does not have a protein tyrosine kinase domain at its C-terminus; neither does its cytoplasmic domain appear to interact with a signaling partner. It has been suggested that FGFRL1 may not have any direct signaling function, but instead acts as a decoy receptor trapping FGFs and preventing them from binding other receptors.	92
409443	cd05857	IgI_2_FGFR	Second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor; member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of fibroblast growth factor (FGF) receptor. FGF receptors bind FGF signaling polypeptides. FGFs participate in multiple processes such as morphogenesis, development, and angiogenesis. FGFs bind to four FGF receptor tyrosine kinases (FGFR1, FGFR2, FGFR3, FGFR4). Receptor diversity is controlled by alternative splicing producing splice variants with different ligand binding characteristics and different expression patterns. FGFRs have an extracellular region comprised of three IG-like domains, a single transmembrane helix, and an intracellular tyrosine kinase domain. Ligand binding and specificity reside in the Ig-like domains 2 and 3, and the linker region that connects these two. FGFR activation and signaling depend on FGF-induced dimerization, a process involving cell surface heparin or heparin sulfate proteoglycans.	95
409444	cd05858	IgI_3_FGFR2	Third immunoglobulin (Ig)-like domain of fibroblast growth factor receptor 2 (FGFR2); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin (Ig)-like domain of human fibroblast growth factor receptor 2 (FGFR2). Fibroblast growth factors (FGFs) participate in morphogenesis, development, angiogenesis, and wound healing. These FGF-stimulated processes are mediated by four FGFR tyrosine kinases (FGRF1-4). FGFRs are comprised of an extracellular portion consisting of three Ig-like domains, a transmembrane helix, and a cytoplasmic portion having protein tyrosine kinase activity. The highly conserved Ig-like domains 2 and 3, and the linker region between D2 and D3 define a general binding site for FGFs. FGFR2 is required for male sex determination. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	105
409445	cd05859	Ig4_PDGFR	Fourth immunoglobulin (Ig)-like domain of platelet-derived growth factor receptor (PDGFR). The members here are composed of the fourth immunoglobulin (Ig)-like domain of platelet-derived growth factor receptor (PDGFR; also known as cluster of differentiation (CD) 140a) alpha and beta. PDGF is a potent mitogen for connective tissue cells. PDGF-stimulated processes are mediated by three different PDGFs (PDGF-A,PDGF-B, and PDGF-C). PDGFR alpha binds to all three PDGFs, whereas the PDGFR beta binds only to PDGF-B. PDGF alpha is organized as an extracellular component having five Ig-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. In mice, PDGFR alpha and PDGFR beta are essential for normal development.	101
409446	cd05860	IgI_4_SCFR	Fourth immunoglobulin (Ig)-like domain of stem cell factor receptor (SCFR); member of the I-set of IgSF domains. The members here are composed of the fourth Immunoglobulin (Ig)-like domain in stem cell factor receptor (SCFR). SCFR is organized as an extracellular component having five IG-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. SCFR and its ligand SCF are critical for normal hematopoiesis, mast cell development, melanocytes, and gametogenesis. SCF binds to the second and third Ig-like domains of SCFR. This fourth Ig-like domain participates in SCFR dimerization, which follows ligand binding. Deletion of this fourth domain abolishes the ligand-induced dimerization of SCFR and completely inhibits signal transduction. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	101
409447	cd05861	IgI_PDGFR-alphabeta	Immunoglobulin (Ig)-like domain of platelet-derived growth factor (PDGF) receptors (R), alpha and beta; member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of platelet-derived growth factor (PDGF) receptors (R), alpha (also known as cluster of differentiation (CD) 140a), and beta (also known as CD140b). PDGF is a potent mitogen for connective tissue cells. PDGF-stimulated processes are mediated by three different PDGFs (PDGF-A,PDGF-B, and PDGF-C). PDGFRalpha binds to all three PDGFs, whereas the PDGFRbeta binds only to PDGF-B. PDGFRs alpha and beta have similar organization: an extracellular component with five Ig-like domains, a transmembrane segment, and a cytoplasmic portion having protein tyrosine kinase activity. In mice, PDGFRalpha and PDGFRbeta are essential for normal development.	99
409448	cd05862	IgI_VEGFR	Immunoglobulin (Ig)-like domain of vascular endothelial growth factor (VEGF) receptor(R); member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor (VEGF) receptor(R). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. The VEGFR family consists of three members, VEGFR-1 (also known as Flt-1), VEGFR-2 (also known as KDR or Flk-1) and VEGFR-3 (also known as Flt-4). VEGF_A interacts with both VEGFR-1 and VEGFR-2. VEGFR-1 binds strongest to VEGF, VEGF-2 binds more weakly. VEGFR-3 appears not to bind VEGF, but binds other members of the VEGF family (VEGF-C and -D). VEGFRs bind VEGFs with high affinity with the IG-like domains. VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-2 is a major mediator of the mitogenic, angiogenic and microvascular permeability-enhancing effects of VEGF-A. VEGFR-1 may play an inhibitory part in these processes by binding VEGF and interfering with its interaction with VEGFR-2. VEGFR-1 has a signaling role in mediating monocyte chemotaxis. VEGFR-2 and -1 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. VEGFR-3 has been shown to be involved in tumor angiogenesis and growth.	102
409449	cd05863	IgI_VEGFR-3	Immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 3 (VEGFR-3); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 3 (VEGFR-3). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGFR-3 (Flt-4) binds two members of the VEGF family (VEGF-C and VEGF-D) and is involved in tumor angiogenesis and growth. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	88
409450	cd05864	IgI_VEGFR-2	Immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 2 (VEGFR-2); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 2 (VEGFR-2). The VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGFR-2 (KDR/Flk-1) is a major mediator of the mitogenic, angiogenic and microvascular permeability-enhancing effects of VEGF-A; VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGF-A also interacts with VEGFR-1, which it binds more strongly than VEGFR-2. VEGFR-1 and VEGFR-2 may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	89
409451	cd05865	IgI_1_NCAM-1	First immunoglobulin (Ig)-like domain of neural cell adhesion molecule (NCAM-1); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of neural cell adhesion molecule (NCAM-1). NCAM-1 plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM), and heterophilic (NCAM-nonNCAM), interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves the Ig1, Ig2, and Ig3 domains. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions), through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	97
409452	cd05866	IgI_1_NCAM-2	First immunoglobulin (Ig)-like domain of neural cell adhesion molecule NCAM-2; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of neural cell adhesion molecule NCAM-2 (OCAM/mamFas II, RNCAM). NCAM-2 is organized similarly to NCAM-1, including five N-terminal Ig-like domains and two fibronectin type III domains. NCAM-2 is differentially expressed in the developing and mature olfactory epithelium (OE), and may function like NCAM, as an adhesion molecule. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	93
409453	cd05867	Ig4_L1-CAM_like	Fourth immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). The members here are composed of the fourth immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region, and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, and spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains the chicken neuron-glia cell adhesion molecule, Ng-CAM.	89
409454	cd05868	Ig4_NrCAM	Fourth immunoglobulin (Ig)-like domain of NrCAM (NgCAM-related cell adhesion molecule). The members here are composed of the fourth immunoglobulin (Ig)-like domain of NrCAM (NgCAM-related cell adhesion molecule). NrCAM belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six IG-like domains and five fibronectin type III domains, a transmembrane region, and an intracellular domain. NrCAM is primarily expressed in the nervous system.	89
143277	cd05869	IgI_NCAM-1	Immunoglobulin (Ig)-like I-set domain of Neural Cell Adhesion Molecule 1 (NCAM-1). The members here are composed of the fourth Ig domain of Neural Cell Adhesion Molecule 1(NCAM-1). NCAM plays important roles in the development and regeneration of the central nervous system, in synaptogenesis and neural migration. NCAM mediates cell-cell and cell-substratum recognition and adhesion via homophilic (NCAM-NCAM) and heterophilic (NCAM-non-NCAM) interactions. NCAM is expressed as three major isoforms having different intracellular extensions. The extracellular portion of NCAM has five N-terminal Ig-like domains and two fibronectin type III domains. The double zipper adhesion complex model for NCAM homophilic binding involves Ig1, Ig2, and Ig3. By this model, Ig1 and Ig2 mediate dimerization of NCAM molecules situated on the same cell surface (cis interactions), and Ig3 domains mediate interactions between NCAM molecules expressed on the surface of opposing cells (trans interactions), through binding to the Ig1 and Ig2 domains. The adhesive ability of NCAM is modulated by the addition of polysialic acid chains to the fifth Ig-like domain. One of the unique features of I-set domains is the lack of a C" strand. The structures of this group show that the Ig domain lacks this strand and thus is a member of the I-set of Ig domains.	97
143278	cd05870	IgI_NCAM-2	Immunoglobulin (Ig)-like I-set domain of Neural Cell Adhesion Molecule 2 (NCAM-2). The members here are composed of the fourth Ig domain of Neural Cell Adhesion Molecule NCAM-2 (also known as OCAM/mamFas II and RNCAM). NCAM-2 is organized similarly to NCAM, including five N-terminal Ig-like domains and two fibronectin type III domains. NCAM-2 is differentially expressed in the developing and mature olfactory epithelium (OE), and may function like NCAM, as an adhesion molecule. One of the unique features of I-set domains is the lack of a C" strand. The structures of this group show that the Ig domain lacks this strand and thus is a member of the I-set of Ig domains.	98
409455	cd05871	Ig_Sema3	Immunoglobulin (Ig)-like domain of class III semaphorin Sema3. The members here are composed of the immunoglobulin (Ig)-like domain of Sema3 and similar proteins.  Semaphorins are classified based on structural features additional to the Sema domain. Sema3 is a Class III semaphorin that is secreted.  It is a vertebrate class having a Sema domain, an Ig domain, a short basic domain. They have been shown to be axonal guidance cues and have a part in the regulation of the cardiovascular, immune, and respiratory systems. Sema3A, the prototype member of this class III subfamily, induces growth cone collapse and is an inhibitor of axonal sprouting. In perinatal rat cortex, it acts as a chemoattractant and functions to direct the orientated extension of apical dendrites. It may play a role, prior to the development of apical dendrites, in signaling the radial migration of newborn cortical neurons towards the upper layers. Sema3A selectively inhibits vascular endothelial growth factor receptor (VEGF)-induced angiogenesis and induces microvascular permeability. This group also includes Sema3B, -C, -D, -E, -G.	92
409456	cd05872	Ig_Sema4B_like	Immunoglobulin (Ig)-like domain of the class IV semaphorin Sema4B. The members here are composed of the immunoglobulin (Ig)-like domain of Sema4B and similar proteins. Sema4B is a Class IV semaphorin. Semaphorins are classified based on structural features additional to the Sema domain. Sema4B has extracellular Sema and Ig domains, a transmembrane domain, and a short cytoplasmic domain. Sema4B has been shown to preferentially regulate the development of the postsynaptic specialization at the glutamatergic synapses. This cytoplasmic domain includes a PDZ-binding motif upon which the synaptic localization of Sem4B is dependent. Sema4B is a ligand of CLCP1. CLCP1 was identified in an expression profiling analysis, which compared a highly metastic lung cancer subline with its low metastic parental line. Sema4B was shown to promote CLCP1 endocytosis and their interaction is a potential target for therapeutic intervention of metastasis.	86
409457	cd05873	Ig_Sema4D_like	Immunoglobulin (Ig)-like domain of semaphorin 4D (Sema4D) and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain of semaphorin 4D (Sema4D) and similar proteins. Sema4D is a Class IV semaphorin. Semaphorins are classified based on structural features additional to the Sema domain. Sema4D has extracellular Sema and Ig domains, a transmembrane domain, and a short cytoplasmic domain. Sema4D plays a part in the development of GABAergic synapses. Sema4D in addition is an immune semaphorin. It is abundant on resting T cells; its expression is weak on resting B cells and antigen presenting cells (APCs), but is upregulated by various stimuli. The receptor used by Sema4D in the immune system is CD72. Sem4D enhances the activation of B cells and DCs through binding CD72, perhaps by reducing CD72s inhibitory signals. The receptor used by Sema4D in the non-lymphatic tissues is plexin-B1. Sem4D is anchored to the cell surface but its extracellular domain can be released from the cell surface by a metalloprotease-dependent process. Sem4D may mediate its effects in its membrane-bound form and/or its cleaved form.	87
409458	cd05874	IgI_NrCAM	Immunoglobulin (Ig)-like domain of NrCAM (Ng (neuronglia) CAM-related cell adhesion molecule); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of NrCAM (Ng (neuronglia) CAM-related cell adhesion molecule). NrCAM belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region, and an intracellular domain. NrCAM is primarily expressed in the nervous system.	95
409459	cd05875	IgI_hNeurofascin_like	Immunoglobulin (Ig)-like domain of human neurofascin (NF); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin (Ig)-like domain of human neurofascin (NF). NF belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains and five fibronectin type III domains, a transmembrane region, and a cytoplasmic domain. NF has many alternatively spliced isoforms having different temporal expression patterns during development. NF participates in axon subcellular targeting and synapse formation, however little is known of the functions of the different isoforms. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lacks a C" strand.	95
409460	cd05876	Ig3_L1-CAM	Third immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). The members here are composed of the third immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains, five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth. This group also contains the chicken neuron-glia cell adhesion molecule, Ng-CAM.	83
409461	cd05877	Ig_LP_like	Immunoglobulin (Ig)-like domain of human cartilage link protein (LP), and similar domains. The members here are composed of the immunoglobulin (Ig)-like domain similar to that found in human cartilage link protein (LP; also called hyaluronan and proteoglycan link protein). In cartilage, chondroitin-keratan sulfate proteoglycan (CSPG), aggrecan, forms cartilage link protein stabilized aggregates with hyaluronan (HA). These aggregates contribute to the tissue's load bearing properties. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes.	117
409462	cd05878	Ig_Aggrecan_like	Immunoglobulin (Ig)-like domain of the aggrecan-like chondroitin sulfate proteoglycan core protein (CSPG). The members here are composed of the immunoglobulin (Ig)-like domain of the aggrecan-like chondroitin sulfate proteoglycan core proteins (CSPGs). Included in this group are the Ig domains of other CSPGs: versican, and neurocan. In CSPGs, this Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with hyaluronan (HA). These aggregates contribute to the tissue's load bearing properties. Aggrecan and versican have a wide distribution in connective tissue and extracellular matrices. Neurocan is localized almost exclusively in nervous tissue. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes.	125
409463	cd05879	IgV_P0	Immunoglobulin (Ig)-like domain of protein zero (P0). The members here are composed of the immunoglobulin (Ig) domain of protein zero (P0), a myelin membrane adhesion molecule. P0 accounts for over 50% of the total protein in peripheral nervous system (PNS) myelin. P0 is a single-pass transmembrane glycoprotein with a highly basic intracellular domain and an Ig domain.  The extracellular domain of P0 (P0-ED) is similar to the Ig variable domain, carrying one acceptor sequence for N-linked glycosylation. P0 plays a role in membrane adhesion in the spiral wraps of the myelin sheath. The intracellular domain is thought to mediate membrane apposition of the cytoplasmic faces and may, through electrostatic interactions, interact directly with lipid headgroups. It is thought that homophilic interactions of the P0 extracellular domain mediate membrane juxtaposition in the extracellular space of PNS myelin.	117
409464	cd05880	IgV_EVA1	Immunoglobulin (Ig)-like domain of epithelial V-like antigen (EVA) 1. The members here are composed of the immunoglobulin (Ig) domain of epithelial V-like antigen 1 (EVA 1). EVA is also known as myelin protein zero-like 2. EVA is an adhesion molecule and may play a role in the structural organization of the thymus and early lymphocyte development.	116
409465	cd05881	IgV_1_Necl-2	First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molecule 2; member of the V-set of Ig superfamily (IgSF) domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-2, Necl-2 (also known as cell adhesion molecule 1 (CADM1), SynCAM1, IGSF4A, Tslc1, sgIGSF, and RA175).  Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region, belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-2 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-2 is expressed in a wide variety of tissues and is a putative tumour suppressor gene, which is downregulated in aggressive neuroblastoma.	94
143290	cd05882	IgV_1_Necl-1	First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molecule-1 (Necl-1); member of the V-set of Ig superfamily (IgSF) domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-1, Necl-1 (also known as celll adhesion molecule 3 (CADM3), SynCAM2, or IGSF4). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue and is important to the formation of synapses, axon bundles, and myelinated axons.	95
409466	cd05883	IgI_2_Necl-2	Second immunoglobulin (Ig)-like domain of nectin-like molecule 2 (Necl-2); member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of nectin-like molecule 2 (Necl-2; also known as cell adhesion molecule 1 (CADM1)). Nectin-like molecules (Necls) have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 through Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. Necl-2 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is expressed in a wide variety of tissues and is a putative tumour suppressor gene which is downregulated in aggressive neuroblastoma. Ig domains are likely to participate in ligand binding and recognition.	99
409467	cd05884	IgI_2_Necl-3	Second immunoglobulin (Ig)-like domain of nectin-like molecule-3 (Necl-3); member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of nectin-like molecule-3 (Necl-3; also known as cell adhesion molecule 2 (CADM2)). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 through Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. Necl-3 has been shown to accumulate in tissues of the central and peripheral nervous system where it is expressed in ependymal cells and myelinated axons.  It is observed at the interface between the axon shaft and the myelin sheath. Ig domains are likely to participate in ligand binding and recognition.	104
409468	cd05885	IgI_2_Necl-4	Second immunoglobulin (Ig)-like domain of nectin-like molecule-4  (Necl-4); member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of nectin-like molecule-4  (Necl-4; also known as cell adhesion molecule 4 (CADM4)). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1-Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. Ig domains are likely to participate in ligand binding and recognition. Necl-4 is expressed on Schwann cells, and plays a key part in initiating peripheral nervous system (PNS) myelination.  In injured peripheral nerve cells, the mRNA signal for both Necl-4 and Necl-5 was observed to be elevated.  Necl-4 participates in cell-cell adhesion and is proposed to play a role in tumor suppression.	100
409469	cd05886	IgV_1_Nectin-1_like	First immunoglobulin variable (IgV) domain of nectin-1, and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of nectin-1 (also known as poliovirus receptor related protein 1 (PVRL1) or cluster of differentiation (CD) 111). Nectin-1 belongs to the nectin family comprised of four transmembrane glycoproteins (nectins-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. In addition nectins heterophilically trans-interact with other CAMs such as nectin-like molecules (Necls), nectin-1 for example, has been shown to trans-interact with Necl-1. Nectins also interact with various other proteins, including the actin filament (F-actin)-binding protein, afadin. Mutation in the human nectin-1 gene is associated with cleft lip/palate ectodermal dysplasia syndrome (CLPED1). Nectin-1 is a major receptor for herpes simplex virus through interaction with the viral envelope glycoprotein D.	113
409470	cd05887	IgV_1_Nectin-3_like	First immunoglobulin variable (IgV) domain of nectin-3 (also known as poliovirus receptor related protein 3), and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of nectin-3 (also known as poliovirus receptor related protein 3 (PVRL3) or cluster of differentiation (CD) 113). Nectin-3 belongs to the nectin family comprised of four transmembrane glycoproteins (nectins-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which participate in adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. For example, during spermatid development, the nectin-3,-2 trans-interaction is required for the formation of Sertoli cell-spermatid junctions in testis, and during morphogenesis of the ciliary body, the nectin-3,-1 trans-interaction is important for apex-apex adhesion between the pigment and non-pigment layers of the ciliary epithelia. Nectins also heterophilically trans-interact with other CAMs such as nectin-like molecules (Necls); nectin-3 for example, trans-interacts with Necl-5, regulating cell movement and proliferation. Other proteins with which nectin-3 interacts include the actin filament-binding protein, afadin, integrin alpha-beta3, Par-3, and PDGF receptor; its interaction with PDGF receptor regulates the latter's signaling for anti-apoptosis.	110
409471	cd05888	IgV_1_Nectin-4_like	First immunoglobulin (Ig) domain of nectin-4, and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of nectin-4 (also known as poliovirus receptor related protein 4 or LNIR receptor). Nectin-4 belongs to the nectin family, which is comprised of four transmembrane glycoproteins (nectins-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which participate in adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. For example nectin-4 trans-interacts with nectin-1. Nectin-4 has also been shown to interact with the actin filament-binding protein, afadin. Unlike the other nectins, which are widely expressed in adult tissues, nectin-4 is mainly expressed during embryogenesis, and is not detected in normal adult tissue or in serum. Nectin-4 is re-expressed in breast carcinoma, and patients having metastatic breast cancer have a circulating form of nectin-4 formed from the ectodomain	108
409472	cd05889	IgV_1_DNAM-1_like	First immunoglobulin variable (IgV) domain of DNAX accessory molecule 1, and similar domains. The members here are composed of the first immunoglobulin (Ig) domain of DNAX accessory molecule 1 (DNAM-1, also known as CD226). DNAM-1 is a transmembrane protein having two Ig-like domains. It is an adhesion molecule which plays a part in tumor-directed cytotoxicity and adhesion in natural killer (NK) cells and T lymphocytes. It has been shown to regulate the NK cell killing of several tumor types, including myeloma cells and ovarian carcinoma cells. DNAM-1 interacts specifically with poliovirus receptor (PVR; CD155) and nectin -2 (CD211), other members of the Ig superfamily. DNAM-1 is expressed in most peripheral T cells, NK cells, monocytes and a subset of B lymphocytes.	111
143298	cd05890	IgC1_2_Nectin-1_like	Second immunoglobulin (Ig) domain of nectin-1, and similar domains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig) domain of nectin-1 (also known as poliovirus receptor related protein 1, or cluster of differentiation (CD) 111). Nectin-1 belongs to the nectin family comprised of four transmembrane glycoproteins (nectin-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. Nectins also heterophilically trans-interact with other CAMs such as nectin-like molecules (Necls); nectin-1 for example, has been shown to trans-interact with Necl-1. Nectins also interact with various other proteins, including the actin filament (F-actin)-binding protein, afadin. Mutation in the human nectin-1 gene is associated with cleft lip/palate ectodermal dysplasia syndrome (CLPED1). Nectin-1 is a major receptor for herpes simplex virus through interaction with the viral envelope glycoprotein D.	98
143299	cd05891	IgI_M-protein_C	C-terminal immunoglobulin (Ig)-like domain of M-protein; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of M-protein (also known as myomesin-2). M-protein is a structural protein localized to the M-band, a transverse structure in the center of the sarcomere, and is a candidate for M-band bridges. M-protein is modular consisting mainly of repetitive IG-like and fibronectin type III (FnIII) domains and has a muscle-type specific expression pattern. M-protein is present in fast fibers.	92
409473	cd05892	IgI_Myotilin_C	C-terminal immunoglobulin (Ig)-like domain of myotilin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of myotilin. Mytolin belongs to the palladin-myotilin-myopalladin family. Proteins belonging to the latter family contain multiple Ig-like domains and function as scaffolds, modulating the actin cytoskeleton. Myotilin is most abundant in skeletal and cardiac muscle and is involved in maintaining sarcomere integrity. It binds to alpha-actinin, filamin, and actin. Mutations in myotilin lead to muscle disorders. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	92
409474	cd05893	IgI_1_Palladin_C	First C-terminal immunoglobulin (Ig)-like domain of palladin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of palladin. Palladin belongs to the palladin-myotilin-myopalladin family. Proteins belonging to this family contain multiple Ig-like domains and function as scaffolds, modulating actin cytoskeleton. Palladin binds to alpha-actinin ezrin, vasodilator-stimulated phosphoprotein VASP, SPIN90 (also known as DIP or mDia interacting protein), and Src. Palladin also binds F-actin directly, via its Ig3 domain. Palladin is expressed as several alternatively spliced isoforms, having various combinations of Ig-like domains, in a cell-type-specific manner. It has been suggested that palladin's different Ig-like domains may be specialized for distinct functions. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	92
409475	cd05894	Ig_C5_MyBP-C	C5 immunoglobulin (Ig) domain of cardiac myosin binding protein C (MyBP-C). The members here are composed of the C5 immunoglobulin (Ig) domain of cardiac myosin binding protein C (MyBP-C). MyBP-C consists of repeated domains, Ig and fibronectin type 3, and various linkers. Three isoforms of MYBP-C exist: slow-skeletal (ssMyBP-C), fast-skeletal (fsMyBP-C), and cardiac (cMyBP-C). cMYBP-C has insertions between and inside domains and an additional cardiac-specific Ig domain at the N-terminus. For cMYBP_C  an interaction has been demonstrated between this C5 domain and the Ig C8 domain.	86
409476	cd05895	Ig_Pro_neuregulin-1	Immunoglobulin (Ig)-like domain found in neuregulin (NRG)-1. The members here are composed of the immunoglobulin (Ig)-like domain found in neuregulin (NRG)-1. There are many NRG-1 isoforms which arise from the alternative splicing of mRNA. NRG-1 belongs to the neuregulin gene family which is comprised of four genes. This group represents NRG-1. NRGs are signaling molecules which participate in cell-cell interactions in the nervous system, breast, and heart, and other organ systems, and are implicated in the pathology of diseases including schizophrenia, multiple sclerosis, and breast cancer. The NRG-1 protein binds to and activates the tyrosine kinases receptors ErbB3 and ErbB4, initiating signaling cascades. NRG-1 has multiple functions, for example, in the brain it regulates various processes such as radial glia formation and neuronal migration, dendritic development, and expression of neurotransmitters receptors in the peripheral nervous system NRG-1 regulates processes such as target cell differentiation, and Schwann cell survival.	93
409477	cd05896	Ig1_IL1RAPL-1_like	First immunoglobulin (Ig)-like domain of X-linked interleukin-1 receptor accessory protein-like 1 (IL1RAPL-1), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of X-linked interleukin-1 receptor accessory protein-like 1 (IL1RAPL-1). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP).  IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three Ig-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. IL1RAPL is encoded by a gene on the X-chromosome, this gene is wholly or partially deleted in multiple cases of non-syndromic intellectual disability. This group also contains IL1RAPL-2 which is also encoded by a gene on the X-chromosome and is a candidate for another non-syndromic intellectual disability loci.	105
409478	cd05897	Ig2_IL1R2_like	Second immunoglobulin (Ig)-like domain of interleukin-1 receptor-2 (IL1R2), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of interleukin-1 receptor-2 (IL1R2). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP).  IL-1 also binds the IL-1 receptor, type II (IL1R2) represented in this group. Mature IL1R2 consists of three IG-like domains, a transmembrane domain, and a short cytoplasmic domain. It lacks the large cytoplasmic domain of mature IL1R1 and does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta.	95
409479	cd05898	IgI_5_KIRREL3	Fifth immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3 protein; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the fifth immunoglobulin (Ig)-like domain of Kirrel (kin of irregular chiasm-like) 3 protein (also known as Neph2). This protein has five Ig-like domains, one transmembrane domain, and a cytoplasmic tail. Included in this group is mammalian Kirrel (Neph1). These proteins contain multiple Ig domains, have properties of cell adhesion molecules, and are important in organ development. Neph1 and 2 may mediate axonal guidance and synapse formation in certain areas of the CNS. In the kidney they participate in the formation of the slit diaphragm.	98
409480	cd05899	IgV_TCR_beta	Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) beta chain. The members here are composed of the immunoglobulin (Ig) variable domain of the beta chain of alpha/beta T-cell antigen receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are composed of alpha and beta, or gamma and delta, polypeptide chains with variable (V) and constant (C) regions. This group includes the variable domain of the alpha chain of alpha/beta TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. The variable domain of TCRs is responsible for antigen recognition, and is located at the N-terminus of the receptor.  Gamma/delta TCRs recognize intact protein antigens directly without antigen processing and recognize MHC independently of the bound peptide. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	110
409481	cd05900	Ig_Aggrecan	Immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), aggrecan. The members here are composed of the immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), aggrecan. In CSPGs, the Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, aggrecan forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Aggrecan has a wide distribution in connective tissue and extracellular matrices. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes.	123
409482	cd05901	Ig_Versican	Immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), versican. The members here are composed of the immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), versican. In CSPGs, the Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, the CSPG aggrecan (not included in this group) forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Like aggrecan, versican has a wide distribution in connective tissue and extracellular matrices. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes.	128
409483	cd05902	Ig_Neurocan	Immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), neurocan. The members here are composed of the immunoglobulin (Ig)-like domain of the chondroitin sulfate proteoglycan core protein (CSPG), neurocan. In CSPGs, the Ig-like domain is followed by hyaluronan (HA)-binding tandem repeats, and a C-terminal region with epidermal growth factor-like, lectin-like, and complement regulatory protein-like domains. Separating these N- and C-terminal regions is a nonhomologous glycosaminoglycan attachment region. In cartilage, the CSPG aggrecan (not included in this group) forms cartilage link protein stabilized aggregates with HA. These aggregates contribute to the tissue's load bearing properties. Unlike aggrecan which is widely distributed in connective tissue and extracellular matrices, neurocan is localized almost exclusively in nervous tissue. Aggregates having other CSPGs substituting for aggrecan may contribute to the structural integrity of many different tissues. Members of the vertebrate HPLN (hyaluronan/HA and proteoglycan binding link) protein family are physically linked adjacent to CSPG genes.	121
341229	cd05903	CHC_CoA_lg	Cyclohexanecarboxylate-CoA ligase (also called cyclohex-1-ene-1-carboxylate:CoA ligase). Cyclohexanecarboxylate-CoA ligase activates the aliphatic ring compound, cyclohexanecarboxylate, for degradation. It catalyzes the synthesis of cyclohexanecarboxylate-CoA thioesters in a two-step reaction involving the formation of cyclohexanecarboxylate-AMP anhydride, followed by the nucleophilic substitution of AMP by CoA.	437
341230	cd05904	4CL	4-Coumarate-CoA Ligase (4CL). 4-Coumarate:coenzyme A ligase is a key enzyme in the phenylpropanoid metabolic pathway for monolignol and flavonoid biosynthesis. It catalyzes the synthesis of hydroxycinnamate-CoA thioesters in a two-step reaction, involving the formation of hydroxycinnamate-AMP anhydride and the nucleophilic substitution of AMP by CoA. The phenylpropanoid pathway is one of the most important secondary metabolism pathways in plants and hydroxycinnamate-CoA thioesters are the precursors of lignin and other important phenylpropanoids.	505
341231	cd05905	Dip2	Disco-interacting protein 2 (Dip2). Dip2 proteins show sequence similarity to other members of the adenylate forming enzyme family, including insect luciferase, acetyl CoA ligases and the adenylation domain of nonribosomal peptide synthetases (NRPS). However, its function may have diverged from other members of the superfamily. In mouse embryo, Dip2 homolog A plays an important role in the development of both vertebrate and invertebrate nervous systems. Dip2A appears to regulate cell growth and the arrangement of cells in organs. Biochemically, Dip2A functions as a receptor of FSTL1, an extracellular glycoprotein, and may play a role as a cardiovascular protective agent.	571
341232	cd05906	A_NRPS_TubE_like	The adenylation domain (A domain) of a family of nonribosomal peptide synthetases (NRPSs) synthesizing toxins and antitumor agents. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino)-acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. This family includes NRPSs that synthesize toxins and antitumor agents; for example, TubE for Tubulysine, CrpA for cryptophycin, TdiA for terrequinone A, KtzG for kutzneride, and Vlm1/Vlm2 for Valinomycin. Nonribosomal peptide synthetases are large multifunctional enzymes which synthesize many therapeutically useful peptides. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and, in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	540
341233	cd05907	VL_LC_FACS_like	Long-chain fatty acid CoA synthetases and Bubblegum-like very long-chain fatty acid CoA synthetases. This family includes long-chain fatty acid (C12-C20) CoA synthetases and Bubblegum-like very long-chain (>C20) fatty acid CoA synthetases. FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Eukaryotes generally have multiple isoforms of LC-FACS genes with multiple splice variants. For example, nine genes are found in Arabidopsis and six genes are expressed in mammalian cells. Drosophila melanogaster mutant bubblegum (BGM) have elevated levels of very-long-chain fatty acids (VLCFA) caused by a defective gene later named bubblegum. The human homolog (hsBG) of bubblegum has been characterized as a very long chain fatty acid CoA synthetase that functions specifically in the brain; hsBG may play a central role in brain VLCFA metabolism and myelinogenesis. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions.	452
341234	cd05908	A_NRPS_MycA_like	The adenylation domain of nonribosomal peptide synthetases (NRPS) similar to mycosubtilin synthase subunit A (MycA). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as (amino)-acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. This family includes NRPS similar to mycosubtilin synthase subunit A (MycA). Mycosubtilin, which is characterized by a beta-amino fatty acid moiety linked to the circular heptapeptide Asn-Tyr-Asn-Gln-Pro-Ser-Asn, belongs to the iturin family of lipopeptide antibiotics. The mycosubtilin synthase subunit A (MycA) combines functional domains derived from peptide synthetases, amino transferases, and fatty acid synthases. Nonribosomal peptide synthetases are large multifunction enzymes that synthesize many therapeutically useful peptides. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and, in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	499
341235	cd05909	AAS_C	C-terminal domain of the acyl-acyl carrier protein synthetase (also called 2-acylglycerophosphoethanolamine acyltransferase, Aas). Acyl-acyl carrier protein synthase (Aas) is a membrane protein responsible for a minor pathway of incorporating exogenous fatty acids into membrane phospholipids. Its in vitro activity is characterized by the ligation of free fatty acids between 8 and 18 carbons in length to the acyl carrier protein sulfydryl group (ACP-SH) in the presence of ATP and Mg2+. However, its in vivo function is as a 2-acylglycerophosphoethanolamine (2-acyl-GPE) acyltransferase. The reaction occurs in two steps: the acyl chain is first esterified to acyl carrier protein (ACP) via a thioester bond, followed by a second step where the acyl chain is transferred to a 2-acyllysophospholipid, thus completing the transacylation reaction. This model represents the C-terminal domain of the enzyme, which belongs to the class I adenylate-forming enzyme family, including acyl-CoA synthetases.	490
341236	cd05910	FACL_like_1	Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions.	457
341237	cd05911	Firefly_Luc_like	Firefly luciferase of light emitting insects and 4-Coumarate-CoA Ligase (4CL). This family contains insect firefly luciferases that share significant sequence similarity to plant 4-coumarate:coenzyme A ligases, despite their functional diversity. Luciferase catalyzes the production of light in the presence of MgATP, molecular oxygen, and luciferin. In the first step, luciferin is activated by acylation of its carboxylate group with ATP, resulting in an enzyme-bound luciferyl adenylate. In the second step, luciferyl adenylate reacts with molecular oxygen, producing an enzyme-bound excited state product (Luc=O*) and releasing AMP. This excited-state product then decays to the ground state (Luc=O), emitting a quantum of visible light.	486
341238	cd05912	OSB_CoA_lg	O-succinylbenzoate-CoA ligase (also known as O-succinylbenzoate-CoA synthase, OSB-CoA synthetase, or MenE). O-succinylbenzoic acid-CoA synthase catalyzes the coenzyme A (CoA)- and ATP-dependent conversion of o-succinylbenzoic acid to o-succinylbenzoyl-CoA. The reaction is the fourth step of the biosynthesis pathway of menaquinone (vitamin K2). In certain bacteria, menaquinone is used during fumarate reduction in anaerobic respiration. In cyanobacteria, the product of the menaquinone pathway is phylloquinone (2-methyl-3-phytyl-1,4-naphthoquinone), a molecule used exclusively as an electron transfer cofactor in Photosystem 1. In green sulfur bacteria and heliobacteria, menaquinones are used as loosely bound secondary electron acceptors in the photosynthetic reaction center.	411
341239	cd05913	PaaK	Phenylacetate-CoA ligase (also known as PaaK). PaaK catalyzes the first step in the aromatic degradation pathway, by converting phenylacetic acid (PA) into phenylacetyl-CoA (PA-CoA). Phenylacetate-CoA ligase has been found in proteobacteria as well as gram positive prokaryotes. The enzyme is specifically induced after aerobic growth in a chemically defined medium containing PA or phenylalanine (Phe) as the sole carbon source. PaaKs are members of the adenylate-forming enzyme (AFE) family. However, sequence comparison reveals divergent features of PaaK with respect to the superfamily, including a novel N-terminal sequence.	425
341240	cd05914	LC_FACL_like	Uncharacterized subfamily of fatty acid CoA ligase (FACL). The members of this family are bacterial long-chain fatty acid CoA synthetase, most of which are as yet uncharacterized. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions.	463
213283	cd05915	ttLC_FACS_like	Fatty acyl-CoA synthetases similar to LC-FACS from Thermus thermophiles. This family includes fatty acyl-CoA synthetases that can activate medium-chain to long-chain fatty acids. They catalyze the ATP-dependent acylation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. Fatty acyl-CoA synthetases are responsible for fatty acid degradation as well as physiological regulation of cellular functions via the production of fatty acyl-CoA esters. The fatty acyl-CoA synthetase from Thermus thermophiles in this family has been shown to catalyze the long-chain fatty acid, myristoyl acid, while another member in this family, the AlkK protein identified in Pseudomonas oleovorans, targets medium chain fatty acids. This family also includes an uncharacterized subgroup of FACS.	509
341241	cd05917	FACL_like_2	Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions.	349
341242	cd05918	A_NRPS_SidN3_like	The adenylation (A) domain of siderophore-synthesizing nonribosomal peptide synthetases (NRPS). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. This family of siderophore-synthesizing NRPS includes the third adenylation domain of SidN from the endophytic fungus Neotyphodium lolii, ferrichrome siderophore synthetase, HC-toxin synthetase, and enniatin synthase. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	481
341243	cd05919	BCL_like	Benzoate CoA ligase (BCL) and similar adenylate forming enzymes. This family contains benzoate CoA ligase (BCL) and related ligases that catalyze the acylation of benzoate derivatives, 2-aminobenzoate and 4-hydroxybenzoate. Aromatic compounds represent the second most abundant class of organic carbon compounds after carbohydrates. Xenobiotic aromatic compounds are also a major class of man-made pollutants. Some bacteria use benzoate as the sole source of carbon and energy through benzoate degradation. Benzoate degradation starts with its activation to benzoyl-CoA by benzoate CoA ligase. The reaction catalyzed by benzoate CoA ligase proceeds via a two-step process; the first ATP-dependent step forms an acyl-AMP intermediate, and the second step forms the acyl-CoA ester with release of the AMP.	436
341244	cd05920	23DHB-AMP_lg	2,3-dihydroxybenzoate-AMP ligase. 2,3-dihydroxybenzoate-AMP ligase activates 2,3-dihydroxybenzoate (DHB) by ligation of AMP from ATP with the release of pyrophosphate. However, it can also catalyze the ATP-PPi exchange for 2,3-DHB analogs, such as salicyclic acid (o-hydrobenzoate), as well as 2,4-DHB and 2,5-DHB, but with less efficiency. Proteins in this family are the stand-alone adenylation components of non-ribosomal peptide synthases (NRPSs) involved in the biosynthesis of siderophores, which are low molecular weight iron-chelating compounds synthesized by many bacteria to aid in the acquisition of this vital trace elements. In Escherichia coli, the 2,3-dihydroxybenzoate-AMP ligase is called EntE, the adenylation component of the enterobactin NRPS system.	482
341245	cd05921	FCS	Feruloyl-CoA synthetase (FCS). Feruloyl-CoA synthetase is an essential enzyme in the feruloyl acid degradation pathway and enables some proteobacteria to grow on media containing feruloyl acid as the sole carbon source. It catalyzes the transfer of CoA to the carboxyl group of ferulic acid, which then forms feruloyl-CoA in the presence of ATP and Mg2. The resulting feruloyl-CoA is further degraded to vanillin and acetyl-CoA. Feruloyl-CoA synthetase (FCS) is a subfamily of the adenylate-forming enzymes superfamily.	561
341246	cd05922	FACL_like_6	Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions.	457
341247	cd05923	CBAL	4-Chlorobenzoate-CoA ligase (CBAL). CBAL catalyzes the conversion of 4-chlorobenzoate (4-CB) to 4-chlorobenzoyl-coenzyme A (4-CB-CoA) by the two-step adenylation and thioester-forming reactions. 4-Chlorobenzoate (4-CBA) is an environmental pollutant derived from microbial breakdown of aromatic pollutants, such as polychlorinated biphenyls (PCBs), DDT, and certain herbicides. The 4-CBA degrading pathway converts 4-CBA to the metabolite 4-hydroxybezoate (4-HBA), allowing some soil-dwelling microbes to utilize 4-CBA as an alternate carbon source. This pathway consists of three chemical steps catalyzed by 4-CBA-CoA ligase, 4-CBA-CoA dehalogenase, and 4HBA-CoA thioesterase in sequential reactions.	493
341248	cd05924	FACL_like_5	Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions.	364
341249	cd05926	FACL_fum10p_like	Subfamily of fatty acid CoA ligase (FACL) similar to Fum10p of Gibberella moniliformis. FACL catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, followed by the formation of a fatty acyl-CoA. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. Fum10p is a fatty acid CoA ligase involved in the synthesis of fumonisin, a polyketide mycotoxin, in Gibberella moniliformis.	493
341250	cd05927	LC-FACS_euk	Eukaryotic long-chain fatty acid CoA synthetase (LC-FACS). The members of this family are eukaryotic fatty acid CoA synthetases that activate fatty acids with chain lengths of 12 to 20. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. Organisms tend to have multiple isoforms of LC-FACS genes with multiple splice variants. For example, nine genes are found in Arabidopsis and six genes are expressed in mammalian cells.	545
341251	cd05928	MACS_euk	Eukaryotic Medium-chain acyl-CoA synthetase (MACS or ACSM). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. The acyl-CoA is a key intermediate in many important biosynthetic and catabolic processes. MACS enzymes are localized to mitochondria. Two murine MACS family proteins are found in liver and kidney. In rodents, a MACS member is detected particularly in the olfactory epithelium and is called O-MACS. O-MACS demonstrates substrate preference for the fatty acid lengths of C6-C12.	530
341252	cd05929	BACL_like	Bacterial Bile acid CoA ligases and similar proteins. Bile acid-Coenzyme A ligase catalyzes the formation of bile acid-CoA conjugates in a two-step reaction: the formation of a bile acid-AMP molecule as an intermediate, followed by the formation of a bile acid-CoA. This ligase requires a bile acid with a free carboxyl group, ATP, Mg2+, and CoA for synthesis of the final bile acid-CoA conjugate. The bile acid-CoA ligation is believed to be the initial step in the bile acid 7alpha-dehydroxylation pathway in the intestinal bacterium Eubacterium sp.	473
341253	cd05930	A_NRPS	The adenylation domain of nonribosomal peptide synthetases (NRPS). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	444
341254	cd05931	FAAL	Fatty acyl-AMP ligase (FAAL). FAAL belongs to the class I adenylate forming enzyme family and is homologous to fatty acyl-coenzyme A (CoA) ligases (FACLs). However, FAALs produce only the acyl adenylate and are unable to perform the thioester-forming reaction, while FACLs perform a two-step catalytic reaction; AMP ligation followed by CoA ligation using ATP and CoA as cofactors. FAALs have insertion motifs between the N-terminal and C-terminal subdomains that distinguish them from the FACLs. This insertion motif precludes the binding of CoA, thus preventing CoA ligation. It has been suggested that the acyl adenylates serve as substrates for multifunctional polyketide synthases to permit synthesis of complex lipids such as phthiocerol dimycocerosate, sulfolipids, mycolic acids, and mycobactin.	547
341255	cd05932	LC_FACS_bac	Bacterial long-chain fatty acid CoA synthetase (LC-FACS), including Marinobacter hydrocarbonoclasticus isoprenoid Coenzyme A synthetase. The members of this family are bacterial long-chain fatty acid CoA synthetase. Marinobacter hydrocarbonoclasticus isoprenoid Coenzyme A synthetase in this family is involved in the synthesis of isoprenoid wax ester storage compounds when grown on phytol as the sole carbon source. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions.	508
341256	cd05933	ACSBG_like	Bubblegum-like very long-chain fatty acid CoA synthetase (VL-FACS). This family of very long-chain fatty acid CoA synthetase is named bubblegum because Drosophila melanogaster mutant bubblegum (BGM) has elevated levels of very-long-chain fatty acids (VLCFA) caused by a defective gene of this family. The human homolog (hsBG) has been characterized as a very long chain fatty acid CoA synthetase that functions specifically in the brain; hsBG may play a central role in brain VLCFA metabolism and myelinogenesis. VL-FACS is involved in the first reaction step of very long chain fatty acid degradation. It catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions.	596
341257	cd05934	FACL_DitJ_like	Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. Members of this family include DitJ from Pseudomonas and similar proteins.	422
341258	cd05935	LC_FACS_like	Putative long-chain fatty acid CoA ligase. The members of this family are putative long-chain fatty acyl-CoA synthetases, which catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. Fatty acyl-CoA synthetases are responsible for fatty acid degradation as well as physiological regulation of cellular functions via the production of fatty acyl-CoA esters.	430
341259	cd05936	FC-FACS_FadD_like	Prokaryotic long-chain fatty acid CoA synthetases similar to Escherichia coli FadD. This subfamily of the AMP-forming adenylation family contains Escherichia coli FadD and similar prokaryotic fatty acid CoA synthetases. FadD was characterized as a long-chain fatty acid CoA synthetase. The gene fadD is regulated by the fatty acid regulatory protein FadR. Fatty acid CoA synthetase catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, followed by the formation of a fatty acyl-CoA. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions.	468
341260	cd05937	FATP_chFAT1_like	Uncharacterized subfamily of bifunctional fatty acid transporter/very-long-chain acyl-CoA synthetase in fungi. Fatty acid transport protein (FATP) transports long-chain or very-long-chain fatty acids across the plasma membrane. FATPs also have fatty acid CoA synthetase activity, thus playing dual roles as fatty acid transporters and its activation enzymes. FATPs are the key players in the trafficking of exogenous fatty acids into the cell and in intracellular fatty acid homeostasis. Members of this family are fungal FATPs, including FAT1 from Cochliobolus heterostrophus.	468
341261	cd05938	hsFATP2a_ACSVL_like	Fatty acid transport proteins (FATP) including hsFATP2, hsFATP5, and hsFATP6, and similar proteins. Fatty acid transport proteins (FATP) of this family transport long-chain or very-long-chain fatty acids across the plasma membrane. At least five copies of FATPs are identified in mammalian cells. This family includes hsFATP2, hsFATP5, and hsFATP6, and similar proteins. Each FATP has unique patterns of tissue distribution. These FATPs also have fatty acid CoA synthetase activity, thus playing dual roles as fatty acid transporters and its activation enzymes. The hsFATP proteins exist in two splice variants; the b variant, lacking exon 3, has no acyl-CoA synthetase activity. FATPs are key players in the trafficking of exogenous fatty acids into the cell and in intracellular fatty acid homeostasis.	537
341262	cd05939	hsFATP4_like	Fatty acid transport proteins (FATP), including FATP4 and FATP1, and similar proteins. Fatty acid transport protein (FATP) transports long-chain or very-long-chain fatty acids across the plasma membrane. At least five copies of FATPs are identified in mammalian cells. This family includes FATP4, FATP1, and homologous proteins. Each FATP has unique patterns of tissue distribution. FATP4 is mainly expressed in the brain, testis, colon and kidney. FATPs also have fatty acid CoA synthetase activity, thus playing dual roles as fatty acid transporters and its activation enzymes. FATPs are the key players in the trafficking of exogenous fatty acids into the cell and in intracellular fatty acid homeostasis.	474
341263	cd05940	FATP_FACS	Fatty acid transport proteins (FATP) play dual roles as fatty acid transporters and its activation enzymes. Fatty acid transport protein (FATP) transports long-chain or very-long-chain fatty acids across the plasma membrane. FATPs also have fatty acid CoA synthetase activity, thus playing dual roles as fatty acid transporters and its activation enzymes. At least five copies of FATPs are identified in mammalian cells. This family also includes prokaryotic FATPs. FATPs are the key players in the trafficking of exogenous fatty acids into the cell and in intracellular fatty acid homeostasis.	449
341264	cd05941	MCS	Malonyl-CoA synthetase (MCS). MCS catalyzes the formation of malonyl-CoA in a two-step reaction consisting of the adenylation of malonate with ATP, followed by malonyl transfer from malonyl-AMP to CoA. Malonic acid and its derivatives are the building blocks of polyketides and malonyl-CoA serves as the substrate of polyketide synthases. Malonyl-CoA synthetase has broad substrate tolerance and can activate a variety of malonyl acid derivatives. MCS may play an important role in biosynthesis of polyketides, the important secondary metabolites with therapeutic and agrochemical utility.	442
341265	cd05943	AACS	Acetoacetyl-CoA synthetase (acetoacetate-CoA ligase, AACS). AACS is a cytosolic ligase that specifically activates acetoacetate to its coenzyme A ester by a two-step reaction. Acetoacetate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is the first step of the mevalonate pathway of isoprenoid biosynthesis via isopentenyl diphosphate. Isoprenoids are a large class of compounds found in all living organisms. AACS is widely distributed in bacteria, archaea and eukaryotes. In bacteria, AACS is known to exhibit an important role in the metabolism of poly-b-hydroxybutyrate, an intracellular reserve of organic carbon and chemical energy by some microorganisms. In mammals, AACS influences the rate of ketone body utilization for the formation of physiologically important fatty acids and cholesterol.	629
341266	cd05944	FACL_like_4	Uncharacterized subfamily of fatty acid CoA ligase (FACL). Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions.	359
341267	cd05945	DltA	D-alanine:D-alanyl carrier protein ligase (DltA) and similar proteins. This family includes D-alanyl carrier protein ligase DltA and aliphatic beta-amino acid adenylation enzymes IdnL1 and CmiS6. DltA incorporates D-ala in techoic acids in gram-positive bacteria via a two-step process, starting with adenylation of D-alanine that transfers D-alanine to the D-alanyl carrier protein. IdnL1, a short-chain aliphatic beta-amino acid adenylation enzyme, recognizes 3-aminobutanoic acid, and is involved in the synthesis of the macrolactam antibiotic incednine. CmiS6 is a medium-chain beta-amino acid adenylation enzyme that recognizes 3-aminononanoic acid, and is involved in the synthesis of cremimycin, also a macrolactam antibiotic. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	449
341268	cd05958	ABCL	2-aminobenzoate-CoA ligase (ABCL). ABCL catalyzes the initial step in the 2-aminobenzoate aerobic degradation pathway by activating 2-aminobenzoate to 2-aminobenzoyl-CoA. The reaction is carried out via a two-step process; the first step is ATP-dependent and forms a 2-aminobenzoyl-AMP intermediate, and the second step forms the 2-aminobenzoyl-CoA ester and releases the AMP. 2-Aminobenzoyl-CoA is further converted to 2-amino-5-oxo-cyclohex-1-ene-1-carbonyl-CoA catalyzed by 2-aminobenzoyl-CoA monooxygenase/reductase. ABCL has been purified from cells aerobically grown with 2-aminobenzoate as sole carbon, energy, and nitrogen source, and has been characterized as a monomer.	439
341269	cd05959	BCL_4HBCL	Benzoate CoA ligase (BCL) and 4-Hydroxybenzoate-Coenzyme A Ligase (4-HBA-CoA ligase). Benzoate CoA ligase and 4-hydroxybenzoate-coenzyme A ligase catalyze the first activating step for benzoate and 4-hydroxybenzoate catabolic pathways, respectively. Although these two enzymes share very high sequence homology, they have their own substrate preference. The reaction proceeds via a two-step process; the first ATP-dependent step forms the substrate-AMP intermediate, while the second step forms the acyl-CoA ester, releasing the AMP. Aromatic compounds represent the second most abundant class of organic carbon compounds after carbohydrates. Some bacteria can use benzoic acid or benzenoid compounds as the sole source of carbon and energy through degradation. Benzoate CoA ligase and 4-hydroxybenzoate-Coenzyme A ligase are key enzymes of this process.	508
341270	cd05966	ACS	Acetyl-CoA synthetase (also known as acetate-CoA ligase and acetyl-activating enzyme). Acetyl-CoA synthetase (ACS, EC 6.2.1.1, acetate#CoA ligase or acetate:CoA ligase (AMP-forming)) catalyzes the formation of acetyl-CoA from acetate, CoA, and ATP. Synthesis of acetyl-CoA is carried out in a two-step reaction. In the first step, the enzyme catalyzes the synthesis of acetyl-AMP intermediate from acetate and ATP. In the second step, acetyl-AMP reacts with CoA to produce acetyl-CoA. This enzyme is widely present in all living organisms. The activity of this enzyme is crucial for maintaining the required levels of acetyl-CoA, a key intermediate in many important biosynthetic and catabolic processes. Acetyl-CoA is used in the biosynthesis of glucose, fatty acids, and cholesterol. It can also be used in the production of energy in the citric acid cycle. Eukaryotes typically have two isoforms of acetyl-CoA synthetase, a cytosolic form involved in biosynthetic processes and a mitochondrial form primarily involved in energy generation.	608
341271	cd05967	PrpE	Propionyl-CoA synthetase (PrpE). EC 6.2.1.17: propanoate:CoA ligase (AMP-forming) or propionate#CoA ligase (PrpE) catalyzes the first step of the 2-methylcitric acid cycle for propionate catabolism. It activates propionate to propionyl-CoA in a two-step reaction, which proceeds through a propionyl-AMP intermediate and requires ATP and Mg2+. In Salmonella enterica, the PrpE protein is required for growth of Salmonella enterica on propionate and can substitute for the acetyl-CoA synthetase (Acs) enzyme during growth on acetate. PrpE can also activate acetate, 3HP, and butyrate to their corresponding CoA-thioesters, although with less efficiency.	617
341272	cd05968	AACS_like	Uncharacterized acyl-CoA synthetase subfamily similar to Acetoacetyl-CoA synthetase. This uncharacterized acyl-CoA synthetase family (EC 6.2.1.16, or acetoacetate#CoA ligase or acetoacetate:CoA ligase (AMP-forming)) is highly homologous to acetoacetyl-CoA synthetase. However, the proteins in this family exist in only bacteria and archaea. AACS is a cytosolic ligase that specifically activates acetoacetate to its coenzyme A ester by a two-step reaction. Acetoacetate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is the first step of the mevalonate pathway of isoprenoid biosynthesis via isopentenyl diphosphate. Isoprenoids are a large class of compounds found in all living organisms.	610
341273	cd05969	MACS_like_4	Uncharacterized subfamily of Acetyl-CoA synthetase like family (ACS). This family is most similar to acetyl-CoA synthetase. Acetyl-CoA synthetase (ACS) catalyzes the formation of acetyl-CoA from acetate, CoA, and ATP. Synthesis of acetyl-CoA is carried out in a two-step reaction. In the first step, the enzyme catalyzes the synthesis of acetyl-AMP intermediate from acetate and ATP. In the second step, acetyl-AMP reacts with CoA to produce acetyl-CoA. This enzyme is only present in bacteria.	442
341274	cd05970	MACS_AAE_MA_like	Medium-chain acyl-CoA synthetase (MACS) of AAE_MA like. MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This family of MACS enzymes is found in archaea and bacteria. It is represented by the acyl-adenylating enzyme from Methanosarcina acetivorans (AAE_MA). AAE_MA is most active with propionate, butyrate, and the branched analogs: 2-methyl-propionate, butyrate, and pentanoate. The specific activity is weaker for smaller or larger acids.	537
341275	cd05971	MACS_like_3	Uncharacterized subfamily of medium-chain acyl-CoA synthetase (MACS). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. MACS enzymes are localized to mitochondria.	439
341276	cd05972	MACS_like	Medium-chain acyl-CoA synthetase (MACS or ACSM). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. The acyl-CoA is a key intermediate in many important biosynthetic and catabolic processes.	428
341277	cd05973	MACS_like_2	Uncharacterized subfamily of medium-chain acyl-CoA synthetase (MACS). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. MACS enzymes are localized to mitochondria.	437
341278	cd05974	MACS_like_1	Uncharacterized subfamily of medium-chain acyl-CoA synthetase (MACS). MACS catalyzes the two-step activation of medium chain fatty acids (containing 4-12 carbons). The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. MACS enzymes are localized to mitochondria.	432
99716	cd05992	PB1	The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants.	81
100076	cd06006	R3H_unknown_2	R3H domain of a group of fungal proteins with unknown function. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA  or ssRNA in a sequence-specific manner.	59
100077	cd06007	R3H_DEXH_helicase	R3H domain of a group of proteins which also contain a DEXH-box helicase domain, and may function as ATP-dependent DNA or RNA helicases. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner.	59
100116	cd06008	NF-X1-zinc-finger	Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms.	49
276963	cd06059	Tubulin	The tubulin superfamily and related homologs. The tubulin superfamily includes five distinct families, the alpha-, beta-, gamma-, delta-, and epsilon-tubulins and a sixth family (zeta-tubulin) which is present only in kinetoplastid protozoa. The alpha- and beta-tubulins are the major components of microtubules, while gamma-tubulin plays a major role in the nucleation of microtubule assembly.  The delta- and epsilon-tubulins are widespread but unlike the alpha, beta, and gamma-tubulins they are not ubiquitous among eukaryotes. The alpha/beta-tubulin heterodimer is the structural subunit of microtubules.  The alpha- and beta-tubulins share 40% amino-acid sequence identity, exist in several isotype forms, and undergo a variety of posttranslational modifications.  The structures of alpha- and beta-tubulin are basically identical: each monomer is formed by a core of two beta-sheets surrounded by alpha-helices. The monomer structure is very compact, but can be divided into three regions based on function: the amino-terminal nucleotide-binding region, an intermediate taxol-binding region and the carboxy-terminal region which probably constitutes the binding surface for motor proteins. Also included in this group is the mitochondrial Misato/DML1 protein family, involved in mitochondrial fusion and in mitochondrial distribution and morphology.	387
276964	cd06060	misato	Misato segment II tubulin-like domain. Human Misato shows similarity with Tubulin/FtsZ family of GTPases and is localized to the the outer membrane of mitochondria. It has a role in mitochondrial fusion and in mitochondrial distribution and morphology. Mutations in its Drosophila homolog (misato) lead to irregular chromosome segregation during mitosis. Deletion of the budding yeast homolog DML1 is lethal and unregulate expression of DML1 leads to mitochondrial dispersion and abnormalities in cell morphology. The Misato/DML1 protein family is conserved from yeast to human, but its exact function is still unknown.	539
100037	cd06061	PurM-like1	AIR synthase (PurM) related protein, subgroup 1 of unknown function. The family of PurM related proteins includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM synthase and Selenophosphate synthetase (SelD). They all contain two conserved domains and seem to dimerize. The N-terminal domain forms the dimer interface and is a putative ATP binding domain.	298
99873	cd06062	H2MP_MemB-H2up	Endopeptidases belonging to membrane-bound hydrogenases group. These hydrogenases transfer electrons from H2 to a cytochrome that is bound to a membrane-located complex coupling electron transfer to transmembrane proton translocation. Endopeptidase HybD from E. coli is well studied in this group. Maturation of [NiFe] hydrogenases include proteolytic processing of large subunit, assembly with other subunits, and formation of the nickel metallocenter. Hydrogenase maturation endopeptidase (HybD) cleaves a short C-terminal peptide after a His or an Arg residue in the large subunit (pre-HybC) of hydrogenase 2 (hyb operon) in E. coli. This cleavage is nickel dependent. A variety of endopeptidases belong to this group that are similar in function and sequence homology. They include such proteins as HynC, HoxM, and HupD.	146
99874	cd06063	H2MP_Cyano-H2up	This group of endopeptidases include HupW enzymes that are specific to the cyanobacterial hydrogenase and are involved in the C-terminal cleavage of the hydrogenase large subunit precursor protein. Cyanobacterial nickel-iron (NiFe)-hydrogenases are found exclusively in the N2-fixing strains and are encoded by hup (hydrogen uptake) genes. These uptake hydrogenases are heterodimers with a large (hupL) and small subunit (hupS) and catalyze the consumption of the H2 produced during N2 fixation. Sequence similarity shows that the putative metal-binding resides are well conserved in this group of hydrogen maturation proteases. This group also includes such proteins as the hydrogenase III from Aquifex aeolicus.	146
99875	cd06064	H2MP_F420-Reduc	Endopeptidases belonging to F420-reducing hydrogenases group. These hydrogenases from methanogens are encoded by the fru, frc, or frh genes. Sequence comparison indicates that fruD and frcD gene products from Methanococcus voltae are similar to HycI protease of Escherichia coli and are putatively involved in the C-terminal processing of large subunits (FruA and FrcA respectively). FrhD (F420 reducing hydrogenase delta subunit) enzyme belongs to the gene cluster of 8-hydroxy-5-deazaflavin (F420) reducing hydrogenase (FRH) from the thermophilic methanogen Methanobacterium thermoautotrophicum delta H. FrhD subunit is putatively involved in the processing of the coenzyme F420 hydrogenase-processing. It is similar to those frhD genes found in Methanomicrobia and Methanobacteria. It is different from the FrhD conserved domain found in methyl viologen-reducing hydrogenase and F420-non-reducing hydrogenase iron-sulfur subunit D.	150
99876	cd06066	H2MP_NAD-link-bidir	Endopeptidases that belong to the bidirectional NAD-linked hydrogenase group. This group of endopeptidases are highly specific carboxyl-terminal protease (HoxW protease) which releases a 24-amino-acid peptide from HoxH prior to progression of subunit assembly. These bidirectional hydrogenases are heteropentamers encoded by the hox (hydrogen oxidation) genes, in which complex HoxEFU shows the diaphorase activity, and HoxYH constitutes the NiFe-hydrogenase.	139
99877	cd06067	H2MP_MemB-H2evol	Endopeptidases belonging to membrane-bound hydrogen evolving hydrogenase group. In hydrogenase 3 from E coli, the maturation of the large subunit (HycE) requires the cleavage of a C-terminal peptide by the endopeptidase HycI, before the final formation of the [NiFe] metallocenter. HycI protease is a monomer and lacks characteristic signature motifs of serine, zinc, cysteine, or acid proteases and thus its cleavage reaction is not inhibited by conventional inhibitors of serine and metalloproteases. Such hydrogenases as those from Methanosarcina barkeri (EchCE) and Rhodospirillum rubrum (CooLH) also belong to this group of membrane-bound hydrogen evolving hydrogenase. Sequence comparison of the large subunits from related hydrogenase indicates that in contrast to EchE (358 amino acids) and CooH (361 amino acids), the large subunit HycE (569 amino acids) contains an extra carboxy-terminal stretch of 32 amino acids that is cleaved during the maturation process. In the absence of this C-terminal stretch, there is no homolog of endopeptidase HycI found in these two related hydrogenase.	136
99878	cd06068	H2MP_like-1	Putative [NiFe] hydrogenase-specific C-terminal protease. Sequence comparison shows similarity to hydrogenase specific C-terminal endopeptidases, also called Hydrogen Maturation Proteases (H2MP). Maturation of [FeNi] hydrogenases includes formation of the nickel metallocenter, proteolytic processing and assembly with other subunits. Hydrogenase maturation endopeptidases are responsible for the proteolytic processing, liberating a short C-terminal peptide by cleaving after a His or an Arg residue, e.g., HycI (E. coli) is involved  in processing of HypE (the large subunit of hydrogenases 3). This cleavage is nickel dependent.	144
99879	cd06070	H2MP_like-2	Putative [NiFe] hydrogenase-specific C-terminal protease. Sequence comparison shows similarity to hydrogenase specific C-terminal endopeptidases, also called Hydrogen Maturation Proteases (H2MP). Maturation of [FeNi] hydrogenases includes formation of the nickel metallocenter, proteolytic processing and assembly with other subunits. Hydrogenase maturation endopeptidases are responsible for the proteolytic processing, liberating a short C-terminal peptide by cleaving after a His or an Arg residue, e.g., HycI (E. coli) is involved  in processing of HypE (the large subunit of hydrogenases 3). This cleavage is nickel dependent.	140
100117	cd06071	Beach	BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking,  are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins.	275
99903	cd06080	MUM1_like	Mutated melanoma-associated antigen 1 (MUM-1) is a melanoma-associated antigen (MAA).  MUM-1 belongs to the mutated or aberrantly expressed type of MAAs, along with antigens such as CDK4, beta-catenin, gp100-in4, p15, and N-acetylglucosaminyltransferase V.  It is highly expressed in several types of human cancers.  The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes.	80
240505	cd06081	KOW_Spt5_1	KOW domain of Spt5, repeat 1. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.	38
240506	cd06082	KOW_Spt5_2	KOW domain of Spt5, repeat 2. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.	51
240507	cd06083	KOW_Spt5_3	KOW domain of Spt5, repeat 3. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.	51
240508	cd06084	KOW_Spt5_4	KOW domain of Spt5, repeat 4. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.	43
240509	cd06085	KOW_Spt5_5	KOW domain of Spt5, repeat 5. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.	52
240510	cd06086	KOW_Spt5_6	KOW domain of Spt5, repeat 6. Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.	58
240511	cd06087	KOW_RPS4	KOW motif of Ribosomal Protein S4 (RPS4). RPS4 plays a critical role in the core assembly of the small ribosomal subunit with a KOW motif at its C-terminal. RPS4 also acts as a general transcription antiterminator factor and regulates ribosomal RNA expression level. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. RPS4 deficiency in human has been associated with Turner syndrome. Archeae RPS4 (RPS4e) showed substantial identity to the eukaryotic equivalents RPS4, but the archaeal proteins formed a different complex from the eukaryotic proteins.	55
240512	cd06088	KOW_RPL14	KOW motif of Ribosomal Protein L14. RPL14 is a component of the large ribosomal subunit in both archaea and eukaryotes with KOW motif at its N terminal. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. Auto-antibodies to RPL14 in humans have been associated with systemic lupus erythematosus . Although RPL14 is well conserved, it is not found in all archaea, and therefore it is presumably not essential.	76
240513	cd06089	KOW_RPL26	KOW motif of Ribosomal Protein L26. RPL26 and its bacterial paralogs RPL24 have a KOW motif at their N terminal. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. RPL26 makes a very minor contributions to the biogenesis, structure, and function of 60s ribosomal subunits. However, RPL24 is essential to generate the first intermediate during 50s ribosomal subunits assembly. RPL26 have an extra-ribosomal function to enhances p53 translation after DNA damage.	65
240514	cd06090	KOW_RPL27	KOW motif of eukaryotic Ribosomal Protein L27. RPL27e has a KOW motif at its N terminal. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. 	83
240515	cd06091	KOW_NusG	NusG contains an NGN domain at its N-terminus and KOW motif at its C-terminus. KOW_NusG motif is one of the two domains of N-Utilization Substance G (NusG) a transcription elongation and Rho-termination factor in bacteria and archaea. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The eukaryotic ortholog of NusG is Spt5 with multiple KOW motifs at its C-terminus.	56
132768	cd06093	PX_domain	The Phox Homology domain, a phosphoinositide binding module. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting.	106
133158	cd06094	RP_Saci_like	RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.	89
133159	cd06095	RP_RTVL_H_like	Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.	86
133160	cd06096	Plasmepsin_5	Plasmepsins are a class of aspartic proteinases produced by the plasmodium parasite. The family contains a group of aspartic proteinases homologous to plasmepsin 5.  Plasmepsins are a class of at least 10 enzymes produced by the plasmodium parasite. Through their haemoglobin-degrading activity, they are an important cause of symptoms in malaria sufferers. This family of enzymes is a potential target for anti-malarial drugs. Plasmepsins are aspartic acid proteases, which means their active site contains two aspartic acid residues. These two aspartic acid residue act respectively as proton donor and proton acceptor, catalyzing the hydrolysis of peptide bond in proteins. Aspartic proteinases are composed of two structurally similar beta barrel lobes, each lobe contributing an aspartic acid residue to form a catalytic dyad that acts to cleave the substrate peptide bond. The catalytic Asp residues are contained in an Asp-Thr-Gly-Ser/thr motif in both N- and C-terminal lobes of the enzyme.  There are four types of plasmepsins, closely related but varying in the specificity of cleavage site. The name plasmepsin may come from plasmodium (the organism) and pepsin (a common aspartic acid protease with similar molecular structure). This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	326
133161	cd06097	Aspergillopepsin_like	Aspergillopepsin_like, aspartic proteases of fungal origin. The members of this family are aspartic proteases of fungal origin, including aspergillopepsin, rhizopuspepsin, endothiapepsin, and rodosporapepsin. The various fungal species in this family may be the most economically important genus of fungi. They may serve as virulence factors or as industrial aids. For example, Aspergillopepsin from A. fumigatus is involved in invasive aspergillosis owing to its elastolytic activity and Aspergillopepsins from the mold A. saitoi are used in fermentation industry. Aspartic proteinases are a group of proteolytic enzymes in which the scissile peptide bond is attacked by a nucleophilic water molecule activated by two aspartic residues in a DT(S)G motif at the active site. They have a similar fold composed of two beta-barrel domains. Between the N-terminal and C-terminal domains, each of which contributes one catalytic aspartic residue, there is an extended active-site cleft capable of interacting with multiple residues of a substrate. Although members of the aspartic protease family of enzymes have very similar three-dimensional structures and catalytic mechanisms, each has unique substrate specificity. The members of this family has an optimal acidic pH (5.5) and cleaves protein substrates with similar specificity to that of porcine pepsin A, preferring hydrophobic residues at P1 and P1' in the cleave site.  This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	278
133162	cd06098	phytepsin	Phytepsin, a plant homolog of mammalian lysosomal pepsins. Phytepsin, a plant homolog of mammalian lysosomal pepsins, resides in grains, roots, stems, leaves and flowers. Phytepsin may participate in metabolic turnover and in protein processing events. In addition, it highly expressed in several plant tissues undergoing apoptosis. Phytepsin contains an internal region consisting of about 100 residues not present in animal or microbial pepsins. This region is thus called a plant specific insert. The insert is highly similar to saponins, which are lysosomal sphingolipid-activating proteins in mammalian cells. The saponin-like domain may have a role in the vacuolar targeting of phytepsin. Phytepsin, as its animal counterparts, possesses a topology typical of all aspartic proteases.  They are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localized between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A, clan AA).	317
99853	cd06099	CS_ACL-C_CCL	Citrate synthase (CS), citryl-CoA lyase (CCL), the C-terminal portion of the single-subunit type ATP-citrate lyase (ACL) and the C-terminal portion of the large subunit of the two-subunit type ACL. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) from citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. Some CS proteins function as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. CCL cleaves citryl-CoA (CiCoA) to AcCoA and OAA. ACLs catalyze an ATP- and a CoA- dependant cleavage of citrate to form AcCoA and OAA; they do this in a multistep reaction, the final step of which is likely to involve the cleavage of CiCoA to generate AcCoA and OAA. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate CiCoA, and c) the hydrolysis of CiCoA to produce citrate and CoA. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism.  In fungi, yeast, plants, and animals ACL is cytosolic and generates AcCoA for lipogenesis. In several groups of autotrophic prokaryotes and archaea, ACL carries out the citrate-cleavage reaction of the reductive tricarboxylic acid (rTCA) cycle. In the family Aquificaceae this latter reaction in the rTCA cycle is carried out via a two enzyme system the second enzyme of which is CCL.	213
99854	cd06100	CCL_ACL-C	Citryl-CoA lyase (CCL), the C-terminal portion of the single-subunit type ATP-citrate lyase (ACL) and the C-terminal portion of the large subunit of the two-subunit type ACL. CCL cleaves citryl-CoA (CiCoA) to acetyl-CoA (AcCoA) and oxaloacetate (OAA). ACL catalyzes an ATP- and a CoA- dependant cleavage of citrate to form AcCoA and OAA in a multistep reaction, the final step of which is likely to involve the cleavage of CiCoA to generate AcCoA and OAA. In fungi, yeast, plants, and animals ACL is cytosolic and generates AcCoA for lipogenesis. ACL may be required for fruiting body maturation in the filamentous fungus Sordaria macrospore. In several groups of autotrophic prokaryotes and archaea, ACL carries out the citrate-cleavage reaction of the reductive tricarboxylic acid (rTCA) cycle. In the family Aquificaceae this latter reaction in the rTCA cycle is carried out via a two enzyme system the second enzyme of which is CCL; the first enzyme is citryl-CoA synthetase (CCS) which is not included in this group. Chlorobium limicola ACL is an example of a two-subunit type ACL. It is comprised of a large and a small subunit; it has been speculated that the large subunit arose from a fusion of the small subunit of the two subunit CCS with CCL. The small ACL subunit is a homolog of the larger CCS subunit. Mammalian ACL is of the single-subunit type and may have arisen from the two-subunit ACL by another gene fusion. Mammalian ACLs are homotetramers; the ACLs of C. limicola and Arabidopsis are a heterooctomers (alpha4beta4). In cancer cells there is a shift in energy metabolism to aerobic glycolysis, the glycolytic end product pyruvate enters a truncated TCA cycle generating citrate which is cleaved in the cytosol by ACL. Inhibiting ACL limits the in-vitro proliferation and survival of these cancer cells, reduces in vivo tumor growth, and induces differentiation.	227
99855	cd06101	citrate_synt	Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and form homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. This subgroup includes both gram-positive and gram-negative bacteria.	265
99856	cd06102	citrate_synt_like_2	Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. This subgroup includes both gram-positive and gram-negative bacteria.	282
99857	cd06103	ScCS-like	Saccharomyces cerevisiae (Sc) citrate synthase (CS)-like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) with oxaloacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). Some CS proteins function as 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  This group includes three S. cerevisiae CS proteins, ScCit1,-2,-3. ScCit1 is a nuclear-encoded mitochondrial CS with highly specificity for AcCoA; in addition to having activity with AcCoA, it plays a part in the construction of the TCA cycle metabolon. Yeast cells deleted for Cit1 are hyper-susceptible to apoptosis induced by heat and aging stress. ScCit2 is a peroxisomal CS involved in the glyoxylate cycle; in addition to having activity with AcCoA, it may have activity with PrCoA. ScCit3 is a mitochondrial CS and functions in the metabolism of PrCoA; it is a dual specificity CS and 2MCS, having similar catalytic efficiency with both AcCoA and PrCoA. The pattern of expression of the ScCIT3 gene follows that of the ScCIT1 gene and its expression is increased in the presence of a ScCIT1 deletion. Included in this group is the Tetrahymena 14 nm filament protein which functions as a CS in mitochondria and as a cytoskeletal component in cytoplasm and Geobacter sulfurreducens (GSu) CS. GSuCS is dimeric and eukaryotic-like; it lacks 2MCS activity and  is inhibited by ATP. In contrast to eukaryotic and other prokaryotic CSs, GSuCIT is not stimulated by K+ ions.  This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS.	426
99858	cd06105	ScCit1-2_like	Saccharomyces cerevisiae (Sc) citrate synthases Cit1-2_like. Citrate synthases (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) with oxaloacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). Some CS proteins function as 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  ScCit1 is a nuclear-encoded mitochondrial CS with highly specificity for AcCoA. In addition to its CS function, ScCit1 plays a part in the construction of the TCA cycle metabolon. Yeast cells deleted for Cit1 are hyper-susceptible to apoptosis induced by heat and aging stress. ScCit2 is a peroxisomal CS involved in the glyoxylate cycle; in addition to having activity with AcCoA, it may have activity with PrCoA. Chicken and pig heart CS, two Arabidopsis thaliana (Ath) CSs, CSY4 and -5, and Aspergillus niger (An) CS also belong to this group. Ath CSY4 has a mitochondrial targeting sequence; AthCSY5 has no identifiable targeting sequence. AnCS encoded by the citA gene has both an N-terminal mitochondrial import signal and a C-terminal peroxisiomal target sequence; it is not known if both these signals are functional in vivo. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS.	427
99859	cd06106	ScCit3_like	Saccharomyces cerevisiae (Sc) 2-methylcitrate synthase Cit3-like. 2-methylcitrate synthase (2MCS) catalyzes the condensation of propionyl-coenzyme A (PrCoA) and oxaloacetate (OAA) to form 2-methylcitrate and CoA. Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) with OAA to form citrate and CoA, the first step in the citric acid cycle (TCA or Krebs cycle). The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). ScCit3 is mitochondrial and functions in the metabolism of PrCoA; it is a dual specificity CS and 2MCS, having similar catalytic efficiency with both AcCoA and PrCoA. The pattern of expression of the ScCIT3 gene follows that of the major mitochondrial CS gene (CIT1, not included in this group) and its expression is increased in the presence of a CIT1 deletion. This group also contains Aspergillus nidulans 2MCS; a deletion of the gene encoding this protein results in a strain unable to grow on propionate. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS.	428
99860	cd06107	EcCS_AthCS-per_like	Escherichia coli (Ec) citrate synthase (CS) gltA and Arabidopsis thaliana (Ath) peroxisomal (Per) CS_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA.   There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  Some type II CSs, including EcCS, are strongly and specifically inhibited by NADH through an allosteric mechanism. Included in this group is an NADH-insensitive type II Acetobacter acetii CS which has retained many of the residues used by EcCS for NADH binding. C. aurantiacus is a gram-negative thermophilic green gliding bacterium; its CS belonging to this group may be a type I CS.  It is not inhibited by NADH or 2-oxoglutarate and is inhibited by ATP. Both gram-positive and gram-negative bacteria are found in this group. This group also contains three Arabidopsis peroxisomal CS proteins, CYS-1, -2, and -3 which participate in the glyoxylate cycle. AthCYS1, in addition to a peroxisomal targeting sequence, has a predicted secretory signal peptide; it may be targeted to both the secretory pathway and the peroxisomes and perhaps is located in the extracellular matrix. AthCSY1 is expressed only in siliques and specifically in developing seeds. AthCSY2 and 3 are active during seed germination and seedling development and are thought to participate in the beta-oxidation of fatty acids.	382
99861	cd06108	Ec2MCS_like	Escherichia coli (Ec) 2-methylcitrate synthase (2MCS)_like. 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and oxalacetate (OAA) to form 2-methylcitrate and coenzyme A (CoA) during propionate metabolism. Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and OAA to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). This group contains proteins similar to the E. coli 2MCS, EcPrpC.  EcPrpC is one of two CS isozymes in the gram-negative E. coli. EcPrpC is a dimeric (type I ) CS; it is induced during growth on propionate and prefers PrCoA as a substrate though it has partial CS activity with AcCoA. This group also includes Salmonella typhimurium PrpC and Ralstonia eutropha (Re) 2-MCS1 which are also induced during growth on propionate and prefer PrCoA as substrate, but can also use AcCoA. Re 2-MCS1 can use butyryl-CoA and valeryl-CoA at a lower rate. A second Ralstonia eutropha 2MCS, Re 2-MCS2, which is induced on propionate is also found in this group. This group may include proteins which may function exclusively as a CS, those which may function exclusively as a 2MCS, or those with dual specificity which functions as both a CS and a 2MCS.	363
99862	cd06109	BsCS-I_like	Bacillus subtilis (Bs) citrate synthase CS-I_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and coenzyme A (CoA) during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. This group contains proteins similar to BsCS-I, one of two CS isozymes in the gram-positive B. subtilis. The majority of CS activity in B. subtilis is provided by the other isozyme, BsCS-II (not included in this group). BsCS-I has a lower catalytic activity than BsCS-II, and has a Glu in place of a key catalytic Asp residue. This change is conserved in other members of this group. For E. coli CS (not included in this group), site directed mutagenesis of the key Asp residue to a Glu converts the enzyme into citryl-CoA lyase which cleaves citryl-CoA to AcCoA and OAA.  A null mutation in the gene encoding BsCS-I (citA) had little effect on B. subtilis CS activity or on sporulation. However, disruption of the citA gene in a strain null for the gene encoding BsCS-II resulted in a sporulation deficiency, a characteristic of strains defective in the Krebs cycle. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. Many of the gram-negative species represented in this group have a second CS isozyme which is in another group.	349
99863	cd06110	BSuCS-II_like	Bacillus subtilis (Bs) citrate synthase (CS)-II_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. This group contains proteins similar to BsCS-II, the major CS of the gram-positive bacterium Bacillus subtilis. A mutation in the gene which encodes BsCS-II (citZ gene) has been described which resulted in a significant loss of CS activity, partial glutamate auxotrophy, and a sporulation deficiency, all of which are characteristic of strains defective in the Krebs cycle. Streptococcus mutans CS, found in this group, may participate in a pathway for the anaerobic biosynthesis of glutamate. This group also contains functionally uncharacterized CSs of various gram-negative bacteria. Some of the gram-negative species represented in this group have a second CS isozyme found in another group. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS.	356
99864	cd06111	DsCS_like	Cold-active citrate synthase (CS) from an Antarctic bacterial strain DS2-3R (Ds)-like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). 2-methylcitrate synthase (2MCS) catalyzes the condensation of propionyl-coenzyme A (PrCoA) and OAA to form 2-methylcitrate and coenzyme A (CoA) during propionate metabolism. The overall CS reaction is thought to proceed through three partial reactions: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. DsCS, compared with CS from the hyperthermophile Pyrococcus furiosus (not included in this group), has an increase in the size of surface loops, a higher proline content in the loop regions, a more accessible active site, and a higher number of intramolecular ion pairs. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. For example, included in this group are Corynebacterium glutamicum (Cg) PrpC1 and -2, which are only synthesized during growth on propionate-containing medium, can use PrCoA, AcCoA and butyryl-CoA as substrates, and have comparable catalytic activity with AcCoA as the major CgCS (GltA, not included in this group).	362
99865	cd06112	citrate_synt_like_1_1	Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism.	373
99866	cd06113	citrate_synt_like_1_2	Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) a carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) hydrolysis of citryl-CoA to produce citrate and CoA. CSs are found in two structural types: type I (homodimeric) and type II CSs (homohexameric). Type II CSs are unique to gram-negative bacteria. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria. Type I CS is active as a homodimer, both subunits participating in the active site. Type II CS is a hexamer of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. This subgroup includes both gram-positive and gram-negative bacteria.	406
99867	cd06114	EcCS_like	Escherichia coli (Ec) citrate synthase (CS) GltA_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA.  There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  Some type II CSs including EcCS are strongly and specifically inhibited by NADH through an allosteric mechanism. Included in this group is an NADH-insensitive type II Acetobacter acetii CS which has retained many of the residues used by EcCS for NADH binding.	400
99868	cd06115	AthCS_per_like	Arabidopsis thaliana (Ath) peroxisomal (Per) CS_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. This group contains three Arabidopsis peroxisomal CS proteins, CYS1, -2, and -3 which are involved in the glyoxylate cycle. AthCYS1, in addition to a peroxisomal targeting sequence, has a predicted secretory signal peptide; it may be targeted to both the secretory pathway and the peroxisomes and is thought to be located in the extracellular matrix. AthCSY1 is expressed only in siliques and specifically in developing seeds. AthCSY2 and 3 are active during seed germination and seedling development and are thought to participate in the beta-oxidation of fatty acids.	410
99869	cd06116	CaCS_like	Chloroflexus aurantiacus (Ca) citrate synthase (CS)_like. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). This group is similar to gram-negative Escherichia coli (Ec) CS (type II, gltA) and Arabidopsis thaliana (Ath) peroxisomal (Per) CS. However EcCS and AthPerCS are not found in this group. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA.   There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. C. aurantiacus is a gram-negative thermophilic green gliding bacterium, its CS belonging to this group may be a type I CS; it is not inhibited by NADH or 2-oxoglutarate and is inhibited by ATP. Both gram-positive and gram-negative bacteria are found in this group.	384
99870	cd06117	Ec2MCS_like_1	Subgroup of Escherichia coli (Ec) 2-methylcitrate synthase (2MCS)_like. 2MCS catalyzes the condensation of propionyl-coenzyme A (PrCoA) and oxalacetate (OAA) to form 2-methylcitrate and coenzyme A (CoA) during propionate metabolism. Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and OAA to form citrate and coenzyme A (CoA), the first step in the citric acid cycle (TCA or Krebs cycle). This group contains proteins similar to the E. coli 2MCS, EcPrpC.  EcPrpC is one of two CS isozymes in the gram-negative E. coli. EcPrpC is a dimeric (type I ) CS; it is induced during growth on propionate and prefers PrCoA as a substrate, but has a partial CS activity with AcCoA. This group also includes Salmonella typhimurium PrpC and Ralstonia eutropha (Re) 2-MCS1 which are also induced during growth on propionate, prefer PrCoA as substrate, but can also can use AcCoA. Re 2-MCS1 at a low rate can use butyryl-CoA and valeryl-CoA. A second Ralstonia eutropha 2MCS is also found in this group, Re 2-MCS2, which is induced on propionate. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS.	366
99871	cd06118	citrate_synt_like_1	Citrate synthase (CS) catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) to form citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. This group also includes CS proteins which functions as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate citryl-CoA, and c) the hydrolysis of citryl-CoA to produce citrate and CoA. There are two types of CSs: type I CS and type II CSs.  Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site.  Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers).  Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism.	358
380376	cd06121	cupin_YML079wp	Saccharomyces cerevisiae YML079wp and related proteins, cupin domain. This family includes eukaryotic, bacterial, and archaeal proteins homologous to YML079wp, a Saccharomyces cerevisiae cupin-like protein of unknown function that structurally resembles plant seed storage and ligand-binding proteins (canavalin, glycinin, auxin binding protein) as well as the bacterial RmlC epimerase. YML079wp is non-essential in yeast and localizes to the nucleus and cytoplasm. The presence of a hydrophobic ligand within a well-conserved binding pocket inside the cupin beta-barrel and sequence similarity with bacterial epimerases suggests a possible biochemical function for YML079wp and its homologs. Also included in this family are Shewanella oneidensis So0799, Agrobacterium fabrum Atu3615 and Branchiostoma belcheri Bbduf985. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold and forms a homodimer.	153
380377	cd06122	cupin_TTHA0104	Thermus thermophilus TTHA0104 and related proteins, cupin domain. This family contains bacterial proteins including TTHA0104 (also called TT1209), a putative antibiotic synthesis protein from Thermus thermophilus. TTHA0104 is a cupin-like protein. The cupins are a functionally diverse superfamily originally discovered based on the highly conserved motif found in germin and germin-like proteins.  This conserved motif forms a beta-barrel fold found in all of the cupins, giving rise to the name cupin (cupa is the Latin term for small barrel).	102
380378	cd06123	cupin_HAO	3-Hydroxyanthranilate-3,4-dioxygenase, cupin domain. 3-Hydroxyanthranilate-3,4-dioxygenase (HAO or 3HAO) is a non-heme iron-dependent extradiol dioxygenase that catalyzes the oxidative ring opening of 3-hydroxyanthranilate (3-HAA) in the final enzymatic step of the kynurenine biosynthetic pathway in which tryptophan is converted to quinolinate, an endogenous neurotoxin, making HAO a target for pharmacological downregulation. Quinolate is also the universal de novo precursor to the pyridine ring of nicotinamide adenine dinucleotide. The enzyme forms homodimers, with two metal binding sites per molecule. One of the bound metal ions occupies the proposed ferrous-coordinated active site, which is located in a conserved double-strand beta-helix domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	153
380379	cd06124	cupin_NimR-like_N	AraC/XylS family transcriptional regulators similar to NimR, N-terminal cupin domain. This family contains mostly bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators. Included in this family is Escherichia coli HTH-type transcriptional regulator NimR (also called YeaM) that negatively regulates expression of the nimT operon and its own expression. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	95
176647	cd06125	DnaQ_like_exo	DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily. The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer.	96
176648	cd06127	DEDDh	DEDDh 3'-5' exonuclease domain family. DEDDh exonucleases, part of the DnaQ-like (or DEDD) exonuclease superfamily, catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. These proteins contain four invariant acidic residues in three conserved sequence motifs termed ExoI, ExoII and ExoIII. DEDDh exonucleases are classified as such because of the presence of specific Hx(4)D conserved pattern at the ExoIII motif. The four conserved acidic residues are clustered around the active site and serve as ligands for the two metal ions required for catalysis. Most DEDDh exonucleases are the proofreading subunits (epsilon) or domains of bacterial DNA polymerase III, the main replicating enzyme in bacteria, which functions as the chromosomal replicase. Other members include other DNA and RNA exonucleases such as RNase T, Oligoribonuclease, and RNA exonuclease (REX), among others.	159
176649	cd06128	DNA_polA_exo	DEDDy 3'-5' exonuclease domain of family-A DNA polymerases. The 3'-5' exonuclease domain of family-A DNA polymerases has a fundamental role in reducing polymerase errors and is involved in proofreading activity. Family-A DNA polymerases contain a DnaQ-like exonuclease domain in the same polypeptide chain as the polymerase domain, similar to family-B DNA polymerases. The exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four invariant acidic residues that serve as ligands for the two metal ions required for catalysis. The Klenow fragment (KF) of Escherichia coli Pol I, the Thermus aquaticus (Taq) Pol I, and Bacillus stearothermophilus (BF) Pol I are examples of family-A DNA polymerases. They are involved in nucleotide excision repair and in the processing of Okazaki fragments that are generated during lagging strand synthesis. The N-terminal domains of BF Pol I and Taq Pol I resemble the fold of the 3'-5' exonuclease domain of KF without the proofreading activity of KF. The four critical metal-binding residues are not conserved in BF Pol I and Taq Pol I, and they are unable to bind metals necessary for exonuclease activity.	151
176650	cd06129	RNaseD_like	DEDDy 3'-5' exonuclease domain of RNase D, WRN, and similar proteins. The RNase D-like group is composed of RNase D, WRN, and similar proteins. They contain a DEDDy-type, DnaQ-like, 3'-5' exonuclease domain that contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. RNase D is involved in the 3'-end processing of tRNA precursors. RNase D-like proteins in eukaryotes include yeast Rrp6p, human PM/Scl-100 and Drosophila melanogaster egalitarian (Egl) protein. WRN is a unique DNA helicase possessing exonuclease activity. Mutation in the WRN gene is implicated in Werner syndrome, a disease associated with premature aging and increased predisposition to cancer. Yeast Rrp6p and the human Polymyositis/scleroderma autoantigen 100kDa (PM/Scl-100) are exosome-associated proteins involved in the degradation and processing of precursors to stable RNAs. Egl is a component of an mRNA-binding complex which is required for oocyte specification. The Egl subfamily does not possess a completely conserved YX(3)D pattern at the ExoIII motif.	161
99834	cd06130	DNA_pol_III_epsilon_like	an uncharacterized bacterial subgroup of the DEDDh 3'-5' exonuclease domain family with similarity to the epsilon subunit of DNA polymerase III. This subfamily is composed of uncharacterized bacterial proteins with similarity to the epsilon subunit of DNA polymerase III (Pol III), a multisubunit polymerase which is the main DNA replicating enzyme in bacteria, functioning as the chromosomal replicase. The Pol III holoenzyme is a complex of ten different subunits, three of which (alpha, epsilon, and theta) compose the catalytic core. The Pol III epsilon subunit, encoded by the dnaQ gene, is a DEDDh-type 3'-5' exonuclease which is responsible for the proofreading activity of the polymerase, increasing the fidelity of DNA synthesis. It contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The epsilon subunit of Pol III also functions as a stabilizer of the holoenzyme complex.	156
99835	cd06131	DNA_pol_III_epsilon_Ecoli_like	DEDDh 3'-5' exonuclease domain of the epsilon subunit of Escherichia coli DNA polymerase III and similar proteins. This subfamily is composed of the epsilon subunit of Escherichia coli DNA polymerase III (Pol III) and similar proteins. Pol III is the main DNA replicating enzyme in bacteria, functioning as the chromosomal replicase. It is a holoenzyme complex of ten different subunits, three of which (alpha, epsilon, and theta) compose the catalytic core. The Pol III epsilon subunit, encoded by the dnaQ gene, is a DEDDh-type 3'-5' exonuclease which is responsible for the proofreading activity of the polymerase, increasing the fidelity of DNA synthesis. It contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The epsilon subunit of Pol III also functions as a stabilizer of the holoenzyme complex.	167
99836	cd06133	ERI-1_3'hExo_like	DEDDh 3'-5' exonuclease domain of Caenorhabditis elegans ERI-1, human 3' exonuclease, and similar proteins. This subfamily is composed of Caenorhabditis elegans ERI-1, human 3' exonuclease (3'hExo), Drosophila exonuclease snipper (snp), and similar proteins from eukaryotes and bacteria. These are DEDDh-type DnaQ-like 3'-5' exonucleases containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. ERI-1 has been implicated in the degradation of small interfering RNAs (RNAi). 3'hExo participates in the degradation of histone mRNAs. Snp is a non-essential exonuclease that efficiently degrades structured RNA and DNA substrates as long as there is a minimum of 2 nucleotides in the 3' overhang to initiate degradation. Snp is not a functional homolog of either ERI-1 or 3'hExo.	176
99837	cd06134	RNaseT	DEDDh 3'-5' exonuclease domain of RNase T. RNase T is a DEDDh-type DnaQ-like 3'-5' exoribonuclease E implicated in the 3' maturation of small stable RNAs and 23srRNA, and in the end turnover of tRNA. It contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. RNase T is related to the proofreading domain of DNA polymerase III. Despite its important role, RNase T is mainly found only in gammaproteobacteria. It is speculated that it might have originated from DNA polymerase III at the time the gamma division of proteobacteria diverged from other bacteria. RNase T is a homodimer with the catalytic residues of one monomer contacting a large basic patch on the other monomer to form a functional active site.	189
99838	cd06135	Orn	DEDDh 3'-5' exonuclease domain of oligoribonuclease and similar proteins. Oligoribonuclease (Orn) is a DEDDh-type DnaQ-like 3'-5' exoribonuclease that is responsible for degrading small oligoribonucleotides to mononucleotides. It contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. Orn is essential for Escherichia coli survival. The human homolog, also called Sfn (small fragment nuclease), is able to hydrolyze short single-stranded RNA and DNA oligomers. It plays a role in cellular nucleotide recycling.	173
99839	cd06136	TREX1_2	DEDDh 3'-5' exonuclease domain of three prime repair exonuclease (TREX)1, TREX2, and similar proteins. Three prime repair exonuclease (TREX)1 and TREX2 are closely related DEDDh-type DnaQ-like 3'-5' exonucleases. They contain three conserved sequence motifs known as ExoI, II, and III, with a specific Hx(4)D conserved pattern at ExoIII. These motifs contain four conserved acidic residues that participate in coordination of divalent metal ions required for catalysis. Both proteins play a role in the metabolism and clearance of DNA. TREX1 is the major 3'-5' exonuclease activity detected in mammalian cells. Mutations in the human TREX1 gene can cause Aicardi-Goutieres syndrome (AGS), which is characterized by perturbed innate immunity and presents itself as a severe neurological disease. TREX1 degrades ssDNA generated by aberrant replication intermediates to prevent checkpoint activation and autoimmune disease. There are distinct structural differences between TREX1 and TREX2 that point to different biological roles for these proteins. The main difference is the presence of about 70 amino acids at the C-terminus of TREX1. In addition, TREX1 has a nonrepetitive proline-rich region that is not present in the TREX2 protein. Furthermore, TREX2 contains a conserved DNA binding loop positioned adjacent to the active site that has a sequence distinct from the corresponding loop in TREX1. Truncations in the C-terminus of human TREX1 cause autosomal dominant retinal vasculopathy with cerebral leukodystrophy (RVCL), a neurovascular syndrome featuring a progressive loss of visual acuity combined with a variable neurological picture.	177
99840	cd06137	DEDDh_RNase	DEDDh 3'-5' exonuclease domain of the eukaryotic exoribonucleases PAN2, RNA exonuclease (REX)-1,-3, and -4, ISG20, and similar proteins. This group is composed of eukaryotic exoribonucleases that include PAN2, RNA exonuclease 1 (REX1 or Rex1p), REX3 (Rex3p), REX4 (or Rex4p), ISG20, and similar proteins. They are DEDDh-type DnaQ-like 3'-5' exonucleases containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. PAN2 is the catalytic subunit of poly(A) nuclease (PAN), a Pab1p-dependent 3'-5' exoribonuclease which plays an important role in the posttranscriptional maturation of pre-mRNAs. REX proteins are required for the processing and maturation of many RNA species, and ISG20 is an interferon-induced antiviral exonuclease with a strong preference for single-stranded RNA.	161
99841	cd06138	ExoI_N	N-terminal DEDDh 3'-5' exonuclease domain of Escherichia coli exonuclease I and similar proteins. This subfamily is composed of the N-terminal domain of Escherichia coli exonuclease I (ExoI) and similar proteins. ExoI is a monomeric enzyme that hydrolyzes single stranded DNA in the 3' to 5' direction. It plays a role in DNA recombination and repair. It primarily functions in repairing frameshift mutations. The N-terminal domain of ExoI is a DEDDh-type DnaQ-like 3'-5 exonuclease containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The ExoI structure is unique among DnaQ family enzymes in that there is a large distance between the two metal ions required for catalysis and the catalytic histidine is oriented away from the active site.	183
176651	cd06139	DNA_polA_I_Ecoli_like_exo	DEDDy 3'-5' exonuclease domain of Escherichia coli DNA polymerase I and similar bacterial family-A DNA polymerases. Escherichia coli-like Polymerase I (Pol I), a subgroup of family-A DNA polymerases, contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain in the same polypeptide chain as the polymerase domain. The exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The 3'-5' exonuclease domain of DNA polymerases has a fundamental role in reducing polymerase errors and is involved in proofreading activity. E. coli DNA Pol I is involved in genome replication but is not the main replicating enzyme. It is also implicated in DNA repair.	193
176652	cd06140	DNA_polA_I_Bacillus_like_exo	inactive DEDDy 3'-5' exonuclease domain of Bacillus stearothermophilus DNA polymerase I and similar family-A DNA polymerases. Bacillus stearothermophilus-like Polymerase I (Pol I), a subgroup of the family-A DNA polymerases, contains an inactive DnaQ-like 3'-5' exonuclease domain in the same polypeptide chain as the polymerase region. The exonuclease-like domain of these proteins possess the same fold as the Klenow fragment (KF) of Escherichia coli Pol I, but does not contain the four critical metal-binding residues necessary for activity. The function of this domain is unknown. It might act as a spacer between the polymerase and the 5'-3' exonuclease domains. Some members of this subgroup, such as those from Bacillus sphaericus and Thermus aquaticus, are thermostable DNA polymerases.	178
176653	cd06141	WRN_exo	DEDDy 3'-5' exonuclease domain of WRN and similar proteins. WRN is a unique RecQ DNA helicase exhibiting an exonuclease activity. It contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. Mutations in the WRN gene cause Werner syndrome, an autosomal recessive disorder associated with premature aging and increased susceptibility to cancer and type II diabetes. WRN interacts with key proteins involved in DNA replication, recombination, and repair. It is believed to maintain genomic stability and life span by participating in DNA processes. WRN is stimulated by Ku70/80, an important regulator of genomic stability.	170
176654	cd06142	RNaseD_exo	DEDDy 3'-5' exonuclease domain of Ribonuclease D and similar proteins. Ribonuclease (RNase) D is a bacterial enzyme involved in the maturation of small stable RNAs and the 3' maturation of tRNA. It contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. In vivo, RNase D only becomes essential upon removal of other ribonucleases. Eukaryotic RNase D homologs include yeast Rrp6p, human PM/Scl-100, and the Drosophila melanogaster egalitarian protein.	178
99846	cd06143	PAN2_exo	DEDDh 3'-5' exonuclease domain of the eukaryotic exoribonuclease PAN2. PAN2 is the catalytic subunit of poly(A) nuclease (PAN), a Pab1p-dependent 3'-5' exoribonuclease which plays an important role in the posttranscriptional maturation of pre-mRNAs. PAN catalyzes the deadenylation of poly(A) tails, which are initially synthesized to default lengths of 70 to 90, to mRNA-specific lengths of 55 to 71. Pab1p and PAN also play a role in the export and decay of mRNA. PAN2 contains a DEDDh-type DnaQ-like 3'-5' exonuclease domain with three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis.	174
99847	cd06144	REX4_like	DEDDh 3'-5' exonuclease domain of RNA exonuclease 4, XPMC2, Interferon Stimulated Gene product of 20 kDa, and similar proteins. This subfamily is composed of RNA exonuclease 4 (REX4 or Rex4p), XPMC2, Interferon (IFN) Stimulated Gene product of 20 kDa (ISG20), and similar proteins. REX4 is involved in pre-rRNA processing. It controls the ratio between the two forms of 5.8S rRNA in yeast. XPMC2 is a Xenopus gene which was identified through its ability to correct a mitotic defect in fission yeast. The human homolog of XPMC2 (hPMC2) may be involved in angiotensin II-induced adrenal cell cycle progression and cell proliferation. ISG20 is an IFN-induced antiviral exonuclease with a strong preference for single-stranded RNA and minor activity towards single-stranded DNA. These proteins are DEDDh-type DnaQ-like 3'-5' exonucleases containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. REX proteins function in the processing and maturation of many RNA species, similar to the function of Escherchia coli RNase T.	152
99848	cd06145	REX1_like	DEDDh 3'-5' exonuclease domain of RNA exonuclease 1, -3 and similar eukaryotic proteins. This subfamily is composed of RNA exonuclease 1 (REX1 or Rex1p), REX3 (or Rex3p), and similar eukaryotic proteins. In yeast, REX1 and REX3 are required for 5S rRNA and MRP (mitochondrial RNA processing) RNA maturation, respectively. They are DEDDh-type DnaQ-like 3'-5' exonucleases containing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. REX1 is the major exonuclease responsible for pre-tRNA trail trimming and may also be involved in nuclear CCA turnover. REX proteins function in the processing and maturation of many RNA species, similar to the function of Escherichia coli RNase T.	150
176655	cd06146	mut-7_like_exo	DEDDy 3'-5' exonuclease domain of Caenorhabditis elegans mut-7 and similar proteins. The mut-7 subfamily is composed of Caenorhabditis elegans mut-7 and similar proteins found in plants and metazoans. Mut-7 is implicated in posttranscriptional gene silencing. It contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs, termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis.	193
99850	cd06147	Rrp6p_like_exo	DEDDy 3'-5' exonuclease domain of yeast Rrp6p, human polymyositis/scleroderma autoantigen 100kDa, and similar proteins. Yeast Rrp6p and its human homolog, the polymyositis/scleroderma autoantigen 100kDa (PM/Scl-100), are exosome-associated proteins involved in the degradation and processing of precursors to stable RNAs. Both proteins contain a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. The motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. PM/Scl-100, an autoantigen present in the nucleolar compartment of the cell, reacts with autoantibodies produced by about 50% of patients with polymyositis-scleroderma overlap syndrome.	192
99851	cd06148	Egl_like_exo	DEDDy 3'-5' exonuclease domain of Drosophila Egalitarian (Egl) and similar proteins. The Egalitarian (Egl) protein subfamily is composed of Drosophila Egl and similar proteins. Egl is a component of an mRNA-binding complex which is required for oocyte specification. Egl contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain possessing three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. The motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation of this subfamily throughout eukaryotes suggests that its members may be part of ancient RNA processing complexes that are likely to participate in the regulated processing of specific mRNAs. Some members of this subfamily do not have a completely conserved YX(3)D pattern at the ExoIII motif.	197
99852	cd06149	ISG20	DEDDh 3'-5' exonuclease domain of Interferon Stimulated Gene product of 20 kDa, and similar proteins. Interferon (IFN) Stimulated Gene product of 20 kDa (ISG20) is an IFN-induced antiviral exonuclease with a strong preference for single-stranded RNA and minor activity towards single-stranded DNA. It was also independently identified by its response to estrogen and was called HEM45 (human estrogen regulated transcript). ISG20 is a DEDDh-type DnaQ-like 3'-5' exonuclease containing three conserved sequence motifs termed ExoI, ExoII and ExoIII with a specific Hx(4)D conserved pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. ISG20 may be a major effector of innate immunity against pathogens including viruses, bacteria, and parasites. It is located in promyelocytic leukemia (PML) nuclear bodies, sites for oncogenic DNA viral transcription and replication. It may carry out its function by degrading viral RNAs as part of the IFN-regulated antiviral response.	157
100007	cd06150	YjgF_YER057c_UK114_like_2	This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function.  The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	105
100008	cd06151	YjgF_YER057c_UK114_like_3	This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function.  The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	126
100009	cd06152	YjgF_YER057c_UK114_like_4	YjgF, YER057c, and UK114 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function.  The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	114
100010	cd06153	YjgF_YER057c_UK114_like_5	This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function.   The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	114
100011	cd06154	YjgF_YER057c_UK114_like_6	This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function.  The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	119
100012	cd06155	eu_AANH_C_1	A group of hypothetical eukaryotic proteins, characterized by the presence of an adenine nucleotide alpha hydrolase (AANH)-like domain located N-terminal to two distinctly different YjgF-YER057c-UK114-like domains. This CD contains the first of these domains. The YjgF-YER057c-UK114 protein family is a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function.  The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	101
100013	cd06156	eu_AANH_C_2	A group of hypothetical eukaryotic proteins, characterized by the presence of an adenine nucleotide alpha hydrolase (AANH)-like domain located N-terminal to two distinctly different YjgF-YER057c-UK114-like domains. This CD contains the second of these domains. The YjgF-YER057c-UK114 protein family is a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function.  The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site.	118
132726	cd06157	NR_LBD	The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators. Ligand-binding domain (LBD) of nuclear receptor (NR):  Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	168
100079	cd06158	S2P-M50_like_1	Uncharacterized homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group includes bacterial, eukaryotic, and Archaeal S2P/M50s homologs with a minimal core protein and no PDZ domains.	181
100080	cd06159	S2P-M50_PDZ_Arch	Uncharacterized Archaeal homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group appears to be limited to Archaeal S2P/M50s homologs with additional putative N-terminal transmembrane spanning regions, relative to the core protein, and either one or two PDZ domains present.	263
100081	cd06160	S2P-M50_like_2	Uncharacterized homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group includes bacterial, eukaryotic, and Archaeal S2P/M50s homologs with additional putative N- and C-terminal transmembrane spanning regions, relative to the core protein, and no PDZ domains.	183
100082	cd06161	S2P-M50_SpoIVFB	SpoIVFB Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50B), regulates intramembrane proteolysis (RIP), and is involved in the pro-sigmaK pathway of bacterial spore formation. SpoIVFB (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus). SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB. It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB.	208
100083	cd06162	S2P-M50_PDZ_SREBP	Sterol regulatory element-binding protein (SREBP) Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50A), regulates intramembrane proteolysis (RIP) of SREBP and is part of a signal transduction mechanism involved in sterol and lipid metabolism. In sterol-depleted mammalian cells, a two-step proteolytic process releases the N-terminal domains of SREBPs from membranes of the endoplasmic reticulum (ER). These domains translocate into the nucleus, where they activate genes of cholesterol and fatty acid biosynthesis. The first cleavage occurs at Site-1 within the ER lumen to generate an intermediate that is subsequently released from the membrane by cleavage at Site-2, which lies within the first transmembrane domain. It is the second proteolytic step that is carried out by the SREBP Site-2 protease (S2P) which is present in this CD family.  This group appears to be limited to eumetazoan proteins and contains one PDZ domain.	277
100084	cd06163	S2P-M50_PDZ_RseP-like	RseP-like Site-2 proteases (S2P), zinc metalloproteases (MEROPS family M50A), cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. In Escherichia coli, the S2P homolog RseP is involved in the sigmaE pathway of extracytoplasmic stress responses. Also included in this group are such homologs as Bacillus subtilis YluC, Mycobacterium tuberculosis Rv2869c S2P, and Bordetella bronchiseptica HurP.  Rv2869c S2P appears to have a role in the regulation of prokaryotic lipid biosynthesis and membrane composition and YluC of Bacillus has a role in transducing membrane stress. This group includes bacterial and eukaryotic S2P/M50s homologs with either one or two PDZ domains present. PDZ domains are believed to have a regulatory role. The RseP PDZ domain is required for the inhibitory reaction that prevents cleavage of its substrate, RseA.	182
100085	cd06164	S2P-M50_SpoIVFB_CBS	SpoIVFB Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50B), regulates intramembrane proteolysis (RIP), and is involved in the pro-sigmaK pathway of bacterial spore formation. In this subgroup, SpoIVFB (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain. SpoIVFB is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus). SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB. It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.	227
320680	cd06165	Sortase_A	Sortase domain found in class A sortases. Class A sortases are membrane-bound cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Firmicutes). They perform a housekeeping role in the cell as members of this group are capable of anchoring a large number of functionally distinct surface proteins containing a cell wall sorting signal to an amino group located on the bacterial cell wall. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall-sorting signal (Class A sortases recognize a canonical LPXTG motif, X can be any amino acid), and covalently linked to peptidoglycan for display on the bacterial surface. The prototypical sortase A protein from Staphylococcus aureus (named Sa-SrtA) cleaves the amide bond between threonine and glycine residues of the canonical LPXTG motif in a wide range of protein substrates with diverse functions that can promote bacterial adhesion, nutrient acquisition, host cell invasion, and immune evasion. Next, it catalyzes a transpeptidation reaction by which the proteins are covalently linked to the peptidoglycan precursor lipid II.  SrtA is therefore affects the ability of a pathogen to establish successful infection. SrtA contains an N-terminal hydrophobic segment, a linker region and an extra-cellular C-terminal catalytic domain. The hydrophobic segment functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring. The catalytic domain contains the catalytic TLXTC signature sequence where X is usually a valine, isoleucine or a threonine. The gene encoding SrtA is generally not located in the same gene cluster as its substrates while the gene encoding SrtB is usually clustered in the same locus as its substrate.	127
320681	cd06166	Sortase_D_2	Sortase domain found in subfamily 2 of the class D family of sortases. Class D sortases are cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Firmicutes). They anchor surface proteins bearing a cell wall sorting signal to peptidoglycans of the bacterial cell wall envelope, which is responsible for spore formation under anaerobic conditions. This involves a transpeptidation reaction in which the surface protein substrate is cleaved at the cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. The prototypical subfamily 2 of class D sortase from Clostridium perfringens (named Cp-SrtD) recognizes the LPQTGS signal motif for transpeptidation. Its catalytic activity is in a metal cation- and temperature- dependent manner. The presence of magnesium appears to enhance Cp-SrtD catalysis towards the LPQTGS signal motif.	127
350201	cd06167	PIN_LabA-like	PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing. It is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system. In particular, LabA seems necessary for KaiC-dependent repression of gene expression. This family also includes the N-terminal domain of limkain b1, a human autoantigen associated with cytoplasmic vesicles. Other members are the LabA-like PIN domains of human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family.	113
212486	cd06168	LSMD1	LSM domain containing 1. The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSMD1 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes, forming heptameric and hexameric ring structures similar to those found in eukaryotes.	73
132884	cd06169	BMC	Bacterial Micro-Compartment (BMC) domain. Bacterial micro-compartments are primitive protein-based organelles that sequester specific metabolic pathways in bacterial cells. The prototypical bacterial microcompartment is the carboxysome shell, a bacterial polyhedral organelle which increase the efficiency of CO2 fixation by encapsulating RuBisCO and carbonic anhydrase. They can be divided into two types: alpha-type carboxysomes (alpha-cyanobacteria and proteobacteria) and beta-type carboxysomes (beta-cyanobacteria).  In addition to these proteins there are several homologous shell proteins including those found in pdu organelles involved in coenzyme B12-dependent degradation of 1,2-propanediol and eut organelles involved in the cobalamin-dependent degradation of ethanolamine. Structure evidence shows that several carboxysome shell proteins and their homologs (Csos1A, CcmK1,2,4, and PduU) exist as hexamers which might further assemble into extended, tightly packed layers hypothesized to represent the flat facets of the polyhedral organelles outer shell. Although it has been suggested that other homologous proteins in this family might also form hexamers and play similar functional roles in the construction of their corresponding organelle outer shell at present no experimental evidence directly supports this view.	62
99777	cd06170	LuxR_C_like	C-terminal DNA-binding domain of LuxR-like proteins. This domain contains a helix-turn-helix motif and binds DNA. Proteins belonging to this group are response regulators; some act as transcriptional activators, others as transcriptional repressors. Many are active as homodimers. Many are two domain proteins in which the DNA binding property of the C-terminal DNA binding domain is modulated by modifications of the N-terminal domain.  For example in the case of Lux R which participates in the regulation of gene expression in response to fluctuations in cell-population density (quorum-sensing), a signaling molecule, the pheromone Acyl HSL (N-acyl derivatives of homoserine lactone), binds to the N-terminal domain and leads to LuxR dimerization.  For others phophorylation of the N-terminal domain leads to multimerization, for example Escherichia coli NarL and Sinorhizobium melilot FixJ. NarL controls gene expression of many respiratory-related operons when environmental nitrate or nitrite is present under anerobic conditions. FixJ is involved in the transcriptional activation of nitrogen fixation genes. The group also includes small proteins which lack an N-terminal signaling domain, such as Bacillus subtilis GerE.  GerE is dimeric and acts in conjunction with sigmaK as an activator or a repressor modulating the expression of various genes in particular those encoding the spore-coat. These LuxR family regulators may share a similar organization of their target binding sites. For example the LuxR dimer binds the lux box, a 20bp inverted repeat, GerE dimers bind two 12bp consensus sequences in inverted orientation having the central four bases overlap, and the NarL dimer binds two 7bp inverted repeats separated by 2 bp.	57
100119	cd06171	Sigma70_r4	Sigma70, region (SR) 4 refers to the most C-terminal of four conserved domains found in Escherichia coli (Ec) sigma70, the main housekeeping sigma, and related sigma-factors (SFs). A SF is a dissociable subunit of RNA polymerase, it directs bacterial or plastid core RNA polymerase to specific promoter elements located upstream of transcription initiation points. The SR4 of Ec sigma70 and other essential primary SFs contact promoter sequences located 35 base-pairs upstream of the initiation point, recognizing a 6-base-pair -35 consensus TTGACA.  Sigma70 related SFs also include SFs which are dispensable for bacterial cell growth for example Ec sigmaS, SFs which activate regulons in response to a specific signal for example heat-shock Ec sigmaH, and a group of SFs which includes the extracytoplasmic function (ECF) SFs and is typified by Ec sigmaE which contains SR2 and -4 only. ECF SFs direct the transcription of genes that regulate various responses including periplasmic stress and pathogenesis.   Ec sigmaE SR4 also contacts the -35 element, but recognizes a different consensus (a 7-base-pair GGAACTT).  Plant SFs recognize sigma70 type promoters and direct transcription of the major plastid RNA polymerase, plastid-encoded RNA polymerase (PEP).	55
340862	cd06172	MFS_LacY	LacY proton/sugar symporter family of the Major Facilitator Superfamily of transporters. LacY proton/sugar family (also called LacY/RafB family) symporters are integral membrane proteins responsible for the transport of specific beta-glucosides into the cell accompanied by the import of a proton. Members include lactose permease (LacY), raffinose permease (RafB), and sucrose permease, which facilitate the transport of beta-galactosides, raffinose, and sucrose, respectively. The prototypical member, LacY, contains 12 transmembrane helices connected by hydrophilic loops with both N and C termini on the cytoplasmic face. The LacY/RafB permease family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	389
340863	cd06173	MFS_MefA_like	Macrolide efflux protein A and similar proteins of the Major Facilitator Superfamily of transporters. This family is composed of Streptococcus pyogenes macrolide efflux protein A (MefA) and similar transporters, many of which remain uncharacterized. Some members may be multidrug resistance (MDR) transporters, which are drug/H+ antiporters (DHAs) that mediate the efflux of a variety of drugs and toxic compounds, conferring resistance to these compounds. MefA confers resistance to 14-membered macrolides including erythromycin and to 15-membered macrolides. It functions as an efflux pump to regulate intracellular macrolide levels. The MefA-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	383
349949	cd06174	MFS	Major Facilitator Superfamily. The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated.	378
340865	cd06175	MFS_POT	Proton-coupled oligopeptide transporter (POT) family of the Major Facilitator Superfamily of transporters. The Proton-coupled oligopeptide transporter (POT) family is present across all major kingdoms of life and is known by a variety of names. It is referred to as the Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) in plants, and in addition to POT, it is also known as the Peptide transporter (PepT/PTR) or Solute Carrier 15 (SLC15) family in animals. Members of this family are proton-driven symporters involved in nitrogen acquisition in the form of di- and tripeptides. Plant members transport other nitrogenous ligands including nitrate, the plant hormone auxin, and glucosinolate compounds that are important for seed defense. POT proteins exhibit substrate multispecificity, with one transporter able to recognize as many as 8,400 types of di/tripeptides and certain peptide-like drugs. The POT family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	422
349950	cd06176	MFS_BCD_PucC-like	Bacteriochlorophyll delivery (BCD) family, also called PucC family, of the Major Facilitator Superfamily. The bacteriochlorophyll delivery (BCD) family, also called PucC family, is composed of the PucC protein and related proteins including LhaA (also called ORF477 and F1696) and bacteriochlorophyll synthase 44.5 kDa chain (also called ORF428). These proteins are found in photosynthetic organisms. Rhodobacter capsulatus LhaA and PucC are implicated in light-harvesting complex 1 and 2 (LH1 and LH2) assembly. PucC may function to shepherd or sequester LH2 alpha and beta proteins to facilitate proper assembly, as well as deliver bacteriochlorophyll a to nascent LH2 complexes. The BCD family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	409
340866	cd06177	MFS_NHS	Nucleoside:H(+) symporter family of the Major Facilitator Superfamily of transporters. The prototypical members of the Nucleoside:H(+) symporter (NHS) family are Escherichia coli nucleoside permease NupG and xanthosine permease. Nucleoside:H(+) symporters are proton-driven transporters that facilitate the import of nucleosides across the cytoplasmic membrane. NupG is a broad-specificity transporter of purine and pyrimidine nucleosides. Xanthosine permease is involved in the uptake of xanthosine and other nucleosides such as inosine, adenosine, cytidine, uridine and thymidine. The NHS family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	386
340867	cd06178	MFS_unc93-like	Uncharacterized Unc-93-like proteins of the Major Facilitator Superfamily of transporters. This subfamily consists of uncharacterized proteins, mainly from fungi and plants, with similarity to Caenorhabditis elegans uncoordinated protein 93 (also called putative potassium channel regulatory protein unc-93). Unc-93 acts as a regulatory subunit of a multi-subunit  potassium channel complex that may function in coordinating muscle contraction in C. elegans. The unc93-like subfamily belongs to the Unc-93 family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	415
340868	cd06179	MFS_TRI12_like	Fungal trichothecene efflux pump (TRI12) of the Major Facilitator Superfamily of transporters. This family includes Fusarium sporotrichioides trichothecene efflux pump (TRI12), which may play a role in F. sporotrichioides self-protection against trichothecenes. TRI12 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	518
340869	cd06180	MFS_YjiJ	Uncharacterized protein YjiJ and similar proteins of the Major Facilitator Superfamily of transporters. This family is composed of Escherichia coli YjiJ and other uncharacterized proteins. They belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	371
198409	cd06181	BI-1-like	BAX inhibitor (BI)-1/YccA-like protein family. Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes.	202
99779	cd06182	CYPOR_like	NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. CYPOR has a C-terminal ferredoxin reducatase (FNR)- like FAD and NAD binding module, an FMN-binding domain, and an additional conecting domain (inserted within the FAD binding region) that orients the FNR and FMN binding domains. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria and participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains.  Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2, which then transfers two electrons and a proton to NADP+ to form NADPH.	267
99780	cd06183	cyt_b5_reduct_like	Cytochrome b5 reductase catalyzes the reduction of 2 molecules of cytochrome b5 using NADH as an electron donor. Like ferredoxin reductases, these proteins have an N-terminal FAD binding subdomain and a C-terminal NADH binding subdomain, separated by a cleft, which accepts FAD. The NADH-binding moiety interacts with part of the FAD and resembles a Rossmann fold. However, NAD is bound differently than in canonical Rossmann fold proteins. Nitrate reductases, flavoproteins similar to pyridine nucleotide cytochrome reductases, catalyze the reduction of nitrate to nitrite. The enzyme can be divided into three functional fragments that bind the cofactors molybdopterin, heme-iron, and FAD/NADH.	234
99781	cd06184	flavohem_like_fad_nad_binding	FAD_NAD(P)H binding domain of flavohemoglobin. Flavohemoglobins have a globin domain containing a B-type heme fused with a ferredoxin reductase-like FAD/NAD-binding domain. Flavohemoglobins detoxify nitric oxide (NO) via an NO dioxygenase reaction. The hemoglobin domain adopts a globin fold with an embedded heme molecule. Flavohemoglobins also have a C-terminal reductase domain with bindiing sites for FAD and NAD(P)H. This domain catalyzes the conversion of NO + O2 + NAD(P)H to NO3- + NAD(P)+.  Instead of the oxygen transport function of hemoglobins, flavohemoglobins seem to act in NO dioxygenation and NO signalling.	247
99782	cd06185	PDR_like	Phthalate dioxygenase reductase (PDR) is an FMN-dependent reductase that mediates electron transfer from NADH to FMN to an iron sulfur cluster. PDR has an an N-terminal  ferrredoxin reductase (FNR)-like NAD(H) binding domain and a C-terminal iron-sulfur [2Fe-2S] cluster domain. Although structurally homologous to FNR, PDR binds FMN rather than FAD in it's FNR-like domain. Electron transfer between pyrimidines and iron-sulfur clusters (Rieske center [2Fe-2S]) or heme groups is mediated by flavins in respiration, photosynthesis, and oxygenase systems. Type I dioxygenase systems, including the hydroxylate phthalate system, have 2 components, a monomeric reductase consisting of a flavin and a 2Fe-2S center and a multimeric oxygenase. In contrast to other Rieske dioxygenases the ferredoxin like domain is C-, not N-terminal.	211
99783	cd06186	NOX_Duox_like_FAD_NADP	NADPH oxidase (NOX) catalyzes the generation of reactive oxygen species (ROS) such as superoxide and hydrogen peroxide. ROS were originally identified as bactericidal agents in phagocytes, but are now also implicated in cell signaling and metabolism. NOX has a 6-alpha helix heme-binding transmembrane domain fused to a flavoprotein with the nucleotide binding domain located in the cytoplasm. Duox enzymes link a peroxidase domain to the NOX domain via a single  transmembrane and EF-hand Ca2+ binding sites. The flavoprotein module has a ferredoxin like FAD/NADPH binding domain. In classical phagocytic NOX2, electron transfer occurs from NADPH to FAD to the heme of cytb to oxygen leading to superoxide formation.	210
99784	cd06187	O2ase_reductase_like	The oxygenase reductase FAD/NADH binding domain acts as part of the multi-component bacterial oxygenases which oxidize hydrocarbons using oxygen as the oxidant. Electron transfer is from NADH via FAD (in the oxygenase reductase) and an [2FE-2S] ferredoxin center (fused to the FAD/NADH domain and/or discrete) to the oxygenase. Dioxygenases add both atoms of oxygen to the substrate, while mono-oxygenases (aka mixed oxygenases) add one atom to the substrate and one atom to water. In dioxygenases, Class I enzymes are 2 component, containing a reductase with Rieske type  [2Fe-2S] redox centers and an oxygenase. Class II are 3 component, having discrete flavin and ferredoxin proteins and an oxygenase. Class III have 2 [2Fe-2S] centers, one fused to the flavin domain and the other separate.	224
99785	cd06188	NADH_quinone_reductase	Na+-translocating NADH:quinone oxidoreductase (Na+-NQR) FAD/NADH binding domain. (Na+-NQR) provides a means of storing redox reaction energy via the transmembrane translocation of Na2+ ions. The C-terminal domain resembles ferredoxin:NADP+ oxidoreductase, and has NADH and FAD binding sites. (Na+-NQR) is distinct from H+-translocating NADH:quinone oxidoreductases and noncoupled NADH:quinone oxidoreductases. The NAD(P) binding domain of ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal domain of this group typically contains an iron-sulfur cluster binding domain.	283
99786	cd06189	flavin_oxioreductase	NAD(P)H dependent flavin oxidoreductases use flavin as a substrate in mediating electron transfer from iron complexes or iron proteins. Structurally similar to ferredoxin reductases, but with only 15% sequence identity, flavin reductases reduce FAD, FMN, or riboflavin via NAD(P)H. Flavin is used as a substrate, rather than a tightly bound prosthetic group as in flavoenzymes; weaker binding is due to the absence of a binding site for the AMP moeity of FAD.	224
99787	cd06190	T4MO_e_transfer_like	Toluene-4-monoxygenase electron transfer component of Pseudomonas mendocina hydroxylates toluene and forms p-cresol as part of a three component toluene-4-monoxygenase system. Electron transfer is from NADH to an NADH:ferredoxin oxidoreductase (TmoF in P. mendocina) to ferredoxin to an iron-containing oxygenase. TmoF is homologous to other mono- and dioxygenase systems within the ferredoxin reductase family.	232
99788	cd06191	FNR_iron_sulfur_binding	Iron-sulfur binding Ferredoxin Reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with a C-terminal iron-sulfur binding cluster domain. FNR was intially identified as a chloroplast reductase activity catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methnae assimilation in a variety of organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H).	231
99789	cd06192	DHOD_e_trans_like	FAD/NAD binding domain (electron transfer subunit) of dihydroorotate dehydrogenase-like proteins. Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. In L. lactis, DHOD B (encoded by pyrDa) is co-expressed with pyrK and both gene products are required for full activity, as well as NAD binding. NAD(P) binding domain of ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal domain may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Ferredoxin is reduced in the final stage of photosystem I. The flavoprotein Ferredoxin-NADP+ reductase transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) which then transfers a hydride ion to convert NADP+ to NADPH.	243
99790	cd06193	siderophore_interacting	Siderophore interacting proteins share the domain structure of the ferredoxin reductase like family. Siderophores are produced in various bacteria (and some plants) to extract iron from hosts. Binding constants are high, so iron can be pilfered from transferrin and lactoferrin for bacterial uptake, contributing to pathogen virulence. Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in a variety of organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect  to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one-electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and two electron carriers. FNR has a strong preference for NADP(H) vs NAD(H).	235
99791	cd06194	FNR_N-term_Iron_sulfur_binding	Iron-sulfur binding ferredoxin reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with an N-terminal Iron-Sulfur binding cluster domain. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	222
99792	cd06195	FNR1	Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	241
99793	cd06196	FNR_like_1	Ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which varies in orientation with respect to the NAD(P) binding domain. The N-terminal region may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Ferredoxin is reduced in the final stage of photosystem I. The flavoprotein Ferredoxin-NADP+ reductase transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) which then transfers a hydride ion to convert NADP+ to NADPH.	218
99794	cd06197	FNR_like_2	FAD/NAD(P) binding domain of  ferredoxin reductase-like proteins. Ferredoxin reductase (FNR) was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and have a variety of physiological  functions in a variety of organisms including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which varies in orientation with respect  to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one-electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and two electron carriers. FNR has a strong preference for NADP(H) vs NAD(H).	220
99795	cd06198	FNR_like_3	NAD(P) binding domain of  ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) domain, which varies in orientation with respect to the NAD(P) binding domain. The N-terminal domain may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Ferredoxin is reduced in the final stage of photosystem I. The flavoprotein Ferredoxin-NADP+ reductase transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) which then transfers a hydride ion to convert NADP+ to NADPH.	216
99796	cd06199	SiR	Cytochrome p450- like alpha subunits of E. coli sulfite reductase (SiR) multimerize with beta subunits to catalyze the NADPH dependent reduction of sulfite to sulfide. Beta subunits have an Fe4S4 cluster and a siroheme, while the alpha subunits (cysJ gene) are of the cytochrome p450 (CyPor) family having FAD and FMN as prosthetic groups and utilizing NADPH. Cypor (including cyt -450 reductase, nitric oxide synthase, and methionine synthase reductase) are ferredoxin reductase (FNR)-like proteins with an additional N-terminal FMN domain and a connecting sub-domain inserted within the flavin binding portion of the FNR-like domain. The connecting domain orients the N-terminal FMN domain with the C-terminal FNR domain.	360
99797	cd06200	SiR_like1	Cytochrome p450- like alpha subunits of E. coli sulfite reductase (SiR) multimerize with beta subunits to catalyze the NADPH dependent reduction of sulfite to sulfide. Beta subunits have an Fe4S4 cluster and a siroheme, while the alpha subunits (cysJ gene) are of the cytochrome p450 (CyPor) family having FAD and FMN as prosthetic groups and utilizing NADPH. Cypor (including cyt -450 reductase, nitric oxide synthase, and methionine synthase reductase) are ferredoxin reductase (FNR)-like proteins with an additional N-terminal  FMN domain and a connecting sub-domain inserted within the flavin binding portion of the FNR-like domain. The connecting domain orients the N-terminal FMN domain with the C-terminal FNR domain. NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues, and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule, which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	245
99798	cd06201	SiR_like2	Cytochrome p450- like alpha subunits of E. coli sulfite reductase (SiR) multimerize with beta subunits to catalyze the NADPH dependent reduction of sulfite to sulfide.  Beta subunits have an Fe4S4 cluster and a siroheme, while the alpha subunits (cysJ gene) are of the cytochrome p450 (CyPor) family having FAD and FMN as prosthetic groups and utilizing NADPH.  Cypor (including cyt -450 reductase, nitric oxide synthase, and methionine synthase reductase) are ferredoxin reductase (FNR)-like proteins with an additional N-terminal  FMN domain and a connecting sub-domain inserted within the flavin binding portion of the FNR-like domain. The connecting domain orients the N-terminal FMN domain with the C-terminal FNR domain. NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	289
99799	cd06202	Nitric_oxide_synthase	The ferredoxin-reductase (FNR) like C-terminal domain of the nitric oxide synthase (NOS) fuses with a heme-containing N-terminal oxidase domain. The reductase portion is similar in structure to NADPH dependent cytochrome-450 reductase (CYPOR), having an  inserted connecting sub-domain within the FAD binding portion of FNR. NOS differs from CYPOR in a requirement for the cofactor tetrahydrobiopterin and unlike most CYPOR is dimeric. Nitric oxide synthase produces nitric oxide in the conversion of L-arginine to L-citruline. NOS has been implicated in a variety of processes including cytotoxicity, anti-inflamation, neurotransmission, and vascular smooth muscle relaxation.	406
99800	cd06203	methionine_synthase_red	Human methionine synthase reductase (MSR) restores methionine sythase which is responsible for the regeneration of methionine from homocysteine, as well as the coversion of methyltetrahydrofolate to tetrahydrofolate. In MSR, electrons are transferred from NADPH to FAD to FMN to cob(II)alamin. MSR resembles proteins of the cytochrome p450 family including nitric oxide synthase, the alpha subunit of sulfite reductase, but contains an extended hinge region. NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. CYPORs resemble ferredoxin reductase (FNR) but have a connecting subdomain inserted within the flavin binding region, which helps orient the FMN binding doamin with the FNR module. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	398
99801	cd06204	CYPOR	NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	416
99802	cd06206	bifunctional_CYPOR	These bifunctional proteins fuse N-terminal cytochrome p450 with a cytochrome p450 reductase (CYPOR). NADPH cytochrome p450 reductase serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	384
99803	cd06207	CyPoR_like	NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	382
99804	cd06208	CYPOR_like_FNR	These ferredoxin reductases are related to the NADPH cytochrome p450 reductases (CYPOR), but lack the FAD-binding region connecting sub-domain. Ferredoxin-NADP+ reductase (FNR) is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins, such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap between the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2, which then transfers two electrons and a proton to NADP+ to form NADPH. CYPOR serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases, sulfite reducatase, and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN.  CYPOR has a C-terminal FNR-like FAD and NAD binding module, an FMN-binding domain, and an additional connecting  domain (inserted within the FAD binding region) that orients the FNR and FMN -binding domains. The C-terminal domain contains most of the NADP(H) binding residues, and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule, which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	286
99805	cd06209	BenDO_FAD_NAD	Benzoate dioxygenase reductase (BenDO) FAD/NAD binding domain. Oxygenases oxidize hydrocarbons using dioxygen as the oxidant. As a Class I bacterial dioxygenases, benzoate dioxygenase like proteins combine an [2Fe-2S] cluster containing N-terminal ferredoxin at the end fused to an FAD/NADP(P) domain.  In dioxygenase FAD/NAD(P) binding domain, the reductase transfers 2 electrons from NAD(P)H to the oxygenase which insert into an aromatic substrate, an initial step in microbial aerobic degradation of aromatic rings. Flavin oxidoreductases use flavins as substrates, unlike flavoenzymes which have a flavin prosthetic group.	228
99806	cd06210	MMO_FAD_NAD_binding	Methane monooxygenase (MMO) reductase of methanotrophs catalyzes the NADH-dependent hydroxylation of methane to methanol. This multicomponent enzyme mediates electron transfer via a hydroxylase (MMOH), a coupling protein, and a reductase which is comprised of an N-terminal [2Fe-2S] ferredoxin domain, an FAD binding subdomain, and an NADH binding subdomain. Oxygenases oxidize hydrocarbons using dioxygen as the oxidant. Dioxygenases add both atom of oxygen to the substrate, while mono-oxygenases add one atom to the substrate and one atom to water.	236
99807	cd06211	phenol_2-monooxygenase_like	Phenol 2-monooxygenase (phenol hydroxylase) is a flavoprotein monooxygenase, able to use molecular oxygen as a substrate in the microbial degredation of phenol. This protein is encoded by a single gene and uses a tightly bound FAD cofactor in the NAD(P)H dependent conversion of phenol and O2 to catechol and H2O. This group is related to the NAD binding ferredoxin reductases.	238
99808	cd06212	monooxygenase_like	The oxygenase reductase FAD/NADH binding domain acts as part of the multi-component bacterial oxygenases which oxidize hydrocarbons. These flavoprotein monooxygenases use molecular oxygen as a substrate and require reduced FAD. One atom of oxygen is incorportated into the aromatic compond, while the other is used to form a molecule of water. In contrast dioxygenases add both atoms of oxygen to the substrate.	232
99809	cd06213	oxygenase_e_transfer_subunit	The oxygenase reductase FAD/NADH binding domain acts as part of the multi-component bacterial oxygenases which oxidize hydrocarbons. Electron transfer is from NADH via FAD (in the oxygenase reductase) and an [2FE-2S] ferredoxin center (fused to the FAD/NADH domain and/or discrete) to the oxygenase. Dioxygenases add both atoms of oxygen to the substrate while mono-oxygenases add one atom to the substrate and one atom to water. In dioxygenases, Class I enzymes are 2 component, containing a reductase with  Rieske type [2Fe-2S] redox centers and an oxygenase. Class II are 3 component, having discrete flavin and ferredoxin proteins and an oxygenase. Class III have 2 [2Fe-2S] centers, one fused to the flavin domain and the other separate.	227
99810	cd06214	PA_degradation_oxidoreductase_like	NAD(P) binding domain of ferredoxin reductase like phenylacetic acid (PA) degradation oxidoreductase. PA oxidoreductases of E. coli hydroxylate PA-CoA in the second step of PA degradation. Members of this group typically fuse a ferredoxin reductase-like domain with an iron-sulfur binding cluster domain. Ferredoxins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal portion may contain a flavin prosthetic group, as in flavoenzymes, or use flavin as a substrate. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria and participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	241
99811	cd06215	FNR_iron_sulfur_binding_1	Iron-sulfur binding ferredoxin reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with an iron-sulfur binding cluster domain. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal portion of the FAD/NAD binding domain contains most of the NADP(H) binding residues and the N-terminal sub-domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains. In this ferredoxin like sub-group, the FAD/NAD sub-domains is typically fused to a C-terminal iron-sulfur binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins which act as electron carriers in photosynthesis and ferredoxins which participate in redox chains from bacteria to mammals. Ferredoxin reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	231
99812	cd06216	FNR_iron_sulfur_binding_2	Iron-sulfur binding ferredoxin reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with an iron-sulfur binding cluster domain.  Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap betweed the two domains.  Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	243
99813	cd06217	FNR_iron_sulfur_binding_3	Iron-sulfur binding ferredoxin reductase (FNR) proteins combine the FAD and NAD(P) binding regions of FNR with an iron-sulfur binding cluster domain. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria in which they participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-terminal domain interacts non-covalently with the isoalloxazine rings of the flavin molecule which lies largely in a large gap between the two domains. Ferredoxin-NADP+ reductase first accepts one electron from reduced ferredoxin to form a flavin semiquinone intermediate. The enzyme then accepts a second electron to form FADH2 which then transfers two electrons and a proton to NADP+ to form NADPH.	235
99814	cd06218	DHOD_e_trans	FAD/NAD binding domain in the electron transfer subunit of dihydroorotate dehydrogenase. Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. In L. lactis, DHOD B (encoded by pyrDa) is co-expressed with pyrK and both gene products are required for full activity, as well as 3 cofactors: FMN, FAD, and an [2Fe-2S] cluster.	246
99815	cd06219	DHOD_e_trans_like1	FAD/NAD binding domain in the electron transfer subunit of dihydroorotate dehydrogenase-like proteins. Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. In L. lactis, DHOD B (encoded by pyrDa) is co-expressed with pyrK and both gene products are required for full activity, as well as NAD binding. NAD(P) binding domain of ferredoxin reductase-like proteins catalyze electron transfer between an NAD(P)-binding domain of the alpha/beta class and a discrete (usually N-terminal) domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal domain may contain a flavin prosthetic group, as in flavoenzymes, or use flavin as a substrate. Ferredoxin is reduced in the final stage of photosystem I. The flavoprotein Ferredoxin-NADP+ reductase transfers electrons from reduced ferredoxin to FAD, forming FADH2 via a semiquinone intermediate, and then transfers a hydride ion to convert NADP+ to NADPH.	248
99816	cd06220	DHOD_e_trans_like2	FAD/NAD binding domain in the electron transfer subunit of dihydroorotate dehydrogenase-like proteins. Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+. In L. lactis, DHOD B (encoded by pyrDa) is co-expressed with pyrK and both gene products are required for full activity, as well as 3 cofactors: FMN, FAD, and an [2Fe-2S] cluster.	233
99817	cd06221	sulfite_reductase_like	Anaerobic sulfite reductase contains an FAD and NADPH binding module with structural similarity to ferredoxin reductase and sequence similarity to dihydroorotate dehydrogenases. Clostridium pasteurianum inducible dissimilatory type sulfite reductase is linked to ferredoxin and reduces NH2OH and SeO3 at a lesser rate than it's normal substate SO3(2-). Dihydroorotate dehydrogenases (DHODs) catalyze the only redox reaction in pyrimidine de novo biosynthesis. They catalyze the oxidation of (S)-dihydroorotate to orotate coupled with the reduction of NAD+.	253
259998	cd06222	RNase_H_like	Ribonuclease H-like superfamily, including RNase H, HI, HII, HIII, and RNase-like domain IV of spliceosomal protein Prp8. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. It is widely present in various organisms, including bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. An important RNase H function is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as anti-HIV drug targets since RNase H inactivation inhibits reverse transcription. This model also includes the Prp8 domain IV, which adopts the RNase fold but shows low sequence homology; domain IV is implicated in key spliceosomal interactions.	121
206754	cd06223	PRTases_typeI	Phosphoribosyl transferase (PRT)-type I domain. Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrophosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP).  PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22.	130
100121	cd06224	REM	Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few.	122
381743	cd06225	HAMP	Histidine kinase, Adenylyl cyclase, Methyl-accepting protein, and Phosphatase (HAMP) domain. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established.	45
349445	cd06226	M14_CPT_like	Peptidase M14-like domain of an uncharacterized group of Peptidase M14 Carboxypeptidase T (CPT)-like proteins. Peptidase M14-like domain of an uncharacterized group of Peptidase M14 Carboxypeptidase T (CPT)-like proteins. This group belongs to the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPT exhibits dual-substrate specificity by cleaving C-terminal hydrophobic amino acid residues and C-terminal positively charged residues. However, CPT does not belong to this CPT-like group.	267
349446	cd06227	M14-CPA-like	Peptidase M14 carboxypeptidase A-like domain; uncharacterized subfamily. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	224
349447	cd06228	M14-like	Peptidase M14-like domain; uncharacterized subfamily. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	294
349448	cd06229	M14_Endopeptidase_I	Peptidase M14 carboxypeptidase family-like domain of Endopeptidase I. Peptidase M14-like domain of Gamma-D-glutamyl-L-diamino acid endopeptidase 1 (also known as Gamma-D-glutamyl-meso-diaminopimelate peptidase I, and Endopeptidase I (ENP1); EC 3.4.19.11). ENP1 is a member of the M14 family of metallocarboxypeptidases (MCPs), and is classified as belonging to subfamily C. However it has an exceptional type of activity of hydrolyzing the gamma-D-Glu-(L)meso-diaminopimelic acid (gamma-D-Glu-Dap) bond of L-Ala-gamma-D-Glu-(L)meso-diaminopimelic acid and L-Ala-gamma-D-Glu-(L)meso-diaminopimelic acid(L)-D-Ala peptides. ENP1 has a different substrate specificity and cellular role than MpaA (MpaA does not belong to this group). ENP1 hydrolyzes the gamma-D-Glu-Dap bond of MurNAc-tripeptide and MurNAc-tetrapeptide, as well as the amide bond of free tripeptide and tetrapeptide. ENP1 is active on spore cortex peptidoglycan, and is produced at stage IV of sporulation in forespore and spore integuments.	238
349449	cd06230	M14_ASTE_ASPA_like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily. The Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily belongs to the M14 family of metallocarboxypeptidases (MCPs), and includes ASTE, which catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) which cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	177
349450	cd06231	M14_REP34-like	Peptidase M14-like domain similar to rapid encystment phenotype 34 (REP34). This family includes Francisella tularensis protein rapid encystment phenotype 34 (REP34) which is a zinc-containing monomeric protein demonstrating carboxypeptidase B-like activity. REP34 possesses a novel topology with its substrate binding pocket deviating from the canonical M14 peptidases with a possible catalytic role for a conserved tyrosine and distinct S1' recognition site. Thus, REP34, identified as an active carboxypeptidase and a potential key F. tularensis effector protein, may help elucidate a mechanistic understanding of F. tularensis infection of phagocytic cells. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	239
349451	cd06232	M14-like	Peptidase M14-like domain; uncharacterized subfamily. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	276
349452	cd06233	M14-like	Peptidase M14-like domain; uncharacterized subfamily. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	249
349453	cd06234	M14_PaCCP-like	Peptidase M14-like domain of ATP/GTP binding proteins and cytosolic carboxypeptidases similar to Pseudomonas aerugnosa CCP (PaCCP). A bacterial subgroup of the Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP)-like proteins. This subgroup includes PaCCP from Pseudomonas aeruginosa, a carboxypeptidase homologous to M14D subfamily of human CCPs. Structural complexes with well-known inhibitors of metallocarboxypeptidases indicate that PaCCP might only possess C-terminal hydrolase activity against cellular substrates of particular specificity. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins (such as alpha-tubulin in eukaryotes) to remove a C-terminal tyrosine. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	256
349454	cd06235	M14_AGTPBP-like	Peptidase M14-like domain of human Nna1/AGTPBP-1, AGBL2 -5, and related proteins. Subgroup of the Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP), and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This eukaryotic subgroup includes the human Nna1/AGTPBP-1 and AGBL -2, -3, -4, and -5, and the mouse Nna1/CCP-1 and CCP -2 through -6. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Nna1 is widely expressed in the developing and adult nervous systems, including cerebellar Purkinje and granule neurons, miral cells of the olfactory bulb and retinal photoreceptors. Nna1 is also induced in axotomized motor neurons. Mutations in Nna1 cause Purkinje cell degeneration (pcd). The Nna1 CP domain is required to prevent the retinal photoreceptor loss and cerebellar ataxia phenotypes of pcd mice, and a functional zinc-binding domain is needed for Nna-1 to support neuron survival in these mice. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	256
349455	cd06236	M14_AGBL5_like	Peptidase M14-like domain of ATP/GTP binding protein (AGBL)-5 and related proteins. Peptidase M14-like domain of ATP/GTP binding protein_like (AGBL)-5, and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This eukaryotic subgroup includes the human AGBL5 and the mouse cytosolic carboxypeptidase (CCP)-5. ATP/GTP binding protein (AGTPBP-1/Nna1)-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Mutations in AGTPBP-1/Nna1 cause Purkinje cell degeneration (pcd). AGTPBP-1/Nna1 however does not belong to this subgroup. AGTPBP-1/Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	263
349456	cd06237	M14_Nna1-like	Peptidase M14-like domain of ATP/GTP binding proteins and cytosolic carboxypeptidases; uncharacterized bacterial subgroup. A bacterial subgroup of the Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP),-like proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins (such as alpha-tubulin in eukaryotes) to remove a C-terminal tyrosine. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	239
349457	cd06238	M14-like	Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies.  Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	217
349458	cd06239	M14-like	Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	194
349459	cd06240	M14-like	Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies.  Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	212
349460	cd06241	M14-like	Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	215
349461	cd06242	M14-like	Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	220
349462	cd06243	M14_CP_Csd4-like	Peptidase M14 carboxypeptidase Csd4 and similar proteins. This family includes peptidase M14 carboxypeptidase Csd4 from H. pylori which has been shown to be DL-carboxypeptidase with a modified zinc binding site containing a glutamine residue in place of a conserved histidine. It is an archetype of a new carboxypeptidase subfamily with a domain arrangement that differs from this family of peptide-cleaving enzymes. Csd4 plays a role in trimming uncrosslinked peptidoglycan peptide chains by cleaving the amide bond between meso-diaminopimelate and iso-D-glutamic acid in truncated peptidoglycan side chains. It acts as a cell shape determinant, similar to Campylobacter jejuni Pgp1. The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.  Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	227
349463	cd06244	M14-like	Peptidase M14-like domain; uncharacterized subgroup. Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	223
349464	cd06245	M14_CPD_III	Peptidase M14 carboxypeptidase subfamily N/E-like; Carboxypeptidase D, domain III subgroup. The third carboxypeptidase (CP)-like domain of Carboxypeptidase D (CPD; EC 3.4.17.22), domain III. CPD differs from all other metallocarboxypeptidases in that it contains multiple CP-like domains. CPD belongs to the N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs).The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPD is a single-chain protein containing a signal peptide, three tandem repeats of CP-like domains separated by short bridge regions, followed by a transmembrane domain, and a C-terminal cytosolic tail. The first two CP-like domains of CPD contain all of the essential active site and substrate-binding residues, the third CP-like domain lacks critical residues necessary for enzymatic activity and is inactive towards standard CP substrates. Domain I is optimally active at pH 6.3-7.5 and prefers substrates with C-terminal Arg, whereas domain II is active at pH 5.0-6.5 and prefers substrates with C-terminal Lys. CPD functions in the processing of proteins that transit the secretory pathway, and is present in all vertebrates as well as Drosophila. It is broadly distributed in all tissue types. Within cells, CPD is present in the trans-Golgi network and immature secretory vesicles, but is excluded from mature vesicles. It is thought to play a role in the processing of proteins that are initially processed by furin or related endopeptidases present in the trans-Golgi network, such as growth factors and receptors. CPD is implicated in the pathogenesis of lupus erythematosus (LE), it is regulated by TGF-beta in various cell types of murine and human origin and is significantly down-regulated in CD14 positive cells isolated from patients with LE. As down -regulation of CPD leads to down-modulation of TGF-beta, CPD may have a role in a positive feedback loop.	283
349465	cd06246	M14_CPB2	Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase B2 subgroup. Peptidase M14 Carboxypeptidase (CP) B2 (CPB2, also known as plasma carboxypeptidase B, carboxypeptidase U, and CPU), belongs to the carboxpeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPB2 enzyme displays B-like activity; it only cleaves the basic residues lysine or arginine. It is produced and secreted by the liver as the inactive precursor, procarboxypeptidase U or PCPB2, commonly referred to as thrombin-activatable fibrinolysis inhibitor (TAFI). It circulates in plasma as a zymogen bound to plasminogen, and the active enzyme, TAFIa, inhibits fibrinolysis. It is highly regulated, increased TAFI concentrations are thought to increase the risk of thrombosis and coronary artery disease by reducing fibrinolytic activity while low TAFI levels have been correlated with chronic liver disease.	300
349466	cd06247	M14_CPO	Peptidase M14 carboxypeptidase subfamily A/B-like; Carboxypeptidase O subgroup. Peptidase M14 carboxypeptidase (CP) O (CPO, also known as metallocarboxypeptidase C; EC 3.4.17.) belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPO has not been well characterized as yet, and little is known about it. Based on modeling studies, CPO has been suggested to have specificity for acidic residues rather than aliphatic/aromatic residues as in A-like enzymes or basic residues as in B-like enzymes. It remains to be demonstrated that CPO is functional as an MCP.	298
349467	cd06248	M14_CP_insect	Peptidase M14 carboxypeptidase subfamily A/B-like. This family includes peptidase M14 carboxypeptidases found specifically in insects, including B-type carboxypeptidase of H. zea (CPBHz, insect gut carboxypeptidase-3) that is insensitive to potato carboxypeptidase inhibitor (PCI) in corn earworm, and midgut procarboxypeptidase A (PCPAHa, insect gut carboxypeptidase-1) from Helicoverpa armigera larva, a devastating pest of crops.  PCPAHa preferentially cleaves aliphatic and aromatic residues. The peptidase M14 Carboxypeptidase (CP) A/B subfamily is one of two main M14 CP subfamilies defined by sequence and structural homology, the other being the N/E subfamily. CPs hydrolyze single, C-terminal amino acids from polypeptide chains. They have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. There are nine members in the A/B family: CPA1, CPA2, CPA3, CPA4, CPA5, CPA6, CPB, CPO and CPU. CPA1, CPA2 and CPB are produced by the pancreas. The A forms have slightly different specificities, with CPA1 preferring aliphatic and small aromatic residues, and CPA2 preferring the bulkier aromatic side chains. CPA3 is found in secretory granules of mast cells and functions in inflammatory processes. CPA4 is detected in hormone-regulated tissues, and is thought to play a role in prostate cancer. CPA5 is present in discrete regions of pituitary and other tissues, and cleaves aliphatic C-terminal residues. CPA6 is highly expressed in embryonic brain and optic muscle, suggesting that it may play a specific role in cell migration and axonal guidance. CPU (also called CPB2) is produced and secreted by the liver as the inactive precursor, PCPU, commonly referred to as thrombin-activatable fibrinolysis inhibitor (TAFI). Little is known about CPO but it has been suggested to have specificity for acidic residues.	297
349468	cd06250	M14_PaAOTO_like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like subfamily; subgroup includes Pseudomonas aeruginosa AotO. An uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the the M14 family of metallocarboxypeptidases. This subgroup includes Pseudomonas aeruginosa AotO and related proteins. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD. The gene encoding P. aeruginosa AotO was characterized as part of an operon encoding an arginine and ornithine transport system, however it is not essential for arginine and ornithine uptake.	267
349469	cd06251	M14_ASTE_ASPA-like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	195
349470	cd06252	M14_ASTE_ASPA-like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	224
349471	cd06253	M14_ASTE_ASPA-like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	211
349472	cd06254	M14_ASTE_ASPA-like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	198
349473	cd06255	M14_ASTE_ASPA-like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	223
349474	cd06256	M14_ASTE_ASPA-like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	204
99751	cd06257	DnaJ	DnaJ domain or J-domain.  DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification.	55
341049	cd06258	M3_like	M3-like Peptidases, zincin metallopeptidases, include M2_ACE,  M3A, M3B_PepF, and M32 families. The peptidase M3-like family, also called neurolysin-like family, is part of the "zincin" metallopeptidases, and includes the M2, M3 and M32 families of metallopeptidases. The M2 angiotensin converting enzyme (ACE, EC 3.4.15.1) is a membrane-bound, zinc-dependent dipeptidase that catalyzes the conversion of the decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. The M3 family is subdivided into two subfamilies: the widespread M3A, which comprises a number of high-molecular mass endo- and exopeptidases from bacteria, archaea, protozoa, fungi, plants and animals, and the small M3B, whose members are enzymes primarily from bacteria. Well-known mammalian/eukaryotic M3A endopeptidases are the thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (alias endopeptidase 3.4.24.16), and the mitochondrial intermediate peptidase. The first two are intracellular oligopeptidases, which act only on relatively short substrates of less than 20 amino acid residues, while the latter cleaves N-terminal octapeptides from proteins during their import into the mitochondria. The M3A subfamily also contains several bacterial endopeptidases, called oligopeptidases A, as well as a large number of bacterial carboxypeptidases, called dipeptidyl peptidases (Dcp; Dcp II; peptidyl dipeptidase; EC 3.4.15.5). M3B subfamily consists of oligopeptidase F (PepF) which hydrolyzes peptides containing 7-17 amino acid residues with fairly broad specificity. Peptidases in the M3 family contain the HEXXH motif that forms part of the active site in conjunction with a C-terminally-located Glutamic acid (Glu) residue. A single zinc ion is ligated by the side-chains of the two Histidine (His) residues, and the more C-terminal Glu. Most of the peptidases are synthesized without signal peptides or propeptides, and function intracellularly. There are similarities to the thermostable carboxypeptidases from Pyrococcus furiosus carboxypeptidase (PfuCP), and Thermus aquaticus (TaqCP), belonging to peptidase family M32. Little is known about function of this family, including carboxypeptidases Taq and Pfu.	473
99750	cd06259	YdcF-like	YdcF-like. YdcF-like is a large family of mainly bacterial proteins, with a few members found in fungi, plants, and archaea. Escherichia coli YdcF has been shown to bind S-adenosyl-L-methionine (AdoMet), but a biochemical function has not been idenitified. The family also includes Escherichia coli sanA and Salmonella typhimurium sfiX,  which are involved in vancomycin resistance; sfiX may also be involved in murein synthesis.	150
411709	cd06260	DUF820-like	Uncharacterized PDDEXK family nuclease. The Domain of unknown function 820 (DUF820) family is composed of hypothetical proteins that are greatly expanded in cyanobacteria. The proteins are found sporadically in other bacteria. They have been predicted to belong to the PD-(D/E)xK superfamily of nucleases, which includes very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	155
119394	cd06261	TM_PBP2	Transmembrane subunit (TM) found in Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which generally bind type 2 PBPs. These types of transporters consist of a PBP, two TMs, and two cytoplasmic ABC ATPase subunits, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. For these transporters the ABCs and TMs are on independent polypeptide chains. These systems transport a diverse range of substrates. Most are specific for a single substrate or a group of related substrates; however some transporters are more promiscuous, transporting structurally diverse substrates such as the histidine/lysine and arginine transporter in Enterobacteriaceae. In the latter case, this is achieved through binding different PBPs with different specificities to the TMs. For other promiscuous transporters such as the multiple-sugar transporter Msm of Streptococcus mutans, the PBP has a wide substrate specificity. These transporters include the maltose-maltodextrin, phosphate and sulfate transporters, among others.	190
293792	cd06262	metallo-hydrolase-like_MBL-fold	mainly hydrolytic enzymes and related proteins which carry out various biological functions; MBL-fold metallohydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases which can catalyze the hydrolysis of a wide range of beta-lactam antibiotics, hydroxyacylglutathione hydrolases (also called glyoxalase II) which hydrolyze S-d-lactoylglutathione to d-lactate in the second step of the glycoxlase system, AHL lactonases which catalyze the hydrolysis and opening of the homoserine lactone rings of acyl homoserine lactones (AHLs), persulfide dioxygenase which catalyze the oxidation of glutathione persulfide to glutathione and persulfite in the mitochondria, flavodiiron proteins which catalyze the reduction of oxygen and/or nitric oxide to water or nitrous oxide respectively, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J which has both 5'-3' exoribonucleolytic and endonucleolytic activity and ribonuclease Z which catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors, cyclic nucleotide phosphodiesterases which decompose cyclic adenosine and guanosine 3', 5'-monophosphate (cAMP and cGMP) respectively, insecticide hydrolases, and proteins required for natural transformation competence. The diversity of biological roles is reflected in variations in the active site metallo-chemistry, for example classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, human persulfide dioxygenase ETHE1 is a mono-iron binding member of the superfamily; Arabidopsis thaliana hydroxyacylglutathione hydrolases incorporates iron, manganese, and zinc in its dinuclear metal binding site, and flavodiiron proteins contains a diiron site.	188
99706	cd06263	MAM	Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region.	157
119387	cd06265	RNase_A_canonical	Canonical RNase A family includes all vertebrate homologues to the bovine pancreatic ribonuclease A (RNase A) that contain the catalytic site, necessary for RNase activity.  In the human genome 8 RNases , refered to as "canonical" RNases, have been identified, pancreatic RNase (RNase 1), Eosinophil Derived Neurotoxin (SEDN/RNASE 2), Eosinophil Cationic Protein (ECP/RNase 3), RNase 4, Angiogenin (RNase 5), RNase 6 or k6, the skin derived RNase (RNase 7) and RNase 8. The eight human genes are all located in a cluster on chromosome 14. Canonical RNase A enzymes have special biological activities; for example, some stimulate the development of vascular endothelial cells, dendritic cells, and neurons, are cytotoxic/anti-tumoral and/or anti-pathogenic. RNase A is involved in endonucleolytic cleavage of 3'-phosphomononucleotides and 3'-phosphooligonucleotides ending in C-P or U-P with 2',3'-cyclic phosphate intermediates. The catalytic mechanism is a transphosphorylation of P-O 5' bonds on the 3' side of pyrimidines and subsequent hydrolysis to generate 3' phosphate groups. The canonical RNase A family proteins have a conserved catalytic triad (two histidines and one lysine). They also share 6 to 8 cysteines that form three to four disulfide bonds. Two disulfide bonds that are close to the N and C termini contribute most significantly to conformational stability. Angiogenin or RNAse 5 has been implicated in tumor-associated angiogenesis. Comparative analysis in mammals and birds indicates that the whole family may have originated from a RNase 5-like gene. This hypothesis is supported by the fact that only RNase 5-like RNases have been reported outside the mammalian class. The RNase 5 group would therefore be the most ancient form of this family, and all other members would have arisen during mammalian evolution.	115
259999	cd06266	RNase_HII	Ribonuclease H (RNase H) type II family (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). This family contains ribonucleases HII (RNases H2) which include bacterial RNase HII and HIII, and eukaryotic and archaeal RNase H2/HII. RNase H2 cleaves RNA sequences that are part of RNA/DNA hybrids or that are incorporated into DNA, thereby preventing genomic instability and the accumulation of aberrant nucleic acid which can induce Aicardi-Goutieres syndrome, a severe autoimmune disorder in humans. Ribonuclease H (RNase H) is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes, but no prokaryotic genome contains the combination of only RNase HI and HIII. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. It appears that type I and type II RNases H also have overlapping functions in cells, as over-expression of Escherichia coli RNase HII can complement an RNase HI deletion phenotype in E. coli.	193
380491	cd06267	PBP1_LacI_sugar_binding-like	ligand binding domain of the LacI transcriptional regulator family belonging to the type 1 periplasmic-binding fold protein superfamily. Ligand binding domain of the LacI transcriptional regulator family belonging to the type 1 periplasmic-binding fold protein superfamily. In most cases, ligands are monosaccharide including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the domain sugar binding changes the DNA binding activity of the repressor domain.	264
380492	cd06268	PBP1_ABC_transporter_LIVBP-like	periplasmic binding domain of ATP-binding cassette transporter-like systems that belong to the type 1 periplasmic binding fold protein superfamily. Periplasmic binding domain of ATP-binding cassette transporter-like systems that belong to the type 1 periplasmic binding fold protein superfamily. They are mostly present in archaea and eubacteria, and are primarily involved in scavenging solutes from the environment. ABC-type transporters couple ATP hydrolysis with the uptake and efflux of a wide range of substrates across bacterial membranes, including amino acids, peptides, lipids and sterols, and various drugs. These systems are comprised of transmembrane domains, nucleotide binding domains, and in most bacterial uptake systems, periplasmic binding proteins (PBPs) which transfer the ligand to the extracellular gate of the transmembrane domains. These PBPs bind their substrates selectively and with high affinity. Members of this group include ABC-type Leucine-Isoleucine-Valine-Binding Proteins (LIVBP), which are homologous to the aliphatic amidase transcriptional repressor, AmiC, of Pseudomonas aeruginosa. The uncharacterized periplasmic components of various ABC-type transport systems are included in this group.	298
380493	cd06269	PBP1_glutamate_receptors-like	ligand-binding domain of family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases such as natriuretic peptide receptors (NPRs), and N-terminal leucine/isoleucine/valine-binding protein (LIVBP)-like domain of ionotropic glutamate receptors. This CD represents the ligand-binding domain of the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases such as the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the ionotropic glutamate receptors, all of which are structurally similar and related to the periplasmic-binding fold type 1 family. The family C GPCRs consists of metabotropic glutamate receptor (mGluR), a calcium-sensing receptor (CaSR), gamma-aminobutyric acid receptor (GABAbR), the promiscuous L-alpha-amino acid receptor GPR6A, families of taste and pheromone receptors, and orphan receptors. Truncated splicing variants of the orphan receptors are not included in this CD. The family C GPCRs are activated by endogenous agonists such as amino acids, ions, and sugar based molecules. Their amino terminal ligand-binding region is homologous to the bacterial leucine-isoleucine-valine binding protein (LIVBP) and a leucine binding protein (LBP). The ionotropic glutamate receptors (iGluRs) have an integral ion channel and are subdivided into three major groups based on their pharmacology and structural similarities: NMDA receptors, AMPA receptors, and kainate receptors. The family of membrane bound guanylyl cyclases is further divided into three subfamilies: the ANP receptor (GC-A)/C-type natriuretic peptide receptor (GC-B), the heat-stable enterotoxin receptor (GC-C)/sensory organ specific membrane GCs such as retinal receptors (GC-E, GC-F), and olfactory receptors (GC-D and GC-G).	332
380494	cd06270	PBP1_GalS-like	ligand binding domain of DNA transcription iso-repressor GalS, which is one of two regulatory proteins involved in galactose transport and metabolism. Ligand binding domain of DNA transcription iso-repressor GalS, which is one of two regulatory proteins involved in galactose transport and metabolism. Transcription of the galactose regulon genes is regulated by Gal iso-repressor (GalS) and Gal repressor (GalR) in different ways, but both repressors recognize the same DNA binding site in the absence of D-galactose. GalS is a dimeric protein like GalR,and its major role is in regulating expression of the high-affinity galactose transporter encoded by the mgl operon, whereas GalR is the exclusive regulator of galactose permease, the low-affinity galactose transporter. GalS and GalR are members of the LacI-GalR family of transcription regulators and both contain the type 1 periplasmic binding protein-like fold. Hence, they are homologous to the periplasmic sugar binding of ABC-type transport systems.	266
380495	cd06271	PBP1_AglR_RafR-like	ligand-binding domain of DNA transcription repressors specific for raffinose (RafR) and alpha-glucosides (AglR) which are members of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressors specific for raffinose (RafR) and alpha-glucosides (AglR) which are members of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	264
380496	cd06272	PBP1_hexuronate_repressor-like	ligand-binding domain of DNA transcription repressor for the hexuronate utilization operon from Bacillus species and close homologs, all members of the LacI-GalR family of bacterial transcription regulators. Ligand-binding domain of DNA transcription repressor for the hexuronate utilization operon from Bacillus species and its close homologs from other bacteria, all of which are members of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	266
380497	cd06273	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	268
380498	cd06274	PBP1_FruR	ligand binding domain of DNA transcription repressor specific for fructose (FruR) and its close homologs. Ligand binding domain of DNA transcription repressor specific for fructose (FruR) and its close homologs, all of which are members of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to members of the type 1 periplasmic binding protein superfamily. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor	264
380499	cd06275	PBP1_PurR	ligand-binding domain of purine repressor, PurR, which functions as the master regulatory protein of de novo purine nucleotide biosynthesis in Escherichia coli. Ligand-binding domain of purine repressor, PurR, which functions as the master regulatory protein of de novo purine nucleotide biosynthesis in Escherichia coli. This dimeric PurR belongs to the LacI-GalR family of transcription regulators and is activated to bind to DNA operator sites by initially binding either of high affinity corepressors, hypoxanthine or guanine. PurR is composed of two functional domains: aan N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the purine transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	269
380500	cd06277	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	275
380501	cd06278	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	266
380502	cd06279	PBP1_LacI-like	ligand-binding domain of an uncharacterized transcription regulator from Corynebacterium glutamicum and its close homologs from other bacteria. This group includes the ligand-binding domain of an uncharacterized transcription regulator from Corynebacterium glutamicum and its close homologs from other bacteria. This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding.	284
380503	cd06280	PBP1_LacI-like	ligand-binding domain of an uncharacterized transcription regulator from Staphylococcus saprophyticus and its close homologs from other bacteria. This group includes the ligand-binding domain of an uncharacterized transcription regulator from Staphylococcus saprophyticus and its close homologs from other bacteria. This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding.	266
380504	cd06281	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	270
380505	cd06282	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	267
380506	cd06283	PBP1_RegR_EndR_KdgR-like	ligand-binding domain of DNA transcription repressor RegR and other putative regulators such as KdgR and EndR. Ligand-binding domain of DNA transcription repressor RegR and other putative regulators such as KdgR and EndR, all of which are members of the LacI-GalR family of bacterial transcription regulators. RegR regulates bacterial competence and the expression of virulence factors, including hyaluronidase. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	266
380507	cd06284	PBP1_LacI-like	ligand-binding domain of an uncharacterized transcription regulator from Actinobacillus succinogenes and its close homologs from other bacteria. This group includes the ligand-binding domain of an uncharacterized transcription regulator from Actinobacillus succinogenes and its close homologs from other bacteria. This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding.	267
380508	cd06285	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	269
380509	cd06286	PBP1_CcpB-like	ligand-binding domain of a novel transcription factor implicated in catabolite repression in Bacillus and Clostridium species. This group includes the ligand-binding domain of a novel transcription factor implicated in catabolite repression in Bacillus and Clostridium species. Catabolite control protein B (CcpB) is 30% identical in sequence to CcpA which functions as the major transcriptional regulator of carbon catabolite repression/regulation (CCR), a process in which enzymes necessary for the metabolism of alternative sugars are inhibited in the presence of glucose. Like CcpA, the DNA-binding protein CcpB exerts its catabolite-repressing effect by a mechanism dependent on the presence of HPr(Ser-P), the small phosphocarrier proteins of the phosphoenolpyruvate-sugar phosphotransferase system, but with a less significant degree.	262
380510	cd06287	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	268
380511	cd06288	PBP1_sucrose_transcription_regulator	ligand-binding domain of DNA-binding regulatory proteins specific to sucrose that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of DNA-binding regulatory proteins specific to sucrose that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	268
380512	cd06289	PBP1_MalI-like	ligand-binding domain of MalI, a transcription regulator of the maltose system of Escherichia coli and its close homologs from other bacteria. This group includes the ligand-binding domain of MalI, a transcription regulator of the maltose system of Escherichia coli and its close homologs from other bacteria. They are members of the LacI-GalR family of repressor proteins which are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	268
380513	cd06290	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	267
380514	cd06291	PBP1_Qymf-like	ligand binding domain of the lacI-like transcription regulator from a novel metal-reducing bacterium Alkaliphilus Metalliredigens (strain Qymf) and its close homologs. This group includes the ligand binding domain of the lacI-like transcription regulator from a novel metal-reducing bacterium Alkaliphilus metalliredigens (strain Qymf) and its close homologs. Qymf is a strict anaerobe that could be grown in the presence of borax and its cells are straight rods that produce endospores. This group is a member of the LacI-GalR family repressors that are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	264
380515	cd06292	PBP1_AglR_RafR-like	Ligand-binding domain of uncharacterized DNA transcription repressors highly similar to that of the repressors specific raffinose (RafR) and alpha-glucosides (AglR) which are members of the LacI-GalR family of bacterial transcription regulators. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins highly similar to DNA transcription repressors specific for raffinose (RafR) and alpha-glucosides (AglR). Members of this group belong to the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type I periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	273
380516	cd06293	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	270
380517	cd06294	PBP1_MalR-like	ligand-binding domain of maltose transcription regulator MalR which is a member of the LacI-GalR family repressors. This group includes the ligand-binding domain of maltose transcription regulator MalR which is a member of the LacI-GalR family repressors that are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	269
380518	cd06295	PBP1_CelR	ligand binding domain of a transcription regulator of cellulose genes, CelR, which is highly homologous to the LacI-GalR family of bacterial transcription regulators. This group includes the ligand binding domain of a transcription regulator of cellulose genes, CelR, which is highly homologous to the LacI-GalR family of bacterial transcription regulators. The binding of CelR to the celE promoter is inhibited specifically by cellobiose. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	273
380519	cd06296	PBP1_CatR-like	ligand-binding domain of a LacI-like transcriptional regulator, CatR which is involved in catechol degradation. This group includes the ligand-binding domain of a LacI-like transcriptional regulator, CatR which is involved in catechol degradation. This group belongs to the LacI-GalR family repressors that are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	270
380520	cd06297	PBP1_CcpA_TTHA0807	ligand-binding domain of TTHA0807, a CcpA regulator, from Thermus thermophilus HB8 and its close homologs. Ligand-binding domain of the uncharacterized transcription regulator TTHA0807 from the extremely thermophilic organism Thermus thermophilus HB8 and close homologs from other bacteria. Although its exact biological function is not known, the TTHA0807 belongs to the catabolite control protein A (CcpA)family of regulatory proteins. The CcpA functions as the major transcriptional regulator of carbon catabolite repression/regulation (CCR), a process in which enzymes necessary for the metabolism of alternative sugars are inhibited in the presence of glucose. In gram-positive bacteria, CCR is controlled by HPr, a phosphoenolpyruvate:sugar phsophotrasnferase system (PTS) and a transcriptional regulator CcpA. Moreover, CcpA can regulate sporulation and antibiotic resistance as well as play a role in virulence development of certain pathogens such as the group A streptococcus. The ligand binding domain of CcpA is a member of the LacI-GalR family of bacterial transcription regulators.	268
380521	cd06298	PBP1_CcpA	ligand-binding domain of the catabolite control protein A (CcpA), which functions as the major transcriptional regulator of carbon catabolite repression/regulation. Ligand-binding domain of the catabolite control protein A (CcpA), which functions as the major transcriptional regulator of carbon catabolite repression/regulation (CCR), a process in which enzymes necessary for the metabolism of alternative sugars are inhibited in the presence of glucose. In gram-positive bacteria, CCR is controlled by HPr, a phosphoenolpyruvate:sugar phsophotrasnferase system (PTS) and a transcriptional regulator CcpA. Moreover, CcpA can regulate sporulation and antibiotic resistance as well as play a role in virulence development of certain pathogens such as the group A streptococcus. The ligand binding domain of CcpA is a member of the LacI-GalR family of bacterial transcription regulators.	268
380522	cd06299	PBP1_LacI-like	ligand-binding domain of DNA-binding regulatory protein from Corynebacterium glutamicum produces significant amounts of L-glutamate directly from cheap sugar and ammonia. This group includes the ligand-binding domain of DNA-binding regulatory protein from Corynebacterium glutamicum which has a unique ability to produce significant amounts of L-glutamate directly from cheap sugar and ammonia. This regulatory protein is a member of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	268
380523	cd06300	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail.	302
380524	cd06301	PBP1_rhizopine_binding-like	periplasmic binding proteins specific to rhizopines. Periplasmic binding proteins specific to rhizopines, which are simple sugar-like compounds produced in the nodules induced by the symbiotic root nodule bacteria, such as Rhizobium and Sinorhizobium. Rhizopine-binding-like proteins from other bacteria are also included. Two inositol based rhizopine compounds are known to date: L-3-O-methly-scyllo-inosamine (3-O-MSI) and scyllo-inosamine. Bacterial strains that can metabolize rhizopine have a greater competitive advantage in nodulation and rhizopine synthesis is regulated by NifA/NtrA regulatory transcription activators which are maximally expressed at the onset of nitrogen fixation in bacteroids. The members of this group belong to the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily.	272
380525	cd06302	PBP1_LsrB_Quorum_Sensing-like	periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs. Periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs from other bacteria. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling.	296
380526	cd06303	PBP1_LuxPQ_Quorum_Sensing	periplasmic binding protein (LuxP) of autoinducer-2 (AI-2) receptor LuxPQ from Vibrio harveyi and its close homologs. Periplasmic binding protein (LuxP) of autoinducer-2 (AI-2) receptor LuxPQ from Vibrio harveyi and its close homologs from other bacteria. The members of this group are highly homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transport of many sugar based solutes in bacteria and archaea, and that are members of the type 1 periplasmic binding protein superfamily. The Vibrio harveyi AI-2 receptor consists of two polypeptides, LuxP and LuxQ: LuxP is a periplasmic binding protein that binds AI-2 by clamping it between two domains, LuxQ is an integral membrane protein belonging to the two-component sensor kinase family. Unlike AI-2 bound to the LsrB receptor in Salmonella typhimurium, the Vibrio harveyi AI-2 signaling molecule has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LuxPQ to control light production as well as its motility behavior.	320
380527	cd06304	PBP1_BmpA_Med_PnrA-like	periplasmic binding component of a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. Periplasmic binding component of a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. These outer membrane proteins include Med, a cell-surface localized protein regulating the competence transcription factor gene comK in Bacillus subtilis, and PnrA, a periplasmic purine nucleoside binding protein of an ATP-binding cassette (ABC) transport system in Treponema pallidum. All contain the type 1 periplasmic sugar-binding protein-like fold.	262
380528	cd06305	PBP1_methylthioribose_binding-like	similar to methylthioribose-binding protein of ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily. Proteins similar to methylthioribose-binding protein of ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. The sugar-binding domain of the periplasmic proteins in this group is also homologous to the ligand-binding domain of eukaryotic receptors such as metabotropic glutamate receptor (mGluR), DNA-binding transcriptional repressors such as LacI and GalR.	273
380529	cd06306	PBP1_TorT-like	TorT-like proteins, a periplasmic binding protein family that activates induction of the Tor respiratory system upon trimethylamine N-oxide (TMAO) electron-acceptor binding in bacteria. TorT-like proteins, a periplasmic binding protein family that activates induction of the Tor respiratory system upon trimethylamine N-oxide (TMAO) electron-acceptor binding in bacteria. The Tor respiratory system is consists of three proteins (TorC, TorA, and TorD) and is induced in the presence of TMAO. The TMAO control is tightly regulated by three proteins: TorS, TorT, and TorR. Thus, the disruption of any of these proteins can abolish the Tor respiratory induction. TorT shares homology with the sugar-binding domain of the type 1 periplasmic binding proteins. The members of TorT-like family bind TMAO or related compounds and are predicted to be involved in signal transduction and/or substrate transport.	269
380530	cd06307	PBP1_sugar_binding	periplasmic sugar-binding domain of uncharacterized transport systems. Periplasmic sugar-binding domain of uncharacterized transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily. The members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes.	275
380531	cd06308	PBP1_sensor_kinase-like	periplasmic binding domain of two-component sensor kinase signaling systems. Periplasmic binding domain of two-component sensor kinase signaling systems, some of which are fused with a C-terminal histidine kinase A domain (HisK) and/or a signal receiver domain (REC). Members of this group share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily and are predicted to be involved in sensing of environmental stimuli; their substrate specificities, however, are not known in detail.	268
380532	cd06309	PBP1_galactofuranose_YtfQ-like	periplasmic binding domain of ABC-type galactofuranose YtfQ-like transport systems. Periplasmic binding domain of ABC-type YtfQ-like transport systems. The YtfQ protein from Escherichia coli is up-regulated under glucose-limited conditions and shares homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their ligand specificity is not determined experimentally.	285
380533	cd06310	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	272
380534	cd06311	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	270
380535	cd06312	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	272
380536	cd06313	PBP1_ABC_ThpA_XypA	periplasmic sugar-binding proteins (ThpA and XypA) of ABC-type transport systems. This group includes periplasmic D-threitol-binding protein ThpA and xylitol/L-sorbitol-binding protein XypA, which are part of sugar ABC-type transport systems. Both ThpA and XypA share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge.	277
380537	cd06314	PBP1_tmGBP	periplasmic sugar-binding domain of Thermotoga maritima glucose-binding protein (tmGBP) and its close homologs. Periplasmic sugar-binding domain of Thermotoga maritima glucose-binding protein (tmGBP) and its close homologs from other bacteria. They are members of the type 1 periplasmic binding protein superfamily which consists of two domains connected by a three-stranded hinge. TmGBP is specific for glucose and its binding pocket is buried at the interface of the two domains. TmGBP also exhibits high thermostability and the highest structural similarity to E. coli glucose binding protein (ecGBP).	271
380538	cd06315	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	278
380539	cd06316	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	294
380540	cd06317	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	281
380541	cd06318	PBP1_ABC_D-talitol-like	periplasmic D-talitol-binding protein of an ABC transport system and similar proteins. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	282
380542	cd06319	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	278
380543	cd06320	PBP1_allose_binding	periplasmic allose-binding domain of bacterial transport systems that function as a primary receptor of active transport and chemotaxis. Periplasmic allose-binding domain of bacterial transport systems that function as a primary receptor of active transport and chemotaxis. The members of this group are belonging to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily. Like other periplasmic receptors of the ABC-type transport systems, the allose-binding protein consists of two alpha/beta domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding.	283
380544	cd06321	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. This group includes the periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consist of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	270
380545	cd06322	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. This group includes the periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consist of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	270
380546	cd06323	PBP1_ribose_binding	periplasmic sugar-binding domain of the thermophilic Thermoanaerobacter tengcongensis ribose binding protein (ttRBP) and its mesophilic homologs. Periplasmic sugar-binding domain of the thermophilic Thermoanaerobacter tengcongensis D-ribose binding protein (ttRBP) and its mesophilic homologs. Members of this group belong to the type 1 periplasmic binding protein superfamily, whose members are involved in chemotaxis, ATP-binding cassette transport, and intercellular communication in central nervous system. The thermophilic and mesophilic ribose-binding proteins are structurally very similar, but differ substantially in thermal stability.	268
380547	cd06324	PBP1_ABC_sugar_binding-like	periplasmic sugar-binding domain of uncharacterized ABC-type transport systems. This group includes the periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	317
380548	cd06325	PBP1_ABC_unchar_transporter	type 1 periplasmic ligand-binding domain of uncharacterized ABC-type transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This group includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); its ligand specificity has not been determined experimentally.	282
380549	cd06326	PBP1_ABC_ligand_binding-like	periplasmic ligand-binding domain of uncharacterized ABC-type transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This group includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); its ligand specificity has not been determined experimentally, however.	339
380550	cd06327	PBP1_SBP-like	periplasmic substrate-binding domain of active transport proteins (substrate binding proteins or SBPs). Periplasmic substrate-binding domain of active transport proteins found in gram-negative, gram-positive bacteria, and archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids.	336
380551	cd06328	PBP1_SBP-like	periplasmic substrate-binding domain of active transport proteins (substrate binding proteins or SBPs). Periplasmic substrate-binding domain of active transport proteins found in gram-negative and gram-positive bacteria. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids.	336
380552	cd06329	PBP1_SBP-like	periplasmic substrate-binding domain of active transport proteins (substrate binding proteins or SBPs). Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids.	343
380553	cd06330	PBP1_As_SBP-like	periplasmic substrate-binding domain of active transport proteins. Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea that is predicted to be involved in the efflux of toxic compounds. Members of this subgroup include proteins from Herminiimonas arsenicoxydans, which is resistant to arsenic (As) and various heavy metals such as cadmium and zinc. Moreover, they show significant sequence similarity to the cluster of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa.	342
380554	cd06331	PBP1_AmiC-like	type 1 periplasmic components of amide-binding protein (AmiC) and the active transport system for short-chain and urea (FmdDEF). This group includes the type 1 periplasmic components of amide-binding protein (AmiC) and the active transport system for short-chain and urea (FmdDEF), found in bacteria and Archaea. AmiC controls expression of the amidase operon by a ligand-triggered conformational switch. In the absence of ligand or presence of butyramide (repressor), AmiC (the ligand sensor and negative regulator) adopts an open conformation and inhibits the transcription antitermination function of AmiR by direct protein-protein interaction. In the presence of inducing ligands such as acetamide, AmiC adopts a closed conformation which disrupts a silencing AmiC-AmiR complex and the expression of amidase and other genes of the operon is induced. FmdDEF is predicted to be an ATP-dependent transporter and closely resembles the periplasmic binding protein and the two transmembrane proteins present in various hydrophobic amino acid-binding transport systems.	333
380555	cd06332	PBP1_aromatic_compounds-like	type 1 periplasmic binding proteins of active transport systems predicted to be involved in transport of aromatic compounds such as 2-nitrobenzoic acid and alkylbenzenes. This group includes the type 1 periplasmic binding proteins of active transport systems that are predicted to be involved in transport of aromatic compounds such as 2-nitrobenzoic acid and alkylbenzenes; their substrate specificities are not well characterized, however. Members also exhibit close similarity to active transport systems for short chain amides and/or urea found in bacteria and archaea.	336
380556	cd06333	PBP1_ABC_RPA1789-like	type 1 periplasmic binding-protein component (CouP) of an ABC system (CouPSTU; RPA1789, RPA1791-1793), involved in active transport of lignin-derived aromatic substrates, and its close homologs. This group includes RPA1789 (CouP) from Rhodopseudomonas palustris and its close homologs in other bacteria. RPA1789 (CouP) is the periplasmic binding-protein component of an ABC system (CouPSTU; RPA1789, RPA1791-1793) that is involved in the active transport of lignin-derived aromatic substrates. Members of this group has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP).	342
380557	cd06334	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters, such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally.	360
380558	cd06335	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters, such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally.	348
380559	cd06336	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This group includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, their ligand specificity has not been determined experimentally.	345
380560	cd06337	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters, such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally.	354
380561	cd06338	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT); however, their ligand specificity has not been determined experimentally.	347
380562	cd06339	PBP1_YraM_LppC_lipoprotein-like	periplasmic binding component of lipoprotein LppC, an immunodominant antigen. This subgroup includes periplasmic binding component of lipoprotein LppC, an immunodominant antigen, whose molecular function is not characterized. Members of this subgroup are predicted to be involved in transport of lipid compounds, and they are sequence similar to the family of ABC-type hydrophobic amino acid transporters (HAAT).	331
380563	cd06340	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, their ligand specificity has not been determined experimentally.	352
380564	cd06341	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally.	340
380565	cd06342	PBP1_ABC_LIVBP-like	type 1 periplasmic ligand-binding domain of ABC (Atpase Binding Cassette)-type active transport systems involved in the transport of all three branched chain aliphatic amino acids (leucine, isoleucine and valine). This subgroup includes the type 1 periplasmic ligand-binding domain of ABC (Atpase Binding Cassette)-type active transport systems that are involved in the transport of all three branched chain aliphatic amino acids (leucine, isoleucine and valine). This subgroup also includes a leucine-specific binding protein (or LivK), which is very similar in sequence and structure to leucine-isoleucine-valine binding protein (LIVBP). ABC-type active transport systems are transmembrane proteins that function in the transport of diverse sets of substrates across extra- and intracellular membranes, including carbohydrates, amino acids, inorganic ions, dipeptides and oligopeptides, metabolic products, lipids and sterols, and heme, to name a few.	334
380566	cd06343	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however its ligand specificity has not been determined experimentally.	355
380567	cd06344	PBP1_ABC_HAAT-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of hydrophobic amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of hydrophobic amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	332
380568	cd06345	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	356
380569	cd06346	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	314
380570	cd06347	PBP1_ABC_LivK_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	334
380571	cd06348	PBP1_ABC_HAAT-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	342
380572	cd06349	PBP1_ABC_HAAT-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	338
380573	cd06350	PBP1_GPCR_family_C-like	ligand-binding domain of membrane-bound glutamate receptors that mediate excitatory transmission on the cellular surface through initial binding of glutamate; categorized into ionotropic glutamate receptors (iGluRs) and metabotropic glutamate receptors (mGluRs). Ligand-binding domain of membrane-bound glutamate receptors that mediate excitatory transmission on the cellular surface through initial binding of glutamate and are categorized into ionotropic glutamate receptors (iGluRs) and metabotropic glutamate receptors (mGluRs). The metabotropic glutamate receptors (mGluR) are key receptors in the modulation of excitatory synaptic transmission in the central nervous system. The mGluRs are coupled to G proteins and are thus distinct from the iGluRs which internally contain ligand-gated ion channels. The mGluR structure is divided into three regions: the extracellular region, the seven-spanning transmembrane region and the cytoplasmic region. The extracellular region is further divided into the ligand-binding domain (LBD) and the cysteine-rich domain. The LBD has sequence similarity to the LIVBP, which is a bacterial periplasmic protein (PBP), as well as to the extracellular region of both iGluR and the gamma-aminobutyric acid (GABA)b receptor. iGluRs are divided into three main subtypes based on pharmacological profile: NMDA, AMPA, and kainate receptors. All family C GPCRs have a large extracellular N terminus that contain a domain with homology to bacterial periplasmic amino acid-binding proteins.	350
380574	cd06351	PBP1_iGluR_N_LIVBP-like	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NMDA, AMPA, and kainate receptor subtypes of ionotropic glutamate receptors (iGluRs). N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NMDA, AMPA, and kainate receptor subtypes of ionotropic glutamate receptors (iGluRs). While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Glutamate mediates the majority of excitatory synaptic transmission in the central nervous system via two broad classes of ionotropic receptors characterized by their response to glutamate agonists: N-methyl-aspartate (NMDA) and non-NMDA receptors. NMDA receptors have intrinsically slow kinetics, are highly permeable to Ca2+, and are blocked by extracellular Mg2+ in a voltage-dependent manner. On the other hand, non-NMDA receptors have faster kinetics, are weakly permeable to Ca2+, and are not blocked by extracellular Mg2+. While non-NMDA receptors typically mediate excitatory synaptic responses at resting membrane potentials, NMDA receptors contribute to several forms of synaptic plasticity and are suggested to play an important role in the development of synaptic pathways.	348
380575	cd06352	PBP1_NPR_GC-like	ligand-binding domain of membrane guanylyl-cyclase receptors. Ligand-binding domain of membrane guanylyl-cyclase receptors. Membrane guanylyl cyclases (GC) have a single membrane-spanning region and are activated by endogenous and exogenous peptides. This family can be divided into three major subfamilies: the natriuretic peptide receptors (NPRs), sensory organ-specific membrane GCs, and the enterotoxin/guanylin receptors. The binding of peptide ligands to the receptor results in the activation of the cytosolic catalytic domain. Three types of NPRs have been cloned from mammalian tissues: NPR-A/GC-A, NPR-B/ GC-B, and NPR-C. In addition, two of the GCs, GC-D and GC-G, appear to be pseudogenes in humans. Atrial natriuretic peptide (ANP) and brain natriuretic peptide (BNP) are produced in the heart, and both bind to the NPR-A. NPR-C, also termed the clearance receptor, binds each of the natriuretic peptides and can alter circulating levels of these peptides. The ligand binding domain of the NPRs exhibits strong structural similarity to the type 1 periplasmic binding fold protein family.	391
380576	cd06353	PBP1_Med-like	periplasmic binding domain of the basic membrane lipoprotein Med in Bacillus and its close homologs from other bacteria and Archaea. Periplasmic binding domain of the basic membrane lipoprotein Med in Bacillus and its close homologs from other bacteria and Archaea. Med, a cell-surface localized protein, which regulates the competence transcription factor gene comK in Bacillus subtilis, lacks the DNA binding domain when compared with structures of transcription regulators from the LacI family. Nevertheless, Med has significant overall sequence homology to various periplasmic substrate-binding proteins. Moreover, the structure of Med shows a striking similarity to PnrA, a periplasmic nucleoside binding protein of an ATP-binding cassette transport system. Members of this group contain the type 1 periplasmic sugar-binding protein-like fold.	260
380577	cd06354	PBP1_PrnA-like	periplasmic binding domain of basic membrane lipoprotein, PnrA, in Treponema pallidum and its homologs from other bacteria and Archaea. Periplasmic binding domain of basic membrane lipoprotein, PnrA, in Treponema pallidum and its homologs from other bacteria and Archaea. The PnrA lipoprotein, also known as Tp0319 or TmpC, represents a novel family of bacterial purine nucleoside receptor encoded within an ATP-binding cassette (ABC) transport system (pnrABCDE). It shows a striking structural similarity to another basic membrane lipoprotein Med which regulates the competence transcription factor gene, comK, in Bacillus subtilis. The members of PnrA-like subgroup are likely to have similar nucleoside-binding functions and a similar type 1 periplasmic sugar-binding protein-like fold.	268
380578	cd06355	PBP1_FmdD-like	periplasmic component (FmdD) of an active transport system for short-chain amides and urea (FmdDEF). This group includes the periplasmic component (FmdD) of an active transport system for short-chain amides and urea (FmdDEF), found in Methylophilus methylotrophus, and its homologs from other bacteria. FmdD, a type 1 periplasmic binding protein, is induced by short-chain amides and urea and repressed by excess ammonia, while FmdE and FmdF are hydrophobic transmembrane proteins. FmdDEF is predicted to be an ATP-dependent transporter and closely resembles the periplasmic binding protein and the two transmembrane proteins present in various hydrophobic amino acid-binding transport systems.	347
380579	cd06356	PBP1_amide_urea_BP-like	periplasmic component (FmdD) of an active transport system for short-chain amides and urea (FmdDEF). This group includes the type 1 periplasmic-binding proteins that are predicted to have a function similar to that of an active transport system for short chain amides and/or urea in bacteria and Archaea, by sequence comparison and phylogenetic analysis.	334
380580	cd06357	PBP1_AmiC	periplasmic binding domain of amidase (AmiC) that belongs to the type 1 periplasmic binding fold protein family. This group includes the periplasmic binding domain of amidase (AmiC) that belongs to the type 1 periplasmic binding fold protein family. AmiC controls expression of the amidase operon by the ligand-triggered conformational switch. In the absence of ligand or presence of butyramide (repressor), AmiC (the ligand sensor and negative regulator) adopts an open conformation and inhibits the transcription antitermination function of AmiR by direct protein-protein interaction. In the presence of inducing ligands such as acetamide, AmiC adopts a closed conformation which disrupts a silencing AmiC-AmiR complex and the expression of amidase and other genes of the operon are induced.	357
380581	cd06358	PBP1_NHase	type 1 periplasmic-binding protein of the nitrile hydratase (NHase) system that selectively converts nitriles to corresponding amides. This group includes the type 1 periplasmic-binding protein of the nitrile hydratase (NHase) system that selectively converts nitriles to corresponding amides, which are subsequently converted by amidases to yield free carboxylic acids and ammonia. NHases from bacteria and fungi have been purified and characterized. In Rhodococcus sp., the nitrile hydratase operon consists of six genes encoding NHase regulator 2, NHase regulator 1, amidase, NHase alpha subunit, NHase beta subunit, and NHase activator. The operon produces a constitutive hydratase that has a broad substrate spectrum: aliphatic and aromatic nitriles, mononitriles and dinitriles, hydroxynitriles and amino-nitriles, and a constitutive amidase of equally low substrate specificity. NHases are metalloenzymes containing either cobalt or iron, and therefore can be classified into two subgroups: ferric NHases and cobalt NHases.	333
380582	cd06359	PBP1_Nba-like	type 1 periplasmic binding component of active transport systems predicted to be involved in 2-nitrobenzoic acid degradation pathway. This group includes the type 1 periplasmic binding component of active transport systems that are predicted to be involved in 2-nitrobenzoic acid degradation pathway; their substrate specificities are not well characterized.	333
380583	cd06360	PBP1_alkylbenzenes-like	type 1 periplasmic binding component of active transport systems predicted be involved in anaerobic biodegradation of alkylbenzenes such as toluene and ethylbenzene. This group includes the type 1 periplasmic binding component of active transport systems that are predicted be involved in anaerobic biodegradation of alkylbenzenes such as toluene and ethylbenzene; their substrate specificity is not well characterized, however.	357
380584	cd06361	PBP1_GPC6A-like	ligand-binding domain of the promiscuous L-alpha-amino acid receptor GPRC6A which is a broad-spectrum amino acid-sensing receptor. This family includes the ligand-binding domain of the promiscuous L-alpha-amino acid receptor GPRC6A which is a broad-spectrum amino acid-sensing receptor, and its fish homolog, the 5.24 chemoreceptor. GPRC6A is a member of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into cellular responses.	401
380585	cd06362	PBP1_mGluR	ligand binding domain of metabotropic glutamate receptors (mGluR). Ligand binding domain of the metabotropic glutamate receptors (mGluR), which are members of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into cellular responses. mGluRs bind to glutamate and function as an excitatory neurotransmitter; they are involved in learning, memory, anxiety, and the perception of pain. Eight subtypes of mGluRs have been cloned so far, and are classified into three groups according to their sequence similarities, transduction mechanisms, and pharmacological profiles. Group I is composed of mGlu1R and mGlu5R that both stimulate PLC hydrolysis. Group II includes mGlu2R and mGlu3R, which inhibit adenylyl cyclase, as do mGlu4R, mGlu6R, mGlu7R, and mGlu8R, which form group III.	460
380586	cd06363	PBP1_taste_receptor	ligand-binding domain of the T1R taste receptor. Ligand-binding domain of the T1R taste receptor. The T1R is a member of the family C receptors within the G-protein coupled receptor superfamily, which also includes the metabotropic glutamate receptors, GABAb receptors, the calcium-sensing receptor (CaSR), the V2R pheromone receptors, and a small group of uncharacterized orphan receptors.	418
380587	cd06364	PBP1_CaSR	ligand-binding domain of the CaSR calcium-sensing receptor, a member of the family C receptors within the G-protein coupled receptor superfamily. Ligand-binding domain of the CaSR calcium-sensing receptor, which is a member of the family C receptors within the G-protein coupled receptor superfamily. CaSR provides feedback control of extracellular calcium homeostasis by responding sensitively to acute fluctuations in extracellular ionized Ca2+ concentration. This ligand-binding domain has homology to the bacterial leucine-isoleucine-valine binding protein (LIVBP) and a leucine binding protein (LBP). CaSR is widely expressed in mammalian tissues and is active in tissues that are not directly involved in extracellular calcium homeostasis. Moreover, CaSR responds to aromatic, aliphatic, and polar amino acids, but not to positively charged or branched chain amino acids, which suggests that changes in plasma amino acid levels are likely to modulate whole body calcium metabolism. Additionally, the family C GPCRs includes at least two receptors with broad-spectrum amino acid-sensing properties: GPRC6A which recognizes basic and various aliphatic amino acids, its gold-fish homolog the 5.24 chemoreceptor, and a specific taste receptor (T1R) which responds to aliphatic, polar, charged, and branched amino acids, but not to aromatic amino acids.	473
380588	cd06365	PBP1_pheromone_receptor	Ligand-binding domain of the V2R pheromone receptor, a member of the family C receptors within the G-protein coupled receptor superfamily. Ligand-binding domain of the V2R pheromone receptor, a member of the family C receptors within the G-protein coupled receptor superfamily, which also includes the metabotropic glutamate receptor, the GABAb receptor, the calcium-sensing receptor (CaSR), the T1R taste receptor, and a small group of uncharacterized orphan receptors.	464
380589	cd06366	PBP1_GABAb_receptor	ligand-binding domain of GABAb receptors, which are metabotropic transmembrane receptors for gamma-aminobutyric acid (GABA). Ligand-binding domain of GABAb receptors, which are metabotropic transmembrane receptors for gamma-aminobutyric acid (GABA). GABA is the major inhibitory neurotransmitter in the mammalian CNS and, like glutamate and other transmitters, acts via both ligand gated ion channels (GABAa receptors) and G-protein coupled receptors (GABAb receptor or GABAbR). GABAa receptors are members of the ionotropic receptor superfamily which includes alpha-adrenergic and glycine receptors. The GABAb receptor is a member of a receptor superfamily which includes the mGlu receptors. The GABAb receptor is coupled to G alpha-i proteins, and activation causes a decrease in calcium, an increase in potassium membrane conductance, and inhibition of cAMP formation. The response is thus inhibitory and leads to hyperpolarization and decreased neurotransmitter release, for example.	404
380590	cd06367	PBP1_iGluR_NMDA	N-terminal leucine-isoleucine-valine-binding protein (LIVBP)-like domain of the ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptors. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptors. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. The function of the NMDA subtype receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer comprising two NR1 and two NR2 (A, B, C, and D) or NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain.	357
380591	cd06368	PBP1_iGluR_non_NMDA-like	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the non-NMDA (N-methyl-D-aspartate) subtypes of ionotropic glutamate receptors. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the non-NMDA (N-methyl-D-asparate) subtypes of ionotropic glutamate receptors. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Glutamate mediates the majority of excitatory synaptic transmission in the central nervous system via two broad classes of ionotropic receptors, characterized by their response to glutamate agonists: N-methyl-D-aspartate (NMDA) and non-NMDA receptors. NMDA receptors have intrinsically slow kinetics, are highly permeable to Ca2+, and are blocked by extracellular Mg2+ in a voltage-dependent manner. Non-NMDA receptors have faster kinetics, are most often only weakly permeable to Ca2+, and are not blocked by extracellular Mg2+. While non-NMDA receptors typically mediate excitatory synaptic responses at resting membrane potentials, NMDA receptors contribute several forms of synaptic plasticity and are thought to play an important role in the development of synaptic pathways. Non-NMDA receptors include alpha-amino-3-hydroxy-5-methyl-4-isoxazole proprionate (AMPA) and kainate receptors.	339
380592	cd06369	PBP1_GC_C_enterotoxin_receptor	ligand-binding domain of the membrane guanylyl cyclase C. Ligand-binding domain of the membrane guanylyl cyclase C (GC-C or StaR). StaR is a key receptor for the STa (Escherichia coli Heat Stable enterotoxin), a potent stimulant of intestinal chloride and bicarbonate secretion that cause acute secretory diarrhea. The catalytic domain of the STa/guanylin receptor type membrane GC is highly similar to those of the natriuretic peptide receptor (NPR) type and sensory organ-specific type membrane GCs (GC-D, GC-E and GC-F). The GC-C receptor is mainly expressed in the intestine of most vertebrates, but is also found in the kidney and other organs. Moreover, GC-C is activated by guanylin and uroguanylin, endogenous peptide ligands synthesized in the intestine and kidney. Consequently, the receptor activation results in increased cGMP levels and phosphorylation of the CFTR chloride channel and secretion.	381
380593	cd06370	PBP1_SAP_GC-like	Ligand-binding domain of membrane bound guanylyl cyclases. Ligand-binding domain of membrane bound guanylyl cyclases (GCs), which are known to be activated by sperm-activating peptides (SAPs), such as speract or resact. These ligand peptides are released by a range of invertebrates to stimulate the metabolism and motility of spermatozoa and are also potent chemoattractants. These GCs contain a single transmembrane segment, an extracellular ligand binding domain, and intracellular protein kinase-like and cyclase catalytic domains. GCs of insect and nematodes, which exhibit high sequence similarity to the speract receptor are also included in this model.	400
380594	cd06371	PBP1_sensory_GC_DEF-like	ligand-binding domain of membrane guanylyl cyclases (GC-D, GC-E, and GC-F) that are specifically expressed in sensory tissues. This group includes the ligand-binding domain of membrane guanylyl cyclases (GC-D, GC-E, and GC-F) that are specifically expressed in sensory tissues. They share a similar topology with an N-terminal extracellular ligand-binding domain, a single transmembrane domain, and a C-terminal cytosolic region that contains kinase-like and catalytic domains. GC-D is specifically expressed in a subpopulation of olfactory sensory neurons. GC-E and GC-F are colocalized within the same photoreceptor cells of the retina and have important roles in phototransduction. Unlike the other family members, GC-E and GC-F have no known extracellular ligands. Instead, they are activated under low calcium conditions by guanylyl cyclase activating proteins called GCAPs. GC-D expressing neurons have been implicated in pheromone detection and GC-D is phylogenetically more similar to the Ca2+-regulated GC-E and GC-F than to receptor GC-A, -B and -C which are activated by peptide ligands. Moreover, these olfactory GCs and retinal GCs share characteristic sequence similarity in a regulatory domain that is involved in the binding of GCAPs, suggesting GC-D activity may be regulated by an unknown extracellular ligand and intracellular Ca2+. Rodent GC-D-expressing neurons have been implicated in pheromone detection and were recently shown to respond to atmospheric CO2 which is an olfactory stimulus for many invertebrates and regulates some insect innate behavior, such as the location of food and hosts.	379
380595	cd06372	PBP1_GC_G-like	Ligand-binding domain of membrane guanylyl cyclase G. This group includes the ligand-binding domain of membrane guanylyl cyclase G (GC-G) which is a sperm surface receptor and might function, similar to its sea urchin counterpart, in the early signaling event that regulates the Ca2+ influx/efflux and subsequent motility response in sperm. GC-G appears to be a pseudogene in human. Furthermore, in contrast to the other orphan receptor GCs, GC-G has a broad tissue distribution in rat, including lung, intestine, kidney, and skeletal muscle.	390
380596	cd06373	PBP1_NPR-like	Ligand binding domain of natriuretic peptide receptor (NPR) family. Ligand binding domain of natriuretic peptide receptor (NPR) family which consists of three different subtypes: type A natriuretic peptide receptor (NPR-A, or GC-A), type B natriuretic peptide receptors (NPR-B, or GC-B), and type C natriuretic peptide receptor (NPR-C). There are three types of natriuretic peptide (NP) ligands specific to the receptors: atrial NP (ANP), brain or B-type NP (BNP), and C-type NP (CNP). The NP family is thought to have arisen through gene duplication during evolution and plays an essential role in cardiovascular and body fluid homeostasis. ANP and BNP bind mainly to NPR-A, while CNP binds specifically to NPR-B. Both NPR-A and NPR-B have guanylyl cyclase catalytic activity and produces intracellular secondary messenger cGMP in response to peptide-ligand binding. Consequently, the NPR-A activation results in vasodilation and inhibition of vascular smooth muscle cell proliferation. NPR-C acts as the receptor for all the three members of NP family, and functions as a clearance receptor. Unlike NPR-A and -B, NPR-C lacks an intracellular guanylyl cyclase domain and is thought to exert biological actions by sequestration of released natriuretic peptides and/or inhibition of adenylyl cyclase.	394
380597	cd06374	PBP1_mGluR_groupI	ligand binding domain of the group I metabotropic glutamate receptor. Ligand binding domain of the group I metabotropic glutamate receptor, a family containing mGlu1R and mGlu5R, all of which stimulate phospholipase C (PLC) hydrolysis. The metabotropic glutamate receptor is a member of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into intracellular responses. The mGluRs are classified into three groups which comprise eight subtypes.	474
380598	cd06375	PBP1_mGluR_groupII	ligand binding domain of the group II metabotropic glutamate receptor. Ligand binding domain of the group II metabotropic glutamate receptor, a family that contains mGlu2R and mGlu3R, all of which inhibit adenylyl cyclase. The metabotropic glutamate receptor is a member of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into intracellular responses. The mGluRs are classified into three groups which comprise eight subtypes	462
380599	cd06376	PBP1_mGluR_groupIII	ligand-binding domain of the group III metabotropic glutamate receptor. Ligand-binding domain of the group III metabotropic glutamate receptor, a family which contains mGlu4R, mGluR6R, mGluR7, and mGluR8; all of which inhibit adenylyl cyclase. The metabotropic glutamate receptor is a member of the family C of G-protein-coupled receptors that transduce extracellular signals into G-protein activation and ultimately into intracellular responses. The mGluRs are classified into three groups which comprise eight subtypes.	467
380600	cd06377	PBP1_iGluR_NMDA_NR3	N-terminal leucine-isoleucine-valine-binding protein (LIVBP)-like domain of the NR3 subunit of NMDA receptor family. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NR3 subunit of NMDA receptor family. The ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer composed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain.	373
380601	cd06378	PBP1_iGluR_NMDA_NR2	N-terminal leucine-isoleucine-valine-binding protein (LIVBP)-like domain of the NR2 subunit of NMDA receptor family. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NR2 subunit of NMDA receptor family. The ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer composed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain.	356
380602	cd06379	PBP1_iGluR_NMDA_NR1	N-terminal leucine-isoleucine-valine-binding protein (LIVBP)-like domain of the NR1, an essential channel-forming subunit of the NMDA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the NR1, an essential channel-forming subunit of the NMDA receptor. The ionotropic N-methyl-D-asparate (NMDA) subtype of glutamate receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer ccomposed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. When co-expressed with NR1, the NR3 subunits form receptors that are activated by glycine alone and therefore can be classified as excitatory glycine receptors. NR1/NR3 receptors are calcium-impermeable and unaffected by ligands acting at the NR2 glutamate-binding site	364
380603	cd06380	PBP1_iGluR_AMPA	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor, a member of the glutamate-receptor ion channels (iGluRs). AMPA receptors are the major mediators of excitatory synaptic transmission in the central nervous system. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. AMPA receptors consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important roles in mediating the rapid excitatory synaptic current.	390
380604	cd06381	PBP1_iGluR_delta-like	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of an orphan family of delta receptors, GluRdelta1 and GluRdelta2. This CD represents the N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of an orphan family of delta receptors, GluRdelta1 and GluRdelta2. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetic analysis shows that both GluRdelta1 and GluRalpha2 are more homologous to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq, and the tumor necrosis factor family which is secreted from cerebellar granule cells. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans.	401
380605	cd06382	PBP1_iGluR_Kainate	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the kainate receptors. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the kainate receptors, non-NMDA ionotropic receptors which respond to the neurotransmitter glutamate. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Kainate receptors have five subunits, GluR5, GluR6, GluR7, KA1 and KA2, which are structurally similar to AMPA and NMDA subunits of ionotropic glutamate receptors. KA1 and KA2 subunits can only form functional receptors with one of the GluR5-7 subunits. Moreover, GluR5-7 can also form functional homomeric receptor channels activated by kainate and glutamate when expressed in heterologous systems. Kainate receptors are involved in excitatory neurotransmission by activating postsynaptic receptors and in inhibitory neurotransmission by modulating release of the inhibitory neurotransmitter GABA through a presynaptic mechanism. Kainate receptors are closely related to AMAP receptors. In contrast of AMPA receptors, kainate receptors play only a minor role in signaling at synapses and their function is not well defined.	335
380606	cd06383	PBP1_iGluR_AMPA_Like	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of uncharacterized AMPA-like receptors. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of uncharacterized AMPA-like receptors. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. AMPA receptors consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important roles in mediating the rapid excitatory synaptic current.	379
380607	cd06384	PBP1_NPR_B	ligand-binding domain of type B natriuretic peptide receptor. Ligand-binding domain of type B natriuretic peptide receptor (NPR-B). NPR-B is one of three known single membrane-spanning natriuretic peptide receptors that have been identified. Natriuretic peptides are family of structurally related but genetically distinct hormones/paracrine factors that regulate blood volume, blood pressure, ventricular hypertrophy, pulmonary hypertension, fat metabolism, and long bone growth. In mammals there are three natriuretic peptides: ANP, BNP, and CNP. Like NPR-A (or GC-A), NPR-B (or GC-B) is a transmembrane guanylyl cyclase, an enzyme that catalyzes the synthesis of cGMP. NPR-B is the predominant natriuretic peptide receptor in the brain. The rank of order activation of NPR-B by natriuretic peptides is CNP>>ANP>BNP. Homozygous inactivating mutations in human NPR-B cause a form of short-limbed dwarfism known as acromesomelic dysplasia type Maroteaux.	399
380608	cd06385	PBP1_NPR_A	Ligand-binding domain of type A natriuretic peptide receptor. Ligand-binding domain of type A natriuretic peptide receptor (NPR-A). NPR-A is one of three known single membrane-spanning natriuretic peptide receptors that regulate blood volume, blood pressure, ventricular hypertrophy, pulmonary hypertension, fat metabolism, and long bone growth. In mammals there are three natriuretic peptides: ANP, BNP, and CNP. NPR-A is highly expressed in kidney, adrenal, terminal ileum, adipose, aortic, and lung tissues. The rank order of NPR-A activation by natriuretic peptides is ANP>BNP>>CNP. Single allele-inactivating mutations in the promoter of human NPR-A are associated with hypertension and heart failure.	408
380609	cd06386	PBP1_NPR_C	ligand-binding domain of type C natriuretic peptide receptor. Ligand-binding domain of type C natriuretic peptide receptor (NPR-C). NPR-C is found in atrial, mesentery, placenta, lung, kidney, venous tissue, aortic smooth muscle, and aortic endothelial cells. The affinity of NPR-C for natriuretic peptides is ANP>CNP>BNP. The extracellular domain of NPR-C is about 30% identical to NPR-A and NPR-B. However, unlike the cyclase-linked receptors, it contains only 37 intracellular amino acids and no guanylyl cyclase activity. Major function of NPR-C is to clear natriuretic peptides from the circulation or extracellular surroundings through constitutive receptor-mediated internalization and degradation.	391
380610	cd06387	PBP1_iGluR_AMPA_GluR3	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR3 subunit of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR3 subunit of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor. The AMPA receptor is a member of the glutamate-receptor ion channels (iGluRs) which are the major mediators of excitatory synaptic transmission in the central nervous system. AMPA receptors are composed of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. Furthermore, this N-terminal domain of the iGluRs has homology with LIVBP, a bacterial periplasmic binding protein, as well as with the structurally related glutamate-binding domain of the G-protein-coupled metabotropic receptors (mGluRs).	375
380611	cd06388	PBP1_iGluR_AMPA_GluR4	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR4 subunit of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR4 subunit of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor. The AMPA receptor is a member of the glutamate-receptor ion channels (iGluRs) which are the major mediators of excitatory synaptic transmission in the central nervous system. AMPA receptors are composed of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. Furthermore, this N-terminal domain of the iGluRs has homology with LIVBP, a bacterial periplasmic binding protein, as well as with the structurally related glutamate-binding domain of the G-protein-coupled metabotropic receptors (mGluRs).	373
380612	cd06389	PBP1_iGluR_AMPA_GluR2	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR2 subunit of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR2 subunit of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor. The AMPA receptor is a member of the glutamate-receptor ion channels (iGluRs) which are the major mediators of excitatory synaptic transmission in the central nervous system. AMPA receptors are composed of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. Furthermore, this N-terminal domain of the iGluRs has homology with LIVBP, a bacterial periplasmic binding protein, as well as with the structurally related glutamate-binding domain of the G-protein-coupled metabotropic receptors (mGluRs).	372
380613	cd06390	PBP1_iGluR_AMPA_GluR1	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR1 subunit of the AMPA receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the GluR1 subunit of the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) receptor. The AMPA receptor is a member of the glutamate-receptor ion channels (iGluRs) which are the major mediators of excitatory synaptic transmission in the central nervous system. AMPA receptors are composed of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current. Furthermore, this N-terminal domain of the iGluRs has homology with LIVBP, a bacterial periplasmic binding protein, as well as with the structurally related glutamate-binding domain of the G-protein-coupled metabotropic receptors (mGluRs).	367
380614	cd06391	PBP1_iGluR_delta_2	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the delta2 receptor of an orphan glutamate receptor family. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the delta2 receptor of an orphan glutamate receptor family. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetic analysis shows that both GluRdelta1 and GluRalpha2 are closer related to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq and tumor necrosis factor family that is secreted from cerebellar granule cells.	402
380615	cd06392	PBP1_iGluR_delta_1	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the delta1 receptor of an orphan glutamate receptor family. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the delta1 receptor of an orphan glutamate receptor family. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetic analysis shows that both GluRdelta1 and GluRalpha2 may be closer related to non-NMDA receptors. In contrast to GluRdelta2, GluRdelta1 is expressed in many areas in the developing CNS, including the hippocampus and the caudate putamen. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans.	402
380616	cd06394	PBP1_iGluR_Kainate_KA1_2	N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the KA1 and KA2 subunits of Kainate receptor. N-terminal leucine-isoleucine-valine binding protein (LIVBP)-like domain of the KA1 and KA2 subunits of Kainate receptor. While this N-terminal domain belongs to the periplasmic-binding fold type 1 superfamily, the glutamate-binding domain of the iGluR is structurally homologous to the periplasmic-binding fold type 2. The LIVBP-like domain of iGluRs is thought to play a role in the initial assembly of iGluR subunits, but it is not well understood how this domain is arranged and functions in intact iGluR. There are five types of kainate receptors, GluR5, GluR6, GluR7, KA1, and KA2, which are structurally similar to AMPA and NMDA subunits of ionotropic glutamate receptors. KA1 and KA2 subunits can only form functional receptors with one of the GluR5-7 subunits. Moreover, GluR5-7 can also form functional homomeric receptor channels activated by kainate and glutamate when expressed in heterologous systems. Kainate receptors are involved in excitatory neurotransmission by activating postsynaptic receptors and in inhibitory neurotransmission by modulating release of the inhibitory neurotransmitter GABA through a presynaptic mechanism. Kainate receptors are closely related to AMPA receptors. In contrast of AMPA receptors, kainate receptors play only a minor role in signaling at synapses and their function is not well defined.	379
99717	cd06395	PB1_Map2k5	PB1 domain is essential part of the mitogen-activated protein kinase kinase 5 (Map2k5, alias MEK5) one of the key member of the signaling kinases cascade which involved in angiogenesis and early cardiovascular development. The PB1 domain of Map2k5 interacts with the PB1 domain of another members of kinase cascade MEKK2 (or MEKK3).  A canonical PB1-PB1 interaction, involving heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  The Map2k5 protein contains a type I PB1 domain.	91
99718	cd06396	PB1_NBR1	The PB1 domain is an essential part of NBR1 protein, next to BRCA1, a scaffold protein mediating specific protein-protein interaction with both titin protein kinase and with another scaffold protein p62. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. The NBR1 protein contains a type I PB1 domain.	81
99719	cd06397	PB1_UP1	Uncharacterized protein 1. The PB1 domain is a modular domain mediating specific protein-protein interaction which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions.	82
99720	cd06398	PB1_Joka2	The PB1 domain is present in the Nicotiana plumbaginifolia Joka2 protein which interacts with sulfur stress inducible UP9 protein. The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants.	91
99721	cd06399	PB1_P40	The PB1 domain is essential part of the p40 adaptor protein which plays an important role in activating phagocyte NADPH oxidase during phagocytosis. The PB1 domain is a modular domain mediating specific protein-protein interaction which play a role in many critical cell processes , such as osteoclastogenesis, angiogenesis, early cardiovascular development and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. The PB1 domain of p40 represents a type I PB1 domain which interacts with the PB1 domain of oxidase activator p67 which belong to type II PB1 domain. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants.	92
99722	cd06401	PB1_TFG	The PB1 domain found in TFG protein, an oncogenic gene product and fusion partner to nerve growth factor tyrosine kinase receptor TrkA and to the tyrosine kinase ALK. The PB1 domain is a modular domain mediating specific protein-protein interaction in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  The PB1 domains of TFG represent a type I/II PB1 domain. The physiological function of TFG remains unknown.	81
99723	cd06402	PB1_p62	The PB1 domain is an essential part of p62 scaffold protein (alias sequestosome 1,SQSTM) involved in cell signaling, receptor internalization, and protein turnover. The PB1 domain is a modular domain mediating specific protein-protein interaction which play roles in many critical cell processes. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants.	87
99724	cd06403	PB1_Par6	The PB1 domain is an essential part of Par6 protein which in complex with Par3 and aPKC proteins is crucial for establishment of apical-basal polarity of animal cells. The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. The Par6 protein contains a type II PB1 domain.	80
99725	cd06404	PB1_aPKC	PB1 domain is an essential modular domain of the atypical protein kinase C (aPKC) which in complex with Par6 and Par3  proteins is crucial for establishment of apical-basal polarity of animal cells. PB1 domain is a modular domain mediating specific protein-protein interaction which play roles in many critical cell processes. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. The aPKC protein contains a type I/II PB1 domain.	83
99726	cd06405	PB1_Mekk2_3	The PB1 domain is present in the two mitogen-activated protein kinase kinases MEKK2 and MEKK3 which are two members of the signaling kinase cascade involved in angiogenesis and early cardiovascular development. The PB1 domain of MEKK2 (and/or MEKK3) interacts with the PB1 domain of another member of the kinase cascade Map2k5.  A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. The MEKK2 and MEKK3 proteins contain a type II PB1 domain.	79
99727	cd06406	PB1_P67	A PB1 domain is present in p67 proteins which forms a signaling complex with p40, a crucial step for activation of  NADPH oxidase during phagocytosis. PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes . A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants. The p67 proteins contain a type II PB1 domain.	80
99728	cd06407	PB1_NLP	A PB1 domain is present in NIN like proteins (NLP), a key enzyme in a process of establishment of symbiosis betweeen legumes and nitrogen fixing bacteria (Rhizobium). The PB1 domain is a modular domain mediating specific protein-protein interaction which play a role in many critical cell processes like osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants.	82
99729	cd06408	PB1_NoxR	The PB1 domain is present in the Epichloe festucae NoxR protein (NADPH oxidase regulator), a key regulator of NADPH oxidase isoform, NoxA.  NoxA is essential for growth control of the fungal endophyte in plant tissue in the process of symbiotic interaction between a fungi and its plant host.   The Epichloe festucae p67(phox)-like regulator, NoxR, dispensable in culture but essential in plants for the symbiotic interaction. Plants infected with a noxR deletion mutant show severe stunting and premature senescence, whereas hyphae in the meristematic tissues show increased branching leading to increased fungal colonization of pseudostem and leaf blade tissue.  The PB1 domain is a modular domain mediating specific protein-protein interactions which a play role in many critical cell processes such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants.	86
99730	cd06409	PB1_MUG70	The MUG70 protein is a product of the meiotically up-regulated gene 70 which has a role in meiosis and harbors a PB1 domain. The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domains depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic amino acid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants.	86
99731	cd06410	PB1_UP2	Uncharacterized protein 2. The PB1 domain is a modular domain mediating specific protein-protein interaction which play a role in many critical cell processes such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions.	97
99732	cd06411	PB1_p51	The PB1 domain is present in the p51 protein, a homolog of the p67 protein.  p51 plays an  important role in NADPH oxidase activation during phagosytosis. The PB1 domain is a modular domain mediating specific protein-protein interaction in many critical cell processes such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster.  Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants.	78
119374	cd06412	GH25_CH-type	CH-type (Chalaropsis-type) lysozymes represent one of four functionally-defined classes of peptidoglycan hydrolases (also referred to as endo-N-acetylmuramidases) that cleave bacterial cell wall peptidoglycans.  CH-type lysozymes exhibit both lysozyme (acetylmuramidase) and diacetylmuramidase activity. The first member of this family to be described was a muramidase from the fungus Chalaropsis.  However, a majority of the CH-type lysozymes are found in bacteriophages and Gram-positive bacteria such as Streptomyces and Clostridium.  CH-type lysozymes have a single glycosyl hydrolase family 25 (GH25) domain with an unusual beta/alpha-barrel fold in which the last strand of the barrel is antiparallel to strands beta7 and beta1.  Most CH-type lysozymes appear to lack the cell wall-binding domain found in other GH25 muramidases.	199
119375	cd06413	GH25_muramidase_1	Uncharacterized bacterial muramidase containing a glycosyl hydrolase family 25 (GH25) catalytic domain.  Endo-N-acetylmuramidases are lysozymes (also referred to as peptidoglycan hydrolases) that degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues.	191
119376	cd06414	GH25_LytC-like	The LytC lysozyme of Streptococcus pneumoniae is a bacterial cell wall hydrolase that cleaves the beta1-4-glycosydic bond located between the N-acetylmuramoyl-N-glucosaminyl residues of the cell wall polysaccharide chains.   LytC is composed of a C-terminal glycosyl hydrolase family 25 (GH25) domain and an N-terminal choline-binding module (CBM) consisting of eleven homologous repeats that specifically recognizes the choline residues of pneumococcal lipoteichoic and teichoic acids. This domain arrangement is the reverse of the major pneumococcal autolysin, LytA, and the CPL-1-like lytic enzymes of the pneumococcal bacteriophages, in which the CBM (consisting of six repeats) is at the C-terminus. This model represents the C-terminal catalytic domain of the LytC-like enzymes.	191
119377	cd06415	GH25_Cpl1-like	Cpl-1 lysin (also known as Cpl-9 lysozyme / muramidase) is a bacterial cell wall endolysin encoded by the pneumococcal bacteriophage Cp-1, which cleaves the glycosidic N-acetylmuramoyl-(beta1,4)-N-acetylglucosamine bonds of the pneumococcal glycan chain, thus acting as an enzymatic antimicrobial agent (an enzybiotic) against streptococcal infections. Cpl-1 belongs to the CP family of lysozymes (CPL lysozymes) which includes the Cpl-7 lysin.  Cpl-1 has a glycosyl hydrolase family 25 (GH25) catalytic domain with an irregular (beta/alpha)5-beta3 barrel and a C-terminal cell wall-anchoring module formed by six similar choline-binding repeats (ChBr's). The ChBr's facilitate the anchoring of Cpl-1 to the choline-containing teichoic acid of the pneumococcal cell wall. Other members of this domain family have an N-terminal CHAP (cysteine, histidine-dependent amidohydrolases/peptidases) domain similar to that of the firmicute CHAP lysins and associated with endopeptidase activity.  The Cpl-7 lysin is also included here as is LysB of Lactococcus phage, and the Mur lysin of Lactobacillus phage.	196
119378	cd06416	GH25_Lys1-like	Lys-1 is a lysozyme encoded by the Caenorhabditis elegans lys-1 gene. This gene is one of a several lysozyme genes upregulated upon infection by the Gram-negative bacterial pathogen Serratia marcescens.  Lys-1 contains a glycosyl hydrolase family 25 (GH25) catalytic domain.  This family also includes Lys-5 from Caenorhabditis elegans.	196
119379	cd06417	GH25_LysA-like	LysA is a cell wall endolysin produced by Lactobacillus fermentum, which degrades bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues.  The N-terminal glycosyl hydrolase family 25 (GH25) domain of LysA has sequence similarity with other murein hydrolase catalytic domains while the C-terminal domain has sequence similarity with putative bacterial cell wall-binding SH3b domains.  This domain family also includes LysL of Lactococcus lactis.	195
119380	cd06418	GH25_BacA-like	BacA is a bacterial lysin from Enterococcus faecalis that degrades bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues.  BacA is homologous to the YbfG and YkuG lysins of Bacillus subtilis. BacA has a C-terminal catalytic glycosyl hydrolase family 25 (GH25) domain and an N-terminal peptidoglycan-binding domain comprised of three alpha helices which is similar to a domain found in matrixins.	212
119381	cd06419	GH25_muramidase_2	Uncharacterized bacterial muramidase containing a glycosyl hydrolase family 25 (GH25) catalytic domain.  Endo-N-acetylmuramidases are lysozymes (also referred to as peptidoglycan hydrolases) that degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues.	190
133042	cd06420	GT2_Chondriotin_Pol_N	N-terminal domain of Chondroitin polymerase functions as a GalNAc transferase. Chondroitin polymerase is a two domain, bi-functional protein. The N-terminal domain functions as a GalNAc transferase. The bacterial chondroitin polymerase catalyzes elongation of the chondroitin chain by alternatively transferring the GlcUA and GalNAc moiety from UDP-GlcUA and UDP-GalNAc to the non-reducing ends of the chondroitin chain. The enzyme consists of N-terminal and C-terminal domains in which the two active sites catalyze the addition of GalNAc and GlcUA, respectively. Chondroitin chains range from 40 to over 100 repeating units of the disaccharide. Sulfated chondroitins are involved in the regulation of various biological functions such as central nervous system development, wound repair, infection, growth factor signaling, and morphogenesis, in addition to its conventional structural roles. In Caenorhabditis elegans, chondroitin is an essential factor for the worm to undergo cytokinesis and cell division. Chondroitin is synthesized as proteoglycans, sulfated and secreted to the cell surface or extracellular matrix.	182
133043	cd06421	CESA_CelA_like	CESA_CelA_like are involved in the elongation of the glucan chain of cellulose. Family of proteins related to  Agrobacterium tumefaciens CelA and  Gluconacetobacter xylinus BscA. These proteins are involved in the elongation of the glucan chain of cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues. They are putative catalytic subunit of cellulose synthase, which is a glycosyltransferase using UDP-glucose as the substrate. The catalytic subunit is an integral membrane protein with 6 transmembrane segments and it is postulated that the protein is anchored in the membrane at the N-terminal end.	234
133044	cd06422	NTP_transferase_like_1	NTP_transferase_like_1 is a member of the nucleotidyl transferase family. This is a subfamily of nucleotidyl transferases. Nucleotidyl transferases transfer nucleotides onto phosphosugars. The activated sugars are precursors for synthesis of lipopolysaccharide, glycolipids and polysaccharides. Other subfamilies of nucleotidyl transferases include Alpha-D-Glucose-1-Phosphate Cytidylyltransferase, Mannose-1-phosphate guanyltransferase, and Glucose-1-phosphate thymidylyltransferase.	221
133045	cd06423	CESA_like	CESA_like is  the cellulose synthase superfamily. The cellulose synthase (CESA) superfamily includes a wide variety of glycosyltransferase family 2 enzymes that share the common characteristic of catalyzing the elongation of polysaccharide chains. The members include cellulose synthase catalytic subunit, chitin synthase, glucan biosynthesis protein and other families of CESA-like proteins. Cellulose synthase catalyzes the polymerization reaction of cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues in  plants, most algae, some bacteria and fungi, and even some animals. In bacteria, algae and lower eukaryotes, there is a second unrelated type of cellulose synthase (Type II), which produces acylated cellulose, a derivative of cellulose. Chitin synthase catalyzes the incorporation of GlcNAc from substrate UDP-GlcNAc into chitin, which is a linear homopolymer of beta-(1,4)-linked GlcNAc residues and Glucan Biosynthesis protein catalyzes the elongation of beta-1,2 polyglucose chains of Glucan.	180
133046	cd06424	UGGPase	UGGPase catalyzes the synthesis of UDP-Glucose/UDP-Galactose. UGGPase: UDP-Galactose/Glucose Pyrophosphorylase catalyzes the reversible production of UDP-Glucose/UDP-Galactose and pyrophosphate (PPi) from Glucose-1-phosphate/Galactose-1-phosphate and UTP. Its dual substrate specificity distinguishes it from the single substrate enzyme UDP-glucose pyrophosphorylase. It may play a key role in the galactose metabolism in raffinose oligosaccharide (RFO) metabolizing plants. RFO raffinose is a major photoassimilate and is a galactosylderivative of sucrose (Suc) containing a galactose (Gal) moiety. Upon arriving at the sink tissue, the Gal moieties of the RFOs are initially removed by alpha-galactosidase and then are phosphorylated to Gal-1-P. Gal-1-P is converted to UDP-Gal. The UDP-Gal is further metabolized to UDP-Glc via an epimerase reaction. The UDP-Glc can be directly utilized in cell wall metabolism or in Suc synthesis. However, for the Suc synthesis UDP-Glc must be further metabolized to Glc-1-P. This can be carried out either by the UGPase in the reverse direction or by the dual substrate PPase itself operating in the reverse direction. According to the latter possibility, the three-step pathway of Gal-1-P to Glc-1-P could be carried out by a single PPase, functioning sequentially in reverse directions separated by the epimerase reaction.	315
133047	cd06425	M1P_guanylylT_B_like_N	N-terminal domain of the M1P-guanylyltransferase B-isoform like proteins. GDP-mannose pyrophosphorylase  (GTP: alpha-d-mannose-1-phosphate guanyltransferase) catalyzes the formation of GDP-d-mannose from GTP and alpha-d-mannose-1-Phosphate. It contains an N-terminal catalytic domain and a C-terminal Lefthanded-beta-Helix fold domain. GDP-d-mannose is the activated form of mannose for formation of cell wall lipoarabinomannan and various mannose-containing glycolipids and polysaccharides. The function of GDP-mannose pyrophosphorylase is essential for cell wall integrity, morphogenesis and viability. Repression of GDP-mannose pyrophosphorylase in yeast leads to phenotypes, such as cell lysis, defective cell wall, and failure of polarized growth and cell separation.	233
133048	cd06426	NTP_transferase_like_2	NTP_trnasferase_like_2 is a member of the nucleotidyl transferase family. This is a subfamily of nucleotidyl transferases. Nucleotidyl transferases transfer nucleotides onto phosphosugars. The activated sugars are precursors for synthesis of lipopolysaccharide, glycolipids and polysaccharides. Other subfamilies of nucleotidyl transferases include Alpha-D-Glucose-1-Phosphate Cytidylyltransferase, Mannose-1-phosphate guanyltransferase, and Glucose-1-phosphate thymidylyltransferase.	220
133049	cd06427	CESA_like_2	CESA_like_2 is a member of the cellulose synthase superfamily. The cellulose synthase (CESA) superfamily includes a wide variety of glycosyltransferase family 2 enzymes that share the common characteristic of catalyzing the elongation of polysaccharide chains.  The members include cellulose synthase catalytic subunit, chitin synthase, Glucan Biosynthesis protein and other families of CESA-like proteins. Cellulose synthase catalyzes the polymerization reaction of cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues in  plants, most algae, some bacteria and fungi, and even some animals. In bacteria, algae and lower eukaryotes, there is a second unrelated type of cellulose synthase (Type II), which produces acylated cellulose, a derivative of cellulose.  Chitin synthase catalyzes the incorporation of GlcNAc from substrate UDP-GlcNAc into chitin, which is a linear homopolymer of beta-(1,4)-linked GlcNAc residues and Glucan Biosynthesis protein catalyzes the elongation of beta-1,2 polyglucose chains of glucan.	241
133050	cd06428	M1P_guanylylT_A_like_N	N-terminal domain of M1P_guanylyl_A_ like proteins are likely to be a isoform of GDP-mannose pyrophosphorylase. N-terminal domain of the M1P-guanylyltransferase A-isoform like proteins:  The proteins of this family are likely to be a isoform of GDP-mannose pyrophosphorylase. Their sequences are highly conserved with mannose-1-phosphate guanyltransferase, but  generally about 40-60 bases longer.  GDP-mannose pyrophosphorylase (GTP: alpha-d-mannose-1-phosphate guanyltransferase) catalyzes the formation of GDP-d-mannose from GTP and alpha-d-mannose-1-Phosphate. It contains an N-terminal catalytic domain that resembles a dinucleotide-binding Rossmann fold and a C-terminal LbH fold domain. GDP-d-mannose is the activated form of mannose for formation of cell wall lipoarabinomannan and various mannose-containing glycolipids and polysaccharides. The function of GDP-mannose pyrophosphorylase is essential for cell wall integrity, morphogenesis and viability.  Repression of GDP-mannose pyrophosphorylase in yeast leads to phenotypes including cell lysis, defective cell wall, and failure of polarized growth and cell separation.	257
133051	cd06429	GT8_like_1	GT8_like_1 represents a subfamily of GT8 with unknown function. A subfamily of glycosyltransferase family 8 with unknown function: Glycosyltransferase family 8 comprises enzymes with a number of known activities; lipopolysaccharide galactosyltransferase  lipopolysaccharide glucosyltransferase 1, glycogenin glucosyltransferase and inositol 1-alpha-galactosyltransferase. It is classified as a retaining glycosyltransferase, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed.	257
133052	cd06430	GT8_like_2	GT8_like_2 represents a subfamily of GT8 with unknown function. A subfamily of glycosyltransferase family 8 with unknown function: Glycosyltransferase family 8 comprises enzymes with a number of known activities; lipopolysaccharide galactosyltransferase  lipopolysaccharide glucosyltransferase 1, glycogenin glucosyltransferase and inositol 1-alpha-galactosyltransferase. It is classified as a retaining glycosyltransferase, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed.	304
133053	cd06431	GT8_LARGE_C	LARGE catalytic domain has closest homology to GT8 glycosyltransferase involved in lipooligosaccharide synthesis. The catalytic domain of LARGE is a putative glycosyltransferase. Mutations of LARGE in mouse and human cause dystroglycanopathies, a disease associated with hypoglycosylation of the membrane protein alpha-dystroglycan (alpha-DG) and consequent loss of extracellular ligand binding. LARGE needs to both physically interact with alpha-dystroglycan and function as a glycosyltransferase in order to stimulate alpha-dystroglycan hyperglycosylation. LARGE localizes to the Golgi apparatus and contains three conserved DxD motifs. While two of the motifs are indispensible for glycosylation function, one is important for localization of th eenzyme. LARGE was originally named because it covers approximately large trunck of genomic DNA, more than 600bp long. The predicted protein structure contains an N-terminal cytoplasmic domain, a transmembrane region, a coiled-coil motif, and two putative catalytic domains. This catalytic domain has closest homology to  GT8 glycosyltransferase involved in lipooligosaccharide synthesis.	280
133054	cd06432	GT8_HUGT1_C_like	The C-terminal domain of HUGT1-like is highly homologous to the GT 8 family. C-terminal domain of glycoprotein glucosyltransferase (UGT).  UGT is a large glycoprotein whose C-terminus contains the catalytic activity. This catalytic C-terminal domain is highly homologous to Glycosyltransferase Family 8 (GT 8) and contains the DXD motif that coordinates donor sugar binding, characteristic for Family 8 glycosyltransferases.  GT 8 proteins are retaining enzymes based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed. The non-catalytic N-terminal portion of the human UTG1 (HUGT1) has been shown to monitor the protein folding status and activate its glucosyltransferase activity.	248
133055	cd06433	GT_2_WfgS_like	WfgS and WfeV are involved in O-antigen biosynthesis. Escherichia coli WfgS and Shigella dysenteriae WfeV are glycosyltransferase 2 family enzymes involved in O-antigen biosynthesis. GT-2 enzymes have GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. These are enzymes that catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Glycosyltransferases have been classified into more than 90 distinct sequence based families.	202
133056	cd06434	GT2_HAS	Hyaluronan synthases catalyze polymerization of hyaluronan. Hyaluronan synthases (HASs) are bi-functional glycosyltransferases that catalyze polymerization of hyaluronan. HASs transfer both GlcUA and GlcNAc in beta-(1,3) and beta-(1,4) linkages, respectively to the hyaluronan chain using UDP-GlcNAc and UDP-GlcUA as substrates. HA is made as a free glycan, not attached to a protein or lipid. HASs do not need a primer for HA synthesis; they initiate HA biosynthesis de novo with only UDP-GlcNAc, UDP-GlcUA, and Mg2+. Hyaluronan (HA) is a linear heteropolysaccharide composed of (1-3)-linked beta-D-GlcUA-beta-D-GlcNAc disaccharide repeats. It can be found in vertebrates and a few microbes and is typically on the cell surface or in the extracellular space, but is also found inside mammalian cells. Hyaluronan has several physiochemical and biological functions such as space filling, lubrication, and providing a hydrated matrix through which cells can migrate.	235
133057	cd06435	CESA_NdvC_like	NdvC_like  proteins in this family are putative bacterial beta-(1,6)-glucosyltransferase. NdvC_like  proteins in this family are putative bacterial beta-(1,6)-glucosyltransferase. Bradyrhizobium japonicum synthesizes periplasmic cyclic beta-(1,3),beta-(1,6)-D-glucans during growth under hypoosmotic conditions. Two genes (ndvB, ndvC) are involved in the beta-(1, 3), beta-(1,6)-glucan synthesis. The ndvC mutant strain resulted in synthesis of altered cyclic beta-glucans composed almost entirely of beta-(1, 3)-glycosyl linkages. The periplasmic cyclic beta-(1,3),beta-(1,6)-D-glucans function for osmoregulation. The ndvC mutation also affects the ability of the bacteria to establish a successful symbiotic interaction with host plant. Thus, the beta-glucans may function as suppressors of a host defense response.	236
133058	cd06436	GlcNAc-1-P_transferase	N-acetyl-glucosamine transferase is involved in the synthesis of Poly-beta-1,6-N-acetyl-D-glucosamine. N-acetyl-glucosamine transferase is responsible for the synthesis of bacteria Poly-beta-1,6-N-acetyl-D-glucosamine (PGA). Poly-beta-1,6-N-acetyl-D-glucosamine is a homopolymer that serves as an adhesion for the maintenance of biofilm structural stability in diverse eubacteria. N-acetyl-glucosamine transferase is the product of gene pgaC. Genetic analysis indicated that all four genes of the pgaABCD locus were required for the PGA production, pgaC being a glycosyltransferase.	191
133059	cd06437	CESA_CaSu_A2	Cellulose synthase catalytic subunit A2 (CESA2) is a catalytic subunit or a catalytic subunit substitute of the cellulose synthase complex. Cellulose synthase (CESA) catalyzes the polymerization reaction of cellulose using UDP-glucose as the substrate. Cellulose is an aggregate of unbranched polymers of beta-1,4-linked glucose residues, which is an abundant polysaccharide produced by plants and in varying degrees by several other organisms including algae, bacteria, fungi, and even some animals. Genomes from higher plants harbor multiple CESA genes. There are ten in Arabidopsis. At least three different CESA proteins are required to form a functional complex. In Arabidopsis, CESA1, 3 and 6 and CESA4, 7 and 8, are required for cellulose biosynthesis during primary and secondary cell wall formation. CESA2 is very closely related to CESA6 and is viewed as a prime substitute for CESA6. They functionally compensate each other. The cesa2 and cesa6 double mutant plants were significantly smaller, while the single mutant plants were almost normal.	232
133060	cd06438	EpsO_like	EpsO protein participates in the methanolan synthesis. The Methylobacillus sp EpsO protein is predicted to participate in the methanolan synthesis. Methanolan is an exopolysaccharide (EPS), composed of glucose, mannose and galactose.  A 21 genes cluster was predicted to participate in the methanolan synthesis. Gene disruption analysis revealed that EpsO is one of the glycosyltransferase enzymes involved in the synthesis of repeating sugar units onto the lipid carrier.	183
133061	cd06439	CESA_like_1	CESA_like_1 is a member of the cellulose synthase (CESA) superfamily. This is a subfamily of cellulose synthase (CESA) superfamily.  CESA superfamily includes a wide variety of glycosyltransferase family 2 enzymes that share the common characteristic of catalyzing the elongation of polysaccharide chains.  The members of the superfamily include cellulose synthase catalytic subunit, chitin synthase, glucan biosynthesis protein and other families of CESA-like proteins.	251
133062	cd06442	DPM1_like	DPM1_like represents putative enzymes similar to eukaryotic DPM1. Proteins similar to eukaryotic DPM1, including enzymes from bacteria and archaea; DPM1 is the catalytic subunit of eukaryotic dolichol-phosphate mannose (DPM) synthase. DPM synthase is required for synthesis of the glycosylphosphatidylinositol (GPI) anchor, N-glycan precursor, protein O-mannose, and C-mannose. In higher eukaryotes,the enzyme has three subunits, DPM1, DPM2 and DPM3. DPM is synthesized from dolichol phosphate and GDP-Man on the cytosolic surface of the ER membrane by DPM synthase and then is flipped onto the luminal side and used as a donor substrate. In lower eukaryotes, such as Saccharomyces cerevisiae and Trypanosoma brucei, DPM synthase consists of a single component (Dpm1p and TbDpm1, respectively) that possesses one predicted transmembrane region near the C terminus for anchoring to the ER membrane. In contrast, the Dpm1 homologues of higher eukaryotes, namely fission yeast, fungi, and animals, have no transmembrane region, suggesting the existence of adapter molecules for membrane anchoring. This family also includes bacteria and archaea DPM1_like enzymes. However, the enzyme structure and mechanism of function are not well understood. This protein family belongs to Glycosyltransferase 2 superfamily.	224
176473	cd06444	DNA_pol_A	Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication. DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase  beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains.	347
119438	cd06445	ATase	The DNA repair protein O6-alkylguanine-DNA alkyltransferase (ATase; also known as AGT, AGAT and MGMT) reverses O6-alkylation DNA damage by transferring O6-alkyl adducts to an active site cysteine irreversibly, without inducing DNA strand breaks. ATases are specific for repair of guanines with O6-alkyl adducts, however human ATase is not limited to O6-methylguanine, repairing many other adducts at the O6-position of guanine as well. ATase is widely distributed among species. Most ATases have N- and C-terminal domains. The C-terminal domain contains the conserved active-site cysteine motif (PCHR), the O6-alkylguanine binding channel, and the helix-turn-helix (HTH) DNA-binding motif. The active site is located near the recognition helix of the HTH motif. While the C-terminal domain of ATase contains residues that are necessary for DNA binding and alkyl transfer, the function of the N-terminal domain is still unknown. Removal of the N-terminal domain abolishes the activity of the C-terminal domain, suggesting an important structural role for the N-terminal domain in orienting the C-terminal domain for proper catalysis. Some ATase C-terminal domain homologs are either single-domain proteins that lack an N-terminal domain, or have a tryptophan substituted in place of the acceptor cysteine (i.e. the motif PCHR is replaced by PWHR). ATase null mutant mice are viable, fertile, and have a normal lifespan.	79
107207	cd06446	Trp-synth_B	Tryptophan synthase-beta:  Tryptophan synthase is a bifunctional enzyme that catalyses the last two steps in the biosynthesis of L-tryptophan via its alpha and beta reactions. In the alpha reaction, indole 3-glycerol phosphate is cleaved reversibly to glyceraldehyde 3-phosphate and indole at the active site of the alpha subunit. In the beta reaction, indole undergoes a PLP-dependent reaction with L-serine to form L-tryptophan at the active site of the beta subunit. Members of this CD, Trp-synth_B, are found in all three major phylogenetic divisions.	365
107208	cd06447	D-Ser-dehyd	D-Serine dehydratase is a pyridoxal phosphate (PLP)-dependent enzyme which catalyzes the conversion of L- or D-serine  to pyruvate and ammonia.  D-serine dehydratase serves as a detoxifying enzyme in most E. coli strains where D-serine is a competitive antagonist of beta-alanine in the biosynthetic pathway to pentothenate and coenzyme A.  D-serine dehydratase is different from other pyridoxal-5'-phosphate-dependent enzymes in that it catalyzes alpha, beta-elimination reactions on amino acids.	404
107209	cd06448	L-Ser-dehyd	Serine dehydratase is a pyridoxal phosphate (PLP)-dependent enzyme which catalyzes the conversion of L- , D-serine, or L-threonine to pyruvate/ketobutyrate and ammonia.	316
107210	cd06449	ACCD	Aminocyclopropane-1-carboxylate deaminase (ACCD): Pyridoxal phosphate (PLP)-dependent enzyme which catalyzes the conversion of 1-aminocyclopropane-L-carboxylate (ACC), a precursor of the plant hormone ethylene, to alpha-ketobutyrate and ammonia.	307
99743	cd06450	DOPA_deC_like	DOPA decarboxylase family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to DOPA/tyrosine decarboxylase (DDC), histidine decarboxylase (HDC), and glutamate decarboxylase (GDC). DDC is active as a dimer and catalyzes the decarboxylation of tyrosine. GDC catalyzes the decarboxylation of glutamate and HDC catalyzes the decarboxylation of histidine.	345
99744	cd06451	AGAT_like	Alanine-glyoxylate aminotransferase (AGAT) family. This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to alanine-glyoxylate aminotransferase (AGAT), serine-glyoxylate aminotransferase (SGAT), and 3-hydroxykynurenine transaminase (HKT). AGAT is a homodimeric protein, which catalyses the transamination of glyoxylate to glycine, and SGAT converts serine and glyoxylate to hydroxypyruvate and glycine. HKT catalyzes the PLP-dependent transamination of 3-hydroxykynurenine, a potentially toxic metabolite of the kynurenine pathway.	356
99745	cd06452	SepCysS	Sep-tRNA:Cys-tRNA synthase. This family belongs to the pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). Cys-tRNA(Cys) is produced by O-phosphoseryl-tRNA synthetase which ligates O-phosphoserine (Sep) to tRNA(Cys), and Sep-tRNA:Cys-tRNA synthase (SepCysS) converts Sep-tRNA(Cys) to Cys-tRNA(Cys), in methanogenic archaea. SepCysS forms a dimer, each monomer is composed of a large and small domain; the larger, a typical pyridoxal 5'-phosphate (PLP)-dependent-like enzyme fold.  In the active site of each monomer, PLP is covalently bound to a conserved Lys residue near the dimer interface.	361
99746	cd06453	SufS_like	Cysteine desulfurase (SufS)-like. This family belongs to the pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD correspond to cysteine desulfurase (SufS) and selenocysteine lyase. SufS catalyzes the removal of elemental sulfur and selenium atoms from L-cysteine, L-cystine, L-selenocysteine, and L-selenocystine to produce L-alanine; and selenocysteine lyase catalyzes the decomposition of L-selenocysteine.	373
99747	cd06454	KBL_like	KBL_like; this family belongs to the pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I). The major groups in this CD corresponds to serine palmitoyltransferase (SPT), 5-aminolevulinate synthase (ALAS), 8-amino-7-oxononanoate synthase (AONS), and 2-amino-3-ketobutyrate CoA ligase (KBL). SPT is responsible for the condensation of L-serine with palmitoyl-CoA to produce 3-ketodihydrospingosine, the reaction of the first step in sphingolipid biosynthesis. ALAS is involved in heme biosynthesis; it catalyzes the synthesis of 5-aminolevulinic acid from glycine and succinyl-coenzyme A. AONS catalyses the decarboxylative condensation of l-alanine and pimeloyl-CoA in the first committed step of biotin biosynthesis. KBL catalyzes the second reaction step of the metabolic degradation pathway for threonine converting 2-amino-3-ketobutyrate, to glycine and acetyl-CoA. The members of this CD are widely found in all three forms of life.	349
341050	cd06455	M3A_TOP	Peptidase M3 thimet oligopeptidase (TOP), also includes neurolysin. Peptidase M3 Thimet oligopeptidase (TOP; PZ-peptidase; endo-oligopeptidase A; endopeptidase 24.15; soluble metallo-endopeptidase; EC 3.4.24.15) family also includes neurolysin (endopeptidase 24.16, microsomal endopeptidase, mitochondrial oligopeptidase M, neurotensin endopeptidase, soluble angiotensin II-binding protein, thimet oligopeptidase II) which hydrolyzes oligopeptides such as neurotensin, bradykinin and dynorphin A. TOP and neurolysin are neuropeptidases expressed abundantly in the testis, but are also found in the liver, lung and kidney. They are involved in the metabolism of neuropeptides under 20 amino acid residues long and cleave most bioactive peptides at the same sites, but recognize different positions on some naturally occurring and synthetic peptides; they cleave at distinct sites on the 13-residue bioactive peptide neurotensin, which modulates central dopaminergic and cholinergic circuits. TOP has been shown to degrade peptides released by the proteasome, limiting the extent of antigen presentation by major histocompatibility complex class I molecules, and has been associated with amyloid protein precursor processing.	642
341051	cd06456	M3A_DCP	Peptidase family M3, dipeptidyl carboxypeptidase (DCP). Peptidase family M3 dipeptidyl carboxypeptidase (DCP; Dcp II; peptidyl dipeptidase; EC 3.4.15.5). This metal-binding M3A family also includes oligopeptidase A (OpdA; EC 3.4.24.70). DCP cleaves dipeptides off the C-termini of various peptides and proteins, the smallest substrate being N-blocked tripeptides and unblocked tetrapeptides. DCP from Escherichia coli is inhibited by the anti-hypertensive drug captopril, an inhibitor of the mammalian angiotensin converting enzyme (ACE, also called peptidyl dipeptidase A). OpdA may play a specific role in the degradation of signal peptides after they are released from precursor forms of secreted proteins. It can also cleave N-acetyl-L-Ala. This family also includes Arabidopsis thaliana organellar oligopeptidase OOP (At5g65620), which plays a role in targeting peptide degradation in mitochondria and chloroplasts; it degrades peptide substrates that are between 8 to 23 amino acid residues, and shows a weak preference for hydrophobic residues (F/L) at the P1 position.	653
341052	cd06457	M3A_MIP	Peptidase M3 mitochondrial intermediate peptidase (MIP). Peptidase M3 mitochondrial intermediate peptidase (MIP; EC 3.4.24.59) belongs to the widespread subfamily M3A, that shows similarity to Thimet oligopeptidase (TOP). It is one of three peptidases responsible for the proteolytic processing of both nuclear and mitochondrial encoded precursor polypeptides targeted to various subcompartments of the mitochondria. It cleaves intermediate-size proteins initially processed by mitochondrial processing peptidase (MPP) to yield a processing intermediate with a typical N-terminal octapeptide that is sequentially cleaved by MIP to mature-size protein. MIP cleaves precursor proteins of respiratory components, including subunits of the electron transport chain and tri-carboxylic acid cycle enzymes, and components of the mitochondrial genetic machinery, including ribosomal proteins, translation factors, and proteins required for mitochondrial DNA metabolism. It has been suggested that the human MIP (HMIP polypeptide (gene symbol MIPEP) may be one of the loci predicted to influence the clinical manifestations of Friedreich's ataxia (FRDA), an autosomal recessive neurodegenerative disease caused by the lack of human frataxin. These proteins are enriched in cysteine residues, two of which are highly conserved, suggesting their importance to stability as well as in formation of metal binding sites, thus playing a role in MIP activity.	613
341053	cd06459	M3B_PepF	Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and includes oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid.	539
341054	cd06460	M32_Taq	Peptidase family M32, which includes thermostable carboxypeptidases TaqCP, PfuCP and FisCP. Peptidase family M32 is a subclass of metallocarboxypeptidases, distributed mainly in bacteria and archaea, whose members contain a HEXXH motif that participates in coordinating a divalent cation such as Zn2+ or Co2+. It includes the thermostable carboxypeptidases (E.C. 3.4.17.19) from Thermus aquaticus (TaqCP) and Pyrococcus furiosus (PfuCP), which have broad specificities toward a wide range of C-terminal substrates that include basic, aromatic, neutral and polar amino acids. These enzymes have a similar fold to the M3 peptidases such as neurolysin and the M2 angiotensin converting enzyme (ACE). The keratin-degrading extremophilic eubacterium Fervidobacterium islandicum M32 carboxypeptidase (FisCP) plays an important role in cellular metabolism, and significantly enhances the degradation of native chicken feathers. It has been shown to mainly cleave the C-termini of peptides with a basic amino acid sequence. Novel M32 peptidases from some eukaryotes: protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, have been identified, thus making these enzymes an attractive potential target for drug development against these organisms.	484
341055	cd06461	M2_ACE	Peptidase family M2, angiotensin converting enzyme (ACE). Peptidase family M2 angiotensin converting enzyme (ACE, EC 3.4.15.1) is a membrane-bound, zinc-dependent dipeptidase that catalyzes the conversion of the decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II, by removing two C-terminal amino acids. There are two forms of the enzyme in humans, the ubiquitous somatic ACE and the sperm-specific germinal ACE, both encoded by the same gene through transcription from alternative promoters. Somatic ACE has two tandem active sites with distinct catalytic properties, whereas germinal ACE, the function of which is largely unknown, has just a single active site. Recently, an ACE homolog, ACE2, has been identified in humans that differs from ACE; it preferentially removes carboxy-terminal hydrophobic or basic amino acids and appears to be important in cardiac function. ACE homologs (also known as members of the M2 gluzincin family) have been found in a wide variety of species, including those that neither have a cardiovascular system nor synthesize angiotensin. ACE is well-known as a key part of the renin-angiotensin system that regulates blood pressure and ACE inhibitors are important for the treatment of hypertension.	563
119396	cd06462	Peptidase_S24_S26	The S24, S26 LexA/signal peptidase superfamily contains LexA-related and type I signal peptidase families. The S24 LexA protein domains include: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The S26 type I signal peptidase (SPase) family also includes mitochondrial inner membrane protease (IMP)-like members. SPases are essential membrane-bound proteases which function to cleave away the amino-terminal signal peptide from the translocated pre-protein, thus playing a crucial role in the transport of proteins across membranes in all living organisms. All members in this superfamily are unique serine proteases that carry out catalysis using a serine/lysine dyad instead of the prototypical serine/histidine/aspartic acid triad found in most serine proteases.	84
107220	cd06463	p23_like	Proteins containing this p23_like domain include p23 and its Saccharomyces cerevisiae (Sc) homolog Sba1. Both are co-chaperones for the heat shock protein (Hsp) 90.  p23 binds Hsp90 and participates in the folding of a number of Hsp90 clients, including the progesterone receptor. p23 also has a passive chaperoning activity and in addition may participate in prostaglandin synthesis.  Both p23 and Sba1p can regulate telomerase activity. This group includes domains similar to the C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1). Sgt1 interacts with multiple protein complexes and has the features of a co-chaperone. Human (h) Sgt1 interacts with both Hsp70 and Hsp90, and has been shown to bind Hsp90 through its CS domain.  Saccharomyces cerevisiae (Sc) Sgt1 is a subunit of both core kinetochore and SCF (Skp1-Cul1-F-box) ubiquitin ligase complexes. Sgt1 is required for pathogen resistance in plants.  This group also includes the p23_like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR). hB-ind1 plays a role in the signaling pathway mediated by the small GTPase Rac1, NUDC is needed for nuclear movement, Melusin interacts with two splice variants of beta1 integrin, and NCB5OR plays a part in maintaining viable pancreatic beta cells.	84
107221	cd06464	ACD_sHsps-like	Alpha-crystallin domain (ACD) of alpha-crystallin-type small(s) heat shock proteins (Hsps). sHsps are small stress induced proteins with monomeric masses between 12 -43 kDa, whose common feature is the Alpha-crystallin domain  (ACD). sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps.	88
107222	cd06465	p23_hB-ind1_like	p23_like domain found in human (h) butyrate-induced transcript 1 (B-ind1) and similar proteins. hB-ind1 participates in signaling by the small GTPase Rac1. It binds to Rac1 and enhances different Rac1 effects including activation of nuclear factor (NF) kappaB and activation of c-Jun N-terminal kinase (JNK). hB-ind1 also plays a part in the RNA replication and particle production of Hepatitis C virus (HCV)  through its interaction with heat shock protein Hsp90, HCV nonstructural protein 5A (NS5A), and the immunophilin FKBP8.  hB-ind1 is upregulated in the outer layer of Chinese hamster V79 cells grown as multicell spheroids, versus in the same cells grown as monolayers. This group includes the Saccharomyces cerevisiae Sba1, a co-chaperone of the Hsp90. Sba1 has been shown to be is required for telomere length maintenance, and may modulate telomerase DNA-binding activity.	108
107223	cd06466	p23_CS_SGT1_like	p23_like domain similar to the C-terminal CHORD-SGT1 (CS) domain of Sgt1 (suppressor of G2 allele of Skp1). Sgt1 interacts with multiple protein complexes and has the features of a cochaperone. Human (h) Sgt1 interacts with both Hsp70 and Hsp90, and has been shown to bind Hsp90 through its CS domain.  Saccharomyces cerevisiae (Sc) Sgt1 is a subunit of both core kinetochore and SCF (Skp1-Cul1-F-box) ubiquitin ligase complexes. Sgt1 is required for pathogen resistance in plants. ScSgt1 is needed for the G1/S and G2/M cell-cycle transitions, and for assembly of the core kinetochore complex (CBF3) via activation of Ctf13, the F-box protein. Binding of Hsp82 (a yeast Hsp90 homologue) to ScSgt1, promotes the binding of Sgt1 to Skp1 and of Skp1 to Ctf13.  Some proteins in this group have an SGT1-specific (SGS) domain at the extreme C-terminus. The ScSgt1-SGS domain binds adenylate cyclase.  The hSgt1-SGS domain interacts with some S100 family proteins, and studies suggest that the interaction of hSgt1 with Hsp90 and Hsp70 may be regulated by S100A6 in a Ca2+ dependent fashion. This group also includes the p23_like domains of Melusin and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR). Melusin is a vertebrate protein which interacts with two splice variants of beta1 integrin, and NCB5OR plays a part in maintaining viable pancreatic beta cells.	84
107224	cd06467	p23_NUDC_like	p23_like domain of NUD (nuclear distribution) C and similar proteins. Aspergillus nidulas (An) NUDC is needed for nuclear movement. AnNUDC is localized at the hyphal cortex, and binds NUDF at spindle pole bodies (SPBs) and in the cytoplasm at different stages in the cell cycle. At the SPBs it is part of the dynein molecular motor/NUDF complex that regulates microtubule dynamics.  Mammalian(m) NUDC associates both with the dynein complex and also with an anti-inflammatory enzyme, platelet activating factor acetylhydrolase I, PAF-AH(I) complex, through binding mNUDF, the regulatory beta subunit of PAF-AH(I).  mNUDC is important for cell proliferation both in normal and tumor tissues.  Its expression is elevated in various cell types undergoing mitosis or stimulated to proliferate, with high expression levels observed in leukemic cells and tumors.  For a leukemic cell line, human NUDC was shown to activate the thrombopoietin (TPO) receptor (Mpl) by binding to its extracellular domain, and promoting cell proliferation and differentiation.  This group also includes the human broadly immunogenic tumor associated antigen, CML66, which is highly expressed in a variety of solid tumors and in leukemias. In normal tissues high expression of CML66 is limited to testis and heart.	85
107225	cd06468	p23_CacyBP	p23_like domain found in proteins similar to Calcyclin-Binding Protein(CacyBP)/Siah-1-interacting protein (SIP). CacyBP/SIP interacts with S100A6 (calcyclin), with some other members of the S100 family, with tubulin, and with Siah-1 and Skp-1. The latter two are components of the ubiquitin ligase that regulates beta-catenin degradation. The beta-catenin gene is an oncogene participating in tumorigenesis in many different cancers. Overexpression of CacyBP/SIP, in part through its effect on the expression of beta-catenin, inhibits the proliferation, tumorigenicity, and invasion of gastric cancer cells. CacyBP/SIP is abundant in neurons and neuroblastoma NB2a cells. An extensive re-organization of microtubules accompanies the differentiation of NB2a cells. CacyBP/SIP may contribute to NB2a cell differentiation through binding to and increasing the oligomerization of tubulin. CacyBP/SIP is also implicated in differentiation of erythroid cells, rat neonatal cardiomyocytes, in mouse endometrial events, and in thymocyte development.	92
107226	cd06469	p23_DYX1C1_like	p23_like domain found in proteins similar to dyslexia susceptibility 1 (DYX1) candidate 1 (C1) protein, DYX1C1. The human gene encoding this protein is a positional candidate gene for developmental dyslexia (DD), it is located on 15q21.3 by the DYX1 DD susceptibility locus (15q15-21). Independent association studies have reported conflicting results. However, association of short-term memory, which plays a role in DD, with a variant within the DYX1C1 gene has been reported. Most proteins belonging to this group contain a C-terminal tetratricopeptide repeat (TPR) protein binding region.	78
107227	cd06470	ACD_IbpA-B_like	Alpha-crystallin domain (ACD) found in Escherichia coli inclusion body-associated proteins IbpA and IbpB, and similar proteins.  IbpA and IbpB are 16 kDa small heat shock proteins (sHsps). sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. IbpA and IbpB are produced during high-level production of various heterologous proteins, specifically human prorenin, renin and bovine insulin-like growth factor 2 (bIGF-2), and are strongly associated with inclusion bodies containing these heterologous proteins. IbpA and IbpB work as an integrated system to stabilize thermally aggregated proteins in a disaggregation competent state.  The chaperone activity of IbpB is also significantly elevated as the temperature increases from normal to heat shock. The high temperature results in the disassociation of 2-3-MDa IbpB oligomers into smaller approximately 600-kDa structures. This elevated activity seen under heat shock conditions is retained for an extended period of time after the temperature is returned to normal. IbpA also forms multimers.	90
107228	cd06471	ACD_LpsHSP_like	Group of bacterial proteins containing an alpha crystallin domain (ACD) similar to Lactobacillus plantarum (Lp) small heat shock proteins (sHsp) HSP 18.5, HSP 18.55 and HSP 19.3. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Transcription of the genes encoding Lp HSP 18.5, 18.55 and 19.3 is regulated by a variety of stresses including heat, cold and ethanol. Early growing L. plantarum cells contain elevated levels of these mRNAs which rapidly fall of as the cells enter stationary phase. Also belonging to this group is Bifidobacterium breve (Bb) HSP20 and Oenococcus oenis (syn. Leuconostoc oenos) (Oo) HSP18.  Transcription of the gene encoding BbHSP20 is strongly induced following heat or osmotic shock, and that of the gene encoding OoHSP18 following heat, ethanol or acid shock. OoHSP18 is peripherally associated with the cytoplasmic membrane.	93
107229	cd06472	ACD_ScHsp26_like	Alpha crystallin domain (ACD) found in Saccharomyces cerevisiae (Sc) small heat shock protein (Hsp)26 and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. ScHsp26 is temperature-regulated, it switches from an inactive to a chaperone-active form upon elevation in temperature. It associates into large 24-mers storage forms which upon heat shock disassociate into dimers. These dimers initiate the interaction with non-native substrate proteins and re-assemble into large globular assemblies having one monomer of substrate bound per dimer. This group also contains Arabidopsis thaliana (Ath) Hsp15.7, a peroxisomal matrix protein which can complement the morphological phenotype of S. cerevisiae mutants deficient in Hsps26. AthHsp15.7 is minimally expressed under normal conditions and is strongly induced by heat and oxidative stress. Also belonging to this group is wheat HSP16.9 which differs in quaternary structure from the shell-type particles of ScHsp26, it assembles as a dodecameric double disc, with each disc organized as a trimer of dimers.	92
107230	cd06475	ACD_HspB1_like	Alpha crystallin domain (ACD) found in mammalian small (s)heat shock protein (Hsp)-27 (also denoted HspB1 in human) and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Hsp27 shows enhanced synthesis in response to stress. It is a molecular chaperone which interacts with a large number of different proteins. It is found in many types of human cells including breast, uterus, cervix, platelets and cancer cells. Hsp27 has diverse cellular functions including, chaperoning, regulation of actin polymerization, keratinocyte differentiation, regulation of inflammatory pathways in keratinocytes, and protection from oxidative stress through modulating glutathione levels. It is also a subunit of AUF1-containing protein complexes. It has been linked to several transduction pathways regulating cellular functions including differentiation, cell growth, development, and apoptosis. Its activity can be regulated by phosphorylation. Its unphosphorylated state is a high molecular weight aggregated form (100-800kDa) composed of up to 24 subunits, which forms as a result of multiple interactions within the ACD, and is required for chaperone function and resistance to oxidative stress. Upon phosphorylation these large aggregates rapidly disassociate to smaller oligomers and chaperone activity is modified.  High constitutive levels of Hsp27 have been detected in various cancer cells, in particular those of carcinoma origin. Over-expression of Hsp27 has a protective effect against various diseases-processes, including Huntington's disease. Mutations in Hsp27 have been associated with a form of distal hereditary motor neuropathy type II and Charcot-Marie-Tooth disease type 2.	86
107231	cd06476	ACD_HspB2_like	Alpha crystallin domain (ACD) found in mammalian small heat shock protein (sHsp) HspB2/heat shock 27kDa protein 2 and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits.  HspB2 is preferentially and constitutively expressed in skeletal muscle and heart. HspB2 shows homooligomeric activity and forms aggregates in muscle cytosol. Although its expression is not induced by heat shock, it redistributes to the insoluble fraction in response to heat shock. In the mouse heart, HspB2 plays a role in maintaining energetic balance, by protecting cardiac energetics during ischemia/reperfusion, and allowing  for increased work during acute inotropic challenge. hHspB2 [previously also known as myotonic dystrophy protein kinase (DMPK) binding protein (MKBP)]  is selectively up-regulated in skeletal muscles from myotonic dystrophy patients. The ACD of hHspB2 binds the DMPK kinase domain. In vitro, hHspB2 enhances the kinase activity of DMPK and confers thermoresistance. The hHspB2 gene lies less than 1kb from the 5 prime end of the related alphaB (HspB4)-crystallin gene, with the opposite transcription direction. These two genes may share regulatory elements for their expression.	83
107232	cd06477	ACD_HspB3_Like	Alpha crystallin domain (ACD) found in mammalian HspB3, also known as heat-shock protein 27-like protein (HSPL27, 17-kDa) and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. HspB3 is expressed in adult skeletal muscle, smooth muscle, and heart, and in several other fetal tissues.  In muscle cells HspB3 forms an oligomeric 150 kDa complex with myotonic dystrophy protein kinase-binding protein (MKBP/ HspB2), this complex may comprise one of two independent muscle-cell specific chaperone systems. The expression of HspB3 is induced during muscle differentiation controlled by the myogenic factor MyoD. HspB3 may also interact with Hsp22 (HspB8).	83
107233	cd06478	ACD_HspB4-5-6	Alpha-crystallin domain found in alphaA-crystallin (HspB4), alphaB-crystallin (HspB5), and the small heat shock protein (sHsp) HspB6, also known as Hsp20. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Alpha crystallin, an abundant protein in the mammalian lens, is a large (700 kDa) heteropolymer composed of HspB4 and HspB5, generally in a molar ratio of HspB4:HspB5 of 3:1.  Only trace amounts of HspB4 are found in tissues other than the lens. HspB5 on the other hand is also expressed constitutively in other tissues including brain, heart, and type I and type IIa skeletal muscle fibers, and in several cancers including gliomas, renal cell carcinomas, basal-like and metaplastic breast carcinomas, and head and neck cancer.  HspB5's functions include effects on the apoptotic pathway and on metastasis.  Phosphorylation of HspB5 reduces its oligomerization and anti-apoptotic activities.  HspB5 is protective in demyelinating disease such as multiple sclerosis (MS), being a negative regulator of inflammation. In early active MS lesions it is the most abundant gene transcript and an autoantigen, the immune response against it would disrupt its function and worsen inflammation and demyelination. Given as therapy for ongoing demyelinating disease it may counteract this effect.  It is an autoantigen in the pathogenesis of various other inflammatory disorders including Lens-associated uveitis (LAU), and Behcet's disease. Mutations in HspB5 have been associated with diseases including dominant cataract and desmin-related myopathy. Mutations in HspB4 have been associated with Autosomal Dominant Congenital Cataract (ADCC). HspB6 (Hsp20) is ubiquitous and is involved in diverse functions including regulation of glucose transport and contraction of smooth muscle, in platelet aggregation, in cardioprotection, and in the prevention of apoptosis. It interacts with the universal scaffolding and adaptor protein 14-3-3, and also with the proapoptotic protein Bax.	83
107234	cd06479	ACD_HspB7_like	Alpha crystallin domain (ACD) found in mammalian small heat shock protein (sHsp) HspB7, also known as cardiovascular small heat shock protein (cvHsp), and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. HspB7 is a 25-kDa protein, preferentially expressed in heart and skeletal muscle. It binds the cytoskeleton protein alpha-filamin (also known as actin-binding protein 280). The expression of HspB7 is increased during rat muscle aging.  Its expression is also modulated in obesity implicating this protein in this and related metabolic disorders. As the human gene encoding HspB7 is mapped to chromosome 1p36.23-p34.3 it is a positional candidate for several dystrophies and myopathies.	81
107235	cd06480	ACD_HspB8_like	Alpha-crystallin domain (ACD) found in mammalian 21.6 KDa small heat shock protein (sHsp) HspB8, also denoted as Hsp22 in humans, and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. A chaperone complex formed of HspB8 and Bag3 stimulates degradation of protein complexes by macroautophagy. HspB8 also forms complexes with Hsp27 (HspB1), MKBP (HspB2), HspB3, alphaB-crystallin (HspB5), Hsp20 (HspB6), and cvHsp (HspB7). These latter interactions may depend on phosphorylation of the respective partner sHsp. HspB8 may participate in the regulation of cell proliferation, cardiac hypertrophy, apoptosis, and carcinogenesis. Point mutations in HspB8 have been correlated with the development of several congenital neurological diseases, including Charcot Marie tooth disease and distal motor neuropathy type II.	91
107236	cd06481	ACD_HspB9_like	Alpha crystallin domain (ACD) found in mammalian small heat shock protein (sHsp) HspB9 and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Human (h) HspB9 is expressed exclusively in the normal testis and in various tumor samples and is a cancer/testis antigen. hHspB9  interacts with TCTEL1 (T-complex testis expressed protein -1), a subunit of dynein. hHspB9 and TCTEL1 are co-expressed in similar cells within the testis and in tumor cells. Included in this group is Xenopus Hsp30, a developmentally-regulated heat-inducible molecular chaperone.	87
107237	cd06482	ACD_HspB10	Alpha crystallin domain (ACD) found in mammalian small heat shock protein (sHsp) HspB10, also known as sperm outer dense fiber protein (ODFP), and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Human (h) HspB10 occurs exclusively in the axoneme of sperm cells and may have a cytoskeletal role.	87
107238	cd06488	p23_melusin_like	p23_like domain similar to the C-terminal (tail) domain of vertebrate Melusin and related proteins. Melusin's tail domain interacts with the cytoplasmic domain of beta1-A and beta1-D isoforms of beta1 integrin, it does not bind other integrin beta subunits. Melusin is a muscle-specific protein expressed in skeletal and cardiac muscles but not in smooth muscle or other tissues. It is needed for heart hypertrophy following mechanical overload. The integrin-binding portion of this domain appears to be sequestered in the full length melusin protein, Ca2+ may modulate the protein's conformation exposing this binding site. This group includes Chordc1, also known as Chp-1, which is conserved from vertebrates to humans.  Mammalian Chordc1 interacts with the heat shock protein (HSP) Hsp90 and is implicated in circadian and/or homeostatic mechanisms in the brain. The N-terminal portions of proteins belonging to this group contain two cysteine and histidine rich domain (CHORD) domains.	87
107239	cd06489	p23_CS_hSgt1_like	p23_like domain similar to the C-terminal CS (CHORD-SGT1) domain of human (h) Sgt1 and related proteins. hSgt1 is a co-chaperone which has been shown to be elevated in HEp-2 cells as a result of stress conditions such as heat shock. It interacts with the heat shock proteins (HSPs) Hsp70 and Hsp90, and it expression pattern is synchronized with these two Hsps. The interaction with HSP90 has been shown to involve the hSgt1_CS domain, and appears to be required for correct kinetochore assembly and efficient cell division.  Some proteins in this subgroup contain a tetratricopeptide repeat (TPR) HSP-binding domain N-terminal to this CS domain, and most proteins in this subgroup contain a Sgt1-specific (SGS) domain C-terminal to the CS domain. The SGS domain interacts with some S100 family proteins. Studies suggest that S100A6 modulates in a Ca2+ dependent manner the interactions of hSgt1 with Hsp90 and Hsp70. The yeast Sgt1 CS domain is not found in this subgroup.	84
107240	cd06490	p23_NCB5OR	p23_like domain found in NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR) and similar proteins.  NCB5OR is widely expressed in human organs and tissues and is localized in the ER (endoplasmic reticulum). It appears to play a critical role in maintaining viable pancreatic beta cells. Mice homozygous for a targeted knockout (KO) of the gene encoding NCB5OR develop an early-onset nonautoimmune diabetes phenotype with a non-inflammatory beta-cell deficiency.  The role of NCB5OR in beta cells may be in maintaining or regulating their redox status. Proteins in this group in addition contain an N-terminal cytochrome b5 domain and a C-terminal cytochrome b5 oxidoreductase domain.  The gene encoding NCB5OR has been considered as a positional candidate for type II diabetes and other diabetes subtypes related to B-cell dysfunction, however variation in its coding region does not appear not to be a major contributor to the pathogenesis of these diseases.	87
107241	cd06492	p23_mNUDC_like	p23-like NUD (nuclear distribution) C-like domain of mammalian(m) NUDC and similar proteins. Mammalian(m) NUDC associates both with the dynein complex and also with an anti-inflammatory enzyme, platelet activating factor acetylhydrolase I, PAF-AH(I) complex, through binding mNUDF, the regulatory beta subunit of PAF-AH(I).  mNUDC is important for cell proliferation both in normal and tumor tissues.  Its expression is elevated in various cell types undergoing mitosis or stimulated to proliferate, with high expression levels observed in leukemic cells and tumors. For a leukemic cell line, human NUDC was shown to activate the thrombopoietin (TPO) receptor (Mpl) by binding to its extracellular domain, and promoting cell proliferation and differentiation.	87
107242	cd06493	p23_NUDCD1_like	p23_NUDCD1: p23-like NUD (nuclear distribution) C-like domain found in human NUD (nuclear distribution) C domain-containing protein 1, NUDCD1 (also known as CML66), and similar proteins. NUDCD1/CML66 is a broadly immunogenic tumor associated antigen, which is highly expressed in a variety of solid tumors and in leukemias. In normal tissues high expression of NUDCD1/CML66 is limited to testis and heart.	85
107243	cd06494	p23_NUDCD2_like	p23-like NUD (nuclear distribution) C-like found in human NUDC domain-containing protein 2 (NUDCD2) and similar proteins.  Little is known about the function of the proteins in this subgroup.	93
107244	cd06495	p23_NUDCD3_like	p23-like NUD (nuclear distribution) C-like domain found in human NUDC domain-containing protein 3 (NUDCD3) and similar proteins.   Little is known about the function of the proteins in this subgroup.	102
107245	cd06497	ACD_alphaA-crystallin_HspB4	Alpha-crystallin domain found in the small heat shock protein (sHsp) alphaA-crystallin (HspB4, 20kDa). sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Alpha crystallin, an abundant protein in the mammalian lens, is a large (700 kDa) heteropolymer composed of HspB4 and HspB5, generally in a molar ratio of HspB4:HspB5 of 3:1.  Only trace amounts of HspB4 are found in tissues other than the lens. HspB5 does not belong to this group. Mutations inHspB4 have been associated with Autosomal Dominant Congenital Cataract (ADCC). The chaperone-like functions of HspB4 are considered important for maintaining lens transparency and preventing cataract.	86
107246	cd06498	ACD_alphaB-crystallin_HspB5	Alpha-crystallin domain found in the small heat shock protein (sHsp) alphaB-crystallin (HspB5, 20kDa). sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. Alpha crystallin, an abundant protein in the mammalian lens, is a large (700 kDa) heteropolymer composed of HspB4 and HspB5, generally in a molar ratio of HspB4:HspB5 of 3:1.  HspB4 does not belong to this group. HspB5 shows increased synthesis in response to stress. HspB5 is also expressed constitutively in other tissues including brain, heart, and type I and type IIa skeletal muscle fibers, and in several cancers including gliomas, renal cell carcinomas, basal-like and metaplastic breast carcinomas, and head and neck cancer.  Its functions include effects on the apoptotic pathway and on metastasis.  Phosphorylation of HspB5 reduces its oligomerization and anti-apoptotic activities.  HspB5 is protective in demyelinating disease such as multiple sclerosis (MS), being a negative regulator of inflammation. In early active MS lesions it is the most abundant gene transcript and an autoantigen, the immune response against it would disrupt its function and worsen inflammation and demyelination. Given as therapy for ongoing demyelinating disease it may counteract this effect.  It is an autoantigen in the pathogenesis of various other inflammatory disorders including Lens-associated uveitis (LAU), and Behcet's disease. Mutations in HspB5 have been associated with diseases including dominant cataract and desmin-related myopathy.	84
133460	cd06499	GT_MraY-like	Glycosyltransferase 4 (GT4) includes both eukaryotic and prokaryotic UDP-D-N-acetylhexosamine:polyprenol phosphate D-N-acetylhexosamine-1-phosphate transferases. They catalyze the transfer of a D-N-acetylhexosamine 1-phosphate to a membrane-bound polyprenol phosphate, which is the initiation step of protein N-glycosylation in eukaryotes and peptidoglycan biosynthesis in bacteria. One member, D-N-acetylhexosamine 1-phosphate transferase (GPT) is a eukaryotic enzyme, which is specific for UDP-GlcNAc as donor substrate and dolichol-phosphate as the membrane bound acceptor. The bacterial members MraY, WecA, and WbpL/WbcO utilize undecaprenol phosphate as the acceptor substrate, but use different UDP-sugar donor substrates. MraY-type transferases are highly specific for UDP-N-acetylmuramate-pentapeptide, whereas WecA proteins are selective for UDP-N-acetylglucosamine (UDP-GlcNAc). The WbcO/WbpL substrate specificity has not yet been determined, but the structure of their biosynthetic endproducts implies that UDP-N-acetyl-D-fucosamine (UDP-FucNAc) and/or UDPN-acetyl-D-quinosamine (UDP-QuiNAc) are used. The eukaryotic reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for N-glycosylation. The prokaryotic reactions lead to the formation of polyprenol-linked oligosaccharides involved in bacterial cell wall and peptidoglycan assembly. Archaeal and eukaryotic enzymes may use the same substrates and are evolutionarily closer than the bacterial enzyme. Archaea possess the same N-glycosylation pathway as eukaryotes. A glycosyl transferase gene Mv1751 in M. voltae encodes for the enzyme that carries out the first step in the pathway, the attachment of GlcNAc to a dolichol lipid carrier in the membrane. A lethal mutation in the alg7 (GPT) gene in Saccharomyces cerevisiae was successfully complemented with Mv1751, the archaea gene.	185
99748	cd06502	TA_like	Low-specificity threonine aldolase (TA). This family belongs to pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily (fold I).  TA catalyzes the conversion of L-threonine or L-allo-threonine to glycine and acetaldehyde in a secondary glycine biosynthetic pathway.	338
349951	cd06503	ATP-synt_Fo_b	F-type ATP synthase, membrane subunit b. Membrane subunit b is a component of the Fo complex of FoF1-ATP synthase. The F-type ATP synthases (FoF1-ATPase) consist of two structural domains: the F1 (assembly factor one) complex containing the soluble catalytic core, and the Fo (oligomycin sensitive factor) complex containing the membrane proton channel, linked together by a central stalk and a peripheral stalk. F1 is composed of alpha (or A), beta (B), gamma (C), delta (D) and epsilon (E) subunits with a stoichiometry of 3:3:1:1:1, while Fo consists of the three subunits a, b, and c (1:2:10-14). An oligomeric ring of 10-14 c subunits (c-ring) make up the Fo rotor. The flux of protons through the ATPase channel (Fo) drives the rotation of the c-ring, which in turn is coupled to the rotation of the F1 complex gamma subunit rotor due to the permanent binding between the gamma and epsilon subunits of F1 and the c-ring of Fo. The F-ATP synthases are primarily found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts or in the plasma membranes of bacteria. The F-ATP synthases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis. This group also includes F-ATP synthase that has also been found in the archaea Candidatus Methanoperedens.	132
119382	cd06522	GH25_AtlA-like	AtlA is an autolysin found in Gram-positive lactic acid bacteria that degrades bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues.  This family includes the AtlA and Aml autolysins from Streptococcus mutans which have a C-terminal glycosyl hydrolase family 25 (GH25) catalytic domain as well as six tandem N-terminal repeats of the GBS (group B Streptococcus) Bsp-like peptidoglycan-binding domain.  Other members of this family have one or more C-terminal peptidoglycan-binding domain(s) (SH3 or LysM) in addition to the GH25 domain.	192
119383	cd06523	GH25_PlyB-like	PlyB is a bacteriophage endolysin that displays potent lytic activity toward Bacillus anthracis.  PlyB has an N-terminal glycosyl hydrolase family 25 (GH25) catalytic domain and a C-terminal bacterial SH3-like domain, SH3b.  Both domains are required for effective catalytic activity.  Endolysins are produced by bacteriophages at the end of their life cycle and participate in lysing the bacterial cell in order to release the newly formed progeny.  Endolysins (also referred to as endo-N-acetylmuramidases or peptidoglycan hydrolases) degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues.	177
119384	cd06524	GH25_YegX-like	YegX is an uncharacterized bacterial protein with a glycosyl hydrolase family 25 (GH25) catalytic domain that is similar in sequence to the CH-type (Chalaropsis-type) lysozymes of the GH25 family of endolysins.	194
119385	cd06525	GH25_Lyc-like	Lyc muramidase is an autolytic lysozyme (autolysin) from Clostridium acetobutylicum encoded by the lyc gene.  Lyc has a glycosyl hydrolase family 25 (GH25) catalytic domain.  Endo-N-acetylmuramidases are lysozymes (also referred to as peptidoglycan hydrolases) that degrade bacterial cell walls by catalyzing the hydrolysis of 1,4-beta-linkages between N-acetylmuramic acid and N-acetyl-D-glucosamine residues.	184
107247	cd06526	metazoan_ACD	Alpha-crystallin domain (ACD) of metazoan alpha-crystallin-type small(s) heat shock proteins (Hsps). sHsps are small stress induced proteins with monomeric masses between 12 -43 kDa, whose common feature is the Alpha-crystallin domain  (ACD). sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps.	83
132725	cd06528	RNAP_A''	A'' subunit of Archaeal RNA Polymerase (RNAP). Archaeal RNA polymerase (RNAP), like bacterial RNAP, is a large multi-subunit complex responsible for the synthesis of all RNAs in the cell. The relative positioning of the RNAP core is highly conserved between archaeal RNAP and the three classes of eukaryotic RNAPs. In archaea, the largest subunit is split into two polypeptides, A' and A'', which are encoded by separate genes in an operon. Sequence alignments reveal that the archaeal A'' subunit corresponds to the C-terminal one-third of the RNAPII largest subunit (Rpb1). In subunit A'', several loops in the jaw domain are shorter. The RNAPII Rpb1 interacts with the second-largest subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis.	363
119397	cd06529	S24_LexA-like	Peptidase S24 LexA-like proteins are involved in the SOS response leading to the repair of single-stranded DNA within the bacterial cell. This family includes: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The LexA-like proteins contain two-domains:  an N-terminal DNA binding domain and a C-terminal domain (CTD) that provides LexA dimerization as well as cleavage activity. They undergo autolysis, cleaving at an Ala-Gly or a Cys-Gly bond, separating the DNA-binding domain from the rest of the protein. In the presence of single-stranded DNA, the LexA, UmuD and MucA proteins interact with RecA, activating self cleavage, thus either derepressing transcription in the case of LexA or activating the lesion-bypass polymerase in the case of UmuD and MucA. The LexA proteins are serine proteases that carry out catalysis using a serine/lysine dyad instead of the prototypical serine/histidine/aspartic acid triad found in most serine proteases. LexA sequence homologs are found in almost all of the bacterial genomes sequenced to date, covering a large number of phyla, suggesting both, an ancient origin and a widespread distribution of lexA and the SOS response.	81
119398	cd06530	S26_SPase_I	The S26 Type I signal peptidase (SPase; LepB; leader peptidase B; leader peptidase I; EC 3.4.21.89) family members are essential membrane-bound serine proteases that function to cleave the amino-terminal signal peptide extension from proteins that are translocated across biological membranes. The bacterial signal peptidase I, which is the most intensively studied, has two N-terminal transmembrane segments inserted in the plasma membrane and a hydrophilic, C-terminal catalytic region that is located in the periplasmic space. Although the bacterial signal peptidase I is monomeric, signal peptidases of eukaryotic cells commonly function as oligomeric complexes containing two divergent copies of the catalytic monomer. These are the IMP1 and IMP2 signal peptidases of the mitochondrial inner membrane that remove leader peptides from nuclear- and mitochondrial-encoded proteins. Also, two components of the endoplasmic reticulum signal peptidase in mammals (18-kDa and 21-kDa) belong to this family and they process many proteins that enter the ER for retention or for export to the Golgi apparatus, secretory vesicles, plasma membranes or vacuole. An atypical member of the S26 SPase type I family is the TraF peptidase which has the remarkable activity of producing a cyclic protein of the Pseudomonas pilin system. The type I signal peptidases are unique serine proteases that utilize a serine/lysine catalytic dyad mechanism in place of the classical serine/histidine/aspartic acid catalytic triad mechanism.	85
133474	cd06532	Glyco_transf_25	Glycosyltransferase family 25 [lipooligosaccharide (LOS) biosynthesis protein] is a family of glycosyltransferases involved in LOS biosynthesis. The members include the beta(1,4) galactosyltransferases: Lgt2 of Moraxella catarrhalis, LgtB and LgtE of Neisseria gonorrhoeae and Lic2A of Haemophilus influenzae. M. catarrhalis Lgt2 catalyzes the addition of galactose (Gal) to the growing chain of LOS on the cell surface. N. gonorrhoeae LgtB and LgtE link Gal-beta(1,4)  to GlcNAc (N-acetylglucosamine) and Glc (glucose), respectively. The genes encoding LgtB and LgtE are two genes of a five gene locus involved in the synthesis of gonococcal LOS. LgtE is believed to perform the first step in LOS biosynthesis.	128
119439	cd06533	Glyco_transf_WecG_TagA	The glycosyltransferase WecG/TagA superfamily contains Escherichia coli WecG, Bacillus subtilis TagA and related proteins. E. coli WecG is believed to be a UDP-N-acetyl-D-mannosaminuronic acid transferase, and is involved in enterobacterial common antigen (eca) synthesis. B. subtilis TagA plays a key role in the Wall Teichoic Acid (WTA) biosynthetic pathway, catalyzing the transfer of N-acetylmannosamine to the C4 hydroxyl of a membrane-anchored N-acetylglucosaminyl diphospholipid to make ManNAc-beta-(1,4)-GlcNAc-pp-undecaprenyl. This is the first committed step in this pathway. Also included in this group is Xanthomonas campestris pv. campestris GumM, a glycosyltransferase participating in the biosynthesis of the exopolysaccharide xanthan.	171
143395	cd06534	ALDH-SF	NAD(P)+-dependent aldehyde dehydrogenase superfamily. The aldehyde dehydrogenase superfamily (ALDH-SF) of  NAD(P)+-dependent enzymes, in general, oxidize a wide range of  endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an  important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase.  Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group.	367
119368	cd06535	CIDE_N_CAD	CIDE_N domain of CAD nuclease. The CIDE_N (cell death-inducing DFF45-like effector, N-terminal) domain is found at the N-terminus of CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of CAD/DFF40 and ICAD/DFF45 during apoptosis. In normal cells, DFF exists in the nucleus as a heterodimer composed of CAD/DFF40 as a latent nuclease and its chaperone and inhibitor subunit ICAD/DFF45. Apoptotic activation of caspase-3 results in the cleavage of DFF45/ICAD and the release of active DFF40/CAD nuclease.	77
119369	cd06536	CIDE_N_ICAD	CIDE_N domain of ICAD. The CIDE_N  (cell death-inducing DFF45-like effector, N-terminal) domain is found at the N-terminus of the CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD (DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of the CAD/DFF40 and ICAD/DFF45 during apoptosis. In normal cells, DFF exists in the nucleus as a heterodimer composed of CAD/DFF40 as a latent nuclease and its chaperone and inhibitor subunit ICAD/DFF45. Apoptotic activation of caspase-3 results in the cleavage of DFF45/ICAD and release of active DFF40/CAD nuclease.	80
119370	cd06537	CIDE_N_B	CIDE_N domain of CIDE-B proteins. The CIDE_N (cell death-inducing DFF45-like effector, N-terminal) domain is found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins. These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of the CAD/DFF40,  ICAD/DFF45 and CIDE nucleases during apoptosis. The CIDE protein family includes 3 members: CIDE-A, CIDE-B, and FSP27(CIDE-C).  Based on sequence similarity with DFF40 and DFF45, CIDE proteins were initially characterized as mitochondrial activators of apoptosis. However, strong metabolic phenotypes of mice lacking CIDE-A and CIDE-B indicated that this family may play critical roles in energy balance.	81
119371	cd06538	CIDE_N_FSP27	CIDE_N domain of FSP27 proteins. The CIDE-N (cell death-inducing DFF45-like effector, N-terminal) domain is found in the FSP27/CIDE-C protein, which has been identified as a n adipocyte lipid droplet protein that negatively regulates lipolysis and promotes triglyceride accumulation. The CIDE protein family includes 3 members: CIDE-A, CIDE-B, and FSP27(CIDE-C). Based on sequence similarity with DFF40 and DFF45, CIDE proteins were initially characterized as mitochondrial activators of apoptosis. The CIDE-N domain of FSP27 is sufficient to increase apoptosis in vitro when overexpressed.	79
119372	cd06539	CIDE_N_A	CIDE_N domain of CIDE-A proteins. The CIDE_N (cell death-inducing DFF45-like effector, N-terminal) domain is found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins. These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of the CAD/DFF40, ICAD/DFF45, and CIDE nucleases during apoptosis. The CIDE protein family includes 3 members: CIDE-A, CIDE-B, and FSP27(CIDE-C).  Based on sequence similarity with DFF40 and DFF45, the CIDE proteins were initially characterized as mitochondrial activators of apoptosis. However, strong metabolic phenotypes of mice lacking CIDE-A and CIDE-B indicated that this family may play critical roles in energy balance.	78
119343	cd06541	ASCH	ASC-1 homology or ASCH domain, a small beta-barrel domain found in all three kingdoms of life. ASCH resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation. The domain has been named after the ASC-1 protein, the activating signal cointegrator 1 or thyroid hormone receptor interactor protein 4 (TRIP4). ASC-1 is conserved in many eukaryotes and has been suggested to participate in a protein complex that interacts with RNA. It has been shown that ASC-1 mediates the interaction between various transciption factors and the basal transcriptional machinery.	105
119359	cd06542	GH18_EndoS-like	Endo-beta-N-acetylglucosaminidases are bacterial chitinases that hydrolyze the chitin core of various asparagine (N)-linked glycans and glycoproteins. The endo-beta-N-acetylglucosaminidases have a glycosyl hydrolase family 18 (GH18) catalytic domain.  Some members also have an additional C-terminal glycosyl hydrolase family 20 (GH20) domain while others have an N-terminal domain of unknown function (pfam08522).  Members of this family include endo-beta-N-acetylglucosaminidase S (EndoS) from Streptococcus pyogenes, EndoF1, EndoF2, EndoF3, and  EndoH from Flavobacterium meningosepticum, and  EndoE from Enterococcus faecalis.  EndoS is a secreted endoglycosidase from Streptococcus pyogenes that specifically hydrolyzes the glycan on human IgG between two core N-acetylglucosamine residues.  EndoE is a secreted endoglycosidase, encoded by the ndoE gene in Enterococcus faecalis, that hydrolyzes the glycan on human RNase B.	255
119360	cd06543	GH18_PF-ChiA-like	PF-ChiA is an uncharacterized chitinase found in the hyperthermophilic archaeon Pyrococcus furiosus with a glycosyl hydrolase family 18 (GH18) catalytic domain as well as a cellulose-binding domain.  Members of this domain family are found not only in archaea but also in eukaryotes and prokaryotes. PF-ChiA exhibits hydrolytic activity toward both colloidal and crystalline (beta/alpha) chitins at high temperature.	294
119361	cd06544	GH18_narbonin	Narbonin is a plant 2S protein from the globulin fraction of narbon bean (Vicia narbonensis L.) cotyledons with unknown function.  Narbonin has a glycosyl hydrolase family 18 (GH18) domain without the conserved catalytic residues and with no known enzymatic activity.  Narbonin amounts to up to 3% of the total seed globulins of mature seeds and was thought to be a storage protein but was found to degrade too slowly during germination.  This family also includes the VfNOD32 nodulin from Vicia faba.	253
119362	cd06545	GH18_3CO4_chitinase	The Bacteroides thetaiotaomicron protein represented by pdb structure 3CO4 is an uncharacterized bacterial member of the family 18 glycosyl hydrolases with homologs found in Flavobacterium, Stigmatella, and Pseudomonas.	253
119363	cd06546	GH18_CTS3_chitinase	GH18 domain of CTS3 (chitinase 3), an uncharacterized protein from the human fungal pathogen Coccidioides posadasii.  CTS3 has a chitinase-like glycosyl hydrolase family 18 (GH18) domain; and has homologs in bacteria as well as fungi.	256
119364	cd06547	GH85_ENGase	Endo-beta-N-acetylglucosaminidase (ENGase) hydrolyzes the N-N'-diacetylchitobiosyl core of N-glycosylproteins.  The beta-1,4-glycosyl bond located between two N-acetylglucosamine residues is hydrolyzed such that N-acetylglucosamine 1 remains with the protein and N-acetylglucosamine 2 forms the reducing end of the released glycan.  ENGase is a key enzyme in the processing of free oligosaccharides in the cytosol of eukaryotes. Oligosaccharides formed in the lumen of the endoplasmic reticulum are transported into the cytosol where they are catabolized by cytosolic ENGases and other enzymes, possibly to maximize the reutilization of the component sugars. ENGases have an eight-stranded alpha/beta barrel topology and are classified as a family 85 glycosyl hydrolase (GH85) domain.  The GH85 ENGases are sequence-similar to the family 18 glycosyl hydrolases, also known as GH18 chitinases.  An ENGase-like protein is also found in bacteria and is included in this alignment model.	339
119365	cd06548	GH18_chitinase	The GH18 (glycosyl hydrolases, family 18) type II chitinases hydrolyze chitin, an abundant polymer of N-acetylglucosamine and have been identified in bacteria, fungi, insects, plants, viruses, and protozoan parasites.  The structure of this domain is an eight-stranded alpha/beta barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel.	322
119366	cd06549	GH18_trifunctional	GH18 domain of an uncharacterized family of bacterial proteins, which share a common three-domain architecture: an N-terminal glycosyl hydrolase family 18 (GH18) domain, a glycosyl transferase family 2 domain, and a C-terminal polysaccharide deacetylase domain.	298
119348	cd06550	TM_ABC_iron-siderophores_like	Transmembrane subunit (TM), of Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters involved in the uptake of siderophores, heme, vitamin B12, or the divalent cations Mg2+ and Zn2+. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The TMs are bundles of alpha helices that transverse the cytoplasmic membrane multiple times. The two ABCs bind and hydrolyze ATP and drive the transport reaction. Each TM has a prominent cytoplasmic loop which contacts an ABC and represents a conserved motif. The two TMs form either a homodimer (e.g. in the case of the BtuC subunits of the Escherichia coli BtuCD vitamin B12 transporter), a heterodimer (e.g. the TroC and TroD subunits of the Treponema pallidum general transition metal transporter, TroBCD), or a pseudo-heterodimer (e.g. the FhuB protein of the E. coli ferrichrome transporter, FhuBC). FhuB contains two tandem TMs which associate to form the pseudo-heterodimer. Both FhuB TMs are found in this hierarchy.	261
153244	cd06551	LPLAT	Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis. Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene).	187
119344	cd06552	ASCH_yqfb_like	ASC-1 homology domain, subfamily similar to Escherichia coli Yqfb. The ASCH domain, a small beta-barrel domain found in all three kingdoms of life, resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation.	100
119345	cd06553	ASCH_Ef3133_like	ASC-1 homology domain, subfamily similar to Enterococcus faecalis Ef3133. The ASCH domain, a small beta-barrel domain found in all three kingdoms of life, resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation.	127
119346	cd06554	ASCH_ASC-1_like	ASC-1 homology domain, ASC-1-like subfamily. The ASCH domain, a small beta-barrel domain found in all three kingdoms of life, resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation. The domain has been named after the ASC-1 protein, the activating signal cointegrator 1 or thyroid hormone receptor interactor protein 4 (TRIP4). ASC-1 is conserved in many eukaryotes and has been suggested to participate in a protein complex that interacts with RNA. It has been shown that ASC-1 mediates the interaction between various transciption factors and the basal transcriptional machinery.	113
119347	cd06555	ASCH_PF0470_like	ASC-1 homology domain, subfamily similar to Pyrococcus furiosus Pf0470. The ASCH domain, a small beta-barrel domain found in all three kingdoms of life, resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation.	109
119341	cd06556	ICL_KPHMT	Members of the ICL/PEPM_KPHMT enzyme superfamily catalyze the formation and cleavage of either P-C or C-C bonds. Typical members are phosphoenolpyruvate mutase (PEPM), phosphonopyruvate hydrolase (PPH), carboxyPEP mutase (CPEP mutase), oxaloacetate hydrolase (OAH), isocitrate lyase (ICL), 2-methylisocitrate lyase (MICL), and ketopantoate hydroxymethyltransferase (KPHMT).	240
119342	cd06557	KPHMT-like	Ketopantoate hydroxymethyltransferase (KPHMT) is the first enzyme in the pantothenate biosynthesis pathway. Ketopantoate hydroxymethyltransferase (KPHMT) catalyzes the first committed step in the biosynthesis of pantothenate (vitamin B5), which is a precursor to coenzyme A and is required for penicillin biosynthesis.	254
119339	cd06558	crotonase-like	Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole.	195
143472	cd06559	Endonuclease_V	Endonuclease_V, a DNA repair enzyme that initiates repair of nitrosative deaminated purine bases. Endonuclease_V (EndoV) is an enzyme that can initiate repair of all possible deaminated DNA bases.  EndoV cleaves the DNA strand containing lesions at the second phosphodiester bond 3' to the lesion using Mg2+ as a cofactor.  EndoV homologs are conserved throughout all domains of life from bacteria to humans. EndoV is encoded by the nfi gene and nfi null mutant mice have a phenotype prone to cancer. The ability of endonuclease V to recognize mismatches and abnormal replicative DNA structures suggests that the enzyme plays an important role in DNA metabolism. The details of downstream processing for the EndoV pathway remain unknown.	208
143473	cd06560	PriL	Archaeal/eukaryotic core primase: Large subunit, PriL. Primases synthesize the RNA primers required for DNA replication. Primases are grouped into two classes, bacteria/bacteriophage and archaeal/eukaryotic. The proteins in the two classes differ in structure and the replication apparatus components. The DNA replication machinery of archaeal organisms contains only the core primase, a simpler arrangement compared to eukaryotes. Archaeal/eukaryotic core primase is a heterodimeric enzyme consisting of a small catalytic subunit (PriS) and a large subunit (PriL). Although the catalytic activity resides within PriS, the PriL subunit is essential for primase function as disruption of the PriL gene in yeast is lethal. PriL is composed of two structural domains. Several functions have been proposed for PriL, such as the stabilization of PriS, involvement in the initiation of synthesis, the improvement of primase processivity, and the determination of product size.	166
132880	cd06561	AlkD_like	A new structural DNA glycosylase. This domain represents a new and uncharacterized structural superfamily of DNA glycosylases that form an alpha-alpha superhelix fold that are not belong to the identified five structural DNA glycosylase superfamilies (UDG, AAG/MNPG, MutM/Fpg and helix-hairpin-helix). DNA glycosylases removing alkylated base residues have been identified in all organisms investigated and may be universally present in nature. DNA glycosylases catalyze the first step in Base Excision Repair (BER) pathway by cleaving damaged DNA bases within double strand DNA to produce an abasic site. The resulting abasic site is further processed by AP endonuclease, phosphodiesterase, DNA polymerases, and DNA ligase functions to restore the DNA to an undamaged state. All glycosylase examined to date utilize a similar strategy for binding DNA and base flipping despite their structural diversity.	197
119332	cd06562	GH20_HexA_HexB-like	Beta-N-acetylhexosaminidases catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. The hexA and hexB genes encode the alpha- and beta-subunits of the two major beta-N-acetylhexosaminidase isoenzymes, N-acetyl-beta-D-hexosaminidase A (HexA) and beta-N-acetylhexosaminidase B  (HexB). Both the alpha and the beta catalytic subunits have a TIM-barrel fold and belong to the glycosyl hydrolase family 20 (GH20).  The HexA enzyme is a heterodimer containing one alpha and one beta subunit while the HexB enzyme is a homodimer containing two beta-subunits.  Hexosaminidase mutations cause an inability to properly hydrolyze certain sphingolipids which accumulate in lysosomes within the brain, resulting in the lipid storage disorders Tay-Sachs and Sandhoff.  Mutations in the alpha subunit cause in a deficiency in the HexA enzyme and result in Tay-Sachs, mutations in the beta-subunit cause in a deficiency in both HexA and HexB enzymes and result in Sandhoff disease.  In both disorders GM(2) gangliosides accumulate in lysosomes. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.	348
119333	cd06563	GH20_chitobiase-like	The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl hydrolase family 20 (GH20) domain that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin.  Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. This GH20 domain family includes an N-acetylglucosamidase (GlcNAcase A) from Pseudoalteromonas piscicida and an N-acetylhexosaminidase (SpHex) from Streptomyces plicatus. SpHex lacks the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.	357
119334	cd06564	GH20_DspB_LnbB-like	Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase (LnbB) and related proteins. Dispersin B is a soluble beta-N-acetylglucosamidase found in bacteria that hydrolyzes the beta-1,6-linkages of PGA (poly-beta-(1,6)-N-acetylglucosamine), a major component of the extracellular polysaccharide matrix. Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I oligosaccharides at the nonreducing terminus to produce lacto-N-biose as part of the GNB/LNB (galacto-N-biose/lacto-N-biose I) degradation pathway.  The lacto-N-biosidase from Bifidobacterium bifidum has this GH20 domain, a carbohydrate binding module 32, and a bacterial immunoglobulin-like domain 2, as well as a YSIRK signal peptide and a G5 membrane anchor at the N and C termini, respectively. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.	326
119335	cd06565	GH20_GcnA-like	Glycosyl hydrolase family 20 (GH20) catalytic domain of N-acetyl-beta-D-glucosaminidase (GcnA, also known as BhsA) and related proteins. GcnA  is an exoglucosidase which cleaves N-acetyl-beta-D-galactosamine (NAG) and N-acetyl-beta-D-galactosamine residues from 4-methylumbelliferylated (4MU) substrates, as well as cleaving NAG from chito-oligosaccharides (i.e. NAG polymers).  In contrast, sulfated forms of the substrate are unable to be cleaved and act instead as mild competitive inhibitors. Additionally, the enzyme is known to be poisoned by several first-row transition metals as well as by mercury.  GcnA forms a homodimer with subunits comprised of three domains, an N-terminal zincin-like domain, this central catalytic GH20 domain, and a C-terminal alpha helical domain.  The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.	301
143475	cd06567	Peptidase_S41	C-terminal processing peptidase family S41. Peptidase family S41 (C-terminal processing peptidase or CTPase family) contains very different subfamilies; it includes photosystem II D1 C-terminal processing protease (CTPase), interphotoreceptor retinoid-binding protein IRBP and tricorn protease (TRI). CTPase and TRI both contain the PDZ domain while IRBP, although being very similar to the tail-specific protease domain, lacks the PDZ insertion domain and hydrolytic activity. These serine proteases have distinctly different active sites: in CTPase, the active site consists of a serine/lysine catalytic dyad while in tricorn core protease, it is a tetrad (serine, histidine, serine, glutamate). CPases with different substrate specificities in different species include processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein; and others such as tricorn protease (TRI) act as a carboxypeptidase, involved in the degradation of proteasomal products. CTPase homolog IRBP, secreted by photoreceptors into the interphotoreceptor matrix, having arisen in the early evolution of the vertebrate eye, promotes the release of all-trans retinol from photoreceptors and facilitates its delivery to the retinal pigment epithelium.	224
119336	cd06568	GH20_SpHex_like	A subgroup of  the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins similar to the N-acetylhexosaminidase from Streptomyces plicatus (SpHex).  SpHex catalyzes the hydrolysis of N-acetyl-beta-hexosaminides. An Asp residue within the active site plays a critical role in substrate-assisted catalysis by orienting the 2-acetamido group and stabilizing the transition state. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. Proteins belonging to this subgroup lack the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases.	329
119337	cd06569	GH20_Sm-chitobiase-like	The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl hydrolase family 20 (GH20) domain that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.	445
119338	cd06570	GH20_chitobiase-like_1	A functionally uncharacterized subgroup of  the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins similar to the chitobiase of Serratia marcescens, a beta-N-1,4-acetylhexosaminidase that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin.  Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. This subgroup lacks the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.	311
119330	cd06571	Bac_DnaA_C	C-terminal domain of bacterial DnaA proteins. The DNA-binding C-terminal domain of DnaA contains a helix-turn-helix motif that specifically interacts with the DnaA box, a 9-mer motif that occurs repetitively in the replication origin oriC. Multiple copies of DnaA, which is an ATPase, bind to 9-mers at the origin and form an initial complex in which the DNA strands are being separated in an ATP-dependent step.	90
119329	cd06572	Histidinol_dh	Histidinol dehydrogenase, HisD, E.C 1.1.1.23. Histidinol dehydrogenase catalyzes the last two steps in the L-histidine biosynthesis pathway, which is conserved in bacteria, archaea, fungi, and plants. These last two steps are (i) the NAD-dependent oxidation of L-histidinol to L-histidinaldehyde, and (ii) the NAD-dependent oxidation of L-histidinaldehyde to L-histidine. In most fungi and in the unicellular choanoflagellate Monosiga bevicollis, the HisD domain is fused with units that catalyze the second and third biosynthesis steps in this same pathway.	390
119325	cd06573	PASTA	PASTA domain. This domain is found at the C-termini of several Penicillin-binding proteins (PBPs) and bacterial serine/threonine kinases. It is a small globular fold consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain.	53
119320	cd06574	TM_PBP1_branched-chain-AA_like	Transmembrane subunit (TM) of Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which are involved in the uptake of branched-chain amino acids (AAs), as well as TMs of transporters involved in the uptake of monosaccharides including ribose, galactose, and arabinose. These transporters generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. This group includes Escherichia coli LivM and LivH, two TMs which heterodimerize to form the translocation pathway of the E. coli branched-chain AA LIV-1/LS transporter. This transporter is comprised of two TMs (LivM and LivH), two ABCs (LivG and LivF), and one of two alternative PBPs, LivJ (LIV-BP) and LivK (LS-BP). In addition to transporting branched-chain AAs including leucine, isoleucine and valine, the E. coli LIV-1/LS transporter is involved in the uptake of the aromatic AA, phenylalanine. Included in this group are proteins from transport systems that contain a single TM which homodimerizes to generate the transmembrane pore; for example E. coli RbsC, AlsC, and MglC, the TMs of the high affinity ribose transporter, the D-allose transporter and the galactose transporter, respectively. The D-allose transporter may also to be involved in low affinity ribose transport.	266
119326	cd06575	PASTA_Pbp2x-like_2	PASTA domain of PBP2x-like proteins, second repeat. Penicillin-binding proteins (PBPs) are the major targets for beta-lactam antibiotics, like penicillins and cephalosporins. Beta-lactam antibiotics specifically inhibit transpeptidase activity by acylating the active site serine. PBPs catalyze key steps in the synthesis of the peptidoglycan, such as the interconnecting of glycan chains (polymers of N-glucosamine and N-acetylmuramic acid residues) and the cross-linking (transpeptidation) of short stem peptides, which are attached to glycan chains. Peptidoglycan is essential in cell division and protects bacteria from osmotic shock and lysis. PBP2x is one of the two monofunctional high molecular mass PBPs in Streptococcus pneumoniae and has been seen as the primary PBP target in beta-lactam-resistant strains. The PASTA domain is found at the C-termini of several PBPs and bacterial serine/threonine kinases. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain.	54
119327	cd06576	PASTA_Pbp2x-like_1	PASTA domain of PBP2x-like proteins, first repeat. Penicillin-binding proteins (PBPs) are the major targets for beta-lactam antibiotics, like penicillins and cephalosporins. Beta-lactam antibiotics specifically inhibit transpeptidase activity by acylating the active site serine. PBPs catalyze key steps in the synthesis of the peptidoglycan, such as the interconnecting of glycan chains (polymers of N-glucosamine and N-acetylmuramic acid residues) and the cross-linking (transpeptidation) of short stem peptides, which are connected to glycan chains. Peptidoglycan is essential in cell division and protects bacteria from osmotic shock and lysis. PBP2x is one of the two monofunctional high molecular mass PBPs in Streptococcus pneumoniae and has been seen as the primary PBP target in beta-lactam-resistant strains. The PASTA domain is found at the C-termini of several PBPs and bacterial serine/threonine kinases. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain.	55
119328	cd06577	PASTA_pknB	PASTA domain of bacterial serine/threonine kinase pknB-like proteins. PknB is a member of a group of related transmembrane sensor kinases present in many gram positive bacteria, which has been shown to regulate cell shape in Mycobacterium tubercolosis. PknB is a receptor-like transmembrane protein with an extracellular signal sensor domain (containing multiple PASTA domains) and an intracellular, eukaryotic serine/threonine kinase-like domain. The PASTA domain is found at the C-termini of several Penicillin-binding proteins (PBPs) and bacterial serine/threonine kinases.  The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain.	62
119440	cd06578	HemD	Uroporphyrinogen-III synthase (HemD) catalyzes the asymmetrical cyclization of tetrapyrrole (linear) to uroporphyrinogen-III, the fourth step in the biosynthesis of heme. This ubiquitous enzyme is present in eukaryotes, bacteria and archaea. Mutations in the human uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria, a recessive inborn error of metabolism also known as Gunther disease.	239
119321	cd06579	TM_PBP1_transp_AraH_like	Transmembrane subunit (TM) of Escherichia coli AraH and related proteins. E. coli AraH is the TM of a Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporter involved in the uptake of the monosaccharide arabinose. This group also contains E. coli RbsC, AlsC, and MglC, which are TMs of other monosaccharide transporters, the ribose transporter, the D-allose transporter and the galactose transporter, respectively. The D-allose transporter may also be involved in low affinity ribose transport. These transporters generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP, which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. Proteins in this subgroup have a single TM which homodimerizes to generate the transmembrane pore.	263
119322	cd06580	TM_PBP1_transp_TpRbsC_like	Transmembrane subunit (TM) of Treponema pallidum (Tp) RbsC-1, RbsC-2 and related proteins. This is a functionally uncharacterized subgroup of TMs which belong to a larger group of TMs of Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters, which are mainly involved in the uptake of branched-chain amino acids (AAs) or in the uptake of monosaccharides including ribose, galactose, and arabinose, and which generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP, which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction.	234
119323	cd06581	TM_PBP1_LivM_like	Transmembrane subunit (TM) of Escherichia coli LivM and related proteins. LivM is one of two TMs of the E. coli LIV-1/LS transporter, a Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporter involved in the uptake of branched-chain amino acids (AAs). These types of transporters generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP, which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. E. coli LivM forms a heterodimer with another TM, LivH, to generate the transmembrane pore. LivH is not included in this subgroup. The LIV-1/LS transporter is comprised of two TMs (LivM and LivH), two ABCs (LivG and LivF), and one of two alternative PBPs, LivJ (LIV-BP) or LivK (LS-BP). In addition to transporting branched-chain AAs including leucine, isoleucine and valine, the E. coli LIV-1/LS transporter is involved in the uptake of the aromatic AA, phenylalanine.	268
119324	cd06582	TM_PBP1_LivH_like	Transmembrane subunit (TM) of Escherichia coli LivH and related proteins. LivH is one of two TMs of the E. coli LIV-1/LS transporter, a Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporter involved in the uptake of branched-chain amino acids (AAs). These types of transporters generally bind type 1 PBPs. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP, which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. E. coli LivH forms a heterodimer with another TM, LivM, to generate the transmembrane pore. LivM is not included in this subgroup. The LIV-1/LS transporter is comprised of two TMs (LivM and LivH), two ABCs (LivG and LivF), and one of two alternative PBPs, LivJ (LIV-BP) or LivK (LS-BP). In addition to transporting branched-chain AAs including leucine, isoleucine and valine, the E. coli LIV-1/LS transporter is involved in the uptake of the aromatic AA, phenylalanine.	272
133475	cd06583	PGRP	Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity.	126
132915	cd06586	TPP_enzyme_PYR	Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes. Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit.	154
319898	cd06587	VOC	vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases.	112
319899	cd06588	PhnB_like	Escherichia coli PhnB and similar proteins. The Escherichia coli phnB gene is found next to an operon of fourteen genes (phnC-to-phnP) related to the cleavage of carbon-phosphorus (C-P) bonds in unactivated alkylphosphonates, supporting bacterial growth on alkylphosphonates as the sole phosphorus source. It was originally considered part of that operon. PhnB appears to play no direct catalytic role in the usage of alkylphosphonate. Although many of the proteins in this family have been annotated as 3-demethylubiquinone-9 3-methyltransferase enzymes by automatic annotation programs, the experimental evidence for this assignment is lacking. In Escherichia coli, the gene coding 3-demethylubiquinone-9 3-methyltransferase enzyme is ubiG, which belongs to the AdoMet-MTase protein family. PhnB-like proteins adopt a structural fold similar to bleomycin resistance proteins, glyoxalase I, and type I extradiol dioxygenases.	129
269876	cd06589	GH31	glycosyl hydrolase family 31 (GH31). GH31 enzymes occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as Pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively.	265
260000	cd06590	RNase_HII_bacteria_HIII_like	Bacterial type 2 ribonuclease, HII and HIII-like. This family includes type 2 RNases H from several bacteria, such as Bacillus subtilis, which have two different RNases, HII and HIII. RNases HIII are distinguished by having a large (70-90 residues) N-terminal extension of unknown function. In addition, the active site of RNase HIII differs from that of other RNases H; replacing the fourth residue (aspartate) of the acidic "DEDD" motif with a glutamate. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes; however, no prokaryotic genomes contain the combination of both RNase HI and HIII. This mutual exclusive gene inheritance might be the result of functional redundancy of RNase HI and HIII in prokaryotes. Ribonuclease (RNase) H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, archaeal RNase HII and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication or repair.	207
269877	cd06591	GH31_xylosidase_XylS	xylosidase XylS-like. XylS is a glycosyl hydrolase family 31 (GH31) alpha-xylosidase found in prokaryotes, eukaryotes, and archaea, that catalyzes the release of alpha-xylose from the non-reducing terminal side of the alpha-xyloside substrate. XylS has been characterized in Sulfolobus solfataricus where it hydrolyzes isoprimeverose, the p-nitrophenyl-beta derivative of isoprimeverose, and xyloglucan oligosaccharides, and has transxylosidic activity. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. The XylS family corresponds to subgroup 3 in the Ernst et al classification of GH31 enzymes.	322
269878	cd06592	GH31_NET37	glucosidase NET37. NET37 (also known as KIAA1161) is a human lamina-associated nuclear envelope transmembrane protein. A member of the glycosyl hydrolase family 31 (GH31) , it has been shown to be required for myogenic differentiation of C2C12 cells. Related proteins are found in eukaryotes and prokaryotes. Enzymes of the GH31 family possess a wide range of different hydrolytic activities including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein.	364
269879	cd06593	GH31_xylosidase_YicI	alpha-xylosidase YicI-like. YicI alpha-xylosidase is a glycosyl hydrolase family 31 (GH31) enzyme that catalyzes the release of an alpha-xylosyl residue from the non-reducing end of alpha-xyloside substrates such as alpha-xylosyl fluoride and isoprimeverose. YicI forms a homohexamer (a trimer of dimers). All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. The YicI family corresponds to subgroup 4 in the Ernst et al classification of GH31 enzymes.	308
269880	cd06594	GH31_glucosidase_YihQ	alpha-glucosidase YihQ-like. YihQ is a bacterial alpha-glucosidase with a conserved glycosyl hydrolase family 31 (GH31) domain that catalyzes the release of an alpha-glucosyl residue from the non-reducing end of alpha-glucoside substrates such as alpha-glucosyl fluoride. Orthologs of YihQ that have not yet been functionally characterized are present in plants and fungi. YihQ has sequence similarity to other GH31 enzymes such as CtsZ, a 6-alpha-glucosyltransferase from Bacillus globisporus, and YicI, an alpha-xylosidase from Echerichia coli. These latter two belong to different GH31 subfamilies than YihQ. In bacteria, YihQ (along with YihO) is important for bacterial O-antigen capsule assembly and translocation.	325
269881	cd06595	GH31_u1	glycosyl hydrolase family 31 (GH31); uncharacterized subgroup. This family represents an uncharacterized GH31 enzyme subgroup found in bacteria and eukaryotes. Enzymes of the GH31 family possess a wide range of different hydrolytic activities including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein.	304
269882	cd06596	GH31_CPE1046	Clostridium CPE1046-like. CPE1046 is an uncharacterized Clostridium perfringens protein with a glycosyl hydrolase family 31 (GH31) domain. The domain architecture of CPE1046 and its orthologs includes a C-terminal fibronectin type 3 (FN3) domain and a coagulation factor 5/8 type C domain in addition to the GH31 domain. Enzymes of the GH31 family possess a wide range of different hydrolytic activities including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein.	334
269883	cd06597	GH31_transferase_CtsY	CtsY (cyclic tetrasaccharide-synthesizing enzyme Y)-like. CtsY is a bacterial 3-alpha-isomaltosyltransferase, first identified in Arthrobacter globiformis, that produces cyclic tetrasaccharides together with a closely related enzyme CtsZ. CtsY and CtsZ both have a glycosyl hydrolase family 31 (GH31) catalytic domain; CtsZ belongs to a different subfamily. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein.	326
269884	cd06598	GH31_transferase_CtsZ	CtsZ (cyclic tetrasaccharide-synthesizing enzyme Z)-like. CtsZ is a bacterial 6-alpha-glucosyltransferase, first identified in Arthrobacter globiformis, that produces cyclic tetrasaccharides together with a closely related enzyme CtsY. CtsZ and CtsY both have a glycosyl hydrolase family 31 (GH31) catalytic domain; CtsY belongs to a different subfamily. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein.	332
269885	cd06599	GH31_glycosidase_Aec37	E.coli Aec37-like. Glycosyl hydrolase family 31 (GH31) domain of a bacterial protein family represented by Escherichia coli protein Aec37. The gene encoding Aec37 (aec-37) is located within a genomic island (AGI-3) isolated from the extraintestinal avian pathogenic Escherichia coli strain BEN2908. The function of Aec37 and its orthologs is unknown; however, deletion of a region of the genome that includes aec-37 affects the assimilation of seven carbohydrates, decreases growth rate of the strain in minimal medium containing galacturonate or trehalose, and attenuates the virulence of E. coli BEN2908 in chickens. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein.	319
269886	cd06600	GH31_MGAM-like	maltase-glucoamylase (MGAM)-like. This family includes the following closely related glycosyl hydrolase family 31 (GH31) enzymes: maltase-glucoamylase (MGAM), sucrase-isomaltase (SI), lysosomal acid alpha-glucosidase (GAA), neutral alpha-glucosidase C (GANC), the alpha subunit of neutral alpha-glucosidase AB (GANAB), and alpha-glucosidase II. MGAM is one of the two enzymes responsible for catalyzing the last glucose-releasing step in starch digestion. SI is implicated in the digestion of dietary starch and major disaccharides such as sucrose and isomaltose, while GAA degrades glycogen in the lysosome, cleaving both alpha-1,4 and alpha-1,6 glucosidic linkages. MGAM and SI are anchored to small-intestinal brush-border epithelial cells. The absence of SI from the brush border membrane or its malfunction is associated with malabsorption disorders such as congenital sucrase-isomaltase deficiency (CSID). The domain architectures of MGAM and SI include two tandem GH31 catalytic domains, an N-terminal domain found near the membrane-bound end and a C-terminal luminal domain. Both of the tandem GH31 domains of MGAM and SI are included in this family. The domain architecture of GAA includes an N-terminal TFF (trefoil factor family) domain in addition to the GH31 catalytic domain. Deficient GAA expression causes Pompe disease, an autosomal recessive genetic disorder also known as glycogen storage disease type II (GSDII). GANC and GANAB are key enzymes in glycogen metabolism that hydrolyze terminal, non-reducing 1,4-linked alpha-D-glucose residues from glycogen in the endoplasmic reticulum. Alpha-glucosidase II is a GH31 enzyme, found in bacteria and plants, which has exo-alpha-1,4-glucosidase and oligo-1,6-glucosidase activities. Alpha-glucosidase II has been characterized in Bacillus thermoamyloliquefaciens where it forms a homohexamer. This family also includes the MalA alpha-glucosidase from Sulfolobus solfataricus and the AglA alpha-glucosidase from Picrophilus torridus. MalA is part of the carbohydrate-metabolizing machinery that allows this organism to utilize carbohydrates, such as maltose, as the sole carbon and energy source. The MGAM-like family corresponds to subgroup 1 in the Ernst et al classification of GH31 enzymes.	256
269887	cd06601	GH31_lyase_GLase	alpha-1,4-glucan lyase. GLases (alpha-1,4-glucan lyases) are glycosyl hydrolase family 31 (GH31) enzymes that degrade alpha-1,4-glucans and maltooligosaccharides via a nonhydrolytic pathway to yield 1,5-D-anhydrofructose from the nonreducing end. GLases cleave the bond between C1 and O1 of the nonreducing sugar residue of alpha-glucans to generate a monosaccharide product with a double bond between C1 and C2. This family corresponds to subgroup 2 in the Ernst et al classification of GH31 enzymes.	347
269888	cd06602	GH31_MGAM_SI_GAA	maltase-glucoamylase, sucrase-isomaltase, lysosomal acid alpha-glucosidase. This subgroup includes the following three closely related glycosyl hydrolase family 31 (GH31) enzymes: maltase-glucoamylase (MGAM), sucrase-isomaltase (SI), and lysosomal acid alpha-glucosidase (GAA), also known as acid-maltase. MGAM is one of the two enzymes responsible for catalyzing the last glucose-releasing step in starch digestion. SI is implicated in the digestion of dietary starch and major disaccharides such as sucrose and isomaltose, while GAA degrades glycogen in the lysosome, cleaving both alpha-1,4 and alpha-1,6 glucosidic linkages. MGAM and SI are anchored to small-intestinal brush-border epithelial cells. The absence of SI from the brush border membrane or its malfunction is associated with malabsorption disorders such as congenital sucrase-isomaltase deficiency (CSID). The domain architectures of MGAM and SI include two tandem GH31 catalytic domains, an N-terminal domain found near the membrane-bound end, and a C-terminal luminal domain. Both of the tandem GH31 domains of MGAM and SI are included in this family. The domain architecture of GAA includes an N-terminal TFF (trefoil factor family) domain in addition to the GH31 catalytic domain. Deficient GAA expression causes Pompe disease, an autosomal recessive genetic disorder also known as glycogen storage disease type II (GSDII).	367
269889	cd06603	GH31_GANC_GANAB_alpha	neutral alpha-glucosidase C, neutral alpha-glucosidase AB. This subgroup includes the closely related glycosyl hydrolase family 31 (GH31) isozymes, neutral alpha-glucosidase C (GANC) and the alpha subunit of heterodimeric neutral alpha-glucosidase AB (GANAB). Initially distinguished on the basis of differences in electrophoretic mobility in starch gel, GANC and GANAB have been shown to have other differences, including those of substrate specificity. GANC and GANAB are key enzymes in glycogen metabolism that hydrolyze terminal, non-reducing 1,4-linked alpha-D-glucose residues from glycogen in the endoplasmic reticulum. The GANC/GANAB family includes the alpha-glucosidase II (ModA) from Dictyostelium discoideum as well as the alpha-glucosidase II (GLS2, or ROT2 - Reversal of TOR2 lethality protein 2) from Saccharomyces cerevisiae.	467
269890	cd06604	GH31_glucosidase_II_MalA	Alpha-glucosidase II-like. Alpha-glucosidase II (alpha-D-glucoside glucohydrolase) is a glycosyl hydrolase family 31 (GH31) enzyme, found in bacteria and plants, which has exo-alpha-1,4-glucosidase and oligo-1,6-glucosidase activities. Alpha-glucosidase II has been characterized in Bacillus thermoamyloliquefaciens where it forms a homohexamer. This subgroup also includes the MalA alpha-glucosidase from Sulfolobus solfataricus and the AglA alpha-glucosidase from Picrophilus torridus. MalA is part of the carbohydrate-metabolizing machinery that allows this organism to utilize carbohydrates, such as maltose, as the sole carbon and energy source.	339
270782	cd06605	PKc_MAPKK	Catalytic domain of the dual-specificity Protein Kinase, Mitogen-Activated Protein Kinase Kinase. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MAPKKs are dual-specificity PKs that phosphorylate their downstream targets, MAPKs, at specific threonine and tyrosine residues. The MAPK signaling pathways are important mediators of cellular responses to extracellular signals. The pathways involve a triple kinase core cascade comprising the MAPK, which is phosphorylated and activated by a MAPK kinase (MAPKK or MKK or MAP2K), which itself is phosphorylated and activated by a MAPKK kinase (MAPKKK or MKKK or MAP3K). There are three MAPK subfamilies: extracellular signal-regulated kinase (ERK), c-Jun N-terminal kinase (JNK), and p38. In mammalian cells, there are seven MAPKKs (named MKK1-7) and 20 MAPKKKs. Each MAPK subfamily can be activated by at least two cognate MAPKKs and by multiple MAPKKKs. The MAPKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
270783	cd06606	STKc_MAPKKK	Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein Kinase Kinase Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPKKKs (MKKKs or MAP3Ks) are also called MAP/ERK kinase kinases (MEKKs) in some cases. They phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. This subfamily is composed of the Apoptosis Signal-regulating Kinases ASK1 (or MAPKKK5) and ASK2 (or MAPKKK6), MEKK1, MEKK2, MEKK3, MEKK4, as well as plant and fungal MAPKKKs. Also included in this subfamily are the cell division control proteins Schizosaccharomyces pombe Cdc7 and Saccharomyces cerevisiae Cdc15. The MAPKKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270784	cd06607	STKc_TAO	Catalytic domain of the Serine/Threonine Kinases, Thousand-and-One Amino acids proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TAO proteins possess mitogen-activated protein kinase (MAPK) kinase kinase activity. They activate the MAPKs, p38 and c-Jun N-terminal kinase (JNK), by phosphorylating and activating the respective MAP/ERK kinases (MEKs, also known as MKKs or MAPKKs), MEK3/MEK6 and MKK4/MKK7. MAPK signaling cascades are important in mediating cellular responses to extracellular signals. Vertebrates contain three TAO subfamily members, named TAO1, TAO2, and TAO3. The TAO subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270785	cd06608	STKc_myosinIII_N_like	N-terminal Catalytic domain of Class III myosin-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Class III myosins are motor proteins with an N-terminal kinase catalytic domain and a C-terminal actin-binding motor domain. Class III myosins are present in the photoreceptors of invertebrates and vertebrates and in the auditory hair cells of mammals. The kinase domain of myosin III can phosphorylate several cytoskeletal proteins, conventional myosin regulatory light chains, and can autophosphorylate the C-terminal motor domain. Myosin III may play an important role in maintaining the structural integrity of photoreceptor cell microvilli. It may also function as a cargo carrier during light-dependent translocation, in photoreceptor cells, of proteins such as transducin and arrestin. The Drosophila class III myosin, called NinaC (Neither inactivation nor afterpotential protein C), is critical in normal adaptation and termination of photoresponse.  Vertebrates contain two isoforms of class III myosin, IIIA and IIIB. This subfamily also includes mammalian NIK-like embryo-specific kinase (NESK), Traf2- and Nck-interacting kinase (TNIK), and mitogen-activated protein kinase (MAPK) kinase kinase kinase 4/6. MAP4Ks are involved in some MAPK signaling pathways by activating a MAPK kinase kinase. MAPK signaling cascades are important in mediating cellular responses to extracellular signals. The class III myosin-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	275
270786	cd06609	STKc_MST3_like	Catalytic domain of Mammalian Ste20-like protein kinase 3-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of MST3, MST4, STK25, Schizosaccharomyces pombe Nak1 and Sid1, Saccharomyces cerevisiae sporulation-specific protein 1 (SPS1), and related proteins. Nak1 is required by fission yeast for polarizing the tips of actin cytoskeleton and is involved in cell growth, cell separation, cell morphology and cell-cycle progression. Sid1 is a component in the septation initiation network (SIN) signaling pathway, and plays a role in cytokinesis. SPS1 plays a role in regulating proteins required for spore wall formation. MST4 plays a role in mitogen-activated protein kinase (MAPK) signaling during cytoskeletal rearrangement, morphogenesis, and apoptosis. MST3 phosphorylates the STK NDR and may play a role in cell cycle progression and cell morphology. STK25 may play a role in the regulation of cell migration and polarization. The MST3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	274
270787	cd06610	STKc_OSR1_SPAK	Catalytic domain of the Serine/Threonine Kinases, Oxidative stress response kinase and Ste20-related proline alanine-rich kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SPAK is also referred to as STK39 or PASK (proline-alanine-rich STE20-related kinase). OSR1 and SPAK regulate the activity of cation-chloride cotransporters through direct interaction and phosphorylation. They are also implicated in cytoskeletal rearrangement, cell differentiation, transformation and proliferation. OSR1 and SPAK contain a conserved C-terminal (CCT) domain, which recognizes a unique motif ([RK]FX[VI]) present in their activating kinases (WNK1/WNK4) and their substrates. The OSR1 and SPAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
132942	cd06611	STKc_SLK_like	Catalytic domain of Ste20-Like Kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of the subfamily include SLK, STK10 (also called LOK for Lymphocyte-Oriented Kinase), SmSLK (Schistosoma mansoni SLK), and related proteins. SLK promotes apoptosis through apoptosis signal-regulating kinase 1 (ASK1) and the mitogen-activated protein kinase (MAPK) p38. It also plays a role in mediating actin reorganization. STK10 is responsible in regulating the CD28 responsive element in T cells, as well as leukocyte function associated antigen (LFA-1)-mediated lymphocyte adhesion. SmSLK is capable of activating the MAPK Jun N-terminal kinase (JNK) pathway in human embryonic kidney cells as well as in Xenopus oocytes. It may participate in regulating MAPK cascades during host-parasite interactions. The SLK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	280
132943	cd06612	STKc_MST1_2	Catalytic domain of the Serine/Threonine Kinases, Mammalian STe20-like protein kinase 1 and 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of MST1, MST2, and related proteins including Drosophila Hippo and Dictyostelium discoideum Krs1 (kinase responsive to stress 1). MST1/2 and Hippo are involved in a conserved pathway that governs cell contact inhibition, organ size control, and tumor development. MST1 activates the mitogen-activated protein kinases (MAPKs) p38 and c-Jun N-terminal kinase (JNK) through MKK7 and MEKK1 by acting as a MAPK kinase kinase kinase. Activation of JNK by MST1 leads to caspase activation and apoptosis. MST1 has also been implicated in cell proliferation and differentiation. Krs1 may regulate cell growth arrest and apoptosis in response to cellular stress. The MST1/2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270788	cd06613	STKc_MAP4K3_like	Catalytic domain of Mitogen-activated protein kinase kinase kinase kinase (MAP4K) 3-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes MAP4K3, MAP4K1, MAP4K2, MAP4K5, and related proteins. Vertebrate members contain an N-terminal catalytic domain and a C-terminal citron homology (CNH) regulatory domain. MAP4K1, also called haematopoietic progenitor kinase 1 (HPK1), is a hematopoietic-specific STK involved in many cellular signaling cascades including MAPK, antigen receptor, apoptosis, growth factor, and cytokine signaling. It participates in the regulation of T cell receptor signaling and T cell-mediated immune responses. MAP4K2 was referred to as germinal center (GC) kinase because of its preferred location in GC B cells. MAP4K3 plays a role in the nutrient-responsive pathway of mTOR (mammalian target of rapamycin) signaling. It is required in the activation of S6 kinase by amino acids and for the phosphorylation of the mTOR-regulated inhibitor of eukaryotic initiation factor 4E. MAP4K5, also called germinal center kinase-related enzyme (GCKR), has been shown to activate the MAPK c-Jun N-terminal kinase (JNK). The MAP4K3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
270789	cd06614	STKc_PAK	Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. PAKs are implicated in the regulation of many cellular processes including growth factor receptor-mediated proliferation, cell polarity, cell motility, cell death and survival, and actin cytoskeleton organization. PAK deregulation is associated with tumor development. PAKs from higher eukaryotes are classified into two groups (I and II), according to their biochemical and structural features. Group I PAKs contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). Group II PAKs contain a PBD and a catalytic domain, but lack other motifs found in group I PAKs. Since group II PAKs do not contain an obvious AID, they may be regulated differently from group I PAKs. Group I PAKs interact with the SH3 containing proteins Nck, Grb2 and PIX; no such binding has been demonstrated for group II PAKs. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
132946	cd06615	PKc_MEK	Catalytic domain of the dual-specificity Protein Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MEK1 and MEK2 are MAPK kinases (MAPKKs or MKKs), and are dual-specificity PKs that phosphorylate and activate the downstream targets, ERK1 and ERK2, on specific threonine and tyrosine residues. The ERK cascade starts with extracellular signals including growth factors, hormones, and neurotransmitters, which act through receptors and ion channels to initiate intracellular signaling that leads to the activation at the MAPKKK (Raf-1 or MOS) level, which leads to the transmission of signals to MEK1/2, and finally to ERK1/2. The ERK cascade plays an important role in cell proliferation, differentiation, oncogenic transformation, and cell cycle control, as well as in apoptosis and cell survival under certain conditions. This cascade has also been implicated in synaptic plasticity, migration, morphological determination, and stress response immunological reactions. Gain-of-function mutations in genes encoding ERK cascade proteins, including MEK1/2, cause cardiofaciocutaneous (CFC) syndrome, a condition leading to multiple congenital anomalies and mental retardation in patients. The MEK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	308
270790	cd06616	PKc_MKK4	Catalytic domain of the dual-specificity Protein Kinase, Mitogen-activated protein Kinase Kinase 4. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MKK4 is a dual-specificity PK that phosphorylates and activates the downstream targets, c-Jun N-terminal kinase (JNK) and p38 MAPK, on specific threonine and tyrosine residues. JNK and p38 are collectively known as stress-activated MAPKs, as they are activated in response to a variety of environmental stresses and pro-inflammatory cytokines. Their activation is associated with the induction of cell death. Mice deficient in MKK4 die during embryogenesis and display anemia, severe liver hemorrhage, and abnormal hepatogenesis. MKK4 may also play roles in the immune system and in cardiac hypertrophy. It plays a major role in cancer as a tumor and metastasis suppressor. Under certain conditions, MKK4 is pro-oncogenic. The MKK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	291
173729	cd06617	PKc_MKK3_6	Catalytic domain of the dual-specificity Protein Kinases, Mitogen-activated protein Kinase Kinases 3 and 6. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MKK3 and MKK6 are dual-specificity PKs that phosphorylate and activate their downstream target, p38 MAPK, on specific threonine and tyrosine residues. MKK3/6 play roles in the regulation of cell cycle progression, cytokine- and stress-induced apoptosis, oncogenic transformation, and adult tissue regeneration. In addition, MKK6 plays a critical role in osteoclast survival in inflammatory disease while MKK3 is associated with tumor invasion, progression, and poor patient survival in glioma. The MKK3/6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
270791	cd06618	PKc_MKK7	Catalytic domain of the dual-specificity Protein Kinase, Mitogen-activated protein Kinase Kinase 7. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MKK7 is a dual-specificity PK that phosphorylates and activates its downstream target, c-Jun N-terminal kinase (JNK), on specific threonine and tyrosine residues. Although MKK7 is capable of dual phosphorylation, it prefers to phosphorylate the threonine residue of JNK. Thus, optimal activation of JNK requires both MKK4 and MKK7. MKK7 is primarily activated by cytokines. MKK7 is essential for liver formation during embryogenesis. It plays roles in G2/M cell cycle arrest and cell growth. In addition, it is involved in the control of programmed cell death, which is crucial in oncogenesis, cancer chemoresistance, and antagonism to TNFalpha-induced killing, through its inhibition by Gadd45beta and the subsequent suppression of the JNK cascade. The MKK7 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	295
132950	cd06619	PKc_MKK5	Catalytic domain of the dual-specificity Protein Kinase, Mitogen-activated protein Kinase Kinase 5. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MKK5 (also called MEK5) is a dual-specificity PK that phosphorylates its downstream target, extracellular signal-regulated kinase 5 (ERK5), on specific threonine and tyrosine residues. MKK5 is activated by MEKK2 and MEKK3 in response to mitogenic and stress stimuli. The ERK5 cascade promotes cell proliferation, differentiation, neuronal survival, and neuroprotection. This cascade plays an essential role in heart development. Mice deficient in either ERK5 or MKK5 die around embryonic day 10 due to cardiovascular defects including underdevelopment of the myocardium. In addition, MKK5 is associated with metastasis and unfavorable prognosis in prostate cancer. The MKK5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
270792	cd06620	PKc_Byr1_like	Catalytic domain of fungal Byr1-like dual-specificity Mitogen-activated protein Kinase Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. Members of this group include the MAPKKs Byr1 from Schizosaccharomyces pombe, FUZ7 from Ustilago maydis, and related proteins. Byr1 phosphorylates its downstream target, the MAPK Spk1, and is regulated by the MAPKK kinase Byr2. The Spk1 cascade is pheromone-responsive and is essential for sporulation and sexual differentiation in fission yeast. FUZ7 phosphorylates and activates its target, the MAPK Crk1, which is required in mating and virulence in U. maydis. MAPK signaling pathways are important mediators of cellular responses to extracellular signals. The Byr-1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
270793	cd06621	PKc_Pek1_like	Catalytic domain of fungal Pek1-like dual-specificity Mitogen-Activated Protein Kinase Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. Members of this group include the MAPKKs Pek1/Skh1 from Schizosaccharomyces pombe and MKK2 from Saccharomyces cerevisiae, and related proteins. Both fission yeast Pek1 and baker's yeast MKK2 are components of the cell integrity MAPK pathway. In fission yeast, Pek1 phosphorylates and activates Pmk1/Spm1 and is regulated by the MAPKK kinase Mkh1. In baker's yeast, the pathway involves the MAPK Slt2, the MAPKKs MKK1 and MKK2, and the MAPKK kinase Bck1. The cell integrity MAPK cascade is activated by multiple stress conditions, and is essential  in cell wall construction, morphogenesis, cytokinesis, and ion homeostasis. MAPK signaling pathways are important mediators of cellular responses to extracellular signals. The MAPKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
132953	cd06622	PKc_PBS2_like	Catalytic domain of fungal PBS2-like dual-specificity Mitogen-Activated Protein Kinase Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. Members of this group include the MAPKKs Polymyxin B resistance protein 2 (PBS2) from Saccharomyces cerevisiae, Wis1 from Schizosaccharomyces pombe, and related proteins. PBS2 and Wis1 are components of stress-activated MAPK cascades in budding and fission yeast, respectively. PBS2 is the specific activator of the MAPK Hog1, which plays a central role in the response of budding yeast to stress including exposure to arsenite and hyperosmotic environments. Wis1 phosphorylates and activates the MAPK Sty1 (also called Spc1 or Phh1), which stimulates a transcriptional response to a wide range of cellular insults through the bZip transcription factors Atf1, Pcr1, and Pap1. The PBS2 subfamily is part of a larger superfamily that includes the catalytic domains of STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
132954	cd06623	PKc_MAPKK_plant_like	Catalytic domain of Plant dual-specificity Mitogen-Activated Protein Kinase Kinases and similar proteins. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. Members of this group include MAPKKs from plants, kinetoplastids, alveolates, and mycetozoa. The MAPKK, LmxPK4, from Leishmania mexicana, is important in differentiation and virulence. Dictyostelium discoideum MEK1 is required for proper chemotaxis; MEK1 null mutants display severe defects in cell polarization and directional movement. Plants contain multiple MAPKKs like other eukaryotes. The Arabidopsis genome encodes for 10 MAPKKs while poplar and rice contain 13 MAPKKs each. The functions of these proteins have not been fully elucidated. There is evidence to suggest that MAPK cascades are involved in plant stress responses. In Arabidopsis, MKK3 plays a role in pathogen signaling; MKK2 is involved in cold and salt stress signaling; MKK4/MKK5 participates in innate immunity; and MKK7 regulates basal and systemic acquired resistance. The MAPKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	264
270794	cd06624	STKc_ASK	Catalytic domain of the Serine/Threonine Kinase, Apoptosis signal-regulating kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this subfamily are mitogen-activated protein kinase (MAPK) kinase kinases (MAPKKKs or MKKKs) and include ASK1, ASK2, and MAPKKK15. ASK1 (also called MAPKKK5) functions in the c-Jun N-terminal kinase (JNK) and p38 MAPK signaling pathways by directly activating their respective MAPKKs, MKK4/MKK7 and MKK3/MKK6. It plays important roles in cytokine and stress responses, as well as in reactive oxygen species-mediated cellular responses. ASK1 is implicated in various diseases mediated by oxidative stress including inschemic heart disease, hypertension, vessel injury, brain ischemia, Fanconi anemia, asthma, and pulmonary edema, among others. ASK2 (also called MAPKKK6) functions only in a heteromeric complex with ASK1, and can activate ASK1 by direct phosphorylation. The function of MAPKKK15 is still unknown. The ASK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
270795	cd06625	STKc_MEKK3_like	Catalytic domain of Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 3-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of MEKK3, MEKK2, and related proteins; all contain an N-terminal PB1 domain, which mediates oligomerization, and a C-terminal catalytic domain. MEKK2 and MEKK3 are MAPK kinase kinases (MAPKKKs or MKKK) that activate MEK5 (also called MKK5), which activates ERK5. The ERK5 cascade plays roles in promoting cell proliferation, differentiation, neuronal survival, and neuroprotection. MEKK3 plays an essential role in embryonic angiogenesis and early heart development. MEKK2 and MEKK3 can also activate the MAPKs, c-Jun N-terminal kinase (JNK) and p38, through their respective MAPKKs. The MEKK3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	260
270796	cd06626	STKc_MEKK4	Catalytic domain of the Protein Serine/Threonine Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MEKK4 is a MAPK kinase kinase that phosphorylates and activates the c-Jun N-terminal kinase (JNK) and p38 MAPK signaling pathways by directly activating their respective MAPKKs, MKK4/MKK7 and MKK3/MKK6. JNK and p38 are collectively known as stress-activated MAPKs, as they are activated in response to a variety of environmental stresses and pro-inflammatory cytokines. MEKK4 also plays roles in the re-polarization of the actin cytoskeleton in response to osmotic stress, in the proper closure of the neural tube, in cardiovascular development, and in immune responses. The MEKK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
270797	cd06627	STKc_Cdc7_like	Catalytic domain of Cell division control protein 7-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this subfamily include Schizosaccharomyces pombe Cdc7, Saccharomyces cerevisiae Cdc15, Arabidopsis thaliana mitogen-activated protein kinase kinase kinase (MAPKKK) epsilon, and related proteins. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Fission yeast Cdc7 is essential for cell division by playing a key role in the initiation of septum formation and cytokinesis. Budding yeast Cdc15 functions to coordinate mitotic exit with cytokinesis. Arabidopsis MAPKKK epsilon is required for pollen development in the plasma membrane. The Cdc7-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	254
270798	cd06628	STKc_Byr2_like	Catalytic domain of the Serine/Threonine Kinases, fungal Byr2-like Mitogen-Activated Protein Kinase Kinase Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this group include the MAPKKKs Schizosaccharomyces pombe Byr2, Saccharomyces cerevisiae and Cryptococcus neoformans Ste11, and related proteins. They contain an N-terminal SAM (sterile alpha-motif) domain, which mediates protein-protein interaction, and a C-terminal catalytic domain. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Fission yeast Byr2 is regulated by Ras1. It responds to pheromone signaling and controls mating through the MAPK pathway. Budding yeast Ste11 functions in MAPK cascades that regulate mating, high osmolarity glycerol, and filamentous growth responses. The Byr2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270799	cd06629	STKc_Bck1_like	Catalytic domain of the Serine/Threonine Kinases, fungal Bck1-like Mitogen-Activated Protein Kinase Kinase Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this group include the MAPKKKs Saccharomyces cerevisiae Bck1 and Schizosaccharomyces pombe Mkh1, and related proteins. Budding yeast Bck1 is part of the cell integrity MAPK pathway, which is activated by stresses and aggressions to the cell wall. The MAPKKK Bck1, MAPKKs Mkk1 and Mkk2, and the MAPK Slt2 make up the cascade that is important in the maintenance of cell wall homeostasis. Fission yeast Mkh1 is involved in MAPK cascades regulating cell morphology, cell wall integrity, salt resistance, and filamentous growth in response to stress. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The Bck1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
270800	cd06630	STKc_MEKK1	Catalytic domain of the Protein Serine/Threonine Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MEKK1 is a MAPK kinase kinase (MAPKKK or MKKK) that phosphorylates and activates activates the ERK1/2 and c-Jun N-terminal kinase (JNK) pathways by activating their respective MAPKKs, MEK1/2 and MKK4/MKK7, respectively. MEKK1 is important in regulating cell survival and apoptosis. MEKK1 also plays a role in cell migration, tissue maintenance and homeostasis, and wound healing. The MEKK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
270801	cd06631	STKc_YSK4	Catalytic domain of the Serine/Threonine Kinase, Yeast Sps1/Ste20-related Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. YSK4 is a putative MAPKKK, whose mammalian gene has been isolated. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The YSK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	266
270802	cd06632	STKc_MEKK1_plant	Catalytic domain of the Serine/Threonine Kinase, Plant Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of plant MAPK kinase kinases (MAPKKKs) including Arabidopsis thaliana MEKK1 and MAPKKK3. Arabidopsis thaliana MEKK1 activates MPK4, a MAPK that regulates systemic acquired resistance. MEKK1 also participates in the regulation of temperature-sensitive and tissue-specific cell death. MAPKKKs phosphorylate and activate MAPK kinases, which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The plant MEKK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
270803	cd06633	STKc_TAO3	Catalytic domain of the Serine/Threonine Kinase, Thousand-and-One Amino acids 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TAO3 is also known as JIK (c-Jun N-terminal kinase inhibitory kinase) or KFC (kinase from chicken). It specifically activates JNK, presumably by phosphorylating and activating MKK4/MKK7. In Saccharomyces cerevisiae, TAO3 is a component of the RAM (regulation of Ace2p activity and cellular morphogenesis) signaling pathway. TAO3 is upregulated in retinal ganglion cells after axotomy, and may play a role in apoptosis. TAO proteins possess mitogen-activated protein kinase (MAPK) kinase kinase activity. MAPK signaling cascades are important in mediating cellular responses to extracellular signals. The TAO3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	313
270804	cd06634	STKc_TAO2	Catalytic domain of the Serine/Threonine Kinase, Thousand-and-One Amino acids 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Human TAO2 is also known as prostate-derived Ste20-like kinase (PSK) and was identified in a screen for overexpressed RNAs in prostate cancer. TAO2 possesses mitogen-activated protein kinase (MAPK) kinase kinase activity and activates both p38 and c-Jun N-terminal kinase (JNK), by phosphorylating and activating their respective MAP/ERK kinases, MEK3/MEK6 and MKK4/MKK7. It contains a long C-terminal extension with autoinhibitory segments, and is activated by the release of this inhibition and the phosphorylation of its activation loop serine. TAO2 functions as a regulator of actin cytoskeletal and microtubule organization. In addition, it regulates the transforming growth factor-activated kinase 1 (TAK1), which is a MAPKKK that plays an essential role in the signaling pathways of tumor necrosis factor, interleukin 1, and Toll-like receptor. The TAO2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	308
270805	cd06635	STKc_TAO1	Catalytic domain of the Serine/Threonine Kinase, Thousand-and-One Amino acids 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TAO1 is sometimes referred to as prostate-derived sterile 20-like kinase 2 (PSK2). TAO1 activates the p38 MAPK through direct interaction with and activation of MEK3. TAO1 is highly expressed in the brain and may play a role in neuronal apoptosis. TAO1 interacts with the checkpoint proteins BubR1 and Mad2, and plays an important role in regulating mitotic progression, which is required for both chromosome congression and checkpoint-induced anaphase delay. TAO1 may play a role in protecting genomic stability. TAO proteins possess MAPK kinase kinase activity. MAPK signaling cascades are important in mediating cellular responses to extracellular signals. The TAO1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	317
270806	cd06636	STKc_MAP4K4_6_N	N-terminal Catalytic domain of the Serine/Threonine Kinases, Mitogen-Activated Protein Kinase Kinase Kinase Kinase 4 and 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Members of this subfamily contain an N-terminal catalytic domain and a C-terminal citron homology (CNH) regulatory domain. MAP4K4 is also called Nck Interacting kinase (NIK). It facilitates the activation of the MAPKs, extracellular signal-regulated kinase (ERK) 1, ERK2, and c-Jun N-terminal kinase (JNK), by phosphorylating and activating MEKK1. MAP4K4 plays a role in tumor necrosis factor (TNF) alpha-induced insulin resistance. MAP4K4 silencing in skeletal muscle cells from type II diabetic patients restores insulin-mediated glucose uptake. MAP4K4, through JNK, also plays a broad role in cell motility, which impacts inflammation, homeostasis, as well as the invasion and spread of cancer. MAP4K4 is found to be highly expressed in most tumor cell lines relative to normal tissue. MAP4K6 (also called MINK for Misshapen/NIKs-related kinase) is activated after Ras induction and mediates activation of p38 MAPK. MAP4K6 plays a role in cell cycle arrest, cytoskeleton organization, cell adhesion, and cell motility. The MAP4K4/6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	282
270807	cd06637	STKc_TNIK	Catalytic domain of the Serine/Threonine Kinase, Traf2- and Nck-Interacting Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TNIK is an effector of Rap2, a small GTP-binding protein from the Ras family. TNIK specifically activates the c-Jun N-terminal kinase (JNK) pathway and plays a role in regulating the actin cytoskeleton. The TNIK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	296
132969	cd06638	STKc_myosinIIIA_N	N-terminal Catalytic domain of the Serine/Threonine Kinase, Class IIIA myosin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Class IIIA myosin is highly expressed in retina and in inner ear hair cells. It is localized to the distal ends of actin-bundled structures. Mutations in human myosin IIIA are responsible for progressive nonsyndromic hearing loss. Human myosin IIIA possesses ATPase and kinase activities, and the ability to move actin filaments in a motility assay. It may function as a cellular transporter capable of moving along actin bundles in sensory cells. Class III myosins are motor proteins containing an N-terminal kinase catalytic domain and a C-terminal actin-binding domain. Class III myosins may play an important role in maintaining the structural integrity of photoreceptor cell microvilli. In photoreceptor cells, they may also function as cargo carriers during light-dependent translocation of proteins such as transducin and arrestin. The class III myosin subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
270808	cd06639	STKc_myosinIIIB_N	N-terminal Catalytic domain of the Serine/Threonine Kinase, Class IIIB myosin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Class IIIB myosin is expressed highly in retina. It is also present in the brain and testis. The human class IIIB myosin gene maps to a region that overlaps the locus for Bardet-Biedl syndrome, which is characterized by dysmorphic extremities, retinal dystrophy, obesity, male hypogenitalism, and renal abnormalities. Class III myosins are motor proteins containing an N-terminal kinase catalytic domain and a C-terminal actin-binding domain. They may play an important role in maintaining the structural integrity of photoreceptor cell microvilli. They may also function as cargo carriers during light-dependent translocation, in photoreceptor cells, of proteins such as transducin and arrestin. The class III myosin subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	291
132971	cd06640	STKc_MST4	Catalytic domain of the Serine/Threonine Kinase, Mammalian Ste20-like protein kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MST4 is sometimes referred to as MASK (MST3 and SOK1-related kinase). It plays a role in mitogen-activated protein kinase (MAPK) signaling during cytoskeletal rearrangement, morphogenesis, and apoptosis. It influences cell growth and transformation by modulating the extracellular signal-regulated kinase (ERK) pathway. MST4 may also play a role in tumor formation and progression. It localizes in the Golgi apparatus by interacting with the Golgi matrix protein GM130 and may play a role in cell migration. The MST4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270809	cd06641	STKc_MST3	Catalytic domain of the Serine/Threonine Kinase, Mammalian Ste20-like protein kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MST3 phosphorylates the STK NDR and may play a role in cell cycle progression and cell morphology. It may also regulate paxillin and consequently, cell migration. MST3 is present in human placenta, where it plays an essential role in the oxidative stress-induced apoptosis of trophoblasts in normal spontaneous delivery. Dysregulation of trophoblast apoptosis may result in pregnancy complications such as preeclampsia and intrauterine growth retardation. The MST3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270810	cd06642	STKc_STK25	Catalytic domain of Serine/Threonine Kinase 25 (also called Yeast Sps1/Ste20-related kinase 1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK25 is also called Ste20/oxidant stress response kinase 1 (SOK1) or yeast Sps1/Ste20-related kinase 1 (YSK1). It is localized in the Golgi apparatus through its interaction with the Golgi matrix protein GM130. It may be involved in the regulation of cell migration and polarization. STK25 binds and phosphorylates CCM3 (cerebral cavernous malformation 3), also called PCD10 (programmed cell death 10), and may play a role in apoptosis. Human STK25 is a candidate gene responsible for pseudopseudohypoparathyroidism (PPHP), a disease that shares features with the Albright hereditary osteodystrophy (AHO) phenotype. The STK25 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270811	cd06643	STKc_SLK	Catalytic domain of the Serine/Threonine Kinase, Ste20-Like Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SLK promotes apoptosis through apoptosis signal-regulating kinase 1 (ASK1) and the mitogen-activated protein kinase (MAPK) p38. It acts as a MAPK kinase kinase by phosphorylating ASK1, resulting in the phosphorylation of p38. SLK also plays a role in mediating actin reorganization. It is part of a microtubule-associated complex that is targeted at adhesion sites, and is required in focal adhesion turnover and in regulating cell migration. The SLK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
132975	cd06644	STKc_STK10	Catalytic domain of the Serine/Threonine Kinase, STK10 (also called Lymphocyte-Oriented Kinase or LOK). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK10/LOK is also called polo-like kinase kinase 1 in Xenopus (xPlkk1). It is highly expressed in lymphocytes and is responsible in regulating leukocyte function associated antigen (LFA-1)-mediated lymphocyte adhesion. It plays a role in regulating the CD28 responsive element in T cells, and may also function as a regulator of polo-like kinase 1 (Plk1), a protein which is overexpressed in multiple tumor types. The STK10 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	292
270812	cd06645	STKc_MAP4K3	Catalytic domain of the Serine/Threonine Kinase, Mitogen-activated protein kinase kinase kinase kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAP4K3 plays a role in the nutrient-responsive pathway of mTOR (mammalian target of rapamycin) signaling. MAP4K3 is required in the activation of S6 kinase by amino acids and for the phosphorylation of the mTOR-regulated inhibitor of eukaryotic initiation factor 4E. mTOR regulates ribosome biogenesis and protein translation, and is frequently deregulated in cancer. MAP4Ks are involved in MAPK signaling pathways by activating a MAPK kinase kinase. Each MAPK cascade is activated either by a small GTP-binding protein or by an adaptor protein, which transmits the signal either directly to a MAP3K to start the triple kinase core cascade or indirectly through a mediator kinase, a MAP4K. Members of this subfamily contain an N-terminal catalytic domain and a C-terminal citron homology (CNH) regulatory domain. The MAP4K3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	272
270813	cd06646	STKc_MAP4K5	Catalytic domain of the Serine/Threonine Kinase, Mitogen-activated protein kinase kinase kinase kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAP4K5, also called germinal center kinase-related enzyme (GCKR), has been shown to activate the MAPK c-Jun N-terminal kinase (JNK). MAP4K5 also facilitates Wnt signaling in B cells, and may therefore be implicated in the control of cell fate, proliferation, and polarity. MAP4Ks are involved in some MAPK signaling pathways by activating a MAPK kinase kinase. Each MAPK cascade is activated either by a small GTP-binding protein or by an adaptor protein, which transmits the signal either directly to a MAP3K to start the triple kinase core cascade or indirectly through a mediator kinase, a MAP4K. Members of this subfamily contain an N-terminal catalytic domain and a C-terminal citron homology (CNH) regulatory domain. The MAP4K5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
270814	cd06647	STKc_PAK_I	Catalytic domain of the Serine/Threonine Kinase, Group I p21-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Group I PAKs, also called conventional PAKs, include PAK1, PAK2, and PAK3. Group I PAKs contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). They interact with the SH3 domain containing proteins Nck, Grb2 and PIX. Binding of group I PAKs to activated GTPases leads to conformational changes that destabilize the AID, allowing autophosphorylation and full activation of the kinase domain. Known group I PAK substrates include MLCK, Bad, Raf, MEK1, LIMK, Merlin, Vimentin, Myc, Stat5a, and Aurora A, among others. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. PAKs are implicated in the regulation of many cellular processes including growth factor receptor-mediated proliferation, cell polarity, cell motility, cell death and survival, and actin cytoskeleton organization. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	261
270815	cd06648	STKc_PAK_II	Catalytic domain of the Serine/Threonine Kinase, Group II p21-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Group II PAKs, also called non-conventional PAKs, include PAK4, PAK5, and PAK6. Group II PAKs contain PBD (p21-binding domain) and catalytic domains, but lack other motifs found in group I PAKs, such as an AID (autoinhibitory domain) and SH3 binding sites. Since group II PAKs do not contain an obvious AID, they may be regulated differently from group I PAKs. While group I PAKs interact with the SH3 containing proteins Nck, Grb2 and PIX, no such binding has been demonstrated for group II PAKs. Some known substrates of group II PAKs are also substrates of group I PAKs such as Raf, BAD, LIMK and GEFH1. Unique group II substrates include MARK/Par-1 and PDZ-RhoGEF. Group II PAKs play important roles in filopodia formation, neuron extension, cytoskeletal organization, and cell survival. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	261
132980	cd06649	PKc_MEK2	Catalytic domain of the dual-specificity Protein Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase 2. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MEK2 is a dual-specificity PK and a MAPK kinase (MAPKK or MKK) that phosphorylates and activates the downstream targets, ERK1 and ERK2, on specific threonine and tyrosine residues. The ERK cascade starts with extracellular signals including growth factors, hormones, and neurotransmitters, which act through receptors and ion channels to initiate intracellular signaling that leads to the activation at the MAPKKK (Raf-1 or MOS) level, which leads to the transmission of signals to MEK2, and finally to ERK1/2. The ERK cascade plays an important role in cell proliferation, differentiation, oncogenic transformation, and cell cycle control, as well as in apoptosis and cell survival under certain conditions. Gain-of-function mutations in genes encoding  ERK cascade proteins, including MEK2, cause cardiofaciocutaneous (CFC) syndrome, a condition leading to multiple congenital anomalies and mental retardation in patients. The MEK subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	331
270816	cd06650	PKc_MEK1	Catalytic domain of the dual-specificity Protein Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase 1. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (ST) or tyrosine residues on protein substrates. MEK1 is a dual-specificity PK and a MAPK kinase (MAPKK or MKK) that phosphorylates and activates the downstream targets, ERK1 and ERK2, on specific threonine and tyrosine residues. The ERK cascade starts with extracellular signals including growth factors, hormones, and neurotransmitters, which act through receptors and ion channels to initiate intracellular signaling that leads to the activation at the MAPKKK (Raf-1 or MOS) level, which leads to the transmission of signals to MEK1, and finally to ERK1/2. The ERK cascade plays an important role in cell proliferation, differentiation, oncogenic transformation, and cell cycle control, as well as in apoptosis and cell survival under certain conditions. Gain-of-function mutations in genes encoding ERK cascade proteins, including MEK1, cause cardiofaciocutaneous (CFC) syndrome, a condition leading to multiple congenital anomalies and mental retardation in patients. MEK1 also plays a role in cell cycle control. The MEK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	319
270817	cd06651	STKc_MEKK3	Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MEKK3 is a MAPK kinase kinase (MAPKKK or MKKK), that phosphorylates and activates the MAPK kinase MEK5 (or MKK5), which in turn phosphorylates and activates ERK5. The ERK5 cascade plays roles in promoting cell proliferation, differentiation, neuronal survival, and neuroprotection. MEKK3 plays an essential role in embryonic angiogenesis and early heart development. In addition, MEKK3 is involved in interleukin-1 receptor and Toll-like receptor 4 signaling. It is also a specific regulator of the proinflammatory cytokines IL-6 and GM-CSF in some immune cells. MEKK3 also regulates calcineurin, which plays a critical role in T cell activation, apoptosis, skeletal myocyte differentiation, and cardiac hypertrophy. The MEKK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
270818	cd06652	STKc_MEKK2	Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MEKK2 is a MAPK kinase kinase (MAPKKK or MKKK), that phosphorylates and activates the MAPK kinase MEK5 (or MKK5), which in turn phosphorylates and activates ERK5. The ERK5 cascade plays roles in promoting cell proliferation, differentiation, neuronal survival, and neuroprotection. MEKK2 also activates ERK1/2, c-Jun N-terminal kinase (JNK) and p38 through their respective MAPKKs MEK1/2, JNK-activating kinase 2 (JNKK2), and MKK3/6. MEKK2 plays roles in T cell receptor signaling, immune synapse formation, cytokine gene expression, as well as in EGF and FGF receptor signaling. The MEKK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	264
270819	cd06653	STKc_MEKK3_like_u1	Catalytic domain of an Uncharacterized subfamily of Mitogen-Activated Protein (MAP)/Extracellular signal-Regulated Kinase (ERK) Kinase Kinase 3-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of uncharacterized proteins with similarity to MEKK3, MEKK2, and related proteins; they contain an N-terminal PB1 domain, which mediates oligomerization, and a C-terminal catalytic domain. MEKK2 and MEKK3 are MAPK kinase kinases (MAPKKKs or MKKKs), proteins that phosphorylate and activate MAPK kinases (MAPKKs or MKKs), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MEKK2 and MEKK3 activate MEK5 (also called MKK5), which activates ERK5. The ERK5 cascade plays roles in promoting cell proliferation, differentiation, neuronal survival, and neuroprotection. MEKK3 plays an essential role in embryonic angiogenesis and early heart development. MEKK2 and MEKK3 can also activate the MAPKs, c-Jun N-terminal kinase (JNK) and p38, through their respective MAPKKs. The MEKK3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	264
270820	cd06654	STKc_PAK1	Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK1 is important in the regulation of many cellular processes including cytoskeletal dynamics, cell motility, growth, and proliferation. Although PAK1 has been regarded mainly as a cytosolic protein, recent reports indicate that PAK1 also exists in significant amounts in the nucleus, where it is involved in transcription modulation and in cell cycle regulatory events. PAK1 is also involved in transformation and tumorigenesis. Its overexpression, hyperactivation and increased nuclear accumulation is correlated to breast cancer invasiveness and progression. Nuclear accumulation is also linked to tamoxifen resistance in breast cancer cells. PAK1 belongs to the group I PAKs, which contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	296
132986	cd06655	STKc_PAK2	Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK2 plays a role in pro-apoptotic signaling. It is cleaved and activated by caspases leading to morphological changes during apoptosis. PAK2 is also activated in response to a variety of stresses including DNA damage, hyperosmolarity, serum starvation, and contact inhibition, and may play a role in coordinating the stress response. PAK2 also contributes to cancer cell invasion through a mechanism distinct from that of PAK1. It belongs to the group I PAKs, which contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	296
132987	cd06656	STKc_PAK3	Catalytic domain of the Protein Serine/Threonine Kinase, p21-activated kinase 3. Serine/threonine kinases (STKs), p21-activated kinase (PAK) 3, catalytic (c) domain. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. PAKs from higher eukaryotes are classified into two groups (I and II), according to their biochemical and structural features. PAK3 belongs to group I. Group I PAKs contain a PBD (p21-binding domain) overlapping with an AID (autoinhibitory domain), a C-terminal catalytic domain, SH3 binding sites and a non-classical SH3 binding site for PIX (PAK-interacting exchange factor). PAK3 is highly expressed in the brain. It is implicated in neuronal plasticity, synapse formation, dendritic spine morphogenesis, cell cycle progression, neuronal migration, and apoptosis. Inactivating mutations in the PAK3 gene cause X-linked non-syndromic mental retardation, the severity of which depends on the site of the mutation.	297
132988	cd06657	STKc_PAK4	Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK4 regulates cell morphology and cytoskeletal organization. It is essential for embryonic viability and proper neural development. Mice lacking PAK4 die due to defects in the fetal heart. In addition, their spinal cord motor neurons showed failure to differentiate and migrate. PAK4 also plays a role in cell survival and tumorigenesis. It is overexpressed in many primary tumors including colon, esophageal, and mammary tumors. PAK4 has also been implicated in viral and bacterial infection pathways. PAK4 belongs to the group II PAKs, which contain a PBD (p21-binding domain) and a C-terminal catalytic domain, but do not harbor an AID (autoinhibitory domain) or SH3 binding sites. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	292
132989	cd06658	STKc_PAK5	Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK5 is mainly expressed in the brain. It is not required for viability, but together with PAK6, it is required for normal levels of locomotion and activity, and for learning and memory. PAK5 cooperates with Inca (induced in neural crest by AP2) in the regulation of cell adhesion and cytoskeletal organization in the embryo and in neural crest cells during craniofacial development. PAK5 may also play a role in controlling the signaling of Raf-1, an effector of Ras, at the mitochondria. PAK5 belongs to the group II PAKs, which contain a PBD (p21-binding domain) and a C-terminal catalytic domain, but do not harbor an AID (autoinhibitory domain) or SH3 binding sites. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	292
270821	cd06659	STKc_PAK6	Catalytic domain of the Serine/Threonine Kinase, p21-activated kinase 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PAK6 may play a role in stress responses through its activation by the mitogen-activated protein kinase (MAPK) p38 and MAPK kinase 6 (MKK6) pathway. PAK6 is highly expressed in the brain. It is not required for viability, but together with PAK5, it is required for normal levels of locomotion and activity, and for learning and memory. Increased expression of PAK6 is found in primary and metastatic prostate cancer. PAK6 may play a role in the regulation of motility. PAK6 belongs to the group II PAKs, which contain a PBD (p21-binding domain) and a C-terminal catalytic domain, but do not harbor an AID (autoinhibitory domain) or SH3 binding sites. PAKs are Rho family GTPase-regulated kinases that serve as important mediators in the function of Cdc42 (cell division cycle 42) and Rac. The PAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	297
381296	cd06660	AKR_SF	Aldo-keto reductase (AKR) superfamily. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits, and aflatoxin aldehyde reductases, among others.	232
119400	cd06661	GGCT_like	GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC).	99
119401	cd06662	SURF1	SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder.	202
133456	cd06663	Biotinyl_lipoyl_domains	Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue.	73
143480	cd06664	IscU_like	Iron-sulfur cluster scaffold-like proteins. IscU_like and NifU_like proteins. IscU and NifU function as a scaffold for the assembly of [2Fe-2S] clusters before they are transferred to apo target proteins. They are highly conserved and play vital roles in the ISC and NIF systems of Fe-S protein maturation. NIF genes participate in nitrogen fixation in several isolated bacterial species. The NifU domain, however, is also found in bacteria that do not fix nitrogen, so it may have wider significance in the cell. Human IscU interacts with frataxin, the Friedreich ataxia gene product, and incorrectly spliced IscU has been shown to disrupt iron homeostasis in skeletal muscle and cause myopathy.	123
143484	cd06808	PLPDE_III	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes. The fold type III PLP-dependent enzyme family is predominantly composed of two-domain proteins with similarity to bacterial alanine racemases (AR) including eukaryotic ornithine decarboxylases (ODC), prokaryotic diaminopimelate decarboxylases (DapDC), biosynthetic arginine decarboxylases (ADC), carboxynorspermidine decarboxylases (CANSDC), and similar proteins. AR-like proteins contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. These proteins play important roles in the biosynthesis of amino acids and polyamine. The family also includes the single-domain YBL036c-like proteins, which contain a single PLP-binding TIM-barrel domain without any N- or C-terminal extensions. Due to the lack of a second domain, these proteins may possess only limited D- to L-alanine racemase activity or non-specific racemase activity.	211
143485	cd06810	PLPDE_III_ODC_DapDC_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, Ornithine and Diaminopimelate Decarboxylases, and Related Enzymes. This family includes eukaryotic ornithine decarboxylase (ODC, EC 4.1.1.17), diaminopimelate decarboxylase (DapDC, EC 4.1.1.20), plant and prokaryotic biosynthetic arginine decarboxylase (ADC, EC 4.1.1.19), carboxynorspermidine decarboxylase (CANSDC), and ODC-like enzymes from diverse bacterial species. These proteins are fold type III PLP-dependent enzymes that catalyze essential steps in the  biosynthesis of polyamine and lysine. ODC and ADC participate in alternative pathways of the biosynthesis of putrescine, which is the precursor of aliphatic polyamines in many organisms. ODC catalyzes the direct synthesis of putrescine from L-ornithine, while ADC converts L-arginine to agmatine, which is hydrolysed to putrescine by agmatinase in a pathway that exists only in plants and bacteria. DapDC converts meso-2,6-diaminoheptanedioate to L-lysine, which is the final step of lysine biosynthesis. CANSDC catalyzes the decarboxylation of carboxynorspermidine, which is the last step in the synthesis of norspermidine. The PLP-dependent decarboxylases in this family contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Prokaryotic ornithine, lysine and biodegradative arginine decarboxylases are fold type I PLP-dependent enzymes and are not included in this family.	368
143486	cd06811	PLPDE_III_yhfX_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme yhfX. This subfamily is composed of the uncharacterized protein yhfX from Escherichia coli K-12 and similar bacterial proteins. These proteins are homologous to bacterial alanine racemases (AR), which are fold type III PLP-dependent enzymes containing an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. It catalyzes the interconversion between L- and D-alanine, which is an essential component of the peptidoglycan layer of bacterial cell walls. Members of this subfamily may act as PLP-dependent enzymes.	382
143487	cd06812	PLPDE_III_DSD_D-TA_like_1	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes Similar to D-Serine Dehydratase and D-Threonine Aldolase, Unknown Group 1. This subfamily is composed of uncharacterized bacterial proteins with similarity to eukaryotic D-serine dehydratases (DSD) and D-threonine aldolases (D-TA). DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. D-TA reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. DSD and D-TA are fold type III PLP-dependent enzymes, similar to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on their similarity to AR, it is possible members of this family also form dimers in solution.	374
143488	cd06813	PLPDE_III_DSD_D-TA_like_2	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes Similar to D-Serine Dehydratase and D-Threonine Aldolase, Unknown Group 2. This subfamily is composed of uncharacterized bacterial proteins with similarity to eukaryotic D-serine dehydratases (DSD) and D-threonine aldolases (D-TA). DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. D-TA reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. DSD and D-TA are fold type III PLP-dependent enzymes, similar to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on their similarity to AR, it is possible members of this family also form dimers in solution.	388
143489	cd06814	PLPDE_III_DSD_D-TA_like_3	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes Similar to D-Serine Dehydratase and D-Threonine Aldolase, Unknown Group 3. This subfamily is composed of uncharacterized bacterial proteins with similarity to eukaryotic D-serine dehydratases (DSD) and D-threonine aldolases (D-TA). DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. D-TA reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. DSD and D-TA are fold type III PLP-dependent enzymes, similar to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on their similarity to AR, it is possible members of this family also form dimers in solution.	379
143490	cd06815	PLPDE_III_AR_like_1	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Alanine Racemase-like 1. This subfamily is composed of uncharacterized bacterial proteins with similarity to bacterial alanine racemases (AR), which are fold type III PLP-dependent enzymes containing an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. It catalyzes the interconversion between L- and D-alanine, which is an essential component of the peptidoglycan layer of bacterial cell walls. Members of this subfamily may act as PLP-dependent enzymes.	353
143491	cd06817	PLPDE_III_DSD	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Eukaryotic D-Serine Dehydratase. This subfamily is composed of chicken D-serine dehydratase (DSD, EC 4.3.1.18) and similar eukaryotic proteins. Chicken DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. It is a fold type III PLP-dependent enzyme with similarity to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as dimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Experimental data suggest that chicken DSD also exists as dimers. Sequence comparison and biochemical experiments show that chicken DSD is distinct from the ubiquitous bacterial DSDs coded by dsdA gene, mammalian L-serine dehydratases (LSD) and mammalian serine racemase (SerRac), which are fold type II PLP-dependent enzymes.	389
143492	cd06818	PLPDE_III_cryptic_DSD	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Bacterial Cryptic D-Serine Dehydratase. This subfamily is composed of Burkholderia cepacia cryptic D-serine dehydratase (cryptic DSD), which is also called D-serine deaminase, and similar bacterial proteins. Members of this subfamily are fold type III PLP-dependent enzymes with similarity to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as dimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on similarity, it is possible cryptic DSDs may also form dimers. Cryptic DSDs are distinct from the ubiquitous bacterial DSDs coded by the dsdA gene, mammalian L-serine dehydratases (LSD) and mammalian serine racemase (SerRac), which are fold type II PLP-dependent enzymes. At present, the enzymatic and biochemical properties of cryptic DSDs are still poorly understood. Typically, DSDs catalyze the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia.	382
143493	cd06819	PLPDE_III_LS_D-TA	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Low Specificity D-Threonine Aldolase. Low specificity D-threonine aldolase (Low specificity D-TA, EC 4.3.1.18), encoded by dtaAS gene from Arthrobacter sp. strain DK-38, is the prototype of this subfamily. Low specificity D-TAs are fold type III PLP-dependent enzymes that catalyze the interconversion between D-threonine/D-allo-threonine and glycine plus acetaldehyde. Both PLP and divalent cations (eg. Mn2+) are required for catalytic activity. Members of this subfamily show similarity to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on its similarity to AR, it is possible that low specificity D-TAs also form dimers in solution. Experimental data show that the monomeric form of low specificity D-TAs exhibit full catalytic activity.	358
143494	cd06820	PLPDE_III_LS_D-TA_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, Low Specificity D-Threonine Aldolase-like. This subfamily is composed of uncharacterized bacterial proteins with similarity to low specificity D-threonine aldolase (D-TA), which is a fold type III PLP-dependent enzyme that catalyzes the interconversion between D-threonine/D-allo-threonine and glycine plus acetaldehyde. Both PLP and divalent cations (eg. Mn2+) are required for catalytic activity. Low specificity D-TAs show similarity to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on its similarity to AR, it is possible that low specificity D-TAs also form dimers in solution. Experimental data show that the monomeric form of low specificity D-TAs exhibit full catalytic activity.	353
143495	cd06821	PLPDE_III_D-TA	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme D-Threonine Aldolase. D-threonine aldolase (D-TA, EC 4.3.1.18) reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. Its activity is present in several genera of bacteria but not in fungi. It requires PLP and a divalent cation such as Co2+, Ni2+, Mn2+, or Mg2+ as cofactors for catalytic activity and thermal stability. Members of this subfamily show similarity to bacterial alanine racemase (AR), a fold type III PLP-dependent enzyme which contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on its similarity to AR, it is possible that low specificity D-TAs also form dimers in solution. Experimental data show that the monomeric form of low specificity D-TAs exhibit full catalytic activity.	361
143496	cd06822	PLPDE_III_YBL036c_euk	Pyridoxal 5-phosphate (PLP)-binding TIM barrel domain of Type III PLP-Dependent Enzymes, Eukaryotic YBL036c-like proteins. This subfamily contains mostly uncharacterized eukaryotic proteins with  similarity to the yeast hypothetical protein YBL036c, which is homologous to a Pseudomonas aeruginosa gene that is co-transcribed with a known proline biosynthetic gene. YBL036c is a single domain monomeric protein with a typical TIM barrel fold. It binds the PLP cofactor and has been shown to exhibit amino acid racemase activity. The YBL036c structure is similar to the N-terminal domain of the fold type III PLP-dependent enzymes, bacterial alanine racemase and eukaryotic ornithine decarboxylase, which are two-domain dimeric proteins. The lack of a second domain in YBL036c may explain limited D- to L-alanine racemase or non-specific racemase activity. Some members of this subfamily are also referred to as PROSC (Proline synthetase co-transcribed bacterial homolog).	227
143497	cd06824	PLPDE_III_Yggs_like	Pyridoxal 5-phosphate (PLP)-binding TIM barrel domain of Type III PLP-Dependent Enzymes, Yggs-like proteins. This subfamily contains mainly uncharacterized proteobacterial proteins with similarity to the hypothetical Escherichia coli protein YggS, a homolog of yeast YBL036c, which is homologous to a Pseudomonas aeruginosa gene that is co-transcribed with a known proline biosynthetic gene. Like yeast YBL036c, Yggs is a single domain monomeric protein with a typical TIM-barrel fold. Its structure, which shows a covalently-bound PLP cofactor, is similar to the N-terminal domain of the fold type III PLP-dependent enzymes, bacterial alanine racemase and eukaryotic ornithine decarboxylase, which are two-domain dimeric proteins. YggS has not been characterized extensively and its biological function is still unkonwn.	224
143498	cd06825	PLPDE_III_VanT	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, VanT and similar proteins. This subfamily is composed of Enterococcus gallinarum VanT and similar proteins. VanT is a membrane-bound serine racemase (EC 5.1.1.18) that is essential for vancomycin resistance in Enterococcus gallinarum. It converts L-serine into its D-enantiomer (D-serine) for peptidoglycan synthesis. The C-terminal region of this protein contains a PLP-binding TIM-barrel domain followed by beta-sandwich domain, which is homologous to the fold type III PLP-dependent enzyme, bacterial alanine racemase (AR). AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. On the basis of this similarity, it has been suggested that dimer formation of VanT is required for its catalytic activity, and that it catalyzes the racemization of serine in a mechanistically similar manner to that of alanine by bacterial AR. Some biochemical evidence indicates that VanT also exhibits alanine racemase activity and plays a role in the racemization of L-alanine. VanT contains a unique N-terminal transmembrane domain, which may function as an L-serine transporter. VanT serine racemases are not related to eukaryotic serine racemases, which are fold type II PLP-dependent enzymes.	368
143499	cd06826	PLPDE_III_AR2	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme, Alanine Racemase 2. This subfamily is composed of bacterial alanine racemases (EC 5.1.1.1) with similarity to Yersinia pestis and Vibrio cholerae alanine racemase (AR) 2. ARs catalyze the interconversion between L- and D-alanine, an essential component of the peptidoglycan layer of bacterial cell walls. These proteins are similar to other bacterial ARs and are fold type III PLP-dependent enzymes containing contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Homodimer formation and the presence of the PLP cofactor are required for catalytic activity.	365
143500	cd06827	PLPDE_III_AR_proteobact	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, Proteobacterial Alanine Racemases. This subfamily is composed mainly of proteobacterial alanine racemases (EC 5.1.1.1), fold type III PLP-dependent enzymes that catalyze the interconversion between L- and D-alanine, which is an essential component of the peptidoglycan layer of bacterial cell walls. hese proteins are similar to other bacterial ARs and are fold type III PLP-dependent enzymes containing contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Homodimer formation and the presence of the PLP cofactor are required for catalytic activity.	354
143501	cd06828	PLPDE_III_DapDC	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Diaminopimelate Decarboxylase. Diaminopimelate decarboxylase (DapDC, EC 4.1.1.20) participates in the last step of lysine biosynthesis. It converts meso-2,6-diaminoheptanedioate to L-lysine. It is a fold type III PLP-dependent enzyme that contains an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. DapDC exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Homodimer formation and the presence of the PLP cofactor are required for catalytic activity.	373
143502	cd06829	PLPDE_III_CANSDC	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Carboxynorspermidine Decarboxylase. Carboxynorspermidine decarboxylase (CANSDC) catalyzes the decarboxylation of carboxynorspermidine, the last step in the biosynthesis of norspermidine. It is homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC), which are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. Based on this similarity, CANSDC may require homodimer formation and the presence of the PLP cofactor for its catalytic activity.	346
143503	cd06830	PLPDE_III_ADC	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Arginine Decarboxylase. This subfamily includes plants and biosynthetic prokaryotic arginine decarboxylases (ADC, EC 4.1.1.19). ADC is involved in the biosynthesis of putrescine, which is the precursor of aliphatic polyamines in many organisms. It catalyzes the decarboxylation of L-arginine to agmatine, which is then hydrolyzed to putrescine by agmatinase. ADC is homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC), which are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. Homodimer formation and the presence of both PLP and Mg2+ cofactors may be required for catalytic activity. Prokaryotic ADCs (biodegradative), which are fold type I PLP-dependent enzymes, are not included in this family.	409
143504	cd06831	PLPDE_III_ODC_like_AZI	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Ornithine Decarboxylase-like Antizyme Inhibitor. Antizyme inhibitor (AZI) is homologous to the fold type III PLP-dependent enzyme ODC but does not retain any decarboxylase activity. Like ODC, AZI is presumed to exist as a homodimer. Antizyme is a regulatory protein that binds directly to the ODC monomer to block its active site, leading to its degradation by the 26S proteasome. AZI binds to Antizyme with a higher affinity than ODC, preventing the formation of the Antizyme-ODC complex. Thus, AZI blocks the ability of Antizyme to promote ODC degradation, which leads to increased ODC enzymatic activity and polyamine levels. AZI also prevents the degradation of other proteins regulated by Antizyme, such as cyclin D1.	394
143505	cd06836	PLPDE_III_ODC_DapDC_like_1	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes, Uncharacterized Proteins with similarity to Ornithine and Diaminopimelate Decarboxylases. This subfamily contains uncharacterized proteins with similarity to ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC). ODC and DapDC are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. ODC participates in the formation of putrescine by catalyzing the decarboxylation of ornithine, the first step in polyamine biosynthesis. DapDC participates in the last step of lysine biosynthesis, the conversion of meso-2,6-diaminoheptanedioate to L-lysine. Proteins in this subfamily may function as PLP-dependent decarboxylases. Homodimer formation and the presence of the PLP cofactor may be required for catalytic activity.	379
143506	cd06839	PLPDE_III_Btrk_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Btrk Decarboxylase. This subfamily is composed of Bacillus circulans BtrK decarboxylase and similar proteins. These proteins are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases, eukaryotic ornithine decarboxylases and diaminopimelate decarboxylases. BtrK is presumed to function as a PLP-dependent decarboxylase involved in the biosynthesis of the aminoglycoside antibiotic butirosin. Homodimer formation and the presence of the PLP cofactor may be required for catalytic activity.	382
143507	cd06840	PLPDE_III_Bif_AspK_DapDC	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Bifunctional Aspartate Kinase/Diaminopimelate Decarboxylase. Bifunctional aspartate kinase/diaminopimelate decarboxylase (AspK/DapDC, EC 4.1.1.20/EC 2.7.2.4) typically exists in bacteria. These proteins contain an N-terminal AspK region and a C-terminal DapDC region, which contains a PLP-binding TIM-barrel domain followed by beta-sandwich domain, characteristic of fold type III PLP-dependent enzymes. Members of this subfamily have not been fully characterized. Based on their sequence, these proteins may catalyze both reactions catalyzed by AspK and DapDC. AspK catalyzes the phosphorylation of L-aspartate to produce 4-phospho-L-aspartate while DapDC participates in the last step of lysine biosynthesis, the conversion of meso-2,6-diaminoheptanedioate to L-lysine.	368
143508	cd06841	PLPDE_III_MccE_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme MccE. This subfamily is composed of uncharacterized proteins with similarity to Escherichia coli MccE, a hypothetical protein that is homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC). ODC and DapDC are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. ODC participates in the formation of putrescine by catalyzing the decarboxylation of ornithine, the first step in polyamine biosynthesis. DapDC participates in the last step of lysine biosynthesis, the conversion of meso-2,6-diaminoheptanedioate to L-lysine. Most members of this subfamily share the same domain architecture as ODC and DapDC. A few members, including Escherichia coli MccE, contain an additional acetyltransferase domain at the C-terminus.	379
143509	cd06842	PLPDE_III_Y4yA_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Y4yA. This subfamily is composed of the hypothetical Rhizobium sp. protein Y4yA and similar uncharacterized bacterial proteins. These proteins are homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC). ODC and DapDC are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. ODC participates in the formation of putrescine by catalyzing the decarboxylation of ornithine, the first step in polyamine biosynthesis. DapDC participates in the last step of lysine biosynthesis, the conversion of meso-2,6-diaminoheptanedioate to L-lysine. Proteins in this subfamily may function as PLP-dependent decarboxylases.	423
143510	cd06843	PLPDE_III_PvsE_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme PvsE. This subfamily is composed of PvsE from Vibrio parahaemolyticus and similar proteins. PvsE is a vibrioferrin biosynthesis protein which is homologous to eukaryotic ornithine decarboxylase (ODC) and diaminopimelate decarboxylase (DapDC). ODC and DapDC are fold type III PLP-dependent enzymes that contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain, similar to bacterial alanine racemases. It has been suggested that PvsE may be involved in the biosynthesis of the polycarboxylate siderophore vibrioferrin. It may catalyze the decarboxylation of serine to yield ethanolamine. PvsE may require homodimer formation and the presence of the PLP cofactor for activity.	377
132911	cd06844	STAS	Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors. The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation.	100
132900	cd06845	Bcl-2_like	Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly.	144
185704	cd06846	Adenylation_DNA_ligase_like	Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases. ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity.	182
133457	cd06848	GCS_H	Glycine cleavage H-protein. Glycine cleavage H-proteins are part of the glycine cleavage system (GCS) found in bacteria, archea and the mitochondria of eukaryotes. GCS is a multienzyme complex consisting of 4 different components (P-, H-, T- and L-proteins) which catalyzes the oxidative cleavage of glycine. The H-protein shuttles the methylamine group of glycine from the P-protein (glycine dehydrogenase) to the T-protein (aminomethyltransferase) via a lipoyl group, attached to a completely conserved lysine residue.	96
133458	cd06849	lipoyl_domain	Lipoyl domain of the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases. 2-oxo acid dehydrogenase multienzyme complexes, like pyruvate dehydrogenase (PDH), 2-oxoglutarate dehydrogenase (OGDH) and branched-chain 2-oxo acid dehydrogenase (BCDH), contain at least three different enzymes, 2-oxo acid dehydrogenase (E1), dihydrolipoyl acyltransferase (E2) and dihydrolipoamide dehydrogenase (E3) and play a key role in redox regulation. E2, the central component of the complex, catalyzes the transfer of the acyl group of CoA from E1 to E3 via reductive acetylation of a lipoyl group covalently attached to a lysine residue.	74
133459	cd06850	biotinyl_domain	The biotinyl-domain or biotin carboxyl carrier protein (BCCP) domain is present in all biotin-dependent enzymes, such as acetyl-CoA carboxylase, pyruvate carboxylase, propionyl-CoA carboxylase, methylcrotonyl-CoA carboxylase, geranyl-CoA carboxylase, oxaloacetate decarboxylase, methylmalonyl-CoA decarboxylase, transcarboxylase and urea amidolyase. This domain functions in transferring CO2 from one subsite to another, allowing carboxylation, decarboxylation, or transcarboxylation. During this process, biotin is covalently attached to a specific lysine.	67
133461	cd06851	GT_GPT_like	This family includes eukaryotic UDP-GlcNAc:dolichol-P GlcNAc-1-P transferase (GPT) and archaeal GPT-like glycosyltransferases. Eukaryotic GPT catalyzes the transfer of GlcNAc-1-P from UDP-GlcNAc to dolichol-P to form GlcNAc-P-P-dolichol. The reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for eukaryotic N-linked glycosylation. GPT activity has been identified in all eukaryotic cells examined to date. Evidence for the existence of the N-glycosylation pathway in archaea has emerged and genes responsible for the pathway have been identified. A glycosyl transferase gene Mv1751 in M. voltae encodes for the enzyme that carries out the first step in the pathway, the attachment of GlcNAc to a dolichol lipid carrier in the membrane. A lethal mutation in the alg7 (GPT) gene in Saccharomyces cerevisiae was successfully complemented with Mv1751, the archaeal gene, indicating eukaryotic and archaeal enzymes may use the same substrates and are evolutionarily closer than the bacterial enzyme, which uses a different substrate.	223
133462	cd06852	GT_MraY	Phospho-N-acetylmuramoyl-pentapeptide-transferase (mraY) is an enzyme responsible for the formation of the first lipid intermediate in the synthesis of bacterial cell wall peptidoglycan. It catalyzes the formation of undecaprenyl-pyrophosphoryl-N-acetylmuramoyl-pentapeptide from UDP-MurNAc-pentapeptide and undecaprenyl-phosphate. It is an integral membrane protein with possibly ten transmembrane domains.	280
133463	cd06853	GT_WecA_like	This subfamily contains Escherichia coli WecA, Bacillus subtilis TagO and related proteins. WecA is an UDP-N-acetylglucosamine (GlcNAc):undecaprenyl-phosphate (Und-P) GlcNAc-1-phosphate transferase that catalyzes the formation of a phosphodiester bond between a membrane-associated undecaprenyl-phosphate molecule and N-acetylglucosamine 1-phosphate, which is usually donated by a soluble UDP-N-acetylglucosamine precursor. WecA participates in the biosynthesis of O antigen LPS in many enteric bacteria and is also involved in the biosynthesis of enterobacterial common antigen. A conserved short sequence motif and a conserved arginine at a cytosolic loop of this integral membrane protein were shown to be critical in recognition of substrate UDP-N-acetylglucosamine.	249
133464	cd06854	GT_WbpL_WbcO_like	The members of this subfamily catalyze the formation of a phosphodiester bond between a membrane-associated undecaprenyl-phosphate (Und-P) molecule and N-acetylhexosamine 1-phosphate, which is usually donated by a soluble UDP-N-acetylhexosamine precursor. The WbcO/WbpL substrate specificity has not yet been determined, but the structure of their biosynthetic end products implies that UDP-N-acetyl-D-fucosamine (UDP-FucNAc) and/or UDPN-acetyl-D-quinosamine (UDP-QuiNAc) are used. The subgroup of bacterial UDP-HexNAc:polyprenol-P HexNAc-1-P transferases includes the WbcO protein from Yersinia enterocolitica and the WbpL protein from Pseudomonas aeruginosa. These transferases initiate LPS O-antigen biosynthesis. Similar to other GlcNAc/MurNAc-1-P transferase family members, WbpL is a highly hydrophobic protein possessing 11 predicted transmembrane segments.	253
133465	cd06855	GT_GPT_euk	UDP-GlcNAc:dolichol-P GlcNAc-1-P transferase (GPT) catalyzes the transfer of GlcNAc-1-P from UDP-GlcNAc to dolichol-P to form GlcNAc-P-P-dolichol. The reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for eukaryotic N-glycosylation. GPT activity has been identified in all eukaryotic cells examined to date. A series of six conserved motifs designated A through F, ranging in length from 5 to 13 amino acid residues, has been identified in this family. They have been determined to be important for stable expression, substrate binding, or catalytic activities.	283
133466	cd06856	GT_GPT_archaea	UDP-GlcNAc:dolichol-P GlcNAc-1-P transferase (GPT)-like proteins in archaea. Eukaryotic GPT catalyzes the transfer of GlcNAc-1-P from UDP-GlcNAc to dolichol-P to form GlcNAc-P-P-dolichol. The reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for eukaryotic N-linked glycosylation. Evidence for the existence of the N-glycosylation pathway in archaea has emerged and genes responsible for the pathway have been identified. A glycosyl transferase gene Mv1751 in M. voltae encodes for the enzyme that carries out the first step in the pathway, the attachment of GlcNAc to a dolichol lipid carrier in the membrane. A lethal mutation in the alg7 (GPT) gene in Saccharomyces cerevisiae was successfully complemented with Mv1751, the archaea gene, indicating that eukaryotic and archaeal enzymes may use the same substrates and are evolutionarily closer than the bacterial enzyme, which uses a different substrate.	280
271356	cd06857	SLC5-6-like_sbd	Solute carrier families 5 and 6-like; solute binding domain. This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT.	407
132769	cd06859	PX_SNX1_2_like	The phosphoinositide binding Phox Homology domain of Sorting Nexins 1 and 2. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. This subfamily consists of SNX1, SNX2, and similar proteins. They harbor a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. Both domains have been shown to determine the specific membrane-targeting of SNX1. SNX1 and SNX2 are components of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures effcient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex.	114
132770	cd06860	PX_SNX7_30_like	The phosphoinositide binding Phox Homology domain of Sorting Nexins 7 and 30. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. This subfamily consists of SNX7, SNX30, and similar proteins. They harbor a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to the sorting nexins SNX1-2, SNX4-6, SNX8, and SNX32. Both domains have been shown to determine the specific membrane-targeting of SNX1. The specific function of the sorting nexins in this subfamily has yet to be elucidated.	116
132771	cd06861	PX_Vps5p	The phosphoinositide binding Phox Homology domain of yeast sorting nexin Vps5p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Vsp5p is the yeast counterpart of human SNX1 and is part of the retromer complex, which functions in the endosome-to-Golgi retrieval of vacuolar protein sorting receptor Vps10p, the Golgi-resident membrane protein A-ALP, and endopeptidase Kex2. The PX domain of Vps5p binds phosphatidylinositol-3-phosphate (PI3P). Similar to SNX1, Vps5p contains a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. Both domains have been shown to determine the specific membrane-targeting of SNX1.	112
132772	cd06862	PX_SNX9_18_like	The phosphoinositide binding Phox Homology domain of Sorting Nexins 9 and 18. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. This subfamily consists of SNX9, SNX18, and similar proteins. They contain an N-terminal Src Homology 3 (SH3) domain, a PX domain, and a C-terminal Bin/Amphiphysin/Rvs (BAR) domain. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis, while SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1.	125
132773	cd06863	PX_Atg24p	The phosphoinositide binding Phox Homology domain of yeast Atg24p, an autophagic degradation protein. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The yeast Atg24p is a sorting nexin (SNX) which is involved in membrane fusion events at the vacuolar surface during pexophagy. This is facilitated via binding of Atg24p to phosphatidylinositol 3-phosphate (PI3P) through its PX domain. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway.	118
132774	cd06864	PX_SNX4	The phosphoinositide binding Phox Homology domain of Sorting Nexin 4. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX4 is involved in recycling traffic from the sorting endosome (post-Golgi endosome) back to the late Golgi. It shows a similar domain architecture as SNX1-2, among others, containing a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. SNX4 is implicated in the regulation of plasma membrane receptor trafficking and interacts with receptors for EGF, insulin, platelet-derived growth factor and the long form of the leptin receptor.	129
132775	cd06865	PX_SNX_like	The phosphoinositide binding Phox Homology domain of SNX-like proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. This subfamily is composed of uncharacterized proteins, predominantly from plants, with similarity to sorting nexins. A few members show a similar domain architecture as a subfamily of sorting nexins, containing a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. The PX-BAR structural unit is known to determine specific membrane localization.	120
132776	cd06866	PX_SNX8_Mvp1p_like	The phosphoinositide binding Phox Homology domain of Sorting Nexin 8 and yeast Mvp1p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX8 and the yeast counterpart Mvp1p are involved in sorting and delivery of late-Golgi proteins, such as carboxypeptidase Y, to vacuoles.	105
132777	cd06867	PX_SNX41_42	The phosphoinositide binding Phox Homology domain of fungal Sorting Nexins 41 and 42. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX41 and SNX42 (also called Atg20p) form dimers with SNX4, and are required in protein recycling from the sorting endosome (post-Golgi endosome) back to the late Golgi in yeast.	112
132778	cd06868	PX_HS1BP3	The phosphoinositide binding Phox Homology domain of HS1BP3. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Hematopoietic lineage cell-specific protein-1 (HS1) binding protein 3 (HS1BP3) associates with HS1 proteins through their SH3 domains, suggesting a role in mediating signaling. It has been reported that HS1BP3 might affect the IL-2 signaling pathway in hematopoietic lineage cells. Mutations in HS1BP3 may also be associated with familial Parkinson disease and essential tremor. HS1BP3 contains a PX domain, a leucine zipper, motifs similar to immunoreceptor tyrosine-based inhibitory motif and proline-rich regions. The PX domain interacts with PIs and plays a role in targeting proteins to PI-enriched membranes.	120
132779	cd06869	PX_UP2_fungi	The phosphoinositide binding Phox Homology domain of uncharacterized fungal proteins. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to PI-enriched membranes. Members in this subfamily are uncharacterized fungal proteins containing a PX domain. PX domain harboring proteins have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction.	119
132780	cd06870	PX_CISK	The phosphoinositide binding Phox Homology Domain of Cytokine-Independent Survival Kinase. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Cytokine-independent survival kinase (CISK), also called Serum- and Glucocorticoid-induced Kinase 3 (SGK3), plays a role in cell growth and survival. It is expressed in most tissues and is most abundant in the embryo and adult heart and spleen. It was originally discovered in a screen for antiapoptotic genes. It phosphorylates and inhibits the proapoptotic proteins, Bad and FKHRL1. CISK/SGK3 also regulates many transporters, ion channels, and receptors. It plays a critical role in hair follicle morphogenesis and hair cycling. N-terminal to a catalytic kinase domain, CISK contains a PX domain which binds highly phosphorylated PIs, directs membrane localization, and regulates the enzyme's activity.	109
132781	cd06871	PX_MONaKA	The phosphoinositide binding Phox Homology domain of Modulator of Na,K-ATPase. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. MONaKA (Modulator of Na,K-ATPase) binds the plasma membrane ion transporter, Na,K-ATPase, and modulates its enzymatic and ion pump activities. It modulates brain Na,K-ATPase and may be involved in regulating electrical excitability and synaptic transmission. MONaKA contains an N-terminal PX domain and a C-terminal catalytic kinase domain. The PX domain interacts with PIs and plays a role in targeting proteins to PI-enriched membranes.	120
132782	cd06872	PX_SNX19_like_plant	The phosphoinositide binding Phox Homology domain of uncharacterized SNX19-like plant proteins. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to PI-enriched membranes. Members in this subfamily are uncharacterized plant proteins containing an N-terminal PXA domain, a central PX domain, and a C-terminal domain that is conserved in some sorting nexins (SNXs). This is the same domain architecture found in SNX19. SNX13 and SNX14 also contain these three domains but also contain a regulator of G protein signaling (RGS) domain in between the PXA and PX domains. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction.	107
132783	cd06873	PX_SNX13	The phosphoinositide binding Phox Homology domain of Sorting Nexin 13. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX13, also called RGS-PX1, contains an N-terminal PXA domain, a regulator of G protein signaling (RGS) domain, a PX domain, and a C-terminal domain that is conserved in some SNXs. It specifically binds to the stimulatory subunit of the heterotrimeric G protein G(alpha)s, serving as its GTPase activating protein, through the RGS domain. It preferentially binds phosphatidylinositol-3-phosphate (PI3P) through the PX domain and is localized in early endosomes. SNX13 is involved in endosomal sorting of EGFR into multivesicular bodies (MVB) for delivery to the lysosome.	120
132784	cd06874	PX_KIF16B_SNX23	The phosphoinositide binding Phox Homology domain of KIF16B kinesin or Sorting Nexin 23. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. KIF16B, also called sorting nexin 23 (SNX23), is a family-3 kinesin which harbors an N-terminal kinesin motor domain containing ATP and microtubule binding sites, a ForkHead Associated (FHA) domain, and a C-terminal PX domain. The PX domain of KIF16B  binds to phosphatidylinositol-3-phosphate (PI3P) in early endosomes and plays a role in the transport of early endosomes to the plus end of microtubules. By regulating early endosome plus end motility, KIF16B modulates the balance between recycling and degradation of receptors. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway.	127
132785	cd06875	PX_IRAS	The phosphoinositide binding Phox Homology domain of the Imidazoline Receptor Antisera-Selected. The PX domain is a phosphoinositide binding (PI) module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Imidazoline Receptor Antisera-Selected (IRAS), also called nischarin, contains an N-terminal PX domain, leucine rich repeats, and a predicted coiled coil domain. The PX domain of IRAS binds to phosphatidylinositol-3-phosphate in membranes. Together with the coiled coil domain, it is essential for the localization of IRAS to endosomes. IRAS has been shown to interact with integrin and inhibit cell migration. Its interaction with alpha5 integrin causes a redistribution of the receptor from the cell surface to endosomal structures, suggesting that IRAS may function as a sorting nexin (SNX) which regulates the endosomal trafficking of integrin. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway.	116
132786	cd06876	PX_MDM1p	The phosphoinositide binding Phox Homology domain of yeast MDM1p. The PX domain is a phosphoinositide binding (PI) module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Yeast MDM1p is a filament-like protein localized in punctate structures distributed throughout the cytoplasm. It plays an important role in nuclear and mitochondrial transmission to daughter buds. Members of this subfamily show similar domain architectures as some sorting nexins (SNXs). Some members are similar to SNX19 in that they contain an N-terminal PXA domain, a central PX domain, and a C-terminal domain that is conserved in some SNXs. Others are similar to SNX13 and SNX14, which also harbor these three domains as well as a regulator of G protein signaling (RGS) domain in between the PXA and PX domains. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway.	133
132787	cd06877	PX_SNX14	The phosphoinositide binding Phox Homology domain of Sorting Nexin 14. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX14 may be involved in recruiting other proteins to the membrane via protein-protein and protein-ligand interaction. It is expressed in the embryonic nervous system of mice, and is co-expressed in the motoneurons and the anterior pituary with Islet-1. SNX14 shows a similar domain architecture as SNX13, containing an N-terminal PXA domain, a regulator of G protein signaling (RGS) domain, a PX domain, and a C-terminal domain that is conserved in some SNXs.	119
132788	cd06878	PX_SNX25	The phosphoinositide binding Phox Homology domain of Sorting Nexin 25. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. The function of SNX25 is not yet known. It has been found in exosomes from human malignant pleural effusions. SNX25 shows the same domain architecture as SNX13 and SNX14, containing an N-terminal PXA domain, a regulator of G protein signaling (RGS) domain, a PX domain, and a C-terminal domain that is conserved in some SNXs.	127
132789	cd06879	PX_UP1_plant	The phosphoinositide binding Phox Homology domain of uncharacterized plant proteins. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to PI-enriched membranes. Members in this subfamily are uncharacterized fungal proteins containing a PX domain. PX domain harboring proteins have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction.	138
132790	cd06880	PX_SNX22	The phosphoinositide binding Phox Homology domain of Sorting Nexin 22. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX22 may be involved in recruiting other proteins to the membrane via protein-protein and protein-ligand interaction. The biological function of SNX22 is not yet known.	110
132791	cd06881	PX_SNX15_like	The phosphoinositide binding Phox Homology domain of Sorting Nexin 15-like proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Members of this subfamily have similarity to sorting nexin 15 (SNX15), which contains an N-terminal PX domain and a C-terminal Microtubule Interacting and Trafficking (MIT) domain. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNX15 plays a role in protein trafficking processes in the endocytic pathway and the trans-Golgi network. The PX domain of SNX15 interacts with the PDGF receptor and is responsible for the membrane association of the protein. Other members of this subfamily contain an additional C-terminal kinase domain, similar to human RPK118, which binds sphingosine kinase and the antioxidant peroxiredoxin-3 (PRDX3). RPK118 may be involved in the transport of proteins such as PRDX3 from the cytoplasm to its site of function in the mitochondria.	117
132792	cd06882	PX_p40phox	The phosphoinositide binding Phox Homology domain of the p40phox subunit of NADPH oxidase. The PX domain is a phosphoinositide binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. p40phox contains an N-terminal PX domain, a central SH3 domain that binds p47phox, and a C-terminal PB1 domain that interacts with p67phox. It is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p40phox positively regulates NADPH oxidase in both phosphatidylinositol-3-phosphate (PI3P)-dependent and PI3P-independent manner. The PX domain is a phospholipid-binding module involved in the membrane targeting of proteins. The p40phox PX domain binds to PI3P, an abundant lipid in phagosomal membranes, playing an important role in the localization of NADPH oxidase. The PX domain of p40phox is also involved in protein-protein interaction.	123
132793	cd06883	PX_PI3K_C2	The phosphoinositide binding Phox Homology Domain of Class II Phosphoinositide 3-Kinases. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They are also involved in the regulation of clathrin-mediated membrane trafficking as well as ATP-dependent priming of neurosecretory granule exocytosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. Class II PI3Ks include three vertebrate isoforms (alpha, beta, and gamma), the Drosophila PI3K_68D, and similar proteins.	109
132794	cd06884	PX_PI3K_C2_68D	The phosphoinositide binding Phox Homology Domain of Class II Phosphoinositide 3-Kinases similar to the Drosophila PI3K_68D protein. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. PI3K_68D is a novel PI3K which is widely expressed throughout the Drosophila life cycle. In vitro, it has been shown to phosphorylate PI and PI4P. It is involved in signaling pathways that affect pattern formation of Drosophila wings.	111
132795	cd06885	PX_SNX17_31	The phosphoinositide binding Phox Homology domain of Sorting Nexins 17 and 31. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Members of this subfamily include sorting nexin 17 (SNX17), SNX31, and similar proteins. They contain an N-terminal PX domain followed by a truncated FERM (4.1, ezrin, radixin, and moesin) domain and a unique C-terminal region. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX17 is known to regulate the trafficking and processing of a number of proteins. It binds some members of the low-density lipoprotein receptor (LDLR) family such as LDLR, VLDLR, ApoER2, and others, regulating their endocytosis. It also binds P-selectin and may regulate its lysosomal degradation. SNX17 is highly expressed in neurons. It binds amyloid precursor protein (APP) and may be involved in its intracellular trafficking and processing to amyloid beta peptide, which plays a central role in the pathogenesis of Alzheimer's disease. The biological function of SNX31 is unknown.	104
132796	cd06886	PX_SNX27	The phosphoinositide binding Phox Homology domain of Sorting Nexin 27. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX27 contains an N-terminal PDZ domain followed by a PX domain and a Ras-Associated (RA) domain. It binds G protein-gated potassium (Kir3) channels, which play a role in neuronal excitability control, through its PDZ domain. SNX27 downregulates Kir3 channels by promoting their movement in the endosome, reducing surface expression and increasing degradation. SNX27 also associates with 5-hydroxytryptamine type 4 receptor (5-HT4R), cytohesin associated scaffolding protein (CASP), and diacylglycerol kinase zeta, and may play a role in their intracellular trafficking and endocytic recycling. The SNX27 PX domain preferentially binds to phosphatidylinositol-3-phosphate (PI3P) and is important for targeting to the early endosome.	106
132797	cd06887	PX_p47phox	The phosphoinositide binding Phox Homology domain of the p47phox subunit of NADPH oxidase. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. p47phox is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox), which plays a key role in the ability of phagocytes to defend against bacterial infections. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p47phox is required for activation of NADH oxidase and plays a role in translocation. It contains an N-terminal PX domain, two Src Homology 3 (SH3) domains, and a C-terminal domain that contains PxxP motifs for binding SH3 domains. The PX domain of p47phox is unique in that it contains two distinct basic pockets on the membrane-binding surface: one preferentially binds phosphatidylinositol-3,4-bisphosphate [PI(3,4)P2] and is analogous to the PI3P-binding pocket of p40phox, while the other binds anionic phospholipids such as phosphatidic acid or phosphatidylserine. Simultaneous binding in the two pockets results in increased membrane affinity. The PX domain of p47phox is also involved in protein-protein interaction.	118
132798	cd06888	PX_FISH	The phosphoinositide binding Phox Homology domain of Five SH protein. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Five SH (FISH), also called Tks5, is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. FISH contains an N-terminal PX domain and five Src homology 3 (SH3) domains. FISH binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. This subfamily also includes proteins with a different number of SH3 domains than FISH, such as Tks4, which contains four SH3 domains instead of five. The Tks4 adaptor protein is required for the formation of functional podosomes. It has overlapping, but not identical, functions as FISH. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction.	119
132799	cd06889	PX_NoxO1	The phosphoinositide binding Phox Homology domain of Nox Organizing protein 1. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Nox Organizing protein 1 (NoxO1) is a critical regulator of enzyme kinetics of the nonphagocytic NADPH oxidase Nox1, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Nox1 is expressed in colon, stomach, uterus, prostate, and vascular smooth muscle cells. NoxO1, a homolog of the p47phox subunit of phagocytic NADPH oxidase, is involved in targeting activator subunits (such as NoxA1) to Nox1. It is co-localized with Nox1 in the membranes of resting cells and directs the subcellular localization of Nox1. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of NoxO1 preferentially binds phosphatidylinositol-3,5-bisphosphate [PI(3,5)P2], PI5P, and PI4P.	121
132800	cd06890	PX_Bem1p	The phosphoinositide binding Phox Homology domain of Bem1p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Members of this subfamily bear similarity to Saccharomyces cerevisiae Bem1p, containing two Src Homology 3 (SH3) domains at the N-terminus, a central PX domain, and a C-terminal PB1 domain. Bem1p is a scaffolding protein that is critical for proper Cdc42p activation during bud formation in yeast. During budding and mating, Bem1p migrates to the plasma membrane where it can serve as an adaptor for Cdc42p and some other proteins. Bem1p also functions as an effector of the G1 cyclin Cln3p and the cyclin-dependent kinase Cdc28p in promoting vacuolar fusion. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of Bem1p specifically binds phosphatidylinositol-4-phosphate (PI4P).	112
132801	cd06891	PX_Vps17p	The phosphoinositide binding Phox Homology domain of yeast sorting nexin Vps17p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Vsp17p forms a dimer with Vps5p, the yeast counterpart of human SNX1, and is part of the retromer complex that mediates the transport of the carboxypeptidase Y receptor Vps10p from endosomes to Golgi. Similar to Vps5p and SNX1, Vps17p harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. The PX-BAR structural unit helps determine specific membrane localization.	140
132802	cd06892	PX_SNX5_like	The phosphoinositide binding Phox Homology domain of Sorting Nexins 5 and 6. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Members of this subfamily include SNX5, SNX6, and similar proteins. They contain a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to other sorting nexins including SNX1-2. The PX-BAR structural unit helps determine the specific membrane-targeting of some SNXs. SNX5 and SNX6 may be components of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p.	141
132803	cd06893	PX_SNX19	The phosphoinositide binding Phox Homology domain of Sorting Nexin 19. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX19 contains an N-terminal PXA domain, a central PX domain, and a C-terminal domain that is conserved in some SNXs. These domains are also found in SNX13 and SNX14, which also contain a regulator of G protein signaling (RGS) domain in between the PXA and PX domains. SNX19 interacts with IA-2, a major autoantigen found in type-1 diabetes. It inhibits the conversion of phosphatidylinositol-4,5-bisphosphate [PI(4,5)P2] to PI(3,4,5)P3, which leads in the decrease of protein phosphorylation in the Akt signaling pathway, resulting in apoptosis. SNX19 may also be implicated in coronary heart disease and thyroid oncocytic tumors.	132
132804	cd06894	PX_SNX3_like	The phosphoinositide binding Phox Homology domain of Sorting Nexin 3 and related proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. This subfamily is composed of SNX3, SNX12, and fungal Grd19. Grd19 is involved in the localization of late Golgi membrane proteins in yeast. SNX3/Grp19 associates with the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, and functions as a cargo-specific adaptor for the retromer.	123
132805	cd06895	PX_PLD	The phosphoinositide binding Phox Homology domain of Phospholipase D. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Phospholipase D (PLD) catalyzes the hydrolysis of the phosphodiester bond of phosphatidylcholine to generate membrane-bound phosphatidic acid and choline. Members of this subfamily contain PX and Pleckstrin Homology (PH) domains in addition to the catalytic domain. PLD activity has been detected in viruses, bacteria, yeast, plants, and mammals, but the PX domain is not present in PLDs from viruses and bacteria. PLDs are implicated in many cellular functions like signaling, cytoskeletal reorganization, vesicular transport, stress responses, and the control of differentiation, proliferation, and survival. Vertebrates contain two PLD isozymes, PLD1 and PLD2. PLD1 is located mainly in intracellular membranes while PLD2 is associated with plasma membranes. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction.	140
132806	cd06896	PX_PI3K_C2_gamma	The phosphoinositide binding Phox Homology Domain of the Gamma Isoform of Class II Phosphoinositide 3-Kinases. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. The class II gamma isoform, PI3K-C2gamma, is expressed in the liver, breast, and prostate. It's biological function remains unknown. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction.	101
132807	cd06897	PX_SNARE	The phosphoinositide binding Phox Homology domain of SNARE proteins from fungi. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. This subfamily is composed of fungal proteins similar to Saccharomyces cerevisiae Vam7p. They contain an N-terminal PX domain and a C-terminal SNARE domain. The SNARE (Soluble NSF attachment protein receptor) family of proteins are integral membrane proteins that serve as key factors for vesicular trafficking. Vam7p is anchored at the vacuolar membrane through the specific interaction of its PX domain with phosphatidylinositol-3-phosphate (PI3P) present in bilayers. It plays an essential role in vacuole fusion. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction.	108
132808	cd06898	PX_SNX10	The phosphoinositide binding Phox Homology domain of Sorting Nexin 10. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX10 may be involved in the regulation of endosome homeostasis. Its expression induces the formation of giant vacuoles in mammalian cells.	113
173887	cd06899	lectin_legume_LecRK_Arcelin_ConA	legume lectins, lectin-like receptor kinases, arcelin, concanavalinA, and alpha-amylase inhibitor. This alignment model includes the legume lectins (also known as agglutinins), the arcelin (also known as phytohemagglutinin-L) family of lectin-like defense proteins, the LecRK family of lectin-like receptor kinases, concanavalinA (ConA), and an alpha-amylase inhibitor.  Arcelin is a major seed glycoprotein discovered in kidney beans (Phaseolus vulgaris) that has insecticidal properties and protects the seeds from predation by larvae of various bruchids.  Arcelin is devoid of monosaccharide binding properties and lacks a key metal-binding loop that is present in other members of this family.  Phytohaemagglutinin (PHA) is a lectin found in plants, especially beans, that affects cell metabolism by inducing mitosis and by altering the permeability of the cell membrane to various proteins.  PHA agglutinates most mammalian red blood cell types by binding glycans on the cell surface.  Medically, PHA is used as a mitogen to trigger cell division in T-lymphocytes and to activate latent HIV-1 from human peripheral lymphocytes.  Plant L-type lectins are primarily found in the seeds of leguminous plants where they constitute about 10% of the total soluble protein of the seed extracts. They are synthesized during seed development several weeks after flowering and transported to the vacuole where they become condensed into specialized vesicles called protein bodies. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face".  This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers.  Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.	236
173888	cd06900	lectin_VcfQ	VcfQ bacterial pilus biogenesis protein, lectin domain. This family includes bacterial proteins homologous to the VcfQ (also known as MshQ) bacterial pilus biogenesis protein.  VcfQ is encoded by the vcfQ gene of the type IV pilus gene cluster of Vibrio cholerae and is essential for type IV pilus assembly.  VcfQ has a Laminin G-like domain as well as an L-type lectin domain.	255
173889	cd06901	lectin_VIP36_VIPL	VIP36 and VIPL type 1 transmembrane proteins, lectin domain. The vesicular integral protein of 36 kDa (VIP36) is a type 1 transmembrane protein of the mammalian early secretory pathway that acts as a cargo receptor transporting high mannose type glycoproteins between the Golgi and the endoplasmic reticulum (ER).  Lectins of the early secretory pathway are involved in the selective transport of newly synthesized glycoproteins from the ER to the ER-Golgi intermediate compartment (ERGIC). The most prominent cycling lectin is the mannose-binding type1 membrane protein ERGIC-53, which functions as a cargo receptor to facilitate export of glycoproteins from the ER. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face".  This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers.  Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.	248
173890	cd06902	lectin_ERGIC-53_ERGL	ERGIC-53 and ERGL type 1 transmembrane proteins, N-terminal lectin domain. ERGIC-53 and ERGL, N-terminal carbohydrate recognition domain. ERGIC-53 and ERGL are eukaryotic mannose-binding type 1 transmembrane proteins of the early secretory pathway that transport newly synthesized glycoproteins from the endoplasmic reticulum (ER) to the ER-Golgi intermediate compartment (ERGIC).  ERGIC-53 and ERGL have an N-terminal lectin-like carbohydrate recognition domain (represented by this alignment model) as well as a C-terminal transmembrane domain.  ERGIC-53 functions as a 'cargo receptor' to facilitate the export of glycoproteins with different characteristics from the ER, while the ERGIC-53-like protein (ERGL) which may act as a regulator of ERGIC-53.  In mammals, ERGIC-53 forms a complex with MCFD2 (multi-coagulation factor deficiency 2) which then recruits blood coagulation factors V and VIII.  Mutations in either MCFD2 or ERGIC-53 cause a mild form of inherited hemophilia known as combined deficiency of factors V and VIII (F5F8D). In addition to the lectin and transmembrane domains, ERGIC-53 and ERGL have a short N-terminal cytoplasmic region of about 12 amino acids. ERGIC-53 forms disulphide-linked homodimers and homohexamers. ERGIC-53 and ERGL are sequence-similar to the lectins of leguminous plants.  L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face".  This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers.  Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.	225
173891	cd06903	lectin_EMP46_EMP47	EMP46 and EMP47 type 1 transmembrane proteins, N-terminal lectin domain. EMP46 and EMP47, N-terminal carbohydrate recognition domain. EMP46 and EMP47 are fungal type-I transmembrane proteins that cycle between the endoplasmic reticulum and the golgi apparatus and are thought to function as cargo receptors that transport newly synthesized glycoproteins.  EMP47 is a receptor for EMP46 responsible for the selective transport of EMP46 by forming hetero-oligomerization between the two proteins. EMP46 and EMP47 have an N-terminal lectin-like carbohydrate recognition domain (represented by this alignment model) as well as a C-terminal transmembrane domain. EMP46 and EMP47 are 45% sequence-identical to one another and have sequence homology to a class of intracellular lectins defined by ERGIC-53 and VIP36.  L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face".  This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers.  Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.	215
349475	cd06904	M14_MpaA-like	Peptidase M14-like domain of Escherichia coli Murein Peptide Amidase A and related proteins. Peptidase M14-like domain of Escherichia coli Murein Peptide Amidase A (MpaA) and related proteins. MpaA is a member of the M14 family of metallocarboxypeptidases (MCPs), however it has an exceptional type of activity, it hydrolyzes the gamma-D-glutamyl-meso-diaminopimelic acid (gamma-D-Glu-Dap) bond in murein peptides. MpaA is specific for cleavage of the gamma-D-Glu-Dap bond of free murein tripeptide; it may also cleave murein tetrapeptide. MpaA has a different substrate specificity and cellular role than endopeptidase I, ENP1 (ENP1 does not belong to this group). MpaA works on free murein peptide in the recycling pathway.	214
349476	cd06905	M14-like	Peptidase M14-like domain; uncharacterized subfamily. A functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers.  MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others.   Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism.	359
349477	cd06906	M14_Nna1	Peptidase M14-like domain of ATP/GTP binding proteins and cytosolic carboxypeptidases. Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP), and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This eukaryotic subgroup includes the mouse Nna1/CCP-1, and -4 proteins, and the human Nna1/AGTPBP-1 protein. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Nna1 is widely expressed in the developing and adult nervous systems, including cerebellar Purkinje and granule neurons, miral cells of the olfactory bulb and retinal photoreceptors. Nna1 is also induced in axotomized motor neurons. Mutations in Nna1 cause Purkinje cell degeneration (pcd). The Nna1 CP domain is required to prevent the retinal photoreceptor loss and cerebellar ataxia phenotypes of pcd mice, and a functional zinc-binding domain is needed for Nna-1 to support neuron survival in these mice. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	271
349478	cd06907	M14_AGBL2-3_like	Peptidase M14-like domain of ATP/GTP binding protein AGBL-2 and AGBL-3, and related proteins. Peptidase M14-like domain of ATP/GTP binding protein_like (AGBL)-2, and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This subgroup includes the human AGBL-2, and -3, and the mouse cytosolic carboxypeptidase (CCPs)-2, and -3. ATP/GTP binding protein (AGTPBP-1/Nna1)-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Mutations in AGTPBP-1/Nna1 cause Purkinje cell degeneration (pcd). AGTPBP-1/Nna1 however does not belong to this subgroup. AGTPBP-1/Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	252
349479	cd06908	M14_AGBL4_like	Peptidase M14-like domain of ATP/GTP binding protein AGBL-4 and related proteins. Peptidase M14-like domain of ATP/GTP binding protein_like (AGBL)-4, and related proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. This eukaryotic subgroup includes the human AGBL4 and the mouse cytosolic carboxypeptidase (CCP)-6. ATP/GTP binding protein (AGTPBP-1/Nna1)-like proteins are active metallopeptidases that are thought to act on cytosolic proteins such as alpha-tubulin, to remove a C-terminal tyrosine. Mutations in AGTPBP-1/Nna1 cause Purkinje cell degeneration (pcd). AGTPBP-1/Nna1 however does not belong to this subgroup. AGTPBP-1/Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	254
349480	cd06909	M14_ASPA	Peptidase M14 Aspartoacylase (ASPA) subfamily. Aspartoacylase (ASPA) belongs to the Succinylglutamate desuccinylase/aspartoacylase subfamily of the M14 family of metallocarboxypeptidases. ASPA (also known as aminoacylase 2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	190
349481	cd06910	M14_ASTE_ASPA-like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	208
132874	cd06911	VirB9_CagX_TrbG	VirB9/CagX/TrbG, a component of the type IV secretion system. VirB9 is a component of the type IV secretion system, which is employed by pathogenic bacteria to export virulence proteins directly from the bacterial cytoplasm into the host cell. Unlike the more common type III secretion system, type IV systems evolved from the conjugative apparatus, which is used to transfer DNA between cells. VirB9 was initially identified as an essential virulence gene on the Agrobacterium tumefaciens Ti plasmid. In the pilin-like conjugative structure, VirB9 appears to form a stabilizing complex in the outer membrane, by interacting with the lipoprotein VirB7. The heterodimer has been shown to stabilize other components of the type IV system. This alignment model spans the C-terminal domain of VirB9. CagX is a component of the Helicobacter pylori cag PAI-encoded type IV secretion system. Some other members of this family are involved in conjugal transfer to T-DNA of plant cells.	86
133467	cd06912	GT_MraY_like	This subfamily is composed of uncharacterized bacterial glycosyltransferases in the MraY-like family. This family contains both eukaryotic and prokaryotic UDP-D-N-acetylhexosamine:polyprenol phosphate D-N-acetylhexosamine-1-phosphate transferases, which catalyze the transfer of a D-N-acetylhexosamine 1-phosphate to a membrane-bound polyprenol phosphate. This is the initiation step of protein N-glycosylation in eukaryotes and peptidoglycan biosynthesis in bacteria. The three bacterial members MraY, WecA, and WbpL/WbcO, utilize undecaprenol phosphate as the acceptor substrate, but use different UDP-sugar donor substrates. MraY-type transferases are highly specific for UDP-N-acetylmuramate-pentapeptide, whereas WecA proteins are selective for UDP-N-acetylglucosamine (UDP-GlcNAc). The WbcO/WbpL substrate specificity has not yet been determined, but the structure of their biosynthetic end products implies that UDP-N-acetyl-D-fucosamine (UDP-FucNAc) and/or UDPN-acetyl-D-quinosamine (UDP-QuiNAc) are used. The prokaryotic enzyme-catalyzed reactions lead to the formation of polyprenol-linked oligosaccharides involved in bacterial cell wall and peptidoglycan assembly.	193
133063	cd06913	beta3GnTL1_like	Beta 1, 3-N-acetylglucosaminyltransferase is essential for the formation of poly-N-acetyllactosamine . This family includes human Beta3GnTL1 and related eukaryotic proteins. Human Beta3GnTL1 is a putative beta-1,3-N-acetylglucosaminyltransferase. Beta3GnTL1 is expressed at various levels in most of tissues examined. Beta 1, 3-N-acetylglucosaminyltransferase has been found to be essential for the formation of poly-N-acetyllactosamine. Poly-N-acetyllactosamine is a unique carbohydrate composed of N-acetyllactosamine repeats. It is often an important part of cell-type-specific oligosaccharide structures and some functional oligosaccharides. It has been shown that the structure and biosynthesis of poly-N-acetyllactosamine display a dramatic change during development and oncogenesis. Several members of beta-1, 3-N-acetylglucosaminyltransferase have been identified.	219
133064	cd06914	GT8_GNT1	GNT1 is a fungal enzyme that belongs to the GT 8 family. N-acetylglucosaminyltransferase is a fungal enzyme that catalyzes the addition of N-acetyl-D-glucosamine to mannotetraose side chains by an alpha 1-2 linkage during the synthesis of mannan. The N-acetyl-D-glucosamine moiety in mannan plays a role in the attachment of mannan to asparagine residues in proteins. The mannotetraose and its N-acetyl-D-glucosamine derivative side chains of mannan are the principle immunochemical determinants on the cell surface. N-acetylglucosaminyltransferase is a member of  glycosyltransferase family 8, which are, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed, retaining glycosyltransferases.	278
133065	cd06915	NTP_transferase_WcbM_like	WcbM_like is a subfamily of nucleotidyl transferases. WcbM protein of Burkholderia mallei is involved in the biosynthesis, export or translocation of capsule. It is a subfamily of nucleotidyl transferases that transfer nucleotides onto phosphosugars.	223
143512	cd06916	NR_DBD_like	DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).  Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers.	72
270822	cd06917	STKc_NAK1_like	Catalytic domain of Fungal Nak1-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Schizosaccharomyces pombe Nak1, Saccharomyces cerevisiae Kic1p (kinase that interacts with Cdc31p) and related proteins. Nak1 (also called N-rich kinase 1), is required by fission yeast for polarizing the tips of actin cytoskeleton and is involved in cell growth, cell separation, cell morphology and cell-cycle progression. Kic1p is required by budding yeast for cell integrity and morphogenesis. Kic1p interacts with Cdc31p, the yeast homologue of centrin, and phosphorylates substrates in a Cdc31p-dependent manner. The Nak1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
132994	cd06919	Asp_decarbox	Aspartate alpha-decarboxylase or L-aspartate 1-decarboxylase, a pyruvoyl group-dependent  decarboxylase in beta-alanine production. Decarboxylation of aspartate is  the major route of beta-alanine production in bacteria, and is catalyzed  by the enzyme L-aspartate decarboxylase (ADC), EC:4.1.1.11 which  requires a pyruvoyl group for its activity. The pyruvoyl cofactor is  covalently bound to the enzyme. The protein is synthesized as a  proenzyme and cleaved via self-processing at Gly23-Ser24 to yield an  alpha chain (C-terminal fragment) and beta chain (N-terminal fragment),  and the pyruvoyl group. Beta-alanine is required for the biosynthesis of  pantothenate, in which the enzyme plays a critical regulatory role. The  active site of the tetrameric enzyme is located at the interface of two  subunits, with a Lysine and a Histidine from the beta chain of one  subunit forming the active site with residues from the alpha chain of  the adjacent subunit. This alignment model spans the precursor (or both  beta and alpha chains) of aspartate decarboxylase.	111
132993	cd06920	NEAT	NEAr Transport domain, a component of cell surface proteins. NEAr Transporter (NEAT) domain; used by pathogenic bacteria to to scavenge heme-iron from host hemoproteins. The NEAT domain is a component of cell surface proteins (iron regulated surface determinants, or Isd, such as IsdA and IsdC)  in various gram-positive bacteria, and may be arranged in tandem repeats.	117
211312	cd06921	ChtBD1_GH19_hevein	Hevein or Type 1 chitin binding domain subfamily co-occuring with family 19 glycosyl hydrolases or with barwin domains. This subfamily includes Hevein, a major IgE-binding allergen in natural rubber latex. ChtBD1 is a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements.	40
211313	cd06922	ChtBD1_GH18_1	Hevein or Type 1 chitin binding domain subfamily that co-occurs with family 18 glycosyl hydrolases. ChtBD1 is a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements.	38
211314	cd06923	ChtBD1_GH16	Hevein or Type 1 chitin binding domain subfamily that co-occurs with family 16 glycosyl hydrolases. This subfamily includes Saccharomyces cerevisiae Utr2p, also known as Crh2p, which participates in the cross-linking of chitin to beta(1-3)- and beta(1-6) glucan in the cell wall,  and S. cerevisiae Crr1p, a putative transglycosidase which is needed for proper spore wall assembly. ChtBD1 is a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements.	47
132902	cd06926	RNAP_II_RPB11	RPB11 subunit of Eukaryotic RNA polymerase II. The eukaryotic RPB11 subunit of RNA polymerase (RNAP) II is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III. RNAP II is responsible for the synthesis of mRNA precursor. The RPB11 subunit heterodimerizes with the RPB3 subunit, and together with RPB10 and RPB12, anchors the two largest subunits, RPB1 and RPB2, and stabilizes their association.	93
132903	cd06927	RNAP_L	L subunit of Archaeal RNA polymerase. The archaeal L subunit of RNA polymerase (RNAP) is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. A single distinct RNAP complex is found in archaea, which may be responsible for the synthesis of all RNAs. The archaeal RNAP harbors homologues of all eukaryotic RNAP II subunits with two exceptions (RPB8 and RPB9). The 12 archaeal subunits are designated by letters and can be divided into three functional groups that are engaged in: (I) catalysis (A'/A", B'/B" or B); (II) assembly (L, N, D and P); and (III) auxiliary functions (F, E, H and K). The assembly of the two largest archaeal RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of the archaeal D/L heterodimer.	83
132904	cd06928	RNAP_alpha_NTD	N-terminal domain of the Alpha subunit of Bacterial RNA polymerase. The bacterial alpha subunit of RNA polymerase (RNAP) consists of two independently folded domains: an amino-terminal domain (alphaNTD) and a carboxy-terminal domain (alphaCTD). AlphaCTD is not required for RNAP assembly but interacts with transcription activators. AlphaNTD is essential in vivo and in vitro for RNAP assembly and basal transcription. It is similar to the eukaryotic RPB3/AC40/archaeal D subunit, and contains two subdomains: one subdomain is similar the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization; and the other is an inserted beta sheet subdomain. The alphaNTDs of plant plastid RNAP (PEP) are also included in this subfamily. PEP is largely responsible for the transcription of photosynthetic genes and is closely related to the multi-subunit bacterial RNAP, which is a large multi-subunit complex responsible for the synthesis of all bacterial RNAs. The bacterial RNAP core enzyme consists of four subunits (beta', beta, alpha and omega). All residues in the alpha subunit that is involved in dimerization or in the interaction with other subunits are located within alphaNTD.	215
132727	cd06929	NR_LBD_F1	Ligand-binding domain of nuclear receptor family 1. Ligand-binding domain (LBD) of nuclear receptor (NR) family 1:  This is one of the major subfamily of nuclear receptors, including thyroid receptor, retinoid acid receptor, ecdysone receptor, farnesoid X receptor, vitamin D receptor, and other related receptors. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	174
132728	cd06930	NR_LBD_F2	Ligand-binding domain of nuclear receptor family 2. Ligand-binding domain (LBD) of nuclear receptor (NR) family 2:  This is one of the major subfamily of nuclear receptors, including some well known nuclear receptors such as glucocorticoid receptor (GR), mineralocorticoid receptor (MR), estrogen receptor (ER), progesterone receptor (PR), and androgen receptor (AR), other related receptors. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	165
132729	cd06931	NR_LBD_HNF4_like	The ligand binding domain of heptocyte nuclear factor 4, which is explosively expanded in nematodes. The ligand binding domain of hepatocyte nuclear factor 4 (HNF4) like proteins: HNF4 is a member of the nuclear receptor superfamily. HNF4 plays a key role in establishing and maintenance of hepatocyte differentiation in the liver. It is also expressed in gut, kidney, and pancreatic beta cells. HNF4 was originally classified as an orphan receptor, but later it is found that HNF4 binds with very high affinity to a variety of fatty acids. However, unlike other nuclear receptors, the ligands do not act as a molecular switch for HNF4. They seem to constantly bind to the receptor, which is constitutively active as a transcription activator. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, HNF4  has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). The LBD domain is also responsible for recruiting co-activator proteins. More than 280 nuclear receptors are found in C. ele gans, most of which are originated from an explosive burst of duplications of HNF4.	222
132730	cd06932	NR_LBD_PPAR	The ligand binding domain of peroxisome proliferator-activated receptors. The ligand binding domain (LBD) of peroxisome proliferator-activated receptors (PPAR):  Peroxisome proliferator-activated receptors (PPARs) are members of the nuclear receptor superfamily of ligand-activated transcription factors. PPARs play important roles in regulating cellular differentiation, development and lipid metabolism. Activated PPAR forms a heterodimer with the retinoid X receptor (RXR) that binds to the hormone response element located upstream of the peroxisome proliferator responsive genes and interacts with co-activators. There are three subtypes of peroxisome proliferator activated receptors, alpha, beta (or delta), and gamma, each with a distinct tissue distribution. Several essential fatty acids, oxidized lipids and prostaglandin J derivatives can bind and activate PPAR.  Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PPAR has a central well conserved DNA binding domain (DBD), a variable N-terminal regulatory domain, a flexible hinge a nd a C-terminal ligand binding domain (LBD).	259
132731	cd06933	NR_LBD_VDR	The ligand binding domain of vitamin D receptors, a member of the nuclear receptor superfamily. The ligand binding domain of vitamin D receptors (VDR): VDR is a member of the nuclear receptor (NR) superfamily that functions as classical endocrine receptors. VDR controls a wide range of biological activities including calcium metabolism, cell proliferation and differentiation, and immunomodulation. VDR is a high affinity receptor for the biologically most active Vitamin D metabolite, 1alpha,25-dihydroxyvitamin D3 (1alpha,25(OH)2D3). The binding of the ligand to the receptor induces a conformational change of the ligand binding domain (LBD) with consequent dissociation of corepressors. Upon ligand binding, VDR forms heterodimer with the retinoid X receptor (RXR) that binds to vitamin D response elements (VDREs), recruits coactivators. This leads to the expression of a large number of genes.  Approximately 200 human genes are considered to be primary targets of VDR and even more genes are regulated indirectly. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, VDR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	238
132732	cd06934	NR_LBD_PXR_like	The ligand binding domain of xenobiotic receptors:pregnane X receptor and constitutive androstane receptor. The ligand binding domain of xenobiotic receptors: This xenobiotic receptor family includes pregnane X receptor (PXR), constitutive androstane receptor (CAR) and other related nuclear receptors.  They function as sensors of toxic byproducts of cell metabolism and of exogenous chemicals, to facilitate their elimination. The nuclear receptor pregnane X receptor (PXR) is a ligand-regulated transcription factor that responds to a diverse array of chemically distinct ligands, including many endogenous compounds and clinical drugs. The ligand binding domain of PXR shows remarkable flexibility to accommodate both large and small molecules. PXR functions as a heterodimer with retinoic X receptor-alpha (RXRa) and binds to a variety of response elements in the promoter regions of a diverse set of target genes involved in the metabolism, transport, and elimination of these molecules from the cell. Constitutive androstane receptor (CAR) is a closest mammalian relative of PXR, which has also been proposed to function as a xenosensor. CAR is activated by some of the same ligands as PXR and regulates a subset of common genes. The sequence homology and functional similarity suggests that the CAR gene arose from a duplication of an ancestral PXR gene. Like other nuclear receptors, xenobiotic receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	226
132733	cd06935	NR_LBD_TR	The ligand binding domain of thyroid hormone receptor, a members of a superfamily of nuclear receptors. The ligand binding domain (LBD) of thyroid hormone receptors: Thyroid hormone receptors are members of a superfamily of nuclear receptors. Thyroid hormone receptors (TR) mediate the actions of thyroid hormones, which play critical roles in growth, development, and homeostasis in mammals. They regulate overall metabolic rate, cholesterol and triglyceride levels, and heart rate, and affect mood. TRs are expressed from two separate genes (alpha and beta) in human and each gene generates two isoforms of the receptor through differential promoter usage or splicing. TRalpha functions in the heart to regulate heart rate and rhythm and TRbeta is active in the liver and other tissues. The unliganded TRs function as transcription repressors, by binding to thyroid hormone response elements (TRE) predominantly as homodimers, or as heterodimers with retinoid X-receptors (RXR), and being associated with a complex of proteins containing corepressor proteins. Ligand binding promotes corepressor dissociation and binding of a coactivator to activate transcription. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	243
132734	cd06936	NR_LBD_Fxr	The ligand binding domain of Farnesoid X receptor:a member of the nuclear receptor superfamily of ligand-activated transcription factors. The ligand binding domain (LBD) of Farnesoid X receptor: Farnesoid X receptor (FXR) is a member of the nuclear receptor superfamily of ligand-activated transcription factors. FXR is highly expressed in the liver, the intestine, the kidney, and the adrenals.  FXR plays key roles in the regulation of bile acid, cholesterol, triglyceride, and glucose metabolism. Evidences show that it also regulates liver regeneration. Upon binding of ligands, such as bile acid, an endogenous ligand, FXRs bind to FXR response elements (FXREs) either as a monomer or as a heterodimer with retinoid X receptor (RXR), and regulate the expression of various genes involved in bile acid, lipid, and glucose metabolism. There are two FXR genes (FXRalpha and FXRbeta) in mammals. A single FXRalpha gene encodes four isoforms resulting from differential use of promoters and alternative splicing. FXRbeta is a functional receptor in mice, rats, rabbits and dogs, but is a pseudogene in humans and primates. Like other members of the nuclear receptor (NR) superfamily, farnesoid X receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	221
132735	cd06937	NR_LBD_RAR	The ligand binding domain (LBD) of retinoic acid receptor (RAR), a members of the nuclear receptor superfamily. The ligand binding domain (LBD) of retinoic acid receptor (RAR): Retinoic acid receptors are members of the nuclear receptor (NR) superfamily of ligand-regulated transcription factors. RARs mediate the biological effect of retinoids, including both naturally dietary vitamin A (retinol) metabolites and active synthetic analogs. Retinoids play key roles in a wide variety of essential biological processes, such as vertebrate embryonic morphogenesis and organogenesis, differentiation and apoptosis, and homeostasis. RARs function as heterodimers with retinoic X receptors by binding to specific RAR response elements (RAREs) found in the promoter regions of retinoid target genes. In the absence of ligand, the RAR-RXR heterodimer recruits the corepressor proteins NCoR or AMRT, and associated factors such as histone deacetylases or DNA-methyltransferases, leading to an inactive condensed chromatin structure, preventing transcription. Upon ligand binding, the corepressors are released, and coactivator complexes such as histone acetyltransferase or histone arginine methyltransferases are recruited to activate transcription. There are three RAR subtypes (alpha, beta, gamma), originating from three distinct genes. For each subtype, several isoforms exist that differ in their N-terminal region, allowing retinoids to exert their pleiotropic effects. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, retinoic acid receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	231
132736	cd06938	NR_LBD_EcR	The ligand binding domain (LBD) of the Ecdysone receptor, a member of  the nuclear receptors super family. The ligand binding domain (LBD) of the ecdysone receptor: The ecdysone receptor (EcR) belongs to the superfamily of nuclear receptors (NRs) of ligand-dependent transcription factors. Ecdysone receptor is present only in invertebrates and regulates the expression of a large number of genes during development and reproduction. ECR functions as a heterodimer by partnering with ultraspiracle protein (USP), the ortholog of the vertebrate retinoid X receptor (RXR). The natural ligands of ecdysone receptor are ecdysteroids#the endogenous steroidal hormones found in invertebrates. In addition, insecticide bisacylhydrazine used against pests has shown to act on EcR. EcR must be dimerised with a USP for high-affinity ligand binding to occur. The ligand binding triggers a conformational change in the C-terminal part of the EcR ligand-binding domain that leads to transcriptional activation of genes controlled by EcR. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ec dysone receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	231
132737	cd06939	NR_LBD_ROR_like	The ligand binding domain of Retinoid-related orphan receptors, of the nuclear receptor superfamily. The ligand binding domain (LBD) of Retinoid-related orphan receptors (RORs): Retinoid-related orphan receptors (RORs) are transcription factors belonging to the nuclear receptor superfamily. RORs are key regulators of many physiological processes during embryonic development. RORs bind as monomers to specific ROR response elements (ROREs) consisting of the consensus core motif AGGTCA preceded by a 5-bp A/T-rich sequence. Transcription regulation by RORs is mediated through certain corepressors, as well as coactivators. There are three subtypes of retinoid-related orphan receptors (RORs), alpha, beta, and gamma that differ only in N-terminal sequence and are distributed in distinct tissues. RORalpha plays a key role in the development of the cerebellum, particularly in the regulation of the maturation and survival of Purkinje cells. RORbeta expression is largely restricted to several regions of the brain, the retina, and pineal gland. RORgamma is essential for lymph node organogenesis. Recently, it has been su ggested that cholesterol or a cholesterol derivative is the natural ligand of RORalpha. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, retinoid-related orphan receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	241
132738	cd06940	NR_LBD_REV_ERB	The ligand binding domain of REV-ERB receptors, members of the nuclear receptor superfamily. The ligand binding domain (LBD) of REV-ERB receptors:  REV-ERBs are transcriptional regulators belonging to the nuclear receptor superfamily. They regulate a number of physiological functions including the circadian rhythm, lipid metabolism, and cellular differentiation. The LBD domain of REV-ERB is unusual   in the nuclear receptor family by lacking the AF-2 region that is responsible for coactivator interaction.  REV-ERBs act as constitutive repressors because of their inability to bind coactivators.  REV-ERB receptors can bind to two classes of DNA response elements as either a monomer or heterodimer, indicating functional diversity. When bound to the DNA, they recruit corepressors (NcoR/histone deacetylase 3) to the promoter, resulting in repression of the target gene. The porphyrin heme has been demonstrated to function as a ligand for REV-ERB. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, REV-ERB receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	189
132739	cd06941	NR_LBD_DmE78_like	The ligand binding domain of Drosophila ecdysone-induced protein 78, a member of the nuclear receptor superfamily. The ligand binding domain (LBD) of Drosophila ecdysone-induced protein 78 (E78) like: Drosophila ecdysone-induced protein 78 (E78) is a transcription factor belonging to the nuclear receptor superfamily.  E78 is a product of the ecdysone-inducible gene found in an early late puff locus at position 78C during the onset of Drosophila metamorphosis. Two isoforms of E78, E78A and E78B, are expressed from two nested transcription units. An E78 orthologue from the Platyhelminth Schistosoma mansoni (SmE78) has also been identified. It is the first E78 orthologue known outside of the molting animals--the Ecdysozoa. SmE78 may be involved in transduction of an ecdysone signal in S. mansoni, consistent with its function in Drosophila.  Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, E78-like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	195
132740	cd06942	NR_LBD_Sex_1_like	The ligand binding domain of Caenorhabditis elegans nuclear hormone receptor Sex-1 protein. The ligand binding domain (LBD) of Caenorhabditis elegans nuclear hormone receptor Sex-1 protein like: Sex-1 protein of C. elegans is a transcription factor belonging to the nuclear receptor superfamily. Sex-1 plays pivotal role in sex fate of C. elegans by regulating the transcription of the sex-determination gene xol-1, which specifies male (XO) fate when active and hermaphrodite (XX) fate when inactive. The Sex-1 protein directly represses xol-1 transcription by binding to its promoter. However, the active ligand for Sex-1 protein has not yet been identified. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, Sex-1 like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	191
132741	cd06943	NR_LBD_RXR_like	The ligand binding domain of the retinoid X receptor and Ultraspiracle, members of nuclear receptor superfamily. The ligand binding domain of the retinoid X receptor (RXR) and Ultraspiracle (USP): This family includes two evolutionary related nuclear receptors: retinoid X receptor (RXR) and Ultraspiracle (USP). RXR is a nuclear receptor in mammalian and USP is its counterpart in invertebrates.  The native ligand of retinoid X receptor is 9-cis retinoic acid (RA). RXR functions as a DNA binding partner by forming heterodimers with other nuclear receptors including CAR, FXR, LXR, PPAR, PXR, RAR, TR, and VDR. RXRs can play different roles in these heterodimers. It acts  either as a structural component of the heterodimer complex, required for DNA binding but not acting as a receptor or as both a structural and a functional component of the heterodimer, allowing 9-cis RA to signal through the corresponding heterodimer. In addition, RXR can also form homodimers, functioning as a receptor for 9-cis RA, independently of other nuclear receptors. Ultraspiracle (USP) plays similar roles as DNA binding partner of other nuclear rec eptors in invertebrates. USP has no known high-affinity ligand and is thought to be a silent component in the heterodimeric complex with partner receptors. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, RXR and USP  have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	207
132742	cd06944	NR_LBD_Ftz-F1_like	The ligand binding domain of FTZ-F1 like nuclear receptors. The ligand binding domain of FTZ-F1 like nuclear receptors: This nuclear receptor family includes at least three subgroups of receptors that function in embryo development and differentiation, and other processes. FTZ-F1 interacts with the cis-acting DNA motif of ftz gene, which required at several stages of development. Particularly, FTZ-F1 genes are strongly linked to steroid biosynthesis and sex-determination; LRH-1 is a regulator of bile-acid homeostasis, steroidogenesis, reverse cholesterol transport and the initial stages of embryonic development. SF-1 is an essential regulator of endocrine development and function and is considered a master regulator of reproduction; SF-1 functions cooperatively with other transcription factors to modulate gene expression. Phospholipids have been identified as potential ligand for LRH-1 and steroidogenic factor-1 (SF-1). However, the ligand for FTZ-F1 has not yet been identified. Most nuclear receptors function as homodimer or heterodimers. However, LRH-1 and SF-1 bind to DNA as a monomer. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, receptors in this family  have  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	237
132743	cd06945	NR_LBD_Nurr1_like	The ligand binding domain of Nurr1 and related nuclear receptor proteins, members of nuclear receptor superfamily. The ligand binding domain of nuclear receptor Nurr1_like: This family of nuclear receptors, including Nurr1, Nerve growth factor-induced-B (NGFI-B) and DHR38 are involved in the embryo development. Nurr1 is a transcription factor that is expressed in the embryonic ventral midbrain and is critical for the development of dopamine (DA) neurons. Structural studies have shown that the ligand binding pocket of Nurr1 is filled by bulky hydrophobic residues, making it unable to bind to ligands. Therefore, it belongs to the class of orphan receptors. However, Nurr1 forms heterodimers with RXR and can promote signaling via its partner, RXR. NGFI-B is an early immediate gene product of embryo development that is rapidly produced in response to a variety of cellular signals including nerve growth factor. It is involved in T-cell-mediated apoptosis, as well as neuronal differentiation and function. NGFI-B regulates transcription by binding to a specific DNA target upstream of its target genes and regulating the rate of tr anscriptional initiation. Another group of receptor in this family is DHR38.  DHR38 is the Drosophila homolog to the vertebrate NGFI-B-type orphan receptor. It interacts with the USP component of the ecdysone receptor complex, suggesting that DHR38 might modulate ecdysone-triggered signals in the fly, in addition to the ECR/USP pathway. Nurr1_like proteins exhibit a modular structure that is characteristic for nuclear receptors; they have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	239
132744	cd06946	NR_LBD_ERR	The ligand binding domain of estrogen receptor-related nuclear receptors. The ligand binding domain of estrogen receptor-related receptors (ERRs): The family of estrogen receptor-related receptors (ERRs), a subfamily of nuclear receptors, is closely related to the estrogen receptor (ER) family, but it lacks the ability to bind estrogen.  ERRs can interfere with the classic ER-mediated estrogen signaling pathway, positively or negatively. ERRs  share target genes, co-regulators and promoters with the estrogen receptor (ER) family. There are three subtypes of ERRs: alpha, beta and gamma. ERRs bind at least two types of DNA sequence, the estrogen response element and another site, originally characterized as SF-1 (steroidogenic factor 1) response element. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ERR has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	221
132745	cd06947	NR_LBD_GR_Like	Ligand binding domain of  nuclear hormone receptors:glucocorticoid receptor, mineralocorticoid receptor , progesterone receptor, and androgen receptor. The ligand binding domain of GR_like nuclear receptors: This family of NRs includes four distinct, but closely related nuclear hormone receptors: glucocorticoid receptor (GR), mineralocorticoid receptor (MR), progesterone receptor (PR), and androgen receptor (AR). These four receptors play key roles in some of the most fundamental physiological functions such as the stress response, metabolism, electrolyte homeostasis, immune function, growth, development, and reproduction. The NRs in this family use multiple signaling pathways and share similar functional mechanisms.  The dominant signaling pathway is via direct DNA binding and transcriptional regulation of target genes. Another mechanism is via protein-protein interactions, mainly with other transcription factors such as nuclear factor-kappaB and activator protein-1, to regulate gene expression patterns. Both pathways can up-regulate or down-regulate gene expression and require ligand activation of the receptor and recruitment of other cofactors such as chaperone proteins and coregulator proteins. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, GR, MR, PR, and AR share the same modular structure with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	246
132746	cd06948	NR_LBD_COUP-TF	Ligand binding domain of chicken ovalbumin upstream promoter transcription factors, a member of the nuclear receptor family. The ligand binding domain of chicken ovalbumin upstream promoter transcription factors (COUP-TFs): COUP-TFs are orphan members of the steroid/thyroid hormone receptor superfamily. They are expressed in many tissues and are involved in the regulation of several important biological processes, such as neurogenesis, organogenesis, cell fate determination, and metabolic homeostasis. In mammals two isoforms named COUP-TFI and COUP-TFII have been identified. Both genes show an exceptional homology and overlapping expression patterns, suggesting that they may serve redundant functions. Although COUP-TF was originally characterized as a transcriptional activator of the chicken ovalbumin gene, COUP-TFs are generally considered to be repressors of transcription for other nuclear hormone receptors, such as retinoic acid receptor (RAR), thyroid hormone receptor (TR), vitamin D receptor (VDR), peroxisome proliferator activated receptor (PPAR), and hepatocyte nuclear factor 4 (HNF4). Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, COUP-TFs  have  a central well cons erved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	236
132747	cd06949	NR_LBD_ER	Ligand binding domain of Estrogen receptor, which are activated by the hormone 17beta-estradiol (estrogen). The ligand binding domain (LBD) of Estrogen receptor (ER): Estrogen receptor, a member of nuclear receptor superfamily,  is activated by the hormone estrogen. Estrogen regulates many physiological processes including reproduction, bone integrity, cardiovascular health, and behavior. The main mechanism of action of the estrogen receptor is as a transcription factor by binding to the estrogen response element of target genes upon activation by estrogen and then recruiting coactivator proteins which are responsible for the transcription of target genes. Additionally some ERs may associate with other membrane proteins and can be rapidly activated by exposure of cells to estrogen.  Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ER has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). The C-terminal LBD also contains AF-2 activation motif, the dimerization motif, and part of the nuclear localization region. Estrogen receptor has been linked to aging, cancer, obesity and other diseases.	235
132748	cd06950	NR_LBD_Tlx_PNR_like	The ligand binding domain of Tailless-like proteins,  orphan nuclear receptors. The ligand binding domain of the photoreceptor cell-specific nuclear receptor (PNR)  like family: This family includes photoreceptor cell-specific nuclear receptor (PNR), Tailless (TLX), and related receptors. TLX is an orphan receptor that is expressed by neural stem/progenitor cells in the adult brain of the subventricular zone (SVZ) and the dentate gyrus (DG). It plays a key role in neural development by promoting cell cycle progression and preventing apoptosis in the developing brain. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TLX and PNR  have  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	206
132749	cd06951	NR_LBD_Dax1_like	The ligand binding domain of DAX1 protein, a nuclear receptor lacking DNA binding domain. The ligand binding domain of DAX1-like proteins: This orphan nuclear receptor family includes  DAX1 (dosage-sensitive sex reversal adrenal hypoplasia congenita critical region on chromosome X gene 1) and the Small Heterodimer Partner (SHP). Both receptors have a typical ligand binding domain, but lack the DNA binding domain, typical to almost all of the nuclear receptors. They function as a transcriptional coregulator by directly interacting with other nuclear receptors. DAX1 and SHP can form heterodimers with each other, as well as with many other nuclear receptors. In addition, DAX1 can also form homodimers. DAX1 plays an important role in the normal development of several hormone-producing tissues.  SHP has shown to regulate a variety of target genes.	222
132750	cd06952	NR_LBD_TR2_like	The ligand binding domain of the orphan nuclear receptors TR4 and TR2. The ligand binding domain of the TR4 and TR2 (human testicular receptor 4 and 2):  TR4 and TR2 are orphan nuclear receptors. Several isoforms of TR4 and TR2 have been isolated in various tissues. TR2 is abundantly expressed in the androgen-sensitive prostate. TR4 transcripts are expressed in many tissues, including central nervous system, adrenal gland, spleen, thyroid gland, and prostate. The expression of TR2 is negatively regulated by androgen, retinoids, and radiation. The expression of both mouse TR2 and TR4 is up-regulated by neurocytokine ciliary neurotrophic factor (CNTF) in mouse. It has shown that human TR2 binds to a wide spectrum of natural hormone response elements (HREs) with distinct affinities suggesting that TR2 may cross-talk with other gene expression regulation systems. The genes responding to TR2 or TR4 include genes that are regulated by retinoic acid receptor, vitamin D receptor, peroxisome proliferator-activated receptor. TR4/2 binds to HREs as a dimer. Like other members of the nuclea r receptor (NR) superfamily of ligand-activated transcription factors, TR2-like receptors  have  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	222
132751	cd06953	NR_LBD_DHR4_like	The ligand binding domain of orphan nuclear receptor Ecdysone-induced receptor DHR4. The ligand binding domain of Ecdysone-induced receptor DHR4: Ecdysone-induced orphan receptor DHR4 is a member of the nuclear receptor family. DHR4 is expressed during the early Drosophila larval development and is induced by ecdysone. DHR4 coordinates growth and maturation in Drosophila by mediating endocrine response to the attainment of proper body size during larval development. Mutations in DHR4 result in shorter larval development which translates into smaller and lighter flies. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, DHR4  has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). 	213
132752	cd06954	NR_LBD_LXR	The ligand binding domain of Liver X receptors, a family of nuclear receptors of ligand-activated transcription factors. The ligand binding domain of Liver X receptors: Liver X receptors (LXRs) belong to a family of nuclear receptors of ligand-activated transcription factors. LXRs operate as cholesterol sensors which protect from cholesterol overload by stimulating reverse cholesterol transport from peripheral tissues to the liver and its excretion in the bile. Oxidized cholesterol derivatives or oxysterols were identified as specific ligands for LXRs. Upon ligand binding a conformational change leads to recruitment of co-factors, which stimulates expression of target genes. Among the LXR target genes are several genes involved in cholesterol efflux from peripheral tissues such as the ATP-binding-cassette transporters ABCA1, ABCG1 and ApoE. There are two LXR isoforms in mammals, LXRalpha and LXRbeta. LXRalpha is expressed mainly in the liver, intestine, kidney, spleen, and adipose tissue, whereas LXRbeta is ubiquitously expressed at lower level. Both LXRalpha and LXRbeta function as heterodimers with the retinoid X receptor (RX R) which may be activated by either LXR ligands or 9-cis retinoic acid, a specific RXR ligand. The LXR/RXR complex binds to a liver X receptor response element (LXRE) in the promoter region of target genes. LXR has typical NR modular structure with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and the ligand binding domain (LBD) at the C-terminal.	236
143513	cd06955	NR_DBD_VDR	DNA-binding domain of vitamin D receptors (VDR) is composed of two C4-type zinc fingers. DNA-binding domain of vitamin D receptors (VDR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. VDR interacts with a VDR response element, a direct repeat of GGTTCA DNA site with 3 bp spacer upstream of the target gene, and modulates the rate of transcriptional initiation.  VDR is a member of the nuclear receptor (NR) superfamily that functions as classical endocrine receptors. VDR controls a wide range of biological activities including calcium metabolism, cell proliferation and differentiation, and immunomodulation. VDR is a high-affinity receptor for the biologically most active Vitamin D metabolite, 1alpha,25-dihydroxyvitamin D3 (1alpha,25(OH)2D3). The binding of the ligand to the receptor induces a conformational change of the ligand binding domain (LBD) with consequent dissociation of corepressors. Upon ligand binding, VDR forms a heterodimer with the retinoid X receptor (RXR) that binds to vitamin D response elements (VDREs), recruits coactivators. This leads to the expression of a large number of genes.  Approximately 200 human genes are considered to be primary targets of VDR and even more genes are regulated indirectly. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, VDR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	107
143514	cd06956	NR_DBD_RXR	DNA-binding domain of retinoid X receptor (RXR) is composed of two C4-type zinc fingers. DNA-binding domain of retinoid X receptor (RXR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. RXR functions as a DNA binding partner by forming heterodimers with other nuclear receptors including CAR, FXR, LXR, PPAR, PXR, RAR, TR, and VDR. All RXR heterodimers preferentially bind response elements composed of direct repeats of two AGGTCA sites with a 1-5 bp spacer.  RXRs can play different roles in these heterodimers. RXR  acts either as a structural component of the heterodimer complex, required for DNA binding but not acting as a receptor, or as both a structural and a functional component of the heterodimer, allowing 9-cis RA to signal through the corresponding heterodimer. In addition, RXR can also form homodimers, functioning as a receptor for 9-cis RA, independently of other nuclear receptors. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, RXR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	77
143515	cd06957	NR_DBD_PNR_like_2	DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) like is composed of two C4-type zinc fingers. The DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) nuclear receptor-like family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. PNR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This family includes nuclear receptor Tailless (TLX), photoreceptor cell-specific nuclear receptor (PNR) and related receptors. TLX is an orphan receptor that plays a key role in neural development by regulating cell cycle progression and exit of neural stem cells in the developing brain. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PNR-like receptors have a central well-conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	82
143516	cd06958	NR_DBD_COUP_TF	DNA-binding domain of chicken ovalbumin upstream promoter transcription factors (COUP-TFs) is composed of two C4-type zinc fingers. DNA-binding domain of chicken ovalbumin upstream promoter transcription factors (COUP-TFs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. COUP-TFs are orphan members of the steroid/thyroid hormone receptor superfamily. They are expressed in many tissues and are involved in the regulation of several important biological processes, such as neurogenesis, organogenesis, cell fate determination, and metabolic homeostasis. COUP-TFs homodimerize or heterodimerize with retinoid X receptor (RXR) and a few other nuclear receptors and bind to a variety of response elements that are composed of imperfect AGGTCA direct or inverted repeats with various spacings. COUP-TFs are generally considered to be repressors of transcription for other nuclear hormone receptors such as retinoic acid receptor (RAR), thyroid hormone receptor (TR), vitamin D receptor (VDR), peroxisome proliferator activated receptor (PPAR), and hepatocyte nuclear factor 4 (HNF4). Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, COUP-TFs have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	73
143517	cd06959	NR_DBD_EcR_like	The DNA-binding domain of Ecdysone receptor (EcR) like nuclear receptor family is composed of two C4-type zinc fingers. The DNA-binding domain of Ecdysone receptor (EcR) like nuclear receptor family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. EcR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This family includes three types of nuclear receptors: Ecdysone receptor (EcR), Liver X receptor (LXR) and Farnesoid X receptor (FXR). The DNA binding activity is regulated by their corresponding ligands. The ligands for EcR are ecdysteroids; LXR is regulated by oxidized cholesterol derivatives or oxysterols; and bile acids control FXR's activities. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, EcR-like receptors have  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	73
143518	cd06960	NR_DBD_HNF4A	DNA-binding domain of heptocyte nuclear factor 4 (HNF4) is composed of two C4-type zinc fingers. DNA-binding domain of hepatocyte nuclear factor 4 (HNF4) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. HNF4 interacts with a DNA site, composed of two direct repeats of AGTTCA with 1 bp spacer, which is upstream of target genes and modulates the rate of transcriptional initiation. HNF4 is a member of the nuclear receptor superfamily. HNF4 plays a key role in establishing and maintenance of hepatocyte differentiation in the liver. It is also expressed in gut, kidney, and pancreatic beta cells. HNF4 was originally classified as an orphan receptor, but later it is found that HNF4 binds with very high affinity to a variety of fatty acids. However, unlike other nuclear receptors, the ligands do not act as a molecular switch for HNF4. They seem to constantly bind to the receptor, which is constitutively active as a transcription activator. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, HNF4  has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	76
143519	cd06961	NR_DBD_TR	DNA-binding domain of thyroid hormone receptors (TRs) is composed of two C4-type zinc fingers. DNA-binding domain of thyroid hormone receptors (TRs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. TR interacts with the thyroid response element, which is a DNA site with direct repeats of the consensus sequence 5'-AGGTCA-3' separated by one to five base pairs, upstream of target genes and modulates the rate of transcriptional initiation. Thyroid hormone receptor (TR) mediates the actions of thyroid hormones, which play critical roles in growth, development, and homeostasis in mammals. They regulate overall metabolic rate, cholesterol and triglyceride levels, and heart rate, and affect mood. TRs are expressed from two separate genes (alpha and beta) in human and each gene generates two isoforms of the receptor through differential promoter usage or splicing. TRalpha functions in the heart to regulate heart rate and rhythm and TRbeta is active in the liver and other tissues. The unliganded TRs function as transcription repressors, by binding to thyroid hormone response elements (TRE) predominantly as homodimers, or as heterodimers with retinoid X-receptors (RXR), and being associated with a complex of proteins containing corepressor proteins. Ligand binding promotes corepressor dissociation and binding of a coactivator to activate transcription. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	85
143520	cd06962	NR_DBD_FXR	DNA-binding domain of Farnesoid X receptor (FXR) family is composed of two C4-type zinc fingers. DNA-binding domain of Farnesoid X receptor (FXR) family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. FXR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation.  FXR is a member of the nuclear receptor family of ligand activated transcription factors. Bile acids are endogenous ligands for FXRs. Upon binding of a ligand, FXR binds to FXR response element (FXRE), which is an inverted repeat of TGACCT spaced by one nucleotide, either as a monomer or as a heterodimer with retinoid X receptor (RXR), to regulate the expression of various genes involved in bile acid, lipid, and glucose metabolism. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, FXR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	84
143521	cd06963	NR_DBD_GR_like	The DNA binding domain of GR_like nuclear receptors is composed of two C4-type zinc fingers. The DNA binding domain of GR_like nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This family of NRs includes four types of nuclear hormone receptors: glucocorticoid receptor (GR), mineralocorticoid receptor (MR), progesterone receptor (PR), and androgen receptor (AR). The receptors bind to common DNA elements containing a partial palindrome of the core sequence 5'-TGTTCT-3' with a 3bp spacer. These four receptors regulate some of the most fundamental physiological functions such as the stress response, metabolism, electrolyte homeostasis, immune function, growth, development, and reproduction. The NRs in this family have high sequence homology and share similar functional mechanisms.  The dominant mechanism of function is by direct DNA binding and transcriptional regulation of target genes . The GR, MR, PR, and AR exhibit same modular structure. They have a central highly conserved DNA binding domain (DBD), a non-conserved N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	73
143522	cd06964	NR_DBD_RAR	DNA-binding domain of retinoic acid receptor (RAR) is composed of two C4-type zinc fingers. DNA-binding domain of retinoic acid receptor (RAR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. RAR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. RARs mediate the biological effect of retinoids, including both natural dietary vitamin A (retinol) metabolites and active synthetic analogs. Retinoids play key roles in a wide variety of essential biological processes, such as vertebrate embryonic morphogenesis and organogenesis, differentiation and apoptosis, and homeostasis. RAR function as a heterodimer with retinoic X receptor by binding to specific RAR response elements (RAREs), which are composed of two direct repeats of the consensus sequence 5'-AGGTCA-3' separated by one to five base pair and found in the promoter regions of retinoid target genes. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, retinoic acid receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	85
143523	cd06965	NR_DBD_Ppar	DNA-binding domain of peroxisome proliferator-activated receptors (PPAR) is composed of two C4-type zinc fingers. DNA-binding domain of peroxisome proliferator-activated receptors (PPAR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. PPAR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. Peroxisome proliferator-activated receptors (PPARs) are members of the nuclear receptor superfamily of ligand-activated transcription factors. PPARs play important roles in regulating cellular differentiation, development and lipid metabolism. Activated PPAR forms a heterodimer with the retinoid X receptor (RXR) that binds to the hormone response elements, which are composed of two direct repeats of the consensus sequence 5'-AGGTCA-3' separated by one to five base pair located upstream of the peroxisome proliferator responsive genes, and interacts with co-activators. Several essential fatty acids, oxidized lipids and prostaglandin J derivatives can bind and activate PPAR.  Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PPAR has a central well conserved DNA binding domain (DBD), a variable N-terminal regulatory domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	84
143524	cd06966	NR_DBD_CAR	DNA-binding domain of constitutive androstane receptor (CAR) is composed of two C4-type zinc fingers. DNA-binding domain (DBD) of constitutive androstane receptor (CAR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. CAR DBD interacts with CAR response element, a perfect repeat of two AGTTCA motifs with a 4 bp spacer upstream of the target gene, and modulates the rate of transcriptional initiation. The constitutive androstane receptor (CAR) is a ligand-regulated transcription factor that responds to a diverse array of chemically distinct ligands, including many endogenous compounds and clinical drugs. It functions as a heterodimer with RXR. The CAR/RXR heterodimer binds many common response elements in the promoter regions of a diverse set of target genes involved in the metabolism, transport, and ultimately, elimination of these molecules from the body. CAR is a closest mammalian relative of PXR and is activated by some of the same ligands as PXR and regulates a subset of common genes. The sequence homology and functional similarity suggests that the CAR gene arose from a duplication of an ancestral PXR gene. Like other nuclear receptors, CAR has a central well conserved DNA binding domain, a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain.	94
143525	cd06967	NR_DBD_TR2_like	DNA-binding domain of the TR2 and TR4 (human testicular receptor 2 and 4) is composed of two C4-type zinc fingers. DNA-binding domain of the TR2 and TR4 (human testicular receptor 2 and 4) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. TR2 and TR4 interact with specific DNA sites upstream of the target gene and modulate the rate of transcriptional initiation. TR4 and TR2 are orphan nuclear receptors; the physiological ligand is as yet unidentified. TR2 is abundantly expressed in the androgen-sensitive prostate. TR4 transcripts are expressed in many tissues, including central nervous system, adrenal gland, spleen, thyroid gland, and prostate. It has been shown that human TR2 binds to a wide spectrum of natural hormone response elements (HREs) with distinct affinities suggesting that TR2 may cross-talk with other gene expression regulation systems. The genes responding to TR2 or TR4 include genes that are regulated by retinoic acid receptor, vitamin D receptor, and peroxisome proliferator-activated receptor. TR4/2 binds to HREs as dimers. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TR2-like receptors  have  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	87
143526	cd06968	NR_DBD_ROR	DNA-binding domain of Retinoid-related orphan receptors (RORs) is composed of two C4-type zinc fingers. DNA-binding domain of Retinoid-related orphan receptors (RORs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. ROR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation.  RORS are key regulators of many physiological processes during embryonic development. RORs bind as monomers to specific ROR response elements (ROREs) consisting of the consensus core motif AGGTCA preceded by a 5-bp A/T-rich sequence. There are three subtypes of retinoid-related orphan receptors (RORs), alpha, beta, and gamma, which differ only in N-terminal sequence and are distributed in distinct tissues. RORalpha plays a key role in the development of the cerebellum particularly in the regulation of the maturation and survival of Purkinje cells. RORbeta expression is largely restricted to several regions of the brain, the retina, and pineal gland. RORgamma is essential for lymph node organogenesis. Recently, it has been suggested that cholesterol or a cholesterol derivative are the natural ligands of RORalpha. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, retinoid-related orphan receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	95
143527	cd06969	NR_DBD_NGFI-B	DNA-binding domain of the orphan nuclear receptor, nerve growth factor-induced-B. DNA-binding domain (DBD) of the orphan nuclear receptor, nerve growth factor-induced-B (NGFI-B) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. NGFI-B interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. NGFI-B is a member of the nuclear-steroid receptor superfamily. NGFI-B is classified as an orphan receptor because no ligand has yet been identified. NGFI-B is an early immediate gene product of embryo development that is rapidly produced in response to a variety of cellular signals including nerve growth factor. It is involved in T-cell-mediated apoptosis, as well as neuronal differentiation and function. NGFI-B regulates transcription by binding to a specific DNA target upstream of its target genes and regulating the rate of transcriptional initiation. NGFI-B binds to the NGFI-B response element (NBRE) 5'-(A/T)AAAGGTCA as a monomer. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, NGFI-B has  a central well-conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	75
143528	cd06970	NR_DBD_PNR	DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) is composed of two C4-type zinc fingers. DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. PNR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation.  PNR is a member of the nuclear receptor superfamily of the ligand-activated transcription factors. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. It most likely binds to DNA as a homodimer. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PNR  has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	92
133477	cd06971	PgpA	Phosphatidylglycerophosphatase A; a bacterial membrane-associated enzyme involved in lipid metabolism. Phosphatidylglycerophosphatase A domain represents a family of bacterial membrane-associated enzymes involved in lipid metabolism. The prototype of this CD is a putative Phosphatidylglycerophosphatase A (PGPase A) from Listeria monocytogenes. PGPase A (EC: 3.1.3.27), encoded by the gene pgpA, specifically catalyzes the formation of phosphatidylglycerol from phosphatidyl glycerophosphate (PGP). It requires Mg2+ for activity and is inhibited by sulfhydryl agents and freezing/thawing. PGPase B encoded from pgpB is not included in this family, which also acts on phosphatidic acid (PA) and lysophosphatidic acid (LPA). Aside from PGPase A and B, evidence shows that there is another PGPase existing in E. coli. Thus, PGPase A is not essential for PGPase activity in E. coli.	143
132992	cd06974	TerD_like	Uncharacterized proteins involved in stress response, similar to tellurium resistance terD. Tellurium resistance terD like proteins. This family is composed of uncharacterized proteins involved in stress response, such as the tellurium resistance proteins, chemical-damaging agent resistance proteins, and general stress proteins from a variety of organisms. The tellurium resistance proteins are homologous terA,-D,-E,-F,-Z,-X gene products, which confer tellurium resistance mediated by plasmids. Currently, the biochemical mechanism of tellurium resistance remains unknown. The family also contains several ter gene homologues, YceC, YceD, YceE, for which there is no clear evidence for any involvement in the tellurium resistance. A putative cAMP-binding protin CABP1 shows a significant similarity to the terD protein and is also included in this family.	162
380380	cd06975	cupin_BacB	Bacillus subtilis bacilysin and related proteins, cupin domain. Bacilysin (BacB, also known as AerE in Microcystis aeruginosa) is a non-ribosomally synthesized dipeptide antibiotic that is produced and excreted by certain strains of Bacillus subtilis. It is an oxidase that catalyzes the synthesis of 2-oxo-3-(4-oxocyclohexa-2,5-dienyl)propanoic acid, a precursor to L-anticapsin. Each bacilysin monomer has two tandem cupin domains. It is active against a wide range of bacteria and some fungi. The antimicrobial activity of bacilysin is antagonized by glucosamine and N-acetyl glucosamine, indicating that bacilysin interferes with glucosamine synthesis, and thus, with the synthesis of microbial cell walls. AerE is thought to be involved in the formation of the 2-carboxy-6-hydroxyoctahydroindole (Choi) moiety found on all aeruginosin tetrapeptides, based on gene knock-out experiments. It is encoded by the aerE gene of the aerABCDEF aeruginosin biosynthesis gene cluster in Microcystis aeruginosa.	93
380381	cd06976	cupin_MtlR-like_N	AraC/XylS family transcriptional regulators similar to MtlR, N-terminal cupin domain. MtlR is a Pseudomonas fluorescens protein that acts as a transcriptional regulator of the mannitol utilization genes.  It has an N-terminal cupin domain (represented by this alignment) and a C-terminal AraC/XylS family helix-turn-helix (HTH) DNA-binding domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	83
380382	cd06977	cupin_RhaR_RhaS-like_N	HTH-type transcriptional activator RhaR and RhaS and related proteins, N-terminal cupin domain. Members of this family contain an N-terminal cupin domain and a C-terminal AraC/XylS family helix-turn-helix (HTH) DNA-binding domain, including the HTH-type transcription activators RhaS and RhaR. RhaS and RhaR respond to the availability of L-rhamnose and activate transcription of the operons in the Escherichia coli L-rhamnose catabolic regulon. The E. coli RhaR protein activates expression of the rhaSR operon in the presence of its effector, L-rhamnose. The resulting RhaS protein (plus L-rhamnose) activates expression of the L-rhamnose catabolic operon rhaBAD as well as the transport operon rhaT. These proteins bind DNA as dimers, via their HTH motifs. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	147
380383	cd06978	cupin_EctC	L-ectoine synthase, cupin domain. Ectoine synthase (EctC; also known as L-ectoine synthase or N-acetyldiaminobutyrate dehydratase; EC 4.2.1.108) is a cupin-like bacterial protein that converts N'-acetyldiaminobutyric acid to ectoine, in the last step of the L-ectoine biosynthetic pathway, via a cyclo-condensation reaction and using iron as the cofactor. Ectoines are potent microbial stress protectants, primarily synthesized by bacteria but also found in a few obligate halophilic protists and archaea, based on the ectoine biosynthetic ectABC gene. In halophilic eubacteria, the osmolytic ectoines enable the organisms to adapt to a wide range of salt concentrations by adjusting the cytoplasmic solute pool to the osmolarity of the surrounding environment. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	111
380384	cd06979	cupin_RemF-like	Streptomyces resistomycificus RemF cyclase and related proteins, cupin domain. RemF cyclase is a manganese-containing polyketide cyclase present in bacteria that is involved in the biosynthesis of resistomycin, the aromatic pentacyclic metabolite in Streptomyces resistomycificus. Structure of this enzyme shows a cupin fold with a conserved "jelly roll-like" beta-barrel fold that forms a homodimer. It contains an unusual octahedral zinc-binding site in a large hydrophobic pocket that may represent the active site. The zinc ion, coordinated to four histidine side chains and two water molecules, could act as a Lewis acid in the aldol condensation reaction catalyzed by RemF, reminiscent of class II aldolases.	93
380385	cd06980	cupin_bxe_c0505	uncharacterized protein bxe_c0505, cupin domain. This family includes mostly bacterial proteins homologous to bxe_c0505, a Burkholderia xenovorans protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	105
380386	cd06981	cupin_reut_a1446	Cupriavidus pinatubonensis reut_a1446 and related proteins, cupin domain. This family includes bacterial and some eukaryotic proteins homologous to reut_a1446, a Cupriavidus pinatubonensis protein of unknown function that may be related to mannose-6-phosphate isomerase. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	103
380387	cd06982	cupin_BauB-like	Pseudomonas aeruginosa BauB and related proteins, cupin domain. This family includes bacterial proteins homologous to beta-alanine degradation protein BauB from Pseudomonas aeruginosa, which is involved in the degradation of beta-alanine. Also included are Rhodopseudomonas palustris Rpa4178 and Bordetella pertussis Bp2299, which are both proteins with unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	91
380388	cd06983	cupin_dsy2733	Desulfitobacterium hafniense dsy2733 and related proteins, cupin domain. This family includes bacterial proteins homologous to dsy2733, a Desulfitobacterium hafniense protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	81
380389	cd06984	cupin_Moth_1897	uncharacterized Methanocaldococcus jannaschii Moth_1897 and related proteins, cupin domain. This family includes archaeal and bacterial proteins homologous to Moth_1897, a Methanocaldococcus jannaschii protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	83
380390	cd06985	cupin_BF4112	Bacteroides fragilis BF4112 and related proteins, cupin domain. This family includes archaeal and bacterial proteins homologous to BF4112, a Bacteroides fragilis protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	101
380391	cd06986	cupin_MmsR-like_N	AraC/XylS family transcriptional regulators similar to MmsR, N-terminal cupin domain. This family contains bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators. Included is MmsR, a bacterial transcriptional regulator thought to positively regulate the expression of the mmsAB operon. The mmsAB operon contains two structural genes involved in valine metabolism: mmsA which encodes methylmalonate-semialdehyde dehydrogenase, and mmsB which encodes 3-hydroxyisobutyrate dehydrogenase. The cupin domain of members of this subfamily does not contain a metal binding site.	84
380392	cd06987	cupin_MAE_RS03005	Microcystis aeruginosa MAE_RS03005 and related proteins, cupin domain. This family includes bacterial and some eukaryotic proteins homologous to MAE_RS03005, a Microcystis aeruginosa protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	122
380393	cd06988	cupin_DddK	Dimethylsulfoniopropionate lyase DddK and related proteins, cupin domain. This family includes mostly bacterial proteins homologous to dimethylsulfoniopropionate lyase DddK from marine bacterium Pelagibacter. DddK cleaves dimethylsulfoniopropionate (DMSP), the organic osmolyte and antioxidant produced in marine environments, and yields acrylate and the climate-active gas dimethyl sulfide (DMS). DddK contains a double-stranded beta-helical motif which utilizes various divalent metal ions as cofactors for catalytic activity; however, nickel, an abundant metal ion in marine environments, confers the highest DMSP lyase activity. Also included in this family is Plu4264, a Photorhabdus luminescens manganese-containing cupin shown to have similar metal binding site to TM1287 decarboxylase, but two very different substrate binding pockets. The Plu4264 binding pocket shows a cavity and substrate entry point more than twice as large as and more hydrophobic than TM1287, suggesting that Plu4264 accepts a substrate that is significantly larger than that of TM1287, a putative oxalate decarboxylase. Thus, the function of Plu4264 could be similar to that of TM1287 but with a larger, less charged substrate. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	76
380394	cd06989	cupin_DRT102	Arabidopsis thaliana DRT102 and related proteins, cupin domain. This family includes bacterial and eukaryotic proteins homologous to DNA-damage-repair/toleration protein DRT102 found in Arabidopsis thaliana. DRT102 may be involved in DNA repair from UV damage. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	97
380395	cd06990	cupin_DUF861	domain of unknown function DUF 861, cupin domain. This family contains proteins which seem to be specific to bacteria and some fungi. The function of this family is unknown but contains a cupin domain without a metal binding site. Cupins are a functionally diverse superfamily originally discovered based on the highly conserved motif found in germin and germin-like proteins. This conserved motif forms a beta-barrel fold found in all of the cupins, giving rise to the name cupin (cupa is the Latin term for small barrel).	101
380396	cd06991	cupin_TcmJ-like	TcmJ monooxygenase and related proteins, cupin domain. This family includes TcmJ, a subunit of the tetracenomycin (TCM) polyketide synthase (PKS) type II complex in Streptomyces glaucescens. TcmJ is a quinone-forming monooxygenase involved in the modification of aromatic polyketides synthesized by polyketide synthases of types II and III. Orthologs of TcmJ include the Streptomyces BenD (benastatin biosynthetic pathway), the Streptomyces olivaceus ElmJ (polyketide antibiotic elloramycin biosynthetic pathway), the Actinomadura hibisca PdmL (pradimicin biosynthetic pathway), the Streptomyces cyaneus CurC (curamycin biosynthetic pathway), the Streptomyces rishiriensis Lct30 (lactonamycin biosynthetic pathway), and the Streptomyces WhiE II (spore pigment polyketide biosynthetic pathway). Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	105
380397	cd06992	cupin_GDO-like_C	gentisate 1,2-dioxygenase, 1-hydroxy-2-naphthoate dioxygenase, and salicylate 1,2-dioxygenase bicupin aromatic ring-cleaving dioxygenases, C-terminal cupin domain. This model represents the C-terminal cupin domains of three closely related bicupin aromatic ring-cleaving dioxygenases: gentisate 1,2-dioxygenase (GDO), salicylate 1,2-dioxygenase (SDO), and 1-hydroxy-2-naphthoate dioxygenase (NDO). GDO catalyzes the cleavage of the gentisate (2,5-dihydroxybenzoate) aromatic ring, a key step in the gentisate degradation pathway allowing soil bacteria to utilize 2,5-xylenol, 3,5-xylenol, and m-cresol as sole carbon and energy sources. NDO catalyzes the cleavage of 1-hydroxy-2-naphthoate as part of the bacterial phenanthrene degradation pathway. SDO is a ring cleavage dioxygenase from Pseudaminobacter salicylatoxidans that oxidizes salicylate to 2-oxohepta-3,5-dienedioic acid via a novel ring fission mechanism. SDO differs from other known GDO's and NDO's in its unique ability to oxidatively cleave many different salicylate, gentisate, and 1-hydroxy-2- naphthoate substrates with high catalytic efficiency. The active site of this enzyme is located in the N-terminal domain but could be influenced by changes in the C-terminal domain, which lacks the strictly conserved metal-binding residues found in other cupin domains and is thought to be an inactive vestigial remnant.	99
380398	cd06993	cupin_CENP-C_C	centromere-binding protein CENP-C, C-terminal cupin domain. This family includes centromeric protein C (CENP-C; known as Mif2 in budding yeast and centromere protein 3 or cnp3 in fission yeast), which is an inner kinetochore centromere (CEN)-binding protein found in fungi and metazoans. CENP-C is a component of the CENP-A nucleosome-associated complex (NAC) that plays a central role in assembly of kinetochore proteins, mitotic progression and chromosome segregation. CENP-C localizes to the inner kinetochore plates adjacent to the centromeric DNA and is known to have DNA-binding ability. CENP-C, along with CENP-H, provides a platform onto which the mitotic kinetochore is assembled and thus plays a critical role in the structuring of kinethocore chromatin. The cupin domain at the C-terminus forms a homodimer which is part of an enhanceosome-like structure that nucleates kinetochore assembly in budding yeast. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	77
380399	cd06995	cupin_YkgD-like_N	AraC/XylS family transcriptional regulators similar to Escherichia coli YkgD, N-terminal cupin domain. This family contains mostly bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators. Included in this family is YkgD, an uncharacterized Escherichia coli protein thought to be a transcriptional regulator. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	114
380400	cd06996	cupin_Lmo2851-like_N	AraC/XylS family transcriptional regulators similar to Listeria monocytogenes Lmo2851 protein, N-terminal cupin domain. This family contains bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators. Included is Listeria monocytogenes Lmo2851 protein, whose function is unknown. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	87
380401	cd06997	cupin_MelR-like_N	AraC/XylS family transcriptional regulators similar to Escherichia coli MelR, N-terminal cupin domain. This family contains bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators, including Escherichia coli MelR, a transcription factor that controls melibiose utilization. MelR is encoded by the melR gene and is essential for melibiose-dependent triggering of the melAB operon that encodes products needed for melibiose catabolism and transport. Expression of melR is autoregulated by MelR, which represses the melR promoter by binding to a target that overlaps the transcript start. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	78
380402	cd06998	cupin_D-LI-like	sugar isomerase such as lyxose isomerase, cupin domain. This family includes D-lyxose isomerase (D-LI; EC 5.3.1.15) homologous to YdaE from the sigma B regulon of Bacillus subtilis and to pathogenic Escherichia coli O157 z5688 D-lyxose isomerase (EcSI or Z5688), both having highly similar active sites. YdaE may have a synergistic role with ydaD, an NAD(P)-dependent alcohol dehydrogenase, in the adaptation to environment stresses, while EcSI has D-lyxose/D-mannose activity. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	100
380403	cd06999	cupin_HpaA-like_N	AraC/XylS family transcriptional regulators similar to HpaA, N-terminal cupin domain. Members of this family contain an N-terminal cupin domain and a C-terminal AraC/XylS family helix-turn-helix (HTH) DNA-binding domain, similar to Escherichia coli 4-hydroxyphenylacetate catabolism regulatory protein HpaA (also known as 4HPA). HpaA is encoded by the hpaA gene which is located upstream of hpaBC. It is activated by 4-HPA, 3-HPA and phenylacetate, and represents a member of the AraC/XylS family of regulators that recognizes aromatic effectors. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	98
380404	cd07000	cupin_HGO_N	homogentisate 1,2-dioxygenase and related proteins, N-terminal cupin domain. This family includes homogentisate 1,2-dioxygenase (also known as homogentisate oxygenase, homogentisic acid oxidase, homogentisicase, HGO, HGD, HGDO, or HmgA; EC 1.13.11.5), which is involved in the metabolic degradation of phenylalanine and tyrosine. It catalyzes the crucial aromatic ring opening reaction, utilizing nonheme Fe2+ to incorporate both atoms of molecular oxygen into homogentisate (2,5-dihydroxyphenylacetate) to yield 4-maleylacetoacetate as part of the homogentisate pathway. HGO deficiency caused by critical mutations and polymorphic sites, causes the metabolic disease alkaptonuria (AKU), a rare disorder of autosomal recessive inheritance. Homogentisate accumulation causes insoluble ochronotic pigments to deposit in connective tissues, resulting in degenerative arthritis. These enzymes are found in prokaryotes, eukaryotes, and archaea. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	109
380405	cd07001	cupin_YbfI-like_N	AraC/XylS family transcriptional regulators similar to Bacillus subtilis YbfI, N-terminal cupin domain. This family contains bacterial proteins containing an AraC/XylS family helix-turn-helix (HTH) DNA-binding domain C-terminal to a cupin domain, and may be possible transcriptional regulators, including YbfI, an uncharacterized Bacillus subtilis. In Pseudomonas putida, this protein is thought to regulate the expression of phenylserine aldolase. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	76
380406	cd07002	cupin_SznF-like_C	Streptomyces achromogenes SznF and related proteins, C-terminal cupin domain. This family includes bacterial proteins similar to Streptomyces achromogenes SznF, containing an N-terminal helical region that mediates dimerization, a central heme oxygenase domain, and a C-terminal cupin domain. SznF is a metalloenzyme that catalyzes an oxidative rearrangement of the guanidine group of N(omega)-methyl-L-arginine to generate an N-nitrosourea product, during the biosynthesis of streptozotocin, an N-nitrosourea natural product and an approved cancer chemotherapeutic. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold generally capable of homodimerization. However, in SznF, the cupin domain is not involved in dimerization.	96
380407	cd07003	cupin_YobQ-like_N	Bacillus subtilis YobQ and related proteins, N-terminal cupin domain. This family includes bacterial proteins homologous to Bacillus subtilis YobQ and Photobacterium leiognathi LumQ, both uncharacterized proteins thought to be DNA-binding proteins that may function as AraC/XylS family transcriptional regulators. YobQ has an N-terminal cupin beta barrel domain (represented by this alignment model) and a C-terminal AraC/XylS family helix-turn-helix (HTH) DNA-binding domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	66
380408	cd07005	cupin_WbuC-like	Escherichia coli WbuC and related proteins, cupin domain. This family includes bacterial proteins homologous to WbuC, an Escherichia coli protein of unknown function with a cupin beta barrel fold.  Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	114
380409	cd07006	cupin_XcTcmJ-like	Xanthomonas campestris XcTcmJ and related proteins, cupin domain. This family includes bacterial and archaeal proteins homologous to plant pathogen Xanthomonas campestris tetracenomycin polyketide synthesis protein XcTcmJ, a protein encoded by the tcmJ gene. XcTcmJ is annotated as being involved in tetracenomycin polyketide biosynthesis. Also included is Xc5357 from a different strain of X. campestris. Structure studies show that binding of zinc induces conformational changes and serves a functional role in this cupin protein. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	89
380410	cd07007	cupin_CapF-like_C	Staphylococcus aureus CapF and related proteins, C-terminal cupin domain. This family contains cupin domains of proteins homologous to Staphylococcus aureus CapF (also known as WbjC in Pseudomonas aeruginosa and FnlB in Escherichia coli). CapF is a bifunctional metalloenzyme produced by certain pathogenic bacteria and is essential in the biosynthetic path of capsular polysaccharide (CP), a mucous layer on the surface of bacterium that facilitates immune evasion and infection. Thus, CapF is an antibacterial/therapeutic target. In S. aureus, enzymes CapE, CapF and CapG catalyze the sequential transformation of UDP-D-GlcNAc in the CP precursor UDP-L-FucNAc via the intermediate compound UDP-N-acetyl-L-talosamine (UDP-L-TalNAc). CapF consists of two domains; the C-terminal cupin domain catalyzes the epimerization of the compound produced by the upstream enzyme CapE, and the N-terminal short-chain dehydrogenase/reductase (SDR) domain catalyzes the reduction of the compound afforded by the cupin domain, requiring one equivalent of NADPH. The cupin domain is crucial for catalyzing the first chemical reaction, and also important for the stability of the enzyme. Similarly, in P. aeruginosa, WbjC, WbjB and WbjD enzymes synthesize UDP-N-acetyl-L-fucosamine, a precursor of the lipopolysacharide component L-fucosamine. The cupin domains contain a conserved "jelly roll-like" beta-barrel fold.	109
380411	cd07008	cupin_yp_001338853-like	Klebsiella pneumoniae yp_001338853.1 and related proteins, cupin domain. This family includes bacterial proteins homologous to Klebsiella pneumoniae yp_001338853.1, an uncharacterized conserved protein with double-stranded beta-helix domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	101
380412	cd07009	cupin_BLL0285-like	Bradyrhizobium japonicum BLL0285 and related proteins, cupin domain. This family includes bacterial proteins homologous to BLL0285, a Bradyrhizobium japonicum protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	81
380413	cd07010	cupin_PMI_type_I_N_bac	Phosphomannose isomerase in bacteria and archaea, N-terminal cupin domain. This subfamily contains type I phosphomannose isomerase (PMI; E.C. 5.3.1.8; also known as mannose-6-phosphate isomerase) found in many bacteria (e.g. Bacillus subtilis) and archaea. PMI catalyzes the reversible isomerization of fructose-6-phosphate (F6P) and mannose-6-phosphate (M6P), the first committed step in the synthesis of mannosylated glycoproteins. The active site, located within the N-terminal jelly roll-like beta-barrel cupin fold, contains a single essential zinc atom and forms a deep, open cavity large enough to contain M6P or F6P. PMI type I also has a C-terminal beta-barrel fold which has diverged considerably from the N-terminal domain and is not included here. This subfamily does not contain an alpha helical domain that exists in eukaryotic and some prokaryotic PMIs. F6P is a substrate for glycolysis and gluconeogenesis, while M6P is a substrate for production of activated mannose donor guanosine 5'-diphosphate D-mannose, an important precursor of mannosylated biomolecules such as glycoproteins, bacterial exopolysaccharides and fungal cell wall components. PMI is also essential for survival, virulence and possibly pathogenicity of some bacteria and protozoan parasites, as well as for cell wall integrity of certain yeasts. Thus, PMI is a potential target against fungal infections causing serious illness or death.	173
380414	cd07011	cupin_PMI_type_I_N	type I phosphomannose isomerase in eukaryotes and bacteria, N-terminal cupin domain. This subfamily contains type I phosphomannose isomerase (PMI; E.C. 5.3.1.8; also known as mannose-6-phosphate isomerase) found in eukaryotes and some bacteria such as Salmonella enterica. PMI catalyzes the reversible isomerization of fructose-6-phosphate (F6P) and mannose-6-phosphate (M6P), the first committed step in the synthesis of mannosylated glycoproteins. The active site, located within the N-terminal jelly roll-like beta-barrel cupin fold, contains a single essential zinc atom and forms a deep, open cavity large enough to contain M6P or F6P. PMI type I also has a C-terminal beta-barrel fold which has diverged considerably from the N-terminal domain and is not included here. This subfamily contains an alpha helical domain that is found in eukaryotic and some prokaryotic PMIs but is not present in their archaeal counterparts. F6P is a substrate for glycolysis and gluconeogenesis, while M6P is a substrate for production of activated mannose donor guanosine 5'-diphosphate D-mannose, an important precursor of mannosylated biomolecules such as glycoproteins, bacterial exopolysaccharides and fungal cell wall components. PMI is also essential for survival, virulence and possibly pathogenicity of some bacteria and protozoan parasites, as well as for cell wall integrity of certain yeasts. Thus, PMI is a potential target against fungal infections causing serious illness or death.	247
270234	cd07012	PBP2_Bug_TTT	Bug (Bordetella uptake gene) protein family of periplasmic solute-binding receptors; contains the type 2 periplasmic binding fold. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) proteins present in a number of bacterial species, but mainly in proteobacteria. In eubacteria, at least three families of periplasmic binding-protein dependent transporters are known: the ATP-binding cassette (ABC) transporters, the tripartite ATP-independent periplasmic transporters, and the tripartite tricarboxylate transporters (TTT). Bug proteins are the PBP components of the TTT. Their expansive expansion in proteobacteria indicates a large functional diversity. The best studied examples are Bordetella pertussis BugD, which is an aspartic acid transporter, and BugE, which is glutamate transporter.	291
132924	cd07013	S14_ClpP	Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. Additionally, they are implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function.	162
132925	cd07014	S49_SppA	Signal peptide peptidase A. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppA is an intramembrane enzyme found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Unlike the eukaryotic functional homologs that are proposed to be aspartic proteases, site-directed mutagenesis and sequence analysis have shown these bacterial, archaeal and thylakoid SppAs to be ClpP-like serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain, cleaving peptide bonds in the plane of the lipid bilayer. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain (sometimes referred to as 67K type). Others, including sohB peptidase, protein C, protein 1510-N and archaeal signal peptide peptidase, do not contain the amino-terminal domain (sometimes referred to as 36K type). Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. This family also contains homologs that either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases.	177
132926	cd07015	Clp_protease_NfeD	Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control.	172
132927	cd07016	S14_ClpP_1	Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. This subfamily only contains bacterial sequences. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function.	160
132928	cd07017	S14_ClpP_2	Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function.	171
132929	cd07018	S49_SppA_67K_type	Signal peptide peptidase A (SppA) 67K type, a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV) 67K type: SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Members in this subfamily contain an amino-terminal domain in addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members (sometimes referred to as 67K type), similar to E. coli and Arabidopsis thaliana SppA peptidases. Unlike the eukaryotic functional homologs that are proposed to be aspartic proteases, site-directed mutagenesis and sequence analysis have shown that members in this subfamily, mostly bacterial, are serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain.	222
132930	cd07019	S49_SppA_1	Signal peptide peptidase A (SppA), a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppAs in this subfamily are found in all three domains of life and are involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Site-directed mutagenesis and sequence analysis have shown these bacterial, archaeal and thylakoid SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain, similar to Arabidopsis thaliana SppA1 peptidase. Others, including sohB peptidase, protein C and archaeal signal peptide peptidase, do not contain the amino-terminal domain. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain.	211
132931	cd07020	Clp_protease_NfeD_1	Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control.	187
132932	cd07021	Clp_protease_NfeD_like	Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control.	178
132933	cd07022	S49_Sppa_36K_type	Signal peptide peptidase A (SppA) 36K type, a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV) 36K type: SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Members in this subfamily are all bacterial and include sohB peptidase and protein C. These are sometimes referred to as 36K type since they contain only one domain, unlike E. coli SppA that also contains an amino-terminal domain. Site-directed mutagenesis and sequence analysis have shown these SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases.	214
132934	cd07023	S49_Sppa_N_C	Signal peptide peptidase A (SppA), a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. This subfamily contains members with either a single domain (sometimes referred to as 36K type), such as sohB peptidase, protein C and archaeal signal peptide peptidase, or an amino-terminal domain in addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members (sometimes referred to as 67K type), similar to E. coli and Arabidopsis thaliana SppA peptidases. Site-directed mutagenesis and sequence analysis have shown these SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain.	208
132882	cd07025	Peptidase_S66	LD-Carboxypeptidase, a serine protease, includes microcin C7 self immunity protein. LD-carboxypeptidase (Muramoyltetrapeptide carboxypeptidase; EC 3.4.17.13; Merops family S66; initially described as Carboxypeptidase II) family also includes the microcin c7 self-immunity protein (MccF) as well as uncharacterized proteins including hypothetical proteins. LD-carboxypeptidase hydrolyzes the amide bond that links the dibasic amino acids to C-terminal  D-amino acids. The physiological substrates of LD-carboxypeptidase are tetrapeptide fragments (such as UDP-MurNAc-tetrapeptides) that are produced when bacterial cell walls are degraded; they contain an L-configured residue (L-lysine or meso-diaminopimelic acid residue) as the penultimate residue and D-alanine as the ultimate residue.  A possible role of LD-carboxypeptidase is in peptidoglycan recycling whereby the resulting tripeptide (precursor for murein synthesis) can be reconverted into peptidoglycan by attachment of preformed D-Ala-D-Ala dipeptides. Some enzymes possessing LD-carboxypeptidase activity also act as LD-transpeptidase by replacing the terminal D-Ala with another D-amino acid. MccF contributes to self-immunity towards microcin C7 (MccC7), a ribosomally encoded peptide antibiotic that contains a phosphoramidate linkage to adenosine monophosphate at its C-terminus. Its possible biological role is to defend producer cells against exogenous microcin from re-entering after having been exported.  It is suggested that MccF is involved in microcin degradation or sequestration in the periplasm.	282
197305	cd07026	Ribosomal_L20	Ribosomal protein L20. The ribosomal protein family L20 contains members from eubacteria, as well as their mitochondrial and plastid homologs. L20 is an assembly protein, required for the first in-vitro reconstitution step of the 50S ribosomal subunit, but does not seem to be essential for ribosome activity. L20 has been shown to partially unfold in the absence of RNA, in regions corresponding to the RNA-binding sites. L20 represses the translation of its own mRNA via specific binding to two distinct mRNA sites, in a manner similar to the L20 interaction with 23S ribosomal RNA. 	106
132905	cd07027	RNAP_RPB11_like	RPB11 subunit of RNA polymerase. The eukaryotic RPB11 subunit of RNA polymerase (RNAP), as well as its archaeal (L subunit) and bacterial (alpha subunit) counterparts, is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei:  RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts.	83
132906	cd07028	RNAP_RPB3_like	RPB3 subunit of RNA polymerase. The eukaryotic RPB3 subunit of RNA polymerase (RNAP), as well as its archaeal (D subunit) and bacterial (alpha subunit) counterparts, is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei:  RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The RPB3 subunit is similar to the bacterial RNAP alpha subunit in that it contains two subdomains: one subdomain is similar to the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization; and the other is an inserted beta sheet subdomain. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts.	212
132907	cd07029	RNAP_I_III_AC19	AC19 subunit of Eukaryotic RNA polymerase (RNAP) I and RNAP III. The eukaryotic AC19 subunit of RNA polymerase (RNAP) I and RNAP III is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei:  RNAP I, RNAP II, and RNAP III. RNAP I is responsible for the synthesis of ribosomal RNA precursor, while RNAP III functions in the synthesis of 5S and tRNA. The AC19 subunit is the equivalent of the RPB11 subunit of RNAP II. The RPB11 subunit heterodimerizes with the RPB3 subunit, and together with RPB10 and RPB12, anchors the two largest subunits, RPB1 and RPB2, and stabilizes their association. The homology of AC19 to RPB11 suggests a similar function. The AC19 subunit is likely to associate with the RPB3 counterpart, AC40, to form a heterodimer, which stabilizes the association of the two largest subunits of RNAP I and RNAP III.	85
132908	cd07030	RNAP_D	D subunit of Archaeal RNA polymerase. The D subunit of archaeal RNA polymerase (RNAP) is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. A single distinct RNAP complex is found in archaea, which may be responsible for the synthesis of all RNAs. The archaeal RNAP harbors homologues of all eukaryotic RNAP II subunits with two exceptions (RPB8 and RPB9). The 12 archaeal subunits are designated by letters and can be divided into three functional groups that are engaged in: (I) catalysis (A'/A", B'/B" or B); (II) assembly (L, N, D and P); and (III) auxiliary functions (F, E, H and K). The D subunit is equivalent to the RPB3 subunit of eukaryotic RNAP II. It contains two subdomains: one subdomain is similar the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization, and the other is an inserted beta sheet subdomain. The assembly of the two largest archaeal RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of the archaeal D/L heterodimer.	259
132909	cd07031	RNAP_II_RPB3	RPB3 subunit of Eukaryotic RNA polymerase II. The eukaryotic RPB3 subunit of RNA polymerase (RNAP) II is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III. RNAP II is responsible for the synthesis of mRNA precursor. The RPB3 subunit is similar to the bacterial RNAP alpha subunit in that it contains two subdomains: one subdomain is similar the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization, and the other is an inserted beta sheet subdomain. The RPB3 subunit heterodimerizes with the RPB11 subunit, and together with RPB10 and RPB12, anchors the two largest subunits, RPB1 and RPB2, and stabilizes their association.	265
132910	cd07032	RNAP_I_II_AC40	AC40 subunit of Eukaryotic RNA polymerase (RNAP) I and RNAP III. The eukaryotic AC40 subunit of RNA polymerase (RNAP) I and RNAP III is involved in the assembly of RNAP subunits. RNAP is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III. RNAP I is responsible for the synthesis of ribosomal RNA precursor, while RNAP III functions in the synthesis of 5S and tRNA. The AC40 subunit is the equivalent of the RPB3 subunit of RNAP II. The RPB3 subunit is similar to the bacterial RNAP alpha subunit in that it contains two subdomains: one subdomain is similar the eukaryotic Rpb11/AC19/archaeal L subunit which is involved in dimerization; and the other is an inserted beta sheet subdomain. The RPB3 subunit heterodimerizes with the RPB11 subunit, and together with RPB10 and RPB12, anchors the two largest subunits, RPB1 and RPB2, and stabilizes their association. The homology of AC40 to RPB3 suggests a similar function. The AC40 subunit is likely to associate with the RPB11 counterpart, AC19, to form a heterodimer, which stabilizes the association of the two largest subunits of RNAP I and RNAP III.	291
132916	cd07033	TPP_PYR_DXS_TK_like	Pyrimidine (PYR) binding domain of 1-deoxy-D-xylulose-5-phosphate synthase (DXS), transketolase (TK), and related proteins. Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain of 1-deoxy-D-xylulose-5-phosphate synthase (DXS), transketolase (TK), and the beta subunits of the E1 component of the human pyruvate dehydrogenase complex (E1- PDHc), subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Like many TPP-dependent enzymes DXS and TK are homodimers having a PYR and a PP domain on the same subunit. TK has two active sites per dimer which lie between PYR and PP domains of different subunits. For DXS each active site is located at the interface of a PYR and a PP domain from the same subunit. E1-PDHc is an alpha2beta2 dimer-of-heterodimers having two active sites but having the PYR and PP domains arranged on separate subunits, the PYR domains on the beta subunits, the PP domains on the alpha subunits. DXS is a regulatory enzyme of the mevalonate-independent pathway involved in terpenoid biosynthesis, it catalyzes a transketolase-type condensation of pyruvate with D-glyceraldehyde-3-phosphate to form 1-deoxy-D-xylulose-5-phosphate (DXP) and carbon dioxide. TK catalyzes the transfer of a two-carbon unit from ketose phosphates to aldose phosphates. In heterotrophic organisms, TK provides a link between glycolysis and the pentose phosphate pathway and provides precursors for nucleotide, aromatic amino acid and vitamin biosynthesis. TK also plays a central role in the Calvin cycle in plants. PDHc catalyzes the irreversible oxidative decarboxylation of pyruvate to produce acetyl-CoA in the bridging step between glycolysis and the citric acid cycle. This subfamily includes the beta subunits of the E1 component of the acetoin dehydrogenase complex (ADC) and the branched chain alpha-keto acid dehydrogenase/2-oxoisovalerate dehydrogenase complex (BCADC). ADC participates in the breakdown of acetoin. BCADC catalyzes the oxidative decarboxylation of 4-methyl-2-oxopentanoate, 3-methyl-2-oxopentanoate and 3-methyl-2-oxobutanoate during the breakdown of branched chain amino acids.	156
132917	cd07034	TPP_PYR_PFOR_IOR-alpha_like	Pyrimidine (PYR) binding domain of pyruvate ferredoxin oxidoreductase (PFOR), indolepyruvate ferredoxin oxidoreductase alpha subunit (IOR-alpha), and related proteins. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain, of pyruvate ferredoxin oxidoreductase (PFOR), indolepyruvate ferredoxin oxidoreductase (IOR) alpha subunit (IOR-alpha), and related proteins, subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzyme Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit. This subfamily includes proteins characterized as pyruvate NADP+ oxidoreductase (PNO). PFOR and PNO catalyze the oxidative decarboxylation of pyruvate to form acetyl-CoA, a crucial step in many metabolic pathways. The facultative anaerobic mitochondrion of the photosynthetic protist Euglena gracilis oxidizes pyruvate with PNO. IOR catalyzes the oxidative decarboxylation of arylpyruvates, such as indolepyruvate or phenylpyruvate.	160
132918	cd07035	TPP_PYR_POX_like	Pyrimidine (PYR) binding domain of POX and related proteins. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain of pyruvate oxidase (POX) and related protiens subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. For glyoxylate carboligase, which belongs to this subfamily, but lacks this conserved glutamate, the rate of the initial TPP activation step is reduced but the ensuing steps of the enzymic reaction proceed efficiently. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites, for many the active sites lie between PP and PYR domains on different subunits. POX decarboxylates pyruvate, producing hydrogen peroxide and the energy-storage metabolite acetylphosphate. This subfamily includes pyruvate decarboxylase (PDC) and indolepyruvate decarboxylase (IPDC). PDC catalyzes the conversion of pyruvate to acetaldehyde and CO2 in alcoholic fermentation. IPDC plays a role in the indole-3-pyruvic acid (IPA) pathway in plants and various plant-associated bacteria, it catalyzes the decarboxylation of IPA to IAA. This subfamily also includes the large catalytic subunit of acetohydroxyacid synthase (AHAS). AHAS catalyzes the condensation of two molecules of pyruvate to give the acetohydroxyacid, 2-acetolactate, a precursor of the branched chain amino acids, valine and leucine. AHAS also catalyzes the condensation of pyruvate and 2-ketobutyrate to form 2-aceto-2-hydroxybutyrate in isoleucine biosynthesis. Methanococcus jannaschii sulfopyruvate decarboxylase (MjComDE) and phosphonopyruvate decarboxylase (PpyrDc) also belong to this subfamily. PpyrDc is a homotrimeric enzyme having the PP and PYR domains tandemly arranged on the same subunit. It functions in the biosynthesis of C-P compounds such as bialaphos tripeptide in Streptomyces hygroscopicus. MjComDE is a dodecamer having the PYR and PP domains on different subunits, it has six alpha (PYR/ComD) subunits and six beta (PP/ComE) subunits. MjComDE catalyzes the decarboxylation of sulfopyruvic acid to sulfoacetaldehyde in the coenzyme M pathway.	155
132919	cd07036	TPP_PYR_E1-PDHc-beta_like	Pyrimidine (PYR) binding domain of the beta subunits of the E1 components of human pyruvate dehydrogenase complex (E1- PDHc) and related proteins. Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain of the beta subunits of the E1 components of: human pyruvate dehydrogenase complex (E1- PDHc), the acetoin dehydrogenase complex (ADC), and the branched chain alpha-keto acid dehydrogenase/2-oxoisovalerate dehydrogenase complex (BCADC), subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. E1-PDHc is an alpha2beta2 dimer-of-heterodimers having two active sites lying between PYR and PP domains of separate subunits, the PYR domains are arranged on the beta subunit, the PP domains on the alpha subunits. PDHc catalyzes the irreversible oxidative decarboxylation of pyruvate to produce acetyl-CoA in the bridging step between glycolysis and the citric acid cycle. ADC participates in the breakdown of acetoin. BCADC catalyzes the oxidative decarboxylation of 4-methyl-2-oxopentanoate, 3-methyl-2-oxopentanoate and 3-methyl-2-oxobutanoate during the breakdown of branched chain amino acids.	167
132920	cd07037	TPP_PYR_MenD	Pyrimidine (PYR) binding domain of 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexadiene-1-carboxylate synthase (MenD) and related proteins. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain of 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexadiene-1-carboxylate (SEPHCHC) synthase (MenD) subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. Escherichia coli MenD (EcMenD) is a homotetramer (dimer-of-homodimers), having two active sites per homodimer lying between PYR and PP domains of different subunits. EcMenD catalyzes a Stetter-like conjugate addition of alpha-ketoglutarate to isochorismate, leading to the formation of SEPHCHC and carbon dioxide, this addition is the first committed step in the biosynthesis of vitamin K2 (menaquinone).	162
132921	cd07038	TPP_PYR_PDC_IPDC_like	Pyrimidine (PYR) binding domain of pyruvate decarboxylase (PDC), indolepyruvate decarboxylase (IPDC) and related proteins. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain of  pyruvate decarboxylase (PDC) and indolepyruvate decarboxylase (IPDC) subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites, for many the active sites lie between PP and PYR domains on different subunits. PDC catalyzes the conversion of pyruvate to acetaldehyde and CO2 in alcoholic fermentation. IPDC plays a role in the indole-3-pyruvic acid (IPA) pathway in plants and various plant-associated bacteria, it catalyzes the decarboxylation of IPA to IAA. Also belonging to this group is Mycobacterium tuberculosis alpha-keto acid decarboxylase (MtKDC) which participates in amino acid degradation via the Ehrlich pathway, and Lactococcus lactis branched-chain keto acid decarboxylase (KdcA) an enzyme identified as being involved in cheese ripening, which exhibits a very broad substrate range in the decarboxylation and carboligation reactions.	162
132922	cd07039	TPP_PYR_POX	Pyrimidine (PYR) binding domain of POX. Thiamine pyrophosphate (TPP family), pyrimidine (PYR) binding domain of pyruvate oxidase (POX) subfamily. The PYR domain is found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this sub-family. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. Lactobacillus plantarum POX is a homotetramer (dimer-of-homodimers), having two active sites per homodimer lying between PYR and PP domains of different subunits. POX decarboxylates pyruvate, producing hydrogen peroxide and the energy-storage metabolite acetylphosphate.	164
132716	cd07040	HP	Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction. Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed.	153
132912	cd07041	STAS_RsbR_RsbS_like	Sulphate Transporter and Anti-Sigma factor antagonist domain of the "stressosome" complex proteins RsbS and RsbR, regulators of the bacterial stress activated alternative sigma factor sigma-B by phosphorylation. The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain of proteins related to RsbS and RsbR which are part of the "stressosome" complex that plays an important role in the regulation of the bacterial stress activated alternative sigma factor sigma-B. During stress conditions RsbS and RsbR are phosphorylated which leads to the release of RsbT, an activator of of the RsbU phosphatase, which in turn activates RsbV which leads to the release and activation of sigma factor B. RsbS is a single domain protein (STAS domain), while RsbR-like proteins have a well-conserved C-terminal STATS domain and a variable N-terminal domain. The STAS domain is also found in the C- terminal region of sulphate transporters and anti-anti-sigma factors.	109
132913	cd07042	STAS_SulP_like_sulfate_transporter	Sulphate Transporter and Anti-Sigma factor antagonist domain of SulP-like sulfate transporters, plays a role in the function and regulation of the transport activity, proposed general NTP binding function. The SulP family is a large and diverse family of anion transporters, with members from eubacteria, plants, fungi, and mammals. They contain 10 to 14 transmembrane helices which form the catalytic core of the protein and a C-terminal extension, the STAS (Sulphate Transporter and AntiSigma factor antagonist) domain which plays a role in the function and regulation of the transport activity. The STAS domain is found in the C-terminal region of sulphate transporters and bacterial anti-sigma factor antagonists. It has been suggested that this domain may have a general NTP binding function.	107
132914	cd07043	STAS_anti-anti-sigma_factors	Sulphate Transporter and Anti-Sigma factor antagonist) domain of anti-anti-sigma factors, key regulators of anti-sigma factors by phosphorylation. Anti-anti-sigma factors play an important role in the regulation of several sigma factors and their corresponding anti-sigma factors. Upon dephosphorylation they bind the anti-sigma factor and induce the release of the sigma factor from the anti-sigma factor. In a feedback mechanism the anti-anti-sigma factor can be inactivated via phosphorylation by the anti-sigma factor. Well studied examples from Bacillus subtilis are SpoIIAA (regulating sigmaF and sigmaC which play an important role in sporulation) and RsbV (regulating sigmaB involved in the general stress response). The STAS domain is also found in the C- terminal region of sulphate transporters and stressosomes.	99
132871	cd07044	CofD_YvcK	Family of CofD-like proteins and proteins related to YvcK. CofD is a 2-phospho-L-lactate transferase that catalyzes the last step in the biosynthesis of coenzyme F(420)-0 (F(420) without polyglutamate) by transferring the lactyl phosphate moiety of lactyl(2)diphospho-(5')guanosine (LPPG) to 7,8-didemethyl-8-hydroxy-5-deazariboflavin ribitol (F0). F420 is a hydride carrier, important for energy metabolism of methanogenic archaea, as well as for the biosynthesis of other natural products, like tetracycline in Streptomyces. F420 and some of its precursors are also utilized as cofactors for enzymes, like DNA photolyase in Mycobacterium tuberculosis. YvcK from Bacillus subtilis is a member of a family of mostly uncharacterized proteins and has been proposed to play a role in carbon metabolism, since its function is essential for growth on intermediates of the Krebs cycle and pentose phosphate pathway.  Both families appear to have a conserved phosphate binding site, but have different substrate binding residues conserved within each family.	309
132885	cd07045	BMC_CcmK_like	Carbon dioxide concentrating mechanism K (CcmK)-like proteins, Bacterial Micro-Compartment (BMC) domain. Bacterial micro-compartments are primitive protein-based organelles that sequester specific metabolic pathways in bacterial cells. The prototypical bacterial microcompartment is the carboxysome shell, a bacterial polyhedral organelle which increase the efficiency of CO2 fixation by encapsulating RuBisCO and carbonic anhydrase. They can be divided into two types: alpha-type carboxysomes (alpha-cyanobacteria and proteobacteria) and beta-type carboxysomes (beta-cyanobacteria).  Potential functional differences between the two types are not yet fully understood. In addition to these proteins there are several homologous shell proteins including those found in pdu organelles involved in coenzyme B12-dependent degradation of 1,2-propanediol and eut organelles involved in the cobalamin-dependent degradation of ethanolamine. Structure evidence shows that several carboxysome shell proteins and their homologs (Csos1A, CcmK1,2,4, and PduU) exist as hexamers which might further assemble into extended, tightly packed layers hypothesized to represent the flat facets of the polyhedral organelles outer shell. Although it has been suggested that other homologous proteins in this family might also form hexamers and play similar functional roles in the construction of their corresponding organelle outer shells at present no experimental evidence directly supports this view.	84
132886	cd07046	BMC_PduU-EutS	1,2-propanediol utilization protein U (PduU)/ethanolamine utilization protein S (EutS), Bacterial Micro-Compartment (BMC) domain. PduU encapsulates several related enzymes within a shell composed of a few thousand protein subunits.  PduU exists as a hexamer which might further assemble into the flat facets of the polyhedral outer shell of the pdu organelle. This proteinaceous noncarboxysome microcompartment is involved in coenzyme B12-dependent degradation of 1,2-propanediol. The core of PduU is related to the typical BMC domain and its natural oligomeric state is a cyclic hexamer. Unlike other typical BMC domain proteins, the 3D topology of PduU reveals a circular permuted variation on the typical BMC fold which leads to several unique features. The exact functions related to those unique features are still not clear. Another difference is the presence of a deep cavity on one side of the hexamer as well as an intermolecular six-stranded beta barrel that seems to block the central pore that is present in other BMC domain proteins.  EutS proteins included in this CD are sequence homologs of PduU. They are encoded within eut operon and may be required for the formation of the outer shell of bacterial eut polyhedral organelles which are involved in the cobalamin-dependent degradation of ethanolamine. Although it has been suggested that EutS might also form hexamers and play similar functional roles in the construction of the eut organelle outer shell at present no experimental evidence directly supports this view.	110
132887	cd07047	BMC_PduB_repeat1	1,2-propanediol utilization protein B (PduB), Bacterial Micro-Compartment (BMC) domain repeat 1. PduB proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduB might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles at present no experimental evidence directly supports this view. PduB proteins contain two tandem BMC domains repeats. This CD contains repeat 1 (the first BMC domain of PduB).	134
132888	cd07048	BMC_PduB_repeat2	1,2-propanediol utilization protein B (PduB), Bacterial Micro-Compartment (BMC) domain repeat 2. PduB proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduB might form hexamers and further assemble into the flat facets of the polyhedral outer shell of the pdu organelles at present no experimental evidence directly supports this view. PduB proteins contain two tandem BMC domains repeats. This CD contains repeat 2 (the second BMC domain of PduB).	70
132889	cd07049	BMC_EutL_repeat1	ethanolamine utilization protein S (EutS), Bacterial Micro-Compartment (BMC) domain repeat 1. EutL proteins are homologs of the carboxysome shell protein. They are encoded within the eut operon and might be required for the formation of the outer shell of the bacterial eut polyhedral organelles which are involved in the cobalamin-dependent degradation of ethanolamine. Although it has been suggested that EutL might form hexamers and further assemble into the flat facets of the polyhedral outer shell of the eut organelles at present no experimental evidence directly supports this view. EutL proteins contain two tandem BMC domains. This CD includes domain 1 (the first BMC domain of EutL).	103
132890	cd07050	BMC_EutL_repeat2	ethanolamine utilization protein S (EutS), Bacterial Micro-Compartment (BMC) domain repeat 2. EutL proteins are homologs of the carboxysome shell protein. They are encoded within the eut operon and might be required for the formation of the outer shell of the bacterial eut polyhedral organelles which are involved in the cobalamin-dependent degradation of ethanolamine. Although it has been suggested that EutL might form hexamers and further assemble into the flat facets of the polyhedral outer shell of eut organelles at present no experimental evidence directly supports this view. EutL proteins contain two tandem BMC domains. This CD includes domain 2 (the second BMC domain of EutL).	87
132891	cd07051	BMC_like_1_repeat1	Bacterial Micro-Compartment (BMC)-like domain 1 repeat 1. BMC-like domains exist in cyanobacteria, proteobacteria, and actinobacteria and are homologs of the carboxysome shell proteins. They might be encoded from putative organelles involved in unknown metabolic process. Although it has been suggested that these carboxysome shell protein homologs form hexamers and further assemble into the flat facets of the polyhedral bacterial organelles shell at present no experimental evidence exists to directly support this view. Proteins in this CD contain two tandem BMC domains. This CD includes repeat 1 (the first BMC domain of BMC like 1 proteins).	111
132892	cd07052	BMC_like_1_repeat2	Bacterial Micro-Compartment (BMC)-like domain 1 repeat 2. BMC-like domains exist in cyanobacteria, proteobacteria, and actinobacteria and are homologs of the carboxysome shell proteins. They might be encoded from putative organelles involved in unknown metabolic process. Although it has been suggested that these carboxysome shell protein homologs form hexamers and further assemble into the flat facets of the polyhedral bacterial organelles shell at present no experimental evidence exists to directly support this view. Proteins in this CD contain two tandem BMC domains. This CD includes repeat 2 (the second BMC domain of BMC like 1 proteins).	79
132893	cd07053	BMC_PduT_repeat1	1,2-propanediol utilization protein T (PduT), Bacterial Micro-Compartment (BMC) domain repeat 1. PduT proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles which are involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduT might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles at present no experimental evidence directly supports this view. PduT proteins contain two tandem BMC domains repeats. This CD contains repeat 1 (the first BMC domain of PduT) as well as carboxysome shell protein sequence homolog, EutM protein, are also included in this CD. They too might exist as hexamers and might play similar functional roles in the construction of the eut organelle outer shell which still remains poorly understood.	76
132894	cd07054	BMC_PduT_repeat2	1,2-propanediol utilization protein T (PduT), Bacterial Micro-Compartment (BMC) domain repeat 2. PduT proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles which are involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduT might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles, at present no experimental evidence directly supports this view. PduT proteins contain two tandem BMC domains repeats. This CD contains repeat 2 (the second BMC domain of PduT) as well as carboxysome shell protein sequence homolog, EutM protein, are also included in this CD. They too might exist as hexamers and might play similar functional roles in the construction of the eut organelle outer shell which still remains poorly understood.	78
132895	cd07055	BMC_like_2	Bacterial Micro-Compartment (BMC)-like domain 2. BMC like 2 domains exist in cyanobacteria, proteobacteria, and actinobacteria and are homologs of carboxysome shell proteins. They might be encoded from putative organelles involved in unknown metabolic process. Although it has been suggested that these carboxysome shell protein homologs form hexamers and further assemble into the flat facets of the polyhedral bacterial organelles shell at present no experimental evidence exists to directly support this view.	61
132896	cd07056	BMC_PduK	1,2-propanediol utilization protein K (PduK), Bacterial Micro-Compartment (BMC) domain repeat 1l. PduK proteins are homologs of the carboxysome shell protein. They are encoded within the pdu operon and might be required for the formation of the outer shell of the bacterial pdu polyhedral organelles which are involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduK might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles at present no experimental evidence directly supports this view.	77
132897	cd07057	BMC_CcmK	Carbon dioxide concentrating mechanism (CcmK); Bacterial Micro-Compartment (BMC) domain. CcmK1-4 and CcmL proteins found in Synechocystis sp. strain PCC 6803 make up the beta carboxysome shell.  These CcmK proteins have been shown to form hexameric units, while the CcmL proteins have been shown to form pentameric units.  Together these proteins further assemble into the flat facets of the polyhedral carboxysome shell.  The structures suggest that the central pores and the gaps between hexamers limit the transport of metabolites into and out of the the carboxysome.	88
132898	cd07058	BMC_CsoS1	Carboxysome Shell 1 (CsoS1); Bacterial Micro-Compartment (BMC) domain. The cso operon in Halothiobacillus neapolitanus contains the genes involved in alpha carboxysome function including those for the carboxysome shell proteins: CsoS1A, CsoS1B, and CsoS1C. CsoS1A has been shown to form hexameric units which further assemble into the flat facets of the polyhedral carboxysome shell. The structures suggest that the central pores and the gaps between hexamers limit the transport of metabolites into and out of the the carboxysome. Although it has been suggested that other homologous proteins, CsoS1B and CsoS1C, in this family might also form hexamers and play similar functional roles in the construction of carboxysome outer shell at present no experimental evidence directly supports this view.	88
132899	cd07059	BMC_PduA	1,2-propanediol utilization protein A (PduA), Bacterial Micro-Compartment (BMC) domain. PduA is encoded within the 1,2-propanediol utilization (pdu) operon along with other homologous carboxysome shell proteins PduB, B', J, K, T, and U. PduA is thought to be required for the formation of the outer shell of bacterial pdu polyhedral organelles which are involved in coenzyme B12-dependent degradation of 1,2-propanediol. Although it has been suggested that PduA might form hexamers and further assemble into the flat facets of the polyhedral outer shell of pdu organelles, like PduU does, at present no experimental evidence directly supports this view.	85
349952	cd07060	SPOUT_MTase	SPOUT superfamily of SAM-dependent RNA methyltransferases. The SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, also known as class IV methyltransferase family, is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot. Members of the SPOUT superfamily that have been characterized functionally are involved in post-transcriptional RNA modification by catalyzing methylation of the 2-OH group of ribose, the N-1 atom of guanosine 37 in tRNA, or the N-3 atom of uridine 1498 in 16S rRNA.	99
132717	cd07061	HP_HAP_like	Histidine phosphatase domain found in histidine acid phosphatases and phytases; contains a His residue which is phosphorylated during the reaction. Catalytic domain of HAP (histidine acid phosphatases) and phytases (myo-inositol hexakisphosphate phosphohydrolases). The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. Functions in this subgroup include roles in metabolism, signaling, or regulation, for example Escherichia coli glucose-1-phosphatase functions to scavenge glucose from glucose-1-phosphate and the signaling molecules inositol 1,3,4,5,6-pentakisphosphate (InsP5) and inositol hexakisphosphate (InsP6) are in vivo substrates for eukaryotic multiple inositol polyphosphate phosphatase 1 (Minpp1). Phytases scavenge phosphate from extracellular sources and are added to animal feed while prostatic acid phosphatase (PAP) has been used for many years as a serum marker for prostate cancer. Recently PAP has been shown in mouse models to suppress pain by functioning as an ecto-5prime-nucleotidase. In vivo it dephosphorylates extracellular adenosine monophosphate (AMP) generating adenosine,and leading to the activation of A1-adenosine receptors in dorsal spinal cord.	242
132883	cd07062	Peptidase_S66_mccF_like	Microcin C7 self-immunity protein determines resistance to exogenous microcin C7. Microcin C7 self-immunity protein (mccF): MccF, a homolog of the LD-carboxypeptidase family, mediates resistance against exogenously added microcin C7 (MccC7), a ribosomally-encoded peptide antibiotic that contains a phosphoramidate linkage to adenosine monophosphate at its C-terminus. The plasmid-encoded mccF gene is transcribed in the opposite direction to the other five genes (mccA-E) and is required for the full expression of immunity but not for production. The catalytic triad residues (Ser, His, Glu) of LD-carboxypeptidase are also conserved in MccF, strongly suggesting that MccF shares the hydrolytic activity with LD-carboxypeptidases. Substrates of MccF have not been deduced, but could likely be microcin C7 precursors. The possible role of MccF is to defend producer cells against exogenous microcin from re-entering after having been exported.  It is suggested that MccF is involved in microcin degradation or sequestration in the periplasm.	308
132881	cd07064	AlkD_like_1	A new structural DNA glycosylase containing HEAT-like repeats. This domain represents a new and uncharacterized structural superfamily of DNA glycosylases that form an alpha-alpha superhelix fold that are not belong to the identified five structural DNA glycosylase superfamilies (UDG, AAG/MNPG, MutM/Fpg and helix-hairpin-helix).  DNA glycosylases removing alkylated base residues have been identified in all organisms investigated and may be universally present in nature. DNA glycosylases catalyze the first step in Base Excision Repair (BER) pathway by cleaving damaged DNA bases within double strand DNA to produce an abasic site. The resulting abasic site is further processed by AP endonuclease, phosphodiesterase, DNA polymerases, and DNA ligase functions to restore the DNA to an undamaged state. All glycosylase examined to date utilize a similar strategy for binding DNA and base  flipping despite their structural diversity. The known structures for members of this family, AlkC and AlkD from Bacillus cereus, are distant homologues and are composed of six variant HEAT (Huntington/Elongation/ A subunit/Target of rapamycin) repeats. HEAT motifs are ~45-amino acid sequences that form antiparallel alpha-helices, which are packed by a conserved hyrophobic interface and are tandemly repeated to form superhelical alpha-structures. AlkD and AlkC are specific for removal of 3-methyladenine (3mA) and 7-methylguanine (7mG) from the DNA by base excision repair. Homologues of AlkC and AlkD were also identified in other organisms.	208
143549	cd07066	CRD_FZ	CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain. CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines.	119
132718	cd07067	HP_PGM_like	Histidine phosphatase domain found in phosphoglycerate mutases and related proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction. Subgroup of the catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This subgroup contains cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example, F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG.	153
132753	cd07068	NR_LBD_ER_like	The ligand binding domain of estrogen receptor and estrogen receptor-related receptors. The ligand binding domain of estrogen receptor (ER) and estrogen receptor-related receptors (ERRs): Estrogen receptors are a group of receptors which are activated by the hormone estrogen. Estrogen regulates many physiological processes including reproduction, bone integrity, cardiovascular health, and behavior. The main mechanism of action of the estrogen receptor is as a transcription factor by binding to the estrogen response element of target genes upon activation by estrogen and then recruiting coactivator proteins which are responsible for the transcription of target genes. Additionally some ERs may associate with other membrane proteins and can be rapidly activated by exposure of cells to estrogen.  ERRs are closely related to the estrogen receptor (ER) family. But, it lacks the ability to bind estrogen.  ERRs can interfere with the classic ER-mediated estrogen signaling pathway, positively or negatively. ERRs  share target genes, co-regulators and promoters with the estrogen receptor (ER) family. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ER and ERRs have  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	221
132754	cd07069	NR_LBD_Lrh-1	The ligand binding domain of the liver receptor homolog-1, a member of  nuclear receptor superfamily,. The ligand binding domain (LBD) of the liver receptor homolog-1 (LRH-1): LRH-1 belongs to nuclear hormone receptor superfamily, and is expressed mainly in the liver, intestine, exocrine pancreas, and ovary. Most nuclear receptors function as homodimer or heterodimers. However, LRH-1 binds DNA as a monomer, and is a regulator of bile-acid homeostasis, steroidogenesis, reverse cholesterol transport and the initial stages of embryonic development. Recently, phospholipids have been identified as potential ligand for LRH-1 and steroidogenic factor-1 (SF-1).  Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, LRH-1 has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	241
132755	cd07070	NR_LBD_SF-1	The ligand binding domain of nuclear receptor steroidogenic factor 1, a member of nuclear receptor superfamily. The ligand binding domain of nuclear receptor steroidogenic factor 1 (SF-1): SF-1, a member of the  nuclear hormone receptor superfamily, is an essential regulator of endocrine development and function and is considered a master regulator of reproduction. Most nuclear receptors function as homodimer or heterodimers, however SF-1 binds to its target genes as a monomer, recognizing the variations of the DNA sequence motif, T/CCA AGGTCA. SF-1 functions cooperatively with other transcription factors to modulate gene expression. Phospholipids have been determined as potential ligands of SF-1. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, SF-1 has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	237
132756	cd07071	NR_LBD_Nurr1	The ligand binding domain of  Nurr1, a member of  conserved family of nuclear receptors. The ligand binding domain of nuclear receptor Nurr1: Nurr1 belongs to the conserved family of nuclear receptors. It is a transcription factor that is expressed in the embryonic ventral midbrain and is critical for the development of dopamine (DA) neurons. Structural studies have shown that the ligand binding pocket of Nurr1 is filled by bulky hydrophobic residues, making it unable to bind to ligands. Therefore, it belongs to the class of orphan receptors. However, Nurr1 forms heterodimers with RXR and can promote signaling via its partner, RXR. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, Nurr1 has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	238
132757	cd07072	NR_LBD_DHR38_like	Ligand binding domain of  DHR38_like proteins, members of the nuclear receptor superfamily. The ligand binding domain of nuclear receptor DHR38_like proteins:  DHR38 is a member of the steroid receptor superfamily in Drosophila. DHR38 interacts with the USP component of the ecdysone receptor complex, suggesting that DHR38 might modulate ecdysone-triggered signals in the fly, in addition to the ECR/USP pathway. At least four differentially expressed mRNA isoforms have been detected during development. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, DHR38 has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	239
132758	cd07073	NR_LBD_AR	Ligand binding domain of the nuclear receptor androgen receptor, ligand activated transcription regulator. The ligand binding domain of the androgen receptor (AR): AR is a member of the nuclear receptor family. It is activated by binding either of the androgenic hormones, testosterone or dihydrotestosterone, which are responsible for male primary sexual characteristics and for secondary male characteristics, respectively. The primary mechanism of action of ARs is by direct regulation of gene transcription. The binding of an androgen results in a conformational change in the androgen receptor which causes its transport from the cytosol into the cell nucleus, and dimerization. The receptor dimer binds to a hormone response element of AR-regulated genes and modulates their expression. Another mode of action is independent of their interactions with DNA. The receptors interact directly with signal transduction proteins in the cytoplasm, causing rapid changes in cell function, such as ion transport. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, AR has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).  The LBD is not only involved in binding to androgen, but also involved in binding of coactivator proteins and dimerization. A ligand dependent nuclear export signal is also present at the ligand binding domain.	246
132759	cd07074	NR_LBD_PR	Ligand binding domain of the progesterone receptor, a member of the nuclear hormone receptor. The ligand binding domain of the progesterone receptor (PR): PR is a member of the nuclear receptor superfamily of ligand dependent transcription factors, mediating the biological actions of progesterone. PR functions in a variety of biological processes including development of the mammary gland, regulating cell cycle progression, protein processing, and metabolism. When no binding hormone is present the carboxyl terminal inhibits transcription. Binding to a hormone induces a structural change that removes the inhibitory action. After progesterone binds to the receptor, PR forms a dimer and the complex enters the nucleus where it interacts with the hormone response element (HRE) in the promoters of  progesterone responsive genes and alters their transcription. In addition, rapid actions of PR that occur independent of transcription, have also been observed in several tissues like brain, liver, mammary gland and spermatozoa. There are two natural PR isoforms called PR-A and PR-B. PR-B has an additional stretc h of 164 amino acids at the N terminus. The extra domain in PR-B performs activation functions by recruiting coactivators  that could not be recruited by PR-A. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).  The LBD is not only involved in binding to progesterone, but also involved in coactivator binding and dimerization.	248
132760	cd07075	NR_LBD_MR	Ligand binding domain of the mineralocorticoid receptor, a member of the nuclear receptor superfamily. The ligand binding domain of the mineralocorticoid receptor (MR): MR, also called aldosterone receptor, is a member of nuclear receptor superfamily involved in the regulation of electrolyte and fluid balance. The receptor is activated by mineralocorticoids such as aldosterone and deoxycorticosterone as well as glucocorticoids, like cortisol and cortisone. Binding of its ligand results in its translocation to the cell nucleus, homodimerization and binding to hormone response elements (HREs) present in the promoter of MR controlled genes. This results in the recruitment of the coactivators and the transcription of the activated genes. MR is expressed in many tissues and its activation results in the expression of proteins regulating electrolyte and fluid balance. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, MR has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD ). The LBD, in addition to binding ligand, contains a ligand-dependent activation function-2 (AF-2).	248
132761	cd07076	NR_LBD_GR	Ligand binding domain of the glucocorticoid receptor, a member of the nuclear receptor superfamily. The ligand binding domain of the glucocorticoid receptor (GR): GR is a ligand-activated transcription factor belonging to the nuclear receptor superfamily. It binds with high affinity to cortisol and other glucocorticoids. GR is expressed in almost every cell in the body and regulates genes controlling a wide variety of processes including the development, metabolism, and immune response of the organism. In the absence of hormone, the glucocorticoid receptor (GR) is complexes with a variety of heat shock proteins in the cytosol. The binding of the glucocorticoids results in release of the heat shock proteins and transforms it to its active state. One mechanism of action of GR is by direct activation of gene transcription. The activated form of GR forms dimers, translocates into the nucleus, and binds to specific hormone responsive elements, activating gene transcription. GR can also function as a repressor of other gene transcription activators, such as NF-kappaB and AF-1 by directly binding to them, and bloc king the expression of their activated genes. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, GR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). The LBD also functions for dimerization and chaperone protein association.	247
143396	cd07077	ALDH-like	NAD(P)+-dependent aldehyde dehydrogenase-like (ALDH-like) family. The aldehyde dehydrogenase-like (ALDH-like) group of the ALDH superfamily of NAD(P)+-dependent enzymes which, in general, oxidize a wide range of  endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an  important role in detoxification. This group includes families ALDH18, ALDH19, and ALDH20 and represents such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase.  All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group.	397
143397	cd07078	ALDH	NAD(P)+ dependent aldehyde dehydrogenase family. The aldehyde dehydrogenase family (ALDH) of NAD(P)+ dependent enzymes, in general, oxidize a wide range of  endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an  important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in  metabolic pathways, or as  binding proteins, or as osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme  is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-like) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase.  Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group.	432
143398	cd07079	ALDH_F18-19_ProA-GPR	Gamma-glutamyl phosphate reductase (GPR), aldehyde dehydrogenase families 18 and 19. Gamma-glutamyl phosphate reductase (GPR), a L-proline biosynthetic pathway (PBP) enzyme that catalyzes the NADPH dependent reduction of L-gamma-glutamyl  5-phosphate into L-glutamate 5-semialdehyde and phosphate. The glutamate route of the PBP involves two enzymatic steps catalyzed by gamma-glutamyl kinase (GK, EC 2.7.2.11) and GPR (EC 1.2.1.41). These enzymes are fused into the bifunctional enzyme, ProA or delta(1)-pyrroline-5-carboxylate synthetase (P5CS) in plants and animals, whereas they are separate enzymes in bacteria and yeast. In humans, the P5CS (ALDH18A1), an inner mitochondrial membrane enzyme, is essential to the de novo synthesis of the amino acids proline and arginine. Tomato (Lycopersicon esculentum) has both the prokaryotic-like polycistronic operons encoding GK and GPR (PRO1, ALDH19) and the full-length, bifunctional P5CS (PRO2, ALDH18B1).	406
143399	cd07080	ALDH_Acyl-CoA-Red_LuxC	Acyl-CoA reductase LuxC. Acyl-CoA reductase, LuxC, (EC=1.2.1.50) is the fatty acid reductase enzyme responsible for synthesis of the aldehyde  substrate for the luminescent reaction catalyzed by luciferase. The fatty acid reductase, a luminescence-specific, multienzyme complex (LuxCDE), reduces myristic acid to generate the long chain fatty aldehyde required for the luciferase-catalyzed reaction resulting in the emission of blue-green light. Mutational studies of conserved cysteines of LuxC revealed that the cysteine which aligns with the catalytic cysteine conserved throughout the ALDH superfamily is the LuxC acylation site. This CD is composed of mainly bacterial sequences but also includes a few archaeal sequences similar to the Methanospirillum hungateiacyl acyl-CoA reductase RfbN.	422
143400	cd07081	ALDH_F20_ACDH_EutE-like	Coenzyme A acylating aldehyde dehydrogenase (ACDH), Ethanolamine utilization protein EutE, and related proteins. Coenzyme A acylating aldehyde dehydrogenase (ACDH), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, acetylating (EC=1.2.1.10), functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA. The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH, and may be critical enzymes in the fermentative pathway.	439
143401	cd07082	ALDH_F11_NP-GAPDH	NADP+-dependent non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase and ALDH family 11. NADP+-dependent non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase (NP-GAPDH, EC=1.2.1.9) catalyzes the irreversible oxidation of glyceraldehyde 3-phosphate to 3-phosphoglycerate generating NADPH for biosynthetic reactions.  This CD also includes the Arabidopsis thaliana osmotic-stress-inducible ALDH family 11, ALDH11A3  and similar sequences. In autotrophic eukaryotes, NP-GAPDH generates NADPH for biosynthetic processes from photosynthetic glyceraldehyde-3-phosphate exported from the chloroplast and catalyzes one of the classic glycolytic bypass reactions unique to plants.	473
143402	cd07083	ALDH_P5CDH	ALDH subfamily NAD+-dependent delta(1)-pyrroline-5-carboxylate dehydrogenase-like. ALDH subfamily of the NAD+-dependent, delta(1)-pyrroline-5-carboxylate dehydrogenases (P5CDH, EC=1.5.1.12). The proline catabolic enzymes, proline dehydrogenase and P5CDH catalyze the two-step oxidation of proline to glutamate.  P5CDH catalyzes the oxidation of glutamate semialdehyde, utilizing NAD+ as the electron acceptor. In some bacteria, the two enzymes are fused into the bifunctional flavoenzyme, proline utilization A (PutA). These enzymes play important roles in cellular redox control, superoxide generation, and apoptosis. In certain prokaryotes such as Escherichia coli, PutA is also a transcriptional repressor of the proline utilization genes. Monofunctional enzyme sequences such as those seen in the Bacillus RocA P5CDH are also present in this subfamily as well as the human ALDH4A1 P5CDH and the Drosophila Aldh17 P5CDH.	500
143403	cd07084	ALDH_KGSADH-like	ALDH subfamily: NAD(P)+-dependent alpha-ketoglutaric semialdehyde dehydrogenases and plant delta(1)-pyrroline-5-carboxylate dehydrogenase, ALDH family 12-like. ALDH subfamily which includes the NAD(P)+-dependent, alpha-ketoglutaric semialdehyde dehydrogenases (KGSADH, EC 1.2.1.26); plant delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH, EC=1.5.1.12 ), ALDH family 12; the N-terminal domain of the MaoC (monoamine oxidase C) dehydratase regulatory protein; and orthologs of MaoC, PaaZ and PaaN, which are putative ring-opening enzymes of the aerobic phenylacetic acid catabolic pathway.	442
143404	cd07085	ALDH_F6_MMSDH	Methylmalonate semialdehyde dehydrogenase and ALDH family members 6A1 and 6B2. Methylmalonate semialdehyde dehydrogenase (MMSDH, EC=1.2.1.27) [acylating] from Bacillus subtilis is involved in valine metabolism and catalyses the NAD+- and CoA-dependent oxidation of methylmalonate semialdehyde into propionyl-CoA. Mitochondrial human MMSDH ALDH6A1 and Arabidopsis MMSDH ALDH6B2 are also present in this CD.	478
143405	cd07086	ALDH_F7_AASADH-like	NAD+-dependent alpha-aminoadipic semialdehyde dehydrogenase and related proteins. ALDH subfamily which includes the NAD+-dependent, alpha-aminoadipic semialdehyde dehydrogenase (AASADH, EC=1.2.1.31), also known as Antiquitin-1, ALDH7A1, ALDH7B or delta-1-piperideine-6-carboxylate dehydrogenase (P6CDH), and other similar sequences, such as the uncharacterized aldehyde dehydrogenase of Candidatus kuenenia AldH (locus CAJ73105).	478
143406	cd07087	ALDH_F3-13-14_CALDH-like	ALDH subfamily: Coniferyl aldehyde dehydrogenase, ALDH families 3, 13, and 14, and other related proteins. ALDH subfamily which includes NAD(P)+-dependent, aldehyde dehydrogenase, family 3 member A1 and B1  (ALDH3A1, ALDH3B1,  EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and also plant ALDH family members ALDH3F1, ALDH3H1, and ALDH3I1, fungal ALDH14 (YMR110C) and the protozoan family 13 member (ALDH13), as well as coniferyl aldehyde dehydrogenases (CALDH, EC=1.2.1.68), and other similar  sequences, such as the Pseudomonas putida benzaldehyde dehydrogenase I that is involved in the metabolism of mandelate.	426
143407	cd07088	ALDH_LactADH-AldA	Escherichia coli lactaldehyde dehydrogenase AldA-like. Lactaldehyde dehydrogenase from Escherichia coli (AldA, LactADH, EC=1.2.1.22), an NAD(+)-dependent enzyme involved in the metabolism of L-fucose and L-rhamnose, and other similar sequences are present in this CD.	468
143408	cd07089	ALDH_CddD-AldA-like	Rhodococcus ruber 6-oxolauric acid dehydrogenase-like and related proteins. The 6-oxolauric acid dehydrogenase (CddD) from Rhodococcus ruber SC1 which converts 6-oxolauric acid to dodecanedioic acid; and the aldehyde dehydrogenase (locus SSP0762) from Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305 and also, the Mycobacterium tuberculosis H37Rv ALDH AldA (locus Rv0768) sequence; and other similar sequences, are included in this CD.	459
143409	cd07090	ALDH_F9_TMBADH	NAD+-dependent 4-trimethylaminobutyraldehyde dehydrogenase, ALDH family 9A1. NAD+-dependent, 4-trimethylaminobutyraldehyde dehydrogenase (TMABADH, EC=1.2.1.47), also known as aldehyde dehydrogenase family 9 member A1 (ALDH9A1) in humans, is a cytosolic tetramer which catalyzes the oxidation of gamma-aminobutyraldehyde involved in 4-aminobutyric acid (GABA) biosynthesis  and also oxidizes betaine aldehyde (gamma-trimethylaminobutyraldehyde) which is involved in carnitine biosynthesis.	457
143410	cd07091	ALDH_F1-2_Ald2-like	ALDH subfamily: ALDH families 1and 2, including 10-formyltetrahydrofolate dehydrogenase, NAD+-dependent retinal dehydrogenase 1 and related proteins. ALDH subfamily which includes the NAD+-dependent retinal dehydrogenase 1 (RALDH 1, ALDH1, EC=1.2.1.36), also known as aldehyde dehydrogenase family 1 member A1 (ALDH1A1), in humans, a homotetrameric, cytosolic enzyme that catalyzes the oxidation of retinaldehyde to retinoic acid. Human ALDH1B1 and ALDH2 are also in this cluster; both are mitochrondrial homotetramers which play important roles in acetaldehyde oxidation; ALDH1B1 in response to UV light exposure and ALDH2 during ethanol metabolism. 10-formyltetrahydrofolate dehydrogenase (FTHFDH, EC=1.5.1.6), also known as aldehyde dehydrogenase family 1 member L1 (ALDH1L1), in humans, a multi-domain homotetramer with an N-terminal formyl transferase domain and a C-terminal ALDH domain. FTHFDH catalyzes an NADP+-dependent dehydrogenase reaction resulting in the conversion of 10-formyltetrahydrofolate to tetrahydrofolate and CO2. Also included in this subfamily is the Arabidosis aldehyde dehydrogenase family 2 members B4 and B7 (EC=1.2.1.3), which are mitochondrial, homotetramers that oxidize acetaldehyde and glycolaldehyde, as well as, the Arabidosis cytosolic, homotetramer ALDH2C4 (EC=1.2.1.3), an enzyme involved in the oxidation of sinapalehyde and coniferaldehyde. Also included is the AldA aldehyde dehydrogenase  of Aspergillus nidulans (locus AN0554),  the aldehyde dehydrogenase 2 (YMR170c, ALD5, EC=1.2.1.5) of Saccharomyces cerevisiae, and other similar sequences.	476
143411	cd07092	ALDH_ABALDH-YdcW	Escherichia coli NAD+-dependent gamma-aminobutyraldehyde dehydrogenase YdcW-like. NAD+-dependent, tetrameric, gamma-aminobutyraldehyde dehydrogenase (ABALDH), YdcW of Escherichia coli K12, catalyzes the oxidation of gamma-aminobutyraldehyde to gamma-aminobutyric acid. ABALDH can also oxidize n-alkyl medium-chain aldehydes, but with a lower catalytic efficiency.	450
143412	cd07093	ALDH_F8_HMSADH	Human aldehyde dehydrogenase family 8 member A1-like. In humans, the  aldehyde dehydrogenase family 8 member A1 (ALDH8A1) protein functions to convert 9-cis-retinal to 9-cis-retinoic acid and has a preference for NAD+. Also included in this CD is the 2-hydroxymuconic semialdehyde dehydrogenase (HMSADH) which catalyzes the conversion of 2-hydroxymuconic semialdehyde to 4-oxalocrotonate, a step in the meta cleavage pathway of aromatic hydrocarbons in bacteria. Such HMSADHs seen here are: XylG of the TOL plasmid pWW0 of Pseudomonas putida, TomC  of Burkholderia cepacia G4, and AphC of Comamonas testosterone.	455
143413	cd07094	ALDH_F21_LactADH-like	ALDH subfamily: NAD+-dependent, lactaldehyde dehydrogenase, ALDH family 21 A1, and related proteins. ALDH subfamily which includes Tortula ruralis aldehyde dehydrogenase ALDH21A1 (RNP123), and NAD+-dependent, lactaldehyde dehydrogenase (EC=1.2.1.22) and like sequences.	453
143414	cd07095	ALDH_SGSD_AstD	N-succinylglutamate 5-semialdehyde dehydrogenase, AstD-like. N-succinylglutamate 5-semialdehyde dehydrogenase or succinylglutamic semialdehyde dehydrogenase (SGSD, E. coli AstD, EC=1.2.1.71) involved in L-arginine degradation via the arginine succinyltransferase (AST) pathway and catalyzes the NAD+-dependent reduction of succinylglutamate semialdehyde into succinylglutamate.	431
143415	cd07097	ALDH_KGSADH-YcbD	Bacillus subtilis NADP+-dependent alpha-ketoglutaric semialdehyde dehydrogenase ycbD-like. Kinetic studies of the Bacillus subtilis ALDH-like ycbD protein, which is involved in d-glucarate/d-galactarate utilization, reveal that it is a NADP+-dependent, alpha-ketoglutaric semialdehyde dehydrogenase (KGSADH). KGSADHs (EC 1.2.1.26) catalyze the NAD(P)+-dependent conversion of KGSA to alpha-ketoglutarate. Interestingly, the NADP+-dependent, tetrameric, 2,5-dioxopentanoate dehydrogenase (EC=1.2.1.26), an enzyme involved in the catabolic pathway for D-arabinose in Sulfolobus solfataricus, also clusters in this group. This CD shows a distant phylogenetic relationship to the Azospirillum brasilense KGSADH-II (-III) group.	473
143416	cd07098	ALDH_F15-22	Aldehyde dehydrogenase family 15A1 and 22A1-like. Aldehyde dehydrogenase family members ALDH15A1 (Saccharomyces cerevisiae YHR039C) and ALDH22A1 (Arabidopsis thaliana, EC=1.2.1.3), and similar sequences, are in this CD. Significant improvement of stress tolerance in tobacco plants was observed by overexpressing the ALDH22A1 gene from maize (Zea mays) and was accompanied by a reduction of malondialdehyde  derived from cellular lipid peroxidation.	465
143417	cd07099	ALDH_DDALDH	Methylomonas sp. 4,4'-diapolycopene-dialdehyde dehydrogenase-like. The 4,4'-diapolycopene-dialdehyde dehydrogenase (DDALDH) involved in C30 carotenoid synthesis in Methylomonas sp. strain 16a and other similar sequences are present in this CD. DDALDH converts 4,4'-diapolycopene-dialdehyde into 4,4'-diapolycopene-diacid.	453
143418	cd07100	ALDH_SSADH1_GabD1	Mycobacterium tuberculosis succinate-semialdehyde dehydrogenase 1-like. Succinate-semialdehyde dehydrogenase 1 (SSADH1, GabD1, EC=1.2.1.16) catalyzes the NADP(+)-dependent oxidation of succinate semialdehyde (SSA)  to succinate.  SSADH activity in Mycobacterium tuberculosis (Mtb) is encoded by both gabD1 (Rv0234c) and gabD2 (Rv1731).  The Mtb GabD1 SSADH1 reportedly is an enzyme of the gamma-aminobutyrate shunt, which forms a functional link between two TCA half-cycles by converting alpha-ketoglutarate to succinate.	429
143419	cd07101	ALDH_SSADH2_GabD2	Mycobacterium tuberculosis succinate-semialdehyde dehydrogenase 2-like. Succinate-semialdehyde dehydrogenase 2 (SSADH2) and similar proteins are in this CD. SSADH1 (GabD1, EC=1.2.1.16) catalyzes the NADP(+)-dependent oxidation of succinate semialdehyde to succinate.  SSADH activity in Mycobacterium tuberculosis is encoded by both gabD1 (Rv0234c) and gabD2 (Rv1731), however ,the Vmax of GabD1 was shown to be much higher than that of GabD2, and GabD2 (SSADH2) is likely to serve physiologically as a dehydrogenase for a different aldehyde(s).	454
143420	cd07102	ALDH_EDX86601	Uncharacterized aldehyde dehydrogenase of Synechococcus sp. PCC 7335 (EDX86601). Uncharacterized aldehyde dehydrogenase of Synechococcus sp. PCC 7335 (locus EDX86601) and other similar sequences, are present in this CD.	452
143421	cd07103	ALDH_F5_SSADH_GabD	Mitochondrial succinate-semialdehyde dehydrogenase and ALDH family members 5A1 and 5F1-like. Succinate-semialdehyde dehydrogenase, mitochondrial (SSADH, GabD, EC=1.2.1.24) catalyzes the NAD+-dependent oxidation of succinate semialdehyde (SSA) to succinate. This group includes the human aldehyde dehydrogenase family 5 member A1 (ALDH5A1) which is a mitochondrial homotetramer that converts SSA to succinate in the last step of 4-aminobutyric acid (GABA) catabolism. This CD also includes the Arabidopsis SSADH gene product ALDH5F1. Mutations in this gene result in the accumulation of H2O2, suggesting a role in plant defense against the environmental stress of elevated reactive oxygen species.	451
143422	cd07104	ALDH_BenzADH-like	ALDH subfamily: NAD(P)+-dependent benzaldehyde dehydrogenase II, vanillin dehydrogenase, p-hydroxybenzaldehyde dehydrogenase and related proteins. ALDH subfamily which includes the NAD(P)+-dependent, benzaldehyde dehydrogenase II (XylC, BenzADH, EC=1.2.1.28)  involved in the oxidation of benzyl alcohol to benzoate; p-hydroxybenzaldehyde dehydrogenase (PchA, HBenzADH) which catalyzes the oxidation of p-hydroxybenzaldehyde to p-hydroxybenzoic acid; vanillin dehydrogenase (Vdh, VaniDH) involved in the metabolism of ferulic acid as seen in Pseudomonas putida KT2440; and other related sequences.	431
143423	cd07105	ALDH_SaliADH	Salicylaldehyde dehydrogenase, DoxF-like. Salicylaldehyde dehydrogenase (DoxF, SaliADH, EC=1.2.1.65) involved in the upper naphthalene catabolic pathway of Pseudomonas strain C18 and other similar sequences are present in this CD.	432
143424	cd07106	ALDH_AldA-AAD23400	Streptomyces aureofaciens putative aldehyde dehydrogenase AldA (AAD23400)-like. Putative aldehyde dehydrogenase, AldA, from Streptomyces aureofaciens (locus AAD23400) and other similar sequences are present in this CD.	446
143425	cd07107	ALDH_PhdK-like	Nocardioides 2-carboxybenzaldehyde dehydrogenase, PhdK-like. Nocardioides sp. strain KP72-carboxybenzaldehyde dehydrogenase (PhdK), an enzyme involved in phenanthrene degradation, and other similar sequences, are present in this CD.	456
143426	cd07108	ALDH_MGR_2402	Magnetospirillum NAD(P)+-dependent aldehyde dehydrogenase MSR-1-like. NAD(P)+-dependent aldehyde dehydrogenase of Magnetospirillum gryphiswaldense MSR-1 (MGR_2402) , and other similar sequences, are present in this CD.	457
143427	cd07109	ALDH_AAS00426	Uncharacterized Saccharopolyspora spinosa aldehyde dehydrogenase (AAS00426)-like. Uncharacterized aldehyde dehydrogenase of Saccharopolyspora spinosa (AAS00426) and other similar sequences, are present in this CD.	454
143428	cd07110	ALDH_F10_BADH	Arabidopsis betaine aldehyde dehydrogenase 1 and 2, ALDH family 10A8 and 10A9-like. Present in this CD are the Arabidopsis betaine aldehyde dehydrogenase (BADH) 1 (chloroplast) and 2 (mitochondria), also known as, aldehyde dehydrogenase family 10 member A8 and aldehyde dehydrogenase family 10 member A9, respectively, and are putative dehydration- and salt-inducible BADHs (EC 1.2.1.8) that catalyze the oxidation of betaine aldehyde to the compatible solute glycine betaine.	456
143429	cd07111	ALDH_F16	Aldehyde dehydrogenase family 16A1-like. Uncharacterized aldehyde dehydrogenase family 16 member A1 (ALDH16A1) and other related sequences are present in this CD. The active site cysteine and glutamate residues are not conserved in the human ALDH16A1 protein sequence.	480
143430	cd07112	ALDH_GABALDH-PuuC	Escherichia coli NADP+-dependent gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase PuuC-like. NADP+-dependent, gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase (GABALDH) PuuC of  Escherichia coli which catalyzes the conversion of putrescine to 4-aminobutanoate and other similar sequences are present in this CD.	462
143431	cd07113	ALDH_PADH_NahF	Escherichia coli NAD+-dependent phenylacetaldehyde dehydrogenase PadA-like. NAD+-dependent, homodimeric, phenylacetaldehyde dehydrogenase (PADH, EC=1.2.1.39) PadA of Escherichia coli involved in the catabolism of 2-phenylethylamine, and other related sequences, are present in this CD. Also included is the Pseudomonas fluorescens ST StyD PADH involved in styrene catabolism, the Sphingomonas sp. LB126 FldD protein involved in fluorene degradation, and the Novosphingobium aromaticivorans NahF salicylaldehyde dehydrogenase involved in the NAD+-dependent conversion of salicylaldehyde to salicylate.	477
143432	cd07114	ALDH_DhaS	Uncharacterized Candidatus pelagibacter aldehyde dehydrogenase, DhaS-like. Uncharacterized aldehyde dehydrogenase from Candidatus pelagibacter (DhaS) and other related sequences are present in this CD.	457
143433	cd07115	ALDH_HMSADH_HapE	Pseudomonas fluorescens 4-hydroxymuconic semialdehyde dehydrogenase-like. 4-hydroxymuconic semialdehyde dehydrogenase (HapE, EC=1.2.1.61) of Pseudomonas fluorescens ACB involved in 4-hydroxyacetophenone degradation, and putative hydroxycaproate semialdehyde dehydrogenase (ChnE) of Brachymonas petroleovorans involved in cyclohexane metabolism, and other similar sequences, are present in this CD.	453
143434	cd07116	ALDH_ACDHII-AcoD	Ralstonia eutrophus NAD+-dependent acetaldehyde dehydrogenase II-like. Included in this CD is the NAD+-dependent, acetaldehyde dehydrogenase II (AcDHII, AcoD, EC=1.2.1.3) from Ralstonia (Alcaligenes) eutrophus H16 involved in the catabolism of acetoin and ethanol, and similar proteins, such as, the dimeric dihydrolipoamide dehydrogenase of the acetoin dehydrogenase enzyme system of Klebsiella pneumonia. Also included are sequences similar to the NAD+-dependent chloroacetaldehyde dehydrogenases (AldA and AldB) of Xanthobacter autotrophicus GJ10 which are involved in the degradation of 1,2-dichloroethane. These proteins apparently require RpoN factors for expression.	479
143435	cd07117	ALDH_StaphAldA1	Uncharacterized Staphylococcus aureus AldA1 (SACOL0154) aldehyde dehydrogenase-like. Uncharacterized aldehyde dehydrogenase from Staphylococcus aureus (AldA1, locus SACOL0154) and other similar sequences are present in this CD.	475
143436	cd07118	ALDH_SNDH	Gluconobacter oxydans L-sorbosone dehydrogenase-like. Included in this CD is the L-sorbosone dehydrogenase (SNDH) from Gluconobacter oxydans UV10. In G. oxydans,  D-sorbitol is converted to 2-keto-L-gulonate (a precursor of L-ascorbic acid) in sequential oxidation steps catalyzed by a FAD-dependent, L-sorbose dehydrogenase and an NAD(P)+-dependent,  L-sorbosone dehydrogenase.	454
143437	cd07119	ALDH_BADH-GbsA	Bacillus subtilis NAD+-dependent betaine aldehyde dehydrogenase-like. Included in this CD is the NAD+-dependent, betaine aldehyde dehydrogenase (BADH, GbsA, EC=1.2.1.8) of Bacillus subtilis involved in the synthesis of the osmoprotectant glycine betaine from choline or glycine betaine aldehyde.	482
143438	cd07120	ALDH_PsfA-ACA09737	Pseudomonas putida aldehyde dehydrogenase PsfA (ACA09737)-like. Included in this CD is the aldehyde dehydrogenase (PsfA, locus ACA09737) of Pseudomonas putida involved in furoic acid metabolism. Transcription of psfA was induced in response to 2-furoic acid, furfuryl alcohol, and furfural.	455
143439	cd07121	ALDH_EutE	Ethanolamine utilization protein EutE-like. Coenzyme A acylating aldehyde dehydrogenase (ACDH), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, acetylating (EC=1.2.1.10), converts acetaldehyde into acetyl-CoA.  This CD is limited to such monofunctional enzymes as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium.  Mutations in eutE abolish the ability to utilize ethanolamine as a carbon source.	429
143440	cd07122	ALDH_F20_ACDH	Coenzyme A acylating aldehyde dehydrogenase (ACDH), ALDH family 20-like. Coenzyme A acylating aldehyde dehydrogenase (ACDH, EC=1.2.1.10), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA . The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH and may be critical enzymes in the fermentative pathway.	436
143441	cd07123	ALDH_F4-17_P5CDH	Delta(1)-pyrroline-5-carboxylate dehydrogenase, ALDH families 4 and 17. Delta(1)-pyrroline-5-carboxylate dehydrogenase (EC=1.5.1.12 ), families 4 and 17: a proline catabolic enzyme of the aldehyde dehydrogenase (ALDH) protein superfamily.  Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH), also known as ALDH4A1 in humans,  is a mitochondrial  homodimer involved in proline degradation and catalyzes the NAD + -dependent conversion of P5C to glutamate. This is a necessary step in the pathway interconnecting the urea and tricarboxylic acid cycles. The preferred substrate is glutamic gamma-semialdehyde, other substrates include succinic, glutaric and adipic semialdehydes. Also included in this CD is the Aldh17 Drosophila melanogaster (Q9VUC0) P5CDH and similar sequences.	522
143442	cd07124	ALDH_PutA-P5CDH-RocA	Delta(1)-pyrroline-5-carboxylate dehydrogenase, RocA. Delta(1)-pyrroline-5-carboxylate dehydrogenase (EC=1.5.1.12 ), RocA: a proline catabolic enzyme of the aldehyde dehydrogenase (ALDH) protein superfamily. The proline catabolic enzymes, proline dehydrogenase and Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH), catalyze the two-step oxidation of proline to glutamate; P5CDH catalyzes the oxidation of glutamate semialdehyde, utilizing NAD+ as the electron acceptor. In some bacteria, the two enzymes are fused into the bifunctional flavoenzyme, proline utilization A (PutA). In this CD, monofunctional enzyme sequences such as seen in the Bacillus subtilis RocA P5CDH are also present. These enzymes play important roles in cellular redox control, superoxide generation, and apoptosis.	512
143443	cd07125	ALDH_PutA-P5CDH	Delta(1)-pyrroline-5-carboxylate dehydrogenase, PutA. The proline catabolic enzymes of the aldehyde dehydrogenase (ALDH) protein superfamily, proline dehydrogenase and Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH, (EC=1.5.1.12 )), catalyze the two-step oxidation of proline to glutamate; P5CDH catalyzes the oxidation of glutamate semialdehyde, utilizing NAD+ as the electron acceptor. In some bacteria, the two enzymes are fused into the bifunctional flavoenzyme, proline utilization A (PutA) These enzymes play important roles in cellular redox control, superoxide generation, and apoptosis. In certain prokaryotes such as Escherichia coli, PutA is also a transcriptional repressor of the proline utilization genes.	518
143444	cd07126	ALDH_F12_P5CDH	Delta(1)-pyrroline-5-carboxylate dehydrogenase, ALDH family 12. Delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH, EC=1.5.1.12), family 12: a proline catabolic enzyme of the aldehyde dehydrogenase (ALDH) protein superfamily. P5CDH is a mitochondrial enzyme involved in proline degradation and catalyzes the NAD + -dependent conversion of P5C to glutamate.  The P5CDH, ALDH12A1 gene, in Arabidopsis, has been identified as an osmotic-stress-inducible ALDH gene. This CD contains both Viridiplantae and Alveolata P5CDH sequences.	489
143445	cd07127	ALDH_PAD-PaaZ	Phenylacetic acid degradation proteins PaaZ (Escherichia coli) and PaaN (Pseudomonas putida)-like. Phenylacetic acid degradation (PAD) proteins PaaZ  (Escherichia coli) and PaaN (Pseudomonas putida) are putative aromatic ring cleavage enzymes of the aerobic PA catabolic pathway. PaaZ mutants were defective for growth with PA as a sole carbon source due to interruption of the putative ring opening system.  This CD is limited to bacterial monofunctional enzymes.	549
143446	cd07128	ALDH_MaoC-N	N-terminal domain of the monoamine oxidase C dehydratase. The N-terminal domain of the MaoC dehydratase, a monoamine oxidase regulatory protein. Orthologs of MaoC include PaaZ (Escherichia coli) and PaaN (Pseudomonas putida), which are putative ring-opening enzymes of the aerobic phenylacetic acid (PA) catabolic pathway. The C-terminal domain of MaoC has sequence similarity to enoyl-CoA hydratase. Also included in this CD is a novel Burkholderia xenovorans LB400 ALDH of the aerobic benzoate oxidation (box) pathway. This pathway involves first the synthesis of a CoA thio-esterified aromatic acid, with subsequent dihydroxylation and cleavage steps, yielding the CoA thio-esterified aliphatic aldehyde, 3,4-dehydroadipyl-CoA semialdehyde, which is further converted into its corresponding CoA acid by the Burkholderia LB400 ALDH.	513
143447	cd07129	ALDH_KGSADH	Alpha-Ketoglutaric Semialdehyde Dehydrogenase. Alpha-Ketoglutaric Semialdehyde (KGSA) Dehydrogenase (KGSADH, EC 1.2.1.26) catalyzes the NAD(P)+-dependent conversion of KGSA to alpha-ketoglutarate. This CD contains such sequences as those seen in Azospirillum brasilense, KGSADH-II (D-glucarate/D-galactarate-inducible) and KGSADH-III (hydroxy-L-proline-inducible). Both show similar high substrate specificity for KGSA and different coenzyme specificity; KGSADH-II is NAD+-dependent and KGSADH-III is NADP+-dependent. Also included in this CD is the NADP(+)-dependent aldehyde dehydrogenase from Vibrio harveyi which catalyzes the oxidation of long-chain aliphatic aldehydes to acids.	454
143448	cd07130	ALDH_F7_AASADH	NAD+-dependent alpha-aminoadipic semialdehyde dehydrogenase, ALDH family members 7A1 and 7B. Alpha-aminoadipic semialdehyde dehydrogenase (AASADH, EC=1.2.1.31), also known as ALDH7A1, Antiquitin-1, ALDH7B, or delta-1-piperideine-6-carboxylate dehydrogenase (P6CDH), is a NAD+-dependent ALDH. Human ALDH7A1 is involved in the pipecolic acid pathway of lysine catabolism, catalyzing the oxidation of alpha-aminoadipic semialdehyde to alpha-aminoadipate.  Arabidopsis thaliana ALDH7B4 appears to be an osmotic-stress-inducible ALDH gene encoding a turgor-responsive or stress-inducible ALDH. The Streptomyces clavuligerus P6CDH appears to be involved in cephamycin biosynthesis, catalyzing the second stage of the two-step conversion of lysine to alpha-aminoadipic acid.  The ALDH7A1 enzyme and others in this group have been observed as tetramers, yet the bacterial P6CDH enzyme has been reported as a monomer.	474
143449	cd07131	ALDH_AldH-CAJ73105	Uncharacterized Candidatus kuenenia aldehyde dehydrogenase AldH (CAJ73105)-like. Uncharacterized aldehyde dehydrogenase of Candidatus kuenenia AldH (locus CAJ73105) and similar sequences with similarity to alpha-aminoadipic semialdehyde dehydrogenase (AASADH, human ALDH7A1, EC=1.2.1.31), Arabidopsis ALDH7B4, and Streptomyces clavuligerus delta-1-piperideine-6-carboxylate dehydrogenase (P6CDH) are included in this CD.	478
143450	cd07132	ALDH_F3AB	Aldehyde dehydrogenase family 3 members A1, A2, and B1 and related proteins. NAD(P)+-dependent, aldehyde dehydrogenase, family 3 members A1 and B1  (ALDH3A1, ALDH3B1,  EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and similar sequences are included in this CD. Human ALDH3A1 is a homodimer with a critical role in cellular defense against oxidative stress; it catalyzes the oxidation of various cellular membrane lipid-derived aldehydes. Corneal crystalline ALDH3A1 protects the cornea and underlying lens against UV-induced oxidative stress. Human ALDH3A2, a microsomal homodimer, catalyzes the oxidation of long-chain aliphatic aldehydes to fatty acids. Human ALDH3B1 is highly expressed in the kidney and liver and catalyzes the oxidation of various medium- and long-chain saturated and unsaturated aliphatic aldehydes.	443
143451	cd07133	ALDH_CALDH_CalB	Coniferyl aldehyde dehydrogenase-like. Coniferyl aldehyde dehydrogenase (CALDH, EC=1.2.1.68) of Pseudomonas sp. strain HR199 (CalB) which catalyzes the NAD+-dependent oxidation of coniferyl aldehyde to ferulic acid, and similar sequences, are present in this CD.	434
143452	cd07134	ALDH_AlkH-like	Pseudomonas putida Aldehyde dehydrogenase AlkH-like. Aldehyde dehydrogenase AlkH (locus name P12693, EC=1.2.1.3) of the alkBFGHJKL operon that allows Pseudomonas putida to metabolize alkanes and the aldehyde dehydrogenase AldX of Bacillus subtilis (locus P46329, EC=1.2.1.3), and similar sequences, are present in this CD.	433
143453	cd07135	ALDH_F14-YMR110C	Saccharomyces cerevisiae aldehyde dehydrogenase family 14 and related proteins. Aldehyde dehydrogenase family 14 (ALDH14), isolated mainly from the mitochondrial outer membrane of Saccharomyces cerevisiae (YMR110C) and most closely related to the plant and animal ALDHs and fatty ALDHs family 3 members, and similar fungal sequences, are present in this CD.	436
143454	cd07136	ALDH_YwdH-P39616	Bacillus subtilis aldehyde dehydrogenase ywdH-like. Uncharacterized Bacillus subtilis ywdH aldehyde dehydrogenase (locus P39616)  most closely related to the ALDHs and fatty ALDHs of families 3 and 14, and similar sequences, are included in this CD.	449
143455	cd07137	ALDH_F3FHI	Plant aldehyde dehydrogenase family 3 members F1, H1, and I1 and related proteins. Aldehyde dehydrogenase family members 3F1, 3H1, and 3I1 (ALDH3F1, ALDH3H1, and ALDH3I1), and similar plant sequences, are in this CD.  In Arabidopsis thaliana, stress-regulated expression of ALDH3I1  was observed in  leaves and osmotic stress expression of  ALDH3H1 was observed in root tissue, whereas, ALDH3F1 expression was not stress responsive. Functional analysis of ALDH3I1 suggest it may be involved in a detoxification pathway in plants that limits aldehyde accumulation and oxidative stress.	432
143456	cd07138	ALDH_CddD_SSP0762	Rhodococcus ruber 6-oxolauric acid dehydrogenase-like. The 6-oxolauric acid dehydrogenase (CddD) from Rhodococcus ruber SC1 which converts 6-oxolauric acid to dodecanedioic acid, and the aldehyde dehydrogenase (locus SSP0762) from Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305 and other similar sequences, are included in this CD.	466
143457	cd07139	ALDH_AldA-Rv0768	Mycobacterium tuberculosis aldehyde dehydrogenase  AldA-like. The Mycobacterium tuberculosis NAD+-dependent, aldehyde dehydrogenase  PDB structure,  3B4W, and the Mycobacterium tuberculosis H37Rv aldehyde dehydrogenase  AldA (locus Rv0768) sequence, as well as the Rhodococcus rhodochrous ALDH involved in haloalkane catabolism, and other similar sequences, are included in this CD.	471
143458	cd07140	ALDH_F1L_FTFDH	10-formyltetrahydrofolate dehydrogenase, ALDH family 1L. 10-formyltetrahydrofolate dehydrogenase (FTHFDH, EC=1.5.1.6), also known as aldehyde dehydrogenase family 1 member L1 (ALDH1L1) in humans, is a multi-domain homotetramer with an N-terminal formyl transferase domain and a C-terminal ALDH domain. FTHFDH catalyzes an NADP+-dependent dehydrogenase reaction resulting in the conversion of 10-formyltetrahydrofolate to tetrahydrofolate and CO2. The ALDH domain is also capable of the oxidation of short chain aldehydes to their corresponding acids.	486
143459	cd07141	ALDH_F1AB_F2_RALDH1	NAD+-dependent retinal dehydrogenase 1, ALDH families 1A, 1B, and 2-like. NAD+-dependent retinal dehydrogenase 1 (RALDH 1, ALDH1, EC=1.2.1.36) also known as aldehyde dehydrogenase family 1 member A1 (ALDH1A1) in humans, is a homotetrameric, cytosolic enzyme that catalyzes the oxidation of retinaldehyde to retinoic acid. Human ALDH1B1 and ALDH2 are also in this cluster; both are mitochrondrial homotetramers which play important roles in acetaldehyde oxidation; ALDH1B1 in response to UV light exposure and ALDH2 during ethanol metabolism.	481
143460	cd07142	ALDH_F2BC	Arabidosis aldehyde dehydrogenase family 2 B4, B7, C4-like. Included in this CD is the Arabidosis aldehyde dehydrogenase family 2 members B4 and B7 (EC=1.2.1.3),  which are mitochondrial homotetramers that oxidize acetaldehyde and glycolaldehyde, but not L-lactaldehyde. Also in this group, is the Arabidosis cytosolic, homotetramer ALDH2C4 (EC=1.2.1.3), an enzyme involved in the oxidation of sinapalehyde and coniferaldehyde.	476
143461	cd07143	ALDH_AldA_AN0554	Aspergillus nidulans aldehyde dehydrogenase, AldA (AN0554)-like. NAD(P)+-dependent aldehyde dehydrogenase (AldA) of Aspergillus nidulans (locus AN0554), and other similar sequences, are present in this CD.	481
143462	cd07144	ALDH_ALD2-YMR170C	Saccharomyces cerevisiae aldehyde dehydrogenase 2 (YMR170c)-like. NAD(P)+-dependent Saccharomyces cerevisiae aldehyde dehydrogenase 2 (YMR170c, ALD5, EC=1.2.1.5) and other similar sequences, are present in this CD.	484
143463	cd07145	ALDH_LactADH_F420-Bios	Methanocaldococcus jannaschii NAD+-dependent lactaldehyde dehydrogenase-like. NAD+-dependent, lactaldehyde dehydrogenase (EC=1.2.1.22) involved the biosynthesis of coenzyme F(420) in Methanocaldococcus jannaschii through the oxidation of lactaldehyde to lactate and generation of NAPH, and similar sequences are included in this CD.	456
143464	cd07146	ALDH_PhpJ	Streptomyces putative phosphonoformaldehyde dehydrogenase PhpJ-like. Putative phosphonoformaldehyde dehydrogenase (PhpJ), an aldehyde dehydrogenase homolog reportedly involved in the biosynthesis of phosphinothricin tripeptides in Streptomyces viridochromogenes DSM 40736, and similar sequences are included in this CD.	451
143465	cd07147	ALDH_F21_RNP123	Aldehyde dehydrogenase family 21A1-like. Aldehyde dehydrogenase ALDH21A1 (gene name RNP123) was first described in the moss Tortula ruralis and is believed to play an important role in the detoxification of aldehydes generated in response to desiccation- and salinity-stress, and ALDH21A1 expression represents a unique stress tolerance mechanism. So far, of plants, only the bryophyte sequence has been observed, but similar protein sequences from bacteria and archaea are also present in this CD.	452
143466	cd07148	ALDH_RL0313	Uncharacterized ALDH ( RL0313) with similarity to Tortula ruralis aldehyde dehydrogenase ALDH21A1. Uncharacterized aldehyde dehydrogenase (locus RL0313) with sequence similarity to the moss Tortula ruralis aldehyde dehydrogenase ALDH21A1 (RNP123) believed to play an important role in the detoxification of aldehydes generated in response to desiccation- and salinity-stress, and similar sequences are included in this CD.	455
143467	cd07149	ALDH_y4uC	Uncharacterized ALDH (y4uC) with similarity to Tortula ruralis aldehyde dehydrogenase ALDH21A1. Uncharacterized aldehyde dehydrogenase (ORF name y4uC) with sequence similarity to the moss Tortula ruralis aldehyde dehydrogenase ALDH21A1 (RNP123) believed to play an important role in the detoxification of aldehydes generated in response to desiccation- and salinity-stress, and similar sequences are included in this CD.	453
143468	cd07150	ALDH_VaniDH_like	Pseudomonas putida vanillin dehydrogenase-like. Vanillin dehydrogenase (Vdh, VaniDH) involved in the metabolism of ferulic acid and other related  sequences are included in this CD.  The E. coli vanillin dehydrogenase (LigV) preferred NAD+ to NADP+  and exhibited a broad substrate preference, including vanillin,  benzaldehyde, protocatechualdehyde, m-anisaldehyde, and p-hydroxybenzaldehyde.	451
143469	cd07151	ALDH_HBenzADH	NADP+-dependent p-hydroxybenzaldehyde dehydrogenase-like. NADP+-dependent, p-hydroxybenzaldehyde dehydrogenase (PchA, HBenzADH) which catalyzes oxidation of p-hydroxybenzaldehyde to p-hydroxybenzoic acid and other related sequences are included in this CD.	465
143470	cd07152	ALDH_BenzADH	NAD-dependent benzaldehyde dehydrogenase II-like. NAD-dependent, benzaldehyde dehydrogenase II (XylC, BenzADH, EC=1.2.1.28) is involved in the oxidation of benzyl alcohol to benzoate. In Acinetobacter calcoaceticus, this process is carried out by the chromosomally encoded, benzyl alcohol dehydrogenase (xylB) and benzaldehyde dehydrogenase II (xylC) enzymes; whereas in Pseudomonas putida they are encoded by TOL plasmids.	443
133478	cd07153	Fur_like	Ferric uptake regulator(Fur) and related metalloregulatory proteins; typically iron-dependent, DNA-binding repressors and activators. Ferric uptake regulator (Fur) and related metalloregulatory proteins are iron-dependent, DNA-binding repressors and activators mainly involved in iron metabolism.  A general model for Fur repression under iron-rich conditions is that activated Fur (a dimer having one Fe2+ coordinated per monomer) binds to specific DNA sequences (Fur boxes) in the promoter region of iron-responsive genes, hindering access of RNA polymerase, and repressing transcription. Positive regulation by Fur can be direct or indirect, as in the Fur repression of an anti-sense regulatory small RNA. Some members sense metal ions other than Fe2+.  For example, the zinc uptake regulator (Zur) responds to Zn2+, the manganese uptake regulator (Mur) responds to Mn2+, and the nickel uptake regulator (Nur) responds to Ni2+. Other members sense signals other than metal ions.  For example, PerR, a metal-dependent sensor of hydrogen peroxide. PerR regulates DNA-binding activity through metal-based protein oxidation, and co-ordinates Mn2+ or Fe2+ at its regulatory site. Fur family proteins contain an N-terminal winged-helix DNA-binding domain followed by a dimerization domain; this CD spans both those domains.	116
143529	cd07154	NR_DBD_PNR_like	The DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) nuclear receptor-like family. The DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) nuclear receptor-like family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. PNR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This family includes nuclear receptor Tailless (TLX), photoreceptor cell-specific nuclear receptor (PNR) and related receptors. TLX is an orphan receptor that plays a key role in neural development by regulating cell cycle progression and exit of neural stem cells in the developing brain. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PNR-like receptors have a central well-conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	73
143530	cd07155	NR_DBD_ER_like	DNA-binding domain of estrogen receptor (ER) and estrogen related receptors (ERR) is composed of two C4-type zinc fingers. DNA-binding domains of estrogen receptor (ER) and estrogen related receptors (ERR) are composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. ER and ERR interact with the palindromic inverted repeat, 5'GGTCAnnnTGACC-3', upstream of the target gene and modulate the rate of transcriptional initiation. ERR and ER are closely related and share sequence similarity, target genes, co-regulators and promoters. While ER is activated by endogenous estrogen, ERR lacks the ability to bind to estrogen. Estrogen receptor mediates the biological effects of hormone estrogen by the binding of the receptor dimer to estrogen response element of target genes.  However, ERRs seem to interfere with the classic ER-mediated estrogen responsive signaling by targeting the same set of genes. ERRs and ERs exhibit the common modular structure with other nuclear receptors. They have a central highly conserved DNA binding domain (DBD), a non-conserved N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	75
143531	cd07156	NR_DBD_VDR_like	The DNA-binding domain of vitamin D receptors (VDR) like nuclear receptor family is composed of two C4-type zinc fingers. The DNA-binding domain of vitamin D receptors (VDR) like nuclear receptor family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. This domain interacts with specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. This family includes three types of nuclear receptors: vitamin D receptors (VDR), constitutive androstane receptor (CAR) and pregnane X receptor (PXR). VDR regulates calcium metabolism, cellular proliferation and differentiation.  PXR and CAR function as sensors of toxic byproducts of cell metabolism and of exogenous chemicals, to facilitate their elimination. The DNA binding activity is regulated by their corresponding ligands. VDR is activated by Vitamin D; CAR and PXR respond to a diverse array of chemically distinct ligands, including many endogenous compounds and clinical drugs. Like other nuclear receptors, xenobiotic receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	72
143532	cd07157	2DBD_NR_DBD1	The first DNA-binding domain (DBD) of the 2DBD nuclear receptors is composed of two C4-type zinc fingers. The first DNA-binding domain (DBD) of the 2DBD nuclear receptors(NRs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. NRs interact with specific DNA sites upstream of the target gene and modulate the rate of transcriptional initiation. Theses proteins contain two DBDs in tandem, probably resulted from an ancient recombination event. The 2DBD-NRs are found only in flatworm species, mollusks and arthropods.  Their biological function is unknown.	86
143533	cd07158	NR_DBD_Ppar_like	The DNA-binding domain of peroxisome proliferator-activated receptors (PPAR) like nuclear receptor family. The DNA-binding domain of peroxisome proliferator-activated receptors (PPAR) like nuclear receptor family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. These domains interact with specific DNA sites upstream of the target gene and modulate the rate of transcriptional initiation. This family includes three known types of nuclear receptors: peroxisome proliferator-activated receptors (PPAR), REV-ERB receptors and Drosophila ecdysone-induced protein 78 (E78). Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PPAR-like receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	73
143534	cd07160	NR_DBD_LXR	DNA-binding domain of Liver X receptors (LXRs) family is composed of two C4-type zinc fingers. DNA-binding domain of Liver X receptors (LXRs) family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. LXR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation.  LXR operates as cholesterol sensor which protects cells from cholesterol overload by stimulating reverse cholesterol transport from peripheral tissues to the liver and its excretion in the bile. Oxidized cholesterol derivatives or oxysterols were identified as specific ligands for LXRs. LXR functions as a heterodimer with the retinoid X receptor (RXR) which may be activated by either LXR agonist or 9-cis retinoic acid, a specific RXR ligand. The LXR/RXR complex binds to a liver X receptor response element (LXRE) in the promoter region of target genes. The ideal LXRE sequence is a direct repeat-4 (DR-4) DNA fragment consisting of two AGGTCA hexameric half-sites separated by a 4-nucleotide spacer. LXR has typical NR modular structure with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and the ligand binding domain (LBD) at the C-terminal.	101
143535	cd07161	NR_DBD_EcR	DNA-binding domain of Ecdysone receptor (ECR) family is composed of two C4-type zinc fingers. DNA-binding domain of Ecdysone receptor (EcR) family is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. EcR interacts with highly degenerate pseudo-palindromic response elements, resembling inverted repeats of 5'-AGGTCA-3' separated by 1 bp, upstream of the target gene and modulates the rate of transcriptional initiation. EcR is present only in invertebrates and regulates the expression of a large number of genes during development and reproduction. EcR functions as a heterodimer by partnering with ultraspiracle protein (USP), the ortholog of the vertebrate retinoid X receptor (RXR). The natural ligands of EcR are ecdysteroids, the endogenous steroidal hormones found in invertebrates. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, EcRs have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	91
143536	cd07162	NR_DBD_PXR	DNA-binding domain of pregnane X receptor (PXRs) is composed of two C4-type zinc fingers. DNA-binding domain (DBD)of pregnane X receptor (PXR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. PXR DBD interacts with the PXR response element, a perfect repeat of two AGTTCA motifs with a 4 bp spacer upstream of the target gene, and modulates the rate of transcriptional initiation. The pregnane X receptor (PXR) is a ligand-regulated transcription factor that responds to a diverse array of chemically distinct ligands, including many endogenous compounds and clinical drugs. PXR functions as a heterodimer with retinoic X receptor-alpha (RXRa) and binds to a variety of promoter regions of a diverse set of target genes involved in the metabolism, transport, and ultimately, elimination of these molecules from the body. Like other nuclear receptors, PXR has a central well conserved DNA-binding domain, a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain.	87
143537	cd07163	NR_DBD_TLX	DNA-binding domain of Tailless (TLX) is composed of two C4-type zinc fingers. DNA-binding domain of Tailless (TLX) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. TLX interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation.  TLX is an orphan receptor that is expressed by neural stem/progenitor cells in the adult brain of the subventricular zone (SVZ) and the dentate gyrus (DG). It plays a key role in neural development by promoting cell cycle progression and preventing apoptosis in the developing brain. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, TLX has a central well conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	92
143538	cd07164	NR_DBD_PNR_like_1	DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) like proteins is composed of two C4-type zinc fingers. DNA-binding domain of the photoreceptor cell-specific nuclear receptor (PNR) like proteins is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. PNR interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation.  PNR is a member of nuclear receptor superfamily of the ligand-activated transcription factors. PNR is expressed only in the outer layer of retinal photoreceptor cells. It may be involved in the signaling pathway regulating photoreceptor differentiation and/or maintenance. It most likely binds to DNA as a homodimer. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, PNR  has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	78
143539	cd07165	NR_DBD_DmE78_like	DNA-binding domain of Drosophila ecdysone-induced protein 78 (E78) like is composed of two C4-type zinc fingers. DNA-binding domain of proteins similar to Drosophila ecdysone-induced protein 78 (E78) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. E78 interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. Drosophila ecdysone-induced protein 78 (E78) is a transcription factor belonging to the nuclear receptor superfamily.  E78 is a product of the ecdysone-inducible gene found in an early late puff locus at position 78C during the onset of Drosophila metamorphosis. An E78 orthologue from the Platyhelminth Schistosoma mansoni (SmE78) has also been identified. It is the first E78 orthologue known outside of the molting animals--the Ecdysozoa. The SmE78 may be involved in transduction of an ecdysone signal in S. mansoni, consistent with its function in Drosophila.  Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, E78-like receptors have a central well conserved DNA-binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	81
143540	cd07166	NR_DBD_REV_ERB	DNA-binding domain of REV-ERB receptor-like is composed of two C4-type zinc fingers. DNA-binding domain of REV-ERB receptor- like is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. This domain interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. REV-ERB receptors are transcriptional regulators belonging to the nuclear receptor superfamily. They regulate a number of physiological functions including the circadian rhythm, lipid metabolism, and cellular differentiation. REV-ERB receptors bind as a monomer to a (A/G)GGTCA half-site with a 5' AT-rich extension or as a homodimer to a direct repeat 2 element (AGGTCA sequence with a 2-bp spacer), indicating functional diversity. When bound to the DNA, they recruit corepressors (NcoR/histone deacetylase 3) to the promoter, resulting in repression of the target genes. The porphyrin heme has been demonstrated to function as a ligand for REV-ERB receptor. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, REV-ERB receptors have a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	89
143541	cd07167	NR_DBD_Lrh-1_like	The DNA-binding domain of Lrh-1 like nuclear receptor family like is composed of two C4-type zinc fingers. The DNA-binding domain of Lrh-1 like nuclear receptor family like is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. This domain interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. This nuclear receptor family includes at least three subgroups of receptors that function in embryo development and differentiation, and other processes. FTZ-F1 interacts with the cis-acting DNA motif of ftz gene, which is required at several stages of development. Particularly, FTZ-F1 regulated genes are strongly linked to steroid biosynthesis and sex-determination; LRH-1 is a regulator of bile-acid homeostasis, steroidogenesis, reverse cholesterol transport and the initial stages of embryonic development; SF-1 is an essential regulator of endocrine development and function and is considered a master regulator of reproduction; SF-1 functions cooperatively with other transcription factors to modulate gene expression. Phospholipids have been identified as potential ligand for LRH-1 and steroidogenic factor-1 (SF-1). However, the ligand for FTZ-F1 has not yet been identified. Most nuclear receptors function as homodimer or heterodimers. However, LRH-1 and SF-1 bind to DNA as monomers. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, receptors in this family  have  a central well conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	93
143542	cd07168	NR_DBD_DHR4_like	DNA-binding domain of ecdysone-induced DHR4 orphan nuclear receptor is composed of two C4-type zinc fingers. DNA-binding domain of ecdysone-induced DHR4 orphan nuclear receptor is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. This domain interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. Ecdysone-induced orphan receptor DHR4 is a member of the nuclear receptor family. DHR4 is expressed during the early Drosophila larval development and is induced by ecdysone. DHR4 coordinates growth and maturation in Drosophila by mediating endocrine response to the attainment of proper body size during larval development. Mutations in DHR4 result in shorter larval development which translates into smaller and lighter flies. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, DHR4  has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	90
143543	cd07169	NR_DBD_GCNF_like	DNA-binding domain of Germ cell nuclear factor (GCNF) F1 is composed of two C4-type zinc fingers. DNA-binding domain of Germ cell nuclear factor (GCNF) F1 is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. This domain interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. GCNF is a transcription factor expressed in post-meiotic stages of developing male germ cells. In vitro, GCNF has the ability to bind to direct repeat elements of  5'-AGGTCA.AGGTCA-3', as well as to an extended half-site sequence 5'-TCA.AGGTCA-3'. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, GCNF has  a central well conserved DNA-binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	90
143544	cd07170	NR_DBD_ERR	DNA-binding domain of estrogen related receptors (ERR) is composed of two C4-type zinc fingers. DNA-binding domain of estrogen related receptors (ERRs) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. ERR interacts with the palindromic inverted repeat, 5'GGTCAnnnTGACC-3', upstream of the target gene and modulates the rate of transcriptional initiation. The estrogen receptor-related receptors (ERRs) are transcriptional regulators, which are closely related to the estrogen receptor (ER) family.  Although ERRs lack the ability to bind to estrogen and are so-called orphan receptors, they share target genes, co-regulators and promoters with the estrogen receptor (ER) family. By targeting the same set of genes, ERRs seem to interfere with the classic ER-mediated estrogen response in various ways. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ERR has a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	97
143545	cd07171	NR_DBD_ER	DNA-binding domain of estrogen receptors (ER) is composed of two C4-type zinc fingers. DNA-binding domain of estrogen receptors (ER) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which coordinates a single zinc atom. ER interacts with specific DNA sites upstream of the target gene and modulates the rate of transcriptional initiation. Estrogen receptor is a transcription regulator that mediates the biological effects of hormone estrogen. The binding of estrogen to the receptor triggers the dimerization and the binding of the receptor dimer to estrogen response element, which is a palindromic inverted repeat: 5'GGTCAnnnTGACC-3', of target genes. Through ER, estrogen regulates development, reproduction and homeostasis. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, ER  has  a central well-conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	82
143546	cd07172	NR_DBD_GR_PR	DNA-binding domain of glucocorticoid receptor (GR) is composed of two C4-type zinc fingers. DNA-binding domains of glucocorticoid receptor (GR) and progesterone receptor (PR) are composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinate  a single zinc atom. The DBD from both receptors interact with the same hormone response element (HRE), which is an imperfect palindrome GGTACAnnnTGTTCT, upstream of target genes and modulates the rate of transcriptional initiation. GR is a transcriptional regulator that mediates the biological effects of glucocorticoids and PR regulates genes controlled by progesterone. GR is expressed in almost every cell in the body and regulates genes controlling a wide variety of processes including the development, metabolism, and immune response of the organism. PR functions in a variety of biological processes including development of the mammary gland, regulating cell cycle progression, protein processing, and metabolism. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, GR and PR have  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD).	78
143547	cd07173	NR_DBD_AR	DNA-binding domain of androgen receptor (AR) is composed of two C4-type zinc fingers. DNA-binding domain of androgen receptor (AR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. To regulate gene expression, AR interacts with a palindrome of the core sequence 5'-TGTTCT-3' with a 3-bp spacer. It also binds to the direct repeat  5'-TGTTCT-3' hexamer in some androgen controlled genes. AR is activated by the androgenic hormones, testosterone or dihydrotestosterone, which are responsible for primary and for secondary male characteristics, respectively. The primary mechanism of action of ARs is by direct regulation of gene transcription. The binding of androgen results in a conformational change in the androgen receptor which causes its transport from the cytosol into the cell nucleus, and dimerization. The receptor dimer binds to a hormone response element of AR regulated genes and modulates their expression. Another mode of action of androgen receptor is independent of their interactions with DNA. The receptor interacts directly with signal transduction proteins in the cytoplasm, causing rapid changes in cell function, such as ion transport. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, AR has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	82
143580	cd07176	terB	tellurite resistance protein terB. This family contains uncharacterized bacterial proteins involved in tellurium resistance. The prototype of this CD is the Kp-terB protein from Klebsiella pneumoniae, whose 3D structure was recently determined. The biological function of terB and the mechanism responsible for tellurium resistance are unknown.	111
143581	cd07177	terB_like	tellurium resistance terB-like protein. This family consists of tellurium resistance terB proteins, N-terminal domain of heat shock DnaJ-like proteins, N-terminal domain of Mo-dependent nitrogenase-like proteins, C-terminal domain of ABC transporter ATP-binding proteins, C-terminal domain of serine/threonine protein kinase, and many hypothetical bacterial proteins. The function of this family is unknown.	104
143582	cd07178	terB_like_YebE	tellurium resistance terB-like protein, subgroup 3. This family includes several uncharacterized bacterial proteins including an Escherichia coli protein called YebE. Protein sequence homology analysis shows they are similar to tellurium resistance protein terB, but the function of this family is unknown.	95
143548	cd07179	2DBD_NR_DBD2	The second DNA-binding domain (DBD) of the 2DBD nuclear receptor is composed of two C4-type zinc fingers. The second DNA-binding domain (DBD) of the 2DBD nuclear receptor (NR) is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. NRs interact with specific DNA sites upstream of the target gene and modulate the rate of transcriptional initiation. The proteins contain two DBDs in tandem, probably resulting from an ancient recombination event.  The 2DBD-NRs are found only in flatworm species, mollusks and arthropods.  Their biological function is unknown.	74
260001	cd07180	RNase_HII_archaea_like	Archaeal Ribonuclease HII. This family includes type 2 RNases H from archaea, some of which show broad divalent cation specificity. It is proposed that three of the four acidic residues at the active site are involved in metal binding and the fourth one is involved in the catalytic process in archaea. Most archaeal genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. It appears that type I and type II RNases H also have overlapping functions in cells, as over-expression of Escherichia coli RNase HII can complement an RNase HI deletion phenotype in E. coli. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, archaeal RNase HII and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication or repair.	204
260002	cd07181	RNase_HII_eukaryota_like	Eukaryotic RNase HII. This family includes eukaryotic type 2 RNase H (RNase HII or H2) which is active during replication and is believed to play a role in the removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII (RNASEH2A) is functional when it forms a heterotrimeric complex with two other accessory proteins (RNASEH2B and RNASEH2C). It is speculated that these accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause the severe genetic neurological disorder Aicardi-Goutieres syndrome. Ribonuclease H (RNase H) is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms.	221
260003	cd07182	RNase_HII_bacteria_HII_like	Bacterial Ribonuclease HII-like. This family includes mostly bacterial type 2 RNases H, with some eukaryotic members. Bacterial RNase HII has a role in primer removal based on its involvement in ribonucleotide-specific catalytic activity in the presence of RNA/DNA hybrid substrates. Several bacteria, such as Bacillus subtilis, have two different type II RNases H, RNases HII and HIII; double deletion of these leads to cellular lethality. It appears that type I and type II RNases H also have overlapping functions in cells, as over-expression of Escherichia coli RNase HII can complement an RNase HI deletion phenotype. In Leishmania mitochondria, of the four distinct RNase H genes (H1, HIIA, HIIB, HIIC), HIIC is essential for the survival of the parasite and thus can be a potential target for anti-leishmanial chemotherapy. Ribonuclease H (RNase H) is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair.	177
199892	cd07184	E_set_Isoamylase_like_N	N-terminal Early set domain associated with the catalytic domain of isoamylase-like (also called glycogen 6-glucanohydrolase) proteins. E or "early" set domains are associated with the catalytic domain of isoamylase-like proteins at the N-terminal end. Isoamylase is one of the starch-debranching enzymes that catalyze the hydrolysis of alpha-1,6-glucosidic linkages specific in alpha-glucans such as amylopectin or glycogen. Isoamylase contains a bound calcium ion, but this is not in the same position as the conserved calcium ion that has been reported in other alpha-amylase family enzymes. The N-terminal domain of isoamylase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, and the beta subunit of AMP-activated protein kinase.	86
143586	cd07185	OmpA_C-like	Peptidoglycan binding domains similar to the C-terminal domain of outer-membrane protein OmpA. OmpA-like domains (named after the C-terminal domain of Escherichia coli OmpA protein) have been shown to non-covalently associate with peptidoglycan, a network of glycan chains composed of disaccharides, which are crosslinked via short peptide bridges. Well-studied members of this family include the Escherichia coli outer membrane protein OmpA, the Escherichia coli lipoprotein PAL, Neisseria meningitdis RmpM, which interact with the outer membrane, as well as the Escherichia coli motor protein MotB, and the Vibrio flagellar motor proteins PomB and MotY, which interact with the inner membrane.	106
132872	cd07186	CofD_like	LPPG:FO 2-phospho-L-lactate transferase; important in F420 biosynthesis. CofD is a 2-phospho-L-lactate transferase that catalyzes the last step in the biosynthesis of coenzyme F(420)-0 (F(420) without polyglutamate) by transferring the lactyl phosphate moiety of lactyl(2)diphospho-(5')guanosine (LPPG) to 7,8-didemethyl-8-hydroxy-5-deazariboflavin ribitol (F0). F420 is a hydride carrier, important for energy metabolism of methanogenic archaea, as well as for the biosynthesis of other natural products, like tetracycline in Streptomyces. F420 and some of its precursors are also utilized as cofactors for enzymes, like DNA photolyase in Mycobacterium tuberculosis.	303
132873	cd07187	YvcK_like	family of mostly uncharacterized proteins similar to B.subtilis YvcK. One member of this protein family, YvcK from Bacillus subtilis, has been proposed to play a role in carbon metabolism, since its function is essential for growth on intermediates of the Krebs cycle and the pentose phosphate pathway. In general, this family of mostly uncharacterized proteins is related to the CofD-like protein family. CofD has been characterized as a 2-phospho-L-lactate transferase involved in F420 biosynthesis. This family appears to have the same conserved phosphate binding site as the other family in this hierarchy, but a different substrate binding site.	308
143587	cd07197	nitrilase	Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes. This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy.	253
132837	cd07198	Patatin	Patatin-like phospholipase. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes PNPLA (1-9), TGL (3-5), ExoU-like, and SDP1-like subfamilies. There are some additional hypothetical proteins included in this family.	172
132838	cd07199	Pat17_PNPLA8_PNPLA9_like	Patatin-like phospholipase; includes PNPLA8, PNPLA9, and Pat17. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum.	258
132839	cd07200	cPLA2_Grp-IVA	Group IVA cytosolic phospholipase A2; catalytic domain; Ca-dependent. Group IVA cPLA2, an 85 kDa protein, consists of two domains: the regulatory C2 domain and the alpha/beta hydrolase PLA2 domain. Group IVA cPLA2 is also referred to as cPLA2-alpha. The catalytic domain of cytosolic phospholipase A2 (cPLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. Movement of the cPLA2 lid possibly exposes a greater hydrophobic surface and the active site. cPLA2 belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Calcium is required for cPLA2 to bind with membranes or phospholipids. A calcium-dependent phospholipid binding domain resides in the N-terminal region of cPLA2; it is homologous to the C2 domain superfamily which is not included in this hierarchy. Includes PLA2G4A from chicken, human, and frog.	505
132840	cd07201	cPLA2_Grp-IVB-IVD-IVE-IVF	Group IVB, IVD, IVE, and IVF cytosolic phospholipase A2; catalytic domain; Ca-dependent. Group IVB, IVD, IVE, and IVF cPLA2 consists of two domains: the regulatory C2 domain and alpha/beta hydrolase PLA2 domain. Group IVB, IVD, IVE, and IVF cPLA2 are also referred to as cPLA2-beta, -delta, -epsilon, and -zeta respectively. cPLA2-beta is approximately 30% identical to cPLA2-alpha and it shows low enzymatic activity compared to cPLA2alpha. cPLA2-beta hydrolyzes palmitic acid from 1-[14C]palmitoyl-2-arachidonoyl-PC and arachidonic acid from 1-palmitoyl-2[14C]arachidonoyl-PC, but not from 1-O-alkyl-2[3H]arachidonoyl-PC. cPLA2-delta, -epsilon, and -zeta are approximately 45-50% identical to cPLA2-beta and 31-37% identical to cPLA2-alpha. It's possible that cPLA2-beta, -delta, -epsilon, and -zeta may have arisen by gene duplication from an ancestral gene. The catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. Movement of the cPLA2 lid possibly exposes a greater hydrophobic surface and the active site. cPLA2 belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Calcium is required for cPLA2 to bind with membranes or phospholipids. The calcium-dependent phospholipid binding domain resides in the N-terminal region of cPLA2; it is homologous to the C2 domain superfamily which is not included in this hierarchy. It includes PLA2G4B, PLA2G4D, PLA2G4E, and PLA2G4F from humans.	541
132841	cd07202	cPLA2_Grp-IVC	Group IVC cytoplasmic phospholipase A2; catalytic domain; Ca-independent. Group IVC cPLA2, a small 61 kDa protein, is a single domain alpha/beta hydrolase. It lacks a C2 domain; therefore, it has no Ca-dependence. Group IVC cPLA2 is also referred to as cPLA2-gamma. The cPLA2-gamma enzyme is predominantly found in cardiac and skeletal muscles, and to a lesser extent in the brain. Human cPLA2-gamma is approximately 30% identical to cPLA2-alpha. The catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms. Movement of the cPLA2 lid possibly exposes a greater hydrophobic surface and the active site. cPLA2 belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Includes PLA2G4C protein from human and Pla2g4c protein from mouse.	430
132842	cd07203	cPLA2_Fungal_PLB	Fungal Phospholipase B-like; cPLA2 GrpIVA homologs; catalytic domain. Fungal phospholipase B are Group IV cPLA2 homologs. Aspergillus PLA2 is Ca-dependent, yet it does not contain a C2 domain. PLB deacylates both sn-1 and sn-2 chains of phospholipids and are abundantly expressed in fungi. It shows lysophospholipase (lysoPL) and transacylase activities. The active site residues from cPLA2 are also conserved in PLB. Like cPLA2, PLB also has a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). It includes PLB1 from Schizosaccharomyces pombe, PLB2 from Candida glabrata, and PLB3 from Saccharomyces cerevisiae. PLB1, PLB2, and PLB3 show PLB and lysoPL activities; PLB3 is specific for phosphoinositides.	552
132843	cd07204	Pat_PNPLA_like	Patatin-like phospholipase domain containing protein family. Members of this family share a patain domain, initially discovered in potato tubers. PNPLA protein members show non-specific hydrolase activity with a variety of substrates such as triacylglycerol, phospholipids, and retinylesters. It contains the lipase consensus sequence (Gly-X-Ser-X-Gly). Nomenclature of PNPLA family could be misleading as some of the mammalian members of this family show hydrolase, but no phospholipase activity.	243
132844	cd07205	Pat_PNPLA6_PNPLA7_NTE1_like	Patatin-like phospholipase domain containing protein 6, protein 7, and fungal NTE1. Patatin-like phospholipase domain containing protein 6 (PNPLA6) and protein 7 (PNPLA7) are included in this family. PNPLA6 is commonly known as Neuropathy Target Esterase (NTE). NTE has at least two functional domains: the N-terminal domain putatively regulatory domain and the C-terminal catalytic domain which shows esterase activity. NTE shows phospholipase activity for lysophosphatidylcholine (LPC) and phosphatidylcholine (PC). Exposure of NTE to organophosphates leads to organophosphate-induced delayed neurotoxicity (OPIDN). OPIDN is a progressive neurological condition that is characterized by weakness, paralysis, pain, and paresthesia. PNPLA7 is an insulin-regulated phospholipase that is homologus to Neuropathy Target Esterase (NTE or PNPLA6) and is also known as NTE-related esterase (NRE). Human NRE is predominantly expressed in prostate, white adipose, and pancreatic tissue. NRE hydrolyzes sn-1 esters in lysophosphatidylcholine and lysophosphatidic acid, but shows no lipase activity with substrates like triacylglycerols (TG), cholesteryl esters, retinyl esters (RE), phosphatidylcholine (PC), or monoacylglycerol (MG). This family includes subfamily of PNPLA6 (NTE) and PNPLA7 (NRE)-like phospholipases.	175
132845	cd07206	Pat_TGL3-4-5_SDP1	Triacylglycerol lipase 3, 4, and 5 and Sugar-Dependent 1 lipase. Triacylglycerol lipases are involved in triacylglycerol mobilization and degradation; they are found in lipid particles. TGL4 is 30% homologus to TGL3, whereas TGL5 is 26% homologus to TGL3. Sugar-Dependent 1 (SDP1) lipase has a patatin-like acyl-hydrolase domain that initiates the breakdown of storage oil in germinating Arabidopsis seeds. This family includes subfamilies of proteins: TGL3, TGL4, TGL5, and SDP1.	298
132846	cd07207	Pat_ExoU_VipD_like	ExoU and VipD-like proteins; homologus to patatin, cPLA2, and iPLA2. ExoU, a 74-kDa enzyme, is a potent virulence factor of Pseudomonas aeruginosa. One of the pathogenic mechanisms of P. aeruginosa is to induce cytotoxicity by the injection of effector proteins (e.g. ExoU) using the type III secretion (T3S) system. ExoU is homologus to patatin and also has the conserved catalytic residues of mammalian calcium-independent (iPLA2) and cytosolic (cPLA2) PLA2. In vitro, ExoU cytotoxity is blocked by the inhibitor of cytosolic and Ca2-independent phospholipase A2 (cPLA2 and iPLA2) enzymes, suggesting that phospholipase A2 inhibitors may represent a novel mode of treatment for acute P. aeruginosa infections. ExoU requires eukaryotic superoxide dismutase as a cofactor and cleaves phosphatidylcholine and phosphatidylethanolamine in vitro. VipD, a 69-kDa cytosolic protein, belongs to the members of Legionella pneumophila family and is homologus to ExoU from Pseudomonas. Even though VipD shows high sequence similarity with several functional regions of ExoU (e.g. oxyanion hole, active site serine, active site aspartate), it has been shown to have no phospholipase activity. This family includes ExoU from Pseudomonas aeruginosa and VipD of Legionella pneumophila.	194
132847	cd07208	Pat_hypo_Ecoli_yjju_like	Hypothetical patatin similar to yjju protein of Escherichia coli. Patatin-like phospholipase similar to yjju protein of Escherichia coli. This family predominantly consists of bacterial patatin glycoproteins, and some representatives from eukaryotes and archaea.  The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates.	266
132848	cd07209	Pat_hypo_Ecoli_Z1214_like	Hypothetical patatin similar to Z1214 protein of Escherichia coli. Patatin-like phospholipase similar to Z1214 protein of Escherichia coli. This family predominantly consists of bacterial patatin glycoproteins and some representatives from eukaryotes and archaea. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates.	215
132849	cd07210	Pat_hypo_W_succinogenes_WS1459_like	Hypothetical patatin similar to WS1459 of Wolinella succinogenes. Patatin-like phospholipase. This family predominantly consists of bacterial patatin glycoproteins. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates.	221
132850	cd07211	Pat_PNPLA8	Patatin-like phospholipase domain containing protein 8. PNPLA8 is a Ca-independent myocardial phospholipase which maintains mitochondrial integrity. PNPLA8 is also known as iPLA2-gamma. In humans, it is predominantly expressed in heart tissue. iPLA2-gamma can catalyze both phospholipase A1 and A2 reactions (PLA1 and PLA2 respectively). This family includes PNPLA8 (iPLA2-gamma) from Homo sapiens and iPLA2-2 from Mus musculus.	308
132851	cd07212	Pat_PNPLA9	Patatin-like phospholipase domain containing protein 9. PNPLA9 is a Ca-independent phospholipase that catalyzes the hydrolysis of glycerophospholipids at the sn-2 position. PNPLA9 is also known as PLA2G6 (phospholipase A2 group VI) or iPLA2beta. PLA2G6 is stimulated by ATP and inhibited by bromoenol lactone (BEL). In humans, PNPLA9 in expressed ubiquitously and is involved in signal transduction, cell proliferation, and apoptotic cell death. Mutations in human PLA2G6 leads to infantile neuroaxonal dystrophy (INAD) and idiopathic neurodegeneration with brain iron accumulation (NBIA). This family includes PLA2G6 from Homo sapiens and Rattus norvegicus.	312
132852	cd07213	Pat17_PNPLA8_PNPLA9_like1	Patatin-like phospholipase. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum.	288
132853	cd07214	Pat17_isozyme_like	Patatin-like phospholipase of plants. Pat17 is an isozyme of patatin cloned from Solanum cardiophyllum. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue, and Nu = nucleophile). Patatin-like phospholipase are included in this group. Members of this family have also been found in vertebrates.	349
132854	cd07215	Pat17_PNPLA8_PNPLA9_like2	Patatin-like phospholipase of bacteria. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum.	329
132855	cd07216	Pat17_PNPLA8_PNPLA9_like3	Patatin-like phospholipase. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum.	309
132856	cd07217	Pat17_PNPLA8_PNPLA9_like4	Patatin-like phospholipase. Patatin is a storage protein of the potato tuber that shows Phospholipase A2 activity (PLA2; EC 3.1.1.4). Patatin catalyzes the nonspecific hydrolysis of phospholipids, glycolipids, sulfolipids, and mono- and diacylglycerols, thereby showing lipid acyl hydrolase activity. The active site includes an oxyanion hole with a conserved GGxR motif; it is found in almost all the members of this family. The catalytic dyad is formed by a serine and an aspartate. Patatin belongs to the alpha-beta hydrolase family which is identified by a characteristic nucleophile elbow with a consensus sequence of Sm-X-Nu-Sm (Sm = small residue, X = any residue and Nu = nucleophile). Members of this family have been found also in vertebrates. This family includes subfamily of PNPLA8 (iPLA2-gamma) and PNPLA9 (iPLA2-beta) like phospholipases from human as well as the Pat17 isozyme from Solanum cardiophyllum.	344
132857	cd07218	Pat_iPLA2	Calcium-independent phospholipase A2; Classified as Group IVA-1 PLA2. Calcium-independent phospholipase A2; otherwise known as Group IVA-1 PLA2. It contains the lipase consensus sequence (Gly-X-Ser-X-Gly);mutagenesis experiments confirm the role of this serine as a nucleophile. Some members of this group show triacylglycerol lipase activity (EC 3:1:1:3). Members include iPLA-1, iPLA-2, and iPLA-3 from Aedes aegypti and show acylglycerol transacylase/lipase activity. Also includes putative iPLA2-eta from Pediculus humanus corporis which shows patatin-like phospholipase activity.	245
132858	cd07219	Pat_PNPLA1	Patatin-like phospholipase domain containing protein 1. Members of this family share a patatin domain, initially discovered in potato tubers. Some members of PNPLA1 subfamily do not have the lipase consensus sequence Gly-X-Ser-X-Gly which is essential for hydrolase activity.  This family includes PNPLA1 from Homo sapiens and Gallus gallus. Currently, there is no literature available on the physiological role, structure, or enzymatic activity of PNPLA1. It is expressed in various human tissues in low mRNA levels.	382
132859	cd07220	Pat_PNPLA2	Patatin-like phospholipase domain containing protein 2. PNPLA2 plays a key role in hydrolysis of stored triacylglecerols and is also known as adipose triglyceride lipase (ATGL). Members of this family share a patain domain, initially discovered in potato tubers. ATGL is expressed in white and brown adipose tissue in high mRNA levels. Mutations in PNPLA2 encoding adipose triglyceride lipase (ATGL) leads to neutral lipid storage disease (NLSD) which is characterized by the accumulation of triglycerides in multiple tissues. ATGL mutations are also commonly associated with severe forms of skeletal- and cardio-myopathy. This family includes patatin-like proteins: TTS-2.2 (transport-secretion protein 2.2), PNPLA2 (Patatin-like phospholipase domain-containing protein 2), and iPLA2-zeta (Calcium-independent phospholipase A2) from Homo sapiens.	249
132860	cd07221	Pat_PNPLA3	Patatin-like phospholipase domain containing protein 3. PNPLA3 is a triacylglycerol lipase that mediates triacylglycerol hydrolysis in adipocytes and is an indicator of the nutritional state. PNPLA3 is also known as adiponutrin (ADPN) or iPLA2-epsilon. Human adiponutrins are bound to the cell membrane of adipocytes and show transacylase, TG hydrolase, and PLA2 activity. This family includes patatin-like proteins: ADPN (adiponutrin) from mammals, PNPLA3 (Patatin-like phospholipase domain-containing protein 3), and iPLA2-epsilon (Calcium-independent phospholipase A2) from Homo sapiens.	252
132861	cd07222	Pat_PNPLA4	Patatin-like phospholipase domain containing protein 4. PNPLA4, also known as GS2 (gene sequence-2), shows both lipase and transacylation activities. GS2 lipase is expressed in various tissues, predominantly in muscle and adipocytes tissue. It is also expressed in keratinocytes and shows retinyl ester hydrolase, acylglycerol, TG hydrolase, and PLA2 activity. This family includes patatin-like proteins: GS2 from mammals, PNPLA4 (Patatin-like phospholipase domain-containing protein 4), and iPLA2-eta (Calcium-independent phospholipase A2) from Homo sapiens.	246
132862	cd07223	Pat_PNPLA5-mammals	Patatin-like phospholipase domain containing protein 5. PNPLA5, also known as GS2L (GS2-like), plays a role in regulation of adipocyte differentiation. PNPLA5 is expressed in brain tissue in high mRNA levels and low levels in liver tissue. There is no concrete evidence in support of the enzymatic activity of GS2L. This family includes patatin-like proteins: GS2L (GS2-like) and PNPLA5 (Patatin-like phospholipase domain-containing protein 5) reported exclusively in mammals.	405
132863	cd07224	Pat_like	Patatin-like phospholipase. Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of lipid acyl hydrolase, catalysing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates.	233
132864	cd07225	Pat_PNPLA6_PNPLA7	Patatin-like phospholipase domain containing protein 6 and protein 7. Patatin-like phospholipase domain containing protein 6 (PNPLA6) and protein 7 (PNPLA7) are 60% identical to each other. PNPLA6 is commonly known as Neuropathy Target Esterase (NTE). NTE has at least two functional domains: the N-terminal domain putatively regulatory domain and the C-terminal catalytic domain which shows esterase activity. NTE shows phospholipase activity for lysophosphatidylcholine (LPC) and phosphatidylcholine (PC). Exposure of NTE to organophosphates leads to organophosphate-induced delayed neurotoxicity (OPIDN). OPIDN is a progressive neurological condition that is characterized by weakness, paralysis, pain, and paresthesia. PNPLA7 is an insulin-regulated phospholipase that is homologous to Neuropathy Target Esterase (NTE or PNPLA6) and is also known as NTE-related esterase (NRE). Human NRE is predominantly expressed in prostate, white adipose, and pancreatic tissue. NRE hydrolyzes sn-1 esters in lysophosphatidylcholine and lysophosphatidic acid, but shows no lipase activity with substrates like triacylglycerols (TG), cholesteryl esters, retinyl esters (RE), phosphatidylcholine (PC), or monoacylglycerol (MG). This family includes PNPLA6 and PNPLA7 from Homo sapiens, YMF9 from Yeast, and Swiss Cheese protein (sws) from Drosophila melanogaster.	306
132865	cd07227	Pat_Fungal_NTE1	Fungal patatin-like phospholipase domain containing protein 6. These are fungal Neuropathy Target Esterase (NTE), commonly referred to as NTE1. Patatin-like phospholipase. NTE has at least two functional domains: the N-terminal domain putatively regulatory domain and the C-terminal catalytic domain which shows esterase activity. NTE shows phospholipase activity for lysophosphatidylcholine (LPC) and phosphatidylcholine (PC). Exposure of NTE to organophosphates leads to organophosphate-induced delayed neurotoxicity (OPIDN). OPIDN is a progressive neurological condition that is characterized by weakness, paralysis, pain, and paresthesia. This family includes NTE1 from fungi.	269
132866	cd07228	Pat_NTE_like_bacteria	Bacterial patatin-like phospholipase domain containing protein 6. Bacterial patatin-like phospholipase domain containing protein 6. PNPLA6 is commonly known as Neuropathy Target Esterase (NTE). NTE has at least two functional domains: the N-terminal domain putatively regulatory domain and the C-terminal catalytic domain which shows esterase activity. NTE shows phospholipase activity for lysophosphatidylcholine (LPC) and phosphatidylcholine (PC). Exposure of NTE to organophosphates leads to organophosphate-induced delayed neurotoxicity (OPIDN). OPIDN is a progressive neurological condition that is characterized by weakness, paralysis, pain, and paresthesia. This group includes YCHK and rssA from Escherichia coli as well as Ylbk from Bacillus amyloliquefaciens.	175
132867	cd07229	Pat_TGL3_like	Triacylglycerol lipase 3. Triacylglycerol lipase 3 (TGL3) are responsible for all the TAG lipase activity of the lipid particle. Triacylglycerol (TAG) lipases are also necessary for the mobilization of TAG stored in lipid particles. TGL3 contains the consensus sequence motif GXSXG, which is found in lipolytic enzymes. This family includes Tgl3p from Saccharomyces cerevisiae.	391
132868	cd07230	Pat_TGL4-5_like	Triacylglycerol lipase 4 and 5. TGL4 and TGL5 are triacylglycerol lipases that are involved in triacylglycerol mobilization and degradation; they are found in lipid particles. Tgl4 is a functional ortholog of mammalian adipose TG lipase (ATGL) and is phosphorylated and activated by cyclin-dependent kinase 1 (Cdk1/Cdc28). TGL4 is 30% homologus to TGL3, whereas TGL5 is 26% homologus to TGL3. This family includes TGL4 (STC1) and TGL5 (STC2) from Saccharomyces cerevisiae.	421
132869	cd07231	Pat_SDP1-like	Sugar-Dependent 1 like lipase. Sugar-Dependent 1 (SDP1) lipase has a patatin-like acyl-hydrolase domain that initiates the breakdown of storage oil in germinating Arabidopsis seeds. This acyl-hydrolase domain is homologus to yeast triacylglycerol lipase 3 and human adipose triglyceride lipase. This family includes SDP1 from Arabidopsis thaliana.	323
132870	cd07232	Pat_PLPL	Patain-like phospholipase. Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants and fungi. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates.	407
319900	cd07233	GlxI_Zn	Glyoxalase I that uses Zn(++) as cofactor. This family includes eukaryotic glyoxalase I that prefers the divalent cation zinc as cofactor. Glyoxalase I (also known as lactoylglutathione lyase; EC 4.4.1.5) is part of the glyoxalase system, a two-step system for detoxifying methylglyoxal, a side product of glycolysis. This system is responsible for the conversion of reactive, acyclic alpha-oxoaldehydes into the corresponding alpha-hydroxyacids and involves 2 enzymes, glyoxalase I and II. Glyoxalase I catalyses an intramolecular redox reaction of the hemithioacetal (formed from methylglyoxal and glutathione) to form the thioester, S-D-lactoylglutathione. This reaction involves the transfer of two hydrogen atoms from C1 to C2 of the methylglyoxal, and proceeds via an ene-diol intermediate. Glyoxalase I has a requirement for bound metal ions for catalysis. Eukaryotic glyoxalase I prefers the divalent cation zinc as cofactor, whereas Escherichia coil and other prokaryotic glyoxalase I uses nickel. However, eukaryotic Trypanosomatid parasites also use nickel as a cofactor, which could possibly be explained by acquiring their GLOI gene by horizontal gene transfer. Human glyoxalase I is a two-domain enzyme and  it has the structure of a domain-swapped dimer with two active sites located at the dimer interface. In yeast, in various plants, insects and Plasmodia, glyoxalase I is four-domain, possibly the result of a further gene duplication and an additional gene fusing event.	142
319901	cd07235	MRD	Mitomycin C resistance protein (MRD). Mitomycin C (MC) is a naturally occurring antibiotic, and antitumor agent used in the treatment of cancer. Its antitumor activity is exerted primarily through monofunctional and bifunctional alkylation of DNA. MRD binds to MC and functions as a component of the MC exporting system. MC is bound to MRD by a stacking interaction between a His and a Trp. MRD adopts a structural fold similar to bleomycin resistance protein, glyoxalase I, and extradiol dioxygenases; and it has binding sites at an identical location to binding sites in these evolutionarily related enzymes.	123
319902	cd07237	BphC1-RGP6_C_like	C-terminal domain of 2,3-dihydroxybiphenyl 1,2-dioxygenase. This subfamily contains the C-terminal, catalytic, domain of BphC1-RGP6 and similar proteins. BphC catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). This subfamily of BphCs belongs to the type I extradiol dioxygenase family, which require a metal in the active site in its catalytic mechanism. Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of BphCs. For example, three types of BphC enzymes have been found in Rhodococcus globerulus (BphC1-RGP6 - BphC3-RGP6), all three enzymes are type I extradiol dioxygenases. BphC1-RGP6 has an internal duplication, it is a two-domain dioxygenase which forms octamers, and has Fe(II) at the catalytic site. Its C-terminal repeat is represented in this subfamily. BphC2-RGP6 and BphC3-RGP6 are one-domain dioxygenases, they belong to a different subfamily of the ED_TypeI_classII_C  (C-terminal domain of type I, class II extradiol dioxygenases) family.	153
319903	cd07238	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	112
319904	cd07239	BphC5-RK37_C_like	C-terminal, catalytic domain of BphC5 (2,3-dihydroxybiphenyl 1,2-dioxygenase). 2,3-dihydroxybiphenyl 1,2-dioxygenase (BphC) catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). The enzyme contains a N-terminal and a C-terminal domain of similar structure fold, resulting from an ancient gene duplication. BphC belongs to the type I extradiol dioxygenase family, which requires a metal in the active site for its catalytic activity. Polychlorinated biphenyl degrading bacteria demonstrate multiplicity of BphCs. Bacterium Rhodococcus rhodochrous K37 has eight genes encoding BphC enzymes. This family includes the C-terminal domain of BphC5-RrK37. The crystal structure of the protein from Novosphingobium aromaticivorans has a Mn(II)in the active site, although most proteins of type I extradiol dioxygenases are activated by Fe(II).	143
319905	cd07241	VOC_BsYyaH	vicinal oxygen chelate (VOC) family protein similar to Bacillus subtilis YyaH. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	125
319906	cd07242	VOC_BsYqjT	vicinal oxygen chelate (VOC) family protein similar to Bacillus subtilis YqjT. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	126
319907	cd07243	2_3_CTD_C	C-terminal domain of catechol 2,3-dioxygenase. This subfamily contains the C-terminal, catalytic, domain of catechol 2,3-dioxygenase. Catechol 2,3-dioxygenase (2,3-CTD, catechol:oxygen 2,3-oxidoreductase) catalyzes an extradiol cleavage of catechol to form 2-hydroxymuconate semialdehyde with the insertion of two atoms of oxygen. The enzyme is a homotetramer and contains catalytically essential Fe(II) . The reaction proceeds by an ordered bi-unit mechanism. First, catechol binds to the enzyme, this is then followed by the binding of dioxygen to form a tertiary complex, and then the aromatic ring is cleaved to produce 2-hydroxymuconate semialdehyde. Catechol 2,3-dioxygenase belongs to the type I extradiol dioxygenase family. The subunit comprises the N- and C-terminal domains of similar structure fold, resulting from an ancient gene duplication. The active site is located in a funnel-shaped space of the C-terminal domain. This subfamily represents the C-terminal domain.	144
319908	cd07244	FosA	fosfomycin resistant protein subfamily FosA. This subfamily family contains FosA, a fosfomycin resistant protein. FosA is a Mn(II) and K(+)-dependent glutathione transferase. Fosfomycin inhibits the enzyme UDP-N-acetylglucosamine-3-enolpyruvyltransferase (MurA), which catalyzes the first committed step in bacterial cell wall biosynthesis. FosA, catalyzes the addition of glutathione to the antibiotic fosfomycin, (1R,2S)-epoxypropylphosphonic acid, making it inactive. FosA is a Mn(II) dependent enzyme. It is evolutionarily related to glyoxalase I and type I extradiol dioxygenases.	121
319909	cd07245	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	117
319910	cd07246	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping	124
319911	cd07247	SgaA_N_like	N-terminal domain of Streptomyces griseus SgaA and similar domains. SgaA suppresses the growth disturbances caused by high osmolarity and a high concentration of A-factor, a microbial hormone, during the early growth phase in Streptomyces griseus. A-factor (2-isocapryloyl-3R-hydroxymethyl-gamma-butyrolactone) controls morphological differentiation and secondary metabolism in Streptomyces griseus. It is a chemical signaling molecule that at a very low concentration acts as a switch for yellow pigment production, aerial mycelium formation, streptomycin production, and streptomycin resistance. The structure and amino acid sequence of SgaA are closely related to a group of antibiotics resistance proteins, including bleomycin resistance protein, mitomycin resistance protein, and fosfomycin resistance proteins. SgaA might also function as a streptomycin resistance protein.	114
319912	cd07249	MMCE	Methylmalonyl-CoA epimerase (MMCE). MMCE, also called methylmalonyl-CoA racemase (EC 5.1.99.1) interconverts (2R)-methylmalonyl-CoA and (2S)-methylmalonyl-CoA. MMCE has been found in bacteria, archaea, and in animals. In eukaryotes, MMCE is an essential enzyme in a pathway that converts propionyl-CoA to succinyl-CoA, and is important in the breakdown of odd-chain length fatty acids, branched-chain amino acids, and other metabolites. In bacteria, MMCE participates in the reverse pathway for propionate fermentation, glyoxylate regeneration, and the biosynthesis of polyketide antibiotics. MMCE is closely related to glyoxalase I and type I extradiol dioxygenases.	127
319913	cd07250	HPPD_C_like	C-terminal domain of 4-hydroxyphenylpyruvate dioxygenase (HppD) and hydroxymandelate synthase (HmaS). HppD and HmaS are non-heme iron-dependent dioxygenases, which modify a common substrate, 4-hydroxyphenylpyruvate (HPP), but yield different products. HPPD catalyzes the second reaction in tyrosine catabolism, the conversion of 4-hydroxyphenylpyruvate to homogentisate (2,5-dihydroxyphenylacetic acid, HG). HmaS converts HPP to 4-hydroxymandelate, a committed step in the formation of hydroxyphenylglycerine, a structural component of nonproteinogenic macrocyclic peptide antibiotics, such as vancomycin. If the emphasis is on catalytic chemistry, HPPD and HmaS are classified as members of a large family of alpha-keto acid dependent mononuclear non-heme iron oxygenases most of which require Fe(II), molecular oxygen, and an alpha-keto acid (typically alpha-ketoglutarate) to either oxygenate or oxidize a third substrate. Both enzymes are exceptions in that they require two, instead of three, substrates, do not use alpha-ketoglutarate, and incorporate both atoms of dioxygen into the aromatic product. Both HPPD and HmaS exhibit duplicate beta barrel topology in their N- and C-terminal domains which share sequence similarity, suggestive of a gene duplication. Each protein has only one catalytic site located in at the C-terminal domain. This HPPD_C_like domain represents the C-terminal domain.	194
319914	cd07251	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	120
319915	cd07252	BphC1-RGP6_N_like	N-terminal domain of 2,3-dihydroxybiphenyl 1,2-dioxygenase. This subfamily contains the N-terminal, non-catalytic, domain of BphC1-RGP6 and similar proteins. BphC catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). This subfamily of BphCs belongs to the type I extradiol dioxygenase family, which require a metal in the active site in its catalytic mechanism. Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of 2,3-dihydroxybiphenyl 1,2-dioxygenases. For example, three types of BphC enzymes have been found in Rhodococcus globerulus (BphC1-RGP6 - BphC3-RGP6), all three enzymes are type I extradiol dioxygenases. BphC1-RGP6 has an internal duplication, it is a two-domain dioxygenase which forms octamers, and has Fe(II) at the catalytic site. Its N-terminal repeat is represented in this subfamily. BphC2-RGP6 and BphC3-RGP6 are one-domain dioxygenases, they belong to a different family, the ED_TypeI_classII_C  (C-terminal domain of type I, class II extradiol dioxygenases) family.	120
319916	cd07253	GLOD5	Human glyoxalase domain-containing protein 5 and similar proteins. Uncharacterized subfamily of VOC family contains human glyoxalase domain-containing protein 5 and similar proteins.  The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	123
319917	cd07254	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	120
319918	cd07255	VOC_BsCatE_like_N	N-terminal of Bacillus subtilis CatE like protein. Uncharacterized subfamily of VOC superfamily contains Bacillus subtilis CatE and similar proteins. CatE is proposed to function as Catechol-2,3-dioxygenase. VOC  is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	124
319919	cd07256	HPCD_C_class_II	C-terminal domain of 3,4-dihydroxyphenylacetate 2,3-dioxygenase (HPCD). This subfamily contains the C-terminal, catalytic, domain of HPCD. HPCD catalyses the second step in the degradation of 4-hydroxyphenylacetate to succinate and pyruvate. The aromatic ring of 4-hydroxyphenylacetate is opened by this dioxygenase to yield the 3,4-diol product, 2-hydroxy-5-carboxymethylmuconate semialdehyde. HPCD is a homotetramer and each monomer contains two structurally homologous barrel-shaped domains at the N- and C-terminus. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism. Most extradiol dioxygenases contain Fe(II) in their active site, but HPCD can be activated by either Mn(II) or Fe(II). These enzymes belong to the type I class II family of extradiol dioxygenases. The class III 3,4-dihydroxyphenylacetate 2,3-dioxygenases belong to a different superfamily.	160
319920	cd07257	THT_oxygenase_C	The C-terminal domain of 2,4,5-trihydroxytoluene (THT) oxygenase. This subfamily contains the C-terminal, catalytic, domain of THT oxygenase. THT oxygenase is an extradiol dioxygenase in the 2,4-dinitrotoluene (DNT) degradation pathway. It catalyzes the conversion of 2,4,5-trihydroxytoluene to an unstable ring fission product, 2,4-dihydroxy-5-methyl-6-oxo-2,4-hexadienoic acid. The native protein was determined to be a dimer by gel filtration. The enzyme belongs to the type I family of extradiol dioxygenases which contains two structurally homologous barrel-shaped domains at the N- and C-terminus of each monomer. The active-site metal is located in the C-terminal barrel. Fe(II) is required for its catalytic activity.	152
319921	cd07258	PpCmtC_C	C-terminal domain of 2,3-dihydroxy-p-cumate-3,4-dioxygenase (PpCmtC). This subfamily contains the C-terminal, catalytic, domain of PpCmtC. 2,3-dihydroxy-p-cumate-3,4-dioxygenase (CmtC of Pseudomonas putida F1) is a dioxygenase involved in the eight-step catabolism pathway of p-cymene. CmtC acts upon the reaction intermediate 2,3-dihydroxy-p-cumate, yielding 2-hydroxy-3-carboxy-6-oxo-7-methylocta-2,4-dienoate. The CmtC belongs to the type I family of extradiol dioxygenases. Fe2+ was suggested as a cofactor, same as for other enzymes in the family. The type I family of extradiol dioxygenases contains two structurally homologous barrel-shaped domains at the N- and C-terminal. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism.	138
319922	cd07261	EhpR_like	phenazine resistance protein, EhpR. Phenazine resistance protein (EhpR)  in Enterobacter agglomerans confers resistance by binding D-alanyl-griseoluteic acid and acting as a chaperone involved in exporting the antibiotic rather than by altering it chemically. EhpR is evolutionarily related to glyoxalase I and type I extradiol dioxygenases.	114
319923	cd07262	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	121
319924	cd07263	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping	120
319925	cd07264	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	118
319926	cd07265	2_3_CTD_N	N-terminal domain of catechol 2,3-dioxygenase. This subfamily contains the N-terminal, non-catalytic, domain of catechol 2,3-dioxygenase. Catechol 2,3-dioxygenase  (2,3-CTD, catechol:oxygen 2,3-oxidoreductase) catalyzes an extradiol cleavage of catechol to form 2-hydroxymuconate semialdehyde with the insertion of two atoms of oxygen. The enzyme is a homotetramer and contains catalytically essential Fe(II) . The reaction proceeds by an ordered bi-unit mechanism. First, catechol binds to the enzyme, this is then followed by the binding of dioxygen to form a tertiary complex, and then the aromatic ring is cleaved to produce 2-hydroxymuconate semialdehyde. Catechol 2,3-dioxygenase belongs to the type I extradiol dioxygenase family. The subunit comprises the N- and C-terminal domains of similar structure fold, resulting from an ancient gene duplication. The active site is located in a funnel-shaped space of the C-terminal domain. This subfamily represents the N-terminal domain.	122
319927	cd07266	HPCD_N_class_II	N-terminal domain of 3,4-dihydroxyphenylacetate 2,3-dioxygenase (HPCD). This subfamily contains the N-terminal, non-catalytic, domain of HPCD. HPCD catalyses the second step in the degradation of 4-hydroxyphenylacetate to succinate and pyruvate. The aromatic ring of 4-hydroxyphenylacetate is opened by this dioxygenase to yield the 3,4-diol product, 2-hydroxy-5-carboxymethylmuconate semialdehyde. HPCD is a homotetramer and each monomer contains two structurally homologous barrel-shaped domains at the N- and C-terminus. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism. Most extradiol dioxygenases contain Fe(II) in their active site, but HPCD can be activated by either Mn(II) or Fe(II). These enzymes belong to the type I class II family of extradiol dioxygenases. The class III 3,4-dihydroxyphenylacetate 2,3-dioxygenases belong to a different superfamily.	118
319928	cd07267	THT_Oxygenase_N	N-terminal domain of 2,4,5-trihydroxytoluene (THT) oxygenase. This subfamily contains the N-terminal, non-catalytic, domain of THT oxygenase. THT oxygenase is an extradiol dioxygenase in the 2,4-dinitrotoluene (DNT) degradation pathway. It catalyzes the conversion of 2,4,5-trihydroxytoluene to an unstable ring fission product, 2,4-dihydroxy-5-methyl-6-oxo-2,4-hexadienoic acid. The native protein was determined to be a dimer by gel filtration. The enzyme belongs to the type I family of extradiol dioxygenases which contains two structurally homologous barrel-shaped domains at the N- and C-terminus of each monomer. The active-site metal is located in the C-terminal barrel. Fe(II) is required for its catalytic activity.	113
319929	cd07268	VOC_EcYecM_like	Escherichia coli YecM and similar proteins, a vicinal oxygen chelate subfamily. Uncharacterized subfamily of vicinal oxygen chelate (VOC) superfamily contains Escherichia coli YecM and similar proteins.The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	171
132809	cd07276	PX_SNX16	The phosphoinositide binding Phox Homology domain of Sorting Nexin 16. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX16 contains a central PX domain followed by a coiled-coil region. SNX16 is localized in early and recycling endosomes through the binding of its PX domain to phosphatidylinositol-3-phosphate (PI3P). It plays a role in epidermal growth factor (EGF) signaling by regulating EGF receptor membrane trafficking.	110
132810	cd07277	PX_RUN	The phosphoinositide binding Phox Homology domain of uncharacterized proteins containing PX and RUN domains. The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to PI-enriched membranes. Members in this subfamily are uncharacterized proteins containing an N-terminal RUN domain and a C-terminal PX domain. PX domain harboring proteins have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction. The RUN domain is found in GTPases in the Rap and Rab families and may play a role in Ras-like signaling pathways.	118
132811	cd07278	PX_RICS_like	The phosphoinositide binding Phox Homology domain of PX-RICS-like proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Members of this family include PX-RICS, TCGAP (Tc10/Cdc42 GTPase-activating protein), and similar proteins. They contain N-terminal PX and Src Homology 3 (SH3) domains, a central Rho GAP domain, and C-terminal extensions. They act as Rho GTPase-activating proteins. PX-RICS is the main isoform expressed during neural development. It is involved in neural functions including axon and dendrite extension, postnatal remodeling, and fine-tuning of neural circuits during early brain development. The PX domain of PX-RICS specifically binds phosphatidylinositol 3-phosphate (PI3P), PI4P, and PI5P. TCGAP is widely expressed in the brain where it is involved in regulating the outgrowth of axons and dendrites and is regulated by the protein tyrosine kinase Fyn. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction.	114
132812	cd07279	PX_SNX20_21_like	The phosphoinositide binding Phox Homology domain of Sorting Nexins 20 and 21. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. This subfamily consists of SNX20, SNX21, and similar proteins. SNX20 interacts with P-Selectin glycoprotein ligand-1 (PSGL-1), a surface-expressed mucin that acts as a ligand for the selectin family of adhesion proteins. It may function in the sorting and cycling of PSGL-1 into endosomes. SNX21, also called SNX-L, is distinctly and highly-expressed in fetal liver and may be involved in protein sorting and degradation during embryonic liver development.	112
132813	cd07280	PX_YPT35	The phosphoinositide binding Phox Homology domain of the fungal protein YPT35. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. This subfamily is composed of YPT35 proteins from the fungal subkingdom Dikarya. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of YPT35 binds to phosphatidylinositol 3-phosphate (PI3P). It also serves as a protein interaction domain, binding to members of the Yip1p protein family, which localize to the ER and Golgi. YPT35 is mainly associated with endosomes and together with Yip1p proteins, may be involved in a specific function in the endocytic pathway.	120
132814	cd07281	PX_SNX1	The phosphoinositide binding Phox Homology domain of Sorting Nexin 1. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX1 is both membrane associated and a cytosolic protein that exists as a tetramer in protein complexes. It can associate reversibly with membranes of the endosomal compartment, thereby coating these vesicles. SNX1 is a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures efficient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. SNX1 contains a Bin/Amphiphysin/Rvs (BAR) domain C-terminal to the PX domain. The PX domain of SNX1 specifically binds phosphatidylinositol-3-phosphate (PI3P) and PI(3,5)P2, while the BAR domain detects membrane curvature. Both domains help determine the specific membrane-targeting of SNX1, which is localized to a microdomain in early endosomes where it regulates cation-independent mannose-6-phosphate receptor retrieval to the trans Golgi network.	124
132815	cd07282	PX_SNX2	The phosphoinositide binding Phox Homology domain of Sorting Nexin 2. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX2 is a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures efficient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. Similar to SNX1, SNX2 contains a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain. The PX domain of SNX2 preferentially binds phosphatidylinositol-3-phosphate (PI3P), but not PI(3,4,5)P3. Studies on mice deficient with SNX1 and/or SNX2 suggest that they provide an essential function in embryogenesis and are functionally redundant.	124
132816	cd07283	PX_SNX30	The phosphoinositide binding Phox Homology domain of Sorting Nexin 30. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX30 harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to the sorting nexins SNX1-2, SNX4-8, and SNX32. Both domains have been shown to determine the specific membrane-targeting of SNX1. The specific function of SNX30 has yet to be elucidated.	116
132817	cd07284	PX_SNX7	The phosphoinositide binding Phox Homology domain of Sorting Nexin 7. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX7 harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to the sorting nexins SNX1-2, SNX4-6, SNX8, SNX30, and SNX32. Both domains have been shown to determine the specific membrane-targeting of SNX1. The specific function of SNX7 has yet to be elucidated.	116
132818	cd07285	PX_SNX9	The phosphoinositide binding Phox Homology domain of Sorting Nexin 9. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX9, also known as SH3PX1, is a cytosolic protein that interacts with proteins associated with clathrin-coated pits such as Cdc-42-associated tyrosine kinase 2 (ACK2). It contains an N-terminal Src Homology 3 (SH3) domain, a PX domain, and a C-terminal Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature. The PX-BAR structural unit helps determine specific membrane localization. Through its SH3 domain, SNX9 binds class I polyproline sequences found in dynamin 1/2 and the WASP/N-WASP actin regulators. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis. Its array of interacting partners suggests that SNX9 functions at the interface between endocytosis and actin cytoskeletal organization.	126
132819	cd07286	PX_SNX18	The phosphoinositide binding Phox Homology domain of Sorting Nexin 18. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX18, like SNX9, contains an N-terminal Src Homology 3 (SH3) domain, a PX domain, and a C-terminal Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature. The PX-BAR structural unit helps determine specific membrane localization. SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1.	127
132820	cd07287	PX_RPK118_like	The phosphoinositide binding Phox Homology domain of RPK118-like proteins. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Members of this subfamily bear similarity to human RPK118, which contains an N-terminal PX domain, a Microtubule Interacting and Trafficking (MIT) domain, and a kinase domain. RPK118 binds sphingosine kinase, a key enzyme in the synthesis of sphingosine 1-phosphate (SPP), a lipid messenger involved in many cellular events. RPK118 may be involved in transmitting SPP-mediated signaling. It also binds the antioxidant peroxiredoxin-3 (PRDX3) and may be involved in the transport of PRDX3 from the cytoplasm to its site of function in the mitochondria. Members of this subfamily also show similarity to sorting nexin 15 (SNX15), which contains PX and MIT domains but does not contain a kinase domain. SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNX15 plays a role in protein trafficking processes in the endocytic pathway and the trans-Golgi network. The PX domain of SNX15 interacts with the PDGF receptor and is responsible for the membrane association of the protein.	118
132821	cd07288	PX_SNX15	The phosphoinositide binding Phox Homology domain of Sorting Nexin 15. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX15 contains an N-terminal PX domain and a C-terminal Microtubule Interacting and Trafficking (MIT) domain. It plays a role in protein trafficking processes in the endocytic pathway and the trans-Golgi network. The PX domain of SNX15 interacts with the PDGF receptor and is responsible for the membrane association of the protein.	118
132822	cd07289	PX_PI3K_C2_alpha	The phosphoinositide binding Phox Homology Domain of the Alpha Isoform of Class II Phosphoinositide 3-Kinases. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. The class II alpha isoform, PI3K-C2alpha, plays key roles in clathrin assembly and clathrin-mediated membrane trafficking, insulin signaling, vascular smooth muscle contraction, and the priming of neurosecretory granule exocytosis. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction.	109
132823	cd07290	PX_PI3K_C2_beta	The phosphoinositide binding Phox Homology Domain of the Beta Isoform of Class II Phosphoinositide 3-Kinases. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C-terminus. The class II beta isoform, PI3K-C2beta, contributes to the migration and survival of cancer cells. It regulates Rac activity and impacts membrane ruffling, cell motility, and cadherin-mediated cell-cell adhesion. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction.	109
132824	cd07291	PX_SNX5	The phosphoinositide binding Phox Homology domain of Sorting Nexin 5. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX5, abundantly expressed in macrophages, regulates macropinocytosis, a process that enables cells to internalize large amounts of external solutes. It may also be a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. It also binds the Fanconi anaemia complementation group A protein (FANCA). SNX5 harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to other sorting nexins including SNX1-2. The PX-BAR structural unit helps determine the specific membrane-targeting of some SNXs. The PX domain of SNX5 binds phosphatidylinositol-3-phosphate (PI3P) and PI(3,4)P2. SNX5 is localized to a subdomain of early endosome and is recruited to the plasma membrane following EGF stimulation and elevation of PI(3,4)P2 levels.	141
132825	cd07292	PX_SNX6	The phosphoinositide binding Phox Homology domain of Sorting Nexin 6. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX6 forms a stable complex with SNX1 and may be a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. It interacts with the receptor serine/threonine kinases from the transforming growth factor-beta family. It also plays roles in enhancing the degradation of EGFR and in regulating the activity of Na,K-ATPase through its interaction with Translationally Controlled Tumor Protein (TCTP). SNX6 harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain, similar to other sorting nexins including SNX1-2. The PX-BAR structural unit helps determine the specific membrane-targeting of some SNXs.	141
132826	cd07293	PX_SNX3	The phosphoinositide binding Phox Homology domain of Sorting Nexin 3. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. SNX3 associates with early endosomes through a PX domain-mediated interaction with phosphatidylinositol-3-phosphate (PI3P). It associates with the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, and functions as a cargo-specific adaptor for the retromer. SNX3 is required for the formation of multivesicular bodies, which function as transport intermediates to late endosomes. It also promotes cell surface expression of the amiloride-sensitive epithelial Na+ channel (ENaC), which is critical in sodium homeostasis and maintenance of extracellular fluid volume.	123
132827	cd07294	PX_SNX12	The phosphoinositide binding Phox Homology domain of Sorting Nexin 12. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. The specific function of SNX12 has yet to be elucidated.	132
132828	cd07295	PX_Grd19	The phosphoinositide binding Phox Homology domain of fungal Grd19. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Grd19 is involved in the localization of late Golgi membrane proteins in yeast. Grp19 associates with the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, and functions as a cargo-specific adaptor for the retromer.	116
132829	cd07296	PX_PLD1	The phosphoinositide binding Phox Homology domain of Phospholipase D1. The PX domain is a phosphoinositide binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Phospholipase D (PLD) catalyzes the hydrolysis of the phosphodiester bond of phosphatidylcholine to generate membrane-bound phosphatidic acid and choline. PLDs are implicated in many cellular functions like signaling, cytoskeletal reorganization, vesicular transport, stress responses, and the control of differentiation, proliferation, and survival. PLD1 contains PX and Pleckstrin Homology (PH) domains in addition to the catalytic domain. It acts as an effector of Rheb in the signaling of the mammalian target of rapamycin (mTOR), a serine/threonine protein kinase that transduces nutrients and other stimuli to regulate many cellular processes. PLD1 also regulates the secretion of the procoagulant von Willebrand factor (VWF) in endothelial cells. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of PLD1 specifically binds to phosphatidylinositol-3,4,5-trisphosphate [PI(3,4,5)P3], which enables PLD1 to mediate signals via the ERK1/2 pathway.	135
132830	cd07297	PX_PLD2	The phosphoinositide binding Phox Homology domain of Phospholipase D2. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. Phospholipase D (PLD) catalyzes the hydrolysis of the phosphodiester bond of phosphatidylcholine to generate membrane-bound phosphatidic acid and choline. PLD activity has been detected in viruses, bacteria, yeast, plants, and mammals, but the PX domain is not present in PLDs from viruses and bacteria. PLDs are implicated in many cellular functions like signaling, cytoskeletal reorganization, vesicular transport, stress responses, and the control of differentiation, proliferation, and survival. PLD2 contains PX and Pleckstrin Homology (PH) domains in addition to the catalytic domain. It mediates EGF-dependent insulin secretion and EGF-induced Ras activation by the guanine nucleotide-exchange factor Son of sevenless (Sos). It regulates mast cell activation by associating and promoting the activation of the protein tyrosine kinase Syk. PLD2 also participates in the sphingosine 1-phosphate-mediated pathway that stimulates the migration of endothelial cells, an important factor in angiogenesis. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction.	130
132831	cd07298	PX_RICS	The phosphoinositide binding Phox Homology domain of PX-RICS. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. RICS is a Rho GTPase-activating protein for cdc42 and Rac1. It is implicated in the regulation of postsynaptic signaling and neurite outgrowth. An N-terminal splicing variant of RICS containing additional PX and Src Homology 3 (SH3) domains, also called PX-RICS, is the main isoform expressed during neural development. PX-RICS is involved in neural functions including axon and dendrite extension, postnatal remodeling, and fine-tuning of neural circuits during early brain development. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction. The PX domain of PX-RICS specifically binds phosphatidylinositol 3-phosphate (PI3P), PI4P, and PI5P.	115
132832	cd07299	PX_TCGAP	The phosphoinositide binding Phox Homology domain of Tc10/Cdc42 GTPase-activating protein. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. TCGAP (Tc10/Cdc42 GTPase-activating protein) contains N-terminal PX and Src Homology 3 (SH3) domains, a central Rho GAP domain, and C-terminal proline-rich regions. It is widely expressed in the brain where it is involved in regulating the outgrowth of axons and dendrites and is regulated by the protein tyrosine kinase Fyn. It interacts with cdc42 and TC10beta through its GAP domain and with phosphatidylinositol-(4,5)-bisphosphate [PI(4,5)P2] through its PX domain. It is translocated to the plasma membrane in adipocytes in response to insulin and may be involved in the regulation of insulin-stimulated glucose transport. TCGAP has also been named sorting nexins 26 (SNX26). SNXs make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. It is unknown whether TCGAP also functions as a SNX.	113
132833	cd07300	PX_SNX20	The phosphoinositide binding Phox Homology domain of Sorting Nexin 20. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX20 interacts with P-Selectin glycoprotein ligand-1 (PSGL-1), a surface-expressed mucin that acts as a ligand for the selectin family of adhesion proteins. The PX domain of SNX20 binds PIs and targets the SNX20/PSGL-1 complex to endosomes. SNX20 may function in the sorting and cycling of PSGL-1 into endosomes.	114
132834	cd07301	PX_SNX21	The phosphoinositide binding Phox Homology domain of Sorting Nexin 21. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Some SNXs are localized in early endosome structures such as clathrin-coated pits, while others are located in late structures of the endocytic pathway. SNX21, also called SNX-L, is distinctly and highly-expressed in fetal liver and may be involved in protein sorting and degradation during embryonic liver development.	112
143636	cd07302	CHD	cyclase homology domain. Catalytic domains of the mononucleotidyl cyclases (MNC's), also called cyclase homology domains (CHDs), are part of the class III nucleotidyl cyclases. This class includes eukaryotic and prokaryotic adenylate cyclases (AC's) and guanylate cyclases (GC's). They seem to share a common catalytic mechanism in their requirement for two magnesium ions to bind the polyphosphate moiety of the nucleotide.	177
132765	cd07303	Porin3	Eukaryotic porin family that forms channels in the mitochondrial outer membrane. The porin family 3 contains two sub-families that play vital roles in the mitochondrial outer membrane, a translocase for unfolded pre-proteins (Tom40) and the voltage-dependent anion channel (VDAC) that regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane.	274
143612	cd07304	Chorismate_synthase	Chorismase synthase, the enzyme catalyzing the final step of the shikimate pathway. Chorismate synthase (CS; 5-enolpyruvylshikimate-3-phosphate phospholyase; 1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C. 4.2.3.5) catalyzes the seventh and final step in the shikimate pathway: the conversion of 5- enolpyruvylshikimate-3-phosphate (EPSP) to chorismate, a precursor for the biosynthesis of aromatic compounds. This process has an absolute requirement for reduced FMN as a co-factor which is thought to facilitate cleavage of C-O bonds by transiently donating an electron to the substrate, having no overall change its redox state. Depending on the capacity of these enzymes to regenerate the reduced form of FMN, chorismate synthases are divided into two classes: Enzymes, mostly from plants and eubacteria, that sequester CS from the cellular environment, are monofunctiona,l while those that can generate reduced FMN at the expense of NADPH, such as found in fungi and the ciliated protozoan Euglena gracilis, are bifunctional, having an additional NADPH:FMN oxidoreductase activity. Recently, bifunctionality of the Mycobacterium tuberculosis enzyme (MtCS) was determined by measurements of both chorismate synthase and NADH:FMN oxidoreductase activities. Since shikimate pathway enzymes are present in bacteria, fungi and apicomplexan parasites (such as Toxoplasma gondii, Plasmodium falciparum, and Cryptosporidium parvum) but absent in mammals, they are potentially attractive targets for the development of new therapy against infectious diseases such as tuberculosis (TB).	344
132766	cd07305	Porin3_Tom40	Translocase of outer mitochondrial membrane 40 (Tom40). Tom40 forms a channel in the mitochondrial outer membrane with a pore about 1.5 to 2.5 nanometers wide. It functions as a transport channel for unfolded protein chains and forms a complex with Tom5, Tom6, Tom7, and Tom22. The primary receptors Tom20 and Tom70 recruit the unfolded precursor protein from the mitochondrial-import stimulating factor (MSF) or cytosolic Hsc70. The precursor passes through the Tom40 channel and through another channel in the inner membrane, formed by Tim23, to be finally translocated into the mitochondrial matrix. The process depends on a proton motive force across the inner membrane and requires a contact site where the outer and inner membranes come close. Tom40 is also involved in inserting outer membrane proteins into the membrane, most likely not via a lateral opening in the pore, but by transfering precursor proteins to an outer membrane sorting and assembly machinery.	279
132767	cd07306	Porin3_VDAC	Voltage-dependent anion channel of the outer mitochondrial membrane. The voltage-dependent anion channel (VDAC) regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane, which is highly permeable to small molecules. VDAC is the most abundant protein in the outer membrane, and membrane potentials can toggle VDAC between open or high-conducting and closed or low-conducting forms. VDAC binds to and is regulated in part by hexokinase, an interaction that renders mitochondria less susceptible to pro-apoptotic signals, most likely by intefering with VDAC's capability to respond to Bcl-2 family proteins. While VDAC appears to play a key role in mitochondrially induced cell death, a proposed involvement in forming the mitochondrial permeability transition pore, which is characteristic for damaged mitochondria and apoptosis, has been challenged by more recent studies.	276
153271	cd07307	BAR	The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively.	194
173892	cd07308	lectin_leg-like	legume-like lectins: ERGIC-53, ERGL, VIP36, VIPL, EMP46, and EMP47. The legume-like (leg-like) lectins are eukaryotic intracellular sugar transport proteins with a carbohydrate recognition domain similar to that of the legume lectins.  This domain binds high-mannose-type oligosaccharides for transport from the endoplasmic reticulum to the Golgi complex.  These leg-like lectins include ERGIC-53, ERGL, VIP36, VIPL, EMP46, EMP47, and the UIP5 (ULP1-interacting protein 5) precursor protein.  Leg-like lectins have different intracellular distributions and dynamics in the endoplasmic reticulum-Golgi system of the secretory pathway and interact with N-glycans of glycoproteins in a calcium-dependent manner, suggesting a role in glycoprotein sorting and trafficking.  L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face".  This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers.  Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.	218
213985	cd07309	PHP	Polymerase and Histidinol Phosphatase domain. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. PHP in polymerases has trinuclear zinc/magnesium dependent proofreading activity. It has also been shown that the PHP domain functions in DNA repair. The PHP structures have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel.	88
143583	cd07311	terB_like_1	tellurium resistance terB-like protein, subgroup 1. This family includes several uncharacterized bacterial proteins. The prototype of this CD is tellurite resistance protein from Nostoc punctiforme that belongs to COG3793. Its precise biological function and its mechanism responsible for tellurium resistance still remains rather poorly understood.	150
143584	cd07313	terB_like_2	tellurium resistance terB-like protein, subgroup 2. This family includes several uncharacterized bacterial proteins. Protein sequence homology analysis shows they are similar to tellurium resistance protein terB, but the function of this family is unknown.	104
143585	cd07316	terB_like_DjlA	N-terminal tellurium resistance protein terB-like domain of heat shock DnaJ-like proteins. Tellurium resistance terB-like domain of the DnaJ-like DjlA proteins. This family represents the terB-like domain of DjlA-like proteins, a subgroup of heat shock DnaJ-like proteins.  Escherichia coli DjlA is a type III membrane protein with a small N-terminal transmembrane region and DnaJ-like domain on the extreme C-terminus.  Overproduction has been shown to activate the RcsC pathway, which regulates the production of the capsular exopolysaccharide colanic acid.  The specific function of this domain is unknown.	106
153371	cd07320	Extradiol_Dioxygenase_3B_like	Subunit B of Class III Extradiol ring-cleavage dioxygenases. Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be further divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two-domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B. This model represents the catalytic subunit B of extradiol dioxygenase class III enzymes. Enzymes belonging to this family include Protocatechuate 4,5-dioxygenase (LigAB), 2'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarB), 4,5-DOPA Dioxygenase, 2,3-dihydroxyphenylpropionate 1,2-dioxygenase, and 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD). There are also some family members that do not show the typical dioxygenase activity.	260
153390	cd07321	Extradiol_Dioxygenase_3A_like	Subunit A of Class III extradiol dioxygenases. Extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings.  There are two major groups of dioxygenases according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents subunit A of class III extradiol dioxygenase enzymes. The A subunit is the smaller, non-catalytic subunit. Enzymes that belong to this family include Protocatechuate 4,5-dioxygenase (LigAB) A subunit, 2'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarB) A subunit, Gallate Dioxygenase and proteins of unknown function.	77
143474	cd07322	PriL_PriS_Eukaryotic	Eukaryotic core primase: Large subunit, PriL. Primases synthesize the RNA primers required for DNA replication. Primases are grouped into two classes, bacteria/bacteriophage and archaeal/eukaryotic. The proteins in the two classes differ in structure and the replication apparatus components. Archaeal/eukaryotic core primase is a heterodimeric enzyme consisting of a small catalytic subunit (PriS) and a large subunit (PriL). In eukaryotic organisms, a heterotetrameric enzyme formed by DNA polymerase alpha, the B subunit and two primase subunits has primase activity. Although the catalytic activity resides within PriS, the PriL subunit is essential for primase function as disruption of the PriL gene in yeast is lethal. PriL is composed of two structural domains. Several functions have been proposed for PriL such as stabilization of the PriS, involvement in synthesis initiation, improvement of primase processivity, determination of product size and transfer of the products to DNA polymerase alpha.	390
153396	cd07323	LAM	LA motif RNA-binding domain. This domain is found at the N-terminus of La RNA-binding proteins as well as in other related proteins. Typically, the domain co-occurs with an RNA-recognition motif (RRM), and together these domains function to bind primary transcripts of RNA polymerase III in the La autoantigen (Lupus La protein, LARP3, or Sjoegren syndrome type B antigen, SS-B). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	75
320683	cd07324	M48C_Oma1-like	Oma1 peptidase-like, integral membrane metallopeptidase. This family contains peptidase M48 subfamily C (also known as Oma1 peptidase or mitochondrial metalloendopeptidase OMA1), including similar peptidases containing tetratricopeptide (TPR) repeats, as well as uncharacterized proteins such as E. coli bepA (formerly yfgC), ycaL and loiP (formerly yggG), considered to be putative metallopeptidases. Oma1 peptidase is part of the quality control system in the inner membrane of mitochondria, with its catalytic site facing the matrix space. It cleaves and thereby promotes the turnover of mistranslated or misfolded membrane proteins. Oma1 can cleave the misfolded multi-pass membrane protein Oxa1, thus exerting a function similar to the ATP-dependent m-AAA protease for quality control of inner membrane proteins. It has been proposed that in the absence of m-AAA protease, proteolysis of Oxa1 is mediated by Oma1 in an ATP-independent manner. Homologs of Oma1 are present in higher eukaryotes, eubacteria and archaebacteria, suggesting that Oma1 is the founding member of a conserved family of membrane-embedded metallopeptidases, all containing the zinc metalloprotease motif (HEXXH). M48 peptidases proteolytically remove the C-terminal three residues of farnesylated proteins.	142
320684	cd07325	M48_Ste24p_like	M48 Ste24 endopeptidase-like, integral membrane metallopeptidase. This family contains peptidase M48 family Ste24p-like proteins that are as yet uncharacterized, but probably function as intracellular, membrane-associated zinc metalloproteases; they all contain the HEXXH Zn-binding motif, which is critical for Ste24p activity. They likely remove the C-terminal three residues of farnesylated proteins proteolytically and are possibly associated with the endoplasmic reticulum and golgi. Some members also contain ankyrin domains which occur in very diverse families of proteins and mediate protein-protein interactions.	199
320685	cd07326	M56_BlaR1_MecR1_like	Peptidase M56-like including those in BlaR1 and MecR1, integral membrane metallopeptidase. This family contains peptidase M56, which includes zinc metalloprotease domain in MecR1 as well as BlaR1. MecR1 is a transmembrane beta-lactam sensor/signal transducer protein that regulates the expression of an altered penicillin-binding protein PBP2a, which resists inactivation by beta-lactam antibiotics, in methicillin-resistant Staphylococcus aureus (MRSA). BlaR1 regulates the inducible expression of a class A beta-lactamase that hydrolytically destroys certain ?-lactam antibiotics in MRSA. Both, MecR1 and BlaR1, are transmembrane proteins that consist of four transmembrane helices, a cytoplasmic zinc protease domain, and the soluble C-terminal extracellular sensor domain, and are highly similar in sequence and function. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. All members contain the zinc metalloprotease motif (HEXXH). Homologs of this peptidase domain are also found in a number of other bacterial genome sequences, most of which are as yet uncharacterized.	165
320686	cd07327	M48B_HtpX_like	HtpX-like membrane-bound metallopeptidase. This family contains peptidase M48 subfamily B, also known as HtpX, which consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX, an integral membrane (IM) metallopeptidase, is widespread in bacteria and archaea, and plays a central role in protein quality control by preventing the accumulation of misfolded proteins in the membrane. Its expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and eliminating them by collaborating with FtsH, a membrane-bound and ATP-dependent protease. HtpX contains the zinc binding motif (HEXXH), has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. Mutation studies of HtpX-like M48 metalloprotease from Leptospira interrogans (LA4131) has been shown to result in altered expression of a subset of metal toxicity and stress response genes.	183
320687	cd07328	M48_Ste24p_like	M48 Ste24 endopeptidase-like, integral membrane metallopeptidase. This family contains peptidase M48-like proteins that are as yet uncharacterized, but probably function as intracellular, membrane-associated zinc metalloproteases; they all contain the HEXXH Zn-binding motif, which is critical for Ste24p activity. They likely remove the C-terminal three residues of farnesylated proteins proteolytically and are possibly associated with the endoplasmic reticulum and golgi.	160
320688	cd07329	M56_like	Peptidase M56-like, integral membrane metallopeptidase in bacteria. This family contains peptidase M56, which includes zinc metalloprotease domain in MecR1 as well as BlaR1. MecR1 is a transmembrane beta-lactam sensor/signal transducer protein that regulates the expression of an altered penicillin-binding protein PBP2a, which resists inactivation by beta-lactam antibiotics, in methicillin-resistant Staphylococcus aureus (MRSA). BlaR1 regulates the inducible expression of a class A beta-lactamase that hydrolytically destroys certain beta-lactam antibiotics in MRSA. Both, MecR1 and BlaR1, are transmembrane proteins that consist of four transmembrane helices, a cytoplasmic zinc protease domain, and the soluble C-terminal extracellular sensor domain, and are highly similar in sequence and function. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. All members contain the zinc metalloprotease motif (HEXXH). Homologs of this peptidase domain are also found in a number of other bacterial genome sequences, most of which are as yet uncharacterized.	188
320689	cd07330	M48A_Ste24p	Peptidase M48 CaaX prenyl protease type 1, an integral membrane, Zn-dependent protein. This family of M48 CaaX prenyl protease 1-like family includes a number of well characterized genes such as those found in Taenia solium metacestode (TsSte24p), Arabidopsis (AtSte24), yeast Ste24p and human (Hs Ste24p) as well as several uncharacterized genes such as YhfN, some of which also containing tetratricopeptide (TPR) repeats. All members of this family contain the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. They are thought to be intimately associated with the endoplasmic reticulum (ER), regardless of whether their genes possess the conventional signal motif (KKXX) in the C-terminal. Proteins in this family proteolytically remove the C-terminal three residues of farnesylated proteins. The gene ZmpSte24, also known as FACE-1 in humans, a member of this family, is involved in the post-translational processing of prelamin A to mature lamin A, a major component of the nuclear envelope. ZmpSte24 deficiency causes an accumulation of prelamin A leading to lipodystrophy and other disease phenotypes while mutations in the protein lead to diseases of lamin processing (laminopathies), such as premature aging disease progeria and metabolic disorders. Some of these mutations map to the peptide-binding site.	285
320690	cd07331	M48C_Oma1_like	Peptidase M48C, integral membrane endopeptidase. This subfamily contains peptidase M48C Oma1 (also called mitochondrial metalloendopeptidase OMA1) protease homologs that are mostly uncharacterized. Oma1 is part of the quality control system in the inner membrane of mitochondria, with its catalytic site facing the matrix space. It cleaves and thereby promotes the turnover of mistranslated or misfolded membrane proteins. Oma1 can cleave the misfolded multi-pass membrane protein Oxa1, thus exerting a function similar to the ATP-dependent m-AAA protease for quality control of inner membrane proteins; it cleaves a misfolded polytopic membrane protein at multiple sites. It has been proposed that in the absence of m-AAA protease, proteolysis of Oxa1 is mediated by Oma1 in an ATP-independent manner. Oma1 is part of highly conserved mitochondrial metallopeptidases, with homologs present in higher eukaryotes, eubacteria and archaebacteria, all containing the zinc binding motif (HEXXH). It forms a high molecular mass complex in the inner membrane, possibly a homo-hexamer.	187
320691	cd07332	M48C_Oma1_like	Peptidase M48C Ste24p, integral membrane endopeptidase. This subfamily contains peptidase M48C Oma1 (also called mitochondrial metalloendopeptidase OMA1) protease homologs that are mostly uncharacterized. Oma1 is part of the quality control system in the inner membrane of mitochondria, with its catalytic site facing the matrix space. It cleaves and thereby promotes the turnover of mistranslated or misfolded membrane proteins. Oma1 can cleave the misfolded multi-pass membrane protein Oxa1, thus exerting a function similar to the ATP-dependent m-AAA protease for quality control of inner membrane proteins; it cleaves a misfolded polytopic membrane protein at multiple sites. It has been proposed that in the absence of m-AAA protease, proteolysis of Oxa1 is mediated by Oma1 in an ATP-independent manner. Oma1 is part of highly conserved mitochondrial metallopeptidases, with homologs present in higher eukaryotes, eubacteria and archaebacteria, all containing the zinc binding motif (HEXXH). It forms a high molecular mass complex in the inner membrane, possibly a homo-hexamer.	222
320692	cd07333	M48C_bepA_like	Peptidase M48C Ste24p bepA-like, integral membrane protein. This family contains peptidase M48C Ste24p protease bepA (formerly yfgC)-like proteins considered to be putative metallopeptidases, containing a zinc-binding motif, HEXXH, and a COOH-terminal ER retrieval signal (KKXX). They proteolytically remove the C-terminal three residues of farnesylated proteins. They are integral membrane proteins associated with the endoplasmic reticulum and golgi, binding one zinc ion per subunit.  In eukaryotes, Ste24p is required for the first NH2-terminal proteolytic processing event within the a-factor precursor, which takes place after COOH-terminal CAAX modification (C is cysteine; A is usually aliphatic; X is one of several amino acids) is complete. Mutation studies have shown that the HEXXH protease motif, which is extracellular but adjacent to a transmembrane domain and therefore close to the membrane surface, is critical for Ste24p activity.  Several members of this family also contain tetratricopeptide (TPR) repeat motifs, which are involved in a variety of functions including protein-protein interactions. BepA has been shown to possess protease activity and is responsible for the degradation of incorrectly folded LptD, an essential outer-membrane protein (OMP) involved in OM transport and assembly of lipopolysaccharide. Overexpression of the bepA protease causes abnormal biofilm architecture.	174
320693	cd07334	M48C_loiP_like	Peptidase M48C Ste24p loiP-like, integral membrane protein. This subfamily contains peptidase M48 Ste24p protease loiP (formerly yggG)-like family are mostly uncharacterized proteins that include E. coli loiP and ycaLG, considered to be putative metallopeptidases, containing a zinc-binding motif, HEXXH, and a COOH-terminal ER retrieval signal (KKXX). They proteolytically remove the C-terminal three residues of farnesylated proteins. They are integral membrane proteins associated with the endoplasmic reticulum and golgi, binding one zinc ion per subunit.  In eukaryotes, Ste24p is required for the first NH2-terminal proteolytic processing event within the a-factor precursor, which takes place after COOH-terminal CAAX modification (C is cysteine; A is usually aliphatic; X is one of several amino acids) is complete. Mutation studies have shown that the HEXXH protease motif, which is extracellular but adjacent to a transmembrane domain and therefore close to the membrane surface, is critical for Ste24p activity. LoiP has been shown to be a metallopeptidase that cleaves its targets preferentially between Phe-Phe residues. It is upregulated when bacteria are subjected to media of low osmolarity, thus yggG was named LoiP (low osmolarity induced protease). Proper membrane localization of LoiP may depend on YfgC, another putative metalloprotease in this subfamily.	215
320694	cd07335	M48B_HtpX_like	Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This family contains peptidase M48 subfamily B, also known as HtpX, which consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX, an integral membrane (IM) metallopeptidase, is widespread in bacteria and archaea, and plays a central role in protein quality control by preventing the accumulation of misfolded proteins in the membrane. Its expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and eliminating them by collaborating with FtsH, a membrane-bound and ATP-dependent protease. HtpX contains the zinc binding motif (HEXXH), has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not. Mutation studies of HtpX-like M48 metalloprotease from Leptospira interrogans (LA4131) has been shown to result in altered expression of a subset of metal toxicity and stress response genes.	240
320695	cd07336	M48B_HtpX_like	Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not.	266
320696	cd07337	M48B_HtpX_like	Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not.	203
320697	cd07338	M48B_HtpX_like	Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not.	216
320698	cd07339	M48B_HtpX_like	Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not.	229
320699	cd07340	M48B_Htpx_like	Peptidase M48 subfamily B HtpX-like membrane-bound metallopeptidase. This HtpX family of peptidase M48 subfamily B includes uncharacterized HtpX homologs and consists of proteins smaller than Ste24p, with homology restricted to the C-terminal half of Ste24p. HtpX expression is controlled by the Cpx stress response system, which senses abnormal membrane proteins. HtpX participates in the proteolytic quality control of these misfolded proteins by undergoing self-degradation and collaborating with FtsH, a membrane-bound and ATP-dependent protease, to eliminate them. HtpX, a zinc metalloprotease with an active site motif HEXXH, has an FtsH-like topology, and is capable of introducing endoproteolytic cleavages into SecY (also an FtsH substrate). However, HtpX does not have an ATPase activity and will only act against cytoplasmic regions of a target membrane protein. Thus, HtpX and FtsH have overlapping and/or complementary functions, which are especially important at high temperature; in E. coli and Xylella fastidiosa, HtpX is heat-inducible, while in Streptococcus gordonii it is not.	246
320700	cd07341	M56_BlaR1_MecR1_like	Peptidase M56-like including those in BlaR1 and MecR1, integral membrane metallopeptidase. This family contains peptidase M56, which includes zinc metalloprotease domain in MecR1 as well as BlaR1. MecR1 is a transmembrane beta-lactam sensor/signal transducer protein that regulates the expression of an altered penicillin-binding protein PBP2a, which resists inactivation by beta-lactam antibiotics, in methicillin-resistant Staphylococcus aureus (MRSA). BlaR1 regulates the inducible expression of a class A beta-lactamase that hydrolytically destroys certain ?-lactam antibiotics in MRSA. Both, MecR1 and BlaR1, are transmembrane proteins that consist of four transmembrane helices, a cytoplasmic zinc protease domain, and the soluble C-terminal extracellular sensor domain, and are highly similar in sequence and function. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. All members contain the zinc metalloprotease motif (HEXXH). Homologs of this peptidase domain are also found in a number of other bacterial genome sequences, most of which are as yet uncharacterized.	187
320701	cd07342	M48C_Oma1_like	M48C peptidase, integral membrane endopeptidase. This subfamily contains peptidase M48C Oma1 (also called mitochondrial metalloendopeptidase OMA1) protease homologs that are mostly uncharacterized. Oma1 is part of the quality control system in the inner membrane of mitochondria, with its catalytic site facing the matrix space. It cleaves and thereby promotes the turnover of mistranslated or misfolded membrane proteins. Oma1 can cleave the misfolded multi-pass membrane protein Oxa1, thus exerting a function similar to the ATP-dependent m-AAA protease for quality control of inner membrane proteins; it cleaves a misfolded polytopic membrane protein at multiple sites. It has been proposed that in the absence of m-AAA protease, proteolysis of Oxa1 is mediated by Oma1 in an ATP-independent manner. Oma1 is part of highly conserved mitochondrial metallopeptidases, with homologs present in higher eukaryotes, eubacteria and archaebacteria, all containing the zinc binding motif (HEXXH). It forms a high molecular mass complex in the inner membrane, possibly a homo-hexamer.	158
320702	cd07343	M48A_Zmpste24p_like	Peptidase M48 subfamily A, a type 1 CaaX endopeptidase. This family contains peptidase family M48 subfamily A which includes a number of well-characterized genes such as those found in humans (ZMPSTE24, also known as farnesylated protein-converting enzyme 1 or FACE-1 or Hs Ste24), Taenia solium metacestode (TsSte24p), Arabidopsis (AtSte24) and yeast (Ste24p). Ste24p contains the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. It is thought to be intimately associated with the endoplasmic reticulum (ER), regardless of whether its genes possess the conventional signal motif (KKXX) in the C-terminal. Proteins in this family proteolytically remove the C-terminal three residues of farnesylated proteins. Ste24p is involved in the post-translational processing of prelamin A to mature lamin A, a major component of the nuclear envelope. ZmpSte24 deficiency causes an accumulation of prelamin A leading to lipodystrophy and other disease phenotypes, while mutations in this gene or in that encoding its substrate, prelamin A, result in a series of human inherited diseases known as laminopathies, the most severe of which are Hutchinson Gilford progeria syndrome (HGPS) and restrictive dermopathy (RD) which arise due to unsuccessful maturation of prelamin A. Two forms of mandibuloacral dysplasia, a condition that causes a variety of abnormalities involving bone development, skin pigmentation, and fat distribution, are caused by mutations in two different genes; mutations in the LMNA gene, which normally provides instructions for making lamin A and lamin C, cause mandibuloacral dysplasia with A-type lipodystrophy (MAD-A), and mutations in the ZMPSTE24 gene cause mandibuloacral dysplasia with B-type lipodystrophy (MAD-B). Within cells, these genes are involved in maintaining the structure of the nucleus and may play a role in many cellular processes. Certain HIV protease inhibitors have been shown to inhibit the enzymatic activity of ZMPSTE24, but not enzymes involved in prelamin A processing.	405
320703	cd07344	M48_yhfN_like	Peptidase M48 YhfN-like, a novel minigluzincin. M48 YhfN-like protease is considered as a CaaX prenyl protease 1 homolog, with most of the sequences in this family as yet uncharacterized. It contains the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. It is probably associated with the endoplasmic reticulum (ER), regardless of whether its genes possess the conventional signal motif (KKXX) in the C-terminal. Proteins in this family proteolytically remove the C-terminal three residues of farnesylated proteins. This novel family of related proteins consist of the soluble minimal scaffold similar to the catalytic domains of the integral-membrane metallopeptidase M48 and M56, thus called minigluzincins.	96
320704	cd07345	M48A_Ste24p-like	Peptidase M48 subfamily A-like, putative CaaX prenyl protease. This family contains peptidase family M48 subfamily A-like CaaX prenyl protease 1, most of which are uncharacterized. Some of these contain tetratricopeptide (TPR) repeats at the C-terminus. Proteins in this family contain the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. They are thought to be possibly associated with the endoplasmic reticulum (ER), regardless of whether their genes possess the conventional signal motif (KKXX) in the C-terminal. These proteins putatively remove the C-terminal three residues of farnesylated proteins proteolytically.	346
349983	cd07346	ABC_6TM_exporters	Six-transmembrane helical domain of the ATP-binding cassette transporters. This family represents a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in this family. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting chemical diversity of the translocated substrates, whereas NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional unit. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	292
259818	cd07347	harmonin_N_like	N-terminal protein-binding module of harmonin and similar domains, also known as HHD (harmonin homology domain). This domain is found in harmonin, and similar proteins such as delphilin, and whirlin. These are postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold proteins. Harmonin and whirlin are organizers of the Usher protein network of the inner ear and the retina, delphilin is found at the cerebellar parallel fiber-Purkinje cell synapses. This domain is also found in CCM2 (also called malcavernin; C7orf22/chromosome 7 open reading frame 22; OSM). CCM2 along with CCM1 and CCM3 constitutes a set of proteins which when mutated are responsible for cerebral cavernous malformations, an autosomal dominant neurovascular disease characterized by cerebral hemorrhages and vascular malformations in the central nervous system. CCM2 plays many functional roles. CCM2 functions as a scaffold involved in small GTPase Rac-dependent p38 mitogen-activated protein kinase (MAPK) activation when the cell is under hyperosmotic stress. It associates with CCM1 in the signaling cascades that regulate vascular integrity and participates in HEG1 (the transmembrane receptor heart of glass 1) mediated endothelial cell junctions. CCM proteins also inhibit the activation of small GTPase RhoA and its downstream effector Rho kinase (ROCK) to limit vascular permeability. CCM2 mediates TrkA-dependent cell death via its N-terminal PTB domain in pediatric neuroblastic tumours; the C-terminal domain of malcavernin represented here has also been refered to as the Karet domain. Harmonin contains a single copy of this domain at its N-terminus which binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain (a component of the Usher protein network). Whirlin contains two copies of this domain; the first of these has been assayed for interaction with the cytoplasmic domain of cadherin 23 and no interaction could be detected.	78
132762	cd07348	NR_LBD_NGFI-B	The ligand binding domain of  Nurr1, a member of  conserved family of nuclear receptors. The ligand binding domain of Nerve growth factor-induced-B (NGFI-B): NGFI-B is a member of the nuclear#steroid receptor superfamily. NGFI-B is classified as an orphan receptor because no ligand has yet been identified. NGFI-B is an early immediate gene product of the embryo development that is rapidly produced in response to a variety of cellular signals including nerve growth factor. It is involved in T-cell-mediated apoptosis, as well as neuronal differentiation and function. NGFI-B regulates transcription by binding to a specific DNA target upstream of its target genes and regulating the rate of transcriptional initiation. Like other members of the nuclear receptor (NR) superfamily of ligand-activated transcription factors, NGFI-B has  a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD).	238
132763	cd07349	NR_LBD_SHP	The ligand binding domain of DAX1 protein, a nuclear receptor lacking DNA binding domain. The ligand binding domain of the Small Heterodimer Partner (SHP): SHP is a member of the nuclear receptor superfamily. SHP has a ligand binding domain, but lacks the DNA binding domain, typical to almost all of the nuclear receptors. It functions as a transcriptional coregulator by directly interacting with other nuclear receptors through its AF-2 motif. The closest relative of SHP is DAX1 and they can form heterodimer. SHP is an orphan receptor, lacking an identified ligand.	222
132764	cd07350	NR_LBD_Dax1	The ligand binding domain of DAX1 protein, a nuclear receptor lacking DNA binding domain. The ligand binding domain of the DAX1 protein: DAX1 (dosage-sensitive sex reversal adrenal hypoplasia congenita critical region on chromosome X gene 1) is a nuclear receptor with a typical ligand binding domain, but lacks the   DNA binding domain. DAX1 plays an important role in the normal development of several hormone-producing tissues. Duplications of the region of the X chromosome containing DAX1 cause dosage sensitive sex reversal. DAX1 acts as a global repressor of many nuclear receptors, including SF-1, LRH-1, ERR, ER, AR and PR. DAX1 can form homodimer and heterodimerizes with its alternatively spliced isoform DAX1A and other nuclear receptors such as SHP, ERalpha and SF-1.	232
259819	cd07353	harmonin_N	N-terminal protein-binding module of harmonin. Harmonin is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein, which organizes the Usher protein network of the inner ear and the retina. Harmonin contains a single copy of this domain, which is found at the N-terminus of all three harmonin isoform classes (a, b and c), and which preceeds the first PDZ protein-binding domain, PDZ1. This harmonin_N domain binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain; cadherin 23 is a component of the Usher protein network.	79
259820	cd07354	HN_L-delphilin-R1_like	first harmonin_N_like domain (repeat 1) of L-delphilin, and related domains. This subgroup contains the first of two harmonin_N_like domains of an alternatively spliced longer variant of mouse delphilin (L-delphilin, isoform 1), and related domains. Delphilin is a scaffold protein which binds the glutamate receptor delta-2 (GRID2) subunit and the monocarboxylate transporter 2 at the cerebellar parallel fiber-Purkinje cell synapses. The N-terminus of L-delphilin contains this harmonin_N_like domain preceded by a postsynaptic density-95/discs-large/ZO-1 (PDZ) protein-binding domain, PDZ1. L-delphilin, in common with the shorter C-terminal isoforms (S-delphilin/delphilin alpha and delphilin beta) has a second harmonin_N_like domain (not belonging to this subgroup) and a second PDZ domain, PDZ2. This first harmonin_N_like domain is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin.	80
259821	cd07355	HN_L-delphilin-R2_like	second harmonin_N_like domain (repeat 2) of L-delphilin, and related domains. This subgroup contains the second of two harmonin_N_like domains of an alternatively spliced longer variant of mouse delphilin (L-delphilin), and related domains. Delphilin is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein which binds the glutamate receptor delta-2 (GRID2) subunit and the monocarboxylate transporter 2 at the cerebellar parallel fiber-Purkinje cell synapses. This harmonin_N_like domain in L-delphilin follows the second PDZ protein-binding domain, PDZ2; it is also found in the shorter C-terminal isoforms (S-delphilin/delphilin alpha and delphilin beta). It is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. The first harmonin_N_like domain of L-delphilin belongs to a different subgroup and is missing from S-delphilin.	80
259822	cd07356	HN_L-whirlin_R1_like	first harmonin_N_like domain (repeat 1) of the long isoform of whirlin, and related domains. This subgroup contains the first of two harmonin_N_like domains of the long isoform of whirlin, and related domains. Whirlin is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein which binds various components of the Usher protein network of the inner ear and the retina: erythrocyte protein p55, usherin, VlGR1, and myosin XVa. The long isoform of whirlin contains two harmonin_N_like domains, and three PDZ protein-binding domains, PDZ1-3. This first harmonin_N_like domain precedes PDZ1, and is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. This first harmonin_N_like domain has been assayed for interaction with the cytoplasmic domain of cadherin 23 (a component of the Usher network and an interacting partner of the harmonin N-domain), however no interaction could be detected. The short whirlin isoform, derived from an alternative start ATG, lacks this first harmonin_N_like domain. The short isoform has in common with the long isoform, the second harmonin_N_like domain (designated repeat 2, not present in this subgroup), and PDZ3.	78
259823	cd07357	HN_L-whirlin_R2_like	second harmonin_N_like domain (repeat 2) of the long isoform of whirlin, and related domains. This subgroup contains the second of two harmonin_N_like domains found in the long isoform of whirlin, and related domains. Whirlin is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein which binds various components of the Usher protein network of the inner ear and the retina: erythrocyte protein p55, usherin, VlGR1, and myosin XVa. The long isoform of whirlin contains two harmonin_N_like domains, and three PDZ protein-binding domains, PDZ1-3. The short whirlin isoform, derived from an alternative start ATG, lacks the first harmonin_N_like domain but has in common with the long isoform, this second harmonin_N_like domain (designated repeat 2, included in this subgroup) and PDZ3. This second harmonin_N_like domain is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin.	81
259824	cd07358	HN_PDZD7_like	harmonin_N_like domain, a protein-binding module of PDZ domain-containing protein 7 and related proteins. Human PDZD7 is a scaffolding protein which associates with the Usher Syndrome protein network, and localizes to the stereocilia Ankle-link. Usher syndrome is the leading cause of genetic deaf-blindness. PDZD7 has a role as in Usher syndrome type 2 (and not in USH1) in humans. Whirlin, Usherin and GRP98 are other USH2 proteins. The latter two form the ankle links and whirlin is thought to be a scaffold for protein interactions at these links. PDZD7, whirlin, and harmonin (an USH1 protein) have a similar domain composition. The domain represented here is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. Cooperative effects of mutations in PDZD7 and Usherin, and in PDZD7 and GPR98, result in a digenic USH2 phenotype.	78
153372	cd07359	PCA_45_Doxase_B_like	Subunit B of the Class III Extradiol dioxygenase, Protocatechuate 4,5-dioxygenase, and simlar enzymes. This subfamily of class III extradiol dioxygenases consists of a number of proteins with known enzymatic activities: Protocatechuate (PCA) 4,5-dioxygenase (LigAB), 2,3-dihydroxyphenylpropionate 1,2-dioxygenase (MhpB), 3-O-Methylgallate Dioxygenase, 2-aminophenol 1,6-dioxygenase, as well as proteins without any known enzymatic activity. These proteins play essential roles in the degradation of aromatic compounds by catalyzing the incorporation of both atoms of molecular oxygen into their preferred substrates. As members of the Class III extradiol dioxygenase family, the enzymes use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like class III enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B.	271
153373	cd07361	MEMO_like	Memo (mediator of ErbB2-driven cell motility) is co-precipitated with the C terminus of ErbB2, a protein involved in cell motility. This subfamily is composed of Memo (mediator of ErbB2-driven cell motility) and similar proteins. Memo is a protein that is co-precipitated with the C terminus of ErbB2, a protein involved in cell motility. It is required for the ErbB2-driven cell mobility and is found in protein complexes with cofilin, ErbB2 and PLCgamma1. However, Memo is not homologous to any known signaling proteins, and its function in ErbB2 signaling is not known. Structural studies show that Memo binds directly to a specific ErbB2-derived phosphopeptide. Memo is homologous to class III nonheme iron-dependent extradiol dioxygenases, however, no metal binding or enzymatic activity can be detected for Memo. This subfamily also contains a few members containing a C-terminal AMMECR1-like domain. The AMMECR1 protein was proposed to be a regulatory factor that is potentially involved in the development of AMME contiguous gene deletion syndrome.	266
153374	cd07362	HPCD_like	Class III extradiol dioxygenases with similarity to homoprotocatechuate 2,3-dioxygenase, which catalyzes the key ring cleavage step in the metabolism of homoprotocatechuate. This subfamily of class III extradiol dioxygenases consists of two types of  proteins with known enzymatic activities; 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD) and 2-amino-5-chlorophenol 1,6-dioxygenase. HPCD catalyzes the key ring cleavage step in the metabolism of homoprotocatechuate (hpca), a central intermediate in the bacterial degradation of aromatic compounds. The enzyme incorporates both atoms of molecular oxygen into hpca, resulting in aromatic ring-opening to yield the product  alpha-hydroxy-delta-carboxymethyl cis-muconic semialdehyde. 2-amino-5-chlorophenol 1,6-dioxygenase catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, which is an intermediate during p-chloronitrobenzene degradation. The enzyme is probably a heterotetramer composed of two alpha and two beta subunits. Alpha and beta subunits share significant sequence similarity and both belong to this family. Like all Class III extradiol dioxygenases, these enzymes use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon.	272
153375	cd07363	45_DOPA_Dioxygenase	The Class III extradiol dioxygenase, 4,5-DOPA Dioxygenase, catalyzes the incorporation of both atoms of molecular oxygen into 4,5-dihydroxy-phenylalanine. This subfamily is composed of plant 4,5-DOPA Dioxygenase, the uncharacterized Escherichia coli protein Jw3007, and similar proteins. 4,5-DOPA Dioxygenase catalyzes the incorporation of both atoms of molecular oxygen into 4,5-dihydroxy-phenylalanine (4,5-DOPA). The reaction results in the opening of the cyclic ring  between carbons 4 and 5 and producing an unstable seco-DOPA that rearranges to betalamic acid. 4,5-DOPA Dioxygenase is a key enzyme in the biosynthetic pathway of the plant pigment betalain. Homologs of DODA are present not only in betalain-producing plants but also in bacteria and archaea. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon.	253
153376	cd07364	PCA_45_Dioxygenase_B	Subunit B of the Class III extradiol dioxygenase, Protocatechuate 4,5-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of protocatechuate. Protocatechuate 4,5-dioxygenase (LigAB) catalyzes the oxidization and subsequent ring-opening of protocatechuate (or 3,4-dihydroxybenzoic acid, PCA), an intermediate in the breakdown of lignin and other compounds. Protocatechuate 4,5-dioxygenase is an aromatic ring opening dioxygenase belonging to the class III extradiol enzyme family, a group of enyzmes that cleaves aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon using a non-heme Fe(II). LigAB is composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. The B subunit (LigB) is the catalytic subunit of LigAB.	277
153377	cd07365	MhpB_like	Subunit B of the Class III Extradiol ring-cleavage dioxygenase, 2,3-dihydroxyphenylpropionate 1,2-dioxygenase (MhpB), which catalyzes the oxidization and subsequent ring-opening of 2,3-dihydroxyphenylpropionate. 2,3-dihydroxyphenylpropionate 1,2-dioxygenase (MhpB) catalyzes the oxidization and subsequent ring-opening of 2,3-dihydroxyphenylpropionate, yielding the product 2-hydroxy-6-oxo-nona-2,4-diene 1,9-dicarboxylate.  It is an essential enzyme in the beta-phenylpropionic degradation pathway, in which beta-phenylpropionic is first hydrolyzed to produce 2,3-dihydroxyphenylpropionate. The enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like class III enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B. MhpB is likely to be a tetramer.	310
153378	cd07366	3MGA_Dioxygenase	Subunit B of the Class III Extradiol ring-cleavage dioxygenase, 3-O-Methylgallate Dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 3-O-Methylgallate. 3-O-Methylgallate Dioxygenase catalyzes the oxidization and subsequent ring-opening of 3-O-Methylgallate (3MGA) between carbons 2 and 3. 3-O-Methylgallate Dioxygenase is a key enzyme in the syringate degradation pathway, in which the syringate is first converted to 3-O-Methylgallate by O-demethylase. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which uses a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B.	328
153379	cd07367	CarBb	CarBb is the B subunit of the Class III Extradiol ring-cleavage dioxygenase, 2-aminophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-aminophenyl-2,3-diol. CarBb is the B subunit of 2-aminophenol 1,6-dioxygenase (CarB), which catalyzes the oxidization and subsequent ring-opening of 2-aminophenyl-2,3-diol. It is a key enzyme in the carbazole degradation pathway isolated from bacterial strains with carbazole degradation ability. The enzyme is a heterotetramer composed of two A and two B subunits. CarB belongs to the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Although the enzyme was originally isolated as a meta-cleavage enzyme for 2'-aminobiphenyl-2,3-diol involved in carbazole degradation, it has also shown high specificity for 2,3-dihydroxybiphenyl.	268
153380	cd07368	PhnC_Bs_like	PhnC is a Class III Extradiol ring-cleavage dioxygenase involved in the polycyclic aromatic hydrocarbon (PAH) catabolic pathway. This subfamily is composed of Burkholderia sp. PhnC and similar poteins. PhnC is one of nine protein products encoded by the phn locus. These proteins are involved in the polycyclic aromatic hydrocarbon (PAH) catabolic pathway. PhnC is a member of the class III extradiol dioxygenase family, a group os enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B.	277
153381	cd07369	PydA_Rs_like	PydA is a Class III Extradiol ring-cleavage dioxygenase required for the degradation of 3-hydroxy-4-pyridone (HP). This subfamily is composed of Rhizobium sp. PydA and similar proteins. PydA is required for the degradation of 3-hydroxy-4-pyridone (HP), an intermediate in the Leucaena toxin mimosine degradation pathway. It is a member of the class III extradiol dioxygenase family, a group of enzymes that use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B.	329
153382	cd07370	HPCD	The Class III extradiol dioxygenase, homoprotocatechuate 2,3-dioxygenase, catalyzes the key ring cleavage step in the metabolism of homoprotocatechuate. 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD) catalyzes the key ring cleavage step in the metabolism of homoprotocatechuate (hpca), a central intermediate in the bacterial degradation of aromatic compounds. The enzyme incorporates both atoms of molecular oxygen into hpca, resulting in aromatic ring-opening to yield alpha-hydroxy-delta-carboxymethyl cis-muconic semialdehyde. HPCD is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon.	280
153383	cd07371	2A5CPDO_AB	The alpha and beta subunits of the Class III extradiol dioxygenase, 2-amino-5-chlorophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol. This subfamily contains both alpha and beta subunits of 2-amino-5-chlorophenol 1,6-dioxygenase (2A5CPDO), which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, an intermediate during p-chloronitrobenzene degradation. 2A5CPDO is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. The active enzyme is probably a heterotetramer, composed of two alpha and two beta subunits. Alpha and beta subunits share significant sequence similarity and may have evolved by gene duplication.	268
153384	cd07372	2A5CPDO_B	The beta subunit of the Class III extradiol dioxygenase, 2-amino-5-chlorophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol. 2-amino-5-chlorophenol 1,6-dioxygenase (2A5CPDO), catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, which is an intermediate during p-chloronitrobenzene degradation. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. The active 2A5CPDO enzyme is probably a heterotetramer, composed of two alpha and two beta subunits. The alpha and beta subunits share significant sequence similarity and may have evolved by gene duplication. This model describes the beta subunit, which contains a putative metal binding site with two conserved histidines; these residues are equivalent to two out of three Fe(II) binding residues present in the catalytic subunit dioxygenase LigB. The alpha subunit does not contain these potential metal binding residues. The 2A5CPDO beta subunit may be the catalytic subunit of the enzyme.	294
153385	cd07373	2A5CPDO_A	The alpha subunit of the Class III extradiol dioxygenase, 2-amino-5-chlorophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol. 2-amino-5-chlorophenol 1,6-dioxygenase (2A5CPDO) catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, which is an intermediate during p-chloronitrobenzene degradation. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. The active enzyme is probably a heterotetramer, composed of two alpha and two beta subunits. The alpha and beta subunits share significant sequence similarity and may have evolved by gene duplication. This model describes the alpha subunit, which does not contain a potential metal binding site and may not possess catalytic activity.	271
143620	cd07374	CYTH-like_Pase	CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases. CYTH-like superfamily enzymes hydrolyze triphosphate-containing substrates and require metal cations as cofactors. They have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB), and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions.	174
153408	cd07375	Anticodon_Ia_like	Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains. This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway.	117
143511	cd07376	PLPDE_III_DSD_D-TA_like	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes Similar to D-Serine Dehydratase and D-Threonine Aldolase. This family includes eukaryotic D-serine dehydratases (DSD), cryptic DSDs from bacteria, D-threonine aldolases (D-TA), low specificity D-TAs, and similar uncharacterized proteins. DSD catalyzes the dehydration of D-serine to aminoacrylate, which is rapidly hydrolyzed to pyruvate and ammonia. D-TA reversibly catalyzes the aldol cleavage of D-threonine into glycine and acetaldehyde, and the synthesis of D-threonine from glycine and acetaldehyde. Members of this family are fold type III PLP-dependent enzymes, similar to bacterial alanine racemase (AR), which contains an N-terminal PLP-binding TIM barrel domain and a C-terminal beta-sandwich domain. AR exists as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. Based on similarity to AR, it is possible members of this family also form dimers in solution.	345
153418	cd07377	WHTH_GntR	Winged helix-turn-helix (WHTH) DNA-binding domain of the GntR family of transcriptional regulators. This CD represents the winged HTH DNA-binding domain of the GntR (named after the gluconate operon repressor in Bacillus subtilis) family of bacterial transcriptional regulators and their putative homologs found in eukaryota and archaea. The GntR family has over 6000 members distributed among almost all bacterial species, which is comprised of FadR, HutC, MocR, YtrA, AraR, PlmA, and other subfamilies for the regulation of the most varied biological process. The monomeric proteins of the GntR family are characterized by two function domains: a small highly conserved winged helix-turn-helix prokaryotic DNA binding domain in the N-terminus, and a very diverse regulatory ligand-binding domain in the C-terminus for effector-binding/oligomerization, which provides the basis for the subfamily classifications.  Binding of the effector to GntR-like transcriptional regulators is presumed to result in a conformational change that regulates the DNA-binding affinity of the repressor. The GntR-like proteins bind as dimers, where each monomer recognizes a half-site of 2-fold symmetric DNA sequences.	66
277324	cd07378	MPP_ACP5	Homo sapiens acid phosphatase 5 and related proteins, metallophosphatase domain. Acid phosphatase 5 (ACP5) removes the mannose 6-phosphate recognition marker from lysosomal proteins. The exact site of dephosphorylation is not clear. Evidence suggests dephosphorylation may take place in a prelysosomal compartment as well as in the lysosome. ACP5 belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	286
277325	cd07379	MPP_239FB	Homo sapiens 239FB and related proteins, metallophosphatase domain. 239FB (Fetal brain protein 239) is thought to play a role in central nervous system development, but its specific role in unknown.  239FB is expressed predominantly in human fetal brain from a gene located in the chromosome 11p13 region associated with the mental retardation component of the WAGR (Wilms tumor, Aniridia, Genitourinary anomalies, Mental retardation) syndrome. Orthologous brp-like (brain protein 239-like) proteins have been identified in the invertebrate amphioxus group and in vertebrates.  239FB belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	135
277326	cd07380	MPP_CWF19_N	Schizosaccharomyces pombe CWF19 and related proteins, N-terminal metallophosphatase domain. CWF19 cell cycle control protein (also known as CWF19-like 1 (CWF19L1) in Homo sapiens), N-terminal metallophosphatase domain.   CWF19 contains C-terminal domains similar to that found in the CwfJ cell cycle control protein.   The metallophosphatase domain belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	149
277327	cd07381	MPP_CapA	CapA and related proteins, metallophosphatase domain. CapA is one of three membrane-associated enzymes in Bacillus anthracis that is required for synthesis of gamma-polyglutamic acid (PGA), a major component of the bacterial capsule.  The YwtB and PgsA proteins of Bacillus subtilis are closely related to CapA and are also included in this alignment model.  CapA belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	239
277328	cd07382	MPP_DR1281	Deinococcus radiodurans DR1281 and related proteins, metallophosphatase domain. DR1281 is an uncharacterized Deinococcus radiodurans protein with a domain that belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	255
277329	cd07383	MPP_Dcr2	Saccharomyces cerevisiae DCR2 phosphatase and related proteins, metallophosphatase domain. DCR2 phosphatase (Dosage-dependent Cell Cycle Regulator 2) functions together with DCR1 (Gid8) in a common pathway to accelerate initiation of DNA replication in Saccharomyces cerevisiae. Genetic analysis suggests that DCR1 functions upstream of DCR2.  DCR2 interacts with and dephosphorylates Sic1, an inhibitor of mitotic cyclin/cyclin-dependent kinase complexes, which may serve to trigger the initiation of cell division.  DCR2 belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	202
277330	cd07384	MPP_Cdc1_like	Saccharomyces cerevisiae CDC1 and related proteins, metallophosphatase domain. Cdc1 (also known as XlCdc1 in Xenopus laevis) is an endoplasmic reticulum-localized transmembrane lipid phosphatase with a metallophosphatase domain facing the ER lumen.  In budding yeast, the gene encoding CDC1 is essential while nonlethal mutations cause defects in Golgi inheritance and actin polarization.  Cdc1 mutant cells accumulate an unidentified phospholipid, suggesting that Cdc1 is a lipid phosphatase.  Cdc1 mutant cells also have highly elevated intracellular calcium levels suggesting a possible role for Cdc1 in calcium regulation.  The 5' flanking region of Cdc1 is a regulatory region with conserved binding site motifs for AP1, AP2, Sp1, NF-1 and CREB.  DNA polymerase delta consists of at least four subunits - Pol3, Cdc1, Cdc27, and Cdm1.  This group also contains Saccharomyces cerevisiae TED1 (Trafficking of Emp24p/Erv25p-dependent cargo disrupted 1), which acts together with Emp24p and Erv25p in cargo exit from the ER, and human MPPE1. The human MPPE1 gene is a candidate susceptibility gene for bipolar disorder. These proteins belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	172
277331	cd07385	MPP_YkuE_C	Bacillus subtilis YkuE and related proteins, C-terminal metallophosphatase domain. YkuE is an uncharacterized Bacillus subtilis protein with a C-terminal metallophosphatase domain and an N-terminal twin-arginine (RR) motif. An RR-signal peptide derived from the Bacillus subtilis YkuE protein can direct Tat-dependent secretion of agarase in Streptomyces lividans. This is an indication that YkuE is transported by the Bacillus subtilis Tat (Twin-arginine translocation) pathway machinery.  YkuE belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	224
277332	cd07386	MPP_DNA_pol_II_small_archeal_C	archeal DNA polymerase II, small subunit, C-terminal metallophosphatase domain. The small subunit of the archeal DNA polymerase II contains a C-terminal metallophosphatase domain.  This domain is thought to be functionally active because the active site residues required for phosphoesterase activity in other members of this superfamily are intact.  The archeal replicative DNA polymerases are thought to possess intrinsic phosphatase activity that hydrolyzes the pyrophosphate released during nucleotide polymerization.  This domain belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	243
277333	cd07387	MPP_PolD2_C	PolD2 (DNA polymerase delta, subunit 2), C-terminal domain. PolD2 (DNA polymerase delta, subunit 2) is an auxiliary subunit of the eukaryotic DNA polymerase delta (PolD) complex thought to play a regulatory role and to serve as a scaffold for PolD assembly by interacting simultaneously with all of the other three subunits.  PolD2 is catalytically inactive and lacks the active site residues required for phosphoesterase activity in other members of this superfamily.  PolD2 is also involved in the recruitment of several proteins regulating DNA metabolism, including p21, PDIP1, PDIP38, PDIP46, and WRN. Human PolD consists of four subunits: p125 (PolD1), p50 (PolD2), p66(PolD3), and p12(PolD4).  PolD is one of three major replicases in eukaryotes. PolD also plays an essential role in translesion DNA synthesis, homologous recombination, and DNA repair.  Within the PolD complex, PolD2 tightly associates with PolD3.  PolD2 belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	257
277334	cd07388	MPP_Tt1561	Thermus thermophilus Tt1561 and related proteins, metallophosphatase domain. This family includes bacterial proteins related to Tt1561 (also known as Aq1956 in Aquifex aeolicus), an uncharacterized Thermus thermophilus protein.  The conserved domain present in members of this family belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets, and is thought to allow for productive metal coordination. However, the active site residues required for phosphoesterase activity in other members of this superfamily are poorly conserved in this functionally uncharacterized family.	224
277335	cd07389	MPP_PhoD	Bacillus subtilis PhoD and related proteins, metallophosphatase domain. PhoD (also known as alkaline phosphatase D/APaseD in Bacillus subtilis) is a secreted phosphodiesterase encoded by phoD of the Pho regulon in Bacillus subtilis. PhoD homologs are found in prokaryotes, eukaryotes, and archaea. PhoD contains a twin arginine (RR) motif and is transported by the Tat (Twin-arginine translocation) translocation pathway machinery (TatAyCy). This family also includes the Fusarium oxysporum Fso1 protein. PhoD belongs to the metallophosphatase (MPP) superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	242
277336	cd07390	MPP_AQ1575	Aquifex aeolicus AQ1575 and related proteins, metallophosphatase domain. This family includes bacterial and archeal proteins homologous to AQ1575, an uncharacterized Aquifex aeolicus protein.  AQ1575 may play an accessory role in DNA repair, based on the close proximity of its gene to Holliday junction resolvasome genes.  The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	170
277337	cd07391	MPP_PF1019	Pyrococcus furiosus PF1019 and related proteins, metallophosphatase domain. This family includes bacterial and archeal proteins homologous to PF1019, an uncharacterized Pyrococcus furiosus protein.  The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	175
277338	cd07392	MPP_PAE1087	Pyrobaculum aerophilum PAE1087 and related proteins, metallophosphatase domain. PAE1087 is an uncharacterized Pyrobaculum aerophilum protein with a metallophosphatase domain.  The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	190
277339	cd07393	MPP_DR1119	Deinococcus radiodurans DR1119 and related proteins, metallophosphatase domain. DR1119 is an uncharacterized Deinococcus radiodurans protein with a metallophosphatase domain.  The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	238
163637	cd07394	MPP_Vps29	Homo sapiens Vps29 and related proteins, metallophosphatase domain. Vps29 (vacuolar sorting protein 29), also known as vacuolar membrane protein Pep11, is a subunit of the retromer complex which is responsible for the retrieval of mannose-6-phosphate receptors (MPRs) from the endosomes for retrograde transport back to the Golgi. Vps29 has a phosphoesterase fold that acts as a protein interaction scaffold for retromer complex assembly as well as a phosphatase with specificity for the cytoplasmic tail of the MPR.  The retromer includes the following 5 subunits: Vps35, Vps26, Vps29, and a dimer of the sorting nexins Vps5 (Snx1), and Vps17 (Snx2).  Vps29 belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	178
277340	cd07395	MPP_CSTP1	Homo sapiens CSTP1 and related proteins, metallophosphatase domain. CSTP1 (complete S-transactivated protein 1) is an uncharacterized Homo sapiens protein with a metallophosphatase domain, that is transactivated by the complete S protein of hepatitis B virus.  CSTP1 belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	263
277341	cd07396	MPP_Nbla03831	Homo sapiens Nbla03831 and related proteins, metallophosphatase domain. Nbla03831 (also known as LOC56985) is an uncharacterized Homo sapiens protein with a domain that belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	245
277342	cd07397	MPP_NostocDevT-like	Nostoc DevT and similar proteins, metallophosphatase domain. DevT (Alr4674) is a putative protein phosphatase from Nostoc PCC 7120 (Anabaena PCC 7120). DevT mutants form mature heterocysts, but they are unable to fix N(2) and must be supplied with a source of combined nitrogen in order to survive. Anabaena DevT shows homology to phosphatases of the PPP family and displays a Mn(2+)-dependent phosphatase activity. DevT is constitutively expressed in both vegetative cells and heterocysts, and is not regulated by NtcA. The heterocyst regulator HetR may exert a certain inhibition on the expression of devT. Under diazotrophic growth conditions, DevT protein accumulates specifically in mature heterocysts. The role that DevT plays in a late essential step of heterocyst differentiation is still unknown.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	245
277343	cd07398	MPP_YbbF-LpxH	Escherichia coli YbbF/LpxH and related proteins, metallophosphatase domain. YbbF/LpxH is an Escherichia coli UDP-2,3-diacylglucosamine hydrolase thought to catalyze the fourth step of lipid A biosynthesis, in which a precursor UDP-2,3-diacylglucosamine is hydrolyzed to yield 2,3-diacylglucosamine 1-phosphate and UMP.  YbbF belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	217
277344	cd07399	MPP_YvnB	Bacillus subtilis YvnB and related proteins, metallophosphatase domain. YvnB (BSU35040) is an uncharacterized Bacillus subtilis protein with a metallophosphatase domain.  This family includes bacterial and eukaryotic proteins similar to YvnB.  YvnB belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	207
277345	cd07400	MPP_1	Uncharacterized subfamily, metallophosphatase domain. Uncharacterized subfamily of the MPP superfamily. MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	138
277346	cd07401	MPP_TMEM62_N	Homo sapiens TMEM62, N-terminal metallophosphatase domain. TMEM62 (transmembrane protein 62) is an uncharacterized Homo sapiens transmembrane protein with an N-terminal metallophosphatase domain.  TMEM62 belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	254
277347	cd07402	MPP_GpdQ	Enterobacter aerogenes GpdQ and related proteins, metallophosphatase domain. GpdQ (glycerophosphodiesterase Q, also known as Rv0805 in Mycobacterium tuberculosis) is a binuclear metallophosphoesterase from Enterobacter aerogenes that catalyzes the hydrolysis of mono-, di-, and triester substrates, including some organophosphate pesticides and products of the degradation of nerve agents.  The GpdQ homolog, Rv0805, has 2',3'-cyclic nucleotide phosphodiesterase activity. GpdQ and Rv0805 belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	240
277348	cd07403	MPP_TTHA0053	Thermus thermophilus TTHA0053 and related proteins, metallophosphatase domain. TTHA0053 is an uncharacterized Thermus thermophilus protein with a domain that belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	130
277349	cd07404	MPP_MS158	Microscilla MS158 and related proteins, metallophosphatase domain. MS158 is an uncharacterized Microscilla protein with a metallophosphatase domain.  Microscilla proteins MS152, and MS153 are also included in this family.  The domain present in members of this family belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	201
277350	cd07405	MPP_UshA_N	Escherichia coli UshA and related proteins, N-terminal metallophosphatase domain. UshA is a bacterial periplasmic enzyme with UDP-sugar hydrolase and dinucleoside-polyphosphate hydrolase activities associated with its N-terminal metallophosphatase domain, and 5'-nucleotidase activity associated with its C-terminal domain. UshA has been studied in Escherichia coli where it is expressed from the ushA gene as an immature precursor and proteolytically cleaved to form a mature product upon export to the periplasm. UshA hydrolyzes many different nucleotides and nucleotide derivatives and has been shown to degrade external UDP-glucose to uridine, glucose 1-phosphate and phosphate for utilization by the cell. The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	287
277351	cd07406	MPP_CG11883_N	Drosophila melanogaster CG11883 and related proteins, N-terminal metallophosphatase domain. CG11883 is an uncharacterized Drosophila melanogaster UshA-like protein with two domains, an N-terminal metallophosphatase domain and  a C-terminal nucleotidase domain.  The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	257
277352	cd07407	MPP_YHR202W_N	Saccharomyces cerevisiae YHR202W and related proteins, N-terminal metallophosphatase domain. YHR202W is an uncharacterized Saccharomyces cerevisiae UshA-like protein with two domains, an N-terminal metallophosphatase domain and  a C-terminal nucleotidase domain.  The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	286
277353	cd07408	MPP_SA0022_N	Staphylococcus aureus SA0022 and related proteins, N-terminal metallophosphatase domain. SA0022 is an uncharacterized Staphylococcus aureus UshA-like protein with two putative domains, an N-terminal metallophosphatase domain and  a C-terminal nucleotidase domain.  SA0022 also contains a putative C-terminal cell wall anchor domain.  The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	255
277354	cd07409	MPP_CD73_N	CD73 ecto-5'-nucleotidase and related proteins, N-terminal metallophosphatase domain. CD73 is a mammalian ecto-5'-nucleotidase expressed in endothelial cells and lymphocytes that catalyzes the conversion of 5'-AMP to adenosine in the final step of a pathway that generates adenosine from ATP.  This pathway also includes a CD39 nucleoside triphosphate dephosphorylase that mediates the dephosphorylation of ATP to ADP and then to 5'-AMP.  These enzymes all have an N-terminal metallophosphatase domain and a C-terminal 5'nucleotidase domain.  The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	279
277355	cd07410	MPP_CpdB_N	Escherichia coli CpdB and related proteins, N-terminal metallophosphatase domain. CpdB is a bacterial periplasmic protein with an N-terminal metallophosphatase domain and a C-terminal 3'-nucleotidase domain.  This alignment model represents the N-terminal metallophosphatase domain, which has 2',3'-cyclic phosphodiesterase activity, hydrolyzing the 2',3'-cyclic phosphates of adenosine, guanosine, cytosine and uridine to yield nucleoside and phosphate.  CpdB also hydrolyzes the chromogenic substrates p-nitrophenyl phosphate (PNPP), bis(PNPP) and p-nitrophenyl phosphorylcholine (NPPC).  CpdB is thought to play a scavenging role during RNA hydrolysis by converting the non-transportable nucleotides produced by RNaseI to nucleosides which can easily enter a cell for use as a carbon source.  This family also includes YfkN, a Bacillus subtilis nucleotide phosphoesterase with two copies of each of the metallophosphatase and 3'-nucleotidase domains.  The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	280
277356	cd07411	MPP_SoxB_N	Thermus thermophilus SoxB and related proteins, N-terminal metallophosphatase domain. SoxB (sulfur oxidation protein B) is a periplasmic thiosulfohydrolase and an essential component of the sulfur oxidation pathway in archaea and bacteria.  SoxB has a dinuclear manganese cluster and is thought to catalyze the release of sulfate from a protein-bound cysteine S-thiosulfonate.  SoxB is expressed from the sox (sulfur oxidation) gene cluster, which encodes 15 other sox genes, and has two domains, an N-terminal metallophosphatase domain and a C-terminal 5'-nucleotidase domain.  SoxB binds the SoxYZ complex and is thought to function as a sulfate-thiohydrolase.  SoxB is closely related to the UshA, YchR, and CpdB proteins, all of which have the same two-domain architecture.  The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	273
277357	cd07412	MPP_YhcR_N	Bacillus subtilis YhcR endonuclease and related proteins, N-terminal metallophosphatase domain. YhcR is a Bacillus subtilis sugar-nonspecific endonuclease. It cleaves endonucleolytically to yield nucleotide 3'-monophosphate products, similar to Staphylococcus aureus micrococcal nuclease. YhcR appears to be located in the cell wall, and is thought to be a substrate for a Bacillus subtilis sortase. YhcR is the major calcium-activated nuclease of B. subtilis.  The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	295
277358	cd07413	MPP_PA3087	Pseudomonas aeruginosa PA3087 and related proteins, metallophosphatase domain. PA3087 is an uncharacterized protein from Pseudomonas aeruginosa with a metallophosphatase domain that belongs to the phosphoprotein phosphatase (PPP) family.  The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	222
277359	cd07414	MPP_PP1_PPKL	PP1, PPKL (PP1 and kelch-like) enzymes,  and related proteins, metallophosphatase domain. PP1 (protein phosphatase type 1) is a serine/threonine phosphatase that regulates many cellular processes including: cell-cycle progression, protein synthesis, muscle contraction, carbohydrate metabolism, transcription and neuronal signaling, through its interaction with at least 180 known targeting proteins.  PP1 occurs in all tissues and regulates many pathways, ranging from cell-cycle progression to carbohydrate metabolism.  Also included here are the PPKL (PP1 and kelch-like) enzymes including the PPQ, PPZ1, and PPZ2 fungal phosphatases.  These PPKLs have a large N-terminal kelch repeat in addition to a C-terminal phosphoesterase domain.  The PPP (phosphoprotein phosphatase) family, to which PP1 belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP2A, PP2B (calcineurin), PP4, PP5, PP6,  PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	291
277360	cd07415	MPP_PP2A_PP4_PP6	PP2A, PP4, and PP6 phosphoprotein phosphatases, metallophosphatase domain. PP2A-like family of phosphoprotein phosphatases (PPP's) including PP4 and PP6.  PP2A (Protein phosphatase 2A) is a critical regulator of many cellular activities.  PP2A comprises about 1% of total cellular proteins.  PP2A, together with protein phosphatase 1 (PP1), accounts for more than 90% of all serine/threonine phosphatase activities in most cells and tissues. The PP2A subunit  in addition to having a catalytic domain homologous to PP1, has a unique C-terminal tail, containing a motif that is conserved in the catalytic subunits of all PP2A-like phosphatases including PP4 and PP6, and has an important role in PP2A regulation.  The PP2A-like family of phosphatases all share a similar heterotrimeric architecture, that includes: a 65kDa scaffolding subunit (A), a 36kDa catalytic subunit (C), and one of 18 regulatory subunits (B).  The PPP (phosphoprotein phosphatase) family, to which PP2A belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP1, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	285
277361	cd07416	MPP_PP2B	PP2B, metallophosphatase domain. PP2B (calcineurin) is a unique serine/threonine protein phosphatase in its regulation by a second messenger (calcium and calmodulin).  PP2B is involved in many biological processes including immune responses, the second messenger cAMP pathway, sodium/potassium ion transport in the nephron, cell cycle progression in lower eukaryotes, cardiac hypertrophy, and memory formation.  PP2B is highly conserved from yeast to humans, but is absent from plants.  PP2B is a heterodimer consisting of a catalytic subunit (CnA) and a regulatory subunit (CnB); CnB  contains four Ca2+ binding motifs referred to as EF hands.  The PPP (phosphoprotein phosphatase) family, to which PP2B belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP1, PP2A, PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	305
277362	cd07417	MPP_PP5_C	PP5, C-terminal metallophosphatase domain. Serine/threonine protein phosphatase-5 (PP5) is a member of the PPP gene family of protein phosphatases that is highly conserved among eukaryotes and widely expressed in mammalian tissues. PP5 has a C-terminal phosphatase domain and an extended N-terminal TPR (tetratricopeptide repeat) domain containing three TPR motifs.  The PPP (phosphoprotein phosphatase) family, to which PP5 belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	316
163661	cd07418	MPP_PP7	PP7, metallophosphatase domain. PP7 is a plant phosphoprotein phosphatase that is highly expressed in a subset of stomata and thought to play an important role in sensory signaling.  PP7 acts as a positive regulator of signaling downstream of cryptochrome blue light photoreceptors.  PP7 also controls amplification of phytochrome signaling, and interacts with nucleotidediphosphate kinase 2 (NDPK2), a positive regulator of phytochrome signalling.  In addition, PP7 interacts with heat shock transcription factor HSF and up-regulates protective heat shock proteins.  PP7 may also play a role in salicylic acid-dependent defense signaling.  The PPP (phosphoprotein phosphatase) family, to which PP7 belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP2A, PP2B (calcineurin), PP4, PP5, PP6, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	377
277363	cd07419	MPP_Bsu1_C	Arabidopsis thaliana Bsu1 phosphatase and related proteins, C-terminal metallophosphatase domain. Bsu1 encodes a nuclear serine-threonine protein phosphatase found in plants and protozoans.  Bsu1 has a C-terminal phosphatase domain and an N-terminal Kelch-repeat domain.  Bsu1 is preferentially expressed in elongating plant cells. It modulates the phosphorylation state of Bes1, a transcriptional regulator phosphorylated by the glycogen synthase kinase Bin2, as part of a steroid hormone signal transduction pathway.  The PPP (phosphoprotein phosphatase) family, to which Bsu1 belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	311
277364	cd07420	MPP_RdgC	Drosophila melanogaster RdgC and related proteins, metallophosphatase domain. RdgC (retinal degeneration C) is a vertebrate serine-threonine protein phosphatase that is required to prevent light-induced retinal degeneration.  In addition to its catalytic domain, RdgC has two C-terminal EF hands.  Homologs of RdgC include the human phosphatases protein phosphatase with EF hands 1 and -2 (PPEF-1 and -2).  PPEF-1 transcripts are present at low levels in the retina, PPEF-2 transcripts and PPEF-2 protein are present at high levels in photoreceptors.  The PPP (phosphoprotein phosphatase) family, to which RdgC belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	297
163664	cd07421	MPP_Rhilphs	Rhilph phosphatases, metallophosphatase domain. Rhilphs (Rhizobiales/ Rhodobacterales/ Rhodospirillaceae-like phosphatases) are a phylogenetically distinct group of PPP (phosphoprotein phosphatases), found only in land plants. They are named for their close relationship to to PPP phosphatases from alpha-Proteobacteria, including Rhizobiales, Rhodobacterales and Rhodospirillaceae.  The PPP (phosphoprotein phosphatase) family, to which the Rhilphs belong, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	304
277365	cd07422	MPP_ApaH	Escherichia coli ApaH and related proteins, metallophosphatase domain. ApaH (also known as symmetrically cleaving Ap4A hydrolase and bis(5'nucleosyl)-tetraphosphatase) is a bacterial member of the PPP (phosphoprotein phosphatase) family of serine/threonine phosphatases that hydrolyzes the nucleotide-signaling molecule diadenosine tetraphosphate (Ap(4)A) into two ADP and also hydrolyzes Ap(5)A, Gp(4)G, and other extending compounds.  Null mutations in apaH result in high intracellular levels of Ap(4)A which correlate with multiple phenotypes, including a decreased expression of catabolite-repressible genes, a reduction in the expression of flagellar operons, and an increased sensitivity to UV  and heat.  Ap4A hydrolase is important in responding to heat shock and oxidative stress via regulating the concentration of Ap4A in bacteria.  Ap4A hydrolase is also thought to play a role in siderophore production, but the mechanism by which ApaH interacts with siderophore pathways in unknown.  The PPP (phosphoprotein phosphatase) family, to which ApaH belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, and PrpA/PrpB. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	257
277366	cd07423	MPP_Prp_like	Bacillus subtilis PrpE and related proteins, metallophosphatase domain. PrpE (protein phosphatase E) is a bacterial member of the PPP (phosphoprotein phosphatase) family of serine/threonine phosphatases and a key signal transduction pathway component controlling the expression of spore germination receptors GerA and GerK in Bacillus subtilis. PrpE is closely related to ApaH (also known symmetrical Ap(4)A hydrolase and bis(5'nucleosyl)-tetraphosphatase).  PrpE has specificity for phosphotyrosine only, unlike the serine/threonine phosphatases to which it is related. The Bacilli members of this family are single domain proteins while the other members have N- and C-terminal domains in addition to this phosphatase domain. Pnkp is the end-healing and end-sealing component of an RNA repair system present in bacteria. It is composed of three catalytic modules: an N-terminal polynucleotide 5' kinase, a central 2',3' phosphatase, and a C-terminal ligase. Pnkp is a Mn(2+)-dependent phosphodiesterase-monoesterase that dephosphorylates 2',3'-cyclic phosphate RNA ends. An RNA binding site is suggested by a continuous tract of positive surface potential flanking the active site. The PPP (phosphoprotein phosphatase) family, to which PrpE belongs, is one of two known protein phosphatase families specific for serine and threonine.  The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	235
277367	cd07424	MPP_PrpA_PrpB	PrpA and PrpB, metallophosphatase domain. PrpA and PrpB are bacterial type I serine/threonine and tyrosine phosphatases thought to modulate the expression of proteins that protect the cell upon accumulation of misfolded proteins in the periplasm.  The PPP (phosphoprotein phosphatase) family, to which PrpA and PrpB belong, is one of two known protein phosphatase families specific for serine and threonine.  This family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	201
277368	cd07425	MPP_Shelphs	Shewanella-like phosphatases, metallophosphatase domain. This family includes bacterial, eukaryotic, and archeal proteins orthologous to the Shewanella cold-active protein-tyrosine phosphatase, CAPTPase.  CAPTPase is an uncharacterized protein that belongs to the Shelph (Shewanella-like phosphatase) family of PPP (phosphoprotein phosphatases).  The PPP family is one of two known protein phosphatase families specific for serine and threonine.  In addition to Shelps, the PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpE, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-).  The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes.  Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes.  PPPs belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets.  This domain is thought to allow for productive metal coordination.	209
143631	cd07429	Cby_like	Chibby, a nuclear inhibitor of Wnt/beta-catenin mediated transcription, and similar proteins. Chibby(Cby) is a well-conserved nuclear protein that functions as part of the Wnt/beta-catenin signaling pathway. Specifically, Cby binds directly to beta-catenin by interacting with its central region, which harbors armadillo repeats. Cby-beta-catenin interactions may also involve 14-3-3 proteins. By competing with other binding partners of beta-catenin, the Tcf/Lef transcription factors, Cby inhibits transcriptional activation. Cby has been shown to play a role in adipocyte differentiation. The C-terminal region of Cby appears to contain an alpha-helical coiled-coil motif.	108
143632	cd07430	GH15_N	Glycoside hydrolase family 15, N-terminal domain. Members of this family are N-terminal domains uniquely found in bacterial and archaeal glucoamylases and glucodextranases. Glucoamylase (glucan 1,4-alpha-glucosidase; 4-alpha-D-glucan glucohydrolase; amyloglucosidase; exo-1,4-alpha-glucosidase; gamma-amylase; lysosomal alpha-glucosidase; EC 3.2.1.3) hydrolyzes beta-1,4-glucosidic linkages of starch, glycogen and malto-oligosaccharides, releasing beta-D-glucose from the non-reducing end. Glucodextranase (glucan 1,6-alpha-glucosidase; exo-1,6-alpha-glucosidase; EC 3.2.1.70) uses an inverting reaction mechanism to hydrolyze  alpha-1,6-glucosidic linkages of dextran and related oligosaccharides, releasing beta-D-glucose from the non-reducing end. These N-terminal domains adopt a structure consisting of antiparallel beta-strands, divided into two beta-sheets, with one sheet wrapped by an extended polypeptide, which appears to stabilize the domain. The function of these domains in the enzymes is as yet unknown. However, it is suggested that domain N of bacterial GA is involved in folding and/or the thermostability of the A domain that forms an (alpha/alpha)6-barrel structure.	260
213986	cd07431	PHP_PolIIIA	Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III. PolIIIAs that contain an N-terminal PHP domain have been classified into four basic groups based on genome composition, phylogenetic, and domain structural analysis: polC, dnaE1, dnaE2, and dnaE3. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that is responsible for the replication of the DNA duplex. The alpha subunit of DNA polymerase III core enzyme catalyzes the reaction for polymerizing both DNA strands. The PolIIIA PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination, and like other PHP structures, exhibits a distorted (beta/alpha) 7 barrel and coordinates up to 3 metals. Initially, it was proposed that PHP region might be involved in pyrophosphate hydrolysis, but such activity has not been found. It has been shown that the PHP domain of PolIIIA has a trinuclear metal complex and is capable of proofreading activity.	179
213987	cd07432	PHP_HisPPase	Polymerase and Histidinol Phosphatase domain of Histidinol phosphate phosphatase. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to produce histidinol. HisPPase can be classified into two types: the bifunctional HisPPase found in proteobacteria that belongs to the DDDD superfamily and the monofunctional Bacillus subtilis type that is a member of the PHP family. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel.	129
213988	cd07433	PHP_PolIIIA_DnaE1	Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III DnaE1. PolIIIAs that contain an N-terminal PHP domain have been classified into four basic groups based on genome composition, phylogenetic, and domain structural analysis: polC, dnaE1, dnaE2, and dnaE3. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that are responsible for the replication of the DNA duplex. PolIIIA core enzyme catalyzes the reaction for polymerizing both DNA strands. dnaE1 is the longest compared to dnaE2 and dnaE3. A unique motif was also identified in dnaE1 and dnaE3 genes.	277
213989	cd07434	PHP_PolIIIA_DnaE2	Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III at DnaE2 gene. PolIIIA DnaE2 plays a role in SOS mutagenesis/translesion synthesis and has dominant effects in determining GC variability in the bacterial genome. PolIIIAs that contain an N-terminal PHP domain have been classified into four basic groups based on genome composition, phylogenetic, and domain structural analysis: polC, dnaE1, dnaE2, and dnaE3. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that are responsible for the replication of the DNA duplex. PolIIIA core enzyme catalyzes the reaction for polymerizing both DNA strands. PolC PHP is located in a different location compared to dnaE1, 2, and 3. dnaE1 is the longest compared to dnaE2 and dnaE3. A unique motif was also identified in dnaE1 and dnaE3 genes. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. PHP domains found in DnaEs of thermophilic origin exhibit 3'-5' exonuclease activity.	260
213990	cd07435	PHP_PolIIIA_POLC	Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III at PolC gene. DNA polymerase III alphas (PolIIIAs) that contain a PHP domain have been classified into four basic groups based on phylogenetic and domain structural analyses: polC, dnaE1, dnaE2, and dnaE3. The PolC group is distinct from the other three and is clustered together. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that are responsible for the replication of the DNA duplex. The alpha subunit of DNA polymerase III core enzyme catalyzes the reaction for polymerizing both DNA strands. PolC PHP is located in different location compare to dnaE1, 2, and 3. The PHP domain has four conserved sequence motifs and and contains an invariant histidine that is involved in metal ion coordination.The PHP domain of PolC is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. PHP domains found in dnaEs of thermophilic origin exhibit 3'-5' exonuclease activity. In contrast, PolC PHP lacks detectable nuclease activity.	268
213991	cd07436	PHP_PolX	Polymerase and Histidinol Phosphatase domain of bacterial polymerase X. The bacterial/archaeal X-family DNA polymerases (PolXs) have a PHP domain at their C-terminus. The bacterial/archaeal PolX core domain and PHP domain interact with each other and together are involved in metal dependent 3'-5' exonuclease activity. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. PolX is found in all kingdoms, however bacterial PolXs have a completely different domain structure from eukaryotic PolXs. Bacterial PolX has an extended conformation in contrast to the common closed 'right hand' conformation for DNA polymerases. This extended conformation is stabilized by the PHP domain. The PHP domain of PolX is structurally homologous to other members of the PHP family that has a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel.	237
213992	cd07437	PHP_HisPPase_Ycdx_like	Polymerase and Histidinol Phosphatase domain of Ycdx like. PHP Ycdx-like is a stand alone PHP domain similar to Ycdx E. coli protein with an unknown physiological role. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. It has also been shown that the PHP domain functions in DNA repair. The PHP structures have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel. YcdX may be involved in swarming.	233
213993	cd07438	PHP_HisPPase_AMP	Polymerase and Histidinol Phosphatase domain of Histidinol phosphate phosphatase (HisPPase) AMP bound. The PHP domain of this HisPPase family has an unknown function. It has a second domain inserted in the middle that binds adenosine 5-monophosphate (AMP). The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to give histidinol. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to the other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel.	155
143633	cd07439	FANCE_c-term	Fanconi anemia complementation group E protein, C-terminal domain. Fanconi Anemia (FA) is an autosomal recessive disorder associated with increased susceptibility to various cancers, bone marrow failure, cardiac, renal, and limb malformations, and other characteristics. Cells are highly sensitive to DNA damaging agents. A multi-subunit protein complex, the FA core complex, is responsible for ubiquitination of the protein FANCD2 in response to DNA damage. This monoubiquitination results in a downstream effect on homology-directed DNA repair. FANCE is part of the FA core complex and its C-terminal domain, which is modeled here, has been shown to directly interact with FANCD2. The domain contains a five-fold repeat of a structural unit similar to ARM and HEAT repeats. FANCE appears conserved in metazoa and in plants.	254
188659	cd07440	RGS	Regulator of G protein signaling (RGS) domain superfamily. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision.	113
143550	cd07441	CRD_SFRP3	Cysteine-rich domain of the secreted frizzled-related protein 3 (SFRP3, alias FRZB), a Wnt antagonist. The cysteine-rich domain (CRD) is an essential part of the secreted frizzled-related protein 3 (SFRP3, alias FRZB), which plays important roles in embryogenesis and postnatal development as an antagonist of Wnt proteins, key players in a number of fundamental cellular processes. SFRPs antagonize the activation of Wnt signaling by binding to the CRD domains of frizzled proteins (Fz), thereby preventing Wnt proteins from binding to these receptors. SFRPs are also known to have functions unrelated to Wnt, as enhancers of procollagen cleavage by the TLD proteinases. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs. SFRP3 regulates Wnt signaling activity in bone development and homeostasis. It is also involved in the control of planar cell polarity.	126
143551	cd07442	CRD_SFRP4	Cysteine-rich domain of the secreted frizzled-related protein 4 (SFRP4), a Wnt antagonist. The cysteine-rich domain (CRD) is an essential part of the secreted frizzled-related Protein 4 (SFRP4), which regulates the activity of Wnt proteins, key players in a number of fundamental cellular processes such as embryogenesis and postnatal development. SFRPs antagonize the activation of Wnt signaling by binding to the CRDs domains of frizzled (Fz) proteins, thereby preventing Wnt proteins from binding to these receptors. SFRPs are also known to have functions unrelated to Wnt, as enhancers of procollagen cleavage by the TLD proteinases. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs.	127
143552	cd07443	CRD_SFRP1	Cysteine-rich domain of the secreted frizzled-related protein 1 (SFRP1), a regulator of Wnt activity. The cysteine-rich domain (CRD) is an essential part of the secreted frizzled-related protein 1 (SFRP1), which regulates the activity of Wnt proteins, key players in a number of fundamental cellular processes such as embryogenesis and postnatal development. SFRPs antagonize the activation of Wnt signaling by binding to the CRDs domains of frizzled (Fz) proteins, thereby preventing Wnt proteins from binding to these receptors. SFRPs are also known to have functions unrelated to Wnt, as enhancers of procollagen cleavage by the TLD proteinases. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs. SFRP1 is expressed in many tissues and is involved in the regulation of Wnt signaling in osteoblasts, leading to enhanced trabecular bone formation in adults; it has also been shown to control the growth of retinal ganglion cell axons and the elongation of the antero-posterior axis.	124
143553	cd07444	CRD_SFRP5	Cysteine-rich domain of the secreted frizzled-related protein 5 (SFRP5), a regulator of Wnt activity. The cysteine-rich domain (CRD) is an essential part of the secreted frizzled-related Protein 5 (SFRP5), which regulates the activity of Wnt proteins, key players in a number of fundamental cellular processes such as embryogenesis and postnatal development. SFRPs antagonize the activation of Wnt signaling by binding to the CRD domains of frizzled (Fz) proteins, thereby preventing Wnt proteins from binding to these receptors. SFRPs are also known to have functions unrelated to Wnt, as enhancers of procollagen cleavage by the TLD proteinases. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs.	127
143554	cd07445	CRD_corin_1	One of two cysteine-rich domains of the corin protein, a type II transmembrane serine protease . The cysteine-rich domain (CRD) is an essential component of corin, a type II transmembrane serine protease which functions as the convertase of the pro-atrial natriuretic peptide (pro-ANP) in the heart. Corin contains two CRDs in its extracellular region, which play an important role in recognition of the physiological substrate, pro-ANP. This model characterizes the first (N-terminal) CRD.	130
143555	cd07446	CRD_SFRP2	Cysteine-rich domain of the secreted frizzled-related protein 2 (SFRP2), a regulator of Wnt activity. The cysteine-rich-domain (CRD) is an essential part of the secreted frizzled related protein 2 (SFRP2), which regulates the activity of Wnt  proteins, key players in a number of fundamental cellular processes such as embryogenesis and postnatal development. SFRPs antagonize the activation of Wnt signaling by binding to CRD domains of frizzled (Fz) proteins, thereby preventing Wnt proteins from binding to these receptors. SFRPs and Fz proteins both contain CRD domains, but SFRPs lack the seven-pass transmembrane domain which is an integral part of Fzs. As a Wnt antagonist, SFRP2 regulates Nkx2.2  expression in the ventral spinal cord and anteroposterior axis elongation. SFRP2 also has a Wnt-independent function as an enhancer of procollagen cleavage by the TLD proteinases. SFRP2 binds both procollagen and TLD, thus facilitating the enzymatic reaction by bringing together the proteinase and its substrate.	128
143556	cd07447	CRD_Carboxypeptidase_Z	Cysteine-rich domain of carboxypeptidase Z, a member of the carboxypeptidase E family. The cysteine-rich-domain (CRD) is an essential part of carboxypeptidase Z, a member of the carboxypeptidase E family of metallocarboxypeptidases. This is a group of Zn-dependent enzymes implicated in the intra- and extracellular processing of proteins. Carboxypeptidase Z removes C-terminal basic amino acid residues from its substrates, particularly arginine. The CRD acts as a ligand-binding domain for Wnts involved in developmental processes. CPZ binds and may process Wnt-4, CPZ has also been found to enhance the induction of the homeobox gene Cdx1. During vertebrate embryogenesis, the CRD of CPZ upregulates Pax3, a Wnt reporter gene essential for patterning of somites and limb development.	128
143557	cd07448	CRD_FZ4	Cysteine-rich Wnt-binding domain of the frizzled 4 (Fz4) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 4 (Fz4) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including  both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and the Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Frizzled 4 (Fz4) activates the Ca(2+)/calmodulin-dependent protein kinase II and protein kinase C of the Wnt/Ca(2+) signaling pathway during retinal angiogenesis. Mutations in Fz4 lead to familial exudative vitreoretinopathy (FEVR), a hereditary ocular disorder characterized by failure of the peripheral retinal vascularization. In addition, the interplay between Fz4 and norrin as a receptor-ligand pair plays an important role in vascular development in the retina and inner ear in a Wnt-independent manner.	126
143558	cd07449	CRD_FZ3	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 3 (Fz3) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 3 (Fz3) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Fz3 plays a vital role in the anterior-posterior guidance of commissural axons. Knockout mice without Fz3 show defects in fiber tracts in the rostral CNS.	127
143559	cd07450	CRD_FZ6	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 6 (Fz6) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 6 (Fz6) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Frizzled 6 (Fz6) is expressed in the skin and hair follicles and controls hair patterning in mammals using a Fz-dependent tissue polarity system, which is similar to the one that patterns the Drosophila cuticle.	127
143560	cd07451	CRD_SMO	Cysteine-rich domain of the smoothened receptor (Smo) integral membrane protein. The cysteine-rich domain (CRD) is part of the smoothened receptor (Smo), an integral membrane protein and one of the key players in the Hedgehog (Hh) signaling pathway, critical for development, cell growth and migration, as well as stem cell maintenance. The CRD of Smo is conserved in vertebrates and can also be identified in invertebrates. The precise function of the CRD in Smo is unknown. Mutations in the Drosophila CRD disrupt Smo activity in vivo, while deletion of the CRD in mammalian cells does not seem to affect the activity of overexpressed Smo.	132
143561	cd07452	CRD_sizzled	Cysteine-rich domain of the sizzled protein. The cysteine-rich domain (CRD) is an essential part of the sizzled protein, which regulates bone morphogenetic protein (Bmp) signaling by stabilizing chordin, and plays a critical role in the patterning of vertebrate and invertebrate embryos. Sizzled also functions in the ventral region as a Wnt inhibitor and modulates canonical Wnt signaling. Sizzled proteins belong to the secreted frizzled-related protein family (SFRP), and have be identified in the genomes of birds, fishes and frogs, but not mammals.	141
143562	cd07453	CRD_crescent	Cysteine-rich domain of the crescent protein. The cysteine-rich domain (CRD) is an essential part of the crescent protein, a member of the secreted frizzled-related protein (SFRP) family, which regulates convergent extension movements (CEMs) during gastrulation and neurulation. Xenopus laevis crescent efficiently forms inhibitory complexes with Wnt5a and Wnt11, but this effect is cancelled in the presence of another member of the SFRP family, Frzb1. A potential role for Crescent in head formation is to regulate a non-canonical Wnt pathway positively in the adjacent posterior mesoderm, and negatively in the overlying anterior neuroectoderm.	135
143563	cd07454	CRD_LIN_17	Cysteine-rich domain (CRD) of LIN_17. A cysteine-rich domain (CRD) is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines. The protein lin-17 is involved in cell type specification during Caenorhabditis elegans vulval development.	124
143564	cd07455	CRD_Collagen_XVIII	Cysteine-rich domain of the variant 3 of collagen XVIII (V3C18 ). The cysteine-rich domain (CRD) is an essential part of the variant 3 of collagen XVIII (V3C18), which regulates major cellular functions such as the differential epithelial morphogenesis of early lung and kidney development. V3C18 is a 170 kD protein, which is proteolotically processed into the CRD-containing 50 kD glucoprotein precursor that binds Wnt3a through its CRD domain and suppresses the Wnt3a-induced stabilization of beta catenin. Full-length V3C18 is unable to inhibit Wnt signaling.	123
143565	cd07456	CRD_FZ5_like	Cysteine-rich Wnt-binding domain (CRD) of receptors similar to frizzled 5. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 5 (Fz5) and frizzled 8 (Fz8) receptors, and similar proteins. This domain is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines.	120
143566	cd07457	CRD_FZ9_like	Cysteine-rich Wnt-binding domain (CRD) of receptors similar to frizzled 9. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 9 (Fz9) and frizzled 10 (Fz10) receptors, and similar proteins. This domain is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines.	121
143567	cd07458	CRD_FZ1_like	Cysteine-rich Wnt-binding domain (CRD) of receptors similar to frizzled 1. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 1 (Fz1), frizzled 2 (Fz2), and frizzled 7 (Fz7) receptors, and similar proteins. This domain is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines.	119
143568	cd07459	CRD_TK_ROR_like	Cysteine-rich domain of tyrosine kinase-like orphan receptors. The cysteine-rich domain (CRD) is an essential part of the tyrosine kinase-like orphan receptor (Ror) proteins, a conserved family of tyrosine kinases that function in various processes, including neuronal and skeletal development, cell polarity, and cell movement. Ror proteins are receptors of Wnt proteins, which are key players in a number of fundamental cellular processes in embryogenesis and postnatal development. In different cellular contexts, Ror proteins can either activate or repress transcription of Wnt target genes, and can modulate Wnt signaling by sequestering Wnt ligands. In addition, a number of Wnt-independent functions have been proposed for both Ror1 and Ror2.	135
143569	cd07460	CRD_FZ5	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 5 (Fz5) receptor.proteins. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 5 (Fz5) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Fz5 plays critical regulating roles in the yolk sac and placental angiogenesis, in the maturation of the Paneth cell phenotype, in governing the neural potential of progenitors in the developing retina, and in neuronal survival in the parafascicular nucleus.	127
143570	cd07461	CRD_FZ8	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 8 (Fz8) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 8 (Fz8) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Xenopus Fz8 is important in Wnt/beta-catenin signaling pathways controlling the transcriptional activation of target genes Siamois and Xnr3 in the animal caps of late blastula.	125
143571	cd07462	CRD_FZ10	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 10 (Fz10) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 10 (Fz10) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. The cellular functon of Fz10 is unknown.	127
143572	cd07463	CRD_FZ9	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 9 (Fz9) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 9 (Fz9) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Fz9 may play a signaling role in lymphoid development and maturation, particularly at points where B cells undergo self-renewal prior to further differentiation.	127
143573	cd07464	CRD_FZ2	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 2 (Fz2) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 2 (Fz2) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Fz2 is involved in the Wnt/beta-catenin signaling pathway and in the activation of protein kinase C and calcium/calmodulin-dependent protein kinase (CaM kinase).	127
143574	cd07465	CRD_FZ1	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 1 (Fz1) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 1 (Fz1) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata.	127
143575	cd07466	CRD_FZ7	Cysteine-rich Wnt-binding domain (CRD) of the frizzled 7 (Fz7) receptor. The cysteine-rich domain (CRD) is an essential extracellular portion of the frizzled 7 (Fz7) receptor, and is required for binding Wnt proteins, which play fundamental roles in many aspects of early development, such as cell and tissue polarity, neural synapse formation, and the regulation of proliferation. Fz proteins serve as Wnt receptors for multiple signal transduction pathways, including both beta-catenin dependent and -independent cellular signaling, as well as the planar cell polarity pathway and Ca(2+) modulating signaling pathway. CRD containing Fzs have been found in diverse species from amoebas to mammals. 10 different frizzled proteins are found in vertebrata. Xenopus Fz7 is important in Wnt/beta-catenin signaling pathways controlling the transcriptional activation of target genes Siamois and Xnr3 in the animal caps of late blastula.	125
143576	cd07467	CRD_TK_ROR1	Cysteine-rich domain of tyrosine kinase-like orphan receptor 1. The cysteine-rich domain (CRD) is an essential part of the tyrosine kinase-like orphan receptor 1 (Ror1), a conserved family of tyrosine kinases that function in various processes, including neuronal and skeletal development, cell polarity, and cell movement. Ror proteins are receptors of Wnt proteins, which are key players in a number of fundamental cellular processes in embryogenesis and postnatal development. In different cellular contexts, Ror proteins can either activate or repress transcription of Wnt target genes, and can modulate Wnt signaling by sequestering Wnt ligands. In addition, a number of Wnt-independent functions have been proposed for both Ror1 and Ror2.	142
143577	cd07468	CRD_TK_ROR2	Cysteine-rich domain of tyrosine kinase-like orphan receptor 2. The cysteine-rich domain (CRD) is an essential part of the tyrosine kinase-like orphan receptor (Ror2), a conserved family of tyrosine kinases that function in various processes, including neuronal and skeletal development, cell polarity, and cell movement. Ror proteins are receptors of Wnt proteins, which are key players in a number of fundamental cellular processes in embryogenesis and postnatal development. In different cellular contexts, Ror proteins can either activate or repress transcription of Wnt target genes, and can modulate Wnt signaling by sequestering Wnt ligands. In addition, a number of Wnt-independent functions have been proposed for both Ror1 and Ror2.	140
143578	cd07469	CRD_TK_ROR_related	Cysteine-rich domain of proteins similar to tyrosine kinase-like orphan receptors. The cysteine-rich domain (CRD) is an essential part of the tyrosine kinase-like orphan receptor (Ror) proteins, a conserved family of tyrosine kinases that function in various processes, including neuronal and skeletal development, cell polarity, and cell movement. Ror proteins are receptors of Wnt proteins, which are key players in a number of fundamental cellular processes in embryogenesis and postnatal development. In different cellular contexts, Ror proteins can either activate or repress transcription of Wnt target genes, and can modulate Wnt signaling by sequestering Wnt ligands.	147
143621	cd07470	CYTH-like_mRNA_RTPase	CYTH-like mRNA triphosphatase (RTPase) component of the mRNA capping apparatus. This subgroup includes fungal and protozoal RTPases. RTPase catalyzes the first step in the mRNA cap formation process, the removal of the gamma-phosphate of  triphosphate terminated pre-mRNA. This activity is metal-dependent. The 5'-end of the resulting mRNA diphosphate is subsequently capped with GMP by RNA guanylytransferase, and then further modified by one or more methyltransferases. The mRNA cap-forming activity is an essential step in mRNA processing. The RTPases are not conserved among eukarya. The structure and mechanism of this fungal RTPase domain group is different from that of higher eukaryotes. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel. The RTPase domain of the mimivirus RTPase-GTase fusion mRNA capping enzyme also belongs to this subgroup.	243
213030	cd07472	HmuY_like	Bacterial proteins similar to Porphyromonas gingivalis HmuY and the C-terminal domain of PARMER_03218. HmuY is a hemophore that scavenges heme from infected hosts and delivers it to the outer membrane receptor HmuR. Related but uncharacterized proteins do not appear to share the specific heme-binding site. The C-terminal domain of PARMER_03128, an uncharacterized protein from Parabacteroides merdae, plus related proteins from Bacteroidetes, appear to be a distantly related family and have been included in this model.	157
173799	cd07473	Peptidases_S8_Subtilisin_like	Peptidase S8 family domain in Subtilisin-like proteins. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	259
173800	cd07474	Peptidases_S8_subtilisin_Vpr-like	Peptidase S8 family domain in Vpr-like proteins. The maturation of the peptide antibiotic (lantibiotic) subtilin in Bacillus subtilis ATCC 6633 includes posttranslational modifications of the propeptide and proteolytic cleavage of the leader peptide.  Vpr was identified as one of the proteases,  along with WprA, that are capable of processing subtilin.    Asp, Ser, His triadPeptidases S8 or Subtilases are a serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	295
173801	cd07475	Peptidases_S8_C5a_Peptidase	Peptidase S8 family domain in Streptococcal C5a peptidases. Streptococcal C5a peptidase (SCP), is a highly specific protease and adhesin/invasin.  The subtilisin-like protease domain is located at the N-terminus and contains a protease-associated domain inserted into a loop.  There are three fibronectin type III (Fn) domains at the C-terminus. SCP binds to integrins with the help of Arg-Gly-Asp motifs which are thought to stabilize conformational changes required for substrate binding.  Peptidases S8 or Subtilases are a serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	346
173802	cd07476	Peptidases_S8_thiazoline_oxidase_subtilisin-like_protease	Peptidase S8 family domain in Thiazoline oxidase/subtilisin-like proteases. Thiazoline oxidase/subtilisin-like protease is produced by the symbiotic bacteria Prochloron spp. that inhabit didemnid family ascidians.  The cyclic peptides of the patellamide class found in didemnid extracts are now known to be synthesized by the Prochloron spp.  The prepatellamide is heterocyclized to form thiazole and oxazoline rings and the peptide is cleaved to form the two cyclic patellamides A and C.  Subtilases, or subtilisin-like serine proteases, have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure (an example of convergent evolution).	267
173803	cd07477	Peptidases_S8_Subtilisin_subset	Peptidase S8 family domain in Subtilisin proteins. This group is composed of many different subtilisins: Pro-TK-subtilisin, subtilisin Carlsberg, serine protease Pb92 subtilisin, and BPN subtilisins just to name a few. Pro-TK-subtilisin is a serine protease from the hyperthermophilic archaeon Thermococcus kodakaraensis and consists of a signal peptide, a propeptide, and a mature domain.  TK-subtilisin is matured from pro-TK-subtilisin upon autoprocessing and degradation of the propeptide. Unlike other subtilisins though, the folding of the unprocessed form of pro-TK-subtilisin is induced by Ca2+ binding which is almost completed prior to autoprocessing. Ca2+ is required for activity unlike the bacterial subtilisins. The propeptide is not required for folding of the mature domain unlike the bacterial subtilases because of the stability produced from Ca2+ binding.  Subtilisin Carlsberg is extremely similar in structure to subtilisin BPN'/Novo thought it has a 30% difference in amino acid sequence.  The substrate binding regions are also similar and 2 possible Ca2+ binding sites have been identified recently. Subtilisin Carlsberg possesses the highest commercial importance as a proteolytic additive for detergents. Serine protease Pb92, the serine protease from the alkalophilic Bacillus strain PB92, also contains two calcium ions and the overall  folding of the polypeptide chain closely resembles that of the subtilisins.   Members of the peptidases S8 and S35 clan include endopeptidases, exopeptidases and also a tripeptidyl-peptidase. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The S53 family contains a catalytic triad Glu/Asp/Ser. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	229
173804	cd07478	Peptidases_S8_CspA-like	Peptidase S8 family domain in CspA-like proteins. GSP (germination-specific protease) converts the spore peptidoglycan hydrolase (SleC) precursor to an active enzyme during germination of Clostridium perfringens S40 spores.  Analysis of an enzyme fraction of GSP showed that it was composed of a gene cluster containing the processed forms of products of cspA, cspB, and cspC which are positioned in a tandem array just upstream of the 5' end of sleC. The amino acid sequences deduced from the nucleotide sequences of the csp genes showed significant similarity and showed a high degree of homology with those of the catalytic domain and the oxyanion binding region of subtilisin-like serine proteases.   Members of the peptidases S8 and S35 clan include endopeptidases, exopeptidases and also a tripeptidyl-peptidase. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The S53 family contains a catalytic triad Glu/Asp/Ser. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	455
173805	cd07479	Peptidases_S8_SKI-1_like	Peptidase S8 family domain in SKI-1-like proteins. SKI-1 (type I membrane-bound subtilisin-kexin-isoenzyme) proteins are secretory Ca2+-dependent serine proteinases cleave at nonbasic residues: Thr, Leu, and Lys.  SKI-1s play a critical role in the regulation of the synthesis and metabolism of cholesterol and fatty acid metabolism.   Members of the peptidases S8 and S35 clan include endopeptidases, exopeptidases and also a tripeptidyl-peptidase. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The S53 family contains a catalytic triad Glu/Asp/Ser. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	255
173806	cd07480	Peptidases_S8_12	Peptidase S8 family domain, uncharacterized subfamily 12. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	297
173807	cd07481	Peptidases_S8_BacillopeptidaseF-like	Peptidase S8 family domain in BacillopeptidaseF-like proteins. Bacillus subtilis produces and secretes proteases and other types of exoenzymes at the end of the exponential phase of growth. The ones that make up this group is known as bacillopeptidase F, encoded by bpr,  a serine protease with high esterolytic activity which is inhibited by PMSF.  Like other members of the peptidases S8 family these have a Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity.	264
173808	cd07482	Peptidases_S8_Lantibiotic_specific_protease	Peptidase S8 family domain in Lantiobiotic (lanthionine-containing antibiotics) specific proteases. Lantiobiotic (lanthionine-containing antibiotics) specific proteases are very similar in structure to serine proteases.  Lantibiotics are ribosomally synthesised antimicrobial agents derived from ribosomally synthesised peptides with antimicrobial activities against Gram-positive bacteria. The proteases that cleave the N-terminal leader peptides from lantiobiotics include:  epiP, nsuP, mutP, and nisP.  EpiP, from Staphylococcus, is thought to cleave matured epidermin. NsuP, a dehydratase from Streptococcus and NisP, a membrane-anchored subtilisin-like serine protease from Lactococcus cleave nisin.  MutP is highly similar to epiP and nisP and is thought to process the prepeptide mutacin III of S. mutans. Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) clan include endopeptidases and  exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin.  The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine  base.   However, the aspartic acid residue that acts as an electrophile  is quite different.  In S53 the it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity.  There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values.	294
173809	cd07483	Peptidases_S8_Subtilisin_Novo-like	Peptidase S8 family domain in Subtilisin_Novo-like proteins. Subtilisins are a group of alkaline proteinases originating from different strains of Bacillus subtilis.  Novo is one of the strains that produced enzymes belonging to this group.  The enzymes obtained from the Novo and BPN' strains are identical.  The Carlsburg and Novo subtilisins are thought to have arisen from a common ancestral protein.  They have similar peptidase and esterase activities, pH profiles, catalyze transesterification reactions, and are both inhibited by diispropyl fluorophosphate, though they differ in 85 positions in the amino acid sequence.  Members of the peptidases S8 and S35 clan include endopeptidases, exopeptidases and also a tripeptidyl-peptidase. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin.. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	291
173810	cd07484	Peptidases_S8_Thermitase_like	Peptidase S8 family domain in Thermitase-like proteins. Thermitase is a non-specific, trypsin-related serine protease with a very high specific activity.  It contains a subtilisin like domain. The tertiary structure of thermitase is similar to that of subtilisin BPN'.  It contains a Asp/His/Ser catalytic triad. Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) clan include endopeptidases and  exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin.  The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine  base.   However, the aspartic acid residue that acts as an electrophile  is quite different.  In S53 the it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity.  There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values.	260
173811	cd07485	Peptidases_S8_Fervidolysin_like	Peptidase S8 family domain in Fervidolysin. Fervidolysin found in Fervidobacterium pennivorans is an extracellular subtilisin-like keratinase.  It is contains a signal peptide, a propeptide, and a catalytic region. The tertiary structure of fervidolysin is similar to that of subtilisin.  It contains a Asp/His/Ser catalytic triad and is a member of the peptidase S8 (subtilisin and kexin) family. The catalytic triad is similar to that found in trypsin-like proteases, but it does not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin.  The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base.   However, the aspartic acid residue that acts as an electrophile is quite different.  In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity.  There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values.	273
173812	cd07487	Peptidases_S8_1	Peptidase S8 family domain, uncharacterized subfamily 1. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	264
173813	cd07488	Peptidases_S8_2	Peptidase S8 family domain, uncharacterized subfamily 2. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	247
173814	cd07489	Peptidases_S8_5	Peptidase S8 family domain, uncharacterized subfamily 5. gap in seq This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	312
173815	cd07490	Peptidases_S8_6	Peptidase S8 family domain, uncharacterized subfamily 6. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	254
173816	cd07491	Peptidases_S8_7	Peptidase S8 family domain, uncharacterized subfamily 7. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	247
173817	cd07492	Peptidases_S8_8	Peptidase S8 family domain, uncharacterized subfamily 8. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	222
173818	cd07493	Peptidases_S8_9	Peptidase S8 family domain, uncharacterized subfamily 9. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	261
173819	cd07494	Peptidases_S8_10	Peptidase S8 family domain, uncharacterized subfamily 10. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	298
173820	cd07496	Peptidases_S8_13	Peptidase S8 family domain, uncharacterized subfamily 13. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	285
173821	cd07497	Peptidases_S8_14	Peptidase S8 family domain, uncharacterized subfamily 14. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	311
173822	cd07498	Peptidases_S8_15	Peptidase S8 family domain, uncharacterized subfamily 15. This family is a member of the Peptidases S8 or Subtilases serine endo- and exo-peptidase clan. They have an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. The stability of subtilases may be enhanced by calcium, some members have been shown to bind up to 4 ions via binding sites with different affinity. Some members of this clan contain disulfide bonds. These enzymes can be intra- and extracellular, some function at extreme temperatures and pH values.	242
319802	cd07499	HAD_CBAP	molecular class B acid phosphatases, similar to Escherichia coli AphA. class B acid phosphatases (CBAPs) have been detected in a minority of bacterial species which include a number of major pathogens such as Escherichia coli, Haemophilus influenzae, and Streptococcus pyogenes. This family includes the CBAP Escherichia coli AphA. The purified enzyme is a broad-spectrum nucleotidase highly active against both 3'- and 5'-mononucleotides and monodeoxynucleotides, which can also act as a phosphotransferase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	185
319803	cd07500	HAD_PSP	phosphoserine phosphatase (PSP), similar to Methanococcus Jannaschii PSP and Saccharomyces cerevisiae SER2p. This family includes Methanococcus jannaschii PSP, and Saccharomyces cerevisiae phosphoserine phosphatase SER2p, EC 3.1.3.3, which participates in a pathway whereby serine and glycine are synthesized from the glycolytic intermediate 3-phosphoglycerate; phosphoserine phosphatase catalyzes the hydrolysis of phospho-L-serine to L-serine and inorganic phosphate, the third reaction in this pathway. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	180
319804	cd07501	HAD_MDP-1_like	eukaryotic hypothetical phosphotyrosine phosphatase MDP-1 and related phosphatases, similar to Bacillus cereus phosphonoacetaldehyde hydrolase and Streptomyces FkbH. This family includes eukaryotic magnesium-dependent phosphatase-1 (MDP-1) which is most likely a phosphotyrosine phosphatase catalyzing the dephosphorylation of tyrosine-phosphorylated proteins, Bacillus cereus phosphonoacetaldehyde hydrolase (phosphonatase)which catalyzes the hydrolysis of phosphonoacetaldehyde to acetaldehyde and phosphate using Mg(II) as cofactor, and sequences annotated as FkbH including BafAIV an FkbH-like protein from Streptomyces griseus encoded in ORF12 of the bafilomycin synthesis gene cluster. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	129
319805	cd07502	HAD_PNKP-C	C-terminal phosphatase domain of T4 polynucleotide kinase/phosphatase (PNKP) and related phosphatases. This family includes the C-terminal domain of the bifunctional enzyme T4 polynucleotide kinase/phosphatase, PNKP. The PNKP phosphatase domain can catalyze the hydrolytic removal of the 3'-phosphoryl of DNA, RNA and deoxynucleoside 3'-monophosphates. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	145
319806	cd07503	HAD_HisB-N	histidinol phosphate phosphatase and related phosphatases. This family includes the N-terminal domain of the Escherichia coli bifunctional enzyme histidinol-phosphate phosphatase/imidazole-glycerol-phosphate dehydratase, HisB. The N-terminal histidinol-phosphate phosphatase domain catalyzes the dephosphorylation of histidinol phosphate, the eight step of L-histidine biosynthesis. This family also includes Escherichia coli GmhB phosphatase which is highly specific for D-glycero-D-manno-heptose-1,7-bisphosphate, it removes the C(7)phosphate and not the C(1)phosphate, and this is the third essential step of lipopolysaccharide heptose biosynthesis. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	142
319807	cd07504	HAD_5NT	haloacid dehalogenase (HAD)-like 5'-nucleotidases similar to human cytosolic IIIA and IIIB. 5'-nucleotidases dephosphorylate nucleoside 5prime-monophosphates. This family includes human 5'-nucleotidase, cytosolic IIIA (cN-IIIA, previously called cN-III; NT5C3A) the main pyrimidine 5'-nucleotidase in erythrocytes which dephosphorylates the pyrimidine nucleotides CMP, UMP, TMP, and the purine 7-methylguanosine monophosphate (m7GM), and possesses phosphotransferase activity. It also includes human 5'-nucleotidase, cytosolic IIIB (cN-IIIB; NT5C3B) which has a strong preference for m7GMP, dephosphorylates CMP and UMP and, with significantly lower efficiency, GMP and AMP, and can also act as a phosphotransferase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	273
319808	cd07505	HAD_BPGM-like	beta-phosphoglucomutase-like family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. This family represents the beta-phosphoglucomutase-like family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. Family members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. It belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	143
319809	cd07506	HAD_like	uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others.   Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	115
319810	cd07507	HAD_Pase	haloacid dehalogenase-like superfamily phosphatase similar to Pyrococcus horikoshii mannosyl-3-phosphoglycerate phosphatase and Persephonella marina glucosyl-3-phosphoglycerate phosphatase. This family includes Pyrococcus horikoshii and Thermus thermophilus HB27 mannosyl-3-phosphoglycerate phosphatases (MpgPs) which catalyze the dephosphorylation of alpha-mannosyl-3-phosphoglycerate (MPG) to produce alpha-mannosylglycerate (MG), and Persephonella marina glucosyl-3-phosphoglycerate phosphatase (GpgP) which catalyzes the dephosphorylation of glucosyl-3-phosphoglycerate (GPG) to produce glucosylglycerate (GG). It also includes Methanococcoides burtonii MpgP protein which is able to dephosphorylate GPG to GG, and MPG to MG. Similar flexibilities in substrate specificity have been confirmed in vitro for the MpgPs from Thermus thermophiles and Pyrococcus horikoshii. Screens with natural substrates have not yet detected activity for another member Escherichia Coli YedP. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	255
319811	cd07508	HAD_Pase_UmpH-like	haloacid dehalogenase-like superfamily phosphatases, UmpH/NagD family. Phosphatases in this UmpH/NagD family include Escherichia coli UmpH UMP phosphatase/NagD nucleotide phosphatase , Mycobacterium tuberculosis Rv1692 glycerol 3-phosphate phosphatase, human PGP phosphoglycolate phosphatase, Schizosaccharomyces pombe PHO2 p-nitrophenylphosphatase, Bacillus AraL a putative sugar phosphatase, and  Plasmodium falciparum para nitrophenyl phosphate phosphatase PNPase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	270
319812	cd07509	HAD_PPase	inorganic pyrophosphatase similar to a human phospholysine phosphohistidine inorganic pyrophosphate phosphatase (LHPP). LHPP hydrolyzes nitrogen-phosphorus bonds in phospholysine, phosphohistidine and imidodiphosphate as well as oxygen-phosphorus bonds in inorganic pyrophosphate in vitro. This family also includes human haloacid dehalogenase like hydrolase domain containing 2 protine (HDHD2) a phosphatase which may be involved in polygenic hypertension. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	248
319813	cd07510	HAD_Pase_UmpH-like	UmpH/NagD family phosphatase, similar to human PGP phosphoglycolate phosphatase and Schizosaccharomyces pombe PHO2 p-nitrophenylphosphatase. This subfamily includes the phosphoglycolate phosphatases (human PGP and Arabidopsis thaliana PGLP2) and p-nitrophenylphosphatases (Schizosaccharomyces pombe PHO2 and Saccharomyces PHO13p). It belongs to the UmpH/NagD phosphatase family, and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	282
319814	cd07511	HAD_like	uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily, similar to the uncharacterized human CECR5 (cat eye syndrome critical region protein 5). This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	136
319815	cd07512	HAD_PGPase	haloacid dehalogenase-like superfamily phosphoglycolate phosphatase, similar to Rhodobacter sphaeroides CbbZ. Phosphoglycolate phosphatase catalyzes the dephosphorylation of phosphoglycolate; its activity requires divalent cations, especially Mg++.  This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	214
319816	cd07514	HAD_Pase	phosphatase, similar to Thermoplasma acidophilum TA0175 phosphoglycolate phosphatase (PCPase), and Pyrococcus horikoshii PH1421, a magnesium-dependent phosphatase; belongs to the haloacid dehalogenase-like superfamily. Thermoplasma acidophilum TA0175 phosphoglycolate phosphatase (PGPase) catalyzes the magnesium-dependent dephosphorylation of phosphoglycolate. This family also includes Pyrococcus horikoshii OT3 PH1421, a magnesium-dependent phosphatase. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	139
319817	cd07515	HAD-like	uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	131
319818	cd07516	HAD_Pase	phosphatase, similar to Escherichia coli Cof and Thermotoga maritima TM0651; belongs to the haloacid dehalogenase-like superfamily. Escherichia coli Cof is involved in the hydrolysis of HMP-PP (4-amino-2-methyl-5-hydroxymethylpyrimidine pyrophosphate, an intermediate in thiamin biosynthesis), Cof also has phosphatase activity against the coenzymes pyridoxal phosphate (PLP) and FMN. Thermotoga maritima TM0651 acts as a phosphatase with a phosphorylated carbohydrate molecule as a possible substrate. Escherichia coli YbhA is also a member of this family and catalyzes the dephosphorylation of PLP, YbhA can also hydrolyze erythrose-4-phosphate and fructose-1,6-bis-phosphate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	253
319819	cd07517	HAD_HPP	phosphatase, similar to Bacteroides thetaiotaomicron VPI-5482 BT4131 hexose phosphate phosphatase; belongs to the haloacid dehalogenase-like superfamily. Bacteroides thetaiotaomicron VPI-5482 BT4131 is a phosphatase with preference for hexose phosphates. In addition this family includes uncharacterized Bacillus subtilis YkrA, a putative phosphatase and uncharacterized Streptococcus pyogenes MGAS10394 a putative bifunctional phosphatase/peptidyl-prolyl cis-trans isomerase. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	213
319820	cd07518	HAD_YbiV-Like	Escherichia coli YbiV sugar phosphatase/phosphotransferase and related proteins; belongs to the haloacid dehalogenase-like superfamily. Escherichia coli YbiV can act as both a sugar phosphatase and as a phosphotransferase. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	184
319821	cd07519	HAD_PTase	hydrolase domain of the bifunctional HAD hydrolase/UbiA family prenyltransferase proteins and related domains; belongs to the haloacid dehalogenase-like superfamily. This family includes bifunctional enzymes that have both an N-terminal HAD hydrolase domain and a C-terminal UbiA family prenyltransferase domain. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases (PTases) and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. PTases catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways.	105
319822	cd07520	HAD_like	uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others.   Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	144
319823	cd07521	HAD_FCP1-like	human CTD phosphatase subunit 1 (CTDP1/FCP1) and related proteins; belongs to the haloacid dehalogenase-like superfamily. Human CTDP1/FCP1 is a protein phosphatase which dephosphorylates the phosphorylated C terminus (CTD) of RNA polymerase II. CTD phosphorylation is a key mechanism of regulation of gene expression in eukaryotes. CTDP1/FCP1 may have other roles in in transcription regulation independent of its phosphatase activity. This family also includes human translocase of inner mitochondrial membrane 50 (TIMM50), CTD small phosphatase like (CTDSPL) and CTD small phosphatase like 2 (CTDSPL2), Saccharomyces cerevisiae (nuclear envelope morphology protein 1) Nem1p, and Xenopus Dullard. Yeast Nem1p in complex with Spo7p dephosphorylates the nuclear membrane-associated phosphatidic acid phosphatase, Smp2p, which may be part of a signaling cascade playing a role in nuclear membrane biogenesis. Xenopus Dullard is a potential regulator of neural tube development. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	134
319824	cd07522	HAD_cN-II	cytosolic 5'-nucleotidase II (cN-II) similar to human NT5DC1 (5'-nucleotidase domain-containing protein 1) and NT5DC2. Cytosolic 5'-nucleotidase II (cN-II), also known as purine 5'-nucleotidase, IMP-GMP specific nucleotidase, or high Km 5prime-nucleotidase, catalyzes the dephosphorylation of 6-hydroxypurine nucleoside monophosphates. It is ubiquitously expressed and likely to play an important role in the regulation of purine nucleotide interconversions and in the regulation of IMP and GMP pools within the cell. It is also acts as a phosphotransferase, catalyzing the reverse reaction, the transfer of a phosphate from a monophosphate substrate to a nucleoside acceptor, to form a nucleoside monophosphate. The nucleoside acceptor is preferentially inosine and deoxyinosine, phosphate donors include any 6-hydroxypurine monophosphate substrate of the nucleotidase reaction.  Both the dephosphorylation and phosphotransferase reactions are allosterically activated by adenine-based nucleotides and 2,3-bisphosphoglycerate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	352
319825	cd07523	HAD_YsbA-like	uncharacterized family of the haloacid dehalogenase-like superfamily, similar to the uncharacterized Lactococcus lactis YsbA. The specific function of Lactococcus lactis YsbA is unknown. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases	173
319826	cd07524	HAD_Pase	phosphatase, similar to Bacillus subtilis MtnX; belongs to the haloacid dehalogenase-like superfamily. Bacillus subtilis recycles two toxic byproducts of polyamine metabolism, methylthioadenosine and methylthioribose, into methionine by a salvage pathway. The sixth reaction in this pathway is catalyzed by B. subtilis MtnX: the dephosphorylation of 2- hydroxy-3-keto-5-methylthiopentenyl-1-phosphate (HKMTP- 1-P) into 1,2-dihydroxy-3-keto-5-methylthiopentene. The hydrolysis of HK-MTP-1-P is a two-step mechanism involving the formation of a transiently phosphorylated aspartyl intermediate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	211
319827	cd07525	HAD_like	uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	253
319828	cd07526	HAD_BPGM_like	subfamily of beta-phosphoglucomutase-like family, similar to Escherichia coli 6-phosphogluconate phosphatase YieH. This subfamily includes Escherichia coli YieH/HAD3 an 6-phosphogluconate phosphatase, which can hydrolyzed purines and pyrimidines as secondary substrates. It belongs to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate, and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	141
319829	cd07527	HAD_ScGPP-like	subfamily of beta-phosphoglucomutase-like family, similar to Saccharomyces cerevisiae DL-glycerol-3-phosphate phosphatase (GPP1p/ Rhr2p and GPP2p/HOR2p) and 2-deoxyglucose-6-phosphate phosphatase (DOG1p and DOG2p). This subfamily includes Saccharomyces cerevisiae DL-glycerol-3-phosphate phosphatase (GPP1p/ Rhr2p and GPP2p/HOR2p) and 2-deoxyglucose-6-phosphate phosphatase (DOG1p and DOG2p). GPP1p and GPP2p are involved in glycerol biosynthesis, GPP1 is induced in response to both anaerobic and hyperosmotic stress, GPP2 is induced in response to hyperosmotic or oxidative stress, and during diauxic shift; overexpression of DOG1 or DOG2 confers 2-deoxyglucose resistance. These belong to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	205
319830	cd07528	HAD_CbbY-like	subfamily of beta-phosphoglucomutase-like family, similar to Rhodobacter sphaeroides xylulose-1,5-bisphosphate phosphatase CbbY. This family includes Rhodobacter sphaeroides and Arabidopsis thaliana xylulose-1,5-bisphosphate phosphatase CbbY which convert xylulose-1,5-bisphosphate (a potent inhibitor of Ribulose-1,5-bisphosphate carboxylase/oxygenase, Rubisco), to the non-inhibitory compound xylulose-5-phosphate. It belongs to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	199
319831	cd07529	HAD_AtGPP-like	subfamily of beta-phosphoglucomutase-like family, similar to Arabidopsis thaliana Gpp1 and Gpp2. This subfamily includes Arabidopsis thaliana AtGpp1 and AtGpp2, and Drosophila GS1-like protein (Dmel\Gs1l) of unknown function. AtGpp1 and AtGpp2 are constitutively expressed in all the Arabidopsis tissues and unaffected under abiotic stress. Overexpression of AtGpp2 in transgenic Arabidopsis plants increases the specific DL-glycerol-3-phosphatase activity and improves the plants tolerance to salt, osmotic and oxidative stress. It belongs to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	192
319832	cd07530	HAD_Pase_UmpH-like	UmpH/NagD family phosphatase, similar to Escherichia coli UmpH UMP phosphatase/NagD nucleotide phosphatase and Mycobacterium tuberculosis Rv1692 glycerol 3-phosphate phosphatase. Escherichia coli UmpH/NagD is a ribonucleoside tri-, di-, and monophosphatase with a preference for purines, it shows peak activity with UMP and functions in UMP-degradation. It is also an effective phosphatase with AMP, GMP and CMP. Mycobacterium tuberculosis phosphatase, Rv1692 is a glycerol 3-phosphate phosphatase. Rv1692 is the final enzyme involved in glycerophospholipid recycling/catabolism.  This subfamily belongs to the UmpH/NagD phosphatase family, and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	247
319833	cd07531	HAD_Pase_UmpH-like	UmpH/NagD family phosphatase, similar to Bacillus AraL phosphatase, a putative sugar phosphatase. Bacillus subtilis AraL is a phosphatase displaying activity towards different sugar phosphate substrates; it is encoded by the arabinose metabolic operon araABDLMNPQ-abfA and may play a role in the dephosphorylation of substrates related to l-arabinose metabolism. This subfamily belongs to the UmpH/NagD phosphatase family, and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	252
319834	cd07532	HAD_PNPase_UmpH-like	UmpH/NagD family phosphatase para nitrophenyl phosphate phosphatase, similar to Plasmodium falciparum PNPase. Plasmodium falciparum para nitrophenyl phosphate phosphatase (PNPase) catalyzes the dephosphorylation of thiamine monophosphate to thiamine, other substrates on which its active are nucleotides, phosphorylated sugars, pyridoxal-5-phosphate, and paranitrophenyl phosphate. This subfamily belongs to the UmpH/NagD phosphatase family, and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	286
319835	cd07533	HAD_like	uncharacterized family of the haloacid dehalogenase-like (HAD) hydrolase superfamily, similar to Parvibaculum lavamentivorans  HAD-superfamily hydrolase, subfamily IA, variant 1. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	207
319836	cd07534	HAD_CAP	molecular class C acid phosphatases, similar to Haemophilus influenzae e (P4) acid phosphatase; belongs to the haloacid dehalogenase-like hydrolase superfamily. Molecular class C acid phosphatases (CAPs) are nonspecific acid phosphatases with generally broad substrate specificity and optimum activity at neutral to acidic pH. Members include Haemophilus influenzae lipoprotein e (P4), Elizabethkingia meningosepticum OlpA, Helicobacter pylori HppA, Enterobacter sp. 4 acid phosphatase PhoI, and Streptococcus pyogenes M1 GAS LppC. Lipoprotein e (P4) exhibits phosphomonoesterase activity with aryl phosphate substrates including nicotinamide mononucleotide (NMN), tyrosine phosphate, phenyl phosphate, p-nitrophenyl phosphate, and 4-methylumbelliferyl phosphate. The role of P4 in NAD+ uptake appears to be the dephosphorylation of NMN to nicotinamide riboside, which is then taken up by the organism. Elizabethkingia meningosepticum OlpA is a broad-spectrum nucleotidase with preference for 5'-nucleotides, it efficiently hydrolyzes nucleotide monophosphates, with a strong preference for 5'-nucleotides and for 3'-AMP; OlpA can also hydrolyze sugar phosphates and beta-glycerol phosphate, although with a lower efficiency. Helicobacter pylori HppA is also a 5' nucleotidase. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	196
319837	cd07535	HAD_VSP	vegetative storage proteins similar to soybean VSPalpha and VSPbeta proteins; belongs to the haloacid dehalogenase-like superfamily. Soybean [Glycine max (L.) Merr.] vegetative storage protein VSPalpha and VSPbeta levels were identified as storage proteins due to their abundance and pattern of expression in plant tissues, they accumulate to almost one-half the amount of soluble leaf protein when soybean plants are continually depodded. They possess acid phosphatase activity which appears to be low compared to several other plant acid phosphatases, it increases in the leaves of depodded soybean plants, but to no more than 0.1% of the total acid phosphatase activity in these leaves. This acid phosphatase activity has maximal activity at pH 5.0 - 5.5, and can liberate Pi from different substrates such as napthyl acid phosphate, carboxyphenyl phosphate, sugar-phosphates, glyceraldehyde 3-phosphate, dihydroxyacetone phosphate, phosphoenolpyruvate, ATP, ADP, PPi, and short chain polyphosphates; they cleave phosphoenolpyruvate, ATP, ADP, PPI, and polyphosphates most efficiently. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Soybean VSPalpha and VSPbeta lack this active site aspartate, other members of this family have this aspartate and may be more active.  Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	186
319838	cd07536	P-type_ATPase_APLT	Aminophospholipid translocases (APLTs), similar to Saccharomyces cerevisiae Dnf1-3p, Drs2p, Neo1p, and human ATP8A2, -9B, -10D, -11B, and -11C. Aminophospholipid translocases (APLTs), also known as type 4 P-type ATPases, act as flippases, and translocate specific phospholipids from the exoplasmic leaflet to the cytoplasmic leaflet of biological membranes. Yeast Dnf1 and Dnf2 mediate the transport of phosphatidylethanolamine, phosphatidylserine, and phosphatidylcholine from the outer to the inner leaflet of the plasma membrane. Mammalian ATP11C may selectively transports PS and PE from the outer leaflet of the plasma membrane to the inner leaflet. The yeast Neo1p localizes to the endoplasmic reticulum and the Golgi complex and plays a role in membrane trafficking within the endomembrane system. Human putative ATPase phospholipid transporting 9B, ATP9B, localizes to the trans-golgi network in a CDC50 protein-independent manner. It also includes Arabidopsis phospholipid flippases including ALA1, and Caenorhabditis elegans flippases, including TAT-1, the latter has been shown to facilitate the inward transport of phosphatidylserine. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	805
319839	cd07538	P-type_ATPase	uncharacterized subfamily of P-type ATPase transporters. This subfamily contains P-type ATPase transporters of unknown function. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. They are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.  A general characteristic of P-type ATPases is a bundle of transmembrane helices which make up the transport path, and three domains on the cytoplasmic side of the membrane. Members include pumps that transport various light metal ions, such as H(+), Na(+), K(+), Ca(2+), and Mg(2+), pumps that transport indispensable trace elements, such as Zn(2+) and Cu(2+), pumps that remove toxic heavy metal ions, such as Cd2+, and pumps such as aminophospholipid translocases which transport phosphatidylserine and phosphatidylethanolamine.	653
319840	cd07539	P-type_ATPase	uncharacterized subfamily of P-type ATPase transporters. This subfamily contains P-type ATPase transporters of unknown function. The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. They are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.  A general characteristic of P-type ATPases is a bundle of transmembrane helices which make up the transport path, and three domains on the cytoplasmic side of the membrane. Members include pumps that transport various light metal ions, such as H(+), Na(+), K(+), Ca(2+), and Mg(2+), pumps that transport indispensable trace elements, such as Zn(2+) and Cu(2+), pumps that remove toxic heavy metal ions, such as Cd2+, and pumps such as aminophospholipid translocases which transport phosphatidylserine and phosphatidylethanolamine.	634
319841	cd07541	P-type_ATPase_APLT_Neo1-like	Aminophospholipid translocases (APLTs), similar to Saccharomyces cerevisiae Neo1p and human putative APLT, ATP9B. Aminophospholipid translocases (APLTs), also known as type 4 P-type ATPases, act as a flippases, and translocate specific phospholipids from the exoplasmic leaflet to the cytoplasmic leaflet of biological membranes. The yeast Neo1 gene is an essential gene; Neo1p localizes to the endoplasmic reticulum and the Golgi complex and plays a role in membrane trafficking within the endomembrane system. Also included in this sub family is human putative ATPase phospholipid transporting 9B, ATP9B, which localizes to the trans-golgi network in a CDC50 protein-independent manner. Levels of ATP9B, along with levels of other ATPase genes, may contribute to expressivity of and atypical presentations of Hailey-Hailey disease (HHD), and the ATP9B gene has recently been identified as a putative Alzheimer's disease loci. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	792
319842	cd07542	P-type_ATPase_cation	P-type cation-transporting ATPases, similar to human ATPase type 13A2 (ATP13A2) protein and Saccharomyces cerevisiae Ypk9p. Saccharomyces cerevisiae Yph9p localizes to the yeast vacuole and may play a role in sequestering heavy metal ions, its deletion confers sensitivity for growth for cadmium, manganese, nickel or selenium. Human ATP13A2 (PARK9/CLN12) is a lysosomal transporter with zinc as the possible substrate. Mutation in the ATP13A2 gene has been linked to Parkinson's disease and Kufor-Rakeb syndrome, and to neuronal ceroid lipofuscinoses. ATP13A3/AFURS1 is a candidate gene for oculo auriculo vertebral spectrum (OAVS), being one of nine genes included in a 3q29 microduplication in a patient with OAVS. Mutation in the human ATP13A4 may be involved in a speech-language disorder. This subfamily also includes zebrafish ATP13A2 a lysosome-specific transmembrane ATPase protein of unknown function which plays a crucial role during embryonic development, its deletion is lethal.  This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	760
319843	cd07543	P-type_ATPase_cation	P-type cation-transporting ATPases, similar to human cation-transporting ATPase type 13A1 (ATP13A1) and Saccharomyces manganese-transporting ATPase 1 Spf1p. Saccharomyces Spf1p may mediate manganese transport into the endoplasmic reticulum (ER); one consequence of deletion of SPF1 is severe ER stress. This subfamily also includes Arabidopsis thaliana MIA (Male Gametogenesis Impaired Anthers) protein which is highly abundant in the endoplasmic reticulum and small vesicles of developing pollen grains and tapetum cells. The MIA gene functionally complements a mutant in the SPF1 from Saccharomyces cerevisiae. The expression of ATP13A1 has been followed during mouse development, ATP13A1 transcript expression showed an increase as development progressed, with the highest expression at the peak of neurogenesis. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	804
319844	cd07544	P-type_ATPase_HM	P-type heavy metal-transporting ATPase; uncharacterized subfamily. Uncharacterized subfamily of the heavy metal-transporting ATPases (Type IB ATPases) which transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. The characteristic N-terminal heavy metal associated (HMA) domain of this group is essential for the binding of metal ions. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	596
319845	cd07545	P-type_ATPase_Cd-like	P-type heavy metal-transporting ATPase, similar to Staphylococcus aureus plasmid pI258 CadA, a cadmium-efflux ATPase. CadA from gram-positive Staphylococcus aureus plasmid pI258 is required for full Cd(2+) and Zn(2+) resistance. This subfamily also includes CadA, from the gram-negative bacilli, Stenotrophomonas maltophilia D457R, which is a cadmium efflux pump acquired as part of a cluster of antibiotic and heavy metal resistance genes from gram-positive bacteria. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	599
319846	cd07546	P-type_ATPase_Pb_Zn_Cd2-like	P-type heavy metal-transporting ATPase, similar to Escherichia coli ZntA which is selective for Pb(2+), Zn(2+), and Cd(2+). Escherichia coli ZntA mediates resistance to toxic levels of selected divalent metal ions. ZntA has the highest selectivity for Pb(2+), followed by Zn(2+) and Cd(2+); it also shows low levels of activity with Cu(2+), Ni(2+), and Co(2+). It is upregulated by the transcription factor ZntR at high zinc concentrations. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	597
319847	cd07548	P-type_ATPase-Cd_Zn_Co_like	P-type heavy metal-transporting ATPase, similar to Bacillus subtilis CadA which appears to transport cadmium, zinc and cobalt but not copper out of the cell. Bacillus subtilis CadA/YvgW appears to transport cadmium, zinc and cobalt but not copper, out of the cell. Functions in metal ion resistance and cellular metal ion homeostasis. CadA/YvgW is also important for sporulation in B. subtilis, the significant specific expression of the cadA/yvgW gene during the late stage of sporulation, is controlled by forespore-specific sigma factor, sigma G, and mother cell-specific sigma factor, sigma E. This subfamily also includes Helicobacter pylori CadA an essential resistance pump with ion specificity towards Cd(2+), Zn(2+) and Co(2+), and Zn-transporting ATPase, ZiaA(N) in Synechocystis PCC 6803. Transcription of ziaA is induced by Zn under the control of the Zn responsive repressor ZiaR. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	604
319848	cd07550	P-type_ATPase_HM	P-type heavy metal-transporting ATPase; uncharacterized subfamily. Uncharacterized subfamily of the heavy metal-transporting ATPases (Type IB ATPases) which transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. The characteristic N-terminal heavy metal associated (HMA) domain of this group is essential for the binding of metal ions. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	592
319849	cd07551	P-type_ATPase_HM_ZosA_PfeT-like	P-type heavy metal-transporting ATPase, similar to Bacillus subtilis ZosA/PfeT which transports copper, and perhaps zinc under oxidative stress, and perhaps ferrous iron. Bacillus subtilis ZosA/PfeT (previously known as YkvW) transports copper, it may also transport zinc under oxidative stress and may also be involved in ferrous iron efflux. ZosA/PfeT is expressed under the regulation of the peroxide-sensing repressor PerR. It is involved in competence development. Disruption of the zosA/pfeT gene results in low transformability. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	611
319850	cd07552	P-type_ATPase_Cu-like	P-type heavy metal-transporting ATPase, similar to Archaeoglobus fulgidus CopB, a Cu(2+)-ATPase. Archaeoglobus fulgidus CopB transports Cu(2+) from the cytoplasm to the exterior of the cell using ATP as energy source, it transports preferentially Cu(2+) over Cu(+), it is activated by Cu(2+) with high affinity and partially by Cu(+) and Ag(+). This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	632
319851	cd07553	P-type_ATPase_HM	P-type heavy metal-transporting ATPase; uncharacterized subfamily. Uncharacterized subfamily of the heavy metal-transporting ATPases (Type IB ATPases) which transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. The characteristic N-terminal heavy metal associated (HMA) domain of this group is essential for the binding of metal ions. This subclass of P-type ATPase is also referred to as CPx-type ATPases because their amino acid sequences contain a characteristic CPC or CPH motif associated with a stretch of hydrophobic amino acids and N-terminal ion-binding sequences. This subfamily belongs to the P-type ATPases, a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids, and are distinguished from other main classes of transport ATPases (F- , V- , and ABC- type) by the formation of a phosphorylated (P-) intermediate state in the catalytic cycle.	610
143637	cd07556	Nucleotidyl_cyc_III	Class III nucleotidyl cyclases. Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's).  The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways.	133
143638	cd07557	trimeric_dUTPase	Trimeric dUTP diphosphatases. Trimeric dUTP diphosphatases, or dUTPases, are the most common family of dUTPase, found in bacteria, eukaryotes, and archaea. They catalyze the hydrolysis of the dUTP-Mg complex (dUTP-Mg) into dUMP and pyrophosphate. This reaction is crucial for the preservation of chromosomal integrity as it removes dUTP and therefore reduces the cellular dUTP/dTTP ratio, and prevents dUTP from being incorporated into DNA.  It also provides dUMP as the precursor for dTTP synthesis via the thymidylate synthase pathway. dUTPases are homotrimeric, except some monomeric viral dUTPases, which have been shown to mimic a trimer. Active sites are located at the subunit interface.	92
143471	cd07559	ALDH_ACDHII_AcoD-like	Ralstonia eutrophus NAD+-dependent acetaldehyde dehydrogenase II and Staphylococcus aureus AldA1 (SACOL0154)-like. Included in this CD is the NAD+-dependent, acetaldehyde dehydrogenase II (AcDHII, AcoD, EC=1.2.1.3) from Ralstonia (Alcaligenes) eutrophus H16 involved in the catabolism of acetoin and ethanol, and similar proteins, such as, the dimeric dihydrolipoamide dehydrogenase of the acetoin dehydrogenase enzyme system of Klebsiella pneumonia. Also included are sequences similar to the NAD+-dependent chloroacetaldehyde dehydrogenases (AldA and AldB) of Xanthobacter autotrophicus GJ10 which are involved in the degradation of 1,2-dichloroethane, as well as, the uncharacterized aldehyde dehydrogenase from Staphylococcus aureus (AldA1, locus SACOL0154) and other similar sequences.	480
143476	cd07560	Peptidase_S41_CPP	C-terminal processing peptidase; serine protease family S41. The C-terminal processing peptidase (CPP, EC 3.4.21.102) also known as tail-specific protease (tsp), the photosystem II D1 C-terminal processing protease (D1P), and other related S41 protease family members are present in this CD. CPP is synthesized as a precursor form with a carboxyl-terminal extension. It specifically recognizes a C-terminal tripeptide, Xaa-Yaa-Zaa, in which Xaa is preferably Ala or Leu, Yaa is preferably Ala or Tyr and Zaa is preferably Ala, but then cleaves at a variable distance from the C-terminus. The C-terminal carboxylate group is essential, and proteins where this group is amidated are not substrates. This family of proteases contains the PDZ domain that promotes protein-protein interactions and is important for substrate recognition. The active site consists of a serine/lysine catalytic dyad. The bacterial CCP-1 is believed to be important for the degradation of incorrectly synthesized proteins as well as protection from thermal and osmotic stresses. In E. coli, it is involved in the cleavage of a C-terminal peptide of 11 residues from the precursor form of penicillin-binding protein 3 (PBP3). In the plant chloroplast, the enzyme removes the C-terminal extension of the D1 polypeptide of photosystem II, allowing the light-driven assembly of the tetranuclear manganese cluster, which is responsible for photosynthetic water oxidation.	211
143477	cd07561	Peptidase_S41_CPP_like	C-terminal processing peptidase-like; serine protease family S41. Bacterial protease homologs of the S41 family related to C-terminal processing peptidase (CPP).  CPP-1 is believed to be important for the degradation of incorrectly synthesized proteins as well as protection from thermal and osmotic stresses. CPP is synthesized with an extension on its carboxyl-terminus and specifically recognizes a C-terminal tripeptide, but cleaves at variable distance from the C-terminus. The CPP active site consists of a serine/lysine catalytic dyad. Conservation of these residues is seen in the CPP-like proteins of this group. CPP proteins contain a PDZ domain that promotes protein-protein interactions and is important for substrate recognition however, most of CPP-like proteins only have an internal fragment or lack the PDZ domain.	256
143478	cd07562	Peptidase_S41_TRI	Tricorn protease; serine protease family S41. The tricorn protease (TRI), a member of the S41 peptidase family and named for its tricorn-like shape, exists only in some archaea and eubacteria. It has been shown to act as a carboxypeptidase, involved in the degradation of proteasomal products to preferentially yield di- and tripeptides, with subsequent and final degradations to free amino acid residues by tricorn interacting factors, F1, F2 and F3. Tricorn is a hexameric D3-symmetric protease of 720kD, and can self-associate further into a giant icosahedral capsid structure containing twenty copies of the complex. Each tricorn peptidase monomer consists of five structural domains: a six-bladed beta-propeller and a seven-bladed beta-propeller that limit access to the active site, the two domains (C1 and C2) that carry the active site residues, and a PDZ-like domain (proposed to be important for substrate recognition) between the C1 and C2 domains. The active site tetrad residues are distributed between the C1 and C2 domains, with serine and histidine on C1 and serine and glutamate on C2.	266
143479	cd07563	Peptidase_S41_IRBP	Interphotoreceptor retinoid-binding protein; serine protease family S41. Interphotoreceptor retinoid-binding protein (IRBP) is a homolog of the S41 protease, C-terminal processing peptidase (CTPase) family. It is thought to facilitate the compartmentalization of the visual cycle that requires poorly soluble and potentially toxic retinoids to cross the aqueous subretinal space between the photoreceptors and the retinal pigment epithelium (RPE). IRBP is secreted by photoreceptors into the interphotoreceptor matrix (IPM) where it is rapidly turned over by a combination of RPE and photoreceptor endocytosis. It is the most abundant soluble protein component of the IPM, consisting of homologous modules, each repeat structure arising through the duplication (as in teleost IRBP) or quadruplication (in tetrapods) of an ancient gene, arisen in the early evolution of the vertebrate eye. IRBP has been shown to promote the release of all-trans retinol from photoreceptors and facilitates its delivery to the RPE. Conversely, IRBP can promote the release of 11-cis-retinal from the RPE, prevent its isomerization in the subretinal space, and transfer it to photoreceptors. In vivo evidence implicates IRBP as a retinoid transporter in the visual cycle, suggesting a critical role for IRBP in cone function essential for human vision. IRBP is a dominant autoimmune antigen in the eye; IRBP proteolysis analysis has proven a useful biomarker for autoimmune uveitis (AU) disorders, a major cause of blindness. This family also includes a chlamydia-secreted protein, designated chlamydia protease-like activity factor (CPAF), known to degrade host proteins, enabling Chlamydia to evade host defenses and replicate.	250
143588	cd07564	nitrilases_CHs	Nitrilases, cyanide hydratase (CH)s, and similar proteins (class 1 nitrilases). Nitrilases (nitrile aminohydrolases, EC:3.5.5.1) hydrolyze nitriles (RCN) to ammonia and the corresponding carboxylic acid. Most nitrilases prefer aromatic nitriles, some prefer arylacetonitriles and others aliphatic nitriles. This group includes the nitrilase cyanide dihydratase (CDH), which hydrolyzes inorganic cyanide (HCN) to produce formate. It also includes cyanide hydratase (CH), which hydrolyzes HCN to formamide. This group includes four Arabidopsis thaliana nitrilases (Ath)NIT1-4. AthNIT1-3 have a strong substrate preference for phenylpropionitrile (PPN) and other nitriles which may originate from the breakdown of glucosinolates. The product of PPN hydrolysis, phenylacetic acid has auxin activity. AthNIT1-3 can also convert indoacetonitrile to indole-3-acetic acid (IAA, auxin), but with a lower affinity and velocity. From their expression patterns, it has been speculated that NIT3 may produce IAA during the early stages of germination, and that NIT3 may produce IAA during embryo development and maturation. AthNIT4 has a strong substrate specificity for the nitrile, beta-cyano-L-alanine (Ala(CN)), an intermediate of cyanide detoxification. AthNIT4 has both a nitrilase activity and a nitrile hydratase (NHase) activity, which generate aspartic acid and asparagine respectively from Ala(CN). NHase catalyzes the hydration of nitriles to their corresponding amides. This subgroup belongs to a larger nitrilase superfamily comprised of belong to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 1.	297
143589	cd07565	aliphatic_amidase	aliphatic amidases (class 2 nitrilases). Aliphatic amidases catalyze the hydrolysis of short-chain aliphatic amides to form ammonia and the corresponding organic acid. This group includes Pseudomonas aeruginosa (Pa) AmiE, the amidase from Geobacillus pallidus RAPc8 (RAPc8 amidase), and Helicobacter pylori (Hp) AmiE and AmiF. PaAimE and HpAmiE hydrolyze various very short aliphatic amides, including propionamide, acetamide and acrylamide. HpAmiF is a formamidase which specifically hydrolyzes formamide. These proteins belong to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 2. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. HpAmiE , HpAmiF, and RAPc8 amidase, and PaAimE appear to be homohexameric enzymes, trimer of dimers.	291
143590	cd07566	ScNTA1_like	Saccharomyces cerevisiae N-terminal amidase NTA1, and related proteins (class 3 nitrilases). Saccharomyces cerevisiae NTA1 functions in the N-end rule protein degradation pathway. It specifically deaminates the N-terminal asparagine and glutamine residues of substrates of this pathway, to aspartate and glutamate respectively, these latter are the destabilizing residues. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 3.	295
143591	cd07567	biotinidase_like	biotinidase and vanins (class 4 nitrilases). These secondary amidases participate in vitamin recycling. Biotinidase (EC 3.5.1.12) has both a hydrolase and a transferase activity. It hydrolyzes free biocytin or small biotinyl-peptides produced during the proteolytic degradation of biotin-dependent carboxylases, to release free biotin (vitamin H), and it can transfer biotin to acceptor molecules such as histones. Biotinidase deficiency in humans is an autosomal recessive disorder characterized by neurological and cutaneous symptoms. This subgroup includes the three human vanins, vanin1-3. Vanins are ectoenzymes, Vanin-1, and -2 are membrane associated, vanin-3 is secreted. They are pantotheinases (EC 3.5.1.92, pantetheine hydrolase), which convert pantetheine, to pantothenic acid (vitamin B5) and cysteamine (2-aminoethanethiol, a potent anti-oxidant). They are potential targets for therapeutic intervention in inflammatory disorders. Vanin-1 deficient mice lacking free cysteamine are less susceptible to intestinal inflammation, and expression of vanin-1 and -3 is induced as part of the inflammatory-regenerative differentiation program of human epidermis. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 4. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	299
143592	cd07568	ML_beta-AS_like	mammalian-like beta-alanine synthase (beta-AS) and similar proteins (class 5 nitrilases). This family includes mammalian-like beta-AS (EC 3.5.1.6, also known as beta-ureidopropionase or N-carbamoyl-beta-alanine amidohydrolase). This enzyme catalyzes the third and final step in the catabolic pyrimidine catabolic pathway responsible for the degradation of uracil and thymine, the hydrolysis of N-carbamyl-beta-alanine and N-carbamyl-beta-aminoisobutyrate to the beta-amino acids, beta-alanine and beta-aminoisobutyrate respectively. This family belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 5. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Beta-ASs from this subgroup are found in various oligomeric states, dimer (human), hexamer (calf liver), decamer (Arabidopsis and Zea mays), and in the case of  Drosophila melanogaster beta-AS, as a homooctamer assembled as a left-handed helical turn, with the possibility of higher order oligomers formed by adding dimers at either end. Rat beta-AS changes its oligomeric state (hexamer, trimer, dodecamer) in response to allosteric effectors. Eukaryotic Saccharomyces kluyveri beta-AS belongs to a different superfamily.	287
143593	cd07569	DCase	N-carbamyl-D-amino acid amidohydrolase (DCase, class 6 nitrilases). DCase hydrolyses N-carbamyl-D-amino acids to produce D-amino acids. It is an important biocatalyst in the pharmaceutical industry, producing useful D-amino acids for example in the preparation of beta-lactam antibiotics. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 6. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Agrobacterium radiobacter DCase forms a tetramer (dimer of dimers). Some DCases may form trimers.	302
143594	cd07570	GAT_Gln-NAD-synth	Glutamine aminotransferase (GAT, glutaminase) domain of glutamine-dependent NAD synthetases (class 7 and 8 nitrilases). Glutamine-dependent NAD synthetases are bifunctional enzymes, which have an N-terminal GAT domain and a C-terminal NAD+ synthetase domain. The GAT domain is a glutaminase (EC 3.5.1.2) which hydrolyses L-glutamine to L-glutamate and ammonia. The ammonia is used by the NAD+ synthetase domain in the ATP-dependent amidation of nicotinic acid adenine dinucleotide. Glutamine aminotransferases are categorized depending on their active site residues into different unrelated classes. This class of GAT domain belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to classes 7 and 8. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Mycobacterium tuberculosis glutamine-dependent NAD+ synthetase forms a homooctamer.	261
143595	cd07571	ALP_N-acyl_transferase	Apolipoprotein N-acyl transferase (class 9 nitrilases). ALP N-acyl transferase (Lnt), is an essential membrane-bound enzyme in gram-negative bacteria, which catalyzes the N-acylation of apolipoproteins, the final step in lipoprotein maturation. This is a reverse amidase (i.e. condensation) reaction. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 9.	270
143596	cd07572	nit	Nit1, Nit 2, and related proteins, and the Nit1-like domain of NitFhit (class 10 nitrilases). This subgroup includes mammalian Nit1 and Nit2, the Nit1-like domain of the invertebrate NitFhit, and various uncharacterized bacterial and archaeal Nit-like proteins. Nit1 and Nit2 are candidate tumor suppressor proteins. In NitFhit, the Nit1-like domain is encoded as a fusion protein with the non-homologous tumor suppressor, fragile histidine triad (Fhit). Mammalian Nit1 and Fhit may affect distinct signal pathways, and both may participate in DNA damage-induced apoptosis. Nit1 is a negative regulator in T cells. Overexpression of Nit2 in HeLa cells leads to a suppression of cell growth through cell cycle arrest in G2. These Nit proteins and the Nit1-like domain of NitFhit belong to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 10.	265
143597	cd07573	CPA	N-carbamoylputrescine amidohydrolase (CPA) (class 11 nitrilases). CPA (EC 3.5.1.53, also known as N-carbamoylputrescine amidase and carbamoylputrescine hydrolase) converts N-carbamoylputrescine to putrescine, a step in polyamine biosynthesis in plants and bacteria. This subgroup includes Arabidopsis thaliana CPA, also known as nitrilase-like 1 (NLP1), and Pseudomonas aeruginosa AguB. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 11. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer; P. aeruginosa AugB is a homohexamer, Arabidopsis thaliana NLP1 is a homooctomer.	284
143598	cd07574	nitrilase_Rim1_like	Uncharacterized subgroup of the nitrilase superfamily; some members of this subgroup have an N-terminal RimI domain (class 12 nitrilases). Some members of this subgroup are implicated in post-translational modification, as they contain an N-terminal GCN5-related N-acetyltransferase (GNAT) protein RimI family domain. The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 12. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	280
143599	cd07575	Xc-1258_like	Xanthomonas campestris XC1258 and related proteins, members of the nitrilase superfamily (putative class 13 nitrilases). Uncharacterized subgroup belonging to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup either represents a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. XC1258 is a homotetramer.	252
143600	cd07576	R-amidase_like	Pseudomonas sp. MCI3434 R-amidase and related proteins (putative class 13 nitrilases). Pseudomonas sp. MCI3434 R-amidase hydrolyzes (R,S)-piperazine-2-tert-butylcarboxamide to form (R)-piperazine-2-carboxylic acid. It does so with strict R-stereoselectively. Its preferred substrates are carboxamide compounds which have the amino or imino group connected to their beta- or gamma-carbon. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), class 13 represents proteins that at the time were difficult to place in a distinct similarity group. It has been suggested that this subgroup represents a new class. Members of the nitrilase superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Native R-amidase however appears to be a monomer.	254
143601	cd07577	Ph0642_like	Pyrococcus horikoshii Ph0642 and related proteins, members of the nitrilase superfamily (putative class 13 nitrilases). Uncharacterized subgroup of the nitrilase superfamily. This superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. Pyrococcus horikoshii Ph0642 is a hypothetical protein belonging to this subgroup. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). This subgroup was classified as belonging to class 13, which represents proteins that at the time were difficult to place in a distinct similarity group. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	259
143602	cd07578	nitrilase_1_R1	First nitrilase domain of an uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). Members of this subgroup have two nitrilase domains. This is the first of those two domains. The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	258
143603	cd07579	nitrilase_1_R2	Second nitrilase domain of an uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). Members of this subgroup have two nitrilase domains. This is the second of those two domains. The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	279
143604	cd07580	nitrilase_2	Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	268
143605	cd07581	nitrilase_3	Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	255
143606	cd07582	nitrilase_4	Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	294
143607	cd07583	nitrilase_5	Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	253
143608	cd07584	nitrilase_6	Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	258
143609	cd07585	nitrilase_7	Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	261
143610	cd07586	nitrilase_8	Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases). The nitrilase superfamily is comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13). Class 13 represents proteins that at the time were difficult to place in a distinct similarity group; this subgroup represents either a new class or one that was included previously in class 13. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer.	269
143611	cd07587	ML_beta-AS	mammalian-like beta-alanine synthase (beta-AS) and similar proteins (class 5 nitrilases). This subgroup includes mammalian-like beta-AS (EC 3.5.1.6, also known as beta-ureidopropionase or N-carbamoyl-beta-alanine amidohydrolase). This enzyme catalyzes the third and final step in the catabolic pyrimidine catabolic pathway responsible for the degradation of uracil and thymine, the hydrolysis of N-carbamyl-beta-alanine and N-carbamyl-beta-aminoisobutyrate to the beta-amino acids, beta-alanine and beta-aminoisobutyrate respectively. This subgroup belongs to a larger nitrilase superfamily comprised of nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes, which depend on a Glu-Lys-Cys catalytic triad. This superfamily has been classified in the literature based on global and structure based sequence analysis into thirteen different enzyme classes (referred to as 1-13), this subgroup corresponds to class 5. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. Beta-ASs from this subgroup are found in various oligomeric states, dimer (human), hexamer (calf liver), decamer (Arabidopsis and Zea mays), and in the case of  Drosophila melanogaster beta-AS, as a homooctamer assembled as a left-handed helical turn, with the possibility of higher order oligomers formed by adding dimers at either end. Rat beta-AS changes its oligomeric state (hexamer, trimer, dodecamer) in response to allosteric effectors. Eukaryotic Saccharomyces kluyveri beta-AS belongs to a different superfamily.	363
153272	cd07588	BAR_Amphiphysin	The Bin/Amphiphysin/Rvs (BAR) domain of Amphiphysins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Amphiphysins function primarily in endocytosis and other membrane remodeling events. They contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. This subfamily is composed of different isoforms of amphiphysin and Bridging integrator 2 (Bin2). Amphiphysin I proteins, enriched in the brain and nervous system, contain domains that bind clathrin, Adaptor Protein complex 2 (AP2), dynamin and synaptojanin. They function in synaptic vesicle endocytosis. Some amphiphysin II isoforms, also called Bridging integrator 1 (Bin1), are localized in many different tissues and may function in intracellular vesicle trafficking. In skeletal muscle, Bin1 plays a role in the organization and maintenance of the T-tubule network. Bin2 is mainly expressed in hematopoietic cells and is upregulated during granulocyte differentiation. The N-BAR domains of amphiphysins form a curved dimer with a positively-charged concave face that can drive membrane bending and curvature.	211
153273	cd07589	BAR_DNMBP	The Bin/Amphiphysin/Rvs (BAR) domain of Dynamin Binding Protein. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. DyNamin Binding Protein (DNMBP), also called Tuba, is a Cdc42-specific Guanine nucleotide Exchange Factor (GEF) that binds dynamin and various actin regulatory proteins. It serves as a link between dynamin function, Rho GTPase signaling, and actin dynamics. It plays an important role in regulating cell junction configuration. DNMBP contains BAR and SH3 domains as well as a Dbl Homology domain (DH domain), which harbors GEF activity. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of DNMBP may be involved in binding to membranes. The gene encoding DNMBP is a candidate gene for late onset Alzheimer's disease.	195
153274	cd07590	BAR_Bin3	The Bin/Amphiphysin/Rvs (BAR) domain of Bridging integrator 3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Bridging integrator 3 (Bin3) is widely expressed in many tissues except in the brain. It plays roles in regulating filamentous actin localization and in cell division. In humans, the Bin3 gene is located in chromosome 8p21.3, a region that is implicated in cancer suppression. Homozygous inactivation of the Bin3 gene in mice led to the development of cataracts and an increased likelihood of lymphomas during aging, suggesting a role for Bin3 in lens development and cancer suppression. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	225
153275	cd07591	BAR_Rvs161p	The Bin/Amphiphysin/Rvs (BAR) domain of Saccharomyces cerevisiae Reduced viability upon starvation protein 161 and similar proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of fungal proteins with similarity to Saccharomyces cerevisiae Reduced viability upon starvation protein 161 (Rvs161p) and Schizosaccharomyces pombe Hob3 (homolog of Bin3). S. cerevisiae Rvs161p plays a role in regulating cell polarity, actin cytoskeleton polarization, vesicle trafficking, endocytosis, bud formation, and the mating response. It forms a heterodimer with another BAR domain protein Rvs167p. Rvs161p and Rvs167p share common functions but are not interchangeable. Their BAR domains cannot be replaced with each other and the overexpression of one cannot suppress the mutant phenotypes of the other. S. pombe Hob3 is important in regulating filamentous actin localization and may be required in activating Cdc42 and recruiting it to cell division sites. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	224
153276	cd07592	BAR_Endophilin_A	The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-A. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins are accessory proteins, localized at synapses, which interact with the endocytic proteins, dynamin and synaptojanin. They are essential for synaptic vesicle formation from the plasma membrane. They interact with voltage-gated calcium channels, thus linking vesicle endocytosis to calcium regulation. They also play roles in virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain three endophilin-A isoforms. Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. They tubulate membranes and regulate calcium influx into neurons to trigger the activation of the endocytic machinery. They are also involved in the sorting of plasma membrane proteins, actin filament assembly, and the uncoating of clathrin-coated vesicles for fusion with endosomes. The BAR domains of endophilin-A1 and A3 form crescent-shaped dimers that can detect membrane curvature and drive membrane bending.	223
153277	cd07593	BAR_MUG137_fungi	The Bin/Amphiphysin/Rvs (BAR) domain of Schizosaccharomyces pombe Meiotically Up-regulated Gene 137 protein and similar proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This subfamily is composed predominantly of uncharacterized fungal proteins with similarity to Schizosaccharomyces pombe Meiotically Up-regulated Gene 137 protein (MUG137), which may play a role in meiosis and sporulation in fission yeast. MUG137 contains an N-terminal BAR domain and a C-terminal SH3 domain, similar to endophilins. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	215
153278	cd07594	BAR_Endophilin_B	The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-B. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain two endophilin-B isoforms. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle.	229
153279	cd07595	BAR_RhoGAP_Rich-like	The Bin/Amphiphysin/Rvs (BAR) domain of Rich-like Rho GTPase Activating Proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of Rho and Rac GTPase activating proteins (GAPs) with similarity to GAP interacting with CIP4 homologs proteins (Rich). Members contain an N-terminal BAR domain, followed by a Rho GAP domain, and a C-terminal prolin-rich region. Vertebrates harbor at least three Rho GAPs in this subfamily including Rich1, Rich2, and SH3-domain binding protein 1 (SH3BP1). Rich1 and Rich2 play complementary roles in the establishment and maintenance of cell polarity. Rich1 is a Cdc42- and Rac-specific GAP that binds to polarity proteins through the scaffold protein angiomotin and plays a role in maintaining the integrity of tight junctions. Rich2 is a Rac GAP that interacts with CD317 and plays a role in actin cytoskeleton organization and the maintenance of microvilli in polarized epithelial cells. SH3BP1 is a Rac GAP that inhibits Rac-mediated platelet-derived growth factor (PDGF)-induced membrane ruffling. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of Rich1 has been shown to form oligomers, bind membranes and induce membrane tubulation.	244
153280	cd07596	BAR_SNX	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	218
153281	cd07597	BAR_SNX8	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 8. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX8 and the yeast counterpart Mvp1p are involved in sorting and delivery of late-Golgi proteins, such as carboxypeptidase Y, to vacuoles. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	246
153282	cd07598	BAR_FAM92	The Bin/Amphiphysin/Rvs (BAR) domain of Family with sequence similarity 92 (FAM92). BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This group is composed of proteins from the family with sequence similarity 92 (FAM92), which were originally identified by the presence of the unknown domain DUF1208. This domain shows similarity to the BAR domains of sorting nexins. Mammals contain at least two member types, FAM92A and FAM92B, which may exist in many variants. The Xenopus homolog of FAM92A1, xVAP019, is essential for embryo survival and cell differentiation. FAM92A1 may be involved in regulating cell proliferation and apoptosis. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	211
153283	cd07599	BAR_Rvs167p	The Bin/Amphiphysin/Rvs (BAR) domain of Saccharomyces cerevisiae Reduced viability upon starvation protein 167 and similar proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of fungal proteins with similarity to Saccharomyces cerevisiae Reduced viability upon starvation protein 167 (Rvs167p) and Schizosaccharomyces pombe Hob1 (homolog of Bin1). S. cerevisiae Rvs167p plays a role in regulation of the actin cytoskeleton, endocytosis, and sporulation. It forms a heterodimer with another BAR domain protein Rvs161p. Rvs161p and Rvs167p share common functions but are not interchangeable. Their BAR domains cannot be replaced with each other and the overexpression of one cannot suppress the mutant phenotypes of the other. Rvs167p also interacts with the GTPase activating protein (GAP) Gyp5p, which is involved in ER to Golgi vesicle trafficking. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	216
153284	cd07600	BAR_Gvp36	The Bin/Amphiphysin/Rvs (BAR) domain of Saccharomyces cerevisiae Golgi vesicle protein of 36 kDa and similar proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Proteomic analysis shows that Golgi vesicle protein of 36 kDa (Gvp36) may be involved in vesicular trafficking and nutritional adaptation. A Saccharomyces cerevisiae strain deficient in Gvp36 shows defects in growth, in actin cytoskeleton polarization, in endocytosis, in vacuolar biogenesis, and in the cell cycle. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	242
153285	cd07601	BAR_APPL	The Bin/Amphiphysin/Rvs (BAR) domain of Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing (APPL) proteins are effectors of the small GTPase Rab5 that function in endosome-mediated signaling. They contain BAR, pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains. They form homo- and hetero-oligomers that are mediated by their BAR domains, and are localized to cytoplasmic membranes. Vertebrates contain two APPL proteins, APPL1 and APPL2. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	215
153286	cd07602	BAR_RhoGAP_OPHN1-like	The Bin/Amphiphysin/Rvs (BAR) domain of Oligophrenin1-like Rho GTPase Activating Proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of Rho and Rac GTPase activating proteins (GAPs) with similarity to oligophrenin1 (OPHN1). Members contain an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, and a Rho GAP domain. Some members contain a C-terminal SH3 domain. Vertebrates harbor at least three Rho GAPs in this subfamily including OPHN1, GTPase Regulator Associated with Focal adhesion kinase (GRAF), GRAF2, and an uncharacterized protein called GAP10-like. OPHN1, GRAF and GRAF2 show GAP activity towards RhoA and Cdc42. In addition, OPHN1 is active towards Rac. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domains of OPHN1 and GRAF directly interact with their Rho GAP domains and inhibit their activity. The autoinhibited proteins are able to bind membranes and tubulate liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domains can occur simultaneously.	207
153287	cd07603	BAR_ACAPs	The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with Coiled-coil, ANK repeat and PH domain containing proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of ACAPs (ArfGAP with Coiled-coil, ANK repeat and PH domain containing proteins), which are Arf GTPase activating proteins (GAPs) containing an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. Vertebrates contain at least three members, ACAP1, ACAP2, and ACAP3. ACAP1 and ACAP2 are Arf6-specific GAPs, involved in the regulation of endocytosis, phagocytosis, cell adhesion and migration, by mediating Arf6 signaling. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	200
153288	cd07604	BAR_ASAPs	The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with SH3 domain, ANK repeat and PH domain containing proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of ASAPs (ArfGAP with SH3 domain, ANK repeat and PH domain containing proteins), which are Arf GTPase activating proteins (GAPs) with similarity to ACAPs (ArfGAP with Coiled-coil, ANK repeat and PH domain containing proteins) in that they contain an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and ankyrin (ANK) repeats. However, ASAPs contain an additional C-terminal SH3 domain. ASAPs function in regulating cell growth, migration, and invasion. Vertebrates contain at least three members, ASAP1, ASAP2, and ASAP3. ASAP1 and ASAP2 shows GTPase activating protein (GAP) activity towards Arf1 and Arf5. They do not show GAP activity towards Arf6, but is able to mediate Arf6 signaling by binding stably to GTP-Arf6. ASAP3 is an Arf6-specific GAP. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of ASAP1 mediates membrane bending, is essential for function, and autoinhibits GAP activity by interacting with the PH and/or Arf GAP domains.	215
153289	cd07605	I-BAR_IMD	Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), a dimerization module that binds and bends membranes. Inverse (I)-BAR (or IMD) is a member of the Bin/Amphiphysin/Rvs (BAR) domain family. It is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. IMD domains are found in Insulin Receptor tyrosine kinase Substrate p53 (IRSp53), Missing in Metastasis (MIM), and Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-like (BAIAP2L) proteins. These are multi-domain proteins that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. Most members contain an N-terminal IMD, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus, exccept for MIM which does not carry an SH3 domain. Some members contain additional domains and motifs. The IMD domain binds and bundles actin filaments, binds membranes and produces membrane protrusions, and interacts with the small GTPase Rac.	223
153290	cd07606	BAR_SFC_plant	The Bin/Amphiphysin/Rvs (BAR) domain of the plant protein SCARFACE (SFC). BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. The plant protein SCARFACE (SFC), also called VAscular Network 3 (VAN3), is a plant ACAP (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein), an Arf GTPase Activating Protein (GAP) that plays a role in the trafficking of auxin efflux regulators from the plasma membrane to the endosome. It is required for the normal vein patterning in leaves. SCF contains an N-terminal BAR domain, followed by a Pleckstrin Homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	202
153291	cd07607	BAR_SH3P_plant	The Bin/Amphiphysin/Rvs (BAR) domain of the plant SH3 domain-containing proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This group is composed of proteins with similarity to Arabidopsis thaliana SH3 domain-containing proteins 1 (SH3P1) and 2 (SH3P2). SH3P1 is involved in the trafficking of clathrin-coated vesicles. It is localized at the plasma membrane and is associated with vesicles of the trans-Golgi network. Yeast complementation studies reveal that SH3P1 has similar functions to the Saccharomyces cerevisiae Rvs167p, which is involved in endocytosis and actin cytoskeletal arrangement. Members of this group contain an N-terminal BAR domain and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	209
153292	cd07608	BAR_ArfGAP_fungi	The Bin/Amphiphysin/Rvs (BAR) domain of uncharacterized fungal Arf GAP proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This group is composed of uncharacterized fungal proteins containing an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, and an Arf GTPase Activating Protein (GAP) domain. These proteins may play roles in Arf-mediated functions involving membrane dynamics. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	192
153293	cd07609	BAR_SIP3_fungi	The Bin/Amphiphysin/Rvs (BAR) domain of fungal Snf1p-interacting protein 3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. This group is composed of mostly uncharacterized fungal proteins with similarity to Saccharomyces cerevisiae Snf1p-interacting protein 3 (SIP3). These proteins contain an N-terminal BAR domain followed by a Pleckstrin Homology (PH) domain. SIP3 interacts with SNF1 protein kinase and activates transcription when anchored to DNA. It may function in the SNF1 pathway. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	214
153294	cd07610	FCH_F-BAR	The Extended FES-CIP4 Homology (FCH) or F-BAR (FCH and Bin/Amphiphysin/Rvs) domain, a dimerization module that binds and bends membranes. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. F-BAR domain containing proteins, also known as Pombe Cdc15 homology (PCH) family proteins, include Fes and Fer tyrosine kinases, PACSINs/Syndapins, FCHO, PSTPIP, CIP4-like proteins and srGAPs. Many members also contain an SH3 domain and play roles in endocytosis. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. These tubules have diameters larger than those observed with N-BARs. The F-BAR domains of some members such as NOSTRIN and Rgd1 are important for the subcellular localization of the protein.	191
153295	cd07611	BAR_Amphiphysin_I_II	The Bin/Amphiphysin/Rvs (BAR) domain of Amphiphysin I and II. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Amphiphysins function primarily in endocytosis and other membrane remodeling events. They contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. Amphiphysin I proteins, enriched in the brain and nervous system, contain domains that bind clathrin, Adaptor Protein complex 2 (AP2), dynamin and synaptojanin. They function in synaptic vesicle endocytosis. Some amphiphysin II isoforms, also called Bridging integrator 1 (Bin1), are localized in many different tissues and may function in intracellular vesicle trafficking. In skeletal muscle, Bin1 plays a role in the organization and maintenance of the T-tubule network. The N-BAR domain of amphiphysin forms a curved dimer with a positively-charged concave face that can drive membrane bending and curvature. Human autoantibodies to amphiphysin-1 hinder GABAergic signaling and contribute to the pathogenesis of paraneoplastic stiff-person syndrome. Mutations in amphiphysin-2 (BIN1) are associated with autosomal recessive centronuclear myopathy.	211
153296	cd07612	BAR_Bin2	The Bin/Amphiphysin/Rvs (BAR) domain of Bridging integrator 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Bridging integrator 2 (Bin2) is a BAR domain containing protein that is mainly expressed in hematopoietic cells. It is upregulated during granulocyte differentiation and is thought to function primarily in this lineage. The BAR domain of Bin2 is closely related to the BAR domains of amphiphysins, which function primarily in endocytosis and other membrane remodeling events. Amphiphysins contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. Unlike amphiphysins, Bin2 does not appear to contain a C-terminal SH3 domain. Amphiphysin I proteins, enriched in the brain and nervous system, function in synaptic vesicle endocytosis. Some amphiphysin II isoforms, also called Bridging integrator 1 (Bin1), function in intracellular vessicle trafficking. Bin2 can form a stable complex with Bin1 in cells but cannot replace the function of Bin1, and thus, appears to harbor a nonredundant function. The N-BAR domain of amphiphysin forms a curved dimer with a positively-charged concave face that can drive membrane bending and curvature.	211
153297	cd07613	BAR_Endophilin_A1	The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-A1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain three endophilin-A isoforms. Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. Endophilin-A1 (or endophilin-1) is also referred to as SH3P4 (SH3 domain containing protein 4) or SH3GL2 (SH3 domain containing Grb2-like protein 2). It is localized in presynaptic nerve terminals. It plays many roles in clathrin-dependent endocytosis of synaptic vesicles including early vesicle formation, ubiquitin-dependent sorting of plasma membrane proteins, and regulation of calcium influx into neurons. The BAR domain of endophilin-A1 forms crescent-shaped dimers that can detect membrane curvature and drive membrane bending, while its SH3 domain binds the endocytic proteins, dynamin 1, synaptojanin 1, and amphiphysins.	223
153298	cd07614	BAR_Endophilin_A2	The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-A2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins are accessory proteins, localized at synapses, which interact with the endocytic proteins, dynamin and synaptojanin. They are essential for synaptic vesicle formation from the plasma membrane. They interact with voltage-gated calcium channels, thus linking vesicle endocytosis to calcium regulation. They also play roles in virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. Endophilin-A2 (or endophilin-2) is also referred to as SH3P8 (SH3 domain containing protein 8) or SH3GL1 (SH3 domain containing Grb2-like protein 1). It localizes to presynaptic nerve terminals and forms heterodimers with endophilin-A1 through their BAR domains. Endophilin-A2 binds dynamin 1, synaptojanin 1, and the beta1-adrenergic receptor cytoplasmic tail through its SH3 domain.	223
153299	cd07615	BAR_Endophilin_A3	The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-A3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins are accessory proteins localized at synapses that interacts with the endocytic proteins, dynamin and synaptojanin. They are essential for synaptic vesicle formation from the plasma membrane. They interact with voltage-gated calcium channels, thus linking vesicle endocytosis to calcium regulation. They also play roles in virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. Endophilin-A3 (or endophilin-3) is also referred to as SH3P13 (SH3 domain containing protein 13) or SH3GL3 (SH3 domain containing Grb2-like protein 3). It regulates Arp2/3-dependent actin filament assembly during endocytosis. It binds N-WASP through its SH3 domain and enhances the ability of N-WASP to activate the Arp2/3 complex. Endophilin-A3 co-localizes with the vesicular glutamate transporter 1 (VGLUT1), and may play an important role in the synaptic release of glutamate.	223
153300	cd07616	BAR_Endophilin_B1	The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-B1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle. Endophilin-B1, also called Bax-interacting factor 1 (Bif-1) or SH3GLB1 (SH3-domain GRB2-like endophilin B1), is localized mainly to the Golgi apparatus. It is involved in the regulation of many biological events including autophagy, tumorigenesis, nerve growth factor (NGF) trafficking, neurite outgrowth, mitochondrial outer membrane dynamics, and cell death. Endophilin-B1 forms homo- and heterodimers (with endophilin-B2) through its BAR domain, which can bind and bend membranes. It interacts with amphiphysin 1 and dynamin 1 through its SH3 domain.	229
153301	cd07617	BAR_Endophilin_B2	The Bin/Amphiphysin/Rvs (BAR) domain of Endophilin-B2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. They are classified into two types, A and B. Vertebrates contain two endophilin-B isoforms. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle. Endophilin-B2, also called SH3GLB2 (SH3-domain GRB2-like endophilin B2), is a cytoplasmic protein that interacts with the apoptosis inducer Bax. It is overexpressed in prostate cancer metastasis and has been identified as a cancer antigen with potential utility in immunotherapy. Endophilin-B2 forms homo- and heterodimers (with endophilin-B1) through its BAR domain, which can bind and bend membranes.	220
153302	cd07618	BAR_Rich1	The Bin/Amphiphysin/Rvs (BAR) domain of RhoGAP interacting with CIP4 homologs protein 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. RhoGAP interacting with CIP4 homologs protein 1 (Rich1) is also called Neuron-associated developmentally-regulated protein (Nadrin) or Rho GTPase activating protein 17 (ARHGAP17). It is a Cdc42- and Rac-specific GAP that binds to polarity proteins through the scaffold protein angiomotin and plays a role in maintaining the integrity of tight junctions. It may be a component of a sorting mechanism in the recycling of tight junction transmembrane proteins. Rich1 contains an N-terminal BAR domain followed by a Rho GAP domain and a C-terminal proline-rich domain. It interacts with the BAR domain proteins endophilin and amphiphysin through its proline-rich region. The BAR domain of Rich1 forms oligomers and can bind membranes and induce membrane tubulation.	246
153303	cd07619	BAR_Rich2	The Bin/Amphiphysin/Rvs (BAR) domain of RhoGAP interacting with CIP4 homologs protein 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. RhoGAP interacting with CIP4 homologs protein 2 (Rich2) is a Rho GTPase activating protein that interacts with CD317, a lipid raft-associated integral membrane protein. It plays a role in actin cytoskeleton organization and the maintenance of microvilli in polarized epithelial cells. Rich2 contains an N-terminal BAR domain followed by a GAP domain for Rho and Rac GTPases and a C-terminal proline-rich domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	248
153304	cd07620	BAR_SH3BP1	The Bin/Amphiphysin/Rvs (BAR) domain of SH3-domain Binding Protein 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. SH3-domain binding protein 1 (SH3BP1 or 3BP-1) is a Rac GTPase activating protein that inhibits Rac-mediated platelet-derived growth factor (PDGF)-induced membrane ruffling. SH3BP1 contains an N-terminal BAR domain followed by a GAP domain for Rho and Rac GTPases and a C-terminal proline-rich domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	257
153305	cd07621	BAR_SNX5_6	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexins 5 and 6. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. Members of this subfamily include SNX5, SNX6, the mammalian SNX32, and similar proteins. SNX5 and SNX6 may be components of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. The function of SNX32 is still unknown. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	219
153306	cd07622	BAR_SNX4	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 4. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX4 is involved in recycling traffic from the sorting endosome (post-Golgi endosome) back to the late Golgi. It is also implicated in the regulation of plasma membrane receptor trafficking and interacts with receptors for EGF, insulin, platelet-derived growth factor and leptin. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	201
153307	cd07623	BAR_SNX1_2	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexins 1 and 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. This subfamily consists of SNX1, SNX2, and similar proteins. SNX1 and SNX2 are components of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures efficient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	224
153308	cd07624	BAR_SNX7_30	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexins 7 and 30. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. This subfamily consists of SNX7, SNX30, and similar proteins. The specific functions of SNX7 and SNX30 have not been elucidated. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	200
153309	cd07625	BAR_Vps17p	The Bin/Amphiphysin/Rvs (BAR) domain of yeast Sorting Nexin Vps17p. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. Vsp17p forms a dimer with Vps5p, the yeast counterpart of human SNX1, and is part of the retromer complex that mediates the transport of the carboxypeptidase Y receptor Vps10p from endosomes to Golgi. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	230
153310	cd07626	BAR_SNX9_like	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 9 and Similar Proteins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. This subfamily consists of SNX9, SNX18, SNX33, and similar proteins. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis, while SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	199
153311	cd07627	BAR_Vps5p	The Bin/Amphiphysin/Rvs (BAR) domain of yeast Sorting Nexin Vps5p. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. Vsp5p is the yeast counterpart of human SNX1 and is part of the retromer complex, which functions in the endosome-to-Golgi retrieval of vacuolar protein sorting receptor Vps10p, the Golgi-resident membrane protein A-ALP, and endopeptidase Kex2. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	216
153312	cd07628	BAR_Atg24p	The Bin/Amphiphysin/Rvs (BAR) domain of yeast Sorting Nexin Atg24p. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. Atg24p is involved in membrane fusion events at the vacuolar surface during pexophagy. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	185
153313	cd07629	BAR_Atg20p	The Bin/Amphiphysin/Rvs (BAR) domain of yeast Sorting Nexin Atg20p. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. The function of Atg20p is unknown but it has been shown to interact with Atg11p, which plays a role in linking cargo molecules with vesicle-forming components. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	187
153314	cd07630	BAR_SNX_like	The Bin/Amphiphysin/Rvs (BAR) domain of uncharacterized Sorting Nexins. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This subfamily is composed of uncharacterized proteins with similarity to sorting nexins (SNXs), which are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	198
153315	cd07631	BAR_APPL1	The Bin/Amphiphysin/Rvs (BAR) domain of Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing (APPL) proteins are effectors of the small GTPase Rab5 that function in endosome-mediated signaling. They contain BAR, pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains. They form homo- and hetero-oligomers that are mediated by their BAR domains. Vertebrates contain two APPL proteins, APPL1 and APPL2. APPL1 interacts with diverse receptors (e.g. NGF receptor TrkA, FSHR, adiponectin receptors) and signaling proteins (e.g. Akt, PI3K), and may function as an adaptor linked to many distinct signaling pathways. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	215
153316	cd07632	BAR_APPL2	The Bin/Amphiphysin/Rvs (BAR) domain of Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Adaptor protein, Phosphotyrosine interaction, PH domain and Leucine zipper containing (APPL) proteins are effectors of the small GTPase Rab5 that function in endosome-mediated signaling. They contain BAR, pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains. They form homo- and hetero-oligomers that are mediated by their BAR domains. Vertebrates contain two APPL proteins, APPL1 and APPL2. Both APPL proteins interact with the transcriptional repressor Reptin, acting as activators of beta-catenin/TCF-mediated trancription. APPL2 is essential for cell proliferation. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	215
153317	cd07633	BAR_OPHN1	The Bin/Amphiphysin/Rvs (BAR) domain of Oligophrenin-1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Oligophrenin-1 (OPHN1) is a GTPase activating protein (GAP) with activity towards RhoA, Rac, and Cdc42, that is expressed in developing spinal cord and in adult brain areas with high plasticity. It plays a role in regulating the actin cystoskeleton as well as morphology changes in axons and dendrites, and may also function in modulating neuronal connectivity. Mutations in the OPHN1 gene causes X-linked mental retardation associated with cerebellar hypoplasia, lateral ventricle enlargement and epilepsy. OPHN1 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, and a Rho GAP domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	207
153318	cd07634	BAR_GAP10-like	The Bin/Amphiphysin/Rvs (BAR) domain of Rho GTPase activating protein 10-like. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. This group is composed of uncharacterized proteins called Rho GTPase activating protein (GAP) 10-like. GAP10-like may be a GAP with activity towards RhoA and Cdc42. Similar to GRAF and GRAF2, it contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domains of the related proteins GRAF and OPHN1, directly interact with their Rho GAP domains and inhibit theiractivity. The autoinhibited proteins are capable of binding membranes and tubulating liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domain can occur simultaneously.	207
153319	cd07635	BAR_GRAF2	The Bin/Amphiphysin/Rvs (BAR) domain of GTPase Regulator Associated with Focal adhesion 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. GTPase Regulator Associated with Focal adhesion kinase 2 (GRAF2), also called Rho GTPase activating protein 10 (ARHGAP10) or PS-GAP, is a GAP with activity towards Cdc42 and RhoA which regulates caspase-activated p21-activated protein kinase-2 (PAK-2p34). GRAF2 interacts with PAK-2p34, leading to its stabilization and decrease of cell death. It is highly expressed in skeletal muscle and also interacts with PKNbeta, which is a target of Rho. GRAF2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of the related protein GRAF directly interacts with its Rho GAP domain and inhibits its activity. Autoinhibited GRAF is capable of binding membranes and tubulating liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domain can occur simultaneously.	207
153320	cd07636	BAR_GRAF	The Bin/Amphiphysin/Rvs (BAR) domain of GTPase Regulator Associated with Focal adhesion kinase. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. GTPase Regulator Associated with Focal adhesion kinase (GRAF), also called Rho GTPase activating protein 26 (ARHGAP26), is a GAP with activity towards RhoA and Cdc42 and is only weakly active towards Rac1. It influences Rho-mediated cytoskeletal rearrangements and binds focal adhesion kinase (FAK), which is a critical component of integrin signaling. GRAF contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of GRAF directly interacts with its Rho GAP domain and inhibits its activity. Autoinhibited GRAF is capable of binding membranes and tubulating liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domain can occur simultaneously.	207
153321	cd07637	BAR_ACAP3	The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ACAP3 (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 3), also called centaurin beta-5, is presumed to be an Arf GTPase activating protein (GAP) based on its similarity to the Arf6-specific GAPs ACAP1 and ACAP2. The specific function of ACAP3 is still unknown. ACAP3 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	200
153322	cd07638	BAR_ACAP2	The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ACAP2 (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 2), also called centaurin beta-2, is an Arf6-specific GTPase activating protein (GAP) which mediates Arf6 signaling. Arf6 is involved in the regulation of endocytosis, phagocytosis, cell adhesion and migration. ACAP2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	200
153323	cd07639	BAR_ACAP1	The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ACAP1 (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 1), also called centaurin beta-1, is an Arf6-specific GTPase activating protein (GAP) which mediates Arf6 signaling. Arf6 is involved in the regulation of endocytosis, phagocytosis, cell adhesion and migration. ACAP1 also participates in the cargo sorting and recycling of the transferrin receptor and integrin beta1. It may also play a role in innate immune responses. ACAP1 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, and C-terminal ankyrin (ANK) repeats. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	200
153324	cd07640	BAR_ASAP3	The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 3. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ASAP3 (ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 3) is also known as ACAP4 (ArfGAP with Coiled-coil, ANK repeat and PH domain containing protein 4), DDEFL1 (Development and Differentiation Enhancing Factor-Like 1), or centaurin beta-6. It is an Arf6-specific GTPase activating protein (GAP) and is co-localized with Arf6 in ruffling membranes upon EGF stimulation. ASAP3 is implicated in the pathogenesis of hepatocellular carcinoma and plays a role in regulating cell migration and invasion. ASAP3 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of the related protein ASAP1 mediates membrane bending, is essential for function, and autoinhibits GAP activity by interacting with the PH and/or Arf GAP domains.	213
153325	cd07641	BAR_ASAP1	The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ASAP1 (ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 1) is also known as DDEF1 (Development and Differentiation Enhancing Factor 1), AMAP1, centaurin beta-4, or PAG2. ASAP1 is an Arf GTPase activating protein (GAP) with activity towards Arf1 and Arf5 but not Arf6 However, it has been shown to bind GTP-Arf6 stably without GAP activity. It has been implicated in cell growth, migration, and survival, as well as in tumor invasion and malignancy. It binds paxillin and cortactin, two components of invadopodia which are essential for tumor invasiveness. It also binds focal adhesion kinase (FAK) and the SH2/SH3 adaptor CrkL. ASAP1 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of ASAP1 mediates membrane bending, is essential for function, and autoinhibits GAP activity by interacting with the PH and/or Arf GAP domains.	215
153326	cd07642	BAR_ASAP2	The Bin/Amphiphysin/Rvs (BAR) domain of ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. ASAP2 (ArfGAP with SH3 domain, ANK repeat and PH domain containing protein 2) is also known as DDEF2 (Development and Differentiation Enhancing Factor 2), AMAP2, centaurin beta-3, or PAG3. ASAP2 mediates the functions of Arf GTPases vial dual mechanisms: it exhibits GTPase activating protein (GAP) activity towards class I (Arf1) and II (Arf5) Arfs; and binds class III Arfs (GTP-Arf6) stably without GAP activity. It binds paxillin and is implicated in Fcgamma receptor-mediated phagocytosis in macrophages and in cell migration. ASAP2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of the related protein ASAP1 mediates membrane bending, is essential for function, and autoinhibits GAP activity by interacting with the PH and/or Arf GAP domains.	215
153327	cd07643	I-BAR_IMD_MIM	Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), of Missing In Metastasis. The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions. Members of this subfamily include missing in metastasis (MIM) or metastasis suppressor 1 (MTSS1), metastasis suppressor 1-like (MTSSL) or ABBA (Actin-Bundling protein with BAIAP2 homology), and similar proteins. They contain an N-terminal IMD and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. MIM was originally identified as a missing transcript from metastatic bladder and prostate cancer cells. It is a scaffold protein that functions in a signaling pathway between the PDGF receptor, Src kinases, and actin assembly. It may also function as a cofactor of the Sonic hedgehog (Shh) transcriptional pathway and may participate in tumor development and progression via this pathway. ABBA regulates actin and plasma membrane dynamics to promote the extension of radial glia, which is important in neuronal migration, axon guidance and neurogenesis. The IMD domain of MIM binds and bundles actin filaments, binds membranes, and interacts with the small GTPase Rac.	231
153328	cd07644	I-BAR_IMD_BAIAP2L2	Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), of Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 2. The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions. This group is composed of uncharacterized proteins known as BAIAP2L2 (Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 2). They contain an N-terminal IMD, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The related proteins, BAIAP2L1 and IRSp53, function as regulators of membrane dynamics and the actin cytoskeleton. The IMD domain binds and bundles actin filaments, binds membranes and produces membrane protrusions, and interacts with the small GTPase Rac.	215
153329	cd07645	I-BAR_IMD_BAIAP2L1	Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), of Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 1. The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions. BAIAP2L1 (Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 1) is also known as IRTKS (Insulin Receptor Tyrosine Kinase Substrate). It is widely expressed, serves as a substrate for the insulin receptor, and binds the small GTPase Rac. It plays a role in regulating the actin cytoskeleton and colocalizes with F-actin, cortactin, VASP, and vinculin. BAIAP2L1 expression leads to the formation of short actin bundles, distinct from filopodia-like protrusions induced by the expression of the related protein IRSp53. It contains an N-terminal IMD, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The IMD domain of BAIAP2L1 binds and bundles actin filaments, and binds the small GTPase Rac.	226
153330	cd07646	I-BAR_IMD_IRSp53	Inverse (I)-BAR, also known as the IRSp53/MIM homology Domain (IMD), of Insulin Receptor tyrosine kinase Substrate p53. The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, is a dimerization and lipid-binding module that bends membranes and induces membrane protrusions. IRSp53 (Insulin Receptor tyrosine kinase Substrate p53) is also known as BAIAP2 (Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2). It is a scaffolding protein that takes part in many signaling pathways including Cdc42-induced filopodia formation, Rac-mediated lamellipodia extension, and spine morphogenesis. IRSp53 exists as multiple splicing variants that differ mainly at the C-termini. One variant (T-form) is expressed exclusively in human breast cancer cells. The gene encoding IRSp53 is a putative susceptibility gene for Gilles de la Tourette syndrome. IRSp53 contains an N-terminal IMD, a CRIB (Cdc42 and Rac interactive binding motif), an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. Its IMD domain binds and bundles actin filaments, binds membranes, and interacts with the small GTPase Rac.	232
153331	cd07647	F-BAR_PSTPIP	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Proline-Serine-Threonine Phosphatase-Interacting Proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Vetebrates contain two Proline-Serine-Threonine Phosphatase-Interacting Proteins (PSTPIPs), PSTPIP1 and PSTPIP2. PSTPIPs are mainly expressed in hematopoietic cells and are involved in the regulation of cell adhesion and motility. Mutations in PSTPIPs have been shown to cause autoinflammatory disorders. PSTPIP1 contains an N-terminal F-BAR domain, PEST motifs, and a C-terminal SH3 domain, while PSTPIP2 contains only the N-terminal F-BAR domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	239
153332	cd07648	F-BAR_FCHO	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH domain Only proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Proteins in this group have been named FCH domain Only (FCHO) proteins. Vertebrates have two members, FCHO1 and FCHO2. These proteins contain an F-BAR domain and a C-terminal domain of unknown function named SAFF which is also present in endophilin interacting protein 1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	261
153333	cd07649	F-BAR_GAS7	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Growth Arrest Specific protein 7. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Growth Arrest Specific protein 7 (GAS7) is mainly expressed in the brain and is required for neurite outgrowth. It may also play a role in the protection and migration of embryonic stem cells. Treatment-related acute myeloid leukemia (AML) has been reported resulting from mixed-lineage leukemia (MLL)-GAS7 translocations as a complication of primary cancer treatment. GAS7 contains an N-terminal SH3 domain, followed by a WW domain, and a central F-BAR domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	233
153334	cd07650	F-BAR_Syp1p_like	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of yeast Syp1 protein. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Syp1p is associated with septins, a family of GTP-binding proteins that serve as elements of septin filaments, which are required for cell morphogenesis and division. Syp1p regulates cell-cycle dependent septin cytoskeletal dynamics in yeast. It contains an N-terminal F-BAR domain and a C-terminal domain of unknown function named SAFF which is also present in FCH domain Only (FCHO) proteins and endophilin interacting protein 1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	228
153335	cd07651	F-BAR_PombeCdc15_like	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Schizosaccharomyces pombe Cdc15, and similar proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. This subfamily is composed of Schizosaccharomyces pombe Cdc15 and Imp2, and similar proteins. These proteins contain an N-terminal F-BAR domain and a C-terminal SH3 domain. S. pombe Cdc15 and Imp2 play both distinct and overlapping roles in the maintenance and strengthening of the contractile ring at the division site, which is required in cell division. Cdc15 is a component of the actomyosin ring and is required in normal cytokinesis. Imp2 colocalizes with the medial ring during septation and is required for normal septation. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	236
153336	cd07652	F-BAR_Rgd1	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Saccharomyces cerevisiae  Rho GTPase activating protein Rgd1 and similar proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Saccharomyces cerevisiae Rgd1 is a GTPase activating protein (GAP) with activity towards Rho3p and Rho4p, which are involved in bud growth and cytokinesis, respectively. At low pH, S. cerevisiae Rgd1 is required for cell survival and the activation of the protein kinase C pathway, which is important in cell integrity and the maintenance of cell shape. It contains an N-terminal F-BAR domain and a C-terminal Rho GAP domain. The F-BAR domain of S. cerevisiae Rgd1 binds to phosphoinositides and plays an important role in the localization of the protein to the bud tip/neck during the cell cycle. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	234
153337	cd07653	F-BAR_CIP4-like	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Cdc42-Interacting Protein 4 and similar proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. This subfamily is composed of Cdc42-Interacting Protein 4 (CIP4), Formin Binding Protein 17 (FBP17), FormiN Binding Protein 1-Like (FNBP1L), and similar proteins. CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. CIP4 and FBP17 bind to the Fas ligand and may be implicated in the inflammatory response. CIP4 may also play a role in phagocytosis. Members of this subfamily typically contain an N-terminal F-BAR domain and a C-terminal SH3 domain. In addition, some members such as FNBP1L contain a central Cdc42-binding HR1 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	251
153338	cd07654	F-BAR_FCHSD	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH and double SH3 domains proteins (FCHSD). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. This subfamily is composed of FCH and double SH3 domain (FCHSD) proteins, so named as they contain an N-terminal F-BAR domain and two SH3 domains at the C-terminus. Vertebrates harbor two subfamily members, FCHSD1 and FCHSD2, which have been characterized only in silico. Their biological function is still unknown. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	264
153339	cd07655	F-BAR_PACSIN	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. They bind both dynamin and Wiskott-Aldrich syndrome protein (WASP), and may provide direct links between the actin cytoskeletal machinery through WASP and dynamin-dependent endocytosis. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	258
153340	cd07656	F-BAR_srGAP	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase Activating Proteins. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs, all of which are expressed during embryonic and early development in the nervous system but with different localization and timing. srGAPs contain an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	241
153341	cd07657	F-BAR_Fes_Fer	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Fes (feline sarcoma) and Fer (Fes related) tyrosine kinases. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Fes (feline sarcoma), also called Fps (Fujinami poultry sarcoma), and Fer (Fes related) are cytoplasmic (or nonreceptor) tyrosine kinases that play roles in haematopoiesis, inflammation and immunity, growth factor signaling, cytoskeletal regulation, cell migration and adhesion, and the regulation of cell-cell interactions. Although Fes and Fer show redundancy in their biological functions, they show differences in their expression patterns. Fer is ubiquitously expressed while Fes is expressed predominantly in myeloid and endothelial cells. Fes and Fer contain an N-terminal F-BAR domain, an SH2 domain, and a C-terminal catalytic kinase domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. The F-BAR domain of Fes is critical in its role in microtubule nucleation and bundling.	237
153342	cd07658	F-BAR_NOSTRIN	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Nitric Oxide Synthase TRaffic INducer (NOSTRIN). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Nitric Oxide Synthase TRaffic INducer (NOSTRIN) is expressed in endothelial and epithelial cells and is involved in the regulation, trafficking and targeting of endothelial NOS (eNOS). NOSTRIN facilitates the endocytosis of eNOS by coordinating the functions of dynamin and the Wiskott-Aldrich syndrome protein (WASP). Increased expression of NOSTRIN may be correlated to preeclampsia. NOSTRIN contains an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. The F-BAR domain of NOSTRIN is necessary and sufficient for its membrane association and is responsible for its subcellular localization.	239
153343	cd07659	BAR_PICK1	The Bin/Amphiphysin/Rvs (BAR) domain of Protein Interacting with C Kinase 1. The BAR domain of Arfaptin-like proteins, also called the Arfaptin domain, is a dimerization and lipid binding module that can detect and drive membrane curvature. Protein Interacting with C Kinase 1 (PICK1), also called Protein kinase C-alpha-binding protein, is highly expressed in brain and testes. PICK1 plays a key role in the trafficking of AMPA receptors, which are critical for regulating synaptic strength and may be important in cellular processes involved in learning and memory. PICK1 is also critical in the early stages of spermiogenesis. Mice deficient in PICK1 are infertile and show characteristics of the human disease globozoospermia such as round-headed sperm, reduced sperm count, and severely impaired sperm motility. PICK1 may also be involved in the neuropathogenesis of schizophrenia. PICK1 contains an N-terminal PDZ domain and a C-terminal BAR domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of PICK1 is necessary for its membrane localization and activation.	215
153344	cd07660	BAR_Arfaptin	The Bin/Amphiphysin/Rvs (BAR) domain of Arfaptin. The BAR domain of Arfaptin-like proteins, also called the Arfaptin domain, is a dimerization and lipid binding module that can detect and drive membrane curvature. Arfaptins are ubiquitously expressed proteins implicated in mediating cross-talk between Rac, a member of the Rho family GTPases, and Arf (ADP-ribosylation factor) small GTPases. Arfaptins bind to GTP-bound Arf1, Arf5, and Arf6, with strongest binding to GTP-Arf1. Arfaptins also bind to Rac-GTP and Rac-GDP with similar affinities. The Arfs are thought to bind to the same surface as Rac, and their binding is mutually exclusive. Mammals contain at least two isoforms of Arfaptin. Arfaptin 1 has been shown to inhibit the activation of Arf-dependent phospholipase D (PLD) and the secretion of matrix metalloproteinase-9 (MMP-9), an enzyme implicated in cancer invasiveness and metastasis. Arfaptin 2 regulates the aggregation of the protein huntingtin, which is implicated in Huntington disease. Arfaptins are single-domain proteins with a BAR-like structure. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	201
153345	cd07661	BAR_ICA69	The Bin/Amphiphysin/Rvs (BAR) domain of Islet Cell Autoantigen 69-kDa. The BAR domain of Arfaptin-like proteins, also called the Arfaptin domain, is a dimerization and lipid binding module that can detect and drive membrane curvature. Islet cell autoantigen 69-kDa (ICA69) is a diabetes-associated autoantigen that is highly expressed in brain and beta cells. It is involved in membrane trafficking at the Golgi complex in neurosecretory cells. It is coexpressed with Protein Interacting with C Kinase 1 (PICK1), also a the BAR domain containing protein, in many tissues at different developmental stages. In neurons, ICA69 colocalizes with PICK1 in cell bodies and dendrites but is absent in synapses where PICK1 is enriched. ICA69 contains an N-terminal BAR domain and a conserved C-terminal domain of unknown function. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. ICA69 associates with PICK1 through their BAR domains to form a heterodimer which is involved in regulating the synaptic targeting and surface expression of AMPA receptors. Autoantibodies against ICA69 have been identified in patients with insulin-dependent diabetes mellitus, rheumatoid arthritis, and primary Sjogren's syndrome.  ICA69 has also been shown to be released by pancreatic cancer cells.	204
153346	cd07662	BAR_SNX6	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 6. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX6 forms a stable complex with SNX1 and may be a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. It interacts with the receptor serine/threonine kinases from the transforming growth factor-beta family. It also plays roles in enhancing the degradation of EGFR and in regulating the activity of Na,K-ATPase through its interaction with Translationally Controlled Tumor Protein (TCTP). BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	218
153347	cd07663	BAR_SNX5	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 5. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX5, abundantly expressed in macrophages, regulates macropinocytosis, a process that enables cells to internalize large amounts of external solutes. It may also be a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, acting as a mammalian equivalent of yeast Vsp17p. It also binds the Fanconi anaemia complementation group A protein (FANCA). SNX5 is localized to a subdomain of early endosome and is recruited to the plasma membrane following EGF stimulation and elevation of PI(3,4)P2 levels. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	218
153348	cd07664	BAR_SNX2	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 2. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX2 is a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures effcient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	234
153349	cd07665	BAR_SNX1	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 1. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX1 is a component of the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi. The retromer consists of a cargo-recognition subcomplex and a subcomplex formed by a dimer of sorting nexins (SNX1 and/or SNX2), which ensures effcient cargo sorting by facilitating proper membrane localization of the cargo-recognition subcomplex. SNX1 is localized to a microdomain in early endosomes where it regulates cation-independent mannose-6-phosphate receptor retrieval to the trans Golgi network. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	234
153350	cd07666	BAR_SNX7	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 7. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. The specific function of SNX7 is still unknown. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	243
153351	cd07667	BAR_SNX30	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 30. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. The specific function of SNX30 is still unknown. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	240
153352	cd07668	BAR_SNX9	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 9. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX9, also known as SH3PX1, is a cytosolic protein that interacts with proteins associated with clathrin-coated pits such as Cdc-42-associated tyrosine kinase 2 (ACK2). It binds class I polyproline sequences found in dynamin 1/2 and the WASP/N-WASP actin regulators. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis. Its array of interacting partners suggests that SNX9 functions at the interface between endocytosis and actin cytoskeletal organization. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	210
153353	cd07669	BAR_SNX33	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 33. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX33 interacts with Wiskott-Aldrich syndrome protein (WASP) and plays a role in the maintenance of cell shape and cell cycle progression. It modulates the shedding and endocytosis of cellular prion protein (PrP(c)) and amyloid precursor protein (APP). BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	207
153354	cd07670	BAR_SNX18	The Bin/Amphiphysin/Rvs (BAR) domain of Sorting Nexin 18. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains. The PX-BAR structural unit determines the specific membrane targeting of SNXs. SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions.	207
153355	cd07671	F-BAR_PSTPIP1	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Proline-Serine-Threonine Phosphatase-Interacting Protein 1. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Proline-Serine-Threonine Phosphatase-Interacting Protein 1 (PSTPIP1), also known as CD2 Binding Protein 1 (CD2BP1), is mainly expressed in hematopoietic cells. It is a binding partner of the cell surface receptor CD2 and PTP-PEST, a tyrosine phosphatase which functions in cell motility and Rac1 regulation. It also plays a role in the activation of the Wiskott-Aldrich syndrome protein (WASP), which couples actin rearrangement and T cell activation. Mutations in the gene encoding PSTPIP1 cause the autoinflammatory disorder known as PAPA (pyogenic sterile arthritis, pyoderma gangrenosum, and acne) syndrome. PSTPIP1 contains an N-terminal F-BAR domain, PEST motifs, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	242
153356	cd07672	F-BAR_PSTPIP2	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Proline-Serine-Threonine Phosphatase-Interacting Protein 2. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Proline-Serine-Threonine Phosphatase-Interacting Protein 2 (PSTPIP2), also known as Macrophage Actin-associated tYrosine Phosphorylated protein (MAYP), is mostly expressed in hematopoietic cells but is also expressed in the brain. It is involved in regulating cell adhesion and motility. Mutations in the gene encoding murine PSTPIP2 can cause autoinflammatory disorders such as chronic multifocal osteomyelitis and macrophage autoinflammatory disease. PSTPIP2 contains an N-terminal F-BAR domain and lacks the PEST motifs and SH3 domain that are found in PSTPIP1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	240
153357	cd07673	F-BAR_FCHO2	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH domain Only 2 protein. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. The specific function of FCH domain Only 2 (FCHO2) is still unknown. It contains an N-terminal F-BAR domain and a C-terminal domain of unknown function named SAFF which is also present in FCHO1 and endophilin interacting protein 1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	269
153358	cd07674	F-BAR_FCHO1	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH domain Only 1 protein. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. FCH domain Only 1 (FCHO1) may be involved in clathrin-coated vesicle formation. It contains an N-terminal F-BAR domain and a C-terminal domain of unknown function named SAFF which is also present in FCHO2 and endophilin interacting protein 1. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	261
153359	cd07675	F-BAR_FNBP1L	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Formin Binding Protein 1-Like. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. FormiN Binding Protein 1-Like (FNBP1L), also known as Toca-1 (Transducer of Cdc42-dependent actin assembly), forms a complex with neural Wiskott-Aldrich syndrome protein (N-WASP). The FNBP1L/N-WASP complex induces the formation of filopodia and endocytic vesicles. FNBP1L is required for Cdc42-induced actin assembly and is essential for autophagy of intracellular pathogens. It contains an N-terminal F-BAR domain, a central Cdc42-binding HR1 domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	252
153360	cd07676	F-BAR_FBP17	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Formin Binding Protein 17. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Formin Binding Protein 17 (FBP17), also called FormiN Binding Protein 1 (FNBP1), is involved in dynamin-mediated endocytosis. It is recruited to clathrin-coated pits late in the endocytosis process and may play a role in the invagination and scission steps. FBP17 binds in vivo to tankyrase, a protein involved in telomere maintenance and mitogen activated protein kinase (MAPK) signaling. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	253
153361	cd07677	F-BAR_FCHSD2	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH and double SH3 domains 2 (FCHSD2). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. FCH and double SH3 domains 2 (FCHSD2) contains an N-terminal F-BAR domain and two SH3 domains at the C-terminus. It has been characterized only in silico, and its biological function is still unknown. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	260
153362	cd07678	F-BAR_FCHSD1	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of FCH and double SH3 domains 1 (FCHSD1). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. FCH and double SH3 domains 1 (FCHSD1) contains an N-terminal F-BAR domain and two SH3 domains at the C-terminus. It has been characterized only in silico, and its biological function is still unknown. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	263
153363	cd07679	F-BAR_PACSIN2	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Protein kinase C and Casein kinase Substrate in Neurons 2 (PACSIN2). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSIN 2 or Syndapin II is expressed ubiquitously and is involved in the regulation of tubulin polymerization. It associates with Golgi membranes and forms a complex with dynamin II which is crucial in promoting vesicle formation from the trans-Golgi network. PACSIN 2 contains an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	258
153364	cd07680	F-BAR_PACSIN1	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Protein kinase C and Casein kinase Substrate in Neurons 1 (PACSIN1). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSIN 1 or Syndapin I is expressed specifically in the brain and is localized in neurites and synaptic boutons. It binds the brain-specific proteins dynamin I, synaptojanin, synapsin I, and neural Wiskott-Aldrich syndrome protein (nWASP), and functions as a link between the cytoskeletal machinery and synaptic vesicle endocytosis. PACSIN 1 interacts with huntingtin and may be implicated in the neuropathology of Huntington's disease. It contains an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	258
153365	cd07681	F-BAR_PACSIN3	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Protein kinase C and Casein kinase Substrate in Neurons 3 (PACSIN3). F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSIN 3 or Syndapin III is expressed ubiquitously and regulates glucose uptake in adipocytes through its role in GLUT1 trafficking. It also modulates the subcellular localization and stimulus-specific function of the cation channel TRPV4. PACSIN 3 contains an N-terminal F-BAR domain and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	258
153366	cd07682	F-BAR_srGAP2	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase Activating Protein 2. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs. srGAP2 is expressed in zones of neuronal differentiation. It plays a role in the regeneration of neurons and axons. srGAP2 contains an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	263
153367	cd07683	F-BAR_srGAP1	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase Activating Protein 1. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs. srGAP1, also called Rho GTPase-Activating Protein 13 (ARHGAP13), is a Cdc42- and RhoA-specific GAP and is expressed later in the development of CNS (central nervous system) tissues. It is an important downstream signaling molecule of Robo1. srGAP1 contains an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	253
153368	cd07684	F-BAR_srGAP3	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Slit-Robo GTPase Activating Protein 3. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs. srGAP3, also called MEGAP (MEntal disorder associated GTPase-Activating Protein), is a Rho GAP with activity towards Rac1 and Cdc42. It impacts cell migration by regulating actin and microtubule cytoskeletal dynamics. The association between srGAP3 haploinsufficiency and mental retardation is under debate. srGAP3 contains an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	253
153369	cd07685	F-BAR_Fes	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Fes (feline sarcoma) tyrosine kinase. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Fes (feline sarcoma), also called Fps (Fujinami poultry sarcoma), is a cytoplasmic (or nonreceptor) tyrosine kinase whose gene was first isolated from tumor-causing retroviruses. It is expressed in myeloid, vascular endothelial, epithelial, and neuronal cells, and plays important roles in cell growth and differentiation, angiogenesis, inflammation and immunity, and cytoskeletal regulation. Fes kinase has also been implicated as a tumor suppressor in colorectal cancer. It contains an N-terminal F-BAR domain, an SH2 domain, and a C-terminal catalytic kinase domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules. The F-BAR domain of Fes is critical in its role in microtubule nucleation and bundling.	237
153370	cd07686	F-BAR_Fer	The F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain of Fer (Fes related) tyrosine kinase. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization. Fer (Fes related) is a cytoplasmic (or nonreceptor) tyrosine kinase expressed in a wide variety of tissues, and is found to reside in both the cytoplasm and the nucleus. It plays important roles in neuronal polarization and neurite development, cytoskeletal reorganization, cell migration, growth factor signaling, and the regulation of cell-cell interactions mediated by adherens junctions and focal adhesions. Fer kinase also regulates cell cycle progression in malignant cells. It contains an N-terminal F-BAR domain, an SH2 domain, and a C-terminal catalytic kinase domain. F-BAR domains form banana-shaped dimers with a positively-charged concave surface that binds to negatively-charged lipid membranes. They can induce membrane deformation in the form of long tubules.	234
409484	cd07687	IgC_TCR_delta	Immunoglobulin (Ig) constant domain of the delta chain of delta/gamma T-cell antigen receptors (TCRs). The members here are composed of the constant domain of the delta chain of delta/gamma T-cell antigen receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are composed of alpha and beta, or gamma and delta, polypeptide chains with variable (V) and constant (C) regions. The majority of T cells contain alpha-beta TCRs, but a small subset contain gamma-delta TCRs. Alpha-beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma-delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing and MHC independently of the bound peptide. Gamma-delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds.	80
409485	cd07688	IgC_TCR_alpha	Immunoglobulin (Ig) constant domain the alpha chain of alpha/beta T-cell antigen receptors (TCRs). The members here are composed of the constant domain of the alpha chain of alpha/beta T-cell antigen receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes and are composed of alpha and beta, or gamma and delta polypeptide chains with variable (V) and constant (C) regions. This group includes the variable domain of the alpha chain. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. The antigen binding site is formed by the variable domains of the alpha and beta chains, located at the N-terminus of each chain. Alpha/beta TCRs recognize antigens differently from gamma/delta TCRs.	83
409486	cd07689	IgC2_VCAM-1	Immunoglobulin (Ig)-like domain of vascular endothelial cell adhesion molecule-1 (VCAM-1) and similar proteins; member of the C2-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial cell adhesion molecule-1 (VCAM-1; also known as Cluster of Differentiation (CD) 106) and similar proteins. During the inflammation process, these molecules recruit leukocytes onto the vascular endothelium before extravasation to the injured tissues. The interaction of VCAM-1 binding to the beta1 integrin very late antigen 4 (VLA-4) expressed by lymphocytes and monocytes mediates the adhesion of leucocytes to blood vessel walls, and regulates migration across the endothelium. During metastasis, some circulating cancer cells extravasate to a secondary site by a similar process. VCAM-1 may be involved in organ targeted tumor metastasis and may also act as host receptors for viruses and parasites. VCAM-1 contains seven Ig domains.	101
409487	cd07690	IgV_1_CD4	First immunoglobulin (Ig) domain of Cluster of Differentiation (CD) 4; member of the V-set of IgSF domains. The members here are composed of the first immunoglobulin (Ig) domain of Cluster of Differentiation (CD) 4.  CD4 and CD8 are the two primary co-receptor proteins found on the surface of T cells, and the presence of either CD4 or CD8 determines the function of the T cell. CD4 is found on helper T cells, where it is required for the binding of MHC (major histocompatibility complex) class II molecules, while CD8 is found on cytotoxic T cells, where it is required for the binding of MHC class I molecules. CD4 contains four immunoglobulin domains, with the first three included in this hierarchy. The fourth domain has a general Ig architecture, but has slight topological changes in the arrangement of beta strands relative to the other structures in this family and is not specifically included in the hierarchy.	97
409488	cd07691	IgC1_CD3_gamma_delta	Immunoglobulin (Ig)-like domain of Cluster of Differentiation (CD) 3 gamma and delta chains; member of the C1-set of IgSF domains. The members here are composed of immunoglobulin (Ig)-like domain of Cluster of Differentiation (CD) 3 gamma and delta chains. CD3 is a T cell surface receptor that is associated with alpha/beta T cell receptors (TCRs). The CD3 complex consists of one gamma, one delta, two epsilon, and two zeta chains. The CD3 subunits form heterodimers as gamma/epsilon, delta/epsilon, and zeta/zeta. The gamma, delta, and epsilon chains each contain an extracellular Ig domain, whereas the extracellular domains of the zeta chains are very small and have unknown structure. The CD3 domain participates in intracellular signaling once the TCR has bound an MHC/antigen complex.	69
409489	cd07692	IgC1_CD3_epsilon	Immunoglobulin (Ig)-like domain of Cluster of Differentiation (CD) 3 epsilon chain; member of the C1-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)-like domain of Cluster of Differentiation (CD) 3 epsilon chain. CD3 is a T cell surface receptor that is associated with alpha/beta T cell receptors (TCRs). The CD3 complex consists of one gamma, one delta, two epsilon, and two zeta chains. The CD3 subunits form heterodimers as gamma/epsilon, delta/epsilon, and zeta/zeta. The gamma, delta, and epsilon chains each contain an extracellular Ig domain, whereas the extracellular domains of the zeta chains are very small and have unknown structure. The CD3 domain participates in intracellular signaling once the TCR has bound an MHC/antigen complex.	75
409490	cd07693	IgC_1_Robo	First immunoglobulin (Ig)-like constant domain in Robo (roundabout) receptors, and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain in Roundabout (Robo) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, Robo2, and Robo3), and three mammalian Slit homologs (Slit1, Slit2, Slit3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, Robo2, and Robo3 are expressed by commissural neurons in the vertebrate spinal cord and Slit1, Slit2,and Slit3 are expressed at the ventral midline. Robo3 is a divergent member of the Robo family which instead of being a positive regulator of Slit responsiveness, antagonizes Slit responsiveness in precrossing axons.  The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit2 has been shown by surface plasmon resonance experiments and mutational analysis to be is the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site.	99
409491	cd07694	IgC2_2_CD4	Second immunoglobulin (Ig) domain of Cluster of Differentiation (CD) 4; member of the C2-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig) Constant 2 (C2)-set domain of Cluster of Differentiation (CD) 4. CD4 and CD8 are the two primary co-receptor proteins found on the surface of T cells, and the presence of either CD4 or CD8 determines the function of the T cell. CD4 is found on helper T cells, where it is required for the binding of MHC (major histocompatibility complex) class II molecules, while CD8 is found on cytotoxic T cells, where it is required for the binding of MHC class I molecules. CD4 contains four immunoglobulin domains, with the first three included in this hierarchy.  The fourth domain has a general Ig architecture, but has slight topological changes in the arrangement of beta strands relative to the other structures in this family and is not specifically included in the hierarchy. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains, having A, B, and E strands in one beta-sheet and A', G, F, C' in the other. Unlike other Ig domain sets, the C2-set lacks the D strand.	88
409492	cd07695	IgV_3_CD4	Third immunoglobulin (Ig) Variable (V) domain of Cluster of Differentiation (CD) 4; member of the V-set of IgSF domains. The members here are composed of the third immunoglobulin variable (IgV) domain of Cluster of Differentiation (CD) 4. CD4 and CD8 are the two primary co-receptor proteins found on the surface of T cells, and the presence of either CD4 or CD8 determines the function of the T cell. CD4 is found on helper T cells, where it is required for the binding of MHC (major histocompatibility complex) class II molecules, while CD8 is found on cytotoxic T cells, where it is required for the binding of MHC class I molecules. CD4 contains four immunoglobulin domains, with the first three included in this hierarchy. The fourth domain has a general Ig architecture, but has slight topological changes in the arrangement of beta strands relative to the other structures in this family and is not specifically included in the hierarchy.	107
409493	cd07696	IgC1_CH3_IgAEM_CH2_IgG	CH3 domain (third constant Ig domain of heavy chains) in immunoglobulin heavy alpha, epsilon, and mu chains, and CH2 domain (second constant Ig domain of the gheavy chain) in immunoglobulin heavy gamma chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the third immunoglobulin constant domain (IgC) of the gamma heavy chains and the second immunoglobulin constant domain (IgC) of alpha, epsilon, and mu heavy chains. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns.	98
409494	cd07697	IgC1_TCR_gamma	T cell receptor (TCR) gamma chain constant immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) constant (C) domain of the gamma chain of gamma-delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes and are heterodimers consisting of alpha and beta chains or gamma and delta chains.  Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha-beta TCRs, but a small subset contain gamma-delta TCRs. Alpha-beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma-delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing and MHC independently of the bound peptide. Gamma-delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds.	98
409495	cd07698	IgC1_MHC_I_alpha3	Class I major histocompatibility complex (MHC) alpha chain, alpha3 immunoglobulin domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class I alpha chain. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	92
409496	cd07699	IgC1_L	Immunoglobulin light chain Constant domain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) light chain constant (C) domain. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determine the type of immunoglobulin:  IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which seem to be functionally identical, and can associate with any of the heavy chains.	99
409497	cd07700	IgV_CD8_beta	Immunoglobulin (Ig) variable (V) domain of Cluster of Differentiation (CD) 8 beta chain. The members here are composed of the immunoglobulin (Ig)-like domain in Cluster of Differentiation (CD) 8 beta. The CD8 glycoprotein plays an essential role in the control of T-cell selection, maturation, and the T-cell receptor (TCR)-mediated response to peptide antigen. CD8 is comprised of alpha and beta subunits and is expressed as either an alpha/alpha or alpha/beta dimer. Both dimeric isoforms can serve as a coreceptor for T cell activation and differentiation, however they have distinct physiological roles, different cellular distributions, unique binding partners, etc. Each CD8 subunit is comprised of an extracellular domain containing a V-type Ig-like domain, a single pass transmembrane portion, and a short intracellular domain. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	116
409498	cd07701	IgV_1_Necl-3	First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molecule-3; member of the V-set of Ig superfamily (IgSF) domains. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-3, Necl-3 (also known as cell adhesion molecule 2 (CADM2), SynCAM2, IGSF4D).  Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region, belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses.  Necl-3 accumulates in central and peripheral nervous system tissue, and has been shown to selectively interact with oligodendrocytes.	96
409499	cd07702	IgI_VEGFR-1	Immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 1 (VEGFR-1); member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain of vascular endothelial growth factor receptor 1 (VEGFR-1). VEGFRs have an extracellular component with seven Ig-like domains, a transmembrane segment, and an intracellular tyrosine kinase domain interrupted by a kinase-insert domain. VEGFRs bind VEGFs with high affinity at the Ig-like domains. VEGFR-1 binds VEGF-A strongly; VEGF-A is important to the growth and maintenance of vascular endothelial cells and to the development of new blood- and lymphatic-vessels in physiological and pathological states. VEGFR-1 may play an inhibitory role in the function of VEGFR-2 by binding VEGF-A and interfering with its interaction with VEGFR-2. VEGFR-1 has a signaling role in mediating monocyte chemotaxis and may mediate a chemotactic and a survival signal in hematopoietic stem cells or leukemia cells. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	92
409500	cd07703	IgC1_2_Nectin-2_Necl-5_like	Second immunoglobulin (Ig) domain of Nectin-2 and Nectin-like protein 5, and similar domains; member of the C1-set of the Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig) domain of nectin-2 (also known as poliovirus receptor related protein 2 or Cluster of Differentiation 112 (CD112)), nectin-like protein 5 (CD155), and similar proteins. Nectins and Nectin-like molecules are a family of Ca(2+)-independent immunoglobulin-like transmembrane glycoproteins belonging to the class of adhesion receptors, consisting of nine members (nectins 1 through 4 and nectin-like proteins 1 through 5). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. Nectin-2 and nectin-3 localize at Sertoli-spermatid junctions where they form heterophilic trans-interactions between the cells that are essential for the formation and maintenance of the junctions and for spermatid development. CD155 is the fifth member in the nectin-like molecule family, and functions as the receptor of poliovirus; therefore, CD155 is also referred to as Necl-5, or PVR. In contrast to all other family members, CD155 lacks self-adhesion capacity, yet it shares with nectins the feature to interact with other nectins. For instance, CD155 heterophilically trans-interacts with nectin-3, thereby contributing significantly to the establishment of adherens junctions between epithelial cells. This group belongs to the Constant 1 (C1)-set of IgSF domains, which has one beta-sheet that is formed by strands A-B-E-D and the other strands by G-F-C-C'.	97
409501	cd07704	IgC1_2_Nectin-3-4_like	Second immunoglobulin (Ig) domain of nectin-3 and nectin-4 (poliovirus receptor related protein 4), and similar domains; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin (Ig) domain of nectin-3 (also known as poliovirus receptor related protein 3 or cluster of differentiation (CD) 113) and nectin-4 (poliovirus receptor related protein 4). Nectin-3 and nectin-4 belong to the nectin family comprised of four transmembrane glycoproteins (nectin-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. Nectin-2 and nectin-3 localize at Sertoli-spermatid junctions where they form heterophilic trans-interactions between the cells that are essential for the formation and maintenance of the junctions and for spermatid development. Nectin-3 has also been shown to form a heterophilic trans-interaction with nectin-1 in ciliary epithelia, establishing the apex-apex adhesion between the pigment and non-pigment cell layers. Nectin-4 has recently been identified in several types of breast carcinoma and can be used as a histological and serological marker for breast cancer.	96
409502	cd07705	IgI_2_Necl-1	Second immunoglobulin (Ig)-like domain of nectin-like molcule-1 (Necl-1); member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin (Ig)-like domain of nectin-like molcule-1 (Necl-1; also known as cell adhesion molecule3 (CADM3)). These nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 through Necl-5). These have an extracellular region containing three Ig-like domains, one transmembrane region, and one cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains is essential to cell-cell adhesion and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1 and Necl-2 have Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue and is important to the formation of synapses, axon bundles, and myelinated axons. Necl-2 is expressed in a wide variety of tissues and is a putative tumour suppressor gene which is downregulated in aggressive neuroblastoma. Ig domains are likely to participate in ligand binding and recognition.	103
409503	cd07706	IgV_TCR_delta	Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) delta chain. The members here are composed of the immunoglobulin (Ig) variable (V) domain of the delta chain of gamma/delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are heterodimers consisting of alpha and beta chains or gamma and delta chains.  Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha/beta TCRs, but a small subset contain gamma/delta TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma/delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing, and MHC independently of the bound peptide. Gamma/delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. The variable domain of gamma/delta TCRs is responsible for antigen recognition and is located at the N-terminus of the receptor. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	112
293793	cd07707	MBL-B1-B2-like	metallo-beta-lactamases; subclasses B1 and B2 and related proteins; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. Subclass B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. B1 MBls include chromosomally-encoded MBLs such as Bacillus cereus BcII, Bacteroides fragilis CcrA, and Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB and acquired MBLs including IMP-1, VIM-1, VIM-2, GIM-1, NDM-1 and FIM-1. B2 MBLs have a narrow substrate profile that includes carbapenems, and they are active with one zinc ion bound in the Asp-Cys-His site, binding of a second zinc ion in the modified 3H site (Asn-His-His) inhibits catalysis. B2 MBLs include Aeromonas hydrophyla CphA, Aeromonas veronii ImiS, and Serratia fonticola Sfh-I.	219
293794	cd07708	MBL-B3-like	metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. B3 MBLs include Fluoribacter gormanii FEZ-1, Elizabethkingia meningoseptica (Chryseobacterium meningosepticum)  GOB-1, Stenotrophomonas Maltophilia L1, and Bradyrhizobium diazoefficiens BJP-1, Serratia marcescens SMB-1, and Pseudomonas Aeruginosa AIM-1. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	248
293795	cd07709	flavodiiron_proteins_MBL-fold	catalytic domain of flavodiiron proteins (FDPs) and related proteins; MBL-fold metallo-hydrolase domain. FDPs catalyze the reduction of oxygen and/or nitric oxide to water or nitrous oxide respectively. In addition to this N-terminal catalytic domain they contain a C-terminal flavin mononucleotide-binding flavodoxin-like domain. Although some FDPs are able to reduce NO or O2 with similar catalytic efficiencies others are selective for either NO or O2, such as Escherichia coli flavorubredoxin which is selective toward NO and G. intestinalis FDP which is selective toward O2. These enzymes belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. Some members of this subgroup are single domain.	238
293796	cd07710	arylsulfatase_Sdsa1-like_MBL-fold	Pseudomonas aeruginosa arylsulfatase SdsA1, Pseudomonas sp. DSM6611 arylsulfatase  Pisa1, and related proteins; MBL-fold metallo-hydrolase domain. Arylsulfatase (also known as aryl-sulfate sulfohydrolase, EC 3.1.6.1). Pseudomonas aeruginosa SdsA1 is a secreted SDS hydrolase that allows the bacterium to use primary sulfates such as the detergent SDS common in commercial personal hygiene products as a sole carbon or sulfur source. Pseudomonas inverting secondary alkylsulfatase 1 (Pisa1) is specific for secondary alkyl sulfates. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	239
293797	cd07711	MBLAC1-like_MBL-fold	uncharacterized human metallo-beta-lactamase domain-containing protein 1 and related proteins; MBL-fold metallo hydrolase domain. Includes the MBL-fold metallo hydrolase domain of uncharacterized human MBLAC1 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	190
293798	cd07712	MBLAC2-like_MBL-fold	uncharacterized human metallo-beta-lactamase domain-containing protein 2 and related proteins; MBL-fold metallo hydrolase domain. Includes the MBL-fold metallo hydrolase domain of uncharacterized human MBLAC2 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	182
293799	cd07713	DHPS-like_MBL-fold	Methanocaldococcus jannaschii dihydropteroate synthase, Thermoanaerobacter tengcongensis Tflp, and related proteins; MBL-fold metallo hydrolase domain. This subgroup includes Methanocaldococcus jannaschii 7,8-dihydropterin-6-methyl-4-(beta-D-ribofuranosyl)-aminobenzene-5'-phosphate synthase (EC 2.5.1.15), a folate biosynthetic enzyme also known as dihydropteroate synthase and 7,8 dihydropteroate synthase. Thermoanaerobacter tengcongensis Tflp is a ferredoxin-like member. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	269
293800	cd07714	RNaseJ_MBL-fold	RNAaseJ, MBL-fold metallo-hydrolase domain. RNase J, also called Ribonuclease J, is a prokaryotic ribonuclease which plays a key part in RNA processing and in RNA degradation. It can act as an endonuclease which is specific for single-stranded regions of RNA irrespective of their sequence or location, and as a processive 5' exonuclease which only acts on substrates having a single phosphate or a hydroxyl at the 5' end. Many bacterial species have only one RNase J, but some, such as Bacillus subtilis, have two. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	248
293801	cd07715	TaR3-like_MBL-fold	MBL-fold metallo-hydrolase domain of Myxococcus xanthus TaR3 and related proteins; MBL-fold metallo-hydrolase domain. Myxococcus xanthus Tar3 may function as an ammonium regulator/effector protein involved in biosynthesis of the antibiotic TA. Some are members of this subgroup are annotated as ribonucleases. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	212
293802	cd07716	RNaseZ_short-form-like_MBL-fold	uncharacterized bacterial subgroup of Ribonuclease Z, short form; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme. Only the short form exists in bacteria. Members of this bacterial subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	175
293803	cd07717	RNaseZ_ZiPD-like_MBL-fold	Ribonuclease Z, E. coli 3' tRNA-processing endonuclease ZiPD and related proteins; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Escherichia coli zinc phosphodiesterase (ZiPD, also known as ecoZ, tRNase Z, or RNase BN) is a 3' tRNA-processing endonuclease, encoded by the elaC gene. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme; this subgroup includes the short form (ELAC1). Only the short form exists in bacteria. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	247
293804	cd07718	RNaseZ_ELAC1_ELAC2-C-term-like_MBL-fold	Ribonuclease Z ELAC1, C-terminus of ELAC2, and related proteins; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme; this eukaryotic subgroup includes short forms (ELAC1) and the C-terminus of long forms including human ELAC2. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	204
293805	cd07719	arylsulfatase_AtsA-like_MBL-fold	Pseudoalteromonas carrageenovora arylsulfatase AtsA and related proteins; MBL-fold metallo-hydrolase domain. Arylsulfatase (also known as aryl-sulfate sulfohydrolase, EC 3.1.6.1). Pseudoalteromonas carrageenovora arylsulfatase AtsA may function as a glycosulfohydrolase involved with desulfation of sulfated polysaccharides, which catalyzes hydrolysis of the arylsulfate ester bond, producing the aryl compounds and inorganic sulfate. CD also includes some sequences annotated as ribonucleases. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily.	193
293806	cd07720	OPHC2-like_MBL-fold	Pseudomonas pseudoalcaligenes organophosphorus hydrolase C2, and related proteins; MBL-fold metallo hydrolase domain. Pseudomonas pseudoalcaligenes OPHC2 is a thermostable organophosphorus hydrolase which a broad substrate activity spectrum: it hydrolyzes various phosphotriesters, esters, and a lactone. This subgroup also includes Pseudomonas oleovorans PoOPH which exhibits high lactonase and esterase activities, and latent PTE activity. However, double mutations His250Ile/Ile263Trp switch PoOPH into an efficient and thermostable PTE. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	251
293807	cd07721	yflN-like_MBL-fold	uncharacterized subgroup which includes Bacillus subtilis yflN; MBL-fold metallo hydrolase domain. This subgroup includes the uncharacterized Bacillus subtilis yflN protein. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	202
293808	cd07722	LACTB2-like_MBL-fold	uncharacterized subgroup which includes human lactamase beta 2 and related proteins; MBL-fold metallo hydrolase domain. Includes functionally uncharacterized human lactamase beta 2. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	188
293809	cd07723	hydroxyacylglutathione_hydrolase_MBL-fold	hydroxyacylglutathione hydrolase, MBL-fold metallo-hydrolase domain. hydroxyacylglutathione hydrolase (EC 3.1.2.6, also known as, glyoxalase II; S-2-hydroxylacylglutathione hydrolase; hydroxyacylglutathione hydrolase; acetoacetylglutathione hydrolase). In the second step of the glycoxlase system this enzyme hydrolyzes S-d-lactoylglutathione to d-lactate and regenerates glutathione in the process. It has broad substrate specificity for glutathione thiol esters, hydrolyzing a number of these species to their corresponding carboxylic acids and reduced glutathione. It appears to hydrolyze 2-hydroxy thiol esters with greatest efficiency. It belongs to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	165
293810	cd07724	POD-like_MBL-fold	ETHE1 (PDO type I), persulfide dioxygenase A (PDOA, PDO type II) and related proteins; MBL-fold metallo-hydrolase domain. Persulfide dioxygenase (PDO, also known as sulfur dioxygenase, SDO, EC 1.13.11.18) is a non-heme iron-dependent oxygenase which catalyzes the oxidation of glutathione persulfide to glutathione and persulfite in the mitochondria. Mutations in ethe1 (the human PDO gene) are responsible for a rare autosomal recessive metabolic disorder called ethylmalonic encephalopathy. Arabidopsis thaliana ETHE1 is essential for embryo and endosperm development. Bacterial ETHE1-type PDOs are also called Type 1 PDOs. Type II PDOs (also called PDOAs), are mainly proteobacterial. These enzymes belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	177
293811	cd07725	TTHA1429-like_MBL-fold	uncharacterized Thermus thermophilus TTHA1429 and related proteins; MBL-fold metallo hydrolase domain. Includes the MBL-fold metallo hydrolase domain of uncharacterized Thermus thermophilus TTHA1429 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	184
293812	cd07726	ST1585-like_MBL-fold	uncharacterized subgroup which includes Sulfolobus tokodaii ST1585 protein; MBL-fold metallo hydrolase domain. This subgroup includes the uncharacterized Sulfolobus tokodaii ST1585 protein. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	215
293813	cd07727	YmaE-like_MBL-fold	uncharacterized subgroup which includes Bacillus subtilis YmaE and related proteins; MBL-fold metallo hydrolase domain. Includes the uncharacterized Bacillus subtilis YmaE and Nostoc all1228 proteins.Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. 	181
293814	cd07728	YtnP-like_MBL-fold	Bacillus subtilis YtnP and related proteins; MBL-fold metallo hydrolase domain. Bacillus subtilis YtnP inhibits the signaling pathway required for the streptomycin production and development of aerial mycelium in Streptomyces griseus. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	249
293815	cd07729	AHL_lactonase_MBL-fold	quorum-quenching N-acyl-homoserine lactonase, MBL-fold metallo-hydrolase domain. Acyl Homoserine Lactones (also known as AHLs) are signal molecules which coordinate gene expression in quorum sensing, in many Gram-negative bacteria. Quorum-quenching N-acyl-homoserine lactonase (also known as AHL lactonase, N-acyl-L-homoserine lactone hydrolase, EC 3.1.1.81) catalyzes the hydrolysis and opening of the homoserine lactone rings of AHLs, a reaction that can block quorum sensing. These enzymes belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	238
293816	cd07730	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. Some members of this subgroup are annotated as GumP protein.	250
293817	cd07731	ComA-like_MBL-fold	Competence protein ComA, ComEC and related proteins; MBL-fold metallo hydrolase domain. This subgroup includes proteins required for natural transformation competence including Neisseria gonorrhoeae ComA, Pseudomonas stutzeri ComA, Bacillus subtilis ComEC (also known as ComE operon protein 3) and Haemophilus influenza ORF2 encoded by the rec-2 gene, as well as Escherichia coli YcaI which does not mediate spontaneous plasmid transformation on nutrient-containing agar plates. It also includes the phosphorylcholine esterase (Pce) domain of choline-binding protein e from streptococcus pneumonia. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 	179
293818	cd07732	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Includes functionally uncharacterized Enterococcus faecalis EF2904. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	202
293819	cd07733	YycJ-like_MBL-fold	uncharacterized subgroup which includes Bacillus subtilis YycJ and related proteins; MBL-fold metallo hydrolase domain. Includes the uncharacterized Bacillus subtilis YycJ protein. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	151
293820	cd07734	Int9-11_CPSF2-3-like_MBL-fold	Int9, Int11, CPSF2, CPSF3 and related cleavage and polyadenylation specificity factors; MBL-fold metallo-hydrolase domain. CPSF3 (cleavage and polyadenylation specificity factor subunit 3; also known as cleavage and polyadenylation specificity factor 73 kDa subunit, CPSF-73) and CPSF2 (also known as cleavage and polyadenylation specificity factor 100 kDa subunit /CPSF-100) are components of the CPSF complex, which plays a role in 3' end processing of pre-mRNAs during cleavage/polyadenylation, and during processing of metazoan histone pre-mRNAs. CPSF3 functions as a 3' endonuclease. Int11 (also known as cleavage and polyadenylation-specific factor (CPSF) 3-like protein, and protein related to CPSF subunits of 68 kDa (RC-68)), and Int9, also known as protein related to CPSF subunits of 74 kDa (RC-74) are subunits of Integrator, a metazoan-specific multifunctional protein complex composed of 14 subunits. Integrator has been implicated in a variety of Pol II transcription events including 3' end processing of snRNA, transcription initiation, promoter-proximal pausing, termination of protein-coding transcripts, and in HVS pre-miRNA 3' end processing. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	193
293821	cd07735	class_II_PDE_MBL-fold	class II cyclic nucleotide phosphodiesterases Saccharomyces cerevisiae PDE1, Dictyostelium discoideum PDE1 and PDE7, and related proteins; MBL-fold metallo-hydrolase domain. Cyclic nucleotide phosphodiesterases (PDEs) decompose the second messengers cyclic adenosine and guanosine 3',5'-monophosphate (cAMP and cGMP, respectively). Saccharomyces cerevisiae PDE1 and Dictyostelium discoideum PDE1 and PDE7, have dual cAMP/cGMP specificity. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	259
293822	cd07736	PhnP-like_MBL-fold	phosphodiesterase Escherichia coli PhnP and related proteins; MBL-fold metallo hydrolase domain. Escherichia coli PhnP catalyzes the hydrolysis of 5-phospho-D-ribose-1,2-cyclic phosphate to D-ribose-1,5-bisphosphate, a step in the C-P lyase pathway. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	186
293823	cd07737	YcbL-like_MBL-fold	Salmonella enterica serovar typhimurium YcbL and related proteins; MBL-fold metallo hydrolase domain. This subgroup includes Salmonella enterica serovar typhimurium YcbL which has type II hydroxyacylglutathione hydrolase (EC 3.1.2.6, also known as glyoxalase II) activity, and has a single metal ion binding site, and Thermus thermophilus TTHA1623 which does not have GLX2 activity and has two metal ion binding sites with a glyoxalase II-type metal coordination. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	190
293824	cd07738	DdPDE5-like_MBL-fold	Dictyostelium discoideum phosphodiesterase 5 and related proteins; MBL-fold metallo hydrolase domain. Includes Dictyostelium discoideum cAMP/cGMP-dependent 3',5'-cAMP/cGMP phosphodiesterase A (also known as cyclic GMP-binding protein A, phosphodiesterase 5, phosphodiesterase D, and PDE5) and cAMP/cGMP-dependent 3',5'-cAMP/cGMP phosphodiesterase B (also known as cyclic GMP-binding protein B, phosphodiesterase 6, phosphodiesterase E, and PDE6. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	189
293825	cd07739	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	201
293826	cd07740	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	194
293827	cd07741	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	212
293828	cd07742	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	249
293829	cd07743	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	197
143394	cd07749	NT_Pol-beta-like_1	Nucleotidyltransferase (NT) domain of an uncharacterized subgroup of the Pol beta-like NT superfamily. The Pol beta-like NT superfamily includes DNA polymerase beta and other family X DNA Polymerases, as well as Class I and Class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly(A) polymerases, terminal uridylyl transferases, Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins. Proteins belonging to this subgroup are uncharacterized. In the majority of the Pol beta-like superfamily NTs, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations essential for catalysis. These divalent metal ions are involved in a two-metal ion mechanism of nucleotide addition. These carboxylate residues are conserved in this subgroup.	156
143622	cd07750	PolyPPase_VTC_like	Polyphosphate(polyP) polymerase domain of yeast vacuolar transport chaperone (VTC) proteins VTC-2, -3 and- 4, and similar proteins. Saccharomyces cerevisiae VTC-1, -2, -3, and -4 comprise the membrane-integral VTC complex. VTC-2, -3, and -4 contain polyP polymerase domains. For S. cerevisiae VTC4 it has been shown that this domain generates polyP from ATP by a phosphotransfer reaction releasing ADP. This activity is metal ion-dependent. The ATP gamma phosphate may be cleaved and then transferred to an acceptor phosphate to form polyP. PolyP is ubiquitous. In prokaryotes, it is a store of phosphate and energy. In eukaryotes, polyPs  have roles in  bone calcification, and osmoregulation, and in phosphate transport in the symbiosis of mycorrhizal fungi and plants. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel.	214
143623	cd07751	PolyPPase_VTC4_like	Polyphosphate(polyP) polymerase domain of yeast vacuolar transport chaperone (VTC) protein VTC4, and similar proteins. Saccharomyces cerevisiae VTC-1, -2, -3, and -4 comprise the membrane-integral VTC complex. VTC-2,-3, and -4 contain polyP polymerase domains. S. cerevisiae VTC4 belongs to this subgroup. For VTC4 it has been shown that this domain generates polyP from ATP by a phosphotransfer reaction releasing ADP. This activity is metal ion-dependent. The ATP gamma phosphate may be cleaved and then transferred to an acceptor phosphate to form polyP. PolyP is ubiquitous. In prokaryotes, it is a store of phosphate and energy. In eukaryotes, polyPs have roles in  bone calcification, and osmoregulation, and in phosphate transport in the symbiosis of mycorrhizal fungi and plants. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel.	290
143624	cd07756	CYTH-like_Pase_CHAD	Uncharacterized subgroup of the CYTH-like superfamily having an associated CHAD domain. This subgroup belongs to the CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) superfamily. Members of this superfamily hydrolyze triphosphate-containing substrates, require metal cations as cofactors, and have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). A number of proteins in this subgroup also contain a C-terminal CHAD (Conserved Histidine Alpha-helical Domain) domain which may participate in metal chelation or act as a phosphor-acceptor. The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB) and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions. Proteins of this subgroup have not been characterized.	197
143625	cd07758	ThTPase	Thiamine Triphosphatase. ThTPase is a soluble cytosolic enzyme which converts thiamine triphosphate (ThTP) to thiamine diphosphate. This catalytic activity depends on a divalent metal cofactor, for example Mg++. ThTPase regulates the intracellular concentration of ThTP, maintaining it at a low concentration in vivo. ThTP acts as a messenger in cell signaling in response to cellular stress, and in addition, can phosphorylate proteins in certain tissues. There is another class of membrane-associated enzymes in animal tissues which also convert ThTP to thiamine diphosphate, however they do not belong to this subgroup. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel.	196
143626	cd07761	CYTH-like_CthTTM-like	Clostridium thermocellum (Cth)TTM and similar proteins, a subgroup of the CYTH-like superfamily. CthTTM is a metal dependent tripolyphosphatase, nucleoside triphosphatase, and nucleoside tetraphosphatase. It hydrolyzes the beta-gamma phosphoanhydride linkage of triphosphate-containing substrates including tripolyphosphate, nucleoside triphosphates and nucleoside tetraphosphates. These substrates are hydrolyzed, releasing Pi. Mg++ or Mn++ are required for the enzyme's activity. CthTTM appears to have no adenylate cyclase activity. This subgroup consists chiefly of bacterial sequences. Members of the CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) superfamily have a unique active site located within an eight-stranded beta barrel.	146
143627	cd07762	CYTH-like_Pase_1	Uncharacterized subgroup 1 of the CYTH-like superfamily. Enzymes belonging to the CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) superfamily hydrolyze triphosphate-containing substrates, require metal cations as cofactors, and have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB) and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions. Proteins of this subgroup are of bacterial origin and have not been characterized.	180
143639	cd07765	KRAB_A-box	KRAB (Kruppel-associated box) domain -A box. The KRAB domain is a transcription repression module, found in a subgroup of the zinc finger proteins (ZFPs) of the C2H2 family, KRAB-ZFPs. KRAB-ZFPs comprise the largest group of transcriptional regulators in mammals, and are only found in tetrapods. These proteins have been shown to play important roles in cell differentiation and organ development, and in regulating viral replication and transcription. A KRAB domain may consist of an A-box, or of an A-box plus either a B-box, a divergent B-box (b), or a C-box. Only the A-box is included in this model. The A-box is needed for repression, the B- and C- boxes are not. KRAB-ZFPs have one or two KRAB domains at their amino-terminal end, and multiple C2H2 zinc finger motifs at their C-termini. Some KRAB-ZFPs also contain a SCAN domain which mediates homo- and hetero-oligomerization. The KRAB domain is a protein-protein interaction module which represses transcription through recruiting corepressors. A key mechanism appears to be the following: KRAB-AFPs tethered to DNA recruit, via their KRAB domain, the repressor KAP1 (KRAB-associated protein-1,  also known as transcription intermediary factor 1 beta , KRAB-A interacting protein , and tripartite motif protein 28). The KAP1/ KRAB-AFP complex in turn recruits the heterochromatin protein 1 (HP1) family, and other chromatin modulating proteins, leading to transcriptional repression through heterochromatin formation.	40
341447	cd07766	DHQ_Fe-ADH	Dehydroquinate synthase-like (DHQ-like) and iron-containing alcohol dehydrogenases (Fe-ADH). This superfamily consists of two subgroups: the dehydroquinate synthase (DHQS)-like, and a large metal-containing alcohol dehydrogenases (ADH), known as iron-containing alcohol dehydrogenases. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds. Dehydroquinate synthase-like group includes dehydroquinate synthase, 2-deoxy-scyllo-inosose synthase, and 2-epi-5-epi-valiolone synthase. The alcohol dehydrogenases (ADHs) in this superfamily contain a dehydroquinate synthase-like protein structural fold and mostly contain iron. They are distinct from other alcohol dehydrogenases which contains different protein domains. There are several distinct families of alcohol dehydrogenases: Zinc-containing long-chain alcohol dehydrogenases; insect-type, or short-chain alcohol dehydrogenases; iron-containing alcohol dehydrogenases, and others. The iron-containing family has a Rossmann fold-like topology that resembles the fold of the zinc-dependent alcohol dehydrogenases, but lacks sequence homology, and differs in strand arrangement. ADH catalyzes the reversible oxidation of alcohol to acetaldehyde with the simultaneous reduction of NAD(P)+ to NAD(P)H.	271
163686	cd07767	MPN	Mpr1p, Pad1p N-terminal (MPN) domains. MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors.  These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface.  Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function.	116
198346	cd07768	FGGY_RBK_like	Ribulokinase-like carbohydrate kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of ribulokinases (RBKs) and similar proteins from bacteria and eukaryota. RBKs catalyze the MgATP-dependent phosphorylation of a variety of sugar substrates including L- and/or D-ribulose. Members of this subfamily contain two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Members of this subfamily belong to the FGGY family of carbohydrate kinases	465
198347	cd07769	FGGY_GK	Glycerol kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily includes glycerol kinases (GK; EC 2.7.1.30) and glycerol kinase-like proteins from all three kingdoms of living organisms. Glycerol is an important intermediate of energy metabolism and it plays fundamental roles in several vital physiological processes. GKs are involved in the entry of external glycerol into cellular metabolism. They catalyze the rate-limiting step in glycerol metabolism by transferring a phosphate from ATP to glycerol thus producing glycerol 3-phosphate (G3P) in the cytoplasm. Human GK deficiency, called hyperglycerolemia, is an X-linked recessive trait associated with psychomotor retardation, osteoporosis, spasticity, esotropia, and bone fractures. Under different conditions, GKs from different species may exist in different oligomeric states. The monomer of GKs is composed of two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The high affinity ATP binding site of GKs is created only by a substrate-induced conformational change. Based on sequence similarity, some GK-like proteins from metazoa, which have lost their GK enzymatic activity, are also included in this CD. Members in this subfamily belong to the FGGY family of carbohydrate kinases.	484
212659	cd07770	FGGY_GntK	Gluconate kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of a group of gluconate kinases (GntK, also known as gluconokinase; EC 2.7.1.12) encoded by the gntK gene, which catalyzes the ATP-dependent phosphorylation of D-gluconate and produce 6-phospho-D-gluconate and ADP. The presence of Mg2+ might be required for catalytic activity. The prototypical member of this subfamily is GntK from Lactobacillus acidophilus. Unlike Escherichia coli GntK, which belongs to the superfamily of P-loop containing nucleoside triphosphate hydrolases, members in this subfamily are homologous to glycerol kinase, xylulose kinase, and rhamnulokinase from Escherichia coli. They have been classified as members of the FGGY family of carbohydrate kinases, which contain two large domains separated by a deep cleft that forms the active site. This model spans both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Some uncharacterized homologous sequences are also included in this subfamily. The Lactobacillus gnt operon contains a single gntK gene. The gnt operons of some bacteria, such as Corynebacterium glutamicum, have two gntK genes. For example, the C. glutamicum gnt operon has both a gluconate kinase gntV gene (also known as gntK) and a second hypothetical gntK gene (also known as gntK2). Both gluconate kinases encoded by these genes belong to this family, however the protein encoded by C. glutamicum gntV is not included in this model as it is truncated in the C-terminal domain.	440
198349	cd07771	FGGY_RhuK	L-rhamnulose kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is predominantly composed of bacterial L-rhamnulose kinases (RhuK, also known as rhamnulokinase; EC 2.7.1.5), which are encoded by the rhaB gene and catalyze the ATP-dependent phosphorylation of L-rhamnulose to produce L-rhamnulose-1-phosphate and ADP. Some uncharacterized homologous sequences are also included in this subfamily. The prototypical member of this subfamily is Escherichia coli RhuK, which exists as a monomer composed of two large domains. The ATP binding site is located in the cleft between the two domains. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of divalent Mg2+ or Mn2+ is required for catalysis. Although an intramolecular disulfide bridge is present in Rhuk, disulfide formation is not important to the regulation of RhuK enzymatic activity. Members of this subfamily belong to the FGGY family of carbohydrate kinases.	440
198350	cd07772	FGGY_NaCK_like	Novosphingobium aromaticivorans carbohydrate kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subfamily is predominantly composed of uncharacterized bacterial proteins with similarity to carbohydrate kinase from Novosphingobium aromaticivorans (NaCK). These proteins may catalyze the transfer of a phosphate group from ATP to their carbohydrate substrates. They belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	419
198351	cd07773	FGGY_FK	L-fuculose kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of bacterial L-fuculose kinases (FK, also known as fuculokinase, EC 2.7.1.51), which catalyze the ATP-dependent phosphorylation of L-fuculose to produce L-fuculose-1-phosphate and ADP. The presence of Mg2+ or Mn2+ is required for enzymatic activity. FKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	448
198352	cd07774	FGGY_1	uncharacterized subgroup; belongs to the FGGY family of carbohydrate kinases. This subfamily is composed of uncharacterized carbohydrate kinases. They are sequence homologous to bacterial glycerol kinase and have been classified as members of the FGGY family of carbohydrate kinases. The monomers of FGGY proteins contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	430
198353	cd07775	FGGY_AI-2K	Autoinducer-2 kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of bacterial autoinducer-2 (AI-2) kinases and similar proteins. AI-2 is a small chemical quorum-sensing signal involved in interspecies communication in bacteria. Cytoplasmic autoinducer-2 kinase, encoded by the lsrK gene from Salmonella enterica serovar Typhimurium lsr (luxS regulated) operon, is the prototypical member of this subfamily. AI-2 kinase catalyzes the phosphorylation of intracellular AI-2 to phospho-AI-2, which leads to the inactivation of lsrR, the repressor of the lsr operon. Members of this family are homologs of glycerol kinase-like proteins and belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	452
212660	cd07776	FGGY_D-XK_euk	eukaryotic D-xylulose kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of eukaryotic D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. They belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Members of this subfamily are similar to bacterial D-XKs, which exist as dimers with active sites that lie at the interface between two large domains. The presence of Mg2+ or Mn2+ is required for catalytic activity.	480
212661	cd07777	FGGY_SHK_like	sedoheptulokinase-like proteins; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is predominantly composed of uncharacterized bacterial and eukaryotic proteins with similarity to human sedoheptulokinase (SHK, also known as D-altro-heptulose or heptulokinase, EC 2.7.1.14) encoded by the carbohydrate kinase-like (CARKL/SHPK) gene. SHK catalyzes the ATP-dependent phosphorylation of sedoheptulose to produce sedoheptulose 7-phosphate and ADP. The presence of Mg2+ or Mn2+ might be required for catalytic activity. Members of this subfamily belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	448
198356	cd07778	FGGY_L-RBK_like	L-ribulokinase-like proteins; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of a group of putative bacterial L-ribulokinases (RBK; EC 2.7.1.16) and similar proteins. L-RBK catalyzes the MgATP-dependent phosphorylation of a variety of sugar substrates. Members of this subfamily belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	466
212662	cd07779	FGGY_ygcE_like	uncharacterized ygcE-like proteins. This subfamily consists of uncharacterized hypothetical bacterial proteins with similarity to Escherichia coli sugar kinase ygcE , whose functional roles are not yet clear. Escherichia coli ygcE is recognized by this model, but is not present in the alignment as it contains a deletion relative to other members of the group. These proteins belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	488
198358	cd07781	FGGY_RBK	Ribulokinases; belongs to the FGGY family of carbohydrate kinases. This subgroup is predominantly composed of bacterial ribulokinases (RBK) which catalyze the MgATP-dependent phosphorylation of L(or D)-ribulose to produce L(or D)-ribulose 5-phosphate and ADP. RBK also phosphorylates a variety of other sugar substrates including ribitol and arabitol. The reason why L-RBK can phosphorylate so many different substrates is not yet clear. The presence of Mg2+ is required for catalytic activity. This group belongs to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	498
212663	cd07782	FGGY_YpCarbK_like	Yersinia Pseudotuberculosis carbohydrate kinase-like subgroup; belongs to the FGGY family of carbohydrate kinases. This subgroup is composed of the uncharacterized Yersinia Pseudotuberculosis carbohydrate kinase that has been named glyerol/xylulose kinase and similar uncharacterized proteins from bacteria and eukaryota. Carbohydrate kinases catalyze the ATP-dependent phosphorylation of their carbohydrate substrate to produce phosphorylated sugar and ADP. The presence of Mg2+ is required for catalytic activity. This subgroup shows high homology to characterized ribulokinases and belongs to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	536
198360	cd07783	FGGY_CarbK-RPE_like	Carbohydrate kinase and ribulose-phosphate 3-epimerase fusion proteins-like; belongs to the FGGY family of carbohydrate kinases. This subgroup is composed of uncharacterized proteins with similarity to carbohydrate kinases. Some members are carbohydrate kinase and ribulose-phosphate 3-epimerase fusion proteins. Carbohydrate kinases catalyze the ATP-dependent phosphorylation of their carbohydrate substrate to produce phosphorylated sugar and ADP. The presence of Mg2+ is required for catalytic activity. This subgroup shows high homology to characterized ribulokinases and belongs to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	484
198361	cd07786	FGGY_EcGK_like	Escherichia coli glycerol kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup is composed of mostly bacterial and archaeal glycerol kinases (GK), including the well characterized proteins from Escherichia coli (EcGK), Thermococcus kodakaraensis (TkGK), and Enterococcus casseliflavus (EnGK). GKs contain two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The high affinity ATP binding site of EcGK is created only by a substrate-induced conformational change, which is initiated by protein-protein interactions through complex formation with enzyme IIAGlc (also known as IIIGlc), the glucose-specific phosphocarrier protein of the phosphotransferase system (PTS). EcGK exists in a dimer-tetramer equilibrium. IIAGlc binds to both EcGK dimer and tetramer, and inhibits the uptake and subsequent metabolism of glycerol and maltose. Another well-known allosteric regulator of EcGK is fructose 1,6-bisphosphate (FBP), which binds to the EcGK tetramer and plays an essential role in the stabilization of the inactive tetrameric form. EcGK requires Mg2+ for its enzymatic activity. Members in this subgroup belong to the FGGY family of carbohydrate kinases	486
198362	cd07789	FGGY_CsGK_like	Cellulomonas sp. glycerol kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a small group of bacterial glycerol kinases (GK) with similarity to Cellulomonas sp. glycerol kinase (CsGK). CsGK might exist as a dimer. Its monomer is composed of two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The regulation of the catalytic activity of this group has not yet been examined. Members in this subgroup belong to the FGGY family of carbohydrate kinases	495
198363	cd07791	FGGY_GK2_bacteria	bacterial glycerol kinase 2-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a group of putative bacterial glycerol kinases (GK), which may be coded by the GK-like gene, GK2. Sequence comparison shows members in this CD are homologs of Escherichia coli GK. They retain all functionally important residues, and may catalyze the Mg-ATP dependent phosphorylation of glycerol to yield glycerol 3-phosphate (G3P). GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	484
212664	cd07792	FGGY_GK1-3_metazoa	Metazoan glycerol kinase 1 and 3-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a group of metazoan glycerol kinases (GKs), coded by X chromosome-linked GK genes, and glycerol kinase (GK)-like proteins, coded by autosomal testis-specific GK-like genes (GK-like genes, GK1 and GK3).  Sequence comparison shows that metazoan GKs and GK-like proteins in this family are closely related to the bacterial GKs, which catalyze the Mg-ATP dependent phosphorylation of glycerol to yield glycerol 3-phosphate (G3P). The metazoan GKs do have GK enzymatic activity. However, the GK-like metazoan proteins do not exhibit GK activity and their biological functions are not yet clear. Some of them lack important functional residues involved in the binding of ADP and Mg2+, which may result in the loss of GK catalytic function. Others that have conserved catalytic residues have lost their GK activity as well; the reason remains unclear. It has been suggested the conserved catalytic residues might facilitate them performing a distinct function. GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	504
212665	cd07793	FGGY_GK5_metazoa	metazoan glycerol kinase 5-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a group of metazoan putative glycerol kinases (GK), which may be coded by the GK-like gene, GK5. Sequence comparison shows members of this group are homologs of bacterial GKs, and they retain all functionally important residues. However, GK-like proteins in this family do not have detectable GK activity. The reason remains unclear. It has been suggested tha the conserved catalytic residues might facilitate them performing a distinct function. GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	504
198366	cd07794	FGGY_GK_like_proteobact	Proteobacterial glycerol kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a small group of proteobacterial glycerol kinase (GK)-like proteins, including the glycerol kinase from Pseudomonas aeruginosa. Most bacteria, such as Escherichia coli, take up glycerol passively by facilitated diffusion. In contrast, P. aeruginosa may also utilize a binding protein-dependent active transport system to mediate glycerol transportation. The glycerol kinase subsequently phosphorylates the intracellular glycerol to glycerol 3-phosphate (G3P). GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	470
198367	cd07795	FGGY_ScGut1p_like	Saccharomyces cerevisiae Gut1p and related proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup corresponds to a small group of fungal glycerol kinases (GK), including Saccharomyces cerevisiae Gut1p/YHL032Cp, which phosphorylates glycerol to glycerol-3-phosphate in the cytosol. Glycerol utilization has been considered as the sole source of carbon and energy in S. cerevisiae, and is mediated by glycerol kinase and glycerol 3-phosphate dehydrogenase, which is encoded by the GUT2 gene. Members in this family show high similarity to their prokaryotic and eukaryotic homologs. GKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	496
198368	cd07796	FGGY_NHO1_plant	Arabidopsis NHO1 and related proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup includes Arabidopsis NHO1 (also known as NONHOST1, or noh-host resistant 1) and other putative plant glycerol kinases, which share strong homology with glycerol kinases from bacteria, fungi, and animals. Nonhost resistance of plants refers to the phenomenon observed when all members of a plant species are typically resistant to a specific parasite. NHO1 is required for nonspecific resistance to nonhost Pseudomonas bacteria, it is also required for resistance to the fungal pathogen Botrytis cinerea. This subgroup belongs to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	503
198369	cd07798	FGGY_AI-2K_like	Autoinducer-2 kinase-like proteins; belongs to the FGGY family of carbohydrate kinases. This subgroup consists of uncharacterized hypothetical bacterial proteins with similarity to bacterial autoinducer-2 (AI-2) kinases, which catalyzes the phosphorylation of intracellular AI-2 to phospho-AI-2, leading to the inactivation of lsrR, the repressor of the lsr operon. Members of this subgroup belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	437
212666	cd07802	FGGY_L-XK	L-xylulose kinases; a subfamily of the FGGY family of carbohydrate kinases. This subfamily is composed of bacterial L-xylulose kinases (L-XK, also known as L-xylulokinase; EC 2.7.1.53), which catalyze the ATP-dependent phosphorylation of L-xylulose to produce L-xylulose 5-phosphate and ADP. The presence of Mg2+ might be required for catalytic activity. Some uncharacterized sequences are also included in this subfamily. L-XKs belong to the FGGY family of carbohydrate kinases, the monomers of which contain two large domains, which are separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	447
198371	cd07803	FGGY_D-XK	D-xylulose kinases; a subgroup of the FGGY family of carbohydrate kinases. This subfamily is predominantly composed of bacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. Some uncharacterized sequences are also included in this subfamily. The prototypical member of this subfamily is Escherichia coli xylulokinase (EcXK), which exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. XKs do not have any known allosteric regulators, and they may have weak but significant activity in the absence of substrate. The presence of Mg2+ or Mn2+ is required for catalytic activity. Members of this subfamily belong to the FGGY family of carbohydrate kinases.	482
198372	cd07804	FGGY_XK_like_1	uncharacterized xylulose kinase-like proteins; a subgroup of the FGGY family of carbohydrate kinases. This subgroup is composed of uncharacterized bacterial and archaeal xylulose kinases-like proteins with similarity to bacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. The presence of Mg2+ or Mn2+ is required for catalytic activity. D-XK exists as a dimer with an active site that lies at the interface between the N- and C-terminal domains. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Members of this subgroup belong to the FGGY family of carbohydrate kinases	492
198373	cd07805	FGGY_XK_like_2	uncharacterized xylulose kinase-like proteins; a subgroup of the FGGY family of carbohydrate kinases. This subgroup is composed of uncharacterized proteins with similarity to bacterial D-Xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. The presence of Mg2+ or Mn2+ is required for catalytic activity. D-XK exists as a dimer with an active site that lies at the interface between the N- and C-terminal domains. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. Members of this subgroup belong to the FGGY family of carbohydrate kinases.	514
198374	cd07808	FGGY_D-XK_EcXK-like	Escherichia coli xylulokinase-like D-xylulose kinases; a subgroup of the FGGY family of carbohydrate kinases. This subgroup is predominantly composed of bacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17), which catalyze the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. D-xylulose has been used as a source of carbon and energy by a variety of microorganisms. Some uncharacterized sequences are also included in this subgroup. The prototypical member of this CD is Escherichia coli xylulokinase (EcXK), which exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of Mg2+ or Mn2+ is required for catalytic activity.  Members of this subgroup belong to the FGGY family of carbohydrate kinases.	482
198375	cd07809	FGGY_D-XK_1	D-xylulose kinases, subgroup 1; members of the FGGY family of carbohydrate kinases. This subgroup is composed of D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17) from bacteria and eukaryota. They share high sequence similarity with Escherichia coli xylulokinase (EcXK), which catalyzes the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. Some uncharacterized sequences are also included in this subfamily. EcXK exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of Mg2+ or Mn2+ might be required for catalytic activity.  Members of this subgroup belong to the FGGY family of carbohydrate kinases.	487
198376	cd07810	FGGY_D-XK_2	D-xylulose kinases, subgroup 2; members of the FGGY family of carbohydrate kinases. This subgroup is predominantly composed of bacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17). They share high sequence similarity with Escherichia coli xylulokinase (EcXK), which catalyzes the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. EcXK exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of Mg2+ or Mn2+ might be required for catalytic activity. Members of this subgroup belong to the FGGY family of carbohydrate kinases.	490
198377	cd07811	FGGY_D-XK_3	D-xylulose kinases, subgroup 3; members of the FGGY family of carbohydrate kinases. This subgroup is composed of proteobacterial D-xylulose kinases (XK, also known as xylulokinase; EC 2.7.1.17). They share high sequence similarity with Escherichia coli xylulokinase (EcXK), which catalyzes the rate-limiting step in the ATP-dependent phosphorylation of D-xylulose to produce D-xylulose 5-phosphate (X5P) and ADP. Some uncharacterized sequences are also included in this subfamily. EcXK exists as a dimer. Each monomer consists of two large domains separated by an open cleft that forms an active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain. The presence of Mg2+ or Mn2+ might be required for catalytic activity. Members of this subgroup belong to the FGGY family of carbohydrate kinases.	493
176854	cd07812	SRPBCC	START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	141
176855	cd07813	COQ10p_like	Coenzyme Q-binding protein COQ10p and similar proteins. Coenzyme Q-binding protein COQ10p and similar proteins. COQ10p is a hydrophobic protein located in the inner membrane of mitochondria that binds coenzyme Q (CoQ), also called ubiquinone, which is an essential electron carrier of the respiratory chain. Deletion of the gene encoding COQ10p (COQ10 or YOL008W) in Saccharomyces cerevisiae results in respiratory defect because of the inability to oxidize NADH and succinate. COQ10p may function in the delivery of CoQ (Q6 in budding yeast) to its proper location for electron transport. The human homolog, called Q-binding protein COQ10 homolog A (COQ10A), is able to fully complement for the absence of COQ10p in fission yeast. Human COQ10A also has a splice variant COQ10B. COQ10p belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	138
176856	cd07814	SRPBCC_CalC_Aha1-like	Putative hydrophobic ligand-binding SRPBCC domain of Micromonospora echinospora CalC, human Aha1, and related proteins. This family includes the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of Micromonospora echinospora CalC, human Aha1, and related proteins. Proteins in this group belong to the SRPBCC domain superfamily of proteins, which bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. MeCalC confers resistance to the enediyne, calicheamicin gamma 1 (CLM), by a self sacrificing mechanism which results in inactivation of both CalC and the highly reactive diradical enediyne species. MeCalC can also inactivate two other enediynes, shishijimicin and namenamicin. A crucial Gly of the MeCalC CLM resistance mechanism is not conserved in this subgroup. This family also includes the C-terminal, Bet v1-like domain of Aha1, one of several co-chaperones, which regulate the dimeric chaperone Hsp90. Aha1 promotes dimerization of the N-terminal domains of Hsp90, and stimulates its low intrinsic ATPase activity, and may regulate the dwell time of Hsp90 with client proteins. Aha1 can act as either a positive or negative regulator of chaperone-dependent activation, depending on the client protein, but the mechanisms by which these opposing functions are achieved are unclear.  Aha1 is upregulated in a number of tumor lines co-incident with the activation of several signaling kinases.	139
176857	cd07815	SRPBCC_PITP	Lipid-binding SRPBCC domain of Class I and Class II Phosphatidylinositol Transfer Proteins. This family includes the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of the phosphatidylinositol transfer protein (PITP) family of lipid transfer proteins. This family of proteins includes Class 1 PITPs (PITPNA/PITPalpha and PITPNB/PITPbeta, Drosophila vibrator and related proteins), Class IIA  PITPs (PITPNM1/PITPalphaI/Nir2,  PITPNM2/PITPalphaII/Nir3, Drosophila RdgB, and related proteins), and Class IIB  PITPs (PITPNC1/RdgBbeta and related proteins). The PITP family belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. In vitro, PITPs bind phosphatidylinositol (PtdIns), as well as phosphatidylcholine (PtdCho) but with a lower affinity. They transfer these lipids from one membrane compartment to another. The cellular roles of PITPs include inositol lipid signaling, PtdIns metabolism, and membrane trafficking. Class III PITPs, exemplified by the Sec14p family, are found in yeast and plants but are unrelated in sequence and structure to Class I and II PITPs and belong to a different superfamily.	251
176858	cd07816	Bet_v1-like	Ligand-binding bet_v_1 domain of major pollen allergen of white birch (Betula verrucosa), Bet v 1, and related proteins. This family includes the ligand binding domain of Bet v 1 (the major pollen allergen of white birch, Betula verrucosa) and related proteins. In addition to birch Bet v 1, this family includes other plant intracellular pathogenesis-related class 10 (PR-10) proteins, norcoclaurine synthases (NCSs), cytokinin binding proteins (CSBPs), major latex proteins (MLPs), and ripening-related proteins. It belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Members of this family binds a diverse range of ligands. Bet v 1 can bind brassinosteroids, cytokinins, flavonoids and fatty acids. Hyp-1, a PR-10 from Hypericum perforatum/St. John's wort, catalyzes the condensation of two molecules of emodin to the bioactive naphthodianthrone hypericin. NCSs catalyze the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine. The role of MLPs is unclear; however, they are associated with fruit and flower development and in pathogen defense responses. A number of PR-10 proteins in this subgroup, including Bet v 1, have in vitro RNase activity, the biological significance of which is unclear. Bet v 1 family proteins have a conserved glycine-rich P (phosphate-binding)-loop proximal to the entrance of the ligand-binding pocket. However, its conformation differs from that of the canonical P-loop structure found in nucleotide-binding proteins. Several PR-10 members including Bet v1 are allergenic. Cross-reactivity of Bet v 1 with homologs from plant foods results in birch-fruit syndrome.	148
176859	cd07817	SRPBCC_8	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	139
176860	cd07818	SRPBCC_1	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	150
176861	cd07819	SRPBCC_2	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	140
176862	cd07820	SRPBCC_3	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	137
176863	cd07821	PYR_PYL_RCAR_like	Pyrabactin resistance 1 (PYR1), PYR1-like (PYL), regulatory component of abscisic acid receptors (RCARs), and related proteins. The PYR/PYL/RCAR-like family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. PYR/PYL/RCAR plant proteins are receptors involved in signal transduction. They bind abscisic acid (ABA) and mediate its signaling. ABA is a vital plant hormone, which regulates plant growth, development, and response to environmental stresses. Upon binding ABA, these plant proteins interact with a type 2C protein phosphatase (PP2C), such as ABI1 and ABI2, and inhibit their activity. When ABA is bound, a loop (designated the gate/CL2 loop) closes over the ligand binding pocket, resulting in the weakening of the inactive PYL dimer and facilitating type 2C protein phosphatase binding. In the ABA:PYL1:ABI1 complex, the gate blocks substrate access to the phosphatase active site. A conserved Trp from PP2C inserts into PYL to lock the receptor in a closed formation. This group also contains Methylobacterium extorquens AM1 MxaD. The mxaD gene is located within the mxaFJGIR(S)ACKLDEHB cluster which encodes proteins involved in methanol oxidation. MxaD may participate in the periplasmic electron transport chain for oxidation of methanol. Mutants lacking MxaD exhibit a reduced growth on methanol, and a lower rate of respiration with methanol.	140
176864	cd07822	SRPBCC_4	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	141
176865	cd07823	SRPBCC_5	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	146
176866	cd07824	SRPBCC_6	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	146
176867	cd07825	SRPBCC_7	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	144
176868	cd07826	SRPBCC_CalC_Aha1-like_9	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	142
143640	cd07827	RHD-n	N-terminal sub-domain of the Rel homology domain (RHD). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal sub-domain, which may be distantly related to the DNA-binding domain found in P53. The C-terminal sub-domain has an immunoglobulin-like fold and serves as a dimerization module that also binds DNA (see cd00102). The RHD is found in NF-kappa B, nuclear factor of activated T-cells (NFAT), the tonicity-responsive enhancer binding protein (TonEBP), and the arthropod proteins Dorsal and Relish (Rel).	174
381185	cd07828	lipocalin_heme-bd-THAP4-like	heme-binding beta-barrel domain of human THAP4, Arabidopsis thaliana nitrobindin, and similar proteins. Proteins in this subfamily use a beta-barrel domain to bind ferric heme. This group also includes the beta-barrel domain of human THAP domain containing 4 (THAP4). The THAP domain is found in proteins involved in transcriptional regulation, cell-cycle control, apoptosis and chromatin modification. Arabidopsis thaliana nitrobindin may reversibly bind nitric oxide (NO) and be involved in NO transport. It also includes the beta-barrel domain of Caenorhabditis elegans protein male abnormal 7 (Mab-7) which plays an important role in determining body shape and sensory ray morphology. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	150
270823	cd07829	STKc_CDK_like	Catalytic domain of Cyclin-Dependent protein Kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. CDKs are partly regulated by their subcellular localization, which defines substrate phosphorylation and the resulting specific function. CDK1, CDK2, CDK4, and CDK6 have well-defined functions in the cell cycle, such as the regulation of the early G1 phase by CDK4 or CDK6, the G1/S phase transition by CDK2, or the entry of mitosis by CDK1. They also exhibit overlapping cyclin specificity and functions in certain conditions. Knockout mice with a single CDK deleted remain viable with specific phenotypes, showing that some CDKs can compensate for each other. For example, CDK4 can compensate for the loss of CDK6, however, double knockout mice with both CDK4 and CDK6 deleted die in utero. CDK8 and CDK9 are mainly involved in transcription while CDK5 is implicated in neuronal function. CDK7 plays essential roles in both the cell cycle as a CDK-Activating Kinase (CAK) and in transcription as a component of the general transcription factor TFIIH. The CDK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	282
270824	cd07830	STKc_MAK_like	Catalytic domain of Male germ cell-Associated Kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of human MAK and MAK-related kinase (MRK), Saccharomyces cerevisiae Ime2p, Schizosaccharomyces pombe Mei4-dependent protein 3 (Mde3) and Pit1, Caenorhabditis elegans dyf-5, Arabidopsis thaliana MHK, and similar proteins. These proteins play important roles during meiosis. MAK is highly expressed in testicular cells specifically in the meiotic phase, but is not essential for spermatogenesis and fertility. It functions as a coactivator of the androgen receptor in prostate cells. MRK, also called Intestinal Cell Kinase (ICK), is expressed ubiquitously, with highest expression in the ovary and uterus. A missense mutation in MRK causes endocrine-cerebro-osteodysplasia, suggesting that this protein plays an important role in the development of many organs. MAK and MRK may be involved in regulating cell cycle and cell fate. Ime2p is a meiosis-specific kinase that is important during meiotic initiation and during the later stages of meiosis. Mde3 functions downstream of the transcription factor Mei-4 which is essential for meiotic prophase I. The MAK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
270825	cd07831	STKc_MOK	Catalytic domain of the Serine/Threonine Kinase, MAPK/MAK/MRK Overlapping Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MOK, also called Renal tumor antigen 1 (RAGE-1), is widely expressed and is enriched in testis, kidney, lung, and brain. It is expressed in approximately 50% of renal cell carcinomas (RCC) and is a potential target for immunotherapy. MOK is stabilized by its association with the HSP90 molecular chaperone. It is induced by the transcription factor Cdx2 and may be involved in regulating intestinal epithelial development and differentiation. The MOK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	282
270826	cd07832	STKc_CCRK	Catalytic domain of the Serine/Threonine Kinase, Cell Cycle-Related Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CCRK was previously called p42. It is a Cyclin-Dependent Kinase (CDK)-Activating Kinase (CAK) which is essential for the activation of CDK2. It is indispensable for cell growth and has been implicated in the progression of glioblastoma multiforme. In the heart, a splice variant of CCRK with a different C-terminal half is expressed; this variant promotes cardiac cell growth and survival and is significantly down-regulated during the development of heart failure. The CCRK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
270827	cd07833	STKc_CDKL	Catalytic domain of Cyclin-Dependent protein Kinase Like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of CDKL1-5 and similar proteins. Some CDKLs, like CDKL1 and CDKL3, may be implicated in transformation and others, like CDKL3 and CDKL5, are associated with mental retardation when impaired. CDKL2 plays a role in learning and memory. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDKL subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
270828	cd07834	STKc_MAPK	Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPKs serve as important mediators of cellular responses to extracellular signals. They control critical cellular functions including differentiation, proliferation, migration, and apoptosis. They are also implicated in the pathogenesis of many diseases including multiple types of cancer, stroke, diabetes, and chronic inflammation. Typical MAPK pathways involve a triple kinase core cascade comprising of the MAPK, which is phosphorylated and activated by a MAPK kinase (MAP2K or MKK), which itself is phosphorylated and activated by a MAPK kinase kinase (MAP3K or MKKK). Each cascade is activated either by a small GTP-binding protein or by an adaptor protein, which transmits the signal either directly to a MAP3K to start the triple kinase core cascade or indirectly through a mediator kinase, a MAP4K. There are three typical MAPK subfamilies: Extracellular signal-Regulated Kinase (ERK), c-Jun N-terminal Kinase (JNK), and p38. Some MAPKs are atypical in that they are not regulated by MAP2Ks. These include MAPK4, MAPK6, NLK, and ERK7. The MAPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	329
270829	cd07835	STKc_CDK1_CdkB_like	Catalytic domain of Cyclin-Dependent protein Kinase 1-like Serine/Threonine Kinases and of Plant B-type Cyclin-Dependent protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of CDK, CDK2, and CDK3. CDK1 is also called Cell division control protein 2 (Cdc2) or p34 protein kinase, and is regulated by cyclins A, B, and E. The CDK1/cyclin A complex controls G2 phase entry and progression while the CDK1/cyclin B complex is critical for G2 to M phase transition. CDK2 is regulated by cyclin E or cyclin A. Upon activation by cyclin E, it phosphorylates the retinoblastoma (pRb) protein which activates E2F mediated transcription and allows cells to move into S phase. The CDK2/cyclin A complex plays a role in regulating DNA replication. Studies in knockout mice revealed that CDK1 can compensate for the loss of the cdk2 gene as it can also bind cyclin E and drive G1 to S phase transition. CDK3 is regulated by cyclin C and it phosphorylates pRB specifically during the G0/G1 transition. This phosphorylation is required for cells to exit G0 efficiently and enter the G1 phase. The plant-specific B-type CDKs are expressed from the late S to the M phase of the cell cycle. They are characterized by the cyclin binding motif PPT[A/T]LRE. They play a role in controlling mitosis and integrating developmental pathways, such as stomata and leaf development. CdkB has been shown to associate with both cyclin B, which controls G2/M transition, and cyclin D, which acts as a mediator in linking extracellular signals to the cell cycle. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
143341	cd07836	STKc_Pho85	Catalytic domain of the Serine/Threonine Kinase, Fungal Cyclin-Dependent protein Kinase Pho85. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Pho85 is a multifunctional CDK in yeast. It is regulated by 10 different cyclins (Pcls) and plays a role in G1 progression, cell polarity, phosphate and glycogen metabolism, gene expression, and in signaling changes in the environment. It is not essential for yeast viability and is the functional homolog of mammalian CDK5, which plays a role in central nervous system development. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The Pho85 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270830	cd07837	STKc_CdkB_plant	Catalytic domain of the Serine/Threonine Kinase, Plant B-type Cyclin-Dependent protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The plant-specific B-type CDKs are expressed from the late S to the M phase of the cell cycle. They are characterized by the cyclin binding motif PPT[A/T]LRE. They play a role in controlling mitosis and integrating developmental pathways, such as stomata and leaf development. CdkB has been shown to associate with both cyclin B, which controls G2/M transition, and cyclin D, which acts as a mediator in linking extracellular signals to the cell cycle. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CdkB subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	294
270831	cd07838	STKc_CDK4_6_like	Catalytic domain of Cyclin-Dependent protein Kinase 4 and 6-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK4 and CDK6 partner with D-type cyclins to regulate the early G1 phase of the cell cycle. They are the first kinases activated by mitogenic signals to release cells from the G0 arrested state. CDK4 and CDK6 are both expressed ubiquitously, associate with all three D cyclins (D1, D2 and D3), and phosphorylate the retinoblastoma (pRb) protein. They are also regulated by the INK4 family of inhibitors which associate with either the CDK alone or the CDK/cyclin complex. CDK4 and CDK6 show differences in subcellular localization, sensitivity to some inhibitors, timing in activation, tumor selectivity, and possibly substrate profiles. Although CDK4 and CDK6 seem to show some redundancy, they also have discrete, nonoverlapping functions. CDK6 plays an important role in cell differentiation. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK4/6-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
143344	cd07839	STKc_CDK5	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK5 is unusual in that it is regulated by non-cyclin proteins, p35 and p39. It is highly expressed in the nervous system and is critical in normal neural development and function. It plays a role in neuronal migration and differentiation, and is also important in synaptic plasticity and learning. CDK5 also participates in protecting against cell death and promoting angiogenesis. Impaired CDK5 activity is implicated in Alzheimer's disease, amyotrophic lateral sclerosis, Parkinson's disease, Huntington's disease and acute neuronal injury. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270832	cd07840	STKc_CDK9_like	Catalytic domain of Cyclin-Dependent protein Kinase 9-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of CDK9 and CDK12 from higher eukaryotes, yeast BUR1, C-type plant CDKs (CdkC), and similar proteins. CDK9, BUR1, and CdkC are functionally equivalent. They act as a kinase for the C-terminal domain of RNA polymerase II and participate in regulating mutliple steps of gene expression including transcription elongation and RNA processing. CDK9 and CdkC associate with T-type cyclins while BUR1 associates with the cyclin BUR2. CDK12 is a unique CDK that contains an arginine/serine-rich (RS) domain, which is predominantly found in splicing factors. CDK12 interacts with cyclins L1 and L2, and participates in regulating transcription and alternative splicing. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK9-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	291
270833	cd07841	STKc_CDK7	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK7 plays essential roles in the cell cycle and in transcription. It associates with cyclin H and MAT1 and acts as a CDK-Activating Kinase (CAK) by phosphorylating and activating cell cycle CDKs (CDK1/2/4/6). In the brain, it activates CDK5. CDK7 is also a component of the general transcription factor TFIIH, which phosphorylates the C-terminal domain (CTD) of RNA polymerase II when it is bound with unphosphorylated DNA, as present in the pre-initiation complex. Following phosphorylation, the CTD dissociates from the DNA which allows transcription initiation. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK7 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	298
270834	cd07842	STKc_CDK8_like	Catalytic domain of Cyclin-Dependent protein Kinase 8-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of CDK8, CDC2L6, and similar proteins. CDK8 functions as a negative or positive regulator of transcription, depending on the scenario. Together with its regulator, cyclin C, it reversibly associates with the multi-subunit core Mediator complex, a cofactor that is involved in regulating RNA polymerase II-dependent transcription. CDC2L6 also associates with Mediator in complexes lacking CDK8. In VP16-dependent transcriptional activation, CDK8 and CDC2L6 exerts opposing effects by positive and negative regulation, respectively, in similar conditions. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK8-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	316
173741	cd07843	STKc_CDC2L1	Catalytic domain of the Serine/Threonine Kinase, Cell Division Cycle 2-like 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDC2L1, also called PITSLRE, exists in different isoforms which are named using the alias CDK11(p). The CDC2L1 gene produces two protein products, CDK11(p110) and CDK11(p58). CDC2L1 is also represented by the caspase-processed CDK11(p46). CDK11(p110), the major isoform, associates with cyclin L and is expressed throughout the cell cycle. It is involved in RNA processing and the regulation of transcription. CDK11(p58) associates with cyclin D3 and is expressed during the G2/M phase of the cell cycle. It plays roles in spindle morphogenesis, centrosome maturation, sister chromatid cohesion, and the completion of mitosis. CDK11(p46) is formed from the larger isoforms by caspases during TNFalpha- and Fas-induced apoptosis. It functions as a downstream effector kinase in apoptotic signaling pathways and interacts with eukaryotic initiation factor 3f (eIF3f), p21-activated kinase (PAK1), and Ran-binding protein (RanBPM). CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDC2L1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	293
270835	cd07844	STKc_PCTAIRE_like	Catalytic domain of PCTAIRE-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PCTAIRE-like proteins show unusual expression patterns with high levels in post-mitotic tissues, suggesting that they may be involved in regulating post-mitotic cellular events. They share sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The association of PCTAIRE-like proteins with cyclins has not been widely studied, although PFTAIRE-1 has been shown to function as a CDK which is regulated by cyclin D3 as well as the membrane-associated cyclin Y. The PCTAIRE-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
173742	cd07845	STKc_CDK10	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 10. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK10, also called PISSLRE, is essential for cell growth and proliferation, and acts through the G2/M phase of the cell cycle. CDK10 has also been identified as an important factor in endocrine therapy resistance in breast cancer. CDK10 silencing increases the transcription of c-RAF and the activation of the p42/p44 MAPK pathway, which leads to antiestrogen resistance. Patients who express low levels of CDK10 relapse early on tamoxifen. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK10 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	309
270836	cd07846	STKc_CDKL2_3	Catalytic domain of the Serine/Threonine Kinases, Cyclin-Dependent protein Kinase Like 2 and 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDKL2, also called p56 KKIAMRE, is expressed in testis, kidney, lung, and brain. It functions mainly in mature neurons and plays an important role in learning and memory. Inactivation of CDKL3, also called NKIAMRE (NKIATRE in rat), by translocation is associated with mild mental retardation. It has been reported that CDKL3 is lost in leukemic cells having a chromosome arm 5q deletion, and may contribute to the transformed phenotype. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDKL2/3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
270837	cd07847	STKc_CDKL1_4	Catalytic domain of the Serine/Threonine Kinases, Cyclin-Dependent protein Kinase Like 1 and 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDKL1, also called p42 KKIALRE, is a glial protein that is upregulated in gliosis. It is present in neuroblastoma and A431 human carcinoma cells, and may be implicated in neoplastic transformation. The function of CDKL4 is unknown. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDKL1/4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
270838	cd07848	STKc_CDKL5	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase Like 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Mutations in the gene encoding CDKL5, previously called STK9, are associated with early onset epilepsy and severe mental retardation [X-linked infantile spasm syndrome (ISSX) or West syndrome]. In addition, CDKL5 mutations also sometimes cause a phenotype similar to Rett syndrome (RTT), a progressive neurodevelopmental disorder. These pathogenic mutations are located in the N-terminal portion of the protein within the kinase domain. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDKL5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
270839	cd07849	STKc_ERK1_2_like	Catalytic domain of Extracellular signal-Regulated Kinase 1 and 2-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the mitogen-activated protein kinases (MAPKs) ERK1, ERK2, baker's yeast Fus3, and similar proteins. MAPK pathways are important mediators of cellular responses to extracellular signals. ERK1/2 activation is preferentially by mitogenic factors, differentiation stimuli, and cytokines, through a kinase cascade involving the MAPK kinases MEK1/2 and a MAPK kinase kinase from the Raf family. ERK1/2 have numerous substrates, many of which are nuclear and participate in transcriptional regulation of many cellular processes. They regulate cell growth, cell proliferation, and cell cycle progression from G1 to S phase. Although the distinct roles of ERK1 and ERK2 have not been fully determined, it is known that ERK2 can maintain most functions in the absence of ERK1, and that the deletion of ERK2 is embryonically lethal. The MAPK, Fus3, regulates yeast mating processes including mating-specific gene expression, G1 arrest, mating projection, and cell fusion. This ERK1/2-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	336
270840	cd07850	STKc_JNK	Catalytic domain of the Serine/Threonine Kinase, c-Jun N-terminal Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. JNKs are mitogen-activated protein kinases (MAPKs) that are involved in many stress-activated responses including those during inflammation, neurodegeneration, apoptosis, and persistent pain sensitization, among others. They are also essential regulators of physiological and pathological processes and are involved in the pathogenesis of several diseases such as diabetes, atherosclerosis, stroke, Parkinson's and Alzheimer's. Vetebrates harbor three different JNK genes (Jnk1, Jnk2, and Jnk3) that are alternatively spliced to produce at least 10 isoforms. JNKs are specifically activated by the MAPK kinases MKK4 and MKK7, which are in turn activated by upstream MAPK kinase kinases as a result of different stimuli including stresses such as ultraviolet (UV) irradiation, hyperosmolarity, heat shock, or cytokines. JNKs activate a large number of different substrates based on specific stimulus, cell type, and cellular condition, and may be implicated in seemingly contradictory functions. The JNK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	337
143356	cd07851	STKc_p38	Catalytic domain of the Serine/Threonine Kinase, p38 Mitogen-Activated Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38 kinases are mitogen-activated protein kinases (MAPKs), serving as important mediators of cellular responses to extracellular signals. They function in the regulation of the cell cycle, cell development, cell differentiation, senescence, tumorigenesis, apoptosis, pain development and pain progression, and immune responses. p38 kinases are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. p38 substrates include other protein kinases and factors that regulate transcription, nuclear export, mRNA stability and translation. p38 kinases are drug targets for the inflammatory diseases psoriasis, rheumatoid arthritis, and chronic pulmonary disease. Vertebrates contain four isoforms of p38, named alpha, beta, gamma, and delta, which show varying substrate specificity and expression patterns. p38alpha and p38beta are ubiquitously expressed, p38gamma is predominantly found in skeletal muscle, and p38delta is found in the heart, lung, testis, pancreas, and small intestine. The p38 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	343
270841	cd07852	STKc_MAPK15-like	Catalytic domain of the Serine/Threonine Kinase, Mitogen-Activated Protein Kinase 15 and similar MAPKs. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Human MAPK15 is also called Extracellular signal Regulated Kinase 8 (ERK8) while the rat protein is called ERK7. ERK7 and ERK8 display both similar and different biochemical properties. They autophosphorylate and activate themselves and do not require upstream activating kinases. ERK7 is constitutively active and is not affected by extracellular stimuli whereas ERK8 shows low basal activity and is activated by DNA-damaging agents. ERK7 and ERK8 also have different substrate profiles. Genome analysis shows that they are orthologs with similar gene structures. ERK7 and ERK 8 may be involved in the signaling of some nuclear receptor transcription factors. ERK7 regulates hormone-dependent degradation of estrogen receptor alpha while ERK8 down-regulates the transcriptional co-activation androgen and glucocorticoid receptors. MAPKs are important mediators of cellular responses to extracellular signals. The MAPK15 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	337
173748	cd07853	STKc_NLK	Catalytic domain of the Serine/Threonine Kinase, Nemo-Like Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NLK is an atypical mitogen-activated protein kinase (MAPK) that is not regulated by a MAPK kinase. It functions downstream of the MAPK kinase kinase Tak1, which also plays a role in activating the JNK and p38 MAPKs. The Tak1/NLK pathways are regulated by Wnts, a family of secreted proteins that is critical in the control of asymmetric division and cell polarity. NLK can phosphorylate transcription factors from the TCF/LEF family, inhibiting their ability to activate the transcription of target genes. In prostate cancer cells, NLK is involved in regulating androgen receptor-mediated transcription and its expression is altered during cancer progression. MAPKs are important mediators of cellular responses to extracellular signals. The NLK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	372
143359	cd07854	STKc_MAPK4_6	Catalytic domain of the Serine/Threonine Kinases, Mitogen-Activated Protein Kinases 4 (also called ERK4) and 6 (also called ERK3). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK4 (also called ERK4 or p63MAPK) and MAPK6 (also called ERK3 or p97MAPK) are atypical MAPKs that are not regulated by MAPK kinases. MAPK6 is expressed ubiquitously with highest amounts in brain and skeletal muscle. It may be involved in the control of cell differentiation by negatively regulating cell cycle progression in certain conditions. It may also play a role in glucose-induced insulin secretion. MAPK6 and MAPK4 cooperate to regulate the activity of MAPK-activated protein kinase 5 (MK5), leading to its relocation to the cytoplasm and exclusion from the nucleus. The MAPK6/MK5 and MAPK4/MK5 pathways may play critical roles in embryonic and post-natal development. MAPKs are important mediators of cellular responses to extracellular signals. The MAPK4/6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	342
270842	cd07855	STKc_ERK5	Catalytic domain of the Serine/Threonine Kinase,  Extracellular signal-Regulated Kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ERK5 (also called Big MAPK1 (BMK1) or MAPK7) has a unique C-terminal extension, making it approximately twice as big as other MAPKs. This extension contains transcriptional activation capability which is inhibited by the N-terminal half. ERK5 is activated in response to growth factors and stress by a cascade that leads to its phosphorylation by the MAP2K MEK5, which in turn is regulated by the MAP3Ks MEKK2 and MEKK3. Activated ERK5 phosphorylates its targets including myocyte enhancer factor 2 (MEF2), Sap1a, c-Myc, and RSK. It plays a role in EGF-induced cell proliferation during the G1/S phase transition. Studies on knockout mice revealed that ERK5 is essential for cardiovascular development and plays an important role in angiogenesis. It is also critical for neural differentiation and survival. The ERK5 pathway has been implicated in the pathogenesis of many diseases including cancer, cardiac hypertrophy, and atherosclerosis. MAPKs are important mediators of cellular responses to extracellular signals. The ERK5 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	336
270843	cd07856	STKc_Sty1_Hog1	Catalytic domain of the Serine/Threonine Kinases, Fungal Mitogen-Activated Protein Kinases Sty1 and Hog1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the MAPKs Sty1 from Schizosaccharomyces pombe, Hog1 from Saccharomyces cerevisiae, and similar proteins. Sty1 and Hog1 are stress-activated MAPKs that partipate in transcriptional regulation in response to stress. Sty1 is activated in response to oxidative stress, osmotic stress, and UV radiation. It is regulated by the MAP2K Wis1, which is activated by the MAP3Ks Wis4 and Win1, which receive signals of the stress condition from membrane-spanning histidine kinases Mak1-3. Activated Sty1 stabilizes the Atf1 transcription factor and induces transcription of Atf1-dependent genes of the core environmetal stress response. Hog1 is the key element in the high osmolarity glycerol (HOG) pathway and is activated upon hyperosmotic stress. Activated Hog1 accumulates in the nucleus and regulates stress-induced transcription. The HOG pathway is mediated by two transmembrane osmosensors, Sln1 and Sho1. MAPKs are important mediators of cellular responses to extracellular signals. The Sty1/Hog1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	328
173750	cd07857	STKc_MPK1	Catalytic domain of the Serine/Threonine Kinase, Fungal Mitogen-Activated Protein Kinase MPK1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the MAPKs MPK1 from Saccharomyces cerevisiae, Pmk1 from Schizosaccharomyces pombe, and similar proteins. MPK1 (also called Slt2) and Pmk1 (also called Spm1) are stress-activated MAPKs that regulate the cell wall integrity pathway, and are therefore important in the maintainance of cell shape, cell wall construction, morphogenesis, and ion homeostasis. MPK1 is activated in response to cell wall stress including heat stimulation, osmotic shock, UV irradiation, and any agents that interfere with cell wall biogenesis such as chitin antagonists, caffeine, or zymolase. MPK1 is regulated by the MAP2Ks Mkk1/2, which are regulated by the MAP3K Bck1. Pmk1 is also activated by multiple stresses including elevated temperatures, hyper- or hypotonic stress, glucose deprivation, exposure to cell-wall damaging compounds, and oxidative stress. It is regulated by the MAP2K Pek1, which is regulated by the MAP3K Mkh1. MAPKs are important mediators of cellular responses to extracellular signals. The MPK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	332
143363	cd07858	STKc_TEY_MAPK	Catalytic domain of the Serine/Threonine Kinases, Plant TEY Mitogen-Activated Protein Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Plant MAPKs are typed based on the conserved phosphorylation motif present in the activation loop, TEY and TDY. This subfamily represents the TEY subtype of plant MAPKs and is further subdivided into three groups (A, B, and C). Group A is represented by AtMPK3, AtMPK6, Nicotiana tabacum BTF4 (NtNTF4), among others. They are mostly involved in environmental and hormonal responses. AtMPK3 and  AtMPK6 are also key regulators for stomatal development and patterning. Group B is represented by AtMPK4, AtMPK13, and NtNTF6, among others. They may be involved in both cell division and environmental stress response. AtMPK4 also participates in regulating innate immunity. Group C is represented by AtMPK1, AtMPK2, NtNTF3, Oryza sativa MAPK4 (OsMAPK4), among others. They may also be involved in stress responses. AtMPK1 and AtMPK2 are activated following mechanical injury and in the presence of stress chemicals such as jasmonic acid, hydrogen peroxide and abscisic acid. OsMAPK4 is also called OsMSRMK3 for Multiple Stress-Responsive MAPK3. In plants, MAPKs are associated with physiological, developmental, hormonal, and stress responses. Some plants show numerous gene duplications of MAPKs; Arabidopsis thaliana harbors at least 20 MAPKs, named AtMPK1-20. The TEY MAPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	337
143364	cd07859	STKc_TDY_MAPK	Catalytic domain of the Serine/Threonine Kinases, Plant TDY Mitogen-Activated Protein Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Plant MAPKs are typed based on the conserved phosphorylation motif present in the activation loop, TEY and TDY. This subfamily represents the TDY subtype and is composed of Group D plant MAPKs including Arabidopsis thaliana MPK18 (AtMPK18), Oryza sativa Blast- and Wound-induced MAPK1 (OsBWMK1), OsWJUMK1 (Wound- and JA-Uninducible MAPK1), Zea mays MPK6, and the Medicago sativa TDY1 gene product. OsBWMK1 enhances resistance to pathogenic infections. It mediates stress-activated defense responses by activating a transcription factor that affects the expression of stress-related genes. AtMPK18 is involved in microtubule-related functions. In plants, MAPKs are associated with physiological, developmental, hormonal, and stress responses. Some plants show numerous gene duplications of MAPKs; Arabidopsis thaliana harbors at least 20 MAPKs, named AtMPK1-20 while Oryza sativa contains at least 17 MAPKs. Arabidopsis thaliana contains more TEY-type MAPKs than TDY-type, whereas the reverse is true for Oryza sativa. The TDY MAPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	338
270844	cd07860	STKc_CDK2_3	Catalytic domain of the Serine/Threonine Kinases, Cyclin-Dependent protein Kinase 2 and 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK2 is regulated by cyclin E or cyclin A. Upon activation by cyclin E, it phosphorylates the retinoblastoma (pRb) protein which activates E2F mediated transcription and allows cells to move into S phase. The CDK2/cyclin A complex plays a role in regulating DNA replication. CDK2, together with CDK4, also regulates embryonic cell proliferation. Despite these important roles, mice deleted for the cdk2 gene are viable and normal except for being sterile. This may be due to compensation provided by CDK1 (also called Cdc2), which can also bind cyclin E and drive the G1 to S phase transition. CDK3 is regulated by cyclin C and it phosphorylates pRB specifically during the G0/G1 transition. This phosphorylation is required for cells to exit G0 efficiently and enter the G1 phase. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK2/3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270845	cd07861	STKc_CDK1_euk	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 1 from higher eukaryotes. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK1 is also called Cell division control protein 2 (Cdc2) or p34 protein kinase, and is regulated by cyclins A, B, and E. The CDK1/cyclin A complex controls G2 phase entry and progression. CDK1/cyclin A2 has also been implicated as an important regulator of S phase events. The CDK1/cyclin B complex is critical for G2 to M phase transition. It induces mitosis by activating nuclear enzymes that regulate chromatin condensation, nuclear membrane degradation, mitosis-specific microtubule and cytoskeletal reorganization. CDK1 also associates with cyclin E and plays a role in the entry into S phase. CDK1 transcription is stable throughout the cell cycle but is modulated in some pathological conditions. It may play a role in regulating apoptosis under these conditions. In breast cancer cells, HER2 can mediate apoptosis by inactivating CDK1. Activation of CDK1 may contribute to HIV-1 induced apoptosis as well as neuronal apoptosis in neurodegenerative diseases. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	285
270846	cd07862	STKc_CDK6	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK6 is regulated by D-type cyclins and INK4 inhibitors. It is active towards the retinoblastoma (pRb) protein, implicating it to function in regulating the early G1 phase of the cell cycle. It is expressed ubiquitously and is localized in the cytoplasm. It is also present in the ruffling edge of spreading fibroblasts and may play a role in cell spreading. It binds to the p21 inhibitor without any effect on its own activity and it is overexpressed in squamous cell carcinomas and neuroblastomas. CDK6 has also been shown to inhibit cell differentiation in many cell types. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
143368	cd07863	STKc_CDK4	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK4 partners with all three D-type cyclins (D1, D2, and D3) and is also regulated by INK4 inhibitors. It is active towards the retinoblastoma (pRb) protein and plays a role in regulating the early G1 phase of the cell cycle. It is expressed ubiquitously and is localized in the nucleus. CDK4 also shows kinase activity towards Smad3, a signal transducer of TGF-beta signaling which modulates transcription and plays a role in cell proliferation and apoptosis. CDK4 is inhibited by the p21 inhibitor and is specifically mutated in human melanoma. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
270847	cd07864	STKc_CDK12	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 12. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK12 is also called Cdc2-related protein kinase 7 (CRK7) or Cdc2-related kinase arginine/serine-rich (CrkRS). It is a unique CDK that contains an RS domain, which is predominantly found in splicing factors. CDK12 is widely expressed in tissues. It interacts with cyclins L1 and L2, and plays roles in regulating transcription and alternative splicing. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK12 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	302
270848	cd07865	STKc_CDK9	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 9. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK9, together with a cyclin partner (cyclin T1, T2a, T2b, or K), is the main component of distinct positive transcription elongation factors (P-TEFb), which function as Ser2 C-terminal domain kinases of RNA polymerase II. P-TEFb participates in multiple steps of gene expression including transcription elongation, mRNA synthesis, processing, export, and translation. It also plays a role in mediating cytokine induced transcription networks such as IL6-induced STAT3 signaling. In addition, the CDK9/cyclin T2a complex promotes muscle differentiation and enhances the function of some myogenic regulatory factors. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK9 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	310
270849	cd07866	STKc_BUR1	Catalytic domain of the Serine/Threonine Kinase, Fungal Cyclin-Dependent protein Kinase (CDK), Bypass UAS Requirement 1, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BUR1, also called SGV1, is a yeast CDK that is functionally equivalent to mammalian CDK9. It associates with the cyclin BUR2. BUR genes were orginally identified in a genetic screen as factors involved in general transcription. The BUR1/BUR2 complex phosphorylates the C-terminal domain of RNA polymerase II. In addition, this complex regulates histone modification by phosporylating Rad6 and mediating the association of the Paf1 complex with chromatin. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The BUR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	311
270850	cd07867	STKc_CDC2L6	Catalytic domain of Serine/Threonine Kinase, Cell Division Cycle 2-like 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDC2L6 is also called CDK8-like and was previously referred to as CDK11. However, this is a confusing nomenclature as CDC2L6 is distinct from CDC2L1, which is represented by the two protein products from its gene, called CDK11(p110) and CDK11(p58), as well as the caspase-processed CDK11(p46). CDK11(p110), CDK11(p58), and CDK11(p46)do not belong to this subfamily. CDC2L6 is an associated protein of Mediator, a multiprotein complex that provides a platform to connect transcriptional and chromatin regulators and cofactors, in order to activate and mediate RNA polymerase II transcription. CDC2L6 is localized mainly in the nucleus amd exerts an opposing effect to CDK8 in VP16-dependent transcriptional activation by being a negative regulator. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDC2L6 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	318
270851	cd07868	STKc_CDK8	Catalytic domain of the Serine/Threonine Kinase, Cyclin-Dependent protein Kinase 8. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CDK8 can act as a negative or positive regulator of transcription, depending on the scenario. Together with its regulator, cyclin C, it reversibly associates with the multi-subunit core Mediator complex, a cofactor that is involved in regulating RNA polymerase II (RNAP II)-dependent transcription. CDK8 phosphorylates cyclin H, a subunit of the general transcription factor TFIIH, which results in the inhibition of TFIIH-dependent phosphorylation of the C-terminal domain of RNAP II, facilitating the inhibition of transcription. It has also been shown to promote transcription by a mechanism that is likely to involve RNAP II phosphorylation. CDK8 also functions as a stimulus-specific positive coregulator of p53 transcriptional responses. CDKs belong to a large family of STKs that are regulated by their cognate cyclins. Together, they are involved in the control of cell-cycle progression, transcription, and neuronal function. The CDK8 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	333
143374	cd07869	STKc_PFTAIRE1	Catalytic domain of the Serine/Threonine Kinase, PFTAIRE-1 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PFTAIRE-1 is widely expressed except in the spleen and thymus. It is highly expressed in the brain, heart, pancreas, testis, and ovary, and is localized in the cytoplasm. It is regulated by cyclin D3 and is inhibited by the p21 cell cycle inhibitor. It has also been shown to interact with the membrane-associated cyclin Y, which recruits the protein to the plasma membrane. PFTAIRE-1 shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PFTAIRE-1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	303
270852	cd07870	STKc_PFTAIRE2	Catalytic domain of the Serine/Threonine Kinase, PFTAIRE-2 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PFTAIRE-2 is also referred to as ALS2CR7 (amyotrophic lateral sclerosis 2 (juvenile) chromosome region candidate 7). It may be associated with amyotrophic lateral sclerosis 2 (ALS2), an autosomal recessive form of juvenile ALS. The function of PFTAIRE-2 is not yet known. It shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PFTAIRE-2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
270853	cd07871	STKc_PCTAIRE3	Catalytic domain of the Serine/Threonine Kinase, PCTAIRE-3 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PCTAIRE-3 shows a restricted pattern of expression and is present in brain, kidney, and intestine. It is elevated in Alzheimer's disease (AD) and has been shown to associate with paired helical filaments (PHFs) and stimulate Tau phosphorylation. As AD progresses, phosphorylated Tau aggregates and forms PHFs, which leads to the formation of neurofibrillary tangles. In human glioma cells, PCTAIRE-3 induces cell cycle arrest and cell death. PCTAIRE-3 shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PCTAIRE-3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
143377	cd07872	STKc_PCTAIRE2	Catalytic domain of the Serine/Threonine Kinase, PCTAIRE-2 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PCTAIRE-2 is specifically expressed in neurons in the central nervous system, mainly in terminally differentiated neurons. It associates with Trap (Tudor repeat associator with PCTAIRE-2) and could play a role in regulating mitochondrial function in neurons. PCTAIRE-2 shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PCTAIRE-2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	309
270854	cd07873	STKc_PCTAIRE1	Catalytic domain of the Serine/Threonine Kinase, PCTAIRE-1 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PCTAIRE-1 is expressed ubiquitously and is localized in the cytoplasm. Its kinase activity is cell cycle dependent and peaks at the S and G2 phases. PCTAIRE-1 is highly expressed in the brain and may play a role in regulating neurite outgrowth. It can also associate with Trap (Tudor repeat associator with PCTAIRE-2), a physiological partner of PCTAIRE-2; with p11, a small dimeric protein with similarity to S100; and with 14-3-3 proteins, mediators of phosphorylation-dependent interactions in many different proteins. PCTAIRE-1 shares sequence similarity with Cyclin-Dependent Kinases (CDKs), which belong to a large family of STKs that are regulated by their cognate cyclins. Together, CDKs and cyclins are involved in the control of cell-cycle progression, transcription, and neuronal function. The PCTAIRE-1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	297
143379	cd07874	STKc_JNK3	Catalytic domain of the Serine/Threonine Kinase, c-Jun N-terminal Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. JNK3 is expressed primarily in the brain, and to a lesser extent in the heart and testis. Mice deficient in JNK3 are protected against kainic acid-induced seizures, stroke, sciatic axotomy neural death, and neuronal death due to NGF deprivation, oxidative stress, or exposure to beta-amyloid peptide. This suggests that JNK3 may play roles in the pathogenesis of these diseases. JNKs are mitogen-activated protein kinases (MAPKs) that are involved in many stress-activated responses including those during inflammation, neurodegeneration, apoptosis, and persistent pain sensitization, among others. The JNK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	355
143380	cd07875	STKc_JNK1	Catalytic domain of the Serine/Threonine Kinase, c-Jun N-terminal Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. JNK1 is expressed in every cell and tissue type. It specifically binds with JAMP (JNK1-associated membrane protein), which regulates the duration of JNK1 activity in response to stimuli. Specific JNK1 substrates include Itch and SG10, which are implicated in Th2 responses and airway inflammation, and microtubule dynamics and axodendritic length, respectively. Mice deficient in JNK1 are protected against arthritis, obesity, type 2 diabetes, cardiac cell death, and non-alcoholic liver disease, suggesting that JNK1 may play roles in the pathogenesis of these diseases. Initially, it was thought that JNK1 and JNK2 were functionally redundant as mice deficient in either genes could survive but disruption of both genes resulted in lethality. However, recent studies have shown that JNK1 and JNK2 perform distinct functions through specific binding partners and substrates. JNKs are mitogen-activated protein kinases that are involved in many stress-activated responses including those during inflammation, neurodegeneration, apoptosis, and persistent pain sensitization, among others. The JNK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	364
143381	cd07876	STKc_JNK2	Catalytic domain of the Serine/Threonine Kinase, c-Jun N-terminal Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. JNK2 is expressed in every cell and tissue type. It is specifically translocated to the mitochondria during dopaminergic cell death. Specific substrates include the microtubule-associated proteins DCX and Tau, as well as TIF-IA which is involved in ribosomal RNA synthesis regulation. Mice deficient in Jnk2 show protection against arthritis, type 1 diabetes, atherosclerosis, abdominal aortic aneurysm, cardiac cell death, TNF-induced liver damage, and tumor growth, indicating that JNK2 may play roles in the pathogenesis of these diseases. Initially it was thought that JNK1 and JNK2 were functionally redundant as mice deficient in either genes could survive but disruption of both genes resulted in lethality. However, recent studies have shown that JNK1 and JNK2 perform distinct functions through specific binding partners and substrates. JNKs are mitogen-activated protein kinases (MAPKs) that are involved in many stress-activated responses including those during inflammation, neurodegeneration, apoptosis, and persistent pain sensitization, among others. The JNK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	359
143382	cd07877	STKc_p38alpha	Catalytic domain of the Serine/Threonine Kinase, p38alpha Mitogen-Activated Protein Kinase (also called MAPK14). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38alpha/MAPK14 is expressed in most tissues and is the major isoform involved in the immune and inflammatory response. It is the central p38 MAPK involved in myogenesis. It plays a role in regulating cell cycle check-point transition and promoting cell differentiation. p38alpha also regulates cell proliferation and death through crosstalk with the JNK pathway. Its substrates include MAPK activated protein kinase 2 (MK2), MK5, and the transcription factors ATF2 and Mitf. p38 kinases MAPKs, serving as important mediators of cellular responses to extracellular signals. They are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. The p38alpha subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	345
143383	cd07878	STKc_p38beta	Catalytic domain of the Serine/Threonine Kinase, p38beta Mitogen-Activated Protein Kinase (also called MAPK11). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38beta/MAPK11 is widely expressed in tissues and shows more similarity with p38alpha than with the other isoforms. Both are sensitive to pyridinylimidazoles and share some common substrates such as MAPK activated protein kinase 2 (MK2) and the transcription factors ATF2, c-Fos and, ELK-1. p38beta is involved in regulating the activation of the cyclooxygenase-2 promoter and the expression of TGFbeta-induced alpha-smooth muscle cell actin. p38 kinases are mitogen-activated protein kinases (MAPKs), serving as important mediators of cellular responses to extracellular signals. They are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. The p38beta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	343
143384	cd07879	STKc_p38delta	Catalytic domain of the Serine/Threonine Kinase, p38delta Mitogen-Activated Protein Kinase (also called MAPK13). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38delta/MAPK13 is found in skeletal muscle, heart, lung, testis, pancreas, and small intestine. It regulates microtubule function by phosphorylating Tau. It activates the c-jun promoter and plays a role in G2 cell cycle arrest. It also controls the degration of c-Myb, which is associated with myeloid leukemia and poor prognosis in colorectal cancer. p38delta is the main isoform involved in regulating the differentiation and apoptosis of keratinocytes. p38 kinases are MAPKs, serving as important mediators of cellular responses to extracellular signals. They are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. The p38delta subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	342
143385	cd07880	STKc_p38gamma	Catalytic domain of the Serine/Threonine Kinase, p38gamma Mitogen-Activated Protein Kinase (also called MAPK12). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. p38gamma/MAPK12 is predominantly expressed in skeletal muscle. Unlike p38alpha and p38beta, p38gamma is insensitive to pyridinylimidazoles. It displays an antagonizing function compared to p38alpha. p38gamma inhibits, while p38alpha stimulates, c-Jun phosphorylation and AP-1 mediated transcription. p38gamma also plays a role in the signaling between Ras and the estrogen receptor and has been implicated to increase cell invasion and breast cancer progression. In Xenopus, p38gamma is critical in the meiotic maturation of oocytes. p38 kinases are MAPKs, serving as important mediators of cellular responses to extracellular signals. They are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines. The p38gamma subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	343
143641	cd07881	RHD-n_NFAT	N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor of activated T-cells (NFAT) proteins. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NFAT family of transcription factors. NFAT transcription complexes are a target of calcineurin, a calcium dependent phosphatase, and activate genes that are mainly involved in cell-cell interaction. Upon de-phosphorylation of the nuclear localization signal, NFAT enters the nucleus and acts as a transcription factor; its export from the nucleus is triggered by phosphorylation via export kinases. NFATs play important roles in mediating the immune response, and are found in T cells, B Cells, NK cells, mast cells, and monocytes. NFATs are also found in various non-hematopoietic cell types, where they play roles in development.	175
143642	cd07882	RHD-n_TonEBP	N-terminal sub-domain of the Rel homology domain (RHD) of tonicity-responsive enhancer binding protein (TonEBP). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the tonicity-responsive enhancer binding protein (TonEBP), also called NFAT5. Mammalian TonEBP regulates the expression of genes in response to tonicity. It plays a pivotal role in urinary concentrating mechanisms in kidney medulla, by triggering the accumulation of osmolytes that enable renal medullary cells to tolerate high levels of urea and salt.	161
143643	cd07883	RHD-n_NFkB	N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor of kappa light polypeptide gene enhancer in B-cells (NF-kappa B). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NF-kappa B1 and B2 families of transcription factors, also referred to as class I members of the NF-kappa B family. In class I NF-kappa Bs, the RHD domain co-occurs with C-terminal ankyrin repeats. Family members include NF-kappa B1 and NF-kappa B2. NF-kappa B1 is commonly referred to as p105 or p50 (proteolytically processed form), while NF-kappa B2 is called p100 or p52 (proteolytically processed form). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and REL). p105 and p100 may also act as I-kappa Bs due to their C-terminal ankyrin repeats.	197
143644	cd07884	RHD-n_Relish	N-terminal sub-domain of the Rel homology domain (RHD) of the arthropod protein Relish. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the arthropod Relish protein, in which the RHD domain co-occurs with C-terminal ankyrin repeats. Family members are sometimes referred to as p110 or p68 (proteolytically processed form). Relish is an NF-kappa B-like transcription factor, which plays a role in mediating innate immunity in Drosophila. It is activated via the Imd (immune deficiency) pathway, which triggers phosphorylation of Relish. IKK-dependent proteolytic cleavage of Relish (which involves Dredd) results in a smaller active form (without the C-terminal ankyrin repeats), which is transported into the nucleus and functions as a transactivator.	159
143645	cd07885	RHD-n_RelA	N-terminal sub-domain of the Rel homology domain (RHD) of RelA. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD domain of the RelA family of transcription factors, categorized as a class II member of the NF-kappa B family. In class II NF-kappa Bs, the RHD domain co-occurs with a C-terminal transactivation domain (TAD). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and Rel). RelA (also called p65) forms heterodimers with NF-kappa B1 (p50) and B2 (p52). RelA also forms homodimers.	169
143646	cd07886	RHD-n_RelB	N-terminal sub-domain of the Rel homology domain (RHD) of the reticuloendotheliosis viral oncogene homolog B (RelB) protein. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the RelB family of transcription factors, categorized as class II NF-kappa B family members. In class II NF-kappa Bs, the RHD domain co-occurs with a C-terminal transactivation domain (TAD). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and Rel). RelB, is unable to homodimerize but is a potent transactivator in a heterodimer with NF-kappa B1 (p50) or B2 (p52). It is involved in the regulation of genes that play roles in inflammatory processes and the immune response.	172
143647	cd07887	RHD-n_Dorsal_Dif	N-terminal sub-domain of the Rel homology domain (RHD) of the arthropod protein Dorsal. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the arthropod Dorsal and Dif (Dorsal-related immunity factor), and similar proteins. Dorsal and Dif are Rel-like transcription factors, which play roles in mediating innate immunity in Drosophila. They are activated via the Toll pathway. Cytoplasmic Dorsal/Dif are inactivated via forming a complex with Cactus, the Drosophila homologue of mammalian I-kappa B proteins. In response to signals, Cactus is degraded and Dorsal/Dif can be transported into the nucleus, where they act as transcription factors. Dorsal is also an essential gene in establishing the proper dorsal/ventral polarity in the developing embryo.	173
143579	cd07888	CRD_corin_2	One of two cysteine-rich domains of the corin protein, a type II transmembrane serine protease . The cysteine-rich domain (CRD) is an essential component of corin, a type II transmembrane serine protease which functions as the convertase of the pro-atrial natriuretic peptide (pro-ANP) in the heart. Corin contains two CRDs in its extracellular region, which play an important role in recognition of the physiological substrate, pro-ANP. This model characterizes the second (C-terminal) CRD.	122
143628	cd07890	CYTH-like_AC_IV-like	Adenylyl cyclase (AC) class IV-like, a subgroup of the CYTH-like superfamily. This subgroup contains class IV ACs and similar proteins. AC catalyzes the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. cAMP is a key signaling molecule which conveys a variety of signals in different cell types. In prokaryotes, cAMP is a catabolite derepression signal which triggers the expression of metabolic pathways including the lactose operon. Six non-homologous classes of ACs have been identified (I-VI). Class IV ACs are found in this group. In bacteria, the gene encoding Class IV AC has been designated cyaB and the protein as AC2. AC-IV occurs in addition to AC-I in bacterial pathogens such as Yersinia pestis (plague disease). The role of AC-IV is unknown but it has been speculated that it may be a factor in pathogenesis, perhaps providing cAMP for a secondary internal signaling function, or for secretion and uptake into host cells, where it may disrupt normal cellular processes. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel.	169
143629	cd07891	CYTH-like_CthTTM-like_1	CYTH-like Clostridium thermocellum TTM-like subgroup 1. This subgroup contains the triphosphate tunnel metalloenzyme (TTM) from Clostridium thermocellum (CthTTM) and similar proteins. These are found primarily in bacteria. CthTTM is a metal dependent tripolyphosphatase, nucleoside triphosphatase, and nucleoside tetraphosphatase. It hydrolyzes the beta-gamma phosphoanhydride linkage of triphosphate-containing substrates including tripolyphosphate, nucleoside triphosphates and nucleoside tetraphosphates. These substrates are hydrolyzed, releasing Pi. Mg++ or Mn++ are required for the enzyme's activity. CthTTM appears to have no adenylate cyclase activity. This subgroup consists chiefly of bacterial sequences. These enzymes are members of the CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) superfamily, which have a unique active site located within an eight-stranded beta barrel.	148
143630	cd07892	PolyPPase_VTC2-3_like	Polyphosphate(polyP) polymerase domain of yeast vacuolar transport chaperone (VTC) proteins VTC-2, and -3 , and similar proteins. Saccharomyces cerevisiae VTC-1, -2, -3, and -4 comprise the membrane-integral VTC complex. VTC-2, -3, and -4 contain polyP polymerase domains. S. cerevisiae VTC-2,and -3 belong to this subgroup. For VTC4 it has been shown that this domain generates polyP from ATP by a phosphotransfer reaction releasing ADP. This activity is metal ion-dependent. The ATP gamma phosphate may be cleaved and then transferred to an acceptor phosphate to form polyP. PolyP is ubiquitous. In prokaryotes, it is a store of phosphate and energy. In eukaryotes, polyPs have roles in  bone calcification, and osmoregulation, and in phosphate transport in the symbiosis of mycorrhizal fungi and plants. This subgroup belongs to the CYTH/triphosphate tunnel metalloenzyme (TTM)-like superfamily, whose enzymes have a unique active site located within an eight-stranded beta barrel.	303
153435	cd07893	OBF_DNA_ligase	The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	129
185705	cd07894	Adenylation_RNA_ligase	Adenylation domain of RNA circularization proteins. RNA circularization proteins are capable of circularizing RNA molecules in an ATP-dependent reaction. RNA circularization may protect RNA from exonuclease activity. This model comprises the adenylation domain, the minimal catalytic unit that is common to all members of the ATP-dependent DNA ligase family, and the carboxy-terminal extension of RNA circularization protein that serves as a dimerization module. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation of nicked nucleic acid substrates using the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. The adenylation domain binds ATP and contains many active site residues.	342
185706	cd07895	Adenylation_mRNA_capping	Adenylation domain of GTP-dependent mRNA capping enzymes. RNA capping enzymes transfer GMP from GTP to the 5'-diphosphate end of nascent mRNAs to form a G(5')ppp(5')RNA cap structure. The RNA cap is found only in eukarya. RNA capping is chemically analogous to the first two steps of polynucleotide ligation. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation of nicked nucleic acid substrates using the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. Structural studies reveal a shared structure for DNA ligases and capping enzymes, with a common catalytic core composed of an adenylation or nucleotidyltransferase domain and a C-terminal OB-fold domain containing conserved sequence motifs. The adenylation domain binds ATP and contains many active site residues.	215
185707	cd07896	Adenylation_kDNA_ligase_like	Adenylation domain of kDNA ligases and similar proteins. The mitochondrial DNA of parasitic protozoans is highly unusual. It is termed the kinetoplast DNA (kDNA) and consists of circular DNA molecules (maxicircles) and several thousand smaller circular molecules (minicircles). This group is composed of kDNA ligase, Chlorella virus DNA ligase, and similar proteins. kDNA ligase and Chlorella virus DNA ligase are the smallest known ATP-dependent ligases. They are involved in DNA replication or repair. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. They have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and the C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family, including this group. The adenylation domain binds ATP and contains many of the active-site residues.	174
185708	cd07897	Adenylation_DNA_ligase_Bac1	Adenylation domain of putative bacterial ATP-dependent DNA ligases. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of predicted bacterial ATP-dependent DNA ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three-step reaction mechanism. The adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family, including this group. The adenylation domain binds ATP and contains many of the active site residues.	207
185709	cd07898	Adenylation_DNA_ligase	Adenylation domain of ATP-dependent DNA Ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. Some organisms express a variety of different ligases which appear to be targeted to specific functions. ATP-dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many of the active-site residues. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases.	201
185710	cd07900	Adenylation_DNA_ligase_I_Euk	Adenylation domain of eukaryotic DNA Ligase I. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. Some organisms express a variety of different ligases which appear to be targeted to specific functions. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). DNA ligase I is required for the ligation of Okazaki fragments during lagging-strand DNA synthesis and for base excision repair (BER). DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. DNA ligase I is the main replicative ligase in eukaryotes. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases.	219
185711	cd07901	Adenylation_DNA_ligase_Arch_LigB	Adenylation domain of archaeal and bacterial LigB-like DNA ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of archaeal DNA ligases and bacterial proteins similar to Mycobacterium tuberculosis LigB. Members of this group contain adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains, comprising a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases.	207
185712	cd07902	Adenylation_DNA_ligase_III	Adenylation domain of DNA Ligase III. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three-step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). DNA ligase III is not found in lower eukaryotes and is present both in the nucleus and mitochondria. It has several isoforms; two splice forms, III-alpha and III-beta, differ in their carboxy-terminal sequences. DNA ligase III-beta is believed to play a role in homologous recombination during meiotic prophase. DNA ligase III-alpha interacts with X-ray Cross Complementing factor 1 (XRCC1) and functions in single nucleotide Base Excision Repair (BER). The mitochondrial form of DNA ligase III originates from the nucleolus and is involved in the mitochondrial DNA repair pathway. This isoform is expressed by a second start site on the DNA ligase III gene. DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many active site residues. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases.	213
185713	cd07903	Adenylation_DNA_ligase_IV	Adenylation domain of DNA Ligase IV. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligase in eukaryotic cells (I, III and IV). DNA ligase IV is required for DNA non-homologous end joining pathways, including recombination of the V(D)J immunoglobulin gene segments in cells of the mammalian immune system. DNA ligase IV is stabilized by forming a complex with XRCC4, a nuclear phosphoprotein, which is phosphorylated by DNA-dependent protein kinase. DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to all members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. The common catalytic unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases.	225
185714	cd07905	Adenylation_DNA_ligase_LigC	Adenylation domain of Mycobacterium tuberculosis LigC-like ATP-dependent DNA ligases. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of ATP-dependent DNA ligases similar to Mycobacterium tuberculosis LigC. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. Members of this group contain adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains, comprising a catalytic core unit that is common to all members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases.	194
185715	cd07906	Adenylation_DNA_ligase_LigD_LigC	Adenylation domain of Mycobacterium tuberculosis LigD and LigC-like ATP-dependent DNA ligases. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of ATP-dependent DNA ligases similar to Mycobacterium tuberculosis LigC. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. Members of this group contain adenylation and C-terminal oligonucleotide/oligosaccharide binding (OB)-fold domains, comprising a catalytic core unit that is common to all members of the ATP-dependent DNA ligase family. The adenylation domain binds ATP and contains many of the active-site residues. The common catalytic core unit comprises six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. LigD consists of a central ATP-dependent DNA ligase catalytic core unit fused to a C-terminal polymerase domain and an N-terminal 3'-phosphoesterase (PE) module. LigD catalyzes the end-healing and end-sealing steps during non-homologous end joining.	190
153117	cd07908	Mn_catalase_like	Manganese catalase-like protein, ferritin-like diiron-binding domain. This uncharacterized bacterial protein family has a ferritin-like domain similar to that of the manganese catalase protein of Lactobacillus plantarum and the bll3758 protein of Bradyrhizobium japonicum.  Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF).	154
153118	cd07909	YciF	YciF bacterial stress response protein, ferritin-like iron-binding domain. YciF is a bacterial protein of unknown function that is up-regulated when bacteria experience stress conditions, and is highly conserved in a broad range of bacterial species.  YciF has a ferritin-like domain.  Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF).	147
153119	cd07910	MiaE	MiaE tRNA-modifying nonheme diiron monooxygenase, ferritin-like diiron-binding domain. MiaE is a nonheme diiron monooxygenase that catalyzes the posttranscriptional allylic hydroxylation of a modified nucleoside in tRNA called 2-methylthio-N-6-isopentenyl adenosine (ms2i6A).  ms2i6A is found at position 37, next to the anticodon at the 3' position in almost all eukaryotic and bacterial tRNA's that read codons beginning with uridine. The miaE gene is absent in Escherichia coli, a finding consistent with the absence of the hydroxylated derivative of ms2i6A in this species.	180
153120	cd07911	RNRR2_Rv0233_like	Ribonucleotide Reductase R2-like protein, Mn/Fe-binding domain. Rv0233 is a Mycobacterium tuberculosis ribonucleotide reductase R2 protein with a  heterodinuclear manganese/iron-carboxylate cofactor located in its metal center. The Rv0233-like family may represent a structural/functional counterpart of the evolutionary ancestor of the RNRR2's (Ribonucleotide Reductase, R2/beta subunit) and the bacterial multicomponent monooxygenases.  RNRR2s belong to a broad superfamily of ferritin-like diiron-carboxylate proteins. The RNR protein catalyzes the conversion of ribonucleotides to deoxyribonucleotides and is found in prokaryotes and archaea. The catalytically active form of RNR is a proposed alpha2-beta2 tetramer. The homodimeric alpha subunit (R1) contains the active site and redox active cysteines as well as the allosteric binding sites.	280
143653	cd07912	Tweety_N	N-terminal domain of the protein encoded by the Drosophila tweety gene and related proteins, a family of chloride ion channels. The protein product of the Drosophila tweety (tty) gene is thought to form a trans-membrane protein with five membrane-spanning regions and a cytoplasmic C-terminus. This N-terminal domain contains the putative transmembrane spanning regions. Tweety has been suggested as a candidate for a large conductance chloride channel, both in vertebrate and insect cells. Three human homologs have been identified and designated TTYH1-3. TTYH2 has been associated with the progression of cancer, and Drosophila melanogaster tweety has been assumed to play a role in development. TTYH2, and TTYH3 bind to and are ubiquinated by Nedd4-2, a HECT type E3 ubiquitin ligase, which most likely plays a role in controlling the cellular levels of tweety family proteins.	418
153419	cd07914	IGPD	Imidazoleglycerol-phosphate dehydratase. Imidazoleglycerol-phosphate dehydratase (IGPD; EC 4.2.1.19) catalyzes the dehydration of imidazole glycerol phosphate to imidazole acetol phosphate, the sixth step of histidine biosynthesis in plants and microorganisms where the histidine is synthesized de novo. There is an internal repeat in the protein domain that is related by pseudo-dyad symmetry, perhaps as a result of an ancient gene duplication. The apo-form of IGPD exists as a catalytically inactive trimer which, in the presence of specific divalent metal cations such as manganese (Mn2+), cobalt (Co2+), cadmium (Cd2+), nickel (Ni2+), iron (Fe2+) and zinc (Zn2+), assembles to form a biologically active high molecular weight metalloenzyme; a 24-mer with 4-3-2 symmetry. Each 24-mer has 24 active sites, and contains around 1.5 metal ions per monomer, each monomer contributing residues to three separate active sites. IGPD enzymes are monofunctional in fungi, plants, archaea and some eubacteria while they are encoded as bifunctional enzymes in other eubacteria, such that the enzyme is fused to histidinol-phosphate phosphatase, the penultimate enzyme of the histidine biosynthesis pathway. The histidine biosynthesis pathway is a potential target for development of herbicides, and IGPD is a target for the triazole phosphonate herbicides.	190
153420	cd07920	Pumilio	Pumilio-family RNA binding domain. Puf repeats (also labelled PUM-HD or Pumilio homology domain) mediate sequence specific RNA binding in fly Pumilio, worm FBF-1 and FBF-2, and many other proteins such as vertebrate Pumilio. These proteins function as translational repressors in early embryonic development by binding to sequences in the 3' UTR of target mRNAs, such as the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA. Other proteins that contain Puf domains are also plausible RNA binding proteins. Yeast PUF1 (JSN1), for instance, appears to contain a single RNA-recognition motif (RRM) domain. Puf repeat proteins have been observed to function asymmetrically and may be responsible for creating protein gradients involved in the specification of cell fate and differentiation. Puf domains usually occur as a tandem repeat of 8 domains. This model encompasses all 8 tandem repeats. Some proteins may have fewer (canonical) repeats.	322
153391	cd07921	PCA_45_Doxase_A_like	Subunit A of the Class III Extradiol dioxygenase, Protocatechuate 4,5-dioxygenase, and similar enzymes. This subfamily includes the A subunit of protocatechuate (PCA) 4,5-dioxygenase (LigAB) and two subfamilies of unknown function. The A subunit is the smaller, non-catalytic subunit of LigAB. PCA 4,5-dioxygenase catalyzes the oxidization and subsequent ring-opening of PCA (or 3,4-dihydroxybenzoic acid), which is an intermediate in the breakdown of lignin and other compounds. PCA 4,5-dioxygenase is one of the aromatic ring opening dioxygenases which play key roles in the degradation of aromatic compounds. As members of the Class III extradiol dioxygenase family, the enzymes use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like class III enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit.	106
153392	cd07922	CarBa	CarBa is the A subunit of 2-aminophenol 1,6-dioxygenase, which catalyzes the oxidization and   subsequent ring-opening of 2-aminophenyl-2,3-diol. CarBa is the A subunit of 2-aminophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-aminophenyl-2,3-diol. 2-aminophenol 1,6-dioxygenase is a key enzyme in the carbazole degradation pathway isolated from bacterial strains with carbazole degradation ability. The enzyme is a heterotetramer composed of two A and two B subunits. CarB belongs to the class III extradiol dioxygenase family, composed of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Although the enzyme was originally isolated as a meta-cleavage enzyme for 2'-aminobiphenyl-2,3-diol involved in carbazole degradation, the enzyme has also shown high specificity for 2,3-dihydroxybiphenyl.	81
153393	cd07923	Gallate_dioxygenase_C	The C-terminal domain of Gallate Dioxygenase, which catalyzes the oxidization and subsequent ring-opening of gallate. Gallate Dioxygenase catalyzes the oxidization and subsequent ring-opening of gallate, an intermediate in the degradation of the aromatic compound, syringate. The reaction product of gallate dioxygenase is 4-oxalomesaconate. The amino acid sequence of the N-terminal and C-terminal regions of gallate dioxygenase exhibits homology with the sequence of the PCA 4,5-dioxygenase B (catalytic) and A subunits, respectively. This model represents the C-terminal domain, which is similar to the A subunit of PCA 4,5-dioxygenase (or LigAB). The enzyme is estimated to be a homodimer according to the Escherichia coli enzyme. Since enzymes in this subfamily have fused A and B subunits, the dimer interface may resemble the tetramer interface of classical LigAB enzymes. This enzyme belongs to the class III extradiol dioxygenase family, composed of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon.	94
153394	cd07924	PCA_45_Doxase_A	The A subunit of Protocatechuate 4,5-dioxygenase (LigAB) is the smaller, non-catalytic subunit. The A subunit is the non-catalytic subunit of Protocatechuate (PCA) 4,5-dioxygenase (LigAB), which is composed of A and B subunits that form a tetramer. PCA 4,5-dioxygenase catalyzes the oxidization and subsequent ring-opening of PCA (or 3,4-dihydroxybenzoic acid), which is an intermediate in the breakdown of lignin and other compounds. PCA 4,5-dioxygenase is one of the aromatic ring opening dioxygenases which  play key roles in the degradation of aromatic compounds. As a member of the Class III extradiol dioxygenase family, LigAB uses a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon.	121
153395	cd07925	LigA_like_1	The A subunit of Uncharacterized proteins with similarity to Protocatechuate 4,5-dioxygenase (LigAB). The proteins of unknown function in this subfamily are similar to the A subunit of the Protocatechuate (PCA) 4,5-dioxygenase (LigAB). LigAB belongs to the class III extradiol dioxygenase family, composed of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Dioxygenases play key roles in the degradation of aromatic compounds. PCA 4,5-dioxygenase catalyzes the oxidization and subsequent ring-opening of PCA (or 3,4-dihydroxybenzoic acid), which is an intermediate in the breakdown of lignin and other compounds.	106
143648	cd07927	RHD-n_NFAT_like	N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor of activated T-cells (NFAT) proteins and similar proteins. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NFAT family of transcription factors. NFAT transcription complexes are a target of calcineurin, a calcium dependent phosphatase, and activate genes that are mainly involved in cell-cell interaction. Upon de-phosphorylation of the nuclear localization signal, NFAT enters the nucleus and acts as a transcription factor; its export from the nucleus is triggered by phosphorylation via export kinases. NFATs play important roles in mediating the immune response, and are found in T cells, B Cells, NK cells, mast cells, and monocytes. NFATs are also found in various non-hematopoietic cell types, where they play roles in development. This group also contains the N-terminal RHD sub-domain of the non-calcium regulated tonicity-responsive enhancer binding protein (TonEBP), also called NFAT5. Mammalian TonEBP regulates the expression of genes in response to tonicity. It plays a pivotal role in urinary concentrating mechanisms in kidney medulla, by triggering the accumulation of osmolytes that enable renal medullary cells to tolerate high levels of urea and salt.	161
153077	cd07930	bacterial_phosphagen_kinase	Phosphagen (guanidino) kinases found in bacteria. Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, such as phosphocreatine (PCr) or phosphoarginine, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. This subfamily is specific to bacteria and lacks an N-terminal domain, which otherwise forms part of the substrate binding site. Most of the catalytic residues are found in the larger C-terminal domain, however, which appears conserved in these bacterial proteins. Their functions have not been characterized.	232
153078	cd07931	eukaryotic_phosphagen_kinases	Phosphagen (guanidino) kinases mostly found in eukaryotes. Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK) or phosphoarginine in the case of arginine kinase, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CK exists in tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial and cytosolic) isoforms. They are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK and AK, the most studied members of this family are also other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK) and hypotaurocyamine kinase (HTK).	338
153079	cd07932	arginine_kinase_like	Phosphagen (guanidino) kinases such as arginine kinase and similar enzymes. Eukaryotic arginine kinase-like phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphoarginine in the case of arginine kinase (AK), which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. Besides AK, one of the most studied members of this family, this model also represents a phosphagen kinase with different substrate specificity, hypotaurocyamine kinase (HTK).	350
143649	cd07933	RHD-n_c-Rel	N-terminal sub-domain of the Rel homology domain (RHD) of c-Rel. Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the c-Rel family of transcription factors, categorized as a class II member of the NF-kappa B family. In class II NF-kappa Bs, the RHD domain co-occurs with a C-terminal transactivation domain (TAD). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and Rel). c-Rel plays an important role in B cell proliferation and survival.	172
143650	cd07934	RHD-n_NFkB2	N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor kappa B2 (NF-kappa B2). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NF-kappa B2 family of transcription factors, a class I member of the NF-kappa B family. In class I NF-kappa Bs, the RHD domain co-occurs with C-terminal ankyrin repeats. NF-kappa B2 is commonly referred to as p100 or p52 (proteolytically processed form). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and REL). NF-kappa B2 is involved in the alternative NF-kappa B signaling pathway which is activated by few agonists and plays an important role in secondary lymphoid organogenesis, maturation of B-cells, and adaptive humoral immunity. p100 may also act as an I-kappa B due to its C-terminal ankyrin repeats.	185
143651	cd07935	RHD-n_NFkB1	N-terminal sub-domain of the Rel homology domain (RHD) of nuclear factor of kappa B1 (NF-kappa B1). Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal RHD sub-domain of the NF-kappa B1 family of transcription factors, a class I member of the NF-kappa B family. In class I NF-kappa Bs, the RHD domain co-occurs with C-terminal ankyrin repeats. NF-kappa B1 is commonly referred to as p105 or p50 (proteolytically processed form). NF-kappa B proteins are part of a protein complex that acts as a transcription factor, which is responsible for regulating a host of cellular responses to a variety of stimuli. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis. The cytosolic NF-kappa B complex is activated via phosphorylation of the ankyrin-repeat containing inhibitory protein I-kappa B, which dissociates from the complex and exposes the nuclear localization signal of the heterodimer (NF-kappa B and REL). NF-kappa B1 is involved in the canonical NF-kappa B signaling pathway which is activated by many agonists and is essential in immune and inflammatory responses, as well as cell survival. p105 is involved in its own specific NF-kappa B signaling pathway which is also implicated in immune and inflammatory responses. p105 may also act as an I-kappa B due to its C-terminal ankyrin repeats. It is also involved in mitogen-activated protein kinase (MAPK) signaling as its degradation leads to the activation of TPL-2, a MAPK kinase kinase which activates ERK pathways.	202
153421	cd07936	SCAN	SCAN oligomerization domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several vertebrate proteins that contain C2H2 zinc finger motifs, many of which may be transcription factors playing roles in cell survival and differentiation. This protein-interaction domain is able to mediate homo- and hetero-oligomerization of SCAN-containing proteins. Some SCAN-containing proteins, including those of lower vertebrates, do not contain zinc finger motifs. It has been noted that the SCAN domain resembles a domain-swapped version of the C-terminal domain of the HIV capsid protein. This domain model features elements common to the three general groups of SCAN domains (SCAN-A1, SCAN-A2, and SCAN-B). The SCAND1 protein is truncated at the C-terminus with respect to this model, the SCAND2 protein appears to have a truncated central helix.	85
163675	cd07937	DRE_TIM_PC_TC_5S	Pyruvate carboxylase and Transcarboxylase 5S, carboxyltransferase domain. This family includes the carboxyltransferase domains of pyruvate carboxylase (PC) and the transcarboxylase (TC) 5S subunit.  Transcarboxylase 5S is a cobalt-dependent metalloenzyme subunit of the biotin-dependent transcarboxylase multienzyme complex. Transcarboxylase 5S transfers carbon dioxide from the 1.3S biotin to pyruvate in the second of two carboxylation reactions catalyzed by TC. The first reaction involves the transfer of carbon dioxide from methylmalonyl-CoA to the 1.3S biotin, and is catalyzed by the 12S subunit.  These two steps allow a carboxylate group to be transferred from oxaloacetate to propionyl-CoA to yield pyruvate and methylmalonyl-CoA.  The catalytic domain of transcarboxylase 5S has a canonical TIM-barrel fold with a large C-terminal extension that forms a funnel leading to the active site.  Transcarboxylase 5S forms a homodimer and there are six dimers per complex.  In addition to the catalytic domain, transcarboxylase 5S has several other domains including a carbamoyl-phosphate synthase domain, a biotin carboxylase domain, a carboxyltransferase domain, and an ATP-grasp domain.  Pyruvate carboxylase, like TC, is a biotin-dependent enzyme that catalyzes the carboxylation of pyruvate to produce oxaloacetate.  In mammals, PC has critical roles in gluconeogenesis, lipogenesis, glyceroneogenesis, and insulin secretion.  Inherited PC deficiencies are linked to serious diseases in humans such as lactic acidemia, hypoglycemia, psychomotor retardation, and death.  PC is a single-chain enzyme and is active only in its homotetrameric form.  PC has three domains, an N-terminal biotin carboxylase domain, a carboxyltransferase domain (this alignment model), and a C-terminal biotin-carboxyl carrier protein domain.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	275
163676	cd07938	DRE_TIM_HMGL	3-hydroxy-3-methylglutaryl-CoA lyase, catalytic TIM barrel domain. 3-hydroxy-3-methylglutaryl-CoA lyase (HMGL) catalyzes the cleavage of HMG-CoA to acetyl-CoA and acetoacetate, one of the terminal steps in ketone body generation and leucine degradation, and is a key enzyme in the pathway that supplies metabolic fuel to extrahepatic tissues.  Mutations in HMGL cause a human autosomal recessive disorder called primary metabolic aciduria that affects ketogenesis and leucine catabolism and can be fatal due to an inability to tolerate hypoglycemia.  HMGL has a TIM barrel domain with a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  The cleavage of HMG-CoA requires the presence of a divalent cation like Mg2+ or Mn2+, and the reaction is thought to involve general acid/base catalysis.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	274
163677	cd07939	DRE_TIM_NifV	Streptomyces rubellomurinus FrbC and related proteins, catalytic TIM barrel domain. FrbC (NifV) of Streptomyces rubellomurinus catalyzes the condensation of acetyl-CoA and alpha-ketoglutarate to form homocitrate and CoA, a reaction similar to one catalyzed by homocitrate synthase.  The gene encoding FrbC is one of several genes required for the biosynthesis of FR900098, a potent antimalarial antibiotic.  This protein is also required for assembly of the nitrogenase MoFe complex but its exact role is unknown.   This family also includes the NifV proteins of Heliobacterium chlorum and Gluconacetobacter diazotrophicus, which appear to be orthologous to FrbC.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	259
163678	cd07940	DRE_TIM_IPMS	2-isopropylmalate synthase (IPMS), N-terminal catalytic TIM barrel domain. 2-isopropylmalate synthase (IPMS) catalyzes an aldol-type condensation of acetyl-CoA and 2-oxoisovalerate yielding 2-isopropylmalate and CoA, the first committed step in leucine biosynthesis.  This family includes the Arabidopsis thaliana IPMS1 and IPMS2 proteins, the Glycine max GmN56 protein, and the Brassica insularis BatIMS protein.  This family also includes a group of archeal IPMS-like proteins represented by the Methanocaldococcus jannaschii AksA protein.  AksA catalyzes the condensation of alpha-ketoglutarate and acetyl-CoA to form trans-homoaconitate, one of 13 steps in the conversion of alpha-ketoglutarate and acetylCoA to alpha-ketosuberate, a precursor to coenzyme B and biotin.  AksA also catalyzes the condensation of alpha-ketoadipate or alpha-ketopimelate with acetylCoA to form, respectively, the (R)-homocitrate homologs (R)-2-hydroxy-1,2,5-pentanetricarboxylic acid and (R)-2-hydroxy-1,2,6- hexanetricarboxylic acid.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	268
163679	cd07941	DRE_TIM_LeuA3	Desulfobacterium autotrophicum LeuA3 and related proteins, N-terminal catalytic TIM barrel domain. Desulfobacterium autotrophicum LeuA3 is sequence-similar to alpha-isopropylmalate synthase (LeuA) but its exact function is unknown.  Members of this family have an N-terminal TIM barrel domain that belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	273
163680	cd07942	DRE_TIM_LeuA	Mycobacterium tuberculosis LeuA3 and related proteins, N-terminal catalytic TIM barrel domain. Alpha-isopropylmalate synthase (LeuA), a key enzyme in leucine biosynthesis, catalyzes the first committed step in the pathway, converting acetyl-CoA and alpha-ketoisovalerate to alpha-isopropyl malate and CoA.  Although the reaction catalyzed by LeuA is similar to that of the Arabidopsis thaliana IPMS1 protein, the two fall into phylogenetically distinct families within the same superfamily.  LeuA has and N-terminal TIM barrel catalytic domain, a helical linker domain, and a C-terminal regulatory domain.  LeuA forms a homodimer in which the linker domain of one monomer sits over the catalytic domain of the other, inserting residues into the active site that may be important for catalysis.  Homologs of LeuA are found in bacteria as well as fungi.  This family includes alpha-isopropylmalate synthases I (LEU4) and II (LEU9) from Saccharomyces cerevisiae.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	284
163681	cd07943	DRE_TIM_HOA	4-hydroxy-2-oxovalerate aldolase, N-terminal catalytic TIM barrel domain. 4-hydroxy 2-ketovalerate aldolase  (Also known as 4-hydroxy-2-ketovalerate aldolase and 4-hydroxy-2-oxopentanoate aldolase (HOA)) converts 4-hydroxy-2-oxopentanoate to acetaldehyde and pyruvate, the penultimate step in the meta-cleavage pathway for the degradation of phenols, cresols and catechol.  This family includes the Escherichia coli MhpE aldolase, the Pseudomonas DmpG aldolase, and the Burkholderia xenovorans BphI pyruvate aldolase.  In Pseudomonas, the DmpG aldolase tightly associates with a dehydrogenase (DmpF ) and is inactive without it.  HOA has a canonical TIM-barrel fold with a C-terminal extension that forms a funnel leading to the active site.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	263
163682	cd07944	DRE_TIM_HOA_like	4-hydroxy-2-oxovalerate aldolase-like, N-terminal catalytic TIM barrel domain. This family of bacterial enzymes is sequence-similar to 4-hydroxy-2-oxovalerate aldolase (HOA) but its exact function is unknown.  This family includes the Bacteroides vulgatus Bvu_2661 protein and belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	266
163683	cd07945	DRE_TIM_CMS	Leptospira interrogans citramalate synthase (CMS) and related proteins, N-terminal catalytic TIM barrel domain. Citramalate synthase (CMS) catalyzes the conversion of pyruvate and acetyl-CoA to (R)-citramalate in the first dedicated step of the citramalate pathway.  Citramalate is only found in Leptospira interrogans and a few other microorganisms.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	280
163684	cd07947	DRE_TIM_Re_CS	Clostridium kluyveri Re-citrate synthase and related proteins, catalytic TIM barrel domain. Re-citrate synthase (Re-CS) is a Clostridium kluyveri enzyme that converts acetyl-CoA and oxaloacetate to citrate.  In most organisms, this reaction is catalyzed by Si-citrate synthase which is Si-face stereospecific with respect to C-2 of oxaloacetate, and phylogenetically unrelated to Re-citrate synthase.  Re-citrate synthase is also found in a few other strictly anaerobic organisms.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	279
163685	cd07948	DRE_TIM_HCS	Saccharomyces cerevisiae homocitrate synthase and related proteins, catalytic TIM barrel domain. Homocitrate synthase (HCS) catalyzes the condensation of acetyl-CoA and alpha-ketoglutarate to form homocitrate, the first step in the lysine biosynthesis pathway.  This family includes the Yarrowia lipolytica LYS1 protein as well as the Saccharomyces cerevisiae LYS20 and LYS21 proteins.  This family belongs to the DRE-TIM metallolyase superfamily.  DRE-TIM metallolyases include 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC.  These members all share a conserved  triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices.  The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel.  In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM".	262
153386	cd07949	PCA_45_Doxase_B_like_1	The B subunit of unknown Class III extradiol dioxygenases with similarity to Protocatechuate 4,5-dioxygenase. This subfamily is composed of proteins of unknown function with similarity to the B subunit of Protocatechuate 4,5-dioxygenase (LigAB). LigAB belongs to the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Dioxygenases play key roles in the degradation of aromatic compounds. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents the catalytic subunit, B.	276
153387	cd07950	Gallate_Doxase_N	The N-terminal domain of the Class III extradiol dioxygenase, Gallate Dioxygenase, which catalyzes the oxidization and subsequent ring-opening of gallate. Gallate Dioxygenase catalyzes the oxidization and subsequent ring-opening of gallate, an intermediate in the degradation of the aromatic compound, syringate. The reaction product of gallate dioxygenase is 4-oxalomesaconate. The amino acid sequence of the N-terminal and C-terminal regions of gallate dioxygenase exhibits homology with the sequence of PCA 4,5-dioxygenase B (catalytic) and A subunits, respectively. The enzyme is estimated to be a homodimer according to the Escherichia coli enzyme. LigAB-like enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. In this subfamily, the subunits A and B are fused to make a single polypeptide chain. The dimer interface for this subfamily may resemble the tetramer interface of classical LigAB enzymes. Gallate Dioxygenase belongs to the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon.	277
153388	cd07951	ED_3B_N_AMMECR1	The N-terminal domain, an extradiol dioxygenase class III subunit B-like domain, of unknown proteins containing a C-terminal AMMECR1 domain. This subfamily is composed of uncharacterized proteins containing an N-terminal domain with similarity to the catalytic B subunit of class III extradiol dioxygenases and a C-terminal AMMECR1-like domain. This model represents the N-terminal domain. Class III extradiol dioxygenases use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon, however, proteins in this subfamily do not contain a potential metal binding site and may not exhibit class III extradiol dioxygenase-like activity. The AMMECR1 protein was proposed to be a regulatory factor that is potentially involved in the development of AMME contiguous gene deletion syndrome.	256
153389	cd07952	ED_3B_like	Uncharacterized class III extradiol dioxygenases. This subfamily is composed of proteins of unknown function with similarity to the catalytic B subunit of class III extradiol dioxygenases. Class III extradiol dioxygenases use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. They play key roles in the degradation of aromatic compounds.	256
409289	cd07953	PUA	PUA RNA binding domain. The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, and a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was also found in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in regulating the expression of other genes. It has been shown that the PUA domain acts as an RNA binding domain in at least some of the proteins involved in RNA metabolism.	73
271157	cd07954	AP_MHD_Cterm	C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD). This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15.	245
153409	cd07955	Anticodon_Ia_Cys_like	Anticodon-binding domain of cysteinyl tRNA synthetases and domain found in MshC. This domain is found in cysteinyl tRNA synthetases (CysRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. CysRS catalyzes the transfer of cysteine to the 3'-end of its tRNA. The family also includes a domain of MshC, the rate-determining enzyme in the mycothiol biosynthetic pathway, which is specific to actinomycetes. The anticodon-binding site of CysRS lies C-terminal to this model's footprint and is not shared by MshC.	81
153410	cd07956	Anticodon_Ia_Arg	Anticodon-binding domain of arginyl tRNA synthetases. This domain is found in arginyl tRNA synthetases (ArgRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. ArgRS catalyzes the transfer of arginine to the 3'-end of its tRNA.	156
153411	cd07957	Anticodon_Ia_Met	Anticodon-binding domain of methionyl tRNA synthetases. This domain is found in methionyl tRNA synthetases (MetRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon (CAU). MetRS catalyzes the transfer of methionine to the 3'-end of its tRNA.	129
153412	cd07958	Anticodon_Ia_Leu_BEm	Anticodon-binding domain of bacterial and eukaryotic mitochondrial leucyl tRNA synthetases. This domain is found in leucyl tRNA synthetases (LeuRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain. In contrast to other class Ia enzymes, the anticodon is not used as an identity element in LeuRS (with exceptions such as Saccharomyces cerevisiae and some other eukaryotes). No anticodon-binding site can be defined for this family, which includes bacterial and eukaryotic mitochondrial members, as well as LeuRS from the archaeal Halobacteria. LeuRS catalyzes the transfer of leucine to the 3'-end of its tRNA.	117
153413	cd07959	Anticodon_Ia_Leu_AEc	Anticodon-binding domain of archaeal and eukaryotic cytoplasmic leucyl tRNA synthetases. This domain is found in leucyl tRNA synthetases (LeuRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain. In contrast to other class Ia enzymes, the anticodon is not used as an identity element in LeuRS (with exceptions such as Saccharomyces cerevisiae and some other eukaryotes). No anticodon-binding site can be defined for this family, which includes archaeal and eukaryotic cytoplasmic members. LeuRS catalyzes the transfer of leucine to the 3'-end of its tRNA.	117
153414	cd07960	Anticodon_Ia_Ile_BEm	Anticodon-binding domain of bacterial and eukaryotic mitochondrial isoleucyl tRNA synthetases. This domain is found in isoleucyl tRNA synthetases (IleRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. This family includes bacterial and eukaryotic mitochondrial members. IleRS catalyzes the transfer of isoleucine to the 3'-end of its tRNA.	180
153415	cd07961	Anticodon_Ia_Ile_ABEc	Anticodon-binding domain of archaeal, bacterial, and eukaryotic cytoplasmic isoleucyl tRNA synthetases. This domain is found in isoleucyl tRNA synthetases (IleRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. This family includes bacterial, archaeal, and eukaryotic cytoplasmic members. IleRS catalyzes the transfer of isoleucine to the 3'-end of its tRNA.	183
153416	cd07962	Anticodon_Ia_Val	Anticodon-binding domain of valyl tRNA synthetases. This domain is found in valyl tRNA synthetases (ValRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. ValRS catalyzes the transfer of valine to the 3'-end of its tRNA.	135
153417	cd07963	Anticodon_Ia_Cys	Anticodon-binding domain of cysteinyl tRNA synthetases. This domain is found in cysteinyl tRNA synthetases (CysRS), which belong to the class Ia aminoacyl tRNA synthetases. It lies C-terminal to the catalytic core domain, and recognizes and specifically binds to the tRNA anticodon. CysRS catalyzes the transfer of cysteine to the 3'-end of its tRNA.	156
176481	cd07964	RBP-H	Head domain of virus receptor-binding proteins (RBP). Virus receptor-binding proteins (RBPs) are found in lactococcal bacteriophages, as well as in adenoviruses and reoviruses, which invade mammalian cells. Lactococcus lactis is widely used in dairy fermentations and infection of L. lactis by phages greatly impairs the fermentation process. Adenovirus typically infects respiratory tracts with symptoms ranging from the common cold to pneumonia. Onset of viral infections begin with the recognition of host cells through the receptor-binding protein complex located at the distal part of the virion. The RBP has three domains: the N- terminal shoulders domain, the interlaced neck domain, and the C-terminal head domain. Phages recognize their host through an interaction between the RBP head (RBP-H) domain and saccharidic receptors at the host cell surface. Adenovirus recognizes the membrane cofactor protein, CD46, as a cellular receptor.	103
153436	cd07967	OBF_DNA_ligase_III	The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase III is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). DNA ligase III is not found in lower eukaryotes and is present both in the nucleus and mitochondria. It has several isoforms; two splice forms, III-alpha and III-beta, differ in their carboxy-terminal sequences. DNA ligase III-beta is believed to play a role in homologous recombination during meiotic prophase. DNA ligase III-alpha interacts with X-ray Cross Complementing factor 1 (XRCC1) and functions in single nucleotide Base Excision Repair (BER). The mitochondrial form of DNA ligase III originates from the nucleolus and is involved in the mitochondrial DNA repair pathway. This isoform is expressed by a second start site on the DNA ligase III gene. DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligouncleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	139
153437	cd07968	OBF_DNA_ligase_IV	The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase IV is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). DNA ligase IV is required for DNA non-homologous end joining pathways, including recombination of the V(D)J immunoglobulin gene segments in cells of the mammalian immune system. DNA ligase IV is stabilized by forming a complex with XRCC4, a nuclear phosphoprotein, which is phosphorylated by DNA-dependent protein kinase. DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and C-terminal oligouncleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	140
153438	cd07969	OBF_DNA_ligase_I	The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase I is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. There are three classes of ATP-dependent DNA ligases in eukaryotic cells (I, III and IV). This group is composed of eukaryotic DNA ligase I, Sulfolobus solfataricus DNA ligase and similar proteins. DNA ligase I is required for the ligation of Okazaki fragments during lagging-strand DNA synthesis and for base excision repair (BER). ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	144
153439	cd07970	OBF_DNA_ligase_LigC	The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase LigC is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of Mycobacterium tuberculosis LigC and similar bacterial proteins. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	122
153440	cd07971	OBF_DNA_ligase_LigD	The Oligonucleotide/oligosaccharide binding (OB)-fold domain of ATP-dependent DNA ligase LigD is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of Mycobacterium tuberculosis LigD and similar bacterial proteins. LigD, or DNA ligase D, catalyzes the end-healing and end-sealing steps during nonhomologous end joining. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	115
153441	cd07972	OBF_DNA_ligase_Arch_LigB	The Oligonucleotide/oligosaccharide binding (OB)-fold domain of archaeal and bacterial ATP-dependent DNA ligases is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. Bacterial DNA ligases are divided into two broad classes: NAD-dependent and ATP-dependent. All bacterial species have a NAD-dependent DNA ligase (LigA). Some bacterial genomes contain multiple genes for DNA ligases that are predicted to use ATP as their cofactor, including Mycobacterium tuberculosis LigB, LigC, and LigD. This group is composed of Pyrococcus furiosus DNA ligase, Mycobacterium tuberculosis LigB, and similar archaeal and bacterial proteins. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	122
153422	cd07973	Spt4	Transcription elongation factor Spt4. Spt4 is a transcription elongation factor. Three transcription-elongation factors Spt4, Spt5, and Spt6, are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles.   Spt4 functions entirely in the context of the Spt4-Spt5 heterodimer and it has been found only as a complex to Spt5 in Yeast and Human. Spt4 is a small protein that has zinc finger at the N-terminus.   Spt5 is a large protein that has several interesting structural features of an acidic N-terminus, a single NGN domain, five or six KOW domains, and a set of simple C-termianl repeats. Spt4 binds to Spt5 NGN domain. Unlike Spt5, Spt4 is not essential for viability in yeast, however Spt4 is critical for normal function of the Spt4-Spt5 complex. Spt4 homolog is not found in bacteria.	98
199899	cd07976	TFIIA_alpha_beta_like	Precursor of TFIIA alpha and beta subunits and similar proteins. Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of TATA-binding protein (TBP) for DNA in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta) and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single gene (TFIIA_alpha_beta), its protein product is post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. TFIIA_alpha_beta alone is sufficient for transcription in early embryogenesis, but the cleaved forms, TFIIA alpha and TFIIA beta, represent the vast majority of TFIIA in most differentiated cells. The exact functional differences between cleaved and uncleaved forms are not yet clear. This model also contains paralogs of the canonical TFIIA_alpha_beta, such as the human ALF, which may be involved in gametogenesis and early embryogenesis (and is also subject to proteolytic cleavage).	102
153423	cd07977	TFIIE_beta_winged_helix	TFIIE_beta_winged_helix domain, located at the central core region of TFIIE beta, with double-stranded DNA binding activity. Transcription Factor IIE (TFIIE) beta winged-helix (or forkhead) domain is located at the central core region of TFIIE beta. The winged-helix is a form of helix-turn-helix (HTH) domain which typically binds DNA with the 3rd helix. The winged-helix domain is distinguished by the presence of a C-terminal beta-strand hairpin unit (the wing) that packs against the cleft of the tri-helical core. Although most winged-helix domains are multi-member families, TFIIE beta winged-helix domain is typically found as a single orthologous group. TFIIE is one of the six eukaryotic general transcription factors (TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH) that are required for transcription initiation of protein-coding genes. TFIIE is a heterotetramer consisting of two copies each of alpha and beta subunits. TFIIE beta contains several functional domains, an N-terminal serine-rich region, a central core domain exhibiting a winged-helix structure capable of binding double-stranded DNA, a leucine repeat, a sigma3 region, and a C-terminal domain containing two basic regions. The assembly of transcription preinitiation complex (PIC) includes the general transcription factors and RNA polymerase II (pol II) initiated by the binding of the TBP subunit of TFIID to the TATA box, followed by either the sequential assembly of other general transcription factors and pol II or a preassembled pol II holoenzyme pathway. TFIIE interacts directly with TFIIF, TFIIB, pol II, and promoter DNA. TFIIE recruits TFIIH and regulates its activities. TFIIE and TFIIH are also important for the transition from initiation to elongation.	75
173962	cd07978	TAF13	The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is  involved  in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAFs orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID.   Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and are found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF13 interacts with TAF11 and makes a histone-like heterodimer similar to H3/H4-like proteins. The dimer may be structurally and functionally similar to the spt3 protein within the SAGA histone acetyltransferase complex.	92
173963	cd07979	TAF9	TATA Binding Protein (TBP) Associated Factor 9 (TAF9) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 9 (TAF9) is one of several TAFs that bind TBP and are involved in forming the TFIID complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAFs orthologs and paralogs. Human TAF9 has a paralogue gene (TAF9L) which has a redundant function. Several hypotheses are proposed for TAF function such as serving as activator-binding sites, in core-promoter recognition or a role in essential catalytic activity. It has been shown that TAF9 interacts directly with different transcription factors such as p53, herpes simplex virus activator vp16 and the basal transcription factor TFIIB. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF9 is a component of TFIID in multiple organisms as well as different TBP-free TAF complexes containing the GCN5-type histone acetyltransferase. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFS and many other transcription factors. TFIID has a histone octamer-like substructure. TFIID has a histone octamer-like substructure. TAF9 is a shared subunit of both, histone acetyltransferase complex (SAGA) and TFIID complexes. TAF9 domain interacts with TAF6 to form a novel histone-like heterodimer that is structurally related to the histone H3 and H4 oligomer.	117
259828	cd07980	TFIIF_beta	Transcription initiation factor IIF, beta subunit. The TFIIF-beta subunit, also called RNA Polymerase II-associating Protein 30 (RAP30), forms a heteromeric complex of RAP30/74 (TFIIF, beta/gamma) that is involved in both initiation and elongation of RNA chains by RNA polymerase II. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. TFIIF-beta binds directly to RNA polymerase II and helps bring polymerase into a pre-initiation complex.	123
381751	cd07981	TAF12	TATA Binding Protein (TBP) Associated Factor 12. The TATA Binding Protein (TBP) Associated Factor 12 (TAF12; also known as TAF2J or TAFII20) is one of several TAFs that bind TBP and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of several General Transcription Factors (GTFs), which also include TFIIA, TFIIB, TFIIE, TFIIF and TFIIH, that are involved in the accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and in the assembly of the pre-initiation complex (PIC). The TFIID complex is composed of the TBP and at least 13 TAFs which specifically interact with a variety of core promoter DNA sequences. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A unified and systematic nomenclature has been adopted for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs function such as serving as activator-binding sites, core-promoter recognition, or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF12 interacts with TAF4 and makes a novel histone-like heterodimer that binds DNA and has a core promoter function of a subset of genes. It is important for RAS-induced transformation properties of human colorectal cancer cells; its levels are increased in the cells harboring the RAS mutation. Also, TAF12 interacts with activating transcription factor 7 (ATF7) and contributes to the hypersensitivity of osteoclast (OCL) precursors to 1,25-dihydroxyvitamin D2 (1,25-(OH)2D3; also known as calcitriol) in Paget's disease (PD), a disorder of the bone remodeling process, in which the body absorbs old bone and forms abnormal new bone.	69
187739	cd07982	TAF10	The TATA Binding Protein (TBP) Associated Factor 10. The TATA Binding Protein (TBP) Associated Factor 10 (TAF 10) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of the seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and the assembly of the preinitiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. Several hypotheses are proposed for TAF functions, such as serving as activator-binding sites, being involved in core-promoter recognition, or to perform an essential catalytic activity. Each TAF - with the help of a specific activator - is required only for the expression of a subset of genes, and TAFs are not universally involved in transcription such as the GTFs. TAF10 regulates genes that are important for cell cycle progression and cell morphology. A lack of TAF10 leads to cell cycle arrest and cell death by apoptosis in mouse. In both yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF10 is part of other transcription regulatory multiprotein complexes (e.g., SAGA, TBP-free TAF-containing complex [TFTC], STAGA, and PCAF/GCN5). Several TAFs interact via histone-fold motifs. The histone fold (HFD) is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. The minimal HFD contains three alpha-helices linked by two loops. The HFD is found in core histones, TAFs and many other transcription factors. Five HF-containing TAF pairs have been described in TFIID: TAF6-TAF9, TAF4-TAF12, TAF11-TAF13, TAF8-TAF10 and TAF3-TAF10.	108
153245	cd07983	LPLAT_DUF374-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: DUF374. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are the uncharacterized DUF374 phospholipid/glycerol acyltransferases and similar proteins.	189
153246	cd07984	LPLAT_LABLAT-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: LABLAT-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as lipid A biosynthesis lauroyl/myristoyl (LABLAT, HtrB) acyltransferases and similar proteins.	192
153247	cd07985	LPLAT_GPAT	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: GPAT. Lysophospholipid acyltransferase (LPLAT) superfamily member: glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB). LPLATs are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. This subgroup includes glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB).	235
153248	cd07986	LPLAT_ACT14924-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: Unknown ACT14924. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are uncharacterized phospholipid/glycerol acyltransferases such as the Pectobacterium carotovorum subsp. carotovorum PC1 locus ACT14924 putative acyltransferase, and similar proteins.	210
153249	cd07987	LPLAT_MGAT-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: MGAT-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this suubgroup are such LPLATs as 2-acylglycerol O-acyltransferase (MGAT), and similar proteins.	212
153250	cd07988	LPLAT_ABO13168-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: Unknown ABO13168. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are uncharacterized phospholipid/glycerol acyltransferases such as the Acinetobacter baumannii ATCC 17978 locus ABO13168 putative acyltransferase, and similar proteins.	163
153251	cd07989	LPLAT_AGPAT-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: AGPAT-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), Tafazzin (product of Barth syndrome gene), and similar proteins.	184
153252	cd07990	LPLAT_LCLAT1-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: LCLAT1-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as Lysocardiolipin acyltransferase 1 (LCLAT1) or 1-acyl-sn-glycerol-3-phosphate acyltransferase and similar proteins.	193
153253	cd07991	LPLAT_LPCAT1-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: LPCAT1-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as lysophosphatidylcholine acyltransferase 1 (LPCAT-1),  glycerol-3-phosphate acyltransferase 3 (GPAT3), and similar sequences.	211
153254	cd07992	LPLAT_AAK14816-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: Unknown AAK14816-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are uncharacterized glycerol-3-phosphate acyltransferases such as the Plasmodium falciparum locus AAK14816 putative acyltransferase, and similar proteins.	203
153255	cd07993	LPLAT_DHAPAT-like	Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: GPAT-like. Lysophospholipid acyltransferase (LPLAT) superfamily member: acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis which catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this subgroup are such LPLATs as dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and similar proteins.	205
153424	cd07994	WGR	WGR domain. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs) as well as the putative Escherichia coli molybdate metabolism regulator and related bacterial proteins, a small family of bacterial DNA ligases, and various other bacterial proteins of unknown function. It has been called WGR after the most conserved central motif of the domain. The domain occurs in single-domain proteins and in a variety of domain architectures, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain.	73
153431	cd07995	TPK	Thiamine pyrophosphokinase. Thiamine pyrophosphokinase (TPK, EC:2.7.6.2, also spelled thiamin pyrophosphokinase) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamine) to form the coenzyme thiamine pyrophosphate (TPP). TPP is required for central metabolic functions, and thiamine deficiency is associated with potentially fatal human diseases. The structure of thiamine pyrophosphokinase suggests that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis.	208
153425	cd07996	WGR_MMR_like	WGR domain of molybdate metabolism regulator and related proteins. The WGR domain is found in the putative Escherichia coli molybdate metabolism regulator and related bacterial proteins, as well as in various other bacterial proteins of unknown function. It has been called WGR after the most conserved central motif of the domain. The domain appears to occur in single-domain proteins and in a variety of domain architectures, together with ATP-dependent DNA ligase domains, WD40 repeats, leucine-rich repeats, and other domains. It has been proposed to function as a nucleic acid binding domain.	74
153426	cd07997	WGR_PARP	WGR domain of poly(ADP-ribose) polymerases. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs). It has been called WGR after the most conserved central motif of the domain. The domain typically occurs together with a catalytic PARP domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. PARPs catalyze the NAD(+)-dependent synthesis of ADP-ribose polymers and their addition to various nuclear proteins and histones. Higher eukaryotes contain several PARPs and there may be up to 17 human PARP-like proteins, with three of them (PARP-1, PARP-2, and PARP-3) containing a WGR domain. The synthesis of poly-ADP-ribose requires multiple enzymatic activities for initiation, trans-ADP-ribosylation, elongation, branching, and release of the polymer from the enzyme. Poly-ADP-ribosylation was thought to be a reversible post-translational covalent modification that serves as a regulatory mechanism for protein substrates. However, it is now known that it plays important roles in many cellular processes including maintenance of genomic stability, transcriptional regulation, energy metabolism, cell death and survival, among others.	102
153427	cd07998	WGR_DNA_ligase	WGR domain of bacterial DNA ligases. The WGR domain is found in a small family of predicted bacterial DNA ligases. It has been called WGR after the most conserved central motif of the domain. The domain typically occurs in together with an ATP-dependent DNA ligase domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain.	77
153432	cd07999	GH7_CBH_EG	Glycosyl hydrolase family 7. Glycosyl hydrolase family 7 contains eukaryotic endoglucanases (EGs) and cellobiohydrolases (CBHs) that hydrolyze glycosidic bonds using a double-displacement mechanism. This leads to a net retention of the conformation at the anomeric carbon. Both enzymes work synergistically in the degradation of cellulose,which is the main component of plant cell wall, and is composed of beta-1,4 linked glycosyl units. EG cleaves the beta-1,4 linkages of cellulose and CBH cleaves off cellobiose disaccharide units from the reducing end of the chain. In general, the O-glycosyl hydrolases are a widespread group of enzymes that hydrolyze the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycoside hydrolase family 7.	386
193574	cd08000	NGN	N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily. The N-Utilization Substance G (NusG) and its eukaryotic homolog Spt5 are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms a Spt4-Spt5 complex that is an essential RNA Polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The diverse activities suggest that, after diverging from a common ancestor, NusG proteins became specialized in different bacteria.	99
153428	cd08001	WGR_PARP1_like	WGR domain of poly(ADP-ribose) polymerase 1 and similar proteins. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs). It has been called WGR after the most conserved central motif of the domain. The domain typically occurs together with a catalytic PARP domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. PARPs catalyze the NAD(+)-dependent synthesis of ADP-ribose polymers and their addition to various nuclear proteins. Higher eukaryotes contain several PARPs and and there may be up to 17 human PARP-like proteins, with three of them (PARP-1, PARP-2, and PARP-3) containing a WGR domain. The synthesis of poly-ADP-ribose requires multiple enzymatic activities for initiation, trans-ADP-ribosylation, elongation, branching, and release of the polymer from the enzyme. This subfamily is composed of vertebrate PARP-1 and similar proteins, including Arabidopsis thaliana PARP-1 and PARP-3. PARP-1 is the best-studied among the PARPs. It is a widely expressed nuclear chromatin-associated enzyme that possesses auto-mono-ADP-ribosylation (initiation), elongation, and branching activities. PARP-1 is implicated in DNA damage and cell death pathways and is important in maintaining genomic stability and regulating cell proliferation, differentiation, neuronal function, inflammation, and aging.	104
153429	cd08002	WGR_PARP3_like	WGR domain of poly(ADP-ribose) polymerase 3 and similar proteins. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs). It has been called WGR after the most conserved central motif of the domain. The domain typically occurs together with a catalytic PARP domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. PARPs catalyze the NAD(+)-dependent synthesis of ADP-ribose polymers and their addition to various nuclear proteins. Higher eukaryotes contain several PARPs and and there may be up to 17 human PARP-like proteins, with three of them (PARP-1, PARP-2, and PARP-3) containing a WGR domain. The synthesis of poly-ADP-ribose requires multiple enzymatic activities for initiation, trans-ADP-ribosylation, elongation, branching, and release of the polymer from the enzyme. This subfamily is composed of human PARP-3 and similar proteins, including Arabidopsis thaliana PARP-2. PARP-3 displays a tissue-specific expression, with highest amounts found in the nuclei of epithelial cells of prostate ducts, salivary glands, liver, pancreas, and in the neurons of terminal ganglia. Unlike PARP-1 and PARP-2, PARP-3 activity is not induced by DNA strand breaks. However, it co-localizes with Polycomb group bodies and is part of complexes making up DNA-PKcs, DNA ligases III and IV, Ku70, and Ku80. PARP-3 is a nuclear protein that may be involved in transcriptional control and responses to DNA damage.	100
153430	cd08003	WGR_PARP2_like	WGR domain of poly(ADP-ribose) polymerases. The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs). It has been called WGR after the most conserved central motif of the domain. The domain typically occurs together with a catalytic PARP domain, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain. PARPs catalyze the NAD(+)-dependent synthesis of ADP-ribose polymers and their addition to various nuclear proteins. Higher eukaryotes contain several PARPs and and there may be up to 17 human PARP-like proteins, with three of them (PARP-1, PARP-2, and PARP-3) containing a WGR domain. The synthesis of poly-ADP-ribose requires multiple enzymatic activities for initiation, trans-ADP-ribosylation, elongation, branching, and release of the polymer from the enzyme. This subfamily is composed of human PARP-2 and similar proteins. Similar to PARP-1, PARP-2 is ubiquitously expressed and its activity is induced by DNA strand breaks. It also plays a role in cell differentiation, cell death, and maintaining genomic stability. Studies on mice deficient with PARP-2 shows that it is important in fat storage, T cell maturation, and spermatogenesis.	103
381750	cd08010	MltG_like	proteins similar to Escherichia coli YceG/mltG may function as endolytic murein transglycosylases. The gene product of Escherichia coli yceG/mltG has been erroneously annotated as an aminodeoxychorismate lyase. Its overexpression has been reported to cause abnormal biofilm architecture, and it has been reported to be part of a putative five-gene operon. More recently it has been proposed to function as a terminase for peptidoglycan polymerization. The family also includes Streptomyces caeruleus NovB, an uncharacterized member of the novobiocin biosynthetic gene cluster.	246
349933	cd08011	M20_ArgE_DapE-like	M20 Peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. This group includes the hypothetical protein ygeY from Escherichia coli, a putative deacetylase, but many in this subfamily are classified as unassigned peptidases. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly archaeal, and have been inferred by homology as being related to both ArgE and DapE.	355
349934	cd08012	M20_ArgE-related	M20 Peptidases with similarity to acetylornithine deacetylases. Peptidase M20 family, acetylornithine deacetylase (ArgE, Acetylornithinase, AO, N2-acetyl-L-ornithine amidohydrolase, EC 3.5.1.16)-related subfamily. Proteins in this subfamily have not yet been characterized, but have been predicted to have deacetylase activity. ArgE catalyzes the conversion of N-acetylornithine to ornithine, which can then be incorporated into the urea cycle for the final stage of arginine synthesis. The substrate specificity of ArgE is quite broad; several alpha-N-acyl-L-amino acids can be hydrolyzed, including alpha-N-acetylmethionine and alpha-N-formylmethionine. ArgE shares significant sequence homology and biochemical features, and possibly a common origin, with glutamate carboxypeptidase (CPG2) and succinyl-diaminopimelate desuccinylase (DapE), and aminoacylase I (ACY1), having all metal ligand binding residues conserved.	423
349935	cd08013	M20_ArgE_DapE-like	M20 peptidases with similarity to acetylornithine deacetylases and succinyl-diaminopimelate desuccinylases. Peptidase M20 family, uncharacterized protein subfamily with similarity to acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) subfamily. This group includes the hypothetical protein ygeY from Escherichia coli, a putative deacetylase, but many in this subfamily are classified as unassigned peptidases. ArgE/DapE enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this subfamily are mostly fungal and bacterial, and have been inferred by homology as being related to both ArgE and DapE.	379
349936	cd08014	M20_Acy1-like	M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of uncharacterized bacterial proteins predicted as putative amidohydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	371
349937	cd08015	M28_like	M28 Zn-peptidase-like; uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions.	218
349938	cd08017	M20_IAA_Hyd	M20 Peptidase Indole-3-acetic acid amino acid hydrolase. Peptidase M20 family, plant aminoacyclase-1 indole-3-acetic-L-aspartic acid hydrolase (IAA-Asp hydrolase; IAAspH; IAAH; IAA amidohydrolase; EC 3.5.1.-) subfamily. IAAspH hydrolyzes indole-3-acetyl-N-aspartic acid (IAA or auxin) to indole-3-acetic acid. Genes encoding IAA-amidohydrolases were first cloned from Arabidopsis; ILR1, IAR3, ILL1 and ILL2 encode active IAA- amino acid hydrolases, and three additional amidohydrolase-like genes (ILL3, ILL5, ILL6) have been isolated. In higher plants, the growth regulator indole-3-acetic acid (IAA or auxin) is found both free and conjugated via amide bonding to a variety of amino acids and peptides, and via an ester linkage to carbohydrates. IAA-Asp conjugates are involved in homeostatic control, protection, storing and subsequent use of free IAA. IAA-Asp is also found in some plants as a unique intermediate for entering into IAA non-decarboxylative oxidative pathway. IAA amidohydrolase cleaves the amide bond between the auxin and the conjugated amino acid. Enterobacter agglomerans IAAspH has very strong enzyme activity and substrate specificity towards IAA-Asp, although its substrate affinity is weaker compared to Arabidopsis enzymes of the ILR1 gene family. Enhanced IAA-hydrolase activity has been observed during clubroot disease in Chinese cabbage.	376
349939	cd08018	M20_Acy1_amhX-like	M20 Peptidase aminoacylase 1 amhX-like subfamily. Peptidase M20 family, uncharacterized subfamily of proteins predicted as putative amidohydrolases, including the amhX gene product from Bacillus subtilis. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	365
349940	cd08019	M20_Acy1-like	M20 Peptidase aminoacylase 1 subfamily. Peptidase M20 family, uncharacterized subfamily of bacterial proteins predicted as putative amidohydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	372
349941	cd08021	M20_Acy1_YhaA-like	M20 Peptidase aminoacylase 1 subfamily, includes Bacillus subtilis YhaA and Staphylococcus aureus amidohydrolase, SACOL0085. Peptidase M20 family, uncharacterized subfamily of bacterial proteins predicted as putative amidohydrolases or hippurate hydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine). This family includes Staphylococcus aureus amidohydrolase, SACOL0085, which contains two manganese ions in the active site, and forms a homotetramer with variations in interdomain orientation which possibly plays a role in the regulation of catalytic activity.	384
349942	cd08022	M28_PSMA_like	M28 Zn-peptidase prostate-specific membrane antigen. Peptidase M28 family; prostate-specific membrane antigen (PSMA, also called glutamate carboxypeptidase II or GCP-II)-like subfamily. PSMA is a homodimeric type II transmembrane protein containing three distinct domains: protease-like, apical or protease-associated (PA) and helical domains. The protease-like domain is a large extracellular portion (ectodomain). PSMA is over-expressed predominantly in prostate cancer (PCa) as well as in the neovasculature of most solid tumors, but not in the vasculature of the normal tissues. PSMA is considered a biomarker for PCa and possibly for use as an imaging and therapeutic target. The extracellular domain of PSMA possesses two unique enzymatic functions: N-acetylated, alpha-linked acidic dipeptidase (NAALADase) which cleaves terminal glutamate from the neurodipeptide N-acetyl-aspartyl-glutamate (NAAG), and folate hydrolase (FOLH) which cleaves the terminal glutamates from gamma-linked polyglutamates (carboxypeptidase). A mutation in this gene may be associated with impaired intestinal absorption of dietary folates, resulting in low blood folate levels and consequent hyperhomocysteinemia. Expression of this protein in the brain may be involved in a number of pathological conditions associated with glutamate excitotoxicity. Inhibition of GCP-II has been shown to be effective in preclinical models of neurological disorders associated with excessive activation of glutamatergic systems. This gene likely arose from a duplication event of a nearby chromosomal region. Alternative splicing gives rise to multiple transcript variants.	287
185693	cd08023	GH16_laminarinase_like	Laminarinase, member of the glycosyl hydrolase family 16. Laminarinase, also known as glucan endo-1,3-beta-D-glucosidase, is a glycosyl hydrolase family 16 member that hydrolyzes 1,3-beta-D-glucosidic linkages in 1,3-beta-D-glucans such as laminarins, curdlans, paramylons, and pachymans, with very limited action on mixed-link (1,3-1,4-)-beta-D-glucans.	235
185694	cd08024	GH16_CCF	Coelomic cytolytic factor, member of glycosyl hydrolase family 16. Subgroup of glucanases of unknown function that are related to beta-GRP (beta-1,3-glucan recognition protein), but contain active site residues. Beta-GRPs are one group of pattern recognition receptors (PRRs), also referred to as biosensor proteins, that complexes with pathogen-associated beta-1,3-glucans and then transduces signals necessary for activation of an appropriate innate immune response. Beta-GRPs are present in insects and lack all catalytic residues. This subgroup contains related proteins that still contain the active site and are widely distributed in eukaryotes. Their structures adopt a jelly roll fold with a deep active site channel harboring the catalytic residues, like those of other glycosyl hydrolase family 16 members.	330
153090	cd08025	RNR_PFL_like_DUF711	Uncharacterized proteins with similarity to Ribonucleotide reductase and Pyruvate formate lyase. This subfamily contains Streptococcus pneumoniae Sp0239 and similar uncharacterized proteins. Sp0239 is structurally similar to ribonucleotide reductase (RNR) and pyruvate formate lyase (PFL), which are believed to have diverged from a common ancestor. RNR and PFL possess a ten-stranded alpha-beta barrel domain that hosts the active site, and are radical enzymes. RNRs are found in all organisms and provide the only mechanism by which nucleotides are converted to deoxynucleotides. PFL is an essential enzyme in anaerobic bacteria that catalyzes the conversion of pyruvate and CoA to acteylCoA and formate.	400
153434	cd08026	DUF326	Cysteine-rich 4 helical bundle widely conserved in bacteria. This functionally uncharacterized protein forms a 4-helical bundle with a bromodomain-like topology. It is present in major bacterial lineages and contains highly conserved cysteines in a repeated pattern, whose sidechains appear buried. Some family members have been (mis)annotated as putative ferredoxins.	102
153397	cd08028	LARP_3	La RNA-binding domain of La-related protein 3. This domain is found at the N-terminus of the La autoantigen and similar proteins, and co-occurs with an RNA-recognition motif (RRM). Together these domains function to bind primary transcripts of RNA polymerase III at their 3' terminus and protect them from exonucleolytic degradation. Binding is specific for the 3'-terminal UUU-OH motif. The La autoantigen is also called Lupus La protein, LARP3, or Sjoegren syndrome type B antigen (SS-B).	82
153398	cd08029	LA_like_fungal	La-motif domain of fungal proteins similar to the La autoantigen. This domain is found in fungal proteins related to the La autoantigen. A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	76
153399	cd08030	LA_like_plant	La-motif domain of plant proteins similar to the La autoantigen. This domain is found in plant proteins related to the La autoantigen. A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	90
153400	cd08031	LARP_4_5_like	La RNA-binding domain of proteins similar to La-related proteins 4 and 5. This domain is found in proteins similar to La-related proteins 4 and 5 (LARP4, LARP5). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	75
153401	cd08032	LARP_7	La RNA-binding domain of La-related protein 7. LARP7 is a component of the 7SK snRNP, a key factor in the regulation of RNA polymerase II transcription. 7SK functionality is dependent on the presence of LARP7, which is thought to stabilize the 7SK RNA by interacting with its 3' end. The release of 7SK RNA from P-TEFb/HEXIM/7SK complexes activates the cyclin-dependent kinase P-TEFb, which in turn phosphorylates the C-terminal domain of RNA pol II and mediates a transition into productive transcription elongation.	82
153402	cd08033	LARP_6	La RNA-binding domain of La-related protein 6. This domain is found in animal and plant proteins related to the La autoantigen. A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	77
153403	cd08034	LARP_1_2	La RNA-binding domain proteins similar to La-related proteins 1 and 2. This domain is found in proteins similar to vertebrate La-related proteins 1 and 2 (LARP1, LARP2). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	73
153404	cd08035	LARP_4	La RNA-binding domain of La-related protein 4. This domain is found in vertebrate La-related protein 4 (LARP4), also known as c-MPL binding protein. La-type domains often co-occur with RNA-recognition motifs (RRMs). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	75
153405	cd08036	LARP_5	La RNA-binding domain of La-related protein 5. This domain is found in vertebrate La-related protein 5 (LARP5). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	75
153406	cd08037	LARP_1	La RNA-binding domain of La-related protein 1. This domain is found in vertebrate La-related protein 1 (LARP1). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	73
153407	cd08038	LARP_2	La RNA-binding domain of La-related protein 2. This domain is found in vertebrate La-related protein 2 (LARP2). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes.	73
185716	cd08039	Adenylation_DNA_ligase_Fungal	Adenylation domain of uncharacterized fungal ATP-dependent DNA ligase-like proteins. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriophages, eukarya, archaea and bacteria. This group is composed of uncharacterized fungal proteins with similarity to ATP-dependent DNA ligases. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many of the active-site residues. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. This model characterizes the adenylation domain of this group of uncharacterized fungal proteins. It is not known whether these proteins also contain an OB-fold domain.	235
153442	cd08040	OBF_DNA_ligase_family	The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	108
153443	cd08041	OBF_kDNA_ligase_like	The Oligonucleotide/oligosaccharide binding (OB)-fold domain of kDNA ligase-like ATP-dependent DNA ligases is a DNA-binding module that is part of the catalytic core unit. ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent ligases are present in many organisms such as viruses, bacteriohages, eukarya, archaea and bacteria. The mitochondrial DNA of parasitic protozoan is highly unusual. It is termed the kinetoplast DNA (kDNA) and consists of circular DNA molecules (maxicircles) and several thousand smaller circular molecules (minicircles). This group is composed of kDNA ligase, Chlorella virus DNA ligase, and similar proteins. kDNA ligase and Chlorella virus DNA ligase are the smallest known ATP-dependent ligases. They are involved in DNA replication or repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains. The adenylation and oligonucleotide/oligosaccharide binding (OB)-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain.	77
176269	cd08044	TAF5_NTD2	TAF5_NTD2 is the second conserved N-terminal region of TATA Binding Protein (TBP) Associated Factor 5 (TAF5), involved in forming Transcription Factor IID (TFIID). The TATA Binding Protein (TBP) Associated Factor 5 (TAF5) is one of several TAFs that bind TBP and are involved in forming Transcription Factor IID (TFIID) complex. TAF5 contains three domains, two conserved sequence motifs at the N-terminal and one at the C-terminal region. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the preinitiation complex. TFIID complex is composed of the TBP and at least 13 TAFs.  In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF5 may play a major role in forming TFIID and its related complexes. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. TAF5 has a paralog gene (TAF5L) which has a redundant function. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. C-terminus of TAF5 contains six WD40 repeats that likely form a closed beta propeller structure and may be involved in protein-protein interaction. The first part of the TAF5 N-terminal (TAF5_NTD1) homodimerizes in the absence of other TAFs. The second conserved N-terminal part of TAF5 (TAF5_NTD2) has an alpha-helical domain. One study has shown that TAF5_NTD2 homodimerizes only at high concentration of calcium but not any other metals. No dimerization was observed in other structural studies of TAF_NTD2. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. However, TAF5 does not have a HFD motif.	133
173965	cd08045	TAF4	TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryote. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for the expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID.   Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFS and many other transcription factors. TFIID has a histone octamer-like substructure. TAF4 domain interacts with TAF12 and makes a novel histone-like heterodimer that binds DNA and has a core promoter function of a subset of genes.	212
173966	cd08047	TAF7	TATA Binding Protein (TBP) Associated Factor 7 (TAF7) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 7 (TAF7) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the preinitiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A new, unified nomenclature has been suggested for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and is not universally involved for transcription as are GTFs. TAF7 is involved in the regulation of the transition from PIC assembly to initiation and elongation. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers.	162
173967	cd08048	TAF11	TATA Binding Protein (TBP) Associated Factor 11 (TAF11) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 11 (TAF11) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity.  TAF11 interacts with the ligand binding domains of the nuclear receptors for vitamin D3 and thyroid hormone. TAF11 also directly interacts with TFIIA, acting as a bridging factor that stabilizes the TFIIA-TBP-DNA complex. Each TAF, with the help of a specific activator, is required only for the expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID.   Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFS and many other transcription factors. TFIID has a histone octamer-like substructure. The TAF11 domain is structurally analogous to histone H3 and interacts with TAF13, making a novel histone-like heterodimer. The dimer may be structurally and functionally similar to the spt3 protein within the SAGA histone acetyltransferase complex.	85
176263	cd08049	TAF8	TATA Binding Protein (TBP) Associated Factor 8. The TATA Binding Protein (TBP) Associated Factor 8 (TAF8) is one of several TAFs that bind TBP, and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and the assembly of the preinitiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs' functions, such as serving as activator-binding sites, involvement in the core-promoter recognition, or a role in the essential catalytic activity of the complex. The mouse ortholog of TAF8 is called taube nuss protein (TBN), and is required for early embryonic development. TBN mutant mice exhibit disturbances in the balance between cell death and cell survival in the early embryo. TAF8 plays a role in the differentiation of preadipocyte fibroblasts to adipocytes; it is also required for the integration of TAF10 into the TAF complex. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID.  TAF8 is also a component of a small TAF complex (SMAT), which contains TAF8, TAF10 and SUPT7L. Several TAFs interact via histone-fold motifs. The histone fold (HFD) is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. TAF8 contains an H4 related histone fold motif, and interacts with several subunits of TFIID, including TBP and the histone-fold protein TAF10. Currently, five HF-containing TAF pairs have been described or suggested to exist in TFIID: TAF6-TAF9, TAF4-TAF12, TAF11-TAF13, TAF8-TAF10 and TAF3-TAF10.	54
381749	cd08050	TAF6C	C-terminal domain of TATA Binding Protein (TBP) Associated Factor 6 (TAF6). This model characterizes the carboxy (C)-terminal domain of TATA Binding Protein (TBP) Associated Factor 6 (TAF6), which is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. This C-terminal HEAT repeat domain of TAF6 (TAF6C) is proposed to form a homodimer that effectively bridges the downstream promoter-interacting TAFs (TAF1, -2, and -7) with lobe B of TFIID. This domain influences the TAF6-TAF9 complex, is thus important for TFIID assembly, and may trigger signals from transcriptional effectors. The HEAT domain motif is generally involved in protein/protein interactions, and in A. locustae, the conserved TAF6C domain is formed by five HEAT repeats, tightly packed against each other, defining a single structural domain. TFIID is one of several General Transcription Factors (GTFs), which also include TFIIA, TFIIB, TFIIE, TFIIF and TFIIH, that are involved in the accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays a key role in the recognition of promoter DNA and assembly of the pre-initiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A new, unified nomenclature has been suggested for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs' functions such as serving as activator-binding sites, core-promoter recognition, or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription, as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold domain (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF6 is a shared subunit of histone acetyltransferase complex SAGA and TFIID complexes. The N-terminal HFD of TAF6, interacts with the HFD of TAF9 and makes a novel histone-like heterodimer that is structurally related to histones H4 and H3. TAF6 may also interact with the downstream core promoter element (DPE).	216
153444	cd08051	gp6_gp15_like	Head-Tail Connector Proteins gp6 and gp15, and similar proteins. Members of this family include the proteins gp6 and gp15 from bacteriophage HK97 and SPP1, respectively. They are critical in the assembly of the connector, a specialized structure that serves as an interface for head and tail attachment, as well as a point at which DNA exits the head during infection by the bacteriophage. They form dodecameric ring structures that comprise the middle ring of the connector, located between the portal protein (attached to the head) and the gp7/gp16 ring (attached to the tail). They are components of the mature phage and the absence or mutation of HK97 gp6 or SPP1 gp15, respectively, result in defective head-tail joining and the absence of mature phage particles. The genome maps of HK97 and SPP1 show that genes encoding gp6 and gp15 are in the same relative position on the genome, located adjacent to the major capsid protein (MCP) gene and in between head and tail genes. Also included in this family is the uncharacterized Bacillus subtilis Yqbg protein, whose gene is part of the unusual genetic element called skin. The Yqbg gene is surrounded with genes similar to genes in the Bacillus subtilis prophage-like element PBSX, which encode for proteins comprising contractile-tailed phage-like particles that are produced upon mitomycin C treatment. Yqbg likely acts as a head-tail connector protein, similar to gp6 and gp15, of the PBSX-like prophage encoded in the skin element.	94
153445	cd08053	Yqbg	Putative Head-Tail Connector Protein Yqbg from Bacillus subtilis and similar proteins. The uncharacterized Bacillus subtilis Yqbg protein, whose gene is part of the unusual genetic element called skin, shows a similar structure to the connector proteins gp6 and gp15 from bacteriophage HK97 and SPP1, respectively. gp6 and gp15 are critical in the assembly of the connector, a specialized structure that serves as an interface for head and tail attachment, as well as a point at which DNA exits the head during infection by the bacteriophage. They form dodecameric ring structures that comprise the middle ring of the connector, located between the portal protein (attached to the head) and the gp7/gp16 ring (attached to the tail). The Yqbg gene is surrounded with genes similar to genes in the Bacillus subtilis prophage-like element PBSX, which encode for proteins comprising contractile-tailed phage-like particles that are produced upon mitomycin C treatment. Yqbg likely acts as a head-tail connector protein, similar to gp6 and gp15, of the PBSX-like prophage encoded in the skin element.	121
153446	cd08054	gp6	Head-Tail Connector Protein gp6 of Bacteriophage HK97 and similar proteins. The bacteriophage HK97 gp6 protein is critical in the assembly of the connector, a specialized structure that serves as an interface for head and tail attachment, as well as a point at which DNA exits the head during infection by the bacteriophage. It forms a dodecameric ring structure that comprises the middle ring of the connector, located between the portal protein (attached to the head) and the gp7 ring (attached to the tail). It is a component of the mature phage and the absence of HK97 gp6 results in defective head-tail joining and the absence of mature phage particles. Although the crystal structure of HK97 gp6 shows an unexpected 13-mer ring, the biological form present in the mature phage is believed to be a dodecamer.	91
153447	cd08055	gp15	Head-Tail Connector Protein gp15 of Bacteriophage SPP1 and similar proteins. The bacteriophage SPP1 gp15 protein is critical in the assembly of the connector, a specialized structure that serves as an interface for head and tail attachment, as well as a point at which DNA exits the head during infection by the bacteriophage. It forms a dodecameric ring structure that comprises the middle ring of the connector, located between the portal protein (attached to the head) and the gp16 ring (attached to the tail). Binding of the gp15 and gp16 rings to the portal protein is essential to prevent leakage of packaged DNA. gp15 is a component of the mature phage and its mutation results in defective head-tail joining.	95
163687	cd08056	MPN_PRP8	Mpr1p, Pad1p N-terminal (MPN) domains without isopeptidase activity found in splicing factor Prp8. Members of this family are found in pre-mRNA-processing factor 8 (Prp8) which is a critical splicing factor, interacting with several other spliceosomal proteins, snRNAs, and the pre-mRNA, thus organizing and stabilizing the spliceosome catalytic core. Prp8 is one of the largest and most highly conserved of nuclear proteins, occupying a central  position in the catalytic core of the spliceosome. Its C-terminal domain exhibits a JAB1/MPN-like core similar to deubiquitinating enzymes, but does not show catalytic isopeptidase activity, possibly because the putative isopeptidase center is covered by insertions and terminal appendices that are grafted onto this core, thus impairing the metal binding site. It is proposed that this domain is a protein interaction domain instead of a Zn(2+)-dependent metalloenzyme as proposed for some MPN proteins. The DEAD-box protein Brr2 and the GTPase Snu114 bind to the Prp8 C-terminus, a region where mutations in human Prp8 (hPrp8) cause a severe form of the genetic disorder retinitis pigmentosa, RP13, which leads to progressive photoreceptor degeneration in the retina and eventual blindness. At the N-terminus of Prp8, there are several domains, including a highly variable nuclear localization signal (NLS) motif rich in prolines, a conserved RNA recognition motif (RRM), and U5 and U6 snRNA binding sites.	252
163688	cd08057	MPN_euk_non_mb	Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity (non metal-binding); eukaryotic. This family contains MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains variants  lacking key residues in the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Rpn7/PSMD7, Rpn8/PSMD8, CSN6, Prp8p, and the translation initiation factor 3 subunits f and h do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function. Rpn7 is known to be critical for the integrity of the 26S proteasome complex by establishing a correct lid structure. It is necessary for the incorporation/anchoring of Rpn3 and Rpn12 to the lid and essential for viability and normal mitosis. CSN6 is a highly conserved protein complex with diverse functions, including several important intracellular pathways such as the ubiquitin/proteasome system, DNA repair, cell cycle, developmental changes, and some aspects of immune responses. It cleaves ubiquitin-like protein Nedd8 (neural precursor cell expressed, developmentally downregulated 8)) in the cullin 1 in cells. EIF3f s a potent inhibitor of HIV-1 replication as well as an important negative regulator of cell growth and proliferation. EIF3h regulates cell growth and viability, and that over-expression of the gene may provide growth advantage to prostate, breast, and liver cancer cells.	157
163689	cd08058	MPN_euk_mb	Mpr1p, Pad1p N-terminal (MPN) domains with catalytic isopeptidase activity (metal-binding); eukaryotic. This family contains eukaryotic MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains found in proteins with a variety of functions, including AMSH (associated molecule with the Src homology 3 domain (SH3) of STAM), H2A-DUB (histone H2A deubiquitinase), BRCC36 (BRCA1/BRCA2-containing complex subunit 36), as well as Rpn11 (regulatory particle number 11) and CSN5 (COP9 signalosome complex subunit 5). These domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity. Rpn11 is responsible for substrate deubiquitination during proteasomal degradation. It is essential for maintaining a correct cell cycle and normal mitochondrial morphology and physiology. CSN5 is critical for nuclear export and the degradation of several tumor suppressor proteins, including p53, p27, and Smad4. Over-expression of CSN5 has been implicated in cancer initiation and progression. AMSH specifically cleaves Lys 63 and not Lys48-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. It is involved in the degradation of EGF receptor (EGFR) and possibly other ubiquitinated endocytosed proteins. BRCC36 is part of the BRCA1/BRCA2/BARD1-containing nuclear complex that displays an E3 ubiquitin ligase activity; it is targeted to DNA damage foci after irradiation. 2A-DUB is specific for monoubiquitinated H2A (uH2A), regulating transcription by coordinating histone acetylation and deubiquitination, and destabilizing the association of linker histone H1 with nucleosomes. It is a positive regulator of androgen receptor (AR) transactivation activity on a reporter gene and serves as a marker in prostate tumors.	119
163690	cd08059	MPN_prok_mb	Mpr1p, Pad1p N-terminal (MPN) domains with catalytic isopeptidase activity (metal-binding); prokaryotic. This family contains bacterial and archaeal MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+)-like domains. These catalytically active domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity for the release of ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation.  The JAMM proteins likely hydrolyze ubiquitin conjugates in a manner similar to thermolysin, in which the zinc-polarized aqua ligand serves as the nucleophile, compared with the classical DUBs that do so with a cysteine residue in the active site.	101
163691	cd08060	MPN_UPF0172	Mov34/MPN/PAD-1 family: UPF0172 family of unknown function includes neighbor of COX4 (Noc4p). This family includes Noc4p (neighbor of COX4; neighbor of Cytochrome c Oxidase 4; nucleolar complex associated 4 homolog) which belongs to the family of unknown function, UPF0172, with MPN/JAMM-like domains. Proteins in this family are homologs of the NOC4 gene which is conserved in eukaryotic members including human, dog, mouse, rat, chicken, zebrafish, fruit fly, mosquito, S.pombe, K.lactis, E.gossypii, M.grisea, N.crassa, A.thaliana, and rice. NOC4 highly expressed in the pancreas and moderately in liver, heart, lung, kidney, brain, skeletal muscle, and placenta. This nucleolar protein forms a complex with Nop14p that mediates maturation and nuclear export of 40S ribosomal subunits. This family of eukaryotic MPN-like domains lacks the key residues that coordinate a metal ion and therefore does not show catalytic isopeptidase activity.	182
163692	cd08061	MPN_NPL4	Mov34/MPN/PAD-1 family: nuclear protein localization-4 (Npl4) domain. Npl4p (nuclear protein localization-4) is identical to Hmg-CoA reductase degradation 4 (HRD4) protein and contains a domain that is part of the pfam clan MPN/Mov34-like. Npl4 plays an intermediate role between endoplasmic reticulum-associated degradation (ERAD) substrate ubiquitylation and proteasomal degradation. Npl4p associates with Cdc48p (Cdc48 in yeast and p97 or valosin-containing protein (VCP) in higher eukaryotes), the highly conserved ATPase of the AAA family, via ubiquitin fusion degradation-1 protein (Ufd1p) to form a Cdc48p-Ufd1p-Npl4p complex which then functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation. This family of eukaryotic MPN-like domains lacks the key residues that coordinate a metal ion and therefore does not show catalytic isopeptidase activity.	274
163693	cd08062	MPN_RPN7_8	Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity, found in 19S proteasomal subunits Rpn7 and Rpn8. This family includes lid subunits of the 26 S proteasome regulatory particles, Rpn7 (PSMD7; proteasome 26S non-ATPase subunit 7; p44), and Rpn8 (PSMD8; proteasome 26S non-ATPase subunit 8; p40; Mov34). Rpn7 is known to be critical for the integrity of the 26 S proteasome complex by establishing a correct lid structure. It is necessary for the incorporation/anchoring of Rpn3 and Rpn12 to the lid and essential for viability and normal mitosis. Rpn7 and Rpn8 are ATP-independent components of the 19S regulator subunit, and contain the MPN structural motif on its N-terminal region. However, while they show a typical MPN metalloprotease fold, they lack the canonical JAMM motif, and therefore do not show catalytic isopeptidase activity. It is suggested that Rpn7 function is primarily structural.	280
163694	cd08063	MPN_CSN6	Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity, found in COP9 signalosome complex subunit 6. CSN6 (COP9 signalosome subunit 6; COP9 subunit 6; MOV34 homolog, 34 kD) is one of the eight subunits of COP9 signalosome, a highly conserved protein complex with diverse functions, including several important intracellular pathways such as the ubiquitin/proteasome system, DNA repair, cell cycle, developmental changes, and some aspects of immune responses. CSN6 is an MPN-domain protein that directly interacts with the MPN+-domain subunit CSN5. It is cleaved during apoptosis by activated caspases. CSN6 processing occurs in CSN/CRL (cullin-RING Ub ligase) complexes and is followed by the cleavage of Rbx1, the direct interaction partner of CSN6. CSN6 cleavage enhances CSN-mediated deneddylating activity (i.e. cleavage of ubiquitin-like protein Nedd8 (neural precursor cell expressed, developmentally downregulated 8)) in the cullin 1 in cells. The cleavage of Rbx1 and increased deneddylation of cullins inactivate CRLs and presumably stabilize pro-apoptotic factors for final apoptotic steps. While CSN6 shows a typical MPN metalloprotease fold, it lacks the canonical JAMM motif, and therefore does not show catalytic isopeptidase activity.	288
163695	cd08064	MPN_eIF3f	Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity, found in eIF3f. Eukaryotic translation initiation factor 3 (eIF3) subunit F (eIF3F; EIF3S5; eIF3-p47; eukaryotic translation initiation factor 3, subunit 5 epsilon, 47kDa; Mov34/MPN/PAD-1 family protein) is an evolutionarily non-conserved subunit of the functional core that comprises eIF3a, eIF3b, eIF3c, eIF3e, eIF3f, and eIF3h, and contains the MPN domain. However, it lacks the canonical JAMM motif, and therefore does not show catalytic isopeptidase activity. It has been shown that eIF3f mRNA expression is significantly decreased in many human tumors including pancreatic cancer and melanoma. EIF3f is a potent inhibitor of HIV-1 replication; it mediates restriction of HIV-1 expression through several factors including the serine/arginine-rich (SR) protein 9G8, and cyclin-dependent kinase 11 (CDK11). EIF3f phosphorylation by CDK11 is important in regulating its function in translation and apoptosis. It enhances its association with the core eIF3 subunits during apoptosis, suggesting that eIF3f may inhibit translation by increasing the binding to the eIF3 complex during apoptosis. Thus, eIF3f may be an important negative regulator of cell growth and proliferation.	265
163696	cd08065	MPN_eIF3h	Mpr1p, Pad1p N-terminal (MPN) domains without catalytic isopeptidase activity, found in eIF2h. Eukaryotic translation initiation factor 3 (eIF3) subunit h (eIF3h; eIF3 subunit 3; eIF3S3; eIF3-gamma; eIF3-p40) is an evolutionarily non-conserved subunit of the functional core that comprises eIF3a, eIF3b, eIF3c, eIF3e, eIF3f, and eIF3h, and contains the MPN domain. However, it lacks the canonical JAMM motif, and therefore does not show catalytic isopeptidase activity.Together with eIF3e and eIF3f, eIF3h stabilizes the eIF3 complex. Results suggest that eIF3h regulates cell growth and viability, and that over-expression of the gene may provide growth advantage to prostate, breast, and liver cancer cells. For example, EIF3h gene amplification is common in late-stage prostate cancer suggesting that it may be functionally involved in the progression of the disease. It has been shown that coamplification of MYC, a well characterized oncogene involved in cell growth, differentiation, and apoptosis, and EIF3h in patients with non-small cell lung cancer (NSCLC) improves survival if treated with the Epidermal Growth Factor Receptor Tyrosine Kinase Inhibitor (EGFR-TKI), Gefitinib. Plant eIF3h is implicated in translation of specific mRNAs.	266
163697	cd08066	MPN_AMSH_like	Mov34/MPN/PAD-1 family. AMSH (associated molecule with the Src homology 3 domain (SH3) of STAM (signal-transducing adapter molecule, also known as STAMBP)) and AMSH-like proteins (AMSH-LP) are members of JAMM/MPN+ deubiquitinases (DUBs), with Zn2+-dependent ubiquitin isopeptidase activity. AMSH specifically cleaves Lys 63 and not Lys48-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. AMSH and AMSH-LP are anchored on the early endosomal membrane via interaction with the clathrin coat. AMSH shares a common SH3-binding site with another endosomal DUB, UBPY (ubiquitin-specific protease Y; also known as USP8), the latter being a cysteine protease that does not discriminate between Lys48 and Lys63-linked ubiquitin.  AMSH is involved in the degradation of EGF receptor (EGFR) and possibly other ubiquitinated endocytosed proteins. AMSH also interacts with CHMP1, CHMP2, and CHMP3 proteins, all of which are components of ESCRT-III, suggested to be required for EGFR down-regulation.  The function of AMSH-LP has not been elucidated; however, it exhibits two fundamentally distinct features from AMSH: first, there is a substitution in the critical amino acid residue in the SH3-binding motif (SBM) in the human AMSH-LP, but not in its mouse ortholog, and lacks STAM-binding ability; second, AMSH-LP lacks the ability to interact with CHMP proteins. It is therefore likely that AMSH and AMSH-LP play different roles on early endosomes.	173
163698	cd08067	MPN_2A_DUB	Mov34/MPN/PAD-1 family: Histone H2A deubiquitinase. This family includes histone H2A deubiquitinase (Histone H2A DUB;MYSM1; myb-like, SWIRM and MPN domains 1; 2ADUB; 2A-DUB; KIAA19152ADUB, or KIAA1915/MYSM1), a member of JAMM/MPN+ deubiquitinases (DUBs), with possible Zn2+-dependent ubiquitin isopeptidase activity. It contains the SWIRM (Swi3p, Rsc8p and Moira), and SANT (SWI-SNF, ADA N-CoR, TFIIIB)/Myb domains; the SANT, but not the SWIRM, domain can bind directly to DNA. 2A-DUB is specific for monoubiquitinated H2A (uH2A), regulating transcription by coordinating histone acetylation and deubiquitination, and destabilizing the association of linker histone H1 with nucleosomes. 2A-DUB interacts with p/CAF (p300/CBP-associated factor) in a co-regulatory protein complex, where the status of acetylation of nucleosomal histones modulates its deubiquitinase activity. 2A-DUB is a positive regulator of androgen receptor (AR) transactivation activity on a reporter gene; it participates in transcriptional regulation events in androgen receptor-dependent gene activation. In prostate tumors, the levels of uH2A are dramatically decreased, thus 2A-DUB serving as a cancer-related marker.	187
163699	cd08068	MPN_BRCC36	Mov34/MPN/PAD-1 family: BRCC36, a subunit of BRCA1-A complex. BRCC36 (BRCA1-A complex subunit BRCC36; BRCA1/BRCA2-containing complex subunit 36; BRCA1/BRCA2-containing complex subunit 3; BRCC3; BRISC complex subunit BRCC36; BRCC36 isopeptidase complex; Lys-63-specific deubiquitinase BRCC36) and BRCC36-like domains are members of JAMM/MPN+ deubiquitinases (DUBs),  possibly with Zn2+-dependent ubiquitin isopeptidase activity. BRCC36 is part of the BRCA1/BRCA2/BARD1-containing nuclear complex that displays an E3 ubiquitin ligase activity. It is targeted to DNA damage foci after irradiation; RAP80 recruits the Abraxas-BRCC36-BRCA1-BARD1 complex to DNA double strand breaks (DSBs) for DNA repair through specific recognition of Lys 63-linked polyubiquitinated proteins by its tandem ubiquitin-interacting motifs. A new protein, MERIT40 (mediator of RAP80 interactions and targeting 40 kDa), also named NBA1 (new component of the BRCA1 A complex), exists in the same BRCA1-containing complex and is essential for the integrity of the complex.  There are studies suggesting that MERIT40/NBA1 ties BRCA1 complex integrity, DSB recognition, and ubiquitin chain activities to the DNA damage response. It has also been shown that BRCA1-containing complex resembles the lid complex of the 26S proteasome.	244
163700	cd08069	MPN_RPN11_CSN5	Mov34/MPN/PAD-1 family: proteasomal regulatory protein Rpn11 and signalosome complex subunit CSN5. This family contains proteasomal regulatory protein Rpn11 (26S proteasome regulatory subunit rpn11; PAD1; POH1; RPN11; PSMD14; Rpn11 subunit of the 19S-proteasome; regulatory particle number 11) and signalosomal CSN5 (COP9 signalosome complex subunit 5; COP9 complex homolog subunit 5; c-Jun activation domain-binding protein-1; CSN5/JAB1; JAB1). COP9 signalosome (CSN) and the proteasome lid are paralogous complexes and their respective subunits CSN5 and Rpn11 are most closely related between the two complexes, both containing the conserved JAMM (JAB1/MPN/Mov34 metalloenzyme) motif involved in zinc ion coordination and providing the active site for isopeptidase activity. Rpn11 is responsible for substrate deubiquitination during proteasomal degradation. It is essential for maintaining a correct cell cycle and normal mitochondrial morphology and physiology; mutations in Rpn11 cause cell cycle and mitochondrial defects, temperature sensitivity and sensitivity to DNA damaging reagents such as UV. It has been shown that the C-terminal region of Rpn11 is involved in the regulation of the mitochondrial fission and tubulation processes. CSN5, one of the eight subunits of CSN, is critical for nuclear export and the degradation of several tumor suppressor proteins, including p53, p27, and Smad4. Its MPN+ domain is critical for the physical interaction of RUNX3 and Jab1. It has been suggested that the direct interaction of CSN5/JAB1 with p27 provides p27 with a leucine-rich nuclear export signal (NES), which is required for binding to chromosomal region maintenance 1 (CRM1), and facilitates nuclear export. The over-expression of CSN5/JAB1 also has been implicated in cancer initiation and progression, including cancer of the lung, pancreas, mouth, thyroid, and breast, suggesting that the oncogenic activity of CSN5 is associated with the down-regulation of RUNX3.	268
163701	cd08070	MPN_like	Mpr1p, Pad1p N-terminal (MPN) domains with catalytic isopeptidase activity (metal-binding). This family contains archaeal and bacterial MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+)-like domains. These domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity for the release of ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation.  The JAMM proteins likely hydrolyze ubiquitin conjugates in a manner similar to thermolysin, in which the zinc-polarized aqua ligand serves as the nucleophile, compared with the classical DUBs that do so with a cysteine residue in the active site.	128
163702	cd08071	MPN_DUF2466	Mov34/MPN/PAD-1 family. Mov34 DUF2466 (also known as DNA repair protein RadC) domain of unknown function contains the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity. However, to date, the name RadC has been misleading and no function has been determined.	113
163703	cd08072	MPN_archaeal	Mov34/MPN/PAD-1 family: archaeal JAB1/MPN/Mov34 metalloenzyme. This family contains only archaeal MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+)-like domains. These domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity for the release of ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation.  The JAMM proteins likely hydrolyze ubiquitin conjugates in a manner similar to thermolysin, in which the zinc-polarized aqua ligand serves as the nucleophile, compared with the classical DUBs that do so with a cysteine residue in the active site.	117
163704	cd08073	MPN_NLPC_P60	Mpr1p, Pad1p N-terminal (MPN) domains with catalytic isopeptidase activity (metal-binding) found in proteins also containing NlpC/P60 domains. This family contains bacterial MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+)-like domains at the N-terminus of NlpC/P60 phage tail protein domains. These domains contain the signature JAB1/MPN/Mov34 metalloenzyme (JAMM) motif, EXnHS/THX7SXXD, which is involved in zinc ion coordination and provides the active site for isopeptidase activity for the release of ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation.  The JAMM proteins likely hydrolyze ubiquitin conjugates in a manner similar to thermolysin, in which the zinc-polarized aqua ligand serves as the nucleophile, compared with the classical DUBs that do so with a cysteine residue in the active site.	108
173969	cd08148	RuBisCO_large	Ribulose bisphosphate carboxylase large chain. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubisco-like proteins (RLP), are missing critical active site residues and therefore do not catalyze CO2 fixation. They are believed to utilize a related enzymatic mechanism, but have divergent functions.	366
163706	cd08150	catalase_like	Catalase-like heme-binding proteins and protein domains. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity.	283
163707	cd08151	AOS	Allene oxide synthase. Allene oxide synthase converts a fatty acid hydroperoxide to an allene oxide, which is an unstable epoxide. In corals, the enzyme is part of a eiconaosid synthesis pathway that is initiated by a lipoxygenase, which generates the fatty acid hydroperoxides in the first step. The structure of allene oxide synthase closely resembles that of catalase, but allene oxide synthase does not have catalase activity.	328
163708	cd08152	y4iL_like	Catalase-like heme-binding proteins similar to the uncharacterized y4iL. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity.  This family contains uncharacterized proteins similar to Rhizobium sp. NGR234 y4iL, of mostly bacterial origin.	305
163709	cd08153	srpA_like	Catalase-like heme-binding proteins similar to the uncharacterized srpA. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity.  This family contains uncharacterized proteins similar to the Synechococcus elongatus PCC 7942 periplasmic protein srpA, of mostly bacterial origin. The plasmid-encoded srpA is regulated by sulfate, but does not seem to function in its uptake or metabolism.	295
163710	cd08154	catalase_clade_1	Clade 1 of the heme-binding enzyme catalase. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. Clade 1 catalases are found in bacteria, algae, and plants; they have a relatively small subunit size of 55 to 69 kDa, and bind a protoheme IX (heme b) group buried deep inside the structure. They appear to form tetramers. In eukaryotic cells, catalases are located in peroxisomes.	469
163711	cd08155	catalase_clade_2	Clade 2 of the heme-binding enzyme catalase. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. Clade 2 catalases are mostly found in bacteria and fungi; they have a large subunit size of 75 to 84 kDa, and bind a heme d group buried deep inside the structure. They appear to form tetramers. In eukaryotic cells, catalases are located in peroxisomes.	443
163712	cd08156	catalase_clade_3	Clade 3 of the heme-binding enzyme catalase. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. Clade 3 catalases are the most abundant subfamily and are found in all three kingdoms of life; they have a relatively small subunit size of 43 to 75 kDa, and bind a protoheme IX (heme b) group buried deep inside the structure. Clade 3 catalases also bind NADPH as a second redox-active cofactor. They form tetramers, and in eukaryotic cells, catalases are located in peroxisomes.	429
163713	cd08157	catalase_fungal	Fungal catalases similar to yeast catalases A and T. Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes, which is involved in the protection of cells from the toxic effects of peroxides. It catalyzes the conversion of hydrogen peroxide to water and molecular oxygen. Catalases also utilize hydrogen peroxide to oxidize various substrates such as alcohol or phenols. This family of fungal catalases has a relatively small subunit size, and binds a protoheme IX (heme b) group buried deep inside the structure. Fungal catalases also bind NADPH as a second redox-active cofactor. They form tetramers; in eukaryotic cells, catalases are typically located in peroxisomes. Saccharomyces cerevisiae catalase T is found in the cytoplasm, though.	451
176482	cd08159	APC10-like	APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination. This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An  APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here.	129
380914	cd08161	SET	SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain superfamily. The Su(var)3-9, Enhancer-of-zeste, Trithorax (SET) domain superfamily corresponds to SET domain-containing lysine methyltransferases, which catalyze site and state-specific methylation of lysine residues in histones that are fundamental in epigenetic regulation of gene activation and silencing in eukaryotic organisms. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains has been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as N-SET and C-SET. C-SET forms an unusual and conserved knot-like structure of probable functional importance. In addition to N-SET and C-SET, an insert region (I-SET) and flanking regions of high structural variability form part of the overall structure. Some family members contain a pre-SET domain, which is found in a number of histone methyltransferases (HMTase), and a post-SET domain, which harbors a zinc-binding site.	72
277369	cd08162	MPP_PhoA_N	Synechococcus sp. strain PCC 7942  PhoA and related proteins, N-terminal metallophosphatase domain. Synechococcus sp. strain PCC 7942 PhoA is a large atypical alkaline phosphatase.  It is known to be transported across the inner cytoplasmic membrane and into the periplasmic space.  In vivo inactivation of the gene encoding PhoA leads to a loss of extracellular, phosphate-regulated phosphatase activity, but does not appear to affect the cells capacity for phosphate uptake.  PhoA may play a role in scavenging phosphate during growth of Synechococcus sp. strain PCC 7942 in its natural environment.  PhoA  belongs to a domain family which includes the bacterial enzyme UshA and several other related enzymes including SoxB, CpdB, YhcR, and CD73.  All members have a similar domain architecture which includes an N-terminal metallophosphatase domain and a C-terminal nucleotidase domain.  The N-terminal metallophosphatase domain belongs to a large superfamily of distantly related metallophosphatases (MPPs) that includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	325
277370	cd08163	MPP_Cdc1	Saccharomyces cerevisiae CDC1 and related proteins, metallophosphatase domain. Cdc1 (also known as XlCdc1 in Xenopus laevis) is an endoplasmic reticulum-localized transmembrane lipid phosphatase with a metallophosphatase domain facing the ER lumen.  In budding yeast, the gene encoding CDC1 is essential while nonlethal mutations cause defects in Golgi inheritance and actin polarization.  Cdc1 mutant cells accumulate an unidentified phospholipid, suggesting that Cdc1 is a lipid phosphatase.  Cdc1 mutant cells also have highly elevated intracellular calcium levels suggesting a possible role for Cdc1 in calcium regulation.  The 5' flanking region of Cdc1 is a regulatory region with conserved binding site motifs for AP1, AP2, Sp1, NF-1 and CREB.  DNA polymerase delta consists of at least four subunits - Pol3, Cdc1, Cdc27, and Cdm1.  Cdc1 belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	257
277371	cd08164	MPP_Ted1	Saccharomyces cerevisiae Ted1 and related proteins, metallophosphatase domain. Saccharomyces cerevisiae Ted1 (trafficking of Emp24p/Erv25p-dependent cargo disrupted 1) is a metallophosphatase domain-containing protein which acts together with Emp24p and Erv25p in cargo exit from the ER.  Ted1 belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	193
277372	cd08165	MPP_MPPE1	human MPPE1 and related proteins, metallophosphatase domain. MPPE1 is a functionally uncharacterized metallophosphatase domain-containing protein. The MPPE1 gene is located on chromosome 18 and is a candidate susceptibility gene for Bipolar disorder.  MPPE1 belongs to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	156
277373	cd08166	MPP_Cdc1_like_1	uncharacterized subgroup related to Saccharomyces cerevisiae CDC1, metallophosphatase domain. A functionally uncharacterized subgroup related to the metallophosphatase domain of Saccharomyces cerevisiae Cdc1, S. cerevisiae Ted1 and human MPPE1. Cdc1 is an endoplasmic reticulum-localized transmembrane lipid phosphatase and is a subunit of DNA polymerase delta. TED1 (trafficking of Emp24p/Erv25p-dependent cargo disrupted 1), acts together with Emp24p and Erv25p in cargo exit from the ER.  The MPPE1 gene is a candidate susceptibility gene for Bipolar disorder.  Proteins in this uncharacterized subgroup belong to the metallophosphatase (MPP) superfamily.  MPPs are functionally diverse, but all share a conserved domain with an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. The MPP superfamily includes: Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases).  The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination.	195
173979	cd08168	Cytochrom_C3	Heme-binding domain of the class III cytochrome C family and related proteins. This alignment models heme binding core motifs as encountered in the cytochrome C3 family and related proteins. Cytochrome C3 is a tetraheme protein found in sulfate-reducing bacteria which use either thiosulfate or sulfate as the ultimate electron acceptors. C3 is an integral part of a complex electron transfer chain. The model also contains triheme cytochromes C7 which function in electron transfer during Fe(III) respiration by Geobacter sulfurreducens (PpcA, PpcB, PpcC, PpcD, and PpcE) and four repeated core motifs as found in the 16-heme cytochrome C HmcA of Desulfovibrio vulgaris Hildenborough which plays a role in electron transfer through the membrane following periplasmic oxidation of hydrogen (resulting in sulfate reduction in the cytoplasm).	85
341448	cd08169	DHQ-like	Dehydroquinate synthase-like which includes dehydroquinate synthase, 2-deoxy-scyllo-inosose synthase, and 2-epi-5-epi-valiolone synthase. This group contains dehydroquinate synthase, 2-deoxy-scyllo-inosose synthase, and 2-epi-5-epi-valiolone synthase. These proteins exhibit the dehydroquinate synthase structural fold. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds. 2-deoxy-scyllo-inosose synthase (DOIS) catalyzes carbocycle formation from D-glucose-6-phosphate to 2-deoxy-scyllo-inosose through a multi-step reaction in the biosynthesis of aminoglycoside antibiotics. 2-deoxystreptamine (DOS)-containing aminoglycoside antibiotics includes neomycin, kanamycin, gentamicin, and ribostamycin. 2-epi-5-epi-valiolone synthases catalyze the cyclization of sedoheptulose 7-phosphate to 2-epi-5-epi-valiolone in the biosynthesis of C(7)N-aminocyclitol-containing products. The cyclization product, 2-epi-5-epi-valiolone ((2S,3S,4S,5R)-5-(hydroxymethyl)cyclohexanon-2,3,4,5-tetrol), is a precursor of the valienamine moiety. The valienamine unit is responsible for their biological activities as various glycosidic hydrolases inhibitors. Two important microbial secondary metabolites, validamycin and acarbose, are used in agricultural and biomedical applications.	328
341449	cd08170	GlyDH	Glycerol dehydrogenases (GlyDH) catalyzes oxidation of glycerol to dihydroxyacetone in glycerol dissmilation. Glycerol dehydrogenases (GlyDH) is a key enzyme in the glycerol dissimilation pathway. In anaerobic conditions, many microorganisms utilize glycerol as a source of carbon through coupled oxidative and reductive pathways. One of the pathways involves the oxidation of glycerol to dihydroxyacetone with the reduction of NAD+ to NADH catalyzed by glycerol dehydrogenases. Dihydroxyacetone is then phosphorylated by dihydroxyacetone kinase and enters the glycolytic pathway for further degradation. The activity of GlyDH is zinc-dependent; the zinc ion plays a role in stabilizing an alkoxide intermediate at the active site.	351
341450	cd08171	GlyDH-like	Glycerol dehydrogenase-like. This family contains glycerol dehydrogenase (GlyDH)-like proteins that have yet to be characterized, but show sequence homology with glycerol dehydrogenase. Glycerol dehydrogenases (GlyDH) is a key enzyme in the glycerol dissimilation pathway. In anaerobic conditions, many microorganisms utilize glycerol as a source of carbon through coupled oxidative and reductive pathways. One of the pathways involves the oxidation of glycerol to dihydroxyacetone with the reduction of NAD+ to NADH catalyzed by glycerol dehydrogenases. Dihydroxyacetone is then phosphorylated by dihydroxyacetone kinase and enters the glycolytic pathway for further degradation. The activity of GlyDH is zinc-dependent; the zinc ion plays a role in stabilizing an alkoxide intermediate at the active site.	345
341451	cd08172	GlyDH-like	Glycerol_dehydrogenase-like. This family contains glycerol dehydrogenase (GlyDH)-like proteins that have yet to be characterized, but show sequence homology with glycerol dehydrogenase. Glycerol dehydrogenases (GlyDH) is a key enzyme in the glycerol dissimilation pathway. In anaerobic conditions, many microorganisms utilize glycerol as a source of carbon through coupled oxidative and reductive pathways. One of the pathways involves the oxidation of glycerol to dihydroxyacetone with the reduction of NAD+ to NADH catalyzed by glycerol dehydrogenases. Dihydroxyacetone is then phosphorylated by dihydroxyacetone kinase and enters the glycolytic pathway for further degradation. The activity of GlyDH is zinc-dependent; the zinc ion plays a role in stabilizing an alkoxide intermediate at the active site.	346
341452	cd08173	Gro1PDH	Sn-glycerol-1-phosphate dehydrogenase (Gro1PDH) catalyzes the reversible conversion between dihydroxyacetone phosphate and glycerol-1-phosphate using either NADH or NADPH as a coenzyme. Sn-glycerol-1-phosphate dehydrogenase (Gro1PDH, EC 1.1.1.261) plays an important role in the formation of the enantiomeric configuration of the glycerophosphate backbone (sn-glycerol-1-phosphate) of archaeal ether lipids. It catalyzes the reversible conversion between dihydroxyacetone phosphate and glycerol-1-phosphate using either NADH or NADPH as a coenzyme. The activity is zinc-dependent. One characteristic feature of archaea is that their cellular membrane has an ether linkage between the glycerol backbone and the hydrocarbon residues. The polar lipids of the members of Archaea consist of di- and tetra-ethers of glycerol with isoprenoid alcohols bound at the sn-2 and sn-3 positions of the glycerol moiety. The archaeal polar lipids have the enantiomeric configuration of a glycerophosphate backbone [sn-glycerol-1-phosphate (G-1-P)] that is the mirror image structure of the bacterial or eukaryal counterpart [sn-glycerol- 3-phosphate (G-3-P)]. The absolute stereochemistry of the glycerol moiety in all archaeal polar lipids is opposite to that of glycerol ester lipids in bacteria and eukarya.	343
341453	cd08174	G1PDH-like	Glycerol-1-phosphate dehydrogenase-like. These glycerol-1-phosphate dehydrogenase-like proteins have not been characterized. The protein sequences have high similarity with that of glycerol-1-phosphate dehydrogenase (G1PDH) which plays a role in the synthesis of phosphoglycerolipids in Gram-positive bacterial species. It catalyzes the reversibly reduction of dihydroxyacetone phosphate (DHAP) to glycerol-1-phosphate (G1P) in a NADH-dependent manner. Its activity requires Ni++ ion.	332
341454	cd08175	G1PDH	Glycerol-1-phosphate dehydrogenase (G1PDH) catalyzes the reversible reduction of dihydroxyacetone phosphate (DHAP) to glycerol-1-phosphate (G1P) in an NADH-dependent manner. Glycerol-1-phosphate dehydrogenase (G1PDH) plays a role in the synthesis of phosphoglycerolipids in Gram-positive bacterial species. It catalyzes the reversibly reduction of dihydroxyacetone phosphate (DHAP) to glycerol-1-phosphate (G1P) in a NADH-dependent manner. Its activity requires a Ni++ ion. In Bacillus subtilis, it has been described as AraM gene in L-arabinose (ara) operon. AraM protein forms homodimer.	340
341455	cd08176	LPO	Lactadehyde:propanediol oxidoreductase (LPO) catalyzes the interconversion between L-lactaldehyde and L-1,2-propanediol in Escherichia coli and other enterobacteria. Lactadehyde:propanediol oxidoreductase (LPO) is a member of the group III iron-activated dehydrogenases which catalyze the interconversion between L-lactaldehyde and L-1,2-propanediol in Escherichia coli and other enterobacteria. L-fucose and L-rhamnose are used by Escherichia coli through an inducible pathway mediated by the fucose regulon comprising four linked operons fucO, fucA, fucPIK, and fucR. The fucA-encoded aldolase catalyzes the formation of dihydroxyacetone phosphate and L-lactaldehyde. Under anaerobic conditions, with NADH as a cofactor, lactaldehyde is converted by a fucO-encoded lactadehyde:propanediol oxidoreductase (LPO) to L-1,2-propanediol, which is excreted as a fermentation product. In mutant strains, E. coli adapted to grow on L-1,2-propanediol, FucO catalyzes the oxidation of the polyol to L-lactaldehyde. FucO is induced regardless of the respiratory conditions of the culture, remains fully active in the absence of oxygen. In the presence of oxygen, this enzyme becomes oxidatively inactivated by a metal-catalyzed oxidation mechanism. FucO is an iron-dependent metalloenzyme that is inactivated by other metals, such as zinc, copper, or cadmium. This enzyme can also reduce glycol aldehyde with similar efficiency.  Beside L-1,2-propanediol, the enzyme is also able to oxidize methanol as an alternative substrate.	378
341456	cd08177	MAR	Maleylacetate reductase is involved in many aromatic compounds degradation pathways of aerobic microbes. Maleylacetate reductase (MAR) plays an important role in the degradation of aromatic compounds  in aerobic microbes. In fungi and yeasts, the enzyme is involved in the catabolism of compounds such as phenol, tyrosine, benzoate, 4-hydroxybenzoate and resorcinol. In bacteria, the enzyme contributes to the degradation of resorcinol, 2,4-dihydroxybenzoate ([beta]-resorcylate) and 2,6-dihydroxybenzoate ([gamma]-resorcylate) via hydroxyquinol and maleylacetate. Maleylacetate reductase catalyzes NADH- or NADPH-dependent reduction, at the carbon-carbon double bond, of maleylacetate or 2-chloromaleylacetate to 3-oxoadipate. In the case of 2-chloromaleylacetate, MAR initially catalyzes the NAD(P)H-dependent dechlorination to maleylacetate, which is then reduced to 3-oxoadipate. This enzyme is a homodimer and is inhibited by thiol-blocking reagents such as p-chloromercuribenzoate and Hg++, indicating that the cysteine residue is probably necessary for the catalytic activity of maleylacetate reductase.	337
341457	cd08178	AAD_C	C-terminal alcohol dehydrogenase domain of the acetaldehyde dehydrogenase-alcohol dehydrogenase bifunctional two-domain protein (AAD). This alcohol dehydrogenase domain is located on the C-terminal of a bifunctional two-domain protein. The N-terminal of the protein contains an acetaldehyde-CoA dehydrogenase domain. This protein is involved in pyruvate metabolism whereby pyruvate is converted to acetyl-CoA and formate by pyruvate formate-lysase (PFL). Under anaerobic condition, acetyl-CoA is reduced to acetaldehyde and ethanol by this two-domain protein. Acetyl-CoA is first converted into an enzyme-bound thiohemiacetal by the N-terminal acetaldehyde dehydrogenase domain. The enzyme-bound thiohemiacetal is subsequently reduced by the C-terminal NAD+-dependent alcohol dehydrogenase domain. In E. coli, this protein is called AdhE and has been shown to have pyruvate formate-lyase (PFL) deactivase activity, which leads to the inactivation of PFL, a key enzyme in anaerobic metabolism. In Escherichia coli and Entamoeba histolytica, this enzyme forms homopolymeric peptides composed of more than 20 protomers associated in a helical rod-like structure.	400
341458	cd08179	NADPH_BDH	NADPH-dependent butanol dehydrogenase involved in the butanol and ethanol formation pathway in bacteria. NADPH-dependent butanol dehydrogenase (BDH) is involved in the butanol and ethanol formation pathway of some bacteria. The fermentation process is characterized by an acid producing growth phase, followed by a solvent producing phase. The latter phase is associated with the induction of solventogenic enzymes such as butanol dehydrogenase. The activity of the enzyme requires NADPH as cofactor, as well as divalent ions zinc or iron. This family is a member of the iron-containing alcohol dehydrogenase superfamily. Protein structure has a dehydroquinate synthase-like fold.	379
341459	cd08180	PDD	1,3-propanediol dehydrogenase (PPD) catalyzes the reduction of 3-hydroxypropionaldehyde (3-HPA) to 1,3-propanediol in glycerol metabolism. 1,3-propanediol dehydrogenase (PPD) plays a role in glycerol metabolism of some bacteria in anaerobic conditions. In this degradation pathway, glycerol is converted in a two-step process to 1,3-propanediol (1,3-PD) which is then excreted into the extracellular medium. The first reaction involves the transformation of glycerol into 3-hydroxypropionaldehyde (3-HPA) by a coenzyme B-12-dependent dehydratase. The second reaction involves the dismutation of the 3-hydroxypropionaldehyde (3-HPA) to 1,3-propanediol by the NADH-linked 1,3-propanediol dehydrogenase (PPD). The enzyme requires iron ion for its function. Because many genes in this pathway are present in the propanediol utilization (pdu) operon, they are also named pdu genes. PPD is a member of the iron-containing alcohol dehydrogenase superfamily. The PPD structure has a dehydroquinate synthase-like fold.	333
341460	cd08181	PPD-like	1,3-propanediol dehydrogenase-like (PPD). This family contains proteins similar to 1,3-propanediol dehydrogenase (PPD) which is a member of the iron-containing alcohol dehydrogenase superfamily, and exhibits a dehydroquinate synthase-like fold.  Protein sequence similarity search and other biochemical evidences suggest that they are close to the iron-containing 1,3-propanediol dehydrogenase (EC 1.1.1.202). 1,3-propanediol dehydrogenase catalyzes the oxidation of propane-1,3-diol to 3-hydroxypropanal with the simultaneous reduction of NADP+ to NADPH. The protein structure of Thermotoga maritima TM0920 gene contains one NADP+ and one iron ion.	358
341461	cd08182	HEPD	Hydroxyethylphosphoate dehydrogenase (HEPD) catalyzes the reduction of phosphonoacetaldehyde (PnAA) to hydroxyethylphosphoate (HEP). Hydroxyethylphosphoate dehydrogenase (HEPD) catalyzes the reduction of phosphonoacetaldehyde (PnAA) to hydroxyethylphosphoate (HEP) with either NADH or NADPH as a cofactor, although NADH is the preferred cofactor. PnAA is a biosynthetic intermediate for several phosphonates such as the antibiotic fosfomycin, phosphinothricin tripeptide (PTT), and 2-aminoethylphosphonate (AEP). This enzyme is named PhpC in PTT biosynthesis pathway in Streptomyces hygroscopicus and S. viridochromogenes.	370
341462	cd08183	Fe-ADH-like	Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contain different protein domains. Proteins of this family have not been characterized.	377
341463	cd08184	Fe-ADH_KdnB-like	Iron-containing alcohol dehydrogenase similar to Shewanella oneidensis KdnB required for Kdo8N biosynthesis. This family contains iron-containing alcohol dehydrogenase-like proteins, many of which have not been characterized. Their specific function is unknown. The protein structure represents a dehydroquinate synthase-like fold and belongs to the iron-containing alcohol dehydrogenase-like superfamily. It is distinct from other alcohol dehydrogenases which contain different protein domains. Alcohol dehydrogenase catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron or zinc ions. This family also includes Shewanella oneidensis KdnB which is required for biosynthesis of 8-Amino-3,8-dideoxy-D-manno-octulosonic acid (Kdo8N), a unique amino sugar that has thus far only been observed on the lipopolysaccharides of marine bacteria belonging to the genus Shewanella, and thought to be important for the integrity of the bacterial cell outer membrane. KdnB requires NAD(P) and zinc ion for activity.	348
341464	cd08185	Fe-ADH-like	Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase-like (ADH) proteins. Alcohol dehydrogenase catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase fold and is a member of the iron-containing alcohol dehydrogenase-like family. They are distinct from other alcohol dehydrogenases which contain different protein domains. Proteins of this family have not been characterized.	379
341465	cd08186	Fe-ADH-like	Iron-containing alcohol dehydrogenase. This family contains iron-containing alcohol dehydrogenase (ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. The ADH of hyperthermophilic archaeon Thermococcus hydrothermalis oxidizes a series of primary aliphatic and aromatic alcohols, preferentially from C2 to C8, but is also active towards methanol and glycerol, and is stereospecific for monoterpenes. It has been suggested that the type III ADHs in microorganisms are involved in acetaldehyde detoxication rather than in alcohol turnover.	380
341466	cd08187	BDH	Butanol dehydrogenase catalyzes the conversion of butyraldehyde to butanol with the cofactor NAD(P)H being oxidized in the process. The butanol dehydrogenase (BDH) is involved in the final step of the butanol formation pathway in anaerobic micro-organism. Butanol dehydrogenase catalyzes the conversion of butyraldehyde to butanol with the cofactor NAD(P)H being oxidized in the process. Activity in the reverse direction is 50-fold lower than that in the forward direction. The NADH-BDH has higher activity with longer chained aldehydes and is inhibited by metabolites containing an adenine moiety. This protein family belongs to the so-called iron-containing alcohol dehydrogenase superfamily. Since members of this superfamily use different divalent ions, preferentially iron or zinc, it has been suggested to be renamed to family III metal-dependent polyol dehydrogenases. This family also includes E. coli YqhD enzyme, an NADP-dependent dehydrogenase whose activity measurements with several alcohols demonstrate preference for alcohols longer than C3. The active site of YqhD contains a Zn metal, and a modified NADPH cofactor bearing OH groups on the saturated C5 and C6 atoms, possibly due to oxygen stress on the enzyme, which would functionally work under anaerobic conditions.	382
341467	cd08188	PDDH	1,3-Propanediol (1,3-PD) dehydrogenase. This family includes 1,3-propanediol (1,3-PD) dehydrogenase, a key enzyme in the microbial production of 1,3-PD that has been previously characterized as the product of dhaT gene in Klebsiella pneumoniae. 1,3-PD dehydrogenase is a member of the family III metal-dependent polyol dehydrogenases, which are shown to require a divalent metal ion for catalysis. However, some members of this family showed a dependence on Fe(2+) or Zn(2+) for activity.	377
341468	cd08189	Fe-ADH-like	Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and belongs to the alcohol dehydrogenase-like superfamily. It is distinct from other alcohol dehydrogenases which contain different protein domain. Proteins of this family have not been characterized.	378
341469	cd08190	HOT	Hydroxyacid-oxoacid transhydrogenase (HOT) involved in gamma-hydroxybutyrate metabolism. This family contains hydroxyacid-oxoacid transhydrogenase (HOT), also known as D-2-hydroxyglutarate transhydrogenase. It catalyzes the conversion of gamma-hydroxybutyrate (GHB) to succinic semialdehyde (SSA), coupled to the stoichiometric conversion of alpha-ketoglutarate to D-2-hydroxyglutarate in gamma-Hydroxybutyrate catabolism. Unlike many other alcohols, which are oxidized by NAD-linked dehydrogenases, gamma-hydroxybutyrate is metabolized to succinate semialdehyde by hydroxyacid-oxoacid transhydrogenase which does not require free NAD or NADP; instead, it uses alpha-ketoglutarate as an acceptor, converting it to d-2-hydroxyglutarate. Alpha-ketoglutarate serves as an intermediate acceptor to regenerate NAD(P) required for the oxidation of GHB. HOT also catalyzes the reversible oxidation of a hydroxyacid obligatorily coupled to the reduction of an oxoacid, and requires no cofactor. In mammals, the HOT enzyme is located in mitochondria, and is expressed with an N-terminal mitochondrial targeting sequence. HOT enzyme is member of the metal-containing alcohol dehydrogenase family. It typically contains an iron although other metal ions may be used.	412
341470	cd08191	Fe-ADH-like	Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contain different protein domain. Proteins of this family have not been characterized.	392
341471	cd08192	MAR-like	Maleylacetate reductase is involved in many aromatic compounds degradation pathways of aerobic microbes. Maleylacetate reductase (MAR) plays an important role in the degradation of aromatic compounds  in aerobic microbes. In fungi and yeasts, the enzyme is involved in the catabolism of compounds such as phenol, tyrosine, benzoate, 4-hydroxybenzoate and resorcinol. In bacteria, the enzyme contributes to the degradation of resorcinol, 2,4-dihydroxybenzoate ([beta]-resorcylate) and 2,6-dihydroxybenzoate ([gamma]-resorcylate) via hydroxyquinol and maleylacetate. Maleylacetate reductase (MAR) catalyzes NADH- or NADPH-dependent reduction, at the carbon-carbon double bond, of maleylacetate or 2-chloromaleylacetate to 3-oxoadipate. In the case of 2-chloromaleylacetate, MAR initially catalyzes the NAD(P)H-dependent dechlorination to maleylacetate, which is then reduced to 3-oxoadipate. This enzyme is a homodimer. It is inhibited by thiol-blocking reagents such as p-chloromercuribenzoate and Hg++, indicating that the cysteine residue is probably necessary for the catalytic activity of maleylacetate reductase.	380
341472	cd08193	HVD	5-hydroxyvalerate dehydrogenase (HVD) catalyzes the oxidation of 5-hydroxyvalerate to 5-oxovalerate with NAD+ as cofactor. 5-hydroxyvalerate dehydrogenase (HVD) is an iron-containing (type III) NAD-dependent alcohol dehydrogenase. It plays a role in the cyclopentanol metabolism biochemical pathway. It catalyzes the oxidation of 5-hydroxyvalerate to 5-oxovalerate with NAD+ as cofactor. This cyclopentanol (cpn) degradation pathway is present in some bacteria which can use cyclopentanol as sole carbon source. In Comamonas sp. strain NCIMB 9872, this enzyme is encoded by the CpnD gene.	379
341473	cd08194	Fe-ADH-like	Iron-containing alcohol dehydrogenases-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) most of which have not been characterized. Their specific function is unknown. The protein structure represents a dehydroquinate synthase-like fold and belongs to the alcohol dehydrogenase-like superfamily. It is distinct from other alcohol dehydrogenases which contain different protein domains. Alcohol dehydrogenase catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions.	378
341474	cd08195	DHQS	Dehydroquinate synthase (DHQS) catalyzes the conversion of DAHP to DHQ in shikimate pathway for aromatic compounds synthesis. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway, which involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds, is found in bacteria, microbial eukaryotes, and plants, but not in mammals. Therefore, enzymes of this pathway are attractive targets for the development of non-toxic antimicrobial compounds, herbicides and anti-parasitic agents. The activity of DHQS requires nicotinamide adenine dinucleotide (NAD) as cofactor. A single active site in DHQS catalyzes five sequential reactions involving alcohol oxidation, phosphate elimination, carbonyl reduction, ring opening, and intramolecular aldol condensation. The binding of substrates and ligands induces domain conformational changes. In some fungi and protozoa, this domain is fused with the other four domains in shikimate pathway and forms a penta-domain AROM protein, which catalyzes steps 2-6 in the shikimate pathway.	343
341475	cd08196	Fe-ADH-like	iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized.	367
341476	cd08197	DOIS	2-deoxy-scyllo-inosose synthase (DOIS) catalyzes carbocycle formation from D-glucose-6-phosphate to 2-deoxy-scyllo-inosose. 2-deoxy-scyllo-inosose synthase (DOIS) catalyzes carbocycle formation from D-glucose-6-phosphate to 2-deoxy-scyllo-inosose through a multistep reaction in the biosynthesis of aminoglycoside antibiotics. 2-deoxystreptamine (DOS)-containing aminoglycoside antibiotics includes neomycin, kanamycin, gentamicin, and ribostamycin. They are important antibacterial agents. DOIS is a homolog of the dehydroquinate synthase which catalyzes the cyclization of 3-deoxy-D-arabino-heputulosonate-7-phosphate to dehydroquinate (DHQ) in the shikimate pathway.	355
341477	cd08198	DHQS-like	Dehydroquinate synthase (DHQS) catalyzes the conversion of DAHP to DHQ in shikimate pathway for aromatic compounds synthesis. This family contains dehydroquinate synthase-like proteins. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds. The activity of DHQS requires NAD as cofactor. Proteins of this family share sequence similarity and functional motifs with that of dehydroquinate synthase, but the specific function has not been characterized.	366
341478	cd08199	EEVS	2-epi-5-epi-valiolone synthase (EEVS). 2-epi-5-epi-valiolone synthase catalyzes the cyclization of sedoheptulose 7-phosphate to 2-epi-5-epi-valiolone in the biosynthesis of C(7)N-aminocyclitol-containing products. The cyclization product, 2-epi-5-epi-valiolone ((2S,3S,4S,5R)-5-(hydroxymethyl)cyclohexanon-2,3,4,5-tetrol), is a precursor of the valienamine moiety. The valienamine unit is responsible for their biological activities as various glycosidic hydrolases inhibitors. Two important microbial secondary metabolites, validamycin and acarbose, are used in agricultural and biomedical applications. Validamycin A is an antifungal antibiotic which has a strong trehalase inhibitory activity and has been used to control sheath blight disease in rice caused by Rhizoctonia solani. Acarbose is an alpha-glucosidase inhibitor used for the treatment of type II insulin-independent diabetes.  Salbostatin produced by Streptomyces albus also belongs to this family. It exhibits strong trehalase inhibitory activity.	349
173828	cd08200	catalase_peroxidase_2	C-terminal non-catalytic domain of catalase-peroxidases. This is a subgroup of heme-dependent peroxidases of the plant superfamily that share a heme prosthetic group and catalyze a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. Catalase-peroxidases can exhibit both catalase and broad-spectrum peroxidase activities depending on the steady-state concentration of hydrogen peroxide. These enzymes are found in many archaeal and bacterial organisms where they neutralize potentially lethal hydrogen peroxide molecules generated during photosynthesis or stationary phase. Along with related intracellular fungal and plant peroxidases, catalase-peroxidases belong to plant peroxidase superfamily. Unlike the eukaryotic enzymes, they are typically comprised of two homologous domains that probably arose via a single gene duplication event. The heme binding motif is present only in the N-terminal domain; the function of the C-terminal domain is not clear.	297
173829	cd08201	plant_peroxidase_like_1	Uncharacterized family of plant peroxidase-like proteins. This is a subgroup of heme-dependent peroxidases similar to plant peroxidases.  Along with animal peroxidases, these enzymes belong to a group of peroxidases containing a heme prosthetic group (ferriprotoporphyrin IX) which catalyzes a multistep oxidative reaction involving hydrogen peroxide as the electron acceptor. The plant peroxidase-like superfamily is found in all three kingdoms of life and carries out a variety of biosynthetic and degradative functions.	264
188876	cd08203	SAM_PNT	Sterile alpha motif (SAM)/Pointed domain. Sterile alpha motif (SAM)/Pointed domain is found in about 40% of transcriptional regulators of ETS family (initially named for Erythroblastosis virus, E26-E Twenty Six).  SAM Pointed domain containing proteins of this family additionally have a C-terminal ETS DNA-binding domain. In a few cases, SAM Pointed domain appears as a single domain protein.  Members of this group are mostly involved in regulation of embryonic development and growth control in eukaryotes. SAM Pointed domains mediate protein-protein interactions. Depending on the subgroup, they can interact with other SAM Pointed domains forming homo or hetero dimers/oligomers and/or they can recruit a protein kinase to its target which can be a SAM Pointed domain containing protein itself or another protein that has no kinase docking site. Thus, SAM Pointed domains participate in transcriptional regulation and signal transduction. Some genes coding ETS family transcriptional regulators are proto-oncogenes. They are prone to chromosomal translocations resulting in gene fusions. Chimeric proteins with SAM Pointed domains were found in a number of different human tumors including myeloid leukemia, lymphoblastic leukemia, Ewing's sarcoma and primitive neuroectodermal tumor. Members of this family are potential targets for cancer therapy.	67
350058	cd08204	ArfGap	GTPase-activating protein (GAP) for the ADP ribosylation factors (ARFs). ArfGAPs are a family of proteins containing an ArfGAP catalytic domain that induces the hydrolysis of GTP bound to the small guanine nucleotide-binding protein Arf, a member of the Ras superfamily of GTPases. Like all GTP-binding proteins, Arf proteins function as molecular switches, cycling between GTP (active-membrane bound) and GDP (inactive-cytosolic) form. Conversion to the GTP-bound form requires a guanine nucleotide exchange factor (GEF), whereas conversion to the GDP-bound form is catalyzed by a GTPase activating protein (GAP). In that sense, ArfGAPs were originally proposed to function as terminators of Arf signaling, which is mediated by regulating Arf family GTP-binding proteins. However, recent studies suggest that ArfGAPs can also function as Arf effectors, independently of their GAP enzymatic activity to transduce signals in cells. The ArfGAP domain contains a C4-type zinc finger motif and a conserved arginine that is required for activity, within a specific spacing (CX2CX16CX2CX4R). ArfGAPs, which have multiple functional domains, regulate the membrane trafficking and actin cytoskeleton remodeling via specific interactions with signaling lipids such as phosphoinositides and trafficking proteins, which consequently affect cellular events such as cell growth, migration, and cancer invasion. The ArfGAP family, which includes 31 human ArfGAP-domain containing proteins, is divided into 10 subfamilies based on domain structure and sequence similarity. The ArfGAP nomenclature is mainly based on the protein domain structure. For example, ASAP1 contains ArfGAP, SH3, ANK repeat and PH domains; ARAPs contain ArfGAP, Rho GAP, ANK repeat and PH domains; ACAPs contain ArfGAP, BAR (coiled coil), ANK repeat and PH domains; and AGAPs contain Arf GAP, GTP-binding protein-like, ANK repeat and PH domains.  Furthermore, the ArfGAPs can be classified into two major types of subfamilies, according to the overall domain structure: the ArfGAP1 type includes 6 subfamilies (ArfGAP1, ArfGAP2/3, ADAP, SMAP, AGFG, and GIT), which contain the ArfGAP domain at the N-terminus of the protein; and the AZAP type includes 4 subfamilies (ASAP, ACAP, AGAP, and ARAP), which contain an ArfGAP domain between the PH and ANK repeat domains.	106
173970	cd08205	RuBisCO_IV_RLP	Ribulose bisphosphate carboxylase like proteins, Rubisco-Form IV. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubisco-like proteins (RLP), are missing critical active site residues and therefore do not catalyze CO2 fixation. They are believed to utilize a related enzymatic mechanism, but have divergent functions, like for example 2,3-diketo-5-methylthiopentyl-1-phosphate enolase or 5-methylthio-d-ribulose 1-phosphate isomerase.	367
173971	cd08206	RuBisCO_large_I_II_III	Ribulose bisphosphate carboxylase large chain, Form I,II,III. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubico-like proteins (RLP), are missing critical active site residues.	414
173972	cd08207	RLP_NonPhot	Ribulose bisphosphate carboxylase like proteins from nonphototrophic bacteria. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubisco-like proteins (RLP), are missing critical active site residues and therefore do not catalyze CO2 fixation. They are believed to utilize a related enzymatic mechanism, but have divergent functions. The specific function of this subgroup is unknown.	406
173973	cd08208	RLP_Photo	Ribulose bisphosphate carboxylase like proteins from phototrophic bacteria. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV, which differ in their taxonomic distribution and subunit composition. Form I-III have Rubisco activity, while Form IV, also called Rubisco-like proteins (RLP), are missing critical active site residues and therefore do not catalyze CO2 fixation. They are believed to utilize a related enzymatic mechanism, but have divergent functions. The specific function of this subgroup is unknown.	424
173974	cd08209	RLP_DK-MTP-1-P-enolase	2,3-diketo-5-methylthiopentyl-1-phosphate enolase. Ribulose bisphosphate carboxylase like proteins (RLPs) similar to B. subtilis YkrW protein, have been identified as 2,3-diketo-5-methylthiopentyl-1-phosphate enolases. They catalyze the tautomerization of 2,3-diketo-5-methylthiopentane 1-phosphate (DK-MTP 1-P). This is an important step in the methionine salvage pathway in which 5-methylthio-D-ribose (MTR) derived from 5'-methylthioadenosine is converted to methionine.	391
173975	cd08210	RLP_RrRLP	Ribulose bisphosphate carboxylase like proteins (RLPs) similar to R.rubrum RLP. RLP from Rhodospirillum rubrum plays a role in an uncharacterized sulfur salvage pathway and has been shown to catalyze a novel isomerization reaction that converts 5-methylthio-d-ribulose 1-phosphate to a 3:1 mixture of 1-methylthioxylulose 5-phosphate and 1-methylthioribulose 5-phosphate.	364
173976	cd08211	RuBisCO_large_II	Ribulose bisphosphate carboxylase large chain, Form II. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV , which differ in their taxonomic distribution and subunit composition. Form II is mainly found in bacteria, and forms large subunit oligomers (dimers, tetramers, etc.) that do not include small subunits.	439
173977	cd08212	RuBisCO_large_I	Ribulose bisphosphate carboxylase large chain, Form I. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV , which differ in their taxonomic distribution and subunit composition. Form I is the most abundant class, present in plants, algae, and bacteria, and forms large complexes composed of 8 large and 8 small subunits.	450
173978	cd08213	RuBisCO_large_III	Ribulose bisphosphate carboxylase large chain, Form III. Ribulose bisphosphate carboxylase (Rubisco) plays an important role in the Calvin reductive pentose phosphate pathway. It catalyzes the primary CO2 fixation step. Rubisco is activated by carbamylation of an active site lysine, stabilized by a divalent cation, which then catalyzes the proton abstraction from the substrate ribulose 1,5 bisphosphate (RuBP) and leads to the formation of two molecules of 3-phosphoglycerate. Members of the Rubisco family can be divided into 4 subgroups, Form I-IV , which differ in their taxonomic distribution and subunit composition. Form III is only found in archaea and forms large subunit oligomers (dimers or decamers) that do not include small subunits.	412
270855	cd08215	STKc_Nek	Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The Nek family is composed of 11 different mammalian members (Nek1-11) with similarity to the catalytic domain of Aspergillus nidulans NIMA kinase, the founding member of the Nek family, which was identified in a screen for cell cycle mutants that were prevented from entering mitosis. Neks contain a conserved N-terminal catalytic domain and a more divergent C-terminal regulatory region of various sizes and structures. They are involved in the regulation of downstream processes following the activation of Cdc2, and many of their functions are cell cycle-related. They play critical roles in microtubule dynamics during ciliogenesis and mitosis. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270856	cd08216	PK_STRAD	Pseudokinase domain of STE20-related kinase adapter protein. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. STRAD forms a complex with the scaffolding protein MO25, and the serine/threonine kinase (STK), LKB1, resulting in the activation of the kinase. In the complex, LKB1 phosphorylates and activates adenosine monophosphate-activated protein kinases (AMPKs), which regulate cell energy metabolism and cell polarity. LKB1 is a tumor suppressor linked to the rare inherited disease, Peutz-Jeghers syndrome, which is characterized by a predisposition to benign polyps and hyperpigmentation of the buccal mucosa. There are two forms of STRAD, alpha and beta, that complex with LKB1 and MO25. The structure of STRAD-alpha is available and shows that this protein binds ATP, has an ordered activation loop, and adopts a closed conformation typical of fully active protein kinases. It does not possess activity due to nonconservative substitutions of essential catalytic residues. ATP binding enhances the affinity of STRAD for MO25. The conformation of STRAD-alpha stabilized through ATP and MO25 may be needed to activate LKB1. The STRAD subfamily is part of a larger superfamily that includes the catalytic domains of STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	315
270857	cd08217	STKc_Nek2	Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The Nek2 subfamily includes Aspergillus nidulans NIMA kinase, the founding member of the Nek family, which was identified in a screen for cell cycle mutants prevented from entering mitosis. NIMA is essential for mitotic entry and progression through mitosis, and its degradation is essential for mitotic exit. NIMA is involved in nuclear membrane fission. Vertebrate Nek2 is a cell cycle-regulated STK, localized in centrosomes and kinetochores, that regulates centrosome splitting at the G2/M phase. It also interacts with other mitotic kinases such as Polo-like kinase 1 and may play a role in spindle checkpoint. An increase in the expression of the human NEK2 gene is strongly associated with the progression of non-Hodgkin lymphoma. Nek2 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. It The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
270858	cd08218	STKc_Nek1	Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek1 is associated with centrosomes throughout the cell cycle. It is involved in the formation of primary cilium and in the maintenance of centrosomes. It cycles through the nucleus and may be capable of relaying signals between the cilium and the nucleus. Nek1 is implicated in the development of polycystic kidney disease, which is characterized by benign polycystic tumors formed by abnormal overgrowth of renal epithelial cells. It appears also to be involved in DNA damage response, and may be important for both correct DNA damage checkpoint activation and DNA repair. Nek1 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
173759	cd08219	STKc_Nek3	Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek3 is primarily localized in the cytoplasm and shows no cell cycle-dependent changes in its activity. It is present in the axons of neurons and affects morphogenesis and polarity through its regulation of microtubule acetylation. Nek3 modulates the signaling of the prolactin receptor through its activation of Vav2 and contributes to prolactin-mediated motility of breast cancer cells. It is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
270859	cd08220	STKc_Nek8	Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 8. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek8 contains an N-terminal kinase catalytic domain and a C-terminal RCC1 (regulator of chromosome condensation) domain. A double point mutation in Nek8 causes cystic kidney disease in mice that genetically resembles human autosomal recessive polycystic kidney disease (ARPKD). Nek8 is also associated with a rare form of juvenile renal cystic disease, nephronophthisis type 9. It has been suggested that a defect in the ciliary localization of Nek8 contributes to the development of cysts manifested by these diseases. Nek8 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270860	cd08221	STKc_Nek9	Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 9. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek9, also called Nercc1, is primarily a cytoplasmic protein but can also localize in the nucleus. It is involved in modulating chromosome alignment and splitting during mitosis. It interacts with the gamma-tubulin ring complex and the Ran GTPase, and is implicated in microtubule organization. Nek9 associates with FACT (FAcilitates Chromatin Transcription) and modulates interphase progression. It also interacts with Nek6, and Nek7, during mitosis, resulting in their activation. Nek9 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270861	cd08222	STKc_Nek11	Catalytic domain of the Protein Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 11. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek11 is involved, through direct phosphorylation, in regulating the degradation of Cdc25A (Cell Division Cycle 25 homolog A), which plays a role in cell cycle progression and in activating cyclin dependent kinases. Nek11 is activated by CHK1 (CHeckpoint Kinase 1) and may be involved in the G2/M checkpoint. Nek11 may also play a role in the S-phase checkpoint as well as in DNA replication and genotoxic stress responses. It is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	260
270862	cd08223	STKc_Nek4	Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek4 is highly abundant in the testis. Its specific function is unknown. Neks are involved in the regulation of downstream processes following the activation of Cdc2, and many of their functions are cell cycle-related. They play critical roles in microtubule dynamics during ciliogenesis and mitosis. Nek4 is one in a family of 11 different Neks (Nek1-11). The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
270863	cd08224	STKc_Nek6_7	Catalytic domain of the Serine/Threonine Kinases, Never In Mitosis gene A (NIMA)-related kinase 6 and 7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek6 and Nek7 are the shortest Neks, consisting only of the catalytic domain and a very short N-terminal extension. They show distinct expression patterns and both appear to be downstream substrates of Nek9. They are required for mitotic spindle formation and cytokinesis. They may also be regulators of the p70 ribosomal S6 kinase. Nek6/7 is part of a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
173765	cd08225	STKc_Nek5	Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Neks are involved in the regulation of downstream processes following the activation of Cdc2, and many of their functions are cell cycle-related. They play critical roles in microtubule dynamics during ciliogenesis and mitosis. The specific function of Nek5 is unknown. Nek5 is one in a family of 11 different Neks (Nek1-11). The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
270864	cd08226	PK_STRAD_beta	Pseudokinase domain of STE20-related kinase adapter protein beta. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity.STRAD-beta is also referred to as ALS2CR2 (Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 2 protein), since the human gene encoding it is located within the juvenile ALS2 critical region on chromosome 2q33-q34. It is not linked to the development of ALS2. STRAD forms a complex with the scaffolding protein MO25, and the serine/threonine kinase (STK), LKB1, resulting in the activation of the kinase. In the complex, LKB1 phosphorylates and activates adenosine monophosphate-activated protein kinases (AMPKs), which regulate cell energy metabolism and cell polarity. LKB1 is a tumor suppressor linked to the rare inherited disease, Peutz-Jeghers syndrome, which is characterized by a predisposition to benign polyps and hyperpigmentation of the buccal mucosa. The STRAD-beta subfamily is part of a larger superfamily that includes the catalytic domains of STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	328
173767	cd08227	PK_STRAD_alpha	Pseudokinase domain of STE20-related kinase adapter protein alpha. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. The structure of STRAD-alpha is available and shows that this protein binds ATP, has an ordered activation loop, and adopts a closed conformation typical of fully active protein kinases. It does not possess activity due to nonconservative substitutions of essential catalytic residues. ATP binding enhances the affinity of STRAD for MO25. The conformation of STRAD-alpha, stabilized through ATP and MO25, may be needed to activate LKB1. A mutation which results in a truncation of a C-terminal part of the human STRAD-alpha pseudokinase domain and disrupts its association with LKB1, leads to PMSE (polyhydramnios, megalencephaly, symptomatic epilepsy) syndrome. Several splice variants of STRAD-alpha exist which exhibit different effects on the localization and activation of LKB1. STRAD forms a complex with the scaffolding protein MO25, and the serine/threonine kinase (STK), LKB1, resulting in the activation of the kinase. In the complex, LKB1 phosphorylates and activates adenosine monophosphate-activated protein kinases (AMPKs), which regulate cell energy metabolism and cell polarity. The STRAD alpha subfamily is part of a larger superfamily that includes the catalytic domains of STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	327
270865	cd08228	STKc_Nek6	Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 6. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek6 is required for the transition from metaphase to anaphase. It also plays important roles in mitotic spindle formation and cytokinesis. Activated by Nek9 during mitosis, Nek6 phosphorylates Eg5, a kinesin that is important for spindle bipolarity. Nek6 localizes to spindle microtubules during metaphase and anaphase, and to the midbody during cytokinesis. It is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
270866	cd08229	STKc_Nek7	Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Nek7 is required for mitotic spindle formation and cytokinesis. It is enriched in the centrosome and is critical for microtubule nucleation. Nek7 is activated by Nek9 during mitosis, and may regulate the p70 ribosomal S6 kinase. It is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	292
176192	cd08230	glucose_DH	Glucose dehydrogenase. Glucose dehydrogenase (GlcDH), a member of the medium chain dehydrogenase/zinc-dependent alcohol dehydrogenase-like family, catalyzes the NADP(+)-dependent oxidation of glucose to gluconate, the first step in the Entner-Doudoroff pathway, an alternative to or substitute for glycolysis or the pentose phosphate pathway. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases  (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossman fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology  to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability.	355
176193	cd08231	MDR_TM0436_like	Hypothetical enzyme TM0436 resembles the zinc-dependent alcohol dehydrogenases (ADH). This group contains the hypothetical TM0436 alcohol dehydrogenase from Thermotoga maritima,  proteins annotated as 5-exo-alcohol dehydrogenase, and other members of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family.  MDR, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability.	361
176194	cd08232	idonate-5-DH	L-idonate 5-dehydrogenase. L-idonate 5-dehydrogenase (L-ido 5-DH ) catalyzes the conversion of L-lodonate to 5-ketogluconate in the metabolism of L-Idonate to  6-P-gluconate. In E. coli, this GntII pathway is a subsidiary pathway to the canonical GntI system, which also phosphorylates and transports gluconate.  L-ido 5-DH is found in an operon with a regulator indR, transporter idnT, 5-keto-D-gluconate 5-reductase, and Gnt kinase. L-ido 5-DH is a zinc-dependent alcohol dehydrogenase-like protein. The alcohol dehydrogenase ADH-like family of proteins is a diverse group of proteins related to the first identified member, class I mammalian ADH.  This group is also called the medium chain dehydrogenases/reductase family (MDR) which displays a broad range of activities and are distinguished from the smaller short chain dehydrogenases(~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal GroES-like catalytic domain.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	339
176195	cd08233	butanediol_DH_like	(2R,3R)-2,3-butanediol dehydrogenase. (2R,3R)-2,3-butanediol dehydrogenase, a zinc-dependent medium chain alcohol dehydrogenase, catalyzes the NAD(+)-dependent oxidation of (2R,3R)-2,3-butanediol and meso-butanediol to acetoin. BDH functions as a homodimer.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose. Sorbitol dehydrogenase is tetrameric and has a single catalytic zinc per subunit.	351
176196	cd08234	threonine_DH_like	L-threonine dehydrogenase. L-threonine dehydrogenase (TDH) catalyzes the zinc-dependent formation of 2-amino-3-ketobutyrate from L-threonine, via NAD(H)-dependent oxidation.  THD is a member of the zinc-requiring, medium chain NAD(H)-dependent alcohol dehydrogenase family (MDR). MDRs  have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria),  and have 2 tightly bound zinc atoms per subunit. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose.	334
176197	cd08235	iditol_2_DH_like	L-iditol 2-dehydrogenase. Putative L-iditol 2-dehydrogenase based on annotation of some members in this subgroup.  L-iditol 2-dehydrogenase catalyzes the NAD+-dependent conversion of L-iditol to L-sorbose in fructose and mannose metabolism. This enzyme is related to sorbitol dehydrogenase, alcohol dehydrogenase, and other medium chain dehydrogenase/reductases. The zinc-dependent alcohol dehydrogenase (ADH-Zn)-like family of proteins is a diverse group of proteins related to the first identified member, class I mammalian ADH.  This group is also called the medium chain dehydrogenases/reductase family (MDR) to highlight its broad range of activities and to distinguish from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal GroES-like catalytic domain.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol  dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins  typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	343
176198	cd08236	sugar_DH	NAD(P)-dependent sugar dehydrogenases. This group contains proteins identified as sorbitol dehydrogenases and other sugar dehydrogenases of the medium-chain dehydrogenase/reductase family (MDR), which includes zinc-dependent alcohol dehydrogenase and related proteins. Sorbitol and aldose reductase are NAD(+) binding proteins of the polyol pathway, which interconverts glucose and fructose. Sorbitol dehydrogenase is tetrameric and has a single catalytic zinc per subunit. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Related proteins include threonine dehydrogenase, formaldehyde dehydrogenase, and butanediol dehydrogenase. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit. Horse liver alcohol dehydrogenase is a dimeric enzyme and each subunit has two domains. The NAD binding domain is in a Rossmann fold and the catalytic domain contains a zinc ion to which substrates bind. There is a cleft between the domains that closes upon formation of the ternary complex.	343
176199	cd08237	ribitol-5-phosphate_DH	ribitol-5-phosphate dehydrogenase. NAD-linked ribitol-5-phosphate dehydrogenase, a member of the MDR/zinc-dependent alcohol dehydrogenase-like family, oxidizes the phosphate ester of ribitol-5-phosphate to xylulose-5-phosphate of the pentose phosphate pathway. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability.	341
176200	cd08238	sorbose_phosphate_red	L-sorbose-1-phosphate reductase. L-sorbose-1-phosphate reductase, a member of the MDR family, catalyzes the NADPH-dependent conversion of l-sorbose 1-phosphate to d-glucitol 6-phosphate in the metabolism of L-sorbose to  (also converts d-fructose 1-phosphate to d-mannitol 6-phosphate).  The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of an beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol  dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability.	410
176201	cd08239	THR_DH_like	L-threonine dehydrogenase (TDH)-like. MDR/AHD-like proteins, including a protein annotated as a threonine dehydrogenase. L-threonine dehydrogenase (TDH) catalyzes the zinc-dependent formation of 2-amino-3-ketobutyrate from L-threonine via NAD(H)-dependent oxidation. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Zinc-dependent ADHs are medium chain dehydrogenase/reductase type proteins (MDRs) and have a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. In addition to alcohol dehydrogenases, this group includes quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	339
176202	cd08240	6_hydroxyhexanoate_dh_like	6-hydroxyhexanoate dehydrogenase. 6-hydroxyhexanoate dehydrogenase, an enzyme of the zinc-dependent alcohol dehydrogenase-like family of medium chain dehydrogenases/reductases catalyzes the conversion of 6-hydroxyhexanoate and NAD(+) to 6-oxohexanoate + NADH and H+.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H)-binding occurs in the cleft between the catalytic  and coenzyme-binding domains, at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	350
176203	cd08241	QOR1	Quinone oxidoreductase (QOR). QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR acts in the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain.  NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	323
176204	cd08242	MDR_like	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group contains members identified as related to zinc-dependent alcohol dehydrogenase and other members of the MDR family, including threonine dehydrogenase. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group includes various activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	319
176205	cd08243	quinone_oxidoreductase_like_1	Quinone oxidoreductase (QOR). NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	320
176206	cd08244	MDR_enoyl_red	Possible enoyl reductase. Member identified as possible enoyl reductase of the MDR family. 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain.  NAD(H)  binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.  Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers, with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain.	324
176207	cd08245	CAD	Cinnamyl alcohol dehydrogenases (CAD) and related proteins. Cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family, reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes, or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins  typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	330
176208	cd08246	crotonyl_coA_red	crotonyl-CoA reductase. Crotonyl-CoA reductase, a member of the medium chain dehydrogenase/reductase family, catalyzes the NADPH-dependent conversion of crotonyl-CoA to butyryl-CoA, a step in (2S)-methylmalonyl-CoA  production for straight-chain fatty acid biosynthesis.  Like enoyl reductase, another enzyme in fatty acid synthesis, crotonyl-CoA reductase is a member of the zinc-dependent alcohol dehydrogenase-like medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.	393
176209	cd08247	AST1_like	AST1 is a cytoplasmic protein associated with the periplasmic membrane in yeast. This group contains members identified in targeting of yeast membrane proteins ATPase. AST1 is a cytoplasmic protein associated with the periplasmic membrane in yeast, identified as a multicopy suppressor of pma1 mutants which cause temperature sensitive growth arrest due to the inability of ATPase to target to the cell surface. This family is homologous to the medium chain family of dehydrogenases and reductases. Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of an beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.	352
176210	cd08248	RTN4I1	Human Reticulon 4 Interacting Protein 1. Human Reticulon 4 Interacting Protein 1 is a member of the medium chain dehydrogenase/ reductase (MDR) family. Riticulons are endoplasmic reticulum associated proteins involved in membrane trafficking  and neuroendocrine secretion. The MDR/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.	350
176211	cd08249	enoyl_reductase_like	enoyl_reductase_like. Member identified as possible enoyl reductase of the MDR family. 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in  Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain.  NAD(H)-binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.  Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain.	339
176212	cd08250	Mgc45594_like	Mgc45594 gene product and other MDR family members. Includes Human Mgc45594 gene product of undetermined function. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.	329
176213	cd08251	polyketide_synthase	polyketide synthase. Polyketide synthases produce polyketides in step by step mechanism that is similar to fatty acid synthesis. Enoyl reductase reduces a double to single bond. Erythromycin is one example of a polyketide generated by 3 complex enzymes (megasynthases). 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in  Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.	303
176214	cd08252	AL_MDR	Arginate lyase and other MDR family members. This group contains a structure identified as an arginate lyase. Other members are identified quinone reductases, alginate lyases, and other proteins related to the zinc-dependent dehydrogenases/reductases. QOR catalyzes the conversion of a quinone and NAD(P)H to a hydroquinone and NAD(P+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR acts in the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.  In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	336
176215	cd08253	zeta_crystallin	Zeta-crystallin with NADP-dependent quinone reductase activity (QOR). Zeta-crystallin is a eye lens protein with NADP-dependent quinone reductase activity (QOR). It has been cited as a structural component in mammalian eyes, but also has homology to quinone reductases in unrelated species. QOR catalyzes the conversion of a quinone and NAD(P)H to a hydroquinone and NAD(P+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR acts in the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain.  NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	325
176216	cd08254	hydroxyacyl_CoA_DH	6-hydroxycyclohex-1-ene-1-carboxyl-CoA dehydrogenase, N-benzyl-3-pyrrolidinol dehydrogenase, and other MDR family members. This group contains enzymes of the zinc-dependent alcohol dehydrogenase family, including members (aka MDR) identified as 6-hydroxycyclohex-1-ene-1-carboxyl-CoA dehydrogenase and N-benzyl-3-pyrrolidinol dehydrogenase. 6-hydroxycyclohex-1-ene-1-carboxyl-CoA dehydrogenase catalyzes the conversion of 6-Hydroxycyclohex-1-enecarbonyl-CoA and NAD+ to 6-Ketoxycyclohex-1-ene-1-carboxyl-CoA,NADH, and H+. This group displays the characteristic catalytic and structural zinc sites of the zinc-dependent alcohol dehydrogenases. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	338
176217	cd08255	2-desacetyl-2-hydroxyethyl_bacteriochlorophyllide_like	2-desacetyl-2-hydroxyethyl bacteriochlorophyllide and other MDR family members. This subgroup of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family has members identified as 2-desacetyl-2-hydroxyethyl bacteriochlorophyllide A dehydrogenase and alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.	277
176218	cd08256	Zn_ADH2	Alcohol dehydrogenases of the MDR family. This group has the characteristic catalytic and structural zinc-binding sites of the zinc-dependent alcohol dehydrogenases of the MDR family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.	350
176219	cd08258	Zn_ADH4	Alcohol dehydrogenases of the MDR family. This group shares the zinc coordination sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of an beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	306
176220	cd08259	Zn_ADH5	Alcohol dehydrogenases of the MDR family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. This group contains proteins that share the characteristic catalytic and structural zinc-binding sites of the zinc-dependent alcohol dehydrogenase family.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine (His-51), the ribose of NAD, a serine (Ser-48), then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	332
176221	cd08260	Zn_ADH6	Alcohol dehydrogenases of the MDR family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. This group has the characteristic catalytic and structural zinc sites of the zinc-dependent alcohol dehydrogenases.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	345
176222	cd08261	Zn_ADH7	Alcohol dehydrogenases of the MDR family. This group contains members identified as related to zinc-dependent alcohol dehydrogenase and other members of the MDR family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group includes various activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	337
176223	cd08262	Zn_ADH8	Alcohol dehydrogenases of the MDR family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	341
176224	cd08263	Zn_ADH10	Alcohol dehydrogenases of the MDR family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.   Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.   A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H)-binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	367
176225	cd08264	Zn_ADH_like2	Alcohol dehydrogenases of the MDR family. This group resembles the zinc-dependent alcohol dehydrogenases of the medium chain dehydrogenase family. However, this subgroup does not contain the characteristic catalytic zinc site. Also, it contains an atypical structural zinc-binding pattern: DxxCxxCxxxxxxxC. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.   Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	325
176226	cd08265	Zn_ADH3	Alcohol dehydrogenases of the MDR family. This group resembles the zinc-dependent alcohol dehydrogenase and has the catalytic and structural zinc-binding sites characteristic of this group. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology  to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc.	384
176227	cd08266	Zn_ADH_like1	Alcohol dehydrogenases of the MDR family. This group contains proteins related to the zinc-dependent  alcohol dehydrogenases. However, while the group has structural zinc site characteristic of these enzymes, it lacks the consensus site for a catalytic zinc. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.   Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide. A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria),  and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain. NAD(H)-binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	342
176228	cd08267	MDR1	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	319
176229	cd08268	MDR2	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	328
176230	cd08269	Zn_ADH9	Alcohol dehydrogenases of the MDR family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.	312
176231	cd08270	MDR4	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	305
176232	cd08271	MDR5	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	325
176233	cd08272	MDR6	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	326
176234	cd08273	MDR8	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	331
176235	cd08274	MDR9	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	350
176236	cd08275	MDR3	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	337
176237	cd08276	MDR7	Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family. This group is a member of the medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, but lacks the zinc-binding sites of the zinc-dependent alcohol dehydrogenases. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P)-binding Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	336
176238	cd08277	liver_alcohol_DH_like	Liver alcohol dehydrogenase. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  There are 7 vertebrate ADH 7 classes, 6 of which have been identified in humans. Class III, glutathione-dependent formaldehyde dehydrogenase, has been identified as the primordial form and exists in diverse species, including plants, micro-organisms, vertebrates, and invertebrates. Class I, typified by  liver dehydrogenase, is an evolving form. Gene duplication and functional specialization of ADH into ADH classes and subclasses created numerous forms in vertebrates.  For example, the A, B and C (formerly alpha, beta, gamma) human class I subunits have high overall structural similarity, but differ in the substrate binding pocket and therefore in substrate specificity. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine (His-51), the ribose of NAD,  a serine (Ser-48) , then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology  to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.	365
176239	cd08278	benzyl_alcohol_DH	Benzyl alcohol dehydrogenase. Benzyl alcohol dehydrogenase is similar to liver alcohol dehydrogenase, but has some amino acid substitutions  near  the active site, which may determine the enzyme's specificity of oxidizing aromatic substrates.  Also known as aryl-alcohol dehydrogenases, they catalyze the conversion of an aromatic alcohol + NAD+ to an aromatic aldehyde + NADH + H+.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.   ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.  In human  ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	365
176240	cd08279	Zn_ADH_class_III	Class III alcohol dehydrogenase. Glutathione-dependent formaldehyde dehydrogenases (FDHs, Class III ADH) are members of the zinc-dependent/medium chain alcohol dehydrogenase family.  FDH converts formaldehyde and NAD(P) to formate and NAD(P)H. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation. Class III ADH are also known as glutathione-dependent formaldehyde dehydrogenase (FDH), which convert aldehydes to corresponding carboxylic acid and alcohol.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.	363
176241	cd08281	liver_ADH_like1	Zinc-dependent alcohol dehydrogenases (ADH) and class III ADG (AKA formaldehyde dehydrogenase). NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones. This group contains members identified as zinc dependent alcohol dehydrogenases (ADH), and class III ADG (aka formaldehyde dehydrogenase, FDH). Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  Class III ADH are also know as glutathione-dependent formaldehyde dehydrogenase (FDH), which convert aldehydes to the corresponding carboxylic acid and alcohol.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human  ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	371
176242	cd08282	PFDH_like	Pseudomonas putida aldehyde-dismutating formaldehyde dehydrogenase (PFDH). Formaldehyde dehydrogenase (FDH) is a member of the zinc-dependent/medium chain alcohol dehydrogenase family.  Unlike typical FDH, Pseudomonas putida aldehyde-dismutating FDH (PFDH) is glutathione-independent.  PFDH converts 2 molecules of aldehydes to corresponding carboxylic acid and alcohol.  MDH family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones. Like the zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), these tetrameric FDHs have a catalytic zinc that resides between the catalytic and NAD(H)binding domains and a structural zinc in a lobe of the catalytic domain. Unlike ADH, where NAD(P)(H) acts as a cofactor, NADH in FDH is a tightly bound redox cofactor (similar to nicotinamide proteins).  The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	375
176243	cd08283	FDH_like_1	Glutathione-dependent formaldehyde dehydrogenase related proteins, child 1. Members identified as glutathione-dependent formaldehyde dehydrogenase(FDH), a member of the zinc-dependent/medium chain alcohol dehydrogenase family.  FDH converts formaldehyde and NAD(P) to formate and NAD(P)H. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione.  MDH family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones. Like many zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), these FDHs form dimers, with 4 zinc ions per dimer. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	386
176244	cd08284	FDH_like_2	Glutathione-dependent formaldehyde dehydrogenase related proteins, child 2. Glutathione-dependent formaldehyde dehydrogenases (FDHs) are members of the zinc-dependent/medium chain alcohol dehydrogenase family. Formaldehyde dehydrogenase (FDH) is a member of the zinc-dependent/medium chain alcohol dehydrogenase family.  FDH converts formaldehyde and NAD to formate and NADH. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione.   These tetrameric FDHs have a catalytic zinc that resides between the catalytic and NAD(H)binding domains and a structural zinc in a lobe of the catalytic domain. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	344
176245	cd08285	NADP_ADH	NADP(H)-dependent alcohol dehydrogenases. This group is predominated by atypical alcohol dehydrogenases; they exist as tetramers and exhibit specificity for NADP(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones.  Like other zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), tetrameric ADHs have a catalytic zinc that resides between the catalytic and NAD(H)binding domains; however, they do not have and a structural zinc in a lobe of the catalytic domain.  The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	351
176246	cd08286	FDH_like_ADH2	formaldehyde dehydrogenase (FDH)-like. This group is related to formaldehyde dehydrogenase (FDH), which  is a member of the zinc-dependent/medium chain alcohol dehydrogenase family.  This family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones. Another member is identified as a dihydroxyacetone reductase. Like the zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), tetrameric FDHs have a catalytic zinc that resides between the catalytic and NAD(H)binding domains and a structural zinc in a lobe of the catalytic domain. Unlike ADH, where NAD(P)(H) acts as a cofactor, NADH in FDH is a tightly bound redox cofactor (similar to nicotinamide proteins). The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	345
176247	cd08287	FDH_like_ADH3	formaldehyde dehydrogenase (FDH)-like. This group contains proteins identified as alcohol dehydrogenases and glutathione-dependant formaldehyde dehydrogenases (FDH) of the zinc-dependent/medium chain alcohol dehydrogenase family.  The MDR family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes, or ketones.  FDH converts formaldehyde and NAD to formate and NADH. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione. The medium chain alcohol dehydrogenase family (MDR) has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.	345
176248	cd08288	MDR_yhdh	Yhdh putative quinone oxidoreductases. Yhdh putative quinone oxidoreductases (QOR). QOR catalyzes the conversion of a quinone + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.   ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone. The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria),  and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.  In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	324
176249	cd08289	MDR_yhfp_like	Yhfp putative quinone oxidoreductases. yhfp putative quinone oxidoreductases (QOR). QOR catalyzes the conversion of a quinone  + NAD(P)H to a hydroquinone + NAD(P)+. Quinones are cyclic diones derived from aromatic compounds. Membrane bound QOR actin the respiratory chains of bacteria and mitochondria, while soluble QOR acts to protect from toxic quinones (e.g. DT-diaphorase) or as a soluble eye-lens protein in some vertebrates (e.g. zeta-crystalin). QOR reduces quinones through a semi-quinone intermediate via a NAD(P)H-dependent single electron transfer. QOR is a member of the medium chain dehydrogenase/reductase family, but lacks the zinc-binding sites of the prototypical alcohol dehydrogenases of this group.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of a histidine, the ribose of NAD, a serine, then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	326
176250	cd08290	ETR	2-enoyl thioester reductase (ETR). 2-enoyl thioester reductase (ETR) catalyzes the NADPH-dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in  Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.   ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  The N-terminal catalytic domain has a distant homology to GroES. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains, at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers, with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain.	341
176251	cd08291	ETR_like_1	2-enoyl thioester reductase (ETR) like proteins, child 1. 2-enoyl thioester reductase (ETR) like proteins. ETR catalyzes the NADPH-dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the 2-enoyl thioester reductase (ETR) like proteins. ETR catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in  Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.   ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers, with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain.	324
176252	cd08292	ETR_like_2	2-enoyl thioester reductase (ETR) like proteins, child 2. 2-enoyl thioester reductase (ETR) like proteins. ETR catalyzes the NADPH-dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the 2-enoyl thioester reductase (ETR) like proteins. ETR catalyzes the NADPH-dependent dependent conversion of trans-2-enoyl acyl carrier protein/coenzyme A (ACP/CoA) to acyl-(ACP/CoA) in fatty acid synthesis. 2-enoyl thioester reductase activity has been linked in Candida tropicalis as essential in maintaining mitiochondrial respiratory function. This ETR family is a part of the medium chain dehydrogenase/reductase family, but lack the zinc coordination sites characteristic of the alcohol dehydrogenases in this family. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.   ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  The N-terminal catalytic domain has a distant homology to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site, and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic and coenzyme-binding domains, at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.  Candida tropicalis enoyl thioester reductase (Etr1p) catalyzes the NADPH-dependent reduction of trans-2-enoyl thioesters in mitochondrial fatty acid synthesis. Etr1p forms homodimers, with each subunit containing a nucleotide-binding Rossmann fold domain and a catalytic domain.	324
176253	cd08293	PTGR2	Prostaglandin reductase. Prostaglandins and related eicosanoids are metabolized by the oxidation of the 15(S)-hydroxyl group of the NAD+-dependent (type I 15-PGDH) 15-prostaglandin dehydrogenase (15-PGDH) followed by reduction by NADPH/NADH-dependent (type II 15-PGDH) delta-13 15-prostaglandin reductase (13-PGR) to 15-keto-13,14,-dihydroprostaglandins. 13-PGR is a bifunctional enzyme, since it also has leukotriene B(4) 12-hydroxydehydrogenase activity. These 15-PGDH and related enzymes are members of the medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases  (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.	345
176254	cd08294	leukotriene_B4_DH_like	13-PGR is a bifunctional enzyme with delta-13 15-prostaglandin reductase and leukotriene B4 12 hydroxydehydrogenase activity. Prostaglandins and related eicosanoids are metabolized by the oxidation of the 15(S)-hydroxyl group of the NAD+-dependent (type I 15-PGDH) 15-prostaglandin dehydrogenase (15-PGDH) followed by reduction by NADPH/NADH-dependent (type II 15-PGDH) delta-13 15-prostaglandin reductase (13-PGR) to 15-keto- 13,14,-dihydroprostaglandins. 13-PGR is a bifunctional enzyme, since it also has leukotriene B(4) 12-hydroxydehydrogenase activity. These 15-PGDH and related enzymes are members of the medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.	329
176255	cd08295	double_bond_reductase_like	Arabidopsis alkenal double bond reductase and leukotriene B4 12-hydroxydehydrogenase. This group includes proteins identified as the Arabidopsis alkenal double bond reductase and leukotriene B4 12-hydroxydehydrogenase.  The Arabidopsis enzyme, a member of the medium chain dehydrogenase/reductase family, catalyzes the reduction of 7-8-double bond of phenylpropanal substrates as a plant defense mechanism.  Prostaglandins and related eicosanoids (lipid mediators involved in host defense and inflamation) are metabolized by the oxidation of the 15(S)-hydroxyl group of the NAD+-dependent (type I 15-PGDH) 15-prostaglandin dehydrogenase (15-PGDH) followed by reduction by NADPH/NADH-dependent (type II 15-PGDH) delta-13 15-prostaglandin reductase (13-PGR) to 15-keto-13,14,-dihydroprostaglandins. 13-PGR is a bifunctional enzyme, since it also has leukotriene B(4) 12-hydroxydehydrogenase activity. Leukotriene B4 (LTB4) can be metabolized by LTB4 20-hydroxylase in inflamatory cells, and in other cells by bifunctional LTB4 12-HD/PGR. These 15-PGDH and related enzymes are members of the medium chain dehydrogenase/reductase family. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of an beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.	338
176256	cd08296	CAD_like	Cinnamyl alcohol dehydrogenases (CAD). Cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family, reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADHs), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	333
176257	cd08297	CAD3	Cinnamyl alcohol dehydrogenases (CAD). These alcohol dehydrogenases are related to the cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Cinnamyl alcohol dehydrogenases (CAD) reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	341
176258	cd08298	CAD2	Cinnamyl alcohol dehydrogenases (CAD). These alcohol dehydrogenases are related to the cinnamyl alcohol dehydrogenases (CAD), members of the medium chain dehydrogenase/reductase family.  NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes, or ketones. Cinnamyl alcohol dehydrogenases (CAD) reduce cinnamaldehydes to cinnamyl alcohols in the last step of monolignal metabolism in plant cells walls. CAD binds 2 zinc ions and is NADPH- dependent. CAD family members are also found in non-plant species, e.g. in yeast where they have an aldehyde reductase activity. The medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH.  MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR).  The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES.  The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH), quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the  NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones.  Active site zinc has a catalytic role, while structural zinc aids in stability.  ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines.	329
176259	cd08299	alcohol_DH_class_I_II_IV	class I, II, IV alcohol dehydrogenases. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones.  This group includes alcohol dehydrogenases corresponding to mammalian classes I, II, IV. Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form.  The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology  to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H) binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding. In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine (His-51), the ribose of NAD,  a serine (Ser-48) , then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction.	373
176260	cd08300	alcohol_DH_class_III	class III alcohol dehydrogenases. Members identified as glutathione-dependent formaldehyde dehydrogenase(FDH), a member of the zinc dependent/medium chain alcohol dehydrogenase family.  FDH converts formaldehyde and NAD(P) to formate and NAD(P)H. The initial step in this process the spontaneous formation of a S-(hydroxymethyl)glutathione adduct from formaldehyde and glutathione, followed by FDH-mediated oxidation (and detoxification) of the adduct to S-formylglutathione.  MDH family uses NAD(H) as a cofactor in the interconversion of alcohols and aldehydes or ketones. Like many zinc-dependent alcohol dehydrogenases (ADH) of the medium chain alcohol dehydrogenase/reductase family (MDR), these FDHs form dimers, with 4 zinc ions per dimer. The medium chain alcohol dehydrogenase family (MDR) have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The N-terminal region typically has an all-beta catalytic domain. These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.   ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which have a NAD(P)(H)-binding domain in a Rossmann fold of a beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology  to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria),  and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H)  binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.	368
176261	cd08301	alcohol_DH_plants	Plant alcohol dehydrogenase. NAD(P)(H)-dependent oxidoreductases are the major enzymes in the interconversion of alcohols and aldehydes or ketones.  Alcohol dehydrogenase in the liver converts ethanol and NAD+ to acetaldehyde and NADH, while in yeast and some other microorganisms ADH catalyzes the conversion acetaldehyde to ethanol in alcoholic fermentation.  There are 7 vertebrate ADH 7 classes, 6 of which have been identified in humans. Class III, glutathione-dependent formaldehyde dehydrogenase, has been identified as the primordial form and exists in diverse species, including plants, micro-organisms, vertebrates, and invertebrates. Class I, typified by  liver dehydrogenase, is an evolving form. Gene duplication and functional specialization of ADH into ADH classes and subclasses created numerous forms in vertebrates.  For example, the A, B and C (formerly alpha, beta, gamma) human class I subunits have high overall structural similarity, but differ in the substrate binding pocket and therefore in substrate specificity.  In human ADH catalysis, the zinc ion helps coordinate the alcohol, followed by deprotonation of  a histidine (His-51), the ribose of NAD,  a serine (Ser-48) , then the alcohol, which allows the transfer of a hydride to NAD+, creating NADH and a zinc-bound aldehyde or ketone. In yeast and some bacteria, the active site zinc binds an aldehyde, polarizing it, and leading to the reverse reaction. ADH is a member of the medium chain alcohol dehydrogenase family (MDR), which has a NAD(P)(H)-binding domain in a Rossmann fold of an beta-alpha form. The NAD(H)-binding region is comprised of 2 structurally similar halves, each of which contacts a mononucleotide.  A GxGxxG motif after the first mononucleotide contact half allows the close contact of the coenzyme with the ADH backbone.  The N-terminal catalytic domain has a distant homology  to GroES.  These proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain.  NAD(H)  binding occurs in the cleft between the catalytic  and coenzyme-binding domains at the active site, and coenzyme binding induces a conformational closing of this cleft. Coenzyme binding typically precedes and contributes to substrate binding.	369
176720	cd08304	DD	Death Domain Superfamily of protein-protein interaction domains. The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer.	69
260019	cd08305	Pyrin	Pyrin: a protein-protein interaction domain. The Pyrin domain (or PYD), also called DAPIN or PAAD, is a subfamily of the Death Domain (DD) superfamily and it functions in several signaling pathways. The Pyrin domain is found at the N-terminus of a variety of proteins and serves as a linker that recruits other domains into signaling complexes. Pyrin-containing proteins include NALPs, ASC (Apoptosis-associated speck-like protein containing a CARD), and the interferon-inducible p200 (IFI-200) family of proteins which includes the human IFI-16, myeloid cell nuclear differentiation antigen (MNDA) and absent in melanoma (AIM) 2. NALPs are members of the NBS-LRR family of proteins possessing a tripartite domain structure including a C-terminal LRR (leucine-rich repeats), a central nucleotide-binding site (NBS) domain or NACHT (for neuronal apoptosis inhibitor protein, CIITA, HET-E and TP1), and an N-terminal protein-protein interaction domain, which is a Pyrin domain in the case of NALPs. ASC and NALPs are involved in the regulation of inflammation. ASC, NALP1 and NALP3 are involved in the assembly of the 'inflammasome', a multiprotein platform which is formed in response to infection or injury and is responsible for caspase-1 activation and regulation of IL-1beta maturation. NALP12 functions as a negative regulator of inflammation. The p200 proteins are involved in the regulation of cell cycle and differentiation. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including Caspase activation and recruitment domain (CARD) and Death Effector Domain (DED). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	73
260020	cd08306	Death_FADD	Fas-associated Death Domain protein-protein interaction domain. Death domain (DD) found in FAS-associated via death domain (FADD). FADD is a component of the death-inducing signaling complex (DISC) and serves as an adaptor in the signaling pathway of death receptor proteins. It modulates apoptosis as well as non-apoptotic processes such as cell cycle progression, survival, innate immune signaling, and hematopoiesis. FADD contains an N-terminal DED and a C-terminal DD. Its DD interacts with the DD of the activated death receptor, FAS, and its DED recruits the initiator caspases, caspase-8 and -10, to the DISC complex via a homotypic interaction with the N-terminal DED of the caspase. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes.	85
260021	cd08307	Death_Pelle	Death domain of the protein kinase Pelle. Death domain (DD) of the protein kinase Pelle from Drosophila melanogaster and similar proteins. In Drosophila, interaction between the DDs of Tube and Pelle is an important component of the Toll pathway, which functions in establishing dorsoventral polarity in embryos and in mediating innate immune responses to pathogens. Tube and Pelle transmit the signal from the Toll receptor to the Dorsal/Cactus complex. Pelle also functions in photoreceptor axon targeting. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	97
260022	cd08308	Death_Tube	Death domain of Tube. Death domains (DDs) similar to the DD in the protein Tube from Drosophila melanogaster. In Drosophila, interaction between the DDs of Tube and Pelle is an important component of the Toll pathway, which functions in establishing dorsoventral polarity in embryos and also in mediating innate immune response to pathogens. Tube and Pelle transmit the signal from the Toll receptor to the Dorsal/Cactus complex. Some members of this subfamily contain a C-terminal kinase domain, like Pelle, in addition to the DD. Tube has no counterpart in vertebrates. It contains an N-terminal DD and a C-terminal region with five copies of the Tube repeat, an 8-amino acid motif. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	128
260023	cd08309	Death_IRAK	Death domain of Interleukin-1 Receptor-Associated Kinases. Death Domains (DDs) found in Interleukin-1 (IL-1) Receptor-Associated Kinases (IRAK1-4) and similar proteins. IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. All four types are involved in signal transduction involving IL-1 and IL-18 receptors, Toll-like receptors, nuclear factor-kappaB, and mitogen-activated protein kinase pathways. IRAK1 and IRAK4 are active kinases while IRAK2 and IRAK-M (also called IRAK3) are inactive. In general, IRAKs are expressed ubiquitously, except for IRAK-M which is detected only in macrophages. The insect homologs, Pelle and Tube, are important components of the Toll pathway, which functions in establishing dorsoventral polarity in embryos and also in the innate immune response. Most members have an N-terminal DD followed by a kinase domain. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	88
260024	cd08310	Death_NFkB-like	Death domain of Nuclear Factor-KappaB precursor proteins. Death Domain (DD) of Nuclear Factor-KappaB (NF-kB) precursor proteins. The NF-kB family of transcription factors play a central role in cardiovascular growth, stress response, and inflammation by controlling the expression of a network of different genes. There are five NF-kB proteins, all containing an N-terminal REL Homology Domain (RHD). Two of these, NF-kB1 and NF-kB2 are produced from the processing of the precursor proteins p105 and p100, respectively. In addition to RHD, p105 and p100 contain ANK repeats and a C-terminal DD. NF-kBs are regulated by the Inhibitor of NF-kB (IkB) Kinase (IKK) complex through classical and non-canonical pathways, which differ in the IKK subunits involved and downstream targets. IKKs facilitate the release of NF-kB dimers from an inactive state, allowing them to migrate to the nucleus where they regulate gene transcription. The precursor proteins p105 and p100 function as IkBs and as NF-kB proteins after being processed by the proteasome. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	72
260025	cd08311	Death_p75NR	Death domain of p75 Neurotrophin Receptor. Death Domain (DD) found in p75 neurotrophin receptor (p75NTR, NGFR, TNFRSF16). p75NTR binds members of the neurotrophin (NT) family including nerve growth factor (NGF), brain-derived neurotrophic factor (BDNF), and NT3, among others. It contains an NT-binding extracellular region that bears four cysteine-rich repeats, a transmembrane domain, and an intracellular DD. p75NTR plays roles in the immune, vascular, and nervous systems, and has been shown to promote cell death or survival, and to induce neurite outgrowth or collapse depending on its ligands and co-receptors. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	80
260026	cd08312	Death_MyD88	Death domain of Myeloid Differentation primary response protein MyD88. Death Domain (DD) of Myeloid Differentiation primary response protein 88 (MyD88). MyD88 is an adaptor protein involved in interleukin-1 receptor (IL-1R)- and Toll-like receptor (TLR)-induced activation of nuclear factor-kappaB (NF-kB) and mitogen activated protein kinase pathways that lead to the induction of proinflammatory cytokines. It is a key component in the signaling pathway of pathogen recognition in the innate immune system. MyD88 contains an N-terminal DD and a C-terminal Toll/IL-1 Receptor (TIR) homology domain that mediates interaction with TLRs and IL-1R. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	79
176729	cd08313	Death_TNFR1	Death domain of Tumor Necrosis Factor Receptor 1. Death Domain (DD) found in tumor necrosis factor receptor-1 (TNFR-1). TNFR-1 has many names including TNFRSF1A, CD120a, p55, p60, and TNFR60. It activates two major intracellular signaling pathways that lead to the activation of the transcription factor NF-kB and the induction of cell death. Upon binding of its ligand TNF, TNFR-1 trimerizes which leads to the recruitment of an adaptor protein named TNFR-associated death domain protein (TRADD) through a DD/DD interaction. Mutations in the TNFRSF1A gene causes TNFR-associated periodic syndrome (TRAPS), a rare disorder characterized recurrent fever, myalgia, abdominal pain, conjunctivitis and skin eruptions. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	80
260027	cd08315	Death_TRAILR_DR4_DR5	Death domain of Tumor necrosis factor-Related Apoptosis-Inducing Ligand Receptors. Death Domain (DD) found in Tumor necrosis factor-Related Apoptosis-Inducing Ligand (TRAIL) Receptors. In mammals, this family includes TRAILR1 (also called DR4 or TNFRSF10A) and TRAILR2 (also called DR5, TNFRSF10B, or KILLER). They function as receptors for the cytokine TRAIL and are involved in apoptosis signaling pathways. TRAIL preferentially induces apoptosis in cancer cells while exhibiting little toxicity in normal cells. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	88
260028	cd08316	Death_FAS_TNFRSF6	Death domain of FAS or TNF receptor superfamily member 6. Death Domain (DD) found in the FS7-associated cell surface antigen (FAS). FAS, also known as TNFRSF6 (TNF receptor superfamily member 6), APT1, CD95, FAS1, or APO-1, together with FADD (Fas-associating via Death Domain) and caspase 8, is an integral part of the death inducing signalling complex (DISC), which plays an important role in the induction of apoptosis and is activated by binding of the ligand FasL to FAS. FAS also plays a critical role in self-tolerance by eliminating cell types (autoreactive T and B cells) that contribute to autoimmunity. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	94
260029	cd08317	Death_ank	Death domain associated with Ankyrins. Death Domain (DD) associated with Ankyrins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. Ankyrins function as adaptor proteins and they interact, through ANK repeats, with structurally diverse membrane proteins, including ion channels/pumps, calcium release channels, and cell adhesion molecules. They play critical roles in the proper expression and membrane localization of these proteins. In mammals, this family includes ankyrin-R for restricted (or ANK1), ankyrin-B for broadly expressed (or ANK2) and ankyrin-G for general or giant (or ANK3). They are expressed in different combinations in many tissues and play non-overlapping functions. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
260030	cd08318	Death_NMPP84	Death domain of Nuclear Matrix Protein P84. Death domain (DD) found in the Nuclear Matrix Protein P84 (also known as HPR1 or THOC1). HPR1/p84 resides in the nuclear matrix and is part of the THO complex, also called TREX (transcription/export) complex, which functions in mRNP biogenesis at the interface between transcription and export of mRNA from the nucleus. Mice lacking THOC1 have abnormal testis development and are sterile. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
260031	cd08319	Death_RAIDD	Death domain of RIP-associated ICH-1 homologous protein with a death domain. Death domain (DD) of RAIDD (RIP-associated ICH-1 homologous protein with a death domain), also known as CRADD (Caspase and RIP adaptor). RAIDD is an adaptor protein that together with the p53-inducible protein PIDD and caspase-2, forms the PIDDosome complex, which is required for caspase-2 activation and plays a role in mediating stress-induced apoptosis. RAIDD contains an N-terminal Caspase Activation and Recruitment Domain (CARD), which interacts with the caspase-2 CARD, and a C-terminal DD, which interacts with the DD of PIDD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD, DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	83
260032	cd08320	Pyrin_NALPs	Pyrin death domain found in NALP proteins. Pyrin Death Domain found in NALP (NACHT, LRR and PYD domains) proteins including NALP1 (CARD7, NLRP1), NALP3 (NLRP3, Cryopyrin, CIAS1), and NALP12 (NLRP12, Monarch-1), among others. Mammals contains at least 14 NALP proteins, named NALP1-14 (or NLRP1-14). NALPs are members of the NBS-LRR family of proteins possessing a tripartite domain structure including a C-terminal LRR (leucine-rich repeats), a central nucleotide-binding site (NBS) domain or NACHT (for neuronal apoptosis inhibitor protein, CIITA, HET-E and TP1), and an N-terminal protein-protein interaction domain, which is a Pyrin domain in the case of NALPs. The NBS-LRR family is also referred to as the NLR (Nod-like Receptor) or CATERPILLAR (for CARD, transcription enhancer, R-(purine)-binding, pyrin, lots of LRRs) family. NALP1 contains an additional Caspase activation and recruitment domain (CARD) at the C-terminus. NALP1 and NALP3 are both involved in the assembly of the 'inflammasome', a multiprotein platform which is formed in response to infection or injury and is responsible for caspase-1 activation and regulation of IL-1beta maturation. NALP1-inflammasomes recognize specific substances while NALP3-inflammasomes responds to many diverse triggers. Mutations in the NALP3 gene are associated with a broad spectrum of autoinflammatory disorders including Muckle-Wells Syndrome (MWS), familial cold autoinflammatory syndrome (FCAS), and chronic neurologic cutaneous and articular syndrome (CINCA). NALP12 functions as a negative regulator of inflammation. In general, Pyrin is a subfamily of the Death Domain (DD) superfamily and functions in several signaling pathways. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
260033	cd08321	Pyrin_ASC-like	Pyrin Death Domain found in ASC. Pyrin Death Domain found in ASC (Apoptosis-associated speck-like protein containing a CARD) and similar proteins. ASC is an adaptor molecule that functions in the assembly of the 'inflammasome', a multiprotein platform, which is responsible for caspase-1 activation and regulation of IL-1beta maturation. ASC contains two domains from the Death Domain (DD) superfamily, an N-terminal pyrin-like domain and a C-terminal Caspase activation and recruitment domain (CARD). Through these 2 domains, ASC serves as an adaptor for inflammasome integrity and oligomerizes to form supramolecular assemblies. Included in this family is human PYNOD (also known as NLRP10 or NOD8) which via its Pyrin domain suppresses oligomerization of ASC, and ASC-mediated NF-kappaB activation. Other members of this subfamily are associated with ATPase domains and their function remains unknown. In general, Pyrin is a subfamily of the DD superfamily and functions in several signaling pathways. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD and Death Effector Domain (DED). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	82
260034	cd08323	CARD_APAF1	Caspase activation and recruitment domain similar to that found in Apoptotic Protease-Activating Factor 1. Caspase activation and recruitment domain (CARD) similar to that found in apoptotic protease-activating factor 1 (APAF-1), which is an activator of caspase-9. APAF-1 contains WD-40 repeats, a CARD, and an ATPase domain. Upon stimulation, APAF-1, together with caspase-9, forms the heptameric 'apoptosome', which leads to the processing and activation of caspase-9, starting a caspase cascade which leads to apoptosis. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
260035	cd08324	CARD_NOD1_CARD4	Caspase activation and recruitment domain similar to that found in NOD1. Caspase activation and recruitment domain (CARD) found in human NOD1 (CARD4) and similar proteins. NOD1 is a member of the Nod-like receptor (NLR) family, which plays a central role in the innate immune response. NLRs typically contain an N-terminal effector domain, a central nucleotide-binding domain and a C-terminal ligand-binding region of several leucine-rich repeats (LRRs). In NOD1, as well as NOD2, the N-terminal effector domain is a CARD. Nod1-CARD has been shown to interact with the CARD domain of the downstream effector RICK (RIP2, CARDIAK), a serine/threonine kinase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	85
260036	cd08325	CARD_CASP1-like	Caspase activation and recruitment domain found in Caspase-1 and related proteins. Caspase activation and recruitment domain (CARD) similar to those found in Caspase-1 (CASP1, ICE) and related proteins, including CARD-only proteins such as ICEBERG or CARD18, INCA (CARD17), CARD16 (COP1, PSEUDO-ICE), CARD8 (DACAR, NDPP1, TUCAN), and CARD12 (NLRC4), as well as ICE-like caspases such as CASP12, CASP5 (ICH-3) and CASP4 (TX, ICH-2). Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. CASP1 plays a central role in the cellular response to a wide variety of microbial and non-microbial stimuli, being activated by the inflammasome or the pyroptosome. CARD8 binds itself and the initiator caspase-9, interfering with the binding of APAF-1 and suppressing caspase-9 activation. CARD12 is a Nod-like receptor (NLR) that plays an important role in the innate immune response to Gram-negative bacteria. Caspase-4 (CASP4), -5 (CASP5), and -12 (CASP12) are inflammatory caspases implicated in inflammation and endoplasmic reticulum stress-induced apoptosis. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	83
176740	cd08326	CARD_CASP9	Caspase activation and recruitment domain of Caspase-9. Caspase activation and recruitment domain (CARD) similar to that found in caspase-9 (CASP9, MCH6, APAF3), which interacts with the CARD of apoptotic protease-activating factor 1 (APAF-1). Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-9 is the initiator caspase associated with the intrinsic or mitochondrial pathway of apoptosis, induced by many pro-apoptotic signals. Together with APAF-1, it forms the heptameric 'apoptosome' in response to the release of cytochrome c from mitochondria. Activated caspase-9 cleaves and activates downstream effector caspases, like caspase-3, caspase-6, and caspase-7, resulting in apoptosis. In general, CARDs are death domains (DDs) associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
260037	cd08327	CARD_RAIDD	Caspase activation and recruitment domain of RIP-associated ICH-1 homologous protein with a death domain. Caspase activation and recruitment domain (CARD) of RAIDD (RIP-associated ICH-1 homologous protein with a death domain), also known as CRADD (Caspase and RIP adaptor). RAIDD is an adaptor protein that together with the p53-inducible protein PIDD and caspase-2, forms the PIDDosome complex, which is required for caspase-2 activation and plays a role in mediating stress-induced apoptosis. RAIDD contains an N-terminal CARD, which interacts with the caspase-2 CARD, and a C-terminal Death domain (DD), which interacts with the DD of PIDD. In general, CARDs are DDs associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	94
260038	cd08329	CARD_BIRC2_BIRC3	Caspase activation and recruitment domain found in Baculoviral IAP repeat-containing proteins, BIRC2 (c-IAP1) and BIRC3 (c-IAP2). Caspase activation and recruitment domain (CARD) similar to those found in Baculoviral IAP repeat (BIR)-containing protein 2 (BIRC2) or cellular Inhibitor of Apoptosis Protein 1 (c-IAP1), and BIRC3 (or c-IAP2). IAPs are anti-apoptotic proteins that contain at least one BIR domain. Most IAPs also contain a C-terminal RING domain. In addition, both BIRC2 and BIRC3 contain a CARD. BIRC2 and BIRC3, through their binding with TRAF (TNF receptor-associated factor) 2, are recruited to TNFR-1/2 signaling complexes, where they regulate caspase-8 activity. They also play important roles in pro-survival NF-kB signaling pathways. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	94
260039	cd08330	CARD_ASC_NALP1	Caspase activation and recruitment domain found in Human ASC, NALP1, and similar proteins. Caspase activation and recruitment domain (CARD) similar to those found in human ASC (Apoptosis-associated speck-like protein containing a CARD) and NALP1 (CARD7, NLRP1). ASC, an adaptor molecule, and NALP1, a member of the Nod-like receptor (NLR) family, are involved in the assembly of the 'inflammasome', a multiprotein platform, which is responsible for caspase-1 activation and regulation of IL-1beta maturation. In general, CARDs are death domains (DDs) associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	81
260040	cd08332	CARD_CASP2	Caspase activation and recruitment domain of Caspase-2. Caspase activation and recruitment domain (CARD) similar to that found in caspase-2. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Caspase-2 (also known as ICH1, NEDD2, or CASP2) is one of the most evolutionarily conserved caspases, and plays a role in apoptosis, DNA damage response, cell cycle regulation, and tumor suppression. It is localized in the nucleus and exhibits properties of both an initiator and an effector caspase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	87
260041	cd08333	DED_Caspase_8_r1	Death effector domain, repeat 1, of Caspase-8. Death effector domain (DED) found in caspase-8 (CASP8, FLICE), repeat 1. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 is an initiator of death receptor mediated apoptosis. Together with FADD, caspase-10, and the pseudo-caspase c-FLIP, it forms the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 also plays many important non-apoptotic functions including roles in embryonic development, cell adhesion and motility, immune cell proliferation and differentiation, T-cell activation, and NFkappaB signaling. It contains two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	82
260042	cd08334	DED_Caspase_8_10_r2	Death effector domain, repeat 2, of initator caspases 8 and 10. Death Effector Domain (DED) found in caspase-8 and caspase-10, repeat 2. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 and -10 are the initiators of death receptor mediated apoptosis, and they play partially redundant roles. Together with FADD and the pseudo-caspase c-FLIP, they form the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 and -10 also play important functions in cell adhesion and motility. They contain two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	83
260043	cd08336	DED_FADD	Death Effector Domain found in Fas-Associated via Death Domain. Death Effector Domain (DED) found in Fas-Associated via Death Domain (FADD). DEDs comprise a subfamily of the Death Domain (DD) superfamily. FADD is a component of the death-inducing signaling complex (DISC) and serves as an adaptor in the signaling pathway of death receptor proteins. It modulates apoptosis as well as non-apoptotic processes such as cell cycle progression, survival, innate immune signaling, and hematopoiesis. FADD contains an N-terminal DED and a C-terminal DD. Its DD interacts with the DD of the activated death receptor and its DED recruits the initiator caspases 8 and 10 to the DISC complex via a homotypic interaction with the N-terminal DED of the caspase. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes.	82
260044	cd08337	DED_c-FLIP_r1	Death Effector Domain, repeat 1, of cellular FLICE-Inhibitory Protein. Death Effector Domain (DED), repeat 1, similar to that found in FLICE-inhibitory protein (c-FLIP/CASH, also known as Casper/iFLICE/FLAME-1/CLARP/MRIT/usurpin). c-FLIP is a catalytically inactive homolog of the initator procaspases-8 and -10. It negatively influences apoptotic signaling by interfering with the efficient formation of the Death Inducing Signalling Complex (DISC). At low levels, c-FLIP has been shown to enhance apoptotic signaling by allosterically activating caspase-8. As a modulator of the initiator caspases, c-FLIP regulates life and death in various types of cells and tissues. All members contain two N-terminal DEDs and a C-terminal pseudo-caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	80
260045	cd08338	DED_PEA15	Death Effector Domain of Astrocyte phosphoprotein PEA-15. Death Effector Domain (DED) similar to that found in PEA-15 (Astrocyte phosphoprotein PEA-15). PEA-15 is a multifunctional phosphoprotein that modulates signaling pathways, like the ERK MAP kinase cascade by binding to ERK and changing its subcellular localization. It has been implicated in apoptosis, cell proliferation, and glucose metabolism. It does not possess enzymatic activity and mainly acts as an adaptor protein. PEA-15 contains an N-terminal DED domain and a C-terminal disordered region. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes.	84
176750	cd08339	DED_DEDD-like	Death Effector Domain of DEDD and DEDD2. Death Effector Domain (DED) found in DEDD and DEDD2. Both proteins have a single N-terminal DED and a long C-terminal portion with no known domains. DEDD has been shown to block mitotic progression by inhibiting Cdk1 and to be involved in regulating the insulin signaling cascade. DEDD and DEDD2 can bind to themselves, to each other, and to the two tandem DED-containing caspases, caspase-8 and -10. In general, DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and they can recruit other proteins into signaling complexes.	97
260046	cd08340	DED_c-FLIP_r2	Death Effector Domain, repeat 2, of cellular FLICE-Inhibitory Protein. Death Effector Domain (DED), repeat 2, similar to that found in cellular FLICE-inhibitory protein (c-FLIP/CASH, also known as Casper/iFLICE/FLAME-1/CLARP/MRIT/usurpin). c-FLIP is a catalytically inactive homolog of the initator procaspases-8 and -10. It negatively influences apoptotic signaling by interfering with the efficient formation of the Death Inducing Signalling Complex (DISC). At low levels, c-FLIP has been shown to enhance apoptotic signaling by allosterically activating caspase-8. As a modulator of the initiator caspases, c-FLIP regulates life and death in various types of cells and tissues. All members contain two N-terminal DEDs and a C-terminal pseudo-caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	81
260047	cd08341	DED_Caspase_10_r1	Death effector domain, repeat 1, of Caspase-10. Death effector domain (DED) found in caspase-10, repeat 1. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-10 is an initiator of death receptor mediated apoptosis. Together with FADD, caspase-8 and the pseudo-caspase c-FLIP, it forms the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. It contains two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	82
319930	cd08342	HPPD_N_like	N-terminal domain of 4-hydroxyphenylpyruvate dioxygenase (HPPD) and hydroxymandelate Synthase (HmaS). HppD and HmaS are non-heme iron-dependent dioxygenases, which modify a common substrate, 4-hydroxyphenylpyruvate (HPP), but yield different products. HPPD catalyzes the second reaction in tyrosine catabolism, the conversion of HPP to homogentisate (2,5-dihydroxyphenylacetic acid, HG). HmaS converts HPP to 4-hydroxymandelate, a committed step in the formation of hydroxyphenylglycerine, a structural component of nonproteinogenic macrocyclic peptide antibiotics, such as vancomycin. If the emphasis is on catalytic chemistry, HPPD and HmaS are classified as members of a large family of alpha-keto acid dependent mononuclear non-heme iron oxygenases most of which require Fe(II), molecular oxygen, and an alpha-keto acid (typically alpha-ketoglutarate) to either oxygenate or oxidize a third substrate. Both enzymes are exceptions in that they require two, instead of three, substrates, do not use alpha-ketoglutarate, and incorporate both atoms of dioxygen into the aromatic product. Both HPPD and HmaS exhibit duplicate beta barrel topology in their N- and C-terminal domains which share sequence similarity, suggestive of a gene duplication. Each protein has only one catalytic site located in at the C-terminal domain. This HPPD_N_like domain represents the N-terminal domain.	141
319931	cd08343	ED_TypeI_classII_C	C-terminal domain of type I, class II extradiol dioxygenases, catalytic domain. This family contains the C-terminal, catalytic domain of type I, class II extradiol dioxygenases. Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site; extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon, whereas intradiol enzymes cleave the aromatic ring between two hydroxyl groups. Extradiol dioxygenases are classified into type I and type II enzymes. Type I extradiol dioxygenases include class I and class II enzymes. These two classes of enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. The extradiol dioxygenases represented in this family are type I, class II enzymes, and are composed of the N- and C-terminal domains of similar structure fold, resulting from an ancient gene duplication. The active site is located in a funnel-shaped space of the C-terminal domain. A catalytically essential metal, Fe(II) or Mn(II),  presents in all the enzymes in this family.	132
319932	cd08344	MhqB_like_N	N-terminal domain of MhqB, a type I extradiol dioxygenase, and similar proteins. This subfamily contains the N-terminal, non-catalytic, domain of Burkholderia sp. NF100 MhqB and similar proteins. MhqB is a type I extradiol dioxygenase involved in the catabolism of methylhydroquinone, an intermediate in the degradation of fenitrothion. The purified enzyme has shown extradiol ring cleavage activity toward 3-methylcatechol. Fe2+ was suggested as a cofactor, the same as most other enzymes in the family. Burkholderia sp. NF100 MhqB is encoded on the plasmid pNF1. The type I family of extradiol dioxygenases contains two structurally homologous barrel-shaped domains at the N- and C-terminal. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism.	112
319933	cd08345	Fosfomycin_RP	Fosfomycin resistant protein. This family contains three types of fosfomycin resistant protein. Fosfomycin inhibits the enzyme UDP-N-acetylglucosamine-3-enolpyruvyltransferase (MurA), which catalyzes the first committed step in bacterial cell wall biosynthesis. The three types of fosfomycin resistance proteins, employ different mechanisms to render fosfomycin [(1R,2S)-epoxypropylphosphonic acid] inactive. FosB catalyzes the addition of L-cysteine to the epoxide ring of fosfomycin. FosX catalyzes the addition of a water molecule to the C1 position of the antibiotic with inversion of configuration at C1. FosA catalyzes the addition of glutathione to the antibiotic fosfomycin, making it inactive. Catalytic activities of both FosX and FosA are Mn(II)-dependent, but FosB is activated by Mg(II). Fosfomycin resistant proteins are evolutionarily related to glyoxalase I and type I extradiol dioxygenases.	118
319934	cd08346	PcpA_N_like	N-terminal domain of Sphingobium chlorophenolicum 2,6-dichloro-p-hydroquinone 1,2-dioxygenase (PcpA), and similar proteins. The N-terminal domain of Sphingobium chlorophenolicum (formerly Sphingomonas chlorophenolica) 2,6-dichloro-p-hydroquinone1,2-dioxygenase (PcpA), and similar proteins. PcpA is a key enzyme in the pentachlorophenol (PCP) degradation pathway, catalyzing the conversion of 2,6-dichloro-p-hydroquinone to 2-chloromaleylacetate. This domain belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases.	124
319935	cd08347	PcpA_C_like	C-terminal domain of Sphingobium chlorophenolicum 2,6-dichloro-p-hydroquinone 1,2-dioxygenase (PcpA), and similar proteins. The C-terminal domain of Sphingobium chlorophenolicum (formerly Sphingomonas chlorophenolica) 2,6-dichloro-p-hydroquinone 1,2-dioxygenase (PcpA), and similar proteins. PcpA is a key enzyme in the pentachlorophenol (PCP) degradation pathway, catalyzing the conversion of 2,6-dichloro-p-hydroquinone to 2-chloromaleylacetate. This domain belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases.	157
319936	cd08348	BphC2-C3-RGP6_C_like	The single-domain 2,3-dihydroxybiphenyl 1,2-dioxygenases. This subfamily contains Rhodococcus globerulus P6 BphC2-RGP6 and BphC3-RGP6, and similar proteins. BphC catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, yielding 2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoic acid. This is the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). This subfamily of BphCs belongs to the type I extradiol dioxygenase family, which require a metal in the active site in its catalytic mechanism. Most type I extradiol dioxygenases are activated by Fe(II). Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of BphCs. For example, three types of BphC enzymes have been found in Rhodococcus globerulus (BphC1-RGP6 - BphC3-RGP6), all three enzymes are type I extradiol dioxygenases. BphC2-RGP6 and BphC3-RGP6 are one-domain dioxygenases, which form hexamers. BphC1-RGP6 has an internal duplication, it is a two-domain dioxygenase which forms octamers, its two domains do not belong to this subfamily.	137
319937	cd08349	BLMA_like	Bleomycin binding protein (BLMA) and similar proteins. BLMA also called Bleomycin resistance protein, confers Bm resistance by directly binding to Bm. Bm is a glycopeptide antibiotic produced naturally by actinomycetes. It is a potent anti-cancer drug, which acts as a strong DNA-cutting agent, thereby causing cell death. BLMA is produced by actinomycetes to protect themselves against their own lethal compound. BLMA has two identically-folded subdomains, with the same alpha/beta fold; these two halves have no sequence similarity. BLMAs are dimers and each dimer binds to two Bm molecules at the Bm-binding pockets formed at the dimer interface; two Bm molecules are bound per dimer. BLMA belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. As for the larger superfamily, this family contains members with or without domain swapping.	114
319938	cd08350	BLMT_like	BLMT, a bleomycin resistance protein encoded on the transposon Tn5, and similar proteins. BLMT is a bleomycin (Bm) resistance protein, encoded by the ble gene on the transposon Tn5. This protein confers a survival advantage to Escherichia coli host cells. Bm is a glycopeptide antibiotic produced naturally by actinomycetes. It is a potent anti-cancer drug, which acts as a strong DNA-cutting agent, thereby causing cell death. BLMT has strong binding affinity to Bm and it protects against this lethal compound through drug sequestering. BLMT has two identically-folded subdomains, with the same alpha/beta fold; these two halves have no sequence similarity. BLMT is a dimer with two Bm-binding pockets formed at the dimer interface.	118
319939	cd08351	ChaP_like	ChaP, an enzyme involved in the biosynthesis of the antitumor agent chartreusin (cha), and similar proteins. ChaP is an enzyme involved in the biosynthesis of the potent antitumor agent chartreusin (cha). Cha is an aromatic polyketide glycoside produced by Streptomyces chartreusis. ChaP may play a role as a meta-cleavage dioxygenase in the oxidative rearrangement of the anthracyclic polyketide. ChaP belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases.	118
319940	cd08352	VOC_Bs_YwkD_like	vicinal oxygen chelate (VOC) family protein  Bacillus subtilis YwkD and similar proteins. uncharacterized subfamily of vicinal oxygen chelate (VOC) family contains Bacillus subtilis YwkD and similar proteins. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	123
319941	cd08353	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	142
319942	cd08354	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	122
319943	cd08355	TioX_like	Micromonospora sp. TioX and similar proteins. Micromonospora sp. TioX  is encoded by a gene of the thiocoraline biosynthetic gene cluster. Thiocoraline is a thiodepsipeptide with potent antitumor activity. TioX may be  involved in thiocoraline resistance or secretion. TioX belongs to  vicinal oxygen chelate (VOC) superfamily that is  composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	123
319944	cd08356	VOC_CChe_VCA0619_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. uncharacterized subfamily of vicinal oxygen chelate (VOC) family contains Vibrio cholerae VCA0619 and similar proteins. The VOC superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	113
319945	cd08357	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) familyprotein, glyoxalase I, and type I ring-cleaving dioxygenases. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	124
319946	cd08358	GLOD4_N	N-terminal domain of human glyoxalase domain-containing protein 4 and similar proteins. Uncharacterized subfamily of the vicinal oxygen chelate (VOC) superfamily contains human glyoxalase domain-containing protein 4 and similar proteins. VOC  is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	127
319947	cd08359	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	119
319948	cd08360	MhqB_like_C	C-terminal domain of Burkholderia sp. NF100 MhqB and similar proteins. This subfamily contains the C-terminal, catalytic, domain of Burkholderia sp. NF100 MhqB and similar proteins. MhqB is a type I extradiol dioxygenase involved in the catabolism of methylhydroquinone, an intermediate in the degradation of fenitrothion. The purified enzyme has shown extradiol ring cleavage activity toward 3-methylcatechol. Fe2+ was suggested as a cofactor, the same as most other enzymes in the family. Burkholderia sp. NF100 MhqB is encoded on the plasmid pNF1. The type I family of extradiol dioxygenases contains two structurally homologous barrel-shaped domains at the N- and C-terminal. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism.	134
319949	cd08361	PpCmtC_N	N-terminal domain of 2,3-dihydroxy-p-cumate-3,4-dioxygenase (PpCmtC). This subfamily contains the N-terminal, non-catalytic, domain of PpCmtC. 2,3-dihydroxy-p-cumate-3,4-dioxygenase (CmtC of Pseudomonas putida F1) is a dioxygenase involved in the eight-step catabolism pathway of p-cymene. CmtC acts upon the reaction intermediate 2,3-dihydroxy-p-cumate, yielding 2-hydroxy-3-carboxy-6-oxo-7-methylocta-2,4-dienoate. The CmtC belongs to the type I family of extradiol dioxygenases. Fe2+ was suggested as a cofactor, same as other enzymes in the family. The type I family of extradiol dioxygenases contains two structurally homologous barrel-shaped domains at the N- and C-terminal. The active-site metal is located in the C-terminal barrel and plays an essential role in the catalytic mechanism.	124
319950	cd08362	BphC5-RrK37_N_like	N-terminal, non-catalytic, domain of BphC5 (2,3-dihydroxybiphenyl 1,2-dioxygenase) from Rhodococcus rhodochrous K37, and similar proteins. 2,3-dihydroxybiphenyl 1,2-dioxygenase (BphC) catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, the third step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). The enzyme contains a N-terminal and a C-terminal domain of similar structure fold, resulting from an ancient gene duplication. BphC belongs to the type I extradiol dioxygenase family, which requires a metal in the active site for its catalytic activity. Polychlorinated biphenyl degrading bacteria demonstrate multiplicity of BphCs. Bacterium Rhodococcus rhodochrous K37 has eight genes encoding BphC enzymes. This family includes the N-terminal domain of BphC5-RrK37. The crystal structure of the protein from Novosphingobium aromaticivorans has a Mn(II)in the active site, although most proteins of type I extradiol dioxygenases are activated by Fe(II).	120
319951	cd08363	FosB	fosfomycin resistant protein subfamily FosB. This subfamily family contains FosB, a fosfomycin resistant protein. FosB is a Mg(2+)-dependent L-cysteine thiol transferase. Fosfomycin inhibits the enzyme UDP-nacetylglucosamine-3-enolpyruvyltransferase (MurA), which catalyzes the first committed step in bacterial cell wall biosynthesis. FosB catalyzes the Mg(II) dependent addition of L-cysteine to the epoxide ring of fosfomycin, (1R,2S)-epoxypropylphosphonic acid, rendering it inactive. FosB is evolutionarily related to glyoxalase I and type I extradiol dioxygenases.	131
319952	cd08364	FosX	fosfomycin resistant protein subfamily FosX. This subfamily family contains FosX, a fosfomycin resistant protein. FosX is a Mn(II)-dependent fosfomycin-specific epoxide hydrolase. Fosfomycin inhibits the enzyme UDP-Nacetylglucosamine-3-enolpyruvyltransferase (MurA), which catalyzes the first committed step in bacterial cell wall biosynthesis. FosX catalyzes the addition of a water molecule to the C1 position of the antibiotic with inversion of the configuration at C1 in the presence of Mn(II). The hydrated fosfomycin loses the inhibition activity. FosX is evolutionarily related to glyoxalase I and type I extradiol dioxygenases.	130
176483	cd08365	APC10-like1	APC10-like DOC1 domains of E3 ubiquitin ligases that mediate substrate ubiquitination. This model represens the APC10-like DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. APC10/DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included here. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10/DOC1 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here.	131
176484	cd08366	APC10	APC10 subunit of the anaphase-promoting complex (APC) that mediates substrate ubiquitination. This model represents the single domain protein APC10, a subunit of the anaphase-promoting complex (APC), which is a multi-subunit E3 ubiquitin ligase. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a vital component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC (also known as the cyclosome), is a cell cycle-regulated E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. In mitosis, the APC initiates sister chromatid separation by ubiquitinating the anaphase inhibitor securin and triggers exit from mitosis by ubiquitinating cyclin B. The C-terminus of APC10 binds to CDC27/APC3, an APC subunit that contains multiple tetratrico peptide repeats. APC10 domains are homologous to the DOC1 domains present in the HECT (Homologous to the E6-AP Carboxyl Terminus) E3 ubiquitin ligase protein, and the Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase complex. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here.	139
176262	cd08367	P53	P53 DNA-binding domain. P53 is a tumor suppressor gene product; mutations in p53 or lack of expression are found associated with a large fraction of all human cancers. P53 is activated by DNA damage and acts as a regulator of gene expression that ultimatively blocks progression through the cell cycle. P53 binds to DNA as a tetrameric transcription factor. In its inactive form, p53 is bound to the ring finger protein Mdm2, which promotes its ubiquitinylation and subsequent proteosomal degradation. Phosphorylation of p53 disrupts the Mdm2-p53 complex, while the stable and active p53 binds to regulatory regions of its target genes, such as the cyclin-kinase inhibitor p21, which complexes and inactivates cdk2 and other cyclin complexes.	179
259829	cd08368	LIM	LIM is a small protein-protein interaction domain, containing two zinc fingers. LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid).	53
187712	cd08369	FMT_core	Formyltransferase, catalytic core domain. Formyltransferase, catalytic core domain. The proteins of this superfamily contain a formyltransferase domain that hydrolyzes the removal of a formyl group from its substrate as part of a multistep transfer mechanism, and this alignment model represents the catalytic core of the formyltransferase domain.  This family includes the following known members; Glycinamide Ribonucleotide Transformylase (GART), Formyl-FH4 Hydrolase, Methionyl-tRNA Formyltransferase, ArnA, and 10-Formyltetrahydrofolate Dehydrogenase (FDH).  Glycinamide Ribonucleotide Transformylase  (GART) catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Methionyl-tRNA Formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA, which plays important role in translation initiation. ArnA is required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. 10-formyltetrahydrofolate dehydrogenase (FDH) catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. Members of this family are multidomain proteins. The formyltransferase domain is located at the N-terminus of FDH, Methionyl-tRNA Formyltransferase and ArnA, and at the C-terminus of Formyl-FH4 Hydrolase.  Prokaryotic Glycinamide Ribonucleotide Transformylase (GART) is a single domain protein while eukaryotic GART is a trifunctional protein that catalyzes the second, third and fifth steps in de novo purine biosynthesis.	173
187727	cd08370	FMT_C_like	Carboxy-terminal domain of Formyltransferase and similar domains. This family represents the C-terminal domain of formyltransferase and similar proteins. This domain is found in a variety of enzymes with formyl transferase and alkyladenine DNA glycosylase activities. The proteins with formyltransferase function include methionyl-tRNA formyltransferase, ArnA, 10-formyltetrahydrofolate dehydrogenase and HypX proteins. Although most proteins with formyl transferase activity contain this C-terminal domain, prokaryotic glycinamide ribonucleotide transformylase (GART), a single domain protein, only contains the core catalytic domain. Thus, the C-terminal domain is not required for formyl transferase catalytic activity and may be involved in substrate binding. Some members of this family have shown nucleic acid binding capacity. The C-terminal domain of methionyl-tRNA formyltransferase is involved in tRNA binding. Alkyladenine DNA glycosylase is a distant member of this family with very low sequence similarity to other members. It catalyzes the first step in base excision repair (BER) by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site and shows ability to bind to DNA.	73
187740	cd08371	Lumazine_synthase-like	lumazine synthase and riboflavin synthase; involved in the riboflavin (vitamin B2) biosynthetic pathway. This superfamily contains lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS) and riboflavin synthase (RS). Both enzymes play important roles in the riboflavin biosynthetic pathway. Riboflavin is the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential cofactors for the catalysis of a wide range of redox reactions. These cofactors are also involved in many other processes involving DNA repair, circadian time-keeping, light sensing, and bioluminescence. Riboflavin is biosynthesized in plants, fungi and certain microorganisms; as animals lack the necessary enzymes to produce this vitamin, they acquire it from dietary sources. In the final steps of the riboflavin biosynthetic pathway, LS catalyzes the condensation of the 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate to release water, inorganic phosphate and 6,7-dimethyl-8-ribityllumazine (DMRL), and RS catalyzes a dismutation of DMRL which yields riboflavin and 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione. In the latter reaction, a four-carbon moiety is transferred between two DMRL molecules serving as donor and acceptor, respectively. Both the LS and RS catalyzed reactions are thermodynamically irreversible and can proceed in the absence of a catalyst. In bacteria and eukaryotes, there are two types of LS: type-I LS forms homo-pentamers or icosahedrally arranged dodecamers of pentamers, type-II LS forms decamers (dimers of pentamers). In archaea LSs and RSs appear to have diverged early in the evolution of archaea from a common ancestor.	129
197306	cd08372	EEP	Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily. This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins.	241
176019	cd08373	C2A_Ferlin	C2 domain first repeat in Ferlin. Ferlins are involved in vesicle fusion events.  Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together.  There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6.  Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1).  Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E.   In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-II topology.	127
176020	cd08374	C2F_Ferlin	C2 domain sixth repeat in Ferlin. Ferlins are involved in vesicle fusion events.  Ferlins and other proteins, such as Synaptotagmins, are implicated in facilitating the fusion process when cell membranes fuse together.  There are six known human Ferlins: Dysferlin (Fer1L1), Otoferlin (Fer1L2), Myoferlin (Fer1L3), Fer1L4, Fer1L5, and Fer1L6.  Defects in these genes can lead to a wide range of diseases including muscular dystrophy (dysferlin), deafness (otoferlin), and infertility (fer-1, fertilization factor-1).  Structurally they have 6 tandem C2 domains, designated as (C2A-C2F) and a single C-terminal transmembrane domain, though there is a new study that disputes this and claims that there are actually 7 tandem C2 domains with another C2 domain inserted between C2D and C2E.   In a subset of them (Dysferlin, Myoferlin, and Fer1) there is an additional conserved domain called DysF. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the sixth C2 repeat, C2E, and has a type-II topology.	133
176021	cd08375	C2_Intersectin	C2 domain present in Intersectin. A single instance of the C2 domain is located C terminally in the intersectin protein.  Intersectin functions as a scaffolding protein, providing a link between the actin cytoskeleton and the components of endocytosis and plays a role in signal transduction.   In addition to C2, intersectin contains several additional domains including: Eps15 homology domains, SH3 domains, a RhoGEF domain, and a PH domain.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. The members here have topology I.	136
176022	cd08376	C2B_MCTP_PRT	C2 domain second repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP). MCTPs are involved in Ca2+ signaling at the membrane.  MCTP is composed of a variable N-terminal sequence, three C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence.  It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology.	116
176023	cd08377	C2C_MCTP_PRT	C2 domain third repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP). MCTPs are involved in Ca2+ signaling at the membrane.  The cds in this family contain multiple C2 domains as well as a C-terminal PRT domain.  It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the third C2 repeat, C2C, and has a type-II topology.	119
176024	cd08378	C2B_MCTP_PRT_plant	C2 domain second repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP); plant subset. MCTPs are involved in Ca2+ signaling at the membrane.  Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence.  It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology.	121
176025	cd08379	C2D_MCTP_PRT_plant	C2 domain fourth repeat found in Multiple C2 domain and Transmembrane region Proteins (MCTP); plant subset. MCTPs are involved in Ca2+ signaling at the membrane.  Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence.  It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2+ but not phospholipids. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the fourth C2 repeat, C2D, and has a type-II topology.	126
176026	cd08380	C2_PI3K_like	C2 domain present in phosphatidylinositol 3-kinases (PI3Ks). C2 domain present in all classes of PI3Ks.  PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility.  PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain.  In addition some PI3Ks contain a Ras-binding domain and/or a p85-binding domain.  Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains members with the first C2 repeat, C2A, and a type-I topology, as well as some with a single C2 repeat.	156
176027	cd08381	C2B_PI3K_class_II	C2 domain second repeat present in class II phosphatidylinositol 3-kinases (PI3Ks). There are 3 classes of PI3Ks based on structure, regulation, and specificity.  All classes contain a N-terminal C2 domain, a PIK domain, and a kinase catalytic domain. Unlike class I and class III, class II PI3Ks have additionally a PX domain and a C-terminal C2 domain containing a nuclear localization signal both of which bind phospholipids though in a slightly different fashion.  PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility. PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	122
176028	cd08382	C2_Smurf-like	C2 domain present in Smad ubiquitination-related factor (Smurf)-like proteins. A single C2 domain is found in Smurf proteins, C2-WW-HECT-domain E3s, which play an important role in the downregulation of the TGF-beta signaling pathway.  Smurf proteins also regulate cell shape, motility, and polarity by degrading small guanosine triphosphatases (GTPases). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  Members here have type-II topology.	123
176029	cd08383	C2A_RasGAP	C2 domain (first repeat) of Ras GTPase activating proteins (GAPs). RasGAPs suppress Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras.  In this way it can control cellular proliferation and differentiation.  The proteins here all contain either a single C2 domain or two tandem C2 domains,  a Ras-GAP domain, and a pleckstrin homology (PH)-like domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology.	117
176030	cd08384	C2B_Rabphilin_Doc2	C2 domain second repeat present in Rabphilin and Double C2 domain. Rabphilin is found neurons and in neuroendrocrine cells, while Doc2 is found not only in the brain but in tissues, including mast cells, chromaffin cells, and osteoblasts.  Rabphilin and Doc2s share highly homologous tandem C2 domains, although their N-terminal structures are completely different: rabphilin contains an N-terminal Rab-binding domain (RBD),7 whereas Doc2 contains an N-terminal Munc13-1-interacting domain (MID). C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	133
176031	cd08385	C2A_Synaptotagmin-1-5-6-9-10	C2A domain first repeat present in Synaptotagmins 1, 5, 6, 9, and 10. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 1, a member of class 1 synaptotagmins, is located in the brain and endocranium and localized to the synaptic vesicles and secretory granules.  It functions as a Ca2+ sensor for fast exocytosis as do synaptotagmins 5, 6, and 10. It is distinguished from the other synaptotagmins by having an N-glycosylated N-terminus. Synaptotagmins 5, 6, and 10, members of class 3 synaptotagmins, are located primarily in the brain and localized to the active zone and plasma membrane.  They is distinguished from the other synaptotagmins by having disulfide bonds at its N-terminus.  Synaptotagmin 6 also regulates the acrosome reaction, a unique Ca2+-regulated exocytosis, in sperm. Synaptotagmin 9, a class 5 synaptotagmins, is located in the brain and localized to the synaptic vesicles.  It is thought to be a Ca2+-sensor for dense-core vesicle exocytosis. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	124
176032	cd08386	C2A_Synaptotagmin-7	C2A domain first repeat present in Synaptotagmin 7. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 7, a member of class 2 synaptotagmins, is located in presynaptic plasma membranes in neurons, dense-core vesicles in endocrine cells, and lysosomes in fibroblasts.  It has been shown to play a role in regulation of Ca2+-dependent lysosomal exocytosis in fibroblasts and may also function as a vesicular Ca2+-sensor.  It is distinguished from the other synaptotagmins by having over 12 splice forms. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	125
176033	cd08387	C2A_Synaptotagmin-8	C2A domain first repeat present in Synaptotagmin 8. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	124
176034	cd08388	C2A_Synaptotagmin-4-11	C2A domain first repeat present in Synaptotagmins 4 and 11. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains.  Synaptotagmins 4 and 11, class 4 synaptotagmins, are located in the brain.  Their functions are unknown. They are distinguished from the other synaptotagmins by having and Asp to Ser substitution in their C2A domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	128
176035	cd08389	C2A_Synaptotagmin-14_16	C2A domain first repeat present in Synaptotagmins 14 and 16. Synaptotagmin 14 and 16 are membrane-trafficking proteins in specific tissues outside the brain.   Both of these contain C-terminal tandem C2 repeats, but only Synaptotagmin 14 has an N-terminal transmembrane domain and a putative fatty-acylation site. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium and this is indeed the case here.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	124
176036	cd08390	C2A_Synaptotagmin-15-17	C2A domain first repeat present in Synaptotagmins 15 and 17. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. It is thought to be involved in the trafficking and exocytosis of secretory vesicles in non-neuronal tissues and is Ca2+ independent. Human synaptotagmin 15 has 2 alternatively spliced forms that encode proteins with different C-termini.  The larger, SYT15a, contains a N-terminal TM region, a putative fatty-acylation site, and 2 tandem C terminal C2 domains.  The smaller, SYT15b, lacks the C-terminal portion of the second C2 domain.  Unlike most other synaptotagmins it is nearly absent in the brain and rather is found in the heart, lungs, skeletal muscle, and testis. Synaptotagmin 17 is located in the brain, kidney, and prostate and is thought to be a peripheral membrane protein. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	123
176037	cd08391	C2A_C2C_Synaptotagmin_like	C2 domain first and third repeat in Synaptotagmin-like proteins. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains either the first or third repeat in Synaptotagmin-like proteins with a type-I topology.	121
176038	cd08392	C2A_SLP-3	C2 domain first repeat present in Synaptotagmin-like protein 3. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length. SHD of Slp (except for the Slp4-SHD) function as a specific Rab27A/B-binding domain.  In addition to Slp, rabphilin, Noc2, and  Munc13-4 also function as Rab27-binding proteins. Little is known about the expression or localization of Slp3.  The C2A domain of Slp3 is Ca2+ dependent.  It has been demonstrated that Slp3 promotes dense-core vesicle exocytosis.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains the first C2 repeat, C2A, and has a type-I topology.	128
176039	cd08393	C2A_SLP-1_2	C2 domain first repeat present in Synaptotagmin-like proteins 1 and 2. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length.  Slp1/JFC1 and Slp2/exophilin 4 promote granule docking to the plasma membrane.  Additionally, their C2A domains are both Ca2+ independent, unlike Slp3 and Slp4/granuphilin which are Ca2+ dependent.  It is thought that SHD (except for the Slp4-SHD) functions as a specific Rab27A/B-binding domain.  In addition to Slps, rabphilin, Noc2, and  Munc13-4 also function as Rab27-binding proteins.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains the first C2 repeat, C2A, and has a type-I topology.	125
176040	cd08394	C2A_Munc13	C2 domain first repeat in Munc13 (mammalian uncoordinated) proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner.  Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode.  Munc13 is the mammalian homolog which are expressed in the brain.  There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters.  Unc13 and Munc13 contain both C1 and C2 domains.  There are two C2 related domains present, one central and one at the carboxyl end.  Munc13-1 contains a third C2-like domain.  Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains the first C2 repeat, C2A, and has a type-II topology.	127
176041	cd08395	C2C_Munc13	C2 domain third repeat in Munc13 (mammalian uncoordinated) proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner.  Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode.  Munc13 is the mammalian homolog which are expressed in the brain.  There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters.  Unc13 and Munc13 contain both C1 and C2 domains.  There are two C2 related domains present, one central and one at the carboxyl end.  Munc13-1 contains a third C2-like domain.  Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins.C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains the third C2 repeat, C2C, and has a type-II topology.	120
176042	cd08397	C2_PI3K_class_III	C2 domain present in class III phosphatidylinositol 3-kinases (PI3Ks). PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility.  PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain.  These are the only domains identified in the class III PI3Ks present in this cd. In addition some PI3Ks contain a Ras-binding domain and/or a p85-binding domain. Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	159
176043	cd08398	C2_PI3K_class_I_alpha	C2 domain present in class I alpha phosphatidylinositol 3-kinases (PI3Ks). PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility.  PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain.  The members here are class I, alpha isoform PI3Ks and contain both a Ras-binding domain and a p85-binding domain.  Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  Members have a type-I topology.	158
176044	cd08399	C2_PI3K_class_I_gamma	C2 domain present in class I gamma phosphatidylinositol 3-kinases (PI3Ks). PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility.  PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain. The members here are class I, gamma isoform PI3Ks and contain both a Ras-binding domain and a p85-binding domain. Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members have a type-I topology.	178
176045	cd08400	C2_Ras_p21A1	C2 domain present in RAS p21 protein activator 1 (RasA1). RasA1 is a GAP1 (GTPase activating protein 1), a Ras-specific GAP member, which suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras.  In this way it can control cellular proliferation and differentiation.  RasA1 contains a C2 domain,  a Ras-GAP domain, a pleckstrin homology (PH)-like domain, a SH3 domain, and 2 SH2 domains. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology.	126
176046	cd08401	C2A_RasA2_RasA3	C2 domain first repeat present in RasA2 and RasA3. RasA2 and RasA3 are GAP1s (GTPase activating protein 1s ), Ras-specific GAP members, which suppresses Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras. In this way it can control cellular proliferation and differentiation.  RasA2 and RasA3 are both inositol 1,3,4,5-tetrakisphosphate-binding proteins and contain an N-terminal C2 domain, a Ras-GAP domain, a pleckstrin-homology (PH) domain which localizes it to the plasma membrane, and Bruton's Tyrosine Kinase (BTK) a zinc binding domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the first C2 repeat, C2A, and has a type-I topology.	121
176047	cd08402	C2B_Synaptotagmin-1	C2 domain second repeat present in Synaptotagmin 1. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains.  Synaptotagmin 1, a member of the class 1 synaptotagmins, is located in the brain and endocranium and localized to the synaptic vesicles and secretory granules.  It functions as a Ca2+ sensor for fast exocytosis. It, like synaptotagmin-2, has an N-glycosylated N-terminus. Synaptotagmin 4, a member of class 4 synaptotagmins, is located in the brain.  It functions are unknown. It, like synaptotagmin-11, has an Asp to Ser substitution in its C2A domain. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	136
176048	cd08403	C2B_Synaptotagmin-3-5-6-9-10	C2 domain second repeat present in Synaptotagmins 3, 5, 6, 9, and 10. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 3, a member of class 3 synaptotagmins, is located in the brain and localized to the active zone and plasma membrane.  It functions as a Ca2+ sensor for fast exocytosis. It, along with synaptotagmins 5,6, and 10, has disulfide bonds at its N-terminus. Synaptotagmin 9, a class 5 synaptotagmins, is located in the brain and localized to the synaptic vesicles.  It is thought to be a Ca2+-sensor for dense-core vesicle exocytosis. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	134
176049	cd08404	C2B_Synaptotagmin-4	C2 domain second repeat present in Synaptotagmin 4. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains.  Synaptotagmin 4, a member of class 4 synaptotagmins, is located in the brain.  It functions are unknown. It, like synaptotagmin-11, has an Asp to Ser substitution in its C2A domain. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	136
176050	cd08405	C2B_Synaptotagmin-7	C2 domain second repeat present in Synaptotagmin 7. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 7, a member of class 2 synaptotagmins, is located in presynaptic plasma membranes in neurons, dense-core vesicles in endocrine cells, and lysosomes in fibroblasts.  It has been shown to play a role in regulation of Ca2+-dependent lysosomal exocytosis in fibroblasts and may also function as a vesicular Ca2+-sensor.  It is distinguished from the other synaptotagmins by having over 12 splice forms. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	136
176051	cd08406	C2B_Synaptotagmin-12	C2 domain second repeat present in Synaptotagmin 12. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 12, a member of class 6 synaptotagmins, is located in the brain.  It functions are unknown. It, like synaptotagmins 8 and 13, do not have any consensus Ca2+ binding sites. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	136
176052	cd08407	C2B_Synaptotagmin-13	C2 domain second repeat present in Synaptotagmin 13. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 13, a member of class 6 synaptotagmins, is located in the brain.  It functions are unknown. It, like synaptotagmins 8 and 12, does not have any consensus Ca2+ binding sites. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	138
176053	cd08408	C2B_Synaptotagmin-14_16	C2 domain second repeat present in Synaptotagmins 14 and 16. Synaptotagmin 14 and 16 are membrane-trafficking proteins in specific tissues outside the brain.   Both of these contain C-terminal tandem C2 repeats, but only Synaptotagmin 14 has an N-terminal transmembrane domain and a putative fatty-acylation site. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium and this is indeed the case here.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	138
176054	cd08409	C2B_Synaptotagmin-15	C2 domain second repeat present in Synaptotagmin 15. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. It is thought to be involved in the trafficking and exocytosis of secretory vesicles in non-neuronal tissues and is Ca2+ independent. Human synaptotagmin 15 has 2 alternatively spliced forms that encode proteins with different C-termini.  The larger, SYT15a, contains a N-terminal TM region, a putative fatty-acylation site, and 2 tandem C terminal C2 domains.  The smaller, SYT15b, lacks the C-terminal portion of the second C2 domain.  Unlike most other synaptotagmins it is nearly absent in the brain and rather is found in the heart, lungs, skeletal muscle, and testis.  Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	137
176055	cd08410	C2B_Synaptotagmin-17	C2 domain second repeat present in Synaptotagmin 17. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 17 is located in the brain, kidney, and prostate and is thought to be a peripheral membrane protein. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-I topology.	135
176103	cd08411	PBP2_OxyR	The C-terminal substrate-binding domain of the LysR-type transcriptional regulator OxyR, a member of the type 2 periplasmic binding fold protein superfamily. OxyR senses hydrogen peroxide and is activated through the formation of an intramolecular disulfide bond. The OxyR activation induces the transcription of genes necessary for the bacterial defense against oxidative stress. The OxyR of LysR-type transcriptional regulator family is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The C-terminal domain also contains the redox-active cysteines that mediate the redox-dependent conformational switch. Thus, the interaction between the OxyR-tetramer and DNA is notably different between the oxidized and reduced forms. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176104	cd08412	PBP2_PAO1_like	The C-terminal substrate-binding domain of putative LysR-type transcriptional regulator PAO1-like, a member of the type 2 periplasmic binding fold protein superfamily. This family includes the C-terminal substrate domain of a putative LysR-type transcriptional regulator from the plant pathogen Pseudomonas aeruginosa PAO1and its closely related homologs. The LysR-type transcriptional regulators (LTTRs) are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of N2 fixing bacteria, and synthesis of virulence factors, to a name a few. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	198
176105	cd08413	PBP2_CysB_like	The C-terminal substrate domain of LysR-type transcriptional regulators CysB-like contains type 2 periplasmic binding fold. CysB is a transcriptional activator of genes involved in sulfate and thiosulfate transport, sulfate reduction, and cysteine synthesis. In Escherichia coli, the regulation of transcription in response to sulfur source is attributed to two transcriptional regulators, CysB and Cbl. CysB, in association with Cbl, downregulates the expression of ssuEADCB operon which is required for the utilization of sulfur from aliphatic sulfonates, in the presence of cysteine. Also, Cbl and CysB together directly function as transcriptional activators of tauABCD genes, which are required for utilization of taurine as sulfur source for growth. Like many other members of the LTTR family, CysB is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	198
176106	cd08414	PBP2_LTTR_aromatics_like	The C-terminal substrate binding domain of LysR-type transcriptional regulators involved in the catabolism of aromatic compounds and that of other related regulators, contains type 2 periplasmic binding fold. This CD includes the C-terminal substrate binding domain of LTTRs involved in degradation of aromatic compounds, such as CbnR, BenM, CatM, ClcR and TfdR, as well as that of other transcriptional regulators clustered together in phylogenetic trees, including XapR, HcaR, MprR, IlvR, BudR, AlsR, LysR, and OccR. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	197
176107	cd08415	PBP2_LysR_opines_like	The C-terminal substrate-domain of LysR-type transcriptional regulators involved in the catabolism of opines and that of related regulators, contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate-domain of LysR-type transcriptional regulators, OccR and NocR, involved in the catabolism of opines and that of LysR for lysine biosynthesis which clustered together in phylogenetic trees. Opines, such as octopine and nopaline, are low molecular weight compounds found in plant crown gall tumors that are produced by the parasitic bacterium Agrobacterium. There are at least 30 different opines identified so far. Opines are utilized by tumor-colonizing bacteria as a source of carbon, nitrogen, and energy. NocR and OccR belong to the family of LysR-type transcriptional regulators that positively regulates the catabolism of nopaline and octopine, respectively. Both nopaline and octopalin are arginine derivatives. In Agrobacterium tumefaciens, NocR regulates expression of the divergently transcribed nocB and nocR genes of the nopaline catabolism (noc) region.  OccR protein activates the occQ operon of the Ti plasmid in response to octopine. This operon encodes proteins required for the uptake and catabolism of octopine. The occ operon also encodes the TraR protein, which is a quorum-sensing transcriptional regulator of the Ti plasmid tra regulon.  LysR is the transcriptional activator of lysA gene encoding diaminopimelate decarboxylase, an enzyme that catalyses the decarboxylation of diaminopimelate to produce lysine. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	196
176108	cd08416	PBP2_MdcR	The C-terminal substrate-binding domian of LysR-type transcriptional regulator MdcR, which involved in the malonate catabolism contains the type 2 periplasmic binding fold. This family includes the C-terminal substrate binding domain of LysR-type transcriptional regulator (LTTR) MdcR that controls the expression of the malonate decarboxylase (mdc) genes. Like other members of the LTTRs, MdcR is a positive regulatory protein for its target promoter and composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins (PBP2). The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the substrate- binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	199
176109	cd08417	PBP2_Nitroaromatics_like	The C-terminal substrate binding domain of LysR-type transcriptional regulators that involved in the catabolism of nitroaromatic/naphthalene compounds and that of related regulators; contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate binding domain of LysR-type transcriptional regulators involved in the catabolism of dinitrotoluene and similar compounds, such as DntR, NahR, and LinR. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. Also included are related LysR-type regulators clustered together in phylogenetic trees, including NodD, ToxR, LeuO, SyrM, TdcA, and PnbR. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176110	cd08418	PBP2_TdcA	The C-terminal substrate binding domain of LysR-type transcriptional regulator TdcA, which is involved in the degradation of L-serine and L-threonine, contains the type 2 periplasmic binding fold. TdcA, a member of the LysR family, activates the expression of the anaerobically-regulated tdcABCDEFG operon which is involved in the degradation of L-serine and L-threonine to acetate and propionate, respectively. The tdc operon is comprised of one regulatory gene tdcA and six structural genes, tdcB to tdcG. The expression of the tdc operon is affected by several transcription factors including the cAMP receptor protein (CRP), integration host factor (IHF), histone-like protein (HU), and the operon specific regulators TdcA and TcdR. TcdR is divergently transcribed from the operon and encodes a small protein that is required for efficient expression of the Escherichia coli tdc operon.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	201
176111	cd08419	PBP2_CbbR_RubisCO_like	The C-terminal substrate binding of LysR-type transcriptional regulator (CbbR) of RubisCO operon, which is involved in the carbon dioxide fixation, contains the type 2 periplasmic binding fold. CbbR, a LysR-type transcriptional regulator, is required to activate expression of RubisCO, one of two unique enzymes in the Calvin-Benson-Bassham (CBB) cycle pathway. All plants, cyanobacteria, and many autotrophic bacteria use the CBB cycle to fix carbon dioxide. Thus, this cycle plays an essential role in assimilating CO2 into organic carbon on earth. The key CBB cycle enzyme is ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO), which catalyzes the actual CO2 fixation reaction. The CO2 concentration affects the expression of RubisCO genes.  It has also shown that NADPH enhances the DNA-binding ability of the CbbR. RubisCO is composed of eight large (CbbL) and eight small subunits (CbbS).  The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176112	cd08420	PBP2_CysL_like	C-terminal substrate binding domain of LysR-type transcriptional regulator CysL, which activates the transcription of the cysJI operon encoding sulfite reductase, contains the type 2 periplasmic binding fold. CysL, also known as YwfK, is a regular of sulfur metabolism in Bacillus subtilis. Sulfur is required for the synthesis of proteins and essential cofactors in all living organism. Sulfur can be assimilated either from inorganic sources (sulfate and thiosulfate), or from organic sources (sulfate esters, sulfamates, and sulfonates). CysL activates the transcription of the cysJI operon encoding sulfite reductase, which reduces sulfite to sulfide. Both cysL mutant and cysJI mutant are unable to grow using sulfate or sulfite as the sulfur source. Like other LysR-type regulators, CysL also negatively regulates its own transcription. In Escherichia coli, three LysR-type activators are involved in the regulation of sulfur metabolism: CysB, Cbl and MetR.  The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	201
176113	cd08421	PBP2_LTTR_like_1	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176114	cd08422	PBP2_CrgA_like	The C-terminal substrate binding domain of LysR-type transcriptional regulator CrgA and its related homologs, contains the type 2 periplasmic binding domain. This CD includes the substrate binding domain of LysR-type transcriptional regulator (LTTR) CrgA and its related homologs. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis further showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176115	cd08423	PBP2_LTTR_like_6	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176116	cd08425	PBP2_CynR	The C-terminal substrate-binding domain of the LysR-type transcriptional regulator CynR, contains the type 2 periplasmic binding fold. CynR is a LysR-like transcriptional regulator of the cyn operon, which encodes genes that allow cyanate to be used as a sole source of nitrogen. The operon includes three genes in the following order: cynT (cyanate permease), cynS (cyanase), and cynX (a protein of unknown function).  CynR negatively regulates its own expression independently of cyanate. CynR binds to DNA and induces bending of DNA in the presence or absence of cyanate, but the amount of bending is decreased by cyanate. The CynR of LysR-type transcriptional regulator family is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins (PBP2). The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176117	cd08426	PBP2_LTTR_like_5	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	199
176118	cd08427	PBP2_LTTR_like_2	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	195
176119	cd08428	PBP2_IciA_ArgP	The C-terminal substrate binding domain of LysR-type transcriptional regulator, ArgP (IciA), for arginine exporter (ArgO); contains the type 2 periplasmic binding fold. The inhibitor of chromosomal replication (iciA) protein encoded by Mycobacterium tuberculosis, which is implicated in chromosome replication initiation in vitro, has been identified as arginine permease (ArgP), a LysR-type transcriptional regulator for arginine outward transport, based on the same amino sequence and similar DNA binding targets. Arp has been shown to regulate various targets including DnaA (replication), ArgO (arginine export), dapB (lysine biosynthesis), and gdhA (glutamate biosynthesis). With abundant nutrition, ArgP activates the DnaA gene (to increase replication) and the ArgO (to export redundant molecules). However, when nutrition supply is limited, it is suggested that ArgP might function as an inhibitor of chromosome replication in order to slow replication. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	195
176120	cd08429	PBP2_NhaR	The C-terminal substrate binding domain of LysR-type transcriptional activator of the nhaA gene, encoding Na+/H+ antiporter, contains the type 2 periplasmic binding fold. NhaR is a positive regulator of the LysR family and is known to be an activator of the nhaA gene encoding a Na(+)/H(+) antiporter. In Escherichia coli, NhaA is the vital antiporter that protects against high sodium stress, and it is essential for growth in high sodium levels, while NhaB becomes essential only if NhaA is not available. The nhaA gene of nhaAR operon is induced by monovalent cations. The nhaR of the operon activates nhaAR, as well as the osmC transcription which is induced at elevated osmolarity. OsmC is transcribed from the two overlapping promoters (osmCp1 and osmP2) and that NhaR is shown to activate only the expression of osmCp1. NhaR also activates the transcription of the pgaABCD operon which is required for production of the biofilm adhesion, poly-beta-1,6-N-acetyl-d-glucosamine (PGA) .Thus, it is suggested that NhaR has an extended role in promoting bacterial survival. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	204
176121	cd08430	PBP2_IlvY	The C-terminal substrate binding of LysR-type transcriptional regulator IlvY, which activates the expression of ilvC gene that encoding acetohydroxy acid isomeroreductase for the biosynthesis of branched amino acids; contains the type 2 periplasmic binding fold. In Escherichia coli, IlvY is required for the regulation of ilvC gene expression that encodes acetohydroxy acid isomeroreductase (AHIR), a key enzyme in the biosynthesis of branched-chain amino acids (isoleucine, valine, and leucine). The ilvGMEDA operon genes encode remaining enzyme activities required for the biosynthesis of these amino acids. Activation of ilvC transcription by IlvY requires the additional binding of a co-inducer molecule (either alpha-acetolactate or alpha-acetohydoxybutyrate, the substrates for AHIR) to a preformed complex of IlvY protein-DNA.  Like many other LysR-family members, IlvY negatively auto-regulates the transcription of its own divergently transcribed ilvY gene in an inducer-independent manner. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	199
176122	cd08431	PBP2_HupR	The C-terminal substrate binding domain of LysR-type transcriptional regulator, HupR, which regulates expression of the heme uptake receptor HupA; contains the type 2 periplasmic binding fold. HupR, a member of the LysR family, activates hupA transcription under low-iron conditions in the presence of hemin. The expression of many iron-uptake genes, such as hupA,  is regulated at the transcriptional level by iron and an iron-binding repressor protein called Fur (ferric uptake regulation). Under iron-abundant conditions with heme, the active Fur repressor protein represses transcription of the iron-uptake gene hupA, and prevents transcriptional activation via HupR. Under low-iron conditions with heme, the Fur repressor is inactive and transcription of the hupA is allowed. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	195
176123	cd08432	PBP2_GcdR_TrpI_HvrB_AmpR_like	The C-terminal substrate domain of LysR-type GcdR, TrPI, HvR and beta-lactamase regulators, and that of other closely related homologs; contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate domain of LysR-type transcriptional regulators involved in controlling the expression of glutaryl-CoA dehydrogenase (GcdH), S-adenosyl-L-homocysteine hydrolase, cell division protein FtsW, tryptophan synthase, and beta-lactamase. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	194
176124	cd08433	PBP2_Nac	The C-teminal substrate binding domain of LysR-like nitrogen assimilation control (NAC) protein, contains the type 2 periplasmic binding fold. The NAC is a LysR-type transcription regulator that activates expression of operons such as hut (histidine utilization) and ure (urea utilization), allowing use of non-preferred (poor) nitrogen sources, and represses expression of operons, such as glutamate dehydrogenase (gdh), allowing assimilation of the preferred nitrogen source.  The expression of the nac gene is fully dependent on the nitrogen regulatory system (NTR) and the sigma54-containing RNA polymerase (sigma54-RNAP). In response to nitrogen starvation, NTR system activates the expression of nac, and NAC activates the expression of hut, ure, and put (proline utilization). NAC is not involved in the transcription of Sigma70-RNAP operons such as glnA, which directly respond by the NTR system, but activates the transcription of sigma70-RNAP dependent operons such as hut. Hence, NAC allows the coupling of sigma70-RNAP dependent operons to the sigma54-RNAP dependent NTR system.  This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176125	cd08434	PBP2_GltC_like	The substrate binding domain of LysR-type transcriptional regulator GltC, which activates gltA expression of glutamate synthase operon, contains type 2 periplasmic binding fold. GltC, a member of the LysR family of bacterial transcriptional factors, activates the expression of gltA gene of glutamate synthase operon and is essential for cell growth in the absence of glutamate. Glutamate synthase is a heterodimeric protein that encoded by gltA and gltB, whose expression is subject to nutritional regulation. GltC also negatively auto-regulates its own expression. This substrate-binding domain has strong homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	195
176126	cd08435	PBP2_GbpR	The C-terminal substrate binding domain of galactose-binding protein regulator contains the type 2 periplasmic binding fold. Galactose-binding protein regulator (GbpR), a member of the LysR family of bacterial transcriptional regulators, regulates the expression of chromosomal virulence gene chvE.   The chvE gene is involved in the uptake of specific sugars, in chemotaxis to these sugars, and in the VirA-VirG two-component signal transduction system. In the presence of an inducing sugar such as L-arabinose, D-fucose, or D-galactose, GbpR activates chvE expression, while in the absence of an inducing sugar, GbpR represses expression. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	201
176127	cd08436	PBP2_LTTR_like_3	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	194
176128	cd08437	PBP2_MleR	The substrate binding domain of LysR-type transcriptional regulator MleR which required for malolactic fermentation, contains type 2 periplasmic binidning fold. MleR, a transcription activator of malolactic fermentation system, is found in gram-positive bacteria and belongs to the lysR family of bacterial transcriptional regulators. The mleR gene is required for the expression and induction of malolactic fermentation. This substrate binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176129	cd08438	PBP2_CidR	The C-terminal substrate binding domain of LysR-like transcriptional regulator CidR, contains the type 2 periplasmic binding fold. This CD includes the substrate binding domain of CidR which positively up-regulates the expression of cidABC operon in the presence of acetic acid produced by the metabolism of excess glucose. The CidR affects the control of murein hydrolase activity by enhancing cidABC expression in the presence of acetic acid. Thus, up-regulation of cidABC expression results in increased murein hydrolase activity. This substrate binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176130	cd08439	PBP2_LrhA_like	The C-terminal substrate domain of LysR-like regulator LrhA (LysR homologue A) and that of closely related homologs, contains the type 2 periplasmic binding fold. This CD represents the LrhA subfamily of LysR-like bacterial transcriptional regulators, including LrhA, HexA, PecT, and DgdR.  LrhA is involved in control of the transcription of flagellar, motility, and chemotaxis genes by regulating the synthesis and concentration of FlhD(2)C(2), the master regulator for the expression of flagellar and chemotaxis genes. The LrhA protein has strong homology to HexA and PecT from plant pathogenic bacteria, in which HexA and PecT act as repressors of motility and of virulence factors, such as exoenzymes required for lytic reactions. DgdR also shares similar characteristics to those of LrhA, HexA and PecT. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	185
176131	cd08440	PBP2_LTTR_like_4	TThe C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator, contains the type 2 periplasmic binding fold. LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176132	cd08441	PBP2_MetR	The C-terminal substrate binding domain of LysR-type transcriptional regulator metR, which regulates the expression of methionine biosynthetic genes, contains type 2 periplasmic binding fold. MetR, a member of the LysR family, is a positive regulator for the metA, metE, metF, and metH genes. The sulfur-containing amino acid methionine is the universal initiator of protein synthesis in all known organisms and its derivative S-adenosylmethionine (SAM) and autoinducer-2 (AI-2) are involved in various cellular processes. SAM plays a central role as methyl donor in methylation reactions, which are essential for the biosynthesis of phospholipids, proteins, DNA and RNA.  The interspecies signaling molecule AI-2 is involved in cell-cell communication process (quorum sensing) and gene regulation in bacteria. Although methionine biosynthetic enzymes and metabolic pathways are well conserved in bacteria, the regulation of methionine biosynthesis involves various regulatory mechanisms. In Escherichia coli and Salmonella enterica serovar Typhimurium,  MetJ and MetR regulate the expression of methionine biosynthetic genes.  The MetJ repressor negatively regulates the E. coli met genes, except for metH. Several of these genes are also under the positive control of MetR with homocysteine as a co-inducer. In Bacillus subtilis, the met genes are controlled by S-box termination-antitermination system. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176133	cd08442	PBP2_YofA_SoxR_like	The C-terminal substrate binding domain of LysR-type transcriptional regulators, YofA and SoxR, contains the type 2 periplasmic binding fold. YofA is a LysR-like transcriptional regulator of cell growth in Bacillus subtillis. YofA controls cell viability and the formation of constrictions during cell division. YofaA positively regulates expression of the cell division gene ftsW, and thus is essential for cell viability during stationary-phase growth of Bacillus substilis. YofA shows significant homology to SoxR from Arthrobacter sp. TE1826. SoxR is a negative regulator for the sarcosine oxidase gene soxA. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine, which is involved in the metabolism of creatine and choline. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	193
176134	cd08443	PBP2_CysB	The C-terminal substrate domain of LysR-type transcriptional regulator CysB contains type 2 periplasmic binding fold. CysB is a transcriptional activator of genes involved in sulfate and thiosulfate transport, sulfate reduction, and cysteine synthesis. In Escherichia coli, the regulation of transcription in response to sulfur source is attributed to two transcriptional regulators, CysB and Cbl. CysB, in association with Cbl, downregulates the expression of ssuEADCB operon which is required for the utilization of sulfur from aliphatic sulfonates, in the presence of cysteine. Also, Cbl and CysB together directly function as transcriptional activators of tauABCD genes, which are required for utilization of taurine as sulfur source for growth. Like many other members of the LTTR family, CysB is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176135	cd08444	PBP2_Cbl	The C-terminal substrate binding domain of LysR-type transcriptional regulator Cbl, which is required for expression of sulfate starvation-inducible (ssi) genes, contains the type 2 periplasmic binding fold. Cbl is a member of the LysR transcriptional regulators that comprise the largest family of prokaryotic transcription factor. Cbl shows high sequence similarity to CysB, the LysR-type transcriptional activator of genes involved in sulfate and thiosulfate transport, sulfate reduction, and cysteine synthesis. In Escherichia coli, the function of Cbl is required for expression of sulfate starvation-inducible (ssi) genes, coupled with the biosynthesis of cysteine from the organic sulfur sources (sulfonates). The ssi genes include the ssuEADCB and tauABCD operons encoding uptake systems for organosulfur compounds, aliphatic sulfonates, and taurine. The genes in these operons encode an ABC-type transport system required for uptake of aliphatic sulfonates and a desulfonation enzyme. Both Cbl and CysB require expression of the tau and ssu genes.  Like many other members of the LTTR family, the Cbl is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176136	cd08445	PBP2_BenM_CatM_CatR	The C-terminal substrate binding domain of LysR-type transcriptional regulators involved in benzoate catabolism; contains the type 2 periplasmic binding fold. This CD includes the C-terminal of LysR-type transcription regulators, BenM, CatM, and CatR, which are involved in the benzoate catabolism. The BenM and CatM are paralogs with overlapping functions. BenM responds synergistically to two effectors, benzoate and cis,cis-muconate, to activate expression of the benABCDE operon which is involved in benzoate catabolism, while CatM responses only to muconate. BenM and CatM share high protein sequence identity and bind to the operator-promoter regions that have similar DNA sequences. In Pseudomonas species, phenolic compounds are converted by different enzymes to central intermediates, such as protocatechuate and catechols. Generally, unsubstituted compounds, such as benzoate, are metabolized by an ortho-cleavage pathway. The catBCA operon encodes three enzymes of the ortho-pathway required for benzoate catabolism: muconate lactonizing enzyme I, muconolactone isomerase, and catechol 1,2-dioxygenase. CatR normally responds to benzoate and cis,cis-muconate, an inducer molecule,  to activate transcription of the catBCA operon, whose gene products convert benzoate to catechol. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the substrate-binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	203
176137	cd08446	PBP2_Chlorocatechol	The C-terminal substrate binding domain of LysR-type transcriptional regulators involved in the chlorocatechol catabolism, contains the type 2 periplasmic binding fold. This CD includes the substrate binding domain of LysR-type regulators CbnR, ClcR and TfdR, which are involved in the regulation of chlorocatechol breakdown. The chlorocatechol-degradative pathway is often found in bacteria that can use chlorinated aromatic compounds as carbon and energy sources. CbnR is found in the 3-chlorobenzoate degradative bacterium Ralstonia eutropha NH9 and forms a tetramer. CbnR activates the expression of the cbnABCD genes, which are responsible for the degradation of chlorocatechol converted from 3-chlorobenzoate and are transcribed divergently from cbnR.   In soil bacterium Pseudomonas putida, the 3-chlorocatechol-degradative pathway is encoded by clcABD operon, which requires the divergently transcribed clcR for activation. TfdR is involved in the activation of tfdA and tfdB gene expression. These genes encode enzymes for the conversion of 2,4-dichlorophenoxyacetic acid and 2,4-dichlorophenol. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176138	cd08447	PBP2_LTTR_aromatics_like_1	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to regulators involved in the catabolism of aromatic compounds, contains type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type regulator similar to CbnR which is involved in the regulation of chlorocatechol breakdown. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176139	cd08448	PBP2_LTTR_aromatics_like_2	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to regulators involved in the catabolism of aromatic compounds, contains type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type regulator similar to CbnR which is involved in the regulation of chlorocatechol breakdown. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176140	cd08449	PBP2_XapR	The C-terminal substrate binding domain of LysR-type transcriptional regulator XapR involved in xanthosine catabolism, contains the type 2 periplasmic binding fold. In Escherichia coli, XapR is a positive regulator for the expression of xapA gene, encoding xanthosine phosphorylase, and xapB gene, encoding a polypeptide similar to the nucleotide transport protein NupG. As an operon, the expression of both xapA and xapB is fully dependent on the presence of both XapR and the inducer xanthosine. Expression of the xapR is constitutive but not auto-regulated, unlike many other LysR family proteins. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176141	cd08450	PBP2_HcaR	The C-terminal substrate binding domain of LysR-type transcriptional regulator HcaR in involved in 3-phenylpropionic acid catabolism, contains the type2 periplasmic binding fold. HcaR, a member of the LysR family of transcriptional regulators, controls the expression of the hcA1, A2, B, C, and D operon, encoding for the 3-phenylpropionate dioxygenase complex and 3-phenylpropionate-2',3'-dihydrodiol dehydrogenase, that oxidizes 3-phenylpropionate to 3-(2,3-dihydroxyphenyl) propionate.  Dioxygenases play an important role in protecting the cell against the toxic effects of dioxygen. The expression of hcaR is negatively auto-regulated, as for other members of the LysR family, and is strongly repressed in the presence of glucose. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	196
176142	cd08451	PBP2_BudR	The C-terminal substrate binding domain of LysR-type transcrptional regulator BudR, which is responsible for activation of the expression of the butanediol operon genes; contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of BudR regulator, which is responsible for induction of the butanediol formation pathway under fermentative growth conditions. Three enzymes are involved in the production of 1 mol of 2,3 butanediol from the condensation of 2 mol of pyruvate with acetolactate and acetoin as intermediates: acetolactate synthetase, acetolactate decarboxylase, and acetoin reductase. In Klebsiella terrigena, BudR regulates the expression of the budABC operon genes, encoding these three enzymes of the butanediol pathway. In many bacterial species, the use of this pathway can prevent intracellular acidification by diverting metabolism from acid production to the formation of neutral compounds (acetoin and butanediol). This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	199
176143	cd08452	PBP2_AlsR	The C-terminal substrate binding domain of LysR-type trnascriptional regulator AlsR, which regulates acetoin formation under stationary phase growth conditions; contains the type 2 periplasmic binding fold. AlsR is responsible for activating the expression of the acetoin operon (alsSD) in response to inducing signals such as glucose and acetate.  Like many other LysR family proteins, AlsR is transcribed divergently from the alsSD operon. The alsS gene encodes acetolactate synthase, an enzyme involved in the production of acetoin in cells of stationary-phase. AlsS catalyzes the conversion of two pyruvate molecules to acetolactate and carbon dioxide. Acetolactate is then converted to acetoin at low pH by acetolactate decarboxylase which encoded by the alsD gene. Acetoin is an important physiological metabolite excreted by many microorganisms grown on glucose or other fermentable carbon sources. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176144	cd08453	PBP2_IlvR	The C-terminal substrate binding domain of LysR-type transcriptional regulator, IlvR, involved in the biosynthesis of isoleucine, leucine and valine; contains type 2 periplasmic binding fold. The IlvR is an activator of the upstream and divergently transcribed ilvD gene, which encodes dihydroxy acid dehydratase that participates in isoleucine, leucine, and valine biosynthesis. As in the case of other members of the LysR family, the expression of ilvR gene is repressed in the presence of its own gene product. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176145	cd08456	PBP2_LysR	The C-terminal substrate binding domain of LysR, transcriptional regulator for lysine biosynthesis, contains the type 2 periplasmic binding fold. LysR, the transcriptional activator of lysA encoding diaminopimelate decarboxylase, catalyses the decarboxylation of diaminopimelate to produce lysine. The LysR-transcriptional regulators comprise the largest family of prokaryotic transcription factor. Homologs of some of LTTRs with similar domain organizations are also found in the archaea and eukaryotic organisms. The LTTRs are composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcriptional repressor undergoes a conformational change upon substrate binding which in turn changes the DNA binding affinity of the repressor.  The genes controlled by the LTTRs have diverse functional roles including amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to a name a few.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	196
176146	cd08457	PBP2_OccR	The C-terminal substrate-domain of LysR-type transcriptional regulator, OccR, involved in the catabolism of octopine, contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate-domain of LysR-type transcriptional regulator OccR, which is involved in the catabolism of octopine. Opines are low molecular weight compounds found in plant crown gall tumors produced by the parasitic bacterium Agrobacterium. There are at least 30 different opines identified so far. Opines are utilized by tumor-colonizing bacteria as a source of carbon, nitrogen, and energy. In Agrobacterium tumefaciens,  OccR protein activates the occQ operon of the Ti plasmid in response to octopine. This operon encodes proteins required for the uptake and catabolism of octopine, an arginine derivative. The occ operon also encodes the TraR protein, which is a quorum-sensing transcriptional regulator of the Ti plasmid tra regulon.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	196
176147	cd08458	PBP2_NocR	The C-terminal substrate-domain of LysR-type transcriptional regulator, NocR, involved in the catabolism of nopaline, contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate-domain of LysR-type transcriptional regulator NocR, which is involved in the catabolism of nopaline. Opines are low molecular weight compounds found in plant crown gall tumors produced by the parasitic bacterium Agrobacterium. There are at least 30 different opines identified so far. Opines are utilized by tumor-colonizing bacteria as a source of carbon, nitrogen, and energy. In Agrobacterium tumefaciens,  NocR regulates expression of the divergently transcribed nocB and nocR genes of the nopaline catabolism (noc) region.   This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	196
176148	cd08459	PBP2_DntR_NahR_LinR_like	The C-terminal substrate binding domain of LysR-type transcriptional regulators that are involved in the catabolism of dinitrotoluene, naphthalene and gamma-hexachlorohexane; contains the type 2 periplasmic binding fold. This CD includes LysR-like bacterial transcriptional regulators, DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded.  DntR from Burkholderia species controls genes encoding enzymes for oxidative degradation of the nitro-aromatic compound 2,4-dinitrotoluene. The active form of DntR is homotetrameric, consisting of a dimer of dimers. NahR is a salicylate-dependent transcription activator of the nah and sal operons for naphthalene degradation.  Salicylic acid is an intermediate of the oxidative degradation of the aromatic ring in soil bacteria.  LinR positively regulates expression of the genes (linD and linE) encoding enzymes for gamma-hexachlorocyclohexane (a haloorganic insecticide) degradation. Expression of linD and linE are induced by their substrates, 2,5-dichlorohydroquinone (2,5-DCHQ) and chlorohydroquinone (CHQ). The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	201
176149	cd08460	PBP2_DntR_like_1	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to DntR, which is involved in the catabolism of dinitrotoluene; contains the type 2 periplasmic binding fold. This CD includes an uncharacterized LysR-type transcriptional regulator similar to DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176150	cd08461	PBP2_DntR_like_3	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to DntR, which is involved in the catabolism of dinitrotoluene; contains the type 2 periplasmic binding fold. This CD includes an uncharacterized LysR-type transcriptional regulator similar to DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176151	cd08462	PBP2_NodD	The C-terminal substsrate binding domain of NodD family of LysR-type transcriptional regulators that regulates the expression of nodulation (nod) genes; contains the type 2 periplasmic binding fold. The nodulation (nod) genes in soil bacteria play important roles in the development of nodules. nod genes are involved in synthesis of Nod factors that are required for bacterial entry into root hairs. Thirteen nod genes have been identified and are classified into five transcription units: nodD, nodABCIJ, nodFEL, nodMNT, and nodO. NodD is negatively auto-regulates its own expression of nodD gene, while other nod genes are inducible and positively regulated by NodD in the presence of flavonoids released by plant roots. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176152	cd08463	PBP2_DntR_like_4	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to DntR, which is involved in the catabolism of dinitrotoluene; contains the type 2 periplasmic binding fold. This CD includes an uncharacterized LysR-type transcriptional regulator similar to DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	203
176153	cd08464	PBP2_DntR_like_2	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator similar to DntR, which is involved in the catabolism of dinitrotoluene; contains the type 2 periplasmic binding fold. This CD includes an uncharacterized LysR-type transcriptional regulator similar to DntR, NahR, and LinR, which are involved in the degradation of aromatic compounds. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded.  This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176154	cd08465	PBP2_ToxR	The C-terminal substrate binding domain of LysR-type transcriptional regulator ToxR regulates the expression of the toxoflavin biosynthesis genes; contains the type 2 periplasmic bindinig fold. In soil bacterium Burkholderia glumae, ToxR regulates the toxABCDE and toxFGHI operons in the presence of toxoflavin as a coinducer. Additionally, the expression of both operons requires a transcriptional activator, ToxJ, whose expression is regulated by the TofI or TofR quorum-sensing system. The biosynthesis of toxoflavin is suggested to be synthesized in a pathway common to the synthesis of riboflavin. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176155	cd08466	PBP2_LeuO	The C-terminal substrate binding domain of LysR-type transcriptional regulator LeuO, an activator of  leucine synthesis operon, contains the type 2 periplasmic binding fold. LeuO, a LysR-type transcriptional regulator, was originally identified as an activator of the leucine synthesis operon (leuABCD). Subsequently, LeuO was found to be not a specific regulator of the leu gene but a global regulator of unrelated various genes. LeuO activates bglGFB (utilization of beta-D-glucoside) and represses cadCBA (lysine decarboxylation) and dsrA (encoding a regulatory small RNA for translational control of rpoS and hns). LeuO also regulates the yjjQ-bglJ operon which coding for a LuxR-type transcription factor. In Salmonella enterica serovar Typhi, LeuO is a positive regulator of ompS1 (encoding an outer membrane), ompS2 (encoding a pathogenicity determinant), and assT, while LeuO represses the expression of OmpX and Tpx. Both osmS1 and osmS2 influence virulence in the mouse model of Salmonella. In Vibrio cholerae, LeuO is involved in control of biofilm formation and in the stringent response. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176156	cd08467	PBP2_SyrM	The C-terminal substrate binding of LysR-type symbiotic regulator SyrM, which activates expression of nodulation gene NodD3, contains the type 2 periplasmic binding fold. Rhizobium is a nitrogen fixing bacteria present in the roots of leguminous plants, which fixes atmospheric nitrogen to the soil. Most Rhizobium species possess multiple nodulation (nod) genes for the development of nodules. For example, Rhizobium meliloti possesses three copies of nodD genes. NodD1 and NodD2 activate nod operons when  Rhizobium is exposed to inducers synthesized by the host plant, while NodD3 acts independent of plant inducers and requires the symbiotic regulator SyrM for nod gene expression. SyrM activates the expression of the regulatory nodulation gene nodD3. In turn, NodD3 activates expression of syrM. In addition, SyrM is involved in exopolysaccharide synthesis. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	200
176157	cd08468	PBP2_Pa0477	The C-terminal substrate biniding domain of an uncharacterized LysR-like transcriptional regulator Pa0477 related to DntR, contains the type 2 periplasmic binding fold. LysR-type transcriptional regulator Pa0477 is related to DntR, which controls genes encoding enzymes for oxidative degradation of the nitro-aromatic compound 2,4-dinitrotoluene. The transcription of the genes encoding enzymes involved in such degradation is regulated and expression of these enzymes is enhanced by inducers, which are either an intermediate in the metabolic pathway or compounds to be degraded.  The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	202
176158	cd08469	PBP2_PnbR	The C-terminal substrate binding domain of LysR-type transcriptional regulator PnbR, which is involved in regulating the pnb genes encoding enzymes for 4-nitrobenzoate catabolism, contains the type 2 periplasmic binding fold. PnbR is the regulator of one or both of the two pnb genes that encoding enzymes for 4-nitrobenzoate catabolism. In Pseudomonas putida strain, pnbA encodes a 4-nitrobenzoate  reductase, which is responsible for catalyzing the direct reduction of 4-nitrobenzoate to 4-hydroxylaminobenzoate, and pnbB encodes a 4-hydroxylaminobenzoate lyase, which catalyzes the conversion of 4-hydroxylaminobenzoate to 3, 4-dihydroxybenzoic acid and ammonium. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	221
176159	cd08470	PBP2_CrgA_like_1	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding domain. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 1. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176160	cd08471	PBP2_CrgA_like_2	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 2. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	201
176161	cd08472	PBP2_CrgA_like_3	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 3. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	202
176162	cd08473	PBP2_CrgA_like_4	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 4. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	202
176163	cd08474	PBP2_CrgA_like_5	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 5. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	202
176164	cd08475	PBP2_CrgA_like_6	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 6. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	199
176165	cd08476	PBP2_CrgA_like_7	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 7. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176166	cd08477	PBP2_CrgA_like_8	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 8. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	197
176167	cd08478	PBP2_CrgA	The C-terminal substrate binding domain of LysR-type transcriptional regulator CrgA, contains the type 2 periplasmic binding domain. This CD represents the substrate binding domain of LysR-type transcriptional regulator (LTTR) CrgA. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis further showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	199
176168	cd08479	PBP2_CrgA_like_9	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 9. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176169	cd08480	PBP2_CrgA_like_10	The C-terminal substrate binding domain of an uncharacterized LysR-type transcriptional regulator CrgA-like, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of an uncharacterized LysR-type transcriptional regulator (LTTR) CrgA-like 10. The LTTRs are acting as both auto-repressors and activators of target promoters, controlling operons involved in a wide variety of cellular processes such as amino acid biosynthesis, CO2 fixation, antibiotic resistance, degradation of aromatic compounds, nodule formation of nitrogen-fixing bacteria, and synthesis of virulence factors, to name a few. In contrast to the tetrameric form of other LTTRs, CrgA from Neisseria meningitides assembles into an octameric ring, which can bind up to four 63-bp DNA oligonucleotides. Phylogenetic cluster analysis showed that the CrgA-like regulators form a subclass of the LTTRs that function as octamers. The CrgA is an auto-repressor of its own gene and activates the expression of the mdaB gene which coding for an NADPH-quinone reductase and that its action is increased by MBL (alpha-methylene-gamma-butyrolactone), an inducer of NADPH-quinone oxidoreductase.  The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176170	cd08481	PBP2_GcdR_like	The C-terminal substrate binding domain of LysR-type transcriptional regulators GcdR-like, contains the type 2 periplasmic binding fold. GcdR is involved in the glutaconate/glutarate-specific activation of the Pg promoter driving expression of a glutaryl-CoA dehydrogenase-encoding gene (gcdH). The GcdH protein is essential for the anaerobic catabolism of many aromatic compounds and some alicyclic and dicarboxylic acids.  The structural topology of this substrate-binding domain is most similar to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	194
176171	cd08482	PBP2_TrpI	The C-terminal substrate binding domain of LysR-type transcriptional regulator TrpI, which is involved in control of tryptophan synthesis, contains type 2 periplasmic binding fold. TrpI and indoleglycerol phosphate (InGP), are required to activate transcription of the trpBA, the genes for tryptophan synthase. The trpBA is induced by the InGp substrate, rather than by tryptophan, but the exact mechanism of the activation event is not known. This substrate-binding domain of TrpI shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	195
176172	cd08483	PBP2_HvrB	The C-terminal substrate-binding domain of LysR-type transcriptional regulator HvrB, an activator of S-adenosyl-L-homocysteine hydrolase expression, contains the type 2 periplasmic binding fold. The transcriptional regulator HvrB of the LysR family is required for the light-dependent activation of both ahcY, which encoding the enzyme S-adenosyl-L-homocysteine hydrolase (AdoHcyase) that responsible for the reversible hydrolysis of AdoHcy to adenosine and homocysteine,  and orf5, a gene of unknown.  The topology of this C-terminal domain of HvrB is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	190
176173	cd08484	PBP2_LTTR_beta_lactamase	The C-terminal substrate-domain of LysR-type transcriptional regulators for beta-lactamase genes, contains the type 2 periplasmic binding fold. This CD includes the C-terminal substrate binding domain of LysR-type transcriptional regulators, BlaA and AmpR, that are involved in control of the expression of beta-lactamase genes.  Beta-lactamases are responsible for bacterial resistance to beta-lactam antibiotics such as penicillins. BlaA (a constitutive class A penicillinase) belongs to the LysR family of transcriptional regulators, while BlaB (an inducible class C cephalosporinase or AmpC) can be referred to as a penicillin-binding protein, but it does not act as a beta-lactamase. AmpR regulates the expression of beta-lactamases in many enterobacterial strains and many other gram-negative bacilli. In contrast to BlaA, AmpR acts an activator only in the presence of the beta-lactam inducer. In the absence of the inducer, AmpR acts as a repressor. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	189
176174	cd08485	PBP2_ClcR	The C-terminal substrate binding domain of LysR-type transcriptional regulator ClcR involved in the chlorocatechol catabolism, contains type 2 periplasmic binding fold. In soil bacterium Pseudomonas putida, the ortho-pathways of catechol and 3-chlorocatechol are central catabolic pathways that convert aromatic and chloroaromaric compounds to tricarboxylic acid (TCA) cycle intermediates. The 3-chlorocatechol-degradative pathway is encoded by clcABD operon, which requires the divergently transcribed clcR and an intermediate of the pathway, 2-chloromuconate, as an inducer for activation. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176175	cd08486	PBP2_CbnR	The C-terminal substrate binding domain of LysR-type transcriptional regulator, CbnR, involved in the chlorocatechol catabolism, contains the type 2 periplasmic binding fold. This CD represents the substrate binding domain of LysR-type regulator CbnR which is involved in the regulation of chlorocatechol breakdown. The chlorocatechol-degradative pathway is often found in bacteria that can use chlorinated aromatic compounds as carbon and energy sources. CbnR is found in the 3-chlorobenzoate degradative bacterium Ralstonia eutropha NH9 and forms a tetramer. CbnR activates the expression of the cbnABCD genes, which are responsible for the degradation of chlorocatechol converted from 3-chlorobenzoate and are transcribed divergently from cbnR. The structural topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	198
176176	cd08487	PBP2_BlaA	The C-terminal substrate-binding domain of LysR-type trnascriptional regulator BlaA which involved in control of the beta-lactamase gene expression; contains the type 2 periplasmic binding fold. This CD represents the C-terminal substrate binding domain of LysR-type transcriptional regulator, BlaA, that involved in control of the expression of beta-lactamase genes, blaA and blaB.  Beta-lactamases are responsible for bacterial resistance to beta-lactam antibiotics such as penicillins.  The blaA gene is located just upstream of blaB in the opposite direction and regulates the expression of the blaB. BlaA also negatively auto-regulates the expression of its own gene, blaA. BlaA (a constitutive class A penicllinase) belongs to the LysR family of transcriptional regulators, whereas BlaB (an inducible class C cephalosporinase or AmpC) can be referred to as a penicillin binding protein but it does not act as a beta-lactamase. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	189
176177	cd08488	PBP2_AmpR	The C-terminal substrate domain of LysR-type transcriptional regulator AmpR that involved in control of the expression of beta-lactamase gene ampC, contains the type 2 periplasmic binding fold. AmpR acts as a transcriptional activator by binding to a DNA region immediately upstream of the ampC promoter. In the absence of a beta-lactam inducer, AmpR represses the synthesis of beta-lactamase, whereas expression is induced in the presence of a beta-lactam inducer. The AmpD, AmpG, and AmpR proteins are involved in the induction of AmpC-type beta-lactamase (class C) which produced by enterobacterial strains and many other gram-negative bacilli. The activation of ampC by AmpR requires ampG for induction or high-level expression of AmpC. It is probable that the AmpD and AmpG work together to modulate the ability of AmpR to activate ampC expression. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	191
173854	cd08489	PBP2_NikA	The substrate-binding component of an ABC-type nickel import system contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding domain of nickel transport system, which functions in the import of nickel and in the control of chemotactic response away from nickel. The ATP-binding cassette (ABC) type nickel transport system is comprised of five subunits NikABCDE: the two pore-forming integral inner membrane proteins NikB and NikC; the two inner membrane-associated proteins with ATPase activity NikD and NikE; and the periplasmic nickel binding NikA, the initial nickel receptor. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	488
173855	cd08490	PBP2_NikA_DppA_OppA_like_3	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis.  Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	470
173856	cd08491	PBP2_NikA_DppA_OppA_like_12	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis.  Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	473
173857	cd08492	PBP2_NikA_DppA_OppA_like_15	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	484
173858	cd08493	PBP2_DppA_like	The substrate-binding component of an ABC-type dipeptide import system contains the type 2 periplasmic binding fold. This family represents the substrate-binding domain of an ATP-binding cassette (ABC)-type dipeptide import system. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis.  Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	482
173859	cd08494	PBP2_NikA_DppA_OppA_like_6	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	448
173860	cd08495	PBP2_NikA_DppA_OppA_like_8	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	482
173861	cd08496	PBP2_NikA_DppA_OppA_like_9	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA can bind peptides of a wide range of lengths (2-35 amino-acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	454
173862	cd08497	PBP2_NikA_DppA_OppA_like_14	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	491
173863	cd08498	PBP2_NikA_DppA_OppA_like_2	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	481
173864	cd08499	PBP2_Ylib_like	The substrate-binding component of an uncharacterized ABC-type peptide import system Ylib contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding component of an uncharacterized ATP-binding cassette (ABC)-type peptide transport system YliB. Although the ligand specificity of Ylib protein is not known, it shares significant sequence similarity to the ABC-type dipeptide and oligopeptide binding proteins. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	474
173865	cd08500	PBP2_NikA_DppA_OppA_like_4	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	499
173866	cd08501	PBP2_Lpqw	The substrate-binding domain of mycobacterial lipoprotein Lpqw contains type 2 periplasmic binding fold. LpqW is one of key players in synthesis and transport of the unique components of the mycobacterial cell wall which is a complex structure rich in two related lipoglycans, the phosphatidylinositol mannosides (PIMs) and lipoarabinomannans (LAMs).  Lpqw is a highly conserved lipoprotein that transport intermediates from a pathway for mature PIMs production into a pathway for LAMs biosynthesis, thus controlling the relative abundance of these two essential components of cell wall.   LpqW is thought to have been adapted by the cell-wall biosynthesis machinery of mycobacteria and other closely related pathogens, evolving to play an important role in PIMs/LAMs biosynthesis.  Most of periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the LpqW protein. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	486
173867	cd08502	PBP2_NikA_DppA_OppA_like_16	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	472
173868	cd08503	PBP2_NikA_DppA_OppA_like_17	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	460
173869	cd08504	PBP2_OppA	The substrate-binding component of an ABC-type oligopetide import system contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding component of an ATP-binding cassette (ABC)-type oligopeptide transport system comprised of 5 subunits. The transport system OppABCDEF contains two homologous integral membrane proteins OppB and OppF that form the translocation pore; two homologous nucleotide-binding domains OppD and OppF that drive the transport process through binding and hydrolysis of ATP; and the substrate-binding protein or receptor OppA that determines the substrate specificity of the transport system. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis.  Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	498
173870	cd08505	PBP2_NikA_DppA_OppA_like_18	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	528
173871	cd08506	PBP2_clavulanate_OppA2	The substrate-binding domain of an oligopeptide binding protein (OppA2) from the biosynthesis pathway of the beta-lactamase inhibitor clavulanic acid contains the type 2 periplasmic binding fold. Clavulanic acid (CA), a clinically important beta-lactamase inhibitor, is one of a family of clavams produced as secondary metabolites by fermentation of Streptomyces clavuligeru. The biosynthesis of CA proceeds via multiple steps from the precursors, glyceraldehyde-3-phosphate and arginine. CA possesses a characteristic (3R,5R) stereochemistry essential for reaction with penicillin-binding proteins and beta-lactamases. Two genes (oppA1 and oppA2) in the clavulanic acid gene cluster encode oligopeptide-binding proteins that are required for CA biosynthesis. OppA1/2 is involved in the binding and transport of peptides across the cell membrane of Streptomyces clavuligerus.  Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	466
173872	cd08507	PBP2_SgrR_like	The C-terminal solute-binding domain of DNA-binding transcriptional regulator SgrR is related to the ABC-type oligopeptide-binding proteins and contains the type 2 periplasmic-binding fold. A novel family of SgrR transcriptional regulator contains a two-domain structure with an N terminal DNA-binding domain of the winged helix family and a C-terminal solute-binding domain. The C-terminal domain shows strong homology with the ABC-type oligopeptide-binding protein family, a member of the type 2 periplasmic-binding fold protein (PBP2) superfamily that also includes the C-terminal substrate-binding domain of LysR-type transcriptional regulators. SgrR (SugaR transport-related Regulator) is negatively autoregulated and activates transcription of divergent operon SgrS, which encodes a small RNA required for recovery from glucose-phosphate stress.  Hence, the small RNA SgrS and SgrR, the transcription factor that controls sgrS expression, are both required for recovery from glucose-phosphate stress. Most of periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	448
173873	cd08508	PBP2_NikA_DppA_OppA_like_1	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	470
173874	cd08509	PBP2_TmCBP_oligosaccharides_like	The substrate binding domain of a cellulose-binding protein from Thermotoga maritima contains the type 2 periplasmic binding fold. This family represents the substrate-binding domain of a cellulose-binding protein from the hyperthermophilic bacterium Thermotoga maritima (TmCBP) and its closest related proteins. TmCBP binds a variety of lengths of beta-1,4-linked glucose oligomers, ranging from two sugar rings (cellobiose) to five (cellopentose). TmCBP is structurally homologous to domains I and III of the ATP-binding cassette (ABC)-type oligopeptide-binding proteins and thus belongs to the type 2 periplasmic binding fold protein (PBP2) superfamily.  The type 2 periplasmic binding proteins are soluble ligand-binding components of ABC or tripartite ATP-independent transporters and chemotaxis systems. Members of the PBP2 superfamily function in uptake of a variety of metabolites in bacteria such as amino acids, carbohydrate, ions, and polyamines. Ligands are then transported across the cytoplasmic membrane energized by ATP hydrolysis or electrochemical ion gradient. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	509
173875	cd08510	PBP2_Lactococcal_OppA_like	The substrate binding component of an ABC-type lactococcal OppA-like transport system contains. This family represents the substrate binding domain of an ATP-binding cassette (ABC)-type oligopeptide import system from Lactococcus lactis and other gram-positive bacteria, as well as its closet homologs from gram-negative bacteria. Oligopeptide-binding protein (OppA) from Lactococcus lactis can bind peptides of length from 4 to at least 35 residues without sequence preference.  The oligopeptide import system OppABCDEF is consisting of five subunits:  two homologous integral membrane proteins OppB and OppF that form the translocation pore; two homologous nucleotide-binding domains OppD and OppF that drive the transport process through binding and hydrolysis of ATP; and the substrate-binding protein or receptor OppA that determines the substrate specificity of the transport system. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and also is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis.  Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	516
173876	cd08511	PBP2_NikA_DppA_OppA_like_5	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This family represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	467
173877	cd08512	PBP2_NikA_DppA_OppA_like_7	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	476
173878	cd08513	PBP2_thermophilic_Hb8_like	The substrate-binding component of ABC-type thermophilic oligopeptide-binding protein Hb8-like import systems, contains the type 2 periplasmic binding fold. This family includes the substrate-binding domain of an ABC-type oligopeptide-binding protein Hb8 from Thermus thermophilius and its closest homologs from other bacteria. The structural topology of this substrate-binding domain is similar to those of DppA from Escherichia coli and OppA from Salmonella typhimurium, and thus belongs to the type 2 periplasmic binding fold protein (PBP2) superfamily. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. The type 2 periplasmic binding proteins are soluble ligand-binding components of ABC or tripartite ATP-independent transporters and chemotaxis systems. Members of the PBP2 superfamily function in uptake of a variety of metabolites in bacteria such as amino acids, carbohydrate, ions, and polyamines. Ligands are then transported across the cytoplasmic membrane energized by ATP hydrolysis or electrochemical ion gradient. Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	482
173879	cd08514	PBP2_AppA_like	The substrate-binding component of the oligopeptide-binding protein, AppA, from Bacillus subtilis contains the type 2 periplasmic-binding fold. This family represents the substrate-binding domain of the oligopeptide-binding protein, AppA, from Bacillus subtilis and its closest homologs from other bacteria and archaea. Bacillus subtilis has three ABC-type peptide transport systems, a dipeptide-binding protein (DppA) and two oligopeptide-binding proteins (OppA and AppA) with overlapping specificity. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and also is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction.	483
173880	cd08515	PBP2_NikA_DppA_OppA_like_10	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	460
173881	cd08516	PBP2_NikA_DppA_OppA_like_11	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	457
173882	cd08517	PBP2_NikA_DppA_OppA_like_13	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	480
173883	cd08518	PBP2_NikA_DppA_OppA_like_19	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	464
173884	cd08519	PBP2_NikA_DppA_OppA_like_20	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	469
173885	cd08520	PBP2_NikA_DppA_OppA_like_21	The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold. This CD represents the substrate-binding domain of an uncharacterized ATP-binding cassette (ABC) type nickel/dipeptide/oligopeptide-like transporter. The oligopeptide-binding protein OppA and the dipeptide-binding protein DppA show significant sequence similarity to NikA, the initial nickel receptor. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine.  The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the PBP2 superfamily includes the ligand-binding domains from ionotropic glutamate receptors, LysR-type transcriptional regulators,  and unorthodox sensor proteins involved in signal transduction.	468
176056	cd08521	C2A_SLP	C2 domain first repeat present in Synaptotagmin-like proteins. All Slp members basically share an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains (named the C2A domain and the C2B domain) with the SHD and C2 domains being separated by a linker sequence of various length.  Slp1/JFC1 and Slp2/exophilin 4 promote granule docking to the plasma membrane.  Additionally, their C2A domains are both Ca2+ independent, unlike the case in Slp3 and Slp4/granuphilin in which their C2A domains are Ca2+ dependent.  It is thought that SHD (except for the Slp4-SHD) functions as a specific Rab27A/B-binding domain. In addition to Slps, rabphilin, Noc2, and  Munc13-4 also function as Rab27-binding proteins. It has been demonstrated that Slp3 and Slp4/granuphilin promote dense-core vesicle exocytosis. Slp5 mRNA has been shown to be restricted to human placenta and liver suggesting a role in Rab27A-dependent membrane trafficking in specific tissues. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.   This cd contains the first C2 repeat, C2A, and has a type-I topology.	123
260080	cd08523	Reeler_cohesin_like	Domains similar to the eukaryotic reeler domain and bacterial cohesins. This diverse family summarizes a set of distantly related domains, as revealed by structural similarity	128
197341	cd08524	Reelin_subrepeat_like	Tandem repeat subunit of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns.	144
197342	cd08525	Reelin_subrepeat_1	N-terminal subrepeat in the tandem repeat unit of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	161
197343	cd08526	Reelin_subrepeat_2	C-terminal subrepeat in the tandem repeat unit of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	152
270867	cd08528	STKc_Nek10	Catalytic domain of the Serine/Threonine Kinase, Never In Mitosis gene A (NIMA)-related kinase 10. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. No function has yet been ascribed to Nek10. The gene encoding Nek10 is a putative causative gene for breast cancer; it is located within a breast cancer susceptibility loci on chromosome 3p24. Nek10 is one in a family of 11 different Neks (Nek1-11) that are involved in cell cycle control. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
270868	cd08529	STKc_FA2-like	Catalytic domain of the Serine/Threonine Kinases, Chlamydomonas reinhardtii FA2 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Chlamydomonas reinhardtii FA2 was discovered in a genetic screen for deflagellation-defective mutants. It is essential for basal-body/centriole-associated microtubule severing, and plays a role in cell cycle progression. No cellular function has yet been ascribed to CNK4. The Chlamydomonas reinhardtii FA2-like subfamily belongs to the (NIMA)-related kinase (Nek) family, which includes seven different Chlamydomonas Neks (CNKs 1-6 and Fa2). This subfamily contains FA2 and CNK4. The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270869	cd08530	STKc_CNK2-like	Catalytic domain of the Serine/Threonine Kinases, Chlamydomonas reinhardtii CNK2 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Chlamydomonas reinhardtii CNK2 has both cilliary and cell cycle functions. It influences flagellar length through promoting flagellar disassembly, and it regulates cell size, through influencing the size threshold at which cells commit to mitosis. This subfamily belongs to the (NIMA)-related kinase (Nek) family, which includes seven different Chlamydomonas Neks (CNKs 1-6 and Fa2). This subfamily includes CNK1, and -2.  The Nek family is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
188877	cd08531	SAM_PNT-ERG_FLI-1	Sterile alpha motif (SAM)/Pointed domain of ERG (Ets related gene) and FLI-1 (Friend leukemia integration 1) transcription factors. SAM Pointed domain of ERG/FLI-1 subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. The ERG and FLI regulators are involved in endothelial cell differentiation, bone morphogenesis and neural crest development. They are proto-oncogenes implicated in cancer development such as myeloid leukemia, Ewing's sarcoma and erythroleukemia. Members of this subfamily are potential targets for cancer therapy.	75
188878	cd08532	SAM_PNT-PDEF-like	Sterile alpha motif (SAM)/Pointed domain of prostate-derived ETS factor. SAM Pointed domain of PDEF-like (Prostate-Derived ETS Factor) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. In human males this activator is highly expressed in the prostate gland and enhances androgen-mediated activation of the PSA promoter though interaction with the DNA binding domain of androgen receptor. PDEF may play a role in prostate cancer development as well as in goblet cell formation and mucus production in the epithelial lining of respiratory and intestinal tracts.	81
188879	cd08533	SAM_PNT-ETS-1,2	Sterile alpha motif (SAM)/Pointed domain of ETS-1,2 family. SAM Pointed domain of ETS-1,2 family of transcriptional activators is a protein-protein interaction domain. It carries a kinase docking site and mediates interaction between ETS transcriptional activators and protein kinases. This group of transcriptional factors is involved in the Ras/MAP kinase signaling pathway. MAP kinases phosphorylate the transcription factors.  Phosphorylated factors then recruit coactivators and enhance transactivation. Members of this group play a role in regulation of different embryonic developmental processes. ETS-1,2 transcriptional activators are proto-oncogenes involved in malignant transformation and tumor progression. They are potential molecular targets for selective cancer therapy.	71
176084	cd08534	SAM_PNT-GABP-alpha	Sterile alpha motif (SAM)/Pointed domain of GA-binding protein (GABP) alpha chain. SAM Pointed domain of GA-binding protein (GABP) alpha subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. This type of transcriptional regulators forms heterotetramers containing two alpha and two beta subunits.  It interacts with GA repeats (purine rich repeats). GABP transcriptional factors control gene expression in cell cycle control, apoptosis, and cellular respiration. GABP participates in regulation of transmembrane receptors and key hormones especially in myeloid cells and at the neuromuscular junction.	89
176085	cd08535	SAM_PNT-Tel_Yan	Sterile alpha motif (SAM)/Pointed domain of Tel/Yan protein. SAM Pointed domain of Tel (Translocation, Ets, Leukemia)/Yan subfamily of ETS transcriptional repressors is a protein-protein interaction domain. SAM Pointed domains of this type of regulators can interact with each other, forming head-to-tail homodimers or homooligomers, and/or interact with SAM Pointed domains of another subfamily of ETS factors forming heterodimers. The oligomeric form is able to block transcription of target genes and is involved in MAPK signaling. They participate in regulation of different processes during embryoniv development including hematopoietic differentiation and eye development. Tel/Yan transcriptional factors are frequent targets of chromosomal translocations resulting in fusions of SAM domain with new neighboring genes. Such chimeric proteins were found in different tumors. Members of this subfamily are potential targets for cancer therapy.	68
176086	cd08536	SAM_PNT-Mae	Sterile alpha motif (SAM)/Pointed domain of Mae protein homolog. Mae (Modulator of the Activity of ETS) subfamily represents a group of SAM Pointed monodomain proteins. SAM Pointed domain is a protein-protein interaction domain. It can interact with other SAM pointed domains forming head-to-tail heterodimers and also provides a kinase docking site. For example, in Drosophila Mae is required for facilitating phosphorylation of the Yan factor and for blocking phosphorylation of the ETS-2 regulator. Mae interacts with the SAM Pointed domains of Yan and ETS-2. Binding enhances access of the kinase to the Yan phosphorylation site by providing a kinase docking site, or inhibits phosphorylation of ETS-2 by blocking its docking site. This type of factors participates in regulation of kinase signaling particularly during embryogenesis.	66
188880	cd08537	SAM_PNT-ESE-1-like	Sterile alpha motif (SAM)/Pointed domain of ESE-1 like ETS transcriptional regulators. SAM Pointed domain of ESE-1-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. SAM Pointed domain of ESE-1 provides a potential docking site for signaling kinase Pak1 in humans. ESE-1 factors are involved in regulation of gene expression in different types of epithelial cells. ESE-1 is expressed in many different organs including intestine, stomach, pancreas, lungs, kidneys, and prostate. The DNA binding consensus motif for ESE-1 consists of a purine-rich GGA[AT] core sequence. The expression profile of these factors is altered in epithelial cancers if compared to normal tissues. Members of this subfamily are potential targets for cancer therapy.	81
188881	cd08538	SAM_PNT-ESE-2-like	Sterile alpha motif (SAM)/Pointed domain of ESE-2 like ETS transcriptional regulators. SAM Pointed domain of ESE-2-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It can act as a major transactivator by providing a potential docking site for co-activators. ESE-2 factors are involved in regulation of gene expression in a variety of epithelial (glandular and secretory) cells. ESE-2 mRNA was found in skin keratinocytes, salivary gland, mammary gland, stomach, prostate, and kidneys. The DNA binding consensus motif for ESE-2 consists of a GGA core and AT-rich flanks. The expression profiles of these factors are altered in epithelial cancers. Members of this subfamily are potential targets for cancer therapy.	84
188882	cd08539	SAM_PNT-ESE-3-like	Sterile alpha motif (SAM)/Pointed domain of ESE-3 like ETS transcriptional regulators. SAM Pointed domain of ESE-3-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It can act as a major transactivator by providing a potential docking site for co-activators. The ESE-3 transcriptional activator is involved in regulation of glandular epithelium differentiation through the MAP kinase signaling cascade. It is found to be expressed in glandular epithelium of prostate, pancreas, salivary gland, and trachea. Additionally, ESE-3 is differentially expressed during monocyte-derived dendritic cells development. DNA binding consensus motif for ESE-3 consists of purine-rich GGAA/T core sequence. The expression profiles of these factors are altered in epithelial cancers. Members of this subfamily are potential targets for cancer therapy.	78
176090	cd08540	SAM_PNT-ERG	Sterile alpha motif (SAM)/Pointed domain of ERG transcription factor. SAM Pointed domain of ERG subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It may participate in formation of homodimers or heterodimers with ETS-2, Fli-1, ER81, and Pu-1. However, dimeric forms are inactive and SAM Pointed domain is not essential for dimerization, since ER81 and Pu-1 do not have it. In mouse, a regulator of this type binds the ESET histone H3-specific methyltransferase (human homolog is SETDB1), which leads to modification of the local chromatin structure through histone methylation.  ERG regulators are involved in endothelial cell differentiation, bone morphogenesis and neural crest development. The Erg gene is a proto-oncogene. It is a target of chromosomal translocations resulting in fusions with other neighboring genes. Chimeric proteins were found in solid tumors such as myeloid leukemia or Ewing's sarcoma. Members of this subfamily are potential targets for cancer therapy.	75
188883	cd08541	SAM_PNT-FLI-1	Sterile alpha motif (SAM)/Pointed domain of friend leukemia integration 1 transcription activator. SAM Pointed domain of FLI-1 (Friend Leukemia Integration) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. The FLI-1 protein participates in regulation of cellular differentiation, proliferation, and survival. The Fli-1 gene was initially described in Friend virus-induced erythroleukemias as a site for virus integration. It is highly expressed in hematopoietic tissues and at lower level in lungs, heart, and ovaries. Fli-1 is a proto-oncogene implicated in Ewing's sarcoma and erythroleukemia. Members of this subfamily are potential targets for cancer therapy.	91
176092	cd08542	SAM_PNT-ETS-1	Sterile alpha motif (SAM)/Pointed domain of ETS-1. SAM Pointed domain of ETS-1 subfamily of ETS transcriptional activators is a protein-protein interaction domain. The ETS-1 activator is regulated by phosphorylation. It contains a docking site for the ERK2 MAP (Mitogen Activated Protein) kinase, while the ERK2 phosphorylation site is located in the N-terminal disordered region upstream of the SAM Pointed domain. Mutations of the kinase docking site residues inhibit phosphorylation. ETS-1 activators play a role in a number of different physiological processes, and they are expressed during embryonic development, including blood vessel formation, hematopoietic, lymphoid, neuronal and osteogenic differentiation. The Ets-1 gene is a proto-oncogene involved in progression of different tumors (including breast cancer, meningioma, and prostate cancer). Members of this subfamily are potential molecular targets for selective cancer therapy.	88
188884	cd08543	SAM_PNT-ETS-2	Sterile alpha motif (SAM)/Pointed domain of ETS-2. SAM Pointed domain of ETS-2 subfamily of ETS transcriptional regulators is a protein-protein interaction domain. It contains a docking site for Cdk10 (cyclin-dependent kinase 10), a member of the Cdc2 kinase family. The interaction between ETS-2 and Cdk10 kinase inhibits ETS-2 transactivation activity in mammals. ETS-2 is also regulated by ERK2 MAP kinase. ETS-2, which is phosphorylated by ERK2, can interact with coactivators and enhance transactivation. ETS-2 transcriptional activators are involved in embryonic development and cell cycle control. The Ets-2 gene is a proto-oncogene. It is overexpressed in breast and prostate cancer cells and its overexpression is necessary for transformation of such cells. Members of ETS-2 subfamily are potential molecular targets for selective cancer therapy.	89
260081	cd08544	Reeler	Reeler, the N-terminal domain of reelin, F-spondin, and a variety of other proteins. This domain is found at the N-terminus of F-spondin, a protein attached to the extracellular matrix, which plays roles in neuronal development and vascular remodelling. The F-spondin reeler domain has been reported to bind heparin. The reeler domain is also found at the N-terminus of reelin, an extracellular glycoprotein involved in the development of the brain cortex, and in a variety of other eukaryotic proteins with different domain architectures, including the animal ferric-chelate reductase 1 or stromal cell-derived receptor 2, a member of the cytochrome B561 family, which reduces ferric iron before its transport from the endosome to the cytoplasm. Also included is the insect putative defense protein 1, which is expressed upon bacterial infection and appears to contain a single reeler domain.	135
260082	cd08545	YcnI_like	Reeler-like domain of YcnI and similar proteins. YcnI is a copper-responsive gene of Bacillus subtilis. It is homologous to an uncharacterized protein from Nocardia farcinica, which shares a conserved three-dimensional structure with cohesins and the reeler domain. Some members in this YcnI_like family have C-terminal domains (DUF461) that may bind copper.	152
260083	cd08546	cohesin_like	Cohesin domain, interaction parter of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. Cohesin modules are phylogenetically distributed into three groups:  type I cohesin-dockerin interactions mediate assembly of a range of dockerin-borne enzymes to the complex, while type-II interactions mediate attachment of the cellulosome complex to the bacterial cell wall. Recently discovered type-III cohesins, such as found in the anchoring scaffoldin ScaE, appears to contribute to increased stability of the elaborate cellulosome complex. While the presence of cohesin and dockerin domains in a genome can be indicative of cellulolytic activity, cohesin domains may occur in a wider range of domain architectures, biological systems, and taxonomic lineages.	144
260084	cd08547	Type_II_cohesin	Type II cohesin domain, interaction partner of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type II cohesins; their interactions with dockerin mediate attachment of the cellulosome complex to the bacterial cell wall.	136
260085	cd08548	Type_I_cohesin_like	Type I cohesin domain, interaction partner of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type I cohesins; their interactions with dockerin mediate assembly of a range of dockerin-borne enzymes to the complex.	135
341479	cd08549	G1PDH_related	Glycerol-1-phosphate dehydrogenase and related proteins. This family contains bacterial and archeal glycerol-1-phosphate dehydrogenase-like oxidoreductases. These proteins have similarity with glycerol-1-phosphate dehydrogenase (G1PDH) which plays a role in the synthesis of phosphoglycerolipids in gram-positive bacterial species. It catalyzes the reversibly reduction of dihydroxyacetone phosphate (DHAP) to glycerol-1-phosphate (G1P) in a NADH-dependent manner. Its activity requires Ni++ ion. It also contains archaeal Sn-glycerol-1-phosphate dehydrogenase (Gro1PDH) that plays an important role in the formation of the enantiomeric configuration of the glycerophosphate backbone (sn-glycerol-1-phosphate) of archaeal ether lipids.	331
341480	cd08550	GlyDH-like	Glycerol_dehydrogenase-like. This family contains glycerol dehydrogenase (GlyDH)-like proteins. Glycerol dehydrogenases (GlyDH) is a key enzyme in the glycerol dissimilation pathway. In anaerobic conditions, many microorganisms utilize glycerol as a source of carbon through coupled oxidative and reductive pathways. One of the pathways involves the oxidation of glycerol to dihydroxyacetone with the reduction of NAD+ to NADH catalyzed by glycerol dehydrogenases. Dihydroxyacetone is then phosphorylated by dihydroxyacetone kinase and enters the glycolytic pathway for further degradation. The activity of GlyDH is zinc-dependent; the zinc ion plays a role in stabilizing an alkoxide intermediate at the active site. Some subfamilies have yet to be characterized.	347
341481	cd08551	Fe-ADH	iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains large metal-containing alcohol dehydrogenases (ADH), known as iron-containing alcohol dehydrogenases. They contain a dehydroquinate synthase-like protein structural fold and mostly contain iron. They are distinct from other alcohol dehydrogenases which contains different protein domains. There are several distinct families of alcohol dehydrogenases: Zinc-containing long-chain alcohol dehydrogenases, insect-type, or short-chain alcohol dehydrogenases, iron-containing alcohol dehydrogenases, among others. The iron-containing family has a Rossmann fold-like topology that resembles the fold of the zinc-dependent alcohol dehydrogenases, but lacks sequence homology, and differs in strand arrangement. ADH catalyzes the reversible oxidation of alcohol to acetaldehyde with the simultaneous reduction of NAD(P)+ to NAD(P)H.	372
350202	cd08553	PIN_Fcf1-like	VapC-like PIN domain of rRNA-processing proteins, Fcf1 (Utp24, YDR339C), Utp23 (YOR004W), and other eukaryotic homologs. Fcf1 (FAF1-copurifying factor 1, also known as Utp24) and Utp23 (U three-associated protein 23) are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly. Components of the small subunit (SSU) processome, Fcf1 and Utp23 are essential nucleolar proteins that are required for processing of the 18S pre-rRNA at sites A0-A2. The Fcf1 protein was reported to interact with Pmc1p (vacuolar Ca2+ ATPase) and Cor1p (core subunit of the ubiquinol-cytochrome c reductase complex). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. The subfamily of Fcf1- and Utp23-like homologs have three of the four conserved residues found in S. cerevisiae Fcf1. Some members of the superfamily, including S. cerevisiae Utp23, lack several of these key catalytic residues. Mutation of the remaining conserved putative active site residues seen in Utp23 did not interfere with rRNA maturation and cell viability. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	123
176489	cd08554	Cyt_b561	Eukaryotic cytochrome b(561). Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous.	131
176498	cd08555	PI-PLCc_GDPD_SF	Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily. The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases.	179
176499	cd08556	GDPD	Glycerophosphodiester phosphodiesterase domain as found in prokaryota and eukaryota, and similar proteins. The typical glycerophosphodiester phosphodiesterase domain (GDPD) consists of a TIM barrel and a small insertion domain named the GDPD-insertion (GDPD-I) domain, which is specific for GDPD proteins. This family corresponds to both typical GDPD domain and GDPD-like domain which lacks the GDPD-I region. Members in this family mainly consist of a large family of prokaryotic and eukaryotic glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), and a number of uncharacterized homologs. Sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria are also included in this family. GDPD plays an essential role in glycerol metabolism and catalyzes the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols are major sources of carbon and phosphate. Its catalytic mechanism is based on the metal ion-dependent acid-base reaction, which is similar to that of phosphoinositide-specific phospholipases C (PI-PLCs, EC 3.1.4.11). Both, GDPD related proteins and PI-PLCs, belong to the superfamily of PI-PLC-like phosphodiesterases.	189
176500	cd08557	PI-PLCc_bacteria_like	Catalytic domain of bacterial phosphatidylinositol-specific phospholipase C and similar proteins. This subfamily corresponds to the catalytic domain present in bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) and their sequence homologs found in eukaryota. Bacterial PI-PLCs participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although their precise physiological function remains unclear, bacterial PI-PLCs may function as virulence factors in some pathogenic bacteria. Bacterial PI-PLCs contain a single TIM-barrel type catalytic domain. Its catalytic mechanism is based on general base and acid catalysis utilizing two well conserved histidines, and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. Eukaryotic homologs in this family are named as phosphatidylinositol-specific phospholipase C X domain containing proteins (PI-PLCXD). They are distinct from the typical eukaryotic phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11), which have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, which is closely related to that of bacterial PI-PLCs. Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may be distinct from that of typical eukaryotic PI-PLCs. This family also includes a distinctly different type of eukaryotic PLC, glycosylphosphatidylinositol-specific phospholipase C (GPI-PLC), an integral membrane protein characterized in the protozoan parasite Trypanosoma brucei. T. brucei GPI-PLC hydrolyzes the GPI-anchor on the variant specific glycoprotein (VSG), releasing dimyristyl glycerol (DMG), which may facilitate the evasion of the protozoan to the host's immune system. It does not require Ca2+ for its activity and is more closely related to bacterial PI-PLCs, but not mammalian PI-PLCs.	271
176501	cd08558	PI-PLCc_eukaryota	Catalytic domain of eukaryotic phosphoinositide-specific phospholipase C and similar proteins. This family corresponds to the catalytic domain present in eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) and similar proteins. The higher eukaryotic PI-PLCs play a critical role in most signal transduction pathways, controlling numerous cellular events such as cell growth, proliferation, excitation and secretion. They strictly require Ca2+ for the catalytic activity. They display a clear preference towards the hydrolysis of the more highly phosphorylated membrane phospholipids PI-analogues, phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidylinositol-4-phosphate (PIP), to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. The eukaryotic PI-PLCs have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains, such as the pleckstrin homology (PH) domain, EF-hand motif, and C2 domain. The catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a linker region. The catalytic mechanism of eukaryotic PI-PLCs is based on general base and acid catalysis utilizing two well conserved histidines and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. The mammalian PI-PLCs consist of 13 isozymes, which are classified into six-subfamilies, PI-PLC-delta (1,3 and 4), -beta(1-4), -gamma(1,2), -epsilon, -zeta, and -eta (1,2). Ca2+ is required for the activation of all forms of mammalian PI-PLCs, and the concentration of calcium influences substrate specificity. This family also includes metazoan phospholipase C related but catalytically inactive proteins (PRIP), which belong to a group of novel inositol trisphosphate binding proteins. Due to the replacement of critical catalytic residues, PRIP does not have PLC enzymatic activity.	226
176502	cd08559	GDPD_periplasmic_GlpQ_like	Periplasmic glycerophosphodiester phosphodiesterase domain (GlpQ) and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in bacterial and eukaryotic glycerophosphodiester phosphodiesterase (GP-GDE, EC 3.1.4.46) similar to Escherichia coli periplasmic phosphodiesterase GlpQ. GP-GDEs are involved in glycerol metabolism and catalyze the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols, which are major sources of carbon and phosphate. In E. coli, there are two major G3P uptake systems: Glp and Ugp, which contain genes coding for two different GP-GDEs. GlpQ gene from the glp operon codes for a periplasmic phosphodiesterase GlpQ. GlpQ is a dimeric enzyme that hydrolyzes periplasmic glycerophosphodiesters, such as glycerophosphocholine (GPC), glycerophosphoethanolanmine (GPE), glycerophosphoglycerol (GPG), glycerophosphoinositol (GPI), and glycerophosphoserine (GPS), to the corresponding alcohols and G3P, which is subsequently transported into the cell through the GlpT transport system. Ca2+ is required for GlpQ enzymatic activity. This subfamily also includes some GP-GDEs in higher plants and their eukaryotic homologs, which show very high sequence similarities with bacterial periplasmic GP-GDEs.	296
176503	cd08560	GDPD_EcGlpQ_like_1	Glycerophosphodiester phosphodiesterase domain similar to Escherichia coli periplasmic phosphodiesterase (GlpQ) include uncharacterized proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46) and their hypothetical homologs. Members in this subfamily show high sequence similarity to Escherichia coli periplasmic phosphodiesterase GlpQ, which catalyzes the Ca2+-dependent degradation of periplasmic glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	356
176504	cd08561	GDPD_cytoplasmic_ScUgpQ2_like	Glycerophosphodiester phosphodiesterase domain of Streptomyces coelicolor cytoplasmic phosphodiesterases UgpQ2 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized cytoplasmic phosphodiesterases which predominantly exist in bacteria. The prototype of this family is a putative cytoplasmic phosphodiesterase encoded by gene ulpQ2 (SCO1419) in the Streptomyces coelicolor genome. It is distantly related to the Escherichia coli cytoplasmic phosphodiesterases UgpQ that catalyzes the hydrolysis of glycerophosphodiesters at the inner side of the cytoplasmic membrane to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	249
176505	cd08562	GDPD_EcUgpQ_like	Glycerophosphodiester phosphodiesterase domain in Escherichia coli cytosolic glycerophosphodiester phosphodiesterase UgpQ and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Escherichia coli cytosolic glycerophosphodiester phosphodiesterase (GP-GDE, EC 3.1.4.46), UgpQ, and similar proteins. GP-GDE plays an essential role in the metabolic pathway of E. coli. It catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols, which are major sources of carbon and phosphate. E. coli possesses two major G3P uptake systems: Glp and Ugp, which contain genes coding for two distinct GP-GDEs. UgpQ gene from the E. coli ugp operon codes for a cytosolic phosphodiesterase GlpQ, which is the prototype of this family. Various glycerophosphodiesters, such as glycerophosphocholine (GPC), glycerophosphoethanolanmine (GPE), glycerophosphoglycerol (GPG), glycerophosphoinositol (GPI), and glycerophosphoserine (GPS), can only be hydrolyzed by UgpQ during transport at the inner side of the cytoplasmic membrane to alcohols and G3P, which is a source of phosphate. In contrast to Ca2+-dependent periplasmic phosphodiesterase GlpQ, cytosolic phosphodiesterase UgpQ requires divalent cations, such as Mg2+, Co2+, or Mn2+, for its enzyme activity.	229
176506	cd08563	GDPD_TtGDE_like	Glycerophosphodiester phosphodiesterase domain of Thermoanaerobacter tengcongensis and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Thermoanaerobacter tengcongensis glycerophosphodiester phosphodiesterase (TtGDE, EC 3.1.4.46) and its uncharacterized homologs. Members in this family show high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Despite the fact that most of GDPD family members exist as the monomer, TtGDE can function as a dimeric unit. Its catalytic mechanism is based on the general base-acid catalysis, which is similar to that of phosphoinositide-specific phospholipases C (PI-PLCs, EC 3.1.4.11). A divalent metal cation is required for the enzyme activity of TtGDE.	230
176507	cd08564	GDPD_GsGDE_like	Glycerophosphodiester phosphodiesterase domain of putative Galdieria sulphuraria glycerophosphodiester phosphodiesterase and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in putative Galdieria sulphuraria glycerophosphodiester phosphodiesterase (GsGDE, EC 3.1.4.46) and its uncharacterized eukaryotic homologs. Members in this family show high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	265
176508	cd08565	GDPD_pAtGDE_like	Glycerophosphodiester phosphodiesterase domain of putative Agrobacterium tumefaciens glycerophosphodiester phosphodiesterase and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in putative Agrobacterium tumefaciens glycerophosphodiester phosphodiesterase (pAtGDE, EC 3.1.4.46) and its uncharacterized homologs. Members in this family show high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	235
176509	cd08566	GDPD_AtGDE_like	Glycerophosphodiester phosphodiesterase domain of Agrobacterium tumefaciens and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Agrobacterium tumefaciens glycerophosphodiester phosphodiesterase (AtGDE, EC 3.1.4.46) and its uncharacterized eukaryotic homolgoues. Members in this family shows high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. AtGDE exists as a hexamer that is a trimer of dimers, which is unique among current known GDPD family members. However, it remains unclear if the hexamer plays a physiological role in AtGDE enzymatic function.	240
176510	cd08567	GDPD_SpGDE_like	Glycerophosphodiester phosphodiesterase domain of putative Silicibacter pomeroyi glycerophosphodiester phosphodiesterase and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized bacterial glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46) and similar proteins. The prototype of this CD is a putative GP-GDE from Silicibacter pomeroyi (SpGDE). It shows high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	263
176511	cd08568	GDPD_TmGDE_like	Glycerophosphodiester phosphodiesterase domain of Thermotoga maritime and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Thermotoga maritime glycerophosphodiester phosphodiesterase (TmGDE, EC 3.1.4.46) and its uncharacterized  homologs. Members in this family show high sequence similarity to Escherichia coli GP-GDE, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. TmGDE exists as a monomer that might be the biologically relevant form.	226
176512	cd08570	GDPD_YPL206cp_fungi	Glycerophosphodiester phosphodiesterase domain of Saccharomyces cerevisiae YPL206cp and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Saccharomyces cerevisiae YPL206cp and uncharacterized hypothetical homologs existing in fungi. The product of S. cerevisiae ORF YPL206c (PGC1), YPL206cp (Pgc1p), displays homology to bacterial and mammalian glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. S. cerevisiae YPL206cp is an integral membrane protein with a single GDPD domain following by a short hydrophobic C-terminal tail that may function as a membrane anchor. This protein plays an essential role in the regulation of the cardiolipin (CL) biosynthetic pathway in yeast by removing the excess phosphatidylglycerol (PG) content of membranes via a phospholipase C-type degradation mechanism. YPL206cp has been characterized as a PG-specific phospholipase C that selectively catalyzes the cleavage of PG, not glycerophosphoinositol (GPI) or glycerophosphocholine (GPC), to diacylglycerol (DAG) and glycerophosphate.  Members in this family are distantly related to S. cerevisiae YPL110cp, which selectively hydrolyzes glycerophosphocholine (GPC), not glycerophosphoinositol (GPI), to generate choline and glycerolphosphate, and has been characterized as a cytoplasmic GPC-specific phosphodiesterase.	234
176513	cd08571	GDPD_SHV3_plant	Glycerophosphodiester phosphodiesterase domain of glycerophosphodiester phosphodiesterase-like protein SHV3 and SHV3-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase (GDPD) domain present in glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs), which may play an important role in cell wall organization. The prototype of this family is a glycosylphosphatidylinositol (GPI) anchored protein SHV3 encoded by shaven3 (shv3) gene from Arabidopsis thaliana. Members in this family show sequence homology to bacterial GP-GDEs (EC 3.1.4.46) that catalyze the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.  Both, SHV3 and SVLs, have two tandemly repeated GDPD domains whose biochemical functions remain unclear. The residues essential for interactions with the substrates and calcium ions in bacterial GP-GDEs are not conserved in SHV3 and SVLs, which suggests that the function of GDPD domains in these proteins might be distinct from those in typical bacterial GP-GDEs. In addition, the two tandem repeats show low sequence similarity to each other, suggesting they have different biochemical function. Most members of this family are Arabidopsis-specific gene products. To date, SHV3 orthologues are only found in Physcomitrella patens.	302
176514	cd08572	GDPD_GDE5_like	Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE5-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian glycerophosphodiester phosphodiesterase GDE5-like proteins. GDE5 is widely expressed in mammalian tissues, with highest expression in spinal chord. Although its biological function remains unclear, mammalian GDE5 shows higher sequence homology to fungal and plant  glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46) than to other bacterial and mammalian GP-GDEs. It may also hydrolyze glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	293
176515	cd08573	GDPD_GDE1	Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE1 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE1 (also known as MIR16, membrane interacting protein of RGS16) and their metazoan homologs. GDE1 is widely expressed in mammalian tissues, including the heart, brain, liver, and kidney. It shows sequence homology to bacterial glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), which catalyzes the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. GDE1 has been characterized as GPI-GDE (EC 3.1.4.44) that selectively hydrolyzes extracellular glycerophosphoinositol (GPI) to generate glycerol phosphate and inositol. It functions as an integral membrane-bound glycoprotein interacting with regulator of G protein signaling protein RGS16, and is modulated by G protein-coupled receptor (GPCR) signaling. In addition, GDE1 may interact with PRA1 domain family, member 2 (PRAF2, also known as JM4), which is an interacting protein of the G protein-coupled chemokine receptor CCR5. The catalytic activity, which is dependent on the integrity of the GDPD domain, is required for GDE1 cellular function.	258
176516	cd08574	GDPD_GDE_2_3_6	Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE2, GDE3, GDE6-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian glycerophosphodiester phosphodiesterase domain-containing protein subtype 5 (GDE2), subtype 2 (GDE3), subtype 1 (GDE6), and their eukaryotic homologs. Mammalian GDE2, GDE3, and GDE6 show very high sequence similarity to each other and have been classified into the same family. Although they are all transmembrane proteins, based on different pattern of tissue distribution, these enzymes might display diverse cellular functions. Mammalian GDE2 is primarily expressed in mature neurons. It selectively hydrolyzes glycerophosphocholine (GPC) and mainly functions in a complex with an antioxidant scavenger peroxiredoxin1 (Prdx1) to control motor neuron differentiation in the spinal cord.  Mammalian GDE3 is specifically expressed in bone tissues and spleen. It selectively hydrolyzes extracellular glycerophosphoinositol (GPI) to generate inositol 1-phosphate (Ins1P) and glycerol and functions as an inducer of osteoblast differentiation. Mammalian GDE6 is predominantly expressed in the spermatocytes of testis, and its specific physiological function has not been elucidated yet.	252
176517	cd08575	GDPD_GDE4_like	Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE4-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE4 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 1 (GDPD1)) and similar proteins. Mammalian GDE4 is a transmembrane protein whose cellular function is not elucidated. It is expressed widely, including in placenta, liver, kidney, pancreas, spleen, thymus, ovary, small intestine and peripheral blood leukocytes. It is also expressed in the growth cones in neuroblastoma Neuro2a cells, which suggests mammalian GDE4 may play some distinct role from other members of mammalian GDEs family. Also included in this subfamily are uncharacterized mammalian glycerophosphodiester phosphodiesterase domain-containing protein 3 (GDPD3) and similar proteins which display very high sequence homology to mammalian GDE4.	264
176518	cd08576	GDPD_like_SMaseD_PLD	Glycerophosphodiester phosphodiesterase-like domain of spider venom sphingomyelinases D, bacterial phospholipase D, and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase-like domain (GDPD-like) present in sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.4) from spider venom, the Corynebacterium pseudotuberculosis Phospholipase D (PLD)-like protein from pathogenic bacteria, and the Ajellomyces capsulatus H143 PLD-like protein from ascomycetes. Spider SMases D and bacterial PLD proteins catalyze the Mg2+-dependent hydrolysis of sphingomyelin producing choline and ceramide 1-phosphate (C1P), which possess a number of biological functions, such as regulating cell proliferation and apoptosis, participating in inflammatory responses, and playing a key role in phagocytosis. In the presence of Mg2+, SMases D can function as lysophospholipase D and hydrolyze lysophosphatidylcholine (LPC) to choline and lysophosphatidic acid (LPA), which is a multifunctional phospholipid involved in platelet aggregation, endothelial hyperpermeability, and pro-inflammatory responses. Loxosceles spider venoms' SMases D are the principal toxins responsible for dermonecrosis and complement dependent haemolysis induced by spider venom. Due to amino acid substitutions at the entrance to the active-site pocket, some members lack activity. The typical GDPD domain consists of a TIM barrel and a small insertion domain named as the GDPD-insertion (GDPD-I) domain, which is specific for GDPD proteins.  Although proteins in this family contain a non-typical GDPD domain which lacks the GDPD-I, their catalytic mechanisms are based on Mg2+-dependent acid-base reactions similar to GDPD proteins. They might be divergent members of the GDPD family. Moreover, this family does not belong to phospholipase D (PLD) superfamily, since it lacks the conserved HKD sequence motif that characterizes the catalytic center of the PLD superfamily. It belongs to the superfamily of PLC-like phosphodiesterases.	265
176519	cd08577	PI-PLCc_GDPD_SF_unchar3	Uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipaseand Glycerophosphodiester phosphodiesterases. This subfamily corresponds to a group of uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipase C (PI-PLC), and glycerophosphodiester phosphodiesterases (GP-GDE), and also sphingomyelinases D (SMases D) and similar proteins. They hydrolyze the 3'-5' phosphodiester bonds in different substrates, utilizing a similar mechanism of general base and acid catalysis involving two conserved histidine residues.	228
176520	cd08578	GDPD_NUC-2_fungi	Putative glycerophosphodiester phosphodiesterase domain of ankyrin repeat protein NUC-2 and similar proteins. This subfamily corresponds to a putative glycerophosphodiester phosphodiesterase domain (GDPD) present in Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81. Some uncharacterized NUC-2 sequence homologs are also included in this family. NUC-2 plays an important role in the phosphate-regulated signal transduction pathway in Neurospora crassa. It shows high similarity to a cyclin-dependent kinase inhibitory protein PHO81, which is part of the phosphate regulatory cascade in S. cerevisiae. Both NUC-2 and PHO81 have multi-domain architecture, including an SPX N-terminal domain following by several ankyrin repeats and a putative C-terminal GDPD domain with unknown function. Although the putative GDPD domain displays sequence homology to that of bacterial glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), the residues essential for interactions with the substrates and calcium ions in bacterial GP-GDEs are not conserved in members of this family, which suggests the function of putative GDPD domains in these proteins might be distinct from those in typical bacterial GP-GDEs.	300
176521	cd08579	GDPD_memb_like	Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in uncharacterized bacterial glycerophosphodiester phosphodiesterases. In addition to a C-terminal GDPD domain, most members in this family have an N-terminus that functions as a membrane anchor.	220
176522	cd08580	GDPD_Rv2277c_like	Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial protein Rv2277c and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in uncharacterized bacterial protein Rv2277c and similar proteins. Members in this subfamily are bacterial homologous of mammalian GDE4, a transmembrane protein whose cellular function has not yet been elucidated.	263
176523	cd08581	GDPD_like_1	Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized bacterial glycerophosphodiester phosphodiesterase and similar proteins. They show high sequence similarity to Escherichia coli glycerophosphodiester phosphodiesterase, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	229
176524	cd08582	GDPD_like_2	Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized bacterial glycerophosphodiester phosphodiesterase and similar proteins. They show high sequence similarity to Escherichia coli glycerophosphodiester phosphodiesterase, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	233
176525	cd08583	PI-PLCc_GDPD_SF_unchar1	Uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipaseand Glycerophosphodiester phosphodiesterases. This subfamily corresponds to a group of uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipase C (PI-PLC), and glycerophosphodiester phosphodiesterases (GP-GDE), and also sphingomyelinases D (SMases D) and similar proteins. They hydrolyze the 3'-5' phosphodiester bonds in different substrates, utilizing a similar mechanism of general base and acid catalysis involving two conserved histidine residues.	237
176526	cd08584	PI-PLCc_GDPD_SF_unchar2	Uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipaseand Glycerophosphodiester phosphodiesterases. This subfamily corresponds to a group of uncharacterized hypothetical proteins similar to the catalytic domains of Phosphoinositide-specific phospholipase C (PI-PLC), and glycerophosphodiester phosphodiesterases (GP-GDE), and also sphingomyelinases D (SMases D) and similar proteins. They hydrolyze the 3'-5' phosphodiester bonds in different substrates, utilizing a similar mechanism of general base and acid catalysis involving two conserved histidine residues.	192
176527	cd08585	GDPD_like_3	Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized bacterial glycerophosphodiester phosphodiesterase and similar proteins. They show high sequence similarity with Escherichia coli glycerophosphodiester phosphodiesterase, which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	237
176528	cd08586	PI-PLCc_BcPLC_like	Catalytic domain of Bacillus cereus phosphatidylinositol-specific phospholipases C and similar proteins. This subfamily corresponds to the catalytic domain present in Bacillus cereus phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) and its sequence homologs found in bacteria and eukaryota. Bacterial PI-PLCs participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although their precise physiological function remains unclear, bacterial PI-PLCs may function as virulence factors in some pathogenic bacteria. Bacterial PI-PLCs contain a single TIM-barrel type catalytic domain. Their catalytic mechanism is based on general base and acid catalysis utilizing two well conserved histidines, and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This family also includes some uncharacterized eukaryotic homologs, which contains a single TIM-barrel type catalytic domain, X domain. They are similar to bacterial PI-PLCs, and distinct from typical eukaryotic PI-PLCs, which have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains, and  strictly require Ca2+ for their catalytic activities. The prototype of this family is Bacillus cereus PI-PLC, which has a moderate thermal stability and is active as a monomer.	279
176529	cd08587	PI-PLCXDc_like	Catalytic domain of phosphatidylinositol-specific phospholipase C X domain containing and similar proteins. This family corresponds to the catalytic domain present in phosphatidylinositol-specific phospholipase C X domain containing proteins (PI-PLCXD) which are bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) sequence homologs mainly found in eukaryota. The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) have a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs and their bacterial homologs contain a single TIM-barrel type catalytic domain, X domain, which is more closely related to that of bacterial PI-PLCs. Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may be distinct from that of typical eukaryotic PI-PLCs.	288
176530	cd08588	PI-PLCc_At5g67130_like	Catalytic domain of Arabidopsis thaliana PI-PLC X domain-containing protein At5g67130 and its uncharacterized homologs. This subfamily corresponds to the catalytic domain present in Arabidopsis thaliana PI-PLC X domain-containing protein At5g67130 and its uncharacterized homologs. Members in this family show high sequence similarity to bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), which participates in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG).	270
176531	cd08589	PI-PLCc_SaPLC1_like	Catalytic domain of Streptomyces antibioticus phosphatidylinositol-specific phospholipase C1-like proteins. This subfamily corresponds to the catalytic domain present in Streptomyces antibioticus phosphatidylinositol-specific phospholipase C1 (SaPLC1) and similar proteins. The typical bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) catalyzes Ca2+-independent hydrolysis of the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). The catalytic mechanism is based on general base and acid catalysis utilizing two well conserved histidines, and consists of two steps, a phosphotransfer and a phosphodiesterase reaction. In contrast, SaPLC1 is the first known natural Ca2+-dependent bacterial PI-PLC. It is more closely related to the eukaryotic PI-PLCs rather than the typical bacterial PI-PLCs. It participates in PI metabolism to generate myo-inositol-1-phosphate and myo-inositol-1:2-cyclic phosphate simultaneously. SaPLC1 and other members in this subfamily have two Ca2+-chelating amino acid substitutions which convert them from metal-independent enzymes to metal-dependent bacterial PI-PLC. Additionally, SaPLC1 active site utilizes a mechanism of amino acid juxtaposition, swapping amino acid positions, to adapt a calcium binding pocket and maintain more ideal active site geometry to support efficient catalysis.	324
176532	cd08590	PI-PLCc_Rv2075c_like	Catalytic domain of uncharacterized Mycobacterium tuberculosis Rv2075c-like proteins. This subfamily corresponds to the catalytic domain present in uncharacterized Mycobacterium tuberculosis Rv2075c and its homologs. Members in this family are more closely related to the Streptomyces antibioticus phosphatidylinositol-specific phospholipase C1(SaPLC1)-like proteins rather than the typical bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). In contrast, SaPLC1-like proteins have two Ca2+-chelating amino acid substitutions which convert them to metal-dependent bacterial PI-PLC. Rv2075c and its homologs have the same amino acid substitutions as well, which might suggest they have metal-dependent PI-PLC activity.	267
176533	cd08591	PI-PLCc_beta	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are four PLC-beta isozymes (1-4). They are activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. The beta-gamma subunits of heterotrimeric G proteins are known to activate the PLC-beta2 and -beta3 isozymes only. Aside from four PLC-beta isozymes identified in mammals, some eukaryotic PLC-beta homologs have been classified into this subfamily, such as NorpA and PLC-21 from Drosophila and PLC-beta from turkey, Xenopus, sponge, and hydra.	257
176534	cd08592	PI-PLCc_gamma	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-gamma. This family corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-gamma isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-gamma represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain.The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unique to PI-PLC-gamma, a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region is present within this linker region. There are two PI-PLC-gamma isozymes (1-2). They are activated by receptor and non-receptor tyrosine kinases due to the presence of two SH2 and a single SH3 domain within the linker region.  Aside from the two PI-PLC-gamma isozymes identified in mammals, some eukaryotic PI-PLC-gamma homologs have been classified with this subfamily.	229
176535	cd08593	PI-PLCc_delta	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-delta. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-delta isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-delta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C-terminal C2 domain. This CD corresponds to the catalytic domain which is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1,3 and 4). PI-PLC-delta1 is relatively well characterized. It is activated by high calcium levels generated by other PI-PLC family members, and therefore functions as a calcium amplifier within the cell. Different PI-PLC-delta isozymes have different tissue distribution and different subcellular locations. PI-PLC-delta1 is mostly a cytoplasmic protein, PI-PLC-delta3 is located in the membrane, and PI-PLC-delta4 is predominantly detected in the cell nucleus. Aside from three PI-PLC-delta isozymes identified in mammals, some eukaryotic PI-PLC-delta homologs have been classified to this CD.	257
176536	cd08594	PI-PLCc_eta	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-eta. This family corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-eta isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-eta represents a class of neuron-speific PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are two PI-PLC-eta isozymes (1-2), both neuron-specific enzymes. They function as calcium sensors that are activated by small increases in intracellular calcium concentrations. The PI-PLC-eta isozymes are also activated through GPCR stimulation. Aside from the PI-PLC-eta isozymes identified in mammals, their eukaryotic homologs are also present in this family.	227
176537	cd08595	PI-PLCc_zeta	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-zeta. This family corresponds to the catalytic domain presenting in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-zeta isozyme. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-zeta represents a class of sperm-specific PI-PLC that has an N-terminal EF-hand domain, a PLC catalytic core domain, and a C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There is one PLC-zeta isozyme (1). PLC-zeta plays a fundamental role in vertebrate fertilization by initiating intracellular calcium oscillations that trigger the embryo development. However, the mechanism of its activation still remains unclear. Aside from PI-PLC-zeta identified in mammals, its eukaryotic homologs have been classified with this family.	257
176538	cd08596	PI-PLCc_epsilon	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-epsilon. This family corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-epsilon isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-epsilon represents a class of mammalian PI-PLC that has an N-terminal CDC25 homology domain with a guanyl-nucleotide exchange factor (GFF) activity, a pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and two predicted RA (Ras association) domains that are implicated in the binding of small GTPases, such as Ras or Rap, from the Ras family. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There is one PI-PLC-epsilon isozyme (1). PI-PLC-epsilon is activated by G alpha(12/13), G beta gamma, and activated members of  Ras and Rho small GTPases. Aside from PI-PLC-epsilon identified in mammals, its eukaryotic homologs have been classified with this family.	254
176539	cd08597	PI-PLCc_PRIP_metazoa	Catalytic domain of metazoan phospholipase C related, but catalytically inactive protein. This family corresponds to the catalytic domain present in metazoan phospholipase C related, but catalytically inactive proteins (PRIP), which belong to a group of novel Inositol 1,4,5-trisphosphate (InsP3) binding protein. PRIP has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP do not have PLC enzymatic activity. PRIP consists of two subfamilies, PRIP-1(previously known as p130 or PLC-1), which is predominantly expressed in the brain, and PRIP-2 (previously known as PLC-2), which exhibits a relatively ubiquitous expression. Experiments show both, PRIP-1 and PRIP-2, are involved in InsP3-mediated calcium signaling pathway and GABA(A)receptor-mediated signaling pathway. In addition, PRIP-2 acts as a negative regulator of B-cell receptor signaling and immune responses.	260
176540	cd08598	PI-PLC1c_yeast	Catalytic domain of putative yeast phosphatidylinositide-specific phospholipases C. This family corresponds to the catalytic domain present in a group of putative phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) encoded by PLC1 genes from yeasts, which are homologs of the delta isoforms of mammalian PI-PLC in terms of overall sequence similarity and domain organization. Mammalian PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. The prototype of this CD is protein Plc1p encoded by PLC1 genes from Saccharomyces cerevisiae. Plc1p contains both highly conserved X- and Y- regions of PLC catalytic core domain, as well as a presumptive EF-hand like calcium binding motif.  Experiments show that Plc1p displays calcium dependent catalytic properties with high similarity to those of the mammalian PLCs, and plays multiple roles in modulating the membrane/protein interactions in filamentation control. CaPlc1p encoded by CAPLC1 from the closely related yeast Candida albicans, an orthologue of S. cerevisiae Plc1p, is also included in this group. Like Plc1p, CaPlc1p has conserved presumptive catalytic domain, shows PLC activity when expressed in E. coli, and is involved in multiple cellular processes. There are two other gene copies of CAPLC1 in C. albicans, CAPLC2 (also named as PIPLC) and CAPLC3. Experiments show CaPlc1p is the only enzyme in C. albicans which functions as PLC. The biological functions of CAPLC2 and CAPLC3 gene products must be clearly different from CaPlc1p, but their exact roles remain unclear. Moreover, CAPLC2 and CAPLC3 gene products are more similar to extracellular bacterial PI-PLC than to the eukaryotic PI-PLC, and they are not included in this subfamily.	231
176541	cd08599	PI-PLCc_plant	Catalytic domain of plant phosphatidylinositide-specific phospholipases C. This family corresponds to the catalytic domain present in a group of phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11) encoded by PLC genes from higher plants, which are homologs of mammalian PI-PLC in terms of overall sequence similarity and domain organization. Mammalian PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades, inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. The domain arrangement of plant PI-PLCs is structurally similar to the mammalian PLC-zeta isoform, which lacks the N-terminal pleckstrin homology (PH) domain, but contains EF-hand like motifs (which are absent in a few plant PLCs), a PLC catalytic core domain with X- and Y- highly conserved regions split by a linker sequence, and a C2 domain. However, at the sequence level, the plant PI-PLCs are closely related to the mammalian PLC-delta isoform. Experiments show that plant PLCs display calcium dependent PLC catalytic properties, although they lack some of the N-terminal motifs found in their mammalian counterparts. A putative calcium binding site may be located at the region spanning the X- and Y- domains.	228
176542	cd08600	GDPD_EcGlpQ_like	Glycerophosphodiester phosphodiesterase domain of Escherichia coli (GlpQ) and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Escherichia coli periplasmic glycerophosphodiester phosphodiesterase (GP-GDE, EC 3.1.4.46), GlpQ, and similar proteins. GP-GDE plays an essential role in the metabolic pathway of E. coli. It catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols, which are major sources of carbon and phosphate. E. coli possesses two major G3P uptake systems: Glp and Ugp, which contain genes coding for two different GP-GDEs. GlpQ gene from the E. coli glp operon codes for a periplasmic phosphodiesterase GlpQ, which is the prototype of this family. GlpQ is a dimeric enzyme that hydrolyzes periplasmic glycerophosphodiesters, such as glycerophosphocholine (GPC), glycerophosphoethanolanmine (GPE), glycerophosphoglycerol (GPG), glycerophosphoinositol (GPI), and glycerophosphoserine (GPS), to the corresponding alcohols and G3P, which is subsequently transported into the cell through the GlpT transport system. Ca2+ is required for the enzymatic activity of GlpQ.  This family also includes a surface-exposed lipoprotein, protein D (HPD), from Haemophilus influenza Type b and nontypeable strains, which shows very high sequence similarity with E. coli GlpQ. HPD has been characterized as a human immunoglobulin D-binding protein with glycerophosphodiester phosphodiesterase activity. It can hydrolyze phosphatidylcholine from host membranes to produce free choline on the lipopolysaccharides on the surface of pathogenic bacteria.	318
176543	cd08601	GDPD_SaGlpQ_like	Glycerophosphodiester phosphodiesterase domain of Staphylococcus aureus and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in uncharacterized glycerophosphodiester phosphodiesterase (GP-GDE, EC 3.1.4.46) from Staphylococcus aureus, Bacillus subtilis and similar proteins. Members in this family show very high sequence similarity to Escherichia coli periplasmic phosphodiesterase GlpQ, which catalyzes the Ca2+-dependent degradation of periplasmic glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.	256
176544	cd08602	GDPD_ScGlpQ1_like	Glycerophosphodiester phosphodiesterase domain of Streptomycin coelicolor (GlpQ1) and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present  in a group of putative bacterial and eukaryotic glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46) similar to Escherichia coli periplasmic phosphodiesterase GlpQ, as well as plant glycerophosphodiester phosphodiesterases (GP-PDEs), all of which catalyzes the Ca2+-dependent degradation of periplasmic glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. The prototypes of this family include putative secreted phosphodiesterase encoded by gene glpQ1 (SCO1565) from the pho regulon in Streptomyces coelicolor genome, and in plants, two distinct Arabidopsis thaliana genes, AT5G08030 and AT1G74210, coding putative GP-PDEs from the cell walls and vacuoles, respectively.	309
176545	cd08603	GDPD_SHV3_repeat_1	Glycerophosphodiester phosphodiesterase domain repeat 1 of glycerophosphodiester phosphodiesterase-like protein SHV3 and SHV3-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) repeat 1 present in glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs), which may play an important role in cell wall organization. The prototype of this family is a glycosylphosphatidylinositol (GPI) anchored protein SHV3 encoded by shaven3 (shv3) gene from Arabidopsis thaliana. Members in this family show sequence homology to bacterial GP-GDEs (EC 3.1.4.46) that catalyze the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.  Both, SHV3 and SVLs, have two tandemly repeated GDPD domains whose biochemical functions remain unclear. The residues essential for interactions with the substrates and calcium ions in bacterial GP-GDEs are not conserved in SHV3 and SVLs, which suggests that the function of GDPD domains in these proteins might be distinct from those in typical bacterial GP-GDEs. In addition, the two tandem repeats show low sequence similarity to each other, suggesting they have different biochemical function. Most of the members of this family are Arabidopsis-specific gene products. To date, SHV3 orthologues are only found in Physcomitrella patens. This family includes domain I, the first GDPD domain of SHV3 and SVLs.	299
176546	cd08604	GDPD_SHV3_repeat_2	Glycerophosphodiester phosphodiesterase domain repeat 2 of glycerophosphodiester phosphodiesterase-like protein SHV3 and SHV3-like proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) repeat 2 present in glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs), which may play important an role in cell wall organization. The prototype of this family is a glycosylphosphatidylinositol (GPI) anchored protein SHV3 encoded by shaven3 (shv3) gene from Arabidopsis thaliana. Members in this family show sequence homology to bacterial GP-GDEs (EC 3.1.4.46) that catalyze the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols.  Both, SHV3 and SVLs, have two tandemly repeated GDPD domains whose biochemical functions remain unclear. The residues essential for interactions with the substrates and calcium ions in bacterial GP-GDEs are not conserved in SHV3 and SVLs, which suggests that the function of GDPD domains in these proteins might be distinct from those in typical bacterial GP-GDEs. In addition, the two tandem repeats show low sequence similarity to each other, suggesting they have different biochemical function. Most of the members of this family are Arabidopsis-specific gene products. To date, SHV3 orthologues are only found in Physcomitrella patens. This CD includes domain II (the second GDPD domain of SHV3 and SVLs), which is necessary for SHV3 function.	300
176547	cd08605	GDPD_GDE5_like_1_plant	Glycerophosphodiester phosphodiesterase domain of uncharacterized plant glycerophosphodiester phosphodiesterase-like proteins similar to mammalian GDE5. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in a group of uncharacterized plant glycerophosphodiester phosphodiesterase (GP-PDE)-like proteins. Members in this family show very high sequence homology to mammalian glycerophosphodiester phosphodiesterase GDE5 and are distantly related to plant GP-PDEs.	282
176548	cd08606	GDPD_YPL110cp_fungi	Glycerophosphodiester phosphodiesterase domain of Saccharomyces cerevisiae YPL110cp and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in Saccharomyces cerevisiae YPL110cp and other uncharacterized fungal homologs. The product of S. cerevisiae ORF YPL110c (GDE1), YPL110cp (Gde1p), displays homology to bacterial and mammalian glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), which catalyzes the degradation of glycerophosphodiesters to produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. S. cerevisiae YPL110cp has been characterized as a cytoplasmic glycerophosphocholine (GPC)-specific phosphodiesterase that selectively hydrolyzes GPC, not glycerophosphoinositol (GPI), to generate choline and glycerolphosphate. YPL110cp has multi-domain architecture, including not only C-terminal GDPD, but also an SPX N-terminal domain along with several ankyrin repeats, which implies that YPL110cp may mediate protein-protein interactions in a variety of proteins and play a role in maintaining cellular phosphate levels. Members in this family are distantly related to S. cerevisiae YPL206cp, which selectively catalyzes the cleavage of phosphatidylglycerol (PG), not glycerophosphoinositol (GPI) or glycerophosphocholine (GPC), to diacylglycerol (DAG) and glycerophosphate, and has been characterized as a PG-specific phospholipase C.	286
176549	cd08607	GDPD_GDE5	Glycerophosphodiester phosphodiesterase domain of putative mammalian glycerophosphodiester phosphodiesterase GDE5 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in putative mammalian GDE5 and similar proteins. Mammalian GDE5 is widely expressed in mammalian tissues, with highest expression in the spinal chord. Although its biological function remains unclear, mammalian GDE5 shows higher sequence homology to fungal and plant  glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46) than to other bacterial and mammalian GP-GDEs. It may also hydrolyze glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. In addition to C-terminal GDPD domain, all members in this subfamily have a starch binding domain (CBM20) in the N-terminus, which suggests these proteins may play a distinct role in glycerol metabolism.	290
176550	cd08608	GDPD_GDE2	Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE2 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE2 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 5 (GDPD5)) and their metazoan homologs. Mammalian GDE2 is transmembrane protein primarily expressed in mature neurons. It is a mammalian homolog of bacterial glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), which catalyze the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Mammalian GDE2 selectively hydrolyzes glycerophosphocholine (GPC) and has been characterized as GPC-GDE (EC 3.1.4.2) that contributes to osmotic regulation of cellular GPC. Mammalian GDE2 functions in a complex with an antioxidant scavenger peroxiredoxin1 (Prdx1) to control motor neuron differentiation in the spinal cord. Mammalian GDE2 also plays a critical role for retinoid-induced neuronal outgrowth. The catalytic activity of GDPD domain is essential for mammalian GDE2 cellular function.	351
176551	cd08609	GDPD_GDE3	Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE3 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE3 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 2 (GDPD2), Osteoblast differentiation promoting factor) and their metazoan homologs. Mammalian GDE3 is a transmembrane protein specifically expressed in bone tissues and spleen. It is a mammalian homolog of bacterial glycerophosphodiester phosphodiesterases (GP-GDEs, EC 3.1.4.46), which catalyzes the hydrolysis of various glycerophosphodiesters, and produce sn-glycerol-3-phosphate (G3P) and the corresponding alcohols. Mammalian GDE3 has been characterized as glycerophosphoinositol inositolphosphodiesterase (EC 3.1.4.43) that selectively hydrolyzes extracellular glycerophosphoinositol (GPI) to generate inositol 1-phosphate (Ins1P) and glycerol. Mammalian GDE3 functions as an inducer of osteoblast differentiation. It also plays a critical role for actin cytoskeletal modulation. The catalytic activity of GDPD domain is essential for mammalian GDE3 cellular function.	315
176552	cd08610	GDPD_GDE6	Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE6 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE6 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 4 (GDPD4)) and their metazoan homologs. Mammalian GDE6 is a transmembrane protein predominantly expressed in the spermatocytes of testis. Although the specific physiological function of mammalian GDE6 has not been elucidated, its different pattern of tissue distribution suggests it might play a critical role in the completion of meiosis during male germ cell differentiation.	316
176553	cd08612	GDPD_GDE4	Glycerophosphodiester phosphodiesterase domain of mammalian glycerophosphodiester phosphodiesterase GDE4 and similar proteins. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in mammalian GDE4 (also known as glycerophosphodiester phosphodiesterase domain-containing protein 1 (GDPD1)) and similar proteins. Mammalian GDE4 is a transmembrane protein whose cellular function has not yet been elucidated. It is expressed widely, including in placenta, liver, kidney, pancreas, spleen, thymus, ovary, small intestine and peripheral blood leukocytes. It is also expressed in the growth cones in neuroblastoma Neuro2a cells, which suggests GDE4 may play some distinct role from other members of the GDE family.	300
176554	cd08613	GDPD_GDE4_like_1	Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial  homologs of mammalian glycerophosphodiester phosphodiesterase GDE4. This subfamily corresponds to the glycerophosphodiester phosphodiesterase domain (GDPD) present in uncharacterized bacterial homologs of mammalian GDE4, a transmembrane protein whose cellular function has not been elucidated yet.	309
176555	cd08616	PI-PLCXD1c	Catalytic domain of phosphatidylinositol-specific phospholipase C, X domain containing 1. This subfamily corresponds to the catalytic domain present in a group of phosphatidylinositol-specific phospholipase C X domain containing 1 (PI-PLCXD1), 2 (PI-PLCXD2) and 3 (PI-PLCXD3), which are bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13) sequence homologs found in vertebrates. The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, members in this group contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs.	290
176556	cd08619	PI-PLCXDc_plant	Catalytic domain of phosphatidylinositol-specific phospholipase C, X domain containing proteins found in plants. The CD corresponds to the catalytic domain present in uncharacterized plant phosphatidylinositol-specific phospholipase C, X domain containing proteins (PI-PLCXD). The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, plant PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of plant PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs.	285
176557	cd08620	PI-PLCXDc_like_1	Catalytic domain of uncharacterized hypothetical proteins similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins. This subfamily corresponds to the catalytic domain present in a group of uncharacterized hypothetical proteins found in bacteria and fungi, which are similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins (PI-PLCXD). The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs.	281
176558	cd08621	PI-PLCXDc_like_2	Catalytic domain of uncharacterized hypothetical proteins similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins. This subfamily corresponds to the catalytic domain present in a group of uncharacterized hypothetical proteins found in bacteria and fungi, which are similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins (PI-PLCXD). The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs.	300
176559	cd08622	PI-PLCXDc_CG14945_like	Catalytic domain of Drosophila melanogaster CG14945-like proteins similar to phosphatidylinositol-specific phospholipase C, X domain containing. This subfamily corresponds to the catalytic domain present in uncharacterized metazoan Drosophila melanogaster CG14945-like proteins, which are similar to eukaryotic phosphatidylinositol-specific phospholipase C, X domain containing proteins (PI-PLCXD). The typical eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) has a multidomain organization that consists of a PLC catalytic core domain, and various regulatory domains. The catalytic core domain is assembled from two highly conserved X- and Y-regions split by a divergent linker sequence. In contrast, eukaryotic PI-PLCXDs contain a single TIM-barrel type catalytic domain, X domain, and are more closely related to bacterial PI-PLCs, which participate in Ca2+-independent PI metabolism, hydrolyzing the membrane lipid phosphatidylinositol (PI) to produce phosphorylated myo-inositol and diacylglycerol (DAG). Although the biological function of eukaryotic PI-PLCXDs still remains unclear, it may distinct from that of typical eukaryotic PI-PLCs.	276
176560	cd08623	PI-PLCc_beta1	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta1. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozyme 1. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-beta1 is expressed at highest levels in specific regions of the brain. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension.	258
176561	cd08624	PI-PLCc_beta2	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta2. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozyme 2. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-beta2 is expressed at highest levels in cells of hematopoietic origin. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension.  It is also activated by the beta-gamma subunits of heterotrimeric G proteins.	261
176562	cd08625	PI-PLCc_beta3	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta3. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozyme 3. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-beta3 is widely expressed at highest levels in brain, liver, and parotid gland. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension.  It is also activated by the beta-gamma subunits of heterotrimeric G proteins.	258
176563	cd08626	PI-PLCc_beta4	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-beta4. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-beta isozyme 4. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-beta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-beta4 is expressed in high concentrations in cerebellar Purkinje and granule cells, the median geniculate body, and the lateral geniculate nucleus. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension.	257
176564	cd08627	PI-PLCc_gamma1	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-gamma1. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-gamma isozyme 1. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-gamma represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unique to PI-PLC-gamma1, a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region is present within this linker region. PI-PLC-gamma1 is ubiquitously expressed. It is activated by receptor and non-receptor tyrosine kinases due to the presence of two SH2 and a single SH3 domain within the linker region.	229
176565	cd08628	PI-PLCc_gamma2	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-gamma2. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-gamma isozyme 2. PI-PLC is a signaling enzyme that hydrolyze the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-gamma represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain.  The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unique to PI-PLC-gamma2, a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region is present within this linker region. PI-PLC-gamma2 is highly expressed in cells of hematopoietic origin. It is activated by receptor and non-receptor tyrosine kinases due to the presence of two SH2 and a single SH3 domain within the linker region. Unlike PI-PLC-gamma1, the activation of PI-PLC-gamma2 may require concurrent stimulation of PI 3-kinase.	254
176566	cd08629	PI-PLCc_delta1	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-delta1. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-delta1 isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-delta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C-terminal C2 domain. This subfamily corresponds to the catalytic domain which is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1,3 and 4). PI-PLC-delta1 is relatively well characterized. It is activated by high calcium levels generated by other PI-PLC family members, and therefore functions as a calcium amplifier within the cell. Unlike PI-PLC-delta 4, PI-PLC-delta1 and 3 possess a putative nuclear export sequence (NES) located in the EF-hand domain, which may be responsible transporting PI-PLC-delta1and 3 from the cell nucleus. Experiments show PI-PLC-delta1 is essential for normal hair formation.	258
176567	cd08630	PI-PLCc_delta3	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-delta3. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-delta3 isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-delta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C-terminal C2 domain. This family corresponds to the catalytic domain which is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1,3 and 4). Unlike PI-PLC-delta 4, PI-PLC-delta1 and 3 possess a putative nuclear export sequence (NES) located in the EF-hand domain, which may be responsible transporting PI-PLC-delta1 and 3 from the cell nucleus.	258
176568	cd08631	PI-PLCc_delta4	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-delta4. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-delta4 isozymes. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PLC-delta represents a class of mammalian PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C-terminal C2 domain. This CD corresponds to the catalytic domain which is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1,3 and 4). Unlike PI-PLC-delta 1 and 3, a putative nuclear export sequence (NES) located in the EF-hand domain, which may be responsible transporting PI-PLC-delta1 and 3 from the cell nucleus, is not present in PI-PLC-delta4. Experiments show PI-PLC-delta4 is required for the acrosome reaction in fertilization.	258
176569	cd08632	PI-PLCc_eta1	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-eta1. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-eta isozyme 1. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-eta represents a class of neuron-speific PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-eta1 is a neuron-specific enzyme and expressed in only nerve tissues such as the brain and spinal cord. It may perform a fundamental role in the brain.	253
176570	cd08633	PI-PLCc_eta2	Catalytic domain of metazoan phosphoinositide-specific phospholipase C-eta2. This subfamily corresponds to the catalytic domain present in metazoan phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11)-eta isozyme 2. PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. PI-PLC-eta represents a class of neuron-speific PI-PLC that has an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-eta2 is a neuron-specific enzyme and expressed in the brain. It may in part function downstream of G-protein-coupled receptors and play an important role in the formation and maintenance of the neuronal network in the postnatal brain.	254
176474	cd08637	DNA_pol_A_pol_I_C	Polymerase I functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. Family A polymerase (polymerase I) functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase  beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I (pol I) ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used to search for protein signatures. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains.	377
176475	cd08638	DNA_pol_A_theta	DNA polymerase theta is a low-fidelity family A enzyme implicated in translesion synthesis and in somatic hypermutation. DNA polymerase theta is a low-fidelity family A enzyme implicated in translesion synthesis (TLS) and in somatic hypermutation (SHM). DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase  beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. Pol theta is an exception among family A polymerases and generates processive single base substitutions. Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I (pol I) ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. Polymerase theta mostly has amino-terminal helicase domain, a carboxy-terminal polymerase domain and an intervening space region.	373
176476	cd08639	DNA_pol_A_Aquificae_like	Phylum Aquificae Pol A is different from Escherichia coli  Pol A by three signature sequences. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase  beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used for phylogenetic anaylsis of bacteria. Species of the phylum Aquificae grow in extreme thermophilic environments. The Aquificae are non-spore-forming, Gram-negative rods and strictly thermophilic. Phylum Aquificae Pol A is different from E. coli Pol I by three signature sequences consisting of a 2 amino acids (aa) insert, a 5-6 aa insert and a 6 aa deletion. These signature sequences may provide a molecular marker for the family Aquificaceae and related species.	324
176477	cd08640	DNA_pol_A_plastid_like	DNA polymerase A type from plastids of higher plants possibly involve in DNA replication or in the repair of errors occurring during replication. DNA polymerase A type from plastids of higher plants possibly involve in DNA replication or in the repair of errors occurring during replication. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase  beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7).   The three-dimensional structure of plastid DNA polymerase has substantial similarity to Pol I. The structure of Pol I resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains.	371
176478	cd08641	DNA_pol_gammaA	Pol gammaA is a family A polymerase that is responsible for DNA replication and repair in mitochondria. DNA polymerase gamma (Pol gamma), 5'-3' polymerase domain (Pol gammaA). Pol gammaA is a family A polymerase that is responsible for DNA replication and repair in mitochondria. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gammaA, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7).   The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains. Pol gammaA has also the right hand configuration. Pol gammaA has both polymerase and proofreading exonuclease activities separated by a spacer. Pol gamma holoenzyme is a heterotrimer containing one Pol gammaA subunit and a dimeric Pol gammaB subunit. Pol gamma is important for mitochondria DNA maintenance and mutation of the catalytic subunit of Pol gamma is implicated in more than 30 human diseases.	425
176479	cd08642	DNA_pol_A_pol_I_A	Polymerase I functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. Family A polymerase (polymerase I) functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase  beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used to search for protein signatures. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains.	378
176480	cd08643	DNA_pol_A_pol_I_B	Polymerase I functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase  beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used to search for protein signatures. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains.	429
187713	cd08644	FMT_core_ArnA_N	ArnA, N-terminal formyltransferase domain. ArnA_N:  ArnA is a bifunctional enzyme required for the modification of lipid A with 4-amino-4-deoxy-L-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin.  The C-terminal dehydrogenase domain of ArnA catalyzes the oxidative decarboxylation of UDP-glucuronic acid (UDP-GlcUA) to UDP-4-keto-arabinose (UDP-Ara4O), while the N-terminal formyltransferase domain of ArnA catalyzes the addition of a formyl group to UDP-4-amino-4-deoxy-L-arabinose (UDP-L-Ara4N) to form UDP-L-4-formamido-arabinose (UDP-L-Ara4FN). This domain family represents the catalytic core of the N-terminal formyltransferase domain. The formyltransferase also contains a smaller C-terminal domain the may be  involved in substrate binding. ArnA forms a hexameric structure, in which the dehydrogenase domains are arranged at the center of the particle with the transformylase domains on the outside of the particle.	203
187714	cd08645	FMT_core_GART	Phosphoribosylglycinamide formyltransferase (GAR transformylase, GART). Phosphoribosylglycinamide formyltransferase, also known as GAR transformylase or GART, is an essential enzyme that catalyzes the third step in de novo purine biosynthesis. This enzyme uses formyl tetrahydrofolate as a formyl group donor to produce 5'-phosphoribosyl-N-formylglycinamide. In prokaryotes, GART is a single domain protein but in most eukaryotes it is the C-terminal portion of a large multifunctional protein which also contains GAR synthetase and aminoimidazole ribonucleotide synthetase activities.	183
187715	cd08646	FMT_core_Met-tRNA-FMT_N	Methionyl-tRNA formyltransferase, N-terminal hydrolase domain. Methionyl-tRNA formyltransferase (Met-tRNA-FMT), N-terminal formyltransferase domain.  Met-tRNA-FMT transfers a formyl group from N-10 formyltetrahydrofolate to the amino terminal end of a methionyl-aminoacyl-tRNA acyl moiety, yielding formyl-Met-tRNA. Formyl-Met-tRNA plays essential role in protein translation initiation by forming complex with IF2. The formyl group plays a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP.  The N-terminal domain contains a Rossmann fold and it is the catalytic domain of the enzyme.	204
187716	cd08647	FMT_core_FDH_N	10-formyltetrahydrofolate dehydrogenase (FDH), N-terminal hydrolase domain. This family represents the N-terminal hydrolase domain of the bifunctional protein 10-formyltetrahydrofolate dehydrogenase (FDH). This domain contains a 10-formyl-tetrahydrofolate (10-formyl-THF) binding site and shares sequence homology and structural topology with other enzymes utilizing this substrate. This domain functions as a hydrolase, catalyzing the conversion of 10-formyl-THF, a precursor for nucleotide biosynthesis, to tetrahydrofolate (THF). The overall FDH reaction mechanism is a coupling of two sequential reactions, a hydrolase and a formyl dehydrogenase, bridged by a substrate transfer step.  The N-terminal hydrolase domain removes the formyl group from 10-formyl-THF and the C-terminal NADP-dependent dehydrogenase domain then reduces the formyl group to carbon dioxide.  The two catalytic domains are connected by a third intermediate linker domain that transfers the formyl group, covalently attached to the sulfhydryl group of the phosphopantetheine arm, from the N-terminal domain to the C-terminal domain.	203
187717	cd08648	FMT_core_Formyl-FH4-Hydrolase_C	Formyltetrahydrofolate deformylase (Formyl-FH4 hydrolase), C-terminal hydrolase domain. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Formate is the substrate of phosphoribosylglycinamide transformylase for step three of de novo purine nucleotide synthesis. Formyl-FH4 hydrolase has been proposed to regulate the balance of FH4 and C1-FH4 in the cell.  The enzyme uses methionine and glycine to sense the pools of C1-FH4 and FH4, respectively. This domain belongs to the formyltransferase (FMT) domain superfamily. Members of this family have an N-terminal ACT domain, which is  commonly involved in specifically bind an amino acid or other small ligand leading to regulation of the enzyme. The N-terminal of this protein family may be responsible for the binding of the regulators methionine and glycine.	196
187718	cd08649	FMT_core_NRPS_like	N-terminal formyl transferase catalytic core domain of NRPS_like proteins, one of the proteins involved in the synthesis of Oxazolomycin. This family represents the N-terminal formyl transferase catalytic core domain present in a subgroup of non-ribosomal peptide synthetases. In Streptomyces albus a member of this family has been shown to be involved in the synthesis of oxazolomycin (OZM). OZM is a hybrid peptide-polyketide antibiotic and exhibits potent antitumor and antiviral activities. It is a multi-domain protein consisting of a formyl transferase domain, a Flavin-utilizing monoxygenase domain, a LuxE domain functioning as an acyl protein synthetase and a pp-binding domain, which may function as an acyl carrier. It shows sequence similarity with other peptide-polyketide biosynthesis proteins.	166
187719	cd08650	FMT_core_HypX_N	HypX protein, N-terminal hydrolase domain. The family represents the N-terminal hydrolase domain of HypX protein.  HypX is involved in the maturation process of active [NiFe] hydrogenase. [NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalent under aerobic and anaerobic conditions. [NiFe] hydrogenases consist of a large and a small subunit. The large subunit contains [NiFe] active site, which is synthesized as a precursor without the [NiFe] active site. This precursor then undergoes a complex post-translational maturation process that requires the presence of a number of accessory proteins. HypX has been shown to be involved in this maturation process and have been proposed to participate in the generation and transport of the CO and CN ligands. However, HypX is not present in all hydrogen-metabolizing bacteria. Furthermore, hypX deletion mutants have a reduced but detectable level of hydrogenase activity. Thus, HypX might not be a determining factor in the matur ation process. Members of this group have an N-terminal formyl transferase domain and a C-terminal enoyl-CoA hydratase/isomerase domain.	151
187720	cd08651	FMT_core_like_4	Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions.  Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group.  This domain contains a Rossmann fold and it is the catalytic domain of the enzyme.	180
187721	cd08653	FMT_core_like_3	Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions.  Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group.  This domain contains a Rossmann fold and it is the catalytic domain of the enzyme.	152
349943	cd08656	M28_like	M28 Zn-peptidase; uncharacterized subfamily. Peptidase family M28 (also called aminopeptidase Y family), uncharacterized subfamily. The M28 family contains aminopeptidases as well as carboxypeptidases. They have co-catalytic zinc ions; each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions.	287
349944	cd08659	M20_ArgE_DapE-like	Peptidase M20 acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE)-like. Peptidase M20 acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase (ArgE/DapE) like family of enzymes catalyze analogous reactions and share a common activator, the metal ion (usually Co2+ or Zn2+). ArgE catalyzes a broad range of substrates, including N-acetylornithine, alpha-N-acetylmethionine and alpha-N-formylmethionine, while DapE catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelate (L,L-SDAP) to L,L-diaminopimelate and succinate. Proteins in this family are mostly bacterial and have been inferred by homology as being related to both ArgE and DapE. This family also includes N-acetyl-L-citrulline deacetylase (ACDase; acetylcitrulline deacetylase), a unique, novel enzyme found in Xanthomonas campestris, a plant pathogen, in which N-acetyl-L-ornithine is the substrate for transcarbamoylation reaction, and the product is N-acetyl-L-citrulline. Thus, in the arginine biosynthesis pathway, ACDase subsequently catalyzes the hydrolysis of N-acetyl-L-citrulline to acetate and L-citrulline.	361
349945	cd08660	M20_Acy1-like	M20 Peptidase Aminoacylase 1-like family. This family includes aminoacylase 1 (ACY1) and Aminoacylase 1-like protein 2 (ACY1L2). Aminoacylase 1 proteins are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. ACY1 (acyl-L-amino-acid amidohydrolase; EC 3.5.1.14) is the most abundant of the aminoacylases, a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. It is encoded by the aminoacylase 1 gene (Acy1) on chromosome 3p21 that comprises 15 exons. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity; substrates include indoleacetic acid (IAA) N-conjugates of amino acids, N-acetyl-L-amino acids and aminobenzoylglutamate. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1L2 family contains many uncharacterized proteins predicted as amidohydrolases, including gene products of abgA and abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in E. coli, to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate utilization is catalyzed by the abg region gene product, AbgT. Defects in ACY1 are the cause of aminoacylase-1 deficiency (ACY1D) resulting in a metabolic disorder manifesting with encephalopathy and psychomotor delay.	366
341056	cd08662	M13	Peptidase family M13 includes neprilysin and endothelin-converting enzyme I. The M13 family of metallopeptidases includes neprilysin (neutral endopeptidase, NEP, enkephalinase, CD10, CALLA, EC 3.4.24.11), endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), erythrocyte surface antigen KELL (ECE-3), phosphate-regulating gene on the X chromosome (PHEX), soluble secreted endopeptidase (SEP), and damage-induced neuronal endopeptidase (DINE)/X-converting enzyme (XCE). Proteins in this family fulfill a broad range of physiological roles due to the greater variation in the active site's S2' subsite allowing substrate specificity. NEP is expressed in a variety of tissues including kidney and brain, and is involved in many physiological and pathological processes, including blood pressure and inflammatory response. It degrades a wide array of substrates such as substance P, enkephalins, cholecystokinin, neurotensin and somatostatin.  It is an important enzyme in the regulation of amyloid-beta (Abeta) protein that forms amyloid plaques that are associated with Alzeimers disease (AD). ECE-1 catalyzes the final rate-limiting step in the biosynthesis of endothelins via post-translational conversion of the biologically inactive big endothelins. Like NEP, it also hydrolyzes bradykinin, substance P, neurotensin, and Abeta. Endothelin-1 overproduction has been implicated in various diseases including stroke, asthma, hypertension, and cardiac and renal failure. Kell is a homolog of NEP and constitutes a major antigen on human erythrocytes; it preferentially cleaves big endothelin-3 to produce bioactive endothelin-3, but is also known to cleave substance P and neurokinin A. PHEX forms a complex interaction with fibroblast growth factor 23 (FGF23) and matrix extracellular phosphoglycoprotein, causing bone mineralization. A loss-of-function mutation in PHEX disrupts this interaction leading to hypophosphatemic rickets; X-linked hypophosphatemic (XLH) rickets is the most common form of metabolic rickets. ECEL1 is a brain metalloprotease which plays a critical role in the nervous regulation of the respiratory system, while DINE is abundantly expressed in the hypothalamus and its expression responds to nerve injury. A majority of these M13 proteases are prime therapeutic targets for selective inhibition.	642
176450	cd08663	DAP_dppA_1	Peptidase M55, D-aminopeptidase dipeptide-binding protein family. M55 Peptidase, D-Aminopeptidase dipeptide-binding protein (dppA; DAP dppA; EC 3.4.11.-) domain: Peptide transport systems are found in many bacterial species and generally function to accumulate intact peptides in the cell, where they are hydrolyzed. The dipeptide-binding protein (dppA) of Bacillus subtilis belongs to the dipeptide ABC transport (dpp) operon expressed early during sporulation. It is a binuclear zinc-dependent, D-specific aminopeptidase. The biologically active enzyme is a homodecamer with active sites buried in its channel. These self-compartmentalizing proteases are characterized by a SXDXEG motif. D-Ala-D-Ala and D-Ala-Gly-Gly are the preferred substrates. Bacillus subtilis dppA is thought to function as an adaptation to nutrient deficiency; hydrolysis of its substrate releases D-Ala which can be used subsequently as metabolic fuel. This family also contains a number of uncharacterized putative peptidases.	266
176485	cd08664	APC10-HERC2	APC10-like DOC1 domain present in HERC2 (HECT domain and RLD2). This model represents the APC10/DOC1 domain present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including a zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT (Homologous to the E6-AP Carboxyl Terminus) domain. The APC10/DOC1 domain of HERC2 is a homolog of the APC10 subunit and the DOC1 domain present in E3 ubiquitin ligases which mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. As suggested by structural relationships between HERC2 and other proteins such as HERC1, the proposed role for HERC2 in protein trafficking and degradation pathways is consistent with observations that mutations in HERC2 lead to neuromuscular secretory vesicle and sperm acrosome defects, other developmental abnormalities, and juvenile lethality of jdf2 mice. Recent studies have shown that the protein complex, HERC2-RNF8, coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes.	152
176486	cd08665	APC10-CUL7	APC10-like DOC1 domain of CUL7, subunit of the SCF-ROC1-like E3 ubiquitin ligase complex that mediates substrate ubiquitination. This model represents the APC10/DOC1 domain present in CUL7, a subunit of the SCF-ROC1-like E3 Ubiquitin (Ub) ligase complex, which mediates substrate ubiquitination (or ubiquitylation), and is a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation.  CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling the SCF-ROC1-like E3 Ub ligase complex consisting of the adapter protein Skp1, CUL7, the WD40 repeat-containing F-box Fbw8 (also known as Fbx29), and ROC1 (RING-box protein 1). CUL7 is a large protein with a C-terminal cullin domain that binds ROC1 and additional domains, including an APC10/DOC1 domain. While the Fbw8 protein is responsible for substrate protein recognition, the ROC1 RING domain recruits an Ub-charged E2 Ub-conjugating enzyme for substrate ubiquitination. It remains to be determined how CUL7 binds to the Skp1-Fbw8 heterodimer. The CUL7 E3 Ub ligase has been implicated in the proteasomal degradation of the cellular proteins, cyclin D1, an important regulator of the G1 to S-phase cell cycle progression, and insulin receptor substrate 1, a critical component of the signaling pathways downstream of the insulin and insulin-like growth factor 1 receptor. CUL7 appears to be an important regulator of placental development. Germ line mutations of CUL7 are linked to 3-M syndrome and Yakuts short stature syndrome.	131
176487	cd08666	APC10-HECTD3	APC10-like DOC1 domain of HECTD3, a HECT E3 ubiquitin ligase protein that mediates substrate ubiquitination. This model represents the APC10/DOC1 domain present in HECTD3, a HECT (Homologous to the E6-AP Carboxyl Terminus) E3 ubiquitin ligase protein. HECT E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), and are a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. They also regulate the trafficking of many receptors, channels, transporters and viral proteins. HECTD3 (HECT domain-containing protein3) contains a C-terminal HECT domain with the active site for ubiquitin transfer onto substrates, and an N-terminal APC10/DOC1 domain, which is responsible for substrate recognition and binding. HECTD3 specifically recognizes the Trio-binding protein, Tara (Trio-associated repeat on actin), implicated in regulating actin cytoskeletal, cell motility and cell growth. Tara also binds to TRF1 and may participate in telomere maintenance and/or mitotic regulation through interacting with TRF1. HECTD3 interacts with and promotes the ubiquitination of Syntaxin 8, an endosomal syntaxin proposed to mediate distinct steps of endosomal protein trafficking. HECTD3-mediated Syntaxin 8 degradation has been suggested to contribute to the pathophysiology of neurodegenerative diseases.	134
176488	cd08667	APC10-ZZEF1	APC10/DOC1-like domain of uncharacterized Zinc finger ZZ-type and EF-hand domain-containing protein 1 (ZZEF1) and homologs. This model represents the APC10/DOC1-like domain present in the uncharacterized Zinc finger ZZ-type and EF-hand domain-containing protein 1 (ZZEF1) of Mus musculus. Members of this family contain EF-hand, APC10, CUB, and zinc finger ZZ-type domains. ZZEF1-like APC10 domains are homologous to the APC10 subunit/DOC1 domains present in E3 ubiquitin ligases, which mediate substrate ubiquitination (or ubiquitylation), and are components of the ubiquitin-26S proteasome pathway for selective proteolytic degradation.	131
176571	cd08674	Cdt1_m	The middle winged helix fold of replication licensing factor Cdt1 binds geminin to inhibit binding of the MCM complex to origins of replication and DNA. Cdt1 is a replication licensing factor in eukaryotes that recruits the Minichromosome Maintenance Complex (MCM2-7) to the origin recognition complex (ORC). The Cdt1 protein is divided into three regions based on sequence comparison and biochemical analyses: the N-terminal region (Cdt1_n) binds DNA in a sequence-, strand-, and conformation-independent manner; the middle winged helix fold (Cdt1_m) binds geminin to inhibit both binding of the MCM complex to origins of replication and DNA; and the C-terminal region (Cdt1_c) is essential for Cdt1 activity and directly interacts with the MCM2-7 helicase. Precise duplication of chromosomal DNA is required for genomic stability during replication. Assembly of replication factors to start DNA replication in eukaryotes must occur only once per cell cycle. To form a pre-replicative complex on replication origins in the G phase, ORC first binds origin DNA and triggers the binding of Cdc6 and Cdt1. These two factors recruit a putative replicative helicase and the MCM2-7. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication in S-phase. Cdt1 is present during G1 and early S phase of the cell cycle and degraded during the late S, G2, and M phases. The winged helix fold structure of Cdt1_m is similar to the structures of Cdt1_c and other archaeal homologues of the eukaryotic replication initiator, without apparent sequence similarity.	185
176057	cd08675	C2B_RasGAP	C2 domain second repeat of Ras GTPase activating proteins (GAPs). RasGAPs suppress Ras function by enhancing the GTPase activity of Ras proteins resulting in the inactive GDP-bound form of Ras.  In this way it can control cellular proliferation and differentiation.  The proteins here all contain two tandem C2 domains,  a Ras-GAP domain, and a pleckstrin homology (PH)-like domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. Members here have a type-I topology.	137
176058	cd08676	C2A_Munc13-like	C2 domain first repeat in Munc13 (mammalian uncoordinated)-like proteins. C2-like domains are thought to be involved in phospholipid binding in a Ca2+ independent manner in both Unc13 and Munc13. Caenorabditis elegans Unc13 has a central domain with sequence similarity to PKC, which includes C1 and C2-related domains. Unc13 binds phorbol esters and DAG with high affinity in a phospholipid manner.  Mutations in Unc13 results in abnormal neuronal connections and impairment in cholinergic neurotransmission in the nematode.  Munc13 is the mammalian homolog which are expressed in the brain.  There are 3 isoforms (Munc13-1, -2, -3) and are thought to play a role in neurotransmitter release and are hypothesized to be high-affinity receptors for phorbol esters.  Unc13 and Munc13 contain both C1 and C2 domains.  There are two C2 related domains present, one central and one at the carboxyl end.  Munc13-1 contains a third C2-like domain.  Munc13 interacts with syntaxin, synaptobrevin, and synaptotagmin suggesting a role for these as scaffolding proteins. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This cd contains the second C2 repeat, C2B, and has a type-II topology.	153
176059	cd08677	C2A_Synaptotagmin-13	C2 domain. Synaptotagmin is a membrane-trafficking protein characterized by a N-terminal transmembrane region, a linker, and 2 C-terminal C2 domains. Synaptotagmin 13, a member of class 6 synaptotagmins, is located in the brain.  It functions are unknown. It, like synaptotagmins 8 and 12, does not have any consensus Ca2+ binding sites. Previously all synaptotagmins were thought to be calcium sensors in the regulation of neurotransmitter release and hormone secretion, but it has been shown that not all of them bind calcium.  Of the 17 identified synaptotagmins only 8 bind calcium (1-3, 5-7, 9, 10).  The function of the two C2 domains that bind calcium are: regulating the fusion step of synaptic vesicle exocytosis (C2A) and  binding to phosphatidyl-inositol-3,4,5-triphosphate (PIP3) in the absence of calcium ions and to phosphatidylinositol bisphosphate (PIP2) in their presence (C2B).  C2B also regulates also the recycling step of synaptic vesicles. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This CD contains the first C2 repeat, C2A, and has a type-I topology.	118
176060	cd08678	C2_C21orf25-like	C2 domain found in the Human chromosome 21 open reading frame 25 (C21orf25) protein. The members in this cd are named after the Human C21orf25 which contains a single C2 domain.  Several other members contain a C1 domain downstream of the C2 domain.  No other information on this protein is currently known. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	126
176061	cd08679	C2_DOCK180_related	C2 domains found in Dedicator Of CytoKinesis 1 (DOCK 180) and related proteins. Dock180 was first identified as an 180kd proto-oncogene product c-Crk-interacting protein involved in actin cytoskeletal changes.  It is now known that it has Rac-specific GEF activity, but lacks the conventional Dbl homology (DH) domain. There are 10 additional related proteins that can be divided into four classes based on sequence similarity and domain organization: Dock-A which includes Dock180/Dock1, Dock2, and Dock5; Dock-B which includes Dock3/MOCA (modifier of cell adhesion) and Dock4; Dock-C which includes Dock6/Zir1, Dock7/Zir2, and Dock8/Zir3; and Dock-D, which includes Dock9/Zizimin1, Dock10/Zizimin3, and Dock11/Zizimin2/ACG (activated Cdc42-associated GEF).  Most of members of classes Dock-A and Dock-B are the GEFs specific for Rac.  Those of Dock-D are Cdc42-specific GEFs while those of Dock-C are the GEFs for both. All Dock180-related proteins have two common homology domains: the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker). DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	178
176062	cd08680	C2_Kibra	C2 domain found in Human protein Kibra. Kibra is thought to be a regulator of the Salvador (Sav)/Warts (Wts)/Hippo (Hpo) (SWH) signaling network, which limits tissue growth by inhibiting cell proliferation and promoting apoptosis. The core of the pathway consists of a MST and LATS family kinase cascade that ultimately phosphorylates and inactivates the YAP/Yorkie (Yki) transcription coactivator. The FERM domain proteins Merlin (Mer) and Expanded (Ex) are part of the upstream regulation controlling pathway mechanism.  Kibra colocalizes and associates with Mer and Ex and is thought to transduce an extracellular signal via the SWH network. The apical scaffold machinery that contains Hpo, Wts, and Ex recruits Yki to the apical membrane facilitating its inhibitory phosphorlyation by Wts.  Since Kibra associates with Ex and is apically located it is hypothesized that KIBRA is part of the scaffold, helps in the Hpo/Wts complex, and helps recruit Yki for inactivation that promotes SWH pathway activity.  Kibra contains two amino-terminal WW domains, an internal C2-like domain, and a carboxy-terminal glutamic acid-rich stretch.  The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	124
176063	cd08681	C2_fungal_Inn1p-like	C2 domain found in fungal Ingression 1 (Inn1) proteins. Saccharomyces cerevisiae Inn1 associates with the contractile actomyosin ring at the end of mitosis and is needed for cytokinesis. The C2 domain of Inn1, located at the N-terminus, is required for ingression of the plasma membrane. The C-terminus is relatively unstructured and contains eight PXXP motifs that are thought to mediate interaction of Inn1 with other proteins with SH3 domains in the cytokinesis proteins Hof1 (an F-BAR protein) and Cyk3 (whose overexpression can restore primary septum formation in Inn1Delta cells) as well as recruiting Inn1 to the bud-neck by binding to Cyk3. Inn1 and Cyk3 appear to cooperate in activating chitin synthase Chs2 for primary septum formation, which allows coordination of actomyosin ring contraction with ingression of the cleavage furrow. It is thought that the C2 domain of Inn1 helps to preserve the link between the actomyosin ring and the plasma membrane, contributing both to membrane ingression, as well as to stability of the contracting ring. Additionally, Inn1 might induce curvature of the plasma membrane adjacent to the contracting ring, thereby promoting ingression of the membrane. It has been shown that the C2 domain of human synaptotagmin induces curvature in target membranes and thereby contributes to fusion of these membranes with synaptic vesicles. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	118
176064	cd08682	C2_Rab11-FIP_classI	C2 domain found in Rab11-family interacting proteins (FIP) class I. Rab GTPases recruit various effector proteins to organelles and vesicles.  Rab11-family interacting proteins (FIPs) are involved in mediating the role of Rab11. FIPs can be divided into three classes: class I FIPs (Rip11a, Rip11b, RCP, and FIP2) which contain a C2 domain after N-terminus of the protein, class II FIPs (FIP3 and FIP4) which contain two EF-hands and a proline rich region, and class III FIPs (FIP1) which exhibits no homology to known protein domains. All FIP proteins contain a highly conserved, 20-amino acid motif at the C-terminus of the protein, known as Rab11/25 binding domain (RBD).  Class I FIPs are thought to bind to endocytic membranes via their C2 domain, which interacts directly with phospholipids. Class II FIPs do not have any membrane binding domains leaving much to speculate about the mechanism involving FIP3 and FIP4 interactions with endocytic membranes. The members in this CD are class I FIPs.  The exact function of the Rab11 and FIP interaction is unknown, but there is speculation that it involves the role of forming a targeting complex that recruits a group of proteins involved in membrane transport to organelles. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	126
176065	cd08683	C2_C2cd3	C2 domain found in C2 calcium-dependent domain containing 3 (C2cd3) proteins. C2cd3 is a novel C2 domain-containing protein specific to vertebrates.  C2cd3 functions in regulator of cilia formation, Hedgehog signaling, and mouse embryonic development. Mutations in C2cd3 mice resulted in lethality in some cases and exencephaly, a twisted body axis, and pericardial edema in others. The presence of calcium-dependent lipid-binding domains in C2cd3 suggests a potential role in vesicular transport. C2cd3 is also an interesting candidate for ciliopathy because of its orthology to certain cilia-related genetic disease loci on chromosome. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	143
176066	cd08684	C2A_Tac2-N	C2 domain first repeat found in Tac2-N (Tandem C2 protein in Nucleus). Tac2-N contains two C2 domains and a short C-terminus including a WHXL motif, which are key in stabilizing transport vesicles to the plasma membrane by binding to a plasma membrane.  However unlike the usual carboxyl-terminal-type (C-type) tandem C2 proteins, it lacks a transmembrane domain, a Slp-homology domain, and a Munc13-1-interacting domain. Homology search analysis indicate that no known protein motifs are located in its N-terminus, making Tac2-N a novel class of Ca2+-independent, C-type tandem C2 proteins. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	103
176067	cd08685	C2_RGS-like	C2 domain of the Regulator Of G-Protein Signaling (RGS) family. This CD contains members of the regulator of G-protein signaling (RGS) family. RGS is a GTPase activating protein which inhibits G-protein mediated signal transduction. The protein is largely cytosolic, but G-protein activation leads to translocation of this protein to the plasma membrane. A nuclear form of this protein has also been described, but its sequence has not been identified. There are multiple alternatively spliced transcript variants in this family with some members having additional domains (ex. PDZ and RGS) downstream of the C2 domain. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	119
176068	cd08686	C2_ABR	C2 domain in the Active BCR (Breakpoint cluster region) Related protein. The ABR protein is similar to the breakpoint cluster region protein.  It has homology to guanine nucleotide exchange proteins and GTPase-activating proteins (GAPs).  ABR is expressed primarily in the brain, but also includes non-neuronal tissues such as the heart.  It has been associated with human diseases such as Miller-Dieker syndrome in which mental retardation and malformations of the heart are present.  ABR contains a RhoGEF domain and a PH-like domain upstream of its C2 domain and a RhoGAP domain downstream of this domain.  A few members also contain a Bcr-Abl oncoprotein oligomerization domain at the very N-terminal end. Splice variants of ABR have been identified. ABR is found in a wide variety of organisms including chimpanzee, dog, mouse, rat, fruit fly, and mosquito. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	118
176069	cd08687	C2_PKN-like	C2 domain in Protein kinase C-like (PKN) proteins. PKN is a lipid-activated serine/threonine kinase.  It is a member of the protein kinase C (PKC) superfamily, but lacks a C1 domain. There are at least 3 different isoforms of PKN (PRK1/PKNalpha/PAK1; PKNbeta, and PRK2/PAK2/PKNgamma). The C-terminal region contains the Ser/Thr type protein kinase domain, while the N-terminal region of PKN contains three antiparallel coiled-coil (ACC) finger domains which are relatively rich in charged residues and contain a leucine zipper-like sequence. These domains binds to the small GTPase RhoA.  Following these domains is a C2-like domain.  Its C-terminal part functions as an auto-inhibitory region.  PKNs are not activated by classical PKC activators such as diacylglycerol, phorbol ester or Ca2+, but instead are activated by phospholipids and unsaturated fatty acids. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	98
176070	cd08688	C2_KIAA0528-like	C2 domain found in the Human KIAA0528 cDNA clone. The members of this CD are named after the Human KIAA0528 cDNA clone.  All members here contain a single C2 repeat.  No other information on this protein is currently known. The C2 domain was first identified in PKC.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	110
176071	cd08689	C2_fungal_Pkc1p	C2 domain found in protein kinase C (Pkc1p) in Saccharomyces cerevisiae. This family is named after the protein kinase C in Saccharomyces cerevisiae, Pkc1p. Protein kinase C is a member of a family of Ser/Thr phosphotransferases that are involved in many cellular signaling pathways. PKC has two antiparallel coiled-coiled regions (ACC finger domain) (AKA PKC homology region 1 (HR1)/ Rho binding domain) upstream of the C2 domain and two C1 domains downstream. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains, like those of PKC, are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	109
176072	cd08690	C2_Freud-1	C2 domain found in 5' repressor element under dual repression binding protein-1 (Freud-1). Freud-1 is a novel calcium-regulated repressor that negatively regulates basal 5-HT1A receptor expression in neurons.  It may also play a role in the altered regulation of 5-HT1A receptors associated with anxiety or major depression. Freud-1 contains two DM-14 basic repeats, a helix-loop-helix DNA binding domain, and a C2 domain. The Freud-1 C2 domain is thought to be calcium insensitive and it lacks several acidic residues that mediate calcium binding of the PKC C2 domain. In addition, it contains a poly-basic insert that is not present in calcium-dependent C2 domains and may function as a nuclear localization signal. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  This cd contains the first C2 repeat, C2A, and has a type-II topology.	155
176073	cd08691	C2_NEDL1-like	C2 domain present in NEDL1 (NEDD4-like ubiquitin protein ligase-1). NEDL1 (AKA  HECW1(HECT, C2 and WW domain containing E3 ubiquitin protein ligase 1)) is a newly identified HECT-type E3 ubiquitin protein ligase highly expressed in favorable neuroblastomas. In vertebrates it is found primarily in neuronal tissues, including the spinal cord. NEDL1 is thought to normally function in the quality control of cellular proteins by eliminating misfolded proteins.  This is thought to be accomplished via a mechanism analogous to that of ER-associated degradation by forming tight complexes and aggregating misfolded proteins that have escaped ubiquitin-mediated degradation.  NEDL1, is composed of a C2 domain, two WW domains, and a ubiquitin ligase Hect domain. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	137
176074	cd08692	C2B_Tac2-N	C2 domain second repeat found in Tac2-N (Tandem C2 protein in Nucleus). Tac2-N contains two C2 domains and a short C-terminus including a WHXL motif, which are key in stabilizing transport vesicles to the plasma membrane by binding to a plasma membrane.  However unlike the usual carboxyl-terminal-type (C-type) tandem C2 proteins, it lacks a transmembrane domain, a Slp-homology domain, and a Munc13-1-interacting domain. Homology search analysis indicate that no known protein motifs are located in its N-terminus, making Tac2-N a novel class of Ca2+-independent, C-type tandem C2 proteins. The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	135
176075	cd08693	C2_PI3K_class_I_beta_delta	C2 domain present in class I beta and delta phosphatidylinositol 3-kinases (PI3Ks). PI3Ks (AKA phosphatidylinositol (PtdIns) 3-kinases) regulate cell processes such as cell growth, differentiation, proliferation, and motility.  PI3Ks work on phosphorylation of phosphatidylinositol, phosphatidylinositide (4)P (PtdIns (4)P),2 or PtdIns(4,5)P2. Specifically they phosphorylate the D3 hydroxyl group of phosphoinositol lipids on the inositol ring. There are 3 classes of PI3Ks based on structure, regulation, and specificity. All classes contain a C2 domain, a PIK domain, and a kinase catalytic domain.  The members here are class I, beta and delta isoforms of PI3Ks and contain both a Ras-binding domain and a p85-binding domain.  Class II PI3Ks contain both of these as well as a PX domain, and a C-terminal C2 domain containing a nuclear localization signal.  C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.  Members have a type-I topology.	173
176076	cd08694	C2_Dock-A	C2 domains found in Dedicator Of CytoKinesis (Dock) class A proteins. Dock-A is one of 4 classes of Dock family proteins.  The members here include: Dock180/Dock1, Dock2, and Dock5.  Most of these members have been shown to be GEFs specific for Rac.  Dock5 has not been well characterized to date, but most likely also is a GEF specific for Rac. In addition to the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker), which all Dock180-related proteins have, Dock-A members contain a proline-rich region and a SH3 domain upstream of the C2 domain. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	196
176077	cd08695	C2_Dock-B	C2 domains found in Dedicator Of CytoKinesis (Dock) class B proteins. Dock-B is one of 4 classes of Dock family proteins.  The members here include: Dock3/MOCA (modifier of cell adhesion) and Dock4.  Most of these members have been shown to be GEFs specific for Rac, although Dock4 has also been shown to interact indirectly with the Ras family GTPase Rap1, probably through Rap regulatory proteins. In addition to the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker), which all Dock180-related proteins have, Dock-B members contain a SH3 domain upstream of the C2 domain and a proline-rich region downstream.  DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3).  The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	189
176078	cd08696	C2_Dock-C	C2 domains found in Dedicator Of CytoKinesis (Dock) class C proteins. Dock-C is one of 4 classes of Dock family proteins.  The members here include: Dock6/Zir1, Dock7/Zir2, and Dock8/Zir3.  Dock-C members are GEFs for both Rac and Cdc42. In addition to the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker), which all Dock180-related proteins have, Dock-C members contain a functionally uncharacterized domain upstream of the C2 domain. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	179
176079	cd08697	C2_Dock-D	C2 domains found in Dedicator Of CytoKinesis (Dock) class C proteins. Dock-D is one of 4 classes of Dock family proteins.  The members here include: Dock9/Zizimin1, Dock10/Zizimin3, and Dock11/Zizimin2/ACG (activated Cdc42-associated GEF).  Dock-D are Cdc42-specific GEFs. In addition to the C2 domain (AKA Dock homology region (DHR)-1, CED-5, Dock180, MBC-zizimin homology (CZH) 1) and the DHR-2 (AKA CZH2, or Docker), which all Dock180-related proteins have, Dock-D members contain a functionally uncharacterized domain and a PH domain upstream of the C2 domain.  DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock180 and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3).  The PH domain broadly binds to phospholipids and is thought to be involved in targeting the plasma membrane.  The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins.  Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1.  However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain.  C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions.	185
381627	cd08698	TGF_beta_SF	transforming growth factor beta (TGF-beta) like domain found in TGF-beta superfamily. TGF-beta superfamily consists of a large group of cell regulatory proteins, such as TFG-betas, Nodal, Activins/Inhibins, glial cell-line-derived neurotrophic factor (GDNF) family of ligands, bone morphogenetic proteins (BMPs), and growth and differentiation factors (GDFs). They play important roles in developmental and physiological processes in a variety of species, including invertebrates as well as vertebrates, through specific receptor complexes that are composed of type I and type II serine/threonine receptor kinases. The receptor kinases subsequently activate Smad proteins, which then propagate the signals into the nucleus to regulate target gene expression. Proteins from the TGF-beta superfamily are only active as homo- or heterodimer.	100
187728	cd08700	FMT_C_OzmH_like	C-terminal subdomain of the Formyltransferase-like domain found in OzmH-like proteins. Domain found in OzmH-like proteins with similarity to the C-terminal domain of Formyltransferase. OzmH is one of the proteins involved in the synthesis of Oxazolomycin (OZM), which is a hybrid peptide-polyketide antibiotic that exhibits potent antitumor and antiviral activities. OzmH is a multi-domain protein consisting of a formyl transferase domain, a flavin-utilizing monoxygenase domain, a LuxE domain functioning as an acyl protein synthetase and a phosphopantetheine (PP)-binding domain, which may function as an acyl carrier. It shows sequence similarity with other peptide-polyketide biosynthesis proteins.	100
187729	cd08701	FMT_C_HypX	C-terminal subdomain of the Formyltransferase-like domain found in HypX-like proteins. Domain found in HypX-like proteins with similarity to the C-terminal domain of Formyltransferase. HypX is involved in the maturation process of active [NiFe] hydrogenase. [NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalents under aerobic and anaerobic conditions. [NiFe] hydrogenases consist of a large and a small subunit. The large subunit contains the [NiFe] active site but is synthesized as a precursor without the [NiFe] active site. This precursor undergoes a complex post-translational maturation process that requires the presence of a number of accessory proteins. HypX has been shown to be involved in this maturation process and have been proposed to participate in the generation and transport of the CO and CN ligands. However, HypX is not present in all hydrogen-metabolizing bacteria. Furthermore, hypX deletion mutants have a reduced but detectable level of hydrogenase activity. Thus, HypX might not be the determining factor in the maturation process. Members of this group have an N-terminal formyl transferase domain and a C-terminal enoyl-CoA hydratase/isomerase domain.	96
187730	cd08702	Arna_FMT_C	C-terminal subdomain of the formyltransferase domain on ArnA, which modifies lipid A with 4-amino-4-deoxy-l-arabinose. Domain found in ArnA with similarity to the C-terminal domain of Formyltransferase. ArnA is a bifunctional enzyme required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. The C-terminal domain of ArnA is a dehydrogenase domain that catalyzes the oxidative decarboxylation of UDP-glucuronic acid (UDP-GlcUA) to UDP-4-keto-arabinose (UDP-Ara4O) and the N-terminal domain is a formyltransferase domain that catalyzes the addition of a formyl group to UDP-4-amino-4-deoxy-L-arabinose (UDP-L-Ara4N) to form UDP-L-4-formamido-arabinose (UDP-L-Ara4FN). This domain family represents the C-terminal subdomain of the formyltransferase domain, downstream of the N-terminal subdomain containing the catalytic center. ArnA forms a hexameric structure (a dimer of trimers), in which the dehydrogenase domains are arranged at the center with the transformylase domains on the outside of the complex.	92
187731	cd08703	FDH_Hydrolase_C	The C-terminal subdomain of the hydrolase domain on the bi-functional protein 10-formyltetrahydrofolate dehydrogenase. The family represents the C-terminal subdomain of the hydrolase domain on the bi-functional protein, 10-formyltetrahydrofolate dehydrogenase (FDH). FDH catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. The protein comprises two functional domains: the N-terminal hydrolase domain that removes a formyl group from 10-formyltetrahydrofolate and the C-terminal NADP-dependent dehydrogenase domain that reduces the formyl group to carbon dioxide. The hydrolase domain contains an N-terminal formyl transferase catalytic core subdomain and this C-terminal subdomain, which may be involved in substrate binding.	100
187732	cd08704	Met_tRNA_FMT_C	C-terminal domain of Formyltransferase and other enzymes. C-terminal domain of formyl transferase and other proteins with diverse enzymatic activities. Proteins found in this family include methionyl-tRNA formyltransferase, ArnA, and 10-formyltetrahydrofolate dehydrogenase. Methionyl-tRNA formyltransferases constitute the majority of the family and also demonstrate greater sequence diversity. Although most proteins with formyltransferase activity contain the C-terminal domain, some formyltransferases ( for example, prokaryotic glycinamide ribonucleotide transformylase (GART)) only have the core catalytic domain, indicating that the C-terminal domain is not a requirement for catalytic activity and may be involved in substrate binding. For example, the C-terminal domain of methionyl-tRNA formyltransferase is involved in the tRNA binding.	87
188660	cd08705	RGS_R7-like	Regulator of G protein signaling (RGS) domain found in the R7 subfamily of proteins. The RGS (Regulator of G-protein Signaling) domain is an essential part of the R7 (Neuronal RGS) protein subfamily of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The R7 subfamily includes RGS6, RGS7, RGS9, and RGS11, all of which, in humans, are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes. In addition, R7 proteins were found to bind many other proteins outside of the G protein signaling pathways including: m-opioid receptor, beta-arrestin, alpha-actinin-2, NMDAR, polycystin, spinophilin, guanylyl cyclase, among others.	121
188661	cd08706	RGS_R12-like	Regulator of G protein signaling (RGS) domain found in the R12 subfamily of proteins. The RGS (Regulator of G-protein Signaling) domain is an essential part of the R12 (Neuronal RGS) protein subfamily of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play a critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of G-protein signaling, controlled by RGS domain, accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP that results in reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The R12 RGS subfamily includes RGS10, RGS12 and RGS14 all of which are highly selective for G-alpha-i1 over G-alpha-q.	113
188662	cd08707	RGS_Axin	Regulator of G protein signaling (RGS) domain found in the Axin protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the Axin protein. Axin is a member of the RA/RGS subfamily of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, and skeletal and muscle development. The RGS domain of Axin is specifically interacts with the heterotrimeric G-alpha12 protein, but not with closely related G-alpha13, and provides a unique tool to regulate G-alpha12-mediated signaling processes. The RGS domain of Axin also interacts with the tumor suppressor protein APC (Adenomatous Polyposis Coli) in order to control the cytoplasmic level of the proto-oncogene, beta-catenin.	117
188663	cd08708	RGS_FLBA	Regulator of G protein signaling (RGS) domain found in the FLBA (Fluffy Low BrlA) protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the FLBA (Fluffy Low BrlA) protein. FLBA is a member of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins play a critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of the G-protein signaling controlled by the RGS domain accelerates the GTPase activity of the alpha subunit by hydrolysis of GTP to GDP which results in reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes. The RGS domain of the FLBA protein antagonizes G protein signaling to block proliferation and allow development. It is required for control of mycelial proliferation and activation of asexual sporulation in yeast.	148
188664	cd08709	RGS_RGS2	Regulator of G protein signaling (RGS) domain found in the RGS2 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS2 protein. RGS2 is a member of R4/RGS subfamily of RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G- alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS2 plays important roles in the regulation of blood pressure and the pathogenesis of human hypertension, as well as in bone formation in osteoblasts. Outside of the GPCR pathway RGS2 interacts with calmodulin, beta- COP, tubulin, PKG1-alpha, and TRPV6.	114
188665	cd08710	RGS_RGS16	Regulator of G protein signaling (RGS) domain found in the RGS16 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS16 protein. RGS16 is a member of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS16 is a member of the R4/RGS subfamily and interacts with neuronal G-alpha0. RGS16 expression is upregulated by IL-17 of the NF-kappaB signaling pathway in autoimmune B cells.	114
188666	cd08711	RGS_RGS8	Regulator of G protein signaling (RGS) domain found in the RGS8 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS8 protein. RGS8 is a member of R4/RGS subfamily of RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS8 is involved in G-protein-gated potassium channels regulation and predominantly expressed in the brain. RGS8 also is selectively expressed in the hematopoietic system (NK cells).	125
188667	cd08712	RGS_RGS18	Regulator of G protein signaling (RGS) domain found in the RGS18 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS18 protein.  RGS18 is a member of the RGS protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS18 is a member of the R4/RGS subfamily and is expressed predominantly in osteoclasts where it acts as a negative regulator of the acidosis-induced osteoclastogenic OGR1/NFAT signaling pathway. RANKL (receptor activator of nuclear factor B ligand) stimulates osteoclastogenesis by inhibiting expression of RGS18.	114
188668	cd08713	RGS_RGS3	Regulator of G protein signaling (RGS) domain found in the RGS3 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS3 protein. RGS3 is a member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes. RGS3 induces apoptosis when overexpressed and is involved in cell migration through interaction with the Ephrin receptor. RGS3 exits as several splice isoforms and interacts with neuroligin, estrogen receptor-alpha, and 14-3-3 outside of the GPCR pathways.	114
188669	cd08714	RGS_RGS4	Regulator of G protein signaling (RGS) domain found in the RGS4 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS4 protein. RGS4 is a member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. RGS4 is expressed widely in brain including prefrontal cortex, striatum, locus coeruleus (LC), and hippocampus and has been implicated in regulation of opioid, cholinergic, and serotonergic signaling. Dysfunctions in RGS4 proteins are involved  in etiology of Parkinson's disease, addiction, and schizophrenia. RGS4 also is up-regulated in the failing human heart. RGS4 interacts with many binding partners outside of GPCR pathways, including calmodulin, COP, Kir3, PIP, calcium/CaM, PA, ErbB3, and 14-3-3.	114
188670	cd08715	RGS_RGS1	Regulator of G protein signaling (RGS) domain found in the RGS1 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS1 protein. RGS1 is a member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis.  RGS 1 is expressed predominantly in hematopoietic compartments, including T and B lymphocytes, and may play a major role in chemokine-mediated homing of lymphocytes to secondary lymphoid organs. In addition, RGS1 interacts with calmodulin and 14-3-3 protein outside of the GPCR pathway.	114
188671	cd08716	RGS_RGS13	Regulator of G protein signaling (RGS) domain found in the RGS13 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS13 protein. RGS13 is member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis.  RGS13 is predominantly expressed in T and B lymphocytes and in mast cells, and plays a role in adaptive immune responses. RGS13 also found in Rgs13, which is also expressed in dendritic cells and in neuroendocrine cells of the thymus, gastrointestinal, and respiratory tracts. Outside of the GPCR pathway, RGS5 interacts with the PIP3 protein.	114
188672	cd08717	RGS_RGS5	Regulator of G protein signaling (RGS) domain found in the RGS5 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS5 protein. RGS5 is member of the R4/RGS subfamily of the RGS family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha subunits. The RGS domain controls G-protein signaling by accelerating the GTPase activity of the G-alpha subunit which leads to G protein deactivation and promotes desensitization. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis.  Two splice isoforms of RGS5 has been found: RGS5L (long) which is expressed in smooth muscle cells (pericytes) and heart and RGS5S (short) which is highly expressed in the ciliary body of the eye, kidney, brain, spleen, skeletal muscle, and small intestine. Outside of the GPCR pathway, RGS5 interacts with the 14-3-3 protein.	114
188673	cd08718	RGS_RZ-like	Regulator of G protein signaling (RGS) domain found in the RZ protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RZ subfamily of the RGS protein family.  They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of G-protein signaling is controlled by RGS domains, which accelerate GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in reassociation of the alpha-subunit with the beta-gamma-dimer and inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The RZ subfamily of RGS proteins includes RGS17, RGS19 (former GAIP), RGS20, and its splice variant Ret-RGS.	118
188674	cd08719	RGS_SNX13	Regulator of G protein signaling (RGS) domain found in the Sorting Nexin 13 (SNX13) protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the SNX13 (Sorting Nexin 13) protein, a member of  the RGS protein family.  They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. The RGS-domain of SNX13 plays a major role through attenuation of Galphas-mediated signaling and regulates endocytic trafficking and degradation of the epidermal growth factor receptor. Snx13-null mice were embryonic lethal around midgestation which supports an essential role for SNX13 in mouse development and regulation of endocytosis dynamics.	135
188675	cd08720	RGS_SNX25	Regulator of G protein signaling (RGS) domain found in the Sorting Nexin 25 (SNX25) protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the SNX25 (Sorting Nexin 25) protein, a member of  the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. SNX25 is a member of the Dopamine receptors (DAR) signalplex and regulates the trafficking of D1 and D2 DARs.	110
188676	cd08721	RGS_AKAP2_2	Regulator of G protein signaling (RGS) domain 2 found in the A-kinase anchoring protein, D-AKAP2. The RGS (Regulator of G-protein Signaling) domain is an essential part of the D-AKAP2 (A-kinase anchoring protein), a member of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. D-AKAP2 contains two RGS domains which play an important role in spatiotemporal localization of cAMP-dependent PKA (cyclic AMP-dependent protein kinase) that regulates many different signaling pathways by phosphorylation of target proteins. This cd contains the second RGS domain.	121
188677	cd08722	RGS_SNX14	Regulator of G protein signaling (RGS) domain found in the Sorting Nexin14 (SNX14) protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the SNX14 (Sorting Nexin14) protein, a member of  the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. SNX14 is believed to regulates membrane trafficking in motor neurons.	127
188678	cd08723	RGS_RGS21	Regulator of G protein signaling (RGS) domain found in the RGS21 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part RGS21 protein, a member of RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, apoptosis, and cell proliferation, as well as modulation of cardiac development.  RGS21 is a member of the R4/RGS subfamily and its mRNA was detected only in sensory taste cells that express sweet taste receptors and the taste G-alpha subunit, gustducin, suggesting a potential role in regulating taste transduction.	111
188679	cd08724	RGS_GRK-like	Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase (GRK). The RGS domain is found in G protein-coupled receptor kinases (GRKs).  These proteins play a key role in phosphorylation-dependent desensitization/resensitization of GPCRs (G protein-coupled receptors), intracellular trafficking, endocytosis, as well as in the modulation of important intracellular signaling cascades by GPCR. GRKs also modulate cellular response in phosphorylation-independent manner using their ability to interact with multiple signaling proteins involved in many essential cellular pathways. The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. Based on sequence homology the GRK family consists of three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	114
188680	cd08725	RGS_RGS22_4	Regulator of G protein signaling domain RGS_RGS22_4. The RGS (Regulator of G-protein Signaling) domain found in the RGS22 protein, a member of the RA/RGS subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. RGS22 contains at least 3 copies of the RGS domain in vertebrata and exists in multiple splicing variants. RGS22 is predominantly expressed in testis and believed to play an important role in spermatogenesis.	123
188681	cd08726	RGS_RGS22_3	Regulator of G protein signaling domain RGS_RGS22_3. The RGS (Regulator of G-protein Signaling) domain found in the RGS22 protein, a member of the RA/RGS subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. RGS22 contains at least 3 copies of the RGS domain in vertebrata and exists in multiple splicing variants. RGS22 is predominantly expressed in testis and believed to play an important role in spermatogenesis.	130
188682	cd08727	RGS_RGS22_2	Regulator of G protein signaling domain RGS_RGS22_2. The RGS (Regulator of G-protein Signaling) domain found in the RGS22 protein, a member of the RA/RGS subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. RGS22 contains at least 3 copies of the RGS domain in vertebrata and exists in multiple splicing variants. RGS22 is predominantly expressed in testis and believed to play an important role in spermatogenesis.	116
188683	cd08728	RGS-like_2	Uncharacterized Regulator of G protein Signaling (RGS) domain subfamily, child 2. These uncharacterized RGS-like domains consists largely of hypothetical proteins. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play an important role in neuronal signal modulation. Some RGS proteins are the principal elements needed for proper vision.	179
188684	cd08729	RGS_PX	Regulator of G protein signaling domain. These uncharacterized RGS-like domains are found in proteins that also contain one or more PX domains. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. As a major G-protein regulator, the RGS domain containing proteins that are involves in many crucial cellular processes. RGS proteins regulate intracellular trafficking and provide vital support for signal transduction. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, others RGS proteins play important role in neuronal signals modulation. Some RGS proteins are the principal elements needed for proper vision.	136
188685	cd08730	RGS-like_3	Uncharacterized Regulator of G protein Signaling (RGS) domain subfamily, child 3. These uncharacterized RGS-like domains consists largely of hypothetical proteins. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. As a major G-protein regulator, the RGS domain containing proteins that are involved in many crucial cellular processes. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play an important role in neuronal signal modulation. Some RGS proteins are the principal elements needed for proper vision.	165
188686	cd08731	RGS_RGS22_1	Regulator of G protein signaling domain RGS_RGS22_1. The RGS (Regulator of G-protein Signaling) domain found in the RGS22 protein, a member of the RA/RGS subfamily of the RGS protein family, which is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. RGS22 contains at least 3 copies of the RGS domain in vertebrata and exists in multiple splicing variants. RGS22 is predominantly expressed in testis and believed to play an important role in spermatogenesis.	125
188687	cd08732	RGS-like_4	Uncharacterized Regulator of G protein Signaling (RGS) domain subfamily, child 4. These uncharacterized RGS-like domains consists largely of hypothetical proteins. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play an important role in neuronal signal modulation. Some RGS proteins are the principal elements needed for proper vision.	139
188688	cd08734	RGS-like_1	Uncharacterized Regulator of G protein Signaling (RGS) domain subfamily, child 1. These uncharacterized RGS-like domains consists largely of hypothetical proteins. The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. As a major G-protein regulator, the RGS domain containing proteins that are involved in many crucial cellular processes. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play an important role in neuronal signal modulation. Some RGS proteins are the principal elements needed for proper vision.	109
188689	cd08735	RGS_AKAP2_1	Regulator of G protein signaling (RGS) domain 1 found in the A-kinase anchoring protein, D-AKAP2. The RGS (Regulator of G-protein Signaling) domain is an essential part of the D-AKAP2 (A-kinase anchoring protein), a member of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development. D-AKAP2 contains two RGS domains which play an important role in spatiotemporal localization of cAMP-dependent PKA (cyclic AMP-dependent protein kinase) that regulates many different signaling pathways by phosphorylation of target proteins. This cd contains the first RGS domain.	171
188690	cd08736	RGS_RhoGEF-like	Regulator of G protein signaling (RGS) domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein. The RGS domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein subfamily of the RGS domain containing protein family, which is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RhoGEFs link signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RGS domain of the RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. The RGS-GEFs subfamily includes the leukemia-associated RhoGEF (LARG), p115RhoGEF, and PDZ-RhoGEF. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	120
188691	cd08737	RGS_RGS6	Regulator of G protein signaling (RGS) domain found in the RGS6 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS6 protein, a member of R7 subfamily of the RGS protein family. RGS is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). Other members of the R7 subfamily (Neuronal RGS) include: RGS7, RGS9, and RGS11, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control.  Additionally they have been implicated in many neurological conditions such as anxiety, schizophrenia, and drug dependence. RGS6 exists in multiple splice isoforms with identical RGS domains, but possess complete or incomplete GGL domains and distinct N- and C-terminal domains. RGS6 interacts with SCG10, a neuronal growth-associated protein and therefore regulates neuronal differentiation. Another RGS6-binding protein is DMAP1, a component of the Dnmt1 complex involved in repression of newly replicated genes. Mutations of a critical residue required for interaction of RGS6 protein with G proteins did not affect the ability of RGS6 to interact with both SCG10 and DMAP1. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis.	125
188692	cd08738	RGS_RGS7	Regulator of G protein signaling (RGS) domain found in the RGS7 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS7 protein, a member of R7 subfamily of the RGS protein family. RGS is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs).  As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. Other members of the R7 subfamily (Neuronal RGS) include: RGS6, RGS9, and RGS11, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control.  Additionally they have been implicated in many neurological conditions such as anxiety, schizophrenia, and drug dependence. R7 RGS proteins are key modulators of the pharmacological effects of drugs involved in the development of tolerance and addiction. In addition, RGS7 was found to bind a component of the synaptic fusion complex, snapin, and some other proteins outside of G protein signaling pathways.	121
188693	cd08739	RGS_RGS9	Regulator of G protein signaling (RGS) domain found in the RGS9 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS9 protein, a member of R7 subfamily of the RGS protein family. RGS is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis.  Other members of the R7 subfamily (Neuronal RGS) include: RGS6, RGS7, and RGS11, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control.  Additionally they have been implicated in many neurological conditions such as anxiety, schizophrenia, and drug dependence. RGS9 forms constitutive complexes with G-beta-5 subunit and controls such fundamental functions as vision and behavior. RGS9 exists in two splice isoforms: RGS9-1 which regulates phototransduction in rods and cones and RGS9-2 which regulates dopamine and opioid signaling in the basal ganglia. In addition, RGS9 was found to bind many other proteins outside of G protein signaling pathways including: mu-opioid receptor, beta-arrestin, alpha-actinin-2, NMDAR, polycystin, spinophilin, and guanylyl cyclase, among others.	121
188694	cd08740	RGS_RGS11	Regulator of G protein signaling (RGS) domain found in the RGS11 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS11 protein, a member of R7 subfamily of the RGS protein family. RGS is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. Other members of the R7 subfamily (Neuronal RGS) include: RGS6, RGS7, and RGS9, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control.  Additionally they have been implicated in many neurological conditions such as anxiety, schizophrenia, and drug dependence. RGS11 is expressed exclusively in retinal ON-bipolar neurons in which it forms complexes with G-beta-5  and  R7AP (RGS7 anchor protein ) and plays crucial roles in processing the light responses of retinal neurons.	126
188695	cd08741	RGS_RGS10	Regulator of G protein signaling (RGS) domain found in the RGS10 protein. RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS10 protein. RGS10 is a member of the RA/RGS subfamily of RGS proteins family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS10 belong to the R12 RGS subfamily, which includes RGS12 and RGS14, all of which are highly selective for G-alpha-i1 over G-alpha-q. RGS10 exists in 2 splice isoforms. RGS10A is specifically expressed in osteoclasts and is a key component in the RANKL signaling mechanism for osteoclast differentiation, whereas RGS10B expressed in brain and in immune tissues and  has been implicated in diverse processes including: promoting of  dopaminergic neuron survival via regulation of the microglial inflammatory response, modulation of presynaptic and postsynaptic G-protein signalling, as well as a possible role in regulation of gene expression.	113
188696	cd08742	RGS_RGS12	Regulator of G protein signaling (RGS) domain found in the RGS12 protein. RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS12 protein. RGS12 is a member of the RA/RGS subfamily of RGS proteins family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS12 belong to the R12 RGS subfamily, which includes RGS10 and RGS14, all of which are highly selective for G-alpha-i1 over G-alpha-q.  RGS12 exist in multiple splice variants: RGS12s (short) contains the core RGS/RBD/GoLoco domains, while RGS12L (long) has additional N-terminal PDZ and PTB domains. RGS12 splice variants show distinct expression patterns, suggesting that they have discrete functions during mouse embryogenesis. RGS12 also may play a critical role in coordinating Ras-dependent signals that are required for promoting and maintaining neuronal differentiation.	115
188697	cd08743	RGS_RGS14	Regulator of G protein signaling (RGS) domain found in the RGS14 protein. RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS14 protein. RGS14 is a member of the RA/RGS subfamily of RGS proteins family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS14 belong to the R12 RGS subfamily, which includes RGS10 and RGS12, all of which are highly selective for G-alpha-i1 over G-alpha-q.  RGS14 binds and regulates the subcellular localization and activities of H-Ras and Raf  kinases in cells and thereby integrates G protein and Ras/Raf signaling pathways.	129
188698	cd08744	RGS_RGS17	Regulator of G protein signaling (RGS) domain found in the RGS17 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS17 protein, a member of  the RZ subfamily of the RGS protein family.  They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). As a major G-protein regulator, the RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of the G-protein signaling controlled by the RGS domain, which accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, results in reassociation of the alpha-subunit with the beta-gamma-dimer and inhibition of downstream activity. The RZ subfamily of RGS proteins includes RGS19 (former GAIP), RGS20, and its splice variant Ret-RGS. RGS17 is a relatively non-selective GAP for G-alpha-z and other G-alpha-i/o proteins. RGS17 blocks dopamine receptor-mediated inhibition of cAMP accumulation; it also blocks thyrotropin releasing hormone-stimulated Ca++ mobilization. RGS17, like other members of RZ subfamily, can act either as a GAP or as G-protein effector antogonist.	118
188699	cd08745	RGS_RGS19	Regulator of G protein signaling (RGS) domain found in the RGS19 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS19 protein (also known as GAIP), a member of the RZ subfamily of the RGS protein family. They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of G-protein signaling is controlled by RGS domains, which accelerate GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, resulting in a reassociation of the alpha-subunit with the beta-gamma-dimer and an inhibition of downstream activity. As a major G-protein regulator, the RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The RZ subfamily of RGS proteins includes RGS17, RGS20, and its splice variant Ret-RGS. RGS19 participates in regulation of dopamine receptor D2R and D3R, as well as beta-adrenergic receptors .	118
188700	cd08746	RGS_RGS20	Regulator of G protein signaling (RGS) domain found in the RGS20 protein. The RGS (Regulator of G-protein Signaling) domain is an essential part of the RGS20 protein (also known as RGSZ1), a member of the RZ subfamily of the RGS protein family.  They are a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. Deactivation of G-protein signaling is controlled by the RGS domain, which accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP resulting in reassociation of the alpha-subunit with the beta-gamma-dimer and inhibition of downstream activity. As a major G-protein regulator, the RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. The RZ subfamily of RGS proteins include RGS17, RGS19 (former GAIP), and the splice variant of RGS20, Ret-RGS. RGS20 is expressed exclusively in brain, with the highest concentrations in the temporal lobe and the caudate nucleus and may play a role in signaling regulation in these brain regions. RGS20 acts as a GAP of both G-alpha-z and G-alpha-I and controls signaling in the mu opioid receptor pathway.	167
188701	cd08747	RGS_GRK2_GRK3	Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 2 (GRK2) and  G protein-coupled receptor kinase 3 (GRK3). The RGS domain is an essential part of the GRK2 (G protein-coupled receptor kinases 2) and the GRK3 proteins, which are members of the beta-adrenergic receptor kinases subfamily. GRK2 and GRK3 are ubiquitously expressed and can phosphorylate many different GPCR.  The C-terminus of GRK2 and 3 contains a plekstrin homology domain (PH) with binding sites for the membrane phospholipid PIP2 and free G#? subunits. These specific interactions could help to maintain a membrane-bound population of GRK2 prior to the agonist-dependent overt GRK2 translocation. GRK2 and GRK3 are members of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3).  The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	157
188702	cd08748	RGS_GRK1	Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 1 (GRK1). The RGS domain is found in G protein-coupled receptor kinases 1 (GRK1, also refered to as  Rhodopsin kinase) which play a key role in phosphorylation of rhodopsin (Rho), a G protein-coupled receptor responsible for visual signal transduction in rod cell. GRK1 is a member of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. A few inactivation mutations in GRK1 have been found in patients with Oguchi disease, a stationary form of night blindness. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	138
188703	cd08749	RGS_GRK7	Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 7 (GRK7). The RGS domain is an essential part of the GRK7 (G protein-coupled receptor kinases 7) proteins which together with GRK1 (Rhodopsin kinase) have been implicated in the shutoff of the photoresponse and adaptation to changing light conditions via rod and cone opsin phosphorylation. GRK7 is a member of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3).  The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. GRK7 is expressed in all vertebrate cones except that of mice and rats, which do not have the gene for GRK7. Lack of either GRK7 or both GRK1 and GRK7 in human leads to a vision defect called Enhanced S Cone syndrome. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	139
188704	cd08750	RGS_GRK4	Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 4 (GRK4). The RGS domain is an essential part of the GRK4 (G protein-coupled receptor kinase4) proteins, which are membrane-associated serine/threonine protein kinases that phosphorylate G protein-coupled receptors (GPCRs) upon agonist stimulation. This phosphorylation initiates beta-arrestin-mediated receptor desensitization, internalization, and signaling events. GRK4 is a member of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3). The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. GRK4 plays a key role in regulating dopaminergic-mediated natriuresis and is associated with essential hypertension and/or salt-sensitive hypertension. GRK4 exists in four splice variants involved in hyperphosphorylation, desensitization, and internalization of two dopamine receptors (D1R and D3R). GRK4 also increases the expression of a key receptor of the renin-angiotensin system, the AT1R (angiotensin type 1 receptor). RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	132
188705	cd08751	RGS_GRK6	Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 6 (GRK6). The RGS domain is an essential part of the GRK6 (G protein-coupled receptor kinase 6) protein which plays an important role in the regulating of dopamine, opioids, M3 muscarinic, and chemokine receptor signaling. GRK6 is a member of the GRK kinase family which includes three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3).  The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. The RH domain of GRK6 does not have structural determinants that are required for binding G-alpha subunit, in contrast to GRK2 and many other RGS proteins. GRK6 is an important target for treatment of addiction and Parkinson disease. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	145
188706	cd08752	RGS_GRK5	Regulator of G protein signaling domain (RGS) found in G protein-coupled receptor kinase 5 (GRK5). The RGS domain is an essential part of the GRK5 (G protein-coupled receptor kinase 5) protein, a membrane-associated serine/threonine protein kinases which phosphorylates G protein-coupled receptors (GPCRs) upon agonist stimulation. This phosphorylation initiates beta-arrestin-mediated receptor desensitization, internalization, and signaling events. GRK5 is a member of the GRK kinase family which include three major subfamilies: the GRK4 subfamily (GRK4, GRK5 and GRK6), the rhodopsin kinase or visual GRK subfamily (GRK1 and GRK7), and the beta-adrenergic receptor kinases subfamily (GRK2/GRK3).  The RGS domain of the GRKs has very little sequence similarity with the canonical RGS domain of the RGS proteins and therefore is often refered to as the RH (RGS Homology) domain. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	123
188707	cd08753	RGS_PDZRhoGEF	Regulator of G protein signaling (RGS) domain found in the PDZ-Rho guanine nucleotide exchange factor (RhoGEF) protein. The RGS domain is an essential part of the PDZ-RhoGEF (PDZ:Postsynaptic density 95, Disk large, Zona occludens-1; RhoGEF: Rho guanine nucleotide exchange factor; alias PRG) protein, a member of RhoGEFs subfamily of the RGS protein family. The RhoGEFs are peripheral membrane proteins that regulate essential cellular processes, including cell shape, cell migration, and cell cycle progression, as well as gene transcription by linking signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. RhoGEFs subfamily includes leukemia-associated RhoGEF protein (LARG), p115RhoGEF, PDZ-RhoGEF and its rat specific splice variant GTRAP48. The RGS domain of RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and is often refered to as RH (RGS Homology) domain. In contrast to p115RhoGEF and LARG, PDZ-RhoGEF cannot serve as a GTPase-activating protein (GAP), due to the mutation of sites in the RGS domain region that are crucial for GAP activity. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	145
188708	cd08754	RGS_LARG	Regulator of G protein signaling (RGS) domain found in the leukemia-associated Rho guanine nucleotide exchange factor (RhoGEF) protein (LARG). The RGS domain is an essential part of the leukemia-associated RhoGEF protein (LARG), a member of the RhoGEF (Rho guanine nucleotide exchange factor) subfamily of the RGS protein family. The RhoGEFs are peripheral membrane proteins that regulate essential cellular processes, including cell shape, cell migration, cell cycle progression of cells, and gene transcription by linking signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RhoGEF subfamily includes p115RhoGEF, LARG, PDZ-RhoGEF, and its rat specific splice variant GTRAP48. The RGS domain of RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and is often refered to as RH (RGS Homology) domain. In addition to being a G-alpha13 effector, the LARG protein also functions as a GTPase-activating protein (GAP) for G-alpha13. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	222
188709	cd08755	RGS_p115RhoGEF	Regulator of G protein signaling (RGS) domain found in the Rho guanine nucleotide exchange factor (GEF), p115 RhoGEF. The RGS (Regulator of G-protein Signaling) domain is an essential part of the p115RhoGEF protein, a member of the RhoGEF (Rho guanine nucleotide exchange factor) subfamily of the RGS protein family. The RhoGEFs are peripheral membrane proteins that regulate essential cellular processes, including cell shape, cell migration, cell cycle progression of cells, and gene transcription by linking signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RhoGEF subfamily includes p115RhoGEF, LARG, PDZ-RhoGEF and its rat specific splice variant GTRAP48. The RGS domain of RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and is often refered to as RH (RGS Homology) domain. In addition to being a G-alpha13/12 effector, the p115RhoGEF protein also functions as a GTPase-activating protein (GAP) for G-alpha13. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	193
188710	cd08756	RGS_GEF_like	Regulator of G protein signaling (RGS) domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein. The RGS domain found in the Rho guanine nucleotide exchange factor (RhoGEF) protein subfamily of the RGS domain containing protein family, which is a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). The RhoGEFs are peripheral membrane proteins that regulate essential cellular processes, including cell shape, cell migration and cell cycle progression as well as gene transcription by linking signals from heterotrimeric G-alpha12/13 protein-coupled receptors to Rho GTPase activation, leading to various cellular responses, such as actin reorganization and gene expression. The RhoGEF subfamily includes the leukemia-associated RhoGEF protein (LARG), p115RhoGEF, PDZ-RhoGEF, and its rat specific splice variant GTRAP48. The RGS domain of RhoGEFs has very little sequence similarity with the canonical RGS domain of the RGS proteins and is often refered to as RH (RGS Homology) domain. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins play critical regulatory role as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. RGS proteins regulate many aspects of embryonic development such as glial differentiation, embryonic axis formation, skeletal and muscle development, cell migration during early embryogenesis, as well as apoptosis, cell proliferation, and modulation of cardiac development.	122
188885	cd08757	SAM_PNT_ESE	Sterile alpha motif (SAM)/Pointed domain of ESE-like ETS transcriptional regulators. SAM Pointed domain of ESE-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It can act as a major transactivator by providing a potential docking site for co-activators. ETS factors are important for cell differentiation. They can be involved in regulation of gene expression in different types of epithelial cells. They are expressed in salivary gland, intestine, stomach, pancreas, lungs, kidneys, colon, mammary gland, and prostate. Members of this group are proto-oncogenes. Expression profiles of these factors are altered in epithelial cancers, which makes them potential targets for cancer therapy.	69
260086	cd08759	Type_III_cohesin_like	Cohesin domain, interaction partner of dockerin. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. Two specific calcium-dependent interactions between cohesin and dockerin appear to be essential for cellulosome assembly, type I and type II. This subfamily represents type III cohesins and closely related domains.	167
176490	cd08760	Cyt_b561_FRRS1_like	Eukaryotic cytochrome b(561), including the FRRS1 gene product. Cytochrome b(561), as found in eukaryotes, similar to and including the human FRRS1 gene product (ferric-chelate reductase 1), also called SDR-2 (stromal cell-derived receptor 2). This family comprises a variety of domain architectures, many of which contain dopamine beta-monooxygenase (DOMON) domains. The protein might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments.	191
176491	cd08761	Cyt_b561_CYB561D2_like	Eukaryotic cytochrome b(561), including the CYB561D2 gene product. Cytochrome b(561), as found in eukaryotes, similar to and including the human CYB561D2 gene product. CYB561D2 is a candidate tumor suppressor. The protein might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments.	183
176492	cd08762	Cyt_b561_CYBASC3	Vertebrate cytochrome b(561), CYBASC3 gene product. Cytochrome b ascorbate-dependent 3, as found in vertebrates, which might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments.	179
176493	cd08763	Cyt_b561_CYB561	Vertebrate cytochrome b(561), CYB561 gene product. Cytochrome b(561), as found in vertebrates, which might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments.	143
176494	cd08764	Cyt_b561_CG1275_like	Non-vertebrate eumetazoan cytochrome b(561). Cytochrome b(561), as found in non-vertebrate eumetazoans, similar to the Drosophila melanogaster CG1275 gene product. This protein might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments.	214
176495	cd08765	Cyt_b561_CYBRD1	Vertebrate cytochrome b(561), CYBRD1 gene product. Duodenal cytochrome b or ferric-chelate reductase 3, a cytochrome b(561), as found in vertebrates, which might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), such as associated with the transport of iron from the endosome to the cytoplasm. It is assumed that this protein uses ascorbate as the electron donor. This protein is expressed at the brush border of duodenal enterocytes and may play a role in the uptake of dietary Fe(3+), facilitating its transport into the mucosal cells. It may also be involved in the recycling of extracellular ascorbate in erythrocyte membranes, and act as a ferrireductase in epithelial cells of the respiratory system. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments.	153
176496	cd08766	Cyt_b561_ACYB-1_like	Plant cytochrome b(561), including the carbon monoxide oxygenase ACYB-1. Cytochrome b(561), as found in plants, similar to the Arabidopsis thaliana ACYB-1 gene product, a cytochrome b561 isoform localized to the tonoplast. This protein might act as a ferric-chelate reductase, catalyzing the reduction of Fe(3+) to Fe(2+), and might be capable of trans-membrane electron transport from intracellular ascorbate to extracellular ferric chelates. It is assumed that this protein uses ascorbate as the electron donor. Belongs to the cytochrome b(561) family, which are secretory vesicle-specific electron transport proteins. Cytochromes b(561) are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments.	144
176572	cd08767	Cdt1_c	The C-terminal fold of replication licensing factor Cdt1 is essential for Cdt1 activity and directly interacts with MCM2-7 helicase. Cdt1 is a replication licensing factor in eukaryotes that recruits the Minichromosome Maintenance Complex (MCM2-7) to the Origin Recognition Complex (ORC). The Cdt1 protein is divided into three regions based on sequence comparison and biochemical analyses: the N-terminal region (Cdt1_n) binds DNA in a sequence-, strand-, and conformation-independent manner; the middle winged helix fold (Cdt1_m) binds geminin to inhibit both binding of the MCM complex to origins of replication and DNA; and the C-terminal region (Cdt1_c) is essential for Cdt1 activity and directly interacts with the MCM2-7 helicase. Precise duplication of chromosomal DNA is required for genomic stability during replication. Assembly of replication factors to start DNA replication in eukaryotes must occur only once per cell cycle. To form a pre-replicative complex on replication origins in the G phase, ORC first binds origin DNA and triggers the binding of Cdc6 and Cdt1. These two factors recruit a putative replicative helicase and the MCM2-7. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication in S-phase. Cdt1 is present during G1 and early S phase of the cell cycle and is degraded during the late S, G2, and M phases. The winged helix fold structure of Cdt1_m is similar to the structures of Cdt1_c and archaeal homologues of the eukaryotic replication initiator, without apparent sequence similarity.	126
176573	cd08768	Cdc6_C	Winged-helix domain of essential DNA replication protein Cell division control protein (Cdc6), which mediates DNA binding. This model characterizes the winged-helix, C-terminal domain of the Cell division control protein (Cdc6_C). Cdc6 (also known as Cell division cycle 6 or Cdc18) functions as a regulator at the early stages of DNA replication, by helping to recruit and load the Minichromosome Maintenance Complex (MCM) onto DNA and may have additional roles in the control of mitotic entry. Precise duplication of chromosomal DNA is required for genomic stability during replication. Cdc6 has an essential role in DNA replication and irregular expression of Cdc6 may lead to genomic instability. Cdc6 over-expression is observed in many cancerous lesions. DNA replication begins when an origin recognition complex (ORC) binds to a replication origin site on the chromatin. Studies indicate that Cdc6 interacts with ORC through the Orc1 subunit, and that this association increases the specificity of the ORC-origins interaction. Further studies suggest that hydrolysis of Cdc6-bound ATP promotes the association of the replication licensing factor Cdt1 with origins through an interaction with Orc6 and this in turn promotes the loading of MCM2-7 helicase onto chromatin. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication.  S-Cdk (S-phase cyclin and cyclin-dependent kinase complex) prevents rereplication by causing the Cdc6 protein to dissociate from ORC and prevents the Cdc6 and MCM proteins from reassembling at any origin. By phosphorylating Cdc6, S-Cdk also triggers Cdc6's ubiquitination.  The Cdc6 protein is composed of three domains, an N-terminal AAA+ domain with Walker A and B, and Sensor-1 and -2 motifs. The central region contains a conserved nucleotide binding/ATPase domain and is a member of the ATPase superfamily. The C-terminal domain (Cdc6_C) is a conserved winged-helix domain that possibly mediates protein-protein interactions or direct DNA interactions. Cdc6 is conserved in eukaryotes, and related genes are found in Archaea. The winged helix fold structure of Cdc6_C is similar to the structures of other eukaryotic replication initiators without apparent sequence similarity.	87
176451	cd08769	DAP_dppA_2	Peptidase M55, D-aminopeptidase dipeptide-binding protein family. M55 Peptidase, D-Aminopeptidase dipeptide-binding protein (dppA; DAP dppA; EC 3.4.11.-) domain: Peptide transport systems are found in many bacterial species and generally function to accumulate intact peptides in the cell, where they are hydrolyzed. The dipeptide-binding protein (dppA) of Bacillus subtilis belongs to the dipeptide ABC transport (dpp) operon expressed early during sporulation. It is a binuclear zinc-dependent, D-specific aminopeptidase. The biologically active enzyme is a homodecamer with active sites buried in its channel. These self-compartmentalizing proteases are characterized by a SXDXEG motif. D-Ala-D-Ala and D-Ala-Gly-Gly are the preferred substrates. Bacillus subtilis dppA is thought to function as an adaptation to nutrient deficiency; hydrolysis of its substrate releases D-Ala which can be used subsequently as metabolic fuel. This family also contains a number of uncharacterized putative peptidases.	270
176452	cd08770	DAP_dppA_3	Peptidase M55, D-aminopeptidase dipeptide-binding protein family. M55 Peptidase, D-Aminopeptidase dipeptide-binding protein (dppA; DAP dppA; EC 3.4.11.-) domain: Peptide transport systems are found in many bacterial species and generally function to accumulate intact peptides in the cell, where they are hydrolyzed. The dipeptide-binding protein (dppA) of Bacillus subtilis belongs to the dipeptide ABC transport (dpp) operon expressed early during sporulation. It is a binuclear zinc-dependent, D-specific aminopeptidase. The biologically active enzyme is a homodecamer with active sites buried in its channel. These self-compartmentalizing proteases are characterized by a SXDXEG motif. D-Ala-D-Ala and D-Ala-Gly-Gly are the preferred substrates. Bacillus subtilis dppA is thought to function as an adaptation to nutrient deficiency; hydrolysis of its substrate releases D-Ala which can be used subsequently as metabolic fuel. This family also contains a number of uncharacterized putative peptidases.	263
206738	cd08771	DLP_1	Dynamin_like protein family includes dynamins and Mx proteins. The dynamin family of large mechanochemical GTPases includes the classical dynamins and dynamin-like proteins (DLPs) that are found throughout the Eukarya. These proteins catalyze membrane fission during clathrin-mediated endocytosis. Dynamin consists of five domains; an N-terminal G domain that binds and hydrolyzes GTP, a middle domain (MD) involved in self-assembly and oligomerization, a pleckstrin homology (PH) domain responsible for interactions with the plasma membrane, GED, which is also involved in self-assembly, and a proline arginine rich domain (PRD) that interacts with SH3 domains on accessory proteins. To date, three vertebrate dynamin genes have been identified; dynamin 1, which is brain specific, mediates uptake of synaptic vesicles in presynaptic terminals; dynamin-2 is expressed ubiquitously and similarly participates in membrane fission; mutations in the MD, PH and GED domains of dynamin 2 have been linked to human diseases such as Charcot-Marie-Tooth peripheral neuropathy and rare forms of centronuclear myopathy. Dynamin 3 participates in megakaryocyte progenitor amplification, and is also involved in cytoplasmic enlargement and the formation of the demarcation membrane system. This family also includes interferon-induced Mx proteins that inhibit a wide range of viruses by blocking an early stage of the replication cycle. Dynamin oligomerizes into helical structures around the neck of budding vesicles in a GTP hydrolysis-dependent manner.	278
350091	cd08772	GH43_62_32_68_117_130	Glycosyl hydrolase families: GH43, GH62, GH32, GH68, GH117, CH130. Members of the glycosyl hydrolase families 32, 43, 62, 68, 117 and 130 (GH32, GH43, GH62, GH68, GH117, GH130) all possess 5-bladed beta-propeller domains and comprise clans F and J, as classified by the carbohydrate-active enzymes database (CAZY). Clan F consists of families GH43 and GH62. GH43 includes beta-xylosidases (EC 3.2.1.37), beta-xylanases (EC 3.2.1.8), alpha-L-arabinases (EC 3.2.1.99), and alpha-L-arabinofuranosidases (EC 3.2.1.55), using aryl-glycosides as substrates, while family GH62 contains alpha-L-arabinofuranosidases (EC 3.2.1.55) that specifically cleave either alpha-1,2 or alpha-1,3-L-arabinofuranose sidechains from xylans. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Clan J consists of families GH32 and GH68. GH32 comprises sucrose-6-phosphate hydrolases, invertases (EC 3.2.1.26), inulinases (EC 3.2.1.7), levanases (EC 3.2.1.65), eukaryotic fructosyltransferases, and bacterial fructanotransferases while GH68 consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10); beta-fructofuranosidase (EC 3.2.1.26); inulosucrase (EC 2.4.1.9), while GH68 consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10); beta-fructofuranosidase (EC 3.2.1.26); inulosucrase (EC 2.4.1.9), all of which use sucrose as their preferential donor substrate. Members of this clan are retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) that catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. Structures of all families in the two clans manifest a funnel-shaped active site that comprises two subsites with a single route for access by ligands. Also included in this superfamily are GH117 enzymes that have exo-alpha-1,3-(3,6-anhydro)-l-galactosidase activity, removing terminal non-reducing alpha-1,3-linked 3,6-anhydro-l-galactose residues from their neoagarose substrate, and GH130 that are phosphorylases and hydrolases for beta-mannosides, involved in the bacterial utilization of mannans or N-linked glycans.	257
176798	cd08773	FpgNei_N	N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. These enzymes initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycolsylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. The FpgNei DNA glycosylases represent one of the two structural superfamilies of DNA glycosylases that recognize oxidized bases (the other is the HTH-GPD superfamily exemplified by Escherichia coli Nth). Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. One exception is mouse Nei-like glycosylase 3 (Neil3) which forms a Schiff base intermediate via its N-terminal valine. In addition to this FpgNei_N domain, FpgNei proteins have a helix-two-turn-helix (H2TH) domain and a zinc (or zincless)-finger motif which also contribute residues to the active site. FpgNei DNA glycosylases have a broad substrate specificity. They are bifunctional, in addition to the glycosylase (recognition) activity, they have a lyase (cleaving) activity on the phosphodiester backbone of the DNA at the AP site. This superfamily includes eukaryotic, bacterial, and viral proteins.	117
206755	cd08774	14-3-3	14-3-3 domain. 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 14-3-3 proteins play important roles in many biological processes that are regulated by phosphorylation, including cell cycle regulation, cell proliferation, protein trafficking, metabolic regulation and apoptosis.  More than 300 binding partners of the 14-3-3 domain have been identified in all subcellular compartments and include transcription factors, signaling molecules, tumor suppressors, biosynthetic enzymes, cytoskeletal proteins and apoptosis factors. 14-3-3 binding can alter the conformation, localization, stability, phosphorylation state, activity as well as molecular interactions of a target protein. They function only as dimers, some preferring strictly homodimeric interaction, while others form heterodimers. Binding of the 14-3-3 domain to its target occurs in a phosphospecific manner where it binds to one of two consensus sequences of their target proteins; RSXpSXP (mode-1) and RXXXpSXP (mode-2). In some instances, 14-3-3 domain containing proteins are involved in regulation and signaling of a number of cellular processes in phosphorylation-independent manner. Many organisms express multiple isoforms: there are seven mammalian 14-3-3 family members (beta, gamma, eta, theta, epsilon, sigma, zeta), each encoded by a distinct gene, while plants contain up to 13 isoforms. The flexible C-terminal segment of 14-3-3 isoforms shows the highest sequence variability and may significantly contribute to individual isoform uniqueness by playing an important regulatory role by occupying the ligand binding groove and blocking the binding of inappropriate ligands in a distinct manner. Elevated amounts of 14-3-3 proteins are found in the cerebrospinal fluid of patients with Creutzfeldt-Jakob disease. In protozoa, like Plasmodium or Cryptosporidium parvum 14-3-3 proteins play an important role in key steps of parasite development.	225
176753	cd08775	DED_Caspase-like_r2	Death effector domain, repeat 2, of initator caspase-like proteins. Death Effector Domain (DED), second repeat, found in initator caspase-like proteins like caspase-8, -10 and c-FLIP. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 and -10 are the initiators of death receptor mediated apoptosis. Together with FADD and the pseudo-caspase c-FLIP, they form the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 and -10 also play important functions in cell adhesion and motility. c-FLIP is a catalytically inactive homolog of the initator procaspases-8 and -10. It negatively influences apoptotic signaling by interfering with the efficient formation of DISC. All members contain two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	81
176754	cd08776	DED_Caspase-like_r1	Death effector domain, repeat 1, of initator caspase-like proteins. Death Effector Domain (DED), first repeat, found in initator caspase-like proteins, like caspase-8 and -10 and c-FLIP. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 and -10 are the initiators of death receptor mediated apoptosis. Together with FADD and the pseudo-caspase c-FLIP, they form the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 and -10 also play important functions in cell adhesion and motility. c-FLIP is a catalytically inactive homolog of the initator procaspases-8 and -10. It negatively influences apoptotic signaling by interfering with the efficient formation of DISC. All members contain two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	71
260048	cd08777	Death_RIP1	Death Domain of Receptor-Interacting Protein 1. Death domain (DD) found in Receptor-Interacting Protein 1 (RIP1) and related proteins. RIP kinases serve as essential sensors of cellular stress. Vertebrates contain several types containing a homologous N-terminal kinase domain and varying C-terminal domains. RIP1 harbors a C-terminal DD, which binds death receptors (DRs) including TNF receptor 1, Fas, TNF-related apoptosis-inducing ligand receptor 1 (TRAILR1), and TRAILR2. It also interacts with other DD-containing adaptor proteins such as TRADD and FADD. RIP1 plays a crucial role in determining a cell's fate, between survival or death, following exposure to stress signals. It is important in the signaling of NF-kappaB and MAPKs, and it links DR-associated signaling to reactive oxygen species (ROS) production. Abnormal RIP1 function may result in ROS accumulation affecting inflammatory responses, innate immunity, stress responses, and cell survival. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
176756	cd08778	Death_TNFRSF21	Death domain of tumor necrosis factor receptor superfamily member 21. Death domain (DD) found in tumor necrosis factor receptor superfamily member 21 (TNFRSF21), also called death receptor-6, DR6. DR6 is an orphan receptor that is expressed ubiquitously, but shows high expression in lymphoid organs, heart, brain and pancreas. Results from DR6(-/-) mice indicate that DR6 plays an important regulatory role for the generation of adaptive immunity. It may also be involved in tumor cell survival and immune evasion. In neuronal cells, it binds beta-amyloid precursor protein (APP) and activates caspase-dependent cell death. It may contribute to the pathogenesis of Alzheimer's disease. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
260049	cd08779	Death_PIDD	Death Domain of p53-induced protein with a death domain. Death domain (DD) found in PIDD (p53-induced protein with a death domain) and similar proteins. PIDD is a component of the PIDDosome complex, which is an oligomeric caspase-activating complex involved in caspase-2 activation and plays a role in mediating stress-induced apoptosis. The PIDDosome complex is composed of three components, PIDD, RAIDD and caspase-2, which interact through their DDs and DD-like domains. The DD of PIDD interacts with the DD of RAIDD, which also contains a Caspase Activation and Recruitment Domain (CARD) that interacts with the caspase-2 CARD. Autoproteolysis of PIDD determines the downstream signaling event, between pro-survival NF-kB or pro-death caspase-2 activation. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD, DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
260050	cd08780	Death_TRADD	Death Domain of Tumor Necrosis Factor Receptor 1-Associated Death Domain protein. Death domain (DD) of TRADD (TNF Receptor 1-Associated Death Domain or TNFRSF1A-associated via death domain) protein. TRADD is a central signaling adaptor for TNF-receptor 1 (TNFR1), mediating activation of Nuclear Factor -kappaB (NF-kB) and c-Jun N-terminal kinase (JNK), as well as caspase-dependent apoptosis. It also carries important immunological roles including germinal center formation, DR3-mediated T-cell stimulation, and TNFalpha-mediated inflammatory responses. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	90
260051	cd08781	Death_UNC5-like	Death domain found in Uncoordinated-5 homolog family. Death Domain (DD) found in Uncoordinated-5 (UNC-5) homolog family, which includes Unc5A, B, C and D in vertebrates. UNC5 proteins are receptors for secreted netrins (netrin-1, -3 and -4) that are involved in diverse processes like axonal guidance, neuronal migration, blood vessel patterning, and apoptosis. They are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	83
260052	cd08782	Death_DAPK1	Death domain found in death-associated protein kinase 1. Death domain (DD) found in death-associated protein kinase 1 (DAPK1). DAPK1 is composed of several functional domains, including a kinase domain, a CaM regulatory domain, ankyrin repeats, a cytoskeletal-binding domain and a C-terminal DD. It plays important roles in a diverse range of signal transduction pathways including apoptosis, growth factor signalling, and autophagy. Loss of DAPK1 expression, usually because of DNA methylation, is implicated in many tumor types. DAPK1 is highly abundant in the brain and has also been associated with neurodegeneration. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	82
260053	cd08783	Death_MALT1	Death domain similar to that found in Mucosa-associated lymphoid tissue-lymphoma-translocation gene 1. Death domain (DD) similar to that found in Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1). Malt1, together with Bcl10 (B-cell lymphoma 10), are the integral components of the CBM signalosome. They associate with CARD9 to form M-CBM (CBM complex in myeloid immune cells) and with CARMA1 to form L-CBM (CBM complex in lymphoid immune cells), to mediate activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	96
260054	cd08784	Death_DRs	Death Domain of Death Receptors. Death domain (DD) found in death receptor proteins. Death receptors are members of the tumor necrosis factor (TNF) receptor superfamily, characterized by having a cytoplasmic DD. Known members of the family are Fas (CD95/APO-1), TNF-receptor 1 (TNFR1/TNFRSF1A/p55/CD120a), TNF-related apoptosis-inducing ligand receptor 1 (TRAIL-R1 /DR4), and receptor 2 (TRAIL-R2/DR5/APO-2/KILLER), as well as Death Receptor 3 (DR3/APO-3/TRAMP/WSL-1/LARD). They are involved in apoptosis signaling pathways. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	80
260055	cd08785	CARD_CARD9-like	Caspase activation and recruitment domain of CARD9 and related proteins. Caspase activation and recruitment domain (CARD) found in CARD9, CARD14 (CARMA2), CARD10 (CARMA3), CARD11 (CARMA1) and BCL10. BCL10 (B-cell lymphoma 10), together with Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1), are integral components of the CBM signalosome. They associate with CARD9 to form M-CBM (CBM complex in myeloid immune cells), and with CARD11 to form L-CBM (CBM complex in lymphoid immune cells), which mediates activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. BCL10/Malt1 also associates with CARD10, which is more widely expressed and is not restricted to hematopoietic cells, to play a role in GPCR-induced NF-kB activation. CARD14 has also been shown to associate with BCL10. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
176764	cd08786	CARD_RIP2_CARD3	Caspase activation and recruitment domain of Receptor Interacting Protein 2. Caspase activation and recruitment domain (CARD) of Receptor Interacting Protein 2 (RIP2/RIPK2/RICK/CARDIAK/CARD3). RIP kinases serve as essential sensors of cellular stress. Vertebrates contain several types containing a homologous N-terminal kinase domain and varying C-terminal domains. RIP2 harbors a C-terminal CARD domain and functions as an effector kinase downstream of the pattern recognition receptors from the Nod-like (NLR)-family, NOD1 and NOD2, which recognizes bacterial peptidoglycans released upon infection. This cascade is implicated in inflammatory immune responses and the clearance of intracellular pathogens. RIP2 associates with NOD1 and NOD2 via CARD-CARD interactions. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	87
176765	cd08787	CARD_NOD2_1_CARD15	Caspase activation and recruitment domain of NOD2, repeat 1. Caspase activation and recruitment domain (CARD) similar to that found in human NOD2 (CARD15), repeat 1. NOD2 is a member of the Nod-like receptor (NLR) family, which plays a central role in the innate immune response. NLRs typically contain an N-terminal effector domain, a central nucleotide-binding domain and a C-terminal ligand-binding region of several leucine-rich repeats (LRRs). In NOD2, as well as NOD1, the N-terminal effector domain is a CARD. NOD2 contains two N-terminal CARD repeats. Mutations in NOD2 have been associated with Crohns disease and Blau syndrome. Nod2-CARDs have been shown to interact with the CARD domain of the downstream effector RICK (RIP2, CARDIAK), a serine/threonine kinase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	87
260056	cd08788	CARD_NOD2_2_CARD15	Caspase activation and recruitment domain of NOD2, repeat 2. Caspase activation and recruitment domain (CARD) similar to that found in human NOD2 (CARD15), repeat 2. NOD2 is a member of the Nod-like receptor (NLR) family, which plays a central role in the innate immune response. NLRs typically contain an N-terminal effector domain, a central nucleotide-binding domain and a C-terminal ligand-binding region of several leucine-rich repeats (LRRs). In NOD2, as well as NOD1, the N-terminal effector domain is a CARD. NOD2 contains two N-terminal CARD repeats. Mutations in NOD2 have been associated with Crohns disease and Blau syndrome. Nod2-CARDs have been shown to interact with the CARD domain of the downstream effector RICK (RIP2, CARDIAK), a serine/threonine kinase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	81
260057	cd08789	CARD_IPS-1_RIG-I	Caspase activation and recruitment domains (CARDs) found in IPS-1 and RIG-I-like RNA helicases. Caspase activation and recruitment domains (CARDs) found in IPS-1 (Interferon beta promoter stimulator protein 1) and Retinoic acid Inducible Gene I (RIG-I)-like DEAD box helicases. RIG-I-like helicases and IPS-1 play important roles in the induction of interferons in response to viral infection. They are crucial in triggering innate immunity and in developing adaptive immunity against viral pathogens. RIG-I-like helicases, including MDA5 and RIG-I, contain two N-terminal CARD domains and a C-terminal DEAD box RNA helicase domain. They are cytoplasmic RNA helicases that play an important role in host antiviral response by sensing incoming viral RNA. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. MDA5 and RIG-I associate with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	91
260058	cd08790	DED_DEDD	Death Effector Domain of DEDD. Death Effector Domain (DED) found in DEDD. DEDD has been shown to block mitotic progression by inhibiting Cdk1 and to be involved in regulating the insulin signaling cascade. DEDD can bind to itself, to DEDD2, and to the two tandem DED-containing caspases, caspase-8 and -10. In general, DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	97
176769	cd08791	DED_DEDD2	Death Effector Domain of DEDD2. Death Effector Domain (DED) found in DEDD2. DEDD2 has been shown to bind to itself, DEDD, and to the two tandem DED-containing caspases, caspase-8 and -10. It may play a role in apoptosis. In general, DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. In mammals, they are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways.	106
260059	cd08792	DED_Caspase_8_10_r1	Death effector domain, repeat 1, of initator caspases 8 and 10. Death Effector Domain (DED) found in caspase-8 and caspase-10, repeat 1. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 and -10 are the initiators of death receptor mediated apoptosis, and they play partially redundant roles. Together with FADD and the pseudo-caspase c-FLIP, they form the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 and -10 also play important functions in cell adhesion and motility. They contain two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	77
260060	cd08793	Death_IRAK4	Death domain of Interleukin-1 Receptor-Associated Kinase 4. Death Domain (DD) of Interleukin-1 Receptor-Associated Kinase 4 (IRAK4). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors, nuclear factor-kappaB, and mitogen-activated protein kinases. IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK4 is an active kinase that is also involved in T-cell receptor signaling pathways, implying that it may function in acquired immunity and not just in innate immunity. It is known as the master IRAK member because its absence strongly impairs TLR- and IL-1-mediated signaling and innate immune defenses, while the absence of other IRAK proteins only shows slight effects. IRAK4-deficient patients have impaired inflammatory responses and recurrent life-threatening infections. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	100
260061	cd08794	Death_IRAK1	Death domain of Interleukin 1 Receptor Associated Kinase-1. Death Domain (DD) of Interleukin-1 Receptor-Associated Kinase 1 (IRAK1). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors, nuclear factor-kappaB (NF-kB), and mitogen-activated protein kinases (MAPKs). IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK1 is an active kinase and also plays adaptor functions. It binds to the MyD88-IRAK4 complex via its DD, which facilitates its phosphorylation by IRAK4, activating it for further auto-phosphorylation. Hyper-phosphorylated IRAK1 forms a cytosolic complex with TRAF6, leading to the activation of NF-kB and MAPK pathways. IRAK1 is involved in autoimmunity and may be associated with lupus pathogenesis. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
176773	cd08795	Death_IRAK2	Death domain of Interleukin 1 Receptor Associated Kinase-2. Death Domain (DD) of Interleukin-1 Receptor-Associated Kinase 1 (IRAK1). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors (TLRs), nuclear factor-kappaB (NF-kB), and mitogen-activated protein kinases (MAPKs). IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK2 is an essential component of several signaling pathways, including NF-kappaB and the IL-1 signaling pathways. It is an inactive kinase that participates in septic shock mediated by TLR4 and TLR9. It plays a redundant role with IRAK1 in early NF-kB and MAPK responses, and remains present at later stages whereas IRAK1 disappears. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	88
260062	cd08796	Death_IRAK-M	Death domain of Interleukin 1 Receptor Associated Kinase-M. Death Domain (DD) of Interleukin-1 Receptor-Associated Kinase M (IRAK-M). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors(TLRs), nuclear factor-kappaB (NF-kB), and mitogen-activated protein kinases (MAPKs). IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK-M, also called IRAK-3, is an inactive kinase present only in macrophages in an inducible manner. It is a negative regulator of TLR signaling and it contributes to the attenuation of NF-kB activation. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	89
260063	cd08797	Death_NFkB1_p105	Death domain of the Nuclear Factor-KappaB1 precursor protein p105. Death Domain (DD) of the Nuclear Factor-KappaB1 (NF-kB1) precursor protein p105. The NF-kB family of transcription factors play a central role in cardiovascular growth, stress response, and inflammation by controlling the expression of a network of different genes. There are five NF-kB proteins, all containing an N-terminal REL Homology Domain (RHD). NF-kB1 (or p50) is produced from the processing of the precursor protein p105, which contains ANK repeats and a C-terminal DD in addition to the RHD. It is regulated by the classical (or canonical) NF-kB pathway. In the cytosol, p50 forms an inactive complex with RelA (or p65) and the Inhibitor of NF-kB (IkB). Activation is triggered by the phosphorylation and degradation of IkB, resulting in the active DNA-binding p50-RelA dimer to migrate to the nucleus. The classical pathway regulates the majority of genes activated by NF-kB including those encoding cytokines, chemokines, leukocyte adhesion molecules, and anti-apoptotic factors. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	76
176776	cd08798	Death_NFkB2_p100	Death domain of the Nuclear Factor-KappaB2 precursor protein p100. Death Domain (DD) of the Nuclear Factor-KappaB2 (NF-kB2) precursor protein p100. The NF-kB family of transcription factors play a central role in cardiovascular growth, stress response, and inflammation by controlling the expression of a network of different genes. There are five NF-kB proteins, all containing an N-terminal REL Homology Domain (RHD). NF-kB2 (or p52) is produced from the processing of the precursor protein p100, which contains ANK repeats and a C-terminal DD in addition to the RHD. It is regulated by the non-canonical NF-kB pathway. The p100 precursor is cytosolic and interacts with RelB. Upon phosphorylation by IKKalpha, p100 is processed to its 52kDa active, DNA-binding form and the p52/RelB complex is translocated into the nucleus. The non-canonical pathway plays a role in adaptive immunity and lymphorganogenesis. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	76
260064	cd08799	Death_UNC5C	Death domain found in Uncoordinated-5C. Death Domain (DD) found in Uncoordinated-5C (UNC5C). UNC5C is part of the UNC-5 homolog family. It is a receptor for the secreted netrin-1 and plays a role in axonal guidance, angiogenesis, and apoptosis. UNC5C plays a critical role in the development of spinal accessory motor neurons. Methylation of the UNC5C gene is associated with early stages of colorectal carcinogenesis. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	83
260065	cd08800	Death_UNC5A	Death domain found in Uncoordinated-5A. Death Domain (DD) found in Uncoordinated-5A (UNC5A). UNC5A is part of the UNC-5 homolog family. It is a receptor for the secreted netrin-1 and plays a critical role in neuronal development and differentiation, as well as axon-guidance. It also plays a role in regulating apoptosis in non-neuronal cells as a downstream target of p53. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
176779	cd08801	Death_UNC5D	Death domain found in Uncoordinated-5D. Death Domain (DD) found in Uncoordinated-5D (UNC5D). UNC5D is part of the UNC-5 homolog family. It is a receptor for the secreted netrin-1 and plays a role in axonal guidance, angiogenesis, and apoptosis. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	98
176780	cd08802	Death_UNC5B	Death domain found in Uncoordinated-5B. Death Domain (DD) found in Uncoordinated-5B (UNC5B). UNC5B is part of the UNC-5 homolog family. It is a receptor for the secreted netrin-1 and plays a role in axonal guidance, angiogenesis, and apoptosis. UNC5B signaling is involved in the netrin-1-induced proliferation and migration of renal proximal tubular cells. It is also required for vascular patterning during embryonic development, and its activation inhibits sprouting angiogenesis. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
176781	cd08803	Death_ank3	Death domain of Ankyrin-3. Death Domain (DD) of the human protein ankyrin-3 (ANK-3) and related proteins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. ANK-3, also called anykyrin-G (for general or giant), is found in neurons and at least one splice variant has been shown to be essential for propagation of action potentials as a binding partner to neurofascin and voltage-gated sodium channels. It is required for maintaining axo-dendritic polarity, and may be a genetic risk factor associated with bipolar disorder. ANK-3 may also play roles in other cell types. Mutations affecting ANK-3 pathways for Na channel localization are associated with Brugada syndrome, a potentially fatal arrythmia. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
260066	cd08804	Death_ank2	Death domain of Ankyrin-2. Death Domain (DD) of Ankyrin-2 (ANK-2) and related proteins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. ANK-2, also called ankyrin-B (for broadly expressed), is required for proper function of the Na/Ca ion exchanger-1 in cardiomyocytes, and is thought to function in linking integral membrane proteins to the underlying cytoskeleton. Human ANK-2 is associated with "Ankyrin-B syndrome", an atypical arrythmia disorder with risk of sudden cardiac death. It also plays key roles in the brain and striated muscle. Loss of ANK-2 is associated with significant nervous system defects and sarcomere disorganization. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
260067	cd08805	Death_ank1	Death domain of Ankyrin-1. Death Domain (DD) of the human protein ankyrin-1 (ANK-1) and related proteins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. ANK-1, also called ankyrin-R (for restricted), is found in brain, muscle, and erythrocytes and is thought to function in linking integral membrane proteins to the underlying cytoskeleton. It plays a critical nonredundant role in erythroid development and is associated with hereditary spherocytosis (HS), a common disorder of the red cell membrane. The small alternatively-spliced variant, sANK-1, found in striated muscle and concentrated in the sarcoplasmic reticulum (SR) binds obscurin and titin, which facilitates the anchoring of the network SR to the contractile apparatus. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	84
260068	cd08806	CARD_CARD14_CARMA2	Caspase activation and recruitment domain of CARD14-like proteins. Caspase activation and recruitment domain (CARD) similar to that found in CARD14, also known as BIMP2 or CARMA2 (caspase recruitment domain-containing membrane-associated guanylate kinase protein 2). CARD14 has been identified as a novel member of the MAGUK (membrane-associated guanylate kinase) family that functions as upstream activators of BCL10 (B-cell lymphoma 10) and NF-kB signaling. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
260069	cd08807	CARD_CARD10_CARMA3	Caspase activation and recruitment domain of CARD10-like proteins. Caspase activation and recruitment domain (CARD) similar to that found in CARD10, also known as CARMA3 (caspase recruitment domain-containing membrane-associated guanylate kinase protein 3) or BIMP1. The CARMA3-BCL10-MALT1 signalosome plays a role in the GPCR-induced NF-kB activation. CARMA3 is more widely expressed than CARMA1, which is found only in hematopoietic cells. In endothelial and smooth muscle cells, CARMA3-mediated NF-kB activation induces pro-inflammatory signals within the vasculature and is a key factor in atherogenesis. In bronchial epithelial cells, CARMA3-mediated NF-kB signaling is important for the development of allergic airway inflammation. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
260070	cd08808	CARD_CARD11_CARMA1	Caspase activation and recruitment domain of CARD11-like proteins. Caspase activation and recruitment domain (CARD) similar to that found in CARD11, also known as caspase recruitment domain-containing membrane-associated guanylate kinase protein 1 (CARMA1). CARMA1, together with BCL10 (B-cell lymphoma 10) and Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1), form the L-CBM signalosome (CBM complex in lymphoid immune cells) which mediates activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. CARMA1 associates with BCL10 via a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
260071	cd08809	CARD_CARD9	Caspase activation and recruitment domain of CARD9-like proteins. Caspase activation and recruitment domain (CARD) similar to that found in CARD9. CARD9 is a central regulator of innate immunity and is highly expressed in dendritic cells and macrophages. Together with BCL10 (B-cell lymphoma 10) and Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1), it forms the M-CBM signalosome (the CBM complex in myeloid immune cells), which mediates activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. CARD9 associates with BCL10 via a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
260072	cd08810	CARD_BCL10	Caspase activation and recruitment domain of B-cell lymphoma 10. Caspase activation and recruitment domain (CARD) similar to that found in BCL10 (B-cell lymphoma 10). BCL10 and Malt1 (mucosa-associated lymphoid tissue-lymphoma-translocation gene 1) are the integral components of CBM signalosomes. They associate with CARD9 to form M-CBM (CBM complex in myeloid immune cells) and with CARMA1 to form L-CBM (CBM complex in lymphoid immune cells), to mediate activation of NF-kB and MAPK by ITAM-coupled receptors expressed on immune cells. Both CARMA1 and CARD9 associate with BCL10 via a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	85
260073	cd08811	CARD_IPS1	Caspase activation and recruitment domain (CARD) found in IPS-1. Caspase activation and recruitment domain (CARD) found in IPS-1 (Interferon beta promoter stimulator protein 1), also known as CARDIF, VISA or MAVS. IPS-1 is an adaptor protein that plays an important role in interferon induction in response to viral infection. It is crucial in triggering innate immunity and in developing adaptive immunity against viral pathogens. The CARD of IPS-1 associates with the CARDs of two RNA helicases, RIG-I and MDA5, which bind viral DNA in the cytoplasm during the initial stage of intracellular antiviral response, leading to the induction of type I interferons. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	92
176791	cd08813	DED_Caspase_8_r2	Death Effector Domain, repeat 2, of Caspase-8. Death effector domain (DED) found in caspase-8 (CASP8, FLICE), repeat 2. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-8 is an initiator of death receptor mediated apoptosis. Together with FADD, caspase-10, and the pseudo-caspase c-FLIP, it forms the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. Caspase-8 also plays many important non-apoptotic functions including roles in embryonic development, cell adhesion and motility, immune cell proliferation and differentiation, T-cell activation, and NFkappaB signaling. It contains two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	83
260074	cd08814	DED_Caspase_10_r2	Death Effector Domain, repeat 2, of Caspase-10. Death effector domain (DED) found in Caspase-10, repeat 2. Caspases are aspartate-specific cysteine proteases with functions in apoptosis and immune signaling. Initiator caspases are the first to be activated following death- or inflammation-inducing signals. Caspase-10 is an initiator of death receptor mediated apoptosis. Together with FADD, caspase-8 and the pseudo-caspase c-FLIP, it forms the death-inducing signaling complex (DISC), whose formation is triggered by the activation of type 1 tumor necrosis factor (TNF) receptors such as Fas, TNF receptor 1, and TRAIL receptor. It contains two N-terminal DED domains and a C-terminal caspase domain. DEDs comprise a subfamily of the Death Domain (DD) superfamily. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and CARD (Caspase activation and recruitment domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	79
176793	cd08815	Death_TNFRSF25_DR3	Death domain of Tumor Necrosis Factor Receptor superfamily 25. Death Domain (DD) found in Tumor Necrosis Factor (TNF) receptor superfamily 25 (TNFRSF25), also known as TRAMP (TNF receptor-related apoptosis-mediating protein), LARD, APO-3, WSL-1, or DR3 (Death Receptor-3). TNFRSF25 is primarily expressed in T cells, is activated by binding to its ligand TL1A, and plays an important role in T-cell function. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	77
260075	cd08816	CARD_RIG-I_r1	Caspase activation and recruitment domain found in RIG-I, first repeat. Caspase activation and recruitment domain (CARD) found in RIG-I (Retinoic acid Inducible Gene I, also known as Ddx58), first repeat. RIG-I is a cytoplasmic RNA helicase that plays an important role in host antiviral response by sensing incoming viral RNA. RIG-I contains two N-terminal CARD domains and a C-terminal RNA helicase. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. Although very similar in sequence, RIG-I recognizes different sets of viruses compared to MDA5, a related RNA helicase. RIG-I associates with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	90
260076	cd08817	CARD_RIG-I_r2	Caspase activation and recruitment domain found in RIG-I, second repeat. Caspase activation and recruitment domain (CARD) found in RIG-I (Retinoic acid Inducible Gene I, also known as Ddx58), second repeat. RIG-I is a cytoplasmic RNA helicase that plays an important role in host antiviral response by sensing incoming viral RNA. RIG-I contains two N-terminal CARD domains and a C-terminal RNA helicase. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. Although very similar in sequence, RIG-I recognizes different sets of viruses compared to MDA5, a related RNA helicase. RIG-I associates with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	91
260077	cd08818	CARD_MDA5_r1	Caspase activation and recruitment domain found in MDA5, first repeat. Caspase activation and recruitment domain (CARD) found in MDA5 (melanoma-differentiation-associated gene 5), first repeat. MDA5, also known as IFIH1, contains two N-terminal CARD domains and a C-terminal RNA helicase domain. MDA5 is a cytoplasmic DEAD box RNA helicase that plays an important role in host antiviral response by sensing incoming viral RNA. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. Although very similar in sequence, MDA5 recognizes different sets of viruses compared to RIG-I, a related RNA helicase. MDA5 associates with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	92
260078	cd08819	CARD_MDA5_r2	Caspase activation and recruitment domain found in MDA5, second repeat. Caspase activation and recruitment domain (CARD) found in MDA5 (melanoma-differentiation-associated gene 5), second repeat. MDA5, also known as IFIH1, contains two N-terminal CARD domains and a C-terminal RNA helicase domain. MDA5 is a cytoplasmic DEAD box RNA helicase that plays an important role in host antiviral response by sensing incoming viral RNA. Upon activation, the signal is transferred to downstream pathways via the adaptor molecule IPS-1 (MAVS, VISA, CARDIF), leading to the induction of type I interferons. Although very similar in sequence, MDA5 recognizes different sets of viruses compared to RIG-I, a related RNA helicase. MDA5 associates with IPS-1 through a CARD-CARD interaction. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	92
187722	cd08820	FMT_core_like_6	Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions.  Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group.  This domain contains a Rossmann fold and it is the catalytic domain of the enzyme.	173
187723	cd08821	FMT_core_like_1	Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions.  Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group.  This domain contains a Rossmann fold and it is the catalytic domain of the enzyme.	211
187724	cd08822	FMT_core_like_2	Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions.  Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group.  This domain contains a Rossmann fold and it is the catalytic domain of the enzyme.	192
187725	cd08823	FMT_core_like_5	Formyl transferase catalytic core domain found in a group of proteins with unknown functions. Formyl transferase catalytic core domain found in a group of proteins with unknown functions.  Formyl transferase catalyzes the transfer of one-carbon groups, specifically the formyl- or hydroxymethyl- group.  This domain contains a Rossmann fold and it is the catalytic domain of the enzyme.	177
193585	cd08824	LOTUS	LOTUS is an uncharacterized small globular domain found in Limkain b1, Oskar and Tudor-containing proteins 5 and 7. LOTUS is an uncharacterized small globular domain found in Limkain b1, Oskar and Tudor-containing proteins 5 and 7. The LOTUS containing proteins are germline-specific and are found in the nuage/polar granules of germ cells. Tudor-containing protein 5 and 7 belong to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD5 and TDRD7 are components of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. Oskar protein is a critical component of the pole plasm in the Drosophila oocyte, which is required for germ cell formation. Limkain b1 is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. Limkain b1 contains multiple copies of LOTUS domains and a conserved RNA recognition motif. The exact molecular function of LOTUS domain remains to be characterized. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	70
259807	cd08825	MVP_shoulder	Shoulder domain of the major vault protein. The major vault protein is the major polypeptide component of a large cellular ribonuclear protein complex found in the cytoplasm of eukaryotic cells. Its shoulder domain appears to be a homolog of the SPFH core domain. Vault proteins may be involved in detoxification processes, and have been associated with the multi-drug resistance (MDR) phenotype in malignancies. Presumably they play a role in transport processes.	151
259808	cd08826	SPFH_eoslipins_u1	Uncharacterized prokaryotic subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in bacteria and archaebacteria. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Bacterial and archaebacterial SLPs remain uncharacterized. This subgroup contains PH1511 from the hyperthermophilic archaeon Pyrococcus horikoshi.	178
259809	cd08827	SPFH_podocin	Podocin, a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in vertebrates. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Podocin is expressed in the kidney and mutations in the gene have been linked to familial idiopathic nephrotic syndrome. Podocin interacts with the TRP ion channel TRPV-6 and may function as a scaffolding protein in the organization of lipid-protein domains.	223
259810	cd08828	SPFH_SLP-3	Slipin-3 (SLP-3), an uncharacterized subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in vertebrates. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Members of this slipin subgroup remain uncharacterized, except for Caenorhabditis elegans UNC-1. Mutations in the unc-1 gene result in abnormal motion and altered patterns of sensitivity to volatile anesthetics.	154
259811	cd08829	SPFH_paraslipin	Paraslipin or slipin-2 (SLP-2, a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in all three kingdoms of life. The conserved domain common to these families has also been referred to as the Band 7 domain. Individual proteins of the SPFH family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. This subgroup of the SLPs remains largely uncharacterized. It includes human SLP-2 which is upregulated and involved in the progression and development in several types of cancer, including esophageal squamous cell carcinoma, endometrial adenocarcinoma, breast cancer, and glioma.	111
350059	cd08830	ArfGap_ArfGap1	Arf1 GTPase-activating protein 1. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif.	115
350060	cd08831	ArfGap_ArfGap2_3_like	Arf1 GTPase-activating protein 2/3-like. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif.	116
350061	cd08832	ArfGap_ADAP	ArfGap with dual PH domains. The ADAP subfamily, ArfGAPs with dual pleckstrin homology (PH) domains, includes two members: ADAP1 and ADAP2. Both ADAP1 (also known as centaurin-alpha1, p42(IP4), or PIP3BP) and ADAP2 (centaurin-alpha2) display a GTPase-activating protein (GAP) activity toward Arf6 (ADP-ribosylation factor 6), which is involved in protein trafficking that regulates endocytic recycling, cytoskeleton remodeling, and neuronal differentiation.  ADAP2 has high sequence similarity to the ADAP1 and they both contain a ArfGAP domain at the N-terminus, followed by two PH domains. However, ADAP1, unlike ADAP2, contains a putative N-terminal nuclear localization signal. The PH domains of ADAP1bind to the two second messenger molecules phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3) and inositol 1,3,4,5-tetrakisphosphate (I(1,3,4,5)P4) with identical high affinity, whereas those of ADAP2 specifically binds phosphatidylinositol 3,4-bisphosphate (PI(3,4)P2) and PI(3,4,5)P3, which are produced by activated phosphatidylinositol 3-kinase. ADAP1 is predominantly expressed in the brain neurons, while ADAP2 is broadly expressed, including the adipocytes, heart, and skeletal muscle but not in the brain. The limited distribution and high expression of ADAP1 in the brain indicates that ADAP1 is important for neuronal functions. ADAP1 has been shown to highly expressed in the neurons and plagues of Alzheimer's disease patients. In other hand, ADAP2 gene deletion has been shown to cause circulatory deficiencies and heart shape defects in zebrafish, indicating that ADAP2 has a vital role in heart development. Taken together, the hemizygous deletion of ADAP2 gene may be contributing to the cardiovascular malformation in patients with neurofibromatosis type 1 (NF1) microdeletions.	113
350062	cd08833	ArfGap_GIT	The GIT subfamily of ADP-ribosylation factor GTPase-activating proteins. The GIT (G-protein coupled receptor kinase-interacting protein) subfamily includes GIT1 and GIT2, which have three ANK repeats, a Spa-homology domain (SHD), a coiled-coil domain and a C-terminal paxillin-binding site (PBS). The GIT1/2 proteins are GTPase-activating proteins that function as an inactivator of Arf signaling, and interact with the PIX/Cool family of Rac/Cdc42  guanine nucleotide exchange factors (GEFs). Unlike other ArfGAPs, GIT and PIX (Pak-interacting exchange factor) proteins are tightly associated to form an oligomeric complex that acts as a scaffold and signal integrator that can be recruited for multiple signaling pathways. The GIT/PIX complex functions as a signaling scaffold by binding to specific protein partners. As a result, the complex is transported to specific cellular locations. For instance, the GIT partners paxillin or integrin-alpha4 (to focal adhesions), piccolo and liprin-alpha (to synapses), and the beta-PIX partner Scribble (to epithelial cell-cell contacts and synapses). Moreover, the GIT/PIT complex functions to integrate signals from multiple GTP-binding protein and protein kinase pathways to regulate the actin cytoskeleton and thus cell polarity, adhesion and migration.	109
350063	cd08834	ArfGap_ASAP	ArfGAP domain of ASAP (Arf GAP, SH3, ANK repeat and PH domains) subfamily of ADP-ribosylation factor GTPase-activating proteins. The ArfGAPs are a family of multidomain proteins with a common catalytic domain that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling. ASAP-subfamily GAPs include three members: ASAP1, ASAP2, ASAP3.  The ASAP subfamily comprises Arf GAP, SH3, ANK repeat and PH domains. From the N-terminus, each member has a BAR, PH, Arf GAP, ANK repeat, and proline rich domains. Unlike ASAP3, ASAP1 and ASAP2 also have an SH3 domain at the C-terminus. ASAP1 and ASAP2 show strong GTPase-activating protein (GAP) activity toward Arf1 and Arf5 and weak activity toward Arf6. ASAP1 is a target of Src and FAK signaling that regulates focal adhesions, circular dorsal ruffles (CDR), invadopodia, and podosomes. ASAP1 GAP activity is synergistically stimulated by phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidic acid.  ASAP2 is believed to function as an ArfGAP that controls ARF-mediated vesicle budding when recruited to Golgi membranes. It also functions as a substrate and downstream target for protein tyrosine kinases Pyk2 and Src, a pathway that may be involved in the regulation of vesicular transport. ASAP3 is a focal adhesion-associated ArfGAP that functions in cell migration and invasion. Similar to ASAP1, the GAP activity of ASAP3 is strongly enhanced by PIP2 via PH domain. Like ASAP1, ASAP3 associates with focal adhesions and circular dorsal ruffles. However, unlike ASAP1, ASAP3 does not localize to invadopodia or podosomes. Both ASAP 1 and 3 have been implicated in oncogenesis, as ASAP1 is highly expressed in metastatic breast cancer and ASAP3 in hepatocellular carcinoma.	117
350064	cd08835	ArfGap_ACAP	ArfGAP domain of ACAP (ArfGAP with Coiled-coil, ANK repeat and PH domains) proteins. ArfGAP domain is an essential part of ACAP proteins that play important role in endocytosis, actin remodeling and receptor tyrosine kinase-dependent cell movement. ACAP subfamily of ArfGAPs are composed of coiled coils (BAR, Bin-Amphiphysin-Rvs), PH, ArfGAP and ANK repeats domains. ACAP1 (centaurin beta1) and ACAP2 centaurin beta2) have a GAP (GTPase-activating protein) activity preferentially toward Arf6, which regulates endocytic recycling. Both ACAP1/2 are activated by are activated by the phosphoinositides, PI(4,5)P2 and PI(3,5)P2. ACAP1 binds specifically with recycling cargo proteins such as transferrin receptor (TfR) and cellubrevin. Thus, ACAP1 promotes cargo sorting to enhance TfR recycling from the recycling endosome. In addition, phosphorylation of ACAP by Akt, a serine/threonine protein kinase, regulates the recycling of integrin beta1 to control cell migration. In contrast, ACAP2 does not exhibit a similar interaction with the recycling cargo proteins. It has been shown that ACAP2 functions both as an effector of Ras-related protein Rab35 and as an Arf6-GTPase-activating protein (GAP) during neurite outgrowth of PC12 cells. In addition, ACAP2, together with Rab35, regulates phagocytosis in mammalian macrophages. ACAP3 also positively regulates neurite outgrowth through its GAP activity specific to Arf6 in mouse hippocampal neurons.	116
350065	cd08836	ArfGap_AGAP	ArfGAP with GTPase domain, ANK repeat and PH domains. The AGAP subfamily of ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) includes three members: AGAP1-3. In addition to the Arf GAP domain, AGAP proteins contain GTP-binding protein-like, ANK repeat and pleckstrin homology (PH) domains.  AGAP1 and AGAP2 have phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2)-mediated GTPase-activating protein (GAP) activity preferentially toward Arf1, and function in the endocytic system. AGAP1 and AGAP2 independently regulate AP-3 endosomes and AP-1/Rab4 fast recycling endosomes, respectively. AGAP1, via its PH domain, directly interacts with the adapter protein 3 (AP-3), which is a coat protein involved in trafficking in the endosomal-lysosomal system, and regulates AP-3-dependent trafficking.  In other hand, AGAP2 specifically binds the clathrin adaptor protein AP-1 and regulates the AP-1/Rab-4 dependent endosomal trafficking. AGAP2 is overexpressed in different human cancers including prostate carcinoma and glioblastoma, and promotes cancer cell invasion. AGAP3 exists as a component of the NMDA receptor complex that regulates Arf6 and Ras/ERK signaling pathways. Moreover, AGAP3 regulates AMPA receptor trafficking through the ArfGAP domain. Together, AGAP3 is believed to involve in linking NMDA receptor activation to AMPA receptor trafficking.	108
350066	cd08837	ArfGap_ARAP	ArfGap with Rho-Gap domain, ANK repeat and PH domain-containing proteins. The ARAP subfamily includes three members, ARAP1-3, and belongs to the ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) family of proteins that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling.  The function of Arfs is dependent on GAPs and guanine nucleotide exchange factors (GEFs), which allow Arfs to cycle between the GDP-bound and GTP-bound forms. In addition to the Arf GAP domain, ARAPs contain the SAM (sterile-alpha motif) domain, 5 pleckstrin homology (PH) domains, the Rho-GAP domain, the Ras-association domain, and ANK repeats. ARAPs show phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3)-dependent GAP activity toward Arf6. ARAPs play important roles in endocytic trafficking, cytoskeleton reorganization in response to growth factors stimulation, and focal adhesion dynamics.	116
350067	cd08838	ArfGap_AGFG	ArfGAP domain of the AGFG subfamily (ArfGAP domain and FG repeat-containing proteins). The ArfGAP domain and FG repeat-containing proteins (AFGF) subfamily of Arf GTPase-activating proteins consists of the two structurally-related members: AGFG1 and AGFG2. AGFG1 (alias: HIV-1 Rev binding protein, HRB; Rev interacting protein, RIP; Rev/Rex activating domain-binding protein, RAB) and AGFG2 are involved in the maintenance and spread of immunodeficiency virus type 1 (HIV-1) infection. The ArfGAP domain of AGFG is related to nucleoporins, which is a class of proteins that mediate nucleocytoplasmic transport. AGFG plays a role in the Rev export pathway, which mediates the nucleocytoplasmic transfer of proteins and RNAs, possibly together by the nuclear export receptor CRM1. In humans, the presence of the FG repeat motifs (11 in AGFG1 and 7 in AGFG2) are thought to be required for these proteins to act as HIV-1 Rev cofactors. Hence, AGFG promotes movement of Rev-responsive element-containing RNAs from the nuclear periphery to the cytoplasm, which is an essential step for HIV-1 replication.	113
350068	cd08839	ArfGap_SMAP	Stromal membrane-associated proteins; a subfamily of the ArfGAP family. The SMAP subfamily of Arf GTPase-activating proteins consists of the two structurally-related members, SMAP1 and SMAP2. Each SMAP member exhibits common and distinct functions in vesicle trafficking. They both bind to clathrin heavy chain molecules and are involved in the trafficking of clathrin-coated vesicles. SMAP1 preferentially exhibits GAP toward Arf6, while SMAP2 prefers Arf1 as a substrate. SMAP1 is involved in Arf6-dependent vesicle trafficking, but not Arf6-mediated actin cytoskeleton reorganization, and regulates clathrin-dependent endocytosis of the transferrin receptors and E-cadherin. SMAP2 regulates Arf1-dependent retrograde transport of TGN38/46 from the early endosome to the trans-Golgi network (TGN). SMAP2 has the Clathrin Assembly Lymphoid Myeloid (CALM)-binding domain, but SMAP1 does not.	103
350069	cd08843	ArfGap_ADAP1	ADAP1 GTPase activating protein for Arf, with dual PH domains. The ADAP subfamily, ArfGAPs with dual pleckstrin homology (PH) domains, includes two members: ADAP1 and ADAP2. Both ADAP1 (also known as centaurin-alpha1, p42(IP4), or PIP3BP) and ADAP2 (centaurin-alpha2) display a GTPase-activating protein (GAP) activity toward Arf6 (ADP-ribosylation factor 6), which is involved in protein trafficking that regulates endocytic recycling, cytoskeleton remodeling, and neuronal differentiation.  ADAP2 has high sequence similarity to the ADAP1 and they both contain a ArfGAP domain at the N-terminus, followed by two PH domains. However, ADAP1, unlike ADAP2, contains a putative N-terminal nuclear localization signal. The PH domains of ADAP1bind to the two second messenger molecules phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3) and inositol 1,3,4,5-tetrakisphosphate (I(1,3,4,5)P4) with identical high affinity, whereas those of ADAP2 specifically binds phosphatidylinositol 3,4-bisphosphate (PI(3,4)P2) and PI(3,4,5)P3, which are produced by activated phosphatidylinositol 3-kinase. ADAP1 is predominantly expressed in the brain neurons, while ADAP2 is broadly expressed, including the adipocytes, heart, and skeletal muscle but not in the brain. The limited distribution and high expression of ADAP1 in the brain indicates that ADAP1 is important for neuronal functions. ADAP1 has been shown to highly expressed in the neurons and plagues of Alzheimer's disease patients. In other hand, ADAP2 gene deletion has been shown to cause circulatory deficiencies and heart shape defects in zebrafish, indicating that ADAP2 has a vital role in heart development. Taken together, the hemizygous deletion of ADAP2 gene may be contributing to the cardiovascular malformation in patients with neurofibromatosis type 1 (NF1) microdeletions.	112
350070	cd08844	ArfGap_ADAP2	ADAP2 GTPase activating protein for Arf, with dual PH domains. The ADAP subfamily, ArfGAPs with dual pleckstrin homology (PH) domains, includes two members: ADAP1 and ADAP2. Both ADAP1 (also known as centaurin-alpha1, p42(IP4), or PIP3BP) and ADAP2 (centaurin-alpha2) display a GTPase-activating protein (GAP) activity toward Arf6 (ADP-ribosylation factor 6), which is involved in protein trafficking that regulates endocytic recycling, cytoskeleton remodeling, and neuronal differentiation.  ADAP2 has high sequence similarity to the ADAP1 and they both contain a ArfGAP domain at the N-terminus, followed by two PH domains. However, ADAP1, unlike ADAP2, contains a putative N-terminal nuclear localization signal. The PH domains of ADAP1bind to the two second messenger molecules phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3) and inositol 1,3,4,5-tetrakisphosphate (I(1,3,4,5)P4) with identical high affinity, whereas those of ADAP2 specifically binds phosphatidylinositol 3,4-bisphosphate (PI(3,4)P2) and PI(3,4,5)P3, which are produced by activated phosphatidylinositol 3-kinase. ADAP1 is predominantly expressed in the brain neurons, while ADAP2 is broadly expressed, including the adipocytes, heart, and skeletal muscle but not in the brain. The limited distribution and high expression of ADAP1 in the brain indicates that ADAP1 is important for neuronal functions. ADAP1 has been shown to highly expressed in the neurons and plagues of Alzheimer's disease patients. In other hand, ADAP2 gene deletion has been shown to cause circulatory deficiencies and heart shape defects in zebrafish, indicating that ADAP2 has a vital role in heart development. Taken together, the hemizygous deletion of ADAP2 gene may be contributing to the cardiovascular malformation in patients with neurofibromatosis type 1 (NF1) microdeletions.	112
350071	cd08846	ArfGap_GIT1	GIT1 GTPase activating protein for Arf. The GIT (G-protein coupled receptor kinase-interacting protein) subfamily includes GIT1 and GIT2, which have three ANK repeats, a Spa-homology domain (SHD), a coiled-coil domain and a C-terminal paxillin-binding site (PBS). The GIT1/2 proteins are GTPase-activating proteins that function as an inactivator of Arf signaling, and interact with the PIX/Cool family of Rac/Cdc42  guanine nucleotide exchange factors (GEFs). Unlike other ArfGAPs, GIT and PIX (Pak-interacting exchange factor) proteins are tightly associated to form an oligomeric complex that acts as a scaffold and signal integrator that can be recruited for multiple signaling pathways. The GIT/PIX complex functions as a signaling scaffold by binding to specific protein partners. As a result, the complex is transported to specific cellular locations. For instance, the GIT partners paxillin or integrin-alpha4 (to focal adhesions), piccolo and liprin-alpha (to synapses), and the beta-PIX partner Scribble (to epithelial cell-cell contacts and synapses). Moreover, the GIT/PIT complex functions to integrate signals from multiple GTP-binding protein and protein kinase pathways to regulate the actin cytoskeleton and thus cell polarity, adhesion and migration.	111
350072	cd08847	ArfGap_GIT2	GIT2 GTPase activating protein for Arf. The GIT (G-protein coupled receptor kinase-interacting protein) subfamily includes GIT1 and GIT2, which have three ANK repeats, a Spa-homology domain (SHD), a coiled-coil domain and a C-terminal paxillin-binding site (PBS). The GIT1/2 proteins are GTPase-activating proteins that function as an inactivator of Arf signaling, and interact with the PIX/Cool family of Rac/Cdc42  guanine nucleotide exchange factors (GEFs). Unlike other ArfGAPs, GIT and PIX (Pak-interacting exchange factor) proteins are tightly associated to form an oligomeric complex that acts as a scaffold and signal integrator that can be recruited for multiple signaling pathways. The GIT/PIX complex functions as a signaling scaffold by binding to specific protein partners. As a result, the complex is transported to specific cellular locations. For instance, the GIT partners paxillin or integrin-alpha4 (to focal adhesions), piccolo and liprin-alpha (to synapses), and the beta-PIX partner Scribble (to epithelial cell-cell contacts and synapses). Moreover, the GIT/PIT complex functions to integrate signals from multiple GTP-binding protein and protein kinase pathways to regulate the actin cytoskeleton and thus cell polarity, adhesion and migration.	111
350073	cd08848	ArfGap_ASAP1	ArfGAP domain of ASAP1 (ArfGAP with SH3 domain, ANK repeat and PH domain-containing protein 1). The ArfGAPs are a family of multidomain proteins with a common catalytic domain that promotes the hydrolysis of GTP bound to Arf, thereby  inactivating Arf signaling. ASAP-subfamily GAPs include three members: ASAP1, ASAP2, ASAP3.  The ASAP subfamily comprises Arf GAP, SH3, ANK repeat and PH domains. From the N-terminus, each member has a BAR, PH, Arf GAP, ANK repeat, and proline rich domains. Unlike ASAP3, ASAP1 and ASAP2 also have an SH3 domain at the C-terminus. ASAP1 and ASAP2 show strong GTPase-activating protein (GAP) activity toward Arf1 and Arf5 and weak activity toward Arf6. ASAP1 is a target of Src and FAK signaling that regulates focal adhesions, circular dorsal ruffles (CDR), invadopodia, and podosomes. ASAP1 GAP activity is synergistically stimulated by phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidic acid.  ASAP2 is believed to function as an ArfGAP that controls ARF-mediated vesicle budding when recruited to Golgi membranes. It also functions as a substrate and downstream target for protein tyrosine kinases Pyk2 and Src, a pathway that may be involved in the regulation of vesicular transport. ASAP3 is a focal adhesion-associated ArfGAP that functions in cell migration and invasion. Similar to ASAP1, the GAP activity of ASAP3 is strongly enhanced by PIP2 via PH domain. Like ASAP1, ASAP3 associates with focal adhesions and circular dorsal ruffles. However, unlike ASAP1, ASAP3 does not localize to invadopodia or podosomes. ASAP 1 and 3 have been implicated in oncogenesis, as ASAP1 is highly expressed in metastatic breast cancer and ASAP3 in hepatocellular carcinoma.	122
350074	cd08849	ArfGap_ASAP2	ArfGAP domain of ASAP2 (ArfGAP2 with SH3 domain, ANK repeat and PH domain-containing protein 2). The Arf GAPs are a family of multidomain proteins with a common catalytic domain that promotes the hydrolysis of GTP bound to Arf , thereby  inactivating Arf signaling. ASAP-subfamily GAPs include three members: ASAP1, ASAP2, ASAP3.  The ASAP subfamily comprises Arf GAP, SH3, ANK repeat and PH domains. From the N-terminus, each member has a BAR, PH, Arf GAP, ANK repeat, and proline rich domains. Unlike ASAP3, ASAP1 and ASAP2 also have an SH3 domain at the C-terminus. ASAP1 and ASAP2 show strong GTPase-activating protein (GAP) activity toward Arf1 and Arf5 and weak activity toward Arf6. ASAP1 is a target of Src and FAK signaling that regulates focal adhesions, circular dorsal ruffles (CDR), invadopodia, and podosomes. ASAP1 GAP activity is synergistically stimulated by phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidic acid.  ASAP2 is believed to function as an ArfGAP that controls ARF-mediated vesicle budding when recruited to Golgi membranes. It also functions as a substrate and downstream target for protein tyrosine kinases Pyk2 and Src, a pathway that may be involved in the regulation of vesicular transport.	123
350075	cd08850	ArfGap_ACAP3	ArfGAP domain of ACAP3 (ArfGAP with Coiled-coil, ANK repeat and PH domains 3). ACAP3 belongs to the ACAP subfamily of GAPs (GTPase-activating proteins) for the small GTPase Arf (ADP-ribosylation factor).  ACAP subfamily of ArfGAPs are composed of Coiled coli (BAR, Bin-Amphiphysin-Rvs), PH, ArfGAP and ANK repeats domains. It has been shown that ACAP3 positively regulates neurite outgrowth through its GAP activity specific to Arf6 in mouse hippocampal neurons. ACAP1 (centaurin beta1) and ACAP2 centaurin beta2) also have a GAP (GTPase-activating protein) activity preferentially toward Arf6, which regulates endocytic recycling. Both ACAP1/2 are activated by are activated by the phosphoinositides, PI(4,5)P2 and PI(3,5)P2. ACAP1 binds specifically with recycling cargo proteins such as transferrin receptor (TfR) and cellubrevin. Thus, ACAP1 promotes cargo sorting to enhance TfR recycling from the recycling endosome. In addition, phosphorylation of ACAP by Akt, a serine/threonine protein kinase, regulates the recycling of integrin beta1 to control cell migration. In contrast, ACAP2 does not exhibit a similar interaction with the recycling cargo proteins. It has been shown that ACAP2 functions both as an effector of Ras-related protein Rab35 and as an Arf6-GTPase-activating protein (GAP) during neurite outgrowth of PC12 cells. Moreover, ACAP2, together with Rab35, regulates phagocytosis in mammalian macrophages.	116
350076	cd08851	ArfGap_ACAP2	ArfGAP domain of ACAP2 (ArfGAP with Coiled-coil, ANK repeat and PH domains 2). ACAP2 belongs to the ACAP subfamily of GAPs (GTPase-activating proteins) for the small GTPase Arf (ADP-ribosylation factor).  ACAP subfamily of ArfGAPs are composed of Coiled coli (BAR, Bin-Amphiphysin-Rvs), PH, ArfGAP and ANK repeats domains. ACAP1 (centaurin beta1) and ACAP2 centaurin beta2) have a GAP (GTPase-activating protein) activity preferentially toward Arf6, which regulates endocytic recycling. Both ACAP1/2 are activated by are activated by the phosphoinositides, PI(4,5)P2 and PI(3,5)P2. ACAP1 binds specifically with recycling cargo proteins such as transferrin receptor (TfR) and cellubrevin. Thus, ACAP1 promotes cargo sorting to enhance TfR recycling from the recycling endosome. In addition, phosphorylation of ACAP by Akt, a serine/threonine protein kinase, regulates the recycling of integrin beta1 to control cell migration. In contrast, ACAP2 does not exhibit a similar interaction with the recycling cargo proteins. It has been shown that ACAP2 functions both as an effector of Ras-related protein Rab35 and as an Arf6-GTPase-activating protein (GAP) during neurite outgrowth of PC12 cells. Moreover, ACAP2, together with Rab35, regulates phagocytosis in mammalian macrophages. ACAP3 also positively regulates neurite outgrowth through its GAP activity specific to Arf6 in mouse hippocampal neurons.	116
350077	cd08852	ArfGap_ACAP1	ArfGAP domain of ACAP1 (ArfGAP with Coiled-coil, ANK repeat and PH domains 1). ACAP1 belongs to the ACAP subfamily of GAPs (GTPase-activating proteins) for the small GTPase Arf (ADP-ribosylation factor).  ACAP subfamily of ArfGAPs are composed of Coiled coli (BAR, Bin-Amphiphysin-Rvs), PH, ArfGAP and ANK repeats domains. ACAP1 (centaurin beta1) and ACAP2 centaurin beta2) have a GAP (GTPase-activating protein) activity preferentially toward Arf6, which regulates endocytic recycling. Both ACAP1/2 are activated by are activated by the phosphoinositides, PI(4,5)P2 and PI(3,5)P2. ACAP1 binds specifically with recycling cargo proteins such as transferrin receptor (TfR) and cellubrevin. Thus, ACAP1 promotes cargo sorting to enhance TfR recycling from the recycling endosome. In addition, phosphorylation of ACAP by Akt, a serine/threonine protein kinase, regulates the recycling of integrin beta1 to control cell migration. In contrast, ACAP2 does not exhibit a similar interaction with the recycling cargo proteins. It has been shown that ACAP2 functions both as an effector of Ras-related protein Rab35 and as an Arf6-GTPase-activating protein (GAP) during neurite outgrowth of PC12 cells. Moreover, ACAP2, together with Rab35, regulates phagocytosis in mammalian macrophages. ACAP3 also positively regulates neurite outgrowth through its GAP activity specific to Arf6 in mouse hippocampal neurons.	120
350078	cd08853	ArfGap_AGAP2	ArfGAP with GTPase domain, ANK repeat and PH domain 2. The AGAP subfamily of ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) includes three members: AGAP1-3. In addition to the Arf GAP domain, AGAP proteins contain GTP-binding protein-like, ANK repeat and pleckstrin homology (PH) domains.  AGAP1 and AGAP2 have phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2)-mediated GTPase-activating protein (GAP) activity preferentially toward Arf1, and function in the endocytic system. AGAP1 and AGAP2 independently regulate AP-3 endosomes and AP-1/Rab4 fast recycling endosomes, respectively. AGAP1, via its PH domain, directly interacts with the adapter protein 3 (AP-3), which is a coat protein involved in trafficking in the endosomal-lysosomal system, and regulates AP-3-dependent trafficking.  In other hand, AGAP2 specifically binds the clathrin adaptor protein AP-1 and regulates the AP-1/Rab-4 dependent endosomal trafficking. AGAP2 is overexpressed in different human cancers including prostate carcinoma and glioblastoma, and promotes cancer cell invasion. AGAP3 exists as a component of the NMDA receptor complex that regulates Arf6 and Ras/ERK signaling pathways. Moreover, AGAP3 regulates AMPA receptor trafficking through the ArfGAP domain. Together, AGAP3 is believed to involve in linking NMDA receptor activation to AMPA receptor trafficking.	109
350079	cd08854	ArfGap_AGAP1	ArfGAP with GTPase domain, ANK repeat and PH domain 1. The AGAP subfamily of ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) includes three members: AGAP1-3. In addition to the Arf GAP domain, AGAP proteins contain GTP-binding protein-like, ANK repeat and pleckstrin homology (PH) domains.  AGAP1 and AGAP2 have phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2)-mediated GTPase-activating protein (GAP) activity preferentially toward Arf1, and function in the endocytic system. AGAP1 and AGAP2 independently regulate AP-3 endosomes and AP-1/Rab4 fast recycling endosomes, respectively. AGAP1, via its PH domain, directly interacts with the adapter protein 3 (AP-3), which is a coat protein involved in trafficking in the endosomal-lysosomal system, and regulates AP-3-dependent trafficking.  In other hand, AGAP2 specifically binds the clathrin adaptor protein AP-1 and regulates the AP-1/Rab-4 dependent endosomal trafficking. AGAP2 is overexpressed in different human cancers including prostate carcinoma and glioblastoma, and promotes cancer cell invasion. AGAP3 exists as a component of the NMDA receptor complex that regulates Arf6 and Ras/ERK signaling pathways. Moreover, AGAP3 regulates AMPA receptor trafficking through the ArfGAP domain. Together, AGAP3 is believed to involve in linking NMDA receptor activation to AMPA receptor trafficking.	109
350080	cd08855	ArfGap_AGAP3	ArfGAP with GTPase domain, ANK repeat and PH domain 3. The AGAP subfamily of ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) includes three members: AGAP1-3. In addition to the Arf GAP domain, AGAP proteins contain GTP-binding protein-like, ANK repeat and pleckstrin homology (PH) domains.  AGAP3 exists as a component of the NMDA receptor complex that regulates Arf6 and Ras/ERK signaling pathways. Moreover, AGAP3 regulates AMPA receptor trafficking through the ArfGAP domain. Together, AGAP3 is believed to involve in linking NMDA receptor activation to AMPA receptor trafficking. AGAP1 and AGAP2 have phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2)-mediated GTPase-activating protein (GAP) activity preferentially toward Arf1, and function in the endocytic system. AGAP1 and AGAP2 independently regulate AP-3 endosomes and AP-1/Rab4 fast recycling endosomes, respectively. AGAP1, via its PH domain, directly interacts with the adapter protein 3 (AP-3), which is a coat protein involved in trafficking in the endosomal-lysosomal system, and regulates AP-3-dependent trafficking.  In other hand, AGAP2 specifically binds the clathrin adaptor protein AP-1 and regulates the AP-1/Rab-4 dependent endosomal trafficking. AGAP2 is overexpressed in different human cancers including prostate carcinoma and glioblastoma, and promotes cancer cell invasion.	110
350081	cd08856	ArfGap_ARAP2	ArfGap with Rho-Gap domain, ANK repeat and PH domain-containing protein 2. The ARAP subfamily includes three members, ARAP1-3, and belongs to the ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) family of proteins that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling.  The function of Arfs is dependent on GAPs and guanine nucleotide exchange factors (GEFs), which allow Arfs to cycle between the GDP-bound and GTP-bound forms. In addition to the Arf GAP domain, ARAPs contain the SAM (sterile-alpha motif) domain, 5 pleckstrin homology (PH) domains, the Rho-GAP domain, the Ras-association domain, and ANK repeats. ARAPs show phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3)-dependent GAP activity toward Arf6. ARAPs play important roles in endocytic trafficking, cytoskeleton reorganization in response to growth factors stimulation, and focal adhesion dynamics. ARAP2 localizes to the cell periphery and on focal adhesions composed of paxillin and vinculin, and functions downstream of RhoA to regulate focal adhesion dynamics. ARAP2 is a PI(3,4,5)P3-dependent Arf6 GAP that binds RhoA-GTP, but it lacks the predicted catalytic arginine in the RhoGAP domain and does not have RhoGAP activity. ARAP2 reduces Rac1oGTP levels by reducing Arf6oGTP levels through GAP activity. AGAP2 also binds to and regulates focal adhesion kinase (FAK). Thus, ARAP2 signals through Arf6 and Rac1 to control focal adhesion morphology.	121
350082	cd08857	ArfGap_AGFG1	ArfGAP domain of AGFG1 (ArfGAP domain and FG repeat-containing protein 1). The ArfGAP domain and FG repeat-containing proteins (AFGF) subfamily of Arf GTPase-activating proteins consists of the two structurally-related members: AGFG1 and AGFG2. AGFG1 (alias: HIV-1 Rev binding protein, HRB; Rev interacting protein, RIP; Rev/Rex activating domain-binding protein, RAB) and AGFG2 are involved in the maintenance and spread of immunodeficiency virus type 1 (HIV-1) infection. The ArfGAP domain of AGFG1 is related to nucleoporins, which is a class of proteins that mediate nucleocytoplasmic transport. AGFG1 plays a role in the Rev export pathway, which mediates the nucleocytoplasmic transfer of proteins and RNAs, possibly together by the nuclear export receptor CRM1. In humans, the presence of the FG repeat motifs (11 in AGFG1 and 7 in AGFG2) are thought to be required for these proteins to act as HIV-1 Rev cofactors. Hence, AGFG1 promotes movement of Rev-responsive element-containing RNAs from the nuclear periphery to the cytoplasm, which is an essential step for HIV-1 replication.	116
350083	cd08859	ArfGap_SMAP2	Stromal membrane-associated protein 2; a subfamily of the ArfGAP family. The SMAP subfamily of Arf GTPase-activating proteins consists of the two structurally-related members, SMAP1 and SMAP2. Each SMAP member exhibits common and distinct functions in vesicle trafficking. They both bind to clathrin heavy chain molecules and are involved in the trafficking of clathrin-coated vesicles. SMAP1 preferentially exhibits GAP toward Arf6, while SMAP2 prefers Arf1 as a substrate. SMAP1 is involved in Arf6-dependent vesicle trafficking, but not Arf6-mediated actin cytoskeleton reorganization, and regulates clathrin-dependent endocytosis of the transferrin receptors and E-cadherin. SMAP2 regulates Arf1-dependent retrograde transport of TGN38/46 from the early endosome to the trans-Golgi network (TGN). SMAP2 has the Clathrin Assembly Lymphoid Myeloid (CALM)-binding domain, but SMAP1 does not.	107
176869	cd08860	TcmN_ARO-CYC_like	N-terminal aromatase/cyclase domain of the multifunctional protein tetracenomycin (TcmN) and related domains. This family includes the N-terminal aromatase/cyclase (ARO/CYC) domain of Streptomyces glaucescens TcmN, and related domains. It belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. ARO/CYC domains participate in the diversification of aromatic polyketides by promoting polyketide cyclization. They occur in two architectural forms, monodomain and didomain. Monodomain aromatase/cyclases have a single ARO/CYC domain. For some, such as TcmN, this single domain is linked to a second domain of unrelated function. TcmN is a multifunctional cyclase-dehydratase-O-methyl transferase. Its N-terminal ARO/CYC domain participates in polyketide binding and catalysis; it promotes C9-C14 first-ring (and C7-C16 second-ring) cyclizations. Its C-terminal domain has O-methyltransferase activity. Didomain aromatase/cyclases contain two ARO/CYC domains, and they biosynthesize C7-C12 first ring cyclized polyketides. These latter domains belong to a different subfamily in the SRPBCC superfamily.	146
176870	cd08861	OtcD1_ARO-CYC_like	N-terminal and C-terminal aromatase/cyclase domains of Streptomyces rimosus  OtcD1 and related domains. This family includes the N- and C- terminal aromatase/cyclase (ARO/CYC) domains of Streptomyces rimosus OtcD1 and related domains. It belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. ARO/CYC domains participate in the diversification of aromatic polyketides by promoting polyketide cyclization. They occur in two architectural forms, didomain and monodomain. Didomain aromatase/cyclases (ARO/CYCs), contain two ARO/CYC domains, and are associated with C7-C12 first ring cyclized polyketides. Streptomyces rimosus OtcD1 is a didomain ARO/CYC. The polyketide Oxytetracycline (OTC) is a broad spectrum antibiotic made by Streptomyces rimosus. The gene encoding OtcD1 is part of oxytetracycline (OTC) gene cluster. Disruption of this gene results in the production of novel polyketides having shorter chain lengths (by up to 10 carbons) than OTC. Monodomain ARO/CYCs have a single ARO/CYC domain, and are often associated with C9-C14 first ring cyclizations, these latter domains belong to a different subfamily in the SRPBCC superfamily.	142
176871	cd08862	SRPBCC_Smu440-like	Ligand-binding SRPBCC domain of Streptococcus mutans Smu.440 and related proteins. This family includes the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of Streptococcus mutans Smu.440 and related proteins. This domain belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Streptococcus mutans is a dental pathogen, and the leading cause of dental caries. In this pathogen, the gene encoding Smu.440 is in the same operon as the gene encoding SMU.441, a member of the MarR protein family of transcriptional regulators involved in multiple antibiotic resistance. It has been suggested that SMU.440 is involved in polyketide-like antibiotic resistance.	138
176872	cd08863	SRPBCC_DUF1857	DUF1857, an uncharacterized ligand-binding domain of the SRPBCC domain superfamily. Uncharacterized family of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	141
176873	cd08864	SRPBCC_DUF3074	DUF3074, an uncharacterized ligand-binding domain of the SRPBCC domain superfamily. Uncharacterized family of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	208
176874	cd08865	SRPBCC_10	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	140
176875	cd08866	SRPBCC_11	Ligand-binding SRPBCC domain of an uncharacterized subfamily of proteins. Uncharacterized group of the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. SRPBCC domains include the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), Class I and II phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of the superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins.	144
176876	cd08867	START_STARD4_5_6-like	Lipid-binding START domain of mammalian STARD4, -5, -6, and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD4, -5, and -6. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD4 plays an important role in steroidogenesis, trafficking cholesterol into mitochondria. It specifically binds cholesterol, and demonstrates limited binding to another sterol, 7a-hydroxycholesterol. STARD4 and STARD5 are ubiquitously expressed, with highest levels in liver and kidney. STRAD5 functions in the kidney within the proximal tubule cells where it is associated with the Endoplasmic Reticulum (ER), and may participate in ER-associated cholesterol transport. It binds cholesterol and 25-hydroxycholesterol. Expression of the gene encoding STARD5 is increased by ER stress, and its mRNA and protein levels are elevated in a type I diabetic mouse model of human diabetic nephropathy. STARD6 is expressed in male germ cells of normal rats, and in the steroidogenic Leydig cells of perinatal hypothyroid testes. It may play a pivotal role in the steroidogenesis as well as in the spermatogenesis of normal rats. STARD6 has also been detected in the rat nervous system, and may participate in neurosteroid synthesis.	206
176877	cd08868	START_STARD1_3_like	Cholesterol-binding START domain of mammalian STARD1, -3 and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD1 (also known as StAR) and STARD3 (also known as metastatic lymph node 64/MLN64). The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. This STARD1-like subfamily has a high affinity for cholesterol. STARD1/StAR can reduce macrophage lipid content and inflammatory status. It plays an essential role in steroidogenic tissues: transferring the steroid precursor, cholesterol, from the outer to the inner mitochondrial membrane, across the aqueous space. Mutations in the gene encoding STARD1/StAR can cause lipid congenital adrenal hyperplasia (CAH), an autosomal recessive disorder characterized by a steroid synthesis deficiency and an accumulation of cholesterol in the adrenal glands and the gonads. STARD3 may function in trafficking endosomal cholesterol to a cytosolic acceptor or membrane. In addition to having a cytoplasmic START cholesterol-binding domain, STARD3 also contains an N-terminal MENTAL cholesterol-binding and protein-protein interaction domain. The MENTAL domain contains transmembrane helices and anchors MLN64 to endosome membranes. The gene encoding STARD3 is overexpressed in about 25% of breast cancers.	208
176878	cd08869	START_RhoGAP	C-terminal lipid-binding START domain of mammalian STARD8, -12, -13 and related proteins, which also have an N-terminal Rho GTPase-activating protein (RhoGAP) domain. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD8 (also known as deleted in liver cancer 3/DLC3, and Arhgap38), STARD12 (also known as DLC-1, Arhgap7, and p122-RhoGAP), and STARD13 (also known as DLC-2, Arhgap37, and SDCCAG13). The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Proteins belonging to this subfamily also have a RhoGAP domain. Some, including STARD12, -and -13, also have an N-terminal SAM (sterile alpha motif) domain; these have a SAM-RhoGAP-START domain organization. This subfamily is involved in cancer development. A large spectrum of cancers have dysregulated genes encoding these proteins. The precise function of the START domain in this subfamily is unclear.	197
176879	cd08870	START_STARD2_7-like	Lipid-binding START domain of mammalian STARD2, -7, and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD2 (also known as phosphatidylcholine transfer protein/PC-TP), and STARD7 (also known as gestational trophoblastic tumor 1/GTT1). The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD2 is a cytosolic phosphatidycholine (PtdCho) transfer protein, which traffics PtdCho, the most common class of phospholipids in eukaryotes, between membranes. It represents a minimal START domain structure. STARD2 plays roles in hepatic cholesterol metabolism, in the development of atherosclerosis, and may also have a mitochondrial function. The gene encoding STARD7 is overexpressed in choriocarcinoma. STARD7 appears to be involved in the intracellular trafficking of PtdCho to mitochondria. STARD7 was shown to be surface active and to interact differentially with phospholipid monolayers. It showed a preference for phosphatidylserine, cholesterol, and phosphatidylglycerol.	209
176880	cd08871	START_STARD10-like	Lipid-binding START domain of mammalian STARD10 and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD10 (also known as CGI-52, PTCP-like, and SDCCAG28). The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD10 binds phophatidylcholine and phosphatidylethanolamine. This protein is widely expressed and is synthesized constitutively in many organs. It may function in the liver in the export of phospholipids into bile. It is concentrated in the sperm flagellum, and may play a role in energy metabolism. In the mammary gland it may participate in the enrichment of lipids in milk, and be a potential marker of differentiation. Its expression is induced in this gland during gestation and lactation. It is overexpressed in mammary tumors from Neu/ErbB2 transgenic mice, in several breast carcinoma cell lines, and in 35% of primary human breast cancers, and may cooperate with c-erbB receptor signaling in breast oncogenesis. It is a potential marker of disease outcome in breast cancer; loss of STARD10 expression in breast cancer strongly predicts an aggressive disease course. The lipid transfer activity of STRAD10 is downregulated by phosphorylation of its Ser284 by CK2 (casein kinase 2).	222
176881	cd08872	START_STARD11-like	Ceramide-binding START domain of mammalian STARD11 and related domains. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD11 and related domains. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD11 can mediate transfer of the natural ceramide isomers, dihydroceramide and phytoceramide, as well as ceramides having C14, C16, C18, and C20 chains. They can also transfer diacylglycerol, but with a lower efficiency. STARD11 is synthesized from two major transcripts: a larger one encoding Goodpasture antigen-binding protein (GPBP)/ceramide transporter long form (CERTL); and a smaller one encoding GPBPdelta26/CERT, which is deleted for 26 amino acids. Both splicing variants mediate ceramide transfer from the ER to the Golgi, in a non-vesicular manner. It is likely that these two carry out different functions in specific sub-cellular locations. These proteins have roles in brain homeostasis and disease processes. GPBP/CERTL exists in multiple isoforms originating from alternative translation initiation sites and post-translational modifications. Goodpasture syndrome is a human disorder caused by antibodies directed against the a3-chain of collagen type IV. GPBP/CERTL binds and phosphorylates this antigen. The human gene encoding STARD11 is referred to as COL4A3BP referring to its collagen binding function. It is unknown whether the ceramide-transfer function of GPBP/CERTL is related to this collagen interaction. The expression of GPBP/CERTL is elevated in these and other spontaneous autoimmune disorders including cutaneous lupus erythematosus, pemphigoid, and lichen planus. GPBL/CERTL contains an N-terminal pleckstrin homology domain (PH), which targets the protein to the Golgi, a middle region containing two serine-rich domains (SR1, SR2), a FFAT (two phenylalanine amino acids in an acidic tract) motif which is involved in endoplasmic reticulum targeting, and this C-terminal SMART domain. The shorter splicing variant, CERT, lacks the SR2 domain.	235
176882	cd08873	START_STARD14_15-like	Lipid-binding START domain of mammalian STARDT14, -15, and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian brown fat-inducible STARD14 (also known as Acyl-Coenzyme A Thioesterase 11 or ACOT11, BFIT, THEA, THEM1, KIAA0707, and MGC25974), STARD15/ACOT12 (also known as cytoplasmic acetyl-CoA hydrolase/CACH, THEAL, and MGC105114), and related domains. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD14/ACOT11 and STARD15/ACOT12 are type II acetyl-CoA thioesterases; they catalyze the hydrolysis of acyl-CoAs to free fatty acid and CoASH. Human STARD14 displays acetyl-CoA thioesterase activity towards medium(C12)- and long(C16)-chain fatty acyl-CoA substrates. Rat CACH hydrolyzes acetyl-CoA to acetate and CoA. In addition to having a START domain, STARD14 and STARD15 each have two tandem copies of the hotdog domain. There are two splice variants of human STARD14, named BFIT1 and BFIT2, which differ in their C-termini. Human BFIT2 is equivalent to mouse mBFIT/Acot11, whose transcription is increased two fold in obesity-resistant mice compared with obesity-prone mice. Human STARD15 may have roles in cholesterol metabolism and in beta-oxidation.	235
176883	cd08874	START_STARD9-like	C-terminal START domain of mammalian STARD9, and related domains; lipid binding. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD9 (also known as KIAA1300), and related domains. The START domain family belongs to the SRPBCC (START/RHO_alpha_C /PITP /Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Some members of this subfamily have N-terminal kinesin motor domains. STARD9 interacts with supervillin, a protein important for efficient cytokinesis, perhaps playing a role in coordinating microtubule motors with actin and myosin II functions at membranes. The human gene encoding STARD9 lies within a target region for LGMD2A, an autosomal recessive form of limb-girdle muscular dystrophy.	205
176884	cd08875	START_ArGLABRA2_like	C-terminal lipid-binding START domain of the Arabidopsis homeobox protein GLABRA 2 and related proteins. This subfamily includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of the Arabidopsis homeobox protein GLABRA 2 and related proteins. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Most proteins in this subgroup contain an N-terminal homeobox DNA-binding domain, some contain a leucine zipper. ArGLABRA2 plays a role in the differentiation of hairless epidermal cells of the Arabidopsis root. It acts in a cell-position-dependent manner to suppress root hair formation in those cells.	229
176885	cd08876	START_1	Uncharacterized subgroup of the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domain family. Functionally uncharacterized subgroup of the START domain family. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. For some mammalian members of the START family (STARDs), it is known which lipids bind in this pocket; these include cholesterol (STARD1, -3, -4, and -5), 25-hydroxycholesterol (STARD5), phosphatidylcholine (STARD2, -7, and -10), phosphatidylethanolamine (STARD10) and ceramides (STARD11). Mammalian STARDs participate in the control of various cellular processes, including lipid trafficking between intracellular compartments, lipid metabolism, and modulation of signaling events. Mutation or altered expression of STARDs is linked to diseases such as cancer, genetic disorders, and autoimmune disease.	195
176886	cd08877	START_2	Uncharacterized subgroup of the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domain family. Functionally uncharacterized subgroup of the START domain family. The START domain family belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. For some mammalian members of the START family (STARDs), it is known which lipids bind in this pocket; these include cholesterol (STARD1, -3, -4, and -5), 25-hydroxycholesterol (STARD5), phosphatidylcholine (STARD2, -7, and -10), phosphatidylethanolamine (STARD10) and ceramides (STARD11). Mammalian STARDs participate in the control of various cellular processes, including lipid trafficking between intracellular compartments, lipid metabolism, and modulation of signaling events. Mutation or altered expression of STARDs is linked to diseases such as cancer, genetic disorders, and autoimmune disease.	215
176887	cd08878	RHO_alpha_C_DMO-like	C-terminal catalytic domain of the oxygenase alpha subunit of dicamba O-demethylase and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of Stenotrophomonas maltophilia dicamba O-demethylase (DMO) and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and the C-terminal catalytic domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Oxygenases belonging to this subgroup include the alpha subunits of carbazole 1,9a-dioxygenase, phthalate dioxygenase, vanillate O-demethylase, Pseudomonas putida 2-oxoquinoline 8-monooxygenase, and Comamonas testosteroni T-2 p-toluenesulfonate dioxygenase. It also includes the C-terminal domain of the lignin biphenyl-specific O-demethylase (LigX) of the 5,5'-dehydrodivanillic acid O- demethylation system of Sphingomonas paucimobilis SYK-6. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	196
176888	cd08879	RHO_alpha_C_AntDO-like	C-terminal catalytic domain of the oxygenase alpha subunit of Pseudomonas resinovorans strain CA10 anthranilate 1,2-dioxygenase and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of anthranilate 1,2-dioxygenase (AntDO) and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n.  The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and the C-terminal catalytic domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Oxygenases belonging to this subgroup include the alpha subunits of AntDO, aniline dioxygenase, Acinetobacter calcoaceticus benzoate 1,2-dioxygenase, 2-halobenzoate 1,2-dioxygenase from Pseudomonas cepacia 2CBS, 2,4,5-trichlorophenoxyacetic acid oxygenase from Pseudomonas cepacia AC1100, 2,4-dichlorophenoxyacetic acid oxygenase from Bradyrhizobium sp. strain HW13, p-cumate 2,3-dioxygenase, 2-halobenzoate 1,2-dioxygenase form Pseudomonas cepacia 2CBS, and Pseudomonas putida IacC, which may be involved in the catabolism of the plant hormone indole 3-acetic acid. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	237
176889	cd08880	RHO_alpha_C_ahdA1c-like	C-terminal catalytic domain of the large/alpha subunit (ahdA1c) of a ring-hydroxylating dioxygenase from Sphingomonas sp. strain P2 and related proteins. C-terminal catalytic domain of the large subunit (ahdA1c) of the AhdA3A4A2cA1c salicylate 1-hydroxylase complex from Sphingomonas sp. strain P2, and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). AhdA3A4A2cA1c is one of three known isofunctional salicylate 1-hydroxylase complexes in strain P2, involved in phenanthrene degradation, which catalyze the monooxygenation of salicylate, the metabolite of phenanthene degradation, to produce catechol. This complex prefers salicylate over other substituted salicylates; the other two salicylate 1-hydroxylases have different substrate preferences. RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Other oxygenases belonging to this subgroup include the alpha subunits of anthranilate 1,2-dioxygenase from Burkholderia cepacia DBO1, a polycyclic aromatic hydrocarbon dioxygenase from Cycloclasticus sp. strain A5 (PhnA dioxygenase), salicylate-5-hydroxylase from Ralstonia sp. U2, ortho-halobenzoate 1,2-dioxygenase from Pseudomonas aeruginosa strain JB2, and the terephthalate 1,2-dioxygenase system from Delftia tsuruhatensis strain T7. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	222
176890	cd08881	RHO_alpha_C_NDO-like	C-terminal catalytic domain of the oxygenase alpha subunit of naphthalene 1,2-dioxygenase (NDO) and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of naphthalene 1,2-dioxygenase (NDO) and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). This domain binds non-heme Fe(II).  RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents form the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Proteins belonging to this subgroup include the terminal oxygenase alpha subunits of biphenyl dioxygenase, cumene dioxygenase from Pseudomonas fluorescens IP01, ethylbenzene dioxygenase, naphthalene 1,2-dioxygenase, nitrobenzene dioxygenase from Comamonas sp. strain JS765, toluene 2,3-dioxygenase from Pseudomonas putida F1, dioxin dioxygenase of Sphingomonas sp. Strain RW1, and the polycyclic aromatic hydrocarbons (PAHs)degrading ring-hydroxylating dioxygenase from Sphingomonas CHY-1. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	206
176891	cd08882	RHO_alpha_C_MupW-like	C-terminal catalytic domain of Pseudomonas fluorescens MupW and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of Pseudomonas fluorescens MupW and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. MupW is part of the mupirocin biosynthetic gene cluster in Pseudomonas fluorescens, and may catalyze the oxidation of the 16-methyl group during biosynthesis of this polyketide antibiotic. Mupirocin is a mixture of pseudomonic acids which targets isoleucyl-tRNA synthase and is a strong inhibitor of Gram positive bacterial and mycoplasmal pathogens. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	243
176892	cd08883	RHO_alpha_C_CMO-like	C-terminal catalytic domain of plant choline monooxygenase (CMO) and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of plant choline monooxygenase and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. Plant choline monooxygenase catalyzes the first step in a two-step oxidation of choline to the osmoprotectant glycine betaine. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	175
176893	cd08884	RHO_alpha_C_GbcA-like	C-terminal catalytic domain of GbcA (glycine betaine catabolism A) from Pseudomonas aeruginosa PAO1 and related aromatic ring hydroxylating dioxygenases. C-terminal catalytic domain of GbcA  (glycine betaine catabolism A) from Pseudomonas aeruginosa PAO1 and related Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs, also known as aromatic ring hydroxylating dioxygenases). RHOs utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n.  The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. GbcA is involved in glycine betaine (GB) catabolism in Pseudomonas aeruginosa; it may remove a methyl group from GB via a dioxygenase mechanism, producing dimethylglycine and formaldehyde. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	205
176894	cd08885	RHO_alpha_C_1	C-terminal catalytic domain of the oxygenase alpha subunit of an uncharacterized subgroup of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of a functionally uncharacterized subgroup of the Rieske-type non-heme iron aromatic ring-hydroxylating oxygenase (RHO) family. RHOs, also known as aromatic ring hydroxylating dioxygenases, utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. This group contains two putative Parvibaculum lavamentivorans (T) DS-1 oxygenases; this organism catabolizes commercial linear alkylbenzenesulfonate surfactant (LAS) and other surfactants, by a pathway involving an undefined 'omega-oxygenation' and beta-oxidation of the LAS side chain. The nature of the LAS-oxygenase is unknown but is likely a multicomponent system. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	190
176895	cd08886	RHO_alpha_C_2	C-terminal catalytic domain of the oxygenase alpha subunit of an uncharacterized subgroup of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of a functionally uncharacterized subgroup of the Rieske-type non-heme iron aromatic ring-hydroxylating oxygenase (RHO) family. RHOs, also known as aromatic ring hydroxylating dioxygenases, utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	182
176896	cd08887	RHO_alpha_C_3	C-terminal catalytic domain of the oxygenase alpha subunit of an uncharacterized subgroup of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases. C-terminal catalytic domain of the oxygenase alpha subunit of a functionally uncharacterized subgroup of the Rieske-type non-heme iron aromatic ring-hydroxylating oxygenase (RHO) family. RHOs, also known as aromatic ring hydroxylating dioxygenases, utilize non-heme Fe(II) to catalyze the addition of hydroxyl groups to the aromatic ring, an initial step in the oxidative degradation of aromatic compounds. RHOs are composed of either two or three protein components, and are comprised of an electron transport chain (ETC) and an oxygenase. The ETC transfers reducing equivalents from the electron donor to the oxygenase component, which in turn transfers electrons to the oxygen molecules. The oxygenase components are oligomers, either (alpha)n or (alpha)n(beta)n. The alpha subunits are the catalytic components and have an N-terminal domain, which binds a Rieske-like 2Fe-2S cluster, and a C-terminal domain which binds the non-heme Fe(II). The Fe(II) is co-ordinated by conserved His and Asp residues. This group contains a putative Parvibaculum lavamentivorans (T) DS-1 oxygenase; this organism catabolizes commercial linear alkylbenzenesulfonate surfactant (LAS) and other surfactants, by a pathway involving an undefined 'omega-oxygenation' and beta-oxidation of the LAS side chain. The nature of the LAS-oxygenase is unknown but is likely a multicomponent system. This subfamily belongs to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket.	185
176897	cd08888	SRPBCC_PITPNA-B_like	Lipid-binding SRPBCC domain of mammalian PITPNA, -B, and related proteins (Class I PITPs). This subgroup includes the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of mammalian Class 1 phosphatidylinositol transfer proteins (PITPs), PITPNA/PITPalpha and PITPNB/PITPbeta, Drosophila vibrator, and related proteins. These are single domain proteins belonging to the PITP family of lipid transfer proteins, and to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. In vitro, PITPs bind phosphatidylinositol (PtdIns), as well as phosphatidylcholine (PtdCho) but with a lower affinity. They transfer these lipids from one membrane compartment to another. The cellular roles of PITPs include inositol lipid signaling, PtdIns metabolism, and membrane trafficking. In addition, PITPNB transfers sphingomyelin in vitro, with a low affinity. PITPNA is found chiefly in the nucleus and cytoplasm; it is enriched in the brain and predominantly localized in the axons. A reduced expression of PITPNA contributes to the neurodegenerative phenotype of the mouse vibrator mutation. The role of PITPNA in vivo may be to provide PtdIns for localized PI3K-dependent signaling, thereby controlling the polarized extension of axonal processes. PITPNA homozygous null mice die soon after birth from complicated organ failure, including intestinal and hepatic steatosis, hypoglycemia, and spinocerebellar disease. PITPNB is associated with the Golgi and ER, and is highly expressed in the liver. Deletion of the PITPNB gene results in embryonic lethality. The PtdIns and PtdCho exchange activity of PITPNB is required for COPI-mediated retrograde transport from the Golgi to the ER. Drosophila vibrator localizes to the ER, and has an essential role in cytokinesis during mitosis and meiosis.	258
176898	cd08889	SRPBCC_PITPNM1-2_like	Lipid-binding SRPBCC domain of mammalian PITPNM1-2 and related proteins (Class IIA PITPs). This subgroup includes an N-terminal SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of mammalian Class II phosphatidylinositol transfer protein (PITPs), PITPNM1/PITPalphaI/Nir2 (PYK2 N-terminal domain-interacting receptor2) and   PITPNM2/PITPalphaII/Nir3), Drosophila RdgB, and related proteins. These are membrane associated multidomain proteins belonging to the PITP family of lipid transfer proteins, and to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. In vitro, PITPs bind phosphatidylinositol (PtdIns), as well as phosphatidylcholine (PtdCho) but with a lower affinity. They transfer these lipids from one membrane compartment to another. The cellular roles of PITPs include inositol lipid signaling, PtdIns metabolism, and membrane trafficking. Ablation of the mouse gene encoding PITPNM1 results in early embryonic death. PITPNM1 is localized chiefly to the Golgi apparatus, and under certain conditions translocates to the lipid droplets. Targeting to the latter is dependent on a specific threonine residue within the SRPBCC domain. PITPNM1 plays a part in Golgi-mediated transport. It regulates diacylglycerol (DAG) production at the trans-Golgi network (TGN) via the CDP-choline pathway. Drosophila RdgB, the founding member of the PITP family, is implicated in the visual and olfactory transduction. RdgB is required for maintenance of ultra structure in photoreceptors and for sensory transduction. The mouse PITPNM1 gene rescues the phenotype of Drosophila rdgB mutant flies. In addition to the SRPBCC domain, PITPNM1 and -2 contain a Rho-inhibitory domain (Rid), six hydrophobic stretches, a DDHD calcium binding region, and a C-terminal tyrosine kinase Pyk2-binding / HAD-like phosphohydrolase domain. PITPNM1 has a role in regulating cell morphogenesis through its Rho inhibitory domain (Rid). This SRPBCC_PITPNM1-2_like domain model includes the first 52 residues of the 224 residues Rid (Rho-inhibitory domain).	260
176899	cd08890	SRPBCC_PITPNC1_like	Lipid-binding SRPBCC domain of mammalian PITPNC1,and related proteins (Class IIB PITPs). This subgroup includes the N-terminal SRPBCC (START/RHO_alpha_C /PITP /Bet_v1/CoxG/CalC) domain of mammalian Class IIB phosphatidylinositol transfer protein (PITP), PITPNC1/RdgBbeta, and related proteins. These are metazoan proteins belonging to the PITP family of lipid transfer proteins, and to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. In vitro, PITPs bind phosphatidylinositol (PtdIns), as well as phosphatidylcholine (PtdCho) but with a lower affinity. They transfer these lipids from one membrane compartment to another. The cellular roles of PITPs include inositol lipid signaling, PtdIns metabolism, and membrane trafficking. Mammalian PITPNC1 contains an amino-terminal SRPBCC PITP-like domain and a short carboxyl-terminal domain. It is a cytoplasmic protein, and is ubiquitously expressed. It can transfer phosphatidylinositol (PtdIns) in vitro with a similar ability to other PITPs.	250
176900	cd08891	SRPBCC_CalC	Ligand-binding SRPBCC domain of Micromonospora echinospora CalC and related proteins. This subfamily includes Micromonospora echinospora CalC (MeCalC) and related proteins. These proteins belong to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins which bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. MeCalC confers resistance to the enediyne, calicheamicin gamma 1 (CLM). Enediyne antibiotics are antitumor agents. Enediynes have an in vitro and in vivo role as DNA damaging agents; they consist of a DNA recognition unit (e.g., aryltetrasaccharide of CLM), an activating component (e.g., methyl trisulfide of CLM), which promotes cycloaromatization, and the enediyne warhead which cycloaromatizes to a reactive diradical species, resulting in oxidative strand cleavage of the targeted DNA sequence. MeCalC confers resistance to CLM by a self sacrificing mechanism: the transient enediyne diradical species abstracts a CalC Gly Calpha-hydrogen, thereby quenching the reactive enediyne moiety, and generating a CalC Gly Calpha radical. This radical then reacts with oxygen, leading to oxidative site-specific proteolysis of CalC. This antibiotic-induced proteolysis of CalC results in inactivation of both CalC and the highly reactive diradical species. CalC has also been shown to inactivate two other enediynes, shishijimicin and namenamicin. The crucial Gly of the MeCalC CLM resistance mechanism is contained in a loop (L1) which is displaced when CLM is bound, this Gly is not conserved in this subgroup.	149
176901	cd08892	SRPBCC_Aha1	Putative hydrophobic ligand-binding SRPBCC domain of the Hsp90 co-chaperone Aha1 and related proteins. This subfamily includes the C-terminal SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of Aha1, and related domains. Proteins in this group belong to the SRPBCC domain superfamily of proteins which bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Aha1 is one of several co-chaperones, which regulate the dimeric chaperone Hsp90. Hsp90, Aha1, and other accessory proteins interact in a chaperone cycle driven by ATP binding and hydrolysis. Aha1 promotes dimerization of the N-terminal domains of Hsp90, and stimulates its low intrinsic ATPase activity. One Aha1 molecule binds per Hsp90 dimer. The N- and C- terminal domains of Aha1 cooperatively bind across the dimer interface of Hsp90. The C-terminal domain of Aha1 binds the N-terminal Hsp90 ATPase domain. Aha1 may regulate the dwell time of Hsp90 with client proteins. Aha1 may act as either a negative or positive regulator of chaperone-dependent activation, depending on the client protein; for example, it acts as a negative regulator in the case of Saccharomyces cerevisiae MAL63 MAL-activator, and acts as a positive regulator in the case of glucocorticoid receptor and v-Src kinase. The mechanisms by which these opposing functions are achieved are unclear. Aha1 is upregulated in a number of tumor lines co-incident with the activation of several signaling kinases.	126
176902	cd08893	SRPBCC_CalC_Aha1-like_GntR-HTH	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins; some contain an N-terminal GntR family winged HTH DNA-binding domain. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands. Some proteins in this subgroup contain an N-terminal winged helix-turn-helix DNA-binding domain found in the GntR family of proteins which include bacterial transcriptional regulators and their putative homologs from eukaryota and archaea.	136
176903	cd08894	SRPBCC_CalC_Aha1-like_1	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	139
176904	cd08895	SRPBCC_CalC_Aha1-like_2	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	146
176905	cd08896	SRPBCC_CalC_Aha1-like_3	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	146
176906	cd08897	SRPBCC_CalC_Aha1-like_4	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	133
176907	cd08898	SRPBCC_CalC_Aha1-like_5	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	145
176908	cd08899	SRPBCC_CalC_Aha1-like_6	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	157
176909	cd08900	SRPBCC_CalC_Aha1-like_7	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	143
176910	cd08901	SRPBCC_CalC_Aha1-like_8	Putative hydrophobic ligand-binding SRPBCC domain of an uncharacterized subgroup of CalC- and Aha1-like proteins. SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain of a functionally uncharacterized subgroup of CalC- and Aha1-like proteins. This group shows similarity to the SRPBCC domains of Micromonospora echinospora CalC (a protein which confers resistance to enediynes) and human Aha1 (one of several co-chaperones which regulate the dimeric chaperone Hsp90), and belongs to the SRPBCC domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket and they bind diverse ligands.	136
176911	cd08902	START_STARD4-like	Lipid-binding START domain of mammalian STARD4 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD4 and related domains. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD4 plays an important role in steroidogenesis, trafficking cholesterol into mitochondria. It specifically binds cholesterol, and demonstrates limited binding to another sterol, 7alpha-hydroxycholesterol. STARD4 is ubiquitously expressed, with highest levels in liver and kidney.	202
176912	cd08903	START_STARD5-like	Lipid-binding START domain of mammalian STARD5 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD5, and related domains. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD5 is ubiquitously expressed, with highest levels in liver and kidney. STARD5 functions in the kidney within the proximal tubule cells where it is associated with the Endoplasmic Reticulum (ER), and may participate in ER-associated cholesterol transport. It binds cholesterol and 25-hydroxycholesterol. Expression of the gene encoding STARD5 is increased by ER stress, and its mRNA and protein levels are elevated in a type I diabetic mouse model of human diabetic nephropathy.	208
176913	cd08904	START_STARD6-like	Lipid-binding START domain of mammalian STARD6 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD6 and related domains. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD6 is expressed in male germ cells of normal rats, and in the steroidogenic Leydig cells of  perinatal hypothyroid testes. It may play a pivotal role in the steroidogenesis as well as in the spermatogenesis of normal rats. STARD6 has also been detected in the rat nervous system, and may participate in neurosteroid synthesis.	204
176914	cd08905	START_STARD1-like	Cholesterol-binding START domain of mammalian STARD1 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD1 (also known as StAR) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD1 has a high affinity for cholesterol. It can reduce macrophage lipid content and inflammatory status. It plays an essential role in steroidogenic tissues: transferring the steroid precursor, cholesterol, from the outer to the inner mitochondrial membrane, across the aqueous space. Mutations in the gene encoding STARD1/StAR can cause lipid congenital adrenal hyperplasia (CAH), an autosomal recessive disorder characterized by a steroid synthesis deficiency and an accumulation of cholesterol in the adrenal glands and the gonads.	209
176915	cd08906	START_STARD3-like	Cholesterol-binding START domain of mammalian STARD3 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD3 (also known as metastatic lymph node 64/MLN64) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD3 has a high affinity for cholesterol. It may function in trafficking endosomal cholesterol to a cytosolic acceptor or membrane. In addition to having a cytoplasmic START cholesterol-binding domain, STARD3 also contains an N-terminal MENTAL cholesterol-binding and protein-protein interaction domain. The MENTAL domain contains transmembrane helices and anchors MLN64 to endosome membranes. The gene encoding STARD3 is overexpressed in about 25% of breast cancers.	209
176916	cd08907	START_STARD8-like	C-terminal lipid-binding START domain of mammalian STARD8 and related proteins, which also have an N-terminal Rho GTPase-activating protein (RhoGAP) domain. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD8 (also known as deleted in liver cancer 3/DLC3, and Arhgap38) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Proteins belonging to this subfamily also have a RhoGAP domain. The precise function of the START domain in this subgroup is unclear.	205
176917	cd08908	START_STARD12-like	C-terminal lipid-binding START domain of mammalian STARD12 and related proteins, which also have an N-terminal Rho GTPase-activating protein (RhoGAP) domain. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD12 (also known as DLC-1, Arhgap7, and p122-RhoGAP) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Proteins belonging to this subgroup also have an N-terminal SAM (sterile alpha motif) domain and a RhoGAP domain, and have a SAM-RhoGAP-START domain organization. The precise function of the START domain in this subgroup is unclear.	204
176918	cd08909	START_STARD13-like	C-terminal lipid-binding START domain of mammalian STARD13 and related proteins, which also have an N-terminal Rho GTPase-activating protein (RhoGAP) domain. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD13 (also known as DLC-2, Arhgap37, and SDCCAG13) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. Proteins belonging to this subfamily also have a RhoGAP domain. The precise function of the START domain in this subgroup is unclear.	205
176919	cd08910	START_STARD2-like	Lipid-binding START domain of mammalian STARD2 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD2 (also known as phosphatidylcholine transfer protein/PC-TP) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD2 is a cytosolic phosphatidycholine (PtdCho) transfer protein, which traffics PtdCho, the most common class of phospholipids in eukaryotes, between membranes. It represents a minimal START domain structure. STARD2 plays roles in hepatic cholesterol metabolism, in the development of atherosclerosis, and may have a mitochondrial function.	207
176920	cd08911	START_STARD7-like	Lipid-binding START domain of mammalian STARD7 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD7 (also known as gestational trophoblastic tumor 1/GTT1). It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. The gene encoding STARD7 is overexpressed in choriocarcinoma. STARD7 appears to be involved in the intracellular trafficking of phosphatidycholine (PtdCho) to mitochondria. STARD7 was shown to be surface active and to interact differentially with phospholipid monolayers, it showed a preference for phosphatidylserine, cholesterol, and phosphatidylglycerol.	207
176921	cd08913	START_STARD14-like	Lipid-binding START domain of mammalian STARDT14 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian brown fat-inducible STARD14 (also known as Acyl-Coenzyme A Thioesterase 11 or ACOT11, BFIT, THEA, THEM1, KIAA0707, and MGC25974) and related proteins. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD14/ACOT11 is a type II acetyl-CoA thioesterase; it catalyzes the hydrolysis of acyl-CoAs to free fatty acid and CoASH. Human STARD14 displays acetyl-CoA thioesterase activity towards medium(C12)- and long(C16)-chain fatty acyl-CoA substrates. In addition to having a START domain, most proteins in this subgroup have two tandem copies of the hotdog domain. There are two splice variants of human STARD14, named BFIT1 and BFIT2, which differ in their C-termini.  Human BFIT2 is equivalent to mouse mBFIT/Acot11, whose transcription is increased two fold in obesity-resistant mice compared with obesity-prone mice.	240
176922	cd08914	START_STARD15-like	Lipid-binding START domain of mammalian STARD15 and related proteins. This subgroup includes the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of STARD15/ACOT12 (also known as cytoplasmic acetyl-CoA hydrolase/CACH, THEAL, and MGC105114) and related domains. It belongs to the START domain family, and in turn to the SRPBCC (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC) domain superfamily of proteins that bind hydrophobic ligands. SRPBCC domains have a deep hydrophobic ligand-binding pocket. STARD15/ACOT12 is a type II acetyl-CoA thioesterase; it catalyzes the hydrolysis of acyl-CoAs to free fatty acid and CoASH. Rat CACH hydrolyzes acetyl-CoA to acetate and CoA. In addition to having a START domain, most proteins in this subgroup have two tandem copies of the hotdog domain. Human STARD15/ACOT12 may have roles in cholesterol metabolism and in beta-oxidation.	236
185746	cd08915	V_Alix_like	Protein-interacting V-domain of mammalian Alix and related domains. This superfamily contains the V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. The Alix V-domain contains a binding site, partially conserved in this superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Members of this superfamily have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members, including Alix, HD-PTP, and Bro1, also have a proline-rich region (PRR), which binds multiple partners in Alix, including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. The C-terminal portion (V-domain and PRR) of Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes; it interacts with a YPxL motif in Doa4s catalytic domain to stimulate its deubiquitination activity. Rim20 may bind the ESCRT-III subunit Snf7, bringing the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and promoting the proteolytic activation of Rim101. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate often absent in human kidney, breast, lung, and cervical tumors. HD-PTP has a C-terminal catalytically inactive tyrosine phosphatase domain.	342
381257	cd08916	TrHb3_P	Truncated hemoglobins (TrHbs, 2/2Hb, 2/2 globins); group 3 (P). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. They are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). TrHb3s include Campylobacter jejuni Ctb, encoded by Cj0465c, which may play a role in moderating O2 flux within C. jejuni.	116
381258	cd08917	TrHb2_O	Truncated hemoglobins (TrHbs, 2/2Hb, 2/2 globins); group 2 (O). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). TrHb2s include the dimeric Arabidopsis thaliana TrHb2 AtGLB3. GLB3 is likely to have a function distinct from other plant globins: it exhibits a low O2 affinity, an unusual concentration-independent binding of O2 and CO, and does not respond to any of the treatments that induce plant 3-on-3 globins. Other TrHb2's include Bacillus subtilis trHb (Bs-trHb) which exhibits an extremely high oxygen affinity, and Pseudoalteromonas haloplanktis PhHbO (encoded by the PSHAa0030 gene) which appears to be involved in oxidative and nitrosative stress resistance.	116
381259	cd08919	PBP-like	Phycobiliproteins (PBPs) and related proteins. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). This family also contains allophycocyanin-like (Apl) proteins, which conserve the residues critical for chromophore interactions, but may not maintain the proper alpha-beta subunit interactions and tertiary structure of PBPs. The genes encoding the Apl proteins cluster with light-responsive regulatory components, so these may have photoresponsive regulatory role(s). Included in this family is the PBP-like domain of the core-membrane linker polypeptide (LCM). The LCM serves both as a terminal energy acceptor and as a linker polypeptide. Its single phycocyanobilin (PCB) chromophore is one of two terminal energy transmitters, and transfers excitations from the hundreds of chromophores of the PBS to the RCs. This family also includes some proteins which have glutathione-S-transferases (GST) domains N-terminal to this PBP-like domain.	153
271272	cd08920	Ngb	Neuroglobins. The Ngb described in this subfamily is a hexacoordinated heme globin chiefly expressed in neurons of the brain and retina. In the human brain, it is highly expressed in the hypothalamus, amygdala, and in the pontine tegmental nuclei. It affords protection of brain neurons from ischemia and hypoxia. In rats, it plays a role in the neuroprotection of limb ischemic preconditioning (LIP). It plays roles as: a sensor of oxygen levels; a store or reservoir for oxygen; a facilitator for oxygen transport; a regulator of ROS; and a scavenger of nitric oxide. It also functions in the protection against apoptosis and in sleep regulation. This subgroup contains Ngb from mammalian and non-mammalian vertebrates, including fish, amphibians and reptiles; the functionally pentacoordinated acoelomorph Symsagittifera roscoffensis Ngb does not belong to this subgroup.	148
381260	cd08922	FHb-globin	Globin domain of flavohemoglobins (flavoHbs). FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. NO scavenging by flavoHb attenuates the expression of the nitrosative stress response, affects the swarming behavior of Escherichia coli, and maintains squid-Vibrio fischeri and Medicago truncatula-Sinorhizobium meliloti symbioses. FlavoHb expression affects Aspergillus nidulans sexual development and mycotoxin production, and Dictyostelium discoideum development. This family also includes some single-domain goblins (SDgbs).	140
381261	cd08923	class1-2_nsHbs_Lbs	Class1 nonsymbiotic hemoglobins (nsHbs), class II nsHbs, leghemoglobins (Lbs,) and related proteins. Class1 nsHbs include the dimeric hexacoordinate Trema tomentosa nsHb and the dimeric hexacoordinate nsHb from monocot barley. Also belonging to this family is ParaHb, a dimeric pentacoordinate Hb from the root nodules of Parasponia andersonii, a non-legume capable of symbiotic nitrogen fixation. ParaHb is unusual in that it has different heme redox potentials for each subunit; it may have evolved from class1 nsHbs. Lbs are pentacoordinate, and facilitate the diffusion of O2 to the respiring Rhizobium bacteroids within root nodules. They may have evolved from class 2 nonsymbiotic hemoglobins (class2 nsHb).	147
271275	cd08924	Cygb	Cytoglobin and related globins. Cygb is a hexacoordinated heme-containing protein, able to bind O2, NO and carbon monoxide. It has both nitric oxide dioxygenase and lipid peroxidase activities, and potentially participates in the maintenance of normal phenotype by implementing a homeostatic effect, to counteract stress conditions imposed on a cell. Cygb is implicated in multiple human pathologies: it is up-regulated in fibrosis and neurodegenerative disorders, and down-regulated in multiple cancer types, and may have a tumor suppressor role. It is expressed ubiquitously across a broad range of vertebrate organs including liver, heart, brain, lung, retina, and gut. In the human brain, it was detected at high levels in the habenula, hypothalamus, thalamus, hippocampus and pontine tegmental nuclei, detected at a low level in the cerebral cortex, and undetected in the cerebellar cortex.	153
381262	cd08925	Hb-beta-like	Hemoglobin beta, gamma, delta, epsilon, and related Hb subunits. Hb is the oxygen transport protein of erythrocytes. It is an allosterically modulated heterotetramer. Hemoglobin A (HbA) is the most common Hb in adult humans, and is formed from two alpha-chains and two beta-chains (alpha2beta2). An equilibrium exists between deoxygenated/unliganded/T(tense state) Hb having low oxygen affinity, and oxygenated /liganded/R(relaxed state) Hb having a high oxygen affinity. Various endogenous heterotropic effectors bind Hb to modulate its oxygen affinity and cooperative behavior, e.g. hydrogen ions, chloride ions, carbon dioxide and 2,3-bisphosphoglycerate. Hb is also an allosterically regulated nitrite reductase; the plasma nitrite anion may be activated by hemoglobin in areas of hypoxia to bring about vasodilation. Other Hb types are: HbA2 (alpha2delta2) which in normal individuals, is naturally expressed at a low level; Hb Portland-1 (zeta2gamma2), Hb Gower-1 (zeta2epsilon2), and Hb Gower-2 (alpha2epsilon2), which are Hbs present during the embryonic period; and fetal hemoglobin (HbF, alpha2gamma2), the primary hemoglobin throughout most of gestation. These Hbs types have differences in O2 affinity and in their interactions with allosteric effectors.	139
271277	cd08926	Mb	Animal Myoglobins. Myoglobin (Mb) is a monomeric pentacoordinate heme-bound globin protein whose expression has long been considered limited to cardiomyocytes and striated skeletal muscle cell, however it has recently been found localized in a wide variety of tissues including smooth muscle cells. As a physiological catalyst, it can modulate reactive oxygen species levels, facilitate oxygen diffusion within the cell, and scavenge or generate NO depending on oxygen tensions within the cell. Through its NO dioxygenase and nitrite reductase activities, Mb regulates mitochondrial function in energy-demanding tissues.	148
381263	cd08927	Hb-alpha-like	Hemoglobin alpha, zeta, mu, theta, and related Hb subunits. Hb is the oxygen transport protein of erythrocytes. It is an allosterically modulated heterotetramer. Hemoglobin A (HbA) is the most common Hb in adult humans, and is formed from two alpha-chains and two beta-chains (alpha2beta2). An equilibrium exists between deoxygenated/unliganded/T(tense state) Hb having low oxygen affinity, and oxygenated /liganded/R(relaxed state) Hb having a high oxygen affinity. Various endogenous heterotropic effectors bind Hb to modulate its oxygen affinity and cooperative behavior, e.g. hydrogen ions, chloride ions, carbon dioxide and 2,3-bisphosphoglycerate. Hb is also an allosterically regulated nitrite reductase; the plasma nitrite anion may be activated by hemoglobin in areas of hypoxia to bring about vasodilation. Other Hb types are: HbA2 (alpha2delta2) which in normal individuals, is naturally expressed at a low level; Hb Portland-1 (zeta2gamma2), Hb Gower-1 (zeta2epsilon2), and Hb Gower-2 (alpha2epsilon2), which are Hbs present during the embryonic period; and fetal hemoglobin (HbF, alpha2gamma2), the primary hemoglobin throughout most of gestation. These Hbs types have differences in O2 affinity and in their interactions with allosteric effectors.	140
187633	cd08928	KR_fFAS_like_SDR_c_like	ketoacyl reductase (KR) domain of fungal-type fatty acid synthase (fFAS)-like, classical (c)-like SDRs. KR domain of FAS, including the fungal-type multidomain FAS alpha chain, and the single domain daunorubicin C-13 ketoreductase. Fungal-type FAS is a heterododecameric FAS composed of alpha and beta multifunctional polypeptide chains. The KR, an SDR family member is located centrally in the alpha chain. KR catalyzes the NADP-dependent reduction of ketoacyl-ACP to hydroxyacyl-ACP. KR shares the critical active site Tyr of the classical SDR and has partial identity of the active site tetrad, but the upstream Asn is replaced in KR by Met. As in other SDRs, there is a glycine rich NAD(P)-binding motif, but the pattern found in KR does not match the classical SDRs, and is not strictly conserved within this group. Daunorubicin is a clinically important therapeutic compound used in some cancer treatments. Single domain daunorubicin C-13 ketoreductase is member of the classical SDR family with a canonical glycine-rich NAD(P)-binding motif, but lacking a complete match to the active site tetrad characteristic of this group. The critical Tyr, plus the Lys and upstream Asn are present, but the catalytic Ser is replaced, generally by Gln. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	248
187634	cd08929	SDR_c4	classical (c) SDR, subgroup 4. This subgroup has a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	226
187635	cd08930	SDR_c8	classical (c) SDR, subgroup 8. This subgroup has a fairly well conserved active site tetrad and domain size of the classical SDRs, but has an atypical NAD-binding motif ([ST]G[GA]XGXXG). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	250
187636	cd08931	SDR_c9	classical (c) SDR, subgroup 9. This subgroup has the canonical active site tetrad and NAD-binding motif of the classical SDRs. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	227
212493	cd08932	HetN_like_SDR_c	HetN oxidoreductase-like, classical (c) SDR. This subgroup includes Anabaena sp. strain PCC 7120 HetN, a putative oxidoreductase involved in heterocyst differentiation, and related proteins.  SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	223
187638	cd08933	RDH_SDR_c	retinal dehydrogenase-like, classical (c) SDR. These classical SDRs includes members identified as retinol dehydrogenases, which convert retinol to retinal, a property that overlaps with 17betaHSD activity. 17beta-dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens, and include members of the short-chain dehydrogenases/reductase family. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	261
187639	cd08934	CAD_SDR_c	clavulanic acid dehydrogenase (CAD), classical (c) SDR. CAD catalyzes the NADP-dependent reduction of clavulanate-9-aldehyde to clavulanic acid, a beta-lactamase inhibitor. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	243
187640	cd08935	mannonate_red_SDR_c	putative D-mannonate oxidoreductase, classical (c) SDR. D-mannonate oxidoreductase catalyzes the NAD-dependent interconversion of D-mannonate and D-fructuronate. This subgroup includes Bacillus subtitils UxuB/YjmF, a putative D-mannonate oxidoreductase; the B. subtilis UxuB gene is part of a putative ten-gene operon (the Yjm operon) involved in hexuronate catabolism. Escherichia coli UxuB does not belong to this subgroup. This subgroup has a canonical active site tetrad and a typical Gly-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	271
187641	cd08936	CR_SDR_c	Porcine peroxisomal carbonyl reductase like, classical (c) SDR. This subgroup contains porcine peroxisomal carbonyl reductase and similar proteins. The porcine enzyme efficiently reduces retinals. This subgroup also includes human dehydrogenase/reductase (SDR family) member 4 (DHRS4), and human DHRS4L1. DHRS4 is a peroxisomal enzyme with 3beta-hydroxysteroid dehydrogenase activity; it catalyzes the reduction of 3-keto-C19/C21-steroids into 3beta-hydroxysteroids more efficiently than it does the retinal reduction. The human DHRS4 gene cluster contains DHRS4, DHRS4L2 and DHRS4L1. DHRS4L2 and DHRS4L1 are paralogs of DHRS4, DHRS4L2 being the most recent member. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	256
187642	cd08937	DHB_DH-like_SDR_c	1,6-dihydroxycyclohexa-2,4-diene-1-carboxylate dehydrogenase (DHB DH)-like, classical (c) SDR. DHB DH (aka 1,2-dihydroxycyclohexa-3,5-diene-1-carboxylate dehydrogenase) catalyzes the NAD-dependent conversion of 1,2-dihydroxycyclohexa-3,4-diene carboxylate to a catechol. This subgroup also contains Pseudomonas putida F1 CmtB, 2,3-dihydroxy-2,3-dihydro-p-cumate dehydrogenase, the second enzyme in  the pathway for catabolism of p-cumate catabolism. This subgroup shares the glycine-rich NAD-binding motif of the classical SDRs and shares the same catalytic triad; however, the upstream Asn implicated in cofactor binding or catalysis in other SDRs is generally substituted by a Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	256
187643	cd08939	KDSR-like_SDR_c	3-ketodihydrosphingosine reductase (KDSR) and related proteins, classical (c) SDR. These proteins include members identified as KDSR, ribitol type dehydrogenase, and others. The group shows strong conservation of the active site tetrad and glycine rich NAD-binding motif of the classical SDRs. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	239
187644	cd08940	HBDH_SDR_c	d-3-hydroxybutyrate dehydrogenase (HBDH), classical (c) SDRs. DHBDH, an NAD+ -dependent enzyme, catalyzes the interconversion of D-3-hydroxybutyrate and acetoacetate. It is a classical SDR, with the canonical NAD-binding motif and active site tetrad. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	258
187645	cd08941	3KS_SDR_c	3-keto steroid reductase, classical (c) SDRs. 3-keto steroid reductase (in concert with other enzymes) catalyzes NADP-dependent sterol C-4 demethylation, as part of steroid biosynthesis. 3-keto reductase is a classical SDR, with a well conserved canonical active site tetrad and fairly well conserved characteristic NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	290
187646	cd08942	RhlG_SDR_c	RhlG and related beta-ketoacyl reductases, classical (c) SDRs. Pseudomonas aeruginosa RhlG is an SDR-family beta-ketoacyl reductase involved in Rhamnolipid biosynthesis. RhlG is similar to but distinct from the FabG family of beta-ketoacyl-acyl carrier protein (ACP) of type II fatty acid synthesis. RhlG and related proteins are classical SDRs, with a canonical active site tetrad and glycine-rich NAD(P)-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	250
187647	cd08943	R1PA_ADH_SDR_c	rhamnulose-1-phosphate aldolase/alcohol dehydrogenase, classical (c) SDRs. This family has bifunctional proteins with an N-terminal aldolase and a C-terminal classical SDR domain. One member is identified as a rhamnulose-1-phosphate aldolase/alcohol dehydrogenase. The SDR domain has a canonical SDR glycine-rich NAD(P) binding motif and a match to the characteristic active site triad. However, it lacks an upstream active site Asn typical of SDRs. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	250
187648	cd08944	SDR_c12	classical (c) SDR, subgroup 12. These are classical SDRs, with the canonical active site tetrad and glycine-rich NAD-binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	246
187649	cd08945	PKR_SDR_c	Polyketide ketoreductase, classical (c) SDR. Polyketide ketoreductase (KR) is a classical SDR with a characteristic NAD-binding pattern and active site tetrad.  Aromatic polyketides include various aromatic compounds of pharmaceutical interest. Polyketide KR, part of the type II polyketide synthase (PKS) complex, is comprised of stand-alone domains that resemble the domains found in fatty acid synthase and multidomain type I PKS. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	258
212494	cd08946	SDR_e	extended (e) SDRs. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	200
187651	cd08947	NmrA_TMR_like_SDR_a	NmrA (a transcriptional regulator), HSCARG (an NADPH sensor), and triphenylmethane reductase (TMR) like proteins, atypical (a) SDRs. Atypical SDRs belonging to this subgroup include NmrA, HSCARG, and TMR, these proteins bind NAD(P) but  they lack the usual catalytic residues of the SDRs. Atypical SDRs are distinct from classical SDRs. NmrA is a negative transcriptional regulator of various fungi, involved in the post-translational modulation of the GATA-type transcription factor AreA.  NmrA lacks the canonical GXXGXXG NAD-binding motif and has altered residues at the catalytic triad, including a Met instead of the critical Tyr residue. NmrA may bind nucleotides but appears to lack any dehydrogenase activity. HSCARG has been identified as a putative NADP-sensing molecule, and redistributes and restructures in response to NADPH/NADP ratios. Like NmrA, it lacks most of the active site residues of the SDR family, but has an NAD(P)-binding motif similar to the extended SDR family, GXXGXXG. TMR, an NADP-binding protein, lacks the active site residues of the SDRs but has a glycine rich NAD(P)-binding motif that matches the extended SDRs. Atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), progesterone 5-beta-reductase like proteins, phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	224
187652	cd08948	5beta-POR_like_SDR_a	progesterone 5-beta-reductase-like proteins (5beta-POR), atypical (a) SDRs. 5beta-POR catalyzes the reduction of progesterone to 5beta-pregnane-3,20-dione in Digitalis plants. This subgroup of atypical-extended SDRs, shares the structure of an extended SDR, but has a different glycine-rich nucleotide binding motif  (GXXGXXG) and lacks the YXXXK active site motif of classical and extended SDRs. Tyr-179 and Lys 147 are present in the active site, but not in the usual SDR configuration. Given these differences, it has been proposed that this subfamily represents a new SDR class. Other atypical SDRs include biliverdin IX beta reductase (BVR-B,aka flavin reductase), NMRa (a negative transcriptional regulator of various fungi), phenylcoumaran benzylic ether and pinoresinol-lariciresinol reductases, phenylpropene synthases, eugenol synthase, triphenylmethane reductase, isoflavone reductases, and others. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. In addition to the Rossmann fold core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	308
187653	cd08950	KR_fFAS_SDR_c_like	ketoacyl reductase (KR) domain of fungal-type fatty acid synthase (fFAS), classical (c)-like SDRs. KR domain of fungal-type fatty acid synthase (FAS), type I. Fungal-type FAS is a heterododecameric FAS composed of alpha and beta multifunctional polypeptide chains. The KR, an SDR family member, is located centrally in the alpha chain. KR catalyzes the NADP-dependent reduction of ketoacyl-ACP to hydroxyacyl-ACP. KR shares the critical active site Tyr of the Classical SDR and has partial identity of the active site tetrad, but the upstream Asn is replaced in KR by Met. As in other SDRs, there is a glycine rich NAD-binding motif, but the pattern found in KR does not match the classical SDRs, and is not strictly conserved within this group. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	259
187654	cd08951	DR_C-13_KR_SDR_c_like	daunorubicin C-13 ketoreductase (KR), classical (c)-like SDRs. Daunorubicin is a clinically important therapeutic compound used in some cancer treatments. Daunorubicin C-13 ketoreductase is member of the classical SDR family with a canonical glycine-rich NAD(P)-binding motif, but lacking a complete match to the active site tetrad characteristic of this group. The critical Tyr, plus the Lys and upstream Asn are present, but the catalytic Ser is replaced, generally by Gln. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	260
187655	cd08952	KR_1_SDR_x	ketoreductase (KR), subgroup 1, complex (x) SDRs. Ketoreductase, a module of the multidomain polyketide synthase (PKS), has 2 subdomains, each corresponding  to a SDR family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerize but is composed of 2 subdomains, each resembling an SDR monomer. The active site resembles that of typical SDRs, except that the usual positions of the catalytic Asn and Tyr are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular PKSs are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) FAS. Polyketide synthesis also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP-binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. This subfamily includes KR domains found in many multidomain PKSs, including six of seven Sorangium cellulosum PKSs (encoded by spiDEFGHIJ) which participate in the synthesis of the polyketide scaffold of the cytotoxic spiroketal polyketide spirangien. These seven PKSs have either a single PKS module (SpiF), two PKR modules (SpiD,-E,-I,-J), or three PKS modules (SpiG,-H). This subfamily includes the single KR domain of SpiF, the first KR domains of SpiE,-G,H,-I,and #J, the third KR domain of SpiG, and the second KR domain of SpiH. The second KR domains of SpiE,-G, I, and #J, and the KR domains of SpiD, belong to a different KR_FAS_SDR subfamily. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	480
187656	cd08953	KR_2_SDR_x	ketoreductase (KR), subgroup 2, complex (x) SDRs. Ketoreductase, a module of the multidomain polyketide synthase (PKS), has 2 subdomains, each corresponding  to a SDR family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerize but is composed of 2 subdomains, each resembling an SDR monomer. The active site resembles that of typical SDRs, except that the usual positions of the catalytic Asn and Tyr are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular PKSs are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) FAS. Polyketide synthesis also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP-binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. This subfamily includes both KR domains of the Bacillus subtilis Pks J,-L, and PksM, and all three KR domains of PksN, components of the megacomplex bacillaene synthase, which synthesizes the antibiotic bacillaene. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	436
187657	cd08954	KR_1_FAS_SDR_x	beta-ketoacyl reductase (KR) domain of fatty acid synthase (FAS), subgroup 1, complex (x) SDRs. NADP-dependent KR domain of the multidomain type I FAS, a complex SDR family. This subfamily also includes proteins identified as polyketide synthase (PKS), a protein with related modular protein architecture and similar function. It includes the KR domains of mammalian and chicken FAS, and Dictyostelium discoideum putative polyketide synthases (PKSs). These KR domains contain two subdomains, each of which is related to SDR Rossmann fold domains. However, while the C-terminal subdomain has an active site similar to the other SDRs and a NADP-binding capability, the N-terminal SDR-like subdomain is truncated and lacks these functions, serving a supportive structural role. In some instances, such as porcine FAS, an enoyl reductase (a Rossman fold NAD-binding domain of the medium-chain dehydrogenase/reductase, MDR family) module is inserted between the sub-domains. Fatty acid synthesis occurs via the stepwise elongation of a chain (which is attached to acyl carrier protein, ACP) with 2-carbon units. Eukaryotic systems consists of large, multifunctional synthases (type I) while bacterial, type II systems, use single function proteins. Fungal fatty acid synthesis uses a dodecamer of 6 alpha and 6 beta subunits. In mammalian type FAS cycles,  ketoacyl synthase forms acetoacetyl-ACP which is reduced by the NADP-dependent beta-ketoacyl reductase (KR), forming beta-hydroxyacyl-ACP, which is in turn dehydrated by dehydratase to a beta-enoyl intermediate, which is reduced by NADP-dependent beta-enoyl reductase (ER); this KR and ER are members of the SDR family. This KR subfamily has an active site tetrad with a similar 3D orientation compared to archetypical SDRs, but the active site Lys and Asn residue positions are swapped. The characteristic NADP-binding is typical of the multidomain  complex SDRs, with a GGXGXXG NADP binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	452
187658	cd08955	KR_2_FAS_SDR_x	beta-ketoacyl reductase (KR) domain of fatty acid synthase (FAS), subgroup 2, complex (x). Ketoreductase, a module of the multidomain polyketide synthase, has 2 subdomains, each corresponding  to a short-chain dehydrogenases/reductase (SDR) family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin.  The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerizes but is composed of 2 subdomains, each resembling an SDR monomer.  In some instances,  as in porcine FAS, an enoyl reductase (a Rossman fold NAD binding domain of the MDR family) module is inserted between the sub-domains.  The active site resembles that of typical SDRs, except that the usual positions of the catalytic asparagine and tyrosine are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular polyketide synthases are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) fatty acid synthase.   In some instances, such as porcine FAS , an enoyl reductase module is inserted between the sub-domains. Fatty acid synthesis occurs via the stepwise elongation of a chain (which is attached to acyl carrier protein, ACP) with 2-carbon units. Eukaryotic systems consists of large, multifunctional synthases (type I) while bacterial, type II systems, use single function proteins. Fungal fatty acid synthesis uses dodecamer of 6 alpha and 6 beta subunits. In mammalian type FAS cycles,  ketoacyl synthase forms acetoacetyl-ACP which is reduced by the NADP-dependent beta-ketoacyl reductase (KR), forming beta-hydroxyacyl-ACP, which is in turn dehydrated by dehydratase to a beta-enoyl intermediate, which is reduced by NADP-dependent beta-enoyl reductase (ER). Polyketide syntheses also proceeds via the addition of 2-carbon units as in fatty acid synthesis.  The complex SDR NADP binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. This subfamily includes the KR domain of the Lyngbya majuscule Jam J, -K, and #L  which are encoded on the jam gene cluster and are involved in the synthesis of the Jamaicamides (neurotoxins); Lyngbya majuscule Jam P belongs to a different KR_FAS_SDR_x subfamily. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	376
187659	cd08956	KR_3_FAS_SDR_x	beta-ketoacyl reductase (KR) domain of fatty acid synthase (FAS), subgroup 3, complex (x). Ketoreductase, a module of the multidomain polyketide synthase (PKS), has 2 subdomains, each corresponding  to a SDR family monomer. The C-terminal subdomain catalyzes the NADPH-dependent reduction of the beta-carbonyl of a polyketide to a hydroxyl group, a step in the biosynthesis of polyketides, such as erythromycin. The N-terminal subdomain, an interdomain linker, is a truncated Rossmann fold which acts to stabilizes the catalytic subdomain. Unlike typical SDRs, the isolated domain does not oligomerize but is composed of 2 subdomains, each resembling an SDR monomer. The active site resembles that of typical SDRs, except that the usual positions of the catalytic Asn and Tyr are swapped, so that the canonical YXXXK motif changes to YXXXN. Modular PKSs are multifunctional structures in which the makeup recapitulates that found in (and may have evolved from) FAS. In some instances, such as porcine FAS, an enoyl reductase (ER) module is inserted between the sub-domains. Fatty acid synthesis occurs via the stepwise elongation of a chain (which is attached to acyl carrier protein, ACP) with 2-carbon units. Eukaryotic systems consists of large, multifunctional synthases (type I) while bacterial, type II systems, use single function proteins. Fungal fatty acid synthesis uses a dodecamer of 6 alpha and 6 beta subunits. In mammalian type FAS cycles, ketoacyl synthase forms acetoacetyl-ACP which is reduced by the NADP-dependent beta-KR, forming beta-hydroxyacyl-ACP, which is in turn dehydrated by dehydratase to a beta-enoyl intermediate, which is reduced by NADP-dependent beta- ER. Polyketide synthesis also proceeds via the addition of 2-carbon units as in fatty acid synthesis. The complex SDR NADP-binding motif, GGXGXXG, is often present, but is not strictly conserved in each instance of the module. This subfamily includes KR domains found in many multidomain PKSs, including six of seven Sorangium cellulosum PKSs (encoded by spiDEFGHIJ) which participate in the synthesis of the polyketide scaffold of the cytotoxic spiroketal polyketide spirangien. These seven PKSs have either a single PKS module (SpiF), two PKR modules (SpiD,-E,-I,-J), or three PKS modules (SpiG,-H). This subfamily includes the second KR domains of SpiE,-G, I, and -J, both KR domains of SpiD, and the third KR domain of SpiH. The single KR domain of SpiF, the first and second KR domains of SpiH, the first KR domains of SpiE,-G,- I, and -J, and the third KR domain of SpiG, belong to a different KR_FAS_SDR subfamily. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type KRs have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	448
187660	cd08957	WbmH_like_SDR_e	Bordetella bronchiseptica enzymes WbmH and WbmG-like, extended (e) SDRs. Bordetella bronchiseptica enzymes WbmH and WbmG, and related proteins. This subgroup exhibits the active site tetrad and NAD-binding motif of the extended SDR family. It has been proposed that the active site in Bordetella WbmG and WbmH cannot function as an epimerase, and that it plays a role in O-antigen synthesis pathway from UDP-2,3-diacetamido-2,3-dideoxy-l-galacturonic acid. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	307
187661	cd08958	FR_SDR_e	flavonoid reductase (FR), extended (e) SDRs. This subgroup contains FRs of the extended SDR-type and related proteins. These FRs act in the NADP-dependent reduction of  flavonoids, ketone-containing plant secondary metabolites; they have the characteristic active site triad of the SDRs (though not the upstream active site Asn) and a NADP-binding motif that is very similar to the typical extended SDR motif. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	293
350084	cd08959	ArfGap_ArfGap1_like	ARF1 GTPase-activating protein 1-like. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif.	115
185752	cd08961	GH64-TLP-SF	glycoside hydrolase family 64 (beta-1,3-glucanases which produce specific pentasaccharide oligomers) and thaumatin-like proteins. This superfamily includes glycoside hydrolases of family 64 (GH64), these are mostly bacterial beta-1,3-glucanases which cleave long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers  and are implicated in fungal cell wall degradation. Also included in this superfamily are thaumatin, the sweet-tasting protein from the African berry Thaumatococcus daniellii, and thaumatin-like proteins (TLPs) which are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Like GH64s, some TLPs also hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. Plant TLPs are classified as pathogenesis-related (PR) protein family 5 (PR5), their expression is induced by environmental stresses such as pathogen/pest attack, drought and cold. Several members of the plant TLP family have been reported as food allergens from fruits, and pollen allergens from conifers. Streptomyces matensis laminaripentaose-producing, beta-1,3-glucanase (GH64-LPHase), and TLPs have in common, a core N-terminal barrel domain (domain I) composed of 10 beta-strands, two coming from the C-terminal region of the protein. In TLPs, this core domain is flanked by two shorter domains (domains II and III). Small TLPs, such as Triticum aestivum thaumatin-like xylanase inhibitor, have a deletion in the third domain (domain II). GH64-LPHase has a second C-terminal domain which corresponds positional to, but is much larger than, domain III of TLP. GH64-LPHase and TLPs are described as crescent-fold structures. Critical functional residues, common to GH64-LPHase and TLPs are a Glu and an Asp residue. LPHase has an electronegative, substrate-binding cleft and the afore mentioned conserved Glu and Asp residues are the catalytic residues essential for beta-1,3-glucan cleavage. In TLPs, these residues are two of the four conserved residues which contribute to the strong electronegative character of the cleft which is associated with the antifungal activity of TLPs.	153
199206	cd08962	GatD	GatD subunit of archaeal Glu-tRNA amidotransferase. GatD is involved in the alternative synthesis of Gln-tRNA(Gln) in archaea via the transamidation of incorrectly charged Glu-tRNA(Gln). GatD is active as a dimer, and it provides the amino group required for this reaction. GatD is related to bacterial L-asparaginases (amidohydrolases), which catalyze the hydrolysis of asparagine to aspartic acid and ammonia. This CD spans both the L-asparaginase_like domain and an N-terminal supplementary domain.	402
199207	cd08963	L-asparaginase_I	Type I (cytosolic) bacterial L-asparaginase. Asparaginases (amidohydrolases, E.C. 3.5.1.1) are enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases. This model represents type I L-asparaginases, which are highly specific for asparagine and localized in the cytosol. Type I L-asparaginase acts as a dimer. A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. One example of an enzyme with no L-glutaminase activity is the type I L-asparaginase from Wolinella succinogenes.	316
199208	cd08964	L-asparaginase_II	Type II (periplasmic) bacterial L-asparaginase. Asparaginases (amidohydrolases, E.C. 3.5.1.1) are enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases. This model represents type II L-asparaginases, which tend to be highly specific for asparagine and localized to the periplasm. They are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL), but not without severe side effects. Tumor cells appear to have a heightened dependence on exogenous L-aspartate, and depleting their surroundings of L-aspartate may starve cancerous ALL cells. Type II L-asparaginase acts as a tetramer, which is actually a dimer of two tightly bound dimers. A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase.	319
176799	cd08965	EcNei-like_N	N-terminal domain of Escherichia coli Nei/endonuclease VIII and related DNA glycosylases. This family contains the N-terminal domain of proteobacteria Nei and related DNA glycosylases. It includes Escherichia coli Nei, and belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. Escherichia coli Nei has been well studied, it is a DNA glycosylase/AP lyase that excises damaged pyrimidines, including 5-hydroxycytosine, 5-hydroxyuracil, and uracil glycol. In addition to this EcNei-like_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a canonical zinc-finger motif.	115
176800	cd08966	EcFpg-like_N	N-terminal domain of Escherichia coli Fpg1/MutM and related bacterial DNA glycosylases. This family contains the N-terminal domain of Escherichia coli Fpg1/MutM and related bacterial DNA glycosylases. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate.  Escherichia coli Fpg mainly recognizes and excises damaged purines such as 8-oxo-7,8-dihydroguanine (8-oxoG) and 2,6-diamino-4-hydroxy-5-formamidopyrimidine (FapyG). It is bifunctional, having both a DNA glycosylase (recognition activity) and a AP lyase activity. In addition to this EcFpg-like_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif, which also contribute residues to the active site.	120
176801	cd08967	MeNeil1_N	N-terminal domain of metazoan Nei-like glycosylase 1 (NEIL1). This family contains the N-terminal domain of metazoan NEIL1. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. NEIL1 recognizes the oxidized pyrimidines 2,6-diamino-4-hydroxy-5-formamidopyrimidine (FapyG) and 4,6-diamino- 5-formamidopyrimidine (FapyA), thymine glycol (Tg) and 5-hydroxyuracil (5-OHU).  However, even though it has weak activity on 8-oxo-7,8-dihydroguanine (8-oxoG), it does show strong preference for the products of its further oxidation: spiroiminodihydantoin and guanidinohydantoin. In addition to this MeNeil1_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zincless finger motif. This characteristic "zincless finger" motif, is a structural equivalent of the zinc finger common to other members of the Fpg/Nei family. Neil1 is one of three  homologs found in eukaryotes and its lineage extends back as far as early metazoans.	131
176802	cd08968	MeNeil2_N	N-terminal domain of metazoan Nei-like glycosylase 2 (NEIL2). This family contains the N-terminal domain of the metazoan protein Neil2. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. NEIL2 repairs 5-hydroxyuracil (5-OHU) and other oxidized derivatives of cytosine, but it shows preference for DNA bubble structures. In addition to this MeNeil2_N domain, NEIL2 contains a helix-two turn-helix (H2TH) domain and a characteristic CHCC zinc finger motif. Neil2 is one of three homologs found in eukaryotes.	126
176803	cd08969	MeNeil3_N	N-terminal domain of metazoan Nei-like glycosylase 3 (NEIL3). This family contains the N-terminal domain of the Metazoan Neil3. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. In contrast, mouse NEIL3 (MmuNEIL3) forms a Schiff base intermediate via its N-terminal valine. The latter is a functional DNA glycosylase in vitro and in vivo. MmuNEIL3 prefers lesions in single-stranded DNA and in bubble structures.  In duplex DNA, it recognizes the oxidized purines spiroiminodihydantoin (Sp), guanidinohydantoin (Gh), 2,6-diamino-4-hydroxy-5-formamidopyrimidine (FapyG) and 4,6-diamino-5-formamidopyrimidine (FapyA), but not 8-oxo-7,8-dihydroguanine (8-oxoG). Since the expression of the MmuNeil3 glycosylase domain (MmuNeil3delta324) reduces both the high spontaneous mutation frequency and the FapyG level in a Escherichia coli mutant lacking Fpg, Nei and MutY glycosylase activites, NEIL3 may play a role in repairing FapyG in vivo.  In addition to this MeNeil3_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc finger motif, plus a characteristic C-terminal extension that contains additional zinc fingers. Neil3 is one of three homologs found in eukaryotes.	140
176804	cd08970	AcNei1_N	N-terminal domain of the actinomycetal Nei1 and related DNA glycosylases. This family contains the N-terminal domain of the actinomycetal Nei1 and related DNA glycosylases. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This family contains mostly actinomycetes and includes Mycobacterium tuberculosis Nei1 (MtuNei1). MtuNei1 recognizes oxidized pyrimidines such as thymine glycol (Tg) and 5,6-dihydrouracil on both double stranded and single stranded DNA, it has a strong preference for the 5R isomer of Tg. In addition to this domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif.	110
176805	cd08971	AcNei2_N	N-terminal domain of the actinomycetal Nei2 and related DNA glycosylases. This family contains the N-terminal domain of the actinomycetal Nei2 and related DNA glycosylases. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This family contains mostly actinomycetes and includes Mycobacterium tuberculosis Nei2 (MtuNei2). Complementation experiments in repair-deficient Escherichia coli (fpg mutY nei triple and nei nth double mutants), support that MtuNei2 is functionally active in vivo and recognizes both guanine and cytosine oxidation products. In addition to this AcNei2_N domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif.	114
176806	cd08972	PF_Nei_N	N-terminal domain of the plant and fungal Nei and related proteins. This family contains the N-terminal domain of plant and Fungi Nei and related proteins. It belongs to the FpgNei_N, [N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII)] domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. The plant and fungal FpgNei glycosylases prefer the oxidized pyrimidines spiroiminodihydantoin (Sp), guanidinohydantoin (Gh) over 8-oxoguanine in double stranded oligonucleotides and also show weak activity on single stranded DNA. In addition to this domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a characteristic zincless finger motif. They share a common ancestor not shared with other eukaryotic members of the FpgNei family.	137
176807	cd08973	BaFpgNei_N_1	Uncharacterized bacterial subgroup of the N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei  (endonuclease VIII) base-excision repair DNA glycosylases. This family is an uncharacterized bacterial subgroup of the FpgNei_N domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This N-terminal proline is conserved in this family. Escherichia coli Fpg prefers 8-oxo-7,8-dihydroguanine (8-oxoG) and oxidized purines and Escherichia coli Nei recognizes oxidized pyrimidines. However, neither Escherichia coli Fpg or Nei belong to this family. In addition to this BaFpgNei_N_1 domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif.	122
176808	cd08974	BaFpgNei_N_2	Uncharacterized bacterial subgroup of the N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei  (endonuclease VIII) base-excision repair DNA glycosylases. This family is an uncharacterized bacterial subgroup of the FpgNei_N domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This N-terminal proline is conserved in this family. Escherichia coli Fpg prefers 8-oxo-7,8-dihydroguanine (8-oxoG) and oxidized purines, and Escherichia coli Nei recognizes oxidized pyrimidines. However, neither Escherichia coli Fpg or Nei belong to this family. In addition to this BaFpgNei_N_2 domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain. Most also contain a zinc-finger motif.	98
176809	cd08975	BaFpgNei_N_3	Uncharacterized bacterial subgroup of the N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei  (endonuclease VIII) base-excision repair DNA glycosylases. This family is an uncharacterized bacterial subgroup of the FpgNei_N domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. One exception is mouse Nei-like glycosylase 3 (Neil3) which forms a Schiff base intermediate via its N-terminal valine. In this family the N-terminal proline is replaced by an isoleucine or valine. Escherichia coli Fpg prefers 8-oxo-7,8-dihydroguanine (8-oxoG) and oxidized purines and Escherichia coli Nei recognizes oxidized pyrimidines. However, neither Escherichia coli Fpg or Nei belong to this family. In addition to this BaFpgNei_N_3 domain, enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif.	117
176810	cd08976	BaFpgNei_N_4	Uncharacterized bacterial subgroup of the N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei  (endonuclease VIII) base-excision repair DNA glycosylases. This family is an uncharacterized bacterial subgroup of the FpgNei_N domain superfamily. DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. They initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycosylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. This N-terminal proline is conserved in this family. Escherichia coli Fpg prefers 8-oxo-7,8-dihydroguanine (8-oxoG) and oxidized purines and Escherichia coli Nei recognizes oxidized pyrimidines. However, neither Escherichia coli Fpg or Nei belong to this family. In addition to this BaFpgNei_N_4 domain, most enzymes belonging to this family contain a helix-two turn-helix (H2TH) domain and a zinc-finger motif.	117
185760	cd08977	SusD	starch binding outer membrane protein SusD. SusD-like proteins from Bacteroidetes, members of the human distal gut microbiota, are part of the starch utilization system (Sus). Sus is one of the large clusters of glycosyl hydrolases, called polysaccharide utilization loci (PULs), which play an important role in polysaccharide recognition and uptake, and it is needed for growth on amylose, amylopectin, pullulan, and maltooligosaccharides. SusD, together with SusC, a predicted beta-barrel porin, forms the minimum outer-membrane starch-binding complex. The adult human distal gut microbiota is essential for digestion of a large variety of dietary polysaccharides, for which humans lack the necessary glycosyl hydrolases.	359
350092	cd08978	GH_F	Glycosyl hydrolase families 43 and 62 form CAZY clan GH-F. This glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) includes family 43 (GH43) and 62 (GH62). GH43 includes enzymes with beta-xylosidase (EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanases (beta-xylanases) and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. GH62 includes enzymes characterized as arabinofuranosidases (alpha-L-arabinofuranosidases; EC 3.2.1.55) that specifically cleave either alpha-1,2 or alpha-1,3-L-arabinofuranose side chains from xylans. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the enzymes in this family display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. GH62 are also predicted to be inverting enzymes. A common structural feature of both, GH43 and GH62 enzymes, is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	251
350093	cd08979	GH_J	Glycosyl hydrolase families 32 and 68, which form the clan GH-J. This glycosyl hydrolase family clan J (according to carbohydrate-active enzymes database (CAZY)) includes family 32 (GH32) and 68 (GH68). GH32 enzymes include invertase (EC 3.2.1.26) and other other fructofuranosidases such as inulinase (EC 3.2.1.7), exo-inulinase (EC 3.2.1.80), levanase (EC 3.2.1.65), and transfructosidases such sucrose:sucrose 1-fructosyltransferase (EC 2.4.1.99), fructan:fructan 1-fructosyltransferase (EC 2.4.1.100), sucrose:fructan 6-fructosyltransferase (EC 2.4.1.10), fructan:fructan 6G-fructosyltransferase (EC 2.4.1.243) and levan fructosyltransferases (EC 2.4.1.-). The GH68 family consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10, also known as beta-D-fructofuranosyl transferase), beta-fructofuranosidase (EC 3.2.1.26) and inulosucrase (EC 2.4.1.9). GH32 and GH68 family enzymes are retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) and catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	292
350094	cd08980	GH43_LbAraf43-like	Glycosyl hydrolase family 43 proteins such as Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans GbtXyl43B. This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55) and possibly bifunctional xylosidase/arabinofuranosidase activities. In addition to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans IT-08 beta-xylosidase / exo-xylanase (GbtXyl43B). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) familiesGH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	276
350095	cd08981	GH43_Bt1873-like	Glycosyl hydrolase family 43 protein such as Bacteroides thetaiotaomicron BT_1873. This glycosyl hydrolase family 43 (GH43) subfamily includes Bacteroides thetaiotaomicron VPI-5482 endo-arabinase (Bt1873;BT_1873), as well as uncharacterized enzymes similar to those with beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanase and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the GH43 enzymes in this family may display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	289
350096	cd08982	GH43-like	Glycosyl hydrolase family 43 protein; uncharacterized. This glycosyl hydrolase family 43 (GH43)-like subfamily includes uncharacterized enzymes similar to those with beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanase and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the enzymes in this family display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	308
350097	cd08983	GH43_Bt3655-like	Glycosyl hydrolase family 43 protein such as Bacteroides thetaiotaomicron VPI-5482 arabinofuranosidase Bt3655. This glycosyl hydrolase family 43 (GH43)-like family includes the characterized arabinofuranosidases (EC 3.2.1.55): Bacteroides thetaiotaomicron VPI-5482 (Bt3655;BT_3655) and Penicillium chrysogenum 31B Abf43B, as well as Bifidobacterium adolescentis ATCC 15703  beta-xylosidase (EC 3.2.1.37) BAD_1527. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 includes enzymes with beta-xylosidase (EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanases (beta-xylanases) and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	262
350098	cd08984	GH43-like	Glycosyl hydrolase family 43. This glycosyl hydrolase family 43 (GH43)-like subfamily includes uncharacterized enzymes similar to those with beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanase and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the enzymes in this family display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	291
350099	cd08985	GH43_CtGH43-like	Glycosyl hydrolase family 43 protein such as Clostridium thermocellum exo-beta-1,3-galactanase CtGH43 and Ruminococcus champanellensis arabinanase Ara43A. This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum (Ct1,3Gal43A or CtGH43) and Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), and arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis Ara43A. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	273
350100	cd08986	GH43-like	Glycosyl hydrolase family 43 protein; uncharacterized. This glycosyl hydrolase family 43 (GH43)-like subfamily includes uncharacterized enzymes similar to those with beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC 3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-), alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), endo-alpha-L-arabinanase and galactan 1,3-beta-galactosidase (EC 3.2.1.145) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many of the enzymes in this family display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	257
350101	cd08987	GH62	Glycosyl hydrolase family 62, characterized arabinofuranosidases. The glycosyl hydrolase family 62 (GH62) includes eukaryotic (mostly fungal) and prokaryotic enzymes which are characterized arabinofuranosidases (alpha-L-arabinofuranosidases; EC 3.2.1.55) that specifically cleave either alpha-1,2 or alpha-1,3-L-arabinofuranose side chains from xylans. These enzymes show significantly different substrate preference with rather low specific activity towards natural substrates and differ in catalytic efficiency. They do not act on xylose moieties in xylan that are adorned with an arabinose side chain at both O2 and O3 positions, nor do they display any non-specific arabinofuranosidase activity. The synergistic action in biomass degradation makes GH62 promising candidates for biotechnological improvements of biofuel production and in various biorefinery applications. These enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan.	304
350102	cd08988	GH43_ABN	Glycosyl hydrolase family 43. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	277
350103	cd08989	GH43_XYL-like	Glycosyl hydrolase family 43, beta-D-xylosidases and arabinofuranosidases. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes that have been annotated as having  beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37) activity, including Selenomonas ruminantium beta-D-xylosidase SXA.  These are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. It also includes various GH43 family GH43 arabinofuranosidases (EC 3.2.1.55) including Humicola insolens alpha-L-arabinofuranosidase AXHd3, Bacteroides ovatus alpha-L-arabinofuranosidase (BoGH43, XynB), and the bifunctional Phanerochaete chrysosporium xylosidase/arabinofuranosidase (Xyl;PcXyl). GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	272
350104	cd08990	GH43_AXH_like	Glycosyl hydrolase family 43 protein, includes arabinoxylan arabinofuranohydrolase, beta-xylosidase, endo-1,4-beta-xylanase, and alpha-L-arabinofuranosidase. This subgroup includes Bacillus subtilis arabinoxylan arabinofuranohydrolase  (XynD;BsAXH-m23;BSU18160), Butyrivibrio proteoclasticus alpha-L-arabinofuranosidase (Xsa43E;bpr_I2319), Clostridium stercorarium alpha-L-arabinofuranosidase XylA, and metagenomic beta-xylosidase (EC 3.2.1.37) / alpha-L-arabinofuranosidase (EC 3.2.1.55) CoXyl43. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families.  The GH43_AXH-like subgroup includes enzymes that have been characterized with beta-xylosidase, alpha-L-arabinofuranosidase, endo-alpha-L-arabinanase as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. AXHs specifically hydrolyze the glycosidic bond between arabinofuranosyl substituents and xylopyranosyl backbone residues of arabinoxylan. Metagenomic beta-xylosidase/alpha-L-arabinofuranosidase CoXyl43 shows synergy with Trichoderma reesei cellulases and promotes plant biomass saccharification by degrading xylo-oligosaccharides, such as xylobiose and xylotriose, into the monosaccharide xylose. Studies show that the hydrolytic activity of CoXyl43 is stimulated in the presence of calcium. Several of these enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	269
350105	cd08991	GH43_HoAraf43-like	Glycosyl hydrolase family 43 protein such as Halothermothrix orenii H 168  alpha-L-arabinofuranosidase (HoAraf43;Hore_20580). This glycosyl hydrolase family 43 (GH43) subgroup includes Halothermothrix orenii H 168  alpha-L-arabinofuranosidase (EC 3.2.1.55) (HoAraf43;Hore_20580). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. This GH43_ HoAraf43-like subgroup includes enzymes that have been annotated as having xylan-digesting beta-xylosidase (EC 3.2.1.37) and xylanase (endo-alpha-L-arabinanase, EC 3.2.1.8) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	283
350106	cd08992	GH117	Glycosyl hydrolase family 117 (GH117). This glycoside hydrolase 117 (GH117) family includes alpha-1,3-L-neoagarooligosaccharide hydrolase (EC 3.2.1.-); alpha-1,3-L-neoagarobiase/neoagarobiose hydrolase (NABH, EC 3.2.1.-). In the agarolytic pathway, in order to metabolize agar, NABH is an essential enzyme because it converts alpha-neoagarobiose (O-3,6-anhydro-alpha-l-galactopyranosyl-(1,3)-d-galactose) into fermentable monosaccharides (d-galactose and 3,6-anhydro-l-galactose). Thus, these enzymes have exo-alpha-1,3-(3,6-anhydro)-l-galactosidase activity, removing terminal non-reducing alpha-1,3-linked 3,6-anhydro-l-galactose residues from their neoagarose substrate. This family includes Zobellia galactanivorans enzymes, Zg4663 and Zg3615 (also known as ZgAhgA and ZgAhgB, respectively) that have been shown to have similar activity on unsubstituted agarose oligosaccharides while Zg3597 has been shown to be inactive, possibly due to differences in dimerization conformation, active-site structure and function. GH117 shares distant sequence similarity with families GH43 and GH32. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	314
350107	cd08993	GH130	Glycosyl hydrolase family 130. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), among others that have yet to be characterized. They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor. This family includes Ruminococcus albus 4-O-beta-D-mannosyl-D-glucose phosphorylase (RaMP1) and beta-(1,4)-mannooligosaccharide phosphorylase (RaMP2), enzymes that phosphorolyze beta-mannosidic linkages at the non-reducing ends of their substrates, and have substantially diverse substrate specificity that are determined by three loop regions.	279
350108	cd08994	GH43_62_32_68_117_130-like	Glycosyl hydrolase families: GH43, GH62, GH32, GH68, GH117, CH130. Members of the glycosyl hydrolase families 32, 43, 62, 68, 117 and 130 (GH32, GH43, GH62, GH68, GH117, GH130) all possess 5-bladed beta-propeller domains and comprise clans F and J, as classified by the carbohydrate-active enzymes database (CAZY). Clan F consists of families GH43 and GH62. GH43 includes beta-xylosidases (EC 3.2.1.37), beta-xylanases (EC 3.2.1.8), alpha-L-arabinases (EC 3.2.1.99), and alpha-L-arabinofuranosidases (EC 3.2.1.55), using aryl-glycosides as substrates, while family GH62 contains alpha-L-arabinofuranosidases (EC 3.2.1.55) that specifically cleave either alpha-1,2 or alpha-1,3-L-arabinofuranose sidechains from xylans. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Clan J consists of families GH32 and GH68. GH32 comprises sucrose-6-phosphate hydrolases, invertases (EC 3.2.1.26), inulinases (EC 3.2.1.7), levanases (EC 3.2.1.65), eukaryotic fructosyltransferases, and bacterial fructanotransferases while GH68 consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10); beta-fructofuranosidase (EC 3.2.1.26); inulosucrase (EC 2.4.1.9), while GH68 consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10); beta-fructofuranosidase (EC 3.2.1.26); inulosucrase (EC 2.4.1.9), all of which use sucrose as their preferential donor substrate. Members of this clan are retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) that catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. Structures of all families in the two clans manifest a funnel-shaped active site that comprises two subsites with a single route for access by ligands. Also included in this superfamily are GH117 enzymes that have exo-alpha-1,3-(3,6-anhydro)-l-galactosidase activity, removing terminal non-reducing alpha-1,3-linked 3,6-anhydro-l-galactose residues from their neoagarose substrate, and GH130 that are phosphorylases and hydrolases for beta-mannosides, involved in the bacterial utilization of mannans or N-linked glycans.	294
350109	cd08995	GH32_EcAec43-like	Glycosyl hydrolase family 32, such as the putative glycoside hydrolase Escherichia coli Aec43 (FosGH2). This glycosyl hydrolase family 32 (GH32) subgroup includes Escherichia coli strain BEN2908 putative glycoside hydrolase Aec43 (FosGH2). GH32 enzymes cleave sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). GH32 family also contains other fructofuranosidases such as inulinase (EC 3.2.1.7), exo-inulinase (EC 3.2.1.80), levanase (EC 3.2.1.65), and transfructosidases such sucrose:sucrose 1-fructosyltransferase (EC 2.4.1.99), fructan:fructan 1-fructosyltransferase (EC 2.4.1.100), sucrose:fructan 6-fructosyltransferase (EC 2.4.1.10), fructan:fructan 6G-fructosyltransferase (EC 2.4.1.243) and levan fructosyltransferases (EC 2.4.1.-). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. These enzymes are predicted to display a 5-fold beta-propeller fold as found for GH43 and CH68. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency to crystallize.	281
350110	cd08996	GH32_FFase	Glycosyl hydrolase family 32, beta-fructosidases. Glycosyl hydrolase family GH32 cleaves sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). This family also contains other fructofuranosidases such as inulinase (EC 3.2.1.7), exo-inulinase (EC 3.2.1.80), levanase (EC 3.2.1.65), and transfructosidases such sucrose:sucrose 1-fructosyltransferase (EC 2.4.1.99), fructan:fructan 1-fructosyltransferase (EC 2.4.1.100), sucrose:fructan 6-fructosyltransferase (EC 2.4.1.10), fructan:fructan 6G-fructosyltransferase (EC 2.4.1.243) and levan fructosyltransferases (EC 2.4.1.-). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. These enzymes are predicted to display a 5-fold beta-propeller fold as found for GH43 and CH68. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency  to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	281
350111	cd08997	GH68	Glycosyl hydrolase family 68, includes levansucrase, beta-fructofuranosidase and inulosucrase. Glycosyl hydrolase family 68 (GH68) consists of frucosyltransferases (FTFs) that include levansucrase (EC 2.4.1.10), beta-fructofuranosidase (EC 3.2.1.26) and inulosucrase (EC 2.4.1.9), all of which use sucrose as their preferential donor substrate. Levansucrase, also known as beta-D-fructofuranosyl transferase, catalyzes the transfer of the sucrose fructosyl moiety to a growing levan chain. Similarly, inulosucrase catalyzes long inulin-type of fructans, and beta-fructofuranosidases create fructooligosaccharides (FOS). However, in the absence of high fructan/sucrose ratio, some GH68 enzymes can also use fructan as donor substrate. GH68 retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller. Biotechnological applications of these enzymes include use of inulin in inexpensive production of rich fructose syrups as well as use of FOS as health-promoting pre-biotics.	354
350112	cd08998	GH43_Arb43a-like	Glycosyl hydrolase family 43 protein such as Bacillus subtilis subsp. subtilis str. 168 endo-alpha-1,5-L-arabinanase Arb43A. This glycosyl hydrolase family 43 (GH43) subgroup belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. Many of these enzymes such as the Bacillus subtilis arabinanase Abn2, that hydrolyzes sugar beet arabinan (branched), linear alpha-1,5-L-arabinan and pectin, are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	278
350113	cd08999	GH43_ABN-like	Glycosyl hydrolase family 43 protein such as endo-alpha-L-arabinanase. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	284
350114	cd09000	GH43_SXA-like	Glycosyl hydrolase family 43, such as Selenomonas ruminantium beta-D-xylosidase SXA. This glycosyl hydrolase family 43 (GH43) includes enzymes that have been characterized to mainly have beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37) activity, including Selenomonas ruminantium (Xsa;Sxa;SXA), Bifidobacterium adolescentis ATCC 15703 (XylC;XynB;BAD_0428) and Bacillus sp. KK-1 XylB. They are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. These enzymes possess an additional C-terminal beta-sandwich domain that restricts access for substrates to a portion of the active site to form a pocket. The active-site pockets comprise of two subsites, with binding capacity for two monosaccharide moieties and a single route of access for small molecules such as substrate. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	292
350115	cd09001	GH43_FsAxh1-like	Glycosyl hydrolase family 43 such as Fibrobacter succinogenes subsp. succinogenes S85 arabinoxylan alpha-L-arabinofuranosidase. This glycosyl hydrolase family 43 (GH43) includes mostly enzymes that have been annotated as having beta-1,4-xylosidase (beta-D-xylosidase; xylan 1,4-beta-xylosidase; EC 3.2.1.37) activity. They are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. This subfamily includes the characterized Clostridium stercorarium F-9 beta-xylosidase Xyl43B. It also includes Humicola insolens AXHd3 (HiAXHd3), a GH43 arabinofuranosidase (EC 3.2.1.55) that hydrolyzes O3-linked arabinose of doubly substituted xylans, a feature of the polysaccharide that is recalcitrant to degradation. It possesses an additional C-terminal beta-sandwich domain such that the interface between the domains comprises a xylan binding cleft that houses the active site pocket. The HiAXHd3 active site is tuned to hydrolyze arabinofuranosyl or xylosyl linkages, and the topology of the distal regions of the substrate binding surface confers specificity. It also includes Fibrobacter succinogenes subsp. succinogenes S85  arabinoxylan alpha-L-arabinofuranosidase (Axh1;Fisuc_1769;FSU_2269), Paenibacillus sp. E18  alpha-L-arabinofuranosidase (Abf43A), Bifidobacterium adolescentis ATCC 15703  double substituted xylan alpha-1,3-L-specific arabinofuranosidase d3 (AXHd3;AXH-d3;BaAXH-d3;BAD_0301;E-AFAM2), and Chrysosporium lucknowense C1 arabinoxylan hydrolase / double substituted xylan alpha-1,3-L-arabinofuranosidase (Abn7;AXHd). A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	270
350116	cd09002	GH43_XYL-like	Glycosyl hydrolase family 43, beta-D-xylosidase (uncharacterized). This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have been annotated as having beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37) activity. They are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	271
350117	cd09003	GH43_XynD-like	Glycosyl hydrolase family 43 protein such as Bacillus subtilis arabinoxylan arabinofuranohydrolase  (XynD;BsAXH-m23;BSU18160). This glycosyl hydrolase family 43 (GH43) subgroup includes characterized Bacillus subtilis arabinoxylan arabinofuranohydrolase (AXH), Caldicellulosiruptor sp. Tok7B.1 beta-1,4-xylanase (EC 3.2.1.8) / alpha-L-arabinosidase (EC 3.2.1.55) XynA, Caldicellulosiruptor sp. Rt69B.1 xylanase C (EC 3.2.1.8) XynC, and Caldicellulosiruptor saccharolyticus  beta-xylosidase (EC 3.2.1.37)/ alpha-L-arabinofuranosidase (EC 3.2.1.55) XynF. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. It belongs to the GH43_AXH-like subgroup which includes enzymes that have been annotated as having beta-xylosidase, alpha-L-arabinofuranosidase and arabinoxylan alpha-L-1,3-arabinofuranohydrolase, xylanase (endo-alpha-L-arabinanase) as well as AXH activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. AXHs specifically hydrolyze the glycosidic bond between arabinofuranosyl substituents and xylopyranosyl backbone residues of arabinoxylan. Bacillus subtilis AXH (BsAXH-m2,3) has been shown to cleave arabinose units from O-2- or O-3-mono-substituted xylose residues and superposition of its structure with known structures of the GH43 exo-acting enzymes, beta-xylosidase and alpha-L-arabinanase, each in complex with their substrate, reveals a different orientation of the sugar backbone. Several of these enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	315
350118	cd09004	GH43_bXyl-like	Glycosyl hydrolase family 43 protein such as Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (BT3675;BT_3675) and (BT3662;BT_3662); includes mostly xylanases. This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have been annotated as xylan-digesting beta-xylosidase (EC 3.2.1.37) and xylanase (endo-alpha-L-arabinanase, EC 3.2.1.8) activities, as well the Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (EC 3.2.1.55) (BT3675;BT_3675) and (BT3662;BT_3662). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	266
350156	cd09005	NP-I	nucleoside phosphorylase-I family. The nucleoside phosphorylase-I family members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases such as purine nucleoside phosphorylase (PNP, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases such as AMP nucleosidase (AMN, EC 3.2.2.4) and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). Members of this family display different physiologically relevant quaternary structures: hexameric (trimer-of-dimers arrangement of Shewanella oneidensis MR-1 UP); homotrimeric (human PNP and Escherichia coli PNPII or XapA); hexameric (with some evidence for co-existence of a trimeric form) such as E. coli PNPI (DeoD); or homodimeric such as human and Trypanosoma brucei UP. The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	216
350157	cd09006	PNP_EcPNPI-like	purine nucleoside phosphorylases similar to Escherichia coli PNP-I (DeoD) and Trichomonas vaginalis PNP. Escherichia coli purine nucleoside phosphorylase (PNP)-I (or DeoD) accepts both 6-oxo and 6-amino purine nucleosides as substrates. Trichomonas vaginalis PNP has broad substrate specificity, having phosphorolytic catalytic activity with adenosine, inosine, and guanosine (with adenosine as the preferred substrate). This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	228
350158	cd09007	NP-I_spr0068	uncharacterized subfamily of the nucleoside phosphorylase-I family. This subfamily is composed of uncharacterized members including Streptococcus pneumoniae hypothetical protein spr0068. The nucleoside phosphorylase-I (NP-I) family members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases such as purine nucleoside phosphorylase (PNP, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases such as AMP nucleosidase (AMN, EC 3.2.2.4) and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). Members of the NP-I family display different physiologically relevant quaternary structures: hexameric (trimer-of-dimers arrangement of Shewanella oneidensis MR-1 UP); homotrimeric (human PNP and Escherichia coli PNPII or XapA); hexameric (with some evidence for co-existence of a trimeric form) such as E. coli PNPI (DeoD); or homodimeric such as human and Trypanosoma brucei UP. The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	221
350159	cd09008	MTAN	5'-methylthioadenosine/S-adenosylhomocysteine nucleosidases. This subfamily includes both bacterial and plant 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidases (MTANs): bacterial MTANs show comparable efficiency in hydrolyzing MTA and SAH, while plant enzymes are highly specific for MTA and are unable to metabolize SAH or show significantly reduced activity towards SAH. MTAN is involved in methionine and S-adenosyl-methionine recycling, polyamine biosynthesis, and bacterial quorum sensing. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	222
350160	cd09009	PNP-EcPNPII_like	purine nucleoside phosphorylases similar to human PNP and Escherichia coli PNP-II (XapA). Human PNP catalyzes the reversible phosphorolysis of the purine nucleosides and deoxynucleosides inosine, guanosine, deoxyinosine, and deoxyguanosine. Patients with PNP deficiency typically present with severe immunodeficiency, neurological dysfunction, and autoimmunity. Escherichia coli PNPII, product of the xapA/pndA gene, catalyzes the phosphorolysis of xanthosine, inosine and guanosine with equal efficiency and has been referred to as xanthosine phosphorylase and inosine-guanosine phosphorylase. E. coli PNPII is also capable of converting nicotinamide to nicotinamide riboside, and may be involved in the NAD+ salvage pathway. It is one of two purine nucleoside phosphorylases found in E. coli, which also contains PNPI, which displays a different substrate specificity and belongs to a different subgroup of the nucleoside phosphorylase-I (NP-I) family than PNPII. NP-I family members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	265
350161	cd09010	MTAP_SsMTAPII_like_MTIP	5'-deoxy-5'-methylthioadenosine phosphorylases (MTAP) similar to Sulfolobus solfataricus MTAPII and Pseudomonas aeruginosa PAO1 5'-methylthioinosine phosphorylase (MTIP). MTAP catalyzes the reversible phosphorolysis of 5'-deoxy-5'-methylthioadenosine (MTA) to adenine and 5-methylthio-D-ribose-1-phosphate. This subfamily includes human MTAP which is highly specific for MTA, and Sulfolobus solfataricus MTAPII which accepts adenosine in addition to MTA. Two MTAPs have been isolated from S. solfataricus: SsMTAP1 and SsMTAPII, SsMTAP1 belongs to a different subfamily of the nucleoside phosphorylase-I (NP-I) family. This group also includes Pseudomonas aeruginosa PAO1 MTI phosphorylase (MTIP) which uses 5'-methylthioinosine (MTI) as a preferred substrate, and does not use MTA. NP-I family members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	238
319953	cd09011	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	122
319954	cd09012	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	127
319955	cd09013	BphC-JF8_N_like	N-terminal, non-catalytic, domain of BphC_JF8, (2,3-dihydroxybiphenyl 1,2-dioxygenase) from Bacillus sp. JF8, and similar proteins. 2,3-dihydroxybiphenyl 1,2-dioxygenase (BphC) catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, a key step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). BphC belongs to the type I extradiol dioxygenase family, which requires a metal ion in the active site in its catalytic mechanism. Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of BphCs. This subfamily of BphC is represented by the enzyme purified from the thermophilic biphenyl and naphthalene degrader, Bacillus sp. JF8. The members in this family of BphC enzymes may use either Mn(II) or Fe(II) as cofactors. The enzyme purified from Bacillus sp. JF8 is Mn(II)-dependent, however, the enzyme from Rhodococcus jostii RHAI has Fe(II) bound to it. BphC_JF8 is thermostable and its optimum activity is at 85 degrees C. The enzymes in this family have an internal duplication. This family represents the N-terminal repeat.	121
319956	cd09014	BphC-JF8_C_like	C-terminal, catalytic domain of BphC_JF8, (2,3-dihydroxybiphenyl 1,2-dioxygenase). 2,3-dihydroxybiphenyl 1,2-dioxygenase (BphC) catalyzes the extradiol ring cleavage reaction of 2,3-dihydroxybiphenyl, a key step in the polychlorinated biphenyls (PCBs) degradation pathway (bph pathway). BphC belongs to the type I extradiol dioxygenase family, which requires a metal ion in the active site in its catalytic mechanism. Polychlorinated biphenyl degrading bacteria demonstrate a multiplicity of BphCs. This subfamily of BphC is represented by the enzyme purified from the thermophilic biphenyl and naphthalene degrader, Bacillus sp. JF8. The members in this family of BphC enzymes may use either Mn(II) or Fe(II) as cofactors. The enzyme purified from Bacillus sp. JF8 is Mn(II)-dependent, however, the enzyme from Rhodococcus jostii RHAI has Fe(II) bound to it. BphC_JF8 is thermostable and its optimum activity is at 85 degrees C. The enzymes in this family have an internal duplication. This family represents the C-terminal repeat.	167
212511	cd09015	Ureohydrolase	Ureohydrolase superfamily includes arginase, formiminoglutamase, agmatinase and proclavaminate amidinohydrolase (PAH). This family, also known as arginase-like amidino hydrolase family, includes Mn-dependent enzymes: arginase (Arg, EC 3.5.3.1), formimidoylglutamase (HutG, EC 3.5.3.8 ), agmatinase (SpeB, EC 3.5.3.11), guanidinobutyrase (Gbh, EC=3.5.3.7), proclavaminate amidinohydrolase (PAH, EC 3.5.3.22) and related proteins. These enzymes catalyze hydrolysis of amide bond. They are involved in control of cellular levels of arginine and ornithine (both involved in protein biosynthesis, and production of creatine, polyamines, proline and nitric acid), in histidine and arginine degradation, and in clavulanic acid biosynthesis.	270
176656	cd09018	DEDDy_polA_RNaseD_like_exo	DEDDy 3'-5' exonuclease domain of family-A DNA polymerases, RNase D, WRN, and similar proteins. DEDDy exonucleases, part of the DnaQ-like (or DEDD) exonuclease superfamily, catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. They contain four invariant acidic residues in three conserved sequence motifs termed ExoI, ExoII and ExoIII. DEDDy exonucleases are classified as such because of the presence of a specific YX(3)D pattern at ExoIII. The four conserved acidic residues serve as ligands for the two metal ions required for catalysis. This family of DEDDy exonucleases includes the proofreading domains of family A DNA polymerases, as well as RNases such as RNase D and yeast Rrp6p. The Egalitarian (Egl) and Bacillus-like DNA Polymerase I subfamilies do not possess a completely conserved YX(3)D pattern at the ExoIII motif. In addition, the Bacillus-like DNA polymerase I subfamily has inactive 3'-5' exonuclease domains which do not possess the metal-binding residues necessary for activity.	150
185696	cd09019	galactose_mutarotase_like	galactose mutarotase_like. Galactose mutarotase catalyzes the conversion of beta-D-galactose to alpha-D-galactose. Beta-D-galactose is produced by the degradation of lactose, a disaccharide composed of beta-D-glucose and beta-D-galactose. This epimerization reaction is the first step in the four-step Leloir pathway, which converts galactose into metabolically important glucose. This epimerization step is followed by the phosophorylation of alpha-D-galactose by galactokinase, an enzyme which can only act on the alpha anomer. A glutamate and a histidine residue of the galactose mutarotase have been shown to be critical for catalysis, the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. Galactose mutarotase is a member of the aldose-1-epimerase superfamily.	326
185697	cd09020	D-hex-6-P-epi_like	D-hexose-6-phosphate epimerase-like. D-Hexose-6-phosphate epimerase Ymr099c from Saccharomyces cerevisiae belongs to the large superfamily of aldose-1-epimerases. Its active site is very similar to the catalytic site of galactose mutarotase, the best studied member of the superfamily. It also contains the conserved glutamate and histidine residues that have been shown in galactose mutarotase to be critical for catalysis, the glutamate serving as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. In addition Ymr099c contains 2 conserved arginine residues which are involved in phosphate binding, and exhibits hexose-6-phosphate mutarotase activity on glucose-6-P, galactose-6-P and mannose-6-P.	269
185698	cd09021	Aldose_epim_Ec_YphB	aldose 1-epimerase, similar to Escherichia coli YphB. Proteins similar to Escherichia coli YphB are uncharacterized members of the aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen.	273
185699	cd09022	Aldose_epim_Ec_YihR	Aldose 1-epimerase, similar to Escherichia coli YihR. Proteins similar to Escherichia coli YihR are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen.	284
185700	cd09023	Aldose_epim_Ec_c4013	Aldose 1-epimerase, similar to Escherichia coli c4013. Proteins, similar to Escherichia coli c4013, are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen.	284
185701	cd09024	Aldose_epim_lacX	Aldose 1-epimerase, similar to Lactococcus lactis lacX. Proteins similar to Lactococcus lactis lacX are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen.	288
185702	cd09025	Aldose_epim_Slr1438	Aldose 1-epimerase, similar to Synechocystis Slr1438. Proteins similar to Synechocystis Slr1438 are uncharacterized members of aldose-1-epimerase superfamily. Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen.	271
193601	cd09027	PET	PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions. PET domain is involved in protein-protein interactions and is usually found in conjunction with LIM domain, which is also a protein-protein interaction domain. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. The PET domain has been found at the N-terminal of four known groups of proteins: prickle, testin, LIMPETin/LIM-9 and overexpressed breast tumor protein (OEBT). Prickle has been implicated in regulation of cell movement through its association with the Dishevelled (Dsh) protein in the planar cell polarity (PCP) pathway. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell contact areas, and at focal adhesion plaques. It interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin, and is involved in cell motility and adhesion events. Knockout mice experiments reveal tumor repressor function of Testin.  LIMPETin/LIM-9 contains an N-terminal PET domain and 6 LIM domains at the C-terminal.  In Schistosoma mansoni, where LIMPETin was first identified, it is down regulated in sexually mature adult females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator.  In C. elegans, LIM-9 may play a role in regulating the assembly and maintenance of the muscle A-band by forming a protein complex with SCPL-1 and UNC-89 and other proteins. OEBT displays a PET domain with two LIM domains, and is predicted to be localized in the nucleus with a possible role in cancer differentiation.	82
350085	cd09028	ArfGap_ArfGap3	Arf1 GTPase-activating protein 3. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif.	120
350086	cd09029	ArfGap_ArfGap2	Arf1 GTPase-activating protein 2. ArfGAP (ADP Ribosylation Factor GTPase Activating Protein) domain is a part of ArfGap1-like proteins that play a crucial role in controlling of membrane trafficking, particularly in the formation of COPI (coat protein complex I)-coated vesicles on Golgi membranes. The ArfGAP1 protein subfamily consists of three members: ArfGAP1 (Gcs1p in yeast), ArfGAP2 and ArfGAP3 (both are homologs of yeast Glo3p). ArfGAP2/3 are closely related, but with little similarity to ArfGAP1, except the catalytic ArfGAP domain. They promote hydrolysis of GTP bound to the small G protein ADP-ribosylation factor 1 (Arf1), which leads to the dissociation of coat proteins from Golgi-derived membranes and vesicles. Dissociation of the coat proteins is required for the fusion of these vesicles with target compartments. Thus, the GAP catalytic activity plays a key role in the formation of COPI vesicles from Golgi membrane. In contrast to ArfGAP1, which displays membrane curvature-dependent ArfGAP activity, ArfGAP2 and ArfGAP3 activities are dependent on coatomer (the core COPI complex) which required for efficient recruitment of ArfGAP2 and ArfGAP3 to the Golgi membrane. Accordingly, ArfGAP2/3 has been implicated in coatomer-mediated protein transport between the Golgi complex and the endoplasmic reticulum. Unlike ArfGAP1, which is controlled by membrane curvature through its amphipathic lipid packing sensor (ALPS) motifs, ArfGAP2/3 do not possess ALPS motif.	120
176923	cd09030	DUF1425	Putative periplasmic lipoprotein. This bacterial family of proteins contains members described as putative lipoproteins, some are also known as YcfL. The function of this family is unknown. Family members have also been annotated as predicted periplasmic lipoproteins (COG5633), and appear to contain an N-terminal membrane lipoprotein lipid attachment side (pfam08139), which is not included in this alignment model.	101
411807	cd09031	KH-I_NOVA_rpt3	third type I K homology (KH) RNA-binding domain found in the family of neuro-oncological ventral antigen (Nova). The family includes two related neuronal RNA-binding proteins, Nova-1 and Nova-2. Nova-1, also called onconeural ventral antigen 1, or paraneoplastic Ri antigen, or ventral neuron-specific protein 1, may regulate RNA splicing or metabolism in a specific subset of developing neurons. It interacts with RNA containing repeats of the YCAY sequence. It is a brain-enriched splicing factor regulating neuronal alternative splicing. Nova-1 is involved in neurological disorders and carcinogenesis.  Nova-2, also called astrocytic NOVA1-like RNA-binding protein, is a neuronal RNA-binding protein expressed in a broader central nervous system (CNS) distribution than Nova-1. It functions in neuronal RNA metabolism. NOVA family proteins contain three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	71
411808	cd09032	KH-I_N4BP1_like_rpt1	first type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 (N4BP1). The N4BP1 family includes N4BP1, NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN) and KH and NYN domain-containing protein (KHNYN). These proteins are probably of retroviral origin. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. Members of this family contains two type I K homology (KH) RNA-binding domain. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif.	65
411809	cd09033	KH-I_PNPT1	type I K homology (KH) RNA-binding domain found in mitochondrial polyribonucleotide nucleotidyltransferase 1 (PNPT1) and similar proteins. PNPT1, also called 3'-5' RNA exonuclease OLD35, or PNPase old-35, or polynucleotide phosphorylase 1, or PNPase 1, or polynucleotide phosphorylase-like protein, is an RNA-binding protein implicated in numerous RNA metabolic processes. It catalyzes the phosphorolysis of single-stranded polyribonucleotides processively in the 3'-to-5' direction. It acts as a mitochondrial intermembrane factor with RNA-processing exoribonulease activity. PNPT1 is a component of the mitochondrial degradosome (mtEXO) complex, that degrades 3' overhang double-stranded RNA with a 3'-to-5' directionality in an ATP-dependent manner. It is involved in the degradation of non-coding mitochondrial transcripts (MT-ncRNA) and tRNA-like molecules and required for correct processing and polyadenylation of mitochondrial mRNAs. PNPT1 also plays a role as a cytoplasmic RNA import factor that mediates the translocation of small RNA components, like the 5S RNA, the RNA subunit of ribonuclease P and the mitochondrial RNA-processing (MRP) RNA, into the mitochondrial matrix.	67
185761	cd09034	BRO1_Alix_like	Protein-interacting Bro1-like domain of mammalian Alix and related domains. This superfamily includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1 and Rim20 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, HD-PTP, and Brox) and Snf7 (in the case of yeast Bro1, and Rim20). The single domain protein human Brox, and the isolated Bro1-like domains of Alix, HD-PTP and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix, HD-PTP, Bro1, and Rim20 also have a V-shaped (V) domain, which in the case of Alix, has been shown to be a dimerization domain and to contain a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in this superfamily. Alix, HD-PTP and Bro1 also have a proline-rich region (PRR); the Alix PRR binds multiple partners. Rhophilin-1, and -2, in addition to this Bro1-like domain, have an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This protein has a C-terminal, catalytically inactive tyrosine phosphatase domain.	345
176924	cd09071	FAR_C	C-terminal domain of fatty acyl CoA reductases. C-terminal domain of fatty acyl CoA reductases, a family of SDR-like proteins. SDRs or short-chain dehydrogenases/reductases are Rossmann-fold NAD(P)H-binding proteins. Many proteins in this FAR_C family may function as fatty acyl-CoA reductases (FARs), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as the biosynthesis of insect pheromones, plant cuticular wax production, and mammalian wax biosynthesis. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols. The function of this C-terminal domain is unclear.	92
197307	cd09073	ExoIII_AP-endo	Escherichia coli exonuclease III (ExoIII)-like apurinic/apyrimidinic (AP) endonucleases. The ExoIII family AP endonucleases belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, which is then followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, which have both mutagenic and cytotoxic effects. AP endonucleases can carry out a wide range of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Many organisms have two functional AP endonucleases, for example, APE1/Ref-1 and Ape2 in humans, Apn1 and Apn2 in bakers yeast, Nape and NExo in Neisseria meningitides, and exonuclease III (ExoIII) and endonuclease IV (EndoIV) in Escherichia coli. Usually, one of the two is the dominant AP endonuclease, the other has weak AP endonuclease activity, but exhibits strong 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, and 3'-phosphatase activities. Class II AP endonucleases have been classified into two families, designated ExoIII and EndoIV, based on their homology to the Escherichia coli enzymes. This family contains the ExoIII family; the EndoIV family belongs to a different superfamily.	251
197308	cd09074	INPP5c	Catalytic domain of inositol polyphosphate 5-phosphatases. Inositol polyphosphate 5-phosphatases (5-phosphatases) are signal-modifying enzymes, which hydrolyze the 5-phosphate from the inositol ring of specific 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), such as PI(4,5)P2, PI(3,4,5)P3, PI(3,5)P2, I(1,4,5)P3, and I(1,3,4,5)P4. These enzymes are Mg2+-dependent, and belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. In addition to this INPP5c domain, 5-phosphatases often contain additional domains and motifs, such as the SH2 domain, the Sac-1 domain, the proline-rich domain (PRD), CAAX, RhoGAP (RhoGTPase-activating protein), and SKICH [SKIP (skeletal muscle- and kidney-enriched inositol phosphatase) carboxyl homology] domains, that are important for protein-protein interactions and/or for the subcellular localization of these enzymes. 5-phosphatases incorporate into large signaling complexes, and regulate diverse cellular processes including postsynaptic vesicular trafficking, insulin signaling, cell growth and survival, and endocytosis. Loss or gain of function of 5-phosphatases is implicated in certain human diseases. This family also contains a functionally unrelated nitric oxide transport protein, Cimex lectularius (bedbug) nitrophorin, which catalyzes a heme-assisted S-nitrosation of a proximal thiolate; the heme however binds at a site distinct from the active site of the 5-phosphatases.	299
197309	cd09075	DNase1-like	Deoxyribonuclease 1 and related proteins. This family includes Deoxyribonuclease 1 (DNase1, EC 3.1.21.1) and related proteins. DNase1, also known as DNase I, is a Ca2+, Mg2+/Mn2+-dependent secretory endonuclease, first isolated from bovine pancreas extracts. It cleaves DNA preferentially at phosphodiester linkages next to a pyrimidine nucleotide, producing 5'-phosphate terminated polynucleotides with a free hydroxyl group on position 3'. It generally produces tetranucleotides. DNase1 substrates include single-stranded DNA, double-stranded DNA, and chromatin. This enzyme may be responsible for apoptotic DNA fragmentation. Other deoxyribonucleases in this subfamily include human DNL1L (human DNase I lysosomal-like, also known as DNASE1L1, Xib and DNase X ), human DNASE1L2 (also known as DNAS1L2), and DNASE1L3 (also known as DNAS1L3, nhDNase, LS-DNase, DNase Y, and DNase gamma). DNASE1L3 is also implicated in apoptotic DNA fragmentation. DNase1 is also a cytoskeletal protein which binds actin. A recombinant form of human DNase1 is used as a mucoactive therapy in patients with cystic fibrosis; it hydrolyzes the extracellular DNA in sputum and reduces its viscosity. Mutations in the gene encoding DNase1 have been associated with Systemic Lupus Erythematosus, a multifactorial autoimmune disease. This family also includes a subfamily of mostly uncharacterized proteins, which includes Mycoplasma pulmonis MnuA, a membrane-associated nuclease. The in vivo role of MnuA is as yet undetermined. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	258
197310	cd09076	L1-EN	Endonuclease domain (L1-EN) of the non-LTR retrotransposon LINE-1 (L1), and related domains. This family contains the endonuclease domain (L1-EN) of the non-LTR retrotransposon LINE-1 (L1), and related domains, including the endonuclease of Xenopus laevis Tx1. These retrotranspons belong to the subtype 2, L1-clade. LINES can be classified into two subtypes. Subtype 2 has two ORFs: the second (ORF2) encodes a modular protein consisting of an N-terminal apurine/apyrimidine endonuclease domain (EN), a central reverse transcriptase, and a zinc-finger-like domain at the C-terminus. LINE-1/L1 elements (full length and truncated) comprise about 17% of the human genome. This endonuclease nicks the genomic DNA at the consensus target sequence 5'TTTT-AA3' producing a ribose 3'-hydroxyl end as a primer for reverse transcription of associated template RNA. This subgroup also includes the endonuclease of Xenopus laevis Tx1, another member of the L1-clade. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	236
197311	cd09077	R1-I-EN	Endonuclease domain encoded by various R1- and I-clade non-long terminal repeat retrotransposons. This family contains the endonuclease (EN) domain of various non-long terminal repeat (non-LTR) retrotransposons, long interspersed nuclear elements (LINEs) which belong to the subtype 2, R1- and I-clade. LINES can be classified into two subtypes. Subtype 2 has two ORFs: the second (ORF2) encodes a modular protein consisting of an N-terminal apurine/apyrimidine endonuclease domain (EN), a central reverse transcriptase, and a zinc-finger-like domain at the C-terminus. Most non-LTR retrotransposons are inserted throughout the host genome; however, many retrotransposons of the R1 clade exhibit target-specific retrotransposition. This family includes the endonucleases of SART1 and R1bm, from the silkworm Bombyx mori, which belong to the R1-clade. It also includes the endonuclease of snail (Biomphalaria glabrata) Nimbus/Bgl and mosquito Aedes aegypti (MosquI), both which belong to the I-clade. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	205
197312	cd09078	nSMase	Neutral sphingomyelinases (nSMase) catalyze the hydrolysis of sphingomyelin in biological membranes to ceramide and phosphorylcholine. Sphingomyelinases (SMase) are phosphodiesterases that catalyze the hydrolysis of sphingomyelin to ceramide and phosphorylcholine. Eukaryotic SMases have been classified according to their pH optima and are known as acid SMase, alkaline SMase, and neutral SMase (nSMase). Eukaryotic proteins in this family are nSMases, and are activated by a variety of stress-inducing agents such as cytokines or UV radiation. Ceramides and other metabolic derivatives, including sphingosine, are lipid "second messenger" molecules that participate in the regulation of stress-induced cellular responses, including cell death, adhesion, differentiation, and proliferation. Bacterial neutral SMases, which also belong to this domain family, are secreted proteins that act as membrane-damaging virulence factors. They promote colonization of the host tissue. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	280
197313	cd09079	RgfB-like	Streptococcus agalactiae RgfB, part of a putative two component signal transduction system, and related proteins. This family includes Streptococcus agalactiae RgfB (for regulator of fibrinogen binding) and related proteins. The function of RgfB is unknown. It is part of a putative two component signal transduction system designated rgfBDAC (the rgf locus was identified in a screen for mutants of Streptococcus agalactiae with altered binding to fibrinogen). RgfA,-C,and -D do not belong to this superfamily: rgfA encodes a putative response regulator, and rgfC, a putative histidine kinase. All four genes are co-transcribed, and may be involved in regulating expression of bacterial cell surface components. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	259
197314	cd09080	TDP2	Phosphodiesterase domain of human TDP2, a 5'-tyrosyl DNA phosphodiesterase, and related domains. Human TDP2, also known as TTRAP (TRAF/TNFR-associated factors, and tumor necrosis factor receptor/TNFR-associated protein), is a 5'-tyrosyl DNA phosphodiesterase. It is required for the efficient repair of topoisomerase II-induced DNA double strand breaks. The topoisomerase is covalently linked by a phosphotyrosyl bond to the 5'-terminus of the break. TDP2 cleaves the DNA 5'-phosphodiester bond and restores 5'-phosphate termini, needed for subsequent DNA ligation, and hence repair of the break. TDP2 and 3'-tyrosyl DNA phosphodiesterase (TDP1) are complementary activities; together, they allow cells to remove trapped topoisomerase from both 3'- and 5'-DNA termini. TTRAP has been reported as being involved in apoptosis, embryonic development, and transcriptional regulation, and it may inhibit the activation of nuclear factor-kB. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	248
197315	cd09081	CdtB	CdtB, the catalytic DNase I-like subunit of cytolethal distending toxin (CDT) protein. CDT is a secreted protein toxin produced by a number of Gram-negative disease-causing bacteria. CDT causes cell cycle arrest and eventual cell death in eukaryotic cells, as a result of chromosomal DNA damage caused by the catalytic, DNase I-like, CdtB subunit. Bacterial CDTs are generally comprised of three subunits, CdtA, -B and -C. CdtB is translocated  into the host cell, where it acts as a genotoxin. CdtA and CdtC are needed for cell surface binding and cellular entry, and it is likely that they remain associated with the membrane, when CdtB is internalized. CdtB enters the target nucleus via nuclear translocation signal domain(s). This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	247
197316	cd09082	Deadenylase	C-terminal deadenylase domain of CCR4, nocturnin, and related domains. This family contains the C-terminal catalytic domains of the deadenylases, CCR4 and nocturnin, and related domains. Nocturnin is a poly(A)-specific 3' exonuclease that specifically degrades the 3' poly(A) tail of RNA in a process known as deadenylation. This nuclease activity is manganese dependent. Nocturnin is expressed in the cytoplasm of the Xenopus laevis retinal photoreceptor cells in a rhythmic fashion, and it has been proposed that it participates in posttranscriptional regulation of the circadian clock or its outputs, and that the mRNA target(s) of this deadenylase are circadian clock-related. Saccharomyces cerevisiae CCR4p is a 3'-5' poly(A) RNA and ssDNA exonuclease. It is the catalytic subunit of the yeast mRNA deadenylase (Ccr4p/Pop2p/Not complex). This complex participates in various ways in mRNA metabolism, including transcription initiation and elongation, and mRNA degradation. The deadenylase activities of Ccr4p and nocturnin differ: nocturnin degrades poly(A), Ccr4p degrades both poly(A) and single-stranded DNA, and in contrast to Ccr4p, nocturnin appears to function in a highly processive manner. This family belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	348
197317	cd09083	EEP-1	Exonuclease-Endonuclease-Phosphatase domain; uncharacterized family 1. This family of uncharacterized proteins belongs to a superfamily that includes the catalytic domain (exonuclease/endonuclease/phosphatase, EEP, domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds. Their substrates range from nucleic acids to phospholipids and perhaps, proteins.	252
197318	cd09084	EEP-2	Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; uncharacterized family 2. This family of uncharacterized proteins belongs to a superfamily that includes the catalytic domain (exonuclease/endonuclease/phosphatase, EEP, domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps, proteins.	246
197319	cd09085	Mth212-like_AP-endo	Methanothermobacter thermautotrophicus Mth212-like subfamily of the ExoIII family purinic/apyrimidinic (AP) endonucleases. This subfamily includes the thermophilic archaeon Methanothermobacter thermautotrophicus Mth212and related proteins. These are Escherichia coli exonuclease III (ExoIII)-like AP endonucleases and they belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Mth212 is an AP endonuclease, and a DNA uridine endonuclease (U-endo) that nicks double-stranded DNA at the 5'-side of a 2'-d-uridine residue. After incision at the 5'-side of a 2'-d-uridine residue by Mth212, DNA polymerase B takes over the 3'-OH terminus and carries out repair synthesis, generating a 5'-flap structure that is resolved by a 5'-flap endonuclease. Finally, DNA ligase seals the resulting nick. This U-endo activity shares the same catalytic center as its AP-endo activity, and is absent from other AP endonuclease homologues.	252
197320	cd09086	ExoIII-like_AP-endo	Escherichia coli exonuclease III (ExoIII) and Neisseria meningitides NExo-like subfamily of the ExoIII family purinic/apyrimidinic (AP) endonucleases. This subfamily includes Escherichia coli ExoIII, Neisseria meningitides NExo,and related proteins. These are ExoIII family AP endonucleases and they belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiencies. Many organisms have two AP endonucleases, usually one is the dominant AP endonuclease, the other has weak AP endonuclease activity. For example, Neisseria meningitides Nape and NExo, and exonuclease III (ExoIII) and endonuclease IV (EndoIV) in Escherichia coli. NExo and ExoIII  are found in this subfamily. NExo is the non-dominant AP endonuclease. It exhibits strong 3'-5' exonuclease and 3'-deoxyribose phosphodiesterase activities. Escherichia coli ExoIII is an active AP endonuclease, and in addition, it exhibits double strand (ds)-specific 3'-5' exonuclease, exonucleolytic RNase H, 3'-phosphomonoesterase and  3'-phosphodiesterase activities, all catalyzed by a single active site. Class II AP endonucleases have been classified into two families, designated ExoIII and EndoIV, based on their homology to the Escherichia coli enzymes ExoIII and endonuclease IV (EndoIV). This subfamily belongs to the ExoIII family; the EndoIV family belongs to a different superfamily.	254
197321	cd09087	Ape1-like_AP-endo	Human Ape1-like subfamily of the ExoIII family apurinic/apyrimidinic (AP) endonucleases. This subfamily includes human Ape1 (also known as Apex, Hap1, or Ref-1) and related proteins. These are Escherichia coli exonuclease III (ExoIII)-like AP endonucleases and they belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Many organisms have two AP endonucleases, usually one is the dominant AP endonuclease, the other has weak AP endonuclease activity; for example, Ape1 and Ape2 in humans. Ape1 is found in this subfamily, it exhibits strong AP-endonuclease activity but shows weak 3'-5' exonuclease and 3'-phosphodiesterase activities. Class II AP endonucleases have been classified into two families, designated ExoIII and EndoIV, based on their homology to the Escherichia coli enzymes exonuclease III (ExoIII) and endonuclease IV (EndoIV). This subfamily belongs to the ExoIII family; the EndoIV family belongs to a different superfamily.	253
197322	cd09088	Ape2-like_AP-endo	Human Ape2-like subfamily of the ExoIII family purinic/apyrimidinic (AP) endonucleases. This subfamily includes human APE2, Saccharomyces cerevisiae Apn2/Eth1, and related proteins. These are Escherichia coli exonuclease III (ExoIII)-like AP endonucleases and they belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER, the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Many organisms have two AP endonucleases, usually one is the dominant AP endonuclease, the other has weak AP endonuclease activity. For examples, Ape1 and Ape2 in humans, and Apn1 and Apn2 in bakers yeast. Ape2 and Apn2/Eth1 are both found in this subfamily, and have the weaker AP endonuclease activity. Ape2 shows strong 3'-5' exonuclease and 3'-phosphodiesterase activities; it can reduce the mutagenic consequences of attack by reactive oxygen species by removing 3'-end adenine opposite from 8-oxoG, in addition to repairing 3'-damaged termini. Apn2/Eth1 exhibits AP endonuclease activity, but has 30-40 fold more active 3'-phosphodiesterase and 3'-5' exonuclease activities. Class II AP endonucleases have been classified into two families, designated ExoIII and EndoIV, based on their homology to the Escherichia coli enzymes exonuclease III (ExoIII) and endonuclease IV (EndoIV). This subfamily belongs to the ExoIII family; the EndoIV family belongs to a different superfamily.	309
197323	cd09089	INPP5c_Synj	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of synaptojanins. This subfamily contains the INPP5c domains of two human synaptojanins, synaptojanin 1 (Synj1) and synaptojanin 2 (Synj2), and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs). They belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. Synj1 occurs as two main isoforms: a brain enriched 145 KDa protein (Synj1-145) and a ubiquitously expressed 170KDa protein (Synj1-170). Synj1-145 participates in clathrin-mediated endocytosis. The primary substrate of the Synj1-145 INPP5c domain is PI(4,5)P2, which it converts to PI4P. Synj1-145 may work with membrane curvature sensors/generators (such as endophilin) to remove PI(4,5)P2 from curved membranes. The recruitment of the INPP5c domain of Synj1-145 to endophilin-induced membranes leads to a fragmentation and condensation of these structures. The PI(4,5)P2 to PI4P conversion may cooperate with dynamin to produce membrane fission. In addition to this INPP5c domain, Synjs contain an N-terminal Sac1-like domain; the Sac1 domain can dephosphorylate a variety of phosphoinositides in vitro. Synj2 can hydrolyze phosphatidylinositol diphosphate (PIP2) to phosphatidylinositol phosphate (PIP). Synj2 occurs as multiple alternative splice variants in various tissues. These variants share the INPP5c domain and the Sac1 domain. Synj2A is recruited to the mitochondria via its interaction with OMP25 (a mitochondrial outer membrane protein). Synj2B is found at nerve terminals in the brain and at the spermatid manchette in testis. Synj2B undergoes further alternative splicing to give 2B1 and 2B2. In clathrin-mediated endocytosis, Synj2 participates in the formation of clathrin-coated pits, and perhaps also in vesicle decoating. Rac1 GTPase regulates the intracellular localization of Synj2 forms, but not Synj1. Synj2 may contribute to the role of Rac1 in cell migration and invasion, and is a potential target for therapeutic intervention in malignant tumors.	328
197324	cd09090	INPP5c_ScInp51p-like	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of Saccharomyces cerevisiae Inp51p, Inp52p, and Inp53p, and related proteins. This subfamily contains the INPP5c domain of three Saccharomyces cerevisiae synaptojanin-like inositol polyphosphate 5-phosphatases (INP51, INP52, and INP53), Schizosaccharomyces pombe synaptojanin (SPsynaptojanin), and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. In addition to this INPP5c domain, these proteins have an N-terminal catalytic Sac1-like domain (found in other proteins including the phophoinositide phosphatase Sac1p), and a C-terminal  proline-rich domain (PRD). The Sac1 domain allows Inp52p and Inp53p to recognize and dephosphorylate a wider range of substrates including PI3P, PI4P, and PI(3,5)P2. The Sac1 domain of Inp51p is non-functional. Disruption of any two of INP51, INP52, and INP53, in S. cerevisiae leads to abnormal vacuolar and plasma membrane morphology. During hyperosmotic stress, Inp52p and Inp53p localize at actin patches, where they may facilitate the hydrolysis of PI(4,5)P2, and consequently promote actin rearrangement to regulate cell growth. SPsynaptojanin is also active against a range of soluble and lipid inositol phosphates, including I(1,4,5)P3, I(1,3,4,5)P4, I(1,4,5,6)P4, PI(4,5)P2, and PIP3. Transformation of S. cerevisiae with a plasmid expressing the SPsynaptojanin 5-phosphatase domain rescues inp51/inp52/inp53 triple-mutant strains.	291
197325	cd09091	INPP5c_SHIP	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of SH2 domain containing inositol polyphosphate 5-phosphatase-1 and -2, and related proteins. This subfamily contains the INPP5c domain of SHIP1 (SH2 domain containing inositol polyphosphate 5-phosphatase-1, also known as SHIP/INPP5D), and SHIP2 (also known as INPPL1). It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. Both SHIP1 and -2 catalyze the dephosphorylation of the PI, phosphatidylinositol 3,4,5-trisphosphate [PI(3,4,5)P3], to phosphatidylinositol 3,4-bisphosphate [PI(3,4)P2]. SHIP1 also converts inositol-1,3,4,5- polyphosphate [I(1,3,4,5)P4] to inositol-1,3,4-polyphosphate [I(1,3,4)P3]. SHIP1 and SHIP2 have little overlap in their in vivo functions. SHIP1 is a negative regulator of cell growth and plays a major part in mediating the inhibitory signaling in B cells; it is predominantly expressed in hematopoietic cells. SHIP2 is as an inhibitor of the insulin signaling pathway, and is implicated in actin structure remodeling, cell adhesion and cell spreading, receptor endocytosis and degradation, and in the JIP1-mediated JNK pathway. SHIP2  is widely expressed, most prominently in brain, heart and in skeletal muscle. In addition to this INPP5c domain, SHIP1 has an N-terminal SH2 domain, two NPXY motifs, and a C-terminal proline-rich region (PRD), while SHIP2 has an N-terminal SH2 domain, a C-terminal proline-rich domain (PRD), which includes a WW-domain binding motif (PPLP), an NPXY motif, and a sterile alpha motif (SAM) domain. The gene encoding SHIP2 is a candidate gene for conferring a predisposition for type 2 diabetes.	307
197326	cd09092	INPP5A	Type I inositol polyphosphate 5-phosphatase I. Type I inositol polyphosphate 5-phosphatase I (INPP5A) hydrolyzes the 5-phosphate from inositol 1,3,4,5-tetrakisphosphate [I(1,3,4,5)P4] and inositol 1,4,5-trisphosphate [I(1,4,5)P3]. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. As the substrates of INPP5A mobilize intracellular calcium ions, INPP5A is a calcium signal-terminating enzyme. In platelets, phosphorylated pleckstrin binds and activates INPP5A in a 1:1 complex, and accelerates the degradation of the calcium ion-mobilizing I(1,4,5)P3.	383
197327	cd09093	INPP5c_INPP5B	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of Type II inositol polyphosphate 5-phosphatase I, Oculocerebrorenal syndrome of Lowe 1, and related proteins. This subfamily contains the INPP5c domain of type II inositol polyphosphate 5-phosphatase I (INPP5B), Oculocerebrorenal syndrome of Lowe 1 (OCRL-1), and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. INPP5B and OCRL1 preferentially hydrolyze the 5-phosphate of phosphatidylinositol (4,5)- bisphosphate [PI(4,5)P2] and phosphatidylinositol (3,4,5)- trisphosphate [PI(3,4,5)P3]. INPP5B can also hydrolyze soluble inositol (1,4,5)-trisphosphate [I(1,4,5)P3] and inositol (1,3,4,5)-tetrakisphosphate [I(1,3,4,5)P4]. INPP5B participates in the endocytic pathway and in the early secretory pathway. In the latter, it may function in retrograde ERGIC (ER-to-Golgi intermediate compartment)-to-ER transport; it binds specific RAB proteins within the secretory pathway. In the endocytic pathway, it binds RAB5 and during endocytosis, may function in a RAB5-controlled cascade for converting PI(3,4,5)P3 to phosphatidylinositol 3-phosphate (PI3P). This cascade may link growth factor signaling and membrane dynamics. Mutation in OCRL1 is implicated in Lowe syndrome, an X-linked recessive multisystem disorder, which includes defects in eye, brain, and kidney function, and in Type 2 Dent's disease, a disorder with only the renal symptoms. OCRL-1 may have a role in membrane trafficking within the endocytic pathway and at the trans-Golgi network, and may participate in actin dynamics or signaling from endomembranes. OCRL1 and INPP5B have overlapping functions: deletion of both 5-phosphatases in mice is embryonic lethal, deletion of OCRL1 alone has no phenotype, and deletion of Inpp5b alone has only a mild phenotype (male sterility). Several of the proteins that interact with OCRL1 also bind INPP5B, for examples, inositol polyphosphate phosphatase interacting protein of 27kDa (IPIP27)A and B (also known as Ses1 and 2), and endocytic signaling adaptor APPL1. OCRL1, but not INPP5B, binds clathrin heavy chain, the plasma membrane AP2 adaptor subunit alpha-adaptin.  In addition to this INPP5c domain, most proteins in this subfamily have a C-terminal RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain.	292
197328	cd09094	INPP5c_INPP5J-like	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of inositol polyphosphate 5-phosphatase J and related proteins. INPP5c domain of Inositol polyphosphate-5-phosphatase J (INPP5J), also known as PIB5PA or PIPP, and related proteins. This subfamily belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. INPP5J hydrolyzes PI(4,5)P2, I(1,4,5)P3, and I(1,3,4,5)P4 at ruffling membranes. These proteins contain a C-terminal, SKIP carboxyl homology domain (SKICH), which may direct plasma membrane ruffle localization.	300
197329	cd09095	INPP5c_INPP5E-like	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of Inositol polyphosphate-5-phosphatase E and related proteins. INPP5c domain of Inositol polyphosphate-5-phosphatase E (also called type IV or 72 kDa 5-phosphatase), rat pharbin, and related proteins. This subfamily belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. INPP5E hydrolyzes the 5-phosphate from PI(3,5)P2, PI(4,5)P2 and PI(3,4,5)P3, forming PI3P, PI4P, and PI(3,4)P2, respectively. It is a very potent PI(3,4,5)P3 5-phosphatase. Its intracellular localization is chiefly cytosolic, with pronounced perinuclear/Golgi localization. INPP5E also has an N-terminal proline rich domain (PRD) and a C-terminal CAAX motif. This protein is expressed in a variety of tissues, including the breast, brain, testis, and haemopoietic cells. It is differentially expressed in several cancers, for example, it is up-regulated in cervical cancer and down-regulated in stomach cancer. It is a candidate target for therapeutics of obesity and related disorders, as it is expressed in the hypothalamus, and following insulin stimulation, it undergoes tyrosine phosphorylation, associates with insulin receptor substrate-1, -2, and PI3-kinase, and become active as a 5-phosphatase. INPP5E may play a role, along with other 5-phosphatases SHIP2 and SKIP, in regulating glucose homoeostasis and energy metabolism. Mice deficient in INPPE5 develop a multi-organ disorder associated with structural defects of the primary cilium.	298
197330	cd09096	Deadenylase_nocturnin	C-terminal deadenylase domain of nocturnin and related domains. This subfamily contains the C-terminal catalytic domain of the deadenylase, nocturnin, and related domains. Nocturnin is a poly(A)-specific 3' exonuclease that specifically degrades the 3' poly(A) tail of RNA in a process known as deadenylation. This nuclease activity is manganese dependent. Nocturnin is expressed in the cytoplasm of Xenopus laevis retinal photoreceptor cells in a rhythmic fashion, and it has been proposed that it participates in posttranscriptional regulation of the circadian clock or its outputs, and that the mRNA target(s) of this deadenylase are circadian clock-related. In mouse, the nocturnin gene, mNoc, is expressed in a circadian pattern in a range of tissues including retina, spleen, heart, kidney, and liver. It is highly expressed in bone-marrow stromal cells, adipocytes and hepatocytes. In mammals, nocturnin plays a role in regulating mesenchymal stem-cell lineage allocation, perhaps through regulating PPAR-gamma (peroxisome proliferator-activated receptor-gamma) nuclear translocation. This subfamily belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	280
197331	cd09097	Deadenylase_CCR4	C-terminal deadenylase domain of CCR4 and related domains. This subfamily contains the C-terminal catalytic domain of the deadenylases, Saccharomyces cerevisiae Ccr4p and two vertebrate homologs (CCR4a and CCR4b), and related domains. CCR4 belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. CCR4 is the major deadenylase subunit of the CCR4-NOT transcription complex, which contains two deadenylase subunits and several noncatalytic subunits. The other deadenylase subunit, Caf1 (called Pop2 in yeast), is a DEDD-type protein and does not belong in this superfamily. Saccharomyces cerevisiae CCR4 (or Ccr4p) is a 3'-5' poly(A) RNA and ssDNA exonuclease. It is the catalytic subunit of the yeast mRNA deadenylase (Ccr4p/Pop2p/Not complex). This complex participates in various ways in mRNA metabolism, including transcription initiation and elongation, and mRNA degradation. Ccr4p degrades both poly(A) and single-stranded DNA. There are two vertebrate homologs of Ccr4p, CCR4a (also called CCR4-NOT transcription complex subunit 6 or CNOT6) and CCR4b (also called CNOT6-like or CNOT6L), which independently associate with other components to form distinct CCR4-NOT multisubunit complexes. The nuclease domain of CNOT6 and CNOT6L exhibits Mg2+-dependent deadenylase activity, with specificity for poly (A) RNA as substrate. CCR4a is a component of P-bodies and is necessary for foci formation. CCR4b regulates p27/Kip1 mRNA levels, thereby influencing cell cycle progression. They both contribute to the prevention of cell death by regulating insulin-like growth factor-binding protein 5.	329
197332	cd09098	INPP5c_Synj1	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of synaptojanin 1. This subfamily contains the INPP5c domains of human synaptojanin 1 (Synj1) and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. Synj1 occurs as two main isoforms: a brain enriched 145 KDa protein (Synj1-145) and a ubiquitously expressed 170KDa protein (Synj1-170). Synj1-145 participates in clathrin-mediated endocytosis. The primary substrate of the Synj1-145 INPP5c domain is PI(4,5)P2, which it converts to PI4P. Synj1-145 may work with membrane curvature sensors/generators (such as endophilin) to remove PI(4,5)P2 from curved membranes. The recruitment of the INPP5c domain of Synj1-145 to endophilin-induced membranes leads to a fragmentation and condensation of these structures. The PI(4,5)P2 to PI4P conversion may cooperate with dynamin to produce membrane fission. In addition to this INPP5c domain, these proteins contain an N-terminal Sac1-like domain; the Sac1 domain can dephosphorylate a variety of phosphoinositides in vitro.	336
197333	cd09099	INPP5c_Synj2	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of synaptojanin 2. This subfamily contains the INPP5c domains of human synaptojanin 2 (Synj2) and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated  phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. Synj2 can hydrolyze phosphatidylinositol diphosphate (PIP2) to phosphatidylinositol phosphate (PIP). In addition to this INPP5c domain, these proteins contain an N-terminal Sac1-like domain; the Sac1 domain can dephosphorylate a variety of phosphoinositides in vitro. Synj2 occurs as multiple alternative splice variants in various tissues. These variants share the INPP5c domain and the Sac1 domain. Synj2A is recruited to the mitochondria via its interaction with OMP25, a mitochondrial outer membrane protein. Synj2B is found at nerve terminals in the brain and at the spermatid manchette in testis. Synj2B undergoes further alternative splicing to give 2B1 and 2B2. In clathrin-mediated endocytosis, Synj2 participates in the formation of clathrin-coated pits, and perhaps also in vesicle decoating. Rac1 GTPase regulates the intracellular localization of Synj2 forms, but not Synj1. Synj2 may contribute to the role of Rac1 in cell migration and invasion, and is a potential target for therapeutic intervention in malignant tumors.	336
197334	cd09100	INPP5c_SHIP1-INPP5D	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of SH2 domain containing inositol polyphosphate 5-phosphatase-1 and related proteins. This subfamily contains the INPP5c domain of SHIP1 (SH2 domain containing inositol polyphosphate 5-phosphatase-1, also known as SHIP/INPP5D) and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. SHIP1's enzymic activity is restricted to phosphatidylinositol 3,4,5-trisphosphate [PI (3,4,5)P3] and inositol-1,3,4,5- polyphosphate [I(1,3,4,5)P4]. It converts these two phosphoinositides to phosphatidylinositol 3,4-bisphosphate [PI (3,4)P2] and inositol-1,3,4-polyphosphate [I(1,3,4)P3], respectively. SHIP1 is a negative regulator of cell growth and plays a major part in mediating the inhibitory signaling in B cells; it is predominantly expressed in hematopoietic cells. In addition to this INPP5c domain, SHIP1 has an N-terminal SH2 domain, two NPXY motifs, and a C-terminal proline-rich region (PRD). SHIP1's phosphorylated NPXY motifs interact with proteins with phosphotyrosine binding (PTB) domains, and facilitate the translocation of SHIP1 to the plasma membrane to hydrolyze PI(3,4,5)P3. SHIP1 generally acts to oppose the activity of phosphatidylinositol 3-kinase (PI3K). It acts as a negative signaling molecule, reducing the levels of PI(3,4,5)P3, thereby removing the latter as a membrane-targeting signal for PH domain-containing effector molecules. SHIP1 may also, in certain contexts, amplify PI3K signals. SHIP1 and SHIP2 have little overlap in their in vivo functions.	307
197335	cd09101	INPP5c_SHIP2-INPPL1	Catalytic inositol polyphosphate 5-phosphatase (INPP5c) domain of SH2 domain containing inositol 5-phosphatase-2 and related proteins. This subfamily contains the INPP5c domain of SHIP2 (SH2 domain containing inositol 5-phosphatase-2, also called INPPL1) and related proteins. It belongs to a family of Mg2+-dependent inositol polyphosphate 5-phosphatases, which hydrolyze the 5-phosphate from the inositol ring of various 5-position phosphorylated phosphoinositides (PIs) and inositol phosphates (IPs), and to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. SHIP2 catalyzes the dephosphorylation of the PI, phosphatidylinositol 3,4,5-trisphosphate [PI(3,4,5)P3], to phosphatidylinositol 3,4-bisphosphate [PI(3,4)P2]. SHIP2 is widely expressed, most prominently in brain, heart and in skeletal muscle. SHIP2 is an inhibitor of the insulin signaling pathway. It is implicated in actin structure remodeling, cell adhesion and cell spreading, receptor endocytosis and degradation, and in the JIP1-mediated JNK pathway. Its interacting partners include filamin/actin, p130Cas, Shc, Vinexin, Interesectin 1, and c-Jun NH2-terminal kinase (JNK)-interacting protein 1 (JIP1). A large variety of extracellular stimuli appear to lead to the tyrosine phosphorylation of SHIP2, including epidermal growth factor (EGF), platelet-derived growth factor (PDGF), insulin, macrophage colony-stimulating factor (M-CSF) and hepatocyte growth factor (HGF). SHIP2 is localized to the cytosol in quiescent cells; following growth factor stimulation and /or cell adhesion, it relocalizes to membrane ruffles. In addition to this INPP5c domain, SHIP2 has an N-terminal SH2 domain, a C-terminal proline-rich domain (PRD), which includes a WW-domain binding motif (PPLP), an NPXY motif and a sterile alpha motif (SAM) domain. The gene encoding SHIP2 is a candidate for conferring a predisposition for type 2 diabetes; it has been suggested that suppression of SHIP2 may be of benefit in the treatment of obesity and thereby prevent type 2 diabetes. SHIP2 and SHIP1 have little overlap in their in vivo functions.	304
197201	cd09102	PLDc_CDP-OH_P_transf_II_1	Catalytic domain, repeat 1, of CDP-alcohol phosphatidyltransferase class-II family members. Catalytic domain, repeat 1, of CDP-alcohol phosphatidyltransferase class-II family members, which mainly include gram-negative bacterial phosphatidylserine synthases (PSS; CDP-diacylglycerol--serine O-phosphatidyltransferase, EC 2.7.8.8), yeast phosphatidylglycerophosphate synthase (PGP synthase; CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase, EC 2.7.8.5), and metazoan PGP synthase 1. All members in this subfamily have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterize the phospholipase D (PLD) superfamily. They may utilize a common two-step ping-pong catalytic mechanism, involving a substrate-enzyme intermediate, to cleave phosphodiester bonds. The two motifs are suggested to constitute the active site involving phosphatidyl group transfer. Phosphatidylserine synthases from gram-positive bacteria and eukaryotes, and prokaryotic phosphatidylglycerophosphate synthases are not members of this subfamily.	168
197202	cd09103	PLDc_CDP-OH_P_transf_II_2	Catalytic domain, repeat 2, of CDP-alcohol phosphatidyltransferase class-II family members. Catalytic domain, repeat 2, of CDP-alcohol phosphatidyltransferase class-II family members, which mainly include gram-negative bacterial phosphatidylserine synthases (PSS; CDP-diacylglycerol--serine O-phosphatidyltransferase, EC 2.7.8.8), yeast phosphatidylglycerophosphate synthase (PGP synthase; CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase, EC 2.7.8.5), and metazoan PGP synthase 1. All members in this subfamily have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterize the phospholipase D (PLD) superfamily. They may utilize a common two-step ping-pong catalytic mechanism, involving a substrate-enzyme intermediate, to cleave phosphodiester bonds. The two motifs are suggested to constitute the active site involving phosphatidyl group transfer. Phosphatidylserine synthases from gram-positive bacteria and eukaryotes, and prokaryotic phosphatidylglycerophosphate synthases are not members of this subfamily.	184
197203	cd09104	PLDc_vPLD1_2_like_1	Catalytic domain, repeat 1, of vertebrate phospholipases, PLD1 and PLD2, and similar proteins. Catalytic domain, repeat 1, of phospholipase D (PLD, EC 3.1.4.4) found in yeast, plants, and vertebrates, and their bacterial homologs. PLDs are involved in signal transduction, vesicle formation, protein transport, and mitosis by participating in phospholipid metabolism. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Both prokaryotic and eukaryotic PLDs have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. PLDs are active as bi-lobed monomers. Each monomer contains two domains, each of which carries one copy of the HKD motif. Two HKD motifs from two domains form a single active site. PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	147
197204	cd09105	PLDc_vPLD1_2_like_2	Catalytic domain, repeat 2, of vertebrate phospholipases, PLD1 and PLD2, and similar proteins. Catalytic domain, repeat 2, of phospholipase D (PLD, EC 3.1.4.4) found in yeast, plants, and vertebrates, and their bacterial homologs. PLDs are involved in signal transduction, vesicle formation, protein transport, and mitosis by participating in phospholipid metabolism. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Both prokaryotic and eukaryotic PLDs have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. PLDs are active as bi-lobed monomers. Each monomer contains two domains, each of which carries one copy of the HKD motif. Two HKD motifs from two domains form a single active site. PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	146
197205	cd09106	PLDc_vPLD3_4_5_like_1	Putative catalytic domain, repeat 1, of vertebrate phospholipases, PLD3, PLD4 and PLD5, viral envelope proteins K4 and p37, and similar proteins. Putative catalytic domain, repeat 1, of vertebrate phospholipases D, PLD3, PLD4, and PLD5 (EC 3.1.4.4), viral envelope proteins (vaccinia virus proteins K4 and p37), and similar proteins. Most family members contain two copies of the HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue), and have been classified into the phospholipase D (PLD) superfamily. Proteins in this subfamily are associated with Golgi membranes, altering their lipid content by the conversion of phospholipids into phosphatidic acid, which is thought to be involved in the regulation of lipid movement. ADP ribosylation factor (ARF), a small guanosine triphosphate binding protein, might be required activity. The vaccinia virus p37 protein, encoded by the F13L gene, is also associated with Golgi membranes and is required for the envelopment and spread of the extracellular enveloped virus (EEV). The vaccinia virus protein K4, encoded by the HindIII K4L gene, remains to be characterized. Sequence analysis indicates that the vaccinia virus proteins K4 and p37 might have evolved from one or more captured eukaryotic genes involved in cellular lipid metabolism. Up to date, no catalytic activity of PLD3 has been shown. Furthermore, due to the lack of functional important histidine and lysine residues in the HKD motif, mammalian PLD5 has been characterized as an inactive PLD. The poxvirus p37 proteins may also lack PLD enzymatic activity, since they contain only one partially conserved HKD motif (N-x-K-x(4)-D).	153
197206	cd09107	PLDc_vPLD3_4_5_like_2	Putative catalytic domain, repeat 2, of vertebrate phospholipases, PLD3, PLD4 and PLD5, viral envelope proteins K4 and p37, and similar proteins. Putative catalytic domain, repeat 2, of vertebrate phospholipases D, PLD3, PLD4, and PLD5 (EC 3.1.4.4), viral envelope proteins (vaccinia virus proteins K4 and p37), and similar proteins. Most family members contain two copies of the HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue), and have been classified into the phospholipase D (PLD) superfamily. Proteins in this subfamily are associated with Golgi membranes, altering their lipid content by the conversion of phospholipids into phosphatidic acid, which is thought to be involved in the regulation of lipid movement. ADP ribosylation factor (ARF), a small guanosine triphosphate binding protein, might be required activity. The vaccinia virus p37 protein, encoded by the F13L gene, is also associated with Golgi membranes and is required for the envelopment and spread of the extracellular enveloped virus (EEV). The vaccinia virus protein K4, encoded by the HindIII K4L gene, remains to be characterized. Sequence analysis indicates that the vaccinia virus proteins K4 and p37 might have evolved from one or more captured eukaryotic genes involved in cellular lipid metabolism. Up to date, no catalytic activity of PLD3 has been shown. Furthermore, due to the lack of functional important histidine and lysine residues in the HKD motif, mammalian PLD5 has been characterized as an inactive PLD. The poxvirus p37 proteins may also lack PLD enzymatic activity, since they contain only one partially conserved HKD motif (N-x-K-x(4)-D).	175
197207	cd09108	PLDc_PMFPLD_like_1	Catalytic domain, repeat 1, of phospholipase D from Streptomyces Sp. Strain PMF and similar proteins. Catalytic domain, repeat 1, of phospholipases D (PLD, EC 3.1.4.4) from Streptomyces Sp. Strain PMF (PMFPLD) and similar proteins, which are generally extracellular and bear N-terminal signal sequences. PMFPLD hydrolyzes the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. It also catalyzes a transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. In contrast to eukaryotic PLDs, PMFPLD has a compact structure, which consists of two catalytic domains, but lacks the regulatory domains. Each catalytic domain contains one copy of the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. Two HKD motifs from two domains form a single active site. Like other PLD enzymes, PMFPLD may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. A calcium-dependent PLD from Streptomyce chromofuscus is excluded from this family, since it displays very little sequence homology with other Streptomyces PLDs. Moreover, it does not contain the conserved HKD motif and hydrolyzes the phospholipids via a different mechanism.	210
197208	cd09109	PLDc_PMFPLD_like_2	Catalytic domain, repeat 2, of phospholipase D from Streptomyces Sp. Strain PMF and similar proteins. Catalytic domain, repeat 2, of phospholipases D (PLD, EC 3.1.4.4) from Streptomyces Sp. Strain PMF (PMFPLD) and similar proteins, which are generally extracellular and bear N-terminal signal sequences. PMFPLD hydrolyzes the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. It also catalyzes a transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. In contrast to eukaryotic PLDs, PMFPLD has a compact structure, which consists of two catalytic domains, but lacks the regulatory domains. Each catalytic domain contains one copy of the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. Two HKD motifs from two domains form a single active site. Like other PLD enzymes, PMFPLD may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. A calcium-dependent PLD from Streptomyce chromofuscus is excluded from this family, since it displays very little sequence homology with other Streptomyces PLDs. Moreover, it does not contain the conserved HKD motif and hydrolyzes the phospholipids via a different mechanism.	212
197209	cd09110	PLDc_CLS_1	Catalytic domain, repeat 1, of bacterial cardiolipin synthase and similar proteins. Catalytic domain, repeat 1, of bacterial cardiolipin (CL) synthase and a few homologs found in eukaryotes and archaea. Bacterial CL synthases catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. The monomer of bacterial CL synthase consists of two catalytic domains. Each catalytic domain contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. Two HKD motifs from two domains form a single active site involved in phosphatidyl group transfer. Bacterial CL synthases can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity belonging to the PLD superfamily. Like other PLD enzymes, bacterial CL synthases utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	154
197210	cd09111	PLDc_ymdC_like_1	Putative catalytic domain, repeat 1, of Escherichia coli uncharacterized protein ymdC and similar proteins. Putative catalytic domain, repeat 1, of Escherichia coli uncharacterized protein ymdC and similar proteins. In Escherichia coli, there are two genes, f413 (ybhO) and o493 (ymdC), which are homologous to gene cls that encodes the Escherichia coli cardiolipin (CL) synthase. The prototype of this subfamily is an uncharacterized protein ymdC specified by the o493 (ymdC) gene. Although the functional characterization of ymdC and similar proteins remains unknown, members of this subfamily show high sequence homology to bacterial CL synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Moreover, ymdC and its similar proteins contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characteriszes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	162
197211	cd09112	PLDc_CLS_2	catalytic domain repeat 2 of bacterial cardiolipin synthase and similar proteins. This CD corresponds to the catalytic domain repeat 2 of bacterial cardiolipin synthase (CL synthase, EC 2.7.8.-) and a few homologs found in eukaryotes and archea. Bacterial CL synthases catalyze reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form cardiolipin (CL) and glycerol. The monomer of bacterial CL synthase consists of two catalytic domains. Each catalytic domain contains one copy of conserved HKD motifs (H-X-K-X(4)-D, X represents any amino acid residue) that are the characteristic of the phospholipase D (PLD) superfamily. Two HKD motifs from two domains together form a single active site involving in phosphatidyl group transfer. Bacterial CL synthases can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity in PLD superfamily. Like other PLD enzymes, bacterial CL synthase utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid stabilizing the leaving group.	174
197212	cd09113	PLDc_ymdC_like_2	Putative catalytic domain, repeat 2, of Escherichia coli uncharacterized protein ymdC and similar proteins. Putative catalytic domain, repeat 2, of Escherichia coli uncharacterized protein ymdC and similar proteins. In Escherichia coli, there are two genes, f413 (ybhO) and o493 (ymdC), which are homologous to gene cls that encodes the Escherichia coli cardiolipin (CL) synthase. The prototype of this subfamily is an uncharacterized protein ymdC specified by the o493 (ymdC) gene. Although the functional characterization of ymdC and similar proteins remains unknown, members of this subfamily show high sequence homology to bacterial CL synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Moreover, ymdC and its similar proteins contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characteriszes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	218
197213	cd09114	PLDc_PPK1_C1	Catalytic C-terminal domain, first repeat, of prokaryotic polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, first repeat (C1 domain), of bacterial polyphosphate kinases 1 (Poly P kinase 1 or PPK1, EC 2.7.4.1) and similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. Each PPK1 monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of PPK1 are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. There is a second bacterial-type enzyme, PPK2, which is involved in the synthesis of poly P from GTP or ATP. PPK2 shows no sequence similarity to PPK1 and belongs to different superfamily.	162
197214	cd09115	PLDc_PPK1_C2	Catalytic C-terminal domain, second repeat, of prokaryotic polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, second repeat (C2 domain), of bacterial polyphosphate kinases 1 (Poly P kinase 1 or PPK1, EC 2.7.4.1) and similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. Each PPK1 monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of PPK1 are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. There is a second bacterial-type enzyme, PPK2, which is involved in the synthesis of poly P from GTP or ATP. PPK2 shows no sequence similarity to PPK1 and belongs to different superfamily.	162
197215	cd09116	PLDc_Nuc_like	Catalytic domain of EDTA-resistant nuclease Nuc, vertebrate phospholipase D6, and similar proteins. Catalytic domain of EDTA-resistant nuclease Nuc, vertebrate phospholipase D6 (PLD6, EC 3.1.4.4), and similar proteins. Nuc is an endonuclease from Salmonella typhimurium and the smallest known member of the PLD superfamily. It cleaves both single- and double-stranded DNA. PLD6 selectively hydrolyzes the terminal phosphodiester bond of phosphatidylcholine (PC), with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLD6 also catalyzes the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Both Nuc and PLD6 belong to the phospholipase D (PLD) superfamily. They contain a short conserved sequence motif, the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which is essential for catalysis. PLDs utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. This subfamily also includes some uncharacterized hypothetical proteins, which have two HKD motifs in a single polypeptide chain.	138
197216	cd09117	PLDc_Bfil_DEXD_like	Catalytic domain of type II restriction endonucleases BfiI and NgoFVII, and uncharacterized proteins with a DEAD domain. Catalytic domain of type II restriction endonucleases BfiI and NgoFVII, uncharacterized type III restriction endonuclease Res subunit, and uncharacterized DNA/RNA helicase superfamily II members. Proteins in this family are found mainly in prokaryotes. They contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in a single polypeptide chain, and have been classified as members of the phospholipase D (PLD, EC 3.1.4.4) superfamily. BfiI consists of two discrete domains with distinct functions: an N-terminal catalytic domain with non-specific nuclease activity and dimerization function that is more closely related to Nuc, an EDTA-resistant nuclease from the phospholipase D (PLD) superfamily; and a C-terminal domain that specifically recognizes its target sequences, 5'-ACTGGG-3'. BfiI forms a functionally active homodimer which has two DNA-binding surfaces located at the C-terminal domains but only one active site, located at the dimer interface between the two N-terminal catalytic domains that contain the two HKD motifs from both subunits. BfiI utilizes a single active site to cut both DNA strands, which represents a novel mechanism for the scission of double-stranded DNA. It uses a histidine residue from the HKD motif in one subunit as the nucleophile for the cleavage of the target phosphodiester bond in both of the anti-parallel DNA strands, while the symmetrically-related histidine residue from the HKD motif of the opposite subunit acts as the proton donor/acceptor during both strand-scission events.	117
197217	cd09118	PLDc_yjhR_C_like	C-terminal domain of Escherichia coli uncharacterized protein yjhR and similar proteins. C-terminal domain of Escherichia coli uncharacterized protein yjhR, encoded by the o338 gene, and similar proteins.  Although the biological function of yjhR remains unknown, it shows sequence similarity to the C-terminal portions of superfamily I DNA and RNA helicases, which are ubiquitous enzymes mediating ATP-dependent unwinding of DNA and RNA duplexes, and play essential roles in gene replication and expression. Moreover, The C-termini of yjhR and similar proteins contain one HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The PLDc-like domain of yjhR is similar to bacterial endonucleases, Nuc and BfiI, both of which have only one copy of the HKD motif per chain.  They function as homodimers, with a single active site at the dimer interface containing the HKD motifs from both subunits. They utilize a two-step mechanism to cleave phosphodiester bonds. Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit.	144
197218	cd09119	PLDc_FAM83_N	N-terminal phospholipase D-like domain of proteins from the Family with sequence similarity 83. N-terminal phospholipase D (PLD)-like domain of vetebrate proteins from the Family with sequence similarity 83 (FAM83), which is comprised of 8 members, designated FAM83A through FAM83H. Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, the FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are unlikely to carry PLD activity. Members of the FAM83 are mostly uncharacterized proteins. FAM83A, also known as tumor antigen BJ-TSA-9, is a novel tumor-specific gene highly expressed in human lung adenocarcinoma. FAM83D, also known as spindle protein CHICA, is a cell-cycle-regulated spindle component which localizes to the mitotic spindle and is both upregulated and phosphorylated during mitosis. The gene encoding protein FAM83H is the first gene involved in the etiology of amelogenesis imperfecta (AI), that encodes a non-secreted protein due to the absence of a signal peptide. Defects in gene FAM83H cause autosomal dominant hypocalcified amelogenesis imperfecta (ADHCAI). FAM83B, FAM83C, FAM83F, and FAM83G are uncharacterized proteins present across vertebrates while FAM83E is an uncharacterized protein found only in mammals.	269
197219	cd09120	PLDc_DNaseII_1	Catalytic domain, repeat 1, of Deoxyribonuclease II and similar proteins. Catalytic domain, repeat 1, of Deoxyribonuclease II (DNase II, EC 3.1.22.1), an endodeoxyribonuclease with ubiquitous tissue distribution. It is essential for accessory apoptotic DNA fragmentation and DNA clearance during development, as well as in tissue regeneration in higher eukaryotes. Unlike the majority of nucleases, DNase II functions optimally at acidic pH in the absence of divalent metal ion cofactors. It hydrolyzes the phosphodiester backbone of DNA by a single strand cleavage mechanism to generate 3'-phosphate termini. The majority of family members contain an N-terminal signal-peptide leader sequence, which is critical for N-glycosylation and DNase II activity. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta (also known as DNase II-like acid DNase, DLAD) subtypes. A few homologs are found in non-metazoan species, but none are found in fungi, plants or prokaryotes, with the sole exception of Burkholderia pseudomallei. Among those homologs, the Caenorhabditis elegans C07B5.5 ORF encoding NUC-1 apoptotic nuclease, the uncharacterized C. elegans crn-6 (cell death related nuclease) gene encoding protein, and the putative gene CG7780 encoding Drosophila DNase II (dDNase II) have similar cleavage activity and specificity to mammalian DNase II enzymes. They may function like an acid DNase implicated in degrading DNA from apoptotic cells engulfed by macrophages. Plancitoxin I, the major lethal factor from the Acanthaster planci venom, is a unique homolog of mammalian DNase II. It has potent hepatotoxicity and the optimum pH for its activity is 7.2, unlike the optimum acidic PH for mammalian DNase II. Some members of this family contain substitutions of conserved residues found in the putative active site, which suggest that these proteins may have diverged from a canonical DNase II activity and may perform other functions.	141
197220	cd09121	PLDc_DNaseII_2	Catalytic domain, repeat 2, of Deoxyribonuclease II and similar proteins. Catalytic domain, repeat 2, of Deoxyribonuclease II (DNase II, EC 3.1.22.1), an endodeoxyribonuclease with ubiquitous tissue distribution. It is essential for accessory apoptotic DNA fragmentation and DNA clearance during development, as well as in tissue regeneration in higher eukaryotes. Unlike the majority of nucleases, DNase II functions optimally at acidic pH in the absence of divalent metal ion cofactors. It hydrolyzes the phosphodiester backbone of DNA by a single strand cleavage mechanism to generate 3'-phosphate termini. The majority of family members contain an N-terminal signal-peptide leader sequence, which is critical for N-glycosylation and DNase II activity. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta (also known as DNase II-like acid DNase, DLAD) subtypes. A few homologs are found in non-metazoan species, but none are found in fungi, plants or prokaryotes, with the sole exception of Burkholderia pseudomallei. Among those homologs, the Caenorhabditis elegans C07B5.5 ORF encoding NUC-1 apoptotic nuclease, the uncharacterized C. elegans crn-6 (cell death related nuclease) gene encoding protein, and the putative gene CG7780 encoding Drosophila DNase II (dDNase II) have similar cleavage activity and specificity to mammalian DNase II enzymes. They may function like an acid DNase implicated in degrading DNA from apoptotic cells engulfed by macrophages. Plancitoxin I, the major lethal factor from the Acanthaster planci venom, is a unique homolog of mammalian DNase II. It has potent hepatotoxicity and the optimum pH for its activity is 7.2, unlike the optimum acidic PH for mammalian DNase II. Some members of this family contain substitutions of conserved residues found in the putative active site, which suggest that these proteins may have diverged from the canonical DNase II activity and may perform other functions.	139
197221	cd09122	PLDc_Tdp1_1	Catalytic domain, repeat 1, of Tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 1, of Tyrosyl-DNA phosphodiesterase (Tdp1, EC 3.1.4.-), which exists in eukaryotes but not in prokaryotes. Tdp1 acts as an important DNA repair enzyme that removes stalled topoisomerase I-DNA complexes by catalyzing the hydrolysis of a phosphodiester bond between a tyrosine side chain and a DNA 3'-phosphate. It is a monomeric protein that contains two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Thus, this family represents a distinct class within the PLD superfamily. Like other PLD enzymes, Tdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way.	145
197222	cd09123	PLDc_Tdp1_2	Catalytic domain, repeat 2, of tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 2, of Tyrosyl-DNA phosphodiesterase (Tdp1, EC 3.1.4.-), which exists in eukaryotes but not in prokaryotes. Tdp1 acts as an important DNA repair enzyme that removes stalled topoisomerase I-DNA complexes by catalyzing the hydrolysis of a phosphodiester bond between a tyrosine side chain and a DNA 3'-phosphate. It is a monomeric protein that contains two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Thus, this family represents a distinct class within the PLD superfamily. Like other PLD enzymes, Tdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way.	182
197223	cd09124	PLDc_like_TrmB_middle	Middle phospholipase D-like domain of the transcriptional regulator TrmB and similar proteins. Middle phospholipase D (PLD)-like domain of the transcriptional regulator TrmB and similar proteins. TrmB acts as a bifunctional sugar-sensing transcriptional regulator which controls two operons encoding maltose/trehalose and maltodextrin ABC transporters of Pyrococcus fruiosus. It  functions as a dimer. Full length TrmB includes an N-terminal DNA-binding domain, a C-terminal sugar-binding domain and middle region that has been named as a PLD-like domain. The middle domain displays homology to PLD enzymes, which contain one or two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) per chain. The HKD motif characterizes the PLD superfamily. Due to the lack of key residues related to PLD activity in the PLD-like domain, members of this subfamily are unlikely to carry PLD activity.	126
197224	cd09126	PLDc_C_DEXD_like	C-terminal putative phospholipase D-like domain of uncharacterized prokaryotic HKD family nucleases fused to DEAD/DEAH box helicases. C-terminal putative phospholipase D (PLD)-like domain of uncharacterized prokaryotic HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. In addition to the helicase-like region, members of this family also contain a PLD-like domain in the C-terminal region, which is characterized by a variant HKD (H-x-K-x(4)-D motif, where x represents any amino acid residue) motif. Due to the lack of key residues related to PLD activity in the variant HKD motif, members of this subfamily are most unlikely to carry PLD activity.	126
197225	cd09127	PLDc_unchar1_1	Putative catalytic domain, repeat 1, of uncharacterized phospholipase D-like proteins. Putative catalytic domain, repeat 1, of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	141
197226	cd09128	PLDc_unchar1_2	Putative catalytic domain, repeat 2, of uncharacterized phospholipase D-like proteins. Putative catalytic domain, repeat 2, of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	142
197227	cd09129	PLDc_unchar2_1	Putative catalytic domain, repeat 1, of uncharacterized phospholipase D-like proteins. Putative catalytic domain, repeat 1, of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	196
197228	cd09130	PLDc_unchar2_2	Putative catalytic domain, repeat 2, of uncharacterized phospholipase D-like proteins. Putative catalytic domain, repeat 2, of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	157
197229	cd09131	PLDc_unchar3	Putative catalytic domain of uncharacterized phospholipase D-like proteins. Putative catalytic domain of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. Members of this subfamily contain one copy of HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily.	143
197230	cd09132	PLDc_unchar4	Putative catalytic domain of uncharacterized phospholipase D-like proteins. Putative catalytic domain of uncharacterized phospholipase D (PLD, EC 3.1.4.4)-like proteins. Members of this subfamily contain one copy of HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily.	122
197231	cd09133	PLDc_unchar5	Putative catalytic domain of uncharacterized hypothetical proteins with one or two copies of the HKD motif. Putative catalytic domain of uncharacterized hypothetical proteins with similarity to phospholipase D (PLD, EC 3.1.4.4). PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain one or two copies of the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily.	127
197232	cd09134	PLDc_PSS_G_neg_1	Catalytic domain, repeat 1, of phosphatidylserine synthases from gram-negative bacteria. Catalytic domain, repeat 1, of phosphatidylserine synthases (PSSs) from gram-negative bacteria. There are two subclasses of PSS enzymes in bacteria: subclass I of gram-negative bacteria and subclass II of gram-positive bacteria. It is common that PSSs in gram-positive bacteria and yeast are tight membrane-associated enzymes. By contrast, the gram-negative bacterial PSSs, such as Escherichia coli PSS, are commonly bound to the ribosomes. They are peripheral membrane proteins that can interact with the surface of the inner membrane by binding to the lipid substrate (CDP-diacylglycerol) and the lipid product (phosphatidylserine). The prototypical member of this subfamily is Escherichia coli PSS (also called CDP-diacylglycerol-L-serine O-phosphatidyltransferase, EC 2.7.8.8), which catalyzes the exchange reactions between CMP and CDP-diacylglycerol, and between serine and phosphatidylserine. The phosphatidylserine is then decarboxylated by phosphatidylserine decarboxylase to yield phosphatidylethanolamine, the major phospholipid in Escherichia coli. It also catalyzes the hydrolysis of CDP-diacylglycerol to form phosphatidic acid with the release of CMP. PSS may utilize a ping-pong mechanism involving a phosphatidyl-enzyme intermediate, which is distinct from those of gram-positive bacterial phosphatidylserine synthases. Moreover, all members in this subfamily have two HKD motifs (H-x-K-x(4)-D,  where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs constitute an active site for the formation of a covalent substrate-enzyme intermediate.	173
197233	cd09135	PLDc_PGS1_euk_1	Catalytic domain, repeat 1, of eukaryotic PhosphatidylGlycerophosphate Synthases. Catalytic domain, repeat 1, of eukaryotic phosphatidylglycerophosphate (PGP) synthases, also called CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase (EC 2.7.8.5). Eukaryotic PGP synthases are different and unrelated to prokaryotic PGP synthases and yeast phosphatidylserine synthase. They catalyze the synthesis of PGP from CDP-diacylglycerol and sn-glycerol 3-phosphate, the committed and rate-limiting step in the biosynthesis of cardiolipin (CL), which is an essential component of many mitochondrial functions in eukaryotes. Members in this subfamily all have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. They may utilize a common two-step ping-pong catalytic mechanism involving a substrate-enzyme intermediate to cleave phosphodiester bonds. The two motifs are suggested to constitute the active site involved in the phosphatidyl group transfer.	170
197234	cd09136	PLDc_PSS_G_neg_2	Catalytic domain, repeat 2, of phosphatidylserine synthases from gram-negative bacteria. Catalytic domain, repeat 2, of phosphatidylserine synthases (PSSs) from gram-negative bacteria. There are two subclasses of PSS enzymes in bacteria: subclass I of gram-negative bacteria and subclass II of gram-positive bacteria. It is common that PSSs in gram-positive bacteria and yeast are tight membrane-associated enzymes. By contrast, the gram-negative bacterial PSSs, such as Escherichia coli PSS, are commonly bound to the ribosomes. They are peripheral membrane proteins that can interact with the surface of the inner membrane by binding to the lipid substrate (CDP-diacylglycerol) and the lipid product (phosphatidylserine). The prototypical member of this subfamily is Escherichia coli PSS (also called CDP-diacylglycerol-L-serine O-phosphatidyltransferase, EC 2.7.8.8), which catalyzes the exchange reactions between CMP and CDP-diacylglycerol, and between serine and phosphatidylserine. The phosphatidylserine is then decarboxylated by phosphatidylserine decarboxylase to yield phosphatidylethanolamine, the major phospholipid in Escherichia coli. It also catalyzes the hydrolysis of CDP-diacylglycerol to form phosphatidic acid with the release of CMP. PSS may utilize a ping-pong mechanism involving a phosphatidyl-enzyme intermediate, which is distinct from those of gram-positive bacterial phosphatidylserine synthases. Moreover, all members in this subfamily have two HKD motifs (H-x-K-x(4)-D,  where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs constitute an active site for the formation of a covalent substrate-enzyme intermediate.	215
197235	cd09137	PLDc_PGS1_euk_2	Catalytic domain, repeat 2, of eukaryotic phosphatidylglycerophosphate synthases. Catalytic domain, repeat 2, of eukaryotic phosphatidylglycerophosphate (PGP) synthases, also called CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase (EC 2.7.8.5). Eukaryotic PGP synthases are different and unrelated to prokaryotic PGP synthases and yeast phosphatidylserine synthase. They catalyze the synthesis of PGP from CDP-diacylglycerol and sn-glycerol 3-phosphate, the committed and rate-limiting step in the biosynthesis of cardiolipin (CL), which is an essential component of many mitochondrial functions in eukaryotes. Members in this subfamily all have two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. They may utilize a common two-step ping-pong catalytic mechanism involving a substrate-enzyme intermediate to cleave phosphodiester bonds. The two motifs are suggested to constitute the active site involved in the phosphatidyl group transfer.	186
197236	cd09138	PLDc_vPLD1_2_yPLD_like_1	Catalytic domain, repeat 1, of vertebrate phospholipases, PLD1 and PLD2, yeast PLDs, and similar proteins. Catalytic domain, repeat 1, of vertebrate phospholipases D (PLD1 and PLD2), yeast phospholipase D (PLD SPO14/PLD1), and other similar eukaryotic proteins. These PLD enzymes play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. The vertebrate PLD1 and PLD2 are membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzymes that selectively hydrolyze phosphatidylcholine (PC). Protein cofactors and calcium may be required for their activation. Yeast SPO14/PLD1 is a calcium-independent PLD, which needs PIP2 for its activity. Instead of the regulatory calcium-dependent phospholipid-binding C2 domain in plants, most mammalian and yeast PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at the N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. The PX and PH domains are also present in zeta-type PLD from Arabidopsis, which is more closely related to vertebrate PLDs than to other plant PLD types. In addition, this subfamily also includes some related proteins which have either PX-like or PH domains in their N-termini. Like other members of the PLD superfamily, the monomer of mammalian and yeast PLDs consists of two catalytic domains, each containing one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from the two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	146
197237	cd09139	PLDc_pPLD_like_1	Catalytic domain, repeat 1, of plant phospholipase D and similar proteins. Catalytic domain, repeat 1, of plant phospholipase D (PLD, EC 3.1.4.4) and similar proteins. Plant PLDs have broad substrate specificity and can hydrolyze the terminal phosphodiester bond of several common membrane phospholipids such as phosphatidylcholine (PC), phosphatidylethanolamine (PE), phosphatidylglycerol (PG), and phosphatidylserine (PS), with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Most plant PLDs possess a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and require calcium for activity, which is unique to plant PLDs and is not present in animal or fungal PLDs. Like other PLD enzymes, the monomer of plant PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDs may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. This subfamily includes two types of plant PLDs, alpha-type and beta-type PLDs, which are derived from different gene products and distinctly regulated. The zeta-type PLD from Arabidopsis is not included in this subfamily.	176
197238	cd09140	PLDc_vPLD1_2_like_bac_1	Catalytic domain, repeat 1, of uncharacterized bacterial proteins with similarity to vertebrate phospholipases, PLD1 and PLD2. Catalytic domain, repeat 1, of uncharacterized bacterial counterparts of vertebrate, yeast and plant phospholipase D (PLD, EC 3.1.4.4). PLDs hydrolyze the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. They also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Instead of the regulatory C2 (calcium-activated lipid binding) domain in plants and the adjacent Phox (PX) and the Pleckstrin homology (PH) N-terminal domains in most mammalian and yeast PLDs, many members in this subfamily contain a SNARE associated C-terminal domain, whose functional role is unclear.  Like other PLD enzymes, members in this subfamily contain two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), that may play an important role in the catalysis.	146
197239	cd09141	PLDc_vPLD1_2_yPLD_like_2	Catalytic domain, repeat 2, of vertebrate phospholipases, PLD1 and PLD2, yeast PLDs, and similar proteins. Catalytic domain, repeat 2, of vertebrate phospholipases D (PLD1 and PLD2), yeast phospholipase D (PLD SPO14/PLD1), and other similar eukaryotic proteins. These PLD enzymes play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. The vertebrate PLD1 and PLD2 are membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzymes that selectively hydrolyze phosphatidylcholine (PC). Protein cofactors and calcium may be required for their activation. Yeast SPO14/PLD1 is a calcium-independent PLD, which needs PIP2 for its activity. Instead of the regulatory calcium-dependent phospholipid-binding C2 domain in plants, most mammalian and yeast PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at the N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. The PX and PH domains are also present in zeta-type PLD from Arabidopsis, which is more closely related to vertebrate PLDs than to other plant PLD types. In addition, this subfamily also includes some related proteins which have either PX-like or PH domains in their N-termini. Like other members of the PLD superfamily, the monomer of mammalian and yeast PLDs consists of two catalytic domains, each containing one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from the two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	183
197240	cd09142	PLDc_pPLD_like_2	Catalytic domain, repeat 2, of plant phospholipase D and similar proteins. Catalytic domain, repeat 2, of plant phospholipase D (PLD, EC 3.1.4.4) and similar proteins. Plant PLDs have broad substrate specificity and can hydrolyze the terminal phosphodiester bond of several common membrane phospholipids such as phosphatidylcholine (PC), phosphatidylethanolamine (PE), phosphatidylglycerol (PG), and phosphatidylserine (PS), with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Most plant PLDs possess a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and require calcium for activity, which is unique to plant PLDs and is not present in animal or fungal PLDs. Like other PLD enzymes, the monomer of plant PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDs may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. This subfamily includes two types of plant PLDs, alpha-type and beta-type PLDs, which are derived from different gene products and distinctly regulated. The zeta-type PLD from Arabidopsis is not included in this subfamily.	208
197241	cd09143	PLDc_vPLD1_2_like_bac_2	Catalytic domain, repeat 2, of uncharacterized bacterial proteins with similarity to vertebrate phospholipases, PLD1 and PLD2. Catalytic domain, repeat 2, of uncharacterized bacterial counterparts of vertebrate, yeast and plant phospholipase D (PLD, EC 3.1.4.4). PLDs hydrolyze the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. They also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Instead of the regulatory C2 (calcium-activated lipid binding) domain in plants and the adjacent Phox (PX) and the Pleckstrin homology (PH) N-terminal domains in most mammalian and yeast PLDs, many members in this subfamily contain a SNARE associated C-terminal domain, whose functional role is unclear.  Like other PLD enzymes, members in this subfamily contain two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), that may play an important role in the catalysis.	142
197242	cd09144	PLDc_vPLD3_1	Putative catalytic domain, repeat 1, of vertebrate phospholipase PLD3. Putative catalytic domain, repeat 1, of phospholipase D3 (PLD3, EC 3.1.4.4). The human protein is also known as Hu-K4 or HUK4 and it was identified as a human homolog of the vaccinia virus protein K4, which is encoded by the HindIII K4L gene. PLD3 is found in many human organs with highest expression levels found in the central nervous system. Due to the presence of two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), PLD3 has been assigned to the PLD superfamily although no catalytic activity has been detected experimentally. PLD3 is a membrane-bound protein that colocalizes with protein disulfide isomerase, an endoplasmic reticulum (ER) protein. Like other homologs of protein K4, PLD3 might alter the lipid content of associated membranes by selectively hydrolyzing phosphatidylcholine (PC) into the corresponding phosphatidic acid, which is thought to be involved in the regulation of lipid movement.	172
197243	cd09145	PLDc_vPLD4_1	Putative catalytic domain, repeat 1, of vertebrate phospholipase PLD4. Putative catalytic domain, repeat 1, of vertebrate phospholipases D4 (PLD4, EC 3.1.4.4), homologs of the vaccinia virus protein K4 which is encoded by the HindIII K4L gene. Due to the presence of two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), PLD4 has been assigned to PLD superfamily although no catalytic activity has been detected to date. Unlike PLD1 and PLD2, PLD4 does not contain Phox (PX) and Pleckstrin homology (PH) domains but has a putative transmembrane domain. Like other vertebrate homologs of protein K4, PLD4 might be associated with Golgi membranes and alter their lipid content by selectively hydrolyze phosphatidylcholine (PC) into corresponding phosphatidic acid, which is thought to be involved in the regulation of lipid movement.	170
197244	cd09146	PLDc_vPLD5_1	Putative catalytic domain, repeat 1, of inactive veterbrate phospholipase PLD5. Putative catalytic domain, repeat 1, of inactive veterbrate phospholipases D5 (PLD5, EC 3.1.4.4), homologs of the vaccinia virus protein K4 encoded by the HindIII K4L gene. Vertebrate PLD5 has been assigned to the PLD superfamily, since it shows high sequence similarity to other human homologs of protein K4, which contain two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). However, due to the lack of functionally important histidine and lysine residues in the HKD motif, vetebrate PLD5 has been characterized as an inactive PLD.	163
197245	cd09147	PLDc_vPLD3_2	Putative catalytic domain, repeat 2, of vertebrate phospholipase PLD3. Putative catalytic domain, repeat 2, of phospholipase D3 (PLD3, EC 3.1.4.4). The human protein is also known as Hu-K4 or HUK4 and it was identified as a human homolog of the vaccinia virus protein K4, which is encoded by the HindIII K4L gene. PLD3 is found in many human organs with highest expression levels found in the central nervous system. Due to the presence of two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), PLD3 has been assigned to the PLD superfamily although no catalytic activity has been detected experimentally. PLD3 is a membrane-bound protein that colocalizes with protein disulfide isomerase, an endoplasmic reticulum (ER) protein. Like other homologs of protein K4, PLD3 might alter the lipid content of associated membranes by selectively hydrolyzing phosphatidylcholine (PC) into the corresponding phosphatidic acid, which is thought to be involved in the regulation of lipid movement.	186
197246	cd09148	PLDc_vPLD4_2	Putative catalytic domain, repeat 2, of vertebrate phospholipase PLD4. Putative catalytic domain, repeat 2, of vertebrate phospholipases D4 (PLD4, EC 3.1.4.4), homologs of the vaccinia virus protein K4 which is encoded by the HindIII K4L gene. Due to the presence of two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), PLD4 has been assigned to PLD superfamily although no catalytic activity has been detected to date. Unlike PLD1 and PLD2, PLD4 does not contain Phox (PX) and Pleckstrin homology (PH) domains but has a putative transmembrane domain. Like other vertebrate homologs of protein K4, PLD4 might be associated with Golgi membranes and alter their lipid content by selectively hydrolyze phosphatidylcholine (PC) into corresponding phosphatidic acid, which is thought to be involved in the regulation of lipid movement.	187
197247	cd09149	PLDc_vPLD5_2	Putative catalytic domain, repeat 2, of inactive veterbrate phospholipase PLD5. Putative catalytic domain, repeat 2, of inactive veterbrate phospholipases D5 (PLD5, EC 3.1.4.4), homologs of the vaccinia virus protein K4 encoded by the HindIII K4L gene. Vertebrate PLD5 has been assigned to the PLD superfamily, since it shows high sequence similarity to other human homologs of protein K4, which contain two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). However, due to the lack of functionally important histidine and lysine residues in the HKD motif, vetebrate PLD5 has been characterized as an inactive PLD.	188
197248	cd09150	PLDc_Ymt_1	Putative catalytic domain, repeat 1, of Yersinia pestis murine toxin-like proteins. Putative catalytic domain, repeat 1, of Yersinia pestis murine toxin (Ymt), a plasmid-encoded phospholipase D (PLD, EC 3.1.4.4), and similar proteins. Ymt is important in order for Yersinia pestis to survive and spread. It is toxic to mice and rats but not to other animals. It is not a conventional secreted exotoxin, but a cytoplasmic protein that is released upon bacterial lysis. Ymt may be active as a dimer. The monomeric Ymt consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Ymt has PLD-like activity and has been classified into the PLD superfamily. It hydrolyzes the terminal phosphodiester bond in several phospholipids, with preference for phosphatidylethanolamine (PE) over phosphatidylcholine (PC) and phosphatidylserine (PS). Like other PLD enzymes, Ymt may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. In terms of sequence similarity, Ymt is closely related to Streptomyces PLDs.	215
197249	cd09151	PLDc_Ymt_2	Putative catalytic domain, repeat 2, of Yersinia pestis murine toxin-like proteins. Putative catalytic domain, repeat 2, of Yersinia pestis murine toxin (Ymt), a plasmid-encoded phospholipase D (PLD, EC 3.1.4.4), and similar proteins. Ymt is important in order for Yersinia pestis to survive and spread. It is toxic to mice and rats but not to other animals. It is not a conventional secreted exotoxin, but a cytoplasmic protein that is released upon bacterial lysis. Ymt may be active as a dimer. The monomeric Ymt consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Ymt has PLD-like activity and has been classified into the PLD superfamily. It hydrolyzes the terminal phosphodiester bond in several phospholipids, with preference for phosphatidylethanolamine (PE) over phosphatidylcholine (PC) and phosphatidylserine (PS). Like other PLD enzymes, Ymt may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group. In terms of sequence similarity, Ymt is closely related to Streptomyces PLDs.	264
197250	cd09152	PLDc_EcCLS_like_1	Catalytic domain, repeat 1, of Escherichia coli cardiolipin synthase and similar proteins. Catalytic domain, repeat 1, of Escherichia coli cardiolipin (CL) synthase and similar proteins. Escherichia coli CL synthase (EcCLS), specified by the cls gene, is the prototype of this family. EcCLS is a multi-pass membrane protein that catalyzes reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form cardiolipin (CL) and glycerol. The monomer of EcCLS consists of two catalytic domains. Each catalytic domain contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. Two HKD motifs from two domains form a single active site involved in phosphatidyl group transfer. EcCLS can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity belonging to the PLD superfamily. Like other PLD enzymes, EcCLS utilizes a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	163
197251	cd09154	PLDc_SMU_988_like_1	Putative catalytic domain, repeat 1, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins. Putative catalytic domain, repeat 1, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins. Although SMU_988 and similar proteins have not been functionally characterized, members in this subfamily show high sequence homology to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	155
197252	cd09155	PLDc_PaCLS_like_1	Putative catalytic domain, repeat 1, of Pseudomonas aeruginosa cardiolipin synthase and similar proteins. Putative catalytic domain, repeat 1, of Pseudomonas aeruginosa cardiolipin (CL) synthase (PaCLS) and similar proteins. Although PaCLS and similar proteins have not been functionally characterized, members in this subfamily show high sequence homology to bacterial CL synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Moreover, PaCLS and other members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	156
197253	cd09156	PLDc_CLS_unchar1_1	Putative catalytic domain, repeat 1, of uncharacterized proteins similar to bacterial cardiolipin synthase. Putative catalytic domain, repeat 1, of uncharacterized proteins similar to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	154
197254	cd09157	PLDc_CLS_unchar2_1	Putative catalytic domain, repeat 1, of uncharacterized proteins similar to bacterial cardiolipin synthase. Putative catalytic domain, repeat 1, of uncharacterized proteins similar to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	155
197255	cd09158	PLDc_EcCLS_like_2	Catalytic domain, repeat 2, of Escherichia coli cardiolipin synthase and similar proteins. Catalytic domain, repeat 2, of Escherichia coli cardiolipin (CL) synthase and similar proteins. Escherichia coli CL synthase (EcCLS), specified by the cls gene, is the prototype of this family. EcCLS is a multi-pass membrane protein that catalyzes reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form cardiolipin (CL) and glycerol. The monomer of EcCLS consists of two catalytic domains. Each catalytic domain contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. Two HKD motifs from two domains form a single active site involved in phosphatidyl group transfer. EcCLS can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity belonging to the PLD superfamily. Like other PLD enzymes, EcCLS utilizes a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	174
197256	cd09159	PLDc_ybhO_like_2	Catalytic domain, repeat 2, of Escherichia coli cardiolipin synthase ybhO and similar proteins. Catalytic domain, repeat 2, of Escherichia coli cardiolipin (CL) synthase ybhO and similar proteins. In Escherichia coli, there are two genes, f413 (ybhO) and o493 (ymdC), which are homologous to gene cls that encodes the Escherichia coli CL synthase. The prototype of this subfamily is Escherichia coli CL synthase ybhO specified by the f413 (ybhO) gene. ybhO is a membrane-bound protein that catalyzes the formation of cardiolipin (CL) by transferring phosphatidyl group between two phosphatidylglycerol molecules. It can also catalyze phosphatidyl group transfer to water to form phosphatidate. In contrast to the Escherichia coli CL synthase encoded by the cls gene (EcCLS), ybhO does not hydrolyze CL. Moreover, ybhO lacks an N-terminal segment encoded by Escherichia coli cls, which makes ybhO easy to denature. The monomer of ybhO consists of two catalytic domains. Each catalytic domain contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. Two HKD motifs from two domains form a single active site involved in phosphatidyl group transfer. ybhO can be stimulated by phosphate and inhibited by CL, the product of the reaction, and by phosphatidate. Phosphate stimulation may be unique to enzymes with CL synthase activity belonging to the PLD superfamily.	170
197257	cd09160	PLDc_SMU_988_like_2	Putative catalytic domain, repeat 2, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins. Putative catalytic domain, repeat 2, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins. Although SMU_988 and similar proteins have not been functionally characterized, members in this subfamily show high sequence homology to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	176
197258	cd09161	PLDc_PaCLS_like_2	Putative catalytic domain, repeat 2, of Pseudomonas aeruginosa cardiolipin synthase and similar proteins. Putative catalytic domain, repeat 2, of Pseudomonas aeruginosa cardiolipin (CL) synthase (PaCLS) and similar proteins. Although PaCLS and similar proteins have not been functionally characterized, members in this subfamily show high sequence homology to bacterial CL synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Moreover, PaCLS and other members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	176
197259	cd09162	PLDc_CLS_unchar1_2	Putative catalytic domain, repeat 2, of uncharacterized proteins similar to bacterial cardiolipin synthase. Putative catalytic domain, repeat 2, of uncharacterized proteins similar to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	172
197260	cd09163	PLDc_CLS_unchar2_2	Putative catalytic domain, repeat 2, of uncharacterized proteins similar to bacterial cardiolipin synthase. Putative catalytic domain, repeat 2, of uncharacterized proteins similar to bacterial cardiolipin (CL) synthases, which catalyze the reversible phosphatidyl group transfer between two phosphatidylglycerol molecules to form CL and glycerol. Members of this subfamily contain two HKD motifs (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the phospholipase D (PLD) superfamily. The two motifs may be part of the active site and may be involved in phosphatidyl group transfer.	176
197261	cd09164	PLDc_EcPPK1_C1_like	Catalytic C-terminal domain, first repeat, of Escherichia coli polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, first repeat (C1 domain), of Escherichia coli polyphosphate kinase 1 (Poly P kinase 1 or PPK1, EC 2.7.4.1) and similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. The prototype of this subfamily is Escherichia coli polyphosphate kinase (EcPPK), which forms a homotetramer in solution, and becomes a homodimer upon the binding of AMPPNP, a non-hydrolysable ATP analogue. Each EcPPK monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2)domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of EcPPK are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of EcPPK. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution.	162
197262	cd09165	PLDc_PaPPK1_C1_like	Catalytic C-terminal domain, first repeat, of Pseudomonas aeruginosa polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, first repeat (C1 domain), of polyphosphate kinase (Poly P kinase 1 or PPK1, EC 2.7.4.1) from Pseudomonas aeruginosa (PaPPK1), Dictyostelium discoideum (DdPPK1), and other similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PaPPK1 is the key enzyme responsible for the synthesis of Poly P in Pseudomonas aeruginosa. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. PaPPK1 shows high sequence homolog to Escherichia coli polyphosphate kinase (EcPPK), which contains four structural domains per chain: the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. The polyphosphate kinase from Dictyostelium discoideum (DdPPK1) shares similar structural features with EcPPK1 in the ATP-binding pocket and poly P tunnel, but has a unique N-terminal extension that may be responsible for its enzymatic activity, cellular localization, and physiological functions. In spite of the lack of sequence homology, the C1 and C2 domains of the family members are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. In some bacteria, such as Pseudomonas aeruginosa, a second enzyme, PPK2, which is involved in the alternative pathway of polyphosphate synthesis, has been found. It can catalyze the synthesis of poly P from GTP or ATP, with a preference for Mn2+ over Mg2+. PPK2 shows no sequence similarity to PPK1 and belongs to a different superfamily.	164
197263	cd09166	PLDc_PPK1_C1_unchar	Catalytic C-terminal domain, first repeat, of uncharacterized prokaryotic polyphosphate kinases. Catalytic C-terminal domain, first repeat (C1 domain), of a group of uncharacterized prokaryotic polyphosphate kinases (Poly P kinase 1 or PPK1, EC 2.7.4.1). Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. Each PPK1 monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of PPK1 are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution.	162
197264	cd09167	PLDc_EcPPK1_C2_like	Catalytic C-terminal domain, second repeat, of Escherichia coli polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, second repeat (C2 domain), of Escherichia coli polyphosphate kinase 1 (Poly P kinase 1 or PPK1, EC 2.7.4.1) and similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. The prototype of this subfamily is Escherichia coli polyphosphate kinase (EcPPK), which forms a homotetramer in solution, and becomes a homodimer upon the binding of AMPPNP, a non-hydrolysable ATP analogue. Each EcPPK monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2)domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of EcPPK are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of EcPPK. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution.	165
197265	cd09168	PLDc_PaPPK1_C2_like	Catalytic C-terminal domain, second repeat, of Pseudomonas aeruginosa polyphosphate kinase 1 and similar proteins. Catalytic C-terminal domain, second repeat (C2 domain), of polyphosphate kinase (Poly P kinase 1 or PPK1, EC 2.7.4.1) from Pseudomonas aeruginosa (PaPPK1), Dictyostelium discoideum (DdPPK1), and other similar proteins. Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PaPPK1 is the key enzyme responsible for the synthesis of Poly P in Pseudomonas aeruginosa. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. PaPPK1 shows high sequence homolog to Escherichia coli polyphosphate kinase (EcPPK), which contains four structural domains per chain: the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. The polyphosphate kinase from Dictyostelium discoideum (DdPPK1) shares similar structural features with EcPPK1 in the ATP-binding pocket and poly P tunnel, but has a unique N-terminal extension that may be responsible for its enzymatic activity, cellular localization, and physiological functions. In spite of the lack of sequence homology, the C1 and C2 domains of the family members are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution. In some bacteria, such as Pseudomonas aeruginosa, a second enzyme, PPK2, which is involved in the alternative pathway of polyphosphate synthesis, has been found. It can catalyze the synthesis of poly P from GTP or ATP, with a preference for Mn2+ over Mg2+. PPK2 shows no sequence similarity to PPK1 and belongs to a different superfamily.	163
197266	cd09169	PLDc_PPK1_C2_unchar	Catalytic C-terminal domain, second repeat, of uncharacterized prokaryotic polyphosphate kinases. Catalytic C-terminal domain, second repeat (C2 domain), of a group of uncharacterized prokaryotic polyphosphate kinases (Poly P kinase 1 or PPK1, EC 2.7.4.1). Inorganic polyphosphate (Poly P) plays an important role in bacterial stress responses and stationary-phase survival. PPK1 is the key enzyme responsible for the synthesis of Poly P in bacteria. It can catalyze the reversible conversion of the terminal-phosphate of ATP to Poly P. Therefore, PPK1 is essential for bacterial motility, quorum sensing, biofilm formation, and the production of virulence factors and may serve as an attractive antimicrobial drug target. Dimerization is crucial for the enzymatic activity of PPK1. Each PPK1 monomer includes four structural domains, the N-terminal (N) domain, the head (H) domain, and two closely related C-terminal (C1 and C2) domains. The N domain provides the upper binding interface for the adenine ring of the ATP. The H domain is involved in dimerization, while both the C1 and C2 domains contain residues crucial for catalytic activity. The intersection of the N, C1, and C2 domains forms a structural tunnel in which the PPK catalytic reactions are carried out. In spite of the lack of sequence homology, the C1 and C2 domains of PPK1 are structurally similar to the two repetitive catalytic domains of phospholipase D (PLD). Moreover, some residues in the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) of the PLD superfamily are spatially conserved in the active site of PPK1. It is possible that the bacterial PPK1 family and the PLD family have a common ancestor and diverged early in evolution.	162
197267	cd09170	PLDc_Nuc	Catalytic domain of EDTA-resistant nuclease Nuc from Salmonella typhimurium and similar proteins. Catalytic domain of an EDTA-resistant nuclease Nuc from Salmonella typhimurium and similar proteins. Nuc is an endonuclease cleaving both single- and double-stranded DNA. It is the smallest known member of the phospholipase D (PLD, EC 3.1.4.4) superfamily that includes a diverse group of proteins with various catalytic functions. Most members of this superfamily have two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in a single polypeptide chain and both are required for catalytic activity. However, Nuc only has one copy of the HKD motif per subunit but form a functionally active homodimer (it is most likely also active in solution as a multimeric protein), which has a single active site at the dimer interface containing the HKD motifs from both subunits. Due to the lack of a distinct domain for DNA binding, Nuc cuts DNA non-specifically. It utilizes a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit.	142
197268	cd09171	PLDc_vPLD6_like	Catalytic domain of vertebrate phospholipase D6 and similar proteins. Catalytic domain of vertebrate phospholipase D6 (PLD6, EC 3.1.4.4), a homolog of the EDTA-resistant nuclease Nuc from Salmonella typhimurium, and similar proteins. PLD6 can selectively hydrolyze the terminal phosphodiester bond of phosphatidylcholine (PC) with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. It also catalyzes the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. PLD6 belongs to the phospholipase D (PLD) superfamily. Its monomer contains a short conserved sequence motif, H-x-K-x(4)-D (where x represents any amino acid residue), termed the HKD motif, which is essential in catalysis. PLD6 is more closely related to the nuclease Nuc than to other vertebrate phospholipases, which have two copies of the HKD motif in a single polypeptide chain. Like Nuc, PLD6 may utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from the HKD motif of one subunit to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit.	136
197269	cd09172	PLDc_Nuc_like_unchar1_1	Putative catalytic domain, repeat 1, of uncharacterized hypothetical proteins similar to Nuc, an endonuclease from Salmonella typhimurium. Putative catalytic domain, repeat 1, of uncharacterized hypothetical proteins, which show high sequence homology to the endonuclease from Salmonella typhimurium and vertebrate phospholipase D6. Nuc and PLD6 belong to the phospholipase D (PLD) superfamily. They contain a short conserved sequence motif, the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which characterizes the PLD superfamily and is essential for catalysis. Nuc and PLD6 utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. However, proteins in this subfamily have two HKD  motifs in a single polypeptide chain.	144
197270	cd09173	PLDc_Nuc_like_unchar1_2	Putative catalytic domain, repeat 2, of uncharacterized hypothetical proteins similar to Nuc, an endonuclease from Salmonella typhimurium. Putative catalytic domain, repeat 2,  of uncharacterized hypothetical proteins, which show high sequence homology to the endonuclease from Salmonella typhimurium and vertebrate phospholipase D6. Nuc and PLD6 belong to the phospholipase D (PLD) superfamily. They contain a short conserved sequence motif, the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which characterizes the PLD superfamily and is essential for catalysis. Nuc and PLD6 utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. However, proteins in this subfamily have two HKD  motifs in a single polypeptide chain.	159
197271	cd09174	PLDc_Nuc_like_unchar2	Putative catalytic domain of uncharacterized hypothetical proteins closely related to Nuc, , an endonuclease from Salmonella typhimurium. Putative catalytic domain of uncharacterized hypothetical proteins, which show high sequence homology to the endonuclease from Salmonella typhimurium and vertebrate phospholipase D6. Nuc and PLD6 belong to the phospholipase D (PLD) superfamily. They contain a short conserved sequence motif, the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which characterizes the PLD superfamily and is essential for catalysis. Nuc and PLD6 utilize a two-step mechanism to cleave phosphodiester bonds: Upon substrate binding, the bond is first attacked by a histidine residue from one HKD motif to form a covalent phosphohistidine intermediate, which is then hydrolyzed by water with the aid of a second histidine residue from the other HKD motif in the opposite subunit. However, proteins in this subfamily have two HKD  motifs in a single polypeptide chain.	136
197272	cd09175	PLDc_Bfil	Catalytic domain of type IIs restriction endonuclease BfiI and similar proteins. Catalytic domain of a novel type IIs restriction endonuclease BfiI and similar proteins. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+,  not ATP or GTP, for catalysis. Unlike all other restriction enzymes known to date, BfiI is unique in cleaving DNA at fixed positions downstream of an asymmetric sequence in the absence of Mg2+. BfiI consists of two discrete domains with distinct functions: an N-terminal catalytic domain with non-specific nuclease activity and dimerization function that is more closely related to Nuc, an EDTA-resistant nuclease from the phospholipase D (PLD) superfamily; and a C-terminal domain that specifically recognizes its target sequences, 5'-ACTGGG-3'. BfiI presumably evolved through domain fusion of a DNA recognition domain to the catalytic Nuc-like domain from the PLD superfamily. Most PLD enzymes have two copies of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in a single polypeptide chain and both are required for catalytic activity. However, BfiI contains only one HKD motif per protein chain and forms a functionally active homodimer which has two DNA-binding surfaces located at the C-terminal domains but only one active site, located at the dimer interface between the two N-terminal catalytic domains that contain the two HKD motifs from both subunits. BfiI utilizes a single active site to cut both DNA strands, which represents a novel mechanism for the scission of double-stranded DNA. It uses a histidine residue from the HKD motif in one subunit as the nucleophile for the cleavage of the target phosphodiester bond in both of the anti-parallel DNA strands, while the symmetrically-related histidine residue from the HKD motif of the opposite subunit acts as the proton donor/acceptor during both strand-scission events.	161
197273	cd09176	PLDc_unchar6	Putative catalytic domain of uncharacterized hypothetical proteins with one or two copies of the HKD motif. Putative catalytic domain of uncharacterized hypothetical proteins with similarity to phospholipase D (PLD, EC 3.1.4.4). PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze transphosphatidylation of phospholipids to acceptor alcohols. Members of this subfamily contain one or two copies of the HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) that characterizes the PLD superfamily.	114
197274	cd09177	PLDc_RE_NgoFVII	Putative catalytic domain of type II restriction enzyme NgoFVII and similar proteins. Putative catalytic domain of type II restriction enzyme NgoFVII (EC 3.1.21.4), which shows high sequence similarity to type IIs restriction endonuclease BfiI. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+, not ATP or GTP, for catalysis. The prototype of this subfamily is the NgoFVII restriction endonuclease from Neisseria gonorrhoeae. It plays an essential role in the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. It recognizes the double-stranded sequence GCSGC and cleaves after G-4. Members of this subfamily contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) per protein chain and have been classified into the phospholipase D (PLD, EC 3.1.4.4) superfamily.	143
197275	cd09178	PLDc_N_Snf2_like	N-terminal putative catalytic domain of uncharacterized HKD family nucleases fused to putative helicases from the Snf2-like family. N-terminal putative catalytic domain of uncharacterized archaeal and prokaryotic HKD family nucleases fused to putative helicases from the Snf2-like family, which belong to the DNA/RNA helicase superfamily II (SF2). Although Snf2-like family enzymes do not possess helicase activity, they contain a helicase-like region, where seven helicase-related sequence motifs are found, similar to those in DEAD/DEAH box helicases, which represent the biggest family within the SF2 superfamily. In addition to the helicase-like region, members of this family also contain an N-terminal putative catalytic domain with one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), and have been classified as members of the phospholipase D (PLD, EC 3.1.4.4) superfamily.	134
197276	cd09179	PLDc_N_DEXD_a	N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily.	190
197277	cd09180	PLDc_N_DEXD_b	N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily.  A few family members contain additional domains, like a C-terminal peptidase S24-like domain.	142
197278	cd09181	PLDc_FAM83A_N	N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83A. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83A (FAM83A), also known as tumor antigen BJ-TSA-9. FAM83A or BJ-TSA-9 is a novel tumor-specific gene highly expressed in human lung adenocarcinoma. Due to this specific expression pattern, it may serve as a biomarker for lung cancer, especially in the early detection of micrometastasis for lung adenocarcinoma patients. Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity.	276
197279	cd09182	PLDc_FAM83B_N	N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83B. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83B (FAM83B). Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. The N-terminus of FAM83B shows high homology to other FAM83 family members, indicating that FAM83B might have arisen early in vertebrate evolution by duplication of a gene in the FAM83 family.	266
197280	cd09183	PLDc_FAM83C_N	N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83C. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83C (FAM83C). Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. The N-terminus of FAM83C shows high homology to other FAM83 family members, indicating that FAM83C might have arisen early in vertebrate evolution by duplication of a gene in the FAM83 family.	274
197281	cd09184	PLDc_FAM83D_N	N-terminal phospholipase D-like domain of the protein, Family with sequence similarity 83D. N-terminal phospholipase D (PLD)-like domain of the protein Family with sequence similarity 83D (FAM83D), also known as spindle protein CHICA. CHICA is a cell-cycle-regulated spindle component, which localizes to the mitotic spindle and is both upregulated and phosphorylated during mitosis. CHICA is required to localize the chromokinesin Kid to the mitotic spindle and serves as a novel interaction partner of Kid, which is required for the generation of polar ejection forces and chromosome congression. Since the N-terminal PLD-like domain of FAM83D shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83D may share a similar three-dimensional fold with PLD enzymes, but is unlikely to carry PLD activity.	271
197282	cd09186	PLDc_FAM83F_N	N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83F. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83F (FAM83F). Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. The N-terminus of FAM83F shows high homology to other FAM83 family members, indicating that FAM83F might have arisen early in vertebrate evolution by duplication of a gene in the FAM83 family.	268
197283	cd09187	PLDc_FAM83G_N	N-terminal phospholipase D-like domain of the uncharacterized protein Family with sequence similarity 83G. N-terminal phospholipase D (PLD)-like domain of the uncharacterized protein, Family with sequence similarity 83G (FAM83G). Since the N-terminal PLD-like domain of FAM83 proteins shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83 proteins may share a similar three-dimensional fold with PLD enzymes, but are most unlikely to carry PLD activity. The N-terminus of FAM83G shows high homology to other FAM83 family members, indicating that FAM83G might have arisen early in vertebrate evolution by duplication of a gene in the FAM83 family.	275
197284	cd09188	PLDc_FAM83H_N	N-terminal phospholipase D-like domain of the uncharacterized protein, Family with sequence similarity 83H. N-terminal phospholipase D (PLD)-like domain of the protein, Family with sequence similarity 83H (FAM83H) on chromosome 8q24.3, which localizes in the intracellular environment and is associated with vesicles, can be regulated by kinases, and plays important roles during ameloblast differentiation and enamel matrix calcification. The gene encoding protein FAM83H is the first gene involved in the etiology of amelogenesis imperfecta (AI), that encodes a non-secreted protein due to the absence of a signal peptide. Defects in gene FAM83H cause autosomal dominant hypocalcified amelogenesis imperfecta (ADHCAI). Since the N-terminal PLD-like domain of FAM83H shows only trace similarity to the PLD catalytic domain and lacks the functionally important histidine residue, FAM83H may share a similar three-dimensional fold with PLD enzymes, but is most unlikely to carry PLD activity.	265
197285	cd09189	PLDc_DNaseII_alpha_1	Catalytic domain, repeat 1, of Deoxyribonuclease II alpha and similar proteins. Catalytic domain, repeat 1, of Deoxyribonuclease II alpha (DNase II alpha, EC 3.1.22.1) and similar proteins. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta. DNase II alpha is an acidic endonuclease found in lysosomes, nuclei, and various secretions. It plays a critical role in the degradation of nuclear DNA expelled from erythroid precursor cells, as well as in the degradation of the apoptotic DNA after macrophages engulf them. It cleaves double-stranded DNA to short 3'-phosphoryl oligonucleotides, rather than 3'-hydroxyl groups, and functions optimally at acidic pH in the absence of divalent metal ion cofactors.	162
197286	cd09190	PLDc_DNaseII_beta_1	Catalytic domain, repeat 1, of Deoxyribonuclease II beta and similar proteins. Catalytic domain, repeat 1, of Deoxyribonuclease II beta (DNase II beta, EC 3.1.22.1), also known as DNase II-like acid DNase (DLAD), and similar proteins. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta. DNase II beta, or DLAD, is a novel mammalian divalent cation-independent endonuclease with homology to DNase II alpha. It is highly expressed in the eye lens and in salivary glands and is responsible for the degradation of nuclear DNA during lens cell differentiation. DLAD mainly exists as a cytoplasmic protein and cleaves DNA to produce 3'-phosphoryl/5'-hydroxyl ends. Like DNase II alpha, DLAD is active under acidic conditions with maximum activity at pH 5.2. Aurintricarboxylic acid and Zn2+ are effective inhibitors of DLAD activity. Mice deficient in DLAD develop cataracts as they are unable to degrade DNA during differentiation of the lens cells.	165
197287	cd09191	PLDc_DNaseII_alpha_2	Catalytic domain, repeat 2, of Deoxyribonuclease II alpha and similar proteins. Catalytic domain, repeat 2, of Deoxyribonuclease II alpha (DNase II alpha, EC 3.1.22.1) and similar proteins. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta. DNase II alpha is an acidic endonuclease found in lysosomes, nuclei, and various secretions. It plays a critical role in the degradation of nuclear DNA expelled from erythroid precursor cells, as well as in the degradation of the apoptotic DNA after macrophages engulf them. It cleaves double-stranded DNA to short 3'-phosphoryl oligonucleotides, rather than 3'-hydroxyl groups, and functions optimally at acidic pH in the absence of divalent metal ion cofactors.	137
197288	cd09192	PLDc_DNaseII_beta_2	Catalytic domain, repeat 2, of Deoxyribonuclease II beta and similar proteins. Catalytic domain, repeat 2, of Deoxyribonuclease II beta (DNase II beta, EC 3.1.22.1), also known as DNase II-like acid DNase (DLAD), and similar proteins. DNase II is a monomeric nuclease that contains two copies of a variant HKD motif, where the aspartic acid residue is not conserved. The HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily. The catalytic center of DNase II is formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way. Members of this family are mainly found in metazoans, and vertebrate proteins have been further classified into DNase II alpha and beta. DNase II beta, or DLAD, is a novel mammalian divalent cation-independent endonuclease with homology to DNase II alpha. It is highly expressed in the eye lens and in salivary glands and is responsible for the degradation of nuclear DNA during lens cell differentiation. DLAD mainly exists as a cytoplasmic protein and cleaves DNA to produce 3'-phosphoryl/5'-hydroxyl ends. Like DNase II alpha, DLAD is active under acidic conditions with maximum activity at pH 5.2. Aurintricarboxylic acid and Zn2+ are effective inhibitors of DLAD activity. Mice deficient in DLAD develop cataracts as they are unable to degrade DNA during differentiation of the lens cells.	139
197289	cd09193	PLDc_mTdp1_1	Catalytic domain, repeat 1, of metazoan tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 1, of metazoan tyrosyl-DNA phosphodiesterase (Tdp1, EC 3.1.4.-). Human Tdp1 (hTdp1) acts as an important DNA repair enzyme with a preference for single-stranded or blunt-ended duplex oligonucleotides. It can remove stalled topoisomerase I-DNA complexes by catalyzing the hydrolysis of a phosphodiester bond between a tyrosine side chain and a DNA 3'-phosphate. It is therefore a potential molecular target for new anti-cancer drugs. hTdp1 has been shown to associate with additional proteins, such as XRCC1, to form a multi-enzyme complex. These additional proteins may be involved in recognizing 3'-phoshotyrosyl DNA in vivo. hTdp1 is a monomeric protein containing two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Like other PLD enzymes, hTdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way.	169
197290	cd09194	PLDc_yTdp1_1	Catalytic domain, repeat 1, of yeast tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 1, of yeast tyrosyl-DNA phosphodiesterase (yTdp1, EC 3.1.4.-). yTdp1 is involved in the repair of topoisomerase I DNA lesions by  hydrolyzing the topoisomerase from the 3'-end of the DNA during double-strand break repair. Unlike human Tdp1 whose substrate-binding pocket can accommodate a fairly large topoisomerase I peptide fragment, yTdp1 has a preference for substrates containing one to four amino acid residues. The monomeric yTdp1 contains two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Like other PLD enzymes, yTdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way.	166
197291	cd09195	PLDc_mTdp1_2	Catalytic domain, repeat 2, of metazoan tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 2, of metazoan tyrosyl-DNA phosphodiesterase (Tdp1, EC 3.1.4.-). Human Tdp1 (hTdp1) acts as an important DNA repair enzyme with a preference for single-stranded or blunt-ended duplex oligonucleotides. It can remove stalled topoisomerase I-DNA complexes by catalyzing the hydrolysis of a phosphodiester bond between a tyrosine side chain and a DNA 3'-phosphate. It is therefore a potential molecular target for new anti-cancer drugs. hTdp1 has been shown to associate with additional proteins, such as XRCC1, to form a multi-enzyme complex. These additional proteins may be involved in recognizing 3'-phoshotyrosyl DNA in vivo. hTdp1 is a monomeric protein containing two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Like other PLD enzymes, hTdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way.	191
197292	cd09196	PLDc_yTdp1_2	Catalytic domain, repeat 2, of yeast tyrosyl-DNA phosphodiesterase. Catalytic domain, repeat 2, of yeast tyrosyl-DNA phosphodiesterase (yTdp1, EC 3.1.4.-). yTdp1 is involved in the repair of topoisomerase I DNA lesions by  hydrolyzing the topoisomerase from the 3'-end of the DNA during double-strand break repair. Unlike human Tdp1 whose substrate-binding pocket can accommodate a fairly large topoisomerase I peptide fragment, yTdp1 has a preference for substrates containing one to four amino acid residues. The monomeric yTdp1 contains two copies of a variant HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue), which consists of the highly conserved histidine and lysine residues, but lacks the aspartate residue that is well conserved in other phospholipase D (PLD, EC 3.1.4.4) enzymes. Like other PLD enzymes, yTdp1 may utilize a common two-step general acid/base catalytic mechanism, involving a DNA-enzyme intermediate to cleave phosphodiester bonds. A single active site involved in phosphatidyl group transfer would be formed by the two variant HKD motifs from the N- and C-terminal domains in a pseudodimeric way.	200
197293	cd09197	PLDc_pPLDalpha_1	Catalytic domain, repeat 1, of plant alpha-type phospholipase D. Catalytic domain, repeat 1, of plant alpha-type phospholipase D (PLDalpha, EC 3.1.4.4). Plant PLDalpha is a phosphatidylinositol 4,5-bisphosphate (PIP2)-independent PLD that possesses a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and require millimolar calcium for optimal activity. The C2 domain is unique to plant PLDs and is not present in animal or fungal PLDs. Like other PLD enzymes, the monomer of plant PLDalpha consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDalpha may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	178
197294	cd09198	PLDc_pPLDbeta_1	Catalytic domain, repeat 1, of plant beta-type phospholipase D. Catalytic domain, repeat 1, of plant beta-type phospholipase D (PLDbeta, EC 3.1.4.4). Plant PLDbeta is a phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent PLD that possesses a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and requires nanomolar calcium and cytosolic factors for optimal activity. The C2 domain is unique to plant PLDs and is not present in animal or fungal PLDs. Sequence analysis shows that plant PLDbeta is evolutionarily divergent from alpha-type plant PLD, and plant PLDbeta is more closely related to mammalian and yeast PLDs than to plant PLDalpha. Like other PLD enzymes, the monomer of plant PLDbeta consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDbeta may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	180
197295	cd09199	PLDc_pPLDalpha_2	Catalytic domain, repeat 2, of plant alpha-type phospholipase D. Catalytic domain, repeat 2, of plant alpha-type phospholipase D (PLDalpha, EC 3.1.4.4). Plant PLDalpha is a phosphatidylinositol 4,5-bisphosphate (PIP2)-independent PLD that possesses a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and require millimolar calcium for optimal activity. The C2 domain is unique to plant PLDs and is not present in animal or fungal PLDs. Like other PLD enzymes, the monomer of plant PLDalpha consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDalpha may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	211
197296	cd09200	PLDc_pPLDbeta_2	Catalytic domain, repeat 2, of plant beta-type phospholipase D. Catalytic domain, repeat 2, of plant beta-type phospholipase D (PLDbeta, EC 3.1.4.4). Plant PLDbeta is a phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent PLD that possesses a regulatory calcium-dependent phospholipid-binding C2 domain in the N-terminus and requires nanomolar calcium and cytosolic factors for optimal activity. The C2 domain is unique to plant PLDs and is not present in animal or fungal PLDs. Sequence analysis shows that plant PLDbeta is evolutionarily divergent from alpha-type plant PLD, and plant PLDbeta is more closely related to mammalian and yeast PLDs than to plant PLDalpha. Like other PLD enzymes, the monomer of plant PLDbeta consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. Plant PLDbeta may utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	211
197297	cd09203	PLDc_N_DEXD_b1	N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily.	143
197298	cd09204	PLDc_N_DEXD_b2	N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily.	139
197299	cd09205	PLDc_N_DEXD_b3	N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily.  A few family members contain additional domains, like a C-terminal peptidase S24-like domain.	143
187741	cd09208	Lumazine_synthase-II	lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS), catalyzes the penultimate step in the biosynthesis of riboflavin (vitamin B2); type-II. Type-II LS also known as RibH2, catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. LS catalyses the formation of 6,7-dimethyl-8-ribityllumazine by the condensation of 5-amino-6-ribitylamino- 2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate. Subsequently, the lumazine intermediate dismutates yielding riboflavin and 5-amino-6-ribitylamino- 2,4(1H,3H)-pyrimidinedione, in a reaction catalyzed by riboflavin synthase (RS); RS belongs to a different family of the Lumazine-synthase-like superfamily. Riboflavin is the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential cofactors for the catalysis of a wide range of redox reactions. These cofactors are also involved in many other processes involving DNA repair, circadian time-keeping, light sensing, and bioluminescence. Riboflavin is biosynthesized in plants, fungi and certain microorganisms; as animals lack the necessary enzymes to produce this vitamin, they acquire it from dietary sources. Type II LSs are distinct from type-I LS not only in protein sequence, but in that they exhibit different quaternary assemblies; type-II LSs form decamers (dimers of pentamers). The pathogen Brucella spp. have both a type-I LS and a type-II LS called RibH1 and RibH2, respectively. RibH1/type-I LS appears to be a functional LS in Brucella spp., whereas RibH2/type-II LS has much lower catalytic activity as LS and may be regulated by a riboswitch that senses FMN, suggesting that the type-II LSs may have evolved into very poor catalysts or, that they may harbor a new, as-yet-unknown function.	137
187742	cd09209	Lumazine_synthase-I	lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS), catalyzes the penultimate step in the biosynthesis of riboflavin (vitamin B2); type-I. Type-I LS, also known as RibH1, catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. LS catalyse the formation of 6,7-dimethyl-8-ribityllumazine by the condensation of 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate. Subsequently, the lumazine intermediate dismutates to yield riboflavin and 5-amino-6-ribitylamino- 2,4(1H,3H)-pyrimidinedione, in a reaction catalyzed by riboflavin synthase synthase (RS); RS belongs to a different family of the Lumazine-synthase-like superfamily. Riboflavin is the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential cofactors for the catalysis of a wide range of redox reactions. These cofactors are also involved in many other processes involving DNA repair, circadian time-keeping, light sensing, and bioluminescence. Riboflavin is biosynthesized in plants, fungi and certain microorganisms; as animals lack the necessary enzymes to produce this vitamin, they acquire it from dietary sources. Type II LSs are distinct from type-I LS not only in protein sequence, but in that they exhibit different quaternary assemblies; type-I LSs form pentamers. The pathogen Brucella spp. encode both a Type-I LS and a Type-II LS called RibH1 and RibH2, respectively. RibH1/type-I LS  appears to be the functional LS in Brucella spp., whereas RibH2/type-II LS has much lower catalytic activity as LS. The pathogen Brucella spp. have both a type-I LS and a type-II LS called RibH1 and RibH2, respectively. RibH1/type-I LS appears to be a functional LS in Brucella spp., whereas RibH2/type-II LS has much lower catalytic activity as LS.	133
187743	cd09210	Riboflavin_synthase_archaeal	archaeal riboflavin synthase (RS); involved in the biosynthesis pathway of riboflavin (vitamin B2). Archaeal RSs are homopentamers catalyzing the formation of riboflavin from 6,7-dimethyl-8-ribityllumazine in riboflavin biosynthesis. Divalent metal ions, preferably manganese or magnesium, are needed for maximum activity. Riboflavin serves as the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD), essential cofactors for several oxidoreductases that are indispensable in most living cells. In the final steps of the riboflavin biosynthetic pathway, lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS) catalyzes the condensation of the 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate to release water, inorganic phosphate and 6,7-dimethyl-8-ribityllumazine (DMRL), followed by RS which catalyzes a dismutation of DMRL yielding riboflavin and 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione. In the latter reaction, a four-carbon moiety is transferred between two DMRL molecules serving as donor and acceptor, respectively. Both the LS and RS catalyzed reactions are thermodynamically irreversible and can proceed in the absence of a catalyst. Archaeal RSs share sequence similarity with LSs, both appear to have diverged early in the evolution of archaea from a common ancestor.	143
187744	cd09211	Lumazine_synthase_archaeal	lumazine synthase (6,7-dimethyl-8-ribityllumazine synthase, LS); catalyzes the penultimate step in the biosynthesis of riboflavin (vitamin B2). Archaeal LS is an important enzyme in the riboflavin biosynthetic pathway. Riboflavin is the precursor of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) which are essential cofactors for the catalysis of a wide range of redox reactions. These cofactors are also involved in many other processes involving DNA repair, circadian time-keeping, light sensing, and bioluminescence. In the final steps of the riboflavin biosynthetic pathway LS catalyzes the condensation of the 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione with 3,4-dihydroxy- 2-butanone-4-phosphate to release water, inorganic phosphate and 6,7-dimethyl-8-ribityllumazine (DMRL), and riboflavin synthase (RS) catalyzes a dismutation of DMRL which yields riboflavin and 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidinedione. In the latter reaction, a four-carbon moiety is transferred between two DMRL molecules serving as donor and acceptor, respectively. Both the LS and RS catalyzed reactions are thermodynamically irreversible and can proceed in the absence of a catalyst. LS from Methanococcus jannaschii forms capsids with icosahedral 532 symmetry consisting of 60 subunits. Archaeal LSs share sequence similarity with archaeal RSs, both appear to have diverged early in the evolution of archaea from a common ancestor.	131
198416	cd09212	PUB	PNGase/UBA or UBX (PUB) domain of p97 adaptor proteins. The PUB domain is found in p97 adaptor proteins such as PNGase, UBXD1 (UBX domain-containing protein 1), and RNF31 (RING finger protein 31). It functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The p97, a type II AAA+ ATPase, is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins.   The PUB domain in UBX-domain protein 1 (UBXD1), which is widely expressed in higher eukaryotes (except for fungi) and which is involved in substrate recruitment to p97, interacts strongly with the C-terminus of p97. Peptide:N-glycanase (PNGase), a deglycosylating enzyme that functions in proteasome-dependent degradation of misfolded glycoproteins which are translocated from the endoplasmic reticulum (ER) to the cytosol during ERAD, associates with the ubiquitin-proteasome system proteins mediated by the N-terminal PUB domain. PNGase is present in all eukaryotic organisms; however, the yeast PNGase ortholog does not contain the PUB domain. The RNF31 protein, also known as HOIP or Zibra, contains an N-terminal PUB domain similar to those in PNGase and UBXD1, suggesting its association with p97.	96
188873	cd09213	Luminal_IRE1_like	The Luminal domain, a dimerization domain, of Inositol-requiring protein 1-like proteins. The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), eukaryotic translation Initiation Factor 2-Alpha Kinase 3  (EIF2AK3), and similar proteins. IRE1 and EIF2AK3 are serine/threonine protein kinases (STKs) and are type I transmembrane proteins that are localized in the endoplasmic reticulum (ER). They are kinase receptors that are activated through the release of BiP, a chaperone bound to their luminal domains under unstressed conditions. This results in dimerization through their luminal domains, allowing trans-autophosphorylation of their kinase domains and activation. They play roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), contains an endoribonuclease domain in its cytoplasmic side and acts as an ER stress sensor. It is the oldest and most conserved component of the UPR in eukaryotes. Its activation results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. EIF2AK3, also called PKR-like Endoplasmic Reticulum Kinase (PERK), phosphorylates the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. It functions as the central regulator of translational control during the UPR pathway. In addition to the eIF-2 alpha subunit, EIF2AK3 also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR.	312
185753	cd09214	GH64-like	glycosyl hydrolase 64 family. This family is represented by the laminaripentaose-producing, beta-1,3-glucanase (LPHase) of Streptomyces matensis and related bacterial and ascomycete proteins. LPHase is a member of glycoside hydrolase family 64 (GH64), it is an inverting enzyme involved in the cleavage of long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers. LPHase is a two-domain crescent fold structure: one domain is composed of 10 beta-strands, eight coming from the N-terminus of the protein and two from the C-terminal region, and the protein has a second inserted domain; this cd includes both domains. This protein has an electronegative, substrate-binding cleft, and conserved Glu and Asp residues involved in the cleavage of the beta-1,3-glucan, laminarin, a plant and fungal cell wall component. Among bacteria, many beta-1,3-glucanases are implicated in fungal cell wall degradation. Also included in this family is GluB , the beta-1,3-glucanase B from Lysobacter enzymogenes Strain N4-7. Recombinant GluB demonstrated higher relative activity toward the branched-chain beta-1,3 glucan substrate zymosan A than toward linear beta-1,3 glucan substrates. Sometimes these two domains are found associated with other domains such as in the Catenulispora acidiphila DSM 44928 carbohydrate binding family 6 protein in which they are positioned N-terminal of a carbohydrate binding module, family 6 (CBM_6) domain. In the Cellulosimicrobium cellulans, glucan endo-1,3-beta-glucosidase, they are positioned N-terminal of a RICIN, carbohydrate-binding domain, and in the Salinispora tropica CNB-440, coagulation factor 5/8 C-terminal domain (FA58C) protein, they are positioned C-terminal of two FA58C domains which are proposed to function as cell surface-attached, carbohydrate-binding domain. This FA58C-containing protein has an internal peptide deletion (of approx. 44 residues) in the LPHase domain II.	319
185754	cd09215	Thaumatin-like	the sweet-tasting protein, thaumatin, and thaumatin-like proteins involved in host defense. This family is represented by the sweet-tasting protein thaumatin from the African berry Thaumatococcus daniellii and thaumatin-like proteins (TLPs) involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Plant TLPs are classified as pathogenesis-related (PR) protein family 5 (PR5), their expression is induced by environmental stresses such as pathogen/pest attack, drought and cold. TLPs included in this family are such proteins as zeamatin, found in high concentrations in cereal seeds; osmotin, a salt-induced protein in osmotically stressed plants; and PpAZ44, a propylene-induced TLP in abscission of young fruit. Several members of the plant TLP family have been reported as food allergens from fruits (i.e., cherry, Pru av 2; bell pepper, Cap a1; tomatoes, Lyc e NP24) and pollen allergens from conifers (i.e., mountain cedar, Jun a 3; Arizona cypress, Cup a3; Japanese cedar, Cry j3). Thaumatin and TLPs are three-domain, crescent-fold structures with either an electronegative, electropositive, or neutral cleft occurring between domains I and II. It has been proposed that the antifungal activity of plant PR5 proteins relies on the strong electronegative character of this cleft. Some TLPs hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. Most TLPs contain 16 conserved Cys residues. A deletion within the third domain (domain II) of the Triticum aestivum thaumatin-like xylanase inhibitor is observed, thus, only 10 conserved Cys residues are present within this smaller TLP and similar homologs.	157
185755	cd09216	GH64-LPHase-like	glycoside hydrolase family 64: laminaripentaose-producing, beta-1,3-glucanase (LPHase)-like. This subfamily is represented by the laminaripentaose-producing, beta-1,3-glucanase (LPHase) of Streptomyces matensis and related bacterial and ascomycete proteins. LPHase is a member of glycoside hydrolase family 64 (GH64), it is an inverting enzyme involved in the cleavage of long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers. LPHase is a two-domain crescent fold structure: one domain is composed of 10 beta-strands, eight coming from the N-terminus of the protein and two from the C-terminal region, and the protein has a second inserted domain; this cd includes both domains. This protein has an electronegative, substrate-binding cleft, and conserved Glu and Asp residues involved in the cleavage of the beta-1,3-glucan, laminarin, a plant and fungal cell wall component. Among bacteria, many beta-1,3-glucanases are implicated in fungal cell wall degradation. Also included in this family is GluB , the beta-1,3-glucanase B from Lysobacter enzymogenes Strain N4-7. Recombinant GluB demonstrated higher relative activity toward the branched-chain beta-1,3 glucan substrate zymosan A than toward linear beta-1,3 glucan substrates. Sometimes these two domains are found associated with other domains such as in the Catenulispora acidiphila DSM 44928 carbohydrate binding family 6 protein in which they are positioned N-terminal of a carbohydrate binding module, family 6 (CBM_6) domain. In the Cellulosimicrobium cellulans, glucan endo-1,3-beta-glucosidase, they are positioned N-terminal of a RICIN, carbohydrate-binding domain.	353
185756	cd09217	TLP-P	thaumatin and allergenic/antifungal thaumatin-like proteins: plant homologs. This subfamily is represented by the sweet-tasting protein thaumatin from the African berry Thaumatococcus daniellii, allergenic/antifungal Thaumatin-like proteins (TLPs), and related plant proteins. TLPs are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Plant TLPs are classified as pathogenesis-related (PR) protein family 5 (PR5), their expression is induced by environmental stresses such as pathogen/pest attack, drought and cold. TLPs in this subfamily include such proteins as zeamatin, found in high concentrations in cereal seeds, and osmotin, a salt-induced protein in osmotically stressed plants. Several members of the plant TLP family have been reported as food allergens from fruits (i.e., cherry, Pru av 2; bell pepper, Cap a1; tomatoes, Lyc e NP24) and pollen allergens from conifers (i.e., mountain cedar, Jun a 3; Arizona cypress, Cup a3; Japanese cedar, Cry j3). Thaumatin and TLPs are three-domain, crescent-fold structures with either an electronegative, electropositive, or neutral cleft occurring between domains I and II. It has been proposed that the antifungal activity of plant PR5 proteins relies on the strong electronegative character of this cleft. IgE-binding epitopes of mountain Cedar (Juniperus ashei) allergen Jun a 3, which interact with pooled IgE from patients suffering allergenic response to this allergen, were mainly located on the helical domain II; the best-conserved IgE-binding epitope predicted for TLPs corresponds to this region. Some TLPs hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. Most TLPs contain 16 conserved Cys residues. A deletion within the third domain (domain II) of the Triticum aestivum thaumatin-like xylanase inhibitor is observed, thus, only 10 conserved Cys residues are present within this smaller TLP and similar homologs.	151
185757	cd09218	TLP-PA	allergenic/antifungal thaumatin-like proteins: plant and animal homologs. This subfamily is represented by the thaumatin-like proteins (TLPs), Cherry Allergen Pru Av 2 TLP, Peach PpAZ44 TLP (a propylene-induced TLP in abscission), the Caenorhabditis elegans thaumatin family member (thn-6), and other plant and animal homologs. TLPs are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Due to their inducible expression by environmental stresses such as pathogen/pest attack, drought and cold, plant TLPs are classified as the pathogenesis-related (PR) protein family 5 (PR5). Several members of the plant TLP family have been reported as food allergens from fruits (i.e., cherry, Pru av 2; bell pepper, Cap a1; tomatoes, Lyc e NP24) and pollen allergens from conifers (i.e., mountain cedar, Jun a 3; Arizona cypress, Cup a3; Japanese cedar, Cry j3). TLPs are three-domain, crescent-fold structures with either an electronegative, electropositive, or neutral cleft occurring between domains I and II. It has been proposed that the antifungal activity of plant PR5 proteins relies on the strong electronegative character of this cleft. Some TLPs hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. TLPs within this subfamily contain 16 conserved Cys residues.	219
185758	cd09219	TLP-F	thaumatin-like proteins: basidiomycete homologs. This subfamily is represented by Lentinula edodes TLG1, a thaumatin-like protein (TLP), as well as, other basidiomycete homologs.  In general, TLPs are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. TLG1 TLP is involved in lentinan degradation and fruiting body senescence.  TLG1 expressed in Escherichia coli and Aspergillus oryzae exhibited beta-1,3-glucanase activity and demonstrated lentinan degrading activity. TLG1 is proposed to be involved in lentinan and cell wall degradation during senescence following harvest and spore diffusion. TLPs are three-domain, crescent-fold structures with either an electronegative, electropositive, or neutral cleft occurring between domains I and II. TLG1 from Lentinula edodes contains the required acidic amino acids conserved in the appropriate positions to possess an electronegative cleft. TLPs within this subfamily contain 13 conserved Cys residues; the number of total Cys residues in these TLPs varies from 16 in L. edodes TLG1 to 18 in other basidiomycete homologs.	229
185759	cd09220	GH64-GluB-like	glycoside hydrolase family 64: beta-1,3-glucanase B (GluB)-like. This subfamily is represented by GluB, beta-1,3-glucanase B , from Lysobacter enzymogenes Strain N4-7 and related bacterial and ascomycete proteins. GluB is a member of the glycoside hydrolase family 64 (GH64) involved in the cleavage of long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers. Among bacteria, many beta-1,3-glucanases are implicated in fungal cell wall degradation. GluB possesses the conserved Glu and Asp residues required to cleave substrate beta-1,3-glucans. Recombinant GluB demonstrated higher relative activity toward the branched-chain beta-1,3 glucan substrate zymosan A than toward linear beta-1,3 glucan substrates. Based on the structure of laminaripentaose-producing, beta-1,3-glucanase (LPHase) of Streptomyces matensis, which belongs to the same family as GluB but to a different subfamily, this cd is a two-domain model. Sometimes these two domains are found associated with other domains such as in the Catenulispora acidiphila DSM 44928 carbohydrate binding family 6 protein in which they are positioned N-terminal of a carbohydrate binding module, family 6 (CBM_6) domain.	369
187745	cd09223	Photo_RC	D1, D2 subunits of photosystem II  (PSII); M, L subunits of bacterial photosynthetic reaction center. This protein superfamily contains the D1, D2 subunits of the photosystem II (PS II) and the M, L subunits of the bacterial photosynthetic reaction center (RC). These four proteins are highly homologous and share a common fold. PS II is a multi-subunit protein found in the photosynthetic membranes of plants, algae, and cyanobacteria.  It utilizes light-induced electron transfer and water-splitting reactions to produce protons, electrons, and molecular oxygen. The protons generated are instrumental in ATP formation.  Bacterial photosynthetic reaction center (RC) complex is found in photosynthetic bacteria, such as purple bacteria and other proteobacteria species. It couples light-induced electron transfer to proton pumping across the membrane by reactions of a quinone molecule (QB) that binds two electrons and two protons at the active site. Protons are translocated from the bacterial cytoplasm to the periplasmic space, generating an electrochemical gradient of protons (the protonmotive force) that can be used to power reactions such as the synthesis of ATP.	199
198423	cd09224	CytoC_RC	Cytochrome C subunit of the bacterial photosynthetic reaction center. Photosynthesis in purple bacteria is dependent on light-induced electron transfer in the reaction center (RC), coupled to the uptake of protons from the cytoplasm. The RC contains a cytochrome molecule which re-reduces the oxidized electron donor. The electron transfer reactions of photosynthesis are performed by the following three components: the photosynthetic reaction center (RC), the cytochrome, and the soluble electron carrier protein. Firstly, the RC promotes the light-induced charge separation across the plasma membrane, which results in the oxidation of  a pair of light-harvesting complexes, LH1 and LH2, and the reduction of quinone to quinol. The quinol then leaves the RC and moves to the cytochrome complex through the quinone pool of the plasma membrane. Secondly, the cytochrome complex reoxidizes the quinol to quinone, and the released electrons are transferred to soluble electron carriers. Third, the soluble electron carriers transport the electrons to the RC through the periplasmic space. Finally, the photo-oxidized light-harvesting complex is reduced by the soluble electron carriers, and the RC comes back to the initial state. In the course of the oxidation and reduction of the quinones, a transmembrane electrochemical gradient of protons is formed, and its energy is used to produce ATP by the ATP synthase complex.	300
185717	cd09232	Snurportin-1_C	C-terminal m3G cap-binding domain of nuclear import adaptor snurportin-1. Snurportin-1 (SPN1 or SNUPN) is a nuclear import adaptor for m3G-capped spliceosomal U small nucleoproteins (snRNPs), which are assembled in the cytoplasm. After capping and assembly, the U snRNPs are transported into the nucleus by SPN1 and importin beta; SPN1 is then returned to the cytoplasm by exportin 1 (CRM1), which also transports the non-capped U snRNPs. The U snRNPs are essential elements of the spliceosome, which catalyzes the excision of introns and the ligation of exons to form a mature mRNA. SPN1 contains two domains, an N-terminal importin beta-binding (IBB) domain and a C-terminal m3G cap-binding domain.	186
187750	cd09233	ACE1-Sec16-like	Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16. COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.	314
185747	cd09234	V_HD-PTP_like	Protein-interacting V-domain of mammalian His-Domain type N23 protein tyrosine phosphatase and related domains. This family contains the V-shaped (V) domain of mammalian His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23) and related domains. It belongs to the V_Alix_like superfamily which includes the V domains of  Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, mammalian Alix (apoptosis-linked gene-2 interacting protein X/ also known as apoptosis-linked gene-2 interacting protein 1, AIP1), and related domains. HD_PTP interacts with the ESCRT (Endosomal Sorting Complexes Required for Transport) system, and participates in cell migration and endosomal trafficking. The related Alix V-domain (belonging to a different family in this superfamily) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. In addition to the V-domain, HD_PTP also has an N-terminal Bro1-like domain, a proline-rich region (PRR), a catalytically inactive tyrosine phosphatase domain, and a region containing a PEST motif. Bro1-like domains bind components of the ESCRT-III complex, specifically to CHMP4 in the case of HD-PTP. The Bro1-like domain of HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This family also contains Drosophila Myopic, which promotes epidermal growth factor receptor (EGFR) signaling, and Caenorhabditis elegans (enhancer of glp-1) EGO-2 which promotes Notch signaling.	337
185748	cd09235	V_Alix	Middle V-domain of mammalian Alix and related domains are dimerization and protein interaction modules. This family contains the middle V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X) and related domains. It belongs to the V_Alix_like superfamily which includes the V-domains of Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, mammalian His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), is part of the ESCRT (Endosomal Sorting Complexes Required for Transport) system, and participates in membrane remodeling processes, including the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), the abscission reactions of mammalian cell division, and in apoptosis. The Alix V-domain is a dimerization domain, and contains a binding site, partially conserved in the V_Alix_like superfamily, for the retroviral late assembly (L) domain YPXnL motif. In addition to the V-domain, Alix also has an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex, in particular CHMP4. The Bro1-like domain of Alix can also bind to human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix also has a C-terminal proline-rich region (PRR) that binds multiple partners including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1), and the apoptotic protein ALG-2.	339
185749	cd09236	V_AnPalA_UmRIM20_like	Protein-interacting V-domains of Aspergillus nidulans PalA/RIM20, Ustilago maydis RIM20, and related proteins. This family belongs to the V_Alix_like superfamily which includes the V-shaped (V) domains of Bro1 and Rim20 from Saccharomyces cerevisiae, mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Aspergillus nidulas PalA/RIM20 and Ustilago maydis RIM20, like Saccharomyces cerevisiae Rim20, participate in the response to the external pH via the Pal/Rim101 pathway; however, Saccharomyces cerevisiae Rim20 does not belong to this family. This pathway is a signaling cascade resulting in the activation of the transcription factor PacC/Rim101. The mammalian Alix V-domain (belonging to a different family) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. Aspergillus nidulas PalA binds a nonviral YPXnL motif  (tandem YPXL/I motifs within PacC). The Alix V-domain is also a dimerization domain. In addition to this V-domain, members of the V_Alix_like superfamily also have an N-terminal Bro1-like domain, which has been shown to bind CHMP4/Snf7, a component of the ESCRT-III complex.	353
185750	cd09237	V_ScBro1_like	Protein-interacting V-domain of Saccharomyces cerevisiae Bro1 and related domains. This family contains the V-shaped (V) domain of Saccharomyces cerevisiae Bro1, and related domains. It belongs to the V_Alix_like superfamily which also includes the V-domain of Saccharomyces cerevisiae Rim20 (also known as PalA), mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Bro1 interacts with the ESCRT (Endosomal Sorting Complexes Required for Transport) system, and participates in endosomal trafficking. The mammalian Alix V-domain (belonging to a different family) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Bro1 also has an N-terminal Bro1-like domain, which binds Snf7, a component of the ESCRT-III complex, and a C-terminal proline-rich region (PRR). The C-terminal portion (V-domain and PRR) of S. cerevisiae Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes. It interacts with a YPxL motif in the Doa4s catalytic domain to stimulate its deubiquitination activity.	356
185751	cd09238	V_Alix_like_1	Protein-interacting V-domain of an uncharacterized family of the V_Alix_like superfamily. This domain family is comprised of uncharacterized plant proteins. It belongs to the V_Alix_like superfamily which includes the V-shaped (V) domains of Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, mammalian Alix (apoptosis-linked gene-2 interacting protein X), (His-Domain) type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. The mammalian Alix V-domain (belonging to a different family) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. In addition to this V-domain, members of the V_Alix_Rim20_Bro1_like superfamily also have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind to human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members of the V_Alix_like superfamily also have a proline-rich region (PRR).	339
185762	cd09239	BRO1_HD-PTP_like	Protein-interacting, N-terminal, Bro1-like domain of mammalian His-Domain type N23 protein tyrosine phosphatase and related domains. This family contains the N-terminal, Bro1-like domain of mammalian His-Domain type N23 protein tyrosine phosphatase (HD-PTP) and related domains. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. HD-PTP participates in cell migration and endosomal trafficking. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 in the case of HD-PTP. The Bro1-like domain of HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. HD-PTP, and some other members of the BRO1_Alix_like  superfamily including Alix, also have a V-shaped (V) domain. In the case of Alix, the V-domain contains a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the V-domain superfamily. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This family also contains Drosophila Myopic which promotes epidermal growth factor receptor (EGFR) signaling, and Caenorhabditis elegans (enhancer of glp-1) EGO-2 which promotes Notch signaling.	361
185763	cd09240	BRO1_Alix	Protein-interacting, N-terminal, Bro1-like domain of mammalian Alix and related domains. This family contains the N-terminal, Bro1-like domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), also called apoptosis-linked gene-2 interacting protein 1 (AIP1). It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4, in the case of Alix. The Alix Bro1-like domain can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid and Rab5-specfic GAP (RabGAP5, also known as Rab-GAPLP). In addition to this Bro1-like domain, Alix has a middle V-shaped (V) domain. The Alix V-domain is a dimerization domain, and carries a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the superfamily. Alix also has a C-terminal proline-rich region (PRR) that binds multiple partners including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2.	346
185764	cd09241	BRO1_ScRim20-like	Protein-interacting, N-terminal, Bro1-like domain of Saccharomyces cerevisiae Rim20 and related proteins. This family contains the N-terminal, Bro1-like domain of Saccharomyces cerevisiae Rim20 (also known as PalA) and related proteins. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Saccharomyces cerevisiae Bro1, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Rim20 and Rim23 participate in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: Snf7 in the case of Rim20. RIM20, and some other members of the BRO1_Alix_like  superfamily including Alix, also have a V-shaped (V) domain. In the case of Alix, the V-domain is a dimerization domain that also contains a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the V-domain superfamily. Rim20 localizes to endosomes under alkaline pH conditions. By binding Snf7, it may bring the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and thus aid in the proteolytic activation of the latter. Rim20 and other intermediates in the Rim101 pathway play roles in the pathogenesis of fungal corneal infection during Candida albicans keratitis.	355
185765	cd09242	BRO1_ScBro1_like	Protein-interacting, N-terminal, Bro1-like domain of Saccharomyces cerevisiae Bro1 and related proteins. This family contains the N-terminal, Bro1-like domain of Saccharomyces cerevisiae Bro1 and related proteins. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Saccharomyces cerevisiae Rim20 (also known as PalA), Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Bro1 participates in endosomal trafficking. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: Snf7 in the case of Bro1. Snf7 binds to a conserved hydrophobic patch on the middle of the concave side of the Bro1 domain. RIM20, and some other members of the BRO1_Alix_like  superfamily including Alix, also have a V-shaped (V) domain. In the case of Alix, the V-domain contains a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the superfamily. The Alix V-domain is also a dimerization domain. The C-terminal portion (V-domain and proline rich-region) of Bro1 interacts with Doa4, a protease that deubiquitinates integral membrane proteins sorted into the lumenal vesicles of late-endosomal multivesicular bodies. It interacts with a YPxL motif in the Doa4 catalytic domain to stimulate its deubiquitination activity.	348
185766	cd09243	BRO1_Brox_like	Protein-interacting Bro1-like domain of human Brox1 and related proteins. This family contains the Bro1-like domain of a single-domain protein, human Brox, and related domains. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23, interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 in the case of Brox. Human Brox can bind to human immunodeficiency virus type 1 (HIV-1) nucleocapsid. In addition to a Bro1-like domain, Brox also has a C-terminal thioester-linkage site for isoprenoid lipids (CaaX motif). This family lacks the V-shaped (V) domain found in many members of the BRO1_Alix_like superfamily.	353
185767	cd09244	BRO1_Rhophilin	Protein-interacting Bro1-like domain of RhoA-binding protein Rhophilin and related domains. This family contains the Bro1-like domain of RhoA-binding proteins, Rhophilin-1 and -2, and related domains. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Rhophilin-1 and -2 bind both GDP- and GTP-bound RhoA. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. In addition to this Bro1-like domain, Rhophilin-1 and -2, contain an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. Their PDZ domains have limited homology. Rhophilin-1 and -2 have different activities. The Drosophila knockout of Rhophilin-1 is embryonic lethal, suggesting an essential role in embryonic development. Roles of Rhophilin-2 may include limiting stress fiber formation or increasing the turnover of F-actin in the absence of high levels of RhoA signaling activity. The isolated Bro1-like domain of Rhophilin-1 binds human immunodeficiency virus type 1 (HIV-1) nucleocapsid. This family lacks the V-shaped (V) domain found in many members of the BRO1_Alix _like superfamily.	350
185768	cd09245	BRO1_UmRIM23-like	Protein-interacting, Bro1-like domain of Ustilago maydis Rim23 (PalC), and related domains. This family contains the Bro1-like domain of Ustilago maydis Rim23 (also known as PalC), and related proteins. It belongs to the BRO1_Alix_like superfamily which includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1, Rim20, and Rim23 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Rim20 and Rim23 participate in the response to the external pH via the Rim101 pathway. Through its Bro1-like domain, Rim23 allows the interaction between the endosomal and plasma membrane complexes. Bro1-like domains are boomerang-shape, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Intermediates in the Rim101 pathway may play roles in the pathogenesis of fungal corneal infection during Candida albicans keratitis. This family lacks the V-shaped (V) domain found in many members of the BRO1_Alix_like superfamily.	413
185769	cd09246	BRO1_Alix_like_1	Protein-interacting, N-terminal, Bro1-like domain of an Uncharacterized family of the BRO1_Alix_like superfamily. This domain family is comprised of uncharacterized proteins. It belongs to the BRO1_Alix_like superfamily which includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20 and Rim23 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP and Bro1 function in endosomal trafficking, with HD-PTP having additional functions in cell migration. Rim20 and Rim23 play roles in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, Brox and HD-PTP) and Snf7 (in the case of yeast Bro1 and Rim20). The Bro1-like domains of Alix, HD-PTP, Brox, and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. In addition to this Bro1-like domain, Alix, Bro1, Rim20, HD_PTP, and proteins belonging to this uncharacterized family, also have a V-shaped (V) domain. The Alix V-domain is a dimerization domain, and contains a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in the BRO1_Alix_like superfamily. Many members of  this superfamily also have a proline-rich region (PRR), a protein interaction domain.	353
185770	cd09247	BRO1_Alix_like_2	Protein-interacting Bro1-like domain of an Uncharacterized family of the BRO1_Alix_like superfamily. This domain family is comprised of uncharacterized proteins. It belongs to the BRO1_Alix_like superfamily which includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and -2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Alix, HD-PTP, Brox, Bro1, Rim20 and Rim23 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP and Bro1 function in endosomal trafficking, with HD-PTP having additional functions in cell migration. Rim20 and Rim23 play roles in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. These domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, Brox and HD-PTP) and Snf7 (in the case of yeast Bro1 and Rim20). The Bro1-like domains of Alix, HD-PTP, Brox, and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. This family lacks the V-shaped (V) domain found in many members of the BRO1_Alix_like superfamily.	346
185771	cd09248	BRO1_Rhophilin_1	Protein-interacting Bro1-like domain of RhoA-binding protein Rhophilin-1. This subfamily contains the Bro1-like domain of the RhoA-binding protein, Rhophilin-1. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding protein Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Rhophilin-1 binds both GDP- and GTP-bound RhoA. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. In addition to this Bro1-like domain, Rhophilin-1 contains an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. The Drosophila knockout of the Rhophilin-1 is embryonic lethal, suggesting an essential role in embryonic development. The isolated Bro1-like domain of Rhophilin-1 binds human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Rhophilin-1 lacks the V-shaped (V) domain found in many members of the BRO1_Alix_ like superfamily.	384
185772	cd09249	BRO1_Rhophilin_2	Protein-interacting Bro1-like domain of RhoA-binding protein Rhophilin-2. This subfamily contains the Bro1-like domain of RhoA-binding protein, Rhophilin-2. It belongs to the BRO1_Alix_like superfamily which also includes the Bro1-like domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding protein Rhophilin-1, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, Ustilago maydis Rim23 (also known as PalC), and related domains. Rhophilin-2, binds both GDP- and GTP-bound RhoA. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. In addition to this Bro1-like domain, Rhophilin-2 contains an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. Roles for Rhophilin-2 may include limiting stress fiber formation or increasing the turnover of F-actin in the absence of high levels of RhoA signaling activity. Rhophilin-2 lacks the V-shaped (V) domain found in many members of the BRO1_Alix_like superfamily.	385
271158	cd09250	AP-1_Mu1_Cterm	C-terminal domain of medium Mu1 subunit in clathrin-associated adaptor protein (AP) complex AP-1. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This family corresponds to the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 1 (AP-1) medium mu1 subunit, which includes two closely related homologs, mu1A (encoded by ap1m1) and mu1B (encoded by ap1m2). Mu1A is ubiquitously expressed, but mu1B is expressed exclusively in polarized epithelial cells. AP-1 has been implicated in bi-directional transport between the trans-Golgi network (TGN) and endosomes. It plays an essential role in the formation of clathrin-coated vesicles (CCVs) from the trans-Golgi network (TGN). Epithelial cell-specific AP-1 is also involved in sorting to the basolateral surface of polarized epithelial cells. Recruitment of AP-1 to the TGN membrane is regulated by a small GTPase, ADP-ribosylation factor 1 (ARF1). Phosphorylation/dephosphorylation events can also regulate the function of AP-1. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-1. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-1 mu1 subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding.	272
271159	cd09251	AP-2_Mu2_Cterm	C-terminal domain of medium Mu2 subunit in ubiquitously expressed clathrin-associated adaptor protein (AP) complex AP-2. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, -2, -3, and -4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This family corresponds to the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 2 (AP-2) medium mu2 subunit. Mu2 is ubiquitously expressed in mammals. In higher eukaryotes, AP-2 plays a critical role in clathrin-mediated endocytosis from the plasma membrane in different cells. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-2. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-2 mu2 subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding. Since the Y-X-X-Phi binding site is buried in the core structure of AP-2, a phosphorylation induced conformational change is required when the cargo molecules binds to AP-2. In addition, the C-terminal domain of mu2 subunit has been shown to bind other molecules. For instance, it can bind phosphoinositides, in particular PI[4,5]P2, which might be involved in the recognition process of the tyrosine-based signals. It can also interact with synaptotagmins, a family of important modulators of calcium-dependent neurosecretion within the synaptic vesicle (SV) membrane. Since many of the other endocytic adaptors responsible for biogenesis of synaptic vesicles exist, in the absence of AP-2, clathrin-mediated endocytosis can still occur. However, the cells may not survive in the complete absence of clathrin as well as AP-2.	263
271160	cd09252	AP-3_Mu3_Cterm	C-terminal domain of medium Mu3 subunit in adaptor protein (AP) complex AP-3. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This family corresponds to the C-terminal domain of heterotetrameric adaptor protein complex 3 (AP-3) medium mu3 subunit, which includes two closely related homologs, mu3A (P47A, encoded by ap3m1) and mu1B (P47B, encoded by ap3m2). Mu3A is ubiquitously expressed, but mu3B is specifically expressed in neurons and neuroendocrine cells. AP-3 is particularly important for targeting integral membrane proteins to lysosomes and lysome-related organelles at trans-Golgi network (TGN) and/or endosomes, such as the yeast vacuole, fly pigment granules and mammalian melanosomes, platelet dense bodies and the secretory lysosomes of cytotoxic T lymphocytes. Unlike AP-1 and AP-2, which function in conjunction with clathrin which is a scaffolding protein participating in the formation of coated vesicles, the nature of the outer shell of AP-3 containing coats remains to be elucidated. Membrane-anchored cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-3 mu3 subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding.	251
271161	cd09253	AP-4_Mu4_Cterm	C-terminal domain of medium Mu4 subunit in adaptor protein (AP) complex AP-4. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This family corresponds to the C-terminal domain of heterotetrameric adaptor protein complex 4 (AP-4) medium mu4 subunit. AP-4 plays a role in signal-mediated trafficking of integral membrane proteins in mammalian cells. Unlike other AP complexes, AP-4 is found only in mammals and plants. It is believed to be part of a nonclathrin coat, since it might function independently of clathrin, a scaffolding protein participating in the formation of coated vesicles. Recruitment of AP-4 to the trans-Golgi network (TGN) membrane is regulated by a small GTPase, ADP-ribosylation factor 1 (ARF1) or a related protein. Membrane-anchored cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. One of the most important sorting signals binding to mu subunits of AP complexes are tyrosine-based endocytotic signals, which are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. However, AP-4 does not bind most canonical tyrosine-based signals except for two naturally occurring ones from the lysosomal membrane proteins CD63 and LAMP-2a. It binds YX [FYL][FL]E motif, where X can be any residue, from the cytosolic tails of amyloid precursor protein (APP) family members in a distinct way.	271
271162	cd09254	AP_delta-COPI_MHD	Mu homology domain (MHD) of adaptor protein (AP) coat protein I (COPI) delta subunit. COPI complex-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. COPI complex-coated vesicles consist of a small GTPase, ADP-ribosylation factor 1 (ARF1) and a heteroheptameric coatomer composed of two subcomplexes, F-COPI and B-COPI. ARF1 regulates COPI vesicle formation by recruiting the coatomer onto Golgi membranes to initiate its coat function. Coatomer complexes then bind cargo molecules and self-assemble to form spherical cages that yield COPI-coated vesicles. The heterotetrameric F-COPI subcomplex contains beta-, gamma-, delta-, and zeta-COP subunits, where beta- and gamma-COP subunits are related to the large AP subunits, and delta- and zeta-COP subunits are related to the medium and small AP subunits, respectively. Due to the sequence similarity to the AP complexes, the F-COPI subcomplex might play a role in the cargo-binding. The heterotrimeric B-COPI contains alpha-, beta-, and epsilon-COP subunits, which are not related to the adaptins. This subcomplex is thought to participate in the cage-forming and might serve a function similar to that of clathrin. This family corresponds to the mu homology domain of delta-subunit of COPI complex (delta-COP), which is distantly related to the C-terminal domain of mu chains among AP complexes. The delta-COP subunit appears tightly associated with the beta-COP subunit to confer its interaction with ARF1. In addition, both delta- and beta-COP subunits contribute to a common binding site for arginine (R)-based signals, which are sorting motifs conferring transient endoplasmic reticulum (ER) localization to unassembled subunits of multimeric membrane proteins.	237
271163	cd09255	AP-like_stonins_MHD	Mu homology domain (MHD) of adaptor-like proteins (AP-like), stonins. A small family of proteins named stonins has been characterized as clathrin-dependent AP-2 mu2 chain related factors, which may act as cargo-specific sorting adaptors in endocytosis. Stonins include stonin 1 and stonin 2, which are only mammalian homologs of Drosophila stoned B, a presynaptic protein implicated in neurotransmission and synaptic vesicle (SV) recycling. They are conserved from C. elegans to humans, but are not found in prokaryotes or yeasts. This family corresponds to the mu homology domain of stonins, which is distantly related to the C-terminal domain of mu chains among AP complexes. Due to the low degree of sequence conservation of the corresponding binding site, the mu homology domain of stonins is unable to recognize tyrosine-based endocytic sorting signals. To data, little is known about the localization and function of stonin 1. Stonin 2, also known as stoned B, acts as an AP-2-dependent synaptotagmin-specific sorting adaptors for SV endocytosis. Stoned A is not a stonin. It is structurally unrelated to the adaptins and does not appear to have mammalian homologs. It is not included in this family.	315
271164	cd09256	AP_MuD_MHD	Mu-homology domain (MHD) of a adaptor protein (AP) encoded by mu-2 related death-inducing gene, MuD (also known as MUDENG). This family corresponds to the MHD found in a protein encoded by MuD (also known as Adapter-related protein complex 5 subunit mu-1), which is distantly related to the C-terminal domain of the mu2 subunit of AP complexes that participates in clathrin-mediated endocytosis. MuD is evolutionary conserved from mammals to amphibians. It is able to induce cell death by itself and plays an important role in cell death in various tissues.	276
271165	cd09257	AP_muniscins_like_MHD	Mu-homology domain (MHD) of muniscins adaptor proteins (AP) and similar proteins. This family corresponds to the MHD found in muniscins, a novel family of endocytic adaptor proteins. The term, muniscins, has been assigned to name the MHD of proteins with both EFC/F-BAR domain and MHD. These two domains are responsible for the membrane-tubulation activity associated with transmembrane cargo proteins. Members in this family include an endocytic adaptor Syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related uncharacterized proteins. Syp1 is a poorly characterized yeast protein with multiple biological functions. Syp1 contains an N-terminal EFC/F-BAR domain that induces membrane tabulation, a proline-rich domain (PRD) in the middle region, and a C-terminal MHD that can directly binds to the endocytic adaptor/scaffold protein Ede1 or a transmembrane stress sensor cargo protein Mid2. Thus, Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress response. Syp1 shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, the membrane-sculpting F-BAR domain-containing Fer/Cip4 homology domain-only proteins 1 and 2 (FCHo1/2). FCHo1/2 represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They are required for plasma membrane clathrin-coated vesicle (CCV) budding and marked sites of CCV formation. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein, neuronal-specific transcript Scr homology 3 (SH3)-domain growth factor receptor-bound 2 (GRB2)-like (endophilin) interacting protein 1 [SGIP1] does not contain EFC/F-BAR domain, but does have a PRD and a C-terminal MHD and has been classified into this family as well. SGIP1 is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15.	244
271166	cd09258	AP-1_Mu1A_Cterm	C-terminal domain of medium Mu1A subunit in ubiquitously expressed clathrin-associated adaptor protein (AP) complex AP-1. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This subfamily corresponds to the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 1 (AP-1) medium mu1A subunit encoded by ap1m1 gene, which is ubiquitously expressed in all mammalian tissues and cells. AP-1 has been implicated in bidirectional transport between the trans-Golgi network (TGN) and endosomes. It is involved in the formation of clathrin-coated vesicles (CCVs) from the trans-Golgi network (TGN). The ubiquitous AP-1 is recruited to the TGN membrane, as well as to immature secretory granules. Recruitment of AP-1 to the TGN membrane is regulated by a small GTPase, ADP-ribosylation factor 1 (ARF1). Phosphorylation/dephosphorylation events can also regulate the function of AP-1. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-1. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-1 mu1A subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding.	270
271167	cd09259	AP-1_Mu1B_Cterm	C-terminal domain of medium Mu1B subunit in epithelial cell-specific clathrin-associated adaptor protein (AP) complex AP-1. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from different AP complexes exhibits similarity with each other. This subfamily corresponds to the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 1 (AP-1) medium mu1B subunit encoded by ap1m2 gene exclusively expressed in polarized epithelial cells. Epithelial cell-specific AP-1 is used to sort proteins to the basolateral plasma membrane, which involves the formation of clathrin-coated vesicles (CCVs) from the trans-Golgi network (TGN). Recruitment of AP-1 to the TGN membrane is regulated by a small GTPase, ADP-ribosylation factor 1 (ARF1). The phosphorylation/dephosphorylation events can also regulate the function of AP-1. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-1. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-1 mu1B subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic reside-binding. Besides, AP-1 mu1B subunit mediates the basolateral recycling of low-density lipoprotein receptor (LDLR) and transferrin receptor (TfR) from the sorting endosomes, where the basolateral sorting signal does not belong to the tyrosine-based signals. Thus, the binding site in mu1B subunit of AP-1 for the signals of LDLR and TfR might be distinct from that for YXXPhi signals.	268
211371	cd09260	AP-3_Mu3A_Cterm	C-terminal domain of medium Mu3A subunit in ubiquitously expressed adaptor protein (AP) complex AP-3. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This subfamily corresponds to the C-terminal domain of heterotetrameric adaptor protein complex 3 (AP-3) medium mu3A subunit encoded by ap3m1gene. Mu3A is ubiquitously expressed in all mammalian tissues and cells. It appears to be localized to the trans-Golgi network (TGN) and/or endosomes and participates in trafficking to the vacuole/lysosome in yeast, flies, and mammals. Unlike AP-1 and AP-2, which function in conjunction with clathrin which is a scaffolding protein participating in the formation of coated vesicles, the nature of the outer shell of ubiquitous AP-3 containing coats remains to be elucidated. Membrane-anchored cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-3 mu3A subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding.	254
211372	cd09261	AP-3_Mu3B_Cterm	C-terminal domain of medium Mu3B subunit in neuron-specific adaptor protein (AP) complex AP-3. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. There are four AP complexes, AP-1, AP-2, AP-3, and AP-4, described in various eukaryotic organisms. Each AP complex consists of four subunits: two large chains (one each of gamma/alpha/delta/epsilon and beta1-4, respectively), a medium mu chain (mu1-4), and a small sigma chain (sigma1-4). Each of the four subunits from the different AP complexes exhibits similarity with each other. This subfamily corresponds to the C-terminal domain of heterotetrameric adaptor protein complex 3 (AP-3) medium mu3B subunit encoded by ap3m2 gene. Mu3B is specifically expressed in neurons and neuroendocrine cells. Neuron-specific AP-3 appears to be involved in synaptic vesicle biogenesis from endosomes in neurons and plays an important role in synaptic transmission in the central nervous system. Unlike AP-1 and AP-2, which function in conjunction with clathrin which is a scaffolding protein participating in the formation of coated vesicles, the nature of the outer shell of neuron-specific AP-3 containing coats remains to be elucidated. Membrane-anchored cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-3 mu3B subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding.	254
271168	cd09262	AP_stonin-1_MHD	Mu homology domain (MHD) of adaptor-like protein (AP-like), stonin-1 (also called Stoned B-like factor). A small family of proteins named stonins has been characterized as clathrin-dependent AP-2 mu2 chain related factors, which may act as cargo-specific sorting adaptors in endocytosis. Stonins include stonin 1 and stonin 2, which are the only mammalian homologs of Drosophila stoned B, a presynaptic protein implicated in neurotransmission and synaptic vesicle (SV) recycling. They are conserved from C. elegans to humans, but are not found in prokaryotes or yeasts. This family corresponds to the mu homology domain of stonin 1, which is distantly related to the C-terminal domain of mu chains among AP complexes. Due to the low degree of sequence conservation of the corresponding binding site, the mu homology domain of stonin-1 is unable to recognize tyrosine-based endocytic sorting signals. To data, little is known about the localization and function of stonin-1.	314
271169	cd09263	AP_stonin-2_MHD	Mu homology domain (MHD) of adaptor-like protein (AP-like), stonin-2. A small family of proteins named stonins has been characterized as clathrin-dependent AP-2 mu2 chain related factors, which may act as cargo-specific sorting adaptors in endocytosis. Stonins include stonin 1 and stonin 2, which are the only mammalian homologs of Drosophila stoned B, a presynaptic protein implicated in neurotransmission and synaptic vesicle (SV) recycling. They are conserved from C. elegans to humans, but are not found in prokaryotes or yeasts. This family corresponds to the mu homology domain of stonin 2, which is distantly related to the C-terminal domain of mu chains among AP complexes. Due to the low degree of sequence conservation of the corresponding binding site, the mu homology domain of stonin-2 is unable to recognize tyrosine-based endocytic sorting signals. It acts as an AP-2-dependent synaptotagmin-specific sorting adaptor for SV endocytosis.	318
271170	cd09264	AP_Syp1_MHD	mu-homology domain (MHD) of adaptor protein (AP), Syp1, and related proteins. This family corresponds to the MHD found in a novel endocytic adaptor Syp1 and related proteins. Syp1 is a poorly characterized yeast protein with multiple biological functions. It was originally identified as a suppressor of a yeast profiling deletion and later as a suppressor of arf3delta (Arf3 is the yeast homologue of Arf6, a mammalian regulator of endocytosis). Syp1 can bind to septins and physically link with cell polarity factors. It also directly binds to the endocytic adaptor/scaffold protein Ede1, and plays a role in endocytosis. Further studies show that Syp1 is itself an endocytic adaptor protein contributing to stress responses. Its mu-homology domain at the C-terminus binds to the cargo protein Mid2, a transmembrane stress sensor protein, and mediates Mid2 internalization. In addition, Syp1 contains an EFC/F-BAR domain which can induce membrane tabulation.	257
271171	cd09265	AP_Syp1_like_MHD	Mu-homology domain (MHD) of endocytic adaptor protein (AP), Syp1. This family corresponds to the MHD found in the metazoan counterparts of yeast Syp1, which includes two ubiquitously expressed membrane-sculpting F-BAR domain-containing Fer/Cip4 homology domain-only proteins 1 and 2 (FCH domain only 1 and 2, or FCHo1/FCHo2), neuronal-specific SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related uncharacterized proteins. FCHo1/FCHo2 represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They are required for plasma membrane clathrin-coated vesicle (CCV) budding and marked sites of CCV formation. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Both FCHo1/FCHo2 contain an N-terminal EFC/F-BAR domain that induces membrane tabulation, a proline-rich domain (PRD) in the middle region, and a C-terminal MHD responsible for the binding of eps15 and intersectin. Another mammalian neuronal-specific protein, neuronal-specific transcript Scr homology 3 (SH3)-domain growth factor receptor-bound 2 (GRB2)-like (endophilin) interacting protein 1 [SGIP1] does not contain EFC/F-BAR domain, but does have a PRD and a C-terminal MHD and has been classified into this family as well. SGIP1 is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15.	266
271172	cd09266	SGIP1_MHD	mu-homology domain (MHD) of Scr homology 3 (SH3)-domain growth factor receptor-bound 2 (GRB2)-like (endophilin) interacting protein 1 (also known as endophilin-3-interacting protein, SGIP1) and similar proteins. This family corresponds to the MHD found in mammalian neuronal-specific transcript SGIP1 and similar proteins. Unlike other members in this family, SGIP1 does not contain EFC/F-BAR domain, but does have a proline-rich domain (PRD) and a C-terminal MHD. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis, and is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15.	267
211378	cd09267	FCHo2_MHD	mu-homology domain (MHD) of F-BAR domain-containing Fer/Cip4 homology domain-only protein 2 (FCH domain only 2 or FCHo2) and similar proteins. This family corresponds to the MHD found in the ubiquitously expressed mammalian membrane-sculpting FCHo2 and similar proteins. FCHo2 represents a key initial protein that ultimately controls cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. It is required for plasma membrane clathrin-coated vesicle (CCV) budding and marks sites of CCV formation. It binds specifically to the plasma membrane and recruits the scaffold proteins eps15 and intersectin, which subsequently engages the adaptor complex AP2 and clathrin, leading to coated vesicle formation. FCHo2 contains an N-terminal EFC/F-BAR domain, a proline-rich domain (PRD) in the middle region, and a C-terminal MHD. The crescent-shaped EFC/F-BAR domain can form an antiparallel dimer structure that binds PtdIns(4,5)P2-enriched membranes and can polymerize into rings to generate membrane tubules. The MHD is structurally related to the cargo-binding mu2 subunit of adaptor complex 2 (AP-2) and is responsible for the binding of eps15 and intersectin.	267
271173	cd09268	FCHo1_MHD	mu-homology domain (MHD) of F-BAR domain-containing Fer/Cip4 homology domain-only protein 1 (FCH domain only 1 or FCHo1, also known as KIAA0290) and similar proteins. This family corresponds to the MHD found in ubiquitously expressed mammalian membrane-sculpting FCHo1 and similar proteins. FCHo1 represents a key initial protein that ultimately controls cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. It is required for plasma membrane clathrin-coated vesicle (CCV) budding and marks sites of CCV formation. It binds specifically to the plasma membrane and recruits the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. FCHo1 contains an N-terminal EFC/F-BAR domain, a proline-rich domain (PRD) in the middle region, and a C-terminal MHD. The crescent-shaped EFC/F-BAR domain can form an antiparallel dimer structure that binds PtdIns(4,5)P2-enriched membranes and can polymerize into rings to generate membrane tubules. The MHD is structurally related to the cargo-binding mu2 subunit of adaptor complex 2 (AP-2) and is responsible for the binding of eps15 and intersectin. Unlike other F-BAR domain containing proteins, FCHo1 has neither the Src homology 3 (SH3) domain nor any other known domain for interaction with dynamin and actin cytoskeleton. However,  it can periodically accumulate at the budding site of clathrin. FCHo1 may utilize a unique action mode for vesicle formation as compared with other F-BAR proteins.	265
185703	cd09269	deoxyribose_mutarotase	deoxyribose mutarotase_like. Salmonella enterica serovar Typhi DeoM (earlier named as DeoX) is a mutarotase with high specificity for deoxyribose.  It is encoded by one of four genes beonging to the deoK operon. This operon has also been found in  Escherichia coli where it is more common in pathogenic than in commensal strains and is associated with pathogenicity. It has been found on a pathogenicity island from a human blood isolate AL863 and confers the ability to use deoxyribose as a carbon source; deoxyribose is not fermented by non-pathogenic  E.coli K-12.  Proteins in this family are members of the aldose-1-epimerase superfamily. Aldose 1-epimerases, or mutarotases, are key enzymes of carbohydrate metabolism, catalyzing the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. Site directed mutagenesis of this latter histidine residue renders Salmonella enterica DeoM inactive.	293
187751	cd09270	RNase_H2-B	Ribonuclease H2-B is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids. Ribonuclease H2B is one of the three proteins of eukaryotic RNase H2 complex that is required for nucleic acid binding and hydrolysis. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. Eukaryotic RNase HII is active during replication and is believed to play a role in removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII is functional when it forms a complex with RNase H2B and RNase H2C proteins. It is speculated that the two accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause neurological disorder.	211
187752	cd09271	RNase_H2-C	Ribonuclease H2-C is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids. Ribonuclease H2C is one of the three protein of eukaryotic RNase H2 complex that is required for nucleic acid binding and hydrolysis. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. Eukaryotic RNase HII is active during replication and is believed to play a role in removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII is functional when it forms a complex with RNase H2B and RNase H2C proteins. It is speculated that the two accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause neurological disorder.	93
260004	cd09272	RNase_HI_RT_Ty1	Ty1/Copia family of RNase HI in long-term repeat retroelements. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.	140
260005	cd09273	RNase_HI_RT_Bel	Bel/Pao family of RNase HI in long-term repeat retroelements. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryote. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Bel/Pao family has been described only in metazoan genomes. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.	131
260006	cd09274	RNase_HI_RT_Ty3	Ty3/Gypsy family of RNase HI in long-term repeat retroelements. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Ty3/Gypsy family widely distributed among the genomes of plants, fungi and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.	121
260007	cd09275	RNase_HI_RT_DIRS1	DIRS1 family of RNase HI in long-term repeat retroelements. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. The structural features of DIRS1-group elements are different from typical LTR elements. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.	120
260008	cd09276	Rnase_HI_RT_non_LTR	non-LTR RNase HI domain of reverse transcriptases. Ribonuclease H (RNase H) is classified into two families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). Ribonuclease HI (RNase HI) is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as an adjunct domain to the reverse transcriptase gene in retroviruses, long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. The position of the RNase domain of non-LTR and LTR transposons is at the carboxyl terminal of the reverse transcriptase (RT) domain and their RNase domains group together, indicating a common evolutionary origin. Many non-LTR transposons have lost the RNase domain because their activity is at the nucleus and cellular RNase may suffice; however LTR retrotransposons always encode their own RNase domain because it requires RNase activity in RNA-protein particles in the cytoplasm. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.	131
260009	cd09277	RNase_HI_bacteria_like	Bacterial RNase HI containing a hybrid binding domain (HBD) at the N-terminus. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is involved in DNA replication, repair and transcription. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, Type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD) residues and have the same catalytic mechanism and functions in cells. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. Prokaryotic RNase H varies greatly in domain structures and substrate specificities. Prokaryotes and some single-cell eukaryotes do not require RNase H for viability. Some bacteria distinguished from other bacterial RNase HI in the presence of a hybrid binding domain (HBD) at the N-terminus which is commonly present at the N-termini of eukaryotic RNase HI. It has been reported that this domain is required for dimerization and processivity of RNase HI upon binding to RNA-DNA hybrids.	133
260010	cd09278	RNase_HI_prokaryote_like	RNase HI family found mainly in prokaryotes. Ribonuclease H (RNase H) is classified into two evolutionarily unrelated families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD), residues and have the same catalytic mechanism and functions in cells. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. Prokaryotic RNase H varies greatly in domain structures and substrate specificities. Prokaryotes and some single-cell eukaryotes do not require RNase H for viability.	139
260011	cd09279	RNase_HI_like	RNAse HI family that includes archaeal, some bacterial as well as plant RNase HI. Ribonuclease H (RNase H) is classified into two evolutionarily unrelated families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD) residues and have the same catalytic mechanism and functions in cells. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. Most archaeal genomes contain only type 2 RNase H (RNase HII); however, a few contain RNase HI as well. Although archaeal RNase HI sequences conserve the DEDD active-site motif, they lack other common features important for catalytic function, such as the basic protrusion region. Archaeal RNase HI homologs are more closely related to retroviral RNase HI than bacterial and eukaryotic type I RNase H in enzymatic properties.	128
260012	cd09280	RNase_HI_eukaryote_like	Eukaryotic RNase H is essential and is longer and more complex than their prokaryotic counterparts. Ribonuclease H (RNase H) is classified into two families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H is widely present in various organisms, including bacteria, archaea and eukaryote and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD) residues and have the same catalytic mechanism and functions in cells. Eukaryotic RNase H is longer and more complex than in prokaryotes. Almost all eukaryotic RNase HI have highly conserved regions at their N-termini called hybrid binding domain (HBD). It is speculated that the HBD contributes to binding the RNA/DNA hybrid. Prokaryotes and some single-cell eukaryotes do not require RNase H for viability, but RNase H is essential in higher eukaryotes. RNase H knockout mice lack mitochondrial DNA replication and die as embryos.	145
187753	cd09281	UPF0066	Escherichia coli YaeB and related proteins. Uncharacterized protein family UPF0066.  This domain includes Escherichia coli YeaB, Archeoglobus fulgidus AF0241, and Agrobacterium tumefaciens VirR.  Proteins with this domain are probable S-adenosylmethionine-dependent methyltransferases but they have not been functionally characterized and the substrate is unknown.	124
185681	cd09286	NMNAT_Eukarya	Nicotinamide/nicotinate mononucleotide adenylyltransferase, Eukaryotic. Nicotinamide/nicotinate mononucleotide (NMN/ NaMN)adenylyltransferase (NMNAT).  NMNAT represents the primary bacterial and eukaryotic adenylyltransferases for nicotinamide-nucleotide and for the deamido form, nicotinate nucleotide.  It is an indispensable enzyme in the biosynthesis of NAD(+) and NADP(+). Nicotinamide-nucleotide adenylyltransferase synthesizes NAD via the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD via the de novo pathway. Human NMNAT displays unique dual substrate specificity toward both NMN and NaMN, and can participate in both de novo and salvage pathways of NAD synthesis.  This subfamily consists strictly of eukaryotic members and includes secondary structural elements not found in all NMNATs.	225
185682	cd09287	GluRS_non_core	catalytic core domain of non-discriminating glutamyl-tRNA synthetase. Non-discriminating Glutamyl-tRNA synthetase (GluRS) cataytic core domain. These enzymes attach Glu to the appropriate tRNA. Like other class I tRNA synthetases, they aminoacylate the 2'-OH of the nucleotide at the 3' end of the tRNA. The core domain is based on the Rossman fold and is responsible for the ATP-dependent formation of the enzyme bound aminoacyl-adenylate. It contains the characteristic class I HIGH and KMSKS motifs, which are involved in ATP binding. These enzymes function as monomers. Archaea and most bacteria lack GlnRS. In these organisms, the "non-discriminating" form of GluRS aminoacylates both tRNA(Glu) and tRNA(Gln) with Glu, which is converted to Gln when appropriate by a transamidation enzyme.	240
187746	cd09288	Photosystem-II_D2	D2 subunit  of photosystem II (PS II). Photosystem II (PS II), D2 subunit.  PS II is a multi-subunit protein found in the photosynthetic membranes of plants, algae, and cyanobacteria.  It utilizes light-induced electron transfer and water-splitting reactions to produce protons, electrons, and molecular oxygen. The protons generated are instrumental in ATP formation.   Molecular dioxygen is released as a by-product. PS II can be described as containing two parts: the photochemical part and the catalytic part. The photochemical portion promotes the fast, efficient light-induced charge separation and stabilization that occur when light is absorbed by chlorophyll. The catalytic portion, where water is oxidized, involves a cluster of Mn ions close to a redox-active tyrosine residue. The Mn cluster and its ligands form a functional unit called the oxygen-evolving complex (OEC) or the water-oxidizing complex (WOC). The D1 and D2 subunits are a pair of intertwined polypeptides. They contain all the cofactors involved directly in water oxidation and plastoquinone reduction.  D1 and D2 are highly homologous and are also similar to the L and M proteins in bacterial photosynthetic reaction centers.	339
187747	cd09289	Photosystem-II_D1	D1 subunit  of photosystem II (PS II). Photosystem II (PS II), D2 subunit.  PS II is a multi-subunit protein found in the photosynthetic membranes of plants, algae, and cyanobacteria.  It utilizes light-induced electron transfer and water-splitting reactions to produce protons, electrons, and molecular oxygen. The protons generated are instrumental in ATP formation.   Molecular dioxygen is released as a by-product. PS II can be described as containing two parts: the photochemical part and the catalytic part. The photochemical portion promotes the fast, efficient light-induced charge separation and stabilization that occur when light is absorbed by chlorophyll. The catalytic portion, where water is oxidized, involves a cluster of Mn ions close to a redox-active tyrosine residue. The Mn cluster and its ligands form a functional unit called the oxygen-evolving complex (OEC) or the water-oxidizing complex (WOC). The D1 and D2 subunits are a pair of interwined polypeptides. They contain all the cofactors involved directly in water oxidation and plastoquinone reduction. The D1 subunit contains the Mn cluster that constitutes the site of water oxidation. D1 and D2 are highly homologous and are also similar to the L and M proteins in bacterial photosynthetic reaction centers.	338
187748	cd09290	Photo-RC_L	Subunit L of bacterial photosynthetic reaction center. Bacterial photosynthetic reaction center (RC) complex, subunit L. The bacterial photosynthetic reaction center couples light-induced electron transfer with pumping protons across the membrane using reactions involving a quinone molecule (QB) that binds two electrons and two protons at the active site. The reaction center consists of three membrane-bound subunits, designated L, M, and H, plus an additional extracellular cytochrome subunit. The L and M subunits are arranged around an axis of 2-fold rotational symmetry perpendicular to the membrane, forming a scaffold that maintains the cofactors in a precise configuration. The L and M subunits have both sequence and structural similarity, suggesting a common evolutionary origin. The L and M subunits bind noncovalently to the nine cofactors in 2-fold symmetric branches: four bacteriochlorophylls (Bchl), two bacteriopheophytins (Bphe), two ubiquinone molecules (QA and QB), and a non-heme iron. Two Bchls on the periplasmic side of the membrane form the 'special pair' or dimer which is the primary electron donor for the photosynthetic reactions. The electron transfer reaction proceeds from the dimer to an intermediate acceptor (PA), a primary quinone (QA), and a secondary quinone (QB). Protons are translocated from the bacterial cytoplasm to the periplasmic space, generating an electrochemical gradient of protons (the protonmotive force) that can be used to power reactions such as ATP synthesis. The RC complex is found in photosynthetic bacteria, such as purple bacteria and other proteobacteria species.	273
187749	cd09291	Photo-RC_M	Subunit M of bacterial photosynthetic reaction center. Bacterial photosynthetic reaction center (RC) complex, subunit M. The bacterial photosynthetic reaction center couples light-induced electron transfer with pumping protons across the membrane using reactions involving a quinone molecule (QB) that binds two electrons and two protons at the active site. The reaction center consists of three membrane-bound subunits, designated L, M, and H, plus an additional extracellular cytochrome subunit. The L and M subunits are arranged around an axis of 2-fold rotational symmetry perpendicular to the membrane, forming a scaffold that maintains the cofactors in a precise configuration. The L and M subunits have both sequence and structural similarity, suggesting a common evolutionary origin. The L and M subunits bind noncovalently to the nine cofactors in 2-fold symmetric branches: four bacteriochlorophylls (Bchl), two bacteriopheophytins (Bphe), two ubiquinone molecules (QA and QB), and a non-heme iron. Two Bchls on the periplasmic side of the membrane form the 'special pair' or dimer which is the primary electron donor for the photosynthetic reactions. The electron transfer reaction proceeds from the dimer to an intermediate acceptor (PA), a primary quinone (QA), and a secondary quinone (QB). Protons are translocated from the bacterial cytoplasm to the periplasmic space, generating an electrochemical gradient of protons (the protonmotive force) that can be used to power reactions such as ATP synthesis. The RC complex is found in photosynthetic bacteria, such as purple bacteria and other proteobacteria species.	297
187754	cd09293	AMN1	Antagonist of mitotic exit network protein 1. Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model.	226
187755	cd09294	SmpB	Small protein B (SmpB) is a component of the trans-translation system in prokaryotes for releasing stalled ribosome from damaged messenger RNAs. Small protein B (SmpB) is a component of the trans-translation system in prokaryotes for releasing stalled ribosome from damaged messenger RNAs and targeting incompletely synthesized protein fragments for degradation. Trans-translation system is composed of a ribonucleoprotein complex of tmRNA, a specialized RNA with properties of both tRNA and mRNA, and SmpB. SmpB is highly conserved and present in all bacterial kingdoms and is also found in some chloroplasts and mitochondria. This is suggesting Trans-translation arose early in bacterial evolution and its mechanism is a quality control for protein synthesis in spite of challenges such as transcription errors, mRNA damage, and translation frame shifting. SmpB deletion results in phage development defects phenotype and absence of tagged proteins translated from defective mRNAs.	116
200495	cd09295	Sema	The Sema domain, a protein interacting module, of semaphorins and plexins. Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors.	392
187756	cd09299	TDT	The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential  in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane alpha-helical spanners (TMSs).	326
350171	cd09300	DEAD-like_helicase_C	C-terminal helicase domain of the DEAD-like helicases. This hierarchy of DEAD-like helicases is composed of two superfamilies, SF1 and SF2, that share almost identical folds and extensive structural similarity in their catalytic core. Helicases are involved in ATP-dependent RNA or DNA unwinding. Two distinct types of helicases exist, those forming toroidal, predominantly hexameric structures, and those that do not. SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Their conserved helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	59
212512	cd09301	HDAC	Histone deacetylase (HDAC) classes I, II, IV and related proteins. The HDAC/HDAC-like family includes Zn-dependent histone deacetylase classes I, II and IV (class III HDACs, also called sirtuins, are NAD-dependent and structurally unrelated, and therefore not part of this family). Histone deacetylases catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98), as opposed to the acetylation reaction by some histone acetyltransferases (EC 2.3.1.48). Deacetylases of this family are involved in signal transduction through histone and other protein modification, and can repress/activate transcription of a number of different genes. They usually act via the formation of large multiprotein complexes. They are involved in various cellular processes, including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. In mammals, they are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs.	279
187706	cd09302	Jacalin_like	Jacalin-like lectin domain. Jacalin-like lectins are sugar-binding protein domains mostly found in plants. They adopt a beta-prism topology consistent with a circularly permuted three-fold repeat of a structural motif. Proteins containing this domain may bind mono- or oligosaccharides with high specificity. The domain can occur in tandem-repeat arrangements with up to six copies, and in architectures combined with a variety of other functional domains. Taxonomic distribution is not restricted to plants, the domain is also found in various mammalian proteins, for example.	128
187757	cd09317	TDT_Mae1_like	C4-dicarboxylate transporter/malic acid transport protein family includes Mae1. This family contains eukaryotic homologs of C4-dicarboxylate transporter/malic acid transport proteins which are part of the Tellurite-resistance/Dicarboxylate Transporter (TDT) family. This includes the MAE1 gene in Schizosaccharomyces pombe gene that encodes malate permease, Mae1, which functions by proton symport and transports C4-dicarboxylates (malate, fumarate, succinate, oxaloacetate, etc.), but not K-ketoglutarate.	330
187758	cd09318	TDT_SSU1	Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes sulfite sensitivity protein (sulfite efflux pump; SSU1). This family contains the sulfite sensitivity protein (sulfite efflux pump; SSU1) and belongs to the tellurite-resistance/dicarboxylate transporter (TDT) family. The SSU1 gene encodes the sulfite pump required for efficient sulfite efflux. Mutations in the SSU1 gene cause sensitivity to sulfite while overexpression confers heightened resistance to sulfite toxicity. In dematophytes and other filamentous fungi, sulfite is excreted as a reducing agent during keratin degradation; thus sulfite transporters in keratinolytic fungi could be a new target for antifungal drugs in dermatology. The number of genes encoding sulfite efflux pumps in fungal genomes varies from species to species.	341
187759	cd09319	TDT_like_1	The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential  in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane alpha-helical spanners (TMSs).	317
187760	cd09320	TDT_like_2	The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential  in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane alpha-helical spanners (TMSs).	327
187761	cd09321	TDT_like_3	The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential  in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane a-helical spanners (TMSs).	327
187762	cd09322	TDT_TehA_like	The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes TehA proteins. The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential  in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane a-helical spanners (TMSs).	289
187763	cd09323	TDT_SLAC1_like	Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes SLAC1 (Slow Anion Channel-Associated 1). SLAC1 (Slow Anion Channel-Associated 1) is a plasma membrane protein, preferentially expressed in guard cells, which encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. It is essential for stomatal closure in response to carbon dioxide, abscisic acid, ozone, light/dark transitions, humidity change, calcium ions, hydrogen peroxide and nitric oxide. In the Arabidopsis genome, SLAC1 is part of a gene family with five members and encodes a membrane protein that has ten putative transmembrane domains flanked by large N- and C-terminal domains. Mutations in SLAC1 impair slow (S-type) anion channel currents that are activated by cytosolic calcium ions and abscisic acid, but do not affect rapid (R-type) anion channel currents or calcium ion channel function.	297
187764	cd09324	TDT_TehA	Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes TehA protein. This subfamily includes Tellurite resistance protein TehA that belongs to the C4-dicarboxylate transporter/malic acid transport (TDT) protein family and is a homolog of plant Slow Anion Channel-Associated 1 (SLAC1). The tehA gene encodes an integral membrane protein that has been shown to have efflux activity of quaternary ammonium compounds. TehA protein of Escherichia coli functions as a tellurite-resistance uptake permease.	301
187765	cd09325	TDT_C4-dicarb_trans	C4-dicarboxylate transporters of the Tellurite-resistance/Dicarboxylate Transporter (TDT) family. This subfamily contains bacterial C4-dicarboxylate transporters, which is part of the Tellurite-resistance/Dicarboxylate Transporter (TDT) family. It includes Tellurite resistance protein tehA; the tehA gene encodes an integral membrane protein that has been shown to have efflux activity of quaternary ammonium compounds. TehA protein of Escherichia coli functions as a tellurite-resistance uptake permease.	293
188712	cd09326	LIM_CRP_like	The LIM domains of Cysteine Rich Protein (CRP) family. The LIM domains of Cysteine Rich Protein (CRP) family: Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to a short glycine-rich repeats (GRRs). The known CRP family members include CRP1, CRP2, and CRP3/MLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription control, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. CRP1, CRP2, and CRP3/MLP are involved in promoting protein assembly along the actin-based cytoskeleton. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188713	cd09327	LIM1_abLIM	The first LIM domain of actin binding LIM (abLIM) proteins. The first LIM domain of actin binding LIM (abLIM) proteins:  Three homologous members of the abLIM protein family have been identified; abLIM-1, abLIM-2 and abLIM-3. The N-terminal of abLIM consists of four tandem repeats of LIM domains and the C-terminal of acting binding LIM protein is a villin headpiece domain, which has strong actin binding activity. The abLIM-1, which is expressed in retina, brain, and muscle tissue, has been indicated to function as a tumor suppressor. AbLIM-2 and -3, mainly expressed in muscle and neuronal tissue, bind to F-actin strongly.  They may serve as a scaffold for signaling modules of the actin cytoskeleton and thereby modulate transcription. It has shown that LIM domains of abLIMs interact with STARS (striated muscle activator of Rho signaling), which directly binds actin and stimulates serum-response factor (SRF)-dependent transcription. All LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188714	cd09328	LIM2_abLIM	The second LIM domain on actin binding LIM (abLIM) proteins. The second LIM domain of actin binding LIM (abLIM) proteins:  Three homologous members of the abLIM protein family have been identified; abLIM-1, abLIM-2 and abLIM-3. The N-terminal of abLIM consists of four tandem repeats of LIM domains and the C-terminal of acting binding LIM protein is a villin headpiece domain, which has strong actin binding activity. The abLIM-1, which is expressed in retina, brain, and muscle tissue, has been indicated to function as a tumor suppressor. AbLIM-2 and -3, mainly expressed in muscle and neuronal tissue, bind to F-actin strongly.  They may serve as a scaffold for signaling modules of the actin cytoskeleton and thereby modulate transcription. It has shown that LIM domains of abLIMs interact with STARS (striated muscle activator of Rho signaling), which directly binds actin and stimulates serum-response factor (SRF)-dependent transcription. All LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188715	cd09329	LIM3_abLIM	The third LIM domain of actin binding LIM (abLIM) proteins. The third LIM domain of actin binding LIM (abLIM) proteins: Three homologous members of the abLIM protein family have been identified; abLIM-1, abLIM-2 and abLIM-3. The N-terminal of abLIM consists of four tandem repeats of LIM domains and the C-terminal of acting binding LIM protein is a villin headpiece domain, which has strong actin binding activity. The abLIM-1, which is expressed in retina, brain, and muscle tissue, has been indicated to function as a tumor suppressor. AbLIM-2 and -3, mainly expressed in muscle and neuronal tissue, bind to F-actin strongly.  They may serve as a scaffold for signaling modules of the actin cytoskeleton and thereby modulate transcription. It has shown that LIM domains of abLIMs interact with STARS (striated muscle activator of Rho signaling), which directly binds actin and stimulates serum-response factor (SRF)-dependent transcription. All LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188716	cd09330	LIM4_abLIM	The fourth LIM domain of actin binding LIM (abLIM) proteins. The fourth LIM domain of actin binding LIM (abLIM) proteins: Three homologous members of the abLIM protein family have been identified; abLIM-1, abLIM-2 and abLIM-3. The N-terminal of abLIM consists of four tandem repeats of LIM domains and the C-terminal of acting binding LIM protein is a villin headpiece domain, which has strong actin binding activity. The abLIM-1, which is expressed in retina, brain, and muscle tissue, has been indicated to function as a tumor suppressor. AbLIM-2 and -3, mainly expressed in muscle and neuronal tissue, bind to F-actin strongly.  They may serve as a scaffold for signaling modules of the actin cytoskeleton and thereby modulate transcription. It has shown that LIM domains of abLIMs interact with STARS (striated muscle activator of Rho signaling), which directly binds actin and stimulates serum-response factor (SRF)-dependent transcription. All LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188717	cd09331	LIM1_PINCH	The first LIM domain of protein PINCH. The first LIM domain of paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188718	cd09332	LIM2_PINCH	The second LIM domain of protein PINCH. The second LIM domain of protein PINCH: PINCH plays a pivotal role in the assembly of focal adhesions (FAs), regulating diverse functions in cell adhesion, growth, and differentiation through LIM-mediated protein-protein interactions. PINCH comprises an array of five LIM domains that interact with integrin-linked kinase (ILK), Nck2 (also called Nckbeta or Grb4) and other interaction partners.  These interactions are essential for triggering the FA assembly and for relaying diverse mechanical and biochemical signals between Cell-extracellular matrix and the actin cytoskeleton.  LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188719	cd09333	LIM3_PINCH	The third LIM domain of protein PINCH. The third LIM domain of protein PINCH:  PINCH plays pivotal roles in the assembly of focal adhesions (FAs), regulating diverse functions in cell adhesion, growth, and differentiation through LIM-mediated protein-protein interactions. PINCH comprises an array of five LIM domains that interact with integrin-linked kinase (ILK), Nck2 (also called Nckbeta or Grb4) and other interaction partners.  These interactions are essential for triggering the FA assembly and for relaying diverse mechanical and biochemical signals between Cell-extracellular matrix and the actin cytoskeleton.  LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	51
188720	cd09334	LIM4_PINCH	The fourth LIM domain of protein PINCH. The fourth LIM domain of protein PINCH: PINCH plays a pivotal role in the assembly of focal adhesions (FAs), regulating diverse functions in cell adhesion, growth, and differentiation through LIM-mediated protein-protein interactions. PINCH comprises an array of five LIM domains that interact with integrin-linked kinase (ILK), Nck2 (also called Nckbeta or Grb4) and other interaction partners. These interactions are essential for triggering the FA assembly and for relaying diverse mechanical and biochemical signals between Cell-extracellular matrix and the actin cytoskeleton.  The PINCH LIM4 domain recognizes the third SH3 domain of another adaptor protein, Nck2. This step is an important component of integrin signaling event. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assem bly of multimeric protein complexes.	54
188721	cd09335	LIM5_PINCH	The fifth LIM domain of protein PINCH. The fifth LIM domain of protein PINCH:  PINCH plays pivotal roles in the assembly of focal adhesions (FAs), regulating diverse functions in cell adhesion, growth, and differentiation through LIM-mediated protein-protein interactions. PINCH comprises an array of five LIM domains that interact with integrin-linked kinase (ILK), Nck2 (also called Nckbeta or Grb4) and other interaction partners.  These interactions are essential for triggering the FA assembly and for relaying diverse mechanical and biochemical signals between Cell-extracellular matrix and the actin cytoskeleton. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
259830	cd09336	LIM1_Paxillin_like	The first LIM domain of the paxillin like protein family. The first LIM domain of the paxillin like protein family: This family consists of paxillin, leupaxin, Hic-5 (ARA55), and other related proteins. There are four LIM domains in the C-terminal of the proteins and leucine-rich LD-motifs in the N-terminal region.  Members of this family are adaptor proteins to recruit key components of signal-transduction machinery to specific sub-cellular locations. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. Paxillin serves as a platform for the recruitment of numerous regulatory and structural proteins that together control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression that are necessary for cell migration and survival. Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. It associates with focal adhesion kinases PYK2 and pp125FAK and identified to be a component of the osteoclast pososomal signaling complex. Hic-5 controls cell proliferation, migration and senescence by functioning as coactivator for steroid receptors such as androgen receptor, glucocorticoid receptor and progesterone receptor. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188723	cd09337	LIM2_Paxillin_like	The second LIM domain of the paxillin like protein family. The second LIM domain of the paxillin like protein family: This family consists of paxillin, leupaxin, Hic-5 (ARA55), and other related proteins. There are four LIM domains in the C-terminal of the proteins and leucine-rich LD-motifs in the N-terminal region.  Members of this family are adaptor proteins to recruit key components of signal-transduction machinery to specific sub-cellular locations. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. Paxillin serves as a platform for the recruitment of numerous regulatory and structural proteins that together control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression that are necessary for cell migration and survival. Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. It associates with focal adhesion kinases PYK2 and pp125FAK and identified to be a component of the osteoclast pososomal signaling complex. Hic-5 controls cell proliferation, migration and senescence by functioning as coactivator for steroid receptors such as androgen receptor, glucocorticoid receptor and progesterone receptor. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188724	cd09338	LIM3_Paxillin_like	The third LIM domain of the paxillin like protein family. The third LIM domain of the paxillin like protein family: This family consists of paxillin, leupaxin, Hic-5 (ARA55), and other related proteins. There are four LIM domains in the C-terminal of the proteins and leucine-rich LD-motifs in the N-terminal region.  Members of this family are adaptor proteins to recruit key components of signal-transduction machinery to specific sub-cellular locations. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. Paxillin serves as a platform for the recruitment of numerous regulatory and structural proteins that together control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression that are necessary for cell migration and survival. Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. It associates with focal adhesion kinases PYK2 and pp125FAK and identified to be a component of the osteoclast pososomal signaling complex. Hic-5 controls cell proliferation, migration and senescence by functioning as coactivator for steroid receptors such as androgen receptor, glucocorticoid receptor and progesterone receptor. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188725	cd09339	LIM4_Paxillin_like	The fourth LIM domain of the Paxillin-like protein family. The fourth LIM domain of the Paxillin like protein family: This family consists of paxillin, leupaxin, Hic-5 (ARA55), and other related proteins. There are four LIM domains in the C-terminal of the proteins and leucine-rich LD-motifs in the N-terminal region.  Members of this family are adaptor proteins to recruit key components of signal-transduction machinery to specific sub-cellular locations. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. Paxillin serves as a platform for the recruitment of numerous regulatory and structural proteins that together control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression that are necessary for cell migration and survival. Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. It associates with focal adhesion kinases PYK2 and pp125FAK and identified to be a component of the osteoclast pososomal signaling complex. Hic-5 controls cell proliferation, migration and senescence by functioning as coactivator for steroid receptors such as androgen receptor, glucocorticoid receptor and progesterone receptor. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188726	cd09340	LIM1_Testin_like	The first LIM domain of Testin-like family. The first LIM domain of Testin_like family: This family includes testin, prickle, dyxin and LIMPETin. Structurally, testin and prickle proteins contain three LIM domains at C-terminal; LIMPETin has six LIM domains; and dyxin presents only two LIM domains. However, all members of the family contain a PET protein-protein interaction domain.  Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  Dyxin involves in lung and heart development by interaction with GATA6 and blocking GATA6 activated target genes. LIMPETin might be the recombinant product of genes coding testin and four and half LIM proteins and its function is not well understood. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	58
188727	cd09341	LIM2_Testin_like	The second LIM domain of Testin-like family. The second LIM domain of Testin-like family: This family includes testin, prickle, dyxin and LIMPETin. Structurally, testin and prickle proteins contain three LIM domains at C-terminal; LIMPETin has six LIM domains; and dyxin presents only two LIM domains. However, all members of the family contain a PET protein-protein interaction domain.  Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  Dyxin involves in lung and heart development by interaction with GATA6 and blocking GATA6 activated target genes. LIMPETin might be the recombinant product of genes coding testin and four and half LIM proteins and its function is not well understood. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188728	cd09342	LIM3_Testin_like	The third LIM domain of Testin-like family. The third LIM domain of Testin_like family: This family includes testin, prickle, dyxin and LIMPETin. Structurally, testin and prickle proteins contain three LIM domains at C-terminal; LIMPETin has six LIM domains; and dyxin presents only two LIM domains. However, all members of the family contain a PET protein-protein interaction domain. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  Dyxin involves in lung and heart development by interaction with GATA6 and blocking GATA6 activated target genes. LIMPETin might be the recombinant product of genes coding testin and four and half LIM proteins and its function is not well understood. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	57
188729	cd09343	LIM1_FHL	The first LIM domain of Four and a half LIM domains protein (FHL). The first LIM domain of Four and a half LIM domains protein (FHL): LIM-only protein family consists of five members, designated FHL1, FHL2, FHL3, FHL5 and LIMPETin. The first four members are composed of four complete LIM domains arranged in tandem and  an N-terminal single zinc finger domain with a consensus sequence equivalent to the C-terminal half of a LIM domain. LIMPETin is an exception, containing six LIM domains. FHL1, 2 and 3 are predominantly expressed in muscle tissues, and FHL5 is highly expressed in male germ cells.  FHL proteins exert their roles as transcription co-activators or co-repressors through a wide array of interaction partners. For example, FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. FHL3 int eracts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188730	cd09344	LIM1_FHL1	The first LIM domain of Four and a half LIM domains protein 1. The first LIM domain of Four and a half LIM domains protein 1 (FHL1):  FHL1 is heavily expressed in skeletal and cardiac muscles. It plays important roles in muscle growth, differentiation, and sarcomere assembly by acting as a modulator of transcription factors. Defects in FHL1 gene are responsible for a number of Muscular dystrophy-like muscle disorders. It has been detected that FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. 	54
188731	cd09345	LIM2_FHL	The second LIM domain of Four and a half LIM domains protein (FHL). The second LIM domain of Four and a half LIM domains protein (FHL): LIM-only protein family consists of five members, designated FHL1, FHL2, FHL3, FHL5 and LIMPETin. The first four members are composed of four complete LIM domains arranged in tandem and an N-terminal single zinc finger domain with a consensus sequence equivalent to the C-terminal half of a LIM domain. LIMPETin is an exception, containing six LIM domains. FHL1, 2 and 3 are predominantly expressed in muscle tissues, and FHL5 is highly expressed in male germ cells.  FHL proteins exert their roles as transcription co-activators or co-repressors through a wide array of interaction partners. For example, FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. FHL3 int eracts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188732	cd09346	LIM3_FHL	The third LIM domain of Four and a half LIM domains protein (FHL). The third LIM domain of Four and a half LIM domains protein (FHL): LIM-only protein family consists of five members, designated FHL1, FHL2, FHL3, FHL5 and LIMPETin. The first four members are composed of four complete LIM domains arranged in tandem and an N-terminal single zinc finger domain with a consensus sequence equivalent to the C-terminal half of a LIM domain. LIMPETin is an exception, containing six LIM domains. FHL1, 2 and 3 are predominantly expressed in muscle tissues, and FHL5 is highly expressed in male germ cells.  FHL proteins exert their roles as transcription co-activators or co-repressors through a wide array of interaction partners. For example, FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. FHL3 int eracts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188733	cd09347	LIM4_FHL	The fourth LIM domain of Four and a half LIM domains protein (FHL). The fourth LIM domain of Four and a half LIM domains protein (FHL): LIM-only protein family consists of five members, designated FHL1, FHL2, FHL3, FHL5 and LIMPETin. The first four members are composed of four complete LIM domains arranged in tandem and an N-terminal single zinc finger domain with a consensus sequence equivalent to the C-terminal half of a LIM domain. LIMPETin is an exception, containing six LIM domains. FHL1, 2 and 3 are predominantly expressed in muscle tissues, and FHL5 is highly expressed in male germ cells.  FHL proteins exert their roles as transcription co-activators or co-repressors through a wide array of interaction partners. For example, FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. FHL3 interacts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188734	cd09348	LIM4_FHL1	The fourth LIM domain of Four and a half LIM domains protein 1 (FHL1). The fourth LIM domain of Four and a half LIM domains protein 1 (FHL1):  FHL1 is heavily expressed in skeletal and cardiac muscles. It plays important roles in muscle growth, differentiation, and sarcomere assembly by acting as a modulator of transcription factors. Defects in FHL1 gene are responsible for a number of Muscular dystrophy-like muscle disorders. It has been detected that FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	64
188735	cd09349	LIM1_Zyxin	The first LIM domain of Zyxin. The first LIM domain of Zyxin: Zyxin exhibits three copies of the LIM domain, an extensive proline-rich domain and a nuclear export signal.  Localized at sites of cell substratum adhesion in fibroblasts, Zyxin interacts with alpha-actinin, members of the cysteine-rich protein (CRP) family, proteins that display Src homology 3 (SH3) domains and Ena/VASP family members. Zyxin and its partners have been implicated in the spatial control of actin filament assembly as well as in pathways important for cell differentiation. In addition to its functions at focal adhesion plaques, recent work has shown that zyxin moves from the sites of cell contacts to the nucleus, where it directly participates in the regulation of gene expression. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	87
188736	cd09350	LIM1_TRIP6	The first LIM domain of Thyroid receptor-interacting protein 6 (TRIP6). The first LIM domain of Thyroid receptor-interacting protein 6 (TRIP6): TRIP6 is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal. TRIP6 protein localizes to focal adhesion sites and along actin stress fibers. Recruitment of this protein to the plasma membrane occurs in a lysophosphatidic acid (LPA)-dependent manner. TRIP6 recruits a number of molecules involved in actin assembly, cell motility, survival and transcriptional control. The function of TRIP6 in cell motility is regulated by Src-dependent phosphorylation at a Tyr residue. The phosphorylation activates the coupling to the Crk SH2 domain, which is required for the function of TRIP6 in promoting lysophosphatidic acid (LPA)-induced cell migration. TRIP6 can shuttle to the nucleus to serve as a coactivator of AP-1 and NF-kappaB transcriptional factors. Moreover, TRIP6 can form a ternary complex with the NHERF2 PDZ protein and LPA2 receptor to regulate LPA-induced activation of ERK and AKT, rendering cells resistant to chemotherapy. Recent evidence shows that TRIP6 antagonizes Fas-Induced apoptosis by enhancing the antiapoptotic effect of LPA in cells. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	54
188737	cd09351	LIM1_LPP	The first LIM domain of lipoma preferred partner (LPP). The first LIM domain of lipoma preferred partner (LPP): LPP is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal and proline-rich region at the N-terminal.  LPP initially identified as the most frequent translocation partner of HMGA2 (High Mobility Group A2) in a subgroup of benign tumors of adipose tissue (lipomas). It was also shown to be rearranged in a number of other soft tissues, as well as in a case of acute monoblastic leukemia. In addition to its involvement in tumors, LPP was inedited as a smooth muscle restricted LIM protein that plays an important role in SMC migration. LPP is localized at sites of cell adhesion, cell-cell contacts and transiently in the nucleus. In nucleus, it acts as a coactivator for the ETS domain transcription factor PEA3. In addition to PEA3, it interacts with alpha-actinin,vasodilator stimulated phosphoprotein (VASP),Palladin, and Scrib. The  LIM domains are the main focal adhesion targeting elements and that the proline- rich region, which harbors binding sites for alpha-actinin and vasodilator- stimulated phosphoprotein (VASP), has a weak targeting capacity. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	54
188738	cd09352	LIM1_Ajuba_like	The first LIM domain of Ajuba-like proteins. The first LIM domain of Ajuba-like proteins: Ajuba like LIM protein family includes three highly homologous proteins Ajuba, Limd1, and WTIP. Members of the family contain three tandem C-terminal LIM domains and a proline-rich N-terminal region. This family of proteins functions as scaffolds, participating in the assembly of numerous protein complexes. In the cytoplasm, Ajuba binds Grb2 to modulate serum-stimulated ERK activation. Ajuba also recruits the TNF receptor-associated factor 6 (TRAF6) to p62 and activates PKCKappa activity. Ajuba interacts with alpha-catenin and F-actin to contribute to the formation or stabilization of adheren junctions by linking adhesive receptors to the actin cytoskeleton. Although Ajuba is a cytoplasmic protein, it can shuttle into the nucleus. In nucleus, Ajuba functions as a corepressor for the zinc finger-protein Snail. It binds to the SNAG repression domain of Snail through its LIM region.  Arginine methyltransferase-5 (Prmt5), a protein in the complex, is recruited to Snai l through an interaction with Ajuba. This ternary complex functions to repress E-cadherin, a Snail target gene. In addition, Ajuba contains functional nuclear-receptor interacting motifs and selectively interacts with retinoic acid receptors (RARs) and rexinoid receptor (RXRs) to negatively regulate retinoic acid signaling. Wtip, the Wt1-interacting protein, was originally identified as an interaction partner of the Wilms tumour protein 1 (WT1). Wtip is involved in kidney and neural crest development. Wtip interacts with the receptor tyrosine kinase Ror2 and inhibits canonical Wnt signaling. LIMD1 was reported to inhibit cell growth and metastases. The inhibition may be mediated through an interaction with the protein barrier-to-autointegration (BAF), a component of SWI/SNF chromatin-remodeling protein; or through the interaction with retinoblastoma protein (pRB), resulting in inhibition of E2F-mediated transcription, and expression of the majority of genes with E2F1- responsive elements. Recently, Limd1 was shown to interact with the p62/sequestosome protein and influence IL-1 and RANKL signaling by facilitating the assembly of a p62/TRAF6/a-PKC multi-protein complex. The Limd1-p62 interaction affects both NF-kappaB and AP-1 activity in epithelial cells and osteoclasts. Moreover, LIMD1 functions as tumor repressor to block lung tumor cell line in vitro and in vivo. Recent studies revealed that LIM proteins Wtip, LIMD1 and Ajuba interact with components of RNA induced silencing complexes (RISC) as well as eIF4E and the mRNA m7GTP cap-protein complex and are required for microRNA-mediated gene silencing.  As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	54
188739	cd09353	LIM2_Zyxin	The second LIM domain of Zyxin. The second LIM domain of Zyxin: Zyxin exhibits three copies of the LIM domain, an extensive proline-rich domain and a nuclear export signal.  Localized at sites of cellsubstratum adhesion in fibroblasts, Zyxin interacts with alpha-actinin, members of the cysteine-rich protein (CRP) family, proteins that display Src homology 3 (SH3) domains and Ena/VASP family members. Zyxin and its partners have been implicated in the spatial control of actin filament assembly as well as in pathways important for cell differentiation. In addition to its functions at focal adhesion plaques, recent work has shown that zyxin moves from the sites of cell contacts to the nucleus, where it directly participates in the regulation of gene expression. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors o r scaffolds to support the assembly of multimeric protein.	60
188740	cd09354	LIM2_LPP	The second LIM domain of lipoma preferred partner (LPP). The second LIM domain of lipoma preferred partner (LPP): LPP is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal and proline-rich region at the N-terminal.  LPP initially identified as the most frequent translocation partner of HMGA2 (High Mobility Group A2) in a subgroup of benign tumors of adipose tissue (lipomas). It was also shown to be rearranged in a number of other soft tissues, as well as in a case of acute monoblastic leukemia. In addition to its involvement in tumors, LPP was inedited as a smooth muscle restricted LIM protein that plays an important role in SMC migration. LPP is localized at sites of cell adhesion, cell-cell contacts and transiently in the nucleus. In nucleus, it acts as a coactivator for the ETS domain transcription factor PEA3. In addition to PEA3, it interacts with alpha-actinin,vasodilator stimulated phosphoprotein (VASP),Palladin, and Scrib. The  LIM domains are the main focal adhesion targeting elements and that the proline- rich region, which harbors binding sites for alpha-actinin and vasodilator- stimulated phosphoprotein (VASP), has a weak targeting capacity. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	60
188741	cd09355	LIM2_Ajuba_like	The second LIM domain of Ajuba-like proteins. The second LIM domain of Ajuba-like proteins: Ajuba like LIM protein family includes three highly homologous proteins Ajuba, Limd1, and WTIP. Members of the family contain three tandem C-terminal LIM domains and a proline-rich N-terminal region. This family of proteins functions as scaffolds, participating in the assembly of numerous protein complexes. In the cytoplasm, Ajuba binds Grb2 to modulate serum-stimulated ERK activation. Ajuba also recruits the TNF receptor-associated factor 6 (TRAF6) to p62 and activates PKCKappa activity. Ajuba interacts with alpha-catenin and F-actin to contribute to the formation or stabilization of adheren junctions by linking adhesive receptors to the actin cytoskeleton. Although Ajuba is a cytoplasmic protein, it can shuttle into the nucleus. In nucleus, Ajuba functions as a corepressor for the zinc finger-protein Snail. It binds to the SNAG repression domain of Snail through its LIM region.  Arginine methyltransferase-5 (Prmt5), a protein in the complex, is recruited to Snai l through an interaction with Ajuba. This ternary complex functions to repress E-cadherin, a Snail target gene. In addition, Ajuba contains functional nuclear-receptor interacting motifs and selectively interacts with retinoic acid receptors (RARs) and rexinoid receptor (RXRs) to negatively regulate retinoic acid signaling. Wtip, the Wt1-interacting protein, was originally identified as an interaction partner of the Wilms tumour protein 1 (WT1). Wtip is involved in kidney and neural crest development. Wtip interacts with the receptor tyrosine kinase Ror2 and inhibits canonical Wnt signaling. LIMD1 was reported to inhibit cell growth and metastases. The inhibition may be mediated through an interaction with the protein barrier-to-autointegration (BAF), a component of SWI/SNF chromatin-remodeling protein; or through the interaction with retinoblastoma protein (pRB), resulting in inhibition of E2F-mediated transcription, and expression of the majority of genes with E2F1- responsive elements. Recently, Limd1 was shown to interact with the p62/sequestosome protein and influence IL-1 and RANKL signaling by facilitating the assembly of a p62/TRAF6/a-PKC multi-protein complex. The Limd1-p62 interaction affects both NF-kappaB and AP-1 activity in epithelial cells and osteoclasts. Moreover, LIMD1 functions as tumor repressor to block lung tumor cell line in vitro and in vivo. Recent studies revealed that LIM proteins Wtip, LIMD1 and Ajuba interact with components of RNA induced silencing complexes (RISC) as well as eIF4E and the mRNA m7GTP cap-protein complex and are required for microRNA-mediated gene silencing.  As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188742	cd09356	LIM2_TRIP6	The second LIM domain of Thyroid receptor-interacting protein 6 (TRIP6). The second LIM domain of Thyroid receptor-interacting protein 6 (TRIP6): TRIP6 is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal. TRIP6 protein localizes to focal adhesion sites and along actin stress fibers. Recruitment of this protein to the plasma membrane occurs in a lysophosphatidic acid (LPA)-dependent manner. TRIP6 recruits a number of molecules involved in actin assembly, cell motility, survival and transcriptional control. The function of TRIP6 in cell motility is regulated by Src-dependent phosphorylation at a Tyr residue. The phosphorylation activates the coupling to the Crk SH2 domain, which is required for the function of TRIP6 in promoting lysophosphatidic acid (LPA)-induced cell migration. TRIP6 can shuttle to the nucleus to serve as a coactivator of AP-1 and NF-kappaB transcriptional factors. Moreover, TRIP6 can form a ternary complex with the NHERF2 PDZ protein and LPA2 receptor to regulate LPA-induced activation of ERK and AKT, rendering cells resistant to chemotherapy. Recent evidence shows that TRIP6 antagonizes Fas-Induced apoptosis by enhancing the antiapoptotic effect of LPA in cells. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188743	cd09357	LIM3_Zyxin_like	The third LIM domain of Zyxin-like family. The third LIM domain of Zyxin like family: This family includes Ajuba, Limd1, WTIP, Zyxin, LPP, and Trip6 LIM proteins. Members of Zyxin family contain three tandem C-terminal LIM domains, and a proline-rich N-terminal region.  Zyxin proteins are detected primarily in focal adhesion plaques. They function as scaffolds, participating in the assembly of multiple interactions and signal transduction networks, which regulate cell adhesion, spreading, and motility. They can also shuffle into nucleus.  In nucleus, zyxin proteins affect gene transcription by interaction with a variety of nuclear proteins, including several transcription factors, playing regulating roles in cell proliferation, differentiation and apoptosis. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	63
188744	cd09358	LIM_Mical_like	The LIM domain of Mical (molecule interacting with CasL) like family. The LIM domain of Mical (molecule interacting with CasL) like family: Known members of this family includes  LIM domain containing proteins; Mical (molecule interacting with CasL), pollen specific protein SF3, Eplin, xin actin-binding repeat-containing protein 2 (XIRP2) and Ltd-1. The members of this family function mainly at the cytoskeleton and focal adhesions. They interact with transcription factors or other signaling molecules to play roles in muscle development, neuronal differentiation, cell growth and mobility.  Eplin has also found to be tumor suppressor. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs.. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188745	cd09359	LIM_LASP_like	The LIM domain of LIM and SH3 Protein (LASP)-like proteins. The LIM domain of LIM and SH3 Protein (LASP) like proteins:  This family contains two types of LIM containing proteins; LASP and N-RAP. LASP family contains two highly homologous members, LASP-1 and LASP-2. LASP contains a LIM motif at its amino terminus, a src homology 3 (SH3) domains at its C-terminal part, and a nebulin-like region in the middle. LASP-1 and -2 are highly conserved in their LIM, nebulin-like, and SH3 domains, but differ significantly at their linker regions. Both proteins are ubiquitously expressed and involved in cytoskeletal architecture, especially in the organization of focal adhesions. LASP-1 and LASP-2, are important during early embryo- and fetogenesis and are highly expressed in the central nervous system of the adult. However, only LASP-1 seems to participate significantly in neuronal differentiation and plays an important functional role in migration and proliferation of certain cancer cells while the role of LASP-2 is more structural. The expression of LASP-1 in breast tumors is increased significantly.  N-RAP is a muscle-specific protein concentrated at myotendinous junctions in skeletal muscle and intercalated disks in cardiac muscle. LIM domain is found at the N-terminus of N-RAP and the C-terminal of N-RAP contains a region with multiple of nebulin repeats. N-RAP functions as a scaffolding protein that organizes alpha-actinin and actin into symmetrical I-Z-I structures in developing myofibrils. Nebulin repeat is known as actin binding domain. The N-RAP is hypothesized to form antiparallel dimerization via its LIM domain. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188746	cd09360	LIM_ALP_like	The LIM domain of ALP (actinin-associated LIM protein) family. This family represents the LIM domain of ALP (actinin-associated LIM protein) family. Four proteins: ALP, CLP36, RIL, and Mystique have been classified into the ALP subfamily of LIM domain proteins. Each member of the subfamily contains an N-terminal PDZ domain and a C-terminal LIM domain. Functionally, these proteins bind to alpha-actinin through their PDZ domains and bind or other signaling molecules through their LIM domains. ALP proteins have been implicated in cardiac and skeletal muscle structure, function and disease, platelet, and epithelial cell motility. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188747	cd09361	LIM1_Enigma_like	The first LIM domain of Enigma-like family. The first LIM domain of Enigma-like family: The Enigma LIM domain family is comprised of three members: Enigma, ENH, and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. Enigma was initially characterized in humans and is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS.  Thus Enigma is implicated in signal transduction processes, such as mitogenic activity, insulin related actin organization, and glucose metabolism. The second member, ENH protein, was first identified in rat brain. It has been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188748	cd09362	LIM2_Enigma_like	The second LIM domain of Enigma-like family. The second LIM domain of Enigma-like family: The Enigma LIM domain family is comprised of three members: Enigma, ENH, and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. Enigma was initially characterized in humans and is expressed in multiple tissues, such as skeletal muscle, heart, bone and brain. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS.  Thus Enigma is implicated in signal transduction processes, such as mitogenic activity, insulin related actin organization, and glucose metabolism. The second member, ENH protein, was first identified in rat brain.  It has been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188749	cd09363	LIM3_Enigma_like	The third LIM domain of Enigma-like family. The third LIM domain of Enigma-like family: The Enigma LIM domain family is comprised of three members: Enigma, ENH, and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. Enigma was initially characterized in humans and is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS.  Thus Enigma is implicated in signal transduction processes, such as mitogenic activity, insulin related actin organization, and glucose metabolism. The second member, ENH protein, was first identified in rat brain.  It has been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188750	cd09364	LIM1_LIMK	The first LIM domain of LIMK (LIM domain Kinase ). The first LIM domain of LIMK (LIM domain Kinase ): LIMK protein family is  comprised of two members LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, and altering the rate of actin depolymerisation. LIMKs can function in both cytoplasm and nucleus and are expressed in all tissues. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. However, LIMK1 and LIMk2 have different cellular locations. While LIMK1 localizes mainly at focal adhesions, LIMK2 is found in cytoplasmic punctae, suggesting that they may have different cellular functions. The LIM domains of LIMK have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188751	cd09365	LIM2_LIMK	The second LIM domain of LIMK (LIM domain Kinase ). The second LIM domain of LIMK (LIM domain Kinase ): LIMK protein family is  comprised of two members LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, and altering the rate of actin depolymerization. LIMKs can function in both cytoplasm and nucleus and are expressed in all tissues. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. However, LIMK1 and LIMk2 have different cellular locations. While LIMK1 localizes mainly at focal adhesions, LIMK2 is found in cytoplasmic punctae, suggesting that they may have different cellular functions. The LIM domains of LIMK have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188752	cd09366	LIM1_Isl	The first LIM domain of Isl, a member of LHX protein family. The first LIM domain of Isl: Isl is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Isl1 and Isl2 are the two conserved members of this family. Proteins in this group are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Isl-1 is one of the LHX proteins isolated originally by virtue of its ability to bind DNA sequences from the 5'-flanking region of the rat insulin gene in pancreatic insulin-producing cells. Mice deficient in Isl-1 fail to form the dorsal exocrine pancreas and islet cells fail to differentiate. On the other hand, Isl-1 takes part in the pituitary development by activating the gonadotropin-releasing hormone receptor gene together with LHX3 and steroidogenic factor 1. Mouse Is l2 is expressed in the retinal ganglion cells and the developing spinal cord where it plays a role in motor neuron development. Same as Isl1, Isl2 may also be able to bind to the insulin gene enhancer to promote gene activation. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188753	cd09367	LIM1_Lhx1_Lhx5	The first LIM domain of Lhx1 (also known as Lim1) and Lhx5. The first LIM domain of Lhx1 (also known as Lim1) and Lhx5. Lhx1 and Lhx5 are closely related members of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx1 is required for regulating the vertebrate head organizer, the nervous system, and female reproductive tract development. During embryogenesis in the mouse, Lhx1 is expressed early in mesodermal tissue, then later during urogenital, kidney, liver, and nervous system development. In the adult, expression is restricted to the kidney and brain. A mouse embryos with Lhx1 gene knockout cannot grow normal anterior head structures, kidneys, and gonads, but with normally developed trunk and tail morphology. In the developing nervous system, Lhx1 is required to direct the trajectories of motor axons in the limb. Lhx1 null female mice lack the oviducts and uterus.  Lhx5 protein may play complementary or overlapping roles with Lhx1. The expression of Lhx5 in the anterior portion of the mouse neural tube suggests a role in patterning of the forebrain. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188754	cd09368	LIM1_Lhx3_Lhx4	The first LIM domain of Lhx3 and Lhx4 family. The first LIM domain of Lhx3-Lhx4 family: Lhx3 and Lhx4 belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. The LHX3 and LHX4 LIM-homeodomain transcription factors play essential roles in pituitary gland and nervous system development. Although LHX3 and LHX4 share marked sequence homology, the genes have different expression patterns. They play overlapping, but distinct functions during the establishment of the specialized cells of the mammalian pituitary gland and the nervous system. Lhx3 proteins have been demonstrated the ability to directly bind to the promoters/enhancers of several pituitary hormone gene promoters to cause increased transcription. Lhx3a and Lhx3b, whose mRNAs have distinct temporal expression profiles during development, are two isoforms of Lhx3. LHX4 plays essential roles in pituitary gland and nervous system development. In mice, the lhx4 gene is expressed in the developing hindbrain, cerebral cortex, pituitary gland, and spinal cord. LHX4 shows significant sequence similarity to LHX3, particularly to isoforms Lhx3a. In gene regulation experiments, the LHX4 protein exhibits regulation roles towards pituitary genes, acting on their promoters/enhancers. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	52
188755	cd09369	LIM1_Lhx2_Lhx9	The first LIM domain of Lhx2 and Lhx9 family. The first LIM domain of Lhx2 and Lhx9 family: Lhx2 and Lhx9 are highly homologous LHX regulatory proteins. They belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas.  Although Lhx2 and Lhx9 are highly homologous, they seems to play regulatory roles in different organs.  In animals, Lhx2 plays important roles in eye, cerebral cortex, limb, the olfactory organs, and erythrocyte development. Lhx2 gene knockout mice exhibit impaired patterning of the cortical hem and the telencephalon of the developing brain, and a lack of development in olfactory structures. Lhx9 is expressed in several regions of the developing mouse brain , the spinal cord, the pancreas, in limb mesenchyme, and in the urogenital region. Lhx9 plays critical roles in gonad development.  Homozygous mice lacking functional Lhx9 alleles exhibit numerous urogenital defects, such as gonadal agenesis, infertility, and undetectable levels of testosterone and estradiol coupled with high FSH levels. Lhx9 null mice are phenotypically female, even those that are genotypically male. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	54
188756	cd09370	LIM1_Lmx1a	The first LIM domain of Lmx1a. The first LIM domain of Lmx1a: Lmx1a belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Mouse Lmx1a is expressed in multiple tissues, including the roof plate of the neural tube, the developing brain, the otic vesicles, the notochord, and the pancreas. Human Lmx1a can be found in pancreas, skeletal muscle, adipose tissue, developing brain, mammary glands, and pituitary.  The functions of Lmx1a in the developing nervous system were revealed by studies of mutant mouse. In mouse, mutations in Lmx1a result in failure of the roof plate to develop.  Lmx1a may act upstream of other roof plate markers such as MafB, Gdf7, Bmp 6, and Bmp7. Further characterization of these mice reveals numerous defects including disorganized cerebellum, hippocampus, and cortex; altered pigmentation; female sterility; skeletal defects; and behavioral abnormalities. Within pancreatic cells, the Lmx1a protein interacts synergistically with the bHLH transcription factor E47 to activate the insulin gene enhancer/promoter. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	52
188757	cd09371	LIM1_Lmx1b	The first LIM domain of Lmx1b. The first LIM domain of Lmx1b: Lmx1b belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas.  In mouse, Lmx1b functions in the developing limbs and eyes, the kidneys, the brain, and in cranial mesenchyme. The disruption of Lmx1b gene results kidney and limb defects. In the brain, Lmx1b is important for generation of mesencephalic dopamine neurons and the differentiation of serotonergic neurons. In the mouse eye, Lmx1b regulates anterior segment (cornea, iris, ciliary body, trabecular meshwork, and lens) development. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188758	cd09372	LIM2_FBLP-1	The second LIM domain of the filamin-binding LIM protein-1 (FBLP-1). The second LIM domain of the filamin-binding LIM protein-1 (FBLP-1): Fblp-1 contains a proline-rich domain near its N terminus and two LIM domains at its C terminus. FBLP-1 mRNA was detected in a variety of tissues and cells including platelets and endothelial cells. FBLP-1 binds to Filamins. The association between filamin B and FBLP-1 may play an unknown role in cytoskeletal function, cell adhesion, and cell motility. As in other LIM domains, this domain family is 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188759	cd09373	LIM1_AWH	The first LIM domain of Arrowhead (AWH). The first LIM domain of Arrowhead (AWH): Arrowhead belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. During embryogenesis of Drosophila, Arrowhead is expressed in each abdominal segment and in the labial segment. Late in embryonic development, expression of arrowhead is refined to the abdominal histoblasts and salivary gland imaginal ring cells themselves. The Arrowhead gene required for establishment of a subset of imaginal tissues: the abdominal histoblasts and the salivary gland imaginal rings. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	54
188760	cd09374	LIM2_Isl	The second LIM domain of Isl, a member of LHX protein family. The second LIM domain of Isl: Isl is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Isl1 and Isl2 are the two conserved members of this family. Proteins in this group are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Isl-1 is one of the LHX proteins isolated originally by virtue of its ability to bind DNA sequences from the 5'-flanking region of the rat insulin gene in pancreatic insulin-producing cells. Mice deficient in Isl-1 fail to form the dorsal exocrine pancreas and islet cells fail to differentiate. On the other hand, Isl-1 takes part in the pituitary development by activating the gonadotropin-releasing hormone receptor gene together with LHX3 and steroidogenic factor 1. Mouse Isl2 is expressed in the retinal ganglion cells and the developing spinal cord where it plays a role in motor neuron development. Same as Isl1, Isl2 may also be able to bind to the insulin gene enhancer to promote gene activation. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188761	cd09375	LIM2_Lhx1_Lhx5	The second LIM domain of Lhx1 (also known as Lim1) and Lhx5. The second LIM domain of Lhx1 (also known as Lim1) and Lhx5. Lhx1 and Lhx5 are closely related members of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx1 is required for regulating the vertebrate head organizer, the nervous system, and female reproductive tract development. During embryogenesis in the mouse, Lhx1 is expressed early in mesodermal tissue, then later during urogenital, kidney, liver, and nervous system development. In the adult, expression is restricted to the kidney and brain. A mouse embryos with Lhx1 gene knockout cannot grow normal anterior head structures, kidneys, and gonads, but with normally developed trunk and tail morphology. In the developing nervous system, Lhx1 is required to direct the trajectories of motor axons in the limb. Lhx1 null female mice lack the oviducts and uterus.  Lhx5 protein may play complementary or overlapping roles with Lhx1. The expression of Lhx5 in the anterior portion of the mouse neural tube suggests a role in patterning of the forebrain. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188762	cd09376	LIM2_Lhx3_Lhx4	The second LIM domain of Lhx3-Lhx4 family. The second LIM domain of Lhx3-Lhx4 family: Lhx3 and Lhx4 belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. The LHX3 and LHX4 LIM-homeodomain transcription factors play essential roles in pituitary gland and nervous system development. Although LHX3 and LHX4 share marked sequence homology, the genes have different expression patterns. They play overlapping, but distinct functions during the establishment of the specialized cells of the mammalian pituitary gland and the nervous system. Lhx3 proteins have been demonstrated the ability to directly bind to the promoters/enhancers of several pituitary hormone gene promoters to cause increased transcription.Lhx3a and Lhx3b, whose mRNAs have distinct temporal expression profiles during development, are two isoforms of Lhx3. LHX4 plays essential roles in pituitary gland and nervous system development. In mice, the lhx4 gene is expressed in the developing hindbrain, cerebral cortex, pituitary gland, and spinal cord. LHX4 shows significant sequence similarity to LHX3, particularly to isoforms Lhx3a. In gene regulation experiments, the LHX4 protein exhibits regulation roles towards pituitary genes, acting on their promoters/enhancers. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	56
188763	cd09377	LIM2_Lhx2_Lhx9	The second LIM domain of Lhx2 and Lhx9 family. The second LIM domain of Lhx2 and Lhx9 family: Lhx2 and Lhx9 are highly homologous LHX regulatory proteins. They belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas.  Although Lhx2 and Lhx9 are highly homologous, they seems to play regulatory roles in different organs.  In animals, Lhx2 plays important roles in eye, cerebral cortex, limb, the olfactory organs, and erythrocyte development. Lhx2 gene knockout mice exhibit impaired patterning of the cortical hem and the telencephalon of the developing brain, and a lack of development in olfactory structures. Lhx9 is expressed in several regions of the developing mouse brain, the spinal cord, the pancreas, in limb mesenchyme, and in the urogenital region. Lhx9 plays critical roles in gonad development.  Homozygous mice lacking functional Lhx9 alleles exhibit numerous urogenital defects, such as gonadal agenesis, infertility, and undetectable levels of testosterone and estradiol coupled with high FSH levels. Lhx9 null mice are phenotypically female, even those that are genotypically male. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	59
188764	cd09378	LIM2_Lmx1a_Lmx1b	The second LIM domain of Lmx1a and Lmx1b. The second LIM domain of Lmx1a and Lmx1b: Lmx1a and Lmx1b belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. Mouse Lmx1a is expressed in multiple tissues, including the roof plate of the neural tube, the developing brain, the otic vesicles, the notochord, and the pancreas. In mouse, mutations in Lmx1a result in failure of the roof plate to develop.  Lmx1a may act upstream of other roof plate markers such as MafB, Gdf7, Bmp6, and Bmp7. Further characterization of these mice reveals numerous defects including disorganized cerebellum, hippocampus, and cortex; altered pigmentation; female sterility, skeletal defects, and behavioral abnormalities.  In the mouse, Lmx1b functions in the developing limbs and eyes, the kidneys, the brain, and in cranial mesenchyme. The disruption of Lmx1b gene results kidney and limb defects. In the brain, Lmx1b is important for generation of mesencephalic dopamine neurons and the differentiation of serotonergic neurons. In the mouse eye, Lmx1b regulates anterior segment (cornea, iris, ciliary body, trabecular meshwork, and lens) development. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	55
188765	cd09379	LIM2_AWH	The second LIM domain of Arrowhead (AWH). The second LIM domain of Arrowhead (AWH): Arrowhead belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. During embryogenesis of Drosophila, Arrowhead is expressed in each abdominal segment and in the labial segment. Late in embryonic development, expression of arrowhead is refined to the abdominal histoblasts and salivary gland imaginal ring cells themselves. The Arrowhead gene required for establishment of a subset of imaginal tissues: the abdominal histoblasts and the salivary gland imaginal rings. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	55
188766	cd09380	LIM1_Lhx6	The first LIM domain of Lhx6. The first LIM domain of Lhx6. Lhx6 is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. Lhx6 functions in the brain and nervous system.  It is expressed at high levels in several regions of the embryonic mouse CNS, including the telencephalon and hypothalamus, and the first branchial arch. Lhx6 is proposed to have a role in patterning of the mandible and maxilla, and in signaling during odontogenesis. In brain sections, knockdown of Lhx6 gene blocks the normal migration of neurons to the cortex. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188767	cd09381	LIM1_Lhx7_Lhx8	The first LIM domain of Lhx7 and Lhx8. The first LIM domain of Lhx7 and Lhx8:  Lhx7 and Lhx8 belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas.  Studies using mutant mice have revealed roles for Lhx7 and Lhx8 in the development of cholinergic neurons in the telencephalon and in basal forebrain development. Mice lacking alleles of the LIM-homeobox gene Lhx7 or Lhx8 display dramatically reduced number of forebrain cholinergic neurons. In addition, Lhx7 mutation affects male and female mice differently, with females appearing more affected than males. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	56
188768	cd09382	LIM2_Lhx6	The second LIM domain of Lhx6. The second LIM domain of Lhx6. Lhx6 is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas. Lhx6 functions in brain and nervous system.  It is expressed at high levels in several regions of the embryonic mouse CNS, including the telencephalon and hypothalamus, and the first branchial arch. Lhx6 is proposed to have a role in patterning of the mandible and maxilla, and in signaling during odontogenesis. In brain sections, knockdown of Lhx6 gene blocks the normal migration of neurons to the cortex. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188769	cd09383	LIM2_Lhx7_Lhx8	The second LIM domain of Lhx7 and Lhx8. The second LIM domain of Lhx7 and Lhx8:  Lhx7 and Lhx8 belong to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs such as the pituitary gland and the pancreas.  Studies using mutant mice have revealed roles for Lhx7 and Lhx8 in the development of cholinergic neurons in the telencephalon and in basal forebrain development. Mice lacking alleles of the LIM-homeobox gene Lhx7 or Lhx8 display dramatically reduced number of forebrain cholinergic neurons. In addition, Lhx7 mutation affects male and female mice differently, with females appearing more affected than males. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	55
188770	cd09384	LIM1_LMO2	The first LIM domain of LMO2 (LIM domain only protein 2). The first LIM domain of LMO2 (LIM domain only protein 2): LMO2 is a nuclear protein that  plays important roles in transcriptional regulation and development. The two tandem LIM domains of LMO2 support the assembly of a crucial cell-regulatory complex by interacting with both the TAL1-E47 and GATA1 transcription factors to form a DNA-binding complex that is capable of transcriptional activation. LMOs have also been shown to be involved in oncogenesis. LMO1 and LMO2 are activated in T-cell acute lymphoblastic leukemia by distinct chromosomal translocations. LMO2 was also shown to be involved in erythropoiesis and is required for the hematopoiesis in the adult animals. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188771	cd09385	LIM2_LMO2	The second LIM domain of LMO2 (LIM domain only protein 2). The second LIM domain of LMO2 (LIM domain only protein 2): LMO2 is a nuclear protein that  plays important roles in transcriptional regulation and development. The two tandem LIM domains of LMO2 support the assembly of a crucial cell-regulatory complex by interacting with both the TAL1-E47 and GATA1 transcription factors to form a DNA-binding complex that is capable of transcriptional activation. LMOs have also been shown to be involved in oncogenesis. LMO1 and LMO2 are activated in T-cell acute lymphoblastic leukemia by distinct chromosomal translocations. LMO2 was also shown to be involved in erythropoiesis and is required for the hematopoiesis in the adult animals. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188772	cd09386	LIM1_LMO4	The first LIM domain of LMO4 (LIM domain only protein 4). The first LIM domain of LMO4 (LIM domain only protein 4): LMO4 is a nuclear protein that plays important roles in transcriptional regulation and development. LMO4 is involved in various functions in tumorigenesis and cellular differentiation. LMO4 proteins regulate gene expression by interacting with a wide variety of transcription factors and cofactors to form large transcription complexes. It can interact with Smad proteins, and associate with the promoter of the PAI-1 (plasminogen activator inhibitor-1) gene in a TGFbeta (transforming growth factor beta)-dependent manner. LMO4 can also form a complex with transcription regulator CREB (cAMP response element-binding protein) and interact with CLIM1 and CLIM2. In breast tissue, LMO4 interacts with multiple proteins, including the cofactor CtIP [CtBP (C-terminal binding protein)-interacting protein], the breast and ovarian tumor suppressor BRCA1 (breast-cancer susceptibility gene 1) and the LIM-domain-binding protein LDB1. Functionally, LMO4 is shown to repress BRCA1-mediated transcription activation, thus invoking a potential role for LMO4 as a negative regulator of BRCA1 in sporadic breast cancer.  LMO4 also forms complex to both ERa (oestrogen receptor alpha), MTA1 (metastasis tumor antigen 1), and HDACs (histone deacetylases), implying that LMO4 is also a component of the MTA1 corepressor complex. Over-expressed LMO4 represses ERa transactivation functions in an HDAC-dependent manner, and contributes to the process of breast cancer progression by allowing the development of Era-negative phenotypes. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188773	cd09387	LIM2_LMO4	The second LIM domain of LMO4 (LIM domain only protein 4). The second LIM domain of LMO4 (LIM domain only protein 4): LMO4 is a nuclear protein that plays important roles in transcriptional regulation and development. LMO4 is involved in various functions in tumorigenesis and cellular differentiation. LMO4 proteins regulate gene expression by interacting with a wide variety of transcription factors and cofactors to form large transcription complexes. It can interact with Smad proteins, and associate with the promoter of the PAI-1 (plasminogen activator inhibitor-1) gene in a TGFbeta (transforming growth factor beta)-dependent manner. LMO4 can also form a complex with transcription regulator CREB (cAMP response element-binding protein) and interact with CLIM1 and CLIM2. In breast tissue, LMO4 interacts with multiple proteins, including the cofactor CtIP [CtBP (C-terminal binding protein)-interacting protein], the breast and ovarian tumor suppressor BRCA1 (breast-cancer susceptibility gene 1) and the LIM-domain-binding protein LDB1. Functionally, LMO4 is shown to repress BRCA1-mediated transcription activation, thus invoking a potential role for LMO4 as a negative regulator of BRCA1 in sporadic breast cancer.  LMO4 also forms complex to both ERa (oestrogen receptor alpha), MTA1 (metastasis tumor antigen 1), and HDACs (histone deacetylases), implying that LMO4 is also a component of the MTA1 corepressor complex. Over-expressed LMO4 represses ERa transactivation functions in an HDAC-dependent manner, and contributes to the process of breast cancer progression by allowing the development of Era-negative phenotypes. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188774	cd09388	LIM1_LMO1_LMO3	The first LIM domain of LMO1 and LMO3 (LIM domain only protein 1 and 3). The first LIM domain of LMO1 and LMO3 (LIM domain only protein 1 and 3): LMO1 and LMO3 are highly homologous and belong to the LMO protein family. LMO1 and LMO3 are nuclear protein that plays important roles in transcriptional regulation and development. As LIM domains lack intrinsic DNA-binding activity, nuclear LMOs are involved in transcriptional regulation by forming complexes with other transcription factors or cofactors. For example, LMO1 interacts with the the bHLH domain of  bHLH transcription factor, TAL1 (T-cell acute leukemia1)/SCL (stem cell leukemia) . LMO1 inhibits the expression of TAL1/SCL target genes.  LMO3 facilitates p53 binding to its response elements, which suggests that LMO3 acts as a co-repressor of p53, suppressing p53-dependent transcriptional regulation. In addition, LMO3 interacts with neuronal transcription factor, HEN2, and acts as an oncogene in neuroblastoma. Another binding partner of LMO3 is calcium- and integrin-binding protein CIB, which binds via the second LIM domain (LIM2) of LMO3. One role of the CIB/LMO3 complex is to inhibit cell proliferation. Although LMO1 and LMO3 are highly homologous proteins, they play different roles in the regulation of the pituitary glycoprotein hormone alpha-subunit (alpha GSU) gene. Alpha GSU promoter activity was markedly repressed by LMO1 but activated by LMO3. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188775	cd09389	LIM2_LMO1_LMO3	The second LIM domain of LMO1 and LMO3 (LIM domain only protein 1 and 3). The second LIM domain of LMO1 and LMO3 (LIM domain only protein 1 and 3): LMO1 and LMO3 are highly homologous and belong to the LMO protein family. LMO1 and LMO3 are nuclear protein that plays important roles in transcriptional regulation and development. As LIM domains lack intrinsic DNA-binding activity, nuclear LMOs are involved in transcriptional regulation by forming complexes with other transcription factors or cofactors. For example, LMO1 interacts with the the bHLH domain of  bHLH transcription factor, TAL1 (T-cell acute leukemia1)/SCL (stem cell leukemia) . LMO1 inhibits the expression of TAL1/SCL target genes.  LMO3 facilitates p53 binding to its response elements, which suggests that LMO3 acts as a co-repressor of p53, suppressing p53-dependent transcriptional regulation. In addition, LMO3 interacts with neuronal transcription factor, HEN2, and acts as an oncogene in neuroblastoma. Another binding partner of LMO3 is calcium- and integrin-binding protein CIB, which binds via the second LIM domain (LIM2) of LMO3. One role of the CIB/LMO3 complex is to inhibit cell proliferation. Although LMO1 and LMO3 are highly homologous proteins, they play different roles in the regulation of the pituitary glycoprotein hormone alpha-subunit (alpha GSU) gene. Alpha GSU promoter activity was markedly repressed by LMO1 but activated by LMO3. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188776	cd09390	LIM2_dLMO	The second LIM domain of dLMO (Beaderx). The second LIM domain of dLMO (Beaderx): dLMO is a nuclear protein that plays important roles in transcriptional regulation and development. In Drosophila dLMO modulates the activity of LIM-homeodomain protein Apterous (Ap), which regulates the formation of the dorsal-ventral axis of the Drosophila wing. Biochemical analysis shows that dLMO protein influences the activity of Apterous by binding of its cofactor Chip. Further studies shown that dLMO proteins might function in an evolutionarily conserved mechanism involved in patterning the appendages. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188777	cd09391	LIM1_Lrg1p_like	The first LIM domain of Lrg1p, a LIM and RhoGap domain containing protein. The first LIM domain of Lrg1p, a LIM and RhoGap domain containing protein: The members of this family contain three tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Lrg1p is a Rho1 GTPase-activating protein required for efficient cell fusion in yeast. Lrg1p-GAP domain strongly and specifically stimulates the GTPase activity of Rho1p, a regulator of beta (1-3)-glucan synthase in vitro. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	57
188778	cd09392	LIM2_Lrg1p_like	The second LIM domain of Lrg1p, a LIM and RhoGap domain containing protein. The second LIM domain of Lrg1p, a LIM and RhoGap domain containing protein: The members of this family contain three tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Lrg1p is a Rho1 GTPase-activating protein required for efficient cell fusion in yeast. Lrg1p-GAP domain strongly and specifically stimulates the GTPase activity of Rho1p, a regulator of beta (1-3)-glucan synthase in vitro. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188779	cd09393	LIM3_Lrg1p_like	The third LIM domain of Lrg1p, a LIM and RhoGap domain containing protein. The third LIM domain of Lrg1p, a LIM and RhoGap domain containing protein: The members of this family contain three tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Lrg1p is a Rho1 GTPase-activating protein required for efficient cell fusion in yeast. Lrg1p-GAP domain strongly and specifically stimulates the GTPase activity of Rho1p, a regulator of beta (1-3)-glucan synthase in vitro. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	56
188780	cd09394	LIM1_Rga	The first LIM domain of  Rga GTPase-Activating Proteins. The first LIM domain of  Rga GTPase-Activating Proteins: The members of this family contain two tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Rga activates GTPases during polarized morphogenesis. In yeast, a known regulating target of Rga is  CDC42p, a small GTPase. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	55
188781	cd09395	LIM2_Rga	The second LIM domain of  Rga GTPase-Activating Proteins. The second LIM domain of  Rga GTPase-Activating Proteins: The members of this family contain two tandem repeats of LIM domains and a Rho-type GTPase activating protein (RhoGap) domain. Rga activates GTPases during polarized morphogenesis. In yeast, a known regulating target of Rga is CDC42p, a small GTPase. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188782	cd09396	LIM_DA1	The Lim domain of DA1. The Lim domain of DA1: DA1 contains one copy of LIM domain and a domain of unknown function. DA1 is predicted as an ubiquitin receptor, which sets final seed and organ size by restricting the period of cell proliferation. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188783	cd09397	LIM1_UF1	LIM domain in proteins of unknown function. The first Lim domain of a LIM domain containing protein: The functions of the proteins are unknown. The members of this family contain two copies of LIM domain. The LIM domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	58
188784	cd09400	LIM_like_1	LIM domain in proteins of unknown function. LIM domain in proteins of unknown function: LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation, and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. The LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid).	61
188785	cd09401	LIM_TLP_like	The  LIM domains of thymus LIM protein (TLP). The LIM domain of thymus LIM protein (TLP) like proteins:  This family includes the LIM domains of TLP and CRIP (Cysteine-Rich Intestinal Protein). TLP is the distant member of the CRP family of proteins. TLP has two isomers (TLP-A and TLP-B) and sharing approximately 30% with each of the three other CRPs.  Like CRP1, CRP2 and CRP3/MLP, TLP has two LIM domains, connected by a flexible linker region. Unlike the CRPs, TLP lacks the nuclear targeting signal (K/R-K/R-Y-G-P-K) and is localized solely in the cytoplasm. TLP is specifically expressed in the thymus in a subset of cortical epithelial cells.  TLP has a role in development of normal thymus and in controlling the development and differentiation of thymic epithelial cells. CRIP is a short LIM protein with only one LIM domain. CRIP gene is developmentally regulated and can be induced by glucocorticoid hormones during the first three postnatal weeks. The domain shows close sequence homology to LIM domain of thymus LIM protein. However, unlike the TLP proteins which have two LIM domains, the members of this family have only one LIM domain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188786	cd09402	LIM1_CRP	The first LIM domain of Cysteine Rich Protein (CRP). The first LIM domain of Cysteine Rich Protein (CRP): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to a short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription control, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. It is evident that CRP1, CRP2, and CRP3/MLP are involved in promoting protein assembly along the actin-based cytoskeleton. Although members of the CRP family share common binding partners, they are also capable of recognizing different and specific targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188787	cd09403	LIM2_CRP	The second LIM domain of Cysteine Rich Protein (CRP). The second LIM domain of Cysteine Rich Protein (CRP): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to a short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription control, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. It is evident that CRP1, CRP2, and CRP3/MLP are involved in promoting protein assembly along the actin-based cytoskeleton. Although members of the CRP family share common binding partners, they are also capable of recognizing different and specific targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residu es, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188788	cd09404	LIM1_MLP84B_like	The LIM domain of Mlp84B and Mlp60A. The LIM domain of Mlp84B and Mlp60A: Mlp84B and Mlp60A belong to the CRP LIM domain protein family. The Mlp84B protein contains five copies of the LIM domains, each followed by a Glycin Rich Region (GRR). However, only the first LIM domain of Mlp84B is in this family. Mlp60A exhibits only one LIM domain linked to a glycin-rich region. Mlp84B and Mlp60A are muscle specific proteins and have been implicated in muscle differentiation. While Mlp84B transcripts are enriched at the terminal ends of muscle fibers, Mlp60A transcripts are found throughout the muscle fibers. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188789	cd09405	LIM1_Paxillin	The first LIM domain of paxillin. The first LIM domain of paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight cons erved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188790	cd09406	LIM1_Leupaxin	The first LIM domain of Leupaxin. The first LIM domain of Leupaxin: Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells.  Leupaxin belongs to the paxillin focal adhesion protein family. Same as other members of the family, it has four leucine-rich LD-motifs in the N-terminus and four LIM domains in the C-terminus. It may function in cell type-specific signaling by associating with interaction partners PYK2, FAK, PEP and p95PKL.  When expressed in human leukocytic cells, leupaxin significantly suppressed integrin-mediated cell adhesion to fibronectin and the tyrosine phosphorylation of paxillin. These findings indicate that leupaxin may negatively regulate the functions of paxillin during integrin signaling. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188791	cd09407	LIM2_Paxillin	The second LIM domain of paxillin. The second LIM domain of paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188792	cd09408	LIM2_Leupaxin	The second LIM domain of Leupaxin. The second LIM domain of Leupaxin: Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. Leupaxin belongs to the paxillin focal adhesion protein family. Same as other members of the family, it has four leucine-rich LD-motifs in the N-terminus and four LIM domains in the C-terminus. It may function in cell type-specific signaling by associating with interaction partners PYK2, FAK, PEP and p95PKL.  When expressed in human leukocytic cells, leupaxin significantly suppressed integrin-mediated cell adhesion to fibronectin and the tyrosine phosphorylation of paxillin. These findings indicate that leupaxin may negatively regulate the functions of paxillin during integrin signaling. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188793	cd09409	LIM3_Paxillin	The third LIM domain of paxillin. The third LIM domain of paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188794	cd09410	LIM3_Leupaxin	The third LIM domain of Leupaxin. The third LIM domain of Leupaxin: Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. Leupaxin belongs to the paxillin focal adhesion protein family. Same as other members of the family, it has four leucine-rich LD-motifs in the N-terminus and four LIM domains in the C-terminus. It may function in cell type-specific signaling by associating with interaction partners PYK2, FAK, PEP and p95PKL.  When expressed in human leukocytic cells, leupaxin significantly suppressed integrin-mediated cell adhesion to fibronectin and the tyrosine phosphorylation of paxillin. These findings indicate that leupaxin may negatively regulate the functions of paxillin during integrin signaling. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188795	cd09411	LIM4_Paxillin	The fourth LIM domain of Paxillin. The fourth LIM domain of Paxillin: Paxillin is an adaptor protein, which recruits key components of the signal-transduction machinery to specific sub-cellular locations to respond to environmental changes rapidly. The C-terminal region of paxillin contains four LIM domains which target paxillin to focal adhesions, presumably through a direct association with the cytoplasmic tail of beta-integrin. The N-terminal of paxillin is leucine-rich LD-motifs. Paxillin is found at the interface between the plasma membrane and the actin cytoskeleton. The binding partners of paxillin are diverse and include protein tyrosine kinases, such as Src and FAK, structural proteins, such as vinculin and actopaxin, and regulators of actin organization. Paxillin recruits these proteins to their function sites to control the dynamic changes in cell adhesion, cytoskeletal reorganization and gene expression. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188796	cd09412	LIM4_Leupaxin	The fourth LIM domain of Leupaxin. The fourth LIM domain of Leupaxin: Leupaxin is a cytoskeleton adaptor protein, which is preferentially expressed in hematopoietic cells. Leupaxin belongs to the paxillin focal adhesion protein family. Same as other members of the family, it has four leucine-rich LD-motifs in the N-terminus and four LIM domains in the C-terminus. It may function in cell type-specific signaling by associating with interaction partners PYK2, FAK, PEP and p95PKL.  When expressed in human leukocytic cells, leupaxin significantly suppressed integrin-mediated cell adhesion to fibronectin and the tyrosine phosphorylation of paxillin. These findings indicate that leupaxin may negatively regulate the functions of paxillin during integrin signaling. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188797	cd09413	LIM1_Testin	The first LIM domain of Testin. The first LIM domain of Testin: Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N-terminal.   Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Knockout mice experiments reveal that tumor repressor function of Testin. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	58
188798	cd09414	LIM1_LIMPETin	The first LIM domain of protein LIMPETin. The first LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the Testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins.  In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	58
188799	cd09415	LIM1_Prickle	The first LIM domain of Prickle. The first LIM domain of Prickle: Prickle contains three C-terminal LIM domains and a N-terminal PET domain.  Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Four forms of prickles have been identified: prickle 1-4. The best characterized is prickle 1 and prickle 2 which are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells.  Mutations in prickle 1 have been linked to progressive myoclonus epilepsy. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188800	cd09416	LIM2_Testin	The second LIM domain of Testin. The second LIM domain of Testin: Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N-terminal. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Knockout mice experiments reveal that tumor repressor function of testin. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188801	cd09417	LIM2_LIMPETin_like	The second LIM domain of protein LIMPETin and related proteins. The second LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins.  In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188802	cd09418	LIM2_Prickle	The second LIM domain of Prickle. The second LIM domain of Prickle: Prickle contains three C-terminal LIM domains and a N-terminal PET domain.  Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Two forms of prickles have been identified; namely prickle 1 and prickle 2. Prickle 1 and prickle 2 are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188803	cd09419	LIM3_Testin	The third LIM domain of Testin. The third LIM domain of Testin: Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N-terminal.  Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers at cell-cell-contact areas and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events. Knockout mice experiments reveal that tumor repressor function of Testin. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188804	cd09420	LIM3_Prickle	The third LIM domain of Prickle. The third LIM domain of Prickle: Prickle contains three C-terminal LIM domains and a N-terminal PET domain.  Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Two forms of prickles have been identified; namely prickle 1 and prickle 2. Prickle 1 and prickle 2 are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188805	cd09421	LIM3_LIMPETin	The third LIM domain of protein LIMPETin. The third LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins.  In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188806	cd09422	LIM1_FHL2	The first LIM domain of Four and a half LIM domains protein 2 (FHL2). The first LIM domain of Four and a half LIM domains protein 2 (FHL2):  FHL2 is one of the best studied FHL proteins. FHL2 expression is most abundant in the heart, and in brain, liver and lung at lesser extent. FHL2 participates in a wide range of cellular processes, such as transcriptional regulation, signal transduction, and cell survival by binding to various protein partners. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. Although FHL2 is abundantly expressed in heart, the fhl2 null mice are viable and had no detectable abnormal cardiac phenotype. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	62
188807	cd09423	LIM1_FHL3	The first LIM domain of Four and a half LIM domains protein 3 (FHL3). The first LIM domain of Four and a half LIM domains protein 3 (FHL3):  FHL3 is highly expressed in the skeleton and cardiac muscles and possesses the transactivation and repression activities. FHL3 interacts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. Moreover, FHL3 interacts with alpha- and beta-subunits of the muscle alpha7beta1 integrin receptor. FHL3 was also proved to possess the auto-activation ability and was confirmed that the second zinc finger motif in fourth LIM domain was responsible for the auto-activation of FHL3. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188808	cd09424	LIM2_FHL1	The second LIM domain of Four and a half LIM domains protein 1 (FHL1). The second LIM domain of Four and a half LIM domains protein 1 (FHL1):  FHL1 is heavily expressed in skeletal and cardiac muscles. It plays important roles in muscle growth, differentiation, and sarcomere assembly by acting as a modulator of transcription factors. Defects in FHL1 gene are responsible for a number of Muscular dystrophy-like muscle disorders. It has been detected that FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	58
188809	cd09425	LIM4_LIMPETin	The fourth LIM domain of protein LIMPETin. The fourth LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the Testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins.  In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188810	cd09426	LIM2_FHL2	The second LIM domain of Four and a half LIM domains protein 2 (FHL2). The second LIM domain of Four and a half LIM domains protein 2 (FHL2):  FHL2 is one of the best studied FHL proteins. FHL2 expression is most abundant in the heart, and in brain, liver and lung to a lesser extent. FHL2 participates in a wide range of cellular processes, such as transcriptional regulation, signal transduction, and cell survival by binding to various protein partners. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. Although FHL2 is abundantly expressed in heart, the fhl2 null mice are viable and had no detectable abnormal cardiac phenotype. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to s upport the assembly of multimeric protein complexes.	57
188811	cd09427	LIM2_FHL3	The second LIM domain of Four and a half LIM domains protein 3 (FHL3). The second LIM domain of Four and a half LIM domains protein 3 (FHL3):  FHL3 is highly expressed in the skeleton and cardiac muscles and possesses the transactivation and repression activities. FHL3 interacts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. Moreover, FHL3 interacts with alpha- and beta-subunits of the muscle alpha7beta1 integrin receptor. FHL3 was also proved to possess the auto-activation ability and was confirmed that the second zinc finger motif in fourth LIM domain was responsible for the auto-activation of FHL3. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	58
188812	cd09428	LIM2_FHL5	The second LIM domain of Four and a half LIM domains protein 5 (FHL5). The second LIM domain of Four and a half LIM domains protein 5 (FHL5): FHL5 is a tissue-specific coactivator of CREB/CREM family transcription factors , which are highly expressed in male germ cells and is required for post-meiotic gene expression. FHL5 associates with CREM and confers a powerful transcriptional activation function. Activation by CREB has known to occur upon phosphorylation at an essential regulatory site and the subsequent interaction with the ubiquitous coactivator CREB-binding protein (CBP). However, the activation by FHL5 is independent of phosphorylation and CBP association. It represents a new route for transcriptional activation by CREM and CREB.  LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188813	cd09429	LIM3_FHL1	The third LIM domain of Four and a half LIM domains protein 1 (FHL1). The third LIM domain of Four and a half LIM domains protein 1 (FHL1):  FHL1 is heavily expressed in skeletal and cardiac muscles. It plays important roles in muscle growth, differentiation, and sarcomere assembly by acting as a modulator of transcription factors. Defects in FHL1 gene are responsible for a number of Muscular dystrophy-like muscle disorders. It has been detected that FHL1 binds to Myosin-binding protein C, regulating myosin filament formation and sarcomere assembly. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188814	cd09430	LIM5_LIMPETin	The fifth LIM domain of protein LIMPETin. The fifth LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins.  In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188815	cd09431	LIM3_Fhl2	The third LIM domain of Four and a half LIM domains protein 2 (FHL2). The third LIM domain of Four and a half LIM domains protein 2 (FHL2):  FHL2 is one of the best studied FHL proteins. FHL2 expression is most abundant in the heart, and in brain, liver and lung to a lesser extent. FHL2 participates in a wide range of cellular processes, such as transcriptional regulation, signal transduction, and cell survival by binding to various protein partners. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. Although FHL2 is abundantly expressed in heart, the fhl2 null mice are viable and had no detectable abnormal cardiac phenotype. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to s upport the assembly of multimeric protein complexes.	57
188816	cd09432	LIM6_LIMPETin	The sixth LIM domain of protein LIMPETin. The sixth LIM domain of protein LIMPETin: LIMPETin contains 6 LIM domains at the C-terminal and an N-terminal PET domain. Four of the six LIM domains are highly homologous to the four and half LIM domain protein family and two of them show sequence similarity to the LIM domains of the testin family. Thus, LIMPETin may be the recombinant product of genes coding testin and FHL proteins.  In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Its differential expression indicates that it is a transcription regulator. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188817	cd09433	LIM4_FHL2	The fourth LIM domain of Four and a half LIM domains protein 2 (FHL2). The fourth LIM domain of Four and a half LIM domains protein 2 (FHL2):  FHL2 is one of the best studied FHL proteins. FHL2 expression is most abundant in the heart, and in brain, liver and lung to a lesser extent. FHL2 participates in a wide range of cellular processes, such as transcriptional regulation, signal transduction, and cell survival by binding to various protein partners. FHL2 has shown to interact with more than 50 different proteins, including receptors, structural proteins, transcription factors and cofactors, signal transducers, splicing factors, DNA replication and repair enzymes, and metabolic enzymes. Although FHL2 is abundantly expressed in heart, the fhl2 null mice are viable and had no detectable abnormal cardiac phenotype. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to s upport the assembly of multimeric protein complexes.	58
188818	cd09434	LIM4_FHL3	The fourth LIM domain of Four and a half LIM domains protein 3 (FHL3). The fourth LIM domain of Four and a half LIM domains protein 3 (FHL3):  FHL3 is highly expressed in the skeleton and cardiac muscles and possesses the transactivation and repression activities. FHL3 interacts with many transcription factors, such as CREB, BKLF/KLF3, CtBP2, MyoD, and MZF_1. Moreover, FHL3 interacts with alpha- and beta-subunits of the muscle alpha7beta1 integrin receptor. FHL3 was also proved to possess the auto-activation ability and was confirmed that the second zinc finger motif in fourth LIM domain was responsible for the auto-activation of FHL3. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188819	cd09435	LIM3_Zyxin	The third LIM domain of Zyxin. The third LIM domain of Zyxin: Zyxin exhibits three copies of the LIM domain, an extensive proline-rich domain and a nuclear export signal.  Localized at sites of cellsubstratum adhesion in fibroblasts, Zyxin interacts with alpha-actinin, members of the cysteine-rich protein (CRP) family, proteins that display Src homology 3 (SH3) domains and Ena/VASP family members. Zyxin and its partners have been implicated in the spatial control of actin filament assembly as well as in pathways important for cell differentiation. In addition to its functions at focal adhesion plaques, recent work has shown that zyxin moves from the sites of cell contacts to the nucleus, where it directly participates in the regulation of gene expression. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	67
188820	cd09436	LIM3_TRIP6	The third LIM domain of Thyroid receptor-interacting protein 6 (TRIP6). The third LIM domain of Thyroid receptor-interacting protein 6 (TRIP6): TRIP6 is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal. TRIP6 protein localizes to focal adhesion sites and along actin stress fibers. Recruitment of this protein to the plasma membrane occurs in a lysophosphatidic acid (LPA)-dependent manner. TRIP6 recruits a number of molecules involved in actin assembly, cell motility, survival and transcriptional control. The function of TRIP6 in cell motility is regulated by Src-dependent phosphorylation at a Tyr residue. The phosphorylation activates the coupling to the Crk SH2 domain, which is required for the function of TRIP6 in promoting lysophosphatidic acid (LPA)-induced cell migration. TRIP6 can shuttle to the nucleus to serve as a coactivator of AP-1 and NF-kappaB transcriptional factors. Moreover, TRIP6 can form a ternary complex with the NHERF2 PDZ protein and LPA2 receptor to regulate LPA-induced activation of ERK and AKT, rendering cells resistant to chemotherapy. Recent evidence shows that TRIP6 antagonizes Fas-Induced apoptosis by enhancing the antiapoptotic effect of LPA in cells. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	66
188821	cd09437	LIM3_LPP	The third LIM domain of lipoma preferred partner (LPP). The third LIM domain of lipoma preferred partner (LPP): LPP is a member of the zyxin LIM protein family and contains three LIM zinc-binding domains at the C-terminal and proline-rich region at the N-terminal.  LPP initially identified as the most frequent translocation partner of HMGA2 (High Mobility Group A2) in a subgroup of benign tumors of adipose tissue (lipomas). It was also shown to be rearranged in a number of other soft tissues, as well as in a case of acute monoblastic leukemia. In addition to its involvement in tumors, LPP was inedited as a smooth muscle restricted LIM protein that plays an important role in SMC migration. LPP is localized at sites of cell adhesion, cell-cell contacts and transiently in the nucleus. In nucleus, it acts as a coactivator for the ETS domain transcription factor PEA3. In addition to PEA3, it interacts with alpha-actinin,vasodilator stimulated phosphoprotein (VASP), Palladin, and Scrib. The LIM domains are the main focal adhesion targeting elements and that the proline- rich region, which harbors binding sites for alpha-actinin and vasodilator- stimulated phosphoprotein (VASP), has a weak targeting capacity. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	68
188822	cd09438	LIM3_Ajuba_like	The third LIM domain of Ajuba-like proteins. The third LIM domain of Ajuba-like proteins: Ajuba like LIM protein family includes three highly homologous proteins Ajuba, Limd1, and WTIP. Members of the family contain three tandem C-terminal LIM domains and a proline-rich N-terminal region. This family of proteins functions as scaffolds, participating in the assembly of numerous protein complexes. In the cytoplasm, Ajuba binds Grb2 to modulate serum-stimulated ERK activation. Ajuba also recruits the TNF receptor-associated factor 6 (TRAF6) to p62 and activates PKCKappa activity. Ajuba interacts with alpha-catenin and F-actin to contribute to the formation or stabilization of adheren junctions by linking adhesive receptors to the actin cytoskeleton. Although Ajuba is a cytoplasmic protein, it can shuttle into the nucleus. In nucleus, Ajuba functions as a corepressor for the zinc finger-protein Snail. It binds to the SNAG repression domain of Snail through its LIM region.  Arginine methyltransferase-5 (Prmt5), a protein in the complex, is recruited to Snai l through an interaction with Ajuba. This ternary complex functions to repress E-cadherin, a Snail target gene. In addition, Ajuba contains functional nuclear-receptor interacting motifs and selectively interacts with retinoic acid receptors (RARs) and rexinoid receptor (RXRs) to negatively regulate retinoic acid signaling. Wtip, the Wt1-interacting protein, was originally identified as an interaction partner of the Wilms tumour protein 1 (WT1). Wtip is involved in kidney and neural crest development. Wtip interacts with the receptor tyrosine kinase Ror2 and inhibits canonical Wnt signaling. LIMD1 was reported to inhibit cell growth and metastases. The inhibition may be mediated through an interaction with the protein barrier-to-autointegration (BAF), a component of SWI/SNF chromatin-remodeling protein; or through the interaction with retinoblastoma protein (pRB), resulting in inhibition of E2F-mediated transcription, and expression of the majority of genes with E2F1- responsive elements. Recently, Limd1 was shown to interact with the p62/sequestosome protein and influence IL-1 and RANKL signaling by facilitating the assembly of a p62/TRAF6/a-PKC multi-protein complex. The Limd1-p62 interaction affects both NF-kappaB and AP-1 activity in epithelial cells and osteoclasts. Moreover, LIMD1 functions as tumor repressor to block lung tumor cell line in vitro and in vivo. Recent studies revealed that LIM proteins Wtip, LIMD1 and Ajuba interact with components of RNA induced silencing complexes (RISC) as well as eIF4E and the mRNA m7GTP cap-protein complex and are required for microRNA-mediated gene silencing.  As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	62
188823	cd09439	LIM_Mical	The LIM domain of Mical (molecule interacting with CasL). The LIM domain of Mical (molecule interacting with CasL): MICAL is a large, multidomain, cytosolic protein with a single LIM domain, a calponin homology (CH) domain and a flavoprotein monooxygenase domain. In Drosophila, MICAL is expressed in axons, interacts with the neuronal A (PlexA)  receptor and is required for Semapho-rin 1a (Sema-1a)-PlexA-mediated repulsive axon guidance.  The LIM domain and calporin homology domain are known for interactions with the cytoskeleton, cytoskeletal adaptor proteins, and other signaling proteins. The flavoprotein monooxygenase (MO) is required for semaphorin-plexin repulsive axon guidance during axonal pathfinding in the Drosophila neuromuscular system. In addition, MICAL was characterized to interact with Rab13 and Rab8 to coordinate the assembly of tight junctions and adherens junctions in epithelial cells. Thus, MICAL was also named junctional Rab13-binding protein (JRAB). As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	55
188824	cd09440	LIM1_SF3	The first Lim domain of pollen specific protein SF3. The first Lim domain of pollen specific protein SF3: SF3 is a Lim protein that is found exclusively in mature plant pollen grains. It contains two LIM domains. The exact function of SF3 is unknown. It may be a transcription factor required for the expression of late pollen genes. It is possible that SF3 protein is involved in controlling pollen-specific processes such as male gamete maturation, pollen tube formation, or even fertilization. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	63
188825	cd09441	LIM2_SF3	The second Lim domain of pollen specific protein SF3. The second Lim domain of pollen specific protein SF3: SF3 is a Lim protein that is found exclusively in mature plant pollen grains. It contains two LIM domains. The exact function of SF3 is unknown. It may be a transcription factor required for the expression of late pollen genes. It is possible that SF3 protein is involved in controlling pollen-specific processes such as male gamete maturation, pollen tube formation, or even fertilization. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	61
188826	cd09442	LIM_Eplin_like	The Lim domain of Epithelial Protein Lost in Neoplasm (Eplin) like proteins. The Lim domain of Epithelial Protein Lost in Neoplasm (Eplin) like proteins: This family contains Epithelial Protein Lost in Neoplasm in Neoplasm (Eplin), xin actin-binding repeat-containing protein 2 (XIRP2) and a group of protein with unknown function.  The members of this family all contain a single LIM domain. Epithelial Protein Lost in Neoplasm is a cytoskeleton-associated tumor suppressor whose expression inversely correlates with cell growth, motility, invasion and cancer mortality.  Eplin interacts and stabilizes F-actin filaments and stress fibers, which correlates with its ability to suppress anchorage independent growth. In epithelial cells, Eplin is required for formation of the F-actin adhesion belt by binding to the E-cadherin-catenin complex through alpha-catenin. Eplin is expressed in two isoforms, a longer Eplin-beta and a shorter Eplin-alpha. Eplin-alpha mRNA is detected in various tissues and cell lines, but is absent or down regulated in cancer cells. Xirp2 contains a LIM domain and Xin re peats for binding to and stabilising F-actin. Xirp2 is expressed in muscles and is significantly induced in the heart in response to systemic administration of angiotensin II. Xirp2 is an important effector of the Ang II signaling pathway in the heart. The expression of Xirp2 is activated by myocyte enhancer factor (MEF)2A, whose  transcriptional activity is stimulated by angiotersin II. Thus, Xirp2 plays important pathological roles in the angiotensin II induced hypertension. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188827	cd09443	LIM_Ltd-1	The LIM domain of LIM and transglutaminase domains protein (Ltd-1). The LIM domain of LIM and transglutaminase domains protein (Ltd-1): This family includes mouse Ky protein and Caenorhabditis elegans Ltd-1 protein. The members of this family consists a N-terminal  Lim domain and a C-terminal transglutaminase domain. The mouse Ky protein has  putative function in muscle development. The mouse with ky mutant exhibits combined posterior and lateral curvature of the spine. The Ltd-1 gene in C. elegans is expressed in developing hypodermal cells from the twofold stage embryo through adulthood. These data define the ltd-1 gene as a novel marker for C. elegans epithelial cell development. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	55
188828	cd09444	LIM_Mical_like_1	This domain belongs to the LIM domain family which are found on Mical (molecule interacting with CasL) like proteins. The LIM domain on proteins of unknown function: This domain belongs to the LIM domain family which are found on Mical (molecule interacting with CasL) like proteins. Known members of the Mical-like family includes single LIM domain containing proteins, Mical (molecule interacting with CasL), pollen specific protein SF3, Eplin, xin actin-binding repeat-containing protein 2 (XIRP2), and Ltd-1. The members of this family function mainly at the cytoskeleton and focal adhesions. They interact with transcription factors or other signaling molecules to play roles in muscle development, neuronal differentiation, cell growth, and mobility.  As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	55
188829	cd09445	LIM_Mical_like_2	This domain belongs to the LIM domain family which are found on Mical (molecule interacting with CasL) like proteins. The LIM domain on proteins of unknown function: This domain belongs to the LIM domain family which are found on Mical (molecule interacting with CasL)-like proteins. Known members of the Mical-like family includes single LIM domain containing proteins, Mical (molecule interacting with CasL), pollen specific protein SF3, Eplin, xin actin-binding repeat-containing protein 2 (XIRP2), and Ltd-1. The members of this family function mainly at the cytoskeleton and focal adhesions. They interact with transcription factors or other signaling molecules to play roles in muscle development, neuronal differentiation, cell growth, and mobility.  As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188830	cd09446	LIM_N_RAP	The LIM domain of N-RAP. The LIM domain of N-RAP:  N-RAP is a muscle-specific protein concentrated at myotendinous junctions in skeletal muscle and intercalated disks in cardiac muscle. LIM domain is found at the N-terminus of N-RAP and the C-terminal of N-RAP contains a region with multiple of nebulin repeats. N-RAP functions as a scaffolding protein that organizes alpha-actinin and actin into symmetrical I-Z-I structures in developing myofibrils. Nebulin repeat is known as actin binding domain. The N-RAP is hypothesized to form antiparallel dimerization via its LIM domain. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188831	cd09447	LIM_LASP	The LIM domain of LIM and SH3 Protein (LASP). The LIM domain of LIM and SH3 Protein (LASP):  LASP family contains two highly homologous members, LASP-1 and LASP-2. LASP contains a LIM motif at its amino terminus, a src homology 3 (SH3) domains at its C-terminal part, and a nebulin-like region in the middle. LASP-1 and -2 are highly conserved in their LIM, nebulin-like, and SH3 domains ,but differ significantly at their linker regions. Both proteins are ubiquitously expressed and involved in cytoskeletal architecture, especially in the organization of focal adhesions. LASP-1 and LASP-2, are important during early embryo- and fetogenesis and are highly expressed in the central nervous system of the adult. However, only LASP-1 seems to participate significantly in neuronal differentiation and plays an important functional role in migration and proliferation of certain cancer cells while the role of LASP-2 is more structural. The expression of LASP-1 in breast tumors is increased significantly. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188832	cd09448	LIM_CLP36	This family represents the LIM domain of CLP36. This family represents the LIM domain of CLP36.  CLP36 has also been named as CLIM1, Elfin, or PDLIM1. CLP36 contains a C-terminal LIM domain and an N-terminal PDZ domain. CLP36 is highly expressed in heart and is present in many other tissues including lung, liver, spleen, and blood. CLP36 has been implicated in many processes including hypoxia and regulation of actin stress fibers. CLP36 co-localizes with alpha-actinin-2 at the Z-lines in myocardium. In addition, CLP36 binds to alpha-actinin-1 and alpha-actinin-4, and associates with F-actin filaments and stress fibers. CLP36 might be involved in not only the function of sarcomeres in muscle cells, but also in actin stress fiber-mediated cellular processes, such as cell shape, migration, polarit, and cytokinesis in non-muscle cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188833	cd09449	LIM_Mystique	The LIM domain of Mystique, a subfamily of ALP LIM domain proteins. The LIM domain of Mystique, a subfamily of ALP LIM domain proteins: Mystique is the most recently identified member of the ALP protein family. It also interacts with alpha-actinin, as other ALP proteins do. Mystique promotes cell attachment and migration and suppresses anchorage-independent growth. The LIM domain of Mystique is required for the suppression function. Moreover, Mystique functions as an ubiquitin E3 ligase acting on STAT proteins to cause their proteosome mediated degradation. As in all LIM domains, this domain is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188834	cd09450	LIM_ALP	This family represents the LIM domain of ALP, actinin-associated LIM protein. This family represents the LIM domain of ALP, actinin-associated LIM protein. ALP contains an N-terminal PDZ domain, a C-terminal LIM domain and an ALP-subfamily-specific 34-amino-acid motif termed ALP-like motif (AM), which contains a putative consensus protein kinase C (PKC) phosphorylation site and two alpha-helices. ALP proteins are found in heart and in skeletal muscle. ALP may act as a signaling molecule which is regulated by PKC-dependent signaling. ALP plays an essential role in the development of RV (right ventricle) chamber. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188835	cd09451	LIM_RIL	The LIM domain of RIL. The LIM domain of RIL: RIL contains an N-terminal PDZ domain, a LIM domain, and a short consensus C-terminal region. It is the smallest molecule in the ALP LIM domain containing protein family. RIL was identified in rat fibroblasts and in human lymphocytes. The LIM domain interacts with the AMPA glutamate receptor in dendritic spines. The consensus C-terminus interacts with PTP-BL, a submembranous protein tyrosine phosphatase and the PDZ domain is responsible to interact with alpha-actinin molecules. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188836	cd09452	LIM1_Enigma	The first LIM domain of Enigma. The first LIM domain of Enigma: Enigma was initially characterized in humans as a protein containing three LIM domains at the C-terminus and a PDZ domain at N-terminus.  The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS. Thus Enigma is implicated in signal transduction processes such as mitogenic activity, insulin related actin organization, and glucose metabolism. Enigma is expressed in multiple tissues, such as skeletal muscle, heart, bone and brain.  LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188837	cd09453	LIM1_ENH	The first LIM domain of the Enigma Homolog (ENH) family. The first LIM domain of the Enigma Homolog (ENH) family: ENH was initially identified in rat brain. Same as enigma, it contains three LIM domains at the C-terminus and a PDZ domain at N-terminus.  ENH is implicated in signal transduction processes involving protein kinases.  It has also been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ENH is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188838	cd09454	LIM1_ZASP_Cypher	The first LIM domain of ZASP/Cypher family. The first LIM domain of ZASP/Cypher family: ZASP was identified in human heart and skeletal muscle and Cypher is a mice ortholog of ZASP. ZASP/Cyppher contains three LIM domains at the C-terminus and a PDZ domain at N-terminus.  ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188839	cd09455	LIM1_Enigma_like_1	The first LIM domain of an Enigma subfamily with unknown function. The first LIM domain of an Enigma subfamily with unknown function: The Enigma LIM domain family is comprised of three characterized members: Enigma, ENH and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. They serve as adaptor proteins, where the PDZ domain tethers the protein to the cytoskeleton and the LIM domains, recruit signaling proteins to implement corresponding functions. The members of the Enigma family have been implicated in regulating or organizing cytoskeletal structure, as well as involving multiple signaling pathways. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188840	cd09456	LIM2_Enigma	The second LIM domain of Enigma. The second LIM domain of Enigma: Enigma was initially characterized in humans as a protein containing three LIM domains at the C-terminus and a PDZ domain at N-terminus.  The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS.  Thus Enigma is implicated in signal transduction processes, such as mitogenic activity, insulin related actin organization, and glucose metabolism. Enigma is expressed in multiple tissues, such as skeletal muscle, heart, bone and brain.  LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188841	cd09457	LIM2_ENH	The second LIM domain of the Enigma Homolog (ENH) family. The second LIM domain of the Enigma Homolog (ENH) family: ENH was initially identified in rat brain. Same as enigma, it contains three LIM domains at the C-terminus and a PDZ domain at N-terminus. ENH is implicated in signal transduction processes involving protein kinases.  It has also been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ENH is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	52
188842	cd09458	LIM3_Enigma	The third LIM domain of Enigma. The third LIM domain of Enigma: Enigma was initially characterized in humans as a protein containing three LIM domains at the C-terminus and a PDZ domain at N-terminus.  The third LIM domain specifically interacts with the insulin receptor and the second LIM domain interacts with the receptor tyrosine kinase Ret and the adaptor protein APS.  Thus Enigma is implicated in signal transduction processes such as mitogenic activity, insulin related actin organization, and glucose metabolism. Enigma is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain.  LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188843	cd09459	LIM3_ENH	The third LIM domain of the Enigma Homolog (ENH) family. The third LIM domain of the Enigma Homolog (ENH) family: ENH was initially identified in rat brain. Same as enigma, it contains three LIM domains at the C-terminus and a PDZ domain at N-terminus. ENH is implicated in signal transduction processes involving protein kinases.  It has also been shown that ENH interacts with protein kinase D1 (PKD1) via its LIM domains and forms a complex with PKD1 and the alpha1C subunit of cardiac L-type voltage-gated calcium channel in rat neonatal cardiomyocytes. The N-terminal PDZ domain interacts with alpha-actinin at the Z-line. ENH is expressed in multiple tissues, such as skeletal muscle, heart, bone, and brain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188844	cd09460	LIM3_ZASP_Cypher	The third LIM domain of ZASP/Cypher family. The third LIM domain of ZASP/Cypher family: ZASP was identified in human heart and skeletal muscle and Cypher is a mice ortholog of ZASP. ZASP/Cyppher contains three LIM domains at the C-terminus and a PDZ domain at N-terminus.  ZASP/Cypher is required for maintenance of Z-line structure during muscle contraction, but not required for Z-line assembly. In heart, Cypher/ZASP plays a structural role through its interaction with cytoskeletal Z-line proteins. In addition, there is increasing evidence that Cypher/ZASP also performs signaling functions. Studies reveal that Cypher/ZASP interacts with and directs PKC to the Z-line, where PKC phosphorylates downstream signaling targets. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188845	cd09461	LIM3_Enigma_like_1	The third LIM domain of an Enigma subfamily with unknown function. The third LIM domain of an Enigma subfamily with unknown function: The Enigma LIM domain family is comprised of three characterized members: Enigma, ENH, and Cypher (mouse)/ZASP (human). These subfamily members contain a single PDZ domain at the N-terminus and three LIM domains at the C-terminus. They serve as adaptor proteins, where the PDZ domain tethers the protein to the cytoskeleton and the LIM domains, recruit signaling proteins to implement corresponding functions. The members of the enigma family have been implicated in regulating or organizing cytoskeletal structure, as well as involving multiple signaling pathways. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188846	cd09462	LIM1_LIMK1	The first LIM domain of LIMK1 (LIM domain Kinase 1). The first LIM domain of LIMK1 (LIM domain Kinase 1): LIMK1 belongs to the LIMK protein family, which comprises LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain, and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, and altering the rate of actin depolymerization. LIMKs can function in both cytoplasm and nucleus. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. LIMK1 is expressed in all tissues and is localized to focal adhesions in the cell. LIMK1 can form homodimers upon binding of HSP90 and is activated by Rho effector Rho kinase and MAPKAPK2. LIMK1 is important for normal central nervous system development, and its deletion has been implicated in the development of the human genetic disorder Williams syndrome. Moreover, LIMK1 up-regulates the promoter activity of urokinase type plasminogen activator and induces its mRNA and protein expression in breast cancer cells. The LIM domains have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	74
188847	cd09463	LIM1_LIMK2	The first LIM domain of LIMK2 (LIM domain Kinase 2). The first LIM domain of LIMK2 (LIM domain Kinase 2): LIMK2 is a member of the LIMK protein family, which comprises LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain, and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, altering the rate of actin depolymerization. LIMK activity is activated by phosphorylation of a threonine residue within the activation loop of the kinase by p21-activated kinases 1 and 4 and by Rho kinase. LIMKs can function in both cytoplasm and nucleus. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. LIMK2 is expressed in all tissues. While LIMK1 localizes mainly at focal adhesions, LIMK2 is found in cytoplasmic punctae, suggesting that they may have different cellular functions. The activity of LIM kinase 2 to regulate cofilin phosphorylation is inhibited by the direct binding of Par-3. LIMK2 activation promotes cell cycle progression. The phenotype of Limk2 knockout mice shows a defect in spermatogenesis. The LIM domains have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	53
188848	cd09464	LIM2_LIMK1	The second LIM domain of LIMK1 (LIM domain Kinase 1). The second LIM domain of LIMK1 (LIM domain Kinase 1): LIMK1 belongs to the LIMK protein family, which comprises LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain, and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, and altering the rate of actin depolymerization. LIMKs can function in both cytoplasm and nucleus. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. LIMK1 is expressed in all tissues and is localized to focal adhesions in the cell. LIMK1 can form homodimers upon binding of HSP90 and is activated by Rho effector Rho kinase and MAPKAPK2. LIMK1 is important for normal central nervous system development, and its deletion has been implicated in the development of the human genetic disorder Williams syndrome. Moreover, LIMK1 up-regulates the promoter activity of urokinase type plasminogen activator and induces its mRNA and protein expression in breast cancer cells. The LIM domains have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188849	cd09465	LIM2_LIMK2	The second LIM domain of LIMK2 (LIM domain Kinase 2). The second LIM domain of LIMK2 (LIM domain Kinase 2): LIMK2 is a member of the LIMK protein family, which comprises LIMK1 and LIMK2. LIMK contains two LIM domains, a PDZ domain, and a kinase domain. LIMK is involved in the regulation of actin polymerization and microtubule disassembly. LIMK influences architecture of the actin cytoskeleton by regulating the activity of the cofilin family proteins cofilin1, cofilin2, and destrin. The mechanism of the activation is to phosphorylates cofilin on serine 3 and inactivates its actin-severing activity, altering the rate of actin depolymerisation. LIMK activity is activated by phosphorylation of a threonine residue within the activation loop of the kinase by p21-activated kinases 1 and 4 and by Rho kinase. LIMKs can function in both cytoplasm and nucleus. Both LIMK1 and LIMK2 can act in the nucleus to suppress Rac/Cdc42-dependent cyclin D1 expression. LIMK2 is expressed in all tissues. While LIMK1 localizes mainly at focal adhesions, LIMK2 is found in cytoplasmic punctae, suggesting that they may have different cellular functions. The activity of LIM kinase 2 to regulate cofilin phosphorylation is inhibited by the direct binding of Par-3. LIMK2 activation promotes cell cycle progression. The phenotype of Limk2 knockout mice shows a defect in spermatogenesis. The LIM domains have been shown to play an important role in regulating kinase activity and likely also contribute to LIMK function by acting as sites of protein-to-protein interactions. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188850	cd09466	LIM1_Lhx3a	The first LIM domain of Lhx3a. The first LIM domain of Lhx3a: Lhx3a is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx3a is one of the two isoforms of Lhx3. The Lhx3 gene is expressed in the ventral spinal cord, the pons, the medulla oblongata, and the pineal gland of the developing nervous system during mouse embryogenesis, and transcripts are found in the emergent pituitary gland. Lhx3 functions in concert with other transcription factors to specify interneuron and motor neuron fates during development. Lhx3 proteins have been demonstrated to directly bind to the promoters of several pituitary hormone gene promoters. The Lhx3 gene encodes two isoforms, LHX3a and LHX3b that differ in their amino-terminal sequences, where Lhx3a has longer N-terminal.  They show differential activation of pituitary hormone genes and distinct DNA binding properties. In human, Lhx3a trans-activated the alpha-glycoprotein subunit promoter and genes containing a high-affinity Lhx3 binding site more effectively than the hLhx3b isoform. In addition, hLhx3a induce transcription of the TSHbeta-subunit gene by acting on pituitary POU domain factor, Pit-1, while hLhx3b does not. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	56
188851	cd09467	LIM1_Lhx3b	The first LIM domain of Lhx3b. The first LIM domain of Lhx3b. Lhx3b is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx3b is one of the two isoforms of Lhx3. The Lhx3 gene is expressed in the ventral spinal cord, the pons, the medulla oblongata, and the pineal gland of the developing nervous system during mouse embryogenesis, and transcripts are found in the emergent pituitary gland. Lhx3 functions in concert with other transcription factors to specify interneuron and motor neuron fates during development. Lhx3 proteins have been demonstrated to directly bind to the promoters of several pituitary hormone gene promoters. The Lhx3 gene encodes two isoforms, LHX3a and LHX3b that differ in their amino-terminal sequences, where Lhx3a has longer N-terminal.  They show differential activation of pituitary hormone genes and distinct DNA binding properties. In human, Lhx3a trans-activated the alpha-glycoprotein subunit promoter and genes containing a high-affinity Lhx3 binding site more effectively than the hLhx3b isoform. In addition, hLhx3a induce transcription of the TSHbeta-subunit gene by acting on pituitary POU domain factor, Pit-1, while hLhx3b does not. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	55
188852	cd09468	LIM1_Lhx4	The first LIM domain of Lhx4. The first LIM domain of Lhx4. Lhx4 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. LHX4 plays essential roles in pituitary gland and nervous system development. In mice, the lhx4 gene is expressed in the developing hindbrain, cerebral cortex, pituitary gland, and spinal cord. LHX4 shows significant sequence similarity to LHX3, particularly to isoforms Lhx3a. In gene regulation experiments, the LHX4 protein exhibits regulation roles towards pituitary genes, acting on their promoters/enhancers. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	52
188853	cd09469	LIM1_Lhx2	The first LIM domain of Lhx2. The first LIM domain of Lhx2: Lhx2 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas.  In animals, Lhx2 plays important roles in eye, cerebral cortex, limb, the olfactory organs, and erythrocyte development. Lhx2 gene knockout mice exhibit impaired patterning of the cortical hem and the telencephalon of the developing brain, and a lack of development in olfactory structures. The Lhx2 protein has been shown to bind to the mouse M71 olfactory receptor promoter. Similar to other LIM domains, this domain family is 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	64
188854	cd09470	LIM1_Lhx9	The first LIM domain of Lhx9. The first LIM domain of Lhx9: Lhx9 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas.  Lhx9 is highly homologous to Lhx2. It is expressed in several regions of the developing mouse brain, the spinal cord, the pancreas, in limb mesenchyme, and in the urogenital region. Lhx9 plays critical roles in gonad development.  Homozygous mice lacking functional Lhx9 alleles exhibit numerous urogenital defects, such as gonadal agenesis, infertility, and undetectable levels of testosterone and estradiol coupled with high FSH levels. Lhx9 null mice have reduced levels of the Sf1 nuclear receptor that is required for gonadogenesis, and recent studies have shown that Lhx9 is able to activate the Sf1/FtzF1 gene. Lhx9 null mice are phenotypically female, even those that are genotypically male.  As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	54
188855	cd09471	LIM2_Isl2	The second LIM domain of Isl2. The second LIM domain of Isl2: Isl is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Isl proteins are found in the nucleus and act as transcription factors or cofactors. Isl1 and Isl2 are the two conserved members of this family. Mouse Isl2 is expressed in the retinal ganglion cells and the developing spinal cord where it plays a role in motor neuron development. Isl2 may be able to bind to the insulin gene enhancer to promote gene activation. All LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188856	cd09472	LIM2_Lhx3b	The second LIM domain of Lhx3b. The second LIM domain of Lhx3b. Lhx3b is a member of LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx3b is one of the two isoforms of Lhx3. The Lhx3 gene is expressed in the ventral spinal cord, the pons, the medulla oblongata, and the pineal gland of the developing nervous system during mouse embryogenesis, and transcripts are found in the emergent pituitary gland. Lhx3 functions in concert with other transcription factors to specify interneuron and motor neuron fates during development. Lhx3 proteins have been demonstrated to directly bind to the promoters of several pituitary hormone gene promoters. The Lhx3 gene encodes two isoforms, LHX3a and LHX3b that differ in their amino-terminal sequences, where Lhx3a has longer N-terminal.  They show differential activation of pituitary hormone genes and distinct DNA binding properties. In human, Lhx3a trans-activated the alpha-glycoprotein subunit promoter and genes containing a high-affinity Lhx3 binding site more effectively than the hLhx3b isoform. In addition, hLhx3a induce transcription of the TSHbeta-subunit gene by acting on pituitary POU domain factor, Pit-1, while hLhx3b does not.  As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein	57
188857	cd09473	LIM2_Lhx4	The second LIM domain of Lhx4. The second LIM domain of Lhx4. Lhx4 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. LHX4 plays essential roles in pituitary gland and nervous system development. In mice, the lhx4 gene is expressed in the developing hindbrain, cerebral cortex, pituitary gland, and spinal cord. LHX4 shows significant sequence similarity to LHX3, particularly to isoforms Lhx3a. In gene regulation experiments, the LHX4 protein exhibits regulation roles towards pituitary genes, acting on their promoters/enhancers. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	56
188858	cd09474	LIM2_Lhx2	The second LIM domain of Lhx2. The second LIM domain of Lhx2: Lhx2 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas.  In animals, Lhx2 plays important roles in eye, cerebral cortex, limb, the olfactory organs, and erythrocyte development. Lhx2 gene knockout mice exhibit impaired patterning of the cortical hem and the telencephalon of the developing brain, and a lack of development in olfactory structures. The Lhx2 protein has been shown to bind to the mouse M71 olfactory receptor promoter. Similar to other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	59
188859	cd09475	LIM2_Lhx9	The second LIM domain of Lhx9. The second LIM domain of Lhx9: Lhx9 belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas. Lhx9 is highly homologous to Lhx2. It is expressed in several regions of the developing mouse brain, the spinal cord, the pancreas, in limb mesenchyme, and in the urogenital region. Lhx9 plays critical roles in gonad development.  Homozygous mice lacking functional Lhx9 alleles exhibit numerous urogenital defects, such as gonadal agenesis, infertility, and undetectable levels of testosterone and estradiol coupled with high FSH levels. Lhx9 null mice have reduced levels of the Sf1 nuclear receptor that is required for gonadogenesis, and recent studies have shown that Lhx9 is able to activate the Sf1/FtzF1 gene. Lhx9 null mice are phenotypically female, even those that are genotypically male.  As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	59
188860	cd09476	LIM1_TLP	The first LIM domain of thymus LIM protein (TLP). The first LIM domain of thymus LIM protein (TLP):  TLP is the distant member of the CRP family of proteins. TLP has two isomers (TLP-A and TLP-B) and sharing approximately 30% with each of the three other CRPs.  Like CRP1, CRP2 and CRP3/MLP, TLP has two LIM domains, connected by a flexible linker region. Unlike the CRPs, TLP lacks the nuclear targeting signal (K/R-K/R-Y-G-P-K) and is localized solely in the cytoplasm. TLP is specifically expressed in the thymus in a subset of cortical epithelial cells.  TLP has a role in development of normal thymus and in controlling the development and differentiation of thymic epithelial cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188861	cd09477	LIM2_TLP	The second LIM domain of thymus LIM protein (TLP). The second LIM domain of thymus LIM protein (TLP):  TLP is the distant member of the CRP family of proteins. TLP has two isomers (TLP-A and TLP-B) and sharing approximately 30% with each of the three other CRPs.  Like CRP1, CRP2 and CRP3/MLP, TLP has two LIM domains, connected by a flexible linker region. Unlike the CRPs, TLP lacks the nuclear targeting signal (K/R-K/R-Y-G-P-K) and is localized solely in the cytoplasm. TLP is specifically expressed in the thymus in a subset of cortical epithelial cells. TLP has a role in development of normal thymus and in controlling the development and differentiation of thymic epithelial cells. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188862	cd09478	LIM_CRIP	The LIM domain of Cysteine-Rich Intestinal Protein (CRIP). The LIM domain of Cysteine-Rich Intestinal Protein (CRIP): CRIP is a short protein with only one LIM domain. CRIP gene is developmentally regulated and can be induced by glucocorticoid hormones during the first three postnatal weeks. The domain shows close sequence homology to LIM domain of thymus LIM protein. However, unlike the TLP proteins which have two LIM domains, the members of this family have only one LIM domain. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188863	cd09479	LIM1_CRP1	The first LIM domain of Cysteine Rich Protein 1 (CRP1). The first LIM domain of Cysteine Rich Protein 1 (CRP1): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to a short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP and TLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. CRP1 can associate with the actin cytoskeleton and are capable of interacting with alpha-actinin and zyxin. CRP1 was shown to regulate actin filament bundling by interaction with alpha-actinin and direct binding to actin filaments. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	56
188864	cd09480	LIM1_CRP2	The first LIM domain of Cysteine Rich Protein 2 (CRP2). The first LIM domain of Cysteine Rich Protein 2 (CRP2): The CRP family members include CRP1, CRP2, CRP3/MLP and TLP. CRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network. CRP2 specifically binds to protein inhibitor of activated STAT-1 (PIAS1) and a novel human protein designed CRP2BP (for CRP2 binding partner). PIAS1 specifically inhibits the STAT-1 pathway and CRP2BP is homologous to members of the histone acetyltransferase family raising the possibility that CRP2 is a modulator of cytokine-controlled pathways or is functionally active in the transcriptional regulatory network. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	55
188865	cd09481	LIM1_CRP3	The first LIM domain of Cysteine Rich Protein 3 (CRP3/MLP). The first LIM domain of Cysteine Rich Protein 3 (CRP3/MLP): Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP and TLPCRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network.CRP3 also called Muscle LIM Protein (MLP), which is a striated muscle-specific factor that enhances myogenic differentiation. CRP3/MLP interacts with cytoskeletal protein beta-spectrin. CRP3/MLP also interacts with the basic helix-loop-helix myogenic transcriptio n factors MyoD, myogenin, and MRF4 thereby increasing their affinity for specific DNA regulatory elements. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188866	cd09482	LIM2_CRP3	The second LIM domain of Cysteine Rich Protein 3 (CRP3/MLP). The second LIM domain of Cysteine Rich Protein 3 (CRP3/MLP):  Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP and TLPCRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network.CRP3 also called Muscle LIM Protein (MLP), which is a striated muscle-specific factor that enhances myogenic differentiation. The second LIM domain of CRP3/MLP interacts with cytoskeletal protein beta-spectrin. CRP3/MLP also interacts with the basic helix-loop-helix myogenic transcription factors MyoD, myogenin, and MRF4 thereby increasing their affinity for specific DNA regulatory elements. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188867	cd09483	LIM1_Prickle_1	The first LIM domain of Prickle 1. The first LIM domain of Prickle 1. Prickle contains three C-terminal LIM domains and a N-terminal PET domain Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Four forms of prickles have been identified: prickle 1-4. The best characterized is prickle 1 and prickle 2 which are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in mainly expressed in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. In addition, Prickle 1 regulates cell movements during gastrulation and neuronal migration through interaction with the noncanonical Wnt11/Wnt5 pathway in zebrafish. Mutations in prickle 1 have been linked to progressive myoclonus epilepsy.  LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188868	cd09484	LIM1_Prickle_2	The first LIM domain of Prickle 2. The first LIM domain of Prickle 2: Prickle contains three C-terminal LIM domains and a N-terminal PET domain.  Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Four forms of prickles have been identified: prickle 1-4. The best characterized is prickle 1 and prickle 2 which are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. Mutations in prickle 1 have been linked to progressive myoclonus epilepsy. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
188869	cd09485	LIM_Eplin_alpha_beta	The Lim domain of Epithelial Protein Lost in Neoplasm (Eplin). The Lim domain of Epithelial Protein Lost in Neoplasm (Eplin): Epithelial Protein Lost in Neoplasm is a cytoskeleton-associated tumor suppressor whose expression inversely correlates with cell growth, motility, invasion and cancer mortality.  Eplin interacts and stabilizes F-actin filaments and stress fibers, which correlates with its ability to suppress anchorage independent growth. In epithelial cells, Eplin is required for formation of the F-actin adhesion belt by binding to the E-cadherin-catenin complex through alpha-catenin. Eplin is expressed in two isoforms, a longer Eplin-beta and a shorter Eplin-alpha. Eplin-alpha mRNA is detected in various tissues and cell lines, but is absent or down regulated in cancer cells. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188870	cd09486	LIM_Eplin_like_1	a LIM domain subfamily on a group of proteins with unknown function. This model represents a LIM domain subfamily of Eplin-like family.  This family shows highest homology to the LIM domains on Eplin and XIRP2 protein families. Epithelial Protein Lost in Neoplasm is a cytoskeleton-associated tumor suppressor whose expression inversely correlates with cell growth, motility, invasion and cancer mortality. Xirp2 is expressed in muscles and is an important effector of the Ang II signaling pathway in the heart. As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein.	53
188886	cd09487	SAM_superfamily	SAM (Sterile alpha motif ). SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases.	56
188887	cd09488	SAM_EPH-R	SAM domain of EPH family of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH (erythropoietin-producing hepatocyte) family of receptor tyrosine kinases is a C-terminal signal transduction module located in the cytoplasmic region of these receptors. SAM appears to mediate cell-cell initiated signal transduction via binding proteins to a conserved tyrosine that is phosphorylated. In some cases the SAM domain mediates homodimerization/oligomerization and plays a role in the clustering process necessary for signaling. EPH kinases are the largest family of receptor tyrosine kinases. They are classified into two groups based on their abilities to bind ephrin-A and ephrin-B ligands. The EPH receptors are involved in regulation of cell movement, shape, and attachment during embryonic development; they control cell-cell interactions in the vascular, nervous, epithelial, and immune systems, and in many tumors. They are potential molecular markers for cancer diagnostics and potential targets for cancer therapy.	61
188888	cd09489	SAM_Smaug-like	SAM (Sterile alpha motif ). SAM (sterile alpha motif) domain of Smaug-like subfamily proteins is an RNA binding domain. SAM interacts with stem-loop structures in target mRNAs. Proteins of this subfamily are post-transcriptional regulators involved in mRNA silencing and deadenylation; they can be implicated in transcript stability regulation and vacuolar protein transport as well.  SAM_Smaug-like domain-containing proteins are found in metazoa from yeast to human. In animals they are active during early embryogenesis.	57
188889	cd09490	SAM_Arap1,2,3	SAM domain of Arap1,2,3 (angiotensin receptor-associated protein). SAM (sterile alpha motif) domain of Arap1,2,3 subfamily proteins (angiotensin receptor-associated) is a protein-protein interaction domain. Arap1,2,3 proteins are phosphatidylinositol-3,4,5-trisphosphate-dependent GTPase-activating proteins. They are involved in phosphatidylinositol-3 kinase (PI3K) signaling pathways. In addition to SAM domain, Arap1,2,3 proteins contain ArfGap, PH-like, RhoGAP and UBQ domains. SAM domain of Arap3 protein was shown to interact with SAM domain of Ship2 phosphatidylinositol-trisphosphate phosphatase proteins. Such interaction apparently plays a role in inhibition of PI3K regulated pathways since Ship2 converts PI(3,4,5)P3 into PI(3,4)P2. Proteins of this subfamily participate in regulation of signaling and trafficking associated with a number of different receptors (including EGFR, TRAIL-R1/DR4, TRAIL-R2/DR5) in normal and cancer cells; they are involved in regulation of actin cytoskeleton remodeling, cell spreading and formation of lamellipodia.	63
188890	cd09491	SAM_Ship2	SAM domain of Ship2 lipid phosphatase proteins. SAM (sterile alpha motif) domain of Ship2 subfamily is a protein-protein interaction domain. Ship2 proteins are lipid phosphatases (Phosphatidylinositol-3,4,5-trisphosphate 5-phosphatase 2) containing an N-terminal SH2 domain, a central phosphatase domain and a C-terminal SAM domain. Ship2 is involved in a number of PI3K signaling pathways. For example, it plays a role in regulation of the actin cytoskeleton remodeling, in insulin signaling pathways, and in EphA2 receptor endocytosis. SAM domain of Ship2 can interact with SAM domain of other proteins in these pathways, thus participating in signal transduction. In particular, SAM of Ship2 is known to form heterodimers with SAM domain of Eph-A2 receptor tyrosine kinase during receptor endocytosis as well as with SAM domain of PI3K effector protein Arap3 in the actin cytoskeleton signaling network. Since Ship2 plays a role in negatively regulating insulin signaling, it has been suggested that inhibition of its expression or function may contribute in treating type 2 diabetes and obesity-induced insulin resistance.	63
188891	cd09492	SAM_SASH1_repeat2	SAM domain of SASH1 proteins, repeat 2. SAM (sterile alpha motif) repeat 2 of SASH1 proteins is a protein-protein interaction domain. Members of this subfamily are putative adaptor proteins. They appear to mediate signal transduction. SASH1 can bind 14-3-3 proteins in response to IGF1/phosphatidylinositol 3-kinase signaling. SASH1 was found upregulated in different tissues including thymus, placenta, lungs and downregulated in some breast tumors, liver metastases and colon cancers if compare to corresponding normal tissues.  SASH1 is a potential candidate for a tumor suppressor gene in breast cancers.  At the same time, downregulation of SASH1 in colon cancer is associated with metastasis and a poor prognosis.	70
188892	cd09493	SAM_SASH-like	SAM (Sterile alpha motif ), SASH1-like. SAM (sterile alpha motif) domain of SASH1-like proteins is a protein-protein interaction domain. Members of this subfamily are putative adaptor proteins. They appear to mediate signal transduction. Proteins of this subfamily are known to be involved in preventing DN thymocytes from premature initiation of programmed cell death and in B cells activation and differentiation. They have been found downregulated in some breast tumors, liver metastases and colon cancers if compare to corresponding normal tissues.	60
188893	cd09494	SAM_liprin-kazrin_repeat1	SAM domain of liprin/kazrin proteins repeat 1. SAM (sterile alpha motif) domain repeat 1 of liprin/kazrin proteins is a protein-protein interaction domain. The long form of liprin/kazrin proteins contains three copies (repeats) of the SAM domain. Liprin-alpha may form heterodimers with liprin-beta through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance. In particular, liprin-alpha is involved in formation of the presynaptic active zone; liprin-beta is involved in the maintenance of lymphatic vessel integrity. Kazrins are involved in interplay between desmosomes and adherens junctions; additionally they play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization.	58
188894	cd09495	SAM_liprin-kazrin_repeat2	SAM domain of liprin/kazrin proteins repeat 2. SAM (sterile alpha motif) domain repeat 2 of liprin/kazrin proteins is a protein-protein interaction domain. The long form of liprin/kazrin proteins contains three copies (repeats) of SAM domain. Liprin-alpha may form heterodimers with liprin-beta through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance. In particular, liprin-alpha is involved in formation of the presynaptic active zone; liprin-beta is involved in the maintenance of lymphatic vessel integrity. Kazrins are involved in interplay between desmosomes and in adheren junctions; additionally they play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization.	60
188895	cd09496	SAM_liprin-kazrin_repeat3	SAM domain of liprin/kazrin proteins repeat 3. SAM (sterile alpha motif) domain repeat 3 of liprin/kazrin proteins is a protein-protein interaction domain. The long form of liprin/kazrin proteins contains three copies (repeats) of SAM domain. Liprin-alpha may form heterodimers with liprin-beta through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance. In particular, liprin-alpha is involved in formation of the presynaptic active zone; liprin-beta is involved in the maintenance of lymphatic vessel integrity. Kazrins are involved in interplay between desmosomes and in adherens junctions; additionally they play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization.	62
188896	cd09497	SAM_caskin1,2_repeat1	SAM domain of caskin protein repeat 1. SAM (sterile alpha motif) domain repeat 1 of caskin1,2 proteins is a protein-protein interaction domain. Caskin has two tandem SAM domains. Caskin protein is known to interact with membrane-associated guanylate kinase CASK, and apparently may play a role in neural development, synaptic protein targeting, and regulation of gene expression.	66
188897	cd09498	SAM_caskin1,2_repeat2	SAM domain of caskin protein repeat 2. SAM (sterile alpha motif) domain repeat 2 of caskin1,2 proteins is a protein-protein interaction domain. Caskin has two tandem SAM domains. Caskin protein is known to interact with membrane-associated guanylate kinase CASK, and may play a role in neural development, synaptic protein targeting, and regulation of gene expression.	71
188898	cd09499	SAM_AIDA1AB-like_repeat1	SAM domain of AIDA1AB-like proteins, repeat 1. SAM (sterile alpha motif) domain repeat 1 of AIDA1AB-like proteins is a protein-protein interaction domain. AIDA1AB-like proteins have two tandem SAM domains. They may form an intramolecular head-to-tail homodimer. One of two basic motifs of the nuclear localization signal (NLS) is located within helix 5 of SAM2 (motif HKRK). This signal plays a role in decoupling of SAM2 from SAM1, thus facilitating translocation of this type proteins into the nucleus. SAM1 domain has a potential phosphorylation site for CMGC group of serine/threonine kinases. SAM domains of the AIDA1-like subfamily can directly bind ubiquitin and participate in regulating the degradation of ubiquitinated EphA receptors, particularly EPH-A8 receptor. Additionally AIDA1AB-like proteins may participate in the regulation of nucleoplasmic coilin protein interactions.	67
188899	cd09500	SAM_AIDA1AB-like_repeat2	SAM domain of AIDA1AB-like proteins, repeat 2. SAM (sterile alpha motif) domain repeat 2 of AIDA1AB-like proteins is a protein-protein interaction domain. AIDA1AB-like proteins have two tandem SAM domains. They may form an intramolecular head-to-tail homodimer. One of two basic motifs of the nuclear localization signal (NLS) is located within helix 5 of the SAM2 (motif HKRK). This signal plays a role in decoupling of SAM2 from SAM1, thus facilitating translocation of this type proteins into the nucleus. SAM domains of the AIDA1AB-like subfamily can directly bind ubiquitin and participate in regulating the degradation of ubiquitinated EphA receptors, particularly EPH-A8 receptor. Additionally AIDA1AB-like proteins may participate in the regulation of nucleoplasmic coilin protein interactions.	65
188900	cd09501	SAM_SARM1-like_repeat1	SAM domain ot SARM1-like proteins, repeat 1. SAM (sterile alpha motif) domain repeat 1 of SARM1-like adaptor proteins is a protein-protein interaction domain. SARM1-like proteins contain two tandem SAM domains. SARM1-like proteins are involved in TLR (Toll-like receptor) signaling. They are responsible for targeted localization of the whole protein to post-synaptic regions of axons. In humans SARM1 expression is detected in kidney and liver.	69
188901	cd09502	SAM_SARM1-like_repeat2	SAM domain of SARM1-like, repeat 2. SAM (sterile alpha motif) domain repeat 2 of SARM1-like adaptor proteins is a protein-protein interaction domain. SARM1-like proteins contain two tandem SAM domains. SARM1-like proteins are involved in TLR (Toll-like receptor) signaling. They are responsible for targeted localization of the whole protein to post-synaptic regions of axons. In humans SARM1 expression is detected in kidney and liver.	70
188902	cd09503	SAM_tumor-p63,p73	SAM domain of tumor-p63,p73 proteins. SAM (sterile alpha motif) domain of p63, p73 transcriptional factors is a putative protein-protein interaction domain and lipid-binding domain. p63 and p73 are homologs to the tumor suppressor p53. They have a C-terminal SAM domain in their longest spliced alpha forms, while p53 doesn't have it. p63 or p73 knockout mice show significant developmental abnormalities but no increased cancer susceptibility, suggesting that p63 and p73 play a role in regulation of normal development. It was shown that SAM domain of p73 is able to bind some membrane lipids. The structural rearrangements in SAM are necessary to accomplish the binding. No evidence for homooligomerization through SAM domains was found for p63/p73 subfamily. It was suggested that the partner proteins should be either more distantly related SAM-containing domain proteins or proteins without the SAM domain.	65
188903	cd09504	SAM_STIM-1,2-like	SAM domain of STIM-1,2-like proteins. SAM (sterile alpha motif) domain of STIM-1,2-like (Stromal interaction molecule) proteins is a putative protein-protein interaction domain. STIM1 and STIM2 human proteins are type I transmembrane proteins. The N-terminal part of them includes "hidden" EF-hand and SAM domains. This region is responsible for sensing changes in store-operated and basal cytoplasmic Ca2+ levels and initiates oligomerization. "Hidden" EF hand and SAM domains have a stable intramolecular association, and the SAM domain is a component that regulates stability within STIM proteins. Destabilization of the EF-SAM association during Ca2+ depletion leads to partial unfolding and aggregation (homooligomerization), thus activating the store-operated Ca2+ entry. Immunoprecipitation analysis indicates that STIM1 and STIM2 can form co-precipitable oligomeric associations in vivo. It was suggested that STIM1 and STIM2 are involved in opposite regulation of store operated channels in plasma membrane.	74
188904	cd09505	SAM_WDSUB1	SAM domain of WDSUB1 proteins. SAM (sterile alpha motif) domain of WDSUB1 subfamily proteins is a putative protein-protein interaction domain. Proteins of this group contain multiple domains: SAM, one or more WD40 repeats and U-box (derived version of the RING-finger domain). Apparently the WDSUB1 subfamily proteins participate in protein degradation through ubiquitination, since U-box domain are known as a member of E3 ubiquitin ligase family, while SAM and WD40 domains most probably are responsible for an E2 ubiquitin-conjugating enzyme binding and a target protein binding.	72
188905	cd09506	SAM_Shank1,2,3	SAM domain of Shank1,2,3 family proteins. SAM (sterile alpha motif) domain of Shank1,2,3 family proteins is a protein-protein interaction domain. Shank1,2,3 proteins are scaffold proteins that are known to interact with a variety of cytoplasmic and membrane proteins. SAM domains of the Shank1,2,3 family are prone to homooligomerization. They are highly enriched in the postsynaptic density, acting as scaffolds to organize assembly of postsynaptic proteins. SAM domains of Shank3 proteins can form large sheets of helical fibers. Shank genes show distinct patterns of expression, in rat Shank1 mRNA is found almost exclusively in brain, Shank2 in brain, kidney and liver, and Shank3 in heart, brain and spleen.	66
188906	cd09507	SAM_DGK-delta-eta	SAM domain of diacylglycerol kinase delta and eta subunits. SAM (sterile alpha motif) domain of DGK-eta-delta subfamily proteins is a protein-protein interaction domain. Proteins of this subfamily are multidomain diacylglycerol kinases with a SAM domain located at the C-terminus. DGK proteins participate in signal transduction. They regulate the level of second messengers such as diacylglycerol and phosphatidic acid. The SAM domain of DGK proteins can form high molecular weight homooligomers through head-to-tail interactions as well as heterooligomers between the SAM domains of DGK delta and eta proteins. The oligomerization plays a role in the regulation of DGK intracellular localization.	65
188907	cd09508	SAM_HD	SAM domain of HD-phosphohydrolase. SAM (sterile alpha motif) domain of SAM_HD subfamily proteins is a putative protein-protein interaction domain. Proteins of this group, additionally to the SAM domain, contain a HD hydrolase domain. Human SAM-HD1 is a nuclear protein involved in innate immune response and may act as a negative regulator of the cell-intrinsic antiviral response. Mutations in this gene lead to Aicardi-Goutieres syndrome (symptoms include cerebral atrophy, leukoencephalopathy, hepatosplenomegaly, and increased production of alpha-interferon).	70
188908	cd09509	SAM_Polycomb	SAM domain of Polycomb group. SAM (sterile alpha motif) domain of Polycomb group is a protein-protein interaction domain. The Polycomb group includes transcriptional repressors which are involved in the regulation of some key regulatory genes during development in many organisms. They are best known for silencing Hox (Homeobox) genes. Polycomb proteins work together in large multimeric and chromatin-associated complexes. They organize chromatin of the target genes and maintain repressed states during many cell divisions. Polycomb proteins are classified based on their common function, but not on conserved domains and/or motifs; however many Polycomb proteins (members of PRC1 class complex) contain SAM domains which are more similar to each other inside of the Polycomb group than to SAM domains outside of it. Most information about structure and function of Polycomb SAM domains comes from studies of Ph (Polyhomeotic) and Scm (Sex comb on midleg) proteins. Polycomb SAM domains usually can be found at the C-terminus of the proteins. Some members of this group contain, in addition to the SAM domain,  MTB repeats, Zn finger, and/or DUF3588 domains. Polycomb SAM domains can form homo- and/or heterooligomers through ML and EH surfaces. SAM/SAM oligomers apparently play a role in transcriptional repression through polymerization along the chromosome. Polycomb proteins are known to be highly expressed in some cells years before their cancer pathology; thus they are attractive markers for early cancer therapy.	64
188909	cd09510	SAM_aveugle-like	SAM domain of aveugle-like subfamily. SAM (sterile alpha motif) domain of SAM_aveugle-like subfamily is a protein-protein interaction domain. In Drosophila, the aveugle (AVE) protein (also known as HYP (Hyphen)) is involved in normal photoreceptor differentiation, and required for epidermal growth factor receptor (EGFR) signaling between ras and raf genes during eye development and wing vein formation. SAM domain of the HYP(AVE) protein interacts with SAM domain of CNK, the multidomain scaffold protein connector enhancer of kinase suppressor of ras. CNK/HYP(AVE) complex interacts with KSR (kinase suppressor of Ras) protein. This interaction leads to stimulation of Ras-dependent Raf activation. This subfamily also includes vertebrate AVE homologs - Samd10 and Samd12 proteins. Their exact function is unknown, but they may play a role in signal transduction during embryogenesis.	75
188910	cd09511	SAM_CNK1,2,3-suppressor	SAM domain of CNK1,2,3-suppressor subfamily. SAM (sterile alpha motif) domain of CNK (connector enhancer of kinase suppressor of ras (Ksr)) subfamily is a protein-protein interaction domain. CNK proteins are multidomain scaffold proteins containing a few protein-protein interaction domains and are required for connecting Rho and Ras signaling pathways. In Drosophila, the SAM domain of CNK is known to interact with the SAM domain of the aveugle protein, forming a heterodimer. Mutation of the SAM domain in human CNK1 abolishes the ability to cooperate with the Ras effector, supporting the idea that this interaction is necessary for proper Ras signal transduction.	69
188911	cd09512	SAM_Neurabin-like	SAM domain of SAM_Neurabin-like subfamily. SAM (sterile alpha motif) domain of Neurabin-like (Neural actin-binding) subfamily is a putative protein-protein interaction domain. This group currently includes the SAM domains of neurobin-I, SAMD14 and neurobin-I/SAMD14-like proteins.  Most are multidomain proteins and in addition to SAM domain they contain other protein-binding domains such as PDZ and actin-binding domains. Members of this subfamily participate in signal transduction. Neurabin-I is involved in the regulation of Ca signaling intensity in alpha-adrenergic receptors; it forms a functional pair of opposing regulators with neurabin-II. Neurabins are expressed almost exclusively in neuronal cells. They are known to interact with protein phosphatase 1 and inhibit its activity; they also can bind actin filaments; however, the exact role of the SAM domain is unclear, since SAM doesn't participate in these interactions.	70
188912	cd09513	SAM_BAR	SAM domain of BAR subfamily. SAM (sterile alpha motif) domain of BAR (Bifunctional Apoptosis Regulator) subfamily is a protein-protein interaction domain. In addition to the SAM domain, this type of regulator has a RING finger domain. Proteins of this subfamily are involved in the apoptosis signal network. Their overexpression in human neuronal cells significantly protects cells from a broad range of cell death stimuli.  SAM domain can interact with Caspase8, Bcl-2 and Bcl-X resulting in suppression of Bax-induced cell death.	71
188913	cd09514	SAM_SGMS1	SAM domain of sphingomyelin synthase. SAM (sterile alpha motif) domain of SGMS-1 (sphingomyelin synthase) subfamily is a potential protein-protein interaction domain. Sphingomyelin synthase 1 is a transmembrane protein with a SAM domain at the N-terminus and a catalytic domain at the C-terminus. Sphingomyelin synthase 1 is a Golgi-associated enzyme, and depending on the concentration of diacylglycerol and ceramide, can catalyze synthesis phosphocholine or sphingomyelin, respectively. It plays a central role in sphingolipid and glycerophospholipid metabolism.	72
188914	cd09515	SAM_SGMS1-like	SAM domain of sphingomyelin synthase related subfamily. SAM (sterile alpha motif) domain of SGMS-like (sphingomyelin synthase) subfamily is a potential protein-protein interaction domain. This group of proteins is related to sphingomyelin synthase 1, and contains an N-terminal SAM domain. The function of SGMS1-like proteins is unknown; they may play a role in sphingolipid metabolism.	70
188915	cd09516	SAM_sec23ip-like	SAM domain of sec23ip-like subfamily. SAM (sterile alpha motif) domain of Sec23ip-like (Sec23 interacting protein) subfamily is a potential protein-protein interaction domain. This group of proteins includes Sec23ip and DDHD2 proteins. All of them contain at least two domains: a SAM domain and a predicted metal-binding domain. For mammalian DDHD2 members of this group, phospholipase activity has been demonstrated. Sec23ip proteins of this group interact with Sec23 proteins via an N-terminal proline-rich region. Members of this subfamily are involved in organization of ER/Golgi intermediate compartment.	69
188916	cd09517	SAM_USH1G_HARP	SAM domain of USH1G_HARP family. SAM (sterile alpha motif) domain of USH1G/HARP (Usher syndrome type-1G/ Harmonin-interacting Ankyrin Repeat-containing protein) family is a protein-protein interaction domain. Members of this family have an N-terminal ankyrin repeat region and a C-terminal SAM domain. In mammals these proteins can interact via the SAM domain with the PDZ domain of harmonin to form a scaffolding complex that facilitates signal transduction in epithelial and inner ear sensory cells. It was suggested that USH1G and HARP can be tissue specific partners of harmonin. Mutations in ush1g genes lead to Usher syndrome type 1G. This syndrome is the cause of deaf-blindness in humans.	66
188917	cd09518	SAM_ANKS6	SAM domain of ANKS6 (or SamCystin) subfamily. SAM (sterile alpha motif) domain of ANKS6 (or SamCystin) subfamily is a potential protein-protein interaction domain. Proteins of this subfamily have N-terminal ankyrin repeats and a C-terminal SAM domain. They are able to form self-associated complexes and both (SAM and ANK) domains play a role in such interactions.  Mutations in Anks6 gene are associated with polycystic kidney disease. They cause formation of renal cysts in rodent models. It was suggested that the ANKS6 protein can interact indirectly (through RNA and protein intermediates) with BICC1, another polycystic kidney disease-associated protein.	65
188918	cd09519	SAM_ANKS3	SAM domain of ANKS3 subfamily. SAM (sterile alpha motif) domain of ANKS3 subfamily is a potential protein-protein interaction domain. Proteins of this subfamily have N-terminal ankyrin repeats and a C-terminal SAM domain. SAM is a widespread domain in signaling proteins. In many cases it mediates homo-dimerization/oligomerization.	64
188919	cd09520	SAM_BICC1	SAM domain of BICC1 (bicaudal) subfamily. SAM (sterile alpha motif) domain of BICC1 (bicaudal) subfamily is a protein-protein interaction domain. Proteins of this group have N-terminal K homology RNA-binding vigilin-like repeats and a C-terminal SAM domain. BICC1 is involved in the regulation of embryonic differentiation. It plays a role in the regulation of Dvl (Dishevelled) signaling, particularly in the correct cilia orientation and nodal flow generation. In Drosophila, disruption of BICC1 can disturb the normal migration direction of the anterior follicle cell of oocytes; the specific function of SAM is to recruit whole protein to the periphery of P-bodies. In mammals, mutations in this gene are associated with polycystic kidney disease and it was suggested that the BICC1 protein can indirectly interact with ANKS6 protein (ANKS6 is also associated with polycystic kidney disease) through some protein and RNA intermediates.	65
188920	cd09521	SAM_ASZ1	SAM domain of ASZ1 subfamily. SAM (sterile alpha motif) domain of ASZ1 (Ankyrin, SAM, leucine Zipper) also known as GASZ (Germ cell-specific Ankyrin, SAM, leucine Zipper) subfamily is a potential protein-protein interaction domain. Proteins of this group are involved in the repression of transposable elements during spermatogenesis, oogenesis, and preimplantation embryogenesis. They support synthesis of PIWI-interacting RNA via association with some PIWI proteins, such as MILI and MIWI. This association is required for initiation and maintenance of retrotransposon repression during the meiosis. In mice lacking ASZ1, DNA damage and delayed germ cell maturation was observed due to retrotransposons releasing from their repressed state.	64
188921	cd09522	SAM_SLP76	SAM domain of SLP76 subfamily. SAM (sterile alpha motif) domain of SLP76 (SH2 domain-containing leukocyte protein 76), also known as LCP2 (Lymphocyte cytosolic protein), subfamily is a protein-protein interaction domain. Proteins of this group have an N-terminal SAM domain, 3 phosphotyrosine motifs, a proline-rich region and a C-terminal SH2 domain. They are scaffold proteins involved in protein complex formation. The complexes play a role in T-cell receptor mediated signaling pathways such as integrin activation, cytoskeletal organization, MARK activation, and calcium flux.  SAM domain deleted SLP76 knockin mice show a number of defects, including partially blocked thymocyte development, impaired positive and negative thymic selection and changes in T-cell receptor mediated signaling.	69
188922	cd09523	SAM_TAL	SAM domain of TAL subfamily. SAM (sterile alpha motif) domain of TAL (Tsg101-associated ligase) proteins, also known as LRSAM1 (Leucine-rich repeat and sterile alpha motif-containing) proteins, is a putative protein-protein interaction domain. Proteins of this subfamily participate in the regulation of retrovirus budding and receptor endocytosis. They show E3 ubiquitin ligase activity. Human TAL protein interacts with Tsg101 and TAL's C-terminal ring finger domain is essential for the multiple monoubiquitylation of Tsg101.	65
188923	cd09524	SAM_tankyrase1,2	SAM domain of tankyrase1,2 subfamily. SAM (sterile alpha motif) domain of Tankyrase1,2 subfamily is a protein-protein interaction domain.  In addition to the SAM domain, proteins of this group have ankyrin repeats and a ADP- ribosyltransferase (poly-(ADP-ribose) synthase) domain. Tankyrases can polymerize through their SAM domains forming homoligomers and these complexes are disrupted by autoribosylation. Tankyrases apparently act as master scaffolding proteins and thus may interact simultaneously with multiple proteins, in particular with TRF1, NuMA, IRAP and Grb14 (ankyrin repeats are involved in these interactions). Tankyrases participate in a variety of cell signaling pathways as effector molecules. Their functions are different depending on the intracellular location: at telomeres they play a role in the regulation of telomere length via control of telomerase access to telomeres, at centrosomes they promote spindle assembly/disassembly, in Golgi vesicles they participate in the regulation of vesicle trafficking and Golgi dynamics. Tankyrase 1 may be of interest as new potential target for telomerase-directed cancer therapy.	66
188924	cd09525	SAM_GAREM	SAM domain of GAREM subfamily. SAM (sterile alpha motif) domain of GAREM (Grb2-associated and regulator of Erk/MARK) protein subfamily (also known as FAM59A) is a putative protein-protein interaction domain. SAM domain is a widespread domain in signaling proteins. Proteins of this group have SAM at the C-terminus. Human GAREM protein is known to play a role in regulation of the EGF (Epidermal Growth Factor) receptor and of Gab or insulin preceptor substrate-1 family proteins. Grb2 (Growth factor receptor-bound) protein was identified as a binding partner of human GAREM. Proline-rich motifs and phosphorylation of two conserved tyrosines in GAREM are important for the interaction with the SH3 domains of Grb2 protein; however these motifs and residues do not belong to the SAM domain.	67
188925	cd09526	SAM_Samd3	SAM domain of Samd3 subfamily. SAM (sterile alpha motif) domain of the Samd3 subfamily is a putative protein-protein interaction domain. Proteins of this subfamily have a SAM domain at the N-terminus. SAM is a widespread domain in signaling and regulatory proteins. In many cases SAM mediates dimerization/oligomerization. Exact function of proteins belonging to this subfamily is unknown.	66
188926	cd09527	SAM_Samd5	SAM domain of Samd5 subfamily. SAM (sterile alpha motif) domain of Samd5 subfamily is a putative protein-protein interaction domain. Proteins of this subfamily have a SAM domain at the N-terminus. SAM is a widespread domain in signaling and regulatory proteins. In many cases SAM mediates dimerization/oligomerization.  The exact function of proteins belonging to this subfamily is unknown.	63
188927	cd09528	SAM_Samd9_Samd9L	SAM domain of Samd9/Samd9L subfamily. SAM (sterile alpha motif) domain of Samd9/Samd9L subfamily is a putative protein-protein interaction domain. SAM is a widespread domain in signaling proteins. Samd9 is a tumor suppressor gene. It is involved in death signaling of malignant glioblastoma. Samd9 suppression blocks cancer cell death induced by HVJ-E or IFN-beta treatment. Deleterious mutations in Samd9 lead to normophosphatemic familial tumoral calcinosis, a cutaneous disorder characterized by cutaneous calcification or ossification.	64
188928	cd09529	SAM_MLTK	SAM domain of MLTK subfamily. SAM (sterile alpha motif) domain of MLTK subfamily is a protein-protein interaction domain. Besides SAM domain, these proteins have N-terminal protein tyrosine kinase domain and leucine-zipper motif. Proteins of this group act as mitogen-activated protein triple kinase in a number of MAPK cascades. They can be activated by autophosphorylation in response to stress signals. MLTK-alpha is known to phosphorylate histone H3. In mammals, MLTKs participate in the activation of the JNK/SAPK, p38, ERK5 pathways, the transcriptional factor NF-kB, in the regulation of the cell cycle checkpoint, and in the induction of apoptosis in a hepatoma cell line. Some members of this subfamily are proto-oncogenes, thus MLTK-alpha is involved in neoplasmic cell transformation and/or skin cancer development in athymic nude mice. Based on in vivo coprecipitation experiments in mammalian cells, it has been demonstrated that MLTK proteins might form homodimers/oligomers via their SAM domains.	71
188929	cd09530	SAM_Samd14	SAM domain of Samd14 subfamily. SAM (sterile alpha motif) domain of SamD14 (or FAM15A) subfamily is a putative protein-protein interaction domain. SAM is widespread domain in proteins involved in signal transduction and regulation. In many cases SAM mediates homodimerization/oligomerization. The exact function of proteins belonging to this subfamily is unknown.	67
188930	cd09531	SAM_CS047	SAM domain of CS047 subfamily. SAM (sterile alpha motif) domain of CS047 subfamily is a putative protein-protein interaction domain. Proteins of this subfamily have a SAM domain at the N-terminus. SAM is a widespread domain in signaling and regulatory proteins. In many cases SAM mediates homodimerization/oligomerization. The exact function of proteins belonging to this group is unknown.	65
188931	cd09532	SAM_SLA1_fungal	SAM domain of SLA1 subfamily. SAM (sterile alpha motif) domain of fungal SLA1 proteins is a protein-protein interaction domain. Proteins of this group consist of a few N-terminal SH3 domains followed by SHD1 domain, SAM domain (also known as SHD2) and multiple C-terminal repeats. The yeast SLA1 protein is an endocytic clathrin adaptor. It is associated with a variety of endocytic accessory factors and required for endocytic vesicle formation and for clathrin and actin-dependent cargo recognition. SLA1 binds clathrin through a variant clathrin-binding motif (vCB). The SAM domain negatively regulates this binding by blocking the vCB site. The SAM domains of SLA1 proteins can form oligomers via their mid-loop (ML) and end-helix (EH) regions. Such self-associations apparently are important for SLA1 function. A proposed regulatory model suggests that SAM can be considered a mediator of two aspects of clathrin adaptor function. It plays a role in negative regulation of clathrin binding via an intramolecular interaction with the vCB, and a role in positive regulation of vesicle coat assembly via self-oligomerization.	62
188932	cd09533	SAM_Ste50-like_fungal	SAM domain of Ste50_like (ubc2) subfamily. SAM (sterile alpha motif) domain of Ste50-like (or Ubc2 for Ustilago bypass of cyclase) subfamily is a putative protein-protein interaction domain. This group includes only fungal proteins. Basidiomycetes have an N-terminal SAM domain, central UBQ domain, and C-terminal SH3 domain, while Ascomycetes lack the SH3 domain. Ubc2 of Ustilago maydis is a major virulence and maize pathogenicity factor. It is required for filamentous growth (the budding haploid form of Ustilago maydis is a saprophyte, while filamentous dikaryotic form is a pathogen). Also the Ubc2 protein is involved in the pheromone-responsive morphogenesis via the MAP kinase cascade. The SAM domain is necessary for ubc2 function; deletion of SAM eliminates this function.  A Lys-to-Glu mutation in the SAM domain of ubc2 gene induces temperature sensitivity.	58
188933	cd09534	SAM_Ste11_fungal	SAM domain of Ste11_fungal subfamily. SAM (sterile alpha motif) domain of Ste11 subfamily is a protein-protein interaction domain. Proteins of this subfamily have SAM domain at the N-terminus and protein kinase domain at the C-terminus. They participate in regulation of mating pheromone response, invasive growth and high osmolarity growth response. MAP triple kinase Ste11 from S.cerevisia is known to interact with Ste20 kinase and Ste50 regulator. These kinases are able to form homodimers interacting through their SAM domains as well as heterodimers or heterogenous complexes when either SAM domain of monomeric or homodimeric form of Ste11 interacts with Ste50 regulator.	62
188934	cd09535	SAM_BOI-like_fungal	SAM domain of BOI-like fungal subfamily. SAM (sterile alpha motif) domain of BOI-like fungal subfamily is a potential protein-protein interaction domain. Proteins of this subfamily are apparently scaffold proteins, since most contain SH3 and PH domains, which are also protein-protein interaction domains, in addition to SAM domain.  BOI-like proteins participate in cell cycle regulation.  In particular BOI1 and BOI2 proteins of budding yeast S.cerevisiae are involved in bud formation, and POB1 protein of fission yeast S.pombe plays a role in cell elongation and separation. Among binding partners of BOI-like fungal subfamily members are such proteins as Bem1 and Cdc42 (they are known to be involved in cell polarization and bud formation).	65
188935	cd09536	SAM_Ste50_fungal	SAM domain of Ste50 fungal subfamily. SAM (sterile alpha motif) domain of Ste50 fungal subfamily is a protein-protein interaction domain. Proteins of this subfamily have SAM domain at the N-terminus and Ras-associated UBQ superfamily domain at the C-terminus. They participate in regulation of mating pheromone response, invasive growth and high osmolarity growth response, and contribute to cell wall integrity in vegetative cells. Ste50 of S.cerevisiae acts as an adaptor protein between G protein and MAP triple kinase Ste11. Ste50 proteins are able to form homooligomers, binding each other via their SAM domains, as well as heterodimers and heterogeneous complexes with SAM domain or SAM homodimers of MAPKKK Ste11 protein kinase.	74
188936	cd09537	SAM_CP2-like	SAM domain of CP2-like transcription factors. SAM (sterile alpha motif) domain of CP2-like transcription factor is a putative protein-protein interaction domain. Proteins of this group have an N-terminal DNA-binding CP2 domain, a central predicted SAM domain and some also have a C-terminal dimerization domain. CP2-like family of transcriptional factors includes three subgroups: LBP1, TFCP2, and LBP9. Members of this family are involved in transcriptional regulation from early development to terminal differentiation. They play a role in regulation of expression of P450scc (the cholesterol side-chain cleavage enzyme, cytochrome) in placenta, and alpha-globin in erythroid cells. They are required for proper maturation of the dust (epithelial component of tubular organs) of kidney and salivary gland. Human LBP1 is known to be induced by HIV type I infection in lymphocytes; it represses HIV transcription by preventing the binding of TFIID to the virus promoter. Additionally, it has been suggested that UBP1 (LBP1) regulator might be a member of a blood pressure controlling network. LBP1 protein isoforms are able to form dimers apparently via SAM domain since SAM deletion or mutation resulted in a loss of this ability.	67
188937	cd09538	SAM_DLC1,2-like	SAM domain of DLC1,2-like subfamily. SAM (sterile alpha motif) domain of DLC-1,2-like (Deleted in liver cancer) subfamily is a protein-protein interaction domain located at the N-terminus of the protein. Members of this subfamily do not form dimers/oligomers through their SAM domains. They participate in regulation of cell migration and lipid transfer. SAM domain of human DLC1 protein contains the EF1A1 (eukaryotic elongation factor) binding motif, thus SAM facilitates recruitment of EF1A1 to the membrane periphery and suppresses cell migration. Human Dlc2 gene is known as a tumor suppressor gene. It was found underexpressed in hepatocellular carcinoma.	60
188938	cd09539	SAM_TNK-like	SAM domain of TNK(ACK)-like non-receptor tyrosine-protein kinases. SAM (sterile alpha motif) domain of TNK-like subfamily is a putative protein-protein interaction domain.  This subfamily includes TNK1 and TNK2 (also known as ACK1) non-receptor tyrosine-protein kinases. They contain a SAM domain at the N-terminus followed by a catalytic domain and a few other domains. Members of this group are involved in the regulation of cell adhesion and growth, receptor degradation, and axonal guidance. Deletion of the SAM domain resulted in reduction of Ack1 ability to undergo autophosphorylation and dramatically reduces ubiquitination of Ack1 catalyzed by HECT E3 ubiquitin ligase (Nedd4-1) during EGF-induced Ack1 degradation. It has been suggested that the lysine-rich region in SAM domain might be a major ubiquitination site. Members of this group are also associated with some cancers. Amplification of the Ack1 gene correlates with prostate and lung cancer progression, and Ack1 overexpression increases invasiveness. Oncogenecity of Tnk1 gene apparently depends on cell context; it may play a role in tumor suppression since Tnk1 knockout mice can develop spontaneous tumors.	62
188939	cd09540	SAM_EPS8-like	SAM domain of EPS8-like subfamily. SAM (sterile alpha motif) domain of EPS8-like subfamily is a putative protein-protein interaction domain. This subfamily includes epidermal growth factor receptor kinase substrate 8 proteins (EPS8) and epidermal growth factor receptor kinase substrate 8-like (EPSL8) 1, 2, 3 proteins with the SAM domain located in the C-terminal effector region. This region is responsible for intracellular protein localization and is involved in small GTPases (such as Rac and Rab5) activation/inhibition. Proteins belonging to this group participate in coordination and integration of multiple signaling pathways; in particular, they play a role in the control of actin dynamics and in receptor endocytosis. They can form complexes with other proteins; for example, in the actin signaling network they interact with SOS1 and E3b1 (Abl1) proteins as well as with CRIB (via SH3 domains) during the actin filament formation, and in the receptor endocytosis their partner is RN-tre protein.	66
188940	cd09541	SAM_KIF24-like	SAM domain of KIF24-like subfamily. SAM (sterile alpha motif) domain of KIF24 subfamily is a putative protein-protein interaction domain. This subfamily includes proteins related to human kinesin-like protein KIF24. SAM domain is located at the N-terminus followed by kinesin motor domain. Kinesins are proteins involved in a number of different cell processes including microtubule dynamics and axonal transport. Kinesins of this group belong to N-type; they drive microtubule plus end-directed transport. SAM apparently plays the role of adaptor or scaffold domain. In many cases SAM is known as a mediator of dimerization/oligomerization.	60
188941	cd09542	SAM_EPH-A1	SAM domain of EPH-A1 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A1 subfamily of the receptor tyrosine kinases is a C-terminal protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A1 receptors and appears to mediate cell-cell initiated signal transduction. Activation of these receptors leads to inhibition of cell spreading and migration in a RhoA-ROCK-dependent manner. EPH-A1 receptors are known to bind ILK (integrin-linked kinase) which is the mediator of interactions between integrin and the actin cytoskeleton. However SAM is not sufficient for this interaction; it rather plays an ancillary role.  SAM domains of Eph-A1 receptors do not form homo/hetero dimers/oligomers. EphA1 gene was found expressed widely in differentiated epithelial cells. In a number of different malignant tumors EphA1 genes are downregulated. In breast carcinoma the downregulation is associated with invasive behavior of the cell.	63
188942	cd09543	SAM_EPH-A2	SAM domain of EPH-A2 family of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A2 subfamily of receptor tyrosine kinases is a C-terminal protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A2 receptors and appears to mediate cell-cell initiated signal transduction. For example, SAM domain of EPH-A2 receptors interacts with SAM domain of Ship2 proteins (SH2 containing phosphoinositide 5-phosphotase-2) forming heterodimers; such recruitment of Ship2 by EPH-A2 attenuates the positive signal for receptor endocytosis. Eph-A2 is found overexpressed in many types of human cancer, including breast, prostate, lung and colon cancer. High level of expression could induce cancer progression by a variety of mechanisms and could be used as a novel tag for cancer immunotherapy. EPH-A2 receptors are attractive targets for drag design.	70
188943	cd09544	SAM_EPH-A3	SAM domain of EPH-A3 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A3 subfamily of receptor tyrosine kinases is a C-terminal putative protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A3 receptors and appears to mediate cell-cell initiated signal transduction. EPH-A3 receptors bind SH2/SH3 containing adaptor protein Nck1 and this adaptor is a key factor in EPH-A3 mediated signaling. However SAM domain is not implemented in this interaction. Activation of EPH-A3 receptors inhibits outgrowth and cell migration. Mutations in SAM domain may play a role in development of hepatocellular carcinoma. Expression of EPH-A3 is associated with lymphocytic leukemia and defines the subset of rhabdomyosarcoma tumors. EPH-A3 receptors are attractive targets for drug design.	63
188944	cd09545	SAM_EPH-A4	SAM domain of EPH-A4 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A4 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A4 receptors and appears to mediate cell-cell initiated signal transduction. SAM domains of EPH-A4 receptors can form homodimers. EPH-A4 receptors bind ligands such as erphirin A1, A4, A5. They are known to interact with a number of different proteins, including meltrin beta metalloprotease, Cdk5, and EFS2alpha, however SAM domain doesn't participate in these interactions. EPH-A4 receptors are involved in regulation of corticospinal tract formation, in pathway controlling voluntary movements, in formation of motor neurons, and in axon guidance (SAM domain is not required for axon guidance or for EPH-A4 kinase signaling). In Xenopus embryos EPH-A4 induces loss of cell adhesion, ventro-lateral protrusions, and severely expanded posterior structures. Mutations in SAM domain conserved tyrosine (Y928F) enhance the ability of EPH-A4 to induce these phenotypes, thus supporting the idea that the SAM domain may negatively regulate some aspects of EPH-A4 activity. EphA4 gene was found overexpressed in a number of different cancers including human gastric cancer, colorectal cancer, and pancreatic ductal adenocarcinoma. It is likely to be a promising molecular target for the cancer therapy.	71
188945	cd09546	SAM_EPH-A5	SAM domain of EPH-A5 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A5 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A5 receptors and appears to mediate cell-cell initiated signal transduction. Eph-A5 gene is almost exclusively expressed in the nervous system. Murine EPH-A5 receptors participate in axon guidance during embryogenesis and play a role in the adult synaptic plasticity, particularly in neuron-target interactions in multiple neural circuits. Additionally EPH-A5 receptors and its ligand ephrin A5 regulate dopaminergic axon outgrowth and influence the formation of the midbrain dopaminergic pathways. EphA5 gene expression was found decreased in a few different breast cancer cell lines, thus it might be a potential molecular marker for breast cancer carcinogenesis and progression.	66
188946	cd09547	SAM_EPH-A6	SAM domain of EPH-A6 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A6 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A6 receptors and appears to mediate cell-cell initiated signal transduction. Eph-A6 gene is preferentially expressed in the nervous system. EPH-A6 receptors are involved in primate retina vascular and axon guidance, and in neural circuits responsible for learning and memory. EphA6 gene was significantly down regulated in colorectal cancer and in malignant melanomas. It is a potential molecular marker for these cancers.	64
188947	cd09548	SAM_EPH-A7	SAM domain of EPH-A7 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A7 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A7 receptors and appears to mediate cell-cell initiated signal transduction. EphA7 was found expressed in human embryonic stem (ES) cells, neural tissues, kidney vasculature. EphA7 knockout mice show decrease in cortical progenitor cell death at mid-neurogenesis and significant increase in cortical size. EphA7 may be involved in the pathogenesis and development of different cancers; in particular, EphA7 was found upregulated in glioblastoma and downregulated in colorectal cancer and gastric cancer.  Thus, it is a potential molecular marker and/or therapy target for these types of cancers.	70
188948	cd09549	SAM_EPH-A10	SAM domain of EPH-A10 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A10 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A10 receptors and appears to mediate cell-cell initiated signal transduction. It was found preferentially expressed in the testis. EphA10 may be involved in the pathogenesis and development of prostate carcinoma and lymphocytic leukemia. It is a potential molecular marker and/or therapy target for these types of cancers.	70
188949	cd09550	SAM_EPH-A8	SAM domain of EPH-A8 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-A8 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-A8 receptors and appears to mediate cell-cell initiated signal transduction. EPH-A8 receptors are involved in ligand dependent (ephirin A2, A3, A5) regulation of cell adhesion and migration, and in ligand independent regulation of neurite outgrowth in neuronal cells. They perform signaling in kinase dependent and kinase independent manner. EPH-A8 receptors are known to interact with a number of different proteins including PI 3-kinase and AIDA1-like subfamily SAM repeat domain containing proteins. However other domains (not SAM) of EPH-A8 receptors are involved in these interactions.	65
188950	cd09551	SAM_EPH-B1	SAM domain of EPH-B1 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B1 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH- B1 receptors. In human vascular endothelial cells it appears to mediate cell-cell initiated signal transduction via the binding of the adaptor protein GRB10 (growth factor) through its SH2 domain to a conserved tyrosine that is phosphorylated. EPH-B1 receptors play a role in neurogenesis, in particular in regulation of proliferation and migration of neural progenitors in the hippocampus and in corneal neovascularization; they are involved in converting the crossed retinal projection to ipsilateral retinal projection. They may be potential targets in angiogenesis-related disorders.	68
188951	cd09552	SAM_EPH-B2	SAM domain of EPH-B2 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B2 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-B2 receptors and appears to mediate cell-cell initiated signal transduction. SAM domains of this subfamily form homodimers/oligomers (in head-to-head/tail-to-tail orientation); apparently such clustering is necessary for signaling. EPH-B2 receptor is involved in regulation of synaptic function; it is needed for normal vestibular function, proper formation of anterior commissure, control of cell positioning, and ordered migration in the intestinal epithelium. EPH-B2 plays a tumor suppressor role in colorectal cancer. It was found  to be downregulated in gastric cancer and thus may be a negative biomarker for it.	71
188952	cd09553	SAM_EPH-B3	SAM domain of EPH-B3 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B3 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-B3 receptors and appears to mediate cell-cell initiated signal transduction. EPH-B3 receptor protein kinase performs kinase-dependent and kinase-independent functions. It is known to be involved in thymus morphogenesis, in regulation of cell adhesion and migration. Also EphB3 controls cell positioning and ordered migration in the intestinal epithelium and plays a role in the regulation of adult retinal ganglion cell axon plasticity after optic nerve injury. In some experimental models overexpression of EphB3 enhances cell/cell contacts and suppresses colon tumor growth.	69
188953	cd09554	SAM_EPH-B4	SAM domain of EPH-B4 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B4 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-B4 receptors and appears to mediate cell-cell initiated signal transduction. EPH-B4 protein kinase performs kinase-dependent and kinase-independent functions.  These receptors play a role in the regular vascular system development during embryogenesis. They were found overexpressed in a variety of cancers, including carcinoma of the head and neck, ovarian cancer, bladder cancer, and downregulated in bone myeloma. Thus, EphB4 is a potential biomarker and a target for drug design.	67
188954	cd09555	SAM_EPH-B6	SAM domain of EPH-B6 subfamily of tyrosine kinase receptors. SAM (sterile alpha motif) domain of EPH-B6 subfamily of receptor tyrosine kinases is a C-terminal potential protein-protein interaction domain. This domain is located in the cytoplasmic region of EPH-B6 receptors and appears to mediate cell-cell initiated signal transduction. Receptors of this type are highly expressed in embryo and adult nervous system, in thymus and also in T-cells. They are involved in regulation of cell adhesion and migration. (EPH-B6 receptor is unusual; it fails to show catalytic activity due to alteration in kinase domain). EPH-B6 may be considered as a biomarker in some types of tumors; EPH-B6 activates MAP kinase signaling in lung adenocarcinoma, suppresses metastasis formation in non-small cell lung cancer, and slows invasiveness in some breast cancer cell lines.	69
188955	cd09556	SAM_VTS1_fungal	SAM domain of VTS1 RNA-binding proteins. SAM (sterile alpha motif) domain of VTS1 subfamily proteins is RNA binding domain located in the C-terminal region. SAM interacts with stem-loop structures of mRNA. Proteins of this subfamily participate in regulation of transcript stability and degradation, and also may be involved in vacuolar protein transport regulation. VTS1 protein of S.cerevisiae induces mRNA degradation via the major deadenylation-dependent mRNA decay pathway; VTS1 recruits CCR4/POP2/NOT deadenylase complex to target mRNA. The recruitment is the initial step resulting in poly(A) tail removal transcripts. Potentially SAM domain may be responsible not only for RNA binding but also for deadenylase binding.	69
188956	cd09557	SAM_Smaug	SAM domain of Smaug subfamily. SAM (sterile alpha motif) domain of Smaug proteins is an RNA recognition domain. It binds a specific RNA motif known as Smaug recognition element (SRE). Among members of this group are invertebrate Smaug (Smg) proteins and vertebrate Smaug1 and Smaug2 proteins. They are involved in post-transcriptional control during early embryogenesis in animals. In Drosophila, Smaug protein is a translational repressor of mRNA of Nanos (Nos) protein. Gradient of Nanos is required for proper abdominal segmentation. SAM domain interacts specifically with the Nanos mRNA regulatory regions. Moreover, Smaug protein is involved in regulation of specific maternal transcripts degradation in Drosophila early embryo via recruitment of the CCR4/POP2/NOT deadenylase.	63
188957	cd09558	SAM_ZCCH14	SAM domain of ZCCH14 subfamily. SAM (sterile alpha motif) domain of ZCCH14 (Zinc finger CCHC domain 14) protein subfamily (also known as BDG-29 or KIAA0579) is a putative RNA binding domain. Members of this group are believed to be involved in post-translational regulation during early embryogenesis.	65
188958	cd09559	SAM_SASH1_repeat1	SAM domain of SASH1 proteins, repeat 1. SAM (sterile alpha motif) repeat 1 of SASH1 proteins is a predicted protein-protein interaction domain. Members of this subfamily are putative adaptor proteins. They appear to mediate signal transduction. SASH1 can bind 14-3-3 proteins in response to IGF1/phosphatidylinositol 3-kinase signaling. SASH1 was found upregulated in different tissues including thymus, placenta, lungs and downregulated in some breast tumors, liver metastases and colon cancers, relative to corresponding normal tissues. SASH1 is a potential candidate for a tumor suppressor gene in breast cancers. At the same time, downregulation of SASH1 in colon cancer is associated with metastasis and a poor prognosis.	66
188959	cd09560	SAM_SASH3	SAM domain of SASH3 subfamily. SAM (sterile alpha motif) domain of SAHS3 (also known as SLY) proteins is a predicted protein-protein interaction domain. Members of this subfamily are putative signaling/adaptor proteins. In addition to SAM, they contain SLY and SH3 domains. They appear to mediate signal transduction in lymphoid tissues. Murine SASH3 is involved in preventing DN thymocytes from premature initiation of programmed cell death and in mTOR (mammalian target of rapamycin) activation via signal integration of the Notch receptor and preTCR (T cell receptor) pathways.	68
188960	cd09561	SAM_SAMSN1	SAM domain of SAMSN1 subfamily. SAM (sterile alpha motif) domain of SAMSN1 (also known as HACS1 or NASH1) proteins is a predicted protein-protein interaction domain. Members of this group are putative signaling/adaptor proteins. They appear to mediate signal transduction in lymphoid tissues. Murine HACS1 protein likely plays a role in B cell activation and differentiation. Potential binding partners of HACS1 are SLAM, DEC205 and PIR-B receptors and also some unidentified tyrosine-phosphorylated proteins. Proteins of this group were found preferentially expressed in normal hematopietic tissues and in some malignancies including lymphoma, myeloid leukemia and myeloma.	66
188961	cd09562	SAM_liprin-alpha1,2,3,4_repeat1	SAM domain of liprin-alpha1,2,3,4 proteins repeat 1. SAM (sterile alpha motif) domain repeat 1 of liprin-alpha1,2,3,4 proteins is a protein-protein interaction domain. Liprin-alpha proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-beta proteins through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance; in particular, liprin-alpha is involved in formation of the presynaptic active zone.	71
188962	cd09563	SAM_liprin-beta1,2_repeat1	SAM domain of liprin-beta1,2 proteins repeat 1. SAM (sterile alpha motif) domain repeat 1 of liprin-beta1,2 proteins is a protein-protein interaction domain. Liprin-beta protein contain three copies (repeats) of SAM domain. They may form heterodimers with liprins-alpha through their SAM domains. It was suggested based on bioinformatic approaches that the second SAM domain of liprin-beta is potentially able to form polymers. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development, in axon guidance, and in the maintenance of lymphatic vessel integrity.	64
188963	cd09564	SAM_kazrin_repeat1	SAM domain of kazrin proteins repeat 1. SAM (sterile alpha motif) domain repeat 1 of kazrin proteins is a protein-protein interaction domain.  The long isoform of kazrin contains three copies (repeats) of SAM domain. Kazrin can interact with periplakin. It is involved into interplay between desmosomes and in adheren junctions. Additionally kazrins play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization.	70
188964	cd09565	SAM_liprin-alpha1,2,3,4_repeat2	SAM domain of liprin-alpha1,2,3,4 proteins repeat 2. SAM (sterile alpha motif) domain repeat 2 of liprin-alpha1,2,3,4 proteins is a protein-protein interaction domain. Liprin-alpha proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-beta proteins through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development, and in axon guidance; in particular, liprin-alpha is involved in formation of the presynaptic active zone.	66
188965	cd09566	SAM_liprin-beta1,2_repeat2	SAM domain of liprin-beta1,2 proteins repeat 2. SAM (sterile alpha motif) domain repeat 2 of liprin-beta1,2 proteins is a protein-protein interaction domain. Liprin-beta proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-alpha proteins through their SAM domains. It was suggested based on bioinformatic approaches that the second SAM domain of liprin-beta potentially is able to form polymers. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development, in axon guidance, and in the maintenance of lymphatic vessel integrity.	63
188966	cd09567	SAM_kazrin_repeat2	SAM domain of kazrin proteins repeat 2. SAM (sterile alpha motif) domain repeat 2 of kazrin proteins is a protein-protein interaction domain. The long isoform of kazrins contains three copies (repeats) of SAM domain. Kazrin can interact with periplakin. It is involved in interplay between desmosomes and in adheren junctions. Additionally kazrins play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization.	65
188967	cd09568	SAM_liprin-alpha1,2,3,4_repeat3	SAM domain of liprin-alpha1,2,3,4 proteins repeat 3. SAM (sterile alpha motif) domain repeat 3 of liprin-alpha1,2,3,4 proteins is a protein-protein interaction domain. Liprin-alpha proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-beta proteins through their SAM domains. Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development and in axon guidance; in particular, liprin-alpha is involved in formation of the presynaptic active zone.	72
188968	cd09569	SAM_liprin-beta1,2_repeat3	SAM domain of liprin-beta proteins repeat 3. SAM (sterile alpha motif) domain repea t3 of liprin-beta1,2 proteins is a protein-protein interaction domain. Liprin-beta proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-alpha proteins through their SAM domains.  Liprins were originally identified as LAR (leukocyte common antigen-related) transmembrane protein-tyrosine phosphatase-interacting proteins. They participate in mammary gland development, in axon guidance, and in the maintenance of lymphatic vessel integrity.	72
188969	cd09570	SAM_kazrin_repeat3	SAM domain of kazrin proteins repeat 3. SAM (sterile alpha motif) domain repeat 3 of kazrin proteins is a protein-protein interaction domain. The long isoform of kazrins contains three copies (repeats) of SAM domain. Kazrin can interact with periplakin. It is involved in interplay between desmosomes and in adheren junctions. Additionally kazrins play a role in regulation of intercellular differentiation, junction assembly, and cytoskeletal organization.	72
188970	cd09571	SAM_tumor-p73	SAM domain of tumor-p73 proteins. SAM (sterile alpha motif) domain of p73 proteins is a putative protein-protein interaction and lipid-binding domain. p73 is a homolog to the tumor suppressor p53. p73 has a C-terminal SAM domain in the longest spliced alpha form, while p53 doesn't have it. p73 knockout mouse shows significant developmental abnormalities but no increased cancer susceptibility, suggesting that p73 plays a role in regulation of normal development. It was shown that SAM domain of p73 is able to bind some membrane lipids. The structural rearrangements in SAM are necessary to accomplish the binding.  No evidence for homooligomerization through SAM domains was found for the p73 subfamily. It was suggested that the partner proteins should be either more distantly related SAM-containing domain proteins or proteins without the SAM domain.	65
188971	cd09572	SAM_tumor-p63	SAM domain of tumor-p63 proteins. SAM (sterile alpha motif) domain of p63 proteins is a putative protein-protein interaction domain.  p63 is homolog to the tumor suppressor p53. p63 has a C-terminal SAM domain in the longest spliced alpha form, while p53 doesn't have it. p63 knockout mice show significant developmental abnormalities but no increased cancer susceptibility, suggesting that p63 plays a role in regulation of normal development. No evidence for homooligomerization through SAM domains was found for the p63 subfamily. It was suggested that the partner proteins should be either more distantly related SAM-containing domain proteins or proteins without the SAM domain. Mutations in the SAM domain of p63 are found in AEC syndrome patients.	65
188972	cd09573	SAM_STIM1	SAM domain of STIM1 subfamily proteins. SAM (sterile alpha motif) domain of STIM1 (Stromal interaction molecule) subfamily proteins is a putative protein-protein interaction domain. STIM1 and STIM2 human proteins are type I transmembrane proteins. The N-terminal part of them includes "hidden" EF-hand and SAM domains. This region is responsible for sensing changes in store-operated and basal cytoplasmic Ca2+ levels and initiates oligomerization. "Hidden" EF hand and SAM domains have a stable intramolecular association, and the SAM domain is a component that regulates stability within STIM proteins. Destabilization of the EF-SAM association during Ca2+ depletion leads to partial unfolding and aggregation (homooligomerization), thus activating the store-operated Ca2+ entry. Immunoprecipitation analysis indicates that STIM1 and STIM2 can form co-precipitable oligomeric associations in vivo. It was suggested that STIM1 protein is an activator of store operated channels in plasma membrane.	74
188973	cd09574	SAM_STIM2	SAM domain of STIM2 subfamily proteins. SAM (sterile alpha motif) domain of STIM2 (Stromal interaction molecule) subfamily proteins is a putative protein-protein interaction domain. STIM1 and STIM2 human proteins are type I transmembrane proteins. The N-terminal part of them includes "hidden" EF-hand and SAM domains. This region is responsible for sensing changes in store-operated and basal cytoplasmic Ca2+ levels and initiates oligomerization. "Hidden" EF hand and SAM domains have a stable intramolecular association, and the SAM domain is a component that regulates stability within STIM proteins. Destabilization of the EF-SAM association during Ca2+ depletion leads to partial unfolding and aggregation (homooligomerization), thus activating the store-operated Ca2+ entry. Immunoprecipitation analysis indicates that STIM1 and STIM2 can form co-precipitable oligomeric associations in vivo. It was suggested that STIM2 protein is an inhibitor of store operated channels in plasma membrane.	74
188974	cd09575	SAM_DGK-delta	SAM domain of diacylglycerol kinase delta. SAM (sterile alpha motif) domain of DGK-delta subfamily proteins is a protein-protein interaction domain. Proteins of this subfamily are multidomain diacylglycerol kinases with a SAM domain located at the C-terminus. DGK-delta proteins participate in signal transduction. They regulate the level of second messengers such as diacylglycerol and phosphatidic acid. In particular DGK-delta is involved in the regulation of clathrin-dependent endocytosis. The SAM domain of DGK-delta proteins can form high molecular weight homooligomers through head-to-tail interactions as well as heterooligomers with the SAM domain of DGK-eta proteins. The oligomerization plays a role in the regulation of the DGK-delta intracellular localization: it inhibits the translocation of the protein to the plasma membrane from the cytoplasm. The SAM domain also can bind Zn at multiple (not conserved) sites driving the formation of highly ordered large sheets of polymers, thus suggesting that Zn may play important role in the function of DCK-delta.	65
188975	cd09576	SAM_DGK-eta	SAM domain of diacylglycerol kinase eta. SAM (sterile alpha motif) domain of DGK-eta subfamily proteins is a protein-protein interaction domain. Proteins of this subfamily are multidomain diacylglycerol kinases. The SAM domain is located at the C-terminus of two out of three isoforms of DGK-eta protein. DGK-eta proteins participate in signal transduction. They regulate the level of second messengers such as diacylglycerol and phosphatidic acid. The SAM domain of DCK-eta proteins can form high molecular weight homooligomers through head-to-tail interactions as well as heterooligomers with the SAM domain of DGK-delta proteins. The oligomerization plays a role in the regulation of the DGK-delta intracellular localization: it is responsible for sustained endosomal localization of the protein and resulted in negative regulation of DCK-eta catalytic activity.	65
188976	cd09577	SAM_Ph1,2,3	SAM domain of Ph (polyhomeotic) proteins of Polycomb group. SAM (sterile alpha motif) domain of Ph (polyhomeotic) proteins of Polycomb group is a protein-protein interaction domain. Ph1,2,3 proteins are members of PRC1 complex. This complex is involved in transcriptional repression of Hox (Homeobox) cluster genes. It is recruited through methylated H3Lys27 and supports the repression state by mediating monoubiquitination of histone H2A. Proteins of the Ph1,2,3 subfamily contribute to anterior-posterior neural tissue specification during embryogenesis. Additionally, the P2 protein of zebrafish is known to be involved in epiboly and tailbud formation. SAM domains of Ph proteins may interact with each other, forming homooligomers, as well as with SAM domains of other proteins, in particular with the SAM domain of Scm (sex comb on midleg) proteins, forming heterooligomers. Homooligomers are similar to the ones formed by SAM Pointed domains of the TEL proteins. Such SAM/SAM oligomers apparently play a role in transcriptional repression through polymerization along the chromosome.	69
188977	cd09578	SAM_Scm	SAM domain of Scm proteins of Polycomb group. SAM (sterile alpha motif) domain of Scm (Sex comb on midleg) subfamily of Polycomb group is a protein-protein interaction domain. Proteins of this subfamily are transcriptional repressors associated with PRC1 complex. This group includes invertebrate Scm protein and chordate Scm homolog 1 and Scm-like 1, 2, 3 proteins.  Most have a SAM domain, two MBT repeats, and a DUF3588 domain, except Scm-like 4 proteins which do not have MBT repeats. Originally the Scm protein was described in Drosophila as a regulator required for proper spatial expression of homeotic genes. It plays a major role during early embryogenesis. SAM domains of Scm proteins can interact with each other, forming homooligomers, as well as with SAM domains of other proteins, in particular with SAM domains of Ph (polyhomeotic) proteins, forming heterooligomers. Homooligomers are similar to the ones formed by SAM Pointed domains of the TEL proteins. Such SAM/SAM oligomers apparently play a role in transcriptional repression through polymerization along the chromosome. Mammalian Scmh1 protein is known be indispensible member of PRC1 complex; it plays a regulatory role for the complex during meiotic prophase of male sperm cells, and is particularly involved in regulation of chromatin modification at the XY chromatin domain of the pachytene spermatocytes.	72
188978	cd09579	SAM_Samd7,11	SAM domain of Samd7,11 subfamily of Polycomb group. SAM (sterile alpha motif) domain is a protein-protein interaction domain. Phylogenetic analysis suggests that proteins of this subfamily are most closely related to SAM-Ph1,2,3 subfamily of Polycomb group. They are predicted transcriptional repressors in photoreceptor cells and pinealocytes of vertebrates. SAM domain containing protein 11 is also known as Mr-s (major retinal SAM) protein. In mouse, it is predominantly expressed in developing retinal photoreceptors and in adult pineal gland. The SAM domain is involved in homooligomerization of whole proteins (it was shown based on immunoprecipitation assay and mutagenesis), however its repression activity is not due to SAM/SAM interactions but to the C-terminal region.	68
188979	cd09580	SAM_Scm-like-4MBT	SAM domain of Scm-like-4MBT proteins of Polycomb group. SAM (sterile alpha motif) domain of Scm-like-4MBT (Sex comb on midleg like, Malignant Brain Tumor) subfamily proteins of the polycomb group is a putative protein-protein interaction domain. Additionally to the SAM domain, most of the proteins of this subfamily have 4 MBT repeats. In Drosophila SAM-Scm-like-4MBT protein (known as dSfmbt) is a member of Pho repressive complex (PhoRC). Additionally to dSfmbt, the PhoRC complex includes Pho or Pho-like proteins. This complex is responsible for HOX (Homeobox) gene silencing: Pho or Pho-like proteins bind  DNA and dSmbt binds methylated histones. dSmbt can interact with mono- and di-methylated histones H3 and H4 (however this activity has been shown for the MBT repeats, while exact function of the SAM domain is unclear). Besides interaction with histones, dSmbt can interact with Scm (a member of PRC complex), but this interaction also seems to be SAM domain independent.	67
188980	cd09581	SAM_Scm-like-4MBT1,2	SAM domain of Scm-like-4MBT1,2 proteins of Polycomb group. SAM (sterile alpha motif) domain of Scm-like-4MBT1,2 (Sex comb on midleg, Malignant Brain Tumor) subfamily proteins (also known as Sfmbt1,2 proteins) is a putative protein-protein interaction domain. Proteins of this subfamily are transcriptional regulators belonging to Polycomb group. The majority of them are multidomain proteins: in addition to the C-terminal SAM domain, they contain four MBT repeats and DUF5388 domain. The MBT repeats of the human sfmbt1 protein are responsible for association with the nuclear matrix and for selective binding of H3 histone N-terminal tails, while the exact function of the SAM domain is unclear.	85
188981	cd09582	SAM_Scm-like-3MBT3,4	SAM domain of Scm-like-3MBT3,4 proteins of Polycomb group. SAM (sterile alpha motif) domain of Scm-like-3MBT3,4 (Sex comb on midleg, Malignant brain tumor) subfamily proteins (also known as L3mbtl3,4 proteins)  is a putative protein-protein interaction domain.  Proteins of this subfamily are predicted transcriptional regulators belonging to Polycomb group. The majority of them are multidomain proteins: in addition to the C-terminal SAM domain, they contain three MBT repeats and Zn finger domain. Murine L3mbtl3 protein of this subfamily is essential for maturation of myeloid progenitor cells during differentiation. Human L3mbtl4 is a potential tumor suppressor gene in breast cancer, while deregulation of L3MBTL3 is associated with neuroblastoma.	66
188982	cd09583	SAM_Atherin-like	SAM domain of Atherin/Atherin-like subfamily. SAM (sterile alpha motif) domain of SAM_Atherin and Atherin-like subfamily proteins is a putative protein-protein and/or protein-lipid interaction domain.  In addition to the C-terminal SAM domain, the majority of proteins belonging to this group also have PHD (or Zn finger) domain. As potential members of the polycomb group, these proteins may be involved in regulation of some key regulatory genes during development. Atherin can be recruited by Ruk/CIN85 kinase-binding proteins via its SH3 domains thus participating in the signal transferring kinase cascades. Also, atherin was found associated with low density lipids (LDL) in atherosclerotic lesions in human. It was suggested that atherin plays an essential role in atherogenesis via immobilization of LDL in the arterial wall. SAM domains of atherins are predicted to form polymers. Inhibition of polymer formation could be a potential antiatherosclerotic therapy.	69
188983	cd09584	SAM_sec23ip	SAM domain of sec23ip. SAM (sterile alpha motif) domain of Sec23ip (Sec23 interacting protein) group is a potential protein-protein interaction domain. Sec23ip proteins (also known as p125) contain an N-terminal proline-rich region, a central region containing a SAM domain and a C-terminal region with a predicted metal-binding domain. Sec23ip interacts with Sec23p/Sec24p part of COPII-coated vesicles complex involved in protein transport from the ER to the Golgi apparatus. The proline-rich region plays an essential role in this interaction. Overexpression of Sec23ip leads to disorganization of ER/Golgi intermediate compartment.	69
188984	cd09585	SAM_DDHD2	SAM domain of DDHD2. SAM (sterile alpha motif) domain of DDHD2 group is a potential protein-protein interaction domain. DDHD2 proteins contain at least two domains:a SAM domain and a predicted metal-binding domain. Phospholipase A1 activity was demonstrated for the mammalian DDHD2 protein. Mutation of the putative catalytic serine resulted in elimination of activity. Unlike SEC23IP, DDHD2 proteins do not have an N-terminal proline-rich region and correspondingly they are not able to interact with Sec23p/Sec24p complex. Overexpression of DDHD2 is the cause of dispersion of ER/Golgi intermediate compartment and dispersion of tethering proteins located in the Golgi region, leading to aggregation in the endoplasmic reticulum.	69
188985	cd09586	SAM_USH1G	SAM domain of USH1G. SAM (sterile alpha motif) domain of USH1G (Usher syndrome type-1G protein) proteins (also known as SANS) is a putative protein-protein interaction domain. Members of this group have an N-terminal ankyrin repeat region and C-terminal SAM domain. USH1G is expressed in the hair bundles of the inner ear sensory cells. It can form a functional network with USH1B (myosin VIIa), USH1C (harmonin b), USH1F (protocadherin-related 15), and USH1D (cadherin 23). The SAM domain of the USH1G protein is involved in synergetic interactions with the PDZ domain of harmonin. Such interactions contribute to the stability of harmonin. The network is required for the correct cohesion of the hair bundle. Mutations in the ush1g gene lead to Usher syndrome type 1G. This syndrome is the cause of deaf-blindness in humans.	66
188986	cd09587	SAM_HARP	SAM domain of HARP subfamily. SAM (sterile alpha motif) domain of HARP (Harmonin-interacting Ankyrin Repeat-containing) proteins, also known as ANKS4B, is a protein-protein interaction domain. Proteins of this subfamily have an N-terminal ankyrin repeat region and C-terminal SAM. In mouse epithelial tissues, HARP protein interacts with the PDZ domain of harmonin. This scaffolding complex facilitates signal transduction in epithelia. HARP was found co-expressed with harmonin in a number of epithelial cells including pancreatic ductal epithelium, embryonic epithelia of the lung, kidney, salivary glands, and cochlea.	67
188987	cd09588	SAM_LBP1	SAM domain of LBP1 (UBP1) transcription factors. SAM (sterile alpha motif) domain of LBP1 (also known as UBP1) transcription factor is a putative protein-protein interaction domain. Proteins of this group have an N-terminal DNA-binding CP2 domain, a central predicted SAM domain and some also have a C-terminal dimerization domain. They are involved in transcriptional regulation from early development to terminal differentiation. In particular, they regulate alpha-globin in erythroid cells and P450scc (the cholesterol side-chain cleavage enzyme, cytochrome) in human placenta. Human LBP1 is known to be induced by HIV type I infection in lymphocytes; it represses HIV transcription by preventing the binding of TFIID to the virus promoter. Additionally, it has been suggested that UBP1 (LPB1) regulator might be a member of a blood pressure controlling network. LBP1 protein isoforms are able to form dimers, apparently via SAM domain since SAM deletion or mutation resulted in a loss of this ability.	67
188988	cd09589	SAM_TFCP2	SAM domain of TFCP2 transcription factors. SAM (sterile alpha motif) domain of TFCP2 transcription factors is a putative protein-protein interaction domain. Proteins of this group have an N-terminal DNA-binding CP2 domain, a central predicted SAM domain and a C-terminal dimerization domain. They are involved in transcriptional regulation from early development to terminal differentiation. In particular, they regulate expression of erythroid cell-specific alpha-globin, fibrinogen, and sex-determining gene SRY as well as lens alpha-crystallin. TFCP2 regulators can interact with NF-E4 proteins forming heteromeric stage selector protein complex (SSP). This complex is able to bind stage selector element (SSE) and regulate embryonic globin expression in fetal-erythroid cells.	67
188989	cd09590	SAM_LBP9	SAM domain of LBP9 transcriptional factors. SAM (sterile alpha motif) domain of LBP9 (also known as TFCP2L1 or CRTR-1 (CP2-Related Transcriptional Repressor-1)) transcription factor is a putative protein-protein interaction domain. Proteins of this group have an N-terminal DNA-binding CP2 domain, a central predicted SAM domain and a C-terminal dimerization domain. They are involved in transcriptional regulation from early development to terminal differentiation. In particular, they are required for proper maturation of the dust (epithelial component of tubular organs) of kidney and salivary gland as well as for regulation of P450scc (the cholesterol side-chain cleavage enzyme, cytochrome) in human placenta.	67
188990	cd09591	SAM_DLC1	SAM domain of DLC1 subfamily. SAM (sterile alpha motif) domain of DLC1 (Deleted in liver cancer) protein is a protein-protein interaction domain located at the N-terminus. Proteins of this subfamily do not form dimers/oligomers through their SAM domains. They participate in regulation of cell migration. SAM domain of human DLC1 protein contains the EF1A1 (eukaryotic elongation factor) binding motif, thus SAM facilitates recruitment of EF1A1 to the membrane periphery and suppresses cell migration.	60
188991	cd09592	SAM_DLC2	SAM domain of STARD13-like subfamily. SAM (sterile alpha motif) domain of DLC2 (Deleted in liver cancer) protein is a lipid-binding and putative protein-protein interaction domain located at the N-terminus of the protein. Members of this subfamily do not form dimers/oligomers through their SAM domains. They participate in lipid transfer. Human Dlc2 gene is known as a tumor suppressor gene. It was found underexpressed in hepatocellular carcinoma.	64
381677	cd09593	UDG-like	uracil-DNA glycosylases (UDG) and related enzymes. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil may arise from misincorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations; thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. UDG family 1 is the most efficient uracil-DNA glycosylase (UDG, also known as UNG) and shows a specificity for uracil in DNA. UDG family 2 includes thymine DNA glycosylase which removes uracil and thymine from G:U and G:T mismatches, and mismatch-specific uracil DNA glycosylase (MUG) which in Escherichia coli is highly specific to G:U mismatches, but also repairs G:T mismatches at high enzyme concentration. UDG family 3 includes Human SMUG1 which can remove uracil and its oxidized pyrimidine derivatives from, single-stranded DNA and double-stranded DNA with a preference for single-stranded DNA. Pedobacter heparinus SMUG2, which is UDG family 3 SMUG1-like, displays catalytic activities towards DNA containing uracil or hypoxanthine/xanthine. UDG family 4 includes Thermotoga maritima TTUDGA, a robust UDG which like family 1, acts on double-stranded and single-stranded uracil-containing DNA. UDG family 5 (UDGb) includes Thermus thermophilus HB8 TTUDGB which acts on double-stranded uracil-containing DNA; it is a hypoxanthine DNA glycosylase acting on double-stranded hypoxanthine-containing DNA except for the C/I base pair, as well as a xanthine DNA glycosylase which acts on both double-stranded and single-stranded xanthine-containing DNA. UDG family 6 hypoxanthine-DNA glycosylase lacks any detectable UDG activity; it excises hypoxanthine. Other UDG families include one represented by Bradyrhizobium diazoefficiens Blr0248 which prefers single-stranded DNA and removes uracil, 5-hydroxymethyl-uracil or xanthine from it.	125
341057	cd09594	GluZincin	Gluzincin Peptidase family (thermolysin-like proteinases, TLPs) which includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins). The Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), which contain the HEXXH motif as part of their active site. Peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. The M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The M3_like peptidases include the M2_ACE, M3 or neurolysin-like family (subfamilies M3B_PepF and M3A) and M32_Taq peptidases. The M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key component of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M3A includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; and M3B includes oligopeptidase F. The M32 family includes eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and from Leishmania major, a parasite that causes leishmaniasis, making these enzymes attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, and neutral protease as well as bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. The M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. The peptidase M36 fungalysin family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers.	105
341058	cd09595	M1	Peptidase M1 family includes the catalytic domains of aminopeptidase N and leukotriene A4 hydrolase. The model represents the catalytic domains of M1 peptidase family members including aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile upon activation during catalysis. APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. APN expression is dysregulated in many inflammatory diseases and is enhanced in numerous tumor cells, making it a lead target in the development of anti-cancer and anti-inflammatory drugs. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity. The two activities occupy different, but overlapping sites. The activity and physiological relevance of the aminopeptidase in LTA4H is as yet unknown, while the epoxide hydrolase converts leukotriene A4 (LTA4) into leukotriene B4 (LTB4), a potent chemotaxin that is fundamental to the inflammatory response of mammals.	413
341059	cd09596	M36	Peptidase M36 family, also known as fungalysin family. The M36 peptidase family, also known as fungalysin (elastinolytic metalloproteinase) family, includes endopeptidases from pathogenic fungi. Fungalysin can hydrolyze extracellular matrix proteins such as elastin and keratin, with a preference for cleavage on the amino side of hydrophobic residues with bulky side-chains. This family is similar to the M4 (thermolysin) family due to the presence of the HEXXH motif in the active site residues, as well as its fold prediction. Some of these enzymes also contain a protease-associated (PA) domain insert. The eukaryotic M36 and bacterial M4 families of metalloproteases also share a conserved domain in their propeptides called FTP (fungalysin/thermolysin propeptide). Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals; it secretes fungalysin that possibly breaks down proteinaceous structural barriers. A solid lesion known as an aspergilloma can grow in a lung cavity, particularly following recovery from tuberculosis. Fungalysins are also found as multiple copies in the human and animal pathogenic fungi such as Microsporum canis, Trichophyton rubrum and T. mentagrophytes, which cause cutaneous infections.	317
341060	cd09597	M4_TLP	Peptidase M4 family including thermolysin, protealysin, aureolysin, and neutral protease. This peptidase M4 family includes several endopeptidases such as thermolysin (EC 3.4.24.27), aureolysin (the extracellular metalloproteinase from Staphylococcus aureus), neutral protease from Bacillus cereus, protealysin, and bacillolysin (EC 3.4.24.28). Typically, the M4 peptidases consist of a presequence (signal sequence), a propeptide sequence, and a peptidase unit. The presequence is cleaved off during export while the propeptide has inhibitory and chaperone functions and facilitates folding. The propeptide remains attached until the peptidase is secreted and can be safely activated. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. The active site is found between two sub-domains; the N-terminal domain contains the HEXXH zinc-binding motif while the helical C-terminal domain, which is unique for the family, carries the third zinc ligand. These peptidases are secreted eubacterial endopeptidases from Gram-positive or Gram-negative sources that degrade extracellular proteins and peptides for bacterial nutrition. They are selectively inhibited by Steptomyces metalloproteinase inhibitor (SMPI) as well as by phosphoramidon from Streptomyces tanashiensis. A large number of these enzymes are implicated as key factors in the pathogenesis of various diseases, including gastritis, peptic ulcer, gastric carcinoma, cholera and several types of bacterial infections, and are therefore important drug targets. Some enzymes of the family can function at extremes of temperatures, while some function in organic solvents, thus rendering them novel targets for biotechnological applications. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing. It has also been used in production of the artificial sweetener aspartame.	278
341061	cd09598	M4_like	Peptidase M4 family containing mostly uncharacterized proteins. This family of uncharacterized bacterial proteins are homologs of the M4 peptidase family that is also known as the thermolysin-like peptidase (TLP) family. Typically, the M4 peptidases consist of a presequence (signal sequence), a propeptide sequence and a peptidase unit. The presequence is cleaved off during export while the propeptide has inhibitory and chaperone functions and facilitates folding. The propeptide remains attached until the peptidase is secreted and can be safely activated. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. TLPs are secreted eubacterial endopeptidases from Gram-positive or Gram-negative sources that degrade extracellular proteins and peptides for bacterial nutrition. They contain the HEXXH motif as part of their active site and belong to the Gluzincins family and are selectively inhibited by Steptomyces metalloproteinase inhibitor (SMPI) as well as by phosphoramidon from Streptomyces tanashiensis. A large number of these enzymes are implicated as key factors in the pathogenesis of various diseases, including gastritis, peptic ulcer, gastric carcinoma, cholera and several types of bacterial infections, and are therefore important drug targets. Some enzymes of the family can function at extremes of temperatures, while some function in organic solvents, thus rendering them novel targets for biotechnological applications.	263
341062	cd09599	M1_LTA4H	Peptidase M1 family including Leukotriene A4 hydrolase catalytic domain. This model represents the N-terminal catalytic domain of leukotriene A4 hydrolase (LTA4H; E.C. 3.3.2.6) and the close homolog cold-active aminopeptidase (Colwellia psychrerythraea-type peptidase; ColAP), both members of the aminopeptidase M1 family. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity. The two activities occupy different, but overlapping sites. The activity and physiological relevance of the aminopeptidase is poorly understood while the epoxide hydrolase converts leukotriene A4 (LTA4) into leukotriene B4 (LTB4), a potent chemotaxin that is fundamental to the inflammatory response of mammals. It accepts a variety of substrates, including some opioid, di- and tripeptides, as well as chromogenic aminoacyl-p-nitroanilide derivatives. The aminopeptidase activity of LTA4H is possibly involved in the processing of peptides related to inflammation and host defense. Kinetic analysis shows that LTA4H hydrolyzes arginyl tripeptides with high efficiency and specificity, indicating its function as an arginyl aminopeptidase. Thermodynamic characterization using different biophysical methods shows that structurally distinct inhibitors of the LTA4H occupy different regions of the binding site; while some (RB202, ARM1 and SC57461A) bind to the hydrophobic hydrolase side, both bestatin and captopril are located at the hydrophilic peptidase side. LTB4H overexpression is associated with different pathological conditions and diseases such as cystic fibrosis, coronary heart disease, sepsis, shock, connective tissue disease, and chronic obstructive pulmonary disease. It is also overexpressed in certain human cancers, and has been identified as a functionally important target for mediating anticancer properties of resveratrol, a well-known red wine polyphenolic compound with cancer chemopreventive activity.	442
341063	cd09600	M1_APN	Peptidase M1 family, including aminopeptidase N catalytic domain. This model represents the catalytic domain of aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease belonging to the M1 gluzincin family. It includes bacterial-type alanyl aminopeptidases as well as PfA-M1 aminopeptidase (Plasmodium falciparum-type). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and, in higher eukaryotes, is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation, thus considered a marker of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. APNs are also present in many pathogenic bacteria and represent potential drug targets. Some APNs have been used commercially, such as one from Lactococcus lactis used in the food industry. APN also serves as a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs have also been extensively studied as putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established.	434
341064	cd09601	M1_APN-Q_like	Peptidase M1 aminopeptidase N catalytic domain family which includes aminopeptidase N (APN), aminopeptidase Q (APQ), tricorn interacting factor F3, and endoplasmic reticulum aminopeptidase 1 (ERAP1). This M1 peptidase family includes eukaryotic and bacterial members: the catalytic domains of aminopeptidase N (APN), aminopeptidase Q (APQ, laeverin), endoplasmic reticulum aminopeptidase 1 (ERAP1) as well as tricorn interacting factor F3. Aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease,  preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is considered a marker of differentiation since it is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. ERAP1, also known as endoplasmic reticulum aminopeptidase associated with antigen processing (ERAAP), adipocyte derived leucine aminopeptidase (A-LAP), or aminopeptidase regulating tumor necrosis factor receptor I (THFRI) shedding (ARTS-1), associates with the closely related ER aminopeptidase ERAP2, for the final trimming of peptides within the ER for presentation by MHC class I molecules. ERAP1 is associated with ankylosing spondylitis (AS), an inflammatory arthritis that predominantly affects the spine. ERAP1 also aids in the shedding of membrane-bound cytokine receptors. The tricorn interacting factor F3, together with factors F1 and F2, degrades the tricorn protease products, producing free amino acids, thus completing the proteasomal degradation pathway. F3 is homologous to F2, but not F1, and shows a strong preference for glutamate in the P1' position. APQ, also known as laeverin, is specifically expressed in human embryo-derived extravillous trophoblasts (EVTs) that invade the uterus during early placentation. It cleaves the N-terminal amino acid of various peptides such as angiotensin III, endokinin C, and kisspeptin-10, all expressed in the placenta in large quantities. APN is a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs are also putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established.	442
341065	cd09602	M1_APN	Peptidase M1 family including aminopeptidase N catalytic domain. This model represents the catalytic domain of bacterial and eukaryotic aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease belonging to the M1 gluzincin family. APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and, in higher eukaryotes, is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation, thus considered a marker of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. APNs are also present in many pathogenic bacteria and represent potential drug targets. Some APNs have been used commercially, such as one from Lactococcus lactis used in the food industry. APN also serves as a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs have also been extensively studied as putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established.	440
341066	cd09603	M1_APN_like	Peptidase M1 family similar to aminopeptidase N catalytic domain. This family contains mostly bacterial and some archaeal M1 peptidases with smilarity to the catalytic domain of aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease belonging to the M1 gluzincin family. APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and, in higher eukaryotes, is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation, thus considered a marker of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. APNs are also present in many pathogenic bacteria and represent potential drug targets. Some APNs have been used commercially, such as one from Lactococcus lactis used in the food industry. APN also serves as a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs have also been extensively studied as putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established.	410
341067	cd09604	M1_APN_like	Peptidase M1 family similar to aminopeptidase N catalytic domain. This family contains bacterial M1 peptidases with smilarity to the catalytic domain of aminopeptidase N (APN; CD13; alanyl aminopeptidase; EC 3.4.11.2), a type II integral membrane protease belonging to the M1 gluzincin family. APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and, in higher eukaryotes, is present in a variety of human tissues and cell types (leukocyte, fibroblast, endothelial and epithelial cells). APN expression is dysregulated in inflammatory diseases such as chronic pain, rheumatoid arthritis, multiple sclerosis, systemic sclerosis, systemic lupus erythematosus, polymyositis/dermatomyosytis and pulmonary sarcoidosis, and is enhanced in tumor cells such as melanoma, renal, prostate, pancreas, colon, gastric and thyroid cancers. It is predominantly expressed on stem cells and on cells of the granulocytic and monocytic lineages at distinct stages of differentiation, thus considered a marker of differentiation. Thus, APN inhibition may lead to the development of anti-cancer and anti-inflammatory drugs. APNs are also present in many pathogenic bacteria and represent potential drug targets. Some APNs have been used commercially, such as one from Lactococcus lactis used in the food industry. APN also serves as a receptor for coronaviruses, although the virus receptor interaction site seems to be distinct from the enzymatic site and aminopeptidase activity is not necessary for viral infection. APNs have also been extensively studied as putative Cry toxin receptors. Cry1 proteins are pore-forming toxins that bind to the midgut epithelial cell membrane of susceptible insect larvae, causing extensive damage. Several different toxins, including Cry1Aa, Cry1Ab, Cry1Ac, Cry1Ba, Cry1Ca and Cry1Fa, have been shown to bind to APNs; however, a direct role of APN in cytotoxicity has been yet to be firmly established.	440
341068	cd09605	M3A	Peptidase M3A family includes thimet oligopeptidase, dipeptidyl carboxypeptidase and mitochondrial intermediate peptidase. The M3-like family also called neurolysin-like family, is part of the "zincins" metallopeptidases, and includes M3, M2 and M32 families of metallopeptidases. The M3 family is subdivided into two subfamilies: the widespread M3A, represented by this CD, which comprises a number of high-molecular mass endo- and exopeptidases from bacteria, archaea, protozoa, fungi, plants and animals, and the small M3B, whose members are enzymes primarily from bacteria. Well-known mammalian/eukaryotic M3A endopeptidases are the thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (alias endopeptidase 3.4.24.16), and the mitochondrial intermediate peptidase. The first two are intracellular oligopeptidases, which act only on relatively short substrates of less than 20 amino acid residues, while the latter cleaves N-terminal octapeptides from proteins during their import into the mitochondria. The M3A subfamily also contains several bacterial endopeptidases, called oligopeptidases A, as well as a large number of bacterial carboxypeptidases, called dipeptidyl peptidases (Dcp; Dcp II; peptidyl dipeptidase; EC 3.4.15.5). The peptidases in the M3 family contain the HEXXH motif that forms part of the active site in conjunction with a C-terminally-located Glutamic acid (Glu) residue. A single zinc ion is ligated by the side-chains of the two Histidine (His) residues, and the more C-terminal Glu. Most of the peptidases are synthesized without signal peptides or propeptides, and function intracellularly.	587
341069	cd09606	M3B_PepF	Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3 oligopeptidase F (oligendopeptidase) is mostly bacterial and includes oligoendopeptidase F from Geobacillus stearothermophilus. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids and may cleave proteins at Leu-Gly. The PepF gene is duplicated in Lactococcus lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid.	543
341070	cd09607	M3B_PepF	Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B Oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and is similar to oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid.	580
341071	cd09608	M3B_PepF	Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and includes oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid. This PepF family includes Streptococcus agalactiae PepB, a group B streptococcal oligopeptidase which has been shown to degrade a variety of bioactive peptides as well as the synthetic collagen-like substrate N-(3-[2-furyl]acryloyl)-Leu-Gly- Pro-Ala in vitro.	560
341072	cd09609	M3B_PepF	Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and is similar to oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid.	586
341073	cd09610	M3B_PepF	Peptidase family M3B, oligopeptidase F (PepF). Peptidase family M3B oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and is similar to oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the Bacillus amyloliquefaciens PepF oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over-expressed from a multicopy plasmid.	532
187707	cd09611	Jacalin_ZG16_like	Jacalin-like lectin domain of the zymogen granule protein 16 and related proteins. ZG16p is a conserved secreted vertebrate protein with tissue-specific expression profiles, which might play a role in glycoprotein secretion, perhaps as a linker protein that participates in the formation and/or transport of the zymogen granule. Its paralog ZG16b (PAUF) has been associated with roles in gene regulation and cancer. This domain family also contains mammalian proteins labelled as prostatic spermine-binding protein (SBP) and salivary-gland specific secreted proteins.	128
187708	cd09612	Jacalin	Jacalin-like plant lectin domain. Jacalin-like lectins are sugar-binding protein domains mostly found in plants. They adopt a beta-prism topology consistent with a circularly permuted three-fold repeat of a structural motif. Proteins containing this domain may bind mono- or oligosaccharides with high specificity. The domain can occur in tandem-repeat arrangements with up to six copies, and in architectures combined with a variety of other functional domains. The family was initially named after an abundant protein found in the jackfruit seed. Jacalin specifically binds to the alpha-O-glycoside of the disaccharide Gal-beta1-3-GalNAc, and has proven useful in the study of O-linked glycoproteins. Jacalin-like lectins in this family may occur in various oligomerization states.	130
187709	cd09613	Jacalin_metallopeptidase_like	Jacalin-like lectin domain of putative metalloproteases and similar proteins. Members of this family, which appears restricted to fungi, co-occur with protein domains that contain an HExxH motif characteristic of metallopeptidases. They have not been functionally characterized.	124
187710	cd09614	griffithsin_like	Jacalin-like lectin domain of griffithsin and related proteins. Griffithsin is a lectin isolated from a red alga, which has shown potential as an inhibitor of viral entry, exhibiting antiviral activity against HIV and SARS. The biological functions of griffithsin and griffithsin-like proteins with respect to their source organisms are not known.	128
187711	cd09615	Jacalin_EEP	Jacalin-like lectin domains of putative endonucleases/exonucleases/phosphatases and related proteins. Members of this taxonomically diverse family co-occur with metal-dependent endonucleases/exonucleases/phosphatases. They have not been functionally characterized.	134
187737	cd09616	Peptidase_C12_UCH_L1_L3	Cysteine peptidase C12 containing ubiquitin carboxyl-terminal hydrolase (UCH) families L1 and L3. This ubiquitin C-terminal hydrolase (UCH) family includes UCH-L1 and UCH-L3, the two members sharing around 53% sequence identity as well as conserved catalytic residues. Both enzymes hydrolyze carboxyl terminal esters and amides of ubiquitin (Ub). UCH-L1, in dimeric form, has additional enzymatic activity as a ubiquitin ligase. It is highly abundant in the brain, constituting up to 2% of total protein, and is expressed exclusively in neurons and testes. Abnormal expression of UCH-L1 has been shown to correlate with several forms of cancer, including several primary lung tumors, lung tumor cell lines, and colorectal cancers. Mutations in the UCH-L1 gene have been linked to susceptibility to and protection from Parkinson's disease (PD); dysfunction of the hydrolase activity can lead to an accumulation of alpha-synuclein, which is linked to Parkinson's disease (PD), while accumulation of neurofibrillary tangles is linked to Alzheimer's disease (AD).  UCH-L3 hydrolyzes isopeptide bonds at the C-terminal glycine of either Ub or Nedd8, a ubiquitin-like protein. It can also interact with Lys48-linked Ub dimers to protect them from degradation while inhibiting its hydrolase activity at the same time.  Unlike UCH-L1, neither dimerization nor ligase activity have been observed for UCH-L3. It has been shown that levels of Nedd8 and the apoptotic protein p53 and Bax are elevated in UCH-L3 knockout mice upon cryptorchid injury, possibly contributing to profound germ cell loss via apoptosis.	222
187738	cd09617	Peptidase_C12_UCH37_BAP1	Cysteine peptidase C12 containing ubiquitin carboxyl-terminal hydrolase (UCH) families UCH37 (UCH-L5) and BAP1. This ubiquitin C-terminal hydrolase (UCH) family includes UCH37 (also known as UCH-L5) and BRCA1-associated protein-1 (BAP1). They contain a UCH catalytic domain as well as an additional C-terminal extension which plays a role in protein-protein interactions. UCH37 is responsible for ubiquitin (Ub) isopeptidase activity in the 19S proteasome regulatory complex; it disassembles Lys48-linked poly-ubiquitin from the distal end of the chain. It is also associated with the human Ino80 chromatin-remodeling complex (hINO80) in the nucleus and can be activated through transient association of hINO80 with hRpn13 that is bound to the 19S regulatory particle or the proteasome. UCH37 possibly plays a role in oncogenesis; it competes with Smad ubiquitination regulatory factor 2 (Smurf2, ubiquitin ligase) in binding concurrently to Smad7 in order to deubiquitinate the activated type I transforming growth factor beta (TGF-beta) receptor, thus rescuing it from proteasomal degradation. BAP1 binds to the wild-type BRCA1 RING finger domain, localized in the nucleus.  In addition to the UCH catalytic domain, BAP1 contains a UCH37-like domain (ULD), binding domains for BRCA1 and BARD1, which form a tumor suppressor heterodimeric complex, and a binding domain for HCFC1, which interacts with histone-modifying complexes during cell division. The full-length human BRCA1 is a ubiquitin ligase. However, BAP1 does not appear to function in the deubiquitination of autoubiquitinated BRCA1. BAP1 exhibits tumor suppressor activity in cancer cells, and gene mutations have been reported in a small number of breast and lung cancer samples. In metastasis of uveal melanoma, the most common primary cancer of the eye, inactivating somatic mutations have been identified in the gene encoding BAP1 on chromosome 3p21.1. These mutations include several that cause premature protein termination as well as affect its UCH domain, thus implicating loss of BAP1 and suggesting that the BAP1 pathway may be a valuable therapeutic target.	219
187676	cd09618	CBM9_like_2	DOMON-like type 9 carbohydrate binding module. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this uncharacterized subfamily are typically found at the N-terminus of longer proteins that lack additional annotation with domain footprints.	186
187677	cd09619	CBM9_like_4	DOMON-like type 9 carbohydrate binding module. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this uncharacterized heterogeneous subfamily are often located at the C-terminus of longer proteins and may co-occur with various other domains.	187
187678	cd09620	CBM9_like_3	DOMON-like type 9 carbohydrate binding module. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this uncharacterized heterogeneous subfamily may co-occur with various other domains.	200
187679	cd09621	CBM9_like_5	DOMON-like type 9 carbohydrate binding module. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this uncharacterized heterogeneous subfamily are often located at the C-terminus of longer proteins and may co-occur with various other functional domains such as glycosyl hydrolases. The CBM9 module in these architectures may be involved in binding to carbohydrates.	188
187680	cd09622	CBM9_like_HisKa	DOMON-like type 9 carbohydrate binding module at the N-terminus of bacterial sensor histidine kinases. Family 9 carbohydrate-binding modules (CBM9) play a role in the microbial degradation of cellulose and hemicellulose (materials found in plants). The domain has previously been called cellulose-binding domain. The polysaccharide binding sites of CBMs with available 3D structure have been found to be either flat surfaces with interactions formed by predominantly aromatic residues (tryptophan and tyrosine), or extended shallow grooves. CBM9 domains found in this family are located at the N-terminus of bacterial sensor histidine kinases and may constitute or contribute to the ligand-binding moiety.	265
187681	cd09623	DOMON_EBDH	Heme-binding domain of bacterial ethylbenzene dehydrogenase. Ethylbenzene dehydrogenase (EBDH) is a bacterial molybdopterin enzyme. It catalyzes anaerobic hydroxylation of alkylaromatic compounds to secondary alcohols. The DOMON domain in EBDH and related proteins, typically called the gamma subunit, binds a heme; its function in the catalytic mechanism is unclear. It co-occurs with a molybdopterin-binding subunit and an iron-sulfur protein. This family also contains heme-binding domains of dimethylsulfide dehydrogenase, selenate reductases, and chlorate reductase.	224
187682	cd09624	DOMON_b558_566	DOMON-like heme-binding domain of CbsA. This family, conserved in some lineages of the Crenarchaeota, represents a mono-heme cytochrome b558/566. CbsA is reported to be a subunit in a heterodimeric complex (CbsA-CbsB in Sulfolobus species), and appears to be glycosylated.	279
187683	cd09625	DOMON_like_cytochrome	DOMON-like domain of an uncharacterized protein family. This family of uncharacterized bacterial proteins contains a DOMON-like domain and an N-terminal B- or C-type cytochrome domain. DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases.	348
187684	cd09626	DOMON_glucodextranase_like	DOMON-like domain of various glycoside hydrolases. This DOMON-like domain is found at the C-terminus of various bacterial proteins that play roles in metabolizing carbohydrates, such as glucodextranase (hydrolyzes alpha-1,6-glucosidic linkages of dextran from the non-reducing end), glucan alpha-1,4-glucosidase, pullulanase (degrades pullulan, a polysaccharide built from maltotriose units), arabinogalactan endo-1,4-beta-galactosidase, and others. Consequently, the DOMON-like domains in this family co-occur with catalytic domains from various glycosyl hydrolase families. The precise function of the DOMON domains in these proteins is not clear, they may be involved in interactions with carbohydrates.	220
187685	cd09627	DOMON_murB_like	Domon-like domain of UDP-N-acetylenolpyruvoylglucosamine reductase. UDP-N-acetylenolpyruvoylglucosamine reductase (murB) catalyzes an essential step in peptidoglycan biosynthesis, the reduction of UDP-N-acetylglucosamine-enolpyruvate to UDP-N-acetylmuramate. A subset of these FAD-dependent enzymes contains a C-terminal DOMON-like domain. DOMON domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes; initially DOMON domains were suspected to confer protein-protein interactions. The DOMON-like domain in murB may bind a heme.	179
187686	cd09628	DOMON_SDR_2_like	DOMON domain of stromal cell-derived receptor 2 (ferric chelate reductase 1) and related proteins. Stromal cell-derived receptor 2 (or ferric chelate reductase 1) reduces Fe(3+) to Fe(2+) ahead of iron transport from the endosome to the cytoplasm. This transmembrane protein is a member of the cytochrome b561 family and contains a DOMON domain which may bind to heme or another ligand. DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases.	169
187687	cd09629	DOMON_CIL1_like	DOMON-like domain of Brassica carinata CIL1 and similar proteins. Brassica carinata CIL1 has been described as involved in suppression of axillary meristem development. It contains a single DOMON domain, the function of which is unclear. Members in this diverse family of plant proteins may have a cytochrome b561 domain C-terminal to the DOMON domain, some members from Arabidopsis have been characterized as auxin-responsive or auxin-induced proteins. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases.	152
187688	cd09630	CDH_like_cytochrome	Heme-binding cytochrome domain of fungal cellobiose dehydrogenases. Cellobiose dehydrogenase (CellobioseDH or CDH) is an extracellular fungal oxidoreductase that degrades both lignin and cellulose. Specifically, CDHs oxidize cellobiose, cellodextrins, and lactose to corresponding lactones, utilizing a variety of electron acceptors. Class-II CDHs are monomeric hemoflavoenzymes that are comprised of a b-type cytochrome domain linked to a large flavodehydrogenase domain. The cytochrome domain of CDH and related enzymes, which this model describes, folds as a beta sandwich and complexes a heme molecule. It is found at the N-terminus of this family of enzymes, and belongs to the DOMON domain superfamily, a ligand-interacting motif found in all three kingdoms of life.	168
187689	cd09631	DOMON_DOH	DOMON-like domain of copper-dependent monooxygenases and related proteins. This diverse family characterizes DOMON domains found in dopamine beta-hydroxylase (DBH), monooxygenase X (MOX), and various other proteins, some of which contain DOMON domains exclusively; the family is not restricted to eukaryotes. DBH is a membrane-bound enzyme that converts dopamine to L-norepinephrine, and plays a central role in the metabolism of catecholamine neurotransmitters.  DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases.	138
193606	cd09632	PliI_like	Periplasmic lysozyme inhibitor, I-type (PliI) and similar proteins. Aeromonas hydrophila PliI is a dimeric periplasmic protein that enables bacteria to resist permeabilization of the outer membrane by the bactericidal action of lysozyme. PliI may be a direct inhibitor of lysozyme that inserts a conserved loop into the active site of type I (invertebrate) lysozymes.	109
193607	cd09633	Deltex_C	Domain found at the C-terminus of deltex-like. The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures.  Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage.	131
187766	cd09634	Cas1_I-II-III	CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain	317
187767	cd09636	Cas1_I-II-III	CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain	260
187768	cd09637	Cas4_I-A_I-B_I-C_I-D_II-B	CRISPR/Cas system-associated protein Cas4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas4 is RecB-like nuclease with three-cysteine C-terminal cluster	178
187769	cd09638	Cas2_I_II_III	CRISPR/Cas system-associated protein Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas2 is present in majority of CRISPR/Cas systems along with Cas1; RNAse specific to U-rich regions; Possesses an RRM/ferredoxin fold	90
187770	cd09639	Cas3_I	CRISPR/Cas system-associated protein Cas3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DEAD/DEAH box helicase DNA helicase cas3'; Often but not always is fused to HD nuclease domain; signature gene for Type I	353
187771	cd09640	Cas7_I-C	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as CT1132 family	258
193608	cd09641	Cas3''_I	CRISPR/Cas system-associated protein Cas3''. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; HD-like nuclease, specifically digesting double-stranded oligonucleotides and preferably cleaving at G:C pairs; signature gene for Type I	200
187773	cd09642	Cas8c_I-C	CRISPR/Cas system-associated protein Cas8c. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-C subtype; also known as Csd1 family	574
187774	cd09643	Csn1	CRISPR/Cas system-associated protein Cas9. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Very large protein containing McrA/HNH-nuclease related domain and a RuvC-like nuclease domain; signature gene for type II	799
213407	cd09644	Csn2	CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family.	223
187776	cd09645	Cas5_I-E	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex	137
187777	cd09646	Cas7_I-E	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Cse4/CasC family	325
187778	cd09647	Csm2_III-A	CRISPR/Cas system-associated protein Csm2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; signature gene for subtype III-A	95
187779	cd09648	Cas2_I-E	CRISPR/Cas system-associated protein Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas2 is present in majority of CRISPR/Cas systems along with Cas1; RNAse specific to U-rich regions; Possesses an RRM/ferredoxin fold	93
187780	cd09649	Cas5_I-A	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex	143
187781	cd09650	Cas7_I	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as MJ0381 family	189
187782	cd09651	Cas5_I-C	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex; in subtype I-C this protein might be the endoribonuclease that generates crRNAs; also known as DevS family	198
410980	cd09652	Cas6	Class 1 CRISPR-associated endoribonuclease Cas6. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Cas6 family endoribonucleases are typically found within types I and III CRISPR-Cas systems and are metal-independent nucleases that catalyze RNA cleavage via a mechanism involving a 2'-3' cyclic intermediate. They share a common ferredoxin or RNA recognition motif (RRM) fold, and they recognize and excise CRISPR repeat RNAs that vary widely in primary and secondary structures. Cas6 is also found in the rare type IV system that includes rudimentary CRISPR-cas loci lacking the adaptation module.	258
187784	cd09653	Csa5_I-A	CRISPR/Cas system-associated protein Csa5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Predicted transcriptional regulator of CRISPR/Cas system; contains DNA binding HTH domain; also known as Csa5 family	97
187785	cd09654	Cmr5_III-B	CRISPR/Cas system-associated protein Cmr5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; signature gene for subtype III-B	127
187786	cd09655	CasRa_I-A	CRISPR/Cas system-associated transcriptional regulator CasRa. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Predicted transcriptional regulator of CRISPR/Cas system	198
187787	cd09656	Cmr3_III-B	CRISPR/Cas system-associated RAMP superfamily protein Cmr3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex	318
187788	cd09657	Cmr1_III-B	CRISPR/Cas system-associated RAMP superfamily protein Cmr1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex	132
187789	cd09658	Cas5_I-B	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex	181
187790	cd09659	Cas4_I-A	CRISPR/Cas system-associated protein Cas4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas4 is RecB-like nuclease with three-cysteine C-terminal cluster	270
187791	cd09660	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as MJ1666 family	394
187792	cd09661	Cmr6_III-B	CRISPR/Cas system-associated RAMP superfamily protein Cmr6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex	210
187793	cd09662	Csm5_III-A	CRISPR/Cas system-associated RAMP superfamily protein Csm5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein	365
187794	cd09663	Csm4_III-A	CRISPR/Cas system-associated RAMP superfamily protein Csm4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein	301
187795	cd09664	Cas6_I-E	CRISPR/Cas system-associated RAMP superfamily protein Cas6e. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6e is an endoribonuclease that generates crRNA; This family is specific for CRISPR/Cas system I-E subtype; Homologous to Cas6 (RAMP superfamily protein); Possesses double RRM/ferredoxin fold; also known as Cse3 family	210
187796	cd09665	Cas8a1_I-A	CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as CXXC_CXXC family	334
187797	cd09666	Cas8a2_I-A	CRISPR/Cas system-associated protein Csa8a2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein, distant homologs of Cas8 proteins; signature gene for I-A subtype; also known as Csa4 family	352
187798	cd09667	Csb2_I-U	CRISPR/Cas system-associated protein Csb2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Duplicated RAMP domains; also known as GSU0054 family	418
187799	cd09668	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as TM1812 family	214
187800	cd09669	Cse1_I-E	CRISPR/Cas system-associated protein Cse1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; subunit of the Cascade complex; signature gene for I-E subtype; also known as Cse1/CasA/YgcL family	477
187801	cd09670	Cse2_I-E	CRISPR/Cas system-associated protein Cse2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; also known as Cse2/CasB/YgcK family; specific gene for I-E subtype;	152
187802	cd09671	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as DxTHG family	346
187803	cd09672	Cas8a1_I-A	CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as TM1802 family	545
187804	cd09673	Cas3_Cas2_I-F	CRISPR/Cas system-associated protein Cas3/Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas3/Cas2 fusion; This protein includes both DEAH and HD motifs for helicase and N-terminal domain corresponding to Cas2 RNAse; signature gene for Type I and subtype I-F	1106
187805	cd09674	Cas6_I-F	CRISPR/Cas system-associated RAMP superfamily protein Cas6f. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6f is an endoribonuclease that generates crRNA; This family is specific for CRISPR/Cas system I-F subtype; Possesses RRM fold; also known as Csy4 family	186
187806	cd09675	Csy1_I-F	CRISPR/Cas system-associated protein Csy1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins; Predicted subunit of the Cascade complex; signature gene for I-F subtype; also known as Csy1 family	384
187807	cd09676	Csy2_I-F	CRISPR/Cas system-associated RAMP superfamily protein Csy2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas5 ortholog	292
187808	cd09677	Csy3_I-F	CRISPR/Cas system-associated RAMP superfamily protein Csy3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas7 ortholog	339
187809	cd09678	Csb1_I-U	CRISPR/Cas system-associated protein Csb1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; Contains several motifs similar to Cas7 family; also known as GSU0053 family	174
187810	cd09679	Cas10_III	CRISPR/Cas system-associated protein Cas10. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Multidomain protein with permuted HD nuclease domain, palm domain and Zn-ribbon; MTH326-like has inactivated polymerase catalytic domain; alr1562 and slr7011 - predicted only on the basis of size, presence of HD domain, and location with RAMPs in one operon; signature gene for type III; also known as Crm2 family	475
187811	cd09680	Cas10_III	CRISPR/Cas system-associated protein Cas10. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Multidomain protein with permuted HD nuclease domain, palm domain and Zn-ribbon; signature gene for type III; also known as Csm1 family	650
187812	cd09681	Csx3_III-U	CRISPR/Cas system-associated protein Csx3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein in some cases fused to Csx1 (COG1517) family domains	83
187813	cd09682	Cmr4_III-B	CRISPR/Cas system-associated RAMP superfamily protein Cmr4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex	242
187814	cd09683	Csm3_III-A	CRISPR/Cas system-associated RAMP superfamily protein Csm3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein	216
187815	cd09684	Csm3_III-A	CRISPR/Cas system-associated RAMP superfamily protein Csm3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein	215
187816	cd09685	Cas7_I-A	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as DevR family	274
187817	cd09686	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as NE0113 family	209
187818	cd09687	Cas7_I-C	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Cst2/DevR family	302
187819	cd09688	Cas5_I-C	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex; in subtype I-C this protein might be the endoribonuclease that generates crRNAs; also known as DevS family	174
187820	cd09689	Cas7_I-C	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Csd2 family	278
187821	cd09690	Cas7_I-B	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Csh2 family	286
187822	cd09691	Cas8b_I-B	CRISPR/Cas system-associated protein Cas8b. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein, distant homologs of Cas8 proteins; signature gene for I-B subtype; also known as Csh1 family	381
187823	cd09692	Cas5_I-B	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex	189
187824	cd09693	Cas5_I	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex	202
187825	cd09694	Csm6_III-A	CRISPR/Cas system-associated protein Csm6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; loosely associated with CRISPR/Cas systems	181
187826	cd09695	Csx16_III-U	CRISPR/Cas system-associated protein Csx16. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein often seen in proximity to Csx1 (COG1517) family; also known as VVA1548 family	76
187827	cd09696	Cas3_I	CRISPR/Cas system-associated protein Cas3; Distinct Cas3 family with HD domain fused to C-termus of Helicase domain. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DNA helicase Cas3; This protein includes both DEAH and HD motifs; signature gene for Type I	843
187828	cd09697	Cas8a1_I-A	CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as Csx8 family	441
187829	cd09698	Cas8a2_I-A	CRISPR/Cas system-associated protein Csa8a2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as Csx9 family	377
187830	cd09699	Csm6_III-A	CRISPR/Cas system-associated protein Csm6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; loosely associated with CRISPR/Cas systems	360
187831	cd09700	Csx10	CRISPR/Cas system-associated RAMP superfamily protein Csx10. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Duplicated RAMP domains	386
187832	cd09701	Cas10_III	CRISPR/Cas system-associated protein Cas10. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Multidomain protein with permuted HD nuclease domain, inactivated palm domain and Zn-ribbon; signature gene for type III	909
187833	cd09702	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as TIGR02710 family	378
187834	cd09703	Cas6-I-III	CRISPR/Cas system-associated RAMP superfamily protein Cas6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6 is an endoribonuclease that generates crRNAs, predicted subunit of Cascade complex; RAMP superfamily protein; Possesses double RRM/ferredoxin fold; also known as Cse3 family	188
187835	cd09704	Csx12	CRISPR/Cas system-associated protein Cas9. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Very large protein containing McrA/HNH-nuclease related domain and a RuvC-like nuclease domain; signature gene for type II	804
187836	cd09705	Csf1_U	CRISPR/Cas system-associated protein Csf1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein; also known as Csf1 family	202
187837	cd09706	Csf2_U	CRISPR/Cas system-associated RAMP superfamily protein Csf2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; Contains several motifs similar to Cas7 family	328
187838	cd09707	Csf3_U	CRISPR/Cas system-associated RAMP superfamily protein Csf3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein	214
187839	cd09708	Csf4_U	CRISPR/Cas system-associated DinG family helicase Csf4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase	632
187840	cd09709	Csc2_I-D	CRISPR/Cas system-associated protein Csc2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas7 ortholog; also known as Cse1 family	274
187841	cd09710	Cas3_I-D	CRISPR/Cas system-associated protein Cas3; Distinct diverged subfamily of Cas3 helicase domain. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Diverged DNA helicase Cas3'; signature gene for Type I and subtype I-D	353
187842	cd09711	Csc1_I-D	CRISPR/Cas system-associated protein Csc1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas5 ortholog; also known as CasA/Cse1 family	210
187843	cd09712	Cas10d_I-D	CRISPR/Cas system-associated protein Cas10d. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain. Fused to N-terminal HD domain; signature gene for I-D subtype; also known as Csc3 family	900
187844	cd09713	Cas8c_I-C	CRISPR/Cas system-associated protein Cas8c. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-C subtype; also known as Csx13_N family	316
187845	cd09714	Cas8c'_I-D	CRISPR/Cas system-associated protein Cas8c'. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-C subtype; also known as Csx13_C family	152
187846	cd09715	Csp2_I-U	CRISPR/Cas system-associated protein Cas8c. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Predicted Cas8 ortholog	474
187847	cd09716	Cas5_I	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex	220
187848	cd09717	Cas7_I	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Csp1 family	292
187849	cd09718	Cas1_I-F	CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain	306
187850	cd09719	Cas1_I-E	CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain	262
187851	cd09720	Cas1_II	CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer intergration. Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain.	275
187852	cd09721	Cas1_I-C	CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain	338
187853	cd09722	Cas1_I-B	CRISPR/Cas system-associated protein Cas1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain	320
187854	cd09723	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as csx13 family	132
187855	cd09724	CsaX_III-U	CRISPR/Cas system-associated protein CsaX. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; No prediction	296
187856	cd09725	Cas2_I_II_III	CRISPR/Cas system-associated protein Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas2 is present in majority of CRISPR/Cas systems along with Cas1; RNAse specific to U-rich regions; Possesses an RRM/ferredoxin fold	79
187857	cd09726	RAMP_I_III	CRISPR/Cas system-associated RAMP superfamily protein. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily proteins	177
187858	cd09727	Cas6_I-E	CRISPR/Cas system-associated RAMP superfamily protein Cas6e. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6e is an endoribonuclease that generates crRNA; This family is specific for CRISPR/Cas system I-E subtype; Homologous to Cas6 (RAMP superfamily protein); Possesses double RRM/ferredoxin fold; also known as Cse3 family	210
187859	cd09728	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as DxTHG family	400
187860	cd09729	Cse1_I-E	CRISPR/Cas system-associated protein Cse1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; subunit of the Cascade complex; signature gene for I-E subtype; also known as Cse1/CasA/YgcL family	465
187861	cd09730	Cas8a1_I-A	CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as TM1802 family	579
187862	cd09731	Cse2_I-E	CRISPR/Cas system-associated protein Cse2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; also known as Cse2/CasB/YgcK family; specific gene for I-E subtype;	141
187863	cd09732	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as TM1812 family	221
187864	cd09733	Cas6-I-III	CRISPR/Cas system-associated RAMP superfamily protein Cas6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6 is an endoribonuclease that generates crRNAs, predicted subunit of Cascade complex; RAMP superfamily protein; Possesses double RRM/ferredoxin fold; also known as AF0072 family	193
320705	cd09734	Csb2_I-U	CRISPR/Cas system-associated protein Csb2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Duplicated RAMP domains; also known as GSU0054 family	496
187866	cd09735	Csy1_I-F	CRISPR/Cas system-associated protein Csy1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins; Predicted subunit of the Cascade complex; signature gene for I-F subtype; also known as Csy1 family	377
187867	cd09736	Csy2_I-F	CRISPR/Cas system-associated RAMP superfamily protein Csy2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas5 ortholog	289
187868	cd09737	Csy3_I-F	CRISPR/Cas system-associated RAMP superfamily protein Csy3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; predicted Cas7 ortholog	329
187869	cd09738	Csb1_I-U	CRISPR/Cas system-associated protein Csb1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; Contains several motifs similar to Cas7 family; also known as GSU0053 family	168
187870	cd09739	Cas6_I-F	CRISPR/Cas system-associated RAMP superfamily protein Cas6f. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6f is an endoribonuclease that generates crRNA; This family is specific for CRISPR/Cas system I-F subtype; Possesses RRM fold; also known as Csy4 family	185
187871	cd09740	Csx3_III-U	CRISPR/Cas system-associated protein Csx3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein in some cases fused to Csx1 (COG1517) family domains	84
187872	cd09741	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as NE0113 family	219
187873	cd09742	Csm6_III-A	CRISPR/Cas system-associated protein Csm6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; loosely associated with CRISPR/Cas systems; also known as APE2256 family	183
187874	cd09743	Csx16_III-U	CRISPR/Cas system-associated protein Csx16. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein often seen in proximity to Csx1 (COG1517) family; also known as VVA1548 family	90
187875	cd09744	Cas8a1_I-A	CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as Csx8 family	441
187876	cd09745	Cas8a2_I-A	CRISPR/Cas system-associated protein Csa8a2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as Csx9 family	377
187877	cd09746	Csm6_III-A	CRISPR/Cas system-associated protein Csm6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; loosely associated with CRISPR/Cas systems	382
187878	cd09747	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein of this family often fused to HTH domain; Some proteins could have an additional fusion with RecB-family nuclease domain; Core domain appears to have a Rossmann-like fold; loosely associated with CRISPR/Cas systems; also known as Cas02710 family	378
187879	cd09748	Cmr3_III-B	CRISPR/Cas system-associated RAMP superfamily protein Cmr3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; This protein is a subunit of Cmr complex	356
187880	cd09749	Cmr5_III-B	CRISPR/Cas system-associated protein Cmr5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small alpha-helical protein; signature gene for subtype III-B	119
187881	cd09750	Csa5_I-A	CRISPR/Cas system-associated protein Csa5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Predicted transcriptional regulator of CRISPR/Cas system; contains DNA binding HTH domain; also known as Csa5 family	101
187882	cd09751	Cas8a2_I-A	CRISPR/Cas system-associated protein Csa8a2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein, distant homologs of Cas8 proteins; signature gene for I-A subtype; also known as Csa4 family	355
187534	cd09752	Cas5_I-C	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex; in subtype I-C this protein might be the endoribonuclease that generates crRNAs; also known as DevS family	198
187883	cd09753	Cas5_I-A	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex	147
187884	cd09754	Cas8a1_I-A	CRISPR/Cas system-associated protein Cas8a1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as CXXC_CXXC family	65
187885	cd09755	Cas2_I-E	CRISPR/Cas system-associated protein Cas2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas2 is present in majority of CRISPR/Cas systems along with Cas1; RNAse specific to U-rich regions; Possesses an RRM/ferredoxin fold	62
187886	cd09756	Cas5_I-E	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas5 is a RAMP superfamily protein; Subunit of the Cascade complex	135
187887	cd09757	Cas8c_I-C	CRISPR/Cas system-associated protein Cas8c. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Zn-finger domain containing protein, distant homologs of Cas8 proteins; signature gene for I-C subtype; also known as Csd1 family	569
213408	cd09758	Csn2	CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family.	218
187889	cd09759	Cas6_I-A	CRISPR/Cas system-associated RAMP superfamily protein Cas6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6 is an endoribonuclease that generates crRNAs, predicted subunit of Cascade complex; RAMP superfamily protein; Possesses double RRM/ferredoxin fold	240
187890	cd09760	Cas6_III	CRISPR/Cas system-associated RAMP superfamily protein Cas6. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas6 is an endoribonuclease that generates crRNAs, predicted subunit of Cascade complex	289
187662	cd09761	A3DFK9-like_SDR_c	Clostridium thermocellum A3DFK9-like, a putative carbohydrate or polyalcohol metabolizing SDR, classical (c) SDRs. This subgroup includes a putative carbohydrate or polyalcohol metabolizing SDR (A3DFK9) from Clostridium thermocellum. Its members have a TGXXXGXG classical-SDR glycine-rich NAD-binding motif, and some have a canonical SDR active site tetrad (A3DFK9 lacks the upstream Asn). SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	242
187663	cd09762	HSDL2_SDR_c	human hydroxysteroid dehydrogenase-like protein 2 (HSDL2), classical (c) SDRs. This subgroup includes human HSDL2 and related protens. These are members of the classical SDR family, with a canonical Gly-rich NAD-binding motif and the typical YXXXK active site motif. However, the rest of the catalytic tetrad is not strongly conserved. HSDL2 may play a part in fatty acid metabolism, as it is found in peroxisomes. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	243
187664	cd09763	DHRS1-like_SDR_c	human dehydrogenase/reductase (SDR family) member 1 (DHRS1) -like, classical (c) SDRs. This subgroup includes human DHRS1 and related proteins. These are members of the classical SDR family, with a canonical Gly-rich  NAD-binding motif and the typical YXXXK active site motif. However, the rest of the catalytic tetrad is not strongly conserved. DHRS1 mRNA has been detected in many tissues, liver, heart, skeletal muscle, kidney and pancreas; a longer transcript is predominantly expressed in the liver , a shorter one in the heart. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRS are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes have a 3-glycine N-terminal NAD(P)(H)-binding pattern (typically, TGxxxGxG in classical SDRs and TGxxGxxG in extended SDRs), while substrate binding is in the C-terminal region. A critical catalytic Tyr residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering), is often found in a conserved YXXXK pattern. In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) or additional Ser, contributing to the active site. Substrates for these enzymes include sugars, steroids, alcohols, and aromatic compounds. The standard reaction mechanism is a proton relay involving the conserved Tyr and Lys, as well as Asn (or Ser). Some SDR family members, including 17 beta-hydroxysteroid dehydrogenase contain an additional helix-turn-helix motif that is not generally found among SDRs.	265
187733	cd09764	Csb3_I-U	CRISPR/Cas system-associated RAMP superfamily protein Csb3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; RAMP superfamily protein; Might be a catalytically active RNA endoribonuclease	341
187734	cd09765	Csx14_I-U	CRISPR/Cas system-associated protein Csx14. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Protein containing C-terminal alpha-helical domain resembling Cas8a2, also known as GSU0052	272
187735	cd09766	Csx15_I-U	CRISPR/Cas system-associated protein Csx15. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Small protein loosely associated with CRISPR/Cas systems; some are fused to AAA ATPase domain, also known as TTE2665 family	101
187705	cd09767	Csx17_I-U	CRISPR/Cas system-associated protein Csx17. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins; Predicted subunit of the Cascade complex;	652
188874	cd09768	Luminal_EIF2AK3	The Luminal domain, a dimerization domain, of the Serine/Threonine protein kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 3. The Luminal domain is a dimerization domain present in eukaryotic translation Initiation Factor 2-Alpha Kinase 3 (EIF2AK3), also called PKR-like Endoplasmic Reticulum Kinase (PERK). EIF2AK3 is a serine/threonine protein kinase (STK) and a type I transmembrane protein that is localized in the endoplasmic reticulum (ER). As a EIF2AK, it phosphorylates the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. eIF-2 phosphorylation is induced in response to cellular stresses including virus infection, heat shock, nutrient deficiency, and the accummulation of unfolded proteins, among others. There are four distinct kinases that phosphorylate eIF-2 and control protein synthesis: General Control Non-derepressible-2 (GCN2), protein kinase regulated by RNA (PKR), heme-regulated inhibitor kinase (HRI), and PERK. PERK contains a luminal domain bound with the chaperone BiP under unstressed conditions and a cytoplasmic catalytic kinase domain. In response to the accumulation of misfolded or unfolded proteins in the ER, PERK is activated through the release of BiP, allowing it to dimerize through its luminal domain and autophosphorylate. It functions as the central regulator of translational control during the Unfolded Protein Response (UPR) pathway. In addition to the eIF-2 alpha subunit, PERK also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR.	301
188875	cd09769	Luminal_IRE1	The Luminal domain, a dimerization domain, of the Serine/Threonine protein kinase, Inositol-requiring protein 1. The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), a serine/threonine protein kinase (STK) and a type I transmembrane protein that is localized in the endoplasmic reticulum (ER). IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), is a kinase receptor that also contains an endoribonuclease domain in the cytoplasmic side. It plays roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1 acts as an ER stress sensor and is the oldest and most conserved component of the UPR in eukaryotes. During ER stress, IRE1 dimerizes through its luminal domain and forms oligomers, allowing the kinase domain to undergo trans-autophosphorylation. This leads to a conformational change that stimulates its endoribonuclease activity and results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. Mammals contain two IRE1 proteins, IRE1alpha (or ERN1) and IRE1beta (or ERN2). IRE1alpha is expressed in all cells and tissues while IRE1beta is found only in intestinal epithelial cells.	295
197361	cd09803	UBAN	polyubiquitin binding domain of NEMO and related proteins. NEMO (NF-kappaB essential modulator) is a regulatory subunit of the kinase complex IKK, which is involved in the activation of NF-kappaB via phosporylation of inhibitory IkappaBs. This mechanism requires the binding of NEMO to ubiquinated substrates. Binding is achieved via the UBAN motif (ubiquitin binding in ABIN and NEMO), which is described in this model. This region of NEMO has also been named CoZi (for coiled-coil 2 and leucine zipper). ABINs (A20-binding inhibitors of NF-kappaB) are sensors for ubiquitin that are involved in regulation of apoptosis, ABIN-1 is presumed to inhibit signalling via the NF-kappaB route. The UBAN motif is also found in optineurin, the product of a gene associated with glaucoma, which has been characterized as a negative regulator of NF-kappaB as well.	87
197362	cd09804	Dcp1	mRNA decapping enzyme 1 (Dcp1). mRNA decapping enzyme 1 (Dcp1), together with Dcp2, is part of the decapping complex which catalyzes the removal of the 5' cap structure of mRNA. This decapping reaction is an essential step in mRNA degradation, by exposing the 5' end for exonucleolytic digestion. Dcp1 binds to the N-terminal helical domain of catalytic subunit Dcp2 and enhances its function by promoting Dsp2's closed conformation which is catalytically more active.	121
187665	cd09805	type2_17beta_HSD-like_SDR_c	human 17beta-hydroxysteroid dehydrogenase type 2 (type 2 17beta-HSD)-like, classical (c) SDRs. 17beta-hydroxysteroid dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens. This classical-SDR subgroup includes the human proteins: type 2 17beta-HSD, type 6 17beta-HSD,  type 2 11beta-HSD, dehydrogenase/reductase SDR family member 9,  short-chain dehydrogenase/reductase family 9C member 7, 3-hydroxybutyrate dehydrogenase type 1, and retinol dehydrogenase 5. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	281
187666	cd09806	type1_17beta-HSD-like_SDR_c	human estrogenic 17beta-hydroxysteroid dehydrogenase type 1 (type 1 17beta-HSD)-like, classical (c) SDRs. 17beta-hydroxysteroid dehydrogenases are a group of isozymes that catalyze activation and inactivation of estrogen and androgens. This classical SDR subgroup includes human type 1 17beta-HSD, human retinol dehydrogenase 8, zebrafish photoreceptor associated retinol dehydrogenase type 2, and a chicken ovary-specific 17beta-hydroxysteroid dehydrogenase. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	258
212495	cd09807	retinol-DH_like_SDR_c	retinol dehydrogenases (retinol-DHs), classical (c) SDRs. Classical SDR-like subgroup containing retinol-DHs and related proteins. Retinol is processed by a medium chain alcohol dehydrogenase followed by retinol-DHs. Proteins in this subfamily share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. This subgroup includes the human proteins: retinol dehydrogenase -12, -13 ,and -14. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	274
187668	cd09808	DHRS-12_like_SDR_c-like	human dehydrogenase/reductase SDR family member (DHRS)-12/FLJ13639-like, classical (c)-like SDRs. Classical SDR-like subgroup containing human DHRS-12/FLJ13639, the 36K protein of zebrafish CNS myelin, and related proteins. DHRS-12/FLJ13639 is expressed in neurons and oligodendrocytes in the human cerebral cortex. Proteins in this subgroup share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	255
187669	cd09809	human_WWOX_like_SDR_c-like	human WWOX (WW domain-containing oxidoreductase)-like, classical (c)-like SDRs. Classical-like SDR domain of human WWOX and related proteins. Proteins in this subfamily share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	284
187670	cd09810	LPOR_like_SDR_c_like	light-dependent protochlorophyllide reductase (LPOR)-like, classical (c)-like SDRs. Classical SDR-like subgroup containing LPOR and related proteins. Protochlorophyllide (Pchlide) reductases act in chlorophyll biosynthesis. There are distinct enzymes that catalyze Pchlide reduction in light or dark conditions. Light-dependent reduction is via an NADP-dependent SDR, LPOR. Proteins in this subfamily share the glycine-rich NAD-binding motif of the classical SDRs, have a partial match to the canonical active site tetrad, but lack the typical active site Ser. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase (15-PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, 15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	311
187671	cd09811	3b-HSD_HSDB1_like_SDR_e	human 3beta-HSD (hydroxysteroid dehydrogenase) and HSD3B1(delta 5-delta 4-isomerase)-like, extended (e) SDRs. This extended-SDR subgroup includes human 3 beta-HSD/HSD3B1 and C(27) 3beta-HSD/ [3beta-hydroxy-delta(5)-C(27)-steroid oxidoreductase; HSD3B7], and related proteins. These proteins have the characteristic active site tetrad and NAD(P)-binding motif of extended SDRs. 3 beta-HSD catalyzes the oxidative conversion of delta 5-3 beta-hydroxysteroids to the delta 4-3-keto configuration; this activity is essential for the biosynthesis of all classes of hormonal steroids. C(27) 3beta-HSD is a membrane-bound enzyme of the endoplasmic reticulum, it catalyzes the isomerization and oxidation of 7alpha-hydroxylated sterol intermediates, an early step in bile acid biosynthesis. Mutations in the human gene encoding C(27) 3beta-HSD underlie a rare autosomal recessive form of neonatal cholestasis. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid sythase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	354
187672	cd09812	3b-HSD_like_1_SDR_e	3beta-hydroxysteroid dehydrogenase (3b-HSD)-like, subgroup1, extended (e) SDRs. An uncharacterized subgroup of the 3b-HSD-like extended-SDR family. Proteins in this subgroup have the characteristic active site tetrad and NAD(P)-binding motif of extended-SDRs. 3 beta-HSD catalyzes the oxidative conversion of delta 5-3 beta-hydroxysteroids to the delta 4-3-keto configuration; this activity is essential for the biosynthesis of all classes of hormonal steroids. Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid sythase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	339
187673	cd09813	3b-HSD-NSDHL-like_SDR_e	human NSDHL (NAD(P)H steroid dehydrogenase-like protein)-like, extended (e) SDRs. This subgroup includes human NSDHL and related proteins. These proteins have the characteristic active site tetrad of extended SDRs, and also have a close match to their  NAD(P)-binding motif.  Human NSDHL is a 3beta-hydroxysteroid dehydrogenase (3 beta-HSD) which functions in the cholesterol biosynthetic pathway.  3 beta-HSD catalyzes the oxidative conversion of delta 5-3 beta-hydroxysteroids to the delta 4-3-keto configuration; this activity is essential for the biosynthesis of all classes of hormonal steroids. Mutations in the gene encoding NSDHL cause CHILD syndrome (congenital hemidysplasia with ichthyosiform nevus and limb defects), an X-linked dominant, male-lethal trait.  This subgroup also includes an unusual bifunctional [3beta-hydroxysteroid dehydrogenase (3b-HSD)/C-4 decarboxylase from Arabidopsis thaliana, and Saccharomyces cerevisiae ERG26, a 3b-HSD/C-4 decarboxylase, involved in the synthesis of ergosterol, the major sterol of yeast.  Extended SDRs are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central beta-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid sythase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.	335
381167	cd09815	TP_methylase	S-AdoMet-dependent tetrapyrrole methylases. This superfamily uses S-AdoMet (S-adenosyl-L-methionine or SAM) in the methylation of diverse substrates. Most members catalyze various methylation steps in cobalamin (vitamin B12) biosynthesis. There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. The enzymes involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Most of the enzymes are shared by both pathways and a few enzymes are pathway-specific. Diphthine synthase and ribosomal RNA small subunit methyltransferase I (RsmI) are two superfamily members that are not involved in cobalamin biosynthesis. Diphthine synthase participates in the posttranslational modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. RsmI catalyzes the 2-O-methylation of the ribose of cytidine 1402 (C1402) in 16S rRNA. Other superfamily members not involved in cobalamin biosynthesis include the N-terminal tetrapyrrole methylase domain of Bacillus subtilis YabN whose specific function is unknown, and Omphalotus olearius omphalotin methyltransferase which catalyzes the automethylation of its own C-terminus; this C terminus is subsequently released and macrocyclized to give Omphalotin A, a potent nematicide.	219
188648	cd09816	prostaglandin_endoperoxide_synthase	Animal prostaglandin endoperoxide synthase and related bacterial proteins. Animal prostaglandin endoperoxide synthases, including prostaglandin H2 synthase and a set of similar bacterial proteins which may function as cyclooxygenases. Prostaglandin H2 synthase catalyzes the synthesis of prostaglandin H2 from arachidonic acid. In two reaction steps, arachidonic acid is converted to Prostaglandin G2, a peroxide (cyclooxygenase activity) and subsequently converted to the end product via the enzyme's peroxidase activity. Prostaglandin H2 synthase is the target of aspirin and other non-steroid anti-inflammatory drugs such as ibuprofen, which block the substrate's access to the active site and may acetylate a conserved serine residue. In humans and other mammals, prostaglandin H2 synthase (PGHS), also called cyclooxygenase (COX) is present as at least two isozymes, PGHS-1 (or COX-1) and PGHS-2 (or COX-2), respectively. PGHS-1 is expressed constitutively in most mammalian cells, while the expression of PGHS-2 is induced via inflammation response in endothelial cells, activated macrophages, and others. COX-3 is a splice variant of COX-1.	490
188649	cd09817	linoleate_diol_synthase_like	Linoleate (8R)-dioxygenase and related enzymes. These fungal enzymes, related to animal heme peroxidases, catalyze the oxygenation of linoleate and similar targets. Linoleate (8R)-dioxygenase, also called linoleate:oxygen 7S,8S-oxidoreductase, generates (9Z,12Z)-(7S,8S)-dihydroxyoctadeca-9,12-dienoate as a product. Other members are 5,8-linoleate dioxygenase (LDS, ppoA) and linoleate 10R-dioxygenase (ppoC), involved in the biosynthesis of oxylipins.	550
188650	cd09818	PIOX_like	Animal heme oxidases similar to plant pathogen-inducible oxygenases. This is a diverse family of oxygenases related to the animal heme peroxidases, with members from plants, animals, and bacteria. The plant pathogen-inducible oxygenases (PIOX) oxygenate fatty acids into 2R-hydroperoxides. They may be involved in the hypersensitive reaction, rapid and localized cell death induced by infection with pathogens, and the rapidly induced expression of PIOX may be caused by the oxidative burst that occurs in the process of cell death.	484
188651	cd09819	An_peroxidase_bacterial_1	Uncharacterized bacterial family of heme peroxidases. Animal heme peroxidases are diverse family of enzymes which are not restricted to metazoans; members are also found in fungi, and plants, and in bacteria - like this family of uncharacterized proteins.	465
188652	cd09820	dual_peroxidase_like	Dual oxidase and related animal heme peroxidases. Animal heme peroxidases of the dual-oxidase like subfamily play vital roles in the innate mucosal immunity of gut epithelia. They provide reactive oxygen species which help control infection.	558
188653	cd09821	An_peroxidase_bacterial_2	Uncharacterized bacterial family of heme peroxidases. Animal heme peroxidases are diverse family of enzymes which are not restricted to metazoans; members are also found in fungi, and plants, and in bacteria - like this family of uncharacterized proteins.	570
188654	cd09822	peroxinectin_like_bacterial	Uncharacterized family of heme peroxidases, mostly bacterial. Animal heme peroxidases are diverse family of enzymes which are not restricted to animals. Members are also found in metazoans, fungi, and plants, and also in bacteria - like most members of this family of uncharacterized proteins.	420
188655	cd09823	peroxinectin_like	peroxinectin_like animal heme peroxidases. Peroxinectin is an arthropod protein that plays a role in invertebrate immunity mechanisms. Specifically, peroxinectins are secreted as cell-adhesive and opsonic peroxidases. The immunity mechanism appears to involve an interaction between peroxinectin and a transmembrane receptor of the integrin family. Human myeloperoxidase, which is included in this wider family, has also been reported to interact with integrins.	378
188656	cd09824	myeloperoxidase_like	Myeloperoxidases, eosinophil peroxidases, and lactoperoxidases. This well conserved family of animal heme peroxidases contains members with somewhat diverse functions. Myeloperoxidases are lysosomal proteins found in azurophilic granules of neutrophils and the lysosomes of monocytes. They are involved in the formation of microbicidal agents upon activation of activated neutrophils (neutrophils undergoing respiratory bursts as a result of phagocytosis), by catalyzing the conversion of hydrogen peroxide to hypochlorous acid. As a heme protein, myeloperoxidase is responsible for the greenish tint of pus, which is rich in neutrophils. Eosinophil peroxidases are haloperoxidases as well, preferring bromide over chloride. Expressed by eosinophil granulocytes, they are involved in attacking multicellular parasites and play roles in various inflammatory diseases such as asthma. The haloperoxidase lactoperoxidase is secreted from mucosal glands and provides antibacterial activity by oxidizing a variety of substrates such as bromide or chloride in the presence of hydrogen peroxide.	411
188657	cd09825	thyroid_peroxidase	Thyroid peroxidase (TPO). TPO is a member of the animal heme peroxidase family, which is expressed in the thyroid and involved in the processing of iodine and iodine compounds. Specifically, TPO oxidizes iodide via hydrogen peroxide to form active iodine, which is then, for example, incorporated into the tyrosine residues of thyroglobulin to yield mono- and di-iodotyrosines.	565
188658	cd09826	peroxidasin_like	Animal heme peroxidase domain of peroxidasin and related proteins. Peroxidasin is a secreted heme peroxidase which is involved in hydrogen peroxide metabolism and peroxidative reactions in the cardiovascular system. The domain co-occurs with extracellular matrix domains and may play a role in the formation of the extracellular matrix.	440
193602	cd09827	PET_Prickle	The PET domain of Prickle. The PET domain of Prickle: Prickle contains an N-terminal PET domain and three C-terminal LIM domains. Prickle has been implicated in regulation of cell movement in the planar cell polarity (PCP) pathway which   requires the conserved Frizzled/Dishevelled (Dsh); Prickle interacts with Dishevelled, thereby modulating the activity of Frizzled/Dishevelled and the PCP signaling. Two forms of Prickle have been identified, namely Prickle 1 and Prickle 2. These are differentially expressed; Prickle 1 is found in fetal heart and hematological malignancies, while Prickle 2 is expressed in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. The PET domain is a protein-protein interaction domain, usually found in conjunction with the LIM domain, which is also involved in protein-protein interactions. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes.	97
193603	cd09828	PET_OEBT	The PET domain of overexpressed breast tumor protein (OEBT). The PET domain of overexpressed breast tumor protein (OEBT): OEBT contains an N-terminal PET domain and two C-terminal LIM domains, and is predicted to be localized in the nucleus. The expression pattern of OEBT in malignant tissues indicates a possible role of OEBT in cancer differentiation. The PET domain is a protein-protein interaction domain and is usually found in conjunction with LIM domain, which is also involved in protein-protein interactions. PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes.	116
193604	cd09829	PET_testin	The PET domain of Testin. The PET domain of Testin: Testin contains a PET domain at the N-terminus and three C-terminal LIM domains. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell-cell contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and is involved in cell motility and adhesion events. Knockout mice experiments reveal a tumor repressor function of Testin. The PET domain is a protein-protein interaction domain and is usually found in conjunction with LIM domain, which is also involved in protein-protein interactions. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes.	88
193605	cd09830	PET_LIMPETin_LIM-9	The PET domain of protein LIMPETin and LIM-9. The PET domain of protein LIMPETin and LIM-9: Members of this family contain an N-terminal PETdomain and five to six LIM domains at the C-terminus. Four of the six LIM domains are highly homologous to the four-and-half LIM (FHL) domain family while the other two show sequence similarity to LIM domains of the Testin family. Thus, proteins of this family may be the recombinant product of genes coding testin and FHL proteins.  In Schistosoma mansoni, where LIMPETin was first identified, LIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult male. Thus, proteins of this family may be the recombinant product of genes coding Testin and FHL proteins.  SmLIMPETin is down regulated in sexually mature adult Schistosoma females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator. In C. elegans, LIM-9 binds to UNC-97 and UNC-96, components of sarcomeric muscle M-lines.  LIM-9 also forms a complex with SCPL-1 and UNC-89, whose function is to organize sarcomeric A-bands, especially the M-line of muscle. Thus, it might play a role in regulating the assembly and maintenance of muscle A-band. The PET domain is a protein-protein interaction domain and is usually found in conjunction with LIM domain, which is also involved in protein-protein interactions. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes.	83
341402	cd09831	CBS_pair_ABC_Gly_Pro_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found associated with the glycine betaine/L-proline ABC transporter. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown.  In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	116
341403	cd09833	CBS_pair_GGDEF_PAS_repeat1	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in diguanylate cyclase/phosphodiesterase proteins with PAS sensors, repeat 1. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in diguanylate cyclase/phosphodiesterase proteins with PAS sensors.  PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction. The GGDEF domain has been suggested to be homologous to the adenylyl cyclase catalytic domain and is thought to be involved in regulating cell surface adhesiveness in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	116
341404	cd09834	CBS_pair_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	118
341405	cd09836	CBS_pair_arch	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	116
341406	cd09837	CBS_pair_chlorobiales	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in chlorobiales. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	111
341074	cd09839	M1_like_TAF2	TATA binding protein (TBP) associated factor 2. This family includes TATA binding protein (TBP) associated factor 2 (TAF2, TBP-associated factor TAFII150, transcription initiation factor TFIID subunit 2, RNA polymerase II TBP-associated factor subunit B), and has homology to the M1 gluzincin family. TAF2 is part of the TFIID multidomain subunit complex essential for transcription of most protein-encoded genes by RNA polymerase II. TAF2 is known to interact with the initiator element (Inr) found at the transcription start site of many genes, thus possibly playing a key role in promoter binding as well as start-site selection. Image analysis has shown TAF2 to form a complex with TAF1 and TBP, inferring its role in promoter recognition. Peptidases in the M1 family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. TAF2, however, lacks these active site residues.	531
188871	cd09840	LIM2_CRP2	The second LIM domain of Cysteine Rich Protein 2 (CRP2). The second LIM domain of Cysteine Rich Protein 2 (CRP2):  Cysteine-rich proteins (CRPs) are characterized by the presence of two LIM domains linked to short glycine-rich repeats (GRRs). The CRP family members include CRP1, CRP2, CRP3/MLP and TLPCRP1, CRP2 and CRP3 share a conserved nuclear targeting signal (K/R-K/R-Y-G-P-K), which supports the fact that these proteins function not only in the cytoplasm but also in the nucleus. CRPs control regulatory pathways during cellular differentiation, and involve in complex transcription circuits, and the organization as well as the arrangement of the myofibrillar/cytoskeletal network.CRP3 also called Muscle LIM Protein (MLP), which is a striated muscle-specific factor that enhances myogenic differentiation. The second LIM domain of CRP3/MLP interacts with cytoskeletal protein beta-spectrin. CRP3/MLP also interacts with the basic helix-loop-helix myogenic transcription factors MyoD, myogenin, and MRF4 thereby increasing their affinity for specific DNA regulatory elements. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	54
188872	cd09841	LIM1_Prickle_3	The first LIM domain of Prickle 3. The first LIM domain of Prickle 3/LIM domain only 6 (LM06): Prickle contains three C-terminal LIM domains and a N-terminal PET domain.  Prickles have been implicated in roles of regulating tissue polarity or planar cell polarity (PCP).  PCP establishment requires the conserved Frizzled/Dishevelled PCP pathway. Prickle interacts with Dishevelled, thereby modulating Frizzled/Dishevelled activity and PCP signaling. Four forms of prickles have been identified: prickle 1-4. The best characterized is prickle 1 and prickle 2 which are differentially expressed. While prickle 1 is expressed in fetal heart and hematological malignancies, prickle 2 is found in fetal brain, adult cartilage, pancreatic islet, and some types of timorous cells. Mutations in prickle 1 have been linked to progressive myoclonus epilepsy. LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes.	59
197300	cd09842	PLDc_vPLD1_1	Catalytic domain, repeat 1, of vertebrate phospholipase D1. Catalytic domain, repeat 1, of vertebrate phospholipase D1 (PLD1). PLDs play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Vertebrate PLD1 is a membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzyme that selectively hydrolyzes phosphatidylcholine (PC). Protein cofactors and calcium might be required for its activation. Most vertebrate PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at their N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. Like other members of the PLD superfamily, the monomer of vertebrate PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	151
197301	cd09843	PLDc_vPLD2_1	Catalytic domain, repeat 1, of vertebrate phospholipase D2. Catalytic domain, repeat 1, of vertebrate phospholipase D2 (PLD2). PLDs play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. They also catalyze a transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Vertebrate PLD2 is a membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzyme that selectively hydrolyzes phosphatidylcholine (PC). Protein cofactors and calcium might be required for its activation. Most vertebrate PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at their N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. Like other members of the PLD superfamily, the monomer of vertebrate PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	145
197302	cd09844	PLDc_vPLD1_2	Catalytic domain, repeat 2, of vertebrate phospholipase D1. Catalytic domain, repeat 2, of vertebrate phospholipase D1 (PLD1). PLDs play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids resulting in the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. PLDs also catalyze the transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Vertebrate PLD1 is a membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzyme that selectively hydrolyzes phosphatidylcholine (PC). Protein cofactors and calcium might be required for its activation. Most vertebrate PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at their N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. Like other members of the PLD superfamily, the monomer of vertebrate PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	182
197303	cd09845	PLDc_vPLD2_2	Catalytic domain, repeat 2, of vertebrate phospholipase D2. Catalytic domain, repeat 2, of vertebrate phospholipase D2 (PLD2). PLDs play a pivotal role in transmembrane signaling and cellular regulation. They hydrolyze the terminal phosphodiester bond of phospholipids with the formation of phosphatidic acid and alcohols. Phosphatidic acid is an essential compound involved in signal transduction. They also catalyze a transphosphatidylation of phospholipids to acceptor alcohols, by which various phospholipids can be synthesized. Vertebrate PLD2 is a membrane associated phosphatidylinositol 4,5-bisphosphate (PIP2)-dependent enzyme that selectively hydrolyzes phosphatidylcholine (PC). Protein cofactors and calcium might be required for its activation. Most vertebrate PLDs have adjacent Phox (PX) and the Pleckstrin homology (PH) domains at their N-terminus, which have been shown to mediate membrane targeting of the protein and are closely linked to polyphosphoinositide signaling. Like other members of the PLD superfamily, the monomer of vertebrate PLDs consists of two catalytic domains, each of which contains one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue). Two HKD motifs from two domains form a single active site. These PLDs utilize a common two-step ping-pong catalytic mechanism involving an enzyme-substrate intermediate to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine residue from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group.	182
197363	cd09846	DUF1312	N-Utilization Substance G (NusG) N terminal (NGN) insert and Lin0431 are part of DUF1312. Domains of Unknown Function 1312 (DUF1312) are represented in at least 71 bacterial species with no functional annotation. Included in this family are N-Utilization Substance G (NusG) N terminal (NGN) insert and Lin0431, having similar structure and surface features that appear to be conserved across these domain families, suggesting similar function. NusG contains NGN at the N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at the C-terminus in bacteria and archaea, and this insert (often known as Domain II) is found in several bacteria. Lin0431 is similar to NGN-insert but does ot contain the disulphite bridge	81
349946	cd09848	M28_TfR	M28 Zn-peptidase Transferrin Receptor family. Peptidase M28 family; Transferrin Receptor (TfR) subfamily. TfRs are homodimeric type II transmembrane proteins containing three distinct domains: protease-like, apical or protease-associated (PA), and helical domains. The protease-like domain is a large extracellular portion (ectodomain). In TfR, it contains a binding site for the transferrin molecule and has 28% identity to membrane glutamate carboxypeptidase II (mGCP-II or PSMA). The PA domain is inserted between the first and second strands of the central beta sheet in the protease-like domain. TfR1 is widely expressed, and is a key player in the uptake of iron-loaded transferrin (Tf) into cells. The TfR1 homodimer binds two molecules of Tf and the complex is then internalized. TfR1 may also participate in cell growth and proliferation. TfR2 binds Tf but with a significantly lower affinity than TfR1. It is expressed chiefly in hepatocytes, hematopoietic cells, and duodenal crypt cells; its expression overlaps with that of hereditary hemochromatosis protein (HFE). TfR2 is involved in iron homeostasis; in humans, mutations in TfR2 are associated with a form of hemochromatosis (HFE3). While related in sequence to peptidase M28 glutamate carboxypeptidase II (also called prostate-specific membrane antigen or PSMA), TfR lacks the metal ion coordination centers and protease activity of that group.	285
349947	cd09849	M20_Acy1L2-like	M20 Peptidase aminoacylase 1-like protein 2, amidohydrolase family. Peptidase M20 family, aminoacylase 1-like protein 2 (ACY1L2; amidohydrolase)-like subfamily. This group contains many uncharacterized proteins predicted as amidohydrolases, including gene products of abgA and abgB that catalyze the cleavage of p-aminobenzoyl-glutamate, a folate catabolite in Escherichia coli , to p-aminobenzoate and glutamate. p-Aminobenzoyl-glutamate utilization is catalyzed by the abg region gene product, AbgT. Aminoacylase 1 (ACY1) proteins are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in the protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine).	389
197367	cd09850	Ebola-like_HR1-HR2	heptad repeat 1-heptad repeat 2 region of the transmembrane subunit of Filoviridae viruses, Ebola virus and Marburg virus, and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus gp2, Marburg virus gp, and the envelope proteins of various ERVs, including human HERV-R_c7q21.2 (ERV-3). This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intrasubunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some ERVs play specific roles in the host. However, it is unclear whether ERV-3 has a critical biological role: it is expressed in the placenta, but is not fusogenic, has an immunosuppressive domain, but lacks a fusion peptide. Filoviridae, the family of viruses including Ebola and Marburg, may have acquired this domain via horizontal transfer from retroviruses.	77
197368	cd09851	HTLV-1-like_HR1-HR2	heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of human T-cell leukemia virus type 1 (HTLV-1), and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane(TM) subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including HTLV-1, HTLV -2, primate Mason-Pfizer monkey virus, Moloney murine leukemia virus, simian T-cell lymphotropic virus, feline leukemia virus (FeLV), bovine leukemia virus, and various human endogenous retroviruses (HERVs), including, HERV-H1_c2q24.3, HERV-H2_3q26, HERV-F(c)1_cXq21.33, HERV-T_19q13.11, Syncytin-1 (HERV-W_c7q21.2/ ERVWE1), Syncytin-2 (HERV-FRD_6p24.1), and related domains. This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intrasubunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some modern ERVs, those that integrated into the host genome post-speciation, have a currently active exogenous counterpart, such as FeLV. Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Syncytin-1 and Syncytin-2 are expressed in the placenta, and are fusogenic, although they have a different cell specificity for fusion. Syncytin-2, but not Syncytin-1, is immunosuppressive; its immunosuppressive domain may protect the fetus from the mother's immune system. Syncytin-1 may participate in the formation of the placental trophoblast; it is also implicated in cell fusions between cancer and host cells and between cancer cell, and in human osteclast fusion. This subfamily also contains a mouse envelope protein encoded by the Fv-4 env gene, that blocks infection by exogenous MuLV.	78
350203	cd09852	PIN_SF	PIN (PilT N terminus) domain: Superfamily. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. The PIN domain superfamily includes: the FEN-like PIN domain family such as the PIN domains of Flap endonuclease-1 (FEN1), exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. It also includes the Mut7-C PIN domain family, which is not represented here as it is a shortened version of the PIN fold and lacks a core strand and helix (H3 and S3). The Mut7-C PIN domain family includes the C-terminus of Caenorhabditis elegans exonuclease Mut-7.	114
350204	cd09853	PIN_FEN-like	FEN-like PIN domains of structure-specific 5' nucleases (or Flap endonuclease-1-like) involved in DNA replication, repair, and recombination. Structure-specific 5' nucleases are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner. The family includes the PIN (PilT N terminus) domains of Flap endonuclease-1 (FEN1), exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the PIN domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4- and T5-5' nucleases, and other homologs. Canonical members of this FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	174
350205	cd09854	PIN_VapC-like	VapC-like PIN domains of VapC and Smg6 ribonucleases, ribosome assembly factor NOB1, rRNA-processing protein Fcf1, Archaeoglobus fulgidus AF0591 protein, and homologs. PIN (PilT N terminus) domains of such ribonucleases as the toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1, are included in VapC-like this family. Also included are the PIN domains of the Pyrobaculum aerophilum Pea0151 and Archaeoglobus fulgidus AF0591 proteins and other similar archaeal homologs. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	129
350206	cd09856	PIN_FEN1-like	FEN-like PIN domains of Flap endonuclease-1 (FEN1)-like, structure-specific, divalent-metal-ion dependent, 5' nucleases. PIN (PilT N terminus) domain of Flap endonuclease-1 (FEN1)-like nucleases: FEN1, Gap endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Nucleases in this subfamily are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	235
350207	cd09857	PIN_EXO1	FEN-like PIN domains of Exonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease and homologs. exonuclease-1 (EXO1) is involved in multiple, eukaryotic DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity), DNA repair processes (DNA mismatch repair (MMR) and post-replication repair (PRR)), recombination, and telomere integrity. EXO1 functions in the MMS2 error-free branch of the PRR pathway in the maintenance and repair of stalled replication forks. Studies also suggest that EXO1 plays both structural and catalytic roles during MMR-mediated mutation avoidance. These nucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. EXO1 nucleases also have C-terminal Mlh1- and Msh2-binding domains which allow interaction with MMR and PRR proteins, respectively.	202
350208	cd09858	PIN_MKT1	FEN-like PIN domains of Mkt1, a global regulator of mRNAs encoding mitochondrial proteins and eukaryotic homologs. The Mkt1 gene product interacts with the Poly(A)-binding protein associated factor, Pbp1, and is present at the 3' end of RNA transcripts during translation. The Mkt1-Pbp1 complex is involved in the post-transcriptional regulation of HO endonuclease expression. Mkt1 and eukaryotic homologs are atypical members of the structure-specific, 5' nuclease family (FEN-like). Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. Although Mkt1 appears to possess both a PIN and H3TH domain, the Mkt1 PIN domain lacks several of the active site residues necessary to bind essential divalent metal ion cofactors (Mg2+/Mn2+) required for nuclease activity in this family. Also, Mkt1 lacks the glycine-rich loop in the H3TH domain which is proposed to facilitate duplex DNA binding.	206
350209	cd09859	PIN_53EXO	FEN-like PIN domains of PIN domain of the 5'-3' exonuclease of Thermus aquaticus DNA polymerase I (Taq) and homologs. The 5'-3' exonuclease (53EXO) PIN (PilT N terminus) domain of multi-domain DNA polymerase I and single domain protein homologs are included in this family. Taq contains a polymerase domain for synthesizing a new DNA strand and a 53EXO PIN domain for cleaving RNA primers or damaged DNA strands. Taq's 53EXO PIN domain recognizes and endonucleolytically cleaves a structure-specific DNA substrate that has a bifurcated downstream duplex and an upstream template-primer duplex that overlaps the downstream duplex by 1 bp. The 53EXO PIN domain cleaves the unpaired 5'-arm of the overlap flap DNA substrate. 5'-3' exonucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	160
350210	cd09860	PIN_T4-like	FEN-like PIN domains of bacteriophage T3, T4 RNase H, T5-5'nuclease, and homologs. PIN (PilT N terminus) domain of bacteriophage T5-5'nuclease (5'-3' exonuclease or T5FEN), bacteriophage T4 RNase H (T4FEN), bacteriophage T3 (T3 phage exodeoxyribonuclease) and other similar 5' nucleases are included in this family. T5-5'nuclease is a 5'-3'exodeoxyribonuclease that also exhibits endonucleolytic activity on flap structures (branched duplex DNA containing a free single-stranded 5'end). T4 RNase H, which removes the RNA primers that initiate lagging strand fragments, has 5'- 3'exonuclease activity on DNA/DNA and RNA/DNA duplexes and has endonuclease activity on flap or forked DNA structures. Bacteriophage T3 is believed to function in the removal of DNA-linked RNA primers and is essential for phage DNA replication and also necessary for host DNA degradation and phage genetic recombination. These nucleases are members of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. In the T5-5'nuclease, structure-specific endonuclease activity requires binding of a single metal ion in the high-affinity, metal binding site 1, whereas exonuclease activity requires both, the high-affinity, metal binding site 1 and the low-affinity, metal binding site 2 to be occupied by a divalent cofactor. The T5-5'nuclease is reported to be able to bind several metal ions including, Mg2+, Mn2+, Zn2+ and Co2+, as co-factors.	158
350211	cd09862	PIN_Rrp44-like	VapC-like PIN domain of yeast exosome subunit Rrp44 endoribonuclease and other eukaryotic homologs. PIN (PilT N terminus) domain of the Saccharomyces cerevisiae exosome subunit Rrp44 (Ribosomal RNA-processing protein 44 or Protein Dis3 homolog) and other similar eukaryotic homologs are included in this family. The eukaryotic exosome is a conserved macromolecular complex responsible for many RNA-processing and RNA-degradation reactions. It is composed of nine core subunits that directly binds Rrp44. The Rrp44 nuclease is the catalytic subunit of the exosome and has endonuclease activity in the PIN domain and an exoribonuclease activity in its RNase II-like region. Rrp44 binding to the exosome is mediated mainly by the PIN domain and by subunits Rrp41-Rrp45, and binding predictions indicate that the PIN domain active site is positioned on the outer surface of the exosome. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. PIN domains within this subgroup contain four of these residues which cluster at the C-terminal end of the beta-sheet and form a negatively charged pocket near the center of the molecule. Recombinant Rrp44 was shown to possess manganese-dependent endonuclease activity in vitro that was abolished by point mutations in these putative metal binding residues of its PIN domain. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	178
350212	cd09864	PIN_Fcf1-like	VapC-like PIN domain of rRNA-processing protein, Fcf1 (Utp24, YDR339C), and other eukaryotic homologs. Fcf1/Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) is an essential protein involved in pre-rRNA processing and 40S ribosomal subunit assembly. Component of the small subunit (SSU) processome, Fcf1 is an essential nucleolar protein that is required for processing of the 18S pre-rRNA at sites A0-A2. The Fcf1 protein was reported to interact with Pmc1p (vacuolar Ca2+ ATPase) and Cor1p (core subunit of the ubiquinol-cytochrome c reductase complex). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. Most members of the Fcf1 PIN domain subfamily have four of these conserved residues and the Fcf1-Utp23 homolog PIN domain subfamily has three. Point mutation studies of the conserved acidic residues in the putative active site of Saccharomyces cerevisiae Fcf1 determined they were essential for pre-rRNA processing at sites A1 and A2, whereas the presence of the Fcf1 protein itself is also required for cleavage at site A0.	131
350213	cd09865	PIN_ScUtp23p-like	VapC-like PIN domain of rRNA-processing protein, Utp23 (YOR004W), and other fungal homologs. Saccharomyces cerevisiae Utp23 (U three-associated protein 23), component of the small subunit (SSU) processome, is an essential protein involved in pre-rRNA processing and 40S ribosomal subunit assembly. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily, including S. cerevisiae Utp23, lack several of these key catalytic residues. Mutation of the remaining conserved putative active site residues seen in Utp23 did not interfere with rRNA maturation and cell viability. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	149
350214	cd09866	PIN_Fcf1-Utp23-H	VapC-like PIN domain of rRNA-processing protein Fcf1- and Utp23-like homologs found in eukaryotes except fungi; similar to human rRNA-processing protein UTP23. PIN domain homologs of Fcf1/Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23, essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly, are included in this subfamily. It includes human UTP24 which hUTP24 plays a crucial role in human rRNA processing and is essential for accurate endonucleolytic cleavage at the 5'-end of 18S rRNA. Fcf1 is a component of the small subunit (SSU) processome and an essential nucleolar protein required for processing of the 18S pre-rRNA at sites A0-A2. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The Fcf1-Utp23 homolog PIN domain subfamily has three of these conserved acidic residues rather than the four seen in the Fcf1 PIN domain subfamily.	130
350215	cd09867	PIN_FEN1	FEN-like PIN domains of Flap endonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease and homologs. Flap endonuclease-1 (FEN1) is involved in multiple DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity) and DNA repair processes (long-patch base excision repair) in eukaryotes and archaea. Interaction between FEN1 and PCNA (Proliferating cell nuclear antigen) is an essential prerequisite to FEN1's DNA replication functionality and stimulates FEN1 nuclease activity by 10-50 fold. FEN1 belongs to the FEN1-EXO1-like subfamily of structure-specific, 5' nucleases (FEN-like family). Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. FEN1 has a C-terminal extension containing residues forming the consensus PIP-box - Qxx(M/L/I)xxF(Y/F) which serves to anchor FEN1 to PCNA.	251
350216	cd09868	PIN_XPG_RAD2	FEN-like PIN domains of Xeroderma pigmentosum complementation group G (XPG) nuclease, a structure-specific, divalent-metal-ion dependent, 5' nuclease and homologs. The Xeroderma pigmentosum complementation group G (XPG) nuclease plays a central role in nucleotide excision repair (NER) in cleaving DNA bubble structures or loops. XPG is a member of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	209
350217	cd09869	PIN_GEN1	FEN-like PIN domains of Gap Endonuclease 1, a structure-specific, divalent-metal-ion dependent, 5' nuclease and homologs. Gap Endonuclease 1 (GEN1) is a Holliday junction resolvase reported to symmetrically cleave Holliday junctions and allow religation without additional processing. GEN1 is a member of the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	227
350218	cd09870	PIN_YEN1	FEN-like PIN domains of Saccharomyces cerevisiae endonuclease 1 (YEN1), Chaetomium thermophilum junction-resolving enzyme GEN1, and fungal homologs. Fungal Endonuclease 1 (YEN1 and GEN1, GEN1 is known as YEN1 in Saccharomyces cerevisiae) is a four-way (Holliday) junction resolvase. Members of this subgroup belong to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), and at the C-terminus of the PIN domain a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included in this model) and the helical arch/clamp region are involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	229
350219	cd09871	PIN_MtVapC28-VapC30-like	VapC-like PIN domain of Mycobacterium tuberculosis VapC28 and 30 and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC28 and VapC30 toxins. M. tuberculosis VapC28 and VapC30 both cleave tRNA25Ser-TGA and tRNA28Ser-CGA. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	128
350220	cd09872	PIN_Sll0205-like	VapC-like PIN domain of Sll0205 protein and homologs. Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of the Synechocystis sp. (strain PCC 6803) Sll0205 protein and other uncharacterized homologs are included in this subfamily. They are similar to the PIN domains of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	125
350221	cd09873	PIN_Pae0151-like	VapC-like PIN domain of the Pyrobaculum aerophilum Pae0151 and Pae2754 proteins and homologs. Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of the Pyrobaculum aerophilum proteins, Pae0151 and Pae2754, and homologs are included in this subfamily. They are similar to the PIN domains of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	128
350222	cd09874	PIN_MT3492-like	VapC-like PIN domain of the hypothetical protein MT3492 of Mycobacterium tuberculosis CDC1551 and other uncharacterized, annotated PilT protein domain proteins. Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis CDC1551, hypothetical protein MT3492, and similar bacterial and archaeal proteins are included in this subfamily. They are PIN domain homologs of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	134
350223	cd09875	PIN_VapC-FitB-like	VapC-like PIN domain of ribonucleases (toxins), VapC and FitB, of prokaryotic toxin/antitoxin operons, Pyrococcus horikoshii protein PH0500, and other similar bacterial and archaeal homologs. PIN (PilT N terminus) domain-containing proteins of prokaryotic toxin/antitoxin (TA) operons, such as, Mycobacterium tuberculosis VapC of the VapBC (virulence associated proteins) TA operon, and Neisseria gonorrhoeae FitB of the FitAB (fast intracellular trafficking) TA operon, as well as, the archaeal Pyrococcus horikoshii protein PH0500 are included in this family. Toxins of TA operons are believed to be involved in growth inhibition by regulating translation and are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the complex activates the ribonuclease activity of the toxin. In N. gonorrhoeae, FitA and FitB form a heterodimer: FitA is the DNA binding subunit and FitB contains a ribonuclease activity that is blocked by the presence of FitA. A tetramer of FitAB heterodimers binds DNA from the fitAB upstream promoter region with high affinity. This results in both sequestration of FitAB and repression of fitAB transcription. It is thought that FitAB release from the DNA and subsequent dissociation both slows N. gonorrhoeae replication and transcytosis by an as yet undefined mechanism. The toxin M. tuberculosis VapC is a structural homolog of N. gonorrhoeae FitB, but their antitoxin partners, VapB and FitA, respectively, differ structurally. The M. tuberculosis VapC-5 is proposed to be both an endoribonuclease and an exoribonuclease that can act on free RNA in a similar manner to the endo and exonuclease Flap endonuclease-1 (FEN1). VapC-like toxins are structural homologs of FEN1-like PIN domains, but lack the extensive arch/clamp region and the H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region, seen in FEN1-like PIN domains. PIN domains within this group typically contain three or four conserved acidic residues that cluster at the C-terminal end of the beta-sheet and form a negatively charged pocket near the center of the molecule. These putative active site residues are thought to bind Mg2+ and/or Mn2+ ions and be essential for single-stranded ribonuclease activity. VapC-like PIN domains are single domain proteins that form dimers and dimerization configures the active sites in a groove along the long-axis of the structure.	130
350224	cd09876	PIN_Nob1-like	VapC-like PIN domain of eukaryotic ribosome assembly factor Nob1 and archaeal UPF0129 protein Ta0041-like homologs. PIN (PilT N terminus) domain of the Saccharomyces cerevisiae ribosome assembly factor, Nob1 (Nin one binding) protein, the Thermoplasma acidophilum DSM 1728, UPF0129 protein Ta0041, and similar eukaryotic and archaeal homologs are included in this family. The Nob1 PIN domain binds the single-stranded cleavage site D at the 3-prime end of 18S rRNA. Recombinant Nob1 binds as a tetramer to pre-18S rRNA fragments containing cleavage site D and believed to cleave at this site. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_6.	112
350225	cd09877	PIN_YacL-like	VapC-like PIN domain of Thermus Thermophilus Hb8, uncharacterized Bacillus subtilis YacL, and other bacterial homologs. PIN (PilT N terminus) domain of the conserved membrane protein of unknown function of Thermus Thermophilus Hb8, Bacillus subtilis YacL and other similar homologs are included in this family. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Proteins in this group have a C-terminal TRAM domain whose function is unknown but predicted to be a RNA-binding domain common to tRNA uracil methylation and adenine thiolation enzymes.	127
350226	cd09878	PIN_VapC_VirB11L-ATPase-like	VapC-like PIN domain of an uncharacterized AAA+, VirB11-like ATPase-, KH- and PIN-domain containing protein MJ1533 from Methanocaldococcus jannaschii DSM 2661, and other similar archaeal homologs. PIN (PilT N terminus) domain present N-terminal of AAA+, VirB11-like ATPases. Several members of this subfamily possess an AAA+, VirB11-like ATPase domain, flanked by PIN and KH nucleic acid-binding domains. VirB11-ATPase is a type IV secretory pathway component required for T-pilus biogenesis and virulence. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. PIN domains within this subgroup contain four of these highly conserved residues which cluster at the C-terminal end of the beta-sheet and form a negatively charged pocket near the center of the molecule. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	125
350227	cd09879	PIN_VapC_AF0591-like	VapC-like PIN domain of Archaeoglobus fulgidus AF0591 protein and other similar archaeal homologs. PIN (PilT N terminus) domain of Archaeoglobus fulgidus AF0591 protein and other similar uncharacterized archaeal homologs are included in this family. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. PIN domains within this subgroup contain four of these highly conserved putative metal-binding, active site residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains and included distant subgroups, this subgroup includes some sequences belonging to one of these, PIN_14.	118
350228	cd09880	PIN_Smg5-6-like	VapC-like PIN domain of nonsense-mediated decay (NMD) factors, Smg5 and Smg6, and related proteins. PIN (PilT N terminus) domain of nonsense-mediated decay (NMD) factors, Smg5 and Smg6, and homologs are included in this family. Smg5 and Smg6 are essential factors in NMD, a post-transcriptional regulatory pathway that recognizes and rapidly degrades mRNAs containing premature translation termination codons. In vivo, the Smg6 PIN domain elicits degradation of bound mRNAs, as well as, metal-ion dependent, degradation of single-stranded RNA, in vitro. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Point mutation studies of the conserved aspartate residues in the catalytic center of the Smg6 PIN domain revealed that Smg6 is the endonuclease involved in human NMD. However, Smg5 lacks several of these key catalytic residues and does not degrade single-stranded RNA, in vivo. Many of the bacterial homologs in this group have an N-terminal PIN domain and a C-terminal PhoH-like ATPase domain.	152
350229	cd09881	PIN_VapC4-5_FitB-like	VapC-like PIN domain of Mycobacterium tuberculosis VapC4 and VapC5, and Neisseria gonorrhoeae FitB and related proteins. This family includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This family belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350230	cd09882	PIN_MtVapC3-like_start	VapC-like PIN domain of Mycobacterium tuberculosis VapC3 toxin and related proteins. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, and VapC21. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	128
350231	cd09883	PIN_VapC_PhoHL-ATPase	VapC-like PIN domain of bacterial Smg6-like proteins with C-terminal PhoH-like ATPase domains. PIN (PilT N terminus) domain of Smg6-like bacterial proteins with C-terminal PhoH-like ATPase domains and other similar homologs are included in this family. Eukaryotic Smg5 and Smg6 nucleases are essential factors in nonsense-mediated mRNA decay (NMD), a post-transcriptional regulatory pathway that recognizes and rapidly degrades mRNAs containing premature translation termination codons. In vivo, the Smg6 PIN domain elicits degradation of bound mRNAs, as well as, metal ion dependent, degradation of single-stranded RNA, in vitro. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. PIN domains within this subgroup contain four highly conserved acidic residues (putative metal-binding, active site residues). Many of the bacterial homologs in this group have an N-terminal PIN domain and a C-terminal PhoH-like ATPase domain and are predicted to be ATPases which are induced by phosphate starvation.	146
350232	cd09884	PIN_Smg5-like	VapC-like PIN domain of human nonsense-mediated decay factor Smg5, and other similar eukaryotic homologs. Nonsense-mediated decay (NMD) factors, Smg5 and Smg6 are essential to the post-transcriptional regulatory pathway, NMD, which recognizes and rapidly degrades mRNAs containing premature translation termination codons. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Point mutation studies of the conserved aspartate residues in the catalytic center of the Smg6 PIN domain revealed that Smg6 is the endonuclease involved in human NMD. However, Smg5 lacks several of these key catalytic residues and does not degrade single-stranded RNA, in vivo.	160
350233	cd09885	PIN_Smg6-like	VapC-like PIN domain of human telomerase-binding protein EST1, Smg6, and other similar eukaryotic homologs. Nonsense-mediated decay (NMD) factors, Smg5 and Smg6 are essential to the post-transcriptional regulatory pathway, NMD, which recognizes and rapidly degrades mRNAs containing premature translation termination codons. In vivo, the Smg6 PIN (PilT N terminus) domain elicits degradation of bound mRNAs, as well as, metal ion dependent, degradation of single-stranded RNA, in vitro. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. PIN domains within this subgroup contain four highly conserved acidic residues (putative metal-binding, active site residues) which cluster at the C-terminal end of the beta-sheet and form a negatively charged pocket near the center of the molecule. Point mutation studies of the conserved aspartate residues in the catalytic center of the Smg6 PIN domain revealed that Smg6 is the endonuclease involved in human NMD. However, Smg5 lacks several of these key catalytic residues and does not degrade single-stranded RNA, in vivo. Eukaryotic Smg6 PIN domains are present at the C-terminal end of the telomerase activating proteins, EST1.	178
193575	cd09886	NGN_SP	N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP). The N-Utilization Substance G (NusG) protein is involved in transcription elongation and termination. NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination in bacteria. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. The NusG N-terminal (NGN) domain is quite similar in all NusG orthologs, but its C-terminal domains and the linker that separate these two domains are different. The domain organization of NusG and its orthologs suggest that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains.	97
193576	cd09887	NGN_Arch	Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain. The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. Transcription in archaea has a eukaryotic-type transcription apparatus, but contains bacterial-type transcription factors. NusG is one of the few archaeal transcription factors that has orthologs in both bacteria and eukaryotes. Archaeal NusG is similar to bacterial NusG, composed of an NGN domain and a Kyrpides Ouzounis and Woese (KOW) repeat. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. NusG was originally discovered as a N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Archaeal NusG forms a complex with DNA-directed RNA polymerase subunit E (rpoE) that is similar to the Spt5-Spt4 complex in eukaryotes.	82
193577	cd09888	NGN_Euk	Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1). The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.	86
193578	cd09889	NGN_Bact_2	Bacterial N-Utilization Substance G (NusG) N-terminal (NGN) domain, subgroup 2. The N-Utilization Substance G (NusG) protein is involved in transcription elongation and termination. NusG is essential in Escherichia coli and associates with RNA polymerase elongation and Rho-termination. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. The NusG N-terminal domain (NGN) is quite similar in all NusG orthologs, but its C-terminal domain and the linker that separates these two domains are different. The domain organization of NusG and its orthologs suggests that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains.	100
193579	cd09890	NGN_plant	Plant N-Utilization Substance G (NusG) N-terminal (NGN) domain. The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains a NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein comprising an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. The bacterial infected plants contain bacterial DNA, such as NGN sequences, that can be used to clone the DNA of uncultured organisms.	113
193580	cd09891	NGN_Bact_1	Bacterial N-Utilization Substance G (NusG) N-terminal (NGN) domain, subgroup 1. The N-Utilization Substance G (NusG) protein is involved in transcription elongation and termination in bacteria. NusG is essential in Escherichia coli and associates with RNA polymerase elongation and Rho-termination. Homologs of the NusG gene exist in all bacteria. The NusG N-terminal domain (NGN) is similar in all NusG homologs, but its C-terminal domain and the linker that separates these two domains are different. The domain organization of NusG suggests that the common properties of NusG and its homologs are due to their similar NGN domains.	107
193581	cd09892	NGN_SP_RfaH	N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP), RfaH. RfaH is an operon-specific virulence regulator, thought to have arisen from an early duplication of N-Utilization Substance G (NusG). Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination in bacteria. In contrast, RfaH is a non-essential protein that controls expression of operons containing an ops (operon polarity suppressor) element in their transcribed DNA. RfaH and NusG are different in their response to Rho-dependent terminators and regulatory targets. The NusG N-terminal (NGN) domain is quite similar in all NusG orthologs, but its C-terminal domains and the linker that separate these two domains are different. The domain organization of NusG and its homologs suggest that the common properties of NusG and RfaH are due to their similar NGN domains.	96
193582	cd09893	NGN_SP_TaA	N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP), TaA. The N-Utilization Substance G (NusG) protein is involved in transcription elongation and termination. NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination in bacteria. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antiterminationn factors. TaA is a NusG SP factor that is required for synthesis of a polyketide antibiotic TA in Myxococcus xanthus. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The NusG N-terminal (NGN) domain is quite similar in all NusG orthologs, but its C-terminal domains and the linker that separate these two domains are different. The domain organization of NusG and its orthologs suggest that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains.	95
193583	cd09894	NGN_SP_AnfA1	N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP), AnFA1. Regulation of the afp, antifeeding prophage, gene cluster is mediated by AnFA1, a RfaH-like transcriptional antiterminator. RfaH is an operon-specific virulence regulator, thought to arisen from an early duplication of N-Utilization Substance G (NusG). NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination in bacteria. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. The NusG N-terminal domain (NGN) is similar in all NusG orthologs, but its C-terminal domain and the linker that separate these two domains are different. The domain organization of NusG and its orthologs suggests that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains.	99
193584	cd09895	NGN_SP_UpxY	N-Utilization Substance G (NusG) N-terminal domain in the NusG Specialized Paralog (SP), UpxY. The N-Utilization Substance G (NusG) proteins are involved in transcription elongation and termination. NusG is essential in Escherichia coli and is associated with RNA polymerase elongation and Rho-termination. Paralogs of eubacterial NusG, NusG SP (Specialized Paralog of NusG), are more diverse and often found as the first ORF in operons encoding secreted proteins and LPS (lipopolysaccharide) biosynthesis genes. NusG SP family members are operon-specific transcriptional antitermination factors. UpxY proteins, UpxY proteins, where the x is replaced by the letter designation of the specific polysaccharide (UpaY to UphY), are a family of NusG SP factors that act specifically in transcriptional antitermination of operons from which they are encoded.  UpxYs are necessary and specific for transcription regulation of the polysaccharide biosynthesis operon. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. The NusG N-terminal (NGN) domain is similar in all NusG orthologs, but its C-terminal domain and the linker that separate these two domains are different. The domain organization of NusG and its orthologs suggests that the common properties of NusG and its orthologs and paralogs are due to their similar NGN domains.	95
188617	cd09897	H3TH_FEN1-XPG-like	H3TH domains of Flap endonuclease-1 (FEN1)-like structure specific 5' nucleases. The 5' nucleases within this family are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. This family includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1), Xeroderma pigmentosum complementation group G (XPG) nuclease, and other eukaryotic and archaeal homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. With the except of the Mkt1-like proteins, the nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	68
188618	cd09898	H3TH_53EXO	H3TH domain of the 5'-3' exonuclease of Taq DNA polymerase I and homologs. H3TH (helix-3-turn-helix) domains of the 5'-3' exonuclease (53EXO) of mutli-domain DNA polymerase I and single domain protein homologs are included in this family. Taq DNA polymerase I contains a polymerase domain for synthesizing a new DNA strand and a 53EXO domain for cleaving RNA primers or damaged DNA strands. Taq's 53EXO recognizes and endonucleolytically cleaves a structure-specific DNA substrate that has a bifurcated downstream duplex and an upstream template-primer duplex that overlaps the downstream duplex by 1 bp. The 53EXO cleaves the unpaired 5'-arm of the overlap flap DNA substrate. 5'-3' exonucleases are members of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+ or Mn2+ or Zn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	73
188619	cd09899	H3TH_T4-like	H3TH domain of bacteriophage T3, T4 RNase H, T5-5' nucleases, and homologs. H3TH (helix-3-turn-helix) domains of bacteriophage T5-5'nuclease (5'-3' exonuclease or T5FEN), bacteriophage T4 RNase H (T4FEN), bacteriophage T3 (T3 phage exodeoxyribonuclease) and other similar 5' nucleases are included in this family. The T5-5'nuclease is a 5'-3' exodeoxyribonuclease that also exhibits endonucleolytic activity on flap structures (branched duplex DNA containing a free single-stranded 5'end). T4 RNase H, which removes the RNA primers that initiate lagging strand fragments, has 5'- 3' exonuclease activity on DNA/DNA and RNA/DNA duplexes and has endonuclease activity on flap or forked DNA structures. Bacteriophage T3 is believed to function in the removal of DNA-linked RNA primers and is essential for phage DNA replication and also necessary for host DNA degradation and phage genetic recombination. These nucleases are members of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. They contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors required for nuclease activity. The first metal binding site (MBS-1) is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site (MBS-2) is composed generally of two Asp residues from the PIN domain and two Asp residues from the H3TH domain. In the T5-5'nuclease, structure-specific endonuclease activity requires binding of a single metal ion in the high-affinity, MBS-1, whereas exonuclease activity requires both, the high-affinity, MBS-1 and the low-affinity, MBS-2 to be occupied by a divalent cofactor. The T5-5'nuclease is reported to be able to bind several metal ions including, Mg2+, Mn2+, Zn2+ and Co2+, as co-factors. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	74
188620	cd09900	H3TH_XPG-like	H3TH domains of Flap endonuclease-1 (FEN1)-like structure specific 5' nucleases: FEN1 (archaeal), GEN1, YEN1, and XPG. The 5' nucleases within this family are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. This family includes the H3TH (helix-3-turn-helix) domains of archaeal Flap Endonuclease-1 (FEN1), Gap Endonuclease 1 (GEN1), Yeast Endonuclease 1 (YEN1), Xeroderma pigmentosum complementation group G (XPG) nuclease, and other eukaryotic and archaeal homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. With the except of the Mkt1-like proteins, the nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	52
188621	cd09901	H3TH_FEN1-like	H3TH domains of Flap endonuclease-1 (FEN1)-like structure specific 5' nucleases: FEN1 (eukaryotic) and EXO1. The 5' nucleases within this family are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. This family includes the H3TH (helix-3-turn-helix) domains of eukaryotic Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), and other eukaryotic homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this family have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	73
188622	cd09902	H3TH_MKT1	H3TH domain of Mkt1: A global regulator of mRNAs encoding mitochondrial proteins and eukaryotic homologs. The Mkt1 gene product interacts with the Poly(A)-binding protein associated factor, Pbp1, and is present at the 3' end of RNA transcripts during translation. The Mkt1-Pbp1 complex is involved in the post-transcriptional regulation of HO endonuclease expression. Mkt1 and eukaryotic homologs are atypical members of the structure-specific, 5' nuclease family. Conical members of this family possess a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH (helix-3-turn-helix) domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved.  Although Mkt1 appears to possess both a PIN and H3TH domain, the Mkt1 PIN domain lacks several of the active site residues necessary to bind essential divalent metal ion cofactors (Mg2+/Mn2+) required for nuclease activity in this family. Also, Mkt1 lacks the glycine-rich loop in the H3TH domain which is proposed to facilitate duplex DNA binding.	81
188623	cd09903	H3TH_FEN1-Arc	H3TH domain of Flap Endonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease: Archaeal homologs. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of archaeal Flap endonuclease-1 (FEN1), 5' nucleases. FEN1 is involved in multiple DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity) and DNA repair processes (long-patch base excision repair) in eukaryotes and archaea. Interaction between FEN1 and PCNA (Proliferating cell nuclear antigen) is an essential prerequisite to FEN1's DNA replication functionality and stimulates FEN1 nuclease activity by 10-50 fold. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this subfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. Also, FEN1 has a C-terminal extension containing residues forming the consensus PIP-box - Qxx(M/L/I)xxF(Y/F) which serves to anchor FEN1 to PCNA.	65
188624	cd09904	H3TH_XPG	H3TH domain of Xeroderma pigmentosum complementation group G (XPG) nuclease, a structure-specific, divalent-metal-ion dependent, 5' nuclease. The Xeroderma pigmentosum complementation group G (XPG) nuclease plays a central role in nucleotide excision repair (NER) in cleaving DNA bubble structures or loops. XPG is a member of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination.  Members of this subgroup include the H3TH (helix-3-turn-helix) domains of XPG and other similar eukaryotic 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding.  Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	97
188625	cd09905	H3TH_GEN1	H3TH domain of Gap Endonuclease 1, a structure-specific, divalent-metal-ion dependent, 5' nuclease. Gap Endonuclease 1 (GEN1): Holliday junction resolvase reported to symmetrically cleave Holliday junctions and allow religation without additional processing. GEN1 is a member of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of GEN1 and other similar eukaryotic 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	108
188626	cd09906	H3TH_YEN1	H3TH domain of Yeast Endonuclease 1, a structure-specific, divalent-metal-ion dependent, 5' nuclease. Yeast Endonuclease 1 (YEN1): Holliday junction resolvase which promotes reciprocal exchange during mitotic recombination to maintain genome integrity in budding yeast. YEN1 is a member of the structure-specific, 5' nuclease family that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of YEN1 and other similar fungal 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases.	105
188627	cd09907	H3TH_FEN1-Euk	H3TH domain of Flap Endonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease: Eukaryotic homologs. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of eukaryotic Flap endonuclease-1 (FEN1), 5' nucleases. FEN1 is involved in multiple DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity) and DNA repair processes (long-patch base excision repair) in eukaryotes and archaea. Interaction between FEN1 and PCNA (Proliferating cell nuclear antigen) is an essential prerequisite to FEN1's DNA replication functionality and stimulates FEN1 nuclease activity by 10-50 fold. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. The nucleases within this subfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. Also, FEN1 has a C-terminal extension containing residues forming the consensus PIP-box - Qxx(M/L/I)xxF(Y/F) which serves to anchor FEN1 to PCNA.	70
188628	cd09908	H3TH_EXO1	H3TH domain of Exonuclease-1, a structure-specific, divalent-metal-ion dependent, 5' nuclease. Exonuclease-1 (EXO1) is involved in multiple, eukaryotic DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity), DNA repair processes (DNA mismatch repair (MMR) and post-replication repair (PRR), recombination, and telomere integrity. EXO1 functions in the MMS2 error-free branch of the PRR pathway in the maintenance and repair of stalled replication forks. Studies also suggest that EXO1 plays both structural and catalytic roles during MMR-mediated mutation avoidance. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of EXO1 and other similar eukaryotic 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. EXO1 nucleases also have C-terminal Mlh1- and Msh2-binding domains which allow interaction with MMR and PRR proteins, respectively.	73
197369	cd09909	HIV-1-like_HR1-HR2	heptad repeat 1-heptad repeat 2 region (ectodomain) of the gp41 subunit of human immunodeficiency virus (HIV-1), and related domains. This domain family spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including human, simian, and feline immunodeficiency viruses (HIV, SIV, and FIV), bovine immunodeficiency-like virus (BIV), equine infectious anaemia virus (EIAV), and Jaagsiekte sheep retrovirus (JSRV), mouse mammary tumour virus (MMTV) and various ERVs including sheep enJSRV-26, and human ERVs (HERVs): HERV-K_c1q23.3 and HERV-K_c12q14.1. This domain belongs to a larger superfamily containing the HR1-HR2 domain of ERVs and infectious retroviruses, including Ebola virus, and Rous sarcoma virus. Proteins in this family lack the canonical CSK17-like immunosuppressive sequence, and the intrasubunit disulfide bond-forming CX6C motif found in linker region between HR1 and HR2 in the Ebola_RSV-like_HR1-HR2 family. N-terminal to the HR1-HR2 region is a fusion peptide (FP), and C-terminal is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1 helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some modern ERVs, those that integrated into the host genome post-speciation, have a currently active exogenous counterpart, such as JSRV. Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Included in this subgroup are ERVs from domestic sheep that are related to JSRV, the agent of transmissible lung cancer in sheep, for example enJSRV-26 that retains an intact genome. These endogenous JSRVs protect the sheep against JSRV infection and are required for sheep placental development. HERV-K_c12q14.1 is potentially a complete envelope protein; however, it does not appear to be fusogenic.	128
197364	cd09910	NGN-insert_like	NGN-insert domain found between N-terminal domain (D1) and C-terminal KOW domain (DIII) repeats of some N-Utilization Substance G (NusG) N-terminal (NGN). This family contains a unique insert (domain II, DII) found between the highly conserved N-terminal domain (NGN, domain I, D1) and C-terminal Kyrpides Ouzounis and Woese domain (KOW, domain III, DIII) repeats of some N-Utilization Substance G (NusG) N-terminal (NGN) proteins in bacteria such as  Aquifex aeolicus NusG (AaeNusG). NusG was originally discovered as having an N-dependent antitermination enhancing activity in Escherichia coli, and has since been shown to have a variety of functions such as being involved in RNA polymerase elongation and Rho-termination. Orthologs of NusG gene exist in bacteria, but their functions and requirements are diverse. The function of DII is as yet unknown, and belongs to Domains of Unknown Function 1312 (DUF1312).	80
197365	cd09911	Lin0431_like	Listerrria innocua Lin0431  is similar to the N-Utilization Substance G (NusG) N terminal (NGN) insert (DII). This family contains domains homologous to Listeria innocua Lin0431, a protein that is similar to the N-Utilization Substance G (NusG) N terminal (NGN) insert (domain II, DII). Lin0431 and Aquifex aeolicus NusG DII (AaeNusG DII ) have similar structure and similar basic charged surface distributions that may bind negatively charged nucleic acids and/or another anionic binding partner, suggesting a possible role in transcription/translation regulating functions. Despite these two domains having low sequence similarity, the NusG DII and DUF1312 domain families may have diverged from common evolutionary ancestral proteins, and may have similar biochemical functions.	82
206739	cd09912	DLP_2	Dynamin-like protein including dynamins, mitofusins, and guanylate-binding proteins. The dynamin family of large mechanochemical GTPases includes the classical dynamins and dynamin-like proteins (DLPs) that are found throughout the Eukarya. This family also includes bacterial DLPs. These proteins catalyze membrane fission during clathrin-mediated endocytosis. Dynamin consists of five domains; an N-terminal G domain that binds and hydrolyzes GTP, a middle domain (MD) involved in self-assembly and oligomerization, a pleckstrin homology (PH) domain responsible for interactions with the plasma membrane, GED, which is also involved in self-assembly, and a proline arginine rich domain (PRD) that interacts with SH3 domains on accessory proteins. To date, three vertebrate dynamin genes have been identified; dynamin 1, which is brain specific, mediates uptake of synaptic vesicles in presynaptic terminals; dynamin-2 is expressed ubiquitously and similarly participates in membrane fission; mutations in the MD, PH and GED domains of dynamin 2 have been linked to human diseases such as Charcot-Marie-Tooth peripheral neuropathy and rare forms of centronuclear myopathy. Dynamin 3 participates in megakaryocyte progenitor amplification, and is also involved in cytoplasmic enlargement and the formation of the demarcation membrane system. This family also includes mitofusins (MFN1 and MFN2 in mammals) that are involved in mitochondrial fusion. Dynamin oligomerizes into helical structures around the neck of budding vesicles in a GTP hydrolysis-dependent manner.	180
206740	cd09913	EHD	Eps15 homology domain (EHD), C-terminal domain. Dynamin-like C-terminal Eps15 homology domain (EHD) proteins regulate endocytic events; they have been linked to a number of Rab proteins through their association with mutual effectors, suggesting a coordinate role in endocytic regulation. Eukaryotic EHDs comprise four members (EHD1-4) in mammals and single members in Caenorhabditis elegans (Rme-1), Drosophila melanogaster (Past1) as well as several eukaryotic parasites. EHD1 regulates trafficking of multiple receptors from the endocytic recycling compartment (ERC) to the plasma membrane; EHD2 regulates trafficking from the plasma membrane by controlling Rac1 activity; EHD3 regulates endosome-to-Golgi transport, and preserves Golgi morphology; EHD4 is involved in the control of trafficking at the early endosome and regulates exit of cargo toward the recycling compartment as well as late endocytic pathway. Rme-1, an ortholog of human EHD1, controls the recycling of internalized receptors from the endocytic recycling compartment to the plasma membrane. In D. melanogaster, deletion of the Past1 gene leads to infertility as well as premature death of adult flies. Arabidopsis thaliana also has homologs of EHD proteins (AtEHD1 and AtEHD2), possibly involved in regulating endocytosis and signaling.	241
206741	cd09914	RocCOR	Ras of complex proteins (Roc) C-terminal of Roc (COR) domain family. RocCOR (or Roco) protein family is characterized by a superdomain containing a Ras-like GTPase domain, called Roc (Ras of complex proteins), and a characteristic second domain called COR (C-terminal of Roc). A kinase domain and diverse regulatory domains are also often found in Roco proteins. Their functions are diverse; in Dictyostelium discoideum, which encodes 11 Roco proteins, they are involved in cell division, chemotaxis and development, while in human, where 4 Roco proteins (LRRK1, LRRK2, DAPK1, and MFHAS1) are encoded, these proteins are involved in epilepsy and cancer. Mutations in LRRK2 (leucine-rich repeat kinase 2) are known to cause familial Parkinson's disease.	161
206742	cd09915	Rag	Rag GTPase subfamily of Ras-related GTPases. Rag GTPases (ras-related GTP-binding proteins) constitute a unique subgroup of the Ras superfamily, playing an essential role in regulating amino acid-induced target of rapamycin complex 1 (TORC1) kinase signaling, exocytic cargo sorting at endosomes, and epigenetic control of gene expression. This subfamily consists of RagA and RagB as well as RagC and RagD that are closely related. Saccharomyces cerevisiae encodes single orthologs of metazoan RagA/B and RagC/D, Gtr1 and Gtr2, respectively. Dimer formation is important for their cellular function; these domains form heterodimers, as RagA or RagB dimerizes with RagC or RagD, and similarly, Gtr1 dimerizes with Gtr2. In response to amino acids, the Rag GTPases guide the TORC1 complex to activate the platform containing Rheb proto-oncogene by driving the relocalization of mTORC1 from discrete locations in the cytoplasm to a late endosomal and/or lysosomal compartment that is Rheb-enriched and contains Rab-7.	175
197366	cd09916	CpxP_like	CpxP component of the bacterial Cpx-two-component system and related proteins. This family summarizes bacterial proteins related to CpxP, a periplasmic protein that forms part of a two-component system which acts as a global modulator of cell-envelope stress in gram-negative bacteria. CpxP aids in combating extracytoplasmic protein-mediated toxicity, and may also be involved in the response to alkaline pH. Functioning as a dimer, it inhibits activation of the kinase CpxA, but also plays a vital role in the quality control system of P pili. It has been suggested that CpxP directly interacts with CpxA via its concave polar surface. Another member of this family, Spy, is also a periplasmic protein that may be involved in the response to stress. The homology between CpxP and Spy suggests similar functions. A characteristic 5-residue sequence motif LTXXQ is found repeated twice in many members of this family.	96
198174	cd09918	SH2_Nterm_SPT6_like	N-terminal Src homology 2 (SH2) domain found in Spt6. N-terminal SH2 domain in Spt6. Spt6 is an essential transcription elongation factor and histone chaperone that binds the C-terminal repeat domain (CTD) of RNA polymerase II. Spt6 contains a tandem SH2 domain with a novel structure and CTD-binding mode. The tandem SH2 domain binds to a serine 2-phosphorylated CTD peptide in vitro, whereas its N-terminal SH2 subdomain does not. CTD binding requires a positively charged crevice in the C-terminal SH2 subdomain, which lacks the canonical phospho-binding pocket of SH2 domains. The tandem SH2 domain is apparently required for transcription elongation in vivo as its deletion in cells is lethal in the presence of 6-azauracil.  In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	85
198175	cd09919	SH2_STAT_family	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) family. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus.  STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes.  However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated by a receptor. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD).  NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. The CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation.  LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells.  The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	115
198176	cd09920	SH2_Cbl-b_TKB	Src homology 2 (SH2) domain found in the Cbl-b TKB domain. SH2 found in the Cbl-b TKB domain. The Cbl (for Casitas B-lineage lymphoma) family of E3 ubiquitin ligases contains three members Cbl, Cbl-b and Cbl-c. The founding member Cbl was discovered first as the oncogenic protein v-Cbl, a Gag-fusion transforming protein of Cas NS-1 retrovirus, which causes pre- and pro-B lymphomas in mice. The N-terminus of the Cbl proteins is composed of a tyrosine kinase-binding (TKB) domain, also called phosphotyrosine binding (PTB) domain, a short linker region and the RING-type zinc finger.  In addition, Cbl and Cbl-b contain a leucine zipper motif and a proline-rich domain in the C-terminus. The TKB domain consists of a four-helix bundle (4H), a calcium-binding EF hand and a divergent SH2 domain. Cbl-b plays a role in early hematopoietic development and is a negative regulator of T-cell receptor, B-cell receptor and high affinity immunoglobulin epsilon receptor signal transduction pathways. It also negatively regulates insulin-like growth factor 1 signaling during muscle atrophy caused by unloading and is involved in EGFR ubiquitination and internalization. Diseases associated with defects in Cbl-b include: multiple sclerosis, autoimmune diseases, including type 1 diabetes, and a craniofacial phenotype. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198177	cd09921	SH2_Jak_family	Src homology 2 (SH2) domain in the Janus kinase (Jak) family. The Janus kinases (Jak) are a family of 4 non-receptor tyrosine kinases (Jak1, Jak2, Jak3, Tyk2) which respond to cytokine or growth factor receptor activation. To transduce cytokine signaling, a series of conformational changes occur in the receptor-Jak complex upon extracellular ligand binding. This results in trans-activation of the receptor-associated Jaks followed by phosphorylation of receptor tail tyrosine sites. The Signal Transducers and Activators of Transcription (STAT) are then recruited to the receptor tail, become phosphorylated and translocate to the nucleus to regulate transcription. Jaks have four domains: the pseudokinase domain, the catalytic tyrosine kinase domain, the FERM (band four-point-one, ezrin, radixin, and moesin) domain, and the SH2 (Src Homology-2) domain.  The Jak kinases are regulated by several enzymatic and non-enzymatic mechanisms. First, the Jak kinase domain is regulated by phosphorylation of the activation loop which is associated with the catalytically competent kinase conformation and is distinct from the inactive kinase conformation. Second, the pseudokinase domain directly modulates Jak catalytic activity with the FERM domain maintaining an active state. Third, the suppressor of cytokine signaling (SOCS) family and tyrosine phosphatases directly regulate Jak activity. Dysregulation of Jak activity can manifest as either a reduction or an increase in kinase activity resulting in immunodeficiency, inflammatory diseases, hematological defects, autoimmune and myeloproliferative disorders, and susceptibility to infection. Altered Jak regulation occurs by many mechanisms, including: gene translocations, somatic or inherited point mutations, receptor mutations, and alterations in the activity of Jak regulators such as SOCS or phosphatases.  In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198178	cd09923	SH2_SOCS_family	Src homology 2 (SH2) domain found in  suppressor of cytokine signaling (SOCS) family. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	81
198179	cd09925	SH2_SHC	Src homology 2 (SH2) domain found in SH2 adaptor protein C (SHC). SHC is involved in a wide variety of pathways including regulating proliferation, angiogenesis, invasion and metastasis, and bone metabolism. An adapter protein, SHC has been implicated in Ras activation following the stimulation of a number of different receptors, including growth factors [insulin, epidermal growth factor (EGF), nerve growth factor, and platelet derived growth factor (PDGF)], cytokines [interleukins 2, 3, and 5], erythropoietin, and granulocyte/macrophage colony-stimulating factor, and antigens [T-cell and B-cell receptors]. SHC has been shown to bind to tyrosine-phosphorylated receptors, and receptor stimulation leads to tyrosine phosphorylation of SHC. Upon phosphorylation, SHC interacts with another adapter protein, Grb2, which binds to the Ras GTP/GDP exchange factor mSOS which leads to Ras activation. SHC is composed of an N-terminal domain that interacts with proteins containing phosphorylated tyrosines, a (glycine/proline)-rich collagen-homology domain that contains the phosphorylated binding site, and a C-terminal SH2 domain. SH2 has been shown to interact with the tyrosine-phosphorylated receptors of EGF and PDGF and with the tyrosine-phosphorylated C chain of the T-cell receptor, providing one of the mechanisms of T-cell-mediated Ras activation. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	104
198180	cd09926	SH2_CRK_like	Src homology 2 domain found in cancer-related signaling adaptor protein CRK. SH2 domain in the CRK proteins.  CRKI (SH2-SH3) and CRKII (SH2-SH3-SH3) are splicing isoforms of the oncoprotein CRK.  CRKs regulate transcription and cytoskeletal reorganization for cell growth and motility by linking tyrosine kinases to small G proteins. The SH2 domain of CRK associates with tyrosine-phosphorylated receptors or components of focal adhesions, such as p130Cas and paxillin. CRK transmits signals to small G proteins through effectors that bind its SH3 domain, such as C3G, the guanine-nucleotide exchange factor (GEF) for Rap1 and R-Ras, and DOCK180, the GEF for Rac6. The binding of p130Cas to the CRK-C3G complex activates Rap1, leading to regulation of cell adhesion, and activates R-Ras, leading to JNK-mediated activation of cell proliferation, whereas the binding of CRK DOCK180 induces Rac1-mediated activation of cellular migration. The activity of the different splicing isoforms varies greatly with CRKI displaying substantial transforming activity, CRKII less so, and phosphorylated CRKII with no biological activity whatsoever.  CRKII has a linker region with a phosphorylated Tyr and an additional C-terminal SH3 domain. The phosphorylated Tyr creates a binding site for its SH2 domain which disrupts the association between CRK and its SH2 target proteins.  In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	106
198181	cd09927	SH2_Tensin_like	Src homology 2 domain found in Tensin-like proteins. SH2 domain found in Tensin-like proteins. The Tensins are a family of intracellular proteins that interact with receptor tyrosine kinases (RTKs), integrins, and actin. They are thought act as signaling bridges between the extracellular space and the cytoskeleton. There are four homologues: Tensin1, Tensin2 (TENC1, C1-TEN), Tensin3 and Tensin4 (cten), all of which contain a C-terminal tandem SH2-PTB domain pairing, as well as actin-binding regions that may localize them to focal adhesions. The isoforms of Tensin2 and Tensin3 contain N-terminal C1 domains, which are atypical and not expected to bind to phorbol esters. Tensins 1-3 contain a phosphatase (PTPase) and C2 domain pairing which resembles PTEN (phosphatase and tensin homologue deleted on chromosome 10) protein. PTEN is a lipid phosphatase that dephosphorylates phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) to yield phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2). As PtdIns(3,4,5)P3 is the product of phosphatidylinositol 3-kinase (PI3K) activity, PTEN is therefore a key negative regulator of the PI3K pathway. Because of their PTEN-like domains, the Tensins may also possess phosphoinositide-binding or phosphatase capabilities. However, only Tensin2 and Tensin3 have the potential to be phosphatases since only their PTPase domains contain a cysteine residue that is essential for catalytic activity. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	116
198182	cd09928	SH2_Cterm_SPT6_like	C-terminal Src homology 2 (SH2) domain found in Spt6. Spt6 is an essential transcription elongation factor and histone chaperone that binds the C-terminal repeat domain (CTD) of RNA polymerase II. Spt6 contains a tandem SH2 domain with a novel structure and CTD-binding mode. The tandem SH2 domain binds to a serine 2-phosphorylated CTD peptide in vitro, whereas its N-terminal SH2 subdomain does not. CTD binding requires a positively charged crevice in the C-terminal SH2 subdomain, which lacks the canonical phospho-binding pocket of SH2 domains. The tandem SH2 domain is apparently required for transcription elongation in vivo as its deletion in cells is lethal in the presence of 6-azauracil. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	89
198183	cd09929	SH2_BLNK_SLP-76	Src homology 2 (SH2) domain found in B-cell linker (BLNK) protein and SH2 domain-containing leukocyte protein of 76 kDa (SLP-76). BLNK (also known as SLP-65 or BASH) is an important adaptor protein expressed in B-lineage cells. BLNK consists of a N-terminal sterile alpha motif (SAM) domain and a C-terminal SH2 domain.  BLNK is a cytoplasmic protein, but a part of it is bound to the plasma membrane through an N-terminal leucine zipper motif and transiently bound to a cytoplasmic domain of Iga through its C-terminal SH2 domain upon B cell antigen receptor (BCR)-stimulation. A non-ITAM phosphotyrosine in Iga is necessary for the binding with the BLNK SH2 domain and/or for normal BLNK function in signaling and B cell activation. Upon phosphorylation BLNK binds Btk and PLCgamma2 through their SH2 domains and mediates PLCgamma2 activation by Btk. BLNK also binds other signaling molecules such as Vav, Grb2, Syk, and HPK1. BLNK has been shown to be necessary for BCR-mediated Ca2+ mobilization, for the activation of mitogen-activated protein kinases such as ERK, JNK, and p38 in a chicken B cell line DT40, and for activation of transcription factors such as NF-AT and NF-kappaB in human or mouse B cells. BLNK is involved in B cell development, B cell survival, activation, proliferation, and T-independent immune responses. BLNK is structurally homologous to SLP-76. SLP-76 and (linker for activation of T cells) LAT are adaptor/linker proteins in T cell antigen receptor activation and T cell development. BLNK interacts with many downstream signaling proteins that interact directly with both SLP-76 and  LAT.  New data suggest functional complementation of SLP-76 and LAT in T cell antigen receptor function with BLNK in BCR function. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	121
198184	cd09930	SH2_cSH2_p85_like	C-terminal Src homology 2 (cSH2) domain found in p85. Phosphoinositide 3-kinases (PI3Ks) are essential for cell growth, migration, and survival. p110, the catalytic subunit, is composed of an adaptor-binding domain, a Ras-binding domain, a C2 domain, a helical domain, and a kinase domain.  The regulatory unit is called p85 and is composed of an SH3 domain, a RhoGap domain, a N-terminal SH2 (nSH2) domain, a inter SH2 (iSH2) domain, and C-terminal (cSH2) domain.  There are 2 inhibitory interactions between p110alpha and p85 of P13K: 1) p85 nSH2 domain with the C2, helical, and kinase domains of p110alpha and 2) p85 iSH2 domain with C2 domain of p110alpha. There are 3 inhibitory interactions between p110beta and p85 of P13K: 1) p85 nSH2 domain with the C2, helical, and kinase domains of p110beta, 2) p85 iSH2 domain with C2 domain of p110alpha, and 3) p85 cSH2 domain with the kinase domain of p110alpha. It is interesting to note that p110beta is oncogenic as a wild type protein while p110alpha lacks this ability. One explanation is the idea that the regulation of p110beta by p85 is unique because of the addition of inhibitory contacts from the cSH2 domain and the loss of contacts in the iSH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	104
198185	cd09931	SH2_C-SH2_SHP_like	C-terminal Src homology 2 (C-SH2) domain found in SH2 domain Phosphatases (SHP) proteins. The SH2 domain phosphatases (SHP-1, SHP-2/Syp, Drosophila corkscrew (csw), and Caenorhabditis elegans Protein Tyrosine Phosphatase (Ptp-2)) are cytoplasmic signaling enzymes. They are both targeted and regulated by interactions of their SH2 domains with phosphotyrosine docking sites. These proteins contain two SH2 domains (N-SH2, C-SH2) followed by a tyrosine phosphatase (PTP) domain, and a C-terminal extension. Shp1 and Shp2 have two tyrosyl phosphorylation sites in their C-tails, which are phosphorylated differentially by receptor and nonreceptor PTKs. Csw retains the proximal tyrosine and Ptp-2 lacks both sites.  Shp-binding proteins include receptors, scaffolding adapters, and inhibitory receptors. Some of these bind both Shp1 and Shp2 while others bind only one. Most proteins that bind a Shp SH2 domain contain one or more immuno-receptor tyrosine-based inhibitory motifs (ITIMs): [SIVL]xpYxx[IVL].  Shp1 N-SH2 domain blocks the catalytic domain and keeps the enzyme in the inactive conformation, and is thus believed to regulate the phosphatase activity of SHP-1. Its C-SH2 domain is thought to be involved in searching for phosphotyrosine activators.  The SHP2 N-SH2 domain is a conformational switch; it either binds and inhibits the phosphatase, or it binds phosphoproteins and activates the enzyme. The C-SH2 domain contributes binding energy and specificity, but it does not have a direct role in activation. Csw SH2 domain function is essential, but either SH2 domain can fulfill this requirement. The role of the csw SH2 domains during Sevenless receptor tyrosine kinase (SEV) signaling is to bind Daughter of Sevenless rather than activated SEV. Ptp-2 acts in oocytes downstream of sheath/oocyte gap junctions to promote major sperm protein (MSP)-induced MAP Kinase (MPK-1) phosphorylation. Ptp-2 functions in the oocyte cytoplasm, not at the cell surface to inhibit multiple RasGAPs, resulting in sustained Ras activation. It is thought that MSP triggers PTP-2/Ras activation and ROS production to stimulate MPK-1 activity essential for oocyte maturation and that secreted MSP domains and Cu/Zn superoxide dismutases function antagonistically to control ROS and MAPK signaling. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	99
198186	cd09932	SH2_C-SH2_PLC_gamma_like	C-terminal Src homology 2 (C-SH2) domain in Phospholipase C gamma. Phospholipase C gamma is a signaling molecule that is recruited to the C-terminal tail of the receptor upon autophosphorylation of a highly conserved tyrosine. PLCgamma is composed of a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, 2 catalytic regions of PLC domains that flank 2 tandem SH2 domains (N-SH2, C-SH2), and ending with a SH3 domain and C2 domain. N-SH2 SH2 domain-mediated interactions represent a crucial step in transmembrane signaling by receptor tyrosine kinases. SH2 domains recognize phosphotyrosine (pY) in the context of particular sequence motifs in receptor phosphorylation sites. Both N-SH2 and C-SH2 have a very similar binding affinity to pY. But in growth factor stimulated cells these domains bind to different target proteins. N-SH2 binds to pY containing sites in the C-terminal tails of tyrosine kinases and other receptors. Recently it has been shown that this interaction is mediated by phosphorylation-independent interactions between a secondary binding site found exclusively on the N-SH2 domain and a region of the FGFR1 tyrosine kinase domain. This secondary site on the SH2 cooperates with the canonical pY site to regulate selectivity in mediating a specific cellular process.  C-SH2 binds to an intramolecular site on PLCgamma itself which allows it to hydrolyze phosphatidylinositol-4,5-bisphosphate into diacylglycerol and inositol triphosphate. These then activate protein kinase C and release calcium. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	104
199827	cd09933	SH2_Src_family	Src homology 2 (SH2) domain found in the Src family of non-receptor tyrosine kinases. The Src family kinases are nonreceptor tyrosine kinases that have been implicated in pathways regulating proliferation, angiogenesis, invasion and metastasis, and bone metabolism. It is thought that transforming ability of Src is linked to its ability to activate key signaling molecules in these pathways, rather than through direct activity. As such blocking Src activation has been a target for drug companies. Src family members can be divided into 3 groups based on their expression pattern: 1) Src, Fyn, and Yes; 2)  Blk, Fgr, Hck, Lck, and Lyn; and 3) Frk-related kinases Frk/Rak and Iyk/Bsk Of these, cellular c-Src is the best studied and most frequently implicated in oncogenesis. The c-Src contains five distinct regions: a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. Src exists in both active and inactive conformations. Negative regulation occurs through phosphorylation of Tyr, resulting in an intramolecular association between phosphorylated Tyr and the SH2 domain of SRC, which locks the protein in a closed conformation. Further stabilization of the inactive state occurs through interactions between the SH3 domain and a proline-rich stretch of residues within the kinase domain. Conversely, dephosphorylation of Tyr allows SRC to assume an open conformation. Full activity requires additional autophosphorylation of a Tyr residue within the catalytic domain. Loss of the negative-regulatory C-terminal segment has been shown to result in increased activity and transforming potential. Phosphorylation of the C-terminal Tyr residue by C-terminal Src kinase (Csk) and Csk homology kinase results in increased intramolecular interactions and consequent Src inactivation. Specific phosphatases, protein tyrosine phosphatase a (PTPa) and the SH-containing phosphatases SHP1/SHP2, have also been shown to take a part in Src activation. Src is also activated by direct binding of focal adhesion kinase (Fak) and Crk-associated substrate (Cas) to the SH2 domain. SRC activity can also be regulated by numerous receptor tyrosine kinases (RTKs), such as Her2, epidermal growth factor receptor (EGFR), fibroblast growth factor receptor, platelet-derived growth factor receptor (PDGFR), and vascular endothelial growth factor receptor (VEGFR). In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198188	cd09934	SH2_Tec_family	Src homology 2 (SH2) domain found in Tec-like proteins. The Tec protein tyrosine kinase is the founding member of a family that includes Btk, Itk, Bmx, and Txk. The members have a PH domain, a zinc-binding motif, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. Btk is involved in B-cell receptor signaling with mutations in Btk responsible for X-linked agammaglobulinemia (XLA) in humans and X-linked immunodeficiency (xid) in mice. Itk is involved in T-cell receptor signaling. Tec is expressed in both T and B cells, and is thought to function in activated and effector T lymphocytes to induce the expression of genes regulated by NFAT transcription factors. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	104
198189	cd09935	SH2_ABL	Src homology 2 (SH2) domain found in Abelson murine lymphosarcoma virus (ABL) proteins. ABL-family proteins are highly conserved tyrosine kinases. Each ABL protein contains an SH3-SH2-TK (Src homology 3-Src homology 2-tyrosine kinase) domain cassette, which confers autoregulated kinase activity and is common among nonreceptor tyrosine kinases. Several types of posttranslational modifications control ABL catalytic activity, subcellular localization, and stability, with consequences for both cytoplasmic and nuclear ABL functions. Binding partners provide additional regulation of ABL catalytic activity, substrate specificity, and downstream signaling. By combining this cassette with actin-binding and -bundling domain, ABL proteins are capable of connecting phosphoregulation with actin-filament reorganization. Vertebrate paralogs, ABL1 and ABL2, have evolved to perform specialized functions. ABL1 includes nuclear localization signals and a DNA binding domain which is used to mediate DNA damage-repair functions, while ABL2 has additional binding capacity for actin and for microtubules to enhance its cytoskeletal remodeling functions.  SH2 is involved in several autoinhibitory mechanism that constrain the enzymatic activity of the ABL-family kinases. In one mechanism SH2 and SH3 cradle the kinase domain while a cap sequence stabilizes the inactive conformation resulting in a locked inactive state. Another involves phosphatidylinositol 4,5-bisphosphate (PIP2) which binds the SH2 domain through residues normally required for phosphotyrosine binding in the linker segment between the SH2 and kinase domains. The SH2 domain contributes to ABL catalytic activity and target site specificity. It is thought that the ABL catalytic site and SH2 pocket have coevolved to recognize the same sequences. Recent work now supports a hierarchical processivity model in which the substrate target site most compatible with ABL kinase domain preferences is phosphorylated with greatest efficiency. If this site is compatible with the ABL SH2 domain specificity, it will then reposition and dock in the SH2 pocket. This mechanism also explains how ABL kinases phosphorylates poor targets on the same substrate if they are properly positioned and how relatively poor substrate proteins might be recruited to ABL through a complex with strong substrates that can also dock with the SH2 pocket. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	94
198190	cd09937	SH2_csk_like	Src homology 2 (SH2) domain found in Carboxyl-Terminal Src Kinase (Csk). Both the C-terminal Src kinase (CSK) and CSK-homologous kinase (CHK) are members of the CSK-family of protein tyrosine kinases. These proteins suppress activity of Src-family kinases (SFK) by selectively phosphorylating the conserved C-terminal tail regulatory tyrosine by a similar mechanism. CHK is also capable of inhibiting SFKs by a non-catalytic mechanism that involves binding of CHK to SFKs to form stable protein complexes. The unphosphorylated form of SFKs is inhibited by CSK and CHK by a two-step mechanism. The first step involves the formation of a complex of SFKs with CSK/CHK with the SFKs in the complex are inactive. The second step, involves the phosphorylation of the C-terminal tail tyrosine of SFKs, which then dissociates and adopt an inactive conformation. The structural basis of how the phosphorylated SFKs dissociate from CSK/CHK to adopt the inactive conformation is not known. The inactive conformation of SFKs is stabilized by two intramolecular inhibitory interactions: (a) the pYT:SH2 interaction in which the phosphorylated C-terminal tail tyrosine (YT) binds to the SH2 domain, and (b) the linker:SH3 interaction of which the SH2-kinase domain linker binds to the SH3 domain. SFKs are activated by multiple mechanisms including binding of the ligands to the SH2 and SH3 domains to displace the two inhibitory intramolecular interactions, autophosphorylation, and dephosphorylation of YT. By selective phosphorylation and the non-catalytic inhibitory mechanism CSK and CHK are able to inhibit the active forms of SFKs. CSK and CHK are regulated by phosphorylation and inter-domain interactions. They both contain SH3, SH2, and kinase domains separated by the SH3-SH2 connector and SH2 kinase linker, intervening segments separating the three domains. They lack a conserved tyrosine phosphorylation site in the kinase domain and the C-terminal tail regulatory tyrosine phosphorylation site. The CSK SH2 domain is crucial for stabilizing the kinase domain in the active conformation. A disulfide bond here regulates CSK kinase activity. The subcellular localization and activity of CSK are regulated by its SH2 domain. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	98
198191	cd09938	SH2_N-SH2_Zap70_Syk_like	N-terminal Src homology 2 (SH2) domain found in Zeta-chain-associated protein kinase 70 (ZAP-70) and Spleen tyrosine kinase (Syk) proteins. ZAP-70 and Syk comprise a family of hematopoietic cell specific protein tyrosine kinases (PTKs) that are required for antigen and antibody receptor function. ZAP-70 is expressed in T and natural killer (NK) cells and Syk is expressed in B cells, mast cells, polymorphonuclear leukocytes, platelets, macrophages, and immature T cells. They are required for the proper development of T and B cells, immune receptors, and activating NK cells. They consist of two N-terminal Src homology 2 (SH2) domains and a C-terminal kinase domain separated from the SH2 domains by a linker or hinge region. Phosphorylation of both tyrosine residues within the Immunoreceptor Tyrosine-based Activation Motifs (ITAM; consensus sequence Yxx[LI]x(7,8)Yxx[LI]) by the Src-family PTKs is required for efficient interaction of ZAP-70 and Syk with the receptor subunits and for receptor function. ZAP-70 forms two phosphotyrosine binding pockets, one of which is shared by both SH2 domains.  In Syk the two SH2 domains do not form such a phosphotyrosine-binding site.  The SH2 domains here are believed to function independently. In addition, the two SH2 domains of Syk display flexibility in their relative orientation, allowing Syk to accommodate a greater variety of spacing sequences between the ITAM phosphotyrosines and singly phosphorylated non-classical ITAM ligands. This model contains the N-terminus SH2 domains of both Syk and Zap70. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	104
198192	cd09939	SH2_STAP_family	Src homology 2 domain found in Signal-transducing adaptor protein (STAP) family. STAP1 and STAP2 are signal-transducing adaptor proteins. They are composed of a Pleckstrin homology (PH) and SH2 domains along with several tyrosine phosphorylation sites. STAP-1 is an ortholog of BRDG1 (BCR downstream signaling 1). STAP1 protein functions as a docking protein acting downstream of Tec tyrosine kinase in B cell antigen receptor signaling. The protein is phosphorylated by Tec and participates in a positive feedback loop, increasing Tec activity. STAP1 has been shown to interact with C19orf2, an unconventional prefoldin RPB5 interactor.  The STAP2 protein is the substrate of breast tumor kinase, an Src-type non-receptor tyrosine kinase that mediates the interactions linking proteins involved in signal transduction pathways. STAP2 has alternative splicing variants. STAP2 has been shown to interact with tyrosine-protein kinase 6 (PTK6). In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	94
198193	cd09940	SH2_Vav_family	Src homology 2 (SH2) domain found in the Vav family. Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation.  Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases.  Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, 2 SH3 domains,  a proline-rich region, and a SH2 domain.  Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. The leucine-rich helix-loop-helix (HLH) domain is thought to be involved in protein heterodimerization with other HLH proteins and it may function as a negative regulator by forming inactive heterodimers. The CH domain  is usually involved in the association with filamentous actin, but in Vav it controls NFAT stimulation, Ca2+ mobilization, and its transforming activity. Acidic domains are involved in protein-protein interactions and contain regulatory tyrosines. The DH domain is a GDP-GTP exchange factor on Rho/Rac GTPases. The PH domain in involved in interactions with GTP-binding proteins, lipids and/or phosphorylated serine/threonine residues. The SH3 domain is involved in localization of proteins to specific sites within the cell interacting with protein with proline-rich sequences.  The SH2 domain mediates a high affinity interaction with tyrosine phosphorylated proteins.  There are three Vav mammalian family members: Vav1 which is expressed in the hematopoietic system, Vav2 and Vav3 are more ubiquitously expressed. The members here include insect and amphibian Vavs. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	102
199828	cd09941	SH2_Grb2_like	Src homology 2 domain found in Growth factor receptor-bound protein 2 (Grb2) and similar proteins. The adaptor proteins here include homologs Grb2 in humans, Sex muscle abnormal protein 5 (Sem-5) in Caenorhabditis elegans, and Downstream of receptor kinase (drk) in Drosophila melanogaster. They are composed of one SH2 and two SH3 domains. Grb2/Sem-5/drk regulates the Ras pathway by linking the tyrosine kinases to the Ras guanine nucleotide releasing protein Sos, which converts Ras to the active GTP-bound state. The SH2 domain of Grb2/Sem-5/drk binds class II phosphotyrosyl peptides while its SH3 domain binds to Sos and Sos-derived, proline-rich peptides. Besides it function in Ras signaling, Grb2 is also thought to play a role in apoptosis. Unlike most SH2 structures in which the peptide binds in an extended conformation (such that the +3 peptide residue occupies a hydrophobic pocket in the protein, conferring a modest degree of selectivity), Grb2 forms several hydrogen bonds via main chain atoms with the side chain of +2 Asn. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	95
198195	cd09942	SH2_nSH2_p85_like	N-terminal Src homology 2 (nSH2) domain found in p85. Phosphoinositide 3-kinases (PI3Ks) are essential for cell growth, migration, and survival. p110, the catalytic subunit, is composed of an adaptor-binding domain, a Ras-binding domain, a C2 domain, a helical domain, and a kinase domain.  The regulatory unit is called p85 and is composed of an SH3 domain, a RhoGap domain, a N-terminal SH2 (nSH2) domain, an internal SH2 (iSH2) domain, and C-terminal (cSH2) domain.  There are 2 inhibitory interactions between p110alpha and p85 of P13K: (1) p85 nSH2 domain with the C2, helical, and kinase domains of p110alpha and (2) p85 iSH2 domain with C2 domain of p110alpha. There are 3 inhibitory interactions between p110beta and p85 of P13K: (1) p85 nSH2 domain with the C2, helical, and kinase domains of p110beta, (2) p85 iSH2 domain with C2 domain of p110alpha, and (3) p85 cSH2 domain with the kinase domain of p110alpha. It is interesting to note that p110beta is oncogenic as a wild type protein while p110alpha lacks this ability. One explanation is the idea that the regulation of p110beta by p85 is unique because of the addition of inhibitory contacts from the cSH2 domain and the loss of contacts in the iSH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	110
198196	cd09943	SH2_Nck_family	Src homology 2 (SH2) domain found in the Nck family. Nck proteins are adaptors that modulate actin cytoskeleton dynamics by linking proline-rich effector molecules to tyrosine kinases or phosphorylated signaling intermediates. There are two members known in this family: Nck1 (Nckalpha) and Nck2 (Nckbeta and Growth factor receptor-bound protein 4 (Grb4)).  They are characterized by having 3 SH3 domains and a C-terminal SH2 domain. Nck1 and Nck2 have overlapping functions as determined by gene knockouts. Both bind receptor tyrosine kinases and other tyrosine-phosphorylated proteins through their SH2 domains. In addition they also bind distinct targets.  Neuronal signaling proteins: EphrinB1, EphrinB2, and Disabled-1 (Dab-1) all bind to Nck-2 exclusively. And in the case of PDGFR, Tyr(P)751 binds to  Nck1 while Tyr(P)1009 binds to Nck2. Nck1 and Nck2 have a role in the infection process of enteropathogenic Escherichia coli (EPEC). Their SH3 domains are involved in recruiting and activating the N-WASP/Arp2/3 complex inducing actin polymerization resulting in the production of pedestals, dynamic bacteria-presenting protrusions of the plasma membrane. A similar thing occurs in the vaccinia virus where motile plasma membrane projections are formed beneath the virus.  Recently it has been shown that the SH2 domains of both Nck1 and Nck2 bind the G-protein coupled receptor kinase-interacting protein 1 (GIT1) in a phosphorylation-dependent manner. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	93
198197	cd09944	SH2_Grb7_family	Src homology 2 (SH2) domain found in the growth factor receptor bound, subclass 7 (Grb7) proteins. The Grb family binds to the epidermal growth factor receptor (EGFR, erbB1) via their SH2 domains. There are 3 members of the Grb7 family of proteins: Grb7, Grb10, and Grb14. They are composed of an N-terminal Proline-rich domain, a Ras Associating-like (RA) domain, a Pleckstrin Homology (PH) domain, a phosphotyrosine interaction region (PIR, BPS) and a C-terminal SH2 domain. The SH2 domains of Grb7, Grb10 and Grb14 preferentially bind to a different RTK. Grb7 binds strongly to the erbB2 receptor, unlike Grb10 and Grb14 which bind weakly to it. Grb14 binds to Fibroblast Growth Factor Receptor (FGFR). Grb10 has been shown to interact with many different proteins, including the insulin and IGF1 receptors, platelet-derived growth factor (PDGF) receptor-beta, Ret, Kit, Raf1 and MEK1, and Nedd4.  Grb7 family proteins are phosphorylated on serine/threonine as well as tyrosine residues. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	108
198198	cd09945	SH2_SHB_SHD_SHE_SHF_like	Src homology 2 domain found in SH2 domain-containing adapter proteins B, D, E, and F (SHB, SHD, SHE, SHF). SHB, SHD, SHE, and SHF are SH2 domain-containing proteins that play various roles throughout the cell.  SHB functions in generating signaling compounds in response to tyrosine kinase activation. SHB contains proline-rich motifs, a phosphotyrosine binding (PTB) domain, tyrosine phosphorylation sites, and a SH2 domain. SHB mediates certain aspects of platelet-derived growth factor (PDGF) receptor-, fibroblast growth factor (FGF) receptor-, neural growth factor (NGF) receptor TRKA-, T cell receptor-, interleukin-2 (IL-2) receptor- and focal adhesion kinase- (FAK) signaling. SRC-like FYN-Related Kinase FRK/RAK (also named BSK/IYK or GTK) and SHB regulate apoptosis, proliferation and differentiation. SHB promotes apoptosis and is also required for proper mitogenicity, spreading and tubular morphogenesis in endothelial cells. SHB also plays a role in preventing early cavitation of embryoid bodies and reduces differentiation to cells expressing albumin, amylase, insulin and glucagon. SHB is a multifunctional protein that has difference responses in different cells under various conditions. SHE is expressed in heart, lung, brain, and skeletal muscle, while expression of SHD is restricted to the brain. SHF is mainly expressed in skeletal muscle, brain, liver, prostate, testis, ovary, small intestine, and colon. SHD may be a physiological substrate of c-Abl and may function as an adapter protein in the central nervous system. It is also thought to be involved in apoptotic regulation.  SHD contains five YXXP motifs, a substrate sequence preferred by Abl tyrosine kinases, in addition to a poly-proline rich region and a C-terminal SH2 domain. SHE contains two pTry protein binding domains, protein interaction domain (PID) and a SH2 domain, followed by a glycine-proline rich region, all of which are N-terminal to the phosphotyrosine binding (PTB) domain. SHF contains  four putative tyrosine phosphorylation sites and an SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	98
198199	cd09946	SH2_HSH2_like	Src homology 2 domain found in hematopoietic SH2 (HSH2) protein. HSH2 is thought to function as an adapter protein involved in tyrosine kinase signaling. It may also be involved in regulating cytokine signaling and cytoskeletal reorganization in hematopoietic cells. HSH2 contains several putative protein-binding motifs, SH3-binding proline-rich regions, and phosphotyrosine sites, but lacks enzymatic motifs. HSH2 was found to interact with cytokine-regulated tyrosine kinase c-FES and an activated Cdc42-associated tyrosine kinase ACK1. HSH2 binds c-FES through both its C-terminal region and its N-terminal region including the SH2 domain and binds ACK1 via its N-terminal proline-rich region. Both kinases bound and tyrosine-phosphorylated HSH2 in mammalian cells.  In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	102
197370	cd09947	Ebola_HIV-1-like_HR1-HR2	heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus and human immunodeficiency virus type 1 (HIV-1). This domain superfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus gp2, Rous sarcoma virus gp37, human immunodeficiency virus type 1 (HIV-1) gp41, and the envelope proteins of various ERVs. In the HR1-HR2 region of Ebola virus and RSV, the linker region between the two repeats includes a CKS17-like immunosuppressive region and a CX6C motif that forms an intra-subunit disulfide bond; MMTV, HIV-1, HERV-K endogenous retroviruses and related sequences lack a canonical CSK17-like sequence, and CX6C motif.  N-terminal to the HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1 helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some modern ERVs, those that integrated into the host genome post-speciation, have a currently active exogenous counterpart, such as Jaagsiekte sheep retrovirus (JSRV), feline leukemia virus (FeLV), and avian leukemia virus (ALV). Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Human ERVs (HERVs) belonging to this superfamily include Syncytin-1 (HERV-W_c7q21.2/ ERVWE1), and Syncytin-2 (HERV-FRD_6p24.1) which are expressed in the placenta, and are fusogenic, although they have a different cell specificity for fusion. Syncytin-2, but not Syncytin-1, is immunosuppressive; its immunosuppressive domain may protect the fetus from the mother's immune system. Syncytin-1 may participate in the formation of the placental trophoblast; it is also implicated in cell fusions between cancer and host cells and between cancer cell, and in human osteclast fusion. This superfamily also contains human HERV-R_c7q21.2 (ERV-3), which is also expressed in the placenta, but is not fusogenic, and has an immunosuppressive domain, but lacks a fusion peptide. It is unclear whether ERV-3 has a critical biological role. Included in this superfamily are ERVs from domestic sheep that are related to JSRV, the agent of transmissible lung cancer in sheep; for example, enJSRV-26 that retains an intact genome. These endogenous JSRVs protect the sheep against JSRV infection and are required for sheep placental development.	73
197371	cd09948	Ebola_RSV-like_HR1-HR2	heptad repeat 1-heptad repeat 2 region of the transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus and Rous sarcoma virus. This domain family spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus gp2, Rous sarcoma virus gp37, and the envelope proteins of various ERVs. This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intra-subunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), while C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some modern ERVs, those that integrated into the host genome post-speciation, have a currently active exogenous counterpart, such as Jaagsiekte sheep retrovirus (JSRV), feline leukemia virus (FeLV), and avian leukemia virus (ALV). Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Human ERVs (HERVs) belonging to this family include Syncytin-1 (HERV-W_c7q21.2/ ERVWE1), and Syncytin-2 (HERV-FRD_6p24.1) which are expressed in the placenta, and are fusogenic, although they have a different cell specificity for fusion. Syncytin-2, but not Syncytin-1, is immunosuppressive. Its immunosuppressive domain may protect the fetus from the mother's immune system. Syncytin-1 may participate in the formation of the placental trophoblast. It is also implicated in cell fusions between cancer and host cells and between cancer cells, and in human osteclast fusion. This family also contains human HERV-R_c7q21.2 (ERV-3), which is also expressed in the placenta, but is not fusogenic, has an immunosuppressive domain, but lacks a fusion peptide. It is unclear whether ERV-3 has a critical biological role.	72
197372	cd09949	RSV-like_HR1-HR2	heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of Rous sarcoma virus (RSV), and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Rous sarcoma virus gp37, Avian leukosis virus subgroup J (ALV-J)  envelope protein, and the envelope proteins of various ERVs, including those belonging to the ev/J (or EAV-HP) family of chicken ERVs, such as ev/J 4.1 Rb. ALV-J is a recently emerged avian pathogen, the causative agent of myeloid leukosis in meat-type chicken. ERVs are likely to originate from ancient germ-line infections by active retroviruses. ALV-J may have emerged from a recombination event between an unknown ALV and an EAV-HP ERV. This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intrasubunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity.	72
197373	cd09950	ENVV1-like_HR1-HR2	heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of the human endogenous retrovirus ENVV1, and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs), including chicken FET-1 (Female Expressed Transcript 1) protein, and the envelope proteins of the human ERVs (HERVs): ENVV1 (also known as HERV-V2_c19q13.41) and ENVV2 (also known as HERV-V1_c19q13.41 ). This domain belongs to a larger superfamily containing the HR1-HR2 domain of endogenous retroviruses (ERVs) and infectious retroviruses, such as Ebola virus, Rous sarcoma virus and human immunodeficiency virus type 1. This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intra-subunit disulfide bond, and a C-terminal heptad repeat. N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1 helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. FET-1 may have an ovary-determining role. The FET-1 gene is located on the female specific W chromosome in chickens. During the sex-determining period, the FET-1 transcript is up-regulated in the cortex of the left gonad (the only gonad which develops in female chickens); it is also expressed at a lower level, in neural tissue and waste collection ducts. The genes encoding ENVV1 and ENVV2 proteins are located in tandem on chromosome 19q13.41, and show placenta-specific expression in human and baboon.	72
197374	cd09951	HERV-Rb-like_HR1-HR2	heptad repeat 1- heptad repeat 2 region (ectodomain) of the transmembrane subunit of the human endogenous retrovirus HERV-R(b)_c3p24.3 and related domains. This domain subfamily spans both heptad repeats of the glycoprotein (gp)/transmembrane subunit of various endogenous retroviruses (ERVs) including the human ERVs (HERVs): HERV-R(b)_c3p24.3 and Syncytin-3 (also known as HERV-P(b)_c14q32.12). This domain belongs to a larger superfamily containing the HR1-HR2 domain of endogenous retroviruses (ERVs) and infectious retroviruses, such as Ebola virus, Rous sarcoma virus (RSV) and human immunodeficiency virus type 1 (HIV-1). This domain includes an N-terminal heptad repeat, a CKS17-like immunosuppressive region, a CX6C motif that forms an intrasubunit disulfide bond, and a C-terminal, is a heptad repeat. In intact retroviruses, N-terminal to HR1-HR2 region is a fusion peptide (FP), and C-terminal, is a membrane-spanning region (MSR). Viral infection involves the formation of a trimer-of-hairpins structure (three HR1s helices, buttressed by three HR2 helices lying in antiparallel orientation). In this structure, the FP (inserted in the host cell membrane) and MSR (inserted in the viral membrane) are in close proximity. ERVs are likely to originate from ancient germ-line infections by active retroviruses. Some ERVs play specific roles in the host, including placental development, protection of the host from infection by related pathogenic and exogenous retroviruses, and genome plasticity. Syncytin-3 is fusogenic, HERV-R(b)_c3p24.3 appears not to have fusogenic activity.	81
197375	cd09966	UP_III_II	Uroplakin IIIb, IIIa and II. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb,  UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains separating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers; six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis.	181
197376	cd09967	UP_II	Uroplakin II. Uroplakin II, the dimerization partner of uroplakin Ia, is a member of the uroplakin family. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb,  UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains seperating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers and six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis.	165
197377	cd09968	UP_III	Uroplakin III. Uroplakin IIIa and IIIb, the dimerization partners of uroplakin Ib, are a members of the uroplakin family. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb,  UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains seperating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers and six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis.	187
197378	cd09969	UP_IIIb	Uroplakin IIIb. Uroplakin IIIb, minor isoform of the dimerization partner of uroplakin Ib, is a members of the uroplakin family. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb,  UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains seperating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers and six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis.	184
197379	cd09970	UP_IIIa	Uroplakin IIIa. Uroplakin IIIa, mayor isoform of the dimerization partner of uroplakin Ib, is a members of the uroplakin family. Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb,  UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains seperating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers and six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis.	212
197380	cd09971	SdiA-regulated	SdiA-regulated. This model represents a bacterial family of proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. The C-terminal domain included in the alignment forms a five-bladed beta-propeller structure. The X-ray structure of Escherichia coli yjiK (C-terminal domain) exhibits binding of calcium ions (Ca++) in what appears to be an evolutionarily conserved site. Sequence analysis suggests a distant relationship to proteins that are characterized as containing NHL-repeats. The latter also form beta-propeller structures, with several examples known to form six-bladed beta-propellers. Several of the six-bladed beta-propellers containing NHL repeats have been characterized functionally, including members with enzymatic functions that are dependent on metal ions. No functional characterization is available for this family of five-bladed propellers, though.	242
193586	cd09972	LOTUS_TDRD_OSKAR	The first LOTUS domain in Oskar and Tudor-containing proteins 5 and 7. The first LOTUS domain in Oskar and Tudor-containing proteins 5 and 7: The LOTUS containing proteins are germline-specific and are found in the nuage/polar granules of germ cells. Tudor-containing protein 5 and 7 belong to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD5 and TDRD7 are components of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. Oskar protein is a critical component of the pole plasm in the Drosophila oocyte, which is required for germ cell formation.The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	87
193587	cd09973	LOTUS_2_TDRD7	The second LOTUS domain on Tudor-containing protein 7 (TDRD7). The second LOTUS domain on Tudor-containing protein 7 (TDRD7): TDRD7 contains three N-terminal LOTUS domains and three Tudor domain repeats at the C-terminus. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD7 together with TDRD1/MTR-1, TDRD5 and  TDRD6 forms a ribonucleoprotein complex in the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs) involving in RNA processing for spermatogenesis.  TDRD7 is functionally essential for the differentiation of germ cells. The exact molecular function of LOTUS domain on TDRD7 remains to be characterized. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	68
193588	cd09974	LOTUS_3_TDRD7	The third LOTUS domain on Tudor-containing protein 7 (TDRD7). The third LOTUS domain on Tudor-containing protein 7 (TDRD7): TDRD7 contains three N-terminal LOTUS domains and three Tudor domain repeats at the C-terminus. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD7 together with TDRD1/MTR-1, TDRD5 and  TDRD6 forms a ribonucleoprotein complex in the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs) involving in RNA processing for spermatogenesis.  TDRD7 is functionally essential for the differentiation of germ cells. The exact molecular function of LOTUS domain on TDRD7 remains to be characterized. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	67
193589	cd09975	LOTUS_2_TDRD5	The second LOTUS domain on Tudor-containing protein 5 (TDRD5). The second LOTUS domain on Tudor-containing protein 5 (TDRD5): TDRD5 contains three N-terminal LOTUS domains and a C-terminal Tudor domain. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice TDRD5 is a component of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. The exact molecular function of LOTUS domain on TDRD5 remains to be discovered. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	70
193590	cd09976	LOTUS_3_TDRD5	The third LOTUS domain on Tudor-containing protein 5 (TDRD5). The third LOTUS domain on Tudor-containing protein 5 (TDRD5): TDRD5 contains three N-terminal LOTUS domains and a C-terminal Tudor domain. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice TDRD5 is a component of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. The exact molecular function of LOTUS domain on TDRD5 remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	74
193591	cd09977	LOTUS_1_Limkain_b1	The first LOTUS domain on Limkain b1(LKAP). The first LOTUS domain on Limkain b1(LKAP): Limkain b1 is  a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif.  The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	62
193592	cd09978	LOTUS_2_Limkain_b1	The second LOTUS domain on Limkain b1(LKAP). The second LOTUS domain on Limkain b1(LKAP): Limkain b1 is  a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif.  The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization	71
193593	cd09979	LOTUS_3_Limkain_b1	The third LOTUS domain on Limkain b1(LKAP). The third LOTUS domain on Limkain b1(LKAP): Limkain b1 is  a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif.  The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	72
193594	cd09980	LOTUS_4_Limkain_b1	The fourth LOTUS domain on Limkain b1(LKAP). The fourth LOTUS domain on Limkain b1(LKAP): Limkain b1 is  a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif.  The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	72
193595	cd09981	LOTUS_5_Limkain_b1	The fifth LOTUS domain on Limkain b1(LKAP). The fifth LOTUS domain on Limkain b1(LKAP): Limkain b1 is  a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif.  The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	71
193596	cd09982	LOTUS_6_Limkain_b1	The sixth LOTUS domain on Limkain b1(LKAP). The sixth LOTUS domain on Limkain b1(LKAP): Limkain b1 is  a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif.  The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	71
193597	cd09983	LOTUS_7_Limkain_b1	The seventh LOTUS domain on Limkain b1(LKAP). The seventh LOTUS domain on Limkain b1(LKAP): Limkain b1 is  a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif.  The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	73
193598	cd09984	LOTUS_8_Limkain_b1	The eighth LOTUS domain on Limkain b1(LKAP). The eighth LOTUS domain on Limkain b1(LKAP): Limkain b1 is  a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. The protein contains multiple copies of LOTUS domains and a conserved RNA recognition motif.  The exact molecular function of LOTUS domain remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	76
193599	cd09985	LOTUS_1_TDRD5	The first LOTUS domain on Tudor-containing protein 5 (TDRD5). The first LOTUS domain on Tudor-containing protein 5 (TDRD5): TDRD5 contains three N-terminal LOTUS domains and a C-terminal Tudor domain. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD5 is a component of the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs), which are cytoplasmic ribonucleoprotein granules involved in RNA processing for spermatogenesis. The exact molecular function of LOTUS domain on TDRD5 remains to be identified. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	95
193600	cd09986	LOTUS_1_TDRD7	The first LOTUS domain on Tudor-containing protein 7 (TDRD7). The first LOTUS domain on Tudor-containing protein 7 (TDRD7): TDRD7 contains three N-terminal LOTUS domains and three Tudor domain repeats at the C-terminus. It belongs to the evolutionary conserved Tudor domain-containing protein (TDRD) family involved in germ cell development. In mice, TDRD7 together with TDRD1/MTR-1, TDRD5 and  TDRD6 forms a ribonucleoprotein complex in the intermitochondrial cements (IMCs) and the chromatoid bodies (CBs) involving in RNA processing for spermatogenesis.  TDRD7 is functionally essential for the differentiation of germ cells. The exact molecular function of LOTUS domain on TDRD7 remains to be characterized. Its occurrence in proteins associated with RNA metabolism suggests that it might be involved in RNA binding function. The presence of several basic residues and RNA fold recognition motifs support this hypothesis. The RNA binding function might be the first step of regulating mRNA translation or localization.	88
212513	cd09987	Arginase_HDAC	Arginase-like and histone-like hydrolases. Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs.	217
212514	cd09988	Formimidoylglutamase	Formimidoylglutamase or HutE. Formimidoylglutamase (N-formimidoyl-L-glutamate formimidoylhydrolase; formiminoglutamase; N-formiminoglutamate hydrolase; N-formimino-L-glutamate formiminohydrolase; HutE; EC 3.5.3.8) is a metalloenzyme that catalyzes hydrolysis of N-formimidoyl-L-glutamate to L-glutamate and formamide. This enzyme is involved in histidine degradation, requiring Mn as a cofactor while glutathione may be required for maximal activity. In Pseudomonas PAO1, mutation studies show that histidine degradation proceeds via a 'four-step' pathway if the 'five-step' route is absent and vice versa; in the four-step pathway, formiminoglutaminase (HutE, EC 3.5.3.8) directly converts formiminoglutamate (FIGLU) to L-glutamate and formamide in a single step. Formiminoglutamase has traditionally also been referred to as HutG; however, formiminoglutamase is structurally and mechanistically unrelated to N-formyl-glutamate deformylase (also called HutG). Phylogenetic analysis has suggested that HutE was acquired by horizontal gene transfer from a Ralstonia-like ancestor.	262
212515	cd09989	Arginase	Arginase family. This family includes arginase, also known as arginase-like amidino hydrolase family, and related proteins. Arginase is a binuclear Mn-dependent metalloenzyme and catalyzes hydrolysis of L-arginine to L-ornithine and urea (Arg, EC 3.5.3.1), the reaction being the fifth and final step in the urea cycle, providing the path for the disposal of nitrogenous compounds. Arginase controls cellular levels of arginine and ornithine which are involved in protein biosynthesis, and in production of creatine, polyamines, proline and nitric acid. In vertebrates, at least two isozymes have been identified: type I (ARG1) cytoplasmic or hepatic liver-type arginase and type II (ARG2) mitochondrial or non-hepatic arginase. Point mutations in human arginase ARG1 gene lead to hyperargininemia with consequent mental disorders, retarded development and early death. Hyperargininemia is associated with a several-fold increase in the activity of the mitochondrial arginase (ARG2), causing persistent ureagenesis in patients. ARG2 overexpression plays a critical role in the pathophysiology of cholesterol mediated endothelial dysfunction. Thus, arginase is a therapeutic target to treat asthma, erectile dysfunction, atherosclerosis and cancer.	290
212516	cd09990	Agmatinase-like	Agmatinase-like family. Agmatinase subfamily currently includes metalloenzymes such as agmatinase, guanidinobutyrase, guanidopropionase, formimidoylglutamase and proclavaminate amidinohydrolase. Agmatinase (agmatine ureohydrolase; SpeB; EC=3.5.3.11) is the key enzyme in the synthesis of polyamine putrescine; it catalyzes hydrolysis of agmatine to yield putrescine and urea. This enzyme has been found in bacteria, archaea and eukaryotes, requiring divalent Mn and sometimes Zn, Co or Ca for activity. In mammals, the highest level of agmatinase mRNA was found in liver and kidney. However, catabolism of agmatine via agmatinase apparently is a not major path; it is mostly catabolized via diamine oxidase. Agmatinase has been shown to be down-regulated in tumor renal cells. Guanidinobutyrase (Gbh, EC=3.5.3.7) catalyzes hydrolysis of 4-guanidinobutanoate to yield 4-aminobutanoate and urea in arginine degradation pathway. Activity has been shown for purified enzyme from Arthrobacter sp. KUJ 8602. Additionally, guanidinobutyrase is able to hydrolyze D-arginine, 3-guanidinopropionate, 5-guanidinovaleriate and L-arginine with much less affinity, having divalent Zn ions for catalysis. Proclavaminate amidinohydrolase (Pah, EC 3.5.3.22) hydrolyzes amidinoproclavaminate to yield proclavaminate and urea in clavulanic acid biosynthesis. Activity has been shown for purified enzyme from Streptomyces clavuligerus. Clavulanic acid is the effective inhibitor of beta-lactamases. This acid is used in combination with the penicillin amoxicillin to prevent antibiotic's beta-lactam rings from hydrolysis, thus keeping the antibiotics biologically active.	275
212517	cd09991	HDAC_classI	Class I histone deacetylases. Class I histone deacetylases (HDACs) are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98). Enzymes belonging to this group participate in regulation of a number of processes through protein (mostly different histones) modification (deacetylation). Class I histone deacetylases in general act via the formation of large multiprotein complexes. This group includes animal HDAC1, HDAC2, HDAC3, HDAC8, fungal RPD3, HOS1 and HOS2, plant HDA9, protist, archaeal and bacterial (AcuC) deacetylases. Members of this class are involved in cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and in posttranslational control of the acetyl coenzyme A synthetase. In mammals, they are known to be involved in progression of various tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs.	306
212518	cd09992	HDAC_classII	Histone deacetylases and histone-like deacetylases, classII. Class II histone deacetylases are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) and possibly other proteins to yield deacetylated histones/other proteins. This group includes animal HDAC4,5,6,7,8,9,10, fungal HOS3 and HDA1, plant HDA5 and HDA15 as well as other eukaryotes, archaeal and bacterial histone-like deacetylases. Eukaryotic deacetylases mostly use histones (H2, H3, H4) as substrates for deacetylation; however, non-histone substrates are known (for example, tubulin). Substrates for prokaryotic histone-like deacetylases are not known. Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Interaction partners of class II deacetylases include 14-3-3 proteins, MEF2 family of transcriptional factors, CtBP, calmodulin (CaM), SMRT, N-CoR, BCL6, HP1alpha and SUMO. Histone deacetylases play a role in the regulation of cell cycle, cell differentiation and survival. Class II mammalian HDACs are differentially inhibited by structurally diverse compounds with known antitumor activities, thus presenting them as potential drug targets for human diseases resulting from aberrant acetylation.	291
212519	cd09993	HDAC_classIV	Histone deacetylase class IV also known as histone deacetylase 11. Class IV histone deacetylases (HDAC11; EC 3.5.1.98) are predicted Zn-dependent enzymes. This class includes animal HDAC11, plant HDA2 and related bacterial deacetylases. Enzymes in this subfamily participate in regulation of a number of different processes through protein modification (deacetylation). They catalyze hydrolysis of N(6)-acetyl-lysine of histones (or other proteins) to yield a deacetylated proteins. Histone deacetylases often act as members of large multi-protein complexes such as mSin3A or SMRT/N-CoR. Human HDAC11 does not associate with them but can interact with HDAC6 in vivo. It has been suggested that HDAC11 and HDAC6 may use non-histone proteins as their substrates and play a role other than to directly modulate chromatin structure. In normal tissues, expression of HDAC11 is limited to kidney, heart, brain, skeletal muscle and testis, suggesting that its function might be tissue-specific. In mammals, HDAC11 proteins are known to be involved in progression of various tumors. HDAC11 plays an essential role in regulating OX40 ligand (OX40L) expression in Hodgkin lymphoma (HL); selective inhibition of HDAC11 expression significantly up-regulates OX40L and induces apoptosis in HL cell lines. Thus, inhibition of HDAC11 could be a therapeutic drug option for antitumor immune response in HL patients.	275
212520	cd09994	HDAC_AcuC_like	Class I histone deacetylase AcuC (Acetoin utilization protein)-like enzymes. AcuC (Acetoin utilization protein) is a class I deacetylase found only in bacteria and is involved in post-translational control of the acetyl-coenzyme A synthetase (AcsA). Deacetylase AcuC works in coordination with deacetylase SrtN (class III), possibly to maintain AcsA in active (deacetylated) form and let the cell grow under low concentration of acetate. B. subtilis AcuC is a member of operon acuABC; this operon is repressed by the presence of glucose and does not show induction by acetoin; acetoin is a bacterial fermentation product that can be converted to acetate via the butanediol cycle in absence of other carbon sources. Inactivation of AcuC leads to slower growth and lower cell yield under low-acetate conditions in Bacillus subtilis. In general, Class I histone deacetylases (HDACs) are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98). Enzymes belonging to this group participate in regulation of a number of processes through protein (mostly different histones) modification (deacetylation). Class I histone deacetylases in general act via the formation of large multiprotein complexes. Members of this class are involved in cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and in posttranslational control of the acetyl coenzyme A synthetase.	313
212521	cd09996	HDAC_classII_1	Histone deacetylases and histone-like deacetylases, classII. This subfamily includes bacterial as well as eukaryotic Class II histone deacetylase (HDAC) and related proteins. Deacetylases of class II are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) and possibly other proteins to yield deacetylated histones/other proteins. Included in this family is a bacterial HDAC-like amidohydrolase (Bordetella/Alcaligenes species FB18817, denoted as FB188 HDAH) shown to be most similar in sequence and function to class II HDAC6 domain 3 or b (HDAC6b). FB188 HDAH is able to remove the acetyl moiety from acetylated histones, and can be inhibited by common HDAC inhibitors such as SAHA (suberoylanilide hydroxamic acid) as well as class II-specific but not class I specific inhibitors.	359
212522	cd09998	HDAC_Hos3	Class II histone deacetylases Hos3 and related proteins. Fungal histone deacetylase Hos3 from Saccharomyces cerevisiae is a Zn-dependent enzyme belonging to HDAC class II. It catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Hos3 deacetylase is homodimer, in vitro it shows specificity to H4, H3 and H2A.	353
212523	cd09999	Arginase-like_1	Arginase-like amidino hydrolase family. This family includes arginase, also known as arginase-like amidino hydrolase family, as well as arginase-like proteins and are found in bacteria, archaea and eykaryotes, but does not include metazoan arginases. Arginase is a binuclear Mn-dependent metalloenzyme and catalyzes hydrolysis of L-arginine to L-ornithine and urea (Arg, EC 3.5.3.1), the reaction being the fifth and final step in the urea cycle, providing the path for the disposal of nitrogenous compounds. Arginase controls cellular levels of arginine and ornithine which are involved in protein biosynthesis, and in production of creatine, polyamines, proline and nitric acid.	272
212524	cd10000	HDAC8	Histone deacetylase 8 (HDAC8). HDAC8 is a Zn-dependent class I histone deacetylase that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. HDAC8 is found in human cytoskeleton-bound protein fraction and insoluble cell pellets. It plays a crucial role in intramembraneous bone formation; germline deletion of HDAC8 is detrimental to skull bone formation. HDAC8 is possibly associated with the smooth muscle actin cytockeleton and may regulate the contractive capacity of smooth muscle cells. HDAC8 is also involved in the metabolic control of the estrogen receptor related receptor (ERR)-alpha/peroxisome proliferator activated receptor (PPAR) gamma coactivator 1 alpha (PGC1-alpha) transcriptional complex as well as in the development of neuroblastoma and T-cell lymphoma. HDAC8-selective small-molecule inhibitors could be a therapeutic drug option for these diseases.	364
212525	cd10001	HDAC_classII_APAH	Histone deacetylase class IIa. This subfamily includes bacterial acetylpolyamine amidohydrolase (APAH) as well as other Class II histone deacetylase (HDAC) and related proteins. Deacetylases of class II are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) and possibly other proteins to yield deacetylated histones/other proteins. Mycoplana ramosa APAH exhibits broad substrate specificity and catalyzes the deacetylation of polyamines such as putrescine, spermidine, and spermine by cleavage of a non-peptide amide bond.	298
212526	cd10002	HDAC10_HDAC6-dom1	Histone deacetylase 6, domain 1 and histone deacetylase 10. Histone deacetylases 6 and 10 are class IIb Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDACs usually act via association with DNA binding proteins to target specific chromatin regions. HDAC6 is the only histone deacetylase with internal duplication of two catalytic domains which appear to function independently of each other, and also has a C-terminal ubiquitin-binding domain. It is located in the cytoplasm and associates with microtubule motor complex, functioning as the tubulin deacetylase and regulating microtubule-dependent cell motility. HDAC10 has an N-terminal deacetylase domain and a C-terminal pseudo-repeat that shares significant similarity with its catalytic domain. It is located in the nucleus and cytoplasm, and is involved in regulation of melanogenesis. It transcriptionally down-regulates thioredoxin-interacting protein (TXNIP), leading to altered reactive oxygen species (ROS) signaling in human gastric cancer cells. Known interaction partners of HDAC6 are alpha tubulin (substrate) and ubiquitin-like modifier FAT10 (also known as Ubiquitin D or UBD) while interaction partners of HDAC10 are Pax3, KAP1, hsc70 and HDAC3 proteins.	336
212527	cd10003	HDAC6-dom2	Histone deacetylase 6, domain 2. Histone deacetylase 6 is a class IIb Zn-dependent enzyme that catalyzes hydrolysis of N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDACs usually act via association with DNA binding proteins to target specific chromatin regions. HDAC6 is the only histone deacetylase with internal duplication of two catalytic domains which appear to function independently of each other, and also has a C-terminal ubiquitin-binding domain. It is located in the cytoplasm and associates with microtubule motor complex, functioning as the tubulin deacetylase and regulating microtubule-dependent cell motility. Known interaction partners of HDAC6 are alpha tubulin and ubiquitin-like modifier FAT10 (also known as Ubiquitin D or UBD).	350
212528	cd10004	RPD3-like	reduced potassium dependency-3 (RPD3)-like. Proteins of the Rpd3-like family are class I Zn-dependent Histone deacetylases that catalyze hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). RPD3 is the yeast homolog of class I HDACs. The main function of RPD3-like group members is regulation of a number of different processes through protein (mostly different histones) modification (deacetylation). This group includes fungal RPD3 and acts via the formation of large multiprotein complexes. Members of this group are involved in cell cycle regulation, DNA damage response, embryonic development and cytokine signaling important for immune response. Histone deacetylation by yeast RPD3 represses genes regulated by the Ash1 and Ume6 DNA-binding proteins. In mammals, they are known to be involved in progression of various tumors. Specific inhibitors of mammalian histone deacetylases could be a therapeutic drug option.	375
212529	cd10005	HDAC3	Histone deacetylase 3 (HDAC3). HDAC3 is a Zn-dependent class I histone deacetylase that catalyzes hydrolysis of N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. In order to target specific chromatin regions, HDAC3 can interact with DNA-binding proteins (transcriptional factors) either directly or after forming complexes with a number of other proteins, as observed for the SMPT/N-CoR complex which recruits human HDAC3 to specific chromatin loci and activates deacetylation. Human HDAC3 is also involved in deacetylation of non-histone substrates such as RelA, SPY and p53 factors. This protein can also down-regulate p53 function and subsequently modulate cell growth and apoptosis. This gene is therefore regarded as a potential tumor suppressor gene. HDAC3 plays a role in various physiological processes, including subcellular protein localization, cell cycle progression, cell differentiation, apoptosis and survival. HDAC3 has been found to be overexpressed in some tumors including leukemia, lung carcinoma, colon cancer and maxillary carcinoma. Thus, inhibitors precisely targeting HDAC3 (in some cases together with retinoic acid or hyperthermia) could be a therapeutic drug option.	381
212530	cd10006	HDAC4	Histone deacetylase 4. Histone deacetylase 4 is a class IIa Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, having N-terminal regulatory domain with two or three conserved serine residues; phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC4 participates in regulation of chondrocyte hypertrophy and skeletogenesis. However, biological substrates for HDAC4 have not been identified; only low lysine deacetylation activity has been demonstrated and active site mutant has enhanced activity toward acetylated lysines. HDAC4 does not bind DNA directly, but through transcription factors MEF2C (myocyte enhancer factor-2C) and MEF2D. Other known interaction partners of the protein are 14-3-3 proteins, SMRT and N-CoR co-repressors, BCL6, HP1, SUMO-1 ubiquitin-like protein, and ANKRA2. It appears to interact in a multiprotein complex with RbAp48 and HDAC3. Furthermore, HDAC4 is required for TGFbeta1-induced myofibroblastic differentiation.	409
212531	cd10007	HDAC5	Histone deacetylase 5. Histone deacetylase 5 is a class IIa Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, having N-terminal regulatory domain with two or three conserved serine residues; phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC5 is involved in integration of chronic drug (cocaine) addiction and depression with changes in chromatin structure and gene expression; cocaine regulates HDAC5 function to antagonize the rewarding impact of cocaine, possibly by blocking drug-stimulated gene expression that supports drug-induced behavioral change. It is also involved in regulation of angiogenesis and cell cycle as well as immune system development. HDAC5 and HDAC9 have been found to be significantly up-regulated in high-risk medulloblastoma compared with low-risk and may potentially be novel drug targets.	420
212532	cd10008	HDAC7	Histone deacetylase 7. Histone deacetylase 7 is a class IIa Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, having N-terminal regulatory domain with two or three conserved serine residues; phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC7 is involved in regulation of myocyte migration and differentiation. Known interaction partners of class IIa HDAC7 are myocyte enhancer factors - MEF2A, -2C, and -2D, 14-3-3 proteins, SMRT and N-CoR co-repressors, HDAC3, ETA (endothelin receptor). This enzyme is also involved in the development of the immune system as well as brain and heart development. Multiple alternatively spliced transcript variants encoding several isoforms have been found for this gene.	378
212533	cd10009	HDAC9	Histone deacetylase 9. Histone deacetylase 9 is a class IIa Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, they have N-terminal regulatory domain with two or three conserved serine residues, phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC9 is involved in regulation of gene expression and dendritic growth in developing cortical neurons. It also plays a role in hematopoiesis. Its deregulated expression may be associated with some human cancers. HDAC5 and HDAC9 have been found to be significantly up-regulated in high-risk medulloblastoma compared with low-risk and may potentially be novel drug targets.	379
212534	cd10010	HDAC1	Histone deacetylase 1 (HDAC1). Histone deacetylase 1 (HDAC1) is a Zn-dependent class I enzyme that catalyzes hydrolysis of N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDAC1 is involved in regulation through association with DNA binding proteins to target specific chromatin regions. In particular, HDAC1 appears to play a major role in pre-implantation embryogenesis in establishing a repressive chromatin state. Its interaction with retinoblastoma tumor-suppressor protein is essential in the control of cell proliferation and differentiation. Together with metastasis-associated protein-2 (MTA2), it deacetylates p53, thereby modulating its effect on cell growth and apoptosis. It participates in DNA-damage response, along with HDAC2; together, they promote DNA non-homologous end-joining. HDAC1 is also involved in tumorogenesis; its overexpression modulates cancer progression. Specific inhibitors of HDAC1 are currently used in cancer therapy.	371
212535	cd10011	HDAC2	Histone deacetylase 2 (HDAC2). Histone deacetylase 2 (HDAC2) is a Zn-dependent class I enzyme that catalyzes hydrolysis of N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDAC2 is involved in regulation through association with DNA binding proteins to target specific chromatin regions. It forms transcriptional repressor complexes by associating with several proteins, including the mammalian zinc-finger transcription factor YY1, thus playing an important role in transcriptional regulation, cell cycle progression and developmental events. Additionally, a few non-histone HDAC2 substrates have been found. HDAC2 plays a role in embryonic development and cytokine signaling important for immune response, and is over-expressed in several solid tumors including oral, prostate, ovarian, endometrial and gastric cancer. It participates in DNA-damage response, along with HDAC1; together, they can promote DNA non-homologous end-joining. HDAC2 is considered an important cancer prognostic marker. Inhibitors specifically targeting HDAC2 could be a therapeutic drug option.	366
193609	cd10013	Cas3''_I	CRISPR/Cas system-associated protein Cas3''. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; HD-like nuclease, specifically digesting double-stranded oligonucleotides and preferably cleaving at G:C pairs; signature gene for Type I	188
199900	cd10014	TFIIA_gamma_C	Gamma subunit of transcription initiation factor IIA, C-terminal domain. Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of the TATA-binding protein (TBP) for DNA, in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta), and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single TFIIA_alpha_beta gene and post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. The TFIIA gamma subunit is highly conserved between humans, Drosophila and yeast and it is required for TFIIA function. The C-terminal domain of the gamma (TFIIA_gamma_C) subunit forms a beta-barrel structure together with TFIIA beta.	47
197381	cd10015	BfiI_C_EcoRII_N_B3	DNA binding domains of BfiI, EcoRII and plant B3 proteins. This family contains the N-terminal DNA binding domain of type IIE restriction endonuclease EcoRII-like proteins, the C-terminal DNA binding  domain of type IIS restriction endonuclease BfiI-like proteins and plant-specific B3 proteins. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+, not ATP or GTP, for catalysis. EcoRII is specific for the 5'-CCWGG sequence (W stands for A or T). EcoRII consists of 2 domains, the C-terminal catalytic/dimerization domain (EcoRII-C), and the N-terminal effector DNA binding domain (EcoRII-N). BfiI is unique in cleaving DNA at fixed positions downstream of an asymmetric sequence in the absence of Mg2+. BfiI consists of two discrete domains with distinct functions: an N-terminal catalytic domain with non-specific nuclease activity and dimerization function that is more closely related to Nuc, an EDTA-resistant nuclease from the phospholipase D (PLD) superfamily; and a C-terminal domain that specifically recognizes its target sequences, 5'-ACTGGG-3'. B3 proteins are a family of plant-specific transcription factors, involved in a great variety of processes, including seed development and auxin response.	109
197382	cd10016	EcoRII_N	N-terminal domain of type IIE restriction endonuclease EcoRII and similar proteins. N-terminal domain of type IIE restriction endonuclease EcoRII and similar proteins. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+,  not ATP or GTP, for catalysis. EcoRII is specific for the 5'-CCWGG sequence (W stands for A or T). EcoRII consists of 2 domains, the C-terminal catalytic/dimerization domain (EcoRII-C), and the N-terminal effector DNA binding domain (EcoRII-N). To be catalytically active, EcoRII has to form a dimer.	142
197383	cd10017	B3_DNA	Plant-specific B3-DNA binding domain. The plant-specific B3 DNA binding domain superfamily includes the well-characterized auxin response factor (ARF) and the LAV (Leafy cotyledon2 [LEC2]-Abscisic acid insensitive3 [ABI3]-VAL) families, as well as the RAV (Related to ABI3 and VP1) and REM (REproductive Meristem) families. LEC2 and ABI3 have been shown to be involved in seed development, while other members of the LAV family seem to have a more general role, being expressed in many organs during plant development. Members of the ARF family bind to the auxin response element and depending on presence of an activation or repression domain, they activate or repress  transcription. RAV and REM families are less studied B3 protein famillies.	98
197384	cd10018	BfiI_C	C-terminal domain of type IIs restriction endonuclease BfiI and similar proteins. C-terminal domain of a novel type IIs restriction endonuclease BfiI and similar proteins. Type II restriction endonucleases are components of restriction modification (RM) systems that protect bacteria and archaea against invading foreign DNA. They usually function as homodimers or homotetramers that cleave DNA at defined sites of 4 to 8 bp in length, and they require Mg2+, not ATP or GTP, for catalysis. Unlike all other restriction enzymes known to date, BfiI is unique in cleaving DNA at fixed positions downstream of an asymmetric sequence in the absence of Mg2+. BfiI consists of two discrete domains with distinct functions: an N-terminal catalytic domain with non-specific nuclease activity and dimerization function that is more closely related to Nuc, an EDTA-resistant nuclease from the phospholipase D (PLD) superfamily; and a C-terminal domain that specifically recognizes its target sequences, 5'-ACTGGG-3'. BfiI presumably evolved through domain fusion of a DNA recognition domain to the catalytic Nuc-like domain from the PLD superfamily. BfiI forms a functionally active homodimer which has two DNA-binding surfaces located at the C-terminal domains but only one active site, located at the dimer interface between the two N-terminal catalytic domains.	157
206756	cd10019	14-3-3_sigma	14-3-3 sigma, an isoform of 14-3-3 protein. 14-3-3 protein sigma isoform, also known as stratifin or human mammary epithelial marker (HME) 1, has been most directly linked to tumor development. In humans, it is expressed by the SFN gene, strictly in stratified squamous epithelial cells in response to DNA damage where it is transcriptionally induced in a p53-dependent manner, subsequently causing cell-cycle arrest at the G2/M checkpoint. Up-regulation and down-regulation of 14-3-3 sigma expression have both been described in tumors. For example, in human breast cancer, 14-3-3 sigma is predominantly down-regulated by CpG methylation, acting as both a tumor suppressor and a prognostic indicator, while in human scirrhous-type gastric carcinoma (SGC), it is up-regulated and may play an important role in SGC carcinogenesis and progression. Loss of 14-3-3 sigma expression sensitizes tumor cells to treatment with conventional cytostatic drugs, making this protein an attractive therapeutic target. 14-3-3 domains are an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells.	242
206757	cd10020	14-3-3_epsilon	14-3-3 epsilon, an isoform of 14-3-3 protein. 14-3-3 protein epsilon isoform (isoform (also known as tyrosine 3-monooxygenase/ tryptophan 5-monooxygenase activation protein, epsilon polypeptide) is encoded by the YWHAE gene in humans and is involved in cancer cell survival and growth. It interacts with CDC25 phosphatases, RAF1 and IRS1 proteins, suggesting its role in diverse biochemical activities related to signal transduction, such as cell division and regulation of insulin sensitivity. Overexpression of 14-3-3 epsilon in primary hepatocellular carcinoma (HCC) tissues predicts a high risk of extrahepatic metastasis and worse survival, and is a potential therapeutic target. It has also been implicated in the pathogenesis of small cell lung cancer. 14-3-3 epsilon overexpression protects colorectal cancer and endothelial cells from oxidative stress-induced apoptosis, while its suppression by non-steroidal anti-inflammatory drugs induces cancer and endothelial cell death. Cellular levels of 14-3-3 epsilon could possibly serve as an important regulator of cell survival in response to oxidative stress and other death signals. 14-3-3 domains are an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells.	230
206758	cd10022	14-3-3_beta_zeta	14-3-3 beta and zeta isoforms of 14-3-3 protein. 14-3-3 protein beta and zeta isoform (also known as tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, beta and zeta polypeptide) are encoded by the YWHAB gene and YWHAZ gene in humans. They have been linked to mitogenic signaling and the cell cycle machinery, and to cancer initiation and progression, respectively. The beta isoform has been shown to interact with RAF1 and CDC25 phosphatases and its overexpression is associated with invasion, migration, metastasis and proliferation of tumor cells and its elevated levels are correlated with tumor size, the number of lymph node metastases and a reduced survival rate. It is significantly overexpressed in lung cancer tissues, mutated chronic lymphocytic leukemia (M-CLL), gastric cancer tissues, aflatoxin B1-induced rat hepatocellular carcinoma K1 and K2 cells, as well as renal cell carcinoma cysts, and can potentially be used as a diagnostic and prognostic biomarker in the cancer. Numerous proteins involved in anti-apoptosis and tumor progression were also found to be differentially expressed in gastric cancer cells where 14-3-3 beta is overexpressed. 14-3-3 beta also interacts with human Dapper1 (hDpr1), a key negative regulator of Wnt signaling, via hDpr1 phosphorylation by protein kinase A, thus attenuating the ability of hDpr1 to promote Dishevelled (Dvl) degradation, and subsequently enhancing Wnt signaling. The zeta isoform is ubiquitously expressed and localized to most subcellular regions, including the cytoplasm, plasma membrane, mitochondria, and nucleus. Its overexpression and gene amplification in multiple cancers are correlated with poor prognosis and chemoresistance in cancer patients. 14-3-3 zeta has been identified as a biomarker with high sensitivity and specificity for diagnosis and prognosis in multiple tumor types, including hepatocellular carcinoma, head and neck cancer, indicating a potential clinical application for using 14-3-3 zeta in selecting treatment options and predicting cancer outcome. It also interacts with IRS1 protein, suggesting a role in regulating insulin sensitivity. 14-3-3 domains are an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells.	229
206759	cd10023	14-3-3_theta	14-3-3 theta/tau (theta in mice, tau in human), an isoform of 14-3-3 protein. 14-3-3 tau/theta (tau in humans, theta in mice) isoform (also known as tyrosine 3-monooxygenase/ tryptophan 5-monooxygenase activation protein, theta polypeptide) is encoded by the YWHAQ gene in humans and plays an important role in controlling apoptosis through interactions with ASK1, c-jun NH-terminal kinase, and p38 mitogen-activated protein kinase (MAPK). Its interaction with CDC25c regulates entry into the cell cycle and subsequent interaction with Bad prevents apoptosis. 14-3-3 theta protein expression is induced in patients with amyotrophic lateral sclerosis. 14-3-3 tau is often overexpressed in breast cancer, which is associated with the downregulation of p21, a p53 target gene, and thus leads to tamoxifen resistance in MCF7 breast cancer cells and shorter patient survival. Therefore, 14-3-3 tau may be a potential therapeutic target in breast cancer. Additionally, 14-3-3 theta mediates nucleocytoplasmic shuttling of the coronavirus nucleocapsid protein which causes severe acute respiratory syndrome. 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells.	234
206760	cd10024	14-3-3_gamma	14-3-3 gamma, an isoform of 14-3-3 protein. 14-3-3 gamma isoform (also known as tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, gamma polypeptide) is encoded by the YWHAG gene in humans and is induced by growth factors in human vascular smooth muscle cells. It is also highly expressed in skeletal and heart muscles, suggesting an important role in muscle tissue. It has been shown to interact with RAF1 and protein kinase C, proteins involved in various signal transduction pathways. 14-3-3 gamma mediates Cdc25A proteolysis to block premature mitotic entry after DNA damage. 14-3-3 gamma mediates the interaction between Chk1 and Cdc25A; this complex has an essential function in Cdc25A phosphorylation and degradation to block premature mitotic entry after DNA damage. Increased expression of 14-3-3 gamma in lung cancer coincides with loss of functional p53, possibly in a cooperative manner promoting genomic instability. Also, during cell cycle, 14-3-3 gamma protects p21, a cyclin-dependent kinase inhibitor, from degradation mediated by the p53 suppressor MDMX, which may account for elevation of p21 levels independent of p53 and in response to DNA damage. Elevated expression of 14-3-3 gamma in human hepatocellular carcinoma predicts extrahepatic metastasis and worse survival, thus making this protein a candidate biomarker and a potential target for novel therapies against the disease.	246
206761	cd10025	14-3-3_eta	14-3-3 eta, an isoform of 14-3-3 protein. 14-3-3 eta isoform (also known as tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, eta polypeptide) is expressed mainly in brain, and is involved in hypothalamic-pituitary-adrenocortical (HPA) axis regulation. In humans, it is encoded by the YWHAH gene, and is a positional and functional candidate for schizophrenia as well as bipolar disorder (BP). This gene contains a 7 bp repeat sequence in its 5' Untranslated Region (UTR), and early-onset schizophrenia has been associated with changes in the number of this repeat. 14-3-3 eta and gamma are found in the serum and synovial fluid of patients with joint inflammation. Specifically, 14-3-3 eta, which plays a regulatory role in chondrogenic differentiation, is significantly overexpressed in juvenile rheumatoid arthritis (JRA), a chronic inflammatory disease often associated with growth impairment. Overexpression of Gremlin 1, the bone morphogenetic protein antagonist, may play an oncogenic role in carcinomas of the uterine cervix, lung, ovary, kidney, breast, colon, pancreas, and sarcoma, since it functions by interaction with the 14-3-3 eta domain. Therefore, Gremlin 1 and its binding protein 14-3-3 eta could be appropriate targets for developing diagnostic and therapeutic strategies against human cancers. 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells.	239
206762	cd10026	14-3-3_plant	Plant 14-3-3 protein domain. Plant 14-3-3 isoforms, similar to their highly conserved homologs in mammals, bind to phosphorylated target proteins to modulate their function. They have been implicated in a variety of physiological functions; in particular, abiotic and biotic stress responses, primary metabolism, as well as various aspects of plant growth and development. They function through the regulation of a diverse range of proteins including transcription factors, kinases, structural proteins, ion channels as well as pathogen defense-related proteins. The 14-3-3 proteins are affected transcriptionally as well as functionally by the environment of the plant, both intracellular and extracellular, thus playing a key role in the response to environmental stress, pathogens and light conditions. Plant 14-3-3 proteins have been divided into epsilon-like groups and non-epsilon groups based on phylogenetic clustering. They have a varying number of isoforms (for example, Arabidopsis has thirteen known protein isoforms, cotton has six) with variation in their affinity for specific binding partners, suggesting specific roles in specific processes.	237
381678	cd10027	UDG-F1-like	Uracil DNA glycosylase family 1 subfamily, includes Human uracil DNA glycosylase and similar proteins. Uracil DNA glycosylase family 1 is the most efficient of all uracil-DNA glycosylases (UDGs, also known as UNGs) and shows a specificity for uracil in DNA. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	200
381679	cd10028	UDG-F2_TDG_MUG	Uracil DNA glycosylase family 2, includes thymine DNA glycosylase, mismatch-specific uracil DNA glycosylase and similar proteins. Uracil DNA glycosylase family 2 consists of thymine DNA glycosylase (TDG), which removes uracil and thymine from G:U and G:T mismatches in double-stranded DNA. It includes mismatch-specific uracil DNA glycosylase (MUG), the prokaryotic homolog of TDG. Escherichia coli MUG is highly specific to G:U mismatches but also repairs G:T mismatches at high enzyme concentration. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other..	163
381680	cd10030	UDG-F4_TTUDGA_SPO1dp_like	Uracil DNA glycosylase family 4, includes Thermotoga maritima TTUDGA, Bacillus phage SPO1 DNA polymerase, and similar proteins. Uracil DNA glycosylase family 4 includes Thermotoga maritima TTUDGA, a robust uracil DNA glycosylase that shares narrow substrate specificity and high catalytic efficiency with family 1, acting on double-stranded and single-stranded uracil-containing DNA. Members of this family possess four conserved cysteine residues required to coordinate the [4Fe-4S] iron-sulfur cluster. This family also includes the N-terminal domain of Bacillus phage SPO1 DNA polymerase. Bacteriophage SPO1 is one of a group of large, lytic, tailed bacteriophages of Bacillus subtilis, and contains hydroxymethyluracil (hmUra) in place of thymine in their DNA. It has been speculated that this UDG domain may help discriminate between hmUra containing SPO1 DNA and thymine-containing host DNA. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	165
381681	cd10031	UDG-F5_TTUDGB_like	Uracil DNA glycosylase family 5, includes Thermotoga maritima TTUDGB and similar proteins. Uracil DNA glycosylase family 5 includes Thermus thermophilus HB8 TTUDGB (also called UDGb) which is not only a UDG acting on double-stranded uracil-containing DNA, but also a hypoxanthine DNA glycosylase acting on double-stranded hypoxanthine-containing DNA (except for the C/I base pair), as well as a xanthine DNA glycosylase acting on both, double-stranded and single-stranded xanthine-containing DNA.  TTUDGB also excises thymine from G:T mismatched DNA, and removes analogs of uracil from DNA, including 5-hydroxymethyluracil (hmU) and 5-fluorouracil (fU). This subfamily also contains Bradyrhizobium diazoefficiens family 5 homolog Blr5068 (UdgB) which has been found to efficiently excise uracil from ssDNA and dsDNA. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Similar to family 4 UDGs, members of this family possess four conserved cysteine residues required to coordinate the [4Fe-4S] iron-sulfur cluster. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	204
381682	cd10032	UDG-F6_HDG	Uracil DNA glycosylase family 6, includes hypoxanthine-DNA glycosylase and similar proteins. Uracil DNA glycosylase family 6 hypoxanthine-DNA glycosylase (HDG) lacks any detectable UDG activity; it excises hypoxanthine, a deamination product of adenine, from double-stranded DNA. Uracil-DNA glycosylase (UDGs) initiates repair of uracils in DNA. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	141
381683	cd10033	UDG_like	uncharacterized family of the uracil-DNA glycosylase superfamily. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil may arise from mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations; thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. UDG family 1 is the most efficient uracil-DNA glycosylase (UDG, also known as UNG) and shows a specificity for uracil in DNA. UDG family 2 includes thymine DNA glycosylase which removes uracil and thymine from G:U and G:T mismatches, and mismatch-specific uracil DNA glycosylase (MUG) which in Escherichia coli is highly specific to G:U mismatches but also repairs G:T mismatches at high enzyme concentration. UDG family 3 includes Human SMUG1 which can remove uracil and its oxidized pyrimidine derivatives from, single-stranded DNA and double-stranded DNA with a preference for single-stranded DNA. Pedobacter heparinus SMUG2, which is UDG family 3 SMUG1-like, displays catalytic activities towards DNA containing uracil or hypoxanthine/xanthine. UDG family 4 includes Thermotoga maritima TTUDGA, a robust UDG which like family 1, acts on double-stranded and single-stranded uracil-containing DNA. UDG family 5 (UDGb) includes Thermus thermophilus HB8 TTUDGB which acts on double-stranded uracil-containing DNA; it is a hypoxanthine DNA glycosylase acting on double-stranded hypoxanthine-containing DNA except for the C/I base pair, as well as a xanthine DNA glycosylase which acts on both double-stranded and single-stranded xanthine-containing DNA. UDG family 6 hypoxanthine-DNA glycosylase lacks any detectable UDG activity; it excises hypoxanthine. Other UDG families include one represented by Bradyrhizobium diazoefficiens Blr0248 which prefers single-stranded DNA and removes uracil, 5-hydroxymethyl-uracil or xanthine from it.	171
381684	cd10034	UDG_BdiUng_like	Uracil DNA glycosylase family which includes Bradyrhizobium diazoefficiens Blr0248 (BdiUng) and similar proteins. Bradyrhizobium diazoefficiens (previously B. japonicum) Blr0248 uracil-DNA glycosylase (BdiUng) has broad substrate specificity, preferring single-stranded DNA and removing uracil, 5-hydroxymethyl-uracil or xanthine from it. BdiUng is impervious to inhibition by AP DNA, and Ugi protein that specifically inhibits conventional family 1 UDGs. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	181
381685	cd10035	UDG_like	uncharacterized family of the uracil-DNA glycosylase superfamily. Uracil-DNA glycosylases (UDGs) initiate repair of uracils in DNA. Uracil may arise from misincorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations; thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. UDG family 1 is the most efficient uracil-DNA glycosylase (UDG, also known as UNG) and shows a specificity for uracil in DNA. UDG family 2 includes thymine DNA glycosylase which removes uracil and thymine from G:U and G:T mismatches, and mismatch-specific uracil DNA glycosylase (MUG) which in Escherichia coli is highly specific to G:U mismatches, but also repairs G:T mismatches at high enzyme concentration. UDG family 3 includes Human SMUG1 which can remove uracil and its oxidized pyrimidine derivatives from, single-stranded DNA and double-stranded DNA with a preference for single-stranded DNA. Pedobacter heparinus SMUG2, which is UDG family 3 SMUG1-like, displays catalytic activities towards DNA containing uracil or hypoxanthine/xanthine. UDG family 4 includes Thermotoga maritima TTUDGA, a robust UDG which like family 1, acts on double-stranded and single-stranded uracil-containing DNA. UDG family 5 (UDGb) includes Thermus thermophilus HB8 TTUDGB which acts on double-stranded uracil-containing DNA; it is a hypoxanthine DNA glycosylase acting on double-stranded hypoxanthine-containing DNA except for the C/I base pair, as well as a xanthine DNA glycosylase which acts on both double-stranded and single-stranded xanthine-containing DNA. UDG family 6 hypoxanthine-DNA glycosylase lacks any detectable UDG activity; it excises hypoxanthine. Other UDG families include one represented by Bradyrhizobium diazoefficiens Blr0248 which prefers single-stranded DNA and removes uracil, 5-hydroxymethyl-uracil or xanthine from it.	143
197344	cd10036	Reelin_subrepeat_Nt	Additional N-terminal subrepeat of reelin. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. Some family members appear to have an additional subrepeat at the N-terminus as characterized in this model. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns.	151
197345	cd10037	Reelin_repeat_1_subrepeat_1	N-terminal subrepeat of tandem repeat unit 1 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	146
197346	cd10038	Reelin_repeat_2_subrepeat_1	N-terminal subrepeat of tandem repeat unit 2 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	168
197347	cd10039	Reelin_repeat_3_subrepeat_1	N-terminal subrepeat of tandem repeat unit 3 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	170
197348	cd10040	Reelin_repeat_4_subrepeat_1	N-terminal subrepeat of tandem repeat unit 4 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	170
197349	cd10041	Reelin_repeat_5_subrepeat_1	N-terminal subrepeat of tandem repeat unit 5 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	174
197350	cd10042	Reelin_repeat_6_subrepeat_1	N-terminal subrepeat of tandem repeat unit 6 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	157
197351	cd10043	Reelin_repeat_7_subrepeat_1	N-terminal subrepeat of tandem repeat unit 7 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	171
197352	cd10044	Reelin_repeat_8_subrepeat_1	N-terminal subrepeat of tandem repeat unit 8 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the N-terminal subrepeat, which directly contacts the C-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	176
197353	cd10045	Reelin_repeat_1_subrepeat_2	C-terminal subrepeat of tandem repeat unit 1 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	155
197354	cd10046	Reelin_repeat_2_subrepeat_2	C-terminal subrepeat of tandem repeat unit 2 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	156
197355	cd10047	Reelin_repeat_3_subrepeat_2	C-terminal subrepeat of tandem repeat unit 3 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	151
197356	cd10048	Reelin_repeat_4_subrepeat_2	C-terminal subrepeat of tandem repeat unit 4 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	150
197357	cd10049	Reelin_repeat_5_subrepeat_2	C-terminal subrepeat of tandem repeat unit 5 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	150
197358	cd10050	Reelin_repeat_6_subrepeat_2	C-terminal subrepeat of tandem repeat unit 6 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	148
197359	cd10051	Reelin_repeat_7_subrepeat_2	C-terminal subrepeat of tandem repeat unit 7 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	162
197360	cd10052	Reelin_repeat_8_subrepeat_2	C-terminal subrepeat of tandem repeat unit 8 of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	161
380782	cd10140	PFM_aerolysin_family	pore-forming module of aerolysin-type beta-barrel pore-forming proteins. Pore-forming proteins (PFPs) are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta pore-forming proteins (beta-PFPs) form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). Members of this family includes enterolobin, a cytolytic, inflammatory and insecticidal protein from the Brazilian tree Enterolobium contortisiliquum.	92
381074	cd10141	CopZ-like_Fer2_BFD-like	bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of Archaeoglobus fulgidus CopZ, and similar proteins. Archaeoglobus fulgidus CopZ is a fusion of a redox-active domain (containing a mononuclear metal center and an [2Fe-2S] cluster) with a CXXC-containing copper-binding domain.  It is a soluble Cu+ chaperone which delivers cytoplasmic Cu+ to the transmembrane metal-binding sites in the Cu+-ATPase CopA; CopA couples the hydrolysis of ATP to the efflux of cytoplasmic Cu+. In addition to CopZ, the BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), and the large subunit of NADH-dependent nitrite reductase. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	58
408998	cd10142	HD_SAS6_N	N-terminal head domain found in spindle assembly abnormal protein 6 and similar proteins. Spindle assembly abnormal protein 6 (SAS6) is a central scaffolding component of the centrioles that ensures their 9-fold symmetry. It is required for centrosome biogenesis and duplication, and is required for both mother-centriole-dependent centriole duplication and deuterosome-dependent centriole amplification in multiciliated cells. It is also required for the recruitment of microcephaly protein STIL to the procentriole and for STIL-mediated centriole amplification. SAS6 is comprised of an N-terminal globular head domain, a centrally located coiled-coil domain, and a disordered C-terminus. These monomers homodimerize symmetrically, through two dimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. These homodimers can self-assemble into a 9-fold symmetric cartwheel structure comprised of nine SAS6 homodimers associated via their head domains; the dimerized coiled-coil domains being the spokes, the central hub being the head domains. This model corresponds to the N-terminal head domain of SAS6, which is structurally related to other XRCC4-superfamily members, XRCC4, PAXX, XLF and CCDC61.	137
381748	cd10144	Peptidase_S74_CIMCD	Peptidase S74 family, C-terminal intramolecular chaperone domain of Escherichia coli phage K1F endosialidase and related proteins. This peptidase S74 family includes C-terminal intramolecular chaperone domain (CIMCD) of Escherichia coli phage K1F endosialidase, Bacillus phage GA-1 neck appendage protein, and Bacteriophage T5 L-shaped tail fibre. This domain acts as a molecular chaperone; during virus particle assembly, the CIMCD of phage tailspike proteins induces the homo-trimerization of phage tailspike proteins by chaperoning the formation of a triple beta-helix. Homo-trimeric phage tailspike proteins are then auto-cleaved by the CIMCD domain. This family also includes the peptidase S74 Intramolecular Chaperone Auto-processing (ICA) domain of mammalian Myrf. The ICA domain drives the homo-oligomerization of Myrf in the endoplasmic reticulum (ER) membrane. The homo-oligomeric Myrf is proteolyzed by the ICA domain, releasing its N-terminal fragments from the ER membrane.	113
199901	cd10145	TFIIA_gamma_N	Gamma subunit of transcription initiation factor IIA, N-terminal helical domain. Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of the TATA-binding protein (TBP) for DNA, in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta), and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single TFIIA_alpha_beta gene and post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. The TFIIA gamma subunit is highly conserved between humans, Drosophila and yeast and it is required for TFIIA function. The N-terminal domain of the gamma subunit forms a 4-helix bundle together with the alpha subunit.	49
199214	cd10146	LabA_like_C	C-terminal domain of LabA_like proteins. This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains).	69
199902	cd10147	Wzt_C-like	C-Terminal domain of O-antigenic polysaccharide transporter protein Wzt and related proteins. The Escherichia coli ABC protein Wzt consists of 2 domains, a conventional ABC domain that binds ATP and utilizes its energy to transport molecules across membranes, and a C terminal domain which is responsible for its target molecule specificity. Wzt is part of the ATP-binding-cassette (ABC) transporter complex, responsible for the transport of the O-antigenic polysaccharide (O-PS) portion of lipopolysaccharide (LPS), a major component of the outer membrane of Gram-negative bacteria. This CD includes Wzt proteins from two Escherichia coli serotypes O8 and O9a, WztO8 and WztO9a; these proteins are specific for their cognate polysaccharides (O8 or O9a O-PS).	144
197385	cd10148	CsoR-like_DUF156	Transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this domain superfamily was previously known as DUF156. This superfamily includes various transcriptional regulators that respond to stressors including Cu(I), Ni(I), sulfite, and formaldehyde. It includes CsoR (copper-sensitive operon repressor) from Mycobacterium tuberculosis (MtCsoR), Bacillus subtilis (BsCsoR), Thermus thermophilus (TthCsoR), and Staphylococcus aureus (SaCsoR), Mycobacterium tuberculosis RicR (regulated in copper repressor, MtRicR), Escherichia coli RncR (formally known as YohL, nickel and cobalt-sensitive), Alcaligenes xylosoxidans NreA (nickel-sensitive), E. coli FrmR (formally known as YaiN, formaldehyde sensitive), and Staphylococcus aureus CstR (CsoR-like sulfur transferase repressor, NWMN_0026.5, SaCstR). CsoR is Cu(I)-inducible, and regulates the expression of genes involved in copper homeostasis. For example, TthCsoR binds the promoter region of the copZ-csoR-copA operon, and represses expression of these genes, which encode the copper chaperone CopZ, CsoR, and the copper efflux P-type ATPase CopA, respectively. In the presence of excess Cu(I), TthCsoR binds this ion, and is released from the DNA, allowing expression of the downstream genes. TthCsoR also senses other metal ions such as Cu(II), Zn(II), Ag(I), Cd(II) and Ni(II). CsoRs form a homotetramer (dimer of dimers). In the case of MtCsoR, two Cys residues on opposite subunits within each dimer, along with a His residue, bind the Cu(I) ion. These residues are conserved in the majority of members of this superfamily. Exceptions include the functionally uncharacterized Bacillus subtilis YrkD where there is an Asn instead of His (C-N-C), E.coli RcnR where there is a Thr instead of the second Cys  (C-H-T), or TthCsoR and E.coli FrmR where there is a His instead of the second Cys and which have an additional N-terminal His (not found in those family members having C-H-C) that may also be involved in metal binding (H-C-H-H). A conserved Tyr and a Glu residue facilitate allosteric regulation of DNA binding. SaCstR regulates genes predicted to function in sulfur metabolism; it is thought that oxidation of the intersubunit Cys pair to a mixture of disulphide and trisulphide linkages by sulfite, results in a reduced affinity of SaCstR for the operator DNA. SaCstR exists as a mixture of oligomeric states, including dimers, tetramers and octamers. The sequence of SaCstR was not available at the time this hierarchy was curated and therefore was not included. Escherichia coli RncR represses expression of the gene encoding the nickel and cobalt-efflux protein RcnA. The gene encoding Alcaligenes xylosoxidans NreA is part of the nre nickel resistance locus located on the pTOM9 plasmid from thisbacteria. Escherichia coli FrmR regulates the formaldehyde degradation frmRAB operon.	80
197397	cd10149	ClassIIa_HDAC_Gln-rich-N	Glutamine-rich N-terminal helical domain of various Class IIa histone deacetylases (HDAC4, HDAC5 and HDCA9). This superfamily consists of a glutamine-rich N-terminal helical extension to certain Class IIa histone deacetylases (HDACs), including HDAC4, HDAC5 and HDAC9; it is missing in HDAC7. It is referred to as the glutamine-rich domain, and confers responsiveness to calcium signals and mediates interactions with transcription factors and cofactors. This domain is able to repress transcription independently of the HDAC's C-terminal, zinc-dependent catalytic domain. It has many intra- and inter-helical interactions which are possibly involved in reversible assembly and disassembly of proteins. HDACs regulate diverse cellular processes through enzymatic deacetylation of histone as well as non-histone proteins, in particular deacetylating N(6)-acetyl-lysine residues.	90
199903	cd10150	CobN_like	CobN subunit of cobaltochelatase, bchH and chlH subunits of magnesium chelatases, and similar proteins. Cobaltochelatase is a complex enzyme that catalyzes the insertion of cobalt into hydrogenobyrinic acid a,c-diamide, resulting in cobyrinic acid, as demonstrated for Pseudomonas denitrificans. This is an essential step in the bacterial synthesis of cobalamine (B12). The insertion of cobalt requires a complex composed of three polypeptides, cobN, cobS, and cobT. Also included in this family are protoporphyrin IX magnesium chelatases involved in the synthesis of chlorophyll and bacteriochlorophyll, specifically the large (chlH or bchH) subunits.They are thought to bind both the protoporphyrin and the magnesium ion. Hydrolysis of ATP by the smaller subunits in the complex may trigger a conformational change that results in the insertion of the ion into the protoporphyrin scaffold. Cryo electron microscopy studies have suggested that a distinct bchH C-terminal domain may bind tightly to the N-terminal domain upon substrate binding, requiring a substantial conformational change of the bchH subunit. It has also been suggested that chlH of higher plants binds abscisic acid via a C-terminal domain and plays a role in abscisic acid signaling, and that the protein spans the chloroplast envelope, with the C-terminus exposed to the cytosol.	910
197386	cd10151	TthCsoR-like_DUF156	Thermus thermophilus CsoR, a Cu(I)-sensing transcriptional regulator, and related domains; this domain family was previously known as part of DUF156. This domain family contains various Cu(I)-inducible transcriptional regulators including CsoR (copper-sensitive operon repressor) from Mycobacterium tuberculosis (MtCsoR), and Thermus thermophilus (TthCsoR). CsoR regulates the expression of genes involved in copper homeostasis. For example, TthCsoR binds the promoter region of the copZ-csoR-copA operon, and represses expression of these genes, which encode the copper chaperone CopZ, CsoR, and the copper efflux P-type ATPase CopA, respectively. In the presence of excess Cu(I), TthCsoR binds this ion, and is released from the DNA, allowing expression of the downstream genes. TthCsoR also senses other metal ions such as Cu(II), Zn(II), Ag(I), Cd(II) and Ni(II). MtCsoR regulates an operon that includes CsoR and a putative copper transporter gene, ctpV (cation transporter P-type ATPase). CsoRs form a homotetramer (dimer of dimers). In MtCsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in some but not all members of this family; for example, for TthCsoR, there is a His instead of the second Cys as well as an N-terminal His (not found in those family members having C-H-C) which  may also be involved in metal binding (H-C-H-H). A conserved Tyr and a Glu residue facilitate allosteric regulation of DNA binding.	82
197387	cd10152	SaCsoR-like_DUF156	Staphylococcus aureus copper-sensitive operon repressor (CsoR), and related domains; this family was previously known as part of DUF156. This domain family includes Staphylococcus aureus CsoR (SaCsoR). SaCsoR is Cu(I)-inducible, and regulates the expression of genes involved in copper homeostasis; it represses a genetically unlinked copA-copZ operon. copA encodes a copper efflux P-type ATPase, and copZ, a copper chaperone. This family belongs to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes Mycobacterium tuberculosis CsoR (MtCsoR), Bacillus subtilis CsoR, and Thermus thermophilus CsoR. The latter three proteins do not belong to this family. CsoRs form homotetramers (dimer of dimers). In MtCsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family, and a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also well conserved.	82
197388	cd10153	RcnR-FrmR-like_DUF156	Transcriptional regulators RcnR and FrmR, and related domains; this domain family was previously known as part of DUF156. This domain family includes various transcriptional regulators that respond to different stressors. It includes Escherichia coli RncR (formally known as YohL, nickel and cobalt-sensitive), and E. coli FrmR (formally known as YaiN, formaldehyde sensitive). Escherichia coli RncR represses expression of the gene encoding the nickel and cobalt-efflux protein RcnA; RcnA may act through modulating NikR, to repress the NIkABCDE nickel transporter. In vitro, purified RncR binds to the rncA promoter DNA fragment in the absence of Ni2+ or Co2+, and the affinity of RncR for this promoter is reduced in the presence of excess nickel.  Escherichia coli FrmR regulates the formaldehyde degradation frmRAB operon. This family belongs to a larger superfamily that includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily. In this family, however, not all these residues are conserved; in E.coli RcnR and FrmR there is a His or a Thr instead of the second Cys (C-H-H or C-H-T) respectively. For E. coli FrmR, an N-terminal His residue, not conserved in all members of this family, is also involved in metal binding (H-C-H-H). A conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are poorly conserved in this family.	88
197389	cd10154	NreA-like_DUF156	Alcaligenes xylosoxidans NreA and related domains; this domain family was previously known as part of DUF156. This domain family includes Alcaligenes xylosoxidans NreA, Psudomonas putida MreA, and related domains. The gene encoding Alcaligenes xylosoxidans NreA is part of the nre nickel resistance locus located on the pTOM9 plasmid from this bacteria; it confers low-level nickel resistance on both Ralstonia and Escherichia coli strains. The Pseudomonas putida MreA gene is found in association with a gene encoding mrdH, a heavy metal efflux transporter of broad specificity. MreA may have a role in cadmium and nickel resistance. This family is part of a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including members of this family; however, a conserved Tyr and a Glu residue that facilitates allosteric regulation of DNA binding for CsoRs are poorly conserved.	86
197390	cd10155	BsYrkD-like_DUF156	Uncharacterized protein YrkD from Bacillus subtilis and related domains; this domain superfamily was previously known as part of DUF156. This domain family contains an uncharacterized protein YrkD from Bacillus subtilis and related proteins. This family is part of a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily. In this family, however, not all these residues are conserved, there is an Asn instead of the His (C-N-C); also a conserved Tyr and a Glu residue that facilitates allosteric regulation of DNA binding for CsoRs are very poorly conserved.	82
197391	cd10156	FpFrmR-Cterm-like_DUF156	C-terminal domain of Faecalibacterium prausnitzii A2-165 FrmR , and related domains; this domain family was previously known as part of DUF156. This domain family contains the C-terminal domain of the functionally uncharacterized protein Faecalibacterium prausnitzii A2-165 FrmR, and related domains. This family is part of a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family, and a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also conserved.	86
197392	cd10157	BsCsoR-like_DUF156	Bacillus subtilis copper-sensitive operon repressor (BsCsoR), and related domains; this family was previously known as part of DUF156. This domain family includes Bacillus subtilis CsoR (BsCsoR). CsoRs are Cu(I)-inducible, and regulate the expression of genes involved in copper homeostasis. BsCsoR regulates the copZA operon which encodes the copper chaperone CopZ, and the copper efflux P-type ATPase CopA. This family belongs to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes Mycobacterium tuberculosis CsoR (MtCsoR), Thermus thermophilus CsoR, and Staphylococcus aureus CsoR. The latter three proteins do not belong to this family. CsoRs regulate the expression of genes involved in copper homeostasis. CsoRs form homotetramers (dimer of dimers). In MtCsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family, and the conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also well conserved.	85
197393	cd10158	CsoR-like_DUF156_1	Uncharacterized family 1; belongs to a superfamily containing the transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this family was previously known as part of DUF156. Uncharacterized family 1, belonging to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family; however, a conserved Tyr and a Glu residue that facilitates allosteric regulation of DNA binding for CsoRs are poorly conserved.	81
197394	cd10159	CsoR-like_DUF156_2	Uncharacterized family 2; belongs to a superfamily containing transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this family was previously known as part of DUF156. Uncharacterized family 2, belonging to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family, and a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also conserved.	82
197395	cd10160	CsoR-like_DUF156_3	Uncharacterized family 3; belongs to a superfamily containing the transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this family was previously known as part of DUF156. Uncharacterized family 3, belonging to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily, including this family; however, a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are not conserved.	85
197396	cd10161	CsoR-like_DUF156_4	Uncharacterized family 4; belongs to a superfamily containing the transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this family was previously known as part of DUF156. Uncharacterized family 4, belonging to a larger superfamily that contains various transcriptional regulators that respond to different stressors such as Cu(I), Ni(I), sulfite, and formaldehyde, and includes CsoRs (copper-sensitive operon repressors). CsoRs form homotetramers (dimer of dimers). In Mycobacterium tuberculosis CsoR, within each dimer, two Cys residues on opposite subunits, along with a His residue, bind the Cu(I) ion (forming a triagonal S2N coordination complex, C-H-C). These residues are conserved in the majority of members of this superfamily. In this family, however, only one of these residues is conserved (the first Cys); and a conserved Tyr and a Glu residue that facilitate allosteric regulation of DNA binding for CsoRs are also not conserved.	82
197398	cd10162	ClassIIa_HDAC4_Gln-rich-N	Glutamine-rich N-terminal helical domain of HDAC4, a Class IIa histone deacetylase. This family consists of the glutamine-rich domain of histone deacetylase 4 (HDAC4). It belongs to a superfamily that consists of the glutamine-rich N-terminal helical extension to certain Class IIa histone deacetylases (HDACs), including HDAC4, HDAC5 and HDCA9; it is missing from HDAC7. This domain confers responsiveness to calcium signals and mediates interactions with transcription factors and cofactors, and it is able to repress transcription independently of the HDAC C-terminal, zinc-dependent catalytic domain. It has many intra- and inter-helical interactions which are possibly involved in reversible assembly and disassembly of proteins. HDACs regulate diverse cellular processes through enzymatic deacetylation of histone as well as non-histone proteins, in particular deacetylating N(6)-acetyl-lysine residues.	90
197399	cd10163	ClassIIa_HDAC9_Gln-rich-N	Glutamine-rich N-terminal helical domain of HDAC9, a Class IIa histone deacetylase. This family consists of the glutamine-rich domain of histone deacetylase 9 (HDAC9). It belongs to a superfamily that consists of the glutamine-rich N-terminal helical extension to certain Class IIa histone deacetylases (HDACs), including HDAC4, HDAC5 and HDCA9; it is missing from HDAC7. This domain confers responsiveness to calcium signals and mediates interactions with transcription factors and cofactors, and it is able to repress transcription independently of the HDAC C-terminal, zinc-dependent catalytic domain. It has many intra- and inter-helical interactions which are possibly involved in reversible assembly and disassembly of proteins. HDACs regulate diverse cellular processes through enzymatic deacetylation of histone as well as non-histone proteins, in particular deacetylating N(6)-acetyl-lysine residues.	90
197400	cd10164	ClassIIa_HDAC5_Gln-rich-N	Glutamine-rich N-terminal helical domain of HDAC5, a Class IIa histone deacetylase. This family consists of the glutamine-rich domain of histone deacetylase 5 (HDAC5). It belongs to a superfamily that consists of the glutamine-rich N-terminal helical extension to certain Class IIa histone deacetylases (HDACs), including HDAC4, HDAC5 and HDCA9; it is missing from HDAC7. This domain confers responsiveness to calcium signals and mediates interactions with transcription factors and cofactors, and it is able to repress transcription independently of the HDAC C-terminal, zinc-dependent catalytic domain. It has many intra- and inter-helical interactions which are possibly involved in reversible assembly and disassembly of proteins. HDACs regulate diverse cellular processes through enzymatic deacetylation of histone as well as non-histone proteins, in particular deacetylating N(6)-acetyl-lysine residues.	97
212667	cd10170	HSP70_NBD	Nucleotide-binding domain of the HSP70 family. HSP70 (70-kDa heat shock protein) family chaperones assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Some HSP70 family members are not chaperones but instead, function as NEFs to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle, some may function as both chaperones and NEFs.	369
212668	cd10225	MreB_like	MreB and similar proteins. MreB is a bacterial protein which assembles into filaments resembling those of eukaryotic F-actin. It is involved in determining the shape of rod-like bacterial cells, by assembling into large fibrous spirals beneath the cell membrane. MreB has also been implicated in chromosome segregation; specifically MreB is thought to bind to and segregate the replication origin of bacterial chromosomes.	320
212669	cd10227	ParM_like	Plasmid segregation protein ParM and similar proteins. ParM is a plasmid-encoded bacterial homolog of actin, which polymerizes into filaments similar to F-actin, and plays a vital role in plasmid segregation. ParM filaments segregate plasmids paired at midcell into the individual daughter cells. This subfamily also contains Thermoplasma acidophilum Ta0583, an active ATPase at physiological temperatures, which has a propensity to form filaments.	312
212670	cd10228	HSPA4_like_NDB	Nucleotide-binding domain of 105/110 kDa heat shock proteins including HSPA4 and similar proteins. This subgroup includes the human proteins, HSPA4 (also known as 70-kDa heat shock protein 4, APG-2, HS24/P52, hsp70 RY, and HSPH2; the human HSPA4 gene maps to 5q31.1), HSPA4L (also known as 70-kDa heat shock protein 4-like, APG-1, HSPH3, and OSP94; the human HSPA4L gene maps to 4q28), and HSPH1 (also known as heat shock 105kDa/110kDa protein 1, HSP105; HSP105A; HSP105B; NY-CO-25; the human HSPH1 gene maps to 13q12.3), Saccharomyces cerevisiae Sse1p and Sse2p, and a sea urchin sperm receptor. It belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family, and includes proteins believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins.	381
212671	cd10229	HSPA12_like_NBD	Nucleotide-binding domain of HSPA12A, HSPA12B and similar proteins. Human HSPA12A (also known as 70-kDa heat shock protein-12A) and HSPA12B (also known as 70-kDa heat shock protein-12B, chromosome 20 open reading frame 60/C20orf60, dJ1009E24.2) belong to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). No co-chaperones have yet been identified for HSPA12A or HSPA12B. The gene encoding HSPA12A maps to 10q26.12, a cytogenetic region that might represent a common susceptibility locus for both schizophrenia and bipolar affective disorder; reduced expression of HSPA12A has been shown in the prefrontal cortex of subjects with schizophrenia. HSPA12A is also a candidate gene for forelimb-girdle muscular anomaly, an autosomal recessive disorder of Japanese black cattle. HSPA12A is predominantly expressed in neuronal cells. It may also play a role in the atherosclerotic process. The gene encoding HSPA12B maps to 20p13. HSPA12B is predominantly expressed in endothelial cells, is required for angiogenesis, and may interact with known angiogenesis mediators. It may be important for host defense in microglia-mediated immune response. HSPA12B expression is up-regulated in lipopolysaccharide (LPS)-induced inflammatory response in the spinal cord, and mostly located in active microglia; this induced expression may be regulated by activation of MAPK-p38, ERK1/2 and SAPK/JNK signaling pathways. Overexpression of HSPA12B also protects against LPS-induced cardiac dysfunction and involves the preserved activation of the PI3K/Akt signaling pathway.	404
212672	cd10230	HYOU1-like_NBD	Nucleotide-binding domain of human HYOU1 and similar proteins. This subgroup includes human HYOU1 (also known as human hypoxia up-regulated 1, GRP170; HSP12A; ORP150; GRP-170; ORP-150; the human HYOU1 gene maps to11q23.1-q23.3) and Saccharomyces cerevisiae Lhs1p (also known as Cer1p, SsI1). Mammalian HYOU1 functions as a nucleotide exchange factor (NEF) for HSPA5 (alos known as BiP, Grp78 or HspA5) and may also function as a HSPA5-independent chaperone. S. cerevisiae Lhs1p, does not have a detectable endogenous ATPase activity like canonical HSP70s, but functions as a NEF for Kar2p; it's interaction with Kar2p is stimulated by nucleotide-binding. In addition, Lhs1p has a nucleotide-independent holdase activity that prevents heat-induced aggregation of proteins in vitro. This subgroup belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as NEFs, to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins.	388
212673	cd10231	YegD_like	Escherichia coli YegD, a putative chaperone protein, and related proteins. This bacterial subfamily includes the uncharacterized Escherichia coli YegD. It belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. YegD lacks the SBD. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Some family members are not chaperones but instead, function as NEFs for their Hsp70 partners, other family members function as both chaperones and NEFs.	415
212674	cd10232	ScSsz1p_like_NBD	Nucleotide-binding domain of Saccharmomyces cerevisiae Ssz1pp and similar proteins. Saccharomyces cerevisiae Ssz1p (also known as /Pdr13p/YHR064C) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Some family members are not chaperones but rather, function as NEFs for their Hsp70 partners, while other family members function as both chaperones and NEFs. Ssz1 does not function as a chaperone; it facilitates the interaction between the HSP70 Ssb protein and its partner J-domain protein Zuo1 (also known as zuotin) on the ribosome. Ssz1 is found in a stable heterodimer (called RAC, ribosome associated complex) with Zuo1. Zuo1 can only stimulate the ATPase activity of Ssb, when it is in complex with Ssz1. Ssz1 binds ATP but neither nucleotide-binding, hydrolysis, or its SBD, is needed for its in vivo function.	386
212675	cd10233	HSPA1-2_6-8-like_NBD	Nucleotide-binding domain of HSPA1-A, -B, -L, HSPA-2, -6, -7, -8, and similar proteins. This subfamily includes human HSPA1A (70-kDa heat shock protein 1A, also known as HSP72; HSPA1; HSP70I; HSPA1B; HSP70-1; HSP70-1A), HSPA1B (70-kDa heat shock protein 1B, also known as HSPA1A; HSP70-2; HSP70-1B), and HSPA1L (70-kDa heat shock protein 1-like, also known as HSP70T; hum70t; HSP70-1L; HSP70-HOM). The genes for these three HSPA1 proteins map in close proximity on the major histocompatibility complex (MHC) class III region on chromosome 6, 6p21.3. This subfamily also includes human HSPA8 (heat shock 70kDa protein 8, also known as LAP1; HSC54; HSC70; HSC71; HSP71; HSP73; NIP71; HSPA10; the HSPA8 gene maps to 11q24.1), human HSPA2 (70-kDa heat shock protein 2, also known as HSP70-2; HSP70-3, the HSPA2 gene maps to 14q24.1), human HSPA6 (also known as heat shock 70kDa protein 6 (HSP70B') gi 94717614, the HSPA6 gene maps to 1q23.3), human HSPA7 (heat shock 70kDa protein 7 , also known as HSP70B; the HSPA7 gene maps to 1q23.3) and Saccharmoyces cerevisiae Stress-Seventy subfamily B/Ssb1p. This subfamily belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Associations of polymorphisms within the MHC-III HSP70 gene locus with longevity, systemic lupus erythematosus, Meniere's disease, noise-induced hearing loss, high-altitude pulmonary edema, and coronary heart disease, have been found. HSPA2 is involved in cancer cell survival, is required for maturation of male gametophytes, and is linked to male infertility. The induction of HSPA6 is a biomarker of cellular stress. HSPA8 participates in the folding and trafficking of client proteins to different subcellular compartments, and in the signal transduction and apoptosis process; it has been shown to protect cardiomyocytes against oxidative stress partly through an interaction with alpha-enolase. S. cerevisiae Ssb1p, is part of the ribosome-associated complex (RAC), it acts as a chaperone for nascent polypeptides, and is important for translation fidelity; Ssb1p is also a [PSI+] prion-curing factor.	376
212676	cd10234	HSPA9-Ssq1-like_NBD	Nucleotide-binding domain of human HSPA9 and similar proteins. This subfamily includes human mitochondrial HSPA9 (also known as 70-kDa heat shock protein 9, CSA; MOT; MOT2; GRP75; PBP74; GRP-75; HSPA9B; MTHSP75; the gene encoding HSPA9 maps to 5q31.1), Escherichia coli DnaK, Saccharomyces cerevisiae Stress-seventy subfamily Q protein 1/Ssq1p (also called Ssc2p, Ssh1p, mtHSP70 homolog), and S. cerevisiae Stress-Seventy subfamily C/Ssc1p (also called mtHSP70, Endonuclease SceI 75 kDa subunit). It belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs); for Escherichia coli DnaK, these are the DnaJ and GrpE, respectively.	376
212677	cd10235	HscC_like_NBD	Nucleotide-binding domain of Escherichia coli HscC and similar proteins. This subfamily  includes Escherichia coli HscC (also called heat shock cognate protein C, Hsc62, or YbeW) and the the putative DnaK-like protein Escherichia coli ECs0689. It belongs to the heat shock protein 70 (Hsp70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, Hsp70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Two genes in the vicinity of the HscC gene code for potential cochaperones: J-domain containing proteins, DjlB/YbeS and DjlC/YbeV. HscC and its co-chaperone partners may play a role in the SOS DNA damage response. HscC does not appear to require a NEF.	339
212678	cd10236	HscA_like_NBD	Nucleotide-binding domain of HscA and similar proteins. Escherichia coli HscA (heat shock cognate protein A, also called Hsc66), belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). HscA's partner J-domain protein is HscB; it does not appear to require a NEF, and has been shown to be induced by cold-shock. The HscA-HscB chaperone/co-chaperone pair is involved in [Fe-S] cluster assembly.	355
212679	cd10237	HSPA13-like_NBD	Nucleotide-binding domain of human HSPA13 and similar proteins. Human HSPA13 (also called 70-kDa heat shock protein 13,  STCH, "stress 70 protein chaperone, microsome-associated, 60kD", "stress 70 protein chaperone, microsome-associated, 60kDa"; the gene encoding HSPA13 maps to 21q11.1) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). STCH contains an NBD but lacks an SBD. STCH may function to regulate cell proliferation and survival, and modulate the TRAIL-mediated cell death pathway. The HSPA13 gene is a candidate stomach cancer susceptibility gene; a mutation in the NBD coding region of HSPA13 has been identified in stomach cancer cells. The NBD of HSPA13 interacts with the ubiquitin-like proteins Chap1 and Chap2, implicating HSPA13 in regulating cell cycle and cell death events. HSPA13 is induced by the Ca2+ ionophore A23187.	417
212680	cd10238	HSPA14-like_NBD	Nucleotide-binding domain of human HSPA14 and similar proteins. Human HSPA14 (also known as 70-kDa heat shock protein 14, HSP70L1, HSP70-4; the gene encoding HSPA14 maps to 10p13), is ribosome-associated and belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). HSPA14 interacts with the J-protein MPP11 to form the mammalian ribosome-associated complex (mRAC). HSPA14 participates in a pathway along with Nijmegen breakage syndrome 1 (NBS1, also known as p85 or nibrin), heat shock transcription factor 4b (HSF4b), and HSPA4 (belonging to a different subfamily), that induces tumor migration, invasion, and transformation. HSPA14 is a potent T helper cell (Th1) polarizing adjuvant that contributes to antitumor immune responses.	375
212681	cd10241	HSPA5-like_NBD	Nucleotide-binding domain of human HSPA5 and similar proteins. This subfamily includes human HSPA5 (also known as 70-kDa heat shock protein 5, glucose-regulated protein 78/GRP78, and immunoglobulin heavy chain-binding protein/BIP, MIF2; the gene encoding HSPA5 maps to 9q33.3.), Sacchaormyces cerevisiae Kar2p (also known as Grp78p), and related proteins. This subfamily belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent "client" proteins towards degradation. HSPA5 and Kar2p are chaperones of the endoplasmic reticulum (ER). Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Multiple ER DNAJ domain proteins have been identified and may exist in distinct complexes with HSPA5 in various locations in the ER, for example DNAJC3-p58IPK in the lumen. HSPA5-NEFs include SIL1 and an atypical HSP70 family protein HYOU1/ORP150. The ATPase activity of Kar2p is stimulated by the NEFs: Sil1p and Lhs1p.	374
199834	cd10276	BamB_YfgL	Beta-barrel assembly machinery (Bam) complex component B and related proteins. BamB (YflG) is a non-essential component of the beta-barrel assembly machinery (Bam), a multi-subunit complex that inserts proteins with beta-barrel topology into the outer membrane. BamB has been found to interact with BamA, which in turn binds and stabilizes pre-folded beta-barrel proteins; it has been suggested that BamB participates in the stabilization.	358
199835	cd10277	PQQ_ADH_I	Ethanol dehydrogenase, a bacterial quinoprotein (PQQ-dependent type I alcohol dehydrogenase). This bacterial family of homodimeric ethanol dehydrogenases utilize pyrroloquinoline quinone (PQQ) as a cofactor. It represents proteins whose expression may be induced by ethanol, and which are similar to quinoprotein methanol dehydrogenases, but have higher specificities for ethanol and other primary and secondary alcohols. Dehydrogenases with PQQ cofactors, such as ethanol, methanol, and membrane-bound glucose dehydrogenases, form an 8-bladed beta-propeller.	529
199836	cd10278	PQQ_MDH	Large subunit of methanol dehydrogenase (moxF). Methanol dehydrogenase is a key enzyme in the utilization of C1 compounds as a source of energy and carbon by bacteria. It catalyzes the oxidation of methanol to formaldehyde, transfering two electrons per methanol to cytochrome c(L) as the acceptor. Methanol dehydrogenase belongs to a family of dehydrogenases with pyrroloquinoline quinone (PQQ) as cofactor, which also includes dehydrogenases specific to other alcohols and membrane-bound glucose dehydrogenases. This alignment model for the large subunit contains an 8-bladed beta-propeller; the functional enzyme forms a heterotetramer composed of two large and two small subunits.	553
199837	cd10279	PQQ_ADH_II	PQQ_like domain of the quinohemoprotein alcohol dehydrogenase (type II). This family of monomeric and soluble type II alcohol dehydrogenases utilizes pyrroloquinoline quinone (PQQ) as a cofactor and is related to ethanol, methanol, and membrane-bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller.	549
199838	cd10280	PQQ_mGDH	Membrane-bound PQQ-dependent glucose dehydrogenase. This bacterial subfamily of enzymes belongs to the dehydrogenase family with pyrroloquinoline quinone (PQQ) as cofactor, and is the only subfamily that is bound to the membrane. Glucose dehydrogenase converts D-glucose to D-glucono-1,5-lactone in a reaction that is coupled with the respiratory chain in the periplasmic oxidation of sugars and alcohols in gram-negative bacteria. Ubiquinone functions as the electron acceptor. The alignment model contains an 8-bladed beta-propeller.	616
197336	cd10281	Nape_like_AP-endo	Neisseria meningitides Nape-like subfamily of the ExoIII family purinic/apyrimidinic (AP) endonucleases. This subfamily includes Neisseria meningitides Nape and related proteins. These are Escherichia coli exonuclease III (ExoIII)-like AP endonucleases and belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. AP endonucleases participate in the DNA base excision repair (BER) pathway. AP sites are one of the most common lesions in cellular DNA. During BER the damaged DNA is first recognized by DNA glycosylase. AP endonucleases then catalyze the hydrolytic cleavage of the phosphodiester bond 5' to the AP site, and this is followed by the coordinated actions of DNA polymerase, deoxyribose phosphatase, and DNA ligase. If left unrepaired, AP sites block DNA replication, and have both mutagenic and cytotoxic effects. AP endonucleases can carry out a variety of excision and incision reactions on DNA, including 3'-5' exonuclease, 3'-deoxyribose phosphodiesterase, 3'-phosphatase, and occasionally, nonspecific DNase activities. Different AP endonuclease enzymes catalyze the different reactions with different efficiences. Many organisms have two AP endonucleases, usually one is the dominant AP endonuclease, the other has weak AP endonuclease activity; for example, Neisseria meningitides Nape and NExo. Nape, found in this subfamily, is the dominant AP endonuclease. It exhibits strong AP endonuclease activity, and also exhibits 3'-5'exonuclease and 3'-deoxyribose phosphodiesterase activities.	253
197337	cd10282	DNase1	Deoxyribonuclease 1. Deoxyribonuclease 1 (DNase1, EC 3.1.21.1), also known as DNase I, is a Ca2+, Mg2+/Mn2+-dependent secretory endonuclease, first isolated from bovine pancreas extracts. It cleaves DNA preferentially at phosphodiester linkages next to a pyrimidine nucleotide, producing 5'-phosphate terminated polynucleotides with a free hydroxyl group on position 3'. It generally produces tetranucleotides. DNase1 substrates include single-stranded DNA, double-stranded DNA, and chromatin. This enzyme may be responsible for apoptotic DNA fragmentation. Other deoxyribonucleases in this subfamily include human DNL1L (human DNase I lysosomal-like, also known as DNASE1L1, Xib, and DNase X ), human DNASE1L2 (also known as DNAS1L2), and DNASE1L3 (also known as DNAS1L3, nhDNase, LS-DNase, DNase Y, and DNase gamma) . DNASE1L3 is implicated in apoptotic DNA fragmentation. DNase I is also a cytoskeletal protein which binds actin. A recombinant form of human DNase1 is used as a mucoactive therapy in patients with cystic fibrosis; it hydrolyzes the extracellular DNA in sputum and reduces its viscosity. Mutations in the gene encoding DNase1 have been associated with Systemic Lupus Erythematosus, a multifactorial autoimmune disease. This subfamily belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	256
197338	cd10283	MnuA_DNase1-like	Mycoplasma pulmonis MnuA nuclease-like. This subfamily includes Mycoplasma pulmonis MnuA, a membrane-associated nuclease related to Deoxyribonuclease 1 (DNase1 or DNase I, EC 3.1.21.1). The in vivo role of MnuA is as yet undetermined. This subfamily belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds.	266
198434	cd10284	growth_hormone_like	Somatotropin/prolactin hormone family. The somatotropin/prolactin hormone family includes growth hormones 1 and 2, prolactin, prolactin 2, and other members that play vital roles in a variety of processes, including growth control. They are long-chain class-I helical cytokines, most of which are secreted by the pituitary gland, and are active as monomers, binding to cellular receptors with EpoR-like ligand binding domains.	178
198435	cd10285	somatotropin_like	Somatotropin or growth hormone (GH), placental lactogen, and related pituitary gland hormones. Growth hormone (GH) or somatotropin is a peptide hormone synthesized by the pituitary gland, which mediates anabolic effects in development. GH is known to activate, via binding to specific cellular receptors, the MAPK/ERK and JAK-STAT signaling pathways. Via the latter, it triggers the secretion of insulin-like growth factor 1 (mostly in the liver). Besides increasing body height, GH has been shown to have a host of other effects.	180
198436	cd10286	somatolactin	Somatolactin (SL) and somatolactin-like proteins. This family of hormones specific to Actinopterygii is expressed in the pars intermedia bordering the neurohypophysis (posterior pituitary). Somatolactin appears to be involved in acid-base regulation, but much of its physiological role remains to be understood.	207
198437	cd10287	prolactin_2	Vertebrate, non-mammalian prolactin 2 (PRL2). A functionally uncharacterized subfamily of the growth-hormone-like helical cytokines, which is found in vertebrata (except for mammals). The protein has been shown to be expressed in the zebrafish eye and brain, but not the pituitary gland, and might play a role in retina development.	184
198438	cd10288	prolactin_like	Prolactin (PRL or PRL1), chorionic somatomammotropin, and related pituitary gland hormones. Prolactin is primarily responsible for stimulating milk production and breast development in mammals. Aside from roles in reproduction, various functions have been attributed to prolactin, more than for other pituitary gland hormones combined. These are roles in growth and development, metamorphosis, metabolism of lipids, carbohydrates, and steroids, brain biochemistry and even immunoregulation, among others. Most of these roles are poorly understood, but it has become clear that many prolactin-like hormones are actually produced in the placenta and not the pituitary.	199
198322	cd10289	GST_C_AaRS_like	Glutathione S-transferase C-terminal-like, alpha helical domain of various Aminoacyl-tRNA synthetases and similar domains. Glutathione S-transferase (GST) C-terminal domain family, Aminoacyl-tRNA synthetase (AaRS)-like subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of some eukaryotic AaRSs, as well as similar domains found in proteins involved in protein synthesis including Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein 2 (AIMP2), AIMP3, and eukaryotic translation Elongation Factor 1 beta (eEF1b). AaRSs comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. AaRSs in this subfamily include GluRS from lower eukaryotes, as well as GluProRS, MetRS, and CysRS from higher eukaryotes. AIMPs are non-enzymatic cofactors that play critical roles in the assembly and formation of a macromolecular multi-tRNA synthetase protein complex found in higher eukaryotes. The GST_C-like domain is involved in protein-protein interactions, mediating the formation of aaRS complexes such as the MetRS-Arc1p-GluRS ternary complex in lower eukaryotes and the multi-aaRS complex in  higher eukaryotes, that act as molecular hubs for protein synthesis. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain.	82
198323	cd10290	GST_C_MetRS_N_fungi	Glutathione S-transferase C-terminal-like, alpha helical domain of Saccharomycetales Methionyl-tRNA synthetase. Glutathione S-transferase (GST) C-terminal domain family, Saccharomycetales Methionyl-tRNA synthetase (MetRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of Saccharomycetales MetRS. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. MetRS is a class I aaRS, containing a Rossman fold catalytic core. It recognizes the initiator tRNA as well as the Met-tRNA for protein chain elongation. The GST_C-like domain of MetRS from Saccharomycetales is involved in protein-protein interactions, to mediate the formation of the the MetRS-Arc1p-GluRS ternary complex which is considered an evolutionary intermediate between prokaryotic aaRS and the multi-aaRS complex found in higher eukaryotes. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain.	95
198324	cd10291	GST_C_YfcG_like	C-terminal, alpha helical domain of Escherichia coli YfcG Glutathione S-transferases and related uncharacterized proteins. Glutathione S-transferase (GST) C-terminal domain family, YfcG-like subfamily; composed of the Escherichia coli YfcG and related proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. YfcG is one of nine GST homologs in Escherichia coli. It is expressed predominantly during the late stationary phase where the predominant form of GSH is glutathionylspermidine (GspSH), suggesting that YfcG might interact with GspSH. It has very low or no GSH transferase or peroxidase activity, but displays a unique disulfide bond reductase activity that is comparable to thioredoxins (TRXs) and glutaredoxins (GRXs). However,  unlike TRXs and GRXs, YfcG does not contain a redox active cysteine residue and may use a bound thiol disulfide couple such as 2GSH/GSSG for activity. The crystal structure of YcfG reveals a bound GSSG molecule in its active site. The actual physiological substrates for YfcG are yet to be identified.	110
198325	cd10292	GST_C_YghU_like	C-terminal, alpha helical domain of Escherichia coli Yghu Glutathione S-transferases and related uncharacterized proteins. Glutathione S-transferase (GST) C-terminal domain family, YghU-like subfamily; composed of the Escherichia coli YghU and related proteins. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. YghU is one of nine GST homologs in the genome of Escherichia coli. It is similar to Escherichia coli YfcG in that it has poor GSH transferase activity towards typical substrates. It shows modest reductase activity towards some organic hydroperoxides. Like YfcG, YghU also shows good disulfide bond oxidoreductase activity comparable to the activities of glutaredoxins and thioredoxins. YghU does not contain a redox active cysteine residue, and may use a bound thiol disulfide couple such as 2GSH/GSSG for activity. The crystal structure of YghU reveals two GSH molecules bound in its active site.	118
198326	cd10293	GST_C_Ure2p	C-terminal, alpha helical domain of fungal Ure2p Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Ure2p subfamily; composed of the Saccharomyces cerevisiae Ure2p and related fungal proteins. Ure2p is a regulator for nitrogen catabolism in yeast. It represses the expression of several gene products involved in the use of poor nitrogen sources when rich sources are available. A transmissible conformational change of Ure2p results in a prion called [Ure3], an inactive, self-propagating and infectious amyloid. Ure2p displays a GST fold containing an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain. The N-terminal thioredoxin-fold domain is sufficient to induce the [Ure3] phenotype and is also called the prion domain of Ure2p. In addition to its role in nitrogen regulation, Ure2p confers protection to cells against heavy metal ion and oxidant toxicity, and shows glutathione (GSH) peroxidase activity. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of GSH with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST active site is located in a cleft between the N- and C-terminal domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain.	117
198327	cd10294	GST_C_ValRS_N	Glutathione S-transferase C-terminal-like, alpha helical domain of vertebrate Valyl-tRNA synthetase. Glutathione S-transferase (GST) C-terminal domain family, Valyl-tRNA synthetase (ValRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of human ValRS and its homologs from other vertebrates such as frog and zebrafish. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. They typically form large stable complexes with other proteins. ValRS forms a stable complex with Elongation Factor-1H (EF-1H), and together, they catalyze consecutive steps in protein biosynthesis, tRNA aminoacylation and its transfer to EF. The GST_C-like domain of ValRS from higher eukaryotes is likely involved in protein-protein interactions, to mediate the formation of the multi-aaRS complex that acts as a molecular hub to coordinate protein synthesis. ValRSs from prokaryotes and lower eukaryotes, such as fungi and plants, do not appear to contain this GST_C-like domain.	123
198328	cd10295	GST_C_Sigma	C-terminal, alpha helical domain of Class Sigma Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, Class Sigma; GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Vertebrate class Sigma GSTs are characterized as GSH-dependent hematopoietic prostaglandin (PG) D synthases and are responsible for the production of PGD2 by catalyzing the isomerization of PGH2. The functions of PGD2 include the maintenance of body temperature, inhibition of platelet aggregation, bronchoconstriction, vasodilation, and mediation of allergy and inflammation.	100
198329	cd10296	GST_C_CLIC4	C-terminal, alpha helical domain of Chloride Intracellular Channel 4. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 4 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC4, also known as p64H1, is expressed ubiquitously and its localization varies depending on the nature of the cells and tissues, from the plasma membrane to subcellular compartments including the nucleus, mitochondria, ER, and the trans-Golgi network, among others. In response to cellular stress such as DNA damage and senescence, cytoplasmic CLIC4 translocates to the nucleus, where it acts on the TGF-beta pathway. Studies on knockout mice suggest that CLIC4 also plays an important role in angiogenesis, specifically in network formation, capillary sprouting, and lumen formation. CLIC4 has been found to induce apoptosis in several cell types and to retard the growth of grafted tumors in vivo.	141
198330	cd10297	GST_C_CLIC5	C-terminal, alpha helical domain of Chloride Intracellular Channel 5. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 5 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC5 exists in two alternatively-spliced isoforms, CLIC5A or CLIC5B (also called p64). It is expressed at high levels in hair cell stereocilia and is associated with the actin cytoskeleton and ezrin. A recessive mutation in the CLIC5 gene in mice led to the lack of coordination and deafness, due to a defect in the basal region of the hair bundle causing stereocilia to degrade. CLIC5 is therefore essential for normal inner ear function. CLIC5 is also highly expressed in podocytes where it is colocalized with the ezrin/radixin/moesin (ERM) complex. It is essential for foot process integrity, and for podocyte morphology and function.	141
198331	cd10298	GST_C_CLIC2	C-terminal, alpha helical domain of Chloride Intracellular Channel 2. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 2 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC2 contains an intramolecular disulfide bond and exists as a monomer regardless of redox conditions, in contrast to CLIC1 which forms a dimer under oxidizing conditions. It is expressed in most tissues except the brain, and is highly expressed in the lung, spleen, and in cardiac and skeletal muscles. CLIC2 interacts with ryanodine receptors (cardiac RyR2 and skeletal RyR1) and modulates their activity, suggesting that CLIC2 may function in the regulation of calcium release and signaling in cardiac and skeletal muscles.	138
198332	cd10299	GST_C_CLIC3	C-terminal, alpha helical domain of Chloride Intracellular Channel 3. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 3 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC3 is highly expressed in placental tissues, and may play a role in fetal development.	133
198333	cd10300	GST_C_CLIC1	C-terminal, alpha helical domain of Chloride Intracellular Channel 1. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 1 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Soluble CLIC1 is monomeric and adopts a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. Upon oxidation, the N-terminal domain of CLIC1 undergoes a structural change to form a non-covalent dimer stabilized by the formation of an intramolecular disulfide bond between two cysteines that are far apart in the reduced form. The CLIC1 dimer bears no similarity to GST dimers. The redox-controlled structural rearrangement exposes a large hydrophobic surface, which is masked by dimerization in vitro. In vivo, this surface may represent the docking interface of CLIC1 in its membrane-bound state. The two cysteines in CLIC1 that form the disulfide bond in oxidizing conditions are essential for dimerization and chloride channel activity. CLIC1 is widely expressed in many tissues and its subcellular localization is dependent on cell type and cell cycle phase. It acts as a sensor of cell oxidation and appears to have a role in diseases that involve oxidative stress including tumorigenic and neurodegenerative diseases.	139
198334	cd10301	GST_C_CLIC6	C-terminal, alpha helical domain of Chloride Intracellular Channel 6. Glutathione S-transferase (GST) C-terminal domain family, Chloride Intracellular Channel (CLIC) 6 subfamily; CLICs are auto-inserting, self-assembling intracellular anion channels involved in a wide variety of functions including regulated secretion, cell division, and apoptosis. They can exist in both water-soluble and membrane-bound states and are found in various vesicles and membranes, and they may play roles in the maintenance of these intracellular membranes. The membrane localization domain is present in the N-terminal part of the protein. Structures of soluble CLICs reveal that they adopt a fold similar to GSTs, containing an N-terminal domain with a thioredoxin fold and a C-terminal alpha helical domain. CLIC6 is expressed predominantly in the stomach, pituitary, and brain. It interacts with D2-like dopamine receptors directly and through scaffolding proteins. CLIC6 may be involved in the regulation of secretion, possibly through chloride ion transport regulation.	140
198335	cd10302	GST_C_GDAP1L1	C-terminal, alpha helical domain of Ganglioside-induced differentiation-associated protein 1-like 1. Glutathione S-transferase (GST) C-terminal domain family, Ganglioside-induced differentiation-associated protein 1-like 1 (GDAP1L1) subfamily; GDAP1L1 is a paralogue of GDAP1 with about 56% sequence identity and 70% similarity. It's function is unknown. Like GDAP1, it does not exhibit GST activity using standard substrates. GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal thioredoxin-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains.	111
198336	cd10303	GST_C_GDAP1	C-terminal, alpha helical domain of Ganglioside-induced differentiation-associated protein 1. Glutathione S-transferase (GST) C-terminal domain family, Ganglioside-induced differentiation-associated protein 1 (GDAP1) subfamily; GDAP1 was originally identified as a highly expressed gene at the differentiated stage of GD3 synthase-transfected cells. More recently, mutations in GDAP1 have been reported to cause both axonal and demyelinating autosomal-recessive Charcot-Marie-Tooth (CMT) type 4A neuropathy. CMT is characterized by slow and progressive weakness and atrophy of muscles. Sequence analysis of GDAP1 shows similarities and differences with GSTs; it appears to contain both N-terminal thioredoxin-fold and C-terminal alpha helical domains of GSTs, however, it also contains additional C-terminal transmembrane domains unlike GSTs. GDAP1 is mainly expressed in neuronal cells and is localized in the mitochondria through its transmembrane domains. It does not exhibit GST activity using standard substrates.	111
198337	cd10304	GST_C_Arc1p_N_like	Glutathione S-transferase C-terminal-like, alpha helical domain of the Aminoacyl tRNA synthetase cofactor 1 and similar proteins. Glutathione S-transferase (GST) C-terminal domain family, Aminoacyl tRNA synthetase cofactor 1 (Arc1p)-like subfamily; Arc1p, also called GU4 nucleic binding protein 1 (G4p1) or p42, is a tRNA-aminoacylation and nuclear-export cofactor. It contains a domain in the N-terminal region with similarity to the C-terminal alpha helical domain of GSTs. This domain mediates the association of the aminoacyl tRNA synthetases (aaRSs), MetRS and GluRS, in yeast to form a stable stoichiometric ternany complex. The GST_C-like domain of Arc1p is a protein-protein interaction domain containing two binding sites which enable it to bind the two aaRSs simultaneously and independently. The MetRS-Arc1p-GluRS complex selectively recruits and aminoacylates its cognate tRNAs without additional cofactors. Arc1p also plays a role in the transport of tRNA from the nucleus to the cytoplasm. It may also control the subcellular distribution of GluRS in the cytoplasm, nucleoplasm, and the mitochondrial matrix.	100
198338	cd10305	GST_C_AIMP3	Glutathione S-transferase C-terminal-like, alpha helical domain of Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein 3. Glutathione S-transferase (GST) C-terminal domain family, Aminoacyl tRNA synthetase complex-Interacting Multifunctional Protein (AIMP) 3 subfamily; AIMPs are non-enzymatic cofactors that play critical roles in the assembly and formation of a macromolecular multi-tRNA synthetase protein complex that functions as a molecular hub to coordinate protein synthesis. There are three AIMPs, named AIMP1-3, which play diverse regulatory roles. AIMP3, also called p18 or eukaryotic translation elongation factor 1 epsilon-1 (EEF1E1), contains a C-terminal domain with similarity to the C-terminal alpha helical domain of GSTs. It specifically interacts with methionyl-tRNA synthetase (MetRS) and is translocated to the nucleus during DNA synthesis or in response to DNA damage and oncogenic stress. In the nucleus, it interacts with ATM and ATR, which are upstream kinase regulators of p53. It appears to work against DNA damage in cooperation with AIMP2, and similar to AIMP2, AIMP3 is also a haploinsufficient tumor suppressor. AIMP3 transgenic mice have shorter lifespans than wild-type mice and they show characteristics of progeria, suggesting that AIMP3 may also be involved in cellular and organismal aging.	101
198339	cd10306	GST_C_GluRS_N	Glutathione S-transferase C-terminal-like, alpha helical domain of Glutamyl-tRNA synthetase. Glutathione S-transferase (GST) C-terminal domain family, Glutamyl-tRNA synthetase (GluRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of GluRS from lower eukaryotes. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. The GST_C-like domain of GluRS is involved in protein-protein interactions. This domain mediates the formation of the MetRS-Arc1p-GluRS ternary complex found in lower eukaryotes, which is considered an evolutionary intermediate between prokaryotic aaRS and the multi-aaRS complex found in higher eukaryotes. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain.	87
198340	cd10307	GST_C_MetRS_N	Glutathione S-transferase C-terminal-like, alpha helical domain of Methionyl-tRNA synthetase from higher eukaryotes. Glutathione S-transferase (GST) C-terminal domain family, Methionyl-tRNA synthetase (MetRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of MetRS from higher eukaryotes. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. MetRS is a class I aaRS, containing a Rossman fold catalytic core. It recognizes the initiator tRNA as well as the Met-tRNA for protein chain elongation. The GST_C-like domain of MetRS from higher eukaryotes is likely involved in protein-protein interactions, to mediate the formation of the multi-aaRS complex that acts as a molecular hub to coordinate protein synthesis. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain.	102
198341	cd10308	GST_C_eEF1b_like	Glutathione S-transferase C-terminal-like, alpha helical domain of eukaryotic translation Elongation Factor 1 beta. Glutathione S-transferase (GST) C-terminal domain family, eukaryotic translation Elongation Factor 1 beta (eEF1b) subfamily; eEF1b is a component of the eukaryotic translation elongation factor-1 (EF1) complex which plays a central role in the elongation cycle during protein biosynthesis. EF1 consists of two functionally distinct units, EF1A and EF1B. EF1A catalyzes the GTP-dependent binding of aminoacyl-tRNA to the ribosomal A site concomitant with the hydrolysis of GTP. The resulting inactive EF1A:GDP complex is recycled to the active GTP form by the guanine-nucleotide exchange factor EF1B, a complex composed of at least two subunits, alpha and gamma. Metazoan EFB1 contain a third subunit, beta. eEF1b contains a GST_C-like alpha helical domain at the N-terminal region and a C-terminal guanine nucleotide exchange domain. The GST_C-like domain likely functions as a protein-protein interaction domain, similar to the function of the GST_C-like domains of EF1Bgamma and various aminoacyl-tRNA synthetases (aaRSs) from higher eukaryotes.	82
198342	cd10309	GST_C_GluProRS_N	Glutathione S-transferase C-terminal-like, alpha helical domain of bifunctional Glutamyl-Prolyl-tRNA synthetase. Glutathione S-transferase (GST) C-terminal domain family, bifunctional GluRS-Prolyl-tRNA synthetase (GluProRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of GluProRS from higher eukaryotes. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. The GST_C-like domain of GluProRS may be involved in protein-protein interactions, mediating the formation of the multi-aaRS complex in higher eukaryotes. The multi-aaRS complex acts as a molecular hub for protein synthesis. AaRSs from prokaryotes, which are active as dimers, do not contain this GST_C-like domain.	81
198343	cd10310	GST_C_CysRS_N	Glutathione S-transferase C-terminal-like, alpha helical domain of Cysteinyl-tRNA synthetase from higher eukaryotes. Glutathione S-transferase (GST) C-terminal domain family, Cysteinyl-tRNA synthetase (CysRS) subfamily; This model characterizes the GST_C-like domain found in the N-terminal region of CysRS from higher eukaryotes. Aminoacyl-tRNA synthetases (aaRSs) comprise a family of enzymes that catalyze the coupling of amino acids with their matching tRNAs. This involves the formation of an aminoacyl adenylate using ATP, followed by the transfer of the activated amino acid to the 3'-adenosine moiety of the tRNA. AaRSs may also be involved in translational and transcriptional regulation, as well as in tRNA processing. The GST_C-like domain of CysRS from higher eukaryotes is likely involved in protein-protein interactions, to mediate the formation of the multi-aaRS complex that acts as a molecular hub to coordinate protein synthesis. CysRSs from prokaryotes and lower eukaryotes do not appear to contain this GST_C-like domain.	73
197304	cd10311	PLDc_N_DEXD_c	N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. N-terminal putative catalytic domain of uncharacterized prokaryotic and archeal HKD family nucleases fused to a DEAD/DEAH box helicase domain. All members of this subfamily are uncharacterized. Other characterized members of the superfamily that have a related domain architecture ( containing a DEAD/DEAH box helicase domain), include the DNA/RNA helicase superfamily II (SF2) and Res-subunit of type III restriction endonucleases. In addition to the helicase-like region, members of this subfamily also contain one copy of the conserved HKD motif (H-x-K-x(4)-D, where x represents any amino acid residue) in the N-terminal putative catalytic domain. The HKD motif characterizes the phospholipase D (PLD, EC 3.1.4.4) superfamily.	156
197339	cd10312	Deadenylase_CCR4b	C-terminal deadenylase domain of CCR4b, also known as CCR4-NOT transcription complex subunit 6-like. This subfamily contains the C-terminal catalytic domain of the deadenylase, CCR4b, also known as CCR4-NOT transcription complex subunit 6-like (CNOT6L). CCR4 belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. CCR4 is the major deadenylase subunit of the CCR4-NOT transcription complex, which contains two deadenylase subunits and several noncatalytic subunits. The other deadenylase subunit, Caf1, is a DEDD-type protein and does not belong in this superfamily. There are two vertebrate CCR4 proteins, CCR4a (also called CCR4-NOT transcription complex subunit 6 or CNOT6) and CCR4b. CCR4b associates with other components, such as CNOT1-3 and Caf1, to form a CCR4-NOT multisubunit complex, which regulates transcription and mRNA degradation. The nuclease domain of CCR4b exhibits Mg2+-dependent deadenylase activity with strict specificity for poly (A) RNA as substrate. CCR4b is mainly localized in the cytoplasm. It regulates cell growth and influences cell cycle progression by regulating p27/Kip1 mRNA levels. It contributes to the prevention of cell death by regulating insulin-like growth factor-binding protein 5.	348
197340	cd10313	Deadenylase_CCR4a	C-terminal deadenylase domain of CCR4a, also known as CCR4-NOT transcription complex subunit 6. This subfamily contains the C-terminal catalytic domain of the deadenylase, CCR4a, also known as CCR4-NOT transcription complex subunit 6 (CNOT6). CCR4 belongs to the large EEP (exonuclease/endonuclease/phosphatase) superfamily that contains functionally diverse enzymes that share a common catalytic mechanism of cleaving phosphodiester bonds. CCR4 is the major deadenylase subunit of the CCR4-NOT transcription complex, which contains two deadenylase subunits and several noncatalytic subunits. The other deadenylase subunit, Caf1, is a DEDD-type protein and does not belong in this superfamily. There are two vertebrate CCR4 proteins, CCR4a and CCR4b (also called CNOT6-like or CNOT6L). CCR4a associates with other components, such as CNOT1-3 and Caf1, to form a CCR4-NOT multisubunit complex, which regulates transcription and mRNA degradation. The nuclease domain of CCR4a exhibits Mg2+-dependent deadenylase activity with specificity for poly (A) RNA as substrate. CCR4a is a component of P-bodies and is necessary for foci formation of various P-body components. It also plays a role in cellular responses to DNA damage, by regulating Chk2 activity.	350
198457	cd10314	FAM20_C	C-terminal putative kinase domain of FAM20 (family with sequence similarity 20) proteins. This family contains the C-terminal domain of FAM20A, -B, -C and related proteins. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. The C-terminal domains of members of this family are putative kinase domains, based on mutagenesis of the C-terminal domain of Drosophila Four-Jointed, a related Golgi kinase. This domain family is also known as DUF1193.	209
199215	cd10315	CBM41_pullulanase	Family 41 Carbohydrate-Binding Module from pullulanase-like enzymes. Pullulanases (EC 3.2.1.41) are a group of starch-debranching enzymes, catalyzing the hydrolysis of the alpha-1,6-glucosidic linkages of alpha-glucans, preferentially pullulan. Pullulan is a polysaccharide in which alpha-1,4 linked maltotriosyl units are combined via an alpha-1,6 linkage. These enzymes are of importance in the starch industry, where they are used to hydrolyze amylopectin starch. Pullulanases consist of multiple distinct domains, including a catalytic domain belonging to the glycoside hydrolase (GH) family 13 and carbohydrate-binding modules (CBM), including CBM41.	100
199904	cd10316	RGL4_M	Middle domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. The rhamnogalacturonan lyase of the polysaccharide lyase family 4 (RGL4) is involved in the degradation of RG (rhamnogalacturonan) type-I, an important pectic plant cell wall polysaccharide, by cleaving the alpha-1,4 glycoside bond between L-rhamnose and D-galacturonic acids in the backbone of RG type-I through a beta-elimination reaction. RGL4 consists of three domains, an N-terminal catalytic domain, a middle domain with a FNIII type fold and a C-terminal domain with a jelly roll fold. Both the middle domain represented by this model and the C-terminal domain are putative carbohydrate binding modules. There are two types of RG lyases, which both cleave the alpha-1,4 bonds of the RG-I main chain (RG chain) through the beta-elimination reaction, but belong to two structurally unrelated polysaccharide lyase (PL) families, 4 and 11.	92
199905	cd10317	RGL4_C	C-terminal domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. The rhamnogalacturonan lyase of the polysaccharide lyase family 4 (RGL4) is involved in the degradation of RG (rhamnogalacturonan) type-I, an important pectic plant cell wall polysaccharide, by cleaving the alpha-1,4 glycoside bond between L-rhamnose and D-galacturonic acids in the backbone of RG type-I through a beta-elimination reaction. RGL4 consists of three domains, an N-terminal catalytic domain, a middle domain with a FNIII type fold and a C-terminal domain with a jelly roll fold.  Both the middle and the C-terminal domain are putative carbohydrate binding modules. There are two types of RG lyases, which both cleave the alpha-1,4 bonds of the RG-I main chain (RG chain) through the beta-elimination reaction, but belong to two structurally unrelated polysaccharide lyase (PL) families, 4 and 11.	161
199906	cd10318	RGL11	Rhamnogalacturonan lyase of the polysaccharide lyase family 11. The rhamnogalacturonan lyase of the polysaccharide lyase family 11 (RGL11) cleaves glycoside bonds in polygalacturonan as well as RG (rhamnogalacturonan) type-I through a beta-elimination reaction. Functionally characterized members of this family, YesW and YesX from Bacillus subtilis, cleave glycoside bonds between rhamnose and galacturonic acid residues in the RG-I region of plant cell wall pectin. YesW and YesX work synergistically, with YesW cleaving the glycoside bond of the RG chain endolytically, and YesX converting the resultant oligosaccharides through an exotype reaction. This domain is sometimes found in architectures with non-catalytic carbohydrate-binding modules (CBMs). There are two types of RG lyases, which both cleave the alpha-1,4 bonds of the RG-I main chain through a beta-elimination reaction, but belong to two structurally unrelated polysaccharide lyase (PL) families, 4 and 11.	564
198439	cd10319	EphR_LBD	Ligand Binding Domain of Ephrin Receptors. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). They are subdivided into 2 groups, A and B type receptors, depending on their ligand ephrin-A or ephrin-B, respectively. In general, class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. One exception is EphB2, which also interacts with ephrin A5. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis.	177
199907	cd10320	RGL4_N	N-terminal catalytic domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. The rhamnogalacturonan lyase of the polysaccharide lyase family 4 (RGL4) is involved in the degradation of RG (rhamnogalacturonan) type-I, an important pectic plant cell wall polysaccharide, by cleaving the alpha-1,4 glycoside bond between L-rhamnose and D-galacturonic acids in the backbone of RG type-I through a beta-elimination reaction. RGL4 consists of three domains, an N-terminal catalytic domain, a middle domain with a FNIII type fold and a C-terminal domain with a jelly roll fold; the middle and C-terminal domains are both putative carbohydrate binding modules. There are two types of RG lyases, which both cleave the alpha-1,4 bonds of the RG-I main chain (RG chain) through the beta-elimination reaction, but belong to two structurally unrelated polysaccharide lyase (PL) families, 4 and 11.	265
199216	cd10321	RNase_Ire1_like	RNase domain (also known as the kinase extension nuclease domain) of Ire1 and RNase L. This RNase domain is found in the multi-functional protein Ire1; Ire1 also contains a type I transmembrane serine/threonine protein kinase (STK) domain, and a Luminal dimerization domain. Ire1 is essential for the endoplasmic reticulum (ER) unfolded protein response (UPR). The UPR is activated when protein misfolding is detected in the ER in order to reduce the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1 acts as an ER stress sensor; IRE1 dimerizes through its N-terminal luminal domain and forms oligomers, promoting trans-autophosphorylation by its cytosolic kinase domain which stimulates its endoribonuclease (RNase) activity and results in the cleavage of its mRNA substrate, Hac1 in yeast and Xbp1 in metazoans, thus promoting a splicing event that enables translation into a transcription factor which activates the UPR. This RNase domain is also found in Ribonuclease L (RNase L), sometimes referred to as the 2-5A-dependent RNase. RNase L is a highly regulated, latent endoribonuclease widely expressed in most mammalian tissues. It is involved in the mediation of the antiviral and pro-apoptotic activities of the interferon-inducible 2-5A system; the interferon (IFN)-inducible 2'-5'-oligoadenylate synthetase (OAS)/RNase L pathway blocks infections by certain types of viruses through cleavage of viral and cellular single-stranded RNA. RNase L has been shown to have an impact on the pathogenesis of prostate cancer; the RNase L gene, RNASEL, has been identified as a strong candidate for the hereditary prostate cancer 1 (HPC1) allele.	127
271357	cd10322	SLC5sbd	Solute carrier 5 family, sodium/glucose transporters and related proteins; solute-binding domain. This family represents the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporter family or solute sodium symporter family) that co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. Family members include: the human glucose (SGLT1, 2, 4, 5), chiro-inositol (SGLT5), myo-inositol (SMIT), choline (CHT), iodide (NIS), multivitamin (SMVT), and monocarboxylate (SMCT) cotransporters, as well as Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. Vibrio parahaemolyticus Na(+)/galactose cotransporter (vSGLT) has 13 transmembrane helices (TMs): TM-1, an inverted topology repeat: TMs1-5 and TMs6-10, and TMs 11-12 (TMs numbered to conform to the solute carrier 6 family Aquifex aeolicus LeuT). One member of this family, human SGLT3, has been characterized as a glucose sensor and not a transporter. Members of this family are important in human physiology and disease.	454
271358	cd10323	SLC-NCS1sbd	nucleobase-cation-symport-1 (NCS1) transporters; solute-binding domain. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. This family includes Microbacterium liquefaciens Mhp1, a transporter that mediates the uptake of indolyl methyl- and benzyl-hydantoins as part of a metabolic salvage pathway for their conversion to amino acids. It also includes various Saccharomyces cerevisiae transporters: Fcy21p (Purine-cytosine permease), vitamin B6 transporter Tpn1, nicotinamide riboside transporter 1 (Nrt1p, also called Thi71p), Dal4p (allantoin permease), Fui1p (uridine permease), and Fur4p (uracil permease). Mhp1 has 12 transmembrane (TM) helices (an inverted topology repeat: TMs1-5 and TMs6-10, and TMs11-12; TMs numbered to conform to the solute carrier 6 family Aquifex aeolicus LeuT). NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and SLC6 neurotransmitter transporters.	414
271359	cd10324	SLC6sbd	Solute carrier 6 family, neurotransmitter transporters; solute-binding domain. This family represents the solute-binding domain of SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporter family or Na+/Cl--dependent transporter family). These use sodium and chloride electrochemical gradients to catalyze the thermodynamically uphill movement of a variety of substrates, and include neurotransmitter transporters (NTTs). The latter are Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin (5-hydroxytryptamine), dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NTTs are widely expressed in the mammalian brain, and are involved in regulating neurotransmitter signaling and homeostasis, through facilitating the uptake of released neurotransmitters from the extracellular space into neurons and glial cells. NTTs are the target of a range of therapeutic drugs for the treatment of psychiatric diseases, such as major depression, anxiety disorders, attention deficit hyperactivity disorder and epilepsy. In addition, they are the primary targets of cocaine, amphetamines and other psychostimulants. This family also includes Drosophila Blot which is expressed primarily in epithelial tissues of ectodermal origin and in the nervous system of the embryo and larvae, but in addition found in the developing oocyte and the freshly laid egg. A lack or reduction of Blot function during oogenesis results in early arrest of embryonic development. 12 transmembrane helices (TMs) appears to be common for eukaryotic and some prokaryotic and archaeal SLC6s, (a core inverted topology repeat, TM1-5 and TM6-10, plus TMs11-12; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT), although a majority of bacterial, and some archaeal SLC6s lack TM12, for example the functional Fusobacterium nucleatum tyrosine transporter Tyt1.	415
271360	cd10325	SLC5sbd_vSGLT	Vibrio parahaemolyticus Na(+)/galactose cotransporter (vSGLT) and related proteins; solute binding domain. vSGLT transports D-galactose, D-glucose, and alpha-D-fucose, with a sugar specificity in the order of D-galactose >D-fucose >D-glucose. It transports one Na+ ion for each sugar molecule, and appears to function as a monomer. vSGLT has 13 transmembrane helices (TMs): TM-1, an inverted topology repeat: TMs1-5 and TMs6-10, and TMs 11-12 (TMs numbered to conform to the solute carrier 6 family Aquifex aeolicus LeuT). This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	523
271361	cd10326	SLC5sbd_NIS-like	Na(+)/iodide (NIS) and Na(+)/multivitamin (SMVT) cotransporters, and related proteins; solute binding domain. NIS (product of the SLC5A5 gene) transports I-, and other anions including ClO4-, SCN-, and Br-. SMVT (product of the SLC5A6 gene) transports biotin, pantothenic acid and lipoate. This subfamily also includes SMCT1 and 2. SMCT1(the product of the SLC5A8 gene) is a high-affinity transporter of various monocarboxylates including lactate and pyruvate, short-chain fatty acids, ketone bodies, nicotinate and its structural analogs, pyroglutamate, benzoate and its derivatives, and iodide. SMCT2 (product of the SLC5A12 gene) is a low-affinity transporter for short-chain fatty acids, lactate, pyruvate, and nicotinate. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	472
212037	cd10327	SLC5sbd_PanF	Na(+)/pantothenate cotransporters: PanF of Escherichia coli and related proteins; solute binding domain. PanF catalyzes the Na+-coupled uptake of extracellular pantothenate for coenzyme A biosynthesis in cells. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	472
271362	cd10328	SLC5sbd_YidK	uncharacterized SLC5 subfamily, Escherichia coli YidK-like; solute binding domain. Uncharacterized subfamily of the solute binding domain of the solute carrier 5 (SLC5) transporter family (also called the sodium/glucose cotransporter family or solute sodium symporter family) that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily includes the uncharacterized Escherichia coli YidK protein, and belongs to the solute carrier 5 (SLC5) transporter family.	472
271363	cd10329	SLC5sbd_SGLT1-like	Na(+)/glucose cotransporter SGLT1 and related proteins; solute binding domain. This subfamily includes the solute-binding domain of SGLT proteins that cotransport Na+ with various solutes. Its members include: the human glucose (SGLT1, -2, -4, -5 ), chiro-inositol (SGLT5), and myo-inositol (SMIT) cotransporters. It also includes human SGLT3 which has been characterized as a glucose sensor and not a transporter. It belongs to the solute carrier 5 (SLC5) transporter family.	538
271364	cd10332	SLC6sbd-B0AT-like	System B(0) neutral amino acid transporter AT1, 2 and 3, and related proteins; solute-binding domain. This subgroup includes the solute-binding domain of transmembrane transporters, which transport, i) neutral amino acids: NTT4 (also called XT1), SBAT1 (also called B0AT2, v7-3, NTT7-3), and B0AT1 (also called HND); the human genes encoding these are SLC6A17, SLC6A15, and SLC6A19 respectively, ii) glycine: B0AT3 (also called Xtrp2, XT2), iii) imino acids, such as proline, pipecolate, MeAIB, and sarcosine: SIT1 (also called XTRP3, XT3, IMINO). The human genes encoding B0AT3 and SIT1 are SLC6A18 and SLC6A20 respectively. Transporters in this subgroup may play a role in disorders including major depression, Hartnup disorder, increased susceptibility to myocardial infarction, and iminoglycinuria. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	531
271365	cd10333	LeuT-like_sbd	Aquifex aeolicus LeuT and related proteins; solute binding domain. LeuT is a bacterial amino acid transporter with specificity for the hydrophobic amino acids glycine, alanine, methionine, and leucine. This subgroup belongs to the solute carrier 6 (SLC6) transporter family; LeuT has been used as a structural template for understanding fundamental aspects of SLC6 function. It has an arrangement of 12 transmembrane helices (TMs), which appears to be a common motif for eukaryotic and some prokaryotic and archaeal SLC6s: an inverted topology repeat: TMs1-5 and TMs6-10, and TMs11-12.	496
271366	cd10334	SLC6sbd_u1	uncharacterized bacterial and archaeal solute carrier 6 subfamily; solute-binding domain. SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporter family or Na+/Cl--dependent transporter family) include neurotransmitter transporters (NTTs): these are sodium- and chloride-dependent plasma membrane transporters for the monoamine neurotransmitters serotonin (5-hydroxytryptamine), dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. These NTTs are widely expressed in the mammalian brain, involved in regulating neurotransmitter signaling and homeostasis, and the target of a range of therapeutic drugs for the treatment of psychiatric diseases. Bacterial members of the SLC6 family include the LeuT amino acid transporter.	480
271367	cd10336	SLC6sbd_Tyt1-Like	solute carrier 6 subfamily, Fusobacterium nucleatum Tyt1-like; solute-binding domain. SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporter family or Na+/Cl--dependent transporter family) include neurotransmitter transporters (NTTs): these are sodium- and chloride-dependent plasma membrane transporters for the monoamine neurotransmitters serotonin (5-hydroxytryptamine), dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. These NTTs are widely expressed in the mammalian brain, involved in regulating neurotransmitter signaling and homeostasis, and the target of a range of therapeutic drugs for the treatment of psychiatric diseases. Bacterial members of the SLC6 family include the LeuT amino acid transporter. An arrangement of 12 transmembrane (TM) helices appears to be as a common topological motif for eukaryotic and some prokaryotic and archaeal NTTs. However, this subfamily which contains the majority of bacterial members and some archaeal members, appears to contain only 11 TMs; for example the functional Fusobacterium nucleatum tyrosine transporter Tyt1.	440
198200	cd10337	SH2_BCAR3	Src homology 2 (SH2) domain in the Breast Cancer Anti-estrogen Resistance protein 3. BCAR3 is part of a growing family of guanine nucleotide exchange factors is responsible for activation of Ras-family GTPases, including Sos1 and 2, GRF1 and 2, CalDAG-GEF/GRP1-4, C3G, cAMP-GEF/Epac 1 and 2, PDZ-GEFs, MR-GEF, RalGDS family members, RalGPS, RasGEF, Smg GDS, and phospholipase C(epsilon). 12102558  21262352  BCAR3 binds to the carboxy-terminus of BCAR1/p130Cas, a focal adhesion adapter protein.  Over expression of BCAR1 (p130Cas) and BCAR3 induces estrogen independent growth in normally estrogen-dependent cell lines. They have been linked to resistance to anti-estrogens in breast cancer, Rac activation, and cell motility, though the BCAR3/p130Cas complex is not required for this activity in BCAR3.  Many BCAR3-mediated signaling events in epithelial and mesenchymal cells are independent of p130Cas association. Structurally these proteins contain a single SH2 domain upstream of their RasGEF domain, which is responsible for the ability of BCAR3 to enhance p130Cas over-expression-induced migration. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	136
198201	cd10338	SH2_SHA	Src homology 2 (SH2) domain found in SH2 adaptor proteins A (SHA) Signal transducers. Signal transducing adaptor proteins are accessory to main proteins in a signal transduction pathway. These proteins lack intrinsic enzymatic activity, but mediate specific protein-protein interactions that drive the formation of protein complexes. Adaptor proteins usually contain several domains within their structure (e.g. SH2 and SH3 domains) which allow specific interactions with several other specific proteins. Not much is known about the SHA protein except that it is predicted to act as a transcription factor. Arabidopsis SHA pulled down a 120-kD tyrosine-phosphorylated protein in vitro. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	106
198202	cd10339	SH2_RIN_family	Src homology 2 (SH2) domain found in Ras and Rab interactor (RIN)-family. The RIN (AKA Ras interaction/interference) family is composed of RIN1, RIN2 and RIN3. These proteins have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs, and RIN3 specifically functions as a Rab31-GEF. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198203	cd10340	SH2_N-SH2_SHP_like	N-terminal Src homology 2 (N-SH2) domain found in SH2 domain Phosphatases (SHP) proteins. The SH2 domain phosphatases (SHP-1, SHP-2/Syp, Drosophila corkscrew (csw), and Caenorhabditis elegans Protein Tyrosine Phosphatase (Ptp-2)) are cytoplasmic signaling enzymes. They are both targeted and regulated by interactions of their SH2 domains with phosphotyrosine docking sites. These proteins contain two SH2 domains (N-SH2, C-SH2) followed by a tyrosine phosphatase (PTP) domain, and a C-terminal extension. Shp1 and Shp2 have two tyrosyl phosphorylation sites in their C-tails, which are phosphorylated differentially by receptor and nonreceptor PTKs. Csw retains the proximal tyrosine and Ptp-2 lacks both sites.  Shp-binding proteins include receptors, scaffolding adapters, and inhibitory receptors. Some of these bind both Shp1 and Shp2 while others bind only one. Most proteins that bind a Shp SH2 domain contain one or more immuno-receptor tyrosine-based inhibitory motifs (ITIMs): [IVL]xpYxx[IVL].  Shp1 N-SH2 domain blocks the catalytic domain and keeps the enzyme in the inactive conformation, and is thus believed to regulate the phosphatase activity of SHP-1. Its C-SH2 domain is thought to be involved in searching for phosphotyrosine activators. The SHP2 N-SH2 domain is a conformational switch; it either binds and inhibits the phosphatase, or it binds phosphoproteins and activates the enzyme. The C-SH2 domain contributes binding energy and specificity, but it does not have a direct role in activation. Csw SH2 domain function is essential, but either SH2 domain can fulfill this requirement. The role of the csw SH2 domains during Sevenless receptor tyrosine kinase (SEV) signaling is to bind Daughter of Sevenless rather than activated SEV.  Ptp-2 acts in oocytes downstream of sheath/oocyte gap junctions to promote major sperm protein (MSP)-induced MAP Kinase (MPK-1) phosphorylation. Ptp-2 functions in the oocyte cytoplasm, not at the cell surface to inhibit multiple RasGAPs, resulting in sustained Ras activation. It is thought that MSP triggers PTP-2/Ras activation and ROS production to stimulate MPK-1 activity essential for oocyte maturation and that secreted MSP domains and Cu/Zn superoxide dismutases function antagonistically to control ROS and MAPK signaling. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	99
199829	cd10341	SH2_N-SH2_PLC_gamma_like	N-terminal Src homology 2 (N-SH2) domain in Phospholipase C gamma. Phospholipase C gamma is a signaling molecule that is recruited to the C-terminal tail of the receptor upon autophosphorylation of a highly conserved tyrosine.  PLCgamma is composed of a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, 2 catalytic regions of PLC domains that flank 2 tandem SH2 domains (N-SH2, C-SH2), and ending with a SH3 domain and C2 domain. N-SH2 SH2 domain-mediated interactions represent a crucial step in transmembrane signaling by receptor tyrosine kinases. SH2 domains recognize phosphotyrosine (pY) in the context of particular sequence motifs in receptor phosphorylation sites. Both N-SH2 and C-SH2 have a very similar binding affinity to pY. But in growth factor stimulated cells these domains bind to different target proteins. N-SH2 binds to pY containing sites in the C-terminal tails of tyrosine kinases and other receptors. Recently it has been shown that this interaction is mediated by phosphorylation-independent interactions between a secondary binding site found exclusively on the N-SH2 domain and a region of the FGFR1 tyrosine kinase domain. This secondary site on the SH2 cooperates with the canonical pY site to regulate selectivity in mediating a specific cellular process.  C-SH2 binds to an intramolecular site on PLCgamma itself which allows it to hydrolyze phosphatidylinositol-4,5-bisphosphate into diacylglycerol and inositol triphosphate. These then activate protein kinase C and release calcium. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	99
198205	cd10342	SH2_SAP1	Src homology 2 (SH2) domain found in SLAM-associated protein (SAP)1. The X-linked lymphoproliferative syndrome (XLP) gene encodes SAP (also called SH2D1A/DSHP) a protein that consists of a 5 residue N-terminus, a single SH2 domain, and a short 25 residue C-terminal tail.  XLP is characterized by an extreme sensitivity to Epstein-Barr virus.  Both T and natural killer (NK) cell dysfunctions have been seen in XLP patients. SAP binds the cytoplasmic tail of Signaling lymphocytic activation molecule (SLAM), 2B4, Ly-9, and CD84. SAP is believed to function as a signaling inhibitor, by blocking or regulating binding of other signaling proteins. SAP and the SAP-like protein EAT-2 recognize the sequence motif TIpYXX[VI], which is found in the cytoplasmic domains of a restricted number of T, B, and NK cell surface receptors and are proposed to be natural inhibitors or regulators of the physiological role of a small family of receptors on the surface of these cells.  In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198206	cd10343	SH2_SHIP	Src homology 2 (SH2) domain found in SH2-containing inositol-5'-phosphatase (SHIP) and SLAM-associated protein (SAP). The SH2-containing inositol-5'-phosphatase, SHIP (also called SHIP1/SHIP1a), is a hematopoietic-restricted phosphatidylinositide phosphatase that translocates to the plasma membrane after extracellular stimulation and hydrolyzes the phosphatidylinositol-3-kinase (PI3K)-generated second messenger PI-3,4,5-P3 (PIP3) to PI-3,4-P2. As a result, SHIP dampens down PIP3 mediated signaling and represses the proliferation, differentiation, survival, activation, and migration of hematopoietic cells.  PIP3 recruits lipid-binding pleckstrin homology(PH) domain-containing proteins to the inner wall of the plasma membrane and activates them. PH domain-containing downstream effectors include the survival/proliferation enhancing serine/threonine kinase, Akt (protein kinase B), the tyrosine kinase, Btk, the regulator of protein translation, S6K, and the Rac and cdc42 guanine nucleotide exchange factor, Vav. SHIP is believed to act  as a tumor suppressor during leukemogenesis and lymphomagenesis, and may play a role in activating the immune system to combat cancer. SHIP contains an N-terminal SH2 domain, a centrally located phosphatase domain that specifically hydrolyzes the 5'-phosphate from PIP3, PI-4,5-P2  and inositol-1,3,4,5- tetrakisphosphate (IP4), a C2 domain, that is an allosteric activating site when bound by SHIP's enzymatic product, PI-3,4-P2; 2 NPXY motifs that bind proteins with a phosphotyrosine binding (Shc, Dok 1, Dok 2) or an SH2 (p85a, SHIP2) domain; and a proline-rich domain consisting of four PxxP motifs that bind a subset of SH3-containing proteins including Grb2, Src, Lyn, Hck, Abl, PLCg1, and PIAS1. The SH2 domain of SHIP binds to the tyrosine phosphorylated forms of Shc, SHP-2, Doks, Gabs, CD150, platelet-endothelial cell adhesion molecule, Cas, c-Cbl, immunoreceptor tyrosine-based inhibitory motifs (ITIMs), and immunoreceptor tyrosine-based activation motifs (ITAMs). The X-linked lymphoproliferative syndrome (XLP) gene encodes SAP (also called SH2D1A/DSHP) a protein that consists of a 5 residue N-terminus, a single SH2 domain, and a short 25 residue C-terminal tail.  XLP is characterized by an extreme sensitivity to Epstein-Barr virus.  Both T and natural killer (NK) cell dysfunctions have been seen in XLP patients. SAP binds the cytoplasmic tail of Signaling lymphocytic activation molecule (SLAM), 2B4, Ly-9, and CD84. SAP is believed to function as a signaling inhibitor, by blocking or regulating binding of other signaling proteins. SAP and the SAP-like protein EAT-2 recognize the sequence motif TIpYXX(V/I), which is found in the cytoplasmic domains of a restricted number of T, B, and NK cell surface receptors and are proposed to be natural inhibitors or regulators of the physiological role of a small family of receptors on the surface of these cells. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198207	cd10344	SH2_SLAP	Src homology 2 domain found in Src-like adaptor proteins. SLAP belongs to the subfamily of adapter proteins that negatively regulate cellular signaling initiated by tyrosine kinases. It has a myristylated N-terminus, SH3 and SH2 domains with high homology to Src family tyrosine kinases, and a unique C-terminal tail, which is important for c-Cbl binding. SLAP negatively regulates platelet-derived growth factor (PDGF)-induced mitogenesis in fibroblasts and regulates F-actin assembly for dorsal ruffles formation. c-Cbl mediated SLAP inhibition towards actin remodeling. Moreover, SLAP enhanced PDGF-induced c-Cbl phosphorylation by SFK. In contrast, SLAP mitogenic inhibition was not mediated by c-Cbl, but it rather involved a competitive mechanism with SFK for PDGF-receptor (PDGFR) association and mitogenic signaling. Accordingly, phosphorylation of the Src mitogenic substrates Stat3 and Shc were reduced by SLAP. Thus, we concluded that SLAP regulates PDGFR signaling by two independent mechanisms: a competitive mechanism for PDGF-induced Src mitogenic signaling and a non-competitive mechanism for dorsal ruffles formation mediated by c-Cbl. SLAP is a hematopoietic adaptor containing Src homology (SH)3 and SH2 motifs and a unique carboxy terminus. Unlike c-Src, SLAP lacks a tyrosine kinase domain. Unlike c-Src, SLAP does not impact resorptive function of mature osteoclasts but induces their early apoptosis. SLAP negatively regulates differentiation of osteoclasts and proliferation of their precursors. Conversely, SLAP decreases osteoclast death by inhibiting activation of caspase 3. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	104
198208	cd10345	SH2_C-SH2_Zap70_Syk_like	C-terminal Src homology 2 (SH2) domain found in Zeta-chain-associated protein kinase 70 (ZAP-70) and Spleen tyrosine kinase (Syk) proteins. ZAP-70 and Syk comprise a family of hematopoietic cell specific protein tyrosine kinases (PTKs) that are required for antigen and antibody receptor function. ZAP-70 is expressed in T and natural killer (NK) cells and Syk is expressed in B cells, mast cells, polymorphonuclear leukocytes, platelets, macrophages, and immature T cells. They are required for the proper development of T and B cells, immune receptors, and activating NK cells. They consist of two N-terminal Src homology 2 (SH2) domains and a C-terminal kinase domain separated from the SH2 domains by a linker or hinge region. Phosphorylation of both tyrosine residues within the Immunoreceptor Tyrosine-based Activation Motifs (ITAM; consensus sequence Yxx[LI]x(7,8)Yxx[LI]) by the Src-family PTKs is required for efficient interaction of ZAP-70 and Syk with the receptor subunits and for receptor function. ZAP-70 forms two phosphotyrosine binding pockets, one of which is shared by both SH2 domains. In Syk the two SH2 domains do not form such a phosphotyrosine-binding site. The SH2 domains here are believed to function independently. In addition, the two SH2 domains of Syk display flexibility in their relative orientation, allowing Syk to accommodate a greater variety of spacing sequences between the ITAM phosphotyrosines and singly phosphorylated non-classical ITAM ligands. This model contains the C-terminus SH2 domains of both Syk and Zap70. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	95
198209	cd10346	SH2_SH2B_family	Src homology 2 (SH2) domain found in SH2B adapter protein family. The SH2B adapter protein family  has 3 members:  SH2B1 (SH2-B, PSM), SH2B2 (APS), and SH2B3 (Lnk). SH2B family members contain a pleckstrin homology domain, at least one dimerization domain, and a C-terminal SH2 domain which binds to phosphorylated tyrosines in a variety of tyrosine kinases.  SH2B1 and SH2B2  function in signaling pathways found downstream of growth hormone receptor and receptor tyrosine kinases, including the insulin, insulin-like growth factor-I (IGF-I), platelet-derived growth factor (PDGF), nerve growth factor, hepatocyte growth factor, and fibroblast growth factor receptors. SH2B2beta, a new isoform of SH2B2, is an endogenous inhibitor of SH2B1 and/or SH2B2 (SH2B2alpha), negatively regulating insulin signaling and/or JAK2-mediated cellular responses. SH2B3 negatively regulates lymphopoiesis and early hematopoiesis. The lnk-deficiency results in enhanced production of B cells, and expansion as well as enhanced function of hematopoietic stem cells (HSCs), demonstrating negative regulatory functions of Sh2b3/Lnk in cytokine signaling. Sh2b3/Lnk also functions in responses controlled by cell adhesion and in crosstalk between integrin- and cytokine-mediated signaling. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198210	cd10347	SH2_Nterm_shark_like	N-terminal Src homology 2 (SH2) domain found in SH2 domains, ANK, and kinase domain (shark) proteins. These non-receptor protein-tyrosine kinases contain two SH2 domains, five ankyrin (ANK)-like repeats, and a potential tyrosine phosphorylation site in the carboxyl-terminal tail which resembles the phosphorylation site in members of the src family. Like, mammalian non-receptor protein-tyrosine kinases, ZAP-70 and syk proteins, they do not have SH3 domains. However, the presence of ANK makes these unique among protein-tyrosine kinases. Both tyrosine kinases and ANK repeats have been shown to transduce developmental signals, and SH2 domains are known to participate intimately in tyrosine kinase signaling. These tyrosine kinases are believed to be involved in epithelial cell polarity. The members of this family include the shark (SH2 domains, ANK, and kinase domain) gene in Drosophila and yellow fever mosquitos, as well as the hydra protein HTK16. Drosophila Shark is proposed to transduce intracellularly the Crumbs, a protein necessary for proper organization of ectodermal epithelia, intercellular signal. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	81
198211	cd10348	SH2_Cterm_shark_like	C-terminal Src homology 2 (SH2) domain found in SH2 domains, ANK, and kinase domain (shark) proteins. These non-receptor protein-tyrosine kinases contain two SH2 domains, five ankyrin (ANK)-like repeats, and a potential tyrosine phosphorylation site in its carboxyl-terminal tail which resembles the phosphorylation site in members of the src family. Like, mammalian non-receptor protein-tyrosine kinases, ZAP-70 and syk proteins, they do not have SH3 domains. However, the presence of ANK makes these unique among protein-tyrosine kinases. Both tyrosine kinases and ANK repeats have been shown to transduce developmental signals, and SH2 domains are known to participate intimately in tyrosine kinase signaling. These tyrosine kinases are believed to be involved in epithelial cell polarity. The members of this family include the shark (SH2 domains, ANK, and kinase domain) gene in Drosophila and yellow fever mosquitos, as well as the hydra protein HTK16.  Drosophila Shark is proposed to transduce intracellularly the Crumbs, a protein necessary for proper organization of ectodermal epithelia, intercellular signal. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	86
199830	cd10349	SH2_SH2D2A_SH2D7	Src homology 2 domain found in the SH2 domain containing protein 2A and 7 (SH2D2A and SH2D7). SH2D2A and SH7 both contain a single SH2 domain. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	77
198213	cd10350	SH2_SH2D4A	Src homology 2 domain found in the SH2 domain containing protein 4A (SH2D4A). SH2D4A contains a single SH2 domain. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198214	cd10351	SH2_SH2D4B	Src homology 2 domain found in the SH2 domain containing protein 4B (SH2D4B). SH2D4B contains a single SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198215	cd10352	SH2_a2chimerin_b2chimerin	Src homology 2 (SH2) domain found in alpha2-chimerin and beta2-chimerin proteins. Chimerins are a family of phorbol ester- and diacylglycerol-responsive GTPase-activating proteins. Alpha1-chimerin (formerly known as n-chimerin) and alpha2-chimerin are alternatively spliced products of a single gene, as are beta1- and beta2-chimerin. alpha1- and beta1-chimerin have a relatively short N-terminal region that does not encode any recognizable domains, whereas alpha2- and beta2-chimerin both include a functional SH2 domain that can bind to phosphotyrosine motifs within receptors. All of the isoforms contain a GAP domain with specificity in vitro for Rac1 and a diacylglycerol (DAG)-binding C1 domain which allows them to translocate to membranes in response to DAG signaling and anchors them in close proximity to activated Rac. Other C1 domain-containing diacylglycerol receptors including: PKC, Munc-13 proteins, phorbol ester binding scaffolding proteins involved in Ca2+-stimulated exocytosis, and RasGRPs, diacylglycerol-activated guanine-nucleotide exchange factors (GEFs) for Ras and Rap1. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	91
198216	cd10353	SH2_Nterm_RasGAP	N-terminal Src homology 2 (SH2) domain found in Ras GTPase-activating protein 1 (GAP). RasGAP is part of the GAP1 family of GTPase-activating proteins. The protein is located in the cytoplasm and stimulates the GTPase activity of normal RAS p21, but not its oncogenic counterpart. Acting as a suppressor of RAS function, the protein enhances the weak intrinsic GTPase activity of RAS proteins resulting in RAS inactivation, thereby allowing control of cellular proliferation and differentiation. Mutations leading to changes in the binding sites of either protein are associated with basal cell carcinomas. Alternative splicing results in two isoforms. The shorter isoform which lacks the N-terminal hydrophobic region, has the same activity, and is expressed in placental tissues. In general the longer isoform contains 2 SH2 domains, a SH3 domain, a pleckstrin homology (PH) domain, and a calcium-dependent phospholipid-binding C2 domain. The C-terminus contains the catalytic domain of RasGap which catalyzes the activation of Ras by hydrolyzing GTP-bound active Ras into an inactive GDP-bound form of Ras. This model contains the N-terminal SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198217	cd10354	SH2_Cterm_RasGAP	C-terminal Src homology 2 (SH2) domain found in Ras GTPase-activating protein 1 (GAP). RasGAP is part of the GAP1 family of GTPase-activating proteins. The protein is located in the cytoplasm and stimulates the GTPase activity of normal RAS p21, but not its oncogenic counterpart. Acting as a suppressor of RAS function, the protein enhances the weak intrinsic GTPase activity of RAS proteins resulting in RAS inactivation, thereby allowing control of cellular proliferation and differentiation. Mutations leading to changes in the binding sites of either protein are associated with basal cell carcinomas. Alternative splicing results in two isoforms. The shorter isoform which lacks the N-terminal hydrophobic region, has the same activity, and is expressed in placental tissues.  In general longer isoform contains 2 SH2 domains, a SH3 domain, a pleckstrin homology (PH) domain, and a calcium-dependent phospholipid-binding C2 domain. The C-terminus contains the catalytic domain of RasGap which catalyzes the activation of Ras by hydrolyzing GTP-bound active Ras into an inactive GDP-bound form of Ras. This model contains the C-terminal SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	77
198218	cd10355	SH2_DAPP1_BAM32_like	Src homology 2 domain found in dual adaptor for phosphotyrosine and 3-phosphoinositides ( DAPP1)/B lymphocyte adaptor molecule of 32 kDa (Bam32)-like proteins. DAPP1/Bam32 contains a putative myristoylation site at its N-terminus, followed by a SH2 domain, and a pleckstrin homology (PH) domain at its C-terminus. DAPP1 could potentially be recruited to the cell membrane by any of these domains. Its putative myristoylation site could facilitate the interaction of DAPP1 with the lipid bilayer. Its SH2 domain may also interact with phosphotyrosine residues on membrane-associated proteins such as activated tyrosine kinase receptors. And finally its PH domain exhibits a high-affinity interaction with the PtdIns(3,4,5)P(3) PtdIns(3,4)P(2) second messengers produced at the cell membrane following the activation of PI 3-kinases. DAPP1 is thought to interact with both tyrosine phosphorylated proteins and 3-phosphoinositides and therefore may play a role in regulating the location and/or activity of such proteins(s) in response to agonists that elevate PtdIns(3,4,5)P(3) and PtdIns(3,4)P(2). This protein is likely to play an important role in triggering signal transduction pathways that lie downstream from receptor tyrosine kinases and PI 3-kinase. It is likely that DAPP1 functions as an adaptor to recruit other proteins to the plasma membrane in response to extracellular signals. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	92
198219	cd10356	SH2_ShkA_ShkC	Src homology 2 (SH2) domain found in SH2 domain-bearing protein kinases A and C (ShkA and ShkC). SH2-bearing genes cloned from Dictyostelium include two transcription factors, STATa and STATc, and a signaling factor, SHK1 (shkA). A database search of the Dictyostelium discoideum genome revealed two additional putative STAT sequences, dd-STATb and dd-STATd, and four additional putative SHK genes, dd-SHK2 (shkB), dd-SHK3 (shkC), dd-SHK4 (shkD), and dd-SHK5 (shkE). This model contains members of shkA and shkC.  All of the SHK members are most closely related to the protein kinases found in plants.  However these kinases in plants are not conjugated to any SH2 or SH2-like sequences. Alignment data indicates that the SHK SH2 domains carry some features of the STAT SH2 domains in Dictyostelium. When STATc's linker domain was used for a BLAST search, the sequence between the protein kinase domain and the SH2 domain (the linker) of SHK was recovered, suggesting a close relationship among these molecules within this region. SHK's linker domain is predicted to contain an alpha-helix which is indeed homologous to that of STAT. Based on the phylogenetic alignment, SH2 domains can be grouped into two categories, STAT-type and Src-type. SHK family members are in between, but are closer to the STAT-type which indicates a close relationship between SHK and STAT families in their SH2 domains and further supports the notion that SHKs linker-SH2 domain evolved from STAT or STATL (STAT-like Linker-SH2) domain found in plants. In SHK, STAT, and SPT6, the linker-SH2 domains all reside exclusively in the C-terminal regions. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	113
198220	cd10357	SH2_ShkD_ShkE	Src homology 2 (SH2) domain found in SH2 domain-bearing protein kinases D and E (ShkD and ShkE). SH2-bearing genes cloned from Dictyostelium include two transcription factors, STATa and STATc, and a signaling factor, SHK1 (shkA). A database search of the Dictyostelium discoideum genome revealed two additional putative STAT sequences, dd-STATb and dd-STATd, and four additional putative SHK genes, dd-SHK2 (shkB), dd-SHK3 (shkC), dd-SHK4 (shkD), and dd-SHK5 (shkE). This model contains members of shkD and shkE. All of the SHK members are most closely related to the protein kinases found in plants.  However these kinases in plants are not conjugated to any SH2 or SH2-like sequences. Alignment data indicates that the SHK SH2 domains carry some features of the STAT SH2 domains in Dictyostelium. When STATc's linker domain was used for a BLAST search, the sequence between the protein kinase domain and the SH2 domain (the linker) of SHK was recovered, suggesting a close relationship among these molecules within this region. SHK's linker domain is predicted to contain an alpha-helix which is indeed homologous to that of STAT. Based on the phylogenetic alignment, SH2 domains can be grouped into two categories, STAT-type and Src-type. SHK family members are in between, but are closer to the STAT-type which indicates a close relationship between SHK and STAT families in their SH2 domains and further supports the notion that SHKs linker-SH2 domain evolved from STAT or STATL (STAT-like Linker-SH2) domain found in plants. In SHK, STAT, and SPT6, the linker-SH2 domains all reside exclusively in the C-terminal regions.  In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	87
198221	cd10358	SH2_PTK6_Brk	Src homology 2 domain found in protein-tyrosine kinase-6 (PTK6) which is also known as breast tumor kinase (Brk). Human protein-tyrosine kinase-6 (PTK6, also known as breast tumor kinase (Brk)) is a member of the non-receptor protein-tyrosine kinase family and is expressed in two-thirds of all breast tumors. PTK6 (9). PTK6 contains a SH3 domain, a SH2 domain, and catalytic domains. For the case of the non-receptor protein-tyrosine kinases, the SH2 domain is typically involved in negative regulation of kinase activity by binding to a phosphorylated tyrosine residue near to the C terminus. The C-terminal sequence of PTK6 (PTSpYENPT where pY is phosphotyrosine) is thought to be a self-ligand for the SH2 domain. The structure of the SH2 domain resembles other SH2 domains except for a centrally located four-stranded antiparallel beta-sheet (strands betaA, betaB, betaC, and betaD). There are also differences in the loop length which might be responsible for PTK6 ligand specificity. There are two possible means of regulation of PTK6: autoinhibitory with the phosphorylation of Tyr playing a role in its negative regulation and autophosphorylation at this site, though it has been shown that PTK6 might phosphorylate signal transduction-associated proteins Sam68 and signal transducing adaptor family member 2 (STAP/BKS) in vivo. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	100
198222	cd10359	SH2_SH3BP2	Src homology 2 domain found in c-Abl SH3 domain-binding protein-2 (SH3BP2). The adaptor protein 3BP2/SH3BP2 plays a regulatory role in signaling from immunoreceptors. The protein-tyrosine kinase Syk phosphorylates 3BP2 which results in the activation of Rac1 through the interaction with the SH2 domain of Vav1 and induces the binding to the SH2 domain of the upstream protein-tyrosine kinase Lyn and enhances its kinase activity. 3BP2 has a positive regulatory role in IgE-mediated mast cell activation. In lymphocytes, engagement of T cell or B cell receptors triggers tyrosine phosphorylation of 3BP2. Suppression of the 3BP2 expression by siRNA results in the inhibition of T cell or B cell receptor-mediated activation of NFAT. 3BP2 is required for the proliferation of B cells and B cell receptor signaling. Mutations in the 3BP2 gene are responsible for cherubism resulting in excessive bone resorption in the jaw.  In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198223	cd10360	SH2_Srm	Src homology 2 (SH2) domain found in Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal myristoylation sites (srm). Srm is a nonreceptor protein kinase that has two SH2 domains, a SH3 domain, and a kinase domain with a tyrosine residue for autophosphorylation.  However it lacks an N-terminal glycine for myristoylation and a C-terminal tyrosine which suppresses kinase activity when phosphorylated.  Srm is most similar to members of the Tec family who other members include: Tec, Btk/Emb, and Itk/Tsk/Emt. However Srm differs in its N-terminal unique domain it being much smaller than in the Tec family and is closer to Src. Srm is thought to be a new family of nonreceptor tyrosine kinases that may be redundant in function. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	79
198224	cd10361	SH2_Fps_family	Src homology 2 (SH2) domain found in feline sarcoma, Fujinami poultry sarcoma, and fes-related (Fes/Fps/Fer) proteins. The Fps family consists of members Fps/Fes and Fer/Flk/Tyk3. They are cytoplasmic protein-tyrosine kinases implicated in signaling downstream from cytokines, growth factors and immune receptors.  Fes/Fps/Fer contains three coiled-coil regions, an SH2 (Src-homology-2) and a TK (tyrosine kinase catalytic) domain signature. Members here include: Fps/Fes, Fer, Kin-31, and  In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	90
198225	cd10362	SH2_Src_Lck	Src homology 2 (SH2) domain in lymphocyte cell kinase (Lck). Lck is a member of the Src non-receptor type tyrosine kinase family of proteins. It is expressed in the brain, T-cells, and NK cells.  The unique domain of Lck mediates its interaction with two T-cell surface molecules, CD4 and CD8. It associates with their cytoplasmic tails on CD4 T helper cells  and CD8 cytotoxic T cells to assist signaling from the T cell receptor (TCR) complex. When the T cell receptor is engaged by the specific antigen presented by MHC, Lck phosphorylase the intracellular chains of the CD3 and zeta-chains of the TCR complex, allowing ZAP-70 to bind them. Lck then phosphorylates and activates ZAP-70, which in turn phosphorylates Linker of Activated T cells (LAT), a transmembrane protein that serves as a docking site for proteins including: Shc-Grb2-SOS, PI3K, and phospholipase C (PLC). The tyrosine phosphorylation cascade culminates in the intracellular mobilization of a calcium ions and activation of important signaling cascades within the lymphocyte, including the Ras-MEK-ERK pathway, which goes on to activate certain transcription factors such as NFAT, NF-kappaB, and AP-1. These transcription factors regulate the production cytokines such as Interleukin-2 that promote long-term proliferation and differentiation of the activated lymphocytes.  The N-terminal tail of Lck is myristoylated and palmitoylated and it tethers the protein to the plasma membrane of the cell. Lck also contains a SH3 domain, a SH2 domain, and a C-terminal tyrosine kinase domain. Lck has 2 phosphorylation sites, the first an autophosphorylation site that is linked to activation of the protein and the second which is phosphorylated by Csk, which inhibits it. Lck is also inhibited by SHP-1 dephosphorylation and by Cbl ubiquitin ligase, which is part of the ubiquitin-mediated pathway. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198226	cd10363	SH2_Src_HCK	Src homology 2 (SH2) domain found in HCK. HCK is a member of the Src non-receptor type tyrosine kinase family of proteins and is expressed in hemopoietic cells. HCK is proposed to couple the Fc receptor to the activation of the respiratory burst. It may also play a role in neutrophil migration and in the degranulation of neutrophils. It has two different translational starts that have different subcellular localization. HCK has been shown to interact with BCR gene,  ELMO1 Cbl gene, RAS p21 protein activator 1, RASA3, Granulocyte colony-stimulating factor receptor, ADAM15 and RAPGEF1.  Like the other members of the Src family the SH2 domain in addition to binding the target, also plays an autoinhibitory role by binding to its C-terminal tail.  In general SH2 domains are involved in signal transduction. HCK has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	104
198227	cd10364	SH2_Src_Lyn	Src homology 2 (SH2) domain found in Lyn. Lyn is a member of the Src non-receptor type tyrosine kinase family of proteins and is expressed in the hematopoietic cells, in neural tissues, liver, and adipose tissue. There are two alternatively spliced forms of Lyn.  Lyn plays an inhibitory role in myeloid lineage proliferation. Following engagement of the B cell receptors, Lyn undergoes rapid phosphorylation and activation, triggering a cascade of signaling events mediated by Lyn phosphorylation of tyrosine residues within the immunoreceptor tyrosine-based activation motifs (ITAM) of the receptor proteins, and subsequent recruitment and activation of other kinases including Syk, phospholipase C2 (PLC2) and phosphatidyl inositol-3 kinase. These kinases play critical roles in proliferation, Ca2+ mobilization and cell differentiation. Lyn plays an essential role in the transmission of inhibitory signals through phosphorylation of tyrosine residues within the immunoreceptor tyrosine-based inhibitory motifs (ITIM) in regulatory proteins such as CD22, PIR-B and FC RIIb1. Their ITIM phosphorylation subsequently leads to recruitment and activation of phosphatases such as SHIP-1 and SHP-1 which further down modulate signaling pathways, attenuate cell activation and can mediate tolerance. Lyn also plays a role in the insulin signaling pathway. Activated Lyn phosphorylates insulin receptor substrate 1 (IRS1) leading to an increase in translocation of Glut-4 to the cell membrane and increased glucose utilization. It is the primary Src family member involved in signaling downstream of the B cell receptor. Lyn plays an unusual, 2-fold role in B cell receptor signaling; it is essential for initiation of signaling but is also later involved in negative regulation of the signal. Lyn has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198228	cd10365	SH2_Src_Src	Src homology 2 (SH2) domain found in tyrosine kinase sarcoma (Src). Src is a member of the Src non-receptor type tyrosine kinase family of proteins. Src is thought to play a role in the regulation of embryonic development and cell growth. Members here include v-Src and c-Src. v-Src lacks the C-terminal inhibitory phosphorylation site and is therefore constitutively active as opposed to normal cellular src (c-Src) which is only activated under certain circumstances where it is required (e.g. growth factor signaling). v-Src is an oncogene whereas c-Src is a proto-oncogene. c-Src consists of three domains, an N-terminal SH3 domain, a central SH2 domain and a tyrosine kinase domain. The SH2 and SH3 domains work together in the auto-inhibition of the kinase domain. The phosphorylation of an inhibitory tyrosine near the c-terminus of the protein produces a binding site for the SH2 domain which then facilitates binding of the SH3 domain to a polyproline site within the linker between the SH2 domain and the kinase domain. Binding of the SH3 domain inactivates the enzyme. This allows for multiple mechanisms for c-Src activation: dephosphorylation of the C-terminal tyrosine by a protein tyrosine phosphatase, binding of the SH2 domain by a competitive phospho-tyrosine residue, or competitive binding of a polyproline binding site to the SH3 domain.  Unlike most other Src members Src lacks cysteine residues in the SH4 domain that undergo palmitylation. Serine and threonine phosphorylation sites have also been identified in the unique domains of Src and are believed to modulate protein-protein interactions or regulate catalytic activity. Alternatively spliced forms of Src, which contain 6- or 11-amino acid insertions in the SH3 domain, are expressed in CNS neurons. c-Src has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198229	cd10366	SH2_Src_Yes	Src homology 2 (SH2) domain found in Yes. Yes is a member of the Src non-receptor type tyrosine kinase family of proteins. Yes is the cellular homolog of the Yamaguchi sarcoma virus oncogene. In humans it is encoded by the YES1 gene which maps to chromosome 18 and is in close proximity to thymidylate synthase. A corresponding Yes pseudogene has been found on chromosome 22. YES1 has been shown to interact with Janus kinase 2, CTNND1,RPL10, and Occludin. Yes1 has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198230	cd10367	SH2_Src_Fgr	Src homology 2 (SH2) domain found in Gardner-Rasheed feline sarcoma viral (v-fgr) oncogene homolog, Fgr. Fgr is a member of the Src non-receptor type tyrosine kinase family of proteins. The protein contains N-terminal sites for myristoylation and palmitoylation, a PTK domain, and SH2 and SH3 domains which are involved in mediating protein-protein interactions with phosphotyrosine-containing and proline-rich motifs, respectively. Fgr is expressed in B-cells and myeloid cells, localizes to plasma membrane ruffles, and functions as a negative regulator of cell migration and adhesion triggered by the beta-2 integrin signal transduction pathway. Multiple alternatively spliced variants, encoding the same protein, have been identified  Fgr has been shown to interact with Wiskott-Aldrich syndrome protein. Fgr has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198231	cd10368	SH2_Src_Fyn	Src homology 2 (SH2) domain found in Fyn. Fyn is a member of the Src non-receptor type tyrosine kinase family of proteins. Fyn is involved in the control of cell growth and is required in the following pathways: T and B cell receptor signaling, integrin-mediated signaling, growth factor and cytokine receptor signaling, platelet activation, ion channel function, cell adhesion, axon guidance, fertilization, entry into mitosis, and differentiation of natural killer cells, oligodendrocytes and keratinocytes. The protein associates with the p85 subunit of phosphatidylinositol 3-kinase and interacts with the Fyn-binding protein. Alternatively spliced transcript variants encoding distinct isoforms exist. Fyn is primarily localized to the cytoplasmic leaflet of the plasma membrane. Tyrosine phosphorylation of target proteins by Fyn serves to either regulate target protein activity, and/or to generate a binding site on the target protein that recruits other signaling molecules. FYN has been shown to interact with a number of proteins including: BCAR1, Cbl, Janus kinase, nephrin, Sky, tyrosine kinase, Wiskott-Aldrich syndrome protein, and Zap-70. Fyn has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
199831	cd10369	SH2_Src_Frk	Src homology 2 (SH2) domain found in the Fyn-related kinase (Frk). Frk is a member of the Src non-receptor type tyrosine kinase family of proteins. The Frk subfamily is composed of Frk/Rak and Iyk/Bsk/Gst. It is expressed primarily epithelial cells.  Frk is a nuclear protein and may function during G1 and S phase of the cell cycle and suppress growth. Unlike the other Src members it lacks a glycine at position 2 of SH4 which is important for addition of a myristic acid moiety that is involved in targeting Src PTKs to cellular membranes. FRK and SHB exert similar effects when overexpressed in rat phaeochromocytoma (PC12) and beta-cells, where both induce PC12 cell differentiation and beta-cell proliferation. Under conditions that cause beta-cell degeneration these proteins augment beta-cell apoptosis. The FRK-SHB responses involve FAK and insulin receptor substrates (IRS) -1 and -2. Frk has been demonstrated to interact with retinoblastoma protein. Frk regulates PTEN protein stability by phosphorylating PTEN, which in turn prevents PTEN degradation. Frk also plays a role in regulation of embryonal pancreatic beta cell formation. Frk has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family.  Like the other members of the Src family the SH2 domain in addition to binding the target, also plays an autoinhibitory role by binding to its activation loop. The tryosine involved is at the same site as the tyrosine involved in the autophosphorylation of Src. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	96
198233	cd10370	SH2_Src_Src42	Src homology 2 (SH2) domain found in the Src oncogene at 42A (Src42). Src42 is a member of the Src non-receptor type tyrosine kinase family of proteins. The integration of receptor tyrosine kinase-induced RAS and Src42 signals by Connector eNhancer of KSR (CNK) as a two-component input is essential for RAF activation in Drosophila. Src42 is present in a wide variety of organisms including: California sea hare, pea aphid, yellow fever mosquito, honey bee, Panamanian leafcutter ant, and sea urchin. Src42 has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. Like the other members of the Src family the SH2 domain in addition to binding the target, also plays an autoinhibitory role by binding to its C-terminal tail.  In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	96
198234	cd10371	SH2_Src_Blk	Src homology 2 (SH2) domain found in B lymphoid kinase (Blk). Blk is a member of the Src non-receptor type tyrosine kinase family of proteins. Blk is expressed in the B-cells. Unlike most other Src members Blk lacks cysteine residues in the SH4 domain that undergo palmitylation. Blk is required for the development of IL-17-producing gamma-delta T cells. Furthermore, Blk is expressed in lymphoid precursors and, in this capacity, plays a role in regulating thymus cellularity during ontogeny. Blk has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	100
198235	cd10372	SH2_STAT1	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 1 proteins. STAT1 is a member of the STAT family of transcription factors. STAT1 is involved in upregulating genes due to a signal by interferons. STAT1 forms homodimers or heterodimers with STAT3 that bind to the Interferon-Gamma Activated Sequence (GAS) promoter element in response to IFN-gamma stimulation. STAT1 forms a heterodimer with STAT2 that can bind Interferon Stimulated Response Element (ISRE) promoter element in response to either IFN-alpha or IFN-beta stimulation. Binding in both cases leads to an increased expression of ISG (Interferon Stimulated Genes). STAT1 has been shown to interact with protein kinase R, Src, IRF1, STAT3, MCM5, STAT2, CD117, Fanconi anemia, complementation group C, CREB-binding protein, Interleukin 27 receptor, alpha subunit, PIAS1, BRCA1, Epidermal growth factor receptor, PTK2, Mammalian target of rapamycin, IFNAR2, PRKCD, TRADD, C-jun, Calcitriol receptor, ISGF3G, and GNB2L1. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus.  STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2  domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells.  The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	151
198236	cd10373	SH2_STAT2	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 2 proteins. STAT2 is a member of the STAT protein family. In response to interferon, STAT2 forms a complex with STAT1 and IFN regulatory factor family protein p48 (ISGF3G), in which this protein acts as a transactivator, but lacks the ability to bind DNA directly. Transcription adaptor P300/CBP (EP300/CREBBP) has been shown to interact specifically with STAT2, which is thought to be involved in the process of blocking IFN-alpha response by adenovirus. STAT2 has been shown to interact with MED14, CREB-binding protein, SMARCA4, STAT1, IFNAR2, IFNAR1, and ISGF3G. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes. However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD).  NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation.  LD links the DNA-binding and SH2  domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells.  The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	151
198237	cd10374	SH2_STAT3	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 3 proteins. STAT3 encoded by this gene is a member of the STAT protein family. STAT3 mediates the expression of a variety of genes in response to cell stimuli, and plays a key role in many cellular processes such as cell growth and apoptosis. The small GTPase Rac1 regulates the activity of STAT3 and PIAS3 inhibits it. Three alternatively spliced transcript variants encoding distinct isoforms have been described. STAT 3 activation is required for self-renewal of embryonic stem cells (ESCs) and is essential for the differentiation of the TH17 helper T cells. Mutations in the STAT3 gene result in Hyperimmunoglobulin E syndrome and human cancers. STAT3 has been shown to interact with Androgen receptor, C-jun, ELP2, EP300, Epidermal growth factor receptor, Glucocorticoid receptor, HIF1A, Janus kinase 1, KHDRBS1, Mammalian target of rapamycin, MyoD, NDUFA13, NFKB1, Nuclear receptor coactivator 1, Promyelocytic leukemia protein, RAC1, RELA, RET proto-oncogene, RPA2, Src, STAT1, and TRIP10. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes.  However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD).  NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation.  LD links the DNA-binding and SH2  domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells. The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	162
198238	cd10375	SH2_STAT4	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 4proteins. STAT4 mediate signals from the IL-12 receptors. STAT4 is mainly phosphorylated by IL-12-mediated signaling pathway in T cells. STAT4 expression is restricted in myeloid cells, thymus and testis. L-12 is the major cytokine that can activate STAT4, resulting in its tyrosine phosphorylation. The IL-12 receptor has two chains, termed IL-12R 1 and IL-12R 2, and ligand binding results in heterodimer formation and activation of the receptor associated JAK kinases, Jak2 and Tyk2. Phosphorylated STAT4 homo-dimerizes via its SH2 domain, and translocates into nucleus where it can recognize traditional N3 STAT target sequences in IL-12 responsive genes. STAT4 can also be phosphorylated in response to IFN-gamma stimulation through activation of Jak1 and Tyk2  in human. IL-17 can also activate STAT4 in human monocytic leukemia cell lines and IL-2 can induce Jak2 and Stat4 activation in NK cells but not in T cells. T helper 1 (Th1) cells produce IL-2 and IFNgamma, whereas Th2 cells secrete IL-4, IL-5, IL-6 and IL-13. Th1 cells are responsible for cell-mediated/inflammatory immunity and can enhance defenses against infectious agents and cancer, while Th2 cells are essential for humoral immunity and the clearance of parasitic antigens. The most potent factors that can promote Th1 and Th2 differentiation are the cytokines IL-12 and IL-4 respectively Although STAT4 is expressed both in Th1 and Th2 cells, STAT4 can only be phosphorylated by IL-12 which suggests that STAT4 plays an important role in Th1 cell function or development. STAT4 activation leads to Th1 differentiation, including the target genes of STAT4 such as ERM, a transcription factor that belongs to the Ets family of transcription factors. The expression of ERM is specifically induced by IL-12 in wild-type Th1 cells, but not in STAT4-deficient T cells. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes.  However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD).  NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation.  LD links the DNA-binding and SH2  domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells.  The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	148
198239	cd10376	SH2_STAT5	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 5 proteins. STAT5 is a member of the STAT family of transcription factors.  Two highly related proteins, STAT5a and STAT5b are encoded by separate genes, but are 90% identical at the amino acid level.  Both STAT5a and STAT5b are ubiquitously expressed and  functionally interchangeable. Mice lacking either STAT5a or STAT5b have mild defects in prolactin dependent mammary differentiation or sexually dimorphic growth hormone-dependent effects, respectively. Mice lacking both STAT5a and STAT5b exhibit a perinatal lethal phenotype and have multiple defects, including anemia and a virtual absence of B and T lymphocytes. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus.  STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes.  However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD).  NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation.  LD links the DNA-binding and SH2  domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells.  The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins.	137
198240	cd10377	SH2_STAT6	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 6 proteins. STAT6 mediate signals from the IL-4 receptor. Unlike the other STAT proteins which bind an IFNgamma Activating Sequence (GAS),  STAT6 stands out as having a unique binding site preference. This site consists of a palindromic sequence separated by a 3 bp spacer (TTCNNNG-AA)(N3 site). STAT6 is able to bind the GAS site but only at a low affinity. STAT6 may be an important regulator of mitogenesis when cells respond normally to IL-4. There is speculation that the inappropriate activation of STAT6 is involved in uncontrolled cell growth in an oncogenic state. IFNgamma is a negative regulator of STAT6 dependent transcription of target genes. Bcl-6 is another negative regulator of STAT6 activity. Bcl-6 is a transcriptional repressor normally expressed in germinal center B cells and some T cells. IL-4 signaling via STAT6 initially occurs unopposed, but is then dampened by a negative feedback mechanism through the IL-4/Stat6 dependent induction of SOCS1 expression. The IL-4 dependent aspect of Th2 differentiation requires the activation of STAT6. IL-4 signaling and STAT6 appear to play an important role in the immune response. Recently, it was shown that large scale chromatin remodeling of the IL-4 gene occurs as cells differentiate into Th2 effectors is STAT6 dependent. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes.  However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD).  NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation.  LD links the DNA-binding and SH2  domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells.  The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	129
198241	cd10378	SH2_Jak1	Src homology 2 (SH2) domain in the Janus kinase 1 (Jak1) proteins. Janus kinase 1 (JAK1), is a member of a class of protein-tyrosine kinases (PTK) characterized by the presence of a second phosphotransferase-related domain immediately N-terminal to the PTK domain. The second phosphotransferase domain bears all the hallmarks of a protein kinase, although its structure differs significantly from that of the PTK and threonine/serine kinase family members. JAK1 is a large, widely expressed membrane-associated phosphoprotein. JAK1 is involved in the interferon-alpha/beta and -gamma signal transduction pathways. The reciprocal interdependence between JAK1 and TYK2 activities in the interferon-alpha pathway, and between JAK1 and JAK2 in the interferon-gamma pathway, may reflect a requirement for these kinases in the correct assembly of interferon receptor complexes. These kinases couple cytokine ligand binding to tyrosine phosphorylation of various known signaling proteins and of a unique family of transcription factors termed the signal transducers and activators of transcription, or STATs.  In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	102
198242	cd10379	SH2_Jak2	Src homology 2 (SH2) domain in the Janus kinase 2 (Jak2) proteins. Jak2 is a protein tyrosine kinase involved in a specific subset of cytokine receptor signaling pathways. It has been found to be constitutively associated with the prolactin receptor and is required for responses to gamma interferon. Mice that do not express an active protein for this gene exhibit embryonic lethality associated with the absence of definitive erythropoiesis. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198243	cd10380	SH2_Jak3	Src homology 2 (SH2) domain in the Janus kinase 3 (Jak3) proteins. Jak3 is a member of the Janus kinase (JAK) family of tyrosine kinases involved in cytokine receptor-mediated intracellular signal transduction. It is predominantly expressed in immune cells and transduces a signal in response to its activation via tyrosine phosphorylation by interleukin receptors. Mutations in this gene are associated with autosomal SCID (severe combined immunodeficiency disease). In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	96
198244	cd10381	SH2_Jak_Tyk2	Src homology 2 (SH2) domain in Tyrosine Kinase 2 (Tyk2), a member of the Janus kinases (JAK). Tyk2 is a member of the tyrosine kinase and, more specifically, the Janus kinases (JAKs) protein families. This protein associates with the cytoplasmic domain of type I and type II cytokine receptors and promulgate cytokine signals by phosphorylating receptor subunits. It is also component of both the type I and type III interferon signaling pathways. As such, it may play a role in anti-viral immunity. A mutation in this gene has been associated with hyperimmunoglobulin E syndrome (HIES) - a primary immunodeficiency characterized by elevated serum immunoglobulin E. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	102
198245	cd10382	SH2_SOCS1	Src homology 2 (SH2) domain found in  suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7).  In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	98
198246	cd10383	SH2_SOCS2	Src homology 2 (SH2) domain found in  suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7).  In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198247	cd10384	SH2_SOCS3	Src homology 2 (SH2) domain found in  suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198248	cd10385	SH2_SOCS4	Src homology 2 (SH2) domain found in suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198249	cd10386	SH2_SOCS5	Src homology 2 (SH2) domain found in  suppressor of cytokine signaling (SOCS) family. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7).  In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	81
198250	cd10387	SH2_SOCS6	Src homology 2 (SH2) domain found in  suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	100
198251	cd10388	SH2_SOCS7	Src homology 2 (SH2) domain found in  suppressor of cytokine signaling (SOCS) proteins. SH2 domain found in SOCS proteins. SOCS was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7). In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. Members (SOCS4-SOCS7) were identified by their conserved SOCS box, an adapter motif of 3 helices that associates substrate binding domains, such as the SOCS SH2 domain, ankryin, and WD40 with ubiquitin ligase components. These show limited cytokine induction. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198252	cd10389	SH2_SHB	Src homology 2 domain found in SH2 domain-containing adapter protein B (SHB). SHB functions in generating signaling compounds in response to tyrosine kinase activation. SHB contains proline-rich motifs, a phosphotyrosine binding (PTB) domain, tyrosine phosphorylation sites, and a SH2 domain. SHB mediates certain aspects of platelet-derived growth factor (PDGF) receptor-, fibroblast growth factor (FGF) receptor-, neural growth factor (NGF) receptor TRKA-, T cell receptor-, interleukin-2 (IL-2) receptor- and focal adhesion kinase- (FAK) signaling. SRC-like FYN-Related Kinase FRK/RAK (also named BSK/IYK or GTK) and SHB regulate apoptosis, proliferation and differentiation. SHB promotes apoptosis and is also required for proper mitogenicity, spreading and tubular morphogenesis in endothelial cells. SHB also plays a role in preventing early cavitation of embryoid bodies and reduces differentiation to cells expressing albumin, amylase, insulin and glucagon. SHB is a multifunctional protein that has difference responses in different cells under various conditions. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198253	cd10390	SH2_SHD	Src homology 2 domain found in SH2 domain-containing adapter proteins D (SHD). The expression of SHD is restricted to the brain. SHD may be a physiological substrate of c-Abl and may function as an adapter protein in the central nervous system. It is also thought to be involved in apoptotic regulation. SHD contains five YXXP motifs, a substrate sequence preferred by Abl tyrosine kinases, in addition to a poly-proline rich region and a C-terminal SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	98
198254	cd10391	SH2_SHE	Src homology 2 domain found in SH2 domain-containing adapter protein E (SHE). SHE is expressed in heart, lung, brain, and skeletal muscle. SHE contains two pTry protein binding domains, protein interaction domain (PID) and a SH2 domain, followed by a glycine-proline rich region, all of which are N-terminal to the phosphotyrosine binding (PTB) domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	98
198255	cd10392	SH2_SHF	Src homology 2 domain found in SH2 domain-containing adapter protein F (SHF). SHF is thought to play a role in PDGF-receptor signaling and regulation of apoptosis. SHF is mainly expressed in skeletal muscle, brain, liver, prostate, testis, ovary, small intestine, and colon. SHF contains  four putative tyrosine phosphorylation sites and an SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	98
198256	cd10393	SH2_RIN1	Src homology 2 (SH2) domain found in Ras and Rab interactor 1 (RIN1)-like proteins. RIN1, a member of the RIN (AKA Ras interaction/interference) family, have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs. Previous studies showed that RIN1 interacts with EGF receptors via its SH2 domain and regulates trafficking and degradation of EGF receptors via its interaction with STAM, indicating a vital role for RIN1 in regulating endosomal trafficking of receptor tyrosine kinases (RTKs). RIN1 was first identified as a Ras-binding protein that suppresses the activated RAS2 allele in S. cerevisiae. RIN1 binds to the activated Ras through its carboxyl-terminal domain and this Ras-binding domain also binds to 14-3-3 proteins as Raf-1 does. The SH2 domain of RIN1 are thought to interact with the phosphotyrosine-containing proteins, but the physiological partners for this domain are unknown. The proline-rich domain in RIN1 is similar to the consensus SH3 binding regions. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198257	cd10394	SH2_RIN2	Src homology 2 (SH2) domain found in Ras and Rab interactor 2 (RIN2)-like proteins. RIN2, a member of the RIN (AKA Ras interaction/interference) family, have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs. Ras induces activation of Rab5 through RIN2, which is a direct downstream target of Ras and a direct upstream regulator of Rab5. In other words it is the binding of the GTP-bound form of Ras to the RA domain of RIN2 that enhances the GEF activity toward Rab5. It is thought that the RA domain negatively regulates the Rab5 GEF activity. In steady state, RIN2 is likely to form a closed conformation by an intramolecular interaction between the RA domain and the Vps9p-like (Rab5 GEF) domain, negatively regulating the Rab5 GEF activity. In the active state, the binding of Ras to the RA domain may reduce the intramolecular interaction and stabilize an open conformation of RIN2. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	100
198258	cd10395	SH2_RIN3	Src homology 2 (SH2) domain found in Ras and Rab interactor 3 (RIN3)-like proteins. RIN3, a member of the RIN (AKA Ras interaction/interference) family, have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs. RIN3 stimulated the formation of GTP-bound Rab31, a Rab5-subfamily GTPase, and formed enlarged vesicles and tubular structures, where it colocalized with Rab31. Transferrin appeared to be transported partly through the RIN3-positive vesicles to early endosomes. RIN3 interacts via its Pro-rich domain with amphiphysin II, which contains SH3 domain and participates in receptor-mediated endocytosis. RIN3, a Rab5 and Rab31 GEF, plays an important role in the transport pathway from plasma membrane to early endosomes. Mutations in the region between the SH2 and RH domain of RIN3 specifically abolished its GEF action on Rab31, but not Rab5. RIN3 was also found to partially translocate the cation-dependent mannose 6-phosphate receptor from the trans-Golgi network to peripheral vesicles and that this is dependent on its Rab31-GEF activity. These data indicate that RIN3 specifically acts as a GEF for Rab31. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198259	cd10396	SH2_Tec_Itk	Src homology 2 (SH2) domain found in Tec protein, IL2-inducible T-cell kinase (Itk). A member of the Tec protein tyrosine kinase Itk is expressed thymus, spleen, lymph node, T lymphocytes, NK and mast cells. It plays a role in T-cell proliferation and differentiation, analogous to Tec family kinases Txk. Itk  has been shown to interact with Fyn, Wiskott-Aldrich syndrome protein, KHDRBS1, PLCG1, Lymphocyte cytosolic protein 2, Linker of activated T cells, Karyopherin alpha 2, Grb2, and Peptidylprolyl isomerase A. Most of the Tec family members have a PH domain (Txk and the short (type 1) splice variant of Drosophila Btk29A are exceptions), a Tec homology (TH) domain, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. The TH domain consists of a Zn2+-binding Btk motif and a proline-rich region. The Btk motif is found in Tec kinases, Ras GAP, and IGBP. It is crucial for the function of Tec PH domains and it's lack of presence in Txk is not surprising since it lacks a PH domain. The type 1 splice form of the Drosophila homolog also lacks both the PH domain and the Btk motif. The proline-rich regions are highly conserved for the most part with the exception of Bmx whose residues surrounding the PXXP motif are not conserved (TH-like) and Btk29A  which is entirely unique with large numbers of glycine residues (TH-extended).  Tec family members all lack a C-terminal tyrosine having an autoinhibitory function in its phosphorylated state. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	108
198260	cd10397	SH2_Tec_Btk	Src homology 2 (SH2) domain found  in Tec protein, Bruton's tyrosine kinase (Btk). A member of the Tec protein tyrosine kinase Btk is expressed in bone marrow, spleen, all hematopoietic cells except T lymphocytes and plasma cells where it plays a  crucial role in B cell maturation and mast cell activation. Btk has been shown to interact with GNAQ, PLCG2, protein kinase D1, B-cell linker, SH3BP5, caveolin 1, ARID3A, and GTF2I. Most of the Tec family members have a PH domain (Txk and the short (type 1) splice variant of Drosophila Btk29A are exceptions), a Tec homology (TH) domain, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. Btk is implicated in the primary immunodeficiency disease X-linked agammaglobulinemia (Bruton's agammaglobulinemia).  The TH domain consists of a Zn2+-binding Btk motif and a proline-rich region. The Btk motif is found in Tec kinases, Ras GAP, and IGBP.  It is crucial for the function of Tec PH domains and it's lack of presence in Txk is not surprising since it lacks a PH domain. The type 1 splice form of the Drosophila homolog also lacks both the PH domain and the Btk motif.  The proline-rich regions are highly conserved for the most part with the exception of Bmx whose residues surrounding the PXXP motif are not conserved (TH-like) and Btk29A  which is entirely unique with large numbers of glycine residues (TH-extended). Tec family members all lack a C-terminal tyrosine having an autoinhibitory function in its phosphorylated state. Two tyrosine phosphorylation (pY) sites have been identified in Btk: one located in the activation loop of the catalytic domain which regulates the transition between open (active) and closed (inactive) states and the other in its SH3 domain.  In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	106
198261	cd10398	SH2_Tec_Txk	Src homology 2 (SH2) domain found  in Tec protein, Txk. A member of the Tec protein tyrosine kinase Txk is expressed in thymus, spleen, lymph node, T lymphocytes, NK cells, mast cell lines, and myeloid cell line. Txk plays a role in TCR signal transduction, T cell development, and selection which is analogous to the function of Itk. Txk has been shown to interact with IFN-gamma. Unlike most of the Tec family members Txk lacks a  PH domain. Instead Txk has a unique region containing a palmitoylated cysteine string which has a similar membrane tethering function as the PH domain. Txk also has a zinc-binding motif, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain. The TH domain consists of a Zn2+-binding Btk motif and a proline-rich region. The Btk motif is found in Tec kinases, Ras GAP, and IGBP and crucial to the function of the PH domain. It is not present in Txk which is not surprising since it lacks a PH domain. The type 1 splice form of the Drosophila homolog also lacks both the PH domain and the Btk motif. The proline-rich regions are highly conserved for the most part with the exception of Bmx whose residues surrounding the PXXP motif are not conserved (TH-like) and Btk29A  which is entirely unique with large numbers of glycine residues (TH-extended).  Tec family members all lack a C-terminal tyrosine having an autoinhibitory function in its phosphorylated state. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	106
198262	cd10399	SH2_Tec_Bmx	Src homology 2 (SH2) domain found  in Tec protein, Bmx. A member of the Tec protein tyrosine kinase Bmx is expressed in the endothelium of large arteries, fetal endocardium, adult endocardium of the left ventricle, bone marrow, lung, testis, granulocytes, myeloid cell lines, and prostate cell lines. Bmx is involved in the regulation of Rho and serum response factor (SRF). Bmx has been shown to interact with PAK1, PTK2, PTPN21, and RUFY1. Most of the Tec family members have a PH domain (Txk and the short (type 1) splice variant of Drosophila Btk29A are exceptions), a Tec homology (TH) domain, a SH3 domain, a SH2 domain, and a protein kinase catalytic domain.  The TH domain consists of a Zn2+-binding Btk motif and a proline-rich region. The Btk motif is found in Tec kinases, Ras GAP, and IGBP.  It is crucial for the function of Tec PH domains. It is not present in Txk and the type 1 splice form of the Drosophila homolog.  The proline-rich regions are highly conserved for the most part with the exception of Bmx whose residues surrounding the PXXP motif are not conserved (TH-like) and Btk29A  which is entirely unique with large numbers of glycine residues (TH-extended). Tec family members all lack a C-terminal tyrosine having an autoinhibitory function in its phosphorylated state. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	106
198263	cd10400	SH2_SAP1a	Src homology 2 (SH2) domain found in SLAM-associated protein (SAP) 1a. The X-linked lymphoproliferative syndrome (XLP) gene encodes SAP (also called SH2D1A/DSHP) a protein that consists of a 5 residue N-terminus, a single SH2 domain, and a short 25 residue C-terminal tail. XLP is characterized by an extreme sensitivity to Epstein-Barr virus.  Both T and natural killer (NK) cell dysfunctions have been seen in XLP patients. SAP binds the cytoplasmic tail of Signaling lymphocytic activation molecule (SLAM), 2B4, Ly-9, and CD84. SAP is believed to function as a signaling inhibitor, by blocking or regulating binding of other signaling proteins. SAP and the SAP-like protein EAT-2 recognize the sequence motif TIpYXX[VI], which is found in the cytoplasmic domains of a restricted number of T, B, and NK cell surface receptors and are proposed to be natural inhibitors or regulators of the physiological role of a small family of receptors on the surface of these cells.  In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198264	cd10401	SH2_C-SH2_Syk_like	C-terminal Src homology 2 (SH2) domain found in Spleen tyrosine kinase (Syk) proteins. ZAP-70 and Syk comprise a family of hematopoietic cell specific protein tyrosine kinases (PTKs) that are required for antigen and antibody receptor function. ZAP-70 is expressed in T and natural killer (NK) cells  and Syk is expressed in B cells, mast cells, polymorphonuclear leukocytes, platelets, macrophages, and immature T cells. They are required for the proper development of T and B cells, immune receptors, and activating NK cells. They consist of two N-terminal Src homology 2 (SH2) domains and a C-terminal kinase domain separated from the SH2 domains by a linker or hinge region. Phosphorylation of both tyrosine residues within the Immunoreceptor Tyrosine-based Activation Motifs (ITAM; consensus sequence Yxx[LI]x(7,8)Yxx[LI]) by the Src-family PTKs is required for efficient interaction of ZAP-70 and Syk with the receptor subunits and for receptor function. ZAP-70 forms two phosphotyrosine binding pockets, one of which is shared by both SH2 domains.  In Syk the two SH2 domains do not form such a phosphotyrosine-binding site.  The SH2 domains here are believed to function independently. In addition, the two SH2 domains of Syk display flexibility in their relative orientation, allowing Syk to accommodate a greater variety of spacing sequences between the ITAM phosphotyrosines and singly phosphorylated non-classical ITAM ligands. This model contains the C-terminus SH2 domains of Syk. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	99
198265	cd10402	SH2_C-SH2_Zap70	C-terminal Src homology 2 (SH2) domain found in Zeta-chain-associated protein kinase 70 (ZAP-70). ZAP-70 and Syk comprise a family of hematopoietic cell specific protein tyrosine kinases (PTKs) that are required for antigen and antibody receptor function. ZAP-70 is expressed in T and natural killer (NK) cells  and Syk is expressed in B cells, mast cells, polymorphonuclear leukocytes, platelets, macrophages, and immature T cells. They are required for the proper development of T and B cells, immune receptors, and activating NK cells. They consist of two N-terminal Src homology 2 (SH2) domains and a C-terminal kinase domain separated from the SH2 domains by a linker or hinge region. Phosphorylation of both tyrosine residues within the Immunoreceptor Tyrosine-based Activation Motifs (ITAM; consensus sequence Yxx[LI]x(7,8)Yxx[LI]) by the Src-family PTKs is required for efficient interaction of ZAP-70 and Syk with the receptor subunits and for receptor function. ZAP-70 forms two phosphotyrosine binding pockets, one of which is shared by both SH2 domains.  In Syk the two SH2 domains do not form such a phosphotyrosine-binding site.  The SH2 domains here are believed to function independently. In addition, the two SH2 domains of Syk display flexibility in their relative orientation, allowing Syk to accommodate a greater variety of spacing sequences between the ITAM phosphotyrosines and singly phosphorylated non-classical ITAM ligands. This model contains the C-terminus SH2 domains of Zap70. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	105
198266	cd10403	SH2_STAP1	Src homology 2 domain found in Signal-transducing adaptor protein 1 (STAP1). STAP1 is a signal-transducing adaptor protein. It is composed of a Pleckstrin homology (PH) and SH2 domains along with several tyrosine phosphorylation sites. STAP-1 is an ortholog of BRDG1 (BCR downstream signaling 1). STAP1 protein functions as a docking protein acting downstream of Tec tyrosine kinase in B cell antigen receptor signaling. The protein is phosphorylated by Tec and participates in a positive feedback loop, increasing Tec activity. STAP1 has been shown to interact with C19orf2, an unconventional prefoldin RPB5 interactor. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	94
198267	cd10404	SH2_STAP2	Src homology 2 domain found in Signal-transducing adaptor protein 2 (STAP2). STAP2 is a signal-transducing adaptor protein. It is composed of a Pleckstrin homology (PH) and SH2 domains along with several tyrosine phosphorylation sites. The STAP2 protein is the substrate of breast tumor kinase, an Src-type non-receptor tyrosine kinase that mediates the interactions linking proteins involved in signal transduction pathways. STAP2 has alternative splicing variants. STAP2 has been shown to interact with tyrosine-protein kinase 6 (PTK6). In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198268	cd10405	SH2_Vav1	Src homology 2 (SH2) domain found in the Vav1 proteins. Proto-oncogene vav is a member of the Dbl family of guanine nucleotide exchange factors (GEF) for the Rho family of GTP binding proteins.  All vavs are activated by tyrosine phosphorylation leading to their activation. There are three Vav mammalian family members: Vav1 which is expressed in the hematopoietic system, and Vav2 and Vav3 are more ubiquitously expressed. Vav1 plays a role in T-cell and B-cell development and activation.  It has been identified as the specific binding partner of Nef proteins from HIV-1, resulting in morphological changes, cytoskeletal rearrangements, and the JNK/SAPK signaling cascade, leading to increased levels of viral transcription and replication. Vav1 has been shown to interact with Ku70, PLCG1, Lymphocyte cytosolic protein 2, Janus kinase 2, SIAH2, S100B, Abl gene, ARHGDIB, SHB, PIK3R1, PRKCQ, Grb2, MAPK1, Syk, Linker of activated T cells, Cbl gene and EZH2. Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation.  Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases. Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, 2 SH3 domains, a proline-rich region, and a SH2 domain.  Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. The leucine-rich helix-loop-helix (HLH) domain is thought to be involved in protein heterodimerization with other HLH proteins and it may function as a negative regulator by forming inactive heterodimers. The CH domain  is usually involved in the association with filamentous actin, but in Vav it controls NFAT stimulation, Ca2+ mobilization, and its transforming activity. Acidic domains are involved in protein-protein interactions and contain regulatory tyrosines. The DH domain is a GDP-GTP exchange factor on Rho/Rac GTPases. The PH domain in involved in interactions with GTP-binding proteins, lipids and/or phosphorylated serine/threonine residues.  The SH3 domain is involved in localization of proteins to specific sites within the cell interacting with protein with proline-rich sequences. The SH2 domain mediates a high affinity interaction with tyrosine phosphorylated proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198269	cd10406	SH2_Vav2	Src homology 2 (SH2) domain found in the Vav2 proteins. Proto-oncogene vav is a member of the Dbl family of guanine nucleotide exchange factors (GEF) for the Rho family of GTP binding proteins. All vavs are activated by tyrosine phosphorylation leading to their activation. There are three Vav mammalian family members: Vav1 which is expressed in the hematopoietic system, and Vav2 and Vav3 are more ubiquitously expressed. Vav2 is a GEF for RhoA, RhoB and RhoG and may activate Rac1 and Cdc42. Vav2 has been shown to interact with CD19 and Grb2. Alternatively spliced transcript variants encoding different isoforms have been found for Vav2. Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation. Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases. Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, 2 SH3 domains, a proline-rich region, and a SH2 domain.  Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. The leucine-rich helix-loop-helix (HLH) domain is thought to be involved in protein heterodimerization with other HLH proteins and it may function as a negative regulator by forming inactive heterodimers. The CH domain  is usually involved in the association with filamentous actin, but in Vav it controls NFAT stimulation, Ca2+ mobilization, and its transforming activity. Acidic domains are involved in protein-protein interactions and contain regulatory tyrosines. The DH domain is a GDP-GTP exchange factor on Rho/Rac GTPases. The PH domain in involved in interactions with GTP-binding proteins, lipids and/or phosphorylated serine/threonine residues. The SH3 domain is involved in localization of proteins to specific sites within the cell interacting with protein with proline-rich sequences. The SH2 domain mediates a high affinity interaction with tyrosine phosphorylated proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198270	cd10407	SH2_Vav3	Src homology 2 (SH2) domain found in the Vav3 proteins. Proto-oncogene vav is a member of the Dbl family of guanine nucleotide exchange factors (GEF) for the Rho family of GTP binding proteins. All vavs are activated by tyrosine phosphorylation leading to their activation. There are three Vav mammalian family members: Vav1 which is expressed in the hematopoietic system, and Vav2 and Vav3 are more ubiquitously expressed. Vav3 preferentially activates RhoA, RhoG and, to a lesser extent, Rac1.  Alternatively spliced transcript variants encoding different isoforms have been described for this gene.  VAV3 has been shown to interact with Grb2. Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation. Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases. Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, 2 SH3 domains,  a proline-rich region, and a SH2 domain. Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. The leucine-rich helix-loop-helix (HLH) domain is thought to be involved in protein heterodimerization with other HLH proteins and it may function as a negative regulator by forming inactive heterodimers. The CH domain  is usually involved in the association with filamentous actin, but in Vav it controls NFAT stimulation, Ca2+ mobilization, and its transforming activity. Acidic domains are involved in protein-protein interactions and contain regulatory tyrosines.  The DH domain is a GDP-GTP exchange factor on Rho/Rac GTPases. The PH domain in involved in interactions with GTP-binding proteins, lipids and/or phosphorylated serine/threonine residues.  The SH3 domain is involved in localization of proteins to specific sites within the cell interacting with protein with proline-rich sequences. The SH2 domain mediates a high affinity interaction with tyrosine phosphorylated proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	103
198271	cd10408	SH2_Nck1	Src homology 2 (SH2) domain found in Nck. Nck proteins are adaptors that modulate actin cytoskeleton dynamics by linking proline-rich effector molecules to tyrosine kinases or phosphorylated signaling intermediates. There are two members known in this family: Nck1 (Nckalpha) and Nck2 (Nckbeta and Growth factor receptor-bound protein 4 (Grb4)). They are characterized by having 3 SH3 domains and a C-terminal SH2 domain. Nck1 and Nck2 have overlapping functions as determined by gene knockouts. Both bind receptor tyrosine kinases and other tyrosine-phosphorylated proteins through their SH2 domains. In addition they also bind distinct targets.  Neuronal signaling proteins: EphrinB1, EphrinB2, and Disabled-1 (Dab-1) all bind to Nck-2 exclusively. And in the case of PDGFR, Tyr(P)751 binds to  Nck1 while Tyr(P)1009 binds to Nck2. Nck1 and Nck2 have a role in the infection process of enteropathogenic Escherichia coli (EPEC). Their SH3 domains are involved in recruiting and activating the N-WASP/Arp2/3 complex inducing actin polymerization resulting in the production of pedestals, dynamic bacteria-presenting protrusions of the plasma membrane. A similar thing occurs in the vaccinia virus where motile plasma membrane projections are formed beneath the virus. Recently it has been shown that the SH2 domains of both Nck1 and Nck2 bind the G-protein coupled receptor kinase-interacting protein 1 (GIT1) in a phosphorylation-dependent manner. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198272	cd10409	SH2_Nck2	Src homology 2 (SH2) domain found in Nck. Nck proteins are adaptors that modulate actin cytoskeleton dynamics by linking proline-rich effector molecules to tyrosine kinases or phosphorylated signaling intermediates.  There are two members known in this family: Nck1 (Nckalpha) and Nck2 (Nckbeta and Growth factor receptor-bound protein 4 (Grb4)).  They are characterized by having 3 SH3 domains and a C-terminal SH2 domain. Nck1 and Nck2 have overlapping functions as determined by gene knockouts. Both bind receptor tyrosine kinases and other tyrosine-phosphorylated proteins through their SH2 domains. In addition they also bind distinct targets.  Neuronal signaling proteins: EphrinB1, EphrinB2, and Disabled-1 (Dab-1) all bind to Nck-2 exclusively. And in the case of PDGFR, Tyr(P)751 binds to  Nck1 while Tyr(P)1009 binds to Nck2. Nck1 and Nck2 have a role in the infection process of enteropathogenic Escherichia coli (EPEC). Their SH3 domains are involved in recruiting and activating the N-WASP/Arp2/3 complex inducing actin polymerization resulting in the production of pedestals, dynamic bacteria-presenting protrusions of the plasma membrane. A similar thing occurs in the vaccinia virus where motile plasma membrane projections are formed beneath the virus.  Recently it has been shown that the SH2 domains of both Nck1 and Nck2 bind the G-protein coupled receptor kinase-interacting protein 1 (GIT1) in a phosphorylation-dependent manner. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	98
198273	cd10410	SH2_SH2B1	Src homology 2 (SH2) domain found in SH2B adapter proteins (SH2B1, SH2B2, SH2B3). SH2B1 (SH2-B, PSM), like other members of the SH2B adapter protein family, contains a pleckstrin homology domain, at least one dimerization domain, and a C-terminal SH2 domain which binds to phosphorylated tyrosines in a variety of tyrosine kinases.  SH2B1 and SH2B2  function in signaling pathways found downstream of growth hormone receptor and receptor tyrosine kinases, including the insulin, insulin-like growth factor-I (IGF-I), platelet-derived growth factor (PDGF), nerve growth factor, hepatocyte growth factor, and fibroblast growth factor receptors. SH2B2beta, a new isoform of SH2B2, is an endogenous inhibitor of SH2B1 and/or SH2B2 (SH2B2alpha), negatively regulating insulin signaling and/or JAK2-mediated cellular responses. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198274	cd10411	SH2_SH2B2	Src homology 2 (SH2) domain found in SH2B adapter proteins (SH2B1, SH2B2, SH2B3). SH2B2 (APS), like other members of the SH2B adapter protein family, contains a pleckstrin homology domain, at least one dimerization domain, and a C-terminal SH2 domain which binds to phosphorylated tyrosines in a variety of tyrosine kinases. SH2B1 and SH2B2  function in signaling pathways found downstream of growth hormone receptor and receptor tyrosine kinases, including the insulin, insulin-like growth factor-I (IGF-I), platelet-derived growth factor (PDGF), nerve growth factor, hepatocyte growth factor, and fibroblast growth factor receptors. SH2B2beta, a new isoform of SH2B2, is an endogenous inhibitor of SH2B1 and/or SH2B2 (SH2B2alpha), negatively regulating insulin signaling and/or JAK2-mediated cellular responses. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198275	cd10412	SH2_SH2B3	Src homology 2 (SH2) domain found in SH2B adapter proteins (SH2B1, SH2B2, SH2B3). SH2B3 (Lnk), like other members of the SH2B adapter protein family, contains a pleckstrin homology domain, at least one dimerization domain, and a C-terminal SH2 domain which binds to phosphorylated tyrosines in a variety of tyrosine kinases.  SH2B3 negatively regulates lymphopoiesis and early hematopoiesis. The lnk-deficiency results in enhanced production of B cells, and expansion as well as enhanced function of hematopoietic stem cells (HSCs), demonstrating negative regulatory functions of Sh2b3/Lnk in cytokine signaling. Sh2b3/Lnk also functions in responses controlled by cell adhesion and in crosstalk between integrin- and cytokine-mediated signaling. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	97
198276	cd10413	SH2_Grb7	Src homology 2 (SH2) domain found in the growth factor receptor bound, subclass 7 (Grb7) proteins. The Grb family binds to the epidermal growth factor receptor (EGFR, erbB1) via their SH2 domains. Grb7 is part of the Grb7 family of proteins which also includes Grb10, and Grb14. They are composed of an N-terminal Proline-rich domain, a Ras Associating-like (RA) domain, a Pleckstrin Homology (PH) domain, a phosphotyrosine interaction region (PIR, BPS) and a C-terminal SH2 domain. The SH2 domains of Grb7, Grb10 and Grb14 preferentially bind to a different RTK. Grb7 binds strongly to the erbB2 receptor, unlike Grb10 and Grb14 which bind weakly to it. Grb7 family proteins are phosphorylated on serine/threonine as well as tyrosine residues. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	108
198277	cd10414	SH2_Grb14	Src homology 2 (SH2) domain found in the growth factor receptor bound, subclass 14 (Grb14) proteins. The Grb family binds to the epidermal growth factor receptor (EGFR, erbB1) via their SH2 domains. Grb14 is part of the Grb7 family of proteins which also includes Grb7, and Grb14. They are composed of an N-terminal Proline-rich domain, a Ras Associating-like (RA) domain, a Pleckstrin Homology (PH) domain, a phosphotyrosine interaction region (PIR, BPS) and a C-terminal SH2 domain. The SH2 domains of Grb7, Grb10 and Grb14 preferentially bind to a different RTK. Grb14 binds to Fibroblast Growth Factor Receptor (FGFR) and weakly to the erbB2 receptor. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	108
198278	cd10415	SH2_Grb10	Src homology 2 (SH2) domain found in the growth factor receptor bound, subclass 10 (Grb10) proteins. The Grb family binds to the epidermal growth factor receptor (EGFR, erbB1) via their SH2 domains. Grb10 is part of the Grb7 family of proteins which also includes Grb7, and Grb14. They are composed of an N-terminal Proline-rich domain, a Ras Associating-like (RA) domain, a Pleckstrin Homology (PH) domain, a phosphotyrosine interaction region (PIR, BPS) and a C-terminal SH2 domain. The SH2 domains of Grb7, Grb10 and Grb14 preferentially bind to a different RTK. Grb10 has been shown to interact with many different proteins, including the insulin and IGF1 receptors, platelet-derived growth factor (PDGF) receptor-beta, Ret, Kit, Raf1 and MEK1, and Nedd4. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	108
198279	cd10416	SH2_SH2D2A	Src homology 2 domain found in the SH2 domain containing protein 2A (SH2D2A). SH2D2A contains a single SH2 domain. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	102
199832	cd10417	SH2_SH2D7	Src homology 2 domain found in the SH2 domain containing protein 7 (SH2D7). SH2D7 contains a single SH2 domain. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	102
198281	cd10418	SH2_Src_Fyn_isoform_a_like	Src homology 2 (SH2) domain found in Fyn isoform a like proteins. Fyn is a member of the Src non-receptor type tyrosine kinase family of proteins. This cd contains the SH2 domain found in Fyn isoform a type proteins.  Fyn is involved in the control of cell growth and is required in the following pathways: T and B cell receptor signaling, integrin-mediated signaling, growth factor and cytokine receptor signaling, platelet activation, ion channel function, cell adhesion, axon guidance, fertilization, entry into mitosis, and differentiation of natural killer cells, oligodendrocytes and keratinocytes. The protein associates with the p85 subunit of phosphatidylinositol 3-kinase and interacts with the Fyn-binding protein. Alternatively spliced transcript variants encoding distinct isoforms exist. Fyn is primarily localized to the cytoplasmic leaflet of the plasma membrane. Tyrosine phosphorylation of target proteins by Fyn serves to either regulate target protein activity, and/or to generate a binding site on the target protein that recruits other signaling molecules. FYN has been shown to interact with a number of proteins including: BCAR1, Cbl, Janus kinase, nephrin, Sky, tyrosine kinase, Wiskott-Aldrich syndrome protein, and Zap-70. Fyn has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198282	cd10419	SH2_Src_Fyn_isoform_b_like	Src homology 2 (SH2) domain found in Fyn isoform b like proteins. Fyn is a member of the Src non-receptor type tyrosine kinase family of proteins. This cd contains the SH2 domain found in Fyn isoform b type proteins. Fyn is involved in the control of cell growth and is required in the following pathways: T and B cell receptor signaling, integrin-mediated signaling, growth factor and cytokine receptor signaling, platelet activation, ion channel function, cell adhesion, axon guidance, fertilization, entry into mitosis, and differentiation of natural killer cells, oligodendrocytes and keratinocytes. The protein associates with the p85 subunit of phosphatidylinositol 3-kinase and interacts with the Fyn-binding protein. Alternatively spliced transcript variants encoding distinct isoforms exist. Fyn is primarily localized to the cytoplasmic leaflet of the plasma membrane. Tyrosine phosphorylation of target proteins by Fyn serves to either regulate target protein activity, and/or to generate a binding site on the target protein that recruits other signaling molecules. FYN has been shown to interact with a number of proteins including: BCAR1, Cbl, Janus kinase, nephrin, Sky, tyrosine kinase, Wiskott-Aldrich syndrome protein, and Zap-70. Fyn has a unique N-terminal domain, an SH3 domain, an SH2 domain, a kinase domain and a regulatory tail, as do the other members of the family. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	101
198283	cd10420	SH2_STAT5b	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 5b proteins. STAT5 is a member of the STAT family of transcription factors.  Two highly related proteins, STAT5a and STAT5b are encoded by separate genes, but are 90% identical at the amino acid level.  Both STAT5a and STAT5b are ubiquitously expressed and  functionally interchangeable. Mice lacking either STAT5a or STAT5b have mild defects in prolactin dependent mammary differentiation or sexually dimorphic growth hormone-dependent effects, respectively. Mice lacking both STAT5a and STAT5b exhibit a perinatal lethal phenotype and have multiple defects, including anemia and a virtual absence of B and T lymphocytes. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus.  STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes.  However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD).  NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation.  LD links the DNA-binding and SH2  domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells.  The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	145
198284	cd10421	SH2_STAT5a	Src homology 2 (SH2) domain found in signal transducer and activator of transcription (STAT) 5a proteins. STAT5 is a member of the STAT family of transcription factors.  Two highly related proteins, STAT5a and STAT5b are encoded by separate genes, but are 90% identical at the amino acid level.  Both STAT5a and STAT5b are ubiquitously expressed and functionally interchangeable. Mice lacking either STAT5a or STAT5b have mild defects in prolactin dependent mammary differentiation or sexually dimorphic growth hormone-dependent effects, respectively. Mice lacking both STAT5a and STAT5b exhibit a perinatal lethal phenotype and have multiple defects, including anemia and a virtual absence of B and T lymphocytes. STAT proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases. The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes.  However there are a number of unphosphorylated STATs that travel between the cytoplasm and nucleus and some STATs that exist as dimers in unstimulated cells that can exert biological functions independent of being activated. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6. There are 6 conserved domains in STAT: N-terminal domain (NTD), coiled-coil domain (CCD), DNA-binding domain (DBD), alpha-helical linker domain (LD), SH2 domain, and transactivation domain (TAD). NTD is involved in dimerization of unphosphorylated STATs monomers and for the tetramerization between STAT1, STAT3, STAT4 and STAT5 on promoters with two or more tandem STAT binding sites.  It also plays a role in promoting interactions with transcriptional co-activators such as CREB binding protein (CBP)/p300, as well as being important for nuclear import and deactivation of STATs involving tyrosine de-phosphorylation. CCD interacts with other proteins, such as IFN regulatory protein 9 (IRF-9/p48) with STAT1 and c-JUN with STAT3 and is also thought to participate in the negative regulation of these proteins. Distinct genes are bound to STATs via their DBD domain. This domain is also involved in nuclear translocation of activated STAT1 and STAT3 phosphorylated dimers upon cytokine stimulation. LD links the DNA-binding and SH2 domains and is important for the transcriptional activation of STAT1 in response to IFN-gamma. It also plays a role in protein-protein interactions and has also been implicated in the constitutive nucleocytoplasmic shuttling of unphosphorylated STATs in resting cells.  The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. Residues within this domain may be particularly important for some cellular functions mediated by the STATs as well as residues adjacent to this domain.  The TAD interacts with several proteins, namely minichromosome maintenance complex component 5 (MCM5), breast cancer 1 (BRCA1) and CBP/p300. TAD also contains a modulatory phosphorylation site that regulates STAT activity and is necessary for maximal transcription of a number of target genes. The conserved tyrosine residue present in the C-terminus is crucial for dimerization via interaction with the SH2 domain upon the interaction of the ligand with the receptor. STAT activation by tyrosine phosphorylation also determines nuclear import and retention, DNA binding to specific DNA elements in the promoters of responsive genes, and transcriptional activation of STAT dimers. In addition to the SH2 domain there is a coiled-coil domain, a DNA binding domain, and a transactivation domain in the STAT proteins. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	140
199217	cd10422	RNase_Ire1	RNase domain (also known as the kinase extension nuclease domain) of Ire1. The model represents the C-terminal endoribonuclease domain of the multi-functional protein Ire1; Ire1 in addition contains a type I transmembrane serine/threonine protein kinase (STK) domain, and a Luminal dimerization domain. Ire1 is essential for the endoplasmic reticulum (ER) unfolded protein response (UPR), which acts as an ER stress sensor and is the oldest and most conserved component of the UPR in eukaryotes. During ER stress, IRE1 dimerizes through its N-terminal luminal domain and forms oligomers, promoting trans-autophosphorylation by its cytosolic kinase domain. This leads to a conformational change that stimulates its endoribonuclease (RNase) activity and results in the cleavage of its mRNA substrate, Hac1 in yeast and Xbp1 in metazoans, thus promoting a splicing event that enables translation into a transcription factor which activates the UPR. This RNase domain is homologous to the RNase domain of RNase L, and possesses a novel fold for a nuclease and appears to be rigid irrespective of the activation state of IRE1. Structural analysis and mutational studies have revealed that an early stage 'phosphoryl-transfer' competent conformation of IRE1 favors face-to-face dimerization of the kinase domains which precedes and is distinct from the RNase 'active' back-to-back conformation. Furthermore, in yeast IRE1, the flavonol quercetin activates the RNase and potentiates activation of the protein kinase by ADP, hinting at the possible existence of endogenous cytoplasmic ligands that may function along with stress signals from ER lumen in order to modulate IRE1 activity, thus identifying IRE1 as a target for development of ATP-competitive inhibitors to modulate the UPR with specific relevance for multiple myeloma.	129
199218	cd10423	RNase_RNase-L	RNase domain (also known as the kinase extension nuclease domain) of RNase L. Ribonuclease L (RNase L), sometimes referred to as the 2-5A-dependent RNase, is a highly regulated, latent endoribonuclease (thus the 'L' in RNase L) and is widely expressed in most mammalian tissues. It is involved in the mediation of the antiviral and pro-apoptotic activities of the interferon-inducible 2-5A system, which blocks infections by certain types of viruses through cleavage of viral and cellular single-stranded RNA. RNase L is unique in that it is composed of three major domains; N-terminus regulatory ankyrin repeat domain (ARD), followed by a linker, a protein kinase (PK)-like domain and a C-terminal ribonuclease (RNase) domain. The RNase domain has homology with IRE1, also containing both a kinase and an endoribonuclease, that functions in the unfolded protein response (UPR). RNase L has been shown to have an impact on the pathogenesis of prostate cancer; the RNase L gene, RNASEL, has been identified as a strong candidate for the hereditary prostate cancer 1 (HPC1) allele. The broad range of biological functions of RNase offers a possibility for RNase L as a therapeutic target.	119
198344	cd10424	GST_C_9	C-terminal, alpha helical domain of an unknown subfamily 9 of Glutathione S-transferases. Glutathione S-transferase (GST) C-terminal domain family, unknown subfamily 9; composed of uncharacterized proteins with similarity to GSTs. GSTs are cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain.	103
259896	cd10425	Ephrin-A_Ectodomain	Ectodomain of Ephrin A. Ephrins and their receptors EphR play an important role in cell communication in normal physiology, as well as in disease pathogenesis. Binding of the ephrin (Eph) ligand to EphR requires cell-cell contact, since both molecules are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling, depending on Eph kinase activity) and ephrin-expressing cells (reverse signaling). Eph signaling controls cell morphology, adhesion, migration and invasion. Ephrins can be subdivided into 2 groups, A and B, depending on their respective receptors EphA or EphB. The nine human EphA receptors bind to five GPI-linked ephrin-A ligands. Interactions are promiscuous within each class, and some Eph receptors can also bind to ephrins of the other class. All ephrin As contain a highly conserved receptor binding ectodomain described by this model. Although ephrin As do not have a cytoplasmic tail (in contrast to ephrin Bs), they are still capable of downstream activation of Src family kinases and phosphoinositide-3-kinases, most likely involving coreceptors such as neurotrophin receptors.	130
259897	cd10426	Ephrin-B_Ectodomain	Ectodomain of Ephrin B. Ephrin Bs have several conserved tyrosine phosphorylation sites in their cytoplasmic PDZ-like domain, which are important for signal transduction. Ephrins and their receptors EphR play an important role in cell communication in normal physiology, as well as in disease pathogenesis. Binding of the ephrin (Eph) ligand to EphR requires cell-cell contact, since both molecules are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling, depending on Eph kinase activity) and ephrin-expressing cells (reverse signaling). Eph signaling controls cell morphology, adhesion, migration and invasion. Ephrins can be subdivided into 2 groups, A and B, depending on their respective receptors EphA or EphB. The nine human EphA receptors bind to five GPI-linked ephrin-A ligands and the five EphB receptors bind to three transmembrane ephrin-B ligands. Interactions are promiscuous within each class, and some Eph receptors can also bind to ephrins of the other class. All ephrin Bs contain a highly conserved receptor binding ectodomain described in this model.	137
198378	cd10427	FGGY_GK_1	Uncharacterized subgroup; belongs to the glycerol kinases subfamily of the FGGY family of carbohydrate kinases. This subgroup contains uncharacterized bacterial proteins belonging to the glycerol kinase subfamily of the FGGY family of carbohydrate kinases. The glycerol kinase subfamily includes glycerol kinases (GK; EC 2.7.1.30), and glycerol kinase-like proteins from all three kingdoms of living organisms. Glycerol is an important intermediate of energy metabolism and it plays fundamental roles in several vital physiological processes. GKs are involved in the entry of external glycerol into cellular metabolism. They catalyze the rate-limiting step in glycerol metabolism by transferring a phosphate from ATP to glycerol thus producing glycerol 3-phosphate (G3P) in the cytoplasm. Under different conditions, GKs from different species may exist in different oligomeric states. The monomer of GKs is composed of two large domains separated by a deep cleft that forms the active site. This model includes both the N-terminal domain, which adopts a ribonuclease H-like fold, and the structurally related C-terminal domain.	487
198410	cd10428	LFG_like	Proteins similar to and including lifeguard (LFG), a putative regulator of apoptosis. Lifeguard (LFG) inhibits Fas-mediated apoptosis and interacts with the death receptor FasR/CD95/Apo1. LFG has been shown to interact with Bax and is supposed to be integral to cellular membranes such as the ER. A close homolog, PP1201 or RECS1, appears located in the Golgi compartment and also interacts with the Fas receptor CD95/Apo1. PP1201 is expressed in response to shear stress.	217
198411	cd10429	GAAP_like	Golgi antiapoptotic protein. GAAP (or transmembrane BAX inhibitor motif containing 4) is a regulator of apoptosis that is related to the BAX inhibitor (BI)-1 like family of small transmembrane proteins, which have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Human GAAP has been linked to the modulation of intracellular fluxes of Ca(2+), by suppressing influx from the extracellular medium and reducing release from intracellular stores. A viral homolog (vaccinia virus vGAAP) acts similar to its human counterpart in inhibiting apoptosis.	233
198412	cd10430	BI-1	BAX inhibitor (BI)-1. Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. In plants, BI-1 like proteins play a role in pathogen resistance.	213
198413	cd10431	GHITM	Growth-hormone inducible transmembrane protein. GHITM appears to be ubiquitiously expressed in mammalian cells and expression has also been observed in various cancer cell lines. A cytoprotective function has been suggested. It is closely related to the BAX inhibitor (BI)-1 like family of small transmembrane proteins, which have been shown to have an antiapoptotic effect.	264
198414	cd10432	BI-1-like_bacterial	Bacterial BAX inhibitor (BI)-1/YccA-like proteins. This family is comprised of bacterial relatives of the mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins, which have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. In plants, BI-1 like proteins play a role in pathogen resistance. A characterized prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes.	211
198415	cd10433	YccA_like	YccA-like proteins. A prokaryotic member of the BAX inhibitor (BI)-1 like family of small transmembrane proteins, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes.	205
198381	cd10434	GIY-YIG_UvrC_Cho	Catalytic GIY-YIG domain of nucleotide excision repair endonucleases UvrC, Cho, and similar proteins. UvrC is essential for nucleotide excision repair (NER). The N-terminal catalytic GIY-YIG domain of UvrC (also known as Uri domain) is responsible for the 3' incision reaction and the C-terminal half of UvrC, consisting of an UvrB-binding domain (UvrBb), EndoV-like nuclease domain and a helix-hairpin-helix (HhH) DNA-binding domain, contains the residues involved in 5' incision. The N- and C-terminal regions are joined by a common Cys-rich domain containing four conserved Cys residues. Besides UvrC, protein Cho (UvrC homolog) serves as a second endonuclease  in E. coli NER. Cho contains GIY-YIG motif followed by a Cys-rich region and shares sequence homology with the N-terminal half of UvrC. It is capable of incising the DNA at the 3' side of a lesion in the presence of the UvrA and UvrB proteins during NER. The C-terminal half of Cho is a unique uncharacterized domain, which is distinct from that of UvrC. Moreover, unlike UvrC, Cho does not require the UvrC-binding domain of UvrB for the 3' incision reaction, which might cause the shift in incision position and the difference in incision efficiencies between Cho and UvrC on different damaged substrates. Due to this, the range of NER in E. coli can be broadened by combining action of Cho and UvrC. This family also includes many uncharacterized epsilon proofreading subunits of DNA polymerase III, which have an additional N-terminal ExoIII domain and  a 3'-5' exonuclease domain homolog, fused to an UvrC-like region or a Cho-like region. The UvrC-like region includes a GIY-YIG motif, followed by a Cys-rich region, and an UvrB-binding domain (UvrBb), but lacks the EndoV-like nuclease domain and the helix-hairpin-helix (HhH) DNA-binding domain. The Cho-like region consists of a GIY-YIG motif, followed by the Cys-rich region, and the unique uncharacterized domain presenting in the C-terminal half of Cho. Some family members may not carry the Cys-rich region. This family also includes a specific Cho-like protein from G. violaceus, which possesses only UvrBb domain at the C-terminus, but lacks the additional N-terminal ExoIII domain. The oother two remote homologs of UvrC, Bacillus-I and -II, are included in this family as well. Both of them contain a GIY-YIG domain, but no Cys-rich region. Moreover, the whole C-terminal region of Bacillus-I is replaces by an unknown domain, and Bacillus-II possesses another unknown N-terminal extension.	81
198382	cd10435	GIY-YIG_RE_Eco29kI_like	Catalytic GIY-YIG domain of type II restriction endonucleases R.Eco29kI, R.Cfr42I, and similar proteins. This family corresponds to the catalytic GIY-YIG domain of a group of GGCGCC-specific type II restriction endonucleases R.Eco29kI, R.Cfr42I, and similar proteins. R.Eco29kI is encoded on plasmid pECO29 in the E. coli strain 29K. This enzyme recognizes the palindromic 5'-CCGC/GG-3' target and cuts between Cyt4 and Gua5 on each strand of the restriction site to generate 3'-staggered ends. R.Eco29kI forms a domain-swapped homodimeric catalytically active complex during DNA binding and cleavage. Each subunit contains one GIY-YIG catalytic motif. Restriction endonucleases R.Cfr42I is an isoschizomer of R.Eco29kI. Unlike R.Eco29kI, R.Cfr42I is functional as a homotetramer, binding and cleaving two cognate DNA molecules in a cooperative manner. Members in this family are single-domain proteins sharing sequence similarities with the catalytic domain of GIY-YIG endonucleases, such as  homing endonuclease I-TevI. However, they utilize loop insertions and terminal extensions instead of the separate DNA-binding domain to interact with the target site 5'-CCGC/GG-3'. A divalent metal-ion cofactor is required for their catalysis, but not for substrate binding. This family also includes a hypothetical protein from Deinococcus radiodurans that corresponds to MraI, a type II restriction enzyme similar to GIY-YIG family of homing endonucleases. MraI is shown to be an isoschizomer of Eco29kI, Cfr42I recognizing the palindromic nucleotide sequence 5'-CCGC reduced GG-3'. The enzyme shows an absolute requirement of Mg2+, but is active in the absence of added 2-mercaptoethanol. MraI represents the first restriction enzyme from a bacterium whose DNA lacks modified methylated bases.	117
198383	cd10436	GIY-YIG_EndoII_Hpy188I_like	Catalytic GIY-YIG domain of coliphage T4 non-specific endonuclease II, type II restriction endonuclease R.Hpy188I, and similar proteins. This family includes two different GIY-YIG enzymes, coliphage T4 non-specific endonuclease II (EndoII), and type II restriction endonuclease R.Hpy188I. They display high sequence similarity to each other, and both of them contain an extra N-terminal hairpin that lacks counterparts in other GIY-YIG enzymes. EndoII encoded by gene denA catalyzes the initial step in degradation of host DNA, which permits scavenging of host-derived nucleotides for phage DNA synthesis. R.Hpy188I recognizes the unique sequence, 5'-TCNGA-3', and cleaves the DNA between nucleotides N and G in its recognition sequence to generate a single nucleotide 3'-overhang. EndoII binds to two DNA substrates as an X-shaped tetrameric structure composed as a dimer of dimers. In contrast, two subunits of R.Hpy188I form a dimer to embrace one bound DNA. Divalent metal-ion cofactors are required for their catalytic events, but not for the substrates binding.	97
198384	cd10437	GIY-YIG_HE_I-TevI_like	N-terminal catalytic domain of GIY-YIG intron endonuclease I-TevI, I-BmoI, I-BanI, I-BthII and similar proteins. I-TevI is a site-specific GIY-YIG homing endonuclease encoded within the group I intron of the thymidylate synthase gene (td) from Escherichia coli phage T4. It functions as an endonuclease that catalyzes the first step in intron homing by generating a double-strand break in the intronless td allele within a sequence designated the homing site. I-TevI recognizes its extensive 37 base pair DNA target in a site-specific, but sequence-tolerant manner. The cleavage site is located at 23 (upper strand) and 25 (lower strand) nucleotides upstream of the intron insertion site. A divalent cation, such as Mg2+, is required for the catalysis. I-TevI also acts as a repressor of its own transcription. It binds an operator that is located upstream of the I-TevI coding sequence and overlaps the T4 late promoter, which drives I-TevI expression from within the td intron. I-TevI binds the homing sites and the operator with the same affinity, but cleaves the homing site more efficiently than the operator. I-TevI consists of an N-terminal catalytic domain, containing the GIY-YIG motif, and a C-terminal DNA-binding domain that binds DNA as a monomer, joined by a flexible linker. The C-terminal domain includes three subdomains: a zinc finger, a minor-groove binding alpha-helix (NUMOD3, nuclease-associated modular domain 3), and a helix-turn-helix domain (HTH). The last two are responsible for DNA-binding. The zinc finger is part of the linker and not required for DNA-binding. It is implicated as a distance sensor to constrain the catalytic domain to cleave the homing site at a fixed position. None of other GIY-YIG endonucleases have been found to have the zinc finger motif. This family also includes a reduced activity isoschizomer of I-TevI, I-BmoI, which is encoded within the group I intron of the thymidylate synthase (TS) gene (thyA) from Bacillus mojavensis. I-BmoI catalyzes the first step in intron homing by generating a double-strand break in the intronless td allele within a sequence designated the homing site in the presence of a divalent cation cofactor, such as Mg2+. In the absence of Mg2+, I-Bmol only nicks one of the strands. Both I-BmoI and I-TevI bind a homologous stretch of TS-encoding DNA as monomers, but use different strategies to distinguish intronless from intron-containing substrates. I-TevI recognizes substrates at the level of DNA-binding. However, I-BmoI binds both intron-containing and intronless TS-encoding substrates, but efficiently cleaves only intronless substrate. Afterwards they cleave their respective intronless substrates in the same positions, and both require a critical G-C base pair adjacent to the top strand site for efficient cleavage. The C-terminal domain of I-BmoI has nuclease-associated modular DNA-binding domains (NUMODs), but lacks the zinc finger, which is different from that of I-TevI. Although the zinc finger implicated as a distance determination in I-TevI is absent, I-BmoI still possesses some cleavage distance discrimination. Besides I-TevI and I-BmoI, this family contains a putative GIY-YIG homing endonuclease, I-BanI, encoded within the self-splicing group I intron of nrdE gene from Bacillus anthracis. It contains two major domains, the N-terminal GIY-YIG domain and the C-terminal DNA-binding domain that consists of a minor-groove DNA binding alpha-helix motif and a helix-turn-helix (HTH) motif. I-BanI generates a double-strand break (DSB) in the intronless nrdE gene. The cleavage site is located at 5 and 7 nucleotides upstream of the intron insertion site, with 2-nucleotide 3' extensions. The recognition site is 35 to 40 base pairs and covers the cleavage site with a bias toward the downstream region including the (intervening sequence) IVS insertion site. Moreover, this family contains another putative GIY-YIG homing endonuclease, I-BthII, encoded within the self-splicing group I intron of nrdF gene from Bacillus thuringiensis ssp. pakistani. It contains a GIY-YIG motif that generates a double-strand break (DSB) in the intronless nrdF gene. The cleavage site is located at 7 and 9 nucleotides upstream of the intron insertion site, leaving 2-nucleotide 3' extensions. The recognition site is 27 to 29 base pairs with the DSB cleavage site at the 5'-end of the top strand, and with the intervening sequence (IVS) insertion site approximately in the middle of the recognition site.	90
198385	cd10438	GIY-YIG_MSH	Catalytic GIY-YIG domain of eukaryotic DNA mismatch repair protein MutS homologs. This family represents a putative GIY-YIG nuclease domain C-terminally fused to the DNA-repair ATPase on a small group of eukaryotic DNA mismatch repair protein mutS homologs (MSH). The MSH proteins in this family do not have the zinc finger domain, but have a predicted mitochondrial localization. They might play roles in the recognition and repair of errors made during the replication of DNA. The prototype of this family is the protein encoded by the chloroplast mutator (CHM) locus from Arabidopsis thaliana. It is suggested that this protein could be involved in the maintenance of mitochondrial genome stability.	72
198386	cd10439	GIY-YIG_COG3410	GIY-YIG domain of uncharacterized bacterial protein structurally related to COG3410. This family contains a group of uncharacterized bacterial proteins. Although their function roles have not been recognized, these proteins contain a putative GIY-YIG domain in their N-terminus. Moreover, a  conserved domain COG3410 with unknown function has been found in the C-terminus of most family members.	80
198387	cd10440	GIY-YIG_COG3680	GIY-YIG domain of uncharacterized proteins from bacteria and their eukaryotic homologs. This family includes a group of functionally uncharacterized proteins from bacteria and their eukaryotic homologs which are present only in metazoa. These proteins might have nuclease activities and possibly be engaged in DNA repair or recombination, since they share sequence homology with the catalytic GIY-YIG domain of bacterial UvrC DNA repair proteins. Distinct from their prokaryotic relatives, the eukaryotic homologs contain an N-terminal extension that includes the region of approximately 3-4 ankyrin repeats, unique motifs mediating protein-protein interactions. Some of eukaryotic homologs do have an additional LEM domain located between ankyrin repeats region and GIY-YIG domain. The LEM domain, found in inner nuclear membrane proteins, may be involved in protein- or DNA-binding. The different domain composition of the eukaryotic homologs suggests that  they might participate in interactions with multiple partners and implies  important cellular function.	94
198388	cd10441	GIY-YIG_COG1833	GIY-YIG domain of hypothetical proteins from archaea and their bacterial homologs. This family includes a group of functionally uncharacterized hypothetical proteins from archaea and their bacterial homologs. These proteins contain a putative GIY-YIG domain that shows sequence homology with bacterial UvrC DNA repair proteins. Meanwhile, all of them share a C-terminal extension with semi-conserved Cys and His residues, which suggests that the extended region may be a zinc-binding nucleic acid interaction domain. Although the majority of family members have a standalone GIY-YIG domain composition, some of them do have additional endonulcease III domain or sugar fermentation stimulation protein domain, both of which are N-terminally fused to the GIY-YIG domain. As a result, those proteins could perform some other role by cooperating with different domains, which remains to be determined in the future.	112
198389	cd10442	GIY-YIG_PLEs	Catalytic GIY-YIG endonuclease domain of penelope-like elements and similar proteins. This model corresponds to the EN domain of PLEs that contains catalytic module of the GIY-YIG endonucleases of group I bacterial/organellar introns, as well as bacterial UvrC DNA repair proteins. It can cleave DNA with low nucleotide sequence specificity. However, the PLEs EN domain is distinct from other GIY-YIG endonucleases by the presence of a well-conserved CCHH motif (CX(2-7)CX(33-39)HX(3-5)H, X can be any residue). The role of the CCHH motif has not yet been identified. Penelope-like elements (PLEs) represent a novel class of eukaryotic retroelements, which do not belong to either long terminal repeat (LTR) retrotransposons or non-LTR retrotransposons (often called LINEs), but instead form a sister clade to telomerase reverse transcriptases (TERTs), highly specialized non-mobile reverse transcriptases (RTs) which are responsible for the addition of telomeric repeats to the ends of eukaryotic chromosomes. The single open reading frame (ORF) encoded by PLE consists of two principal domains, RT domain and endonuclease (EN) domain, jointed by a linker region of variable length. Both of these two domains are functionally active.	92
198390	cd10443	GIY-YIG_HE_Tlr8p_PBC-V_like	GIY-YIG domain of uncharacterized hypothetical protein found in phycodnavirus PBCV-1 DNA virus, T. thermophila Tlr element eoncoding protein Tlr8p, and similar proteins found in bacteria. The family includes a group of diverse uncharacterized hypothetical proteins with a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease  I-TevI.  Similar to I-TevI, family members from phycodnavirus PBCV-1 DNA virus have nuclease-associated modular DNA-binding domains (NUMODs) and a helix-turn-helix (HTH) domain C-terminally fused to the GIY-YIG domain, which suggests that these PBCV-1 acquired the I-TevI-like homing endonucleases from phages by horizontal gene transfer. This family also includes proteins that appear to connect homing endonucleases with Penelope elements, such as Tetrahymena thermophila Tlr element encoding protein Tlr8p that possess additional N-terminal and central structural regions, followed by a putative superfamily 1 helicase domain and I-TevI-like GIY-YIG domain, but lacks the NUMOD domains and HTH domain. It is suggested that the Tlr8p element could have acquired its GIY-YIG domain w ithin the nucleus of the ciliate cell infected by the Phycodnavirus. Some family members only contain a standalone GIY-YIG domain and their biological functions are unclear.	90
198391	cd10444	GIY-YIG_SegABCDEFG	N-terminal catalytic GIY-YIG domain of bacteriophage T4 segABCDEFG gene encoding proteins. The prototypes of Seg family are proteins SegA, B, C, D, E, F, and G encoded by five seg genes segA, B, C, D, E, F, and G in the bacteriophage T4 genome, respectively. SegA, B, C, D, E, F, and G are not encoded by introns, but free-standing homologs of the GIY-YIG family of endonucleases encoded by group I introns, which are thought to initiate the homing of their own intron by cleaving the intronless DNA at or near the site of insertion. Both phage T4 intron-encoded and free-standing GIY-YIG endonucleases contribute to the exclusion of T2 markers from the progeny of mixed infections. SegA, encoded by the bacteriophage T4 segA gene, is a double-strand DNA endonulcease with a hierarchy of site specificity. The cleavage site of SegA is located in the uvsX gene of T4. Its cleaving activity requires the presence of Mg2+ and can be stimulated by the presence of ATP or ATPgammaS. Bacteriophage T4 segB gene encoding protein SegB is a site-specific endonuclease that recognizes a 27-bp sequence, cleaves DNA by introdu cing double-strand breaks in the adjacent gene 56 of T2 during mixed infection in the presence of Mg2+, Mn2+, or Ca2+ cations, and produces mostly 3' 2-nt protruding ends at its DNA cleavage site. It functions as a homing endonuclease to ensure spreading of its own gene and the surrounding tRNA genes among T4-related phages. Bacteriophage T4 segE gene encoding SegE is a site-specific endonuclease that preferentially cleaves DNA in a site located at the 5' end of the uvsW gene in the RB30 genome. It is responsible for a non-reciprocal genetic exchange between T-even-related phages. Bacteriophage T4 gene 69 encoding SegF is a site-specific double-strand DNA endonuclease that promotes marker exclusion. It preferentially introduces a double-strand break in the adjacent T2 gene 56 over T4 gene 56 both in vitro and in vivo during mixed infection, which results in the replacement of T2 gene 56 by T4 gene 56 in a process similar to group I intron homing. The cleavage site is located 210- and 212-bp upstream from its insertion site. Bacteriophage T4 segG gene (formerly gene 32.1) encoding SegG (also known as F-TevIV) is a double-strand DNA endonuclease adjacent to gene 32 of phage T4 that promotes marker exclusion. Although it is absent from phage T2, SegG preferentially introduces a double-strand break in T2 gene 32 during mixed infection, which results in replacement of T2 genetic markers by the corresponding T4 markers. The cleavage site is located 332- and 334-bp from its insertion site.	85
198392	cd10445	GIY-YIG_bI1_like	Catalytic GIY-YIG domain of putative intron-encoded endonuclease bI1 and similar proteins. The prototype of this family is a putative intron-encoded mitochondrial DNA endonuclease bI1 found in mitochondrion Ustilago maydis. This protein may arise from proteolytic cleavage of an in-frame translation of COB exon 1 plus intron 1, containing the bI1 open reading frame. It contains an N-terminal truncated non-functional cytochrome b region and a C-terminal intron-encoded endonuclease bI1 region. The bI1 region shows high sequence similarity to endonucleases of group I introns of fungi and phage and might be involved in intron homing. Many uncharacterized bI1 homologs existing in fungi and chlorophyta in this family do not contain the cytochrome b region, but have a standalone bI1-like region, which contains a GIY-YIG domain and a minor-groove binding alpha-helix nuclease-associated modular domain (NUMOD). This family also includes a Yarrowia lipolytica mobile group-II intron COX1-i1, also called intron alpha, encoding protein with reverse transcriptase activity. The group-II intron COX1-i1 may be involv ed both in the generation of the circular multimeric DNA molecules (senDNA alpha) which amplify during the senescence syndrome and in the generation of the site-specific deletion which accumulates in the premature-death syndrome.	88
198393	cd10446	GIY-YIG_unchar_1	GIY-YIG domain of uncharacterized hypothetical protein found in bacteria. The family includes a group of uncharacterized bacterial hypothetical proteins with a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease  I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC.	103
198394	cd10447	GIY-YIG_unchar_2	GIY-YIG domain of uncharacterized hypothetical protein found in bacteria and archaea. The family includes a group of uncharacterized hypothetical proteins, mainly found in bacteria and a few found in archaea, with a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC.	80
198395	cd10448	GIY-YIG_unchar_3	GIY-YIG domain of uncharacterized hypothetical protein found in bacteria. The family includes a group of uncharacterized bacterial proteins with a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease  I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC.	87
198396	cd10449	GIY-YIG_SLX1_like	Catalytic GIY-YIG domain of yeast structure-specific endonuclease subunit SLX1 and its homologs. Structure-specific endonuclease subunit SLX1 is a highly conserved protein from yeast to human, with an N-terminal GIY-YIG endonuclease domain and a C-terminal PHD-type zinc finger postulated to mediate protein-protein or protein-DNA interaction. SLX1 forms active heterodimeric complexes with its SLX4 partner, which has additional roles in the DNA damage response that are distinct from the function of the heterodimeric SLX1-SLX4 nuclease. In yeast, the SLX1-SLX4 complex functions as a 5' flap endonuclease that maintains ribosomal DNA copy number, where SLX1 and SLX4 are shown to be catalytic and regulatory subunits, respectively. This endonuclease introduces single-strand cuts in duplex DNA on the 3' side of junctions with single-strand DNA. In addition to 5' flap endonuclease activity, human SLX1-SLX4 complex has been identified as a Holliday junction resolvase that promotes symmetrical cleavage of static and migrating Holliday junctions. SLX1 also associates with MUS81, EME1, C20orf94, PLK1, and ERCC1. Some eukaryotic SLX1 homologs lack the zinc finger domain, but possess intrinsically unstructured extensions of unknown function. These unstructured segments might be involved in interactions with other proteins.	67
198397	cd10450	GIY-YIG_AtGrxS16_like	GIY-YIG domain found in CAXIP1-like proteins, iron-sulfur cluster assembly proteins, and similar proteins. The family includes CAX-interacting protein-1 (CXIP1)-like proteins and iron-sulfur cluster assembly proteins, both of which contain a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. CAXIP1 is a novel PICOT (protein kinase C-interacting cousin of thioredoxin) domain-containing Arabidopsis protein that activates H+/Ca2+ exchanger CAX1, and its homolog CAX4, but not CAX2 or CAX3. Iron-sulfur cluster assembly proteins in this family also contain a C-terminal NifU-like domain that corresponds to a common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown.	70
198398	cd10451	GIY-YIG_LuxR_like	GIY-YIG domain of LuxR and ArsR family transcriptional regulators, and uncharacterized hypothetical proteins found in bacteria. The family includes some bacterial LuxR and ArsR family transcriptional regulators.  The a C-terminal conserved domain shows sequence similarity to the N-terminal catalytic GIY-YIG domains of intron-encoded homing endonucleases. Besides, they have an N-terminally fused transcriptional regulators module, comprising the winged helix-turn-helix (wHTH) domain and uncharacterized domain DUF2087. At this point, they are distinct from GIY-YIG homing endonucleases, which typically contain a variety of C-terminally fused nuclease-associated modular DNA-binding domains (NUMODs). Moreover, some key residues relevant to catalysis in GIY-YIG endonucleases are mutanted or absent in this family, which suggests that members in this family might lose the catalytic function that GIY-YIG endonucleases possess. This family also includes many uncharacterized hypothetical proteins that consist of a standalone GIY-YIG like domain.	101
198399	cd10452	GIY-YIG_RE_Eco29kI_NgoMIII	Catalytic GIY-YIG domain of type II restriction enzyme R.Eco29kI, R.NgoMIII, and similar proteins. This family corresponds to the catalytic GIY-YIG domain of GGCGCC-specific type II restriction endonucleases R.Eco29kI, NgoMIII, and similar proteins. R.Eco29kI is encoded on plasmid pECO29 in the E. coli strain 29K. This enzyme recognizes the palindromic 5'-CCGC/GG-3' target and cuts between Cyt4 and Gua5 on each strand of the restriction site to generate 3'-staggered ends. R.Eco29kI forms a domain-swapped homodimeric catalytically active complex during DNA binding and cleavage. Each subunit contains one GIY-YIG catalytic motif. Restriction endonucleases R.NgoMIII is an isoschizomer of R.Eco29kI. Members in this family are single-domain proteins sharing sequence similarities with the catalytic domain of GIY-YIG endonucleases, such as  homing endonuclease I-TevI. However, they utilize loop insertions and terminal extensions instead of the separate DNA-binding domain to interact with the target site 5'-CCGC/GG-3'. A divalent metal-ion cofactor is required for their catalysis, but not for their substrate binding.	204
198400	cd10453	GIY-YIG_RE_Cfr42I	Catalytic GIY-YIG domain of type II restriction enzyme R.Cfr42I and similar proteins. This family corresponds to the catalytic GIY-YIG domain of GGCGCC-specific type II restriction endonucleases R.Cfr42I and similar proteins. R.Cfr42I is encoded on plasmid pET21b(+) in the Citrobacter freundii RFL42 strain. This enzyme recognizes the palindromic 5'-CCGC/GG-3' target and cuts between Cyt4 and Gua5 on each strand of the restriction site to generate 3'-staggered ends. It is an isoschizomer of R.Eco29kI. Unlike R.Eco29kI, R.Cfr42I is functional as a homotetramer, binding and cleaving two cognate DNA molecules in a cooperative manner. Members in this family are single-domain proteins sharing sequence similarities with the catalytic domain of GIY-YIG endonucleases, such as  homing endonuclease I-TevI. However, they utilize loop insertions and terminal extensions instead of the separate DNA-binding domain to interact with the target site 5'-CCGC/GG-3'. A divalent metal-ion cofactor is required for their catalysis.	156
198401	cd10454	GIY-YIG_COG3680_Meta	GIY-YIG domain of hypothetical proteins from Metazoa. Members of this family are functionally uncharacterized hypothetical proteins from Metazoa. They have bacterial homologs that display sequence homology with the catalytic GIY-YIG domain of bacterial UvrC DNA repair proteins. However, unlike their bacterial relatives, these Metazoan proteins contain an N-terminal extension that includes the region of approximately 3-4 ankyrin repeats, unique motifs mediating protein-protein interactions. Some of them do have an additional LEM domain located between ankyrin repeats region and GIY-YIG domain. The LEM domain, found in inner nuclear membrane proteins, may be involved in protein- or DNA-binding. The different domains composition suggests members in this subfamily might participate in interactions with multiple partners and imply some important cellular functions.	114
198402	cd10455	GIY-YIG_SLX1	Catalytic GIY-YIG domain of yeast structure-specific endonuclease subunit SLX1 and its eukaryotic homologs. Structure-specific endonuclease subunit SLX1 is a highly conserved protein from yeast to human, with an N-terminal GIY-YIG endonuclease domain and a C-terminal PHD-type zinc finger postulated to mediate protein-protein or protein-DNA interaction. SLX1 forms active heterodimeric complexes with its SLX4 partner, which has additional roles in the DNA damage response that are distinct from the function of the heterodimeric SLX1-SLX4 nuclease. In yeast, the SLX1-SLX4 complex functions as a 5' flap endonuclease that maintains ribosomal DNA copy number, where SLX1 and SLX4 are shown to be catalytic and regulatory subunits, respectively. This endonuclease introduces single-strand cuts in duplex DNA on the 3' side of junctions with single-strand DNA. In addition to 5' flap endonuclease activity, human SLX1-SLX4 complex has been identified as a Holliday junction resolvase that promotes symmetrical cleavage of static and migrating Holliday junctions. SLX1 also associates with MUS81, EME1, C20orf94, PLK1, and ERCC1. Some eukaryotic SLX1 homologs lack the zinc finger domain, but possess intrinsically unstructured extensions of unknown function. These unstructured segments might be involved in interactions with other proteins.	76
198403	cd10456	GIY-YIG_UPF0213	The GIY-YIG domain of uncharacterized protein family UPF0213 related to structure-specific endonuclease SLX1. This family contains a group of uncharacterized proteins found mainly in bacteria and several in dsDNA viruses. Although their function roles have not been recognized, these proteins show significant sequence similarities with the N-terminal GIY-YIG endonuclease domain of structure-specific endonuclease subunit SLX1, which binds another structure-specific endonuclease subunit SLX4 to form an active heterodimeric SLX1-SLX4 complex. This complex functions as a 5' flap endonuclease in yeast, and has also been identified as a Holliday junction resolvase in human.	68
198404	cd10457	GIY-YIG_AtGrxS16	GIY-YIG domain found in CAXIP1-like proteins. The family includes CAX-interacting protein-1 (CXIP1)-like proteins which contain a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. CAXIP1 is a novel PICOT (protein kinase C-interacting cousin of thioredoxin) domain-containing Arabidopsis protein that activates H+/Ca2+ exchanger CAX1, and its homolog CAX4, but not CAX2 or CAX3.	74
198405	cd10458	GIY-YIG_NifU	GIY-YIG domain found in iron-sulfur cluster assembly proteins. This family includes a group of uncharacterized iron-sulfur cluster assembly proteins that transiently bind the iron-sulfur cluster before transfer to target apoproteins. These iron-sulfur cluster assembly proteins contains a GIY-YIG domain that shows statistically significant similarity to the N-terminal catalytic domains of GIY-YIG family of intron-encoded homing endonuclease I-TevI and catalytic GIY-YIG domain of nucleotide excision repair endonuclease UvrC. They also contain a C-terminal NifU-like domain that corresponds to a common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species.  The biochemical function of NifU is unknown.	76
198417	cd10459	PUB_PNGase	PNGase/UBA or UBX (PUB) domain of the P97 adaptor protein Peptide:N-glycanase (PNGase). This PUB (PNGase/UBA or UBX) domain is found in the p97 adaptor protein PNGase (Peptide:N-glycanase). The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins.  Peptide:N-glycanase (PNGase), a deglycosylating enzyme that functions in proteasome-dependent degradation of misfolded glycoproteins which are translocated from the endoplasmic reticulum (ER) to the cytosol during ERAD, associates with the ubiquitin-proteasome system proteins mediated by the N-terminal PUB domain. PNGase is present in all eukaryotic organisms; however, the yeast PNGase ortholog does not contain the PUB domain. The mammalian PNGase binds a considerable number of proteins via its PUB domain; these include ERAD E3 enzyme, the autocrine motility factor receptor (AMFR or gp78), SAKS and Derlin-1.	93
198418	cd10460	PUB_UBXD1	PNGase/UBA or UBX (PUB) domain of UBXD1. This PUB  domain is found in p97 adaptor protein UBXD1 (UBX domain-containing protein 1, also called UBXD6). It functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The PUB domain in UBX-domain protein 1 (UBXD1), which is widely expressed in higher eukaryotes, except for fungi, and which is involved in substrate recruitment to p97, interacts strongly with the C-terminus of p97. UBXD1 also interacts with HRD1 and HERP, both components of the ERAD pathway, via p97. It is possibly involved in aggresome formation; aggresomes are perinuclear compartments that contain misfolded proteins colocalized with centrosome markers.	102
198419	cd10461	PUB_UBA_plant	PNGase/UBA or UBX (PUB) domain of plant Ubiquitin-associated (UBA) domain containing proteins. The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The UBA domain, along with UBL (ubiquitin-like) domain, has been implicated in proteasomal degradation by associating with substrates destined for degradation as well as with subunits of the proteasome, thus regulating protein turnover. This family contains only plant UBA domain-containing proteins.	107
198420	cd10462	PUB_UBA	PNGase/UBA or UBX (PUB) domain of Ubiquitin-associated (UBA) domain containing proteins. The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The UBA domain, along with UBL (ubiquitin-like) domain, has been implicated in proteasomal degradation by associating with substrates destined for degradation as well as with subunits of the proteasome, thus regulating protein turnover.	100
198421	cd10463	PUB_WLM	PNGase/UBA or UBX (PUB) domain of the Wss1p-like metalloprotease (WLM) family. The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins.  WLM domains are found mostly in plant proteins, belonging to the Zincin-like superfamily of Zn-dependent peptidases that are linked to the ubiquitin signaling pathway through its fusion with the ubiquitin-binding PUB, ubiquitin-like, and Little Finger domains. More specifically, genetic evidence implicates the WLM family in de-SUMOylation.	96
198422	cd10464	PUB_RNF31	PNGase/UBA or UBX (PUB) domain of the RNF31 (or HOIP) protein. This PUB domain is found in the p97 adaptor protein RNF31 (RING finger protein 31). The PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The RNF31 protein, also known as HOIP or Zibra, contains an N-terminal PUB domain similar to those in PNGase and UBXD1, suggesting its association with p97. RNF31 functions in a complex with another RING-finger protein (HOIL-IL), displaying E3 ubiquitin-protein ligase activity, and forming linear ubiquitin chain assembly complex (LUBAC) through linkages between the N- and C-termini of ubiquitin. LUBAC has been shown to activate the NF-kappaB pathway.	111
198456	cd10466	FimH_man-bind	Mannose binding  domain of FimH and related proteins. This family, restricted to gammaproteobacteria, includes FimH, a mannose-specific adhesin of uropathogenic Escherichia coli strains. The domain appears to bind specifically to D-mannose and mediates cellular adhesion to mannosylated proteins, a prerequisite to colonization and subsequent invasion of epithelial tissues.	160
198458	cd10467	FAM20_C_like	C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins. Drosophila Fj is a Golgi kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in the Drosophila fj gene include loss of the intermediate leg joint, and a PCP defect in the eye. Fjx1, the murine homologue of Fj, has been shown to be involved in both the Fat and Hippo signaling pathways, these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. This domain has homology to a kinase-active site, mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. This model includes the FAM20_C domain family, previously known as DUF1193; FAM20_C appears to be homologous to the catalytic domain of the phosphoinositide 3-kinase (PI3K)-like family.	210
198459	cd10468	Four-jointed-like_C	C-terminal kinase domain of Drosophila Four-jointed (Fj), mouse Fjx1, and related proteins. Drosophila Fj is a Golgi type II transmembrane protein that is partially secreted, and is a kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). Mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in Drosophila Fj include loss of the intermediate leg joint, and a PCP defect in the eye. The expression of the Drospophila fj gene is modulated by Notch, Unpaired (JAK/STAT), and Wingless signals. Mouse Fjx1, has been shown to be involved in both the Fat and Hippo signaling pathways; these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. The expression of the mouse fjx1 gene is also Notch dependent; fjx1 is expressed in the brain, the peripheral nervous system, in epithelial structures of different organs, and during limb development.	286
198460	cd10469	FAM20A_C	C-terminal putative kinase domain of FAM20A. Human FAM20A may play a fundamental role in enamel development and gingival homeostasis as mutations in FAM20A may underlie the pathogenesis of the autosomal recessive Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. It is expressed in ameloblasts and gingivae. AI refers to a heterogeneous group of disorders of biomineralization caused by a lack of normal enamel formation. Mouse FAM20A is a secreted protein and the gene encoding it is differentially expressed in hematopoietic cells undergoing myeloid differentiation. This protein has also been associated with growth disorder in mice. The C-terminal domain of FAM20A is a putative kinase domain, based on mutagenesis of the C-terminal domain of Drosophila Four-Jointed, a related Golgi kinase. This subfamily belongs to the FAM20_C (also known as DUF1193) domain family.	217
198461	cd10470	FAM20B_C	C-terminal putative kinase domain of FAM20B xylose kinase. Experiments with human FAM20B suggest that it is a xylose kinase that participates in proteoglycan production. It may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. The C-terminal domain of FAM20B is a putative kinase domain, based on mutagenesis of the C-terminal domain of Drosophila Four-Jointed, a related Golgi kinase. This subfamily belongs to the FAM20_C (also known as DUF1193) domain family.	206
198462	cd10471	FAM20C_C	C-terminal putative kinase domain of FAM20C (also known as Dentin Matrix Protein 4, DMP4). Mouse DMP4 is abundant in the dentin matrix, and is expressed in high levels in odontoblasts. These latter cells synthesize various nucleators or inhibitors of mineralization. The in vivo role of DMP4 in dentinogenesis is unclear. However, gain- and loss-of-function experiments suggest that it participates in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. In addition to this domain, DMP4 contains a Greek key calcium-binding domain. Human FAM20C participates in bone development; mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), an autosomal recessive disorder in which affected individuals die within days or weeks of birth, usually due to thoratic malformation resulting in respiratory failure. The C-terminal domain of FAM20C is a putative kinase domain, based on mutagenesis of the C-terminal domain of Drosophila Four-Jointed, a related Golgi kinase. This subfamily belongs to the FAM20_C (also known as DUF1193) domain family.	212
198440	cd10472	EphR_LBD_B	Ligand Binding Domain of Ephrin type-B receptors. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. They play important roles in synapse formation and plasticity, spine morphogenesis, axon guidance, and angiogenesis. In the intestinal epithelium, EphB receptors are Wnt signaling target genes that control cell compartmentalization. They function as suppressors of colon cancer progression. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. One exception is EphB2, which also interacts with ephrin A5. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis.	176
198441	cd10473	EphR_LBD_A	Ligand Binding Domain of Ephrin type-A Receptors. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis.	173
198442	cd10474	EphR_LBD_B4	Ligand Binding Domain of Ephrin type-B Receptor 4. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. EphB4 plays a role in osteoblast differentiation and has been linked to multiple myeloma. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	180
198443	cd10475	EphR_LBD_B6	Ligand Binding Domain of Ephrin type-B Receptor 6. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. EphB6, a kinase-defective member of this family, is downregulated in MDA-MB-231-breast cancer cells and myeloid cancers and upregulated in neuroblasoma and glioblastoma. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	180
198444	cd10476	EphR_LBD_B1	Ligand Binding Domain of Ephrin type-B Receptor 1. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. Using EphB1 knockout-mice, EphB1 has been shown to be essential to the development of long-term potentiation (LTP), a cellular model of synaptic plasticity, learning and memory formation. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	176
198445	cd10477	EphR_LBD_B2	Ligand Binding Domain of Ephrin type-B Receptor 2. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. EphB2 plays a role in cell positioning in the gastrointestinal tract by being expressed in proliferating progenitor cells. It also has been implicated in colorectal cancer. A loss of EphB2, as well as EphA4, also precedes memory decline in a murine model of Alzheimers disease. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	178
198446	cd10478	EphR_LBD_B3	Ligand Binding Domain of Ephrin type-B Receptor 3. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. EphB3 plays a role in cell positioning in the gastrointestinal tract by being preferentially expressed in Paneth cells. It also has been implicated in early colorectal cancer and early stage squamous cell lung cancer. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	173
198447	cd10479	EphR_LBD_A1	Ligand Binding Domain of Ephrin type-A Receptor 1. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA1 is downregulated in some advanced colorectal and myeloid cancers and upregulated in neuroblasoma and glioblastoma. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion.	177
198448	cd10480	EphR_LBD_A2	Ligand Binding Domain of Ephrin type-A Receptor 2. EphRs comprise the largest subfamily of receptor tyr kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA2 negatively regulates cell differentiation and has been shown to be overexpressed in tumor cells and tumor blood vessels in a variety of cancers including breast, prostate, lung, and colon. As a result, it is an attractive target for drug design since its inhibition could affect several aspects of tumor progression. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion.	174
198449	cd10481	EphR_LBD_A3	Ligand Binding Domain of Ephrin type-A Receptor 3. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA3 has been implicated in leukemia, lung and other cancers. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion.	173
198450	cd10482	EphR_LBD_A4	Ligand Binding Domain of Ephrin type-A Receptor 4. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. A loss of EphA4, as well as EphB2, precedes memory decline in a murine model of Alzheimers disease. EphA4 has been shown to have a negative effect on axon regeneration and functional restoration in corticospinal lesions and is downregulated in some cervical cancers. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	174
198451	cd10483	EphR_LBD_A5	Ligand Binding Domain of Ephrin type-A Receptor 5. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA5 is almost exclusively expressed in the nervous system. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	173
198452	cd10484	EphR_LBD_A6	Ligand Binding Domain of Ephrin type-A Receptor 6. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA6, like other Eph receptors and their ephrin ligands, seems to play a role in neural development, underlying learning and memory.  EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	173
198453	cd10485	EphR_LBD_A7	Ligand Binding Domain of Ephrin type-A Receptor 7. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA7 has been implicated in various cancers, including prostate, gastic and colorectal cancers. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	177
198454	cd10486	EphR_LBD_A8	Ligand Binding Domain of Ephrin type-A Receptor 8. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA8 has been implicated in various cancers. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling).	173
198455	cd10487	EphR_LBD_A10	Ligand Binding Domain of Ephrin type-A Receptor 10. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. EphA10, which contains an inactive tyr kinase domain, may function to attenuate signals of co-clustered active receptors. EphA10 is mainly expressed in the testis. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction results in cell-cell repulsion or adhesion.	173
199812	cd10488	MH1_R-SMAD	N-terminal Mad Homology 1 (MH1) domain of receptor regulated SMADs. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. It binds to the major groove in an unusual manner via a beta hairpin structure.  It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 domain is found in all receptor regulated SMADs (R-SMADs) including SMAD1, SMAD2, SMAD3, SMAD5 and SMAD9. SMAD1 plays an essential role in bone development and postnatal bone formation through activation by bone morphogenetic protein (BMP) type 1 receptor kinase. SMAD2 regulates multiple cellular processes, such as cell proliferation, apoptosis and differentiation, while SMAD3 modulates signals of activin and TGF-beta. SMAD4, a common mediator SMAD (co-SMAD) binds R-SMADs, forming an oligomeric complex that binds to DNA and serves as a transcription factor. SMAD5 is involved in bone morphogenetic proteins (BMP) signal modulation, possibly playing a role in the pathway involving inhibition of hematopoietic progenitor cells by TGF-beta. SMAD9 (also known as SMAD8) can mediate the differentiation of mesenchymal stem cells (MSCs) into tendon-like cells by inhibiting the osteogenic pathway	123
199813	cd10489	MH1_SMAD_6_7	N-terminal Mad Homology 1 (MH1) domain in SMAD6 and SMAD7. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways.  MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure.  It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 domain is found in SMAD6 and SMAD7, both inhibitory SMADs (I-SMADs) and negative regulators of signaling mediated by TGF-beta superfamily. SMAD6 specifically inhibits bone morphogenetic protein (BMP) type I receptor mediated signaling while SMAD7 enhances muscle differentiation and is often associated with cancer, tissue fibrosis and inflammatory diseases.	119
199814	cd10490	MH1_SMAD_1_5_9	N-terminal Mad Homology 1 (MH1) domain in SMAD1, SMAD5 and SMAD9 (also known as SMAD8). The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure.  It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 domain is found in SMAD1, SMAD5 and SMAD9, all closely related receptor regulated SMADs (R-SMADs). SMAD1 plays an essential role in bone development and postnatal bone formation through activation by bone morphogenetic protein (BMP) type 1 receptor kinase. SMAD5 is involved in bone morphogenetic proteins (BMP) signal modulation and may also play a role in the pathway involving inhibition of hematopoietic progenitor cells by TGF-beta. SMAD9 mediates the differentiation of mesenchymal stem cells (MSCs) into tendon-like cells by inhibiting the osteogenic pathway.	124
199815	cd10491	MH1_SMAD_2_3	N-terminal Mad Homology 1 (MH1) domain in SMAD2 and SMAD3. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways.  MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure.  It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 is found in SMAD2 as well as SMAD3. SMAD2 mediates the signal of the transforming growth factor (TGF)-beta, and thereby regulates multiple cellular processes, such as cell proliferation, apoptosis, and differentiation. It plays a role in the transmission of extracellular signals from ligands of the TGF-beta superfamily growth factors into the cell nucleus. SMAD3 modulates signals of activin and TGF-beta. It binds SMAD4, enabling its transmigration into the nucleus where it forms complexes with other proteins and acts as a transcription factor. Increased SMAD3 activity has been implicated in the pathogenesis of scleroderma.	124
199816	cd10492	MH1_SMAD_4	N-terminal Mad Homology 1 (MH1) domain in SMAD4. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways.  MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure.  It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 belongs to SMAD4, a common mediator SMAD (co-SMAD), which belongs to the Dwarfin family of proteins and is involved in many cell functions such as differentiation, apoptosis, gastrulation, embryonic development and cell cycle. SMAD4 binds receptor regulated SMADs (R-SMADs) such as SMAD1 or SMAD2, and forms an oligomeric complex that binds to DNA and serves as a transcription factor. SMAD4 is often mutated in several cancers, such as multiploid colorectal cancer and pancreatic carcinoma, as well as in juvenile polyposis syndrome (JPS).	125
199817	cd10493	MH1_SMAD_6	N-terminal Mad Homology 1 (MH1) domain in SMAD6. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways.  MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure.  It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 belongs to SMAD6, an inhibitory SMAD (I-SMAD) or antagonistic SMAD, which acts as a negative regulator of signaling mediated by TGF-beta superfamily ligands, by competing with SMAD4 and preventing the transcription of SMAD4's gene products. SMAD6 specifically inhibits bone morphogenetic protein (BMP) type I receptor mediated signaling.	113
199818	cd10494	MH1_SMAD_7	N-terminal Mad Homology 1 (MH1) domain in SMAD7. The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins. It binds to the major groove in an unusual manner via a beta hairpin structure.  It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. This MH1 belongs to SMAD7, an inhibitory SMAD (I-SMAD) or antagonistic SMAD, which acts as a negative regulator of signaling mediated by TGF-beta superfamily ligands, by blocking TGF-beta type 1 and activin association with the receptor as well as access to SMAD2. SMAD7 enhances muscle differentiation, playing pivotal roles in embryonic development and adult homoeostasis. Altered expression of SMAD7 is often associated with cancer, tissue fibrosis and inflammatory diseases.	123
199820	cd10495	MH2_R-SMAD	C-terminal Mad Homology 2 (MH2) domain in receptor regulated SMADs. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain. Receptor regulated SMADs (R-SMADs) include SMAD1, SMAD2, SMAD3, SMAD5 and SMAD9. SMAD1 plays an essential role in bone development and postnatal bone formation through activation by bone morphogenetic protein (BMP) type 1 receptor kinase. SMAD2 regulates multiple cellular processes, such as cell proliferation, apoptosis and differentiation, while SMAD3 modulates signals of activin and TGF-beta. SMAD5 is involved in BMP signal modulation, possibly playing a role in the pathway involving inhibition of hematopoietic progenitor cells by TGF-beta. SMAD9 (also known as SMAD8) can mediate the differentiation of mesenchymal stem cells into tendon-like cells by inhibiting the osteogenic pathway.	182
199821	cd10496	MH2_I-SMAD	C-terminal Mad Homology 2 (MH2) domain in Inhibitory SMADs. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain, which prevents it from forming a complex with SMAD4. SMAD6 and SMAD7 are inhibitory SMADs (I-SMADs) that function as negative regulators of signaling mediated by the TGF-beta superfamily. SMAD6 specifically inhibits bone morphogenetic protein (BMP) type I receptor mediated signaling, while SMAD7 enhances muscle differentiation and is often associated with cancer, tissue fibrosis and inflammatory diseases.	165
199822	cd10497	MH2_SMAD_1_5_9	C-terminal Mad Homology 2 (MH2) domain in SMAD1, SMAD5 and SMAD9. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain, which prevents it from forming a complex with SMAD4. SMAD1, SMAD5 and SMAD9 (also known as SMAD8), are receptor regulated SMADs (R-SMADs). SMAD1 plays an essential role in bone development and postnatal bone formation through activation by bone morphogenetic protein (BMP) type 1 receptor kinase. SMAD5 is involved in BMP signal modulation and may also play a role in the pathway involving inhibition of hematopoietic progenitor cells by TGF-beta. SMAD9 mediates the differentiation of mesenchymal stem cells (MSCs) into tendon-like cells by inhibiting the osteogenic pathway.	201
199823	cd10498	MH2_SMAD_4	C-terminal Mad Homology 2 (MH2) domain in SMAD4. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain. SMAD4, which belongs to the Dwarfin family of proteins, is involved in many cell functions such as differentiation, apoptosis, gastrulation, embryonic development and the cell cycle. SMAD4 binds receptor regulated SMADs (R-SMADs) such as SMAD1 or SMAD2, and forms an oligomeric complex that binds to DNA and serves as a transcription factor. SMAD4 is often mutated in several cancers, such as multiploid colorectal cancer, cervical cancer and pancreatic carcinoma, as well as in juvenile polyposis syndrome.	222
199824	cd10499	MH2_SMAD_6	C-terminal Mad Homology 2 (MH2) domain in SMAD6. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain, which prevents it from forming a complex with SMAD4. SMAD6, an inhibitory or antagonistic SMAD (I-SMAD), acts as a negative regulator of signaling mediated by the TGF-beta superfamily of ligands, by competing with SMAD4 and preventing the transcription of SMAD4's gene products. SMAD6 specifically inhibits bone morphogenetic protein (BMP) type I receptor mediated signaling. SMAD6 and SMAD7 act as critical mediators for effective TGF-beta I-mediated suppression of Interleukin-1/Toll-like receptor (IL-1R/TLR) signaling through simultaneous binding to Pellino-1, an adaptor protein of interleukin-1 receptor associated kinase 1 (IRAK1), via their MH2 domains.	174
199825	cd10500	MH2_SMAD_7	C-terminal Mad Homology 2 (MH2) domain in SMAD7. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain, which prevents it from forming a complex with SMAD4.  SMAD7, an inhibitory or antagonistic SMAD (I-SMAD), acts as a negative regulator of signaling mediated by the TGF-beta superfamily of ligands, by blocking TGF-beta type 1 and activin association with the receptor as well as access to SMAD2. SMAD7 enhances muscle differentiation, playing pivotal roles in embryonic development and adult homoeostasis. SMAD7 and SMAD6 act as critical mediators for effective TGF-beta I-mediated suppression of Interleukin-1/Toll-like receptor (IL-1R/TLR) signaling through simultaneous binding to Pellino-1, an adaptor protein of interleukin-1 receptor associated kinase 1(IRAK1), via their MH2 domains. Altered expression of SMAD7 is often associated with cancer, tissue fibrosis and inflammatory diseases.	171
259849	cd10506	RNAP_IV_RPD1_N	Largest subunit (NRPD1) of higher plant RNA polymerase IV, N-terminal domain. NRPD1 and NRPE1 are the largest subunits of plant DNA-dependent RNA polymerase IV and V that, together with second largest subunits (NRPD2 and NRPE2), form the active site region of the DNA entry and RNA exit channel. Higher plants have five multi-subunit nuclear RNA polymerases; RNAP I, RNAP II and RNAP III, which are essential for viability, plus the two isoforms of the non-essential polymerase RNAP IV and V, which specialize in small RNA-mediated gene silencing pathways. RNAP IV and/or V might be involved in RNA-directed DNA methylation of endogenous repetitive elements, silencing of transgenes, regulation of flowering-time genes, inducible regulation of adjacent gene pairs, and spreading of mobile silencing signals. The subunit compositions of RNAP IV and V reveal that they evolved from RNAP II.	744
259792	cd10507	Zn-ribbon_RPA12	C-terminal zinc ribbon domain of RPA12 subunit of RNA polymerase I. The C-terminal zinc ribbon domain (C-ribbon) of subunit A12 (Zn-ribbon_RPA12) in RNA polymerase (Pol) I is involved in intrinsic transcript cleavage. Eukaryote genomes are transcribed by three nuclear RNA polymerases (Pol I, II and III) that share some subunits. RPA12 in Pol I, RPB9 in Pol II, RPC11 in Pol III and TFS in archaea are distantly related to each other and to the TFIIS elongation factor of Pol II. RPA12 has two zinc-binding domains separated by a flexible linker.	47
259793	cd10508	Zn-ribbon_RPB9	C-terminal zinc ribbon domain of RPB9 subunit of RNA polymerase II. The C-terminal zinc ribbon domain (C-ribbon) of subunit B9 (Zn-ribbon_RPB9) in RNA polymerase (Pol) II is involved in intrinsic transcript cleavage. Eukaryote genomes are transcribed by three nuclear RNA polymerases (Pol I, II and III) that share some subunits. RPB9 have strong homology to RPA12 of Pol I and RPC11 of Pol III subunits but its intrinsic cleavage activity is weaker for Pol II. Zn-ribbon_RPB9 is homologous to Pol II elongation factor TFIIS domain III. The very weak cleavage activity of Pol II is stimulated by TFIIS. RPB9 has two zinc-binding domains separated by a flexible linker.	49
259794	cd10509	Zn-ribbon_RPC11	C-terminal zinc ribbon domain of RPC11 subunit of RNA polymerase III. The C-terminal zinc ribbon domain (C-ribbon) of subunit C11 (Zn-ribbon_RPC11) in RNA polymerase (Pol) III is required for intrinsic transcript cleavage. RPC11 is also involved in Pol III termination. Eukaryote genomes are transcribed by three nuclear RNA polymerases (Pol I, II and III) that share some subunits. RPC11 has strong homology to RPB9 of Pol II and RPA12 of Pol I. Zn-ribbon_RPC11 is homologous to Pol II elongation factor TFIIS domain III. C11 has two zinc-binding domains separated by a flexible linker.	46
259795	cd10511	Zn-ribbon_TFS	C-terminal zinc ribbon domain of archaeal Transcription Factor S (TFS). TFS is an archaeal protein that stimulates the intrinsic cleavage activity of archaeal RNA polymerase. TFS C-terminal domain shows sequence similarity to the homologous C-terminal zinc ribbon domain of subunits A12.2, Rpb9, and C11 in eukaryotic RNA Polymerases (Pol) I, II, and III, respectively and domain III of TFIIS. TFS is not a subunit of archaeal RNA polymerase even though its domains arrangement is similar to A12.2, Rpb9, and C1. TFS is a transcription factor with a similar function to eukaryotic TFIIS. TFS has external cleavage induction activity and improves the fidelity of transcription. TFS has two zinc-binding domains.	47
380915	cd10517	SET_SETDB1	SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) and similar proteins. SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes.	288
380916	cd10518	SET_SETD1-like	SET domain (including post-SET domain) found in SET domain-containing proteins (SETD1A/SETD1B), histone-lysine N-methyltransferases (KMT2A/KMT2B/KMT2C/KMT2D) and similar proteins. This family includes SET domain-containing protein 1A (SETD1A), 1B (SETD1B), as well as histone-lysine N-methyltransferase 2A (KMT2A), 2B (KMT2B), 2C (KMT2C), 2D (KMT2D). These proteins are histone-lysine N-methyltransferases (EC 2.1.1.43) that specifically methylate 'Lys-4' of histone H3 (H3K4me).	150
380917	cd10519	SET_EZH	SET domain found in enhancer of zeste homolog 1 (EZH1), zeste homolog 2 (EZH2) and similar proteins. The family includes EZH1 and EZH2. EZH1 (EC 2.1.1.43; also termed ENX-2, or histone-lysine N-methyltransferase EZH1) is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. EZH2 (EC 2.1.1.43; also termed lysine N-methyltransferase 6, ENX-1, or histone-lysine N-methyltransferase EZH2) is a catalytic subunit of the PRC2/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. Both, EZH1 and EZH2, can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively.	117
380918	cd10520	PR-SET_PRDM17	PR-SET domain found in PR domain zinc finger protein 17 (PRDM17) and similar proteins. PRDM17 (also termed zinc finger protein 408 (ZNF408)) may be involved in transcriptional regulation.	121
380919	cd10521	SET_SMYD5	SET domain (including iSET domain and post-SET domain) found in SET and MYND domain-containing protein 5 (SMYD5) and similar proteins. SMYD5 (also termed protein NN8-4AG, or retinoic acid-induced protein 15) functions as histone lysine methyltransferase that mediates H4K20me3 at heterochromatin regions. It plays an important role in chromosome integrity by regulating heterochromatin and repressing endogenous repetitive DNA elements during differentiation. In zebrafish embryogenesis, it plays pivotal roles in both primitive and definitive hematopoiesis.	282
380920	cd10522	SET_LegAS4-like	SET domain found in Legionella pneumophila type IV secretion system effector LegAS4 and similar proteins. LegAS4 is a type IV secretion system effector of Legionella pneumophila. It contains a SET domain that is involved in the modification of Lys4 of histone H3 (H3K4) in the nucleolus of the host cell, thereby enhancing heterochromatic rDNA transcription. It also contains an ankyrin repeat domain of unknown function at its C-terminal region.	122
380921	cd10523	SET_SETDB2	SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 2 (SETDB2) and similar proteins. SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis.	266
380922	cd10524	SET_Suv4-20-like	SET domain (including post-SET domain) found in Drosophila melanogaster suppressor of variegation 4-20 (Suv4-20) and similar proteins. Suv4-20 (also termed Su(var)4-20) is a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-20' of histone H4. It acts as a dominant suppressor of position-effect variegation. The family also includes Suv4-20 homologs, lysine N-methyltransferase 5B (KMT5B) and lysine N-methyltransferase 5C (KMT5C). Both KMT5B (also termed lysine-specific methyltransferase 5B, or suppressor of variegation 4-20 homolog 1, or Su(var)4-20 homolog 1, or Suv4-20h1) and KMT5C (also termed lysine-specific methyltransferase 5C, or suppressor of variegation 4-20 homolog 2, or Su(var)4-20 homolog 2, or Suv4-20h2) are histone methyltransferases that specifically trimethylate 'Lys-20' of histone H4 (H4K20me3). They play central roles in the establishment of constitutive heterochromatin in pericentric heterochromatin regions.	141
380923	cd10525	SET_SUV39H1	SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homolog 1 (SUV39H1) and similar proteins. SUV39H1 (EC 2.1.1.43; also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A (KMT1A), position-effect variegation 3-9 homolog (SUV39H), or Su(var)3-9 homolog 1) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. It mainly functions in heterochromatin regions, thereby playing a central role in the establishment of constitutive heterochromatin at pericentric and telomere regions.	255
380924	cd10526	SET_SMYD1	SET domain (including post-SET domain) found in SET and MYND domain-containing protein 1 (SMYD1) and similar proteins. SMYD1 (EC 2.1.1.43), also termed BOP, is a heart and muscle specific SET-MYND domain containing protein, which functions as a histone methyltransferase and regulates downstream gene transcription. It methylates histone H3 at 'Lys-4' (H3K4me), seems able to perform both mono-, di-, and trimethylation. SMYD1 plays a critical role in cardiomyocyte differentiation, cardiac morphogenesis and myofibril organization, as well as in the regulation of endothelial cells (ECs). It is expressed in vascular endothelial cells, it has beenshown that knockdown of SMYD1 in endothelial cells impairs EC migration and tube formation.	210
380925	cd10527	SET_LSMT	SET domain found in Rubisco large subunit methyltransferase (LSMT) and similar proteins. Rubisco LSMT is a non-histone protein methyl transferase responsible for the trimethylation of lysine14 in the large subunit of Rubisco (ribulose-1,5-bisphosphate carboxylase/oxygenase). The family also includes SET domain-containing proteins, SETD3, SETD4 and SETD6, which belong to methyltransferase class VII that represents classical non-histone SET domain methyltransferases. Members in this family contain a SET domain and a C-terminal RubisCO LSMT substrate-binding (Rubis-subs-bind) domain.	236
380926	cd10528	SET_SETD8	SET domain found in SET domain-containing protein 8 (SETD8) and similar proteins. SETD8 (EC 2.1.1.43; also termed N-lysine methyltransferase KMT5A, H4-K20-HMTase KMT5A, lysine N-methyltransferase 5A, lysine-specific methylase 5A, PR/SET domain-containing protein 07, PR-Set7 or PR/SET07) is a nucleosomal histone-lysine N-methyltransferase that specifically monomethylates 'Lys-20' of histone H4 (H4K20me1). It plays a central role in the silencing of euchromatic genes.	141
380927	cd10529	SET_SETD5-like	SET domain found in SET domain-containing protein 5 (SETD5), inactive histone-lysine N-methyltransferase 2E (KMT2E) and similar proteins. SETD5 is a probable transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. KMT2E (also termed inactive lysine N-methyltransferase 2E or myeloid/lymphoid or mixed-lineage leukemia protein 5 (MLL5)) associates with chromatin regions downstream of transcriptional start sites of active genes and thus regulates gene transcription. The family also includes Saccharomyces cerevisiae SET domain-containing proteins, SET3 and SET4, and Schizosaccharomyces pombe SET3. Most of these family members contain a post-SET domain which harbors a zinc-binding site.	127
380928	cd10530	SET_SETD7	SET domain found in SET domain-containing protein 7 (SETD7) and similar proteins. SETD7 (EC 2.1.1.43; also termed histone H3-K4 methyltransferase SETD7, H3-K4-HMTase SETD7, lysine N-methyltransferase 7 (KMT7) or SET7/9) is a histone-lysine N-methyltransferase that specifically monomethylates 'Lys-4' of histone H3. It plays a central role in the transcriptional activation of genes such as collagenase or insulin. Set7/9 also methylates non-histone proteins, including estrogen receptor alpha (ERa), suggesting it has a role in diverse biological processes. ERa methylation by Set7/9 stabilizes ERa and activates its transcriptional activities, which are involved in the carcinogenesis of breast cancer. In a high-throughput screen, treatment of human breast cancer cells (MCF7 cells) with cyproheptadine, a Set7/9 inhibitor, decreased the expression and transcriptional activity of ERa, thereby inhibiting estrogen-dependent cell growth.	130
380929	cd10531	SET_SETD2-like	SET domain (including post-SET domain) found in SET domain-containing protein 2 (SETD2), nuclear SETD2 (NSD2), ASH1-like protein (ASH1L) and similar proteins. This family includes SET domain-containing protein 2 (SETD2), nuclear SETD2 (NSD2) and ASH1-like protein (ASH1L), which function as histone-lysine N-methyltransferases. SETD2 specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate. NSD2 shows histone H3 'Lys-27' (H3K27me) methyltransferase activity. ASH1L specifically methylates 'Lys-36' of histone H3 (H3K36me). The family also includes Arabidopsis thaliana ASH1-related protein 3 (ASHR3) and similar proteins.	136
380930	cd10532	SET_SUV39H2	SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homolog 2 (SUV39H2) and similar proteins. SUV39H2 (EC 2.1.1.43; also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B (KMT1B), or Su(var)3-9 homolog 2) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. It mainly functions in heterochromatin regions, thereby playing a central role in the establishment of constitutive heterochromatin at pericentric and telomere regions.	243
380931	cd10533	SET_EHMT2	SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase 2 (EHMT2) and similar proteins. EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C (KMT1C), or protein G9a) acts as a histone-lysine N-methyltransferase that specifically mono- and dimethylates 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin.	239
380932	cd10534	PR-SET_PRDM-like	PR-SET domain found in PRDM (PRDI-BF1 and RIZ homology domain) family of proteins. PRDM family of proteins is defined based on the conserved N-terminal PR domain, which is closely related to the Su(var)3-9, enhancer of zeste, and trithorax (SET) domains of histone methyltransferases, and is specifically called PR-SET domain. The family consists of 17 members in primates. PRDMs play diverse roles in cell-cycle regulation, differentiation, and meiotic recombination. The family also contains zinc finger protein ZFPM1 and ZFPM2. ZFPM1 (also termed friend of GATA protein 1, FOG-1, friend of GATA 1, zinc finger protein 89A, or zinc finger protein multitype 1) functions as a transcription regulator that plays an essential role in erythroid and megakaryocytic cell differentiation. ZFPM2 (also termed friend of GATA protein 2, FOG-2, friend of GATA 2, zinc finger protein 89B, or zinc finger protein multitype 2) functions as a transcription regulator that plays a central role in heart morphogenesis and development of coronary vessels from epicardium, by regulating genes that are essential during cardiogenesis.	83
380933	cd10535	SET_EHMT1	SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase 1 (EHMT1) and similar proteins. EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, or lysine N-methyltransferase 1D (KMT1D)) acts as a histone-lysine N-methyltransferase that specifically mono- and dimethylates 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin.	231
380934	cd10536	SET_SMYD4	SET domain (including iSET domain and post-SET domain) found in SET and MYND domain-containing protein 4 (SMYD4) and similar proteins. SMYD4 functions as a potential tumor suppressor that plays a critical role in breast carcinogenesis at least partly through inhibiting the expression of PDGFR-alpha. In zebrafish, SMYD4 is ubiquitously expressed in early embryos and becomes enriched in the developing heart;  mutants show a strong defect in cardiomyocyte proliferation, which lead to a severe cardiac malformation.	218
380935	cd10537	SET_SETD9	SET domain found in SET domain-containing protein 9 (SETD9) and similar proteins. SETD9 is an uncharacterized protein that belongs to the class V-like SAM-binding methyltransferase superfamily.	150
380936	cd10538	SET_SETDB-like	SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) and 2 (SETDB2), suppressor of variegation 3-9 homologs, SUV39H1 and SUV39H2, euchromatic histone-lysine N-methyltransferase EHMT1 and EHMT2, and similar proteins. The family includes SET domain bifurcated 1 (SETDB1) and 2 (SETDB2), suppressor of variegation 3-9 homologs, SUV39H1 and SUV39H2, euchromatic histone-lysine N-methyltransferase EHMT1 and EHMT2. SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis. SUV39H1 (also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A, KMT1A, position-effect variegation 3-9 homolog, SUV39H, or Su(var)3-9 homolog 1) and SUV39H2 (also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B, KMT1B, or Su(var)3-9 homolog 2), both act as histone-lysine N-methyltransferases that specifically trimethylate 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. They mainly function in heterochromatin regions, thereby playing central roles in the establishment of constitutive heterochromatin at pericentric and telomere regions. EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, lysine N-methyltransferase 1D, or KMT1D) and EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C, KMT1C, or protein G9a), both act as histone-lysine N-methyltransferases that specifically mono- and dimethylate 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin. This family also includes the pre-SET domain, which is found in a number of histone methyltransferases (HMTase), N-terminal to the SET domain. Pre-SET domain is a zinc binding motif which contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilizing SET domains. Most family members, except for Arabidopsis thaliana SUVH9, contain a post-SET domain which harbors a zinc-binding site.	217
380937	cd10539	SET_ATXR5_6-like	SET domain found in fungal protein lysine methyltransferase SET5 and similar protein. The family includes Arabidopsis thaliana ATXR5 and ATXR6. Both ATXR5 (also termed protein SET DOMAIN GROUP 15, or TRX-related protein 5) and ATXR6 (also termed protein SET DOMAIN GROUP 34, or TRX-related protein 6) function as histone methyltransferase that specifically monomethylates 'Lys-37' of histone H3 (H3K27me1). They are required for chromatin structure and gene silencing.	138
380938	cd10540	SET_SpSet7-like	SET domain found in Schizossacharomyces pombe Set7 and similar proteins. Schizosaccharomyces pombe Set7 is a novel histone-lysine N-methyltransferase. The family also includes a viral histone H3 lysine 27 methyltransferase from Paramecium bursaria Chlorella virus 1 (PBCV-1).	112
380939	cd10541	SET_SETDB	SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1), SET domain bifurcated 2 (SETDB2), and similar proteins. SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis.	236
380940	cd10542	SET_SUV39H	SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homologs, SUV39H1, SUV39H2 and similar proteins. This family includes SUV39H1 (also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A, KMT1A, position-effect variegation 3-9 homolog, SUV39H, or Su(var)3-9 homolog 1) and SUV39H2 (also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B, KMT1B, or Su(var)3-9 homolog 2), both act as histone-lysine N-methyltransferases that specifically trimethylate 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. They mainly function in heterochromatin regions, thereby playing central roles in the establishment of constitutive heterochromatin at pericentric and telomere regions. Also included are Schizosaccharomyces pombe H3K9 methyltransferase Clr4 (SUV39H homolog) and Neurospora crassa DIM-5, both of which also methylate 'Lys-9' of histone H3.	245
380941	cd10543	SET_EHMT	SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase EHMT1, EHMT2 and similar proteins. This family includes EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, lysine N-methyltransferase 1D, or KMT1D) and EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C, KMT1C, or protein G9a), both act as histone-lysine N-methyltransferases that specifically mono- and dimethylate 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin.	231
380942	cd10544	SET_SETMAR	SET domain (including pre-SET and post-SET domains) found in SET domain and mariner transposase fusion protein (SETMAR) and similar proteins. SETMAR (also termed metnase) is a DNA-binding protein that is indirectly recruited to sites of DNA damage through protein-protein interactions. It has a sequence-specific DNA-binding activity recognizing the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element and displays a DNA nicking and end joining activity. SETMAR also acts as a histone-lysine N-methyltransferase that methylates 'Lys-4' and 'Lys-36' of histone H3. It specifically mediates dimethylation of H3 'Lys-36' at sites of DNA double-strand break and may recruit proteins required for efficient DSB repair through non-homologous end-joining.	254
380943	cd10545	SET_AtSUVH-like	SET domain found in Arabidopsis thaliana histone H3-K9 methyltransferases (SUVHs) and similar proteins. Arabidopsis thaliana SUVH protein (also termed suppressor of variegation 3-9 homolog protein) is a histone-lysine N-methyltransferase that methylates 'Lys-9' of histone H3. H3 'Lys-9' methylation represents a specific tag for epigenetic transcriptional repression. Some family members contain a post-SET domain which binds a Zn2+ ion. Most family members, except for Arabidopsis thaliana SUVH9, contain a post-SET domain which harbors a zinc-binding site.	232
240598	cd10546	VKOR	Vitamin K epoxide reductase (VKOR) family. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. This family includes enzymes that are present in vertebrates, Drosophila, plants, bacteria, and archaea. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some plant and bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. Warfarin, a widely used oral anticoagulant used in medicine as well as rodenticides, inhibits the activity of VKOR, resulting in decreased levels of reduced vitamin K, which is required for the function of several clotting factors. However, anticoagulation effect of warfarin is significantly associated with polymorphism of certain genes, including VKORC1. Interestingly, in rodents, an adaptive trait appears to have evolved convergently by selection on new or standing genetic polymorphisms in VKORC1 as well as by adaptive introgressive hybridization between species, likely brought about by human-mediated dispersal.	126
380415	cd10547	cupin_BacB_C	Bacillus subtilis bacilysin and related proteins, C-terminal cupin domain. This model represents the C-terminal domain of bacilysin (BacB, also known as AerE in Microcystis aeruginosa), a non-ribosomally synthesized dipeptide antibiotic that is produced and excreted by certain strains of Bacillus subtilis. Bacilysin is an oxidase that catalyzes the synthesis of 2-oxo-3-(4-oxocyclohexa-2,5-dienyl)propanoic acid, a precursor to L-anticapsin. Each bacilysin monomer has two tandem cupin domains. It is active against a wide range of bacteria and some fungi. The antimicrobial activity of bacilysin is antagonized by glucosamine and N-acetyl glucosamine, indicating that bacilysin interferes with glucosamine synthesis, and thus, with the synthesis of microbial cell walls.  AerE is thought to be involved in the formation of the 2-carboxy-6-hydroxyoctahydroindole (Choi) moiety found on all aeruginosin tetrapeptides, based on gene knock-out experiments. It is encoded by the aerE gene of the aerABCDEF aeruginosin biosynthesis gene cluster in Microcystis aeruginosa. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	92
380416	cd10548	cupin_CDO	cysteine dioxygenase, cupin domain. This family contains cysteine dioxygenase (CDO; EC 1.13.11.20), which catalyzes the conversion of cysteine to cysteine sulfinic acid, the first step in the biosynthesis of essential oxidized cysteine metabolites such as sulfate, hypotaurine, and taurine. CDO also plays an important role in the regulation of intracellular cysteine levels in mammals; CDO expression is altered in cancer cells, and abnormal or deficient CDO activity has been linked to Parkinson's disease, Alzheimer's disease, and rheumatoid arthritis. CDO is an iron-dependent thiol dioxygenase that uses molecular oxygen to oxidize the sulfhydryl group of cysteine to generate cysteine sulfinic acid. The CDO active site contains an amino acid-derived cofactor. These enzymes are found in prokaryotes as well as eukaryotes and belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	100
319871	cd10549	MtMvhB_like	Uncharacterized polyferredoxin-like protein. This family contains uncharacterized polyferredoxin protein similar to Methanobacterium thermoautotrophicum MvhB. The mvhB is a gene of the methylviologen-reducing hydrogenase operon. It is predicted to contain 12 [4Fe-4S] clusters, and was therefore suggested to be a polyferredoxin. As a subfamily of the beta subunit of the DMSO Reductase (DMSOR) family, it is predicted to function as electron carrier in the reducing reaction.	128
319872	cd10550	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	130
319873	cd10551	PsrB	polysulfide reductase beta (PsrB) subunit. This family includes the beta subunit of bacterial polysulfide reductase (PsrABC), an integral membrane-bound enzyme responsible for quinone-coupled reduction of polysulfides, a process important in extreme environments such as deep-sea vents and hot springs.  Polysulfide reductase contains three subunits: a catalytic subunit PsrA, an electron transfer PsrB subunit and the hydrophobic transmembrane PsrC subunit. PsrB belongs to the DMSO reductase superfamily that contains [4Fe-4S] clusters which transfer the electrons from the A subunit to the hydrophobic integral membrane C subunit via the B subunit. In Shewanella oneidensis, which has highly diverse anaerobic respiratory pathways, PsrABC is responsible for H2S generation as well as its regulation via respiration of sulfur species. PsrB transfers electrons from PsrC (serving as quinol oxidase) to the catalytic subunit PsrA for reduction of corresponding electron acceptors. It has been shown that T. thermophilus polysulfide reductase could be a key energy-conserving enzyme of the respiratory chain, using polysulfide as the terminal electron acceptor and pumping protons across the membrane.	185
319874	cd10552	TH_beta_N	N-terminal FeS domain of pyrogallol-phloroglucinol transhydroxylase (TH), beta subunit. This family includes the beta subunit of pyrogallol-phloroglucinol transhydroxylase (TH), a cytoplasmic molybdenum (Mo) enzyme from anaerobic microorganisms like Pelobacter acidigallici and Desulfitobacterium hafniense which catalyzes the conversion of pyrogallol to phloroglucinol, an important building block of plant polymers. TH belongs to the DMSO reductase (DMSOR) family; it is a heterodimer consisting of a large alpha catalytic subunit and a small beta FeS subunit.  The beta subunit has two domains with the N-terminal domain containing three [4Fe-4S] centers and a seven-stranded, mainly antiparallel beta-barrel domain. In the anaerobic bacterium Pelobacter acidigallici, gallic acid, pyrogallol, phloroglucinol, or phloroglucinol carboxylic acid are fermented to three molecules of acetate (plus CO2), and TH is the key enzyme in the fermentation pathway, which converts pyrogallol to phloroglucinol in the absence of O2.	186
319875	cd10553	PhsB_like	uncharacterized beta subfamily of DMSO Reductase similar to Desulfonauticus sp PhsB. This family includes beta FeS subunits of anaerobic DMSO reductase (DMSOR) superfamily that have yet to be characterized. DMSOR consists of a large, periplasmic molybdenum-containing alpha subunit as well as a small beta FeS subunit, and may also have a small gamma subunit.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and the tungsten-containing formate dehydrogenase (FDH-T).  Examples of heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	146
319876	cd10554	HycB_like	HycB, HydN and similar proteins. This family includes HycB, the FeS subunit of a membrane-associated formate hydrogenlyase system (FHL-1) in Escherichia coli that breaks down formate, produced during anaerobic fermentation, to H2 and CO2.   FHL-1 consists of formate dehydrogenase H (FDH-H) and the hydrogenase 3 complex (Hyd-3). HycB is thought to code for the [4Fe-4S] ferredoxin subunit of hydrogenase 3, which functions as an intermediate electron carrier protein between hydrogenase 3 and formate dehydrogenase. HydN codes for the [4Fe-4S] ferredoxin subunit of FDH-H; a hydN in-frame deletion mutation causes only weak reduction in hydrogenase activity, but loss of more than 60% of FDH-H activity. This pathway is only active at low pH and high formate concentrations, and is thought to provide a detoxification/de-acidification system countering the buildup of formate during fermentation.	149
319877	cd10555	EBDH_beta	beta subunit of ethylbenzene-dehydrogenase (EBDH). This subfamily includes ethylbenzene dehydrogenase (EBDH, EC 1.17.99.2), a member of the DMSO reductase family.  EBDH oxidizes the hydrocarbon ethylbenzene to (S)-1-phenylethanol. It is a heterotrimer, with the alpha subunit containing the catalytic center with a molybdenum held by two molybdopterin-guanine dinucleotides, the beta subunit containing four iron-sulfur clusters (the electron transfer subunit) and the gamma subunit containing a methionine and a lysine as axial heme ligands.  During catalysis, electrons produced by substrate oxidation are transferred to a heme in the gamma subunit and then presumably to a separate cytochrome involved in nitrate respiration.	316
319878	cd10556	SER_beta	Beta subunit of selenate reductase. This subfamily includes beta FeS subunit of selenate reductase (SER), a member of the DMSO reductase family. SER catalyzes the reduction of selenate to selenite in bacterial species that can obtain energy by respiring anaerobically with selenate as the terminal electron acceptor. The enzyme comprises three subunits SerABC, forming a heterotrimer, with the catalytic component (alpha-subunit), iron-sulfur protein (beta-subunit) and monomeric b-type heme-containing gamma subunit. Beta subunit contains coordinating one [3Fe-4S] cluster and three [4Fe-4S] clusters and functions as electron carrier.	287
319879	cd10557	NarH_beta-like	beta subunit of nitrate reductase A (NarH) and similar proteins. This subfamily includes nitrate reductase A, a member of the DMSO reductase family. The respiratory nitrate reductase complex (NarGHI) from E. coli is a heterotrimer, with the catalytic subunit (NarG) with a molybdo-bis (molybdopterin guanine dinucleotide) cofactor and an [Fe-S] cluster, the electron transfer subunit (NarH) with four [Fe-S] clusters, and the integral membrane subunit (NarI) with two b-type hemes.  Nitrate reductase A often forms a respiratory chain with the formate dehydrogenase via the lipid soluble quinol pool. Electron transfer from formate to nitrate is coupled to proton translocation across the cytoplasmic membrane generating proton motive force by a redox loop mechanism. Demethylmenaquinol (DMKH2) has been shown to be a good substrate for NarGHI in nitrate respiration in E. coli.	363
319880	cd10558	FDH-N	The beta FeS subunit of formate dehydrogenase-N (FDH-N). This subfamily contains beta FeS subunit of formate dehydrogenase-N (FDH-N), a member of the DMSO reductase family.  FDH-N is involved in the major anaerobic respiratory pathway in the presence of nitrate, catalyzing the oxidation of formate to carbon dioxide at the expense of nitrate reduction to nitrite.  Thus, FDH-N is a major component of nitrate respiration of Escherichia coli. This integral membrane enzyme forms a heterotrimer; the alpha-subunit (FDH-G) is the catalytic site of formate oxidation and membrane-associated, incorporating a selenocysteine (SeCys) residue and a [4Fe/4S] cluster in addition to two bis-MGD cofactors, the beta subunit (FDH-H) contains four [4Fe/4S] clusters which transfer the electrons from the alpha subunit to the gamma-subunit (FDH-I), a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups.	208
319881	cd10559	W-FDH	tungsten-containing formate dehydrogenase, small subunit. This subfamily contains beta subunit of Tungsten-containing formate dehydrogenase (W-FDH), a member of the DMSO reductase family. W-FDH contains a tungsten instead of molybdenum at the catalytic center. This enzyme seems to be exclusively found in organisms such as hyperthermophilic archaea that live in extreme environments. It is a heterodimer of a large and a small subunit; the large subunit harbors the W site and one [4Fe-4S] center and the small subunit, containing three [4Fe-4S] clusters, functions to transfer electrons.	200
319882	cd10560	FDH-O_like	beta subunit of formate dehydrogenase O (FDH-O) and similar proteins. This subfamily includes beta subunit of formate dehydrogenase family O (FDH-O), which is highly homologous to formate dehydrogenase N (FDH-N), a member of the DMSO reductase family. In E. coli three formate dehydrogenases are synthesized that are capable of oxidizing formate; Fdh-H, couples formate disproportionation to hydrogen and CO2, and is part of the cytoplasmically oriented formate hydrogenlyase complex, while FDH-N and FDH-O indicate their respective induction after growth with nitrate and oxygen. Little is known about FDH-O, although it shows formate oxidase activity during aerobic growth and is also synthesized during nitrate respiration, similar to FDH-N.	225
319883	cd10561	HybA_like	the FeS subunit of hydrogenase 2. This subfamily includes the beta-subunit of hydrogenase 2 (Hyd-2), an enzyme that catalyzes the reversible oxidation of H2 to protons and electrons. Hyd-2 is membrane-associated and forms an unusual heterotetrameric [NiFe]-hydrogenase in that it lacks the typical cytochrome b membrane anchor subunit that transfers electrons to the quinone pool. The electron transfer subunit of Hyd-2 (HybA) which is predicted to contain four iron-sulfur clusters, is essential for electron transfer from Hyd-2 to menaquinone/demethylmenaquinone (MQ/DMQ) to couple hydrogen oxidation to fumarate reduction.	196
319884	cd10562	FDH_b_like	uncharacterized subfamily of beta subunit of formate dehydrogenase. This subfamily includes the beta-subunit of formate dehydrogenases that are as yet uncharacterized.  Members of the DMSO reductase family include formate dehydrogenase N and O (FDH-N, FDH-O) and tungsten-containing formate dehydrogenase (W-FDH) and other similar proteins. FDH-N, a major component of nitrate respiration of Escherichia coli, is involved in the major anaerobic respiratory pathway in the presence of nitrate, catalyzing the oxidation of formate to carbon dioxide at the expense of nitrate reduction to nitrite.  It forms a heterotrimer; the alpha-subunit (FDH-G) is the catalytic site of formate oxidation and membrane-associated, incorporating a selenocysteine (SeCys) residue and a [4Fe/4S] cluster in addition to two bis-MGD cofactors, the beta subunit (FDH-H) contains four [4Fe/4S] clusters which transfer the electrons from the alpha subunit to the gamma-subunit (FDH-I), a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. W-FDH contains a tungsten instead of molybdenum at the catalytic center. This enzyme seems to be exclusively found in organisms such as hyperthermophilic archaea that live in extreme environments. It is a heterodimer of a large and a small subunit; the large subunit harbors the W site and one [4Fe-4S] center and the small subunit, containing three [4Fe-4S] clusters, functions to transfer electrons.	161
319885	cd10563	CooF_like	CooF, iron-sulfur subunit of carbon monoxide dehydrogenase. This family includes CooF, the iron-sulfur subunit of carbon monoxide dehydrogenase (CODH), found in anaerobic bacteria and archaea. Carbon monoxide dehydrogenase is a key enzyme for carbon monoxide (CO) metabolism, where CooF is the proposed mediator of electron transfer between CODH and the CO-induced hydrogenase, catalyzing the reaction that uses CO as a single carbon and energy source, and producing only H2 and CO2. The ion-sulfur subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons in the protein complex during reaction.	140
319886	cd10564	NapF_like	NapF, iron-sulfur subunit of periplasmic nitrate reductase. This family contains NapF protein, the iron-sulfur subunit of periplasmic nitrate reductase. The periplasmic nitrate reductase NapABC of Escherichia coli likely functions during anaerobic growth in low-nitrate environments; napF operon expression is activated by cyclic AMP receptor protein (Crp). NapF is a subfamily of the beta subunit of DMSO reductase (DMSOR) family.  DMSOR family members have a large, periplasmic molybdenum-containing alpha subunit as well as a small beta FeS subunit, and may also have a small gamma subunit.  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	139
349488	cd10566	MDM2_like	p53-binding domain found in E3 ubiquitin-protein ligase MDM2, MDM4, and similar proteins. MDM2 (also termed HDM2) and MDM4 (also termed MDMX or HDMX) are the primary negative regulators of p53 tumor suppressor. They have non-redundant roles in the regulation of p53. MDM2 mainly functions to control p53 stability, while MDM4 controls p53 transcriptional activity. Both MDM2 and MDM4 contain an N-terminal p53-binding domain, a RanBP2-type zinc finger (zf-RanBP2) domain near the central acidic region, and a C-terminal RING domain. Mdm2 can form homo-oligomers through its RING domain and display E3 ubiquitin ligase activity that catalyzes the attachment of ubiquitin to p53 as an essential step in the regulation of its level in cells. Despite its RING domain and structural similarity with MDM2, MDM4 does not homo-oligomerize and lacks ubiquitin-ligase function, but inhibits the transcriptional activity of p53. In addition, both their RING domains are responsible for the hetero-oligomerization, which is crucial for the suppression of p53 activity during embryonic development and the recruitment of E2 ubiquitin-conjugating enzymes. Moreover, MDM2 and MDM4 can be phosphorylated and destabilized in response to DNA damage stress. In response to ribosomal stress, MDM2-mediated p53 ubiquitination and degradation can be inhibited through the interaction with ribosomal proteins L5, L11 and L23. However, MDM4 is not bound to ribosomal proteins, suggesting its different response to regulation by small basic proteins such as ribosomal proteins and ARF.	75
349489	cd10567	SWIB-MDM2_like	SWIB/MDM2 domain found in SWIB/MDM2 homologous proteins. This family includes Schizosaccharomyces pombe upstream activation factor subunit spp27, Saccharomyces cerevisiae upstream activation factor subunit UAF30, Chlamydiae DNA topoisomerase/SWIB domain fusion protein, Arabidopsis thaliana zinc finger CCCH domain-containing proteins, AtC3H19 and AtC3H44, and similar proteins. S. pombe spp27, also termed upstream activation factor 27 KDa subunit (p27), or upstream activation factor 30 KDa subunit (p30), or upstream activation factor subunit uaf30, is a component of the UAF (upstream activation factor) complex which interacts with the upstream element of the RNA polymerase I promoter and forms a stable preinitiation complex. S. cerevisiae UAF30, also termed upstream activation factor 30 KDa subunit (p30), is a non-essential component of the UAF. It seems to play a role in silencing transcription by RNA polymerase II. The SWIB domain found in Chlamydiae DNA topoisomerase may play a role in chromatin condensation-decondensation, which is characteristic of the chlamydial developmental cycle and not found in any other types of bacteria.  AtC3H19, also termed protein needed for RDR2-independent DNA methylation (NERD), is a plant-specific GW repeat- and PHD finger-containing protein that plays a central role in integrating RNA silencing and chromatin signals in 21 nt small-interfering RNA (siRNA)-dependent DNA methylation on the cytosine pathway, leading to transcriptional gene silencing of specific sequences. This family also includes many uncharacterized proteins containing two copies of SWIB/MDM2 domain.	71
349490	cd10568	SWIB_like	SWIB domain found in the 60 kda subunit of the ATP-dependent SWI/SNF chromatin-remodeling complexes and similar proteins. SWIB domain is a conserved region found within proteins in the SWI/SNF family of complexes. SWI/SNF complex proteins display helicase and ATPase activities and are thought to regulate transcription of certain genes by altering the chromatin structure around those genes. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors), among which the BAF60 subunit serves as a key link between the core complexes and specific transcriptional factors. The BAF60 subunit have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. The family also includes Saccharomyces cerevisiae transcription regulatory protein SNF12 and remodel the structure of chromatin complex subunit 6 (RSC6), and Schizosaccharomyces pombe SWI/SNF and RSC complexes subunit SSR3. SNF12, also termed 73-kDa subunit of the SWI/SNF transcriptional regulatory complex, or SWI/SNF complex component SWP73, is involved in transcriptional activation and repression of select genes by chromatin remodeling (alteration of DNA-nucleosome topology). RSC6 and SSR3 are components of the RSC, which is involved in transcription regulation and nucleosome positioning. RSC6 is essential for mitotic growth and suppresses formamide sensitivity of the RSC8 mutants.	69
269973	cd10569	FERM_C_Talin	FERM domain C-lobe/F3 of Talin. Talin (also called filopodin) plays an important role in initiating actin filament growth in motile cell protrusions. It is responsible for linking the cytoplasmic domains of integrins to the actin-based cytoskeleton, and is involved in vinculin, integrin and actin interactions. At the leading edge of motile cells, talin colocalises with the hyaluronan receptor layilin in transient adhesions, some of which become more stable focal adhesions (FA). During this maturation process, layilin is replaced with integrins, where localized production of PI(4,5)P(2) by type 1 phosphatidyl inositol phosphate kinase type 1gamma (PIPK1gamma) is thought to play a role in FA assembly. Talins are composed of a N-terminal region FERM domain which us made up of 3 subdomains (N, alpha-, and C-lobe; or- A-lobe, B-lobe, and C-lobe; or F1, F2, and F3) connected by short linkers, a talin rod which binds vinculin, and a conserved C-terminal region with actin- and integrin-binding sites. There are 2 additional actin-binding domains, one in the talin rod and the other in the FERM domain. Both the F2 and F3 FERM subdomains contribute to F-actin binding. Subdomain F3 of the FERM domain contains overlapping binding sites for integrin cytoplasmic domains and for the type 1 gamma isoform of PIP-kinase (phosphatidylinositol 4-phosphate 5-kinase). The FERM domain has a cloverleaf tripart structure . F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	92
275393	cd10570	PH-GRAM	Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold.	94
269975	cd10571	PH_beta_spectrin	Beta-spectrin pleckstrin homology (PH) domain. Beta spectrin binds actin and functions as a major component of the cytoskeleton underlying cellular membranes. Beta spectrin consists of multiple spectrin repeats followed by a PH domain, which binds to inositol-1,4,5-trisphosphate. The PH domain of beta-spectrin is thought to play a role in the association of spectrin with the plasma membrane of cells. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	106
269976	cd10572	PH_RhoGEF3_XPLN	Rho guanine nucleotide exchange factor 3 Pleckstrin homology (PH) domain. RhoGEF3/XPLN, a Rho family GEF, preferentially stimulates guanine nucleotide exchange on RhoA and RhoB, but not RhoC, RhoG, Rac1, or Cdc42 in vitro. It also possesses transforming activity. RhoGEF3/XPLN contains a tandem Dbl homology and PH domain, but lacks homology with other known functional domains or motifs. It is expressed in the brain, skeletal muscle, heart, kidney, platelets, and macrophage and neuronal cell lines. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	133
269977	cd10573	PH_DAPP1	Dual Adaptor for Phosphotyrosine and 3-Phosphoinositides Pleckstrin homology (PH) domain. DAPP1 (also known as PHISH/3' phosphoinositide-interacting SH2 domain-containing protein or Bam32) plays a role in B-cell activation and has potential roles in T-cell and mast cell function. DAPP1 promotes B cell receptor (BCR) induced activation of Rho GTPases Rac1 and Cdc42, which feed into mitogen-activated protein kinases (MAPK) activation pathways and affect cytoskeletal rearrangement. DAPP1can also regulate BCR-induced activation of extracellular signal-regulated kinase (ERK), and c-jun NH2-terminal kinase (JNK). DAPP1 contains an N-terminal SH2 domain and a C-terminal pleckstrin homology (PH) domain with a single tyrosine phosphorylation site located centrally. DAPP1 binds strongly to both PtdIns(3,4,5)P3 and PtdIns(3,4)P2. The PH domain is essential for plasma membrane recruitment of PI3K upon cell activation. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	96
269978	cd10574	EVH1_SPRED-like	Sprouty-related EVH1 domain-containing-like proteins EVH1 domain. The Spred family has the following domains: an N-terminal EVH1 domain, a unique KBD (c-Kit kinase binding) domain which that is phosphorylated by the stem cell factor receptor c-Kit, and a C-terminal cysteine-rich SPR (Sprouty-related) domain which is involved in membrane localization. There are 3 Spred proteins: Spred1 which interacts with both Ras and Raf through its SPR domain; Spred2 which is the most abundant isoform; and Spred3 which has a non-functional KBD and maintains the inhibitory action on Raf. Legius syndrome is caused by heterozygous mutations in Spred1. Both EVH1 and SPR domains are involved in the inhibition of the MAP kinase pathway by Spred proteins. The specific function of the Spred2 EVH1 domain is unknown and there are no known interacting proteins to date. It is thought that its EVH1 domain will have a fourth distinct peptide binding mechanism within the EVH1 family. The EVH1 domains are part of the PH domain superamily. There are 5 EVH1 subfamilies: Enables/VASP, Homer/Vesl, WASP, Dcp1, and Spred. Ligands are known for three of the EVH1 subfamilies, all of which bind proline-rich sequences: the Enabled/VASP family binds to FPPPP peptides, the Homer/Vesl family binds PPxxF peptides, and the WASP family binds LPPPEP peptides. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains.	113
276901	cd10575	TNFRSF6B	Tumor necrosis factor receptor superfamily member 6B (TNFRSF6B), also known as decoy receptor 3 (DcR3). The subfamily TNFRSF6B is also known as decoy receptor 3 (DcR3), M68, or TR6. This protein is a soluble receptor without death domain and cytoplasmic domain, and secreted by cells. It acts as a decoy receptor that competes with death receptors for ligand binding. It is a pleiotropic immunomodulator and biomarker for inflammatory diseases, autoimmune diseases, and cancer. Over-expression of this gene has been noted in several cancers, including pancreatic carcinoma, and gastrointestinal tract tumors. It can neutralize the biological effects of three tumor necrosis factor superfamily (TNFSF) members: TNFSF6 (Fas ligand/FasL/CD95L) and TNFSF14 (LIGHT) which are both involved in apoptosis and inflammation, and TNFSF15 (TNF-like molecule 1A/TL1A), which is a T cell co-stimulator and involved in gut inflammation. DcR3 is a novel inflammatory marker; higher DcR3 levels strongly correlate with inflammation and independently predict cardiovascular and all-cause mortality in chronic kidney disease (CKD) patients on hemodialysis. Increased synovial inflammatory cells infiltration in rheumatoid arthritis and ankylosing spondylitis is also associated with the elevated DcR3 expression. In cartilaginous fish, mRNA expression of DcR3 in the thymus and leydig, which are the representative lymphoid tissues of elasmobranchs, suggests that DcR3 may act as a modulator in the immune system. Interestingly, in banded dogfish (Triakis scyllia), DcR3 mRNA is strongly expressed in the gill, compared with human expression in the normal lung; both are respiratory organs, suggesting potential relevance of DcR3 to respiratory function.	163
276902	cd10576	TNFRSF1A	Tumor necrosis factor receptor superfamily member 1A (TNFRSF1A), also known as TNFR1. TNFRSF1A (also known as type I TNFR, TNFR1, DR1, TNFRSF1A, CD120a, p55) binds TNF-alpha, through the death domain (DD), and activates NF-kappaB, mediates apoptosis and activates signaling pathways controlling inflammatory, immune, and stress responses. It mediates signal transduction by interacting with antiapoptotic protein BCL2-associated athanogene 4 (BAG4/SODD) and adaptor proteins TRAF2 and TRADD that play regulatory roles. The human genetic disorder called tumor necrosis factor associated periodic syndrome (TRAPS), or periodic fever syndrome, is associated with germline mutations of the extracellular domains of this receptor, possibly due to impaired receptor clearance. TNFRSF1A polymorphisms rs1800693 and rs4149584 are associated with elevated risk of multiple sclerosis. Serum levels of TNFRSF1A are elevated in schizophrenia and bipolar disorder, and high levels are also associated with cognitive impairment and dementia. Patients with idiopathic recurrent acute pericarditis (IRAP), presumed to be an autoimmune process, have also been shown to carry rare mutations (R104Q and D12E) in the TNFRSF1A gene.	130
276903	cd10577	TNFRSF1B	Tumor necrosis factor receptor superfamily member 1B (TNFRSF1B), also known as TNFR2. TNFRSF1B (also known as TNFR2, type 2 TNFR, TNFBR, TNFR80, TNF-R75, TNF-R-II, p75, CD120b) binds TNF-alpha, but lacks the death domain (DD) that is associated with the cytoplasmic domain of TNFRSF1A (TNFR1). It is inducible and expressed exclusively by oligodendrocytes, astrocytes, T cells, thymocytes, myocytes, endothelial cells, and in human mesenchymal stem cells. TNFRSF1B protects oligodendrocyte progenitor cells (OLGs) against oxidative stress, and induces the up-regulation of cell survival genes. While pro-inflammatory and pathogen-clearing activities of TNF are mediated mainly through activation of TNFRSF1A, a strong activator of NF-kappaB, TNFRSF1B is more responsible for suppression of inflammation. Although the affinities of both receptors for soluble TNF are similar, TNFRSF1B is sometimes more abundantly expressed and thought to associate with TNF, thereby increasing its concentration near TNFRSF1A receptors, and making TNF available to activate TNFRSF1A (a ligand-passing mechanism).	163
276904	cd10578	TNFRSF3	Tumor necrosis factor receptor superfamily member 3 (TNFRSF3),  also known as lymphotoxin beta receptor (LTBR). TNFRSF3 (also known as lymphotoxin beta receptor, LTbetaR, CD18, TNFCR, TNFR3, D12S370, TNFR-RP, TNFR2-RP, LT-BETA-R, TNF-R-III) plays a role in signaling during development of lymphoid and other organs, lipid metabolism, immune response, and programmed cell death. Its ligands include lymphotoxin (LT) alpha/beta membrane form (heterotrimer) and tumor necrosis factor ligand superfamily member 14 (also known as LIGHT). TNFRSF3 agonism by these ligands initiates canonical, as well as non-canonical nuclear factor-kappaB (NF-kappaB) signaling, and preferentially results in the translocation of p52-RELB complexes into the nucleus. While these ligands are often expressed by T and B cells, TNFRSF3 is conspicuous absence on T and B lymphocytes and NK cells, suggesting that signaling may be unidirectional for TNFRSF3. Activity of this receptor has also been linked to carcinogenesis; it helps trigger apoptosis and can also lead to release of the interleukin 8 (IL8). Alternatively spliced transcript variants encoding multiple isoforms have been observed.	158
276905	cd10579	TNFRSF6	Tumor necrosis factor receptor superfamily member 6 (TNFRSF6), also known as fas cell surface death receptor (Fas). TNFRSF6 (also known as fas cell surface death receptor (FasR) or Fas, APT1, CD95, FAS1, APO-1, FASTM, ALPS1A) contains a death domain and plays a central role in the physiological regulation of programmed cell death. It has been implicated in the pathogenesis of various malignancies and diseases of the immune system. The receptor interactions with the Fas ligand (FasL), allowing the formation of a death-inducing signaling complex that includes Fas-associated death domain protein (FADD), caspase 8, and caspase 10; autoproteolytic processing of the caspases in the complex triggers a downstream caspase cascade, leading to apoptosis. This receptor has also been shown to activate NF-kappaB, MAPK3/ERK1, and MAPK8/JNK, and is involved in transducing the proliferating signals in normal diploid fibroblast and T cells. Of the several alternatively spliced transcript variants, some are candidates for nonsense-mediated mRNA decay (NMD). Isoforms lacking the transmembrane domain may negatively regulate the apoptosis mediated by the full length isoform.	129
276906	cd10580	TNFRSF10	Tumor necrosis factor receptor superfamily member 10 (TNFRSF10), includes TNFRSF10A (DR4), TNFRSF10B (DR5), TNFRSF10C (DcR1) and TNFRSF10D (DcR2). TNFRSF10 family contains TNFRSF10A (also known as DR4, Apo2, TRAIL-R1, CD261), TNFRSF10B (also known as DR5, KILLER, TRICK2A, TRAIL-R2, TRICKB, CD262), TNFRSF10C (also known as DcR1, TRAIL-R3, LIT, TRID, CD263), and TNFRSF10D (also known as DcR2, TRUNDD, TRAIL-R4, CD264). Tumor necrosis factor-related apoptosis inducing ligand (TNFSF10/TRAIL) binds to all 4 receptors. DR4 (TRAIL-R1) and DR5 (TRAIL-R2) are membrane-bound and contain a death domain in their intracellular portion, which is able to transmit an apoptotic signal, thus often called death receptors. In contrast, DcR1 (TRAIL-R3), which lacks the complete intracellular portion and DcR2 (TRAIL-R4), which has a truncated cytoplasmic death domain, do not transmit an apoptotic signal, thus known as decoy receptors. Apoptosis mediated by DR4 and DR5 requires Fas (TNFRSF6)-associated via death domain (FADD), a death domain containing adaptor protein. Two transcript variants encoding different isoforms and one non-coding transcript have been found for TNFRSF10B/DR5. DcR1 appears to function as an antagonistic receptor that protects cells from TRAIL-induced apoptosis; it has been found to be a p53-regulated DNA damage-inducible gene. The expression of this gene is detected in many normal tissues but not in most cancer cell lines, which may explain the specific sensitivity of cancer cells to the apoptosis-inducing activity of TRAIL. DcR2 has been shown to play an inhibitory role in TRAIL-induced cell apoptosis. The membrane expression of all of these receptors (DR4, DR5, DcR1, and DcR2) is greater in normal endometrium (NE) than in endometrioid adenocarcinoma (EAC). In EAC patients, membrane expression of these receptors are not independent predictors of survival. DcR1 and DcR2 expression is critical in cell growth and apoptosis in cutaneous or uveal melanoma; DcR1 and DcR2 are frequently methylated in both, leading to loss of gene expression and melanomagenesis. On the other hand, DR4 and DR5 methylation is rare in cutaneous melanoma and frequent in uveal melanoma; their expression is wholly independent of the promoter methylation status. DcR1 and DcR2 genes are also reported to be hyper-methylated in prostate cancer. The TRAIL ligand, a potent and specific inducer of apoptosis in cancer cells, has been explored as a therapeutic drug; experimental data has shown that DR4 specific TRAIL variants are more efficacious than wild-type TRAIL in pancreatic cancer.	103
276907	cd10581	TNFRSF11B	Tumor necrosis factor receptor superfamily member 11B (TNFRSF11B), also known as Osteoprotegerin (OPG). TNFRSF11B (also known as Osteoprotegerin, OPG, TR1, OCIF) is a secreted glycoprotein that regulates bone resorption. It binds to two ligands, RANKL (receptor activator of nuclear factor kappaB ligand, also known as osteoprotegerin ligand, OPGL, TRANCE, TNF-related activation induced cytokine), a critical cytokine for osteoclast differentiation, and TRAIL (TNF-related apoptosis-inducing ligand), involved in immune surveillance. Therefore, acting as a decoy receptor for RANKL and TRAIL, OPG inhibits the regulatory effects of nuclear factor-kappaB on inflammation, skeletal, and vascular systems, and prevents TRAIL-induced apoptosis. Studies in mice counterparts suggest that this protein and its ligand also play a role in lymph-node organogenesis and vascular calcification. Circulating OPG levels have emerged as independent biomarkers of cardiovascular disease (CVD) in patients with acute or chronic heart disease. OPG has also been implicated in various inflammations and linked to diabetes and poor glycemic control. Alternatively spliced transcript variants of this gene have been reported, although their full length nature has not been determined.	147
276908	cd10582	TNFRSF14	Tumor necrosis factor receptor superfamily member 14 (TNFRSF14), also known as herpes virus entry mediator (HVEM). TNFRSF14 (also known as herpes virus entry mediator or HVEM, ATAR, CD270, HVEA, LIGHTR, TR2) regulates T-cell immune responses by activating inflammatory, as well as inhibitory signaling pathways. HVEM acts as a receptor for the canonical TNF-related ligand LIGHT (lymphotoxin-like), which exhibits inducible expression, and competes with herpes simplex virus glycoprotein D for HVEM. It also acts as a ligand for the immunoglobulin superfamily proteins BTLA (B and T lymphocyte attenuator) and CD160, a feature distinguishing HVEM from other immune regulatory molecules, thus, creating a functionally diverse set of intrinsic and bidirectional signaling pathways. HVEM is highly expressed in the gut epithelium. Genome-wide association studies have shown that Hvem is an inflammatory bowel disease (IBD) risk gene, suggesting that HVEM could have a regulatory role influencing the regulation of epithelial barrier, host defense, and the microbiota. Mouse models have revealed that HVEM is involved in colitis pathogenesis, mucosal host defense, and epithelial immunity, thus acting as a mucosal gatekeeper with multiple regulatory functions in the mucosa. HVEM plays a critical role in both tumor progression and resistance to antitumor immune responses, possibly through direct and indirect mechanisms. It is known to be expressed in several human malignancies, including esophageal squamous cell carcinoma, follicular lymphoma and melanoma. HVEM network may therefore be an attractive target for drug intervention.	101
276909	cd10583	TNFRSF21	Tumor necrosis factor receptor superfamily member 21 (TNFRSF21), also known as death receptor (DR6). TNFRSF21 (also known as death receptor 6 (DR6), CD358, BM-018) is highly expressed in differentiating neurons as well as in the adult brain, and is upregulated in injured neurons. DR6 negatively regulates neurondendrocyte, axondendrocyte, and oligodendrocyte survival, hinders axondendrocyte and oligodendrocyte regeneration and its inhibition has a neuro-protective effect in nerve injury. It activates nuclear factor kappa-B (NFkB) and mitogen-activated protein kinase 8 (MAPK8, also called c-Jun N-terminal kinase 1), and induces cell apoptosis by associating with TNFRSF1A-associated via death domain (TRADD), which is known to mediate signal transduction of tumor necrosis factor receptors. TNFRSF21 plays a role in T-helper cell activation, and may be involved in inflammation and immune regulation. Its possible ligand is alpha-amyloid precursor protein (APP), hence probably involved in the development of Alzheimer's disease; when released, APP binds in an autocrine/paracrine manner to activate a caspase-dependent self-destruction program that removes unnecessary or connectionless axons. Increasing beta-catenin levels in brain endothelium upregulates TNFRSF21 and TNFRSF19, indicating that these death receptors are downstream target genes of Wnt/beta-catenin signaling, which has been shown to be required for blood-brain barrier development. DR6 is up-regulated in numerous solid tumors as well as in tumor vascular cells, including ovarian cancer and may be a clinically useful diagnostic and predictive serum biomarker for some adult sarcoma subtypes.	159
213020	cd10585	CE4_SF	Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily. The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins.	142
198285	cd10718	SH2_CIS	Src homology 2 (SH2) domain found in cytokine-inducible SH2-containing protein (CIS). CIS family members are known to be cytokine-inducible negative regulators of cytokine signaling. The expression of the CIS gene can be induced by IL2, IL3, GM-CSF and EPO in hematopoietic cells. Proteasome-mediated degradation of this protein has been shown to be involved in the inactivation of the erythropoietin receptor. Suppressor of cytokine signalling (SOCS) was first recognized as a group of cytokine-inducible SH2 (CIS) domain proteins comprising eight family members in human (CIS and SOCS1-SOCS7).  In addition to the SH2 domain, SOCS proteins have a variable N-terminal domain and a conserved SOCS box in the C-terminal domain. SOCS proteins bind to a substrate via their SH2 domain. The prototypical members, CIS and SOCS1-SOCS3, have been shown to regulate growth hormone signaling in vitro and in a classic negative feedback response compete for binding at phosphotyrosine sites in JAK kinase and receptor pathways to displace effector proteins and target bound receptors for proteasomal degradation. Loss of SOCS activity results in excessive cytokine signaling associated with a variety of hematopoietic, autoimmune, and inflammatory diseases and certain cancers. In general SH2 domains are involved in signal transduction.  They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites.	88
199908	cd10719	DnaJ_zf	Zinc finger domain of DnaJ and HSP40. Central/middle or CxxCxGxG-motif containing domain of DnaJ/Hsp40 (heat shock protein 40). DnaJ proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonin family. Hsp40 proteins are characterized by the presence of an N-terminal J domain, which mediates the interaction with Hsp70. This central domain contains four repeats of a CxxCxGxG motif and binds to two Zinc ions. It has been implicated in substrate binding.	65
199909	cd10747	DnaJ_C	C-terminal substrate binding domain of DnaJ and HSP40. The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70.	158
199910	cd10748	anti-TRAP	anti-TRAP (AT) protein specific to Bacilli. In Bacillus subtilis and related bacteria, AT binds to the TRAP protein, (tryptophan-activated trp RNA-binding attenuation protein), effectively disrupting interaction of TRAP with mRNAs. Upon binding of tryptophan, TRAP (which forms a complex of 11 identical subunits) interacts with a specific location in the leader RNA and blocks translation of the tryptophan biosynthetic operon. AT, in turn, recognizes the tryptophan-activated TRAP complex and prevents RNA binding. AT is expressed in response to high levels of uncharged tryptophan tRNA. AT contains a zinc-binding motif that closely resembles the zinc-binding motifs in the zinc-finger region of DnaJ/Hsp40. AT has been shown to form homo-dodecameric assemblies, and can actually do that in two different relative orientations, resulting in two different dodecamers. Recent data suggest that the trimeric form of AT may be the biologically relevant active complex.	52
212097	cd10785	GH38-57_N_LamB_YdjC_SF	Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins. The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex.  GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22).  This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily.	203
212098	cd10786	GH38N_AMII_like	N-terminal catalytic domain of class II alpha-mannosidases and similar proteins; glycoside hydrolase family 38 (GH38). Alpha-mannosidases (EC 3.2.1.24) are extensively found in eukaryotes and play important roles in the processing of newly formed N-glycans and in degradation of mature glycoproteins.  A deficiency of this enzyme causes the lysosomal storage disease alpha-mannosidosis. Many bacterial and archaeal species also possess putative alpha-mannosidases, but their activity and specificity is largely unknown.  Based on different functional characteristics and sequence homology, alpha-mannosidases have been organized into two classes (class I, belonging to glycoside hydrolase family 47, and class II, belonging to glycoside hydrolase family 38). Members of this family corresponds to class II alpha-mannosidases (alphaMII), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides. The N-terminal catalytic domain of alphaMII adopts a structure consisting of parallel 7-stranded beta/alpha barrel. Members in this family are retaining glycosyl hydrolases of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. Two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst.	251
212099	cd10787	LamB_YcsF_like	LamB/YcsF family of  lactam utilization protein. The LamB/YbgL family includes the Aspergillus nidulans protein LamB, and its homologs from all three kingdoms of life. The lamb gene locates at the lam locus of Aspergillus nidulans, consisting of two divergently transcribed genes, lamA and lamB, needed for the utilization of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression. Although the exact molecular function of LamB is unknown, it might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA	238
212100	cd10788	YdjC_like	YdjC-family proteins. YdjC-family proteins are widely distributed, from human to bacteria. It is represented by an uncharacterised protein YdjC (also known as ChbG), encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. This subfamily also includes hopanoid biosynthesis associated proteins HpnK and many uncharacterized YdjC homologs. Although the exact molecular function of the YdjC-family proteins remains unclear, it has been suggested that they play a role in the cleavage of cellobiosephosphate.	243
212101	cd10789	GH38N_AMII_ER_cytosolic	N-terminal catalytic domain of endoplasmic reticulum(ER)/cytosolic class II alpha-mannosidases; glycoside hydrolase family 38 (GH38). The subfamily is represented by Saccharomyces cerevisiae vacuolar alpha-mannosidase Ams1, rat ER/cytosolic alpha-mannosidase Man2C1, and similar proteins. Members in this family share high sequence similarity. None of them have any classical signal sequence or membrane spanning domains, which are typical of sorting or targeting signals. Ams1 functions as a second resident vacuolar hydrolase in S. cerevisiae. It aids in recycling macromolecular components of the cell through hydrolysis of terminal, non-reducing alpha-d-mannose residues. Ams1 utilizes both the cytoplasm to vacuole targeting (Cvt, nutrient-rich conditions) and autophagic (starvation conditions) pathways for biosynthetic delivery to the vacuole. Man2C1is involved in oligosaccharide catabolism in both the ER and cytosol. It can catalyze the cobalt-dependent cleavage of alpha 1,2-, alpha 1,3-, and alpha 1,6-linked mannose residues. Members in this family are retaining glycosyl hydrolases of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl-enzyme complex. Two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst.	252
212102	cd10790	GH38N_AMII_1	N-terminal catalytic domain of putative prokaryotic class II alpha-mannosidases; glycoside hydrolase family 38 (GH38). This mainly bacterial subfamily corresponds to a group of putative class II alpha-mannosidases, including various proteins assigned as alpha-mannosidases, Streptococcus pyogenes (SpGH38) encoded by ORF spy1604. Escherichia coli MngB encoded by the mngB/ybgG gene, and Thermotoga maritime TMM, and similar proteins. SpGH38 targets alpha-1,3 mannosidic linkages. SpGH38 appears to exist as an elongated dimer and display alpha-1,3 mannosidase activity. It is active on disaccharides and some aryl glycosides. SpGH38 can also effectively deglycosylate human N-glycans in vitro. MngB exhibits alpha-mannosidase activity that catalyzes the conversion of 2-O-(6-phospho-alpha-mannosyl)-D-glycerate to mannose-6-phosphate and glycerate in the pathway which enables use of mannosyl-D-glycerate as a sole carbon source. TMM is a homodimeric enzyme that hydrolyzes p-nitrophenyl-alpha-D-mannopyranoside, alpha -1,2-mannobiose, alpha -1,3-mannobiose, alpha -1,4-mannobiose, and alpha -1,6-mannobiose. The GH38 family contains retaining glycosyl hydrolases that employ a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. Two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. Divalent metal ions, such as zinc or cobalt ions, are suggested to be required for the catalytic activities of typical class II alpha-mannosidases. However, TMM requires the cobalt or cadmium for its activity. The cadmium ion dependency is unique to TMM. Moreover, TMM is inhibited by swainsonine but not 1-deoxymannojirimycin, which is in agreement with the features of cytosolic alpha-mannosidase.	273
212103	cd10791	GH38N_AMII_like_1	N-terminal catalytic domain of mainly uncharacterized eukaryotic proteins similar to alpha-mannosidases; glycoside hydrolase family 38 (GH38). The subfamily of mainly uncharacterized eukaryotic proteins shows sequence homology with class II alpha-mannosidases (AlphaAMIIs). AlphaAMIIs possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyze the degradation of N-linked oligosaccharides. The N-terminal catalytic domain of alphaMII adopts a structure consisting of parallel 7-stranded beta/alpha barrel. This subfamily belongs to the GH38 family of retaining glycosyl hydrolases, which employ a two-step mechanism involving the formation of a covalent glycosyl enzyme complex; two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst.	254
212104	cd10792	GH57N_AmyC_like	N-terminal catalytic domain of  alpha-amylase ( AmyC ) and similar proteins. Alpha-amylases (alpha-1,4-glucan-4-glucanohydrolases, EC 3.2.1.1) play essential roles in alpha-glucan metabolism by catalyzing the hydrolysis of polysaccharides such as amylose starch, and beta-limit dextrin. This subfamily is represented by a novel alpha-amylase (AmyC) encoded by hyperthermophilic organism Thermotoga maritime ORF tm1438, and its prokaryotic homologs. AmyC functions as a homotetramer and shows thermostable amylolytic activity. It is strongly inhibited by acarbose. AmyC is composed of a N-terminal catalytic domain, containing a distorted TIM-barrel structure with a characteristic (beta/alpha)7  fold motif, and two additional less conserved domains. There are other two canonical alpha-amylases encoded from T.  maritime that lack the sequence similarity to AmyC, and belong to a different superfamily.	412
212105	cd10793	GH57N_TLGT_like	N-terminal catalytic domain of 4-alpha-glucanotransferase; glycoside hydrolase family 57 (GH57). 4-alpha-glucanotransferase (TLGT, EC 2.4.1.25) plays a key role in the maltose metabolism. It catalyzes the disproportionation of amylose and the formation of large cyclic alpha-1,4-glucan (cycloamylose) from linear amylose. TLGT functions as a homodimer. Each monomer is composed of two domains, an N-terminal catalytic domain with a (beta/alpha)7 barrel fold and a C-terminal domain with a twisted beta-sandwich fold. Some family members have been designated as alpha-amylases, such as the heat-stable eubacterial amylase from Dictyoglomus thermophilum (DtAmyA) and the extremely thermostable archaeal amylase from Pyrococcus furiosus(PfAmyA). However, both of these proteins are 4-alpha-glucanotransferases. DtAmyA was shown to have transglycosylating activity and PfAmyA  exhibits  4-alpha-glucanotransferase activity.	279
212106	cd10794	GH57N_PfGalA_like	N-terminal catalytic domain of alpha-galactosidase; glycoside hydrolase family 57 (GH57). Alpha-galactosidases (GalA, EC 3.2.1.22) catalyze the hydrolysis of alpha-1,6-linked galactose residues from oligosaccharides and polymeric galactomannans. Based on sequence similarity, the majority of eukaryotic and bacterial GalAs have been classified into glycoside hydrolase family GH27, GH36, and GH4, respectively. This subfamily is represented by a novel type of GalA from Pyrococcus furiosus (PfGalA), which belongs to the GH57 family. PfGalA is an extremely thermo-active and thermostable GalA that functions as a bacterial-like GalA, however, without the capacity to hydrolyze polysaccharides. It specifically catalyzes the hydrolysis of para-nitrophenyl-alpha-galactopyranoside, and to some extent that of melibiose and raffinose. PfGalA has a pH optimum between 5.0-5.5.	305
212107	cd10795	GH57N_MJA1_like	N-terminal catalytic domain of a thermoactive alpha-amylase from Methanococcus jannaschii and similar proteins; glycoside hydrolase family 57 (GH57). The subfamily is represented by a thermostable alpha-amylase (MJA1, EC 3.2.1.1) encoded from the hyperthermophilic archaeon Methanococcus jannaschii locus, M J1611. MJA1 has a broad pH optimum 5.0-8.0. It exhibits extremely thermophilic alpha-amylase activity that catalyzes the hydrolysis of large sugar polymers with alpha-l,6 and alpha-l,4 linkages, and yields products including glucose polymers of 1-7 units. MJ1611 also encodes another alpha-amylase with catalytic features distinct from MJA1, which belongs to glycoside hydrolase family 13 (GH-13), and is not included here. This subfamily also includes many uncharacterized proteins found in bacteria and archaea.	306
212108	cd10796	GH57N_APU	N-terminal catalytic domain of thermoactive amylopullulanases; glycoside hydrolase family 57 (GH57). Pullulanases (EC 3.2.1.41) are capable of hydrolyzing the alpha-1,6 glucosidic bonds of pullulan, producing maltotriose.  Amylopullulanases (APU, E.C 3.2.1.1/41) are type II pullulanases which can also degrade both the alpha-1,6 and alpha-1,4 glucosidic bonds of starch, producing oligosaccharides. This subfamily includes GH57 archaeal thermoactive APUs, which show both pullulanolytic and amylolytic activities. They have an acid pH optimum and the presence of Ca2+ might increase their activity, thermostability, and substrate affinity. Besides GH57 thermoactive APUs, all mesophilic and some thermoactive APUs belong to glycoside hydrolase family 13 with catalytic features distinct from GH57. This subfamily also includes many uncharacterized proteins found in bacteria and archaea.	313
212109	cd10797	GH57N_APU_like_1	N-terminal putative catalytic domain of mainly uncharacterized prokaryotic proteins similar to archaeal thermoactive amylopullulanases; glycoside hydrolase family 57 (GH57). This subfamily of mainly uncharacterized bacterial proteins, shows high sequence homology to GH57 archaeal thermoactive amylopullulanases (APU, E.C 3.2.1.1/41). Thermoactive APUs are type II pullulanases with both pullulanolytic and amylolytic activities. They have an acid pH optimum and the presence of Ca2+ might increase their activity, thermostability, and substrate affinity.	327
212110	cd10798	GH57N_like_1	Uncharacterized subfamily of  glycoside hydrolase family 57 (GH57). This subfamily of uncharacterized bacterial proteins, shows high sequence homology to glycoside hydrolase family 57 (GH57). Glycoside hydrolase family 57(GH57) is a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22).	330
212111	cd10800	LamB_YcsF_YbgL_like	Escherichia coli putative lactam utilization protein YbgL and similar proteins. This subfamily of the LamB/YbgL family is represented by the Escherichia coli putative lactam utilization protein YbgL. Although their molecular function of member of this subfamily is unknown, they show high sequence similarity to the Aspergillus nidulans lactam utilization protein LamB, which might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA.	240
212112	cd10801	LamB_YcsF_like_1	uncharacterized proteins similar to the Aspergillus nidulans lactam utilization protein LamB. This mainly bacterial subfamily of the LamB/YbgL family, contains many well conserved uncharacterized proteins. Although their molecular function remains unknown, those proteins show high sequence similarity to the Aspergillus nidulans lactam utilization protein LamB, which might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA.	233
212113	cd10802	YdjC_TTHB029_like	Thermus thermophiles TTHB029 and similar proteins. This subfamily is represented by an YdjC-family protein TTHB029 from Thermus thermophilus HB8; it is similar to Escherichia coli YdjC, a hypothetical protein encoded by the celG gene. TTHB029 functions as a homodimer. Each of monomer consists of (beta/alpha)-barrel fold. The molecular function of TTHB029 is unclear.	251
212114	cd10803	YdjC_EF3048_like	Enterococcus faecalis EF3048 and similar proteins. This subfamily is represented by a putative cellobiose-phosphate cleavage protein EF3048 from Enterococcus faecalis v583. It is similar to Escherichia coli YdjC, a hypothetical protein encoded by the celG gene. EF3048 might function as a homodimer. Each of the monomers consists of a (beta/alpha)-barrel fold that forms an active homodimer. The molecular function of the EF3048 is unclear.	228
212115	cd10804	YdjC_HpnK_like	hopanoid biosynthesis associated protein HpnK and similar proteins. The subfamily includes some uncharacterized proteins annotated as hopanoid biosynthesis associated proteins, HpnK. They show high sequence similarity to proteins from the YdjC-family, the latter is represented by an uncharacterised protein YdjC (also known as ChbG) encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source.	261
212116	cd10805	YdjC_like_1	uncharacterized YdjC-like family proteins from bacteria. The subfamily contains many hypothetical proteins, and belongs to the YdjC-like family of uncharacterized proteins from bacteria. The YdjC-family is represented by an uncharacterised protein YdjC (also known as ChbG) encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. The molecular function of this subfamily is unclear.	251
212117	cd10806	YdjC_like_2	uncharacterized YdjC-like family proteins from eukaryotes. This eukaryotic subfamily contains hypothetical and uncharacterized proteins, and belongs to the YdjC-like family of uncharacterized proteins. The YdjC-family is represented by an uncharacterised protein YdjC (also known as ChbG) encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. The molecular function of this subfamily is unclear.	280
212118	cd10807	YdjC_like_3	uncharacterized YdjC-like family proteins from bacteria. This subfamily contains many hypothetical proteins, and belongs to the YdjC-like family of uncharacterized proteins from bacteria. The YdjC-family is represented by an uncharacterised protein YdjC (also known as ChbG) encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon in Escherichia coli, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. The molecular function of this subfamily is unclear.	251
212119	cd10808	YdjC	Escherichia coli YdjC-like family of  proteins. Uncharacterized  subfamily of YdjC-like family of proteins. Included in this subfamily is the uncharacterized Escherichia coli protein YdjC (also known as ChbG), encoded by the chb (N,N'-diacetylchitobiose, also called [GlcNAc]2) or cel operon, which encodes enzymes involved in growth on an N,N'-diacetylchitobiose carbon source. The molecular function of this subfamily is unclear.	259
212120	cd10809	GH38N_AMII_GMII_SfManIII_like	N-terminal catalytic domain of Golgi alpha-mannosidase II, Spodoptera frugiperda Sf9 alpha-mannosidase III, and similar proteins; glycoside hydrolase family 38 (GH38). This subfamily is represented by Golgi alpha-mannosidase II (GMII, also known as mannosyl-oligosaccharide 1,3- 1,6-alpha mannosidase, EC 3.2.1.114, Man2A1), a monomeric, membrane-anchored class II alpha-mannosidase existing in the Golgi apparatus of eukaryotes. GMII plays a key role in the N-glycosylation pathway. It catalyzes the hydrolysis of the terminal both alpha-1,3-linked and alpha-1,6-linked mannoses from the high-mannose oligosaccharide GlcNAc(Man)5(GlcNAc)2 to yield GlcNAc(Man)3(GlcNAc)2(GlcNAc, N-acetylglucosmine), which is the committed step of complex N-glycan synthesis. GMII is activated by zinc or cobalt ions and is strongly inhibited by swainsonine. Inhibition of GMII provides a route to block cancer-induced changes in cell surface oligosaccharide structures. GMII has a pH optimum of 5.5-6.0, which is intermediate between those of acidic (lysosomal alpha-mannosidase) and neutral (ER/cytosolic alpha-mannosidase) enzymes. GMII is a retaining glycosyl hydrolase of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl enzyme complex; two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst. This subfamily also includes human alpha-mannosidase 2x (MX, also known as mannosyl-oligosaccharide 1,3- 1,6-alpha mannosidase, EC 3.2.1.114, Man2A2). MX is enzymatically and functionally very similar to GMII, and is thought to also function in the N-glycosylation pathway. Also found in this subfamily is class II alpha-mannosidase encoded by Spodoptera frugiperda Sf9 cell. This alpha-mannosidase is an integral membrane glycoprotein localized in the Golgi apparatus. It shows high sequence homology with mammalian Golgi alpha-mannosidase II(GMII). It can hydrolyze p-nitrophenyl alpha-D-mannopyranoside (pNP-alpha-Man), and it is inhibited by swainsonine. However, the Sf9 enzyme is stimulated by cobalt and can hydrolyze (Man)5(GlcNAc)2 to (Man)3(GlcNAc)2, but it cannot hydrolyze GlcNAc(Man)5(GlcNAc)2, which is distinct from that of GMII. Thus, this enzyme has been designated as Sf9 alpha-mannosidase III (SfManIII). It probably functions in an alternate N-glycan processing pathway in Sf9 cells.	340
212121	cd10810	GH38N_AMII_LAM_like	N-terminal catalytic domain of lysosomal alpha-mannosidase and similar proteins; glycoside hydrolase family 38 (GH38). The subfamily is represented by lysosomal alpha-mannosidase (LAM, Man2B1, EC 3.2.1.114), which is a broad specificity exoglycosidase hydrolyzing all known alpha 1,2-, alpha 1,3-, and alpha 1,6-mannosidic linkages from numerous high mannose type oligosaccharides. LAM is expressed in all tissues and in many species. In mammals, the absence of LAM can cause the autosomal recessive disease alpha-mannosidosis. LAM has an acidic pH optimum at 4.0-4.5. It is stimulated by zinc ion and is inhibited by cobalt ion and plant alkaloids, such as swainsonine (SW). LAM catalyzes hydrolysis by a double displacement mechanism in which a glycosyl-enzyme intermediate is formed and hydrolyzed via oxacarbenium ion-like transition states. A carboxylic acid in the active site acts as the catalytic nucleophile in the formation of the covalent intermediate while a second carboxylic acid acts as a general acid catalyst. The same residue is thought to assist in the hydrolysis (deglycosylation) step, this time acting as a general base.	278
212122	cd10811	GH38N_AMII_Epman_like	N-terminal catalytic domain of mammalian core-specific lysosomal alpha 1,6-mannosidase and similar proteins; glycoside hydrolase family 38 (GH38). The subfamily is represented by a novel human core-specific lysosomal alpha 1,6-mannosidase (Epman, Man2B2) and similar proteins. Although it was previously named as epididymal alpha-mannosidase, Epman has a broadly distributed transcript expression profile. Different from the major broad specificity lysosomal alpha-mannosidases (LAM, MAN2B1), Epman is not associated with genetic alpha-mannosidosis that is caused by the absence of LAM. Furthermore, Epman has unique substrate specificity. It can efficiently cleave only the alpha 1,6-linked mannose residue from (Man)3GlcNAc, but not (Man)3(GlcNAc)2 or other larger high mannose oligosaccharides, in the core of N-linked glycans. In contrast, the major LAM can cleave all of the alpha-linked mannose residues from high mannose oligosaccharides except the core alpha 1,6-linked mannose residue. Moreover, it is suggested that the catalytic activity of Epman is dependent on prior action by di-N-acetyl-chitobiase (chitobiase), which indicates there is a functional cooperation between these two enzymes for the full and efficient catabolism of mammalian lysosomal N-glycan core structures. Epman has an acidic pH optimum. It is strongly stimulated by cobalt or zinc ions and strongly inhibited by furanose analogues swainsonine (SW) and 1,4-dideoxy-1,4-imino-d-mannitol (DIM).	326
212123	cd10812	GH38N_AMII_ScAms1_like	N-terminal catalytic domain of yeast vacuolar alpha-mannosidases and similar proteins; glycoside hydrolase family 38 (GH38). The family is represented by Saccharomyces cerevisiae alpha-mannosidase (Ams1) and its eukaryotic homologs. Ams1 functions as a second resident vacuolar hydrolase in S. cerevisiae. It aids in recycling macromolecular components of the cell through hydrolysis of terminal, non-reducing alpha-d-mannose residues. Ams1 forms an oligomer in the cytoplasm and retains its oligomeric form during the import process. It utilizes both the Cvt (nutrient-rich conditions) and autophagic (starvation conditions) pathways for biosynthetic delivery to the vacuole. Mutants in either pathway are defective in Ams1 import. Members in this family show high sequence similarity with rat ER/cytosolic alpha-mannosidase Man2C1.	258
212124	cd10813	GH38N_AMII_Man2C1	N-terminal catalytic domain of mammalian cytosolic alpha-mannosidase Man2C1 and similar proteins; glycoside hydrolase family 38 (GH38). The subfamily corresponds to cytosolic alpha-mannosidase Man2C1 (also known as ER-mannosidase II or neutral/cytosolic mannosidase), mainly found in various vertebrates, and similar proteins. Man2C1 plays an essential role in the catabolism of cytosolic free oligomannosides derived from dolichol intermediates and the degradation of newly synthesized glycoproteins in ER or cytosol. It can catalyze the cleavage of alpha 1,2-, alpha 1,3-, and alpha 1,6-linked mannose residues. Man2C1 is a cobalt-dependent enzyme belonging to alpha-mannosidase class II. It has a neutral pH optimum and is strongly inhitibed by furanose analogs swainsonine (SW) and 1,4-dideoxy-1,4-imino-D-mannitol (DIM), moderately by deoxymannojirimycin (DMM), but not by kifunensine (KIF). DMM and KIF, both pyranose analogs, are normally known to inhibit class I alpha-mannosidase.	252
212125	cd10814	GH38N_AMII_SpGH38_like	N-terminal catalytic domain of SPGH38, a putative alpha-mannosidase of Streptococcus pyogenes, and its prokaryotic homologs; glycoside hydrolase family 38 (GH38). The subfamily is represented by SpGH38 of Streptococcus pyogenes,  which has been assigned as a putative alpha-mannosidase, and is encoded by ORF spy1604. SpGH38 appears to exist as an elongated dimer and display alpha-1,3 mannosidase activity. It is active on disaccharides and some aryl glycosides. SpGH38 can also effectively deglycosylate human N-glycans in vitro. A divalent metal ion, such as a zinc ion, is required for its activity. SpGH38 is inhibited by swainsonine. The absence of any secretion signal peptide suggests that SpGH38 may be intracellular.	271
212126	cd10815	GH38N_AMII_EcMngB_like	N-terminal catalytic domain of Escherichia coli alpha-mannosidase MngB and its bacterial homologs; glycoside hydrolase family 38 (GH38). The bacterial subfamily is represented by Escherichia coli alpha-mannosidase MngB, which is encoded by the mngB gene (previously called ybgG). MngB exhibits alpha-mannosidase activity that converts 2-O-(6-phospho-alpha-mannosyl)-D-glycerate to mannose-6-phosphate and glycerate in the pathway which enables use of mannosyl-D-glycerate as a sole carbon source. A divalent metal ion is required for its activity.	270
212127	cd10816	GH57N_BE_TK1436_like	N-terminal catalytic domain of Gh57 branching enzyme TK 1436 and similar proteins. The subfamily is represented by a novel branching-enzyme TK1436 of hyperthermophilic archaeon Thermococcus kodakaraensis KOD1. Branching enzymes (BEs, EC 2.4.1.18) play a key role in synthesis of alpha-glucans and they generally are classified into glycoside hydrolase family 13 (GH13). However, TK1436 belongs to the GH57 family. It functions as a monomer and possesses BE activity. TK1436 is composed of a distorted N-terminal (beta/alpha)7-barrel domain and a C-terminal five alpha-helical domain, both of which participate in the formation of the active-site cleft.	423
380680	cd10843	DSRM_DICER	double-stranded RNA binding motif of endoribonuclease Dicer and similar proteins. Dicer (also known as helicase with RNase motif (HERNA), or helicase MOI) is a double-stranded RNA (dsRNA) endoribonuclease playing a central role in short dsRNA-mediated post-transcriptional gene silencing. It cleaves naturally occurring long dsRNAs and short hairpin pre-microRNAs (miRNA) into fragments of twenty-one to twenty-three nucleotides with 3' overhang of two nucleotides, producing respectively short interfering RNAs (siRNA) and mature microRNAs. Dicer contains a double-stranded RNA binding motif (DSRM) at the C-terminus. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	63
380681	cd10844	DSRM_TARBP2_rpt2	second double-stranded RNA binding motif of the RISC-loading complex subunit TARBP2 and similar proteins. TARBP2 (also known as TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)) participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. TARBP2 contains three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	67
380682	cd10845	DSRM_RNAse_III_family	double-stranded RNA binding motif of ribonuclease III (RNase III) and similar proteins. RNase III (EC 3.1.26.3; also known as ribonuclease 3) digests double-stranded RNA formed within single-strand substrates, but not RNA-DNA hybrids. It is involved in the processing of rRNA precursors, viral transcripts, some mRNAs, and at least 1 tRNA (metY, a minor form of tRNA-init-Met). It cleaves the 30S primary rRNA transcript to yield the immediate precursors to the 16S and 23S rRNAs. The cleavage can occur in assembled 30S, 50S, and even 70S subunits and is influenced by the presence of ribosomal proteins. The RNase III family also includes the mitochondrion-specific ribosomal protein mL44 subfamily, which is composed of mitochondrial 54S ribosomal protein L3 (MRPL3) and mitochondrial 39S ribosomal protein L44 (MRPL44). Members of this family contain an RNase III domain and a C-terminal double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	69
211315	cd10909	ChtBD1_GH18_2	Hevein or type 1 chitin binding domain (ChtBD1) subfamily; in some members co-occurs with family 18 glycosyl hydrolases. This subfamily includes a Toxoplasma gondii ME49 protein annotated as a putative mannosyl-oligosaccharide glucosidase. ChtBD1 is a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins such as hevein, a major IgE-binding allergen in natural rubber latex, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements.	51
350234	cd10910	PIN_limkain_b1_N_like	N-terminal LabA-like PIN domain of limkain b1 and similar proteins. Limkain b1 is a human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. Limkain b1 contains multiple copies of LOTUS domains and a conserved RNA recognition motif, this and similar domain architectures are shared by several members of this family, and a function of these architectures in RNA binding or RNA metabolism has been suggested. The function of the N-terminal domain is unknown. This subfamily belongs to LabA-like PIN domain family which includes Synechococcus elongatus PCC 7942 LabA, human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350235	cd10911	PIN_LabA	PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. This subfamily contains Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. This subfamily belongs to the LabA-like domain family which includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes. Also included in the LabA-like domain family are human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB , which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	154
350236	cd10912	PIN_YacP-like	PIN_domain of Bacillus subtilis YacP/Rae1 and related proteins. Bacillus subtilis YacP, also known as Rae1, is an endoribonuclease involved in ribosome-dependent mRNA decay. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	142
199211	cd10913	Peptidase_C25_N_gingipain	gingipain subgroup of the Peptidase C25 family N-terminal domain. Gingipain, produced by Porphyromonas gingivalis, exemplifies the Peptidase family C25, a unique class of cysteine proteases.  P.  gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease also associated with other diseases such as diabetes and cardiovascular disease. The gingipain subgroup contains extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene. Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad, are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. It has been suggested that they enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network.	348
199212	cd10914	Peptidase_C25_N_1	uncharacterized subgroup of the Peptidase C25 family N-terminal domain. Domains in this subgroup are uncharacterized members of the Peptidase family C25 N-terminal domain family. Peptidase family C25 is a unique class of cysteine proteases, exemplified by gingipain, which is produced by Porphyromonas gingivalis. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease that is also associated with other diseases such as diabetes and cardiovascular disease. Gingipains are a group of extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene (also called prtK, prkP). Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. They are proposed to enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network.	365
199213	cd10915	Peptidase_C25_N_2	uncharacterized subgroup of the Peptidase C25 family N-terminal domain. Domains in this subgroup are uncharacterized members of the Peptidase family C25 N-terminal domain family. Peptidases family C25 are a unique class of cysteine proteases, exemplified by gingipain, which is produced by Porphyromonas gingivalis. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease that is also associated with other diseases such as diabetes and cardiovascular disease. Gingipains are a group of extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene. Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. They are proposed to enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network.	403
213021	cd10916	CE4_PuuE_HpPgdA_like	Catalytic domain of bacterial PuuE allantoinases, Helicobacter pylori peptidoglycan deacetylase (HpPgdA), and similar proteins. This family is a member of the very large and functionally diverse carbohydrate esterase 4 (CE4) superfamily. It contains bacterial PuuE (purine utilization E) allantoinases, a peptidoglycan deacetylase from Helicobacter pylori (HpPgdA), Escherichia coli ArnD, and many uncharacterized homologs from all three kingdoms of life. PuuE allantoinase appears to be metal-independent and specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. Different from PuuE allantoinase, HpPgdA has the ability to bind a metal ion at the active site and is responsible for a peptidoglycan modification that counteracts the host immune response. Both PuuE allantoinase and HpPgdA function as a homotetramer. The monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of polysaccharide deacetylase (DCA)-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. However, in contrast with the typical DCAs, PuuE allantoinase and HpPgdA might not exhibit a solvent-accessible polysaccharide binding groove and only recognize a small substrate molecule. ArnD catalyzes the deformylation of 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol to 4-amino-4-deoxy-L-arabinose-phosphoundecaprenol.	247
213022	cd10917	CE4_NodB_like_6s_7s	Catalytic NodB homology domain of rhizobial NodB-like proteins. This family belongs to the large and functionally diverse carbohydrate esterase 4 (CE4) superfamily, whose members show strong sequence similarity with some variability due to their distinct carbohydrate substrates. It includes many rhizobial NodB chitooligosaccharide N-deacetylase (EC 3.5.1.-)-like proteins, mainly from bacteria and eukaryotes, such as chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan. All members of this family contain a catalytic NodB homology domain with the same overall topology and a deformed (beta/alpha)8 barrel fold with 6- or 7 strands. Their catalytic activity is dependent on the presence of a divalent cation, preferably cobalt or zinc, and they employ a conserved His-His-Asp zinc-binding triad closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. Several family members show diversity both in metal ion specificities and in the residues that coordinate the metal.	171
213023	cd10918	CE4_NodB_like_5s_6s	Putative catalytic NodB homology domain of PgaB, IcaB, and similar proteins which consist of a deformed (beta/alpha)8 barrel fold with 5- or 6-strands. This family belongs to the large and functionally diverse carbohydrate esterase 4 (CE4) superfamily, whose members show strong sequence similarity with some variability due to their distinct carbohydrate substrates. It includes bacterial poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase PgaB, hemin storage system HmsF protein in gram-negative species, intercellular adhesion proteins IcaB, and many uncharacterized prokaryotic polysaccharide deacetylases. It also includes a putative polysaccharide deacetylase YxkH encoded by the Bacillus subtilis yxkH gene, which is one of six polysaccharide deacetylase gene homologs present in the Bacillus subtilis genome. Sequence comparison shows all family members contain a conserved domain similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, which consists of a deformed (beta/alpha)8 barrel fold with 6 or 7 strands. However, in this family, most proteins have 5 strands and some have 6 strands. Moreover, long insertions are found in many family members, whose function remains unknown.	157
200545	cd10919	CE4_CDA_like	Putative catalytic domain of chitin deacetylase-like proteins from insects and similar proteins. Chitin deacetylases (CDAs, EC 3.5.1.41) are secreted metalloproteins belonging to a family of extracellular chitin-modifying enzymes that catalyze the N-deacetylation of chitin, a beta-1,4-linked N-acetylglucosamine polymer, to form chitosan, a polymer of beta-(1,4)-linked d-glucosamine residues. CDAs have been isolated and characterized from various bacterial and fungal species and belong to the larger carbohydrate esterase family 4 (CE4). This family includes many CDA-like proteins, mainly from insects, which contain a putative CDA-like catalytic domain similar to the catalytic NodB homology domain of CE4 esterases. Some family members have an additional chitin binding domain (ChBD), or an additional low-density lipoprotein receptor class A domain (LDLa), or both. Due to the lack of some catalytically relevant residues, several insect CDA-like proteins are devoid of enzymatic activity and may simply bind to chitin and thus influence the mechanical or permeability properties of chitin-containing structures such as the cuticle or the peritrophic membrane. This family also includes many uncharacterized hypothetical proteins from bacteria, exhibiting high sequence similarity to insect CDA-like proteins.	273
200546	cd10920	CE4_WbmS	Catalytic domain of a putative polysaccharide deacetylase WbmS from Bordetella bronchiseptica and similar proteins. This family is represented by a putative polysaccharide deacetylase encoded by the O-antigen-related gene wbmS in Bordetella bronchiseptica. Although its precise function remains unknown, it has been suggested that WbmS might be involved in the biosynthesis of O-antigen, an important component of the gram-negative bacterial outer membrane, and may also play a role in sugar phosphate transfer. Structural superposition and sequence comparison show that WbmS consists of a conserved domain similar to the 7-stranded barrel catalytic domain of polysaccharide deacetylases (DACs) from the carbohydrate esterase 4 (CE4) superfamily, which removes N-linked acetyl groups from cell wall polysaccharides.	233
200547	cd10921	CE4_MJ0505_like	Putative catalytic domain of uncharacterized protein MJ0505 from Methanocaldococcus jannaschii and similar proteins. This family contains an uncharacterized protein MJ0505 from Methanocaldococcus jannaschii and its prokaryotic homologs. Although their biochemical properties remain to be determined, members in this family is composed of a seven-stranded barrel with a detectable sequence similarity to the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups of cell wall polysaccharides and belong to a larger carbohydrate esterase 4 (CE4) superfamily.	206
200548	cd10922	CE4_PelA_like_C	C-terminal Putative NodB-like catalytic domain of PelA-like uncharacterized hypothetical proteins found in bacteria. This family is represented by a protein PelA of unknown function that is encoded by a gene in the pelA-G gene cluster for pellicle production and biofilm formation in Pseudomonas aeruginosa. PelA and most of the family members contain a domain of unknown function, DUF297, in the N-terminus and a C-terminal domain that shows high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	266
200549	cd10923	CE4_COG5298	Putative NodB-like catalytic domain of uncharacterized proteins found in bacteria. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily. Some family members contain an additional copper amine oxidase N-terminal domain.	250
200550	cd10924	CE4_COG4878	Putative NodB-like catalytic domain of uncharacterized proteins found in bacteria. The family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	273
200551	cd10925	CE4_u1	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	216
200552	cd10926	CE4_u2	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	253
200553	cd10927	CE4_u3	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	227
200554	cd10928	CE4_u4	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	222
200555	cd10929	CE4_u5	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	263
200556	cd10930	CE4_u6	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	240
200557	cd10931	CE4_u7	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	224
200558	cd10932	CE4_u8	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	324
200559	cd10933	CE4_u9	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	266
200560	cd10934	CE4_cadherin_MopE_like_N	N-terminal Putative NodB-like catalytic domain of hypothetical proteins containing C-terminal cadherin or MopE copper binding domains. The family includes several cadherin or MopE copper binding domain containing hypothetical proteins found in bacteria. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium. They play a role in cell fate, signalling, proliferation, differentiation, and migration. The copper binding domain involves a tryptophan metabolite, kynurenine, in the protein MopE. Members of this family contain an additional conserved domain, which is N-terminally fused to the cadherin domain or the MopE copper binding domain. Although its function remains unclear, the conserved domain exhibits a seven-stranded barrel with a detectable sequence similarity to the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	267
200561	cd10935	CE4_WalW	Putative catalytic domain of lipopolysaccharide biosynthesis protein WalW and its bacterial homologs. This family corresponds to a group of uncharacterized lipopolysaccharide biosynthesis protein WalW found in bacteria. Although their biochemical properties remain to be determined, members of this family is composed of a seven-stranded barrel with detectable sequence similarity to the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	295
200562	cd10936	CE4_DAC2	Putative catalytic domain of family 2 polysaccharide deacetylases (DACs) from bacteria. This family contains an uncharacterized protein BH1492 from Bacillus halodurans, an uncharacterized protein ATU2773 from Agrobacterium tumefaciens C58, and other bacterial hypothetical proteins. Although their functions are still unknown, structural superposition and sequence comparison suggest that BH1492 and ATU2773 might be divergently related to the 7-stranded barrel catalytic domain of polysaccharide deacetylases (DACs) from the carbohydrate esterase 4 (CE4) superfamily, which remove N-linked acetyl groups from cell wall polysaccharides. This family is designated as DAC family 2, a divergent DAC family.	215
200563	cd10938	CE4_HpPgdA_like	Catalytic domain of Helicobacter pylori peptidoglycan deacetylase (HpPgdA) and similar proteins. This family is represented by a peptidoglycan deacetylase (HP0310, HpPgdA) from the gram-negative pathogen Helicobacter pylori. HpPgdA has the ability to bind a metal ion at the active site and is responsible for a peptidoglycan modification that counteracts the host immune response. It functions as a homotetramer. The monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of polysaccharide deacetylase (DCA)-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. In contrast to typical NodB-like DCAs, HpPgdA does not exhibit a solvent-accessible polysaccharide binding groove, suggesting that the enzyme binds a small molecule at the active site.	258
200564	cd10939	CE4_ArnD	Catalytic domain of Escherichia coli 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol deformylase ArnD and other bacterial homologs. This family is represented by Escherichia coli 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol deformylase ArnD (EC 3.5.1.n3). ArnD plays an important role in the biosynthesis of undecaprenyl phosphate alpha-4-amino-4-deoxy-L-arabinose (alpha-L-Ara4N). It catalyzes the deformylation of 4-deoxy-4-formamido-L-arabinose-phosphoundecaprenol to 4-amino-4-deoxy-L-arabinose-phosphoundecaprenol. The ArnD-dependent deformylation likely occurs on the inner leaflet of the inner membrane. This family also includes many uncharacterized bacterial polysaccharide deacetylases. All family members show high sequence homology to the catalytic domain of bacterial PuuE (purine utilization E) allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA), and are classified within the larger carbohydrate esterase 4 (CE4) superfamily.	290
200565	cd10940	CE4_PuuE_HpPgdA_like_1	Putative catalytic domain of uncharacterized bacterial polysaccharide deacetylases similar to bacterial PuuE allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA). This family contains many uncharacterized bacterial polysaccharide deacetylases (DCAs) that show high sequence similarity to the catalytic domain of bacterial PuuE allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA). PuuE allantoinase appears to be metal-independent and specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. Different from PuuE allantoinase, HpPgdA has the ability to bind a metal ion at the active site and is responsible for a peptidoglycan modification that counteracts the host immune response. Both PuuE allantoinase and HpPgdA function as homotetramers. The monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCA-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. In contrast to typical NodB-like DCAs, PuuE allantoinase and HpPgdA do not exhibit a solvent-accessible polysaccharide binding groove and might only bind a small molecule at the active site.	306
200566	cd10941	CE4_PuuE_HpPgdA_like_2	Putative catalytic domain of uncharacterized prokaryotic polysaccharide deacetylases similar to bacterial PuuE allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA). This family contains many uncharacterized prokaryotic polysaccharide deacetylases (DCAs) that show high sequence similarity to the catalytic domain of bacterial PuuE allantoinases and Helicobacter pylori peptidoglycan deacetylase (HpPgdA). PuuE allantoinase appears to be metal-independent and specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. Different from PuuE allantoinase, HpPgdA has the ability to bind a metal ion at the active site and is responsible for a peptidoglycan modification that counteracts the host immune response. Both PuuE allantoinase and HpPgdA function as homotetramers. The monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCA-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. In contrast to typical NodB-like DCAs, PuuE allantoinase and HpPgdA do not exhibit a solvent-accessible polysaccharide binding groove and might only bind a small molecule at the active site.	258
200567	cd10942	CE4_u11	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. This family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	252
200568	cd10943	CE4_NodB	Putative catalytic domain of rhizobial NodB chitooligosaccharide N-deacetylase and its bacterial homologs. This family corresponds to rhizobial NodB chitooligosaccharide N-deacetylase (EC 3.5.1.-), encoded by nodB gene from the nodulation (nod) gene cluster that is responsible for the biosynthesis of bacterial nodulation signals, termed Nod factors. NodB is involved in de-N-acetylating the nonreducing N-acetylglucosamine residue of chitooligosaccharides to allow for the attachment of the fatty acyl group by the acyltransferase NodA. The monosaccharide N-acetylglucosamine cannot be deacetylated by NodB. NodB is composed of a 6-stranded barrel catalytic domain with detectable sequence similarity to the 7-stranded barrel homology domain of polysaccharide deacetylase (DCA)-like proteins in the larger carbohydrate esterase 4 (CE4) superfamily.	193
200569	cd10944	CE4_SmPgdA_like	Catalytic NodB homology domain of Streptococcus mutans polysaccharide deacetylase PgdA, Bacillus subtilis YheN, and similar proteins. This family is represented by a putative polysaccharide deacetylase PgdA from the oral pathogen Streptococcus mutans (SmPgdA) and Bacillus subtilis YheN (BsYheN), which are members of the carbohydrate esterase 4 (CE4) superfamily. SmPgdA is an extracellular metal-dependent polysaccharide deacetylase with a typical CE4 fold, with metal bound to a His-His-Asp triad. It possesses de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. SmPgdA plays a role in tuning cell surface properties and in interactions with (salivary) agglutinin, an essential component of the innate immune system, most likely through deacetylation of an as-yet-unidentified polysaccharide. SmPgdA shows significant homology to the catalytic domains of peptidoglycan deacetylases from Streptococcus pneumoniae (SpPgdA) and Listeria monocytogenes (LmPgdA), both of which are involved in the bacterial defense mechanism against human mucosal lysozyme. The Bacillus subtilis genome contains six polysaccharide deacetylase gene homologs: pdaA, pdaB (previously known as ybaN), yheN, yjeA, yxkH and ylxY. The biological function of BsYheN is still unknown. This family also includes many uncharacterized polysaccharide deacetylases mainly found in bacteria.	189
200570	cd10946	CE4_Mll8295_like	Putative catalytic NodB homology domain of uncharacterized Mll8295 protein encoded from Rhizobium loti and its bacterial homologs. This family is represented by a putative polysaccharide deacetylase Mll8295 encoded from Rhizobium loti. Although its biological function still remains unknown, Mll8295 shows high sequence homology to the catalytic domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Both Mll8295 and SpPgdA belong to the carbohydrate esterase 4 (CE4) superfamily. This family also includes many uncharacterized bacterial polysaccharide deacetylases.	217
200571	cd10947	CE4_SpPgdA_BsYjeA_like	Catalytic NodB homology domain of Streptococcus pneumoniae peptidoglycan deacetylase PgdA, Bacillus subtilis BsYjeA protein, and their bacterial homologs. This family is represented by Streptococcus pneumoniae peptidoglycan GlcNAc deacetylase (SpPgdA), a member of the carbohydrate esterase 4 (CE4) superfamily. SpPgdA protects gram-positive bacterial cell wall from host lysozymes by deacetylating peptidoglycan N-acetylglucosamine (GlcNAc) residues. It consists of three separate domains: N-terminal, middle and C-terminal (catalytic) domains. The catalytic NodB homology domain is similar to the deformed (beta/alpha)8 barrel fold adopted by other CE4 esterases, which harbors a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad closely associated with conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The enzyme is able to accept GlcNAc3 as a substrate, with the N-acetyl of the middle sugar being removed by the enzyme. This family also includes Bacillus subtilis BsYjeA protein encoded by the yjeA gene, which is one of the six polysaccharide deacetylase gene homologs (pdaA, pdaB/ybaN, yheN, yjeA, yxkH and ylxY) in the Bacillus subtilis genome. Although homology comparison shows that the BsYjeA protein contains a polysaccharide deacetylase domain, and was predicted to be a membrane-bound xylanase or a membrane-bound chitooligosaccharide deacetylase, more recent research indicates BsYjeA might be a novel non-specific secretory endonuclease which creates random nicks progressively on the two strands of dsDNA, resulting in highly distinguishable intermediates/products very different in chemical and physical compositions over time. In addition, BsYjeA shares several enzymatic properties with the well-understood DNase I endonuclease. Both enzymes are active on ssDNA and dsDNA, both generate random nicks, and both require Mg2+ or Mn2+ for hydrolytic activity.	177
200572	cd10948	CE4_BsPdaA_like	Catalytic NodB homology domain of Bacillus subtilis polysaccharide deacetylase PdaA, and its bacterial homologs. The Bacillus subtilis genome contains six polysaccharide deacetylase gene homologs: pdaA, pdaB (previously known as ybaN), yheN, yjeA, yxkH and ylxY. This family is represented by Bacillus subtilis pdaA gene encoding polysaccharide deacetylase BsPdaA, which is a member of the carbohydrate esterase 4 (CE4) superfamily. BsPdaA deacetylates peptidoglycan N-acetylmuramic acid (MurNAc) residues to facilitate the formation of muramic delta-lactam, which is required for recognition of germination lytic enzymes. BsPdaA deficiency leads to the absence of muramic delta-lactam residues in the spore cortex. Like other CE4 esterases, BsPdaA consists of a single catalytic NodB homology domain that appears to adopt a deformed (beta/alpha)8 barrel fold with a putative substrate binding groove harboring the majority of the conserved residues. It utilizes a general acid/base catalytic mechanism involving a tetrahedral transition intermediate, where a water molecule functions as the nucleophile tightly associated to the zinc cofactor.	223
200573	cd10949	CE4_BsPdaB_like	Putative catalytic NodB homology domain of Bacillus subtilis putative polysaccharide deacetylase PdaB, and its bacterial homologs. The Bacillus subtilis genome contains six polysaccharide deacetylase gene homologs: pdaA, pdaB (previously known as ybaN), yheN, yjeA, yxkH and ylxY. This family is represented by the putative polysaccharide deacetylase PdaB encoded by the pdaB gene on sporulation of Bacillus subtilis. Although its biochemical properties remain to be determined, the PdaB (YbaN) protein is essential for maintaining spores after the late stage of sporulation and is highly conserved in spore-forming bacteria. The glycans of the spore cortex may be candidate PdaB substrates. Based on sequence similarity, the family members are classified as carbohydrate esterase 4 (CE4) superfamily members. However, the classical His-His-Asp zinc-binding motif of CE4 esterases is missing in this family.	192
200574	cd10950	CE4_BsYlxY_like	Putative catalytic NodB homology domain of uncharacterized protein YlxY from Bacillus subtilis and its bacterial homologs. The Bacillus subtilis genome contains six polysaccharide deacetylase gene homologs: pdaA, pdaB (previously known as ybaN), yheN, yjeA, yxkH and ylxY. This family is represented by Bacillus subtilis putative polysaccharide deacetylase BsYlxY, encoded by the ylxY gene, which is a member of the carbohydrate esterase 4 (CE4) superfamily. Although its biological function still remains unknown, BsYlxY shows high sequence homology to the catalytic domain of Bacillus subtilis pdaB gene encoding a putative polysaccharide deacetylase (BsPdaB), which is essential for the maintenance of spores after the late stage of sporulation and is highly conserved in spore-forming bacteria. However, disruption of the ylxY gene in B. subtilis did not cause any sporulation defect. Moreover, the Asp residue in the classical His-His-Asp zinc-binding motif of CE4 esterases is mutated to a Val residue in this family. Other catalytically relevant residues of CE4 esterases are also not conserved, which suggest that members of this family may be inactive.	188
200575	cd10951	CE4_ClCDA_like	Catalytic NodB homology domain of Colletotrichum lindemuthianum chitin deacetylase and similar proteins. This family is represented by the chitin deacetylase (endo-chitin de-N-acetylase, ClCDA, EC 3.5.1.41) from Colletotrichum lindemuthianum (also known as Glomerella lindemuthiana), which is a member of the carbohydrate esterase 4 (CE4) superfamily. ClCDA catalyzes the hydrolysis of N-acetamido groups of N-acetyl-D-glucosamine residues in chitin, converting it to chitosan in fungal cell walls. It consists of a single catalytic domain similar to the deformed (alpha/beta)8 barrel fold adopted by other CE4 esterases, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad closely associated with the conserved catalytic base (aspartic acid) and acid (histidine), to carry out acid/base catalysis. It possesses a highly conserved substrate-binding groove, with subtle alterations that influence substrate specificity and subsite affinity. Unlike its bacterial homologs, ClCDA contains two intramolecular disulfide bonds that may add stability to this secreted protein. The family also includes many uncharacterized deacetylases and hypothetical proteins mainly from eukaryotes, which show high sequence similarity to ClCDA.	197
200576	cd10952	CE4_MrCDA_like	Catalytic NodB homology domain of Mucor rouxii chitin deacetylase and similar proteins. This family is represented by the chitin deacetylase (MrCDA, EC 3.5.1.41) encoded from the fungus Mucor rouxii (also known as Amylomyces rouxii). MrCDA is an acidic glycoprotein with a very stringent specificity for beta1-4-linked N-acetylglucosamine homopolymers. It requires at least four residues (chitotetraose) for catalysis, and can achieve extensive deacetylation on chitin polymers. MrCDA shows high sequence similarity to Colletotrichum lindemuthianum chitin deacetylase (endo-chitin de-N-acetylase, ClCDA), which consists of a single catalytic domain similar to the deformed (beta/alpha)8 barrel fold adopted by the carbohydrate esterase 4 (CE4) superfamily, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The family also includes some uncharacterized eukaryotic and bacterial homologs of MrCDA.	178
200577	cd10953	CE4_SlAXE_like	Catalytic NodB homology domain of Streptomyces lividans acetylxylan esterase and its bacterial homologs. This family is represented by Streptomyces lividans acetylxylan esterase (SlAXE, EC 3.1.1.72), a member of the carbohydrate esterase 4 (CE4) superfamily. SlAXE deacetylates O-acetylated xylan, a key component of plant cell walls. It shows no detectable activity on generic esterase substrates including para-nitrophenyl acetate. It is specific for sugar-based substrates and will precipitate acetylxylan as a result of deacetylation. SlAXE also functions as a chitin and chitooligosaccharide de-N-acetylase with equal efficiency to its activity on xylan. SlAXE forms a dimer. Each monomer contains a catalytic NodB homology domain with the same overall topology and a deformed (beta/alpha)8 barrel fold as other CE4 esterases, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad closely associated with the conserved catalytic base (aspartic acid) and acid (histidine), to carry out acid/base catalysis. SlAXE possess a single metal center with a chemical preference for Co2+.	179
200578	cd10954	CE4_CtAXE_like	Catalytic NodB homology domain of Clostridium thermocellum acetylxylan esterase and its bacterial homologs. This family is represented by Clostridium thermocellum acetylxylan esterase (CtAXE, EC 3.1.1.72), a member of the carbohydrate esterase 4 (CE4) superfamily. CtAXE deacetylates O-acetylated xylan, a key component of plant cell walls. It shows no detectable activity on generic esterase substrates including para-nitrophenyl acetate. It is specific for sugar-based substrates and will precipitate acetylxylan, as a consequence of deacetylation. CtAXE is a monomeric protein containing a catalytic NodB homology domain with the same overall topology and a deformed (beta/alpha)8 barrel fold as other CE4 esterases. However, due to differences in the topography of the substrate-binding groove, the chemistry of the active center, and metal ion coordination, CtAXE has different metal ion preference and lacks activity on N-acetyl substrates. It is significantly activated by Co2+. Moreover, CtAXE displays distinctly different ligand coordination to the metal ion, utilizing an aspartate, a histidine, and four water molecules, as opposed to the conserved His-His-Asp zinc-binding triad of other CE4 esterases.	180
200579	cd10955	CE4_BH0857_like	Putative catalytic NodB homology domain of uncharacterized BH0857 protein from Bacillus halodurans and its bacterial homologs. This family is represented by a putative polysaccharide deacetylase BH0857 from Bacillus halodurans. Although its biological function still remains unknown, BH0857 shows high sequence homology to the catalytic NodB homology domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Both BH0857 and SpPgdA belong to the carbohydrate esterase 4 (CE4) superfamily. This family also includes many uncharacterized bacterial polysaccharide deacetylases.	195
200580	cd10956	CE4_BH1302_like	Putative catalytic NodB homology domain of uncharacterized BH1302 protein from Bacillus halodurans and its bacterial homologs. This family is represented by a putative polysaccharide deacetylase BH1302 from Bacillus halodurans. Although its biological function is unknown, BH1302 shows high sequence homology to the catalytic NodB homology domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Both BH1302 and SpPgdA belong to the carbohydrate esterase 4 (CE4) superfamily. This family also includes many uncharacterized bacterial polysaccharide deacetylases.	194
200581	cd10958	CE4_NodB_like_2	Catalytic NodB homology domain of uncharacterized chitin deacetylases and hypothetical proteins. This family includes some uncharacterized chitin deacetylases and hypothetical proteins, mainly from eukaryotes. Although their biological function is unknown, members in this family show high sequence homology to the catalytic NodB homology domain of Colletotrichum lindemuthianum chitin deacetylase (endo-chitin de-N-acetylase, ClCDA, EC 3.5.1.41), which catalyzes the hydrolysis of N-acetamido groups of N-acetyl-D-glucosamine residues in chitin, converting it to chitosan in fungal cell walls. Like ClCDA, this family is a member the carbohydrate esterase 4 (CE4) superfamily.	190
200582	cd10959	CE4_NodB_like_3	Catalytic NodB homology domain of uncharacterized bacterial polysaccharide deacetylases. This family includes many uncharacterized bacterial polysaccharide deacetylases. Although their biological function still remains unknown, members in this family show high sequence homology to the catalytic NodB homology domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Like SpPgdA, this family is a member of the carbohydrate esterase 4 (CE4) superfamily.	187
200583	cd10960	CE4_NodB_like_1	Catalytic NodB homology domain of uncharacterized bacterial polysaccharide deacetylases. This family includes many uncharacterized bacterial polysaccharide deacetylases. Although their biological function still remains unknown, members in this family show high sequence homology to the catalytic NodB homology domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), which is an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Like SpPgdA, this family is a member of the carbohydrate esterase 4 (CE4) superfamily.	238
200584	cd10962	CE4_GT2-like	Catalytic NodB homology domain of uncharacterized bacterial glycosyl transferase, group 2-like family proteins. This family includes many uncharacterized bacterial proteins containing an N-terminal GH18 (glycosyl hydrolase, family 18) domain, a middle NodB-like homology domain, and a C-terminal GT2-like (glycosyl transferase group 2) domain. Although their biological function is unknown, members in this family contain a middle NodB homology domain that is similar to the catalytic domain of Streptococcus pneumoniae polysaccharide deacetylase PgdA (SpPgdA), an extracellular metal-dependent polysaccharide deacetylase with de-N-acetylase activity toward a hexamer of chitooligosaccharide N-acetylglucosamine, but not shorter chitooligosaccharides or a synthetic peptidoglycan tetrasaccharide. Like SpPgdA, this family is a member of the carbohydrate esterase 4 (CE4) superfamily. The presence of three domains suggests that members of this family may be multifunctional.	196
200585	cd10963	CE4_RC0012_like	Putative catalytic NodB homology domain of uncharacterized protein RC0012 from Rickettsia conorii and its bacterial homologs. This family contains an uncharacterized protein RC0012 from Rickettsia conorii and its bacterial homologs. Although their biochemical properties remain to be determined, members in this family seems to be composed of a seven-stranded barrel with detectable sequence similarity to the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups from cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	182
200586	cd10964	CE4_PgaB_5s	N-terminal putative catalytic polysaccharide deacetylase domain of bacterial poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase PgaB, and similar proteins. This family is represented by an outer membrane lipoprotein, poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase (PgaB, EC 3.5.1.-), encoded by Escherichia coli pgaB gene from the pgaABCD (formerly ycdSRQP) operon, which affects biofilm development by promoting abiotic surface binding and intercellular adhesion. PgaB catalyzes the N-deacetylation of poly-beta-1,6-N-acetyl-D-glucosamine (PGA), a biofilm adhesin polysaccharide that stabilizes biofilms of E. coli and other bacteria. PgaB contains an N-terminal NodB homology domain with a 5-stranded beta/alpha barrel, and a C-terminal carbohydrate binding domain required for PGA N-deacetylation, which may be involved in binding to unmodified poly-beta-1,6-GlcNAc and assisting catalysis by the deacetylase domain. This family also includes several orthologs of PgaB, such as the hemin storage system HmsF protein, encoded by Yersinia pestis hmsF gene from the hmsHFRS operon, which is essential for Y. pestis biofilm formation. Like PgaB, HmsF is an outer membrane protein with an N-terminal NodB homology domain, which is likely involved in the modification of the exopolysaccharide (EPS) component of the biofilm. HmsF also has a conserved but uncharacterized C-terminal domain that is present in other HmsF-like proteins in Gram-negative bacteria. This alignment model corresponds to the N-terminal NodB homology domain.	193
200587	cd10965	CE4_IcaB_5s	Putative catalytic polysaccharide deacetylase domain of bacterial intercellular adhesion protein IcaB and similar proteins. The family is represented by the surface-attached protein intercellular adhesion protein IcaB (Poly-beta-1,6-N-acetyl-D-glucosamine N-deacetylase, EC 3.5.1.-), encoded by Staphylococcus epidermidis icaB gene from the icaABC gene cluster that is involved in the synthesis of polysaccharide intercellular adhesin (PIA), which is located mainly on the cell surface. IcaB is a secreted, cell wall-associated protein that plays a crucial role in exopolysaccharide modification in bacterial biofilm formation. It catalyzes the N-deacetylation of poly-beta-1,6-N-acetyl-D-glucosamine (PNAG, also referred to as PIA), a biofilm adhesin polysaccharide. IcaB shows high homology to the N-terminal NodB homology domain of Escherichia coli PgaB. At this point, they are classified in the same family.	172
213024	cd10966	CE4_yadE_5s	Putative catalytic polysaccharide deacetylase domain of uncharacterized protein yadE and similar proteins. This family contains an uncharacterized protein yadE from Escherichia coli and its bacterial homologs. Although its molecular function remains unknown, yadE shows high sequence similarity with the catalytic NodB homology domain of outer membrane lipoprotein PgaB and the surface-attached protein intercellular adhesion protein IcaB. Both PgaB and IcaB are essential in bacterial biofilm formation.	164
200589	cd10967	CE4_GLA_like_6s	Putative catalytic NodB homology domain of gellan lyase and similar proteins. This family is represented by the extracellular polysaccharide-degrading enzyme, gellan lyase (gellanase, EC 4.2.2.-), from Bacillus sp. The enzyme acts on gellan exolytically and releases a tetrasaccharide of glucuronyl-glucosyl-rhamnosyl-glucose with unsaturated glucuronic acid at the nonreducing terminus. The family also includes many uncharacterized prokaryotic polysaccharide deacetylases, which show high sequence similarity to Bacillus sp. gellan lyase. Although their biological functions remain unknown, all members of the family contain a conserved domain with a 6-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily.	202
213025	cd10968	CE4_Mlr8448_like_5s	Putative catalytic NodB homology domain of Mesorhizobium loti Mlr8448 protein and its bacterial homologs. This family contains Mesorhizobium loti Mlr8448 protein and its bacterial homologs. Although their biochemical properties are yet to be determined, members in this subfamily contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily.	161
213026	cd10969	CE4_Ecf1_like_5s	Putative catalytic NodB homology domain of a hypothetical protein Ecf1 from Escherichia coli and similar proteins. This family contains a hypothetical protein Ecf1 from Escherichia coli and its prokaryotic homologs. Although their biochemical properties remain to be determined, members in this family contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily.	218
213027	cd10970	CE4_DAC_u1_6s	Putative catalytic NodB homology domain of uncharacterized prokaryotic polysaccharide deacetylases which consist of a 6-stranded beta/alpha barrel. This family contains uncharacterized prokaryotic polysaccharide deacetylases. Although their biological functions remain unknown, all members of the family contain a conserved domain with a 6-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily.	194
200593	cd10971	CE4_DAC_u2_5s	Putative catalytic NodB homology domain of uncharacterized prokaryotic polysaccharide deacetylases which consist of a 5-stranded beta/alpha barrel. This family contains many uncharacterized prokaryotic polysaccharide deacetylases. Although their biological functions remain unknown, all members of this family are predicted to contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily.	198
200594	cd10972	CE4_DAC_u3_5s	Putative catalytic NodB homology domain of uncharacterized bacterial polysaccharide deacetylases which consist of a 5-stranded beta/alpha barrel. This family contains uncharacterized bacterial polysaccharide deacetylases. Although their biological functions remain unknown, all members of the family are predicted to contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily.	216
213028	cd10973	CE4_DAC_u4_5s	Putative catalytic NodB homology domain of uncharacterized bacterial polysaccharide deacetylases which consist of a 5-stranded beta/alpha barrel. This family contains many uncharacterized bacterial polysaccharide deacetylases. Although their biological functions remain unknown, all members of the family are predicted to contain a conserved domain with a 5-stranded beta/alpha barrel, which is similar to the catalytic NodB homology domain of rhizobial NodB-like proteins, belonging to the larger carbohydrate esterase 4 (CE4) superfamily.	157
200596	cd10974	CE4_CDA_like_1	Putative catalytic domain of chitin deacetylase-like proteins with additional chitin-binding peritrophin-A domain (ChBD) and/or a low-density lipoprotein receptor class A domain (LDLa). Chitin deacetylases (CDAs, EC 3.5.1.41) are secreted metalloproteins belonging to a family of extracellular chitin-modifying enzymes that catalyze the N-deacetylation of chitin, a beta-1,4-linked N-acetylglucosamine polymer, to form chitosan, a polymer of beta-(1,4)-linked d-glucosamine residues. CDAs have been isolated and characterized from various bacterial and fungal species and belong to the larger carbohydrate esterase 4 (CE4) superfamily. This family includes many CDA-like proteins mainly from insects, which contain a putative CDA-like catalytic domain similar to the catalytic NodB homology domain of CE4 esterases. In addition to the CDA-like domain, family members contain two additional domains, a chitin-binding peritrophin-A domain (ChBD) and a low-density lipoprotein receptor class A domain (LDLa), or have the ChBD domain but do not have the LDLa domain.	269
200597	cd10975	CE4_CDA_like_2	Putative catalytic domain of chitin deacetylase-like proteins. Chitin deacetylases (CDAs, EC 3.5.1.41) are secreted metalloproteins belonging to a family of extracellular chitin-modifying enzymes that catalyze the N-deacetylation of chitin, a beta-1,4-linked N-acetylglucosamine polymer, to form chitosan, a polymer of beta-(1,4)-linked d-glucosamine residues. CDAs have been isolated and characterized from various bacterial and fungal species and belong to the larger carbohydrate esterase 4 (CE4) superfamily. This family includes many midgut-specific CDA-like proteins mainly from insects, such as Tribolium castaneum CDAs (TcCDA6-9). These proteins contain a putative CDA-like catalytic domain similar to the catalytic NodB homology domain of CE4 esterases. In addition to the CDA-like domain, some family members have an additional chitin-binding peritrophin-A domain (ChBD).	268
200598	cd10976	CE4_CDA_like_3	Putative catalytic domain of uncharacterized bacterial hypothetical proteins similar to insect chitin deacetylase-like proteins. The family includes many uncharacterized bacterial hypothetical proteins that show high sequence similarity to insect chitin deacetylase-like proteins. Chitin deacetylases (CDAs, EC 3.5.1.41) are secreted metalloproteins belonging to a family of extracellular chitin-modifying enzymes that catalyze the N-deacetylation of chitin, a beta-1,4-linked N-acetylglucosamine polymer, to form chitosan, a polymer of beta-(1,4)-linked d-glucosamine residues.	299
200599	cd10977	CE4_PuuE_SpCDA1	Catalytic domain of bacterial PuuE allantoinases, Schizosaccharomyces pombe chitin deacetylase 1 (SpCDA1), and similar proteins. Allantoinase (EC 3.5.2.5) can hydrolyze allantoin((2,5-dioxoimidazolidin-4-yl)urea), one of the most important nitrogen carrier for some plants, soil animals, and microorganisms, to allantoate. DAL1 gene from Saccharomyces cerevisiae encodes an allantoinase. However, some organisms possess allantoinase activity but lack DAL1 allantoinase. In those organisms, a defective allantoinase gene, named puuE (purine utilization E), encodes an allantoinase that specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. PuuE allantoinase is related to polysaccharide deacetylase (DCA), one member of the carbohydrate esterase 4 (CE4) superfamily, that removes N-linked or O-linked acetyl groups of cell wall polysaccharides, and lacks sequence similarity with the known DAL1 allantoinase that belongs to the amidohydrolase superfamily. PuuE allantoinase functions as a homotetramer. Its monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCAs. It appears to be metal-independent and acts on a small substrate molecule, which is distinct from the common features of DCAs that are normally metal ion dependent and recognize multimeric substrates. This family also includes a chitin deacetylase 1 (SpCDA1) encoded by the Schizosaccharomyces pombe cda1 gene. Although the general function of chitin deacetylase (CDA) is the synthesis of chitosan from chitin, a polymer of N-acetyl glucosamine, to build up the proper ascospore wall, the actual function of SpCDA1 might involve allantoin hydrolysis. It is likely orthologous to PuuE allantoinase, whereas it is more distantly related to the CDAs found in other fungi, such as Saccharomyces cerevisiae and Mucor rouxii. Those CDAs are similar with rizobial NodB protein and are not included in this family.	273
200600	cd10978	CE4_Sll1306_like	Putative catalytic domain of Synechocystis sp. Sll1306 protein and other bacterial homologs. The family contains Synechocystis sp. Sll1306 protein and uncharacterized bacterial polysaccharide deacetylases. Although their biological function remains unknown, they show very high sequence homology to the catalytic domain of bacterial PuuE (purine utilization E) allantoinases. PuuE allantoinase specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. It functions as a homotetramer. Its monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of polysaccharide deacetylase-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. PuuE allantoinase appears to be metal-independent and acts on a small substrate molecule, which is distinct from the common feature of polysaccharide deacetylases that are normally metal ion dependent and recognize multimeric substrates.	271
200601	cd10979	CE4_PuuE_like	Putative catalytic domain of uncharacterized prokaryotic polysaccharide deacetylases similar to bacterial PuuE allantoinases. The family includes a group of uncharacterized prokaryotic polysaccharide deacetylases (DCAs) that show high sequence similarity to the catalytic domain of bacterial PuuE (purine utilization E) allantoinases. PuuE allantoinase specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. It functions as a homotetramer. Its monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCA-like proteins in the CE4 superfamily, which removes N-linked or O-linked acetyl groups from cell wall polysaccharides. PuuE allantoinase appears to be metal-independent and acts on a small substrate molecule, which is distinct from the common feature of DCAs which are normally metal ion dependent and recognize multimeric substrates.	281
200602	cd10980	CE4_SpCDA1	Putative catalytic domain of Schizosaccharomyces pombe chitin deacetylase 1 (SpCDA1), and similar proteins. This family is represented by Schizosaccharomyces pombe chitin deacetylase 1 (SpCDA1), encoded by the cda1 gene. The general function of chitin deacetylase (CDA) is the synthesis of chitosan from chitin, a polymer of N-acetyl glucosamine, to build up the proper ascospore wall. The actual function of SpCDA1 might be involved in allantoin hydrolysis. It is likely an ortholog to bacterial PuuE allantoinase, whereas it is more distantly related to the CDAs found in other fungi, such as Saccharomyces cerevisiae and Mucor rouxii. Those CDAs are similar with rizobial NodB protein and are not included in this family.	297
211380	cd10981	ZnPC_S1P1	Zinc dependent phospholipase C/S1-P1 nuclease. This model describes both the bacterial and archeal zinc-dependent phospholipase C, a domain found in the alpha toxin of Clostridium perfringens, as well as S1/P1 nucleases, which predominantly act on single-stranded DNA and RNA.	238
199826	cd10985	MH2_SMAD_2_3	C-terminal Mad Homology 2 (MH2) domain in SMAD2 and SMAD3. The MH2 domain is located at the C-terminus of the SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. The MH2 domain is responsible for type I receptor interaction, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain. SMAD2 and SMAD3 are receptor regulated SMADs (R-SMADs). SMAD2 regulates multiple cellular processes, such as cell proliferation, apoptosis and differentiation, while SMAD3 modulates signals of activin and TGF-beta.	191
199911	cd11005	M35_like	Peptidase M35 family. Family M35 Zn2+-metallopeptidase domain, also known as the deuterolysin family, contains fungal as well as bacterial metalloendopeptidases that include deuterolysin (EC2.4.24.39), peptidyl-Lys metalloendopeptidase (MEP), penicillolysin, as well as uncharacterized sequences. Typically, members of this family of extracellular peptidases contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand and is found in a GTXDXXYG motif C-terminal to the His zinc ligands. Deuterolysins are highly active towards basic nuclear proteins such as histones and protamines, with a preference for a Lys or Arg residue in the P1' subsite. MEPs specifically cleave peptidyl-lysine bonds (-X-Lys-) in proteins and peptides. Penicillolysin, a thermolabile protease from Penicillium citrinum, strongly hydrolyzes nuclear proteins such as clupeine, salmine and histone. Many members of the M35 peptidases display unusual thermostabilities.	167
199912	cd11006	M35_peptidyl-Lys_like	Peptidase M35 domain of peptidyl-Lys metalloendopeptidases and related proteins. This family M35 Zn2+-metallopeptidase extracellular domain is mostly found in proteins characterized as peptidyl-Lys metalloendopeptidases (MEP; peptidyllysine metalloproteinase; EC 3.4.24.20), including some well-characterized domains in Aeromonas salmonicida subsp. Achromogenes (AsaP1) and Grifola frondosa (GfMEP). These proteins specifically cleave peptidyl-lysine bonds (-X-Lys- where X may even be Pro) in proteins and peptides. AsaP1 peptidase has been shown to be important in the virulence of A. salmonicida subsp. achromogenes, having a major role in the fish innate immune response. Members of this family contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand and is found in a GTXDXXYG or similar motif C-terminal to the His zinc ligands.	163
199913	cd11007	M35_like_1	Peptidase M35-like domain of uncharacterized proteins. This family contains proteins similar to the M35 Zn2+-metallopeptidases, also known as the deuterolysin family, presumably these are bacterial metalloendopeptidases that have yet to be characterized. Typically, members of this family of extracellular peptidases contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand; however, members of this family do not contain the GTXDXXYG motif C-terminal to the His zinc ligands that is typical for the M35 proteases. Deuterolysins are highly active towards basic nuclear proteins such as histones and protamines, with a preference for a Lys or Arg residue in the P1' subsite. MEPs specifically cleave peptidyl-lysine bonds (-X-Lys-) in proteins and peptides. Many members of the M35 peptidases display unusual thermostabilities.	183
199914	cd11008	M35_deuterolysin_like	Peptidase M35 domain of deuterolysins and related proteins. This family M35 Zn2+-metallopeptidase extracellular domain is found in fungal deutrolysins (acid metalloproteinase, neutral proteinase II), including some well-characterized metallopeptidase domains in Aspergillus oryzae (NpII), Aspergillus fumigatus (MEP20), Penicillium roqueforti (protease II) and Emericella nidulans (PepJ peptidase). The neutral proteinase II from Aspergillus oryzae (NpII) unfolds reversibly upon incubation at higher temperatures, and loss in activity is mainly due to autoproteolysis. MEP20 is encoded by the mepB gene, which appears to be associated with the cytoplasmic degradation of small peptides. PepJ peptidase is a thermostable enzyme released under carbon starvation. Most members of this family contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand and is found in a GTXDXXYG or similar motif C-terminal to the His zinc ligands. The aspzincin motif is poorly conserved in one subgroup, that includes Asp f2, a major allergen from Aspergillus fumigatus. This subgroup in addition lacks the key conserved Tyr residue which acts as a proton donor during catalysis, and no protease activity has been detected to date for Asp f2.	167
211381	cd11009	Zn_dep_PLPC	Zinc dependent phospholipase C (alpha toxin). This domain conveys a zinc dependent phospholipase C activity (EC 3.1.4.3). It is found in a monomeric phospholipase C of Bacillus cereus as well as in the alpha toxin of Clostridium perfringens and Clostridium bifermentans, which is involved in haemolysis and cell rupture. It is also found in a lecithinase of Listeria monocytogenes, which is involved in breaking the 2-membrane vacuoles that surround the bacterium.	218
211382	cd11010	S1-P1_nuclease	S1/P1 nucleases and related enzymes. This family summarizes both S1 and P1 nucleases (EC:3.1.30.1) which cleave RNA and single stranded DNA with no base specificity. S1 nuclease is more active on DNA than RNA. Its reaction products are oligonucleotides or single nucleotides with 5' phosphoryl groups. Although its primary substrate is single-stranded, it may also introduce single-stranded breaks in double-stranded DNA or RNA, or DNA-RNA hybrids. It is used as a reagent in nuclease protection assays and in removing single stranded tails from DNA molecules to create blunt ended molecules and opening hairpin loops generated during synthesis of double stranded cDNA. P1 nuclease cleaves its substrate at every position yielding nucleoside 5' monophosphates, and it does not recognize or act on double-stranded DNA. It is useful at removing single stranded strands hanging off the end of double stranded DNA and at completely cleaving melted DNA for simple DNA composition analysis.	249
259898	cd11012	CuRO_6_ceruloplasmin	The sixth cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the sixth cupredoxin domain of ceruloplasmin.	145
259899	cd11013	Plantacyanin	Plantacyanin is a subclass of phytocyanins, plant type I copper proteins. Plantacyanins belong to the phytocyanin family of blue copper proteins, a ubiquitous family of plant cupredoxins. Plantacyanin is involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. The exact function of plantacyanin is unknown. However plantacyanin is shown to play a role in reproduction in Arabidopsis. Plantacyanins may also be stress-related proteins and be involved in plant defense responses.	95
259900	cd11014	Mavicyanin	Mavicyanin is a subclass of phytocyanins, a plant blue copper protein. Mavicyanin is a glycosylated protein isolated from Cucurbita pepo medullosa (zucchini) peelings. It belongs to the phytocyanin family of blue copper proteins, a ubiquitous family of plant cupredoxins. Mavicyanin is involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. The copper is tetrahedrally coordinated by a cysteine, 2 histidines, and a glutamine residue, like in the case of stellacyanin. The biological roles of mavicyanin have not been elucidated yet.	101
259901	cd11015	CuRO_2_FVIII_like	The second cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 2 of unprocessed Factor VIII or the heavy chain of circulating Factor VIII, and similar proteins.	134
259902	cd11016	CuRO_4_FVIII_like	The fourth cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 4 of unprocessed Factor VIII or the heavy chain of circulating Factor VIII, and similar proteins.	143
259903	cd11017	Phytocyanin_like_1	A subclass of phytocyanins, plant blue or type I copper proteins. Phytocyanins are plant blue or type I copper proteins. They are involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. Phytocyanins are classified into four groups: stellacyanin, plantacyanin, uclacyanin and early nodulin groups. Members of this unknown subgroup appear to have lost the T1 copper binding site.	99
259904	cd11018	CuRO_6_FVIII_like	The sixth cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 6 of unprocessed Factor VIII or the second cupredoxin domain the light chain of circulating Factor VIII, and similar proteins.	144
259905	cd11019	OsENODL1_like	Early nodulin-like protein (OsENODL1) and similar proteins. This family includes early nodulin-like protein (OsENODL1) from Oryza sativa and similar proteins. It belongs to the phytocyanin family of blue copper proteins, a ubiquitous family of plant cupredoxins. Phytocyanin is involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. OsENODL1 expression occurs specifically at the late developmental stage of the seeds. Members of this subgroup appear to have lost the T1 copper binding site.	103
259906	cd11020	CuRO_1_CuNIR	Cupredoxin domain 1 of Copper-containing nitrite reductase. Copper-containing nitrite reductase (CuNIR), which catalyzes the reduction of NO2- to NO, is the key enzyme in the denitrification process in denitrifying bacteria. CuNIR contains at least one type 1 copper center and a type 2 copper center, which serves as the active site of the enzyme. A histidine, bound to the Type 2 Cu center, is responsible for binding and reducing nitrite. A Cys-His bridge plays an important role in facilitating rapid electron transfer from the type 1 center to the type 2 center. A reduced type I blue copper protein (pseudoazurin) was found to be a specific electron transfer donor for the copper-containing NIR in bacteria Alcaligenes faecalis.	119
259907	cd11021	CuRO_2_ceruloplasmin	The second cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the second cupredoxin domain of ceruloplasmin.	141
259908	cd11022	CuRO_4_ceruloplasmin	The fourth cupredoxin domain of Ceruloplasmin. Ceruloplasmin is a multicopper oxidase essential for normal iron homeostasis and copper transport in blood. It also functions in amine oxidation and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains with six copper centers; three mononuclear sites in domain 2, 4 and 6 and three in the form of trinuclear clusters at the interface of domains 1 and 6. Ceruloplasmin exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the fourth cupredoxin domain of ceruloplasmin.	144
259909	cd11023	CuRO_2_ceruloplasmin_like_2	cupredoxin domain of ceruloplasmin homologs. Uncharacterized subfamily of ceruloplasmin homologous proteins. Ceruloplasmin  (ferroxidase) is a multicopper oxidase essential for normal iron homeostasis.  Ceruloplasmin also functions in copper transport, amine oxidase and as an antioxidant preventing free radicals in serum. The protein has 6 cupredoxin domains and exhibits internal sequence homology that appears to have evolved from the triplication of a sequence unit composed of two tandem cupredoxin domains. This model represents the first domain of the triplicated units.	118
259910	cd11024	CuRO_1_2DMCO_NIR_like	The cupredoxin domain 1 of a two-domain laccase related to nitrite reductase. The two-domain laccase (small laccase) in this family differs significantly from all laccases. It resembles the two domain nitrite reductase in both sequence and structure. It consists of two cupredoxin domains and forms trimers and hence resembles the quaternary structure of nitrite reductases more than that of large laccases. There are three trinuclear copper clusters in the enzyme localized between domains 1 and 2 of each pair of neighbor chains. Three copper ions of type 1 lie close to one another near the surface of the central part of the trimer, and, effectively, a trimeric substrate binding site is formed in their vicinity. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic, notably phenolic, and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities.	119
410652	cd11026	CYP2	cytochrome P450 family 2. The cytochrome P450 family 2 (CYP2 or Cyp2) is one of the largest, most diverse CYP families in vertebrates. It includes many subfamilies across vertebrate species but not all subfamilies are found in multiple vertebrate taxonomic classes. The CYP2U and CYP2R genes are present in the vertebrate ancestor and are shared across all vertebrate classes, whereas some subfamilies are lineage-specific, such as CYP2B and CYP2S in mammals. CYP2 enzymes play important roles in drug metabolism. The CYP2 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410653	cd11027	CYP17A1-like	cytochrome P450 family 17, subfamily A, polypeptide 1, and similar cytochrome P450s. This subfamily contains cytochrome P450 17A1 (CYP17A1 or Cyp17a1), cytochrome P450 21 (CYP21 or Cyp21) and similar proteins. CYP17A1, also called cytochrome P450c17, steroid 17-alpha-hydroxylase (EC 1.14.14.19)/17,20 lyase (EC 1.14.14.32), or 17-alpha-hydroxyprogesterone aldolase, catalyzes the conversion of pregnenolone and progesterone to their 17-alpha-hydroxylated products and subsequently to dehydroepiandrosterone (DHEA) and androstenedione; it catalyzes both the 17-alpha-hydroxylation and the 17,20-lyase reaction. This subfamily also contains CYP21, also called steroid 21-hydroxylase (EC 1.14.14.16) or cytochrome P-450c21 or CYP21A2, catalyzes the 21-hydroxylation of steroids and is required for the adrenal synthesis of mineralocorticoids and glucocorticoids. The CYP17A1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	428
410654	cd11028	CYP1	cytochrome P450 family 1. The cytochrome P450 family 1 (CYP1 or Cyp1) is composed of three functional human members: CYP1A1, CYP1A2 and CYP1B1, which are regulated by the aryl hydrocarbon receptor (AhR),  ligand-activated transcriptional factor that dimerizes with AhR nuclear translocator (ARNT). CYP1 enzymes are involved in the metabolism of endogenous hormones, xenobiotics, and drugs. Included in the CYP1 family is CYP1D1 (cytochrome P450 family 1, subfamily D, polypeptide 1), which is not expressed in humans as its gene is pseudogenized due to five nonsense mutations in the putative coding region, but is functional in in other organisms including cynomolgus monkey. Zebrafish CYP1D1 expression is not regulated by AhR. The CYP1 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	430
410655	cd11029	CYP107-like	cytochrome P450 family 107 and similar cytochrome P450s. This group contains bacterial cytochrome P450s from families 107 (CYP107), 154 (CYP154), 197 (CYP197), and similar proteins. Among the members of this group are: Pseudonocardia autotrophica vitamin D(3) 25-hydroxylase (also known as CYP197A; EC 1.14.15.15) that catalyzes the hydroxylation of vitamin D(3) into 25-hydroxyvitamin D(3) and 1-alpha,25-dihydroxyvitamin D(3), its physiologically active forms; Saccharopolyspora erythraea CYP107A1, also called P450eryF or 6-deoxyerythronolide B hydroxylase (EC 1.14.15.35), that catalyzes the conversion of 6-deoxyerythronolide B (6-DEB) to erythronolide B (EB) by the insertion of an oxygen at the 6S position of 6-DEB; Bacillus megaterium CYP107DY1 that displays C6-hydroxylation activity towards mevastatin to produce pravastatin; Streptomyces coelicolor CYP154C1 that shows activity towards 12- and 14-membered ring macrolactones in vitro and may be involved in catalyzing the site-specific oxidation of the precursors to macrolide antibiotics, which introduces regiochemical diversity into the macrolide ring system; and Nocardia farcinica CYP154C5 that acts on steroids with regioselectivity and stereoselectivity, converting various pregnans and androstans to yield 16 alpha-hydroxylated steroid products. Bacillus subtilis CYP107H1 is not included in this group. The CYP107-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	384
410656	cd11030	CYP105-like	cytochrome P450 family 105 and similar cytochrome P450s. This group predominantly contains bacterial cytochrome P450s, including those belonging to families 105 (CYP105) and 165 (CYP165). Also included in this group are fungal family 55 proteins (CYP55). CYP105s are predominantly found in bacteria belonging to the phylum Actinobacteria and the order Actinomycetales, and are associated with a wide variety of pathways and processes, from steroid biotransformation to production of macrolide metabolites. CYP105A1 catalyzes two sequential hydroxylations of vitamin D3 with differing specificity and cytochrome P450-SOY (also known as CYP105D1) has been shown to be capable of both oxidation and dealkylation reactions. CYP105D6 and CYP105P1, from the filipin biosynthetic pathway, perform highly regio- and stereospecific hydroxylations. Other members of this group include, but are not limited to: CYP165D3 (also called OxyE) from the teicoplanin biosynthetic gene cluster of Actinoplanes teichomyceticus, which is responsible for the phenolic coupling of the aromatic side chains of the first and third peptide residues in the teicoplanin peptide; Micromonospora griseorubida cytochrome P450 MycCI that catalyzes hydroxylation at the C21 methyl group of mycinamicin VIII, the earliest macrolide form in the postpolyketide synthase tailoring pathway; and Fusarium oxysporum CYP55A1 (also called nitric oxide reductase cytochrome P450nor) that catalyzes an unusual reaction, the direct electron transfer from NAD(P)H to bound heme. The CYP105-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	381
410657	cd11031	Cyp158A-like	cytochrome P450 family 158, subfamily A and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Streptomyces coelicolor CYP158A1 and CYP158A2, Streptomyces natalensis PimD (also known as CYP107E), Mycobacterium tuberculosis CYP121, and Micromonospora griseorubida MycG (also known as CYP107B).  CYP158A1 and CYP158A2 catalyze an unusual oxidative C-C coupling reaction to polymerize flaviolin and form highly conjugated pigments; CYP158A2 produces three isomers of biflaviolin and one triflaviolin while CYP158A1 produces only two isomers of biflaviolin. PimD is a cytochrome P450 monooxygenase with native epoxidase activity that is critical in the biosynthesis of the polyene macrolide antibiotic pimaricin. CYP121 is essential for the viability of M. tuberculosis and is a novel drug target for the inhibition of mycobacterial growth. MycG catalyzes both hydroxylation and epoxidation reactions in the biosynthesis of the 16-membered ring macrolide antibiotic mycinamicin II. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	380
410658	cd11032	P450_EryK-like	cytochrome P450 EryK and similar cytochrome P450s. This subfamily contains archaeal and bacterial CYPs including Saccharopolyspora erythraea P450 EryK, Saccharolobus solfataricus cytochrome P450 119 (CYP119), Picrophilus torridus CYP231A2, Bacillus subtilis CYP109, Streptomyces himastatinicus HmtT and HmtN, and Bacillus megaterium CYP106A2, among others. EryK, also called erythromycin C-12 hydroxylase, is active during the final steps of erythromycin A (ErA) biosynthesis. CYP106A2 catalyzes the hydroxylation of a variety of 3-oxo-delta(4)-steroids such as progesterone and deoxycorticosterone, mainly in the 15beta-position. It is also capable of hydroxylating a variety of terpenoids. HmtT and HmtN is involved in the post-tailoring of the cyclohexadepsipeptide backbone during the biosynthesis of the himastatin antibiotic. The EryK-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	368
410659	cd11033	CYP142-like	cytochrome P450 family 142 and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Streptomyces sp. P450sky (also called CYP163B3), Sphingopyxis macrogoltabida P450pyr hydroxylase, Novosphingobium aromaticivorans CYP108D1, Pseudomonas sp. cytochrome P450-Terp (P450terp), and Amycolatopsis balhimycina P450 OxyD, as well as several Mycobacterium proteins CYP124, CYP125, CYP126, and CYP142. P450sky is involved in the hydroxylation of three beta-hydroxylated amino acid precursors required for the biosynthesis of the cyclic depsipeptide skyllamycin. P450pyr hydroxylase is an active and selective catalyst for the regio- and stereo-selective hydroxylation at non-activated carbon atoms with a broad substrate range. P450terp catalyzes the hydroxylation of alpha-terpineol as part of its catabolic assimilation. OxyD is involved in beta-hydroxytyrosine formation during vancomycin biosynthesis. CYP124 is a methyl-branched lipid omega-hydroxylase while CYP142 is a cholesterol 27-oxidase with likely roles in host response modulation and cholesterol metabolism. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	378
410660	cd11034	P450cin-like	P450cin and similar cytochrome P450s. This group is composed of Citrobacter braakii cytochrome P450cin (P450cin, also called CYP176A1) and similar proteins. P450cin is a bacterial P450 enzyme that catalyzes the enantiospecific hydroxylation of 1,8-cineole to (1R)-6beta-hydroxycineole; its natural reduction-oxidation partner is cindoxin. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	361
410661	cd11035	P450cam-like	P450cam and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Pseudomonas putida P450cam and Cyp101 proteins from Novosphingobium aromaticivorans such as CYP101C1 and CYP101D2. P450cam catalyzes the hydroxylation of camphor in a process that involves two electron transfers from the iron-sulfur protein, putidaredoxin. CYP101D2 is capable of oxidizing camphor while CYP101C1 does not bind camphor but is capable of binding and hydroxylating ionone derivatives such as alpha- and beta-ionone and beta-damascone. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	359
410662	cd11036	AknT-like	AknT-like proteins. This family is composed of proteins similar to Streptomyces biosynthesis proteins including anthracycline biosynthesis proteins DnrQ and AknT, and macrolide antibiotic biosynthesis proteins TylM3 and DesVIII. Streptomyces peucetius DnrQ is involved in the biosynthesis of carminomycin and daunorubicin (daunomycin) while Streptomyces galilaeus AknT functions in the biosynthesis of aclacinomycin A. Streptomyces fradiae TylM3 is involved in the biosynthesis of tylosin derived from the polyketide lactone tylactone, and Streptomyces venezuelae functions in the biosynthesis of methymycin, neomethymycin, narbomycin, and pikromycin. These proteins are required for the glycosylation of specific substrates during the biosynthesis of specific anthracyclines and macrolide antibiotics. Although members of this family belong to the large cytochrome P450 (P450, CYP) superfamily and show significant similarity to cytochrome P450s, they lack heme-binding sites and are not functional cytochromes.	340
410663	cd11037	CYP199A2-like	cytochrome P450 family 199, subfamily A, polypeptide 2 and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Rhodopseudomonas palustris CYP199A2 and CYP199A4. CYP199A2 catalyzes the oxidation of aromatic carboxylic acids including indole-2-carboxylic acid, 2-naphthoic acid and 4-ethylbenzoic acid. CYP199A4 catalyzes the hydroxylation of para-substituted benzoic acids. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	371
410664	cd11038	CYP_AurH-like	cytochrome P450 AurH and similar cytochrome P450s. This group includes Streptomyces thioluteus P450 monooxygenase AurH which is uniquely capable of forming a homochiral tetrahydrofuran ring, a vital component of the polyketide antibiotic aureothin. AurH catalyzes an unprecedented tandem oxygenation process: first, it catalyzes an asymmetric hydroxylation of deoxyaureothin to yield (7R)-7-hydroxydeoxyaureothin as an intermediate; and second, it mediates another C-O bond formation that leads to O-heterocyclization. The AurH-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	382
410665	cd11039	P450-pinF2-like	P450-pinF2 and similar cytochrome P450s. This family is composed of cytochrome P450s (CYPs) with similarity to Agrobacterium tumefaciens P450-pinF2, whose expression is induced by the presence of wounded plant tissue and by plant phenolic compounds such as acetosyringone. P450-pinF2 may be involved in the detoxification of plant protective agents at the site of wounding. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	372
410666	cd11040	CYP7_CYP8-like	cytochrome P450s similar to cytochrome P450 family 7, subfamily A, polypeptide 1, cytochrome P450 family 7, subfamily B, polypeptide 1, cytochrome P450 family 8, subfamily A, polypeptide 1. This family is composed of cytochrome P450s (CYPs) with similarity to the human P450s CYP7A1, CYP7B1, CYP8B1, CYP39A1 and prostacyclin synthase (CYP8A1). CYP7A1, CYP7B1, CYP8B1, and CYP39A1 are involved in the catabolism of cholesterol to bile acids (BAs) in two major pathways. CYP7A1 (cholesterol 7alpha-hydroxylase) and CYP8B1 (sterol 12-alpha-hydroxylase) function in the classic (or neutral) pathway, which leads to two bile acids: cholic acid (CA) and chenodeoxycholic acid (CDCA). CYP7B1 and CYP39A1 are 7-alpha-hydroxylases involved in the alternative (or acidic) pathway, which leads mainly to the formation of CDCA. Prostacyclin synthase (CYP8A1) catalyzes the isomerization of prostaglandin H2 to prostacyclin (or prostaglandin I2), a potent mediator of vasodilation and anti-platelet aggregation. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	432
410667	cd11041	CYP503A1-like	cytochrome P450 family 503, subfamily A, polypeptide 1 and similar cytochrome P450s. This family is composed of predominantly fungal cytochrome P450s (CYPs) with similarity to Fusarium fujikuroi Cytochrome P450 503A1 (CYP503A1, also called ent-kaurene oxidase or cytochrome P450-4), Aspergillus nidulans austinol synthesis protein I (ausI), Alternaria alternata tentoxin synthesis protein 1 (TES1), and Acanthamoeba polyphaga mimivirus cytochrome P450 51 (CYP51, also called P450-LIA1 or sterol 14-alpha demethylase). Ent-kaurene oxidase catalyzes three successive oxidations of the 4-methyl group of ent-kaurene to form kaurenoic acid, an intermediate in gibberellin biosynthesis. AusI and TES1 are cytochrome P450 monooxygenases that mediate the biosynthesis of the meroterpenoids, austinol and dehydroaustinol, and the phytotoxin tentoxin, respectively. P450-LIA1 catalyzes the 14-alpha demethylation of obtusifoliol and functions in steroid biosynthesis. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	441
410668	cd11042	CYP51-like	cytochrome P450 family 51 and similar cytochrome P450s. This family is composed of cytochrome P450 51 (CYP51 or sterol 14alpha-demethylase) and related cytochrome P450s. CYP51 is the only cytochrome P450 enzyme with a conserved function across animals, fungi, and plants, in the synthesis of essential sterols. In mammals, it is expressed in many different tissues, with highest expression in testis, ovary, adrenal gland, prostate, liver, kidney, and lung. In fungi, CYP51 is a significant drug target for treatment of human protozoan infections. In plants, it functions within a specialized defense-related metabolic pathway. CYP51 is also found in several bacterial species. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	416
410669	cd11043	CYP90-like	plant cytochrome P450s similar to cytochrome P450 family 90, subfamily A, polypeptide 1, cytochrome P450 family 90, subfamily B, polypeptide 1, and cytochrome P450 family 90, subfamily D, polypeptide 2. This family is composed of plant cytochrome P450s including: Arabidopsis thaliana cytochrome P450s 85A1 (CYP85A1 or brassinosteroid-6-oxidase 1), 90A1 (CYP90A1), 88A3 (CYP88A3 or ent-kaurenoic acid oxidase 1), 90B1 (CYP90B1 or Dwarf4 or steroid 22-alpha-hydroxylase), and 90C1 (CYP90C1 or 3-epi-6-deoxocathasterone 23-monooxygenase); Oryza sativa cytochrome P450s 90D2 (CYP90D2 or C6-oxidase), 87A3 (CYP87A3), and 724B1 (CYP724B1 or dwarf protein 11); and Taxus cuspidata cytochrome P450 725A2 (CYP725A2 or taxane 13-alpha-hydroxylase). These enzymes are monooxygenases that catalyze oxidation reactions involved in steroid or hormone biosynthesis. CYP85A1, CYP90D2, and CYP90C1 are involved in brassinosteroids biosynthesis, while CYP88A3 catalyzes three successive oxidations of ent-kaurenoic acid, which is a key step in the synthesis of gibberellins. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	408
410670	cd11044	CYP120A1_CYP26-like	cyanobacterial cytochrome P450 family 120, subfamily A, polypeptide 1 (CYP120A1), vertebrate cytochrome P450 family 26 enzymes, and similar cytochrome P450s. This family includes cyanobacterial CYP120A1 and vertebrate cytochrome P450s 26A1 (CYP26A1), 26B1 (CYP26B1), and 26C1 (CYP26C1). These are retinoic acid-metabolizing cytochromes that play key roles in retinoic acid (RA) metabolism. Human and zebrafish CYP26a1, as well as Synechocystis CYP120A1 are characterized as RA hydroxylases. RA is a critical signaling molecule that regulates gene transcription and the cell cycle. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	420
410671	cd11045	CYP136-like	putative cytochrome P450 family 136 and similar cytochrome P450s. This group is composed of Mycobacterium tuberculosis putative cytochrome P450 136 (CYP136) and similar proteins. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	407
410672	cd11046	CYP97	cytochrome P450 family/clan 97. CYPs have been classified into families and subfamilies based on homology and phylogenetic criteria; family membership is defined as 40% amino acid sequence identity or higher. The plant CYPs have also been classified according to clans; land plants have 11 clans that form two groups: single-family clans (CYP51, CYP74, CYP97, CYP710, CYP711, CYP727, CYP746) and multi-family clans (CYP71, CYP72, CYP85, CYP86). Members of the CYP97 clan include Arabidopsis thaliana cytochrome P450s 97A3 (CYP97A3), CYP97B3, and CYP97C1. CYP97A3 is also called protein LUTEIN DEFICIENT 5 (LUT5) and CYP97C1 is also called carotene epsilon-monooxygenase or protein LUTEIN DEFICIENT 1 (LUT1). These cytochromes function as beta- and epsilon-ring carotenoid hydroxylases and are involved in the biosynthesis of xanthophylls. CYP97 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	441
410673	cd11049	CYP170A1-like	cytochrome P450 family 170, subfamily A, polypeptide 1-like actinobacterial cytochrome P450s. This subfamily is composed of Streptomyces coelicolor cytochrome P450 170A1 (CYP170A1), Streptomyces avermitilis pentalenene oxygenase, and similar actinobacterial cytochrome P450s. CYP170A1, also called epi-isozizaene 5-monooxygenase (EC 1.14.13.106)/(E)-beta-farnesene synthase (EC 4.2.3.47), catalyzes the two-step allylic oxidation of epi-isozizaene to albaflavenone, which is a sesquiterpenoid antibiotic. Pentalenene oxygenase (EC 1.14.15.32) catalyzes the conversion of pentalenene to pentalen-13-al by stepwise oxidation via pentalen-13-ol, a precursor of the neopentalenolactone antibiotic. The CYP170A1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	415
410674	cd11051	CYP59-like	cytochrome P450 family 59 and similar cytochrome P450s. This family is composed of Aspergillus nidulans cytochrome P450 59 (CYP59), also called sterigmatocystin biosynthesis P450 monooxygenase stcS, and similar fungal proteins. CYP59 is required for the conversion of versicolorin A to sterigmatocystin. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	403
410675	cd11052	CYP72_clan	Plant cytochrome P450s, clan CYP72. CYPs have been classified into families and subfamilies based on homology and phylogenetic criteria; family membership is defined as 40% amino acid sequence identity or higher. The plant CYPs have also been classified according to clans; land plants have 11 clans that form two groups: single-family clans (CYP51, CYP74, CYP97, CYP710, CYP711, CYP727, CYP746) and multi-family clans (CYP71, CYP72, CYP85, CYP86). The CYP72 clan is associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. This clan includes: CYP734 enzymes that are involved in brassinosteroid (BRs) catabolism and regulation of BRs homeostasis; CYP714 enzymes that are involved in the biosynthesis of gibberellins (GAs) and the mechanism to control their bioactive endogenous levels; and CYP72 family enzymes, among others. The CYP72 clan belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	427
410676	cd11053	CYP110-like	cytochrome P450 family 110 and similar cytochrome P450s. This group is composed of mostly uncharacterized proteins, including Nostoc sp. probable cytochrome P450 110 (CYP110) and putative cytochrome P450s 139 (CYP139), 138 (CYP138), and 135B1 (CYP135B1) from Mycobacterium bovis. CYP110 genes, unique to cyanobacteria, are widely distributed in heterocyst-forming cyanobacteria including nitrogen-fixing genera Nostoc and Anabaena. This family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	415
410677	cd11054	CYP24A1-like	cytochrome P450 family 24 subfamily A, polypeptide 1 and similar cytochrome P450s. This family is composed of vertebrate cytochrome P450 24A1 (CYP24A1) and similar proteins including several Drosophila proteins such as CYP315A1 (also called protein shadow) and CYP314A1 (also called ecdysone 20-monooxygenase), and vertebrate CYP11 and CYP27 subfamilies. Both CYP314A1 and CYP315A1, which has ecdysteroid C2-hydroxylase activity, are involved in the metabolism of insect hormones. CYP24A1 and CYP27B1 have roles in calcium homeostasis and metabolism, and the regulation of vitamin D. CYP24A1 catabolizes calcitriol (1,25(OH)2D), the physiologically active vitamin D hormone, by catalyzing its hydroxylation, while CYP27B1 is a calcidiol 1-monooxygenase that coverts 25-hydroxyvitamin D3 to calcitriol. The CYP24A1-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	426
410678	cd11055	CYP3A-like	cytochrome P450 family 3, subfamily A and similar cytochrome P450s. This family includes vertebrate CYP3A subfamily enzymes and CYP5a1, and similar proteins. CYP5A1, also called thromboxane-A synthase, converts prostaglandin H2 into thromboxane A2, a biologically active metabolite of arachidonic acid. CYP3A enzymes are drug-metabolizing enzymes embedded in the endoplasmic reticulum, where they can catalyze a wide variety of biochemical reactions including hydroxylation, N-demethylation, O-dealkylation, S-oxidation, deamination, or epoxidation of substrates. The CYP3A-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	422
410679	cd11056	CYP6-like	cytochrome P450 family 6 and similar cytochrome P450s. This family is composed of cytochrome P450s from insects and crustaceans, including the CYP6, CYP9 and CYP310 subfamilies, which are involved in the metabolism of insect hormones and xenobiotic detoxification. The CYP6-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	429
410680	cd11057	CYP313-like	cytochrome P450 family 313 and similar cytochrome P450s. This subfamily is composed of insect cytochrome P450s from families 313 (CYP313) and 318 (CYP318), and similar proteins. These proteins may be involved in the metabolism of insect hormones and in the breakdown of synthetic insecticides. Their specific function is yet unknown. They belong to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	427
410681	cd11058	CYP60B-like	cytochrome P450 family 60, subfamily B and similar cytochrome P450s. This family is composed of fungal cytochrome P450s including: Aspergillus nidulans cytochrome P450 60B (CYP60B), also called versicolorin B desaturase, which catalyzes the conversion of versicolorin B to versicolorin A during sterigmatocystin biosynthesis; Fusarium sporotrichioides cytochrome P450 65A1 (CYP65A1), also called isotrichodermin C-15 hydroxylase, which catalyzes the hydroxylation at C-15 of isotricodermin in trichothecene biosynthesis; and Penicillium aethiopicum P450 monooxygenase vrtK, also called viridicatumtoxin synthesis protein K, which catalyzes the spirocyclization of the geranyl moiety of previridicatumtoxin to produce viridicatumtoxin, a tetracycline-like fungal meroterpenoid. The CYP60B-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	419
410682	cd11059	CYP_fungal	unknown subfamily of fungal cytochrome P450s. This subfamily is composed of uncharacterized fungal cytochrome P450s. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle.	422
410683	cd11060	CYP57A1-like	cytochrome P450 family 57, subfamily A, polypeptide 1 and similar cytochrome P450s. This family is composed of fungal cytochrome P450s including: Nectria haematococca cytochrome P450 57A1 (CYP57A1), also called pisatin demethylase, which detoxifies the phytoalexin pisatin; Penicillium aethiopicum P450 monooxygenase gsfF, also called griseofulvin synthesis protein F, which catalyzes the coupling of orcinol and phloroglucinol rings in griseophenone B to form desmethyl-dehydrogriseofulvin A during the biosynthesis of griseofulvin, a spirocyclic fungal natural product used to treat dermatophyte infections; and Penicillium aethiopicum P450 monooxygenase vrtE, also called viridicatumtoxin synthesis protein E, which catalyzes hydroxylation at C5 of the polyketide backbone during the biosynthesis of viridicatumtoxin, a tetracycline-like fungal meroterpenoid. The CYP57A1-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410684	cd11061	CYP67-like	cytochrome P450 family 67 and similar cytochrome P450s. This subfamily includes Uromyces viciae-fabae cytochrome P450 67 (CYP67), also called planta-induced rust protein 16, Cystobasidium minutum (Rhodotorula minuta) cytochrome P450rm, and other fungal cytochrome P450s. P450rm catalyzes the formation of isobutene and 4-hydroxylation of benzoate. The gene encoding CYP67 is a planta-induced gene that is expressed in haustoria and rust-infected leaves. The CYP67-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	418
410685	cd11062	CYP58-like	cytochrome P450 family 58-like fungal cytochrome P450s. This group includes Fusarium sporotrichioides cytochrome P450 58 (CYP58, also known as Tri4 and trichodiene oxygenase), and similar fungal proteins. CYP58 catalyzes the oxygenation of trichodiene during the biosynthesis of trichothecenes, which are sesquiterpenoid toxins that act by inhibiting protein biosynthesis. The CYP58-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410686	cd11063	CYP52	cytochrome P450 family 52. Cytochrome P450 52 (CYP52), also called P450ALK, monooxygenases catalyze the first hydroxylation step in the assimilation of alkanes and fatty acids by filamentous fungi. The number of CYP52 proteins depend on the fungal species: for example, Candida tropicalis has seven, Candida maltose has eight, and Yarrowia lipolytica has twelve. The CYP52 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	419
410687	cd11064	CYP86A	cytochrome P450 family 86, subfamily A. This subfamily includes several Arabidopsis thaliana cytochrome P450s (CYP86A1, CYP86A2, CYP86A4, among others), Petunia x hybrida CYP86A22, and Vicia sativa CYP94A1 and CYP94A2. They are P450-dependent fatty acid omega-hydroxylases that catalyze the omega-hydroxylation of various fatty acids. CYP86A2 acts on saturated and unsaturated fatty acids with chain lengths from C12 to C18; CYP86A22 prefers substrates with chain lengths of C16 and C18; and CYP94A1 acts on various fatty acids from 10 to 18 carbons. They play roles in the biosynthesis of extracellular lipids, cutin synthesis, and plant defense. The CYP86A subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	432
410688	cd11065	CYP64-like	cytochrome P450 family 64-like fungal cytochrome P450s. This group includes Aspergillus flavus cytochrome P450 64 (CYP64), also called O-methylsterigmatocystin (OMST) oxidoreductase or aflatoxin B synthase or aflatoxin biosynthesis protein Q, and similar fungal cytochrome P450s. CYP64 converts OMST to aflatoxin B1 and converts dihydro-O-methylsterigmatocystin (DHOMST) to aflatoxin B2 in the aflatoxin biosynthesis pathway. The CYP64-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410689	cd11066	CYP_PhacA-like	fungal cytochrome P450s similar to Aspergillus nidulans phenylacetate 2-hydroxylase. This group includes Aspergillus nidulans phenylacetate 2-hydroxylase (encoded by the phacA gene) and similar fungal cytochrome P450s. PhacA catalyzes the ortho-hydroxylation of phenylacetate, the first step of A. nidulans phenylacetate catabolism. The PhacA-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	434
410690	cd11067	CYP152	cytochrome P450 family 152, also called fatty acid hydroxylases or P450 peroxygenases. The cytochrome P450 152 (CYP152) family enzymes act as peroxygenases, converting fatty acids through oxidative decarboxylation, yielding terminal alkenes, and via alpha- and beta-hydroxylation to yield hydroxy-fatty acids. Included in this family are Bacillus subtilis CYP152A1, also called cytochrome P450BsBeta, that catalyzes the alpha- and beta-hydroxylation of long-chain fatty acids such as myristic acid in the presence of hydrogen peroxide, and Sphingomonas paucimobilis CYP152B1, also called cytochrome P450(SPalpha), that hydroxylates fatty acids with high alpha-regioselectivity. The CYP152 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	400
410691	cd11068	CYP120A1	cytochrome P450 family 102, subfamily A, polypeptide 1, also called bifunctional cytochrome P450/NADPH--P450 reductase. Cytochrome P450 102A1, also called cytochrome P450(BM-3) or P450BM-3, is a bifunctional cytochrome P450/NADPH--P450 reductase. These proteins fuse an N-terminal cytochrome p450 with a C-terminal cytochrome p450 reductase (CYPOR). It functions as a fatty acid monooxygenase, catalyzing the hydroxylation of fatty acids at omega-1, omega-2 and omega-3 positions, with activity towards fatty acids with a chain length of 9-18 carbons. Its NADPH-dependent reductase activity (via the C-terminal domain) allows electron transfer from NADPH to the heme iron of the N-terminal cytochrome P450. CYP120A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	430
410692	cd11069	CYP_FUM15-like	Fusarium verticillioides cytochrome P450 monooxygenase FUM15, and similar cytochrome P450s. Fusarium verticillioides cytochrome P450 monooxygenase FUM15, is also called fumonisin biosynthesis cluster protein 15. The FUM15 gene is part of the gene cluster that mediates the biosynthesis of fumonisins B1, B2, B3, and B4, which are carcinogenic mycotoxins. This FUM15-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	437
410693	cd11070	CYP56-like	cytochrome P450 family 56-like fungal cytochrome P450s. This group includes Saccharomyces cerevisiae cytochrome P450 56, also called cytochrome P450-DIT2, and similar fungal proteins. CYP56 is involved in spore wall maturation and is thought to catalyze the oxidation of tyrosine residues in the formation of LL-dityrosine-containing precursors of the spore wall. The CYP56-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	438
410694	cd11071	CYP74	cytochrome P450 family 74. The cytochrome P450 74 (CYP74) family controls several enzymatic conversions of fatty acid hydroperoxides to bioactive oxylipins in plants, some invertebrates, and bacteria. It includes two dehydrases, namely allene oxide synthase (AOS) and divinyl ether synthase (DES), and two isomerases, hydroperoxide lyase (HPL) and epoxyalcohol synthase (EAS). AOS (EC 4.2.1.92, also called hydroperoxide dehydratase), such as Arabidopsis thaliana CYP74A acts on a number of unsaturated fatty-acid hydroperoxides, forming the corresponding allene oxides. DES (EC 4.2.1.121), also called colneleate synthase or CYP74D, catalyzes the selective removal of pro-R hydrogen at C-8 in the biosynthesis of colneleic acid. The linolenate HPL, Arabidopsis thaliana CYP74B2, is required for the synthesis of the green leaf volatiles (GLVs) hexanal and trans-2-hexenal. The fatty acid HPL, Solanum lycopersicum CYP74B, is involved in the biosynthesis of traumatin and C6 aldehydes. The epoxyalcohol synthase Ranunculus japonicus CYP74A88 (also known as RjEAS) specifically converts linoleic acid 9- and 13-hydroperoxides to oxiranyl carbinols 9,10-epoxy-11-hydroxy-12-octadecenoic acid and 11-hydroxy-12,13-epoxy-9-octadecenoic acid, respectively. The CYP74 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	424
410695	cd11072	CYP71-like	cytochrome P450 family 71 and similar cytochrome P450s. The group includes plant cytochrome P450 family 71 (CYP71) proteins, as well as some CYPs designated as belonging to a different family including CYP99A1, CYP83B1, and CYP84A1, among others. Characterized CYP71 enzymes include: parsnip (Pastinaca sativa) CYP71AJ4, also called angelicin synthase, that converts (+)-columbianetin to angelicin, an angular furanocumarin; periwinkle (Catharanthus roseus) CYP71D351, also called tabersonine 16-hydroxylase 2, that is involved in the foliar biosynthesis of vindoline; sorghum CYP71E1, also called 4-hydroxyphenylacetaldehyde oxime monooxygenase, that catalyzes the conversion of p-hydroxyphenylacetaldoxime to p-hydroxymandelonitrile; as well as maize CYP71C1, CYP71C2, and CYP71C4, which are monooxygenases catalyzing the oxidation of 3-hydroxyindolin-2-one, indolin-2-one, and indole, respectively. CYPs within a single CYP71 subfamily, such as the C subfamily, usually metabolize similar/related compounds. The CYP71-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	428
410696	cd11073	CYP76-like	cytochrome P450 family 76 and similar cytochrome P450s. Characterized members of the plant cytochrome P450 family 76 (CYP76 or Cyp76) include: Catharanthus roseus CYP76B6, a multifunctional enzyme catalyzing two sequential oxidation steps leading to the formation of 8-oxogeraniol from geraniol; the Brassicaceae-specific CYP76C subfamily of enzymes that are involved in the metabolism of monoterpenols and phenylurea herbicides; and two P450s from Lamiaceae, CYP76AH and CYP76AK, that are involved in the oxidation of abietane diterpenes. CYP76AH produces ferruginol and 11-hydroxyferruginol, while CYP76AK catalyzes oxidations at the C20 position. Also included in this group is Berberis stolonifera Cyp80, also called berbamunine synthase or (S)-N-methylcoclaurine oxidase [C-O phenol-coupling], that catalyzes the phenol oxidation of N-methylcoclaurine to form the bisbenzylisoquinoline alkaloid berbamunine. The CYP76-like family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	435
410697	cd11074	CYP73	cytochrome P450 family 73. Cytochrome P450 family 73 (CYP73 pr Cyp73), also called trans-cinnamate 4-monooxygenase (EC 1.14.14.91) or cinnamic acid 4-hydroxylase, catalyzes the regiospecific 4-hydroxylation of cinnamic acid to form precursors of lignin and many other phenolic compounds. It controls the general phenylpropanoid pathway, and controls carbon flux to pigments essential for pollination or UV protection. CYP73 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	434
410698	cd11075	CYP77_89	cytochrome P450 families 77 and 89, and similar cytochrome P450s. This group includes cytochrome P450 families 73 (CYP77) and 89 (CYP89), which are sister families that share a common ancestor. CYP89, present only in angiosperms, is younger than CYP77, which is already found in lycopods; thus, CYP89 may have evolved from CYP77 after duplication and divergence. Also included in this group is ent-kaurene oxidase, called CYP701A3 in Arabidopsis thaliana and CYP701B1 in Physcomitrella patens, that catalyzes the oxidation of ent-kaurene to form ent-kaurenoic acid. CYP701A3 is sensitive to inhibitor uniconazole-P while CYP701B1 is not. This CYP77/89 group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	433
410699	cd11076	CYP78	cytochrome P450 family 78. Characterized cytochrome P450 family 78 (CYP78 or Cyp78) proteins include: CYP78A5, which is expressed in leaf, flora and embryo, and has been reported to stimulate plant organ growth in Arabidopsis thaliana and to regulate plant architecture, ripening time, and fruit mass in tomato; Glycine max CYP78A10 that functions in regulating seed size/weight and pod number; and Physcomitrella patens CYP78A27 or CYP78A28, which together, are essential in bud formation. The CYP78 family belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	426
410700	cd11078	CYP130-like	cytochrome P450 family 130-like and similar cytochrome P450s. This subfamily includes Mycobacterium tuberculosis cytochrome P450 130 (CYP130), Rhodococcus erythropolis CYP116, and similar bacterial proteins. CYP130 catalyzes the N-demethylation of dextromethorphan, and has also shown a natural propensity to bind primary arylamines. CYP116 is involved in the degradation of thiocarbamate herbicides. The CYP130-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	380
410701	cd11079	Cyp_unk	unknown subfamily of mostly bacterial cytochrome P450s. This subfamily is composed of uncharacterized cytochrome P450s, predominantly from bacteria. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle.	350
410702	cd11080	CYP134A1	cytochrome P450 family 134, subfamily A, polypeptide 1. Cytochrome P450 134A1 (CYP134A1, EC 1.14.15.13), also called pulcherriminic acid synthase or cyclo-L-leucyl-L-leucyl dipeptide oxidase or cytochrome P450 CYPX, catalyzes the oxidation of cyclo(L-Leu-L-Leu) (cLL) to yield pulcherriminic acid which forms the red pigment pulcherrimin via a non-enzymatic spontaneous reaction with Fe(3+). It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	370
410703	cd11082	CYP61_CYP710	C-22 sterol desaturase subfamily, such as fungal cytochrome P450 61 and plant cytochrome P450 710. C-22 sterol desaturase (EC 1.14.19.41), also called sterol 22-desaturase, is required for the formation of the C-22 double bond in the sterol side chain of delta22-unsaturated sterols, which are present specifically in fungi and plants. This enzyme is also called cytochrome P450 61 (CYP61) in fungi and cytochrome P450 710 (CYP710) in plants. The CYP61/CYP710 subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	415
410704	cd11083	CYP_unk	unknown subfamily of cytochrome P450s. This subfamily is composed of uncharacterized cytochrome P450s. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle.	421
199893	cd11234	E_set_GDE_N	N-terminal Early set domain associated with the catalytic domain of Glycogen debranching enzyme. E or "early" set domains are associated with the catalytic domain of the glycogen debranching enzyme at the N-terminal end. Glycogen debranching enzymes have both 4-alpha-glucanotransferase and amylo-1,6-glucosidase activities. As a transferase, it transfers a segment of a 1,4-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or another 1,4-alpha-D-glucan. As a glucosidase, it catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. The N-terminal domain of the glycogen debranching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. This domain is also a member of the CBM48 (Carbohydrate Binding Module 48) family whose members include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase.	101
200496	cd11235	Sema_semaphorin	The Sema domain, a protein interacting module, of semaphorins. Semaphorins are regulator molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. They can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted proteins; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. The semaphorins exert their function through their receptors, the neuropilin and plexin families. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	437
200497	cd11236	Sema_plexin_like	The Sema domain, a protein interacting module, of Plexins and MET-like receptor tyrosine kinases. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestor of semaphorins. Ligand binding activates signal transduction pathways controlling axon guidance in the nervous system and other developmental processes including cell migration and morphogenesis, immune function, and tumor progression. Plexins are divided into four types (A-D) according to sequence similarity. In vertebrates, type A Plexins serve as the co-receptors for neuropilins to mediate the signalling of class 3 semaphorins except Sema3E, which signals through Plexin D1. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B. Plexin C1 serves as the receptor of Sema7A and plays regulation roles in both immune and nervous systems. This family also includes the Met and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	401
200498	cd11237	Sema_1A	The Sema domain, a protein interacting module, of semaphorin 1A (Sema1A). Sema1A is a transmembrane protein. It has been shown to mediate the defasciculation of motor axon bundles at specific choice points. Sema1A binds to its receptor plexin A (PlexA), which in turn triggers downstream signaling events involving the receptor tyrosine kinase Otk, the evolutionarily conserved flavoprotein monooxygenase molecule interacting with CasL (MICAL), and the A kinase anchoring protein Nervy, leading to repulsive growth-cone response. Sema1A has also been shown to be involved in synaptic formation. It is a member of the semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	446
200499	cd11238	Sema_2A	The Sema domain, a protein interacting module, of semaphorin 2A (Sema2A). Sema2A, a secreted semaphorin, signals through its receptor plexin B (PlexB) to regulate central and peripheral axon pathfinding. In the Drosophila embryo, Sema2A secreted by oenocytes interacts with PlexB to guide sensory axons. Sema2A is a member of the semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	452
200500	cd11239	Sema_3	The Sema domain, a protein interacting module, of class 3 semaphorins. Class 3 semaphorins (Sema3s) are secreted regulator molecules involved in the development of the nervous system, vasculogenesis, angiogenesis,and tumorigenesis. There are 7 distinct subfamilies named Sema3A to 3G. Sema3s function as repellent signals during axon guidance by repelling neurons away from the source of Sema3s. However, Sema3s that are secreted by tumor cells play an inhibitory role in tumor growth and angiogenesis (specifically Sema3B and Sema3F). Sema3s functions by forming complexes with neuropilins and A-type plexins, where neuropilins serve as the ligand binding moiety and the plexins function as signal transduction component. Sema3s primarily inhibit the cell motility and migration of tumor and endothelial cells by inducing collapse of the actin cytoskeleton via neuropilins and plexins. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	471
200501	cd11240	Sema_4	The Sema domain, a protein interacting module, of class 4 semaphorins (Sema4). Class 4 semaphorins (Sema4s) are transmembrane regulator molecules involved in the development of the nervous system, immune response, cytoskeletal organization, angiogenesis, and cell-cell interactions. There are 7 distinct subfamilies in class 4 semaphorins, named 4A to 4G. Several class 4 subfamilies play important roles in the immune system and are called "immune semaphorins". Sema4A plays critical roles in T cell-DC interactions in the immune response. Sema4D/CD100, expressed by lymphocytes, promotes the aggregation and survival of B lymphocytes and inhibits cytokine-induced migration of immune cells in vitro. It is required for normal activation of B and T lymphocytes. Sema4B negatively regulates basophil functions through T cell-basophil contacts and significantly inhibits IL-4 and IL-6 production from basophils in response to various stimuli, including IL-3 and papain. Sema4s not only influence the activation state of cells but also modulate their migration and survival. The effects of Sema4s on nonlymphoid cells are mediated by plexin D1 and plexin Bs. The Sema4G and Sema4C genes are expressed in the developing cerebellar cortex and are involved in neural tube closure and development of cerebellar granules cells through receptor plexin B2. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues.  It serves as a receptor-recognition and -binding module.	456
200502	cd11241	Sema_5	The Sema domain, a protein interacting module, of semaphorin 5 (Sema5). Class 5 semaphorins are transmembrane glycoproteins characterized by unique thrombospondin specific repeats in the extracellular region of the protein. There are three subfamilies in class 5 semaphorins, namely 5A, 5B and 5C. Sema5A and Sema5B function as guidance cues for optic and corticofugal nerve development, respectively. Sema5A-induced cell migration requires Met signaling. Sema5C is an early development gene and may play a role in odor-guided behavior. Sema5A is also implicated in cancer. In a screening model for metastasis, the Drosophila Sema5A ortholog, Dsema-5C, has been found to be required in tumorigenicity and metastasis. Sema5A is highly expressed in human pancreatic cancer cells and is associated with tumor growth, invasion and metastasis. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues.  It serves as a receptor-recognition and -binding module.	438
200503	cd11242	Sema_6	The Sema domain, a protein interacting module, of class 6 semaphorins (Sema6). Class 6 semaphorins (Sema6s) are membrane associated semaphorins. There are 6 subfamilies named 6A to 6D. Sema6s bind to plexin As in a neuropilin independent fashion. Sema6-plexin A signaling plays important roles in lamina-specific axon projections. Interactions between plexin A2, plexin A4, and Sema6A control lamina-restricted projection of hippocampal mossy fibers. Interactions between Sema6C, Sema6D and plexin A1 shape the stereotypic trajectories of sensory axons in the spinal cord. In addition to axon targeting, Sema6D-plexin A1 interactions influence a wide range of other biological processes. During cardiac development, Sema6D attracts or repels endothelial cells in the cardiac tube depending on the expression patterns of specific coreceptors in addition to plexin A1. Furthermore, Sema6D binds a receptor complex comprising of plexin A1, Trem2 (triggering receptor expressed on myeloid cells 2), and DAP12 on dendritic cells and osteoclasts to mediate T-cell-DC interactions and to control bone development, respectively.  The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues.  It serves as a receptor-recognition and -binding module.	465
200504	cd11243	Sema_7A	The Sema domain, a protein interacting module, of semaphorin 7A (Sema7A, also called CD108). Sema7A plays regulatory roles in both immune and nervous systems. Unlike other semaphorins, which act as repulsive guidance cues, Sema7A enhances central and peripheral axon growth and is required for proper axon tract formation during embryonic development. Sema7A also plays a critical role in the negative regulation of T cell activation and function. Sema7A is a membrane-anchored member of the semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	414
200505	cd11244	Sema_plexin_A	The Sema domain, a protein interacting module, of Plexin A. Plexins serve as receptors of semaphorins and may be the ancestor of semaphorins. Members of the Plexin A subfamily are receptors for Sema1s, Sema3s, and Sema6s, and they mediate diverse biological functions including axon guidance, cardiovascular development, and immune function.  Guanylyl cyclase Gyc76C and Off-track kinase (OTK), a putative receptor tyrosine kinase, modulate Sema1a-Plexin A mediated axon repulsion. Sema3s do not interact directly with plexin A receptors, but instead bind Neuropilin-1 or Neuropilin-2 toactivate neuropilin-plexin A holoreceptor complexes. In contrast to Sema3s, Sema6s do not require neuropilins for plexin A binding. In the complex, plexin As serve as signal-transducing subunits. An increasing number of molecules that interact with the intracellular region of Plexin A have been identified; among them are IgCAMs (in axon guidance events) and Trem2-DAP12 (in immune responses). The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	470
200506	cd11245	Sema_plexin_B	The Sema domain, a protein interacting module, of Plexin B. Plexins, which contain semaphorin domains, function as receptors of semaphorins and may be the ancestors of semaphorins. There are three members of the Plexin B subfamily, namely B1, B2 and B3. Plexins B1, B2 and B3 are receptors for Sema4D, Sema4C and Sema4G, and Sema5A, respectively. The activation of plexin B1 by Sema4D produces an acute collapse of axonal growth cones in hippocampal and retinal neurons over the early stages of neurite outgrowth and promotes branching and complexity. By signaling the effect of Sema4C and Sema4G, the plexin B2 receptor is critically involved in neural tube closure and cerebellar granule cell development.  Plexin B3, the receptor of Sema5A, is a highly potent stimulator of neurite outgrowth of primary murine cerebellar neurons. Plexin B3 has been linked to verbal performance and white matter volume in human brain. Small GTPases play important roles in plexin B signaling. Plexin B1 activates Rho through Rho-specific guanine nucleotide exchange factors, leading to neurite retraction. Plexin B1 possesses an intrinsic GTPase-activating protein activity for R-Ras and induces growth cone collapse through R-Ras inactivation. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	440
200507	cd11246	Sema_plexin_C1	The Sema domain, a protein interacting module, of Plexin C1. Plexins serve as semaphorin receptors. Plexin C1 has been identified as the receptor of semaphorin 7A, which plays regulation roles in both the immune and nervous systems. Unlike other semaphorins which act as repulsive guidance cues, Sema7A enhances central and peripheral axon growth and is required for proper axon tract formation during embryonic development. Plexin C1 is a potential tumor suppressor for melanoma progression. The expression of Plexin C1 is diminished or absent in human melanoma cell lines. Cofilin, an actin-binding protein involved in cell migration, is a downstream target of Sema7A-Plexin C1 signaling. Cofilin is not phosphorylated when Plexin C1 expression is silenced. Thus, melanoma invasion and metastasis may be promoted through the loss of Plexin C1 inhibitory signaling on cofilin activation. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	401
200508	cd11247	Sema_plexin_D1	The Sema domain, a protein interacting module, of Plexin D1. Plexins are known as semaphorin receptors and Plexin D1 has been identified as the receptor of Sema3E. It binds to Sema3E directly with high affinity. Sema3E is implicated in axonal path finding and inhibition of developmental and post-ischemic angiogenesis. Plexin D1 is broadly expressed on tumor vessels and tumor cells in a number of different types of human tumors. Plexin D1-Sema3E interaction inhibits tumor growth but promotes invasiveness and metastasis. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	483
200509	cd11248	Sema_MET_like	The Sema domain, a protein interacting module, of MET and RON receptor tyrosine kinases. This family includes MET and RON receptor tyrosine kinases. MET is encoded by the c-met protooncogene. MET is the receptor for hepatocyte growth factor/scatter factor (HGF/SF). HGF/SF and MET regulates multiple cellular events and are essential for the development of several tissues and organs, including the placenta, liver, and several groups of skeletal muscles. RON receptor tyrosine kinase is a Macrophage-stimulating protein (MSP) receptor. Upon binding of MSP, RON is activated via autophosphorylation within its kinase catalytic domain, resulting in a variety of effects including proliferation, tubular morphogenesis, angiogenesis, cellular motility and invasiveness. By interacting with downstream signaling molecules, it regulates macrophage migration, phagocytosis, and nitric oxide production. MET and RON receptors have been implicated in cancer development and migration. They are composed of alpha-beta heterodimers. The extracellular alpha chain is disulfide linked to the beta chain, which contains an extracellular ligand-binding region with a Sema domain, a PSI domain and four IPT repeats, a transmembrane segment, and an intracellular catalytic tyrosine kinase domain. The Sema domain is necessary for receptor dimerization and activation.	467
200510	cd11249	Sema_3A	The Sema domain, a protein interacting module, of semaphorin 3A (Sema3A). Sema3A has been reported to inhibit the growth of certain experimental tumors and to regulate endothelial cell migration and apoptosis in vitro, as well as arteriogenesis in the muscle, skin vessel permeability, and tumor angiogenesis in vivo. The function of Sema3A is mediated through receptors neuropilin-1 (NP1) and plexins, although little is known about the requirement of specific plexins in its receptor complex. It is known however that Plexin-A4 is the receptor for Sema3A in the Toll-like receptor- and sepsis-induced cytokine storm during immune response. Sema3A is a member of the Class 3 semaphorin family of secreted proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	493
200511	cd11250	Sema_3B	The Sema domain, a protein interacting module, of semaphorin 3B (Sema3B). Sema3B is coexpressed with semaphorin 3F and both proteins are candidate tumor suppressors. Both Sema3B and Sema3F show high levels of expression in normal tissues and low-grade tumors but are down-regulated in highly metastatic tumors in the lung, melanoma cells, bladder carcinoma cells and prostate carcinoma. They are upregulated by estrogen and inhibit cell motility and invasiveness through decreased FAK phosphorylation and inhibition of MMP-2 and MMP-9 expression. Two receptor families, the neuropilins (NP) and plexins, have been implicated in mediating the actions of semaphorins 3B and 3F. Sema3B is a member of the class 3 semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	471
200512	cd11251	Sema_3C	The Sema domain, a protein interacting module, of semaphorin 3C (Sema3C). Sema3C is a secreted semaphorin expressed in and adjacent to cardiac neural crest cells, and causes impaired migration of neural crest cells to the developing cardiac outflow tract, resulting in the interruption of the aortic arch and persistent truncus arteriosus. It has been proposed that Sema3C acts as a guidance molecule, regulating migration of neural crest cells that express semaphorin receptors such as plexin A2. Sema3C may also participate in tumor progression. The cleavage of Sema3C induced by ADAMTS1 promotes the migration of breast cancer cells. Sema3C is a member of the class 3 semaphorin family of secreted proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	470
200513	cd11252	Sema_3D	The Sema domain, a protein interacting module, of semaphorin 3D (Sema3D). Sema3D is a secreted semaphorin expressed during the development of the nervous system. In zebrafish, Sema3D is expressed in the ventral tectum. It guides retinal axons along the dorsoventral axis of the tectum and guides the laterality of retinal ganglion cell (RGC) projections. Both Sema3D knockdown or its ubiquitous overexpression induced aberrant ipsilateral projections. Proper balance of Sema3D is needed at the midline for the progression of RGC axons from the chiasm midline into the contralateral optic tract. Sema3D is a member of the class 3 semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	474
200514	cd11253	Sema_3E	The Sema domain, a protein interacting module, of semaphorin 3E (Sema3E). Sema3E is a secreted molecule implicated in axonal path finding and inhibition of developmental and postischemic angiogenesis. It is also highly expressed in metastatic cancer cells. Sema3E signaling, through its high affinity functional receptor Plexin D1, drives cancer cell invasiveness and metastatic spreading. Sema3E is a member of the class 3 semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	471
200515	cd11254	Sema_3F	The Sema domain, a protein interacting module, of semaphorin 3F (Sema3F). Sema3F is coexpressed with semaphorin3B. Both Sema3B and Sema3F proteins are candidate tumor suppressors that are down-regulated in highly metastatic tumors. Two receptor families, the neuropilins and plexins, have been implicated in mediating the actions of semaphorins 3B and 3F. Sema3F is a member of the class 3 semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	470
200516	cd11255	Sema_3G	The Sema domain, a protein interacting module, of semaphorin 3G (Sema3G). Semaphorin 3G is identified as a primarily endothelial cell- expressed class 3 semaphorin that controls endothelial and smooth muscle cell functions in autocrine and paracrine manners, respectively. It is mainly expressed in the lung and kidney, and a little in the brain. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	474
200517	cd11256	Sema_4A	The Sema domain, a protein interacting module, of semaphorin 4A (Sema4A). Sema4A is expressed in immune cells and is thus termed an "immune semaphorin". It plays critical roles in T cell-DC interactions in the immune response. It has been reported to enhance activation and differentiation of T cells in vitro and generation of antigen-specific T cells in vivo. The function of Sema4A in the immune response implicates its role in infectious and noninfectious diseases. Sema4A exerts its function through three receptors, namely Plexin B, Plexin D1, and Tim-2. Sema4A belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. TThe Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	447
200518	cd11257	Sema_4B	The Sema domain, a protein interacting module, of semaphorin 4B (Sema4B). Sema4B, expressed in T and B cells, is an immune semaphorin. It functions as a negative regulatory of basophils through T cell-basophil contacts and it significantly inhibits IL-4 and IL-6 production from basophils in response to various stimuli, including IL-3 and papain. In addition, T cell-derived Sema4B suppresses basophil-mediated Th2 skewing and humoral memory responses. Sema4B may be also involved in lung cancer cell mobility by inducing the degradation of CLCP1 (CUB, LCCL-homology, coagulation factor V/VIII homology domains protein). Sema4B is characterized by a PDZ-binding motif at the carboxy-terminus, which mediates interaction with the post-synaptic density protein PSD-95/SAP90, which is thought to play a central role during synaptogenesis and in the structure and function of post-synaptic specializations of excitatory synapses. Sema4B belongs to class 4 transmembrane semaphorin family proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	464
200519	cd11258	Sema_4C	The Sema domain, a protein interacting module, of semaphorin 4C (Sema4C). Sema4C acts as a Plexin B2 ligand to regulate the development of cerebellar granule cells and to modulate ureteric branching in the developing kidney. The binding of Sema4C to Plexin B2 results  the phosphorylation of downstream regulator ErbB-2 and the plexin protein itself. The cytoplasmic region of Sema4C binds a neurite-outgrowth-related protein SFAP75, suggesting that Sema4C may also play a role in neural function. Sema4C belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	458
200520	cd11259	Sema_4D	The Sema domain, a protein interacting module, of semaphorin 4D (Sema4D, also known as CD100). Sema4D/CD100 is expressed in immune cells and plays critical roles in immune response; it is thus termed an "immune semaphorin". It is expressed by lymphocytes and promotes the aggregation and survival of B lymphocytes and inhibits cytokine-induced migration of immune cells in vitro. Sema4D/CD100 knock-out mice demonstrate that Sema4D is required for normal activation of B and T lymphocytes. Sema4D increases B-cell and DC function using either Plexin B1 or CD72 as receptors. The function of Sema4D in immune response implicates its role in infectious and noninfectious diseases. Sema4D belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	471
200521	cd11260	Sema_4E	The Sema domain, a protein interacting module, of semaphorin 4E (Sema4E). Sema4E is expressed in the epithelial cells that line the pharyngeal arches in zebrafish. It may act as a guidance molecule to restrict the branchiomotor axons to the mesenchymal cells. Gain-of-function and loss-of-function studies demonstrate that Sema4E is essential for the guidance of facial axons from the hindbrain into their pharyngeal arch targets and is sufficient for guidance of gill motor axons. Sema4E guides facial motor axons by a repulsive action. Sema4E belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	456
200522	cd11261	Sema_4F	The Sema domain, a protein interacting module, of semaphorin 4F (Sema4F). Sema4F plays role in heterotypic cell-cell contacts and controls cell proliferation and suppresses tumorigenesis. In neurofibromatosis type 1 (NF1) patients, reduced Sema4F level disrupts Schwann cell/axonal interactions. Experiments using a yeast two-hybrid system show that the extreme C-terminus of Sema4F interacts with the PDZ domains of post-synaptic density protein SAP90/PSD-95, indicating possible functional involvement of Semas4F at glutamatergic synapses. Recent work also suggests a role for Sema4F in the injury response of intramedullary axotomized motoneuron. Sema4F belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulator molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	460
200523	cd11262	Sema_4G	The Sema domain, a protein interacting module, of semaphorin 4G (Sema4G). The Sema4G and Sema4C genes are expressed in the developing cerebellar cortex. Sema4G and Sema4C proteins specifically bind to Plexin B2 expressed in the cerebellar granule cells. Sema4G and Sema4C are involved in neural tube closure and cerebellar granule cell development through Plexin B2.Sema4G belongs to the class 4 transmembrane semaphorin family of proteins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	457
200524	cd11263	Sema_5A	The Sema domain, a protein interacting module, of semaphorin 5A (Sema5A). Originally, mouse Sema5A was identified as a protein that induces inhibitory responses during optic nerve development. Recent studies show that Sema5A controls innate immunity in mice. It also has been identified as a candidate gene for causing idiopathic autism in humans. Plexin B3 functions as a binding partner and receptor for Sema5A. Furthermore, Sema5A is also implicated in cancer. The role of the Drosophila Sema5A ortholog, Dsema-5C, in tumorigenicity and metastasis has been reported. Sema5A is highly expressed in human pancreatic cancer cells and is associated with tumor growth, invasion and metastasis. Sema5A belongs to class 5 semaphorin family of proteins, which are transmembrane glycoproteins characterized by unique thrombospondin specific repeats in the extracellular region of the protein. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	436
200525	cd11264	Sema_5B	The Sema domain, a protein interacting module, of semaphorin 5B (Sema5B). Sema5B is expressed in regions of the basal telencephalon in rat. Sema5B is an inhibitory cue for corticofugal axons and acts as a source of repulsion for the appropriate guidance of cortical axons away from structures such as the ventricular zone as they navigate toward and within subcortical regions. In addition to its role as a guidance cue, Sema5B regulates the development and maintenance of synapse size and number in hippocampal neurons. In addition, the sema domain of Sema5B can be cleaved of the whole protein and exerts its function in regulation of synapse morphology. Sema5B belongs to the class 5 semaphorin family of proteins, which are transmembrane glycoproteins characterized by unique thrombospondin specific repeats in the extracellular region of the protein. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	437
200526	cd11265	Sema_5C	The Sema domain, a protein interacting module, of semaphorin 5C (sema5C). In Drosophila, Sema5C was identified as an early development gene, which is expressed in stage 2 embryos with a striped pattern emerging at later stages. Sema5c may play a role in odor-guided behavior and in tumorigenesis. Sema5C belongs to class 5 semaphorin family of proteins, which are transmembrane glycoproteins characterized by unique thrombospondin specific repeats in the extracellular region of the protein. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	433
200527	cd11266	Sema_6A	The Sema domain, a protein interacting module, of semaphorins 6A (Sema6A). In the cerebellum, Sema6A-plexin A2 signaling modulates granule cell migration by controlling centrosome positioning. Besides plexin A2, plexin A4 is also found to be a receptor of Sema6A.  Interactions between plexin A2, plexin A4, and Sema6A control lamina-restricted projection of hippocampal mossy fibers. It is required for the clustering of boundary cap cells at the PNS/CNS interface and thus, prevents motoneurons from streaming out of the ventral spinal cord. At the dorsal root entry site, it organizes the segregation of dorsal roots. Sema6A may also be involved in axonal pathfinding processes in the periinfarct and homotopic contralateral cortex. Sema6A is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	466
200528	cd11267	Sema_6B	The Sema domain, a protein interacting module, of semaphorin 6B (Sema6B). Sema6B functions as repellents for axon growth; this repulsive activity is mediated by its receptor Plexin A4. Sema6B is expressed in CA3, and repels mossy fibers in a Plexin A4 dependent manner. In human, it was shown that peroxisome proliferator-activated receptors (PPARs) and 9-cis-retinoic acid receptor (RXR) regulate human semaphorin 6B (Sema6B) gene expression. Sema6B is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	466
200529	cd11268	Sema_6C	The Sema domain, a protein interacting module, of semaphorin 6C (Sema6C, also called semaphorin Y). Sema6C is highly expressed in adult brain and skeletal muscle and it shows growth cone collapsing activity. It may play a role in the maintenance and remodelling of neuronal connections. In adult skeletal muscle, this role includes prevention of motor neuron sprouting and uncontrolled motor neuron growth. The expression of Sema6C in adult skeletal muscle is down-regulated following denervation. Sema6C is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	465
200530	cd11269	Sema_6D	The Sema domain, a protein interacting module, of semaphorin 6D (Sema6D). Sema6D is expressed predominantly in the nervous system during embryogenesis and it uses Plexin-A1 as a receptor. It displays repellent activity for dorsal root ganglion axons. Sema6D also acts as a regulator of late phase primary immune responses. In addition, Sema6D is overexpressed in gastric carcinoma, indicating that it may have an important role in the occurrence and development of the cancer. Sema6D is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules involved in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	465
200531	cd11270	Sema_6E	The Sema domain, a protein interacting module, semaphorin 6E (sema6E). Sema6E is expressed predominantly in the nervous system during embryogenesis. It binds Plexin A1 and might utilize it as a receptor to repel axons of specific types during development. Sema6E acts as a repellent to dorsal root ganglion axons as well as sympathetic axons. Sema6E is a member of the class 6 semaphorin family of proteins, which are membrane associated semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems and cancer. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a receptor-recognition and -binding module.	462
200532	cd11271	Sema_plexin_A1	The Sema domain, a protein interacting module, of Plexin A1. Plexin A1 is found in both the nervous and immune systems. Its external Sema domain is also shared by semaphorin proteins. In the nervous system, Plexin A1 mediates Sema3A axon guidance function by interacting with the Sema3A coreceptor neuropilin, resulting in actin depolarization and cell repulsion. In the immune system, Plexin A1 mediates Sema6D signaling by binding to the Sema6D-Trem2-DAP12 complex on immune cells and osteoclasts to promote Rac activation and DAP12 phosphorylation. In gene profiling experiments, Plexin A1 was identified as a CIITA (class II transactivator) regulated gene in primary dendritic cells (DCs). The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	474
200533	cd11272	Sema_plexin_A2	The Sema domain, a protein interacting module, of Plexin A2. Plexin A2 serves as a receptor for class 6 semaphorins. Interactions between Plexin A2, A4 and semaphorins 6A and 6B control the lamina-restricted projection of hippocampal mossy fibers. Sema6B also repels the growth of mossy fibers in a Plexin A4 dependent manner. Plexin A2 does not suppress Sema6B function. In addition, studies have shown that Plexin A2 may be related to anxiety and other psychiatric disorders. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	515
200534	cd11273	Sema_plexin_A3	The Sema domain, a protein interacting module, of Plexin A3. Plexin-A3 forms a receptor complex with neuropilin-2 and transduces signals for class 3 semaphorins in the nervous system. Both plexins A3 and A4 are essential for normal sympathetic neuron development. They function cooperatively to regulate the migration of sympathetic neurons, and differentially to guide sympathetic axons. Both plexins A3 and A4 are not required for guiding neural crest precursors prior to reaching the sympathetic anlagen. Plexin A3 is a major driving force for intraspinal motor growth cone guidance. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	469
200535	cd11274	Sema_plexin_A4	The Sema domain, a protein interacting module, of Plexin A4. Plexin A4 forms a receptor complex with neuropilins (NRPs) and transduces signals for class 3 semaphorins in the nervous system. It regulates facial nerve development by functioning as a receptor for Sema3A/NRP1. Both plexins A3 and A4 are essential for normal sympathetic development. They function both cooperatively, to regulate the migration of sympathetic neurons, and differentially, to guide sympathetic axons. Plexin A4 is also expressed in lymphoid tissues and functions in the immune system. It negatively regulates T lymphocyte responses. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	473
200536	cd11275	Sema_plexin_B1	The Sema domain, a protein interacting module, of Plexin B1. Plexin B1 serves as the Semaphorin 4D receptor and functions as a regulator of developing neurons and a tumor suppressor protein for melanoma. The Sema4D-plexin B signaling complex regulates dendritic and axonal complexity. The activation of Plexin B1 by Sema4D produces an acute collapse of axonal growth cones in hippocampal and retinal neurons over the early stages of neurite outgrowth and promotes branching and complexity. As a tumor suppressor, plexin B1 abrogates activation of the oncogenic receptor, c-Met, by its ligand, hepatocyte growth factor (HGF), in melanoma. Furthermore, plexin B1 suppresses integrin-dependent migration and activation of pp125FAK and inhibits Rho activity. Plexin B1 is highly expressed in endothelial cells and its activation by Sema4D elicits a potent proangiogenic response. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	461
200537	cd11276	Sema_plexin_B2	The Sema domain, a protein interacting module, of Plexin B2. Plexin B2 serves as the receptor of Sema4C and Sema4G. By signaling the effect of Sema4C and Sema4G, the plexin B2 receptor plays important roles in neural tube closure and cerebellar granule cell development. Mice lacking Plexin B2 demonstrated defects in closure of the neural tube and disorganization of the embryonic brain. In developing kidney, Sema4C-Plexin B2 signaling modulates ureteric branching. Plexin B2 is expressed both in the pretubular aggregates and the ureteric epithelium in the developing kidney. Deletion of Plexin B2 results in renal hypoplasia and occasional double ureters. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	449
200538	cd11277	Sema_plexin_B3	The Sema domain, a protein interacting module, of Plexin B3. Plexin B3 is the receptor of semaphorin 5A. It is a highly potent stimulator of neurite outgrowth of primary murine cerebellar neurons. Plexin B3 has been linked to verbal performance and white matter volume in human brain. Furthermore, Sema5A and plexin B3 have been implicated in the progression of various types of cancer. They play an important role in the invasion and metastasis of gastric carcinoma. The stimulation of plexin B3 by Sema5A binding in human glioma cells results in the inhibition of cell migration and invasion. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as a ligand-recognition and -binding module.	434
200539	cd11278	Sema_MET	The Sema domain, a protein interacting module, of MET (also called hepatocyte growth factor receptor, HGFR). MET is encoded by the c-met protooncogene. MET is a receptor tyrosine kinase that binds its ligand, hepatocyte growth factor/scatter factor (HGF/SF). HGF/SF and MET are essential for the development of several tissues and organs, including the placenta, liver, and several groups of skeletal muscles. It also plays a major role in the abnormal migration of cancer cells as a result of overexpression or MET mutations. MET is composed of an alpha-beta heterodimer. The extracellular alpha chain is disulfide linked to the beta chain, which contains an extracellular ligand-binding region with a Sema domain, a PSI domain and four IPT repeats, a transmembrane segment, and an intracellular catalytic tyrosine kinase domain. The cytoplasmic C-terminal region acts as a docking site for multiple protein substrates, including Grb2, Gab1, STAT3, Shc, SHIP-1 and Src. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. The Sema domain of Met is necessary for receptor dimerization and activation.	492
200540	cd11279	Sema_RON	The Sema domain, a protein interacting module, of RON Receptor Tyrosine Kinase. RON receptor tyrosine kinase is a Macrophage-stimulating protein (MSP) receptor. Upon binding of MSP, RON is activated via autophosphorylation within its kinase catalytic domain, resulting in a wide range of effects, including proliferation, tubular morphogenesis, angiogenesis, cellular motility and invasiveness. By interacting with downstream signaling molecules, it regulates macrophage migration, phagocytosis, and nitric oxide production. RON has been implicated in cancers of the breast, colon, pancreas and ovaries because both splice variants and receptor overexpression have been identified in these tumors. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves as ligand recognition and binding model. RON is composed of an alpha-beta heterodimer. The extracellular alpha chain is disulfide linked to the beta chain, which contains an extracellular ligand-binding region with a Sema domain, a PSI domain and four IPT repeats, a transmembrane segment, and an intracellular catalytic tyrosine kinase domain. The Sema domain of RON may be necessary for receptor dimerization and activation.	493
200436	cd11280	gelsolin_like	Tandemly repeated domains found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins.	88
200437	cd11281	ADF_drebrin_like	ADF homology domain of drebrin and actin-binding protein 1 (abp1). Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Many of these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Abp1 and drebrin (developmentally regulated brain protein) are multidomain proteins with an N-terminal ADF homology domain and one or more C-terminal SH3 domains. They have been shown to interact with polymeric F-actin, but not with monomeric G-actin, and do not appear to promote the disassembly of actin filaments. Drebrin rather stabilizes actin filaments by inducing changes in the helical twist and may promote or interfere with the interactions of other proteins with actin filaments.	136
200438	cd11282	ADF_coactosin_like	Coactosin-like members of the ADF homology domain family. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Many of these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. The function of coactosins is not well understood. They appear to interfere with the capping of actin filaments in Dictyostelium, and may not be able to bind monomeric globular actin. A role for coactosins as chaperones stabilizing 5-lipoxygenase (5LO) has been suggested; 5LO plays a crucial role in leukotriene synthesis.	114
200439	cd11283	ADF_GMF-beta_like	ADF-homology domain of glia maturation factor beta and related proteins. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Most of these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. The glia maturation factor (GMF), however, does not bind actin but interacts with the Arp2/3 complex (which contains actin-related proteins, amongst others) and suppresses Arp2/3 activity, inducing the dissociation of branched daughter filaments from their mother filaments. This family includes both mammalian GMF isoforms, GMF-beta and GMF-gamma. GMF-beta regulates cellular growth, fission, differentiation and apoptosis. GMF-gamma is important in myeloid cell development and is an important regulator for cell migration and polarity in neutrophils.	122
200440	cd11284	ADF_Twf-C_like	C-terminal ADF domain of twinfilin and related proteins. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Twinfilin contains two ADF domains, and inhibits the assembly of actin filaments by strongly interacting with monomeric ADP-actin (ADP-G-actin) in a 1:1 stochiometry (with it's C-terminal ADF domain, Twf-C) and inhibiting the actin monomer's nucleotide exchange. Mammalian twinfilin may also cap the barbed ends of F-actin filaments and prevent further assembly (or disassembly), in a process which requires both ADF domains. The N-terminal ADF domain (Twf-N) binds G-actin with a lower affinity than Twf-C; Twf-C can also bind F-actin. During capping, Twf-N may interact with the terminal actin subunit, and Twf-C may bind between two adjacent subunits at the side of the filament.	132
200441	cd11285	ADF_Twf-N_like	N-terminal ADF domain of twinfilin and related proteins. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. Twinfilin contains two ADF domains, and inhibits the assembly of actin filaments by strongly interacting with monomeric ADP-actin (ADP-G-actin) in a 1:1 stochiometry (with it's C-terminal ADF domain, Twf-C) and inhibiting the actin monomer's nucleotide exchange. Mammalian twinfilin may also cap the barbed ends of F-actin filaments and prevent further assembly (or disassembly), in a process which requires both ADF domains. The N-terminal ADF domain (Twf-N) binds G-actin with a lower affinity than Twf-C; Twf-C can also bind F-actin. During capping, Twf-N may interact with the terminal actin subunit, and Twf-C may bind between two adjacent subunits at the side of the filament.	139
200442	cd11286	ADF_cofilin_like	Cofilin, Destrin, and related actin depolymerizing factors. Actin depolymerization factor/cofilin-like domains (ADF domains) are present in a family of essential eukaryotic actin regulatory proteins. These proteins enhance the turnover rate of actin, and interact with actin monomers (G-actin) as well as actin filaments (F-actin), typically with a preference for ADP-G-actin subunits. The basic function of cofilin is to promote disassembly of aged actin filaments. Vertebrates have three isoforms of cofilin: cofilin-1 (Cfl1, non-muscle cofilin), cofilin-2 (muscle cofilin), and ADF (destrin). When bound to actin monomers, cofilins inhibit their spontaneous exchange of nucleotides. The cooperative binding to (aged) ADP-F-actin induces a local change in the actin filament structure and further promotes aging.	133
200443	cd11287	Sec23_C	C-terminal Actin depolymerization factor-homology domain of Sec23. The C-terminal domain of the Sec23 subunit of the coat protein complex II (COPII) is distantly related to gelsolin-like repeats and the actin depolymerizing domains found in cofilin and similar proteins. Sec23 forms a tight complex with Sec24. The cytoplasmic Sec23/24 complex is recruited together with Sar1-GTP and Sec13/31 to induce coat polymerization and membrane deformation in the forming of COPII-coated endoplasmic reticulum vesicles. The function of the Sec23 C-terminal domain is unclear.	121
200444	cd11288	gelsolin_S5_like	Gelsolin sub-domain 5-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins.	92
200445	cd11289	gelsolin_S2_like	Gelsolin sub-domain 2-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins.	92
200446	cd11290	gelsolin_S1_like	Gelsolin sub-domain 1-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin_like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins.	113
200447	cd11291	gelsolin_S6_like	Gelsolin sub-domain 6-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins.	99
200448	cd11292	gelsolin_S3_like	Gelsolin sub-domain 3-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins.	98
200449	cd11293	gelsolin_S4_like	Gelsolin sub-domain 4-like domain found in gelsolin, severin, villin, and related proteins. Gelsolin repeats occur in gelsolin, severin, villin, advillin, villidin, supervillin, flightless, quail, fragmin, and other proteins, usually in several copies. They co-occur with villin headpiece domains, leucine-rich repeats, and several other domains. These gelsolin-related actin binding proteins (GRABPs) play regulatory roles in the assembly and disassembly of actin filaments; they are involved in F-actin capping, uncapping, severing, or the nucleation of actin filaments. Severing of actin filaments is Ca2+ dependent. Villins are also linked to generating bundles of F-actin with uniform filament polarity, which is most likely mediated by their extra villin headpiece domain. Many family members have also adopted functions in the nucleus, including the regulation of transcription. Supervillin, gelsolin, and flightless I are involved in intracellular signaling via nuclear hormone receptors. The gelsolin-like domain is distantly related to the actin depolymerizing domains found in cofilin and similar proteins.	101
199894	cd11294	E_set_Esterase_like_N	N-terminal Early set domain associated with the catalytic domain of putative esterases. E or "early" set domains are associated with the catalytic domain of esterase at the N-terminal end. Esterases catalyze the hydrolysis of organic esters to release an alcohol or thiol and acid. The term esterase can be applied to enzymes that hydrolyze carboxylate, phosphate and sulphate esters, but is more often restricted to the first class of substrate. The N-terminal domain of esterase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at  either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others.	83
199917	cd11295	Mago_nashi	Mago nashi proteins, integral members of the exon junction complex. Members of this family, which was originally identified in Drosophila and called mago nashi, are integral members of the exon junction complex (EJC). The EJC is a multiprotein complex that is deposited on spliced mRNAs after intron removal at a conserved position upstream of the exon-exon junction, and transported to the cytoplasm where it has been shown to influence translation, surveillance, and localization of the spliced mRNA. It consists of four core proteins (eIF4AIII, Barentsz [Btz], Mago, and Y14), mRNA, and ATP and is supposed to be a binding platform for more peripherally and transiently associated factors along mRNA travel. Mago and Y14 form a stable heterodimer that stabilizes the complex by inhibiting eIF4AIII's ATPase activity. In humans, but not Drosophila, EJC is involved in nonsense-mediated mRNA decay (NMD) via binding to Upf3b, a central NMD effector. EJC is stripped off the mRNA during the first round of translation and then the complex components are transported back into the nucleus and recycled. The Mago-Y14 heterodimer has been shown to interact with the cytoplasmic protein PYM, an EJC disassembly factor, and specifically binds to the karyopherin nuclear receptor importin 13.	143
211383	cd11296	O-FucT_like	GDP-fucose protein O-fucosyltransferase and related proteins. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes.	206
350237	cd11297	PIN_LabA-like_N_1	uncharacterized subfamily of N-terminal LabA-like PIN domains. This N-terminal LabA-like PIN domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain), has been shown to play a role in cyanobacterial circadian timing. The LabA-like C-terminal domains characteristic of this subfamily may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains). The function of the N-terminal domain is unknown. The LabA-like PIN domain family also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes. Other members are the LabA-like PIN domains of human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	117
211384	cd11298	O-FucT-2	GDP-fucose protein O-fucosyltransferase 2. O-FucT-2 adds O-fucose to thrombospondin type 1 repeats (TSRs), and appears conserved in bilateria. The O-fucosylation of TSRs appears to play a role in regulating secretion of metalloproteases of the ADAMTS superfamily. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes.	374
211385	cd11299	O-FucT_plant	GDP-fucose protein O-fucosyltransferase, plant specific subfamily. Some members of this plant-specific family of O-fucosyltransferases have been annotated as auxin-independent growth promotors. The function of the protein seems unclear. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes.	290
211386	cd11300	Fut8_like	Alpha 1-6-fucosyltransferase. Alpha 1,6-fucosyltransferase (Fut8) transfers a fucose moiety from GDP-fucose to the reducing terminal N-acetylglucosamine of the core structure of Asn-linked oligosaccharides, in a process termed core fucosylation. Core fucosylation is essential for the function of growth factor receptors. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes.	328
211387	cd11301	Fut1_Fut2_like	Alpha-1,2-fucosyltransferase. Alpha-1,2-fucosyltransferases (Fut1, Fut2) catalyze the transfer of alpha-L-fucose to the terminal beta-D-galactose residue of glycoconjugates via an alpha-1,2-linkage, generating carbohydrate structures that exhibit H-antigenicity for blood-group carbohydrates. These structures also act as ligands for morphogenesis, the adhesion of microbes, and metastasizing cancer cells. Fut1 is responsible for producing the H antigen on red blood cells. Fut2 is expressed in epithelia of secretory tissues, and individuals termed "secretors" have at least one functional copy of the gene; they secrete H antigen which is further processed into A and/or B antigens depending on the ABO genotype. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes.	265
211388	cd11302	O-FucT-1	GDP-fucose protein O-fucosyltransferase 1. The protein O-fucosyltransferase 1 (Ofut1 or O-FucT-1) adds O-fucose to EGF (epidermal growth factor-like) repeats. The O-fucsosylation of the Notch receptor signaling protein is dependent on this enzyme, which requires GDP-fucose as a substrate. O-fucose residues added to the target of O-FucT-1 may be further elongated by other glycosyltransferases. On top of O-fucosylation, O-FucT-1 may have other functions such as the regulation of the Notch receptor exit from the ER. Six highly conserved cysteines are present in O-FucT-1, which is a soluble ER protein, as well as a DXD-like motif (ERD), conserved in mammals, Drosophila, and C. elegans. Both features are characteristic of several glycosyltransferase families. The membrane-bound pre-protein is released by proteolysis and, as for most glycosyltransferases, is strongly activated by manganese. O-FucT-1 is similar to family 1 glycosyltransferases (GT1).	347
206636	cd11303	Dystroglycan_repeat	Cadherin-like repeat domain of alpha dystroglycan. Dystroglycan is a glycoprotein widely distributed in skeletal muscle and other tissues; the pre-protein is cleaved into two subunits (alpha and beta) that form a complex which links the extracellular matrix to the cytoskeleton. Cadherin-like dystroglycan repeats are present in the extracellular alpha-dystroglycan subunit, which binds to the alpha-2-laminin G-domain in the basement membrane as part of the dystrophin-dystroglycan-complex (DGC). DGC has been shown to interact with other etxtracellular matrix components as well, such as perlecan and m-agrin, suggesting that the complex may play various different roles depending on the extracellular ligand.	99
206637	cd11304	Cadherin_repeat	Cadherin tandem repeat domain. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers.	98
206765	cd11305	alpha_DG_C	C-terminal domain of alpha dystroglycan. Dystroglycan is a glycoprotein widely distributed in skeletal muscle and other tissues; the pre-protein is cleaved into two subunits (alpha and beta) that form a complex which links the extracellular matrix to the cytoskeleton. This C-terminal domain of the alpha-subunit appears to contact neighboring cadherin-like repeats of alpha dystroglycan, and may also be involved in interactions with other components of the dystrophin-dystroglycan-complex (DGC). DGC has been shown to interact with extracellular matrix components such as laminin, perlecan and m-agrin, suggesting that the complex may play various different roles depending on the extracellular ligand.	124
199915	cd11306	M35_peptidyl-Lys	Peptidase M35 domain of peptidyl-Lys metalloendopeptidases. This family M35 Zn2+-metallopeptidase extracellular domain is mostly found in proteins characterized as peptidyl-Lys metalloendopeptidases (MEP; peptidyllysine metalloproteinase; EC 3.4.24.20), including some well-characterized domains in Aeromonas salmonicida subsp. Achromogenes (AsaP1) and Grifola frondosa (GfMEP). These proteins specifically cleave peptidyl-lysine bonds (-X-Lys- where X may even be Pro) in proteins and peptides. AsaP1 peptidase has been shown to be important in the virulence of A. salmonicida subsp. achromogenes, having a major role in the fish innate immune response. Members of this family contain a unique zinc-binding motif (the aspzincin motif), defined by the HExxH + D motif where an aspartic acid is the third zinc ligand and is found in a GTXDXXYG or similar motif C-terminal to the His zinc ligands.	160
199916	cd11307	M35_Asp_f2_like	Peptidase M35 domain of Asp f2, a major allergen from Aspergillus fumigatus, and related proteins; non catalytic. In this domain subgroup the unique zinc-binding motif (the aspzincin motif, characteristic of the M35 deuterolysin family, and defined as the "HEXXH + D" motif: two His ligands and Asp as third ligand), is poorly conserved and may not bind Zinc. Members of this subgroup also lack a key conserved Tyr residue which acts as a proton donor during metallopeptidase catalysis. These include Asp f2, a major allergen from Aspergillus fumigatus, which reacts with serum from patients with ABPA (allergic bronchopulmonary aspergillosis), and pH-regulated antigen 1 (PRA1) from Candida albicans, which has a role in fungal morphogenesis and perhaps in the host-parasite interaction during candidal infection. No protease activity has been detected for Asp f2 to date. This subgroup also includes Saccharomyces cerevisiae Zps1p. The expression of the Zsp1 gene is increased in response to zinc deficiency; it is a target of the Zap1p transcription factor.	179
200604	cd11308	Peptidase_M14NE-CP-C_like	Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain. This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity,  or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families.	76
206763	cd11309	14-3-3_fungi	Fungal 14-3-3 protein domain. This family containing fungal 14-3-3 domains includes the yeasts Saccharomyces cerevisiae (BMH1 and BMH2) and Schizosaccharomyces pombe (rad24 and rad25) isoforms. They possess distinctively variant C-terminal segments that differentiate them from the mammalian isoforms; the C-terminus is longer and BMH1/2 isoforms contain polyglutamine (polyQ) sequences of unknown function. The C-terminal segments of yeast 14-3-3 isoforms may thus behave in a different manner compared to the higher eukaryote isoforms. Yeast 14-3-3 proteins bind to numerous proteins involved in a variety of yeast cellular processes making them excellent model organisms for elucidating the function of the 14-3-3 protein family.  BMH1 and BMH2 are positive regulators of rapamycin-sensitive signaling via TOR kinases while they play an inhibitory role in Rtg3p-dependent transcription involved in retrograde signaling. 14-3-3 domains are an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells.	231
206764	cd11310	14-3-3_1	14-3-3 protein domain. This 14-3-3 domain family includes proteins in Caenorhabditis elegans, the silkworm (Bombyx mori) as well as barley (Hordeum vulgare). In C. elegans, 14-3-3 proteins are SIR-2.1 binding partners which induce transcriptional activation of DAF-16 during stress and are required for the life-span extension conferred by extra copies of sir-2.1. In B. mori, the 14-3-3 proteins are expressed widely in larval and adult tissues, including the brain, fat body, Malpighian tube, silk gland, midgut, testis, ovary, antenna, and pheromone gland, and interact with the N-terminal fragment of Hsp60, suggesting that 14-3-3 (a molecular adaptor) and Hsp60 (a molecular chaperone) work together to achieve a wide range of cellular functions in B. mori. In barley aleurone cells, 14-3-3 proteins and members of the ABF transcription factor family have a regulatory function in the gibberellic acid (GA) pathway since the balance of GA and abscisic acid (ABA) is a determining factor during transition of embryogenesis and seed germination. 14-3-3 is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells.	230
200452	cd11313	AmyAc_arch_bac_AmyA	Alpha amylase catalytic domain found in archaeal and bacterial Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes firmicutes, bacteroidetes, and proteobacteria. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	336
200453	cd11314	AmyAc_arch_bac_plant_AmyA	Alpha amylase catalytic domain found in archaeal, bacterial, and plant Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes AmyA from bacteria, archaea, water fleas, and plants. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	302
200454	cd11315	AmyAc_bac1_AmyA	Alpha amylase catalytic domain found in bacterial Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes Firmicutes, Proteobacteria, Actinobacteria, and Cyanobacteria. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	352
200455	cd11316	AmyAc_bac2_AmyA	Alpha amylase catalytic domain found in bacterial Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes Chloroflexi, Dictyoglomi, and Fusobacteria. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	403
200456	cd11317	AmyAc_bac_euk_AmyA	Alpha amylase catalytic domain found in bacterial and eukaryotic Alpha amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes AmyA proteins from bacteria, fungi, mammals, insects, mollusks, and nematodes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	329
200457	cd11318	AmyAc_bac_fung_AmyA	Alpha amylase catalytic domain found in bacterial and fungal Alpha amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes bacterial and fungal proteins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	391
200458	cd11319	AmyAc_euk_AmyA	Alpha amylase catalytic domain found in eukaryotic Alpha-amylases (also called 1,4-alpha-D-glucan-4-glucanohydrolase). AmyA (EC 3.2.1.1) catalyzes the hydrolysis of alpha-(1,4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides. This group includes eukaryotic alpha-amylases including proteins from fungi, sponges, and protozoans. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	375
200459	cd11320	AmyAc_AmyMalt_CGTase_like	Alpha amylase catalytic domain found in maltogenic amylases, cyclodextrin glycosyltransferase, and related proteins. Enzymes such as amylases, cyclomaltodextrinase (CDase), and cyclodextrin glycosyltransferase (CGTase) degrade starch to smaller oligosaccharides by hydrolyzing the alpha-D-(1,4) linkages between glucose residues. In the case of CGTases, an additional cyclization reaction is catalyzed yielding mixtures of cyclic oligosaccharides which are referred to as alpha-, beta-, or gamma-cyclodextrins (CDs), consisting of six, seven, or eight glucose residues, respectively. CGTases are characterized depending on the major product of the cyclization reaction. Besides having similar catalytic site residues, amylases and CGTases contain carbohydrate binding domains that are distant from the active site and are implicated in attaching the enzyme to raw starch granules and in guiding the amylose chain into the active site. The maltogenic alpha-amylase from Bacillus is a five-domain structure, unlike most alpha-amylases, but similar to that of cyclodextrin glycosyltransferase. In addition to the A, B, and C domains, they have a domain D and a starch-binding domain E. Maltogenic amylase is an endo-acting amylase that has activity on cyclodextrins, terminally modified linear maltodextrins, and amylose. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	389
200460	cd11321	AmyAc_bac_euk_BE	Alpha amylase catalytic domain found in bacterial and eukaryotic branching enzymes. Branching enzymes (BEs) catalyze the formation of alpha-1,6 branch points in either glycogen or starch by cleavage of the alpha-1,4 glucosidic linkage yielding a non-reducing end oligosaccharide chain, and subsequent attachment to the alpha-1,6 position. By increasing the number of non-reducing ends, glycogen is more reactive to synthesis and digestion as well as being more soluble. This group includes bacterial and eukaryotic proteins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	406
200461	cd11322	AmyAc_Glg_BE	Alpha amylase catalytic domain found in the Glycogen branching enzyme (also called 1,4-alpha-glucan branching enzyme). The glycogen branching enzyme catalyzes the third step of glycogen biosynthesis by the cleavage of an alpha-(1,4)-glucosidic linkage and the formation a new alpha-(1,6)-branch by subsequent transfer of cleaved oligosaccharide. They are part of a group called branching enzymes which catalyze the formation of alpha-1,6 branch points in either glycogen or starch. This group includes proteins from bacteria, eukaryotes, and archaea. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	402
200462	cd11323	AmyAc_AGS	Alpha amylase catalytic domain found in Alpha 1,3-glucan synthase (also called uridine diphosphoglucose-1,3-alpha-glucan glucosyltransferase and 1,3-alpha-D-glucan synthase). Alpha 1,3-glucan synthase (AGS, EC 2.4.1.183) is an enzyme that catalyzes the reversible chemical reaction of UDP-glucose and [alpha-D-glucosyl-(1-3)]n to form UDP and [alpha-D-glucosyl-(1-3)]n+1. AGS is a component of fungal cell walls. The cell wall of filamentous fungi is composed of 10-15% chitin and 10-35% alpha-1,3-glucan. AGS is triggered in fungi as a response to cell wall stress and elongates the glucan chains in cell wall synthesis. This group includes proteins from Ascomycetes and Basidomycetes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	569
200463	cd11324	AmyAc_Amylosucrase	Alpha amylase catalytic domain found in Amylosucrase. Amylosucrase is a glucosyltransferase that catalyzes the transfer of a D-glucopyranosyl moiety from sucrose onto an acceptor molecule. When the acceptor is another saccharide, only alpha-1,4 linkages are produced. Unlike most amylopolysaccharide synthases, it does not require any alpha-D-glucosyl nucleoside diphosphate substrate. In the presence of glycogen it catalyzes the transfer of a D-glucose moiety onto a glycogen branch, but in its absence, it hydrolyzes sucrose and synthesizes polymers, smaller maltosaccharides, and sucrose isoforms. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	536
200464	cd11325	AmyAc_GTHase	Alpha amylase catalytic domain found in Glycosyltrehalose trehalohydrolase (also called Maltooligosyl trehalose Trehalohydrolase). Glycosyltrehalose trehalohydrolase (GTHase) was discovered as part of a coupled system for the production of trehalose from soluble starch. In the first half of the reaction, glycosyltrehalose synthase (GTSase), an intramolecular glycosyl transferase, converts the glycosidic bond between the last two glucose residues of amylose from an alpha-1,4 bond to an alpha-1,1 bond, making a non-reducing glycosyl trehaloside. In the second half of the reaction, GTHase cleaves the alpha-1,4 glycosidic bond adjacent to the trehalose moiety to release trehalose and malto-oligosaccharide. Like isoamylase and other glycosidases that recognize branched oligosaccharides, GTHase contains an N-terminal extension and does not have the conserved calcium ion present in other alpha amylase family enzymes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase. Glycosyltrehalose Trehalohydrolase Maltooligosyltrehalose Trehalohydrolase	436
200465	cd11326	AmyAc_Glg_debranch	Alpha amylase catalytic domain found in glycogen debranching enzymes. Debranching enzymes facilitate the breakdown of glycogen through glucosyltransferase and glucosidase activity. These activities are performed by a single enzyme in mammals, yeast, and some bacteria, but by two distinct enzymes in Escherichia coli and other bacteria. Debranching enzymes perform two activities: 4-alpha-D-glucanotransferase (EC 2.4.1.25) and amylo-1,6-glucosidase (EC 3.2.1.33). 4-alpha-D-glucanotransferase catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. Amylo-alpha-1,6-glucosidase catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. In Escherichia coli, GlgX is the debranching enzyme and malQ is the 4-alpha-glucanotransferase. TreX, an archaeal glycogen-debranching enzyme has dual activities like mammals and yeast, but is structurally similar to GlgX. TreX exists in two oligomeric states, a dimer and tetramer. Isoamylase (EC 3.2.1.68) is one of the starch-debranching enzymes that catalyzes the hydrolysis of alpha-1,6-glucosidic linkages specific in alpha-glucans such as amylopectin or glycogen and their beta-limit dextrins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	433
200466	cd11327	AmyAc_Glg_debranch_2	Alpha amylase catalytic domain found in glycogen debranching enzymes. Debranching enzymes facilitate the breakdown of glycogen through glucosyltransferase and glucosidase activity. These activities are performed by a single enzyme in mammals, yeast, and some bacteria, but by two distinct enzymes in Escherichia coli and other bacteria. Debranching enzymes perform two activities, 4-alpha-D-glucanotransferase (EC 2.4.1.25) and amylo-1,6-glucosidase (EC 3.2.1.33). 4-alpha-D-glucanotransferase catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. Amylo-alpha-1,6-glucosidase catalyzes the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues. The catalytic triad (DED), which is highly conserved in other debranching enzymes, is not present in this group. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	478
200467	cd11328	AmyAc_maltase	Alpha amylase catalytic domain found in maltase (also known as alpha glucosidase) and related proteins. Maltase (EC 3.2.1.20) hydrolyzes the terminal, non-reducing (1->4)-linked alpha-D-glucose residues in maltose, releasing alpha-D-glucose. In most cases, maltase is equivalent to alpha-glucosidase, but the term "maltase" emphasizes the disaccharide nature of the substrate from which glucose is cleaved, and the term "alpha-glucosidase" emphasizes the bond, whether the substrate is a disaccharide or polysaccharide. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	470
200468	cd11329	AmyAc_maltase-like	Alpha amylase catalytic domain family found in maltase. Maltase (EC 3.2.1.20) hydrolyzes the terminal, non-reducing (1->4)-linked alpha-D-glucose residues in maltose, releasing alpha-D-glucose. The catalytic triad (DED) which is highly conserved in the other maltase group is not present in this subfamily. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	477
200469	cd11330	AmyAc_OligoGlu	Alpha amylase catalytic domain found in oligo-1,6-glucosidase (also called isomaltase; sucrase-isomaltase; alpha-limit dextrinase) and related proteins. Oligo-1,6-glucosidase (EC 3.2.1.10) hydrolyzes the alpha-1,6-glucosidic linkage of isomalto-oligosaccharides, pannose, and dextran. Unlike alpha-1,4-glucosidases (EC 3.2.1.20), it fails to hydrolyze the alpha-1,4-glucosidic bonds of maltosaccharides. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	472
200470	cd11331	AmyAc_OligoGlu_like	Alpha amylase catalytic domain found in oligo-1,6-glucosidase (also called isomaltase; sucrase-isomaltase; alpha-limit dextrinase) and related proteins. Oligo-1,6-glucosidase (EC 3.2.1.10) hydrolyzes the alpha-1,6-glucosidic linkage of isomalto-oligosaccharides, pannose, and dextran. Unlike alpha-1,4-glucosidases (EC 3.2.1.20), it fails to hydrolyze the alpha-1,4-glucosidic bonds of maltosaccharides. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	450
200471	cd11332	AmyAc_OligoGlu_TS	Alpha amylase catalytic domain found in oligo-1,6-glucosidase (also called isomaltase; sucrase-isomaltase; alpha-limit dextrinase), trehalose synthase (also called maltose alpha-D-glucosyltransferase), and related proteins. Oligo-1,6-glucosidase (EC 3.2.1.10) hydrolyzes the alpha-1,6-glucosidic linkage of isomaltooligosaccharides, pannose, and dextran. Unlike alpha-1,4-glucosidases (EC 3.2.1.20), it fails to hydrolyze the alpha-1,4-glucosidic bonds of maltosaccharides. Trehalose synthase (EC 5.4.99.16) catalyzes the isomerization of maltose to produce trehalulose. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	481
200472	cd11333	AmyAc_SI_OligoGlu_DGase	Alpha amylase catalytic domain found in Sucrose isomerases, oligo-1,6-glucosidase (also called isomaltase; sucrase-isomaltase; alpha-limit dextrinase), dextran glucosidase (also called glucan 1,6-alpha-glucosidase), and related proteins. The sucrose isomerases (SIs) Isomaltulose synthase (EC 5.4.99.11) and Trehalose synthase (EC 5.4.99.16) catalyze the isomerization of sucrose and maltose to produce isomaltulose and trehalulose, respectively. Oligo-1,6-glucosidase (EC 3.2.1.10) hydrolyzes the alpha-1,6-glucosidic linkage of isomaltooligosaccharides, pannose, and dextran. Unlike alpha-1,4-glucosidases (EC 3.2.1.20), it fails to hydrolyze the alpha-1,4-glucosidic bonds of maltosaccharides. Dextran glucosidase (DGase, EC 3.2.1.70) hydrolyzes alpha-1,6-glucosidic linkages at the non-reducing end of panose, isomaltooligosaccharides and dextran to produce alpha-glucose.The common reaction chemistry of the alpha-amylase family enzymes is based on a two-step acid catalytic mechanism that requires two critical carboxylates: one acting as a general acid/base (Glu) and the other as a nucleophile (Asp). Both hydrolysis and transglycosylation proceed via the nucleophilic substitution reaction between the anomeric carbon, C1 and a nucleophile. Both enzymes contain the three catalytic residues (Asp, Glu and Asp) common to the alpha-amylase family as well as two histidine residues which are predicted to be critical to binding the glucose residue adjacent to the scissile bond in the substrates. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	428
200473	cd11334	AmyAc_TreS	Alpha amylase catalytic domain found in Trehalose synthetase. Trehalose synthetase (TreS) catalyzes the reversible interconversion of trehalose and maltose. The enzyme catalyzes the reaction in both directions, but the preferred substrate is maltose. Glucose is formed as a by-product of this reaction. It is believed that the catalytic mechanism may involve the cutting of the incoming disaccharide and transfer of a glucose to an enzyme-bound glucose. This enzyme also catalyzes production of a glucosamine disaccharide from maltose and glucosamine. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	447
200474	cd11335	AmyAc_MTase_N	Alpha amylase catalytic domain found in maltosyltransferase. Maltosyltransferase (MTase), a maltodextrin glycosyltransferase, acts on starch and maltooligosaccharides. It catalyzes the transfer of maltosyl units from alpha-1,4-linked glucans or maltooligosaccharides to other alpha-1,4-linked glucans, maltooligosaccharides or glucose. MTase is a homodimer. The catalytic core domain has the (beta/alpha) 8 barrel fold with the active-site cleft formed at the C-terminal end of the barrel. Substrate binding experiments have led to the location of two distinct maltose-binding sites: one lies in the active-site cleft and the other is located in a pocket adjacent to the active-site cleft. It is a member of the alpha-amylase family, but unlike typical alpha-amylases, MTase does not require calcium for activity and lacks two histidine residues which are predicted to be critical for binding the glucose residue adjacent to the scissile bond in the substrates. The common reaction chemistry of the alpha-amylase family of enzymes is based on a two-step acid catalytic mechanism that requires two critical carboxylates: one acting as a general acid/base (Glu) and the other as a nucleophile (Asp). Both hydrolysis and transglycosylation proceed via the nucleophilic substitution reaction between the anomeric carbon, C1 and a nucleophile. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	538
200475	cd11336	AmyAc_MTSase	Alpha amylase catalytic domain found in maltooligosyl trehalose synthase (MTSase). Maltooligosyl trehalose synthase (MTSase) domain. MTSase and maltooligosyl trehalose trehalohydrolase (MTHase) work together to produce trehalose. MTSase is responsible for converting the alpha-1,4-glucosidic linkage to an alpha,alpha-1,1-glucosidic linkage at the reducing end of the maltooligosaccharide through an intramolecular transglucosylation reaction, while MTHase hydrolyzes the penultimate alpha-1,4 linkage of the reducing end, resulting in the release of trehalose. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	660
200476	cd11337	AmyAc_CMD_like	Alpha amylase catalytic domain found in cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is mainly bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	328
200477	cd11338	AmyAc_CMD	Alpha amylase catalytic domain found in cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	389
200478	cd11339	AmyAc_bac_CMD_like_2	Alpha amylase catalytic domain found in bacterial cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	344
200479	cd11340	AmyAc_bac_CMD_like_3	Alpha amylase catalytic domain found in bacterial cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	407
200480	cd11341	AmyAc_Pullulanase_LD-like	Alpha amylase catalytic domain found in Pullulanase (also called dextrinase; alpha-dextrin endo-1,6-alpha glucosidase), limit dextrinase, and related proteins. Pullulanase is an enzyme with action similar to that of isoamylase; it cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. Pullulanases are very similar to limit dextrinases, although they differ in their action on glycogen and the rate of hydrolysis of limit dextrins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	406
200481	cd11343	AmyAc_Sucrose_phosphorylase-like	Alpha amylase catalytic domain found in sucrose phosphorylase (also called sucrose glucosyltransferase, disaccharide glucosyltransferase, and sucrose-phosphate alpha-D glucosyltransferase). Sucrose phosphorylase is a bacterial enzyme that catalyzes the phosphorolysis of sucrose to yield glucose-1-phosphate and fructose. These enzymes do not have the conserved calcium ion present in other alpha amylase family enzymes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	445
200482	cd11344	AmyAc_GlgE_like	Alpha amylase catalytic domain found in GlgE-like proteins. GlgE is a (1,4)-a-D-glucan:phosphate a-D-maltosyltransferase, involved in a-glucan biosynthesis in bacteria. It is also an anti-tuberculosis drug target. GlgE isoform I from Streptomyces coelicolor has the same catalytic and very similar kinetic properties to GlgE from Mycobacterium tuberculosis. GlgE from Streptomyces coelicolor forms a homodimer with each subunit comprising five domains (A, B, C, N, and S) and 2 inserts. Domain A is a catalytic alpha-amylase-type domain that along with domain N, which has a beta-sandwich fold and forms the core of the dimer interface, binds cyclodextrins. Domain A, B, and the 2 inserts define a well conserved donor pocket that binds maltose. Cyclodextrins competitively inhibit the binding of maltooligosaccharides to the S. coelicolor enzyme, indicating that the hydrophobic patch overlaps with the acceptor binding site. This is not the case in M. tuberculosis GlgE because cyclodextrins do not inhibit this enzyme, despite acceptor length specificity being conserved. Domain C is hypothesized to help stabilize domain A and could be involved in substrate binding. Domain S is a helix bundle that is inserted within the N domain and it plays a role in the dimer interface and interacts directly with domain B. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	355
200483	cd11345	AmyAc_SLC3A2	Alpha amylase catalytic domain found in solute carrier family 3 member 2 proteins. 4F2 cell-surface antigen heavy chain (hc) is a protein that in humans is encoded by the SLC3A2 gene. 4F2hc is a multifunctional type II membrane glycoprotein involved in amino acid transport and cell fusion, adhesion, and transformation. It is related to bacterial alpha-glycosidases, but lacks alpha-glycosidase activity. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	326
200484	cd11346	AmyAc_plant_IsoA	Alpha amylase catalytic domain family found in plant isoamylases. Two types of debranching enzymes exist in plants: isoamylase-type (EC 3.2.1.68) and a pullulanase-type (EC 3.2.1.41, also known as limit-dextrinase). These efficiently hydrolyze alpha-(1,6)-linkages in amylopectin and pullulan. This group does not contain the conserved catalytic triad present in other alpha-amylase-like proteins. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	347
200485	cd11347	AmyAc_1	Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	391
200486	cd11348	AmyAc_2	Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The catalytic triad (DED) is not present here. The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	429
200487	cd11349	AmyAc_3	Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	456
200488	cd11350	AmyAc_4	Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	390
200489	cd11352	AmyAc_5	Alpha amylase catalytic domain found in an uncharacterized protein family. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	443
200490	cd11353	AmyAc_euk_bac_CMD_like	Alpha amylase catalytic domain found in eukaryotic and bacterial cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is mainly bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	366
200491	cd11354	AmyAc_bac_CMD_like	Alpha amylase catalytic domain found in bacterial cyclomaltodextrinases and related proteins. Cyclomaltodextrinase (CDase; EC3.2.1.54), neopullulanase (NPase; EC 3.2.1.135), and maltogenic amylase (MA; EC 3.2.1.133) catalyze the hydrolysis of alpha-(1,4) glycosidic linkages on a number of substrates including cyclomaltodextrins (CDs), pullulan, and starch. These enzymes hydrolyze CDs and starch to maltose and pullulan to panose by cleavage of alpha-1,4 glycosidic bonds whereas alpha-amylases essentially lack activity on CDs and pullulan. They also catalyze transglycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules. Since these proteins are nearly indistinguishable from each other, they are referred to as cyclomaltodextrinases (CMDs). This group of CMDs is bacterial. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	357
200492	cd11355	AmyAc_Sucrose_phosphorylase	Alpha amylase catalytic domain found in sucrose phosphorylase (also called sucrose glucosyltransferase, disaccharide glucosyltransferase, and sucrose-phosphate alpha-D glucosyltransferase). Sucrose phosphorylase is a bacterial enzyme that catalyzes the phosphorolysis of sucrose to yield glucose-1-phosphate and fructose. These enzymes do not have the conserved calcium ion present in other alpha amylase family enzymes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	433
200493	cd11356	AmyAc_Sucrose_phosphorylase-like_1	Alpha amylase catalytic domain found in sucrose phosphorylase-like proteins (also called sucrose glucosyltransferase, disaccharide glucosyltransferase, and sucrose-phosphate alpha-D glucosyltransferase). Sucrose phosphorylase is a bacterial enzyme that catalyzes the phosphorolysis of sucrose to yield glucose-1-phosphate and fructose. These enzymes do not have the conserved calcium ion present in other alpha amylase family enzymes. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	458
206766	cd11358	RNase_PH	RNase PH-like 3'-5' exoribonucleases. RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites.	218
200494	cd11359	AmyAc_SLC3A1	Alpha amylase catalytic domain found in Solute Carrier family 3 member 1 proteins. SLC3A1, also called Neutral and basic amino acid transport protein rBAT or NBAT, plays a role in amino acid and cystine absorption. Mutations in the gene encoding SLC3A1 causes cystinuria, an autosomal recessive disorder characterized by the failure of proximal tubules to reabsorb filtered cystine and dibasic amino acids. The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost the catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase.	456
206767	cd11362	RNase_PH_bact	Ribonuclease PH. Ribonuclease PH (RNase PH)-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Structurally all members of this family form hexameric rings (trimers of dimers). Bacterial RNase PH forms a homohexameric ring, and removes nucleotide residues following the -CCA terminus of tRNA.	227
206768	cd11363	RNase_PH_PNPase_1	Polyribonucleotide nucleotidyltransferase, repeat 1. Polyribonucleotide nucleotidyltransferase (PNPase) is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally, all members of this family form hexameric rings. In the case of PNPase the complex is a trimer, since each monomer contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction and in quality control of ribosomal RNA precursors. It is part of the RNA degradosome complex and binds to the scaffolding domain of the endoribonuclease RNase E.	229
206769	cd11364	RNase_PH_PNPase_2	Polyribonucleotide nucleotidyltransferase, repeat 2. Polyribonucleotide nucleotidyltransferase (PNPase) is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally, all members of this family form hexameric rings. In the case of PNPase the complex is a trimer, since each monomer contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction and in quality control of ribosomal RNA precursors, with the second repeat containing the active site. PNPase is part of the RNA degradosome complex and binds to the scaffolding domain of the endoribonuclease RNase E.	223
206770	cd11365	RNase_PH_archRRP42	RRP42 subunit of archaeal exosome. The RRP42 subunit of the archaeal exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of dimers). In archaea, the ring is formed by three Rrp41:Rrp42 dimers. The central chamber within the ring contains three phosphorolytic active sites located in an Rrp41 pocket at the interface between Rrp42 and Rrp41. The ring is capped by three copies of Rrp4 and/or Csl4 which contain putative RNA interaction domains. The archaeal exosome degrades single-stranded RNA (ssRNA) in the 3'-5' direction, but also can catalyze the reverse reaction of adding nucleoside diphosphates to the 3'-end of RNA which has been shown to lead to the formation of poly-A-rich tails on RNA. It is required for 3' processing of the 5.8S rRNA.	256
206771	cd11366	RNase_PH_archRRP41	RRP41 subunit of archaeal exosome. The RRP41 subunit of the archaeal exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of dimers). In archaea, the ring is formed by three Rrp41:Rrp42 dimers. The central chamber within the ring contains three phosphorolytic active sites located in an Rrp41 pocket at the interface between Rrp42 and Rrp41. The ring is capped by three copies of Rrp4 and/or Csl4 which contain putative RNA interaction domains. The archaeal exosome degrades single-stranded RNA (ssRNA) in the 3'-5' direction, but also can catalyze the reverse reaction of adding nucleoside diphosphates to the 3'-end of RNA which has been shown to lead to the formation of poly-A-rich tails on RNA.	214
206772	cd11367	RNase_PH_RRP42	RRP42 subunit of eukaryotic exosome. The RRP42 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts.	272
206773	cd11368	RNase_PH_RRP45	RRP45 subunit of eukaryotic exosome. The RRP45 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts.	259
206774	cd11369	RNase_PH_RRP43	RRP43 subunit of eukaryotic exosome. The RRP43 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts.	261
206775	cd11370	RNase_PH_RRP41	RRP41 subunit of eukaryotic exosome. The RRP41 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts.	226
206776	cd11371	RNase_PH_MTR3	MTR3 subunit of eukaryotic exosome. The MTR3 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts.	210
206777	cd11372	RNase_PH_RRP46	RRP46 subunit of eukaryotic exosome. The RRP46 subunit of eukaryotic exosome is a member of the RNase_PH family, named after the bacterial Ribonuclease PH, a 3'-5' exoribonuclease. Structurally all members of this family form hexameric rings (trimers of Rrp41-Rrp45, Rrp46-Rrp43, and Mtr3-Rrp42 dimers). The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits and three additional proteins (Rrp4, Csl4 and Rrp40) that form a stable cap and contain RNA-binding domains. The RNase PH-like subunits are no longer phosphorolytic enzymes, the exosome directly associates with Rrp44 and Rrp6, hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. The exosome plays an important role in RNA turnover. It plays a crucial role in the maturation of stable RNA species such as rRNA, snRNA and snoRNA, quality control of mRNA, and the degradation of RNA processing by-products and non-coding transcripts.	199
200603	cd11374	CE4_u10	Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily. The family corresponds to a group of uncharacterized bacterial proteins with high sequence similarity to the catalytic domain of the six-stranded barrel rhizobial NodB-like proteins, which remove N-linked or O-linked acetyl groups of cell wall polysaccharides and belong to the larger carbohydrate esterase 4 (CE4) superfamily.	226
213029	cd11375	Peptidase_M54	Peptidase family M54, also called archaemetzincins or archaelysins. Peptidase M54 (archaemetzincin or archaelysin) is a zinc-dependent aminopeptidase that contains the consensus zinc-binding sequence HEXXHXXGXXH/D and a conserved Met residue at the active site, and is thus classified as a metzincin. Archaemetzincins, first identified in archaea, are also found in bacteria and eukaryotes, including two human members, archaemetzincin-1 and -2 (AMZ1 and AMZ2). AMZ1 is mainly found in the liver and heart while AMZ2 is primarily expressed in testis and heart; both have been reported to degrade synthetic substrates and peptides. The Peptidase M54 family contains an extended metzincin concensus sequence of HEXXHXXGX3CX4CXMX17CXXC such that a second zinc ion is bound to four cysteines, thus resembling a zinc finger. Phylogenetic analysis of this family reveals a complex evolutionary process involving a series of lateral gene transfer, gene loss and genetic duplication events.	173
271138	cd11376	Imelysin-like	imelysin also called Peptidase M75. This family includes insulin-cleaving membrane protease (imelysin, ICMP), imelysin-like protein (IPPA from Psychrobacter arcticus), iron-regulated protein A (IrpA) and iron-transporter EfeO-like alginate-binding protein (Algp7). Imelysin is a membrane protein with the active site outside the cell envelope. It is also called the peptidase M75 since the HxxE sequence motif characteristic of the M14 peptidase is completely conserved. However, the overall structure and the GxHxxE motif region differ from the known HxxE metallopeptidases, suggesting that imelysin-like proteins may not be peptidases. Imelysin's cleavage of the oxidized insulin B chain shows a preference for aromatic hydrophobic amino acids at P1'. Imelysin was first identified in Pseudomonas aeruginosa and has also been shown to cleave fibrinogen. The tertiary structure shows a fold consisting of two domains, each consisting of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event. In addition to an imelysin-like domain, Algp7 typically contains an N-terminal cupredoxin (CUP) domain and has a deep cleft between the 4-helix bundles sufficiently large to accommodate macromolecules such as alginate polysaccharide.	253
206778	cd11377	Pro-peptidase_S53	Activation domain of S53 peptidases. Members of this family are found in various subtilase propeptides, such as  pro-kumamolysin and tripeptidyl peptidase I, and adopt a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptidase.	139
211390	cd11378	DUF296	Domain of unknown function found in archaea, bacteria, and plants. This domain is found in proteins that contain AT-hook motifs, which suggests a role in DNA-binding for the proteins as a whole. Three conserved histidine residues appear to form a zinc-binding site, and the domain has been observed to form homotrimers. It co-occurs with a thioredoxin-like domain in uncharacterized cyanobacterial proteins.	113
211391	cd11379	DUF4425	Uncharacterized protein conserved in Bacteroidetes. This family appears to form homodimers, the 3D structure has been determined by both NMR and X-ray crystallography.	119
211392	cd11380	Ribosomal_S8e_like	Eukaryotic/archaeal ribosomal protein S8e and similar proteins. This family contains the eukaryotic/archaeal ribosomal protein S8, a component of the small ribosomal subunits, as well as the NSA2 gene product.	138
211393	cd11381	NSA2	pre-ribosomal protein NSA2 (Nop seven-associated 2). NSA2 appears to be a protein required for the maturation of 27S pre-rRNA in yeast; it has been characterized in mammalian cells as a nucleolar protein that might play a role in the regulation of the cell cycle and in cell proliferation.	257
211394	cd11382	Ribosomal_S8e	Eukaryotic/archaeal ribosomal protein S8e (RPS8). The eukaryotic/archaeal ribosomal protein S8 is a component of the small (40S in eukaryotes, 30S in archaea) ribosomal subunits and interacts tightly with 18S rRNA (16S rRNA in archaea, presumably).	122
206743	cd11383	YfjP	YfjP GTPase. The Era (E. coli Ras-like protein)-like YfjP subfamily includes several uncharacterized bacterial GTPases that are similar to Era. They generally show sequence conservation in the region between the Walker A and B motifs (G1 and G3 box motifs), to the exclusion of other GTPases. Era is characterized by a distinct derivative of the KH domain (the pseudo-KH domain) which is located C-terminal to the GTPase domain.	140
206744	cd11384	RagA_like	Rag GTPase, subfamily of Ras-related GTPases, includes Ras-related GTP-binding proteins A and B. RagA and RagB are closely related Rag GTPases (ras-related GTP-binding protein A and B) that constitute a unique subgroup of the Ras superfamily, and are functional homologs of Saccharomyces cerevisiae Gtr1. These domains function by forming heterodimers with RagC or RagD, and similarly, Gtr1 dimerizes with Gtr2, through the carboxy-terminal segments. They play an essential role in regulating amino acid-induced target of rapamycin complex 1 (TORC1) kinase signaling, exocytic cargo sorting at endosomes, and epigenetic control of gene expression. In response to amino acids, the Rag GTPases guide the TORC1 complex to activate the platform containing Rheb proto-oncogene by driving the relocalization of mTORC1 from discrete locations in the cytoplasm to a late endosomal and/or lysosomal compartment that is Rheb-enriched and contains Rab-7.	286
206745	cd11385	RagC_like	Rag GTPase, subfamily of Ras-related GTPases, includes Ras-related GTP-binding proteins C and D. RagC and RagD are closely related Rag GTPases (ras-related GTP-binding protein C and D) that constitute a unique subgroup of the Ras superfamily, and are functional homologs of Saccharomyces cerevisiae Gtr2. These domains form heterodimers with RagA or RagB, and similarly, Gtr2 dimerizes with Gtr1 in order to function. They play an essential role in regulating amino acid-induced target of rapamycin complex 1 (TORC1) kinase signaling, exocytic cargo sorting at endosomes, and epigenetic control of gene expression. In response to amino acids, the Rag GTPases guide the TORC1 complex to activate the platform containing Rheb proto-oncogene by driving the relocalization of mTORC1 from discrete locations in the cytoplasm to a late endosomal and/or lysosomal compartment that is Rheb-enriched and contains Rab-7.	175
206779	cd11386	MCP_signal	Methyl-accepting chemotaxis protein (MCP), signaling domain. Methyl-accepting chemotaxis proteins (MCPs or chemotaxis receptors) are an integral part of the transmembrane protein complex that controls bacterial chemotaxis, together with the histidine kinase CheA, the receptor-coupling protein CheW, receptor-modification enzymes, and localized phosphatases. MCPs contain a four helix trans membrane region, an N-terminal periplasmic ligand binding domain, and a C-terminal HAMP domain followed by a cytoplasmic signaling domain. This C-terminal signaling domain dimerizes into a four-helix bundle and interacts with CheA through the adaptor protein CheW.	200
381393	cd11387	bHLHzip_USF_MITF	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in USF/MITF family. The USF (upstream stimulatory factor)/MITF (microphthalmia-associated transcription factor) family includes two bHLHzip transcription factor subfamilies. USFs are ubiquitously expressed and key regulators of a wide number of gene regulation networks, including the stress and immune responses, cell cycle and proliferation, lipid and glucid metabolism. USFs recruit chromatin remodeling enzymes and interact with co-activators and the members of the transcription pre-initiation complex. USFs interact with high affinity to E-box regulatory elements. The MITF (also known as microphthalmia-TFE, or MiT) subfamily comprises four genes in mammals (MITF, TFE3, TFEB, and TFEC); each gene has different functions. MITF is involved in neural crest melanocytes development as well as the pigmented retinal epithelium. TFEB is required for vascularization of the mouse placenta. TFE3 is involved in B cell function. TFEC regulates gene expression in macrophages. The MITF subfamily proteins can form homodimers or heterodimers with each other but not with other bHLH or bHLHzip proteins.	58
381394	cd11388	bHLH_ScINO2_like	basic helix-loop-helix (bHLH) domain found in Saccharomyces cerevisiae protein INO2 and similar proteins. INO2 is a positive regulatory factor required for depression of the co-regulated phospholipid biosynthetic enzymes in Saccharomyces cerevisiae. It is also involved in the expression of ITR1.	68
381395	cd11389	bHLH-O_HERP_like	basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split (HES)-related repressor protein (HERP)-like family. The HERP-like family includes bHLH-O transcriptional regulators that are related to the Drosophila hairy and Enhancer-of-split proteins. They contain a basic helix-loop-helix (bHLH) domain with an invariant glycine residue in its basic region, an orange domain in the central region and YXXW sequence motif at its C-terminal region. HERP proteins (HEY1, HEY2 and HEYL) act as downstream effectors of Notch signaling. They are involved in cardiovascular development and have roles in somitogenesis, myogenesis and gliogenesis. Hairy and enhancer of split-related protein HELT is a transcriptional repressor expressed in the developing central nervous system. It binds preferentially to the canonical E box sequence 5'-CACGCG-3' and regulates neuronal differentiation and/or identity. Differentially expressed in chondrocytes proteins, DEC1 and DEC2, are widely expressed in both embryonic and adult tissues and have been implicated in apoptosis, cell proliferation, and circadian rhythms, as well as malignancy in various cancers.  Drosophila melanogaster protein clockwork orange (Cwo) is also included in this family. It is involved in the regulation of Drosophila circadian rhythms. It functions as both an activator and a repressor of clock gene expression.	55
381396	cd11390	bHLH_TS	tissue specific basic helix-loop-helix (bHLH-TS) domain family. Tissue specific bHLH domain family includes transcription regulators whose expression are restricted to certain tissues. They are involved in cell-fate determination and process in neurogenesis, cardiogenesis, myogenesis, and hematopoiesis and include proteins from myogenic regulatory factor (MRF) family, twist-related protein (TWIST) family, scleraxis-like family, heart- and neural crest derivatives-expressed protein (HAND) family, helix-loop-helix protein (HEN) family, musculin-like family, germline alpha (FIGLA) family, T-cell acute lymphocytic leukemia protein/ lymphoblastic leukemia-derived sequence (TAL/LYL) family, ovary, uterus and testis protein (OUT) family, mesoderm posterior protein (Mesp) family, muscle, intestine and stomach expression 1 (MIST-1) family, protein atonal homologs (ATOH) family, neurogenin (NGN) family, neurogenic differentiation factor (NeuroD) family, achaete-scute complex-like (ASCL) family, Fer3-like protein (FERD3L)-like family, and Oligodendrocyte lineage genes (OLIG) family of transcription factors.	55
381397	cd11391	bHLH_PAS	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain family. bHLH-PAS domain has been found in a large group of bHLH transcription regulators that are involved in gene expression responding to environmental change and controlling aspects of neural development, including proteins from aryl hydrocarbon receptor nuclear translocator (ARNT) family, hypoxia-inducible factor (HIF) family, aryl hydrocarbon receptor (AhR) family, neuronal PAS domain-containing protein (NPAS) family, Circadian locomotor output cycles protein kaput (CLOCK)-like family, and single-minded (SIM) family. bHLH-PAS transcriptional regulatory factors have a bHLH DNA-binding domain followed by two PAS domains and a C-terminal activation or repression domain.  bHLH-PAS family members can be divided into class I and class II based on their dimerization partner. bHLH-PAS class I factors include AhR, HIF and SIM. The best characterized bHLH-PAS Class II protein is the ubiquitous ARNT. Some members of bHLH-PAS family act as transcriptional coactivators (such as NCoA) that lack the ability to dimerize and bind DNA.	55
381398	cd11392	bHLH_ScPHO4_like	basic helix-loop-helix (bHLH) domain found in Saccharomyces cerevisiae phosphate system positive regulatory protein PHO4 and similar proteins. PHO4 is a transcriptional activator that regulates the expression of repressible phosphatase under phosphate starvation conditions in Saccharomyces cerevisiae. The PHO4 protein has four functional domains with the bHLH domain at its carboxyl-terminal region. It regulates transcription by binding to promoter of the genes as a homodimer.	80
381399	cd11393	bHLH_AtbHLH_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana genes coding transcription factors and similar proteins. bHLH proteins are the second largest class of plant transcription factors that regulate transcription of genes that are involve in many essential physiological and developmental process. bHLH proteins are transcriptional regulators that are found in organisms from yeast to humans. The Arabidopsis bHLH proteins that have been characterized so far have roles in regulation of fruit dehiscence, cell development (carpel, anther and epidermal), phytochrome signaling, flavonoid biosynthesis, hormone signaling and stress responses.	53
381400	cd11394	bHLHzip_SREBP	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in sterol regulatory element-binding protein (SREBP) family. The SREBP family includes SREBP1 and SREBP2, which are bHLHzip transcriptional activator of genes encoding proteins essential for cholesterol biosynthesis/uptake and fatty acid biosynthesis. SREBP1 and SREBP2 are principally found in the liver and in adipocytes and made up of an N-terminal transcription factor portion (composed of an activation domain, a bHLHzip domain, and a nuclear localization signal), a hydrophobic region containing two membrane spanning regions, and a C-terminal regulatory segment. They recognize a symmetric sterol regulatory element (TCACNCCAC) instead of E-box.	73
381401	cd11395	bHLHzip_SREBP_like	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in sterol regulatory element-binding protein (SREBP) family and similar proteins. The SREBP family includes SREBP1 and SREBP2, which are bHLHzip transcriptional activator of genes encoding proteins essential for cholesterol biosynthesis/uptake and fatty acid biosynthesis. SREBP1 and SREBP2 are principally found in the liver and in adipocytes and made up of an N-terminal transcription factor portion (composed of an activation domain, a bHLHzip domain, and a nuclear localization signal), a hydrophobic region containing two membrane spanning regions, and a C-terminal regulatory segment. They recognize a symmetric sterol regulatory element (TCACNCCAC) instead of E-box. The family also includes Saccharomyces cerevisiae transcription factor HMS1 (also termed high-copy MEP suppressor protein 1) and serine-rich protein TYE7. HMS1 is a putative bHLHzip transcription factor involved in exit from mitosis and pseudohyphal differentiation. TYE7, also termed basic-helix-loop-helix protein SGC1, is a putative bHLHzip transcription activator required for Ty1-mediated glycolytic gene expression. TYE7 N-terminal is extremely rich in serine residues. It binds DNA on E-box motifs, 5'-CANNTG-3'. TYE7 is not essential for growth.	87
381402	cd11396	bHLHzip_USF	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in upstream stimulatory factors, USF1, USF2 and similar proteins. Upstream stimulatory factor 1 and 2 (USF-1 and USF-2) are members of bHLHzip transcription factor family. USFs are ubiquitously expressed and key regulators of a wide number of gene regulation networks, including the stress and immune responses, cell cycle and proliferation, lipid and glucid metabolism. USFs recruit chromatin remodeling enzymes and interact with co-activators and the members of the transcription pre-initiation complex. USFs interact with high affinity to E-box regulatory elements.	58
381403	cd11397	bHLHzip_MITF_like	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in the microphthalmia-associated transcription factor family (MITF) family. The MITF (also known as microphthalmia-TFE, or MiT) family is a small family that contain a basic helix loop helix domain associated with a leucine zipper (bHLHZip). The MITF family comprises four genes in mammals (MITF, TFE3, TFEB, and TFEC); each gene has different functions. MITF is involved in neural crest melanocytes development as well as the pigmented retinal epithelium. TFEB is required for vascularization of the mouse placenta. TFE3 is involved in B cell function. TFEC regulates gene expression in macrophages. The MITF family can form homodimers or heterodimers with each other but not with other bHLH or bHLHzip proteins.	69
381404	cd11398	bHLHzip_scCBP1	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Saccharomyces cerevisiae centromere-binding protein 1 (CBP-1) and similar proteins. CBP-1, also termed centromere promoter factor 1 (CPF1), or centromere-binding factor 1 (CBF1), is a bHLHzip protein that is required for chromosome stability and methionine prototrophy. It binds as a homodimer to the centromere DNA elements I (CDEI, GTCACATG) region of the centromere that is required for optimal centromere function.	89
381405	cd11399	bHLHzip_scHMS1_like	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Saccharomyces cerevisiae transcription factor HMS1 and similar proteins. HMS1, also termed high-copy MEP suppressor protein 1, is a putative bHLHzip transcription factor involved in exit from mitosis and pseudohyphal differentiation.	96
381406	cd11400	bHLHzip_Myc	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in the Myc family. The Myc family is a member of the bHLHzip family of transcription factors that play important roles in the control of normal cell proliferation, growth, survival and differentiation. All Myc isoforms contain two independently functioning polypeptide chain regions: N-terminal transactivating residues and a C-terminal bHLHzip segment. The bHLHzip family of bHLH transcription factors are characterized by a highly conserved N-terminal basic region that may bind DNA at a consensus hexanucleotide sequence known as the E-box (CANNTG) followed by HLH and leucine zipper motifs that may interact with other proteins to form homo- and heterodimers. Myc heterodimerizes with Max enabling specific binding to E-box DNA sequences in the promoters of target genes. The Myc proto-oncoprotein family includes at least five different functional members: c-, N-, L-, S- and B-Myc (which is lacking the bHLH domain).	80
381407	cd11401	bHLHzip_Mad	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in the Mad family. Members of the Mad family (Mad1, Mxi, Mad3, and Mad4) bear the bHLHzip domain (also known as basic-helix-loop-helix-leucine-zipper or bHLH-LZ domain), which mediates heterodimerization to Max and the sequence-specific DNA binding ability to E-box DNA. Mad family proteins can repress transcription at the E-box through their interaction with co-repressors. Mad family proteins antagonize Myc function in transactivation and transformation and they are growth/tumor suppressors. The developmental phenotypes of the individual Mad family member knockout mice are relatively mild- all these mice have been shown to be viable and normal.	76
381408	cd11402	bHLHzip_Mnt	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-binding protein Mnt and similar proteins. Mnt, also termed Class D basic helix-loop-helix protein 3 (bHLHd3), or Myc antagonist MNT, or protein ROX, is a bHLHZip transcriptional repressor that binds DNA as a heterodimer with MAX. It binds to the canonical E box sequence 5'-CACGTG-3' and, with higher affinity, to 5'-CACGCG-3'. Mnt has an important role as an antagonist and regulator of Myc activities and it is a potential tumor suppressor. Mnt is ubiquitously expressed. Mnt-deficient mice shown to exhibit early postnatal lethality.	77
381409	cd11403	bHLH_scINO4_like	basic Helix-Loop-Helix (bHLH) domain found in Saccharomyces cerevisiae INO4 and similar proteins. INO4 is a bHLH transcriptional activator of phospholipid synthetic genes (such as INO1, CHO1/PSS, CHO2/PEM1, OPI3/PEM2, etc.). It is required for de-repression of phospholipid biosynthetic gene expression in response to inositol deprivation in yeast. INO4 dimerizes with INO2 and binds to an UAS DNA element to control expression of the genes whose expression is inositol-responsive.	71
381410	cd11404	bHLHzip_Mlx_like	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-like protein X (Mlx) family. Mlx, also termed Class D basic helix-loop-helix protein 13 (bHLHd13), or Max-like bHLHZip protein, or protein BigMax, or transcription factor-like protein 4, is a Max-like bHLHZip transcription regulator that interacts with the Max network of transcription factors. It forms a sequence-specific DNA-binding protein complex with some member of Mad family (Mad1 and Mad4) and Mondo family but not the Myc family and bind the E-box DNA to control transcription. The family also includes Saccharomyces cerevisiae INO4, which is a bHLH transcriptional activator of phospholipid synthetic genes (such as INO1, CHO1/PSS, CHO2/PEM1, OPI3/PEM2, etc.). It is required for de-repression of phospholipid biosynthetic gene expression in response to inositol deprivation in yeast.	70
381411	cd11405	bHLHzip_MLXIP_like	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MLX-interacting protein (MLXIP), MLX-interacting protein-like (MLXIPL) and similar proteins. The family includes MLXIP and MLXIPL. MLXIP, also termed Class E basic helix-loop-helix protein 36 (bHLHe36), or transcriptional activator MondoA, is a bHLHZip transcriptional activator that binds DNA as a heterodimer with Mlx. It binds to the canonical E box sequence 5'-CACGTG-3' and plays a role in transcriptional activation of glycolytic target genes. MLXIP is most highly expressed in skeletal muscle and functions as an indirect glucose sensor, by sensing glucose 6-phosphate and shuttling between the nucleus and the cytoplasm. MLXIPL, also termed carbohydrate-responsive element-binding protein (ChREBP), or Class D basic helix-loop-helix protein 14 (bHLHd14), or MLX interactor, or WS basic-helix-loop-helix leucine zipper protein (WS-bHLH), or Williams-Beuren syndrome chromosomal region 14 protein (WBSCR14), is a bHLHZip transcriptional factor integral to the regulation of glycolysis and lipogenesis in the liver. It forms heterodimers with the bHLHZip protein Mlx to bind the DNA sequence 5'-CACGTG-3'.	74
381412	cd11406	bHLHzip_Max	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in protein Max and similar proteins. Max, also termed Class D basic helix-loop-helix protein 4 (bHLHd4), or Myc-associated factor X, is a bHLHZip transcription regulator that forms a sequence-specific DNA-binding protein complex with MYC or MAD which recognizes the core sequence 5'-CAC[GA]TG-3'. The MYC:MAX complex is a transcriptional activator, whereas the MAD:MAX complex is a transcriptional repressor. Max homodimer bind DNA but is transcriptionally inactive. Targeted deletion of max results in early embryonic lethality in mice.	69
381413	cd11407	bHLH-O_HERP	basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split (HES)-related repressor protein (HERP) family. HERP (also called Hey/Hesr/HRT/CHF/gridlock) proteins corresponds to a family of bHLH-O transcriptional repressors that are related to the Drosophila hairy and Enhancer-of-split proteins and act as downstream effectors of Notch signaling. They contain a basic helix-loop-helix (bHLH) domain with an invariant glycine residue in its basic region, an orange domain in the central region and YXXW sequence motif at its C-terminal region. HERP proteins are involved in cardiovascular development and have roles in somitogenesis, myogenesis and gliogenesis.	59
381414	cd11408	bHLH-O_HELT	basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split-related protein HELT and similar proteins. HELT, also termed HES/HEY-like transcription factor, is a bHLH-O transcriptional repressor expressed in the developing central nervous system. It binds preferentially to the canonical E box sequence 5'-CACGCG-3' and regulates neuronal differentiation and/or identity. HELT could homodimerize and heterodimerize with other bHLH-O protein such as HES-5 or HEY-2 and bound to E box to repress gene transcription.	56
381415	cd11409	bHLH-O_DEC	basic helix-loop-helix-orange (bHLH-O) domain found in differentially expressed in chondrocytes protein (DEC) family. The DEC family includes two bHLH-O transcriptional repressors, DEC1 and DEC2, which are widely expressed in both embryonic and adult tissues and have been implicated in apoptosis, cell proliferation, and circadian rhythms, as well as malignancy in various cancers. They mediate the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes. They are induced by CLOCK:BMAL1 heterodimer via the CACGTG E-box in the promoter.	75
381416	cd11410	bHLH_O_HES	basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split (HES) family. The HES family includes bHLH-O transcriptional regulators that are related to the Drosophila hairy and Enhancer-of-split (HES) proteins. They contain a basic helix-loop-helix (bHLH) domain with an invariant proline residue in its basic region, an orange domain in the central region and a conserved tetrapeptide motif, WRPW, at its C-terminal region. HES family proteins form heterodimers or homodimers via their HLH domain and bind DNA to repress gene transcription that play an essential role in development of both compartment and boundary cells of the central nervous system.	54
381417	cd11411	bHLH_TS_MRF	basic helix-loop-helix (bHLH) domain found in myogenic regulatory factor (MRF) family. MRFs are a family of muscle-specific bHLH transcription proteins (MyoD, Myf5, Mrf4 and MyoG) that plays an essential role in regulating skeletal muscle development and growth. MRFs are capable of binding to E-box motifs as a heterodimer with E-proteins to regulate transcription expression.	56
381418	cd11412	bHLH_TS_TWIST1	basic helix-loop-helix (bHLH) domain found in twist-related protein 1 (TWIST1) and similar proteins. TWIST1, also termed Class A basic helix-loop-helix protein 38 (bHLHa38), or H-twist, is a bHLH transcriptional regulator that inhibits myogenesis by sequestrating E proteins, inhibiting trans-activation by MEF2, and inhibiting DNA-binding by MYOD1 through physical interaction. It also represses expression of proinflammatory cytokines such as TNFA and IL1B. In addition, TWIST1 is involved in cancer development and progression.	77
381419	cd11413	bHLH_TS_TAL_LYL	basic helix-loop-helix (bHLH) domain found in T-cell acute lymphocytic leukemia protein/ lymphoblastic leukemia-derived sequence (TAL/LYL) family. The TAL/LYL family includes a group of bHLH transcription factors (TAL1, TAL2 and LYL1) implicated in T cell acute leukaemia. They act as mediators of T cell leukaemogenesis. TAL-1, also termed Class A basic helix-loop-helix protein 17 (bHLHa17), or stem cell protein (SCL), or T-cell leukemia/lymphoma protein 5, is a hematopoietic-specific bHLH transcription factor that functions in embryonic and adult hematopoiesis in vertebrates. It is also required for embryonic vascular remodeling.  It acts as a regulator of erythroid differentiation and binds to regulatory regions of a large cohort of erythroid genes as part of a complex with GATA-1, LMO2 and Ldb1. TAL-2, also termed Class A basic helix-loop-helix protein 19 (bHLHa19), is a bHLH transcription factor essential for the normal brain development. Lyl-1, also termed Class A basic helix-loop-helix protein 18 (bHLHa18), or lymphoblastic leukemia-derived sequence 1, is a proto-oncogenic bHLH transcription factor that plays an important role in hematopoietic stem cell function and is required for the late stages of postnatal angiogenesis to limit the formation of new blood vessels, notably by regulating the activity of the small GTPase Rap1. LYL-1 deficiency induces a stress erythropoiesis.	60
381420	cd11414	bHLH_TS_HEN	basic helix-loop-helix (bHLH) domain found in helix-loop-helix protein (HEN) family. The HEN family includes two neuron-specific bHLH transcription factors, HEN-1 (also known as Nhlh1 or bHLHa35 or NSCL-1) and HEN-2 (also known as Nhlh2 or bHLHa34 or NSCL-2). They may serve as DNA-binding protein that is involved in the control of cell-type determination, possibly within the developing nervous system.	57
381421	cd11415	bHLH_TS_FERD3L_NATO3	basic helix-loop-helix (bHLH) domain found in Fer3-like protein (FERD3L) and similar proteins. FERD3L, also termed basic helix-loop-helix protein N-twist, or Class A basic helix-loop-helix protein 31 (bHLHa31), or nephew of atonal 3 (NATO3), or Neuronal twist (NTWIST), is a bHLH transcription factor expressed in the developing central nervous system (CNS). It regulates floor plate (FP) cells development. FP is a critical organizing center located at the ventral-most midline of the neural tube. FERD3L binds to the E-box and functions as inhibitor of transcription.	64
381422	cd11416	bHLH_TS_ceHLH13_like	basic helix-loop-helix (bHLH) domain found in Caenorhabditis elegans Helix-loop-helix protein 13 (HLH13) and similar proteins. Caenorhabditis elegans HLH13, also termed Fer3-like protein, or nephew of atonal 3, is a bHLH transcription factor that plays a role in the negative regulation of exit from L1 arrest and dauer diapause dependent on IIS signaling (insulin and insulin-like growth factor (IGF) signaling).	63
381423	cd11417	bHLH_TS_PTF1A	basic helix-loop-helix (bHLH) domain found in pancreas transcription factor 1 subunit alpha (PTF1A) and similar proteins. PTF1A, also termed Class A basic helix-loop-helix protein 29 (bHLHa29), or pancreas-specific transcription factor 1a, or bHLH transcription factor p48, or p48 DNA-binding subunit of transcription factor PTF1 (PTF1-p48), is a bHLH transcription factor implicated in the cell fate determination in various organs. It binds to the E-box consensus sequence 5'-CANNTG-3' and plays a role in early and late pancreas development and differentiation.	56
381424	cd11418	bHLH_TS_ASCL	basic helix-loop-helix (bHLH) domain found in achaete-scute complex-like (ASCL) family. The achaete-scute complex-like (ASCL, also known as achaete-scute complex homolog or ASH) family of bHLH transcription factors, ASCL1-5, have been implicated in cell fate specification and differentiation.  They are critical for proper development of the nervous system. The deregulation of ASCL plays a key role in psychiatric and neurological disorders.  ASCL-1, also termed Class A basic helix-loop-helix protein 46 (bHLHa46), or achaete-scute homolog 1 (ASH-1), or mammalian achaete-scute homolog 1 (Mash1), is a neural-specific bHLH transcription factor that is expressed in subsets of neural progenitors in both the central and peripheral nervous system. It plays a key role in neuronal differentiation and specification in the nervous system. ASCL-2, also termed achaete-scute homolog 2 (ASH-2), or Class A basic helix-loop-helix protein 45 (bHLHa45), or mammalian achaete-scute homolog 2 (Mash2), is a bHLH transcription factor that is involved in Schwann cell differentiation and control of proliferation in adult peripheral nerves. ASCL-3, also termed Class A basic helix-loop-helix protein 42 (bHLHa42), or bHLH transcriptional regulator Sgn-1, or achaete-scute homolog 3 (ASH-3), is a bHLH transcription factor specifically localized in the duct cells of the salivary glands. It may act as transcriptional repressor that inhibits myogenesis. The family also includes Drosophila melanogaster achaete-scute complex (AS-C) proteins, which consists of lethal of scute (also known as achaete-scute complex protein T3 or AST3), scute (also known as achaete-scute complex protein T4 or AST4), achaete (also known as achaete-scute complex protein T5 or AST5), and asense (also known as achaete-scute complex protein T8 or AST8). They are involved in the determination of the neuronal precursors in the peripheral nervous system and the central nervous system, as well as in sex determination and dosage compensation.	56
381425	cd11419	bHLHzip_TFAP4	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in transcription factor AP-4 (TFAP4) and similar proteins. TFAP4, also termed activating enhancer-binding protein 4, or Class C basic helix-loop-helix protein 41 (bHLHc41), is a bHLHzip transcription factor that activates both viral and cellular genes involved in the regulation of cellular proliferation, stemness, and epithelial-mesenchymal transition by binding to the symmetrical DNA sequence 5'-CAGCTG-3'.	61
381426	cd11420	bHLH_E-protein	basic helix-loop-helix (bHLH) domain found in E proteins family. The E proteins family corresponds to class I bHLH proteins, which are widely expressed within the immune system and on which the majority of this chapter will be focused. Members in this family include E2A (also referred to as TCF-3), E47, TCF-12 (also referred to as HEB), and TCF-4 (also referred to as E2-2) in vertebrates, as well as the E protein ortholog, Daughterless (Da), from Drosophila melanogaster. E-proteins are expressed broadly and in certain complexes they are restricted to specific cell types. E-proteins homodimerize and heterodimerize with the tissue specific bHLH factors to bind DNA and regulate transcription and differentiation of cells during development. The activity of the E-proteins is regulated by two main mechanisms: first, the relative concentrations of E-proteins, tissue specific bHLH factors, and the Id proteins, and second, covalent modification.	47
381427	cd11421	bHLH_TS_ATOH8	basic helix-loop-helix (bHLH) domain found in protein atonal homolog 8 (ATOH8) and similar proteins. ATOH8, also termed Class A basic helix-loop-helix protein 21 (bHLHa21), or helix-loop-helix protein hATH-6 (hATH6), is a bHLH shear-stress-responsive transcription factor expressed in activated satellite cells and proliferating myoblasts of human skeletal muscle tissue. It regulates endothelial cell proliferation, migration and tube-like structures formation. ATOH8 binds a palindromic (canonical) core consensus DNA sequence 5'-CANNTG- 3' known as an E-box element, possibly as a heterodimer with other bHLH proteins.	68
381428	cd11422	bHLH_TS_FIGLA	basic helix-loop-helix (bHLH) domain found in factor in the germline alpha (FIGLA) and similar proteins. FIGLA, also termed FIGalpha, or Class C basic helix-loop-helix protein 8 (bHLHc8), or folliculogenesis-specific basic helix-loop-helix protein, or transcription factor FIGa, is a germ-cell-specific bHLH transcription factor expressed abundantly in female and less so in male germ cells. It is essential for primordial follicle formation and expression of many genes required for folliculogenesis, fertilization and early embryonic survival. FIGLA knockout mice cannot form primordial follicles and lose oocytes rapidly after birth, whereas male gonads are unaffected.	56
381429	cd11423	bHLH_TS_musculin_like	basic helix-loop-helix (bHLH) domain found in musculin, transcription factor 21 (TCF-21) and similar proteins. The family includes two bHLH transcription factors, musculin and transcription factor 21 (TCF-21). Musculin, also termed activated B-cell factor 1 (ABF-1), or Class A basic helix-loop-helix protein 22 (bHLHa22), is a bHLH transcription factor expressed in activated B lymphocytes. It acts as a transcription repressor capable of inhibiting the transactivation capability of TCF3/E47. Musculin may play a role in regulating antigen-dependent B-cell differentiation. The mouse homolog, musculin, is suggested to be a repressor of myogenesis that is expressed in developing muscle and in the spleen. TCF-21, also termed capsulin, or Class A basic helix-loop-helix protein 23 (bHLHa23), or epicardin, or podocyte-expressed 1 (Pod-1), is a bHLH transcription factor expressed specifically in mesodermally-derived cells that surround the epithelium of the developing gastrointestinal, genitourinary and respiratory systems during mouse embryogenesis. It may play a role in the specification or differentiation of one or more subsets of epicardial cell types.	56
381430	cd11424	bHLH_TS_OUT	basic helix-loop-helix (bHLH) domain found in ovary, uterus and testis protein (OUT) family. The OUT family includes transcription factor 23 (TCF-23), transcription factor 24 (TCF-24) and similar proteins. TCF-23, also termed Class A basic helix-loop-helix protein 24 (bHLHa24), is a bHLH transcription factor that is essential for progesterone-dependent decidualization. The mouse homolog is also called ovary, uterus and testis protein (OUT), which is expressed predominantly in the reproductive organs such as the uterus, ovary and testis. It shows an Id-like inhibitory activity and functions as a negative regulator of bHLH factors through the formation of a functionally inactive heterodimeric complex. OUT inhibits the formation of TCF3 and MYOD1 homodimers and heterodimers, but lacks DNA binding activity. OUT is involved in the regulation or modulation of smooth muscle contraction of the uterus during pregnancy and particularly around the time of delivery. It also plays a role in the inhibition of myogenesis. Unlike typical bHLH factors, OUT proteins do not bind E-box (CANNTG) or N-box DNA sequences and inhibit DNA binding of homo- and heterodimers consisting of E12 and MyoD in gel mobility shift assays. TCF-24 is an uncharacterized bHLH transcription factor that shows high sequence similarity with TCF-23.	55
381431	cd11425	bHLH_TS_Mesp_like	basic helix-loop-helix (bHLH) domain found in mesoderm posterior protein (Mesp) family. Mesp, a bHLH tissue specific transcription factor, acts as a key regulator of the cardiovascular transcriptional network by inducing directly and/or indirectly the expression of the majority of key cardiovascular transcription factors. The Mesp family includes two bHLH transcription factors, Mesp1 and Mesp2. Mesp1, also termed Class C basic helix-loop-helix protein 5 (bHLHc5), promotes cardiovascular differentiation during embryonic development and embryonic stem cell differentiation. Mesp2, also termed Class C basic helix-loop-helix protein 6 (bHLHc6), plays an important role in somitogenesis. The family also includes mesogenin-1 (Msgn1) and similar proteins. Msgn1, also termed paraxial mesoderm-specific mesogenin1, or pMesogenin1 (pMsgn1), is a bHLH transcription factor required for maturation and segmentation of paraxial mesoderm. It may regulate the expression of T-box transcription factors essential for mesoderm formation and differentiation.	59
381432	cd11426	bHLH_TS_MIST1_like	basic helix-loop-helix (bHLH) domain found in muscle, intestine and stomach expression 1 (MIST-1) family. MIST-1, also termed Class A basic helix-loop-helix protein 15 (bHLHa15), or Class B basic helix-loop-helix protein 8 (bHLHb8), is a bHLH transcription factor expressed in pancreatic acinar cells and other serous exocrine cells. It is essential for cytoskeletal organization and secretory activity. It also functions as a potent endoplasmic reticulum (ER) stress-inducible transcriptional regulator. MIST-1 is capable of binding to E-box (CANNTG) motifs as a homodimer or a heterodimer with E-proteins (E12 and E47) to regulate transcription. The family also includes Drosophila melanogaster protein dimmed and similar proteins. Dimmed, also termed DIMM, is a bHLH transcription factor that regulates neurosecretory (NS) cell function and neuroendocrine cell fate in Drosophila.	56
381433	cd11427	bHLH_TS_NeuroD	basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor (NeuroD) family. The NeuroD family includes NeuroD1, NeuroD2, NeuroD4 and NeuroD6. NeuroD1, also termed Class A basic helix-loop-helix protein 3 (bHLHa3), is a neuronal bHLH transcription factor involved in the development and maintenance of the endocrine pancreas and neuronal elements. It acts as an essential regulator of glutamatergic neuronal differentiation. Loss of NeuroD1 causes ataxia, cerebellar hypoplasia, sensorineural deafness, and severe retinal dystrophy in mice. NeuroD2, also termed Class A basic helix-loop-helix protein 1 (bHLHa1), or NeuroD-related factor (NDRF), is a neuronal calcium-dependent bHLH transcription factor that induces neuronal differentiation and promotes neuronal survival. It plays a central role in thalamocortical synaptic maturation. NeuroD2 mediates calcium-dependent transcription activation by binding to E box-containing promoter. NeuroD4, also termed Class A basic helix-loop-helix protein 4 (bHLHa4), or protein atonal homolog 3 (ATH-3), or Atoh3, or Math-3, is a bHLH transcriptional activator that mediates neuronal differentiation. NeuroD6, also termed Class A basic helix-loop-helix protein 2 (bHLHa2), or protein atonal homolog 2 (ATH-2), or Atoh2, or Math2, or Nex1, is a neurogenic bHLH transcription factor involved in neuronal development, differentiation, and survival in Alzheimer's disease (AD) brains of both cohorts. It plays an integrative role in coordinating increase in mitochondrial mass with cytoskeletal remodeling, suggesting that it may act as a co-regulator of neuronal differentiation and energy metabolism.	55
381434	cd11428	bHLH_TS_NGN	basic helix-loop-helix (bHLH) domain found in neurogenin (NGN) family. The NGN family includes three neural-specific bHLH transcription factors, NGN1-3, which may function at neuroblast selection genes during the development of several neuronal lineages. NGN-1, also termed Class A basic helix-loop-helix protein 6 (bHLHa6), or neurogenic basic-helix-loop-helix protein, or neurogenic differentiation factor 3 (NeuroD3), is a neural-specific bHLH transcription factor involved in the initiation of neuronal differentiation. NGN-2, also termed Class A basic helix-loop-helix protein 8 (bHLHa8), or protein atonal homolog 4 (ATOH4), is a neural-specific bHLH transcription factor required for sensory neurogenesis. NGN-3, also termed Class A basic helix-loop-helix protein 7 (bHLHa7), or protein atonal homolog 5 (ATOH5), is a neural-specific bHLH transcription factor expressed in the developing central nervous system and the embryonic pancreas. It is involved in neurogenesis and plays an important role in spermatogenesis.	57
381435	cd11429	bHLH_TS_OLIG	basic helix-loop-helix (bHLH) domain found in Oligodendrocyte lineage genes (OLIG) family of transcription factors. The OLIG family includes three bHLH transcription factors, Oligo1-3, which are expressed in both the developing and mature central nervous system. Oligo1 and Oligo2 are expressed in a nervous tissue-specific manner, but Oligo3 is found mainly in non-neural tissues. Oligo (also known as Olig) have key roles in the specification of motor neurons, dorsal interneurons, and oligodendrocytes. Oligo1, also termed Class B basic helix-loop-helix protein 6 (bHLHb6), or Class E basic helix-loop-helix protein 21 (bHLHe21), promotes formation and maturation of oligodendrocytes, especially within the brain. Oligo2, also termed Class B basic helix-loop-helix protein 1 (bHLHb1), or Class E basic helix-loop-helix protein 19 (bHLHe19), or protein kinase C-binding protein 2, or protein kinase C-binding protein RACK17, is required for oligodendrocyte and motor neuron specification in the spinal cord, as well as for the development of somatic motor neurons in the hindbrain. It cooperates with OLIG1 to establish the MN progenitors (pMN) domain of the embryonic neural tube. Oligo3, also termed Class B basic helix-loop-helix protein 7 (bHLHb7), or Class E basic helix-loop-helix protein 20 (bHLHe20), is expressed in the ventricular zone of the dorsal alar plate of the hindbrain and involved in regulating the development of dorsal and ventral spinal cord. It may determine the distinct specification program of class A neurons in the dorsal part of the spinal cord and suppress specification of class B neurons. This family also includes two OLIG-related bHLH transcription factors, bHLHe22 and bHLHe23. bHLHe22, also termed Class B basic helix-loop-helix protein 5 (bHLHb5), or trinucleotide repeat-containing gene 20 protein, is a neural-specific transcriptional repressor that is expressed in both excitatory (unipolar brush cells) and inhibitory neurons (cartwheel cells) of the dorsal cochlear nucleus (DCN) during development. It is important for the proper development and/or survival of a number of neural cell types. bHLHe23, also termed Class B basic helix-loop-helix protein 4 (bHLHb4), is expressed in rod bipolar cells and is required for rod bipolar cell maturation. bHLHe23 have roles in spinal interneuron differentiation by mechanisms linked to the Notch signaling pathway. It modulates the expression of genes required for the differentiation and/or maintenance of pancreatic and neuronal cell types.	61
381436	cd11430	bHLH_TS_ATOH1_like	basic helix-loop-helix (bHLH) domain found in protein atonal homologs ATOH1, ATOH7 and similar proteins. The family includes ATOH1 and ATOH7. ATOH1, also termed Class A basic helix-loop-helix protein 14 (bHLHa14), or helix-loop-helix protein hATH-1 (hATH1), or Math1, or Cath1, is a proneural bHLH transcription factor that is essential for inner ear hair cell differentiation. It dimerizes with E47 and activates E-box (CANNTG) dependent transcription. ATOH1 is a mammalian homolog of the Drosophila melanogaster gene atonal and mouse atonal homolog 1 (Math1). ATOH7, also termed Class A basic helix-loop-helix protein 13 (bHLHa13), or helix-loop-helix protein hATH-5 (hATH5), or Math5, is a bHLH transcription factor involved in the differentiation of retinal ganglion cells. The family also includes protein Amos (also termed absent MD neurons and olfactory sensilla protein, or reduced olfactory organs protein, or rough eye protein). It is a bHLH transcription factor that promotes multiple dendritic neuron formation in the Drosophila peripheral nervous system.	56
381437	cd11431	bHLH_TS_taxi_Dei	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein taxi and similar proteins. Protein taxi, also termed protein delilah (Dei), is a bHLH transcription factor that is involved in regulation of cell adhesion and attachment that is expressed in specialized cells that provide anchoring sites to either muscles (tendon cells), or proprioceptors (chordotonal attachment cells) during embryonic development. It probably plays an important role in the differentiation of epidermal cells into the tendon cells that form the attachment sites for all muscles.	59
381438	cd11432	bHLH-PAS_NPAS1_3_like	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing proteins, NPAS1, NPAS3 and similar proteins. The family includes neuronal PAS domain proteins NPAS1 and NPAS3, both of which are master regulators of neuropsychiatric function. NPAS1, also termed neuronal PAS1, or Basic-helix-loop-helix-PAS protein MOP5, or Class E basic helix-loop-helix protein 11 (bHLHe11), or member of PAS protein 5, or PAS domain-containing protein 5 (PASD5), is a bHLH-PAS transcriptional repressor expressed in the central nervous system and involved in neuronal differentiation. It is active during late embryogenesis and postnatal development. NPAS3, also termed neuronal PAS3, or Basic-helix-loop-helix-PAS protein MOP6, or Class E basic helix-loop-helix protein 12 (bHLHe12), or member of PAS protein 6, or PAS domain-containing protein 6 (PASD6), is a bHLH-PAS brain-enriched transcription factor that is involved in central nervous system development and neurogenesis. It is a replicated genetic risk factor for psychiatric disorders. Human chromosomal rearrangements that affect NPAS3 normal expression are associated with schizophrenia and mental retardation.	55
381439	cd11433	bHLH-PAS_HIF	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in hypoxia-inducible factor (HIF) family. The HIF family contains bHLH-PAS transcription regulators involved in oxygen homeostasis, including HIF1a, HIF2a, and HIF3a. They have been implicated in development, postnatal physiology as well as disease pathogenesis. HIF1a, also termed HIF-1-alpha, or HIF1-alpha, or ARNT-interacting protein, or Basic-helix-loop-helix-PAS protein MOP1, or Class E basic helix-loop-helix protein 78 (bHLHe78), or Member of PAS protein 1, or PAS domain-containing protein 8 (PASD8), functions as a master transcriptional regulator of the adaptive response to hypoxia. HIF2a, also termed HIF-2-alpha, or HIF2-alpha, or endothelial PAS domain-containing protein 1 (EPAS-1), or Basic-helix-loop-helix-PAS protein MOP2, or Class E basic helix-loop-helix protein 73 (bHLHe73), or Member of PAS protein 2, or PAS domain-containing protein 2 (PASD2), or HIF-1-alpha-like factor (HLF), is a bHLH-PAS transcription factor involved in the induction of oxygen regulated genes. HIF3a, also termed HIF-3-alpha, or HIF3-alpha, or endothelial PAS domain-containing protein 1 (EPAS-1), or Basic-helix-loop-helix-PAS protein MOP7, or Class E basic helix-loop-helix protein 17 (bHLHe17), or Member of PAS protein 7, or PAS domain-containing protein 7 (PASD7), or HIF3-alpha-1, or inhibitory PAS domain protein (IPAS), is a bHLH-PAS transcriptional regulator in adaptive response to low oxygen tension. It plays a role in the regulation of hypoxia-inducible gene expression.	58
381440	cd11434	bHLH-PAS_SIM	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in single-minded (SIM) family. The SIM family includes Drosophila melanogaster protein SIM and its homologs from vertebrates, single-minded homolog 1 (SIM1) and single-minded homolog 2 (SIM2). SIM is a nuclear bHLH-PAS transcription factor that functions as a master developmental regulator controlling midline development of the ventral nerve cord in Drosophila. SIM1, also termed Class E basic helix-loop-helix protein 14 (bHLHe14), is a bHLH-PAS transcription factor that may have pleiotropic effects during embryogenesis and in the adult. SIM2, also termed Class E basic helix-loop-helix protein 15 (bHLHe15), is a bHLH-PAS transcription factor that may be a master gene of central nervous system (CNS) development in cooperation with ARNT. It may have pleiotropic effects in the tissues expressed during development.	61
381441	cd11435	bHLH-PAS_AhRR	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor repressor (AhRR) and similar proteins. AhRR, also termed AhR repressor, or Class E basic helix-loop-helix protein 77 (bHLHe77), is a member of bHLH-PAS transcription factors that acts as a negative regulator of AhR (or Dioxin Receptor), playing key roles in development and environmental sensing. AhR is activated by Dioxin to control the expression of certain genes to influence biological processes such as apoptosis, proliferation, cell growth and differentiation. To form active DNA binding complexes, AhR dimerizes with a bHLH-PAS factor ARNT (Aryl hydrocarbon Nuclear Receptor Translocator). AhRR functions by competing with AhR for its partner ARNT. AhRR-ARNT complexes are transcriptionally inactive.	60
381442	cd11436	bHLH-PAS_AhR	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor (AhR) and similar proteins. AhR, also termed Ah receptor, or Dioxin receptor (DR), or Class E basic helix-loop-helix protein 76 (bHLHe76), is the only member of bHLH-PAS transcription regulators that bind and be activated by small chemical ligands. It is activated by Dioxin to control the expression of certain genes to influence biological processes such as apoptosis, proliferation, cell growth and differentiation. To form active DNA binding complexes AhR dimerize with a bHLH-PAS factor ARNT (Aryl hydrocarbon Nuclear Receptor Translocator).	61
381443	cd11437	bHLH-PAS_ARNT_like	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor nuclear translocator (ARNT) family. The ARNT family of bHLH-PAS transcription regulators includes ARNT, ARNT-like proteins (ARNTL and ARNTL2), and Drosophila melanogaster protein cycle. They act as the heterodimeric partner for bHLH-PAS proteins such as aryl hydrocarbon receptor (AhR), hypoxia-inducible factor (HIF), and single-minded (SIM). These bHLH-PAS transcription complexes are involved in transcriptional responses to xenobiotic, hypoxia, and developmental pathways. Heterodimerization of bHLH-PAS proteins with ARNT is mediated by contacts between both the bHLH and the tandem PAS domains. ARNT use bHLH and/or PAS domains to interact with several transcriptional coactivators. It is required for activity of the aryl hydrocarbon (dioxin) receptor. ARNTL, also termed Basic-helix-loop-helix-PAS protein MOP3, or brain and muscle ARNT-like 1 (BMAL1), or Class E basic helix-loop-helix protein 5 (bHLHe5), or member of PAS protein 3, or PAS domain-containing protein 3 (PASD3), or bHLH-PAS protein JAP3, is a member of the bHLH-PAS transcription factor family that forms heterodimers with another bHLH-PAS protein, CLOCK (circadian locomotor output cycle kaput), which regulates circadian rhythm. ARNTL-CLOCK heterodimer complex activates transcription from E-box (CANNTG) elements found in the promoter of circadian responsive genes. ARNTL is highly homologous to ARNT. ARNTL2, also termed Basic-helix-loop-helix-PAS protein MOP9, or brain and muscle ARNT-like 2 (BMAL2), or CYCLE-like factor (CLIF), or Class E basic helix-loop-helix protein 6 (bHLHe6), or member of PAS protein 9, or PAS domain-containing protein 9 (PASD9), is a neuronal bHLH-PAS transcriptional factor, regulating cell cycle progression and preventing cell death, whose sustained expression might ensure brain neuron survival. It also plays important roles in tumor angiogenesis. Protein cycle, also termed brain and muscle ARNT-like 1 (BMAL1), or MOP3, is a putative bHLH-PAS transcription factor involved in the generation of biological rhythms in Drosophila. It activates cycling transcription of Period (PER) and Timeless (TIM) by binding to the E-box (5'-CACGTG-3') present in their promoters.	58
381444	cd11438	bHLH-PAS_ARNTL_PASD3	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor nuclear translocator-like protein 1 (ARNTL) and similar proteins. ARNTL, also termed Basic-helix-loop-helix-PAS protein MOP3, or brain and muscle ARNT-like 1 (BMAL1), or Class E basic helix-loop-helix protein 5 (bHLHe5), or member of PAS protein 3, or PAS domain-containing protein 3 (PASD3), or bHLH-PAS protein JAP3, is a member of the bHLH-PAS transcription factor family that forms heterodimers with another bHLH-PAS protein, CLOCK (circadian locomotor output cycle kaput), which regulates circadian rhythm. ARNTL-CLOCK heterodimer complex activates transcription from E-box (CANNTG) elements found in the promoter of circadian responsive genes. ARNTL is highly homologous to ARNT.	64
381445	cd11439	bHLH-PAS_SRC	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in steroid receptor coactivator (SRC) family. The SRC family of coactivators includes SRC-1 (NcoA-1/p160), SRC-2(TIF2/GRIP1/NcoA-2) and SRC-3(NcoA-3/pCIP/RAC3/ACTR/pCIP/AIB1/TRAM1), which are critical mediators of steroid receptor action. They contain bHLH-PAS domain at the N-terminal that is followed by receptor interacting domain and C-terminal transcriptional activation domain. SRC coactivators interact with nuclear receptors in a ligand-dependent manner and enhance transcriptional activation by the receptor via histone acetylation/methylation.	58
381446	cd11440	bHLH-O_Cwo_like	basic helix-loop-helix-orange (bHLH-O) domain found in Drosophila melanogaster protein clockwork orange (Cwo) and similar proteins. Cwo is a bHLH-O transcriptional regulator involved in the regulation of Drosophila circadian rhythms. It functions as both an activator and a repressor of clock gene expression.	60
381447	cd11441	bHLH-PAS_CLOCK_like	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Circadian locomotor output cycles protein kaput (CLOCK) and similar proteins. The family includes CLOCK, neuronal PAS domain-containing protein 2 (NPAS2) and non-mammalian circadian clock protein PASD1. CLOCK, also termed Class E basic helix-loop-helix protein 8 (bHLHe8), is a transcriptional activator which forms a core component of the circadian clock. NPAS2, also termed neuronal PAS2, or basic-helix-loop-helix-PAS protein MOP4, or Class E basic helix-loop-helix protein 9 (bHLHe9), or member of PAS protein 4, or PAS domain-containing protein 4, is a transcriptional activator which forms a core component of the circadian clock. PASD1 is evolutionarily related to Circadian locomotor output cycles protein kaput (CLOCK)and functions as a suppressor of the biological clock that drives the daily circadian rhythms of cells throughout the body.	54
381448	cd11442	bHLH_AtPRE_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana paclobutrazol resistance (PRE) family. The PRE family includes several bHLH transcription factors from Arabidopsis thaliana, such as PRE1-6. PRE1 (also termed AtbHLH136, or protein banquo 1), PRE2 (also termed AtbHLH134, or protein banquo 2, or EN 52), PRE4 (also termed AtbHLH161, or protein banquo 3), and PRE5 (also termed AtbHLH164) are atypical and probable non DNA-binding bHLH transcription factors that integrate multiple signaling pathways to regulate cell elongation and plant development. PRE3 (also termed AtbHLH135, or protein activation-tagged BRI1 suppressor 1, or ATBS1, or protein target of MOOPTEROS 7, or EN 67) is an atypical and probable non DNA-binding bHLH transcription factor required for MONOPTEROS-dependent root initiation in embryo. It promotes the correct definition of the hypophysis cell division plane. PRE5 (also termed AtbHLH163, or protein KIDARI) is an atypical and probable non DNA-binding bHLH transcription factor that regulates light-mediated responses in day light conditions by binding and inhibiting the activity of the bHLH transcription factor HFR1, a critical regulator of light signaling and shade avoidance.	65
381449	cd11443	bHLH_AtAMS_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein aborted microspores (AMS) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as AMS, ICE1 and SCREAM2. AMS, also termed AtbHLH21, or EN 48, plays a crucial role in tapetum development and it is required for male fertility and pollen differentiation. ICE1, also termed inducer of CBF expression 1, or AtbHLH116, or EN 45, or SCREAM, acts as a transcriptional activator that regulates the cold-induced transcription of CBF/DREB1 genes. It binds specifically to the MYC recognition sites (5'-CANNTG-3') found in the CBF3/DREB1A promoter. SCREAM2, also termed AtbHLH33, or EN 44, mediates stomatal differentiation in the epidermis probably by controlling successive roles of SPCH, MUTE, and FAMA.	72
381450	cd11444	bHLH_AtIBH1_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana ILI1-BINDING BHLH 1 (IBH1) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as IBH1, UPBEAT1, PAR1 and PAR2. IBH1, also termed bHLH zeta, or AtbHLH158, is an atypical and probable non DNA-binding bHLH transcription factor that acts as transcriptional repressor that negatively regulates cell and organ elongation in response to gibberellin (GA) and brassinosteroid (BR) signaling. IBH1 forms heterodimer with BHLH49, thus inhibiting DNA binding of BHLH49, which is a transcriptional activator that regulates the expression of a subset of genes involved in cell expansion by binding to the G-box motif. UPBEAT1, also termed AtbHLH151, or EN 146, is a bHLH transcription factor that modulates the balance between cellular proliferation and differentiation in root growth. It does not act through cytokinin and auxin signaling, but by repressing peroxidase expression in the elongation zone. PAR1 (also termed AtbHLH165, or protein helix-loop-helix 1, or protein phytochrome rapidly regulated 1) and PAR2 (also termed AtbHLH166, or protein helix-loop-helix 2, or protein phytochrome rapidly regulated 2) are two atypical bHLH transcription factors that act as negative regulators of a variety of shade avoidance syndrome (SAS) responses, including seedling elongation and photosynthetic pigment accumulation. They act as direct transcriptional repressor of two auxin-responsive genes, SAUR15 and SAUR68. They may function in integrating shade and hormone transcriptional networks in response to light and auxin changes.	57
381451	cd11445	bHLH_AtPIF_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana phytochrome interacting factors (PIFs) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as PIFs, ALC, PIL1, SPATULA, and UNE10. PIFs (PIF1, PIF3, PIF4, PIF5, PIF6 and PIF7) have been shown to control light-regulated gene expression. They directly bind to the photoactivated phytochromes and are degraded in response to light signals. ALC, also termed AtbHLH73, or protein ALCATRAZ, or EN 98, is required for the dehiscence of fruit, especially for the separation of the valve cells from the replum. It promotes the differentiation of a strip of labile non-lignified cells sandwiched between layers of lignified cells. PIL1, also termed AtbHLH124, or protein phytochrome interacting factor 3-like 1, or EN 110, is involved in responses to transient and long-term shade. It is required for the light-mediated inhibition of hypocotyl elongation and necessary for rapid light-induced expression of the photomorphogenesis- and circadian-related gene APRR9. PIL1 seems to play a role in multiple PHYB responses, such as flowering transition and petiole elongation. SPATULA, also termed AtbHLH24, or EN 99, plays a role in floral organogenesis. It promotes the growth of carpel margins and of pollen tract tissues derived from them. UNE10, also termed AtbHLH16, or protein UNFERTILIZED EMBRYO SAC 10, or EN 99, is required during the fertilization of ovules by pollen.	64
381452	cd11446	bHLH_AtILR3_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein IAA-leucine resistant 3 (ILR3) and similar proteins. ILR3, also termed AtbHLH105, or EN 133, is a bHLH transcription factor that plays a role in resistance to amide-linked indole-3-acetic acid (IAA) conjugates such as IAA-Leu and IAA-Phe. It may regulate gene expression in response to metal homeostasis changes.	76
381453	cd11447	bHLH-O_HEYL	basic helix-loop-helix-orange (bHLH-O) domain found in hairy/enhancer-of-split related with YRPW motif-like protein (HEYL) and similar proteins. HEYL, also termed Class B basic helix-loop-helix protein 33 (bHLHb33), or hairy-related transcription factor 3 (HRT-3), is a bHLH-O transcriptional repressor that is strongly expressed in the presomitic mesoderm, the somites, the peripheral nervous system and smooth muscle of all arteries and is a downstream effector of the Notch and transforming growth factor-beta pathways. It promotes neuronal differentiation by activating proneural genes and inhibiting other hairy and enhancer of split (HES) and hairy/enhancer-of-split related with YRPW motif protein (HEY) proteins. HEYL also functions as a tumor suppressor involved in the progression of human cancers.	74
381454	cd11448	bHLH_AtFAMA_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein FAMA and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as FAMA, MUTE and SPEECHLESS, which work together to regulate the sequential cell fate specification during stomatal development and differentiation. FAMA, also termed AtbHLH97, or EN 14, is a transcription activator required to promote differentiation and morphogenesis of stomatal guard cells and to halt proliferative divisions in their immediate precursors. It mediates the formation of stomata. MUTE, also termed AtbHLH45, or EN 20, is required for the differentiation of stomatal guard cells, by promoting successive asymmetric cell divisions and the formation of guard mother cells. It promotes the conversion of the leaf epidermis into stomata. SPEECHLESS, also termed AtbHLH98, or EN 19, is required for the initiation and the formation of stomata, by promoting the first asymmetric cell divisions. FAMA, MUTE and SPEECHLESS form heterodimers with SCREAM/ICE1 and SCRM2 to regulate transcription of genes during stomatal development.	74
381455	cd11449	bHLH_AtAIB_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein ABA-INDUCIBLE bHLH-TYPE (AIB) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as AIB and MYC proteins (MYC2, MYC3 and MYC4). AIB, also termed AtbHLH17, or EN 35, is a transcription activator that regulates positively abscisic acid (ABA) response. MYC2, also termed protein jasmonate insensitive 1, or R-homologous Arabidopsis protein 1 (RAP-1), or AtbHLH6, or EN 38, or Z-box binding factor 1 protein, is a transcriptional activator involved in abscisic acid (ABA), jasmonic acid (JA), and light signaling pathways. MYC3, also termed protein altered tryptophan regulation 2, or AtbHLH5, or transcription factor ATR2, or EN 36, is a transcription factor involved in tryptophan, jasmonic acid (JA) and other stress-responsive gene regulation. MYC4, also termed AtbHLH4, or EN 37, is a transcription factor involved in jasmonic acid (JA) gene regulation. MYC2, together with MYC3 and MYC4, controls additively subsets of JA-dependent responses.	78
381456	cd11450	bHLH_AtFIT_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana Fe-deficiency induced transcription factor 1 (FIT) and similar proteins. The family includes bHLH transcription factors from Arabidopsis thaliana, such as FIT and DYT1. FIT, also termed FER-like iron deficiency-induced transcription factor, or FER-like regulator of iron uptake, or AtbHLH29, or EN 43, is a bHLH transcription factor that is required for the iron deficiency response in plant. It regulates FRO2 at the level of mRNA accumulation and IRT1 at the level of protein accumulation. DYT1, also termed AtbHLH22, or protein dysfunctional tapetum 1, or EN 49, is a bHLH transcription factor involved in the control of tapetum development. It is required for male fertility and pollen differentiation, especially during callose deposition.	76
381457	cd11451	bHLH_AtTT8_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein transparent testa 8 (TT8) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as TT8, EGL1, and GL3. TT8, also termed AtbHLH42, or EN 32, is involved in the control of flavonoid pigmentation and plays a key role in regulating leucoanthocyanidin reductase (BANYULS) and dihydroflavonol-4-reductase (DFR). EGL1, also termed AtbHLH2, or EN 30, or AtMYC146, or protein enhancer of GLABRA 3, is involved in epidermal cell fate specification and regulates negatively stomata formation but promotes trichome formation. GL3, also termed AtbHLH1, or AtMYC6, or protein shapeshifter, or EN 31, is involved in epidermal cell fate specification. It regulates negatively stomata formation, but, in association with TTG1 and MYB0/GL1, promotes trichome formation, branching and endoreplication.	75
381458	cd11452	bHLH_AtNAI1_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein NAI1 and similar proteins. NAI1, also termed AtbHLH20, or EN 27, is a bHLH transcription activator that regulates the expression of at least NAI2, PYK10 and PBP1. It is required for and mediates the formation of endoplasmic reticulum bodies (ER bodies). It plays a role in the symbiotic interactions with the endophytes of the Sebacinaceae fungus family, such as Piriformospora indica and Sebacina.	75
381459	cd11453	bHLH_AtBIM_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana BES1-interacting Myc-like proteins (BIMs) and similar proteins. The family includes Arabidopsis thaliana BIM1 and its homologs (BIM2 and BIM3), which are bHLH transcription factors that interact with BES1 to regulate transcription of Brassinosteroid (BR)-induced gene. BR regulates many growth and developmental processes such as cell elongation, vascular development, senescence stress responses, and photomorphogenesis. BIM1 heterodimerize with BES1 and bind to E-box sequences present in many BR-induced promoters to regulated BR-induced genes.	77
381460	cd11454	bHLH_AtIND_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein INDEHISCENT (IND) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as IND, HEC proteins (HEC1, HEC2 and HEC3) and UNE12. IND, also termed AtbHLH40, or EN 120, is a bHLH transcription regulator required for seed dispersal. It is involved in the differentiation of all three cell types required for fruit dehiscence. HEC1 (also termed AtbHLH88, or protein HECATE 1, or EN 118), HEC2 (also termed AtbHLH37, or protein HECATE 2, or EN 117) and HEC3 (also termed AtbHLH43, or protein HECATE 3, or EN 119) are required for the female reproductive tract development and fertility. Both IND and HEC proteins have been implicated in regulation of auxin signaling. They heterodimerize with SPATULA (SPT) bHLH transcription factor to regulate reproductive tract development in plant. UNE12, also termed AtbHLH59, or protein UNFERTILIZED EMBRYO SAC 12, or EN 93, is required for ovule fertilization.	63
381461	cd11455	bHLH_AtAIG1_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein AIG1 and similar proteins. AIG1, also termed AtbHLH32, or EN 54, or protein target of MOOPTEROS 5, is a transcription factor required for MONOPTEROS-dependent root initiation in embryo.	80
381462	cd11456	bHLHzip_N-Myc_like	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in N-Myc and similar proteins. N-Myc, also termed Class E basic helix-loop-helix protein 37 (bHLHe37), is a bHLHZip proto-oncogene protein that positively regulates the transcription of MYCNOS in neuroblastoma cells. It is also essential during embryonic development. N-Myc has a critical role in regulating the switch between proliferation and differentiation of progenitor cells. It binds DNA as a heterodimer with MAX. The family also includes S-Myc, encoded by rat or mouse intronless myc gene, which has apoptosis-inducing activity.	87
381463	cd11457	bHLHzip_L-Myc	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in L-Myc and similar proteins. L-Myc, also termed Class E basic helix-loop-helix protein 38 (bHLHe38), or protein L-Myc-1, or V-myc myelocytomatosis viral oncogene homolog, is a bHLHZip oncoprotein belonging to the Myc oncogene protein family. It binds DNA as a heterodimer with MAX. L-Myc is co-expressed with another Myc family member and has weaker transformation/transactivation activities. L-Myc knockout mouse did not exhibit any phenotypic abnormalities.	89
381464	cd11458	bHLHzip_c-Myc	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in c-Myc and similar proteins. c-Myc, also termed Myc proto-oncogene protein, or Class E basic helix-loop-helix protein 39 (bHLHe39), or transcription factor p64, a bHLHZip proto-oncogene protein that functions as a transcription factor, which binds DNA in a non-specific manner, yet also specifically recognizes the core sequence 5'-CAC[GA]TG-3'. It activates the transcription of growth-related genes.	84
381465	cd11459	bHLH-O_HES1_4	basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split HES-1, HES-4  and similar proteins. The family includes two bHLH-O transcriptional repressors, HES-1 and HES-4. HES-1, also termed Class B basic helix-loop-helix protein 39 (bHLHb39), or hairy homolog, or hairy-like protein (HL), plays an essential role in development of both compartment and boundary cells of the central nervous system. It regulates the maintenance of neural stem/progenitor cells by inhibiting proneural gene expression via Notch signaling. HES-4, also termed Class B basic helix-loop-helix protein 42 (bHLHb42), or bHLH factor Hes4, antagonizes the function of Twist-1 to regulate lineage commitment of bone marrow stromal/stem cells (BMSC). Epigenetic dysregulation of HES-4 is associated with striatal degeneration in postmortem Huntington brains. Both HES-1 and HES-4 are mammalian counterparts of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability.	63
381466	cd11460	bHLH-O_HES6	basic helix-loop-helix-orange (bHLH-O) domain found in transcription factor HES-6 and similar proteins. HES-6, also termed Class B basic helix-loop-helix protein 41 (bHLHb41), or hairy and enhancer of split 6, or C-HAIRY1, is a bHLH-O transcription factor that is expressed in developing muscle and involved in angiogenesis, myogenesis, neural differentiation and neurogenesis. HES-6 antagonizes Notch signaling but is not regulated by Notch signaling. It is a transcription co-factor associated with stem cell characteristics in neural tissue. It may act as an inhibitor of Hes-1 during neuronal development and forms a heterodimer with HES-1 to prevent its association with transcriptional co-repressors. The overexpression of HES-6 has been reported in metastatic cancers of different origins. HES-6 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability.	58
381467	cd11461	bHLH-O_HES5	basic helix-loop-helix-orange (bHLH-O) domain found in transcription factor HES-5 and similar proteins. HES-5, also termed Class B basic helix-loop-helix protein 38 (bHLHb38), or hairy and enhancer of split 5, is a bHLH-O transcription factor that is involved in cell differentiation and proliferation in a variety of tissues. HES-5 is an essential effector for Notch signaling. It acts as a transducer of Notch signals in brain vascular development. It also acts as a key mediator of Wnt-3a-induced neuronal differentiation and plays a crucial role in normal inner ear hair cell development. HES-5 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability.	59
381468	cd11462	bHLH-O_HES7	basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split 7 (HES-7) and similar proteins. HES-7, also termed Class B basic helix-loop-helix protein 37 (bHLHb37), or bHLH factor Hes7, is a bHLH-O transcriptional repressor that is expressed in an oscillatory manner and acts as a key regulator of the pace of the segmentation clock. It is regulated by the Notch and Fgf/Mapk pathways. HES-7 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability.	61
381469	cd11463	bHLH-O_HES2	basic helix-loop-helix-orange (bHLH-O) domain found in hairy and enhancer of split 2 (HES-2) and similar proteins. HES-2, also termed Class B basic helix-loop-helix protein 40 (bHLHb40), is a bHLH-O transcriptional repressor of genes that require a bHLH protein for their transcription. It acts as a negative regulator through interaction with both E-box and N-box sequences. HES-2 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability.	65
381470	cd11464	bHLH_TS_TWIST	basic helix-loop-helix (bHLH) domain found in twist-related protein (TWIST) family. The TWIST family includes TWIST1 and TWIST2, which are highly homologous bHLH transcription factors that promote epithelial-mesenchymal transition (EMT) during development and tumor metastasis. They are involved in the negative regulation of cellular determination and in the differentiation of several lineages, including myogenesis, osteogenesis, and neurogenesis. TWIST factors express in broad partially-overlapping patterns during embryo development and dimerize with a broad sets of dimer partners that form numerous unique transcriptional complexes to regulate embryonic development.	59
381471	cd11465	bHLH_TS_scleraxis_like	basic helix-loop-helix (bHLH) domain found in scleraxis, transcription factor 15 (TCF-15) and similar proteins. The family includes scleraxis and transcription factor 15 (TCF-15). Scleraxis, also termed SCX, or Class A basic helix-loop-helix protein 41 (bHLHa41), or Class A basic helix-loop-helix protein 48 (bHLHa48), is a bHLH transcription factor that is expressed in sclerotome limb bud cranial and body wall mesenchyme, pericardium and heart valves, ligaments and tendons. It is required for tendon formation ligaments, connective tissue, the diaphragm, and testis development. Scleraxis plays a central role in promoting fibroblast proliferation and matrix synthesis during the embryonic development of tendons. TCF-15, also termed Class A basic helix-loop-helix protein 40 (bHLHa40), or paraxis, or protein bHLH-EC2, is a bHLH transcription factor expressed in caudal lateral and paraxial mesoderm dermomyotome and sclerotome fore limb buds during embryo development. It may function as an early transcriptional regulator involved in the patterning of the mesoderm and in lineage determination of cell types derived from the mesoderm.	55
381472	cd11466	bHLH_TS_HAND	basic helix-loop-helix (bHLH) domain found in heart- and neural crest derivatives-expressed protein (HAND) family. The HAND family includes two bHLH transcription factors, HAND1 and HAND2. HAND1, also termed Class A basic helix-loop-helix protein 27 (bHLHa27), or extraembryonic tissues, heart, autonomic nervous system and neural crest derivatives-expressed protein 1 (eHAND), plays an essential role in both trophoblast-giant cells differentiation and in cardiac morphogenesis. HAND2, also termed Class A basic helix-loop-helix protein 26 (bHLHa26), or deciduum, heart, autonomic nervous system and neural crest derivatives-expressed protein 2 (dHAND), is essential for cardiac morphogenesis, particularly for the formation of the right ventricle and of the aortic arch arteries.	56
381473	cd11467	bHLH_E-protein_Da_like	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein daughterless (Da) and similar proteins. Da is a nuclear bHLH transcription factor that is a sole E protein ortholog essential for both neurogenesis and sex determination in Drosophila. Da is expressed in a broad range of tissues and is involved in diverse developmental processes such as oogenesis, sex determination and neurogenesis depending on its bHLH-binding partners. Da and achaete-scute complex (AS-C) form heterodimers that act as transcriptional activators of neural cell fates and are involved in sex determination.	70
381474	cd11468	bHLH_TS_bHLHe22_like	basic helix-loop-helix (bHLH) domain found in Class E basic helix-loop-helix protein bHLHe22, bHLHe23 and similar proteins. The family includes two OLIG-related bHLH transcription factors, bHLHe22 and bHLHe23. bHLHe22, also termed Class B basic helix-loop-helix protein 5 (bHLHb5), or trinucleotide repeat-containing gene 20 protein, is a neural-specific transcriptional repressor that is expressed in both excitatory (unipolar brush cells) and inhibitory neurons (cartwheel cells) of the dorsal cochlear nucleus (DCN) during development. It is important for the proper development and/or survival of a number of neural cell types. bHLHe23, also termed Class B basic helix-loop-helix protein 4 (bHLHb4), is expressed in rod bipolar cells and is required for rod bipolar cell maturation. bHLHe23 have roles in spinal interneuron differentiation by mechanisms linked to the Notch signaling pathway. It modulates the expression of genes required for the differentiation and/or maintenance of pancreatic and neuronal cell types.	62
381475	cd11469	bHLH-PAS_ARNTL2_PASD9	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor nuclear translocator-like protein 2 (ARNTL2) and similar proteins. ARNTL2, also termed Basic-helix-loop-helix-PAS protein MOP9, or brain and muscle ARNT-like 2 (BMAL2), or CYCLE-like factor (CLIF), or Class E basic helix-loop-helix protein 6 (bHLHe6), or member of PAS protein 9, or PAS domain-containing protein 9 (PASD9), is a neuronal bHLH-PAS transcriptional factor, regulating cell cycle progression and preventing cell death, whose sustained expression might ensure brain neuron survival. It also plays important roles in tumor angiogenesis. ARNT-2 heterodimerize with other bHLH-PAS proteins such as aryl hydrocarbon receptor (AhR), hypoxia-inducible factor (HIF), and single-minded (SIM).	60
381476	cd11470	bHLH_TS_TCF15_paraxis	basic helix-loop-helix (bHLH) domain found in transcription factor 15 (TCF-15) and similar proteins. TCF-15, also termed Class A basic helix-loop-helix protein 40 (bHLHa40), or paraxis, or protein bHLH-EC2, is a bHLH transcription factor expressed in caudal lateral and paraxial mesoderm dermomyotome and sclerotome fore limb buds during embryo development. It may function as an early transcriptional regulator involved in the patterning of the mesoderm and in lineage determination of cell types derived from the mesoderm.	66
381477	cd11471	bHLH_TS_HAND2	basic helix-loop-helix (bHLH) domain found in heart- and neural crest derivatives-expressed protein 2 (HAND2) and similar proteins. HAND2, also termed Class A basic helix-loop-helix protein 26 (bHLHa26), or deciduum, heart, autonomic nervous system and neural crest derivatives-expressed protein 2 (dHAND), is a bHLH transcription factor that is essential for cardiac morphogenesis, particularly for the formation of the right ventricle and of the aortic arch arteries.	62
211395	cd11473	W2	C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon. This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats.	135
271368	cd11474	SLC5sbd_CHT	Na(+)- and Cl(-)-dependent choline cotransporter CHT and related proteins; solute-binding domain. Na+/choline co-transport by CHT is Cl- dependent. Human CHT (also called CHT1) is encoded by the SLC5A7 gene, and is expressed in the central nervous system. hCHT1-mediated choline uptake may be the rate-limiting step in acetylcholine synthesis, and essential for cholinergic transmission. Changes in this choline uptake in cortical neurons may contribute to Alzheimer's dementia. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	464
271369	cd11475	SLC5sbd_PutP	Na(+)/proline cotransporter PutP and related proteins; solute binding domain. Escherichia coli PutP catalyzes the Na+-coupled uptake of proline with a stoichiometry of 1:1. The putP gene is part of the put operon; this operon in addition encodes a proline dehydrogenase, allowing the use of proline as a source of nitrogen and/or carbon. This subfamily also includes the Bacillus subtilis Na+/proline cotransporter (OpuE) which has an osmoprotective instead of catabolic role. Expression of the opuE gene is under osmotic control and different sigma factors contribute to its regulation; it is also a putative CcpA-activated gene. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	464
271370	cd11476	SLC5sbd_DUR3	Na(+)/urea-polyamine cotransporter DUR3, and related proteins; solute-binding domain. Dur3 is the yeast plasma membrane urea transporter. Saccharomyces cerevisiae DUR3 also transports polyamine. The polyamine uptake of S. cerevisiae DUR3 is activated upon its phosphorylation by polyamine transport protein kinase 2 (PTK2). S. cerevisiae DUR3 also appears to play a role in regulating the cellular boron concentration. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	493
271371	cd11477	SLC5sbd_u1	Uncharacterized bacterial solute carrier 5 subfamily; putative solute-binding domain. SLC5 (also called the sodium/glucose cotransporter family or solute sodium symporter family) is a family of proteins that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. Prokaryotic members of this family include Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	493
271372	cd11478	SLC5sbd_u2	Uncharacterized bacterial solute carrier 5 subfamily; putative solute-binding domain. SLC5 (also called the sodium/glucose cotransporter family or solute sodium symporter family) is a family of proteins that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. Prokaryotic members of this family include Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	496
271373	cd11479	SLC5sbd_u3	Uncharacterized bacterial solute carrier 5 subfamily; putative solute-binding domain. SLC5 (also called the sodium/glucose cotransporter family or solute sodium symporter family) is a family of proteins that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. Prokaryotic members of this family include Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	454
271374	cd11480	SLC5sbd_u4	Uncharacterized bacterial solute carrier 5 subfamily; putative solute-binding domain. SLC5 (also called the sodium/glucose cotransporter family or solute sodium symporter family) is a family of proteins that co-transports Na+ with sugars, amino acids, inorganic ions or vitamins. Prokaryotic members of this family include Vibrio parahaemolyticus glucose/galactose (vSGLT), and Escherichia coli proline (PutP) and pantothenate (PutF) cotransporters. One member of the SLC5 family, human SGLT3, has been characterized as a glucose sensor and not a transporter. This subfamily belongs to the solute carrier 5 (SLC5) transporter family.	488
271375	cd11482	SLC-NCS1sbd_NRT1-like	nucleobase-cation-symport-1 (NCS1) transporter NRT1-like; solute-binding domain. This fungal NCS1 subfamily includes various Saccharomyces cerevisiae transporters: nicotinamide riboside transporter 1 (Nrt1p, also called Thi71p), Dal4p (allantoin permease), Fui1p (uridine permease), Fur4p (uracil permease), and Thi7p (thiamine transporter). NCS1s are essential components of salvage pathways for nucleobases and related metabolites. NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters, and solute carrier 6 family neurotransmitter transporters.	480
271376	cd11483	SLC-NCS1sbd_Mhp1-like	nucleobase-cation-symport-1 (NCS1) transporter Mhp1-like; solute-binding domain. This NCS1 subfamily includes Microbacterium liquefaciens Mhp1, and various uncharacterized NCS1s. Mhp1 mediates the uptake of indolyl methyl- and benzyl-hydantoins as part of a metabolic salvage pathway for their conversion to amino acids. Mhp1 has 12 transmembrane (TM) helices (an inverted topology repeat: TMs1-5 and TMs6-10, and TMs11-12; TMs numbered to conform to the Solute carrier 6 (SLC6) family Aquifex aeolicus LeuT). NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their other known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and SLC6 neurotransmitter transporters.	451
271377	cd11484	SLC-NCS1sbd_CobB-like	nucleobase-cation-symport-1 (NCS1) transporter CobB-like; solute-binding domain. This NCS1 subfamily includes Escherichia coli CodB (cytosine permease), and the Saccharomyces cerevisiae transporters: Fcy21p (Purine-cytosine permease), and vitamin B6 transporter Tpn1. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and solute carrier 6 family neurotransmitter transporters (SLC6s).	406
271378	cd11485	SLC-NCS1sbd_YbbW-like	uncharacterized nucleobase-cation-symport-1 (NCS1) transporter subfamily, YbbW-like; solute-binding domain. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. This subfamily includes the putative allantoin transporter Escherichia coli YbbW (also known as GlxB2). NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and solute carrier 6 family neurotransmitter transporters (SLC6s).	456
271379	cd11486	SLC5sbd_SGLT1	Na(+)/glucose cotransporter SGLT1;solute binding domain. Human SGLT1 (hSGLT1) is a high-affinity/low-capacity glucose transporter, which can also transport galactose. In the transport mechanism, two Na+ ions first bind to the extracellular side of the transporter and induce a conformational change in the glucose binding site. This results in an increased affinity for glucose. A second conformational change in the transporter follows, bringing the Na+ and glucose binding sites to the inner surface of the membrane. Glucose is then released, followed by the Na+ ions. In the process, hSGLT1 is also able to transport water and urea and may be a major pathway for transport of these across the intestinal brush-border membrane. hSGLT1 is encoded by the SLC5A1 gene and expressed mostly in the intestine, but also in the trachea, kidney, heart, brain, testis, and prostate. The WHO/UNICEF oral rehydration solution (ORS) for the treatment of secretory diarrhea contains salt and glucose. The glucose, along with sodium ions, is transported by hSGLT1 and water is either co-transported along with these or follows by osmosis. Mutations in SGLT1 are associated with intestinal glucose galactose malabsorption (GGM). Up-regulation of intestinal SGLT1 may protect against enteric infections. SGLT1 is expressed in colorectal, head and neck, and prostate tumors. Epidermal growth factor receptor (EGFR) functions in cell survival by stabilizing SGLT1, and thereby maintaining intracellular glucose levels. SGLT1 is predicted to have 14 membrane-spanning regions. This subgroup belongs to the solute carrier 5 (SLC5)transporter family.	636
212056	cd11487	SLC5sbd_SGLT2	Na(+)/glucose cotransporter SGLT2 and related proteins; solute-binding domain. Human SGLT2 (hSGLT2) is a high-capacity, low-affinity glucose transporter, that plays an important role in renal glucose reabsorption. It is encoded by the SLC5A2 gene and expressed almost exclusively in renal proximal tubule cells. Mutations in hSGLT2 cause Familial Renal Glucosuria (FRG), a rare autosomal defect in glucose transport. hSGLT2 is a major drug target for regulating blood glucose levels in diabetes. hSGLT2 is predicted to have 14 membrane-spanning regions. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	583
271380	cd11488	SLC5sbd_SGLT4	Na(+)/glucose cotransporter SGLT4 and related proteins; solute-binding domain. Human SGLT4 (hSGLT4) has been reported to be a low-affinity glucose transporter with unusual sugar selectivity: it transports D-mannose but not galactose or 3-O-methyl-D-glucoside. It is encoded by the SLC5A9 gene and is expressed in intestine, kidney, liver, brain, lung, trachea, uterus, and pancreas. hSLGT4 is predicted to contain 14 membrane-spanning regions. This subgroup belongs to the solute carrier 5 (SLC5 )transporter family.	605
212058	cd11489	SLC5sbd_SGLT5	Na(+)/glucose cotransporter SGLT5 and related proteins; solute-binding domain. Human SGLT5 is a glucose transporter, which also transports galactose. It is encoded by the SLC5A10 gene, and is exclusively expressed in the renal cortex. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	604
271381	cd11490	SLC5sbd_SGLT6	Na(+)/chiro-inositol cotransporter SGLT6 and related proteins; solute-binding domain. Human SGLT6 (also called KST1, SMIT2) is a chiro-inositol transporter, which also transports myo-inositol. It is encoded by the SLC5A11 gene. Xenopus Na1-glucose cotransporter type 1 (SGLT-1)-like protein is predicted to contain 14 membrane-spanning regions. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	602
271382	cd11491	SLC5sbd_SMIT	Na(+)/myo-inositol cotransporter SMIT and related proteins; solute-binding domain. Human SMIT is a high-affinity myo-inositol transporter, and is expressed in brain, heart, kidney, and lung. Inhibition of myo-inositol uptake, through down-regulation of SMIT, may be a common mechanism of action of mood stabilizers, including lithium, carbamazepine, and valproate. SMIT is encoded by the SLC5A3 gene, which is a candidate gene for pathogenesis of nervous system dysfunction in Down syndrome (DS). The SNP, 21q22 near SLC5A3-MRPS6-KCNE2, has been associated with coronary heart disease, cardiovascular disease, and myocardial infarction. SMIT may also be involved in the pathogeneisis of congenital cataract. SMIT also plays roles in osteogenesis, bone formation, and bone mineral density determination. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	609
271383	cd11492	SLC5sbd_NIS-SMVT	Na(+)/iodide (NIS) and Na(+)/multivitamin (SMVT) cotransporters, and related proteins; solute binding domain. NIS (encoded by the SLC5A5 gene) transports I-, and other anions including ClO4-, SCN-, and Br-. SMVT (encoded by the SLC5A6 gene) transports biotin, pantothenic acid and lipoate. This subfamily also includes SMCT1 and -2. SMCT1(encoded by the SLC5A8 gene) is a high-affinity transporter of various monocarboxylates including lactate and pyruvate, short-chain fatty acids, ketone bodies, nicotinate and its structural analogs, pyroglutamate, benzoate and its derivatives, and iodide. SMCT2 (encoded by the SLC5A12 gene) is a low-affinity transporter for short-chain fatty acids, lactate, pyruvate, and nicotinate. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	522
271384	cd11493	SLC5sbd_NIS-like_u1	uncharacterized subgroup of the Na(+)/iodide (NIS) cotransporter subfamily; putative solute-binding domain. Proteins belonging to the same subfamily as this uncharacterized subgroup include i) NIS, which transports I-, and other anions including ClO4-, SCN-, and Br-, ii) SMVT, which transports biotin, pantothenic acid and lipoate, and iii) the Na(+)/monocarboxylate cotransporters SMCT1 and 2. SMCT1 is a high-affinity transporter while SMCT2 is a low-affinity transporter. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	479
271385	cd11494	SLC5sbd_NIS-like_u2	uncharacterized subgroup of the Na(+)/iodide (NIS) cotransporter subfamily; putative solute-binding domain. Proteins belonging to the same subfamily as this uncharacterized subgroup include i) NIS, which transports I-, and other anions including ClO4-, SCN-, and Br-, ii) SMVT, which transports biotin, pantothenic acid and lipoate, and iii) the Na(+)/monocarboxylate cotransporters, SMCT1 and 2. SMCT1 is a high-affinity transporter while SMCT2 is a low-affinity transporter. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	473
271386	cd11495	SLC5sbd_NIS-like_u3	uncharacterized subgroup of the Na(+)/iodide (NIS) cotransporter subfamily; putative solute-binding domain. Proteins belonging to the same subfamily as this uncharacterized subgroup include i) NIS, which transports I-, and other anions including ClO4-, SCN-, and Br-, ii) SMVT, which transports biotin, pantothenic acid and lipoate, and iii) the Na(+)/monocarboxylate cotransporters SMCT1 and 2. SMCT1 is a high-affinity transporter while SMCT2 is a low-affinity transporter. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	473
271387	cd11496	SLC6sbd-TauT-like	Na(+)- and Cl(-)-dependent taurine transporter TauT, and related proteins; solute-binding domain. This subgroup represents the solute-binding domain of TauT-like Na(+)- and Cl(-)-dependent transporters. Family members include: human TauT which transports taurine, human GAT1, GAT2, and GAT3, and BGT1, which transport gamma-aminobutyric acid (GABA), and human CT1 which transports creatine. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	543
271388	cd11497	SLC6sbd_SERT-like	Na(+)- and Cl(-)-dependent monoamine transporters, SERT, NET, DAT1 and related proteins; solute binding domain. This subgroup represents the solute-binding domain of transmembrane transporters that transport monoamine neurotransmitters from synaptic spaces into presynaptic neurons. Members include: NET which transports norepinephrine, SERT which transports serotonin, and DAT1 which transports dopamine. These transporters may play a role in diseases including depression, anxiety disorders, attention-deficit hyperactivity disorder, and in the control of human behavior and emotional states. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	537
212067	cd11498	SLC6sbd_GlyT1	Na(+)- and Cl(-)-dependent glycine transporter GlyT1; solute-binding domain. GlyT1 is a membrane-bound transporter that re-uptakes glycine from the synaptic cleft. Human GlyT1 is encoded by the SLC6A9 gene. GlyT1 is expressed in brain, pancreas, uterus, stomach, spleen, liver, and retina. GlyT1 may play a role in schizophrenia. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	585
271389	cd11499	SLC6sbd_GlyT2	Na(+)- and Cl(-)-dependent glycine transporter GlyT2; solute-binding domain. GlyT2 (also called NET1) is a membrane-bound transporter that re-uptakes glycine from the synaptic cleft. Human GlyT2 is encoded by the SLC6A5 gene. GlyT2 is expressed in brain and spinal cord. GlyT2 may play a role in pain, and in spasticity. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	597
271390	cd11500	SLC6sbd_PROT	Na(+)- and Cl(-)-dependent L-proline transporter PROT; solute-binding domain. PROT is a high-affinity L-proline transporter that transports L-proline, and may have a role in excitatory neurotransmission. Human PROT is encoded by the SLC6A7 gene, a potential susceptible gene for asthma. PROT is expressed in the brain. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	541
271391	cd11501	SLC6sbd_ATB0	Na(+)- and Cl(-)-dependent beta-alanine transporter ATB0+; solute-binding domain. ATB0+ (also known as the beta-alanine carrier) is a transmembrane transporter with a broad substrate specificity; it can transport non-alpha-amino acids such as beta-alanine with low affinity, and can transport dipolar and cationic amino acids such as leucine and lysine, with a higher affinity. It may have a role in the absorption of essential nutrients and drugs in the distal regions of the human gastrointestinal tract. Human ATB0+ is encoded by the SLC6A14 gene. ATB0+ is expressed in the lung, trachea, salivary gland, mammary gland, stomach, and pituitary gland. ATB0+ may play a role in obesity, and its upregulation may have a pathogenic role in colorectal cancer. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	602
271392	cd11502	SLC6sbd_NTT5	Neurotransmitter transporter 5; solute-binding domain. Human NTT5 is encoded by the SLC6A16 gene. NTT5 is expressed in testis, pancreas, and prostate; its expression is predominantly intracellular, indicative of a vesicular location. Its substrates are unknown. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	535
271393	cd11503	SLC5sbd_NIS	Na(+)/iodide cotransporter NIS and related proteins; solute-binding domain. NIS (product of the SLC5A5 gene) transports I-, and other anions including ClO4-, SCN-, and Br-. NIS is expressed in the thyroid, colon, ovary, and in human breast cancers. It mediates the active transport and the concentration of iodide from the blood into thyroid follicular cells, a fundamental step in thyroid hormone biosynthesis, and is the basis of radioiodine therapy for thyroid cancer. Mutation in the SLC5A5 gene can result in a form of thyroid hormone dysgenesis. Human NIS exists mainly as a dimer stabilized by a disulfide bridge. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	535
271394	cd11504	SLC5sbd_SMVT	Na(+)/multivitamin cotransporter SMVT and related proteins; solute-binding domain. This multivitamin transporter SMVT (product of the SLC5A6 gene) transports biotin, pantothenic acid and lipoate, and is essential for mediating biotin uptake into mammalian cells. SMVT is expressed in the placenta, intestine, heart, brain, lung, liver, kidney and pancreas. Biotin may regulate its own cellular uptake through participation in holocarboxylase synthetase-dependent chromatin remodeling events at SMVT promoter loci. The cis regulatory elements, Kruppel-like factor 4 and activator protein-2, regulate the activity of the human SMVT promoter in the intestine. Glycosylation of the hSMVT is important for its transport function. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	527
271395	cd11505	SLC5sbd_SMCT	Na(+)/monocarboxylate cotransporters SMCT1 and 2 and related proteins; solute-binding domain. SMCT1 is a high-affinity transporter of various monocarboxylates including lactate and pyruvate, short-chain fatty acids, ketone bodies, nicotinate and its structural analogs, pyroglutamate, benzoate and its derivatives, and iodide. Human SMCT1 (hSMCT1, also called AIT) is encoded by the tumor suppressor gene SLC5A8. SMCT1 is expressed in the colon, small intestine, kidney, thyroid gland, retina, and brain. SMCT1 may contribute to the intestinal/colonic and oral absorption of monocarboxylate drugs. It also mediates iodide transport from thyrocyte into the colloid lumen in thyroid gland and, through transporting L-lactate and ketone bodies, helps maintain the energy status and the function of neurons. SMCT2 is a low-affinity transporter for short-chain fatty acids, lactate, pyruvate, and nicotinate. hSMCT2 is encoded by the SLC5A12 gene. SMCT2 is expressed in the kidney, small intestine, skeletal muscle, and retina. In the kidney, SMCT2 may initiate lactate absorption in the early parts of the tubule, SMCT1 in the latter parts of the tubule. In the retina, SMCT1 and SMCT2 may play a differential role in monocarboxylate transport in a cell type-specific manner. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	538
212075	cd11506	SLC6sbd_GAT1	Na(+)- and Cl(-)-dependent GABA transporter 1; solute-binding domain. GAT1 transports gamma-aminobutyric acid (GABA). GABA is the main inhibitory neurotransmitter within the mammalian CNS. Human GAT1 is encoded by the SLC6A1 gene. GAT1 is expressed in brain and peripheral nervous system. The antiepileptic drug, Tiagabine, inhibits GAT1. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	598
271396	cd11507	SLC6sbd_GAT2	Na(+)- and Cl(-)-dependent GABA transporter 2; solute-binding domain. This family includes human GAT2 (hGAT2) which transports gamma-aminobutyric acid (GABA). GABA is the main inhibitory neurotransmitter within the mammalian CNS. hGAT2 is encoded by the SLC6A13 gene, and is similar to mouse GAT-3, and rat GAT2. hGAT2 is expressed in brain, kidney, lung, and testis. hGAT2 is a potential drug target for treatment of epilepsy. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	544
212077	cd11508	SLC6sbd_GAT3	Na(+)- and Cl(-)-dependent GABA transporter 3; solute-binding domain. This family includes human GAT3 (hGAT3) a high-affinity transporter of gamma-aminobutyric acid (GABA). GABA is the main inhibitory neurotransmitter within the mammalian CNS. hGAT3 is encoded by the SLC6A11 gene, and is similar to mouse GAT4, and rat GAT3/GATB. GAT3 is expressed primarily in the glia of the brain, and is a potential drug target for antiepileptic drugs. This subgroup belongs to the solute carrier 6 (SLC6) transporter family	542
271397	cd11509	SLC6sbd_CT1	Na(+)- and Cl(-)-dependent creatine transporter 1; solute-binding domain. CT1 (also called CRTR, CRT) transports creatine. Human CT1 is encoded by the SLC6A8 gene. CT1 is ubiquitously expressed, with highest levels found in skeletal muscle and kidney. Creatine is absorbed from food or synthesized from arginine and plays an important role in energy metabolism. Deficiency in human CT1 leads to X-linked cerebral creatine transporter deficiency. In males, this disorder is characterized by language and speech delays, autistic-like behavior, seizures in about 50% of cases, and can also involve midfacial hypoplasia, and short stature. In females, it is characterized by mild cognitive impairment with behavior and learning problems. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	589
271398	cd11510	SLC6sbd_TauT	Na(+)- and Cl(-)-dependent taurine transporter; solute-binding domain. TauT is a Na(+)- and Cl(-)-dependent, high-affinity, low-capacity transporter of taurine and beta-alanine. Human TauT is encoded by the SLC6A6 gene. TauT is expressed in brain, retina, liver, kidney, heart, spleen, and pancreas. It may play a part in the supply of taurine to the intestinal epithelium and in the between-meal-capture of taurine. It may also participate in re-absorbing taurine that has been deconjugated from bile acids in the distal lumen. Functional TauT protects kidney cells from nephrotoxicity caused by the chemotherapeutic agent cisplatin; cisplatin down-regulates TauT in a p53-dependent manner. In mice, TauT has been shown to be important for the maintenance of skeletal muscle function and total exercise capacity. TauT-/- mice develop additional clinically important diseases, some of which are characterized by apoptosis, including vision loss, olfactory dysfunction, and chronic liver disease. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	542
212080	cd11511	SLC6sbd_BGT1	Na(+)- and Cl(-)-dependent betaine/GABA transporter-1, and related proteins; solute-binding domain. BGT1 is a relatively low-affinity transporter of gamma-aminobutyric acid (GABA), and can also transport betaine. GABA is the main inhibitory neurotransmitter within the mammalian CNS. Human BGT1 is encoded by the SLC6A12 gene, and is similar to mouse GAT2. Mouse GAT2 plays a role in transporting GABA across the blood-brain barrier. In addition to being expressed in cells of the central nervous system, BGT1 is expressed in peripheral tissues, including kidney, liver, and heart. An association has been shown between the SLC6A12 gene and the occurrence of aspirin-intolerant asthma, and BGT1 is a drug target for antiepileptic drugs. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	541
212081	cd11512	SLC6sbd_NET	Na(+)- and Cl(-)-dependent norepinephrine transporter NET; solute-binding domain. NET (also called NAT1, NET1), is a transmembrane transporter that transports the neurotransmitter norepinephrine from synaptic spaces into presynaptic neurons. Human NET is encoded by the SLC6A2 gene. NET is expressed in brain, peripheral nervous system, adrenal gland, and placenta. NET may play a role in diseases or disorders including depression, orthostatic intolerance, anorexia nervosa, cardiovascular diseases, alcoholism, and attention-deficit hyperactivity disorder. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	560
271399	cd11513	SLC6sbd_SERT	Na(+)- and Cl(-)-dependent serotonin transporter SERT; solute-binding domain. SERT (also called 5-HTT), is a transmembrane transporter that transports the neurotransmitter serotonin from synaptic spaces into presynaptic neurons. The antiport of a K+ ion is believed to follow the transport of serotonin and promote the reorientation of SERT for another transport cycle. Human SERT is encoded by the SLC6A4 gene. SERT is expressed in brain, peripheral nervous system, placenta, epithelium, and platelets. SERT may play a role in diseases or disorders including anxiety, depression, autism, gastrointestinal disorders, premature ejaculation, and obesity. It may also have a role in social cognition. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	537
212083	cd11514	SLC6sbd_DAT1	Na(+)- and Cl(-)-dependent dopamine transporter 1; solute-binding domain. DAT1 (also called DAT), is a plasma membrane transport protein that functions at the dopaminergic synapses to transport dopamine from the extracellular space back into the presynaptic nerve terminal. Human DAT1 is encoded by the SLC6A3 gene, and is expressed in the brain. DAT1 may play a role in diseases or disorders related to dopaminergic neurons, including attention-deficit hyperactivity disorder (ADHD), Tourette syndrome, Parkinson's disease, alcoholism, drug abuse, schizophrenia, extraversion, and risky behavior. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	555
271400	cd11515	SLC6sbd_NTT4-like	Na(+)-dependent neurotransmitter transporter 4, and related proteins; solute-binding domain. This subgroup includes the solute-binding domain of NTT4 (also called XT1) and SBAT1 (also called B0AT2, v7-3, NTT7-3); both these proteins can transport neutral amino acids. Human SBAT1 is encoded by the SLC6A15 gene, a susceptibility gene for major depression. SBAT1 is expressed in brain, and may have a role in transporting neurotransmitter precursors into neurons. Human NTT4 is encoded by the SLC6A17 gene. NTT4 is specifically expressed in the nervous system, in synaptic vesicles of glutamatergic and GABAergic neurons, and may play an important role in synaptic transmission. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	530
212085	cd11516	SLC6sbd_B0AT1	Na(+)-dependent neutral amino acids transporter, B0AT1; solute-binding domain. B0AT1 (also called HND) transports neutral amino acids. Human B0AT1 is encoded by the SLC6A19 gene. B0AT1 is expressed primarily in the kidney and intestine; it requires collectrin for expression in the kidney, and angiotensin-converting enzyme 2 for expression in the intestine. Interaction with these two proteins implicates B0AT1 in more complex processes such as glomerular structure, exocytosis, and blood pressure control. The autosomal recessive disorder, Hartnup disorder, is caused by mutations in B0AT1. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	581
212086	cd11517	SLC6sbd_B0AT3	glycine transporter, B0AT3; solute-binding domain. B0AT3 (also called Xtrp2, XT2) transports glycine. Human B0AT3 is encoded by the SLC6A18 gene. B0AT3 is expressed in the kidney. Mutations in the SLC6A18 gene may contribute to the autosomal recessive disorder iminoglycinuria and its related disorder hyperglycinuria. SLC6A18 or its neighboring genes are associated with increased susceptibility to myocardial infarction. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	576
271401	cd11518	SLC6sbd_SIT1	Na(+)- and Cl(-)-dependent imino acid transporter SIT1; solute-binding domain. SIT1 (also called XTRP3, XT3, IMINO) transports imino acids, such as proline, pipecolate, MeAIB, and sarcosine. It has weak affinity for neutral amino acids such as phenylalanine. Human SIT1 is encoded by the SLC6A20 gene. SIT1 is expressed in brain, kidney, small intestine, thymus, spleen, ovary, and lung. SLC6A20 is a candidate gene for the rare disorder iminoglycinuria. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	576
271402	cd11519	SLC5sbd_SMCT1	Na(+)/monocarboxylate cotransporter SMCT1 and related proteins; solute-binding domain. SMCT1 is a high-affinity transporter of various monocarboxylates including lactate and pyruvate, short-chain fatty acids, ketone bodies, nicotinate and its structural analogs, pyroglutamate, benzoate and its derivatives, and iodide. Human SMCT1 (hSMCT1, also called AIT) is encoded by the tumor suppressor gene SLC5A8. Its expression is under the control of the C/EBP transcription factor. Its tumor-suppressive role is related to uptake of butyrate, propionate, and pyruvate, these latter are inhibitors of histone deacetylases. SMCT1 is expressed in the colon, small intestine, kidney, thyroid gland, retina, and brain. SMCT1 may contribute to the intestinal/colonic and oral absorption of monocarboxylate drugs. SMCT1 also mediates iodide transport from thyrocyte into the colloid lumen in thyroid gland and through transporting l-lactate and ketone bodies helps maintain the energy status and the function of neurons. In the kidney its expression is limited to the S3 segment of the proximal convoluted tubule (in contrast to the low-affinity monocarboxylate transporter SMCT2, belonging to a different family, which is expressed along the entire length of the tubule). In the retina, SMCT1 and SMCT2 may play a differential role in monocarboxylate transport in a cell type-specific manner, SMCT1 is expressed predominantly in retinal neurons and in retinal pigmented epithelial (RPE) cells. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	542
212089	cd11520	SLC5sbd_SMCT2	Na(+)/monocarboxylate cotransporter SMCT2 and related proteins; solute-binding domain. SMCT2 is a low-affinity transporter for short-chain fatty acids, lactate, pyruvate, and nicotinate. Human SMCT2 (hSMCT2) is encoded by the SLC5A12 gene. SMCT2 is expressed in the kidney, small intestine, skeletal muscle, and retina. In the kidney, it is expressed in the apical membrane of the proximal convoluted tubule, along the entire length of the tubule (in contrast to the high-affinity monocarboxylate transporter SMCT1, belonging to a different family, which is limited to the S3 segment of the tubule). SMCT2 may initiate lactate absorption in the early parts of the tubule. In the retina, SMCT1 and SMCT2 may play a differential role in monocarboxylate transport in a cell type-specific manner, SMCT2 is expressed exclusively in Muller cells. Nicotine transport by hSMCT2 is inhibited by several non-steroidal anti-inflammatory drugs. This subgroup belongs to the solute carrier 5 (SLC5) transporter family.	529
271403	cd11521	SLC6sbd_NTT4	Na(+)-dependent neurotransmitter transporter 4; solute-binding domain. NTT4 (also called XT1) transports the neutral amino acids, proline, glycine, leucine, and alanine, and may play an important role in synaptic transmission. Human NTT4 is encoded by the SLC6A17 gene. NTT4 is specifically expressed in the nervous system, in synaptic vesicles of glutamatergic and GABAergic neurons. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	589
212091	cd11522	SLC6sbd_SBAT1	Sodium-coupled branched-chain amino-acid transporter 1; solute-binding domain. SBAT1 (also called B0AT2, v7-3, NTT7-3) is a high-affinity Na(+)-dependent transporter for large neutral amino acids, including leucine, isoleucine, valine, proline and methionine. Human SBAT1 is encoded by the SLC6A15 gene, a susceptibility gene for major depression. SBAT1 is expressed in brain, and may have a role in transporting neurotransmitter precursors into neurons. This subgroup belongs to the solute carrier 6 (SLC6) transporter family.	580
212133	cd11523	NTP-PPase	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain superfamily. This superfamily contains enzymes that hydrolyze the alpha-beta phosphodiester bond of all canonical NTPs into monophosphate derivatives and pyrophosphate (PPi). Divalent ions, such as Mg2+ ion(s), are essential to activate a proposed water nucleophile and stabilize the charged intermediates to facilitate catalysis. These enzymes share a conserved divalent ion-binding motif EXX[E/D] in their active sites. They also share a highly conserved four-helix bundle, where one face forms the active site, while the other participates in oligomer assembly. The four-helix bundle consists of two central antiparallel alpha-helices that can be contained within a single protomer or form upon dimerization. The superfamily members include dimeric dUTP pyrophosphatases (dUTPases; EC 3.6.1.23), the nonspecific NTP-PPase MazG proteins, HisE-encoded phosphoribosyl ATP pyrophosphohydolase (PRA-PH), fungal histidine biosynthesis trifunctional proteins, and several uncharacterized protein families.	72
211400	cd11524	SYLF	The SYLF domain (also called DUF500), a novel lipid-binding module. The SYLF domain is named after SH3YL1, Ysc84p/Lsb4p, Lsb3p, and plant FYVE, which are proteins that contain it. It is also called DUF500 and is highly conserved from bacteria to mammals. Some members, such as SH3YL1, Ysc84p, and Lsb3p, which represent the best characterized members of the family, also contain an SH3 domain, while family members from plants and stramenopiles also contain a FYVE zinc finger domain. Other members only contain a stand-alone SYLF domain. The SYLF domain of SH3YL1 binds phosphoinositides with high affinity, while the N-terminal SYLF domains of both Ysc84p and Lsb3p have been shown to bind and bundle actin filaments, as well as bind liposomes with high affinity.	194
211401	cd11525	SYLF_SH3YL1_like	The SYLF domain (also called DUF500), a novel lipid-binding module, of SH3 domain containing Ysc84-like 1 (SH3YL1) and similar proteins. This subfamily is composed of yeast Ysc84 (also called LAS17-binding protein 4, Lsb4p) and Lsb3p proteins, vertebrate SH3YL1 (SH3 domain containing Ysc84-like 1), and similar proteins. They contain an N-terminal SYLF domain (also called DUF500) and a C-terminal SH3 domain. SH3YL1 localizes to the plasma membrane and is required for dorsal ruffle formation. Ysc84p localizes to actin patches and plays an important role in actin polymerization during endocytosis. A study of the yeast SH3 domain interactome predicts that Lsb3p and Lsb4p may function as molecular hubs for the assembly of endocytic complexes. The SYLF domain of SH3YL1 binds phosphoinositides with high affinity, while the N-terminal SYLF domains of both Ysc84p and Lsb3p have been shown to bind and bundle actin filaments, as well as bind liposomes with high affinity.	199
211402	cd11526	SYLF_FYVE	The SYLF domain (also called DUF500), a novel lipid-binding module, of FYVE zinc finger domain containing proteins. This subfamily is composed of uncharacterized proteins from plants and stramenopiles containing a FYVE zinc finger domain followed by a SYLF domain (also called DUF500). The SYLF domain of the related protein, SH3YL1, binds phosphoinositides with high affinity, while the N-terminal SYLF domains of both Ysc84p and Lsb3p have been shown to bind and bundle actin filaments, as well as bind liposomes with high affinity.	201
212134	cd11527	NTP-PPase_dUTPase	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in dimeric 2-Deoxyuridine 5'-triphosphate nucleotidohydrolase and similar proteins. dUTPase (dUTP pyrophosphatase; EC 3.6.1.23) catalyzes the hydrolysis of dUTP to dUMP and pyrophosphate. It acts to ensure chromosomal integrity by reducing the effective ratio of dUTP/dTTP. Members in this family are dimeric dUTPases, such as those from Leishmania major, Trypanosoma cruzi, and Campylobacter jejuni, which differ from the monomeric and trimeric forms and adopt an all-alpha topology. A central four-helix bundle, consisting of two alpha-helices from the rigid domain and two helices from the mobile domain and connecting loops, form the active site in dimeric dUTPase-like proteins, requiring the presence of metal ion cofactors to hydrolyze both dUTP and dUDP.	94
212135	cd11528	NTP-PPase_MazG_Nterm	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) N-terminal tandem-domain of MazG proteins from Escherichia coli and bacterial homologs. MazG is a NTP-PPase that hydrolyzes all canonical NTPs into their corresponding nucleoside monophosphates and pyrophosphate. The prototype of this family is MazG proteins from Escherichia coli (EcMazG) that represents the most abundant form consisting two sequence-related domains in tandem, this family corresponding to the N-terminal MazG-like domain. EcMazG functions as a regulator of cellular response to starvation by lowering the cellular concentration of guanosine 3',5'-bispyrophosphate (ppGpp). EcMazG exists as a dimer; each monomer contains two tandem MazG-like domains with similarly folded globular structures. However, only the C-terminal domain has well-ordered active site and exhibits an NTPase activity responsible for the regulation of bacterial cell survival under nutritional stress. Divalent ions, such as Mg2+ or Mn2+, are required for activity; however, this domain does not exhibit an NTPase activity despite containing structural features such as the EEXX(E/D) motif and key basic catalytic residues responsible for nucleotide pyrophosphohydrolysis activity. It is suggested that the N-terminal domain of EcMazG might have a house-cleaning function by hydrolyzing noncanonical NTPs whose incorporation into the nascent DNA leads to increased mutagenesis and DNA damage.	114
212136	cd11529	NTP-PPase_MazG_Cterm	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) C-terminal tandem-domain of MazG proteins from Escherichia coli and bacterial homologs'. MazG is a NTP-PPase that hydrolyzes all canonical NTPs into their corresponding nucleoside monophosphates and pyrophosphate. The prototype of this family is MazG proteins from Escherichia coli (EcMazG) that represents the most abundant form consisting two sequence-related domains in tandem, this family corresponding to the C-terminal MazG-like domain. EcMazG functions as a regulator of cellular response to starvation by lowering the cellular concentration of guanosine 3',5'-bispyrophosphate (ppGpp). EcMazG exists as a dimer. Each monomer contains two tandem MazG-like domains with similarly folded globular structures. However, only the C-terminal domain has well-ordered active sites and exhibits an NTPase activity responsible for the regulation of bacterial cell survival under nutritional stress. Divalent ions, such as Mg2+ or Mn2+, are required for activity, along with structural features such as EEXX(E/D) motifs and key basic catalytic residues. It has been shown that the C-terminus NTPase activity is responsible for regulation of bacterial cell survival under nutritional stress.	116
212137	cd11530	NTP-PPase_DR2231_like	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Deinococcus radiodurans DR2231 protein and its bacterial homologs. This family includes a MazG-like NTP-PPase from Deinococcus radiodurans (DR2231), a putative NTP-PPase YP_001813558.1 from Exiguobacterium sibiricum and their bacterial homologs. DR2231 shows significant structural resemblance to MazG proteins, but is functionally related to the dimeric dUTPases. It can hydrolyze dUTP into dUMP. DR2231-like proteins contain a well conserved divalent ion binding motif, EXXEX(12-28)EXXD, which is the identity signature for the all-alpha-helical NTP-PPase superfamily. Unlike normal dimeric dUTPase-like proteins with a central four-helix bundle forming the active site, YP_001813558.1 displays a very unusual interlaced segment-swapped dimer. It potentially prefers to hydrolyze dCTPs or its derivatives. YP_001813558.1-like proteins contain a variant divalent ion binding motif, EXXEX(12-28)AXXD.	88
212138	cd11531	NTP-PPase_BsYpjD	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain putative pyrophosphatase YpjD from Bacillus subtilis and its bacterial homologs. This family includes a putative pyrophosphatase Ypjd from Bacillus subtilis (BsYpjD) and its homologs. Although its biological role has not been described in detail, BsYpjD shows significant sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, BsYpjD contains a single MazG-like domain.	93
212139	cd11532	NTP-PPase_COG4997	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from archaea and bacteria. The family includes some uncharacterized hypothetical proteins from archaea and bacteria. Although their biological roles remain unclear, the family members show significant sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, the family contains a single MazG-like domain.	95
212140	cd11533	NTP-PPase_Af0060_like	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in uncharacterized protein from Archaeoglobus fulgidus (Af0060) and its bacterial homologs. This family includes an uncharacterized protein from Archaeoglobus fulgidus (Af0060) and its homologs from bacteria. Although its biological role remains unclear, Af0060 shows high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D].	75
212141	cd11534	NTP-PPase_HisIE_like	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Escherichia coli phosphoribosyl-ATP pyrophosphohydrolase (HisIE or PRATP-PH) and its homologs. This family includes Escherichia coli phosphoribosyl-ATP pyrophosphohydrolase, HisIE, and its homologs from all three kingdoms of life. E. coli HisIE is encoded by the hisIE gene, which is formed by hisE gene fused to hisl. HisIE is a bifunctional enzyme responsible for the second and third steps of the histidine-biosynthesis pathway. Its N-terminal and C-terminal domains have phosphoribosyl-AMP cyclohydrolase (HisI) and phosphoribosyl-ATP pyrophosphohydrolase (HisE or PRATP-PH) activity, respectively. This family corresponds to the C-terminal domain of HisIE and includes many hisE gene encoding proteins, all of which show significant sequence similarity to Mycobacterium tuberculosis phosphoribosyl-ATP pyrophosphohydrolase (HisE or PRATP-PH). These proteins may be responsible for only the second step in the histidine-biosynthetic pathway, irreversibly hydrolyzing phosphoribosyl-ATP (PRATP) to phosphoribosyl-AMP (PRAMP) and pyrophosphate.	84
212142	cd11535	NTP-PPase_SsMazG	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Sulfolobus solfataricus (Ss) and its homologs from archaea and bacteria. This family includes a MazG-like protein from Sulfolobus solfataricus (SsMazG) and its homologs from archaea and bacteria. Although its biological roles remain still unclear, SsMazG shows significant sequence similarity to the NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, SsMazG contains a single MazG-like domain. It is predicted that SsMazG might participate in house-cleaning by preventing incorporation of the oxidation product 2-oxo-(d)ATP (iso-dGTP), a mutagenic derivative of ATP, into DNA.	76
212143	cd11536	NTP-PPase_iMazG	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in integron-associated MazG (iMazG) proteins. This family corresponds to the iMazG proteins representing a new subfamily of MazG NTP-PPases. iMazG is likely to act as a house-cleaning enzyme capable of removing aberrant dNTPs, preventing the incorporation of damaging non-canonical nucleotides into host-cell DNA. It can convert dNTP to dNMP and pyrophosphate by cleaving between the alpha- and beta-phosphates of its dNTP substrates, with a marked preference for dCTP and dATP. Unlike typical tandem-domain MazG proteins, iMazG contains a single MazG-like domain and functions as a tetramer (a dimer of dimers) with a typical four-helical bundle. The divalent ions, such as Mg2+, are required for its pyrophosphatase activity.	90
212144	cd11537	NTP-PPase_RS21-C6_like	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in mouse RS21-C6 protein and its homologs. RS21-C6 proteins, highly expressed in all vertebrate genomes and green plants, act as house-cleaning enzymes, removing 5-methyl dCTP (m5dCTP) in order to prevent gene silencing. They show significant sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, RS21-C6 contains a single MazG-like domain and functions as a tetramer (a dimer of dimers) with a typical four-helical bundle. Divalent ions, such as Mg2+, are required for its pyrophosphatase activity. This family also includes a pyrophosphatase from Archaeoglobus fulgidus (Af1178). Although its biological role remains unclear, Af1178 shows significant sequence similarity to the mouse RS21-C6 protein.	90
212145	cd11538	NTP-PPase_u1	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D].	97
212146	cd11539	NTP-PPase_u2	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. The family corresponds to a group of uncharacterized hypothetical proteins from bacteria and archaea, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D].	85
212147	cd11540	NTP-PPase_u3	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria and archaea, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D].	76
212148	cd11541	NTP-PPase_u4	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D].	91
212149	cd11542	NTP-PPase_u5	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain that contains a well conserved divalent ion-binding motif EXX[E/D].	99
212150	cd11543	NTP-PPase_u6	Nucleoside Triphosphate Pyrophosphohydrolase EC 3.6.1.8) MazG-like domain found in a group of uncharacterized proteins from bacteria and archaea. This family corresponds to a group of uncharacterized hypothetical proteins from bacteria, showing a high sequence similarity to the dimeric 2-deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTP pyrophosphatase or dUTPase) and NTP-PPase MazG proteins. However, unlike typical tandem-domain MazG proteins, members in this family consist of a single MazG-like domain.	87
212151	cd11544	NTP-PPase_DR2231	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Deinococcus radiodurans DR2231 protein and its bacterial homologs. This family corresponds to the DR2231 protein, a MazG-like NTP-PPase from Deinococcus radiodurans, and its bacterial homologs. All family members contain a well-conserved divalent ion binding motif, EXXEX(12-28)EXXD, which is the identity signature for all-alpha-helical NTP-PPase superfamily. DR2231 shows significant structural resemblance to MazG proteins, but is functionally related to the dimeric dUTPases. It might be an evolutionary precursor of dimeric dUTPases with  very high specificity in hydrolyzing dUTP into dUMP, but an inability to hydrolyze dTTP, a typical feature of dUTPases. Moreover, unlike the dUPase monomer containing a single active site, the DR2231 protein dimer holds two putative active sites.	116
212152	cd11545	NTP-PPase_YP_001813558	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Exiguobacterium sibiricum YP_001813558.1 protein and its bacterial homologs. This family contains a putative NTP_PPase (YP_001813558.1) from Exiguobacterium sibiricum and its bacterial homologs. Unlike normal dimeric dUTPase-like proteins with a central four-helix bundle forming the active site, YP_001813558.1 displays a very unusual interlaced segment-swapped dimer that might be important for it to adapt to an extremely cold environment. Moreover, structural analysis and comparisons indicate that YP_001813558.1 potentially prefers to hydrolyze dCTPs or its derivatives.	115
212153	cd11546	NTP-PPase_His4	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in His4-like fungal histidine biosynthesis trifunctional proteins and their homologs. This family includes fungal histidine biosynthesis trifunctional proteins and their homologs from eukaryotes and bacteria. Some family members contain three domains responsible for phosphoribosyl-AMP cyclohydrolase (PRAMP-CH), phosphoribosyl-ATP pyrophosphohydrolase (PRATP-PH), and histidinol dehydrogenase (Histidinol-DH) activity, respectively. Some others do not have Histidinol-DH domain, but have an additional N-terminal TIM phosphate binding domain. This family corresponds to the domain for PRATP-PH activity, which shows significant sequence similarity to Mycobacterium tuberculosis PRATP-PH that catalyzes the second step in the histidine-biosynthetic pathway, irreversibly hydrolyzing phosphoribosyl-ATP (PRATP) to phosphoribosyl-AMP (PRAMP) and pyrophosphate.	84
212154	cd11547	NTP-PPase_HisE	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain found in Mycobacterium tuberculosis phosphoribosyl-ATP pyrophosphohydrolase (HisE or PRATP-PH) and its bacterial homologs. This family includes M. tuberculosis phosphoribosyl-ATP pyrophosphohydrolase (HisE or PRATP-PH) and its bacterial homologs. M. tuberculosis HisE is encoded by the hisE gene, which is a separate gene presenting in many bacteria and archaea but is fused to hisI in other bacteria, fungi and plants. HisE is responsible for the second step in the histidine-biosynthetic pathway. It can irreversibly hydrolyze phosphoribosyl-ATP (PRATP) to phosphoribosyl-AMP (PRAMP) and pyrophosphate. HisE dimerizes into a four alpha-helix bundle, forming two inferred PRATP active sites on the outer faces. M. tuberculosis HisE has been found to be essential for growth in vitro, thus making it a potential drug target for tuberculosis.	86
211389	cd11548	NodZ_like	Alpha 1,6-fucosyltransferase similar to Bradyrhizobium NodZ. Bradyrhizobium NodZ is an alpha 1,6-fucosyltransferase involved in the biosynthesis of the nodulation factor, a lipo-chitooligosaccharide formed by three-to-six beta-1,4-linked N-acetyl-d-glucosamine (GlcNAc) residues and a fatty acid acyl group attached to the nitrogen atom at the non-reducing end. NodZ transfers L-fucose from the GDP-beta-L-fucose donor to the reducing residue of the chitin oligosaccharide backbone, before the attachment of a fatty acid group. O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes.	287
211403	cd11549	Serine_rich_CAS	Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes including migration, chemotaxis, apoptosis, differentiation, and progenitor cell function. They mediate the signaling of integrins at focal adhesions where they localize, and thus, regulate cell invasion and survival. Over-expression of these proteins is implicated in poor prognosis, increased metastasis, and resistance to chemotherapeutics in many cancers such as breast, lung, melanoma, and glioblastoma. CAS proteins have also been linked to the pathogenesis of inflammatory disorders, Alzheimer's, Parkinson's, and developmental defects. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Vertebrates contain four CAS proteins: BCAR1 (or p130Cas), NEDD9 (or HEF1), EFS (or SIN), and CASS4 (or HEPL). CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1 has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family.	159
211404	cd11550	Serine_rich_NEDD9	Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding protein, Neural precursor cell Expressed, Developmentally Down-regulated 9; a protein interaction module. NEDD9 is also called human enhancer of filamentation 1 (HEF1) or CAS-L (Crk-associated substrate in lymphocyte). It was first described as a gene predominantly expressed in early embryonic brain, and was also isolated from a screen of human proteins that regulate filamentous budding in yeast, and as a tyrosine phosphorylated protein in lymphocytes. It promotes metastasis in different solid tumors. NEDD9 localizes in focal adhesions and associates with FAK and Abl kinase. It also interacts with SMAD3 and the proteasomal machinery which allows its rapid turnover; these interactions are not shared by other CAS proteins. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1, another CAS protein, has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family.	162
211405	cd11551	Serine_rich_CASS4	Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding protein family member 4; a protein interaction module. CASS4, also called HEPL (HEF1-EFS-p130Cas-like), localizes to focal adhesions and plays a role in regulating FAK activity, focal adhesion integrity, and cell spreading. It is most abundant in blood cells and lung tissue, and is also found in high levels in leukemia and ovarian cell lines. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1, another CAS protein, has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family.	159
211406	cd11552	Serine_rich_BCAR1	Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding protein, Breast Cancer Anti-estrogen Resistance 1; a protein interaction module. BCAR1, also called p130cas or CASS1, is the founding member of the CAS family of scaffolding proteins and was originally identified through its ability to associate with Crk. The name BCAR1 was designated because the human gene was identified in a screen for genes that promote resistance to tamoxifen. It is widely expressed and its deletion is lethal in mice. It plays a role in regulating cell motility, survival, proliferation, transformation, cancer progression, and bacterial pathogenesis. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1 has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family.	157
212092	cd11554	SLC6sbd_u2	uncharacterized eukaryotic solute carrier 6 subfamily; solute-binding domain. SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporter family or Na+/Cl--dependent transporter family) include neurotransmitter transporters (NTTs): these are sodium- and chloride-dependent plasma membrane transporters for the monoamine neurotransmitters serotonin (5-hydroxytryptamine), dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. These NTTs are widely expressed in the mammalian brain, and are involved in regulating neurotransmitter signaling and homeostasis, and are the target of a range of therapeutic drugs for the treatment of psychiatric diseases. Bacterial members of the SLC6 family include the LeuT amino acid transporter.	406
271404	cd11555	SLC-NCS1sbd_u1	uncharacterized nucleobase-cation-symport-1 (NCS1) transporter subfamily; solute-binding domain. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. NCS1s belong to a superfamily which also contains the solute carrier 5 family sodium/glucose transporters (SLC5s), and solute carrier 6 family neurotransmitter transporters (SLC6s).	461
271405	cd11556	SLC6sbd_SERT-like_u1	uncharacterized subgroup of the SERT-like Na(+)- and Cl(-)-dependent monoamine transporter subfamily; solute binding domain. SERT-like Na(+)- and Cl(-)-dependent monoamine transporters, transport monoamine neurotransmitters from synaptic spaces into presynaptic neurons. Members include: the norepinephrine transporter NET, the serotonin transporter SERT , and the dopamine transporter DAT1. These latter may play a role in diseases or disorders including depression, anxiety disorders, and attention-deficit hyperactivity disorder, and in the control of human behavior and emotional states. They belongs to the solute carrier 6 (SLC6) transporter family. Members of this subgroup are uncharacterized.	552
211407	cd11557	ST7	Suppression of tumorigenicity 7. ST7 is a metazoan protein that behaves as a tumor suppressor in human cancer cells. It appears to localize to the cytoplasm and plasma membrane, and may mediate tumor suppression by regulating genes that are involved in oncogenic pathways and/or maintain cellular structure. It has been suggested that the suppression of tumorigenicity is associated with a function in mediating the remodeling of the extracellular matrix. However, somatic mutations of ST7 have not been observed as being commonly associated with molecular pathogenesis in various human neoplasias.	458
211396	cd11558	W2_eIF2B_epsilon	C-terminal W2 domain of eukaryotic translation initiation factor 2B epsilon. eIF2B is a heteropentameric complex which functions as a guanine nucleotide exchange factor in the recycling of eIF-2 during the initiation of translation in eukaryotes. The epsilon and gamma subunits are sequence similar and both are essential in yeast. Epsilon appears to be the catalytically active subunit, with gamma enhancing its activity. The C-terminal domain of the eIF2B epsilon subunit contains bipartite motifs rich in acidic and aromatic residues, which are responsible for the interaction with eIF2. The structure of the domain resembles that of a set of concatenated HEAT repeats.	169
211397	cd11559	W2_eIF4G1_like	C-terminal W2 domain of eukaryotic translation initiation factor 4 gamma 1 and similar proteins. eIF4G1 is a component of the multi-subunit eukaryotic translation initiation factor 4F, which facilitates recruitment of the mRNA to the ribosome, a rate-limiting step during translation initiation. This C-terminal domain, whose structure resembles that of a set of concatenated HEAT repeats, has been associated with binding to/recruiting the kinase Mnk1, which phosphorylates eIF4E.	134
211398	cd11560	W2_eIF5C_like	C-terminal W2 domain of the eukaryotic translation initiation factor 5C and similar proteins. eIF5C appears to be essential for the initiation of protein translation; its actual function, and specifically that of the C-terminal W2 domain, are not well understood. The Drosophila ortholog, kra (krasavietz) or exba (extra bases), may be involved in translational inhibition in neural development. The structure of this C-terminal domain resembles that of a set of concatenated HEAT repeats.	194
211399	cd11561	W2_eIF5	C-terminal W2 domain of eukaryotic translation initiation factor 5. eIF5 functions as a GTPase acceleration protein (GAP), as well as a GDP dissociation inhibitor (GDI) during translational initiation in eukaryotes. The structure of this C-terminal domain resembles that of a set of concatenated HEAT repeats.	157
211408	cd11564	FAT-like_CAS_C	C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes including migration, chemotaxis, apoptosis, differentiation, and progenitor cell function. They mediate the signaling of integrins at focal adhesions where they localize, and thus, regulate cell invasion and survival. Over-expression of these proteins is implicated in poor prognosis, increased metastasis, and resistance to chemotherapeutics in many cancers such as breast, lung, melanoma, and glioblastoma. CAS proteins have also been linked to the pathogenesis of inflammatory disorders, Alzheimer's, Parkinson's, and developmental defects. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Vertebrates contain four CAS proteins: BCAR1 (or p130Cas), NEDD9 (or HEF1), EFS (or SIN), and CASS4 (or HEPL). The FAT-like C-terminal domain of CAS proteins binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion.	126
211318	cd11566	eIF1_SUI1	Eukaryotic initiation factor 1. eIF1/SUI1 (eukaryotic initiation factor 1) plays an important role in accurate initiator codon recognition during translation initiation. eIF1 interacts with 18S rRNA in the 40S ribosomal subunit during eukaryotic translation initiation. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown.	84
211319	cd11567	YciH_like	Homologs of eIF1/SUI1 including Escherichia coli YciH. Members of the eIF1/SUI1 (eukaryotic initiation factor 1) family are found in eukaryotes, archaea, and some bacteria; eukaryotic members are understood to play an important role in accurate initiator codon recognition during translation initiation. The function of non-eukaryotic family members is unclear. Escherichia coli YciH is a non-essential protein and was reported to be able to perform some of the functions of IF3 in prokaryotic initiation.	76
211409	cd11568	FAT-like_CASS4_C	C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding protein family member 4; a protein interaction module. CASS4, also called HEPL (HEF1-EFS-p130Cas-like), localizes to focal adhesions and plays a role in regulating FAK activity, focal adhesion integrity, and cell spreading. It is most abundant in blood cells and lung tissue, and is also found in high levels in leukemia and ovarian cell lines. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain, which binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion.	123
211410	cd11569	FAT-like_BCAR1_C	C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding protein, Breast Cancer Anti-estrogen Resistance 1; a protein interaction module. BCAR1, also called p130cas or CASS1, is the founding member of the CAS family of scaffolding proteins and was originally identified through its ability to associate with Crk. The name BCAR1 was designated because the human gene was identified in a screen for genes that promote resistance to tamoxifen. It is widely expressed and its deletion is lethal in mice. It plays a role in regulating cell motility, survival, proliferation, transformation, cancer progression, and bacterial pathogenesis. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain, which binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion.	133
211411	cd11570	FAT-like_NEDD9_C	C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding protein, Neural precursor cell Expressed, Developmentally Down-regulated 9; a protein interaction module. NEDD9 is also called human enhancer of filamentation 1 (HEF1) or CAS-L (Crk-associated substrate in lymphocyte). It was first described as a gene predominantly expressed in early embryonic brain, and was also isolated from a screen of human proteins that regulate filamentous budding in yeast, and as a tyrosine phosphorylated protein in lymphocytes. It promotes metastasis in different solid tumors. NEDD9 localizes in focal adhesions and associates with FAK and Abl kinase. It also interacts with SMAD3 and the proteasomal machinery which allows its rapid turnover; these interactions are not shared by other CAS proteins. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain, which binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion.	128
211412	cd11571	FAT-like_EFS_C	C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding protein, Embryonal Fyn-associated Substrate; a protein interaction module. EFS is also called HEFS, CASS3 (CAS scaffolding protein family member 3) or SIN (Src-interacting protein). It was identified based on interactions with the Src kinases, Fyn and Yes. It plays a role in thymocyte development and acts as a negative regulator of T cell proliferation. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain, which binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion.	130
211413	cd11572	RlmI_M_like	Middle domain of the SAM-dependent methyltransferase RlmI and related proteins. This middle or central domain is typically found between an N-terminal PUA domain and a C-terminal SAM-dependent methyltransferase domain, such as in the Escherichia coli ribosomal RNA large subunit methyltransferase RlmI (YccW). It may be involved in binding to the RNA substrate.	99
211414	cd11573	GH99_GH71_like	Glycoside hydrolase families 71, 99, and related domains. This superfamily of glycoside hydrolases contains families GH71 and GH99 (following the CAZY nomenclature), as well as other members with undefined function and specificity.	284
211415	cd11574	GH99	Glycoside hydrolase family 99, an endo-alpha-1,2-mannosidase. This family of glycoside hydrolases 99 (following the CAZY nomenclature) includes endo-alpha-1,2-mannosidase (EC 3.2.1.130), which is an important membrane-associated eukaryotic enzyme involved in the maturation of N-linked glycans. Specifically, it cleaves mannoside linkages internal to N-linked glycan chains by hydrolyzing an alpha-1,2-mannosidic bond between a glucose-substituted mannose and the remainder of the chain. The biological function and significance of the soluble bacterial orthologs, which may have obtained the genes via horizontal transfer, is not clear.	338
211416	cd11575	GH99_GH71_like_3	Uncharacterized glycoside hydrolase family 99-like domain. This family of putative glycoside hydrolases resembles glycosyl hydrolase families 71 and 99 (following the CAZY nomenclature) and may share a similar catalytic site and mechanism.	376
211417	cd11576	GH99_GH71_like_2	Uncharacterized glycoside hydrolase family 99-like domain. This family of putative glycoside hydrolases resembles glycosyl hydrolase families 71 and 99 (following the CAZY nomenclature) and may share a similar catalytic site and mechanism. The domain may co-occur with other domains involved in the binding/processing of glycans.	378
211418	cd11577	GH71	Glycoside hydrolase family 71. This family of glycoside hydrolases 71 (following the CAZY nomenclature) function as alpha-1,3-glucanases (mutanases, EC 3.2.1.59). They appear to have an endo-hydrolytic mode of enzymatic activity and bacterial members are investigated as candidates for the development of dental caries treatments.The member from fission yeast, endo-alpha-1,3-glucanase Agn1p, plays a vital role in daughter cell separation, while Agn2p has been associated with endolysis of the ascus wall.	283
211419	cd11578	GH99_GH71_like_1	Uncharacterized glycoside hydrolase family 99-like domain. This family of putative glycoside hydrolases resembles glycosyl hydrolase families 71 and 99 (following the CAZY nomenclature) and may share a similar catalytic site and mechanism.	313
211420	cd11579	Glyco_tran_WbsX	Glycosyl hydrolase family 99-like domain of WbsX-like glycosyltransferases. Members of this domain family are found in proteins within O-antigen biosynthesis clusters in Gram negative bacteria, where they may function as glycosyl hydrolases and typically co-occur with glycosyltransferase domains. They bear resemblance to GH71 and the GH99 family of alpha-1,2-mannosidases and may share a similar cataltyic site and mechanism. The O-antigens are essential lipopolysaccharides in gram-negative bacteria's outer membrane and have been linked to pathogenicity.	347
211421	cd11580	eIF2D_N_like	N-terminal domain of eIF2D, malignant T cell-amplified sequence 1 and related proteins. This N-terminal domain of various proteins co-occurs with a PUA domain. Members of this family are: (1) MCTS-1 (malignant T cell-amplified sequence 1) or MCT-1 (multiple copies T cell malignancies), which may play roles in the regulation of the cell cycle, (2) the eukayotic translation initiation factor 2D, and (3) an uncharacterized archaeal family.	72
212547	cd11581	GINS_A	Alpha-helical domain of GINS complex proteins; Sld5, Psf1, Psf2 and Psf3. The GINS complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In eukaryotes, GINS is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3. The GINS complex has been found in eukaryotes and archaea, but not in bacteria. The four subunits of the complex are homologous and consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3.	103
211424	cd11582	Axin_TNKS_binding	Tankyrase binding N-terminal segment of axin. This N-terminal region of axin mediates interactions with the ankyrin-repeat clusters 2 and 3 of tankyrase, which controls the turnover of axin via poly-ADP-ribosylation. Axin functions as a negative regulator of the WNT signaling pathway.	69
211425	cd11583	Orc6_mid	Middle domain of the origin recognition complex subunit 6. Orc6 is a subunit of the origin recognition complex in eukaryotes, and it may be involved in binding to DNA. This model describes the central or middle domain of Orc6, whose structure resembles that of TFIIB, a DNA-binding transcription factor. Orc6 appears to form distinct complexes with DNA, and a putative DNA-binding site has been identified.	94
211426	cd11585	SATB1_N	N-terminal domain of SATB1 and similar proteins. SATB1, the special AT-rich sequence-binding protein 1, is involved in organizing chromosomal loci into distinct loops, creating a "loopscape" that has a direct bearing on gene expression. This N-terminal domain, which may be involved in various interactions with chromatin proteins, resembles a ubiquitin domain and has been shown to form tetramers, a function critical to SATB1-DNA interactions. The related Drosophila homeobox gene defective proventriculus (dve) plays a key role in the functional specification during endoderm development.	100
212155	cd11586	VbhA_like	VbhA antitoxin and related proteins. VbhA is the antitoxin to VbhT. The VbhT toxin of the mammalian pathogen Bartonella schoenbuchensis is responsible for the disruptive adenylation of host proteins. VbhT also induces FIC-domain-mediated growth arrest in bacteria; it is inhibited by this antitoxin which binds to block the ATP binding site of the VbhT FIC domain.	54
212536	cd11587	Arginase-like	Arginase types I and II and arginase-like family. This family includes arginase, also known as arginase-like amidino hydrolase family, and related proteins, found in bacteria, archaea and eykaryotes. Arginase is a binuclear Mn-dependent metalloenzyme and catalyzes hydrolysis of L-arginine to L-ornithine and urea (Arg, EC 3.5.3.1), the reaction being the fifth and final step in the urea cycle, providing the path for the disposal of nitrogenous compounds. Arginase controls cellular levels of arginine and ornithine which are involved in protein biosynthesis, and in production of creatine, polyamines, proline and nitric acid. In vertebrates, at least two isozymes have been identified: type I cytoplasmic or hepatic liver-type arginase and type II mitochondrial or non-hepatic arginase. Point mutations in human arginase gene lead to hyperargininemia with consequent mental disorders, retarded development and early death. Arginase is a therapeutic target to treat asthma, erectile dysfunction, atherosclerosis and cancer.	294
212537	cd11589	Agmatinase_like_1	Agmatinase and related proteins. This family includes known and predicted bacterial agmatinase (agmatine ureohydrolase; AUH; SpeB; EC=3.5.3.11), a binuclear manganese metalloenzyme, belonging to the ureohydrolase superfamily. It is a key enzyme in the synthesis of polyamine putrescine; it catalyzes hydrolysis of agmatine to yield urea and putrescine, the precursor for biosynthesis of higher polyamines, spermidine, and spermine. Agmatinase from Deinococcus radiodurans shows approximately 33% of sequence identity to human mitochondrial agmatinase. An analysis of the evolutionary relationship among ureohydrolase superfamily enzymes indicates the pathway involving arginine decarboxylase and agmatinase evolved earlier than the arginase pathway of polyamine.	274
212538	cd11592	Agmatinase_PAH	Agmatinase-like family includes proclavaminic acid amidinohydrolase. This agmatinase subfamily contains bacterial and fungal/metazoan enzymes, including proclavaminic acid amidinohydrolase (PAH, EC 3.5.3.22) and Pseudomonas aeruginosa guanidinobutyrase (GbuA) and guanidinopropionase (GpuA). PAH hydrolyzes amidinoproclavaminate to yield proclavaminate and urea in clavulanic acid biosynthesis. Clavulanic acid is an effective inhibitor of beta-lactamases and is used in combination with amoxicillin to prevent the beta-lactam rings of the antibiotic from hydrolysis and, thus keeping the antibiotic biologically active. GbuA hydrolyzes 4-guanidinobutyrate (4-GB) into 4-aminobutyrate and urea while GpuA hydrolyzes 3-guanidinopropionate (3-GP) into beta-alanine and urea. Mutation studies show that significant variations in two active site loops in these two enzymes may be important for substrate specificity. This subfamily belongs to the ureohydrolase superfamily, which includes arginase, agmatinase, proclavaminate amidinohydrolase, and formiminoglutamase.	289
212539	cd11593	Agmatinase-like_2	Agmatinase and related proteins. This family includes known and predicted bacterial and archaeal agmatinase (agmatine ureohydrolase; AUH; SpeB; EC=3.5.3.11), a binuclear manganese metalloenzyme that belongs to the ureohydrolase superfamily. It is a key enzyme in the synthesis of polyamine putrescine; it catalyzes hydrolysis of agmatine to yield urea and putrescine, the precursor for biosynthesis of higher polyamines, spermidine, and spermine. As compared to E. coli where two paths to putrescine exist, via decarboxylation of an amino acid, ornithine or arginine, a single path is found in Bacillus subtilis, where polyamine synthesis starts with agmatine; the speE and speB encode spermidine synthase and agmatinase, respectively. The level of agmatinase synthesis is very low, allowing strict control on the synthesis of putrescine and therefore, of all polyamines, consistent with polyamine levels in the cell. This subfamily belongs to the ureohydrolase superfamily, which includes arginase, agmatinase, proclavaminate amidinohydrolase, and formiminoglutamase.	263
212540	cd11598	HDAC_Hos2	Class I histone deacetylases including ScHos2 and SpPhd1. This subfamily includes Class I histone deacetylase (HDAC) Hos2 from Saccharomyces cerevisiae as well as a histone deacetylase Phd1 from Schizosaccharomyces pombe. Hos2 binds to the coding regions of genes during gene activation, specifically it deacetylates the lysines in H3 and H4 histone tails. It is preferentially associated with genes of high activity genome-wide and is shown to be necessary for efficient transcription. Thus, Hos2 is directly required for gene activation in contrast to other class I histone deacetylases. Protein encoded by phd1 is inhibited by trichostatin A (TSA), a specific inhibitor of histone deacetylase, and is involved in the meiotic cell cycle in S. pombe. Class 1 HDACs are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98).	311
212541	cd11599	HDAC_classII_2	Histone deacetylases and histone-like deacetylases, classII. This subfamily includes eukaryotic as well as bacterial Class II histone deacetylase (HDAC) and related proteins. Deacetylases of class II are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) and possibly other proteins to yield deacetylated histones/other proteins. In D. discoideum, where four homologs (HdaA, HdaB, HdaC, HdaD) have been identified, HDAC activity is important for regulating the timing of gene expression during development. Also, inhibition of HDAC activity by trichostatin A is shown to cause hyperacetylation of the histone and a delay in cell aggregation and differentiation.	288
212542	cd11600	HDAC_Clr3	Class II Histone deacetylase  Clr3 and similar proteins. Clr3 is a class II Histone deacetylase Zn-dependent enzyme that catalyzes hydrolysis of an N(6)-acetyl-lysine residue of a histone to yield a deacetylated histone (EC 3.5.1.98). Clr3 is the homolog of the class-II HDAC HdaI in S. cerevisiae, and is essential for silencing in heterochromatin regions, such as centromeric regions, ribosomal DNA, the mating-type region and telomeric loci. Clr3 has also been implicated in the regulation of stress-related genes; the histone acetyltransferase, Gcn5, in S. cerevisiae, preferentially acetylates global histone H3K14 while Clr3 preferentially deacetylates H3K14ac, and therefore, interplay between Gcn5 and Clr3 is crucial for the regulation of many stress-response genes.	313
409282	cd11601	Nip7_N-like	N-terminal domain of Nip7 and similar proteins. This domain of various proteins is often found N-terminal to a PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain. The family contains Nip7, a protein that was shown to be required for efficient biogenesis of the 60S ribosome subunit in Saccharomyces cerevisiae. Recently, it was demonstrated that human Nip7 is essential in the accurate processing of pre-rRNA. Also included are KD93, a human homolog of Nip7, as well as an archaeal homolog and bacterial RsmB/RsmF family ribosomal methyltransferases.	76
211427	cd11602	Ndc10	Ndc10 component of the yeast centromere-binding factor 3. Ndc10 is a multidomain protein conserved in Saccharomycotina that interacts with kinetochore components. This model characterizes the majority of the protein; some family members may have an additional C-terminal domain that is homologous to transcriptional activators (GCR1_C). Ndc10 is part of the centromere-binding factor 3 (CBF3) complex in budding yeast. The CBF3 complex contains four essential proteins, Ndc10, Cep3, Ctf13, and Skp1. CBF3/Ndc10 is essential for the recruitment of the centromeric nucleosome and formation of the kinetochore. The Kinetochore is the large, multiprotein assembly that serves to connect condensed sister chromatids to the mitotic spindle.  Ndc10 forms a dimer and it has non-sequence-specific DNA binding activity via the DNA backbone. Ndc10 also plays an important role in the coordination of cell division. It has been noted that the protein bears resemblance to the tyrosine recombinases (type IB topoisomerase/lambda-integrase).	413
211428	cd11603	ThermoDBP	Thermoproteales single-stranded DNA-binding (SSB) domain. ThermoDBP is a SSB protein of the Thermoproteales. SSB proteins are essential for the genome maintenance of all known cellular organisms.  Many SSBs contain an OB fold domain, albeit with low sequence conservation and OB fold-containing SSB proteins have been detected in all three domains of life. However, one group of Crenarchaea, the Thermoproteales, lack SSB encoding genes. The Thermoproteales SSB protein, ThermoDBP, lacks the OB fold and binds specifically to ssDNA with low sequence specificity. Its three-dimensional structure resembles that of the Hut operon positive regulatory protein HutP.	141
211429	cd11604	RTT106_N	histone chaperone RTT106, regulator of Ty1 transposition protein 106; N-terminal homodimerization domain. This cd includes the N-terminal homodimerization domain of Saccharomyces cerevisiae Rtt106, a histone chaperone. In addition to this domain, Rtt106 contains two C-terminal pleckstrin-homology (PH) domains. The acetylation of lysine 56 in histone H3 (H3K56ac) is implicated in regulating nucleosome disassembly during gene transcription, and nucleosome assembly during DNA replication and repair. Rtt106 has been shown to aid in the efficient deposition of newly synthesized H3K56ac onto replicating DNA. The interaction of Rtt106 with (H3-H4)2, most likely in the form of a (H3-H4)2 tetramer, is important for gene silencing and for the DNA damage response. Data supports a combinatorial interaction: this N-terminal domain homodimerizes and intercalates between the two H3-H4 components of the (H3-H4)2 tetramer, independent of acetylation, and the two double PH domains bind the K56-containing region of H3. Acetylation of K56 increases the affinity of the interaction. Rtt106 also interacts with both the SWI/SNF and RSC chromatin remodeling complexes and is involved in their cell-cycle dependent recruitment to histone gene pairs regulated by the HIR co-repressor complex (HTA1-HTB1, HHT1-HHF1, and HHT2-HHF2). Saccharomyces cerevisiae Rtt106 also plays a role in a role in regulating Ty1 transposition.	54
212156	cd11606	COE_DBD	Colier/Olf/Early B-cell factor (EBF) DNA Binding Domain. COE_DBD is the amino-terminal DNA binding domain of the COE protein family. The COE transcription factor is a regulator of development in several organs and tissues that contain the DBD domain as well as IPT/TIG (immunoglobulin-like, Plexins, transcription factors/transcription factor immunoglobulin) and basic helix-loop-helix (bHLH) domains. COE has four members in mammals (COE1-4) with high sequence similarity at the amino-terminal region. COE_DBD requires a zinc ion to bind DNA and contains a zinc finger motif (H-X(3)-C-X(2)-C-X(5)-C) termed the zinc knuckle. COE is homo- or heterodimerized through the bHLH domain to bind DNA. COE1-4 each has a variant due to alternative splicing. However, this alternative splicing does not occur at the DBD domain.	212
211320	cd11607	DENR_C	C-terminal domain of DENR and related proteins. DENR (density regulated protein), together with MCT-1 (multiple copies T cell malignancies), has been shown to have similar function as eIF2D translation initiation factor (also known as ligatin), which is involved in the recruitment and delivery of aminoacyl-tRNAs to the P-site of the eukaryotic ribosome in a GTP-independent manner.	86
211321	cd11608	eIF2D_C	C-terminal domain of eIF2D and related proteins. eIF2D translation initiation factor (also known as ligatin) is involved in the recruitment and delivery of aminoacyl-tRNAs to the P-site of the eukaryotic ribosome in a GTP-independent manner.	85
211422	cd11609	MCT1_N	N-terminal domain of multiple copies T cell malignancies 1 and related proteins. This N-terminal domain of MCT-1 (multiple copies T cell malignancies 1), also known as MCTS-1 (malignant T cell-amplified sequence 1), co-occurs with a PUA domain. MCT-1, together with DENR (density regulated protein), has been shown to have similar function as eIF2D translation initiation factor (also known as ligatin), which is involved in the recruitment and delivery of aminoacyl-tRNAs to the P-site of the eukaryotic ribosome in a GTP-independent manner.	77
211423	cd11610	eIF2D_N	N-terminal domain of eIF2D and related proteins. This N-terminal domain of eIF2D co-occurs with a PUA domain. eIF2D translation initiation factor (also known as ligatin) is involved in the recruitment and delivery of aminoacyl-tRNAs to the P-site of the eukaryotic ribosome in a GTP-independent manner.	76
212157	cd11611	SAF	Domains similar to fish antifreeze type III protein. SAF domains are found in a wide variety of proteins with quite different functions. They are components of enzymes, such as D-altronate-dehydratases or sialic acid synthetases, of antifreeze proteins conserved in fish (where they bind to nascent ice crystals), and may act as periplasmic chaperones in bacterial flagella basal body P-ring formation.	56
212158	cd11613	SAF_AH_GD	Domains similar to fish antifreeze type III protein. Altronate dehydratase (EC 4.2.1.7) converts D-altronate into 2-dehydro-3-deoxy-D-gluconate and is part of a bacterial pathway for the degradation of D-galacturonate. D-galactarate dehydratase (EC 4.2.1.42) eliminates water from D-galactarate to yield 5-dehydro-4-deoxy-D-glucarate, initializing the degradation of D-galactarate. The function of the SAF domain in these enzymes is not clear. It may participate in dimerization.	80
212159	cd11614	SAF_CpaB_FlgA_like	SAF domains of the flagella basal body P-ring formation protein FlgA and the flp pilus assembly CpaB. FlgA is a putative periplasmic chaperone that assists in the formation of the flagellar P ring; CpaB is a protein invoved in the assembly of the flp pili, which are bacterial virulence factors mediating non-specific adherence to surfaces; these proteins appear to contain a single SAF domain. This intermediate family also contains the SAF domains of sialic acid synthetases and type III antifreeze proteins, which also share the same extensive core structure.	61
212160	cd11615	SAF_NeuB_like	C-terminal SAF domain of sialic acid synthetase. Sialic acid synthetase (N-acetylneuraminate synthase or N-acetylneuraminate-9-phosphate synthase) catalyzes the condensation of phosphoenolpyruvate with N-acetylmannosamine (ManNAc, in bacteria) or N-acetylmannosamine-6-phosphate (ManNAc-6P, in mammals), to yield N-acetylneuramic acid (NeuNAc) or N-acetylneuramic acid-9-phosphate (NeuNAc-9P), respectively. The N-terminal NeuB domain, a TIM-barrel-like structure, contains the catalytic site, the function of the SAF domain is not as clear. It may participate in domain-swapped dimerization and play a role in binding the substrate, in either domain-swapped dimers or by directly interacting with the N-terminal domain. Also included in the family are PEP-sugar pyruvyltransferases known as spore coat polysaccharide biosynthesis proteins (SpsE).	58
212161	cd11616	SAF_DH_OX_like	SAF domain of putative dehydrogenases or oxidoreductases. C-terminal SAF domain of an uncharacterized family of putative dehydrogenases or oxidoreductases, which are otherwise members of the NAD(P)-dependent Rossmann-fold superfamily.	80
212162	cd11617	Antifreeze_III	Type III antifreeze protein, may be specific to the Zoarcoidei. Antifreeze protein III inhibits the growth of ice crystals and protects fish from cold damage in sub-freezing temperatures.	62
211316	cd11618	ChtBD1_1	Hevein or type 1 chitin binding domain; filamentous ascomycete subfamily. Hevein or type 1 chitin binding domain (ChtBD1), a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins such as hevein, a major IgE-binding allergen in natural rubber latex, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements.	44
212009	cd11619	HR1_CIP4-like	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Cdc42-Interacting Protein 4 and similar proteins. This subfamily is composed of Cdc42-Interacting Protein 4 (CIP4), Formin Binding Protein 17 (FBP17), FormiN Binding Protein 1-Like (FNBP1L), and similar proteins. CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. CIP4 and FBP17 bind to the Fas ligand and may be implicated in the inflammatory response. CIP4 may also play a role in phagocytosis. It functions downstream of Cdc42 in PDGF-dependent actin reorganization and cell migration, and also regulates the activity of PDGFRbeta. It uses Src as a substrate in regulating the invasiveness of breast tumor cells. CIP4 may also play a role in the pathogenesis of Huntington's disease. Members of this subfamily typically contain an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain, central HR1 domain, and a C-terminal SH3 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; the HR1 domain of CIP4 binds Cdc42 and TC10. Translocation of CIP4 is facilitated by its binding to TC10 at the plasma membrane.	77
212010	cd11620	HR1_PKC-like_2_fungi	Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of fungal Protein Kinase C-like proteins. This subfamily is composed of fungal PKC-like proteins including Pkc1p from Saccharomyces cerevisiae, and Pck1p and Pck2p from Schizosaccharomyces pombe. The yeast PKC-like proteins play a critical role in regulating cell wall biosynthesis and maintaining cell wall integrity. They contain two HR1 domains, C2 and C1 domains, and a kinase domain. This model characterizes the second HR1 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. The HR1 domains of Pck1p and Pck2p interact with GTP-bound Rho1p and Rho2p.	72
212011	cd11621	HR1_PKC-like_1_fungi	First Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of fungal Protein Kinase C-like proteins. This subfamily is composed of fungal PKC-like proteins including Pkc1p from Saccharomyces cerevisiae, and Pck1p and Pck2p from Schizosaccharomyces pombe. The yeast PKC-like proteins play a critical role in regulating cell wall biosynthesis and maintaining cell wall integrity. They contain two HR1 domains, C2 and C1 domains, and a kinase domain. This model characterizes the first HR1 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family. The HR1 domains of Pck1p and Pck2p interact with GTP-bound Rho1p and Rho2p.	72
212012	cd11622	HR1_PKN_1	First Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N. PKN, also called Protein-kinase C-related kinase (PRK), is a serine/threonine protein kinase that can be activated by the small GTPase Rho, and by fatty acids such as arachidonic and linoleic acids. It is involved in many biological processes including cytoskeletal regulation, cell adhesion, vesicle transport, glucose transport, regulation of meiotic maturation and embryonic cell cycles, signaling to the nucleus, and tumorigenesis. In some vertebrates, there are three PKN isoforms from different genes (designated PKN1, PKN2, and PKN3), which show different enzymatic properties, tissue distribution, and varied functions. PKN proteins contain three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the first HR1 domain of PKN. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family.	66
212013	cd11623	HR1_PKN_2	Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N. PKN, also called Protein-kinase C-related kinase (PRK), is a serine/threonine protein kinase that can be activated by the small GTPase Rho, and by fatty acids such as arachidonic and linoleic acids. It is involved in many biological processes including cytoskeletal regulation, cell adhesion, vesicle transport, glucose transport, regulation of meiotic maturation and embryonic cell cycles, signaling to the nucleus, and tumorigenesis. In some vertebrates, there are three PKN isoforms from different genes (designated PKN1, PKN2, and PKN3), which show different enzymatic properties, tissue distribution, and varied functions. PKN proteins contain three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the second HR1 domain of PKN. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family.	71
212014	cd11624	HR1_Rhophilin	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rhophilin. Rhophilins are scaffolding proteins that function as effectors of the Rho family of small GTPases. Vertebrates harbor two proteins, Rhophilin-1 and Rhophilin-2, whose exact functions are yet to be determined. Rhophilin-1 has been implicated in sperm motility. Rhophilin-2 regulates the organization of the actin cytoskeleton. Rhophilins contain N-terminal HR1, central Bro1-like, and C-terminal PDZ domains; all are protein-interacting domains. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; both Rhophilin-1 and Rhophilin-2 bind RhoA, and Rhophilin-2 has also been shown to bind RhoB.	76
212015	cd11625	HR1_PKN_3	Third Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N. PKN, also called Protein-kinase C-related kinase (PRK), is a serine/threonine protein kinase that can be activated by the small GTPase Rho, and by fatty acids such as arachidonic and linoleic acids. It is involved in many biological processes including cytoskeletal regulation, cell adhesion, vesicle transport, glucose transport, regulation of meiotic maturation and embryonic cell cycles, signaling to the nucleus, and tumorigenesis. In some vertebrates, there are three PKN isoforms from different genes (designated PKN1, PKN2, and PKN3), which show different enzymatic properties, tissue distribution, and varied functions. PKN proteins contain three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the third HR1 domain of PKN. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family.	74
212016	cd11626	HR1_ROCK	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rho-associated coiled-coil containing protein kinase. ROCK is also referred to as Rho-associated kinase or simply as Rho kinase. It is a serine/threonine protein kinase that is activated via interaction with Rho GTPases and is involved in many cellular functions including contraction, adhesion, migration, motility, proliferation, and apoptosis. ROCKs are the best-described effectors of RhoA. There are two isoforms, ROCK1 and ROCK2, which may be functionally redundant in some systems, but exhibit different tissue distributions. Both isoforms are ubiquitously expressed in most tissues, but ROCK2 is more prominent in brain and skeletal muscle while ROCK1 is more pronounced in the liver, testes, and kidney. Studies in knockout mice result in different phenotypes, suggesting that the two isoforms do not compensate for each other during embryonic development. ROCK contains an N-terminal extension, a catalytic kinase domain, and a long C-terminal extension, which contains a Rho-binding HR1 domain and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by HR1 and PH domains interacting with the catalytic domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family.	66
212017	cd11627	HR1_Ste20-like	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Schizosaccharomyces pombe Ste20-like proteins. This group is composed of predominantly uncharacterized fungal proteins, which contain two known domains: HR1 at the N-terminal region and REM (Ras exchanger motif) at the C-terminal region. One member protein from Schizosaccharomyces pombe is named Ste16 while its gene is called ste20 (a target of rapamycin complex 2 subunit). It is a subunit in the protein kinase TOR complexes in fission yeast. The REM domain is usually found in nucleotide exchange factors for Ras-like small GTPases. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family.	71
212018	cd11628	HR1_CIP4_FNBP1L	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of vertebrate Cdc42-Interacting Protein 4 and FormiN Binding Protein 1-Like. CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. FNBP1L, also called Toca-1 (Transducer of Cdc42-dependent actin assembly 1), forms a complex with neural WASP; the complex induces the formation of filopodia and endocytic vesicles. FNBP1L is required for Cdc42-induced actin assembly and is essential for autophagy of intracellular pathogens. CIP4 may also play a role in phagocytosis. It functions downstream of Cdc42 in PDGF-dependent actin reorganization and cell migration, and also regulates the activity of PDGFRbeta. It uses Src as a substrate in regulating the invasiveness of breast tumor cells. CIP4 may also play a role in the pathogenesis of Huntington's disease. CIP4 and FNBP1L contain an N-terminal F-BAR domain, a central HR1 domain, and a C-terminal SH3 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; the HR1 domain of CIP4 binds Cdc42 and TC10. Translocation of CIP4 is facilitated by its binding to TC10 at the plasma membrane.	81
212019	cd11629	HR1_FBP17	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Formin Binding Protein 17. FBP17, also called FormiN Binding Protein 1 (FNBP1), is involved in dynamin-mediated endocytosis. It is recruited to clathrin-coated pits late in the endocytosis process and may play a role in the invagination and scission steps. FBP17 binds in vivo to tankyrase, a protein involved in telomere maintenance and mitogen activated protein kinase (MAPK) signaling. It also binds to the Fas ligand and may be implicated in the inflammatory response. FBP17 contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain, central HR1 domain, and a C-terminal SH3 domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; the HR1 domain of the related protein, CIP4, binds Cdc42 and TC10. Translocation of CIP4 is facilitated by its binding to TC10 at the plasma membrane.	77
212020	cd11630	HR1_PKN1_2	Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N1. PKN1, also called PKNalpha or Protein-kinase C-related kinase 1 (PRK1), is a serine/threonine protein kinase that is activated by the Rho family of small GTPases, and by fatty acids such as arachidonic and linoleic acids. It is expressed ubiquitously and is the most abundant PKN isoform in neurons. PKN1 is implicated in a variety of functions including cytoskeletal reorganization, cardiac cell survival, cell adhesion, and glucose transport, among others. PKN1 contains three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the second HR1 domain of PKN1. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN1 binds the GTPases RhoA, RhoB, and RhoC, and can also interact weakly with Rac.	78
212021	cd11631	HR1_PKN2_2	Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N2. PKN2, also called PKNgamma or Protein-kinase C-related kinase 2 (PRK2), is a serine/threonine protein kinase and an effector of the small GTPase Rho/Rac. It regulates G2/M cell cycle progression and the exit from cytokinesis. It also phosphorylates hepatitis C virus (HCV) RNA polymerase and thus, plays a role in HCV RNA replication. PKN2 shares a common domain architecture with other PKNs, containing three HR1 domains, a C2 domain, and a kinase domain. In addition, PKN2 contains a proline-rich region in between its C2 and kinase domains and has been shown to associate with SH3 domain containing proteins like NCK and Grb4. This model characterizes the second HR1 domain of PKN2. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN2 specifically binds to RhoA GTPase in a GTP-dependent manner. The HR1 domains of PKN2, together with its C2 domain, also facilitate the recruitment of PKN2 to primordial junctions at nascent cell-cell contacts, where it promotes junctional maturation.	74
212022	cd11632	HR1_PKN3_2	Second Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N3. PKN3, also called PKNbeta, is a serine/threonine protein kinase that is activated by the Rho family of small GTPases, preferentially by RhoC. Both PKN1 and RhoC show limited and barely detectable expression in normal tissues, but are both upregulated in cancer cells, particularly in late-stage malignancies. PKN3 has been implicated to play a role in the metastatic growth and invasiveness of cancer cells, downstream of the oncogenic phosphoinositide 3-kinase signaling network. PKN3 shares a common domain architecture with other PKNs, containing three HR1 domains, a C2 domain, and a kinase domain. In addition, PKN3 contains two proline-rich regions between its C2 and kinase domains, and has been shown to associate with SH3 domain containing proteins like GRAFs, GAP for RhoA, and Cdc42Hs. This model characterizes the second HR1 domain of PKN3. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN3 binds Rho family GTPases, preferentially RhoC.	74
212023	cd11633	HR1_Rhophilin-1	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rhophilin-1. Rhophilin-1 is a scaffolding protein that functions as an effector of the Rho family of small GTPases. It has been implicated in sperm motility. Rhophilin-1 contains an N-terminal HR1, a central Bro1-like, and a C-terminal PDZ domain; all are protein-interacting domains. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; Rhophilin-1 binds RhoA was isolated initially as a RhoA-binding protein.	85
212024	cd11634	HR1_Rhophilin-2	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rhophilin-2. Rhophilin-2 is a scaffolding protein that functions as an effector of the Rho family of small GTPases. It plays a role in regulating the organization of the actin cytoskeleton. Rhophilin-2 contains an N-terminal HR1, a central Bro1-like, and a C-terminal PDZ domain; all are protein-interacting domains. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; Rhophilin-2 has been shown to bind both RhoA and RhoB.	82
212025	cd11635	HR1_PKN2_3	Third Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N2. PKN2, also called PKNgamma or Protein-kinase C-related kinase 2 (PRK2), is a serine/threonine protein kinase and an effector of the small GTPase Rho/Rac. It regulates G2/M cell cycle progression and the exit from cytokinesis. It also phosphorylates hepatitis C virus (HCV) RNA polymerase and thus, plays a role in HCV RNA replication. PKN2 shares a common domain architecture with other PKNs, containing three HR1 domains, a C2 domain, and a kinase domain. In addition, PKN2 contains a proline-rich region in between its C2 and kinase domains and has been shown to associate with SH3 domain containing proteins like NCK and Grb4. This model characterizes the third HR1 domain of PKN2. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN2 specifically binds to RhoA GTPase in a GTP-dependent manner. The HR1 domains of PKN2, together with its C2 domain, also facilitate the recruitment of PKN2 to primordial junctions at nascent cell-cell contacts, where it promotes junctional maturation.	74
212026	cd11636	HR1_PKN1_3	Third Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N1. PKN1, also called PKNalpha or Protein-kinase C-related kinase 1 (PRK1), is a serine/threonine protein kinase that is activated by the Rho family of small GTPases, and by fatty acids such as arachidonic and linoleic acids. It is expressed ubiquitously and is the most abundant PKN isoform in neurons. PKN1 is implicated in a variety of functions including cytoskeletal reorganization, cardiac cell survival, cell adhesion, and glucose transport, among others. PKN1 contains three HR1 domains, a C2 domain, and a kinase domain. This model characterizes the third HR1 domain of PKN1. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN1 binds the GTPases RhoA, RhoB, and RhoC, and can also interact weakly with Rac.	74
212027	cd11637	HR1_PKN3_3	Third Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Protein Kinase N3. PKN3, also called PKNbeta, is a serine/threonine protein kinase that is activated by the Rho family of small GTPases, preferentially by RhoC. Both PKN1 and RhoC show limited and barely detectable expression in normal tissues, but are both upregulated in cancer cells, particularly in late-stage malignancies. PKN3 has been implicated to play a role in the metastatic growth and invasiveness of cancer cells, downstream of the oncogenic phosphoinositide 3-kinase signaling network. PKN3 shares a common domain architecture with other PKNs, containing three HR1 domains, a C2 domain, and a kinase domain. In addition, PKN3 contains two proline-rich regions between its C2 and kinase domains, and has been shown to associate with SH3 domain containing proteins like GRAFs, GAP for RhoA, and Cdc42Hs. This model characterizes the third HR1 domain of PKN3. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family; PKN3 binds Rho family GTPases, preferentially RhoC.	74
212028	cd11638	HR1_ROCK2	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rho-associated coiled-coil containing protein kinase 2. ROCK2 is a serine/threonine protein kinase and was the first identified target of activated RhoA. It plays a role in stress fiber and focal adhesion formation, and is prominently expressed in the brain, heart, and skeletal muscles. It is implicated in vascular and neurological disorders, such as hypertension and vasospasm of the coronary and cerebral arteries. ROCK2 is also activated by caspase-2 cleavage, resulting in thrombin-induced microparticle generation in response to cell activation. Mice deficient in ROCK2 show intrauterine growth retardation and embryonic lethality because of placental dysfunction. ROCK2 contains an N-terminal extension, a catalytic kinase domain, and a long C-terminal extension, which contains a Rho-binding HR1 domain and a pleckstrin homology (PH) domain. ROCK2 is auto-inhibited by HR1 and PH domains interacting with the catalytic domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family.	67
212029	cd11639	HR1_ROCK1	Protein kinase C-related kinase homology region 1 (HR1) Rho-binding domain of Rho-associated coiled-coil containing protein kinase 1. ROCK1 is a serine/threonine kinase and is preferentially expressed in the liver, lung, spleen, testes, and kidney. It mediates signaling from Rho to the actin cytoskeleton. It is implicated in the development of cardiac fibrosis, cardiomyocyte apoptosis, and hyperglycemia. Mice deficient with ROCK1 display eyelids open at birth (EOB) and omphalocele phenotypes due to the disorganization of actin filaments in the eyelids and the umbilical ring. ROCK1 contains an N-terminal extension, a catalytic kinase domain, and a long C-terminal extension, which contains a Rho-binding HR1 domain and a pleckstrin homology (PH) domain. It is auto-inhibited by HR1 and PH domains interacting with the catalytic domain. HR1 domains are anti-parallel coiled-coil (ACC) domains that bind small GTPases from the Rho family.	66
212163	cd11640	HutP	Histidine Utilizing Protein, the hut operon positive regulatory protein. The HutP protein family regulates the expression of 'hut' structural genes in Bacillus and other bacteria. It forms an anti-termination complex, which recognizes three UAG triplet units, separated by four non-conserved nucleotides on the RNA terminator region. In an L-histidine and Mg2+ dependent manner, HutP binds to the nascent hut mRNA leader transcript, and the ensuing anti-termination complex inhibits formation of a stem-loop terminator, clearing the way for transcription of the hut structural genes.	134
381168	cd11641	Precorrin-4_C11-MT	Precorrin-4 C11-methyltransferase (CbiF/CobM). Precorrin-4 C11-methyltransferase participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. In the aerobic pathway, CobM catalyzes the methylation of precorrin-4 at C-11 to yield precorrin-5. In the anaerobic pathway, CibF catalyzes the methylation of cobalt-precorrin-4 to cobalt-precorrin-5. Both CibF and CobM, which are homologous, are included in this model. There are about 30 enzymes involved in vitamin B12 synthetic pathway. The enzymes involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Most of the enzymes are shared in both pathways and several of these enzymes are pathway-specific.	225
381169	cd11642	SUMT	Uroporphyrin-III C-methyltransferase (also known as S-Adenosyl-L-methionine:uroporphyrinogen III methyltransferase, SUMT). SUMT is an enzyme of the cobalamin and siroheme biosynthetic pathway. It catalyzes the first of three steps leading to the formation of siroheme from uroporphyrinogen III; it transfers two methyl groups from S-adenosyl-L-methionine to the C-2 and C-7 atoms of uroporphyrinogen III to yield precorrin-2 via the intermediate formation of precorrin-1. Precorrin-2 is also a precursor for the biosynthesis of vitamin B12, coenzyme F430, siroheme and heme d1. This family includes proteins in which the SUMT domain is fused to other functional domains, such as to a uroporphyrinogen-III synthase domain to form bifunctional uroporphyrinogen-III methylase/uroporphyrinogen-III synthase, or to a dual function dehydrogenase-chelatase domain, as in the case of the multifunctional S-adenosyl-L-methionine (SAM)-dependent bismethyltransferase/dehydrogenase/ferrochelatase CysG, which catalyzes all three steps that transform uroporphyrinogen III into siroheme.	228
381170	cd11643	Precorrin-6A-synthase	Precorrin-6A synthase (also named CobF). Precorrin-6A synthase participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. This model represents CobF, the precorrin-6A synthase, an enzyme specific to the aerobic pathway. After precorrin-4 is methylated at C-11 by CobM to produce precorrin-5, CobF catalyzes the removal of the extruded acyl group in the subsequent step, and the addition of a methyl group at C-1. The product of this reaction is precorrin-6A, which gets reduced by an NADH-dependent reductase to yield precorrin-6B. This family includes enzymes in GC-rich Gram-positive bacteria, alpha proteobacteria and Pseudomonas-related species.	244
381171	cd11644	Precorrin-6Y-MT	Precorrin-6Y methyltransferase (also named CbiE). CbiE (precorrin-6Y methyltransferase, also known as cobalt-precorrin-7 C(5)-methyltransferase, also known as cobalt-precorrin-6Y C(5)-methyltransferase) catalyzes the methylation of C-5 in cobalt-precorrin-7 to form cobalt-precorrin-8. It participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. CbiE functions in the anaerobic pathway, it is a subunit of precorrin-6Y C5,15-methyltransferase, a bifunctional enzyme: cobalt-precorrin-7 C(5)-methyltransferase (CbiE)/cobalt-precorrin-6B C(15)-methyltransferase (decarboxylating) (CbiT), that catalyzes two methylations (at C-5 and C-15) in precorrin-6Y, as well as the decarboxylation of the acetate side chain located in ring C, in order to generate precorrin-8X. CbiE and CbiT can be found fused (CbiET, also called CobL), or on separate protein chains (CbiE and CbiT). In the aerobic pathway, a single enzyme called CobL catalyzes the methylations at C-5 and C-15, and the decarboxylation of the C-12 acetate side chain of precorrin-6B.	198
381172	cd11645	Precorrin_2_C20_MT	Precorrin-2 C20-methyltransferase, also named CobI or CbiL. Precorrin-2 C20-methyltransferase (also known as S-adenosyl-L-methionine--precorrin-2 methyltransferase) participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. Precorrin-2 C20-methyltransferase catalyzes methylation at the C-20 position of a cyclic tetrapyrrole ring of precorrin-2 using S-adenosylmethionine as a methyl group source to produce precorrin-3A. In the anaerobic pathway, cobalt is inserted into precorrin-2 by CbiK to generate cobalt-precorrin-2, which is the substrate for CbiL, a C20 methyltransferase. In Clostridium difficile, CbiK and CbiL are fused into a bifunctional enzyme. In the aerobic pathway, the precorrin-2 C20-methyltransferase is named CobI. This family includes CbiL and CobI precorrin-2 C20-methyltransferases, both as stand-alone enzymes and when CbiL forms part of a bifunctional enzyme.	223
381173	cd11646	Precorrin_3B_C17_MT	Precorrin-3B C(17)-methyltransferase (also named CobJ or CbiH). Precorrin-3B C(17)-methyltransferase participates in the pathway toward the biosynthesis of cobalamin (vitamin B12). There are two distinct cobalamin biosynthetic pathways. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. This model includes CobJ of the aerobic pathway and CbiH of the anaerobic pathway, both as stand-alone enzymes and when CobJ or CbiH form part of bifunctional enzymes, such as in Mycobacterium tuberculosis CobIJ  where CobJ fuses with a precorrin-2 C(20)-methyltransferase domain, or Bacillus megaterium CbiH60, where CbiH is fused to a nitrite and sulfite reductase-like domain. In the aerobic pathway, once CobG has generated precorrin-3b, CobJ catalyzes the methylation of precorrin-3b at C-17 to form precorrin-4 (the extruded methylated C-20 fragment is left attached as an acyl group at C-1). In the corresponding anaerobic  pathway, CbiH carries out this ring contraction, using cobalt-precorrin-3b as a substrate to generate a tetramethylated delta-lactone.	238
381174	cd11647	DHP5_DphB	diphthine methyl ester synthase and diphthine synthase. Eukaryotic diphthine methyl ester synthase (DHP5) and archaeal diphthamide synthase (DphB) participate in the second step of the biosynthetic pathway of diphthamide. The eukaryotic enzyme catalyzes four methylations of the modified target histidine residue in translation elongation factor 2 (EF-2), to form an intermediate called diphthine methyl ester; the archaeal enzyme, catalyzes only 3 methylations, producing diphthine. Diphtheria toxin ADP-ribosylates diphthamide leading to inhibition of protein synthesis in the eukaryotic host cells.	241
381175	cd11648	RsmI	Ribosomal RNA small subunit methyltransferase I (RsmI), also known as rRNA (cytidine-2'-O-)-methyltransferase. RsmI is an S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent methyltransferase responsible for the 2'-O-methylation of cytidine 1402 (C1402) at the P site of bacterial 16S rRNA. Another S-AdoMet-dependent methyltransferase, RsmH (not included in this family), is responsible for N4-methylation at C1402. These methylation reactions may occur at a late step during 30S assembly in the cell. The dimethyl modification is believed to be conserved in bacteria, may play a role in fine-tuning the shape and functions of the P-site to increase the translation fidelity, and has been shown for Staphylococcus aureus, to contribute to virulence in host animals by conferring resistance to oxidative stress.	216
381176	cd11649	RsmI_like	uncharacterized subfamily of the tetrapyrrole methylase family similar to Ribosomal RNA small subunit methyltransferase I (RsmI). RsmI, also known as rRNA (cytidine-2'-O-)-methyltransferase, is an S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent methyltransferase responsible for the 2'-O-methylation of cytidine 1402 (C1402) at the P site of bacterial 16S rRNA. Another S-AdoMet-dependent methyltransferase, RsmH (not included in this family), is responsible for N4-methylation at C1402. These methylation reactions may occur at a late step during 30S assembly in the cell. The dimethyl modification is believed to be conserved in bacteria, may play a role in fine-tuning the shape and functions of the P-site to increase the translation fidelity, and has been shown for Staphylococcus aureus, to contribute to virulence in host animals by conferring resistance to oxidative stress.	229
212164	cd11650	AT4G37440_like	Uncharacterized protein domain conserved in plants. This domain contains an extensive protein sequence fragment that appears conserved in a number of plant proteins, including the gene product of Arabidopsis thaliana locus AT4G37440, which has been identified in transcriptional profiling as expressed at different levels in white cabbage cultivars.	253
212165	cd11651	YPK1_N_like	Fungal protein kinase domain similar to the N-terminus of YPK1. This fungal domain family includes the N-terminal region of the Saccharomyces cerevisiae AGC kinases YPK1 and YPK2, which were found to be essential for the proliferation of yeast. YPK1 is required for cell growth and acts as a downstream kinase in the sphingolipid-mediated signaling pathway of yeast. It also plays a role in efficient endocytosis and in the maintenance of cell wall integrity.	174
212166	cd11652	SSH-N	N-terminal domain conserved in slingshot (SSH) phosphatases. This domain or region conserved in Bilateria is found N-terminal to the DEK_C-like and catalytic domains of slingshot phosphatases. Slingshot is a cofilin-specific phosphatase. Dephosphorylation reactivates cofilin, which in turn depolymerizes actin and is thus required for actin filament reorganization. Slingshot is a member of the dual-specificity protein phosphatase family. This N-terminal SSH region may be involved in P-cofilin binding (the model C-terminus plus the DEK_C-like domain, which are characterized as the "B" domain in some of the literature), and may be required for the F-actin mediated activation of slingshot (the N-terminal region of this model, sometimes referred to as the "A" domain).	233
212167	cd11653	rap1_RCT	C-terminal domain of RAP1 recruits proteins to telomeres. The RAP1 (repressor activator protein 1) C-terminal domain (RCT) mediates interactions with other proteins such as TRF2 (human), Rif1, Rif2, Sir3, Sir4 (Saccharomyces cerevisiae), and Taz1 (Schizosaccharomyces pombe) at telomeres and other loci. RAP1, identified in budding yeast as repressor/activator protein 1, is a well-conserved telomere binding protein, also found in fission yeast and mammals. In Saccharomyces cerevisiae, RAP1 directly binds DNA and is involved in transcriptional activation, gene silencing, as well as binding at numerous sites at each telemore, where it functions in telomere length regulation, telomeric position effect gene silencing and telomere end protection. Human RAP1 apparently does not bind telomeric DNA directly, but binds telomere repeat binding factor 2 (TRF2) via the RCT. RAP1 might act by suppressing nonhomologous end-joining. Yeast RAP1 has two myb-type DNA binding modules, and an RCT domain that recruits Sir proteins 3 and 4 (Sir3, Sir4) for gene silencing, and Rif1 and Rif2 for telomere length maintenance. Schizosaccharomyces pombe RAP1 (spRap1), like human RAP1, lacks direct DNA-binding activity and is localized to telomeres via Taz1, an ortholog of TRF1 and TRF2. The S. pompe RCT resembles the first 3-helix bundle of the yeast and human RCT forms, but is not included in this larger model.	100
212553	cd11654	TRF2_RBM	RAP1 binding motif of telomere repeat binding factor. TRF2 (Telomere repeat binding factor 2) functions as part of the 6-component shelterin complex. TRF2 binds DNA and recruits RAP1 (via binding to the RAP1 protein c-terminus (RCT)) and TIN2 in the protection of telomeres from DNA repair machinery. Metazoan shelterin consists of 3 DNA-binding proteins (TRF2, TRF1 and POT1) and 3 recruited proteins that bind to one or more of these DNA-binding proteins (RAP1, TIN2, TPP1). Human TRF1 and TRF2 bind double-stranded DNA. hTRF2 consists of a basic N-terminus, a TRF homology domain, the RAP1 binding motif (RBM) described by this model, the TIN2 binding motif (TBM), and a myb-like DNA binding domain.	42
212554	cd11655	rap1_myb-like	DNA-binding modules of yeast Rap1 and related proteins. Yeast Rap1 DNA-binding activity is mediated by a pair of DNA-binding modules comprised of 2 3-helix bundles with an N-terminal arm, closely matching the structure of homeodomain and myb-type proteins. Human Rap1 has a single myb-like module, and may not bind DNA directly. Rap1, identified in budding yeast as repressor-activator protein 1, is a conserved telomere binding protein, also identified in fission yeast and mammals. In Saccharomyces cerevisiae, Rap1 directly binds DNA and is involved in transcriptional activation, gene silencing, as well as binding at numerous binding sites at each telomere, where it functions in telomere length regulation, telomeric position effect gene silencing and telomere end protection. Human Rap1 apparently does not bind telomeric DNA directly, but binds telomere repeat binding factor 2 (TRF2) via the Rap C-terminal domain (RCT). Rap1 may act by suppressing non-homologous end-joining. Yeast Rap1 has 2 myb-type DNA binding modules, a BRCT domain, and a RCT domain that recruits Sir3 and Sir4 proteins for gene silencing and Rif1 and Rif2 for telomere length maintenance. Human Rap1 has a similar domain architecture but has a single myb-like domain.	57
212555	cd11656	FBX4_GTPase_like	C-terminal GTPase-like domain of F-Box Only Protein 4. F-box proteins are involved in substrate recognition as part of SCF (Skp1-Cul1-Rbx1-F-box protein) ubiquitin ligase complexes. Fbx4 (or Fbxo4) binds to the telomere repeat binding factor 1 (TRF1), whose activity at telomeres is regulated in part by selective ubiquitination and degradation. This ubiquitination of TRF1 is mediated by Fbx4, which binds to the TRFH domain of TRF1, via the C-terminal domain characterized by this model, a module resembling a small GTPase domain that lacks the GTP-binding site. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, which in turn promotes telomere elongation. Fbx4 has also been reported to target cyclin D1 for degradation by the proteasome, a mechanism ensuring the fidelity of DNA replication. More recently, these findings have been disputed.	223
240667	cd11657	TIN2_N	N-terminal domain of TRF-interacting nuclear factor 2; shelterin complex protein of telomeres. TIN2 is one of the six proteins of shelterin complex, which acts to protect telomeres from DNA damage repair machinery. TIN2 binds directly to TRF1 and TRF2 and stabilizes TRF2 complex-telomere binding by tethering it to the TRF1 complex. TIN2 binding to TRF2 is primarily via the TRF binding motif (TBM) region and the N-terminus, while the far C-terminal region has lower affinity. The TIN2 TBM, but not the N-terminal region, is involved in TIN2 binding to TRF1. Truncation of the TIN2 N-terminus in mouse results in telomere elongation, suggesting a negative regulatory function of this region. Three shelterin components (TRF1, TRF2, POT1) bind DNA and 3 components (TIN2, RAP1, TPP1) are recruited by these DNA binding factors. TRF1 activity at telomeres is regulated in part by selective ubiquitination and degradation. Ubiquitination of TRF1 is mediated by Fbx4, which binds TRF1 in the TRFH domain, via a small GTPase module. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. F-box proteins act in substrate recognition as part of Skp1-Cul1-Rbx1-F- box (SCF) protein complexes. Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, promoting telomere elongation. TIN2 also binds PIP1, which recruits POT1 to telomeres.	188
212556	cd11658	SANT_DMAP1_like	SANT/myb-like domain of Human Dna Methyltransferase 1 Associated Protein 1-like. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins.	46
212557	cd11659	SANT_CDC5_II	SANT/myb-like DNA-binding domain of Cell Division Cycle 5-Like Protein repeat II. In humans, cell division cycle 5-like protein (CDC5) functions in pre-mRNA splicing in cell cycle control. The DNA-binding, myb-like domain of CDC5 is a member of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of DNA-binding Myb domains and is found in a diverse set of proteins.	53
212558	cd11660	SANT_TRF	Telomere repeat binding factor-like DNA-binding domains of the SANT/myb-like family. Human telomere repeat binding factors, TRF1 and TRF2, function as part of the 6 component shelterin complex. TRF2 binds DNA and recruits RAP1 (via binding to the RAP1 protein c-terminal (RCT)) and TIN2 in the protection of telomeres from DNA repair machinery. Metazoan shelterin consists of 3 DNA binding proteins (TRF2, TRF1, and POT1) and 3 recruited proteins that bind to one or more of these DNA-binding proteins (RAP1, TIN2, TPP1).  Schizosaccharomyces pombe TAZ1 is an orthlog and binds RAP1. Human TRF1 and TRF2 bind double-stranded DNA. hTRF2 consists of a basic N-terminus, a TRF homology domain, the RAP1 binding motif (RBM), the TIN2 binding motif (TBM) and a myb-like DNA binding domain, SANT, named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. Tandem copies of the domain bind telomeric DNA tandem repeats as part of the capping complex. The single myb-like domain of TRF-type proteins is similar to the tandem myb_like domains found in yeast RAP1.	50
212559	cd11661	SANT_MTA3_like	Myb-Like Dna-Binding Domain of MTA3 and related proteins. Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle.  MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis.	46
212560	cd11662	apollo_TRF2_binding	TRF2-binding region of apollo and similar proteins. Apollo protein, a DNA repair nuclease, is recruited to telomeres by TRF2 where it is associated with the principle components of the shelterin complex. Apollo is a member of the metallo-beta-lactamase family that is required for telomere integrity during S phase; its 5' exonuclease activity is regulated by binding to TRF2.  Apollo and TRF2 also suppress damage to engineered interstitial telomere repeat tracts at the chromosome ends.  TRF2, which binds preferentially to positively supercoiled DNA substrates, together with Apollo, negatively regulates the amount of DNA topoisomerases (TOP1, TOP2-alpha, and TOP2-beta) at telomeres since they also act in the same pathway of telomere protection. The shelterin complex protein identified in mammals is principally comprised of 6 factors that act to protect telomeres from DNA damage repair machinery. 3 components (TRF1, TRF2, POT1) bind DNA and 3 components are recruited by these factors (TIN2, RAP1, TPP1).	34
212128	cd11663	GH119_BcIgtZ-like	putative catalytic domain of glycoside hydrolase family 119 (GH119). The prokaryotic subgroup is represented by IgtZ, an alpha-amylase from a Bacillus circulans strain. The GH119 family is related to GH57, a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). GH57s cleave alpha-glycosidic bonds by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation.	363
212129	cd11664	LamB_YcsF_like_2	uncharacterized proteins similar to the Aspergillus nidulans lactam utilization protein LamB. This bacterial subfamily of the LamB/YbgL family, contains many well conserved uncharacterized proteins. Although their molecular function is unknown, those proteins show high sequence similarity to the Aspergillus nidulans lactam utilization protein LamB, which might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA.	238
212130	cd11665	LamB_like	Aspergillus nidulans lactam utilization protein LamB and similar proteins. This eukaryotic and bacterial subfamily of the LamB/YbgL family, includes Aspergillus nidulans protein LamB. The lamb gene locates at the lam locus of Aspergillus nidulans, consisting of two divergently transcribed genes, lamA and lamB, needed for the utilization of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression. Although the exact molecular function of lamb encoding protein LamB is unknown, it might be required for conversion of exogenous 2-pyrrolidinone to endogenous GABA.	238
212131	cd11666	GH38N_Man2A1	N-terminal catalytic domain of Golgi alpha-mannosidase II and similar proteins; glycoside hydrolase family 38 (GH38). This subfamily is represented by Golgi alpha-mannosidase II (GMII, also known as mannosyl-oligosaccharide 1,3- 1,6-alpha mannosidase, EC 3.2.1.114, Man2A1), a monomeric, membrane-anchored class II alpha-mannosidase existing in the Golgi apparatus of eukaryotes. GMII plays a key role in the N-glycosylation pathway. It catalyzes the hydrolysis of the terminal of both alpha-1,3-linked and alpha-1,6-linked mannoses from the high-mannose oligosaccharide GlcNAc(Man)5(GlcNAc)2 to yield GlcNAc(Man)3(GlcNAc)2(GlcNAc, N-acetylglucosmine), which is the committed step of complex N-glycan synthesis. GMII is activated by zinc or cobalt ions and is strongly inhibited by swainsonine. Inhibition of GMII provides a route to block cancer-induced changes in cell surface oligosaccharide structures. GMII has a pH optimum of 5.5-6.0, which is intermediate between those of acidic (lysosomal alpha-mannosidase) and neutral (ER/cytosolic alpha-mannosidase) enzymes. GMII is a retaining glycosyl hydrolase of family GH38 that employs a two-step mechanism involving the formation of a covalent glycosyl enzyme complex; two carboxylic acids positioned within the active site act in concert: one as a catalytic nucleophile and the other as a general acid/base catalyst.	344
212132	cd11667	GH38N_Man2A2	N-terminal catalytic domain of Golgi alpha-mannosidase IIx, and similar proteins; glycoside hydrolase family 38 (GH38). This subfamily is represented by human alpha-mannosidase 2x (MX, also known as mannosyl-oligosaccharide 1,3- 1,6-alpha mannosidase, EC 3.2.1.114, Man2A2). MX is enzymatically and functionally very similar to GMII (found  in another subfamily), and as an isoenzyme of GMII. It is thought to also function in the N-glycosylation pathway. MX specifically hydrolyzes the same oligosaccharide substrate as does MII. It specifically removes two mannosyl residues from GlcNAc(Man)5(GlcNAc)2 to yield GlcNAc(Man)3(GlcNAc)2(GlcNAc, N-acetylglucosmine).	344
212168	cd11669	TTHB210-like	Hypothetical protein TTHB210, a sigma(E)-regulated gene product found in Thermus thermophilus, and similar proteins. TTHB210 is an uncharacterized protein found in Thermus thermophilus, and is controlled by the sigma(E) /anti-sigma(E) regulatory system. It is one of the five proteins of the extracytoplasmic function (ECF) sigma factor sigma(E)-regulated gene products whose physiological function have not been determined. Its crystallographic structure reveals a novel homodecamer although it is a dimer in solution.	115
212561	cd11670	Sp_RAP1_RCT	C-terminal domain of S. pombe RAP1 protein. The Schizosaccharomyces pombe RAP1 (repressor activator protein 1) protein C-terminal (RCT) domain structurally resembles the first 3-helix bundle found in yeast and human RAP1 RCT. S. pombe RAP1 (spRap1), like human RAP1, lacks direct DNA-binding activity and is localized to telomeres via Taz1, an ortholog of TRF1 and TRF2. The RAP1 RCT domain interacts with RAP1 binding motif (RBM) of TAZ1. RAP1, identified in budding yeast as repressor/activator protein 1 is a well-conserved telomere binding protein, found in budding yeast, fission yeast and mammals. In Saccharomyces cerevisiae, RAP1 directly binds DNA and is involved in transcriptional activation and mating type information gene silencing, as well as binding at numerous sites at each telomere, where it functions in telomere length regulation, telomeric position effect gene silencing and telomere end protection. Human RAP1 does not bind telomeric DNA directly, but binds telomere repeat binding factor 2 (TRF2) via the RAP C-terminal domain (RCT). Yeast RAP1 has 2 myb-type DNA binding modules, a BRCT domain, and a RCT domain that recruits Sir3 and Sir4 for gene silencing and Rif1 and Rif2 for telomere length maintenance. S. pombe RAP1 has a BRCT domain, 2 myb like domains, and the RCT.	52
212562	cd11671	TAZ1_RBM	RAP1 binding motif of Schizosaccharomyces pombe TAZ1. S. pombe TAZ1 recruits the spRAP1 protein to telomeres. The TAZ1 RAP1-binding motif (RBM) binds the RAP1 C-terminal domain (RCT), which structurally resembles the first 3-helix bundle found in yeast and human RAP1 RCT. TAZ1, an ortholog of TRF1 and TRF2, has a TRF homology (TRFH) domain, the RBM domain, a dimerization domain, and a myb-like C-terminus. RAP1, identified in budding yeast as repressor/activator protein 1, is a well-conserved telomere binding protein and is also found in fission yeast and mammals. In Saccharomyces cerevisiae, RAP1 directly binds DNA and is involved in transcriptional activation and mating type information gene silencing, as well as in binding to numerous binding sites at each telomere, where it functions in telomere length regulation, telomeric position effect gene silencing, and telomere end protection. Like S. pombe RAP1, human RAP1 does not bind telomeric DNA directly, but binds telomere repeat binding factor 2 (TRF2) through the RAP C-terminal domain (RCT).	49
277250	cd11672	ADDz	ATRX, Dnmt3 and Dnmt3l PHD-like zinc finger domain (ADDz). The ADDz zinc finger domain is present in the chromatin-associated proteins cytosine-5-methyltransferase 3 (Dnmt3) and ATRX, a SNF2 type transcription factor protein. The Dnmt3 family includes two active DNA methyltransferases, Dnmt3a and -3b, and one regulatory factor Dnmt3l. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The ADDz domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif.	99
212563	cd11673	hemoglobin_linker_C	Globular domain of extracellular hemoglobin linker. This family of hemoglobin linker chains is restricted to annelid worms, and participates in the formation of the large erythrocruorin respiratory complex. Via its N-terminal coiled-coil segment (not included in this model), the molecule forms trimers, which are part of a scaffold organizing the overall complex architecture; the latter encompasses 36 linkers and 144 hemoglobins in total. This C-terminal globular domain is involved in trimerization, and also interacts with globins and other C-terminal globular linker domains of neighboring trimers. The structure resembles that of nitrophorins and lipocalins.	120
212564	cd11674	lambda-1	inner capsid protein lambda-1 or VP3. The reovirus inner capsid protein lambda-1 displays nucleoside triphosphate phosphohydrolase (NTPase), RNA-5'-triphosphatase (RTPase), and RNA helicase activity and may play a role in the transcription of the virus genome, the unwinding or reannealing of double-stranded RNA during RNA synthesis. The RTPase activity constitutes the first step in the capping of RNA, resulting in a 5'-diphosphorylated RNA plus-strand. lambda1 is an Orthoreovirus core protein, VP3 is the homologous core protein in Aquareoviruses.	1166
212565	cd11675	SCAB1_middle	middle domain of the stomatal closure-related actin binding protein1. SCAB1 is a dimeric actin crosslinker conserved in plants. The three-dimensional structure of this domain resembles that of fibronectin type III repeat units and immunoglobulins. It is situated between a coiled-coil dimerization domain and a C-terminal pleckstrin homology-like module. SCAB1 appears to be required for normal actin dynamics in guard cells stomatal movement. The function of the middle domain is not clear.	85
212487	cd11676	Gemin6	Gemin 6. Gemins 6, together with the survival motor neuron (SMN) protein, other Gemins, and Unr-interacting protein (UNRIP) form the SMN complex, which plays an important role in the Sm core assembly reaction, by binding directly to the Sm proteins, as well as UsnRNAs. Gemin 6 forms a heterodimer with Gemin 7, which serve as a surrogate for the SmB-SmD3 dimer during the formation of the heptameric Sm ring.	63
212488	cd11677	Gemin7	Gemin 7. Gemins 7, together with the survival motor neuron (SMN) protein, other Gemins, and Unr-interacting protein (UNRIP) form the SMN complex, which plays an important role in the Sm core assembly reaction, by binding directly to the Sm proteins, as well as UsnRNAs. Gemin 7 forms a heterodimer with Gemin 6, which serve as a surrogate for the SmB-SmD3 dimer during the formation of the heptameric Sm ring.	77
212489	cd11678	archaeal_LSm	archaeal Like-Sm protein. The archaeal Sm-like (LSm): The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, their Sm proteins may play a more general role. Archaeal LSm proteins are likely to represent the ancestral Sm domain. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.	69
212490	cd11679	archaeal_Sm_like	archaeal Sm-related protein. Archaeal Sm-related proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, their Sm proteins may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.	65
212543	cd11680	HDAC_Hos1	Class I histone deacetylases Hos1 and related proteins. Saccharomyces cerevisiae Hos1 is responsible for Smc3 deacetylation. Smc3 is an important player during the establishment of sister chromatid cohesion. Hos1 belongs to the class I histone deacetylases (HDACs). HDACs are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues in histone amino termini to yield a deacetylated histone (EC 3.5.1.98). Enzymes belonging to this group participate in regulation of a number of processes through protein (mostly different histones) modification (deacetylation). Class I histone deacetylases in general act via the formation of large multiprotein complexes. Other class I HDACs are animal HDAC1, HDAC2, HDAC3, HDAC8, fungal RPD3 and HOS2, plant HDA9, protist, archaeal and bacterial (AcuC) deacetylases. Members of this class are involved in cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and in posttranslational control of the acetyl coenzyme A synthetase.	294
212544	cd11681	HDAC_classIIa	Histone deacetylases, class IIa. Class IIa histone deacetylases are Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine residues of histones (EC 3.5.1.98) to yield deacetylated histones. This subclass includes animal HDAC4, HDAC5, HDAC7, and HDCA9. Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. Histone deacetylases usually act via association with DNA binding proteins to target specific chromatin regions. Class IIa histone deacetylases are signal-dependent co-repressors, they have N-terminal regulatory domain with two or three conserved serine residues, phosphorylation of these residues is important for ability to shuttle between the nucleus and cytoplasm and act as transcriptional co-repressors. HDAC9 is involved in regulation of gene expression and dendritic growth in developing cortical neurons. It also plays a role in hematopoiesis. HDAC7 is involved in regulation of myocyte migration and differentiation. HDAC5 is involved in integration of chronic drug (cocaine) addiction and depression with changes in chromatin structure and gene expression. HDAC4 participates in regulation of chondrocyte hypertrophy and skeletogenesis.	377
212545	cd11682	HDAC6-dom1	Histone deacetylase 6, domain 1. Histone deacetylases 6 are class IIb Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDACs usually act via association with DNA binding proteins to target specific chromatin regions. HDAC6 is the only histone deacetylase with internal duplication of two catalytic domains which appear to function independently of each other, and also has a C-terminal ubiquitin-binding domain. It is located in the cytoplasm and associates with microtubule motor complex, functioning as the tubulin deacetylase and regulating microtubule-dependent cell motility. Known interaction partners of HDAC6 are alpha tubulin (substrate) and ubiquitin-like modifier FAT10 (also known as Ubiquitin D or UBD).	337
212546	cd11683	HDAC10	Histone deacetylase 10. Histone deacetylases 10 are class IIb Zn-dependent enzymes that catalyze hydrolysis of N(6)-acetyl-lysine of a histone to yield a deacetylated histone (EC 3.5.1.98). Histone acetylation/deacetylation process is important for mediation of transcriptional regulation of many genes. HDACs usually act via association with DNA binding proteins to target specific chromatin regions. HDAC10 has an N-terminal deacetylase domain and a C-terminal pseudo-repeat that shares significant similarity with its catalytic domain. It is located in the nucleus and cytoplasm, and is involved in regulation of melanogenesis. It transcriptionally down-regulates thioredoxin-interacting protein (TXNIP), leading to altered reactive oxygen species (ROS) signaling in human gastric cancer cells. Known interaction partners of HDAC10 are Pax3, KAP1, hsc70 and HDAC3 proteins.	337
212566	cd11684	DHR2_DOCK	Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins. DOCK proteins comprise a family of atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. They are also called the CZH (CED-5, Dock180, and MBC-zizimin homology) family, after the first family members identified. Dock180 was first isolated as a binding partner for the adaptor protein Crk. The Caenorhabditis elegans protein, Ced-5, is essential for cell migration and phagocytosis, while the Drosophila ortholog, Myoblast city (MBC), is necessary for myoblast fusion and dorsal closure. DOCKs are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1 (or Dock180), 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1, and DHR-2 (also called CZH2 or Docker). This alignment model represents the DHR-2 domain of DOCK proteins, which contains the catalytic GEF activity for Rac and/or Cdc42.	392
212582	cd11687	PpPFK_gamma	Pichia pastoris 6-phosphofructokinase, gamma subunit. Pichia pastoris 6-phosphofructokinase (PpPfk) is the most complex and probably largest (1 MDa) eukaryotic Pfk. It forms a dodecamer of four alpha-beta-gamma trimers. The gamma unit is unique, in contrast to other eukaryotic ATP-dependent 6-phosphofructokinases, and participates in oligomerization of the alpha and beta chains. It is not essential for enzymatic activity, but it modulates the allosteric behavior of the enzyme.	346
212583	cd11688	THUMP	THUMP domain, predicted to bind RNA. The THUMP domain is named after THioUridine synthases, RNA Methyltransferases and Pseudo-uridine synthases. It is predicted to be an RNA-binding domain and  probably functions by delivering a variety of RNA modification enzymes to their targets.	148
212588	cd11689	SidM_DrrA_GEF	guanine nucleotide-exchange factor domain of Legionella SidM/DrrA. Effector protein DrrA of Legionella pneumophila, an intracellular pathogen, is a potent guanine nucleotide-exchange factor (GEF) specific for the host Rab1 GTPase. It competes with endogenous exchange factors to recruit and activate Rab1 on plasma membrane-derived organelle, therefore effectively hijacking the host's vesicle trafficking to avoid phagosome-lysosome fusion.	187
212589	cd11690	Tsi2_like	Tse2 immunity protein Tsi2 and similar proteins. Tsi2 is an essential protein in Pseudomonas aeruginosa, providing protection from the activity of Tse2, most likely by directly interacting with Tse2. Tse2 is a toxin transported via the type VI secretion system and is targeted towards other bacteria in the environment.	72
212590	cd11691	HRI1_like	Tandem repeat domain of HRI1 and related proteins. Saccharomyces cerevisiae Hri1p (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25p. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a beta-barrel with structural similarity to nitrobindin. The two repeats are sequence dissimilar, and the second (c-terminal) repeat is missing several strands, forming an incomplete barrel.	101
212591	cd11692	HRI1_N_like	N-terminal domain of HRI1 and related proteins. Saccharomyces cerevisiae Hri1p (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25p. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a beta-barrel with structural similarity to nitrobindin. This N-terminal repeat is involved in homodimerization and may contain a ligand binding site.	134
212592	cd11693	HRI1_C_like	C-terminal domain of HRI1 and related proteins. Saccharomyces cerevisiae Hri1p (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25p. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a beta-barrel with structural similarity to nitrobindin. This C-terminal repeat is missing several strands and forms an incomplete barrel.	90
212567	cd11694	DHR2_DOCK_D	Dock Homology Region 2, a GEF domain, of Class D Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate small GTPases by exchanging bound GDP for free GTP. They are divided into four classes (A-D) based on sequence similarity and domain architecture; class D, also called the Zizimin subfamily, includes Dock9, 10 and 11. Class D Docks are specific GEFs for Cdc42. Dock9 plays important roles in spine formation and dendritic growth. Dock10 and Dock11 are preferentially expressed in lymphocytes. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of class D DOCKs, which contains the catalytic GEF activity for Cdc42. Class D DOCKs also contain a Pleckstrin homology (PH) domain at the N-terminus.	376
212568	cd11695	DHR2_DOCK_C	Dock Homology Region 2, a GEF domain, of Class C Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate small GTPases by exchanging bound GDP for free GTP. They are divided into four classes (A-D) based on sequence similarity and domain architecture; class C, also called the Zizimin-related (Zir) subfamily, includes Dock6, 7 and 8. Class C DOCKs have been shown to have GEF activity for both Rac and Cdc42. Dock6 regulates neurite outgrowth. Dock7 plays a critical roles in the early stages of axon formation, neuronal polarity, and myelination. Dock8 regulates T and B cell numbers and functions, and plays essential roles in humoral immune responses and the proper formation of B cell immunological synapses. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Class C Docks, which contains the catalytic GEF activity for Rac and Cdc42.	368
212569	cd11696	DHR2_DOCK_B	Dock Homology Region 2, a GEF domain, of Class B Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate small GTPases by exchanging bound GDP for free GTP. They are divided into four classes (A-D) based on sequence similarity and domain architecture; class B includes Dock3 and 4. Dock3 is a specific GEF for Rac and it regulates N-cadherin dependent cell-cell adhesion, cell polarity, and neuronal morphology. It promotes axonal growth by stimulating actin polymerization and microtubule assembly. Dock4 activates the Ras family GTPase Rap1, probably indirectly through interaction with Rap regulatory proteins. It plays a role in regulating dendritic growth and branching in hippocampal neurons, where it is highly expressed. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of class B DOCKs, which contains the catalytic GEF activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus.	391
212570	cd11697	DHR2_DOCK_A	Dock Homology Region 2, a GEF domain, of Class A Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate small GTPases by exchanging bound GDP for free GTP. They are divided into four classes (A-D) based on sequence similarity and domain architecture; class A includes Dock1, 2 and 5. Class A DOCKs are specific GEFs for Rac. Dock1 interacts with the scaffold protein Elmo and the resulting complex functions upstream of Rac in many biological events including phagocytosis of apoptotic cells, cell migration and invasion. Dock2 plays an important role in lymphocyte migration and activation, T-cell differentiation, neutrophil chemotaxis, and type I interferon induction. Dock5 functions upstream of Rac1 to regulate osteoclast function. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of class A DOCKs, which contains the catalytic GEF activity for Rac and/or Cdc42. Class A DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus.	400
212571	cd11698	DHR2_DOCK9	Dock Homology Region 2, a GEF domain, of Class D Dedicator of Cytokinesis 9. Dock9, also called Zizimin1, is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPase Cdc42 by exchanging bound GDP for free GTP. It plays important roles in spine formation and dendritic growth. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock9, which contains the catalytic GEF activity for Cdc42. Class D DOCKs also contain a Pleckstrin homology (PH) domain at the N-terminus.	415
212572	cd11699	DHR2_DOCK10	Dock Homology Region 2, a GEF domain, of Class D Dedicator of Cytokinesis 10. Dock10, also called Zizimin3, is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPase Cdc42 by exchanging bound GDP for free GTP. Dock10 is preferentially expressed in lymphocytes and may play a role in interleukin-4 induced activation of B cells. It may also play a role in the invasion of tumor cells. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock10, which contains the catalytic GEF activity for Cdc42. Class D DOCKs also contain a Pleckstrin homology (PH) domain at the N-terminus.	446
212573	cd11700	DHR2_DOCK11	Dock Homology Region 2, a GEF domain, of Class D Dedicator of Cytokinesis 11. Dock11, also called Zizimin2 or activated Cdc42-associated GEF (ACG), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPase Cdc42 by exchanging bound GDP for free GTP. Dock11 is predominantly expressed in lymphocytes and is found in high levels in germinal center B lymphocytes after T cell dependent antigen immunization. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock11, which contains the catalytic GEF activity for Cdc42. Class D DOCKs also contain a Pleckstrin homology (PH) domain at the N-terminus.	413
212574	cd11701	DHR2_DOCK8	Dock Homology Region 2, a GEF domain, of Class C Dedicator of Cytokinesis 8. Dock8, also called Zizimin-related 3 (Zir3), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPases Rac1 and Cdc42 by exchanging bound GDP for free GTP. Dock8 is highly expressed in the immune system and it regulates T and B cell numbers and functions. It plays essential roles in humoral immune responses and the proper formation of B cell immunological synapses. Dock8 deficiency is a primary immune deficiency that results in extreme susceptibility to cutaneous viral infections, elevated IgE levels, and eosinophilia. It was originally described as an autosomal recessive form of hyper IgE syndrome (AR-HIES). DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class C includes Dock6, 7 and 8. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock8, which contains the catalytic GEF activity for Rac and/or Cdc42.	422
212575	cd11702	DHR2_DOCK6	Dock Homology Region 2, a GEF domain, of Class C Dedicator of Cytokinesis 6. Dock6, also called Zizimin-related 1 (Zir1), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. It is widely expressed and shows highest expression in the dorsal root ganglion and the brain. It regulates neurite outgrowth. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class C includes Dock6, 7 and 8. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock6, which contains the catalytic GEF activity for Rac and/or Cdc42.	423
212576	cd11703	DHR2_DOCK7	Dock Homology Region 2, a GEF domain, of Class C Dedicator of Cytokinesis 7. Dock7, also called Zizimin-related 2 (Zir2), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates the small GTPases Rac1 and Cdc42 by exchanging bound GDP for free GTP. It plays a critical role in the initial specification of axon formation in hippocampal neurons. It affects neuronal polarity by regulating microtubule dynamics. Dock7 also plays a role in controlling myelination by Schwann cells. It may also play important roles in the function and distribution of dermal and follicular melanocytes. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class C includes Dock6, 7 and 8. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock7, which contains the catalytic GEF activity for Rac and/or Cdc42.	473
212577	cd11704	DHR2_DOCK3	Dock Homology Region 2, a GEF domain, of Class B Dedicator of Cytokinesis 3. Dock3, also called modifier of cell adhesion (MOCA), is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. Dock3 is a specific GEF for Rac. It regulates N-cadherin dependent cell-cell adhesion, cell polarity, and neuronal morphology. It promotes axonal growth by stimulating actin polymerization and microtubule assembly. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class B includes Dock3 and 4. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock3, which contains the catalytic GEF activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus.	392
212578	cd11705	DHR2_DOCK4	Dock Homology Region 2, a GEF domain, of Class B Dedicator of Cytokinesis 4. Dock4 is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. It plays a role in regulating dendritic growth and branching in hippocampal neurons, where it is highly expressed. It may also regulate spine morphology and synapse formation. Dock4 activates the Ras family GTPase Rap1, probably indirectly through interaction with Rap regulatory proteins. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class B includes Dock3 and 4. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock4, which contains the catalytic GEF activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus.	391
212579	cd11706	DHR2_DOCK2	Dock Homology Region 2, a GEF domain, of Class A Dedicator of Cytokinesis 2. Dock2 is a hematopoietic cell-specific, class A DOCK and is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. It plays an important role in lymphocyte migration and activation, T-cell differentiation, neutrophil chemotaxis, and type I interferon induction. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class A includes Dock1, 2 and 5. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock2, which contains the catalytic GEF activity for Rac and/or Cdc42. Class A DOCKs, like Dock2, are specific GEFs for Rac and they contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus.	421
212580	cd11707	DHR2_DOCK1	Dock Homology Region 2, a GEF domain, of Class A Dedicator of Cytokinesis 1. Dock1, also called Dock180, is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. Dock1 interacts with the scaffold protein Elmo and the resulting complex functions upstream of Rac in many biological events including phagocytosis of apoptotic cells, cell migration and invasion. In the nervous system, it mediates attractive responses to netrin-1 and thus, plays a role in axon outgrowth and pathfinding. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class A includes Dock1, 2 and 5. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock1, which contains the catalytic GEF activity for Rac and/or Cdc42. Class A DOCKs, like Dock1, are specific GEFs for Rac and they contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus.	400
212581	cd11708	DHR2_DOCK5	Dock Homology Region 2, a GEF domain, of Class A Dedicator of Cytokinesis 5. Dock5 is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. As a GEF, it activates small GTPases by exchanging bound GDP for free GTP. It functions upstream of Rac1 to regulate osteoclast function. DOCK proteins are divided into four classes (A-D) based on sequence similarity and domain architecture; class A includes Dock1, 2 and 5. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate. This alignment model represents the DHR-2 domain of Dock5, which contains the catalytic GEF activity for Rac and/or Cdc42. Class A DOCKs, like Dock5, are specific GEFs for Rac and they contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus.	400
293931	cd11709	SPRY	SPRY domain. SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). TRIM/RBCC proteins are involved in a variety of processes, including apoptosis, cell cycle regulation, cell growth, senescence, viral response, meiosis, cell differentiation, and vesicular transport. Genes belonging to this family are implicated in several human diseases that vary from cancer to rare genetic syndromes. The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site. While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome.	118
212548	cd11710	GINS_A_psf1	Alpha-helical domain of GINS complex protein Psf1. Psf1 is a component of the GINS tetrameric protein complex. Psf1 is mainly expressed in highly proliferative tissues, such as blastocysts, adult bone marrow, and testis, in which the stem cell system is active. Loss of Psf1 causes embryonic lethality. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits are homologous and homologs are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3.	129
212549	cd11711	GINS_A_Sld5	Alpha-helical domain of GINS complex protein Sld5. Sld5 is a component of GINS tetrameric protein complex, and within the complex Sld5 interacts with Psf1 via its N-terminal A-domain, and with Psf2 through a combination of the A and B domains. Sld5 in Drosophila is required for normal cell cycle progression and the maintenance of genomic integrity. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The  eukaryotic GINS subunits are homologous and homologs are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3.	119
212550	cd11712	GINS_A_psf2	Alpha-helical domain of GINS complex protein Psf2 (partner of Sld5 2). Psf2 is a component of GINS tetrameric protein complex and has been found to play important roles in normal eye development in Xenopus laevis. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits are homologous and homologs are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3.	119
212551	cd11713	GINS_A_psf3	Alpha-helical domain of GINS complex protein Psf3 (partner of Sld5 3). Psf3 is a component of GINS, a tetrameric protein complex. Psf3 expression is up regulated in malignant colon cancer and it might be involved in cancer cell proliferation. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits are homologous and homologs are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3.	109
212552	cd11714	GINS_A_archaea	Alpha-helical domain of archaeal GINS complex proteins. The GINS complex is involved in replication of archaeal and eukayotic genomes. The archaeal DNA replication system is a simplified version of that of the eukaryotes. Like its eukaryotic counterpart, the archaeal GINS complex is tetrameric, but instead of four different subunits (Sld5, Psf1, Psf2 and Psf3) it consists of two different proteins named Gins51 and Gins23. All GINS subunits are homologs and they can be classified into two groups. One group (the eukayotic Sld5 and Psf1, as well as the archaeal Gins51) has the alpha-helical (A) domain at the N-terminus and the beta-strand domain (B) at the C-terminus (this arrangement is called ABtype). The arrangement of the A and B domains is reversed in the second group (eukaryotic Psf2 and Psf3 and archaeal Gins23, also referred to as BAtype). The overall fold of each archaeal subunit and the overall tetrameric assembly of GINS are similar, but the relative locations of the C-terminal small domains are different with respect to the alpha helical domain characterized by this model, resulting in different subunit contacts in the archaeal GINS complex.Some archaea may have a homotetrameric GINS complex (4 copies of an AB-type module).	105
212584	cd11715	THUMP_AdoMetMT	THUMP domain associated with S-adenosylmethionine-dependent methyltransferases. Proteins of this family contain an N-terminal THUMP domain and a C-terminal S-adenosylmethionine-dependent methyltransferase domain. Members have been implicated in the modification of 23S RNA m2G2445, a highly conserved modification in bacteria and in the m2G6 modification of tRNA.  The THUMP domain is named after thiouridine synthases, methylases and PSUSs. The domain consists of about 110 amino acid residues. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets.	152
212585	cd11716	THUMP_ThiI	THUMP domain of thiamine biosynthesis protein ThiI. ThiI is an enzyme responsible for the formation of the modified base S(4)U (4-thiouridine) found at position 8 in some prokaryotic tRNAs. This modification acts as a signal for UV exposure, triggering a response that provides protection against its damaging effects. ThiI consists of an N-terminal THUMP domain, followed by an NFLD domain, and a C-terminal PP-loop pyrophosphatase domain. The N-terminal THUMP domain has been implicated in the recognition of the acceptor-stem region. The THUMP domain is named after thiouridine synthases, methylases and PSUSs. The domain consists of about 110 amino acid residues. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets.	166
212586	cd11717	THUMP_THUMPD1_like	THUMP domain-containing protein 1-like. This family contains THUMP domain-only proteins including THUMP domain-containing protein 1 and Saccharomyces cerevisiae Tan1. Tan1 is non essential and has been shown to be required for the formation of the modified nucleoside N(4)-acetylcytidine (ac(4)C) in tRNA. To date, there is no functional information available about THUMPD1. The THUMP domain is named after thiouridine synthases, methylases and PSUSs. The domain consists of about 110 amino acid residues. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets.	158
212587	cd11718	THUMP_SPOUT	THUMP domain associated with SPOUT RNA Methylases. Members of this archaeal protein family are characterized by containing an N-terminal THUMP domain and a C-terminal SPOUT RNA methyltransferase domain. No functional information is available The THUMP domain is named after thiouridine synthases, methylases and PSUSs. The domain consists of about 110 amino acid residues. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets.	145
212593	cd11719	FANC	Fanconi anemia ID complex proteins FANCI and FANCD2. The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome.	977
212594	cd11720	FANCI	Fanconi anemia I protein. The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome. The phosphorylation of FANCI may function as a molecular switch to turn on the FA pathway.	1202
212595	cd11721	FANCD2	Fanconi anemia D2 protein. The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome. The phosphorylation of FANCD2 is required for DNA damage-induced intra-S phase checkpoint and for cellular resistance to DNA crosslinking agents.	1161
212596	cd11722	SOAR	STIM1 Orai1-activating region. STIM1 (stromal interaction module 1) is a metazoan transmembrane protein located in the endoplasmic reticulum (ER) membrane, which functions as a sensor for ER calcium ion levels and activates store-operated Ca2+ influx channels (SOCs), such as the Orai1 Ca2+ channel located in the plasma membrane. STIM1 has an N-terminal Ca-binding EF-hand domain, which is located in the ER lumen. Responding to the release of Ca2+ from the ER, STIM1 was found to aggregate near the plasma membrane and contact Orai1. This model describes a region near the C-terminus of STIM1, which has been shown to mediate the interaction with Orai1 and has been labeled SOAR (STIM1 Orai1-activating region). STIM1 has also been linked to sensing oxidative and temperature-variation stress and may play a rather general role in mediating calcium signaling in response to stress. Dimerization of STIM1 via the SOAR domain appears required for the activation of the Orai1 calcium channel. A model for STIM1 activation has been proposed, in which an inhibitory helix N-terminal to the SOAR domain prevents STIM1 clustering or aggregation, and in which conformational changes triggered by depletion of the calcium stores allow the clustering and activation of Orai1.	92
381177	cd11723	YabN_N_like	N-terminal S-AdoMet-dependent tetrapyrrole methylase domain of Bacillus subtilis YabN and similar domains. This family includes the S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent tetrapyrrole methylase (TP-methylase) domain of Bacillus subtilis YabN, and similar domains. YabN is a fusion of an N-terminal TP-methylase and a C-terminal MazG-type nucleotide pyrophosphohydrolase domain. MazG-like NTP-PPases have been implicated in house-cleaning functions such as degrading abnormal (d)NTPs. TP-methylases use S-AdoMet in the methylation of diverse substrates. Most TP-methylase family members catalyze various methylation steps in cobalamin (vitamin B12) biosynthesis, other members like diphthine synthase and ribosomal RNA small subunit methyltransferase I (RsmI) act on other substrates. The specific function of YabN's TP-methylase domain is not known.	218
381178	cd11724	TP_methylase	uncharacterized family of the tetrapyrrole methylase superfamily. Members of this superfamily use S-AdoMet (S-adenosyl-L-methionine or SAM) in the methylation of diverse substrates. Most members catalyze various methylation steps in cobalamin (vitamin B12) biosynthesis. There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. The enzymes involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Most of the enzymes are shared by both pathways and a few enzymes are pathway-specific. Diphthine synthase and Ribosomal RNA small subunit methyltransferase I (RsmI) are two superfamily members that are not involved in cobalamin biosynthesis. Diphthine synthase participates in the posttranslational modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. RsmI catalyzes the 2-O-methylation of the ribose of cytidine 1402 (C1402) in 16S rRNA. Other superfamily members not involved in cobalamin biosynthesis include the N-terminal tetrapyrrole methylase domain of Bacillus subtilis YabN whose specific function is unknown, and Omphalotus olearius omphalotin methyltransferase which catalyzes the automethylation of its own C-terminus; this C terminus is subsequently released and macrocyclized to give Omphalotin A, a potent nematicide.	243
277251	cd11725	ADDz_Dnmt3	ADDz domain found in DNA (cytosine-5) methyltransferases (C5-MTases) 3 (Dnmt3). Dnmt3 is a de novo DNA methyltransferase family that includes two active enzymes Dnmt3a and -3b and one regulatory factor Dnmt3l. The ADDz domain of Dnmt3 is located in the C-terminal region of Dnmt3, which is an active catalytic domain in Dnmt3a and -b, but lacks some residues for enzymatic activity in Dnmt3l. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The ADDz_Dnmt3 domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif.	108
277252	cd11726	ADDz_ATRX	ADDz domain found in ATRX (alpha-thalassemia/mental retardation, X-linked). ADDz_ATRX is a PHD-like zinc finger domain of ATRX, which belongs to the SNF2 family of chromatin remodeling proteins. ATRX is a large chromatin-associated nuclear protein with two domains, ADDz_ATRX at the N-terminus, followed by a C-terminal ATPase/helicase domain. The ADDz_ATRX domain recognizes a specific methylated histone, and this interaction is required for heterochromatin localization of the ATRX protein. Missense mutations in either of the two ATRX domains lead to the X-linked alpha-thalassemia and mental retardation syndrome; however the mutations in the ADDz_ATRX domain produce a more severe disease phenotype that may also relate to disturbing unknown functions or interaction sites of this domain. The ADDz domain is also present in chromatin-associated proteins cytosine-5-methyltransferase 3 (Dnmt3); it is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif.	102
277253	cd11727	ADDz_Dnmt3l	ADDz domain found in DNA (cytosine-5) methyltransferases (C5-MTases) 3 like (Dnmt3l). Dnmt3l is a regulator of DNA methylation, which acts by recognizing unmethylated histone H3 tails and interacting with Dnmt3a to stimulate its de novo DNA methylation activity. The ADDz_Dnmt3l domain is located in the C-terminal region of Dnmt3l that otherwise lacks some residues required for DNA methyltransferase activity. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. Dnmt3l is also associating with HDAC1 and acts as a transcriptional repressor. The ADDz_Dnmt3l domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif.	123
277254	cd11728	ADDz_Dnmt3b	ADDz domain found in DNA (cytosine-5) methyltransferases (C5-MTases) 3b (Dnmt3b). ADDz_Dnmt3b is an active catalytic domain of Dnmt3b. Dnmt3b is a member of the Dnmt3 family and is a de novo DNA methyltransferases that has an N-terminal variable region followed by a conserved PWWP region and the cysteine-rich ADDz domain. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The methyltransferase activity of Dnmt3a is not only responsible for the establishment of DNA methylation pattern, but is also essential for the inheritance of these patterns during mitosis. Dnmt3b is ubiquitously expressed in most adult tissues. The ADDz_Dnmt3 domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif. A knockout of Dnmt3b has been shown to be lethal in the mouse model.	120
277255	cd11729	ADDz_Dnmt3a	ADDz domain found in DNA (cytosine-5) methyltransferases (C5-MTases) 3a (Dnmt3a). Dnmt3a is a member of the Dnmt3 family and is a protein with de novo DNA methyltransferase activity. Dnmt3 family members are Dnmt3a, Dnmt3b, and Dnmt3l the non-enzymatic regulatory factor. Dnmt3a is recruited by Dnmt3l to unmethylated histone H3 and methylates the target. Dnmt3a has a variable region at the N-terminus, followed by a conserved PWWP region and the cysteine-rich ADDz domain. ADDz_Dnmt3a is an active catalytic domain of Dnmt3a. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The methyltransferase activity of Dnmt3a is not only responsible for the establishment of DNA methylation pattern, but is also essential for the inheritance of these patterns during mitosis. The ADDz_Dnmt3 domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif. A knockout of Dnmt3a has been shown to be lethal in the mouse model.	128
212496	cd11730	Tthb094_like_SDR_c	Tthb094 and related proteins, classical (c) SDRs. Tthb094 from Thermus Thermophilus is a classical SDR which binds NADP. Members of this subgroup contain the YXXXK active site characteristic of SDRs. Also, an upstream Asn residue of the canonical catalytic tetrad is partially conserved in this subgroup of proteins of undetermined function. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	206
212497	cd11731	Lin1944_like_SDR_c	Lin1944 and related proteins, classical (c) SDRs. Lin1944 protein from Listeria Innocua is a classical SDR, it contains a glycine-rich motif similar to the canonical motif of the SDR NAD(P)-binding site. However, the typical SDR active site residues are absent in this subgroup of proteins of undetermined function. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns. These enzymes catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif. Some atypical SDRs have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues. Reactions catalyzed within the SDR family include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA reduction, and carbonyl-alcohol oxidoreduction.	198
212682	cd11732	HSP105-110_like_NBD	Nucleotide-binding domain of 105/110 kDa heat shock proteins including HSPA4, HYOU1, and similar proteins. This subfamily include the human proteins, HSPA4 (also known as 70-kDa heat shock protein 4, APG-2, HS24/P52, hsp70 RY, and HSPH2; the human HSPA4 gene maps to 5q31.1), HSPA4L (also known as 70-kDa heat shock protein 4-like, APG-1, HSPH3, and OSP94; the human HSPA4L gene maps to 4q28), and HSPH1 (also known as heat shock 105kDa/110kDa protein 1, HSP105; HSP105A; HSP105B; NY-CO-25; the human HSPH1 gene maps to 13q12.3), HYOU1 (also known as human hypoxia up-regulated 1, GRP170; HSP12A; ORP150; GRP-170; ORP-150; the human HYOU1 gene maps to11q23.1-q23.3), Saccharomyces cerevisiae Sse1p, Sse2p, and Lhs1p, and a sea urchin sperm receptor. It belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family, and includes proteins believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is also regulated by J-domain proteins.	377
212683	cd11733	HSPA9-like_NBD	Nucleotide-binding domain of human HSPA9, Escherichia coli DnaK, and similar proteins. This subgroup includes human mitochondrial HSPA9 (also known as 70-kDa heat shock protein 9, CSA; MOT; MOT2; GRP75; PBP74; GRP-75; HSPA9B; MTHSP75; the gene encoding HSPA9 maps to 5q31.1), Escherichia coli DnaK, and Saccharomyces cerevisiae Stress-Seventy subfamily C/Ssc1p (also called mtHSP70, Endonuclease SceI 75 kDa subunit). It belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs); for Escherichia coli DnaK, these are the DnaJ and GrpE, respectively. HSPA9 is involved in multiple processses including mitochondrial import, antigen processing, control of cellular proliferation and differentiation, and regulation of glucose responses. During glucose deprivation-induced cellular stress, HSPA9 plays an important role in the suppression of apoptosis by inhibiting a conformational change in Bax that allow the release of cytochrome c. DnaK modulates the heat shock response in Escherichia coli. It protects E. coli from protein carbonylation, an irreversible oxidative modification that increases during organism aging and bacterial growth arrest. Under severe thermal stress, it functions as part of a bi-chaperone system: the DnaK system and the ring-forming AAA+ chaperone ClpB (Hsp104) system, to promote cell survival. DnaK has also been shown to cooperate with GroEL and the ribosome-associated Escherichia coli Trigger Factor in the proper folding of cytosolic proteins. S. cerevisiae Ssc1p is the major HSP70 chaperone of the mitochondrial matrix, promoting translocation of proteins from the cytosol, across the inner membrane, to the matrix, and their subsequent folding. Ssc1p interacts with Tim44, a peripheral inner membrane protein associated with the TIM23 protein translocase. It is also a subunit of the endoSceI site-specific endoDNase and is required for full endoSceI activity. Ssc1p plays roles in the import of Yfh1p, a nucleus-encoded mitochondrial protein involved in iron homeostasis (and a homolog of human frataxin, implicated in the neurodegenerative disease, Friedreich's ataxia). Ssc1 also participates in translational regulation of cytochrome c oxidase (COX) biogenesis by interacting with Mss51 and Mss51-containing complexes.	377
212684	cd11734	Ssq1_like_NBD	Nucleotide-binding domain of Saccharomyces cerevisiae Ssq1 and similar proteins. Ssq1p (also called Stress-seventy subfamily Q protein 1, Ssc2p, Ssh1p, mtHSP70 homolog) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). S. cerevisiae Ssq1p is a mitochondrial chaperone that is involved in iron-sulfur (Fe/S) center biogenesis. Ssq1p plays a role in the maturation of Yfh1p, a nucleus-encoded mitochondrial protein involved in iron homeostasis (and a homolog of human frataxin, implicated in the neurodegenerative disease, Friedreich's ataxia).	373
212685	cd11735	HSPA12A_like_NBD	Nucleotide-binding domain of HSPA12A and similar proteins. HSPA12A (also known as 70-kDa heat shock protein-12A) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). No co-chaperones have yet been identified for HSPA12A. The gene encoding HSPA12A maps to 10q26.12, a cytogenetic region that might represent a common susceptibility locus for both schizophrenia and bipolar affective disorder; reduced expression of HSPA12A has been shown in the prefrontal cortex of subjects with schizophrenia. HSPA12A is also a candidate gene for forelimb-girdle muscular anomaly, an autosomal recessive disorder of Japanese black cattle. HSPA12A is predominantly expressed in neuronal cells. It may play a role in the atherosclerotic process.	467
212686	cd11736	HSPA12B_like_NBD	Nucleotide-binding domain of HSPA12B and similar proteins. Human HSPA12B (also known as 70-kDa heat shock protein-12B, chromosome 20 open reading frame 60/C20orf60, dJ1009E24.2; the gene encoding HSPA12B maps to 20p13) belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). No co-chaperones have yet been identified for HSPA12B. HSPA12B is predominantly expressed in endothelial cells, is required for angiogenesis, and may interact with known angiogenesis mediators. HSPA12B may be important for host defense in microglia-mediated immune response. HSPA12B expression is up-regulated in lipopolysaccharide (LPS)-induced inflammatory response in the spinal cord, and mostly located in active microglia; this induced expression may be regulated by activation of MAPK-p38, ERK1/2 and SAPK/JNK signaling pathways. Overexpression of HSPA12B also protects against LPS-induced cardiac dysfunction and involves the preserved activation of the PI3K/Akt signaling pathway.	468
212687	cd11737	HSPA4_NBD	Nucleotide-binding domain of HSPA4. Human HSPA4 (also known as 70-kDa heat shock protein 4, APG-2, HS24/P52, hsp70 RY, and HSPH2; the human HSPA4 gene maps to 5q31.1) responds to acidic pH stress, is involved in the radioadaptive response, is required for normal spermatogenesis and is overexpressed in hepatocellular carcinoma. It participates in a pathway along with NBS1 (Nijmegen breakage syndrome 1, also known as p85 or nibrin), heat shock transcription factor 4b (HDF4b), and HSPA14 (belonging to a different HSP70 subfamily) that induces tumor migration, invasion, and transformation. HSPA4 expression in sperm was increased in men with oligozoospermia, especially in those with varicocele. HSPA4 belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins.	383
212688	cd11738	HSPA4L_NBD	Nucleotide-binding domain of HSPA4L. Human HSPA4L (also known as 70-kDa heat shock protein 4-like, APG-1, HSPH3, and OSP94; the human HSPA4L gene maps to 4q28) is expressed ubiquitously and predominantly in the testis. It is required for normal spermatogenesis and plays a role in osmotolerance. HSPA4L belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins.	383
212689	cd11739	HSPH1_NBD	Nucleotide-binding domain of HSPH1. Human HSPH1 (also known as heat shock 105kDa/110kDa protein 1, HSP105; HSP105A; HSP105B; NY-CO-25; the human HSPH1 gene maps to 13q12.3) suppresses the aggregation of denatured proteins caused by heat shock in vitro, and may substitute for HSP70 family proteins to suppress the aggregation of denatured proteins in cells under severe stress. It reduces the protein aggregation and cytotoxicity associated with Polyglutamine (PolyQ) diseases, including Huntington's disease, which are a group of inherited neurodegenerative disorders sharing the characteristic feature of having insoluble protein aggregates in neurons. The expression of HSPH1 is elevated in various malignant tumors, including malignant melanoma, and there is a direct correlation between HSPH1 expression and B-cell non-Hodgkin lymphomas (B-NHLs) aggressiveness and proliferation. HSPH1 belongs to the 105/110 kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent "client" proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD). For HSP70 chaperones, the nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. Hsp70 chaperone activity is also regulated by J-domain proteins.	383
213038	cd11740	YajQ_like	Proteins similar to Escherichia coli YajQ. In Pseudomonas syringae, YajQ functions as a host protein involved in the temporal control of bacteriophage Phi6 gene transcription. It has been shown to bind to the phage's major structural core protein P1, most likely activating transcription by acting indirectly on the RNA polymerase. YajQ may remain bound to the phage particles throughout the infection period. Earlier, YajQ was characterized as a putative nucleic acid-binding protein based on the similarity of its (ferredoxin-like) three-dimensional topology with that of RNP-like RNA-binding domains.	159
240666	cd11741	TIN2_TBM	TRF-binding motif region of TRF-Interacting Nuclear factor 2. The C-terminal region of TIN2 contains the TRF-binding motif (TBM), while the TIN2 N-terminal region acts in the modulation of TRF1 activity via the inhibition of tankyrase 1. TIN2 binding to TRF2 is primarily via the TRF binding motif (TBM) and the N-terminus, while the far C-terminal region interacts with lower affinity. The TIN2 TBM, but not the N-terminal region, is involved in TIN2 binding to TRF1. Truncation of the TIN2 N-terminus in mouse results in telomere elongation, suggesting a a negative regulatory function of this region. TIN2 is a shelterin complex protein identified in mammals, one of 6 factors that act to protect telomeres from DNA damage repair machinery. Three shelterin components (TRF1, TRF2, POT1) bind DNA and 3 components (TIN2, RAP1, TPP1) are recruited by these DNA binding factors. TIN2 binds directly to TRF1 and TRF2 and stabilizes TRF2 complex-telomere binding by tethering it to the TRF1 complex. TRF1 activity at telomeres is regulated in part by selective ubiquitination and degradation. Ubiquitination of TRF1 is mediated by Fbx4, which binds TRF1 in the TRFH domain, via a small GTPase module. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. F-box proteins act in substrate recognition as part of SCF complexes (SCF: Skp1-Cul1-Rbx1-F- box protein). Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, promoting telomere elongation. TIN2 also binds TPP1, which recruits POT1 to telomeres.	108
213039	cd11743	Cthe_2751_like	Uncharacterized protein domain similar to Clostridium thermocellum 2751. Cthe_2751 has been found to form homodimers. Based on structural similarity to other families, a role in processing nucleic acids was suggested, though interactions with DNA could not be demonstrated.	122
213354	cd11744	MIT_CorA-like	metal ion transporter CorA-like divalent cation transporter superfamily. This superfamily of essential membrane proteins is involved in transporting divalent cations (uptake or efflux) across membranes. They are found in most bacteria and archaea, and in some eukaryotes. It is a functionally diverse group which includes the Mg2+ transporters of Escherichia coli and Salmonella typhimurium CorAs (which can also transport Co2+, and Ni2+ ), the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and the Zn2+ transporter Salmonella typhimurium ZntB, which mediates the efflux of Zn2+ (and Cd2+). It includes five Saccharomyces cerevisiae members: i) two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, ii) two mitochondrial inner membrane Mg2+ transporters: Mfm1p/Lpe10p, and Mrs2p, and iii) and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. It also includes a family of Arabidopsis thaliana members (AtMGTs), some of which are localized to distinct tissues, and not all of which can transport Mg2+. Thermotoga maritima CorA and Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, Mrs2p, and Alr1p. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	286
213372	cd11745	Yos9_DD	C-terminal dimerization domain (DD) of Saccharomyces cerevisiae Yos9 and related proteins. Yos9 participates in the ER-associated protein degradation pathway that targets misfolded proteins for proteolysis. Yos9 is a component of the reductase degradation (HRD) ubiquitin-ligase complex, specifically part of the luminal submodule of the ligase. Yos9 scans proteins for specific oligosaccharide modifications, which are critical determinants of degradation signal. It has been shown to be involved in the degradation of glycosylated proteins and various nonglycosylated proteins. Yos9 functions as a homodimer where this domain is responsible for the self-association; it has an alphabeta-roll domain architecture, and is found at the C-terminus of the protein. The N-terminal portion of Yos9 which includes an MRH domain is required for binding to Hrd3p, another component of the HRD complex. The DD domain does not appear to be directly binding Hrd3p.	124
213062	cd11746	GH94N_like	N-terminal domain of glycoside hydrolase family 94 and related domains. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-), amongst other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. This GH64N domain also occurs in tandem repeat arrangements (not at the N-terminus) in cyclic beta 1-2 glucan synthetase and related proteins, and as a standalone domain in distantly related proteins of unknown function.	179
213063	cd11747	GH94N_like_1	Glycoside hydrolase family 94 N-terminal-like domain of uncharacterized function. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase and many other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain. This GH64N domain also occurs as a standalone domain in distantly related proteins of unknown function, as represented by this model, which also includes N-terminal GH94N-like domains of bacterial rhamnosidases and as found at the C-terminus of polygalacturonases.	204
213064	cd11748	GH94N_NdvB_like	Glycoside hydrolase family 94 N-terminal-like domain of NdvB-like proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-), amongst other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel)]. The GH64N domain, as represented by this model, is found at the N-terminus of largely uncharacterized proteins, some members from Xanthomonas campestris and related organisms are annotated as NdvB (nodule development B) gene products, glycosyltransferases required for the synthesis of cyclic beta-(1,2)-glucans, which play a role in interactions between bacteria and plants.	294
213065	cd11749	GH94N_LBP_like	N-terminal-like domain of Paenibacillus sp. YM-1 Laminaribiose Phosphorylase and similar proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes bacterial laminaribiose phosphorylase. This N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. Bacterial laminaribiose phosphorylase phosphorolyzes laminaribiose into alpha-glucose 1-phosphate and glucose, but does not phosphorolyze other glucobioses; it slightly phosphorolyzed laminaritriose and higher laminarioligosaccharides. The GH64N domain, as represented by this model, is also found at the N-terminus of GH94 members with uncharacterized specificities.	229
213066	cd11750	GH94N_like_3	Glycoside hydrolase family 94 N-terminal-like domain of uncharacterized function. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-), amongst other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. The GH64N domain, as represented by this model, is found at the N-terminus of GH94 members with uncharacterized specificities.	282
213067	cd11751	GH94N_like_4	Glycoside hydrolase family 94 N-terminal-like domain of uncharacterized function. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-), amongst other members. Their N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. The GH64N domain, as represented by this model, is found near the N-terminus of GH94 members and related proteins with uncharacterized specificities.	223
213068	cd11752	GH94N_CDP_like	N-terminal domain of cellodextrin phosphorylase (CDP) and similar proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellodextrin phosphorylase (EC:2.4.1.49), also known as 1,4-beta-D-oligo-D-glucan:phosphate alpha-D-glucosyltransferase or CepB. This N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. Cellodextrin phosphorylase catalyzes the reversible and phosphate dependent removal of a single alpha-D-glucose-1-phosphate unit from a (1,4-beta-D-glucosyl) oligomer.	214
213069	cd11753	GH94N_ChvB_NdvB_2_like	Second GH94N domain of cyclic beta 1-2 glucan synthetase and similar domains. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cyclic beta 1-2 glucan synthetase (EC:2.4.1.20) or ChvB (encoded by the chromosomal chvB virulence gene). This second of two tandemly repeated GH94-N-terminal-like domains has not been characterized functionally. Some beta 1-2 glucan synthetases are annotated as NdvB (nodule development B) gene products, glycosyltransferases required for the synthesis of cyclic beta-(1,2)-glucans, which play a role in interactions between bacteria and plants.	336
213070	cd11754	GH94N_CBP_like	N-terminal domain of cellobiose phosphorylase (CBP) and similar proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cellobiose phosphorylase (EC:2.4.1.20) or cellobiose:phosphate alpha-D-glucosyltransferase, or CepA. This N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. Cellobiose phosphorylase participates in the degradation of cellulose, it catalyzes the phosphate dependent hydrolysis of cellobiose into alpha-D-glucose-1-phosphate and D-glucose, a reversible reaction.	303
213071	cd11755	GH94N_ChBP_like	N-terminal domain of chitobiose phosphorylase (ChBP) and similar proteins. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes chitobiose phosphorylase (EC:2.4.1.-). This N-terminal domain is involved in oligomerization and may play a role in catalysis, but it is separate from the catalytic domain [an (alpha/alpha)(6) barrel]. Chitobiose phosphorylase catalyzes the reversible phosphate dependent hydrolysis of chitobiose [(GlcNAc)2] into alpha-GlcNAc-1-phosphate and GlcNAc. In some organisms, ChBP may be involved in the production of GlcNac-6-phosphate in intracellular pathways.	300
213072	cd11756	GH94N_ChvB_NdvB_1_like	First GH94N domain of cyclic beta 1-2 glucan synthetase and similar domains. The glycoside hydrolase family 94 (previously known as glycosyltransferase family 36) includes cyclic beta 1-2 glucan synthetase (EC:2.4.1.20) or ChvB (encoded by the chromosomal chvB virulence gene). This first of two tandemly repeated GH94-N-terminal-like domains has not been characterized functionally. Some beta 1-2 glucan synthetases are annotated as NdvB (nodule development B) gene products, glycosyltransferases required for the synthesis of cyclic beta-(1,2)-glucans, which play a role in interactions between bacteria and plants.	284
212691	cd11757	SH3_SH3BP4	Src Homology 3 domain of SH3 domain-binding protein 4. SH3 domain-binding protein 4 (SH3BP4) is also called transferrin receptor trafficking protein (TTP). SH3BP4 is an endocytic accessory protein that interacts with endocytic proteins including clathrin and dynamin, and regulates the internalization of the transferrin receptor (TfR). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212692	cd11758	SH3_CRK_N	N-terminal Src Homology 3 domain of Ct10 Regulator of Kinase adaptor proteins. CRK adaptor proteins consists of SH2 and SH3 domains, which bind tyrosine-phosphorylated peptides and proline-rich motifs, respectively. They function downstream of protein tyrosine kinases in many signaling pathways started by various extracellular signals, including growth and differentiation factors. Cellular CRK (c-CRK) contains a single SH2 domain, followed by N-terminal and C-terminal SH3 domains. It is involved in the regulation of many cellular processes including cell growth, motility, adhesion, and apoptosis. CRK has been implicated in the malignancy of various human cancers. The N-terminal SH3 domain of CRK binds a number of target proteins including DOCK180, C3G, SOS, and cABL. The CRK family includes two alternatively spliced protein forms, CRKI and CRKII, that are expressed by the CRK gene, and the CRK-like (CRKL) protein, which is expressed by a distinct gene (CRKL). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212693	cd11759	SH3_CRK_C	C-terminal Src Homology 3 domain of Ct10 Regulator of Kinase adaptor proteins. CRK adaptor proteins consists of SH2 and SH3 domains, which bind tyrosine-phosphorylated peptides and proline-rich motifs, respectively. They function downstream of protein tyrosine kinases in many signaling pathways started by various extracellular signals, including growth and differentiation factors. Cellular CRK (c-CRK) contains a single SH2 domain, followed by N-terminal and C-terminal SH3 domains. It is involved in the regulation of many cellular processes including cell growth, motility, adhesion, and apoptosis. CRK has been implicated in the malignancy of various human cancers. The C-terminal SH3 domain of CRK has not been shown to bind any target protein; it acts as a negative regulator of CRK function by stabilizing a structure that inhibits the access by target proteins to the N-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components, and mediating the formation of multiprotein complex assemblies.	57
212694	cd11760	SH3_MIA_like	Src Homology 3 domain of Melanoma Inhibitory Activity protein and similar proteins. MIA is a single domain protein that adopts a SH3 domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. MIA is secreted from malignant melanoma cells and it plays an important role in melanoma development and invasion. MIA is expressed by chondrocytes in normal tissues and may be important in the cartilage cell phenotype. Unlike classical SH3 domains, MIA does not bind proline-rich ligands. MIA is a member of the recently identified family that also includes MIA-like (MIAL), MIA2, and MIA3 (also called TANGO); the biological functions of this family are not yet fully understood.	76
212695	cd11761	SH3_FCHSD_1	First Src Homology 3 domain of FCH and double SH3 domains proteins. This group is composed of FCH and double SH3 domains protein 1 (FCHSD1) and FCHSD2. These proteins have a common domain structure consisting of an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), two SH3, and C-terminal proline-rich domains. They have only been characterized in silico and their functions remain unknown. This group also includes the insect protein, nervous wreck, which acts as a regulator of synaptic growth signaling. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212696	cd11762	SH3_FCHSD_2	Second Src Homology 3 domain of FCH and double SH3 domains proteins. This group is composed of FCH and double SH3 domains protein 1 (FCHSD1) and FCHSD2. These proteins have a common domain structure consisting of an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), two SH3, and C-terminal proline-rich domains. They have only been characterized in silico and their functions remain unknown. This group also includes the insect protein, nervous wreck, which acts as a regulator of synaptic growth signaling. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212697	cd11763	SH3_SNX9_like	Src Homology 3 domain of Sorting Nexin 9 and similar proteins. Sorting nexins (SNXs) are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. This subfamily consists of SH3 domain containing SNXs including SNX9, SNX18, SNX33, and similar proteins. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis, while SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212698	cd11764	SH3_Eps8	Src Homology 3 domain of Epidermal growth factor receptor kinase substrate 8 and similar proteins. This group is composed of Eps8 and Eps8-like proteins including Eps8-like 1-3, among others. These proteins contain N-terminal Phosphotyrosine-binding (PTB), central SH3, and C-terminal effector domains. Eps8 binds either Abi1 (also called E3b1) or Rab5 GTPase activating protein RN-tre through its SH3 domain. With Abi1 and Sos1, it becomes part of a trimeric complex that is required to activate Rac. Together with RN-tre, it inhibits the internalization of EGFR. The SH3 domains of Eps8 and similar proteins recognize peptides containing a PxxDY motif, instead of the classical PxxP motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212699	cd11765	SH3_Nck_1	First Src Homology 3 domain of Nck adaptor proteins. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4), which show partly overlapping functions but also bind distinct targets. Their SH3 domains are involved in recruiting downstream effector molecules, such as the N-WASP/Arp2/3 complex, which when activated induces actin polymerization that results in the production of pedestals, or protrusions of the plasma membrane. The first SH3 domain of Nck proteins preferentially binds the PxxDY sequence, which is present in the CD3e cytoplasmic tail. This binding inhibits phosphorylation by Src kinases, resulting in the downregulation of TCR surface expression. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	51
212700	cd11766	SH3_Nck_2	Second Src Homology 3 domain of Nck adaptor proteins. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4), which show partly overlapping functions but also bind distinct targets. Their SH3 domains are involved in recruiting downstream effector molecules, such as the N-WASP/Arp2/3 complex, which when activated induces actin polymerization that results in the production of pedestals, or protrusions of the plasma membrane. The second SH3 domain of Nck appears to prefer ligands containing the APxxPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212701	cd11767	SH3_Nck_3	Third Src Homology 3 domain of Nck adaptor proteins. This group contains the third SH3 domain of Nck, the first SH3 domain of Caenorhabditis elegans Ced-2 (Cell death abnormality protein 2), and similar domains. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4), which show partly overlapping functions but also bind distinct targets. Their SH3 domains are involved in recruiting downstream effector molecules, such as the N-WASP/Arp2/3 complex, which when activated induces actin polymerization that results in the production of pedestals, or protrusions of the plasma membrane. The third SH3 domain of Nck appears to prefer ligands with a PxAPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. Ced-2 is a cell corpse engulfment protein that interacts with Ced-5 in a pathway that regulates the activation of Ced-10, a Rac small GTPase.	56
212702	cd11768	SH3_Tec_like	Src Homology 3 domain of Tec-like Protein Tyrosine Kinases. The Tec (Tyrosine kinase expressed in hepatocellular carcinoma) subfamily is composed of Tec, Btk, Bmx (Etk), Itk (Tsk, Emt), Rlk (Txk), and similar proteins. They are cytoplasmic (or nonreceptor) tyr kinases containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. Most Tec subfamily members (except Rlk) also contain an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation. In addition, some members contain the Tec homology (TH) domain, which contains proline-rich and zinc-binding regions. Tec kinases are expressed mainly by haematopoietic cells, although Tec and Bmx are also found in endothelial cells. B-cells express Btk and Tec, while T-cells express Itk, Txk, and Tec. Collectively, Tec kinases are expressed in a variety of myeloid cells such as mast cells, platelets, macrophages, and dendritic cells. Each Tec kinase shows a distinct cell-type pattern of expression. The function of Tec kinases in lymphoid cells have been studied extensively. They play important roles in the development, differentiation, maturation, regulation, survival, and function of B-cells and T-cells. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212703	cd11769	SH3_CSK	Src Homology 3 domain of C-terminal Src kinase. CSK is a cytoplasmic (or nonreceptor) tyr kinase containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. They negatively regulate the activity of Src kinases that are anchored to the plasma membrane. To inhibit Src kinases, CSK is translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. CSK catalyzes the tyr phosphorylation of the regulatory C-terminal tail of Src kinases, resulting in their inactivation. It is expressed in a wide variety of tissues and plays a role, as a regulator of Src, in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. In addition, CSK also shows Src-independent functions. It is a critical component in G-protein signaling, and plays a role in cytoskeletal reorganization and cell migration. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212704	cd11770	SH3_Nephrocystin	Src Homology 3 domain of Nephrocystin (or Nephrocystin-1). Nephrocystin contains an SH3 domain involved in signaling pathways that regulate cell adhesion and cytoskeletal organization. It is a protein that in humans is associated with juvenile nephronophthisis, an inherited kidney disease characterized by renal fibrosis that lead to chronic renal failure in children. It is localized in cell-cell junctions in renal duct cells, and is known to interact with Ack1, an activated Cdc42-associated kinase. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212705	cd11771	SH3_Pex13p_fungal	Src Homology 3 domain of fungal peroxisomal membrane protein Pex13p. Pex13p, located in the peroxisomal membrane, contains two transmembrane regions and a C-terminal SH3 domain. It binds to the peroxisomal targeting type I (PTS1) receptor Pex5p and the docking factor Pex14p through its SH3 domain. It is essential for both PTS1 and PTS2 protein import pathways into the peroxisomal matrix. Pex13p binds Pex14p, which contains a PxxP motif, in a classical fashion to the proline-rich ligand binding site of its SH3 domain. It binds the WxxxF/Y motif of Pex5p in a novel site that does not compete with Pex14p binding. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	60
212706	cd11772	SH3_OSTF1	Src Homology 3 domain of metazoan osteoclast stimulating factor 1. OSTF1, also named OSF or SH3P2, is a signaling protein containing SH3 and ankyrin-repeat domains. It acts through a Src-related pathway to enhance the formation of osteoclasts and bone resorption. It also acts as a negative regulator of cell motility. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212707	cd11773	SH3_Sla1p_1	First Src Homology 3 domain of the fungal endocytic adaptor protein Sla1p. Sla1p facilitates endocytosis by playing a role as an adaptor protein in coupling components of the actin cytoskeleton to the endocytic machinery. It interacts with Abp1p, Las17p and Pan1p, which are activator proteins of actin-related protein 2/3 (Arp2/3). Sla1p contains multiple domains including three SH3 domains, a SAM (sterile alpha motif) domain, and a Sla1 homology domain 1 (SHD1), which binds to the NPFXD motif that is found in many integral membrane proteins such as the Golgi-localized Arf-binding protein Lsb5p and the P4-ATPases, Drs2p and Dnf1p. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212708	cd11774	SH3_Sla1p_2	Second Src Homology 3 domain of the fungal endocytic adaptor protein Sla1p. Sla1p facilitates endocytosis by playing a role as an adaptor protein in coupling components of the actin cytoskeleton to the endocytic machinery. It interacts with Abp1p, Las17p and Pan1p, which are activator proteins of actin-related protein 2/3 (Arp2/3). Sla1p contains multiple domains including three SH3 domains, a SAM (sterile alpha motif) domain, and a Sla1 homology domain 1 (SHD1), which binds to the NPFXD motif that is found in many integral membrane proteins such as the Golgi-localized Arf-binding protein Lsb5p and the P4-ATPases, Drs2p and Dnf1p. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212709	cd11775	SH3_Sla1p_3	Third Src Homology 3 domain of the fungal endocytic adaptor protein Sla1p. Sla1p facilitates endocytosis by playing a role as an adaptor protein in coupling components of the actin cytoskeleton to the endocytic machinery. It interacts with Abp1p, Las17p and Pan1p, which are activator proteins of actin-related protein 2/3 (Arp2/3). Sla1p contains multiple domains including three SH3 domains, a SAM (sterile alpha motif) domain, and a Sla1 homology domain 1 (SHD1), which binds to the NPFXD motif that is found in many integral membrane proteins such as the Golgi-localized Arf-binding protein Lsb5p and the P4-ATPases, Drs2p and Dnf1p. The third SH3 domain of Sla1p can bind ubiquitin while retaining the ability to bind proline-rich ligands; monoubiquitination of target proteins signals internalization and sorting through the endocytic pathway. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212710	cd11776	SH3_PI3K_p85	Src Homology 3 domain of the p85 regulatory subunit of Class IA Phosphatidylinositol 3-kinases. Class I PI3Ks convert PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. They are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. Class IA PI3Ks associate with the p85 regulatory subunit family, which contains SH3, RhoGAP, and SH2 domains. The p85 subunits recruit the PI3K p110 catalytic subunit to the membrane, where p110 phosphorylates inositol lipids. Vertebrates harbor two p85 isoforms, called alpha and beta. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	72
212711	cd11777	SH3_CIP4_Bzz1_like	Src Homology 3 domain of Cdc42-Interacting Protein 4, Bzz1 and similar domains. This subfamily is composed of Cdc42-Interacting Protein 4 (CIP4) and similar proteins such as Formin Binding Protein 17 (FBP17) and FormiN Binding Protein 1-Like (FNBP1L), as well as yeast Bzz1 (or Bzz1p). CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. CIP4 and FBP17 bind to the Fas ligand and may be implicated in the inflammatory response. CIP4 may also play a role in phagocytosis. Bzz1 is also a WASP/Las17-interacting protein involved in endocytosis and trafficking to the vacuole. It physically interacts with type I myosins and functions in the early steps of endocytosis. Members of this subfamily contain an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain as well as at least one C-terminal SH3 domain. Bzz1 contains a second SH3 domain at the C-terminus. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212712	cd11778	SH3_Bzz1_2	Second Src Homology 3 domain of Bzz1 and similar domains. Bzz1 (or Bzz1p) is a WASP/Las17-interacting protein involved in endocytosis and trafficking to the vacuole. It physically interacts with type I myosins and functions in the early steps of endocytosis. Together with other proteins, it induces membrane scission in yeast. Bzz1 contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), a central coiled-coil, and two C-terminal SH3 domains. This model represents the second C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	51
212713	cd11779	SH3_Irsp53_BAIAP2L	Src Homology 3 domain of Insulin Receptor tyrosine kinase Substrate p53, Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2 (BAIAP2)-Like proteins, and similar proteins. Proteins in this family include IRSp53, BAIAP2L1, BAIAP2L2, and similar proteins. They all contain an Inverse-Bin/Amphiphysin/Rvs (I-BAR) or IMD domain in addition to the SH3 domain. IRSp53, also known as BAIAP2, is a scaffolding protein that takes part in many signaling pathways including Cdc42-induced filopodia formation, Rac-mediated lamellipodia extension, and spine morphogenesis. IRSp53 exists as multiple splicing variants that differ mainly at the C-termini. BAIAP2L1, also called IRTKS (Insulin Receptor Tyrosine Kinase Substrate), serves as a substrate for the insulin receptor and binds the small GTPase Rac. It plays a role in regulating the actin cytoskeleton and colocalizes with F-actin, cortactin, VASP, and vinculin. IRSp53 and IRTKS also mediate the recruitment of effector proteins Tir and EspFu, which regulate host cell actin reorganization, to bacterial attachment sites. BAIAP2L2 co-localizes with clathrin plaques but its function has not been determined. The SH3 domains of IRSp53 and IRTKS have been shown to bind the proline-rich C-terminus of EspFu. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212714	cd11780	SH3_Sorbs_3	Third (or C-terminal) Src Homology 3 domain of Sorbin and SH3 domain containing (Sorbs) proteins and similar domains. This family, also called the vinexin family, is composed predominantly of adaptor proteins containing one sorbin homology (SoHo) and three SH3 domains. Members include the third SH3 domains of Sorbs1 (or ponsin), Sorbs2 (or ArgBP2), Vinexin (or Sorbs3), and similar domains. They are involved in the regulation of cytoskeletal organization, cell adhesion, and growth factor signaling. Members of this family bind multiple partners including signaling molecules like c-Abl, c-Arg, Sos, and c-Cbl, as well as cytoskeletal molecules such as vinculin and afadin. They may have overlapping functions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212715	cd11781	SH3_Sorbs_1	First Src Homology 3 domain of Sorbin and SH3 domain containing (Sorbs) proteins and similar domains. This family, also called the vinexin family, is composed predominantly of adaptor proteins containing one sorbin homology (SoHo) and three SH3 domains. Members include the first SH3 domains of Sorbs1 (or ponsin), Sorbs2 (or ArgBP2), Vinexin (or Sorbs3), and similar domains. They are involved in the regulation of cytoskeletal organization, cell adhesion, and growth factor signaling. Members of this family bind multiple partners including signaling molecules like c-Abl, c-Arg, Sos, and c-Cbl, as well as cytoskeletal molecules such as vinculin and afadin. They may have overlapping functions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212716	cd11782	SH3_Sorbs_2	Second Src Homology 3 domain of Sorbin and SH3 domain containing (Sorbs) proteins and similar domains. This family, also called the vinexin family, is composed predominantly of adaptor proteins containing one sorbin homology (SoHo) and three SH3 domains. Members include the second SH3 domains of Sorbs1 (or ponsin), Sorbs2 (or ArgBP2), Vinexin (or Sorbs3), and similar domains. They are involved in the regulation of cytoskeletal organization, cell adhesion, and growth factor signaling. Members of this family bind multiple partners including signaling molecules like c-Abl, c-Arg, Sos, and c-Cbl, as well as cytoskeletal molecules such as vinculin and afadin. They may have overlapping functions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212717	cd11783	SH3_SH3RF_3	Third Src Homology 3 domain of SH3 domain containing ring finger 1 (SH3RF1), SH3RF3, and similar domains. SH3RF1 (or POSH) and SH3RF3 (or POSH2) are scaffold proteins that function as E3 ubiquitin-protein ligases. They contain an N-terminal RING finger domain and four SH3 domains. This model represents the third SH3 domain, located in the middle of SH3RF1 and SH3RF3, and similar domains. SH3RF1 plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF3 interacts with p21-activated kinase 2 (PAK2) and GTP-loaded Rac1. It may play a role in regulating JNK mediated apoptosis in certain conditions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212718	cd11784	SH3_SH3RF2_3	Third Src Homology 3 domain of SH3 domain containing ring finger 2. SH3RF2 is also called POSHER (POSH-eliminating RING protein) or HEPP1 (heart protein phosphatase 1-binding protein). It acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. It may also play a role in cardiac functions together with protein phosphatase 1. SH3RF2 contains an N-terminal RING finger domain and three SH3 domains. This model represents the third SH3 domain, located in the middle, of SH3RF2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212719	cd11785	SH3_SH3RF_C	C-terminal (Fourth) Src Homology 3 domain of SH3 domain containing ring finger 1 (SH3RF1), SH3RF3, and similar domains. SH3RF1 (or POSH) and SH3RF3 (or POSH2) are scaffold proteins that function as E3 ubiquitin-protein ligases. They contain an N-terminal RING finger domain and four SH3 domains. This model represents the fourth SH3 domain, located at the C-terminus of SH3RF1 and SH3RF3, and similar domains. SH3RF1 plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF3 interacts with p21-activated kinase 2 (PAK2) and GTP-loaded Rac1. It may play a role in regulating JNK mediated apoptosis in certain conditions. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212720	cd11786	SH3_SH3RF_1	First Src Homology 3 domain of SH3 domain containing ring finger proteins. This model represents the first SH3 domain of SH3RF1 (or POSH), SH3RF2 (or POSHER), SH3RF3 (POSH2), and similar domains. Members of this family are scaffold proteins that function as E3 ubiquitin-protein ligases. They all contain an N-terminal RING finger domain and multiple SH3 domains; SH3RF1 and SH3RF3 have four SH3 domains while SH3RF2 has three. SH3RF1 plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF3 interacts with p21-activated kinase 2 (PAK2) and GTP-loaded Rac1. It may play a role in regulating JNK mediated apoptosis in certain conditions. SH3RF2 acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212721	cd11787	SH3_SH3RF_2	Second Src Homology 3 domain of SH3 domain containing ring finger proteins. This model represents the second SH3 domain of SH3RF1 (or POSH), SH3RF2 (or POSHER), SH3RF3 (POSH2), and similar domains. Members of this family are scaffold proteins that function as E3 ubiquitin-protein ligases. They all contain an N-terminal RING finger domain and multiple SH3 domains; SH3RF1 and SH3RF3 have four SH3 domains while SH3RF2 has three. SH3RF1 plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF3 interacts with p21-activated kinase 2 (PAK2) and GTP-loaded Rac1. It may play a role in regulating JNK mediated apoptosis in certain conditions. SH3RF2 acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212722	cd11788	SH3_RasGAP	Src Homology 3 domain of Ras GTPase-Activating Protein 1. RasGAP, also called Ras p21 protein activator, RASA1, or p120RasGAP, is part of the GAP1 family of GTPase-activating proteins. It is a 120kD cytosolic protein containing an SH3 domain flanked by two SH2 domains at the N-terminal end, a pleckstrin homology (PH) domain, a calcium dependent phospholipid binding domain (CaLB/C2), and a C-terminal catalytic GAP domain. It stimulates the GTPase activity of normal RAS p21. It acts as a positive effector of Ras in tumor cells. It also functions as a regulator downstream of tyrosine receptors such as those of PDGF, EGF, ephrin, and insulin, among others. The SH3 domain of RasGAP is unable to bind proline-rich sequences but have been shown to interact with protein partners such as the G3BP protein, Aurora kinases, and the Calpain small subunit 1. The RasGAP SH3 domain is necessary for the downstream signaling of Ras and it also influences Rho-mediated cytoskeletal reorganization. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212723	cd11789	SH3_Nebulin_family_C	C-terminal Src Homology 3 domain of the Nebulin family of proteins. Nebulin family proteins contain multiple nebulin repeats, and may contain an N-terminal LIM domain and/or a C-terminal SH3 domain. They have molecular weights ranging from 34 to 900 kD, depending on the number of nebulin repeats, and they all bind actin. They are involved in the regulation of actin filament architecture and function as stabilizers and scaffolds for cytoskeletal structures with which they associate, such as long actin filaments or focal adhesions. Nebulin family proteins that contain a C-terminal SH3 domain include the giant filamentous protein nebulin, nebulette, Lasp1, and Lasp2. Lasp2, also called LIM-nebulette, is an alternatively spliced variant of nebulette. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212724	cd11790	SH3_Amphiphysin	Src Homology 3 domain of Amphiphysin and related domains. Amphiphysins function primarily in endocytosis and other membrane remodeling events. They exist in several isoforms and mammals possess two amphiphysin proteins from distinct genes. Amphiphysin I proteins, enriched in the brain and nervous system, contain domains that bind clathrin, Adaptor Protein complex 2 (AP2), dynamin, and synaptojanin. They function in synaptic vesicle endocytosis. Human autoantibodies to amphiphysin I hinder GABAergic signaling and contribute to the pathogenesis of paraneoplastic stiff-person syndrome. Some amphiphysin II isoforms, also called Bridging integrator 1 (Bin1), are localized in many different tissues and may function in intracellular vesicle trafficking. In skeletal muscle, Bin1 plays a role in the organization and maintenance of the T-tubule network. Mutations in Bin1 are associated with autosomal recessive centronuclear myopathy. Amphiphysins contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. The SH3 domain of amphiphysins bind proline-rich motifs present in binding partners such as dynamin, synaptojanin, and nsP3. It also belongs to a subset of SH3 domains that bind ubiquitin in a site that overlaps with the peptide binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	64
212725	cd11791	SH3_UBASH3	Src homology 3 domain of Ubiquitin-associated and SH3 domain-containing proteins, also called TULA (T cell Ubiquitin LigAnd) family of proteins. UBASH3 or TULA proteins are also referred to as Suppressor of T cell receptor Signaling (STS) proteins. They contain an N-terminal UBA domain, a central SH3 domain, and a C-terminal histidine phosphatase domain. They bind c-Cbl through the SH3 domain and to ubiquitin via UBA. In some vertebrates, there are two TULA family proteins, called UBASH3A (also called TULA or STS-2) and UBASH3B (also called TULA-2 or STS-1), which show partly overlapping as well as distinct functions. UBASH3B is widely expressed while UBASH3A is only found in lymphoid cells. UBASH3A facilitates apoptosis induced in T cells through its interaction with the apoptosis-inducing factor AIF. UBASH3B is an active phosphatase while UBASH3A is not. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212726	cd11792	SH3_Fut8	Src homology 3 domain of Alpha1,6-fucosyltransferase (Fut8). Fut8 catalyzes the alpha1,6-linkage of a fucose residue from a donor substrate to N-linked oligosaccharides on glycoproteins in a process called core fucosylation, which is crucial for growth factor receptor-mediated biological functions. Fut8-deficient mice show severe growth retardation, early death, and a pulmonary emphysema-like phenotype. Fut8 is also implicated to play roles in aging and cancer metastasis. It contains an N-terminal coiled-coil domain, a catalytic domain, and a C-terminal SH3 domain. The SH3 domain of Fut8 is located in the lumen and its role in glycosyl transfer is unclear. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212727	cd11793	SH3_ephexin1_like	Src homology 3 domain of ephexin-1-like SH3 domain containing Rho guanine nucleotide exchange factors. Members of this family contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), and C-terminal SH3 domains. They include the Rho guanine nucleotide exchange factors ARHGEF5, ARHGEF16, ARHGEF19, ARHGEF26, ARHGEF27 (also called ephexin-1), and similar proteins, and are also called ephexins because they interact directly with ephrin A receptors. GEFs interact with Rho GTPases via their DH domains to catalyze nucleotide exchange by stabilizing the nucleotide-free GTPase intermediate. They play important roles in neuronal development. The SH3 domains of ARHGEFs play an autoinhibitory role through intramolecular interactions with a proline-rich region N-terminal to the DH domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212728	cd11794	SH3_DNMBP_N1	First N-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin and key regulatory proteins of the actin cytoskeleton. It plays an important role in regulating cell junction configuration. The four N-terminal SH3 domains of DNMBP binds the GTPase dynamin, which plays an important role in the fission of endocytic vesicles. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	51
212729	cd11795	SH3_DNMBP_N2	Second N-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin and key regulatory proteins of the actin cytoskeleton. It plays an important role in regulating cell junction configuration. The four N-terminal SH3 domains of DNMBP binds the GTPase dynamin, which plays an important role in the fission of endocytic vesicles. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212730	cd11796	SH3_DNMBP_N3	Third N-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin and key regulatory proteins of the actin cytoskeleton. It plays an important role in regulating cell junction configuration. The four N-terminal SH3 domains of DNMBP binds the GTPase dynamin, which plays an important role in the fission of endocytic vesicles. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	51
212731	cd11797	SH3_DNMBP_N4	Fourth N-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin and key regulatory proteins of the actin cytoskeleton. It plays an important role in regulating cell junction configuration. The four N-terminal SH3 domains of DNMBP bind the GTPase dynamin, which plays an important role in the fission of endocytic vesicles. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	50
212732	cd11798	SH3_DNMBP_C1	First C-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics. It plays an important role in regulating cell junction configuration. The C-terminal SH3 domains of DNMBP bind to N-WASP and Ena/VASP proteins, which are key regulatory proteins of the actin cytoskeleton. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212733	cd11799	SH3_ARHGEF37_C1	First C-terminal Src homology 3 domain of Rho guanine nucleotide exchange factor 37. ARHGEF37 contains a RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. Its specific function is unknown. Its domain architecture is similar to the C-terminal half of DNMBP or Tuba, a cdc42-specific GEF that provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics, and plays an important role in regulating cell junction configuration. GEFs activate small GTPases by exchanging bound GDP for free GTP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212734	cd11800	SH3_DNMBP_C2_like	Second C-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba, and similar domains. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics. It plays an important role in regulating cell junction configuration. The C-terminal SH3 domains of DNMBP bind to N-WASP and Ena/VASP proteins, which are key regulatory proteins of the actin cytoskeleton. Also included in this subfamily is the second C-terminal SH3 domain of Rho guanine nucleotide exchange factor 37 (ARHGEF37), whose function is still unknown. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212735	cd11801	SH3_JIP1_like	Src homology 3 domain of JNK-interacting proteins 1 and 2, and similar domains. JNK-interacting proteins (JIPs) function as scaffolding proteins for c-Jun N-terminal kinase (JNK) signaling pathways. They bind to components of Mitogen-activated protein kinase (MAPK) pathways such as JNK, MKK, and several MAP3Ks such as MLK and DLK. There are four JIPs (JIP1-4); all contain a JNK binding domain. JIP1 and JIP2 also contain SH3 and Phosphotyrosine-binding (PTB) domains. Both are highly expressed in the brain and pancreatic beta-cells. JIP1 functions as an adaptor linking motor to cargo during axonal transport and also is involved in regulating insulin secretion. JIP2 form complexes with fibroblast growth factor homologous factors (FHFs), which facilitates activation of the p38delta MAPK. The SH3 domain of JIP1 homodimerizes at the interface usually involved in proline-rich ligand recognition, despite the lack of this motif in the domain itself. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212736	cd11802	SH3_Endophilin_B	Src homology 3 domain of Endophilin-B. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. They are classified into two types, A and B. Vertebrates contain two endophilin-B isoforms. Endophilin-B proteins are cytoplasmic proteins expressed mainly in the heart, placenta, and skeletal muscle. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212737	cd11803	SH3_Endophilin_A	Src homology 3 domain of Endophilin-A. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. They are classified into two types, A and B. Vertebrates contain three endophilin-A isoforms (A1, A2, and A3). Endophilin-A proteins are enriched in the brain and play multiple roles in receptor-mediated endocytosis. They tubulate membranes and regulate calcium influx into neurons to trigger the activation of the endocytic machinery. They are also involved in the sorting of plasma membrane proteins, actin filament assembly, and the uncoating of clathrin-coated vesicles for fusion with endosomes. Endophilins contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212738	cd11804	SH3_GRB2_like_N	N-terminal Src homology 3 domain of Growth factor receptor-bound protein 2 (GRB2) and related proteins. This family includes the adaptor protein GRB2 and related proteins including Drosophila melanogaster Downstream of receptor kinase (DRK), Caenorhabditis elegans Sex muscle abnormal protein 5 (Sem-5), GRB2-related adaptor protein (GRAP), GRAP2, and similar proteins. Family members contain an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. GRB2/Sem-5/DRK is a critical signaling molecule that regulates the Ras pathway by linking tyrosine kinases to the Ras guanine nucleotide releasing protein Sos (son of sevenless), which converts Ras to the active GTP-bound state. GRAP2 plays an important role in T cell receptor (TCR) signaling by promoting the formation of the SLP-76:LAT complex, which couples the TCR to the Ras pathway. GRAP acts as a negative regulator of T cell receptor (TCR)-induced lymphocyte proliferation by downregulating the signaling to the Ras/ERK pathway. The N-terminal SH3 domain of GRB2 binds to Sos and Sos-derived proline-rich peptides. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212739	cd11805	SH3_GRB2_like_C	C-terminal Src homology 3 domain of Growth factor receptor-bound protein 2 (GRB2) and related proteins. This family includes the adaptor protein GRB2 and related proteins including Drosophila melanogaster Downstream of receptor kinase (DRK), Caenorhabditis elegans Sex muscle abnormal protein 5 (Sem-5), GRB2-related adaptor protein (GRAP), GRAP2, and similar proteins. Family members contain an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. GRB2/Sem-5/DRK is a critical signaling molecule that regulates the Ras pathway by linking tyrosine kinases to the Ras guanine nucleotide releasing protein Sos (son of sevenless), which converts Ras to the active GTP-bound state. GRAP2 plays an important role in T cell receptor (TCR) signaling by promoting the formation of the SLP-76:LAT complex, which couples the TCR to the Ras pathway. GRAP acts as a negative regulator of T cell receptor (TCR)-induced lymphocyte proliferation by downregulating the signaling to the Ras/ERK pathway. The C-terminal SH3 domains (SH3c) of GRB2 and GRAP2 have been shown to bind to classical PxxP motif ligands, as well as to non-classical motifs. GRB2 SH3c binds Gab2 (Grb2-associated binder 2) through epitopes containing RxxK motifs, while the SH3c of GRAP2 binds to the phosphatase-like protein HD-PTP via a RxxxxK motif. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212740	cd11806	SH3_PRMT2	Src homology 3 domain of Protein arginine N-methyltransferase 2. PRMT2, also called HRMT1L1, belongs to the arginine methyltransferase protein family. It functions as a coactivator to both estrogen receptor alpha (ER-alpha) and androgen receptor (AR), presumably through arginine methylation. The ER-alpha transcription factor is involved in cell proliferation, differentiation, morphogenesis, and apoptosis, and is also implicated in the development and progression of breast cancer. PRMT2 and its variants are upregulated in breast cancer cells and may be involved in modulating the ER-alpha signaling pathway during formation of breast cancer. PRMT2 also plays a role in regulating the function of E2F transcription factors, which are critical cell cycle regulators, by binding to the retinoblastoma gene product (RB). It contains an N-terminal SH3 domain and an AdoMet binding domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212741	cd11807	SH3_ASPP	Src homology 3 domain of Apoptosis Stimulating of p53 proteins (ASPP). The ASPP family of proteins bind to important regulators of apoptosis (p53, Bcl-2, and RelA) and cell growth (APCL, PP1). They share similarity at their C-termini, where they harbor a proline-rich region, four ankyrin (ANK) repeats, and an SH3 domain. Vertebrates contain three members of the family: ASPP1, ASPP2, and iASPP. ASPP1 and ASPP2 activate the apoptotic function of the p53 family of tumor suppressors (p53, p63, and p73), while iASPP is an oncoprotein that specifically inhibits p53-induced apoptosis. The expression of ASPP proteins is altered in tumors; ASPP1 and ASPP2 are downregulated whereas iASPP is upregulated is some cancer types. ASPP proteins also bind and regulate protein phosphatase 1 (PP1), and this binding is competitive with p53 binding. The SH3 domain and the ANK repeats of ASPP contribute to the p53 binding site; they bind to the DNA binding domain of p53. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212742	cd11808	SH3_Alpha_Spectrin	Src homology 3 domain of Alpha Spectrin. Spectrin is a major structural component of the red blood cell membrane skeleton and is important in erythropoiesis and membrane biogenesis. It is a flexible, rope-like molecule composed of two subunits, alpha and beta, which consist of many spectrin-type repeats. Alpha and beta spectrin associate to form heterodimers and tetramers; spectrin tetramer formation is critical for red cell shape and deformability. Defects in alpha spectrin have been associated with inherited hemolytic anemias including hereditary spherocytosis (HSp), hereditary elliptocytosis (HE), and hereditary pyropoikilocytosis (HPP). Alpha spectrin contains a middle SH3 domain and a C-terminal EF-hand binding motif in addition to multiple spectrin repeats. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212743	cd11809	SH3_srGAP	Src homology 3 domain of Slit-Robo GTPase Activating Proteins. Slit-Robo GTPase Activating Proteins (srGAPs) are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. Vertebrates contain three isoforms of srGAPs (srGAP1-3), all of which are expressed during embryonic and early development in the nervous system but with different localization and timing. A fourth member has also been reported (srGAP4, also called ARHGAP4). srGAPs contain an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212744	cd11810	SH3_RUSC1_like	Src homology 3 domain of RUN and SH3 domain-containing proteins 1 and 2. RUSC1 and RUSC2, that were originally characterized in silico. They are adaptor proteins consisting of RUN, leucine zipper, and SH3 domains. RUSC1, also called NESCA (New molecule containing SH3 at the carboxy-terminus), is highly expressed in the brain and is translocated to the nuclear membrane from the cytoplasm upon stimulation with neurotrophin. It plays a role in facilitating neurotrophin-dependent neurite outgrowth. It also interacts with NEMO (or IKKgamma) and may function in NEMO-mediated activation of NF-kB. RUSC2, also called Iporin, is expressed ubiquitously with highest amounts in the brain and testis. It interacts with the small GTPase Rab1 and the Golgi matrix protein GM130, and may function in linking GTPases to certain intracellular signaling pathways. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	50
212745	cd11811	SH3_CHK	Src Homology 3 domain of CSK homologous kinase. CHK is also referred to as megakaryocyte-associated tyrosine kinase (Matk). It inhibits Src kinases using a noncatalytic mechanism by simply binding to them. As a negative regulator of Src kinases, Chk may play important roles in cell proliferation, survival, and differentiation, and consequently, in cancer development and progression. To inhibit Src kinases that are anchored to the plasma membrane, CHK is translocated to the membrane via binding to specific transmembrane proteins, G-proteins, or adaptor proteins near the membrane. CHK also plays a role in neural differentiation in a manner independent of Src by enhancing MAPK activation via Ras-mediated signaling. It is a cytoplasmic (or nonreceptor) tyr kinase containing the Src homology domains, SH3 and SH2, N-terminal to the catalytic tyr kinase domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212746	cd11812	SH3_AHI-1	Src Homology 3 domain of Abelson helper integration site-1 (AHI-1). AHI-1, also called Jouberin, is expressed in high levels in the brain, gonad tissues, and skeletal muscle. It is an adaptor protein that interacts with the small GTPase Rab8a and regulates it distribution and function, affecting cilium formation and vesicle transport. Mutations in the AHI-1 gene can cause Joubert syndrome, a disorder characterized by brainstem malformations, cerebellar aplasia/hypoplasia, and retinal dystrophy. AHI-1 variation is also associated with susceptibility to schizophrenia and type 2 diabetes mellitus progression. AHI-1 contains WD40 and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212747	cd11813	SH3_SGSM3	Src Homology 3 domain of Small G protein Signaling Modulator 3. SGSM3 is also called Merlin-associated protein (MAP), RUN and SH3 domain-containing protein (RUSC3), RUN and TBC1 domain-containing protein 3 (RUTBC3), Rab GTPase-activating protein 5 (RabGAP5), or Rab GAP-like protein (RabGAPLP). It is expressed ubiquitously and functions as a regulator of small G protein RAP- and RAB-mediated neuronal signaling. It is involved in modulating NGF-mediated neurite outgrowth and differentiation. It also interacts with the tumor suppressor merlin and may play a role in the merlin-associated suppression of cell growth. SGSM3 contains TBC, SH3, and RUN domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212748	cd11814	SH3_Eve1_1	First Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	50
212749	cd11815	SH3_Eve1_2	Second Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212750	cd11816	SH3_Eve1_3	Third Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	51
212751	cd11817	SH3_Eve1_4	Fourth Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	50
212752	cd11818	SH3_Eve1_5	Fifth Src homology 3 domain of ADAM-binding protein Eve-1. Eve-1, also called SH3 domain-containing protein 19 (SH3D19) or EEN-binding protein (EBP), exists in multiple alternatively spliced isoforms. The longest isoform contains five SH3 domain in the C-terminal region and seven proline-rich motifs in the N-terminal region. It is abundantly expressed in skeletal muscle and heart, and may be involved in regulating the activity of ADAMs (A disintegrin and metalloproteases). Eve-1 interacts with EEN, an endophilin involved in endocytosis and may be the target of the MLL-EEN fusion protein that is implicated in leukemogenesis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	50
212753	cd11819	SH3_Cortactin_like	Src homology 3 domain of Cortactin and related proteins. This subfamily includes cortactin, Abp1 (actin-binding protein 1), hematopoietic lineage cell-specific protein 1 (HS1), and similar proteins. These proteins are involved in regulating actin dynamics through direct or indirect interaction with the Arp2/3 complex, which is required to initiate actin polymerization. They all contain at least one C-terminal SH3 domain. Cortactin and HS1 bind Arp2/3 and actin through an N-terminal region that contains an acidic domain and several copies of a repeat domain found in cortactin and HS1. Abp1 binds actin via an N-terminal actin-depolymerizing factor (ADF) homology domain. Yeast Abp1 binds Arp2/3 directly through two acidic domains. Mammalian Abp1 does not directly interact with Arp2/3; instead, it regulates actin dynamics indirectly by interacting with dynamin and WASP family proteins. The C-terminal region of these proteins acts as an adaptor or scaffold that can connect membrane trafficking and signaling proteins that bind the SH3 domain within the actin network. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212754	cd11820	SH3_STAM	Src homology 3 domain of Signal Transducing Adaptor Molecules. STAMs were discovered as proteins that are highly phosphorylated following cytokine and growth factor stimulation. They function in cytokine signaling and surface receptor degradation, as well as regulate Golgi morphology. They associate with many proteins including Jak2 and Jak3 tyrosine kinases, Hrs, AMSH, and UBPY. STAM adaptor proteins contain VHS (Vps27, Hrs, STAM homology), ubiquitin interacting (UIM), and SH3 domains. There are two vertebrate STAMs, STAM1 and STAM2, which may be functionally redundant; vertebrate STAMs contain ITAM motifs. They are part of the endosomal sorting complex required for transport (ESCRT-0). STAM2 deficiency in mice did not cause any obvious abnormality, while STAM1 deficiency resulted in growth retardation. Loss of both STAM1 and STAM2 in mice proved lethal, indicating that STAMs are important for embryonic development. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212755	cd11821	SH3_ASAP	Src homology 3 domain of ArfGAP with SH3 domain, ankyrin repeat and PH domain containing proteins. ASAPs are Arf GTPase activating proteins (GAPs) and they function in regulating cell growth, migration, and invasion. They contain an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. Vertebrates contain at least three members, ASAP1, ASAP2, and ASAP3, but some ASAP3 proteins do not seem to harbor a C-terminal SH3 domain. ASAP1 and ASAP2 show GTPase activating protein (GAP) activity towards Arf1 and Arf5. They do not show GAP activity towards Arf6, but are able to mediate Arf6 signaling by binding stably to GTP-Arf6. ASAP3 is an Arf6-specific GAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212756	cd11822	SH3_SASH_like	Src homology 3 domain of SAM And SH3 Domain Containing Proteins. This subfamily, also called the SLY family, is composed of SAM And SH3 Domain Containing Protein 1 (SASH1), SASH2, SASH3, and similar proteins. These are adaptor proteins containing a central conserved region with a bipartite nuclear localization signal (NLS) as wells as SAM (sterile alpha motif) and SH3 domains. SASH1 is a potential tumor suppressor in breast and colon cancer. It is widely expressed in normal tissues (except lymphocytes and dendritic cells) and is localized in the nucleus and the cytoplasm. SASH1 interacts with the oncoprotein cortactin and is important in cell migration and adhesion. SASH2 (also called SAMSN-1, SLY2, HACS1 or NASH1) and SASH3 (also called SLY/SLY1) are expressed mainly in hematopoietic cells, although SASH2 is also found in endothelial cells as well as myeloid leukemias and myeloma. SASH2 was found to be differentially expressed in malignant haematopoietic cells and in colorectal tumors, and is a potential tumor suppressor in lung cancer. SASH3 is essential in the full activation of adaptive immunity and is involved in the signaling of T cell receptors. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212757	cd11823	SH3_Nostrin	Src homology 3 domain of Nitric Oxide Synthase TRaffic INducer. Nostrin is expressed in endothelial and epithelial cells and is involved in the regulation, trafficking and targeting of endothelial NOS (eNOS). It facilitates the endocytosis of eNOS by coordinating the functions of dynamin and the Wiskott-Aldrich syndrome protein (WASP). Increased expression of Nostrin may be correlated to preeclampsia. Nostrin contains an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212758	cd11824	SH3_PSTPIP1	Src homology 3 domain of Proline-Serine-Threonine Phosphatase-Interacting Protein 1. PSTPIP1, also called CD2 Binding Protein 1 (CD2BP1), is mainly expressed in hematopoietic cells. It is a binding partner of the cell surface receptor CD2 and PTP-PEST, a tyrosine phosphatase which functions in cell motility and Rac1 regulation. It also plays a role in the activation of the Wiskott-Aldrich syndrome protein (WASP), which couples actin rearrangement and T cell activation. Mutations in the gene encoding PSTPIP1 cause the autoinflammatory disorder known as PAPA (pyogenic sterile arthritis, pyoderma gangrenosum, and acne) syndrome. PSTPIP1 contains an N-terminal F-BAR domain, PEST motifs, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212759	cd11825	SH3_PLCgamma	Src homology 3 domain of Phospholipase C (PLC) gamma. PLC catalyzes the hydrolysis of phosphatidylinositol (4,5)-bisphosphate [PtdIns(4,5)P2] to produce Ins(1,4,5)P3 and diacylglycerol (DAG) in response to various receptors. Ins(1,4,5)P3 initiates the calcium signaling cascade while DAG functions as an activator of PKC. PLCgamma catalyzes this reaction in tyrosine kinase-dependent signaling pathways. It is activated and recruited to its substrate at the membrane. Vertebrates contain two forms of PLCgamma, PLCgamma1, which is widely expressed, and PLCgamma2, which is primarily found in haematopoietic cells. PLCgamma contains a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, two catalytic regions of PLC domains that flank two tandem SH2 domains, followed by a SH3 domain and C2 domain. The SH3 domain of PLCgamma1 directly interacts with dynamin-1 and can serve as a guanine nucleotide exchange factor (GEF). It also interacts with Cbl, inhibiting its phosphorylation and activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212760	cd11826	SH3_Abi	Src homology 3 domain of Abl Interactor proteins. Abl interactor (Abi) proteins are adaptor proteins serving as binding partners and substrates of Abl tyrosine kinases. They are involved in regulating actin cytoskeletal reorganization and play important roles in membrane-ruffling, endocytosis, cell motility, and cell migration. They localize to sites of actin polymerization in epithelial adherens junction and immune synapses, as well as to the leading edge of lamellipodia. Vertebrates contain two Abi proteins, Abi1 and Abi2. Abi1 displays a wide expression pattern while Abi2 is highly expressed in the eye and brain. Abi proteins contain a homeobox homology domain, a proline-rich region, and a SH3 domain. The SH3 domain of Abi binds to a PxxP motif in Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212761	cd11827	SH3_MyoIe_If_like	Src homology 3 domain of Myosins Ie, If, and similar proteins. Myosins Ie (MyoIe) and If (MyoIf) are nonmuscle, unconventional, long tailed, class I myosins containing an N-terminal motor domain and a myosin tail with TH1, TH2, and SH3 domains. MyoIe interacts with the endocytic proteins, dynamin and synaptojanin-1, through its SH3 domain; it may play a role in clathrin-dependent endocytosis. In the kidney, MyoIe is critical for podocyte function and normal glomerular filtration. Mutations in MyoIe is associated with focal segmental glomerulosclerosis, a disease characterized by massive proteinuria and progression to end-stage kidney disease. MyoIf is predominantly expressed in the immune system; it plays a role in immune cell motility and innate immunity. Mutations in MyoIf may be associated with the loss of hearing. The MyoIf gene has also been found to be fused to the MLL (Mixed lineage leukemia) gene in infant acute myeloid leukemias (AML). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212762	cd11828	SH3_ARHGEF9_like	Src homology 3 domain of ARHGEF9-like Rho guanine nucleotide exchange factors. Members of this family contain a SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains. They include the Rho guanine nucleotide exchange factors ARHGEF9, ASEF (also called ARHGEF4), ASEF2, and similar proteins. GEFs activate small GTPases by exchanging bound GDP for free GTP. ARHGEF9 specifically activates Cdc42, while both ASEF and ASEF2 can activate Rac1 and Cdc42. ARHGEF9 is highly expressed in the brain and it interacts with gephyrin, a postsynaptic protein associated with GABA and glycine receptors. ASEF plays a role in angiogenesis and cell migration. ASEF2 is important in cell migration and adhesion dynamics. ASEF exists in an autoinhibited form and is activated upon binding of the tumor suppressor APC (adenomatous polyposis coli), leading to the activation of Rac1 or Cdc42. In its autoinhibited form, the SH3 domain of ASEF forms an extensive interface with the DH and PH domains, blocking the Rac binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212763	cd11829	SH3_GAS7	Src homology 3 domain of Growth Arrest Specific protein 7. GAS7 is mainly expressed in the brain and is required for neurite outgrowth. It may also play a role in the protection and migration of embryonic stem cells. Treatment-related acute myeloid leukemia (AML) has been reported resulting from mixed-lineage leukemia (MLL)-GAS7 translocations as a complication of primary cancer treatment. GAS7 contains an N-terminal SH3 domain, followed by a WW domain, and a central F-BAR domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212764	cd11830	SH3_VAV_2	C-terminal (or second) Src homology 3 domain of VAV proteins. VAV proteins function both as cytoplasmic guanine nucleotide exchange factors (GEFs) for Rho GTPases and scaffold proteins and they play important roles in cell signaling by coupling cell surface receptors to various effector functions. They play key roles in processes that require cytoskeletal reorganization including immune synapse formation, phagocytosis, cell spreading, and platelet aggregation, among others. Vertebrates have three VAV proteins (VAV1, VAV2, and VAV3). VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212765	cd11831	SH3_VAV_1	First Src homology 3 domain of VAV proteins. VAV proteins function both as cytoplasmic guanine nucleotide exchange factors (GEFs) for Rho GTPases and scaffold proteins and they play important roles in cell signaling by coupling cell surface receptors to various effector functions. They play key roles in processes that require cytoskeletal reorganization including immune synapse formation, phagocytosis, cell spreading, and platelet aggregation, among others. Vertebrates have three VAV proteins (VAV1, VAV2, and VAV3). VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212766	cd11832	SH3_Shank	Src homology 3 domain of SH3 and multiple ankyrin repeat domains (Shank) proteins. Shank proteins carry scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains. They bind a variety of membrane and cytosolic proteins, and exist in alternatively spliced isoforms. They are highly enriched in postsynaptic density (PSD) where they interact with the cytoskeleton and with postsynaptic membrane receptors including NMDA and glutamate receptors. They are crucial in the construction and organization of the PSD and dendritic spines of excitatory synapses. There are three members of this family (Shank1, Shank2, Shank3) which show distinct and cell-type specific patterns of expression. Shank1 is brain-specific; Shank2 is found in neurons, glia, endocrine cells, liver, and kidney; Shank3 is widely expressed. The SH3 domain of Shank binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	50
212767	cd11833	SH3_Stac_1	First C-terminal Src homology 3 domain of SH3 and cysteine-rich domain-containing (Stac) proteins. Stac proteins are putative adaptor proteins that contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. There are three mammalian members (Stac1, Stac2, and Stac3) of this family. Stac1 and Stac3 contain two SH3 domains while Stac2 contains a single SH3 domain at the C-terminus. This model represents the first C-terminal SH3 domain of Stac1 and Stac3, and the single C-terminal SH3 domain of Stac2. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212768	cd11834	SH3_Stac_2	Second C-terminal Src homology 3 domain of SH3 and cysteine-rich domain-containing proteins 1 and 3. This model represents the second C-terminal SH3 domain of Stac1 and Stac3. Stac proteins are putative adaptor proteins that contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. There are three mammalian members (Stac1, Stac2, and Stac3) of this family. Stac1 and Stac3 contain two SH3 domains while Stac2 contains a single SH3 domain at the C-terminus. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	51
212769	cd11835	SH3_ARHGAP32_33	Src homology 3 domain of Rho GTPase-activating proteins 32 and 33, and similar proteins. Members of this family contain N-terminal PX and Src Homology 3 (SH3) domains, a central Rho GAP domain, and C-terminal extensions. RhoGAPs (or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP32 is also called RICS, PX-RICS, p250GAP, or p200RhoGAP. It is a Rho GTPase-activating protein for Cdc42 and Rac1, and is implicated in the regulation of postsynaptic signaling and neurite outgrowth. PX-RICS, a variant of RICS that contain PX and SH3 domains, is the main isoform expressed during neural development. It is involved in neural functions including axon and dendrite extension, postnatal remodeling, and fine-tuning of neural circuits during early brain development. ARHGAP33, also called sorting nexin 26 or TCGAP (Tc10/CDC42 GTPase-activating protein), is widely expressed in the brain where it is involved in regulating the outgrowth of axons and dendrites and is regulated by the protein tyrosine kinase Fyn. It is translocated to the plasma membrane in adipocytes in response to insulin and may be involved in the regulation of insulin-stimulated glucose transport. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212770	cd11836	SH3_Intersectin_1	First Src homology 3 domain (or SH3A) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The first SH3 domain (or SH3A) of ITSN1 has been shown to bind many proteins including Sos1, dynamin1/2, CIN85, c-Cbl, PI3K-C2, SHIP2, N-WASP, and CdGAP, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212771	cd11837	SH3_Intersectin_2	Second Src homology 3 domain (or SH3B) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The second SH3 domain (or SH3B) of ITSN1 has been shown to bind WNK and CdGAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212772	cd11838	SH3_Intersectin_3	Third Src homology 3 domain (or SH3C) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The third SH3 domain (or SH3C) of ITSN1 has been shown to bind many proteins including dynamin1/2, CIN85, c-Cbl, SHIP2, Reps1, synaptojanin-1, and WNK, among others. The SH3C of ITSN2 has been shown to bind the K15 protein of Kaposi's sarcoma-associated herpesvirus. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212773	cd11839	SH3_Intersectin_4	Fourth Src homology 3 domain (or SH3D) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The fourth SH3 domain (or SH3D) of ITSN1 has been shown to bind SHIP2, Numb, CdGAP, and N-WASP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212774	cd11840	SH3_Intersectin_5	Fifth Src homology 3 domain (or SH3E) of Intersectin. Intersectins (ITSNs) are adaptor proteins that function in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. They are essential for initiating clathrin-coated pit formation. They bind to many proteins through their multidomain structure and facilitate the assembly of multimeric complexes. Vertebrates contain two ITSN proteins, ITSN1 and ITSN2, which exist in alternatively spliced short and long isoforms. The short isoforms contain two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoforms, in addition, contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. ITSN1 and ITSN2 are both widely expressed, with variations depending on tissue type and stage of development. The fifth SH3 domain (or SH3E) of ITSN1 has been shown to bind many protein partners including SGIP1, Sos1, dynamin1/2, CIN85, c-Cbl, SHIP2, N-WASP, and synaptojanin-1, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212775	cd11841	SH3_SH3YL1_like	Src homology 3 domain of SH3 domain containing Ysc84-like 1 (SH3YL1) protein. SH3YL1 localizes to the plasma membrane and is required for dorsal ruffle formation. It binds phosphoinositides (PIs) with high affinity through its N-terminal SYLF domain (also called DUF500). In addition, SH3YL1 contains a C-terminal SH3 domain which has been reported to bind to N-WASP, dynamin 2, and SHIP2 (a PI 5-phosphatase). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212776	cd11842	SH3_Ysc84p_like	Src homology 3 domain of Ysc84p and similar fungal proteins. This family is composed of the Saccharomyces cerevisiae proteins, Ysc84p (also called LAS17-binding protein 4, Lsb4p) and Lsb3p, and similar fungal proteins. They contain an N-terminal SYLF domain (also called DUF500) and a C-terminal SH3 domain. Ysc84p localizes to actin patches and plays an important in actin polymerization during endocytosis. The N-terminal domain of both Ysc84p and Lsb3p can bind and bundle actin filaments. A study of the yeast SH3 domain interactome predicts that the SH3 domains of Lsb3p and Lsb4p may function as molecular hubs for the assembly of endocytic complexes. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212777	cd11843	SH3_PACSIN	Src homology 3 domain of Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins. PACSINs, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. They bind both dynamin and Wiskott-Aldrich syndrome protein (WASP), and may provide direct links between the actin cytoskeletal machinery through WASP and dynamin-dependent endocytosis. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212778	cd11844	SH3_CAS	Src homology 3 domain of CAS (Crk-Associated Substrate) scaffolding proteins. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes including migration, chemotaxis, apoptosis, differentiation, and progenitor cell function. They mediate the signaling of integrins at focal adhesions where they localize, and thus, regulate cell invasion and survival. Over-expression of these proteins is implicated in poor prognosis, increased metastasis, and resistance to chemotherapeutics in many cancers such as breast, lung, melanoma, and glioblastoma. CAS proteins have also been linked to the pathogenesis of inflammatory disorders, Alzheimer's, Parkinson's, and developmental defects. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Vertebrates contain four CAS proteins: BCAR1 (or p130Cas), NEDD9 (or HEF1), EFS (or SIN), and CASS4 (or HEPL). The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212779	cd11845	SH3_Src_like	Src homology 3 domain of Src kinase-like Protein Tyrosine Kinases. Src subfamily members include Src, Lck, Hck, Blk, Lyn, Fgr, Fyn, Yrk, Yes, and Brk. Src (or c-Src) proteins are cytoplasmic (or non-receptor) PTKs which are anchored to the plasma membrane. They contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). However, Brk lacks the N-terminal myristoylation sites. Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. They were identified as the first proto-oncogene products, and they regulate cell adhesion, invasion, and motility in cancer cells, and tumor vasculature, contributing to cancer progression and metastasis. Src kinases are overexpressed in a variety of human cancers, making them attractive targets for therapy. They are also implicated in acute inflammatory responses and osteoclast function. Src, Fyn, Yes, and Yrk are widely expressed, while Blk, Lck, Hck, Fgr, Lyn, and Brk show a limited expression pattern. This subfamily also includes Drosophila Src42A, Src oncogene at 42A (also known as Dsrc41) which accumulates at sites of cell-cell or cell-matrix adhesion, and participates in Drosphila development and wound healing. It has been shown to promote tube elongation in the tracheal system, is essential for proper cell-cell matching during dorsal closure, and regulates cell-cell contacts in developing Drosophila eyes. The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212780	cd11846	SH3_Srms	Src homology 3 domain of Srms Protein Tyrosine Kinase. Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal myristoylation sites (Srms) is a cytoplasmic (or non-receptor) PTK with limited homology to Src kinases. Src kinases in general contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr; they are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). However, Srms lacks the N-terminal myristoylation sites. Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212781	cd11847	SH3_Brk	Src homology 3 domain of Brk (Breast tumor kinase) Protein Tyrosine Kinase (PTK), also called PTK6. Brk is a cytoplasmic (or non-receptor) PTK with limited homology to Src kinases. It has been found to be overexpressed in a majority of breast tumors. It plays roles in normal cell differentiation, proliferation, survival, migration, and cell cycle progression. Brk substrates include RNA-binding proteins (SLM-1/2, Sam68), transcription factors (STAT3/5), and signaling molecules (Akt, paxillin, IRS-4). Src kinases in general contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr; they are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). However, Brk lacks the N-terminal myristoylation site. The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212782	cd11848	SH3_SLAP-like	Src homology 3 domain of Src-Like Adaptor Proteins. SLAPs are adaptor proteins with limited similarity to Src family tyrosine kinases. They contain an N-terminal SH3 domain followed by an SH2 domain, and a unique C-terminal sequence. They function in regulating the signaling, ubiquitination, and trafficking of T-cell receptor (TCR) and B-cell receptor (BCR) components. Vertebrates contain two SLAPs, named SLAP (or SLA1) and SLAP2 (or SLA2). SLAP has been shown to interact with the EphA receptor, EpoR, Lck, PDGFR, Syk, CD79a, among others, while SLAP2 interacts with CSF1R. Both SLAPs interact with c-Cbl, LAT, CD247, and Zap70. SLAP modulates TCR surface expression levels as well as surface and total BCR levels. As an adaptor to c-Cbl, SLAP increases the ubiquitination, intracellular retention, and targeted degradation of the BCR complex components. SLAP2 plays a role in c-Cbl-dependent regulation of CSF1R, a tyrosine kinase important for myeloid cell growth and differentiation. The SH3 domain of SLAP forms a complex with v-Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212783	cd11849	SH3_SPIN90	Src homology 3 domain of SH3 protein interacting with Nck, 90 kDa (SPIN90). SPIN90 is also called NCK interacting protein with SH3 domain (NCKIPSD), Dia-interacting protein (DIP), 54 kDa vimentin-interacting protein (VIP54), or WASP-interacting SH3-domain protein (WISH). It is an F-actin binding protein that regulates actin polymerization and endocytosis. It associates with the Arp2/3 complex near actin filaments and determines filament localization at the leading edge of lamellipodia. SPIN90 is expressed in the early stages of neuronal differentiation and plays a role in regulating growth cone dynamics and neurite outgrowth. It also interacts with IRSp53 and regulates cell motility by playing a role in the formation of membrane protrusions. SPIN90 contains an N-terminal SH3 domain, a proline-rich domain, and a C-terminal VCA (verprolin-homology and cofilin-like acidic) domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212784	cd11850	SH3_Abl	Src homology 3 domain of the Protein Tyrosine Kinase, Abelson kinase. Abl (or c-Abl) is a ubiquitously-expressed cytoplasmic (or nonreceptor) PTK that contains SH3, SH2, and tyr kinase domains in its N-terminal region, as well as nuclear localization motifs, a putative DNA-binding domain, and F- and G-actin binding domains in its C-terminal tail. It also contains a short autoinhibitory cap region in its N-terminus. Abl function depends on its subcellular localization. In the cytoplasm, Abl plays a role in cell proliferation and survival. In response to DNA damage or oxidative stress, Abl is transported to the nucleus where it induces apoptosis. In chronic myelogenous leukemia (CML) patients, an aberrant translocation results in the replacement of the first exon of Abl with the BCR (breakpoint cluster region) gene. The resulting BCR-Abl fusion protein is constitutively active and associates into tetramers, resulting in a hyperactive kinase sending a continuous signal. This leads to uncontrolled proliferation, morphological transformation and anti-apoptotic effects. BCR-Abl is the target of selective inhibitors, such as imatinib (Gleevec), used in the treatment of CML. Abl2, also known as ARG (Abelson-related gene), is thought to play a cooperative role with Abl in the proper development of the nervous system. The Tel-ARG fusion protein, resulting from reciprocal translocation between chromosomes 1 and 12, is associated with acute myeloid leukemia (AML). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212785	cd11851	SH3_RIM-BP	Src homology 3 domains of Rab3-interacting molecules (RIMs) binding proteins. RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212786	cd11852	SH3_Kalirin_1	First Src homology 3 domain of the RhoGEF kinase, Kalirin. Kalirin, also called Duo, Duet, or TRAD, is a large neuronal dual Rho guanine nucleotide exchange factor (RhoGEF) that activates Rac1, RhoA, and RhoG using two RhoGEF domains. Kalirin exists in many isoforms generated by alternative splicing and the use of multiple promoters; the major isoforms are kalirin-7, -9, and -12, which differ at their C-terminal ends. Kalirin-12, the longest isoform, contains an N-terminal Sec14p domain, spectrin-like repeats, two RhoGEF domains, two SH3 domains, as well as Ig, FNIII, and kinase domains at the C-terminal end. Kalirin-7 contains only a single RhoGEF domain and does not contain an SH3 domain. Kalirin, through its many isoforms, interacts with many different proteins and is able to localize to different locations within the cell. It influences neurite initiation, axon growth, dendritic morphogenesis, vesicle trafficking, neuronal maintenance, and neurodegeneration. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212787	cd11853	SH3_Kalirin_2	Second Src homology 3 domain of the RhoGEF kinase, Kalirin. Kalirin, also called Duo, Duet, or TRAD, is a large neuronal dual Rho guanine nucleotide exchange factor (RhoGEF) that activates Rac1, RhoA, and RhoG using two RhoGEF domains. Kalirin exists in many isoforms generated by alternative splicing and the use of multiple promoters; the major isoforms are kalirin-7, -9, and -12, which differ at their C-terminal ends. Kalirin-12, the longest isoform, contains an N-terminal Sec14p domain, spectrin-like repeats, two RhoGEF domains, two SH3 domains, as well as Ig, FNIII, and kinase domains at the C-terminal end. Kalirin-7 contains only a single RhoGEF domain and does not contain an SH3 domain. Kalirin, through its many isoforms, interacts with many different proteins and is able to localize to different locations within the cell. It influences neurite initiation, axon growth, dendritic morphogenesis, vesicle trafficking, neuronal maintenance, and neurodegeneration. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212788	cd11854	SH3_Fus1p	Src homology 3 domain of yeast cell fusion protein Fus1p. Fus1p is required at the cell surface for cell fusion during the mating response in yeast. It requires Bch1p and Bud7p, which are Chs5p-Arf1p binding proteins, for localization to the plasma membrane. It acts as a scaffold protein to assemble a cell surface complex which is involved in septum degradation and inhibition of the NOG pathway to promote cell fusion. The SH3 domain of Fus1p interacts with Bin1p, a formin that controls the assembly of actin cables in response to Cdc42 signaling. It has been shown to bind the motif, R(S/T)(S/T)SL, instead of PxxP motifs. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212789	cd11855	SH3_Sho1p	Src homology 3 domain of High osmolarity signaling protein Sho1p. Sho1p (or Sho1), also called SSU81 (Suppressor of SUA8-1 mutation), is a yeast membrane protein that regulates adaptation to high salt conditions by activating the HOG (high-osmolarity glycerol) pathway. High salt concentrations lead to the localization to the membrane of the MAPKK Pbs2, which is then activated by the MAPKK Ste11 and in turn, activates the MAPK Hog1. Pbs2 is localized to the membrane though the interaction of its PxxP motif with the SH3 domain of Sho1p. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212790	cd11856	SH3_p47phox_like	Src homology 3 domains of the p47phox subunit of NADPH oxidase and similar domains. This family is composed of the tandem SH3 domains of p47phox subunit of NADPH oxidase and Nox Organizing protein 1 (NoxO1), the four SH3 domains of Tks4 (Tyr kinase substrate with four SH3 domains), the five SH3 domains of Tks5, the SH3 domain of obscurin, Myosin-I,  and similar domains. Most members of this group also contain Phox homology (PX) domains, except for obscurin and Myosin-I. p47phox and NoxO1 are regulators of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) and nonphagocytic NADPH oxidase Nox1, respectively. They play roles in the activation of their respective NADPH oxidase, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Tks proteins are Src substrates and scaffolding proteins that play important roles in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. Obscurin is a giant muscle protein that plays important roles in the organization and assembly of the myofibril and the sarcoplasmic reticulum. Type I myosins (Myosin-I) are actin-dependent motors in endocytic actin structures and actin patches. They play roles in membrane traffic in endocytic and secretory pathways, cell motility, and mechanosensing. Myosin-I contains an N-terminal actin-activated ATPase, a phospholipid-binding TH1 (tail homology 1) domain, and a C-terminal extension which includes an F-actin-binding TH2 domain, an SH3 domain, and an acidic peptide that participates in activating the Arp2/3complex. The SH3 domain of myosin-I is required for myosin-I-induced actin polymerization. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212791	cd11857	SH3_DBS	Src homology 3 domain of DBL's Big Sister (DBS), a guanine nucleotide exchange factor. DBS, also called MCF2L (MCF2-transforming sequence-like protein) or OST, is a Rho GTPase guanine nucleotide exchange factor (RhoGEF), facilitating the exchange of GDP and GTP. It was originally isolated from a cDNA screen for sequences that cause malignant growth. It plays roles in regulating clathrin-mediated endocytosis and cell migration through its activation of Rac1 and Cdc42. Depending on cell type, DBS can also activate RhoA and RhoG. DBS contains a Sec14-like domain, spectrin-like repeats, a RhoGEF [or Dbl homology (DH)] domain, a Pleckstrin homology (PH) domain, and an SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212792	cd11858	SH3_Myosin-I_fungi	Src homology 3 domain of Type I fungal Myosins. Type I myosins (myosin-I) are actin-dependent motors in endocytic actin structures and actin patches. They play roles in membrane traffic in endocytic and secretory pathways, cell motility, and mechanosensing. Saccharomyces cerevisiae has two myosins-I, Myo3 and Myo5, which are involved in endocytosis and the polarization of the actin cytoskeleton. Myosin-I contains an N-terminal actin-activated ATPase, a phospholipid-binding TH1 (tail homology 1) domain, and a C-terminal extension which includes an F-actin-binding TH2 domain, an SH3 domain, and an acidic peptide that participates in activating the Arp2/3complex. The SH3 domain of myosin-I is required for myosin-I-induced actin polymerization. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212793	cd11859	SH3_ZO	Src homology 3 domain of the Tight junction proteins, Zonula occludens (ZO) proteins. ZO proteins are scaffolding proteins that associate with each other and with other proteins of the tight junction, zonula adherens, and gap junctions. They play roles in regulating cytoskeletal dynamics at these cell junctions. They are considered members of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. Vertebrates contain three ZO proteins (ZO-1, ZO-2, and ZO-3) with redundant and non-redundant roles. They contain three PDZ domains, followed by SH3 and GuK domains; in addition, ZO-1 and ZO-2 contains a proline-rich (PR) actin binding domain at the C-terminus while ZO-3 contains this PR domain between the second and third PDZ domains. The C-terminal regions of the three ZO proteins are unique. The SH3 domain of ZO-1 has been shown to bind ZONAB, ZAK, afadin, and Galpha12. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212794	cd11860	SH3_DLG5	Src homology 3 domain of Disks Large homolog 5. DLG5 is a multifunctional scaffold protein that is located at sites of cell-cell contact and is involved in the maintenance of cell shape and polarity. Mutations in the DLG5 gene are associated with Crohn's disease (CD) and inflammatory bowel disease (IBD). DLG5 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG5 contains 4 PDZ domains as well as an N-terminal domain of unknown function. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	63
212795	cd11861	SH3_DLG-like	Src Homology 3 domain of Disks large homolog proteins. The DLG-like proteins are scaffolding proteins that cluster at synapses and are also called PSD (postsynaptic density)-95 proteins or SAPs (synapse-associated proteins). They play important roles in synaptic development and plasticity, cell polarity, migration and proliferation. They are members of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG-like proteins contain three PDZ domains and varying N-terminal regions. All DLG proteins exist as alternatively-spliced isoforms. Vertebrates contain four DLG proteins from different genes, called DLG1-4. DLG4 and DLG2 are found predominantly at postsynaptic sites and they mediate surface ion channel and receptor clustering. DLG3 is found axons and some presynaptic terminals. DLG1 interacts with AMPA-type glutamate receptors and is critical in their maturation and delivery to synapses. The SH3 domain of DLG4 binds and clusters the kainate subgroup of glutamate receptors via two proline-rich sequences in their C-terminal tail. It also binds AKAP79/150 (A-kinase anchoring protein). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212796	cd11862	SH3_MPP	Src Homology 3 domain of Membrane Protein, Palmitoylated (or MAGUK p55 subfamily member) proteins. The MPP/p55 subfamily of MAGUK (membrane-associated guanylate kinase) proteins includes at least eight vertebrate members (MPP1-7 and CASK), four Drosophila proteins (Stardust, Varicose, CASK and Skiff), and other similar proteins; they all contain one each of the core of three domains characteristic of MAGUK proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, most members except for MPP1 contain N-terminal L27 domains and some also contain a Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. CASK has an additional calmodulin-dependent kinase (CaMK)-like domain at the N-terminus. Members of this subfamily are scaffolding proteins that play important roles in regulating and establishing cell polarity, cell adhesion, and synaptic targeting and transmission, among others. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212797	cd11863	SH3_CACNB	Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta. Voltage-dependent calcium channels (Ca(V)s) are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212798	cd11864	SH3_PEX13_eumet	Src Homology 3 domain of eumetazoan Peroxisomal biogenesis factor 13. PEX13 is a peroxin and is required for protein import into the peroxisomal matrix and membrane. It is an integral membrane protein that is essential for the localization of PEX14 and the import of proteins containing the peroxisome matrix targeting signals, PTS1 and PTS2. Mutations of the PEX13 gene in humans lead to a wide range of peroxisome biogenesis disorders (PBDs), the most severe of which is known as Zellweger syndrome (ZS), a severe multisystem disorder characterized by hypotonia, psychomotor retardation, and neuronal migration defects. PEX13 contains two transmembrane regions and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212799	cd11865	SH3_Nbp2-like	Src Homology 3 domain of Saccharomyces cerevisiae Nap1-binding protein 2 and similar fungal proteins. This subfamily includes Saccharomyces cerevisiae Nbp2 (Nucleosome assembly protein 1 (Nap1)-binding protein 2), Schizosaccharomyces pombe Skb5, and similar proteins. Nbp2 interacts with Nap1, which is essential for maintaining proper nucleosome structures in transcription and replication. It is also the binding partner of the yeast type II protein phosphatase Ptc1p and serves as a scaffolding protein that brings seven kinases in close contact to Ptc1p. Nbp2 plays a role many cell processes including organelle inheritance, mating hormone response, cell wall stress, mitotic cell growth at elevated temperatures, and high osmolarity. Skb5 interacts with the p21-activated kinase (PAK) homolog Shk1, which is critical for fission yeast cell viability. Skb5 activates Shk1 and plays a role in regulating cell morphology and growth under hypertonic conditions. Nbp2 and Skb5 contain an SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212800	cd11866	SH3_SKAP1-like	Src Homology 3 domain of Src Kinase-Associated Phosphoprotein 1 and similar proteins. This subfamily is composed of SKAP1, SKAP2, and similar proteins. SKAP1 and SKAP2 are immune cell-specific adaptor proteins that play roles in T- and B-cell adhesion, respectively, and are thus important in the migration of T- and B-cells to sites of inflammation and for movement during T-cell conjugation with antigen-presenting cells. Both SKAP1 and SKAP2 bind to ADAP (adhesion and degranulation-promoting adaptor protein), among many other binding partners. They contain a pleckstrin homology (PH) domain, a C-terminal SH3 domain, and several tyrosine phosphorylation sites. The SH3 domain of SKAP1 is necessary for its ability to regulate T-cell conjugation with antigen-presenting cells and the formation of LFA-1 clusters. SKAP1 binds primarily to a proline-rich region of ADAP through its SH3 domain; its degradation is regulated by ADAP. A secondary interaction occurs via the ADAP SH3 domain and the RKxxYxxY motif in SKAP1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212801	cd11867	hSH3_ADAP	Helically extended Src Homology 3 domain of Adhesion and Degranulation-promoting Adaptor Protein. ADAP, also called Fyn T-binding protein (FYB) or SLP-76-associated protein (SLAP), is expressed mainly in hematopoietic cells but not in B cells. It is required for the proliferation of mature T-cells and plays an important role in T-cell activation, TCR-induced integrin clustering, and T-cell adhesion. ADAP has been shown to bind many partners including SLP-76, Fyn, Src, SKAP1, SKAP2, dynein, Ena/VASP, Carma1, among others. It is connected to cytoskeleton via its binding to Ena and VASP, which impacts actin cytoskeletal remodeling upon TCR ligation. The SH3 domain of ADAP adopts an altered fold referred to as a helically extended SH3 (hSH3) domain characterized by clusters of positive charges. The hSH3 domain can no longer bind conventional proline-rich peptides, instead, it functions as a novel lipid interaction domain and can bind acidic lipids such as phosphatidylserine, phosphatidylinositol, phosphatidic acid, and polyphosphoinositides.	77
212802	cd11869	SH3_p40phox	Src Homology 3 domain of the p40phox subunit of NADPH oxidase. p40phox, also called Neutrophil cytosol factor 4 (NCF-4), is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p40phox positively regulates NADPH oxidase in both phosphatidylinositol-3-phosphate (PI3P)-dependent and PI3P-independent manner. It contains an N-terminal PX domain, a central SH3 domain, and a C-terminal PB1 domain that interacts with p67phox. The SH3 domain of p40phox binds to canonical polyproline and noncanonical motifs at the C-terminus of p47phox. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212803	cd11870	SH3_p67phox-like_C	C-terminal Src Homology 3 domain of the p67phox subunit of NADPH oxidase and similar proteins. This subfamily is composed of p67phox, NADPH oxidase activator 1 (Noxa1), and similar proteins. p67phox, also called Neutrophil cytosol factor 2 (NCF-2), and Noxa1 are homologs and are the cytosolic subunits of the phagocytic (Nox2) and nonphagocytic (Nox1) NADPH oxidase complexes, respectively. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p67phox and Noxa1 play regulatory roles. p67phox contains N-terminal TPR, first SH3 (or N-terminal or central SH3), PB1, and C-terminal SH3 domains. Noxa1 has a similar domain architecture except it is lacking the N-terminal SH3 domain. The TPR domain of both binds activated GTP-bound Rac, while the C-terminal SH3 domain of p67phox and Noxa1 binds the polyproline motif found at the C-terminus of p47phox and Noxo1, respectively. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212804	cd11871	SH3_p67phox_N	N-terminal (or first) Src Homology 3 domain of the p67phox subunit of NADPH oxidase. p67phox, also called Neutrophil cytosol factor 2 (NCF-2), is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p67phox plays a regulatory role and contains N-terminal TPR, first SH3 (or N-terminal or central SH3), PB1, and C-terminal SH3 domains. It binds, via its C-terminal SH3 domain, to a proline-rich region of p47phox and upon activation, this complex assembles with flavocytochrome b558, the Nox2-p22phox heterodimer. Concurrently, RacGTP translocates to the membrane and interacts with the TPR domain of p67phox, which leads to the activation of NADPH oxidase. The PB1 domain of p67phox binds to its partner PB1 domain in p40phox, and this facilitates the assembly of p47phox-p67phox at the membrane. The N-terminal SH3 domain increases the affinity of p67phox for the oxidase complex. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212805	cd11872	SH3_DOCK_AB	Src Homology 3 domain of Class A and B Dedicator of Cytokinesis proteins. DOCK proteins are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. They are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1, 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. This subfamily includes only Class A and B DOCKs, which also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. Class A/B DOCKs are mostly specific GEFs for Rac, except Dock4 which activates the Ras family GTPase Rap1, probably indirectly through interaction with Rap regulatory proteins. The SH3 domain of class A/B DOCKs have been shown to bind Elmo, a scaffold protein that promotes GEF activity of DOCKs by releasing DHR-2 autoinhibition by the intramolecular SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212806	cd11873	SH3_CD2AP-like_1	First Src Homology 3 domain (SH3A) of CD2-associated protein and similar proteins. This subfamily is composed of the first SH3 domain (SH3A) of CD2AP, CIN85 (Cbl-interacting protein of 85 kDa), and similar domains. CD2AP and CIN85 are adaptor proteins that bind to protein partners and assemble complexes that have been implicated in T cell activation, kidney function, and apoptosis of neuronal cells. They also associate with endocytic proteins, actin cytoskeleton components, and other adaptor proteins involved in receptor tyrosine kinase (RTK) signaling. CD2AP and the main isoform of CIN85 contain three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP and CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. SH3A of both proteins bind to an atypical PXXXPR motif at the C-terminus of Cbl and the cytoplasmic domain of the cell adhesion protein CD2. CIN85 SH3A binds to internal proline-rich motifs within the proline-rich region; this intramolecular interaction serves as a regulatory mechanism to keep CIN85 in a closed conformation, preventing the recruitment of other proteins. CIN85 SH3A has also been shown to bind ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212807	cd11874	SH3_CD2AP-like_2	Second Src Homology 3 domain (SH3B) of CD2-associated protein and similar proteins. This subfamily is composed of the second SH3 domain (SH3B) of CD2AP, CIN85 (Cbl-interacting protein of 85 kDa), and similar domains. CD2AP and CIN85 are adaptor proteins that bind to protein partners and assemble complexes that have been implicated in T cell activation, kidney function, and apoptosis of neuronal cells. They also associate with endocytic proteins, actin cytoskeleton components, and other adaptor proteins involved in receptor tyrosine kinase (RTK) signaling. CD2AP and the main isoform of CIN85 contain three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP and CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. SH3B of both proteins have been shown to bind to Cbl. In the case of CD2AP, its SH3B binds to Cbl at a site distinct from the c-Cbl/SH3A binding site. The CIN85 SH3B also binds ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212808	cd11875	SH3_CD2AP-like_3	Third Src Homology 3 domain (SH3C) of CD2-associated protein and similar proteins. This subfamily is composed of the third SH3 domain (SH3C) of CD2AP, CIN85 (Cbl-interacting protein of 85 kDa), and similar domains. CD2AP and CIN85 are adaptor proteins that bind to protein partners and assemble complexes that have been implicated in T cell activation, kidney function, and apoptosis of neuronal cells. They also associate with endocytic proteins, actin cytoskeleton components, and other adaptor proteins involved in receptor tyrosine kinase (RTK) signaling. CD2AP and the main isoform of CIN85 contain three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP and CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. SH3C of both proteins have been shown to bind to ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212809	cd11876	SH3_MLK	Src Homology 3 domain of Mixed Lineage Kinases. MLKs are Serine/Threonine Kinases (STKs), catalyzing the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. MLKs act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. Mammals have four MLKs (MLK1-4), mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212810	cd11877	SH3_PIX	Src Homology 3 domain of Pak Interactive eXchange factors. PIX proteins are Rho guanine nucleotide exchange factors (GEFs), which activate small GTPases by exchanging bound GDP for free GTP. They act as GEFs for both Cdc42 and Rac 1, and have been implicated in cell motility, adhesion, neurite outgrowth, and cell polarity. Vertebrates contain two proteins from the PIX subfamily, alpha-PIX and beta-PIX. Alpha-PIX, also called ARHGEF6, is localized in dendritic spines where it regulates spine morphogenesis. Mutations in the ARHGEF6 gene cause X-linked intellectual disability in humans. Beta-PIX play roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion. PIX proteins contain an N-terminal SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains, and a C-terminal leucine-zipper domain for dimerization. The SH3 domain of PIX binds to an atypical PxxxPR motif in p21-activated kinases (PAKs) with high affinity. The binding of PAKs to PIX facilitate the localization of PAKs to focal complexes and also localizes PAKs to PIX targets Cdc43 and Rac, leading to the activation of PAKs. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212811	cd11878	SH3_Bem1p_1	First Src Homology 3 domain of Bud emergence protein 1 and similar domains. Members of this subfamily bear similarity to Saccharomyces cerevisiae Bem1p, containing two Src Homology 3 (SH3) domains at the N-terminus, a central PX domain, and a C-terminal PB1 domain. Bem1p is a scaffolding protein that is critical for proper Cdc42p activation during bud formation in yeast. During budding and mating, Bem1p migrates to the plasma membrane where it can serve as an adaptor for Cdc42p and some other proteins. Bem1p also functions as an effector of the G1 cyclin Cln3p and the cyclin-dependent kinase Cdc28p in promoting vacuolar fusion. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	54
212812	cd11879	SH3_Bem1p_2	Second Src Homology 3 domain of Bud emergence protein 1 and similar domains. Members of this subfamily bear similarity to Saccharomyces cerevisiae Bem1p, containing two Src Homology 3 (SH3) domains at the N-terminus, a central PX domain, and a C-terminal PB1 domain. Bem1p is a scaffolding protein that is critical for proper Cdc42p activation during bud formation in yeast. During budding and mating, Bem1p migrates to the plasma membrane where it can serve as an adaptor for Cdc42p and some other proteins. Bem1p also functions as an effector of the G1 cyclin Cln3p and the cyclin-dependent kinase Cdc28p in promoting vacuolar fusion. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	56
212813	cd11880	SH3_Caskin	Src Homology 3 domain of CASK interacting protein. Caskin proteins are multidomain adaptor proteins that contain six ankyrin repeats, a single SH3 domain, tandem sterile alpha motif (SAM) domains, and a long disordered proline-rich region. There are two Caskin proteins called Caskin1 and Caskin2. Caskin1 binds to the multidomain scaffolding protein CASK through the CaM domain in competition with Munc-interacting protein 1 (Mint1). CASK participates in one of two evolutionarily conserved tripartite complexes containing either Mint1 and Velis or Caskin1 and Velis. Caskin1 may play a role in infantile myoclonic epilepsy. There is not much known about Caskin2; despite sharing a domain architecture with Caskin1, Caskin2 does not bind CASK. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	61
212814	cd11881	SH3_MYO7A	Src Homology 3 domain of Myosin VIIa and similar proteins. Myo7A is an uncoventional myosin that is involved in organelle transport. It is required for sensory function in both Drosophila and mammals. Mutations in the Myo7A gene cause both syndromic deaf-blindness [Usher syndrome I (USH1)] and nonsyndromic (DFNB2 and DFNA11) deafness in humans. It contains an N-terminal motor domain, light chain-binding IQ motifs, a coiled-coil region for heavy chain dimerization, and a tail consisting of a pair of MyTH4-FERM tandems separated by a SH3 domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	64
212815	cd11882	SH3_GRAF-like	Src Homology 3 domain of GTPase Regulator Associated with Focal adhesion kinase and similar proteins. This subfamily is composed of Rho GTPase activating proteins (GAPs) with similarity to GRAF. Members contain an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. Although vertebrates harbor four Rho GAPs in the GRAF subfamily including GRAF, GRAF2, GRAF3, and Oligophrenin-1 (OPHN1), only three are included in this model. OPHN1 contains the BAR, PH and GAP domains, but not the C-terminal SH3 domain. GRAF and GRAF2 show GAP activity towards RhoA and Cdc42. GRAF influences Rho-mediated cytoskeletal rearrangements and binds focal adhesion kinase. GRAF2 regulates caspase-activated p21-activated protein kinase-2. The SH3 domain of GRAF and GRAF2 binds PKNbeta, a target of the small GTPase Rho. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	54
212816	cd11883	SH3_Sdc25	Src Homology 3 domain of Sdc25/Cdc25 guanine nucleotide exchange factors. This subfamily is composed of the Saccharomyces cerevisiae guanine nucleotide exchange factors (GEFs) Sdc25 and Cdc25, and similar proteins. These GEFs regulate Ras by stimulating the GDP/GTP exchange on Ras. Cdc25 is involved in the Ras/PKA pathway that plays an important role in the regulation of metabolism, stress responses, and proliferation, depending on available nutrients and conditions. Proteins in this subfamily contain an N-terminal SH3 domain as well as REM (Ras exchanger motif) and RasGEF domains at the C-terminus. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	55
212817	cd11884	SH3_MYO15	Src Homology 3 domain of Myosin XV. This subfamily is composed of proteins with similarity to Myosin XVa. Myosin XVa is an unconventional myosin that is critical for the normal growth of mechanosensory stereocilia of inner ear hair cells. Mutations in the myosin XVa gene are associated with nonsyndromic hearing loss. Myosin XVa contains a unique N-terminal extension followed by a motor domain, light chain-binding IQ motifs, and a tail consisting of a pair of MyTH4-FERM tandems separated by a SH3 domain, and a PDZ domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	56
212818	cd11885	SH3_SH3TC	Src Homology 3 domain of SH3 domain and tetratricopeptide repeat-containing (SH3TC) proteins and similar domains. This subfamily is composed of vertebrate SH3TC proteins and hypothetical fungal proteins containing BAR and SH3 domains. Mammals contain two SH3TC proteins, SH3TC1 and SH3TC2. The function of SH3TC1 is unknown. SH3TC2 is localized in Schwann cells in the peripheral nervous system, where it interacts with Rab11 and plays a role in peripheral nerve myelination. Mutations in SH3TC2 are associated with Charcot-Marie-Tooth disease type 4C, a severe hereditary peripheral neuropathy with symptoms that include progressive scoliosis, delayed age of walking, muscular atrophy, distal weakness, and reduced nerve conduction velocity. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	55
212819	cd11886	SH3_BOI	Src Homology 3 domain of fungal BOI-like proteins. This subfamily includes the Saccharomyces cerevisiae proteins BOI1 and BOI2, and similar proteins. They contain an N-terminal SH3 domain, a Sterile alpha motif (SAM), and a Pleckstrin homology (PH) domain at the C-terminus. BOI1 and BOI2 interact with the SH3 domain of Bem1p, a protein involved in bud formation. They promote polarized cell growth and participates in the NoCut signaling pathway, which is involved in the control of cytokinesis. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	55
212820	cd11887	SH3_Bbc1	Src Homology 3 domain of Bbc1 and similar domains. This subfamily is composed of Saccharomyces cerevisiae Bbc1p, also called Mti1p (Myosin tail region-interacting protein), and similar proteins. Bbc1p interacts with and regulates type I myosins in yeast, Myo3p and Myo5p, which are involved in actin cytoskeletal reorganization. It also binds and inhibits Las17, a WASp family protein that functions as an activator of the Arp2/3 complex. Bbc1p contains an N-terminal SH3 domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	60
212821	cd11888	SH3_ARHGAP9_like	Src Homology 3 domain of Rho GTPase-activating protein 9 and similar proteins. This subfamily is composed of Rho GTPase-activating proteins including mammalian ARHGAP9, and vertebrate ARHGAPs 12 and 27. RhoGAPs (or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP9 functions as a GAP for Rac and Cdc42, but not for RhoA. It negatively regulates cell migration and adhesion. It also acts as a docking protein for the MAP kinases Erk2 and p38alpha, and may facilitate cross-talk between the Rho GTPase and MAPK pathways to control actin remodeling. ARHGAP27, also called CAMGAP1, shows GAP activity towards Rac1 and Cdc42. It binds the adaptor protein CIN85 and may play a role in clathrin-mediated endocytosis. ARHGAP12 has been shown to display GAP activity towards Rac1. It plays a role in regulating HFG-driven cell growth and invasiveness. ARHGAPs in this subfamily contain SH3, WW, Pleckstin homology (PH), and RhoGAP domains. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	54
212822	cd11889	SH3_Cyk3p-like	Src Homology 3 domain of Cytokinesis protein 3 and similar proteins. Cytokinesis protein 3 (Cyk3 or Cyk3p) is a component of the actomyosin ring independent cytokinesis pathway in yeast. It interacts with Inn1 and facilitates its recruitment to the bud neck, thereby promoting cytokinesis. Cyk3p contains an N-terminal SH3 domain and a C-terminal transglutaminase-like domain. The Cyk3p SH3 domain binds to the C-terminal proline-rich region of Inn1. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	53
212823	cd11890	MIA	Melanoma Inhibitory Activity protein. MIA is a single domain protein that adopts a Src Homology 3 (SH3) domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. MIA is secreted from malignant melanoma cells and it plays an important role in melanoma development and invasion. MIA is expressed by chondrocytes in normal tissues and may be important in the cartilage cell phenotype. Unlike classical SH3 domains, MIA does not bind proline-rich ligands. It binds peptide ligands with sequence similarity to type III human fibronectin repeats.	98
212824	cd11891	MIAL	Melanoma Inhibitory Activity-Like protein. MIAL is specifically expressed in the cochlea and the vestibule of the inner ear and may contribute to inner ear dysfunction in humans. MIAL is a member of the recently identified family that also includes MIA, MIA2, and MIA3 (also called TANGO); MIA is the most studied member of the family. MIA is a single domain protein that adopts a Src Homology 3 (SH3) domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. MIA is secreted from malignant melanoma cells and it plays an important role in melanoma development and invasion. MIA is expressed by chondrocytes in normal tissues and may be important in the cartilage cell phenotype. Unlike classical SH3 domains, MIA does not bind proline-rich ligands.	83
212825	cd11892	SH3_MIA2	Src Homology 3 domain of Melanoma Inhibitory Activity 2 protein. MIA2 is expressed specifically in hepatocytes and its expression is controlled by hepatocyte nuclear factor 1 binding sites in the MIA2 promoter. It inhibits the growth and invasion of hepatocellular carcinomas (HCC) and may act as a tumor suppressor. A mutation in MIA2 in mice resulted in reduced cholesterol and triglycerides. Since MIA2 localizes to ER exit sites, it may function as an ER-to-Golgi trafficking protein that regulates lipid metabolism. MIA2 contains an N-terminal SH3-like domain, similar to MIA. It is a member of the recently identified family that also includes MIA, MIAL, and MIA3 (also called TANGO). MIA is a single domain protein that adopts a SH3 domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. Unlike classical SH3 domains, MIA does not bind proline-rich ligands.	73
212826	cd11893	SH3_MIA3	Src Homology 3 domain of Melanoma Inhibitory Activity 3 protein. MIA3, also called TANGO or TANGO1, acts as a tumor suppressor of malignant melanoma. It is downregulated or lost in melanoma cells lines. Unlike other MIA family members, MIA3 is widely expressed except in hematopoietic cells. MIA3 is an ER resident transmembrane protein that is required for the loading of collagen VII into transport vesicles. SNPs in the MIA3 gene have been associated with coronary arterial disease and myocardial infarction. MIA3 contains an N-terminal SH3-like domain, similar to MIA. It is a member of the recently identified family that also includes MIA, MIAL, and MIA2. MIA is a single domain protein that adopts a SH3 domain-like fold; it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. Unlike classical SH3 domains, MIA does not bind proline-rich ligands.	73
212827	cd11894	SH3_FCHSD2_2	Second Src Homology 3 domain of FCH and double SH3 domains protein 2. FCHSD2 has a domain structure consisting of an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), two SH3, and C-terminal proline-rich domains. It has only been characterized in silico and its function is unknown. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212828	cd11895	SH3_FCHSD1_2	Second Src Homology 3 domain of FCH and double SH3 domains protein 1. FCHSD1 has a domain structure consisting of an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), two SH3, and C-terminal proline-rich domains. It has only been characterized in silico and its function is unknown. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212829	cd11896	SH3_SNX33	Src Homology 3 domain of Sorting Nexin 33. SNX33 interacts with Wiskott-Aldrich syndrome protein (WASP) and plays a role in the maintenance of cell shape and cell cycle progression. It modulates the shedding and endocytosis of cellular prion protein (PrP(c)) and amyloid precursor protein (APP). SNXs are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNX33 also contains BAR and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212830	cd11897	SH3_SNX18	Src Homology 3 domain of Sorting nexin 18. SNX18 is localized to peripheral endosomal structures, and acts in a trafficking pathway that is clathrin-independent but relies on AP-1 and PACS1. It binds FIP5 and is required for apical lumen formation. It may also play a role in axonal elongation. SNXs are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNX18 also contains BAR and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212831	cd11898	SH3_SNX9	Src Homology 3 domain of Sorting nexin 9. Sorting nexin 9 (SNX9), also known as SH3PX1, is a cytosolic protein that interacts with proteins associated with clathrin-coated pits such as Cdc-42-associated tyrosine kinase 2 (ACK2). It binds class I polyproline sequences found in dynamin 1/2 and the WASP/N-WASP actin regulators. SNX9 is localized to plasma membrane endocytic sites and acts primarily in clathrin-mediated endocytosis. Its array of interacting partners suggests that SNX9 functions at the interface between endocytosis and actin cytoskeletal organization. SNXs are Phox homology (PX) domain containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNX9 also contains BAR and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212832	cd11899	SH3_Nck2_1	First Src Homology 3 domain of Nck2 adaptor protein. Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds neuronal signaling proteins such as ephrinB and Disabled-1 (Dab-1) exclusively. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The first SH3 domain of Nck2 binds the PxxDY sequence in the CD3e cytoplasmic tail; this binding inhibits phosphorylation by Src kinases, resulting in the downregulation of TCR surface expression. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212833	cd11900	SH3_Nck1_1	First Src Homology 3 domain of Nck1 adaptor protein. Nck1 (also called Nckalpha) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds and activates RasGAP, resulting in the downregulation of Ras. It is also involved in the signaling of endothilin-mediated inhibition of cell migration. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The first SH3 domain of Nck1 binds the PxxDY sequence in the CD3e cytoplasmic tail; this binding inhibits phosphorylation by Src kinases, resulting in the downregulation of TCR surface expression. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212834	cd11901	SH3_Nck1_2	Second Src Homology 3 domain of Nck1 adaptor protein. Nck1 (also called Nckalpha) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds and activates RasGAP, resulting in the downregulation of Ras. It is also involved in the signaling of endothilin-mediated inhibition of cell migration. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The second SH3 domain of Nck appears to prefer ligands containing the APxxPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212835	cd11902	SH3_Nck2_2	Second Src Homology 3 domain of Nck2 adaptor protein. Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds neuronal signaling proteins such as ephrinB and Disabled-1 (Dab-1) exclusively. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The second SH3 domain of Nck appears to prefer ligands containing the APxxPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212836	cd11903	SH3_Nck2_3	Third Src Homology 3 domain of Nck2 adaptor protein. Nck2 (also called Nckbeta or Growth factor receptor-bound protein 4, Grb4) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds neuronal signaling proteins such as ephrinB and Disabled-1 (Dab-1) exclusively. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The third SH3 domain of Nck appears to prefer ligands with a PxAPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212837	cd11904	SH3_Nck1_3	Third Src Homology 3 domain of Nck1 adaptor protein. Nck1 (also called Nckalpha) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling. It binds and activates RasGAP, resulting in the downregulation of Ras. It is also involved in the signaling of endothilin-mediated inhibition of cell migration. Nck adaptor proteins regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates. They contain three SH3 domains and a C-terminal SH2 domain. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics. Vertebrates contain two Nck adaptor proteins: Nck1 (also called Nckalpha) and Nck2, which show partly overlapping functions but also bind distinct targets. The third SH3 domain of Nck appears to prefer ligands with a PxAPxR motif. SH3 domains are protein interaction domains that usually bind to proline-rich ligands with moderate affinity and selectivity, preferentially a PxxP motif. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212838	cd11905	SH3_Tec	Src Homology 3 domain of Tec (Tyrosine kinase expressed in hepatocellular carcinoma). Tec is a cytoplasmic (or nonreceptor) tyr kinase containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. It also contains an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation, and the Tec homology (TH) domain, which contains proline-rich and zinc-binding regions. It is more widely-expressed than other Tec subfamily kinases. Tec is found in endothelial cells, both B- and T-cells, and a variety of myeloid cells including mast cells, erythroid cells, platelets, macrophages and neutrophils. Tec is a key component of T-cell receptor (TCR) signaling, and is important in TCR-stimulated proliferation, IL-2 production and phospholipase C-gamma1 activation. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212839	cd11906	SH3_BTK	Src Homology 3 domain of Bruton's tyrosine kinase. BTK is a cytoplasmic (or nonreceptor) tyr kinase containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. It also contains an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation, and the Tec homology (TH) domain with proline-rich and zinc-binding regions. Btk is expressed in B-cells, and a variety of myeloid cells including mast cells, platelets, neutrophils, and dendrictic cells. It interacts with a variety of partners, from cytosolic proteins to nuclear transcription factors, suggesting a diversity of functions. Stimulation of a diverse array of cell surface receptors, including antigen engagement of the B-cell receptor (BCR), leads to PH-mediated membrane translocation of Btk and subsequent phosphorylation by Src kinase and activation. Btk plays an important role in the life cycle of B-cells including their development, differentiation, proliferation, survival, and apoptosis. Mutations in Btk cause the primary immunodeficiency disease, X-linked agammaglobulinaemia (XLA) in humans. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212840	cd11907	SH3_TXK	Src Homology 3 domain of TXK, also called Resting lymphocyte kinase (Rlk). TXK is a cytoplasmic (or nonreceptor) tyr kinase containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. It also contains an N-terminal cysteine-rich region. Rlk is expressed in T-cells and mast cell lines, and is a key component of T-cell receptor (TCR) signaling. It is important in TCR-stimulated proliferation, IL-2 production and phospholipase C-gamma1 activation. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212841	cd11908	SH3_ITK	Src Homology 3 domain of Interleukin-2-inducible T-cell Kinase. ITK (also known as Tsk or Emt) is a cytoplasmic (or nonreceptor) tyr kinase containing Src homology protein interaction domains (SH3, SH2) N-terminal to the catalytic tyr kinase domain. It also contains an N-terminal pleckstrin homology (PH) domain, which binds the products of PI3K and allows membrane recruitment and activation, and the Tec homology (TH) domain, which contains proline-rich and zinc-binding regions. ITK is expressed in T-cells and mast cells, and is important in their development and differentiation. Of the three Tec kinases expressed in T-cells, ITK plays the predominant role in T-cell receptor (TCR) signaling. It is activated by phosphorylation upon TCR crosslinking and is involved in the pathway resulting in phospholipase C-gamma1 activation and actin polymerization. It also plays a role in the downstream signaling of the T-cell costimulatory receptor CD28, the T-cell surface receptor CD2, and the chemokine receptor CXCR4. In addition, ITK is crucial for the development of T-helper(Th)2 effector responses. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212842	cd11909	SH3_PI3K_p85beta	Src Homology 3 domain of the p85beta regulatory subunit of Class IA Phosphatidylinositol 3-kinases. Class I PI3Ks convert PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. They are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. Class IA PI3Ks associate with the p85 regulatory subunit family, which contains SH3, RhoGAP, and SH2 domains. The p85 subunits recruit the PI3K p110 catalytic subunit to the membrane, where p110 phosphorylates inositol lipids. Vertebrates harbor two p85 isoforms, called alpha and beta. In addition to regulating the p110 subunit, p85beta binds CD28 and may be involved in the activation and differentiation of antigen-stimulated T cells. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	74
212843	cd11910	SH3_PI3K_p85alpha	Src Homology 3 domain of the p85alpha regulatory subunit of Class IA Phosphatidylinositol 3-kinases. Class I PI3Ks convert PtdIns(4,5)P2 to the critical second messenger PtdIns(3,4,5)P3. They are heterodimers and exist in multiple isoforms consisting of one catalytic subunit (out of four isoforms) and one of several regulatory subunits. Class IA PI3Ks associate with the p85 regulatory subunit family, which contains SH3, RhoGAP, and SH2 domains. The p85 subunits recruit the PI3K p110 catalytic subunit to the membrane, where p110 phosphorylates inositol lipids. Vertebrates harbor two p85 isoforms, called alpha and beta. In addition to regulating the p110 subunit, p85alpha interacts with activated FGFR3. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	75
212844	cd11911	SH3_CIP4-like	Src Homology 3 domain of Cdc42-Interacting Protein 4. This subfamily is composed of Cdc42-Interacting Protein 4 (CIP4), Formin Binding Protein 17 (FBP17), FormiN Binding Protein 1-Like (FNBP1L), and similar proteins. CIP4 and FNBP1L are Cdc42 effectors that bind Wiskott-Aldrich syndrome protein (WASP) and function in endocytosis. CIP4 and FBP17 bind to the Fas ligand and may be implicated in the inflammatory response. CIP4 may also play a role in phagocytosis. It functions downstream of Cdc42 in PDGF-dependent actin reorganization and cell migration, and also regulates the activity of PDGFRbeta. It uses Src as a substrate in regulating the invasiveness of breast tumor cells. CIP4 may also play a role in the pathogenesis of Huntington's disease. Members of this subfamily typically contain an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain, a central Cdc42-binding HR1 domain, and a C-terminal SH3 domain. The SH3 domain of CIP4 associates with Gapex-5, a Rab31 GEF. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212845	cd11912	SH3_Bzz1_1	First Src Homology 3 domain of Bzz1 and similar domains. Bzz1 (or Bzz1p) is a WASP/Las17-interacting protein involved in endocytosis and trafficking to the vacuole. It physically interacts with type I myosins and functions in the early steps of endocytosis. Together with other proteins, it induces membrane scission in yeast. Bzz1 contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), a central coiled-coil, and two C-terminal SH3 domains. This model represents the first C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212846	cd11913	SH3_BAIAP2L1	Src Homology 3 domain of Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 1, also called Insulin Receptor Tyrosine Kinase Substrate (IRTKS). BAIAP2L1 or IRTKS is widely expressed, serves as a substrate for the insulin receptor, and binds the small GTPase Rac. It plays a role in regulating the actin cytoskeleton and colocalizes with F-actin, cortactin, VASP, and vinculin. BAIAP2L1 expression leads to the formation of short actin bundles, distinct from filopodia-like protrusions induced by the expression of the related protein IRSp53. IRTKS mediates the recruitment of effector proteins Tir and EspFu, which regulate host cell actin reorganization, to bacterial attachment sites. It contains an N-terminal IMD or Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The SH3 domain of IRTKS has been shown to bind the proline-rich C-terminus of EspFu. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212847	cd11914	SH3_BAIAP2L2	Src Homology 3 domain of Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2-Like 2. BAIAP2L2 co-localizes with clathrin plaques but its function has not been determined. It contains an N-terminal IMD or Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The related proteins, BAIAP2L1 and IRSp53, function as regulators of membrane dynamics and the actin cytoskeleton. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212848	cd11915	SH3_Irsp53	Src Homology 3 domain of Insulin Receptor tyrosine kinase Substrate p53. IRSp53 is also known as BAIAP2 (Brain-specific Angiogenesis Inhibitor 1-Associated Protein 2). It is a scaffolding protein that takes part in many signaling pathways including Cdc42-induced filopodia formation, Rac-mediated lamellipodia extension, and spine morphogenesis. IRSp53 exists as multiple splicing variants that differ mainly at the C-termini. One variant (T-form) is expressed exclusively in human breast cancer cells. The gene encoding IRSp53 is a putative susceptibility gene for Gilles de la Tourette syndrome. IRSp53 can also mediate the recruitment of effector proteins Tir and EspFu, which regulate host cell actin reorganization, to bacterial attachment sites. It contains an N-terminal IMD, a CRIB (Cdc42 and Rac interactive binding motif), an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C-terminus. The SH3 domain of IRSp53 has been shown to bind the proline-rich C-terminus of EspFu. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212849	cd11916	SH3_Sorbs1_3	Third (or C-terminal) Src Homology 3 domain of Sorbin and SH3 domain containing 1 (Sorbs1), also called ponsin. Sorbs1 is also called ponsin, SH3P12, or CAP (c-Cbl associated protein). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It binds Cbl and plays a major role in regulating the insulin signaling pathway by enhancing insulin-induced phosphorylation of Cbl. Sorbs1, like vinexin, localizes at cell-ECM and cell-cell adhesion sites where it binds vinculin, paxillin, and afadin. It may function in the control of cell motility. Other interaction partners of Sorbs1 include c-Abl, Sos, flotillin, Grb4, ataxin-7, filamin C, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212850	cd11917	SH3_Sorbs2_3	Third (or C-terminal) Src Homology 3 domain of Sorbin and SH3 domain containing 2 (Sorbs2), also called Arg-binding protein 2 (ArgBP2). Sorbs2 or ArgBP2 is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It regulates actin-dependent processes including cell adhesion, morphology, and migration. It is expressed in many tissues and is abundant in the heart. Like vinexin, it is found in focal adhesion where it interacts with vinculin and afadin. It also localizes in epithelial cell stress fibers and in cardiac muscle cell Z-discs. Sorbs2 has been implicated to play roles in the signaling of c-Arg, Akt, and Pyk2. Other interaction partners of Sorbs2 include c-Abl, flotillin, spectrin, dynamin 1/2, synaptojanin, PTP-PEST, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212851	cd11918	SH3_Vinexin_3	Third (or C-terminal) Src Homology 3 domain of Vinexin, also called Sorbin and SH3 domain containing 3 (Sorbs3). Vinexin is also called Sorbs3, SH3P3, and SH3-containing adapter molecule 1 (SCAM-1). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. Vinexin was first identified as a vinculin binding protein; it is co-localized with vinculin at cell-ECM and cell-cell adhesion sites. There are several splice variants of vinexin: alpha, which contains the SoHo and three SH3 domains and displays tissue-specific expression; and beta, which contains only the three SH3 domains and is widely expressed. Vinexin alpha stimulates the accumulation of F-actin at focal contact sites. Vinexin also promotes keratinocyte migration and wound healing. The SH3 domains of vinexin have been reported to bind a number of ligands including vinculin, WAVE2, DLG5, Abl, and Cbl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212852	cd11919	SH3_Sorbs1_1	First Src Homology 3 domain of Sorbin and SH3 domain containing 1 (Sorbs1), also called ponsin. Sorbs1 is also called ponsin, SH3P12, or CAP (c-Cbl associated protein). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It binds Cbl and plays a major role in regulating the insulin signaling pathway by enhancing insulin-induced phosphorylation of Cbl. Sorbs1, like vinexin, localizes at cell-ECM and cell-cell adhesion sites where it binds vinculin, paxillin, and afadin. It may function in the control of cell motility. Other interaction partners of Sorbs1 include c-Abl, Sos, flotillin, Grb4, ataxin-7, filamin C, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212853	cd11920	SH3_Sorbs2_1	First Src Homology 3 domain of Sorbin and SH3 domain containing 2 (Sorbs2), also called Arg-binding protein 2 (ArgBP2). Sorbs2 or ArgBP2 is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It regulates actin-dependent processes including cell adhesion, morphology, and migration. It is expressed in many tissues and is abundant in the heart. Like vinexin, it is found in focal adhesion where it interacts with vinculin and afadin. It also localizes in epithelial cell stress fibers and in cardiac muscle cell Z-discs. Sorbs2 has been implicated to play roles in the signaling of c-Arg, Akt, and Pyk2. Other interaction partners of Sorbs2 include c-Abl, flotillin, spectrin, dynamin 1/2, synaptojanin, PTP-PEST, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212854	cd11921	SH3_Vinexin_1	First Src Homology 3 domain of Vinexin, also called Sorbin and SH3 domain containing 3 (Sorbs3). Vinexin is also called Sorbs3, SH3P3, and SH3-containing adapter molecule 1 (SCAM-1). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. Vinexin was first identified as a vinculin binding protein; it is co-localized with vinculin at cell-ECM and cell-cell adhesion sites. There are several splice variants of vinexin: alpha, which contains the SoHo and three SH3 domains and displays tissue-specific expression; and beta, which contains only the three SH3 domains and is widely expressed. Vinexin alpha stimulates the accumulation of F-actin at focal contact sites. Vinexin also promotes keratinocyte migration and wound healing. The SH3 domains of vinexin have been reported to bind a number of ligands including vinculin, WAVE2, DLG5, Abl, and Cbl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212855	cd11922	SH3_Sorbs1_2	Second Src Homology 3 domain of Sorbin and SH3 domain containing 1 (Sorbs1), also called ponsin. Sorbs1 is also called ponsin, SH3P12, or CAP (c-Cbl associated protein). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It binds Cbl and plays a major role in regulating the insulin signaling pathway by enhancing insulin-induced phosphorylation of Cbl. Sorbs1, like vinexin, localizes at cell-ECM and cell-cell adhesion sites where it binds vinculin, paxillin, and afadin. It may function in the control of cell motility. Other interaction partners of Sorbs1 include c-Abl, Sos, flotillin, Grb4, ataxin-7, filamin C, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212856	cd11923	SH3_Sorbs2_2	Second Src Homology 3 domain of Sorbin and SH3 domain containing 2 (Sorbs2), also called Arg-binding protein 2 (ArgBP2). Sorbs2 or ArgBP2 is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. It regulates actin-dependent processes including cell adhesion, morphology, and migration. It is expressed in many tissues and is abundant in the heart. Like vinexin, it is found in focal adhesion where it interacts with vinculin and afadin. It also localizes in epithelial cell stress fibers and in cardiac muscle cell Z-discs. Sorbs2 has been implicated to play roles in the signaling of c-Arg, Akt, and Pyk2. Other interaction partners of Sorbs2 include c-Abl, flotillin, spectrin, dynamin 1/2, synaptojanin, PTP-PEST, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212857	cd11924	SH3_Vinexin_2	Second Src Homology 3 domain of Vinexin, also called Sorbin and SH3 domain containing 3 (Sorbs3). Vinexin is also called Sorbs3, SH3P3, and SH3-containing adapter molecule 1 (SCAM-1). It is an adaptor protein containing one sorbin homology (SoHo) and three SH3 domains. Vinexin was first identified as a vinculin binding protein; it is co-localized with vinculin at cell-ECM and cell-cell adhesion sites. There are several splice variants of vinexin: alpha, which contains the SoHo and three SH3 domains and displays tissue-specific expression; and beta, which contains only the three SH3 domains and is widely expressed. Vinexin alpha stimulates the accumulation of F-actin at focal contact sites. Vinexin also promotes keratinocyte migration and wound healing. The SH3 domains of vinexin have been reported to bind a number of ligands including vinculin, WAVE2, DLG5, Abl, and Cbl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212858	cd11925	SH3_SH3RF3_3	Third Src Homology 3 domain of SH3 domain containing ring finger 3, an E3 ubiquitin-protein ligase. SH3RF3 is also called POSH2 (Plenty of SH3s 2) or SH3MD4 (SH3 multiple domains protein 4). It is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2). It may play a role in regulating JNK mediated apoptosis in certain conditions. It also interacts with GTP-loaded Rac1. SH3RF3 is highly homologous to SH3RF1; it also contains an N-terminal RING finger domain and four SH3 domains. This model represents the third SH3 domain, located in the middle, of SH3RF3. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212859	cd11926	SH3_SH3RF1_3	Third Src Homology 3 domain of SH3 domain containing ring finger 1, an E3 ubiquitin-protein ligase. SH3RF1 is also called POSH (Plenty of SH3s) or SH3MD2 (SH3 multiple domains protein 2). It is a scaffold protein that acts as an E3 ubiquitin-protein ligase. It plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF1 also enhances the ubiquitination of ROMK1 potassium channel resulting in its increased endocytosis. It contains an N-terminal RING finger domain and four SH3 domains. This model represents the third SH3 domain, located in the middle, of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212860	cd11927	SH3_SH3RF1_1	First Src Homology 3 domain of SH3 domain containing ring finger protein 1, an E3 ubiquitin-protein ligase. SH3RF1 is also called POSH (Plenty of SH3s) or SH3MD2 (SH3 multiple domains protein 2). It is a scaffold protein that acts as an E3 ubiquitin-protein ligase. It plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF1 also enhances the ubiquitination of ROMK1 potassium channel resulting in its increased endocytosis. It contains an N-terminal RING finger domain and four SH3 domains. This model represents the first SH3 domain, located at the N-terminal half, of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212861	cd11928	SH3_SH3RF3_1	First Src Homology 3 domain of SH3 domain containing ring finger 3, an E3 ubiquitin-protein ligase. SH3RF3 is also called POSH2 (Plenty of SH3s 2) or SH3MD4 (SH3 multiple domains protein 4). It is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2). It may play a role in regulating JNK mediated apoptosis in certain conditions. It also interacts with GTP-loaded Rac1. SH3RF3 is highly homologous to SH3RF1; it also contains an N-terminal RING finger domain and four SH3 domains. This model represents the first SH3 domain, located at the N-terminal half, of SH3RF3. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212862	cd11929	SH3_SH3RF2_1	First Src Homology 3 domain of SH3 domain containing ring finger 2. SH3RF2 is also called POSHER (POSH-eliminating RING protein) or HEPP1 (heart protein phosphatase 1-binding protein). It acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. It may also play a role in cardiac functions together with protein phosphatase 1. SH3RF2 contains an N-terminal RING finger domain and three SH3 domains. This model represents the first SH3 domain, located at the N-terminal half, of SH3RF2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212863	cd11930	SH3_SH3RF1_2	Second Src Homology 3 domain of SH3 domain containing ring finger protein 1, an E3 ubiquitin-protein ligase. SH3RF1 is also called POSH (Plenty of SH3s) or SH3MD2 (SH3 multiple domains protein 2). It is a scaffold protein that acts as an E3 ubiquitin-protein ligase. It plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and JNK mediated apoptosis. SH3RF1 also enhances the ubiquitination of ROMK1 potassium channel resulting in its increased endocytosis. It contains an N-terminal RING finger domain and four SH3 domains. This model represents the second SH3 domain, located C-terminal of the first SH3 domain at the N-terminal half, of SH3RF1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212864	cd11931	SH3_SH3RF3_2	Second Src Homology 3 domain of SH3 domain containing ring finger 3, an E3 ubiquitin-protein ligase. SH3RF3 is also called POSH2 (Plenty of SH3s 2) or SH3MD4 (SH3 multiple domains protein 4). It is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2). It may play a role in regulating JNK mediated apoptosis in certain conditions. It also interacts with GTP-loaded Rac1. SH3RF3 is highly homologous to SH3RF1; it also contains an N-terminal RING finger domain and four SH3 domains. This model represents the second SH3 domain, located C-terminal of the first SH3 domain at the N-terminal half, of SH3RF3. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212865	cd11932	SH3_SH3RF2_2	Second Src Homology 3 domain of SH3 domain containing ring finger 2. SH3RF2 is also called POSHER (POSH-eliminating RING protein) or HEPP1 (heart protein phosphatase 1-binding protein). It acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. It may also play a role in cardiac functions together with protein phosphatase 1. SH3RF2 contains an N-terminal RING finger domain and three SH3 domains. This model represents the second SH3 domain, located C-terminal of the first SH3 domain at the N-terminal half, of SH3RF2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212866	cd11933	SH3_Nebulin_C	C-terminal Src Homology 3 domain of Nebulin. Nebulin is a giant filamentous protein (600-900 kD) that is expressed abundantly in skeletal muscle. It binds to actin thin filaments and regulates its assembly and function. Nebulin was thought to be part of a molecular ruler complex that is critical in determining the lengths of actin thin filaments in skeletal muscle since its length, which varies due to alternative splicing, correlates with the length of thin filaments in various muscle types. Recent studies indicate that nebulin regulates thin filament length by stabilizing the filaments and preventing depolymerization. Mutations in nebulin can cause nemaline myopathy, characterized by muscle weakness which can be severe and can lead to neonatal lethality. Nebulin contains an N-terminal LIM domain, many nebulin repeats/super repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212867	cd11934	SH3_Lasp1_C	C-terminal Src Homology 3 domain of LIM and SH3 domain protein 1. Lasp1 is a cytoplasmic protein that binds focal adhesion proteins and is involved in cell signaling, migration, and proliferation. It is overexpressed in several cancer cells including breast, ovarian, bladder, and liver. In cancer cells, it can be found in the nucleus; its degree of nuclear localization correlates with tumor size and poor prognosis. Lasp1 is a 36kD protein containing an N-terminal LIM domain, two nebulin repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212868	cd11935	SH3_Nebulette_C	C-terminal Src Homology 3 domain of Nebulette and LIM-nebulette (or Lasp2). Nebulette is a cardiac-specific protein that localizes to the Z-disc. It interacts with tropomyosin and is important in stabilizing actin thin filaments in cardiac muscles. Polymorphisms in the nebulette gene are associated with dilated cardiomyopathy, with some mutations resulting in severe heart failure. Nebulette is a 107kD protein that contains an N-terminal acidic region, multiple nebulin repeats, and a C-terminal SH3 domain. LIM-nebulette, also called Lasp2 (LIM and SH3 domain protein 2), is an alternatively spliced variant of nebulette. Although it shares a gene with nebulette, Lasp2 is not transcribed from a muscle-specific promoter, giving rise to its multiple tissue expression pattern with highest amounts in the brain. It can crosslink actin filaments and it affects cell spreading. Lasp2 is a 34kD protein containing an N-terminal LIM domain, three nebulin repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212869	cd11936	SH3_UBASH3B	Src homology 3 domain of Ubiquitin-associated and SH3 domain-containing protein B. UBASH3B, also called Suppressor of T cell receptor Signaling (STS)-1 or T cell Ubiquitin LigAnd (TULA)-2 is an active phosphatase that is expressed ubiquitously. The phosphatase activity of UBASH3B is essential for its roles in the suppression of TCR signaling and the regulation of EGFR. It also interacts with Syk and functions as a negative regulator of platelet glycoprotein VI signaling. TULA proteins contain an N-terminal UBA domain, a central SH3 domain, and a C-terminal histidine phosphatase domain. They bind c-Cbl through the SH3 domain and to ubiquitin via UBA. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212870	cd11937	SH3_UBASH3A	Src homology 3 domain of Ubiquitin-associated and SH3 domain-containing protein A. UBASH3A is also called Cbl-Interacting Protein 4 (CLIP4), T cell Ubiquitin LigAnd (TULA), or T cell receptor Signaling (STS)-2. It is only found in lymphoid cells and exhibits weak phosphatase activity. UBASH3A facilitates T cell-induced apoptosis through interaction with the apoptosis-inducing factor AIF. It is involved in regulating the level of phosphorylation of the zeta-associated protein (ZAP)-70 tyrosine kinase. TULA proteins contain an N-terminal UBA domain, a central SH3 domain, and a C-terminal histidine phosphatase domain. They bind c-Cbl through the SH3 domain and to ubiquitin via UBA. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	60
212871	cd11938	SH3_ARHGEF16_26	Src homology 3 domain of the Rho guanine nucleotide exchange factors ARHGEF16 and ARHGEF26. ARHGEF16, also called ephexin-4, acts as a GEF for RhoG, activating it by exchanging bound GDP for free GTP. RhoG is a small GTPase that is a crucial regulator of Rac in migrating cells. ARHGEF16 interacts directly with the ephrin receptor EphA2 and mediates cell migration and invasion in breast cancer cells by activating RhoG. ARHGEF26, also called SGEF (SH3 domain-containing guanine exchange factor), also activates RhoG. It is highly expressed in liver and may play a role in regulating membrane dynamics. ARHGEF16 and ARHGEF26 contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), and SH3 domains. The SH3 domains of ARHGEFs play an autoinhibitory role through intramolecular interactions with a proline-rich region N-terminal to the DH domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212872	cd11939	SH3_ephexin1	Src homology 3 domain of the Rho guanine nucleotide exchange factor, ephexin-1 (also called NGEF or ARHGEF27). Ephexin-1, also called NGEF (neuronal GEF) or ARHGEF27, activates RhoA, Tac1, and Cdc42 by exchanging bound GDP for free GTP. It is expressed mainly in the brain in a region associated with movement control. It regulates the stability of postsynaptic acetylcholine receptor (AChR) clusters and thus, plays a critical role in the maturation and neurotransmission of neuromuscular junctions. Ephexin-1 directly interacts with the ephrin receptor EphA4 and their coexpression enhances the ability of ephexin-1 to activate RhoA. It is required for normal axon growth and EphA-induced growth cone collapse. Ephexin-1 contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), and SH3 domains. The SH3 domains of ARHGEFs play an autoinhibitory role through intramolecular interactions with a proline-rich region N-terminal to the DH domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212873	cd11940	SH3_ARHGEF5_19	Src homology 3 domain of the Rho guanine nucleotide exchange factors ARHGEF5 and ARHGEF19. ARHGEF5, also called ephexin-3 or TIM (Transforming immortalized mammary oncogene), is a potent activator of RhoA and it plays roles in regulating cell shape, adhesion, and migration. It binds to the SH3 domain of Src and is involved in regulating Src-induced podosome formation. ARHGEF19, also called ephexin-2 or WGEF (weak-similarity GEF), is highly expressed in the intestine, liver, heart and kidney. It activates RhoA, Cdc42, and Rac 1, and has been shown to activate RhoA in the Wnt-PCP (planar cell polarity) pathway. It is involved in the regulation of cell polarity and cytoskeletal reorganization. ARHGEF5 and ARHGEF19 contain RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), and SH3 domains. The SH3 domains of ARHGEFs play an autoinhibitory role through intramolecular interactions with a proline-rich region N-terminal to the DH domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212874	cd11941	SH3_ARHGEF37_C2	Second C-terminal Src homology 3 domain of Rho guanine nucleotide exchange factor 37. ARHGEF37 contains a RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. Its specific function is unknown. Its domain architecture is similar to the C-terminal half of DNMBP or Tuba, a cdc42-specific GEF that provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics, and plays an important role in regulating cell junction configuration. GEFs activate small GTPases by exchanging bound GDP for free GTP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212875	cd11942	SH3_JIP2	Src homology 3 domain of JNK-interacting protein 2. JNK-interacting protein 2 (JIP2) is also called Mitogen-activated protein kinase 8-interacting protein 2 (MAPK8IP2) or Islet-brain-2 (IB2). It is widely expressed in the brain, where it forms complexes with fibroblast growth factor homologous factors (FHFs), which facilitates activation of the p38delta MAPK. JIP2 is enriched in postsynaptic densities and may play a role in motor and cognitive function. In addition to a JNK binding domain, JIP2 also contains SH3 and Phosphotyrosine-binding (PTB) domains. The SH3 domain of the related protein JIP1 homodimerizes at the interface usually involved in proline-rich ligand recognition, despite the lack of this motif in the domain itself. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212876	cd11943	SH3_JIP1	Src homology 3 domain of JNK-interacting protein 1. JNK-interacting protein 1 (JIP1) is also called Islet-brain 1 (IB1) or Mitogen-activated protein kinase 8-interacting protein 1 (MAPK8IP1). It is highly expressed in neurons, where it functions as an adaptor linking motor to cargo during axonal transport. It also affects microtubule dynamics in neurons. JIP1 is also found in pancreatic beta-cells, where it is involved in regulating insulin secretion. In addition to a JNK binding domain, JIP1 also contains SH3 and Phosphotyrosine-binding (PTB) domains. Its SH3 domain homodimerizes at the interface usually involved in proline-rich ligand recognition, despite the lack of this motif in the domain itself. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212877	cd11944	SH3_Endophilin_B2	Src homology 3 domain of Endophilin-B2. Endophilin-B2, also called SH3GLB2 (SH3-domain GRB2-like endophilin B2), is a cytoplasmic protein that interacts with the apoptosis inducer Bax. It is overexpressed in prostate cancer metastasis and has been identified as a cancer antigen with potential utility in immunotherapy. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. They contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. Endophilin-B2 forms homo- and heterodimers (with endophilin-B1) through its BAR domain. The related protein endophilin-B1 interacts with amphiphysin 1 and dynamin 1 through its SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212878	cd11945	SH3_Endophilin_B1	Src homology 3 domain of Endophilin-B1. Endophilin-B1, also called Bax-interacting factor 1 (Bif-1) or SH3GLB1 (SH3-domain GRB2-like endophilin B1), is localized mainly to the Golgi apparatus. It is involved in the regulation of many biological events including autophagy, tumorigenesis, nerve growth factor (NGF) trafficking, neurite outgrowth, mitochondrial outer membrane dynamics, and cell death. Endophilins play roles in synaptic vesicle formation, virus budding, mitochondrial morphology maintenance, receptor-mediated endocytosis inhibition, and endosomal sorting. They contain an N-terminal N-BAR domain (BAR domain with an additional N-terminal amphipathic helix), followed by a variable region containing proline clusters, and a C-terminal SH3 domain. Endophilin-B1 forms homo- and heterodimers (with endophilin-B2) through its BAR domain. It interacts with amphiphysin 1 and dynamin 1 through its SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212879	cd11946	SH3_GRB2_N	N-terminal Src homology 3 domain of Growth factor receptor-bound protein 2. GRB2 is a critical signaling molecule that regulates the Ras pathway by linking tyrosine kinases to the Ras guanine nucleotide releasing protein Sos (son of sevenless), which converts Ras to the active GTP-bound state. It is ubiquitously expressed in all tissues throughout development and is important in cell cycle progression, motility, morphogenesis, and angiogenesis. In lymphocytes, GRB2 is associated with antigen receptor signaling components. GRB2 contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. Its N-terminal SH3 domain binds to Sos and Sos-derived proline-rich peptides. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212880	cd11947	SH3_GRAP2_N	N-terminal Src homology 3 domain of GRB2-related adaptor protein 2. GRAP2 is also called GADS (GRB2-related adapter downstream of Shc), GrpL, GRB2L, Mona, or GRID (Grb2-related protein with insert domain). It is expressed specifically in the hematopoietic system. It plays an important role in T cell receptor (TCR) signaling by promoting the formation of the SLP-76:LAT complex, which couples the TCR to the Ras pathway. It also have roles in antigen-receptor and tyrosine kinase mediated signaling. GRAP2 is unique from other GRB2-like adaptor proteins in that it can be regulated by caspase cleavage. It contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The N-terminal SH3 domain of the related protein GRB2 binds to Sos and Sos-derived proline-rich peptides. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212881	cd11948	SH3_GRAP_N	N-terminal Src homology 3 domain of GRB2-related adaptor protein. GRAP is a GRB-2 like adaptor protein that is highly expressed in lymphoid tissues. It acts as a negative regulator of T cell receptor (TCR)-induced lymphocyte proliferation by downregulating the signaling to the Ras/ERK pathway. It has been identified as a regulator of TGFbeta signaling in diabetic kidney tubules and may have a role in the pathogenesis of the disease. GRAP contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The N-terminal SH3 domain of the related protein GRB2 binds to Sos and Sos-derived proline-rich peptides. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212882	cd11949	SH3_GRB2_C	C-terminal Src homology 3 domain of Growth factor receptor-bound protein 2. GRB2 is a critical signaling molecule that regulates the Ras pathway by linking tyrosine kinases to the Ras guanine nucleotide releasing protein Sos (son of sevenless), which converts Ras to the active GTP-bound state. It is ubiquitously expressed in all tissues throughout development and is important in cell cycle progression, motility, morphogenesis, and angiogenesis. In lymphocytes, GRB2 is associated with antigen receptor signaling components. GRB2 contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The C-terminal SH3 domain of GRB2 binds to Gab2 (Grb2-associated binder 2) through epitopes containing RxxK motifs, as well as to the proline-rich C-terminus of FGRF2. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212883	cd11950	SH3_GRAP2_C	C-terminal Src homology 3 domain of GRB2-related adaptor protein 2. GRAP2 is also called GADS (GRB2-related adapter downstream of Shc), GrpL, GRB2L, Mona, or GRID (Grb2-related protein with insert domain). It is expressed specifically in the hematopoietic system. It plays an important role in T cell receptor (TCR) signaling by promoting the formation of the SLP-76:LAT complex, which couples the TCR to the Ras pathway. It also has roles in antigen-receptor and tyrosine kinase mediated signaling. GRAP2 is unique from other GRB2-like adaptor proteins in that it can be regulated by caspase cleavage. It contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The C-terminal SH3 domain of GRAP2 binds to different motifs found in substrate peptides including the typical PxxP motif in hematopoietic progenitor kinase 1 (HPK1), the RxxK motif in SLP-76 and HPK1, and the RxxxxK motif in phosphatase-like protein HD-PTP. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212884	cd11951	SH3_GRAP_C	C-terminal Src homology 3 domain of GRB2-related adaptor protein. GRAP is a GRB-2 like adaptor protein that is highly expressed in lymphoid tissues. It acts as a negative regulator of T cell receptor (TCR)-induced lymphocyte proliferation by downregulating the signaling to the Ras/ERK pathway. It has been identified as a regulator of TGFbeta signaling in diabetic kidney tubules and may have a role in the pathogenesis of the disease. GRAP contains an N-terminal SH3 domain, a central SH2 domain, and a C-terminal SH3 domain. The C-terminal SH3 domains (SH3c) of the related proteins, GRB2 and GRAP2, have been shown to bind to classical PxxP motif ligands, as well as to non-classical motifs. GRB2 SH3c binds Gab2 (Grb2-associated binder 2) through epitopes containing RxxK motifs, while the SH3c of GRAP2 binds to the phosphatase-like protein HD-PTP via a RxxxxK motif. SH3 domains are protein interaction domains that typically bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212885	cd11952	SH3_iASPP	Src Homology 3 (SH3) domain of Inhibitor of ASPP protein (iASPP). iASPP, also called RelA-associated inhibitor (RAI), is an oncoprotein that inhibits the apoptotic transactivation potential of p53. It is upregulated in human breast cancers expressing wild-type p53, in acute leukemias regardless of the p53 mutation status, as well as in ovarian cancer where it is associated with poor patient outcome and chemoresistance. iASPP is also a binding partner and negative regulator of p65RelA, which promotes cell proliferation and inhibits apoptosis; p65RelA has the opposite effect on cell growth compared to the p53 family. It contains a proline-rich region, four ankyrin (ANK) repeats, and an SH3 domain at its C-terminal half. The SH3 domain and the ANK repeats of iASPP contribute to the p53 binding site; they bind to the DNA binding domain of p53. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212886	cd11953	SH3_ASPP2	Src Homology 3 (SH3) domain of Apoptosis Stimulating of p53 protein 2. ASPP2 is the full length form of the previously-identified tumor supressor, p53-binding protein 2 (p53BP2). ASPP2 activates the apoptotic function of the p53 family of tumor suppressors (p53, p63, and p73). It plays a central role in regulating apoptosis and cell growth; ASPP2-deficient mice show postnatal death. Downregulated expression of ASPP2 is frequently found in breast tumors, lung cancer, and diffuse large B-cell lymphoma where it is correlated with a poor clinical outcome. ASPP2 contains a proline-rich region, four ankyrin (ANK) repeats, and an SH3 domain at its C-terminal half. The SH3 domain and the ANK repeats of ASPP2 contribute to the p53 binding site; they bind to the DNA binding domain of p53. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212887	cd11954	SH3_ASPP1	Src Homology 3 domain of Apoptosis Stimulating of p53 protein 1. ASPP1, like ASPP2, activates the apoptotic function of the p53 family of tumor suppressors (p53, p63, and p73). In addition, it functions in the cytoplasm to regulate the nuclear localization of the transcriptional cofactors YAP and TAZ by inihibiting their phosphorylation; YAP and TAZ are important regulators of cell expansion, differentiation, migration, and invasion. ASPP1 is downregulated in breast tumors expressing wild-type p53. It contains a proline-rich region, four ankyrin (ANK) repeats, and an SH3 domain at its C-terminal half. The SH3 domain and the ANK repeats of ASPP1 contribute to the p53 binding site; they bind to the DNA binding domain of p53. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212888	cd11955	SH3_srGAP1-3	Src homology 3 domain of Slit-Robo GTPase Activating Proteins 1, 2, and 3. srGAP1, also called Rho GTPase-Activating Protein 13 (ARHGAP13), is a Cdc42- and RhoA-specific GAP and is expressed later in the development of central nervous system tissues. srGAP2 is expressed in zones of neuronal differentiation. It plays a role in the regeneration of neurons and axons. srGAP3, also called MEGAP (MEntal disorder associated GTPase-Activating Protein), is a Rho GAP with activity towards Rac1 and Cdc42. It impacts cell migration by regulating actin and microtubule cytoskeletal dynamics. The association between srGAP3 haploinsufficiency and mental retardation is under debate. srGAPs are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. srGAPs contain an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212889	cd11956	SH3_srGAP4	Src homology 3 domain of Slit-Robo GTPase Activating Protein 4. srGAP4, also called ARHGAP4, is highly expressed in hematopoietic cells and may play a role in lymphocyte differentiation. It is able to stimulate the GTPase activity of Rac1, Cdc42, and RhoA. In the nervous system, srGAP4 has been detected in differentiating neurites and may be involved in axon and dendritic growth. srGAPs are Rho GAPs that interact with Robo1, the transmembrane receptor of Slit proteins. Slit proteins are secreted proteins that control axon guidance and the migration of neurons and leukocytes. srGAPs contain an N-terminal F-BAR domain, a Rho GAP domain, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212890	cd11957	SH3_RUSC2	Src homology 3 domain of RUN and SH3 domain-containing protein 2. RUSC2, also called Iporin or Interacting protein of Rab1, is expressed ubiquitously with highest amounts in the brain and testis. It interacts with the small GTPase Rab1 and the Golgi matrix protein GM130, and may function in linking GTPases to certain intracellular signaling pathways. RUSC proteins are adaptor proteins consisting of RUN, leucine zipper, and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212891	cd11958	SH3_RUSC1	Src homology 3 domain of RUN and SH3 domain-containing protein 1. RUSC1, also called NESCA (New molecule containing SH3 at the carboxy-terminus), is highly expressed in the brain and is translocated to the nuclear membrane from the cytoplasm upon stimulation with neurotrophin. It plays a role in facilitating neurotrophin-dependent neurite outgrowth. It also interacts with NEMO (or IKKgamma) and may function in NEMO-mediated activation of NF-kB. RUSC proteins are adaptor proteins consisting of RUN, leucine zipper, and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	51
212892	cd11959	SH3_Cortactin	Src homology 3 domain of Cortactin. Cortactin was originally identified as a substrate of Src kinase. It is an actin regulatory protein that binds to the Arp2/3 complex and stabilizes branched actin filaments. It is involved in cellular processes that affect cell motility, adhesion, migration, endocytosis, and invasion. It is expressed ubiquitously except in hematopoietic cells, where the homolog hematopoietic lineage cell-specific 1 (HS1) is expressed instead. Cortactin contains an N-terminal acidic domain, several copies of a repeat domain found in cortactin and HS1, a proline-rich region, and a C-terminal SH3 domain. The N-terminal region interacts with the Arp2/3 complex and F-actin, and is crucial in regulating branched actin assembly. Cortactin also serves as a scaffold and provides a bridge to the actin cytoskeleton for membrane trafficking and signaling proteins that bind to its SH3 domain. Binding partners for the SH3 domain of cortactin include dynamin2, N-WASp, MIM, FGD1, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212893	cd11960	SH3_Abp1_eu	Src homology 3 domain of eumetazoan Actin-binding protein 1. Abp1, also called drebrin-like protein, is an adaptor protein that functions in receptor-mediated endocytosis and vesicle trafficking. It contains an N-terminal actin-binding module, the actin-depolymerizing factor (ADF) homology domain, a helical domain, and a C-terminal SH3 domain. Mammalian Abp1, unlike yeast Abp1, does not contain an acidic domain that interacts with the Arp2/3 complex. It regulates actin dynamics indirectly by interacting with dynamin and WASP family proteins. Abp1 deficiency causes abnormal organ structure and function of the spleen, heart, and lung of mice. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212894	cd11961	SH3_Abp1_fungi_C2	Second C-terminal Src homology 3 domain of Fungal Actin-binding protein 1. Abp1 is an adaptor protein that functions in receptor-mediated endocytosis and vesicle trafficking. It contains an N-terminal actin-binding module, the actin-depolymerizing factor (ADF) homology domain, a central proline-rich region, and a C-terminal SH3 domain (many yeast Abp1 proteins contain two C-terminal SH3 domains). Yeast Abp1 also contains two acidic domains that bind directly to the Arp2/3 complex, which is required to initiate actin polymerization. The SH3 domain of yeast Abp1 binds and localizes the kinases, Ark1p and Prk1p, which facilitate actin patch disassembly following vesicle internalization. It also mediates the localization to the actin patch of the synaptojanin-like protein, Sjl2p, which plays a key role in endocytosis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212895	cd11962	SH3_Abp1_fungi_C1	First C-terminal Src homology 3 domain of Fungal Actin-binding protein 1. Abp1 is an adaptor protein that functions in receptor-mediated endocytosis and vesicle trafficking. It contains an N-terminal actin-binding module, the actin-depolymerizing factor (ADF) homology domain, a central proline-rich region, and a C-terminal SH3 domain (many yeast Abp1 proteins contain two C-terminal SH3 domains). Yeast Abp1 also contains two acidic domains that bind directly to the Arp2/3 complex, which is required to initiate actin polymerization. The SH3 domain of yeast Abp1 binds and localizes the kinases, Ark1p and Prk1p, which facilitate actin patch disassembly following vesicle internalization. It also mediates the localization to the actin patch of the synaptojanin-like protein, Sjl2p, which plays a key role in endocytosis. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212896	cd11963	SH3_STAM2	Src homology 3 domain of Signal Transducing Adaptor Molecule 2. STAM2, also called EAST (Epidermal growth factor receptor-associated protein with SH3 and TAM domain) or Hbp (Hrs binding protein), is part of the endosomal sorting complex required for transport (ESCRT-0). It plays a role in sorting mono-ubiquinated endosomal cargo for trafficking to the lysosome for degradation. It is also involved in the regulation of exocytosis. STAMs were discovered as proteins that are highly phosphorylated following cytokine and growth factor stimulation. They function in cytokine signaling and surface receptor degradation, as well as regulate Golgi morphology. They associate with many proteins including Jak2 and Jak3 tyrosine kinases, Hrs, AMSH, and UBPY. STAM adaptor proteins contain VHS (Vps27, Hrs, STAM homology), ubiquitin interacting (UIM), and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212897	cd11964	SH3_STAM1	Src homology 3 domain of Signal Transducing Adaptor Molecule 1. STAM1 is part of the endosomal sorting complex required for transport (ESCRT-0) and is involved in sorting ubiquitinated cargo proteins from the endosome. It may also be involved in the regulation of IL2 and GM-CSF mediated signaling, and has been implicated in neural cell survival. STAMs were discovered as proteins that are highly phosphorylated following cytokine and growth factor stimulation. They function in cytokine signaling and surface receptor degradation, as well as regulate Golgi morphology. They associate with many proteins including Jak2 and Jak3 tyrosine kinases, Hrs, AMSH, and UBPY. STAM adaptor proteins contain VHS (Vps27, Hrs, STAM homology), ubiquitin interacting (UIM), and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212898	cd11965	SH3_ASAP1	Src homology 3 domain of ArfGAP with SH3 domain, ankyrin repeat and PH domain containing protein 1. ASAP1 is also called DDEF1 (Development and Differentiation Enhancing Factor 1), AMAP1, centaurin beta-4, or PAG2. an Arf GTPase activating protein (GAP) with activity towards Arf1 and Arf5 but not Arf6. However, it has been shown to bind GTP-Arf6 stably without GAP activity. It has been implicated in cell growth, migration, and survival, as well as in tumor invasion and malignancy. It binds paxillin and cortactin, two components of invadopodia which are essential for tumor invasiveness. It also binds focal adhesion kinase (FAK) and the SH2/SH3 adaptor CrkL. ASAP1 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212899	cd11966	SH3_ASAP2	Src homology 3 domain of ArfGAP with SH3 domain, ankyrin repeat and PH domain containing protein 2. ASAP2 is also called DDEF2 (Development and Differentiation Enhancing Factor 2), AMAP2, centaurin beta-3, or PAG3. It mediates the functions of Arf GTPases vial dual mechanisms: it exhibits GTPase activating protein (GAP) activity towards class I (Arf1) and II (Arf5) Arfs; and it binds class III Arfs (GTP-Arf6) stably without GAP activity. It binds paxillin and is implicated in Fcgamma receptor-mediated phagocytosis in macrophages and in cell migration. ASAP2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain, ankyrin (ANK) repeats, and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212900	cd11967	SH3_SASH1	Src homology 3 domain of SAM And SH3 Domain Containing Protein 1. SASH1 is a potential tumor suppressor in breast and colon cancer. Its decreased expression is associated with aggressive tumor growth, metastasis, and poor prognosis. It is widely expressed in normal tissues (except lymphocytes and dendritic cells) and is localized in the nucleus and the cytoplasm. SASH1 interacts with the oncoprotein cortactin and is important in cell migration and adhesion. It is a member of the SLY family of proteins, which are adaptor proteins containing a central conserved region with a bipartite nuclear localization signal (NLS) as well as SAM (sterile alpha motif) and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212901	cd11968	SH3_SASH3	Src homology 3 domain of Sam And SH3 Domain Containing Protein 3. SASH3, also called SLY/SLY1 (SH3-domain containing protein expressed in lymphocytes), is expressed exclusively in lymhocytes and is essential in the full activation of adaptive immunity. It is involved in the signaling of T cell receptors. It was the first described member of the SLY family of proteins, which are adaptor proteins containing a central conserved region with a bipartite nuclear localization signal (NLS) as well as SAM (sterile alpha motif) and SH3 domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212902	cd11969	SH3_PLCgamma2	Src homology 3 domain of Phospholipase C (PLC) gamma 2. PLCgamma2 is primarily expressed in haematopoietic cells, specifically in B cells. It is activated by tyrosine phosphorylation by B cell receptor (BCR) kinases and is recruited to the plasma membrane where its substrate is located. It is required in pre-BCR signaling and in the maturation of B cells. PLCs catalyze the hydrolysis of phosphatidylinositol (4,5)-bisphosphate [PtdIns(4,5)P2] to produce Ins(1,4,5)P3 and diacylglycerol (DAG). Ins(1,4,5)P3 initiates the calcium signaling cascade while DAG functions as an activator of PKC. PLCgamma contains a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, two catalytic regions of PLC domains that flank two tandem SH2 domains, followed by a SH3 domain and C2 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212903	cd11970	SH3_PLCgamma1	Src homology 3 domain of Phospholipase C (PLC) gamma 1. PLCgamma1 is widely expressed and is essential in growth and development. It is activated by the TrkA receptor tyrosine kinase and functions as a key regulator of cell differentiation. It is also the predominant PLCgamma in T cells and is required for T cell and NK cell function. PLCs catalyze the hydrolysis of phosphatidylinositol (4,5)-bisphosphate [PtdIns(4,5)P2] to produce Ins(1,4,5)P3 and diacylglycerol (DAG). Ins(1,4,5)P3 initiates the calcium signaling cascade while DAG functions as an activator of PKC. PLCgamma contains a Pleckstrin homology (PH) domain followed by an elongation factor (EF) domain, two catalytic regions of PLC domains that flank two tandem SH2 domains, followed by a SH3 domain and C2 domain. The SH3 domain of PLCgamma1 directly interacts with dynamin-1 and can serve as a guanine nucleotide exchange factor (GEF). It also interacts with Cbl, inhibiting its phosphorylation and activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	60
212904	cd11971	SH3_Abi1	Src homology 3 domain of Abl Interactor 1. Abi1, also called e3B1, is a central regulator of actin cytoskeletal reorganization through interactions with many protein complexes. It is part of WAVE, a nucleation-promoting factor complex, that links Rac 1 activation to actin polymerization causing lamellipodia protrusion at the plasma membrane. Abi1 interact with formins to promote protrusions at the leading edge of motile cells. It also is a target of alpha4 integrin, regulating membrane protrusions at sites of integrin engagement. Abi proteins are adaptor proteins serving as binding partners and substrates of Abl tyrosine kinases. They are involved in regulating actin cytoskeletal reorganization and play important roles in membrane-ruffling, endocytosis, cell motility, and cell migration. Abi proteins contain a homeobox homology domain, a proline-rich region, and a SH3 domain. The SH3 domain of Abi binds to a PxxP motif in Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212905	cd11972	SH3_Abi2	Src homology 3 domain of Abl Interactor 2. Abi2 is highly expressed in the brain and eye. It regulates actin cytoskeletal reorganization at adherens junctions and dendritic spines, which is important in cell morphogenesis, migration, and cognitive function. Mice deficient with Abi2 show defects in orientation and migration of lens fibers, neuronal migration, dendritic spine morphology, as well as deficits in learning and memory. Abi proteins are adaptor proteins serving as binding partners and substrates of Abl tyrosine kinases. They are involved in regulating actin cytoskeletal reorganization and play important roles in membrane-ruffling, endocytosis, cell motility, and cell migration. Abi proteins contain a homeobox homology domain, a proline-rich region, and a SH3 domain. The SH3 domain of Abi binds to a PxxP motif in Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212906	cd11973	SH3_ASEF	Src homology 3 domain of APC-Stimulated guanine nucleotide Exchange Factor. ASEF, also called ARHGEF4, exists in an autoinhibited form and is activated upon binding of the tumor suppressor APC (adenomatous polyposis coli). GEFs activate small GTPases by exchanging bound GDP for free GTP. ASEF can activate Rac1 or Cdc42. Truncated ASEF, which is found in colorectal cancers, is constitutively active and has been shown to promote angiogenesis and cancer cell migration. ASEF contains a SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains. In its autoinhibited form, the SH3 domain of ASEF forms an extensive interface with the DH and PH domains, blocking the Rac binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	73
212907	cd11974	SH3_ASEF2	Src homology 3 domain of APC-Stimulated guanine nucleotide Exchange Factor 2. ASEF2, also called Spermatogenesis-associated protein 13 (SPATA13), is a GEF that localizes with actin at the leading edge of cells and is important in cell migration and adhesion dynamics. GEFs activate small GTPases by exchanging bound GDP for free GTP. ASEF2 can activate both Rac 1 and Cdc42, but only Rac1 activation is necessary for increased cell migration and adhesion turnover. Together with APC (adenomatous polyposis coli) and Neurabin2, a scaffold protein that binds F-actin, it is involved in regulating HGF-induced cell migration. ASEF2 contains a SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212908	cd11975	SH3_ARHGEF9	Src homology 3 domain of the Rho guanine nucleotide exchange factor ARHGEF9. ARHGEF9, also called PEM2 or collybistin, selectively activates Cdc42 by exchanging bound GDP for free GTP. It is highly expressed in the brain and it interacts with gephyrin, a postsynaptic protein associated with GABA and glycine receptors. Mutations in the ARHGEF9 gene cause X-linked mental retardation with associated features like seizures, hyper-anxiety, aggressive behavior, and sensory hyperarousal. ARHGEF9 contains a SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212909	cd11976	SH3_VAV1_2	C-terminal (or second) Src homology 3 domain of VAV1 protein. VAV1 is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The C-terminal SH3 domain of Vav1 interacts with a wide variety of proteins including cytoskeletal regulators (zyxin), RNA-binding proteins (Sam68), transcriptional regulators, viral proteins, and dynamin 2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212910	cd11977	SH3_VAV2_2	C-terminal (or second) Src homology 3 domain of VAV2 protein. VAV2 is widely expressed and functions as a guanine nucleotide exchange factor (GEF) for RhoA, RhoB and RhoG and also activates Rac1 and Cdc42. It is implicated in many cellular and physiological functions including blood pressure control, eye development, neurite outgrowth and branching, EGFR endocytosis and degradation, and cell cluster morphology, among others. It has been reported to associate with Nek3. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212911	cd11978	SH3_VAV3_2	C-terminal (or second) Src homology 3 domain of VAV3 protein. VAV3 is ubiquitously expressed and functions as a phosphorylation-dependent guanine nucleotide exchange factor (GEF) for RhoA, RhoG, and Rac1. It has been implicated to function in the hematopoietic, bone, cerebellar, and cardiovascular systems. VAV3 is essential in axon guidance in neurons that control blood pressure and respiration. It is overexpressed in prostate cancer cells and it plays a role in regulating androgen receptor transcriptional activity. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212912	cd11979	SH3_VAV1_1	First Src homology 3 domain of VAV1 protein. VAV1 is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The first SH3 domain of Vav1 has been shown to bind the adaptor protein Grb2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	63
212913	cd11980	SH3_VAV2_1	First Src homology 3 domain of VAV2 protein. VAV2 is widely expressed and functions as a guanine nucleotide exchange factor (GEF) for RhoA, RhoB and RhoG and also activates Rac1 and Cdc42. It is implicated in many cellular and physiological functions including blood pressure control, eye development, neurite outgrowth and branching, EGFR endocytosis and degradation, and cell cluster morphology, among others. It has been reported to associate with Nek3. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	60
212914	cd11981	SH3_VAV3_1	First Src homology 3 domain of VAV3 protein. VAV3 is ubiquitously expressed and functions as a phosphorylation-dependent guanine nucleotide exchange factor (GEF) for RhoA, RhoG, and Rac1. It has been implicated to function in the hematopoietic, bone, cerebellar, and cardiovascular systems. VAV3 is essential in axon guidance in neurons that control blood pressure and respiration. It is overexpressed in prostate cancer cells and it plays a role in regulating androgen receptor transcriptional activity. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The SH3 domain of VAV is involved in the localization of proteins to specific sites within the cell, by interacting with proline-rich sequences within target proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212915	cd11982	SH3_Shank1	Src homology 3 domain of SH3 and multiple ankyrin repeat domains protein 1. Shank1, also called SSTRIP (Somatostatin receptor-interacting protein), is a brain-specific protein that plays a role in the construction of postsynaptic density (PSD) and the maturation of dendritic spines. Mice deficient in Shank1 show altered PSD composition, thinner PSDs, smaller dendritic spines, and weaker basal synaptic transmission, although synaptic plasticity is normal. They show increased anxiety and impaired fear memory, but also show better spatial learning. Shank proteins carry scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains. The SH3 domain of Shank binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212916	cd11983	SH3_Shank2	Src homology 3 domain of SH3 and multiple ankyrin repeat domains protein 2. Shank2, also called ProSAP1 (Proline-rich synapse-associated protein 1) or CortBP1 (Cortactin-binding protein 1), is found in neurons, glia, endocrine cells, liver, and kidney. It plays a role in regulating dendritic spine volume and branching and postsynaptic clustering. Mutations in the Shank2 gene are associated with autism spectrum disorder and mental retardation. Shank proteins carry scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains. The SH3 domain of Shank binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212917	cd11984	SH3_Shank3	Src homology 3 domain of SH3 and multiple ankyrin repeat domains protein 3. Shank3, also called ProSAP2 (Proline-rich synapse-associated protein 2), is widely expressed. It plays a role in the formation of dendritic spines and synapses. Haploinsufficiency of the Shank3 gene causes the 22q13 deletion/Phelan-McDermid syndrome, and variants of Shank3 have been implicated in autism spectrum disorder, schizophrenia, and intellectual disability. Shank proteins carry scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains. The SH3 domain of Shank binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212918	cd11985	SH3_Stac2_C	C-terminal Src homology 3 domain of SH3 and cysteine-rich domain-containing protein 2 (Stac2). Stac proteins are putative adaptor proteins that contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. There are three mammalian members (Stac1, Stac2, and Stac3) of this family. Stac2 contains a single SH3 domain at the C-terminus unlike Stac1 and Stac3, which contain two C-terminal SH3 domains. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212919	cd11986	SH3_Stac3_1	First C-terminal Src homology 3 domain of SH3 and cysteine-rich domain-containing protein 3 (Stac3). Stac proteins are putative adaptor proteins that contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. There are three mammalian members (Stac1, Stac2, and Stac3) of this family. Stac1 and Stac3 contain two SH3 domains while Stac2 contains a single SH3 domain at the C-terminus. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212920	cd11987	SH3_Intersectin1_1	First Src homology 3 domain (or SH3A) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The first SH3 domain (or SH3A) of ITSN1 has been shown to bind many proteins including Sos1, dynamin1/2, CIN85, c-Cbl, PI3K-C2, SHIP2, N-WASP, and CdGAP, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212921	cd11988	SH3_Intersectin2_1	First Src homology 3 domain (or SH3A) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The first SH3 domain (or SH3A) of ITSN2 is expected to bind many protein partners, similar to ITSN1 which has been shown to bind Sos1, dynamin1/2, CIN85, c-Cbl, PI3K-C2, SHIP2, N-WASP, and CdGAP, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212922	cd11989	SH3_Intersectin1_2	Second Src homology 3 domain (or SH3B) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The second SH3 domain (or SH3B) of ITSN1 has been shown to bind WNK and CdGAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212923	cd11990	SH3_Intersectin2_2	Second Src homology 3 domain (or SH3B) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The second SH3 domain (or SH3B) of ITSN2 is expected to bind protein partners, similar to ITSN1 which has been shown to bind WNK and CdGAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212924	cd11991	SH3_Intersectin1_3	Third Src homology 3 domain (or SH3C) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The third SH3 domain (or SH3C) of ITSN1 has been shown to bind many proteins including dynamin1/2, CIN85, c-Cbl, SHIP2, Reps1, synaptojanin-1, and WNK, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212925	cd11992	SH3_Intersectin2_3	Third Src homology 3 domain (or SH3C) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The third SH3 domain (SH3C) of ITSN2 has been shown to bind the K15 protein of Kaposi's sarcoma-associated herpesvirus. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	52
212926	cd11993	SH3_Intersectin1_4	Fourth Src homology 3 domain (or SH3D) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The fourth SH3 domain (or SH3D) of ITSN1 has been shown to bind SHIP2, Numb, CdGAP, and N-WASP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	65
212927	cd11994	SH3_Intersectin2_4	Fourth Src homology 3 domain (or SH3D) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The fourth SH3 domain (or SH3D) of ITSN2 is expected to bind protein partners, similar to ITSN1 which has been shown to bind SHIP2, Numb, CdGAP, and N-WASP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212928	cd11995	SH3_Intersectin1_5	Fifth Src homology 3 domain (or SH3E) of Intersectin-1. Intersectin-1 (ITSN1) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN1 localizes in membranous organelles, CCPs, the Golgi complex, and may be involved in the cell membrane trafficking system. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The fifth SH3 domain (or SH3E) of ITSN1 has been shown to bind many protein partners including SGIP1, Sos1, dynamin1/2, CIN85, c-Cbl, SHIP2, N-WASP, and synaptojanin-1, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212929	cd11996	SH3_Intersectin2_5	Fifth Src homology 3 domain (or SH3E) of Intersectin-2. Intersectin-2 (ITSN2) is an adaptor protein that functions in exo- and endocytosis, actin cytoskeletal reorganization, and signal transduction. It plays a role in clathrin-coated pit (CCP) formation. It binds to many proteins through its multidomain structure and facilitate the assembly of multimeric complexes. ITSN2 also functions as a specific GEF for Cdc42 activation in epithelial morphogenesis, and is required in mitotic spindle orientation. It exists in alternatively spliced short and long isoforms. The short isoform contains two Eps15 homology domains (EH1 and EH2), a coiled-coil region and five SH3 domains (SH3A-E), while the long isoform, in addition, contains RhoGEF (also called Dbl-homologous or DH), Pleckstrin homology (PH) and C2 domains. The fifth SH3 domain (or SH3E) of ITSN2 is expected to bind protein partners, similar to ITSN1 which has been shown to bind many protein partners including SGIP1, Sos1, dynamin1/2, CIN85, c-Cbl, SHIP2, N-WASP, and synaptojanin-1, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212930	cd11997	SH3_PACSIN3	Src homology 3 domain of Protein kinase C and Casein kinase Substrate in Neurons 3 (PACSIN3). PACSIN 3 or Syndapin III (Synaptic dynamin-associated protein III) is expressed ubiquitously and regulates glucose uptake in adipocytes through its role in GLUT1 trafficking. It also modulates the subcellular localization and stimulus-specific function of the cation channel TRPV4. PACSINs act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212931	cd11998	SH3_PACSIN1-2	Src homology 3 domain of Protein kinase C and Casein kinase Substrate in Neurons 1 (PACSIN1) and PACSIN 2. PACSIN 1 or Syndapin I (Synaptic dynamin-associated protein I) is expressed specifically in the brain and is localized in neurites and synaptic boutons. It binds the brain-specific proteins dynamin I, synaptojanin, synapsin I, and neural Wiskott-Aldrich syndrome protein (nWASP), and functions as a link between the cytoskeletal machinery and synaptic vesicle endocytosis. PACSIN 1 interacts with huntingtin and may be implicated in the neuropathology of Huntington's disease. PACSIN 2 or Syndapin II is expressed ubiquitously and is involved in the regulation of tubulin polymerization. It associates with Golgi membranes and forms a complex with dynamin II which is crucial in promoting vesicle formation from the trans-Golgi network. PACSINs act as regulators of cytoskeletal and membrane dynamics. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212932	cd11999	SH3_PACSIN_like	Src homology 3 domain of an unknown subfamily of proteins with similarity to Protein kinase C and Casein kinase Substrate in Neurons (PACSIN) proteins. PACSINs, also called Synaptic dynamin-associated proteins (Syndapins), act as regulators of cytoskeletal and membrane dynamics. They bind both dynamin and Wiskott-Aldrich syndrome protein (WASP), and may provide direct links between the actin cytoskeletal machinery through WASP and dynamin-dependent endocytosis. Vetebrates harbor three isoforms with distinct expression patterns and specific functions. PACSINs contain an N-terminal F-BAR domain and a C-terminal SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212933	cd12000	SH3_CASS4	Src homology 3 domain of CAS (Crk-Associated Substrate) scaffolding protein family member 4. CASS4, also called HEPL (HEF1-EFS-p130Cas-like), localizes to focal adhesions and plays a role in regulating FAK activity, focal adhesion integrity, and cell spreading. It is most abundant in blood cells and lung tissue, and is also found in high levels in leukemia and ovarian cell lines. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212934	cd12001	SH3_BCAR1	Src homology 3 domain of the CAS (Crk-Associated Substrate) scaffolding protein family member, Breast Cancer Anti-estrogen Resistance 1. BCAR1, also called p130cas or CASS1, is the founding member of the CAS family of scaffolding proteins and was originally identified through its ability to associate with Crk. The name BCAR1 was designated because the human gene was identified in a screen for genes that promote resistance to tamoxifen. It is widely expressed and its deletion is lethal in mice. It plays a role in regulating cell motility, survival, proliferation, transformation, cancer progression, and bacterial pathogenesis. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	68
212935	cd12002	SH3_NEDD9	Src homology 3 domain of CAS (Crk-Associated Substrate) scaffolding protein family member, Neural precursor cell Expressed, Developmentally Down-regulated 9. NEDD9 is also called human enhancer of filamentation 1 (HEF1) or CAS-L (Crk-associated substrate in lymphocyte). It was first described as a gene predominantly expressed in early embryonic brain, and was also isolated from a screen of human proteins that regulate filamentous budding in yeast, and as a tyrosine phosphorylated protein in lymphocytes. It promotes metastasis in different solid tumors. NEDD9 localizes in focal adhesions and associates with FAK and Abl kinase. It also interacts with SMAD3 and the proteasomal machinery which allows its rapid turnover; these interactions are not shared by other CAS proteins. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212936	cd12003	SH3_EFS	Src homology 3 domain of CAS (Crk-Associated Substrate) scaffolding protein family member, Embryonal Fyn-associated Substrate. EFS is also called HEFS, CASS3 (Cas scaffolding protein family member 3) or SIN (Src-interacting protein). It was identified based on interactions with the Src kinases, Fyn and Yes. It plays a role in thymocyte development and acts as a negative regulator of T cell proliferation. CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes. They share a common domain structure that includes an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. The SH3 domain of CAS proteins binds to diverse partners including FAK, FRNK, Pyk2, PTP-PEST, DOCK180, among others. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212937	cd12004	SH3_Lyn	Src homology 3 domain of Lyn Protein Tyrosine Kinase. Lyn is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Lyn is expressed in B lymphocytes and myeloid cells. It exhibits both positive and negative regulatory roles in B cell receptor (BCR) signaling. Lyn, as well as Fyn and Blk, promotes B cell activation by phosphorylating ITAMs (immunoreceptor tyr activation motifs) in CD19 and in Ig components of BCR. It negatively regulates signaling by its unique ability to phosphorylate ITIMs (immunoreceptor tyr inhibition motifs) in cell surface receptors like CD22 and CD5. Lyn also plays an important role in G-CSF receptor signaling by phosphorylating a variety of adaptor molecules. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212938	cd12005	SH3_Lck	Src homology 3 domain of Lck Protein Tyrosine Kinase. Lck is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Lck is expressed in T-cells and natural killer cells. It plays a critical role in T-cell maturation, activation, and T-cell receptor (TCR) signaling. Lck phosphorylates ITAM (immunoreceptor tyr activation motif) sequences on several subunits of TCRs, leading to the activation of different second messenger cascades. Phosphorylated ITAMs serve as binding sites for other signaling factor such as Syk and ZAP-70, leading to their activation and propagation of downstream events. In addition, Lck regulates drug-induced apoptosis by interfering with the mitochondrial death pathway. The apototic role of Lck is independent of its primary function in T-cell signaling. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212939	cd12006	SH3_Fyn_Yrk	Src homology 3 domain of Fyn and Yrk Protein Tyrosine Kinases. Fyn and Yrk (Yes-related kinase) are members of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. Fyn, together with Lck, plays a critical role in T-cell signal transduction by phosphorylating ITAM (immunoreceptor tyr activation motif) sequences on T-cell receptors, ultimately leading to the proliferation and differentiation of T-cells. In addition, Fyn is involved in the myelination of neurons, and is implicated in Alzheimer's and Parkinson's diseases. Yrk has been detected only in chickens. It is primarily found in neuronal and epithelial cells and in macrophages. It may play a role in inflammation and in response to injury. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212940	cd12007	SH3_Yes	Src homology 3 domain of Yes Protein Tyrosine Kinase. Yes (or c-Yes) is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. c-Yes kinase is the cellular homolog of the oncogenic protein (v-Yes) encoded by the Yamaguchi 73 and Esh sarcoma viruses. It displays functional overlap with other Src subfamily members, particularly Src. It also shows some unique functions such as binding to occludins, transmembrane proteins that regulate extracellular interactions in tight junctions. Yes also associates with a number of proteins in different cell types that Src does not interact with, like JAK2 and gp130 in pre-adipocytes, and Pyk2 in treated pulmonary vein endothelial cells. Although the biological function of Yes remains unclear, it appears to have a role in regulating cell-cell interactions and vesicle trafficking in polarized cells. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212941	cd12008	SH3_Src	Src homology 3 domain of Src Protein Tyrosine Kinase. Src (or c-Src) is a cytoplasmic (or non-receptor) PTK and is the vertebrate homolog of the oncogenic protein (v-Src) from Rous sarcoma virus. Together with other Src subfamily proteins, it is involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. Src also play a role in regulating cell adhesion, invasion, and motility in cancer cells, and tumor vasculature, contributing to cancer progression and metastasis. Elevated levels of Src kinase activity have been reported in a variety of human cancers. Several inhibitors of Src have been developed as anti-cancer drugs. Src is also implicated in acute inflammatory responses and osteoclast function. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212942	cd12009	SH3_Blk	Src homology 3 domain of Blk Protein Tyrosine Kinase. Blk is a member of the Src subfamily of proteins, which are cytoplasmic (or non-receptor) PTKs. It is expressed specifically in B-cells and is involved in pre-BCR (B-cell receptor) signaling. Src kinases contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). The SH3 domain of Src kinases contributes to substrate recruitment by binding adaptor proteins/substrates, and regulation of kinase activity through an intramolecular interaction. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212943	cd12010	SH3_SLAP	Src homology 3 domain of Src-Like Adaptor Protein. SLAP (or SLA1) modulates TCR surface expression levels as well as surface and total BCR levels. As an adaptor to c-Cbl, SLAP increases the ubiquitination, intracellular retention, and targeted degradation of the BCR complex components. SLAP has been shown to interact with the EphA receptor, EpoR, Lck, PDGFR, Syk, CD79a, c-Cbl, LAT, CD247, and Zap70, among others. SLAPs are adaptor proteins with limited similarity to Src family tyrosine kinases. They contain an N-terminal SH3 domain followed by an SH2 domain, and a unique C-terminal sequence. The SH3 domain of SLAP forms a complex with v-Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212944	cd12011	SH3_SLAP2	Src homology 3 domain of Src-Like Adaptor Protein 2. SLAP2 plays a role in c-Cbl-dependent regulation of CSF1R, a tyrosine kinase important for myeloid cell growth and differentiation. It has been shown to interact with CSF1R, c-Cbl, LAT, CD247, and Zap70. SLAPs are adaptor proteins with limited similarity to Src family tyrosine kinases. They contain an N-terminal SH3 domain followed by an SH2 domain, and a unique C-terminal sequence. They function in regulating the signaling, ubiquitination, and trafficking of T-cell receptor (TCR) and B-cell receptor (BCR) components. The SH3 domain of SLAP forms a complex with v-Abl. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212945	cd12012	SH3_RIM-BP_2	Second Src homology 3 domain of Rab3-interacting molecules (RIMs) binding proteins. RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212946	cd12013	SH3_RIM-BP_3	Third Src homology 3 domain of Rab3-interacting molecules (RIMs) binding proteins. RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212947	cd12014	SH3_RIM-BP_1	First Src homology 3 domain of Rab3-interacting molecules (RIMs) binding proteins. RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212948	cd12015	SH3_Tks_1	First Src homology 3 domain of Tyrosine kinase substrate (Tks) proteins. Tks proteins are Src substrates and scaffolding proteins that play important roles in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. Vertebrates contain two Tks proteins, Tks4 (Tyr kinase substrate with four SH3 domains) and Tks5 (Tyr kinase substrate with five SH3 domains), which display partially overlapping but non-redundant functions. Both associate with the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. Tks5 interacts with N-WASP and Nck, while Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. Tks proteins contain an N-terminal Phox homology (PX) domain and four or five SH3 domains. This model characterizes the first SH3 domain of Tks proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212949	cd12016	SH3_Tks_2	Second Src homology 3 domain of Tyrosine kinase substrate (Tks) proteins. Tks proteins are Src substrates and scaffolding proteins that play important roles in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. Vertebrates contain two Tks proteins, Tks4 (Tyr kinase substrate with four SH3 domains) and Tks5 (Tyr kinase substrate with five SH3 domains), which display partially overlapping but non-redundant functions. Both associate with the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. Tks5 interacts with N-WASP and Nck, while Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. Tks proteins contain an N-terminal Phox homology (PX) domain and four or five SH3 domains. This model characterizes the second SH3 domain of Tks proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212950	cd12017	SH3_Tks_3	Third Src homology 3 domain of Tyrosine kinase substrate (Tks) proteins. Tks proteins are Src substrates and scaffolding proteins that play important roles in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. Vertebrates contain two Tks proteins, Tks4 (Tyr kinase substrate with four SH3 domains) and Tks5 (Tyr kinase substrate with five SH3 domains), which display partially overlapping but non-redundant functions. Both associate with the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. Tks5 interacts with N-WASP and Nck, while Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. Tks proteins contain an N-terminal Phox homology (PX) domain and four or five SH3 domains. This model characterizes the third SH3 domain of Tks proteins. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212951	cd12018	SH3_Tks4_4	Fourth (C-terminal) Src homology 3 domain of Tyrosine kinase substrate with four SH3 domains. Tks4, also called SH3 and PX domain-containing protein 2B (SH3PXD2B) or HOFI, is a Src substrate and scaffolding protein that plays an important role in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. It is required in the formation of functional podosomes, EGF-induced membrane ruffling, and lamellipodia generation. It plays an important role in cellular attachment and cell spreading. Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. It contains an N-terminal Phox homology (PX) domain and four SH3 domains. This model characterizes the fourth (C-terminal) SH3 domain of Tks4. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212952	cd12019	SH3_Tks5_4	Fourth Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the fourth SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212953	cd12020	SH3_Tks5_5	Fifth (C-terminal) Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the fifth (C-terminal) SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212954	cd12021	SH3_p47phox_1	First or N-terminal Src homology 3 domain of the p47phox subunit of NADPH oxidase, also called Neutrophil Cytosolic Factor 1. p47phox, or NCF1, is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox), which plays a key role in the ability of phagocytes to defend against bacterial infections. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p47phox is required for activation of NADH oxidase and plays a role in translocation. It contains an N-terminal Phox homology (PX) domain, tandem SH3 domains (N-SH3 and C-SH3), a polybasic/autoinhibitory region, and a C-terminal proline-rich region (PRR). This model characterizes the first SH3 domain (or N-SH3) of p47phox. In its inactive state, the tandem SH3 domains interact intramolecularly with the autoinhibitory region; upon activation, the tandem SH3 domains are exposed through a conformational change, resulting in their binding to the PRR of p22phox and the activation of NADPH oxidase. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212955	cd12022	SH3_p47phox_2	Second or C-terminal Src homology 3 domain of the p47phox subunit of NADPH oxidase, also called Neutrophil Cytosolic Factor 1. p47phox, or NCF1, is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox), which plays a key role in the ability of phagocytes to defend against bacterial infections. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p47phox is required for activation of NADH oxidase and plays a role in translocation. It contains an N-terminal Phox homology (PX) domain, tandem SH3 domains (N-SH3 and C-SH3), a polybasic/autoinhibitory region, and a C-terminal proline-rich region (PRR). This model characterizes the second SH3 domain (or C-SH3) of p47phox. In its inactive state, the tandem SH3 domains interact intramolecularly with the autoinhibitory region; upon activation, the tandem SH3 domains are exposed through a conformational change, resulting in their binding to the PRR of p22phox and the activation of NADPH oxidase. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212956	cd12023	SH3_NoxO1_1	First or N-terminal Src homology 3 domain of Nox Organizing protein 1. Nox Organizing protein 1 (NoxO1) is a critical regulator of enzyme kinetics of the nonphagocytic NADPH oxidase Nox1, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Nox1 is expressed in colon, stomach, uterus, prostate, and vascular smooth muscle cells. NoxO1 is involved in targeting activator subunits (such as NoxA1) to Nox1. It is co-localized with Nox1 in the membranes of resting cells and directs the subcellular localization of Nox1. NoxO1 contains an N-terminal Phox homology (PX) domain, tandem SH3 domains (N-SH3 and C-SH3), and a C-terminal proline-rich region (PRR). This model characterizes the first SH3 domain (or N-SH3) of NoxO1. The tandem SH3 domains of NoxO1 interact with the PRR of p22phox, which also complexes with Nox1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212957	cd12024	SH3_NoxO1_2	Second or C-terminal Src homology 3 domain of NADPH oxidase (Nox) Organizing protein 1. Nox Organizing protein 1 (NoxO1) is a critical regulator of enzyme kinetics of the nonphagocytic NADPH oxidase Nox1, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Nox1 is expressed in colon, stomach, uterus, prostate, and vascular smooth muscle cells. NoxO1 is involved in targeting activator subunits (such as NoxA1) to Nox1. It is co-localized with Nox1 in the membranes of resting cells and directs the subcellular localization of Nox1. NoxO1 contains an N-terminal Phox homology (PX) domain, tandem SH3 domains (N-SH3 and C-SH3), and a C-terminal proline-rich region (PRR). This model characterizes the second SH3 domain (or C-SH3) of NoxO1. The tandem SH3 domains of NoxO1 interact with the PRR of p22phox, which also complexes with Nox1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212958	cd12025	SH3_Obscurin_like	Src homology 3 domain of Obscurin and similar proteins. Obscurin is a giant muscle protein that is concentrated at the peripheries of Z-disks and M-lines. It binds small ankyrin I, a component of the sarcoplasmic reticulum (SR) membrane. It is associated with the contractile apparatus through binding with titin and sarcomeric myosin. It plays important roles in the organization and assembly of the myofibril and the SR. Obscurin has been observed as alternatively-spliced isoforms. The major isoform in sleletal muscle, approximately 800 kDa in size, is composed of many adhesion modules and signaling domains. It harbors 49 Ig and 2 FNIII repeats at the N-terminues, a complex middle region with additional Ig domains, an IQ motif, and a conserved SH3 domain near RhoGEF and PH domains, and a non-modular C-terminus with phosphorylation motifs. The obscurin gene also encodes two kinase domains, which are not part of the 800 kDa form of the protein, but is part of smaller spliced products that present in heart muscle. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	63
212959	cd12026	SH3_ZO-1	Src homology 3 domain of the Tight junction protein, Zonula occludens protein 1. ZO-1 is a scaffolding protein that associates with other ZO proteins and other proteins of the tight junction, zonula adherens, and gap junctions. ZO proteins play roles in regulating cytoskeletal dynamics at these cell junctions. ZO-1 plays an essential role in embryonic development. It regulates the assembly and dynamics of the cortical cytoskeleton at cell-cell junctions. It is considered a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. The C-terminal region of ZO-1 is the largest of the three ZO proteins and contains an actin-binding region and domains of unknown function designated alpha and ZU5. The SH3 domain of ZO-1 has been shown to bind ZONAB, ZAK, afadin, and Galpha12. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	65
212960	cd12027	SH3_ZO-2	Src homology 3 domain of the Tight junction protein, Zonula occludens protein 2. ZO-2 is a scaffolding protein that associates with other ZO proteins and other proteins of the tight junction, zonula adherens, and gap junctions. ZO proteins play roles in regulating cytoskeletal dynamics at these cell junctions. ZO-2 plays an essential role in embryonic development. It is critical for the blood-testis barrier integrity and male fertility. It also regulates the expression of cyclin D1 and cell proliferation. It is considered a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. The C-terminal region of ZO-2 contains an actin-binding region and a domain of unknown function designated beta. The SH3 domain of the related protein ZO-1 has been shown to bind ZONAB, ZAK, afadin, and Galpha12. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	63
212961	cd12028	SH3_ZO-3	Src homology 3 domain of the Tight junction protein, Zonula occludens protein 3. ZO-3 is a scaffolding protein that associates with other ZO proteins and other proteins of the tight junction, zonula adherens, and gap junctions. ZO proteins play roles in regulating cytoskeletal dynamics at these cell junctions. ZO-3 is critical for epidermal barrier function. It regulates cyclin D1-dependent cell proliferation. It is considered a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. The C-terminal region of ZO-3 is the smallest of the three ZO proteins. The SH3 domain of the related protein ZO-1 has been shown to bind ZONAB, ZAK, afadin, and Galpha12. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	65
212962	cd12029	SH3_DLG3	Src Homology 3 domain of Disks Large homolog 3. DLG3, also called synapse-associated protein 102 (SAP102), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity. Mutations in DLG3 cause midgestational embryonic lethality in mice and may be associated with nonsyndromic X-linked mental retardation in humans. It interacts with the NEDD4 (neural precursor cell-expressed developmentally downregulated 4) family of ubiquitin ligases and promotes apical tight junction formation. DLG3 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG3 contains three PDZ domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	67
212963	cd12030	SH3_DLG4	Src Homology 3 domain of Disks Large homolog 4. DLG4, also called postsynaptic density-95 (PSD95) or synapse-associated protein 90 (SAP90), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity. It is responsible for the membrane clustering and retention of many transporters and receptors such as potassium channels and PMCA4b, a P-type ion transport ATPase, among others. DLG4 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG4 contains three PDZ domains. The SH3 domain of DLG4 binds and clusters the kainate subgroup of glutamate receptors via two proline-rich sequences in their C-terminal tail. It also binds AKAP79/150 (A-kinase anchoring protein). SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	66
212964	cd12031	SH3_DLG1	Src Homology 3 domain of Disks Large homolog 1. DLG1, also called synapse-associated protein 97 (SAP97), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity. DLG1 plays roles in regulating cell polarity, proliferation, migration, and cycle progression. It interacts with AMPA-type glutamate receptors and is critical in their maturation and delivery to synapses. It also interacts with PKCalpha and promotes wound healing. DLG1 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG1 contains three PDZ domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	67
212965	cd12032	SH3_DLG2	Src Homology 3 domain of Disks Large homolog 2. DLG2, also called postsynaptic density-93 (PSD93) or Channel-associated protein of synapse-110 (chapsyn 110), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity. The DLG2 delta isoform binds inwardly rectifying potassium Kir2 channels, which determine resting membrane potential in neurons. It regulates the spatial and temporal distribution of Kir2 channels within neuronal membranes. DLG2 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG2 contains three PDZ domains. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	74
212966	cd12033	SH3_MPP7	Src Homology 3 domain of Membrane Protein, Palmitoylated 7 (or MAGUK p55 subfamily member 7). MPP7 is a scaffolding protein that binds to DLG1 and promotes tight junction formation and epithelial cell polarity. Mutations in the MPP7 gene may be associated with the pathogenesis of diabetes and extreme bone mineral density. It is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212967	cd12034	SH3_MPP4	Src Homology 3 domain of Membrane Protein, Palmitoylated 4 (or MAGUK p55 subfamily member 4). MPP4, also called Disks Large homolog 6 (DLG6) or Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 5 protein (ALS2CR5), is a retina-specific scaffolding protein that plays a role in organizing presynaptic protein complexes in the photoreceptor synapse, where it localizes to the plasma membrane. It is required in the proper localization of calcium ATPases and for maintenance of calcium homeostasis. MPP4 is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212968	cd12035	SH3_MPP1-like	Src Homology 3 domain of Membrane Protein, Palmitoylated 1 (or MAGUK p55 subfamily member 1)-like proteins. This subfamily includes MPP1, CASK (Calcium/calmodulin-dependent Serine protein Kinase), Caenorhabditis elegans lin-2, and similar proteins. MPP1 and CASK are scaffolding proteins from the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). In addition, they also have the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. CASK and lin-2 also contain an N-terminal calmodulin-dependent kinase (CaMK)-like domain and two L27 domains. MPP1 is ubiquitously-expressed and plays roles in regulating neutrophil polarity, cell shape, hair cell development, and neural development and patterning of the retina. CASK is highly expressed in the mammalian nervous system and plays roles in synaptic protein targeting, neural development, and gene expression regulation. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212969	cd12036	SH3_MPP5	Src Homology 3 domain of Membrane Protein, Palmitoylated 5 (or MAGUK p55 subfamily member 5). MPP5, also called PALS1 (Protein associated with Lin7) or Nagie oko protein in zebrafish or Stardust in Drosophila, is a scaffolding protein which associates with Crumbs homolog 1 (CRB1), CRB2, or CRB3 through its PDZ domain and with PALS1-associated tight junction protein (PATJ) or multi-PDZ domain protein 1 (MUPP1) through its L27 domain. The resulting tri-protein complexes are core proteins of the Crumb complex, which localizes at tight junctions or subapical regions, and is involved in the maintenance of apical-basal polarity in epithelial cells and the morphogenesis and function of photoreceptor cells. MPP5 is critical for the proper stratification of the retina and is also expressed in T lymphocytes where it is important for TCR-mediated activation of NFkB. Drosophila Stardust exists in several isoforms, some of which show opposing functions in photoreceptor cells, which suggests that the relative ratio of different Crumbs complexes regulates photoreceptor homeostasis. MPP5 contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	63
212970	cd12037	SH3_MPP2	Src Homology 3 domain of Membrane Protein, Palmitoylated 2 (or MAGUK p55 subfamily member 2). MPP2 is a scaffolding protein that interacts with the non-receptor tyrosine kinase c-Src in epithelial cells to negatively regulate its activity and morphological function. It is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	59
212971	cd12038	SH3_MPP6	Src Homology 3 domain of Membrane Protein, Palmitoylated 6 (or MAGUK p55 subfamily member 6). MPP6, also called Veli-associated MAGUK 1 (VAM-1) or PALS2, is a scaffolding protein that binds to Veli-1, a homolog of Caenorhabditis Lin-7. It is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	61
212972	cd12039	SH3_MPP3	Src Homology 3 domain of Membrane Protein, Palmitoylated 3 (or MAGUK p55 subfamily member 3). MPP3 is a scaffolding protein that colocalizes with MPP5 and CRB1 at the subdpical region adjacent to adherens junctions and may function in photoreceptor polarity. It interacts with some nectins and regulates their trafficking and processing. Nectins are cell-cell adhesion proteins involved in the establishment apical-basal polarity at cell adhesion sites. It is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
212973	cd12040	SH3_CACNB2	Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta2. The beta2 subunit of voltage-dependent calcium channels (Ca(V)s) is one of four beta subunits present in vertebrates. It is expressed in the heart and is present in specific neuronal cells including cerebellar Purkinje cells, hippocampal pyramidal neurons, and photoreceptors. Knockout of the beta2 gene in mice results in embryonic lethality, demonstrating its importance in development. Ca(V)s are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	69
212974	cd12041	SH3_CACNB1	Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta-1. The beta1 subunit of voltage-dependent calcium channels (Ca(V)s) is one of four beta subunits present in vertebrates. It is the only beta subunit, as the beta1a variant, expressed in skeletal muscle; the beta1b variant is also widely expressed in other tissues including the heart and brain. Knockout of the beta1 gene in mice results in embryonic lethality, demonstrating its importance in development. Ca(V)s are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	68
212975	cd12042	SH3_CACNB3	Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta3. The beta3 subunit of voltage-dependent calcium channels (Ca(V)s) is one of four beta subunits present in vertebrates. It is the main beta subunit present in smooth muscles and is strongly expressed in the brain; it is predominant in the olfactory bulb, cortex, and hippocampus. It may play a role in regulating the NMDAR (N-methyl-d-aspartate receptor) activity in the hippocampus and thus, activity-dependent synaptic plasticity and cognitive behaviors. Ca(V)s are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	68
212976	cd12043	SH3_CACNB4	Src Homology 3 domain of Voltage-dependent L-type calcium channel subunit beta4. The beta4 subunit of voltage-dependent calcium channels (Ca(V)s) is one of four beta subunits present in vertebrates. It is the only beta subunit expressed in the cochlea and is highly expressed in the brain, predominantly in the cerebellum. Ca(V)s are multi-protein complexes that regulate the entry of calcium into cells. They impact muscle contraction, neuronal migration, hormone and neurotransmitter release, and the activation of calcium-dependent signaling pathways. They are composed of four subunits: alpha1, alpha2delta, beta, and gamma. The beta subunit is a soluble and intracellular protein that interacts with the transmembrane alpha1 subunit. It facilitates the trafficking and proper localization of the alpha1 subunit to the cellular plasma membrane. Vertebrates contain four different beta subunits from distinct genes (beta1-4); each exists as multiple splice variants. All are expressed in the brain while other tissues show more specific expression patterns. The beta subunits show similarity to MAGUK (membrane-associated guanylate kinase) proteins in that they contain SH3 and inactive guanylate kinase (GuK) domains; however, they do not appear to contain a PDZ domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	68
212977	cd12044	SH3_SKAP1	Src Homology 3 domain of Src Kinase-Associated Phosphoprotein 1. SKAP1, also called SKAP55 (Src kinase-associated protein of 55kDa), is an immune cell-specific adaptor protein that plays an important role in T-cell adhesion, migration, and integrin clustering. It is expressed exclusively in T-lymphocytes, mast cells, and macrophages. Binding partners include ADAP (adhesion and degranulation-promoting adaptor protein), Fyn, Riam, RapL, and RasGRP. It contains a pleckstrin homology (PH) domain, a C-terminal SH3 domain, and several tyrosine phosphorylation sites. The SH3 domain of SKAP1 is necessary for its ability to regulate T-cell conjugation with antigen-presenting cells and the formation of LFA-1 clusters. SKAP1 binds primarily to a proline-rich region of ADAP through its SH3 domain; its degradation is regulated by ADAP. A secondary interaction occurs via the ADAP SH3 domain and the RKxxYxxY motif in SKAP1. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212978	cd12045	SH3_SKAP2	Src Homology 3 domain of Src Kinase-Associated Phosphoprotein 2. SKAP2, also called SKAP55-Related (SKAP55R) or SKAP55 homolog (SKAP-HOM or SKAP55-HOM), is an immune cell-specific adaptor protein that plays an important role in adhesion and migration of B-cells and macrophages. Binding partners include ADAP (adhesion and degranulation-promoting adaptor protein), YopH, SHPS1, and HPK1. SKAP2 has also been identified as a substrate for lymphoid-specific tyrosine phosphatase (Lyp), which has been implicated in a wide variety of autoimmune diseases. It contains a pleckstrin homology (PH) domain, a C-terminal SH3 domain, and several tyrosine phosphorylation sites. Like SKAP1, SKAP2 is expected to bind primarily to a proline-rich region of ADAP through its SH3 domain; its degradation may be regulated by ADAP. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212979	cd12046	SH3_p67phox_C	C-terminal (or second) Src Homology 3 domain of the p67phox subunit of NADPH oxidase. p67phox, also called Neutrophil cytosol factor 2 (NCF-2), is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p67phox plays a regulatory role and contains N-terminal TPR, first SH3 (or N-terminal or central SH3), PB1, and C-terminal SH3 domains. It binds, via its C-terminal SH3 domain, to a proline-rich region of p47phox and upon activation, this complex assembles with flavocytochrome b558, the Nox2-p22phox heterodimer. Concurrently, RacGTP translocates to the membrane and interacts with the TPR domain of p67phox, which leads to the activation of NADPH oxidase. The PB1 domain of p67phox binds to its partner PB1 domain in p40phox, and this facilitates the assembly of p47phox-p67phox at the membrane. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212980	cd12047	SH3_Noxa1_C	C-terminal Src Homology 3 domain of NADPH oxidase activator 1. Noxa1 is a homolog of p67phox and is a cytosolic subunit of the nonphagocytic NADPH oxidase complex Nox1, which catalyzes the transfer of electrons from NADPH to molecular oxygen to form superoxide. Noxa1 is co-expressed with Nox1 in colon, stomach, uterus, prostate, and vascular smooth muscle cells, consistent with its regulatory role. It does not interact with p40phox, unlike p67phox, making Nox1 activity independent of p40phox, unlike Nox2. Noxa1 contains TPR, PB1, and C-terminal SH3 domains, but lacks the central SH3 domain that is present in p67phox. The TPR domain binds activated GTP-bound Rac. The C-terminal SH3 domain binds the polyproline motif found at the C-terminus of Noxo1, a homolog of p47phox. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212981	cd12048	SH3_DOCK3_B	Src Homology 3 domain of Class B Dedicator of Cytokinesis 3. Dock3, also called modifier of cell adhesion (MOCA), and presenilin binding protein (PBP), is a class B DOCK and is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. It regulates N-cadherin dependent cell-cell adhesion, cell polarity, and neuronal morphology. It promotes axonal growth by stimulating actin polymerization and microtubule assembly. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus; Dock3 is a specific GEFs for Rac. The SH3 domain of Dock3 binds to DHR-2 in an autoinhibitory manner; binding of the scaffold protein Elmo to the SH3 domain of Dock3 exposes the DHR-2 domain and promotes GEF activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212982	cd12049	SH3_DOCK4_B	Src Homology 3 domain of Class B Dedicator of Cytokinesis 4. Dock4 is a class B DOCK and is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. It plays a role in regulating dendritic growth and branching in hippocampal neurons, where it is highly expressed. It may also regulate spine morphology and synapse formation. Dock4 activates the Ras family GTPase Rap1, probably indirectly through interaction with Rap regulatory proteins. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. Class B DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus. The SH3 domain of Dock4 binds to DHR-2 in an autoinhibitory manner; binding of the scaffold protein Elmo to the SH3 domain of Dock4 exposes the DHR-2 domain and promotes GEF activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212983	cd12050	SH3_DOCK2_A	Src Homology 3 domain of Class A Dedicator of Cytokinesis protein 2. Dock2 is a hematopoietic cell-specific, class A DOCK and is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. It plays an important role in lymphocyte migration and activation, T-cell differentiation, neutrophil chemotaxis, and type I interferon induction. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. Class A DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus; they are specific GEFs for Rac. The SH3 domain of Dock2 binds to DHR-2 in an autoinhibitory manner; binding of the scaffold protein Elmo to the SH3 domain of Dock2 exposes the DHR-2 domain and promotes GEF activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212984	cd12051	SH3_DOCK1_5_A	Src Homology 3 domain of Class A Dedicator of Cytokinesis proteins 1 and 5. Dock1, also called Dock180, and Dock5 are class A DOCKs and are atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. Dock1 interacts with the scaffold protein Elmo and the resulting complex functions upstream of Rac in many biological events including phagocytosis of apoptotic cells, cell migration and invasion. Dock5 functions upstream of Rac1 to regulate osteoclast function. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). The DHR-1 domain binds phosphatidylinositol-3,4,5-triphosphate while DHR-2 contains the catalytic activity for Rac and/or Cdc42. Class A DOCKs also contain an SH3 domain at the N-terminal region and a PxxP motif at the C-terminus; they are specific GEFs for Rac. The SH3 domain of Dock1 binds to DHR-2 in an autoinhibitory manner; binding of Elmo to the SH3 domain of Dock1 exposes the DHR-2 domain and promotes GEF activity. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212985	cd12052	SH3_CIN85_1	First Src Homology 3 domain (SH3A) of Cbl-interacting protein of 85 kDa. CIN85, also called SH3 domain-containing kinase-binding protein 1 (SH3KBP1) or CD2-binding protein 3 (CD2BP3) or Ruk, is an adaptor protein that is involved in the downregulation of receptor tyrosine kinases by facilitating endocytosis through interaction with endophilin-associated ubiquitin ligase Cbl proteins. It is also important in many other cellular processes including vesicle-mediated transport, cytoskeletal remodelling, apoptosis, cell adhesion and migration, and viral infection, among others. CIN85 exists as multiple variants from alternative splicing; the main variant contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the first SH3 domain (SH3A) of CIN85; SH3A binds to internal proline-rich motifs within the proline-rich region. This intramolecular interaction serves as a regulatory mechanism to keep CIN85 in a closed conformation, preventing the recruitment of other proteins. SH3A has also been shown to bind ubiquitin and to an atypical PXXXPR motif at the C-terminus of Cbl and the cytoplasmic end of the cell adhesion protein CD2. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212986	cd12053	SH3_CD2AP_1	First Src Homology 3 domain (SH3A) of CD2-associated protein. CD2AP, also called CMS (Cas ligand with Multiple SH3 domains) or METS1 (Mesenchyme-to-Epithelium Transition protein with SH3 domains), is a cytosolic adaptor protein that plays a role in regulating the cytoskeleton. It is critical in cell-to-cell union necessary for kidney function. It also stabilizes the contact between a T cell and antigen-presenting cells. It is primarily expressed in podocytes at the cytoplasmic face of the slit diaphragm and serves as a linker anchoring podocin and nephrin to the actin cytoskeleton. CD2AP contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the first SH3 domain (SH3A) of CD2AP. SH3A binds to the PXXXPR motif present in c-Cbl and the cytoplasmic domain of cell adhesion protein CD2. Its interaction with CD2 anchors CD2 at sites of cell contact. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212987	cd12054	SH3_CD2AP_2	Second Src Homology 3 domain (SH3B) of CD2-associated protein. CD2AP, also called CMS (Cas ligand with Multiple SH3 domains) or METS1 (Mesenchyme-to-Epithelium Transition protein with SH3 domains), is a cytosolic adaptor protein that plays a role in regulating the cytoskeleton. It is critical in cell-to-cell union necessary for kidney function. It also stabilizes the contact between a T cell and antigen-presenting cells. It is primarily expressed in podocytes at the cytoplasmic face of the slit diaphragm and serves as a linker anchoring podocin and nephrin to the actin cytoskeleton. CD2AP contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the second SH3 domain (SH3B) of CD2AP. SH3B binds to c-Cbl in a site (TPSSRPLR is the core binding motif) distinct from the c-Cbl/SH3A binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
212988	cd12055	SH3_CIN85_2	Second Src Homology 3 domain (SH3B) of Cbl-interacting protein of 85 kDa. CIN85, also called SH3 domain-containing kinase-binding protein 1 (SH3KBP1) or CD2-binding protein 3 (CD2BP3) or Ruk, is an adaptor protein that is involved in the downregulation of receptor tyrosine kinases by facilitating endocytosis through interaction with endophilin-associated ubiquitin ligase Cbl proteins. It is also important in many other cellular processes including vesicle-mediated transport, cytoskeletal remodelling, apoptosis, cell adhesion and migration, and viral infection, among others. CIN85 exists as multiple variants from alternative splicing; the main variant contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the second SH3 domain (SH3B) of CIN85. SH3B has been shown to bind Cbl proline-rich peptides and ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
212989	cd12056	SH3_CD2AP_3	Third Src Homology 3 domain (SH3C) of CD2-associated protein. CD2AP, also called CMS (Cas ligand with Multiple SH3 domains) or METS1 (Mesenchyme-to-Epithelium Transition protein with SH3 domains), is a cytosolic adaptor protein that plays a role in regulating the cytoskeleton. It is critical in cell-to-cell union necessary for kidney function. It also stabilizes the contact between a T cell and antigen-presenting cells. It is primarily expressed in podocytes at the cytoplasmic face of the slit diaphragm and serves as a linker anchoring podocin and nephrin to the actin cytoskeleton. CD2AP contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the third SH3 domain (SH3C) of CD2AP. SH3C has been shown to bind ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
212990	cd12057	SH3_CIN85_3	Third Src Homology 3 domain (SH3C) of Cbl-interacting protein of 85 kDa. CIN85, also called SH3 domain-containing kinase-binding protein 1 (SH3KBP1) or CD2-binding protein 3 (CD2BP3) or Ruk, is an adaptor protein that is involved in the downregulation of receptor tyrosine kinases by facilitating endocytosis through interaction with endophilin-associated ubiquitin ligase Cbl proteins. It is also important in many other cellular processes including vesicle-mediated transport, cytoskeletal remodelling, apoptosis, cell adhesion and migration, and viral infection, among others. CIN85 exists as multiple variants from alternative splicing; the main variant contains three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. This alignment model represents the third SH3 domain (SH3C) of CIN85. SH3C has been shown to bind ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	56
212991	cd12058	SH3_MLK4	Src Homology 3 domain of Mixed Lineage Kinase 4. MLK4 is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. MLKs act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. The specific function of MLK4 is yet to be determined. Mutations in the kinase domain of MLK4 have been detected in colorectal cancers. MLK4 contains an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212992	cd12059	SH3_MLK1-3	Src Homology 3 domain of Mixed Lineage Kinases 1, 2, and 3. MLKs 1, 2, and 3 are Serine/Threonine Kinases (STKs), catalyzing the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. MLKs act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. Little is known about the specific function of MLK1, also called MAP3K9. It is capable of activating the c-Jun N-terminal kinase pathway. Mice lacking both MLK1 and MLK2 are viable, fertile, and have normal life spans. MLK2, also called MAP3K10, is abundant in brain, skeletal muscle, and testis. It functions upstream of the MAPK, c-Jun N-terminal kinase. It binds hippocalcin, a calcium-sensor protein that protects neurons against calcium-induced cell death. Both MLK2 and hippocalcin may be associated with the pathogenesis of Parkinson's disease. MLK3, also called MAP3K11, is highly expressed in breast cancer cells and its signaling through c-Jun N-terminal kinase has been implicated in the migration, invasion, and malignancy of cancer cells. It also functions as a negative regulator of Inhibitor of Nuclear Factor-KappaB Kinase (IKK) and thus, impacts inflammation and immunity. MLKs contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212993	cd12060	SH3_alphaPIX	Src Homology 3 domain of alpha-Pak Interactive eXchange factor. Alpha-PIX, also called Rho guanine nucleotide exchange factor 6 (ARHGEF6) or Cool (Cloned out of Library)-2, activates small GTPases by exchanging bound GDP for free GTP. It acts as a GEF for both Cdc42 and Rac 1, and is localized in dendritic spines where it regulates spine morphogenesis. It controls dendritic length and spine density in the hippocampus. Mutations in the ARHGEF6 gene cause X-linked intellectual disability in humans. PIX proteins contain an N-terminal SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains, and a C-terminal leucine-zipper domain for dimerization. The SH3 domain of PIX binds to an atypical PxxxPR motif in p21-activated kinases (PAKs) with high affinity. The binding of PAKs to PIX facilitate the localization of PAKs to focal complexes and also localizes PAKs to PIX targets Cdc43 and Rac, leading to the activation of PAKs. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	58
212994	cd12061	SH3_betaPIX	Src Homology 3 domain of beta-Pak Interactive eXchange factor. Beta-PIX, also called Rho guanine nucleotide exchange factor 7 (ARHGEF7) or Cool (Cloned out of Library)-1, activates small GTPases by exchanging bound GDP for free GTP. It acts as a GEF for both Cdc42 and Rac 1, and plays important roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion. PIX proteins contain an N-terminal SH3 domain followed by RhoGEF (also called Dbl-homologous or DH) and Pleckstrin Homology (PH) domains, and a C-terminal leucine-zipper domain for dimerization. The SH3 domain of PIX binds to an atypical PxxxPR motif in p21-activated kinases (PAKs) with high affinity. The binding of PAKs to PIX facilitate the localization of PAKs to focal complexes and also localizes PAKs to PIX targets Cdc43 and Rac, leading to the activation of PAKs. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
212995	cd12062	SH3_Caskin1	Src Homology 3 domain of CASK interacting protein 1. Caskin1 is a multidomain adaptor protein that contains six ankyrin repeats, a single SH3 domain, tandem sterile alpha motif (SAM) domains, and a long disordered proline-rich region. It is expressed at high levels in the brain and is localized in presynaptic regions. It binds to the multidomain scaffolding protein CASK through the CaMK domain in competition with Munc-interacting protein 1 (Mint1). CASK participates in one of two evolutionarily conserved tripartite complexes containing either Mint1 and Velis or Caskin1 and Velis. Caskin1 may play a role in infantile myoclonic epilepsy. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	62
212996	cd12063	SH3_Caskin2	Src Homology 3 domain of CASK interacting protein 2. Caskin2 is a multidomain adaptor protein that contains six ankyrin repeats, a single SH3 domain, tandem sterile alpha motif (SAM) domains, and a long disordered proline-rich region. It shares a domain architecture with Caskin1, but does not bind CASK. The function of Caskin2 is still unknown. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	62
212997	cd12064	SH3_GRAF	Src Homology 3 domain of GTPase Regulator Associated with Focal adhesion kinase. GRAF, also called Rho GTPase activating protein 26 (ARHGAP26), Oligophrenin-1-like (OPHN1L) or GRAF1, is a GAP with activity towards RhoA and Cdc42 and is only weakly active towards Rac1. It influences Rho-mediated cytoskeletal rearrangements and binds focal adhesion kinase (FAK), which is a critical component of integrin signaling. It is essential for the major clathrin-independent endocytic pathway mediated by pleiomorphic membranes. GRAF contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. The SH3 domain of GRAF binds PKNbeta, a target of the small GTPase Rho. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	56
212998	cd12065	SH3_GRAF2	Src Homology 3 domain of GTPase Regulator Associated with Focal adhesion kinase 2. GRAF2, also called Rho GTPase activating protein 10 (ARHGAP10) or PS-GAP, is a GAP with activity towards Cdc42 and RhoA. It regulates caspase-activated p21-activated protein kinase-2 (PAK-2p34). GRAF2 interacts with PAK-2p34, leading to its stabilization and decrease of cell death. It is highly expressed in skeletal muscle, and is involved in alpha-catenin recruitment at cell-cell junctions. GRAF2 contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. The SH3 domain of GRAF binds PKNbeta, a target of the small GTPase Rho. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	54
212999	cd12066	SH3_GRAF3	Src Homology 3 domain of GTPase Regulator Associated with Focal adhesion kinase 3. GRAF3 is also called Rho GTPase activating protein 42 (ARHGAP42) or ARHGAP10-like. Though its function has not been characterized, it may be a GAP with activity towards RhoA and Cdc42, based on its similarity to GRAF and GRAF2. It contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. The SH3 domain of GRAF and GRAF2 binds PKNbeta, a target of the small GTPase Rho. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	55
213000	cd12067	SH3_MYO15A	Src Homology 3 domain of Myosin XVa. Myosin XVa is an unconventional myosin that is critical for the normal growth of mechanosensory stereocilia of inner ear hair cells. Mutations in the myosin XVa gene are associated with nonsyndromic hearing loss. Myosin XVa contains a unique N-terminal extension followed by a motor domain, light chain-binding IQ motifs, and a tail consisting of a pair of MyTH4-FERM tandems separated by a SH3 domain, and a PDZ domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	80
213001	cd12068	SH3_MYO15B	Src Homology 3 domain of Myosin XVb. Myosin XVb, also called KIAA1783, was named based on its similarity with myosin XVa. It is a transcribed and unprocessed pseudogene whose predicted amino acid sequence contains mutated or deleted amino acid residues that are normally conserved and important for myosin function. The related myosin XVa is important for normal growth of mechanosensory stereocilia of inner ear hair cells. Myosin XVa contains a unique N-terminal extension followed by a motor domain, light chain-binding IQ motifs, and a tail consisting of a pair of MyTH4-FERM tandems separated by a SH3 domain, and a PDZ domain. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	55
213002	cd12069	SH3_ARHGAP27	Src Homology 3 domain of Rho GTPase-activating protein 27. Rho GTPase-activating proteins (RhoGAPs or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP27, also called CAMGAP1, shows GAP activity towards Rac1 and Cdc42. It binds the adaptor protein CIN85 and may play a role in clathrin-mediated endocytosis. It contains SH3, WW, Pleckstin homology (PH), and RhoGAP domains. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	57
213003	cd12070	SH3_ARHGAP12	Src Homology 3 domain of Rho GTPase-activating protein 12. Rho GTPase-activating proteins (RhoGAPs or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP12 has been shown to display GAP activity towards Rac1. It plays a role in regulating hepatocyte growth factor (HGF)-driven cell growth and invasiveness. It contains SH3, WW, Pleckstin homology (PH), and RhoGAP domains. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	60
213004	cd12071	SH3_FBP17	Src Homology 3 domain of Formin Binding Protein 17. Formin Binding Protein 17 (FBP17), also called FormiN Binding Protein 1 (FNBP1), is involved in dynamin-mediated endocytosis. It is recruited to clathrin-coated pits late in the endocytosis process and may play a role in the invagination and scission steps. FBP17 binds in vivo to tankyrase, a protein involved in telomere maintenance and mitogen activated protein kinase (MAPK) signaling. It contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs) domain, a Cdc42-binding HR1 domain, and a C-terminal SH3 domain. The SH3 domain of the related protein, CIP4, associates with Gapex-5, a Rab31 GEF. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
213005	cd12072	SH3_FNBP1L	Src Homology 3 domain of Formin Binding Protein 1-Like. FormiN Binding Protein 1-Like (FNBP1L), also known as Toca-1 (Transducer of Cdc42-dependent actin assembly), forms a complex with neural Wiskott-Aldrich syndrome protein (N-WASP). The FNBP1L/N-WASP complex induces the formation of filopodia and endocytic vesicles. FNBP1L is required for Cdc42-induced actin assembly and is essential for autophagy of intracellular pathogens. It contains an N-terminal F-BAR domain, a central Cdc42-binding HR1 domain, and a C-terminal SH3 domain. The SH3 domain of the related protein, CIP4, associates with Gapex-5, a Rab31 GEF. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
213006	cd12073	SH3_HS1	Src homology 3 domain of Hematopoietic lineage cell-specific protein 1. HS1, also called HCLS1 (hematopoietic cell-specific Lyn substrate 1), is a cortactin homolog expressed specifically in hematopoietic cells. It is an actin regulatory protein that binds the Arp2/3 complex and stabilizes branched actin filaments. It is required for cell spreading and signaling in lymphocytes. It regulates cytoskeletal remodeling that controls lymphocyte trafficking, and it also affects tissue invasion and infiltration of leukemic B cells. Like cortactin, HS1 contains an N-terminal acidic domain, several copies of a repeat domain found in cortactin and HS1, a proline-rich region, and a C-terminal SH3 domain. The N-terminal region binds the Arp2/3 complex and F-actin, while the C-terminal region acts as an adaptor or scaffold that can connect varied proteins that bind the SH3 domain within the actin network. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
213007	cd12074	SH3_Tks5_1	First Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the first SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
213008	cd12075	SH3_Tks4_1	First Src homology 3 domain of Tyrosine kinase substrate with four SH3 domains. Tks4, also called SH3 and PX domain-containing protein 2B (SH3PXD2B) or HOFI, is a Src substrate and scaffolding protein that plays an important role in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. It is required in the formation of functional podosomes, EGF-induced membrane ruffling, and lamellipodia generation. It plays an important role in cellular attachment and cell spreading. Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. It contains an N-terminal Phox homology (PX) domain and four SH3 domains. This model characterizes the first SH3 domain of Tks4. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
213009	cd12076	SH3_Tks4_2	Second Src homology 3 domain of Tyrosine kinase substrate with four SH3 domains. Tks4, also called SH3 and PX domain-containing protein 2B (SH3PXD2B) or HOFI, is a Src substrate and scaffolding protein that plays an important role in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. It is required in the formation of functional podosomes, EGF-induced membrane ruffling, and lamellipodia generation. It plays an important role in cellular attachment and cell spreading. Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. It contains an N-terminal Phox homology (PX) domain and four SH3 domains. This model characterizes the second SH3 domain of Tks4. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
213010	cd12077	SH3_Tks5_2	Second Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the second SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
213011	cd12078	SH3_Tks4_3	Third Src homology 3 domain of Tyrosine kinase substrate with four SH3 domains. Tks4, also called SH3 and PX domain-containing protein 2B (SH3PXD2B) or HOFI, is a Src substrate and scaffolding protein that plays an important role in the formation of podosomes and invadopodia, the dynamic actin-rich structures that are related to cell migration and cancer cell invasion. It is required in the formation of functional podosomes, EGF-induced membrane ruffling, and lamellipodia generation. It plays an important role in cellular attachment and cell spreading. Tks4 is essential for the localization of MT1-MMP (membrane-type 1 matrix metalloproteinase) to invadopodia. It contains an N-terminal Phox homology (PX) domain and four SH3 domains. This model characterizes the third SH3 domain of Tks4. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	53
213012	cd12079	SH3_Tks5_3	Third Src homology 3 domain of Tyrosine kinase substrate with five SH3 domains. Tks5, also called SH3 and PX domain-containing protein 2A (SH3PXD2A) or Five SH (FISH), is a scaffolding protein and Src substrate that is localized in podosomes, which are electron-dense structures found in Src-transformed fibroblasts, osteoclasts, macrophages, and some invasive cancer cells. It binds and regulates some members of the ADAMs family of transmembrane metalloproteases, which function as sheddases and mediators of cell and matrix interactions. It is required for podosome formation, degradation of the extracellular matrix, and cancer cell invasion. Tks5 contains an N-terminal Phox homology (PX) domain and five SH3 domains. This model characterizes the third SH3 domain of Tks5. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	54
213013	cd12080	SH3_MPP1	Src Homology 3 domain of Membrane Protein, Palmitoylated 1 (or MAGUK p55 subfamily member 1). MPP1, also called 55 kDa erythrocyte membrane protein (p55), is a ubiquitously-expressed scaffolding protein that plays roles in regulating neutrophil polarity, cell shape, hair cell development, and neural development and patterning of the retina. It was originally identified as an erythrocyte protein that stabilizes the actin cytoskeleton to the plasma membrane by forming a complex with 4.1R protein and glycophorin C. MPP1 is one of seven vertebrate homologs of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains the three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
213014	cd12081	SH3_CASK	Src Homology 3 domain of Calcium/calmodulin-dependent Serine protein Kinase. CASK is a scaffolding protein that is highly expressed in the mammalian nervous system and plays roles in synaptic protein targeting, neural development, and gene expression regulation. CASK interacts with many different binding partners including parkin, neurexin, syndecans, calcium channel proteins, caskin, among others, to perform specific functions in different subcellular locations. Disruption of the CASK gene in mice results in neonatal lethality while mutations in the human gene have been associated with X-linked mental retardation. Drosophila CASK is associated with both pre- and postsynaptic membranes and is crucial in synaptic transmission and vesicle cycling. CASK contains an N-terminal calmodulin-dependent kinase (CaMK)-like domain, two L27 domains, followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK). In addition, it also contains the Hook (Protein 4.1 Binding) motif in between the SH3 and GuK domains. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	62
240527	cd12082	MATE_like	Multidrug and toxic compound extrusion family and similar proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	420
213373	cd12083	DD_cGKI	Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I. Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a  soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is also expressed at lower concentrations in other tissues. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing their targeting to different subcellular compartments and intracellular substrates.	48
213043	cd12084	DD_R_PKA	Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis.	37
213374	cd12085	DD_cGKI-alpha	Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I alpha. Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a  soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing for their targeting to different subcellular compartments and intracellular substrates. cGKI-alpha specifically binds to myosin light chain phosphatase targeting subunit (MYPT1) and the regulator of G-protein signaling-2 (RGS-2). cGKI-alpha activates the phosphatase activity of MYPT1, resulting in vasorelaxation. It increases the activity of RGS-2 toward G proteins, with implications in the downstream signaling for vasoconstrictive agents.	48
213375	cd12086	DD_cGKI-beta	Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I beta. Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a  soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing for their targeting to different subcellular compartments and intracellular substrates. cGKI-beta binds specifically to inositol triphosphate receptor-associated PKG substrate (IRAG) and the transcriptional regulator TFII-I. Phosphorylation of IRAG by cGKI-beta contributes to smooth muscle relaxation while phosphorylation of TFII-I modulates its co-activator functions for serum response factor and Smad transcription factors.	52
213052	cd12087	TM_EGFR-like	Transmembrane domain of the Epidermal Growth Factor Receptor family of Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER, ErbB) subfamily members include EGFR (HER1, ErbB1), HER2 (ErbB2), HER3 (ErbB3), HER4 (ErbB4), and similar proteins. They are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. They are activated by ligand-induced dimerization, resulting in the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Collectively, they can recognize a variety of ligands including EGF, TGFalpha, and neuregulins, among others. All four subfamily members can form homo- or heterodimers. HER3 contains an impaired kinase domain and depends on its heterodimerization partner for activation. EGFR subfamily members are involved in signaling pathways leading to a broad range of cellular responses including cell proliferation, differentiation, migration, growth inhibition, and apoptosis. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of EGFR family RTKs have been associated with increased breast cancer risk.	38
277187	cd12088	helicase_insert_domain	helicase_insert_domain. helicase_insert_domain; This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases, like archaeal Hef helicase, MDA5-like helicases and FancM-like helicases. The exact function of this domain is unknown, but seems to play a role in interaction with nucleotides and/or the stabilization of the nucleotide complex.	82
277188	cd12089	Hef_ID	insert domain of Archaeal Hef helicase/nuclease. Archaeal Hef helicase/nuclease, originally identified in the hyperthermophilic archaeon Pyrococcus furiosus, contains an N-terminal SF2 helicase domain and a C-terminal XPF/Mus81-type nuclease domain. Hef has been shown to process flap- or fork-DNA structures, and that both helicase and nuclease domain independently recognize branched DNA, with a strong preference for the forked DNA. The SF2 helicase domain is comprised of 3 structural domains, the 2 generally conserved helicase domains and a helical domain inserted between the two domains. This domain which is not present in all SF2 helicases, has been shown to play an important role in branched structure processing.	119
277189	cd12090	MDA5_ID	Insert domain of MDA5. MDA5 (melanoma-differentiation-associated gene 5, also known as IFIH1), as well as RIG-I (Retinoic acid Inducible Gene I, also known as DDX58) and LPG2 (also known as DHX58), contain two N-terminal CARD domains and a C-terminal SF2 helicase domain. They are cytoplasmic DEAD box RNA helicases acting as key innate immune pattern-recognition receptor (PRRs) that play an important role in host antiviral response by sensing incoming viral RNA. Their SF2 helicase domain is comprised of 3 structural domains, the 2 generally conserved helicase domains and a helical domain inserted between the two domains. The inserted domain is involved in conformational changes upon ligand binding.	120
277190	cd12091	FANCM_ID	Insert domain of FANCM and related proteins. FANCM and related proteins, like Mph1 and Fml1, are DNA junction-specific helicases/translocases that bind to and process perturbed replication forks and intermediates of homologous recombination. FANCM contains an N-terminal superfamily 2 helicase (SF2) domain, although FANCM, in contrast to other members of this family, does not exhibit DNA helicase activity. The SF2 helicase domain is comprised of 3 structural domains, the 2 generally conserved helicase domains and a helical domain inserted between the two domains. FANCM is a component of the Fanconi anaemia (FA) core complex. FA is a rare genetic disease in humans that is associated with progressive bone marrow failure, a variety of developmental abnormalities, and a high incidence of cancer. A key role of this complex is to monoubiquitination of FANCD2 and FANCI during S-phase and in response to DNA damage. The role of FANCM during this process seems to be the recruitment of the complex to chromatin.	116
213053	cd12092	TM_ErbB4	Transmembrane domain of ErbB4, a Protein Tyrosine Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. ErbB4 (HER4) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. It is activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Ligands that bind ErbB4 fall into two groups, the neuregulins (or heregulins) and some EGFR (HER1, ErbB1) ligands including betacellulin, HBEGF, and epiregulin. All four neuregulins (NRG1-4) interact with ErbB4. Upon ligand binding, ErbB4 forms homo- or heterodimers with other ErbB proteins. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB4 have been associated with increased breast cancer risk. ErbB4 is essential in embryonic development. It is implicated in mammary gland, cardiac, and neural development. As a postsynaptic receptor of NRG1, ErbB4 plays an important role in synaptic plasticity and maturation. The impairment of NRG1/ErbB4 signaling may contribute to schizophrenia.	44
213054	cd12093	TM_ErbB1	Transmembrane domain of Epidermal Growth Factor Receptor or ErbB1, a Protein Tyrosine Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER1, ErbB1) is a receptor PTK (RTK) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. It is activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Ligands for ErbB1 include EGF, heparin binding EGF-like growth factor (HBEGF), epiregulin, amphiregulin, TGFalpha, and betacellulin. Upon ligand binding, ErbB1 can form homo- or heterodimers with other EGFR/ErbB subfamily members. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB1 have been associated with increased breast cancer risk. The ErbB1 signaling pathway is one of the most important pathways regulating cell proliferation, differentiation, survival, and growth. A number of monoclonal antibodies and small molecule inhibitors have been developed that target ErbB1, including the antibodies Cetuximab and Panitumumab, which are used in combination with other therapies for the treatment of colorectal cancer and non-small cell lung carcinoma (NSCLC). The small molecule inhibitors Gefitinib (Iressa) and Erlotinib (Tarceva), already used for NSCLC, are undergoing clinical trials for other types of cancer including gastrointestinal, breast, head and neck, and bladder.	44
213055	cd12094	TM_ErbB2	Transmembrane domain of ErbB2, a Protein Tyrosine Kinase. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. ErbB2 (HER2, HER2/neu) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. It is activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. ErbB2 does not bind to any known EGFR subfamily ligands, but contributes to the kinase activity of all possible heterodimers. It acts as the preferred partner of other ligand-bound EGFR proteins and functions as a signal amplifier, with the ErbB2-ErbB3 heterodimer being the most potent pair in mitogenic signaling. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB2 have been associated with increased breast cancer risk. ErbB2 plays an important role in cell development, proliferation, survival and motility. Overexpression of ErbB2 results in its activation and downstream signaling, even in the absence of ligand. ErbB2 overexpression, mainly due to gene amplification, has been shown in a variety of human cancers. Its role in breast cancer is especially well-documented. ErbB2 is up-regulated in about 25% of breast tumors and is associated with increases in tumor aggressiveness, recurrence and mortality. ErbB2 is a target for monoclonal antibodies and small molecule inhibitors, which are being developed as treatments for cancer. The first humanized antibody approved for clinical use is Trastuzumab (Herceptin), which is being used in combination with other therapies to improve the survival rates of patients with HER2-overexpressing breast cancer.	44
213056	cd12095	TM_ErbB3	Transmembrane domain of ErbB3, a Protein Tyrosine Kinase. ErbB3 (HER3) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. ErbB receptors are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. ErbB3 contains an impaired tyr kinase domain, which lacks crucial residues for catalytic activity against exogenous substrates but is still able to bind ATP and autophosphorylate. ErbB3 binds the neuregulin ligands, NRG1 and NRG2, and it relies on its heterodimerization partners for activity following ligand binding. The ErbB2-ErbB3 heterodimer constitutes a high affinity co-receptor capable of potent mitogenic signaling. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB receptors have been associated with increased breast cancer risk. ErbB3 participates in a signaling pathway involved in the proliferation, survival, adhesion, and motility of tumor cells.	39
213044	cd12097	DD_RI_PKA	Dimerization/Docking domain of the Type I Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RI subunits are pseudo-substrates as they do not contain a phosphorylation site in their inhibitory site unlike RII subunits. RIalpha function is required for normal development as its deletion is embryonically lethal. RIbeta is expressed highly in the brain and is associated with hippocampal function. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis.	44
213045	cd12098	DD_R_PKA_fungi	Dimerization/Docking domain of the Regulatory subunit of fungal cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. The R subunit of fungal PKA is encoded by a single gene, which is called by various names in different organisms (for example: Yarrowia lipolytica RKA1, Saccharomyces cerevisiae Bcy1, and Schizosaccharomyces pombe Cgs1). Although most characterized PKA holoenzymes are tetramers, Y. lipolytica PKA has been reported to be a dimer of RKA1 and the catalytic subunit TPK1. RKA1 is essential and promotes hyphal growth. Cgs1 is essential for sexual differentiation of S. pombe; mutants with defective Cgs1 are partially sterile. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain of metazoan R subunits dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs). The D/D domain of fungal R subunits may also serve as a dimerization domain, in the case of heterotetrameric PKAs. Fungal PKA plays a major role in controlling cell growth and metabolism in response to nutrients and stress conditions.	38
213046	cd12099	DD_RII_PKA	Dimerization/Docking domain of the Type II Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RII subunits contain a phosphorylation site in their inhibitory site and are both substrates and inhibitors. RIIalpha plays a role in the association and dissociation of PKA with the centrosome during interphase and mitosis, respectively. RIIbeta plays an important role in adipocytes and neuronal tissues. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis.	39
213047	cd12100	DD_CABYR_SP17	Dimerization/Docking domain of the sperm fibrous sheath proteins, Calcium-Binding tYrosine-phosphorylation Regulated protein and Sperm Protein 17. CABYR and SP17 are naturally located in human sperm fibrous sheath (FS). CABYR was originally isolated from spermatoza and was thought to be testis-specific, but has been recently been observed in lung and brain tumors. It is a polymorphic calcium binding protein that is phosphorylated during capacitation. SP17 plays an important role in the interaction of sperm with the zona pellucida during fertilization. It also promotes cell-cell adhesion. SP17 is found in various human tumors of unrelated histological origin including metastatic squamous cell carcinoma, multiple myeloma, ovarian cancer, primary nervous system tumors, among others. Both CABYR and SP17 contain an N-terminal dimerization/docking (D/D) domain with similarity to the D/D domain of the R subunit of cAMP-dependent protein kinase (PKA). The D/D domain of the R subunit dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. The D/D domain of CABYR and SP17 have been shown to bind to AKAP3, a protein that is also associated to the FS of mammalian spermatozoa.	39
213048	cd12101	DD_RIalpha_PKA	Dimerization/Docking domain of the Type I alpha Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RI subunits are pseudo-substrates as they do not contain a phosphorylation site in their inhibitory site unlike RII subunits. RIalpha is the key regulatory subunit responsible for maintaining cAMP control of the catalytic subunit. RIalpha function is required for normal development as its deletion is embryonically lethal due to failed cardiac morphogenesis. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis.	50
213049	cd12102	DD_RIbeta_PKA	Dimerization/Docking domain of the Type I beta Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RI subunits are pseudo-substrates as they do not contain a phosphorylation site in their inhibitory site unlike RII subunits. RIbeta is expressed highly in the brain and is associated with hippocampal function. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis.	54
213050	cd12103	DD_RIIalpha_PKA	Dimerization/Docking domain of the Type II alpha Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RII subunits contain a phosphorylation site in their inhibitory site and are both substrates and inhibitors. RIIalpha plays a role in the association and dissociation of PKA with the centrosome during interphase and mitosis, respectively. It is also involved in endosome-to-Golgi and Golgi-to-ER transport. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis.	41
213051	cd12104	DD_RIIbeta_PKA	Dimerization/Docking domain of the Type II beta Regulatory subunit of cAMP-dependent protein kinase. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RII subunits contain a phosphorylation site in their inhibitory site and are both substrates and inhibitors. RIIbeta plays an important role in adipocytes and neuronal tissues. Mice deficient with RIIbeta have small fat cells, and are resistant to obesity, diet-induced diabetes, and alcohol-induced motor defects. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis.	41
213031	cd12105	HmuY	Bacterial proteins similar to Porphyromonas gingivalis HmuY. HmuY is a hemophore that scavenges heme from infected hosts and delivers it to the outer membrane receptor HmuR. Related but uncharacterized proteins do not appear to share the specific heme-binding site.	121
213061	cd12106	PARMER_03128_N	N-terminal domain of PARMER_03128. PARMER_03128 is an uncharacterized protein from Parabacteroides merdae. This model characterizes its N-terminal domain plus that of related proteins from Bacteroidetes. Structurally, they resemble domains found in streptococcal surface proteins such as SpaP.	137
213982	cd12107	Hemerythrin	Hemerythrin. Hemerythrin (Hr) is a non-heme diiron oxygen transport protein found in four marine invertebrate phyla including priapulida, brachiopoda, sipunculida, and annelida, as well as in protozoa. Myohemerythrin (Mhr), a hemerythrin homolog, is found in the muscle tissue of sipunculids as well as in polycheate and oligocheate annelids. In addition to oxygen transport, Mhr proteins are involved in cadmium fixation and host anti-bacterial defense. Hr and Mhr proteins have the same "four alpha helix bundle" motif and active site structure. Hr forms oligomers, the octameric form being most prevalent, while Mhr is monomeric.	113
213983	cd12108	Hr-like	Hemerythrin-like domain. Hemerythrin (Hr) like domains have the same four alpha helix bundle and a similar, but slightly different active site structure than hemerythrin. They are non-heme diiron binding proteins mainly found in bacteria and eukaryotes. Like Hr, they may be involved in oxygen transport or like human FBXL5 (F-box and leucine-rich repeat protein 5), a member of this group, play a role in cellular iron homeostasis.	130
213984	cd12109	Hr_FBXL5	Hemerythrin-like domain of FBXL5-like proteins. Human FBXL5 (F-box and leucine-rich repeat protein 5) protein plays a role in cellular iron homeostasis. It is part of an E3 ubiquitin ligase complex that targets the iron regulatory protein IRP2 for proteasomal degradation. The FBXL5's stability is regulated by iron concentration, with its iron- and oxygen-binding hemerythrin domain acting as a ligand-dependent regulatory switch.	158
213994	cd12110	PHP_HisPPase_Hisj_like	Polymerase and Histidinol Phosphatase domain of Histidinol phosphate phosphatase of Hisj like. Bacillus subtilis YtvP HisJ has strong histidinol phosphate phosphatase (HisPPase) activity. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to produce histidinol. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel.	244
213995	cd12111	PHP_HisPPase_Thermotoga_like	Polymerase and Histidinol Phosphatase domain of Thermotoga like. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. Thermotoga PHP is an uncharacterized protein. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to give histidinol. The HisPPase can be classified into two types: the bifunctional HisPPase found in proteobacteria that belongs to the DDDD superfamily and the monofunctional Bacillus subtilis type that is a member of the PHP family. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel.	226
213996	cd12112	PHP_HisPPase_Chlorobi_like	Polymerase and Histidinol Phosphatase domain of Chlorobi like. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. Chlorobi PHP is uncharacterized protein. HisPPase catalyzes the eighth step of histidine biosynthesis, in which L-histidinol phosphate undergoes dephosphorylation to produce histidinol. The HisPPase can be classified into two types: the bifunctional Hisppase found in proteobacteria that belongs to the DDDD superfamily and the monofunctional Bacillus subtilis type that is a member of the PHP family. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. The PHP domain of HisPPase is structurally homologous to other members of the PHP family that have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel.	235
213997	cd12113	PHP_PolIIIA_DnaE3	Polymerase and Histidinol Phosphatase domain of alpha-subunit of bacterial polymerase III DnaE3. PolIIIAs that contain an N-terminal PHP domain have been classified into four basic groups based on genome composition, phylogenetic, and domain structural analysis: polC, dnaE1, dnaE2, and dnaE3. The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. DNA polymerase III holoenzyme is one of the five eubacterial DNA polymerases that is responsible for the replication of the DNA duplex. The alpha subunit of DNA polymerase III core enzyme catalyzes the reaction for polymerizing both DNA strands. The PolIIIA PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination, and like other PHP structures, the PolIIIA PHP exhibits a distorted (beta/alpha) 7 barrel and coordinates up to 3 metals. Initially, it was proposed that PHP region might be involved in pyrophosphate hydrolysis, but such an activity has not been found. It has been shown that the PHP of PolIIIA has a trinuclear metal complex and is capable of proofreading activity. Bacterial genome replication and DNA repair mechanisms is related to the GC content of its genomes. There is a correlation between GC content variations and the dimeric combinations of PolIIIA subunits. Eubacteria can be grouped into different GC variable groups: the full-spectrum or dnaE1 group, the high-GC or dnaE2-dnaE1 group, and the low GC or polC-dnaE3 group.	283
341279	cd12114	A_NRPS_TlmIV_like	The adenylation domain of nonribosomal peptide synthetases (NRPS), including Streptoalloteichus tallysomycin biosynthesis genes. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the TLM biosynthetic gene cluster from Streptoalloteichus that consists of nine NRPS genes; the N-terminal module of TlmVI (NRPS-5) and the starter module of BlmVI (NRPS-5) are comprised of the acyl CoA ligase (AL) and acyl carrier protein (ACP)-like domains, which are thought to be involved in the biosynthesis of the beta-aminoalaninamide moiety.	477
341280	cd12115	A_NRPS_Sfm_like	The adenylation domain of nonribosomal peptide synthetases (NRPS), including Saframycin A gene cluster from Streptomyces lavendulae. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the saframycin A gene cluster from Streptomyces lavendulae which implicates the NRPS system for assembling the unusual tetrapeptidyl skeleton in an iterative manner. It also includes saframycin Mx1 produced by Myxococcus xanthus NRPS.	447
341281	cd12116	A_NRPS_Ta1_like	The adenylation domain of nonribosomal peptide synthetases (NRPS), including salinosporamide A polyketide synthase. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the myxovirescin (TA) antibiotic biosynthetic gene in Myxococcus xanthus; TA production plays a role in predation. It also includes the salinosporamide A polyketide synthase which is involved in the biosynthesis of salinosporamide A, a marine microbial metabolite whose chlorine atom is crucial for potent proteasome inhibition and anticancer activity.	470
341282	cd12117	A_NRPS_Srf_like	The adenylation domain of nonribosomal peptide synthetases (NRPS), including Bacillus subtilis termination module Surfactin (SrfA-C). The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and, in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions. This family includes the adenylation domain of the Bacillus subtilis termination module (Surfactin domain, SrfA-C) which recognizes a specific amino acid building block, which is then activated and transferred to the terminal thiol of the 4'-phosphopantetheine (Ppan) arm of the downstream peptidyl carrier protein (PCP) domain.	483
341283	cd12118	ttLC_FACS_AEE21_like	Fatty acyl-CoA synthetases similar to LC-FACS from Thermus thermophiles and Arabidopsis. This family includes fatty acyl-CoA synthetases that can activate medium to long-chain fatty acids. These enzymes catalyze the ATP-dependent acylation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. Fatty acyl-CoA synthetases are responsible for fatty acid degradation as well as physiological regulation of cellular functions via the production of fatty acyl-CoA esters. The fatty acyl-CoA synthetase from Thermus thermophiles in this family has been shown to catalyze the long-chain fatty acid, myristoyl acid. Also included in this family are acyl activating enzymes from Arabidopsis, which contains a large number of proteins from this family with up to 63 different genes, many of which are uncharacterized.	486
341284	cd12119	ttLC_FACS_AlkK_like	Fatty acyl-CoA synthetases similar to LC-FACS from Thermus thermophiles. This family includes fatty acyl-CoA synthetases that can activate medium-chain to long-chain fatty acids. They catalyze the ATP-dependent acylation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. The fatty acyl-CoA synthetases are responsible for fatty acid degradation as well as physiological regulation of cellular functions via the production of fatty acyl-CoA esters. The fatty acyl-CoA synthetase from Thermus thermophiles in this family catalyzes the long-chain fatty acid, myristoyl acid, while another member in this family, the AlkK protein identified from Pseudomonas oleovorans, targets medium chain fatty acids. This family also includes uncharacterized FACS proteins.	518
213376	cd12120	AMPKA_C_like	C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha subunit and similar domains. This family is composed of AMPKs, microtubule-associated protein/microtubule affinity regulating kinases (MARKs), yeast Kcc4p-like proteins, plant calcineurin B-Like (CBL)-interacting protein kinases (CIPKs), and similar proteins. They are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. AMPKs act as sensors for the energy status of the cell and are activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. MARKs phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Kcc4p and related proteins are septin-associated proteins that are involved in septin organization and in the yeast morphogenesis checkpoint coordinating the cell cycle with bud formation. CIPKs interact with the calcineurin B-like (CBL) calcium sensors to form a signaling network that decode specific calcium signals triggered by a variety of environmental stimuli including salinity, drought, cold, light, and mechanical perturbation, among others. All members of this family contain an N-terminal catalytic kinase domain and a C-terminal regulatory domain which is also called kinase associated domain 1 (KA1) in some cases. The C-terminal regulatory domain serves as a protein interaction domain in AMPKs and CIPKs. In MARKs and Kcc4p-like proteins, this domain binds phospholipids and may be involved in membrane localization.	95
213377	cd12121	MARK_C_like	C-terminal kinase associated domain 1 (KA1), a phospholipid binding domain, of microtubule affinity-regulating kinases, and similar domains. Microtubule-associated protein/microtubule affinity regulating kinases (MARKs), also called partition-defective (Par-1) kinases, are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Mammals contain four proteins, MARK1-4, encoded by distinct genes belonging to this subfamily, with additional isoforms arising from alternative splicing. In yeast, MARK/Par-1 homologs are called Kin1/2 kinases. Kin1 is a membrane-associated kinase that is involved in regulating cytokinesis and the cell surface. MARKs contain an N-terminal catalytic kinase domain, a ubiquitin-associated domain (UBA), and a C-terminal kinase associated domain (KA1). The KA1 domain binds anionic phospholipids and may be involved in membrane localization as well as in auto-inhibition of the kinase domain.	96
213378	cd12122	AMPKA_C	C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha catalytic subunit. AMPK, a serine/threonine protein kinase (STK), catalyzes the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. It acts as a sensor for the energy status of the cell and is activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. AMPK is a heterotrimer of three subunits: alpha, beta, and gamma. Co-expression of the three subunits is required for kinase activity; in the absence of one, the other two subunits get degraded. The AMPK alpha subunit is the catalytic subunit and it contains an N-terminal kinase domain and a C-terminal regulatory domain (RD). Vertebrates contain two isoforms of the alpha subunit, alpha1 and alpha2, which are encoded by different genes, PRKAA1 and PRKAA2, respectively. The C-terminal RD of the AMPK alpha subunit is involved in AMPK heterotrimer formation. It mainly interacts with the C-terminal region of the beta subunit to form a tight alpha-beta complex that is associated with the gamma subunit. The AMPK alpha subunit RD also contains an auto-inhibitory region that interacts with the kinase domain; this inhibition is negated by the interaction with the AMPK gamma subunit. AMPK is conserved throughout evolution; the AMPK alpha subunit homologs in yeast and plants are called Snf1 and SnRK1 (Snf1 related kinase), respectively.	132
381264	cd12124	Pgbs	Protoglobins (Pgbs). Pgbs are single-domain globins of yet unknown biological function. Included in this subfamily are Pgbs from the strictly anaerobic methanogen Methanosarcina acetivorans (MaPgb) and from the obligate aerobic hyperthermophile Aeropyrum pernix (ApPgb). MaPgb is a dimeric globin which in addition to the 3-on-3 helical sandwich contains an N-terminal extension. This extension, along with other Pgb-specific loops buries the heme within the protein; two orthogonal apolar tunnels grant access of small ligand molecules to the heme. Like other globins, MaPgb can bind O2, CO and NO reversibly in vitro, however it has as unusually low O2 dissociation rate, along with a large structural distortion of the heme moiety. CO binding to and dissociation from the heme occurs through biphasic kinetics. ApPgb also contains heme, and can bind O2, CO and NO. This subfamily belongs to a family which includes the globin-coupled-sensors (GCSs) and single-domain sensor globins. It has been demonstrated that Pgbs and other single-domain globins can function as sensors, when coupled to an appropriate regulator domain.	185
381265	cd12125	APC_alpha	Allophycocyanin alpha subunit of the phycobilisome core. Phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC).	159
271281	cd12126	APC_beta	Allophycocyanin beta subunit of the phycobilisome core. Phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC).	163
381266	cd12127	PE-PC-PEC_beta	Beta subunits of phycocyanin, phycoerythrin and phycoerythrocyanin; phycobilisome rod components. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). This family also includes the beta subunits of Cryptophyte phycobiliproteins which represent another type of biliprotein antenna with different structure and organization. The beta subunits of cryptophyte PBPs share a high degree of sequence identity with both the alpha and beta subunits of the cyanobacterial and red algal PBPs, however the alpha cryptophyte subunits are shorter, and unrelated. There is only one type of PBP present in a single species, either phycocyanin or phycoerythrin, but not allophycocyanin. Structurally, phycoerythrin in cryptophytes is an alpha1alpha2betabeta dimer and not a trimer as in the PBS.	174
381267	cd12128	PBP_PBS-LCM	Phycobiliprotein-like domain of the phycobilisome core-membrane linker polypeptide. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae, they consist of a central core and radiating rods and function to harvest and channel light energy toward the photosynthetic reaction centers (RCs) within the membrane. They are comprised of phycobiliproteins or chromophorylated proteins (PBPs) maintained together by linker polypeptides. LCM is a chromophore-bearing PBS linker protein; it facilitates PBS assembly and functionally connects the PBS to the chlorophyll-containing core-complexes in the photosynthetic membrane. In addition to being a linker polypeptide that stabilizes the PBS architecture, the LCM also serves as a terminal energy acceptor. The single phycocyanobilin (PCB) chromophore of LCM are one of two terminal energy transmitters that transfer excitations from the hundreds of chromophores of the PBS to the RCs within the membrane.	172
271284	cd12129	PE-PC-PEC_alpha	Alpha subunits of phycoerythrin, phycocyanin and phycoerythrocyanin; phycobilisome rod components. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC).	161
381268	cd12130	Apl	Allophycocyanin-like globins. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). This subfamily contains allophycocyanin-like proteins (Apls), which have conserved the residues critical for chromophore interactions, but may not maintain the proper alpha-beta subunit interactions and tertiary structure of phycobiliproteins. Indeed AplA isolated from Fremyella diplosiphon was not detected in phycobilisomes. As the genes encoding Apls cluster with light-responsive regulatory components, Apls may have photoresponsive regulatory role(s).	154
381269	cd12131	HGbI-like	Hell's gate globin I (HGbI) from Methylacidophilum infernorum and related proteins. HGbI is a single-domain heme-containing protein isolated from Methylacidiphilum infernorum, an aerobic acidophilic and thermophilic methanotroph. M. infernorum grows optimally at pH 2.0 and 60C and its home is New Zealand's Hell's Gate geothermal park. The physiological role of HGbI has yet to be determined. It has an extremely strong resistance to auto-oxidation, and has fast oxygen-binding/slow release characteristics. Its CO on-rate is comparable to the O2 on-rate, and it is able to bind acetate with high affinity in the ferric state. The coordination of the heme iron changes in the ferrous form from pentacoordinate at low pH to predominantly hexacoordinate at high pH; in the ferric form, it is predominantly hexacoordinate at all pH.	128
271287	cd12137	GbX	Globin_X (GbX). Zebrafish globin X (GbX) is expressed at low levels in neurons of the central nervous system, and appears to be associated with the sensory system. GbX is likely to be attached to the cell membrane via S-palmitoylation and N-myristoylation. It's unlikely to have a true respiratory function as it is membrane-associated. It has been suggested that it may protect the lipids in the cell membrane from oxidation or act as a redox-sensing or signaling protein. Zebrafish GbX is hexacoordinate, and displays cooperative O2 binding.	145
213015	cd12139	SH3_Bin1	Src Homology 3 domain of Bridging integrator 1 (Bin1), also called Amphiphysin-2. Bin1 isoforms are localized in many different tissues and may function in intracellular vesicle trafficking. It plays a role in the organization and maintenance of the T-tubule network in skeletal muscle. Mutations in Bin1 are associated with autosomal recessive centronuclear myopathy. Bin1 contains an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR) and a C-terminal SH3 domain. The SH3 domain of Bin1 forms transient complexes with actin, myosin filaments, and CDK5, to facilitate sarcomere organization and myofiber maturation. It also binds dynamin and prevents its self-assembly. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	72
213016	cd12140	SH3_Amphiphysin_I	Src Homology 3 domain of Amphiphysin I. Amphiphysins function primarily in endocytosis and other membrane remodeling events. They exist in several isoforms and mammals possess two amphiphysin proteins from distinct genes. Amphiphysin I proteins, enriched in the brain and nervous system, contain domains that bind clathrin, Adaptor Protein complex 2 (AP2), dynamin, and synaptojanin. They function in synaptic vesicle endocytosis. Human autoantibodies to amphiphysin I hinder GABAergic signaling and contribute to the pathogenesis of paraneoplastic stiff-person syndrome. Amphiphysins contain an N-terminal BAR domain with an additional N-terminal amphipathic helix (an N-BAR), a variable central domain, and a C-terminal SH3 domain. The SH3 domain of amphiphysins bind proline-rich motifs present in binding partners such as dynamin, synaptojanin, and nsP3. It also belongs to a subset of SH3 domains that bind ubiquitin in a site that overlaps with the peptide binding site. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	72
213017	cd12141	SH3_DNMBP_C2	Second C-terminal Src homology 3 domain of Dynamin Binding Protein, also called Tuba, and similar domains. DNMBP or Tuba is a cdc42-specific guanine nucleotide exchange factor (GEF) that contains four N-terminal SH3 domains, a central RhoGEF [or Dbl homology (DH)] domain followed by a Bin/Amphiphysin/Rvs (BAR) domain, and two C-terminal SH3 domains. It provides a functional link between dynamin, Rho GTPase signaling, and actin dynamics. It plays an important role in regulating cell junction configuration. The C-terminal SH3 domains of DNMBP bind to N-WASP and Ena/VASP proteins, which are key regulatory proteins of the actin cytoskeleton. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	57
213018	cd12142	SH3_D21-like	Src Homology 3 domain of SH3 domain-containing protein 21 (SH3D21) and similar proteins. N-terminal SH3 domain of the uncharacterized protein SH3 domain-containing protein 21, and similar uncharacterized domains, it belongs to the CD2AP-like_3 subfamily of proteins. The CD2AP-like_3 subfamily is composed of the third SH3 domain (SH3C) of CD2AP, CIN85 (Cbl-interacting protein of 85 kDa), and similar domains. CD2AP and CIN85 are adaptor proteins that bind to protein partners and assemble complexes that have been implicated in T cell activation, kidney function, and apoptosis of neuronal cells. They also associate with endocytic proteins, actin cytoskeleton components, and other adaptor proteins involved in receptor tyrosine kinase (RTK) signaling. CD2AP and the main isoform of CIN85 contain three SH3 domains, a proline-rich region, and a C-terminal coiled-coil domain. All of these domains enable CD2AP and CIN85 to bind various protein partners and assemble complexes that have been implicated in many different functions. SH3C of both proteins have been shown to bind to ubiquitin. SH3 domains are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. They play versatile and diverse roles in the cell including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies.	55
213019	cd12143	SH3_ARHGAP9	Src Homology 3 domain of Rho GTPase-activating protein 9 and similar proteins. Rho GTPase-activating proteins (RhoGAPs or ARHGAPs) bind to Rho proteins and enhance the hydrolysis rates of bound GTP. ARHGAP9 functions as a GAP for Rac and Cdc42, but not for RhoA. It negatively regulates cell migration and adhesion. It also acts as a docking protein for the MAP kinases Erk2 and p38alpha, and may facilitate cross-talk between the Rho GTPase and MAPK pathways to control actin remodeling. It contains SH3, WW, Pleckstin homology (PH), and RhoGAP domains. SH3 domains bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs; they play a role in the regulation of enzymes by intramolecular interactions, changing the subcellular localization of signal pathway components and mediate multiprotein complex assemblies.	57
213387	cd12144	SDH_N_domain	Saccharopine dehydrogenase N-terminal domain. SDH N-terminal domain is named due to its appearance at the N-terminal of SDH in eukaryotes, but can be found C-terminal of the SDH-like domain in other enzymes, such as the bifunctional lysine ketoglutarate reductase/saccharopine dehydrogenase enzyme. SDH catalyzes the final step in the reversible NAD-dependent oxidative deamination of saccharopine to alpha-ketoglutarate and lysine, in the alpha-aminoadipate pathway of L-lysine biosynthesis. SHD is structurally related to formate dehydrogenase and similar enzymes, having a 2-domain structure in which a Rossmann-fold NAD(P)-binding domain is inserted within the linear sequence of a catalytic domain of a related structure.	114
213388	cd12145	Rev1_C	C-terminal domain of the Y-family polymerase Rev1. Rev1 is a eukaryotic translesion synthesis (TLS) polymerase; TLS is a process that allows the bypass of a variety of DNA lesions. TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. Rev1 has both structural and enzymatic roles. Structurally, it is believed to interact with other nonclassical polymerases and replication machinery to act as a scaffold. The C-terminal domain modeled here is essential for TLS and has been shown to mediate interactions with the Rev7 subunit of the B-family TLS polymerase Pol zeta (Rev3/Rev7), as well as with the RIRs (Rev1-interacting regions) of polymerases kappa, iota, and eta. Rev1 is known to actively promote the introduction of mutations, potentially making it a significant target for cancer treatment.	94
213389	cd12146	STING_C	C-terminal domain of STING. STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain.	181
213390	cd12147	Cep3_C	C-terminal domain of the Cep3, a subunit of the yeast centromere-binding factor 3. Cep3, together with Skp1, Ctf13, and Ndc10, forms the yeast centromere-binding factor 3 (CBF3) which initiates kinetochore assembly by binding to the CDEIII locus of centromeric DNA. Cep3 is comprised of two domains, the N-terminal DNA-binding module, a Zn2Cys6-cluster, C-terminal domain, which dimerizes and is believed to be involved in the recruitment of the Skp1-Ctf1 heterodimer.	552
213391	cd12148	fungal_TF_MHR	fungal transcription factor regulatory middle homology region. This domain is present in the large family of fungal zinc cluster transcription factors that contain an N-terminal GAL4-like C6 zinc binuclear cluster DNA-binding domain. Examples of members of this large fungal group are the following Saccharomyces cerevisiae transcription factors, GAL4, STB5, DAL81, CAT8, RDR1, HAL9, PUT3, PPR1, ASG1, RSF2, PIP2, as well as the C-terminal domain of the Cep3, a subunit of the yeast centromere-binding factor 3. It has been suggested that this region plays a regulatory role.	410
213392	cd12149	Flavi_E_C	Immunoglobulin-like domain III (C-terminal domain) of Flavivirus envelope glycoprotein E. The C-terminal domain (domain III) of Flavivirus glycoprotein E appears to be involved in low-affinity interactions with negatively charged glycoaminoglycans on the host cell surface. Domain III may also play a role in interactions with alpha-v-beta-3 integrins in West Nile virus, Japanese encephalitis virus, and Dengue virus. The interface between domain I and domain III appears to be destabilized by the low-pH environment of the endosome, and domain III may play a vital role in the conformational changes of envelope glycoprotein E that follow the clathrin-mediated endocytosis of viral particles and are a prerequisite to membrane fusion.	91
213393	cd12150	talin-RS	rod-segment of the talin C-terminal domain. The talin rod-segment characterize by this model interacts with its N-terminal FERM domain to mask its integrin-binding site and interferes with interactions between the FERM domain and the cellular membrane. Talin is a large and ubiquitous cytoskeletal protein concentrated at focal adhesion sites. It is involved in linking integrins to the actin cytoskeleton.	172
213394	cd12151	F1-ATPase_gamma	mitochondrial ATP synthase gamma subunit. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain of F-ATPases is composed of alpha, beta, gamma, delta, and epsilon (not present in bacteria) subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain.	282
213395	cd12152	F1-ATPase_delta	mitochondrial ATP synthase delta subunit. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain, F1, is composed of alpha, beta, gamma, delta, and epsilon subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain. In bacteria, which is lacking a eukaryotic epsilon subunit homolog, this subunit is called the epsilon subunit.	123
213396	cd12153	F1-ATPase_epsilon	eukaryotic mitochondrial ATP synthase epsilon subunit. The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes, and in chloroplast thylakoid membranes. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta, and epsilon subunits (only found in eukaryotes, lacking in bacteria) with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain.The epsilon subunit is thought to be involved in the regulation of ATP synthase, since a null mutation increased oligomycin sensitivity and decreased inhibition by inhibitor protein IF1.	45
240631	cd12154	FDH_GDH_like	Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases. The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases.	310
240632	cd12155	PGDH_1	Phosphoglycerate Dehydrogenase, 2-hydroxyacid dehydrogenase family. Phosphoglycerate Dehydrogenase (PGDH) catalyzes the NAD-dependent conversion of 3-phosphoglycerate into 3-phosphohydroxypyruvate, which is the first step in serine biosynthesis. Over-expression of PGDH has been implicated as supporting proliferation of certain breast cancers, while PGDH deficiency is linked to defects in mammalian central nervous system development. PGDH is a member of the 2-hydroxyacid dehydrogenase family, enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann-fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	314
240633	cd12156	HPPR	Hydroxy(phenyl)pyruvate Reductase, D-isomer-specific 2-hydroxyacid-related dehydrogenase. Hydroxy(phenyl)pyruvate reductase (HPPR) catalyzes the NADP-dependent reduction of hydroxyphenylpyruvates, hydroxypyruvate, or pyruvate to its respective lactate. HPPR acts as a dimer and is related to D-isomer-specific 2-hydroxyacid dehydrogenases, a superfamily that includes groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	301
240634	cd12157	PTDH	Thermostable Phosphite Dehydrogenase. Phosphite dehydrogenase (PTDH), a member of the D-specific 2-hydroxyacid dehydrogenase family, catalyzes the NAD-dependent formation of phosphate from phosphite (hydrogen phosphonate). PTDH has been suggested as a potential enzyme for cofactor regeneration systems. The D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD-binding domain.	318
240635	cd12158	ErythrP_dh	D-Erythronate-4-Phosphate Dehydrogenase NAD-binding and catalytic domains. D-Erythronate-4-phosphate Dehydrogenase (E. coli gene PdxB), a D-specific 2-hydroxyacid dehydrogenase family member, catalyzes the NAD-dependent oxidation of erythronate-4-phosphate, which is followed by transamination to form 4-hydroxy-L-threonine-4-phosphate within the de novo biosynthesis pathway of vitamin B6. D-Erythronate-4-phosphate dehydrogenase has the common architecture shared with D-isomer specific 2-hydroxyacid dehydrogenases but contains an additional C-terminal dimerization domain in addition to an NAD-binding domain and the "lid" domain. The lid domain corresponds to the catalytic domain of phosphoglycerate dehydrogenase and other proteins of the D-isomer specific 2-hydroxyacid dehydrogenase family, which include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence.	343
240636	cd12159	2-Hacid_dh_2	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	303
240637	cd12160	2-Hacid_dh_3	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	310
240638	cd12161	GDH_like_1	Putative glycerate dehydrogenase and related proteins of the D-specific 2-hydroxy dehydrogenase family. This group contains a variety of proteins variously identified as glycerate dehydrogenase (GDH, aka Hydroxypyruvate Reductase) and other enzymes of the 2-hydroxyacid dehydrogenase family. GDH catalyzes the reversible reaction of (R)-glycerate + NAD+ to hydroxypyruvate + NADH + H+. 2-hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann-fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	315
240639	cd12162	2-Hacid_dh_4	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	307
240640	cd12163	2-Hacid_dh_5	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	334
240641	cd12164	GDH_like_2	Putative glycerate dehydrogenase and related proteins of the D-specific 2-hydroxy dehydrogenase family. This group contains a variety of proteins variously identified as glycerate dehydrogenase (GDH, also known as hydroxypyruvate reductase) and other enzymes of the 2-hydroxyacid dehydrogenase family. GDH catalyzes the reversible reaction of (R)-glycerate + NAD+ to hydroxypyruvate + NADH + H+. 2-hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann-fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	306
240642	cd12165	2-Hacid_dh_6	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	314
240643	cd12166	2-Hacid_dh_7	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	300
240644	cd12167	2-Hacid_dh_8	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	330
240645	cd12168	Mand_dh_like	D-Mandelate Dehydrogenase-like dehydrogenases. D-Mandelate dehydrogenase (D-ManDH), identified as an enzyme that interconverts benzoylformate and D-mandelate, is a D-2-hydroxyacid dehydrogenase family member that catalyzes the conversion of c3-branched 2-ketoacids. D-ManDH exhibits broad substrate specificities for 2-ketoacids with large hydrophobic side chains, particularly those with C3-branched side chains. 2-hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Glycerate dehydrogenase catalyzes the reaction (R)-glycerate + NAD+ to hydroxypyruvate + NADH + H+. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain.	321
240646	cd12169	PGDH_like_1	Putative D-3-Phosphoglycerate Dehydrogenases. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily, which also include groups such as L-alanine dehydrogenase and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. Many, not all, members of this family are dimeric.	308
240647	cd12170	2-Hacid_dh_9	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	294
240648	cd12171	2-Hacid_dh_10	Putative D-isomer specific 2-hydroxyacid dehydrogenases. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	310
240649	cd12172	PGDH_like_2	Putative D-3-Phosphoglycerate Dehydrogenases, NAD-binding and catalytic domains. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily, which also include groups such as L-alanine dehydrogenase and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. Many, not all, members of this family are dimeric.	306
240650	cd12173	PGDH_4	Phosphoglycerate dehydrogenases, NAD-binding and catalytic domains. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases. PGDH in E. coli and Mycobacterium tuberculosis form tetramers, with subunits containing a Rossmann-fold NAD binding domain. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence.	304
240651	cd12174	PGDH_like_3	Putative D-3-Phosphoglycerate Dehydrogenases, NAD-binding and catalytic domains. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily, which also include groups such as L-alanine dehydrogenase and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. Many, not all, members of this family are dimeric.	305
240652	cd12175	2-Hacid_dh_11	Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	311
240653	cd12176	PGDH_3	Phosphoglycerate dehydrogenases, NAD-binding and catalytic domains. Phosphoglycerate dehydrogenases (PGDHs) catalyze the initial step in the biosynthesis of L-serine from D-3-phosphoglycerate. PGDHs come in 3 distinct structural forms, with this first group being related to 2-hydroxy acid dehydrogenases, sharing structural similarity to formate and glycerate dehydrogenases. PGDH in E. coli and Mycobacterium tuberculosis form tetramers, with subunits containing a Rossmann-fold NAD binding domain. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence.	304
240654	cd12177	2-Hacid_dh_12	Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	321
240655	cd12178	2-Hacid_dh_13	Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	317
240656	cd12179	2-Hacid_dh_14	Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	306
240657	cd12180	2-Hacid_dh_15	Putative D-isomer specific 2-hydroxyacid dehydrogenases, NAD-binding and catalytic domains. 2-Hydroxyacid dehydrogenases catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain. Some related proteins have similar structural subdomain but with a tandem arrangement of the catalytic and NAD-binding subdomains in the linear sequence. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric.	308
240658	cd12181	ceo_syn	N(5)-(carboxyethyl)ornithine synthase. N(5)-(carboxyethyl)ornithine synthase (ceo_syn) catalyzes the NADP-dependent conversion of N5-(L-1-carboxyethyl)-L-ornithine to L-ornithine + pyruvate. Ornithine plays a key role in the urea cycle, which in mammals is used in arginine biosynthesis, and is a precursor in polyamine synthesis. ceo_syn is related to the NAD-dependent L-alanine dehydrogenases. Like formate dehydrogenase and related enzymes, ceo_syn is comprised of 2 domains connected by a long alpha helical stretch, each resembling a Rossmann fold NAD-binding domain. The NAD-binding domain is inserted within the linear sequence of the more divergent catalytic domain. These ceo_syn proteins have a partially conserved NAD-binding motif and active site residues that are characteristic of related enzymes such as Saccharopine Dehydrogenase.	295
240659	cd12183	LDH_like_2	D-Lactate and related Dehydrogenases, NAD-binding and catalytic domains. D-Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate, and is a member of the 2-hydroxyacid dehydrogenase family. LDH is homologous to D-2-hydroxyisocaproic acid dehydrogenase (D-HicDH) and shares the 2-domain structure of formate dehydrogenase. D-2-hydroxyisocaproate dehydrogenase-like (HicDH) proteins are NAD-dependent members of the hydroxycarboxylate dehydrogenase family, and share the Rossmann fold typical of many NAD binding proteins. HicDH from Lactobacillus casei forms a monomer and catalyzes the reaction R-CO-COO(-) + NADH + H+ to R-COH-COO(-) + NAD+. D-HicDH, like the structurally distinct L-HicDH, exhibits low side-chain R specificity, accepting a wide range of 2-oxocarboxylic acid side chains. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-adenosylhomocysteine hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain.	328
240660	cd12184	HGDH_like	(R)-2-Hydroxyglutarate Dehydrogenase and related dehydrogenases, NAD-binding and catalytic domains. (R)-2-hydroxyglutarate dehydrogenase (HGDH) catalyzes the NAD-dependent reduction of 2-oxoglutarate to (R)-2-hydroxyglutarate. HGDH is a member of the D-2-hydroxyacid NAD(+)-dependent dehydrogenase family; these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain.	330
240661	cd12185	HGDH_LDH_like	Putative Lactate dehydrogenase and (R)-2-Hydroxyglutarate Dehydrogenase-like proteins, NAD-binding and catalytic domains. This group contains various putative dehydrogenases related to D-lactate dehydrogenase (LDH), (R)-2-hydroxyglutarate dehydrogenase (HGDH), and related enzymes, members of the 2-hydroxyacid dehydrogenases family. LDH catalyzes the interconversion of pyruvate and lactate, and HGDH catalyzes the NAD-dependent reduction of 2-oxoglutarate to (R)-2-hydroxyglutarate. Despite often low sequence identity within this 2-hydroxyacid dehydrogenase family, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain.	322
240662	cd12186	LDH	D-Lactate dehydrogenase and D-2-Hydroxyisocaproic acid dehydrogenase (D-HicDH), NAD-binding and catalytic domains. D-Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate, and is a member of the 2-hydroxyacid dehydrogenases family. LDH is homologous to D-2-hydroxyisocaproic acid dehydrogenase(D-HicDH) and shares the 2 domain structure of formate dehydrogenase. D-HicDH is a NAD-dependent member of the hydroxycarboxylate dehydrogenase family, and shares the Rossmann fold typical of many NAD binding proteins. HicDH from Lactobacillus casei forms a monomer and catalyzes the reaction R-CO-COO(-) + NADH + H+ to R-COH-COO(-) + NAD+. D-HicDH, like the structurally distinct L-HicDH, exhibits low side-chain R specificity, accepting a wide range of 2-oxocarboxylic acid side chains. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain.	329
240663	cd12187	LDH_like_1	D-Lactate and related Dehydrogenase like proteins, NAD-binding and catalytic domains. D-Lactate dehydrogenase (LDH) catalyzes the interconversion of pyruvate and lactate, and is a member of the 2-hydroxyacid dehydrogenase family. LDH is homologous to D-2-Hydroxyisocaproic acid dehydrogenase(D-HicDH) and shares the 2 domain structure of formate dehydrogenase. D-2-hydroxyisocaproate dehydrogenase-like (HicDH) proteins are NAD-dependent members of the hydroxycarboxylate dehydrogenase family, and share the Rossmann fold typical of many NAD binding proteins. HicDH from Lactobacillus casei forms a monomer and catalyzes the reaction R-CO-COO(-) + NADH + H+ to R-COH-COO(-) + NAD+. D-HicDH, like the structurally distinct L-HicDH, exhibits low side-chain R specificity, accepting a wide range of 2-oxocarboxylic acid side chains. Formate/glycerate and related dehydrogenases of the D-specific 2-hydroxyacid dehydrogenase superfamily include groups such as formate dehydrogenase, glycerate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine Hydrolase. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar subdomains of the alpha/beta Rossmann fold NAD+ binding form. The NAD+ binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain, which has a similar domain structure to the internal NAD binding domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD is bound, primarily to the C-terminal portion of the 2nd (internal) domain.	329
240664	cd12188	SDH	Saccharopine Dehydrogenase NAD-binding and catalytic domains. Saccharopine Dehydrogenase (SDH) catalyzes the final step in the reversible NAD-dependent oxidative deamination of saccharopine to alpha-ketoglutarate and lysine, in the alpha-aminoadipate pathway of L-lysine biosynthesis. SHD is structurally related to formate dehydrogenase and similar enzymes, having a 2-domain structure in which a Rossmann-fold NAD(P)-binding domain is inserted within the linear sequence of a catalytic domain of related structure.	351
240665	cd12189	LKR_SDH_like	bifunctional lysine ketoglutarate reductase /saccharopine dehydrogenase enzyme. Bifunctional lysine ketoglutarate reductase /saccharopine dehydrogenase protein is a pair of enzymes linked on a single polypeptide chain that catalyze the initial, consecutive steps of lysine degradation. These proteins are related to the 2-domain saccharopine dehydrogenases. Along with formate dehydrogenase and similar enzymes, SDH consists paired domains resembling Rossmann folds in which the NAD-binding domain is inserted within the linear sequence of the catalytic domain. In this bifunctional enzyme, the LKR domain is N-terminal of the SDH domain. These proteins have a close match to the active site motif of SDHs, and an NAD-binding site motif that is a partial match to that found in SDH and other FDH-related proteins.	433
213397	cd12190	Bacova_04320_like	Uncharacterized proteins similar to Bacteroides ovatus 4320. This model characterized a family of proteins conserved in Bacteroidetes, similar to B. ovatus ATCC 8483 reading frame 04320. Structurally, the protein resembles members of the SRPBCC domain superfamily (START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC).	159
213398	cd12191	gal11_coact	gall11 coactivator domain. Gall11/MED15 acts in the general regulation of GAL structural genes and is required for full expression for several genes in this pathway, including GALs 1,7, and 10 in Saccharomyces cerevisiae. GAL11 function is dependent on GCN4 functionality and binds GCN4 in a degenerate manner with multiple orientations found at the GCN4-Gal11 interface.	90
213399	cd12192	GCN4_cent	GCN4 central activation domain-like acidic activation domain. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region.	40
269833	cd12193	bZIP_GCN4	Basic leucine zipper (bZIP) domain of General control protein GCN4: a DNA-binding and dimerization domain. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain. In amino acid-deprived cells, GCN4 is up-regulated leading to transcriptional activation of genes encoding amino acid biosynthetic enzymes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	54
213379	cd12194	Kcc4p_like_C	C-terminal kinase associated domain 1 (KA1), a phospholipid binding domain, of Kcc4p and similar proteins. This subfamily is composed of three Saccharomyces cerevisiae proteins, Kcc4p, Gin4p, and Hsl1p, as well as similar serine/threonine protein kinases (STKs). They catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. Kcc4p, Gin4p, and Hsl1p are septin-associated proteins that are involved in septin organization and in the yeast morphogenesis checkpoint coordinating the cell cycle with bud formation. They negatively regulate the Wee1-related kinase Swe1, which phosphorylates the cyclin-dependent kinase Cdc28, and is involved in regulating the entry of cells into mitosis. Kcc4p, Gin4p, and Hsl1p localize in the bud neck in a septin-dependent manner and display distinct but partially overlapping functions. They contain an N-terminal catalytic kinase domain and a C-terminal KA1 domain. The KA1 domain of Kcc4p, Gin4p, and Hsl1p binds acidic phospholipids including phosphatidylserine (PtdSer) and is required for bud neck localization.	122
213380	cd12195	CIPK_C	C-terminal regulatory domain of Calcineurin B-Like (CBL)-interacting protein kinases. CIPKs are serine/threonine protein kinases (STKs), catalyzing the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They comprise a unique family in higher plants of proteins that interact with the calcineurin B-like (CBL) calcium sensors to form a signaling network that decode specific calcium signals triggered by a variety of environmental stimuli including salinity, drought, cold, light, and mechanical perturbation, among others. The specificity of the response relies on differences in expression and localization of both CBLs and CIPKs, as well as on the interaction specificity of CBL-CIPK combinations. There are 25, 30, and 43 CIPK genes identified in the Arabidopsis thaliana, Oryza sativa, and Zea mays genomes, respectively. The founding member of the CIPK family is Arabidopsis thaliana CIPK24, also called SOS2 (Salt Overlay Sensitive 2). CIPKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory domain that contains the FISL (also called NAF for Asn-Ala-Phe) and PPI-binding motifs, which are involved in the interaction with CBLs and PP2C-type protein phosphatases, respectively. Studies using SOS2, SOS3, and ABI2 phosphatase show that the binding of CBL and PP2C-type protein phosphatase to CIPK is mutually exclusive. The binding of CBL to CIPK is inhibitory to kinase activity.	116
213381	cd12196	MARK1-3_C	C-terminal, kinase associated domain 1 (KA1), a phospholipid binding domain, of microtubule affinity-regulating kinases 1-3. Microtubule-associated protein/microtubule affinity regulating kinases (MARKs), also called partition-defective (Par-1) kinases, are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Mammals contain four proteins, MARK1-4, encoded by distinct genes belonging to this subfamily, with additional isoforms arising from alternative splicing. MARK1/2, through their activation by death-associated protein kinase (DAPK), modulates polarized neurite outgrowth. MARK1, also called Par-1c, is also involved in axon-dendrite specification, and SNPs on the MARK1 gene is associated with autism spectrum disorders. MARK2, also called Par-1b, is implicated in many physiological processes including fertility, immune system homeostasis, learning and memory, growth, and metabolism. MARK3, also called Par-1a, is implicated in gluconeogenesis and adiposity; mice deficient with MARK3 display reduced adiposity, resistance to hepatic steatosis, and defective gluconeogensis. MARKs contain an N-terminal catalytic kinase domain, a ubiquitin-associated domain (UBA), and a C-terminal kinase associated domain (KA1). The KA1 domain binds anionic phospholipids and may be involved in membrane localization as well as in auto-inhibition of the kinase domain.	98
213382	cd12197	MARK4_C	C-terminal, kinase associated domain 1 (KA1), a phospholipid binding domain, of microtubule affinity-regulating kinase 4. Microtubule-associated protein/microtubule affinity regulating kinases (MARKs), also called partition-defective (Par-1) kinases, are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Mammals contain four proteins, MARK1-4, encoded by distinct genes belonging to this subfamily, with additional isoforms arising from alternative splicing. MARK4 has two splicing isoforms: MARK4S, predominantly expressed in the brain; and MARK4L, expressed in all tissues. Unlike MARK1-3 that show cytoplasmic localization, MARK4 colocalizes with the centrosome and with microtubules. Decreased MARK4 expression in the brain may be involved in the pathogenesis of Prion diseases and may be correlated to PrP(Sc) deposits. MARK4 is also a component of the ectoplasmic specialization, a testis-specific adherens junction. MARKs contain an N-terminal catalytic kinase domain, a ubiquitin-associated domain (UBA), and a C-terminal kinase associated domain (KA1). The KA1 domain binds anionic phospholipids and may be involved in membrane localization as well as in auto-inhibition of the kinase domain.	99
213383	cd12198	MELK_C	C-terminal kinase associated domain 1 (KA1) of Maternal embryonic leucine zipper kinase. MELK, also called protein kinase 38 (PK38) or pEg3 kinase, is a cell cycle-regulated serine/threonine protein kinase (STK) that catalyzes the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. It is phosphorylated and maximally active during mitosis and is involved in regulating cell cycle progression, division, proliferation, tumor growth, and mRNA splicing. MELK shows a broad substrate specificity, including the zinc finger-like protein ZPR9, the transcription and splicing factor NIPP1, and the protein-tyrosine phosphatase Cdc25B, among others. MELK contains an N-terminal catalytic domain followed by a ubiquitin-associated (UBA) domain, a TP dipeptide-rich region, and a C-terminal KA1 domain. The KA1 domain of MELK, together with its TP dipeptide-rich region, functions as an autoinhibitory domain. The KA1 domain of the related microtubule affinity-regulating kinases (MARKs) has been shown to bind anionic phospholipids and may be involved in membrane localization.	96
213384	cd12199	AMPKA1_C	C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha 1 catalytic subunit. AMPK, a serine/threonine protein kinase (STK), catalyzes the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. It acts as a sensor for the energy status of the cell and is activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. AMPK is a heterotrimer of three subunits: alpha, beta, and gamma. Co-expression of the three subunits is required for kinase activity; in the absence of one, the other two subunits get degraded. The AMPK alpha subunit is the catalytic subunit and it contains an N-terminal kinase domain and a C-terminal regulatory domain (RD). Vertebrates contain two isoforms of the alpha subunit, alpha1 and alpha2, which are encoded by different genes, PRKAA1 and PRKAA2, respectively, and show varying expression patterns. AMPKalpha1 is the predominant isoform expressed in bone; it plays a role in bone remodeling in response to hormonal regulation. It is selectively regulated by nucleoside diphosphate kinase (NDPK)-A in an AMP-independent manner. AMPKalpha1 impacts the regulation of fat metabolism through its in vivo target, acetyl coenzyme A carboxylase (ACC). It also mediates the vasoprotective effects of estrogen through phosphorylation of another in vivo substrate, RhoA. The C-terminal RD of the AMPK alpha 1 subunit is involved in AMPK heterotrimer formation. It mainly interacts with the C-terminal region of the beta subunit to form a tight alpha-beta complex that is associated with the gamma subunit. The AMPK alpha subunit RD also contains an auto-inhibitory region that interacts with the kinase domain; this inhibition is negated by the interaction with the AMPK gamma subunit.	96
213385	cd12200	AMPKA2_C	C-terminal regulatory domain of 5'-AMP-activated serine/threonine kinase, subunit alpha. AMPK, a serine/threonine protein kinase (STK), catalyzes the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. It acts as a sensor for the energy status of the cell and is activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. AMPK is a heterotrimer of three subunits: alpha, beta, and gamma. Co-expression of the three subunits is required for kinase activity; in the absence of one, the other two subunits get degraded. The AMPK alpha subunit is the catalytic subunit and it contains an N-terminal kinase domain and a C-terminal regulatory domain (RD). Vertebrates contain two isoforms of the alpha subunit, alpha1 and alpha2, which are encoded by different genes, PRKAA1 and PRKAA2, respectively, and show varying expression patterns. AMPKalpha2 shows cytoplasmic and nuclear localization, whereas AMPKalpha1 is localized only in the cytoplasm. The C-terminal RD of the AMPK alpha 1 subunit is involved in AMPK heterotrimer formation. It mainly interacts with the C-terminal region of the beta subunit to form a tight alpha-beta complex that is associated with the gamma subunit. The AMPK alpha subunit RD also contains an auto-inhibitory region that interacts with the kinase domain; this inhibition is negated by the interaction with the AMPK gamma subunit.	102
213386	cd12201	MARK2_C	C-terminal, kinase associated domain 1 (KA1), a phospholipid binding domain, of microtubule affinity-regulating kinase 2. Microtubule-associated protein/microtubule affinity regulating kinases (MARKs), also called partition-defective (Par-1) kinases, are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. They phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Mammals contain four proteins, MARK1-4, encoded by distinct genes belonging to this subfamily, with additional isoforms arising from alternative splicing. MARK2, also called Par-1b or ELKL motif kinase 1 (EMK-1), is implicated in many physiological processes including fertility, immune system homeostasis, learning and memory, growth, and metabolism. It also regulates axon formation and has been implicated in neurodegeneration. MARKs contain an N-terminal catalytic kinase domain, a ubiquitin-associated domain (UBA), and a C-terminal kinase associated domain (KA1). The KA1 domain binds anionic phospholipids and may be involved in membrane localization as well as in auto-inhibition of the kinase domain.	99
213401	cd12202	CASP8AP2	Caspase 8-associated protein 2 myb-like domain. This domain is the SANT/myb-like domain of Caspase 8-associated protein 2 (CASP8AP2) / GON-4 like proteins. CASP8AP2 (aka Flice-Associated Huge Protein (FLASH)) is implicated in numerous gene regulatory roles including roles in embryogenesis, oncogenesis, down-regulation of replication-dependent histone genes, regulation of Caspase 8 activity at the death-inducing signaling complex (DISC), and as a useful marker in leukemia prognosis. Gon-4 is critical in Caenorhabditis elegans gonadogenesis. Danio rerio GON4 is a regulator of gene expression in hematopoietic development, possibly by repressing expression. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins.	66
213402	cd12203	GT1	GT1, myb-like, SANT family. GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins.	66
213176	cd12204	CBD_like	Cellulose-binding domain, chitinase and related proteins. This group contains proteins related to the cellulose-binding domain of Erwinia chrysanthemi endoglucanase Z (EGZ) and Serratia marcescens chitinase B (ChiB). Gram negative plant parasite Erwinia chrysanthemi produces a variety of depolymerizing enzymes to metabolize pectin and cellulose on the host plant. Cellulase EGZ has a modular structure, with N-terminal catalytic domain linked to a C-terminal cellulose-binding domain (CBD). CBD mediates the secretion activity of EGZ. Chitinases allow certain bacteria to utilize chitin as a energy source. Typically, non-plant chitinases are of the glycosidase family 18.	48
213344	cd12205	RasGAP_plexin	Ras-GTPase Activating Domain of plexins. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Ligand binding activates signal transduction pathways controlling axon guidance in the nervous system and other developmental processes, including cell migration and morphogenesis, immune function, and tumor progression. Plexins are divided into four types (A-D) according to sequence similarity. In vertebrates, type A Plexins serve as the co-receptors for neuropilins to mediate the signaling of class 3 semaphorins except Sema3E, which signals through Plexin D1. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B. Plexin C1 serves as the receptor of Sema7A and plays regulation roles in both immune and nervous systems. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Other proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	382
213345	cd12206	RasGAP_IQGAP_related	Ras-GTPase Activating Domain of proteins related to IQGAPs. RasGAP: Ras-GTPase Activating Domain. RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a myriad of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGap domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	359
213346	cd12207	RasGAP_IQGAP3	Ras-GTPase Activating Domain of IQ motif containing GTPase activating protein 3. This family represents the IQ motif containing GTPase activating protein 3 (IQGAP3), which associates with Ras GTP-binding proteins. A primary function of IQGAP proteins is to modulate cytoskeletal architecture. There are three known IQGAP family members: IQGAP1, IQGAP2 and IQGAP3. Human IQGAP1 and IQGAP2 share 62% identity. IQGAPs are multi-domain molecules having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP is an essential regulator of cytoskeletal function. IQGAP1 negatively regulates Ras family GTPases by stimulating their intrinsic GTPase activity, the protein actually lacks GAP activity. Both IQGAP1 and IQGAP2 specifically bind to Cdc42 and Rac1, but not to RhoA. Despite of their similarities to part of the sequence of RasGAP, neither IQGAP1 nor IQGAP2 interacts with Ras. IQGAP3, only present in mammals, regulates the organization of the cytoskeleton under the regulation of Rac1 and Cdc42 in neuronal cells. The depletion of IQGAP3 is shown to impair neurite or axon outgrowth in neuronal cells with disorganized cytoskeleton.	350
411994	cd12208	DIP1984-like	DIP1984 family protein and similar proteins. DIP1984 is an uncharacterized protein from Corynebacterium diphtheriae. Some members of this family may have been misnamed as septicolysin.	150
213404	cd12211	Bc2l-C_N	N-Terminal Domain Of Bc2l-C Lectin. Lectin BC2L-C of Burkholderia cenocepacia is one of several lectins produced by this pathogen. BC2L-C has been shown to bind fucosylated human histo-blood group epitopes H-type 1, Lewis B, and Lewis Y. The C-terminal domain resembles BC2L-A, a calcium dependent mannose-binding protein. The N-terminal domain trimerizes and binds alpha-MeSeFuc in pockets between the monomeric units. The N-terminal domain has a similar structure to tumor necrosis factor (TNF).	131
276936	cd12212	Fis1	Mitochondrial Fission Protein Fis1, cytosolic domain. Fis1, along with Dnm1 and Mdv1, is an essential protein in mediating mitochondrial fission. Dnm1 and Fis1 are highly conserved, with a common mechanism in disparate species. In mutants of these proteins, mitochondrial fission is impaired, resulting in networks of undivided mitochondria. The Fis1 N-terminus is cytosolic and tethered to the mitochondrial outer membrane via a C-terminal transmembrane domain. Fis1 appears to act via the recruitment of division complexes to the mitochondrial outer membrane, via interactions with Mdv1 or Caf4. Fis1 has tandem Tetratricopeptide repeat (TPR) motifs which are known to mediate protein-protein interactions.	115
213406	cd12213	ABD	Alpha-Mannosidase Binding Domain of Atg19/34. These proteins are related to the Alpha-mannosidase (Ams1) Binding Domain of Atg19/Atg34, a key component in the targeting pathway that directs alpha-mannosidase and aminopeptidase I to the vacuole, either through cytoplasm-to-vacuole trafficking or via autophagy in starvation conditions. Autophagy in a eukaryotic mechanism in which cytoplasm is enclosed in double-membraned autophagosomes which fuse with a vacuole for transport into the lumen. In Saccharomyces cerevisiae, alpha-mannosidase is selectively directed to the vacuole via the direct interaction with Atg19 (and paralog Atg34) in the Cvt pathway. Ams1 binding domains (ABD) Atg19/34 have a immunoglobulin fold with eight beta-strands. The ABD is responsible for Ams1 recognition, but its deletion does not affect the fusion of Atg19 with prApe1, and the transport of prApe1 to the vacuole. The Atg19 N-terminal region is a distinct coiled-coil domain.	112
213177	cd12214	ChiA1_BD	chitin-binding domain of Chi A1-like proteins. This group contains proteins related to the chitin binding domain of chitinase A1 (ChiA1) of Bacillus circulans WL-12. Glycosidase ChiA1 hydrolyzes chitin and is comprised of several domains: the C-terminal chitin binding domain, an N-terminal and catalytic domain, and 2 fibronectin type III-like domains. Chitinases function in invertebrates in the degradation of old exoskeletons, in fungi to utilize chitin in cell walls, and in bacteria which use chitin as an energy source. Bacillus circulans WL-12 ChiA1 facilitates invasion of fungal cell walls. The ChiAi chitin binding domain is required for the specific recognition of insoluble chitin. although topologically and structurally related, ChiA1 lacks the characteristic aromatic residues of Erwinia chrysanthemi endoglucanase Z (CBD(EGZ)).	45
213178	cd12215	ChiC_BD	Chitin-binding domain of chitinase C. Chitin-binding domain of chitinase C (ChiC) of Streptomyces griseus and related proteins. Chitinase C is a family 19 chitinase, and consists of a N-terminal chitin binding domain and a C-terminal chitin-catalytic domain that effects degradation. Chitinases function in invertebrates in the degradation of old exoskeletons, in fungi to utilize chitin in cell walls, and in bacteria which use chitin as an energy source. ChiC contains the characteristic chitin-binding aromatic residues.	42
213409	cd12216	Csn2_like	CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family.	217
213410	cd12217	Stu0660_Csn2	Stu0660-like CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. This family of Csn2 proteins includes Stu0660, the proteins are larger than other (canonical) Csn2 proteins as they have an additional alpha-helical C-terminal domain. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family.	343
213411	cd12218	Csn2	CRISPR/Cas system-associated protein Csn2. Csn2 is a Nmeni subtype-specific Cas protein, which may function in the adaptation process which mediates the incorporation of foreign nucleic acids into the microbial host genome. Csn 2 may interact directly with double-stranded DNA. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA. Csn2 has been predicted to be a functional analog of Cas4 based on anti-correlated phyletic patterns; also known as SPy1049 family.	219
340518	cd12219	Ubl_TBK1_like	ubiquitin-like (Ubl) domain found in non-canonical Inhibitor of kappa B kinases IKKepsilon and TBK1, and similar proteins. IKKepsilon and TBK1 (TRAF family member-associated NF-kappaB activator-binding kinase 1) are non-canonical members of IKK family. They have been characterized as activators of nuclear factor-kappaB (NF-kappaB), but they are not essential for NF-kappaB activation. They play critical roles in antiviral response via phosphorylation and activation of transcription factors IRF3, IRF7, STAT1 and STAT3. They are also involved in the survival, tumorigenesis and development of various cancers. Both IKKepsilon and TBK1 contain an N-terminal protein kinase domain followed a ubiquitin-like (Ubl) domain. The Ubl domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the interferon pathway.	77
240617	cd12220	Pesticin_RB	Pesticin Translocation And Receptor Binding Domain. Pesticin (Pst) is a anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. The N-terminal domain is further divided into the TonB box (which binds TonB) , the T (translocation domain) and the R (receptor binding domain). Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacteria stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure.	166
240616	cd12221	Cin1	Cellophane induced protein repeats of fungus Venturia inaequalis. Cin1 (cellulose induced protein 1) repeat protein of Venturia inaequalis, the fungus responsible for scab disease of apple, encodes 8 cysteine-rich repeats and is greatly upregulated within the plant and on cellophane membranes. The crystal structure reveals a pair of disulfide bridges in each repeat. The repeats have been described as adopting a beads-on-a-string organization. Cin1 function is undetermined, however the alpha-helical structure may be involved in protein-protein or protein-carbohydrate interactions in the extracellular matrix.	114
240615	cd12222	Caa3-IV	Caa3-Type Cytochrome Oxidase subunit 4 interacts with cyt c subunits I/III. Cytochrome c oxidase, a haem copper oxidase superfamily member, is the final step in the electron-transport chain, linking O2 reduction to transmembrane pumping in mitochondria and aerobic prokaryotes. Cytochrome c oxidase (aka Complex IV) catalyzes the reduction of O2 to 2H2O, and acts downstream of Complexes I-III: NADH-Q oxidoreductase, succinate-Q reductase, and Q-cytochrome c oxidoreductase. In Thermus thermophilus caa3-oxidase is comprised of subunit (SU) I/III, a fusion of classical SU I and SUIII, and IIc as well SU IV, which is composed of 2 connected transmembrane helices that interface with SU I/III.	63
409670	cd12223	RRM_SR140	RNA recognition motif (RRM) found in U2-associated protein SR140 and similar proteins. This subgroup corresponds to the RRM of SR140 (also termed U2 snRNP-associated SURP motif-containing protein orU2SURP, or 140 kDa Ser/Arg-rich domain protein) which is a putative splicing factor mainly found in higher eukaryotes. Although it is initially identified as one of the 17S U2 snRNP-associated proteins, the molecular and physiological function of SR140 remains unclear. SR140 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a SWAP/SURP domain that is found in a number of pre-mRNA splicing factors in the middle region, and a C-terminal arginine/serine-rich domain (RS domain).	84
409671	cd12224	RRM_RBM22	RNA recognition motif (RRM) found in Pre-mRNA-splicing factor RBM22 and similar proteins. This subgroup corresponds to the RRM of RBM22 (also known as RNA-binding motif protein 22, or Zinc finger CCCH domain-containing protein 16), a newly discovered RNA-binding motif protein which belongs to the SLT11 gene family. SLT11 gene encoding protein (Slt11p) is a splicing factor in yeast, which is required for spliceosome assembly. Slt11p has two distinct biochemical properties: RNA-annealing and RNA-binding activities. RBM22 is the homolog of SLT11 in vertebrate. It has been reported to be involved in pre-splicesome assembly and to interact with the Ca2+-signaling protein ALG-2. It also plays an important role in embryogenesis. RBM22 contains a conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a zinc finger of the unusual type C-x8-C-x5-C-x3-H, and a C-terminus that is unusually rich in the amino acids Gly and Pro, including sequences of tetraprolines.	74
409672	cd12225	RRM1_2_CID8_like	RNA recognition motif 1 and 2 (RRM1, RRM2) found in Arabidopsis thaliana CTC-interacting domain protein CID8, CID9, CID10, CID11, CID12, CID 13 and similar proteins. This subgroup corresponds to the RRM domains found in A. thaliana CID8, CID9, CID10, CID11, CID12, CID 13 and mainly their plant homologs. These highly related RNA-binding proteins contain an N-terminal PAM2 domain (PABP-interacting motif 2), two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a basic region that resembles a bipartite nuclear localization signal. The biological role of this family remains unclear.	76
409673	cd12226	RRM_NOL8	RNA recognition motif (RRM) found in nucleolar protein 8 (NOL8) and similar proteins. This model corresponds to the RRM of NOL8 (also termed Nop132) encoded by a novel NOL8 gene that is up-regulated in the majority of diffuse-type, but not intestinal-type, gastric cancers. Thus, NOL8 may be a good molecular target for treatment of diffuse-type gastric cancer. Also, NOL8 is a phosphorylated protein that contains an N-terminal RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), suggesting NOL8 is likely to function as a novel RNA-binding protein. It may be involved in regulation of gene expression at the post-transcriptional level or in ribosome biogenesis in cancer cells.	77
409674	cd12227	RRM_SCAF4_SCAF8	RNA recognition motif (RRM) found in SR-related and CTD-associated factor 4 (SCAF4), SR-related and CTD-associated factor 8 (SCAF8) and similar proteins. This subfamily corresponds to the RRM in a new class of SCAFs (SR-like CTD-associated factors), including SCAF4, SCAF8 and similar proteins. The biological role of SCAF4 remains unclear, but it shows high sequence similarity to SCAF8 (also termed CDC5L complex-associated protein 7, or RNA-binding motif protein 16, or CTD-binding SR-like protein RA8). SCAF8 is a nuclear matrix protein that interacts specifically with a highly serine-phosphorylated form of the carboxy-terminal domain (CTD) of the largest subunit of RNA polymerase II (pol II). The pol II CTD plays a role in coupling transcription and pre-mRNA processing. In addition, SCAF8 co-localizes primarily with transcription sites that are enriched in nuclear matrix fraction, which is known to contain proteins involved in pre-mRNA processing. Thus, SCAF8 may play a direct role in coupling with both, transcription and pre-mRNA processing, processes. SCAF8 and SCAF4 both contain a conserved N-terminal CTD-interacting domain (CID), an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain), and serine/arginine-rich motifs.	77
409675	cd12228	RRM_ENOX	RNA recognition motif (RRM) found in the cell surface Ecto-NOX disulfide-thiol exchanger (ECTO-NOX or ENOX) proteins. This subgroup corresponds to the conserved RNA recognition motif (RRM) in ECTO-NOX proteins (also termed ENOX), comprising a family of plant and animal NAD(P)H oxidases exhibiting both, oxidative and protein disulfide isomerase-like, activities. They are growth-related and drive cell enlargement, and may play roles in aging and neurodegenerative diseases. ENOX proteins function as terminal oxidases of plasma membrane electron transport (PMET) through catalyzing electron transport from plasma membrane quinones to extracellular oxygen, forming water as a product. They are also hydroquinone oxidases that oxidize externally supplied NADH, hence NOX. ENOX proteins harbor a di-copper center that lack flavin. ENOX proteins display protein disulfide interchange activity that is also possessed by protein disulfide isomerase. In contrast to the classic protein disulfide isomerases, ENOX proteins lack the double CXXC motif. This family includes two ENOX proteins, ENOX1 and ENOX2. ENOX1, also termed candidate growth-related and time keeping constitutive hydroquinone [NADH] oxidase (cCNOX), or cell proliferation-inducing gene 38 protein, or Constitutive Ecto-NOX (cNOX), is the constitutively expressed cell surface NADH (ubiquinone) oxidase that is ubiquitous and refractory to drugs. ENOX2, also termed APK1 antigen, or cytosolic ovarian carcinoma antigen 1, or tumor-associated hydroquinone oxidase (tNOX), is a cancer-specific variant of ENOX1 and plays a key role in cell proliferation and tumor progression. In contrast to ENOX1, ENOX2 is drug-responsive and harbors a drug binding site to which the cancer-specific S-peptide tagged pan-ENOX2 recombinant (scFv) is directed. Moreover, ENOX2 is specifically inhibited by a variety of quinone site inhibitors that have anticancer activity and is unique to the surface of cancer cells. ENOX proteins contain many functional motifs.	84
409676	cd12229	RRM_G3BP	RNA recognition motif (RRM) found in ras GTPase-activating protein-binding protein G3BP1, G3BP2 and similar proteins. This subfamily corresponds to the RRM domain in the G3BP family of RNA-binding and SH3 domain-binding proteins. G3BP acts at the level of RNA metabolism in response to cell signaling, possibly as RNA transcript stabilizing factors or an RNase. Members include G3BP1, G3BP2 and similar proteins. These proteins associate directly with the SH3 domain of GTPase-activating protein (GAP), which functions as an inhibitor of Ras. They all contain an N-terminal nuclear transfer factor 2 (NTF2)-like domain, an acidic domain, a domain containing PXXP motif(s), an RNA recognition motif (RRM), and an Arg-Gly-rich region (RGG-rich region, or arginine methylation motif).	81
409677	cd12230	RRM1_U2AF65	RNA recognition motif 1 (RRM1) found in U2 large nuclear ribonucleoprotein auxiliary factor U2AF 65 kDa subunit (U2AF65) and similar proteins. The subfamily corresponds to the RRM1 of U2AF65 and dU2AF50. U2AF65, also termed U2AF2, is the large subunit of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF), which has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. U2AF65 specifically recognizes the intron polypyrimidine tract upstream of the 3' splice site and promotes binding of U2 snRNP to the pre-mRNA branchpoint. U2AF65 also plays an important role in the nuclear export of mRNA. It facilitates the formation of a messenger ribonucleoprotein export complex, containing both the NXF1 receptor and the RNA substrate. Moreover, U2AF65 interacts directly and specifically with expanded CAG RNA, and serves as an adaptor to link expanded CAG RNA to NXF1 for RNA export. U2AF65 contains an N-terminal RS domain rich in arginine and serine, followed by a proline-rich segment and three C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The N-terminal RS domain stabilizes the interaction of U2 snRNP with the branch point (BP) by contacting the branch region, and further promotes base pair interactions between U2 snRNA and the BP. The proline-rich segment mediates protein-protein interactions with the RRM domain of the small U2AF subunit (U2AF35 or U2AF1). The RRM1 and RRM2 are sufficient for specific RNA binding, while RRM3 is responsible for protein-protein interactions. The family also includes Splicing factor U2AF 50 kDa subunit (dU2AF50), the Drosophila ortholog of U2AF65. dU2AF50 functions as an essential pre-mRNA splicing factor in flies. It associates with intronless mRNAs and plays a significant and unexpected role in the nuclear export of a large number of intronless mRNAs.	82
409678	cd12231	RRM2_U2AF65	RNA recognition motif 2 (RRM2) found in U2 large nuclear ribonucleoprotein auxiliary factor U2AF 65 kDa subunit (U2AF65) and similar proteins. This subfamily corresponds to the RRM2 of U2AF65 and dU2AF50. U2AF65, also termed U2AF2, is the large subunit of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF), which has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. U2AF65 specifically recognizes the intron polypyrimidine tract upstream of the 3' splice site and promotes binding of U2 snRNP to the pre-mRNA branchpoint. U2AF65 also plays an important role in the nuclear export of mRNA. It facilitates the formation of a messenger ribonucleoprotein export complex, containing both the NXF1 receptor and the RNA substrate. Moreover, U2AF65 interacts directly and specifically with expanded CAG RNA, and serves as an adaptor to link expanded CAG RNA to NXF1 for RNA export. U2AF65 contains an N-terminal RS domain rich in arginine and serine, followed by a proline-rich segment and three C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The N-terminal RS domain stabilizes the interaction of U2 snRNP with the branch point (BP) by contacting the branch region, and further promotes base pair interactions between U2 snRNA and the BP. The proline-rich segment mediates protein-protein interactions with the RRM domain of the small U2AF subunit (U2AF35 or U2AF1). The RRM1 and RRM2 are sufficient for specific RNA binding, while RRM3 is responsible for protein-protein interactions. The family also includes Splicing factor U2AF 50 kDa subunit (dU2AF50), the Drosophila ortholog of U2AF65. dU2AF50 functions as an essential pre-mRNA splicing factor in flies. It associates with intronless mRNAs and plays a significant and unexpected role in the nuclear export of a large number of intronless mRNAs.	77
409679	cd12232	RRM3_U2AF65	RNA recognition motif 3 (RRM3) found in U2 large nuclear ribonucleoprotein auxiliary factor U2AF 65 kDa subunit (U2AF65) and similar proteins. This subfamily corresponds to the RRM3 of U2AF65 and dU2AF50. U2AF65, also termed U2AF2, is the large subunit of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF), which has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. U2AF65 specifically recognizes the intron polypyrimidine tract upstream of the 3' splice site and promotes binding of U2 snRNP to the pre-mRNA branchpoint. U2AF65 also plays an important role in the nuclear export of mRNA. It facilitates the formation of a messenger ribonucleoprotein export complex, containing both the NXF1 receptor and the RNA substrate. Moreover, U2AF65 interacts directly and specifically with expanded CAG RNA, and serves as an adaptor to link expanded CAG RNA to NXF1 for RNA export. U2AF65 contains an N-terminal RS domain rich in arginine and serine, followed by a proline-rich segment and three C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The N-terminal RS domain stabilizes the interaction of U2 snRNP with the branch point (BP) by contacting the branch region, and further promotes base pair interactions between U2 snRNA and the BP. The proline-rich segment mediates protein-protein interactions with the RRM domain of the small U2AF subunit (U2AF35 or U2AF1). The RRM1 and RRM2 are sufficient for specific RNA binding, while RRM3 is responsible for protein-protein interactions. The family also includes Splicing factor U2AF 50 kDa subunit (dU2AF50), the Drosophila ortholog of U2AF65. dU2AF50 functions as an essential pre-mRNA splicing factor in flies. It associates with intronless mRNAs and plays a significant and unexpected role in the nuclear export of a large number of intronless mRNAs.	89
240679	cd12233	RRM_Srp1p_AtRSp31_like	RNA recognition motif (RRM) found in fission yeast pre-mRNA-splicing factor Srp1p, Arabidopsis thaliana arginine/serine-rich-splicing factor RSp31 and similar proteins. This subfamily corresponds to the RRM of Srp1p and RRM2 of plant SR splicing factors. Srp1p is encoded by gene srp1 from fission yeast Schizosaccharomyces pombe. It plays a role in the pre-mRNA splicing process, but is not essential for growth. Srp1p is closely related to the SR protein family found in Metazoa. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a glycine hinge and a RS domain in the middle, and a C-terminal domain. The family also includes a novel group of arginine/serine (RS) or serine/arginine (SR) splicing factors existing in plants, such as A. thaliana RSp31, RSp35, RSp41 and similar proteins. Like vertebrate RS splicing factors, these proteins function as plant splicing factors and play crucial roles in constitutive and alternative splicing in plants. They all contain two RRMs at their N-terminus and an RS domain at their C-terminus.	70
409680	cd12234	RRM1_AtRSp31_like	RNA recognition motif (RRM) found in Arabidopsis thaliana arginine/serine-rich-splicing factor RSp31 and similar proteins from plants. This subfamily corresponds to the RRM1in a family that represents a novel group of arginine/serine (RS) or serine/arginine (SR) splicing factors existing in plants, such as A. thaliana RSp31, RSp35, RSp41 and similar proteins. Like vertebrate RS splicing factors, these proteins function as plant splicing factors and play crucial roles in constitutive and alternative splicing in plants. They all contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at their N-terminus, and an RS domain at their C-terminus.	72
409681	cd12235	RRM_PPIL4	RNA recognition motif (RRM) found in peptidyl-prolyl cis-trans isomerase-like 4 (PPIase) and similar proteins. This subfamily corresponds to the RRM of PPIase, also termed cyclophilin-like protein PPIL4, or rotamase PPIL4, a novel nuclear RNA-binding protein encoded by cyclophilin-like PPIL4 gene. The precise role of PPIase remains unclear. PPIase contains a conserved N-terminal peptidyl-prolyl cistrans isomerase (PPIase) motif, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a lysine rich domain, and a pair of bipartite nuclear targeting sequences (NLS) at the C-terminus.	83
409682	cd12236	RRM_snRNP70	RNA recognition motif (RRM) found in U1 small nuclear ribonucleoprotein 70 kDa (U1-70K) and similar proteins. This subfamily corresponds to the RRM of U1-70K, also termed snRNP70, a key component of the U1 snRNP complex, which is one of the key factors facilitating the splicing of pre-mRNA via interaction at the 5' splice site, and is involved in regulation of polyadenylation of some viral and cellular genes, enhancing or inhibiting efficient poly(A) site usage. U1-70K plays an essential role in targeting the U1 snRNP to the 5' splice site through protein-protein interactions with regulatory RNA-binding splicing factors, such as the RS protein ASF/SF2. Moreover, U1-70K protein can specifically bind to stem-loop I of the U1 small nuclear RNA (U1 snRNA) contained in the U1 snRNP complex. It also mediates the binding of U1C, another U1-specific protein, to the U1 snRNP complex. U1-70K contains a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by an adjacent glycine-rich region at the N-terminal half, and two serine/arginine-rich (SR) domains at the C-terminal half. The RRM is responsible for the binding of stem-loop I of U1 snRNA molecule. Additionally, the most prominent immunodominant region that can be recognized by auto-antibodies from autoimmune patients may be located within the RRM. The SR domains are involved in protein-protein interaction with SR proteins that mediate 5' splice site recognition. For instance, the first SR domain is necessary and sufficient for ASF/SF2 Binding. The family also includes Drosophila U1-70K that is an essential splicing factor required for viability in flies, but its SR domain is dispensable. The yeast U1-70k doesn't contain easily recognizable SR domains and shows low sequence similarity in the RRM region with other U1-70k proteins and therefore not included in this family. The RRM domain is dispensable for yeast U1-70K function.	91
409683	cd12237	RRM_snRNP35	RNA recognition motif (RRM) found in U11/U12 small nuclear ribonucleoprotein 35 kDa protein (U11/U12-35K) and similar proteins. This subfamily corresponds to the RRM of U11/U12-35K, also termed protein HM-1, or U1 snRNP-binding protein homolog, and is one of the components of the U11/U12 snRNP, which is a subunit of the minor (U12-dependent) spliceosome required for splicing U12-type nuclear pre-mRNA introns. U11/U12-35K is highly conserved among bilateria and plants, but lacks in some organisms, such as Saccharomyces cerevisiae and Caenorhabditis elegans. Moreover, U11/U12-35K shows significant sequence homology to U1 snRNP-specific 70 kDa protein (U1-70K or snRNP70). It contains a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by an adjacent glycine-rich region, and Arg-Asp and Arg-Glu dipeptide repeats rich domain, making U11/U12-35K a possible functional analog of U1-70K. It may facilitate 5' splice site recognition in the minor spliceosome and play a role in exon bridging, interacting with components of the major spliceosome bound to the pyrimidine tract of an upstream U2-type intron. The family corresponds to the RRM of U11/U12-35K that may directly contact the U11 or U12 snRNA through the RRM domain.	94
409684	cd12238	RRM1_RBM40_like	RNA recognition motif 1 (RRM1) found in RNA-binding protein 40 (RBM40) and similar proteins. This subfamily corresponds to the RRM1 of RBM40, also known as RNA-binding region-containing protein 3 (RNPC3) or U11/U12 small nuclear ribonucleoprotein 65 kDa protein (U11/U12-65K protein), It serves as a bridging factor between the U11 and U12 snRNPs. It contains two repeats of RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), connected by a linker that includes a proline-rich region. It binds to the U11-associated 59K protein via its RRM1 and employs the RRM2 to bind hairpin III of the U12 small nuclear RNA (snRNA). The proline-rich region might be involved in protein-protein interactions. 	73
409685	cd12239	RRM2_RBM40_like	RNA recognition motif 2 (RRM2) found in RNA-binding protein 40 (RBM40) and similar proteins. This subfamily corresponds to the RRM2 of RBM40 and the RRM of RBM41. RBM40, also known as RNA-binding region-containing protein 3 (RNPC3) or U11/U12 small nuclear ribonucleoprotein 65 kDa protein (U11/U12-65K protein). It serves as a bridging factor between the U11 and U12 snRNPs. It contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), connected by a linker that includes a proline-rich region. It binds to the U11-associated 59K protein via its RRM1 and employs the RRM2 to bind hairpin III of the U12 small nuclear RNA (snRNA). The proline-rich region might be involved in protein-protein interactions. RBM41 contains only one RRM. Its biological function remains unclear. 	82
409686	cd12240	RRM_NCBP2	RNA recognition motif (RRM) found in nuclear cap-binding protein subunit 2 (CBP20) and similar proteins. This subfamily corresponds to the RRM of CBP20, also termed nuclear cap-binding protein subunit 2 (NCBP2), or cell proliferation-inducing gene 55 protein, or NCBP-interacting protein 1 (NIP1). CBP20 is the small subunit of the nuclear cap binding complex (CBC), which is a conserved eukaryotic heterodimeric protein complex binding to 5'-capped polymerase II transcripts and plays a central role in the maturation of pre-mRNA and uracil-rich small nuclear RNA (U snRNA). CBP20 is most likely responsible for the binding of capped RNA. It contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and interacts with the second and third domains of CBP80, the large subunit of CBC. 	78
409687	cd12241	RRM_SF3B14	RNA recognition motif (RRM) found in pre-mRNA branch site protein p14 (SF3B14) and similar proteins. This subfamily corresponds to the RRM of SF3B14 (also termed p14), a 14 kDa protein subunit of SF3B which is a multiprotein complex that is an integral part of the U2 small nuclear ribonucleoprotein (snRNP) and the U11/U12 di-snRNP. SF3B is essential for the accurate excision of introns from pre-messenger RNA and has been involved in the recognition of the pre-mRNA's branch site within the major and minor spliceosomes. SF3B14 associates directly with another SF3B subunit called SF3B155. It is also present in both U2- and U12-dependent spliceosomes and may contribute to branch site positioning in both the major and minor spliceosome. Moreover, SF3B14 interacts directly with the pre-mRNA branch adenosine early in spliceosome assembly and within the fully assembled spliceosome. SF3B14 contains one well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	77
409688	cd12242	RRM_SLIRP	RNA recognition motif (RRM) found in SRA stem-loop-interacting RNA-binding protein (SLIRP) and similar proteins. This subfamily corresponds to the RRM of SLIRP, a widely expressed small steroid receptor RNA activator (SRA) binding protein, which binds to STR7, a functional substructure of SRA. SLIRP is localized predominantly to the mitochondria and plays a key role in modulating several nuclear receptor (NR) pathways. It functions as a co-repressor to repress SRA-mediated nuclear receptor coactivation. It modulates SHARP- and SKIP-mediated co-regulation of NR activity. SLIRP contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is required for SLIRP's corepression activities. 	73
409689	cd12243	RRM1_MSSP	RNA recognition motif 1 (RRM1) found in the c-myc gene single-strand binding proteins (MSSP) family. This subfamily corresponds to the RRM1 of c-myc gene single-strand binding proteins (MSSP) family, including single-stranded DNA-binding protein MSSP-1 (also termed RBMS1 or SCR2) and MSSP-2 (also termed RBMS2 or SCR3). All MSSP family members contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity. Both, MSSP-1 and -2, have been identified as protein factors binding to a putative DNA replication origin/transcriptional enhancer sequence present upstream from the human c-myc gene in both single- and double-stranded forms. Thus, they have been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with c-MYC, the product of protooncogene c-myc. Moreover, the family includes a new member termed RNA-binding motif, single-stranded-interacting protein 3 (RBMS3), which is not a transcriptional regulator. RBMS3 binds with high affinity to A/U-rich stretches of RNA, and to A/T-rich DNA sequences, and functions as a regulator of cytoplasmic activity. In addition, a putative meiosis-specific RNA-binding protein termed sporulation-specific protein 5 (SPO5, or meiotic RNA-binding protein 1, or meiotically up-regulated gene 12 protein), encoded by Schizosaccharomyces pombe Spo5/Mug12 gene, is also included in this family. SPO5 is a novel meiosis I regulator that may function in the vicinity of the Mei2 dot. 	71
409690	cd12244	RRM2_MSSP	RNA recognition motif 2 (RRM2) found in the c-myc gene single-strand binding proteins (MSSP) family. This subfamily corresponds to the RRM2 of c-myc gene single-strand binding proteins (MSSP) family, including single-stranded DNA-binding protein MSSP-1 (also termed RBMS1 or SCR2) and MSSP-2 (also termed RBMS2 or SCR3). All MSSP family members contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity. Both, MSSP-1 and -2, have been identified as protein factors binding to a putative DNA replication origin/transcriptional enhancer sequence present upstream from the human c-myc gene in both single- and double-stranded forms. Thus they have been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with C-MYC, the product of protooncogene c-myc. Moreover, they family includes a new member termed RNA-binding motif, single-stranded-interacting protein 3 (RBMS3), which is not a transcriptional regulator. RBMS3 binds with high affinity to A/U-rich stretches of RNA, and to A/T-rich DNA sequences, and functions as a regulator of cytoplasmic activity. In addition, a putative meiosis-specific RNA-binding protein termed sporulation-specific protein 5 (SPO5, or meiotic RNA-binding protein 1, or meiotically up-regulated gene 12 protein), encoded by Schizosaccharomyces pombe Spo5/Mug12 gene, is also included in this family. SPO5 is a novel meiosis I regulator that may function in the vicinity of the Mei2 dot. 	82
409691	cd12245	RRM_scw1_like	RNA recognition motif (RRM) found in yeast cell wall integrity protein scw1 and similar proteins. This subfamily corresponds to the RRM of the family including yeast cell wall integrity protein scw1, yeast Whi3 protein, yeast Whi4 protein and similar proteins. The strong cell wall protein 1, scw1, is a nonessential cytoplasmic RNA-binding protein that regulates septation and cell-wall structure in fission yeast. It may function as an inhibitor of septum formation, such that its loss of function allows weak SIN signaling to promote septum formation. It's RRM domain shows high homology to two budding yeast proteins, Whi3 and Whi4. Whi3 is a dose-dependent modulator of cell size and has been implicated in cell cycle control in the yeast Saccharomyces cerevisiae. It functions as a negative regulator of ceroid-lipofuscinosis, neuronal 3 (Cln3), a G1 cyclin that promotes transcription of many genes to trigger the G1/S transition in budding yeast. It specifically binds the CLN3 mRNA and localizes it into discrete cytoplasmic loci that may locally restrict Cln3 synthesis to modulate cell cycle progression. Moreover, Whi3 plays a key role in cell fate determination in budding yeast. The RRM domain is essential for Whi3 function. Whi4 is a partially redundant homolog of Whi3, also containing one RRM. Some uncharacterized family members of this subfamily contain two RRMs; their RRM1 shows high sequence homology to the RRM of RNA-binding protein with multiple splicing (RBP-MS)-like proteins.	79
409692	cd12246	RRM1_U1A_like	RNA recognition motif 1 (RRM1) found in the U1A/U2B"/SNF protein family. This subfamily corresponds to the RRM1 of U1A/U2B"/SNF protein family which contains Drosophila sex determination protein SNF and its two mammalian counterparts, U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A) and U2 small nuclear ribonucleoprotein B" (U2 snRNP B" or U2B"), all of which consist of two RNA recognition motifs (RRMs), connected by a variable, flexible linker. SNF is an RNA-binding protein found in the U1 and U2 snRNPs of Drosophila where it is essential in sex determination and possesses a novel dual RNA binding specificity. SNF binds with high affinity to both Drosophila U1 snRNA stem-loop II (SLII) and U2 snRNA stem-loop IV (SLIV). It can also bind to poly(U) RNA tracts flanking the alternatively spliced Sex-lethal (Sxl) exon, as does Drosophila Sex-lethal protein (SXL). U1A is an RNA-binding protein associated with the U1 snRNP, a small RNA-protein complex involved in pre-mRNA splicing. U1A binds with high affinity and specificity to stem-loop II (SLII) of U1 snRNA. It is predominantly a nuclear protein that shuttles between the nucleus and the cytoplasm independently of interactions with U1 snRNA. Moreover, U1A may be involved in RNA 3'-end processing, specifically cleavage, splicing and polyadenylation, through interacting with a large number of non-snRNP proteins. U2B", initially identified to bind to stem-loop IV (SLIV) at the 3' end of U2 snRNA, is a unique protein that comprises of the U2 snRNP. Additional research indicates U2B" binds to U1 snRNA stem-loop II (SLII) as well and shows no preference for SLIV or SLII on the basis of binding affinity. Moreover, U2B" does not require an auxiliary protein for binding to RNA, and its nuclear transport is independent of U2 snRNA binding. 	78
409693	cd12247	RRM2_U1A_like	RNA recognition motif 2 (RRM2) found in the U1A/U2B"/SNF protein family. This subfamily corresponds to the RRM2 of U1A/U2B"/SNF protein family, containing Drosophila sex determination protein SNF and its two mammalian counterparts, U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A) and U2 small nuclear ribonucleoprotein B" (U2 snRNP B" or U2B"), all of which consist of two RNA recognition motifs (RRMs) connected by a variable, flexible linker. SNF is an RNA-binding protein found in the U1 and U2 snRNPs of Drosophila where it is essential in sex determination and possesses a novel dual RNA binding specificity. SNF binds with high affinity to both Drosophila U1 snRNA stem-loop II (SLII) and U2 snRNA stem-loop IV (SLIV). It can also bind to poly(U) RNA tracts flanking the alternatively spliced Sex-lethal (Sxl) exon, as does Drosophila Sex-lethal protein (SXL). U1A is an RNA-binding protein associated with the U1 snRNP, a small RNA-protein complex involved in pre-mRNA splicing. U1A binds with high affinity and specificity to stem-loop II (SLII) of U1 snRNA. It is predominantly a nuclear protein that shuttles between the nucleus and the cytoplasm independently of interactions with U1 snRNA. Moreover, U1A may be involved in RNA 3'-end processing, specifically cleavage, splicing and polyadenylation, through interacting with a large number of non-snRNP proteins. U2B", initially identified to bind to stem-loop IV (SLIV) at the 3' end of U2 snRNA, is a unique protein that comprises of the U2 snRNP. Additional research indicates U2B" binds to U1 snRNA stem-loop II (SLII) as well and shows no preference for SLIV or SLII on the basis of binding affinity. U2B" does not require an auxiliary protein for binding to RNA and its nuclear transport is independent on U2 snRNA binding. 	72
409694	cd12248	RRM_RBM44	RNA recognition motif (RRM) found in RNA-binding protein 44 (RBM44) and similar proteins.  This subgroup corresponds to the RRM of RBM44, a novel germ cell intercellular bridge protein that is localized in the cytoplasm and intercellular bridges from pachytene to secondary spermatocyte stages. RBM44 interacts with itself and testis-expressed gene 14 (TEX14). Unlike TEX14, RBM44 does not function in the formation of stable intercellular bridges. It carries an RNA recognition motif (RRM) that could potentially bind a multitude of RNA sequences in the cytoplasm and help to shuttle them through the intercellular bridge, facilitating their dispersion into the interconnected neighboring cells.	77
409695	cd12249	RRM1_hnRNPR_like	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein R (hnRNP R) and similar proteins. This subfamily corresponds to the RRM1 in hnRNP R, hnRNP Q, APOBEC-1 complementation factor (ACF), and dead end protein homolog 1 (DND1). hnRNP R is a ubiquitously expressed nuclear RNA-binding protein that specifically binds mRNAs with a preference for poly(U) stretches. It has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP Q is also a ubiquitously expressed nuclear RNA-binding protein. It has been identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome, and has been implicated in the regulation of specific mRNA transport. ACF is an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone, and play a key role in cell growth and differentiation. DND1 is essential for maintaining viable germ cells in vertebrates. It interacts with the 3'-untranslated region (3'-UTR) of multiple messenger RNAs (mRNAs) and prevents micro-RNA (miRNA) mediated repression of mRNA. This family also includes two functionally unknown RNA-binding proteins, RBM46 and RBM47. All members in this family, except for DND1, contain three conserved RNA recognition motifs (RRMs); DND1 harbors only two RRMs. 	78
409696	cd12250	RRM2_hnRNPR_like	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein R (hnRNP R) and similar proteins. This subfamily corresponds to the RRM2 in hnRNP R, hnRNP Q, APOBEC-1 complementation factor (ACF), and dead end protein homolog 1 (DND1). hnRNP R is a ubiquitously expressed nuclear RNA-binding protein that specifically bind mRNAs with a preference for poly(U) stretches. It has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP Q is also a ubiquitously expressed nuclear RNA-binding protein. It has been identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome, and has been implicated in the regulation of specific mRNA transport. ACF is an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone and play a key role in cell growth and differentiation. DND1 is essential for maintaining viable germ cells in vertebrates. It interacts with the 3'-untranslated region (3'-UTR) of multiple messenger RNAs (mRNAs) and prevents micro-RNA (miRNA) mediated repression of mRNA. This family also includes two functionally unknown RNA-binding proteins, RBM46 and RBM47. All members in this family, except for DND1, contain three conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains); DND1 harbors only two RRMs. 	82
409697	cd12251	RRM3_hnRNPR_like	RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein R (hnRNP R) and similar proteins. This subfamily corresponds to the RRM3 in hnRNP R, hnRNP Q, and APOBEC-1 complementation factor (ACF). hnRNP R is a ubiquitously expressed nuclear RNA-binding protein that specifically bind mRNAs with a preference for poly(U) stretches and has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP Q is also a ubiquitously expressed nuclear RNA-binding protein. It has been identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome, and has been implicated in the regulation of specific mRNA transport. ACF is an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone and play a key role in cell growth and differentiation. This family also includes two functionally unknown RNA-binding proteins, RBM46 and RBM47. All members contain three conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains).	72
409698	cd12252	RRM_DbpA	RNA recognition motif (RRM) found in the DbpA subfamily of prokaryotic DEAD-box rRNA helicases. This subfamily corresponds to the C-terminal RRM homology domain of dbpA proteins implicated in ribosome biogenesis. They bind with high affinity and specificity to RNA substrates containing hairpin 92 of 23S rRNA (HP92), which is part of the ribosomal A-site. The majority of dbpA proteins contain two N-terminal ATPase catalytic domains and a C-terminal RNA binding domain, an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain). The catalytic domains bind to nearby regions of RNA to stimulate ATP hydrolysis and disrupt RNA structures. The C-terminal domain is responsible for the high-affinity RNA binding. Several members of this family lack specificity for 23S rRNA. These proteins can generally be distinguished by a basic region that extends beyond the C-terminal domain.	71
240699	cd12253	RRM_PIN4_like	RNA recognition motif (RRM) found in yeast RNA-binding protein PIN4, fission yeast RNA-binding post-transcriptional regulators cip1, cip2 and similar proteins. This subfamily corresponds to the RRM in PIN4, also termed psi inducibility protein 4 or modifier of damage tolerance Mdt1, a novel phosphothreonine (pThr)-containing protein that specifically interacts with the pThr-binding site of the Rad53 FHA1 domain. It is encoded by gene MDT1 (YBL051C) from yeast Saccharomyces cerevisiae. PIN4 is involved in normal G2/M cell cycle progression in the absence of DNA damage and functions as a novel target of checkpoint-dependent cell cycle arrest pathways. It contains an N-terminal RRM, a nuclear localization signal, a coiled coil, and a total of 15 SQ/TQ motifs. cip1 (Csx1-interacting protein 1) and cip2 (Csx1-interacting protein 2) are novel cytoplasmic RRM-containing proteins that counteract Csx1 function during oxidative stress. They are not essential for viability in fission yeast Schizosaccharomyces pombe. Both cip1 and cip2 contain one RRM. Like PIN4, Cip2 also possesses an R3H motif that may function in sequence-specific binding to single-stranded nucleic acids. 	79
409699	cd12254	RRM_hnRNPH_ESRPs_RBM12_like	RNA recognition motif (RRM) found in heterogeneous nuclear ribonucleoprotein (hnRNP) H protein family, epithelial splicing regulatory proteins (ESRPs), Drosophila RNA-binding protein Fusilli, RNA-binding protein 12 (RBM12) and similar proteins. The family includes RRM domains in the hnRNP H protein family, G-rich sequence factor 1 (GRSF-1), ESRPs (also termed RBM35), Drosophila Fusilli, RBM12 (also termed SWAN), RBM12B, RBM19 (also termed RBD-1) and similar proteins. The hnRNP H protein family includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), hnRNP F and hnRNP H3 (also termed hnRNP 2H9), which represent a group of nuclear RNA binding proteins that are involved in pre-mRNA processing. GRSF-1 is a cytoplasmic poly(A)+ mRNA binding protein which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B) are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. Fusilli shows high sequence homology to ESRPs. It can regulate endogenous FGFR2 splicing and functions as a splicing factor. The biological roles of both, RBM12 and RBM12B, remain unclear. RBM19 is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. Members in this family contain 2~6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	73
409700	cd12255	RRM1_LKAP	RNA recognition motif 1 (RRM1) found in Limkain-b1 (LKAP) and similar proteins. This subfamily corresponds to the RRM1 of LKAP, a novel peroxisomal autoantigen that co-localizes with a subset of cytoplasmic microbodies marked by ABCD3 (ATP-binding cassette subfamily D member 3, known previously as PMP-70) and/or PXF (peroxisomal farnesylated protein, known previously as PEX19). It associates with LIM kinase 2 (LIMK2) and may serve as a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. LKAP contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). However, whether those RRMs are bona fide RNA binding sites remains unclear. Moreover, there is no evidence of LAKP localization in the nucleus. Therefore, if the RRMs are functional, their interaction with RNA species would be restricted to the cytoplasm and peroxisomes. 	73
409701	cd12256	RRM2_LKAP	RNA recognition motif 2 (RRM2) found in Limkain-b1 (LKAP) and similar proteins. This subfamily corresponds to the RRM2 of LKAP, a novel peroxisomal autoantigen that co-localizes with a subset of cytoplasmic microbodies marked by ABCD3 (ATP-binding cassette subfamily D member 3, known previously as PMP-70) and/or PXF (peroxisomal farnesylated protein, known previously as PEX19). It associates with LIM kinase 2 (LIMK2) and may serve as a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures. LKAP contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). However, whether those RRMs are bona fide RNA binding sites remains unclear. Moreover, there is no evidence of LAKP localization in the nucleus. Therefore, if the RRMs are functional, their interaction with RNA species would be restricted to the cytoplasm and peroxisomes.	89
409702	cd12257	RRM1_RBM26_like	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 26 (RBM26) and similar proteins. This subfamily corresponds to the RRM1 of RBM26, and the RRM of RBM27. RBM26, also known as cutaneous T-cell lymphoma (CTCL) tumor antigen se70-2, represents a cutaneous lymphoma (CL)-associated antigen. It contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The RRMs may play some functional roles in RNA-binding or protein-protein interactions. RBM27 contains only one RRM; its biological function remains unclear. 	72
409703	cd12258	RRM2_RBM26_like	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 26 (RBM26) and similar proteins. This subfamily corresponds to the RRM2 of RBM26, also known as cutaneous T-cell lymphoma (CTCL) tumor antigen se70-2, which represents a cutaneous lymphoma (CL)-associated antigen. RBM26 contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The RRMs may play some functional roles in RNA-binding or protein-protein interactions.	72
409704	cd12259	RRM_SRSF11_SREK1	RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 11 (SRSF11), splicing regulatory glutamine/lysine-rich protein 1 (SREK1) and similar proteins. This subfamily corresponds to the RRM domain of SRSF11 (SRp54 or p54), SREK1 ( SFRS12 or SRrp86) and similar proteins, a group of proteins containing regions rich in serine-arginine dipeptides (SR protein family). These are involved in bridge-complex formation and splicing by mediating protein-protein interactions across either introns or exons. SR proteins have been identified as crucial regulators of alternative splicing. Different SR proteins display different substrate specificity, have distinct functions in alternative splicing of different pre-mRNAs, and can even negatively regulate splicing. All SR family members are characterized by the presence of one or two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and the C-terminal regions rich in serine and arginine dipeptides (SR domains). The RRM domain is responsible for RNA binding and specificity in both alternative and constitutive splicing. In contrast, SR domains are thought to be protein-protein interaction domains that are often interchangeable. 	76
409705	cd12260	RRM2_SREK1	RNA recognition motif 2 (RRM2) found in splicing regulatory glutamine/lysine-rich protein 1 (SREK1) and similar proteins. This subfamily corresponds to the RRM2 of SREK1, also termed serine/arginine-rich-splicing regulatory protein 86-kDa (SRrp86), or splicing factor arginine/serine-rich 12 (SFRS12), or splicing regulatory protein 508 amino acid (SRrp508). SREK1 belongs to a family of proteins containing regions rich in serine-arginine dipeptides (SR proteins family), which is involved in bridge-complex formation and splicing by mediating protein-protein interactions across either introns or exons. It is a unique SR family member and it may play a crucial role in determining tissue specific patterns of alternative splicing. SREK1 can alter splice site selection by both positively and negatively modulating the activity of other SR proteins. For instance, SREK1 can activate SRp20 and repress SC35 in a dose-dependent manner both in vitro and in vivo. In addition, SREK1 contains two (some contain only one) RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and two serine-arginine (SR)-rich domains (SR domains) separated by an unusual glutamic acid-lysine (EK) rich region. The RRM and SR domains are highly conserved among other members of the SR superfamily. However, the EK domain is unique to SREK1. It plays a modulatory role controlling SR domain function by involvement in the inhibition of both constitutive and alternative splicing and in the selection of splice-site. 	85
240707	cd12261	RRM1_3_MRN1	RNA recognition motif 1 (RRM1) and 3 (RRM3) found in RNA-binding protein MRN1 and similar proteins. This subfamily corresponds to the RRM1 and RRM3 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, which is an RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	73
409706	cd12262	RRM2_4_MRN1	RNA recognition motif 2 (RRM2) and 4 (RRM4) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM2 and RRM4 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, and is an RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	78
409707	cd12263	RRM_ABT1_like	RNA recognition motif (RRM) found in activator of basal transcription 1 (ABT1) and similar proteins. This subfamily corresponds to the RRM of novel nuclear proteins termed ABT1 and its homologous counterpart, pre-rRNA-processing protein ESF2 (eighteen S factor 2), from yeast. ABT1 associates with the TATA-binding protein (TBP) and enhances basal transcription activity of class II promoters. Meanwhile, ABT1 could be a transcription cofactor that can bind to DNA in a sequence-independent manner. The yeast ABT1 homolog, ESF2, is a component of 90S preribosomes and 5' ETS-based RNPs. It is previously identified as a putative partner of the TATA-element binding protein. However, it is primarily localized to the nucleolus and physically associates with pre-rRNA processing factors. ESF2 may play a role in ribosome biogenesis. It is required for normal pre-rRNA processing, as well as for SSU processome assembly and function. Both ABT1 and ESF2 contain an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	98
409708	cd12264	RRM_AKAP17A	RNA recognition motif (RRM) found in A-kinase anchor protein 17A (AKAP-17A) and similar proteins. This subfamily corresponds to the RRM domain of AKAP-17A, also termed 721P, or splicing factor, arginine/serine-rich 17A (SFRS17A). It was originally reported as the pseudoautosomal or X inactivation escape gene 7 (XE7) and as B-lymphocyte antigen precursor. It has been suggested that AKAP-17A is an alternative splicing factor and an SR-related splicing protein that interacts with the classical SR protein ASF/SF2 and the SR-related factor ZNF265. Additional studies have indicated that AKAP-17A is a dual-specific protein kinase A anchoring protein (AKAP) that can bind both type I and type II protein kinase A (PKA) with high affinity and co-localizes with the catalytic subunit of PKA in nuclear speckles as well as the splicing factor SC35 in splicing factor compartments. It is involved in regulation of pre-mRNA splicing possibly by docking a pool of PKA in splicing factor compartments. AKAP-17A contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	122
409709	cd12265	RRM_SLT11	RNA recognition motif (RRM) found in pre-mRNA-splicing factor SLT11 and similar proteins. This subfamily corresponds to the RRM of SLT11, also known as extracellular mutant protein 2, or synthetic lethality with U2 protein 11, and is a splicing factor required for spliceosome assembly in yeast. It contains a conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). SLT11 can facilitate the cooperative formation of U2/U6 helix II in association with stem II in the yeast spliceosome by utilizing its RNA-annealing and -binding activities. 	86
409710	cd12266	RRM_like_XS	RNA recognition motif (RRM)-like XS domain found in plants. This XS (named after rice gene X and SGS3) domain is a single-stranded RNA-binding domain (RBD) and possesses a unique version of a RNA recognition motif (RRM) fold. It is conserved in a family of plant proteins including gene X and SGS3. Although its function is still unknown, the plant SGS3 proteins are thought to be involved in post-transcriptional gene silencing (PTGS) pathways. In addition, they contain a conserved aspartate residue that may be functionally important. 	107
409711	cd12267	RRM_YRA1_MLO3	RNA recognition motif (RRM) found in yeast RNA annealing protein YRA1 (Yra1p), yeast mRNA export protein mlo3 and similar proteins. This subfamily corresponds to the RRM of Yra1p and mlo3. Yra1p is an essential nuclear RNA-binding protein encoded by Saccharomyces cerevisiae YRA1 gene. It belongs to the evolutionarily conserved REF (RNA and export factor binding proteins) family of hnRNP-like proteins. Yra1p possesses potent RNA annealing activity and interacts with a number of proteins involved in nuclear transport and RNA processing. It binds to the mRNA export factor Mex67p/TAP and couples transcription to export in yeast. Yra1p is associated with Pse1p and Kap123p, two members of the beta-importin family, further mediating transport of Yra1p into the nucleus. In addition, the co-transcriptional loading of Yra1p is required for autoregulation. Yra1p consists of two highly conserved N- and C-terminal boxes and a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). This subfamily includes RNA-annealing protein mlo3, also termed mRNA export protein mlo3, which has been identified in fission yeast as a protein that causes defects in chromosome segregation when overexpressed. It shows high sequence similarity with Yra1p. 	78
240714	cd12268	RRM_Vip1	RNA recognition motif (RRM) found in fission yeast protein Vip1 and similar proteins. This subfamily corresponds to Vip1, an RNA-binding protein encoded by gene vip1 from fission yeast Schizosaccharomyces pombe. Its biological role remains unclear. Vip1 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	68
409712	cd12269	RRM_Vip1_like	RNA recognition motif (RRM) found in a group of uncharacterized plant proteins similar to fission yeast Vip1. This subfamily corresponds to the Vip1-like, uncharacterized proteins found in plants. Although their biological roles remain unclear, these proteins show high sequence similarity to the fission yeast Vip1. Like Vip1 protein, members in this family contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	69
409713	cd12270	RRM_MTHFSD	RNA recognition motif (RRM) found in vertebrate methenyltetrahydrofolate synthetase domain-containing proteins. This subfamily corresponds to methenyltetrahydrofolate synthetase domain (MTHFSD), a putative RNA-binding protein found in various vertebrate species. It contains an N-terminal 5-formyltetrahydrofolate cyclo-ligase domain and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The biological role of MTHFSD remains unclear. 	72
409714	cd12271	RRM1_PHIP1	RNA recognition motif 1 (RRM1) found in Arabidopsis thaliana phragmoplastin interacting protein 1 (PHIP1) and similar proteins. This subfamily corresponds to the RRM1 of PHIP1. A. thaliana PHIP1 and its homologs represent a novel class of plant-specific RNA-binding proteins that may play a unique role in the polarized mRNA transport to the vicinity of the cell plate. The family members consist of multiple functional domains, including a lysine-rich domain (KRD domain) that contains three nuclear localization motifs (KKKR/NK), two RNA recognition motifs (RRMs), and three CCHC-type zinc fingers. PHIP1 is a peripheral membrane protein and is localized at the cell plate during cytokinesis in plants. In addition to phragmoplastin, PHIP1 interacts with two Arabidopsis small GTP-binding proteins, Rop1 and Ran2. However, PHIP1 interacted only with the GTP-bound form of Rop1 but not the GDP-bound form. It also binds specifically to Ran2 mRNA. 	72
409715	cd12272	RRM2_PHIP1	RNA recognition motif 2 (RRM2) found in Arabidopsis thaliana phragmoplastin interacting protein 1 (PHIP1) and similar proteins. The CD corresponds to the RRM2 of PHIP1. A. thaliana PHIP1 and its homologs represent a novel class of plant-specific RNA-binding proteins that may play a unique role in the polarized mRNA transport to the vicinity of the cell plate. The family members consist of multiple functional domains, including a lysine-rich domain (KRD domain) that contains three nuclear localization motifs (KKKR/NK), two RNA recognition motifs (RRMs), and three CCHC-type zinc fingers. PHIP1 is a peripheral membrane protein and is localized at the cell plate during cytokinesis in plants. In addition to phragmoplastin, PHIP1 interacts with two Arabidopsis small GTP-binding proteins, Rop1 and Ran2. However, PHIP1 interacted only with the GTP-bound form of Rop1 but not the GDP-bound form. It also binds specifically to Ran2 mRNA. 	73
409716	cd12273	RRM1_NEFsp	RNA recognition motif 1 (RRM1) found in vertebrate putative RNA exonuclease NEF-sp. This subfamily corresponds to the RRM1 of NEF-sp., including uncharacterized putative RNA exonuclease NEF-sp found in vertebrates. Although its cellular functions remains unclear, NEF-sp contains an exonuclease domain and two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), suggesting it may possess both exonuclease and RNA-binding activities. 	71
409717	cd12274	RRM2_NEFsp	RNA recognition motif 2 (RRM2) found in vertebrate putative RNA exonuclease NEF-sp. This subfamily corresponds to the RRM2 of NEF-sp., including uncharacterized putative RNA exonuclease NEF-sp found in vertebrates. Although its cellular functions remains unclear, NEF-sp contains an exonuclease domain and two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), suggesting it may possess both exonuclease and RNA-binding activities. 	71
240721	cd12275	RRM1_MEI2_EAR1_like	RNA recognition motif 1 (RRM1) found in Mei2-like proteins and terminal EAR1-like proteins. This subfamily corresponds to the RRM1 of Mei2-like proteins from plant and fungi, terminal EAR1-like proteins from plant, and other eukaryotic homologs. Mei2-like proteins represent an ancient eukaryotic RNA-binding protein family whose corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. In the fission yeast Schizosaccharomyces pombe, the Mei2 protein is an essential component of the switch from mitotic to meiotic growth. S. pombe Mei2 stimulates meiosis in the nucleus upon binding a specific non-coding RNA. The terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) are mainly found in land plants. They may play a role in the regulation of leaf initiation. All members in this family are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In addition to the RRMs, the terminal EAR1-like proteins also contain TEL characteristic motifs that allow sequence and putative functional discrimination between them and Mei2-like proteins. 	71
409718	cd12276	RRM2_MEI2_EAR1_like	RNA recognition motif 2 (RRM2) found in Mei2-like proteins and terminal EAR1-like proteins. This subfamily corresponds to the RRM2 of Mei2-like proteins from plant and fungi, terminal EAR1-like proteins from plant, and other eukaryotic homologs. Mei2-like proteins represent an ancient eukaryotic RNA-binding proteins family whose corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. In the fission yeast Schizosaccharomyces pombe, the Mei2 protein is an essential component of the switch from mitotic to meiotic growth. S. pombe Mei2 stimulates meiosis in the nucleus upon binding a specific non-coding RNA. The terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) are mainly found in land plants. They may play a role in the regulation of leaf initiation. All members in this family are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In addition to the RRMs, the terminal EAR1-like proteins also contain TEL characteristic motifs that allow sequence and putative functional discrimination between them and Mei2-like proteins. 	71
409719	cd12277	RRM3_MEI2_EAR1_like	RNA recognition motif 3 (RRM3) found in Mei2-like proteins and terminal EAR1-like proteins. This subfamily corresponds to the RRM3 of Mei2-like proteins from plant and fungi, terminal EAR1-like proteins from plant, and other eukaryotic homologs. Mei2-like proteins represent an ancient eukaryotic RNA-binding proteins family whose corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. In the fission yeast Schizosaccharomyces pombe, the Mei2 protein is an essential component of the switch from mitotic to meiotic growth. S. pombe Mei2 stimulates meiosis in the nucleus upon binding a specific non-coding RNA. The terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) are mainly found in land plants. They may play a role in the regulation of leaf initiation. All members in this family are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In addition to the RRMs, the terminal EAR1-like proteins also contain TEL characteristic motifs that allow sequence and putative functional discrimination between them and Mei2-like proteins. 	86
409720	cd12278	RRM_eIF3B	RNA recognition motif (RRM) found in eukaryotic translation initiation factor 3 subunit B (eIF-3B) and similar proteins. This subfamily corresponds to the RRM domain in eukaryotic translation initiation factor 3 (eIF-3), a large multisubunit complex that plays a central role in the initiation of translation by binding to the 40 S ribosomal subunit and promoting the binding of methionyl-tRNAi and mRNA. eIF-3B, also termed eIF-3 subunit 9, or Prt1 homolog, eIF-3-eta, eIF-3 p110, or eIF-3 p116, is the major scaffolding subunit of eIF-3. It interacts with eIF-3 subunits A, G, I, and J. eIF-3B contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is involved in the interaction with eIF-3J. The interaction between eIF-3B and eIF-3J is crucial for the eIF-3 recruitment to the 40 S ribosomal subunit. eIF-3B also binds directly to domain III of the internal ribosome-entry site (IRES) element of hepatitis-C virus (HCV) RNA through its N-terminal RRM, which may play a critical role in both cap-dependent and cap-independent translation. Additional research has shown that eIF-3B may function as an oncogene in glioma cells and can be served as a potential therapeutic target for anti-glioma therapy. This family also includes the yeast homolog of eIF-3 subunit B (eIF-3B, also termed PRT1 or eIF-3 p90) that interacts with the yeast homologs of eIF-3 subunits A(TIF32), G(TIF35), I(TIF34), J(HCR1), and E(Pci8). In yeast, eIF-3B (PRT1) contains an N-terminal RRM that is directly involved in the interaction with eIF-3A (TIF32) and eIF-3J (HCR1). In contrast to its human homolog, yeast eIF-3B (PRT1) may have potential to bind its total RNA through its RRM domain. 	84
409721	cd12279	RRM_TUT1	RNA recognition motif (RRM) found in speckle targeted PIP5K1A-regulated poly(A) polymerase (Star-PAP) and similar proteins. This subfamily corresponds to the RRM of Star-PAP, also termed RNA-binding motif protein 21 (RBM21), which is a ubiquitously expressed U6 snRNA-specific terminal uridylyltransferase (U6-TUTase) essential for cell proliferation. Although it belongs to the well-characterized poly(A) polymerase protein superfamily, Star-PAP is highly divergent from both, the poly(A) polymerase (PAP) and the terminal uridylyl transferase (TUTase), identified within the editing complexes of trypanosomes. Star-PAP predominantly localizes at nuclear speckles and catalyzes RNA-modifying nucleotidyl transferase reactions. It functions in mRNA biosynthesis and may be regulated by phosphoinositides. It binds to glutathione S-transferase (GST)-PIPKIalpha. Star-PAP preferentially uses ATP as a nucleotide substrate and possesses PAP activity that is stimulated by PtdIns4,5P2. It contains an N-terminal C2H2-type zinc finger motif followed by an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a split PAP domain linked by a proline-rich region, a PAP catalytic and core domain, a PAP-associated domain, an RS repeat, and a nuclear localization signal (NLS). 	74
409722	cd12280	RRM_FET	RNA recognition motif (RRM) found in the FET family of RNA-binding proteins. This subfamily corresponds to the RRM of FET (previously TET) (FUS/TLS, EWS, TAF15) family of RNA-binding proteins. This ubiquitously expressed family of similarly structured proteins predominantly localizing to the nuclear, includes FUS (also known as TLS or Pigpen or hnRNP P2), EWS (also known as EWSR1), TAF15 (also known as hTAFII68 or TAF2N or RPB56), and Drosophila Cabeza (also known as SARFH). The corresponding coding genes of these proteins are involved in deleterious genomic rearrangements with transcription factor genes in a variety of human sarcomas and acute leukemias. All FET proteins interact with each other and are therefore likely to be part of the very same protein complexes, which suggests a general bridging role for FET proteins coupling RNA transcription, processing, transport, and DNA repair. The FET proteins contain multiple copies of a degenerate hexapeptide repeat motif at the N-terminus. The C-terminal region consists of a conserved nuclear import and retention signal (C-NLS), a putative zinc-finger domain, and a conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is flanked by 3 arginine-glycine-glycine (RGG) boxes. FUS and EWS might have similar sequence specificity; both bind preferentially to GGUG-containing RNAs. FUS has also been shown to bind strongly to human telomeric RNA and to small low-copy-number RNAs tethered to the promoter of cyclin D1. To date, nothing is known about the RNA binding specificity of TAF15. 	82
409723	cd12281	RRM1_TatSF1_like	RNA recognition motif 1 (RRM1) found in HIV Tat-specific factor 1 (Tat-SF1) and similar proteins. This subfamily corresponds to the RRM1 of Tat-SF1 and CUS2. Tat-SF1 is the cofactor for stimulation of transcriptional elongation by human immunodeficiency virus-type 1 (HIV-1) Tat. It is a substrate of an associated cellular kinase. Tat-SF1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a highly acidic carboxyl-terminal half. The family also includes CUS2, a yeast homolog of human Tat-SF1. CUS2 interacts with U2 RNA in splicing extracts and functions as a splicing factor that aids assembly of the splicing-competent U2 snRNP in vivo. CUS2 also associates with PRP11 that is a subunit of the conserved splicing factor SF3a. Like Tat-SF1, CUS2 contains two RRMs as well. 	92
409724	cd12282	RRM2_TatSF1_like	RNA recognition motif 2 (RRM2) found in HIV Tat-specific factor 1 (Tat-SF1) and similar proteins. This subfamily corresponds to the RRM2 of Tat-SF1 and CUS2. Tat-SF1 is the cofactor for stimulation of transcriptional elongation by human immunodeficiency virus-type 1 (HIV-1) Tat. It is a substrate of an associated cellular kinase. Tat-SF1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a highly acidic carboxyl-terminal half. The family also includes CUS2, a yeast homolog of human Tat-SF1. CUS2 interacts with U2 RNA in splicing extracts and functions as a splicing factor that aids assembly of the splicing-competent U2 snRNP in vivo. CUS2 also associates with PRP11 that is a subunit of the conserved splicing factor SF3a. Like Tat-SF1, CUS2 contains two RRMs as well. 	91
409725	cd12283	RRM1_RBM39_like	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 39 (RBM39) and similar proteins. This subfamily corresponds to the RRM1 of RNA-binding protein 39 (RBM39), RNA-binding protein 23 (RBM23) and similar proteins. RBM39 (also termed HCC1) is a nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). An octapeptide sequence called the RS-ERK motif is repeated six times in the RS region of RBM39. Although the cellular function of RBM23 remains unclear, it shows high sequence homology to RBM39 and contains two RRMs. It may possibly function as a pre-mRNA splicing factor. 	73
409726	cd12284	RRM2_RBM23_RBM39	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein RBM23, RBM39 and similar proteins. This subfamily corresponds to the RRM2 of RBM39 (also termed HCC1), a nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). An octapeptide sequence called the RS-ERK motif is repeated six times in the RS region of RBM39. Although the cellular function of RBM23 remains unclear, it shows high sequence homology to RBM39 and contains two RRMs. It may possibly function as a pre-mRNA splicing factor. 	78
409727	cd12285	RRM3_RBM39_like	RNA recognition motif 3 (RRM3) found in vertebrate RNA-binding protein 39 (RBM39) and similar proteins. This subfamily corresponds to the RRM3 of RBM39, also termed hepatocellular carcinoma protein 1, or RNA-binding region-containing protein 2, or splicing factor HCC1, ia nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). An octapeptide sequence called the RS-ERK motif is repeated six times in the RS region of RBM39. Based on the specific domain composition, RBM39 has been classified into a family of non-snRNP (small nuclear ribonucleoprotein) splicing factors that are usually not complexed to snRNAs. 	85
409728	cd12286	RRM_Man1	RNA recognition motif (RRM) found in inner nuclear membrane protein Man1 (Man1) and similar proteins. This subfamily corresponds to the RRM of Man1, also termed LEM domain-containing protein 3 (LEMD3), an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. It is part of a protein complex essential for chromatin organization and cell division. It also functions as an important negative regulator for the transforming growth factor (TGF) beta/activin/Nodal signaling pathway by directly interacting with chromatin-associated proteins and transcriptional regulators, including the R-Smads, Smad1, Smad2, and Smad3. Moreover, Man1 is a unique type of left-right (LR) signaling regulator that acts on the inner nuclear membrane. Man1 plays a crucial role in angiogenesis. The vascular remodeling can be regulated at the inner nuclear membrane through the interaction between Man1 and Smads. Man1 contains an N-terminal LEM domain, two putative transmembrane domains, a MAN1-Src1p C-terminal (MSC) domain, and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The LEM domain interacts with the DNA and chromatin-binding protein Barrier-to-Autointegration Factor, and is also necessary for efficient localization of MAN1 in the inner nuclear membrane. Research has indicated that C-terminal nucleoplasmic region of Man1 exhibits a DNA binding winged helix domain and is responsible for both DNA- and Smad-binding. 	92
409729	cd12287	RRM_U2AF35_like	RNA recognition motif (RRM) found in U2 small nuclear ribonucleoprotein auxiliary factor U2AF 35 kDa subunit (U2AF35) and similar proteins. This subfamily corresponds to the RRM in U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF) which has been implicated in the recruitment of U2 snRNP to pre-mRNAs. It is a highly conserved heterodimer composed of large and small subunits; this family includes the small subunit of U2AF (U2AF35 or U2AF1) and U2AF 35 kDa subunit B (U2AF35B or C3H60). U2AF35 directly binds to the 3' splice site of the conserved AG dinucleotide and performs multiple functions in the splicing process in a substrate-specific manner. It promotes U2 snRNP binding to the branch-point sequences of introns through association with the large subunit of U2AF (U2AF65 or U2AF2). Although the biological role of U2AF35B remains unclear, it shows high sequence homolgy to U2AF35, which contains two N-terminal zinc fingers, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal arginine/serine (SR) -rich segment interrupted by glycines. In contrast to U2AF35, U2AF35B has a plant-specific conserved C-terminal region containing SERE motif(s), which may have an important function specific to higher plants. 	101
409730	cd12288	RRM_La_like_plant	RNA recognition motif (RRM) found in plant proteins related to the La autoantigen. This subfamily corresponds to the RRM of plant La-like proteins related to the La autoantigen. A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes. Members in this family contain an LAM domain followed by an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	90
409731	cd12289	RRM_LARP6	RNA recognition motif (RRM) found in La-related protein 6 (LARP6) and similar proteins. This subfamily corresponds to the RRM of LARP6, also termed Acheron (Achn), a novel member of the lupus antigen (La) family. It is expressed predominantly in neurons and muscle in vertebrates. LARP6 functions as a key regulatory protein that may play a role in mediating a variety of developmental and homeostatic processes in animals, including myogenesis, neurogenesis and possibly metastasis. LARP6 binds to Ca2+/calmodulin-dependent serine protein kinase (CASK), and forms a complex with inhibitor of differentiation transcription factors. It is structurally related to the La autoantigen and contains a La motif (LAM), nuclear localization and export (NLS and NES) signals, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	93
409732	cd12290	RRM1_LARP7	RNA recognition motif 1 (RRM1) found in La-related protein 7 (LARP7) and similar proteins. This subfamily corresponds to the RRM1 of LARP7, also termed La ribonucleoprotein domain family member 7, or P-TEFb-interaction protein for 7SK stability (PIP7S), an oligopyrimidine-binding protein that binds to the highly conserved 3'-terminal U-rich stretch (3' -UUU-OH) of 7SK RNA. LARP7 is a stable component of the 7SK small nuclear ribonucleoprotein (7SK snRNP). It intimately associates with all the nuclear 7SK and is required for 7SK stability. LARP7 also acts as a negative transcriptional regulator of cellular and viral polymerase II genes, acting by means of the 7SK snRNP system. It plays an essential role in the inhibition of positive transcription elongation factor b (P-TEFb)-dependent transcription, which has been linked to the global control of cell growth and tumorigenesis. LARP7 contains a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), at the N-terminal region, which mediates binding to the U-rich 3' terminus of 7SK RNA. LARP7 also carries another putative RRM domain at its C-terminus. 	79
409733	cd12291	RRM1_La	RNA recognition motif 1 in La autoantigen (La or LARP3) and similar proteins. This subfamily corresponds to the RRM1 of La autoantigen, also termed Lupus La protein, or La ribonucleoprotein, or Sjoegren syndrome type B antigen (SS-B), a highly abundant nuclear phosphoprotein and well conserved in eukaryotes. It specifically binds the 3'-terminal UUU-OH motif of nascent RNA polymerase III transcripts and protects them from exonucleolytic degradation by 3' exonucleases. In addition, La can directly facilitate the translation and/or metabolism of many UUU-3' OH-lacking cellular and viral mRNAs, through binding internal RNA sequences within the untranslated regions of target mRNAs. La contains an N-terminal La motif (LAM), followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It also possesses a short basic motif (SBM) and a nuclear localization signal (NLS) at the C-terminus. 	73
409734	cd12292	RRM2_La_like	RNA recognition motif 2 in La autoantigen (La or SS-B or LARP3), La-related protein 7 (LARP7 or PIP7S) and similar proteins. This subfamily corresponds to the RRM2 of La and LARP7. La is a highly abundant nuclear phosphoprotein and well conserved in eukaryotes. It specifically binds the 3'-terminal UUU-OH motif of nascent RNA polymerase III transcripts and protects them from exonucleolytic degradation by 3' exonucleases. In addition, La can directly facilitate the translation and/or metabolism of many UUU-3' OH-lacking cellular and viral mRNAs, through binding internal RNA sequences within the untranslated regions of target mRNAs. LARP7 is an oligopyrimidine-binding protein that binds to the highly conserved 3'-terminal U-rich stretch (3' -UUU-OH) of 7SK RNA. It is a stable component of the 7SK small nuclear ribonucleoprotein (7SK snRNP), intimately associates with all the nuclear 7SK and is required for 7SK stability. LARP7 also acts as a negative transcriptional regulator of cellular and viral polymerase II genes, acting by means of the 7SK snRNP system. LARP7 plays an essential role in the inhibition of positive transcription elongation factor b (P-TEFb)-dependent transcription, which has been linked to the global control of cell growth and tumorigenesis. Both La and LARP7 contain an N-terminal La motif (LAM), followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	74
410983	cd12293	dRRM_Rrp7p	deviant RNA recognition motif (dRRM) in yeast ribosomal RNA-processing protein 7 (Rrp7p) and similar proteins. Rrp7p is encoded by YCL031C gene from Saccharomyces cerevisiae. It is an essential yeast protein involved in pre-rRNA processing and ribosome assembly, and is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. Rrp7p contains a deviant RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a RRP7 domain. The classic RRM fold has a topology of beta1-alpha1-beta2-beta3-alpha2-beta4 with juxtaposed N- and C-termini. By contrast, the N-terminal region of Rrp7 displays a cyclic permutation of RRM topology: the strand equivalent to RRM beta4 is shuffled to the N-terminus of the strand equivalent to RRM beta1. Moreover, Rrp7 has an extra strand beta1, which, together with other four beta-strands, forms an antiparallel five-stranded beta-sheet.	105
409735	cd12294	RRM_Rrp7A	RNA recognition motif in ribosomal RNA-processing protein 7 homolog A (Rrp7A) and similar proteins. This subfamily corresponds to the RRM of Rrp7A, also termed gastric cancer antigen Zg14, a homolog of yeast ribosomal RNA-processing protein 7 (Rrp7p), and mainly found in Metazoa. Rrp7p is an essential yeast protein involved in pre-rRNA processing and ribosome assembly, and is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. In contrast, the cellular function of Rrp7A remains unclear currently. Rrp7A harbors an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal Rrp7 domain. 	103
409736	cd12295	RRM_YRA2	RNA recognition motif in yeast RNA annealing protein YRA2 (Yra2p) and similar proteins. This subfamily corresponds to the RRM of Yra2p, a nonessential nuclear RNA-binding protein encoded by Saccharomyces cerevisiae YRA2 gene. It may share some overlapping functions with Yra1p, and is able to complement an YRA1 deletion when overexpressed in yeast. Yra2p belongs to the evolutionarily conserved REF (RNA and export factor binding proteins) family of hnRNP-like proteins. It is a major component of endogenous Yra1p complexes. It interacts with Yra1p and functions as a negative regulator of Yra1p. Yra2p consists of two highly conserved N- and C-terminal boxes and a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	74
409737	cd12296	RRM1_Prp24	RNA recognition motif 1 in fungal pre-messenger RNA splicing protein 24 (Prp24) and similar proteins. This subfamily corresponds to the RRM1 of Prp24, also termed U4/U6 snRNA-associated-splicing factor PRP24 (U4/U6 snRNP), an RNA-binding protein with four well conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It facilitates U6 RNA base-pairing with U4 RNA during spliceosome assembly. Prp24 specifically binds free U6 RNA primarily with RRMs 1 and 2 and facilitates pairing of U6 RNA bases with U4 RNA bases. Additionally, it may also be involved in dissociation of the U4/U6 complex during spliceosome activation. 	71
409738	cd12297	RRM2_Prp24	RNA recognition motif 2 in fungal pre-messenger RNA splicing protein 24 (Prp24) and similar proteins. This subfamily corresponds to the RRM2 of Prp24, also termed U4/U6 snRNA-associated-splicing factor PRP24 (U4/U6 snRNP), an RNA-binding protein with four well conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It facilitates U6 RNA base-pairing with U4 RNA during spliceosome assembly. Prp24 specifically binds free U6 RNA primarily with RRMs 1 and 2 and facilitates pairing of U6 RNA bases with U4 RNA bases. Additionally, it may also be involved in dissociation of the U4/U6 complex during spliceosome activation. 	78
409739	cd12298	RRM3_Prp24	RNA recognition motif 3 in fungal pre-messenger RNA splicing protein 24 (Prp24) and similar proteins. This subfamily corresponds to the RRM3 of Prp24, also termed U4/U6 snRNA-associated-splicing factor PRP24 (U4/U6 snRNP), an RNA-binding protein with four well conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It facilitates U6 RNA base-pairing with U4 RNA during spliceosome assembly. Prp24 specifically binds free U6 RNA primarily with RRMs 1 and 2 and facilitates pairing of U6 RNA bases with U4 RNA bases. Additionally, it may also be involved in dissociation of the U4/U6 complex during spliceosome activation. 	78
409740	cd12299	RRM4_Prp24	RNA recognition motif 4 in fungal pre-messenger RNA splicing protein 24 (Prp24) and similar proteins. This subfamily corresponds to the RRM4 of Prp24, also termed U4/U6 snRNA-associated-splicing factor PRP24 (U4/U6 snRNP), an RNA-binding protein with four well conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It facilitates U6 RNA base-pairing with U4 RNA during spliceosome assembly. Prp24 specifically binds free U6 RNA primarily with RRMs 1 and 2 and facilitates pairing of U6 RNA bases with U4 RNA bases. Additionally, it may also be involved in dissociation of the U4/U6 complex during spliceosome activation. 	71
409741	cd12300	RRM1_PAR14	RNA recognition motif 1 in vertebrate poly [ADP-ribose] polymerase 14 (PARP-14). This subfamily corresponds to the RRM1 of PARP-14, also termed aggressive lymphoma protein 2, a member of the B aggressive lymphoma (BAL) family of macrodomain-containing PARPs. It is expressed in B lymphocytes and interacts with the IL-4-induced transcription factor Stat6. It plays a fundamental role in the regulation of IL-4-induced B-cell protection against apoptosis after irradiation or growth factor withdrawal. It mediates IL-4 effects on the levels of gene products that regulate cell survival, proliferation, and lymphomagenesis. PARP-14 acts as a transcriptional switch for Stat6-dependent gene activation. In the presence of IL-4, PARP-14 activates transcription by facilitating the binding of Stat6 to the promoter and release of HDACs from the promoter with an IL-4 signal. In contrast, in the absence of a signal, PARP-14 acts as a transcriptional repressor by recruiting HDACs. Moreover, the absence of PARP-14 protects against Myc-induced developmental block and lymphoma. Thus, PARP-14 may play an important role in Myc-induced oncogenesis. Research indicates that PARP-14 is also a binding partner with phosphoglucose isomerase (PGI)/ autocrine motility factor (AMF). It can inhibit PGI/AMF ubiquitination, thus contributing to its stabilization and secretion. PARP-14 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), three tandem macro domains, and C-terminal region with sequence homology to PARP catalytic domain. 	82
409742	cd12301	RRM1_2_PAR10_like	RNA recognition motif 1 and 2 in poly [ADP-ribose] polymerase PARP-10, RNA recognition motif 2 in PARP-14, RNA recognition motif in N-myc-interactor (Nmi), interferon-induced 35 kDa protein (IFP 35), RNA-binding protein 43 (RBM43) and similar proteins. This subfamily corresponds to the RRM1 and RRM2 of PARP-10, RRM2 of PARP-14, RRM of N-myc-interactor (Nmi), interferon-induced 35 kDa protein (IFP 35) and RNA-binding protein 43 (RBM43). PARP-10 is a novel oncoprotein c-Myc-interacting protein with poly(ADP-ribose) polymerase activity. It is localized to the nuclear and cytoplasmic compartments. In addition to PARP activity, PARP-10 is also involved in the control of cell proliferation by inhibiting c-Myc- and E1A-mediated cotransformation of primary cells. PARP-10 may also play a role in nuclear processes including the regulation of chromatin, gene transcription, and nuclear/cytoplasmic transport. PARP-10 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two overlapping C-terminal domains composed of a glycine-rich region and a region with homology to catalytic domains of PARP enzymes (PARP domain). In addition, PARP-10 contains two ubiquitin-interacting motifs (UIM). PARP-14, also termed aggressive lymphoma protein 2, is a member of the B aggressive lymphoma (BAL) family of macrodomain-containing PARPs. Like PARP-10, PARP-14 also includes two RRMs at the N-terminus. Nmi, also termed N-myc and STAT interactor, is an interferon inducible protein that interacts with c-Myc, N-Myc, Max and c-Fos, and other transcription factors containing bHLH-ZIP, bHLH or ZIP domains. Besides binding Myc proteins, Nmi also associates with all the Stat family of transcription factors except Stat2. In response to cytokine (e.g. IL-2 and IFN-gamma) stimulation, Nmi can enhance Stat-mediated transcriptional activity through recruiting the Stat1 and Stat5 transcriptional coactivators, CREB-binding protein (CBP) and p300. IFP 35 is an interferon-induced leucine zipper protein that can specifically form homodimers. Distinct from known bZIP proteins, IFP 35 lacks a basic domain critical for DNA binding. In addition, IFP 35 may negatively regulate other bZIP transcription factors by protein-protein interaction. For instance, it can form heterodimers with B-ATF, a member of the AP1 transcription factor family. Both Nmi and IFP35 harbor one RRM. RBM43 is a putative RNA-binding protein containing one RRM, but its biological function remains unclear. 	74
409743	cd12302	RRM_scSet1p_like	RNA recognition motif in budding yeast Saccharomyces cerevisiae SET domain-containing protein 1 (scSet1p) and similar proteins. This subfamily corresponds to the RRM of scSet1p, also termed H3 lysine-4 specific histone-lysine N-methyltransferase, or COMPASS component SET1, or lysine N-methyltransferase 2, which is encoded by SET1 from the yeast S. cerevisiae. It is a nuclear protein that may play a role in both silencing and activating transcription. scSet1p is closely related to the SET domain proteins of multicellular organisms, which are implicated in diverse aspects of cell morphology, growth control, and chromatin-mediated transcriptional silencing. scSet1p contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a conserved SET domain that may play a role in DNA repair and telomere function. 	110
409744	cd12303	RRM_spSet1p_like	RNA recognition motif in fission yeast Schizosaccharomyces pombe SET domain-containing protein 1 (spSet1p) and similar proteins. This subfamily corresponds to the RRM of spSet1p, also termed H3 lysine-4 specific histone-lysine N-methyltransferase, or COMPASS component SET1, or lysine N-methyltransferase 2, or Set1 complex component, is encoded by SET1 from the fission yeast S. pombe. It is essential for the H3 lysine-4 methylation. in vivo, and plays an important role in telomere maintenance and DNA repair in an ATM kinase Rad3-dependent pathway. spSet1p is the homology counterpart of Saccharomyces cerevisiae Set1p (scSet1p). However, it is more closely related to Set1 found in mammalian. Moreover, unlike scSet1p, spSet1p is not required for heterochromatin assembly in fission yeast. spSet1p contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a conserved SET domain that may play a role in DNA repair and telomere function. 	86
409745	cd12304	RRM_Set1	RNA recognition motif in the Set1-like family of histone-lysine N-methyltransferases. This subfamily corresponds to the RRM of the Set1-like family of histone-lysine N-methyltransferases which includes Set1A and Set1B that are ubiquitously expressed vertebrates histone methyltransferases exhibiting high homology to yeast Set1. Set1A and Set1B proteins exhibit a largely non-overlapping subnuclear distribution in euchromatic nuclear speckles, strongly suggesting that they bind to a unique set of target genes and thus make non-redundant contributions to the epigenetic control of chromatin structure and gene expression. With the exception of the catalytic component, the subunit composition of the Set1A and Set1B histone methyltransferase complexes are identical. Each complex contains six human homologs of the yeast Set1/COMPASS complex, including Set1A or Set1B, Ash2 (homologous to yeast Bre2), CXXC finger protein 1 (CFP1; homologous to yeast Spp1), Rbbp5 (homologous to yeast Swd1), Wdr5 (homologous to yeast Swd3), and Wdr82 (homologous to yeast Swd2). The genomic targeting of these complexes is determined by the identity of the catalytic subunit present in each histone methyltransferase complex. Thus, the Set1A and Set1B complexes may exhibit both overlapping and non-redundant properties. Both Set1A and Set1B contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), an N- SET domain, and a C-terminal catalytic SET domain followed by a post-SET domain. In contrast to Set1B, Set1A additionally contains an HCF-1 binding motif that interacts with HCF-1 in vivo. 	93
409746	cd12305	RRM_NELFE	RNA recognition motif in negative elongation factor E (NELF-E) and similar proteins. This subfamily corresponds to the RRM of NELF-E, also termed RNA-binding protein RD. NELF-E is the RNA-binding subunit of cellular negative transcription elongation factor NELF (negative elongation factor) involved in transcriptional regulation of HIV-1 by binding to the stem of the viral transactivation-response element (TAR) RNA which is synthesized by cellular RNA polymerase II at the viral long terminal repeat. NELF is a heterotetrameric protein consisting of NELF A, B, C or the splice variant D, and E. NELF-E contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It plays a role in the control of HIV transcription by binding to TAR RNA. In addition, NELF-E is associated with the NELF-B subunit, probably via a leucine zipper motif. 	75
409747	cd12306	RRM_II_PABPs	RNA recognition motif in type II polyadenylate-binding proteins. This subfamily corresponds to the RRM of type II polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 2 (PABP-2 or PABPN1), embryonic polyadenylate-binding protein 2 (ePABP-2 or PABPN1L) and similar proteins. PABPs are highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in the regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. ePABP-2 is predominantly located in the cytoplasm and PABP-2 is located in the nucleus. In contrast to the type I PABPs containing four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), the type II PABPs contains a single highly-conserved RRM. This subfamily also includes Saccharomyces cerevisiae RBP29 (SGN1, YIR001C) gene encoding cytoplasmic mRNA-binding protein Rbp29 that binds preferentially to poly(A). Although not essential for cell viability, Rbp29 plays a role in modulating the expression of cytoplasmic mRNA. Like other type II PABPs, Rbp29 contains one RRM only. 	73
409748	cd12307	RRM_NIFK_like	RNA recognition motif in nucleolar protein interacting with the FHA domain of pKI-67 (NIFK) and similar proteins. This subgroup corresponds to the RRM of NIFK and Nop15p. NIFK, also termed MKI67 FHA domain-interacting nucleolar phosphoprotein, or nucleolar phosphoprotein Nopp34, is a putative RNA-binding protein interacting with the forkhead associated (FHA) domain of pKi-67 antigen in a mitosis-specific and phosphorylation-dependent manner. It is nucleolar in interphase but associates with condensed mitotic chromosomes. This family also includes Saccharomyces cerevisiae YNL110C gene encoding ribosome biogenesis protein 15 (Nop15p), also termed nucleolar protein 15. Both, NIFK and Nop15p, contain an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	74
409749	cd12308	RRM1_Spen	RNA recognition motif 1 (RRM1) found in the Spen (split end) protein family. This subfamily corresponds to the RRM1 domain in the Spen (split end) family which includes RNA binding motif protein 15 (RBM15), putative RNA binding motif protein 15B (RBM15B), and similar proteins found in Metazoa. RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, is a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RNA-binding protein 15B (RBM15B), also known as one twenty-two 3 (OTT3), is a paralog of RBM15 and therefore has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. Members in this family belong- to the Spen (split end) protein family, which share a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 	78
240755	cd12309	RRM2_Spen	RNA recognition motif 2 (RRM2) found in the Spen (split end) protein family. This subfamily corresponds to the RRM2 domain in the Spen (split end) protein family which includes RNA binding motif protein 15 (RBM15), putative RNA binding motif protein 15B (RBM15B), and similar proteins found in Metazoa. RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, is a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possess mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RNA-binding protein 15B (RBM15B), also termed one twenty-two 3 (OTT3), is a paralog of RBM15 and therefore has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. Members in this family belong to the Spen (split end) protein family, which share a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 	79
409750	cd12310	RRM3_Spen	RNA recognition motif 3 (RRM3) found in the Spen (split end) protein family. This subfamily corresponds to the RRM3 domain in the Spen (split end) protein family which includes RNA binding motif protein 15 (RBM15), putative RNA binding motif protein 15B (RBM15B) and similar proteins found in Metazoa. RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, is a novel mRNA export factor and is a novel component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possess mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RNA-binding protein 15B (RBM15B), also termed one twenty-two 3 (OTT3), is a paralog of RBM15 and therefore has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. Members in this family belong to the Spen (split end) protein family, which shares a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 	72
409751	cd12311	RRM_SRSF2_SRSF8	RNA recognition motif (RRM) found in serine/arginine-rich splicing factor SRSF2, SRSF8 and similar proteins. This subfamily corresponds to the RRM of SRSF2 and SRSF8. SRSF2, also termed protein PR264, or splicing component, 35 kDa (splicing factor SC35 or SC-35), is a prototypical SR protein that plays important roles in the alternative splicing of pre-mRNA. It is also involved in transcription elongation by directly or indirectly mediating the recruitment of elongation factors to the C-terminal domain of polymerase II. SRSF2 is exclusively localized in the nucleus and is restricted to nuclear processes. It contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C-terminal RS domain rich in serine-arginine dipeptides. The RRM is responsible for the specific recognition of 5'-SSNG-3' (S=C/G) RNA. In the regulation of alternative splicing events, it specifically binds to cis-regulatory elements on the pre-mRNA. The RS domain modulates SRSF2 activity through phosphorylation, directly contacts RNA, and promotes protein-protein interactions with the spliceosome. SRSF8, also termed SRP46 or SFRS2B, is a novel mammalian SR splicing factor encoded by a PR264/SC35 functional retropseudogene. SRSF8 is localized in the nucleus and does not display the same activity as PR264/SC35. It functions as an essential splicing factor in complementing a HeLa cell S100 extract deficient in SR proteins. Like SRSF2, SRSF8 contains a single N-terminal RRM and a C-terminal RS domain. 	73
240758	cd12312	RRM_SRSF10_SRSF12	RNA recognition motif (RRM) found in serine/arginine-rich splicing factor SRSF10, SRSF12 and similar proteins. This subfamily corresponds to the RRM of SRSF10 and SRSF12. SRSF10, also termed 40 kDa SR-repressor protein (SRrp40), or FUS-interacting serine-arginine-rich protein 1 (FUSIP1), or splicing factor SRp38, or splicing factor, arginine/serine-rich 13A (SFRS13A), or TLS-associated protein with Ser-Arg repeats (TASR). It is a serine-arginine (SR) protein that acts as a potent and general splicing repressor when dephosphorylated. It mediates global inhibition of splicing both in M phase of the cell cycle and in response to heat shock. SRSF10 emerges as a modulator of cholesterol homeostasis through the regulation of low-density lipoprotein receptor (LDLR) splicing efficiency. It also regulates cardiac-specific alternative splicing of triadin pre-mRNA and is required for proper Ca2+ handling during embryonic heart development. In contrast, the phosphorylated SRSF10 functions as a sequence-specific splicing activator in the presence of a nuclear cofactor. It activates distal alternative 5' splice site of adenovirus E1A pre-mRNA in vivo. Moreover, SRSF10 strengthens pre-mRNA recognition by U1 and U2 snRNPs. SRSF10 localizes to the nuclear speckles and can shuttle between nucleus and cytoplasm. SRSF12, also termed 35 kDa SR repressor protein (SRrp35), or splicing factor, arginine/serine-rich 13B (SFRS13B), or splicing factor, arginine/serine-rich 19 (SFRS19), is a serine/arginine (SR) protein-like alternative splicing regulator that antagonizes authentic SR proteins in the modulation of alternative 5' splice site choice. For instance, it activates distal alternative 5' splice site of the adenovirus E1A pre-mRNA in vivo. Both, SRSF10 and SRSF12, contain a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C-terminal RS domain rich in serine-arginine dipeptides. 	84
409752	cd12313	RRM1_RRM2_RBM5_like	RNA recognition motif 1 (RRM1) and 2 (RRM2) found in RNA-binding protein 5 (RBM5) and similar proteins. This subfamily includes the RRM1 and RRM2 of RNA-binding protein 5 (RBM5 or LUCA15 or H37) and RNA-binding protein 10 (RBM10 or S1-1), and the RRM2 of RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). These RBMs share high sequence homology and may play an important role in regulating apoptosis. RBM5 is a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor. RBM6 has been predicted to be a nuclear factor based on its nuclear localization signal. Both, RBM6 and RBM5, specifically bind poly(G) RNA. RBM10 is a paralog of RBM5. It may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. All family members contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 	85
409753	cd12314	RRM1_RBM6	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 6 (RBM6). This subfamily corresponds to the RRM1 of RBM6, also termed lung cancer antigen NY-LU-12, or protein G16, or RNA-binding protein DEF-3, which has been predicted to be a nuclear factor based on its nuclear localization signal. It shows high sequence similarity to RNA-binding protein 5 (RBM5 or LUCA15 or NY-REN-9). Both, RBM6 and RBM5, specifically bind poly(G) RNA. They contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. In contrast to RBM5, RBM6 has two additional unique domains: the decamer repeat occurring more than 20 times, and the POZ (poxvirus and zinc finger) domain. The POZ domain may be involved in protein-protein interactions and inhibit binding of target sequences by zinc fingers. 	78
409754	cd12315	RRM1_RBM19_MRD1	RNA recognition motif 1 (RRM1) found in RNA-binding protein 19 (RBM19), yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subfamily corresponds to the RRM1 of RBM19 and MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). MRD1 is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). It is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RRMs, which may play an important structural role in organizing specific rRNA processing events. 	81
409755	cd12316	RRM3_RBM19_RRM2_MRD1	RNA recognition motif 3 (RRM3) found in RNA-binding protein 19 (RBM19) and RNA recognition motif 2 found in multiple RNA-binding domain-containing protein 1 (MRD1). This subfamily corresponds to the RRM3 of RBM19 and RRM2 of MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). MRD1 is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). It is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RRMs, which may play an important structural role in organizing specific rRNA processing events. 	74
409756	cd12317	RRM4_RBM19_RRM3_MRD1	RNA recognition motif 4 (RRM4) found in RNA-binding protein 19 (RBM19) and RNA recognition motif 3 (RRM3) found in multiple RNA-binding domain-containing protein 1 (MRD1). This subfamily corresponds to the RRM4 of RBM19 and the RRM3 of MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). MRD1 is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well conserved in yeast and its homologues exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RRMs, which may play an important structural role in organizing specific rRNA processing events. 	72
409757	cd12318	RRM5_RBM19_like	RNA recognition motif 5 (RRM5) found in RNA-binding protein 19 (RBM19 or RBD-1) and similar proteins. This subfamily corresponds to the RRM5 of RBM19 and RRM4 of MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	80
409758	cd12319	RRM4_MRD1	RNA recognition motif 4 (RRM4) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subfamily corresponds to the RRM4 of MRD1which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. It contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 	84
409759	cd12320	RRM6_RBM19_RRM5_MRD1	RNA recognition motif 6 (RRM6) found in RNA-binding protein 19 (RBM19 or RBD-1) and RNA recognition motif 5 (RRM5) found in multiple RNA-binding domain-containing protein 1 (MRD1). This subfamily corresponds to the RRM6 of RBM19 and RRM5 of MRD1. RBM19, also termed RNA-binding domain-1 (RBD-1), is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA and is essential for preimplantation development. It has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). MRD1 is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). It is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RRMs, which may play an important structural role in organizing specific rRNA processing events. 	76
409760	cd12321	RRM1_TDP43	RNA recognition motif 1 (RRM1) found in TAR DNA-binding protein 43 (TDP-43) and similar proteins. This subfamily corresponds to the RRM1 of TDP-43 (also termed TARDBP), a ubiquitously expressed pathogenic protein whose normal function and abnormal aggregation are directly linked to the genetic disease cystic fibrosis, and two neurodegenerative disorders: frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS). TDP-43 binds both DNA and RNA, and has been implicated in transcriptional repression, pre-mRNA splicing and translational regulation. TDP-43 is a dimeric protein with two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal glycine-rich domain. The RRMs are responsible for DNA and RNA binding; they bind to TAR DNA and RNA sequences with UG-repeats. The glycine-rich domain can interact with the hnRNP family proteins to form the hnRNP-rich complex involved in splicing inhibition. It is also essential for the cystic fibrosis transmembrane conductance regulator (CFTR) exon 9-skipping activity. 	74
409761	cd12322	RRM2_TDP43	RNA recognition motif 2 (RRM2) found in TAR DNA-binding protein 43 (TDP-43) and similar proteins. This subfamily corresponds to the RRM2 of TDP-43 (also termed TARDBP), a ubiquitously expressed pathogenic protein whose normal function and abnormal aggregation are directly linked to the genetic disease cystic fibrosis, and two neurodegenerative disorders: frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS). TDP-43 binds both DNA and RNA, and has been implicated in transcriptional repression, pre-mRNA splicing and translational regulation. TDP-43 is a dimeric protein with two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal glycine-rich domain. The RRMs are responsible for DNA and RNA binding; they bind to TAR DNA and RNA sequences with UG-repeats. The glycine-rich domain can interact with the hnRNP family proteins to form the hnRNP-rich complex involved in splicing inhibition. It is also essential for the cystic fibrosis transmembrane conductance regulator (CFTR) exon 9-skipping activity. 	71
240769	cd12323	RRM2_MSI	RNA recognition motif 2 (RRM2) found in RNA-binding protein Musashi homologs Musashi-1, Musashi-2 and similar proteins. This subfamily corresponds to the RRM2.in Musashi-1 (also termed Msi1), a neural RNA-binding protein putatively expressed in central nervous system (CNS) stem cells and neural progenitor cells, and associated with asymmetric divisions in neural progenitor cells. It is evolutionarily conserved from invertebrates to vertebrates. Musashi-1 is a homolog of Drosophila Musashi and Xenopus laevis nervous system-specific RNP protein-1 (Nrp-1). It has been implicated in the maintenance of the stem-cell state, differentiation, and tumorigenesis. It translationally regulates the expression of a mammalian numb gene by binding to the 3'-untranslated region of mRNA of Numb, encoding a membrane-associated inhibitor of Notch signaling, and further influences neural development. Moreover, Musashi-1 represses translation by interacting with the poly(A)-binding protein and competes for binding of the eukaryotic initiation factor-4G (eIF-4G). Musashi-2 (also termed Msi2) has been identified as a regulator of the hematopoietic stem cell (HSC) compartment and of leukemic stem cells after transplantation of cells with loss and gain of function of the gene. It influences proliferation and differentiation of HSCs and myeloid progenitors, and further modulates normal hematopoiesis and promotes aggressive myeloid leukemia. Both, Musashi-1 and Musashi-2, contain two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 	74
409762	cd12324	RRM_RBM8	RNA recognition motif (RRM) found in RNA-binding protein RBM8A, RBM8B nd similar proteins. This subfamily corresponds to the RRM of RBM8, also termed binder of OVCA1-1 (BOV-1), or RNA-binding protein Y14, which is one of the components of the exon-exon junction complex (EJC). It has two isoforms, RBM8A and RBM8B, both of which are identical except that RBM8B is 16 amino acids shorter at its N-terminus. RBM8, together with other EJC components (such as Magoh, Aly/REF, RNPS1, Srm160, and Upf3), plays critical roles in postsplicing processing, including nuclear export and cytoplasmic localization of the mRNA, and the nonsense-mediated mRNA decay (NMD) surveillance process. RBM8 binds to mRNA 20-24 nucleotides upstream of a spliced exon-exon junction. It is also involved in spliced mRNA nuclear export, and the process of nonsense-mediated decay of mRNAs with premature stop codons. RBM8 forms a specific heterodimer complex with the EJC protein Magoh which then associates with Aly/REF, RNPS1, DEK, and SRm160 on the spliced mRNA, and inhibits ATP turnover by eIF4AIII, thereby trapping the EJC core onto RNA. RBM8 contains an N-terminal putative bipartite nuclear localization signal, one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), in the central region, and a C-terminal serine-arginine rich region (SR domain) and glycine-arginine rich region (RG domain). 	88
409763	cd12325	RRM1_hnRNPA_hnRNPD_like	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein hnRNP A and hnRNP D subfamilies and similar proteins. This subfamily corresponds to the RRM1 in the hnRNP A subfamily which includes hnRNP A0, hnRNP A1, hnRNP A2/B1, hnRNP A3 and similar proteins. hnRNP A0 is a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. hnRNP A1 is an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A2/B1 is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). hnRNP A3 is also a RNA trafficking response element-binding protein that participates in the trafficking of A2RE-containing RNA. The hnRNP A subfamily is characterized by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. The hnRNP D subfamily includes hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. hnRNP D0 is a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP A/B is an RNA unwinding protein with a high affinity for G- followed by U-rich regions. hnRNP A/B has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus, plays an important role in apoB mRNA editing. hnRNP DL (or hnRNP D-like) is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. All members in this subfamily contain two putative RRMs and a glycine- and tyrosine-rich C-terminus. The family also contains DAZAP1 (Deleted in azoospermia-associated protein 1), RNA-binding protein Musashi homolog Musashi-1, Musashi-2 and similar proteins. They all harbor two RRMs. 	72
409764	cd12326	RRM1_hnRNPA0	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A0 (hnRNP A0) and similar proteins. This subfamily corresponds to the RRM1 of hnRNP A0 which is a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. It has been identified as the substrate for MAPKAP-K2 and may be involved in the lipopolysaccharide (LPS)-induced post-transcriptional regulation of tumor necrosis factor-alpha (TNF-alpha), cyclooxygenase 2 (COX-2) and macrophage inflammatory protein 2 (MIP-2). hnRNP A0 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 	79
409765	cd12327	RRM2_DAZAP1	RNA recognition motif 2 (RRM2) found in Deleted in azoospermia-associated protein 1 (DAZAP1) and similar proteins. This subfamily corresponds to the RRM2 of DAZAP1 or DAZ-associated protein 1, also termed proline-rich RNA binding protein (Prrp), a multi-functional ubiquitous RNA-binding protein expressed most abundantly in the testis and essential for normal cell growth, development, and spermatogenesis. DAZAP1 is a shuttling protein whose acetylated is predominantly nuclear and the nonacetylated form is in cytoplasm. DAZAP1 also functions as a translational regulator that activates translation in an mRNA-specific manner. DAZAP1 was initially identified as a binding partner of Deleted in Azoospermia (DAZ). It also interacts with numerous hnRNPs, including hnRNP U, hnRNP U like-1, hnRNPA1, hnRNPA/B, and hnRNP D, suggesting DAZAP1 might associate and cooperate with hnRNP particles to regulate adenylate-uridylate-rich elements (AU-rich element or ARE)-containing mRNAs. DAZAP1 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal proline-rich domain. 	80
409766	cd12328	RRM2_hnRNPA_like	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A subfamily. This subfamily corresponds to the RRM2 of hnRNP A0, hnRNP A1, hnRNP A2/B1, hnRNP A3 and similar proteins. hnRNP A0 is a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. It has been identified as the substrate for MAPKAP-K2 and may be involved in the lipopolysaccharide (LPS)-induced post-transcriptional regulation of tumor necrosis factor-alpha (TNF-alpha), cyclooxygenase 2 (COX-2) and macrophage inflammatory protein 2 (MIP-2). hnRNP A1 is an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A2/B1 is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). Many mRNAs, such as myelin basic protein (MBP), myelin-associated oligodendrocytic basic protein (MOBP), carboxyanhydrase II (CAII), microtubule-associated protein tau, and amyloid precursor protein (APP) are trafficked by hnRNP A2/B1. hnRNP A3 is also a RNA trafficking response element-binding protein that participates in the trafficking of A2RE-containing RNA. The hnRNP A subfamily is characterized by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 	73
240775	cd12329	RRM2_hnRNPD_like	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. This subfamily corresponds to the RRM2 of hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. hnRNP D0, a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP A/B is an RNA unwinding protein with a high affinity for G- followed by U-rich regions. It has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus plays an important role in apoB mRNA editing. hnRNP DL (or hnRNP D-like) is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. All memembers in this family contain two putative RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glycine- and tyrosine-rich C-terminus. 	75
409767	cd12330	RRM2_Hrp1p	RNA recognition motif 2 (RRM2) found in yeast nuclear polyadenylated RNA-binding protein 4 (Hrp1p or Nab4p) and similar proteins. This subfamily corresponds to the RRM1 of Hrp1p and similar proteins. Hrp1p or Nab4p, also termed cleavage factor IB (CFIB), is a sequence-specific trans-acting factor that is essential for mRNA 3'-end formation in yeast Saccharomyces cerevisiae. It can be UV cross-linked to RNA and specifically recognizes the (UA)6 RNA element required for both, the cleavage and poly(A) addition steps. Moreover, Hrp1p can shuttle between the nucleus and the cytoplasm, and play an additional role in the export of mRNAs to the cytoplasm. Hrp1p also interacts with Rna15p and Rna14p, two components of CF1A. In addition, Hrp1p functions as a factor directly involved in modulating the activity of the nonsense-mediated mRNA decay (NMD) pathway; it binds specifically to a downstream sequence element (DSE)-containing RNA and interacts with Upf1p, a component of the surveillance complex, further triggering the NMD pathway. Hrp1p contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an arginine-glycine-rich region harboring repeats of the sequence RGGF/Y. 	78
409768	cd12331	RRM_NRD1_SEB1_like	RNA recognition motif (RRM) found in Saccharomyces cerevisiae protein Nrd1, Schizosaccharomyces pombe Rpb7-binding protein seb1 and similar proteins. This subfamily corresponds to the RRM of Nrd1 and Seb1. Nrd1 is a novel heterogeneous nuclear ribonucleoprotein (hnRNP)-like RNA-binding protein encoded by gene NRD1 (for nuclear pre-mRNA down-regulation) from yeast S. cerevisiae. It is implicated in 3' end formation of small nucleolar and small nuclear RNAs transcribed by polymerase II, and plays a critical role in pre-mRNA metabolism. Nrd1 contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a short arginine-, serine-, and glutamate-rich segment similar to the regions rich in RE and RS dipeptides (RE/RS domains) in many metazoan splicing factors, and a proline- and glutamine-rich C-terminal domain (P+Q domain) similar to domains found in several yeast hnRNPs. Disruption of NRD1 gene is lethal to yeast cells. Its N-terminal domain is sufficient for viability, which may facilitate interactions with RNA polymerase II where Nrd1 may function as an auxiliary factor. By contrast, the RRM, RE/RS domains, and P+Q domain are dispensable. Seb1 is an RNA-binding protein encoded by gene seb1 (for seven binding) from fission yeast S. pombe. It is essential for cell viability and bound directly to Rpb7 subunit of RNA polymerase II. Seb1 is involved in processing of polymerase II transcripts. It also contains one RRM motif and a region rich in arginine-serine dipeptides (RS domain).	79
409769	cd12332	RRM1_p54nrb_like	RNA recognition motif 1 (RRM1) found in the p54nrb/PSF/PSP1 family. This subfamily corresponds to the RRM1 of the p54nrb/PSF/PSP1 family, including 54 kDa nuclear RNA- and DNA-binding protein (p54nrb or NonO or NMT55), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF or POMp100), paraspeckle protein 1 (PSP1 or PSPC1), which are ubiquitously expressed and are conserved in vertebrates. p54nrb is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF is also a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSP1 is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSP1 remains unknown currently. This subfamily also includes some p54nrb/PSF/PSP1 homologs from invertebrate species, such as the Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA) and Chironomus tentans hrp65 gene encoding protein Hrp65. D. melanogaster NONA is involved in eye development and behavior, and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans Hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore. All family members contain a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction module. PSF has an additional large N-terminal domain that differentiates it from other family members. 	71
409770	cd12333	RRM2_p54nrb_like	RNA recognition motif 2 (RRM2) found in the p54nrb/PSF/PSP1 family. This subfamily corresponds to the RRM2 of the p54nrb/PSF/PSP1 family, including 54 kDa nuclear RNA- and DNA-binding protein (p54nrb or NonO or NMT55), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF or POMp100), paraspeckle protein 1 (PSP1 or PSPC1), which are ubiquitously expressed and are conserved in vertebrates. p54nrb is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF is also a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSP1 is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSP1 remains unknown currently. The family also includes some p54nrb/PSF/PSP1 homologs from invertebrate species, such as the Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA) and Chironomus tentans hrp65 gene encoding protein Hrp65. D. melanogaster NONA is involved in eye development and behavior and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans Hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore. All family members contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction module. PSF has an additional large N-terminal domain that differentiates it from other family members. 	80
409771	cd12334	RRM1_SF3B4	RNA recognition motif 1 (RRM1) found in splicing factor 3B subunit 4 (SF3B4) and similar proteins. This subfamily corresponds to the RRM1 of SF3B4, also termed pre-mRNA-splicing factor SF3b 49 kDa (SF3b50), or spliceosome-associated protein 49 (SAP 49). SF3B4 a component of the multiprotein complex splicing factor 3b (SF3B), an integral part of the U2 small nuclear ribonucleoprotein (snRNP) and the U11/U12 di-snRNP. SF3B is essential for the accurate excision of introns from pre-messenger RNA, and is involved in the recognition of the pre-mRNA's branch site within the major and minor spliceosomes. SF3B4 functions to tether U2 snRNP with pre-mRNA at the branch site during spliceosome assembly. It is an evolutionarily highly conserved protein with orthologs across diverse species. SF3B4 contains two closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It binds directly to pre-mRNA and also interacts directly and highly specifically with another SF3B subunit called SAP 145. 	74
409772	cd12335	RRM2_SF3B4	RNA recognition motif 2 (RRM2) found in splicing factor 3B subunit 4 (SF3B4) and similar proteins. This subfamily corresponds to the RRM2 of SF3B4, also termed pre-mRNA-splicing factor SF3b 49 kDa (SF3b50), or spliceosome-associated protein 49 (SAP 49). SF3B4 is a component of the multiprotein complex splicing factor 3b (SF3B), an integral part of the U2 small nuclear ribonucleoprotein (snRNP) and the U11/U12 di-snRNP. SF3B is essential for the accurate excision of introns from pre-messenger RNA, and is involved in the recognition of the pre-mRNA's branch site within the major and minor spliceosomes. SF3B4 functions to tether U2 snRNP with pre-mRNA at the branch site during spliceosome assembly. It is an evolutionarily highly conserved protein with orthologs across diverse species. SF3B4 contains two closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It binds directly to pre-mRNA and also interacts directly and highly specifically with another SF3B subunit called SAP 145. 	83
409773	cd12336	RRM_RBM7_like	RNA recognition motif (RRM) found in RNA-binding protein 7 (RBM7) and similar proteins. This subfamily corresponds to the RRM of RBM7, RBM11 and their eukaryotic homologous. RBM7 is an ubiquitously expressed pre-mRNA splicing factor that enhances messenger RNA (mRNA) splicing in a cell-specific manner or in a certain developmental process, such as spermatogenesis. It interacts with splicing factors SAP145 (the spliceosomal splicing factor 3b subunit 2) and SRp20, and may play a more specific role in meiosis entry and progression. Together with additional testis-specific RNA-binding proteins, RBM7 may regulate the splicing of specific pre-mRNA species that are important in the meiotic cell cycle. RBM11 is a novel tissue-specific splicing regulator that is selectively expressed in brain, cerebellum and testis, and to a lower extent in kidney. It is localized in the nucleoplasm and enriched in SRSF2-containing splicing speckles. It may play a role in the modulation of alternative splicing during neuron and germ cell differentiation. Both, RBM7 and RBM11, contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region lacking known homology at the C-terminus. The RRM is responsible for RNA binding, whereas the C-terminal region permits nuclear localization and homodimerization. 	75
409774	cd12337	RRM1_SRSF4_like	RNA recognition motif 1 (RRM1) found in serine/arginine-rich splicing factor 4 (SRSF4) and similar proteins. This subfamily corresponds to the RRM1 in three serine/arginine (SR) proteins: serine/arginine-rich splicing factor 4 (SRSF4 or SRp75 or SFRS4), serine/arginine-rich splicing factor 5 (SRSF5 or SRp40 or SFRS5 or HRS), serine/arginine-rich splicing factor 6 (SRSF6 or SRp55). SRSF4 plays an important role in both, constitutive  and alternative, splicing of many pre-mRNAs. It can shuttle between the nucleus and cytoplasm. SRSF5 regulates both alternative splicing and basal splicing. It is the only SR protein efficiently selected from nuclear extracts (NE) by the splicing enhancer (ESE) and essential for enhancer activation. SRSF6 preferentially interacts with a number of purine-rich splicing enhancers (ESEs) to activate splicing of the ESE-containing exon. It is the only protein from HeLa nuclear extract or purified SR proteins that specifically binds B element RNA after UV irradiation. SRSF6 may also recognize different types of RNA sites. Members in this family contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. 	70
409775	cd12338	RRM1_SRSF1_like	RNA recognition motif 1 (RRM1) found in serine/arginine-rich splicing factor 1 (SRSF1) and similar proteins. This subgroup corresponds to the RRM1 in three serine/arginine (SR) proteins: serine/arginine-rich splicing factor 1 (SRSF1 or ASF-1), serine/arginine-rich splicing factor 9 (SRSF9 or SRp30C), and plant pre-mRNA-splicing factor SF2 (SR1). SRSF1 is a shuttling SR protein involved in constitutive and alternative splicing, nonsense-mediated mRNA decay (NMD), mRNA export and translation. It also functions as a splicing-factor oncoprotein that regulates apoptosis and proliferation to promote mammary epithelial cell transformation. SRSF9 has been implicated in the activity of many elements that control splice site selection, the alternative splicing of the glucocorticoid receptor beta in neutrophils and in the gonadotropin-releasing hormone pre-mRNA. It can also interact with other proteins implicated in alternative splicing, including YB-1, rSLM-1, rSLM-2, E4-ORF4, Nop30, and p32. Both, SRSF1 and SRSF9, contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RS domains rich in serine-arginine dipeptides. In contrast, SF2 contains two N-terminal RRMs and a C-terminal PSK domain rich in proline, serine and lysine residues.  	72
409776	cd12339	RRM2_SRSF1_4_like	RNA recognition motif 2 (RRM2) found in serine/arginine-rich splicing factor SRSF1, SRSF4 and similar proteins. This subfamily corresponds to the RRM2 of several serine/arginine (SR) proteins that have been classified into two subgroups. The first subgroup consists of serine/arginine-rich splicing factor 4 (SRSF4 or SRp75 or SFRS4), serine/arginine-rich splicing factor 5 (SRSF5 or SRp40 or SFRS5 or HRS) and serine/arginine-rich splicing factor 6 (SRSF6 or SRp55). The second subgroup is composed of serine/arginine-rich splicing factor 1 (SRSF1 or ASF-1), serine/arginine-rich splicing factor 9 (SRSF9 or SRp30C) and plant pre-mRNA-splicing factor SF2 (SR1). These SR proteins are mainly involved in regulating constitutive and alternative pre-mRNA splicing. They also have been implicated in transcription, genomic stability, mRNA export and translation. All SR proteins in this family, except SRSF5, undergo nucleocytoplasmic shuttling, suggesting their widespread roles in gene expression. These SR proteins share a common domain architecture comprising two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. Both domains can directly contact with RNA. The RRMs appear to determine the binding specificity and the SR domain also mediates protein-protein interactions. In addition, this subfamily includes the yeast nucleolar protein 3 (Npl3p), also termed mitochondrial targeting suppressor 1 protein, or nuclear polyadenylated RNA-binding protein 1. It is a major yeast RNA-binding protein that competes with 3'-end processing factors, such as Rna15, for binding to the nascent RNA, protecting the transcript from premature termination and coordinating transcription termination and the packaging of the fully processed transcript for export. It specifically recognizes a class of G/U-rich RNAs. Npl3p is a multi-domain protein with two RRMs, separated by a short linker and a C-terminal domain rich in glycine, arginine and serine residues. 	70
409777	cd12340	RBD_RRM1_NPL3	RNA recognition motif 1 (RRM1) found in yeast nucleolar protein 3 (Npl3p) and similar proteins. This subfamily corresponds to the RRM1 of Npl3p, also termed mitochondrial targeting suppressor 1 protein, or nuclear polyadenylated RNA-binding protein 1. Npl3p is a major yeast RNA-binding protein that competes with 3'-end processing factors, such as Rna15, for binding to the nascent RNA, protecting the transcript from premature termination and coordinating transcription termination and the packaging of the fully processed transcript for export. It specifically recognizes a class of G/U-rich RNAs. Npl3p is a multi-domain protein containing two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), separated by a short linker and a C-terminal domain rich in glycine, arginine and serine residues. 	69
409778	cd12341	RRM_hnRNPC_like	RNA recognition motif (RRM) found in heterogeneous nuclear ribonucleoprotein C (hnRNP C)-related proteins. This subfamily corresponds to the RRM in the hnRNP C-related protein family, including hnRNP C proteins, Raly, and Raly-like protein (RALYL). hnRNP C proteins, C1 and C2, are produced by a single coding sequence. They are the major constituents of the heterogeneous nuclear RNA (hnRNA) ribonucleoprotein (hnRNP) complex in vertebrates. They bind hnRNA tightly, suggesting a central role in the formation of the ubiquitous hnRNP complex; they are involved in the packaging of the hnRNA in the nucleus and in processing of pre-mRNA such as splicing and 3'-end formation. Raly, also termed autoantigen p542, is an RNA-binding protein that may play a critical role in embryonic development. The biological role of RALYL remains unclear. It shows high sequence homology with hnRNP C proteins and Raly. Members of this family are characterized by an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal auxiliary domain. The Raly proteins contain a glycine/serine-rich stretch within the C-terminal regions, which is absent in the hnRNP C proteins. Thus, the Raly proteins represent a newly identified class of evolutionarily conserved autoepitopes. 	68
240788	cd12342	RRM_Nab3p	RNA recognition motif (RRM) found in yeast nuclear polyadenylated RNA-binding protein 3 (Nab3p) and similar proteins. This subfamily corresponds to the RRM of Nab3p, an acidic nuclear polyadenylated RNA-binding protein encoded by Saccharomyces cerevisiae NAB3 gene that is essential for cell viability. Nab3p is predominantly localized within the nucleoplasm and essential for growth in yeast. It may play an important role in packaging pre-mRNAs into ribonucleoprotein structures amenable to efficient nuclear RNA processing. Nab3p contains an N-terminal aspartic/glutamic acid-rich region, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal region rich in glutamine and proline residues. 	71
409779	cd12343	RRM1_2_CoAA_like	RNA recognition motif 1 (RRM1) and 2 (RRM2) found in RRM-containing coactivator activator/modulator (CoAA) and similar proteins. This subfamily corresponds to the RRM in CoAA (also known as RBM14 or PSP2) and RNA-binding protein 4 (RBM4). CoAA is a heterogeneous nuclear ribonucleoprotein (hnRNP)-like protein identified as a nuclear receptor coactivator. It mediates transcriptional coactivation and RNA splicing effects in a promoter-preferential manner, and is enhanced by thyroid hormone receptor-binding protein (TRBP). CoAA contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a TRBP-interacting domain. RBM4 is a ubiquitously expressed splicing factor with two isoforms, RBM4A (also known as Lark homolog) and RBM4B (also known as RBM30), which are very similar in structure and sequence. RBM4 may also function as a translational regulator of stress-associated mRNAs as well as play a role in micro-RNA-mediated gene regulation. RBM4 contains two N-terminal RRMs, a CCHC-type zinc finger, and three alanine-rich regions within their C-terminal regions. This family also includes Drosophila RNA-binding protein lark (Dlark), a homolog of human RBM4. It plays an important role in embryonic development and in the circadian regulation of adult eclosion. Dlark shares high sequence similarity with RBM4 at the N-terminal region. However, Dlark has three proline-rich segments instead of three alanine-rich segments within the C-terminal region. 	66
409780	cd12344	RRM1_SECp43_like	RNA recognition motif 1 (RRM1) found in tRNA selenocysteine-associated protein 1 (SECp43) and similar proteins. This subfamily corresponds to the RRM1 in tRNA selenocysteine-associated protein 1 (SECp43), yeast negative growth regulatory protein NGR1 (RBP1), yeast protein NAM8, and similar proteins. SECp43 is an RNA-binding protein associated specifically with eukaryotic selenocysteine tRNA [tRNA(Sec)]. It may play an adaptor role in the mechanism of selenocysteine insertion. SECp43 is located primarily in the nucleus and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal polar/acidic region. Yeast proteins, NGR1 and NAM8, show high sequence similarity with SECp43. NGR1 is a putative glucose-repressible protein that binds both RNA and single-stranded DNA (ssDNA). It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains three RRMs, two of which are followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the C-terminus which also harbors a methionine-rich region. NAM8 is a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. NAM8 also contains three RRMs.  	82
409781	cd12345	RRM2_SECp43_like	RNA recognition motif 2 (RRM2) found in tRNA selenocysteine-associated protein 1 (SECp43) and similar proteins. This subfamily corresponds to the RRM2 in tRNA selenocysteine-associated protein 1 (SECp43), yeast negative growth regulatory protein NGR1 (RBP1), yeast protein NAM8, and similar proteins. SECp43 is an RNA-binding protein associated specifically with eukaryotic selenocysteine tRNA [tRNA(Sec)]. It may play an adaptor role in the mechanism of selenocysteine insertion. SECp43 is located primarily in the nucleus and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal polar/acidic region. Yeast proteins, NGR1 and NAM8, show high sequence similarity with SECp43. NGR1 is a putative glucose-repressible protein that binds both RNA and single-stranded DNA (ssDNA). It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains three RRMs, two of which are followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the C-terminus which also harbors a methionine-rich region. NAM8 is a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. NAM8 also contains three RRMs.  	80
409782	cd12346	RRM3_NGR1_NAM8_like	RNA recognition motif 3 (RRM3) found in yeast negative growth regulatory protein NGR1 (RBP1), yeast protein NAM8 and similar proteins. This subfamily corresponds to the RRM3 of NGR1 and NAM8. NGR1, also termed RNA-binding protein RBP1, is a putative glucose-repressible protein that binds both RNA and single-stranded DNA (ssDNA) in yeast. It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the carboxyl terminus which also harbors a methionine-rich region. The family also includes protein NAM8, which is a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. Like NGR1, NAM8 contains two RRMs. 	72
409783	cd12347	RRM_PPIE	RNA recognition motif (RRM) found in cyclophilin-33 (Cyp33) and similar proteins. This subfamily corresponds to the RRM of Cyp33, also termed peptidyl-prolyl cis-trans isomerase E (PPIase E), or cyclophilin E, or rotamase E. Cyp33 is a nuclear RNA-binding cyclophilin with an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal PPIase domain. Cyp33 possesses RNA-binding activity and preferentially binds to polyribonucleotide polyA and polyU, but hardly to polyG and polyC. It binds specifically to mRNA, which can stimulate its PPIase activity. Moreover, Cyp33 interacts with the third plant homeodomain (PHD3) zinc finger cassette of the mixed lineage leukemia (MLL) proto-oncoprotein and a poly-A RNA sequence through its RRM domain. It further mediates downregulation of the expression of MLL target genes HOXC8, HOXA9, CDKN1B, and C-MYC, in a proline isomerase-dependent manner. Cyp33 also possesses a PPIase activity that catalyzes cis-trans isomerization of the peptide bond preceding a proline, which has been implicated in the stimulation of folding and conformational changes in folded and unfolded proteins. The PPIase activity can be inhibited by the immunosuppressive drug cyclosporin A. 	75
409784	cd12348	RRM1_SHARP	RNA recognition motif 1 (RRM1) found in SMART/HDAC1-associated repressor protein (SHARP) and similar proteins. This subfamily corresponds to the RRM1 of SHARP, also termed Msx2-interacting protein (MINT), or SPEN homolog, an estrogen-inducible transcriptional repressor that interacts directly with the nuclear receptor corepressor SMRT, histone deacetylases (HDACs) and components of the NuRD complex. SHARP recruits HDAC activity and binds to the steroid receptor RNA coactivator SRA through four conserved N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), further suppressing SRA-potentiated steroid receptor transcription activity. Thus, SHARP has the capacity to modulate both liganded and nonliganded nuclear receptors. SHARP also has been identified as a component of transcriptional repression complexes in Notch/RBP-Jkappa signaling pathways. In addition to the N-terminal RRMs, SHARP possesses a C-terminal SPOC domain (Spen paralog and ortholog C-terminal domain), which is highly conserved among Spen proteins.  	75
409785	cd12349	RRM2_SHARP	RNA recognition motif 2 (RRM2) found in SMART/HDAC1-associated repressor protein (SHARP) and similar proteins. This subfamily corresponds to the RRM2 of SHARP, also termed Msx2-interacting protein (MINT), or SPEN homolog, an estrogen-inducible transcriptional repressor that interacts directly with the nuclear receptor corepressor SMRT, histone deacetylases (HDACs) and components of the NuRD complex. SHARP recruits HDAC activity and binds to the steroid receptor RNA coactivator SRA through four conserved N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), further suppressing SRA-potentiated steroid receptor transcription activity. Thus, SHARP has the capacity to modulate both liganded and nonliganded nuclear receptors. SHARP also has been identified as a component of transcriptional repression complexes in Notch/RBP-Jkappa signaling pathways. In addition to the N-terminal RRMs, SHARP possesses a C-terminal SPOC domain (Spen paralog and ortholog C-terminal domain), which is highly conserved among Spen proteins. 	74
409786	cd12350	RRM3_SHARP	RNA recognition motif 3 (RRM3) found in SMART/HDAC1-associated repressor protein (SHARP) and similar proteins. This subfamily corresponds to the RRM3 of SHARP, also termed Msx2-interacting protein (MINT), or SPEN homolog, an estrogen-inducible transcriptional repressor that interacts directly with the nuclear receptor corepressor SMRT, histone deacetylases (HDACs) and components of the NuRD complex. SHARP recruits HDAC activity and binds to the steroid receptor RNA coactivator SRA through four conserved N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), further suppressing SRA-potentiated steroid receptor transcription activity. Thus, SHARP has the capacity to modulate both liganded and nonliganded nuclear receptors. SHARP also has been identified as a component of transcriptional repression complexes in Notch/RBP-Jkappa signaling pathways. In addition to the N-terminal RRMs, SHARP possesses a C-terminal SPOC domain (Spen paralog and ortholog C-terminal domain), which is highly conserved among Spen proteins.  	74
409787	cd12351	RRM4_SHARP	RNA recognition motif 4 (RRM4) found in SMART/HDAC1-associated repressor protein (SHARP) and similar proteins. This subfamily corresponds to the RRM of SHARP, also termed Msx2-interacting protein (MINT), or SPEN homolog, is an estrogen-inducible transcriptional repressor that interacts directly with the nuclear receptor corepressor SMRT, histone deacetylases (HDACs) and components of the NuRD complex. SHARP recruits HDAC activity and binds to the steroid receptor RNA coactivator SRA through four conserved N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), further suppressing SRA-potentiated steroid receptor transcription activity. Thus, SHARP has the capacity to modulate both liganded and nonliganded nuclear receptors. SHARP also has been identified as a component of transcriptional repression complexes in Notch/RBP-Jkappa signaling pathways. In addition to the N-terminal RRMs, SHARP possesses a C-terminal SPOC domain (Spen paralog and ortholog C-terminal domain), which is highly conserved among Spen proteins. 	77
409788	cd12352	RRM1_TIA1_like	RNA recognition motif 1 (RRM1) found in granule-associated RNA binding proteins p40-TIA-1 and TIAR. This subfamily corresponds to the RRM1 of nucleolysin TIA-1 isoform p40 (p40-TIA-1 or TIA-1) and nucleolysin TIA-1-related protein (TIAR), both of which are granule-associated RNA binding proteins involved in inducing apoptosis in cytotoxic lymphocyte (CTL) target cells. TIA-1 and TIAR share high sequence similarity. They are expressed in a wide variety of cell types. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis.TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. Both, TIA-1 and TIAR, bind specifically to poly(A) but not to poly(C) homopolymers. They are composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 and TIAR interact with RNAs containing short stretches of uridylates and their RRM2 can mediate the specific binding to uridylate-rich RNAs. The C-terminal auxiliary domain may be responsible for interacting with other proteins. In addition, TIA-1 and TIAR share a potential serine protease-cleavage site (Phe-Val-Arg) localized at the junction between their RNA binding domains and their C-terminal auxiliary domains.	73
409789	cd12353	RRM2_TIA1_like	RNA recognition motif 2 (RRM2) found in granule-associated RNA binding proteins p40-TIA-1 and TIAR. This subfamily corresponds to the RRM2 of nucleolysin TIA-1 isoform p40 (p40-TIA-1 or TIA-1) and nucleolysin TIA-1-related protein (TIAR), both of which are granule-associated RNA binding proteins involved in inducing apoptosis in cytotoxic lymphocyte (CTL) target cells. TIA-1 and TIAR share high sequence similarity. They are expressed in a wide variety of cell types. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis. TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. Both, TIA-1 and TIAR, bind specifically to poly(A) but not to poly(C) homopolymers. They are composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 and TIAR interact with RNAs containing short stretches of uridylates and their RRM2 can mediate the specific binding to uridylate-rich RNAs. The C-terminal auxiliary domain may be responsible for interacting with other proteins. In addition, TIA-1 and TIAR share a potential serine protease-cleavage site (Phe-Val-Arg) localized at the junction between their RNA binding domains and their C-terminal auxiliary domains.	75
409790	cd12354	RRM3_TIA1_like	RNA recognition motif 2 (RRM2) found in granule-associated RNA binding proteins (p40-TIA-1 and TIAR), and yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1. This subfamily corresponds to the RRM3 of TIA-1, TIAR, and PUB1. Nucleolysin TIA-1 isoform p40 (p40-TIA-1 or TIA-1) and nucleolysin TIA-1-related protein (TIAR) are granule-associated RNA binding proteins involved in inducing apoptosis in cytotoxic lymphocyte (CTL) target cells. They share high sequence similarity and are expressed in a wide variety of cell types. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis.TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. Both TIA-1 and TIAR bind specifically to poly(A) but not to poly(C) homopolymers. They are composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 and TIAR interact with RNAs containing short stretches of uridylates and their RRM2 can mediate the specific binding to uridylate-rich RNAs. The C-terminal auxiliary domain may be responsible for interacting with other proteins. In addition, TIA-1 and TIAR share a potential serine protease-cleavage site (Phe-Val-Arg) localized at the junction between their RNA binding domains and their C-terminal auxiliary domains. This subfamily also includes a yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1, termed ARS consensus-binding protein ACBP-60, or poly uridylate-binding protein, or poly(U)-binding protein, which has been identified as both a heterogeneous nuclear RNA-binding protein (hnRNP) and a cytoplasmic mRNA-binding protein (mRNP). It may be stably bound to a translationally inactive subpopulation of mRNAs within the cytoplasm. PUB1 is distributed in both, the nucleus and the cytoplasm, and binds to poly(A)+ RNA (mRNA or pre-mRNA). Although it is one of the major cellular proteins cross-linked by UV light to polyadenylated RNAs in vivo, PUB1 is nonessential for cell growth in yeast. PUB1 also binds to T-rich single stranded DNA (ssDNA); however, there is no strong evidence implicating PUB1 in the mechanism of DNA replication. PUB1 contains three RRMs, and a GAR motif (glycine and arginine rich stretch) that is located between RRM2 and RRM3. 	71
409791	cd12355	RRM_RBM18	RNA recognition motif (RRM) found in eukaryotic RNA-binding protein 18 and similar proteins. This subfamily corresponds to the RRM of RBM18, a putative RNA-binding protein containing a well-conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The biological role of RBM18 remains unclear. 	80
409792	cd12356	RRM_PPARGC1B	RNA recognition motif (RRM) found in peroxisome proliferator-activated receptor gamma coactivator 1-beta (PGC-1-beta) and similar proteins. This subfamily corresponds to the RRM of PGC-1beta, also termed PPAR-gamma coactivator 1-beta, or PPARGC-1-beta, or PGC-1-related estrogen receptor alpha coactivator, which is one of the members of PGC-1 transcriptional coactivators family, including PGC-1alpha and PGC-1-related coactivator (PRC). PGC-1beta plays a nonredundant role in controlling mitochondrial oxidative energy metabolism and affects both, insulin sensitivity and mitochondrial biogenesis, and functions in a number of oxidative tissues. It is involved in maintaining baseline mitochondrial function and cardiac contractile function following pressure overload hypertrophy by preserving glucose metabolism and preventing oxidative stress. PGC-1beta induces hypertriglyceridemia in response to dietary fats through activating hepatic lipogenesis and lipoprotein secretion. It can stimulate apolipoprotein C3 (APOC3) expression, further mediating hypolipidemic effect of nicotinic acid. PGC-1beta also drives nuclear respiratory factor 1 (NRF-1) target gene expression and NRF-1 and estrogen related receptor alpha (ERRalpha)-dependent mitochondrial biogenesis. The modulation of the expression of PGC-1beta can trigger ERRalpha-induced adipogenesis. PGC-1beta is also a potent regulator inducing angiogenesis in skeletal muscle. The transcriptional activity of PGC-1beta can be increased through binding to host cell factor (HCF), a cellular protein involved in herpes simplex virus (HSV) infection and cell cycle regulation. PGC-1beta is a multi-domain protein containing an N-terminal activation domain, an LXXLL coactivator signature, a tetrapeptide motif (DHDY) responsible for HCF binding, two glutamic/aspartic acid-rich acidic domains, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). In contrast to PGC-1alpha, PGC-1beta lacks most of the arginine/serine (SR)-rich domain that is responsible for the regulation of RNA processing. 	97
409793	cd12357	RRM_PPARGC1A_like	RNA recognition motif (RRM) found in the peroxisome proliferator-activated receptor gamma coactivator 1A (PGC-1alpha) family of regulated coactivators. This subfamily corresponds to the RRM of PGC-1alpha, PGC-1beta, and PGC-1-related coactivator (PRC), which serve as mediators between environmental or endogenous signals and the transcriptional machinery governing mitochondrial biogenesis. They play an important integrative role in the control of respiratory gene expression through interacting with a number of transcription factors, such as NRF-1, NRF-2, ERR, CREB and YY1. All family members are multi-domain proteins containing the N-terminal activation domain, an LXXLL coactivator signature, a tetrapeptide motif (DHDY) responsible for HCF binding, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). In contrast to PGC-1alpha and PRC, PGC-1beta possesses two glutamic/aspartic acid-rich acidic domains, but lacks most of the arginine/serine (SR)-rich domain that is responsible for the regulation of RNA processing. 	91
240804	cd12358	RRM1_VICKZ	RNA recognition motif 1 (RRM1) found in the VICKZ family proteins. Thid subfamily corresponds to the RRM1 of IGF2BPs (or IMPs) found in the VICKZ family that have been implicated in the post-transcriptional regulation of several different RNAs and in subcytoplasmic localization of mRNAs during embryogenesis. IGF2BPs are composed of two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and four hnRNP K homology (KH) domains.	73
409794	cd12359	RRM2_VICKZ	RNA recognition motif 2 (RRM2) found in the VICKZ family proteins. This subfamily corresponds to the RRM2 of IGF-II mRNA-binding proteins (IGF2BPs or IMPs) in the VICKZ family that have been implicated in the post-transcriptional regulation of several different RNAs and in subcytoplasmic localization of mRNAs during embryogenesis. IGF2BPs are composed of two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and four hnRNP K homology (KH) domains. 	76
409795	cd12360	RRM_cwf2	RNA recognition motif (RRM) found in yeast pre-mRNA-splicing factor Cwc2 and similar proteins. This subfamily corresponds to the RRM of yeast protein Cwc2, also termed Complexed with CEF1 protein 2, or PRP19-associated complex protein 40 (Ntc40), or synthetic lethal with CLF1 protein 3, one of the components of the Prp19-associated complex [nineteen complex (NTC)] that can bind to RNA. NTC is composed of the scaffold protein Prp19 and a number of associated splicing factors, and plays a crucial role in intron removal during premature mRNA splicing in eukaryotes. Cwc2 functions as an RNA-binding protein that can bind both small nuclear RNAs (snRNAs) and pre-mRNA in vitro. It interacts directly with the U6 snRNA to link the NTC to the spliceosome during pre-mRNA splicing. In the N-terminal half, Cwc2 contains a CCCH-type zinc finger (ZnF domain), a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and an intervening loop, also termed RNA-binding loop or RB loop, between ZnF and RRM, all of which are necessary and sufficient for RNA binding. The ZnF is also responsible for mediating protein-protein interaction. The C-terminal flexible region of Cwc2 interacts with the WD40 domain of Prp19.	79
409796	cd12361	RRM1_2_CELF1-6_like	RNA recognition motif 1 (RRM1) and 2 (RRM2) found in CELF/Bruno-like family of RNA binding proteins and plant flowering time control protein FCA. This subfamily corresponds to the RRM1 and RRM2 domains of the CUGBP1 and ETR-3-like factors (CELF) as well as plant flowering time control protein FCA. CELF, also termed BRUNOL (Bruno-like) proteins, is a family of structurally related RNA-binding proteins involved in regulation of pre-mRNA splicing in the nucleus, and control of mRNA translation and deadenylation in the cytoplasm. The family contains six members: CELF-1 (also known as BRUNOL-2, CUG-BP1, NAPOR, EDEN-BP), CELF-2 (also known as BRUNOL-3, ETR-3, CUG-BP2, NAPOR-2), CELF-3 (also known as BRUNOL-1, TNRC4, ETR-1, CAGH4, ER DA4), CELF-4 (BRUNOL-4), CELF-5 (BRUNOL-5) and CELF-6 (BRUNOL-6). They all contain three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The low sequence conservation of the linker region is highly suggestive of a large variety in the co-factors that associate with the various CELF family members. Based on both, sequence similarity and function, the CELF family can be divided into two subfamilies, the first containing CELFs 1 and 2, and the second containing CELFs 3, 4, 5, and 6. The different CELF proteins may act through different sites on at least some substrates. Furthermore, CELF proteins may interact with each other in varying combinations to influence alternative splicing in different contexts. This subfamily also includes plant flowering time control protein FCA that functions in the posttranscriptional regulation of transcripts involved in the flowering process. FCA contains two RRMs, and a WW protein interaction domain.  	77
409797	cd12362	RRM3_CELF1-6	RNA recognition motif 3 (RRM3) found in CELF/Bruno-like family of RNA binding proteins CELF1, CELF2, CELF3, CELF4, CELF5, CELF6 and similar proteins. This subgroup corresponds to the RRM3 of the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) proteins, a family of structurally related RNA-binding proteins involved in the regulation of pre-mRNA splicing in the nucleus and in the control of mRNA translation and deadenylation in the cytoplasm. The family contains six members: CELF-1 (also termed BRUNOL-2, or CUG-BP1, or NAPOR, or EDEN-BP), CELF-2 (also termed BRUNOL-3, or ETR-3, or CUG-BP2, or NAPOR-2), CELF-3 (also termed BRUNOL-1, or TNRC4, or ETR-1, or CAGH4, or ER DA4), CELF-4 (also termed BRUNOL-4), CELF-5 (also termed BRUNOL-5), CELF-6 (also termed BRUNOL-6). They all contain three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The low sequence conservation of the linker region is highly suggestive of a large variety in the co-factors that associate with the various CELF family members. Based on both sequence similarity and function, the CELF family can be divided into two subfamilies, the first containing CELFs 1 and 2, and the second containing CELFs 3, 4, 5, and 6. The different CELF proteins may act through different sites on at least some substrates. Furthermore, CELF proteins may interact with each other in varying combinations to influence alternative splicing in different contexts. 	73
409798	cd12363	RRM_TRA2	RNA recognition motif (RRM) found in transformer-2 protein homolog TRA2-alpha, TRA2-beta and similar proteins. This subfamily corresponds to the RRM of two mammalian homologs of Drosophila transformer-2 (Tra2), TRA2-alpha, TRA2-beta (also termed SFRS10), and similar proteins found in eukaryotes. TRA2-alpha is a 40-kDa serine/arginine-rich (SR) protein that specifically binds to gonadotropin-releasing hormone (GnRH) exonic splicing enhancer on exon 4 (ESE4) and is necessary for enhanced GnRH pre-mRNA splicing. It strongly stimulates GnRH intron A excision in a dose-dependent manner. In addition, TRA2-alpha can interact with either 9G8 or SRp30c, which may also be crucial for ESE-dependent GnRH pre-mRNA splicing. TRA2-beta is a serine/arginine-rich (SR) protein that controls the pre-mRNA alternative splicing of the calcitonin/calcitonin gene-related peptide (CGRP), the survival motor neuron 1 (SMN1) protein and the tau protein. Both, TRA2-alpha and TRA2-beta, contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), flanked by the N- and C-terminal arginine/serine (RS)-rich regions. 	80
409799	cd12364	RRM_RDM1	RNA recognition motif (RRM) found in RAD52 motif-containing protein 1 (RDM1) and similar proteins. This subfamily corresponds to the RRM of RDM1, also termed RAD52 homolog B, a novel factor involved in the cellular response to the anti-cancer drug cisplatin in vertebrates. RDM1 contains a small RD motif that shares with the recombination and repair protein RAD52, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The RD motif is responsible for the acidic pH-dependent DNA-binding properties of RDM1. It interacts with ss- and dsDNA, and may act as a DNA-damage recognition factor by recognizing the distortions of the double helix caused by cisplatin-DNA adducts in vitro. In addition, due to the presence of RRM, RDM1 can bind to RNA as well as DNA. 	81
409800	cd12365	RRM_RNPS1	RNA recognition motif (RRM) found in RNA-binding protein with serine-rich domain 1 (RNPS1) and similar proteins. This subfamily corresponds to the RRM of RNPS1 and its eukaryotic homologs. RNPS1, also termed RNA-binding protein prevalent during the S phase, or SR-related protein LDC2, was originally characterized as a general pre-mRNA splicing activator, which activates both constitutive and alternative splicing of pre-mRNA in vitro.It has been identified as a protein component of the splicing-dependent mRNP complex, or exon-exon junction complex (EJC), and is directly involved in mRNA surveillance. Furthermore, RNPS1 is a splicing regulator whose activator function is controlled in part by CK2 (casein kinase II) protein kinase phosphorylation. It can also function as a squamous-cell carcinoma antigen recognized by T cells-3 (SART3)-binding protein, and is involved in the regulation of mRNA splicing. RNPS1 contains an N-terminal serine-rich (S) domain, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and the C-terminal arginine/serine/proline-rich (RS/P) domain. 	73
409801	cd12366	RRM1_RBM45	RNA recognition motif 1 (RRM1) found in RNA-binding protein 45 (RBM45) and similar proteins. This subfamily corresponds to the RRM1 of RBM45, also termed developmentally-regulated RNA-binding protein 1 (DRB1), a new member of RNA recognition motif (RRM)-type neural RNA-binding proteins, which expresses under spatiotemporal control. It is encoded by gene drb1 that is expressed in neurons, not in glial cells. RBM45 predominantly localizes in cytoplasm of cultured cells and specifically binds to poly(C) RNA. It could play an important role during neurogenesis. RBM45 carries four RRMs, also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	81
409802	cd12367	RRM2_RBM45	RNA recognition motif 2 (RRM2) found in RNA-binding protein 45 (RBM45) and similar proteins. This subfamily corresponds to the RRM2 of RBM45, also termed developmentally-regulated RNA-binding protein 1 (DRB1), a new member of RNA recognition motif (RRM)-type neural RNA-binding proteins, which expresses under spatiotemporal control. It is encoded by gene drb1 that is expressed in neurons, not in glial cells. RBM45 predominantly localizes in cytoplasm of cultured cells and specifically binds to poly(C) RNA. It could play an important role during neurogenesis. RBM45 carries four RRMs, also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	74
409803	cd12368	RRM3_RBM45	RNA recognition motif 3 (RRM3) found in RNA-binding protein 45 (RBM45) and similar proteins. This subfamily corresponds to the RRM3 of RBM45, also termed developmentally-regulated RNA-binding protein 1 (DRB1), a new member of RNA recognition motif (RRM)-type neural RNA-binding proteins, which expresses under spatiotemporal control. It is encoded by gene drb1 that is expressed in neurons, not in glial cells. RBM45 predominantly localizes in cytoplasm of cultured cells and specifically binds to poly(C) RNA. It could play an important role during neurogenesis. RBM45 carries four RRMs, also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	75
409804	cd12369	RRM4_RBM45	RNA recognition motif 4 (RRM4) found in RNA-binding protein 45 (RBM45) and similar proteins. This subfamily corresponds to the RRM4 of RBM45, also termed developmentally-regulated RNA-binding protein 1 (DRB1), a new member of RNA recognition motif (RRM)-type neural RNA-binding proteins, which expresses under spatiotemporal control. It is encoded by gene drb1 that is expressed in neurons, not in glial cells. RBM45 predominantly localizes in cytoplasm of cultured cells and specifically binds to poly(C) RNA. It could play an important role during neurogenesis. RBM45 carries four RRMs, also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	68
409805	cd12370	RRM1_PUF60	RNA recognition motif 1 (RRM1) found in (U)-binding-splicing factor PUF60 and similar proteins. This subfamily corresponds to the RRM1 of PUF60, also termed FUSE-binding protein-interacting repressor (FBP-interacting repressor or FIR), or Ro-binding protein 1 (RoBP1), or Siah-binding protein 1 (Siah-BP1). PUF60 is an essential splicing factor that functions as a poly-U RNA-binding protein required to reconstitute splicing in depleted nuclear extracts. Its function is enhanced through interaction with U2 auxiliary factor U2AF65. PUF60 also controls human c-myc gene expression by binding and inhibiting the transcription factor far upstream sequence element (FUSE)-binding-protein (FBP), an activator of c-myc promoters. PUF60 contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors another RRM and binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) in several nuclear proteins. Research indicates that PUF60 binds FUSE as a dimer, and only the first two RRM domains participate in the single-stranded DNA recognition. 	76
409806	cd12371	RRM2_PUF60	RNA recognition motif 2 (RRM2) found in (U)-binding-splicing factor PUF60 and similar proteins. This subfamily corresponds to the RRM2 of PUF60, also termed FUSE-binding protein-interacting repressor (FBP-interacting repressor or FIR), or Ro-binding protein 1 (RoBP1), or Siah-binding protein 1 (Siah-BP1). PUF60 is an essential splicing factor that functions as a poly-U RNA-binding protein required to reconstitute splicing in depleted nuclear extracts. Its function is enhanced through interaction with U2 auxiliary factor U2AF65. PUF60 also controls human c-myc gene expression by binding and inhibiting the transcription factor far upstream sequence element (FUSE)-binding-protein (FBP), an activator of c-myc promoters. PUF60 contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors another RRM and binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) in several nuclear proteins. Research indicates that PUF60 binds FUSE as a dimer, and only the first two RRM domains participate in the single-stranded DNA recognition. 	77
409807	cd12372	RRM_CFIm68_CFIm59	RNA recognition motif (RRM) found in pre-mRNA cleavage factor Im 68 kDa subunit (CFIm68 or CPSF6), pre-mRNA cleavage factor Im 59 kDa subunit (CFIm59 or CPSF7), and similar proteins. This subfamily corresponds to the RRM of cleavage factor Im (CFIm) subunits. Cleavage factor Im (CFIm) is a highly conserved component of the eukaryotic mRNA 3' processing machinery that functions in UGUA-mediated poly(A) site recognition, the regulation of alternative poly(A) site selection, mRNA export, and mRNA splicing. It is a complex composed of a small 25 kDa (CFIm25) subunit and a larger 59/68/72 kDa subunit. Two separate genes, CPSF6 and CPSF7, code for two isoforms of the large subunit, CFIm68 and CFIm59. Structurally related CFIm68 and CFIm59, also termed cleavage and polyadenylation specificity factor subunit 6 (CPSF7), or cleavage and polyadenylation specificity factor 59 kDa subunit (CPSF59), are functionally redundant. Both contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a central proline-rich region, and a C-terminal RS-like domain. Their N-terminal RRM mediates the interaction with CFIm25, and also serves to enhance RNA binding and facilitate RNA looping. 	76
409808	cd12373	RRM_SRSF3_like	RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 3 (SRSF3) and similar proteins. This subfamily corresponds to the RRM of two serine/arginine (SR) proteins, serine/arginine-rich splicing factor 3 (SRSF3) and serine/arginine-rich splicing factor 7 (SRSF7). SRSF3, also termed pre-mRNA-splicing factor SRp20, modulates alternative splicing by interacting with RNA cis-elements in a concentration- and cell differentiation-dependent manner. It is also involved in termination of transcription, alternative RNA polyadenylation, RNA export, and protein translation. SRSF3 is critical for cell proliferation, and tumor induction and maintenance. It can shuttle between the nucleus and cytoplasm. SRSF7, also termed splicing factor 9G8, plays a crucial role in both constitutive splicing and alternative splicing of many pre-mRNAs. Its localization and functions are tightly regulated by phosphorylation. SRSF7 is predominantly present in the nuclear and can shuttle between nucleus and cytoplasm. It cooperates with the export protein, Tap/NXF1, helps mRNA export to the cytoplasm, and enhances the expression of unspliced mRNA. Moreover, SRSF7 inhibits tau E10 inclusion through directly interacting with the proximal downstream intron of E10, a clustering region for frontotemporal dementia with Parkinsonism (FTDP) mutations. Both SRSF3 and SRSF7 contain a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RS domain rich in serine-arginine dipeptides. The RRM domain is involved in RNA binding, and the RS domain has been implicated in protein shuttling and protein-protein interactions. 	73
409809	cd12374	RRM_UHM_SPF45_PUF60	RNA recognition motif (RRM) found in UHM domain of 45 kDa-splicing factor (SPF45) and similar proteins. This subfamily corresponds to the RRM found in UHM domain of 45 kDa-splicing factor (SPF45 or RBM17), poly(U)-binding-splicing factor PUF60 (FIR or Hfp or RoBP1 or Siah-BP1), and similar proteins. SPF45 is an RNA-binding protein consisting of an unstructured N-terminal region, followed by a G-patch motif and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and an Arg-Xaa-Phe sequence motif. SPF45 regulates alternative splicing of the apoptosis regulatory gene FAS (also known as CD95). It induces exon 6 skipping in FAS pre-mRNA through the UHM domain that binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) present in the 3' splice site-recognizing factors U2AF65, SF1 and SF3b155. PUF60 is an essential splicing factor that functions as a poly-U RNA-binding protein required to reconstitute splicing in depleted nuclear extracts. Its function is enhanced through interaction with U2 auxiliary factor U2AF65. PUF60 also controls human c-myc gene expression by binding and inhibiting the transcription factor far upstream sequence element (FUSE)-binding-protein (FBP), an activator of c-myc promoters. PUF60 contains two central RRMs and a C-terminal UHM domain. 	85
409810	cd12375	RRM1_Hu_like	RNA recognition motif 1 (RRM1) found in the Hu proteins family, Drosophila sex-lethal (SXL), and similar proteins. This subfamily corresponds to the RRM1 of Hu proteins and SXL. The Hu proteins family represents a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. This family also includes the sex-lethal protein (SXL) from Drosophila melanogaster. SXL governs sexual differentiation and X chromosome dosage compensation in flies. It induces female-specific alternative splicing of the transformer (tra) pre-mRNA by binding to the tra uridine-rich polypyrimidine tract at the non-sex-specific 3' splice site during the sex-determination process. SXL binds to its own pre-mRNA and promotes female-specific alternative splicing. It contains an N-terminal Gly/Asn-rich domain that may be responsible for the protein-protein interaction, and tandem RRMs that show high preference to bind single-stranded, uridine-rich target RNA transcripts. 	76
240822	cd12376	RRM2_Hu_like	RNA recognition motif 2 (RRM2) found in the Hu proteins family, Drosophila sex-lethal (SXL), and similar proteins. This subfamily corresponds to the RRM2 of Hu proteins and SXL. The Hu proteins family represents a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. Also included in this subfamily is the sex-lethal protein (SXL) from Drosophila melanogaster. SXL governs sexual differentiation and X chromosome dosage compensation in flies. It induces female-specific alternative splicing of the transformer (tra) pre-mRNA by binding to the tra uridine-rich polypyrimidine tract at the non-sex-specific 3' splice site during the sex-determination process. SXL binds also to its own pre-mRNA and promotes female-specific alternative splicing. SXL contains an N-terminal Gly/Asn-rich domain that may be responsible for the protein-protein interaction, and tandem RRMs that show high preference to bind single-stranded, uridine-rich target RNA transcripts. 	79
409811	cd12377	RRM3_Hu	RNA recognition motif 3 (RRM3) found in the Hu proteins family. This subfamily corresponds to the RRM3 of the Hu proteins family which represent a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	76
409812	cd12378	RRM1_I_PABPs	RNA recognition motif 1 (RRM1) found in type I polyadenylate-binding proteins. This subfamily corresponds to the RRM1 of type I poly(A)-binding proteins (PABPs), highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in the regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. The family represents type I polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 1 (PABP-1 or PABPC1), polyadenylate-binding protein 3 (PABP-3 or PABPC3), polyadenylate-binding protein 4 (PABP-4 or APP-1 or iPABP), polyadenylate-binding protein 5 (PABP-5 or PABPC5), polyadenylate-binding protein 1-like (PABP-1-like or PABPC1L), polyadenylate-binding protein 1-like 2 (PABPC1L2 or RBM32), polyadenylate-binding protein 4-like (PABP-4-like or PABPC4L), yeast polyadenylate-binding protein, cytoplasmic and nuclear (PABP or ACBP-67), and similar proteins. PABP-1 is a ubiquitously expressed multifunctional protein that may play a role in 3' end formation of mRNA, translation initiation, mRNA stabilization, protection of poly(A) from nuclease activity, mRNA deadenylation, inhibition of mRNA decapping, and mRNP maturation. Although PABP-1 is thought to be a cytoplasmic protein, it is also found in the nucleus. PABP-1 may be involved in nucleocytoplasmic trafficking and utilization of mRNP particles. PABP-1 contains four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a less well conserved linker region, and a proline-rich C-terminal conserved domain (CTD). PABP-3 is a testis-specific poly(A)-binding protein specifically expressed in round spermatids. It is mainly found in mammalian and may play an important role in the testis-specific regulation of mRNA homeostasis. PABP-3 shows significant sequence similarity to PABP-1. However, it binds to poly(A) with a lower affinity than PABP-1. Moreover, PABP-1 possesses an A-rich sequence in its 5'-UTR and allows binding of PABP and blockage of translation of its own mRNA. In contrast, PABP-3 lacks the A-rich sequence in its 5'-UTR. PABP-4 is an inducible poly(A)-binding protein (iPABP) that is primarily localized to the cytoplasm. It shows significant sequence similarity to PABP-1 as well. The RNA binding properties of PABP-1 and PABP-4 appear to be identical. PABP-5 is encoded by PABPC5 gene within the X-specific subinterval, and expressed in fetal brain and in a range of adult tissues in mammals, such as ovary and testis. It may play an important role in germ cell development. Moreover, unlike other PABPs, PABP-5 contains only four RRMs, but lacks both the linker region and the CTD. PABP-1-like and PABP-1-like 2 are the orthologs of PABP-1. PABP-4-like is the ortholog of PABP-5. Their cellular functions remain unclear. The family also includes yeast PABP, a conserved poly(A) binding protein containing poly(A) tails that can be attached to the 3'-ends of mRNAs. The yeast PABP and its homologs may play important roles in the initiation of translation and in mRNA decay. Like vertebrate PABP-1, the yeast PABP contains four RRMs, a linker region, and a proline-rich CTD as well. The first two RRMs are mainly responsible for specific binding to poly(A). The proline-rich region may be involved in protein-protein interactions. 	80
409813	cd12379	RRM2_I_PABPs	RNA recognition motif 2 (RRM2) found found in type I polyadenylate-binding proteins. This subfamily corresponds to the RRM2 of type I poly(A)-binding proteins (PABPs), highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in the regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. The family represents type I polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 1 (PABP-1 or PABPC1), polyadenylate-binding protein 3 (PABP-3 or PABPC3), polyadenylate-binding protein 4 (PABP-4 or APP-1 or iPABP), polyadenylate-binding protein 5 (PABP-5 or PABPC5), polyadenylate-binding protein 1-like (PABP-1-like or PABPC1L), polyadenylate-binding protein 1-like 2 (PABPC1L2 or RBM32), polyadenylate-binding protein 4-like (PABP-4-like or PABPC4L), yeast polyadenylate-binding protein, cytoplasmic and nuclear (PABP or ACBP-67), and similar proteins. PABP-1 is a ubiquitously expressed multifunctional protein that may play a role in 3' end formation of mRNA, translation initiation, mRNA stabilization, protection of poly(A) from nuclease activity, mRNA deadenylation, inhibition of mRNA decapping, and mRNP maturation. Although PABP-1 is thought to be a cytoplasmic protein, it is also found in the nucleus. PABP-1 may be involved in nucleocytoplasmic trafficking and utilization of mRNP particles. PABP-1 contains four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a less well conserved linker region, and a proline-rich C-terminal conserved domain (CTD). PABP-3 is a testis-specific poly(A)-binding protein specifically expressed in round spermatids. It is mainly found in mammalian and may play an important role in the testis-specific regulation of mRNA homeostasis. PABP-3 shows significant sequence similarity to PABP-1. However, it binds to poly(A) with a lower affinity than PABP-1. Moreover, PABP-1 possesses an A-rich sequence in its 5'-UTR and allows binding of PABP and blockage of translation of its own mRNA. In contrast, PABP-3 lacks the A-rich sequence in its 5'-UTR. PABP-4 is an inducible poly(A)-binding protein (iPABP) that is primarily localized to the cytoplasm. It shows significant sequence similarity to PABP-1 as well. The RNA binding properties of PABP-1 and PABP-4 appear to be identical. PABP-5 is encoded by PABPC5 gene within the X-specific subinterval, and expressed in fetal brain and in a range of adult tissues in mammalian, such as ovary and testis. It may play an important role in germ cell development. Unlike other PABPs, PABP-5 contains only four RRMs, but lacks both the linker region and the CTD. PABP-1-like and PABP-1-like 2 are the orthologs of PABP-1. PABP-4-like is the ortholog of PABP-5. Their cellular functions remain unclear. The family also includes the yeast PABP, a conserved poly(A) binding protein containing poly(A) tails that can be attached to the 3'-ends of mRNAs. The yeast PABP and its homologs may play important roles in the initiation of translation and in mRNA decay. Like vertebrate PABP-1, the yeast PABP contains four RRMs, a linker region, and a proline-rich CTD as well. The first two RRMs are mainly responsible for specific binding to poly(A). The proline-rich region may be involved in protein-protein interactions. 	77
409814	cd12380	RRM3_I_PABPs	RNA recognition motif 3 (RRM3) found found in type I polyadenylate-binding proteins. This subfamily corresponds to the RRM3 of type I poly(A)-binding proteins (PABPs), highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in the regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. The family represents type I polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 1 (PABP-1 or PABPC1), polyadenylate-binding protein 3 (PABP-3 or PABPC3), polyadenylate-binding protein 4 (PABP-4 or APP-1 or iPABP), polyadenylate-binding protein 5 (PABP-5 or PABPC5), polyadenylate-binding protein 1-like (PABP-1-like or PABPC1L), polyadenylate-binding protein 1-like 2 (PABPC1L2 or RBM32), polyadenylate-binding protein 4-like (PABP-4-like or PABPC4L), yeast polyadenylate-binding protein, cytoplasmic and nuclear (PABP or ACBP-67), and similar proteins. PABP-1 is an ubiquitously expressed multifunctional protein that may play a role in 3' end formation of mRNA, translation initiation, mRNA stabilization, protection of poly(A) from nuclease activity, mRNA deadenylation, inhibition of mRNA decapping, and mRNP maturation. Although PABP-1 is thought to be a cytoplasmic protein, it is also found in the nucleus. PABP-1 may be involved in nucleocytoplasmic trafficking and utilization of mRNP particles. PABP-1 contains four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a less well conserved linker region, and a proline-rich C-terminal conserved domain (CTD). PABP-3 is a testis-specific poly(A)-binding protein specifically expressed in round spermatids. It is mainly found in mammalian and may play an important role in the testis-specific regulation of mRNA homeostasis. PABP-3 shows significant sequence similarity to PABP-1. However, it binds to poly(A) with a lower affinity than PABP-1. PABP-1 possesses an A-rich sequence in its 5'-UTR and allows binding of PABP and blockage of translation of its own mRNA. In contrast, PABP-3 lacks the A-rich sequence in its 5'-UTR. PABP-4 is an inducible poly(A)-binding protein (iPABP) that is primarily localized to the cytoplasm. It shows significant sequence similarity to PABP-1 as well. The RNA binding properties of PABP-1 and PABP-4 appear to be identical. PABP-5 is encoded by PABPC5 gene within the X-specific subinterval, and expressed in fetal brain and in a range of adult tissues in mammalian, such as ovary and testis. It may play an important role in germ cell development. Moreover, unlike other PABPs, PABP-5 contains only four RRMs, but lacks both the linker region and the CTD. PABP-1-like and PABP-1-like 2 are the orthologs of PABP-1. PABP-4-like is the ortholog of PABP-5. Their cellular functions remain unclear. The family also includes the yeast PABP, a conserved poly(A) binding protein containing poly(A) tails that can be attached to the 3'-ends of mRNAs. The yeast PABP and its homologs may play important roles in the initiation of translation and in mRNA decay. Like vertebrate PABP-1, the yeast PABP contains four RRMs, a linker region, and a proline-rich CTD as well. The first two RRMs are mainly responsible for specific binding to poly(A). The proline-rich region may be involved in protein-protein interactions. 	80
409815	cd12381	RRM4_I_PABPs	RNA recognition motif 4 (RRM4) found in type I polyadenylate-binding proteins. This subfamily corresponds to the RRM4 of type I poly(A)-binding proteins (PABPs), highly conserved proteins that bind to the poly(A) tail present at the 3' ends of most eukaryotic mRNAs. They have been implicated in theThe CD corresponds to the RRM. regulation of poly(A) tail length during the polyadenylation reaction, translation initiation, mRNA stabilization by influencing the rate of deadenylation and inhibition of mRNA decapping. The family represents type I polyadenylate-binding proteins (PABPs), including polyadenylate-binding protein 1 (PABP-1 or PABPC1), polyadenylate-binding protein 3 (PABP-3 or PABPC3), polyadenylate-binding protein 4 (PABP-4 or APP-1 or iPABP), polyadenylate-binding protein 5 (PABP-5 or PABPC5), polyadenylate-binding protein 1-like (PABP-1-like or PABPC1L), polyadenylate-binding protein 1-like 2 (PABPC1L2 or RBM32), polyadenylate-binding protein 4-like (PABP-4-like or PABPC4L), yeast polyadenylate-binding protein, cytoplasmic and nuclear (PABP or ACBP-67), and similar proteins. PABP-1 is an ubiquitously expressed multifunctional protein that may play a role in 3' end formation of mRNA, translation initiation, mRNA stabilization, protection of poly(A) from nuclease activity, mRNA deadenylation, inhibition of mRNA decapping, and mRNP maturation. Although PABP-1 is thought to be a cytoplasmic protein, it is also found in the nucleus. PABP-1 may be involved in nucleocytoplasmic trafficking and utilization of mRNP particles. PABP-1 contains four copies of RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a less well conserved linker region, and a proline-rich C-terminal conserved domain (CTD). PABP-3 is a testis-specific poly(A)-binding protein specifically expressed in round spermatids. It is mainly found in mammalian and may play an important role in the testis-specific regulation of mRNA homeostasis. PABP-3 shows significant sequence similarity to PABP-1. However, it binds to poly(A) with a lower affinity than PABP-1. Moreover, PABP-1 possesses an A-rich sequence in its 5'-UTR and allows binding of PABP and blockage of translation of its own mRNA. In contrast, PABP-3 lacks the A-rich sequence in its 5'-UTR. PABP-4 is an inducible poly(A)-binding protein (iPABP) that is primarily localized to the cytoplasm. It shows significant sequence similarity to PABP-1 as well. The RNA binding properties of PABP-1 and PABP-4 appear to be identical. PABP-5 is encoded by PABPC5 gene within the X-specific subinterval, and expressed in fetal brain and in a range of adult tissues in mammalian, such as ovary and testis. It may play an important role in germ cell development. Moreover, unlike other PABPs, PABP-5 contains only four RRMs, but lacks both the linker region and the CTD. PABP-1-like and PABP-1-like 2 are the orthologs of PABP-1. PABP-4-like is the ortholog of PABP-5. Their cellular functions remain unclear. The family also includes the yeast PABP, a conserved poly(A) binding protein containing poly(A) tails that can be attached to the 3'-ends of mRNAs. The yeast PABP and its homologs may play important roles in the initiation of translation and in mRNA decay. Like vertebrate PABP-1, the yeast PABP contains four RRMs, a linker region, and a proline-rich CTD as well. The first two RRMs are mainly responsible for specific binding to poly(A). The proline-rich region may be involved in protein-protein interactions. 	79
409816	cd12382	RRM_RBMX_like	RNA recognition motif (RRM) found in heterogeneous nuclear ribonucleoprotein G (hnRNP G), Y chromosome RNA recognition motif 1 (hRBMY), testis-specific heterogeneous nuclear ribonucleoprotein G-T (hnRNP G-T) and similar proteins. This subfamily corresponds to the RRM domain of hnRNP G, also termed glycoprotein p43 or RBMX, an RNA-binding motif protein located on the X chromosome. It is expressed ubiquitously and has been implicated in the splicing control of several pre-mRNAs. Moreover, hnRNP G may function as a regulator of transcription for SREBP-1c and GnRH1. Research has shown that hnRNP G may also act as a tumor-suppressor since it upregulates the Txnip gene and promotes the fidelity of DNA end-joining activity. In addition, hnRNP G appears to play a critical role in proper neural development of zebrafish and frog embryos. The family also includes several paralogs of hnRNP G, such as hRBMY and hnRNP G-T (also termed RNA-binding motif protein, X-linked-like-2). Both, hRBMY and hnRNP G-T, are exclusively expressed in testis and critical for male fertility. Like hnRNP G, hRBMY and hnRNP G-T interact with factors implicated in the regulation of pre-mRNA splicing, such as hTra2-beta1 and T-STAR. Although members in this family share a high conserved N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), they appear to recognize different RNA targets. For instance, hRBMY interacts specifically with a stem-loop structure in which the loop is formed by the sequence CA/UCAA. In contrast, hnRNP G associates with single stranded RNA sequences containing a CCA/C motif. In addition to the RRM, hnRNP G contains a nascent transcripts targeting domain (NTD) in the middle region and a novel auxiliary RNA-binding domain (RBD) in its C-terminal region. The C-terminal RBD exhibits distinct RNA binding specificity, and would play a critical role in the regulation of alternative splicing by hnRNP G. 	80
409817	cd12383	RRM_RBM42	RNA recognition motif (RRM) found in RNA-binding protein 42 (RBM42) and similar proteins. This subfamily corresponds to the RRM of RBM42 which has been identified as a heterogeneous nuclear ribonucleoprotein K (hnRNP K)-binding protein. It also directly binds the 3' untranslated region of p21 mRNA that is one of the target mRNAs for hnRNP K. Both, hnRNP K and RBM42, are components of stress granules (SGs). Under nonstress conditions, RBM42 predominantly localizes within the nucleus and co-localizes with hnRNP K. Under stress conditions, hnRNP K and RBM42 form cytoplasmic foci where the SG marker TIAR localizes, and may play a role in the maintenance of cellular ATP level by protecting their target mRNAs. RBM42 contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	83
409818	cd12384	RRM_RBM24_RBM38_like	RNA recognition motif (RRM) found in eukaryotic RNA-binding protein RBM24, RBM38 and similar proteins. This subfamily corresponds to the RRM of RBM24 and RBM38 from vertebrate, SUPpressor family member SUP-12 from Caenorhabditis elegans and similar proteins. Both, RBM24 and RBM38, are preferentially expressed in cardiac and skeletal muscle tissues. They regulate myogenic differentiation by controlling the cell cycle in a p21-dependent or -independent manner. RBM24, also termed RNA-binding region-containing protein 6, interacts with the 3'-untranslated region (UTR) of myogenin mRNA and regulates its stability in C2C12 cells. RBM38, also termed CLL-associated antigen KW-5, or HSRNASEB, or RNA-binding region-containing protein 1(RNPC1), or ssDNA-binding protein SEB4, is a direct target of the p53 family. It is required for maintaining the stability of the basal and stress-induced p21 mRNA by binding to their 3'-UTRs. It also binds the AU-/U-rich elements in p63 3'-UTR and regulates p63 mRNA stability and activity. SUP-12 is a novel tissue-specific splicing factor that controls muscle-specific splicing of the ADF/cofilin pre-mRNA in C. elegans. All family members contain a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	76
409819	cd12385	RRM1_hnRNPM_like	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein M (hnRNP M) and similar proteins. This subfamily corresponds to the RRM1 of heterogeneous nuclear ribonucleoprotein M (hnRNP M), myelin expression factor 2 (MEF-2 or MyEF-2 or MST156) and similar proteins. hnRNP M is pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. Moreover, hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. hnRNP M functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). MEF-2 is a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 shows high sequence homology with hnRNP M. It also contains three RRMs, which may be responsible for its ssDNA binding activity. 	76
409820	cd12386	RRM2_hnRNPM_like	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein M (hnRNP M) and similar proteins. This subfamily corresponds to the RRM2 of heterogeneous nuclear ribonucleoprotein M (hnRNP M), myelin expression factor 2 (MEF-2 or MyEF-2 or MST156) and similar proteins. hnRNP M is pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. It functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). MEF-2 is a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 shows high sequence homology with hnRNP M. It also contains three RRMs, which may be responsible for its ssDNA binding activity. 	74
409821	cd12387	RRM3_hnRNPM_like	RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein M (hnRNP M) and similar proteins. This subfamily corresponds to the RRM3 of heterogeneous nuclear ribonucleoprotein M (hnRNP M), myelin expression factor 2 (MEF-2 or MyEF-2 or MST156) and similar proteins. hnRNP M is pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. hnRNP M functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). MEF-2 is a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 shows high sequence homology with hnRNP M. It also contains three RRMs, which may be responsible for its ssDNA binding activity. 	71
409822	cd12388	RRM1_RAVER	RNA recognition motif 1 (RRM1) found in ribonucleoprotein PTB-binding raver-1, raver-2 and similar proteins. This subfamily corresponds to the RRM1 of raver-1 and raver-2. Raver-1 is a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-2 is a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It shows high sequence homology to raver-1. Raver-2 exerts a spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Both, raver-1 and raver-2, contain three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. They binds to RNA through the RRMs. In addition, the two [SG][IL]LGxxP motifs serve as the PTB-binding motifs in raver1. However, raver-2 interacts with PTB through the SLLGEPP motif only. 	70
409823	cd12389	RRM2_RAVER	RNA recognition motif 2 (RRM2) found in ribonucleoprotein PTB-binding raver-1, raver-2 and similar proteins. This subfamily corresponds to the RRM2 of raver-1 and raver-2. Raver-1 is a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-2 is a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It shows high sequence homology to raver-1. Raver-2 exerts a spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Both, raver-1 and raver-2, contain three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. They binds to RNA through the RRMs. In addition, the two [SG][IL]LGxxP motifs serve as the PTB-binding motifs in raver1. However, raver-2 interacts with PTB through the SLLGEPP motif only. 	77
409824	cd12390	RRM3_RAVER	RNA recognition motif 3 (RRM3) found in ribonucleoprotein PTB-binding raver-1, raver-2 and similar proteins. This subfamily corresponds to the RRM3 of raver-1 and raver-2. Raver-1 is a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-2 is a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It shows high sequence homology to raver-1. Raver-2 exerts a spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Both, raver-1 and raver-2, contain three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. They binds to RNA through the RRMs. In addition, the two [SG][IL]LGxxP motifs serve as the PTB-binding motifs in raver1. However, raver-2 interacts with PTB through the SLLGEPP motif only. 	91
409825	cd12391	RRM1_SART3	RNA recognition motif 1 (RRM1) found in squamous cell carcinoma antigen recognized by T-cells 3 (SART3) and similar proteins. This subfamily corresponds to the RRM1 of SART3, also termed Tat-interacting protein of 110 kDa (Tip110), an RNA-binding protein expressed in the nucleus of the majority of proliferating cells, including normal cells and malignant cells, but not in normal tissues except for the testes and fetal liver. It is involved in the regulation of mRNA splicing probably via its complex formation with RNA-binding protein with a serine-rich domain (RNPS1), a pre-mRNA-splicing factor. SART3 has also been identified as a nuclear Tat-interacting protein that regulates Tat transactivation activity through direct interaction and functions as an important cellular factor for HIV-1 gene expression and viral replication. In addition, SART3 is required for U6 snRNP targeting to Cajal bodies. It binds specifically and directly to the U6 snRNA, interacts transiently with the U6 and U4/U6 snRNPs, and promotes the reassembly of U4/U6 snRNPs after splicing in vitro. SART3 contains an N-terminal half-a-tetratricopeptide repeat (HAT)-rich domain, a nuclearlocalization signal (NLS) domain, and two C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	72
409826	cd12392	RRM2_SART3	RNA recognition motif 2 (RRM2) found in squamous cell carcinoma antigen recognized by T-cells 3 (SART3) and similar proteins. This subfamily corresponds to the RRM2 of SART3, also termed Tat-interacting protein of 110 kDa (Tip110), is an RNA-binding protein expressed in the nucleus of the majority of proliferating cells, including normal cells and malignant cells, but not in normal tissues except for the testes and fetal liver. It is involved in the regulation of mRNA splicing probably via its complex formation with RNA-binding protein with a serine-rich domain (RNPS1), a pre-mRNA-splicing factor. SART3 has also been identified as a nuclear Tat-interacting protein that regulates Tat transactivation activity through direct interaction and functions as an important cellular factor for HIV-1 gene expression and viral replication. In addition, SART3 is required for U6 snRNP targeting to Cajal bodies. It binds specifically and directly to the U6 snRNA, interacts transiently with the U6 and U4/U6 snRNPs, and promotes the reassembly of U4/U6 snRNPs after splicing in vitro. SART3 contains an N-terminal half-a-tetratricopeptide repeat (HAT)-rich domain, a nuclearlocalization signal (NLS) domain, and two C-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	81
409827	cd12393	RRM_ZCRB1	RNA recognition motif (RRM) found in Zinc finger CCHC-type and RNA-binding motif-containing protein 1 (ZCRB1) and similar proteins. This subfamily corresponds to the RRM of ZCRB1, also termed MADP-1, or U11/U12 small nuclear ribonucleoprotein 31 kDa protein (U11/U12 snRNP 31 or U11/U12-31K), a novel multi-functional nuclear factor, which may be involved in morphine dependence, cold/heat stress, and hepatocarcinoma. It is located in the nucleoplasm, but outside the nucleolus. ZCRB1 is one of the components of U11/U12 snRNPs that bind to U12-type pre-mRNAs and form a di-snRNP complex, simultaneously recognizing the 5' splice site and branchpoint sequence. ZCRB1 is characterized by an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a CCHC-type Zinc finger motif. In addition, it contains core nucleocapsid motifs, and Lys- and Glu-rich domains.  	76
409828	cd12394	RRM1_RBM34	RNA recognition motif 1 (RRM1) found in RNA-binding protein 34 (RBM34) and similar proteins. This subfamily corresponds to the RRM1 of RBM34, a putative RNA-binding protein containing two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Although the function of RBM34 remains unclear currently, its RRM domains may participate in mRNA processing. RBM34 may act as an mRNA processing-related protein. 	91
409829	cd12395	RRM2_RBM34	RNA recognition motif 2 (RRM2) found in RNA-binding protein 34 (RBM34) and similar proteins. This subfamily corresponds to the RRM2 of RBM34, a putative RNA-binding protein containing two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Although the function of RBM34 remains unclear currently, its RRM domains may participate in mRNA processing. RBM34 may act as an mRNA processing-related protein. 	73
409830	cd12396	RRM1_Nop13p_fungi	RNA recognition motif 1 (RRM1) found in yeast nucleolar protein 13 (Nop13p) and similar proteins. This subfamily corresponds to the RRM1 of Nop13p encoded by YNL175c from Saccharomyces cerevisiae. It shares high sequence similarity with nucleolar protein 12 (Nop12p). Both, Nop12p and Nop13p, are not essential for growth. However, unlike Nop12p that is localized to the nucleolus, Nop13p localizes primarily to the nucleolus but is also present in the nucleoplasm to a lesser extent. Nop13p contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	85
409831	cd12397	RRM2_Nop13p_fungi	RNA recognition motif 2 (RRM2) found in yeast nucleolar protein 13 (Nop13p) and similar proteins. This subfamily corresponds to the RRM2 of Nop13p encoded by YNL175c from Saccharomyces cerevisiae. It shares high sequence similarity with nucleolar protein 12 (Nop12p). Both Nop12p and Nop13p are not essential for growth. However, unlike Nop12p that is localized to the nucleolus, Nop13p localizes primarily to the nucleolus but is also present in the nucleoplasm to a lesser extent. Nop13p contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	76
409832	cd12398	RRM_CSTF2_RNA15_like	RNA recognition motif (RRM) found in cleavage stimulation factor subunit 2 (CSTF2), yeast ortholog mRNA 3'-end-processing protein RNA15 and similar proteins. This subfamily corresponds to the RRM domain of CSTF2, its tau variant and eukaryotic homologs. CSTF2, also termed cleavage stimulation factor 64 kDa subunit (CstF64), is the vertebrate conterpart of yeast mRNA 3'-end-processing protein RNA15. It is expressed in all somatic tissues and is one of three cleavage stimulatory factor (CstF) subunits required for polyadenylation. CstF64 contains an N-terminal RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a CstF77-binding domain, a repeated MEARA helical region and a conserved C-terminal domain reported to bind the transcription factor PC-4. During polyadenylation, CstF interacts with the pre-mRNA through the RRM of CstF64 at U- or GU-rich sequences within 10 to 30 nucleotides downstream of the cleavage site. CSTF2T, also termed tauCstF64, is a paralog of the X-linked cleavage stimulation factor CstF64 protein that supports polyadenylation in most somatic cells. It is expressed during meiosis and subsequent haploid differentiation in a more limited set of tissues and cell types, largely in meiotic and postmeiotic male germ cells, and to a lesser extent in brain. The loss of CSTF2T will cause male infertility, as it is necessary for spermatogenesis and fertilization. Moreover, CSTF2T is required for expression of genes involved in morphological differentiation of spermatids, as well as for genes having products that function during interaction of motile spermatozoa with eggs. It promotes germ cell-specific patterns of polyadenylation by using its RRM to bind to different sequence elements downstream of polyadenylation sites than does CstF64. The family also includes yeast ortholog mRNA 3'-end-processing protein RNA15 and similar proteins. RNA15 is a core subunit of cleavage factor IA (CFIA), an essential transcriptional 3'-end processing factor from Saccharomyces cerevisiae. RNA recognition by CFIA is mediated by an N-terminal RRM, which is contained in the RNA15 subunit of the complex. The RRM of RNA15 has a strong preference for GU-rich RNAs, mediated by a binding pocket that is entirely conserved in both yeast and vertebrate RNA15 orthologs.	77
409833	cd12399	RRM_HP0827_like	RNA recognition motif (RRM) found in Helicobacter pylori HP0827 protein and similar proteins. This subfamily corresponds to the RRM of H. pylori HP0827, a putative ssDNA-binding protein 12rnp2 precursor, containing one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The ssDNA binding may be important in activation of HP0827. 	75
409834	cd12400	RRM_Nop6	RNA recognition motif (RRM) found in Saccharomyces cerevisiae nucleolar protein 6 (Nop6) and similar proteins. This subfamily corresponds to the RRM of Nop6, also known as Ydl213c, a component of 90S pre-ribosomal particles in yeast S. cerevisiae. It is enriched in the nucleolus and is required for 40S ribosomal subunit biogenesis. Nop6 is a non-essential putative RNA-binding protein with two N-terminal putative nuclear localisation sequences (NLS-1 and NLS-2) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It binds to the pre-rRNA early during transcription and plays an essential role in pre-rRNA processing. 	74
409835	cd12401	RRM_eIF4H	RNA recognition motif (RRM) found in eukaryotic translation initiation factor 4H (eIF-4H) and similar proteins. This subfamily corresponds to the RRM of eIF-4H, also termed Williams-Beuren syndrome chromosomal region 1 protein, which, together with elf-4B/eIF-4G, serves as the accessory protein of RNA helicase eIF-4A. eIF-4H contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It stimulates protein synthesis by enhancing the helicase activity of eIF-4A in the initiation step of mRNA translation. 	84
409836	cd12402	RRM_eIF4B	RNA recognition motif (RRM) found in eukaryotic translation initiation factor 4B (eIF-4B) and similar proteins. This subfamily corresponds to the RRM of eIF-4B, a multi-domain RNA-binding protein that has been primarily implicated in promoting the binding of 40S ribosomal subunits to mRNA during translation initiation. It contains two RNA-binding domains; the N-terminal well-conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), binds the 18S rRNA of the 40S ribosomal subunit and the C-terminal basic domain (BD), including two arginine-rich motifs (ARMs), binds mRNA during initiation, and is primarily responsible for the stimulation of the helicase activity of eIF-4A. eIF-4B also contains a DRYG domain (a region rich in Asp, Arg, Tyr, and Gly amino acids) in the middle, which is responsible for both, self-association of eIF-4B and  binding to the p170 subunit of eIF3. Additional research indicates that eIF-4B can interact with the poly(A) binding protein (PABP) in mammalian cells, which can stimulate both, the eIF-4B-mediated activation of the helicase activity of eIF-4A and binding of poly(A) by PABP. eIF-4B has also been shown to interact specifically with the internal ribosome entry sites (IRES) of several picornaviruses which facilitate cap-independent translation initiation. 	81
409837	cd12403	RRM1_NCL	RNA recognition motif 1 (RRM1) found in vertebrate nucleolin. This subfamily corresponds to the RRM1 of ubiquitously expressed protein nucleolin, also termed protein C23. Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG,NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop.  	75
409838	cd12404	RRM2_NCL	RNA recognition motif 2 (RRM2) found in vertebrate nucleolin. This subfamily corresponds to the RRM2 of ubiquitously expressed protein nucleolin, also termed protein C23, a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG,NG-dimethylarginines.RRM2, together with RRM1, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop.  	77
409839	cd12405	RRM3_NCL	RNA recognition motif 3 (RRM3) found in vertebrate nucleolin. This subfamily corresponds to the RRM3 of ubiquitously expressed protein nucleolin, also termed protein C23, is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG,NG-dimethylarginines. 	72
409840	cd12406	RRM4_NCL	RNA recognition motif 4 (RRM4) found in vertebrate nucleolin. This subfamily corresponds to the RRM4 of ubiquitously expressed protein nucleolin, also termed protein C23, is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG,NG-dimethylarginines. 	78
409841	cd12407	RRM_FOX1_like	RNA recognition motif (RRM) found in vertebrate RNA binding protein fox-1 homologs and similar proteins. This subfamily corresponds to the RRM of several tissue-specific alternative splicing isoforms of vertebrate RNA binding protein Fox-1 homologs, which show high sequence similarity to the Caenorhabditis elegans feminizing locus on X (Fox-1) gene encoding Fox-1 protein. RNA binding protein Fox-1 homolog 1 (RBFOX1), also termed ataxin-2-binding protein 1 (A2BP1), or Fox-1 homolog A, or hexaribonucleotide-binding protein 1 (HRNBP1), is predominantly expressed in neurons, skeletal muscle and heart. It regulates alternative splicing of tissue-specific exons by binding to UGCAUG elements. Moreover, RBFOX1 binds to the C-terminus of ataxin-2 and forms an ataxin-2/A2BP1 complex involved in RNA processing. RNA binding protein fox-1 homolog 2 (RBFOX2), also termed Fox-1 homolog B, or hexaribonucleotide-binding protein 2 (HRNBP2), or RNA-binding motif protein 9 (RBM9), or repressor of tamoxifen transcriptional activity, is expressed in ovary, whole embryo, and human embryonic cell lines in addition to neurons and muscle. RBFOX2 activates splicing of neuron-specific exons through binding to downstream UGCAUG elements. RBFOX2 also functions as a repressor of tamoxifen activation of the estrogen receptor. RNA binding protein Fox-1 homolog 3 (RBFOX3 or NeuN or HRNBP3), also termed Fox-1 homolog C, is a nuclear RNA-binding protein that regulates alternative splicing of the RBFOX2 pre-mRNA, producing a message encoding a dominant negative form of the RBFOX2 protein. Its message is detected exclusively in post-mitotic regions of embryonic brain. Like RBFOX1, both RBFOX2 and RBFOX3 bind to the hexanucleotide UGCAUG elements and modulate brain and muscle-specific splicing of exon EIIIB of fibronectin, exon N1 of c-src, and calcitonin/CGRP. Members in this family also harbor one RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	76
409842	cd12408	RRM_eIF3G_like	RNA recognition motif (RRM) found in eukaryotic translation initiation factor 3 subunit G (eIF-3G) and similar proteins. This subfamily corresponds to the RRM of eIF-3G and similar proteins. eIF-3G, also termed eIF-3 subunit 4, or eIF-3-delta, or eIF3-p42, or eIF3-p44, is the RNA-binding subunit of eIF3, a large multisubunit complex that plays a central role in the initiation of translation by binding to the 40 S ribosomal subunit and promoting the binding of methionyl-tRNAi and mRNA. eIF-3G binds 18 S rRNA and beta-globin mRNA, and therefore appears to be a nonspecific RNA-binding protein. eIF-3G is one of the cytosolic targets and interacts with mature apoptosis-inducing factor (AIF). eIF-3G contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). This family also includes yeast eIF3-p33, a homolog of vertebrate eIF-3G, plays an important role in the initiation phase of protein synthesis in yeast. It binds both, mRNA and rRNA, fragments due to an RRM near its C-terminus. 	76
409843	cd12409	RRM1_RRT5	RNA recognition motif 1 (RRM1) found in yeast regulator of rDNA transcription protein 5 (RRT5) and similar proteins. This subfamily corresponds to the RRM1 of the lineage specific family containing a group of uncharacterized yeast regulators of rDNA transcription protein 5 (RRT5), which may play roles in the modulation of rDNA transcription. RRT5 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	84
409844	cd12410	RRM2_RRT5	RNA recognition motif 2 (RRM2) found in yeast regulator of rDNA transcription protein 5 (RRT5) and similar proteins. This subfamily corresponds to the RRM2 of the lineage specific family containing a group of uncharacterized yeast regulators of rDNA transcription protein 5 (RRT5), which may play roles in the modulation of rDNA transcription. RRT5 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	93
409845	cd12411	RRM_ist3_like	RNA recognition motif (RRM) found in ist3 family. This subfamily corresponds to the RRM of the ist3 family that includes fungal U2 small nuclear ribonucleoprotein (snRNP) component increased sodium tolerance protein 3 (ist3), X-linked 2 RNA-binding motif proteins (RBMX2) found in Metazoa and plants, and similar proteins. Gene IST3 encoding ist3, also termed U2 snRNP protein SNU17 (Snu17p), is a novel yeast Saccharomyces cerevisiae protein required for the first catalytic step of splicing and for progression of spliceosome assembly. It binds specifically to the U2 snRNP and is an intrinsic component of prespliceosomes and spliceosomes. Yeast ist3 contains an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). In the yeast pre-mRNA retention and splicing complex, the atypical RRM of ist3 functions as a scaffold that organizes the other two constituents, Bud13p (bud site selection 13) and Pml1p (pre-mRNA leakage 1). Fission yeast Schizosaccharomyces pombe gene cwf29 encoding ist3, also termed cell cycle control protein cwf29, is an RNA-binding protein complexed with cdc5 protein 29. It also contains one RRM. The biological function of RBMX2 remains unclear. It shows high sequence similarity to yeast ist3 protein and harbors one RRM as well. 	89
409846	cd12412	RRM_DAZL_BOULE	RNA recognition motif (RRM) found in AZoospermia (DAZ) autosomal homologs, DAZL (DAZ-like) and BOULE. This subfamily corresponds to the RRM domain of two Deleted in AZoospermia (DAZ) autosomal homologs, DAZL (DAZ-like) and BOULE. BOULE is the founder member of the family and DAZL arose from BOULE in an ancestor of vertebrates. The DAZ gene subsequently originated from a duplication transposition of the DAZL gene. Invertebrates contain a single DAZ homolog, BOULE, while vertebrates, other than catarrhine primates, possess both BOULE and DAZL genes. The catarrhine primates possess BOULE, DAZL, and DAZ genes. The family members encode closely related RNA-binding proteins that are required for fertility in numerous organisms. These proteins contain an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a varying number of copies of a DAZ motif, believed to mediate protein-protein interactions. DAZL and BOULE contain a single copy of the DAZ motif, while DAZ proteins can contain 8-24 copies of this repeat. Although their specific biochemical functions remain to be investigated, DAZL proteins may interact with poly(A)-binding proteins (PABPs), and act as translational activators of specific mRNAs during gametogenesis.  	81
409847	cd12413	RRM1_RBM28_like	RNA recognition motif 1 (RRM1) found in RNA-binding protein 28 (RBM28) and similar proteins. This subfamily corresponds to the RRM1 of RBM28 and Nop4p. RBM28 is a specific nucleolar component of the spliceosomal small nuclear ribonucleoproteins (snRNPs), possibly coordinating their transition through the nucleolus. It specifically associates with U1, U2, U4, U5, and U6 small nuclear RNAs (snRNAs), and may play a role in the maturation of both small nuclear and ribosomal RNAs. RBM28 has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an extremely acidic region between RRM2 and RRM3. The family also includes nucleolar protein 4 (Nop4p or Nop77p) encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p also contains four RRMs.  	79
409848	cd12414	RRM2_RBM28_like	RNA recognition motif 2 (RRM2) found in RNA-binding protein 28 (RBM28) and similar proteins. This subfamily corresponds to the RRM2 of RBM28 and Nop4p. RBM28 is a specific nucleolar component of the spliceosomal small nuclear ribonucleoproteins (snRNPs), possibly coordinating their transition through the nucleolus. It specifically associates with U1, U2, U4, U5, and U6 small nuclear RNAs (snRNAs), and may play a role in the maturation of both small nuclear and ribosomal RNAs. RBM28 has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an extremely acidic region between RRM2 and RRM3. The family also includes nucleolar protein 4 (Nop4p or Nop77p) encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p also contains four RRMs.  	76
409849	cd12415	RRM3_RBM28_like	RNA recognition motif 3 (RRM3) found in RNA-binding protein 28 (RBM28) and similar proteins. This subfamily corresponds to the RRM3 of RBM28 and Nop4p. RBM28 is a specific nucleolar component of the spliceosomal small nuclear ribonucleoproteins (snRNPs), possibly coordinating their transition through the nucleolus. It specifically associates with U1, U2, U4, U5, and U6 small nuclear RNAs (snRNAs), and may play a role in the maturation of both small nuclear and ribosomal RNAs. RBM28 has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an extremely acidic region between RRM2 and RRM3. The family also includes nucleolar protein 4 (Nop4p or Nop77p) encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p also contains four RRMs.  	83
409850	cd12416	RRM4_RBM28_like	RNA recognition motif 4 (RRM4) found in RNA-binding protein 28 (RBM28) and similar proteins. This subfamily corresponds to the RRM4 of RBM28 and Nop4p. RBM28 is a specific nucleolar component of the spliceosomal small nuclear ribonucleoproteins (snRNPs), possibly coordinating their transition through the nucleolus. It specifically associates with U1, U2, U4, U5, and U6 small nuclear RNAs (snRNAs), and may play a role in the maturation of both small nuclear and ribosomal RNAs. RBM28 has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an extremely acidic region between RRM2 and RRM3. The family also includes nucleolar protein 4 (Nop4p or Nop77p) encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p also contains four RRMs. 	98
409851	cd12417	RRM_SAFB_like	RNA recognition motif (RRM) found in the scaffold attachment factor (SAFB) family. This subfamily corresponds to the RRM domain of the SAFB family, including scaffold attachment factor B1 (SAFB1), scaffold attachment factor B2 (SAFB2), SAFB-like transcriptional modulator (SLTM), and similar proteins, which are ubiquitously expressed. SAFB1, SAFB2 and SLTM have been implicated in many diverse cellular processes including cell growth and transformation, stress response, and apoptosis. They share high sequence similarities and all contain a scaffold attachment factor-box (SAF-box, also known as SAP domain) DNA-binding motif, an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region rich in glutamine and arginine residues. SAFB1 is a nuclear protein with a distribution similar to that of SLTM, but unlike that of SAFB2, which is also found in the cytoplasm. To a large extent, SAFB1 and SLTM might share similar functions, such as the inhibition of an oestrogen reporter gene. The additional cytoplasmic localization of SAFB2 implies that it could play additional roles in the cytoplasmic compartment which are distinct from the nuclear functions shared with SAFB1 and SLTM. 	74
409852	cd12418	RRM_Aly_REF_like	RNA recognition motif (RRM) found in the Aly/REF family. This subfamily corresponds to the RRM of  Aly/REF family which includes THO complex subunit 4 (THOC4, also termed Aly/REF), S6K1 Aly/REF-like target (SKAR, also termed PDIP3 or PDIP46) and similar proteins. THOC4 is an mRNA transporter protein with a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It is involved in RNA transportation from the nucleus, and was initially identified as a transcription coactivator of LEF-1 and AML-1 for the TCRalpha enhancer function. In addition, THOC4 specifically binds to rhesus (RH) promoter in erythroid, and might be a novel transcription cofactor for erythroid-specific genes. SKAR shows high sequence homology with THOC4 and possesses one RRM as well. SKAR is widely expressed and localizes to the nucleus. It may be a critical player in the function of S6K1 in cell and organism growth control by binding the activated, hyperphosphorylated form of S6K1 but not S6K2. Furthermore, SKAR functions as a protein partner of the p50 subunit of DNA polymerase delta. In addition, SKAR may have particular importance in pancreatic beta cell size determination and insulin secretion. 	75
409853	cd12419	RRM_Ssp2_like	RNA recognition motif (RRM) found in yeast sporulation-specific protein 2 (Ssp2) and similar protein. This subfamily corresponds to the RRM of the lineage specific yeast sporulation-specific protein 2 (Ssp2) and similar proteins. Ssp2 is encoded by a sporulation-specific gene necessary for outer spore wall assembly in the yeast Saccharomyces cerevisiae. It localizes to the spore wall and may play an important role after meiosis II and during spore wall formation. Ssp2 contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	85
409854	cd12420	RRM_RBPMS_like	RNA recognition motif (RRM) found in RNA-binding protein with multiple splicing (RBP-MS)-like proteins. This subfamily corresponds to the RRM of RNA-binding proteins with multiple splicing (RBP-MS)-like proteins, including protein products of RBPMS genes (RBP-MS and its paralogue RBP-MS2), the Drosophila couch potato (cpo), and Caenorhabditis elegans Mec-8 genes. RBP-MS may be involved in regulation of mRNA translation and localization during Xenopus laevis development. It has also been shown to physically interact with Smad2, Smad3 and Smad4, and stimulates Smad-mediated transactivation. Cpo may play an important role in regulating normal function of the nervous system, whereas mutations in Mec-8 affect mechanosensory and chemosensory neuronal function. All members contain a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Some uncharacterized family members contain two RRMs; this subfamily includes their RRM1. Their RRM2 shows high sequence homology to the RRM of yeast proteins scw1, Whi3, and Whi4.	76
409855	cd12421	RRM1_PTBP1_hnRNPL_like	RNA recognition motif (RRM) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I), heterogeneous nuclear ribonucleoprotein L (hnRNP-L), and similar proteins. This subfamily corresponds to the RRM1 of the majority of family members that include polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), polypyrimidine tract-binding protein homolog 3 (PTBPH3), polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2), and similar proteins. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. Rod1 is a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL protein plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. The family also includes polypyrimidine tract binding protein homolog 3 (PTBPH3) found in plant. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to other family members, all of which contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Although their biological roles remain unclear, both PTBPH1 and PTBPH2 show significant sequence similarity to PTB. However, in contrast to PTB, they have three RRMs. In addition, this family also includes RNA-binding motif protein 20 (RBM20) that is an alternative splicing regulator associated with dilated cardiomyopathy (DCM) and contains only one RRM. 	74
409856	cd12422	RRM2_PTBP1_hnRNPL_like	RNA recognition motif (RRM) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I), heterogeneous nuclear ribonucleoprotein L (hnRNP-L), and similar proteins. This subfamily corresponds to the RRM2 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), polypyrimidine tract-binding protein homolog 3 (PTBPH3), polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2), and similar proteins, and RRM3 of PTBPH1 and PTBPH2. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. Rod1 is a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL protein plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. This family also includes polypyrimidine tract binding protein homolog 3 (PTBPH3) found in plant. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to other family members, all of which contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Although their biological roles remain unclear, both PTBPH1 and PTBPH2 show significant sequence similarity to PTB. However, in contrast to PTB, they have three RRMs. 	85
409857	cd12423	RRM3_PTBP1_like	RNA recognition motif 3 (RRM3) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I) and similar proteins. This subfamily corresponds to the RRM3 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), and similar proteins found in Metazoa. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 also contains four RRMs. ROD1 coding protein Rod1 is a mammalian PTB homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It may play a role controlling differentiation in mammals. All members in this family contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	74
409858	cd12424	RRM3_hnRNPL_like	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein L (hnRNP-L) and similar proteins. This subfamily corresponds to the RRM3 of heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), and similar proteins. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to hnRNP-L, which contains three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The family also includes polypyrimidine tract binding protein homolog 3 (PTBPH3) found in plant. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RRMs.	74
409859	cd12425	RRM4_PTBP1_like	RNA recognition motif 4 (RRM4) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I) and similar proteins. This subfamily corresponds to the RRM4 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), and similar proteins found in Metazoa. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 also contains four RRMs. ROD1 coding protein Rod1 is a mammalian PTB homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It may play a role controlling differentiation in mammals. All members in this family contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	76
409860	cd12426	RRM4_PTBPH3	RNA recognition motif 4 (RRM4) found in plant polypyrimidine tract-binding protein homolog 3 (PTBPH3). This subfamily corresponds to the RRM4 of PTBPH3. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	79
409861	cd12427	RRM4_hnRNPL_like	RNA recognition motif 4 (RRM4) found in heterogeneous nuclear ribonucleoprotein L (hnRNP-L) and similar proteins. This subfamily corresponds to the RRM4 of heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), and similar proteins. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to hnRNP-L, which contains three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	84
409862	cd12428	RRM_PARN	RNA recognition motif (RRM) found in poly(A)-specific ribonuclease PARN and similar proteins. The subfamily corresponds to the RRM of PARN, also termed deadenylating nuclease, or deadenylation nuclease, or polyadenylate-specific ribonuclease, a processive poly(A)-specific 3'-exoribonuclease involved in the decay of eukaryotic mRNAs. It specifically binds both, the poly(A) tail at the 3' end and the 7-methylguanosine (m7G) cap located at the 5' end of eukaryotic mRNAs, and catalyzes the 3'- to 5'-end deadenylation of single-stranded mRNA with a free 3' hydroxyl group both in the nucleus and in the cytoplasm. PARN belongs to the DEDD superfamily of exonucleases. It contains a nuclease domain, an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and an R3H domain. PARN exists as a homodimer. The nuclease domain is involved in the dimerization. RRM and R3H domains are essential for the RNA-binding. 	66
409863	cd12429	RRM_DNAJC17	RNA recognition motif (RRM) found in the DnaJ homolog subfamily C member 17. The CD corresponds to the RRM of some eukaryotic DnaJ homolog subfamily C member 17 and similar proteins. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Members in this family contains an N-terminal DnaJ domain or J-domain, which mediates the interaction with Hsp70. They also contains a RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), at the C-terminus, which may play an essential role in RNA binding. 	74
409864	cd12430	RRM_LARP4_5_like	RNA recognition motif (RRM) found in La-related protein 4 (LARP4), La-related protein 5 (LARP5 or LARP4B) and similar proteins. This subfamily corresponds to the RRM of LARP4 and LARP5. LARP4 is a cytoplasmic factor that can bind poly(A) RNA and interact with poly(A) binding protein (PABP). It may play a role in promoting translation by stabilizing mRNA. LARP5 is a cytosolic protein that co-sediments with polysomes and accumulates upon stress induction in cellular stress granules. It can interact with the cytosolic poly(A) binding protein 1 (PABPC1) and the receptor for activated C Kinase (RACK1), a component of the 40S ribosomal subunit. LARP5 may function as a stimulatory factor of translation through bridging mRNA factors of the 3' end with initiating ribosomes. Both, LARP4 and LARP5, are structurally related to the La autoantigen. Like other La-related proteins (LARPs) family members, LARP4 and LARP5 contain a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	76
409865	cd12431	RRM_ALKBH8	RNA recognition motif (RRM) found in alkylated DNA repair protein alkB homolog 8 (ALKBH8) and similar proteins. This subfamily corresponds to the RRM of ALKBH8, also termed alpha-ketoglutarate-dependent dioxygenase ABH8, or S-adenosyl-L-methionine-dependent tRNA methyltransferase ABH8, expressed in various types of human cancers. It is essential in urothelial carcinoma cell survival mediated by NOX-1-dependent ROS signals. ALKBH8 has also been identified as a tRNA methyltransferase that catalyzes methylation of tRNA to yield 5-methylcarboxymethyl uridine (mcm5U) at the wobble position of the anticodon loop. Thus, ALKBH8 plays a crucial role in the DNA damage survival pathway through a distinct mechanism involving the regulation of tRNA modification. ALKBH8 localizes to the cytoplasm. It contains the characteristic AlkB domain that is composed of a tRNA methyltransferase motif, a motif homologous to the bacterial AlkB DNA/RNA repair enzyme, and a dioxygenase catalytic core domain encompassing cofactor-binding sites for iron and 2-oxoglutarate. In addition, unlike other AlkB homologs, ALKBH8 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal S-adenosylmethionine (SAM)-dependent methyltransferase (MT) domain. 	80
409866	cd12432	RRM_ACINU	RNA recognition motif (RRM) found in apoptotic chromatin condensation inducer in the nucleus (acinus) and similar proteins. This subfamily corresponds to the RRM of Acinus, a caspase-3-activated nuclear factor that induces apoptotic chromatin condensation after cleavage by caspase-3 without inducing DNA fragmentation. It is essential for apoptotic chromatin condensation and may also participate in nuclear structural changes occurring in normal cells. Acinus contains a P-loop motif and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which indicates Acinus might have ATPase and DNA/RNA-binding activity. 	90
409867	cd12433	RRM_Yme2p_like	RNA recognition motif (RRM) found in yeast mitochondrial escape protein 2 (Yme2p) and similar proteins. This subfamily corresponds to the RRM of Yme2p, also termed protein RNA12, an inner mitochondrial membrane protein that plays a critical role in mitochondrial DNA transactions. It may serve as a mediator of nucleoid structure and number in mitochondria of the yeast Saccharomyces cerevisiae. Yme2p contains an exonuclease domain, an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal domain. 	86
409868	cd12434	RRM_RCAN_like	RNA recognition motif (RRM) found in regulators of calcineurin (RCANs) and similar proteins. This subfamily corresponds to the RRM of RCANs, a novel family of calcineurin regulators that are key factors contributing to Down syndrome in humans. They can stimulate and inhibit the Ca2+/calmodulin-dependent phosphatase calcineurin (also termed PP2B or PP3C) signaling in vivo through direct interactions with its catalytic subunit. Overexpressed RCANs may bind and inhibit calcineurin. In contrast, low levels of phosphorylated RCANs may stimulate the calcineurin signaling. RCANs are characterized by harboring a central short, unique serine-proline motif containing FLIISPPxSPP box, which is strongly conserved from yeast to human but is absent in bacteria. They consist of an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a highly conserved SP repeat domain containing the phosphorylation site by GSK-3, a well-known PxIxIT motif responsible for docking many substrates to calcineurin, and an unrecognized C-terminal TxxP motif of unknown function. 	75
409869	cd12435	RRM_GW182_like	RNA recognition motif (RRM) found in the GW182 family proteins. This subfamily corresponds to the RRM of the GW182 family which includes three paralogs of TNRC6 (GW182-related) proteins comprising GW182/TNGW1, TNRC6B (containing three isoforms) and TNRC6C in mammal, a single Drosophila ortholog (dGW182, also called Gawky) and two Caenorhabditis elegans orthologs AIN-1 and AIN-2, which contain multiple miRNA-binding sites and have important functions in miRNA-mediated translational repression, as well as mRNA degradation in Metazoa. The GW182 family proteins directly interact with Argonaute (Ago) proteins, and thus function as downstream effectors in the miRNA pathway, responsible for inhibition of translation and acceleration of mRNA decay. Members in this family are characterized by an abnormally high content of glycine/tryptophan (G/W) repeats, one or more glutamine (Q)-rich motifs, and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The only exception is the worm protein that does not contain a recognizable RRM domain. The GW182 family proteins are recruited to miRNA targets through an interaction between their N-terminal domain and an Argonaute protein. Then they promote translational repression and/or degradation of miRNA targets through their C-terminal silencing domain.  	71
409870	cd12436	RRM1_2_MATR3_like	RNA recognition motif 1 (RRM1) and 2 (RRM2) found in the matrin 3 family of nuclear proteins. This subfamily corresponds to the RRM of the matrin 3 family of nuclear proteins consisting of Matrin 3 (MATR3), nuclear protein 220 (NP220) and similar proteins. MATR3 is a highly conserved inner nuclear matrix protein that has been implicated in various biological processes. NP220 is a large nucleoplasmic DNA-binding protein that binds to cytidine-rich sequences, such as CCCCC (G/C), in double-stranded DNA (dsDNA). Both, Matrin 3 and NP220, contain two RNA recognition motif (RRM), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Cys2-His2 zinc finger-like motif at the C-terminal region. 	76
409871	cd12437	RRM_BRAP2_like	RNA recognition motif (RRM) found in BRCA1-associated protein (BRAP2) and similar proteins. This subfamily corresponds to the RRM domain of BRAP2, also termed impedes mitogenic signal propagation (IMP), or ring finger protein 52, or renal carcinoma antigen NY-REN-63, a novel cytoplasmic protein interacting with the two functional nuclear localisation signal (NLS) motifs of BRCA1, a nuclear protein linked to breast cancer. It also binds to the SV40 large T antigen NLS motif and the bipartite NLS motif found in mitosin. BRAP2 may serve as a cytoplasmic retention protein and play a role in the regulation of nuclear protein transport. The family also includes RING finger protein ETP1 and its homologs found in fungi. ETP1, also termed BRAP2 homolog, or ethanol tolerance protein 1, is the yeast homolog of BRCA1-associated protein (BRAP2) found in vertebrates. It may be involved in ethanol and salt-induced transcriptional activation of the NHA1 promoter and heat shock protein genes (HSP12 and HSP26), and participate in ethanol-induced turnover of the low-affinity hexose transporter Hxt3p. Members in this family contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C3HC4-type ring finger domain and a UBP-type zinc finger. 	82
409872	cd12438	RRM_CNOT4	RNA recognition motif (RRM) found in Eukaryotic CCR4-NOT transcription complex subunit 4 (NOT4) and similar proteins. This subfamily corresponds to the RRM of NOT4, also termed CCR4-associated factor 4, or E3 ubiquitin-protein ligase CNOT4, or potential transcriptional repressor NOT4Hp, a component of the CCR4-NOT complex, a global negative regulator of RNA polymerase II transcription. NOT4 functions as an ubiquitin-protein ligase (E3). It contains an N-terminal C4C4 type RING finger motif, followed by a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The RING fingers may interact with a subset of ubiquitin-conjugating enzymes (E2s), including UbcH5B, and mediate protein-protein interactions. T	98
409873	cd12439	RRM_TRMT2A	RNA recognition motif (RRM) found in tRNA (uracil-5-)-methyltransferase homolog A (TRMT2A) and similar proteins. This subfamily corresponds to the RRM of TRMT2A, also known as HpaII tiny fragments locus 9c protein (HTF9C), a novel cell cycle regulated protein. It is an independent biologic factor expressed in tumors associated with clinical outcome in HER2 expressing breast cancer. The function of TRMT2A remains unclear although by sequence homology it has a RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), related to RNA methyltransferases. 	79
409874	cd12440	RRM_SYNJ	RNA recognition motif (RRM) found in synaptojanin-1, synaptojanin-2 and similar proteins. This subfamily corresponds to the RRM of two active phosphatidylinositol phosphate phosphatases, synaptojanin-1 and synaptojanin-2. They have different interaction partners and are likely to have different biological functions. Synaptojanin-1 was originally identified as one of the major Grb2-binding proteins that may participate in synaptic vesicle endocytosis. It also acts as a Src homology 3 (SH3) domain-binding brain-specific inositol 5-phosphatase with a putative role in clathrin-mediated endocytosis. Synaptojanin-2 is a ubiquitously expressed homolog of synaptojanin-1. It is a novel Rac1 effector regulating the early step of clathrin-mediated endocytosis. Synaptojanin-2 directly and specifically interacts with Rac1 in a GTP-dependent manner. It mediates the inhibitory effect of Rac1 on endocytosis and plays an important role in the Rac1-mediated control of cell growth. Both, synaptojanin-1 and synaptojanin-2, have two tissue-specific alternative splicing isoforms, a shorter isoform expressed in brain and a longer isoform in peripheral tissues. Synaptojanin-1 contains an N-terminal domain homologous to the cytoplasmic portion of the yeast protein Sac1p, a central inositol 5-phosphatase domain followed by a putative RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal proline-rich region mediating the binding of synaptojanin-1 to various SH3 domain-containing proteins including amphiphysin, SH3p4, SH3p8, SH3p13, and Grb2. Synaptojanin-2 shows high sequence homology to the N-terminal Sac1p homology domain, the central inositol 5-phosphatase domain, the putative RNA recognition motif (RRM) of synaptojanin-1, but differs in the proline-rich region. 	77
409875	cd12441	RRM_Nup53_like	RNA recognition motif (RRM) found in nucleoporin Nup53 and similar proteins. This subfamily corresponds to the RRM domain of nucleoporin Nup53, also termed mitotic phosphoprotein 44 (MP-44), or nuclear pore complex protein Nup53, required for normal cell growth and nuclear morphology in vertebrate. It tightly associates with the nuclear envelope membrane and the nuclear lamina where it interacts with lamin B. It may also interact with a group of nucleoporins including Nup93, Nup155, and Nup205 and play a role in the association of the mitotic checkpoint protein Mad1 with the nuclear pore complex (NPC). The family also includes Saccharomyces cerevisiae Nup53p, an ortholog of vertebrate nucleoporin Nup53. A unique property of yeast Nup53p is that it contains an additional Kap121p-binding domain and interacts specifically with the karyopherin Kap121p, which is involved in the assembly of Nup53p into NPCs. Both, vertebrate Nup35 and yeast Nup53p, contain an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a C-terminal amphipathic alpha-helix and several FG repeats. This family corresponds to the RRM domain which lacks the conserved residues that typically bind RNA in canonical RRM domains.	73
409876	cd12442	RRM_RBM48	RNA recognition motif (RRM) found in RNA-binding protein 48 (RBM48) and similar proteins. This subfamily corresponds to the RRM of RBM48, a putative RNA-binding protein of unknown function. It contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	100
409877	cd12443	RRM_MCM3A_like	RNA recognition motif (RRM) found in 80 kDa MCM3-associated protein (Map80) and similar proteins. This subfamily corresponds to the RRM of Map80, also termed germinal center-associated nuclear protein (GANP), involved in the nuclear localization pathway of MCM3, a protein necessary for the initiation of DNA replication and also involves in controls that ensure DNA replication is initiated once per cell cycle. Map80 contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	75
409878	cd12444	RRM1_CPEBs	RNA recognition motif 1 (RRM1) found in cytoplasmic polyadenylation element-binding protein CPEB-1, CPEB-2, CPEB-3, CPEB-4 and similar protiens. This subfamily corresponds to the RRM1 of the CPEB family of proteins that bind to defined groups of mRNAs and act as either translational repressors or activators to regulate their translation. CPEB proteins are well conserved in both, vertebrates and invertebrates. Based on sequence similarity, RNA-binding specificity, and functional regulation of translation, the CPEB proteins have been classified into two subfamilies. The first subfamily includes CPEB-1 and related proteins. CPEB-1 is an RNA-binding protein that interacts with the cytoplasmic polyadenylation element (CPE), a short U-rich motif in the 3' untranslated regions (UTRs) of certain mRNAs. It functions as a translational regulator that plays a major role in the control of maternal CPE-containing mRNA in oocytes, as well as of subsynaptic CPE-containing mRNA in neurons. Once phosphorylated and recruiting the polyadenylation complex, CPEB-1 may function as a translational activator stimulating polyadenylation and translation. Otherwise, it may function as a translational inhibitor when dephosphorylated and bind to a protein such as maskin or neuroguidin, which blocks translation initiation through interfering with the assembly of eIF-4E and eIF-4G. Although CPEB-1 is mainly located in cytoplasm, it can shuttle between nucleus and cytoplasm. The second subfamily includes CPEB-2, CPEB-3, CPEB-4, and related protiens. Due to high sequence similarity, members in this subfamily may share similar expression patterns and functions. CPEB-2 is an RNA-binding protein that is abundantly expressed in testis and localized in cytoplasm in transfected HeLa cells. It preferentially binds to poly(U) RNA oligomers and may regulate the translation of stored mRNAs during spermiogenesis. CPEB-2 impedes target RNA translation at elongation; it directly interacts with the elongation factor, eEF2, to reduce eEF2/ribosome-activated GTP hydrolysis in vitro and inhibit peptide elongation of CPEB2-bound RNA in vivo. CPEB-3 is a sequence-specific translational regulatory protein that regulates translation in a polyadenylation-independent manner. It functions as a translational repressor that governs the synthesis of the AMPA receptor GluR2 through binding GluR2 mRNA. It also represses translation of a reporter RNA in transfected neurons and stimulates translation in response to NMDA. CPEB-4 is an RNA-binding protein that mediates meiotic mRNA cytoplasmic polyadenylation and translation. It is essential for neuron survival and present on the endoplasmic reticulum (ER). It is accumulated in the nucleus upon ischemia or the depletion of ER calcium. CPEB-4 is overexpressed in a large variety of tumors and is associated with many mRNAs in cancer cells. All CPEB proteins are nucleus-cytoplasm shuttling proteins. They contain an N-terminal unstructured region, followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. CPEB-2, -3, and -4 have conserved nuclear export signals that are not present in CPEB-1. 	95
409879	cd12445	RRM2_CPEBs	RNA recognition motif 2 (RRM2) found in cytoplasmic polyadenylation element-binding protein CPEB-1, CPEB-2, CPEB-3, CPEB-4 and similar protiens. This subfamily corresponds to the RRM2 of CPEB family of  proteins that bind to defined groups of mRNAs and act as either translational repressors or activators to regulate their translation. CPEB proteins are well conserved in both, vertebrates and invertebrates. Based on sequence similarity, RNA-binding specificity, and functional regulation of translation, the CPEB proteins has been classified into two subfamilies. The first subfamily includes CPEB-1 and related proteins. CPEB-1 is an RNA-binding protein that interacts with the cytoplasmic polyadenylation element (CPE), a short U-rich motif in the 3' untranslated regions (UTRs) of certain mRNAs. It functions as a translational regulator that plays a major role in the control of maternal CPE-containing mRNA in oocytes, as well as of subsynaptic CPE-containing mRNA in neurons. Once phosphorylated and recruiting the polyadenylation complex, CPEB-1 may function as a translational activator stimulating polyadenylation and translation. Otherwise, it may function as a translational inhibitor when dephosphorylated and bound to a protein such as maskin or neuroguidin, which blocks translation initiation through interfering with the assembly of eIF-4E and eIF-4G. Although CPEB-1 is mainly located in cytoplasm, it can shuttle between nucleus and cytoplasm. The second subfamily includes CPEB-2, CPEB-3, CPEB-4, and related protiens. Due to the high sequence similarity, members in this subfamily may share similar expression patterns and functions. CPEB-2 is an RNA-binding protein that is abundantly expressed in testis and localized in cytoplasm in transfected HeLa cells. It preferentially binds to poly(U) RNA oligomers and may regulate the translation of stored mRNAs during spermiogenesis. Moreover, CPEB-2 impedes target RNA translation at elongation. It directly interacts with the elongation factor, eEF2, to reduce eEF2/ribosome-activated GTP hydrolysis in vitro and inhibit peptide elongation of CPEB2-bound RNA in vivo. CPEB-3 is a sequence-specific translational regulatory protein that regulates translation in a polyadenylation-independent manner. It functions as a translational repressor that governs the synthesis of the AMPA receptor GluR2 through binding GluR2 mRNA. It also represses translation of a reporter RNA in transfected neurons and stimulates translation in response to NMDA. CPEB-4 is an RNA-binding protein that mediates meiotic mRNA cytoplasmic polyadenylation and translation. It is essential for neuron survival and present on the endoplasmic reticulum (ER). It is accumulated in the nucleus upon ischemia or the depletion of ER calcium. CPEB-4 is overexpressed in a large variety of tumors and is associated with many mRNAs in cancer cells. All CPEB proteins are nucleus-cytoplasm shuttling proteins. They contain an N-terminal unstructured region, followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. CPEB-2, -3, and -4 have conserved nuclear export signals that are not present in CPEB-1. 	81
409880	cd12446	RRM_RBM25	RNA recognition motif (RRM) found in eukaryotic RNA-binding protein 25 and similar proteins. This subfamily corresponds to the RRM of RBM25, also termed Arg/Glu/Asp-rich protein of 120 kDa (RED120), or protein S164, or RNA-binding region-containing protein 7, an evolutionary-conserved splicing coactivator SRm160 (SR-related nuclear matrix protein of 160 kDa, )-interacting protein. RBM25 belongs to a family of RNA-binding proteins containing a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), at the N-terminus, a RE/RD-rich (ER) central region, and a C-terminal proline-tryptophan-isoleucine (PWI) motif. It localizes to the nuclear speckles and associates with multiple splicing components, including splicing cofactors SRm160/300, U snRNAs, assembled splicing complexes, and spliced mRNAs. It may play an important role in pre-mRNA processing by coupling splicing with mRNA 3'-end formation. Additional research indicates that RBM25 is one of the RNA-binding regulators that direct the alternative splicing of apoptotic factors. It can activate proapoptotic Bcl-xS 5'ss by binding to the exonic splicing enhancer, CGGGCA, and stabilize the pre-mRNA-U1 snRNP through interaction with hLuc7A, a U1 snRNP-associated factor. 	83
409881	cd12447	RRM1_gar2	RNA recognition motif 1 (RRM1) found in yeast protein gar2 and similar proteins. This subfamily corresponds to the RRM1 of yeast protein gar2, a novel nucleolar protein required for 18S rRNA and 40S ribosomal subunit accumulation. It shares similar domain architecture with nucleolin from vertebrates and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of gar2 is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of gar2 contains two closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RGG (or GAR) domain of gar2 is rich in glycine, arginine and phenylalanine residues. 	76
409882	cd12448	RRM2_gar2	RNA recognition motif 2 (RRM2) found in yeast protein gar2 and similar proteins. This subfamily corresponds to the RRM2 of yeast protein gar2, a novel nucleolar protein required for 18S rRNA and 40S ribosomal subunit accumulation. It shares similar domain architecture with nucleolin from vertebrates and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of gar2 is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of gar2 contains two closely adjacent N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RGG (or GAR) domain of gar2 is rich in glycine, arginine and phenylalanine residues. 	73
409883	cd12449	RRM_CIRBP_RBM3	RNA recognition motif (RRM) found in cold inducible RNA binding protein (CIRBP), RNA binding motif protein 3 (RBM3) and similar proteins. This subfamily corresponds to the RRM domain of two structurally related heterogenous nuclear ribonucleoproteins, CIRBP (also termed CIRP or A18 hnRNP) and RBM3 (also termed RNPL), both of which belong to a highly conserved cold shock proteins family. The cold shock proteins can be induced after exposure to a moderate cold-shock and other cellular stresses such as UV radiation and hypoxia. CIRBP and RBM3 may function in posttranscriptional regulation of gene expression by binding to different transcripts, thus allowing the cell to response rapidly to environmental signals. However, the kinetics and degree of cold induction are different between CIRBP and RBM3. Tissue distribution of their expression is different. CIRBP and RBM3 may be differentially regulated under physiological and stress conditions and may play distinct roles in cold responses of cells. CIRBP, also termed glycine-rich RNA-binding protein CIRP, is localized in the nucleus and mediates the cold-induced suppression of cell cycle progression. CIRBP also binds DNA and possibly serves as a chaperone that assists in the folding/unfolding, assembly/disassembly and transport of various proteins. RBM3 may enhance global protein synthesis and the formation of active polysomes while reducing the levels of ribonucleoprotein complexes containing microRNAs. RBM3 may also serve to prevent the loss of muscle mass by its ability to decrease cell death. Furthermore, RBM3 may be essential for cell proliferation and mitosis. Both, CIRBP and RBM3, contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), that is involved in RNA binding, and C-terminal glycine-rich domain (RGG motif) that probably enhances RNA-binding via protein-protein and/or protein-RNA interactions. Like CIRBP, RBM3 can also bind to both RNA and DNA via its RRM domain. 	80
409884	cd12450	RRM1_NUCLs	RNA recognition motif 1 (RRM1) found in nucleolin-like proteins mainly from plants. This subfamily corresponds to the RRM1 of a group of plant nucleolin-like proteins, including nucleolin 1 (also termed protein nucleolin like 1) and nucleolin 2 (also termed protein nucleolin like 2, or protein parallel like 1). They play roles in the regulation of ribosome synthesis and in the growth and development of plants. Like yeast nucleolin, nucleolin-like proteins possess two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains).  	78
409885	cd12451	RRM2_NUCLs	RNA recognition motif 2 (RRM2) found in nucleolin-like proteins mainly from plants. This subfamily corresponds to the RRM2 of a group of plant nucleolin-like proteins, including nucleolin 1 (also termed protein nucleolin like 1) and nucleolin 2 (also termed protein nucleolin like 2, or protein parallel like 1). They play roles in the regulation of ribosome synthesis and in the growth and development of plants. Like yeast nucleolin, nucleolin-like proteins possess two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains).  	79
409886	cd12452	RRM_ARP_like	RNA recognition motif (RRM) found in yeast asparagine-rich protein (ARP) and similar proteins. This subfamily corresponds to the RRM of ARP, also termed NRP1, encoded by Saccharomyces cerevisiae YDL167C. Although its exact biological function remains unclear, ARP contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), two Ran-binding protein zinc fingers (zf-RanBP), and an asparagine-rich region. It may possess RNA-binding and zinc ion binding activities. Additional research had indicated that ARP may function as a factor involved in the stress response. 	83
409887	cd12453	RRM1_RIM4_like	RNA recognition motif 1 (RRM1) found in yeast meiotic activator RIM4 and similar proteins. This subfamily corresponds to the RRM1 of RIM4, also termed regulator of IME2 protein 4, a putative RNA binding protein that is expressed at elevated levels early in meiosis. It functions as a meiotic activator required for both the IME1- and IME2-dependent pathways of meiotic gene expression, as well as early events of meiosis, such as meiotic division and recombination, in Saccharomyces cerevisiae. RIM4 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes a putative RNA-binding protein termed multicopy suppressor of sporulation protein Msa1. It is a putative RNA-binding protein encoded by a novel gene, msa1, from the fission yeast Schizosaccharomyces pombe. Msa1 may be involved in the inhibition of sexual differentiation by controlling the expression of Ste11-regulated genes, possibly through the pheromone-signaling pathway. Like RIM4, Msa1 also contains two RRMs, both of which are essential for the function of Msa1. 	86
409888	cd12454	RRM2_RIM4_like	RNA recognition motif 2 (RRM2) found in yeast meiotic activator RIM4 and similar proteins. This subfamily corresponds to the RRM2 of RIM4, also termed regulator of IME2 protein 4, a putative RNA binding protein that is expressed at elevated levels early in meiosis. It functions as a meiotic activator required for both the IME1- and IME2-dependent pathways of meiotic gene expression, as well as early events of meiosis, such as meiotic division and recombination, in Saccharomyces cerevisiae. RIM4 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes a putative RNA-binding protein termed multicopy suppressor of sporulation protein Msa1. It is a putative RNA-binding protein encoded by a novel gene, msa1, from the fission yeast Schizosaccharomyces pombe. Msa1 may be involved in the inhibition of sexual differentiation by controlling the expression of Ste11-regulated genes, possibly through the pheromone-signaling pathway. Like RIM4, Msa1 also contains two RRMs, both of which are essential for the function of Msa1. 	80
409889	cd12455	RRM_like_Smg4_UPF3	RNA recognition motif (RRM)-like Smg4_UPF3 domain in yeast up-frameshift suppressor 3 (Upf3p), Caenorhabditis elegans SMG-4, their human orthologs Upf3A and Upf3B, and similar proteins. This subfamily corresponds to the RRM-like Smg4_UPF3 domain found in yeast up-frameshift suppressor 3 (Upf3p), Caenorhabditis elegans SMG-4, their human orthologs Upf3A and Upf3B, and similar proteins. Upf3p, also termed nonsense-mediated mRNA decay protein 3, or Sua6p, a surveillance factor encoded by UPF3 gene from Saccharomyces cerevisiae. It is required for nonsense-mediated mRNA decay (NMD) in yeast. Upf3p is primarily cytoplasmic but accumulates inside the nucleus. Its nuclear import is mediated by the Srp1p (importin-alpha)/beta heterodimer while its nuclear export is mediated by a leucine-rich nuclear export sequence (NES-A), but not the Crm1p exportin. C. elegans SMG-4 is a nuclear shuttling protein that shuttles between the cytoplasm and nucleus through nuclear import and export signals similar to that of the yeast Upf3p. It is regulated by phosphorylation. Human orthologs of yeast Upf3p and C. elegans SMG-4 include Upf3A and Upf3B, which derive from two genes, UPF3A and X-linked UPF3B, respectively. Both, Upf3A (Up-frameshift suppressor 3 homolog A, also termed regulator of nonsense transcripts 3A, or nonsense mRNA reducing factor 3A) and Upf3B (Up-frameshift suppressor 3 homolog B on chromosome X, also termed regulator of nonsense transcripts 3B, or nonsense mRNA reducing factor 3B), are nucleocytoplasmic shuttling proteins. They associate selectively with spliced beta-globin mRNA in vivo, and tethering of any human Upf protein to the 3'UTR of beta-globin mRNA prevents NMD. The function of the Upf proteins in identifying and targeting nonsense mRNAs for rapid decay is conserved among eukaryotes. Besides, all Upf proteins in this family contain a conserved Smg4_UPF3 domain with some similarity to an RNA recognition motif (RRM), indicating that they may be RNA binding proteins. 	88
240902	cd12456	RRM_p65	RNA recognition motif (RRM) found in the holoenzyme La family protein p65. This subfamily corresponds to the RRM of a lineage specific family containing the essential La family protein p65 found in Tetrahymena thermophila. It is a telomerase holoenzyme protein necessary for telomerase RNA (TER) accumulation in vivo. p65, together with TER and telomerase reverse transcriptase (TERT), comprise a ternary catalytic core complex of Tetrahymena telomerase, which is a ribonucleoprotein complex essential for maintenance of telomere DNA at linear chromosome ends. p65 harbors a cryptic, atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which displays high structural homology to the RRM in genuine La and LARP7 proteins. 	76
409890	cd12457	RRM_XMAS2	RNA recognition motif (RRM) found in X-linked male sterile 2 (Xmas-2) and similar proteins. This subfamily corresponds to the RRM in Xmas-2, the Drosophila homolog of yeast Sac3p protein, together with E(y)2, the Drosophila homologue of yeast Sus1p protein, forming an endogenous complex that is required in the regulation of  mRNA transport and also involved in the efficient transcription regulation of the heat-shock protein 70 (hsp70) loci. All family members are found in insects and contain an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a PCI domain.	71
409891	cd12458	RRM_AtC3H46_like	RNA recognition motif (RRM) found in Arabidopsis thaliana zinc finger CCCH domain-containing protein 46 (AtC3H46) and similar proteins. This subfamily corresponds to the RRM domain in AtC3H46, a putative RNA-binding protein that contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a CCCH class of zinc finger, typically C-X8-C-X5-C-X3-H. It may possess ribonuclease activity. 	70
409892	cd12459	RRM1_CID8_like	RNA recognition motif 1 (RRM1) found in Arabidopsis thaliana CTC-interacting domain protein CID8, CID9, CID10, CID11, CID12, CID 13 and similar proteins. This subgroup corresponds to the RRM1 domains found in A. thaliana CID8, CID9, CID10, CID11, CID12, CID 13 and mainly their plant homologs. These highly related RNA-binding proteins contain an N-terminal PAM2 domain (PABP-interacting motif 2), two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a basic region that resembles a bipartite nuclear localization signal. The biological role of this family remains unclear.	80
409893	cd12460	RRM2_CID8_like	RNA recognition motif 2 (RRM2) found in Arabidopsis thaliana CTC-interacting domain protein CID8, CID9, CID10, CID11, CID12, CID 13 and similar proteins. This subgroup corresponds to the RRM2 domains found in A. thaliana CID8, CID9, CID10, CID11, CID12, CID 13 and mainly their plant homologs. These highly related RNA-binding proteins contain an N-terminal PAM2 domain (PABP-interacting motif 2), two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a basic region that resembles a bipartite nuclear localization signal. The biological role of this family remains unclear.	82
409894	cd12461	RRM_SCAF4	RNA recognition motif (RRM) found in SR-related and CTD-associated factor 4 (SCAF4) and similar proteins. The CD corresponds to the RRM of SCAF4 (also termed splicing factor, arginine/serine-rich 15 or SFR15, or CTD-binding SR-like protein RA4) that belongs to a new class of SCAFs (SR-like CTD-associated factors). Although its biological function remains unclear, SCAF4 shows high sequence similarity to SCAF8 that interacts specifically with a highly serine-phosphorylated form of the carboxy-terminal domain (CTD) of the largest subunit of RNA polymerase II (pol II) and may play a direct role in coupling with both, transcription and pre-mRNA processing, processes. SCAF4 and SCAF8 both contain a conserved N-terminal CTD-interacting domain (CID), an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and serine/arginine-rich motifs.	81
409895	cd12462	RRM_SCAF8	RNA recognition motif (RRM) found in SR-related and CTD-associated factor 8 (SCAF8) and similar proteins. This subgroup corresponds to the RRM of SCAF8 (also termed CDC5L complex-associated protein 7, or RNA-binding motif protein 16, or CTD-binding SR-like protein RA8), a nuclear matrix protein that interacts specifically with a highly serine-phosphorylated form of the carboxy-terminal domain (CTD) of the largest subunit of RNA polymerase II (pol II). The pol II CTD plays a role in coupling transcription and pre-mRNA processing. SCAF8 co-localizes primarily with transcription sites that are enriched in nuclear matrix fraction, which is known to contain proteins involved in pre-mRNA processing. Thus, SCAF8 may play a direct role in coupling with both, transcription and pre-mRNA processing, processes. SCAF8, together with SCAF4, represents a new class of SCAFs (SR-like CTD-associated factors). They contain a conserved N-terminal CTD-interacting domain (CID), an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and serine/arginine-rich motifs.	79
409896	cd12463	RRM_G3BP1	RNA recognition motif (RRM) found in ras GTPase-activating protein-binding protein 1 (G3BP1) and similar proteins. This subgroup corresponds to the RRM of G3BP1, also termed ATP-dependent DNA helicase VIII (DH VIII), or GAP SH3 domain-binding protein 1, which has been identified as a phosphorylation-dependent endoribonuclease that interacts with the SH3 domain of RasGAP, a multi-functional protein controlling Ras activity. The acidic RasGAP binding domain of G3BP1 harbors an arsenite-regulated phosphorylation site and dominantly inhibits stress granule (SG) formation. G3BP1 also contains an N-terminal nuclear transfer factor 2 (NTF2)-like domain, an RNA recognition motif (RRM domain), and an Arg-Gly-rich region (RGG-rich region, or arginine methylation motif). The RRM domain and RGG-rich region are canonically associated with RNA binding. G3BP1 co-immunoprecipitates with mRNAs. It binds to and cleaves the 3'-untranslated region (3'-UTR) of the c-myc mRNA in a phosphorylation-dependent manner. Thus, G3BP1 may play a role in coupling extra-cellular stimuli to mRNA stability. It has been shown that G3BP1 is a novel Dishevelled-associated protein that is methylated upon Wnt3a stimulation and that arginine methylation of G3BP1 regulates both Ctnnb1 mRNA and canonical Wnt/beta-catenin signaling. Furthermore, G3BP1 can be associated with the 3'-UTR of beta-F1 mRNA in cytoplasmic RNA-granules, demonstrating that G3BP1 may specifically repress the translation of the transcript.	80
409897	cd12464	RRM_G3BP2	RNA recognition motif (RRM) found in ras GTPase-activating protein-binding protein 2 (G3BP2) and similar proteins. This subgroup corresponds to the RRM of G3BP2, also termed GAP SH3 domain-binding protein 2, a cytoplasmic protein that interacts with both IkappaBalpha and IkappaBalpha/NF-kappaB complexes, indicating that G3BP2 may play a role in the control of nucleocytoplasmic distribution of IkappaBalpha and cytoplasmic anchoring of the IkappaBalpha/NF-kappaB complex. G3BP2 contains an N-terminal nuclear transfer factor 2 (NTF2)-like domain, an acidic domain, a domain containing five PXXP motifs, an RNA recognition motif (RRM domain), and an Arg-Gly-rich region (RGG-rich region, or arginine methylation motif). It binds to the SH3 domain of RasGAP, a multi-functional protein controlling Ras activity, through its N-terminal NTF2-like domain. The acidic domain is sufficient for the interaction of G3BP2 with the IkappaBalpha cytoplasmic retention sequence. Furthermore, G3BP2 might influence stability or translational efficiency of particular mRNAs by binding to RNA-containing structures within the cytoplasm through its RNA-binding domain.	83
409898	cd12465	RRM_UHMK1	RNA recognition motif (RRM) found in U2AF homology motif kinase 1 (UHMK1) and similar proteins. This subgroup corresponds to the RRM of UHMK1. UHMK1, also termed kinase interacting with stathmin (KIS) or P-CIP2, is a serine/threonine protein kinase functionally related to RNA metabolism and neurite outgrowth. It contains an N-terminal kinase domain and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), with high homology to the corresponding motif of the mammalian U2 small nuclear ribonucleoprotein auxiliary factor U2AF 65 kDa subunit (U2AF65 or U2AF2). UHMK1 targets two key regulators of cell proliferation and migration, the cyclin-dependent kinase (CDK) inhibitor p27Kip1 and the microtubule-destabilizing protein stathmin. It plays a critical role during vascular wound repair by preventing excessive vascular smooth muscle cell (VSMC) migration into the vascular lesion. Moreover, UHMK1 may control cell migration and neurite outgrowth by interacting with and phosphorylating the splicing factor SF1, thereby probably contributing to the control of protein expression. Furthermore, UHMK1 may be functionally related to microtubule dynamics and axon development. It localizes to RNA granules, interacts with three proteins found in RNA granules (KIF3A, NonO, and eEF1A), and further enhances the local translation. UHMK1 is highly expressed in regions of the brain implicated in schizophrenia and may play a role in susceptibility to schizophrenia.	88
409899	cd12466	RRM2_AtRSp31_like	RNA recognition motif 2 (RRM2) found in Arabidopsis thaliana arginine/serine-rich-splicing factor RSp31 and similar proteins from plants. This subgroup corresponds to the RRM2 in a family that represents a novel group of arginine/serine (RS) or serine/arginine (SR) splicing factors existing in plants, such as A. thaliana RSp31, RSp35, RSp41 and similar proteins. Like vertebrate RS splicing factors, these proteins function as plant splicing factors and play crucial roles in constitutive and alternative splicing in plants. They all contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at their N-terminus, and an RS domain at their C-terminus.	70
240913	cd12467	RRM_Srp1p_like	RNA recognition motif 1 (RRM1) found in fission yeast pre-mRNA-splicing factor Srp1p and similar proteins. This subgroup corresponds to the RRM domain in Srp1p encoded by gene srp1 from fission yeast Schizosaccharomyces pombe. It plays a role in the pre-mRNA splicing process, but not essential for growth. Srp1p is closely related to the SR protein family found in metazoa. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a glycine hinge and a RS domain in the middle, and a C-terminal domain. Some family members also contain another RRM domain.	78
409900	cd12470	RRM1_MSSP1	RNA recognition motif 1 (RRM1) found in vertebrate single-stranded DNA-binding protein MSSP-1. This subgroup corresponds to the RRM1 of MSSP-1, also termed RNA-binding motif, single-stranded-interacting protein 1 (RBMS1), or suppressor of CDC2 with RNA-binding motif 2 (SCR2), a double- and single-stranded DNA binding protein that belongs to the c-myc single-strand binding proteins (MSSP) family. It specifically recognizes the sequence CT(A/T)(A/T)T, and stimulates DNA replication in the system using SV40 DNA. MSSP-1 is identical with Scr2, a human protein which complements the defect of cdc2 kinase in Schizosaccharomyces pombe. MSSP-1 has been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with C-MYC, the product of protooncogene c-myc. MSSP-1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity as well as induction of apoptosis. 	86
409901	cd12471	RRM1_MSSP2	RNA recognition motif 1 (RRM1) found in vertebrate single-stranded DNA-binding protein MSSP-2. This subgroup corresponds to the RRM1 of MSSP-2, also termed RNA-binding motif, single-stranded-interacting protein 2 (RBMS2), or suppressor of CDC2 with RNA-binding motif 3 (SCR3), a double- and single-stranded DNA binding protein that belongs to the c-myc single-strand binding proteins (MSSP) family. It specifically recognizes the sequence T(C/A)TT, and stimulates DNA replication in the system using SV40 DNA. MSSP-2 is identical with Scr3, a human protein which complements the defect of cdc2 kinase in Schizosaccharomyces pombe. MSSP-2 has been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with C-MYC, the product of protooncogene c-myc. MSSP-2 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity as well as induction of apoptosis. 	84
409902	cd12472	RRM1_RBMS3	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding motif, single-stranded-interacting protein 3 (RBMS3). This subgroup corresponds to the RRM1 of RBMS3, a new member of the c-myc gene single-strand binding proteins (MSSP) family of DNA regulators. Unlike other MSSP proteins, RBMS3 is not a transcriptional regulator. It binds with high affinity to A/U-rich stretches of RNA, and to A/T-rich DNA sequences, and functions as a regulator of cytoplasmic activity. RBMS3 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and its C-terminal region is acidic and enriched in prolines, glutamines and threonines. 	80
409903	cd12473	RRM2_MSSP1	RNA recognition motif 2 (RRM2) found in vertebrate single-stranded DNA-binding protein MSSP-1. This subgroup corresponds to the RRM2 of MSSP-1, also termed RNA-binding motif, single-stranded-interacting protein 1 (RBMS1), or suppressor of CDC2 with RNA-binding motif 2 (SCR2). MSSP-1 is a double- and single-stranded DNA binding protein that belongs to the c-myc single-strand binding proteins (MSSP) family. It specifically recognizes the sequence CT(A/T)(A/T)T, and stimulates DNA replication in the system using SV40 DNA. MSSP-1 is identical with Scr2, a human protein which complements the defect of cdc2 kinase in Schizosaccharomyces pombe. MSSP-1 has been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with c-MYC, the product of protooncogene c-myc. MSSP-1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity as well as induction of apoptosis. 	85
409904	cd12474	RRM2_MSSP2	RNA recognition motif 2 (RRM2) found in vertebrate single-stranded DNA-binding protein MSSP-2. This subgroup corresponds to the RRM2 of MSSP-2, also termed RNA-binding motif, single-stranded-interacting protein 2 (RBMS2), or suppressor of CDC2 with RNA-binding motif 3 (SCR3). MSSP-2 is a double- and single-stranded DNA binding protein that belongs to the c-myc single-strand binding proteins (MSSP) family. It specifically recognizes the sequence T(C/A)TT, and stimulates DNA replication in the system using SV40 DNA. MSSP-2 is identical with Scr3, a human protein which complements the defect of cdc2 kinase in Schizosaccharomyces pombe. MSSP-2 has been implied in regulating DNA replication, transcription, apoptosis induction, and cell-cycle movement, via the interaction with C-MYC, the product of protooncogene c-myc. MSSP-2 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), both of which are responsible for the specific DNA binding activity as well as induction of apoptosis. 	86
240919	cd12475	RRM2_RBMS3	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding motif, single-stranded-interacting protein 3 (RBMS3). This subgroup corresponds to the RRM2 of RBMS3, a new member of the c-myc gene single-strand binding proteins (MSSP) family of DNA regulators. Unlike other MSSP proteins, RBMS3 is not a transcriptional regulator. It binds with high affinity to A/U-rich stretches of RNA, and to A/T-rich DNA sequences, and functions as a regulator of cytoplasmic activity. RBMS3 contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and its C-terminal region is acidic and enriched in prolines, glutamines and threonines. 	88
409905	cd12476	RRM1_SNF	RNA recognition motif 1 (RRM1) found in Drosophila melanogaster sex determination protein SNF and similar proteins. This subgroup corresponds to the RRM1 of SNF (Sans fille), also termed U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A), an RNA-binding protein found in the U1 and U2 snRNPs of Drosophila. It is essential in Drosophila sex determination and possesses a novel dual RNA binding specificity. SNF binds with high affinity to both Drosophila U1 snRNA stem-loop II (SLII) and U2 snRNA stem-loop IV (SLIV). It can also bind to poly(U) RNA tracts flanking the alternatively spliced Sex-lethal (Sxl) exon, as does Drosophila Sex-lethal protein (SXL). SNF contains two RNA recognition motifs (RRMs); it can self-associate through RRM1, and each RRM can recognize poly(U) RNA binding independently. 	85
409906	cd12477	RRM1_U1A	RNA recognition motif 1 (RRM1) found in vertebrate U1 small nuclear ribonucleoprotein A (U1A). This subgroup corresponds to the RRM1 of U1A (also termed U1 snRNP A or U1-A), an RNA-binding protein associated with the U1 snRNP, a small RNA-protein complex involved in pre-mRNA splicing. U1A binds with high affinity and specificity to stem-loop II (SLII) of U1 snRNA. It is predominantly a nuclear protein and it also shuttles between the nucleus and the cytoplasm independently of interactions with U1 snRNA. U1A may be involved in RNA 3'-end processing, specifically cleavage, splicing and polyadenylation, through interacting with a large number of non-snRNP proteins, including polypyrimidine tract binding protein (PTB), polypyrimidine-tract binding protein-associated factor (PSF), and non-POU-domain-containing, octamer-binding (NONO), DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 (DDX5). It also binds to a flavivirus NS5 protein and plays an important role in virus replication. U1A contains two RNA recognition motifs (RRMs); the N-terminal RRM (RRM1) binds tightly and specifically to the U1 snRNA SLII and its own 3'-UTR, while in contrast, the C-terminal RRM (RRM2) does not appear to associate with any RNA and may be free to bind other proteins. U1A also contains a proline-rich region, and a nuclear localization signal (NLS) in the central domain that is responsible for its nuclear import. 	89
409907	cd12478	RRM1_U2B	RNA recognition motif 1 in U2 small nuclear ribonucleoprotein B" (U2B") and similar proteins. This subgroup corresponds to the RRM1 of U2B" (also termed U2 snRNP B") a unique protein that comprises the U2 snRNP. It was initially identified as binding to stem-loop IV (SLIV) at the 3' end of U2 snRNA. Additional research indicates U2B" binds to U1 snRNA stem-loop II (SLII) as well and shows no preference for SLIV or SLII on the basis of binding affinity. U2B" does not require an auxiliary protein for binding to RNA. In addition, the nuclear transport of U2B" is independent of U2 snRNA binding. U2B" contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It also contains a nuclear localization signal (NLS) in the central domain. However, nuclear import of U2B'' does not depend on this NLS. The N-terminal RRM is sufficient to direct U2B" to the nucleus. 	91
240923	cd12479	RRM2_SNF	RNA recognition motif 2 (RRM2) found in Drosophila melanogaster sex determination protein SNF and similar proteins. This subgroup corresponds to the RRM2 of SNF (Sans fille), also termed U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A), an RNA-binding protein found in the U1 and U2 snRNPs of Drosophila. It is essential in Drosophila sex determination and possesses a novel dual RNA binding specificity. SNF binds with high affinity to both Drosophila U1 snRNA stem-loop II (SLII) and U2 snRNA stem-loop IV (SLIV). It can also bind to poly(U) RNA tracts flanking the alternatively spliced Sex-lethal (Sxl) exon, as does Drosophila Sex-lethal protein (SXL). SNF contains two RNA recognition motifs (RRMs); it can self-associate through RRM1, and each RRM can recognize poly(U) RNA binding independently. 	80
409908	cd12480	RRM2_U1A	RNA recognition motif 2 (RRM2) found in vertebrate U1 small nuclear ribonucleoprotein A (U1 snRNP A or U1-A or U1A). This subgroup corresponds to the RRM2 of U1A (also termed U1 snRNP A or U1-A), an RNA-binding protein associated with the U1 snRNP, a small RNA-protein complex involved in pre-mRNA splicing. U1A binds with high affinity and specificity to stem-loop II (SLII) of U1 snRNA. It is predominantly a nuclear protein that shuttles between the nucleus and the cytoplasm independently of interactions with U1 snRNA. U1A may be involved in RNA 3'-end processing, specifically cleavage, splicing and polyadenylation, through interacting with a large number of non-snRNP proteins, including polypyrimidine tract binding protein (PTB), polypyrimidine-tract binding protein-associated factor (PSF), and non-POU-domain-containing, octamer-binding (NONO), DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 (DDX5). U1A also binds to a flavivirus NS5 protein and plays an important role in virus replication. It contains two RNA recognition motifs (RRMs); the N-terminal RRM (RRM1) binds tightly and specifically to the U1 snRNA SLII and its own 3'-UTR, while in contrast, the C-terminal RRM (RRM2) does not appear to associate with any RNA and it may be free for binding other proteins. U1A also contains a proline-rich region, and a nuclear localization signal (NLS) in the central domain that is responsible for its nuclear import. 	86
240925	cd12481	RRM2_U2B	RNA recognition motif 2 (RRM2) found in vertebrate U2 small nuclear ribonucleoprotein B" (U2B"). This subgroup corresponds to the RRM1 of U2B" (also termed U2 snRNP B"), a unique protein that comprises the U2 snRNP. It was initially identified to bind to stem-loop IV (SLIV) at the 3' end of U2 snRNA. Additional research indicates U2B" binds to U1 snRNA stem-loop II (SLII) as well and shows no preference for SLIV or SLII on the basis of binding affinity. U2B" does not require an auxiliary protein for binding to RNA and its nuclear transport is independent of U2 snRNA binding. U2B" contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). It also contains a nuclear localization signal (NLS) in the central domain. However, nuclear import of U2B'' does not depend on this NLS. The N-terminal RRM is sufficient to direct U2B" to the nucleus. 	80
409909	cd12482	RRM1_hnRNPR	RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein R (hnRNP R). This subgroup corresponds to the RRM1 of hnRNP R, which is a ubiquitously expressed nuclear RNA-binding protein that specifically binds mRNAs with a preference for poly(U) stretches. Upon binding of RNA, hnRNP R forms oligomers, most probably dimers. hnRNP R has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. It is predominantly located in axons of motor neurons and to a much lower degree in sensory axons. In axons of motor neurons, it also functions as a cytosolic protein and interacts with wild type of survival motor neuron (SMN) proteins directly, further providing a molecular link between SMN and the spliceosome. Moreover, hnRNP R plays an important role in neural differentiation and development, and in retinal development and light-elicited cellular activities. hnRNP R contains an acidic auxiliary N-terminal region, followed by two well defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; it binds RNA through its RRM domains. 	79
409910	cd12483	RRM1_hnRNPQ	RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein Q (hnRNP Q).  This subgroup corresponds to the RRM1 of hnRNP Q, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NASP1), or synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP). It is a ubiquitously expressed nuclear RNA-binding protein identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome. As an alternatively spliced version of NSAP, it acts as an interaction partner of a multifunctional protein required for viral replication, and is implicated in the regulation of specific mRNA transport. hnRNP Q has also been identified as SYNCRIP, a dual functional protein participating in both viral RNA replication and translation. As a synaptotagmin-binding protein, hnRNP Q plays a putative role in organelle-based mRNA transport along the cytoskeleton. Moreover, hnRNP Q has been found in protein complexes involved in translationally coupled mRNA turnover and mRNA splicing. It functions as a wild-type survival motor neuron (SMN)-binding protein that may participate in pre-mRNA splicing and modulate mRNA transport along microtubuli. hnRNP Q contains an acidic auxiliary N-terminal region, followed by two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; hnRNP Q binds RNA through its RRM domains.	84
409911	cd12484	RRM1_RBM46	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 46 (RBM46). This subgroup corresponds to the RRM1 of RBM46, also termed cancer/testis antigen 68 (CT68), a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM46 contains two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	78
240929	cd12485	RRM1_RBM47	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 47 (RBM47). This subgroup corresponds to the RRM1 of RBM47, a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM47 contains two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	78
409912	cd12486	RRM1_ACF	RNA recognition motif 1 (RRM1) found in vertebrate APOBEC-1 complementation factor (ACF). This subgroup corresponds to the RRM1 of ACF, also termed APOBEC-1-stimulating protein, an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone, and play a key role in cell growth and differentiation. ACF shuttles between the cytoplasm and nucleus. It contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which display high affinity for an 11 nucleotide AU-rich mooring sequence 3' of the edited cytidine in apoB mRNA. All three RRMs may be required for complementation of editing activity in living cells. RRM2/3 are implicated in ACF interaction with APOBEC-1. 	78
409913	cd12487	RRM1_DND1	RNA recognition motif 1 (RRM1) found in vertebrate dead end protein homolog 1 (DND1). This subgroup corresponds to the RRM1 of DND1, also termed RNA-binding motif, single-stranded-interacting protein 4, an RNA-binding protein that is essential for maintaining viable germ cells in vertebrates. It interacts with the 3'-untranslated region (3'-UTR) of multiple messenger RNAs (mRNAs) and prevents micro-RNA (miRNA) mediated repression of mRNA. For instance, DND1 binds cell cycle inhibitor, P27 (p27Kip1, CDKN1B), and cell cycle regulator and tumor suppressor, LATS2 (large tumor suppressor, homolog 2 of Drosophila). It helps maintain their protein expression through blocking the inhibitory function of microRNAs (miRNA) from these transcripts. DND1 may also impose another level of translational regulation to modulate expression of critical factors in embryonic stem (ES) cells. DND1 interacts specifically with apolipoprotein B editing complex 3 (APOBEC3), a multi-functional protein inhibiting retroviral replication. The DND1-APOBEC3 interaction may play a role in maintaining viability of germ cells and for preventing germ cell tumor development. DND1 contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	78
240932	cd12488	RRM2_hnRNPR	RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein R (hnRNP R). This subgroup corresponds to the RRM2 of hnRNP R, a ubiquitously expressed nuclear RNA-binding protein that specifically bind mRNAs with a preference for poly(U) stretches. Upon binding of RNA, hnRNP R forms oligomers, most probably dimers. hnRNP R has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP R is predominantly located in axons of motor neurons and to a much lower degree in sensory axons. In axons of motor neurons, it also functions as a cytosolic protein and interacts with wild type of survival motor neuron (SMN) proteins directly, further providing a molecular link between SMN and the spliceosome. Moreover, hnRNP R plays an important role in neural differentiation and development, as well as in retinal development and light-elicited cellular activities. It contains an acidic auxiliary N-terminal region, followed by two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif. hnRNP R binds RNA through its RRM domains. 	85
240933	cd12489	RRM2_hnRNPQ	RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). This subgroup corresponds to the RRM3 of hnRNP Q, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NASP1), or synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP). It is a ubiquitously expressed nuclear RNA-binding protein identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome. As an alternatively spliced version of NSAP, it acts as an interaction partner of a multifunctional protein required for viral replication, and is implicated in the regulation of specific mRNA transport. hnRNP Q has also been identified as SYNCRIP that is a dual functional protein participating in both viral RNA replication and translation. As a synaptotagmin-binding protein, hnRNP Q plays a putative role in organelle-based mRNA transport along the cytoskeleton. Moreover, hnRNP Q has been found in protein complexes involved in translationally coupled mRNA turnover and mRNA splicing. It functions as a wild-type survival motor neuron (SMN)-binding protein that may participate in pre-mRNA splicing and modulate mRNA transport along microtubuli. hnRNP Q contains an acidic auxiliary N-terminal region, followed by two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; hnRNP Q binds RNA through its RRM domains. 	85
409914	cd12490	RRM2_ACF	RNA recognition motif 2 (RRM2) found in vertebrate APOBEC-1 complementation factor (ACF). This subgroup corresponds to the RRM2 of ACF, also termed APOBEC-1-stimulating protein, an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone and play a key role in cell growth and differentiation. ACF shuttles between the cytoplasm and nucleus. ACF contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which display high affinity for an 11 nucleotide AU-rich mooring sequence 3' of the edited cytidine in apoB mRNA. All three RRMs may be required for complementation of editing activity in living cells. RRM2/3 are implicated in ACF interaction with APOBEC-1. 	89
409915	cd12491	RRM2_RBM47	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 47 (RBM47). This subgroup corresponds to the RRM2 of RBM47, a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM47 contains two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	95
240936	cd12492	RRM2_RBM46	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 46 (RBM46). This subgroup corresponds to the RRM2 of RBM46, also termed cancer/testis antigen 68 (CT68). It is a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM46 contains two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	85
409916	cd12493	RRM2_DND1	RNA recognition motif 2 (RRM2) found in vertebrate dead end protein homolog 1 (DND1). This subgroup corresponds to the RRM2 of DND1, also termed RNA-binding motif, single-stranded-interacting protein 4. It is an RNA-binding protein that is essential for maintaining viable germ cells in vertebrates. It interacts with the 3'-untranslated region (3'-UTR) of multiple messenger RNAs (mRNAs) and prevents micro-RNA (miRNA) mediated repression of mRNA. For instance, DND1 binds cell cycle inhibitor, P27 (p27Kip1, CDKN1B), and cell cycle regulator and tumor suppressor, LATS2 (large tumor suppressor, homolog 2 of Drosophila). It helps maintain their protein expression through blocking the inhibitory function of microRNAs (miRNA) from these transcripts. DND1 may also impose another level of translational regulation to modulate expression of critical factors in embryonic stem (ES) cells. Moreover, DND1 interacts specifically with apolipoprotein B editing complex 3 (APOBEC3), a multi-functional protein inhibiting retroviral replication. The DND1-APOBEC3 interaction may play a role in maintaining viability of germ cells and for preventing germ cell tumor development. DND1 contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	83
409917	cd12494	RRM3_hnRNPR	RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein R (hnRNP R). This subgroup corresponds to the RRM3 of hnRNP R. a ubiquitously expressed nuclear RNA-binding protein that specifically bind mRNAs with a preference for poly(U) stretches. Upon binding of RNA, hnRNP R forms oligomers, most probably dimers. hnRNP R has been implicated in mRNA processing and mRNA transport, and also acts as a regulator to modify binding to ribosomes and RNA translation. hnRNP R is predominantly located in axons of motor neurons and to a much lower degree in sensory axons. In axons of motor neurons, it also functions as a cytosolic protein and interacts with wild type of survival motor neuron (SMN) proteins directly, further providing a molecular link between SMN and the spliceosome. Moreover, hnRNP R plays an important role in neural differentiation and development, as well as in retinal development and light-elicited cellular activities. hnRNP R contains an acidic auxiliary N-terminal region, followed by two well-defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; hnRNP R binds RNA through its RRM domains. 	72
409918	cd12495	RRM3_hnRNPQ	RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). This subgroup corresponds to the RRM3 of hnRNP Q, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NASP1), or synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP). It is a ubiquitously expressed nuclear RNA-binding protein identified as a component of the spliceosome complex, as well as a component of the apobec-1 editosome. As an alternatively spliced version of NSAP, it acts as an interaction partner of a multifunctional protein required for viral replication, and is implicated in the regulation of specific mRNA transport. hnRNP Q has also been identified as SYNCRIP that is a dual functional protein participating in both viral RNA replication and translation. As a synaptotagmin-binding protein, hnRNP Q plays a putative role in organelle-based mRNA transport along the cytoskeleton. Moreover, hnRNP Q has been found in protein complexes involved in translationally coupled mRNA turnover and mRNA splicing. It functions as a wild-type survival motor neuron (SMN)-binding protein that may participate in pre-mRNA splicing and modulate mRNA transport along microtubuli. hnRNP Q contains an acidic auxiliary N-terminal region, followed by two well defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RGG motif; hnRNP Q binds RNA through its RRM domains. 	72
409919	cd12496	RRM3_RBM46	RNA recognition motif 3 (RRM3) found in vertebrate RNA-binding protein 46 (RBM46). This subgroup corresponds to the RRM3 of RBM46, also termed cancer/testis antigen 68 (CT68), is a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM46 contains two well defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	74
409920	cd12497	RRM3_RBM47	RNA recognition motif 3 (RRM3) found in vertebrate RNA-binding protein 47 (RBM47). This subgroup corresponds to the RRM3 of RBM47, a putative RNA-binding protein that shows high sequence homology with heterogeneous nuclear ribonucleoprotein R (hnRNP R) and heterogeneous nuclear ribonucleoprotein Q (hnRNP Q). Its biological function remains unclear. Like hnRNP R and hnRNP Q, RBM47 contains two well defined and one degenerated RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	74
409921	cd12498	RRM3_ACF	RNA recognition motif 3 (RRM3) found in vertebrate APOBEC-1 complementation factor (ACF). This subgroup corresponds to the RRM3 of ACF, also termed APOBEC-1-stimulating protein, an RNA-binding subunit of a core complex that interacts with apoB mRNA to facilitate C to U RNA editing. It may also act as an apoB mRNA recognition factor and chaperone and play a key role in cell growth and differentiation. ACF shuttles between the cytoplasm and nucleus. ACF contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which display high affinity for an 11 nucleotide AU-rich mooring sequence 3' of the edited cytidine in apoB mRNA. All three RRMs may be required for complementation of editing activity in living cells. RRM2/3 are implicated in ACF interaction with APOBEC-1. 	83
409922	cd12499	RRM_EcCsdA_like	RNA recognition motif (RRM) found in Escherichia coli cold-shock DEAD box protein A (CsdA) and similar proteins. This subgroup corresponds to the C-terminal RRM homology domain of E. coli CsdA, also termed ATP-dependent RNA helicase deaD, or translation factor W2, a member of the DbpA subfamily of prokaryotic DEAD-box rRNA helicases that have been implicated in ribosome biogenesis. CsdA may be involved in translation initiation, gene regulation after cold-shock, mRNA decay and biogenesis of the large or small ribosomal subunit. It contains two N-terminal ATPase catalytic domains and a C-terminal RNA binding domain, an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain). The catalytic domains bind to nearby regions of RNA to stimulate ATP hydrolysis and disrupt RNA structures. The C-terminal domain is responsible for the high-affinity RNA binding.	73
409923	cd12500	RRM_BsYxiN_like	RNA recognition motif (RRM) found in Bacillus subtilis ATP-dependent RNA helicase YxiN and similar proteins. This subgroup corresponds to the C-terminal RRM homology domain of YxiN. B. subtilis YxiN is a member of the DbpA subfamily of prokaryotic DEAD-box rRNA helicases that have been implicated in ribosome biogenesis. It binds with high affinity and specificity to RNA substrates containing hairpin 92 of 23S rRNA (HP92) with either 3' or 5' extensions in an ATP-dependent manner. YxiN contains two N-terminal ATPase catalytic domains and a C-terminal RNA binding domain, an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain). The catalytic domains bind to nearby regions of RNA to stimulate ATP hydrolysis and disrupt RNA structures. The C-terminal domain is responsible for the high-affinity RNA binding. 	73
409924	cd12501	RRM_EcDbpA_like	RNA recognition motif (RRM) found in Escherichia coli RNA helicase dbpA and similar proteins. This subgroup corresponds to the C-terminal RRM homology domain of dbpA. E. coli dbpA is a member of the DbpA subfamily of prokaryotic DEAD-box rRNA helicases that have been implicated in ribosome biogenesis. It binds with high affinity and specificity for RNA substrates containing hairpin 92 of 23S rRNA (HP92) with either 3' or 5' extensions. As a non-processive ATP-dependent helicase, DbpA destabilizes and unwinds short <9bp (base pairs) RNA duplexes as well as long duplex RNA stretches. It disrupts RNA helices exclusively in a 3'- 5' direction and requires a single-stranded loading site 3' of the substrate helix. dbpA contains two N-terminal ATPase catalytic domains and a C-terminal RNA binding domain, an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNPs (ribonucleoprotein domain). The catalytic domains bind to nearby regions of RNA to stimulate ATP hydrolysis and disrupt RNA structures. The C-terminal domain binds specifically to hairpin 92.	73
409925	cd12502	RRM2_RMB19	RNA recognition motif 2 (RRM2) found in RNA-binding protein 19 (RBM19) and similar proteins. This subfamily corresponds to the RRM2 of RBM19, also termed RNA-binding domain-1 (RBD-1), a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA and is also essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	72
409926	cd12503	RRM1_hnRNPH_GRSF1_like	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein (hnRNP) H protein family, G-rich sequence factor 1 (GRSF-1) and similar proteins. This subfamily corresponds to the RRM1 of hnRNP H proteins and GRSF-1. The hnRNP H protein family includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), hnRNP F and hnRNP H3 (also termed hnRNP 2H9), which represent a group of nuclear RNA binding proteins that are involved in pre-mRNA processing. These proteins have similar RNA binding affinities and specifically recognize the sequence GGGA. They can either stimulate or repress splicing upon binding to a GGG motif. hnRNP H binds to the RNA substrate in the presence or absence of these proteins, whereas hnRNP F binds to the nuclear mRNA only in the presence of cap-binding proteins. hnRNP H and hnRNP H2 are almost identical; both have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. hnRNP H3 may be involved in splicing arrest induced by heat shock. Most family members contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. Members in this family have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. They also include a cytoplasmic poly(A)+ mRNA binding protein, GRSF-1, which interacts with RNA in a G-rich element-dependent manner. They may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 contains three potential RRMs responsible for the RNA binding, and two auxiliary domains (an acidic alpha-helical domain and an N-terminal alanine-rich region) that may play a role in protein-protein interactions and provide binding specificity. 	77
409927	cd12504	RRM2_hnRNPH_CRSF1_like	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein (hnRNP) H protein family. This subfamily corresponds to the RRM2 of hnRNP H protein family which includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), hnRNP F and hnRNP H3 (also termed hnRNP 2H9). They represent a group of nuclear RNA binding proteins that are involved in pre-mRNA processing, having similar RNA binding affinities and specifically recognizing the sequence GGGA. They can either stimulate or repress splicing upon binding to a GGG motif. hnRNP H binds to the RNA substrate in the presence or absence of these proteins, whereas hnRNP F binds to the nuclear mRNA only in the presence of cap-binding proteins. Furthermore, hnRNP H and hnRNP H2 are almost identical; both have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. hnRNP H3 may be involved in the splicing arrest induced by heat shock. Most family members contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. The family also includes a cytoplasmic poly(A)+ mRNA binding protein, GRSF-1, which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 also contains three potential RRMs responsible for the RNA binding, and two auxiliary domains (an acidic alpha-helical domain and an N-terminal alanine-rich region) that may play a role in protein-protein interactions and provide binding specificity.	77
409928	cd12505	RRM2_GRSF1	RNA recognition motif 2 (RRM2) found in G-rich sequence factor 1 (GRSF-1) and similar proteins. This subfamily corresponds to the RRM2 of GRSF-1, a cytoplasmic poly(A)+ mRNA binding protein which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 contains three potential RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for the RNA binding. In addition, GRSF-1 has two auxiliary domains, an acidic alpha-helical domain and an N-terminal alanine-rich region, that may play a role in protein-protein interactions and provide binding specificity. 	77
409929	cd12506	RRM3_hnRNPH_CRSF1_like	RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein hnRNP H protein family, G-rich sequence factor 1 (GRSF-1) and similar proteins. This subfamily corresponds to the RRM3 of hnRNP H proteins and GRSF-1. The hnRNP H protein family includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), hnRNP F and hnRNP H3 (also termed hnRNP 2H9), which represent a group of nuclear RNA binding proteins that are involved in pre-mRNA processing. These proteins have similar RNA binding affinities and specifically recognize the sequence GGGA. They can either stimulate or repress splicing upon binding to a GGG motif. hnRNP H binds to the RNA substrate in the presence or absence of these proteins, whereas hnRNP F binds to the nuclear mRNA only in the presence of cap-binding proteins. hnRNP H and hnRNP H2 are almost identical; both have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. hnRNP H3 may be involved in the splicing arrest induced by heat shock. Most family members contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. For instance, members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. The family also includes a cytoplasmic poly(A)+ mRNA binding protein, GRSF-1, which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 also contains three potential RRMs responsible for the RNA binding, and two auxiliary domains (an acidic alpha-helical domain and an N-terminal alanine-rich region) that may play a role in protein-protein interactions and provide binding specificity.	75
240951	cd12507	RRM1_ESRPs_Fusilli	RNA recognition motif 1 (RRM1) found in epithelial splicing regulatory protein ESRP1, ESRP2, Drosophila RNA-binding protein Fusilli and similar proteins. This subfamily corresponds to the RRM1 of ESRPs and Fusilli. ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B). These are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. They are highly conserved paralogs and specifically bind to GU-rich binding site. ESRP1 and ESRP2 contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes Drosophila fusilli (fus) gene encoding RNA-binding protein Fusilli. Loss of fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous fibroblast growth factor receptor 2 (FGFR2) splicing and functions as a splicing factor. It shows high sequence homology to ESRPs and contains three RRMs as well. It also has an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 	75
409930	cd12508	RRM2_ESRPs_Fusilli	RNA recognition motif 2 (RRM2) found in epithelial splicing regulatory protein ESRP1, ESRP2, Drosophila RNA-binding protein Fusilli and similar proteins. This subfamily corresponds to the RRM2 of ESRPs and Fusilli. ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B) are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. They are highly conserved paralogs and specifically bind to GU-rich binding site. ESRP1 and ESRP2 contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes Drosophila fusilli (fus) gene encoding RNA-binding protein Fusilli.Loss of fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous FGFR2 splicing and functions as a splicing factor. It shows high sequence homology to ESRPs and contains three RRMs as well. It also has an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 	80
409931	cd12509	RRM3_ESRPs_Fusilli	RNA recognition motif 3 (RRM3) found in epithelial splicing regulatory protein ESRP1, ESRP2, Drosophila RNA-binding protein Fusilli and similar proteins. This subfamily corresponds to the RRM3 of ESRPs and Fusilli. ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B) are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. They are highly conserved paralogs and specifically bind to GU-rich binding site. ESRP1 and ESRP2 contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The family also includes Drosophila fusilli (fus) gene encoding RNA-binding protein Fusilli. Loss of fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous FGFR2 splicing and functions as a splicing factor. Fusilli shows high sequence homology to ESRPs and contains three RRMs as well. It also has an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 	81
409932	cd12510	RRM1_RBM12_like	RNA recognition motif 1 (RRM1) found in RNA-binding protein RBM12, RBM12B and similar proteins. This subfamily corresponds to the RRM1 of RBM12 and RBM12B. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. RBM12B show high sequence semilarity with RBM12. It contains five distinct RRMs as well. The biological roles of both RBM12 and RBM12B remain unclear. 	74
409933	cd12511	RRM2_RBM12_like	RNA recognition motif 2 (RRM2) found in RNA-binding protein RBM12, RBM12B and similar proteins. This subfamily corresponds to the RRM2 of RBM12 and RBM12B. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. RBM12B shows high sequence semilarity with RBM12. It contains five distinct RRMs as well. The biological roles of both RBM12 and RBM12B remain unclear. 	73
409934	cd12512	RRM3_RBM12	RNA recognition motif 3 (RRM3) found in RNA-binding protein 12 (RBM12) and similar proteins. This subfamily corresponds to the RRM3 of RBM12. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 	101
409935	cd12513	RRM3_RBM12B	RNA recognition motif 3 (RRM3) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM3 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 	81
409936	cd12514	RRM4_RBM12_like	RNA recognition motif 4 (RRM4) found in RNA-binding protein RBM12, RBM12B and similar proteins. This subfamily corresponds to the RRM4 of RBM12 and RBM12B. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. RBM12B show high sequence semilarity with RBM12. It contains five distinct RRMs as well. The biological roles of both RBM12 and RBM12B remain unclear. 	73
409937	cd12515	RRM5_RBM12_like	RNA recognition motif 5 (RRM5) found in RNA-binding protein RBM12, RBM12B and similar proteins. This subfamily corresponds to the RRM5 of RBM12 and RBM12B. RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. RBM12B show high sequence semilarity with RBM12. It contains five distinct RRMs as well. The biological roles of both RBM12 and RBM12B remain unclear. 	75
409938	cd12516	RRM1_RBM26	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 26 (RBM26). This subgroup corresponds to the RRM1 of RBM26, also known as cutaneous T-cell lymphoma (CTCL) tumor antigen se70-2, which represents a cutaneous lymphoma (CL)-associated antigen. It contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The RRMs may play some functional roles in RNA-binding or protein-protein interactions. 	76
409939	cd12517	RRM_RBM27	RNA recognition motif (RRM) found in vertebrate RNA-binding protein 27 (RBM27). This subgroup corresponds to the RRM of RBM27 which contains a single RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Although the specific function of the RRM in RBM27 remains unclear, it shows high sequence similarity with RRM1of RBM26, which functions as a cutaneous lymphoma (CL)-associated antigen. 	76
409940	cd12518	RRM_SRSF11	RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 11 (SRSF11) and similar proteins. This subgroup corresponds to the RRM of SRSF11, also termed arginine-rich 54 kDa nuclear protein (SRp54 or p54), which belongs to a family of proteins containing regions rich in serine-arginine dipeptides (SR proteins family). It is involved in bridge-complex formation and splicing by mediating protein-protein interactions across either introns or exons. SRSF11 has been identified as a tau exon 10 splicing repressor. It interacts with a purine-rich element in exon 10, and suppresses exon 10 inclusion by antagonizing Tra2beta, an SR-domain-containing protein that enhances exon 10 inclusion. SRSF11 is a unique SR family member and may regulate the alternative splicing in a tissue- and substrate-dependent manner. It can directly interact with the U2 auxiliary factor 65-kDa subunit (U2AF65), a protein associated with the 3' splice site. In addition, unlike the typical SR proteins, SRSF11 associates with other SR proteins but not with the U1 small nuclear ribonucleoprotein U1-70K or the U2 auxiliary factor 35-kDa subunit (U2AF35). SREK1 has unique properties in regulating alternative splicing of different pre-mRNAs; it promotes the use of the distal 5' splice site in E1A pre-mRNA alternative splicing. It also inhibits cryptic splice site selection on the beta-globin pre-mRNA containing competing 5' splice sites. SREK1 contains an RNA recognition motif (RRM), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and one serine-arginine (SR)-rich domains (SR domains). 	80
409941	cd12519	RRM1_SREK1	RNA recognition motif 1 (RRM1) found in splicing regulatory glutamine/lysine-rich protein 1 (SREK1) and similar proteins. This subgroup corresponds to the RRM1 of SREK1, also termed serine/arginine-rich-splicing regulatory protein 86-kDa (SRrp86), or splicing factor arginine/serine-rich 12 (SFRS12), or splicing regulatory protein 508 amino acid (SRrp508). SREK1 belongs to a family of proteins containing regions rich in serine-arginine dipeptides (SR proteins family), and is involved in bridge-complex formation and splicing by mediating protein-protein interactions across either introns or exons. It is a unique SR family member and may play a crucial role in determining tissue specific patterns of alternative splicing. SREK1 can alter splice site selection by both positively and negatively modulating the activity of other SR proteins. For instance, SREK1 can activate SRp20 and repress SC35 in a dose-dependent manner both in vitro and in vivo. In addition, SREK1 generally contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and two serine-arginine (SR)-rich domains (SR domains) separated by an unusual glutamic acid-lysine (EK) rich region. The RRM and SR domains are highly conserved among other members of the SR superfamily. However, the EK domain is unique to SREK1; plays a modulatory role controlling SR domain function by involvement in the inhibition of both constitutive and alternative splicing and in the selection of splice-site. 	80
240964	cd12520	RRM1_MRN1	RNA recognition motif 1 (RRM1) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM1 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa,which is a RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	74
240965	cd12521	RRM3_MRN1	RNA recognition motif 3 (RRM3) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM3 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, which is a RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	74
409942	cd12522	RRM4_MRN1	RNA recognition motif 4 (RRM4) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM4 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, which is a RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	81
409943	cd12523	RRM2_MRN1	RNA recognition motif 2 (RRM2) found in RNA-binding protein MRN1 and similar proteins. This subgroup corresponds to the RRM2 of MRN1, also termed multicopy suppressor of RSC-NHP6 synthetic lethality protein 1, or post-transcriptional regulator of 69 kDa, which is a RNA-binding protein found in yeast. Although its specific biological role remains unclear, MRN1 might be involved in translational regulation. Members in this family contain four copies of conserved RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	78
409944	cd12524	RRM1_MEI2_like	RNA recognition motif 1 (RRM1) found in plant Mei2-like proteins. This subgroup corresponds to the RRM1 of Mei2-like proteins that represent an ancient eukaryotic RNA-binding proteins family. Their corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RRM (RRM3) is unique to Mei2-like proteins and it is highly conserved between plants and fungi. Up to date, the intracellular localization, RNA target(s), cellular interactions and phosphorylation states of Mei2-like proteins in plants remain unclear. 	77
409945	cd12525	RRM1_MEI2_fungi	RNA recognition motif 1 (RRM1) found in fungal Mei2-like proteins. This subgroup corresponds to the RRM1 of fungal Mei2-like proteins. The Mei2 protein is an essential component of the switch from mitotic to meiotic growth in the fission yeast Schizosaccharomyces pombe. It is an RNA-binding protein that contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In the nucleus, S. pombe Mei2 stimulates meiosis upon binding a specific non-coding RNA through its C-terminal RRM motif. 	91
409946	cd12526	RRM1_EAR1_like	RNA recognition motif 1 (RRM1) found in terminal EAR1-like proteins. This subgroup corresponds to the RRM1 of terminal EAR1-like proteins, including terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) found in land plants. They may play a role in the regulation of leaf initiation. The terminal EAR1-like proteins are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and TEL characteristic motifs that allow sequence and putative functional discrimination between the terminal EAR1-like proteins and Mei2-like proteins. 	71
409947	cd12527	RRM2_EAR1_like	RNA recognition motif 2 (RRM2) found in terminal EAR1-like proteins. This subgroup corresponds to the RRM2 of terminal EAR1-like proteins, including terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) found in land plants. They may play a role in the regulation of leaf initiation. The terminal EAR1-like proteins are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and TEL characteristic motifs that allow sequence and putative functional discrimination between the terminal EAR1-like proteins and Mei2-like proteins. 	71
240972	cd12528	RRM2_MEI2_fungi	RNA recognition motif 2 (RRM2) found in fungal Mei2-like proteins. This subgroup corresponds to the RRM2 of fungal Mei2-like proteins.The Mei2 protein is an essential component of the switch from mitotic to meiotic growth in the fission yeast Schizosaccharomyces pombe. It is an RNA-binding protein that contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In the nucleus, S. pombe Mei2 stimulates meiosis upon binding a specific non-coding RNA through its C-terminal RRM motif. 	81
409948	cd12529	RRM2_MEI2_like	RNA recognition motif 2 (RRM2) found in plant Mei2-like proteins. This subgroup corresponds to the RRM2 of Mei2-like proteins that represent an ancient eukaryotic RNA-binding proteins family. Their corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RRM (RRM3) is unique to Mei2-like proteins and is highly conserved between plants and fungi. To date, the intracellular localization, RNA target(s), cellular interactions and phosphorylation states of Mei2-like proteins in plants remain unclear. 	71
240974	cd12530	RRM3_EAR1_like	RNA recognition motif 3 (RRM3) found in terminal EAR1-like proteins. This subgroup corresponds to the RRM3 of terminal EAR1-like proteins, including terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) found in land plants. They may play a role in the regulation of leaf initiation. The terminal EAR1-like proteins are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and TEL characteristic motifs that allow sequence and putative functional discrimination between the terminal EAR1-like proteins and Mei2-like proteins. 	101
240975	cd12531	RRM3_MEI2_like	RNA recognition motif 3 (RRM3) found in plant Mei2-like proteins. This subgroup corresponds to the RRM3 of Mei2-like proteins, representing an ancient eukaryotic RNA-binding proteins family. Their corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The C-terminal RRM (RRM3) is unique to Mei2-like proteins and is highly conserved between plants and fungi. To date, the intracellular localization, RNA target(s), cellular interactions and phosphorylation states of Mei2-like proteins in plants remain unclear. 	86
409949	cd12532	RRM3_MEI2_fungi	RNA recognition motif 3 (RRM3) found in fungal Mei2-like proteins. This subgroup corresponds to the RRM3 of fungal Mei2-like proteins. The Mei2 protein is an essential component of the switch from mitotic to meiotic growth in the fission yeast Schizosaccharomyces pombe. It is an RNA-binding protein that contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In the nucleus, S. pombe Mei2 stimulates meiosis upon binding a specific non-coding RNA through its C-terminal RRM motif. 	90
409950	cd12533	RRM_EWS	RNA recognition motif (RRM) found in vertebrate Ewing Sarcoma Protein (EWS). This subgroup corresponds to the RRM of EWS, also termed Ewing sarcoma breakpoint region 1 protein, a member of the FET (previously TET) (FUS/TLS, EWS, TAF15) family of RNA- and DNA-binding proteins whose expression is altered in cancer. It is a multifunctional protein and may play roles in transcription and RNA processing. EWS is involved in transcriptional regulation by interacting with the preinitiation complex TFIID and the RNA polymerase II (RNAPII) complexes. It is also associated with splicing factors, such as the U1 snRNP protein U1C, suggesting its implication in pre-mRNA splicing. Additionally, EWS has been shown to regulate DNA damage-induced alternative splicing (AS). Like other members in the FET family, EWS contains an N-terminal Ser, Gly, Gln and Tyr-rich region composed of multiple copies of a degenerate hexapeptide repeat motif. The C-terminal region consists of a conserved nuclear import and retention signal (C-NLS), a C2/C2 zinc-finger motif, a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and at least 1 arginine-glycine-glycine (RGG)-repeat region. EWS specifically binds to poly G and poly U RNA. It also binds to the proximal-element DNA of the macrophage-specific promoter of the CSF-1 receptor gene. 	84
240978	cd12534	RRM_SARFH	RNA recognition motif (RRM) found in Drosophila melanogaster RNA-binding protein cabeza and similar proteins. This subgroup corresponds to the RRM in cabeza, also termed P19, or sarcoma-associated RNA-binding fly homolog (SARFH). It is a putative homolog of human RNA-binding proteins FUS (also termed TLS or Pigpen or hnRNP P2), EWS (also termed EWSR1), TAF15 (also termed hTAFII68 or TAF2N or RPB56), and belongs to the of the FET (previously TET) (FUS/TLS, EWS, TAF15) family of RNA- and DNA-binding proteins whose expression is altered in cancer. It is a nuclear RNA binding protein that may play an important role in the regulation of RNA metabolism during fly development. Cabeza contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	83
409951	cd12535	RRM_FUS_TAF15	RNA recognition motif (RRM) found in vertebrate fused in Ewing's sarcoma protein (FUS), TATA-binding protein-associated factor 15 (TAF15) and similar proteins. This subgroup corresponds to the RRM of FUS and TAF15. FUS (TLS or Pigpen or hnRNP P2), also termed 75 kDa DNA-pairing protein (POMp75), or oncoprotein TLS (Translocated in liposarcoma), is a member of the FET (previously TET) (FUS/TLS, EWS, TAF15) family of RNA- and DNA-binding proteins whose expression is altered in cancer. It is a multi-functional protein and has been implicated in pre-mRNA splicing, chromosome stability, cell spreading, and transcription. FUS was originally identified in human myxoid and round cell liposarcomas as an oncogenic fusion with the stress-induced DNA-binding transcription factor CHOP (CCAAT enhancer-binding homologous protein) and later as hnRNP P2, a component of hnRNP H complex assembled on pre-mRNA. It can form ternary complexes with hnRNP A1 and hnRNP C1/C2. Additional research indicates that FUS binds preferentially to GGUG-containing RNAs. In the presence of Mg2+, it can bind both single- and double-stranded DNA (ssDNA/dsDNA) and promote ATP-independent annealing of complementary ssDNA and D-loop formation in superhelical dsDNA. FUS has been shown to be recruited by single stranded noncoding RNAs to the regulatory regions of target genes such as cyclin D1, where it represses transcription by disrupting complex formation. TAF15 (TAFII68), also termed TATA-binding protein-associated factor 2N (TAF2N), or RNA-binding protein 56 (RBP56), originally identified as a TAF in the general transcription initiation TFIID complex, is a novel RNA/ssDNA-binding protein with homology to the proto-oncoproteins FUS and EWS (also termed EWSR1), belonging to the FET family as well. TAF15 likely functions in RNA polymerase II (RNAP II) transcription by interacting with TFIID and subunits of RNAP II itself. TAF15 is also associated with U1 snRNA, chromatin and RNA, in a complex distinct from the Sm-containing U1 snRNP that functions in splicing. Like other members in the FET family, both FUS and TAF15 contain an N-terminal Ser, Gly, Gln and Tyr-rich region composed of multiple copies of a degenerate hexapeptide repeat motif. The C-terminal region consists of a conserved nuclear import and retention signal (C-NLS), a C2/C2 zinc-finger motif, a conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and at least 1 arginine-glycine-glycine (RGG)-repeat region. 	86
409952	cd12536	RRM1_RBM39	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 39 (RBM39). This subgroup corresponds to the RRM1 of RBM39, also termed hepatocellular carcinoma protein 1, or RNA-binding region-containing protein 2, or splicing factor HCC1, a nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). An octapeptide sequence called the RS-ERK motif is repeated six times in the RS region of RBM39. Based on the specific domain composition, RBM39 has been classified into a family of non-snRNP (small nuclear ribonucleoprotein) splicing factors that are usually not complexed to snRNAs. 	83
409953	cd12537	RRM1_RBM23	RNA recognition motif 1 (RRM1) found in vertebrate probable RNA-binding protein 23 (RBM23). This subgroup corresponds to the RRM1 of RBM23, also termed RNA-binding region-containing protein 4, or splicing factor SF2, which may function as a pre-mRNA splicing factor. It shows high sequence homology to RNA-binding protein 39 (RBM39 or HCC1), a nuclear autoantigen that contains an N-terminal arginine/serine rich (RS) motif and three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In contrast to RBM39, RBM23 contains only two RRMs. 	85
409954	cd12538	RRM_U2AF35	RNA recognition motif (RRM) found in U2 small nuclear ribonucleoprotein auxiliary factor U2AF 35 kDa subunit (U2AF35). This subgroup corresponds to the RRM of U2AF35, also termed U2AF1, which is one of the small subunits of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF). It has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. U2AF35 directly binds to the 3' splice site of the conserved AG dinucleotide and performs multiple functions in the splicing process in a substrate-specific manner. It promotes U2 snRNP binding to the branch-point sequences of introns through association with the large subunit of U2AF, U2AF65 (also termed U2AF2). U2AF35 contains two N-terminal zinc fingers, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal arginine/serine (SR)-rich segment interrupted by glycines. U2AF35 binds both U2AF65 and the pre-mRNA through its RRM domain. 	104
409955	cd12539	RRM_U2AF35B	RNA recognition motif (RRM) found in splicing factor U2AF 35 kDa subunit B (U2AF35B). This subgroup corresponds to the RRM of U2AF35B, also termed zinc finger CCCH domain-containing protein 60 (C3H60), which is one of the small subunits of U2 small nuclear ribonucleoprotein (snRNP) auxiliary factor (U2AF). It has been implicated in the recruitment of U2 snRNP to pre-mRNAs and is a highly conserved heterodimer composed of large and small subunits. Members in this family are mainly found in plant. They show high sequence homology to vertebrates U2AF35 that directly binds to the 3' splice site of the conserved AG dinucleotide and performs multiple functions in the splicing process in a substrate-specific manner. U2AF35B contains two N-terminal zinc fingers, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal arginine/serine (SR)-rich domain. In contrast to U2AF35, U2AF35B has a plant-specific conserved C-terminal region containing SERE motif(s), which may have an important function specific to higher plants. 	102
409956	cd12540	RRM_U2AFBPL	RNA recognition motif (RRM) found in U2 small nuclear ribonucleoprotein auxiliary factor 35 kDa subunit-related protein 1 (U2AFBPL) and similar proteins. This subgroup corresponds to the RRM of U2AFBPL, a human homolog of the imprinted mouse gene U2afbp-rs, which encodes a U2 small nuclear ribonucleoprotein auxiliary factor 35 kDa subunit-related protein 1 (U2AFBPL), also termed CCCH type zinc finger, RNA-binding motif and serine/arginine rich protein 1 (U2AF1RS1), or U2 small nuclear RNA auxiliary factor 1-like 1 (U2AF1L1). Although the biological role of U2AFBPL remains unclear, it shows high sequence homology to splicing factor U2AF 35 kDa subunit (U2AF35 or U2AF1) that directly binds to the 3' splice site of the conserved AG dinucleotide and performs multiple functions in the splicing process in a substrate-specific manner. Like U2AF35, U2AFBPL contains two N-terminal zinc fingers, a central RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal arginine/serine (SR)-rich domain. 	105
409957	cd12541	RRM2_La	RNA recognition motif 2 in La autoantigen (La or LARP3) and similar proteins. This subgroup corresponds to the RRM2 of La autoantigen, also termed Lupus La protein, or La ribonucleoprotein, or Sjoegren syndrome type B antigen (SS-B), a highly abundant nuclear phosphoprotein and well conserved in eukaryotes. It specifically binds the 3'-terminal UUU-OH motif of nascent RNA polymerase III transcripts and protects them from exonucleolytic degradation by 3' exonucleases. In addition, La can directly facilitate the translation and/or metabolism of many UUU-3' OH-lacking cellular and viral mRNAs, through binding internal RNA sequences within the untranslated regions of target mRNAs. La contains an N-terminal La motif (LAM), followed by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). In addition, it possesses a short basic motif (SBM) and a nuclear localization signal (NLS) at the C-terminus. 	77
409958	cd12542	RRM2_LARP7	RNA recognition motif 2 in La-related protein 7 (LARP7) and similar proteins. This subgroup corresponds to the RRM2 of LARP7, also termed La ribonucleoprotein domain family member 7, or P-TEFb-interaction protein for 7SK stability (PIP7S), an oligopyrimidine-binding protein that binds to the highly conserved 3'-terminal U-rich stretch (3' -UUU-OH) of 7SK RNA. LARP7 is a stable component of the 7SK small nuclear ribonucleoprotein (7SK snRNP). It intimately associates with all the nuclear 7SK and is required for 7SK stability. LARP7 also acts as a negative transcriptional regulator of cellular and viral polymerase II genes, acting by means of the 7SK snRNP system. LARP7 plays an essential role in the inhibition of positive transcription elongation factor b (P-TEFb)-dependent transcription, which has been linked to the global control of cell growth and tumorigenesis. LARP7 contains a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), at the N-terminal region, which mediates binding to the U-rich 3' terminus of 7SK RNA. LARP7 also carries another putative RRM domain at its C-terminus. 	78
409959	cd12543	RRM2_PAR14	RNA recognition motif 2 in vertebrate poly [ADP-ribose] polymerase 14 (PARP-14). This subgroup corresponds to the RRM2 of PARP-14, also termed aggressive lymphoma protein 2, a member of the B aggressive lymphoma (BAL) family of macrodomain-containing PARPs. It is expressed in B lymphocytes and interacts with the IL-4-induced transcription factor Stat6. It plays a fundamental role in the regulation of IL-4-induced B-cell protection against apoptosis after irradiation or growth factor withdrawal. It mediates IL-4 effects on the levels of gene products that regulate cell survival, proliferation, and lymphomagenesis. PARP-14 acts as a transcriptional switch for Stat6-dependent gene activation. In the presence of IL-4, PARP-14 activates transcription by facilitating the binding of Stat6 to the promoter and release of HDACs from the promoter with an IL-4 signal. In contrast, in the absence of a signal, PARP-14 acts as a transcriptional repressor by recruiting HDACs. Absence of PARP-14 protects against Myc-induced developmental block and lymphoma. Thus, PARP-14 may play an important role in Myc-induced oncogenesis. Additional research indicates that PARP-14 is also a binding partner with phosphoglucose isomerase (PGI)/ autocrine motility factor (AMF). It can inhibit PGI/AMF ubiquitination, thus contributing to its stabilization and secretion. PARP-14 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), three tandem macro domains, and C-terminal region with sequence homology to PARP catalytic domain. 	75
409960	cd12544	RRM_NMI	RNA recognition motif in N-myc-interactor (Nmi) and similar proteins. This subgroup corresponds to the RRM.in Nmi, also termed N-myc and STAT interactor, an interferon inducible protein that interacts with c-Myc, N-Myc, Max and c-Fos, and other transcription factors containing bHLH-ZIP, bHLH or ZIP domains. In addition to binding Myc proteins, Nmi also associates with all the Stat family of transcription factors except Stat2. In response to cytokines (e.g. IL-2 and IFN-gamma) stimulation, Nmi can enhance Stat-mediated transcriptional activity through recruiting the Stat1 and Stat5 transcriptional coactivators, CREB-binding protein (CBP) and p300. Nmi contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	81
409961	cd12545	RRM_IN35	RNA recognition motif in interferon-induced 35 kDa protein (IFP 35) and similar proteins. This subgroup corresponds to the RRM in IFP 35, an interferon-induced leucine zipper protein that can specifically form homodimers. Distinct from known bZIP proteins, IFP 35 lacks a basic domain critical for DNA binding. IFP 35 may negatively regulate other bZIP transcription factors by protein-protein interaction. For instance, it can form heterodimers with B-ATF, a member of the AP1 transcription factor family. IFP 35 contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	79
409962	cd12546	RRM_RBM43	RNA recognition motif in vertebrate RNA-binding protein 43 (RBM43). This subgroup corresponds to the RRM of RBM43, a putative RNA-binding protein containing one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). Although its biological function remains unclear, RBM43 shows high sequence homology to poly [ADP-ribose] polymerase 10 (PARP-10), which is a novel oncoprotein c-Myc-interacting protein with poly(ADP-ribose) polymerase activity. 	77
409963	cd12547	RRM1_2_PAR10	RNA recognition motif 1 and 2 in poly [ADP-ribose] polymerase 10 (PARP-10) and similar proteins. This subgroup corresponds to the RRM1 and RRM2 of PARP-10, a novel oncoprotein c-Myc-interacting protein with poly(ADP-ribose) polymerase activity. It is localized to the nuclear and cytoplasmic compartments. In addition to the PARP activity, PARP-10 is also involved in the control of cell proliferation by inhibiting c-Myc- and E1A-mediated cotransformation of primary cells. PARP-10 may play a role in nuclear processes including the regulation of chromatin, gene transcription, and nuclear/cytoplasmic transport. It contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two overlapping C-terminal domains composed of a glycine-rich region and a region with homology to catalytic domains of PARP enzymes (PARP domain). In addition, PARP-10 contains two ubiquitin-interacting motifs (UIM). 	72
409964	cd12548	RRM_Set1A	RNA recognition motif in vertebrate histone-lysine N-methyltransferase Setd1A (Set1A). This subgroup corresponds to the RRM of Setd1A, also termed SET domain-containing protein 1A (Set1A), or lysine N-methyltransferase 2F, or Set1/Ash2 histone methyltransferase complex subunit Set1, a ubiquitously expressed vertebrates histone methyltransferase that exhibits high homology to yeast Set1. Set1A is localized to euchromatic nuclear speckles and associates with a complex containing six human homologs of the yeast Set1/COMPASS complex, including CXXC finger protein 1 (CFP1; homologous to yeast Spp1), Rbbp5 (homologous to yeast Swd1), Ash2 (homologous to yeast Bre2), Wdr5 (homologous to yeast Swd3), and Wdr82 (homologous to yeast Swd2). Set1A contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), an N- SET domain, and a C-terminal catalytic SET domain followed by a post-SET domain. In contrast to Set1B, Set1A additionally contains an HCF-1 binding motif that interacts with HCF-1 in vivo. 	95
409965	cd12549	RRM_Set1B	RNA recognition motif in vertebrate histone-lysine N-methyltransferase Setd1B (Set1B). This subgroup corresponds to the RRM of Setd1B, also termed SET domain-containing protein 1B (Set1B), or lysine N-methyltransferase 2G, a ubiquitously expressed vertebrates histone methyltransferase that exhibits high homology to yeast Set1. Set1B is localized to euchromatic nuclear speckles and associates with a complex containing six human homologs of the yeast Set1/COMPASS complex, including CXXC finger protein 1 (CFP1; homologous to yeast Spp1), Rbbp5 (homologous to yeast Swd1), Ash2 (homologous to yeast Bre2), Wdr5 (homologous to yeast Swd3), and Wdr82 (homologous to yeast Swd2). Set1B complex is a histone methyltransferase that produces trimethylated histone H3 at Lys4. Set1B contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), an N- SET domain, and a C-terminal catalytic SET domain followed by a post-SET domain. 	93
409966	cd12550	RRM_II_PABPN1	RNA recognition motif in type II polyadenylate-binding protein 2 (PABP-2) and similar proteins. This subgroup corresponds to the RRM of PABP-2, also termed poly(A)-binding protein 2, or nuclear poly(A)-binding protein 1 (PABPN1), or poly(A)-binding protein II (PABII), which is a ubiquitously expressed type II nuclear poly(A)-binding protein that directs the elongation of mRNA poly(A) tails during pre-mRNA processing. Although PABP-2 binds poly(A) with high affinity and specificity as type I poly(A)-binding proteins, it contains only one highly conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is responsible for the poly(A) binding. In addition, PABP-2 possesses an acidic N-terminal domain that is essential for the stimulation of PAP, and an arginine-rich C-terminal domain. 	76
409967	cd12551	RRM_II_PABPN1L	RNA recognition motif in vertebrate type II embryonic polyadenylate-binding protein 2 (ePABP-2). This subgroup corresponds to the RRM of ePABP-2, also termed embryonic poly(A)-binding protein 2, or poly(A)-binding protein nuclear-like 1 (PABPN1L). ePABP-2 is a novel embryonic-specific cytoplasmic type II poly(A)-binding protein that is expressed during the early stages of vertebrate development and in adult ovarian tissue. It may play an important role in the poly(A) metabolism of stored mRNAs during early vertebrate development. ePABP-2 shows significant sequence similarity to the ubiquitously expressed nuclear polyadenylate-binding protein 2 (PABP-2 or PABPN1). Like PABP-2, ePABP-2 contains one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), which is responsible for the poly(A) binding. In addition, it possesses an acidic N-terminal domain predicted to form a coiled-coil and an arginine-rich C-terminal domain. 	77
409968	cd12552	RRM_Nop15p	RNA recognition motif in yeast ribosome biogenesis protein 15 (Nop15p) and similar proteins. This subgroup corresponds to the RRM of Nop15p, also termed nucleolar protein 15, which is encoded by YNL110C from Saccharomyces cerevisiae, and localizes to the nucleoplasm and nucleolus. Nop15p has been identified as a component of a pre-60S particle. It interacts with RNA components of the early pre-60S particles. Furthermore, Nop15p binds directly to a pre-rRNA transcript in vitro and is required for pre-rRNA processing. It functions as a ribosome synthesis factor required for the 5' to 3' exonuclease digestion that generates the 5' end of the major, short form of the 5.8S rRNA as well as for processing of 27SB to 7S pre-rRNA. Nop15p also play a specific role in cell cycle progression. Nop15p contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	77
409969	cd12553	RRM1_RBM15	RNA recognition motif 1 (RRM1) found in vertebrate RNA binding motif protein 15 (RBM15). This subgroup corresponds to the RRM1 of RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RBM15 belongs to the Spen (split end) protein family, which contains three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. This family also includes a RBM15-MKL1 (OTT-MAL) fusion protein that RBM15 is N-terminally fused to megakaryoblastic leukemia 1 protein (MKL1) at the C-terminus in a translocation involving chromosome 1 and 22, resulting in acute megakaryoblastic leukemia. The fusion protein could interact with the mRNA export machinery. Although it maintains the specific transactivator function of MKL1, the fusion protein cannot activate RTE-mediated mRNA expression and has lost the post-transcriptional activator function of RBM15. However, it has transdominant suppressor function contributing to its oncogenic properties.	78
409970	cd12554	RRM1_RBM15B	RNA recognition motif 1 (RRM1) found in putative RNA binding motif protein 15B (RBM15B) from vertebrate. This subfamily corresponds to the RRM1 of RBM15B, also termed one twenty-two 3 (OTT3), a paralog of RNA binding motif protein 15 (RBM15), also known as One-twenty two protein 1 (OTT1). Like RBM15, RBM15B has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. RBM15B belongs to the Spen (split end) protein family, which shares a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 	80
409971	cd12555	RRM2_RBM15	RNA recognition motif 2 (RRM2) found in vertebrate RNA binding motif protein 15 (RBM15). This subgroup corresponds to the RRM2 of RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RBM15 belongs to the Spen (split end) protein family, which contain three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. This family also includes a RBM15-MKL1 (OTT-MAL) fusion protein that RBM15 is N-terminally fused to megakaryoblastic leukemia 1 protein (MKL1) at the C-terminus in a translocation involving chromosome 1 and 22, resulting in acute megakaryoblastic leukemia. The fusion protein could interact with the mRNA export machinery. Although it maintains the specific transactivator function of MKL1, the fusion protein cannot activate RTE-mediated mRNA expression and has lost the post-transcriptional activator function of RBM15. However, it has transdominant suppressor function contributing to its oncogenic properties. 	87
409972	cd12556	RRM2_RBM15B	RNA recognition motif 2 (RRM2) found in putative RNA binding motif protein 15B (RBM15B) from vertebrate. This subgroup corresponds to the RRM2 of RBM15B, also termed one twenty-two 3 (OTT3), a paralog of RNA binding motif protein 15 (RBM15), also known as One-twenty two protein 1 (OTT1). Like RBM15, RBM15B has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. RBM15B belongs to the Spen (split end) protein family, which shares a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 	85
409973	cd12557	RRM3_RBM15	RNA recognition motif 3 (RRM3) found in vertebrate RNA binding motif protein 15 (RBM15). This subgroup corresponds to the RRM3 of RBM15, also termed one-twenty two protein 1 (OTT1), conserved in eukaryotes, a novel mRNA export factor component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC). RBM15 belongs to the Spen (split end) protein family, which contains three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralogue and ortholog C-terminal) domain. This family also includes a RBM15-MKL1 (OTT-MAL) fusion protein that RBM15 is N-terminally fused to megakaryoblastic leukemia 1 protein (MKL1) at the C-terminus in a translocation involving chromosome 1 and 22, resulting in acute megakaryoblastic leukemia. The fusion protein could interact with the mRNA export machinery. Although it maintains the specific transactivator function of MKL1, the fusion protein cannot activate RTE-mediated mRNA expression and has lost the post-transcriptional activator function of RBM15. However, it has transdominant suppressor function contributing to its oncogenic properties. 	73
409974	cd12558	RRM3_RBM15B	RNA recognition motif 3 (RRM3) found in putative RNA-binding protein 15B (RBM15B) from vertebrate. This subgroup corresponds to the RRM3 of RBM15B, also termed one twenty-two 3 (OTT3), a paralog of RNA binding motif protein 15 (RBM15), also known as One-twenty two protein 1 (OTT1). Like RBM15, RBM15B has post-transcriptional regulatory activity. It is a nuclear protein sharing with RBM15 the association with the splicing factor compartment and the nuclear envelope as well as the binding to mRNA export factors NXF1 and Aly/REF. RBM15B belongs to the Spen (split end) protein family, which shares a domain architecture comprising of three N-terminal RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal SPOC (Spen paralog and ortholog C-terminal) domain. 	76
409975	cd12559	RRM_SRSF10	RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 10 (SRSF10) and similar proteins. This subgroup corresponds to the RRM of SRSF10, also termed 40 kDa SR-repressor protein (SRrp40), or FUS-interacting serine-arginine-rich protein 1 (FUSIP1), or splicing factor SRp38, or splicing factor, arginine/serine-rich 13A (SFRS13A), or TLS-associated protein with Ser-Arg repeats (TASR). SRSF10 is a serine-arginine (SR) protein that acts as a potent and general splicing repressor when dephosphorylated. It mediates global inhibition of splicing both in M phase of the cell cycle and in response to heat shock. SRSF10 emerges as a modulator of cholesterol homeostasis through the regulation of low-density lipoprotein receptor (LDLR) splicing efficiency. It also regulates cardiac-specific alternative splicing of triadin pre-mRNA and is required for proper Ca2+ handling during embryonic heart development. In contrast, the phosphorylated SRSF10 functions as a sequence-specific splicing activator in the presence of a nuclear cofactor. It activates distal alternative 5' splice site of adenovirus E1A pre-mRNA in vivo. Moreover, SRSF10 strengthens pre-mRNA recognition by U1 and U2 snRNPs. SRSF10 localizes to the nuclear speckles and can shuttle between nucleus and cytoplasm. It contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C-terminal RS domain rich in serine-arginine dipeptides. 	95
409976	cd12560	RRM_SRSF12	RNA recognition motif (RRM) found in serine/arginine-rich splicing factor 12 (SRSF12) and similar proteins. This subgroup corresponds to the RRM of SRSF12, also termed 35 kDa SR repressor protein (SRrp35), or splicing factor, arginine/serine-rich 13B (SFRS13B), or splicing factor, arginine/serine-rich 19 (SFRS19). SRSF12 is a serine/arginine (SR) protein-like alternative splicing regulator that antagonizes authentic SR proteins in the modulation of alternative 5' splice site choice. For instance, it activates distal alternative 5' splice site of the adenovirus E1A pre-mRNA in vivo. SRSF12 contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C-terminal RS domain rich in serine-arginine dipeptides. 	84
409977	cd12561	RRM1_RBM5_like	RNA recognition motif 1 (RRM1) found in RNA-binding protein 5 (RBM5) and similar proteins. This subgroup corresponds to the RRM1 of RNA-binding protein 5 (RBM5 or LUCA15 or H37), RNA-binding protein 10 (RBM10 or S1-1) and similar proteins. RBM5 is a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor; it specifically binds poly(G) RNA. RBM10, a paralog of RBM5, may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. Both, RBM5 and RBM10, contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 	81
409978	cd12562	RRM2_RBM5_like	RNA recognition motif 2 (RRM2) found in RNA-binding protein 5 (RBM5) and similar proteins. This subgroup corresponds to the RRM2 of RNA-binding protein 5 (RBM5 or LUCA15 or H37), RNA-binding protein 10 (RBM10 or S1-1) and similar proteins. RBM5 is a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor; it specifically binds poly(G) RNA. RBM10, a paralog of RBM5, may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. Both, RBM5 and RBM10, contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 	86
409979	cd12563	RRM2_RBM6	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 6 (RBM6). This subgroup corresponds to the RRM2 of RBM6, also termed lung cancer antigen NY-LU-12, or protein G16, or RNA-binding protein DEF-3, which has been predicted to be a nuclear factor based on its nuclear localization signal. It shows high sequence similarity to RNA-binding protein 5 (RBM5 or LUCA15 or NY-REN-9). Both, RBM6 and RBM5, specifically bind poly(G) RNA. They contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. In contrast to RBM5, RBM6 has two additional unique domains: the decamer repeat occurring more than 20 times, and the POZ (poxvirus and zinc finger) domain. The POZ domain may be involved in protein-protein interactions and inhibit binding of target sequences by zinc fingers. 	87
409980	cd12564	RRM1_RBM19	RNA recognition motif 1 (RRM1) found in RNA-binding protein 19 (RBM19) and similar proteins. This subgroup corresponds to the RRM1 of RBM19, also termed RNA-binding domain-1 (RBD-1), a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	76
409981	cd12565	RRM1_MRD1	RNA recognition motif 1 (RRM1) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subgroup corresponds to the RRM1 of MRD1 which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. It contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 	76
409982	cd12566	RRM2_MRD1	RNA recognition motif 2 (RRM2) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subgroup corresponds to the RRM2 of MRD1 which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). It is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. MRD1 contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 	79
409983	cd12567	RRM3_RBM19	RNA recognition motif 3 (RRM3) found in RNA-binding protein 19 (RBM19) and similar proteins. This subgroup corresponds to the RRM3 of RBM19, also termed RNA-binding domain-1 (RBD-1), which is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	79
241012	cd12568	RRM3_MRD1	RNA recognition motif 3 (RRM3) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subgroup corresponds to the RRM3 of MRD1 which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. It contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 	72
409984	cd12569	RRM4_RBM19	RNA recognition motif 4 (RRM4) found in RNA-binding protein 19 (RBM19) and similar proteins. This subgroup corresponds to the RRM4 of RBM19, also termed RNA-binding domain-1 (RBD-1), which is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	72
241014	cd12570	RRM5_MRD1	RNA recognition motif 5 (RRM5) found in yeast multiple RNA-binding domain-containing protein 1 (MRD1) and similar proteins. This subgroup corresponds to the RRM5 of MRD1 which is encoded by a novel yeast gene MRD1 (multiple RNA-binding domain). It is well-conserved in yeast and its homologs exist in all eukaryotes. MRD1 is present in the nucleolus and the nucleoplasm. It interacts with the 35 S precursor rRNA (pre-rRNA) and U3 small nucleolar RNAs (snoRNAs). MRD1 is essential for the initial processing at the A0-A2 cleavage sites in the 35 S pre-rRNA. It contains 5 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may play an important structural role in organizing specific rRNA processing events. 	76
409985	cd12571	RRM6_RBM19	RNA recognition motif 6 (RRM6) found in RNA-binding protein 19 (RBM19) and similar proteins. This subgroup corresponds to the RRM6 of RBM19, also termed RNA-binding domain-1 (RBD-1), which is a nucleolar protein conserved in eukaryotes. It is involved in ribosome biogenesis by processing rRNA. In addition, it is essential for preimplantation development. RBM19 has a unique domain organization containing 6 conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	79
409986	cd12572	RRM2_MSI1	RNA recognition motif 2 (RRM2) found in RNA-binding protein Musashi homolog 1 (Musashi-1) and similar proteins. This subgroup corresponds to the RRM2 of Musashi-1. The mammalian MSI1 gene encoding Musashi-1 (also termed Msi1) is a neural RNA-binding protein putatively expressed in central nervous system (CNS) stem cells and neural progenitor cells, and associated with asymmetric divisions in neural progenitor cells. Musashi-1 is evolutionarily conserved from invertebrates to vertebrates. It is a homolog of Drosophila Musashi and Xenopus laevis nervous system-specific RNP protein-1 (Nrp-1) and has been implicated in the maintenance of the stem-cell state, differentiation, and tumorigenesis. It translationally regulates the expression of a mammalian numb gene by binding to the 3'-untranslated region of mRNA of Numb, encoding a membrane-associated inhibitor of Notch signaling, and further influences neural development. It represses translation by interacting with the poly(A)-binding protein and competes for binding of the eukaryotic initiation factor-4G (eIF-4G). Musashi-1 contains two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 	74
409987	cd12573	RRM2_MSI2	RNA recognition motif 2 (RRM2) found in RNA-binding protein Musashi homolog 2 (Musashi-2) and similar proteins. This subgroup corresponds to the RRM2 of Musashi-2 (also termed Msi2) which has been identified as a regulator of the hematopoietic stem cell (HSC) compartment and of leukemic stem cells after transplantation of cells with loss and gain of function of the gene. It influences proliferation and differentiation of HSCs and myeloid progenitors, and further modulates normal hematopoiesis and promotes aggressive myeloid leukemia. Musashi-2 contains two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 	76
409988	cd12574	RRM1_DAZAP1	RNA recognition motif 1 (RRM1) found in Deleted in azoospermia-associated protein 1 (DAZAP1) and similar proteins. This subfamily corresponds to the RRM1 of DAZAP1 or DAZ-associated protein 1, also termed proline-rich RNA binding protein (Prrp), a multi-functional ubiquitous RNA-binding protein expressed most abundantly in the testis and essential for normal cell growth, development, and spermatogenesis. DAZAP1 is a shuttling protein whose acetylated form is predominantly nuclear and the nonacetylated form is in cytoplasm. It also functions as a translational regulator that activates translation in an mRNA-specific manner. DAZAP1 was initially identified as a binding partner of Deleted in Azoospermia (DAZ). It also interacts with numerous hnRNPs, including hnRNP U, hnRNP U like-1, hnRNPA1, hnRNPA/B, and hnRNP D, suggesting DAZAP1 might associate and cooperate with hnRNP particles to regulate adenylate-uridylate-rich elements (AU-rich element or ARE)-containing mRNAs. DAZAP1 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal proline-rich domain. 	82
409989	cd12575	RRM1_hnRNPD_like	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. This subfamily corresponds to the RRM1 in hnRNP D0, hnRNP A/B, hnRNP DL and similar proteins. hnRNP D0 is a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP A/B is an RNA unwinding protein with a high affinity for G- followed by U-rich regions. hnRNP A/B has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus plays an important role in apoB mRNA editing. hnRNP DL (or hnRNP D-like) is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. All members in this family contain two putative RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glycine- and tyrosine-rich C-terminus. 	72
409990	cd12576	RRM1_MSI	RNA recognition motif 1 (RRM1) found in RNA-binding protein Musashi homolog Musashi-1, Musashi-2 and similar proteins. This subfamily corresponds to the RRM1 in Musashi-1 and Musashi-2. Musashi-1 (also termed Msi1) is a neural RNA-binding protein putatively expressed in central nervous system (CNS) stem cells and neural progenitor cells, and associated with asymmetric divisions in neural progenitor cells. It is evolutionarily conserved from invertebrates to vertebrates. Musashi-1 is a homolog of Drosophila Musashi and Xenopus laevis nervous system-specific RNP protein-1 (Nrp-1). It has been implicated in the maintenance of the stem-cell state, differentiation, and tumorigenesis. It translationally regulates the expression of a mammalian numb gene by binding to the 3'-untranslated region of mRNA of Numb, encoding a membrane-associated inhibitor of Notch signaling, and further influences neural development. Moreover, Musashi-1 represses translation by interacting with the poly(A)-binding protein and competes for binding of the eukaryotic initiation factor-4G (eIF-4G). Musashi-2 (also termed Msi2) has been identified as a regulator of the hematopoietic stem cell (HSC) compartment and of leukemic stem cells after transplantation of cells with loss and gain of function of the gene. It influences proliferation and differentiation of HSCs and myeloid progenitors, and further modulates normal hematopoiesis and promotes aggressive myeloid leukemia. Both, Musashi-1 and Musashi-2, contain two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 	76
409991	cd12577	RRM1_Hrp1p	RNA recognition motif 1 (RRM1) found in yeast nuclear polyadenylated RNA-binding protein 4 (Hrp1p or Nab4p) and similar proteins. This subfamily corresponds to the RRM1 of Hrp1p and similar proteins. Hrp1p or Nab4p, also termed cleavage factor IB (CFIB), is a sequence-specific trans-acting factor that is essential for mRNA 3'-end formation in yeast Saccharomyces cerevisiae. It can be UV cross-linked to RNA and specifically recognizes the (UA)6 RNA element required for both, the cleavage and poly(A) addition, steps. Moreover, Hrp1p can shuttle between the nucleus and the cytoplasm, and play an additional role in the export of mRNAs to the cytoplasm. Hrp1p also interacts with Rna15p and Rna14p, two components of CF1A. In addition, Hrp1p functions as a factor directly involved in modulating the activity of the nonsense-mediated mRNA decay (NMD) pathway. It binds specifically to a downstream sequence element (DSE)-containing RNA and interacts with Upf1p, a component of the surveillance complex, further triggering the NMD pathway. Hrp1p contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an arginine-glycine-rich region harboring repeats of the sequence RGGF/Y. 	76
409992	cd12578	RRM1_hnRNPA_like	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A subfamily. This subfamily corresponds to the RRM1 in hnRNP A0, hnRNP A1, hnRNP A2/B1, hnRNP A3 and similar proteins. hnRNP A0 is a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. It has been identified as the substrate for MAPKAP-K2 and may be involved in the lipopolysaccharide (LPS)-induced post-transcriptional regulation of tumor necrosis factor-alpha (TNF-alpha), cyclooxygenase 2 (COX-2) and macrophage inflammatory protein 2 (MIP-2). hnRNP A1 is an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A2/B1 is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). Many mRNAs, such as myelin basic protein (MBP), myelin-associated oligodendrocytic basic protein (MOBP), carboxyanhydrase II (CAII), microtubule-associated protein tau, and amyloid precursor protein (APP) are trafficked by hnRNP A2/B1. hnRNP A3 is also a RNA trafficking response element-binding protein that participates in the trafficking of A2RE-containing RNA. The hnRNP A subfamily is characterized by two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 	78
409993	cd12579	RRM2_hnRNPA0	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A0 (hnRNP A0) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A0, a low abundance hnRNP protein that has been implicated in mRNA stability in mammalian cells. It has been identified as the substrate for MAPKAP-K2 and may be involved in the lipopolysaccharide (LPS)-induced post-transcriptional regulation of tumor necrosis factor-alpha (TNF-alpha), cyclooxygenase 2 (COX-2) and macrophage inflammatory protein 2 (MIP-2). hnRNP A0 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 	80
409994	cd12580	RRM2_hnRNPA1	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A1, also termed helix-destabilizing protein, or single-strand RNA-binding protein, or hnRNP core protein A1, an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A1 has been characterized as a splicing silencer, often acting in opposition to an activating hnRNP H. It silences exons when bound to exonic elements in the alternatively spliced transcripts of c-src, HIV, GRIN1, and beta-tropomyosin. hnRNP A1 can shuttle between the nucleus and the cytoplasm. Thus, it may be involved in transport of cellular RNAs, including the packaging of pre-mRNA into hnRNP particles and transport of poly A+ mRNA from the nucleus to the cytoplasm. The cytoplasmic hnRNP A1 has high affinity with AU-rich elements, whereas the nuclear hnRNP A1 has high affinity with a polypyrimidine stretch bordered by AG at the 3' ends of introns. hnRNP A1 is also involved in the replication of an RNA virus, such as mouse hepatitis virus (MHV), through an interaction with the transcription-regulatory region of viral RNA. Moreover, hnRNP A1, together with the scaffold protein septin 6, serves as host proteins to form a complex with NS5b and viral RNA, and further play important roles in the replication of Hepatitis C virus (HCV). hnRNP A1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. The RRMs of hnRNP A1 play an important role in silencing the exon and the glycine-rich domain is responsible for protein-protein interactions. 	77
409995	cd12581	RRM2_hnRNPA2B1	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A2/B1 (hnRNP A2/B1) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A2/B1, an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). Many mRNAs, such as myelin basic protein (MBP), myelin-associated oligodendrocytic basic protein (MOBP), carboxyanhydrase II (CAII), microtubule-associated protein tau, and amyloid precursor protein (APP) are trafficked by hnRNP A2/B1. hnRNP A2/B1 also functions as a splicing factor that regulates alternative splicing of the tumor suppressors, such as BIN1, WWOX, the antiapoptotic proteins c-FLIP and caspase-9B, the insulin receptor (IR), and the RON proto-oncogene among others. Overexpression of hnRNP A2/B1 has been described in many cancers. It functions as a nuclear matrix protein involving in RNA synthesis and the regulation of cellular migration through alternatively splicing pre-mRNA. It may play a role in tumor cell differentiation. hnRNP A2/B1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 	80
409996	cd12582	RRM2_hnRNPA3	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A3 (hnRNP A3) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A3, a novel RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE) independently of hnRNP A2 and participates in the trafficking of A2RE-containing RNA. hnRNP A3 can shuttle between the nucleus and the cytoplasm. It contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 	80
241027	cd12583	RRM2_hnRNPD	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein D0 (hnRNP D0) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP D0, also termed AU-rich element RNA-binding protein 1, a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP D0 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), in the middle and an RGG box rich in glycine and arginine residues in the C-terminal part. Each of RRMs can bind solely to the UUAG sequence specifically. 	75
409997	cd12584	RRM2_hnRNPAB	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein A/B (hnRNP A/B) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP A/B, also termed APOBEC1-binding protein 1 (ABBP-1), an RNA unwinding protein with a high affinity for G- followed by U-rich regions. hnRNP A/B has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus plays an important role in apoB mRNA editing. hnRNP A/B contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long C-terminal glycine-rich domain that contains a potential ATP/GTP binding loop. 	80
409998	cd12585	RRM2_hnRPDL	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein D-like (hnRNP DL) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP DL (or hnRNP D-like), also termed AU-rich element RNA-binding factor, or JKT41-binding protein (protein laAUF1 or JKTBP), is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. hnRNP DL binds single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA) in a non-sequencespecific manner, and interacts with poly(G) and poly(A) tenaciously. It contains two putative two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glycine- and tyrosine-rich C-terminus. 	75
409999	cd12586	RRM1_PSP1	RNA recognition motif 1 (RRM1) found in vertebrate paraspeckle protein 1 (PSP1). This subgroup corresponds to the RRM1 of PSPC1, also termed paraspeckle component 1 (PSPC1), a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. It is ubiquitously expressed and highly conserved in vertebrates. Its cellular function remains unknown currently, however, PSPC1 forms a novel heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NonO), which localizes to paraspeckles in an RNA-dependent manner. PSPC1 contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at the N-terminus. 	71
410000	cd12587	RRM1_PSF	RNA recognition motif 1 (RRM1) found in vertebrate polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF). This subgroup corresponds to the RRM1 of PSF, also termed proline- and glutamine-rich splicing factor, or 100 kDa DNA-pairing protein (POMp100), or 100 kDa subunit of DNA-binding p52/p100 complex, a multifunctional protein that mediates diverse activities in the cell. It is ubiquitously expressed and highly conserved in vertebrates. PSF binds not only RNA but also both single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) and facilitates the renaturation of complementary ssDNAs. Besides, it promotes the formation of D-loops in superhelical duplex DNA, and is involved in cell proliferation. PSF can also interact with multiple factors. It is an RNA-binding component of spliceosomes and binds to insulin-like growth factor response element (IGFRE). PSF functions as a transcriptional repressor interacting with Sin3A and mediating silencing through the recruitment of histone deacetylases (HDACs) to the DNA binding domain (DBD) of nuclear hormone receptors. Additionally, PSF is an essential pre-mRNA splicing factor and is dissociated from PTB and binds to U1-70K and serine-arginine (SR) proteins during apoptosis. PSF forms a heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NonO). The PSF/p54nrb complex displays a variety of functions, such as DNA recombination and RNA synthesis, processing, and transport. PSF contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for interactions with RNA and for the localization of the protein in speckles. It also contains an N-terminal region rich in proline, glycine, and glutamine residues, which may play a role in interactions recruiting other molecules. 	71
410001	cd12588	RRM1_p54nrb	RNA recognition motif 1 (RRM1) found in vertebrate 54 kDa nuclear RNA- and DNA-binding protein (p54nrb). This subgroup corresponds to the RRM1 of p54nrb, also termed non-POU domain-containing octamer-binding protein (NonO), or 55 kDa nuclear protein (NMT55), or DNA-binding p52/p100 complex 52 kDa subunit. p54nrb is a multifunctional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. It is ubiquitously expressed and highly conserved in vertebrates. p54nrb binds both, single- and double-stranded RNA and DNA, and also possesses inherent carbonic anhydrase activity. It forms a heterodimer with paraspeckle component 1 (PSPC1 or PSP1), localizing to paraspeckles in an RNA-dependent manneras well as with polypyrimidine tract-binding protein-associated-splicing factor (PSF). p54nrb contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at the N-terminus. 	71
410002	cd12589	RRM2_PSP1	RNA recognition motif 2 (RRM2) found in vertebrate paraspeckle protein 1 (PSP1 or PSPC1). This subgroup corresponds to the RRM2 of PSPC1, also termed paraspeckle component 1 (PSPC1), a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. It is ubiquitously expressed and highly conserved in vertebrates. Although its cellular function remains unknown currently, PSPC1 forms a novel heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NonO), which localizes to paraspeckles in an RNA-dependent manner. PSPC1 contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at the N-terminus. 	80
410003	cd12590	RRM2_PSF	RNA recognition motif 2 (RRM2) found in vertebrate polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF). This subgroup corresponds to the RRM2 of PSF, also termed proline- and glutamine-rich splicing factor, or 100 kDa DNA-pairing protein (POMp100), or 100 kDa subunit of DNA-binding p52/p100 complex, a multifunctional protein that mediates diverse activities in the cell. It is ubiquitously expressed and highly conserved in vertebrates. PSF binds not only RNA but also both single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) and facilitates the renaturation of complementary ssDNAs. It promotes the formation of D-loops in superhelical duplex DNA, and is involved in cell proliferation. PSF can also interact with multiple factors. It is an RNA-binding component of spliceosomes and binds to insulin-like growth factor response element (IGFRE). Moreover, PSF functions as a transcriptional repressor interacting with Sin3A and mediating silencing through the recruitment of histone deacetylases (HDACs) to the DNA binding domain (DBD) of nuclear hormone receptors. PSF is an essential pre-mRNA splicing factor and is dissociated from PTB and binds to U1-70K and serine-arginine (SR) proteins during apoptosis. PSF forms a heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NonO). The PSF/p54nrb complex displays a variety of functions, such as DNA recombination and RNA synthesis, processing, and transport. PSF contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for interactions with RNA and for the localization of the protein in speckles. It also contains an N-terminal region rich in proline, glycine, and glutamine residues, which may play a role in interactions recruiting other molecules. 	80
410004	cd12591	RRM2_p54nrb	RNA recognition motif 2 (RRM2) found in vertebrate 54 kDa nuclear RNA- and DNA-binding protein (p54nrb). This subgroup corresponds to the RRM2 of p54nrb, also termed non-POU domain-containing octamer-binding protein (NonO), or 55 kDa nuclear protein (NMT55), or DNA-binding p52/p100 complex 52 kDa subunit. p54nrb is a multifunctional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. It is ubiquitously expressed and highly conserved in vertebrates. It binds both, single- and double-stranded RNA and DNA, and also possesses inherent carbonic anhydrase activity. p54nrb forms a heterodimer with paraspeckle component 1 (PSPC1 or PSP1), localizing to paraspeckles in an RNA-dependent manner. It also forms a heterodimer with polypyrimidine tract-binding protein-associated-splicing factor (PSF). p54nrb contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), at the N-terminus. 	80
410005	cd12592	RRM_RBM7	RNA recognition motif (RRM) found in vertebrate RNA-binding protein 7 (RBM7). This subfamily corresponds to the RRM of RBM7, a ubiquitously expressed pre-mRNA splicing factor that enhances messenger RNA (mRNA) splicing in a cell-specific manner or in a certain developmental process, such as spermatogenesis. RBM7 interacts with splicing factors SAP145 (the spliceosomal splicing factor 3b subunit 2) and SRp20. It may play a more specific role in meiosis entry and progression. Together with additional testis-specific RNA-binding proteins, RBM7 may regulate the splicing of specific pre-mRNA species that are important in the meiotic cell cycle. RBM7 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region lacking known homology at the C-terminus. 	75
410006	cd12593	RRM_RBM11	RNA recognition motif (RRM) found in vertebrate RNA-binding protein 11 (RBM11). This subfamily corresponds to the RRM or RBM11, a novel tissue-specific splicing regulator that is selectively expressed in brain, cerebellum and testis, and to a lower extent in kidney. RBM11 is localized in the nucleoplasm and enriched in SRSF2-containing splicing speckles. It may play a role in the modulation of alternative splicing during neuron and germ cell differentiation. RBM11 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region lacking known homology at the C-terminus. The RRM of RBM11 is responsible for RNA binding, whereas the C-terminal region permits nuclear localization and homodimerization. 	75
410007	cd12594	RRM1_SRSF4	RNA recognition motif 1 (RRM1) found in vertebrate serine/arginine-rich splicing factor 4 (SRSF4). This subgroup corresponds to the RRM1 of SRSF4, also termed pre-mRNA-splicing factor SRp75, or SRP001LB, or splicing factor, arginine/serine-rich 4 (SFRS4). SRSF4 is a splicing regulatory serine/arginine (SR) protein that plays an important role in both constitutive splicing and alternative splicing of many pre-mRNAs. For instance, it interacts with heterogeneous nuclear ribonucleoproteins, hnRNP G and hnRNP E2, and further regulates the 5' splice site of tau exon 10, whose misregulation causes frontotemporal dementia. SFSF4 also induces production of HIV-1 vpr mRNA through the inhibition of the 5'-splice site of exon 3. In addition, it activates splicing of the cardiac troponin T (cTNT) alternative exon by direct interactions with the cTNT exon 5 enhancer RNA. SRSF4 can shuttle between the nucleus and cytoplasm. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a glycine-rich region, an internal region homologous to the RRM, and a very long, highly phosphorylated C-terminal SR domains rich in serine-arginine dipeptides. 	87
410008	cd12595	RRM1_SRSF5	RNA recognition motif 1 (RRM1) found in vertebrate serine/arginine-rich splicing factor 5 (SRSF5). This subgroup corresponds to the RRM1 of SRSF5, also termed delayed-early protein HRS, or pre-mRNA-splicing factor SRp40, or splicing factor, arginine/serine-rich 5 (SFRS5). SFSF5 is an essential splicing regulatory serine/arginine (SR) protein that regulates both alternative splicing and basal splicing. It is the only SR protein efficiently selected from nuclear extracts (NE) by the splicing enhancer (ESE) and it is necessary for enhancer activation. SRSF5 also functions as a factor required for insulin-regulated splice site selection for protein kinase C (PKC) betaII mRNA. It is involved in the regulation of PKCbetaII exon inclusion by insulin via its increased phosphorylation by a phosphatidylinositol 3-kinase (PI 3-kinase) signaling pathway. Moreover, SRSF5 can regulate alternative splicing in exon 9 of glucocorticoid receptor pre-mRNA in a dose-dependent manner. SRSF5 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. The specific RNA binding by SRSF5 requires the phosphorylation of its SR domain.  	70
410009	cd12596	RRM1_SRSF6	RNA recognition motif 1 (RRM1) found in vertebrate serine/arginine-rich splicing factor 6 (SRSF6). This subfamily corresponds to the RRM1 of SRSF6, also termed pre-mRNA-splicing factor SRp55, which is an essential splicing regulatory serine/arginine (SR) protein that preferentially interacts with a number of purine-rich splicing enhancers (ESEs) to activate splicing of the ESE-containing exon. It is the only protein from HeLa nuclear extract or purified SR proteins that specifically binds B element RNA after UV irradiation. SRSF6 may also recognize different types of RNA sites. For instance, it does not bind to the purine-rich sequence in the calcitonin-specific ESE, but binds to a region adjacent to the purine tract. Moreover, cellular levels of SRSF6 may control tissue-specific alternative splicing of the calcitonin/ calcitonin gene-related peptide (CGRP) pre-mRNA. SRSF6 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal SR domains rich in serine-arginine dipeptides. 	72
410010	cd12597	RRM1_SRSF1	RNA recognition motif 1 (RRM1) found in serine/arginine-rich splicing factor 1 (SRSF1) and similar proteins. This subgroup corresponds to the RRM1 of SRSF1, also termed alternative-splicing factor 1 (ASF-1), or pre-mRNA-splicing factor SF2, P33 subunit. SRSF1 is a splicing regulatory serine/arginine (SR) protein involved in constitutive and alternative splicing, nonsense-mediated mRNA decay (NMD), mRNA export and translation. It also functions as a splicing-factor oncoprotein that regulates apoptosis and proliferation to promote mammary epithelial cell transformation. SRSF1 is a shuttling SR protein and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), separated by a long glycine-rich spacer, and a C-terminal RS domains rich in serine-arginine dipeptides. 	79
241042	cd12598	RRM1_SRSF9	RNA recognition motif 1 (RRM1) found in vertebrate serine/arginine-rich splicing factor 9 (SRSF9). This subgroup corresponds to the RRM1 of SRSF9, also termed pre-mRNA-splicing factor SRp30C. SRSF9 is an essential splicing regulatory serine/arginine (SR) protein that has been implicated in the activity of many elements that control splice site selection, the alternative splicing of the glucocorticoid receptor beta in neutrophils and in the gonadotropin-releasing hormone pre-mRNA. SRSF9 can also interact with other proteins implicated in alternative splicing, including YB-1, rSLM-1, rSLM-2, E4-ORF4, Nop30, and p32. SRSF9 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by an unusually short C-terminal RS domains rich in serine-arginine dipeptides. 	72
410011	cd12599	RRM1_SF2_plant_like	RNA recognition motif 1 (RRM1) found in plant pre-mRNA-splicing factor SF2 and similar proteins. This subgroup corresponds to the RRM1 of SF2, also termed SR1 protein, a plant serine/arginine (SR)-rich phosphoprotein similar to the mammalian splicing factor SF2/ASF. It promotes splice site switching in mammalian nuclear extracts. SF2 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal domain rich in proline, serine and lysine residues (PSK domain), a composition reminiscent of histones. This PSK domain harbors a putative phosphorylation site for the mitotic kinase cyclin/p34cdc2. 	72
410012	cd12600	RRM2_SRSF4_like	RNA recognition motif 2 (RRM2) found in serine/arginine-rich splicing factor 4 (SRSF4) and similar proteins. This subfamily corresponds to the RRM2 of three serine/arginine (SR) proteins: serine/arginine-rich splicing factor 4 (SRSF4 or SRp75 or SFRS4), serine/arginine-rich splicing factor 5 (SRSF5 or SRp40 or SFRS5 or HRS), serine/arginine-rich splicing factor 6 (SRSF6 or SRp55). SRSF4 plays an important role in both, constitutive  and alternative, splicing of many pre-mRNAs. It can shuttle between the nucleus and cytoplasm. SRSF5 regulates both alternative splicing and basal splicing. It is the only SR protein efficiently selected from nuclear extracts (NE) by the splicing enhancer (ESE) and is essential for enhancer activation. SRSF6 preferentially interacts with a number of purine-rich splicing enhancers (ESEs) to activate splicing of the ESE-containing exon. It is the only protein from HeLa nuclear extract or purified SR proteins that specifically binds B element RNA after UV irradiation. SRSF6 may also recognize different types of RNA sites. Members in this family contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides.  	72
410013	cd12601	RRM2_SRSF1_like	RNA recognition motif 2 (RRM2) found in serine/arginine-rich splicing factor SRSF1, SRSF9 and similar proteins. This subfamily corresponds to the RRM2 of serine/arginine-rich splicing factor SRSF1, SRSF9 and similar proteins. SRSF1, also termed ASF-1, is a shuttling SR protein involved in constitutive and alternative splicing, nonsense-mediated mRNA decay (NMD), mRNA export and translation. It also functions as a splicing-factor oncoprotein that regulates apoptosis and proliferation to promote mammary epithelial cell transformation. SRSF9, also termed SRp30C, has been implicated in the activity of many elements that control splice site selection, the alternative splicing of the glucocorticoid receptor beta in neutrophils and in the gonadotropin-releasing hormone pre-mRNA. SRSF9 can also interact with other proteins implicated in alternative splicing, including YB-1, rSLM-1, rSLM-2, E4-ORF4, Nop30, and p32. Both, SRSF1 and SRSF9, contain two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal RS domains rich in serine-arginine dipeptides. 	74
410014	cd12602	RRM2_SF2_plant_like	RNA recognition motif 2 (RRM2) found in plant pre-mRNA-splicing factor SF2 and similar proteins. This subfamily corresponds to the RRM2 of SF2, also termed SR1 protein, a plant serine/arginine (SR)-rich phosphoprotein similar to the mammalian splicing factor SF2/ASF. It promotes splice site switching in mammalian nuclear extracts. SF2 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal domain rich in proline, serine and lysine residues (PSK domain), a composition reminiscent of histones. This PSK domain harbors a putative phosphorylation site for the mitotic kinase cyclin/p34cdc2. 	76
410015	cd12603	RRM_hnRNPC	RNA recognition motif (RRM) found in vertebrate heterogeneous nuclear ribonucleoprotein C1/C2 (hnRNP C1/C2). This subgroup corresponds to the RRM of heterogeneous nuclear ribonucleoprotein C (hnRNP) proteins C1 and C2, produced by a single coding sequence. They are the major constituents of the heterogeneous nuclear RNA (hnRNA) ribonucleoprotein (hnRNP) complex in vertebrates. They bind hnRNA tightly, suggesting a central role in the formation of the ubiquitous hnRNP complex. They are involved in the packaging of hnRNA in the nucleus and in processing of pre-mRNA such as splicing and 3'-end formation. hnRNP C proteins contain two distinct domains, an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal auxiliary domain that includes the variable region, the basic region and the KSG box rich in repeated Lys-Ser-Gly sequences, the leucine zipper, and the acidic region. The RRM is capable of binding poly(U). The KSG box may bind to RNA. The leucine zipper may be involved in dimer formation. The acidic and hydrophilic C-teminus harbors a putative nucleoside triphosphate (NTP)-binding fold and a protein kinase phosphorylation site. 	84
410016	cd12604	RRM_RALY	RNA recognition motif (RRM) found in vertebrate RNA-binding protein Raly. This subgroup corresponds to the RRM of Raly, also termed autoantigen p542, or heterogeneous nuclear ribonucleoprotein C-like 2, or hnRNP core protein C-like 2, or hnRNP associated with lethal yellow protein homolog, an RNA-binding protein that may play a critical role in embryonic development. It is encoded by Raly, a ubiquitously expressed gene of unknown function. Raly shows a high degree of identity with the 5' sequences of p542 gene encoding autoantigen, which can cross-react with EBNA-1 of the Epstein Barr virus. Raly contains two distinct domains, an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal auxiliary domain that includes a unique glycine/serine-rich stretch. 	76
410017	cd12605	RRM_RALYL	RNA recognition motif (RRM) found in vertebrate RNA-binding Raly-like protein (RALYL). This subgroup corresponds to the RRM of RALYL, also termed heterogeneous nuclear ribonucleoprotein C-like 3, or hnRNP core protein C-like 3, a putative RNA-binding protein that shows high sequence homology with Raly, an RNA-binding protein playing a critical role in embryonic development. The biological role of RALYL remains unclear. Like Raly, RALYL contains two distinct domains, an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal auxiliary domain. 	69
410018	cd12606	RRM1_RBM4	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 4 (RBM4). This subgroup corresponds to the RRM1 of RBM4, a ubiquitously expressed splicing factor that has two isoforms, RBM4A (also known as Lark homolog) and RBM4B (also known as RBM30), which are very similar in structure and sequence. RBM4 may function as a translational regulator of stress-associated mRNAs and also plays a role in micro-RNA-mediated gene regulation. RBM4 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a CCHC-type zinc finger, and three alanine-rich regions within their C-terminal regions. The C-terminal region may be crucial for nuclear localization and protein-protein interaction. The RRMs, in combination with the C-terminal region, are responsible for the splicing function of RBM4. 	67
410019	cd12607	RRM2_RBM4	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 4 (RBM4). This subgroup corresponds to the RRM2 of RBM4, a ubiquitously expressed splicing factor that has two isoforms, RBM4A (also known as Lark homolog) and RBM4B (also known as RBM30), which are very similar in structure and sequence. RBM4 may function as a translational regulator of stress-associated mRNAs and also plays a role in micro-RNA-mediated gene regulation. RBM4 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), a CCHC-type zinc finger, and three alanine-rich regions within their C-terminal regions. The C-terminal region may be crucial for nuclear localization and protein-protein interaction. The RRMs, in combination with the C-terminal region, are responsible for the splicing function of RBM4. 	67
410020	cd12608	RRM1_CoAA	RNA recognition motif 1 (RRM1) found in vertebrate RRM-containing coactivator activator/modulator (CoAA). This subgroup corresponds to the RRM1 of CoAA, also termed RNA-binding protein 14 (RBM14), or paraspeckle protein 2 (PSP2), or synaptotagmin-interacting protein (SYT-interacting protein), a heterogeneous nuclear ribonucleoprotein (hnRNP)-like protein identified as a nuclear receptor coactivator. It mediates transcriptional coactivation and RNA splicing effects in a promoter-preferential manner and is enhanced by thyroid hormone receptor-binding protein (TRBP). CoAA contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a TRBP-interacting domain. It stimulates transcription through its interactions with coactivators, such as TRBP and CREB-binding protein CBP/p300, via the TRBP-interacting domain and interaction with an RNA-containing complex, such as DNA-dependent protein kinase-poly(ADP-ribose) polymerase complexes, via the RRMs. 	69
410021	cd12609	RRM2_CoAA	RNA recognition motif 2 (RRM2) found in vertebrate RRM-containing coactivator activator/modulator (CoAA). This subgroup corresponds to the RRM2 of CoAA, also termed RNA-binding protein 14 (RBM14), or paraspeckle protein 2 (PSP2), or synaptotagmin-interacting protein (SYT-interacting protein), a heterogeneous nuclear ribonucleoprotein (hnRNP)-like protein identified as a nuclear receptor coactivator. It mediates transcriptional coactivation and RNA splicing effects in a promoter-preferential manner and is enhanced by thyroid hormone receptor-binding protein (TRBP). CoAA contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a TRBP-interacting domain. It stimulates transcription through its interactions with coactivators, such as TRBP and CREB-binding protein CBP/p300, via the TRBP-interacting domain and interaction with an RNA-containing complex, such as DNA-dependent protein kinase-poly(ADP-ribose) polymerase complexes, via the RRMs. 	68
410022	cd12610	RRM1_SECp43	RNA recognition motif 1 (RRM1) found in tRNA selenocysteine-associated protein 1 (SECp43). This subgroup corresponds to the RRM1 of SECp43, an RNA-binding protein associated specifically with eukaryotic selenocysteine tRNA [tRNA(Sec)]. It may play an adaptor role in the mechanism of selenocysteine insertion. SECp43 is located primarily in the nucleus and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal polar/acidic region. 	84
410023	cd12611	RRM1_NGR1_NAM8_like	RNA recognition motif 1 (RRM1) found in yeast negative growth regulatory protein NGR1, yeast protein NAM8 and similar proteins. This subgroup corresponds to the RRM1 of NGR1 and NAM8. NGR1, also termed RNA-binding protein RBP1, is a putative glucose-repressible protein that binds both, RNA and single-stranded DNA (ssDNA), in yeast. It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two of which are followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the carboxyl terminus which also harbors a methionine-rich region. The subgroup also includes NAM8, a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. Like NGR1, NAM8 contains two RRMs. 	84
410024	cd12612	RRM2_SECp43	RNA recognition motif 2 (RRM2) found in tRNA selenocysteine-associated protein 1 (SECp43). This subgroup corresponds to the RRM2 of SECp43, an RNA-binding protein associated specifically with eukaryotic selenocysteine tRNA [tRNA(Sec)]. It may play an adaptor role in the mechanism of selenocysteine insertion. SECp43 is located primarily in the nucleus and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal polar/acidic region. 	82
410025	cd12613	RRM2_NGR1_NAM8_like	RNA recognition motif 2 (RRM2) found in yeast negative growth regulatory protein NGR1, yeast protein NAM8 and similar proteins. This subgroup corresponds to the RRM2 of NGR1 and NAM8. NGR1, also termed RNA-binding protein RBP1, is a putative glucose-repressible protein that binds both, RNA and single-stranded DNA (ssDNA), in yeast. It may function in regulating cell growth in early log phase, possibly through its participation in RNA metabolism. NGR1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a glutamine-rich stretch that may be involved in transcriptional activity. In addition, NGR1 has an asparagine-rich region near the carboxyl terminus which also harbors a methionine-rich region. The family also includes protein NAM8, which is a putative RNA-binding protein that acts as a suppressor of mitochondrial splicing deficiencies when overexpressed in yeast. It may be a non-essential component of the mitochondrial splicing machinery. Like NGR1, NAM8 contains two RRMs. 	80
410026	cd12614	RRM1_PUB1	RNA recognition motif 1 (RRM1) found in yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1 and similar proteins. This subgroup corresponds to the RRM1 of yeast protein PUB1, also termed ARS consensus-binding protein ACBP-60, or poly uridylate-binding protein, or poly(U)-binding protein. PUB1 has been identified as both, a heterogeneous nuclear RNA-binding protein (hnRNP) and a cytoplasmic mRNA-binding protein (mRNP), which may be stably bound to a translationally inactive subpopulation of mRNAs within the cytoplasm. It is distributed in both, the nucleus and the cytoplasm, and binds to poly(A)+ RNA (mRNA or pre-mRNA). Although it is one of the major cellular proteins cross-linked by UV light to polyadenylated RNAs in vivo, PUB1 is nonessential for cell growth in yeast. PUB1 also binds to T-rich single stranded DNA (ssDNA); however, there is no strong evidence implicating PUB1 in the mechanism of DNA replication. PUB1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a GAR motif (glycine and arginine rich stretch) that is located between RRM2 and RRM3. 	74
410027	cd12615	RRM1_TIA1	RNA recognition motif 1 (RRM1) found in nucleolysin TIA-1 isoform p40 (p40-TIA-1) and similar proteins. This subgroup corresponds to the RRM1 of TIA-1, the 40-kDa isoform of T-cell-restricted intracellular antigen-1 (TIA-1) and a cytotoxic granule-associated RNA-binding protein mainly found in the granules of cytotoxic lymphocytes. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis, and functions as the granule component responsible for inducing apoptosis in cytolytic lymphocyte (CTL) targets. It is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 	74
410028	cd12616	RRM1_TIAR	RNA recognition motif 1 (RRM1) found in nucleolysin TIAR and similar proteins. This subgroup corresponds to the RRM1 of nucleolysin TIAR, also termed TIA-1-related protein, and a cytotoxic granule-associated RNA-binding protein that shows high sequence similarity with 40-kDa isoform of T-cell-restricted intracellular antigen-1 (p40-TIA-1). TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. TIAR possesses nucleolytic activity against cytolytic lymphocyte (CTL) target cells. It can trigger DNA fragmentation in permeabilized thymocytes, and thus may function as an effector responsible for inducing apoptosis. TIAR is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. It interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 	81
410029	cd12617	RRM2_TIAR	RNA recognition motif 2 (RRM2) found in nucleolysin TIAR and similar proteins. This subgroup corresponds to the RRM2 of nucleolysin TIAR, also termed TIA-1-related protein, a cytotoxic granule-associated RNA-binding protein that shows high sequence similarity with 40-kDa isoform of T-cell-restricted intracellular antigen-1 (p40-TIA-1). TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. TIAR possesses nucleolytic activity against cytolytic lymphocyte (CTL) target cells. It can trigger DNA fragmentation in permeabilized thymocytes, and thus may function as an effector responsible for inducing apoptosis. TIAR is composed of three N-terminal, highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. It interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 	80
410030	cd12618	RRM2_TIA1	RNA recognition motif 2 (RRM2) found in nucleolysin TIA-1 isoform p40 (p40-TIA-1) and similar proteins. This subgroup corresponds to the RRM2 of p40-TIA-1, the 40-kDa isoform of T-cell-restricted intracellular antigen-1 (TIA-1), and a cytotoxic granule-associated RNA-binding protein mainly found in the granules of cytotoxic lymphocytes. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis, and function as the granule component responsible for inducing apoptosis in cytolytic lymphocyte (CTL) targets. It is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 	78
410031	cd12619	RRM2_PUB1	RNA recognition motif 2 (RRM2) found in yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1 and similar proteins. This subgroup corresponds to the RRM2 of yeast protein PUB1, also termed ARS consensus-binding protein ACBP-60, or poly uridylate-binding protein, or poly(U)-binding protein. PUB1 has been identified as both, a heterogeneous nuclear RNA-binding protein (hnRNP) and a cytoplasmic mRNA-binding protein (mRNP), which may be stably bound to a translationally inactive subpopulation of mRNAs within the cytoplasm. It is distributed in both, the nucleus and the cytoplasm, and binds to poly(A)+ RNA (mRNA or pre-mRNA). Although it is one of the major cellular proteins cross-linked by UV light to polyadenylated RNAs in vivo, PUB1 is nonessential for cell growth in yeast. PUB1 also binds to T-rich single stranded DNA (ssDNA). However, there is no strong evidence implicating PUB1 in the mechanism of DNA replication. PUB1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a GAR motif (glycine and arginine rich stretch) that is located between RRM2 and RRM3. 	80
241064	cd12620	RRM3_TIAR	RNA recognition motif 3 (RRM3) found in nucleolysin TIAR and similar proteins. This subgroup corresponds to the RRM3 of nucleolysin TIAR, also termed TIA-1-related protein, a cytotoxic granule-associated RNA-binding protein that shows high sequence similarity with 40-kDa isoform of T-cell-restricted intracellular antigen-1 (p40-TIA-1). TIAR is mainly localized in the nucleus of hematopoietic and nonhematopoietic cells. It is translocated from the nucleus to the cytoplasm in response to exogenous triggers of apoptosis. TIAR possesses nucleolytic activity against cytolytic lymphocyte (CTL) target cells. It can trigger DNA fragmentation in permeabilized thymocytes, and thus may function as an effector responsible for inducing apoptosis. TIAR is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. It interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 	73
410032	cd12621	RRM3_TIA1	RNA recognition motif 3 (RRM3) found in nucleolysin TIA-1 isoform p40 (p40-TIA-1) and similar proteins. This subgroup corresponds to the RRM3 of p40-TIA-1, the 40-kDa isoform of T-cell-restricted intracellular antigen-1 (TIA-1) and a cytotoxic granule-associated RNA-binding protein mainly found in the granules of cytotoxic lymphocytes. TIA-1 can be phosphorylated by a serine/threonine kinase that is activated during Fas-mediated apoptosis, and function as the granule component responsible for inducing apoptosis in cytolytic lymphocyte (CTL) targets. It is composed of three N-terminal highly homologous RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glutamine-rich C-terminal auxiliary domain containing a lysosome-targeting motif. TIA-1 interacts with RNAs containing short stretches of uridylates and its RRM2 can mediate the specific binding to uridylate-rich RNAs. 	72
410033	cd12622	RRM3_PUB1	RNA recognition motif 3 (RRM3) found in yeast nuclear and cytoplasmic polyadenylated RNA-binding protein PUB1 and similar proteins. This subfamily corresponds to the RRM3 of yeast protein PUB1, also termed ARS consensus-binding protein ACBP-60, or poly uridylate-binding protein, or poly(U)-binding protein. PUB1 has been identified as both, a heterogeneous nuclear RNA-binding protein (hnRNP) and a cytoplasmic mRNA-binding protein (mRNP), which may be stably bound to a translationally inactive subpopulation of mRNAs within the cytoplasm. PUB1 is distributed in both, the nucleus and the cytoplasm, and binds to poly(A)+ RNA (mRNA or pre-mRNA). Although it is one of the major cellular proteins cross-linked by UV light to polyadenylated RNAs in vivo, PUB1 is nonessential for cell growth in yeast. PUB1 also binds to T-rich single stranded DNA (ssDNA); however, there is no strong evidence implicating PUB1 in the mechanism of DNA replication. PUB1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a GAR motif (glycine and arginine rich stretch) that is located between RRM2 and RRM3. 	74
410034	cd12623	RRM_PPARGC1A	RNA recognition motif (RRM) found in peroxisome proliferator-activated receptor gamma coactivator 1-alpha (PGC-1alpha, or PPARGC-1-alpha) and similar proteins. This subgroup corresponds to the RRM of PGC-1alpha, also termed PPARGC-1-alpha, or ligand effect modulator 6, a member of a family of transcription coactivators that plays a central role in the regulation of cellular energy metabolism. As an inducible transcription coactivator, PGC-1alpha can interact with a broad range of transcription factors involved in a wide variety of biological responses, such as adaptive thermogenesis, skeletal muscle fiber type switching, glucose/fatty acid metabolism, and heart development. PGC-1alpha stimulates mitochondrial biogenesis and promotes oxidative metabolism. It participates in the regulation of both carbohydrate and lipid metabolism and plays a role in disorders such as obesity, diabetes, and cardiomyopathy. PGC-1alpha is a multi-domain protein containing an N-terminal activation domain region, a central region involved in the interaction with at least a nuclear receptor, and a C-terminal domain region. The N-terminal domain region consists of three leucine-rich motifs (L1, NR box 2 and 3), among which the two last are required for interaction with nuclear receptors, potential nuclear localization signals (NLS), and a proline-rich region overlapping a putative repression domain. The C-terminus of PGC-1alpha is composed of two arginine/serine-rich regions (SR domains), a putative dimerization domain, and an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). PGC-1alpha could interact favorably with single-stranded RNA. 	91
410035	cd12624	RRM_PRC	RNA recognition motif (RRM) found in peroxisome proliferator-activated receptor gamma coactivator-related protein 1 (PRC) and similar proteins. This subgroup corresponds to the RRM of PRC, also termed PGC-1-related coactivator, one of the members of PGC-1 transcriptional coactivators family, including peroxisome proliferator-activated receptor gamma coactivators PGC-1alpha and PGC-1beta. Unlike PGC-1alpha and PGC-1beta, PRC is ubiquitous and more abundantly expressed in proliferating cells than in growth-arrested cells. PRC has been implicated in the regulation of several metabolic pathways, mitochondrial biogenesis, and cell growth. It functions as a growth-regulated transcriptional cofactor activating many nuclear genes specifying mitochondrial respiratory function. PRC directly interacts with nuclear transcriptional factors implicated in respiratory chain expression including nuclear respiratory factors 1 and 2 (NRF-1 and NRF-2), CREB (cAMP-response element-binding protein), and estrogen-related receptor alpha (ERRalpha). It interacts indirectly with the NRF-2beta subunit through host cell factor (HCF), a cellular protein involved in herpes simplex virus (HSV) infection and cell cycle regulation. Furthermore, like PGC-1alpha and PGC-1beta, PRC can transactivate a number of NRF-dependent nuclear genes required for mitochondrial respiratory function, including those encoding cytochrome c, 5-aminolevulinate synthase, Tfam, and TFB1M, and TFB2M. Further research indicates that PRC may also act as a sensor of metabolic stress that orchestrates a redox-sensitive program of inflammatory gene expression. PRC is a multi-domain protein containing an N-terminal activation domain, an LXXLL coactivator signature, a central proline-rich region, a tetrapeptide motif (DHDY) responsible for HCF binding, a C-terminal arginine/serine-rich (SR) domain, and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	91
241069	cd12625	RRM1_IGF2BP1	RNA recognition motif 1 (RRM1) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1). This subgroup corresponds to the RRM1 of IGF2BP1 (IGF2 mRNA-binding protein 1 or IMP-1), also termed coding region determinant-binding protein (CRD-BP), or VICKZ family member 1, or zipcode-binding protein 1 (ZBP-1). IGF2BP1 is a multi-functional regulator of RNA metabolism that has been implicated in the control of aspects of localization, stability, and translation for many mRNAs. It is predominantly located in cytoplasm and was initially identified as a trans-acting factor that interacts with the zipcode in the 3'- untranslated region (UTR) of the beta-actin mRNA, which is important for its localization and translational regulation. It inhibits IGF-II mRNA translation through binding to the 5'-UTR of the transcript. IGF2BP1 also acts as human immunodeficiency virus type 1 (HIV-1) Gag-binding factor that interacts with HIV-1 Gag protein and blocks the formation of infectious HIV-1 particles. IGF2BP1 promotes mRNA stabilization; it functions as a coding region determinant (CRD)-binding protein that binds to the coding region of betaTrCP1 mRNA and prevents miR-183-mediated degradation of betaTrCP1 mRNA. It also promotes c-myc mRNA stability by associating with the CRD and stabilizes CD44 mRNA via interaction with the 3'-UTR of the transcript. In addition, IGF2BP1 specifically interacts with both Hepatitis C virus (HCV) 5'-UTR and 3'-UTR, further recruiting eIF3 and enhancing HCV internal ribosome entry site (IRES)-mediated translation initiation via the 3'-UTR. IGF2BP1 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. It also contains two putative nuclear export signals (NESs) and a putative nuclear localization signal (NLS). 	77
241070	cd12626	RRM1_IGF2BP2	RNA recognition motif 1 (RRM1) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2). This subgroup corresponds to the RRM1 of IGF2BP2 (IGF2 mRNA-binding protein 2 or IMP-2), also termed hepatocellular carcinoma autoantigen p62, or VICKZ family member 2,  which is a ubiquitously expressed RNA-binding protein involved in the stimulation of insulin action. It is predominantly nuclear. SNPs in IGF2BP2 gene are implicated in susceptibility to type 2 diabetes. IGF2BP2 plays an important role in cellular motility; it regulates the expression of PINCH-2, an important mediator of cell adhesion and motility, and MURF-3, a microtubule-stabilizing protein, through direct binding to their mRNAs. IGF2BP2 may be involved in the regulation of mRNA stability through the interaction with the AU-rich element-binding factor AUF1. IGF2BP2 binds initially to nascent beta-actin transcripts and facilitates the subsequent binding of the shuttling IGF2BP1. IGF2BP2 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. 	77
410036	cd12627	RRM1_IGF2BP3	RNA recognition motif 1 (RRM1) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3). This subgroup corresponds to the RRM1 of IGF2BP3 (IGF2 mRNA-binding protein 3 or IMP-3), also termed KH domain-containing protein overexpressed in cancer (KOC), or VICKZ family member 3, an RNA-binding protein that plays an important role in the differentiation process during early embryogenesis. It is known to bind to and repress the translation of IGF2 leader 3 mRNA. IGF2BP3 also acts as a Glioblastoma-specific proproliferative and proinvasive marker acting through IGF2 resulting in the activation of oncogenic phosphatidylinositol 3-kinase/mitogen-activated protein kinase (PI3K/MAPK) pathways. IGF2BP3 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. 	77
410037	cd12628	RRM2_IGF2BP1	RNA recognition motif 2 (RRM2) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1). This subgroup corresponds to the RRM2 of IGF2BP1 (IGF2 mRNA-binding protein 1 or IMP-1), also termed coding region determinant-binding protein (CRD-BP), or VICKZ family member 1, or zipcode-binding protein 1 (ZBP-1). IGF2BP1 is a multi-functional regulator of RNA metabolism that has been implicated in the control of aspects of localization, stability, and translation for many mRNAs. It is predominantly located in cytoplasm and was initially identified as a trans-acting factor that interacts with the zipcode in the 3'- untranslated region (UTR) of the beta-actin mRNA, which is important for its localization and translational regulation. It inhibits IGF-II mRNA translation through binding to the 5'-UTR of the transcript. IGF2BP1 also acts as human immunodeficiency virus type 1 (HIV-1) Gag-binding factor that interacts with HIV-1 Gag protein and blocks the formation of infectious HIV-1 particles. It promotes mRNA stabilization and functions as a coding region determinant (CRD)-binding protein that binds to the coding region of betaTrCP1 mRNA and prevents miR-183-mediated degradation of betaTrCP1 mRNA. It also promotes c-myc mRNA stability by associating with the CRD. It stabilizes CD44 mRNA via interaction with the 3'-UTR of the transcript. In addition, IGF2BP1 specifically interacts with both Hepatitis C virus (HCV) 5'-UTR and 3'-UTR, further recruiting eIF3 and enhancing HCV internal ribosome entry site (IRES)-mediated translation initiation via the 3'-UTR. IGF2BP1 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. It also contains two putative nuclear export signals (NESs) and a putative nuclear localization signal (NLS). 	76
410038	cd12629	RRM2_IGF2BP2	RNA recognition motif 2 (RRM2) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2). This subgroup corresponds to the RRM2 of IGF2BP2 (IGF2 mRNA-binding protein 2 or IMP-2), also termed hepatocellular carcinoma autoantigen p62, or VICKZ family member 2, a ubiquitously expressed RNA-binding protein involved in the stimulation of insulin action. It is predominantly nuclear. SNPs in IGF2BP2 gene are implicated in susceptibility to type 2 diabetes. IGF2BP2 plays an important role in cellular motility; it regulates the expression of PINCH-2, an important mediator of cell adhesion and motility, and MURF-3, a microtubule-stabilizing protein, through direct binding to their mRNAs. IGF2BP2 may be involved in the regulation of mRNA stability through the interaction with the AU-rich element-binding factor AUF1. In addition, IGF2BP2 binds initially to nascent beta-actin transcripts and facilitates the subsequent binding of the shuttling IGF2BP1. IGF2BP2 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. 	76
410039	cd12630	RRM2_IGF2BP3	RNA recognition motif 2 (RRM2) found in vertebrate insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3). This subgroup corresponds to the RRM2 of IGF2BP3 (IGF2 mRNA-binding protein 3 or IMP-3), also termed KH domain-containing protein overexpressed in cancer (KOC), or VICKZ family member 3, an RNA-binding protein that plays an important role in the differentiation process during early embryogenesis. It is known to bind to and repress the translation of IGF2 leader 3 mRNA. IGF2BP3 also acts as a Glioblastoma-specific proproliferative and proinvasive marker acting through IGF2 resulting in the activation of oncogenic phosphatidylinositol 3-kinase/mitogen-activated protein kinase (PI3K/MAPK) pathways. IGF2BP3 contains four hnRNP K-homology (KH) domains, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a RGG RNA-binding domain. 	76
410040	cd12631	RRM1_CELF1_2_Bruno	RNA recognition motif 1 (RRM1) found in CUGBP Elav-like family member CELF-1, CELF-2, Drosophila melanogaster Bruno protein and similar proteins. This subgroup corresponds to the RRM1 of CELF-1, CELF-2 and Bruno protein. CELF-1 (also termed BRUNOL-2, or CUG-BP1, or EDEN-BP) and CELF-2 (also termed BRUNOL-3, or ETR-3, or CUG-BP2, or NAPOR) belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that have been implicated in regulation of pre-mRNA splicing, and control of mRNA translation and deadenylation. CELF-1 is strongly expressed in all adult and fetal tissues tested. The human CELF-1 is a nuclear and cytoplasmic RNA-binding protein that regulates multiple aspects of nuclear and cytoplasmic mRNA processing, with implications for onset of type 1 myotonic dystrophy (DM1), a neuromuscular disease associated with an unstable CUG triplet expansion in the 3'-UTR (3'-untranslated region) of the DMPK (myotonic dystrophy protein kinase) gene; it preferentially targets UGU-rich mRNA elements. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. The Xenopus homolog embryo deadenylation element-binding protein (EDEN-BP) mediates sequence-specific deadenylation of Eg5 mRNA. It binds specifically to the EDEN motif in the 3'-untranslated regions of maternal mRNAs and targets these mRNAs for deadenylation and translational repression. CELF-1 contain three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The two N-terminal RRMs of EDEN-BP are necessary for the interaction with EDEN as well as a part of the linker region (between RRM2 and RRM3). Oligomerization of EDEN-BP is required for specific mRNA deadenylation and binding. CELF-2 is expressed in all tissues at some level, but highest in brain, heart, and thymus. It has been implicated in the regulation of nuclear and cytoplasmic RNA processing events, including alternative splicing, RNA editing, stability and translation. CELF-2 shares high sequence identity with CELF-1, but shows different binding specificity; it binds preferentially to sequences with UG repeats and UGUU motifs. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. It also binds to the 3'-UTR of cyclooxygenase-2 messages, affecting both translation and mRNA stability, and binds to apoB mRNA, regulating its C to U editing. CELF-2 also contains three highly conserved RRMs. It binds to RNA via the first two RRMs, which are also important for localization in the cytoplasm. The splicing activation or repression activity of CELF-2 on some specific substrates is mediated by RRM1/RRM2. Both, RRM1 and RRM2 of CELF-2, can activate cardiac troponin T (cTNT) exon 5 inclusion. In addition, CELF-2 possesses a typical arginine and lysine-rich nuclear localization signal (NLS) in the C-terminus, within RRM3. This subgroup also includes Drosophila melanogaster Bruno protein, which plays a central role in regulation of Oskar (Osk) expression in flies. It mediates repression by binding to regulatory Bruno response elements (BREs) in the Osk mRNA 3' UTR. The full-length Bruno protein contains three RRMs, two located in the N-terminal half of the protein and the third near the C-terminus, separated by a linker region. 	84
410041	cd12632	RRM1_CELF3_4_5_6	RNA recognition motif 1 (RRM1) found in CUGBP Elav-like family member CELF-3, CELF-4, CELF-5, CELF-6 and similar proteins. This subfamily corresponds to the RRM1 of CELF-3, CELF-4, CELF-5, CELF-6, all of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that display dual nuclear and cytoplasmic localizations and have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-3, expressed in brain and testis only, is also known as bruno-like protein 1 (BRUNOL-1), or CAG repeat protein 4, or CUG-BP- and ETR-3-like factor 3, or embryonic lethal abnormal vision (ELAV)-type RNA-binding protein 1 (ETR-1), or expanded repeat domain protein CAG/CTG 4, or trinucleotide repeat-containing gene 4 protein (TNRC4). It plays an important role in the pathogenesis of tauopathies. CELF-3 contains three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein.The effect of CELF-3 on tau splicing is mediated mainly by the RNA-binding activity of RRM2. The divergent linker region might mediate the interaction of CELF-3 with other proteins regulating its activity or involved in target recognition. CELF-4, highly expressed throughout the brain and in glandular tissues, moderately expressed in heart, skeletal muscle, and liver, is also known as bruno-like protein 4 (BRUNOL-4), or CUG-BP- and ETR-3-like factor 4. Like CELF-3, CELF-4 also contain three highly conserved RRMs. The splicing activation or repression activity of CELF-4 on some specific substrates is mediated by its RRM1/RRM2. On the other hand, both RRM1 and RRM2 of CELF-4 can activate cardiac troponin T (cTNT) exon 5 inclusion. CELF-5, expressed in brain, is also known as bruno-like protein 5 (BRUNOL-5), or CUG-BP- and ETR-3-like factor 5. Although its biological role remains unclear, CELF-5 shares same domain architecture with CELF-3. CELF-6, strongly expressed in kidney, brain, and testis, is also known as bruno-like protein 6 (BRUNOL-6), or CUG-BP- and ETR-3-like factor 6. It activates exon inclusion of a cardiac troponin T minigene in transient transfection assays in an muscle-specific splicing enhancer (MSE)-dependent manner and can activate inclusion via multiple copies of a single element, MSE2. CELF-6 also promotes skipping of exon 11 of insulin receptor, a known target of CELF activity that is expressed in kidney. In additiona to three highly conserved RRMs, CELF-6 also possesses numerous potential phosphorylation sites, a potential nuclear localization signal (NLS) at the C terminus, and an alanine-rich region within the divergent linker region. 	87
241077	cd12633	RRM1_FCA	RNA recognition motif 1 (RRM1) found in plant flowering time control protein FCA and similar proteins. This subgroup corresponds to the RRM1 of FCA, a gene controlling flowering time in Arabidopsis, encoding a flowering time control protein that functions in the posttranscriptional regulation of transcripts involved in the flowering process. FCA contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNP (ribonucleoprotein domains), and a WW protein interaction domain. 	80
410042	cd12634	RRM2_CELF1_2	RNA recognition motif 2 (RRM2) found in CUGBP Elav-like family member CELF-1, CELF-2 and similar proteins. This subgroup corresponds to the RRM2 of CELF-1 (also termed BRUNOL-2, or CUG-BP1, or EDEN-BP), CELF-2 (also termed BRUNOL-3, or ETR-3, or CUG-BP2, or NAPOR), both of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-1 is strongly expressed in all adult and fetal tissues tested. Human CELF-1 is a nuclear and cytoplasmic RNA-binding protein that regulates multiple aspects of nuclear and cytoplasmic mRNA processing, with implications for onset of type 1 myotonic dystrophy (DM1), a neuromuscular disease associated with an unstable CUG triplet expansion in the 3'-UTR (3'-untranslated region) of the DMPK (myotonic dystrophy protein kinase) gene; it preferentially targets UGU-rich mRNA elements. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. The Xenopus homolog embryo deadenylation element-binding protein (EDEN-BP) mediates sequence-specific deadenylation of Eg5 mRNA. It binds specifically to the EDEN motif in the 3'-untranslated regions of maternal mRNAs and targets these mRNAs for deadenylation and translational repression. CELF-1 contains three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The two N-terminal RRMs of EDEN-BP are necessary for the interaction with EDEN as well as a part of the linker region (between RRM2 and RRM3). Oligomerization of EDEN-BP is required for specific mRNA deadenylation and binding. CELF-2 is expressed in all tissues at some level, but highest in brain, heart, and thymus. It has been implicated in the regulation of nuclear and cytoplasmic RNA processing events, including alternative splicing, RNA editing, stability and translation. CELF-2 shares high sequence identity with CELF-1, but shows different binding specificity; it preferentially binds to sequences with UG repeats and UGUU motifs. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. It also binds to the 3'-UTR of cyclooxygenase-2 messages, affecting both translation and mRNA stability, and binds to apoB mRNA, regulating its C to U editing. CELF-2 also contains three highly conserved RRMs. It binds to RNA via the first two RRMs, which are also important for localization in the cytoplasm. The splicing activation or repression activity of CELF-2 on some specific substrates is mediated by RRM1/RRM2. Both, RRM1 and RRM2 of CELF-2, can activate cardiac troponin T (cTNT) exon 5 inclusion. In addition, CELF-2 possesses a typical arginine and lysine-rich nuclear localization signal (NLS) in the C-terminus, within RRM3. 	81
410043	cd12635	RRM2_CELF3_4_5_6	RNA recognition motif 2 (RRM2) found in CUGBP Elav-like family member CELF-3, CELF-4, CELF-5, CELF-6 and similar proteins. This subgroup corresponds to the RRM2 of CELF-3, CELF-4, CELF-5, and CELF-6, all of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that display dual nuclear and cytoplasmic localizations and have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-3, expressed in brain and testis only, is also known as bruno-like protein 1 (BRUNOL-1), or CAG repeat protein 4, or CUG-BP- and ETR-3-like factor 3, or embryonic lethal abnormal vision (ELAV)-type RNA-binding protein 1 (ETR-1), or expanded repeat domain protein CAG/CTG 4, or trinucleotide repeat-containing gene 4 protein (TNRC4). It plays an important role in the pathogenesis of tauopathies. CELF-3 contains three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The effect of CELF-3 on tau splicing is mediated mainly by the RNA-binding activity of RRM2. The divergent linker region might mediate the interaction of CELF-3 with other proteins regulating its activity or involved in target recognition. CELF-4, being highly expressed throughout the brain and in glandular tissues, moderately expressed in heart, skeletal muscle, and liver, is also known as bruno-like protein 4 (BRUNOL-4), or CUG-BP- and ETR-3-like factor 4. Like CELF-3, CELF-4 also contain three highly conserved RRMs. The splicing activation or repression activity of CELF-4 on some specific substrates is mediated by its RRM1/RRM2. On the other hand, both RRM1 and RRM2 of CELF-4 can activate cardiac troponin T (cTNT) exon 5 inclusion. CELF-5, expressed in brain, is also known as bruno-like protein 5 (BRUNOL-5), or CUG-BP- and ETR-3-like factor 5. Although its biological role remains unclear, CELF-5 shares same domain architecture with CELF-3. CELF-6, being strongly expressed in kidney, brain, and testis, is also known as bruno-like protein 6 (BRUNOL-6), or CUG-BP- and ETR-3-like factor 6. It activates exon inclusion of a cardiac troponin T minigene in transient transfection assays in a muscle-specific splicing enhancer (MSE)-dependent manner and can activate inclusion via multiple copies of a single element, MSE2. CELF-6 also promotes skipping of exon 11 of insulin receptor, a known target of CELF activity that is expressed in kidney. In addition to three highly conserved RRMs, CELF-6 also possesses numerous potential phosphorylation sites, a potential nuclear localization signal (NLS) at the C terminus, and an alanine-rich region within the divergent linker region. 	81
410044	cd12636	RRM2_Bruno_like	RNA recognition motif 2 (RRM2) found in Drosophila melanogaster Bruno protein and similar proteins. This subgroup corresponds to the RRM2 of Bruno, a Drosophila RNA recognition motif (RRM)-containing protein that plays a central role in regulation of Oskar (Osk) expression. It mediates repression by binding to regulatory Bruno response elements (BREs) in the Osk mRNA 3' UTR. The full-length Bruno protein contains three RRMs, two located in the N-terminal half of the protein and the third near the C-terminus, separated by a linker region. 	81
410045	cd12637	RRM2_FCA	RNA recognition motif 2 (RRM2) found in plant flowering time control protein FCA and similar proteins. This subgroup corresponds to the RRM2 of FCA, a gene controlling flowering time in Arabidopsis, which encodes a flowering time control protein that functions in the posttranscriptional regulation of transcripts involved in the flowering process. The flowering time control protein FCA contains two RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNP (ribonucleoprotein domains), and a WW protein interaction domain. 	81
241082	cd12638	RRM3_CELF1_2	RNA recognition motif 3 (RRM3) found in CUGBP Elav-like family member CELF-1, CELF-2 and similar proteins. This subgroup corresponds to the RRM3 of CELF-1 (also termed BRUNOL-2, or CUG-BP1, or EDEN-BP) and CELF-2 (also termed BRUNOL-3, or ETR-3, or CUG-BP2, or NAPOR), both of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-1 is strongly expressed in all adult and fetal tissues tested. Human CELF-1 is a nuclear and cytoplasmic RNA-binding protein that regulates multiple aspects of nuclear and cytoplasmic mRNA processing, with implications for onset of type 1 myotonic dystrophy (DM1), a neuromuscular disease associated with an unstable CUG triplet expansion in the 3'-UTR (3'-untranslated region) of the DMPK (myotonic dystrophy protein kinase) gene; it preferentially targets UGU-rich mRNA elements. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. The Xenopus homolog embryo deadenylation element-binding protein (EDEN-BP) mediates sequence-specific deadenylation of Eg5 mRNA. It specifically binds to the EDEN motif in the 3'-untranslated regions of maternal mRNAs and targets these mRNAs for deadenylation and translational repression. CELF-1 contain three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein. The two N-terminal RRMs of EDEN-BP are necessary for the interaction with EDEN as well as a part of the linker region (between RRM2 and RRM3). Oligomerization of EDEN-BP is required for specific mRNA deadenylation and binding. CELF-2 is expressed in all tissues at some level, but highest in brain, heart, and thymus. It has been implicated in the regulation of nuclear and cytoplasmic RNA processing events, including alternative splicing, RNA editing, stability and translation. CELF-2 shares high sequence identity with CELF-1, but shows different binding specificity; it binds preferentially to sequences with UG repeats and UGUU motifs. It has been shown to bind to a Bruno response element, a cis-element involved in translational control of oskar mRNA in Drosophila, and share sequence similarity to Bruno, the Drosophila protein that mediates this process. It also binds to the 3'-UTR of cyclooxygenase-2 messages, affecting both translation and mRNA stability, and binds to apoB mRNA, regulating its C to U editing. CELF-2 also contain three highly conserved RRMs. It binds to RNA via the first two RRMs, which are important for localization in the cytoplasm. The splicing activation or repression activity of CELF-2 on some specific substrates is mediated by RRM1/RRM2. Both, RRM1 and RRM2 of CELF-2, can activate cardiac troponin T (cTNT) exon 5 inclusion. In addition, CELF-2 possesses a typical arginine and lysine-rich nuclear localization signal (NLS) in the C-terminus, within RRM3. 	92
241083	cd12639	RRM3_CELF3_4_5_6	RNA recognition motif 2 (RRM2) found in CUGBP Elav-like family member CELF-3, CELF-4, CELF-5, CELF-6 and similar proteins. This subgroup corresponds to the RRM3 of CELF-3, CELF-4, CELF-5, and CELF-6, all of which belong to the CUGBP1 and ETR-3-like factors (CELF) or BRUNOL (Bruno-like) family of RNA-binding proteins that display dual nuclear and cytoplasmic localizations and have been implicated in the regulation of pre-mRNA splicing and in the control of mRNA translation and deadenylation. CELF-3, expressed in brain and testis only, is also known as bruno-like protein 1 (BRUNOL-1), or CAG repeat protein 4, or CUG-BP- and ETR-3-like factor 3, or embryonic lethal abnormal vision (ELAV)-type RNA-binding protein 1 (ETR-1), or expanded repeat domain protein CAG/CTG 4, or trinucleotide repeat-containing gene 4 protein (TNRC4). It plays an important role in the pathogenesis of tauopathies. CELF-3 contains three highly conserved RNA recognition motifs (RRMs), also known as RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C-terminus of the protein.The effect of CELF-3 on tau splicing is mediated mainly by the RNA-binding activity of RRM2. The divergent linker region might mediate the interaction of CELF-3 with other proteins regulating its activity or involved in target recognition. CELF-4, highly expressed throughout the brain and in glandular tissues, moderately expressed in heart, skeletal muscle, and liver, is also known as bruno-like protein 4 (BRUNOL-4), or CUG-BP- and ETR-3-like factor 4. Like CELF-3, CELF-4 also contains three highly conserved RRMs. The splicing activation or repression activity of CELF-4 on some specific substrates is mediated by its RRM1/RRM2. Both, RRM1 and RRM2 of CELF-4, can activate cardiac troponin T (cTNT) exon 5 inclusion. CELF-5, expressed in brain, is also known as bruno-like protein 5 (BRUNOL-5), or CUG-BP- and ETR-3-like factor 5. Although its biological role remains unclear, CELF-5 shares same domain architecture with CELF-3. CELF-6, strongly expressed in kidney, brain, and testis, is also known as bruno-like protein 6 (BRUNOL-6), or CUG-BP- and ETR-3-like factor 6. It activates exon inclusion of a cardiac troponin T minigene in transient transfection assays in an muscle-specific splicing enhancer (MSE)-dependent manner and can activate inclusion via multiple copies of a single element, MSE2. CELF-6 also promotes skipping of exon 11 of insulin receptor, a known target of CELF activity that is expressed in kidney. In addition to three highly conserved RRMs, CELF-6 also possesses numerous potential phosphorylation sites, a potential nuclear localization signal (NLS) at the C terminus, and an alanine-rich region within the divergent linker region. 	79
241084	cd12640	RRM3_Bruno_like	RNA recognition motif 3 (RRM3) found in Drosophila melanogaster Bruno protein and similar proteins. This subgroup corresponds to the RRM3 of Bruno protein, a Drosophila RNA recognition motif (RRM)-containing protein that plays a central role in regulation of Oskar (Osk) expression. It mediates repression by binding to regulatory Bruno response elements (BREs) in the Osk mRNA 3' UTR. The full-length Bruno protein contains three RRMs, two located in the N-terminal half of the protein and the third near the C-terminus, separated by a linker region. 	79
410046	cd12641	RRM_TRA2B	RNA recognition motif (RRM) found in Transformer-2 protein homolog beta (TRA-2 beta) and similar proteins. This subgroup corresponds to the RRM of TRA2-beta or TRA-2-beta, also termed splicing factor, arginine/serine-rich 10 (SFRS10), or transformer-2 protein homolog B, a mammalian homolog of Drosophila transformer-2 (Tra2). TRA2-beta is a serine/arginine-rich (SR) protein that controls the pre-mRNA alternative splicing of the calcitonin/calcitonin gene-related peptide (CGRP), the survival motor neuron 1 (SMN1) protein and the tau protein. It contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), flanked by the N- and C-terminal arginine/serine (RS)-rich regions. TRA2-beta specifically binds to two types of RNA sequences, the CAA and (GAA)2 sequences, through the RRMs in different RNA binding modes.  	87
410047	cd12642	RRM_TRA2A	RNA recognition motif (RRM) found in transformer-2 protein homolog alpha (TRA-2 alpha) and similar proteins. This subgroup corresponds to the RRM of TRA2-alpha or TRA-2-alpha, also termed transformer-2 protein homolog A, a mammalian homolog of Drosophila transformer-2 (Tra2). TRA2-alpha is a 40-kDa serine/arginine-rich (SR) protein (SRp40) that specifically binds to gonadotropin-releasing hormone (GnRH) exonic splicing enhancer on exon 4 (ESE4) and is necessary for enhanced GnRH pre-mRNA splicing. It strongly stimulates GnRH intron A excision in a dose-dependent manner. In addition, TRA2-alpha can interact with either 9G8 or SRp30c, which may also be crucial for ESE-dependent GnRH pre-mRNA splicing. TRA2-alpha contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), flanked by the N- and C-terminal arginine/serine (RS)-rich regions. 	84
410048	cd12643	RRM_CFIm68	RNA recognition motif (RRM) found in pre-mRNA cleavage factor Im 68 kDa subunit (CFIm68 or CPSF6) and similar proteins. This subgroup corresponds to the RRM of CFIm68. Cleavage factor Im (CFIm) is a highly conserved component of the eukaryotic mRNA 3' processing machinery that functions in UGUA-mediated poly(A) site recognition, the regulation of alternative poly(A) site selection, mRNA export, and mRNA splicing. It is a complex composed of a small 25 kDa (CFIm25) subunit and a larger 59/68/72 kDa subunit. Two separate genes, CPSF6 and CPSF7, code for two isoforms of the large subunit, CFIm68 and CFIm59. The family includes CFIm68, also termed cleavage and polyadenylation specificity factor subunit 6 (CPSF6), or cleavage and polyadenylation specificity factor 68 kDa subunit (CPSF68), or protein HPBRII-4/7. CFIm68 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a central proline-rich region, and a C-terminal RS-like domain. The N-terminal RRM of CFIm68 mediates the interaction with CFIm25. It also serves to enhance RNA binding and facilitate RNA looping. 	77
410049	cd12644	RRM_CFIm59	RNA recognition motif (RRM) found in pre-mRNA cleavage factor Im 59 kDa subunit (CFIm59 or CPSF7) and similar proteins. This subgroup corresponds to the RRM of CFIm59. Cleavage factor Im (CFIm) is a highly conserved component of the eukaryotic mRNA 3' processing machinery that functions in UGUA-mediated poly(A) site recognition, the regulation of alternative poly(A) site selection, mRNA export, and mRNA splicing. It is a complex composed of a small 25 kDa (CFIm25) subunit and a larger 59/68/72 kDa subunit. The two separate genes, CPSF6 and CPSF7, code for two isoforms of the large subunit, CFIm68 and CFIm59. The family includes CFIm59, also termed cleavage and polyadenylation specificity factor subunit 6 (CPSF7), or cleavage and polyadenylation specificity factor 59 kDa subunit (CPSF59). CFIm59 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a central proline-rich region, and a C-terminal RS-like domain. The N-terminal RRM of CFIm59 mediates the interaction with CFIm25. It also serves to enhance RNA binding and facilitate RNA looping. 	90
241089	cd12645	RRM_SRSF3	RNA recognition motif (RRM) found in vertebrate serine/arginine-rich splicing factor 3 (SRSF3). This subgroup corresponds to the RRM of SRSF3, also termed pre-mRNA-splicing factor SRp20, a splicing regulatory serine/arginine (SR) protein that modulates alternative splicing by interacting with RNA cis-elements in a concentration- and cell differentiation-dependent manner. It is also involved in termination of transcription, alternative RNA polyadenylation, RNA export, and protein translation. SRSF3 is critical for cell proliferation and tumor induction and maintenance. SRSF3 can shuttle between the nucleus and cytoplasm. It contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RS domain rich in serine-arginine dipeptides. The RRM domain is involved in RNA binding, and the RS domain has been implicated in protein shuttling and protein-protein interactions. 	81
410050	cd12646	RRM_SRSF7	RNA recognition motif (RRM) found in vertebrate serine/arginine-rich splicing factor 7 (SRSF7). This subgroup corresponds to the RRM of SRSF7, also termed splicing factor 9G8, is a splicing regulatory serine/arginine (SR) protein that plays a crucial role in both constitutive splicing and alternative splicing of many pre-mRNAs. Its localization and functions are tightly regulated by phosphorylation. SRSF7 is predominantly present in the nuclear and can shuttle between nucleus and cytoplasm. It cooperates with the export protein, Tap/NXF1, helps mRNA export to the cytoplasm, and enhances the expression of unspliced mRNA. SRSF7 inhibits tau E10 inclusion through directly interacting with the proximal downstream intron of E10, a clustering region for frontotemporal dementia with Parkinsonism (FTDP) mutations. SRSF7 contains a single N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a CCHC-type zinc knuckle motif in its median region, and a C-terminal RS domain rich in serine-arginine dipeptides. The RRM domain is involved in RNA binding, and the RS domain has been implicated in protein shuttling and protein-protein interactions. 	77
410051	cd12647	RRM_UHM_SPF45	RNA recognition motif (RRM) found in UHM domain of 45 kDa-splicing factor (SPF45) and similar proteins. This subgroup corresponds to the RRM of SPF45, also termed RNA-binding motif protein 17 (RBM17), an RNA-binding protein consisting of an unstructured N-terminal region, followed by a G-patch motif and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and an Arg-Xaa-Phe sequence motif. SPF45 regulates alternative splicing of the apoptosis regulatory gene FAS (also known as CD95). It induces exon 6 skipping in FAS pre-mRNA through the UHM domain that binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) present in the 3' splice site-recognizing factors U2AF65, SF1 and SF3b155. 	95
410052	cd12648	RRM3_UHM_PUF60	RNA recognition motif 3 (RRM3) found in UHM domain of poly(U)-binding-splicing factor PUF60 and similar proteins. This subgroup corresponds to the RRM3 of PUF60, also termed FUSE-binding protein-interacting repressor (FBP-interacting repressor or FIR), or Ro-binding protein 1 (RoBP1), or Siah-binding protein 1 (Siah-BP1), an essential splicing factor that functions as a poly-U RNA-binding protein required to reconstitute splicing in depleted nuclear extracts. Its function is enhanced through interaction with U2 auxiliary factor U2AF65. PUF60 also controls human c-myc gene expression by binding and inhibiting the transcription factor far upstream sequence element (FUSE)-binding-protein (FBP), an activator of c-myc promoters. PUF60 contains two central RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a C-terminal U2AF (U2 auxiliary factor) homology motifs (UHM) that harbors another RRM and binds to tryptophan-containing linear peptide motifs (UHM ligand motifs, ULMs) in several nuclear proteins. The research indicates that PUF60 binds FUSE as a dimer, and only the first two RRM domains participate in the single-stranded DNA recognition. 	98
241093	cd12649	RRM1_SXL	RNA recognition motif 1 (RRM1) found in Drosophila sex-lethal (SXL) and similar proteins. This subfamily corresponds to the RRM1 of SXL which governs sexual differentiation and X chromosome dosage compensation in Drosophila melanogaster. It induces female-specific alternative splicing of the transformer (tra) pre-mRNA by binding to the tra uridine-rich polypyrimidine tract at the non-sex-specific 3' splice site during the sex-determination process. SXL binds also to its own pre-mRNA and promotes female-specific alternative splicing. SXL contains an N-terminal Gly/Asn-rich domain that may be responsible for the protein-protein interaction, and tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), that show high preference to bind single-stranded, uridine-rich target RNA transcripts. 	81
410053	cd12650	RRM1_Hu	RNA recognition motif 1 (RRM1) found in the Hu proteins family. This subfamily corresponds to the RRM1 of the Hu proteins family which represents a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. HuR has an anti-apoptotic function during early cell stress response. It binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	77
410054	cd12651	RRM2_SXL	RNA recognition motif 2 (RRM2) found in Drosophila sex-lethal (SXL) and similar proteins. This subfamily corresponds to the RRM2 of the sex-lethal protein (SXL) which governs sexual differentiation and X chromosome dosage compensation in Drosophila melanogaster. It induces female-specific alternative splicing of the transformer (tra) pre-mRNA by binding to the tra uridine-rich polypyrimidine tract at the non-sex-specific 3' splice site during the sex-determination process. SXL binds also to its own pre-mRNA and promotes female-specific alternative splicing. SXL contains an N-terminal Gly/Asn-rich domain that may be responsible for the protein-protein interaction, and tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), that show high preference to bind single-stranded, uridine-rich target RNA transcripts. 	81
410055	cd12652	RRM2_Hu	RNA recognition motif 2 (RRM2) found in the Hu proteins family. This subfamily corresponds to the RRM2 of Hu proteins family which represents a group of RNA-binding proteins involved in diverse biological processes. Since the Hu proteins share high homology with the Drosophila embryonic lethal abnormal vision (ELAV) protein, the Hu family is sometimes referred to as the ELAV family. Drosophila ELAV is exclusively expressed in neurons and is required for the correct differentiation and survival of neurons in flies. The neuronal members of the Hu family include Hu-antigen B (HuB or ELAV-2 or Hel-N1), Hu-antigen C (HuC or ELAV-3 or PLE21), and Hu-antigen D (HuD or ELAV-4), which play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. Hu-antigen R (HuR or ELAV-1 or HuA) is the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. Moreover, HuR has an anti-apoptotic function during early cell stress response. It binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Hu proteins perform their cytoplasmic and nuclear molecular functions by coordinately regulating functionally related mRNAs. In the cytoplasm, Hu proteins recognize and bind to AU-rich RNA elements (AREs) in the 3' untranslated regions (UTRs) of certain target mRNAs, such as GAP-43, vascular epithelial growth factor (VEGF), the glucose transporter GLUT1, eotaxin and c-fos, and stabilize those ARE-containing mRNAs. They also bind and regulate the translation of some target mRNAs, such as neurofilament M, GLUT1, and p27. In the nucleus, Hu proteins function as regulators of polyadenylation and alternative splicing. Each Hu protein contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	79
410056	cd12653	RRM3_HuR	RNA recognition motif 3 (RRM3) found in vertebrate Hu-antigen R (HuR). This subgroup corresponds to the RRM3 of HuR, also termed ELAV-like protein 1 (ELAV-1), the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. HuR has an anti-apoptotic function during early cell stress response. It binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Like other Hu proteins, HuR contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	85
241098	cd12654	RRM3_HuB	RNA recognition motif 3 (RRM3) found in vertebrate Hu-antigen B (HuB). This subgroup corresponds to the RRM3 of HuB, also termed ELAV-like protein 2 (ELAV-2), or ELAV-like neuronal protein 1, or nervous system-specific RNA-binding protein Hel-N1 (Hel-N1), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. It is up-regulated during neuronal differentiation of embryonic carcinoma P19 cells. Like other Hu proteins, HuB contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	86
410057	cd12655	RRM3_HuC	RNA recognition motif 3 (RRM3) found in vertebrate Hu-antigen C (HuC). This subgroup corresponds to the RRM3 of HuC, also termed ELAV-like protein 3 (ELAV-3), or paraneoplastic cerebellar degeneration-associated antigen, or paraneoplastic limbic encephalitis antigen 21 (PLE21), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. Like other Hu proteins, HuC contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). The AU-rich element binding of HuC can be inhibited by flavonoids. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	85
241100	cd12656	RRM3_HuD	RNA recognition motif 3 (RRM3) found in vertebrate Hu-antigen D (HuD). This subgroup corresponds to the RRM3 of HuD, also termed ELAV-like protein 4 (ELAV-4), or paraneoplastic encephalomyelitis antigen HuD, one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuD has been implicated in various aspects of neuronal function, such as the commitment and differentiation of neuronal precursors as well as synaptic remodeling in mature neurons. HuD also functions as an important regulator of mRNA expression in neurons by interacting with AU-rich RNA element (ARE) and stabilizing multiple transcripts. Moreover, HuD regulates the nuclear processing/stability of N-myc pre-mRNA in neuroblastoma cells. And it also regulates the neurite elongation and morphological differentiation. HuD specifically bound poly(A) RNA. Like other Hu proteins, HuD contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	86
410058	cd12657	RRM1_hnRNPM	RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein M (hnRNP M). This subgroup corresponds to the RRM1 of hnRNP M, a pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. Moreover, hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. hnRNP M functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). 	76
410059	cd12658	RRM1_MYEF2	RNA recognition motif 1 (RRM1) found in vertebrate myelin expression factor 2 (MEF-2). This subgroup corresponds to the RRM1 of MEF-2, also termed MyEF-2 or MST156, a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may be responsible for its ssDNA binding activity. 	76
410060	cd12659	RRM2_hnRNPM	RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein M (hnRNP M). This subgroup corresponds to the RRM2 of hnRNP M, a pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. It functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). 	76
410061	cd12660	RRM2_MYEF2	RNA recognition motif 2 (RRM2) found in vertebrate myelin expression factor 2 (MEF-2). This subgroup corresponds to the RRM2 of MEF-2, also termed MyEF-2 or MST156, a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may be responsible for its ssDNA binding activity. 	76
410062	cd12661	RRM3_hnRNPM	RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein M (hnRNP M). This subgroup corresponds to the RRM3 of hnRNP M, a pre-mRNA binding protein that may play an important role in the pre-mRNA processing. It also preferentially binds to poly(G) and poly(U) RNA homopolymers. Moreover, hnRNP M is able to interact with early spliceosomes, further influencing splicing patterns of specific pre-mRNAs. hnRNP M functions as the receptor of carcinoembryonic antigen (CEA) that contains the penta-peptide sequence PELPK signaling motif. In addition, hnRNP M and another splicing factor Nova-1 work together as dopamine D2 receptor (D2R) pre-mRNA-binding proteins. They regulate alternative splicing of D2R pre-mRNA in an antagonistic manner. hnRNP M contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and an unusual hexapeptide-repeat region rich in methionine and arginine residues (MR repeat motif). 	77
410063	cd12662	RRM3_MYEF2	RNA recognition motif 3 (RRM3) found in vertebrate myelin expression factor 2 (MEF-2). This subgroup corresponds to the RRM3 of MEF-2, also termed MyEF-2 or MST156, a sequence-specific single-stranded DNA (ssDNA) binding protein that binds specifically to ssDNA derived from the proximal (MB1) element of the myelin basic protein (MBP) promoter and represses transcription of the MBP gene. MEF-2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which may be responsible for its ssDNA binding activity.	77
410064	cd12663	RRM1_RAVER1	RNA recognition motif 1 (RRM1) found in vertebrate ribonucleoprotein PTB-binding 1 (raver-1). This subgroup corresponds to the RRM1 of raver-1, a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-1 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two PTB-binding [SG][IL]LGxxP motifs. Raver1 binds to PTB through the PTB-binding motifs at its C-terminal half, and binds to other partners, such as RNA having the sequence UCAUGCAGUCUG, through its N-terminal RRMs. Interestingly, the 12-nucleotide RNA having the sequence UCAUGCAGUCUG with micromolar affinity is found in vinculin mRNA. Additional research indicates that the RRM1 of raver-1 directs its interaction with the tail domain of activated vinculin. Then the raver1/vinculin tail (Vt) complex binds to vinculin mRNA, which is permissive for vinculin binding to F-actin. 	71
410065	cd12664	RRM1_RAVER2	RNA recognition motif 1 (RRM1) found in vertebrate ribonucleoprotein PTB-binding 2 (raver-2). This subgroup corresponds to the RRM1 of raver-2, a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It is present in vertebrates and shows high sequence homology to raver-1, a ubiquitously expressed co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. In contrast, raver-2 exerts a distinct spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Raver-2 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. Raver-2 binds to PTB through the SLLGEPP motif only, and binds to RNA through its RRMs. 	70
410066	cd12665	RRM2_RAVER1	RNA recognition motif 2 (RRM2) found found in vertebrate ribonucleoprotein PTB-binding 1 (raver-1). This subgroup corresponds to the RRM2 of raver-1, a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-1 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two PTB-binding [SG][IL]LGxxP motifs. Raver1 binds to PTB through the PTB-binding motifs at its C-terminal half, and binds to other partners, such as RNA having the sequence UCAUGCAGUCUG, through its N-terminal RRMs. Interestingly, the 12-nucleotide RNA having the sequence UCAUGCAGUCUG with micromolar affinity is found in vinculin mRNA. Additional research indicates that the RRM1 of raver-1 directs its interaction with the tail domain of activated vinculin. Then the raver1/vinculin tail (Vt) complex binds to vinculin mRNA, which is permissive for vinculin binding to F-actin. 	77
410067	cd12666	RRM2_RAVER2	RNA recognition motif 2 (RRM2) found in vertebrate ribonucleoprotein PTB-binding 2 (raver-2). This subgroup corresponds to the RRM2 of raver-2, a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It is present in vertebrates and shows high sequence homology to raver-1, a ubiquitously expressed co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. In contrast, raver-2 exerts a distinct spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Raver-2 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. Raver-2 binds to PTB through the SLLGEPP motif only, and binds to RNA through its RRMs. 	77
410068	cd12667	RRM3_RAVER1	RNA recognition motif 3 (RRM3) found in vertebrate ribonucleoprotein PTB-binding 1 (raver-1). This subgroup corresponds to the RRM3 of raver-1, a ubiquitously expressed heterogeneous nuclear ribonucleoprotein (hnRNP) that serves as a co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. It shuttles between the cytoplasm and the nucleus and can accumulate in the perinucleolar compartment, a dynamic nuclear substructure that harbors PTB. Raver-1 also modulates focal adhesion assembly by binding to the cytoskeletal proteins, including alpha-actinin, vinculin, and metavinculin (an alternatively spliced isoform of vinculin) at adhesion complexes, particularly in differentiated muscle tissue. Raver-1 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two PTB-binding [SG][IL]LGxxP motifs. Raver1 binds to PTB through the PTB-binding motifs at its C-terminal half, and binds to other partners, such as RNA having the sequence UCAUGCAGUCUG, through its N-terminal RRMs. Interestingly, the 12-nucleotide RNA having the sequence UCAUGCAGUCUG with micromolar affinity is found in vinculin mRNA. Additional research indicates that the RRM1 of raver-1 directs its interaction with the tail domain of activated vinculin. Then the raver1/vinculin tail (Vt) complex binds to vinculin mRNA, which is permissive for vinculin binding to F-actin. 	92
410069	cd12668	RRM3_RAVER2	RNA recognition motif 3 (RRM3) found found in vertebrate ribonucleoprotein PTB-binding 2 (raver-2). This subgroup corresponds to the RRM3 of raver-2, a novel member of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. It is present in vertebrates and shows high sequence homology to raver-1, a ubiquitously expressed co-repressor of the nucleoplasmic splicing repressor polypyrimidine tract-binding protein (PTB)-directed splicing of select mRNAs. In contrast, raver-2 exerts a distinct spatio-temporal expression pattern during embryogenesis and is mainly limited to differentiated neurons and glia cells. Although it displays nucleo-cytoplasmic shuttling in heterokaryons, raver2 localizes to the nucleus in glia cells and neurons. Raver-2 can interact with PTB and may participate in PTB-mediated RNA-processing. However, there is no evidence indicating that raver-2 can bind to cytoplasmic proteins. Raver-2 contains three N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two putative nuclear localization signals (NLS) at the N- and C-termini, a central leucine-rich region, and a C-terminal region harboring two [SG][IL]LGxxP motifs. Raver-2 binds to PTB through the SLLGEPP motif only, and binds to RNA through its RRMs. 	98
410070	cd12669	RRM1_Nop12p_like	RNA recognition motif 1 (RRM1) found in yeast nucleolar protein 12 (Nop12p) and similar proteins. This subgroup corresponds to the RRM1 of Nop12p which is encoded by YOL041C from Saccharomyces cerevisiae. It is a novel nucleolar protein required for pre-25S rRNA processing and normal rates of cell growth at low temperatures. Nop12p shares high sequence similarity with nucleolar protein 13 (Nop13p). Both, Nop12p and Nop13p, are not essential for growth. However, unlike Nop13p that localizes primarily to the nucleolus but also present in the nucleoplasm to a lesser extent, Nop12p is localized to the nucleolus. Nop12p contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	100
410071	cd12670	RRM2_Nop12p_like	RNA recognition motif 2 (RRM2) found in yeast nucleolar protein 12 (Nop12p) and similar proteins. This subgroup corresponds to the RRM2 of Nop12p, which is encoded by YOL041C from Saccharomyces cerevisiae. It is a novel nucleolar protein required for pre-25S rRNA processing and normal rates of cell growth at low temperatures. Nop12p shares high sequence similarity with nucleolar protein 13 (Nop13p). Both, Nop12p and Nop13p, are not essential for growth. However, unlike Nop13p that localizes primarily to the nucleolus but is also present in the nucleoplasm to a lesser extent, Nop12p is localized to the nucleolus. Nop12p contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	77
410072	cd12671	RRM_CSTF2_CSTF2T	RNA recognition motif (RRM) found in cleavage stimulation factor subunit 2 (CSTF2), cleavage stimulation factor subunit 2 tau variant (CSTF2T) and similar proteins. This subgroup corresponds to the RRM domain of CSTF2, its tau variant and eukaryotic homologs. CSTF2, also termed cleavage stimulation factor 64 kDa subunit (CstF64), is the vertebrate conterpart of yeast mRNA 3'-end-processing protein RNA15. It is expressed in all somatic tissues and is one of three cleavage stimulatory factor (CstF) subunits required for polyadenylation. CstF64 contains an N-terminal RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a CstF77-binding domain, a repeated MEARA helical region and a conserved C-terminal domain reported to bind the transcription factor PC-4. During polyadenylation, CstF interacts with the pre-mRNA through the RRM of CstF64 at U- or GU-rich sequences within 10 to 30 nucleotides downstream of the cleavage site. CSTF2T, also termed tauCstF64, is a paralog of the X-linked cleavage stimulation factor CstF64 protein that supports polyadenylation in most somatic cells. It is expressed during meiosis and subsequent haploid differentiation in a more limited set of tissues and cell types, largely in meiotic and postmeiotic male germ cells, and to a lesser extent in brain. The loss of CSTF2T will cause male infertility, as it is necessary for spermatogenesis and fertilization. Moreover, CSTF2T is required for expression of genes involved in morphological differentiation of spermatids, as well as for genes having products that function during interaction of motile spermatozoa with eggs. It promotes germ cell-specific patterns of polyadenylation by using its RRM to bind to different sequence elements downstream of polyadenylation sites than does CstF64. 	85
410073	cd12672	RRM_DAZL	RNA recognition motif (RRM) found in vertebrate deleted in azoospermia-like (DAZL) proteins. This subgroup corresponds to the RRM of DAZL, also termed SPGY-like-autosomal, encoded by the autosomal homolog of DAZ gene, DAZL. It is ancestral to the deleted in azoospermia (DAZ) protein. DAZL is germ-cell-specific RNA-binding protein that contains a RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a DAZ motif, a protein-protein interaction domain. Although their specific biochemical functions remain to be investigated, DAZL proteins may interact with poly(A)-binding proteins (PABPs), and act as translational activators of specific mRNAs during gametogenesis. 	82
410074	cd12673	RRM_BOULE	RNA recognition motif (RRM) found in protein BOULE. This subgroup corresponds to the RRM of BOULE, the founder member of the human DAZ gene family. Invertebrates contain a single BOULE, while vertebrates, other than catarrhine primates, possess both BOULE and DAZL genes. The catarrhine primates possess BOULE, DAZL, and DAZ genes. BOULE encodes an RNA-binding protein containing an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a single copy of the DAZ motif. Although its specific biochemical functions remains to be investigated, BOULE protein may interact with poly(A)-binding proteins (PABPs), and act as translational activators of specific mRNAs during gametogenesis. 	81
410075	cd12674	RRM1_Nop4p	RNA recognition motif 1 (RRM1) found in yeast nucleolar protein 4 (Nop4p) and similar proteins. This subgroup corresponds to the RRM1 of Nop4p (also known as Nop77p), encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	80
410076	cd12675	RRM2_Nop4p	RNA recognition motif 2 (RRM2) found in yeast nucleolar protein 4 (Nop4p) and similar proteins. This subgroup corresponds to the RRM2 of Nop4p (also known as Nop77p), encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	83
410077	cd12676	RRM3_Nop4p	RNA recognition motif 3 (RRM3) found in yeast nucleolar protein 4 (Nop4p) and similar proteins. This subgroup corresponds to the RRM3 of Nop4p (also known as Nop77p), encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	107
410078	cd12677	RRM4_Nop4p	RNA recognition motif 4 (RRM4) found in yeast nucleolar protein 4 (Nop4p) and similar proteins. This subgroup corresponds to the RRM4 of Nop4p (also known as Nop77p), encoded by YPL043W from Saccharomyces cerevisiae. It is an essential nucleolar protein involved in processing and maturation of 27S pre-rRNA and biogenesis of 60S ribosomal subunits. Nop4p has four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	158
410079	cd12678	RRM_SLTM	RNA recognition motif (RRM) found in Scaffold attachment factor (SAF)-like transcription modulator (SLTM) and similar proteins. This subgroup corresponds to the RRM domain of SLTM, also termed modulator of estrogen-induced transcription, which shares high sequence similarity with scaffold attachment factor B1 (SAFB1). It contains a scaffold attachment factor-box (SAF-box, also known as SAP domain) DNA-binding motif, an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a region rich in glutamine and arginine residues. To a large extent, SLTM co-localizes with SAFB1 in the nucleus, which suggests that they share similar functions, such as the inhibition of an oestrogen reporter gene. However, rather than mediating a specific inhibitory effect on oestrogen action, SLTM is shown to exert a generalized inhibitory effect on gene expression associated with induction of apoptosis in a wide range of cell lines. 	74
410080	cd12679	RRM_SAFB1_SAFB2	RNA recognition motif (RRM) found in scaffold attachment factor B1 (SAFB1), scaffold attachment factor B2 (SAFB2), and similar proteins. This subgroup corresponds to RRM of SAFB1, also termed scaffold attachment factor B (SAF-B), heat-shock protein 27 estrogen response element ERE and TATA-box-binding protein (HET), or heterogeneous nuclear ribonucleoprotein hnRNP A1- associated protein (HAP), a large multi-domain protein with well-described functions in transcriptional repression, RNA splicing and metabolism, and a proposed role in chromatin organization. Based on the numerous functions, SAFB1 has been implicated in many diverse cellular processes including cell growth and transformation, stress response, and apoptosis. SAFB1 specifically binds to AT-rich scaffold or matrix attachment region DNA elements (S/MAR DNA) by using its N-terminal scaffold attachment factor-box (SAF-box, also known as SAP domain), a homeodomain-like DNA binding motif. The central region of SAFB1 is composed of an RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a nuclear localization signal (NLS). The C-terminus of SAFB1 contains Glu/Arg- and Gly-rich regions that might be involved in protein-protein interaction. Additional studies indicate that the C-terminal region contains a potent and transferable transcriptional repression domain. Another family member is SAFB2, a homolog of SAFB1. Both SAFB1 and SAFB2 are ubiquitously coexpressed and share very high sequence similarity, suggesting that they might function in a similar manner. However, unlike SAFB1, exclusively existing in the nucleus, SAFB2 is also present in the cytoplasm. The additional cytoplasmic localization of SAFB2 implies that it could play additional roles in the cytoplasmic compartment which are distinct from the nuclear functions shared with SAFB1.	76
410081	cd12680	RRM_THOC4	RNA recognition motif (RRM) found in THO complex subunit 4 (THOC4) and similar proteins. This subgroup corresponds to the RRM of THOC4, also termed transcriptional coactivator Aly/REF, or ally of AML-1 and LEF-1, or bZIP-enhancing factor BEF, an mRNA transporter protein with a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It is involved in RNA transportation from the nucleus. THOC4 was initially identified as a transcription coactivator of LEF-1 and AML-1 for the TCRalpha enhancer function. In addition, THOC4 specifically binds to rhesus (RH) promoter in erythroid. It might be a novel transcription cofactor for erythroid-specific genes. 	75
410082	cd12681	RRM_SKAR	RNA recognition motif (RRM) found in S6K1 Aly/REF-like target (SKAR) and similar proteins. This subgroup corresponds to the RRM of SKAR, also termed polymerase delta-interacting protein 3 (PDIP3), 46 kDa DNA polymerase delta interaction protein (PDIP46), belonging to the Aly/REF family of RNA binding proteins that have been implicated in coupling transcription with pre-mRNA splicing and nucleo-cytoplasmic mRNA transport. SKAR is widely expressed and localizes to the nucleus. It may be a critical player in the function of S6K1 in cell and organism growth control by binding the activated, hyperphosphorylated form of S6K1 but not S6K2. Furthermore, SKAR functions as a protein partner of the p50 subunit of DNA polymerase delta. In addition, SKAR may have particular importance in pancreatic beta cell size determination and insulin secretion. SKAR contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	69
410083	cd12682	RRM_RBPMS	RNA recognition motif (RRM) found in vertebrate RNA-binding protein with multiple splicing (RBP-MS). This subfamily corresponds to the RRM of RBP-MS, also termed heart and RRM expressed sequence (hermes), an RNA-binding proteins found in various vertebrate species. It contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RBP-MS physically interacts with Smad2, Smad3 and Smad4 and plays a role in regulation of Smad-mediated transcriptional activity. In addition, RBP-MS may be involved in regulation of mRNA translation and localization during Xenopus laevis development. 	76
410084	cd12683	RRM_RBPMS2	RNA recognition motif (RRM) found in vertebrate RNA-binding protein with multiple splicing 2 (RBP-MS2). This subfamily corresponds to the RRM of RBP-MS2, encoded by RBPMS2 gene, a paralog of RNA-binding protein with multiple splicing (RBP-MS). The biological function of RBP-MS2 remains unclear. Like RBP-MS, RBP-MS2 contains an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	76
410085	cd12684	RRM_cpo	RNA recognition motif (RRM) found in Drosophila couch potato (cpo) coding RNA-binding protein and similar proteins. This subfamily corresponds to the RRM of Cpo, an RNA-binding protein encoded by Drosophila couch potato (cpo) gene. Cpo contains a well conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). It may control the processing of RNA molecules required for the proper functioning of the peripheral nervous system (PNS). 	83
410086	cd12685	RRM_RBM20	RNA recognition motif (RRM) found in vertebrate RNA-binding protein 20 (RBM20). This subfamily corresponds to the RRM of RBM20, an alternative splicing regulator associated with dilated cardiomyopathy (DCM). It contains only one copy of RNA-recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	76
410087	cd12686	RRM1_PTBPH1_PTBPH2	RNA recognition motif 1 (RRM1) found in plant polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2). This subfamily corresponds to the RRM1 of PTBPH1 and PTBPH2. Although their biological roles remain unclear, PTBPH1 and PTBPH2 show significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Both, PTBPH1 and PTBPH2, contain three RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	81
410088	cd12687	RRM1_PTBPH3	RNA recognition motif 1 (RRM1) found in plant polypyrimidine tract-binding protein homolog 3 (PTBPH3). This subfamily corresponds to the RRM1 of PTBPH3. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	75
410089	cd12688	RRM1_PTBP1_like	RNA recognition motif 1 (RRM1) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I) and similar proteins. This subfamily corresponds to the RRM1 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), and similar proteins found in Metazoa. PTB is an important negative regulator of alternative splicing in mammalian cells and functions at several aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 also contains four RRMs. ROD1 coding protein Rod1 is a mammalian PTB homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein and negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It may play a role controlling differentiation in mammals. All members in this family contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	81
410090	cd12689	RRM1_hnRNPL_like	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein L (hnRNP-L) and similar proteins. This subfamily corresponds to the RRM1 of heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), and similar proteins. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to hnRNP-L, which contains three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	80
410091	cd12690	RRM3_PTBPH1_PTBPH2	RNA recognition motif 3 (RRM3) found in plant polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2). This subfamily corresponds to the RRM3 of PTBPH1 and PTBPH2. Although their biological roles remain unclear, PTBPH1 and PTBPH2 show significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Both, PTBPH1 and PTBPH2, contain three RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	97
241135	cd12691	RRM2_PTBPH1_PTBPH2	RNA recognition motif 2 (RRM2) found in plant polypyrimidine tract-binding protein homolog 1 and 2 (PTBPH1 and PTBPH2). This subfamily corresponds to the RRM2 of PTBPH1 and PTBPH2. Although their biological roles remain unclear, PTBPH1 and PTBPH2 show significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Both, PTBPH1 and PTBPH2, contain three RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	95
410092	cd12692	RRM2_PTBPH3	RNA recognition motif 2 (RRM2) found in plant polypyrimidine tract-binding protein homolog 3 (PTBPH3). This subfamily corresponds to the RRM2 of PTBPH3. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	88
410093	cd12693	RRM2_PTBP1_like	RNA recognition motif 2 (RRM2) found in polypyrimidine tract-binding protein 1 (PTB or hnRNP I) and similar proteins. This subfamily corresponds to the RRM2 of polypyrimidine tract-binding protein 1 (PTB or hnRNP I), polypyrimidine tract-binding protein 2 (PTBP2 or nPTB), regulator of differentiation 1 (Rod1), and similar proteins found in Metazoa. PTB is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTBP2 is highly homologous to PTB and is perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 also contains four RRMs. ROD1 coding protein Rod1 is a mammalian PTB homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It may play a role controlling differentiation in mammals. All members in this family contain four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	96
410094	cd12694	RRM2_hnRNPL_like	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein L (hnRNP-L) and similar proteins. This subfamily corresponds to the RRM2 of heterogeneous nuclear ribonucleoprotein L (hnRNP-L), heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL), and similar proteins. hnRNP-L is a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both nuclear and cytoplasmic roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-LL plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to hnRNP-L, which contains three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	86
410095	cd12695	RRM3_PTBP1	RNA recognition motif 3 (RRM3) found in vertebrate polypyrimidine tract-binding protein 1 (PTB). This subgroup corresponds to the RRM3 of PTB, also known as 58 kDa RNA-binding protein PPTB-1 or heterogeneous nuclear ribonucleoprotein I (hnRNP I), an important negative regulator of alternative splicing in mammalian cells. PTB also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTB contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RRM1 and RRM2 are independent from each other and separated by flexible linkers. By contrast, there is an unusual and conserved interdomain interaction between RRM3 and RRM4. It is widely held that only RRMs 3 and 4 are involved in RNA binding and RRM2 mediates PTB homodimer formation. However, new evidence show that the RRMs 1 and 2 also contribute substantially to RNA binding. Moreover, PTB may not always dimerize to repress splicing. It is a monomer in solution. 	93
410096	cd12696	RRM3_PTBP2	RNA recognition motif 3 (RRM3) found in vertebrate polypyrimidine tract-binding protein 2 (PTBP2). This subgroup corresponds to the RRM3 of PTBP2, also known as neural polypyrimidine tract-binding protein or neurally-enriched homolog of PTB (nPTB), highly homologous to polypyrimidine tract binding protein (PTB) and perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 contains four RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	107
410097	cd12697	RRM3_ROD1	RNA recognition motif 3 (RRM3) found in vertebrate regulator of differentiation 1 (Rod1). This subgroup corresponds to the RRM3 of ROD1 coding protein Rod1, a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. Rod1 contains four repeats of RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and does have RNA binding activities. 	76
410098	cd12698	RRM3_PTBPH3	RNA recognition motif 3 (RRM3) found in plant polypyrimidine tract-binding protein homolog 3 (PTBPH3). This subgroup corresponds to the RRM3 of PTBPH3. Although its biological roles remain unclear, PTBPH3 shows significant sequence similarity to polypyrimidine tract binding protein (PTB) that is an important negative regulator of alternative splicing in mammalian cells and also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. Like PTB, PTBPH3 contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	76
410099	cd12699	RRM3_hnRNPL	RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein L (hnRNP-L). This subgroup corresponds to the RRM3 of hnRNP-L, a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-L shows significant sequence homology with polypyrimidine tract-binding protein (PTB or hnRNP I). Both, hnRNP-L and PTB, are localized in the nucleus but excluded from the nucleolus. hnRNP-L is an RNA-binding protein with three RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	77
410100	cd12700	RRM3_hnRPLL	RNA recognition motif 3 (RRM3) found in vertebrate heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL). The subgroup corresponds to the RRM3 of hnRNP-LL which plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to heterogeneous nuclear ribonucleoprotein L (hnRNP-L), which is an abundant nuclear, multifunctional RNA-binding protein with three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	74
410101	cd12701	RRM4_PTBP1	RNA recognition motif 4 (RRM4) found in vertebrate polypyrimidine tract-binding protein 1 (PTB). This subgroup corresponds to the RRM4 of PTB, also known as 58 kDa RNA-binding protein PPTB-1 or heterogeneous nuclear ribonucleoprotein I (hnRNP I), an important negative regulator of alternative splicing in mammalian cells. PTB also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTB contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RRM1 and RRM2 are independent from each other and separated by flexible linkers. By contrast, there is an unusual and conserved interdomain interaction between RRM3 and RRM4. It is widely held that only RRMs 3 and 4 are involved in RNA binding and RRM2 mediates PTB homodimer formation. However, new evidence shows that the RRMs 1 and 2 also contribute substantially to RNA binding. Moreover, PTB may not always dimerize to repress splicing. It is a monomer in solution. 	76
241146	cd12702	RRM4_PTBP2	RNA recognition motif 4 (RRM4) found in vertebrate polypyrimidine tract-binding protein 2 (PTBP2). This subgroup corresponds to the RRM4 of PTBP2, also known as neural polypyrimidine tract-binding protein or neurally-enriched homolog of PTB (nPTB), highly homologous to polypyrimidine tract binding protein (PTB) and perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 contains four RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	80
410102	cd12703	RRM4_ROD1	RNA recognition motif 4 (RRM4) found in vertebrate regulator of differentiation 1 (Rod1). This subgroup corresponds to the RRM4 of ROD1 coding protein Rod1, a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein that negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. Rod1 contains four repeats of RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and does have RNA binding activities. 	91
410103	cd12704	RRM4_hnRNPL	RNA recognition motif 4 (RRM4) found in vertebrate heterogeneous nuclear ribonucleoprotein L (hnRNP-L). This subgroup corresponds to the RRM4 of hnRNP-L, a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-L shows significant sequence homology with polypyrimidine tract-binding protein (PTB or hnRNP I). Both hnRNP-L and PTB are localized in the nucleus but excluded from the nucleolus. hnRNP-L is an RNA-binding protein with three RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	84
410104	cd12705	RRM4_hnRPLL	RNA recognition motif 4 (RRM4) found in vertebrate heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL). The subgroup corresponds to the RRM4 of hnRNP-LL which plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to heterogeneous nuclear ribonucleoprotein L (hnRNP-L), which is an abundant nuclear, multifunctional RNA-binding protein with three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	85
410105	cd12706	RRM_LARP5	RNA recognition motif (RRM) found in vertebrate La-related protein 5 (LARP5 or LARP4B). This subgroup corresponds to the RRM of LARP5, a cytosolic protein that co-sediments with polysomes and accumulates upon stress induction in cellular stress granules. It can interact with the cytosolic poly(A) binding protein 1 (PABPC1) and the receptor for activated C Kinase (RACK1), a component of the 40S ribosomal subunit. LARP5 may function as a stimulatory factor of translation through bridging mRNA factors of the 3' end with initiating ribosomes. Like other La-related proteins (LARPs) family members, LARP5 contains a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	77
410106	cd12707	RRM_LARP4	RNA recognition motif (RRM) found in vertebrate La-related protein 4 (LARP4). This subgroup corresponds to the RRM of LARP4, a cytoplasmic factor that can bind poly(A) RNA and interact with poly(A) binding protein (PABP). It may play a role in promoting translation by stabilizing mRNA. LARP4 is structurally related to the La autoantigen. Like other La-related proteins (LARPs) family members, LARP4 contains a La motif (LAM) and an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	77
410107	cd12708	RRM_RCAN1	RNA recognition motif (RRM) found in vertebrate regulator of calcineurin 1 (RCAN1). This subgroup corresponds to the RRM of RCAN1, also termed calcipressin-1, or Adapt78, or Down syndrome critical region protein 1, or myocyte-enriched calcineurin-interacting protein 1 (MCIP1), encoded by the Down syndrome critical region 1 (DSCR1) gene that is abundantly expressed in human brain, heart and muscles. Overexpressed RCAN1 functions as an inhibitor of the Ca2+/calmodulin-dependent phosphatase calcineurin (also termed PP2B or PP3C), and is associated with Alzheimer's disease (AD) and Down syndrome (DS). RCAN1 can be phosphorylated by several kinases such as big MAP kinase 1 (BMK1), glycogen synthase kinase-3 (GSK-3), NF-kappaB inducing kinase (NIK), and protein kinase A (PKA). The phosphorylation of RCAN1 can positively or negatively regulate calcineurin-mediated gene transcription, and also affect its protein stability in the ubiquitin-proteasome pathway. RCAN1 consists of an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a highly conserved SP repeat domain containing the phosphorylation site by GSK-3, a well-known PxIxIT motif responsible for docking many substrates to calcineurin, and an unrecognized C-terminal TxxP motif of unknown function. 	93
410108	cd12709	RRM_RCAN2	RNA recognition motif (RRM) found in vertebrate regulator of calcineurin 2 (RCAN2). This subgroup corresponds to the RRM of RCAN2, also termed calcipressin-2, or Down syndrome candidate region 1-like 1 (DSCR1L1), or myocyte-enriched calcineurin-interacting protein 2 (MCIP2), or thyroid hormone-responsive protein ZAKI-4, encoded by a novel thyroid hormone-responsive gene ZAKI-4 that is abundantly expressed in human brain, heart and muscles. RCAN2 binds to the catalytic subunit of Ca2+/calmodulin-dependent phosphatase calcineurin (also termed PP2B or PP3C), calcineurin A, and inhibits its phosphatase activity through its C-terminal region. RCAN2 consists of an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a highly conserved SP repeat domain containing the phosphorylation site by GSK-3, a well-known PxIxIT motif responsible for docking many substrates to calcineurin, and an unrecognized C-terminal TxxP motif of unknown function. 	77
410109	cd12710	RRM_RCAN3	RNA recognition motif (RRM) found in vertebrate regulator of calcineurin 3 (RCAN3). This subgroup corresponds to the RRM of RCAN3, also termed calcipressin-3, or Down syndrome candidate region 1-like protein 2 (DSCR1L2), or myocyte-enriched calcineurin-interacting protein 3 (MCIP3), encoded by a ubiquitously expressed DSCR1L2 gene. Overexpressed RCAN3 binds and inhibits the Ca2+/calmodulin-dependent phosphatase calcineurin (also termed PP2B or PP3C), and further down-regulates nuclear factor of activated T cells (NFAT)-dependent cytokine gene expression in activated human Jurkat T cells. Moreover, RCAN3 interacts with cardiac troponin I (TNNI3), a heart-specific inhibitory subunit of the troponin complex, and may play a role in cardiac contraction. RCAN3 consists of an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a highly conserved SP repeat domain containing the phosphorylation site by GSK-3, a well-known PxIxIT motif responsible for docking many substrates to calcineurin, and an unrecognized C-terminal TxxP motif of unknown function. 	77
410110	cd12711	RRM_TNRC6A	RNA recognition motif (RRM) found in vertebrate GW182 autoantigen. This subgroup corresponds to the RRM of the GW182 autoantigen, also termed trinucleotide repeat-containing gene 6A protein (TNRC6A), or CAG repeat protein 26, or EMSY interactor protein, or protein GW1, or glycine-tryptophan protein of 182 kDa, a phosphorylated cytoplasmic autoantigen involved in stabilizing and/or regulating translation and/or storing several different mRNAs. GW182 is characterized by multiple glycine/tryptophan (G/W) repeats and is a critical component of GW bodies (GWBs, also called mammalian processing bodies, or P bodies). The mRNAs associated with GW182 are presumed to reside within GWBs. GW182 has been shown to bind multiple Ago-miRNA complexes, and thus plays a key role in miRNA-mediated translational repression and mRNA degradation. In the absence of Ago2, GW182 may induce translational silencing effect. GW182 is composed of an N-terminal G/W-rich region containing an Ago hook responsible for Ago protein-binding; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C-terminus. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred to as silencing domain that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. 	92
410111	cd12712	RRM_TNRC6B	RNA recognition motif (RRM) found in vertebrate trinucleotide repeat-containing gene 6B protein (TNRC6B). This subgroup corresponds to the RRM of TNRC6B, one of three GW182 paralogs in mammalian genomes. It is involved in miRNA-mediated mRNA degradation. TNRC6B is composed of an N-terminal glycine/tryptophan (G/W)-rich region; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C-terminus. TNRC6B directly interacts with Argonaute (Ago) proteins through its N-terminal glycine/tryptophan (G/W)-rich region that is called Ago protein-binding domain. TNRC6B is enriched in P-bodies and its Q-rich domain is responsible for P-body localization. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred as silencing domain that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. The C-terminal half of TNRC6B comprising an RRM domain exerts a strong translation inhibition potential, which does not require either association with Agos or localization to P-bodies.  	83
410112	cd12713	RRM_TNRC6C	RNA recognition motif (RRM) found in vertebrate trinucleotide repeat-containing gene 6C protein (TNRC6C). This subgroup corresponds to the RRM of TNRC6C, one of three GW182 paralogs in mammalian genomes. It is enriched in P-bodies and important for efficient miRNA-mediated repression. TNRC6C is composed of an N-terminal glycine/tryptophan (G/W)-rich region containing an Ago hook responsible for Ago protein-binding; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C-terminus. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred as silencing domain that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. The C-terminal half containing the RRM domain functions as a key effector domain mediating protein synthesis repression by TNRC6C. 	88
410113	cd12714	RRM1_MATR3	RNA recognition motif 1 (RRM1) found in vertebrate matrin-3. This subgroup corresponds to the RRM1 of Matrin 3 (MATR3 or P130), a highly conserved inner nuclear matrix protein with a bipartite nuclear localization signal (NLS), two zinc finger domains predicted to bind DNA, and two RNA recognition motifs (RRM), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), that are known to interact with RNA. MATR3 has been implicated in various biological processes. It is involved in RNA processing by interacting with other nuclear proteins to anchor hyperedited RNAs to the nuclear matrix. It plays a role in mRNA stabilization through maintaining the stability of certain mRNA species. Besides, it modulates the activity of proximal promoters by binding to highly repetitive sequences of matrix/scaffold attachment region (MAR/SAR). The phosphorylation of MATR3 is assumed to cause neuronal death. It is phosphorylated by the protein kinase ATM, which activates the cellular response to double strand breaks in the DNA. Its phosphorylation by protein kinase A (PKA) is responsible for the activation of the N-methyl-d-aspartic acid (NMDA) receptor. Furthermore, MATR3 has been identified as both a Ca2+-dependent CaM-binding protein and a downstream substrate of caspases. Additional research indicates that matrin 3 also binds Rev/Rev responsive element (RRE)-containing viral RNA and functions as a cofactor that mediates the post-transcriptional regulation of HIV-1. 	76
410114	cd12715	RRM2_MATR3	RNA recognition motif 2 (RRM2) found in vertebrate matrin-3. This subgroup corresponds to the RRM2 of Matrin 3 (MATR3 or P130), a highly conserved inner nuclear matrix protein with a bipartite nuclear localization signal (NLS), two zinc finger domains predicted to bind DNA, and two RNA recognition motifs (RRM), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), that are known to interact with RNA. MATR3 has been implicated in various biological processes. It is involved in RNA processing by interacting with other nuclear proteins to anchor hyperedited RNAs to the nuclear matrix. It plays a role in mRNA stabilization through maintaining the stability of certain mRNA species. Besides, it modulates the activity of proximal promoters by binding to highly repetitive sequences of matrix/scaffold attachment region (MAR/SAR). The phosphorylation of MATR3 is assumed to cause neuronal death. It is phosphorylated by the protein kinase ATM, which activates the cellular response to double strand breaks in the DNA. Its phosphorylation by protein kinase A (PKA) is responsible for the activation of the N-methyl-d-aspartic acid (NMDA) receptor. Furthermore, MATR3 has been identified as both a Ca2+-dependent CaM-binding protein and a downstream substrate of caspases. Additional research indicates that matrin 3 also binds Rev/Rev responsive element (RRE)-containing viral RNA and functions as a cofactor that mediates the post-transcriptional regulation of HIV-1. 	80
410115	cd12716	RRM1_2_NP220	RNA recognition motif 1 (RRM1) and 2 (RRM2) found in vertebrate nuclear protein 220 (NP220). This subgroup corresponds to RRM1 and RRM2 of NP220, also termed zinc finger protein 638 (ZN638), or cutaneous T-cell lymphoma-associated antigen se33-1, or zinc finger matrin-like protein, a large nucleoplasmic DNA-binding protein that binds to cytidine-rich sequences, such as CCCCC (G/C), in double-stranded DNA (dsDNA). NP220 contains multiple domains, including MH1, MH2, and MH3, domains homologous to the acidic nuclear protein matrin 3; RS, an arginine/serine-rich domain commonly found in pre-mRNA splicing factors; PstI-HindIII, a domain essential for DNA binding; acidic repeat, a domain with nine repeats of the sequence LVTVDEVIEEEDL; and a Cys2-His2 zinc finger-like motif that is also present in matrin 3. It may be involved in packaging, transferring, or processing transcripts. This subgroup corresponds to the domain of MH2 that contains two tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains).	76
410116	cd12717	RRM_ETP1	RNA recognition motif (RRM) found in yeast RING finger protein ETP1 and similar proteins. This subgroup corresponds to the RRM of ETP1, also termed BRAP2 homolog, or ethanol tolerance protein 1, the yeast homolog of BRCA1-associated protein (BRAP2) found in vertebrates. It may be involved in ethanol and salt-induced transcriptional activation of the NHA1 promoter and heat shock protein genes (HSP12 and HSP26), and participate in ethanol-induced turnover of the low-affinity hexose transporter Hxt3p. ETP1 contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C3HC4-type ring finger domain and a UBP-type zinc finger. 	83
410117	cd12718	RRM_BRAP2	RNA recognition motif (RRM) found in BRCA1-associated protein (BRAP2). This subgroup corresponds to the RRM of BRAP2, also termed impedes mitogenic signal propagation (IMP), or ring finger protein 52, or renal carcinoma antigen NY-REN-63, a novel cytoplasmic protein interacting with the two functional nuclear localisation signal (NLS) motifs of BRCA1, a nuclear protein linked to breast cancer. It also binds to the SV40 large T antigen NLS motif and the bipartite NLS motif found in mitosin. BRAP2 may serve as a cytoplasmic retention protein and play a role in the regulation of nuclear protein transport. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C3HC4-type ring finger domain and a UBP-type zinc finger. 	84
410118	cd12719	RRM_SYNJ1	RNA recognition motif (RRM) found in synaptojanin-1 and similar proteins. This subgroup corresponds to the RRM of synaptojanin-1, also termed synaptojanin, or synaptic inositol-1,4,5-trisphosphate 5-phosphatase 1, originally identified as one of the major Grb2-binding proteins that may participate in synaptic vesicle endocytosis. It also acts as a Src homology 3 (SH3) domain-binding brain-specific inositol 5-phosphatase with a putative role in clathrin-mediated endocytosis. Synaptojanin-1 contains an N-terminal domain homologous to the cytoplasmic portion of the yeast protein Sac1p, a central inositol 5-phosphatase domain followed by a putative RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal proline-rich region mediating the binding of synaptojanin-1 to various SH3 domain-containing proteins including amphiphysin, SH3p4, SH3p8, SH3p13, and Grb2. Synaptojanin-1 has two tissue-specific alternative splicing isoforms, synaptojanin-145 expressed in brain and synaptojanin-170 expressed in peripheral tissues. Synaptojanin-145 is very abundant in nerve terminals and may play an essential role in the clathrin-mediated endocytosis of synaptic vesicles. In contrast to synaptojanin-145, synaptojanin-170 contains three unique asparagine-proline-phenylalanine (NPF) motifs in the C-terminal region and may functions as a potential binding partner for Eps15, a clathrin coat-associated protein acting as a major substrate for the tyrosine kinase activity of the epidermal growth factor receptor. 	77
410119	cd12720	RRM_SYNJ2	RNA recognition motif (RRM) found in synaptojanin-2 and similar proteins. This subgroup corresponds to the RRM of synaptojanin-2, also termed synaptic inositol-1,4,5-trisphosphate 5-phosphatase 2, an ubiquitously expressed central regulatory enzyme in the phosphoinositide-signaling cascade. As a novel Rac1 effector regulating the early step of clathrin-mediated endocytosis, synaptojanin-2 acts as a polyphosphoinositide phosphatase directly and specifically interacting with Rac1 in a GTP-dependent manner. It mediates the inhibitory effect of Rac1 on endocytosis and plays an important role in the Rac1-mediated control of cell growth. Synaptojanin-2 shows high sequence homology to the N-terminal Sac1p homology domain, the central inositol 5-phosphatase domain, the putative RNA recognition motif (RRM) of synaptojanin-1, but differs in the proline-rich region. 	78
410120	cd12721	RRM_Nup53p_fungi	RNA recognition motif (RRM) found in yeast nucleoporin Nup53p and similar proteins. This subgroup corresponds to the RRM of Saccharomyces cerevisiae Nup53p, the ortholog of vertebrate nucleoporin Nup53. A unique property of yeast Nup53p is that it contains an additional Kap121p-binding domain and interacts specifically with the karyopherin Kap121p, which is involved in the assembly of Nup53p into NPCs. Like vertebrate Nup35, yeast Nup53p contains an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a C-terminal amphipathic alpha-helix and several FG repeats. The RRM domain lacks the conserved residues that typically bind RNA in canonical RRM domains.	86
410121	cd12722	RRM_Nup53	RNA recognition motif (RRM) found in nucleoporin Nup53. This subgroup corresponds to the RRM of nucleoporin Nup53, also termed mitotic phosphoprotein 44 (MP-44), or nuclear pore complex protein Nup53, required for normal cell growth and nuclear morphology in vertebrate. It tightly associates with the nuclear envelope membrane and the nuclear lamina where it interacts with lamin B. It may also interact with a group of nucleoporins including Nup93, Nup155, and Nup205 and play a role in the association of the mitotic checkpoint protein Mad1 with the nuclear pore complex (NPC). Nup35 contains an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a C-terminal amphipathic alpha-helix and several FG repeats. This RRM lacks the conserved residues that typically bind RNA in canonical RRM domains.	74
410122	cd12723	RRM1_CPEB1	RNA recognition motif 1 (RRM1) found in cytoplasmic polyadenylation element-binding protein 1 (CPEB-1) and similar proteins. This subgroup corresponds to the RRM2 of CPEB-1 (also termed CPE-BP1 or CEBP), an RNA-binding protein that interacts with the cytoplasmic polyadenylation element (CPE), a short U-rich motif in the 3' untranslated regions (UTRs) of certain mRNAs. It functions as a translational regulator that plays a major role in the control of maternal CPE-containing mRNA in oocytes, as well as of subsynaptic CPE-containing mRNA in neurons. Once phosphorylated and recruiting the polyadenylation complex, CPEB-1 may function as a translational activator stimulating polyadenylation and translation. Otherwise, it may function as a translational inhibitor when dephosphorylated and bound to a protein such as maskin or neuroguidin, which blocks translation initiation through interfering with the assembly of eIF-4E and eIF-4G. Although CPEB-1 is mainly located in cytoplasm, it can shuttle between nucleus and cytoplasm. CPEB-1 contains an N-terminal unstructured region, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. Both of the RRMs and the Zn finger are required for CPEB-1 to bind CPE. The N-terminal regulatory region may be responsible for CPEB-1 interacting with other proteins. 	101
410123	cd12724	RRM1_CPEB2_like	RNA recognition motif 1 (RRM1) found in cytoplasmic polyadenylation element-binding protein CPEB-2, CPEB-3, CPEB-4 and similar protiens. This subgroup corresponds to the RRM1 of the paralog proteins CPEB-2, CPEB-3 and CPEB-4, all well-conserved in both, vertebrates and invertebrates. Due to the high sequence similarity, members in this family may share similar expression patterns and functions. CPEB-2 is an RNA-binding protein that is abundantly expressed in testis and localized in cytoplasm in transfected HeLa cells. It preferentially binds to poly(U) RNA oligomers and may regulate the translation of stored mRNAs during spermiogenesis. Moreover, CPEB-2 impedes target RNA translation at elongation; it directly interacts with the elongation factor, eEF2, to reduce eEF2/ribosome-activated GTP hydrolysis in vitro and inhibit peptide elongation of CPEB2-bound RNA in vivo. CPEB-3 is a sequence-specific translational regulatory protein that regulates translation in a polyadenylation-independent manner. It functions as a translational repressor that governs the synthesis of the AMPA receptor GluR2 through binding GluR2 mRNA. It also represses translation of a reporter RNA in transfected neurons and stimulates translation in response to NMDA. CPEB-4 is an RNA-binding protein that mediates meiotic mRNA cytoplasmic polyadenylation and translation. It is essential for neuron survival and present on the endoplasmic reticulum (ER). It is accumulated in the nucleus upon ischemia or the depletion of ER calcium. CPEB-4 is overexpressed in a large variety of tumors and is associated with many mRNAs in cancer cells. All family members contain an N-terminal unstructured region, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. In addition, they do have conserved nuclear export signals that are not present in CPEB-1. 	92
410124	cd12725	RRM2_CPEB1	RNA recognition motif 2 (RRM2) found in cytoplasmic polyadenylation element-binding protein 1 (CPEB-1) and similar proteins. This subgroup corresponds to the RRM2 of CPEB-1 (also termed CPE-BP1 or CEBP), an RNA-binding protein that interacts with the cytoplasmic polyadenylation element (CPE), a short U-rich motif in the 3' untranslated regions (UTRs) of certain mRNAs. It functions as a translational regulator that plays a major role in the control of maternal CPE-containing mRNA in oocytes, as well as of subsynaptic CPE-containing mRNA in neurons. Once phosphorylated and recruiting the polyadenylation complex, CPEB-1 may function as a translational activator stimulating polyadenylation and translation. Otherwise, it may function as a translational inhibitor when dephosphorylated and bound to a protein such as maskin or neuroguidin, which blocks translation initiation through interfering with the assembly of eIF-4E and eIF-4G. Although CPEB-1 is mainly located in cytoplasm, it can shuttle between nucleus and cytoplasm. CPEB-1 contains an N-terminal unstructured region, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. Both of the RRMs and the Zn finger are required for CPEB-1 to bind CPE. The N-terminal regulatory region may be responsible for CPEB-1 interacting with other proteins. 	84
410125	cd12726	RRM2_CPEB2_like	RNA recognition motif 2 (RRM2) found in cytoplasmic polyadenylation element-binding protein CPEB-2, CPEB-3, CPEB-4 and similar protiens. This subgroup corresponds to the RRM2 of the paralog proteins CPEB-2, CPEB-3 and CPEB-4, all well conserved in both, vertebrates and invertebrates. Due to the high sequence similarity, members in this family may share similar expression patterns and functions. CPEB-2 is an RNA-binding protein that is abundantly expressed in testis and localized in cytoplasm in transfected HeLa cells. It preferentially binds to poly(U) RNA oligomers and may regulate the translation of stored mRNAs during spermiogenesis. Moreover, CPEB-2 impedes target RNA translation at elongation; it directly interacts with the elongation factor, eEF2, to reduce eEF2/ribosome-activated GTP hydrolysis in vitro and inhibit peptide elongation of CPEB2-bound RNA in vivo. CPEB-3 is a sequence-specific translational regulatory protein that regulates translation in a polyadenylation-independent manner. It functions as a translational repressor that governs the synthesis of the AMPA receptor GluR2 through binding GluR2 mRNA. It also represses translation of a reporter RNA in transfected neurons and stimulates translation in response to NMDA. CPEB-4 is an RNA-binding protein that mediates meiotic mRNA cytoplasmic polyadenylation and translation. It is essential for neuron survival and present on the endoplasmic reticulum (ER). It is accumulated in the nucleus upon ischemia or the depletion of ER calcium. CPEB-4 is overexpressed in a large variety of tumors and is associated with many mRNAs in cancer cells. All family members contain an N-terminal unstructured region, two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a Zn-finger motif. In addition, they do have conserved nuclear export signals that are not present in CPEB-1. 	81
410126	cd12727	RRM_like_Smg4_UPF3A	RNA recognition motif (RRM)-like Smg4_UPF3 domain in up-frameshift suppressor 3 homolog A (Upf3A). This subgroup corresponds to the RRM-like Smg4_UPF3 domain in Upf3A, also termed regulator of nonsense transcripts 3A, or nonsense mRNA reducing factor 3A, a human ortholog of yeast Upf3p and Caenorhabditis elegans SMG-4. It derives from gene UPF3A and is required for nonsense-mediated mRNA decay (NMD) in human. Upf3A is a nucleocytoplasmic shuttling protein that associates selectively with spliced beta-globin mRNA in vivo. Like other Upf3 proteins, Upf3A contains nuclear import and export signals, and a conserved Smg4_UPF3 domain with some similarity to an RNA recognition motif (RRM), indicating that it may be an RNA binding protein.  	87
410127	cd12728	RRM_like_Smg4_UPF3B	RNA recognition motif (RRM)-like Smg4_UPF3 domain in up-frameshift suppressor 3 homolog B on chromosome X (Upf3B). This subgroup corresponds to the RRM-like Smg4_UPF3 domain in Upf3B, also termed regulator of nonsense transcripts 3B, or nonsense mRNA reducing factor 3B, a human ortholog of yeast Upf3p and Caenorhabditis elegans SMG-4. It derives from X-linked gene UPF3B and is required for nonsense-mediated mRNA decay (NMD) in human. Upf3B is a nucleocytoplasmic shuttling protein that associates selectively with spliced beta-globin mRNA in vivo. Like other Upf3 proteins, Upf3B contains nuclear import and export signals, and a conserved Smg4_UPF3 domain with some similarity to an RNA recognition motif (RRM), indicating that it may be an RNA binding protein.  	89
410128	cd12729	RRM1_hnRNPH_hnRNPH2_hnRNPF	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein hnRNP H , hnRNP H2, hnRNP F and similar proteins. This subgroup corresponds to the RRM1 of hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H') and hnRNP F. These represent a group of nuclear RNA binding proteins that play important roles in the regulation of alternative splicing decisions. hnRNP H and hnRNP F are two closely related proteins, both of which bind to the RNA sequence DGGGD. They are present in a complex with the tissue-specific splicing factor Fox2, and regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts. The presence of Fox 2 can allows hnRNP H and hnRNP F to better compete with the SR protein ASF/SF2 for binding to FGFR2 exon IIIc. Thus, hnRNP H and hnRNP F can function as potent silencers of FGFR2 exon IIIc inclusion through an interaction with the exonic GGG motifs. Furthermore, hnRNP H and hnRNP H2 are almost identical. Both of them have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. 	79
410129	cd12730	RRM1_GRSF1	RNA recognition motif 1 (RRM1) found in G-rich sequence factor 1 (GRSF-1) and similar proteins. This subgroup corresponds to the RRM1 of GRSF-1, a cytoplasmic poly(A)+ mRNA binding protein which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 contains three potential RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for the RNA binding. In addition, GRSF-1 has two auxiliary domains, an acidic alpha-helical domain and an N-terminal alanine-rich region, that may play a role in protein-protein interactions and provide binding specificity. 	79
410130	cd12731	RRM2_hnRNPH_hnRNPH2_hnRNPF	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein hnRNP H, hnRNP H2, hnRNP F and similar proteins. This subgroup corresponds to the RRM2 of hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H') and hnRNP F. These represent a group of nuclear RNA binding proteins that play important roles in the regulation of alternative splicing decisions. hnRNP H and hnRNP F are two closely related proteins, both of which bind to the RNA sequence DGGGD. They are present in a complex with the tissue-specific splicing factor Fox2, and regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts. The presence of Fox 2 can allows hnRNP H and hnRNP F to better compete with the SR protein ASF/SF2 for binding to FGFR2 exon IIIc. Thus, hnRNP H and hnRNP F can function as potent silencers of FGFR2 exon IIIc inclusion through an interaction with the exonic GGG motifs. Furthermore, hnRNP H and hnRNP H2 are almost identical; both have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. 	90
410131	cd12732	RRM2_hnRNPH3	RNA recognition motif 2 (RRM2) found in heterogeneous nuclear ribonucleoprotein H3 (hnRNP H3) and similar proteins. This subgroup corresponds to the RRM2 of hnRNP H3 (also termed hnRNP 2H9), a nuclear RNA binding protein that belongs to the hnRNP H protein family that also includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H') and hnRNP F. This family is involved in mRNA processing and exhibit extensive sequence homology. Currently, little is known about the functions of hnRNP H3 except for its role in the splicing arrest induced by heat shock. In addition, the typical hnRNP H proteins contain contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, like other hnRNP H protein family members, hnRNP H3 has an extensive glycine-rich region near the C-terminus, which may allow it to homo- or heterodimerize. 	96
410132	cd12733	RRM3_GRSF1	RNA recognition motif 3 (RRM3) found in G-rich sequence factor 1 (GRSF-1) and similar proteins. This subgroup corresponds to the RRM3 of G-rich sequence factor 1 (GRSF-1), a cytoplasmic poly(A)+ mRNA binding protein which interacts with RNA in a G-rich element-dependent manner. It may function in RNA packaging, stabilization of RNA secondary structure, or other macromolecular interactions. GRSF-1 contains three potential RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for the RNA binding. In addition, GRSF-1 has two auxiliary domains, an acidic alpha-helical domain and an N-terminal alanine-rich region, that may play a role in protein-protein interactions and provide binding specificity. 	75
410133	cd12734	RRM3_hnRNPH_hnRNPH2_hnRNPF	RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein hnRNP H , hnRNP H2, hnRNP F and similar proteins. This subgroup corresponds to the RRM3 of hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H') and hnRNP F, which represent a group of nuclear RNA binding proteins that play important roles in the regulation of alternative splicing decisions. hnRNP H and hnRNP F are two closely related proteins, both of which bind to the RNA sequence DGGGD. They are present in a complex with the tissue-specific splicing factor Fox2, and regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts. The presence of Fox 2 can allows hnRNP H and hnRNP F to better compete with the SR protein ASF/SF2 for binding to FGFR2 exon IIIc. Thus, hnRNP H and hnRNP F can function as potent silencers of FGFR2 exon IIIc inclusion through an interaction with the exonic GGG motifs. Furthermore, hnRNP H and hnRNP H2 are almost identical; bothe have been found to bind nuclear-matrix proteins. hnRNP H activates exon inclusion by binding G-rich intronic elements downstream of the 5' splice site in the transcripts of c-src, human immunodeficiency virus type 1 (HIV-1), Bcl-X, GRIN1, and myelin. It silences exons when bound to exonic elements in the transcripts of beta-tropomyosin, HIV-1, and alpha-tropomyosin. hnRNP H2 has been implicated in pre-mRNA 3' end formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. In addition, the family members have an extensive glycine-rich region near the C-terminus, which may allow them to homo- or heterodimerize. 	76
241179	cd12735	RRM3_hnRNPH3	RNA recognition motif 3 (RRM3) found in heterogeneous nuclear ribonucleoprotein H3 (hnRNP H3) and similar proteins. This subgroup corresponds to the RRM3 of hnRNP H3 (also termed hnRNP 2H9), a nuclear RNA binding protein that belongs to the hnRNP H protein family that also includes hnRNP H (also termed mcs94-1), hnRNP H2 (also termed FTP-3 or hnRNP H'), and hnRNP F. This family is involved in mRNA processing and exhibit extensive sequence homology. Currently, little is known about the functions of hnRNP H3 except for its role in the splicing arrest induced by heat shock. In addition, the typical hnRNP H proteins contain contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, like other hnRNP H protein family members, hnRNP H3 has an extensive glycine-rich region near the C-terminus, which may allow it to homo- or heterodimerize. 	75
410134	cd12736	RRM1_ESRP1	RNA recognition motif 1 (RRM1) found in epithelial splicing regulatory protein 1 (ESRP1) and similar proteins. This subgroup corresponds to the RRM1 of ESRP1, also termed RNA-binding motif protein 35A (RBM35A), which has been identified as an epithelial cell type-specific regulator of fibroblast growth factor receptor 2 (FGFR2) splicing. It is required for expression of epithelial FGFR2-IIIb and the regulation of CD44, CTNND1 (p120-Catenin) and ENAH (hMena) splicing. It enhances epithelial-specific exons of CD44 and ENAH, silences mesenchymal exons of CTNND1, or both within FGFR2. Additional research indicated that ESRP1 functions as a tumor suppressor in colon cancer cells. It may be involved in posttranscriptional regulation of various genes by exerting a differential effect on protein translation via 5' untranslated regions (UTRs) of mRNAs. ESRP1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	93
410135	cd12737	RRM1_ESRP2	RNA recognition motif 1 (RRM1) found in epithelial splicing regulatory protein 2 (ESRP2) and similar proteins. This subgroup corresponds to the RRM1 of ESRP2, also termed RNA-binding motif protein 35B (RBM35B), which has been identified as an epithelial cell type-specific regulator of fibroblast growth factor receptor 2 (FGFR2) splicing. It is required for expression of epithelial FGFR2-IIIb and the regulation of CD44, CTNND1 (also termed p120-Catenin) and ENAH (also termed hMena) splicing. It enhances epithelial-specific exons of CD44 and ENAH, silences mesenchymal exons of CTNND1, or both within FGFR2. ESRP2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	80
241182	cd12738	RRM1_Fusilli	RNA recognition motif 1 (RRM1) found in Drosophila RNA-binding protein Fusilli and similar proteins. This subgroup corresponds to the RRM1 of RNA-binding protein Fusilli which is encoded by Drosophila fusilli (fus) gene. Loss of Fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous fibroblast growth factor receptor 2 (FGFR2) splicing and functions as a splicing factor. Fusilli contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 	80
410136	cd12739	RRM2_ESRP1	RNA recognition motif 2 (RRM2) found in epithelial splicing regulatory protein 1 (ESRP1) and similar proteins. This subgroup corresponds to the RRM2 of ESRP1, also termed RNA-binding motif protein 35A (RBM35A), which has been identified as an epithelial cell type-specific regulator of fibroblast growth factor receptor 2 (FGFR2) splicing. It is required for expression of epithelial FGFR2-IIIb and the regulation of CD44, CTNND1 (also termed p120-Catenin) and ENAH (also termed hMena) splicing. It enhances epithelial-specific exons of CD44 and ENAH, silences mesenchymal exons of CTNND1, or both within FGFR2. Additional research indicated that ESRP1 functions as a tumor suppressor in colon cancer cells. It may be involved in posttranscriptional regulation of various genes by exerting a differential effect on protein translation via 5' untranslated regions (UTRs) of mRNAs. ESRP1 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	111
241184	cd12740	RRM2_ESRP2	RNA recognition motif 2 (RRM2) found in epithelial splicing regulatory protein 2 (ESRP2) and similar proteins. This subgroup corresponds to the RRM2 of ESRP2, also termed RNA-binding motif protein 35B (RBM35B), which has been identified as an epithelial cell type-specific regulator of fibroblast growth factor receptor 2 (FGFR2) splicing. It is required for expression of epithelial FGFR2-IIIb and the regulation of CD44, CTNND1 (also termed p120-Catenin) and ENAH (also termed hMena) splicing. It enhances epithelial-specific exons of CD44 and ENAH, silences mesenchymal exons of CTNND1, or both within FGFR2. ESRP2 contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	107
410137	cd12741	RRM2_Fusilli	RNA recognition motif 2 (RRM2) found in Drosophila RNA-binding protein Fusilli and similar proteins. This subgroup corresponds to the RRM2 of RNA-binding protein Fusilli which is encoded by Drosophila fusilli (fus) gene. Loss of Fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous fibroblast growth factor receptor 2 (FGFR2) splicing and functions as a splicing factor. Fusilli contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 	99
410138	cd12742	RRM3_ESRP1_ESRP2	RNA recognition motif 3 (RRM3) found in epithelial splicing regulatory protein ESRP1, ESRP2 and similar proteins. This subgroup corresponds to the RRM3 of ESRP1 (also termed RBM35A) and ESRP2 (also termed RBM35B). These are epithelial-specific RNA binding proteins that promote splicing of the epithelial variant of the fibroblast growth factor receptor 2 (FGFR2), ENAH (also termed hMena), CD44 and CTNND1 (also termed p120-Catenin) transcripts. They are highly conserved paralogs and specifically bind to GU-rich binding site. ESRP1 and ESRP2 contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). 	81
241187	cd12743	RRM3_Fusilli	RNA recognition motif 3 (RRM3) found in Drosophila RNA-binding protein Fusilli and similar proteins. This subgroup corresponds to the RRM3 of RNA-binding protein Fusilli which is encoded by Drosophila fusilli (fus) gene. Loss of Fusilli activity causes lethality during embryogenesis in flies. Drosophila Fusilli can regulate endogenous fibroblast growth factor receptor 2 (FGFR2) splicing and functions as a splicing factor. Fusilli contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an N-terminal domain with unknown function and a C-terminal domain particularly rich in alanine, glutamine, and serine. 	85
410139	cd12744	RRM1_RBM12B	RNA recognition motif 1 (RRM1) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM1 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 	79
241189	cd12745	RRM1_RBM12	RNA recognition motif 1 (RRM1) found in RNA-binding protein 12 (RBM12) and similar proteins. This subgrup corresponds to the RRM1 of RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 	92
410140	cd12746	RRM2_RBM12B	RNA recognition motif 2 (RRM2) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM2 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 	86
410141	cd12747	RRM2_RBM12	RNA recognition motif 2 (RRM2) found in RNA-binding protein 12 (RBM12) and similar proteins. This subgroup corresponds to the RRM2 of RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), which is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 	75
410142	cd12748	RRM4_RBM12B	RNA recognition motif 4 (RRM4) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM4 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 	76
410143	cd12749	RRM4_RBM12	RNA recognition motif 4 (RRM4) found in RNA-binding protein 12 (RBM12) and similar proteins. This subgroup corresponds to the RRM4 of RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), which is ubiquitously expressed. It contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 	88
410144	cd12750	RRM5_RBM12B	RNA recognition motif 5 (RRM5) found in RNA-binding protein 12B (RBM12B) and similar proteins. This subgroup corresponds to the RRM5 of RBM12B which contains five distinct RNA binding motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). Its biological role remains unclear. 	77
410145	cd12751	RRM5_RBM12	RNA recognition motif 5 (RRM5) found in RNA-binding protein 12 (RBM12) and similar proteins. This subgroup corresponds to the RRM5 of RBM12, also termed SH3/WW domain anchor protein in the nucleus (SWAN), which is ubiquitously expressed. It contains five distinct RNA binding motifs (RBMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two proline-rich regions, and several putative transmembrane domains. The biological role of RBM12 remains unclear. 	76
410146	cd12752	RRM1_RBM5	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 5 (RBM5). This subgroup corresponds to the RRM1 of RBM5, also termed protein G15, or putative tumor suppressor LUCA15, or renal carcinoma antigen NY-REN-9, a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor. RBM5 shows high sequence similarity to RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). Both, RBM5 and RBM6, specifically bind poly(G) RNA. They contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. 	87
410147	cd12753	RRM1_RBM10	RNA recognition motif 1 (RRM1) found in vertebrate RNA-binding protein 10 (RBM10). This subgroup corresponds to the RRM1 of RBM10, also termed G patch domain-containing protein 9, or RNA-binding protein S1-1 (S1-1), a paralog of putative tumor suppressor RNA-binding protein 5 (RBM5 or LUCA15 or H37). It may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. RBM10 is structurally related to RBM5 and RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). It contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 	84
410148	cd12754	RRM2_RBM10	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 10 (RBM10). This subgroup corresponds to the RRM2 of RBM10, also termed G patch domain-containing protein 9, or RNA-binding protein S1-1 (S1-1), a paralog of putative tumor suppressor RNA-binding protein 5 (RBM5 or LUCA15 or H37). It may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. RBM10 is structurally related to RBM5 and RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). It contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, and a G-patch/D111 domain. 	87
410149	cd12755	RRM2_RBM5	RNA recognition motif 2 (RRM2) found in vertebrate RNA-binding protein 5 (RBM5). This subgroup corresponds to the RRM2 of RBM5, also termed protein G15, or putative tumor suppressor LUCA15, or renal carcinoma antigen NY-REN-9, a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor. RBM5 shows high sequence similarity to RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). Both, RBM5 and RBM6, specifically bind poly(G) RNA. They contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. 	86
410150	cd12756	RRM1_hnRNPD	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein D0 (hnRNP D0) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP D0, also termed AU-rich element RNA-binding protein 1, which is a UUAG-specific nuclear RNA binding protein that may be involved in pre-mRNA splicing and telomere elongation. hnRNP D0 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), in the middle and an RGG box rich in glycine and arginine residues in the C-terminal part. Each of RRMs can bind solely to the UUAG sequence specifically. 	74
410151	cd12757	RRM1_hnRNPAB	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A/B (hnRNP A/B) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP A/B, also termed APOBEC1-binding protein 1 (ABBP-1), which is an RNA unwinding protein with a high affinity for G- followed by U-rich regions. hnRNP A/B has also been identified as an APOBEC1-binding protein that interacts with apolipoprotein B (apoB) mRNA transcripts around the editing site and thus plays an important role in apoB mRNA editing. hnRNP A/B contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long C-terminal glycine-rich domain that contains a potential ATP/GTP binding loop. 	80
410152	cd12758	RRM1_hnRPDL	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein D-like (hnRNP D-like or hnRNP DL) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP DL (or hnRNP D-like), also termed AU-rich element RNA-binding factor, or JKT41-binding protein (protein laAUF1 or JKTBP), which is a dual functional protein that possesses DNA- and RNA-binding properties. It has been implicated in mRNA biogenesis at the transcriptional and post-transcriptional levels. hnRNP DL binds single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA) in a non-sequencespecific manner, and interacts with poly(G) and poly(A) tenaciously. It contains two putative two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a glycine- and tyrosine-rich C-terminus. 	76
241203	cd12759	RRM1_MSI1	RNA recognition motif 1 (RRM1) found in RNA-binding protein Musashi homolog 1 (Musashi-1) and similar proteins. This subgroup corresponds to the RRM1 of Musashi-1. The mammalian MSI1 gene encoding Musashi-1 (also termed Msi1) is a neural RNA-binding protein putatively expressed in central nervous system (CNS) stem cells and neural progenitor cells and associated with asymmetric divisions in neural progenitor cells. Musashi-1 is evolutionarily conserved from invertebrates to vertebrates. It is a homolog of Drosophila Musashi and Xenopus laevis nervous system-specific RNP protein-1 (Nrp-1). Musashi-1 has been implicated in the maintenance of the stem-cell state, differentiation, and tumorigenesis. It translationally regulates the expression of a mammalian numb gene by binding to the 3'-untranslated region of mRNA of Numb, encoding a membrane-associated inhibitor of Notch signaling, and further influences neural development. Moreover, it represses translation by interacting with the poly(A)-binding protein and competes for binding of the eukaryotic initiation factor-4G (eIF-4G). Musashi-1 contains two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 	77
410153	cd12760	RRM1_MSI2	RNA recognition motif 1 (RRM1) found in RNA-binding protein Musashi homolog 2 (Musashi-2 ) and similar proteins. This subgroup corresponds to the RRM2 of Musashi-2 (also termed Msi2) which has been identified as a regulator of the hematopoietic stem cell (HSC) compartment and of leukemic stem cells after transplantation of cells with loss and gain of function of the gene. It influences proliferation and differentiation of HSCs and myeloid progenitors, and further modulates normal hematopoiesis and promotes aggressive myeloid leukemia. Musashi-2 contains two conserved N-terminal tandem RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), along with other domains of unknown function. 	93
410154	cd12761	RRM1_hnRNPA1	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP A1, also termed helix-destabilizing protein, or single-strand RNA-binding protein, or hnRNP core protein A1, and is an abundant eukaryotic nuclear RNA-binding protein that may modulate splice site selection in pre-mRNA splicing. hnRNP A1 has been characterized as a splicing silencer, often acting in opposition to an activating hnRNP H. It silences exons when bound to exonic elements in the alternatively spliced transcripts of c-src, HIV, GRIN1, and beta-tropomyosin. hnRNP A1 can shuttle between the nucleus and the cytoplasm. Thus, it may be involved in transport of cellular RNAs, including the packaging of pre-mRNA into hnRNP particles and transport of poly A+ mRNA from the nucleus to the cytoplasm. The cytoplasmic hnRNP A1 has high affinity with AU-rich elements, whereas the nuclear hnRNP A1 has high affinity with a polypyrimidine stretch bordered by AG at the 3' ends of introns. hnRNP A1 is also involved in the replication of an RNA virus, such as mouse hepatitis virus (MHV), through an interaction with the transcription-regulatory region of viral RNA. hnRNP A1, together with the scaffold protein septin 6, serves as host protein to form a complex with NS5b and viral RNA, and further plays important roles in the replication of Hepatitis C virus (HCV). hnRNP A1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. The RRMs of hnRNP A1 play an important role in silencing the exon and the glycine-rich domain is responsible for protein-protein interactions. 	81
410155	cd12762	RRM1_hnRNPA2B1	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A2/B1 (hnRNP A2/B1) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP A2/B1 which is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE). Many mRNAs, such as myelin basic protein (MBP), myelin-associated oligodendrocytic basic protein (MOBP), carboxyanhydrase II (CAII), microtubule-associated protein tau, and amyloid precursor protein (APP) are trafficked by hnRNP A2/B1. hnRNP A2/B1 also functions as a splicing factor that regulates alternative splicing of the tumor suppressors, such as BIN1, WWOX, the antiapoptotic proteins c-FLIP and caspase-9B, the insulin receptor (IR), and the RON proto-oncogene among others. Moreover, the overexpression of hnRNP A2/B1 has been described in many cancers. It functions as a nuclear matrix protein involving in RNA synthesis and the regulation of cellular migration through alternatively splicing pre-mRNA. It may play a role in tumor cell differentiation. hnRNP A2/B1 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 	81
410156	cd12763	RRM1_hnRNPA3	RNA recognition motif 1 (RRM1) found in heterogeneous nuclear ribonucleoprotein A3 (hnRNP A3) and similar proteins. This subgroup corresponds to the RRM1 of hnRNP A3 which is a novel RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE) independently of hnRNP A2 and participates in the trafficking of A2RE-containing RNA. hnRNP A3 can shuttle between the nucleus and the cytoplasm. It contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a long glycine-rich region at the C-terminus. 	81
410157	cd12764	RRM2_SRSF4	RNA recognition motif 2 (RRM2) found in vertebrate serine/arginine-rich splicing factor 4 (SRSF4). This subgroup corresponds to the RRM2 of SRSF4, also termed pre-mRNA-splicing factor SRp75, or SRP001LB, or splicing factor, arginine/serine-rich 4 (SFRS4), a splicing regulatory serine/arginine (SR) protein that plays an important role in both constitutive splicing and alternative splicing of many pre-mRNAs. For instance, it interacts with heterogeneous nuclear ribonucleoproteins, hnRNP G and hnRNP E2, and further regulates the 5' splice site of tau exon 10, whose misregulation causes frontotemporal dementia. SFRS4 also induces production of HIV-1 vpr mRNA through the inhibition of the 5'-splice site of exon 3. In addition, SRSF4 activates splicing of the cardiac troponin T (cTNT) alternative exon by direct interactions with the cTNT exon 5 enhancer RNA. SRSF4 can shuttle between the nucleus and cytoplasm. It contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), a glycine-rich region, an internal region homologous to the RRM, and a very long, highly phosphorylated C-terminal RS domains rich in serine-arginine dipeptides. 	97
410158	cd12765	RRM2_SRSF5	RNA recognition motif 2 (RRM2) found in vertebrate serine/arginine-rich splicing factor 5 (SRSF5). This subgroup corresponds to the RRM2 of SRSF5, also termed delayed-early protein HRS, or pre-mRNA-splicing factor SRp40, or splicing factor, arginine/serine-rich 5 (SFRS5), is an essential splicing regulatory serine/arginine (SR) protein that regulates both alternative splicing and basal splicing. It is the only SR protein efficiently selected from nuclear extracts (NE) by the splicing enhancer (ESE) and it is necessary for enhancer activation. SRSF5 also functions as a factor required for insulin-regulated splice site selection for protein kinase C (PKC) betaII mRNA. It is involved in the regulation of PKCbetaII exon inclusion by insulin via its increased phosphorylation by a phosphatidylinositol 3-kinase (PI 3-kinase) signaling pathway. Moreover, SRSF5 can regulate alternative splicing in exon 9 of glucocorticoid receptor pre-mRNA in a dose-dependent manner. SRSF5 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. The specific RNA binding by SRSF5 requires the phosphorylation of its SR domain.  	81
410159	cd12766	RRM2_SRSF6	RNA recognition motif 2 (RRM2) found found in vertebrate serine/arginine-rich splicing factor 6 (SRSF6). This subgroup corresponds to the RRM2 of SRSF6, also termed pre-mRNA-splicing factor SRp55, an essential splicing regulatory serine/arginine (SR) protein that preferentially interacts with a number of purine-rich splicing enhancers (ESEs) to activate splicing of the ESE-containing exon. It is the only protein from HeLa nuclear extract or purified SR proteins that specifically binds B element RNA after UV irradiation. SRSF6 may also recognize different types of RNA sites. For instance, it does not bind to the purine-rich sequence in the calcitonin-specific ESE, but binds to a region adjacent to the purine tract. Moreover, cellular levels of SRSF6 may control tissue-specific alternative splicing of the calcitonin/ calcitonin gene-related peptide (CGRP) pre-mRNA. SRSF6 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by a C-terminal RS domains rich in serine-arginine dipeptides. 	73
410160	cd12767	RRM2_SRSF1	RNA recognition motif 2 (RRM2) found in serine/arginine-rich splicing factor 1 (SRSF1) and similar proteins. This subgroup corresponds to the RRM2 of SRSF1, also termed alternative-splicing factor 1 (ASF-1), or pre-mRNA-splicing factor SF2, P33 subunit, a splicing regulatory serine/arginine (SR) protein involved in constitutive and alternative splicing, nonsense-mediated mRNA decay (NMD), mRNA export and translation. It also functions as a splicing-factor oncoprotein that regulates apoptosis and proliferation to promote mammary epithelial cell transformation. SRSF1 is a shuttling SR protein and contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), separated by a long glycine-rich spacer, and a C-terminal SR domains rich in serine-arginine dipeptides. 	84
410161	cd12768	RRM2_SRSF9	RNA recognition motif 2 (RRM2) found in vertebrate serine/arginine-rich splicing factor 9 (SRSF9). This subgroup corresponds to the RRM2 of SRSF9, also termed pre-mRNA-splicing factor SRp30C, an essential splicing regulatory serine/arginine (SR) protein that has been implicated in the activity of many elements that control splice site selection, the alternative splicing of the glucocorticoid receptor beta in neutrophils and in the gonadotropin-releasing hormone pre-mRNA. SRSF9 can also interact with other proteins implicated in alternative splicing, including YB-1, rSLM-1, rSLM-2, E4-ORF4, Nop30, and p32. SRSF9 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), followed by an unusually short C-terminal RS domains rich in serine-arginine dipeptides. 	84
410162	cd12769	RRM1_HuR	RNA recognition motif 1 (RRM1) found in vertebrate Hu-antigen R (HuR). This subgroup corresponds to the RRM1 of HuR, also termed ELAV-like protein 1 (ELAV-1), a ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. HuR has an anti-apoptotic function during early cell stress response; it binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. Meanwhile, HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Like other Hu proteins, HuR contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	82
410163	cd12770	RRM1_HuD	RNA recognition motif 1 (RRM1) found in vertebrate Hu-antigen D (HuD). This subgroup corresponds to the RRM1 of HuD, also termed ELAV-like protein 4 (ELAV-4), or paraneoplastic encephalomyelitis antigen HuD, one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuD has been implicated in various aspects of neuronal function, such as the commitment and differentiation of neuronal precursors as well as synaptic remodeling in mature neurons. HuD also functions as an important regulator of mRNA expression in neurons by interacting with AU-rich RNA element (ARE) and stabilizing multiple transcripts. Moreover, HuD regulates the nuclear processing/stability of N-myc pre-mRNA in neuroblastoma cells, as well as the neurite elongation and morphological differentiation. HuD specifically binds poly(A) RNA. Like other Hu proteins, HuD contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	81
410164	cd12771	RRM1_HuB	RNA recognition motif 1 (RRM1) found in vertebrate Hu-antigen B (HuB). This subgroup corresponds to the RRM1 of HuB, also termed ELAV-like protein 2 (ELAV-2), or ELAV-like neuronal protein 1, or nervous system-specific RNA-binding protein Hel-N1 (Hel-N1), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads and is up-regulated during neuronal differentiation of embryonic carcinoma P19 cells. Like other Hu proteins, HuB contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	83
410165	cd12772	RRM1_HuC	RNA recognition motif 1 (RRM1) found in vertebrate Hu-antigen C (HuC). This subgroup corresponds to the RRM1 of HuC, also termed ELAV-like protein 3 (ELAV-3), or paraneoplastic cerebellar degeneration-associated antigen, or paraneoplastic limbic encephalitis antigen 21 (PLE21), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. Like other Hu proteins, HuC contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). The AU-rich element binding of HuC can be inhibited by flavonoids. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	85
410166	cd12773	RRM2_HuR	RNA recognition motif 2 (RRM2) found in vertebrate Hu-antigen R (HuR). This subgroup corresponds to the RRM2 of HuR, also termed ELAV-like protein 1 (ELAV-1), the ubiquitously expressed Hu family member. It has a variety of biological functions mostly related to the regulation of cellular response to DNA damage and other types of stress. HuR has an anti-apoptotic function during early cell stress response. It binds to mRNAs and enhances the expression of several anti-apoptotic proteins, such as p21waf1, p53, and prothymosin alpha. HuR also has pro-apoptotic function by promoting apoptosis when cell death is unavoidable. Furthermore, HuR may be important in muscle differentiation, adipogenesis, suppression of inflammatory response and modulation of gene expression in response to chronic ethanol exposure and amino acid starvation. Like other Hu proteins, HuR contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	84
410167	cd12774	RRM2_HuD	RNA recognition motif 2 (RRM2) found in vertebrate Hu-antigen D (HuD). This subgroup corresponds to the RRM2 of HuD, also termed ELAV-like protein 4 (ELAV-4), or paraneoplastic encephalomyelitis antigen HuD, one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuD has been implicated in various aspects of neuronal function, such as the commitment and differentiation of neuronal precursors as well as synaptic remodeling in mature neurons. HuD also functions as an important regulator of mRNA expression in neurons by interacting with AU-rich RNA element (ARE) and stabilizing multiple transcripts. Moreover, HuD regulates the nuclear processing/stability of N-myc pre-mRNA in neuroblastoma cells and also regulates the neurite elongation and morphological differentiation. HuD specifically binds poly(A) RNA. Like other Hu proteins, HuD contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an ARE. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	84
410168	cd12775	RRM2_HuB	RNA recognition motif 2 (RRM2) found in vertebrate Hu-antigen B (HuB). This subgroup corresponds to the RRM2 of HuB, also termed ELAV-like protein 2 (ELAV-2), or ELAV-like neuronal protein 1, or nervous system-specific RNA-binding protein Hel-N1 (Hel-N1), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. HuB is also expressed in gonads. It is up-regulated during neuronal differentiation of embryonic carcinoma P19 cells. Like other Hu proteins, HuB contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	84
241220	cd12776	RRM2_HuC	RNA recognition motif 2 (RRM2) found in vertebrate Hu-antigen C (HuC). This subgroup corresponds to the RRM2 of HuC, also termed ELAV-like protein 3 (ELAV-3), or paraneoplastic cerebellar degeneration-associated antigen, or paraneoplastic limbic encephalitis antigen 21 (PLE21), one of the neuronal members of the Hu family. The neuronal Hu proteins play important roles in neuronal differentiation, plasticity and memory. Like other Hu proteins, HuC contains three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). RRM1 and RRM2 may cooperate in binding to an AU-rich RNA element (ARE). The AU-rich element binding of HuC can be inhibited by flavonoids. RRM3 may help to maintain the stability of the RNA-protein complex, and might also bind to poly(A) tails or be involved in protein-protein interactions. 	81
410169	cd12777	RRM1_PTBP1	RNA recognition motif 1 (RRM1) found in vertebrate polypyrimidine tract-binding protein 1 (PTB). This subgroup corresponds to the RRM1 of PTB, also known as 58 kDa RNA-binding protein PPTB-1 or heterogeneous nuclear ribonucleoprotein I (hnRNP I), an important negative regulator of alternative splicing in mammalian cells. PTB also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTB contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RRM1 and RRM2 are independent from each other and separated by flexible linkers. By contrast, there is an unusual and conserved interdomain interaction between RRM3 and RRM4. It is widely held that only RRMs 3 and 4 are involved in RNA binding and RRM2 mediates PTB homodimer formation. However, new evidence shows that the RRMs 1 and 2 also contribute substantially to RNA binding. Moreover, PTB may not always dimerize to repress splicing. It is a monomer in solution. 	81
410170	cd12778	RRM1_PTBP2	RNA recognition motif 1 (RRM1) found in vertebrate polypyrimidine tract-binding protein 2 (PTBP2). This subgroup corresponds to the RRM1 of PTBP2, also known as neural polypyrimidine tract-binding protein or neurally-enriched homolog of PTB (nPTB), highly homologous to polypyrimidine tract binding protein (PTB) and perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 contains four RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	82
410171	cd12779	RRM1_ROD1	RNA recognition motif 1 (RRM1) found in vertebrate regulator of differentiation 1 (Rod1). This subgroup corresponds to the RRM1 of ROD1 coding protein Rod1, a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein that negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. Rod1 contains four repeats of RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and does have RNA binding activities. 	90
410172	cd12780	RRM1_hnRNPL	RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein L (hnRNP-L). This subgroup corresponds to the RRM1 of hnRNP-L, a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-L shows significant sequence homology to polypyrimidine tract-binding protein (PTB or hnRNP I). Both, hnRNP-L and PTB, are localized in the nucleus but excluded from the nucleolus. hnRNP-L is an RNA-binding protein with three RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	80
410173	cd12781	RRM1_hnRPLL	RNA recognition motif 1 (RRM1) found in vertebrate heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL). This subgroup corresponds to the RRM1 of hnRNP-LL, which plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to heterogeneous nuclear ribonucleoprotein L (hnRNP-L), which is an abundant nuclear, multifunctional RNA-binding protein with three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	84
410174	cd12782	RRM2_PTBP1	RNA recognition motif 2 (RRM2) found in vertebrate polypyrimidine tract-binding protein 1 (PTB). This subgroup corresponds to the RRM2 of PTB, also known as 58 kDa RNA-binding protein PPTB-1 or heterogeneous nuclear ribonucleoprotein I (hnRNP I), an important negative regulator of alternative splicing in mammalian cells. PTB also functions at several other aspects of mRNA metabolism, including mRNA localization, stabilization, polyadenylation, and translation. PTB contains four RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). RRM1 and RRM2 are independent from each other and separated by flexible linkers. By contrast, there is an unusual and conserved interdomain interaction between RRM3 and RRM4. It is widely held that only RRMs 3 and 4 are involved in RNA binding and RRM2 mediates PTB homodimer formation. However, new evidence shows that the RRMs 1 and 2 also contribute substantially to RNA binding. Moreover, PTB may not always dimerize to repress splicing. It is a monomer in solution. 	108
410175	cd12783	RRM2_PTBP2	RNA recognition motif 2 (RRM2) found in vertebrate polypyrimidine tract-binding protein 2 (PTBP2). This subgroup corresponds to the RRM2 of PTBP2, also known as neural polypyrimidine tract-binding protein or neurally-enriched homolog of PTB (nPTB), highly homologous to polypyrimidine tract binding protein (PTB) and perhaps specific to the vertebrates. Unlike PTB, PTBP2 is enriched in the brain and in some neural cell lines. It binds more stably to the downstream control sequence (DCS) RNA than PTB does but is a weaker repressor of splicing in vitro. PTBP2 also greatly enhances the binding of two other proteins, heterogeneous nuclear ribonucleoprotein (hnRNP) H and KH-type splicing-regulatory protein (KSRP), to the DCS RNA. The binding properties of PTBP2 and its reduced inhibitory activity on splicing imply roles in controlling the assembly of other splicing-regulatory proteins. PTBP2 contains four RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	107
410176	cd12784	RRM2_ROD1	RNA recognition motif 2 (RRM2) found in vertebrate regulator of differentiation 1 (Rod1). This subgroup corresponds to the RRM2 of ROD1 coding protein Rod1, a mammalian polypyrimidine tract binding protein (PTB) homolog of a regulator of differentiation in the fission yeast Schizosaccharomyces pombe, where the nrd1 gene encodes an RNA binding protein and negatively regulates the onset of differentiation. ROD1 is predominantly expressed in hematopoietic cells or organs. It might play a role controlling differentiation in mammals. Rod1 contains four repeats of RNA recognition motifs (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain) and does have RNA binding activities. 	108
410177	cd12785	RRM2_hnRNPL	RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein L (hnRNP-L). This subgroup corresponds to the RRM2 of hnRNP-L, a higher eukaryotic specific subunit of human KMT3a (also known as HYPB or hSet2) complex required for histone H3 Lys-36 trimethylation activity. It plays both, nuclear and cytoplasmic, roles in mRNA export of intronless genes, IRES-mediated translation, mRNA stability, and splicing. hnRNP-L shows significant sequence homology to polypyrimidine tract-binding protein (PTB or hnRNP I). Both hnRNP-L and PTB are localized in the nucleus but excluded from the nucleolus. hnRNP-L is an RNA-binding protein with three RNA recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	100
241230	cd12786	RRM2_hnRPLL	RNA recognition motif 2 (RRM2) found in vertebrate heterogeneous nuclear ribonucleoprotein L-like (hnRNP-LL). The subgroup corresponds to the RRM2 of hnRNP-LL which plays a critical and unique role in the signal-induced regulation of CD45 and acts as a global regulator of alternative splicing in activated T cells. It is closely related in domain structure and sequence to heterogeneous nuclear ribonucleoprotein L (hnRNP-L), which is an abundant nuclear, multifunctional RNA-binding protein with three RNA-recognition motifs (RRMs), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain). 	96
213347	cd12787	RasGAP_plexin_B	Ras-GTPase Activating Domain of type B plexins. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity.There are three members of the Plexin-B subfamily, namely B1, B2 and B3. Plexins-B1, B2 and B3 are receptors for Sema4D, Sema4C and Sema4G, and Sema5A, respectively. The activation of plexin-B1 by Sema4D produces an acute collapse of axonal growth cones in hippocampal and retinal neurons over the early stages of neurite outgrowth and promotes branching and complexity. By signaling the effect of Sema4C and Sema4G, the plexin-B2 receptor is critically involved in neural tube closure and cerebellar granule cell development. Plexin-B3, the receptor of Sema5A, is a highly potent stimulator of neurite outgrowth of primary murine cerebellar neurons. Plexin-B3 has been linked to verbal performance and white matter volume in human brain. Small GTPases play important roles in plexin-B signaling. Plexin-B1 activates Rho through Rho-specific guanine nucleotide exchange factors, leading to neurite retraction. Plexin-B1 possesses an intrinsic GTPase-activating protein activity for R-Ras and induces growth cone collapse through R-Ras inactivation. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	391
213348	cd12788	RasGAP_plexin_D1	Ras-GTPase Activating Domain of plexin-D1. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-D1 has been identified as the receptor of Sema3E. It binds to Sema3E directly with high affinity. Sema3E is implicated in axonal path finding and inhibition of developmental and postischemic angiogenesis. Plexin-D1 is broadly expressed on tumor vessels and tumor cells in a number of different types of human tumors. The Plexin-D1 and Sema3E interaction inhibits tumor growth but promotes invasiveness and metastasis. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	419
213349	cd12789	RasGAP_plexin_C1	Ras-GTPase Activating Domain of plexin-C1. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-C1 has been identified as the receptor of semaphorin 7A, which plays regulatory roles in both the immune and nervous systems. Unlike other semaphorins which act as repulsive guidance cues, Sema7A enhances central and peripheral axon growth and is required for proper axon tract formation during embryonic development. Plexin-C1 is a potential tumor suppressor for melanoma progression. The expression of Plexin-C1 is diminished or absent in human melanoma cell lines. Cofilin, an actin-binding protein involved in cell migration, is a downstream target of Sema7A and Plexin-C1 signaling. Melanoma invasion and metastasis may be promoted through the loss of Plexin-C1 inhibitory signaling on cofilin activation. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	393
213350	cd12790	RasGAP_plexin_A	Ras-GTPase Activating Domain of type A plexins. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. They are divided into four types (A-D) according to sequence similarity. In vertebrates, there are four type A plexins (A1-A4) that serve as the co-receptors for neuropilins to mediate the signaling of class 3 semaphorins except Sema3E, which signals through Plexin-D1. Plexins serve as direct receptors for several other members of the semaphorin family: class 1 and class 6 semaphorins signal through type A plexins, which mediate diverse biological functions including axon guidance, cardiovascular development, and immune function. Guanylyl cyclase Gyc76C and Off-track kinase (OTK), a putative receptor tyrosine kinase, modulate Sema1a and Plexin-A mediated axon repulsion. In their complex with Sema6s, type A plexins serve as signal-transducing subunits. An increasing number of molecules that interact with the intracellular region of Plexin-A have been identified; among them are IgCAMs (in axon guidance events) and Trem2-DAP12 (in immune responses). Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	385
213351	cd12791	RasGAP_plexin_B3	Ras-GTPase Activating Domain of plexin-B3. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-B3 is the receptor of semaphorin 5A. It is a highly potent stimulator of neurite outgrowth of primary murine cerebellar neurons. Plexin-B3 has been linked to verbal performance and white matter volume in human brain. Furthermore, Sema5A and plexin-B3 have been implicated in the progression of various types of cancer. They play an important role in the invasion and metastasis of gastric carcinoma. The protein and mRNA expression of Sema5A and its receptor plexin-B3 increased gradually in non-neoplastic mucosa, primary gastric carcinoma, and lymph node metastasis, and their expression is correlated. The stimulation of plexin-B3 by Sema5A binding in human glioma cells results in the inhibition of cell migration and invasion. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	397
213352	cd12792	RasGAP_plexin_B2	Ras-GTPase Activating Domain of plexin-B2. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-B2 serves as the receptor of Sema4C and Sema4G. By signaling the effect of Sema4C and Sema4G, the plexin-B2 receptor is critically involved in neural tube closure and cerebellar granule cell development. Mice lacking Plexin-B2 demonstrated defects in closure of the neural tube and disorganization of the embryonic brain. In developing kidney, Sema4C and Plexin-B2 signaling modulates ureteric branching. Plexin-B2 is expressed both in the pretubular aggregates and the ureteric epithelium in the developing kidney. Deletion of Plexin-B2 results in renal hypoplasia and occasional double ureters. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	400
213353	cd12793	RasGAP_plexin_B1	Ras-GTPase Activating Domain of plexin-B1. Plexins form a conserved family of transmembrane receptors for semaphorins and may be the ancestors of semaphorins. Plexins are divided into four types (A-D) according to sequence similarity. Plexin-B1 serves as the Semaphorin 4D receptor and functions as a regulator of developing neurons and a tumor suppressor protein for melanoma. The Sema4D and plexin-B1 signaling complex regulates dendritic and axonal complexity. The activation of Plexin-B1 by Sema4D produces an acute collapse of axonal growth cones in hippocampal and retinal neurons over the early stages of neurite outgrowth and promotes branching and complexity. As a tumor suppressor, plexin-B1 abrogates activation of the oncogenic receptor, c-Met, by its ligand, hepatocyte growth factor (HGF), in melanoma. Furthermore, plexin-B1 suppresses integrin-dependent migration and activation of pp125FAK and inhibits Rho activity. Plexin-B1 is highly expressed in endothelial cells and its activation by Sema4D elicits a potent proangiogenic response. Plexins contain a C-terminal RasGAP domain, which functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Plexins display GAP activity towards the Ras homolog Rap. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP show no sequence homology at their amino acid level. RasGTPases function as molecular switches in a large number of of signaling pathways. When bound to GTP they are in the on state and when bound to GDP they are in the off state. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator.	394
240614	cd12794	Hsm3_like	Hsm3 is a  yeast Proteasome chaperone of the 19S regulatory particle and related proteins. This group contains proteins related to the Hsm3 protein (Yeast Proteasome Interacting Protein) of Saccharomyces cerevisiae. S. cerevisiae Hsm3 is a chaperone of regulatory particles involved in proteasome assembly. The 26S Proteasome is a large, 2.5 MDa complex comprised of at least 33 subunits, and relies on chaperones to facilitate correct assembly. The proteasome contains a cylindrical 20S core particle and 1-2 19S regulatory particles, comprised of AAA-ATPase and non-ATPase subunits. The proteasome acts in ubiquitin-dependent proteolysis. The 19S RP targets and opens the the ubiquitin-tagged substrate and releases ubiquitin. Hsm3 acts as a 19S chaperone, binding to the C-terminal domain of Rpt1 (the 6 ATPase subunits of the 19 S regulatory particle(s). Hsm3 has a C-shape composed of 11 HEAT repeats. Mutations in the Hsm3-Rpt interface disrupt formation of the 26 S Proteasome complex.	455
240613	cd12795	FILIA_N_like	FILIA-N KH-like domain. This group contains the N-terminal atypical KH domain of FILIA and related domains. FILIA is expressed in oocytes and embryo, and contains an atypical KH domain at the N-terminus with an N-terminal extension that interacts with RNA. RNA-binding may mediate RNA transcript regulation in oogenesis and embryogenesis. FILIA-N differs from typical KH domains by forming a stable dimer in solution and crystal structure.	114
240609	cd12796	LbR_Ice_bind	Ice-binding protein, left-handed beta-roll. The ice-binding protein of the grass Lolium perenne (LpIBP) discourages the recrystallization of ice. Ice-binding proteins produced by organisms to prevent the growing of ice are termed to anti-freeze proteins. LpIBP consists of an unusual left-handed beta roll. Ice-binding is mediated by a flat beta-sheet on one side of the helix.	114
410984	cd12797	M23_peptidase	M23 family metallopeptidase, also known as beta-lytic metallopeptidase, and similar proteins. This model describes the metallopeptidase M23 family, which includes beta-lytic metallopeptidase and lysostaphin. Members of this family are zinc endopeptidases that lyse bacterial cell wall peptidoglycans; they cleave either the N-acylmuramoyl-Ala bond between the cell wall peptidoglycan and the cross-linking peptide (e.g. beta-lytic endopeptidase) or a bond within the cross-linking peptide (e.g. stapholysin, and lysostaphin). Beta-lytic metallopeptidase, formerly known as beta-lytic protease, has a preference for cleavage of Gly-X bonds and favors hydrophobic or apolar residues on either side. It inhibits growth of sensitive organisms and may potentially serve as an antimicrobial agent. Lysostaphin, produced by Staphylococcus genus, cleaves pentaglycine cross-bridges of cell wall peptidoglycan, acting as autolysins to maintain cell wall metabolism or as toxins and weapons against competing strains. Staphylolysin (also known as LasA) is implicated in a range of processes related to Pseudomonas virulence, including stimulating shedding of the ectodomain of cell surface heparan sulphate proteoglycan syndecan-1, and elastin degradation in connective tissue. Its active site is less constricted and contains a five-coordinate zinc ion with trigonal bipyramidal geometry and two metal-bound water molecules, possibly contributing to its activity against a wider range of substrates than those used by related lytic enzymes, consistent with its multiple roles in Pseudomonas virulence. The family includes members that do not appear to have the conserved zinc-binding site and might be lipoproteins lacking proteolytic activity.	85
213998	cd12798	Alt_A1	Alternaria alternata allergen Alt a 1. Alt a 1 defines a new homologous protein family with unknown function exclusively found in fungi. The unique structure of Alt a 1 contains intramolecular disulfide bonds that are conserved among the Alt a 1 homologs.  Residues reported to be IgE antibody-binding epitopes are exposed through dimerization via a conserved disulfide bond and hydrophobic and polar interactions. Further mechanistic structure/function studies will give insight into immunologic studies directed toward new forms of immunotherapy for Alternaria species-sensitive allergic patients.	132
340366	cd12799	pesticin_lyz-like	lysozyme-like C-terminal domain of pesticin and related proteins. Pesticin (Pst) is an anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacterial stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. The pesticin C-terminal domain resembles the lysozyme-like family, which includes soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides.	129
213999	cd12800	Sol_i_2	Sol i 2, a major allergen from fire ant venom. Sol i 2, one of four known potent allergens from the venom of red imported fire ant, is a powerful trigger of anaphylaxis. It causes production of IgE antibody in many individuals stung by fire ants. The closest structure homolog of Sol I 2 is the sequence-unrelated odorant binding protein and pheromone binding protein LUSH of the fruit fly Drosophila, suggesting a possible similar biological function.	118
214000	cd12801	HopAB_KID	Kinase-interacting domains of the HopAB family of Type III Effector proteins. HopAB family members are type III effector proteins that are secreted by the plant pathogen Pseudomonas syringae into the host plant to inhibit its immune system and facilitate the spread of the pathogen. AvrPtoB, also called HopAB3, is the best studied member of the family. It suppresses host basal defenses by interfering with PAMP (pathogen-associated molecular signature)-triggered immunity (PTI) through binding and inhibiting BAK1, a kinase which serves to activate defense signaling. It also recognizes the kinase Pto to activate effector-triggered immunity (ETI). AvrPtoB contains an N-terminal region that contains two kinase-interacting domains (KID) and a C-terminal E3 ligase domain. The first KID recognizes the PTI-associated kinase Bti9 as well as Pto, and is referred to as the Pto-binding domain (PID). The second KID interacts with BAK1 and FLS2, which are leucine-rich repeat-containing receptor-like kinases, and is called the BAK1-interacting domain (BID). This family also contains a unique member, HopPmaL, which is shorter and lacks the C-terminal E3 ligase domain.	77
214001	cd12802	HopAB_PID	Pto-interacting domain of the HopAB family of Type III Effector proteins. HopAB family members are type III effector proteins that are secreted by the plant pathogen Pseudomonas syringae into the host plant to inhibit its immune system and facilitate the spread of the pathogen. AvrPtoB, also called HopAB3, is the best studied member of the family. It suppresses host basal defenses by interfering with PAMP (pathogen-associated molecular signature)-triggered immunity (PTI) through binding and inhibiting BAK1, a kinase which serves to activate defense signaling. It also recognizes the kinase Pto to activate effector-triggered immunity (ETI). AvrPtoB contains an N-terminal region that contains two kinase-interacting domains (KID) and a C-terminal E3 ligase domain. The first KID recognizes the PTI-associated kinase Bti9 as well as Pto, and is referred to as the Pto-binding domain (PID). The second KID interacts with BAK1 and FLS2, which are leucine-rich repeat-containing receptor-like kinases, and is called the BAK1-interacting domain (BID). This family also contains a unique member, HopPmaL, which is shorter and lacks the C-terminal E3 ligase domain.	79
214002	cd12803	HopAB_BID	BAK1-interacting domain of the HopAB family of Type III Effector proteins. HopAB family members are type III effector proteins that are secreted by the plant pathogen Pseudomonas syringae into the host plant to inhibit its immune system and facilitate the spread of the pathogen. AvrPtoB, also called HopAB3, is the best studied member of the family. It suppresses host basal defenses by interfering with PAMP (pathogen-associated molecular signature)-triggered immunity (PTI) through binding and inhibiting BAK1, a kinase which serves to activate defense signaling. It also recognizes the kinase Pto to activate effector-triggered immunity (ETI). AvrPtoB contains an N-terminal region that contains two kinase-interacting domains (KID) and a C-terminal E3 ligase domain. The first KID recognizes the PTI-associated kinase Bti9 as well as Pto, and is referred to as the Pto-binding domain (PID). The second KID interacts with BAK1 and FLS2, which are leucine-rich repeat-containing receptor-like kinases, and is called the BAK1-interacting domain (BID). This family also contains a unique member, HopPmaL, which is shorter and lacks the C-terminal E3 ligase domain.	80
214003	cd12804	AKAP10_AKB	PKA-binding (AKB) domain of A Kinase Anchor Protein 10. AKAPs coordinate the specificity of PKA signaling by facilitating the localization of the kinase to subcellular sites through their binding to regulatory (R) subunits of PKA. AKAP-10, also called PRKA10 or Dual-specific AKAP 2 (D-AKAP2), is a multisubunit protein containing two regulator of G protein signaling (RGS)-like domains and a PKA-binding (AKB) domain. The AKB domain of AKAP10 can bind to the dimerization/docking (D/D) domains of both RI and RII regulatory subunits of PKA. This model also includes a C-terminal PDZ-binding motif that binds to PDZK1 and NHERF-1, allowing AKAP10 to link indirectly to membrane proteins. Mutations in AKAP10 can alter its binding to R subunits, which may alter the targeting of PKA; some AKAP10 mutations are associated with abnormalities including hypertension, increased risk of severe arrhythmias during kidney transplantation, and familial breast cancer.	45
214004	cd12805	Allergen_V_VI	Group V, VI major allergens from grass, including Phlp 5, Phlp 6, Pha a 5 and Lol p 5. This family contains major allergens from various grass pollen, including Phl p 5 and Phl p 6 (timothy grass), Lol p 5 (rye grass) and Pha a 5 (canary grass). They induce allergic rhinitis and bronchial asthma in millions of allergic patients worldwide. These group V and group VI grass-pollen allergens belong to a new class of protease-resistant four-helix-bundle domains, which also have internal helix-turn-helix homology pointing to a special type of four-helix bundle topology, defined as twinned two-helix bundle. IgE binding experiments with recombinant Phl p 6 fragments indicated that the N terminus of the allergen is required for IgE recognition. Immunotherapy treatment for these allergies generally involves administration of grass pollen extracts which induce an initial rise in specific immunoglobulin E (sIgE) production followed by a progressive decline during the treatment.	85
214005	cd12806	Esterase_713_like	Novel bacterial esterase that cleaves esters on halogenated cyclic compounds. This family contains proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown. This enzyme is possibly exported from the cytosol to the periplasmic space. A large majority of sequences in this family have yet to be characterized.	261
214006	cd12807	Esterase_713	Novel bacterial esterase 713 that cleaves esters on halogenated cyclic compounds. This family contains proteins similar to a novel bacterial esterase (esterase 713) with the alpha/beta hydrolase fold that cleaves esters on halogenated cyclic compounds. This Alcaligenes esterase, however, does not contain the GXSXXG pentapeptide around the active site serine residue as seen in other esterase families. This enzyme is active as a dimer though its natural substrate is unknown. It has two distinct disulfide bridges; one formed between adjacent cysteines appears to facilitate the correct formation of the oxyanion cleft in the catalytic site. Esterase 713 also resembles human pancreatic lipase in its location of the acidic residue of the catalytic triad. It is possibly exported from the cytosol to the periplasmic space. A large majority of sequences in this family have yet to be characterized.	315
214007	cd12808	Esterase_713_like-1	Uncharacterized enzymes similar to novel bacterial esterase that cleaves esters on halogenated cyclic compounds. This family contains uncharacterized proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown.	309
214008	cd12809	Esterase_713_like-2	Uncharacterized enzymes similar to novel bacterial esterase that cleaves esters on halogenated cyclic compounds. This family contains uncharacterized proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown.	280
214009	cd12810	Esterase_713_like-3	Uncharacterized enzymes similar to novel bacterial esterase that cleaves esters on halogenated cyclic compounds. This family contains uncharacterized proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown.	328
411995	cd12811	MALA	Mala s 1 allergenic protein and similar proteins. This family includes the yeast Malassezia sympodialis allergen Mala s 1 which is localized in the cell wall and exposed on the cell surface. It can elicit specific IgE and T-cell activity in patients with atopic eczema (AE), a chronic inflammatory disease. Mala s 1 does not show any significant sequence homology to characterized proteins. However, its structure is a beta-propeller which is a novel fold among allergens.	304
214010	cd12812	BPSL1549	Burkholderia Lethal Factor 1. BPSL1549, also suggested to be called Burkholderia lethal factor 1, is a protein of unknown function from Burkholderia pseudomallei, a causative agent of melioidosis (also called Whitmore's disease). This protein shows similarity to Escherichia coli cytotoxic necrotizing factor 1 which has been found to act as a potent cytotoxin against eukaryotic cells and is lethal when administered to mice. BPSL1549 expression levels correlate with suppression or promotion of pathogenic conditions. BPSL1549 inhibits helicase activity of translation initiation factor eIF4A. As yet, there is no vaccine and the organism is multidrug resistant.	203
240610	cd12813	LbR-like	Left-handed beta-roll, including virulence factors and various other proteins. This family contains a variety of protein domains with a left-handed beta-roll structure including cell surface adhesion proteins, bacterial virulence factors, and ice-binding proteins, and other activities. UspA1 Head And Neck Domain and YadA of Yersinia are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric beta-rolls of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region. The collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. The ice-binding protein of the grass Lolium perenne (LpIBP) discourages the recrystallization of ice. Ice-binding proteins produced by organisms to prevent the growing of ice are termed to anti-freeze proteins. LpIBP consists of an unusual left-handed beta roll. Ice-binding is mediated by a flat beta-sheet on one side of the helix. These domains form a left handed beta roll made up of a series of short repeated elements. 	99
240611	cd12819	LbR_vir_like	Cell adhesion-like domain, left-handed beta-roll. This group contains proteins of unknown function related to characterized cell surface adhesion proteins with a left-handed beta-roll, like the UspA1 Head And Neck Domain and YadA of Yersinia. UspA1 and UspA2 are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric beta-helices of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region. The collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. These domains form a left handed beta roll made up of a series of short repeated elements.	111
240612	cd12820	LbR_YadA-like	YadA-like, left-handed beta-roll. This group contains the collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins, including Moraxella catarrhalis UspA-like proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. These domains form a left handed beta roll made up of a series of short repeated elements. UspA1 and UspA2 are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric left-handed parallel beta-helices of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region.	126
213355	cd12821	EcCorA_ZntB-like	Escherichia coli CorA-Salmonella typhimurium ZntB_like family. A family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of this family are found in all three kingdoms of life. It is a functionally diverse family, including the Mg2+ transporters Escherichia coli and Salmonella typhimurium CorAs (which can also transport Co2+, and Ni2+ ), and the Zn2+ transporter Salmonella typhimurium ZntB which mediates the efflux of Zn2+ (and Cd2+). It also includes two Saccharomyces cerevisiae members: the inner membrane Mg2+ transporters Mfm1p/Lpe10p, and Mrs2p, and a family of Arabidopsis thaliana members (AtMGTs) some of which are localized to distinct tissues, and not all of which can transport Mg2+. Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	285
213356	cd12822	TmCorA-like	Thermotoga maritima CorA-like family. This family belongs to the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of the Thermotoga maritima CorA_like family are found in all three kingdoms of life. It is a functionally diverse family, in addition to the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, it includes three Saccharomyces cerevisiae members: two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport.	289
213357	cd12823	Mrs2_Mfm1p-like	Saccharomyces cerevisiae inner mitochondrial membrane Mg2+ transporters Mfm1p and Mrs2p-like family. A eukaryotic subfamily belonging to the Escherichia coli CorA-Salmonella typhimurium ZntB_like family (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This functionally diverse subfamily includes the inner mitochondrial membrane Mg2+ transporters Saccharomyces cerevisiae Mfm1p/Lpe10p, Mrs2p, and human MRS2/ MRS2L. It also includes a family of Arabidopsis thaliana proteins (AtMGTs) some of which are localized to distinct tissues, and not all of which can transport Mg2+. Structures of the intracellular domain of two EcCorA_ZntB-like family transporters: Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	323
213358	cd12824	ZntB-like	Salmonella typhimurium Zn2+ transporter ZntB-like subfamily. A bacterial subfamily belonging to the Escherichia coli CorA-Salmonella typhimurium ZntB_like family (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subfamily includes the Zn2+ transporter Salmonella typhimurium ZntB which mediates the efflux of Zn2+ (and Cd2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, which occur in proteins belonging to this subfamily, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	290
213359	cd12825	EcCorA-like	Escherichia coli Mg2+ transporter CorA_like subfamily. A bacterial subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB_like(EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subfamily includes the Mg2+ transporters Escherichia coli, Salmonella typhimurium, and Helicobacter pylori CorAs (which can also transport Co2+, and Ni2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	287
213360	cd12826	EcCorA_ZntB-like_u1	uncharacterized bacterial subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB family. A uncharacterized subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB (EcCorA-ZntB_like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. The EcCorA-ZntB_like family includes the Mg2+ transporters Escherichia coli and Salmonella typhimurium CorAs, which can also transport Co2+, and Ni2+. Structures of the intracellular domain of EcCorA-ZntB_like family members, Vibrio parahaemolyticus and Salmonella typhimurium ZntB, form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	281
213361	cd12827	EcCorA_ZntB-like_u2	uncharacterized bacterial subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB family. A uncharacterized subfamily of the Escherichia coli CorA-Salmonella typhimurium ZntB (EcCorA-ZntB_like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes.The EcCorA-ZntB-like family includes the Mg2+ transporters Escherichia coli and Salmonella typhimurium CorAs, which can also transport Co2+, and Ni2+. Structures of the intracellular domain of EcCorA-ZntB-like family members, Vibrio parahaemolyticus and Salmonella typhimurium ZntB, form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	289
213362	cd12828	TmCorA-like_1	Thermotoga maritima CorA_like subfamily. This subfamily belongs to the Thermotoga maritima CorA (TmCorA)-family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of this subfamily are found in all three kingdoms of life. It is functionally diverse subfamily, in addition to the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, it includes Methanosarcina mazei CorA which may be involved in transport of copper and/or other divalent metal ions. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by a related protein, Saccharomyces cerevisiae Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport.	294
213363	cd12829	Alr1p-like	Saccharomyces cerevisiae Alr1p-like subfamily. This eukaryotic subfamily belongs to the Thermotoga maritima CorA (TmCorA)-family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subfamily includes three Saccharomyces cerevisiae members: two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport.	305
213364	cd12830	MtCorA-like	Mycobacterium tuberculosis CorA-like subfamily. This bacterial subfamily belongs to the Thermotoga maritima CorA (TmCorA)-like family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subfamily includes the Mg2+ transporter Mycobacterium tuberculosis CorA (which also transports Co2+). Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by a related protein, Saccharomyces cerevisiae Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport.	292
213365	cd12831	TmCorA-like_u2	Uncharacterized bacterial subfamily of the Thermotoga maritima CorA-like family. This subfamily belongs to the Thermotoga maritima CorA (TmCorA)-like family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of the TmCorA-like family are found in all three kingdoms of life. It is a functionally diverse family which includes the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and three Saccharomyces cerevisiae proteins: two located in the plasma membrane: the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by a related protein, Saccharomyces cerevisiae Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport.	287
213366	cd12832	TmCorA-like_u3	Uncharacterized subfamily of the Thermotoga maritima CorA-like family. This subfamily belongs to the Thermotoga maritima CorA (TmCorA)-like family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. Members of the TmCorA-like family are found in all three kingdoms of life. It is a functionally diverse family which includes the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and three Saccharomyces cerevisiae proteins: two located in the plasma membrane: the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. Thermotoga maritima CorA forms funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport by a related protein, Saccharomyces cerevisiae Alr1p. Natural variants in this signature sequence may be associated with the transport of different divalent cations. The functional diversity of the MIT superfamily may also be due to minor structural differences regulating gating, substrate selection, and transport.	287
213367	cd12833	ZntB-like_1	Salmonella typhimurium Zn2+ transporter ZntB-like subgroup. A bacterial subgroup belonging to the Escherichia coli CorA-Salmonella typhimurium ZntB_like family (EcCorA_ZntB-like) of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subgroup includes the Zn2+ transporter Salmonella typhimurium ZntB which mediates the efflux of Zn2+ (and Cd2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, which occur in proteins belonging to this subfamily, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	290
213368	cd12834	ZntB_u1	Uncharacterized bacterial subgroup of the Salmonella typhimurium Zn2+ transporter ZntB-like subfamily. The MIT superfamily of essential membrane proteins is involved in transporting divalent cations (uptake or efflux) across membranes. The ZntB-like subfamily includes the Zn2+ transporter Salmonella typhimurium ZntB which mediates the efflux of Zn2+ (and Cd2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN which occur in proteins belonging to this subfamily, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	290
213369	cd12835	EcCorA-like_1	Escherichia coli Mg2+ transporter CorA_like subgroup. A bacterial subgroup of the Escherichia coli CorA-Salmonella typhimurium ZntB_like (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subgroup includes the Mg2+ transporters Escherichia coli CorA and Salmonella typhimurium CorA (which can also transport Co2+, and Ni2+). Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	287
213370	cd12836	HpCorA-like	Mg2+ transporter Helicobacter pylori CorA-like subgroup. A bacterial subgroup of the Escherichia coli CorA-Salmonella typhimurium ZntB_like (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. This subgroup includes the Mg2+ transporter Helicobacter pylori CorAs (which can also transport Co2+, and Ni2+); CorA plays an important role in the viability of this pathogen. Structures of the intracellular domain of Vibrio parahaemolyticus and Salmonella typhimurium ZntB (members of the EcCorA_ZntB-like family) form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, and Mrs2p. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	288
213371	cd12837	EcCorA-like_u1	uncharacterized subgroup of the Escherichia coli Mg2+ transporter CorA_like subfamily. A uncharacterized subgroup of the Escherichia coli CorA-Salmonella typhimurium ZntB_like family (EcCorA_ZntB-like) family of the MIT superfamily of essential membrane proteins involved in transporting divalent cations (uptake or efflux) across membranes. The EcCorA_ZntB-like family includes the Mg2+ transporters Escherichia coli and Salmonella typhimurium CorAs, which can also transport Co2+, and Ni2+. Structures of the intracellular domain of EcCorA_ZntB-like family members, Vibrio parahaemolyticus and Salmonella typhimurium ZntB, form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA. Natural variants such as GVN and GIN, such as occur in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport.	298
214011	cd12838	Killer_toxin_alpha	Alpha subunit of killer toxin from halotolerant yeast. This family contains the alpha subunit of killer toxins that are secreted by several strains of yeasts and fungi. These toxins are proteinous substances that kill sensitive strains. The halotolerant yeast Pichia farinosa KK1 strain produces the SMK toxin, with maximum killer activity under acidic pH and high salt concentration. This toxin is composed of alpha and beta subunits that interact tightly with each other under acidic conditions but easily dissociated and lose activity under neutral conditions. It shares topology to that of the fungal killer toxin, KP4, which contains a rare structural motif, suggesting that these toxins may be evolutionally and/or functionally related.	62
214012	cd12839	Killer_toxin_beta	Beta subunit of killer toxin from halotolerant yeast. This family contains the beta subunit of killer toxins that are secreted by several strains of yeasts and fungi. These toxins are proteinous substances that kill sensitive strains. The halotolerant yeast Pichia farinosa KK1 strain produces the SMK toxin, with maximum killer activity under acidic pH and high salt concentration. This toxin is composed of alpha and beta subunits that interact tightly with each other under acidic conditions but easily dissociated and loose activity under neutral conditions. It shares topology to that of the fungal killer toxin, KP4, which contains a rare structural motif, suggesting that these toxins may be evolutionally and/or functionally related.	74
214013	cd12840	CarS	Antirepressor CarS. CarS, an antirepressor present in Cystobacterineae, recognizes repressors to turn on the photo-inducible promoter P(B). In the dark, access to the P(B) promoter is blocked by the repressor CarA. Blue light causes expression of CarS, leading the way to the CarA-CarS interaction which dismantles the CarA-operator complex, resulting in the derepression of the P(B) promoter. A parallel pathway for regulating P(B) involves the interaction of CarS with the repressor CarH, which shares the domain architecture of CarA. CarH and CarA contain an N-terminal, MerR-type winged-helix DNA-binding domain that recognizes CarS. CarS adopts an SH3-like fold with loop length variations and acts as an operator DNA mimic.	80
214014	cd12841	TM_EphA1	Transmembrane domain of Ephrin Receptor A1 Protein Tyrosine Kinase. Ephrin receptors (EphRs) comprise the largest subfamily of receptor PTKs, and are classified into two classes (EphA and EphB), corresponding to binding preferences for either GPI-anchored ephrin-A ligands or transmembrane ephrin-B ligands. Vertebrates have ten EphA and six EphB receptors, which display promiscuous ligand interactions within each class. EphA1 has been associated with late-onset Alzheimer's disease and certain cancers such as colorectal and gastric carcinomas. EphRs contain an ephrin binding domain and two fibronectin repeats extracellularly, a single-span transmembrane (TM) domain, and a cytoplasmic tyr kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. This allows ephrin/EphR dimers to form, leading to the activation of the intracellular tyr kinase domain. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). The main effect of ephrin/EphR interaction is cell-cell repulsion or adhesion. Ephrin/EphR signaling is important in neural development and plasticity, cell morphogenesis and proliferation, cell-fate determination, embryonic development, tissue patterning, and angiogenesis. The TM domain mediates dimerization.	38
410985	cd12842	IGCP_Hfx_cass2	integron gene cassette protein (IGCP) Hfx_cass2 and similar proteins. This family contains the unique integron gene cassette protein Hfx_cass2 and similar proteins that have yet to be characterized. The structure of Hfx_cass2 depicts a homodimer incorporating a compact all-alpha fold of six helical segments with a core central bundle of helices. It has a surface cleft reminiscent of an enzyme active site. This family may allow an assessment of the impact of the integron/gene cassette system on the emergence of new phenotypes, such as drug resistance or virulence.	110
240608	cd12843	Bvu_2165_C_like	The C-terminal domain of uncharacterized bacterial proteins. This family contains the C-terminal domain of uncharacterized hypothetical proteins from bacteria, including Bacteroides vulgatus Bvu_2165. The structure of Bvu_2165 is dimeric, with an extensive binding interface.	105
240607	cd12869	MqsR	Motility quorum-sensing regulator (MqsR). This family includes domains similar to the motility quorum-sensing regulator MqsR, a toxin that is highly upregulated in persisters (dormant cells found in biofilms that are a source of antibiotic resistance). MqsR pairs with its antitoxin MqsA, forming a unique family of toxin:antitoxin (TA) systems. MqsR has been found to be structurally homologous to the bacterial ribonuclease (RelE) toxins; however, its sequence is not similar to any other known toxins and therefore its molecular function is as yet unknown.	98
240606	cd12870	MqsA	antitoxin MqsA for MqsR toxin. This family includes domains similar to the antitoxin MqsA that binds motility quorum-sensing regulator MqsR, a toxin that is highly upregulated in persisters (dormant cells found in biofilms that are a source of antibiotic resistance), thus forming a unique toxin:antitoxin (TA) pair. MqsA neutralizes MsqR toxicity. It binds its own promoter as well as those of genes important for E. coli physiology, such as mcbR and spy. It also binds zinc and has been shown to coordinate DNA via its C-terminal domain. This family also includes the B. subtilis YokU protein, which is functionally uncharacterized.	66
214015	cd12871	Bacuni_01323_like	Uncharacterized protein conserved in Bacteroidetes. A well-conserved family of 16-stranded beta barrels resembling outer membrane porins. The interior of the barrels is mostly occupied by an insert with partially helical structure.	231
293932	cd12872	SPRY_Ash2	SPRY domain in Ash2. This SPRY domain is found at the C-terminus of Ash2 (absent, small, or homeotic discs 2) -like proteins, core components of all mixed-lineage leukemia (MLL) family histone methyltransferases. Ash2 is a member of the trithorax group of transcriptional regulators of the Hox genes. Recent studies show that the SPRY domain of Ash2 mediates the interaction with RbBP5 and has an important role in regulating the methyltransferase activity of MLL complexes. In yeast, Ash2 is involved in histone methylation and is required for the earliest stages of embryogenesis.	150
293933	cd12873	SPRY_DDX1	SPRY domain associated with DEAD box gene DDX1. This SPRY domain is associated with the DEAD box gene, DDX1, an RNA-dependent ATPase involved in HIV-1 Rev function and virus replication. It is suggested that DDX1 acts as a cellular cofactor by promoting oligomerization of Rev on the Rev response element (RRE). DDX1 RNA is overexpressed in breast cancer, data showing a strong and independent association between poor prognosis and deregulation of the DEAD box protein DDX1, thus potentially serving as an effective prognostic biomarker for early recurrence in primary breast cancer. DDX1 also interacts with RelA and enhances nuclear factor kappaB-mediated transcription. DEAD-box proteins are associated with all levels of RNA metabolism and function, and have been implicated in translation initiation, transcription, RNA splicing, ribosome assembly, RNA transport, and RNA decay.	155
293934	cd12874	SPRY_PRY	PRY/SPRY domain, also known as B30.2. This domain contains residues in the N-terminus that form a distinct PRY domain structure such that the B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Among the TRIM proteins, also known as the N-terminal RING finger/B-box/coiled coil (RBCC) family, only Classes I and II contain the B30.2 domain that has evolved under positive selection. Class I TRIM proteins include multiple members involved in antiviral immunity at various levels of interferon signaling cascade. Among the 75 human TRIMs, roughly half enhance immune response, which they do at multiple levels in signaling pathways. The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site.	168
293935	cd12875	SPRY_SOCS_Fbox	SPRY domain in Fbxo45 and suppressors of cytokine signaling (SOCS) proteins. This family consists of the SPRY domain-containing SOCS box protein family (SPSB1-4, also known as SSB-1 to -4) as well as F-box protein 45 (Fbxo45), a novel synaptic E3 and ubiquitin ligase. The SPSB protein is composed of a central SPRY protein interaction domain and a C-terminal SOCS box. SPSB1, SPSB2, and SPSB4 interact with prostate apoptosis response protein 4 (Par-4) and are negative regulators that recruit the ECS E3 ubiquitin ligase complex to polyubiquitinate inducible nitric-oxide synthase (iNOS), resulting in its proteasomal degradation. Fbxo45 is related to this family; it is located N-terminal to the SPRY domain, and known to induce the degradation of a synaptic vesicle-priming factor, Munc13-1, via the SPRY domain, thus playing an important role in the regulation of neurotransmission by modulating Munc13-1 at the synapse. Suppressor of cytokine signaling (SOCS) proteins negatively regulate signaling from JAK-associated cytokine receptor complexes, and play key roles in the regulation of immune homeostasis.	169
293936	cd12876	SPRY_SOCS3	SPRY domain in the suppressor of cytokine signaling 3 (SOCS3) family. The SPRY domain-containing SOCS box protein family (SPSB1-4, also known as SSB-1 to -4) is composed of a central SPRY protein interaction domain and a C-terminal SOCS box. All four SPSB proteins interact with c-Met, the hepatocyte growth factor receptor, but SOCS3 regulates cellular response to a variety of cytokines such as leukemia inhibitory factor (LIF) and interleukin 6. SOCS3, along with SOCS1, are expressed by immune cells and cells of the central nervous system (CNS) and have the potential to impact immune processes within the CNS. In non-small cell lung cancer (NSCLC), SOCS3 is silenced and proline-rich tyrosine kinase 2 (Pyk2) is over-expressed; it has been suggested that SOCS3 could be an effective way to prevent the progression of NSCLC due to its role in regulating Pyk2 expression.	185
240457	cd12877	SPRY1_RyR	SPRY domain 1 (SPRY1) of ryanodine receptor (RyR). This SPRY domain is the first of three structural repeats in all three isoforms of the ryanodine receptor (RyR), which are the major Ca2+ release channels in the membranes of sarcoplasmic reticulum (SR). There are three RyR genes in mammals; the skeletal RyR1, the cardiac RyR2 and the brain RyR3. The three SPRY domains are located in the N-terminal part of the cytoplasmic region of the RyRs, but no specific function has been found for this first SPRY domain of the RyRs.	151
240458	cd12878	SPRY2_RyR	SPRY domain 2 (SPRY2) of ryanodine receptor (RyR). This SPRY domain (SPRY2) is the second of three structural repeats in all three isoforms of the ryanodine receptor (RyR), which are the major Ca2+ release channels in the membranes of sarcoplasmic reticulum (SR). There are three RyR genes in mammals; the skeletal RyR1, the cardiac RyR2 and the brain RyR3. The three SPRY domains are located in the N-terminal part of the cytoplasmic region of the RyRs, The SPRY2 domain has been shown to bind to the dihydropryidine receptor (DHPR) II-III loop and the ASI region of RyR1	133
293937	cd12879	SPRY3_RyR	SPRY domain 3 (SPRY3) of ryanodine receptor (RyR). This SPRY domain (SPRY3) is the third of three structural repeats in all three isoforms of the ryanodine receptor (RyR), which are the major Ca2+ release channels in the membranes of sarcoplasmic reticulum (SR). There are three RyR genes in mammals; the skeletal RyR1, the cardiac RyR2 and the brain RyR3. The three SPRY domains are located in the N-terminal part of the cytoplasmic region of the RyRs, but no specific function has been found for this third SPRY domain of the RyRs.	151
293938	cd12880	SPRYD7	SPRY domain-containing protein 7. This family contains SPRY domain-containing protein 7 (also known as SPRY domain-containing protein 7 or CLL deletion region gene 6 protein homolog or CLLD6 or chronic lymphocytic leukemia deletion region gene 6 protein homolog). In humans, CLLD6 is highly expressed in heart, skeletal muscle, and testis as well as cancer cell lines. It also has cross-species conservation, suggesting that it is likely to carry out important cellular processes.	160
293939	cd12881	SPRY_HERC1	SPRY domain in HERC1. This SPRY domain is found in the HERC1, a large protein related to chromosome condensation regulator RCC1. It is widely expressed in many tissues, playing an important role in intracellular membrane trafficking in the cytoplasm as well as Golgi apparatus. HERC1 also interacts with tuberous sclerosis 2 (TSC2, tuberin), which suppresses cell growth, and results in the destabilization of TSC2. However, the biological function of HERC1 has yet to be defined.	162
293940	cd12882	SPRY_RNF123	SPRY domain at N-terminus of ring finger protein 123. This SPRY domain is found at the N-terminus of RING finger protein 123 domain (also known as E3 ubiquitin-protein ligase RNF123). The ring finger domain motif is present in a variety of functionally distinct proteins and known to be involved in protein-protein and protein-DNA interactions. RNF123 displays E3 ubiquitin ligase activity toward the cyclin-dependent kinase inhibitor p27 (Kip1).	128
293941	cd12883	SPRY_RING	SPRY domain at N-terminus of Really Interesting New Gene (RING) finger domain. This SPRY domain is found at the N-terminus of RING finger domains which are present in a variety of functionally distinct proteins and known to be involved in protein-protein and protein-DNA interactions. RING-finger domain is a type of Zn-finger that binds two Zn atoms and is identified in proteins with a wide range of functions such as viral replication, signal transduction, and development.	121
293942	cd12884	SPRY_hnRNP	SPRY domain in heterogeneous nuclear ribonucleoprotein U-like (hnRNP) protein 1. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of heterogeneous nuclear ribonucleoprotein U-like (hnRNP) protein 1 (also known as HNRPUL1 ) which is a major constituent of nuclear matrix or scaffold and binds directly to DNA sequences through the N-terminal acidic region named serum amyloid P (SAP). Its function is specifically modulated by E1B-55kDa in adenovirus-infected cells. HNRPUL1 also participates in ATR protein kinase signaling pathways during adenovirus infection. Two transcript variants encoding different isoforms have been found for this gene. When associated with bromodomain-containing protein 7 (BRD7), it activates transcription of glucocorticoid-responsive promoter in the absence of ligand-stimulation.	177
293943	cd12885	SPRY_RanBP_like	SPRY domain in Ran binding proteins, SSH4, HECT E3 and SPRYD3. This family includes SPRY domains found in Ran binding proteins (RBP or RanBPM) 9 and 10, SSH4 (suppressor of SHR3 null mutation protein 4), SPRY domain-containing protein 3 (SPRYD3) as well as HECT, a C-terminal catalytic domain of a subclass of ubiquitin-protein ligase (E3). RanBP9 and RanBP10 act as androgen receptor (AR) coactivators. Both consist of the N-terminal proline- and glutamine-rich regions, the SPRY domain, and LisH-CTLH and CRA motifs. The SPRY domain in SSH4 may be involved in cargo recognition, either directly or by combination with other adaptors, possibly leading to a higher selectivity. SPRYD3 is highly expressed in most tissues in humans, possibly involved in important cellular processes. HECT E3 mediates the direct transfer of ubiquitin from E2 to substrate.	132
293944	cd12886	SPRY_like	SPRY domain-like in bacteria. This family contains SPRY-like domains that are found only in bacterial and are mostly uncharacterized. SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 eukaryotic protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L).	129
293945	cd12887	SPRY_NHR_like	SPRY domain in neuralized homology repeat. This family contains the neuralized homology repeat 1 (NHR1) domain similar to the SPRY domain (known to mediate specific protein-protein interactions) at the C-terminus of a conserved region within eukaryotic neuralized and neuralized-like proteins. In Drosophila, the neuralized protein (Neur) belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the nervous system. Neur binds to the Notch receptor ligand Delta through its first NHR1 domain and mediates its ubiquitination for endocytosis. Multiple copies of this region are found in some members of the family.	161
293946	cd12888	SPRY_PRY_TRIM7_like	PRY/SPRY domain in tripartite motif-binding protein 7 (TRIM7)-like, including TRIM7, TRIM10, TRIM15, TRIM26, TRIM39, TRIM41. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several tripartite motif-containing (TRIM) proteins, including TRIM7 (also referred to as glycogenin-interacting protein, RING finger protein 90 or RNF90), TRIM10, TRIM15, TRIM26, TRIM39 and TRIM41. TRIM7 or GNIP interacts with glycogenin and stimulates its self-glucosylating activity via its SPRY domain. TRIM10 (also known as hematopoietic RING finger 1 (HERF1) or TRIM10/HERF1) plays a key role in definitive erythroid development; downregulation of the Spi-1/PU.1 oncogene induces the expression of TRIM10/HERF1, a key factor required for terminal erythroid cell differentiation and survival. Antiviral activity of TRIM15 is dependent on the ability of its B-box to interact with the MLV Gag precursor protein; downregulation of TRIM15, along with TRIM11, enhances virus release suggesting that these proteins contribute to the endogenous restriction of retroviruses in cells. Tripartite motif-containing 26 (TRIM26) function is as yet unknown; however, since it is localized in the human histocompatibility complex (MHC) class I region, TRIM26 may play a role in immune response although studies show no association between TRIM26 polymorphisms and the risk of aspirin-exacerbated respiratory disease. TRIM39 is a MOAP-1 (Modulator of Apoptosis)-binding protein that stabilizes MOAP-1 through inhibition of its poly-ubiquitination process. TRIM41 (also known as RING finger-interacting protein with C kinase or RINCK) functions as an E3 ligase that catalyzes the ubiquitin-mediated degradation of protein kinase C.	169
293947	cd12889	SPRY_PRY_TRIM67_9	PRY/SPRY domain in tripartite motif-containing proteins, TRIM9 and TRIM67. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM9 proteins. TRIM9 protein is expressed mainly in the cerebral cortex, and functions as an E3 ubiquitin ligase. It has been shown that TRIM9 is localized to the neurons in the normal human brain and its immunoreactivity in affected brain areas in Parkinson's disease and dementia with Lewy bodies is severely decreased, possibly playing an important role in the regulation of neuronal function and participating in pathological process of Lewy body disease through its ligase. TRIM67 negatively regulates Ras activity via degradation of 80K-H, leading to neural differentiation, including neuritogenesis.	172
293948	cd12890	SPRY_PRY_TRIM16	PRY/SPRY domain in tripartite motif-containing protein 16 (TRIM16). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM16 and TRIM-like proteins. TRIM16 (also known as estrogen-responsive B box protein or EBBP) does not possess a RING domain like the other TRIM proteins, but contains two B-box domains and can heterodimerize with other TRIM proteins such as TRIM24, Promyelocytic leukemia (PML) protein and Midline-1 (MID1 or TRIM18). It is a regulator of keratinocyte differentiation and a tumor suppressor in retinoid-sensitive neuroblastoma. It has been shown that loss of TRIM16 expression plays an important role in the development of cutaneous squamous cell carcinoma (SCC) and is a determinant of retinoid sensitivity. TRIM16 also has E3 ubiquitin ligase activity.	182
293949	cd12891	SPRY_PRY_C-I_2	PRY/SPRY domain in tripartite motif-containing (TRIM) proteins, including TRIM14-like, TRIM16-like, TRIM25-like, TRIM47-like, TRIM65 and RNF135, and stonustoxin. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several Class I TRIM proteins, including  TRIM14, TRIM16 and TRIM25, TRIM47 as well as RING finger protein RNF135 and stonustoxin, a secreted poisonous protein of the stonefish Synanceja horrida. TRIM16 (also known as estrogen-responsive B box protein or EBBP) has E3 ubiquitin ligase activity. It is a regulator of keratinocyte differentiation and a tumor suppressor in retinoid-sensitive neuroblastoma. TRIM25 (also called Efp) ubiquitinates the N terminus of the viral RNA receptor retinoic acid-inducible gene-I (RIG-I) in response to viral infection, leading to activation of the RIG-I signaling pathway, thus resulting in type I interferon production to limit viral replication. It has been shown that the influenza A virus targets TRIM25 and disables its antiviral function. TRIM47, also known as GOA (Gene overexpressed in astrocytoma protein) or RNF100 (RING finger protein 100), is highly expressed in kidney tubular cells, but low expressed in most tissue. It is overexpressed in astrocytoma tumor cells and plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis. RNF135 ubiquitinates RIG-I (retinoic acid-inducible gene-I) to promote interferon-beta induction during the early phase of viral infection. Stonustoxin (STNX) is a hypotensive and lethal protein factor that also possesses other biological activities such as species-specific hemolysis (due to its ability to form pores in the cell membrane) and platelet aggregation, edema-induction, and endothelium-dependent vasorelaxation (mediated by the nitric oxide pathway and activation of potassium channels). The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site.	167
240472	cd12892	SPRY_PRY_TRIM18	PRY/SPRY domain of TRIM18/MID1, also known as FXY or RNF59. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is at the C-terminus of the overall domain architecture of MID1 (also known as FXY, RNF59, TRIM18) gene represented by a RING finger domain (RING), two B-box motifs (BBOX), coiled-coil C-terminal to Bbox domain (BBC) and fibronectin type 3 domain (FN3). Mutations in the human MID1 gene result in X-linked Opitz G/BBB syndrome (OS), a disorder affecting development of midline structures, causing craniofacial, urogenital, gastrointestinal and cardiovascular abnormalities. A unique MID1 gene mutation located in a variable loop in the SPRY domain alters conformation of the binding pocket and may affect the binding affinity to the PRY/SPRY domain.	177
293950	cd12893	SPRY_PRY_TRIM35	PRY/SPRY domain in tripartite motif-containing protein 35 (TRIM35). This PRY/SPRY domain is found at the C-terminus of the overall domain architecture of tripartite motif 35, TRIM35 (also known as hemopoietic lineage switch protein), which includes a RING finger domain (RING) and a B-box motif (BBOX). TRIM35 may play a role as a tumor suppressor and is implicated in the cell death mechanism.	171
293951	cd12894	SPRY_PRY_TRIM36	PRY/SPRY domain in tripartite motif-containing protein 36 (TRIM36). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM36, a Class I TRIM protein. TRIM36 (also known as Haprin or RNF98) has a ubiquitin ligase activity and interacts with centromere protein-H, one of the kinetochore proteins. It has been shown that TRIM36 is potentially associated with chromosome segregation and that an excess of TRIM36 may cause chromosomal instability. In Xenopus laevis, TRIM36 is expressed during early embryogenesis and plays an important role in the arrangement of somites during their formation.	204
293952	cd12895	SPRY_PRY_TRIM46	PRY/SPRY domain in tripartite motif-containing protein 46 (TRIM46). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM46 proteins (composed of RING/B-box/coiled-coil core and also known as RBCC proteins). The SPRY/PRY combination is a possible component of immune defense. This protein family has not yet been characterized.	209
293953	cd12896	SPRY_PRY_TRIM65	PRY/SPRY domain in tripartite motif-containing domain 65 (TRIM65). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM65 proteins (composed of RING/B-box/coiled-coil core and also known as RBCC proteins). The SPRY/PRY combination is a possible component of immune defense. This protein family has not been characterized.	182
293954	cd12897	SPRY_PRY_TRIM50_72	PRY/SPRY domain in tripartite motif-binding (TRIM) proteins TRIM50 and TRIM72. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several TRIM proteins, including TRIM72 and TRIM50. TRIM72 (also known as MG53) has been shown to perform a critical function in membrane repair following acute muscle injury by nucleating the assembly of the repair machinery at injury sites. It is expressed specifically in skeletal muscle and heart, and tethered to the plasma membrane and cytoplasmic vesicles via its interaction with phosphatidylserine. TRIM50, an E3 ubiquitin ligase, is deleted in Williams-Beuren (WBS) syndrome, a multi-system neurodevelopmental disorder caused by the deletion of contiguous genes at chromosome region 7q11.23.	191
293955	cd12898	SPRY_PRY_TRIM76	PRY/SPRY domain in tripartite motif-containing protein 76 (TRIM76), also called cardiomyopathy-associated protein 5. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM76, a Class I TRIM protein. TRIM76 (also known as cardiomyopathy-associated protein 5 or CMYA5 or myospryn or SPRYD2) is a muscle-specific member of the TRIM superfamily, but lacks the RING domain. It has been suggested that TRIM76 is involved in two distinct processes, protein kinase A signaling and vesicular trafficking. It has also been implicated in Duchenne muscular dystrophy and cardiac disease; gene polymorphism of TRIM76 is associated with left ventricular wall thickness in patients with hypertension while its interactions with M-band titin and calpain 3 link it to tibial and limb-girdle muscular dystrophies.	171
293956	cd12899	SPRY_PRY_TRIM76_like	PRY/SPRY domain in tripartite motif-containing protein 76 (TRIM76)-like. This domain is similar to the distinct PRY/SPRY subdomain found at the C-terminus of TRIM76, a Class I TRIM protein. TRIM76 (also known as cardiomyopathy-associated protein 5 or CMYA5 or myospryn or SPRYD2) is a muscle-specific member of the TRIM superfamily, but lacks the RING domain. It has been suggested that TRIM76 is involved in two distinct processes, protein kinase A signaling and vesicular trafficking.	176
293957	cd12900	SPRY_PRY_TRIM21	PRY/SPRY domain in  tripartite motif-binding protein 21 (TRIM21) also known as 52kD Ribonucleoprotein Autoantigen (Ro52). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM21, which is also known as Sjogren Syndrome Antigen A (SSA), SSA1, 52kD Ribonucleoprotein Autoantigen (Ro52, Ro/SSA, SS-A/Ro) or RING finger protein 81 (RNF81). TRIM21 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. As an E3 ligase, TRIM21 mediates target specificity in ubiquitination; it regulates type 1 interferon and proinflammatory cytokines via ubiquitination of interferon regulatory factors (IRFs). It is up-regulated at the site of autoimmune inflammation, such as cutaneous lupus lesions, indicating a central role in the tissue destructive inflammatory process. It interacts with auto-antigens in patients with Sjogren syndrome and systemic lupus erythematosus, a chronic systemic autoimmune disease characterized by the presence of autoantibodies against the protein component of the human intracellular ribonucleoprotein-RNA complexes and more specifically TRIM21, Ro60/TROVE2 and La/SSB proteins.  It binds the Fc part of IgG molecules via its PRY-SPRY domain with unexpectedly high affinity.	180
293958	cd12901	SPRY_PRY_FSD1	Fibronectin type III and SPRY containing 1 (FSD1) domain includes PRY at the N-terminus. This domain is part of the fibronectin type III and SPRY domain containing 1 (FSD1) and FSD1-like (FSD1L) proteins. These are centrosome-associated proteins that are characterized by an N-terminal coiled-coil region downstream of B-box (BBC) domain, a central fibronectin type III (FN3) domain, and C-terminal repeats in PRY/SPRY domain. The FSD1 protein associates with a subset of microtubules and may be involved in the stability and organization of microtubules during cytokinesis.	207
293959	cd12902	SPRY_PRY_RNF135	PRY/SPRY domain in RING finger protein RNF135. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of the RING finger protein RNF135 (also known as Riplet/RNF135), which ubiquitinates RIG-I (retinoic acid-inducible gene-I) to promote interferon-beta induction during the early phase of viral infection. Normally, RIG-I is activated by TRIM25 in response to viral infection, leading to activation of the RIG-I signaling pathway, thus resulting in type I interferon production to limit viral replication. However, RNF135, consisting of an N-terminal RING finger domain, C-terminal SPRY and PRY motifs and showing sequence similarity to TRIM25, acts as an alternative factor that promotes RIG-I activation independent of TRIM25.	168
293960	cd12903	SPRY_PRY_SPRYD4	PRY/SPRY domain containing protein 4 (SPRYD4). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain and is encoded by the SPRYD4 gene. SPRYD4 (SPRY containing domain 4) is ubiquitously expressed in many human tissues, most strongly in kidney, bladder, brain, thymus and stomach. Subcellular localization demonstrates that SPRYD4 protein is localized in the nucleus when overexpressed in COS-7 green monkey cell. It has remained uncharacterized thus far.	169
293961	cd12904	SPRY_BSPRY	SPRY domain in Ro-Ret family. This domain, named BSPRY, has been identified in the Ro-Ret family, since the protein is composed of a B-box, an alpha-helical coiled coil and a SPRY domain. The gene for BSPRY resides on human chromosome 9 and is specifically expressed in testis. The function of BSPRY is not known, but several related proteins of the RING-Box-coiled-coil (RBCC) family have been implicated in cell transformation.	171
293962	cd12905	SPRY_PRY_A33L	zinc-binding protein A33-like. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM69 and TRIM proteins NF7 and bloodthirsty (bty). TRIM69 is a novel testis E3 ubiquitin ligase that may function to ubiquitinate its particular substrates during spermatogenesis. In humans, TRIM69 localizes in the cytoplasm and nucleus, and requires an intact RING finger domain to function. TRIM protein NF7, which also contains a chromodomain (CHD) at the N-terminus and an RFP (Ret finger protein)-like domain at the C-terminus, is required for its association with transcriptional units of RNA polymerase II which is mediated by a trimeric B box. In Xenopus oocyte, xNF7 has been identified as a nuclear microtubule-associated protein (MAP) whose microtubule-bundling activity, but not E3-ligase activity, contributes to microtubule organization and spindle integrity. Bloodthirsty (bty) is a novel gene identified in zebrafish and has been shown to likely play a role in in regulation of the terminal steps of erythropoiesis.	178
293963	cd12906	SPRY_SOCS1-2-4	SPRY domain in the suppressor of cytokine signaling 1, 2, 4 families (SOCS1, SOCS2, SOCS4). The SPRY domain-containing SOCS box protein family (SPSB1-4, also known as SSB-1 to -4) is composed of a central SPRY protein interaction domain and a C-terminal SOCS box. All four SPSB proteins interact with c-Met, the hepatocyte growth factor receptor, but only SPSB1, SPSB2, and SPSB4 interact with prostate apoptosis response protein 4 (Par-4). They are negative regulators that recruit the ECS E3 ubiquitin ligase complex to polyubiquitinate inducible nitric-oxide synthase (iNOS), resulting in its proteasomal degradation, thus contributing to protection against the cytotoxic effect of iNOS in activated macrophages. It has been shown that SPSB1 and SPSB4 induce the degradation of iNOS more strongly than SPSB2. The Drosophila melanogaster SPSB1 homolog, GUSTAVUS, interacts with the DEAD box RNA helicase Vasa. Suppressor of cytokine signaling (SOCS) proteins negatively regulate signaling from JAK-associated cytokine receptor complexes, and play key roles in the regulation of immune homeostasis.	174
293964	cd12907	SPRY_Fbox	SPRY domain in the F-box family Fbxo45. Fbxo45 is a novel synaptic E3 and ubiquitin ligase, related to the suppressor of cytokine signaling (SOCS) proteins and located N-terminal to a SPRY (SPla and the ryanodine receptor) domain. Fbxo45 induces the degradation of a synaptic vesicle-priming factor, Munc13-1, via the SPRY domain, thus playing an important role in the regulation of neurotransmission by modulating Munc13-1 at the synapse. F-box motifs are found in proteins that function as the substrate recognition component of SCF E3 complexes.	175
293965	cd12908	SPRYD3	SPRY domain-containing protein 3. This family contains SPRY domain-containing protein 3 (SPRYD3). In humans, it is highly expressed in most tissues, including brain, kidney, heart, intestine, skeletal muscle, and testis. It also has cross-species conservation, suggesting that it is likely to carry out important cellular processes.	171
293966	cd12909	SPRY_RanBP9_10	SPRY domain in Ran binding proteins 9 and 10. This family includes SPRY domain in Ran binding protein (RBP or RanBPM) 9 and 10, and similar proteins. RanBP9 (also known as RanBPM), a binding partner of Ran, is a small Ras-like GTPase that exerts multiple functions via interactions with various proteins. RanBP9 and RanBP10 also act as androgen receptor (AR) coactivators. Both consist of the N-terminal proline- and glutamine-rich regions, the SPRY domain, and LisH-CTLH and CRA motifs. SPRY domain of RanBPM forms a complex with CD39, a prototypic member of the NTPDase family, thus down-regulating activity substantially. RanBP10 enhances the transcriptional activity of AR in a ligand-dependent manner and exhibits a protein expression pattern different from RanBPM in various cell lines. RanBP10 is highly expressed in AR-positive prostate cancer LNCaP cells, while RanBPM is abundant in WI-38 and MCF-7 cells.	144
293967	cd12910	SPRY_SSH4_like	SPRY domain in SSH4 and similar proteins. This family includes SPRY domain in SSH4 (suppressor of SHR3 null mutation protein 4) and similar proteins. SSH4 is a component of the endosome-vacuole trafficking pathway that regulates nutrient transport and may be involved in processes determining whether plasma membrane proteins are degraded or routed to the plasma membrane. The SPRY domain in SSH4 may be involved in cargo recognition, either directly or by combination with other adaptors, possibly leading to a higher selectivity. In yeast, SSH4 and the homologous protein EAR1 (endosomal adapter of RSP5) recruit Rsp5p, an essential ubiquitin ligase of the Nedd4 family, and assist it in its function at multivesicular bodies by directing the ubiquitylation of specific cargoes.	192
350336	cd12911	HK_sensor	Sensor domains of Histidine Kinase receptors. Histidine kinase (HK) receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. HK receptors in this family contain double PDC (PhoQ/DcuS/CitA) sensor domains. Signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. The HK family includes not just histidine kinase receptors but also sensors for chemotaxis proteins and diguanylate cyclase receptors, implying a combinatorial molecular evolution.	100
350337	cd12912	PDC2_MCP_like	second PDC (PhoQ/DcuS/CitA) domain of methyl-accepting chemotaxis proteins and similar domains. Members of this subfamily display varying domain architectures but all contain double PDC (PhoQ/DcuS/CitA) sensor domains. This model represents the second PDC domain of Methyl-accepting chemotaxis proteins (MCPs), Histidine kinases (HKs), and other similar domains. Many members contain both HAMP (HK, Adenylyl cyclase, MCP, and Phosphatase) and MCP domains, which are signalling domains that interact with protein partners to relay a signal. MCPs are part of a transmembrane protein complex that controls bacterial chemotaxis. HK receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. In the case of HKs, signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses.	92
350338	cd12913	PDC1_MCP_like	first PDC (PhoQ/DcuS/CitA) domain of methyl-accepting chemotaxis proteins and similar domains. Members of this subfamily display varying domain architectures but all contain double PDC (PhoQ/DcuS/CitA) sensor domains. This model represents the first PDC domain of Methyl-accepting chemotaxis proteins (MCPs), Histidine kinases (HKs), and other similar domains. Many members contain both HAMP (HK, Adenylyl cyclase, MCP, and Phosphatase) and MCP domains, which are signalling domains that interact with protein partners to relay a signal. MCPs are part of a transmembrane protein complex that controls bacterial chemotaxis. HK receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. In the case of HKs, signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses.	139
350339	cd12914	PDC1_DGC_like	first PDC (PhoQ/DcuS/CitA) domain of diguanylate-cyclase and similar domains. Members of this subfamily display varying domain architectures but all contain double PDC (PhoQ/DcuS/CitA) sensor domains. This model represents the first PDC domain of Diguanylate-cyclases (DGCs), Histidine kinases (HKs), and other similar domains. Many members of this subfamily contain a C-terminal DGC (also called GGDEF) domain. DGCs regulate the turnover of cyclic diguanosine monophosphate. HK receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. In the case of HKs, signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses.	123
350340	cd12915	PDC2_DGC_like	second PDC (PhoQ/DcuS/CitA) domain of diguanylate-cyclase and similar domains. Members of this subfamily display varying domain architectures but all contain double PDC (PhoQ/DcuS/CitA) sensor domains. This model represents the second PDC domain of Diguanylate-cyclases (DGCs), Histidine kinases (HKs), and other similar domains. Many members of this subfamily contain a C-terminal DGC (also called GGDEF) domain. DGCs regulate the turnover of cyclic diguanosine monophosphate. HK receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. In the case of HKs, signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses.	96
240599	cd12916	VKOR_1	Vitamin K epoxide reductase family in bacteria and plants. This family includes vitamin K epoxide reductase (VKOR) present in bacteria and plant. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some plant and bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade.	133
240600	cd12917	VKOR_euk	Vitamin K epoxide reductase family in eukaryotes, excluding plants. This family includes vitamin K epoxide reductase (VKOR) present in bacteria and plant. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. Warfarin, a widely used oral anticoagulant used in medicine as well as rodenticides, inhibits the activity of VKOR, resulting in decreased levels of reduced vitamin K, which is required for the function of several clotting factors. However, anticoagulation effect of warfarin is significantly associated with polymorphism of certain genes, including VKORC1. Interestingly, in rodents, an adaptive trait appears to have evolved convergently by selection on new or standing genetic polymorphisms in VKORC1 as well as by adaptive introgressive hybridization between species, likely brought about by human-mediated dispersal.	140
240601	cd12918	VKOR_arc	Vitamin K epoxide reductase family in archaea and some bacteria. This family includes vitamin K epoxide reductase (VKOR) mostly present in archaea and some bacteria. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade.	126
240602	cd12919	VKOR_2	Vitamin K epoxide reductase family in bacteria. This family includes vitamin K epoxide reductase (VKOR) present only in bacteria. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade.	169
240603	cd12920	VKOR_3	Vitamin K epoxide reductase family in bacteria. This family includes vitamin K epoxide reductase (VKOR) present in proteobacteria and spirochetes. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade.	134
240604	cd12921	VKOR_4	Vitamin K epoxide reductase (VKOR) family in bacteria. This family includes vitamin K epoxide reductase (VKOR) present only in bacteria. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. This family also has a cysteine peptidase domain present at the N-terminus of the VKOR domain.	128
240605	cd12922	VKOR_5	Vitamin K epoxide reductase family in bacteria. This family includes vitamin K epoxide reductase (VKOR) mostly present in actinobacteria. VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade.	133
214016	cd12923	iSH2_PI3K_IA_R	Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunits. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. In vertebrates, there are three genes (PIK3R1, PIK3R2, and PIK3R3) that encode for different Class IA PI3K R subunits.	152
214017	cd12924	iSH2_PIK3R1	Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunit 1, PIK3R1, also called p85alpha. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. In addition, p85alpha, also called PIK3R1, contains N-terminal SH3 and GAP domains. p85alpha carry functions independent of its PI3K regulatory role. It can independently stimulate signaling pathways involved in cytoskeletal rearrangements. Insulin-sensitive tissues express splice variants of the PIK3R1 gene, p50alpha and p55alpha, which may play important roles in insulin signaling during lipid and glucose metabolism. Mice deficient with PIK3R1 die perinatally, indicating its importance in development.	161
214018	cd12925	iSH2_PIK3R3	Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunit 3, PIK3R3, also called p55gamma. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. p55gamma, also called PIK3R3 or p55PIK, also contains a unique N-terminal 24-amino acid residue (N24) that interacts with cell cycle modulators to promote cell cycle progression.	161
214019	cd12926	iSH2_PIK3R2	Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunit 2, PIK3R2, also called p85beta. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. p85beta, also called PIK3R2, contains N-terminal SH3 and GAP domains. It is expressed ubiquitously but at lower levels than p85alpha. Its expression is increased in breast and colon cancer, correlates with tumor progression, and enhanced invasion. During viral infection, the viral nonstructural (NS1) protein binds p85beta specifically, which leads to PI3K activation and the promotion of viral replication. Mice deficient with PIK3R2 develop normally and exhibit moderate metabolic and immunological defects.	161
240571	cd12927	MMP_TTHA0227_like	Minimal MMP-like domain found in Thermus thermophilus TTHA0227, Acidothermus cellulolyticus ACEL2062 and similar proteins. The family includes hypothetical proteins from bacteria that contain a minimal metalloprotease (MMP)-like domain consisting of 3-stranded mixed 2-beta sheets.These proteins may belong to a superfamily of bacterial zinc metallo-peptidases, which is characterized by a conserved HExxHxxGxxD (x could be any amino acid) motif. However, some family members carry a shorter HExxHxxG motif or HExxH motif. Some others do not have such a motif, but still share very high sequence similarity.	97
240592	cd12929	GUCT	RNA-binding GUCT domain found in the RNA helicase II/Gu protein family. This family includes vertebrate RNA helicase II/Gualpha (RH-II/Gualpha) and RNA helicase II/Gubeta (RH-II/Gubeta), both of which consist of a DEAD box helicase domain (DEAD), a helicase conserved C-terminal domain, and a Gu C-terminal (GUCT) domain. They localize to nucleoli, suggesting roles in ribosomal RNA production, but RH-II/Gubeta also localizes to nuclear speckles containing the splicing factor SC35, suggesting its possible involvement in pre-mRNA splicing. In contrast to RH-II/Gualpha, RH-II/Gubeta has RNA-unwinding activity, but no RNA-folding activity. The family also contains plant DEAD-box ATP-dependent RNA helicase 7 (RH7 or PRH75), Thermus thermophilus heat resistant RNA-dependent ATPase (Hera) and similar proteins. RH7 is a new nucleus-localized member of the DEAD-box protein family from higher plants. It displays a weak ATPase activity which is barely stimulated by RNA ligands. RH7 contains an N-terminal KDES domain rich in lysine, glutamic acid, aspartic acid, and serine residues, seven highly conserved helicase motifs in the central region, a GUCT domain, and a C-terminal GYR domain harboring a large number of glycine residues interrupted by either arginines or tyrosines. Thermus thermophilus Hera is a DEAD box helicase that binds fragments of 23S rRNA and RNase P RNA via its C-terminal domain. It contains a helicase core that harbors two RecA-like domains termed RecA_N and RecA_C, a dimerization domain (DD), and a C-terminal RNA-binding domain (RBD) that reveals a compact, RRM-like fold and shows sequence similarity with the typical GUCT domain found in the RNA helicase II/Gu protein family.	72
410577	cd12930	GAT_SF	GAT domain found in eukaryotic GGAs, metazoan Tom1-like proteins, metazoan STAMs, fungal Vps27, and similar proteins. The GAT (GGA and Tom1) domain superfamily includes the canonical GAT domain found in ADP-ribosylation factor (Arf)-binding proteins (GGAs) from eukaryotes, myb protein 1 (Tom1)-like proteins from metazoa, and LAS seventeen-binding protein 5 (Lsb5p)-like proteins from fungi. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGAs play important roles in ubiquitin-dependent sorting of cargo proteins both in biosynthetic and endocytic pathways. Tom1 and its related proteins, Tom1L1 and Tom1L2, form a protein family sharing an N-terminal VHS-domain followed by a GAT domain. Tom1 family proteins bind to ubiquitin, ubiquitinated proteins, and Toll-interacting protein (Tollip) through its GAT domain. They do not associate with either Arf GTPases through its GAT domain nor with acidic cluster-dileucine sequences through its VHS domain. The GAT domain superfamily also includes the non-canonical GAT domain found in several components of the ESCRT-0 complex, including signal transducing adapter molecules (STAMs) and hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs) from metazoa, as well as vacuolar protein sorting-associated protein 27 (Vps27) and class E vacuolar protein-sorting machinery protein Hse1 from fungi. Hrs, together with STAM, forms a Hrs/STAM core complex. Vps27, together with Hse1, forms a Vps27/Hse1 core complex. Those complexes consist of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The intertwined GAT heterodimer acts as a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	77
240579	cd12931	eNOPS_SF	NOPS domain, including C-terminal helical extension region, in the p54nrb/PSF/PSP1 family. All members in this family contain a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRM1 and RRM2), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain with a long helical C-terminal extension. The NOPS domain specifically binds to RRM2 domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. PSF has an additional large N-terminal domain that differentiates it from other family members. The p54nrb/PSF/PSP1 family includes 54 kDa nuclear RNA- and DNA-binding protein (p54nrb), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF) and paraspeckle protein 1 (PSP1), which are ubiquitously expressed and are well conserved in vertebrates. p54nrb, also termed NONO or NMT55, is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF, also termed POMp100, is also a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSP1, also termed PSPC1, is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSP1 remains unknown currently. The family also includes some p54nrb/PSF/PSP1 homologs from invertebrate species. For instance, the Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA) and Chironomus tentans hrp65 gene encoding protein Hrp65. D. melanogaster NONA is involved in eye development and behavior and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans Hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore.	90
240576	cd12932	RRP7_like	RRP7 domain ribosomal RNA-processing protein 7 (Rrp7p), ribosomal RNA-processing protein 7 homolog A (Rrp7A), and similar proteins. This CD corresponds to the RRP7 domain of Rrp7p and Rrp7A. Rrp7p is encoded by YCL031C gene from Saccharomyces cerevisiae. It is an essential yeast protein involved in pre-rRNA processing and ribosome assembly, and is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. Rrp7A, also termed gastric cancer antigen Zg14, is the Rrp7p homolog mainly found in Metazoans. The cellular function of Rrp7A remains unclear currently. Both Rrp7p and Rrp7A harbor an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RRP7 domain.	118
240597	cd12933	eIF3G	eIF3G domain found in eukaryotic translation initiation factor 3 subunit G (eIF-3G) and similar proteins. eIF-3G, also termed eIF-3 subunit 4, or eIF-3-delta, or eIF3-p42, or eIF3-p44, is the RNA-binding subunit of eIF3. eIF3 is a large multi-subunit complex that plays a central role in the initiation of translation by binding to the 40 S ribosomal subunit and promoting the binding of methionyl-tRNAi and mRNA. eIF-3G binds 18 S rRNA and beta-globin mRNA, and therefore appears to be a nonspecific RNA-binding protein. Besides, eIF-3G is one of the cytosolic targets; it interacts with mature apoptosis-inducing factor (AIF). This family also includes yeast eIF3-p33, a homolog of vertebrate eIF-3G; it plays an important role in the initiation phase of protein synthesis in yeast. It binds both mRNA and rRNA fragments due to an RNA recognition motif near its C-terminus.	114
240585	cd12934	LEM	LEM (Lap2/Emerin/Man1) domain found in emerin, lamina-associated polypeptide 2 (LAP2), inner nuclear membrane protein Man1 and similar proteins. The family corresponds to a group of inner nuclear membrane proteins containing LEM domain. Emerin occurs in four phosphorylated forms and plays a role in cell cycle-dependent events. It is absent from the inner nuclear membrane in most patients with X-linked muscular dystrophy. Emerin interacts with A-type and B-type lamins. Man1, also termed LEM domain-containing protein 3 (LEMD3) is an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and post-mitotic reassembly. Some LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are non-membrane nuclear polypeptides. This family also contains LEM domain-containing protein LEMP-1 and LEM2. LEMP-1, also termed cancer/testis antigen 50 (CT50), is encoded by LEMD1, a novel testis-specific gene expressed in colorectal cancers. LEMP-1 may function as a cancer-testis antigen for immunotherapy of colorectal carcinoma (CRC). LEM2, also termed LEMD2, is a novel Man1-related ubiquitously expressed inner nuclear membrane protein required for normal nuclear envelope morphology. Association with lamin A is required for its proper nuclear envelope localization while its binding to lamin C plays an important role in the organization of lamin A/C complexes. Some uncharacterized LEM domain-containing proteins are also included in this family. Unlike other family members, these harbor an ankyrin repeat region that may mediate protein-protein interactions.	37
240596	cd12935	LEM_like	LEM-like domain of lamina-associated polypeptide 2 (LAP2) and similar proteins. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and postmitotic reassembly. Some of the LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are nonmembrane nuclear polypeptides. All LAP2 isoforms contain an N-terminal lamina-associated polypeptide-Emerin-MAN1 (LEM)-domain that is connected to a highly divergent LEM-like domain by an unstructured linker. Both LEM and LEM-like domains share the same structural fold, mainly composed of two large parallel alpha helices. However, their biochemical nature of the solvent-accessible residues is completely different, which indicates the two domains may target different protein surfaces. The LEM domain is responsible for the interaction with the nonspecific DNA binding protein barrier-to-autointegration factor (BAF), and the LEM-like domain is involved in chromosome binding. The family also includes the yeast helix-extension-helix domain-containing proteins, Heh1p (formerly called Src1p) and Heh2p, and their uncharacterized homologs found mainly in fungi and several in bacteria. Heh1p and Heh2p are inner nuclear membrane proteins that might interact with nuclear pore complexes (NPCs). Heh1p is involved in mitosis. It functions at the interface between subtelomeric gene expression and transcription export (TREX)-dependent messenger RNA export through NPCs. The function of Heh2p remains ill-defined. Both Heh1p and Heh2p contain a LEM-like domain (also termed HeH domain), but lack a LEM domain.	36
240593	cd12936	GUCT_RHII_Gualpha_beta	RNA-binding GUCT domain found in vertebrate RNA helicase II/Gualpha (RH-II/Gualpha), RNA helicase II/Gubeta (RH-II/Gubeta) and similar proteins. This subfamily corresponds to the Gu C-terminal (GUCT) domain of RH-II/Gualpha and RH-II/Gubeta, two paralogues found in vertebrates. RH-II/Gualpha, also termed nucleolar RNA helicase 2, or DEAD box protein 21, or nucleolar RNA helicase Gu, is a bifunctional enzyme that displays independent RNA-unwinding and RNA-folding activities. It unwinds double-stranded RNA in the 5' to 3' direction in the presence of Mg2+ through the domains in its N-terminal region. In contrast, it folds single-stranded RNA in an ATP-dependent manner and its C-terminal region is responsible for the Mg2+ independent RNA-foldase activity. RH-II/Gualpha consists of a DEAD box helicase domain (DEAD), a helicase conserved C-terminal domain (helicase_C), and a GUCT followed by three FRGQR repeats and one PRGQR sequence. The DEAD and helicase_C domains may play critical roles in the RNA-helicase activity of RH-II/Gualpha. The function of GUCT domain remains unclear. The C-terminal region responsible for the RNA-foldase activity does not overlap with the GUCT domain. RH-II/Gubeta, also termed ATP-dependent RNA helicase DDX50, or DEAD box protein 50, or nucleolar protein Gu2, shows significant sequence homology with RH-II/Gualpha. It contains a DEAD domain, a helicase_C domain, and a GUCT domain followed by an arginine-serine-rich sequence but not (F/P)RGQR repeats in RH-II/Gualpha. Both RH-II/Gualpha and RH-II/Gubeta localize to nucleoli, suggesting roles in ribosomal RNA production, but RH-II/Gubeta also localizes to nuclear speckles containing the splicing factor SC35, suggesting its possible involvement in pre-mRNA splicing. In contrast to RH-II/Gualpha, RH-II/Gubeta has RNA-unwinding activity, but no RNA-folding activity.	93
240594	cd12937	GUCT_RH7_like	RNA-binding GUCT domain found in plant DEAD-box ATP-dependent RNA helicase 7 (RH7) and similar proteins. This subfamily corresponds to the  Gu C-terminal (GUCT) domain of RH7 and similar proteins. RH7, also termed plant RNA helicase 75 (PRH75), is a new nucleus-localized member of the DEAD-box protein family from higher plants. It displays a weak ATPase activity which is barely stimulated by RNA ligands. RH7 contains an N-terminal KDES domain rich in lysine, glutamic acid, aspartic acid, and serine residues, seven highly conserved helicase motifs in the central region, a GUCT domain, and a C-terminal GYR domain harboring a large number of glycine residues interrupted by either arginines or tyrosines. RH7 is RNA specific and harbors two possible RNA-binding motifs, the helicase motif VI (HRIGRTGR) and the C-terminal glycine-rich GYR domain.	86
240595	cd12938	GUCT_Hera	RNA-binding GUCT-like domain found in Thermus thermophilus heat resistant RNA-dependent ATPase (Hera) and similar proteins. This subfamily corresponds to the  Gu C-terminal (GUCT)-like domain of Hera and similar proteins. Thermus thermophilus Hera is a DEAD box helicase that binds fragments of 23S rRNA and RNase P RNA via its C-terminal domain. It contains a helicase core that harbors two RecA-like domains termed RecA_N and RecA_C, a dimerization domain (DD), and a C-terminal RNA-binding domain (RBD) that reveals a compact, RRM-like fold and shows sequence similarity with GUCT domain found in vertebrate RNA helicase II/Gualpha (RH-II/Gualpha), RNA helicase II/Gubeta (RH-II/Gubeta) and plant DEAD-box ATP-dependent RNA helicase 7 (RH7 or PRH75).	74
240586	cd12939	LEM_emerin	LEM (Lap2/Emerin/Man1) domain found in emerin. This CD corresponds to the LEM domain that is critical for binding to lamin A/C and is also involved in interaction with the DNA binding protein barrier-to-autointegration factor (BAF). Emerin is an inner nuclear membrane protein that occurs in four differently phosphorylated forms and plays a role in cell cycle-dependent events. It is absent from the inner nuclear membrane in most patients with X-linked muscular dystrophy. Emerin interacts with A-type and B-type lamins. It contains an N-terminal LEM domain followed by a poly-serine segment, a region rich in hydrophobic amino acids comprising the nuclear localization signal (NLS) followed by another poly-serine segment, and a C-terminal transmembrane region.	43
240587	cd12940	LEM_LAP2_LEMD1	LEM (Lap2/Emerin/Man1) domain found in lamina-associated polypeptide 2 (LAP2), LEM domain-containing protein 1 (LEMP-1) and similar proteins. This CD corresponds to the LEM domain of LAP2, LEMP-1 and similar proteins. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and post-mitotic reassembly. Some of LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are non-membrane nuclear polypeptides. All LAP2 isoforms contain an N-terminal LEM domain that is connected to a highly divergent LEM-like domain by an unstructured linker. Although LEM and LEM-like domains share the same structural fold composed of two large parallel alpha helices, the biochemical nature of the solvent-accessible residues is completely different, indicating that the two domains may target different protein surfaces. The LEM domain interacts with the nonspecific DNA binding protein barrier-to-autointegration factor (BAF) while the LEM-like domain is involved in chromosome binding. LEMP-1, also termed cancer/testis antigen 50 (CT50), is encoded by LEMD1, a novel testis-specific gene expressed in colorectal cancers. It may function as a cancer-testis antigen for immunotherapy of colorectal carcinoma (CRC). LEMP-1 contains an N-terminal LEM domain.	42
240588	cd12941	LEM_LEMD2	LEM (Lap2/Emerin/Man1) domain found in LEM domain-containing protein 2 (LEM2). This CD corresponds to the LEM domain that is responsible for the interaction with chromatin protein barrier-to-autointegration factor (BAF). LEM2, also termed LEMD2, is a novel Man1-related ubiquitously expressed inner nuclear membrane protein required for normal nuclear envelope morphology. Association with lamin A is required for its proper nuclear envelope localization. It also binds to lamin C and plays an important role in the organization of lamin A/C complexes. LEM2 contains an N-terminal LEM domain, two putative transmembrane domains and a MAN1-Src1p C-terminal (MSC) domain, but lacks the Man1-specific C-terminal RNA recognition motif (RRM).	38
240589	cd12942	LEM_Man1	LEM (Lap2/Emerin/Man1) domain found in inner nuclear membrane protein Man1. This CD corresponds to the LEM domain of Man1 and similar proteins. Man1, also termed LEM domain-containing protein 3 (LEMD3), is an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. It is part of a protein complex essential for chromatin organization and cell division. It also functions as an important negative regulator for the transforming growth factor beta (TGF-beta) /activin/Nodal signaling pathway and bone morphogenetic protein (BMP) signaling pathway by directly interacting with chromatin-associated proteins and transcriptional regulators, including the R-Smads, Smad1, Smad2, and Smad3. Man1 is a unique type of left/right (LR) signaling regulator that acts on the inner nuclear membrane. Furthermore, Man1 plays a crucial role in angiogenesis. The vascular remodeling can be regulated at the inner nuclear membrane through interactions between Man1 and Smads. Man1 contains an N-terminal LEM domain, two putative transmembrane domains, a Man1-Src1p C-terminal (MSC) domain, and a C-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The LEM domain interacts with DNA and chromatin-binding protein Barrier-to-Autointegration Factor, and is also necessary for efficient localization of Man1 in the inner nuclear membrane. It has been shown that the C-terminal nucleoplasmic region of Man1 exhibits a DNA binding winged helix domain and is responsible for both, DNA- and Smad-binding.	44
240590	cd12943	LEM_ANKL1	LEM (Lap2/Emerin/Man1) domain found in ankyrin repeat and LEM domain-containing protein 1 (ANKL1). The family includes ANKL1, also termed ankyrin repeat domain-containing protein 41 (ANKRD41), or LEM-domain containing protein 3 (LEM3), and similar proteins. Although their biological roles remain unclear, the family members contain an N-terminal ankyrin repeat region, LEM domain and C-terminal GIY-YIG nuclease domain. The ankyrin repeats are unique motifs mediating protein-protein interactions. The LEM domain, mainly found in inner nuclear membrane proteins, may be involved in protein- or DNA-binding.	38
240591	cd12944	LEM_ANKL2	LEM (Lap2/Emerin/Man1) domain found in ankyrin repeat and LEM domain-containing protein 2 (ANKL2). The family includes ANKL2 and similar proteins. Although their biological roles remain unclear, the family members share an N-terminal LEM domain and an ankyrin repeat region. The LEM domain, mainly found in inner nuclear membrane proteins, may be involved in protein- or DNA-binding. The ankyrin repeats are unique motifs mediating protein-protein interactions.	43
240580	cd12945	NOPS_NONA_like	NOPS domain, including C-terminal coiled-coil region, in p54nrb/PSF/PSP1 homologs from invertebrate species. The family contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain. This model corresponds to the NOPS domain, with a long helical C-terminal extension , found in Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA), Chironomus tentans hrp65 gene encoding protein Hrp65 and similar proteins. D. melanogaster NONA is involved in eye development and behavior, and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore. The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies.	100
240581	cd12946	NOPS_p54nrb_PSF_PSPC1	NOPS domain, including C-terminal coiled-coil region, in p54nrb/PSF/PSPC1 family proteins. The family contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain. This model corresponds to the NOPS domain, with a long helical C-terminal extension, found in the p54nrb/PSF/PSPC1 proteins. The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. Members in the family include 54 kDa nuclear RNA- and DNA-binding protein (p54nrb), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF) and paraspeckle protein component 1 (PSPC1 or PSP1), which are ubiquitously expressed and are conserved in vertebrates. p54nrb, also termed NONO or NMT55, is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF, also termed POMp100, is a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSPC1 is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSPC1 remains unknown currently. PSF has an additional large N-terminal domain that differentiates it from other family members.	93
240582	cd12947	NOPS_p54nrb	NOPS domain, including C-terminal coiled-coil region, in 54 kDa nuclear RNA- and DNA-binding protein (p54nrb) and similar proteins. The family contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain. This model corresponds to the NOPS domain, with a long helical C-terminal extension, found in p54nrb, also termed non-POU domain-containing octamer-binding protein (NONO), or 55 kDa nuclear protein (NMT55), or DNA-binding p52/p100 complex 52 kDa subunit. It is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. p54nrb is ubiquitously expressed and highly conserved in vertebrates. It binds both single- and double-stranded RNA and DNA, and also possesses inherent carbonic anhydrase activity. p54nrb forms a heterodimer with paraspeckle component 1 (PSPC1 or PSP1), localizing to paraspeckles in an RNA-dependent manner. It also forms a heterodimer with polypyrimidine tract-binding protein-associated-splicing factor (PSF). The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for paraspeckle localization to subnuclear bodies.	94
240583	cd12948	NOPS_PSF	NOPS domain, including C-terminal coiled-coil region, in polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF) and similar proteins. This model contains the NOPS (NONA and PSP1) domain PSF (also termed proline- and glutamine-rich splicing factor, or 100 kDa DNA-pairing protein (POMp100), or 100 kDa subunit of DNA-binding p52/p100 complex), with a long helical C-terminal extension. PSF is a multifunctional protein that mediates diverse activities in the cell. It is ubiquitously expressed and highly conserved in vertebrates. PSF binds not only RNA but also single-stranded DNA (ssDNA) as well as double-stranded DNA (dsDNA) and facilitates the renaturation of complementary ssDNAs. Additionally, it promotes the formation of D-loops in superhelical duplex DNA, and is involved in cell proliferation. PSF can also interact with multiple factors. It is an RNA-binding component of spliceosomes and binds to insulin-like growth factor response element (IGFRE). Moreover, PSF functions as a transcriptional repressor interacting with Sin3A and mediating silencing through the recruitment of histone deacetylases (HDACs) to the DNA binding domain (DBD) of nuclear hormone receptors. As an RNA-binding component of spliceosomes, PSF binds to the insulin-like growth factor response element (IGFRE), and acts as an independent negative regulator of the transcriptional activity of the porcine P-450 cholesterol side-chain cleavage enzyme gene (P450scc) IGFRE. PSF is an essential pre-mRNA splicing factor and is dissociated from PTB and binds to U1-70K and serine-arginine (SR) proteins during apoptosis. In addition, PSF forms a heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NONO). The PSF/p54nrb complex displays a variety of functions, such as DNA recombination and RNA synthesis, processing, and transport. PSF contains two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), which are responsible for interactions with RNA and for the localization of the protein in speckles. It also contains an N-terminal region rich in proline, glycine, and glutamine residues, which may play a role in interactions recruiting other molecules. The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies.	97
240584	cd12949	NOPS_PSPC1	NOPS domain, including C-terminal coiled-coil region, in paraspeckle protein component 1 (PSPC1) and similar proteins. The family contains a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain. This model corresponds to the NOPS domain, with a long helical C-terminal extension, of paraspeckle component 1 (PSPC1, also termed PSP1), a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. It is ubiquitously expressed and highly conserved in vertebrates. Although its cellular function remains unknown currently, PSPC1 forms a novel heterodimer with the nuclear protein p54nrb, also known as non-POU domain-containing octamer-binding protein (NONO), which localizes to paraspeckles in an RNA-dependent manner. The NOPS domain specifically binds to the second RNA recognition motif (RRM2) domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies.	94
240577	cd12950	RRP7_Rrp7p	RRP7 domain ribosomal RNA-processing protein 7 (Rrp7p) and similar proteins. This CD corresponds to the RRP7 domain of Rrp7p. Rrp7p is encoded by YCL031C gene from Saccharomyces cerevisiae. It is an essential yeast protein involved in pre-rRNA processing and ribosome assembly. Rrp7p contains an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RRP7 domain.	128
240578	cd12951	RRP7_Rrp7A	RRP7 domain ribosomal RNA-processing protein 7 homolog A (Rrp7A) and similar proteins. The family corresponds to the RRP7 domain of Rrp7A, also termed gastric cancer antigen Zg14, and similar proteins which are yeast ribosomal RNA-processing protein 7 (Rrp7p) homologs mainly found in Metazoans. The cellular function of Rrp7A remains unclear currently. Rrp7A harbors an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RRP7 domain.	129
240572	cd12952	MMP_ACEL2062	Minimal MMP-like domain found in Acidothermus cellulolyticus hypothetical protein ACEL2062 and similar protein. The subfamily includes an uncharacterized protein from Acidothermus cellulolyticus (ACEL2062) and its homologs from bacteria. Although its biological role remains unclear, ACEL2062 contains a minimal metalloprotease (MMP)-like domain consisting of 3-stranded mixed 2-beta sheets and a HExxHxxGxxD/S (x could be any amino acid) motif. It may belong to a superfamily of bacterial zinc metallo-peptidases, which is characterized by a conserved HExxHxxGxxD motif.	117
240573	cd12953	MMP_TTHA0227	Minimal MMP-like domain found in Thermus thermophilus hypothetical protein TTHA0227 and similar proteins. The subfamily includes an uncharacterized protein from Thermus thermophilus (TTHA0227) and its homologs from bacteria. Although its biological role remains unclear, TTHA0227 contains a minimal metalloprotease (MMP)-like domain consisting of 3-stranded mixed 2-beta sheets and a HExxH (x could be any amino acid) motif. It may belong to a superfamily of bacterial zinc metallo-peptidases, which is characterized by a conserved HExxHxxGxxD motif.	112
240574	cd12954	MMP_TTHA0227_like_1	Minimal MMP-like domain found in a group of hypothetical proteins from alphaproteobacteria and actinobacteria. The subfamily includes some uncharacterized bacterial proteins which show high sequence similarity with Thermus thermophilus hypothetical protein TTHA0227. However, they do not contain the conserved HExxH (x could be any amino acid) motif. They may not have any zinc metallo-peptidase activity.	99
214020	cd12955	SKA2	Spindle and kinetochore-associated protein 2. SKA2, also called FAM33A, is a component of the SKA complex, which is formed by the association of three subunits (SKA1, SKA2, annd SKA3). The SKA complex is essential for accurate cell division. It functions with the Ndc80 network to establish stable kinetochore-microtubule interactions, which are crucial for the highly orchestrated chromosome movements during mitosis. The biological unit is a W-shaped homodimer of the three-subunit complex. SKA2 has also been identified as a glucocorticoid receptor-interacting protein and may be involved in regulating cancer cell proliferation.	116
240562	cd12956	CBM_SusE-F_like	carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins. This group includes five starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins contain an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. CBM-Fa (the CBM unique to SusF), does not bind insoluble starch; CBM-Fb and CBM-Fc both do, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. CBM-Ec has an additional starch-binding loop that may mediate interactions with partially unwound single helical forms of starch or small starch-breakdown products. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.	93
240570	cd12957	SKA3_N	Spindle and kinetochore-associated protein 3, N-terminal domain. SKA3, also called RAMA1 or C13orf3, is a component of the SKA complex, which is formed by the association of three subunits (SKA1, SKA2, and SKA3). The SKA complex is essential for accurate cell division. It functions with the Ndc80 network to establish stable kinetochore-microtubule interactions, which are crucial for the highly orchestrated chromosome movements during mitosis. The biological unit is a W-shaped homodimer of the three-subunit complex. SKA3 contributes to SAC (spindle-assembly checkpoint) signaling through its interaction with Bub3. This model represents the N-terminal domain of SKA3, which is involved in interactions with SKA1 and SKA2 to form the SKA complex. The C-terminal portion of SKA3 is involved in creating a microtubule-binding surface.	100
214021	cd12958	SKA1_N	Spindle and kinetochore-associated protein 1, N-terminal domain. SKA1 is a component of the SKA complex, which is formed by the association of three subunits (SKA1, SKA2, annd SKA3). The SKA complex is essential for accurate cell division. It functions with the Ndc80 network to establish stable kinetochore-microtubule interactions, which are crucial for the highly orchestrated chromosome movements during mitosis. The biological unit is a W-shaped homodimer of the three-subunit complex. This model represents the N-terminal domain of SKA1, which is involved in interactions with SKA2 and SKA3 to form the SKA complex. The C-terminal portion of SKA1 is involved in creating a microtubule-binding surface.	89
214022	cd12959	MMACHC-like	Methylmalonic aciduria and homocystinuria type C protein and similar proteins. MMACHC, also called CblC, is involved in the intracellular processing of vitamin B12 by catalyzing two reactions: the reductive decyanation of cyanocobalamin in the presence of a flavoprotein oxidoreductase and the dealkylation of alkylcobalamins through the nucleophilic displacement of the alkyl group by glutathione. Mutations in MMACHC cause combined methylmalonic acidemia/aciduria and homocystinuria (CblC type), the most common inherited disorder of cobalamin metabolism. The structure of MMACHC reveals it to be the most divergent member of the NADPH-dependent flavin reductase family that can use FMN or FAD to catalyze reductive decyanation; it is also the first enzyme with glutathione transferase (GST) activity that is unrelated to the GST superfamily in structure and sequence.	226
240575	cd12960	Spider_toxin	Spider neurotoxins including agatoxin, purotoxin and ctenitoxin. This domain family contains spider toxins that include the omega-Aga-IVB, a P-type calcium channel antagonist from venom of the funnel web spider, Agelenopsis aperta, as well as purotoxin-1 (PT1), a spider peptide venom of the Central Asian spider Geolycosa sp., which specifically exerts inhibitory action on P2X3 purinoreceptors at nanomolar concentrations. These spider toxins, which are ion channel blockers, share a common structural motif composed of a triple-stranded antiparallel beta-sheet, stabilized by internal disulfide bonds known as cystine knots.	36
240569	cd12961	CBM58_SusG	Carbohydrate-binding module 58 from Bacteroides thetaiotaomicron SusG and similar CBMs. This group includes the starch-specific CBM (carbohydrate-binding module) of SusG, a cell surface lipoprotein within the Sus (Starch-utilization system) system of the Human gut symbiont Bacteroides thetaiotaomicron. It represents the CBM58 class of CBMs in the carbohydrate active enzymes (CAZy) database. SusG is an alpha-amylase, and is essential for growth on high molecular weight starch. SusG-CBM58 binds maltooligosaccharide distal to, and on the opposite side of, the amylase catalytic site; it is one of two starch-binding sites in SusG, the other being adjacent to the active site. SusG-CBM58 is required for efficient degradation of insoluble starch by the purified enzyme. Its starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. It may play a role in product exchange with other Sus components.	110
240568	cd12962	X25_BaPul_like	X25 domain of Bacillus acidopullulyticus pullulanase and similar proteins. Pullulanase (EC 3.2.1.41) cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. BaPul is used industrially in the production of high fructose corn syrup, high maltose content syrups and low calorie and ''light'' beers.  Pullulanases, in addition to the catalytic domain, include several carbohydrate-binding domains (CBMs) as well as domains of unknown function (termed ''X'' modules). X25 was identified in Bacillus acidopullulyticus pullulanase, and splits another domain of unknown function (X45). X25 is present in multiple copy in some pullulanases. It has been suggested that X25 and X45 are CBMs which target mixed alpha-1,6/alpha-1,4 linked D-glucan polysaccharides.	95
240567	cd12963	X45_BaPul_like	X45 domain of Bacillus acidopullulyticus pullulanase and similar proteins. Pullulanase (EC 3.2.1.41) cleaves 1,6-alpha-glucosidic linkages in pullulan, amylopectin, and glycogen, and in alpha-and beta-amylase limit-dextrins of amylopectin and glycogen. BaPul is used industrially in the production of high fructose corn syrup, high maltose content syrups and low calorie and ''light'' beers.  Pullulanases, in addition to the catalytic domain, include several carbohydrate-binding domains (CBMs) as well as domains of unknown function (termed ''X'' modules). X45 was identified in Bacillus acidopullulyticus pullulanase, it is interupted by another domain of unknown function (X25). It has been suggested that X25 and X45 are CBMs which target mixed alpha-1,6/alpha-1,4 linked D-glucan polysaccharides.	89
240563	cd12964	CBM-Fa	carbohydrate-binding module Fa from Bacteroides thetaiotaomicron SusE, and similar CBMs. CBM-Fa is the first of three starch-specific CBM (carbohydrate-binding modules) of SusF, a cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. The precise mechanistic role of SusF in starch metabolism is unclear. SusF has an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by three tandem starch-binding CBMs: CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus. These CBMs have no enzymatic activity. Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These three CBMs show differences in their affinity for various different starch oligosaccharides, and they contribute differently to binding insoluble starch. CBM-Fa does not bind insoluble starch, and can bind smaller maltooligosaccharides. Proteins in this subgroup are present in the species of the Gram-negative Bacteroidetes phylum.	110
240564	cd12965	CBM-Eb_CBM-Fb	carbohydrate-binding modules Eb and Fb from SusE and SusF, respectively, and similar CBMs. Included in this subgroup are CBM-Eb and CBM-Fb, starch-specific carbohydrate-binding modules of SusE and SusF, cell surface lipoproteins within the Sus (Starch-utilization system)system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they contribute differently to binding insoluble starch. CBM-Fb and CBM-Fc both bind insoluble starch, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.	98
240565	cd12966	CBM-Ec_CBM-Fc	carbohydrate-binding modules Ec and Fc from SusE and SusF, respectively, and similar CBMs. Included in this subgroup are CBM-Ec and CBM-Fc, starch-specific carbohydrate-binding modules of SusE and SusF, cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they contribute differently to binding insoluble starch. CBM-Fb and CBM-Fc both bind insoluble starch, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.	98
240566	cd12967	CBM_SusE-F_like_u1	Uncharacterized subgroup of the CBM-SusE-F_like superfamily. The CBM SusE-F_like superfamily includes starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.	91
240556	cd13112	POLO_box	Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases. The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides.	76
381706	cd13113	Wnt	Wnt domain found in the WNT signaling gene family, also called Wingless-type mouse mammary tumor virus (MMTV) integration site family. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about the structure of Wnt proteins, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. Wnt signaling mediated by Wnt proteins orchestrates and influences a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	288
240557	cd13114	POLO_box_Plk4_1	First (cryptic) polo-box domain (PBD) of polo-like kinase 4 (Plk4/Sak). The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides.	112
240558	cd13115	POLO_box_Plk4_2	Second (cryptic) polo-box domain (PBD) of polo-like kinase 4 (Plk4/Sak). The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides.	108
240559	cd13116	POLO_box_Plk4_3	C-terminal (third) polo-box domain (PBD) of polo-like kinase 4 (Plk4/Sak). The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides.	81
240560	cd13117	POLO_box_2	Second polo-box domain (PBD) of polo-like kinases Plk1, Plk2, Plk3, and Plk5. The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides.	81
240561	cd13118	POLO_box_1	First polo-box domain (PBD) of polo-like kinases Plk1, Plk2, Plk3, and Plk5. The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides.	91
240524	cd13119	BF2867_like	Tandemly repeated domain found in Bacteroides fragilis Nctc 9343 BF2867 and related proteins. Two structurally similar domains with low sequence similarity form a protein that may have a role in cell adhesion. This family also includes BF1858 and overlaps with DUF3988.	115
240525	cd13120	BF2867_like_N	N-terminal domain found in Bacteroides fragilis Nctc 9343 BF2867 and related proteins. Two structurally similar domains with low sequence similarity in a tandem repeat arrangement form a protein that may have a role in cell adhesion. This family overlaps with DUF3988.	156
240526	cd13121	BF2867_like_C	C-terminal domain found in Bacteroides fragilisNctc 9343 BF2867 and related proteins. Two structurally similar domains with low sequence similarity  in a tandem repeat arrangement form a protein that may have a role in cell adhesion. This family overlaps with DUF3988.	138
240555	cd13122	MSL2_CXC	DNA-binding cysteine-rich domain of male-specific lethal 2 and related proteins. The CXC domain of Drosophila melanogaster MSL2 forms a Zn(3)Cys(9) cluster and is involved in recruiting members of the dosage compensation complex (DCC) to sites on the X chromosome.	50
240528	cd13123	MATE_MurJ_like	MurJ/MviN, a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins. Escherichia coli MurJ (MviN) has been identified as essential for murein biosynthesis. It has been suggested that MurJ functions as the peptidoglycan lipid II flippase which is involved in translocation of lipid-anchored peptidoglycan precursors across the cytoplasmic membrane, though results obtained in Bacillus subtilis seem to indicate that its MurJ homologs are not essential for growth. Some MviN family members (e.g. in Mycobacterium tuberculosis) possess an extended C-terminal region that contains an intracellular pseudo-kinase domain and an extracellular domain resembling carbohydrate-binding proteins. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR).	420
240529	cd13124	MATE_SpoVB_like	Stage V sporulation protein B, also known as Stage III sporulation protein F, and related proteins. The integral membrane protein SpoVB has been implicated in the biosynthesis of the peptidoglycan component of the spore cortex in Bacillus subtilis. This model represents a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR).	434
240530	cd13125	MATE_like_10	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. This family might function as a translocase for lipopolysaccharides, such as O-antigen. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	409
240531	cd13126	MATE_like_11	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. This family might function as a translocase for lipopolysaccharides, such as O-antigen. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	396
240532	cd13127	MATE_tuaB_like	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. This family might function as a translocase for lipopolysaccharides and participate in the biosynthesis of cell wall components such as teichuronic acid. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	406
240533	cd13128	MATE_Wzx_like	Wzx, a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins. Escherichia coli Wzx and related proteins from other gram-negative bacteria are thought to act as flippases, assisting in the membrane translocation of lipopolysaccharides including those containing O-antigens. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR).	402
240534	cd13129	MATE_epsE_like	Multidrug and toxic compound extrusion family and similar proteins. This model represents a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins, including Ralstonia solanaceraum GMI1000 epsE, which may be involved in exporting exopolysaccharide EPS I, a virulence factor. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR).	411
240535	cd13130	MATE_rft1	Rft1-like subfamily of the multidrug and toxic compound extrusion family (MATE). This eukaryotic family may function as a transporter, shuttling phospholipids, lipopolysaccharides or oligosaccharides from cytoplasmic to the lumenal side of the endoplasmic reticulum. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance.	441
240536	cd13131	MATE_NorM_like	Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Vibrio cholerae NorM. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. This subfamily includes Vibrio cholerae NorM and functions most likely as a multidrug efflux pump, removing xenobiotics from the interior of the cell. The pump utilizes a cation gradient across the membrane to facilitate the export process. NorM appears to bind monovalent cations in an outward-facing conformation and may subsequently cycle through an inward-facing and outward-facing conformation to capture and release its substrate.	435
240537	cd13132	MATE_eukaryotic	Eukaryotic members of the multidrug and toxic compound extrusion (MATE) family. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria. This subfamily, which is restricted to eukaryotes, contains vertebrate solute transporters responsible for secretion of cationic drugs across the brush border membranes, yeast proteins located in the vacuole membrane, and plant proteins involved in disease resistance and iron homeostatis under osmotic stress.	436
240538	cd13133	MATE_like_7	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	438
240539	cd13134	MATE_like_8	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	438
240540	cd13135	MATE_like_9	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	429
240541	cd13136	MATE_DinF_like	DinF and similar proteins, a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins. Escherichia coli DinF is a membrane protein that has been found to protect cells against oxidative stress and bile salts. The expression of DinF is regulated as part of the SOS system. It may act by detoxifying oxidizing molecules that have the potential to damage DNA. Some member of this family have been reported to enhance the virulence of plant pathogenic bacteria by enhancing their ability to grow in the presence of toxic compounds. Proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR).	424
240542	cd13137	MATE_NorM_like	Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Thermotoga marina NorM. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	432
240543	cd13138	MATE_yoeA_like	Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Bacillus subtilis yoeA. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	431
240544	cd13139	MATE_like_14	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	448
240545	cd13140	MATE_like_1	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	435
240546	cd13141	MATE_like_13	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	443
240547	cd13142	MATE_like_12	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	444
240548	cd13143	MATE_MepA_like	Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Streptococcus aureus MepA. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. This subfamily includes Streptococcus aureus MepA and Vibrio vulnificus VmrA and functions most likely as a multidrug efflux pump.	426
240549	cd13144	MATE_like_4	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	434
240550	cd13145	MATE_like_5	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	440
240551	cd13146	MATE_like_6	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	433
240552	cd13147	MATE_MJ0709_like	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins, similar to Methanocaldococcus jannaschii MJ0709. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	441
240553	cd13148	MATE_like_3	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	441
240554	cd13149	MATE_like_2	Uncharacterized subfamily of the multidrug and toxic compound extrusion (MATE) proteins. The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. A number of family members are involved in the synthesis of peptidoglycan components in bacteria.	434
240523	cd13150	DAXX_histone_binding	Histone binding domain of the death-domain associated protein (DAXX). DAXX is a nuclear protein that modulates transcription of various genes and is involved in cell death and/or the suppression of growth. DAXX is also a histone chaperone conserved in Metazoa that acts specifically on histone H3.3. This alignment models a functional domain of DAXX that interacts with the histone H3.3-H4 dimer, and in doing so competes with DNA binding and interactions between the histone chaperone ASF1/CIA and the H3-H4 dimer.	198
240522	cd13151	DAXX_helical_bundle	Helical bundle domain of the death-domain associated protein (DAXX). DAXX is a nuclear protein that modulates transcription of various genes and is involved in cell death and/or the suppression of growth. DAXX is also a histone chaperone conserved in Metazoa that acts specifically on histone H3.3. This alignment models the N-terminal helical bundle domain of DAXX, which was shown to interact with the tumor suppressor Ras-association domain family 1C (RASSF1C).	88
240516	cd13152	KOW_GPKOW_A	KOW motif of the "G-patch domain and KOW motifs-containing protein" (GPKOW) repeat A. GPKOW contains one G-patch domain and two KOW motifs. GPKOW is a nuclear protein that regulated by catalytic (C) subunit of Protein Kinase A (PKA) and bind RNA in vivo. PKA may be involved in regulating multiple steps in post-transcriptional processing of pre-mRNAs. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. GPKOW is also known as T54 protein or MOS2 homolog.	57
240517	cd13153	KOW_GPKOW_B	KOW motif of the "G-patch domain and KOW motifs-containing protein" (GPKOW) repeat B. GPKOW contains one G-patch domain and two KOW motifs. GPKOW is a nuclear protein that regulated by catalytic (C) subunit of Protein Kinase A (PKA) and bind RNA in vivo. PKA may be involved in regulating multiple steps in post-transcriptional processing of pre-mRNAs. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. GPKOW is also known as the T54 protein or MOS2 homolog.	51
240518	cd13154	KOW_Mtr4	KOW_Mtr4 is an inserted domain in Mtr4 globular domain. Mtr4 is a conserved helicase with a core DExH region that cooperates with the eukaryotic nuclear exosome in RNA processing and degradation. KOW_Mtr4 motif might be involved in presenting RNA substrates to the helicase core. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW motif is located at the extended insertion of Mtr4 protein.	129
240519	cd13155	KOW_KIN17	KOW_Kin17 is a RNA-binding motif. KOW domain of the KIN17protein contributes to the RNA-binding properties of the whole protein. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KIN17 is conserved from yeast to human that ubiquitously expressed at low levels in mammals tissue and have functions in DNA replication, DNA repair and cell cycle control.	54
240520	cd13156	KOW_RPL6	KOW motif of Ribosomal Protein L6. RPL6 contains KOW motif that has an extra ribosomal role as an oncogenic. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. 	152
269979	cd13157	PTB_tensin-related	Tensin-related Phosphotyrosine-binding (PTB) domain. Tensin plays critical roles in renal function, muscle regeneration, and cell migration. It binds to actin filaments and interacts with the cytoplasmic tails of beta-integrin via its PTB domain, allowing tensin to link actin filaments to integrin receptors. Tensin functions as a platform for assembly and disassembly of signaling complexes at focal adhesions by recruiting tyrosine-phosphorylated signaling molecules, and also by providing interaction sites for other proteins. In addition to its PTB domain, it contains a C-terminal SH2 domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains.	129
269980	cd13158	PTB_APPL	Adaptor protein containing PH domain, PTB domain, and Leucine zipper motif (APPL; also called DCC-interacting protein (DIP)-13alpha) Phosphotyrosine-binding (PTB) domain. APPL interacts with oncoprotein serine/threonine kinase AKT2, tumor suppressor protein DCC (deleted in colorectal cancer), Rab5, GIPC (GAIP-interacting protein, C terminus), human follicle-stimulating hormone receptor (FSHR), and the adiponectin receptors AdipoR1 and AdipoR2. There are two isoforms of human APPL: APPL1 and APPL2, which share about 50% sequence identity. APPL has a BAR and a PH domain near its N terminus, and the two domains are thought to function as a unit (BAR-PH domain). C-terminal to this is a PTB domain. Lipid binding assays show that the BAR, PH, and PTB domains can bind phospholipids. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains.	135
269981	cd13159	PTB_LDLRAP-mammal-like	Low Density Lipoprotein Receptor Adaptor Protein 1 (LDLRAP1) in mammals and similar proteins Phosphotyrosine-binding (PTB) PH-like fold. The null mutations in the LDL receptor adaptor protein 1 (LDLRAP1) gene, which serves as an adaptor for LDLR endocytosis in the liver, causes autosomal recessive hypercholesterolemia (ARH). LDLRAP1 contains a single PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd contains mammals, insects, and sponges.	123
269982	cd13160	PTB_LDLRAP_insect-like	Low Density Lipoprotein Receptor Adaptor Protein 1 (LDLRAP1) in insects and similar proteins Phosphotyrosine-binding (PTB) PH-like fold. The null mutations in the LDL receptor adaptor protein 1 (LDLRAP1) gene, which serves as an adaptor for LDLR endocytosis in the liver, causes autosomal recessive hypercholesterolemia (ARH). LDLRAP1 contains a single PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd contains insects, ticks, sea urchins, and nematodes.	125
269983	cd13161	PTB_TK_HMTK	Tyrosine-specific kinase/HM-motif TK (TM/HMTK) Phosphotyrosine-binding (PTB) PH-like fold. TK kinases catalyzes the transfer of the terminal phosphate of ATP to a specific tyrosine residue on its target protein. TK kinases play significant roles in development and cell division. Tyrosine-protein kinases can be divided into two subfamilies: receptor tyrosine kinases, which have an intracellular tyrosine kinase domain, a transmembrane domain and an extracellular ligand-binding domain; and non-receptor (cytoplasmic) tyrosine kinases, which are soluble, cytoplasmic kinases. In HMTK the conserved His-Arg-Asp sequence within the catalytic loop is replaced by a His-Met sequence. TM/HMTK have are 2-3 N-terminal PTB domains. PTB domains in TKs are thought to function analogously to the membrane targeting (PH, myristoylation) and pTyr binding (SH2) domains of Src subgroup kinases. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	120
269984	cd13162	PTB_RGS12	Regulator of G-protein signaling 12 Phosphotyrosine-binding (PTB) PH-like fold. RGS12 functions as a GTPase-activating protein and a transcriptional repressor. It is thought to play a role in tumorigenesis. RGS12 specifically interacts with guanine nucleotide-binding protein G(i), alpha-1 subunit and guanine nucleotide-binding protein G(k) subunit alpha. RGS proteins are multi-functional, GTPase-accelerating proteins that promote GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off G protein-coupled receptor signalling pathways. Upon activation by GPCRs, heterotrimeric G proteins exchange GDP for GTP, are released from the receptor, and dissociate into free, active GTP-bound alpha subunit and beta-gamma dimer, both of which activate downstream effectors. The response is terminated upon GTP hydrolysis by the alpha subunit, which can then bind the beta-gamma dimer and the receptor. RGS proteins markedly reduce the lifespan of GTP-bound alpha subunits by stabilizing the G protein transition state. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	131
269985	cd13163	PTB_ICAP1	Integrin beta-1-binding protein 1 Phosphotyrosine-binding (PTB) PH-like fold. ICAP1 (also called Integrin cytoplasmic domain-associated protein 1) binds specifically to the beta1 integrin subunit cytoplasmic domain and the cerebral cavernous malformation (CCM) protein CCM1. It regulates beta1 integrin-dependent cell migration by affecting the pattern of focal adhesion formation. ICAP1 recruits CCM1 to the cell membrane and activates CCM1 by changing its conformation. Since CCM1 plays role in cardiovascular development, it is hypothesized ICAP1 is involved in vascular differentiation. ICAP-1 has an N-terminal domain that rich in serine and threonine and a C-terminal PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	129
241318	cd13164	PTB_DOK4_DOK5_DOK6	Downstream of tyrosine kinase 4, 5, and 6 proteins phosphotyrosine-binding domain (PTBi). The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain binds to acidic phospholids and localizes proteins to the plasma membrane, while the PTB domain mediates protein-protein interactions by binding to phosphotyrosine-containing motifs. The C-terminal part of Dok contains multiple tyrosine phosphorylation sites that serve as potential docking sites for Src homology 2-containing proteins such as ras GTPase-activating protein and Nck, leading to inhibition of ras signaling pathway activation and the c-Jun N-terminal kinase (JNK) and c-Jun activation, respectively. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. Dok-4- 6 play roles in protein tyrosine kinase(PTK)-mediated signaling in neural cells and Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup.	103
269986	cd13165	PTB_DOK7	Downstream of tyrosine kinase 7 phosphotyrosine-binding domain (PTBi). The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain is binds to acidic phospholids and localizes proteins to the plasma membrane, while the PTB domain mediates protein-protein interactions by binding to phosphotyrosine-containing motifs. The C-terminal part of Dok contains multiple tyrosine phosphorylation sites that serve as potential docking sites for Src homology 2-containing proteins such as ras GTPase-activating protein and Nck, leading to inhibition of ras signaling pathway activation and the c-Jun N-terminal kinase (JNK) and c-Jun activation, respectively. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. Dok-4- 6 play roles in protein tyrosine kinase(PTK)-mediated signaling in neural cells and Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the IRS-like subgroup.	101
269987	cd13166	PTB_CCM2	Cerebral cavernous malformation 2 FERM domain C-lobe. CCM2 (also called malcavernin; C7orf22/chromosome 7 open reading frame 22; OSM) along with CCM1 and CCM3 constitutes a set of proteins which when mutated are responsible for cerebral cavernous malformations, an autosomal dominant neurovascular disease characterized by cerebral hemorrhages and vascular malformations in the central nervous system. CCM2 plays many functional roles. CCM2 functions as a scaffold involved in small GTPase Rac-dependent p38 mitogen-activated protein kinase (MAPK) activation when the cell is under hyperosmotic stress. It associates with CCM1 in the signalling cascades that regulate vascular integrity and participates in HEG1 (the transmembrane receptor heart of glass 1) mediated endothelial cell junctions. CCM proteins also inhibit the activation of small GTPase RhoA and its downstream effector Rho kinase (ROCK) to limit vascular permeability. CCM2 mediates TrkA-dependent cell death via its N-terminal PTB domain in pediatric neuroblastic tumours. CCM2 possesses an N-terminal PTB domain and a C-terminal Karet domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	193
269988	cd13167	PTB_P-CLI1	PTB-containing, cubilin and LRP1-interacting protein Phosphotyrosine-binding (PTB) PH-like fold. P-CLI1 (also called Phosphotyrosine interaction domain-containing protein 1) increases proliferation of preadipocytes without affecting adipocytic differentiation. It forms a complex with PID1/PCLI1, LRP1 and CUBNI. It is found in subcutaneous fat, heart, skeletal muscle, brain, colon, thymus, spleen, kidney, liver, small intestine, placenta, lung and peripheral blood leukocyte. P-CLI1 contains a single PTB domain. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains.	139
269989	cd13168	PTB_LOC417372	uncharacterized protein LOC417372 Phosphotyrosine-binding (PTB) PH-like fold. The function of LOC417372 and its related proteins are unknown to date. Members here contain a N-terminal RUN domain, followed by a PDZ domain, and a C-terminal PTB domain. The RUN domain is involved in Ras-like GTPase signaling. The PDZ domain (also called DHR/Dlg homologous region or GLGF after its conserved sequence motif) binds C-terminal polypeptides, internal (non-C-terminal) polypeptides, and lipids. PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains. This cd is part of the Dab-like subgroup.	125
269990	cd13169	RanBD_NUP50_plant	Ran-binding protein 2, repeat 1. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP#importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The first RanBD2 is present in this hierarchy.	117
269991	cd13170	RanBD_NUP50	Nucleoporin 50 Ran-binding domain. NUP50 acts as a cofactor for the importin-alpha:importin-beta heterodimer, which allows for transportation of many nuclear-targeted proteins through nuclear pore complexes. It is thought to function primarily at the terminal stages of nuclear protein import to coordinate import complex disassembly and importin recycling. NUP50 is composed of a N-terminal NUP50 domain which binds the C-terminus of importin-beta, a central domain which binds importin-beta, and a C-terminal RanBD which binds importin-beta through Ran-GTP. NUP50:importin-alpha then binds cargo and can stimulate nuclear import. The N-terminal domain of NUP50 is also able to displace nuclear localization signals from importin-alpha. NUP50 interacts with cyclin-dependent kinase inhibitor 1B which binds to cyclin E-CDK2 or cyclin D-CDK4 complexes and prevents its activation, thereby controling the cell cycle progression at G1. Fungal Nup2 transiently associates with nuclear pore complexes (NPCs) and when artificially tethered to DNA, can prevent the spread of transcriptional activation or repression between flanking genes, a function termed boundary activity (BA). Nup2 and the Ran guanylyl-nucleotide exchange factor, Prp20, interact at specific chromatin regions and enable the NPC to play an active role in chromatin organization. Nup60p, the nup responsible for anchoring Nup2 and the Mlp proteins to the NPC is required for Nup2-dependent BA. Nup2 contains an N-terminal Nup50 family domain and a C-terminal RanBD. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity.	111
269992	cd13171	RanBD1_RanBP2_insect-like	Ran-binding protein 2, Ran binding domain repeat 1. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 1 is present in this hierarchy.	117
269993	cd13172	RanBD2_RanBP2_insect-like	Ran-binding protein 2, Ran binding domain repeat 2. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 2 is present in this hierarchy.	118
269994	cd13173	RanBD3_RanBP2_insect-like	Ran-binding protein 2, Ran binding domain repeat 3. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 3 is present in this hierarchy.	115
269995	cd13174	RanBD4_RanBP2_insect-like	Ran-binding protein 2, Ran binding domain repeat 4. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 4 is present in this hierarchy.	118
269996	cd13175	RanBD5_RanBP2_insect-like	Ran-binding protein 2, Ran binding domain repeat 5. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include insects and nematodes. RanBD repeat 5 is present in this hierarchy.	114
269997	cd13176	RanBD_RanBP2-like	Ran-binding protein 2, Ran binding domains. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeats 1 and 3 are present in this hierarchy.	117
269998	cd13177	RanBD2_RanBP2-like	Ran-binding protein 2, Ran binding domain repeat 2. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeat 2 is present in this hierarchy.	117
269999	cd13178	RanBD4_RanBP2-like	Ran-binding protein 2, Ran binding domain repeat 4. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeat 4 is present in this hierarchy.	117
270000	cd13179	RanBD_RanBP1	Ran-binding domain. RanBP1 interacts specifically with GTP-charged Ran. RanBP1 does not activate GTPase activity of Ran, but does markedly increase GTP hydrolysis by the RanGTPase-activating protein (RanGAP1). In both mammalian cells and in yeast, RanBP1 acts as a negative regulator of Regulator of chromosome condensation 1 (RCC1) by inhibiting RCC1-stimulated guanine nucleotide release from Ran. In addition to Ran, RanBP1 has been shown to interact with Exportin-1 and Importin subunit beta-1 which docks the NPC at the cytoplasmic side of the nuclear pore complex. RabBP1 contains a single RanBD. The RanBD is present in RanBD1, RanBD2, RanBD3, Nuc2, and Nuc50. Most of these proteins have a single RanBD, with the exception of RanBD2 which has 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. The Ran-binding domain is found in multiple copies in Nuclear pore complex proteins. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity.	136
270001	cd13180	RanBD_RanBP3	Ran-binding protein 3 Ran-binding domain. RanBP3, a Ran-interacting nuclear protein, unlike the related proteins RanBP1 and RanBP2, which promote disassembly of the export complex in the cytosol, acts as a CRM1 cofactor, enhancing nuclear export signal (NES) export by stabilizing the export complex in the nucleus. CRM1/Exportin1 is responsible for exporting many proteins and ribonucleoproteins from the nucleus to the cytosol. RanBP3 also alters the cargo selectivity of CRM1, promoting recognition of the NES of HIV-1 Rev and of other cargos while deterring recognition of the import adaptor protein Snurportin1. RanBP3 contains a N-terminal nuclear localization signal (NLS), 2 FxFG motifs, and a single RanBD. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity.	113
270002	cd13181	RanBD_NUP2	Nucleoporin 2 Ran-binding domain. Yeast protein Nup2 transiently associates with Nuclear pore complexes (NPCs) and when artificially tethered to DNA, can prevent the spread of transcriptional activation or repression between flanking genes, a function termed boundary activity (BA). Nup2 and the Ran guanylyl-nucleotide exchange factor, Prp20, interact at specific chromatin regions and enable the NPC to play an active role in chromatin organization. Nup60p, the nup responsible for anchoring Nup2 and the Mlp proteins to the NPC is required for Nup2-dependent BA. Nup2 contains an N-terminal Nup50 family domain and a C-terminal RanBD. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2.  These accessory proteins stabilize the active GTP-bound form of Ran. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity.	115
270003	cd13182	EVH1-like_Dcp1	Decapping enzyme EVH1-like domain. Dcp1 is a small protein containing an EVH1 domain. The Dcp1-Dcp2 complex plays a critical step in mRNA degradation with the removal of the 50 cap structure. Dcp1 stimulates the activity of Dcp2 by promoting and/or stabilizing the closed complex. The interface of Dcp1 and Dcp2 is not fully conserved and in higher eukaryotes it requires an additional factor. The proline-rich sequence (PRS)-binding sites in Dcp1p indicates that it belongs to a novel class of EVH1 domains. Dcp1 has 2 prominent sites,one required for the function of the Dcp1p-Dcp2p complex, and the other, the PRS-binding site of EVH1 domains, a binding site for decapping regulatory proteins. It also has a conserved hydrophobic patch is shown to be critical for decapping. The EVH1 domains are part of the PH domain superamily.	116
270004	cd13183	FERM_C_FRMPD1_FRMPD3_FRMPD4	FERM domain C-lobe of FERM and PDZ domain containing proteins 1, 3, and 4 (FRMPD1, 3, 4). The function of FRMPD1, FRMPD3, and FRMPD4 is unknown at present. These proteins contain an N-terminal PDZ (post synaptic density protein (PSD95), Drosophila disc large tumor suppressor (Dlg1), and zonula occludens-1 protein (zo-1) domain and a C-terminal FERM domain. PDZ (also known as DHR (Dlg homologous region) or GLGF (glycine-leucine-glycine-phenylalanine) domains) help anchor transmembrane proteins to the cytoskeleton and hold together signaling complexes. PDZ domains bind to a short region of the C-terminus of other specific proteins. The FERM domain is composed of three subdomains: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3), which form a clover leaf fold. The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	105
270005	cd13184	FERM_C_4_1_family	FERM domain C-lobe of Protein 4.1 family. The protein 4.1 family includes four well-defined members: erythroid protein 4.1 (4.1R), the best known and characterized member, 4.1G (general), 4.1N (neuronal), and 4.1 B (brain). The less well understood 4.1O/FRMD3 is not a true member of this family and is not included in this hierarchy. Besides three highly conserved domains, FERM, SAB (spectrin and actin binding domain) and CTD (C-terminal domain), the proteins from this family contain several unique domains: U1, U2 and U3. FERM domains like other members of the FERM domain superfamily have a cloverleaf architecture with three distinct lobes: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The brain is a particularly rich source of protein 4.1 isoforms. The various 4.1R, 4.1G, 4.1N, and 4.1B mRNAs are all expressed in distinct patterns within the brain. It is likely that 4.1 proteins play important functional roles in the brain including motor coordination and spatial learning, postmitotic differentiation, and synaptic architecture and function. In addition they are found in nonerythroid, nonneuronal cells where they may play a general structural role in nuclear architecture and/or may interact with splicing factors. The FERM C domain is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	94
270006	cd13185	FERM_C_FRMD1_FRMD6	FERM domain C-lobe of FERM domain containing 1 and 6 proteins. FRMD6 (also called willin and hEx/human expanded) is localized throughout the cytoplasm or along the plasma membrane. The Drosophilla protein Ex is a regulator of the Hippo/SWH (Sav/Wts/Hpo) signaling pathway, a signaling pathway that plays a pivotal role in organ size control and is tumor suppression by restricting proliferation and promoting apoptosis. Surprisingly, hEx is thought to function independently of the Hippo pathway. Instead it is hypothesized that hEx inhibits progression through the S phase of the cell cycle by upregulating p21(Cip1) and downregulating Cyclin A. It is also implicated in the progression of Alzheimer disease. Not much is known about FRMD1 to date. Both FRMD1 and FRMD6 contains a single FERM domain which has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe is a member of the PH superfamily. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	107
270007	cd13186	FERM_C_NBL4_NBL5	FERM domain C-lobe of Novel band 4.1-like protein 4 and 5 (NBL4 and 5). NBL4 (also called Erythrocyte protein band 4.1-like 4; Epb4 1l4) plays a role the beta-catenin/Tcf signaling pathway and is thought to be involved in establishing the cell polarity or proliferation. NBL4 may be also involved in adhesion, in cell motility and/or in cell-to-cell communication. No role for NBL5 has been proposed to date. Both NBL4 and NBL5 contain a N-terminal FERM domain which has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe is a member of the PH superfamily. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	92
270008	cd13187	FERM_C_PTPH13	FERM domain C-lobe of Protein tyrosine phosphatase non-receptor 13 (PTPH13). There are many functions of PTPN13 (also called PTPL1, PTP-BAS, hPTP1E, FAP1, or PTPL1). Mice lacking PTPN13 activity have abnormal regulation of signal transducer and activator of transcription signaling in their T cells, mild impairment of motor nerve repair, and a significant reduction in the growth of retinal glia cultures. It also plays a role in adipocyte differentiation. PTPN13 contains a kinase non-catalytic C-lobe domain (KIND), a FERM domain with two potential phosphatidylinositol 4,5-biphosphate [PtdIns(4,5)P2]-binding motifs, 5 PDZ domains, and a carboxy-terminal catalytic domain. There is an nteraction between the FERM domain of PTPL1 and PtdIns(4,5)P2 which is thought to regulate the membrane localization of PTPN13. PDZ are protein/protein interaction domains so there is the potential for numerous partners that can actively participate in the regulation of its phosphatase activity or can permit direct or indirect recruitment of tyrosine phosphorylated PTPL1 substrates. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	103
270009	cd13188	FERM_C_PTPN14_PTPN21	FERM domain C-lobe of Protein tyrosine phosphatase non-receptor proteins 14 and 21 (PTPN14 and 21). This CD contains PTP members: pez/PTPN14 and PTPN21. A number of mutations in Pez have been shown to be associated with breast and colorectal cancer. The PTPN protein family belong to larger family of PTPs. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. The members are composed of a N-terminal FERM domain and a C-terminal PTP catalytic domain. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. Like most other ERM members they have a phosphoinositide-binding site in their FERM domain. The FERM C domain is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	91
270010	cd13189	FERM_C_PTPN4_PTPN3_like	FERM domain C-lobe of Protein tyrosine phosphatase non-receptor proteins 3 and 4 (PTPN4 and PTPN3). PTPN4 (also called PTPMEG, protein tyrosine phosphatase, megakaryocyte) is a cytoplasmic protein-tyrosine phosphatase (PTP) thought to play a role in cerebellar function. PTPMEG-knockout mice have impaired memory formation and cerebellar long-term depression. PTPN3/PTPH1 is a membrane-associated PTP that is implicated in regulating tyrosine phosphorylation of growth factor receptors, p97 VCP (valosin-containing protein, or Cdc48 in Saccharomyces cerevisiae), and HBV (Hepatitis B Virus) gene expression; it is mutated in a subset of colon cancers. PTPMEG and PTPN3/PTPH1 contains a N-terminal FERM domain, a middle PDZ domain, and a C-terminal phosphatase domain. PTP1/Tyrosine-protein phosphatase 1 from nematodes and a FERM_C repeat 1 from Tetraodon nigroviridis are also included in this cd. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	95
270011	cd13190	FERM_C_FAK1	FERM domain C-lobe of Focal Adhesion Kinase 1 and 2. FAK1 (also called FRNK/Focal adhesion kinase-related nonkinase; p125FAK/pp125FAK;PTK2/Protein-tyrosine kinase 2 protein tyrosine kinase 2 (PTK2) is a non-receptor tyrosine kinase that localizes to focal adhesions in adherent cells. It has been implicated in diverse cellular roles including cell locomotion, mitogen response and cell survival. The N-terminal region of FAK1 contains a FERM domain, a linker, a kinase domain, and a C-terminal FRNK (FAK-related-non-kinase) domain. Three subdomains of FERM: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3), form a cloverleaf fold, similar to those of known FERM structures despite the low sequence conservation. The C-lobe/F3 within the FERM domain is part of the PH domain family. The phosphoinositide-binding site found in ERM family proteins is not present in the FERM domain of FAK1. The adjacent Src SH3 and SH2 binding sites in the linker of FAK1 associates with the F3 and F1 lobes and are thought to be involved in regulation. The FERM domain of FAK1 can inhibit enzymatic activity and repress FAK signaling. In an inactive state of FAK1, the FERM domain is thought to interact with the catalytic domain of FAK1 to repress its activity. Upon activation this interaction is disrupted and its kinase activity restored. The FRNK domain is thought to function as a negative regulator of kinase activity. The C-lobe/F3 is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	111
270012	cd13191	FERM_C_FRMD4A_FRMD4B	FERM domain C-lobe of FERM domain-containing protein 4A and 4B (FRMD4A and 4B). FRMD4A is part of the Par-3/FRMD4A/cytohesin-1 complex that activates Arf6, a central player in actin cytoskeleton dynamics and membrane trafficking, during junctional remodeling and epithelial polarization. The Par-3/Par-6/aPKC/Cdc42 complex regulates the conversion of primordial adherens junctions (AJs) into belt-like AJs and the formation of linear actin cables. When primordial AJs are formed, Par-3 recruits scaffolding protein FRMD4A which connects Par-3 and the Arf6 guanine-nucleotide exchange factor (GEF), cytohesin-1. FRMD4B (also called GRP1-binding protein, GRSP1) is a novel member of GRP1 signaling complexes that are recruited to plasma membrane ruffles in response to insulin receptor signaling. The GRSP1/FRMD4B protein contains a FERM protein domain as well as two coiled coil domains and may function as a scaffolding protein. GRP1 and GRSP1 interact through the coiled coil domains in the two proteins. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	113
270013	cd13192	FERM_C_FRMD3_FRMD5	FERM domain C-lobe of FERM domain-containing protein 3 and 5 (FRMD3 and 5). FRMD3 (also called Band 4.1-like protein 4O/4.1O though it is not a true member of that family) is a novel putative tumor suppressor gene that is implicated in the origin and progression of lung cancer. In humans there are 5 isoforms that are produced by alternative splicing. Less is known about FRMD5, though there are 2 isoforms of the human protein are produced by alternative splicing. Both FRMD3 and FRMD5 contain a N-terminal FERM domain, followed by a FERM adjacent (FA) domain, and 4.1 protein C-terminal domain (CTD). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	105
270014	cd13193	FERM_C_FARP1-like	FERM domain C-lobe of FERM, RhoGEF and pleckstrin domain-containing protein 1 and related proteins. Members here include FARP1 (also called Chondrocyte-derived ezrin-like protein; PH domain-containing family C member 2), FARP2 (also called FIR/FERM domain including RhoGEF; FGD1-related Cdc42-GEF/FRG), and FRMD7(FERM domain containing 7). FARP1 and FARP2 are members of the Dbl family guanine nucleotide exchange factors (GEFs) which are upstream positive regulators of Rho GTPases. FARP1 has increased expression in differentiated chondrocytes. FARP2 is thought to regulate neurite remodeling by mediating the signaling pathways from membrane proteins to Rac. It is found in brain, lung, and testis, as well as embryonic hippocampal and cortical neurons. These members are composed of a N-terminal FERM domain, a proline-rich (PR) domain, Dbl-homology (DH), and two C-terminal PH domains. Other members in this family do not contain the DH domains such as the Human FERM domain containing protein 7 and Caenorhabditis elegans CFRM3, both of which have unknown functions. They contain an N-terminal FERM domain, a PH domain, followed by a FA (FERM adjacent) domain. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	122
270015	cd13194	FERM_C_ERM	FERM domain C-lobe/F3 of the ERM family. The ERM family includes ezrin, radixin, moesin and merlin. They are composed of a N-terminal FERM (ERM) domain (also called N-ERMAD (N-terminal ERM association domain)), a coiled coil region (CRR), and a C-terminal domain CERMAD (C-terminal ERM association domain) which has an F-actin-binding site (ABD). Two actin-binding sites have been identified in the middle and N-terminal domains. Merlin is structurally similar to the ERM proteins, but instead of an actin-binding domain (ABD), it contains a C-terminal domain (CTD), just like the proteins from the 4.1 family. Activated ezrin, radixin and moesin are thought to be involved in the linking of actin filaments to CD43, CD44, ICAM1-3 cell adhesion molecules, various membrane channels and receptors, such as the Na+/H+ exchanger-3 (NHE3), cystic fibrosis transmembrane conductance regulator (CFTR), and the beta2-adrenergic receptor. The ERM proteins exist in two states, a dormant state in which the FERM domain binds to its own C-terminal tail and thereby precludes binding of some partner proteins, and an activated state, in which the FERM domain binds to one of many membrane binding proteins and the C-terminal tail binds to F-actin. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain of ERM is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	97
270016	cd13195	FERM_C_MYLIP_IDOL	FERM domain C-lobe of E3 ubiquitin ligase myosin regulatory light chain-interacting protein (MYLIP; also called inducible degrader of the LDL receptor, IDOL). MYLIP/IDOL is a regulator of the LDL receptor (LDLR) pathway via the nuclear receptor liver X receptor (LXR). In response to cellular cholesterol loading, the activation of LXR leads to the induction of MYLIP expression. MYLIP stimulates ubiquitination of the LDLR on its cytoplasmic tail, directing its degradation. The LXR-MYLIP-LDLR pathway provides a complementary pathway to sterol regulatory element-binding proteins for the feedback inhibition of cholesterol uptake. MYLIP has an N-terminal FERM domain and in some cases a C-terminal RING domain. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	111
275394	cd13196	FERM_C_JAK	FERM domain C-lobe of Janus kinase (JAK). JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	109
270018	cd13197	FERM_C_CCM1	FERM domain C-lobe of Cerebral cavernous malformation 1. CCM1 (also called KRIT-1/Krev interaction trapped 1;ankyrin repeat-containing protein Krit1; CAM), a Rap1-binding protein, is expressed in endothelial cells where it is present in cell-cell junctions and associated with junctional proteins. Together with CCM2/MGC4607 and CCM3/PDCD10, KRIT1 constitutes a set of proteins, mutations of which are found in cerebral cavernous malformations which are characterized by cerebral hemorrhages and vascular malformations in the central nervous system. KRIT-1 possesses four ankyrin repeats, a FERM domain, and multiple NPXY sequences, one of which is essential for integrin cytoplasmic domain-associated protein-1alpha (ICAP1alpha) binding and all of which mediate binding of CCM2. KRIT-1 localization is mediated by its FERM domain. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	100
270019	cd13198	FERM_C1_MyoVII	FERM domain C-lobe, repeat 1, of Myosin VII (MyoVII/Myo7). MyoVII, a MyTH-FERM myosin, is an actin-based motor protein essential for a variety of biological processes in the actin cytoskeleton function. Mutations in MyoVII leads to problems in sensory perception: deafness and blindness in humans (Usher Syndrome), retinal defects and deafness in mice (shaker 1), and aberrant auditory and vestibular function in zebrafish. Myosin VIIAs have plus (barbed) end-directed motor activity on actin filaments and a characteristic actin-activated ATPase activity. MyoVII consists of a conserved spectrin-like, SH3 subdomain N-terminal region, a motor/head region, a neck made of 4-5 IQ motifs, and a tail consisting of a coiled-coil domain, followed by a tandem repeat of myosin tail homology 4 (MyTH4) domains and partial FERM domains that are separated by an SH3 subdomain and are thought to mediate dimerization and binding to other proteins or cargo. Members include: MyoVIIa, MyoVIIb, and MyoVII members that do not have distinct myosin VIIA and myosin VIIB genes. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	99
270020	cd13199	FERM_C2_MyoVII	FERM domain C-lobe, repeat 2, of Myosin VII (MyoVII, Myo7). MyoVII, a MyTH-FERM myosin, is an actin-based motor protein essential for a variety of biological processes in the actin cytoskeleton function. Mutations in MyoVII leads to problems in sensory perception: deafness and blindness in humans (Usher Syndrome), retinal defects and deafness in mice (shaker 1), and aberrant auditory and vestibular function in zebrafish. Myosin VIIAs have plus (barbed) end-directed motor activity on actin filaments and a characteristic actin-activated ATPase activity. MyoVII consists of a conserved spectrin-like, SH3 subdomain N-terminal region, a motor/head region, a neck made of 4-5 IQ motifs, and a tail consisting of a coiled-coil domain, followed by a tandem repeat of myosin tail homology 4 (MyTH4) domains and partial FERM domains that are separated by an SH3 subdomain and are thought to mediate dimerization and binding to other proteins or cargo. Members include: MyoVIIa, MyoVIIb, and MyoVII members that do not have distinct myosin VIIA and myosin VIIB genes. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	96
270021	cd13200	FERM_C_KCBP	FERM domain C-lobe of Kinesin-like calmodulin binding protein. KCBPs (also called KIPK/Kinesin-like Calmodulin-Binding Protein-Interacting Protein Kinase), a member of the Kinesin-14 family, is a C-terminal microtubule motor with three unique domains including a myosin tail homology region 4 (MyTH4), a talin-like domain, and a calmodulin-binding domain (CBD). Binding of the Ca2+-activated calmodulin to KCBP causes the motor to dissociate from microtubules. The microtubule binding of KCBP is controlled by the calcium binding protein KIC containing a single EF-hand motif. KCBPs are unique to land plants and green algae. The MyTH4 and talin-like domains are not found in other kinesins, while the CBD domain is also only found in Strongylocentrotus purpuratus kinesin-C (SpKinC). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	109
270022	cd13201	FERM_C_MyoXV	FERM domain C-lobe of Myosin XV (MyoXV/Myo15). MyoXV, a MyTH-FERM myosin, are actin-based motor proteins essential for a variety of biological processes in actin cytoskeleton function. Specifically MyoXV functions in the actin organization in hair cells of the organ of Corti. Mutations in Human MyoXVa causes non-syndromic deafness, DFNB3 and the mouse shaker-2 mutation. MyoXV consists of a N-terminal motor/head region, a neck made of 1-3 IQ motifs, and a tail that consists of either a myosin tail homology 4 (MyTH4) domains, followed by an SH3 domain, and a MyTH-FERM domains as in rat Myo15 or two MyTH-FERM domains separated by a SH3 domain as in human Myo15A. The MyTH-FERM domains are thought to mediate dimerization and binding to other proteins or cargo. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	101
270023	cd13202	FERM_C_MyoX	FERM domain C-lobe of Myosin X (MyoX, Myo10). MyoX, a MyTH-FERM myosin, is a molecular motor that has crucial functions in the transport and/or tethering of integrins in the actin-based extensions known as filopodia, microtubule binding, and in netrin-mediated axon guidance. It functions as a dimer. MyoX walks on bundles of actin, rather than single filaments, unlike the other unconventional myosins. MyoX is present in organisms ranging from humans to choanoflagellates, but not in Drosophila and Caenorhabditis elegans.MyoX consists of a N-terminal motor/head region, a neck made of 3 IQ motifs, and a tail consisting of a coiled-coil domain, a PEST region, 3 PH domains, a myosin tail homology 4 (MyTH4), and a FERM domain at its very C-terminus. The MyoX FERM domain binds to the NPXY motif of several beta-integrins, a key family of cell surface receptors that are involved in cell adhesion and migration. In addition the FERM domain binds to the cytoplasmic domains of the netrin receptors DCC (deleted in colorectal cancer) and neogenin. The FERM domain also forms a supramodule with its MyTH4 domain which binds to the negatively charged E-hook region in the tails of alpha- and beta-tubulin forming a proposed motorized link between actin filaments and microtubules. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	90
270024	cd13203	FERM_C1_myosin_like	FERM domain C-lobe, repeat 1, of Myosin-like proteins. These myosin-like proteins are unidentified though they are sequence similar to myosin 1/myo1, myosin 7/myoVII, and myosin 10/myoX. These myosin-like proteins contain an N-terminal motor/head region and a C-terminal tail consisting of two myosin tail homology 4 (MyTH4) and twos FERM domains. In myoX the FERM domain forms a supramodule with its MyTH4 domain which binds to the negatively charged E-hook region in the tails of alpha- and beta-tubulin forming a proposed motorized link between actin filaments and microtubules and a similar thing might happen in these myosins. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The first FERM_N repeat is present in this hierarchy. The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	97
270025	cd13204	FERM_C2_myosin_like	FERM domain C-lobe, repeat 2, of Myosin-like proteins. These myosin-like proteins are unidentified though they are sequence similar to myosin 1/myo1, myosin 7/myoVII, and myosin 10/myoX. These myosin-like proteins contain an N-terminal motor/head region and a C-terminal tail consisting of two myosin tail homology 4 (MyTH4) and twos FERM domains. In myoX the FERM domain forms a supramodule with its MyTH4 domain which binds to the negatively charged E-hook region in the tails of alpha- and beta-tubulin forming a proposed motorized link between actin filaments and microtubules and a similar thing might happen in these myosins. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The second FERM_N repeat is present in this hierarchy. The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	93
270026	cd13205	FERM_C_fermitin	FERM domain C-lobe of the Fermitin family. Fermitin functions as a mediator of integrin inside-out signalling. The recruitment of Fermitin proteins and Talin to the membrane mediates the terminal event of integrin signalling, via interaction with integrin beta subunits. Fermatin has FERM domain interrupted with a pleckstrin homology (PH) domain. Fermitin family homologs (Fermt1, 2, and 3, also known as Kindlins) are each encoded by a different gene. In mammalian studies, Fermt1 is generally expressed in epithelial cells, Fermt2 is expressed inmuscle tissues, and Fermt3 is expressed in hematopoietic lineages. Specifically Fermt2 is expressed in smooth and striated muscle tissues in mice and in the somites (a trunk muscle precursor) and neural crest in Xenopus embryos. As such it has been proposed that Fermt2 plays a role in cardiomyocyte and neural crest differentiation. Expression of mammalian Fermt3 is associated with hematopoietic lineages: the anterior ventral blood islands, vitelline veins, and early myeloid cells. In Xenopus embryos this expression, also include the notochord and cement gland. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). This cd is not included in the C-lobe hierarchy based on its position in the tree. One thing to note is that unlike the other members of the C-lobe hierarchy it contains 2 FERM M domains which might also reflect a difference in its evolutionary history. The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	91
241360	cd13206	FERM_C-lobe_PLEKHH1_PLEKHH2	FERM domain C-lobe of Pleckstrin homology domain-containing family H. PLEKHH1 and PLEKHH2 (also called PLEKHH1L) are thought to function in phospholipid binding and signal transduction. There are 3 Human PLEKHH genes: PLEKHH1, PLEKHH2, and PLEKHH3. There are many isoforms, the longest of which contain a FERM domain, a MyTH4 domain, two PH domains, a peroximal domain, a vacuolar domain, and a coiled coil stretch. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	100
275395	cd13207	FERM-like_C_SNX	Atypical FERM-like domain C-lobe of Sorting nexin family. Sorting nexins function in regulating recycling from endosomes to the cell surface. SNX17, SNX27, and SNX31 contain a N-terminal PX domain, a FERM-like domain, and a unique C-terminal region. All three proteins are able to bind the Ras GTPase through their FERM-like domains. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. These interactions place the PX-FERM-like proteins at a hub of endosomal sorting and signaling processes. These proteins participate in a network of interactions that will impact on both endosomal protein trafficking and compartment specific Ras signaling cascades. The typical FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. FERM domains are found in cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	116
275396	cd13208	PH-GRAM_MTMR5_MTMR13	Myotubularian (MTM) related 5 and 13 proteins (MTMR5 and MTMR13) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR5 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It lacks several amino acids in the dsPTPase catalytic pocket which renders it catalytically inactive as a phosphatase. MTMR5 is the most well-studied inactive member of this family and has been implicated in cellular growth control and oncogenic transformation. MTMR13 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Leu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR13 has high sequence similarity to MTMR5 and has recently been shown to be a second gene mutated in type 4B Charcot-Marie-Tooth syndrome. Both MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. Although the majority of the sequences are MTMR 5 and 13, this cd also contains MTM5 nematode sequences.	120
275397	cd13209	PH-GRAM_MTMR3_MTMR4	Myotubularian (MTM) related 3 and 4 proteins (MTMR3 and MTMR4) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR3 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR3 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein can self-associate and also form heteromers with MTMR4. MTMR4, a member of the myotubularin dual specificity protein phosphatase gene family. MTMR4 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein form heteromers with MTMR3. Both MTMR3 and MTMR4 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal lipid-binding FYVE domain which binds phosphotidylinositol-3-phosphate. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	94
270030	cd13210	PH-GRAM_MTMR6-like	Myotubularian (MTM) related (MTMR) 7 and 8 proteins Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR6, MTMR7, and MRMR8 are all member of the myotubularin dual specificity protein phosphatase gene family. They bind to phosphoinositide lipids through its PH-GRAM domain. These proteins also interact with each other as well as MTMR9. They contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The lipid-binding FYVE domain has been shown to bind phosphotidylinositol-3-phosphate. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	98
275398	cd13211	PH-GRAM_MTMR9	Myotubularian (MTM) related 9 protein (MTMR9) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR9 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Gly residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR9 contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, a SET interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	99
275399	cd13212	PH-GRAM_MTMR10-like	Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR10, MTMR11, and MTMR12 are catalytically inactive phosphatases that play a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. They contains a Glu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. They contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, a SET interaction domain, and a C-terminal coiled-coil domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold.	125
275400	cd13213	PH-GRAM_MTMR14	Myotubularian (MTM) related 14 protein (MTMR14) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR14 is a member of the myotubularin protein phosphatase gene family. MTMR14 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. MTMR14 plays a role in the regulation of autophagy and mutations in MTMR14 result in autosomal dominant centronuclear myopathy. MTMR14 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain (SID), a coiled-coil region, and a C-terminal PDZ domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain (SID), and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	116
275401	cd13214	PH-GRAM_WBP2	WW binding protein 2 (WB2) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. WBP2 plays a number of roles including: acting as a tyrosine kinase substrate, activation of estrogen receptor alpha (ERalpha)/progesterone receptor (PR) transcription, and playing a role in breast cancer. WBP2 contain a N-terminal PH-GRAM domain and a C-terminal WWbp domain. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The WWbp domain is characterized by several short PY and PT-like motifs of the PPPPY form and binds to WW domains. WW domains contain two highly conserved tryptophans that are spaced 20-23 residues apart. They bind proline-rich peptide motifs [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs.	103
275402	cd13215	PH-GRAM1_AGT26	Autophagy-related protein 26/Sterol 3-beta-glucosyltransferase Pleckstrin homology (PH) domain, repeat 1. ATG26 (also called UGT51/UDP-glycosyltransferase 51), a member of the glycosyltransferase 28 family, resulting in the biosynthesis of sterol glucoside. ATG26 in decane metabolism and autophagy. There are 32 known autophagy-related (ATG) proteins, 17 are components of the core autophagic machinery essential for all autophagy-related pathways and 15 are the additional components required only for certain pathways or species. The core autophagic machinery includes 1) the ATG9 cycling system (ATG1, ATG2, ATG9, ATG13, ATG18, and ATG27), 2) the phosphatidylinositol 3-kinase complex (ATG6/VPS30, ATG14, VPS15, and ATG34), and 3) the ubiquitin-like protein system (ATG3, ATG4, ATG5, ATG7, ATG8, ATG10, ATG12, and ATG16). Less is known about how the core machinery is adapted or modulated with additional components to accommodate the nonselective sequestration of bulk cytosol (autophagosome formation) or selective sequestration of specific cargos (Cvt vesicle, pexophagosome, or bacteria-containing autophagosome formation). The pexophagosome-specific additions include the ATG30-ATG11-ATG17 receptor-adaptors complex, the coiled-coil protein ATG25, and the sterol glucosyltransferase ATG26. ATG26 is necessary for the degradation of medium peroxisomes. It contains 2 GRAM domains and a single PH domain. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	116
275403	cd13216	PH-GRAM2_AGT26	Autophagy-related protein 26/Sterol 3-beta-glucosyltransferase Pleckstrin homology (PH) domain, repeat 2. ATG26 (also called UGT51/UDP-glycosyltransferase 51), a member of the glycosyltransferase 28 family, resulting in the biosynthesis of sterol glucoside. ATG26 in decane metabolism and autophagy. There are 32 known autophagy-related (ATG) proteins, 17 are components of the core autophagic machinery essential for all autophagy-related pathways and 15 are the additional components required only for certain pathways or species. The core autophagic machinery includes 1) the ATG9 cycling system (ATG1, ATG2, ATG9, ATG13, ATG18, and ATG27), 2) the phosphatidylinositol 3-kinase complex (ATG6/VPS30, ATG14, VPS15, and ATG34), and 3) the ubiquitin-like protein system (ATG3, ATG4, ATG5, ATG7, ATG8, ATG10, ATG12, and ATG16). Less is known about how the core machinery is adapted or modulated with additional components to accommodate the nonselective sequestration of bulk cytosol (autophagosome formation) or selective sequestration of specific cargos (Cvt vesicle, pexophagosome, or bacteria-containing autophagosome formation). The pexophagosome-specific additions include the ATG30-ATG11-ATG17 receptor-adaptors complex, the coiled-coil protein ATG25, and the sterol glucosyltransferase ATG26. ATG26 is necessary for the degradation of medium peroxisomes. It contains 2 GRAM domains and a single PH domain. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains also have diverse functions. They are often involved in targeting proteins to the plasma membrane, but few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	93
275404	cd13217	PH-GRAM1_TCB1D8_TCB1D9_family	TCB1D8 and TCB1D9 family Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 1. TBC1D8, TBC1D8B, TBC1D9 and TBC1D9B may act as a GTPase-activating proteins for Rab family protein(s). They all contain an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the first repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	99
275405	cd13218	PH-GRAM2_TCB1D8_TCB1D9_family	TCB1D8 and TCB1D9 family Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 2. TBC1D8, TBC1D8B, TBC1D9 and TBC1D9B may act as a GTPase-activating proteins for Rab family protein(s). They all contain an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the second repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	96
270039	cd13219	PH-GRAM_C2-GRAM	C2 and GRAM domain-containing protein Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. C2GRAM contains two N-terminal C2 domains followed by a single PH-GRAM domain. Since it contains both of these domains it is assumed that this gene cross-links both calcium and phosphoinositide signaling pathways. In general he C2 domain is involved in binding phospholipids in a calcium dependent manner or calcium independent manner. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	111
275406	cd13220	PH-GRAM_GRAMDC	GRAM domain-containing protein (GRAMDC) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. The GRAMDC proteins are membrane proteins. Nothing is known about its function. Members include: GRAMDC1A, GRAMDC1B, GRAMDC1C, GRAMDC2, GRAMDC3, GRAMDC4, and GRAMDC-like proteins. All of the members, except for GRAMDC4 are included in this hierarchy. Each contains a single PH-GRAM domain at their N-terminus. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	94
270041	cd13221	PH-GRAM_GRAMDC4	GRAM domain-containing protein 4 (GRAMDC4) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. GRAMDC4 is a membrane protein. Nothing is known about its function. Paralogs include: GRAMDC1A, GRAMDC1B, GRAMDC1C, GRAMDC2, GRAMDC3, and GRAMDC-like proteins. It contains a single PH-GRAM domain at its N-terminus. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	104
270042	cd13222	PH-GRAM_GEM	GLABRA 2 expression modulator (GEM) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. GEM interacts with CDT1, a pre-replication complex component that is involved in DNA replication, and with TTG1 (Transparent Testa GLABRA 1), a transcriptional regulator of epidermal cell fate. GEM controls the level of histone H3K9 methylation at the promoters of the GLABRA 2 and CAPRICE (CPC) genes, which are essential for epidermis patterning. GEM also regulates cell division in different root cell types. GEM regulates proliferation-differentiation decisions by integrating DNA replication, cell division and transcriptional controls. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	109
275407	cd13223	PH-GRAM_MTM-like	Myotubularian 1 and related proteins Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase. MTM1, MTMR1, and MTMR2 are members of the myotubularin protein phosphatase gene family. They contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. In addition MTMR1 (Myotubularian related 1 protein) and MTMR2 (Myotubularian related 2 protein) contain a C-terminal PDZ domain. Mutations in MTMR2 are a cause of Charcot-Marie-Tooth disease type 4B, an autosomal recessive demyelinating neuropathy. The protein can self-associate and form heteromers with MTMR5 and MTMR12. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	100
270044	cd13224	PH_Net1	Neuroepithelial cell transforming 1 Pleckstrin homology (PH) domain. Net1 (also called ArhGEF8) is part of the family of Rho guanine nucleotide exchange factors. Members of this family activate Rho proteins by catalyzing the exchange of GDP for GTP. The protein encoded by this gene interacts with RhoA within the cell nucleus and may play a role in repairing DNA damage after ionizing radiation. Net1 binds to caspase activation and recruitment domain (CARD)- and membrane-associated guanylate kinase-like domain-containing (CARMA) proteins and regulates nuclear factor kB activation. Net1 contains a RhoGEF domain N-terminal to a single PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	135
270045	cd13225	PH-like_bacteria	Pleckstrin homology (PH)-like domains in bacteria (PHb). Pleckstrin homology (PH) domains were first identified in eukaryotic proteins. Recently PH-like domains have been identified in bacteria as well. These PHb form dome-shaped oligomeric rings with a conserved hydrophilic surface at the intersection of the beta-strands of adjacent protomers that likely mediates protein-protein interactions. It is now thought that the PH domain superfamily is more widespread than previous thought and appears to have existed before prokaryotes and eukaryotes diverged. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	95
275408	cd13226	PH-GRAM-like_Eap45	Pleckstrin homology-like domain or GLUE (GRAM-like ubiquitin-binding in Eap45) domain of Eap45. ESCRT complexes form the main machinery driving protein sorting from endosomes to lysosomes. Human/yeast ESCRT-I consists of Tsg101/Vps23, Vps28/Vps28, and a Vps37 homolog/Vps37. Human/yeast ESCRT-II is composed of EAP20/Vps25, EAP30/Vps22, and EAP45/Vps36. Yeast ESCRT-III consists Vps2, Vps20, Vps24, and Snf7 subunits. In contrast, there are three Human paralogs of Snf7 (hSnf7-1/CHMP4A, hSnf7-2/CHMP4B, and hSnf7-3/CHMP4C) and two paralogs of Vps2 (CHMP2A and CHMP2B). Yeast ESCRT-I links directly to ESCRT-II, through a tight interaction of Vps28 (ESCRT-I) with the yeast-specific zinc-finger insertion within the GLUE domain of Vps36. The Vps36 subunit (ESCRT-II) binds ubiquitin using one of its two NZF zinc fingers in its N-terminal region. Human Vps36, EAP45, also binds ubiquitin despite having no NZF domain. Instead, mammalian ESCRT-II interacts with Ub through the Eap45 GLUE domain directly. While yeast Vps36 GLUE shows a preference for the singly phosphorylated PI(3)P, while Eap45 GLUE preferentially binds the triply phosphorylated phosphatidylinositol PI(3,4,5)P3. Structurally, Eap45 GLUE only has a PH-like fold since it lacks the secondary structure element corresponding to the 4 strand, unlike that of yeast Vps36 GLUE. ESCRT-II also interacts with ESCRT-III via a EAP20(Vps25)/CHMP6(Vps20) interaction. The interactions of ESCRT-II GLUE domain with membranes, ESCRT-I, and ubiquitin are critical for ubiquitinated cargo progression from early to late endosomes. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	129
275409	cd13227	PH-GRAM-like_Vps36	Pleckstrin homology-like domain or GLUE (GRAM-like ubiquitin-binding in Eap45) domain of Vps36. ESCRT complexes form the main machinery driving protein sorting from endosomes to lysosomes. Yeast/human ESCRT-I consists of Vps23/Tsg101, Vps28/Vps28, and Vps37/Vps37 homolog. Yeast/human ESCRT-II is composed of Vps25/EAP20, Vps22/EAP30, and Vps36/EAP45. Yeast ESCRT-III consists Vps2, Vps20, Vps24, and Snf7 subunits. In contrast, there are three human paralogs of Snf7 (hSnf7-1/CHMP4A, hSnf7-2/CHMP4B, and hSnf7-3/CHMP4C) and two paralogs of Vps2 (CHMP2A and CHMP2B). Yeast ESCRT-I links directly to ESCRT-II, through a tight interaction of Vps28 (ESCRT-I) with the yeast-specific zinc-finger insertion within the GLUE domain of Vps36. The Vps36 subunit (ESCRT-II) binds ubiquitin using one of its two NZF zinc fingers in its N-terminal region. Human Vps36, EAP45, also binds ubiquitin despite having no NZF domain. Instead, mammalian ESCRT-II interacts with Ub through the Eap45 GLUE domain itself. The yeast Vps36 GLUE has a complete PH domain, wherease Eap45 GLUE only has a PH-like fold since it lacks the secondary structure element corresponding to the 4 strand. ESCRT-II also interacts with ESCRT-III via a Vps25(EAP20)/Vps20(CHMP6) interaction. Structure 2CAY is missing this insertion that contains 2 NZF zinc fingers. It is a split PH domain, with a noncanonical lipid binding pocket that binds PI(3)P. The interactions of ESCRT-II GLUE domain with membranes, ESCRT-I, and ubiquitin are critical for ubiquitinated cargo progression from early to late endosomes. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	119
270048	cd13228	PHear_NECAP	NECAP (adaptin-ear-binding coat-associated protein) Plextrin Homology (PH) fold with ear-like function (PHear) domain. NECAPs are alpha-ear-binding proteins that enrich on clathrin-coated vesicles (CCVs). NECAP 1 is expressed in brain and non-neuronal tissues and cells while NECAP 2 is ubiquitously expressed. The PH-like domain of NECAPs is a protein-binding interface that mimics the FxDxF motif binding properties of the alpha-ear and is called PHear (PH fold with ear-like function) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	120
270049	cd13229	PH_TFIIH	Transcription Factor II H (TFIIH) Pleckstrin homology (PH) domain. The transcription factor II H (TFIIH) is one of the general transcription factors (GTFs) known to be a target of the transactivation domain (TAD) of p53. Human TFIIH and its homologous yeast counterpart (factor b) are composed of ten subunits that can be divided into two groups, the core TFIIH (XPB/Ssl2, p62/Tfb1, p52/Tfb2, p44/Ssl1, p34/Tfb4, and TTDA/Tfb5 in human/yeast) and the CAK complex (cdk7/Kin28, cyclin H/Ccl1, and MAT1/Tfb3). These two complexes are linked by the XPD/Rad3 subunit. The helicase activities of XPB and XPD are essential to the formation of the open complex during transcription initiation and the kinase activity of cdk7 phosphorylates the C-terminal domain (CTD) of the RNA Pol II largest subunit, enabling RNA Pol II to progress from the initiation phase to the elongation phase of transcription. The PH domain of p62/Tfb1 has been shown to interact with herpes simplex virus protein 16 (VP16) TAD and the binding of p53 TAD is mediated by the TAD2 subdomain. TFIIE recruits TFIIH to complete the preinitiation complex (PIC) formation and regulates enzymatic activities of TFIIH. The PH domain of the human TFIIH p62 subunit binds to the C-terminal acidic (AC) domain of the human TFIIEalpha subunit. This interaction could be a switch to replace p53 with TFIIE on TFIIH in transcription. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	93
270050	cd13230	PH1_SSRP1-like	Structure Specific Recognition protein 1 (SSRP1) Pleckstrin homology (PH) domain, repeat 1. SSRP1 is a component of FACT (facilitator of chromatin transcription), an essential chromatin reorganizing factor. In yeast FACT (yFACT) is composed of three proteins: Spt16/Cdc68, Pob3, and Nhp6. In metazoans the Pob3 and Nhp6 orthologs are fused to form SSRP1/T160 in human and mouse, respectively. The middle domain of the Pob3 subunit (Pob3-M) has an unusual double pleckstrin homology (PH) architecture. yFACT interacts in a physiologically important way with the central single-strand DNA binding factor RPA to promote a step in DNA Replication. Coordinated function by yFACT and RPA is important during nucleosome deposition. These results support the model that the FACT family has an essential role in constructing nucleosomes during DNA replication, and suggest that RPA contributes to this process. Members of this cd are composed of the first PH-like repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	137
270051	cd13231	PH2_SSRP1-like	Structure Specific Recognition protein 1 (SSRP1) Pleckstrin homology (PH) domain, repeat 2. SSRP1 is a component of FACT (facilitator of chromatin transcription), an essential chromatin reorganizing factor. In yeast FACT (yFACT) is composed of three proteins: Spt16/Cdc68, Pob3, and Nhp6. In metazoans the Pob3 and Nhp6 orthologs are fused to form SSRP1/T160 in human and mouse, respectively.The middle domain of the Pob3 subunit (Pob3-M) has an unusual double pleckstrin homology (PH) architecture. yFACT interacts in a physiologically important way with the central single-strand DNA binding factor RPA to promote a step in DNA Replication. Coordinated function by yFACT and RPA is important during nucleosome deposition. These results support the model that the FACT family has an essential role in constructing nucleosomes during DNA replication, and suggest that RPA contributes to this process. Members of this cd are composed of the second PH-like repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	100
270052	cd13232	Ig-PH_SCAB1	Stomatal Closure Related Actin-Binding Protein 1 Pleckstrin homology-like domain. SCAB1 is an actin-binding protein that interacts with actin filaments and regulates stomatal movement. SCAB1 is composed of an actin-binding domain, two coiled-coil (CC) domains, and a fused immunoglobulin (Ig) and PH (Ig-PH) domain. SCAB1 homologs are widely present, often in multiple copies (three in Arabidopsis), in plants including eudicots, monocots, ferns and mosses, but are not found in algae and non-plant species. The C-terminal PH domain binds weakly with inositol phosphates via an atypical basic surface patch. SCAB1 forms a dimeric structure via its coiled-coil domains. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	119
270053	cd13233	PH_ARHGAP9-like	Beta-spectrin pleckstrin homology (PH) domain. ARHGAP family genes encode Rho/Rac/Cdc42-like GTPase activating proteins with RhoGAP domain. The ARHGAP members here all have a PH domain upstream of their C-terminal RhoGAP domain. Some have additional N-terminal SH3 and WW domains. The members here include: ARHGAP9, ARHGAP12, ARHGAP15, and ARHGAP27. ARHGAP27 and ARHGAP12 shared the common-domain structure, consisting of SH3, WW, PH, and RhoGAP domains. The PH domain of ArhGAP9 employs a non-canonical phosphoinositide binding mechanism, a variation of the spectrin- Ins(4,5)P2-binding mode, that gives rise to a unique PI binding profile, namely a preference for both PI(4,5)P2 and the PI 3-kinase products PI(3,4,5)P3 and PI(3,4)P2. This lipid binding mechanism is also employed by the PH domain of Tiam1 and Slm1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	110
270054	cd13234	PHsplit_PLC_gamma	Phospholipase C-gamma Split pleckstrin homology (PH) domain. PLC-gamma (PLCgamma) is activated by receptor and non-receptor tyrosine kinases due to the presence of its SH2 and SH3 domains. There are two main isoforms of PLC-gamma expressed in human specimens, PLC-gamma1 and PLC-gamma2. PLC-gamma consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves internal to which is a PH domain split by two SH2 domains and a single SH3 domain, and a C-terminal C2 domain. The split PH domain is present in this hierarchy. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
270055	cd13235	PH2_FARP1-like	FERM, RhoGEF and pleckstrin domain-containing protein 1 and related proteins Pleckstrin Homology (PH) domain, repeat 2. Members here include FARP1 (also called Chondrocyte-derived ezrin-like protein; PH domain-containing family C member 2), FARP2 (also called FIR/FERM domain including RhoGEF; FGD1-related Cdc42-GEF/FRG), and FARP6 (also called Zinc finger FYVE domain-containing protein 24). They are members of the Dbl family guanine nucleotide exchange factors (GEFs) which are upstream positive regulators of Rho GTPases. Little is known about FARP1 and FARP6, though FARP1 has increased expression in differentiated chondrocytes. FARP2 is thought to regulate neurite remodeling by mediating the signaling pathways from membrane proteins to Rac. It is found in brain, lung, and testis, as well as embryonic hippocampal and cortical neurons. FARP1 and FARP2 are composed of a N-terminal FERM domain, a proline-rich (PR) domain, Dbl-homology (DH), and two C-terminal PH domains. FARP6 is composed of Dbl-homology (DH), and two C-terminal PH domains separated by a FYVE domain. This hierarchy contains the second PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	98
270056	cd13236	PH2_FGD1-4	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins pleckstrin homology (PH) domain, C-terminus. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Not much is known about FGD2. FGD1 is the best characterized member of the group with mutations here leading to the X-linked disorder known as faciogenital dysplasia (FGDY). Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. However, FGD1 and FGD3 induced significantly different morphological changes in HeLa Tet-Off cells and while FGD1 induced long finger-like protrusions, FGD3 induced broad sheet-like protrusions when the level of GTP-bound Cdc42 was significantly increased by the inducible expression of FGD3. They also reciprocally regulated cell motility in inducibly expressed in HeLa Tet-Off cells, FGD1 stimulated cell migration while FGD3 inhibited it. FGD1 and FGD3 therefore play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway through SCF(FWD1/beta-TrCP). FGD4 is one of the genes associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance. Those affected have distal muscle weakness and atrophy associated with sensory loss and, frequently, pes cavus foot deformity. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
270057	cd13237	PH2_FGD5_FGD6	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins 5 and 6 pleckstrin homology (PH) domain, C-terminus. FGD5 regulates promotes angiogenesis of vascular endothelial growth factor (VEGF) in vascular endothelial cells, including network formation, permeability, directional movement, and proliferation. The specific function of FGD6 is unknown. In general, FGDs have a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activate the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the PH domain is involved in intracellular targeting of the DH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	91
270058	cd13238	PH2_FGD4_insect-like	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 4 pleckstrin homology (PH) domain, C-terminus, in insect and related arthropods. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. FGD4 is one of the genes associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance. Those affected have distal muscle weakness and atrophy associated with sensory loss and, frequently, pes cavus foot deformity. This cd contains insects, crustaceans, and chelicerates. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	97
270059	cd13239	PH_Obscurin	Obscurin pleckstrin homology (PH) domain. Obscurin (also called Obscurin-RhoGEF; Obscurin-myosin light chain kinase/Obscurin-MLCK) is a giant muscle protein that is concentrated at the peripheries of Z-disks and M-lines. It binds small ankyrin I, a component of the sarcoplasmic reticulum (SR) membrane. It is associated with the contractile apparatus through binding with titin and sarcomeric myosin. It plays important roles in the organization and assembly of the myofibril and the SR. Obscurin has been observed as alternatively-spliced isoforms. The major isoform in sleletal muscle, approximately 800 kDa in size, is composed of many adhesion modules and signaling domains. It harbors 49 Ig and 2 FNIII repeats at the N-terminues, a complex middle region with additional Ig domains, an IQ motif, and a conserved SH3 domain near RhoGEF and PH domains, and a non-modular C-terminus with phosphorylation motifs. The obscurin gene also encodes two kinase domains, which are not part of the 800 kDa form of the protein, but is part of smaller spliced products that present in heart muscle. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	125
270060	cd13240	PH1_Kalirin_Trio_like	Triple functional domain pleckstrin homology pleckstrin homology (PH) domain, repeat 1. RhoGEFs, Kalirin and Trio, the mammalian homologs of Drosophila Trio and Caenorhabditis elegans UNC-73 regulate a novel step in secretory granule maturation. Their signaling modulates the extent to which regulated cargo enter and remain in the regulated secretory pathway. This allows for fine tuning of peptides released by a single secretory cell type with impaired signaling leading to pathological states. Trio plays an essential role in regulating the actin cytoskeleton during axonal guidance and branching. Kalirin and Trio are encoded by separate genes in mammals and by a single one in invertebrates. Kalirin and Trio share the same complex multidomain structure and display several splice variants. The longest Kalirin and Trio proteins have a Sec14 domain, a stretch of spectrin repeats, a RhoGEF(DH)/PH cassette (also called GEF1), an SH3 domain, a second RhoGEF(DH)/PH cassette (also called GEF2), a second SH3 domain, Ig/FNIII domains, and a kinase domain. The first RhoGEF(DH)/PH cassette catalyzes exchange on Rac1 and RhoG while the second RhoGEF(DH)/PH cassette is specific for RhoA. Kalirin and Trio are closely related to p63RhoGEF and have PH domains of similar function. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains.	123
270061	cd13241	PH2_Kalirin_Trio_p63RhoGEF	p63RhoGEF pleckstrin homology (PH) domain, repeat 2. The guanine nucleotide exchange factor p63RhoGEF is an effector of the heterotrimeric G protein, Galphaq and linking Galphaq-coupled receptors (GPCRs) to the activation of RhoA. The Dbl(DH) and PH domains of p63RhoGEF interact with the effector-binding site and the C-terminal region of Galphaq and appear to relieve autoinhibition of the catalytic DH domain by the PH domain. Trio, Duet, and p63RhoGEF are shown to constitute a family of Galphaq effectors that appear to activate RhoA both in vitro and in intact cells. Dbs is a guanine nucleotide exchange factor (GEF), which contains spectrin repeats, a rhoGEF (DH) domain and a PH domain. The Dbs PH domain participates in binding to both the Cdc42 and RhoA GTPases. Trio plays an essential role in regulating the actin cytoskeleton during axonal guidance and branching. Trio is a multidomain signaling protein that contains two RhoGEF(DH)-PH domains in tandem. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	140
270062	cd13242	PH_puratrophin-1	Puratrophin-1 pleckstrin homology (PH) domain. Puratrophin-1 (also called Purkinje cell atrophy-associated protein 1 or PLEKHG4/Pleckstrin homology domain-containing family G member 4) contains a spectrin repeat, a RhoGEF (DH) domain, and a PH domain. It is thought to function in intracellular signaling and cytoskeleton dynamics at the Golgi. Puratrophin-1 is expressed in kidney, Leydig cells in the testis, epithelial cells in the prostate gland and Langerhans islet in the pancreas. A single nucleotide substitution in the puratrophin-1 gene were once thought to result in autosomal dominant cerebellar ataxia (ADCA), but now it has been demonstrated that this ataxia is a result of defects in the BEAN gene. Puratrophin contains a domain architecture similar to that of Dbl family members Dbs and Trio. Dbs is a guanine nucleotide exchange factor (GEF), which contains spectrin repeats, a RhoGEF (DH) domain and a PH domain. The Dbs PH domain participates in binding to both the Cdc42 and RhoA GTPases. Trio plays an essential role in regulating the actin cytoskeleton during axonal guidance and branching. Trio is a multidomain signaling protein that contains two RhoGEF(DH)-PH domains in tandem. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	136
270063	cd13243	PH_PLEKHG1_G2_G3	Pleckstrin homology domain-containing family G members 1, 2, and 3 pleckstrin homology (PH) domain. PLEKHG1 (also called ARHGEF41), PLEKHG2 (also called ARHGEF42 or CLG/common-site lymphoma/leukemia guanine nucleotide exchange factor2), and PLEKHG3 (also called ARHGEF43) have RhoGEF DH/double-homology domains in tandem with a PH domain which is involved in phospholipid binding. They function as a guanine nucleotide exchange factor (GEF) and are involved in the regulation of Rho protein signal transduction. Mutations in PLEKHG1 have been associated panic disorder (PD), an anxiety disorder characterized by panic attacks and anticipatory anxiety. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	147
270064	cd13244	PH_PLEKHG5_G6	Pleckstrin homology domain-containing family G member 5 and 6 pleckstrin homology (PH) domain. PLEKHG5 has a RhoGEF DH/double-homology domain in tandem with a PH domain which is involved in phospholipid binding. PLEKHG5 activates the nuclear factor kappa B (NFKB1) signaling pathway. Mutations in PLEKHG5 are associated with autosomal recessive distal spinal muscular atrophy. PLEKHG6 (also called MyoGEF) has no known function to date. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	100
270065	cd13245	PH_PLEKHG7	Pleckstrin homology domain-containing family G member 7 pleckstrin homology (PH) domain. PLEKHG7 has a RhoGEF DH/double-homology domain in tandem with a PH domain which is involved in phospholipid binding. PLEKHG7 is proposed to functions as a guanine nucleotide exchange factor (GEF) and is involved in the regulation of Rho protein signal transduction. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	128
270066	cd13246	PH_Scd1	Shape and Conjugation Deficiency 1 Pleckstrin homology (PH) domain. Fission yeast Scd1 is an exchange factor for Cdc42 and an effector of Ras1, the homolog of the human H-Ras. Scd2/Bem1 mediates Cdc42 activation by binding to Scd1/Cdc24 and to Cdc42. Ras1 regulates Scd1/Cdc24/Ral1, which is a putative guanine nucleotide exchange factor for Cdc42, a member of the Rho family of Ras-like proteins. Cdc42 then activates the Shk1/Orb2 protein kinase. Scd1 interacts with Klp5 and Klp6 kinesins to mediate cytokinesis. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	148
270067	cd13247	BAR-PH_APPL	Adaptor protein containing PH domain, PTB domain, and Leucine zipper motif Bin1/amphiphysin/Rvs167 (BAR)-Pleckstrin homology (PH) domain. APPL (also called DCC-interacting protein (DIP)-13alpha) interacts with oncoprotein serine/threonine kinase AKT2, tumor suppressor protein DCC (deleted in colorectal cancer), Rab5, GIPC (GAIP-interacting protein, C terminus), human follicle-stimulating hormone receptor (FSHR), and the adiponectin receptors AdipoR1 and AdipoR2. There are two isoforms of human APPL: APPL1 and APPL2, which share about 50% sequence identity. APPL has a BAR and a PH domain near its N terminus, and the two domains are thought to function as a unit (BAR-PH domain). C-terminal to this is a PTB domain. Lipid binding assays show that the BAR, PH, and PTB domains can bind phospholipids. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	125
270068	cd13248	PH_PEPP1_2_3	Phosphoinositol 3-phosphate binding proteins 1, 2, and 3 pleckstrin homology (PH) domain. PEPP1 (also called PLEKHA4/PH domain-containing family A member 4 and RHOXF1/Rhox homeobox family member 1), and related homologs PEPP2 (also called PLEKHA5/PH domain-containing family A member 5) and PEPP3 (also called PLEKHA6/PH domain-containing family A member 6), have PH domains that interact specifically with PtdIns(3,4)P3. Other proteins that bind PtdIns(3,4)P3 specifically are: TAPP1 (tandem PH-domain-containing protein-1) and TAPP2], PtdIns3P AtPH1, and Ptd- Ins(3,5)P2 (centaurin-beta2). All of these proteins contain at least 5 of the 6 conserved amino acids that make up the putative phosphatidylinositol 3,4,5- trisphosphate-binding motif (PPBM) located at their N-terminus. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	104
270069	cd13249	PH_rhotekin2	Anillin Pleckstrin homology (PH) domain. Anillin (Rhotekin/RTKN; also called PLEKHK/Pleckstrin homology domain-containing family K) is an actin binding protein involved in cytokinesis. It interacts with GTP-bound Rho proteins and results in the inhibition of their GTPase activity. Dysregulation of the Rho signal transduction pathway has been implicated in many forms of cancer. Anillin proteins have a N-terminal HRI domain/ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. The C-terminal PH domain helps target anillin to ectopic septin containing foci. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	111
270070	cd13250	PH_ACAP	ArfGAP with coiled-coil, ankyrin repeat and PH domains Pleckstrin homology (PH) domain. ACAP (also called centaurin beta) functions both as a Rab35 effector and as an Arf6-GTPase-activating protein (GAP) by which it controls actin remodeling and membrane trafficking. ACAP contain an NH2-terminal bin/amphiphysin/Rvs (BAR) domain, a phospholipid-binding domain, a PH domain, a GAP domain, and four ankyrin repeats. The AZAPs constitute a family of Arf GAPs that are characterized by an NH2-terminal pleckstrin homology (PH) domain and a central Arf GAP domain followed by two or more ankyrin repeats. On the basis of sequence and domain organization, the AZAP family is further subdivided into four subfamilies: 1) the ACAPs contain an NH2-terminal bin/amphiphysin/Rvs (BAR) domain (a phospholipid-binding domain that is thought to sense membrane curvature), a single PH domain followed by the GAP domain, and four ankyrin repeats; 2) the ASAPs also contain an NH2-terminal BAR domain, the tandem PH domain/GAP domain, three ankyrin repeats, two proline-rich regions, and a COOH-terminal Src homology 3 domain; 3) the AGAPs contain an NH2-terminal GTPase-like domain (GLD), a split PH domain, and the GAP domain followed by four ankyrin repeats; and 4) the ARAPs contain both an Arf GAP domain and a Rho GAP domain, as well as an NH2-terminal sterile-a motif (SAM), a proline-rich region, a GTPase-binding domain, and five PH domains. PMID 18003747 and 19055940 Centaurin can bind to phosphatidlyinositol (3,4,5)P3. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	98
270071	cd13251	PH_ASAP	ArfGAP with SH3 domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain. ASAPs (ASAP1, ASAP2, and ASAP3) function as an Arf-specific GAPs, participates in rhodopsin trafficking, is associated with tumor cell metastasis, modulates phagocytosis, promotes cell proliferation, facilitates vesicle budding, Golgi exocytosis, and regulates vesicle coat assembly via a Bin/Amphiphysin/Rvs domain. ASAPs contain an NH2-terminal BAR domain, a tandem PH domain/GAP domain, three ankyrin repeats, two proline-rich regions, and a COOH-terminal Src homology 3 (SH3) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
270072	cd13252	PH1_ADAP	ArfGAP with dual PH domains Pleckstrin homology (PH) domain, repeat 1. ADAP (also called centaurin alpha) is a phophatidlyinositide binding protein consisting of an N-terminal ArfGAP domain and two PH domains. In response to growth factor activation, PI3K phosphorylates phosphatidylinositol 4,5-bisphosphate to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 1 is recruited to the plasma membrane following growth factor stimulation by specific binding of its PH domain to phosphatidylinositol 3,4,5-trisphosphate. Centaurin alpha 2 is constitutively bound to the plasma membrane since it binds phosphatidylinositol 4,5-bisphosphate and phosphatidylinositol 3,4,5-trisphosphate with equal affinity. This cd contains the first PH domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	109
270073	cd13253	PH1_ARAP	ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 1. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the first PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	94
270074	cd13254	PH2_ARAP	ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 2. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the second PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	90
270075	cd13255	PH_TAAP2-like	Tandem PH-domain-containing protein 2 Pleckstrin homology (PH) domain. The binding of TAPP2 (also called PLEKHA2) adaptors to PtdIns(3,4)P(2), but not PI(3,4, 5)P3, function as negative regulators of insulin and PI3K signalling pathways (i.e. TAPP/utrophin/syntrophin complex). TAPP2 contains two sequential PH domains in which the C-terminal PH domain specifically binds PtdIns(3,4)P2 with high affinity. The N-terminal PH domain does not interact with any phosphoinositide tested. They also contain a C-terminal PDZ-binding motif that interacts with several PDZ-binding proteins, including PTPN13 (known previously as PTPL1 or FAP-1) as well as the scaffolding proteins MUPP1 (multiple PDZ-domain-containing protein 1), syntrophin and utrophin. The members here are most sequence similar to TAPP2 proteins, but may not be actual TAPP2 proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	110
270076	cd13256	PH3_ARAP	ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 3. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the third PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	110
270077	cd13257	PH4_ARAP	ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 4. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the fourth PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	91
270078	cd13258	PH_PLEKHJ1	Pleckstrin homology domain containing, family J member 1 Pleckstrin homology (PH) domain. PLEKHJ1 (also called GNRPX2/Guanine nucleotide-releasing protein x ). It contains a single PH domain. Very little information is known about PLEKHJ1. PLEKHJ1 has been shown to interact with IKBKG (inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase gamma) and KRT33B (keratin 33B). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	123
270079	cd13259	PH5_ARAP	ArfGAP with RhoGAP domain, ankyrin repeat and PH domain Pleckstrin homology (PH) domain, repeat 5. ARAP proteins (also called centaurin delta) are phosphatidylinositol 3,4,5-trisphosphate-dependent GTPase-activating proteins that modulate actin cytoskeleton remodeling by regulating ARF and RHO family members. They bind phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) and phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4,5)P2) binding. There are 3 mammalian ARAP proteins: ARAP1, ARAP2, and ARAP3. All ARAP proteins contain a N-terminal SAM (sterile alpha motif) domain, 5 PH domains, an ArfGAP domain, 2 ankyrin domain, A RhoGap domain, and a Ras-associating domain. This hierarchy contains the five PH domain in ARAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	121
270080	cd13260	PH_RASA1	RAS p21 protein activator (GTPase activating protein) 1 Pleckstrin homology (PH) domain. RASA1 (also called RasGap1 or p120) is a member of the RasGAP family of GTPase-activating proteins. RASA1 contains N-terminal SH2-SH3-SH2 domains, followed by two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Splice variants lack the N-terminal domains. It is a cytosolic vertebrate protein that acts as a suppressor of RAS via its C-terminal GAP domain function, enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS, allowing control of cellular proliferation and differentiation. Additionally, it is involved in mitogenic signal transmission towards downstream interacting partners through its N-terminal SH2-SH3-SH2 domains. RASA1 interacts with a number of proteins including: G3BP1, SOCS3, ANXA6, Huntingtin, KHDRBS1, Src, EPHB3, EPH receptor B2, Insulin-like growth factor 1 receptor, PTK2B, DOK1, PDGFRB, HCK, Caveolin 2, DNAJA3, HRAS, GNB2L1 and NCK1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	103
270081	cd13261	PH_RasGRF1_2	Ras-specific guanine nucleotide-releasing factors 1 and 2 Pleckstrin homology (PH) domain. RasGRF1 (also called GRF1; CDC25Mm/Ras-specific nucleotide exchange factor CDC25; GNRP/Guanine nucleotide-releasing protein) and RasGRF2 (also called GRF2; Ras guanine nucleotide exchange factor 2) are a family of guanine nucleotide exchange factors (GEFs). They both promote the exchange of Ras-bound GDP by GTP, thereby regulating the RAS signaling pathway. RasGRF1 and RasGRF2 form homooligomers and heterooligomers. GRF1 has 3 isoforms and GRF2 has 2 isoforms. The longest isoforms of RasGRF1 and RasGRF2 contain the following domains: a Rho-GEF domain sandwiched between 2 PH domains, IQ domains, a REM (Ras exchanger motif) domain, and a Ras-GEF domainwhich gives them the capacity to activate both Ras and Rac GTPases in response to signals from a variety of neurotransmitter receptors. Their IQ domains allow them to act as calcium sensors to mediate the actions of NMDA-type and calcium-permeable AMPA-type glutamate receptors. GRF1 also mediates the action of dopamine receptors that signal through cAMP. GRF1 and GRF2 play strikingly different roles in regulating MAP kinase family members, neuronal synaptic plasticity, specific forms of learning and memory, and behavioral responses to psychoactive drugs. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	136
270082	cd13262	PH_RasSynGAP-like	Synaptic Ras-GTPase activating protein family Pleckstrin homology (PH) domain. The RasSynGAP family is composed of members: DAB2IP, nGAP, and SynGAP. Neuronal growth-associated proteins (nGAPs) are growth cone markers found in multiple types of neurons. There are many nGAPs including Cap1 (Adenylate cyclase-associated protein 1), Capzb (Capping protein (actin filament) muscle Z-line, beta), Clptm1 (Cleft lip and palate associated transmembrane protein 1), Cotl1 (Coactosin-like 1), Crmp1 (Collapsin response mediator protein 1), Cyfip1 (Cytoplasmic FMR1 interacting protein 1), Fabp7 (Fatty acid binding protein 7, brain), Farp2 (FERM, RhoGEF and pleckstrin domain protein 2), Gap43 (Growth associated protein 43), Gnao1 (Guanine nucleotide binding protein (G protein), alpha activating activity polypeptide O), Gnai2 (Guanine nucleotide binding protein (G protein), alpha inhibiting 2), Pacs1 (Phosphofurin acidic cluster sorting protein 1), Rtn1 (Reticulon 1), Sept2 (Septin 2), Snap25 (Synaptosomal-associated protein 25), Strap (Serine/threonine kinase receptor associated protein), Stx7 (Syntaxin 7), and Tmod2 (Tropomodulin 2). SynGAP, a neuronal Ras-GAP, has been shown display both Ras-GAP activity and Ras-related protein (Rap)-GAP activity. Saccharomyces cerevisiae Bud2 and GAP1 members CAPRI (Ca2+-promoted Ras inactivator) and RASAL (Ras-GTPase-activating-like protein) also possess this dual activity. Human DOC-2/DAB2-interacting protein (DAB2IP) is encoded by a tumor suppressor gene and a newly recognized member of the Ras-GTPase-activating family. DAB2IP is a critical component of many signal transduction pathways mediated by Ras and tumor necrosis factors including apoptosis pathways, and it is involved in the formation of many types of tumors. DAB2IP participates in regulation of gene expression and pluripotency of cells. It has been reported that DAB2IP was expressed in different tumor tissues. Little information is available concerning the expression levels of DAB2IP in normal tissues and cells, however, and no studies of its expression patterns during the development of human embryos have been reported. DAB2IP was expressed primarily in cell cytoplasm throughout the fetal development. The expression levels varied among tissues and different gestational ages. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	125
270083	cd13263	PH_RhoGap25-like	Rho GTPase activating protein 25 and related proteins Pleckstrin homology (PH) domain. RhoGAP25 (also called ArhGap25) like other RhoGaps are involved in cell polarity, cell morphology and cytoskeletal organization. They act as GTPase activators for the Rac-type GTPases by converting them to an inactive GDP-bound state and control actin remodeling by inactivating Rac downstream of Rho leading to suppress leading edge protrusion and promotes cell retraction to achieve cellular polarity and are able to suppress RAC1 and CDC42 activity in vitro. Overexpression of these proteins induces cell rounding with partial or complete disruption of actin stress fibers and formation of membrane ruffles, lamellipodia, and filopodia. This hierarchy contains RhoGAP22, RhoGAP24, and RhoGAP25. Members here contain an N-terminal PH domain followed by a RhoGAP domain and either a BAR or TATA Binding Protein (TBP) Associated Factor 4 (TAF4) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	114
270084	cd13264	PH_ITSN	Intersectin Pleckstrin homology (PH) domain. ITSNs, an adaptor protein family, play a role in endo- and exocytosis, actin cytoskeleton rearrangement and signal transduction. There are two human ITSN genes: ITSN1 and ITSN2. They share significant sequence identity and a similar domain structure having both short and long isoforms produced by alternative splicing. The short isoform (ITSN-S) consists of two Eps15 homology domains (EH1 and EH2), a coiled-coil region (CCR) and five Src homology 3 domains (SH3A-E). The EH domains bind to Asn-Pro-Phe motifs and are implicated in endocytosis and vesicle transport. The SH3 domains bind to proline-rich sequences and are commonly found in proteins implicated in cell signalling pathways, cytoskeletal organization and membrane traffic. The long isoform (ITSN-L) contains three additional C-terminal domains, a Dbl homology domain (DH), a Pleckstrin homology domain (PH) and a C2 domain. The tandem DH-PH domains are present in all Dbl family of GEFs. ITSN acts specifically on Cdc42 through its DH domain with no portion of the PH domain making contact with Cdc42. This is in contrast to Dbs which requires the PH domain for full catalytic activity. The ITSN PH domain binds phosphoinositides. C2 domains are usually involved in Ca2+-dependent and Ca2+-independent phospholipid binding. There are more than 30 proteins that interact with ITSNs. ITSN-S is present in mammals, frogs, flies and nematodes, while ITSN-L is present only in vertebrates. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	132
270085	cd13265	PH_evt	Evectin Pleckstrin homology (PH) domain. There are 2 members of the evectin family (also called pleckstrin homology domain containing, family B): evt-1 (also called PLEKHB1) and evt-2 (also called PLEKHB2). evt-1 is specific to the nervous system, where it is expressed in photoreceptors and myelinating glia. evt-2 is widely expressed in both neural and nonneural tissues. Evectins possess a single N-terminal PH domain and a C-terminal hydrophobic region. evt-1 is thought to function as a mediator of post-Golgi trafficking in cells that produce large membrane-rich organelles. It is a candidate gene for the inherited human retinopathy autosomal dominant familial exudative vitreoretinopathy and a susceptibility gene for multiple sclerosis. evt-2 is essential for retrograde endosomal membrane transport from the plasma membrane (PM) to the Golgi. Two membrane trafficking pathways pass through recycling endosomes: a recycling pathway and a retrograde pathway that links the PM to the Golgi/ER. Its PH domain that is unique in that it specifically recognizes phosphatidylserine (PS), but not polyphosphoinositides. PS is an anionic phospholipid class in eukaryotic biomembranes, is highly enriched in the PM, and plays key roles in various physiological processes such as the coagulation cascade, recruitment and activation of signaling molecules, and clearance of apoptotic cells. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
270086	cd13266	PH_Skap_family	Src kinase-associated phosphoprotein family Pleckstrin homology (PH) domain. Skap adaptor proteins couple receptors to cytoskeletal rearrangements. Src kinase-associated phosphoprotein of 55 kDa (Skap55)/Src kinase-associated phosphoprotein 1 (Skap1), Skap2, and Skap-homology (Skap-hom) have an N-terminal coiled-coil conformation, a central PH domain and a C-terminal SH3 domain. Their PH domains bind 3'-phosphoinositides as well as directly affecting targets such as in Skap55 where it directly affecting integrin regulation by ADAP and NF-kappaB activation or in Skap-hom where the dimerization and PH domains comprise a 3'-phosphoinositide-gated molecular switch that controls ruffle formation. PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	106
270087	cd13267	PH_DOCK-D	Dedicator of cytokinesis-D subfamily Pleckstrin homology (PH) domain. DOCK-D subfamily (also called Zizimin subfamily) consists of Dock9/Zizimin1, Dock10/Zizimin3, and Dock11/Zizimin2. DOCK-D has a N-terminal DUF3398 domain, a PH-like domain, a Dock Homology Region 1, DHR1 (also called CZH1), a C2 domain, and a C-terminal DHR2 domain (also called CZH2). Zizimin1 is enriched in the brain, lung, and kidney; zizimin2 is found in B and T lymphocytes, and zizimin3 is enriched in brain, lung, spleen and thymus. Zizimin1 functions in autoinhibition and membrane targeting. Zizimin2 is an immune-related and age-regulated guanine nucleotide exchange factor, which facilitates filopodial formation through activation of Cdc42, which results in activation of cell migration. No function has been determined for Zizimin3 to date. The N-terminal half of zizimin1 binds to the GEF domain through three distinct areas, including CZH1, to inhibit the interaction with Cdc42. In addition its PH domain binds phosphoinositides and mediates zizimin1 membrane targeting. DOCK is a family of proteins involved in intracellular signalling networks. They act as guanine nucleotide exchange factors for small G proteins of the Rho family, such as Rac and Cdc42. There are 4 subfamilies of DOCK family proteins based on their sequence homology: A-D. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	126
270088	cd13268	PH_Brdg1	BCR downstream signaling 1 Pleckstrin homology (PH) domain. Brdg1 is thought to function as a docking protein acting downstream of Tec, a protein tyrosine kinases (PTK), in B-cell antigen receptor (BCR) signaling. BRDG1 contains a proline-rich (PR) motif which is thought to bind SH3 or WW domains, a PH domain, and multiple tyrosine residues which are potential target sites for SH2 domains. Since PH domains bind phospholipids it is thought to be involved in the tethering of Tec and BRDG1 to the cell membrane.Tec and Pyk2, but not Btk, Bmx, Lyn, Syk, or c-Abl, induces phosphorylation of BRDG1 on tyrosine residues. Efficient phosphorylation requires both the PH and SH2 domains of BRDG1 and the kinase domain of Tec. The overexpression of BRDG1 increases theBCR-mediated activation of cAMP-response element binding protein (CREB). Phosphorylated BRDG1 is hypothesized to recruit CREB either directly or through its recruitment of downstream effectors which then recruit CREB. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	127
241423	cd13269	PH_alsin	Alsin Pleckstrin homology (PH) domain. The ALS2 gene encodes alsin, a GEF, that has dual specificity for Rac1 and Rab5 GTPases. Alsin mutations in the form of truncated proteins are responsible for motor function disorders including juvenile-onset amyotrophic lateral sclerosis, familial juvenile primary lateral sclerosis, and infantile-onset ascending hereditary spastic paralysis. The alsin protein is widely expressed in the developing CNS including neurons of the cerebral cortex, brain stem, spinal cord, and cerebellum. Alsin contains a regulator of chromosome condensation 1 (RCC1) domain, a Rho guanine nucleotide exchanging factor (RhoGEF) domain, a PH domain, a Membrane Occupation and Recognition Nexus (MORN), a vacuolar protein sorting 9 (Vps9) domain, and a Dbl homology (DH) domain. Alsin interacts with Rab5 through its Vps9 domain and through this interaction modulates early endosome fusion and trafficking. The GEF activity of alsin towards Rab5 is regulated by Rac1 function. The GEF activity of alsin for Rac1 occurs via its DH domain and this interaction plays a role in promoting spinal motor neuron survival via multiple Rac-dependent signaling pathways. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	106
270089	cd13270	PH1_TAPP1_2	Tandem PH-domain-containing proteins 1 and 2 Pleckstrin homology (PH) domain, N-terminal repeat. The binding of TAPP1 (also called PLEKHA1/pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 1) and TAPP2 (also called PLEKHA2) adaptors to PtdIns(3,4)P(2), but not PI(3,4, 5)P3, function as negative regulators of insulin and PI3K signalling pathways (i.e. TAPP/utrophin/syntrophin complex). TAPP1 and TAPP2 contain two sequential PH domains in which the C-terminal PH domain binds PtdIns(3,4)P2. They also contain a C-terminal PDZ-binding motif that interacts with several PDZ-binding proteins, including PTPN13 (known previously as PTPL1 or FAP-1) as well as the scaffolding proteins MUPP1 (multiple PDZ-domain-containing protein 1), syntrophin and utrophin. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	118
270090	cd13271	PH2_TAPP1_2	Tandem PH-domain-containing proteins 1 and 2 Pleckstrin homology (PH) domain, C-terminal repeat. The binding of TAPP1 (also called PLEKHA1/pleckstrin homology domain containing, family A (phosphoinositide binding specific) member 1) and TAPP2 (also called PLEKHA2) adaptors to PtdIns(3,4)P(2), but not PI(3,4, 5)P3, function as negative regulators of insulin and PI3K signalling pathways (i.e. TAPP/utrophin/syntrophin complex). TAPP1 and TAPP2 contain two sequential PH domains in which the C-terminal PH domain specifically binds PtdIns(3,4)P2 with high affinity. The N-terminal PH domain does not interact with any phosphoinositide tested. They also contain a C-terminal PDZ-binding motif that interacts with several PDZ-binding proteins, including PTPN13 (known previously as PTPL1 or FAP-1) as well as the scaffolding proteins MUPP1 (multiple PDZ-domain-containing protein 1), syntrophin and utrophin. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	114
270091	cd13272	PH_INPP4A_INPP4B	Type I inositol 3,4-bisphosphate 4-phosphatase and Type II inositol 3,4-bisphosphate 4-phosphatase Pleckstrin homology (PH) domain. INPP4A (also called Inositol polyphosphate 4-phosphatase type I) and INPP4B (also called Inositol polyphosphate 4-phosphatase type II) both catalyze the hydrolysis of the 4-position phosphate of phosphatidylinositol 3,4-bisphosphate and inositol 1,3,4-trisphosphate. They differ in that INPP4A additionally catalyzes the hydrolysis of the 4-position phosphate of inositol 3,4-bisphosphate, while INPP4B catalyzes the hydrolysis of the 4-position phosphate of inositol 1,4-bisphosphate. They both have a single PH domain followed by a C2 domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	144
270092	cd13273	PH_SWAP-70	Switch-associated protein-70 Pleckstrin homology (PH) domain. SWAP-70 (also called Differentially expressed in FDCP 6/DEF-6 or IRF4-binding protein) functions in cellular signal transduction pathways (in conjunction with Rac), regulates cell motility through actin rearrangement, and contributes to the transformation and invasion activity of mouse embryo fibroblasts. Metazoan SWAP-70 is found in B lymphocytes, mast cells, and in a variety of organs. Metazoan SWAP-70 contains an N-terminal EF-hand motif, a centrally located PH domain, and a C-terminal coiled-coil domain. The PH domain of Metazoan SWAP-70 contains a phosphoinositide-binding site and a nuclear localization signal (NLS), which localize SWAP-70 to the plasma membrane and nucleus, respectively. The NLS is a sequence of four Lys residues located at the N-terminus of the C-terminal a-helix; this is a unique characteristic of the Metazoan SWAP-70 PH domain. The SWAP-70 PH domain binds PtdIns(3,4,5)P3 and PtdIns(4,5)P2 embedded in lipid bilayer vesicles. There are additional plant SWAP70 proteins, but these are not included in this hierarchy. Rice SWAP70 (OsSWAP70) exhibits GEF activity toward the its Rho GTPase, OsRac1, and regulates chitin-induced production of reactive oxygen species and defense gene expression in rice. Arabidopsis SWAP70 (AtSWAP70) plays a role in both PAMP- and effector-triggered immunity. Plant SWAP70 contains both DH and PH domains, but their arrangement is the reverse of that in typical DH-PH-type Rho GEFs, wherein the DH domain is flanked by a C-terminal PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	110
270093	cd13274	PH_DGK_type2	Type 2 Diacylglycerol kinase Pleckstrin homology (PH) domain. DGK (also called DAGK) catalyzes the conversion of diacylglycerol (DAG) to phosphatidic acid (PA) utilizing ATP as a source of the phosphate. In non-stimulated cells, DGK activity is low and DAG is used for glycerophospholipid biosynthesis. Upon receptor activation of the phosphoinositide pathway, DGK activity increases which drives the conversion of DAG to PA. DGK acts as a switch by terminating the signalling of one lipid while simultaneously activating signalling by another. There are 9 mammalian DGK isoforms all with conserved catalytic domains and two cysteine rich domains. These are further classified into 5 groups according to the presence of additional functional domains and substrate specificity: Type 1 - DGK-alpha, DGK-beta, DGK-gamma - contain EF-hand motifs and a recoverin homology domain; Type 2 - DGK-delta, DGK-eta, and DGK-kappa- contain a pleckstrin homology domain, two cysteine-rich zinc finger-like structures, and a separated catalytic region; Type 3 - DGK-epsilon - has specificity for arachidonate-containing DAG; Type 4 - DGK-zeta, DGK-iota- contain a MARCKS homology domain, ankyrin repeats, a C-terminal nuclear localization signal, and a PDZ-binding motif; Type 5 - DGK-theta - contains a third cysteine-rich domain, a pleckstrin homology domain and a proline rich region. The type 2 DGKs are present as part of this Metazoan DGK hierarchy. They have a N-terminal PH domain, two cysteine rich domains, followed by bipartite catalytic domains, and a C-terminal SAM domain. Their catalytic domains and perhaps other DGK catalytic domains may function as two independent units in a coordinated fashion. They may also require other motifs for maximal activity because several DGK catalytic domains have very little DAG kinase activity when expressed as isolated subunits. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	97
270094	cd13275	PH_M-RIP	Myosin phosphatase-RhoA Interacting Protein Pleckstrin homology (PH) domain. M-RIP is proposed to play a role in myosin phosphatase regulation by RhoA. M-RIP contains 2 PH domains followed by a Rho binding domain (Rho-BD), and a C-terminal myosin binding subunit (MBS) binding domain (MBS-BD). The amino terminus of M-RIP with its adjacent PH domains and polyproline motifs mediates binding to both actin and Galpha. M-RIP brings RhoA and MBS into close proximity where M-RIP can target RhoA to the myosin phosphatase complex to regulate the myosin phosphorylation state. M-RIP does this via its C-terminal coiled-coil domain which interacts with the MBS leucine zipper domain of myosin phosphatase, while its Rho-BD, directly binds RhoA in a nucleotide-independent manner. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	104
270095	cd13276	PH_AtPH1	Arabidopsis thaliana Pleckstrin homolog (PH) 1 (AtPH1) PH domain. AtPH1 is expressed in all plant tissue and is proposed to be the plant homolog of human pleckstrin. Pleckstrin consists of two PH domains separated by a linker region, while AtPH has a single PH domain with a short N-terminal extension. AtPH1 binds PtdIns3P specifically and is thought to be an adaptor molecule since it has no obvious catalytic functions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	106
270096	cd13277	PH_Bem3	Bud emergence protein 3 (Bem3) Pleckstrin homology (PH) domain. Bud emergence in Saccharomyces cerevisiae involves cell cycle-regulated reorganizations of cortical cytoskeletal elements and requires the action of the Rho-type GTPase Cdc42. Bem3 contains a RhoGAP domain and a PH domain. Though Bem3 and Bem2 both contain a RhoGAP, but only Bem3 is able to stimulate the hydrolysis of GTP on Cdc42. Bem3 is thought to be the GAP for Cdc42. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	111
241432	cd13278	PH_Bud4	Bud4 Pleckstrin homology (PH) domain. Bud4 is an anillin-like yeast protein involved in the formation and the disassembly of the double ring structure formed by the septins during cytokinesis. Bud4 acts with Bud3 and and in parallel with septin phosphorylation by the p21-activated kinase Cla4 and the septin-dependent kinase Gin4. Bud4 is regulated by the cyclin-dependent protein kinase Cdk1, the master regulator of cell cycle progression. Bud4 contains an anillin-like domain followed by a PH domain. In addition there are two consensus Cdk phosphorylation sites: one at the N-terminus and one right before the C-terminal PH domain. Anillins also have C-terminal PH domains. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	139
270097	cd13279	PH_Cla4_Ste20	Pleckstrin homology (PH) domain. Budding yeast contain two main p21-activated kinases (PAKs), Cla4 and Ste20. The yeast Ste20 protein kinase is involved in pheromone response, though the function of Ste20 mammalian homologs is unknown. Cla4 is involved in budding and cytokinesis and interacts with Cdc42, a GTPase required for polarized cell growth as is Pak. Cla4 and Ste20 kinases share a function in localizing cell growth with respect to the septin ring. They both contain a PH domain, a Cdc42/Rac interactive binding (CRIB) domain, and a C-terminal Protein Kinase catalytic (PKc) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	92
270098	cd13280	PH_SIP3	Snf1p-interacting protein 3 Pleckstrin homology (PH) domain. SIP3 interacts with SNF1 protein kinase and activates transcription when anchored to DNA. It may function in the SNF1 pathway. SIP3 contain an N-terminal Bin/Amphiphysin/Rvs (BAR) domain followed by a PH domain. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
270099	cd13281	PH_PLEKHD1	Pleckstrin homology (PH) domain containing, family D (with coiled-coil domains) member 1 PH domain. Human PLEKHD1 (also called UPF0639, pleckstrin homology domain containing, family D (with M protein repeats) member 1) is a single transcript and contains a single PH domain. PLEKHD1 is conserved in human, chimpanzee, , dog, cow, mouse, chicken, zebrafish, and Caenorhabditis elegans. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	139
241436	cd13282	PH1_PLEKHH1_PLEKHH2	Pleckstrin homology (PH) domain containing, family H (with MyTH4 domain) members 1 and 2 (PLEKHH1) PH domain, repeat 1. PLEKHH1 and PLEKHH2 (also called PLEKHH1L) are thought to function in phospholipid binding and signal transduction. There are 3 Human PLEKHH genes: PLEKHH1, PLEKHH2, and PLEKHH3. There are many isoforms, the longest of which contain a FERM domain, a MyTH4 domain, two PH domains, a peroximal domain, a vacuolar domain, and a coiled coil stretch. The FERM domain has a cloverleaf tripart structure (FERM_N, FERM_M, FERM_C/N, alpha-, and C-lobe/A-lobe, B-lobe, C-lobe/F1, F2, F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	96
270100	cd13283	PH_GPBP	Goodpasture antigen binding protein Pleckstrin homology (PH) domain. The GPBP (also called Collagen type IV alpha-3-binding protein/hCERT; START domain-containing protein 11/StARD11; StAR-related lipid transfer protein 11) is a kinase that phosphorylates an N-terminal region of the alpha 3 chain of type IV collagen, which is commonly known as the goodpasture antigen. Its splice variant the ceramide transporter (CERT) mediates the cytosolic transport of ceramide. There have been additional splice variants identified, but all of them function as ceramide transport proteins. GPBP and CERT both contain an N-terminal PH domain, followed by a serine rich domain, and a C-terminal START domain. However, GPBP has an additional serine rich domain just upstream of its START domain. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	100
270101	cd13284	PH_OSBP_ORP4	Human Oxysterol binding protein and OSBP-related protein 4 Pleckstrin homology (PH) domain. Human OSBP is proposed to function is sterol-dependent regulation of ERK dephosphorylation and sphingomyelin synthesis as well as modulation of insulin signaling and hepatic lipogenesis. It contains a N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. OSBPs and Osh1p PH domains specifically localize to the Golgi apparatus in a PtdIns4P-dependent manner. ORP4 is proposed to function in Vimentin-dependent sterol transport and/or signaling. Human ORP4 has 2 forms, a long (ORP4L) and a short (ORP4S). ORP4L contains a N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. ORP4S is truncated and contains only an OSBP-related domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	99
270102	cd13285	PH_ORP1	Human Oxysterol binding protein related protein 1 Pleckstrin homology (PH) domain. Human ORP1 has 2 forms, a long (ORP1L) and a short (ORP1S). ORP1L contains 3 N-terminal ankyrin repeats, followed by a PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. ORP1S is truncated and contains only an OSBP-related domain. ORP1L is proposed to function in motility and distribution of late endosomes, autophagy, and macrophage lipid metabolism. ORP1S is proposed to function in vesicle transport from Golgi. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	125
270103	cd13286	PH_OPR5_ORP8	Human Oxysterol binding protein related proteins 5 and 8 Pleckstrin homology (PH) domain. Human ORP5 is proposed to function in efficient nonvesicular transfer of low-density lipoproteins-derived cholesterol (LDL-C) from late endosomes/lysosomes to the endoplasmic reticulum (ER). Human ORP8 is proposed to modulate lipid homeostasis and sterol regulatory element binding proteins (SREBP) activity. Both ORP5 and ORP8 contain a N-terminal PH domain, a C-terminal OSBP-related domain, followed by a transmembrane domain that localizes ORP5 to the ER. Unlike all the other human OSBP/ORPs they lack a FFAT motif (two phenylalanines in an acidic tract). Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	130
270104	cd13287	PH_ORP3_ORP6_ORP7	Human Oxysterol binding protein related proteins 3, 6, and 7 Pleckstrin homology (PH) domain. Human ORP3 is proposed to function in regulating the cell-matrix and cell-cell adhesion. A proposed specific function for Human ORP6 was not found at present. Human ORP7is proposed to function in negatively regulating the Golgi soluble NSF attachment protein receptor (SNARE) of 28kDa (GS28) protein stability via sequestration of Golgi-associated ATPase enhancer of 16 kDa (GATE-16). ORP3 has 2 isoforms: the longer ORP3(1) and the shorter ORP3(2). ORP3(1), ORP6, and ORP7 all contain a N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. The shorter ORP3(2) is missing the C-terminal portion of its OSBP-related domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	123
270105	cd13288	PH_Ses	Sesquipedalian family Pleckstrin homology (PH) domain. The sesquipedalian family has 2 mammalian members: Ses1 and Ses2, which are also callled 7 kDa inositol polyphosphate phosphatase-interacting protein 1 and 2. They play a role in endocytic trafficking and are required for receptor recycling from endosomes, both to the trans-Golgi network and the plasma membrane. Members of this family form homodimers and heterodimers. Sesquipedalian interacts with inositol polyphosphate 5-phosphatase OCRL-1 (INPP5F) also known as Lowe oculocerebrorenal syndrome protein, a phosphatase enzyme that is involved in actin polymerization and is found in the trans-Golgi network and INPP5B. Sesquipedalian contains a single PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	120
241443	cd13289	PH_Osh3p_yeast	Yeast oxysterol binding protein homolog 3 Pleckstrin homology (PH) domain. Yeast Osh3p is proposed to function in sterol transport and regulation of nuclear fusion during mating and of pseudohyphal growth as well as sphingolipid metabolism. Osh3 contains a N-GOLD (Golgi dynamics) domain, a PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. GOLD domains are thought to mediate protein-protein interactions, but their role in ORPs are unknown. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	90
241444	cd13290	PH_ORP9	Human Oxysterol binding protein related protein 9 Pleckstrin homology (PH) domain. Human ORP9 is proposed to function in regulation of Akt phosphorylation. ORP9 has 2 forms, a long (ORP9L) and a short (ORP9S). ORP9L contains an N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. ORP1S is truncated and contains a FFAT motif and an OSBP-related domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	102
270106	cd13291	PH_ORP10_ORP11	Human Oxysterol binding protein (OSBP) related proteins 10 and 11 (ORP10 and ORP11) Pleckstrin homology (PH) domain. Human ORP10 is involvedt in intracellular transport or organelle positioning and is proposed to function as a regulator of cellular lipid metabolism. Human ORP11 localizes at the Golgi-late endosome interface and is thought to form a dimer with ORP9 functioning as an intracellular lipid sensor or transporter. Both ORP10 and ORP11 contain a N-terminal PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	107
241446	cd13292	PH_Osh1p_Osh2p_yeast	Yeast oxysterol binding protein homologs 1 and 2 Pleckstrin homology (PH) domain. Yeast Osh1p is proposed to function in postsynthetic sterol regulation, piecemeal microautophagy of the nucleus, and cell polarity establishment. Yeast Osh2p is proposed to function in sterol metabolism and cell polarity establishment. Both Osh1p and Osh2p contain 3 N-terminal ankyrin repeats, a PH domain, a FFAT motif (two phenylalanines in an acidic tract), and a C-terminal OSBP-related domain. OSBP andOsh1p PH domains specifically localize to the Golgi apparatus in a PtdIns4P-dependent manner. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	103
241447	cd13293	PH_CpORP2-like	Cryptosporidium-like Oxysterol binding protein related protein 2 Pleckstrin homology (PH) domain. There are 2 types of ORPs found in Cryptosporidium: CpORP1 and CpORP2. Cryptosporium differs from other apicomplexans like Plasmodium, Toxoplasma, and Eimeria which possess only a single long-type ORP consisting of an N-terminal PH domain followed by a C-terminal ligand binding (LB) domain. CpORP2 is like this, but CpORP1 differs and has a truncated N-terminus resulting in only having a LB domain present. The exact functions of these proteins are largely unknown though CpORP1 is thought to be involved in lipid transport across the parasitophorous vacuole membrane. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	88
241448	cd13294	PH_ORP_plant	Plant Oxysterol binding protein related protein Pleckstrin homology (PH) domain. Plant ORPs contain a N-terminal PH domain and a C-terminal OSBP-related domain. Not much is known about its specific function in plants to date. Members here include: Arabidopsis, spruce, and petunia. Oxysterol binding proteins are a multigene family that is conserved in yeast, flies, worms, mammals and plants. In general OSBPs and ORPs have been found to be involved in the transport and metabolism of cholesterol and related lipids in eukaryotes. They all contain a C-terminal oxysterol binding domain, and most contain an N-terminal PH domain. OSBP PH domains bind to membrane phosphoinositides and thus likely play an important role in intracellular targeting. They are members of the oxysterol binding protein (OSBP) family which includes OSBP, OSBP-related proteins (ORP), Goodpasture antigen binding protein (GPBP), and Four phosphate adaptor protein 1 (FAPP1). They have a wide range of purported functions including sterol transport, cell cycle control, pollen development and vessicle transport from Golgi recognize both PI lipids and ARF proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	100
270107	cd13295	PH_EFA6	Exchange Factor for ARF6 Pleckstrin homology (PH) domain. EFA6 (also called PSD/pleckstrin and Sec7 domain containing) is an guanine nucleotide exchange factor for ADP ribosylation factor 6 (ARF6), which is involved in membrane recycling. EFA6 has four structurally related polypeptides: EFA6A, EFA6B, EFA6C and EFA6D. It consists of a N-terminal proline rich region (PR), a SEC7 domain, a PH domain, a PR, a coiled-coil region, and a C-terminal PR. The EFA6 PH domain regulates its association with the plasma membrane. EFA6 activates Arf6 through its Sec7 catalytic domain and modulates this activity through its C-terminal domain, which rearranges the actin cytoskeleton in fibroblastic cell lines. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	126
270108	cd13296	PH2_MyoX	Myosin X Pleckstrin homology (PH) domain, repeat 2. MyoX, a MyTH-FERM myosin, is a molecular motor that has crucial functions in the transport and/or tethering of integrins in the actin-based extensions known as filopodia, microtubule binding, and in netrin-mediated axon guidance. It functions as a dimer. MyoX walks on bundles of actin, rather than single filaments, unlike the other unconventional myosins. MyoX is present in organisms ranging from humans to choanoflagellates, but not in Drosophila and Caenorhabditis elegans.MyoX consists of a N-terminal motor/head region, a neck made of 3 IQ motifs, and a tail consisting of a coiled-coil domain, a PEST region, 3 PH domains, a myosin tail homology 4 (MyTH4), and a FERM domain at its very C-terminus. The first PH domain in the MyoX tail is a split-PH domain, interupted by the second PH domain such that PH 1a and PH 1b flanks PH 2. The third PH domain (PH 3) follows the PH 1b domain. This cd contains the second PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	103
270109	cd13297	PH3_MyoX-like	Myosin X-like Pleckstrin homology (PH) domain, repeat 3. MyoX, a MyTH-FERM myosin, is a molecular motor that has crucial functions in the transport and/or tethering of integrins in the actin-based extensions known as filopodia, microtubule binding, and in netrin-mediated axon guidance. It functions as a dimer. MyoX walks on bundles of actin, rather than single filaments, unlike the other unconventional myosins. MyoX is present in organisms ranging from humans to choanoflagellates, but not in Drosophila and Caenorhabditis elegans.MyoX consists of a N-terminal motor/head region, a neck made of 3 IQ motifs, and a tail consisting of a coiled-coil domain, a PEST region, 3 PH domains, a myosin tail homology 4 (MyTH4), and a FERM domain at its very C-terminus. The first PH domain in the MyoX tail is a split-PH domain, interupted by the second PH domain such that PH 1a and PH 1b flanks PH 2. The third PH domain (PH 3) follows the PH 1b domain. This cd contains the third MyoX PH repeat. PLEKHH3/Pleckstrin homology (PH) domain containing, family H (with MyTH4 domain) member 3 is also part of this CD and like MyoX contains a FERM domain, a MyTH4 domain, and a single PH domain. Not much is known about the function of PLEKHH3. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	126
270110	cd13298	PH1_PH_fungal	Fungal proteins Pleckstrin homology (PH) domain, repeat 1. The functions of these fungal proteins are unknown, but they all contain 2 PH domains. This cd represents the first PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	106
270111	cd13299	PH2_PH_fungal	Fungal proteins Pleckstrin homology (PH) domain, repeat 2. The functions of these fungal proteins are unknown, but they all contain 2 PH domains. This cd represents the second PH repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	102
270112	cd13300	PH1_TECPR1	Tectonin beta-propeller repeat-containing protein 1 Pleckstrin homology (PH) domain, repeat 1. TECPR1 is a tethering factor involved in autophagy. It promotes the autophagosome fusion with lysosomes by associating with both the ATG5-ATG12 conjugate and phosphatidylinositol-3-phosphate (PtdIns3P) present at the surface of autophagosomes. TECPR1 is also involved in selective autophagy against bacterial pathogens, by being required for phagophore/preautophagosomal structure biogenesis and maturation. It contains 2 DysFN (Dysferlin domains of unknown function, N-terminal), 2 Hyd_WA domains that is a probably beta-propeller, a PH-like domain, a TECPR domain, and a DysFC (C-terminal). The PH domain mediates the binding to phosphatidylinositol-3-phosphate (PtdIns3P). Binding to the ATG5-ATG12 conjugate exposes the PH domain, allowing the association with PtdIns3P. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	122
270113	cd13301	PH1_Pleckstrin_2	Pleckstrin 2 Pleckstrin homology (PH) domain, repeat 1. Pleckstrin is a protein found in platelets. This name is derived from platelet and leukocyte C kinase substrate and the KSTR string of amino acids. Pleckstrin 2 contains two PH domains and a DEP (dishvelled, egl-10, and pleckstrin) domain. Unlike pleckstrin 1, pleckstrin 2 does not contain obvious sites of PKC phosphorylation. Pleckstrin 2 plays a role in actin rearrangement, large lamellipodia and peripheral ruffle formation, and may help orchestrate cytoskeletal arrangement. The PH domains of pleckstrin 2 are thought to contribute to lamellipodia formation. This cd contains the first PH domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
270114	cd13302	PH2_Pleckstrin_2	Pleckstrin 2 Pleckstrin homology (PH) domain, repeat 2. Pleckstrin is a protein found in platelets. This name is derived from platelet and leukocyte C kinase substrate and the KSTR string of amino acids. Pleckstrin 2 contains two PH domains and a DEP (dishvelled, egl-10, and pleckstrin) domain. Unlike pleckstrin 1, pleckstrin 2 does not contain obvious sites of PKC phosphorylation. Pleckstrin 2 plays a role in actin rearrangement, large lamellipodia and peripheral ruffle formation, and may help orchestrate cytoskeletal arrangement. The PH domains of pleckstrin 2 are thought to contribute to lamellipodia formation. This cd contains the second PH domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	109
241457	cd13303	PH1-like_Rtt106	Pleckstrin homology-like domain, repeat 1, of Histone chaperone RTT106 (regulator of Ty1 transposition protein 106). Rtt106 is a histone chaperone. The binding of Rtt106 to H3K56-acetylated (H3-H4)2 tetramers contributes to nucleosome assembly in terms of DNA replication, gene silencing and maintenance of genomic stability. Rtt106 contains an N-terminal homodimerization domain and two C-terminal pleckstrin-homology (PH) domains (PH1 and PH2). The N-terminal domain homodimerizes homodimerizes and interacts with H3-H4 independently of acetylation while the double PH domain binds the K56-containing region of H3. Rtt106 also interacts with both the SWI/SNF and RSC chromatin remodeling complexes and is involved in their cell-cycle dependent recruitment to histone gene pairs regulated by the HIR co-repressor complex (HTA1-HTB1, HHT1-HHF1, and HHT2-HHF2). This model contains the first PH-like domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	139
241458	cd13304	PH2-like_Rtt106	Pleckstrin homology-like domain, repeat 2, of Histone chaperone RTT106 (regulator of Ty1 transposition protein 106). Rtt106 is a histone chaperone. Rtt106 contains an N-terminal homodimerization domain and two C-terminal pleckstrin-homology (PH) domains (PH1 and PH2). The binding of Rtt106 to H3K56-acetylated (H3-H4)2 tetramers contributes to nucleosome assembly in terms of DNA replication, gene silencing and maintenance of genomic stability. The N-terminal domain homodimerizes homodimerizes and interacts with H3-H4 independently of acetylation while the double PH domain binds the K56-containing region of H3. Rtt106 also interacts with both the SWI/SNF and RSC chromatin remodeling complexes and is involved in their cell-cycle dependent recruitment to histone gene pairs regulated by the HIR co-repressor complex (HTA1-HTB1, HHT1-HHF1, and HHT2-HHF2). This model contains the second PH-like domain repeat. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	89
270115	cd13305	PH_SHARPIN	SHANK-associated RH domain interacting protein Pleckstrin homology (PH) domain. SHARPIN has a variety of roles including: a role as a scaffolding partner of anchoring/scaffold proteins Shank1, a role in carcinogenesis through the interaction with FYN binding protein (FYB), which binds to oncogene FYN, a role in apoptosis by interacting with AIFM1, a mitochondrial regulator of cell death, CAPN13, and NSD1, as well as a role in immune disease and inflammation. SHARPIN has at its N-terminus a PH domain, followed by a E3 ubiquitin ligase domain, and a C-terminal RanBP-type and C3HC4-type zinc finger containing 1 domain (RBCK1, also known as HOIP which functions as a protein kinase C (PKC) binding protein as well as a transcriptional activator. SHARPIN's PH domain functions as a dimerization module, rather than a ligand recognition domain. Instead it acts as a dimerization module extending the functional applications of this superfold. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	114
270116	cd13306	PH1_AFAP	Actin filament associated protein family Pleckstrin homology (PH) domain, repeat 1. There are 3 members of the AFAP family of adaptor proteins: AFAP1, AFAP1L1, and AFAP1L2/XB130. AFAP1 is a cSrc binding partner and actin cross-linking protein. AFAP1L1 is thought to play a similar role to AFAP1 in terms of being an actin cross-linking protein, but it preferentially binds to cortactin and not cSrc, thereby playing a role in invadosome formation. AFAP1L2 is a cSrc binding protein, but does not bind to actin filaments. AFAP1L2 acts as an intermediary between the RET/PTC kinase and PI-3kinase pathway in the thyroid. The AFAPs share a similar structure of a SH3 binding motif, 3 SH2 binding motifs, 2 PH domains, a coiled-coil region corresponding to the AFAP1 leucine zipper, and an actin binding domain. The amino terminal PH1 domain of AFAP1 has been known to function in intra-molecular regulation of AFAP1. In addition, the PH1 domain is a binding partner for PKCa and phospholipids. This cd is the first PH domain of AFAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	107
270117	cd13307	PH2_AFAP	Actin filament associated protein family Pleckstrin homology (PH) domain, repeat 2. There are 3 members of the AFAP family of adaptor proteins: AFAP1, AFAP1L1, and AFAP1L2/XB130. AFAP1 is a cSrc binding partner and actin cross-linking protein. AFAP1L1 is thought to play a similar role to AFAP1 in terms of being an actin cross-linking protein, but it preferentially binds to cortactin and not cSrc, thereby playing a role in invadosome formation. AFAP1L2 is a cSrc binding protein, but does not bind to actin filaments. AFAP1L2 acts as an intermediary between the RET/PTC kinase and PI-3kinase pathway in the thyroid. The AFAPs share a similar structure of a SH3 binding motif, 3 SH2 binding motifs, 2 PH domains, a coiled-coil region corresponding to the AFAP1 leucine zipper, and an actin binding domain. This cd is the second PH domain of AFAP. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	101
270118	cd13308	PH_3BP2	SH3 domain-binding protein 2 Pleckstrin homology (PH) domain. SH3BP2 (the gene that encodes the adaptor protein 3BP2), HD, ITU, IT10C3, and ADD1 are located near the Huntington's Disease Gene on Human Chromosome 4pl6.3. SH3BP2 lies in a region that is often missing in individuals with Wolf-Hirschhorn syndrome (WHS). Gain of function mutations in SH3BP2 causes enhanced B-cell antigen receptor (BCR)-mediated activation of nuclear factor of activated T cells (NFAT), resulting in a rare, genetic disorder called cherubism. This results in an increase in the signaling complex formation with Syk, phospholipase C-gamma2 (PLC-gamma2), and Vav1. It was recently discovered that Tankyrase regulates 3BP2 stability through ADP-ribosylation and ubiquitylation by the E3-ubiquitin ligase. Cherubism mutations uncouple 3BP2 from Tankyrase-mediated protein destruction, which results in its stabilization and subsequent hyperactivation of the Src, Syk, and Vav signaling pathways. SH3BP2 is also a potential negative regulator of the abl oncogene. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	113
270119	cd13309	PH_SKIP	SifA and kinesin-interacting protein Pleckstrin homology (PH) domain. SKIP (also called PLEKHM2/Pleckstrin homology domain-containing family M member 2) is a soluble cytosolic protein that contains a RUN domain and a PH domain separated by a unstructured linker region. SKIP is a target of the Salmonella effector protein SifA and the SifA-SKIP complex regulates kinesin-1 on the bacterial vacuole. The PH domain of SKIP binds to the N-terminal region of SifA while the N-terminus of SKIP is proposed to bind the TPR domain of the kinesin light chain. The opposite side of the SKIP PH domain is proposed to bind phosphoinositides. TSifA, SKIP, SseJ, and RhoA family GTPases are also thought to promote host membrane tubulation. Recently, it was shown that the lysosomal GTPase Arl8 binds to the kinesin-1 linker SKIP and that both are required for the normal intracellular distribution of lysosomes. Interestingly, two kinesin light chain binding motifs (WD) in SKIP have now been identified to match a consensus sequence for a kinesin light chain binding site found in several proteins including calsyntenin-1/alcadein, caytaxin, and vaccinia virus A36. SKIP has also been shown to interact with Rab1A. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	103
270120	cd13310	PH_RalGPS1_2	Ral GEF with PH domain and SH3 binding motif 1 and 2 Pleckstrin homology (PH) domain. RalGPS1 (also called Ral GEF with PH domain and SH3 binding motif 1;RALGEF2/ Ral guanine nucleotide exchange factor 2; RalA exchange factor RalGPS1; Ral guanine nucleotide exchange factor RalGPS1A2; ras-specific guanine nucleotide-releasing factor RalGPS1) and RalGPS2 (also called Ral GEF with PH domain and SH3 binding motif 2; Ral-A exchange factor RalGPS2; ras-specific guanine nucleotide-releasing factor RalGPS22). They activate small GTPase Ral proteins such as RalA and RalB by stimulating the exchange of Ral bound GDP to GTP, thereby regulating various downstream cellular processes. Structurally they contain an N-terminal Cdc25-like catalytic domain, followed by a PXXP motif and a C-terminal PH domain. The Cdc25-like catalytic domain interacts with Ral and its PH domain ensures the correct membrane localization. Its PXXP motif is thought to interact with the SH3 domain of Grb2. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	116
270121	cd13311	PH_Slm1	Slm1 Pleckstrin homology (PH) domain. Slm1 is a component of the target of rapamycin complex 2 (TORC2) signaling pathway. It plays a role in the regulation of actin organization and is a target of sphingolipid signaling during the heat shock response. Slm1 contains a single PH domain that binds PtdIns(4,5)P2, PtdIns(4)P, and dihydrosphingosine 1-phosphate (DHS-1P). Slm1 possesses two binding sites for anionic lipids. The non-canonical binding site of the PH domain of Slm1 is used for ligand binding, and it is proposed that beta-spectrin, Tiam1 and ArhGAP9 also have this type of phosphoinositide binding site. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	110
270122	cd13312	PH_USP37_like	Pleckstrin homology-like domain of Ubiquitin carboxyl-terminal hydrolase 37. Members here include USP37, USP29, and USP26. All of these contain a single PH-like domain. USP37 (also called ubiquitin carboxyl-terminal hydrolase 37, ubiquitin thiolesterase 37, deubiquitinating enzyme 37, and tmp_locus_50) is a deubiquitinase that antagonizes the anaphase-promoting complex (APC/C) during G1/S transition by mediating deubiquitination of cyclin-A (CCNA1 and CCNA2), resulting in promoting S phase entry. USP37 mediates deubiquitination of 'Lys-11'-linked polyubiquitin chains, a specific ubiquitin-linkage type mediated by the APC/C complex and 'Lys-48'-linked polyubiquitin chains in vitro. Phosphorylation at Ser-628 during G1/S phase maximizes the deubiquitinase activity, leading to prevent degradation of cyclin-A (CCNA1 and CCNA2). USP29 (also called ubiquitin carboxyl-terminal hydrolase 29, ubiquitin thiolesterase 29, deubiquitinating enzyme 29, and HOM-TES-84/86) plays a role in apoptosis and oxidative stress. In response to oxidative stress, JTV1 dissociates from the ARS complex, translocates to the nucleus, associates with far upstream element binding protein (FBP) and co-activates the transcription of USP29 which binds to, cleaves poly-ubiquitin chains from, and stabilizes p53 leading to apoptosis. The X-linked deubiquitination enzyme USP26 (also called ubiquitin carboxyl-terminal hydrolase 26, ubiquitin thiolesterase 26, and deubiquitinating enzyme 26) is a regulator of androgen receptor (AR) signaling. It binds to AR using three nuclear receptor interaction motifs (LXXLL, FXXLF and FXXFF) and modulates AR ubiquitination. Polymorphism of Usp26 correlates with idiopathic male infertility. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	103
270123	cd13313	PH_NF1	Neurofibromin-1 Pleckstrin homology-like domain. Neurofibromin (NF1) contains a N-terminal RasGAP domain, followed by a Sec14-like domain, and a PH domain. Surprisingly, in neurofibromin the PH domain alone is not sufficient for phospholipid binding and instead requires the presence of the Sec-14 domain. The Sec-14 domain has been shown to bind 1-(3-sn-phosphatidyl)-sn-glycerol (PtdGro), (3-sn-phosphatidyl)-ethanolamine (PtdEtn) and -choline (PtdCho) and to a minor extent to (3-sn-phosphatidyl)-l-serine (PtdSer) and 1-(3-sn-phosphatidyl)-d-myo-inositol (PtdIns). Neurofibromatosis type 1 (also known as von Recklinghausen neurofibromatosis or NF1) is a genetic disorder caused by alterations in the tumor suppressor gene NF1. Hallmark symptoms include neural crest derived tumors, pigmentation anomalies, bone deformations, and learning disabilities. Mutations of the tumour suppressor gene NF1 are responsible for disease pathogenesis, with 90% of the alterations being nonsense codons. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	110
270124	cd13314	PH_Rpn13	Pleckstrin homology-like domain of Regulatory Particle Non-ATPase 13. Targeted protein degradation is performed to a great extent by the ubiquitin-proteasome pathway, in which substrate proteins are marked by covalently attached ubiquitin chains that mediate recognition by the proteasome. Rpn13(also called ADRM1/ARM1) is one of the two major ubiquitin receptors of the proteasome, the other being S5a/Rpn10 which is not essential for ubiquitin-mediated protein degradation in budding yeast2. S5a has two ubiquitin interacting motifs (UIMs) that bind simultaneously to ubiquitin moieties to increase affinity while Rpn13 binds ubiquitin with a single, high affinity surface within its N-terminal PH domain. Rpn13 also binds and activates deubiquitinating enzyme Uch37, one of the proteasome's three deubiquitinating enzymes. Recently it was discovered that the ubiquitin-binding domain (BD) and Uch37 BD of human (h) Rpn13 pack against each other when it is not incorporated into the proteasome reducing hRpn13's affinity for ubiquitin. However when hRpn13 binds to hRpn2/S1 this abrogates its interdomain interactions, thus activating hRpn13 for ubiquitin binding. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
270125	cd13315	PH_Sec3	Sec 3 Pleckstrin homology-like domain. The Sec3 subunit of the exocyst, a complex involved in polarized exocytosis, bind phospholipids and GTPase Cdc42 and therefore functions as a coincidence detector at the plasma membrane. Unlike most PH domains, Sec3 contains an additional alpha-helix at its N-terminus and two beta-strands at its C-terminus that mediate dimerization through domain swapping. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	141
270126	cd13316	PH_Boi	Boi family Pleckstrin homology domain. Yeast Boi proteins Boi1 and Boi2 are functionally redundant and important for cell growth with Boi mutants displaying defects in bud formation and in the maintenance of cell polarity.They appear to be linked to Rho-type GTPase, Cdc42 and Rho3. Boi1 and Boi2 display two-hybrid interactions with the GTP-bound ("active") form of Cdc42, while Rho3 can suppress of the lethality caused by deletion of Boi1 and Boi2. These findings suggest that Boi1 and Boi2 are targets of Cdc42 that promote cell growth in a manner that is regulated by Rho3. Boi proteins contain a N-terminal SH3 domain, followed by a SAM (sterile alpha motif) domain, a proline-rich region, which mediates binding to the second SH3 domain of Bem1, and C-terminal PH domain. The PH domain is essential for its function in cell growth and is important for localization to the bud, while the SH3 domain is needed for localization to the neck. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	97
270127	cd13317	PH_PLEKHO1_PLEKHO2	Pleckstrin homology domain-containing family O Pleckstrin homology domain. The PLEKHO family members are PLEKHO1 (also called CKIP-1/Casein kinase 2-interacting protein 1/CK2-interacting protein 1) and PLEKHO2 (PLEKHQ1/PH domain-containing family Q member 1). They both contain a single PH domain. PLEKHO1 acts as a scaffold protein that functions in plasma membrane recruitment, transcriptional activity modulation, and posttranscriptional modification regulation. As an adaptor protein it is involved in signaling pathways, apoptosis, differentiation, cytoskeleton, and bone formation. Not much is know about PLEKHO2. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	102
270128	cd13318	PH_IQSEC	IQ motif and SEC7 domain-containing protein family Pleckstrin homology domain. The IQSEC (also called BRAG/Brefeldin A-resistant Arf-gunanine nucleotide exchange factor) family are a subset of Arf GEFs that have been shown to activate Arf6, which acts in the endocytic pathway to control the trafficking of a subset of cargo proteins including integrins and have key roles in the function and organization of distinct excitatory and inhibitory synapses in the retina. The family consists of 3 members: IQSEC1 (also called BRAG2/GEP100), IQSEC2 (also called BRAG1), and IQSEC3 (also called SynArfGEF, BRAG3, or KIAA1110). IQSEC1 interacts with clathrin and modulates cell adhesion by regulating integrin surface expression and in addition to Arf6, it also activates the class II Arfs, Arf4 and Arf5. Mutations in IQSEC2 cause non-syndromic X-linked intellectual disability as well as reduced activation of Arf substrates (Arf1, Arf6). IQSEC3 regulates Arf6 at inhibitory synapses and associates with the dystrophin-associated glycoprotein complex and S-SCAM. These members contains a IQ domain that may bind calmodulin, a PH domain that is thought to mediate membrane localization by binding of phosphoinositides, and a SEC7 domain that can promote GEF activity on ARF. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	128
270129	cd13319	PH_RARhoGAP	RA and RhoGAP domain-containing protein Pleckstrin homology PH domain. RARhoGAP (also called Rho GTPase-activating protein 20 and ARHGAP20 ) is thought to function in rearrangements of the cytoskeleton and cell signaling events that occur during spermatogenesis. RARhoGAP was also shown to be activated by Rap1 and to induce inactivation of Rho, resulting in the neurite outgrowth. Recent findings show that ARHGAP20, even although it is located in the middle of the MDR on 11q22-23, is expressed at higher levels in chronic lymphocytic leukemia patients with 11q22-23 and/or 13q14 deletions and its expression pattern suggests a functional link between cases with 11q22-23 and 13q14 deletions. The mechanism needs to be further studied. RARhoGAP contains a PH domain, a Ras-associating domain, a Rho-GAP domain, and ANXL repeats. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	97
270130	cd13320	PH_OCRL-like	oculocerebrorenal syndrome of Lowe family Pleckstrin homology-like domain. The OCRL family has two members: OCRL1 (also called INPP5F, LOCR, NPHL2, or phosphatidylinositol polyphosphate 5-phosphatase) and OCRL2 ( also called IPNNB5, inositol polyphosphate-5-phosphatase, phosphoinositide 5-phosphatase, 5PTase, or type II inositol-1,4,5-trisphosphate 5-phosphatase). The OCRL proteins hydrolyze phosphatidylinositol 4,5-bisphosphate (PtIns(4,5)P2) and the signaling molecule phosphatidylinositol 1,4,5-trisphosphate (PtIns(1,4,5)P3), and thereby modulates cellular signaling events. They interact with APPL1, FAM109A and FAM109B and several Rab GTPases which might both target them to the specific membranes and as well as stimulating the phosphatase activity. All OCRL family members contain a PH domain and a Rho-GAP domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
241475	cd13321	PH_PLEKHM1	Pleckstrin homology domain-containing family M member 1 Pleckstrin homology (PH) domain. PLEKHM1 is thought to function in vesicular transport in osteoclasts. Mutations in the PLEKHM1 gene are associated with osteopetrosis OPTB6. PLEKHM1 contains an N-terminal RUN domain (RPIP8/RaP2 interacting protein 8, UNC-14 and NESCA/new molecule containing SH3 at the carboxyl-terminus), followed by a PH domain, and either a C1 domain or a DUF4206 domain at its C-terminus. The RUN domain is thought to be involved in Rab-mediated membrane trafficking, possibly as a Rab-binding site. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	132
270131	cd13322	PH_PHLPP-like	PH domain leucine-rich repeat protein phosphatase family Pleckstrin homology-like domain. The PHLPP family has members PHLPP1 (also called hSCOP/Suprachiasmatic nucleus circadian oscillatory protein; PLEKHE1/Pleckstrin homology domain-containing family E member 1) and PHLPP2 (PHLPP-like/PHLPPL). The PHLPP family of novel Ser/Thr phosphatases serve as important regulators of cell survival and apoptosis. PHLPP isozymes catalyze the dephosphorylation of a conserved regulatory motif, the hydrophobic motif, on the AGC kinases Akt, PKC, and S6 kinase, as well as an inhibitory site on the kinase Mst1, to inhibit cellular proliferation and induce apoptosis and negatively regulates ERK1/2 activation. Reductions in their expression have been detected in several cancers and linked to cancer progression. PHLPP1 and PHLPP2 both contain an N-terminal PH domain, followed by 21 LRR (leucine-rich) repeats, and a C-terminal PP2C-like domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	95
270132	cd13323	PH_PLEKHN1	Pleckstrin homology domain containing family N member 1Pleckstrin homology-like domain. Not much is known about PLEKHN1. It is found in a wide range of animals including humans, green anole, frog, and zebrafish. It contains a single PH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	121
270133	cd13324	PH_Gab-like	Grb2-associated binding protein family Pleckstrin homology (PH) domain. Gab proteins are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. There are 3 families: Gab1, Gab2, and Gab3. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	112
270134	cd13325	PH_unc89	unc89 pleckstrin homology (PH) domain. unc89 is a myofibrillar protein. unc89-B the largest isoform is composed of 53 immunoglobulin (Ig) domains, 2 Fn3 domains, a triplet of SH3, DH and PH domains at its N-terminus, and 2 protein kinase domains (PK1 and PK2) at its C-terminus. unc-89 mutants display disorganization of muscle A-bands, and usually lack M-lines. The COOH-terminal region of obscurin, the human homolog of unc89, interacts via two specific Ig-like domains with the NH(2)-terminal Z-disk region of titin, a protein that connects the Z line to the M line in the sarcomere and contributes to the contraction of striated muscle. obscurin is also thought to be involved in Ca2+/calmodulin via its IQ domains, as well as G protein-coupled signal transduction in the sarcomere via its RhoGEF/DH domain. The DH-PH region of OBSCN and unc89, the C. elegans homolog, has exchange activity for RhoA and Rho-1 respectively, but not for the small GTPases homologous to Cdc42 or Rac. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	114
270135	cd13326	PH_CNK_insect-like	Connector enhancer of KSR (Kinase suppressor of ras) (CNK) pleckstrin homology (PH) domain. CNK family members function as protein scaffolds, regulating the activity and the subcellular localization of RAS activated RAF. There is a single CNK protein present in Drosophila and Caenorhabditis elegans in contrast to mammals which have 3 CNK proteins (CNK1, CNK2, and CNK3). All of the CNK members contain a sterile a motif (SAM), a conserved region in CNK (CRIC) domain, and a PSD-95/DLG-1/ZO-1 (PDZ) domain, and a PH domain. A CNK2 splice variant CNK2A also has a PDZ domain-binding motif at its C terminus and Drosophila CNK (D-CNK) also has a domain known as the Raf-interacting region (RIR) that mediates binding of the Drosophila Raf kinase. This cd contains CNKs from insects, spiders, mollusks, and nematodes. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	91
270136	cd13327	PH_PLEKHM3_2	Pleckstrin homology domain-containing family M member 3 Pleckstrin homology domain 2. PLEKHM3 (also called differentiation associated protein/DAPR)(also called differentiation associated protein/DAPR) exists as three alternatively spliced isoforms that participate in metal ion binding. It contains 2 PH domains and 1 phorbol-ester/DAG-type zinc finger domain. PLEKHM3 is found in Humans, canines, bovine, mouse, rat, chicken and zebrafish. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	88
275410	cd13328	PH1_FDG_family	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia family proteins, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Mutations in the FGD1 gene are responsible for the X-linked disorder known as faciogenital dysplasia (FGDY). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	92
275411	cd13329	PH_RhoGEF	Rho guanine nucleotide exchange factor Pleckstrin homology domain. RhoGEFs belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. The members here all contain Dbl homology (DH)-PH domains. In addition some members contain N-terminal C1 (Protein kinase C conserved region 1) domains, PDZ (also called DHR/Dlg homologous regions) domains, ANK (ankyrin) domains, and RGS (Regulator of G-protein signalling) domains or C-terminal ATP-synthase B subunit. The DH-PH domains bind and catalyze the exchange of GDP for GTP on RhoA. RhoGEF2/Rho guanine nucleotide exchange factor 2, p114RhoGEF/p114 Rho guanine nucleotide exchange factor, p115RhoGEF, p190RhoGEF, PRG/PDZ Rho guanine nucleotide exchange factor, RhoGEF 11, RhoGEF 12, RhoGEF 18, AKAP13/A-kinase anchoring protein 13, and LARG/Leukemia-associated Rho guanine nucleotide exchange factor are included in this CD. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	109
241484	cd13330	PH_CARM1	Coactivator-Associated Methyltransferase 1 Pleckstrin homology (PH) domain. CARM1 (also known as protein arginine methyltransferase 4/PRMT4) is a protein arginine methyltransferase recruited by several transcription factors. It methylates a variety of proteins and plays a role in gene expression. The N-terminal domain of CARM1 contains a N-terminal PH domain, a catalytic core module composed of two parts (a Rossmann fold topology (RF) and a beta-barrel), and a C-terminal domain. The N-terminal and the C-terminal end of CARM1 catalytic module contain molecular switches that may explain how CARM1 regulates its biological activities by protein-protein interactions. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	107
270139	cd13331	PH_Avo1	Avo1 Pleckstrin homology (PH) domain. Target of rapamycin (TOR) is a highly conserved serine/threonine protein kinase and a central controller of the growth, metabolism and ageing of eukaryotic cells. TOR assembles into two protein complexes termed TOR complex 1 (TORC1) and TOR complex 2 (TORC2) which function as central nodes in a complex network of signal transduction pathways that are involved in normal physiological as well as pathogenic events. TORC1 mediates the rapamycin-sensitive signalling branch, which positively regulates anabolic processes and negatively regulates catabolic processes. TORC2 signalling is rapamycinin insensitive and is involved in the spatial aspects of cell growth by controlling the actin cytoskeleton and cell polarity. In Saccharomyces cerevisiae, TORC2 is involved in the regulation of ceramide metabolism. In S. cerevisiae, TORC1 consists of the proteins Kog1, Lst8, Tco89 and either Tor1 or Tor2, while TORC2 consists of the proteins Avo1, Avo2, Avo3, Bit61, Lst8 and Tor2. The C-terminal domain of the Saccharomyces cerevisiae TORC2 component Avo1 is required for plasma-membrane localization of TORC2 and is essential for yeast viability. The C-termini of Avo1 and Sin1, its Human ortholog, both have the pleckstrin homology (PH) domain fold. Comparison with known PH-domain structures suggests a putative binding site for phosphoinositides. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
275412	cd13332	FERM_C_JAK1	FERM domain C-lobe of Janus kinase 1. JAK1 is a tyrosine kinase protein essential in signaling type I and type II cytokines. It interacts with the gamma chain of type I cytokine receptors to elicit signals from the IL-2 receptor family, the IL-4 receptor family, the gp130 receptor family, ciliary neurotrophic factor receptor (CNTF-R), neurotrophin-1 receptor (NNT-1R) and Leptin-R). It also is involved in transducing a signal by type I (IFN-alpha/beta) and type II (IFN-gamma) interferons, and members of the IL-10 family via type II cytokine receptors. JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	144
270141	cd13333	FERM_C_JAK2	FERM domain C-lobe of Janus kinase (JAK) 2. JAK2 has been implicated in signaling by members of the type II cytokine receptor family, the GM-CSF receptor family, the gp130 receptor family, and the single chain receptors. JAK2 orthologs have been identified in all mammals. Mutations in JAK2 have been implicated in polycythemia vera, essential thrombocythemia, myelofibrosis as well as other myeloproliferative disorders. JAK2 gene fusions with the PCM1 and TEL(ETV6) (TEL-JAK2) genes have been found in leukemia patients. Researcher are targetting JAK2 inhibitors in the treatment of patients with prostate cancer. JAK2 has been shown to interact with a variety of proteins including growth hormone receptor, STAT5A, STAT5B, interleukin 5 receptor alpha subunit, interleukin 12 receptor, SOCS3, PTPN6,PTPN11, Grb2, VAV1, and YES1. JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	113
275413	cd13334	FERM_C_JAK3	FERM domain C-lobe of Janus kinase (JAK) 3. JAK3 functions in signal transduction and interacts with members of the STAT (signal transduction and activators of transcription) family. It is required for signaling of the type I receptors that use the common gamma chain: IL-2, IL-4, IL-7, IL-9, IL-15 and IL-21. Cytokine binding induces the association of separate cytokine receptor subunits and the activation of the receptor-associated JAKs. In the absence of cytokine, JAKs lack protein tyrosine kinase activity. Once activated, the JAKs create docking sites for the STAT transcription factors by phosphorylation of specific tyrosine residues on the cytokine receptor subunits. Unlike the ubiquitous expression of JAK1, JAK2 and Tyk2, JAK3 is predominantly expressed in hematopoietic cells, such as NK cells, T cells and B cells. Mutations of JAK3 result in severe combined immunodeficiency (SCID). In addition to its well-known roles in T cells and NK cells, JAK3 has recently been found to inhibits IL-8-mediated chemotaxis. JAK3 interacts with CD247, TIAF1, and IL2RG. JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	110
275414	cd13335	FERM_C_TYK2	FERM domain C-lobe of Non-receptor tyrosine-protein kinase TYK2. Tyk2 functions primarily in IL-12 and type I-IFN signaling as well as transduction of IL-23, IL-10, and IL-6 signals. A mutation in the Tyk2 gene has been associated with hyperimmunoglobulin E syndrome (HIES), a primary immunodeficiency characterized by elevated serum immunoglobulin E. Tyk2 has been shown to interact with FYN, PTPN6, IFNAR1, Ku80 and GNB2L1. JAK (also called Just Another Kinase) is a family of intracellular, non-receptor tyrosine kinases that transduce cytokine-mediated signals via the JAK-STAT pathway. The JAK family in mammals consists of 4 members: JAK1, JAK2, JAK3 and TYK2. JAKs are composed of seven JAK homology (JH) domains (JH1-JH7) . The C-terminal JH1 domain is the main catalytic domain, followed by JH2, which is often referred to as a pseudokinase domain, followed by JH3-JH4 which is homologous to the SH2 domain, and lastly JH5-JH7 which is a FERM domain. Named after Janus, the two-faced Roman god of doorways, JAKs possess two near-identical phosphate-transferring domains; one which displays the kinase activity (JH1), while the other negatively regulates the kinase activity of the first (JH2). The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	158
275415	cd13336	FERM-like_C_SNX31	Atypical FERM-like domain C-lobe of Sorting nexin 31. SNX31 functions in regulating recycling from endosomes to the cell surface. SNX31 contains a N-terminal PX domain, a FERM-like domain, and a unique C-terminal region. It bind Ras GTPase through its FERM-like domains. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. These interactions place the PX-FERM-like proteins at a hub of endosomal sorting and signaling processes. These proteins participate in a network of interactions that will impact on both endosomal protein trafficking and compartment specific Ras signaling cascades. The typical FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. FERM domains are found in cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	113
270145	cd13337	FERM-like_C_SNX17	Atypical FERM-like domain C-lobe of Sorting nexin 17. SNX17 is a beta1-integrin-tail-binding protein that interacts with the free kindlin-binding site in endosomes to stabilize beta1 integrins, resulting in their recycling to the cell surface where they can be reused. SNX17 contains a N-terminal PX domain, a FERM-like domain, and a unique C-terminal region. SNX17 binds Ras GTPase through its FERM-like domains. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. These interactions place the PX-FERM-like proteins at a hub of endosomal sorting and signaling processes. These proteins participate in a network of interactions that will impact on both endosomal protein trafficking and compartment specific Ras signaling cascades. The typical FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. FERM domains are found in cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	113
270146	cd13338	FERM-like_C_SNX27	Atypical FERM-like domain C-lobe of Sorting nexin 27. SNX27 is localized to early endosomes and known to regulate the intracellular trafficking of ion channels and receptors. SNX27 contain a N-terminal PDZ domain, a PX domain, and a FERM-like domain. SNX27 regulates trafficking of a PAK interacting exchange factor-G protein-coupled receptor kinase interacting protein complex via its PDZ domain interaction. Sorting nexin 27 interacts with multidrug resistance-associated protein 4 (MRP4). SNX27 binds Ras GTPase through its FERM-like domains. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. These interactions place the PX-FERM-like proteins at a hub of endosomal sorting and signaling processes. These proteins participate in a network of interactions that will impact on both endosomal protein trafficking and compartment specific Ras signaling cascades. The typical FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. FERM domains are found in cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs), the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites.	102
275416	cd13339	PH-GRAM_MTMR13	Myotubularian (MTM) related 13 protein Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR13 (also called SBF2/SET binding factor 2) is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Leu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR13 has high sequence similarity to MTMR5 and has recently been shown to be a second gene mutated in type 4B Charcot-Marie-Tooth syndrome. Both MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	119
275417	cd13340	PH-GRAM_MTMR5	Myotubularian (MTM) related 5 protein (MTMR5) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR5 (also called SBF1/SET binding factor 1) is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It lacks several amino acids in the dsPTPase catalytic pocket which renders it catalytically inactive as a phosphatase. MTMR5 is the most well-studied inactive member of this family and has been implicated in cellular growth control and oncogenic transformation. MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	119
270149	cd13341	PH-GRAM_MTMR3	Myotubularian (MTM) related 3 protein (MTMR3) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR3 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR3 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein can self-associate and also form heteromers with MTMR4. Both MTMR3 and MTMR4 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal lipid-binding FYVE domain which binds phosphotidylinositol-3-phosphate. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	94
270150	cd13342	PH-GRAM_MTMR4	Myotubularian (MTM) related 4 protein (MTMR4) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR4 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR4 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein form heteromers with MTMR3. Both MTMR3 and MTMR4 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal lipid-binding FYVE domain which binds phosphotidylinositol-3-phosphate. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	114
270151	cd13343	PH-GRAM_MTMR6	Myotubularian (MTM) related (MTMR) 6 protein Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR6 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR6 binds to phosphoinositide lipids through its PH-GRAM domain. It acts as a negative regulator of KCNN4/KCa3.1 channel activity in CD4+ T-cells possibly by decreasing intracellular levels of phosphatidylinositol-3 phosphatase and negatively regulates proliferation of reactivated CD4+ T-cells MTMR6 interacts with MTMR7, MTMR8 and MTMR9. MTMR6 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	101
270152	cd13344	PH-GRAM_MTMR7	Myotubularian (MTM) related 7 protein (MTMR7) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR7 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR6 binds to phosphoinositide lipids through its PH-GRAM domain and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate. MTMR7 interacts with MTMR6, MTMR8 and MTMR9. MTMR7 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	103
270153	cd13345	PH-GRAM_MTMR8	Myotubularian (MTM) related 8 protein (MTMR8) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR8 is a member of the myotubularin dual specificity protein phosphatase gene family. MTMR8 binds to phosphoinositide lipids through its PH-GRAM domain. MTMR8 can self associate and interacts with MTMR6, MTMR7 and MTMR9. MTMR8 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	103
270154	cd13346	PH-GRAM_MTMR10	Myotubularian (MTM) related 10 protein (MTMR10) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR10 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Glu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR10 contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, and a SET interaction domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	177
275418	cd13348	PH-GRAM_MTMR12	Myotubularian (MTM) related 12 protein (MTMR12) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR12 is a catalytically inactive phosphatase that plays a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. It contains a Glu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. MTMR12 contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, a SET interaction domain, and a C-terminal a coiled-coil domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	178
270156	cd13349	PH-GRAM1_TBC1D8	TBC1 domain family member 8 (TBC1D8; also called Vascular Rab-GAP/TBC-containing protein) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 1. TBC1D8 may act as a GTPase-activating protein for Rab family protein(s). TBC1D8 contains an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the first repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	99
275419	cd13350	PH-GRAM1_TBC1D8B	TBC1 domain family member 8B (TBC1D8B) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 1. TBC1D8B may act as a GTPase-activating protein for Rab family protein(s). TBC1D8B contains an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the first repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	99
275420	cd13351	PH-GRAM1_TCB1D9_TCB1D9B	TBC1 domain family members 9 and 9B (TBC1D9 and TBC1D9B) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 1. TBC1D9 and TCB1D9B may act as a GTPase-activating proteins for Rab family protein(s). TBC1D9 and TCB1D9B contain two N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the first repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	99
270159	cd13352	PH-GRAM2_TBC1D8B	TBC1 domain family member 8B (TBC1D8B) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 2. TBC1D8B may act as a GTPase-activating protein for Rab family protein(s). TBC1D8B contains an N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the second repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	93
270160	cd13353	PH-GRAM2_TBC1D8	TBC1 domain family member 8 (TBC1D8; also called Vascular Rab-GAP/TBC-containing protein) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 2. TBC1D8 may act as a GTPase-activating protein for Rab family protein(s). TBC1D8 contains two N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the second repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	96
270161	cd13354	PH-GRAM2_TCB1D9_TCB1D9B	TBC1 domain family members 9 and 9B (TBC1D9 and TBC1D9B) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain, repeat 2. TBC1D9 and TCB1D9B may act as a GTPase-activating proteins for Rab family protein(s). TBC1D9 and TCB1D9B contain two N-terminal PH-GRAM domain and a C-terminal Rab-GTPase-TBC (Tre-2, BUB2p, and Cdc16p) domain. This cd contains the second repeat of the PH-GRAM domain. The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. The GRAM domain is part of a larger motif with a pleckstrin homology (PH) domain fold.	97
270162	cd13355	PH-GRAM_MTM1	Myotubularian 1 protein (MTM1) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTM1 is a member of the myotubularin protein phosphatase gene family. It is required for muscle cell differentiation and mutations in this gene have been identified as being responsible for X-linked myotubular myopathy, a severe congenital muscle disorder characterized by defective muscle cell development. Since its initial discovery, there have been an additional 14 myotubularin-related proteins identified. MTM1 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The protein can self-associate and form heteromers with MTMR12. MTM1 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. All MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE and PH domains C-terminal to the coiled-coil region.	100
270163	cd13356	PH-GRAM_MTMR2_mammal-like	Myotubularian related 2 protein (MTMR2) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR2 is a member of the myotubularin protein phosphatase gene family. MTMR2 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. Mutations in MTMR2 are a cause of Charcot-Marie-Tooth disease type 4B, an autosomal recessive demyelinating neuropathy. The protein can self-associate and form heteromers with MTMR5 and MTMR12. MTMR2 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal PDZ domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.Members in this cd include mammals, chickens, anoles, human body lice, and aphids.	115
270164	cd13357	PH-GRAM_MTMR2_insect-like	Myotubularian related 2 protein (MTMR2) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR2 is a member of the myotubularin protein phosphatase gene family. MTMR2 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. Mutations in MTMR2 are a cause of Charcot-Marie-Tooth disease type 4B, an autosomal recessive demyelinating neuropathy. The protein can self-associate and form heteromers with MTMR5 and MTMR12. MTMR2 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal PDZ domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date. Members in this cd include Drosophila, sea urchins, mosquitos, bees, ticks, and anemones.	100
270165	cd13358	PH-GRAM_MTMR1	Myotubularian related 1 protein (MTMR1) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR1 is a member of the myotubularin protein phosphatase gene family. MTMR1 binds to phosphoinositide lipids through its PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. MTMR1 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal PDZ domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. The PH domain family possesses multiple functions including the ability to bind phosphoinositides via its beta1/beta2, beta3/beta4, and beta6/beta7 connecting loops and to other proteins. However, no phosphoinositide binding sites have been found for the MTMRs to date.	100
270166	cd13359	PH_ELMO1_CED-12	Engulfment and cell motility protein 1 pleckstrin homology (PH) domain. DOCK2 (Dedicator of cytokinesis 2), a hematopoietic cell-specific, atypical GEF, controls lymphocyte migration through Rac activation. A DOCK2-ELMO1 complex s necessary for DOCK2-mediated Rac signaling. DOCK2 contains a SH3 domain at its N-terminus, followed by a lipid binding DHR1 domain, and a Rac-binding DHR2 domain at its C-terminus. ELMO1, a mammalian homolog of C. elegans CED-12, contains the N-terminal RhoG-binding region, the ELMO domain, the PH domain, and the C-terminal sequence with three PxxP motifs. The C-terminal region of ELMO1, including the Pro-rich sequence, binds the SH3-containing region of DOCK2 forming a intermolecular five-helix bundle along with the PH domain of ELMO1. Autoinhibition of ELMO1 and DOCK2 is accomplished by the interactions of the EID and EAD domains and SH3 and DHR2 domains, respectively. The interaction of DOCK2 and ELMO1 mutually relieve their autoinhibition and results in the activation of Rac1. The PH domain of ELMO1 does not bind phosphoinositides due to the absence of key binding residues. It more closely resembles the FERM domain rather than other PH domains. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	126
241514	cd13360	PH_PLC_fungal	Fungal Phospholipase C (PLC) pleckstrin homology (PH) domain. Fungal PLC have mostly been characterized in the yeast Saccharomyces cerevisiae via deletion studies which resulted in a pleiotropic phenotype, with defects in growth, carbon source utilization, and sensitivity to osmotic stress and high temperature. Unlike Saccharomyces several other fungi including Neurospora crassa, Cryphonectria parasitica , and Magnaporthe oryzae (Mo) have several PLC proteins, some of which lack a PH domain, with varied functions. MoPLC1-mediated regulation of Ca2+ level is important for conidiogenesis and appressorium formation while both MoPLC2 and MoPLC3 are required for asexual reproduction, cell wall integrity, appressorium development, and pathogenicity. The fungal PLCs in this hierarchy contain an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves, and a C-terminal C2 domain. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	118
270167	cd13361	PH_PLC_beta	Phospholipase C-beta (PLC-beta) pleckstrin homology (PH) domain. PLC-beta (PLCbeta) is regulated by heterotrimeric G protein-coupled receptors through their C2 domain and long C-terminal extension which forms an autoinhibitory helix. There are four isoforms: PLC-beta1-4. The PH domain of PLC-beta2 and PLC-beta3 plays a dual role, much like PLC-delta1, by binding to the plasma membrane, as well as the interaction site for the catalytic activator. However, PLC-beta binds to the lipid surface independent of PIP2. PLC-beta1 seems to play unspecified roles in cellular proliferation and differentiation. PLC-beta consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves, a C2 domain and a C-terminal PDZ. Members of the Rho GTPase family (e.g., Rac1, Rac2, Rac3, and cdc42) have been implicated in their activation by binding to an alternate site on the N-terminal PH domain. A basic amino acid region within the enzyme's long C-terminal tail appears to function as a Nuclear Localization Signal for import into the nucleus. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.the plasma membrane, but only a few (less than 10%) display strong specificity in binding inositol phosphates. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinases, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, cytoskeletal associated molecules, and in lipid associated enzymes.	127
270168	cd13362	PH_PLC_gamma	Phospholipase C-gamma (PLC-gamma) pleckstrin homology (PH) domain. PLC-gamma (PLCgamma) is activated by receptor and non-receptor tyrosine kinases due to the presence of its SH2 and SH3 domains. There are two main isoforms of PLC-gamma expressed in human specimens, PLC-gamma1 and PLC-gamma2. PLC-gamma consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves internal to which is a PH domain split by two SH2 domains and a single SH3 domain, and a C-terminal C2 domain. Only the first PH domain is present in this hierarchy. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	121
270169	cd13363	PH_PLC_delta	Phospholipase C-delta (PLC-delta) pleckstrin homology (PH) domain. The PLC-delta (PLCdelta) consists of three family members, delta 1, 2, and 3. PLC-delta1 is the most well studied. PLC-delta is activated by high calcium levels generated by other PLC family members, and functions as a calcium amplifier within the cell. PLC-delta consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves, and a C-terminal C2 domain. The PH domain binds PIP2 and promotes activation of the catalytic core as well as tethering the enzyme to the plasma membrane. The C2 domain has been shown to mediate calcium-dependent phospholipid binding as well. The PH and C2 domains operate in concert as a "tether and fix" apparatus necessary for processive catalysis by the enzyme. Its leucine-rich nuclear export signal (NES) in its EF hand motif, as well as a Nuclear localization signal within its linker region allow PLC-delta 1 to actively translocate into and out of the nucleus. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	117
270170	cd13364	PH_PLC_eta	Phospholipase C-eta (PLC-eta) pleckstrin homology (PH) domain. PLC-eta (PLCeta) consists of two enzymes, PLCeta1 and PLCeta2. They hydrolyze phosphatidylinositol 4,5-bisphosphate, are more sensitive to Ca2+ than other PLC isozymes, and involved in PKC activation in the brain and neuroendocrine systems. PLC-eta consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves by a variable linker, a C2 domain, and a C-terminal PDZ domain. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.involved in targeting proteins to the plasma membrane, but only a few (less than 10%) display strong specificity in binding inositol phosphates. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinases, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, cytoskeletal associated molecules, and in lipid associated enzymes.	109
270171	cd13365	PH_PLC_plant-like	Plant-like Phospholipase C (PLC) pleckstrin homology (PH) domain. PLC-gamma (PLCgamma) was the second class of PLC discovered. PLC-gamma consists of an N-terminal PH domain, a EF hand domain, a catalytic domain split into X and Y halves internal to which is a PH domain split by two SH2 domains and a single SH3 domain, and a C-terminal C2 domain. PLCs (EC 3.1.4.3) play a role in the initiation of cellular activation, proliferation, differentiation and apoptosis. They are central to inositol lipid signalling pathways, facilitating intracellular Ca2+ release and protein kinase C (PKC) activation. Specificaly, PLCs catalyze the cleavage of phosphatidylinositol-4,5-bisphosphate (PIP2) and result in the release of 1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). These products trigger the activation of protein kinase C (PKC) and the release of Ca2+ from intracellular stores. There are fourteen kinds of mammalian phospholipase C proteins which are are classified into six isotypes (beta, gamma, delta, epsilon, zeta, eta). This cd contains PLC members from fungi and plants. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	115
270172	cd13366	PH_ABR	Active breakpoint cluster region-related protein pleckstrin homology (PH) domain. The ABR protein contains multiple domains including a RhoGEF domain, a PH domain, a C1 domain, a C2 domain, and a C-terminal RhoGAP domain. It is related to a slightly larger protein, BCR, which is structurally similar, but has an additional N-terminal kinase domain. ABR has GAP activity for both Rac and Cdc42. It promotes the exchange of RAC or CDC42-bound GDP by GTP, thereby activating them. It is highly enriched in the brain and found to a lesser extent in heart, lung and muscle. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	185
270173	cd13367	PH_BCR_vertebrate	Breakpoint Cluster Region-related pleckstrin homology (PH) domain. The BCR gene is one of the two genes in the BCR-ABL complex, which is associated with the Philadelphia chromosome, a product of a reciprocal translocation between chromosomes 22 and 9. BCR is a GTPase-activating protein (GAP) for RAC1 (primarily) and CDC42. The Dbl region of BCR has the most RhoGEF activity for Cdc42, and less activity towards Rac and Rho. Since BCR possesses both GAP and GEF activities, it may function to temporally regulate the activity of these GTPases. It also displays serine/threonine kinase activity. The BCR protein contains multiple domains including an N-terminal kinase domain, a RhoGEF domain, a PH domain, a C1 domain, a C2 domain, and a C-terminal RhoGAP domain. This hierarchy is composed of vertebrate BCRs. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	194
270174	cd13368	PH_BCR_arthropod	Breakpoint Cluster Region-related pleckstrin homology (PH) domain. The BCR gene is one of the two genes in the BCR-ABL complex, which is associated with the Philadelphia chromosome, a product of a reciprocal translocation between chromosomes 22 and 9. BCR is a GTPase-activating protein (GAP) for RAC1 (primarily) and CDC42. The Dbl region of BCR has the most RhoGEF activity for Cdc42, and less activity towards Rac and Rho. Since BCR possesses both GAP and GEF activities, it may function to temporally regulate the activity of these GTPases. It also displays serine/threonine kinase activity. The BCR protein contains multiple domains including an N-terminal kinase domain, a RhoGEF domain, a PH domain, a C1 domain, a C2 domain, and a C-terminal RhoGAP domain. This hierarchy is composed of arthropod BCRs. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	180
270175	cd13369	PH_RASAL1	Ras-GTPase-activating-like protein pleckstrin homology (PH) domain. RASAL1 is a member of the GAP1 family of GTPase-activating proteins, along with GAP1(m), GAP1(IP4BP) and CAPRI. RASAL1 contains two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. RASAL1 contains two fully conserved C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Its catalytic GAP domain has dual RasGAP and RapGAP activities, while its C2 domains bind phospholipids in the presence of Ca2+. Both CAPRI and RASAL1 are calcium-activated RasGAPs that inactivate Ras at the plasma membrane. Thereby enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS and allowing control of cellular proliferation and differentiation. CAPRI and RASAL1 differ in that CAPRI is an amplitude sensor while RASAL1 senses calcium oscillations. This difference between them resides not in their C2 domains, but in their PH domains leading to speculation that this might reflect an association with either phosphoinositides and/or proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	138
241521	cd13370	PH_GAP1m_mammal-like	GTPase activating protein 1 m pleckstrin homology (PH) domain. GAP1(m) (also called RASA2/RAS p21 protein activator (GTPase activating protein) 2) is a member of the GAP1 family of GTPase-activating proteins, along with RASAL1, GAP1(IP4BP), and CAPRI. With the notable exception of GAP1(m), they all possess an arginine finger-dependent GAP activity on the Ras-related protein Rap1. GAP1(m) contains two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Its C2 domains, like those of GAP1IP4BP, do not contain the C2 motif that is known to be required for calcium-dependent phospholipid binding. GAP1(m) is regulated by the binding of its PH domains to phophoinositides, PIP3 (phosphatidylinositol 3,4,5-trisphosphate). It suppresses RAS, enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS, allowing control of cellular proliferation and differentiation. GAP1(m) binds inositol tetrakisphosphate (IP4). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	133
241522	cd13371	PH_GAP1_mammal-like	GAP1(IP4BP) pleckstrin homology (PH) domain. GAP1 (also called IP4BP, RASA3/Ras GTPase-activating protein 3, and RAS p21 protein activator (GTPase activating protein) 3/GAPIII/MGC46517/MGC47588)) is a member of the GAP1 family of GTPase-activating proteins, along with RASAL1, GAP1(m), and CAPRI. With the notable exception of GAP1(m), they all possess an arginine finger-dependent GAP activity on the Ras-related protein Rap1. GAP1(IP4BP) contains two C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Its C2 domains, like those of GAP1M, do not contain the C2 motif that is known to be required for calcium-dependent phospholipid binding. GAP1(IP4BP) is regulated by the binding of its PH domains to phophoinositides, PIP3 (phosphatidylinositol 3,4,5-trisphosphate) and PIP2 (phosphatidylinositol 4,5-bisphosphate). It suppresses RAS, enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS, allowing control of cellular proliferation and differentiation. GAP1(IP4BP) binds tyrosine-protein kinase, HCK. Members here include humans, chickens, frogs, and fish. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	125
241523	cd13372	PH_CAPRI	Ca2+ promoted Ras inactivator pleckstrin homology (PH) domain. CAPRI (also called RASA4/RAS p21 protein activator (GTPase activating protein) 4/GAPL/FLJ59070/KIAA0538/MGC131890) is a member of the GAP1 family of GTPase-activating proteins. CAPRI contains two fully conserved C2 domains, a PH domain, a RasGAP domain, and a BTK domain. Its catalytic GAP domain has dual RasGAP and RapGAP activities, while its C2 domains bind phospholipids in the presence of Ca2+. Both CAPRI and RASAL are calcium-activated RasGAPs that inactivate Ras at the plasma membrane. Thereby enhancing the weak intrinsic GTPase activity of RAS proteins resulting in the inactive GDP-bound form of RAS and allowing control of cellular proliferation and differentiation. CAPRI and RASAL differ in that CAPRI is an amplitude sensor while RASAL senses calcium oscillations. This difference between them resides not in their C2 domains, but in their PH domains leading to speculation that this might reflect an association with either phosphoinositides and/or proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	140
270176	cd13373	PH_nGAP	Neuronal growth-associated proteins Pleckstrin homology (PH) domain. nGAP (also called RASAL2/RAS protein activator like-3) is a member of the RasSynGAP family along with DOC-2/DAB2-interacting protein (DAB2IP) and synaptic RasGAP (SynGAP). nGAPs are growth cone markers found in multiple types of neurons. There are many nGAPs including Cap1 (Adenylate cyclase-associated protein 1), Capzb (Capping protein (actin filament) muscle Z-line, beta), Clptm1 (Cleft lip and palate associated transmembrane protein 1), Cotl1 (Coactosin-like 1), Crmp1 (Collapsin response mediator protein 1), Cyfip1 (Cytoplasmic FMR1 interacting protein 1), Fabp7 (Fatty acid binding protein 7, brain), Farp2 (FERM, RhoGEF and pleckstrin domain protein 2), Gap43 (Growth associated protein 43), Gnao1 (Guanine nucleotide binding protein (G protein), alpha activating activity polypeptide O), Gnai2 (Guanine nucleotide binding protein (G protein), alpha inhibiting 2), Pacs1 (Phosphofurin acidic cluster sorting protein 1), Rtn1 (Reticulon 1), Sept2 (Septin 2), Snap25 (Synaptosomal-associated protein 25), Strap (Serine/threonine kinase receptor associated protein), Stx7 (Syntaxin 7), and Tmod2 (Tropomodulin 2). PH domains are only found in eukaryotes. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	138
270177	cd13374	PH_RASAL3	RAS protein activator like-3 Pleckstrin homology (PH) domain. RASAL3 is thought to be a Ras GTPase-activating protein. It is involved in positive regulation of Ras GTPase activity and of small GTPase mediated signal transduction as well as negative regulation of Ras protein signal transduction. It contains a PH domain, a C2 domain, and a Ras-GAP domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	146
270178	cd13375	PH_SynGAP	Synaptic Ras-GTPase activating protein Pleckstrin homology (PH) domain. SynGAP is a member of the RasSynGAP family along with DOC-2/DAB2-interacting protein (DAB2IP) and neuronal growth-associated protein (nGAP/RASAL2). SynGAP, a neuronal Ras-GAP, has been shown display both Ras-GAP activity and Ras-related protein (Rap)-GAP activity. Saccharomyces cerevisiae Bud2 and GAP1 members CAPRI (Ca2+-promoted Ras inactivator) and RASAL (Ras-GTPase-activating-like protein) also possess this dual activity. Human DOC-2/DAB2-interacting protein (DAB2IP) is encoded by a tumor suppressor gene and a newly recognized member of the Ras-GTPase-activating family. Members here include mammals, amphibians, and bony fish. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	189
270179	cd13376	PH_DAB2IP	DOC-2/Disabled homolog 2-interacting protein Pleckstrin homology (PH) domain. DAB2IP (also called AIP1/ASK1-interacting protein-1 and DIP1/2) is a member of the RasSynGAP family along with Synaptic Ras-GTPase activating protein (SynGAP) and neuronal growth-associated protein (nGAP/RASAL2). DAB2IP is a critical component of many signal transduction pathways mediated by Ras and tumor necrosis factors including apoptosis pathways, and it is involved in the formation of many types of tumors. DAB2IP participates in regulation of gene expression and pluripotency of cells. Human DAB2IP is expressed in the adrenal gland, pancreas, endocardium, stomach, kidney, testis, small intestine, liver, trachea, skin, ovary, endometrium, lung, esophagus and bladder. No expression was observed in the cerebrum, parotid gland, thymus, thyroid gland and spleen. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	182
241529	cd13378	PH_RhoGAP2	Rho GTPase activating protein 2 Pleckstrin homology (PH) domain. RhoGAP2 (also called RhoGap22 or ArhGap22) are involved in cell polarity, cell morphology and cytoskeletal organization. They activate a GTPase belonging to the RAS superfamily of small GTP-binding proteins. The encoded protein is insulin-responsive, is dependent on the kinase Akt, and requires the Akt-dependent 14-3-3 binding protein which binds sequentially to two serine residues resulting in regulation of cell motility. Members here contain an N-terminal PH domain followed by a RhoGAP domain and either a BAR or TATA Binding Protein (TBP) Associated Factor 4 (TAF4) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	116
241530	cd13379	PH_RhoGap24	Rho GTPase activating protein 24 Pleckstrin homology (PH) domain. RhoGap24 (also called ARHGAP24, p73RhoGAp, and Filamin-A-associated RhoGAP) like other RhoGAPs are involved in cell polarity, cell morphology and cytoskeletal organization. They act as GTPase activators for the Rac-type GTPases by converting them to an inactive GDP-bound state and control actin remodeling by inactivating Rac downstream of Rho leading to suppress leading edge protrusion and promotes cell retraction to achieve cellular polarity and are able to suppress RAC1 and CDC42 activity in vitro. Overexpression of these proteins induces cell rounding with partial or complete disruption of actin stress fibers and formation of membrane ruffles, lamellipodia, and filopodia. Members here contain an N-terminal PH domain followed by a RhoGAP domain and either a BAR or TATA Binding Protein (TBP) Associated Factor 4 (TAF4) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	114
270180	cd13380	PH_Skap1	Src kinase-associated phosphoprotein 1 Pleckstrin homology (PH) domain. Adaptor protein Skap1 (also called Skap55/Src kinase-associated phosphoprotein of 55 kDa) and its partner, ADAP (adhesion and degranulation promoting adapter protein) help reorganize the cytoskeleton and/or promote integrin-mediated adhesion upon immunoreceptor activation. Skap1 is also involved in T Cell Receptor (TCR)-induced RapL-Rap1 complex formation and LFA-1 activation. Skap1 has an N-terminal coiled-coil conformation which is proposed to be involved in homodimer formation, a central PH domain and a C-terminal SH3 domain that associates with ADAP. The Skap1 PH domain plays a role in controlling integrin function via recruitment of ADAP-SKAP complexes to integrins as well as in controlling the ability of ADAP to interact with the CBM signalosome and regulate NF-kappaB. SKAP1 is necessary for RapL binding to membranes in a PH domain-dependent manner and the PI3K pathway. Skap adaptor proteins couple receptors to cytoskeletal rearrangements. Skap55/Skap1, Skap2, and Skap-homology (Skap-hom) have an N-terminal coiled-coil conformation, a central PH domain and a C-terminal SH3 domain. Their PH domains bind 3'-phosphoinositides as well as directly affecting targets such as in Skap55 where it directly affecting integrin regulation by ADAP and NF-kappaB activation or in Skap-hom where the dimerization and PH domains comprise a 3'-phosphoinositide-gated molecular switch that controls ruffle formation. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	106
270181	cd13381	PH_Skap-hom_Skap2	Src kinase-associated phosphoprotein homolog and Skap 2 Pleckstrin homology (PH) domain. Adaptor protein Skap-hom, a homolog of Skap55, which interacts with actin and with ADAP (adhesion and degranulation promoting adapter protein) undergoes tyrosine phosphorylation in response to plating of bone marrow-derived macrophages on fibronectin. Skap-hom has an N-terminal coiled-coil conformation that is involved in homodimer formation, a central PH domain and a C-terminal SH3 domain that associates with ADAP. The Skap-hom PH domain regulates intracellular targeting; its interaction with the DM domain inhibits Skap-hom actin-based ruffles in macrophages and its binding to 3'-phosphoinositides reverses this autoinhibition. The Skap-hom PH domain binds PI[3,4]P2 and PI[3,4,5]P3, but not to PI[3]P, PI[5]P, or PI[4,5]P2. Skap2 is a downstream target of Heat shock transcription factor 4 (HSF4) and functions in the regulation of actin reorganization during lens differentiation. It is thought that SKAP2 anchors the complex of tyrosine kinase adaptor protein 2 (NCK20/focal adhesion to fibroblast growth factor receptors at the lamellipodium in lens epithelial cells. Skap2 has an N-terminal coiled-coil conformation which interacts with the SH2 domain of NCK2, a central PH domain and a C-terminal SH3 domain that associates with ADAP (adhesion and degranulation promoting adapter protein)/FYB (the Fyn binding protein). Skap2 PH domain binds to membrane lipids. Skap adaptor proteins couple receptors to cytoskeletal rearrangements. Src kinase-associated phosphoprotein of 55 kDa (Skap55)/Src kinase-associated phosphoprotein 1 (Skap1), Skap2, and Skap-hom have an N-terminal coiled-coil conformation, a central PH domain and a C-terminal SH3 domain. Their PH domains bind 3'-phosphoinositides as well as directly affecting targets such as in Skap55 where it directly affecting integrin regulation by ADAP and NF-kappaB activation or in Skap-hom where the dimerization and PH domains comprise a 3'-phosphoinositide-gated molecular switch that controls ruffle formation. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	106
270182	cd13382	PH_OCRL1	oculocerebrorenal syndrome of Lowe 1 Pleckstrin homology-like domain. OCRL1 (also called INPP5F, LOCR, NPHL2, or phosphatidylinositol polyphosphate 5-phosphatase) hydrolyzes phosphatidylinositol 4,5-bisphosphate (PtIns(4,5)P2) and the signaling molecule phosphatidylinositol 1,4,5-trisphosphate (PtIns(1,4,5)P3), and thereby modulates cellular signaling events. It interact with APPL1, FAM109A and FAM109B and several Rab GTPases which might both target them to the specific membranes and as well as stimulating the phosphatase activity. OCRL1 contains a PH domain and a Rho-GAP domain. Patients with Lowe syndrome suffer primarily from congenital cataracts, neonatal hypotonia, intellectual disability and Fanconi syndrome. Mutations in OCRL are also found in a subset of patients with type 2 Dent disease, who selectively suffer from renal proximal tubular dysfunction. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
270183	cd13383	PH_OCRL2	oculocerebrorenal syndrome of Lowe 2 Pleckstrin homology-like domain. OCRL2 ( also called IPNNB5, inositol polyphosphate-5-phosphatase, phosphoinositide 5-phosphatase, 5PTase, or type II inositol-1,4,5-trisphosphate 5-phosphatase) hydrolyzes phosphatidylinositol 4,5-bisphosphate (PtIns(4,5)P2) and the signaling molecule phosphatidylinositol 1,4,5-trisphosphate (PtIns(1,4,5)P3), and thereby modulates cellular signaling events. It interact with APPL1, FAM109A and FAM109B and several Rab GTPases which might both target them to the specific membranes and as well as stimulating the phosphatase activity. OCRL2 contains a PH domain and a Rho-GAP domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
241535	cd13384	PH_Gab2_2	Grb2-associated binding protein family pleckstrin homology (PH) domain. The Gab subfamily includes several Gab proteins, Drosophila DOS and C. elegans SOC-1. They are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. Members here include insect, nematodes, and crustacean Gab2s. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	115
270184	cd13385	PH_Gab3	Grb2-associated binding protein 3 pleckstrin homology (PH) domain. The Gab subfamily includes several Gab proteins, Drosophila DOS and C. elegans SOC-1. They are scaffolding adaptor proteins, which possess N-terminal PH domains and a C-terminus with proline-rich regions and multiple phosphorylation sites. Following activation of growth factor receptors, Gab proteins are tyrosine phosphorylated and activate PI3K, which generates 3-phosphoinositide lipids. By binding to these lipids via the PH domain, Gab proteins remain in proximity to the receptor, leading to further signaling. While not all Gab proteins depend on the PH domain for recruitment, it is required for Gab activity. The members in this cd include the Gab1, Gab2, and Gab3 proteins. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	125
275421	cd13386	PH1_FGD2	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 2, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Not much is known about FGD2. FGD1 is the best characterized member of the group with mutations here leading to the X-linked disorder known as faciogenital dysplasia (FGDY). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
275422	cd13387	PH1_FGD3	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 3, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. However, FGD1 and FGD3 induced significantly different morphological changes in HeLa Tet-Off cells and while FGD1 induced long finger-like protrusions, FGD3 induced broad sheet-like protrusions when the level of GTP-bound Cdc42 was significantly increased by the inducible expression of FGD3. They also reciprocally regulated cell motility in inducibly expressed in HeLa Tet-Off cells, FGD1 stimulated cell migration while FGD3 inhibited it. FGD1 and FGD3 therefore play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway through SCF(FWD1/beta-TrCP). PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	108
275423	cd13388	PH1_FGD1-4_like	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins 1-4 and similar proteins, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. Mutations in the FGD1 gene are responsible for the X-linked disorder known as faciogenital dysplasia (FGDY). Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. They play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway. FGD4 is one of the genes associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance. Those affected have distal muscle weakness and atrophy associated with sensory loss and, frequently, pes cavus foot deformity. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	94
275424	cd13389	PH1_FGD5_FGD6	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins 5 and 6, N-terminal Pleckstrin Homology (PH) domain. FGD5 regulates promotes angiogenesis of vascular endothelial growth factor (VEGF) in vascular endothelial cells, including network formation, permeability, directional movement, and proliferation. The specific function of FGD6 is unknown. In general, FGDs have a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activate the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the PH domain is involved in intracellular targeting of the DH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	124
275425	cd13390	PH_LARG	Leukemia-associated Rho guanine nucleotide exchange factor Pleckstrin homology (PH) domain. LARG (also called RhoGEF12) belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. RhoGEFs activate Rho GTPases regulating cytoskeletal structure, gene transcription, and cell migration. LARG contains a N-terminal extension, followed by Dbl homology (DH)-PH domains which bind and catalyze the exchange of GDP for GTP on RhoA in addition to a RGS domain. The active site of RhoA adopts two distinct GDP-excluding conformations among the four unique complexes in the asymmetric unit. The LARG PH domain also contains a potential protein-docking site. LARG forms a homotetramer via its DH domains. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	138
275426	cd13391	PH_PRG	PDZ Rho guanine nucleotide exchange factor Pleckstrin homology (PH) domain. PRG (also called RhoGEF11) belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. RhoGEFs activate Rho GTPases regulating cytoskeletal structure, gene transcription, and cell migration. PRG contains an N-terminal PDZ domain, a regulators of G-protein signaling-like (RGSL) domain, a linker region, and a C-terminal Dbl-homology (DH) and pleckstrin-homology (PH) domains which bind and catalyze the exchange of GDP for GTP on RhoA. As is the case in p115-RhoGEF, it is thought that the PRG activated by relieving autoinhibition caused by the linker region. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	142
275427	cd13392	PH_AKAP13	A-kinase anchoring protein 13 Pleckstrin homology (PH) domain. The Rho-specific GEF activity of AKAP13 (also called Brx-1, AKAP-Lbc, and proto-Lbc) mediates signaling downstream of G-protein coupled receptors and Toll-like receptor 2. It plays a role in cell growth, cell development and actin fiber formation. Protein kinase A (PKA) binds and phosphorylates AKAP13, regulating its Rho-GEF activity. Alternative splicing of this gene in humans has at least 3 transcript variants encoding different isoforms (i.e. proto-/onco-Lymphoid blast crisis, Lbc and breast cancer nuclear receptor-binding auxiliary protein, Brx) containing a dbl oncogene homology (DH) domain and PH domain which are required for full transforming activity. The DH domain is associated with guanine nucleotide exchange activation while the PH domain has multiple functions including determine protein sub-cellular localisation via phosphoinositide interactions, while others bind protein partners. Other ligands include protein kinase C which is bound by the PH domain of AKAP13, serving to activate protein kinase D and mobilize a cardiac hypertrophy signaling pathway. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	103
275428	cd13393	PH_ARHGEF2	Rho guanine nucleotide exchange factor 2 Pleckstrin homology (PH) domain. ARHGEF2, also called GEF-H1, acts as guanine nucleotide exchange factor (GEF) for RhoA GTPases. It is thought to play a role in actin cytoskeleton reorganization in different tissues since its activation induces formation of actin stress fibers. ARHGEF2 contains a C1 domain followed by Dbl-homology (DH) and pleckstrin-homology (PH) domains which bind and catalyze the exchange of GDP for GTP on RhoA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	116
240521	cd13394	Syo1_like	Fungal symportin 1 (syo1) and similar proteins. This family of eukaryotic proteins includes Saccharomyces cerevisiae Ydl063c and Chaetomium thermophilum Syo1, which mediate the co-import of two ribosomal proteins, Rpl5 and Rpl11 (which both interact with 5S rRNA) into the nucleus. Import precedes their association with rRNA and subsequent ribosome assembly in the nucleolus. The primary structure of syo1 is a mixture of Armadillo- (ARM, N-terminal part of syo1) and HEAT-repeats (C-terminal part of syo1).	597
381602	cd13399	Slt35-like	Slt35-like lytic transglycosylase. Lytic transglycosylase similar to Escherichia coli lytic transglycosylase Slt35 and Pseudomonas aeruginosa Sltb1. Lytic transglycosylase (LT) catalyzes the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc) as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Proteins similar to this this family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL).	108
381603	cd13400	LT_IagB-like	Escherichia coli invasion protein IagB and similar proteins. Lytic transglycosylase-like protein, similar to Escherichia coli invasion protein IagB. IagB is encoded within a pathogenicity island in Salmonella enterica and has been shown to degrade polymeric peptidoglycan. IagB-like invasion proteins are implicated in the invasion of eukaryotic host cells by bacteria. Lytic transglycosylase (LT) catalyzes the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Members of this family resemble the soluble and insoluble membrane-bound LTs in bacteria and the LTs in bacteriophage lambda.	109
381604	cd13401	Slt70-like	70kDa soluble lytic transglycosylase (Slt70) and similar proteins. Catalytic domain of the 70kda soluble lytic transglycosylase (LT)-like proteins, which also have an N-terminal U-shaped U-domain and a linker L-domain. LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria and the LTs in bacteriophage lambda.	152
381605	cd13402	LT_TF-like	lytic transglycosylase-like domain of tail fiber-like proteins and similar domains. These tail fiber-like proteins are multi-domain proteins that include a lytic transglycosylase (LT) domain. Members of the LT family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, and the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL). LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue.	117
381606	cd13403	MLTF-like	membrane-bound lytic murein transglycosylase F (MLTF) and similar proteins. This subfamily includes membrane-bound lytic murein transglycosylase F (MltF, murein lyase F) that degrades murein glycan strands. It is responsible for catalyzing the release of 1,6-anhydromuropeptides from peptidoglycan. Lytic transglycosylase catalyzes the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc) as do goose-type lysozymes. However, in addition, it also makes a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue.	161
259831	cd13404	UreI_AmiS_like	UreI/AmiS family, proton-gated urea channel and putative amide transporters. This family includes UreI proton-gated urea channels as well as putative amide transporters (AmiS of the amidase gene cluster). Helicobacter pylori UreI (HpUreI), a proton-gated inner membrane urea channel opens in acidic pH to allow urea influx to the cytoplasm. There urea is metabolized, producing NH3 and CO2, leading to buffering of the periplasm. This action is essential for the survival of H. pylori in the stomach, and has been identified as a mechanism that could be clinically targeted to prevent various illnesses associated with infection by H. pylori. UreI and the related amide channels (AmiS) appear to function as hexamers, and have 6 predicted transmembrane segments. UreI has also been shown have a lipid "plug" in the center of the hexamer. Urea enters at the periplasmic opening of UreI and must pass 2 constriction sites, one on each side of a conserved Glu (Glu 177, H. pylori numbering), to reach the cytoplasm. Urea/thiourea selectivity is diminished by mutation of a conserved Trp to Ala or Phe in constriction site 2 (cytoplasmic). Channel functionality is greatly diminished by mutation of a conserved Trp in constriction site 1 (periplasmic) and a conserved Tyr in constriction site 2, and to a lesser extent a conserved Phe in site 1. In the cytoplasm, urease hydrolyzes urea to form ammonia and carbamate, which decomposes to carbonic acid. UreI is fully open at pH 5.0 to facilitate urea influx, but closes at neutral pH, preventing over-alkalization. Glu 177 (H. pylori numbering) is present in urea channel proteins, but absent in the related amide channels, suggesting that it plays a role in urea specificity.	167
276910	cd13405	TNFRSF14_teleost	Tumor necrosis factor receptor superfamily member 14 (TNFRSF14) in teleost; also known as herpes virus entry mediator (HVEM). This subfamily of TNFRSF14 (also known as herpes virus entry mediator or HVEM, ATAR, CD270, HVEA, LIGHTR, TR2) is found in teleosts, many of which are as yet uncharacterized. It regulates T-cell immune responses by activating inflammatory as well as inhibitory signaling pathways. HVEM acts as a receptor for the canonical TNF-related ligand LIGHT (lymphotoxin-like), which exhibits inducible expression, and competes with herpes simplex virus glycoprotein D for HVEM. It also acts as a ligand for the immunoglobulin superfamily proteins BTLA (B and T lymphocyte attenuator) and CD160, a feature distinguishing HVEM from other immune regulatory molecules, thus, creating a functionally diverse set of intrinsic and bidirectional signaling pathways. HVEM is highly expressed in the gut epithelium. Genome-wide association studies have shown that HVEM is an inflammatory bowel disease (IBD) risk gene, suggesting that HVEM could have a regulatory role influencing the regulation of epithelial barrier, host defense, and the microbiota. Mouse models have revealed that HVEM is involved in colitis pathogenesis, mucosal host defense, and epithelial immunity, thus acting as a mucosal gatekeeper with multiple regulatory functions in the mucosa. HVEM plays a critical role in both tumor progression and resistance to antitumor immune responses, possibly through direct and indirect mechanisms. It is known to be expressed in several human malignancies, including esophageal squamous cell carcinoma, follicular lymphoma, and melanoma. HVEM network may therefore be an attractive target for drug intervention. In Asian seabass, the up-regulation of differentially expressed TNFRSF14 gene has been observed.	111
276911	cd13406	TNFRSF4	Tumor necrosis factor receptor superfamily member 4 (TNFRSF4), also known as  CD134 or OXO40. TNFRSF4 (also known as OX40, ACT35, CD134, IMD16, TXGP1L) activates NF-kappaB through its interaction with adaptor proteins TRAF2 and TRAF5. It also promotes the expression of apoptosis inhibitors BCL2 and BCL2lL1/BCL2-XL, and thus suppresses apoptosis. It is primarily expressed on activated CD4+ and CD8+ T cells, where it is transiently expressed and upregulated on the most recently antigen-activated T cells within inflammatory lesions. This makes it an attractive target to modulate immune responses, i.e. TNFRSF4 (OX40) blocking agents to inhibit adverse inflammation or agonists to enhance immune responses. An artificially created biologic fusion protein, OX40-immunoglobulin (OX40-Ig), prevents OX40 from reaching the T-cell receptors, thus reducing the T-cell response. Some single nucleotide polymorphisms (SNPs) of its natural ligand OX40 ligand (OX40L, CD252), which is also found on activated T cells, have been associated with systemic lupus erythematosus.	142
276912	cd13407	TNFRSF5	Tumor necrosis factor receptor superfamily member 5 (TNFRSF5), also known as CD40. TNFRSF5 (commonly known as CD40 and also as CDW40, p50, Bp50) is widely expressed in diverse cell types including B lymphocytes, dendritic cells, platelets, monocytes, endothelial cells, and fibroblasts. It is essential in mediating a wide variety of immune and inflammatory responses, including T cell-dependent immunoglobulin class switching, memory B cell development, and germinal center formation. Its natural immunomodulating ligand is CD40L, and a primary defect in the CD40/CD40L system is associated with X-linked hyper-IgM (XHIM) syndrome.  It is also involved in tumorigenesis; CD40 expression is significantly higher in gastric carcinomas and it is associated with the lymphatic metastasis of cancer cells and their tumor node metastasis (TNM) classification. Upregulated levels of CD40/CD40L on B cells and T cells may play an important role in the immune pathogenesis of breast cancer. Consequently, the CD40/CD40L system serves as a link between tumorigenesis, atherosclerosis, and the immune system, and offers a potential target for drug therapy for related diseases, such as cancer, atherosclerosis, diabetes mellitus, and immunological rejection.	161
276913	cd13408	TNFRSF7	Tumor necrosis factor receptor superfamily member 7 (TNFRSF7), also known as CD27. TNFRSF7 (also known as CD27, T14, S152, Tp55, S152, LPFS2) has a key role in the generation of immunological memory via effects on T-cell expansion and survival, and B cell development. It binds to ligand CD70, and plays a key role in regulating B-cell activation and immunoglobulin synthesis. CD27 transduces signals that lead to the activation of NF-kappaB and MAPK8/JNK, and mediates the signaling process through adaptor proteins TRAF2 and TRAF5. CD27-binding protein (SIVA), a pro-apoptotic protein, can bind to CD27 and may play an important role in the apoptosis induced by this receptor. The potential role of the CD27/CD70 pathway in the course of inflammatory diseases, such as arthritis, and inflammatory bowel disease, suggests that CD70 may be a target for immune intervention. The expression of CD27 and CD44 molecules correlates with the differentiation stage of B cell precursors and has been shown to have a biological significance in acute lymphoblastic leukemia.	121
276914	cd13409	TNFRSF8	Tumor necrosis factor receptor superfamily member 8 (TNFRSF8), also known as CD30. TNFRSF8 (also known as CD30, Ki-1, D1S166E) is expressed by activated T and B cells. It transduces signals that lead to the activation of NF-kappaB, mediated by the adaptor proteins TRAF2 and TRAF5. This receptor is a positive regulator of apoptosis, and has been shown to limit the proliferative potential of auto-reactive CD8 effector T cells and protect the body against autoimmunity. Two alternatively spliced transcript variants of this gene encoding distinct isoforms have been reported.  CD30 is expressed in malignant Hodgkin and Reed-Sternberg cells on the surface of extracellular vesicles, facilitating CD30-CD30L interaction between cell types. This receptor is also associated with anaplastic large cell lymphoma. It is expressed in embryonal carcinoma, but not in seminoma, making it a useful marker in distinguishing between these germ cell tumors. Since CD30 has restricted expression in normal tissues, it is an optimal target for selectively eliminating CD30-expressing neoplastic cells by specific toxin-conjugated monoclonal antibodies (mAbs).	130
276915	cd13410	TNFRSF9	Tumor necrosis factor receptor superfamily member 9 (TNFRSF9), also known as CD137. TNFRSF9 (also known as CD137, ILA, 4-1BB) plays a role in the immunobiology of human cancer where it is preferentially expressed on tumor-reactive subset of tumor-infiltrating lymphocytes. It can be expressed by activated T cells, but to a larger extent on CD8 than on CD4 T cells. In addition, CD137 expression is found on dendritic cells, follicular dendritic cells, natural killer cells, granulocytes and cells of blood vessel walls at sites of inflammation. It transduces signals that lead to the activation of NF-kappaB, mediated by the TRAF adaptor proteins. CD137 contributes to the clonal expansion, survival, and development of T cells. It can also induce proliferation in peripheral monocytes, enhance T cell apoptosis induced by TCR/CD3 triggered activation, and regulate CD28 co-stimulation to promote Th1 cell responses. CD137 is modulated by SAHA treatment in breast cancer cells, suggesting that the combination of SAHA with this receptor could be a new therapeutic approach for the treatment of tumors.	138
276916	cd13411	TNFRSF11A	Tumor necrosis factor receptor superfamily member 11A (TNFRSF11A), also known as receptor activator of nuclear factor-kappaB (RANK). TNFRSF11A (also known as RANK, FEO, OFE, ODFR, OSTS, PDB2, CD26, OPTB7, TRANCER, LOH18CR1) induces the activation of NF-kappa B and MAPK8/JNK through interactions with various TRAF adaptor proteins. This receptor and its ligand are important regulators of the interaction between T cells and dendritic cells. The receptor is also an essential mediator for osteoclast and lymph node development. Mutations at this locus have been associated with familial expansile osteolysis, autosomal recessive osteopetrosis, and Juvenile Paget's disease (JPD) of bone. Alternatively spliced transcript variants have been described for this locus. Mutation analysis may improve diagnosis, prognostication, recurrence risk assessment, and perhaps treatment selection among the monogenic disorders of RANKL/OPG/RANK activation.	163
276917	cd13412	TNFRSF11B_teleost	Tumor necrosis factor receptor superfamily 11B (TNFRSF11B) in teleost; also known as Osteoprotegerin (OPG). This subfamily of TNFRSF11B (also known as Osteoprotegerin, OPG, TR1, OCIF) is found in teleosts. It is a secreted glycoprotein that regulates bone resorption. It binds to two ligands, RANKL (receptor activator of nuclear factor kappaB ligand, also known as osteoprotegerin ligand, OPGL, TRANCE, TNF-related activation induced cytokine), a critical cytokine for osteoclast differentiation, and TRAIL (TNF-related apoptosis-inducing ligand), involved in immune surveillance. Therefore, acting as a decoy receptor for RANKL and TRAIL, OPG inhibits the regulatory effects of nuclear factor-kappaB on inflammation, skeletal, and vascular systems, and prevents TRAIL-induced apoptosis. Studies in mice counterparts suggest that this protein and its ligand also play a role in lymph-node organogenesis and vascular calcification. Circulating OPG levels have emerged as independent biomarkers of cardiovascular disease (CVD) in patients with acute or chronic heart disease. OPG has also been implicated in various inflammations and linked to diabetes and poor glycemic control. Alternatively spliced transcript variants of this gene have been reported, although their full length nature has not been determined. Genetic analysis of the Japanese rice fish medaka (Oryzias latipes) has shown that entire networks for bone formation are conserved between teleosts and mammals; enabling medaka to be used as a genetic model to monitor bone homeostasis in vivo.	129
276918	cd13413	TNFRSF12A	Tumor necrosis factor receptor superfamily member 12A (TNFRSFA), also known as receptor fibroblast growth factor inducible 14 (FN14). TNFRSF12A (also known as receptor fibroblast growth factor inducible 14, FN14, CD266, TWEAKR) is induced by a large variety of growth factors including Fibroblast Growth Factor 1 (FGF1), FGF2, Platelet-Derived Growth Factor (PDGF), Epidermal Growth Factor (EGF) and Vascular Endothelial Growth Factor (VEGF), as well as cytokines such as tumor necrosis factor alpha (TNFalpha), Interleukin-1beta (IL-1beta), Interferon gamma (IFNgamma), and transforming growth factor-beta (TGF-beta). FN14 is expressed on a wide variety of different cell types and binds the ligand TWEAK (tumor necrosis factor-like weak inducer of apoptosis) to activate several signaling cascades through activation of NF-kappaB signaling mediated by adaptor TRAF proteins. The FN14/TWEAK pathway controls a range of cellular activities such as proliferation, differentiation, and apoptosis, and has diverse biological functions in pathological mechanisms like inflammation and fibrosis that are associated with cardiovascular diseases (CVDs). The complex is a positive regulator of cardiac hypertrophy and it has been shown that deletion of FN14 receptor protects from right heart fibrosis and dysfunction; the TWEAK/Fn14 axis could be a potential new therapeutic target for achieving cardiac protection in patients with CVDs. FN14 expression is also stimulated under specific atrophic conditions, such as denervation, immobilization, and starvation, leading to activation of TWEAK/Fn14 signaling and eventually skeletal muscle atrophy. FN14 is also a factor that promotes prostate cancer bone metastasis.	117
276919	cd13414	TNFRSF17	Tumor necrosis factor receptor superfamily member 17 (TNFRSF17), also known as B cell maturation antigen (BCMA), as well as TNFRSF13A. TNFRSF17 (also known as TNFRSF13A, B cell maturation antigen or BCMA, CD269) is predominantly expressed on terminally differentiated B cells, including multiple myeloma cells, and is important for B cell development and autoimmune response. Upon binding to its ligands, B cell activator of the TNF family (BAFF, also known as TNSF13B, TALL-1, BLyS, zTNF4), and a proliferation inducing ligand (APRIL), BCMA activates NF-kappaB and MAPK8/JNK; it has a higher affinity for APRIL than for BAFF. This receptor may transduce signals for cell survival and proliferation by binding to TRAF1, TRAF2, and TRAF3. BCMA expression has also been linked to a number of cancers, autoimmune disorders, and infectious diseases. It has been shown that although BCMA does not play a role in normal B cell homeostasis, it is critical for the long-term survival of bone marrow plasma cells. BCMA is expressed in a number of hematologic malignancies, including both Hodgkin's and non-Hodgkin's lymphomas, as well as primary tumor cells and cell lines of multiple myeloma, playing a critical role in protecting myeloma cells from apoptosis. BCMA has been identified as a promising chimeric antigen receptor (CAR) target for multiple myeloma; CARs are synthetic transmembrane proteins used to redirect autologous T cells with a new specificity for antigens on the surface of cancer cells. BCMA may also be implicated in the context of both viral and fungal infections; peripheral blood B cells isolated from HIV+ viremic patients have increased expression levels of BCMA, and significant decreased levels are found during fungal infection with C. neoformans. BCMA has been linked to mucosal immunity; its signaling in B cells and non-B cells is important for driving protective IgA responses. Also, abnormal expression or signaling of BCMA in the gut may be relevant to diseases, such as irritable bowel disease and ulcerative colitis.	165
276920	cd13415	TNFRSF13B	Tumor necrosis factor receptor superfamily member 13B (TNFRSF13B), also known as transmembrane activator and calcium modulator and cyclophilin ligand interactor (TACI). TNFRSF13B (also known as transmembrane activator and calcium modulator and cyclophilin ligand interactor (TACI), CVID, RYZN, CD267, CVID2, TNFRSF14B) is mainly expressed on B cells and binds strongly to B cell activating factor (BAFF) and weakly to a proliferation-inducing ligand (APRIL). TACI-APRIL interactions induce B-cell differentiation, whereas TACI-BAFF ligation negatively regulates B-cell functions. In humans, TACI is expressed on memory B cells and TACI mutations are detected in 8-10% of common variable immunodeficiency (CVID) patients, making it the most frequently mutated gene for the disease. Coexisting morbidities in CVID include bronchiectasis, autoimmunity, and malignancies. However, TNFRSF13B/TACI defects alone do not result in CVID but may also be found frequently in distinct clinical phenotypes, including benign lymphoproliferation and IgG subclass deficiencies. Over-expression of TACI has been detected in multiple myeloma and thyroid carcinoma; correlative analyses suggest that TACI expression is a useful prognostic marker for lymphoma.	212
276921	cd13416	TNFRSF16	Tumor necrosis factor receptor superfamily member 16 (TNFRSF16), also known as p75 neurotrophin receptor (p75NTR) or CD271. TNFRSF16 (also known as nerve growth factor receptor (NGFR) or p75 neurotrophin receptor (p75NTR or p75(NTR)), CD271, Gp80-LNGFR) is a common receptor for both neurotrophins and proneurotrophins, and plays a diverse role in many tissues, including the nervous system. It has been shown to be expressed in various types of stem cells and has been used to prospectively isolate stem cells with different degrees of potency. p75NTR owes its signaling to the recruitment of intracellular binding proteins, leading to the activation of different signaling pathways. It binds nerve growth factor (NGF) and the complex can initiate a signaling cascade which has been associated with both neuronal apoptosis and neuronal survival of discrete populations of neurons, depending on the presence or absence of intracellular signaling molecules downstream of p75NTR (e.g. NF-kB, JNK, or p75NTR intracellular death domain). p75NTR can also bind NGF in concert with the neurotrophic tyrosine kinase receptor type 1 (TrkA) protein where it is thought to modulate the formation of the high-affinity neurotrophin binding complex. On melanoma cell, p75NTR is an immunosuppressive factor, induced by interferon (IFN)-gamma, and mediates down-regulation of melanoma antigens. It can interact with the aggregated form of amyloid beta (Abeta) peptides, and plays an important role in etiopathogenesis of Alzheimer's disease by influencing protein tau hyper-phosphorylation. p75(NTR) is involved in the formation and progression of retina diseases; its expression is induced in retinal pigment epithelium (RPE) cells and its knockdown rescues RPE cell proliferation activity and inhibits RPE apoptosis induced by hypoxia. It can therefore be a potential therapeutic target for RPE hypoxia or oxidative stress diseases.	159
276922	cd13417	TNFRSF18	Tumor necrosis factor receptor superfamily member 18 (TNFRSF18), also known as glucocorticoid-induced tumor necrosis factor receptor family-related protein (GITR). TNFRSF18 (also known as activation-inducible TNF receptor (AITR), glucocorticoid-induced tumor necrosis factor receptor family-related protein (GITR), CD357, GITR-D) has increased expression upon T-cell activation, and is thought to play a key role in dominant immunological self-tolerance maintained by CD25(+)CD4(+) regulatory T cells. In inflammatory cells, GITR expression indicates a possible molecular link between steroid use and complicated acute sigmoid diverticulitis; increased MMP-9 expression by GITR signaling might explain morphological changes in the colonic wall in diverticulitis. Its ligand, GITRL, activates GITR which could then influence the activity of effector and regulatory T cells, participating in the development of several autoimmune and inflammatory diseases, including autoimmune thyroid disease and rheumatoid arthritis. In systemic lupus erythematosus (SLE) patients, serum GITRL levels are increased compared with healthy controls. GITR and its ligand, GITRL, are possibly involved in the pathogenesis of primary Sjogren's syndrome (pSS). GITR is inactivated during tumor progression in Multiple Myeloma (MM); restoration of GITR expression in GITR deficient MM cells leads to inhibition of MM proliferation and induction of apoptosis, thus playing a pivotal role in MM pathogenesis and disease progression. Regulatory T-cells (Tregs) in liver tumor up-regulate the expression of GITR compared with Tregs in tumor-free liver tissue and blood. Regulatory single nucleotide polymorphisms (SNPs) in the promoter regions of the TNFRSF18 gene have been identified in a group of male Gabonese individuals exposed to a wide array of parasitic diseases such as malaria, filariasis and schistosomiasis, and may serve as a basis to study parasite susceptibility in association studies.	130
276923	cd13418	TNFRSF19	Tumor necrosis factor receptor superfamily member 19 (TNFRSF19), also known as TROY. TNFRSF19 (also known as TAJ; TROY; TRADE; TAJ-alpha) is expressed in progenitor cells of the hippocampus, thalamus, and cerebral cortex and highly expressed during embryonic development. It has been shown to interact with TRAF family members, and to activate JNK signaling pathway when overexpressed in cells. It is frequently overexpressed in colorectal cancer cell lines and primary colorectal carcinomas. TNFRSF19 is a beta-catenin target gene, in mesenchymal stem cells, and also activates NF-kappaB signaling, showing that beta-catenin regulates NF-kappaB activity via TNFRSF19.  Since Wnt/beta-catenin signaling plays a crucial role in the regulation of colon tissue regeneration and the development of colon tumors, TNFRSF19 may contribute to the development of colorectal tumors. These findings define a role for death receptors DR6 and TROY in CNS-specific vascular development. TNFRSF19 has been shown to promote glioblastoma (GBM) survival signaling and therefore targeting it may increase tumor vulnerability and improve therapeutic response in glioblastoma. It may play an important role in myelin-associated inhibitory factors (MAIFs)-induced inhibition of neurite outgrowth in the postnatal central nervous system (CNS) or on axon regeneration following CNS injury.	117
276924	cd13419	TNFRSF19L	tumor necrosis factor receptor superfamily member 19-like (TNFRSF19L), also known as receptor expressed in lymphoid tissues (RELT). TNFRSF19L (also known as receptor expressed in lymphoid tissues (RELT)) is especially abundant in hematologic tissues and can stimulate the proliferation of T-cells. It serves as a substrate for the closely related kinases, odd-skipped related transcription factor 1 (OSR1) and STE20/SPS1-related proline/alanine-rich kinase (SPAK); RELT binds SPAK and uses it to mediate p38 and JNK activation, rather than rely on the canonical TRAF pathways for its function. RELT is capable of stimulating T-cell proliferation in the presence of CD3 signaling, which suggests its regulatory role in immune response. It interacts with phospholipid scramblase 1 (PLSCR1), an interferon-inducible protein that mediates antiviral activity against DNA and RNA viruses; PLSCR1 is a regulator of hepatitis B virus X (HBV X) protein. RELT and PLSCR1 co-localize in intracellular regions of human embryonic kidney-293 cells, with RELT over-expression appearing to alter the localization of PLSCR1.	91
276925	cd13420	TNFRSF25	tumor necrosis factor receptor superfamily member 25 (TNFRSF25), also known as death receptor 3 (DR3). TNFRSF25 (also known as death receptor 3 (DR3), death domain receptor 3 (DDR3), apoptosis-mediating receptor, lymphocyte associated receptor of death (LARD), apoptosis inducing receptor (AIR), APO-3, translocating chain-association membrane protein (TRAMP), WSL-1, WSL-LR or TNFRSF12) is preferentially expressed in thymocytes and lymphocytes, and may play a role in regulating lymphocyte homeostasis. It has been detected in lymphocyte-rich tissues such as colon, intestine, thymus and spleen, as well as in the prostate. Various death domain containing adaptor proteins mediate the signal transduction of this receptor; it activates nuclear factor kappa-B (NFkB) and induces cell apoptosis by associating with TNFRSF1A-associated via death domain (TRADD), which is known to mediate signal transduction of tumor necrosis factor receptors. DR3 associates with tumor necrosis factor (TNF)-like cytokine 1A (TL1A also known as TNFSF15) on activated lymphocytes and induces pro-inflammatory signals; TL1A also binds decoy receptor DcR3 (also known as TNFRSF6B). DR3/DcR3/TL1A expression is increased in both serum and inflamed tissues in autoimmune diseases such as in several autoimmune diseases, including inflammatory bowel disease (IBD), rheumatoid arthritis (RA), allergic asthma, experimental autoimmune encephalomyelitis, type 1 diabetes, ankylosing spondylitis (AS), and primary biliary cirrhosis (PBC), making modulation of TL1A-DR3 interaction a potential therapeutic target.	114
276926	cd13421	TNFRSF_EDAR	Tumor necrosis factor receptor superfamily member ectodysplasin A receptor (EDAR). Ectodysplasin A receptor (EDAR, also known as DL, ED3, ED5, ED1R, EDA3, HRM1, EDA1R, ECTD10A, ECTD10B, EDA-A1R) binds the soluble ligand ectodysplasin A and can activate the nuclear factor-kappaB, JNK, and caspase-independent cell death pathways. It is required for the development of hair, teeth, and other ectodermal derivatives. Mutations in this gene result in autosomal dominant and recessive forms of hypohidrotic ectodermal dysplasia. Patients present defects in the development of ectoderm-derived structures resulting in sparse hair, too few teeth (oligodontia), the absence or reduction in the ability to sweat as well as problems with mucous and saliva and the production and formation of pigment cells.	136
276927	cd13422	TNFRSF5_teleost	Tumor necrosis factor receptor superfamily member 5 (TNFRSF5) in teleosts; also known as CD40. TNFRSF5 (commonly known as CD40 and also as CDW40, p50, Bp50) is widely expressed in diverse cell types including B lymphocytes, dendritic cells, platelets, monocytes, endothelial cells, and fibroblasts. It is essential in mediating a wide variety of immune and inflammatory responses, including T cell-dependent immunoglobulin class switching, memory B cell development, and germinal center formation. Its natural immunomodulating ligand is CD40L, and a primary defect in the CD40/CD40L system is associated with X-linked hyper-IgM (XHIM) syndrome.  It is also involved in tumorigenesis; CD40 expression is significantly higher in gastric carcinomas and it is associated with the lymphatic metastasis of cancer cells and their tumor node metastasis (TNM) classification. Upregulated levels of CD40/CD40L on B cells and T cells may play an important role in the immune pathogenesis of breast cancer. Consequently, the CD40/CD40L system serves as a link between tumorigenesis, atherosclerosis, and the immune system, and offers a potential target for drug therapy for related diseases, such as cancer, atherosclerosis, diabetes mellitus, and immunological rejection. Salmon CD40 and CD40L are widely expressed, particularly in immune tissues, and their importance for the immune response is indicated by their relatively high expression in salmon lymphoid organs and gills.	161
276928	cd13423	TNFRSF6_teleost	Tumor necrosis factor receptor superfamily member 6 (TNFRSF6) in teleosts; also known as fas cell surface death receptor (FasR). This subfamily of TNFRSF6 (also known as fas cell surface death receptor (FasR) or Fas; APT1; CD95; FAS1; APO-1; FASTM; ALPS1A) is found in teleosts. It contains a death domain and plays a central role in the physiological regulation of programmed cell death. In humans, it has been implicated in the pathogenesis of various malignancies and diseases of the immune system. The receptor interactions with the Fas ligand (FasL), allowing the formation of a death-inducing signaling complex that includes Fas-associated death domain protein (FADD), caspase 8, and caspase 10; autoproteolytic processing of the caspases in the complex triggers a downstream caspase cascade, leading to apoptosis. This receptor has also been shown to activate NF-kappaB, MAPK3/ERK1, and MAPK8/JNK, and is involved in transducing the proliferating signals in normal diploid fibroblast and T cells. In channel catfish and the Japanese rice fish, medaka, homologs of Fas receptor (FasR), as well as FADD and caspase 8, have been identified and characterized, and likely constitute the teleost equivalent of the death-inducing signaling complex (DISC). FasL/FasR are involved in the initiation of apoptosis and suggest that mechanisms of cell-mediated cytotoxicity in teleosts are similar to those used by mammals; presumably, the mechanism of apoptosis induction via death receptors was evolutionarily established during the appearance of vertebrates.	103
276929	cd13424	TNFRSF9_teleost	Tumor necrosis factor receptor superfamily member 9 (TNFRSF9) in teleosts; also known as CD137. This subfamily of TNFRSF9 (also known as CD137, ILA, 4-1BB) is found in teleosts. CD137 plays a role in the immunobiology of human cancer where it is preferentially expressed on tumor-reactive subset of tumor-infiltrating lymphocytes. It can be expressed by activated T cells, but to a larger extent on CD8 than on CD4 T cells. In addition, CD137 expression is found on dendritic cells, follicular dendritic cells, natural killer cells, granulocytes and cells of blood vessel walls at sites of inflammation. It transduces signals that lead to the activation of NF-kappaB, mediated by the TRAF adaptor proteins. CD137 contributes to the clonal expansion, survival, and development of T cells. It can also induce proliferation in peripheral monocytes, enhance T cell apoptosis induced by TCR/CD3 triggered activation, and regulate CD28 co-stimulation to promote Th1 cell responses. CD137 is modulated by SAHA treatment in breast cancer cells, suggesting that the combination of SAHA with this receptor could be a new therapeutic approach for the treatment of tumors. Mostly, CD137 in teleosts have not been characterized.	150
240448	cd13425	Peptidase_G1_like	Peptidases of the G1 family and homologs that might lack peptidase activity. Some members of this family had been classified earlier as carboxyl peptidases insensitive to pepstatin, and the family has also been called the eqolisin family, due to the fact that the conserved catalytic dyad of the family consists of a glutamate (E) and glutamine (Q) residue. The family is found in fungi and bacteria. This family also includes homologous uncharacterized proteins that might lack peptidase activity.	195
240449	cd13426	Peptidase_G1	Peptidases of the G1 family, including scytalidoglutamic peptidase and aspergillopepsin. Some members of this family had been classified earlier as carboxyl peptidases insensitive to pepstatin, and the family has also been called the eqolisin family, due to the fact that the conserved catalytic dyad of the family consists of a glutamate (E) and glutamine (Q) residue. The family is found in fungi and bacteria.	206
240450	cd13427	YncM_like	Uncharacterized proteins similar to Bacillus subtilis YncM. Members of this family share close structural similarity with peptidases of the Peptidase_G1 family and may be homologous. They do not appear to share the peptidases' active site, though a bound sulfate ion in the single available structure suggests a functional site at a matching location.	204
259832	cd13428	UreI_AmiS	UreI/Amis family, proton-gated urea channel and putative amide transporters. This subfamily includes UreI proton-gated urea channels as well as putative amide transporters (AmiS of the amidase gene cluster). Helicobacter pylori UreI (HpUreI), a proton-gated inner membrane urea channel opens in acidic pH to allow urea influx to the cytoplasm. There urea is metabolized, producing NH3 and Co2, leading to buffering of the periplasm. This action is essential for the survival of H. pylori in the stomach, and has been identified as a mechanism that could be clinically targeted to prevent various illnesses associated with infection by H. pylori. UreI and the related amide channels (AmiS) appear to function as hexamers, and have 6 predicted transmembrane segments. UreI has also been shown have a lipid "plug" in the center of the hexamer. Urea enters at the periplasmic opening of UreI and must pass 2 constriction sites, one on each side of a conserved Glu (Glu 177, H. pylori numbering), to reach the cytoplasm. Urea/thiourea selectivity is diminished by mutation of a conserved Trp to Ala or Phe in constriction site 2 (cytoplasmic). Channel functionality is greatly diminished by mutation of a conserved Trp in constriction site 1 (periplasmic) and a conserved Tyr in constriction site 2, and to a lesser extent a conserved Phe in site 1. In the cytoplasm, urease hydrolyzes urea to form ammonia and carbamate, which decomposes to carbonic acid. UreI is fully open at pH 5.0 to facilitate urea influx, but closes at neutral pH, preventing over-alkalization. Glu 177 (H. pylori numbering) is present in urea channel proteins, but absent in the related amide channels, suggesting that it plays a role in urea specificity.	162
259833	cd13429	UreI_AmiS_like_2	UreI/AmiS family, subgroup 2. Putative transporters related to proton-gated urea channel and putative amide transporters. This subfamily includes putative UreI proton-gated urea channels and putative amide transporters (AmiS of the amidase gene cluster). Helicobacter pylori UreI (HpUreI), a proton-gated inner membrane urea channel opens in acidic pH to allow urea influx to the cytoplasm. There urea is metabolized, producing NH3 and Co2, leading to buffering of the periplasm. This action is essential for the survival of H. pylori in the stomach, and has been identified as a mechanism that could be clinically targeted to prevent various illnesses associated with infection by H. pylori. UreI and the related amide channels (AmiS) appear to function as hexamers, and have 6 predicted transmembrane segments. UreI has also been shown have a lipid "plug" in the center of the hexamer. Urea enters at the periplasmic opening of UreI and must pass 2 constriction sites, one on each side of a conserved Glu (Glu 177, H. pylori numbering), to reach the cytoplasm. Urea/thiourea selectivity is diminished by mutation of a conserved Trp to Ala or Phe in constriction site 2 (cytoplasmic). Channel functionality is greatly diminished by mutation of a conserved Trp in constriction site 1 (periplasmic) and a conserved Tyr in constriction site 2, and to a lesser extent a conserved Phe in site 1. In the cytoplasm, urease hydrolyzes urea to form ammonia and carbamate, which decomposes to carbonic acid. UreI is fully open at pH 5.0 to facilitate urea influx, but closes at neutral pH, preventing over-alkalization. Glu 177 (H. pylori numbering) is present in urea channel proteins, but absent in the related amide channels, suggesting that it plays a role in urea specificity.	165
240445	cd13430	LDT_IgD_like	IgD-like repeat domain of mycobacterial L,D-transpeptidases. Immunoglobulin-like domain found in actinobacterial L,D-transpeptidases, including Mycobacterium tuberculosis LdtMt2, which is a non-classical transpeptidase that generates 3->3 transpeptide linkages. LdtMt2 is associated with virulence and resistance to amoxicillin. This domain may occur in a tandem-repeat arrangement and is found N-terminal to the catalytic L,D-transpeptidase domain.	98
240446	cd13431	LDT_IgD_like_1	IgD-like repeat domain of mycobacterial L,D-transpeptidases. Immunoglobulin-like domain found in actinobacterial L,D-transpeptidases, including Mycobacterium tuberculosis LdtMt2, which is a non-classical transpeptidase that generates 3->3 transpeptide linkages. LdtMt2 is associated with virulence and resistance to amoxicillin. This domain may occur in a tandem-repeat arrangement and is found N-terminal to the catalytic L,D-transpeptidase domain; this model represents the first (N-terminal) repeat in LdtMt2 and related proteins.	95
240447	cd13432	LDT_IgD_like_2	IgD-like repeat domain of mycobacterial L,D-transpeptidases. Immunoglobulin-like domain found in actinobacterial L,D-transpeptidases, including Mycobacterium tuberculosis LdtMt2, which is a non-classical transpeptidase that generates 3->3 transpeptide linkages. LdtMt2 is associated with virulence and resistance to amoxicillin. This domain may occur in a tandem-repeat arrangement and is found N-terminal to the catalytic L,D-transpeptidase domain; this model represents the  repeat adjacent to the catalytic domain.	99
240441	cd13433	Na_channel_gate	Inactivation gate of the voltage-gated sodium channel alpha subunits. This region is part of the intracellular linker between domains III and IV of the alpha subunits of voltage-gated sodium channels. It is responsible for fast inactivation of the channel and essential for proper physiological function.	54
259812	cd13434	SPFH_SLPs	Stomatin-like proteins (slipins) family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes proteins similar to stomatin, podocin, and other members of the stomatin-like protein family (SLPs or slipins). The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Mutations in the podocin gene give rise to autosomal recessive steroid resistant nephritic syndrome. Bacterial and archaebacterial SLPs and many of the eukaryotic family members remain uncharacterized.	108
259813	cd13435	SPFH_SLP-4	Slipin-4 (SLP-4), an uncharacterized subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in arthropods. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Members of this divergent slipin subgroup remain largely uncharacterized. It contains Drosophila Mec2, the gene for which was identified in a screen for genes required for nephrocyte function; it may function together with Sns in maintaining nephrocyte diaphragm.	208
259814	cd13436	SPFH_SLP-1	Stomatin-like protein 1 (SLP-1), a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in animals. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. The family contains human SLP-1, which has been found to be expressed in the brain, and Caenorhabditis elegans UNC-24, which is a lipid raft-associated protein required for normal locomotion. It may mediate the correct localization of UNC-1. Mutations in the unc-24 gene result in abnormal motion and altered patterns of sensitivity to volatile anesthetics.	131
259815	cd13437	SPFH_alloslipin	Alloslipin, a subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in some eukaryotes and viruses. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. This diverse subgroup of the SLPs remains largely uncharacterized.	222
259816	cd13438	SPFH_eoslipins_u2	Uncharacterized prokaryotic subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in bacteria. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Bacterial SLPs remain uncharacterized.	215
240442	cd13439	CamS_repeat	Repeat domain of CamS sex pheromone cAM373 precursor and related proteins. This  family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. The protein contains two structurally similar repeats in a tandem arrangement. The heptapeptide cAM373 is a Streptococcus faecalis pheromone, secreted by recipient cells, which induces a mating response in donor cells that contain particular conjugative plasmids. cAM373 is also excreted by Staphylococcus aureus. The family also contains sex hormone precursors from other bacteria and an uncharacterized protein with a single repeat from Desulfovibrio piger, which is structurally similar and might be homologous.	106
240443	cd13440	CamS_repeat_2	C-terminal repeat domain of CamS sex pheromone cAM373 precursor. This  family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. The protein contains two structurally similar repeats in a tandem arrangement. The heptapeptide cAM373 is a Streptococcus faecalis pheromone, secreted by recipient cells, which induces a mating response in donor cells that contain particular conjugative plasmids. cAM373 is also excreted by Staphylococcus aureus. The family also contains sex hormone precursors from other bacteria.	115
240444	cd13441	CamS_repeat_1	N-terminal repeat domain of CamS sex pheromone cAM373 precursor. This  family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed. The protein contains two structurally similar repeats in a tandem arrangement. The heptapeptide cAM373 is a Streptococcus faecalis pheromone, secreted by recipient cells, which induces a mating response in donor cells that contain particular conjugative plasmids. cAM373 is also excreted by Staphylococcus aureus. The family also contains sex hormone precursors from other bacteria.	204
412041	cd13442	CDI_toxin_Bp1026b-like	C-terminal (CT) toxin domain of the contact-dependent growth inhibition (CDI) system of Burkholderia pseudomallei 1026b, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular example from Burkholderia pseudomallei 1026b and other bacteria appears to function as a Mg2+-dependent RNAse cleaving tRNA, most likely in the aminoacyl acceptor stem. This CdiA-Ct is structurally similar to another CDI toxin domain from B. pseudomallei E479 which is unrelated in sequence but has a similar nuclease domain, and shares similar fold and active-site architecture; it contains a core alpha/beta-fold that is characteristic of PD(D/E)XK superfamily nucleases.	129
259836	cd13443	CDI_inhibitor_Bp1026b_like	Inhibitor of the contact-dependent growth inhibition (CDI) system of Burkholderia pseudomallei 1026b, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor of the CdiA effector protein from Burkholderia pseudomallei 1026b (which is a tRNAse). CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered. The inhibitors are intracellular proteins that inactivate the toxin/effector protein.	100
259837	cd13444	CDI_toxin_EC869_like	Zn-dependent DNAse of the contact-dependent growth inhibition (CDI) system of Escherichia coli EC869, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring bacteria. This model represents the C-terminal toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered. A wide variety of C-terminal toxin domains appear to exist; this particular example from Escherichia coli EC869 and other bacteria appears to function as a Zn2+-dependent DNAse degrading the genome of target cells.	143
259838	cd13445	CDI_inhibitor_EC869_like	Inhibitor of the contact-dependent growth inhibition (CDI) system of Escherichia coli EC869, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring bacteria. This model represents the inhibitor of the CdiA effector protein from Escherichia coli EC869 (which is a DNAse). CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered. The inhibitors are intracellular proteins that inactivate the toxin/effector protein. This domain is also known as DUF1436.	157
259825	cd13516	HHD_CCM2	harmonin-homology domain (harmonin_N_like domain) of malcavernin (CCM2). CCM2 (also called malcavernin; C7orf22/chromosome 7 open reading frame 22; OSM) along with CCM1 and CCM3 constitutes a set of proteins which when mutated are responsible for cerebral cavernous malformations, an autosomal dominant neurovascular disease characterized by cerebral hemorrhages and vascular malformations in the central nervous system. CCM2 plays many functional roles. CCM2 functions as a scaffold involved in small GTPase Rac-dependent p38 mitogen-activated protein kinase (MAPK) activation when the cell is under hyperosmotic stress. It associates with CCM1 in the signaling cascades that regulate vascular integrity and participates in HEG1 (the transmembrane receptor heart of glass 1) mediated endothelial cell junctions. CCM proteins also inhibit the activation of small GTPase RhoA and its downstream effector Rho kinase (ROCK) to limit vascular permeability. CCM2 mediates TrkA-dependent cell death via its N-terminal PTB domain in pediatric neuroblastic tumours. CCM2 possesses an N-terminal PTB domain. The C-terminal domain of malcavernin, which is represented here, appears similar to the N-terminal domain of the scaffolding protein harmonin. It has also been referred to as the Karet domain.	97
270235	cd13517	PBP2_ModA3_like	Substrate binding domain of molybdate binding protein-like (ModA3), a member of the type 2 periplasmic binding fold superfamily. This subfamily contains molybdate binding protein-like (ModA3) domain of an ABC-type transporter. Molybdate transport system is comprised of a periplasmic binding protein, an integral membrane protein, and an energizer protein. These three proteins are coded by modA, modB, and modC genes, respectively. ModA proteins serve as initial receptors in the ABC transport of molybdate mostly in eubacteria and archaea. ModA transporters import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. In contrast to the structure of the two ModA homologs from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arrangted around the metal center, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has shown that a binding site for molybdate and tungstate where the central metal atom is in a hexacoordinate configuration. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge.  They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	223
270236	cd13518	PBP2_Fe3_thiamine_like	Substrate binding domain of iron and thiamine transporters-like, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. On the other hand, thiamin is an essential cofactor in all living systems. Thiamin diphosphate (ThDP)-dependent enzymes play an important role in carbohydrate and branched-chain amino acid metabolism. Most prokaryotes, plants, and fungi can synthesize thiamin, but it is not synthesized in vertebrates. These periplasmic domains have high affinities for their respective substrates and serve as the primary receptor for transport. After binding iron and thiamine with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The iron- and thiamine-binding proteins belong to the PBPI2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	260
270237	cd13519	PBP2_PEB3_AcfC	Ligand-binding domain of a glycoprotein adhesion and an accessory colonization factor, a member of the type 2 periplasmic binding fold superfamily. PEB3 is a glycoprotein adhesion from Campylobacter jejuni whose structure suggests a functional role in transport, and resembles PEB1a, an Asp/Glu transporter and an adhesin. The overall structure of PEB3 is a dimer and is similar to that of other type 2 periplasmic transport proteins such as the molybdate/tungstate, sulfate, and ferric iron transporters.  PEB3 has high sequence identity to Paa, an Escherichia coli adhesin, and to AcfC, an accessory colonization factor from Vibrio cholera.	227
270238	cd13520	PBP2_TAXI_TRAP	Substrate binding domain of TAXI proteins of the tripartite ATP-independent periplasmic transporters; the type 2 periplasmic binding protein fold. This group includes Thermus thermophilus GluBP (TtGluBP) of TAXI-TRAP family and closely related proteins. TRAP transporters are ubiquitous in prokaryotes, but absent from eukaryotes. They are comprised of an SBP (substrate-binding protein) of the DctP or TAXI families and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function. The substrate-binding domain of TAXI proteins belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and tworeceptor cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	285
270239	cd13521	PBP2_AlgQ_like	Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This family represents the periplasmic-binding component of high molecular weight (HMW) alginate uptake system found in gram-negative soil bacteria and related proteins. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. In Sphingomonas sp. A1, the transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins AlgQ1 and AlgQ2. Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide.	483
270240	cd13522	PBP2_ABC_oligosaccharides	The periplasmic-binding component of ABC transport systems specific for maltose and related oligosaccharides; possess type 2 periplasmic binding fold. This family represents the periplasmic binding component of ABC transport systems involved in uptake of oligosaccharides including maltose, trehalose, maltodextrin, and cyclodextrin. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	368
270241	cd13523	PBP2_polyamines	The periplasmic-binding component of ABC transporters involved in uptake of polyamines; possess the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding proteins that function as the primary high-affinity receptors of ABC-type polyamine transport systems. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, as well as plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	268
270242	cd13524	PBP2_Thiaminase_I	Thiaminase-I has high structural homology to the type 2 periplasmic binding proteins of active transport systems. Thiaminase-I, a thiamin-(vitamin B1) degrading enzyme, is a monomer in its biologically active form, with two distinct globular domains (N- and C-domains) separated by a deep groove. It has a structural topology similar to the periplasmic substrate-domains of ABC-type transport systems, such as thiamin-binding protein (TbpA), that possess the type 2 periplasmic binding protein fold.  The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.  The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	363
270243	cd13525	PBP2_ATP-Prtase_HisG	The catalytic domain of ATP phosphoribosyltransferase contains the type 2 periplasmic substrate-binding fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl  1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain.  HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	208
270244	cd13526	PBP2_lipoprotein_MetQ_like	The periplasmic-binding component of ABC-type methionine uptake transporter system and its related lipoproteins; the type 2 periplasmic-binding protein fold. This family represents the periplasmic substrate-binding domain of ATP-binding cassette (ABC) transporter involved in uptake of methionine (MetQ) and its related homologs. Members of the MetQ-like family include the 32-kilodalton lipoprotein (Tp32) from Treponema pallidum, the membrane-associated lipoprotein-9 GmpC from Staphylococcus aureus, and Toll-like receptor 2-activating lipoprotein IlpA from Vibrio vulnificus. They all function as a receptor for methionine. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	228
270245	cd13527	PBP2_TRAP	Substrate-binding component of Tripartite ATP-independent  Periplasmic transporters and related proteins; contains the type 2 periplasmic-binding protein fold. This family represents the TRAP Transporters that are specific to various ligands, including sialic acid (N-acetyl neuraminic acid), glutamate, ectoine, xylulose, C4-dicarboxylates such as succinate, malate and fumarate, and keto acids such as pyruvate and alpha-ketobutyrate. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. This family also includes some eukaryotic homologs that have not been functionally characterized. TRAP transporters are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	301
270246	cd13528	PBP2_osmoprotectants	Substrate-binding domain of osmoregulatory ABC-type transporters; the type 2 periplasmic-binding protein fold. This family represents the periplasmic substrate-binding component of ABC transport systems that are involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. To counteract the efflux of water, bacteria and archaea accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	264
270247	cd13529	PBP2_transferrin	Transferrin family of the type 2 periplasmic-binding protein superfamily. Transferrins are iron-binding blood plasma glycoproteins that regulate the level of free iron in biological fluids. Vertebrate transferrins are made of a single polypeptide chain with a molecular weight of about 80 kDa. The polypeptide is folded into two homologous lobes (the N-lobe and C-lobe), and each lobe is further subdivided into two similar alpha helical and beta sheet domains separated by a deep cleft that forms the binding site for ferric iron. Thus, the transferrin protein contains two homologous metal-binding sites with high affinities for ferric iron. The modern transferrin proteins are thought to be evolved from an ancestral gene coding for a protein of 40 kDa containing a single binding site by means of a gene duplication event. Vertebrate transferrins are found in a variety of bodily fluids, including serum transferrins, ovotransferrins, lactoferrins, and melanotransferrins. Transferrin-like proteins are also found in the circulatory fluid of certain invertebrates. The transferrins have the same structural fold as the type 2 periplasmic-binding proteins, many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	298
270248	cd13530	PBP2_peptides_like	Peptide-binding protein and related homologs; type 2 periplasmic binding protein fold. This domain is found in solute binding proteins that serve as initial receptors in the ABC transport, signal transduction and channel gating.  The PBP2 proteins share the same architecture as periplasmic binding proteins type 1, but have a different topology. They are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.  The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.  After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the family includes ionotropic glutamate receptors and unorthodox sensor proteins involved in signal transduction.	217
270249	cd13531	PBP2_MxaJ	Methanol oxidation system protein MoxJ; the type 2 periplasmic binding fold. This predicted periplasmic protein, called MoxJ or MxaJ, is required for methanol oxidation in Methylobacterium extorquens. Homology suggests it is the substrate-binding protein of an ABC transporter associated with methanol oxidation. Other evidence also suggests that MoxJ is an accessory factor or additional subunit of methanol dehydrogenase itself. Mutational studies show a dependence on this protein for expression of the PQQ-dependent, two-subunit methanol dehydrogenase (MxaF and MxaI) in Methylobacterium extorquens, as if it is a chaperone for enzyme assembly or a third subunit. A homologous N-terminal sequence was found in Paracoccus denitrificans as a 32Kd third subunit. MoxJ may be both, a component of a periplasmic enzyme that converts methanol to formaldehyde and a component of an ABC transporter that delivers the resulting formaldehyde to the cell's interior.	242
270250	cd13532	PBP2_PDT_like	Catalytic domain of prephenate dehydratase and similar proteins; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	184
270251	cd13533	PBP2_Yhfz	Substrate-binding domain of uncharacterized protein Yhfz from Shigella Flexneri; the type 2 periplasmic-binding protein fold. This subfamily contains periplasmic binding protein type II (BPBII).  This domain is found in solute binding proteins that serve as initial receptors in the ABC transport, signal transduction and channel gating.  The PBPII proteins share the same architecture as periplasmic binding proteins type I (PBPI), but have a different topology.  They are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.  The majority of PBPII proteins function in the uptake of small soluble substrates in eubacteria and archaea.  After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the family includes ionotropic glutamate receptors and unorthodox sensor proteins involved in signal transduction.	222
270252	cd13534	PBP2_MqnD_like	Menaquinone biosynthetic enzyme and related hypothetical proteins; the type 2 periplasmic-binding protein fold. This family represents MqnD, an enzyme within the alternative menaquinone biosynthetic pathway, and related conserved hypothetical proteins. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. The members include Ttha1568, MqnD from Thermus thermophiles HB8, and the conserved hypothetical proteins SCO4506 from Streptomyces coelicolor, Af1704 from Archaeoglobus DSM 4304, Dr0370 from Deinococcus radiodurans, and Ca3427 from candida albicans. They all have significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	261
270253	cd13535	PBP2_Osm_BCP_like	Substrate binding domain of osmoregulatory ABC-type glycine betaine/choline/L-proline transport system and related proteins; the type 2 periplasmic binding protein fold. This family is part of a high affinity multicomponent binding-protein-dependent ATP-binding cassette transport system specific to certain quaternary ammonium compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as betaines, choline, and L-proline. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	277
270254	cd13536	PBP2_EcModA	Substrate binding domain of ModA from Escherichia coli and its closest homologs;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins that serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ModA proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	227
270255	cd13537	PBP2_YvgL_like	Substrate binding domain of putative molybdate-binding protein YvgL and similar proteins;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins of putative ABC-type transporter. ModA proteins serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate and tungstate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	225
270256	cd13538	PBP2_ModA_like_1	Substrate binding domain of putative molybdate-binding protein;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins of putative ABC-type transporter. Molybdate transport system is comprised of a periplasmic binding protein, an integral membrane protein, and an energizer protein. These three proteins are coded by modA, modB, and modC genes, respectively. ModA proteins serve as initial receptors in the ABC transport of molybdate mostly in eubacteria and archaea.  After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	230
270257	cd13539	PBP2_AvModA	Substrate binding domain of ModA/WtpA from Azotobacter vinelandii and its closest homologs;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins that serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. In contrast to the structure of the two ModA homologs from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arranged around the metal center, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has shown that a binding site for molybdate and tungstate is where the central metal atom is in a hexacoordinate configuration. This octahedral geometry was rather unexpected. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge.  They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	226
270258	cd13540	PBP2_ModA_WtpA	Substrate binding domain of ModA/WtpA from Pyrococcus furiosus and its closest homologs;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins that serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase.  This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. In contrast to the structure of the two ModA homologs from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arranged around the metal center, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has shown that a binding site for molybdate and tungstate where the central metal atom is in a hexacoordinate configuration. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	263
270259	cd13541	PBP2_ModA_like_2	Substrate binding domain of molybdate-binding proteins;the type 2 periplasmic binding protein fold. This subfamily contains domains found in ModA proteins of putative ABC-type transporter. ModA proteins serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate and tungstate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	238
270260	cd13542	PBP2_FutA1_ilke	Substrate binding domain of ferric iron-binding protein, a member of the type 2 periplasmic binding fold superfamily. FutA1 is the periplasmic component of an ABC-type iron transporter and serves as the primary receptor in Synerchosystis species. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria and is critical for survival of these pathogens within the host. After binding iron with high affinity, FutA1 interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The iron- and thiamine-binding proteins belong to the PBPI2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	314
270261	cd13543	PBP2_Fbp	Substrate binding domain of ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic protein (Fbp) has high affinities for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	306
270262	cd13544	PBP2_Fbp_like_1	Substrate binding domain of a putative ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The substrate domain of this group shows a high homology to the periplasmic component of ferric iron transporter (Fbp), but its biochemical characterization has not been performed. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	292
270263	cd13545	PBP2_TbpA	Substrate binding domain of thiamin transporter, a member of the type 2 periplasmic binding fold superfamily. Thiamin-binding protein TbpA is the periplasmic component of ABC-type transporter in E. coli, while the transmembrane permease and ATPase are ThiP and ThiQ, respectively. Thiamin (vitamin B1) is an essential confactor in all living systems that  most prokaryotes, plants, and fungi can synthesized thiamin. However,  in vertebrates, thiamine cannot be synthesized and must therefore be obtained through dietary absorption.  In addition to thiamin biosynthesis, most organisms can import thiamin using specific transporters. After binding thiamine with high affinity, TbpA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The thiamine-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	269
270264	cd13546	PBP2_BitB	Substrate binding domain of a putative iron transporter BitB, a member of the type 2 periplasmic binding fold superfamily. The substrate domain of this group shows a high homology to the periplasmic component of ferric iron transporter (Fbp), but its biochemical characterization has not been performed. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	258
270265	cd13547	PBP2_Fbp_like_2	Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	259
270266	cd13548	PBP2_AEPn_like	Substrate binding domain of a putative 2-amnioethylphosphonate-bindinig transporter, a member of the type 2 periplasmic binding fold superfamily. The substrate domain of this group shows a high homology to the periplasmic component of ferric iron transporter (Fbp), but its biochemical characterization has not been performed. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	310
270267	cd13549	PBP2_Fbp_like_3	Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	263
270268	cd13550	PBP2_Fbp_like_4	Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	265
270269	cd13551	PBP2_Fbp_like_5	Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	267
270270	cd13552	PBP2_Fbp_like_6	Substrate binding domain of an uncharacterized ferric iron transporter, a member of the type 2 periplasmic binding fold superfamily. The periplasmic iron binding protein plays an essential role in the iron uptake pathway of Gram-negative pathogenic bacteria from the Pasteurellaceae and Neisseriaceae families and is critical for survival of these pathogens within the host. This periplasmic domain (Fbp) has high affinity for ferric iron and serves as the primary receptor for transport. After binding iron with high affinity, Fbp interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ferric iron-binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	266
270271	cd13553	PBP2_NrtA_CpmA_like	Substrate binding domain of ABC-type nitrate/bicarbonate transporters, a member of the type 2 periplasmic binding fold superfamily. This subfamily includes nitrate (NrtA) and bicarbonate (CmpA) receptors. These domains are found in eubacterial perisplamic-binding proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB).  After binding their ligand with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. These binding proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	212
270272	cd13554	PBP2_DszB	Substrate binding domain of 2'-hydroxybiphenyl-2-sulfinate desulfinase, a member of the type 2 periplasmic binding fold superfamily. This subfamily includes DszB, which converts 2'-hydroxybiphenyl-2-sulfinate to 2-hydroxybiphenyl and sulfinate at the rate-limiting step of the microbial dibenzothiophene desulfurization pathway. The overall fold of DszB is highly similar to those of periplasmic substrate-binding proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates. After binding their ligand with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The DszB protein belongs to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	246
270273	cd13555	PBP2_sulfate_ester_like	Sulfate ester binding protein-like, the type 2 periplasmic binding protein fold.  This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	268
270274	cd13556	PBP2_SsuA_like_1	Substrate binding domain of putative sulfonate binding protein, a member of the type 2 periplasmic binding fold superfamily. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	265
270275	cd13557	PBP2_SsuA	Substrate binding domain of sulfonate binding protein, a member of the type 2 periplasmic binding fold superfamily. This subfamily includes the sulfonate binding domains SsuA found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	275
270276	cd13558	PBP2_SsuA_like_2	Putative substrate binding domain of sulfonate binding protein, the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	267
270277	cd13559	PBP2_SsuA_like_3	Putative substrate binding domain of sulfonate binding protein-like, the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	258
270278	cd13560	PBP2_taurine	Taurine-binding periplasmic protein; the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	218
270279	cd13561	PBP2_SsuA_like_4	Putative substrate binding domain of sulfonate binding protein-like, the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	212
270280	cd13562	PBP2_SsuA_like_5	Putative substrate binding domain of sulfonate binding protein-like, the type 2 periplasmic binding protein fold. This subfamily includes sulfonate binding domains found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	215
270281	cd13563	PBP2_SsuA_like_6	Putative substrate binding domain of sulfonate binding protein-like, a member of the type 2 periplasmic binding protein fold. This subfamily includes the periplasmic component of putative ABC-type sulfonate transport system similar to SsuA. These domains are found in eubacterial SsuA proteins that serve as initial receptors in the ABC transport of bicarbonate, nitrate, taurine, or a wide range of aliphatic sulfonates, while other closest homologs are involved in thiamine (vitamin B1) biosynthetic pathway and desulfurization (DszB). After binding the ligand, SsuA interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The SsuA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	208
270282	cd13564	PBP2_ThiY_THI5_like	Substrate binding domain of ABC-type transporter for thiamin biosynthetic pathway intermediates and similar proteins; the type 2 periplasmic binding protein fold. ThiY is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are THI5, which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes, and periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport.  After binding the ligand, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	214
270283	cd13565	PBP2_PstS	Substrate binding domain of ABC-type phosphate transporter, a member of the type 2 periplasmic-binding fold superfamily. This subfamily contians phosphate binding domain found in PstS proteins that serve as initial receptors in the ABC transport of phosphate in eubacteria and archaea. After binding the ligand, PstS interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The PstS proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	254
270284	cd13566	PBP2_phosphate	Substrate binding domain of putative ABC-type phosphate transporter, a member of the type 2 periplasmic binding fold superfamily. This subfamily contains uncharacterized phosphate binding domains found in PstS proteins that serve as initial receptors in the ABC transport of phosphate in eubacteria and archaea. After binding the ligand, PstS interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The PstS proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	245
270285	cd13567	PBP2_TtGluBP	Substrate binding domain of Thermus thermophilus GluBP (TtGluBP) of TAXI family of the tripartite ATP-independent periplasmic transporters; contains the type 2 periplasmic binding protein fold. This subgroup includes TtGluBP of TAXI-TRAP family and closely related proteins. TRAP transporters are comprised of an SBP (substrate-binding protein) and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function.	284
270286	cd13568	PBP2_TAXI_TRAP_like_3	Substrate binding domain of putative TAXI proteins of the tripartite ATP-independent periplasmic transporters; the type 2 periplasmic binding protein fold. This subgroup includes uncharacterized periplasmic binding proteins that are related to Thermus thermophilus GluBP (TtGluBP) of TAXI-TRAP family. TRAP transporters are comprised of an SBP (substrate-binding protein) and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function.	289
270287	cd13569	PBP2_TAXI_TRAP_like_1	Substrate binding domain of putative TAXI proteins of the tripartite ATP-independent periplasmic transporters; the type 2 periplasmic binding protein fold. This subgroup includes uncharacterized periplasmic binding proteins that are related to Thermus thermophilus GluBP (TtGluBP) of TAXI-TRAP family. TRAP transporters are comprised of an SBP (substrate-binding protein) and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function.	283
270288	cd13570	PBP2_TAXI_TRAP_like_2	Substrate binding domain of putative TAXI proteins of the tripartite ATP-independent periplasmic transporters; the type 2 periplasmic binding protein fold. This subgroup includes uncharacterized periplasmic binding proteins that are related to Thermus thermophilus GluBP (TtGluBP) of TAXI-TRAP family. TRAP transporters are comprised of an SBP (substrate-binding protein) and two unequally sized integral membrane components. Although TtGluBP is predicted to be an L-glutamate and/or an L-glutamine-binding protein, the substrate spectrum of TAXI proteins remains to be defined. A sequence-homology search also shows that TtGluBP shares low sequence homology with putative immunogenic proteins of uncharacterized function.	281
270289	cd13571	PBP2_PnhD_1	Substrate binding domain of uncharacterized ABC-type phosphonate-like transporter; contains the type 2 periplasmic binding fold. This subfamily includes putative periplasmic binding components of an ABC transport system similar to alkylphosphonate binding domain PnhD. These domains are found in PnhD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	253
270290	cd13572	PBP2_PnhD_2	Substrate binding domain of uncharacterized ABC-type phosphonate-like transporter; contains the type 2 periplasmic binding fold. This subfamily includes putative periplasmic binding component of an ABC transport system similar to alkylphosphonate binding domain PnhD. These domains are found in PnhD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	249
270291	cd13573	PBP2_PnhD_3	Substrate binding domain of uncharacterized ABC-type phosphonate-like transporter; contains the type 2 periplasmic binding fold. This subfamily includes putative periplasmic binding component of an ABC transport system similar to alkylphosphonate binding domain PnhD. These domains are found in PnhD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	253
270292	cd13574	PBP2_PnhD_4	Substrate binding domain of uncharacterized ABC-type phosphonate-like transporter; contains the type 2 periplasmic binding fold. This subfamily includes putative periplasmic binding component of an ABC transport system similar to alkylphosphonate binding domain PnhD. These domains are found in PnhD-like proteins that are predicted to function as initial receptors in hypophosphite, phosphonate, or phosphate ABC transport in archaea and eubacteria. They belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	250
270293	cd13575	PBP2_PnhD	Substrate binding domain of ABC-type phosphonate uptake system; contains the type 2 periplasmic binding fold. This subfamily includes the Escherichia coli PhnD (EcPhnD) which exhibits high affinity for the environmentally abundant 2-aminoethylphosphonate (2-AEP), a precursor in the biosynthesis of phosphonolipids, phosphonoproteins, and phosphonoglycans. The Escherichia coli phn operon encodes 14 genes involved in binding, uptake and metabolism of phosphonate, and is activated under phophophate-limiting conditions. PhnD belongs to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. The PBP2 have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. PhnD is the periplasmic binding component of an ABC-type phosphonate uptake system (PhnCDE) that recognizes and binds phosphonate.	259
270294	cd13576	PBP2_BugD_Asp	Aspartic acid transporter of Bug (Bordetella uptake gene) protein family; contains the type 2 periplasmic binding fold. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) receptors present in a number of bacterial species, but mainly in proteobacteria. Bug proteins are the PBP components of the tripartite carboxylate transporters (TTT). Their expansive expansion in proteobacteria indicates a large functional diversity. The best studied examples are Bordetella pertussis BugD, which is an aspartic acid transporter, and BugE, which is glutamate transporter.	294
270295	cd13577	PBP2_BugE_Glu	Glutamate transporter of Bug (Bordetella uptake gene) protein family; contains the type 2 periplasmic binding fold. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) receptors present in a number of bacterial species, but mainly in proteobacteria. Bug proteins are the PBP components of the tripartite carboxylate transporters (TTT). Their expansive expansion in proteobacteria indicates a large functional diversity. The best studied examples are Bordetella pertussis BugD, which is an aspartic acid transporter, and BugE, which is glutamate transporter.	292
270296	cd13578	PBP2_Bug27	Aromatic solutes transporter of Bug (Bordetella uptake gene) protein family;  contains the type 2 periplasmic binding fold. Bug27 binds non-carboxylated solute nicotinamide, in contrast to BugD (aspartic acid transporter) and BugE (glutamate transporter) which both bind aliphatic carboxylated ligands. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) receptors present in a number of bacterial species, but mainly in proteobacteria. Bug proteins are the PBP components of the tripartite carboxylate transporters (TTT). Their expansive expansion in proteobacteria indicates a large functional diversity.	291
270297	cd13579	PBP2_Bug_NagM	Uncharacterized NagM-like protein of Bug (Bordetella uptake gene) protein family; contains the type 2 periplasmic binding fold. The Bug (Bordetella uptake gene) protein family is a large family of periplasmic solute-binding (PBP) receptors present in a number of bacterial species, but mainly in proteobacteria. Bug proteins are the PBP components of the tripartite carboxylate transporters (TTT). Their expansive expansion in proteobacteria indicates a large functional diversity. The best studied examples are Bordetella pertussis BugD, which is an aspartic acid transporter, and BugE, which is glutamate transporter.	292
270298	cd13580	PBP2_AlgQ_like_1	Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This subgroup includes uncharacterized periplasmic-binding proteins that are closely related to high molecular weight (HMW) alginate bining proteins (AlgQ1 and AlgQ2) found in gram-negative soil bacteria. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2).  Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide.	471
270299	cd13581	PBP2_AlgQ_like_2	Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This subgroup includes uncharacterized periplasmic-binding proteins that are closely related to high molecular weight (HMW) alginate bining proteins (AlgQ1 and AlgQ2) found in gram-negative soil bacteria. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2).  Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide.	490
270300	cd13582	PBP2_AlgQ_like_3	Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This subgroup includes uncharacterized periplasmic-binding proteins that are closely related to high molecular weight (HMW) alginate bining proteins (AlgQ1 and AlgQ2) found in gram-negative soil bacteria. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2).  Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide.	504
270301	cd13583	PBP2_AlgQ_like_4	Periplasmic-binding component of alginate-specific ABC uptake system-like; contains the type 2 periplasmic binding fold. This subgroup includes uncharacterized periplasmic-binding proteins that are closely related to high molecular weight (HMW) alginate bining proteins (AlgQ1 and AlgQ2) found in gram-negative soil bacteria. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2).  Alginate is an anionic polysaccharide that is made up of alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide.	478
270302	cd13584	PBP2_AlgQ1_2	Periplasmic-binding component of alginate-specific ABC uptake system; contains the type 2 periplasmic binding fold. This group represents the periplasmic-binding component of high molecular weight (HMW) alginate uptake system found in gram-negative soil bacteria such as Sphingomonas sp. A1. The HMW alginate uptake system is composed of a novel pit formed on the cell surface and a pit-dependent ATP-binding cassette (ABC) transporter in the inner membrane. The transportation of HMW alginate from the pit to the ABC transporter is mediated by periplasmic HMW alginate-binding proteins (AlgQ1 and AlgQ2). Alginate is an anionic polysaccharide that includes alpha-L-mannuronate and its 5'-epimer, alpha-L-guluronate. Alginate is present in the cell walls of brown seaweeds, where it forms a viscous gum by binding water. Alginate is also produced by two bacteria genera Pseudomonas and Azotobacter. AlgQ1 and AlgQ2 belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. However, unlike other bacterial periplasmic-binding proteins that deliver small solutes to ABC transporters, AlgQ1/2 can bind a macromolecule and may have specificity for either sugar or a certain type of polysaccharide.	481
270303	cd13585	PBP2_TMBP_like	The periplasmic-binding component of ABC transport systems specific for trehalose/maltose and similar oligosaccharides; possess type 2 periplasmic binding fold. This family includes the periplasmic trehalose/maltose-binding component of an ABC transport system and related proteins from archaea and bacteria. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	383
270304	cd13586	PBP2_Maltose_binding_like	The periplasmic-binding component of ABC transport systems specific for maltose and related polysaccharides; possess type 2 periplasmic binding fold. This subfamily represents the periplasmic binding component of ABC transport systems involved in uptake of polysaccharides including maltose, maltodextrin, and cyclodextrin. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	367
270305	cd13587	PBP2_polyamine_2	The periplasmic-binding component of an uncharacterized ABC transporter involved in uptake of polyamines; contains the type 2 periplasmic binding fold. This family represents the periplasmic binding domain that functions as the primary polyamine receptor of an uncharacterized ABC-type transport system. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	292
270306	cd13588	PBP2_polyamine_1	The periplasmic-binding component of an uncharacterized ABC transporter involved in uptake of polyamines; contains the type 2 periplasmic binding fold. This group represents the periplasmic binding domain that functions as the primary high-affinity receptor of an uncharactertized ABC-type polyamine transport system. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	279
270307	cd13589	PBP2_polyamine_RpCGA009	The periplasmic-binding component of an uncharacterized ABC transport system from Rhodopseudomonas palustris CGA009 and related proteins; contains the type 2 periplasmic-binding fold. This group represents the periplasmic binding domain that serves as the primary high-affinity receptor of an uncharacterized ABC-type polyamine transporter from Rhodopseudomonas palustris Cga009 and related proteins from other bacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	268
270308	cd13590	PBP2_PotD_PotF_like	The periplasmic-binding component of ABC transporters involved in uptake of polyamines; possess the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding domain that functions as the primary high-affinity receptors of ABC-type polyamine transport systems. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	315
270309	cd13591	PBP2_HisGL1	The catalytic domain of hexameric long form HisGL1; contains the type 2 periplasmic binding protein fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl  1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain.  HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	204
270310	cd13592	PBP2_HisGL2	The catalytic domain of hexameric long form HisGL2; contains the type 2 periplasmic binding protein fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl  1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain.  HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	208
270311	cd13593	PBP2_HisGL3	The catalytic domain of hexameric long form HisGL3; contains the type 2 periplasmic binding protein fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl  1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain.  HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	220
270312	cd13594	PBP2_HisGL4	The catalytic domain of hexameric long form HisGL4; contains the type 2 periplasmic binding fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl  1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain.  HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	207
270313	cd13595	PBP2_HisGs	The catalytic domain of hetero-octomeric short form HisGs; contains the type 2 periplasmic binding protein fold. Encoded by the hisG gene, the ATP phosphoribosyltransferase (ATP-PRT, EC 2.4.2.17) is the first enzyme in histidine biosynthetic pathway that catalyzes the condensation of ATP and PRPP (5'-phosphoribosyl  1'-pyrophosphate), and is regulated by a feedback inhibition from the product histidine. ATP-PRT has two distinct forms: a hexameric long form, HisGL, containing two catalytic domains and a C-terminal regulatory domain; and a hetero-octomeric short form, HisGs, without the regulatory domain.  HisGL is catalytically competent, but the hetero-octameric HisGs requires the second subunit HisZ, a paralog to the catalytic domain of functional histidyl-tRNA synthetases (HisRSs), for the enzyme activity. This catalytic domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	205
270314	cd13596	PBP2_lipoprotein_GmpC	The periplasmic substrate-binding domain of the membrane-associated lipoprotein-9 GmpC; contains the type 2 periplasmic-binding protein fold. This group includes the membrane-associated lipoprotein-9 from Staphylococcus aureus that binds the dipeptide glycylmethionine (GlyMet). The lipoprotein-9 has both structural and sequential homology to the MetQ family of substrate-binding protein. The GlyMet binding protein belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	230
270315	cd13597	PBP2_lipoprotein_Tp32	The substrate-binding domain of the 32-kilodalton lipoprotein (Tp32) from Treponema pallidum binds L-methionine; the type 2 periplasmic-binding protein fold. This group includes the lipoprotein Tp32, a periplasmic component of a methionine uptake transporter system, and its closely related homologs. The Tp32 has both structural and sequential homology to the MetQ family of substrate-binding protein, and thus it belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	236
270316	cd13598	PBP2_lipoprotein_IlpA_like	Toll-like receptor 2-activating lipoprotein IlpA from Vibrio vulnificus and similar lipoproteins; the type 2 periplasmic binding protein fold. This group includes the IlpA protein which has both structural and sequential homology to the MetQ family of substrate-binding protein, and thus belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	227
270317	cd13599	PBP2_lipoprotein_Gna1946	The membrane-associated lipoprotein Gna1946 from Neisseria meningitidis; the type 2 periplasmic binding protein fold. Gna1946 shares significant structural and sequence homology with the periplasmic substrate-binding domain of ATP-binding cassette (ABC) transporter involved in uptake of methionine (MetQ). The members of the MetQ-like family include the 32-kilodalton lipoprotein (Tp32) from Treponema pallidum, the membrane-associated lipoprotein-9 GmpC from Staphylococcus aureus, and Toll-like receptor 2-activating lipoprotein IlpA from Vibrio vulnificus. They all function as a receptor for methionine. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	228
270318	cd13600	PBP2_lipoprotein_like_1	Putative periplasmic-binding component of ABC-type methionine uptake transporter system-like; the type 2 periplasmic binding protein fold. This subgroup shares significant sequence homology with the periplasmic substrate-binding domain of ATP-binding cassette (ABC) transporter involved in uptake of methionine (MetQ). The members of the MetQ-like family include the 32-kilodalton lipoprotein (Tp32) from Treponema pallidum, the membrane-associated lipoprotein-9 GmpC from Staphylococcus aureus, and Toll-like receptor 2-activating lipoprotein IlpA from Vibrio vulnificus. They all function as a receptor for methionine. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea.	228
270319	cd13601	PBP2_TRAP_DctP1_3_4_like	Periplasmic substrate-binding component of uncharacterized TRAP-type C4-dicarboxylate transporter subfamilies; the type 2 periplasmic-binding protein fold. This model includes uncharacterized DctP subfamilies of the TRAP Transporters. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	302
270320	cd13602	PBP2_TRAP_BpDctp6_7	Substrate-binding domain of a pyroglutamic acid binding DctP subfamily of the tripartite ATP-independent periplasmic transporters; contains the type 2 periplasmic binding protein fold. DctP6 and DctP7 groups of the TRAP transporters that involved in pyroglutamic acid transport. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	300
270321	cd13603	PBP2_TRAP_Siap_TeaA_like	Substrate-binding domain of a sialic acid binding Tripartite ATP-independent  Periplasmic transport system (SiaP) and related proteins; the type 2 periplasmic-binding protein fold. This subfamily includes the periplasmic-binding component of TRAP transport systems such as SiaP (a sialic acid binding virulence factor), TeaA (an ectoine binding protein), and an uncharacterized TM0322 from hyperthermophilic bacterium Thermotoga maritima. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	297
270322	cd13604	PBP2_TRAP_ketoacid_lactate_like	Substrate-binding domain of an alpha-keto acid binding Tripartite ATP-independent Periplasmic transporter and related proteins; the type 2 periplasmic-binding protein fold. This family constitutes TRAP transporters that bind to ketoacids such as pyruvate and alpha-ketobutyrate, xylulose, and other unknown ligands. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	306
270323	cd13605	PBP2_TRAP_DctP_like_2	Substrate-binding component of uncharacterized Tripartite ATP-independent  Periplasmic transporter; the type 2 periplasmic-binding protein fold. This family represents the TRAP Transporters that are specific to various ligands, including sialic acid (N-acetyl neuraminic acid), glutamate, ectoine, xylulose, C4-dicarboxylates such as succinate, malate and fumarate, and keto acids such as pyruvate and alpha-ketobutyrate. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. This CD also included some eukaryotic homologs that have not been functionally characterized. TRAP transporters are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	303
270324	cd13606	PBP2_ProX_like	Bacterial substrate-binding protein ProX of ABC-type osmoregulated transporter and its related proteins; the type 2 periplasmic-binding protein fold. This group includes periplasmic substrate-binding component of ABC transport systems from gram-negative and -positive bacteria that are involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	260
270325	cd13607	PBP2_AfProX_like	Substrate-binding protein ProX of ABC-type osmoregulatory transporter from Archaeoglobus fulgidus and its related proteins; the type 2 periplasmic-binding protein fold. This subfamily includes the periplasmic substrate-binding protein ProX from the hyperthermophilic archaeon Archaeoglobus fulgidus and its related proteins. AfProX is involved in uptake of compatible solutes such as the trimethylammonium compound glycine betaine and the dimethylammonium compound proline betaine, but the relative substrate preference is not known. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. AfProX belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	261
270326	cd13608	PBP2_OpuCC_like	Substrate-binding protein OpuCC of ABC-type osmoregulatory transporter and related proteins; the type 2 periplasmic-binding protein fold. This subfamily includes the periplasmic substrate-binding protein OpuCC of the ABC transporter OpuC (where Opu is osmoprotectant uptake), which can recognize a broad spectrum of compatible solutes, and its paralog OpuBC that can solely bind choline. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	265
270327	cd13609	PBP2_Opu_like_1	Substrate-binding domain of putative ABC-type osmoprotectant uptake system; the type 2 periplasmic-binding protein fold. This group includes the periplasmic substrate-binding component of a putative ABC transport system that is predicted to be involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline.  The relative substrate preference of this group is not known. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	263
270328	cd13610	PBP2_ChoS	Substrate-binding domain ChoS of an osmoregulated ABC-type transporter and related proteins; type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein ChoS of Lactococcus lactis is predicted to be involved in uptake of compatible solutes such as choline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. ChoS belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	264
270329	cd13611	PBP2_YehZ	Substrate-binding domain YehZ of an osmoregulated ABC-type transporter; the type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein YehZ of Clostridium sticklandii is predicted to be involved in uptake of compatible solutes such as choline, L-proline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. YehZ belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	267
270330	cd13612	PBP2_ProWX	Substrate-binding protein ProWX of ABC-type osmoregulated transporter and its related proteins; the type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein ProWX of Helicobacter pylori is predicted to be involved in uptake of compatible solutes such as choline, L-proline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. ProWX belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	267
270331	cd13613	PBP2_Opu_like_2	Substrate-binding domain of putative ABC-type osmoprotectant uptake system; the type 2 periplasmic-binding protein fold. This group includes the periplasmic substrate-binding component of a putative ABC transport system that is predicted to be involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. The relative substrate preference of this group is not known. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	264
270332	cd13614	PBP2_QAT_like	Substrate-binding domain of quaternary amine ABC-type transporter; the type 2 periplasmic-binding protein fold. This group includes the periplasmic substrate-binding component of a putative quaternary amine ABC transport system that is predicted to be involved in uptake of osmoprotectants (also termed compatible solutes) such as betaine, choline, proline betaine, carnitine, and L-proline. The relative substrate preference of this group is not known. To counteract the efflux of water, many microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	264
270333	cd13615	PBP2_ProWY	Substrate-binding domain of ABC-type osmoregulated transporter; the type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein ProWY of Streptococcus thermophilus is predicted to be involved in uptake of compatible solutes such as choline, L-proline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. ProWY belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	262
270334	cd13616	PBP2_OsmF	Substrate-binding domain OsmF of an osmoregulated ABC-type transporter; the type 2 periplasmic-binding protein fold. Osmoprotectant binding lipoprotein OsmF of an ABC transporter (YehZYXW) from Escherichia coli is predicted to be involved in uptake of compatible solutes such as choline, L-proline and glycine betaine, but the relative substrate preference is not known. To counteract the efflux of water, microorganisms accumulate the compatible solutes for a sustained adjustment to high osmolarity surroundings. OsmF belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	274
270335	cd13617	PBP2_transferrin_C	The C-lobe of transferrin, a member of the type 2 periplasmic binding protein fold superfamily. Transferrins are iron-binding blood plasma glycoproteins that regulate the level of free iron in biological fluids. Vertebrate transferrins are made of a single polypeptide chain with a molecular weight of about 80 kDa. The polypeptide is folded into two homologous lobes (the N-lobe and C-lobe), and each lobe is further subdivided into two similar alpha helices and beta sheets domains separated by a deep cleft that forms the binding site for ferric iron. Thus, the transferrin protein contains two homologous metal-binding sites with high affinities for ferric iron. The modern transferrin proteins are thought to be evolved from an ancestral gene coding for a protein of 40 kDa containing a single binding site by means of a gene duplication event. Vertebrate transferrins are found in a variety of bodily fluids, including serum transferrins, ovotransferrins, lactoferrins, and melanotransferrins. Transferrin-like proteins are also found in the circulatory fluid of certain invertebrates. The transferrins have the same structural fold as the type 2 periplasmic-binding proteins, many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	331
270336	cd13618	PBP2_transferrin_N	The N-lobe of transferrin, a member of the type 2 periplasmic binding protein fold superfamily. Transferrins are iron-binding blood plasma glycoproteins that regulate the level of free iron in biological fluids. Vertebrate transferrins are made of a single polypeptide chain with a molecular weight of about 80 kDa. The polypeptide is folded into two homologous lobes (the N-lobe and C-lobe), and each lobe is further subdivided into two similar alpha helices and beta sheets domains separated by a deep cleft that forms the binding site for ferric iron. Thus, the transferrin protein contains two homologous metal-binding sites with high affinities for ferric iron. The modern transferrin proteins are thought to be evolved from an ancestral gene coding for a protein of 40 kDa containing a single binding site by means of a gene duplication event. Vertebrate transferrins are found in a variety of bodily fluids, including serum transferrins, ovotransferrins, lactoferrins, and melanotransferrins. Transferrin-like proteins are also found in the circulatory fluid of certain invertebrates. The transferrins have the same structural fold as the type 2 periplasmic-binding proteins, many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	324
270337	cd13619	PBP2_GlnP	Glutamine-binding domain of ABC transporter, a member of the type 2 periplasmic binding fold protein superfamily. Periplasmic glutamine binding domain GlnP serves as an initial receptor in the ABC transport of glutamine in eubacteria. GlnP belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	220
270338	cd13620	PBP2_GltS	Substrate binding domain of glutamate or arginine ABC transporter, a member of the type 2 periplasmic binding fold protein superfamily. This family comprises of the periplasmic-binding protein component (GltS) of an ABC transporter specific for glutamate or arginine from Lactococcus lactis, as well as its closely related proteins. The GltS domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many  members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis	227
270339	cd13621	PBP2_AA_binding_like_3	Substrate-binding domain of putative amino acid-binding protein; the type 2 periplasmic-binding protein fold. This putative amino acid-binding protein belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	229
270340	cd13622	PBP2_Arg_3	Substrate binding domain of an arginine 3rd transport system; the type 2 periplasmic binding fold. This subgroup is similar to the HisJ-like family that comprises the periplasmic substrate-binding proteins, including the lysine-, arginine-, ornithine-binding protein (LAO) and the histidine-binding protein (HisJ), which serve as initial receptors for active transport. HisJ and LAO proteins belong to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	222
270341	cd13623	PBP2_AA_hypothetical	Substrate-binding domain of putative amino-acid transport system; the type 2 periplasmic binding protein fold. This putative amino acid-binding protein belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	220
270342	cd13624	PBP2_Arg_Lys_His	Substrate binding domain of the arginine-, lysine-, histidine-binding protein ArtJ; the type 2 periplasmic binding protein fold. This group includes the periplasmic substrate-binding protein ArtJ of the ATP-binding cassette (ABC) transport system from the thermophilic bacterium Geobacillus stearothermophilus, which is specific for arginine, lysine, and histidine. ArtJ belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	219
270343	cd13625	PBP2_AA_binding_like_1	Substrate-binding domain of putative amino acid-binding protein; the type 2 periplasmic-binding protein fold. This putative amino acid-binding protein belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	230
270344	cd13626	PBP2_Cystine_like	Substrate binding domain of cystine ABC transporters; the type 2 periplasmic binding protein fold. Cystine-binding domain of periplasmic receptor-dependent ATP-binding cassette (ABC) transporters.  Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Also, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	219
270345	cd13627	PBP2_AA_binding_like_2	Substrate-binding domain of putative amino acid-binding protein; the type 2 periplasmic-binding protein fold. This putative amino acid-binding protein belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	243
270346	cd13628	PBP2_Ala	Periplasmic substrate binding domain of ABC-type transporter specific to alanine; the type 2 periplasmic binding protein. This periplasmic substrate component serves as an initial receptor in the ABC transport of glutamine in eubacteria and archaea. After binding the alanine with high affinity, this domain Interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically-located ATPase domains.  This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  This alanine specific domain belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	219
270347	cd13629	PBP2_Dsm1740	Amino acid-binding domain of the type 2 periplasmic binding fold superfamily. This subfamily includes the periplasmic binding protein type II (BPBII). This domain is found in solute binding proteins that serve as initial receptors in the ABC transport, signal transduction and channel gating.  The PBPII proteins share the same architecture as periplasmic binding proteins type I (PBPI), but have a different topology. They are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.  The majority of PBPII proteins function in the uptake of small soluble substrates in eubacteria and archaea.  After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.  Besides transport proteins, the family includes ionotropic glutamate receptors and unorthodox sensor proteins involved in signal transduction.	221
270348	cd13630	PBP2_PDT_1	Catalytic domain of prephenate dehydratase and similar proteins, subgroup 1; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	180
270349	cd13631	PBP2_Ct-PDT_like	Catalytic domain of prephenate dehydratase from Chlorobium tepidum and similar proteins, subgroup 2; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	182
270350	cd13632	PBP2_Aa-PDT_like	Catalytic domain of prephenate dehydratase from Arthrobacter aurescens and similar proteins, subgroup 3; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	183
270351	cd13633	PBP2_Sa-PDT_like	Catalytic domain of prephenate dehydratase from Staphylococcus aureus and similar proteins, subgroup 4; the type 2 periplasmic binding protein fold. Prephenate dehydratase (PDT, EC:4.2.1.51) converts prephenate to phenylpyruvate through dehydration and decarboxylation reactions. PDT plays a key role in the biosynthesis of L-Phe in organisms that utilize the shikimate pathway. PDT is allosterically regulated by L-Phe and other amino acids. The catalytic PDT domain consists of two similar subdomains with a cleft in between, which hosts the highly conserved active site. In gram-postive bacteria and archaea, PDT is a monofunctional enzyme, consisting of a catalytic domain (PDT domain) and a regulatory domain (ACT) (aspartokinase, chorismate mustase domain). In gram-negative bacteria, PDT exists as fusion protein with chorismate mutase (CM), forming a bifunctional enzyme, P-protein (PheA). The CM in the P-protein catalyzes the pericycle isomerization of chorismate to prephenate that serves as a substrate for PDT. The CM and PDT are essentail enzymes for the biosynthesis of aromatic amino acids in microorganisms but are not found in humans. Thus, both CM and PDT can potentially serve as drug targets against microbial pathogens. The PDT domain has the same structural fold as the type 2 periplasmic binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	184
270352	cd13634	PBP2_Sco4506	The conserved hypothetical protein SCO4506 exhibits the type 2 periplasmic-binidng protein fold. This group includes the SCO4506 protein from Streptomyces coelicolor and related hypothetical proteins. SCO4506 is an ortholog of Ttha1568 (MqnD) from Thermus thermophilies HB8. MqnD is an enzyme within an alternative menaquinone biosynthetic pathway that catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. SCO4506 has significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	256
270353	cd13635	PBP2_Ttha1568_Mqnd	A menaquinone biosynthetic enzyme exhibits the type 2 periplasmic-binding protein fold. This group includes Ttha1568 (MqnD) from Thermus thermophilies HB8, an enzyme within an alternative menaquinone biosynthetic pathway that catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. Ttha1568 has significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	260
270354	cd13636	PBP2_Af1704	The conserved hypothetical protein Af1704 exhibits the type 2 periplasmic-binding protein fold. This group includes the Af1704 protein from from Archaeoglobus fulgidus DSM 4304, which is an ortholog of Ttha1568 (MqnD) from Thermus thermophilies HB8. MqnD is an enzyme within an alternative menaquinone biosynthetic pathway that catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. Af1704 has significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	259
270355	cd13637	PBP2_Ca3427_like	The conserved hypothetical protein Ca3427 exhibits the type 2 periplasmic-binding protein fold. This group includes the Ca3427 protein from candida albicans, which is an ortholog of Ttha1568 (MqnD) from Thermus thermophilies HB8, and other related hypothetical proteins. MqnD is an enzyme within an alternative menaquinone biosynthetic pathway that catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate. Menaquinone (MK; vitamin K) is an essential lipid-soluble carrier that shuttles electrons between membrane-bound protein complexes in the electron transport chain. Ca3427 has significant structural homology with the members of type 2 periplasmic-binding fold protein superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	273
270356	cd13638	PBP2_EcProx_like	Substrate binding domain of Escherichia coli betaine transport system-like; the type 2 periplasmic binding protein fold. This group includes the periplasmic substrate-binding protein ProX. ProX from the Escherichia coli ATP-binding cassette transport system ProU binds the compatible solutes glycine betaine and proline betaine with high affinity and specificity. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. The ProX belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	299
270357	cd13639	PBP2_OpuAC_like	Substrate binding domain of Lactococcus lactis ABC-type transporter OpuA and related proteins; the type 2 periplasmic binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to betaine compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as glycine betaine and proline betaine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding  fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	254
270358	cd13640	PBP2_ChoX	Substrate binding domain of ABC-type choline transport system; the type 2 periplasmic binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to choline and acetylcholine for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as choline and betaines. Choline is necessary for the biosynthesis of glycine betaine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. In the case of the Sinorhizobium meliloti choline uptake system ChoVWX, ChoV is the nucleotide-binding domain that provides energy for the transport process via ATP hydrolysis, ChoW is the integral transmembrane protein that forms the substrate translaocation pathway, and ChoX is the substrate-binding domain. ChoX belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	266
270359	cd13641	PBP2_HisX_like	Substrate-binding domain of ABC-type histidine transporter involves in betaine and proline uptake; the type 2 periplasmic-binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to certain quaternary ammonium compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as glycine betaine, proline betaine, choline, and carnitine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	261
270360	cd13642	PBP2_BCP_1	Substrate-binding domain of osmoregulatory ABC-type glycine betaine/choline/L-proline transport system-like; the type 2 periplasmic-binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to certain quaternary ammonium compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as glycine betaine, proline betaine, choline, and carnitine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	292
270361	cd13643	PBP2_BCP_2	Substrate-binding domain of osmoregulatory ABC-type glycine betaine/choline/L-proline transport system-like; the type 2 periplasmic-binding protein fold. This subfamily is part of a high affinity multicomponent binding-protein-dependent transport system specific to certain quaternary ammonium compounds for osmoregulation. The periplasmic substrate-binding domain, which is often fused to the permease component of the ATP-binding cassette transporter complex, is involved in uptake of osmoprotectants (also termed compatible solutes) such as glycine betaine, proline betaine, choline, and carnitine. Many microorganisms accumulate these compatible solutes in response to high osmolarity to offset the loss of cell water. This domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	283
270362	cd13644	PBP2_HemC_archaea	Archaeal HemC of hydroxymethylbilane synthase family; the type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). This subfamily includes the three domains of HMBS. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	273
270363	cd13645	PBP2_HuPBGD_like	Human porphobilinogen deaminase possess type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB).  This subfamily includes the three domains of human PBGD and its closely related proteins. Mutations in human PBGD cause AIP (acute intermittent porphyria), an inherited autosomal dominant disorder. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	282
270364	cd13646	PBP2_EcHMBS_like	cd00494. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB).  This subfamily includes the three domains of Escherichia coli HMBS and its closely related proteins. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	274
270365	cd13647	PBP2_PBGD_2	An uncharacterized subgroup of the PBGD family; the type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB).  This subfamily includes the three domains of HMBS. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	282
270366	cd13648	PBP2_PBGD_1	An uncharacterized subgroup of the PBGD family; the type 2 periplasmic binding protein fold. Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophyll, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB).  This subfamily includes the three domains of HMBS. The enzyme is believed to bind substrate through a hinge-bending motion of domains 1 and 2. The C-terminal domain 3 contains an invariant cysteine that forms the covalent attachment site for the DPM (dipyrromethane) cofactor. HMBS is found in all organisms except viruses. The domains 1 and 2 have the same overall topology as found in the type 2 periplasmic-binding proteins (PBP2), many of which are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor.	278
270367	cd13649	PBP2_Cae31940	Substrate binding domain of an uncharacterized protein similar to ABC-type transporter for thiamin biosynthetic pathway intermediates; a member of the type 2 periplasmic binding fold superfamily. This subfamily includes the periplamic-binding protein Cae31940 which is phylogenetically similar to the ThiY/THI5 family. ThiY is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are THI5, which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes, and periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport. After binding the ligand, They interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	223
270368	cd13650	PBP2_THI5	Substrate binding domain of ABC-type transporters for thiamin biosynthetic pathway intermediates; a member of the type 2 periplasmic binding fold superfamily. ThiY is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport , as well as THI5 which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes.  After binding the ligand, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	251
270369	cd13651	PBP2_ThiY	Substrate binding domain of ABC-type transporters for thiamin biosynthetic pathway intermediates; a member of the type 2 periplasmic binding fold superfamily. ThiY is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport , as well as THI5 which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes.  After binding the ligand, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	214
270370	cd13652	PBP2_ThiY_THI5_like_1	Putative substrate binding domain of an ABC-type transporter similar to ThiY/THI5; the type 2 periplasmic binding protein fold. This subfamily is phylogenetically similar to ThiY, which is the periplasmic N-formyl-4-amino-5-(aminomethyl)-2-methylpyrimidine (FAMP) binding component of the ABC transport system (ThiXYZ). FAMP is imported into cell by the transporter, where it is then incorporated into the thiamin biosynthetic pathway. The closest structural homologs of ThiY are THI5, which is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate (HMP-P) in the thiamin biosynthetic pathway of eukaryotes, and periplasmic binding proteins involved in alkanesulfonate/nitrate and bicarbonate transport. After binding the ligand, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The ThiY/THI5 proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	217
270371	cd13653	PBP2_phosphate_like_1	Substrate binding domain of putative ABC-type phosphate transporter, a member of the type 2 periplasmic binding fold superfamily. This subfamily contains uncharacterized phosphate binding domains found in PstS proteins that serve as initial receptors in the ABC transport of phosphate in eubacteria and archaea. After binding the ligand, PstS interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The PstS proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	240
270372	cd13654	PBP2_phosphate_like_2	Substrate binding domain of putative ABC-type phosphate transporter, a member of the type 2 periplasmic binding fold superfamily. This subfamily contains uncharacterized phosphate binding domains found in PstS proteins that serve as initial receptors in the ABC transport of phosphate in eubacteria and archaea. After binding the ligand, PstS interacts with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. The PstS proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	259
270373	cd13655	PBP2_oligosaccharide_1	The periplasmic binding component of ABC tansport system specific for an unknown oligosaccharide; possess the type 2 periplasmic binidng fold. This group represents an uncharacterized periplasmic-binding protein of an ATP-binding cassette transporter predicted to be involved in uptake of an unknown oligosaccharide molecule. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	363
270374	cd13656	PBP2_MBP	The periplasmic binding component of ABC tansport system specific for maltose; possess the type 2 periplasmic binidng fold. This group includes the periplasmic maltose-binding protein of an ATP-binding cassette transporter. Maltose is a disaccharide formed from two units of glucose. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	364
270375	cd13657	PBP2_Maltodextrin	The periplasmic binding component of ABC transport system specific for maltodextrin. This group includes the periplasmic maltodextrin-binding protein of a binding protein-dependent ATP-binding cassette transporter. Maltodextrin is a polysaccharide that is used as a food addtive and can be enzymatically produced from any starch . Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	368
270376	cd13658	PBP2_CMBP	The periplasmic binding component of ABC transport systems specific for cyclo/maltodextrin; possess the type 2 periplasmic binding fold. This group includes the periplasmic cyclo/maltodextrin-binding protein of Thermoactinomyces vulgaris ATP-binding cassette transporter and related proteins. Cyclodextrins are a family of compounds composed of glucose units connected by 1, 4 glycosidic linkages to form a series of oligosaccharide rings, and their cavity is hydrophibic which allows cyclodextrins to accomodate hydrophobic molecules/moieties in the cavity. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	372
270377	cd13659	PBP2_PotF	The periplasmic substrate-binding component of an ABC putrescine transport system and related proteins; contains the type 2 periplasmic-binding fold. This group represents the periplasmic substrate-binding domain that serves as the primary polyamine receptor of ABC-type putrescine-preferential transporter from gram-negative bacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	331
270378	cd13660	PBP2_PotD	The periplasmic substrate-binding component of an active spermidine-preferential transport system; contains the type 2 periplasmic binding fold. This group represents the periplasmic binding domain that serves as the primary polyamine receptor of ABC-type spermindine-preferential transport system from gram-negative bacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	315
270379	cd13661	PBP2_PotD_PotF_like_1	The periplasmic substrate-binding component of an uncharacterized active transport system closely related to spermidine and putrescine transporters; contains the type 2 periplasmic binding fold. This group represents the periplasmic binding domain that serves as a primary polyamine receptor of an uncharacterized ABC-type transport system from plants and plant-symbiotic cyanobacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, as well as plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	319
270380	cd13662	PBP2_TpPotD_like	The periplasmic substrate-binding component of an ABC-type polyamine transport system from Treponema pallidum and related proteins; contains the type 2 periplasmic binding fold. This group includes the polyamine-binding component of an ABC-type polyamine transport system from Treponema pallidum and closely related proteins, which is homologous to the spermidine-preferring periplasmic substrate-binding protein component (PotD)of ABC transport system. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, as well as plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	312
270381	cd13663	PBP2_PotD_PotF_like_2	The periplasmic substrate-binding component of an uncharacterized active transport system closely related to spermidine and putrescine transporters; contains the type 2 periplasmic binding fold. This group represents the periplasmic substrate-binding domain that serves as a primary polyamine receptor of an uncharacterized ABC-type transport system from gram-negative bacteria. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, as well as plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	323
270382	cd13664	PBP2_PotD_PotF_like_3	TThe periplasmic substrate-binding component of an uncharacterized active transport system closely related to spermidine and putrescine transporters; contains the type 2 periplasmic binding fold. This family represents the periplasmic substrate-binding domain that functions as the primary high-affinity receptors of ABC-type polyamine transport systems. Polyamine transport plays an essential role in the regulation of intracellular polyamine levels which are known to be elevated in rapidly proliferating cells and tumors. Natural polyamines are putrescine, spermindine, and spermine. They are polycations that play multiple roles in cell growth, survival and proliferation, and plant stress and disease resistance. They can interact with negatively charged molecules, such as nucleic acids, to modulate their functions. Members of this family belong to the type 2 periplasmic-binding fold superfamily. PBP2 is comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	315
270383	cd13665	PBP2_TRAP_Dctp3_4	Periplasmic substrate-binding component of TRAP-type C4-dicarboxylate transport system DctP3 and DctP4; the type 2 periplasmic-binding protein fold. This group includes uncharacterized DctP3 and DctP 4 subfamilies of TRAP Transporters specific to C4-dicarboxylates such as succinate, malate and fumarate. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. This CD also included some eukaryotic homologs that have not been functionally characterized. TRAP transporters are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	302
270384	cd13666	PBP2_TRAP_DctP_like_1	Substrate-binding component of an uncharacterized TRAP-type C4-dicarboxylate transport system; the type 2 periplasmic-binding protein fold. This group includes a DctP subfamily of TRAP Transporters specific to C4-dicarboxylates such as succinate, malate and fumarate. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. This CD also included some eukaryotic homologs that have not been functionally characterized. TRAP transporters are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	303
270385	cd13667	PBP2_TRAP_DctP1	Periplasmic substrate-binding component of an uncharacterized TRAP-type C4-dicarboxylate transport system DctP1; contains the type 2 periplasmic-binding protein fold. This group includes an uncharacterized DctP1 subfamily of the TRAP Transporters. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	295
270386	cd13668	PBP2_TRAP_UehA_TeaA	Periplasmic substrate-binding component of osmoregulatory TRAP transporters TeaA and UehA; the type 2 periplasmic-binding protein fold. This subfamily includes the periplasmic-binding component of the ectoine-specific TRAP transporters TeaA from Halomonas elongata and UehA from Ruegeria pomeroyi. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	305
270387	cd13669	PBP2_TRAP_TM0322_like	Periplasmic component of TRAP-type C4-dicarboxylate transport system TM0322 from Thermotoga maritima and similar proteins; the type 2 periplasmic binding protein fold. This subgroup includes the hyperthermophilic bacterium Thermotoga maritima TRAP-type C4-dicarboxylate transport system TM0322 and its closely related proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	296
270388	cd13670	PBP2_TRAP_Tp0957_like	Uncharacterized substrate-binding protein of the Tripartite ATP-independent  Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes the putative periplasmic substrate-binding protein Tp0957 from Treponema pallidum, which is similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	298
270389	cd13671	PBP2_TRAP_SBP_like_3	Uncharacterized substrate-binding protein of the Tripartite ATP-independent  Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	296
270390	cd13672	PBP2_TRAP_Siap	Substrate-binding domain of a sialic acid binding Tripartite ATP-independent  Periplasmic transport system (SiaP); the type 2 periplasmic-binding protein fold. This subfamily represents the periplasmic-binding component of TRAP transport system SiaP, a sialic acid binding virulence factor. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	295
270391	cd13673	PBP2_TRAP_SBP_like_2	Uncharacterized substrate-binding protein of the Tripartite ATP-independent  Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	301
270392	cd13674	PBP2_TRAP_SBP_like_1	Uncharacterized substrate-binding protein of the Tripartite ATP-independent  Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	299
270393	cd13675	PBP2_TRAP_SBP_like_5	Uncharacterized substrate-binding protein of the Tripartite ATP-independent  Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	296
270394	cd13676	PBP2_TRAP_DctP2_like	Substrate-binding component of Tripartite ATP-independent Periplasmic transporter DctP2 and related proteins;  the type 2 periplasmic-binding protein fold. This subgroup includes TRAP transporter DctP2 and its similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	297
270395	cd13677	PBP2_TRAP_SBP_like_6	Uncharacterized substrate-binding protein of the Tripartite ATP-independent  Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	304
270396	cd13678	PBP2_TRAP_DctP10	Substrate-binding component of Tripartite ATP-independent Periplasmic transporter DctP10;  the type 2 periplasmic-binding protein fold. This subgroup includes TRAP transporter DctP10 and its similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	300
270397	cd13679	PBP2_TRAP_YiaO_like	Substrate-binding domain of 2,3-diketo-L-gulonate-binding Tripartite ATP-independent  Periplasmic transport system and related proteins; the type 2 periplasmic-binding protein fold. This subfamily includes the solute receptor protein YiaO of TRAP transport system. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	298
270398	cd13680	PBP2_TRAP_SBP_like_4	Uncharacterized substrate-binding protein of the Tripartite ATP-independent  Periplasmic transporter family; the type 2 periplasmic-binding protein fold. This subfamily includes uncharacterized periplasmic substrate-binding proteins similar to TRAP transport systems such as SiaP (a sialic acid binding virulence factor) and TeaA (an ectoine binding protein). TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	300
270399	cd13681	PBP2_TRAP_lactate	Substrate-binding component of a lactate binding Tripartite ATP-independent Periplasmic transporter and related proteins; the type 2 periplasmic-binding protein fold. This subgroup includes a lactate binding TRAP transporter and its similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	311
270400	cd13682	PBP2_TRAP_alpha-ketoacid	Substrate-binding component of an alpha-keto acid binding Tripartite ATP-independent Periplasmic transporter and related proteins; contains the type 2 periplasmic-binding protein fold. This subgroup includes TRAP transporters that bind to ketoacids such as pyruvate and alpha-ketobutyrate, xylulose, and other unknown ligands. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	323
270401	cd13683	PBP2_TRAP_DctP6_7	Substrate-binding domain of Tripartite ATP-independent Periplasmic transporter DctP6 and DctP7; type 2 periplasmic-binding protein fold. This subgroup includes TRAP-type mannitol/chloroaromatic compound transport system (Dctp6) and similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process and a smaller membrane of unknown function. The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the PBP2 superfamily. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	304
270402	cd13684	PBP2_TRAP_Dctp5_like	Substrate-binding component of Tripartite ATP-independent Periplasmic transporter DctP5 and related proteins;  the type 2 periplasmic-binding protein fold. This subgroup includes TRAP transporter DctP5 and its similar proteins. TRAP transporters are a large family of solute transporters ubiquitously found in bacteria and archaea. They are comprised of a periplasmic substrate-binding protein (SBP; often called the P subunit) and two unequally sized integral membrane components: a large transmembrane subunit involved in the translocation process (the M subunit) and a smaller membrane of unknown function (the Q subunit). The driving force of TRAP transporters is provided by electrochemical ion gradients (either protons or sodium ions) across the cytoplasmic membrane, rather than ATP hydrolysis. This substrate-binding domain belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	314
270403	cd13685	PBP2_iGluR_non_NMDA_like	The ligand-binding domain of non-NMDA (N-methyl-D-aspartate) type ionotropic glutamate receptors,  a member of the type 2 periplasmic-binding fold protein superfamily. This subfamily represents the ligand-binding domain of non-NMDA (N-methyl-D-aspartate) type ionotropic glutamate receptors including AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) receptors (GluR1-4), kainate receptors (GluR5-7 and KA1/2), and orphan receptors delta 1/2. iGluRs form tetrameric ligand-gated ion channels, which are concentrated at postsynaptic sites in excitatory synapses where they fulfill a variety of different functions.  While this ligand-binding domain of iGluRs is structurally homologous to the periplasmic binding fold type II superfamily, the N-terminal leucine/isoleucine/valine#binding protein (LIVBP)-like domain belongs to the periplasmic-binding fold type I.	252
270404	cd13686	GluR_Plant	Plant glutamate receptor domain; the type 2 periplasmic binding protein fold. This subfamily contains the glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	232
270405	cd13687	PBP2_iGluR_NMDA	The ligand-binding domain of the NMDA (N-methyl-D-aspartate) subtype of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. The ligand-binding domain of the ionotropic NMDA subtype is structurally homologous to the periplasmic-binding fold type II superfamily, while the N-terminal domain belongs to the periplasmic-binding fold type I.  The function of the NMDA subtype receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer comprising two NR1 and two NR2 (A, B, C, and D) or NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain.	239
270406	cd13688	PBP2_GltI_DEBP	Substrate-binding domain of ABC aspartate-glutamate transporter; the type 2 periplasmic binding protein fold. This subfamily represents the periplasmic-binding protein component of ABC transporter specific for carboxylic amino acids, including GtlI from Escherichia coli. The aspartate-glutamate binding domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many  members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	238
270407	cd13689	PBP2_BsGlnH	Substrate binding domain of ABC glutamine transporter from Bacillus subtilis; the type 2 periplasmic-bindig protein fold. This group includes periplasmic glutamine-binding domain GlnP from Bacillus subtilis and its related proteins. The GlnP domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	229
270408	cd13690	PBP2_GluB	Substrate binding domain of ABC glutamate transporter; the type 2 periplasmic binding protein fold. This group includes periplasmic glutamate-binding domain GluB from Corynebacterium efficiens and its related proteins. The GluB domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	231
270409	cd13691	PBP2_Peb1a_like	Substrate binding domain of an ABC aspartate/glutamate transporter; the type 2 periplasmic-binding protein fold. This group includes periplasmic aspartate/glutamate binding domain Peb1a and its closely related protein. The Peb1a is an important virulence factor in the food-borne human pathogen Campylobacter jejuni, which has a major role in adherence and host colonization. The Peb1a domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	228
270410	cd13692	PBP2_BztA	Substrate bindng domain of ABC glutamate/glutamine/aspartate/asparagine transporter; the type 2 periplasmic binding protein fold. BztA is the periplamic-binding protein component of ABC transporter specific for carboxylic amino acids, glutamine and asparagine. The BZtA domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	236
270411	cd13693	PBP2_polar_AA	Substrate binding domain of polar amino-acid uptake  ABC transporter; the type 2 periplasmic binding protein fold. This group includes the periplamic-binding protein component of putative polar amino acid ABC transporter. The polar amino-acid binding domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	228
270412	cd13694	PBP2_Cysteine	Substrate binding domain of ABC cysteine transporter; the type 2 periplasmic binding protein fold. This subfamily comprises of the periplasmic-binding protein component of ABC transporter specific for cysteine and its closely related proteins. The cysteine-binding domains belong to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many  members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	229
270413	cd13695	PBP2_Mlr3796_like	The substrate-binding domain of putative amino aicd transporter; the type 2 periplasmic binding protein fold. This group includes the periplamic-binding protein component of a putative amino acid ABC transporter from Mesorhizobium loti and its related proteins. The putative Mlr3796-like domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	232
270414	cd13696	PBP2_Atu4678_like	The substrate binding domain of putative amino acid transporter; the type 2 periplasmic binding protein fold. This group includes the periplamic-binding protein component of a putative amino acid ABC transporter from Agrobacterium tumefaciens and its related proteins. The putative Atu4678-like domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	227
270415	cd13697	PBP2_ArtJ_like	Putative substrate-binding domain of ABC arginine transporter; the type 2 periplasmic-binding protein fold. The ArtJ domain belongs to the type 2 periplasmic binding protein fold superfamily (PBP2), whose many members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	228
270416	cd13698	PBP2_HisGluGlnArgOpine	Substrate binding domain of ABC-type histidine/glutamate/glutamine/arginine/opine transporter; the type 2 periplasmic-binding protein fold. This group includes periplasmic-binding component of His/Glu/Gln/Arg/Opine ATP-binding cassette transport system. This substrate-binding domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	214
270417	cd13699	PBP2_OccT_like	Substrate binding domain of ABC-type octopine transporter-like; the type 2 periplasmic-binding protein fold. This group includes periplasmic octopine-binding protein and related proteins. This group belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	211
270418	cd13700	PBP2_Arg_STM4351	Substrate binding domain of arginine-specific ABC transporter; type 2 periplasmic-binding protein fold. This group includes domains similar to Escherichia coli arginine third transport system. STM4351 is the high arginine specific periplasmic-binding protein of ABC transport system. STM4351 belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	222
270419	cd13701	PBP2_ml15202_like	Substrate binding domain of ABC-type histidine/lysine/arginine/ornithine transporter-like; the type 2 periplasmic-binding protein fold. This group includes uncharacterized periplasmic substrate-binding protein similar to HisJ and LAO proteins which are involved in the ABC transport of histidine-, arginine, and lysine-arginine-ornithine amino acids. This group belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	227
270420	cd13702	PBP2_mlr5654_like	Substrate binding domain of ABC-type histidine/lysine/arginine/ornithine transporter-like; the type 2 periplasmic-binding protein fold. This group includes uncharacterized periplasmic substrate-binding protein similar to HisJ and LAO proteins which  serve as initial receptors in the ABC transport of histidine-, arginine, and lysine-arginine-ornithine amino acids.  This group belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	223
270421	cd13703	PBP2_HisJ_LAO	Substrate binding domain of ABC-type histidine- and lysine/arginine/ornithine transporters; the type 2 periplasmic-binding protein fold. This subgroup includes the periplasmic-binding proteins, HisJ and LAO, that serve as initial receptors in the ABC transport of histidine and lysine-arginine-ornithine amino acids. They are  belong to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	229
270422	cd13704	PBP2_HisK	The periplasmic sensor domain of histidine kinase receptors; the type 2 periplasmic binding fold protein. This subfamily includes the periplasmic sensor domain of the histidine kinase receptors (HisK) which are elements of the two-component signal transduction systems commonly found in bacteria and lower eukaryotes. Typically, the two-component system consists of a membrane-spanning histidine kinase sensor and a cytoplasmic response regulator. The two-component systems serve as a stimulus-response coupling mechanism to enable microorganisms to sense and respond to changes in environmental conditions. Extracellular stimuli such as small molecule ligands and ions are detected by the N-terminal periplasmic sensing domain of the sensor kinase receptor, which regulate the catalytic activity of the cytoplasmic kinase domain and promote ATP-dependent autophosphorylation of a conserved histidine residue. The phosphate is then transferred to a conserved aspartate in the response regulator through a phospho-transfer mechanism, and the activity of the response regulator is in turn regulated. The sensor domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space through their function as an initial high-affinity binding component. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	220
270423	cd13705	PBP2_BvgS_D1	The first of the two tandem periplasmic domains of sensor-kinase BvgS; the type 2 peripasmic-binding fold protein. This group contains the first domain of the periplasmic solute-binding domains of BvgS and related proteins. BvgS is composed of two periplasmic domains homologous to bacterial periplasmic-binding proteins (PBPs), a transmembrane region followed successively by a cytoplasmic PAS (Per/ARNT/SIM), a histidine-kinase (HK), a receiver and a histidine phosphotransfer (Hpt) domains. The sensor protein BvgS can autophosphorylate and phosphorylate the response regulator BvgA. The BvgAS phosphorelay controls the expression of virulence factors in response to certain environmental stimuli in Bordetella pertussis. Its close homologs, Escherichia coli EvgS and Klebsiella pneumoniae KvgS, appear to be involved in the transcriptional regulation of drug efflux pumps and in countering free radical stresses and sensing iron limiting conditions, respectively. The periplasmic sensor domain of BvgS belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	221
270424	cd13706	PBP2_HisK_like_1	Putative sensor domain similar to HisK; the type 2 periplasmic binding fold protein. This group includes periplasmic sensor domain of the histidine kinase receptors (HisK) which are elements of the two-component signal transduction systems commonly found in bacteria and lower eukaryotes. Typically, the two-component system consists of a membrane-spanning histidine kinase sensor and a cytoplasmic response regulator. The two-component systems serve as a stimulus-response coupling mechanism to enable microorganisms to sense and respond to changes in environmental conditions. Extracellular stimuli such as small molecule ligands and ions are detected by the N-terminal periplasmic sensing domain of the sensor kinase receptor, which regulate the catalytic activity of the cytoplasmic kinase domain and promote ATP-dependent autophosphorylation of a conserved histidine residue. The phosphate is then transferred to a conserved aspartate in the response regulator through a phospho-transfer mechanism, and the activity of the response regulator is in turn regulated. The sensor domain belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space through their function as an initial high-affinity binding component. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	219
270425	cd13707	PBP2_BvgS_D2	The second of the two tandem periplasmic domains of sensor-kinase BvgS; the type 2 peripasmic-binding fold protein. This group contains the second domain of the periplasmic solute-binding domains of BvgS and related proteins. BvgS is composed of two periplasmic domains homologous to bacterial periplasmic-binding proteins (PBPs), a transmembrane region followed successively by a cytoplasmic PAS (Per/ARNT/SIM), a Histidine-kinase (HK), a receiver and a Histidine phosphotransfer (Hpt) domains. The sensor protein BvgS can autophosphorylate and phosphorylate the response regulator BvgA. The BvgAS phosphorelay controls the expression of virulence factors in response to certain environmental stimuli in Bordetella pertussis.  Its close homologs, Escherichia coli EvgS and Klebsiella pneumoniae KvgS, appear to be involved in the transcriptional regulation of drug efflux pumps and in countering free radical stresses and sensing iron limiting conditions, respectively. The periplasmic sensor domain of BvgS belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	221
270426	cd13708	PBP2_BvgS_like_1	Putative sensor domain similar to BvgS; the type 2 periplasmic binding protein domain. BvgS is composed of two periplasmic domains homologous to bacterial periplasmic-binding proteins (PBPs), a transmembrane region followed successively by a cytoplasmic PAS (Per/ARNT/SIM), a Histidine-kinase (HK), a receiver and a Histidine phosphotransfer (Hpt) domains. The sensor protein BvgS can autophosphorylate and phosphorylate the response regulator BvgA. The BvgAS phosphorelay controls the expression of virulence factors in response to certain environmental stimuli in Bordetella pertussis. Its close homologs, Escherichia coli EvgS and Klebsiella pneumoniae KvgS, appear to be involved in the transcriptional regulation of drug efflux pumps and in countering free radical stresses and sensing iron limiting conditions, respectively. The periplasmic sensor domain of BvgS belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. PBP2 typically comprises of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	220
270427	cd13709	PBP2_YxeM	Substrate binding domain of an ABC transporter YxeMNO; the type 2 periplasmic binding protein fold. This group contains cystine-binding domain (YxeM) of a periplasmic receptor-dependent ATP-binding cassette transporter and its closely related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	227
270428	cd13710	PBP2_TcyK	Substrate binding domain of an ABC transporter TcyJKLMN; the type 2 periplasmic binding protein fold. This group contains periplasmic cystine-binding domain (TcyK) of an ATP-binding cassette transporter from Bacillus subtilus and its closely related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	233
270429	cd13711	PBP2_Ngo0372_TcyA	Substrate binding domain of ABC transporters involved in cystine import; the type 2 periplasmic binding protein fold. This subgroup includes cystine-binding domain of periplasmic receptor-dependent ATP-binding cassette transporters from Neisseria gonorrhoeae and Bacillus subtilis and their related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake. Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	222
270430	cd13712	PBP2_FliY	Substrate binding domain of an Escherichia coli ABC transporter; the type 2 periplasmic binding protein fold. This group contains cystine binding domain FliY and its related proteins. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake.  Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	219
270431	cd13713	PBP2_Cystine_like_1	Substrate binding domain of putative ABC transporters involved in cystine import; the type 2 periplasmic binding protein fold. This group contains uncharacterized periplasmic cystine-binding domain of ATP-binding cassette (ABC) transporters. Cystine is an oxidized dimeric form of cysteine that is required for optimal bacterial growth. In Bacillus subtilis, three ABC transporters, TcyJKLMN (YtmJKLMN), TcyABC (YckKJI), and YxeMNO are involved in uptake of cystine. Likewise, three uptake systems were identified in Salmonella enterica serovar Typhimurium, while in Escherichia coli, two transport systems seem to be involved in cystine uptake.  Moreover, L-cystine limitation was shown to prevent virulence of Neisseria gonorrhoeae; thus, its L-cystine solute receptor (Ngo0372) may be suited as target for an antimicrobial vaccine. The cystine receptor belongs to the type 2 periplasmic binding fold protein superfamily (PBP2). The PBP2 proteins are typically comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two receptor cytoplasmically-located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	218
270432	cd13714	PBP2_iGluR_Kainate	Kainate receptor of the type 2 periplasmic-binding fold superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain.  Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	251
270433	cd13715	PBP2_iGluR_AMPA	The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtypes of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This family represents the ligand-binding domain of the AMPA receptor subunits, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current.	261
270434	cd13716	PBP2_iGluR_delta_like	The ligand-binding domain of the delta family of ionotropic glutamate receptors, a member of the type 2 periplasmic-binding fold protein superfamily. This subfamily represents the ligand-binding domain of an orphan family of delta receptors, GluRdelta1 and GluRdelta2. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of iGluRs belongs to the periplasmic-binding fold type I. Although the delta receptors are members of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetical analysis shows that both GluRdelta1 and GluRalpha2 are more homologous to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq, and the tumor necrosis factor family which is secreted from cerebellar granule cells. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans.	257
270435	cd13717	PBP2_iGluR_putative	The ligand-binding domain of putative ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain. Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	360
270436	cd13718	PBP2_iGluR_NMDA_Nr2	The ligand-binding domain of the NR2 subunit of ionotropic NMDA (N-methyl-D-aspartate) glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the NR2 subunit of NMDA receptor family. The ionotropic N-methyl-d-asparate (NMDA) subtype of glutamate receptors serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer composed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain.	283
270437	cd13719	PBP2_iGluR_NMDA_Nr1	The ligand-binding domain of the NR1 subunit of ionotropic NMDA (N-methyl-D-aspartate) glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand binding domain of the NR1, an essential channel-forming subunit of the NMDA receptor. The ionotropic N-methyl-d-asparate (NMDA) subtype of glutamate receptors serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer ccomposed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. When co-expressed with NR1, the NR3 subunits form receptors that are activated by glycine alone and therefore can be classified as excitatory glycine receptors. NR1/NR3 receptors are calcium-impermeable and unaffected by ligands acting at the NR2 glutamate-binding site.	277
270438	cd13720	PBP2_iGluR_NMDA_Nr3	The ligand-binding domain of the NR3 subunit of ionotropic NMDA (N-methyl-D-aspartate) glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the NR3 subunit of NMDA receptor family. The ionotropic N-methyl-d-asparate (NMDA) subtype of glutamate receptors serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer composed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. Among NMDA receptor subtypes, the NR2B subunit containing receptors appear particularly important for pain perception; thus NR2B-selective antagonists may be useful in the treatment of chronic pain.	283
270439	cd13721	PBP2_iGluR_Kainate_GluR6	GluR6 subtype of kainate receptor, type 2 periplasmic-binding fold superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain.  Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	251
270440	cd13722	PBP2_iGluR_Kainate_GluR5	GluR5 subtype of kainate receptor, type 2 periplasmic-binding fold superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain.  Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	250
270441	cd13723	PBP2_iGluR_Kainate_GluR7	GluR7 subtype of kainate receptor, type 2 periplasmic-binding fold superfamily. This group contains glutamate receptor domain GluR. These domains are found in the GluR proteins that have been shown to function as L-glutamate activated potassium channels, also known ionotropic glutamate receptors or iGluRs. In addition to two ligand binding core domains, iGluRs typically have a channel-like domain inserted in the middle of the GluR-like domain.  Animal iGluRs mediate the ion flux in the synapses of the CNS and can be subdivided into several classes depending on the neurotransmitter specificity and ion conductance properties. Their plant homologs have been shown to function in light signal transduction and calcium homeostasis. The GluR proteins belong to the PBPII superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap.	369
270442	cd13724	PBP2_iGluR_kainate_KA1	The ligand-binding domain of the kainate subtype KA1 of ionotropic glutamate receptors, a member of the type 2 periplasmic-binding fold protein superfamily. This group contains the ligand-binding domain of the KA1 subunit of kainate receptor. While this ligand-binding domain is structurally homologous to the periplasmic binding fold type II superfamily, the N_terminal domain of kainate receptors belongs to the periplasmic-binding fold type I.  There are five types of kainate receptors, GluR5, GluR6, GluR7, KA1, and KA2, which are structurally similar to AMPA and NMDA subunits of ionotropic glutamate receptors. KA1 and KA2 subunits can only form functional receptors with one of the GluR5-7 subunits. Moreover, GluR5-7 can also form functional homomeric receptor channels activated by kainate and glutamate when expressed in heterologous systems. Kainate receptors are involved in excitatory neurotransmission by activating postsynaptic receptors and in inhibitory neurotransmission by modulating release of the inhibitory neurotransmitter GABA through a presynaptic mechanism. Kainate receptors are closely related to AMAP receptors. In contrast of AMPA receptors, kainate receptors play only a minor role in signaling at synapses and their function is not well defined.	333
270443	cd13725	PBP2_iGluR_kainate_KA2	The ligand-binding domain of the kainate subtype KA2 of ionotropic glutamate receptors, a member of the type 2 periplasmic-binding fold protein superfamily. This group contains the ligand-binding domain of the KA2 subunit of kainate receptor. While this ligand-binding domain is structurally homologous to the periplasmic binding fold type II superfamily, the N_terminal domain of kainate receptors belongs to the periplasmic-binding fold type I.  There are five types of kainate receptors, GluR5, GluR6, GluR7, KA1, and KA2, which are structurally similar to AMPA and NMDA subunits of ionotropic glutamate receptors. KA1 and KA2 subunits can only form functional receptors with one of the GluR5-7 subunits. Moreover, GluR5-7 can also form functional homomeric receptor channels activated by kainate and glutamate when expressed in heterologous systems. Kainate receptors are involved in excitatory neurotransmission by activating postsynaptic receptors and in inhibitory neurotransmission by modulating release of the inhibitory neurotransmitter GABA through a presynaptic mechanism. Kainate receptors are closely related to AMAP receptors. In contrast of AMPA receptors, kainate receptors play only a minor role in signaling at synapses and their function is not well defined.	250
270444	cd13726	PBP2_iGluR_AMPA_GluR2	The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtype GluR2 of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the AMPA receptor subunit GluR2, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I. The AMPA receptors are the most commonly found receptor in the nervous system and sensitive to the artificial glutamate analog, AMPA. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current.	259
270445	cd13727	PBP2_iGluR_AMPA_GluR4	The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtype GluR4 of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the AMPA receptor subunit GluR4, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I.The AMPA receptors are the most commonly found receptor in the nervous system and sensitive to the artificial glutamate analog, AMPA. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current.	259
270446	cd13728	PBP2_iGluR_AMPA_GluR3	The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtype GluR3 of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the AMPA receptor subunit GluR3, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I. The AMPA receptors are the most commonly found receptor in the nervous system and sensitive to the artificial glutamate analog, AMPA. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current	259
270447	cd13729	PBP2_iGluR_AMPA_GluR1	The ligand-binding domain of the AMPA (alpha-amino-3-hydroxyl-5-methyl-4-isoxazolepropionic acid) subtype GluR1 of ionotropic glutamate receptors, a member of the type 2 periplasmic binding fold protein superfamily. This group contains the ligand-binding domain of the AMPA receptor subunit GluR1, a member of non-NMDA (N-methyl-D-aspartate) type iGluRs which are ligand-gated ion channels that mediate excitatory synaptic transmission in the central nervous system. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of AMPA receptors belongs to the periplasmic-binding fold type I. The AMPA receptors are the most commonly found receptor in the nervous system and sensitive to the artificial glutamate analog, AMPA. They consist of four types of subunits (GluR1, GluR2, GluR3, and GluR4) which combine to form a tetramer and play an important role in mediating the rapid excitatory synaptic current.	260
270448	cd13730	PBP2_iGluR_delta_1	The ligand-binding domain of an orphan ionotropic glutamate receptor delta-1, a member of the type 2 periplasmic-binding fold protein superfamily. This group contains the ligand-binding domain of the delta1 receptor of an orphan glutamate receptor family. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of delta receptors belongs to the periplasmic-binding fold type I. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetical analysis shows that both GluRdelta1 and GluRdelta2 are more homologous to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq, and the tumor necrosis factor family which is secreted from cerebellar granule cells. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans.	257
270449	cd13731	PBP2_iGluR_delta_2	The ligand-binding domain of an orphan ionotropic glutamate receptor delta-2, a member of the type 2 periplasmic-binding fold protein superfamily. This group contains the ligand-binding domain of the delta-2 receptor of an orphan glutamate receptor family. While this ligand-binding domain is structurally homologous to the periplasmic-binding fold type II superfamily, the N-terminal domain of delta receptors belongs to the periplasmic-binding fold type I. Although the delta receptors are a member of the ionotropic glutamate receptor family, they cannot be activated by AMPA, kainate, NMDA, glutamate, or any other ligands. Phylogenetical analysis shows that both GluRdelta1 and GluRalpha2 are more homologous to non-NMDA receptors. GluRdelta2 was shown to function as an AMPA-like receptor by mutation analysis. Moreover, targeted disruption of GluRdelta2 gene caused motor coordination impairment, Purkinje cell maturation, and long-term depression of synaptic transmission. It has been suggested that GluRdelta2 is the receptor for cerebellin 1, a glycoprotein of the Clq, and the tumor necrosis factor family which is secreted from cerebellar granule cells. Furthermore, recent studies have shown that the orphan GluRdelta1 plays an essential role in high-frequency hearing and ionic homeostasis in the basal cochlea and that the locus encoding GluRdelta1 may be involved in congenial or acquired high-frequency hearing loss in humans.	257
293968	cd13733	SPRY_PRY_C-I_1	PRY/SPRY domain in tripartite motif-containing (TRIM) proteins, including TRIM5, TRIM6, TRIM7, TRIM10, TRIM11, TRIM17, TRIM20, TRIM21, TRIM27, TRIM35, TRIM38, TRIM41, TRIM50, TRIM58, TRIM60, TRIM62, TRIM69, TRIM72, NF7 and bloodthirsty. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several Class IV TRIM proteins, including TRIM7, TRIM35, TRIM41, TRIM50, TRIM62, TRIM69, TRIM72, TRIM protein NF7 and bloodthirsty (bty). TRIM7 interacts with glycogenin and stimulates its self-glucosylating activity via its SPRY domain. TRIM35 may play a role as a tumor suppressor and is implicated in the cell death mechanism. TRIM41 is localized to speckles in the cytoplasm and nucleus, and functions as an E3 ligase that catalyzes the ubiquitin-mediated degradation of protein kinase C. TRIM50, an E3 ubiquitin ligase, is deleted in Williams-Beuren (WBS) syndrome, a multi-system neurodevelopmental disorder caused by the deletion of contiguous genes at chromosome region 7q11.23. TRIM62 is involved in the morphogenesis of the mammary gland; loss of TRIM62 gene expression in breast is associated with increased risk of recurrence in early-onset breast cancer. TRIM69 is a novel testis E3 ubiquitin ligase that may function to ubiquitinate its particular substrates during spermatogenesis. In humans, TRIM69 localizes in the cytoplasm and nucleus, and requires an intact RING finger domain to function. TRIM protein NF7, which also contains a chromodomain (CHD) at the N-terminus and an RFP (Ret finger protein)-like domain at the C-terminus, is required for its association with transcriptional units of RNA polymerase II which is mediated by a trimeric B box. In Xenopus oocyte, xNF7 has been identified as a nuclear microtubule-associated protein (MAP) whose microtubule-bundling activity, but not E3-ligase activity, contributes to microtubule organization and spindle integrity. Bloodthirsty (bty) is a novel gene identified in zebrafish and has been shown to likely play a role in in regulation of the terminal steps of erythropoiesis. TRIM72 has been shown to perform a critical function in membrane repair following acute muscle injury by nucleating the assembly of the repair machinery at injury sites. The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site.	174
293969	cd13734	SPRY_PRY_C-II	PRY/SPRY domain in tripartite motif-containing proteins 1, 9, 18, 36, 46, 67,76 (TRIM1, TRIM9, TRIM18, TRIM36, TRIM46, TRIM67, TRIM76). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of several Class I TRIM proteins, including TRIM1, TRIM9, TRIM18, TRIM36, TRIM46, TRIM67 and TRIM76.  TRIM1 (also known as MID2) and its close homolog, TRIM18 (also known as MID1), both contain a B30.2-like domain at their C-terminus and a single fibronectin type III (FN3) motif between it and their N-terminal RBCC domain. Their coiled-coil motifs mediate both homo- and heterodimerization, a prerequisite for association of the rapamycin-sensitive PP2A regulatory subunit Alpha 4 with microtubules. Mutations in TRIM18 have shown to cause Opitz syndrome, a disorder causing congenital anomalies such as cleft lip and palate as well as heart defects. TRIM9 is expressed mainly in the cerebral cortex, and functions as an E3 ubiquitin ligase. Its immunoreactivity is severely decreased in affected brain areas in Parkinson's disease and dementia with Lewy bodies, possibly playing an important role in the regulation of neuronal function and participating in pathological process of Lewy body disease through its ligase. TRIM36 interacts with centromere protein-H, one of the kinetochore proteins and possibly associates with chromosome segregation; an excess of TRIM36 may cause chromosomal instability. TRIM46 has not yet been characterized.  TRIM67 negatively regulates Ras activity via degradation of 80K-H, leading to neural differentiation, including neuritogenesis.  TRIM76 (also known as cardiomyopathy-associated protein 5 or CMYA5) is a muscle-specific member of the TRIM superfamily, but lacks the RING domain. It is possibly involved in protein kinase A signaling as well as vesicular trafficking. It has also been implicated in Duchenne muscular dystrophy and cardiac disease. The PRY-SPRY domain in these TRIM families is suggested to serve as the target binding site.	166
293970	cd13735	SPRY_HECT_like	SPRY domain in HECT E3. This domain consists of the SPRY subdomain similar to those found at the N-terminus of the HECT (homologous to the E6AP carboxyl terminus) protein, a C-terminal catalytic domain of a subclass of ubiquitin-protein ligase (E3). HECT E3 binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains. It has a prominent role in protein trafficking and immune response, and is involved in crucial signaling pathways implicated in tumorigenesis.	150
293971	cd13736	SPRY_PRY_TRIM25	PRY/SPRY domain in tripartite motif-containing domain 25 (TRIM25). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM25 proteins (composed of RING/B-box/coiled-coil core and also known as RBCC proteins). TRIM25 (also called Efp) ubiquitinates the N terminus of the viral RNA receptor retinoic acid-inducible gene-I (RIG-I) in response to viral infection, leading to activation of the RIG-I signaling pathway, thus resulting in type I interferon production to limit viral replication. It has been shown that the influenza A virus targets TRIM25 and disables its antiviral function.	169
293972	cd13737	SPRY_PRY_TRIM25-like	PRY/SPRY domain in tripartite motif-containing domain 25 (TRIM25)-like. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of proteins similar to TRIM25 (composed of RING/B-box/coiled-coil core and also known as RBCC proteins). TRIM25 (also called Efp) ubiquitinates the N terminus of the viral RNA receptor retinoic acid-inducible gene-I (RIG-I) in response to viral infection, leading to activation of the RIG-I signaling pathway, thus resulting in type I interferon production to limit viral replication. It has been shown that the influenza A virus targets TRIM25 and disables its antiviral function.	172
293973	cd13738	SPRY_PRY_TRIM14	PRY/SPRY domain of tripartite motif-binding protein 14 (TRIM14). This is a TRIM14 domain family contains residues in the N-terminus that form a distinct PRY domain structure such that the B30.2 domain consists of PRY and SPRY subdomains. TRIM14 domains have yet to be characterized. These B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. It belongs to Class IV TRIM protein family which has members involved in antiviral immunity at various levels of interferon signaling cascade.	173
293974	cd13739	SPRY_PRY_TRIM1	PRY/SPRY domain of tripartite motif-binding protein 1 (TRIM1) or MID2. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM1 (also known as MID2 or midline 2). MID2 and its close homolog, TRIM18 (also known as MID1), both contain a B30.2-like domain at their C-terminus and a single fibronectin type III (FN3) motif between it and their N-terminal RBCC domain. MID2 and MID1 coiled-coil motifs mediate both homo- and heterodimerization, a prerequisite for association of the rapamycin-sensitive PP2A regulatory subunit Alpha 4 with microtubules. Mutations in MID1 have shown to cause Opitz syndrome, a disorder causing congenital anomalies such as cleft lip and palate as well as heart defects.	170
293975	cd13740	SPRY_PRY_TRIM7	PRY/SPRY domain in tripartite motif-binding protein 7 (TRIM7). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of tripartite motif-containing protein 7 (TRIM7), also referred to as glycogenin-interacting protein (GNIP) or RING finger protein 90 (RNF90). TRIM7 or GNIP interacts with glycogenin and stimulates its self-glucosylating activity via its SPRY domain. The GNIP gene encodes at least four distinct isoforms of GNIP, of which three (GNIP1, GNIP2, and GNIP3) have the B30.2 domain.	169
240499	cd13741	SPRY_PRY_TRIM41	PRY/SPRY domain in tripartite motif-binding protein 41 (TRIM41). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of tripartite motif-containing protein 41 (TRIM41). TRIM41 (also known as RING finger-interacting protein with C kinase or RINCK) is localized to speckles in the cytoplasm and nucleus, and functions as an E3 ligase that catalyzes the ubiquitin-mediated degradation of protein kinase C.	199
293976	cd13742	SPRY_PRY_TRIM72	PRY/SPRY domain in tripartite motif-binding protein 72 (TRIM72). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM72. Muscle-specific TRIM72 (also known as Mitsugumin 53 or MG53) has been shown to perform a critical function in membrane repair following acute muscle injury by nucleating the assembly of the repair machinery at injury sites. It is expressed specifically in skeletal muscle and heart, and tethered to the plasma membrane and cytoplasmic vesicles via its interaction with phosphatidylserine. TRIM72 interacts with dysferlin, a sarcolemmal protein whose deficiency causes Miyoshi myopathy (MM) and limb girdle muscular dystrophy type 2B (LGMD2B); this coordination plays an important role in the repair of sarcolemma damage.	192
293977	cd13743	SPRY_PRY_TRIM50	PRY/SPRY domain in tripartite motif-binding protein 50 (TRIM50). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM50. TRIM50, an E3 ubiquitin ligase, is deleted in Williams-Beuren (WBS) syndrome, a multi-system neurodevelopmental disorder caused by the deletion of contiguous genes at chromosome region 7q11.23. It is specifically expressed in gastric parietal cells and may play an essential role in tubulovesicular dynamics. It also interacts with and increases the level of p62, a multifunctional adaptor protein that is implicated in various cellular processes such as the autophagy clearance of polyubiquitinated protein aggregates.	189
293978	cd13744	SPRY_PRY_TRIM62	PRY/SPRY domain in tripartite motif-binding protein 62 (TRIM62). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM62. It is also called DEAR1 ductal epithelium (associated RING chromosome 1) and is involved in the morphogenesis of the mammary gland; loss of TRIM62 gene expression in breast is associated with increased risk of recurrence in early-onset breast cancer and thus, making TRIM62 a predictive biomarker. Non-small cell lung cancer lesions show a step-wise loss of TRIM62 levels during disease progression, indicating that it may play a role in the evolution of lung cancer. Decreased levels of TRIM62 also represent an independent adverse prognostic factor in AML.	188
293979	cd13745	SPRY_PRY_TRIM39	PRY/SPRY domain in tripartite motif-binding protein 39 (TRIM39) and TRIM39-like. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of pyrin, several tripartite motif-containing proteins (TRIMs), including E3 ubiquitin-protein ligase (TRIM21), RET finger protein (RFP)/tripartite motif protein 27 (TRIM27), as well as butyrophilin (Btns) and butyrophilin-like (Btnl) family members, with the exception of Btnl2. Btn and Btnl family members are novel regulators of immune responses, with many of the genes located within the MHC. They are implicated in T-cell inhibition and modulation of epithelial cell-T cell interactions. TRIM21 (also known as RO52, SSA1 or RNF81) is a major autoantigen in autoimmune diseases such as rheumatoid arthritis, systemic lupus erythematosus, and Sjorgen's syndrome. TRIM27 (also known as Ret finger protein, RFP or RNF76) negatively regulates CD4 T-cells by ubiquitinating and inhibiting the class II phosphatidylinositol 3 kinase C2beta (PI3K-C2beta), a kinase critical for KCa3.1 channel activation. The PRY/SPRY domain of Pyrin, which is mutated in familial Mediterranean fever patients, interacts with inflammasome components and inhibits proIL-1beta processing.	177
259839	cd13746	Sir4p-SID_like	The SID domain of Saccharomyces cerevisiae silent information regulator 4, a Sir2p interaction domain; and related domains. Saccharomyces cerevisiae Sir2p, Sir3p, and Sir4p form a heterotrimeric complex which binds chromatin and represses transcription at the homothallic mating type (HM) loci and at subtelomeric regions. This domain model spans residues 742-893 of Sir4p. Sir4p forms a stable heterodimer with Sir2p, mediated by Sir4p residues included in this domain, and a pocket between Sir2p's catalytic domain and its non-conserved N-terminus. Sir4p also interacts with an array of additional factors, including Yku80p, a subunit of the telomeric Ku complex (Yku70p-Yku80p), which binds two sites within Sir4p, one at the N-terminus and one in the C-terminal residues, 731-1358. Other interaction factors include Esc1p (Establishes silent chromatin 1) which binds the Sir4p PAD domain (partitioning and anchoring domain, residues 950-1262), and Sir3p, Yku70p, and Rap1p (Repressor Activator Protein) which bind in its C-terminal coiled-coil (residues 1257-1358). Other Sir4p interacting factors include the Ty5 retrotransposon. Additional roles for Sir4p include roles in DNA repair, and in aging. A SIR4 mutant having a truncated Sir4p lacking a C-terminal coiled-coil domain, has an extended mean life span; deletion of the SIR4 gene leads to a decreased mean life span.	115
259834	cd13747	UreI_AmiS_like_1	UreI/Amis family, subgroup 1. Putative proton-gated urea channel and putative amide transporters. This subfamily includes putative UreI proton-gated urea channels and putative amide transporters (AmiS of the amidase gene cluster). Helicobacter pylori UreI (HpUreI), a proton-gated inner membrane urea channel opens in acidic pH to allow urea influx to the cytoplasm. There urea is metabolized, producing NH3 and Co2, leading to buffering of the periplasm. This action is essential for the survival of H. pylori in the stomach, and has been identified as a mechanism that could be clinically targeted to prevent various illnesses associated with infection by H. pylori. UreI and the related amide channels (AmiS) appear to function as hexamers, and have 6 predicted transmembrane segments. UreI has also been shown have a lipid "plug" in the center of the hexamer. Urea enters at the periplasmic opening of UreI and must pass 2 constriction sites, one on each side of a conserved Glu (Glu 177, H. pylori numbering), to reach the cytoplasm. Urea/thiourea selectivity is diminished by mutation of a conserved Trp to Ala or Phe in constriction site 2 (cytoplasmic). Channel functionality is greatly diminished by mutation of a conserved Trp in constriction site 1 (periplasmic) and a conserved Tyr in constriction site 2, and to a lesser extent a conserved Phe in site 1. In the cytoplasm, urease hydrolyzes urea to form ammonia and carbamate, which decomposes to carbonic acid. UreI is fully open at pH 5.0 to facilitate urea influx, but closes at neutral pH, preventing over-alkalization. Glu 177 (H. pylori numbering) is present in urea channel proteins, but absent in the related amide channels, suggesting that it plays a role in urea specificity.	167
259840	cd13748	CBM29_CBM65	family 29 and family 65 carbohydrate binding modules. Members of this family bind to polysaccharides that are components of plant cell walls. CBM29 is present in cell-wall degrading multi-enzyme complexes from the anaerobic fungus Piromyces equi, CBM65 can be found in endoglucanases expressed by Eubacterium cellulosolvens and has a preference for xyloglucans.	106
259796	cd13749	Zn-ribbon_TFIIS	domain III/zinc ribbon domain of Transcription Factor IIS. TFIIS is a zinc-containing transcription factor. It has been shown in vitro to have distinct biochemical activities, including binding to RNA polymerases, stimulation of transcript elongation, and activation of a nascent RNA cleavage activity in the RNA polymerase II (Pol II) elongation complex. TFIIS consists of three domains. Domain II and III are sufficient for all known TFIIS activities. Domain III is a zinc ribbon that separated from domain II by a long linker and is indispensable for TFIIS function. The TFIIS homologs, subunits A12.2, B9, and C11, of Pol I, II, and III respectively, are required for RNA cleavage by the polymerases. In a single organism, there are tissue-specific TFIIS related proteins.	47
381628	cd13750	TGF_beta_GDNF_like	transforming growth factor beta (TGF-beta) like domain found in the glial cell-line-derived neurotrophic factor (GDNF) family of ligands. GDNF family of ligands includes GDNF, Artemin, Neurturin, and Persephin. They plays an important role in the development and maintenance of the central and peripheral nervous system, renal morphogenesis, and spermatogenesis.	95
381629	cd13751	TGF_beta_GDF8_like	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factors, GDF8 and GDF11, and similar proteins. The family includes GDF8 and GDF11. GDF8, also termed myostatin, acts specifically as a negative regulator of skeletal muscle growth. GDF11, also termed bone morphogenetic protein 11 (BMP-11), is a secreted signal that acts globally to specify positional identity along the anterior/posterior axis during development.	96
381630	cd13752	TGF_beta_INHB	transforming growth factor beta (TGF-beta) like domain found in inhibin beta A chain (INHBA), B chain (INHBB), C chain (INHBC), E chain (INHBE) and similar proteins. The family includes inhibin beta A chain (INHBA), B chain (INHBB), C chain (INHBC), and E chain (INHBE). INHBA, also termed activin beta-A chain, or erythroid differentiation protein (EDF), is a component of inhibin A, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. INHBB, also termed activin beta-B chain, is a component of inhibin B, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. INHBC, also termed activin beta-C chain, might play important roles in carcinogenesis. It may function as a negative regulator of liver growth. INHBE, also termed activin beta-E chain, is a possible insulin resistance-associated hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans. It also acts as a possible new marker for drug-induced endoplasmic reticulum stress.	100
381631	cd13753	TGF_beta_TGFbeta1_2_3	transforming growth factor beta (TGF-beta) like domain found in transforming growth factor beta-1 (TGF-beta-1), beta-2 (TGF-beta-2), beta-3 (TGF-beta-3) and similar proteins. The family includes TGF-beta-1, TGF-beta-2 and TGF-beta-3, which are polypeptide members of the transforming growth factor beta superfamily of cytokines. TGF-beta-1 is a secreted protein that performs many cellular functions, including the control of cell growth, cell proliferation, cell differentiation, and apoptosis. TGF-beta-2 is a secreted protein that performs many cellular functions and has a vital role during embryonic development. It can suppress the effects of interleukin-2 dependent T-cell growth. TGF-beta-3 is involved in embryogenesis and cell differentiation. It regulates molecules involved in cellular adhesion and extracellular matrix (ECM) formation during the process of palate development.	97
381632	cd13754	TGF_beta_INHA	transforming growth factor beta (TGF-beta) like domain found in inhibin alpha chain (INHA) and similar proteins. INHA is a component of inhibins (inhibin A or inhibin B) that inhibit the secretion of follitropin by the pituitary gland.	89
381633	cd13755	TGF_beta_maverick	transforming growth factor beta (TGF-beta) like domain found in Drosophila melanogaster maverick and similar proteins. Maverick, also termed MAV, is a novel member of the TGF-beta superfamily in Drosophila. It's a bone morphogenetic protein (BMP)/TGF-beta related ligand.	102
381634	cd13756	TGF_beta_BMPs_GDFs	transforming growth factor beta (TGF-beta) like domain found in the BMP/GDF family. The BMP/GDF family consists of bone morphogenetic proteins (BMPs), growth and differentiation factors (GDFs) and similar proteins. BMPs are a group of growth factors also known as cytokines and as metabologens. They induce the formation of bone and cartilage and functions as pivotal morphogenetic signals, orchestrating tissue architecture throughout the body. GDFs have functions predominantly in development.	102
381635	cd13757	TGF_beta_AMH	transforming growth factor beta (TGF-beta) like domain found in anti-Muellerian hormone (AMH) and similar proteins. AMH, also termed Muellerian-inhibiting factor, or Muellerian-inhibiting substance (MIS), is a glycoprotein that causes regression of the Muellerian duct. It can also inhibit the growth of tumors derived from tissues of Muellerian duct origin.	99
381636	cd13758	TGF_beta_LEFTY1_2	transforming growth factor beta (TGF-beta) like domain found in left-right determination factor 1 (lefty-1), factor 2 (lefty-2) and similar proteins. Lefty-1, also termed left-right determination factor B, or protein lefty-B, is required for left-right axis determination as a regulator of Lefty-2 and NODAL. Lefty-2, also termed endometrial bleeding-associated factor, or left-right determination factor A, or protein lefty-A, or transforming growth factor beta-4 (TGF-beta-4), is required for left-right (L-R) asymmetry determination of organ systems in mammals. It may play a role in endometrial bleeding.	90
381637	cd13759	TGF_beta_NODAL	transforming growth factor beta (TGF-beta) like domain found in Nodal (NODAL)-related proteins. NODAL is essential for mesoderm formation and axial patterning during embryonic development.	103
381638	cd13760	TGF_beta_BMP2_like	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 2 (BMP-2), 4 (BMP-4) and similar proteins. The family includes BMP2 and BMP4 (also known as BMP2B), both of which induce cartilage and bone formation. BMP-2 stimulates the differentiation of myoblasts into osteoblasts via the EIF2AK3-EIF2A- ATF4 pathway. BMP-4 acts in mesoderm induction, tooth development, limb formation and fracture repair.	102
381639	cd13761	TGF_beta_BMP5_like	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic proteins BMP-5, BMP-6, BMP-7, BMP-8A/B and similar proteins. The family includes BMP-5, BMP-6, BMP-7 and BMP-8A/B, which may induce cartilage and bone formation.	103
381640	cd13762	TGF_beta_GDP9_9B_like	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 9 (GDF-9), growth/differentiation factor 9B (GDF-9B) and similar proteins. The family includes GDF-9B (also known as BMP15) and GDF9. GDF-9B acts as oocyte-specific growth/differentiation factor that stimulates folliculogenesis and granulosa cell (GC) growth. GDF-9 is required for ovarian folliculogenesis. It promotes primordial follicle development and stimulates granulosa cell proliferation.	104
381641	cd13763	TGF_beta_BMP3_like	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 3 (BMP-3), growth/differentiation factor 10 (GDF10) and similar proteins. The family includes BMP-3 (also known as BMP-3A or osteogenin) and GDF10 (also known as BMP-3B). BMP-3 negatively regulates bone density. It antagonizes the ability of certain osteogenic BMPs to induce osteoprogenitor differentitation and ossification. GDF10 is a growth factor involved in osteogenesis and adipogenesis.	103
381642	cd13764	TGF_beta_GDF1_3_like	transforming growth factor beta (TGF-beta) like domain found in embryonic growth/differentiation factor 1 (GDF1), factor 3 (GDF3) and similar proteins. The family includes GDF-1 and GDF-3. GDF1 may mediate cell differentiation events during embryonic development. GDF3 is a growth factor involved in early embryonic development and adipose-tissue homeostasis. The family also contains protein DVR-1, also termed vegetal hemisphere VG1 protein (VG-1), which serves to facilitate the differentiation of either mesoderm or endoderm either as a cofactor in an instructive signal or by providing permissive environment.	102
381643	cd13765	TGF_beta_ADMP	transforming growth factor beta (TGF-beta) like domain found in anti-dorsalizing morphogenetic protein (ADMP) and similar proteins. ADMP is a bone morphogenetic protein (BMP)-like transforming growth factor beta ligand, functions in the trunk organizer to antagonize head formation, thereby regulating organizer patterning. It negatively affects the formation of the organizer, although it is robustly expressed within the organizer itself. The organizer-promoting signal of ADMP is mediated by the activin A type I receptor, ACVR1 (also known as activin receptor-like kinase-2, ALK2).	105
381644	cd13766	TGF_beta_GDF5_6_7	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 5 (GDF5), factor 6 (GDF6), factor 7 (GDF7) and similar proteins. The family includes GDF5, GDF6 and GDF7. GDF5, also termed bone morphogenetic protein 14 (BMP-14), or cartilage-derived morphogenetic protein 1 (CDMP-1), or lipopolysaccharide-associated protein 4 (LAP-4), or LPS-associated protein 4, or radotermin, is a growth factor involved in bone and cartilage formation. GDF6, also termed bone morphogenetic protein 13 (BMP-13), or growth/differentiation factor 16, is a growth factor that controls proliferation and cellular differentiation in the retina and bone formation. GDF7, also termed bone morphogenetic protein 12 (BMP-12), may play an active role in the motor area of the primate neocortex.	102
381645	cd13767	TGF_beta_BMP9_like	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic proteins, BMP-9, BMP-10 and similar proteins. The family includes BMP9 (also known as GDF2) and BMP10. BMP-9 is a potent circulating inhibitor of angiogenesis. It signals through the type I activin receptor ACVRL1 but not other activin receptor-like kinases (ALKs). BMP-10 is required for maintaining the proliferative activity of embryonic cardiomyocytes by preventing premature activation of the negative cell cycle regulator CDKN1C/p57KIP and maintaining the required expression levels of cardiogenic factors such as MEF2C and NKX2-5. It inhibits endothelial cell migration and growth. It may reduce cell migration and cell matrix adhesion in breast cancer cell lines.	105
259841	cd13768	DSS1_Sem1	proteasome complex subunit DSS1/Sem1. The evolutionarily conserved deleted in split hand/split foot protein 1 (DSS1)/Sem1 is a subunit of the regulatory particle (RP) of the proteasome. It is implicated in ubiquitin-mediated proteolysis, is required for the maintenance of genomic stability, and functions in DNA damage response. DSS1/Sem1 also displays RP-independent functions; it serves as a functional component of the nuclear pore associated TREX-2 transcription-export complex and is required for proper nuclear export of mRNA. In mammalian cells, DSS1 binds and stabilizes the tumor suppressor BRCA2, and contributes to its function in mediating homologous recombinational repair. In yeast, Sem1 also complexes with the COP9 signalosome, which is involved in de-neddylation. DSS1/Sem1 may be a versatile protein which contributes to the functional integrity of multiple protein complexes involved in various biological processes.	61
259842	cd13769	ApoLp-III_like	Apolipophorin-III and similar insect proteins. Exchangeable apolipoproteins play vital roles in the transport of lipids and lipoprotein metabolism. Apolipophorin III (apoLp-III) assists in the loading of diacylglycerol, generated from triacylglycerol stores in the fat body through the action of adipokinetic hormone, into lipophorin, the hemolymph lipoprotein. ApoLp-III increases the lipid carrying capacity of lipophorin by covering the expanding hydrophobic surface resulting from diacylglycerol uptake. It plays a critical role in the transport of lipids during insect flight, and may also play a role in defense mechanisms and innate immunity.	158
259817	cd13775	SPFH_eoslipins_u3	Uncharacterized prokaryotic subfamily of the stomatin-like proteins (slipins), a subgroup of the SPFH family (stomatin, prohibitin, flotillin, and HflK/C). This model summarizes a subgroup of the stomatin-like protein family (SLPs or slipins) that is found in bacteria and archaebacteria. The conserved domain common to the SPFH superfamily has also been referred to as the Band 7 domain. Individual proteins of the SPFH superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Bacterial and archaebacterial SLPs remain uncharacterized.	177
260099	cd13777	Aar2_N	N-terminal domain of Aar2, a U5 small nuclear ribonucleoprotein particle assembly factor. This family consists of the N-terminal domain of eukaryotic Aar2 and Aar2-like proteins. Aar2 is a U5 small nuclear ribonucleoprotein (snRNP) particle assembly factor and part of Prp8, which forms a large complex containing U5 snRNA, Snu114, and seven Sm proteins (B, D1, D2, D3, E, F and G). Upon import of the complex into the nucleus, Aar2 phosphorylation leads to its release from Prp8 and replacement by Brr2p, thus playing an important role in Brr2p regulation and possibly safeguarding against non-specific RNA binding to Prp8. Aar2p binds directly with the RNaseH-like domain in the C-terminal region of Prp8p. In yeast, Aar2 protein is involved in splicing pre-mRNA of the a1 cistron and other genes important for cell growth.	126
260100	cd13778	Aar2_C	C-terminal domain of Aar2, a U5 small nuclear ribonucleoprotein particle assembly factor. This family consists of the C-terminal domain of eukaryotic Aar2 and Aar2-like proteins. Aar2 is a U5 small nuclear ribonucleoprotein (snRNP) particle assembly factor and part of Prp8, which forms a large complex containing U5 snRNA, Snu114, and seven Sm proteins (B, D1, D2, D3, E, F and G). Upon import of the complex into the nucleus, Aar2 phosphorylation leads to its release from Prp8 and replacement by Brr2p, thus playing an important role in Brr2p regulation and possibly safeguarding against non-specific RNA binding to Prp8. Aar2p binds directly with the RNaseH-like domain in the C-terminal region of Prp8p. In yeast, Aar2 protein is involved in splicing pre-mRNA of the a1 cistron and other genes important for cell growth.	155
260101	cd13783	SPACA1	Sperm acrosome membrane-associated protein 1. SPACA1 (aka SAMP32, due to its 32kDa M.W.) is localized to the acrosome of spermatozoa. The acrosome is an organelle transformed from the Golgi apparatus to form a cap over the anterior portion of the spermatozoa head, which contains the sperm nucleus. Mammalian acrosomes contain digestive enzymes that degrade the ovum outer membrane (zona pellucida) to allow fusion of the sperm and ovum nuclei via the acrosomal reaction. In mammals, the acrosome releases hyaluronidase and acrosin. Antibodies generated against recombinant SPACA1 have been shown to inhibit human sperm binding and membrane fusion in vitro vs. zona-free hamster ova. Male mice lacking SPACA1 are infertile, and exhibit globozoospermia-like misformed sperm heads. SPACA1 content has been reported to be diminished in a comparison of round-headed vs normal spermatozoa.	248
260102	cd13784	SP_1775_like	Uncharacterized protein conserved in Streptococci. Streptococcus pneumoniae SP_1775 and related proteins from other Streptococci; may form homooctamers that may bind hydrophobic ligands.	67
260079	cd13785	CARD_BinCARD_like	BinCARD (Bcl10-interacting protein with CARD). BinCARD was ubiquitously expressed CRAD (Caspase activation and recruitment domain) protein in all tissues. CARD proteins play important role in apoptosis by functioning as direct regulators of death-inducing caspases. BinCARD interacts with apoptosis inducer CARD protein Bcl10 through CARD. It inhibits Bcl10-mediated activation of NF-kappa B and to suppress Bcl10 phosphorylation. Caspase activation and recruitment domains (CARDs) are death domains (DDs) found associated with caspases. In general, DDs domains are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including PYRIN and DED (Death Effector Domain). They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes.	86
259853	cd13831	HU	histone-like DNA-binding protein HU. This subfamily includes HU and HU-like domains. HU is a conserved nucleoid-associated protein (NAP) which binds non-specifically to duplex DNA with a particular preference for targeting nicked and bent DNA. It is highly basic and contributes to chromosomal compaction and maintenance of negative supercoiling, thus often referred to as histone-like protein. HU can induce DNA bends, condense DNA in a fiber and also interact with single stranded DNA. It contains two homologous subunits, alpha and beta, typically forming homodimers (alpha-alpha and beta-beta), except in E. coli and other enterobacteria, which form heterodimers (alpha-beta). In E. coli, HU binds uniformly to the chromosome, with a preference for damaged or distorted DNA structures and can introduce negative supercoils into closed circular DNA in the presence of topoisomerase I. Anabaena HU (AHU) shows preference for A/T-rich region in the center of its DNA binding site.	86
259854	cd13832	IHF	Integration host factor (IHF) and similar proteins. This subfamily includes integration host factor (IHF) and IHF-like domains. IHF is a nucleoid-associated protein (NAP) that binds and sharply bends many DNA targets in a sequence specific manner. It is a heterodimeric protein composed of two highly homologous subunits IHFA (IHF-alpha) and IHFB (IHF-beta). It is known to act as a transcription factor at many gene regulatory regions in E. coli. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). IHF is also involved in formation as well as maintenance of bacterial biofilms since it is found in complex with extracellular DNA (eDNA) within the extracellular polymeric substances (EPS) matrix of many biofilms. This subfamily also includes the protein Hbb from tick-borne spirochete Borrelia burgdorferi, responsible for causing Lyme disease in humans. Hbb, a homodimer, shows DNA sequence preferences that are related, yet distinct from those of IHF.	85
259855	cd13833	HU_IHF_like	Uncharacterized proteins similar to DNA sequence specific (IHF) and non-specific (HU) domains. This subfamily consists of uncharacterized proteins similar to integration host factor (IHF) and HU domains, including hypothetical protein Bvu_2165 from Bacteroides vulgatus. IHF is a nucleoid-associated protein (NAP) that binds and sharply bends many DNA targets in a sequence specific manner. It is a heterodimeric protein composed of two highly homologous subunits IHFA (IHF-alpha) and IHFB (IHF-beta). It is known to act as a transcription factor at many gene regulatory regions in E. coli. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). IHF is also involved in formation as well as maintenance of bacterial biofilms since it is found in complex with extracellular DNA (eDNA) within the extracellular polymeric substances (EPS) matrix of many biofilms.	97
259856	cd13834	HU_like	DNA-binding proteins similar to HU domains. This subfamily consists of DNA-binding proteins similar to HU domains. HU is a conserved nucleoid-associated protein (NAP) which binds non-specifically to duplex DNA with a particular preference for targeting nicked and bent DNA. It is highly basic and contributes to chromosomal compaction and maintenance of negative supercoiling, thus often referred to as histone-like protein. HU can induce DNA bends, condense DNA in a fiber and also interact with single stranded DNA. It contains two homologous subunits, alpha and beta, typically forming homodimers (alpha-alpha and beta-beta), except in E. coli and other enterobacteria, which form heterodimers (alpha-beta).	94
259857	cd13835	IHF_A	Alpha subunit of integration host factor (IHFA). This subfamily consists of the alpha subunit of integration host factor (IHF) and IHF-like domains. IHF is a nucleoid-associated protein (NAP) that binds and sharply bends many DNA targets in a sequence specific manner. It is a heterodimeric protein composed of two highly homologous subunits IHFA (IHF-alpha) and IHFB (IHF-beta). It is known to act as a transcription factor at many gene regulatory regions in E. coli. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). IHF is also involved in formation as well as maintenance of bacterial biofilms since it is found in complex with extracellular DNA (eDNA) within the extracellular polymeric substances (EPS) matrix of many biofilms.	88
259858	cd13836	IHF_B	Beta subunit of integration host factor (IHFB). This subfamily consists of the beta subunit of integration host factor (IHF) and IHF-like domains. IHF is a nucleoid-associated protein (NAP) that binds and sharply bends many DNA targets in a sequence specific manner. It is a heterodimeric protein composed of two highly homologous subunits IHFA (IHF-alpha) and IHFB (IHF-beta). It is known to act as a transcription factor at many gene regulatory regions in E. coli. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups). IHF is also involved in formation as well as maintenance of bacterial biofilms since it is found in complex with extracellular DNA (eDNA) within the extracellular polymeric substances (EPS) matrix of many biofilms.	89
260013	cd13838	RNase_H_like_Prp8_IV	Ribonuclease-like Prp8 domain IV core. This family contains Prp8 domain IV, which adopts a RNase H like fold within its core structure but with little sequence similarity. Prp8, a spliceosome protein, interacts directly with the splice sites and branch regions of precursor-mRNAs and spliceosomal RNAs associated with catalysis of the two steps of splicing. Catalysis of RNA cleavage by RNase H-like proteins involves a two-metal mechanism in which adjacently-bound divalent magnesium ions promote hydrolysis by activation of a water nucleophile and stabilization of the transition-state. However, the Prp8 domain IV contains only one of the canonical metal-binding sites and the coordinating side chains are spatially conserved with respect to Mg2+-coordinating residues within the RNase H fold.	251
260103	cd13839	MEF2_binding	Mycocyte enhancer factor-2 (MEF2) binding domain of the calcineurin-binding protein cabin-1. The myocyte enhancer factor-2 (MEF2) binding domain, as found in the calcineurin-binding protein cabin-1, adopts an amphipathic alpha-helical structure, which allows it to bind to a hydrophobic groove on the MEF2S domain, forming a triple-helical interaction. Interaction of this domain with MEF2 causes repression of transcription. Cabin-1 inhibits calcineurin-mediated signal transduction in T-cell receptor-mediated signalling pathways, by binding to the activated form of calcineurin. Cabin-1 acts as a co-repressor of MEF2, the mycocyte enhancer factor-2, which regulates transcription in a calcium-dependent manner and plays vital roles in T-cell development and function.	35
260104	cd13840	SMBP_like	Small metal-binding protein conserved in proteobacteria. This periplasmic protein appears capable of binding multiple equivalents of a variety of divalent and trivalent metals, including Cu(2+) and Fe(3+) but also Mn(2+), Ni(2+), Mg(2+), and Zn(2+). It has been suggested that SMBP is a metal scavenging protein that plays a role in cellular copper management in Nitrosomonas europaea.	89
260105	cd13841	ABBA-PTs	ABBA-type aromatic prenyltransferases (PTases). ABBA-type aromatic prenyltransferases (PTases) are a subgroup of prenyltransferases that are characterized by an unusual type of beta/alpha fold with antiparallel beta strands. They lack the (N/D)DxxD motif which is characteristic for many other prenyltransferases. Generally, aromatic prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto aromatic substrates, forming C-C bonds between C-1 or C-3 of the isoprenoid substrate and one of the aromatic carbons of the acceptor substrate by an electrophilic alkylation, or Friedel-Crafts alkylation mechanism.	294
259911	cd13842	CuRO_HCO_II_like	Cupredoxin domain of Heme-copper oxidase subunit II. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits.  Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I.	95
259912	cd13843	Azurin_like	Azurin and similar redox proteins. Azurin is a bacterial blue copper-binding protein. It serves as a redox partner to enzymes such as nitrite reductase or arsenite oxidase. The copper of Azurin is tetrahedrally coordinated by a cysteine, 2 histidines, and a methionine residue. The electron transfer reactions are carried out with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. Azurin can function as tumor suppressor; it forms a complex with p53 that triggers apoptosis in various human cancer cells. Auracyanins A and B are from photosynthetic bacteria. They are very  similar blue copper proteins with 38% sequence identity and they are homologous to the bacterial redox protein Azurin. However, auracyanin A is expressed only when C. aurantiacus cells are grown in light, whereas auracyanin B is expressed under dark and in light. Thus, auracyanin A may function as a redox partner in photosynthesis, while auracyanin B may function in aerobic respiration.	124
259913	cd13844	CuRO_1_BOD_CotA_like	The first Cupredoxin domain of Bilirubin oxidase (BOD), the bacterial endospore coat component CotA, and similar proteins. Bilirubin oxidase (BOD) catalyzes the oxidation of bilirubin to biliverdin and the four-electron reduction of molecular oxygen to water. CotA protein is an abundant component of the outer coat layer in bacterial endospore coat and it is required for spore resistance against hydrogen peroxide and UV light. Also included in this subfamily are phenoxazinone synthase (PHS), which catalyzes the oxidative coupling of substituted o-aminophenols to produce phenoxazinones. PHS has been shown to participate in diverse biological functions such as spore pigmentation and biosynthesis of the antibiotic grixazone. These are Laccase-like multicopper oxidases (MCOs) that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	162
259914	cd13845	CuRO_1_AAO	The first cupredoxin domain of plant Ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to MCO family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	120
259915	cd13846	CuRO_1_AAO_like_1	The first cupredoxin domain of plant Ascorbate oxidase homologs. This subfamily is composed of plant pollen multicopper oxidase homologous to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to MCO family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. This subfamily does not harbor trinuclear copper binding histidines.	118
259916	cd13847	CuRO_1_AAO_like_2	The first cupredoxin domain of Ascorbate oxidase homologs. This family includes fungal proteins with similarity to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to multicopper oxidase (MCO) family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	117
259917	cd13848	CuRO_1_CopA	The first cupredoxin domain of CopA copper resistance protein family. CopA is a multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes. It is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CopA is a copper efflux P-type ATPase that is located in the inner cell membrane and is involved in copper resistance in bacteria. CopA mutant causes a loss of function including copper tolerance and oxidase activity, and copA transcription is inducible in the presence of copper. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	116
259918	cd13849	CuRO_1_LCC_plant	The first cupredoxin domain of plant laccases. Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	117
259919	cd13850	CuRO_1_Abr2_like	The first cupredoxin domain of a group of fungal Laccases similar to Abr2 from Aspergillus fumigatus. Abr2 is involved in conidial pigment biosynthesis in Aspergillus fumigatus. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	117
259920	cd13851	CuRO_1_Fet3p	The first Cupredoxin domain of multicopper oxidase Fet3P. Fet3p catalyzes the ferroxidase reaction, which couples the oxidation of Fe(II) to Fe(III) and  a four-electron reduction of molecular oxygen to water. Fet3p is a type I membrane protein with the amino-terminal oxidase domain in the exocellular space  and the carboxyl terminus in the cytoplasm. The periplamic produced Fe(III) is transferred to the permease Ftr1p for import into the cytosol. The four copper ions are inserted post-translationally and are essential for catalytic activity, thus linking copper and iron homeostasis. Like other related multicopper oxidases (MCOs), Fet3p is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	121
259921	cd13852	CuRO_1_McoP_like	The first cupredoxin domain of multicopper oxidase McoP and similar proteins. This family includes archaeal and bacterial multicopper oxidases (MCOs), represented by the extremely thermostable McoP from the hyperthermophilic archaeon Pyrobaculum aerophilum. McoP is an efficient metallo-oxidase that catalyzes the oxidation of cuprous and ferrous ions. It is noteworthy that McoP has three-fold higher catalytic efficiency when using nitrous oxide as the electron acceptor than when using dioxygen, the typical oxidizing substrate of MCOs. McoP may function as a novel archaeal nitrous oxide reductase that is probably involved in the denitrification pathway in archaea. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	114
259922	cd13853	CuRO_1_Tth-MCO_like	The first cupredoxin domain of the bacterial laccases similar to Tth-MCO from Thermus Thermophilus. The subfamily of bacterial laccases includes Tth-MCO and similar proteins. Tth-MCO is a hyperthermophilic multicopper oxidase (MCO) from thermus thermophilus HB27. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	139
259923	cd13854	CuRO_1_MaLCC_like	The first cupredoxin domain of the fungal laccases similar to Ma-LCC  from Melanocarpus albomyces. The subfamily of fungal laccases includes Ma-LCC and similar proteins. Ma-LCC is a multicopper oxidase (MCO) from Melanocarpus albomyces. Its crystal structure contains all four coppers at the mono- and trinuclear copper centers.  Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	122
259924	cd13855	CuRO_1_McoC_like	The first cupredoxin domain of a multicopper oxidase McoC and similar proteins. This family includes bacteria multicopper oxidases (MCOs) represented by McoC from pathogenic bacterium Campylobacter jejuni. McoC is a periplasmic multicopper oxidase, which has been characterized to be associated with copper homeostasis. McoC may also function to protect against oxidative stress as it may convert metallic ions into their less toxic form. MCOs are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. They are capable of oxidizing a vast range of substrates, varying from aromatic compunds to inorganic compounds such as metals. Most MCOs have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	121
259925	cd13856	CuRO_1_Tv-LCC_like	The first cupredoxin domain of fungal laccases similar to Tv-LCC from Trametes versicolor. This subfamily of fungal laccases includes Tv-LCC from Trametes versicolor and Rs-LCC2 from plant pathogenic fungus Rhizoctonia solani. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	125
259926	cd13857	CuRO_1_Diphenol_Ox	The first cupredoxin domain of fungal laccase, diphenol oxidase. Diphenol oxidase belongs to the laccase family. It catalyzes the initial steps in melanin biosynthesis from diphenols. Melanin is one of the virulence factors of infectious fungi. In the pathogenesis of C. neoformans, melanin pigments have been shown to protect the fungal cells from oxidative and microbicidal activities of host defense systems. Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	119
259927	cd13858	CuRO_1_tcLCC2_insect_like	The first cupredoxin domain of insect laccases similar to laccase 2 in Tribolium castaneum. This multicopper oxidase (MCO) family includes the majority of insect laccases. One member of the family is laccase 2 from Tribolium castaneum. Laccase 2 is required for beetle cuticle tanning. Laccase (polyphenol oxidase EC 1.10.3.2) is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic - notably phenolic and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi, plants and insects. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	105
259928	cd13859	CuRO_D1_2dMcoN_like	The first cupredoxin domain of bacterial two domain multicopper oxidase McoN and similar proteins. This family includes bacterial two domain multicopper oxidases (2dMCOs) represented by the McoN from Nitrosomonas europaea. McoN is a trimeric type C blue copper oxidase. Each subunit houses a type 1 copper site in domain 1 and a type 2/type 3 trinuclear copper cluster at the subunit-subunit interface. The 2dMCO is proposed to be a key intermediate in the evolution of three domain MCOs. Its biological function has not been characterized. Multicopper oxidases couple oxidation of substrates with reduction of dioxygen to water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals.	122
259929	cd13860	CuRO_1_2dMco_1	The first cupredoxin domain of bacteria two domain multicopper oxidase. This subfamily includes bacterial two domain multicopper oxidases (2dMCOs) with similarity to McoN from Nitrosomonas europaea. 2dMCO is a trimeric type C blue copper oxidase. Each subunit houses a type 1 copper site in domain 1 and a type 2/type 3 trinuclear copper cluster at the subunit-subunit interface. The 2dMCO is proposed to be a key intermediate in the evolution of three domain MCOs. Multicopper oxidases couple oxidation of substrates with reduction of dioxygen to water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals.	119
259930	cd13861	CuRO_1_CumA_like	The first cupredoxin domain of CumA like multicopper oxidase. This multicopper oxidase (MCO) subfamily includes CumA from Pseudomonas putida, which is involved in the oxidation of Mn(II). However, the cumA gene has been identified in a variety of bacterial species, including both Mn(II)-oxidizing and non-Mn(II)-oxidizing strains. Thus, the proteins in this family may catalyze the oxidation of other substrates. MCO catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water and  has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	119
259931	cd13862	CuRO_1_MCO_like_1	The first cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	123
259932	cd13864	CuRO_1_MCO_like_2	The second cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	139
259933	cd13865	CuRO_1_LCC_like_3	The second cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	115
259934	cd13866	CuRO_2_BOD	The second cupredoxin domain of Bilirubin oxidase (BOD). Bilirubin oxidase (BOD) catalyzes the oxidation of bilirubin to biliverdin and the four-electron reduction of molecular oxygen to water. It is used in diagnosing jaundice through the determination of bilirubin in serum. BOD is a member of the multicopper oxidase (MCO) family that also includes laccase, ascorbate oxidase and ceruloplasmin. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	152
259935	cd13867	CuRO_2_CueO_FtsP	The second Cupredoxin domain of the multicopper oxidase CueO, the cell division protein FtsP, and similar proteins. CueO is a multicopper oxidase (MCO) that is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CueO is a periplasmic multicopper oxidase that is stimulated by exogenous copper(II). FtsP (also named SufI) is a component of the cell division apparatus. It is involved in protecting or stabilizing the assembly of divisomes under stress conditions. FtsP belongs to the multicopper oxidase superfamily but lacks metal cofactors. The protein is localized at septal rings and may serve as a scaffolding function. Members of this subfamily contain three cupredoxin domains and this model represents the second domain. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	146
259936	cd13868	CuRO_2_CotA_like	The second Cupredoxin domain of bacterial laccases including CotA, a bacterial endospore coat component. CotA protein is an abundant component of the outer coat layer in bacterial endospore coat and it is required for spore resistance against hydrogen peroxide and UV light. Laccase is composed of three cupredoxin-like domains and includes one mononuclear and one trinuclear copper center. It is a member of the multicopper oxidase (MCO) family, which couples the oxidation of a substrate with a four-electron reduction of molecular oxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	155
259937	cd13869	CuRO_2_PHS	The second Cupredoxin domain of phenoxazinone synthase (PHS). Phenoxazinone synthase (PHS, 2-aminophenol:oxygen oxidoreductase) catalyzes the oxidative coupling of substituted o-aminophenols to produce phenoxazinones. PHS participates in diverse biological functions such as spore pigmentation and biosynthesis of the antibiotic grixazone. It is a member of the multicopper oxidase (MCO) family, which couples the oxidation of a substrate with a four-electron reduction of molecular oxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	166
259938	cd13870	CuRO_2_CopA_like_1	The second cupredoxin domain of CopA copper resistance protein like family. The members of this family are copper resistance protein (CopA) homologs. CopA is multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes. CopA is involved in copper resistance in bacteria. CopA mutant causes a loss of function, including copper tolerance and oxidase activity, and copA transcription is inducible in the presence of copper. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	117
259939	cd13871	CuRO_2_AAO	The second cupredoxin domain of plant Ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. MCOs couple oxidation of substrates with reduction of dioxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	166
259940	cd13872	CuRO_2_AAO_like_1	The second cupredoxin domain of plant pollen multicopper oxidase homologous to ascorbate oxidase. The proteins in this subfamily are expressed in plant pollen. They share homology to ascorbate oxidase and other members of the blue copper oxidase family. The expression of the protein is detected during germination and pollen tube growth. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. It is a member of the multicopper oxidase (MCO) family that couples oxidation of substrates with reduction of dioxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	141
259941	cd13873	CuRO_2_AAO_like_2	The second cupredoxin domain of plant  Ascorbate oxidase homologs. This family includes plant laccases similar to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to multicopper oxidase (MCO) family which couples oxidation of substrates with reduction of dioxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	161
259942	cd13874	CuRO_2_CopA	The second cupredoxin domain of  CopA copper resistance protein family. CopA is a multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes. It is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CopA is a copper efflux P-type ATPase that is located in the inner cell membrane and is is involved in copper resistance in bacteria. CopA mutant causes a loss of function including copper tolerance and oxidase activity and copA transcription is inducible in the presence of copper. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	112
259943	cd13875	CuRO_2_LCC_plant	The second cupredoxin domain of the plant laccases. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear. Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	148
259944	cd13876	CuRO_2_Abr2_like	The second cupredoxin domain of a group of fungal Laccases similar to Abr2 from Aspergillus fumigatus. Abr2 is involved in conidial pigment biosynthesis in Aspergillus fumigatus. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	138
259945	cd13877	CuRO_2_Fet3p_like	The second Cupredoxin domain of multicopper oxidase Fet3P. Fet3p catalyzes the ferroxidase reaction, which couples the oxidation of Fe(II) to Fe(III) with the four-electron reduction of molecular oxygen to water. Fet3p is a type I membrane protein with the amino-terminal oxidase domain in the extracellular space and the carboxyl terminus in the cytoplasm. The periplasmic produced Fe(III) is transferred to the permease Ftr1p for import into the cytosol. The four copper ions are inserted post-translationally and are essential for catalytic activity, thus linking copper and iron homeostasis. Like other related multicopper oxidases (MCOs), Fet3p is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	148
259946	cd13879	CuRO_2_McoP_like	The second cupredoxin domain of multicopper oxidase McoP and similar proteins. This family includes archaeal and bacterial multicopper oxidases (MCOs), represented by the extremely thermostable McoP from the hyperthermophilic archaeon Pyrobaculum aerophilum. McoP is an efficient metallo-oxidase that catalyzes the oxidation of cuprous and ferrous ions. It is noteworthy that McoP has three-fold higher catalytic efficiency when using nitrous oxide as electron acceptor than when using dioxygen, the typical oxidizing substrate of multicopper oxidases. McoP may function as a novel archaeal nitrous oxide reductase that is probably involved in the denitrification pathway in archaea. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	162
259947	cd13880	CuRO_2_MaLCC_like	The second cupredoxin domain of the fungal laccases similar to Ma-LCC  from Melanocarpus albomyces. The subfamily of fungal laccases includes Ma-LCC and similar proteins. Ma-LCC is a  multicopper oxidase (MCO) from Melanocarpus albomyces. Its crystal structure contains all four coppers at the mono- and trinuclear copper centers. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	167
259948	cd13881	CuRO_2_McoC_like	The second cupredoxin domain of a multicopper oxidase McoC and similar proteins. This family includes bacterial multicopper oxidases (MCOs) represented by McoC from the pathogenic bacterium Campylobacter jejuni. McoC is a periplasmic MCO, which has been characterized to be associated with copper homeostasis. McoC may also function to protect against oxidative stress as it may convert metallic ions into their less toxic form. MCOs are multi-domain enzymes that are able to couple oxidation of substrates with the reduction of dioxygen to water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. They are composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	142
259949	cd13882	CuRO_2_Tv-LCC_like	The second cupredoxin domain of the fungal laccases similar to Tv-LCC from Trametes versicolor. This subfamily of fungal laccases includes Tv-LCC from Trametes versicolor and Rs-LCC2 from plant pathogenic fungus Rhizoctonia solani. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Laccase is a multicopper oxidase (MCO) composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	159
259950	cd13883	CuRO_2_Diphenol_Ox	The second cupredoxin domain of fungal laccase, diphenol oxidase. Diphenol oxidase belongs to the laccase family. It catalyzes the initial steps in melanin biosynthesis from diphenols. Melanin is one of the virulence factors of infectious fungi. In the pathogenesis of C. neoformans, melanin pigments have been shown to protect the fungal cells from oxidative and microbicidal activities of host defense systems. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Laccase is a multicopper oxidase (MCO) composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	164
259951	cd13884	CuRO_2_tcLCC_insect_like	The second cupredoxin domain of the insect laccases similar to laccase 2 in Tribolium castaneum. This multicopper oxidase (MCO) subfamily includes the majority of insect laccases. One member is laccase 2 from Tribolium castaneum, which is required for beetle cuticle tanning. Laccase (polyphenol oxidase EC 1.10.3.2) is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic - notably phenolic and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi, plants and insects. Laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	150
259952	cd13885	CuRO_2_CumA_like	The second cupredoxin domain of CumA like multicopper oxidase. This multicopper oxidase (MCO) subfamily includes CumA from Pseudomonas putida. CumA is involved in the oxidation of Mn(II) in Pseudomonas putida; however, the cumA gene has been identified in a variety of bacterial species, including both Mn(II)-oxidizing and non-Mn(II)-oxidizing strains. Thus, the proteins in this family may catalyze the oxidation of other substrates. MCOs catalyze the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water and  has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. The MCOs in this subfamily are composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	132
259953	cd13886	CuRO_2_MCO_like_1	The second cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This family of MCOs is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	163
259954	cd13887	CuRO_2_MCO_like_2	The second cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a  dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This family of MCOs is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	114
259955	cd13888	CuRO_3_McoP_like	The third cupredoxin domain of multicopper oxidase McoP and similar proteins. This subfamily includes archaeal and bacterial multicopper oxidases (MCOs), represented by the extremely thermostable McoP from the hyperthermophilic archaeon Pyrobaculum aerophilum. McoP is an efficient metallo-oxidase that catalyzes the oxidation of cuprous and ferrous ions. It is noteworthy that McoP has three-fold higher catalytic efficiency when using nitrous oxide as electron acceptor than when using dioxygen, the typical oxidizing substrate of multicopper oxidases. McoP may function as a novel archaeal nitrous oxide reductase that is probably involved in the denitrification pathway in archaea. Members of this subfamily contain three cupredoxin domain repeats. The copper ions are bound in several sites; Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	139
259956	cd13889	CuRO_3_BOD	The third cupredoxin domain of Bilirubin oxidase (BOD). Bilirubin oxidase (BOD) catalyzes the oxidation of bilirubin to biliverdin and the four-electron reduction of molecular oxygen to water. It is used in diagnosing jaundice through the determination of bilirubin in serum. BOD is a member of the multicopper oxidase (MCO) family that also includes laccase, ascorbate oxidase and ceruloplasmin. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	124
259957	cd13890	CuRO_3_CueO_FtsP	The third Cupredoxin domain of the multicopper oxidase CueO, the cell division protein FtsP, and similar proteins. CueO is a multicopper oxidase (MCO) that is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CueO is a periplasmic multicopper oxidase that is stimulated by exogenous copper(II). FtsP (also named SufI) is a component of the cell division apparatus. It is involved in protecting or stabilizing the assembly of divisomes under stress conditions. FtsP belongs to the multicopper oxidase superfamily but lacks metal cofactors. The protein is localized at septal rings and may serve as a scaffolding function. Members of this subfamily contain three cupredoxin domains and this model represents the first domain. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. FtsP does not contain any copper binding sites.	124
259958	cd13891	CuRO_3_CotA_like	The third Cupredoxin domain of bacterial laccases including CotA, a bacterial endospore coat component. CotA protein is an abundant component of the outer coat layer in bacterial endospore coat and is required for spore resistance against hydrogen peroxide and UV light. CotA belongs to the laccase-like multicopper oxidase (MCO) family, which are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	143
259959	cd13892	CuRO_3_PHS	The third Cupredoxin domain of  phenoxazinone synthase (PHS). Phenoxazinone synthase (PHS, 2-aminophenol:oxygen oxidoreductase) catalyzes the oxidative coupling of substituted o-aminophenols to produce phenoxazinones. PHS has been shown to participate in diverse biological functions such as spore pigmentation and biosynthesis of the antibiotic grixazone. PHS is a member of the laccase-like multicopper oxidase (MCO) family, which are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	184
259960	cd13893	CuRO_3_AAO	The third cupredoxin domain of plant Ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to MCO family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	155
259961	cd13894	CuRO_3_AAO_like_1	The third cupredoxin domain of plant Ascorbate oxidase homologs. This subfamily is composed of plant pollen multicopper oxidase homologous to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to MCO family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. This subfamily does not harbor T1 copper or trinuclear copper binding sites.	123
259962	cd13895	CuRO_3_AAO_like_2	The third cupredoxin domain of Ascorbate oxidase homologs. This family includes fungal proteins with similarity to ascorbate oxidase. Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear. Ascorbate oxidase belongs to multicopper oxidase (MCO) family which couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	188
259963	cd13896	CuRO_3_CopA	The third cupredoxin domain of CopA copper resistance protein family. CopA is a multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes. It is part of the copper-regulatory cue operon, which employs a cytosolic metalloregulatory protein CueR that induces expression of CopA and CueO under copper stress conditions. CopA is a copper efflux P-type ATPase that is located in the inner cell membrane and is is involved in copper resistance in bacteria. CopA mutant causes a loss of function including copper tolerance and oxidase activity and copA transcription is inducible in the presence of copper. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	115
259964	cd13897	CuRO_3_LCC_plant	The third cupredoxin domain of the plant laccases. Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	139
259965	cd13898	CuRO_3_Abr2_like	The third cupredoxin domain of a group of fungal Laccases similar to Abr2 from Aspergillus fumigatus. Abr2 is involved in conidial pigment biosynthesis in Aspergillus fumigatus. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	164
259966	cd13899	CuRO_3_Fet3p	The third Cupredoxin domain of multicopper oxidase Fet3p. Fet3p catalyzes the ferroxidase reaction, which couples the oxidation of Fe(II) to Fe(III) with the four-electron reduction of molecular oxygen to water. Fet3p is a type I membrane protein with the amino-terminal oxidase domain in the extracellular space and the carboxyl terminus in the cytoplasm. The periplasmic produced Fe(III) is transferred to the permease Ftr1p for import into the cytosol. The four copper ions are inserted post-translationally and are essential for catalytic activity, thus linking copper and iron homeostasis. Like other related multicopper oxidases (MCOs), Fet3p is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	160
259967	cd13900	CuRO_3_Tth-MCO_like	The third cupredoxin domain of the bacterial laccases similar to Tth-MCO from Thermus Thermophilus. The subfamily of bacterial laccases includes Tth-MCO and similar proteins. Tth-MCO is a hyperthermophilic multicopper oxidase (MCO) from thermus thermophilus HB27. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	123
259968	cd13901	CuRO_3_MaLCC_like	The third cupredoxin domain of the fungal laccases similar to Ma-LCC  from Melanocarpus albomyces. The subfamily of fungal laccases includes Ma-LCC and similar proteins. Ma-LCC is a multicopper oxidase (MCO) from Melanocarpus albomyces. Its crystal structure contains all four coppers at the mono- and trinuclear copper centers.  Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi and plants. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	157
259969	cd13902	CuRO_3_McoC_like	The third cupredoxin domain of a multicopper oxidase McoC and similar proteins. This family includes bacteria multicopper oxidases (MCOs) represented by McoC from pathogenic bacterium Campylobacter jejuni. McoC is a periplasmic multicopper oxidase, which has been characterized to be associated with copper homeostasis. McoC may also function to protect against oxidative stress as it may convert metallic ions into their less toxic form. MCOs are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. They are capable of oxidizing a vast range of substrates, varying from aromatic compunds to inorganic compounds such as metals. Most MCOs have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	125
259970	cd13903	CuRO_3_Tv-LCC_like	The third cupredoxin domain of the fungal laccases similar to Tv-LCC from Trametes Versicolor. This subfamily of fungal laccases includes Tv-LCC from Trametes versicolor and Rs-LCC2 from plant pathogenic fungus Rhizoctonia solani. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	147
259971	cd13904	CuRO_3_Diphenol_Ox	The third cupredoxin domain of fungal laccase, diphenol oxidase. Diphenol oxidase belongs to the laccase family. It catalyzes the initial steps in melanin biosynthesis from diphenols. Melanin is one of the virulence factors of infectious fungi. In the pathogenesis of C. neoformans, melanin pigments have been shown to protect the fungal cells from oxidative and microbicidal activities of host defense systems. Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. It has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	158
259972	cd13905	CuRO_3_tcLLC2_insect_like	The third cupredoxin domain of the insect laccases similar to laccase 2 in Tribolium castaneum. This multicopper oxidase (MCO) family includes the majority of insect laccases. One member of the family is laccase 2 from Tribolium castaneum. Laccase 2 is required for beetle cuticle tanning. Laccase (polyphenol oxidase EC 1.10.3.2) is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic - notably phenolic and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism in fungi, plants and insects. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	174
259973	cd13906	CuRO_3_CumA_like	The third cupredoxin domain of CumA like multicopper oxidase. This multicopper oxidase (MCO) subfamily includes CumA from Pseudomonas putida, which is involved in the oxidation of Mn(II). However, the cumA gene has been identified in a variety of bacterial species, including both Mn(II)-oxidizing and non-Mn(II)-oxidizing strains. Thus, the proteins in this family may catalyze the oxidation of other substrates. MCO catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water and  has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper center. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	138
259974	cd13907	CuRO_3_MCO_like_1	The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	154
259975	cd13908	CuRO_3_MCO_like_2	The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	122
259976	cd13909	CuRO_3_MCO_like_3	The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	137
259977	cd13910	CuRO_3_MCO_like_4	The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	166
259978	cd13911	CuRO_3_MCO_like_5	The third cupredoxin domain of uncharacterized multicopper oxidase. Multicopper Oxidases (MCOs) are multi-domain enzymes that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre which binds a dioxygen. The dioxygen, following the transfer of four electrons, is reduced to two molecules of water. These MCOs are capable of oxidizing a vast range of substrates, varying from aromatic to inorganic compounds such as metals. This subfamily of MCOs is composed of three cupredoxin domains. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3.	119
259979	cd13912	CcO_II_C	C-terminal domain of Cytochrome c Oxidase subunit II. Cytochrome c Oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes.  It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit II contains a copper-copper binuclear site called CuA, which is believed to be involved in electron transfer from cytochrome c to the binuclear center (active site) in subunit I.	130
259980	cd13913	ba3_CcO_II_C	C-terminal cupredoxin domain of Ba3-like heme-copper oxidase subunit II. The ba3 family of heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and some archaea, which catalyze the reduction of O2 and simultaneously pump protons across the membrane. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. The ba3 family contains oxidases that lack the conserved residues that form the D- and K-pathways in CcO and ubiquinol oxidase. Instead, they contain a potential alternative K-pathway. Additional proton channels have been proposed for this family of oxidases but none have been identified definitively.	99
259981	cd13914	CuRO_HCO_II_like_3	Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits.  Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I.	108
259982	cd13915	CuRO_HCO_II_like_2	Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits.  Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I.	98
259983	cd13916	CuRO_HCO_II_like_1	Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits.  Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I.	93
259984	cd13917	CuRO_HCO_II_like_4	Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits.  Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I.	88
259985	cd13918	CuRO_HCO_II_like_6	Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits.  Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I.	139
259986	cd13919	CuRO_HCO_II_like_5	Uncharacterized subfamily with similarity to Heme-copper oxidase subunit II cupredoxin domain. Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which catalyze the reduction of O2 and simultaneously pump protons across the membrane. The superfamily is diverse in terms of electron donors, subunit composition, and heme types. The number of subunits varies from two to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian cytochrome c oxidase (CcO) are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. It has been proposed that archaea acquired heme-copper oxidases through gene transfer from gram-positive bacteria. Subunit II is found in CcO, ubiquinol oxidase, and the ba3-like oxidases, while the cbb3 oxidases contain alternative additional subunits.  Additionally, nitrous oxide reductase contains the globular portion of subunit II as a domain within its structure. In some families, subunit II contains a copper-copper binuclear center that is involved in the transfer of electrons from the substrate to the binuclear center (active site) in subunit I.	107
259987	cd13920	Stellacyanin	Stellacyanin is a subclass of phytocyanins, a plant type I copper protein. Stellacyanin is a subclass of the phytocyanins, a ubiquitous family of plant cupredoxins. Stellacyanin is involved in electron transfer reactions with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. The copper is tetrahedrally coordinated by a cysteine, 2 histidines, and a glutamine residue. The glutamine residue substitutes for a methione ligand typically found in other blue copper proteins. The exact function of stellacyanin is unknown. However, stellacyanin appears to be associated with the plant cell wall; it may be involved in oxidative reactions to build polymeric material making up the cell wall.	101
259988	cd13921	Amicyanin	Amicyanin is a type I blue copper protein that plays an essential role in electron transfer. In Paracoccus denitrificans bacteria, amicyanin acts as an intermediary of a three-member redox complex along with methylamine dehydrogenase (MADH) and cytochrome c-551i. The electron is transferred from the active site of MADH via the amicyanin copper ion to the cytochrome heme iron. The electron transfer from MADH to cytochrome c-551i does not involve a ternary complex but occurs via a ping-pong mechanism in which amicyanin uses the same interface for the reactions with MADH and cytochrome c-551i.	81
259989	cd13922	Azurin	Azurin is a redox partner for enzymes such as nitrite reductase or arsenite oxidase. Azurin is a bacterial blue copper-binding protein. It serves as a redox partner to enzymes such as nitrite reductase or arsenite oxidase. The copper of Azurin is tetrahedrally coordinated by a cysteine, 2 histidines, and a methionine residue. The electron transfer reactions are carried out with the Cu center transitioning between the oxidized Cu(II) form and the reduced Cu(I) form. Azurin can function as a tumor suppressor; it forms a complex with p53 that triggers apoptosis in various human cancer cells.	125
381607	cd13925	RPF	core lysozyme-like domain of resuscitation-promoting factor proteins. Resuscitation-promoting factor (RPF) proteins, found in various (G+C)-rich Gram-positive bacteria, act to reactivate cultures from stationary phase. This protein shares elements of the structural core of lysozyme and related proteins. Furthermore, it shares a conserved active site glutamate which is required for activity, and has a polysaccharide binding cleft that corresponds to the peptidoglycan binding cleft of lysozyme. Muralytic activity of Rpf in Micrococcus luteus correlates with resuscitation, supporting a mechanism dependent on cleavage of peptidoglycan by RPF.	71
381608	cd13926	N-acetylmuramidase_GH108	N-acetylmuramidase domain of the glycosyl hydrolase 108 family. This domain acts as a lysozyme (N-acetylmuramidase), EC:3.2.1.17. It contains a conserved EGGY motif near the N-terminus, the glutamic acid within this motif is essential for catalytic activity. In bacteria, it may activate the secretion of large proteins via the breaking and rearrangement of the peptidoglycan layer during secretion. It is frequently found at the N-terminus of proteins containing a peptidoglycan binding domain.	91
260106	cd13929	PT-DMATS_CymD	aromatic prenyltransferases (PTases) of the DMATS/CymD familiy. Members of the DMATS/CymD family of ABBA prenyltransferases prenylate indole, tyrosine, and xanthone derivatives. This family of fungal proteins includes cyclic dipeptide N-prenyltransferase (CdpNPT), Brevianamide F prenyltransferase (ftmPT1), fumigaclavine C synthase (FgaPT1), dimethylallyltryptophan synthase (DMATS) and related proteins. CdpNPT accepts a variety of tryptophan-containing cyclic dipeptides, including L-tryptophan itself, and prenylates these substrates inverse at the N-1 position of the indole group. FtmPT1 catalyzes the prenylation of brevianamide F in the biosynthesis of fumitremorgin-type alkaloids. FgaPT1 catalyses the prenylation of fumigaclavine A. Dimethylallyltryptophan synthases (DMATS) catalyzes the prenylation of L-tryptophan at C-4 of the indole ring during the biosynthesis of ergot alkaloids.	392
260107	cd13930	PT-Tnase	Aromatic Prenyltransferases (PTases) associated with tryptophanase. This group of bacterial and fungal proteins shows homology to the DMATS/CymD family of ABBA prenyltransferases, which prenylates indole, tyrosine, and xanthone derivatives. Some of the members, mostly fungal proteins, are associated with tryptophanase-like domains (Tnase) which catalyzes the degradation of L-tryptophan to yield indole, pyruvate and ammonia, or the degradation of L-tyrosine to yield phenol, pyruvate and ammonia. This suggest that these otherwise uncharacterized proteins may exhibit multiple functions.	348
260108	cd13931	PT-CloQ_NphB	Aromatic Prenyltransferases (PTases) of the CloQ/NphB family. Members of the CloQ/NphB family of ABBA prenyltransferases catalyze the prenylation of phenols, naphthalenes, and phenazines. This family of fungal and bacterial proteins includes dihydrophenazine-1-carboxylate dimethylallyltransferase PpzP, the aromatic prenyltransferase from the clorobiocin biosynthetic pathway CloQ, and related proteins. CloQ catalyzes the attachment of a dimethylallyl moiety to 4-hydroxyphenylpyruvate, part of the biosynthetic pathway of the Streptomyces roseochromogenes antibiotic clorobiocin. PpzP, as well as EpzP, are important for the biosynthesis of endophenazines; they catalyze the prenylation of 5,10-dihydrophenazine-1-carboxylic acid (dhPCA). Streptomyces NphB catalyzes the addition of a 10-carbon geranyl group to  small organic aromatic substrates and is involved in the biosynthesis of the antioxidant naphterpin. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto aromatic substrates in biosynthetic pathways of microbial secondary metabolites.	274
259826	cd13932	HN_RTEL1	harmonin_N_like domain of regulator of telomere elongation helicase 1 (also known as RTEL). Mouse Rtel is an essential protein required for the maintenance of both telomeric and genomic stability. RTEL1 appears to maintain genome stability by suppressing homologous recombination (HR). In vitro, purified human and insect RTEL1 have been shown to promote the disassembly of D loop recombination intermediates, in a reaction dependent upon ATP hydrolysis. Human RTEL1 is implicated in the etiology of Dyskeratosis congenital (DC, is an inherited bone marrow failure and cancer predisposition syndrome). Point mutations in its helicase domains, and truncations which result in loss of its C-terminus have been discovered in DC families. RTEL1 is also a candidate gene influencing glioma susceptibility. The C-terminal domain of RTEL1, represented here, appears similar to the N-terminal domain of the scaffolding protein harmonin.	99
259827	cd13933	harmonin_N_like_u1	domain similar to the N-terminal protein-binding module of harmonin; uncharacterized subgroup. This domain is a putative protein-binding module based on its sequence similarity to the N-terminal domain of harmonin. Harmonin (not belonging to this group) is a postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold protein, which organizes the Usher protein network of the inner ear and the retina. This domain is also related to domains found in several other scaffold proteins which organize supramolecular complexes.	78
260014	cd13934	RNase_H_Dikarya_like	Fungal (dikarya) Ribonuclease H, uncharacterized. This family contains dikarya RNase H, many of which are uncharacterized. Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. It is widely present in various organisms, including bacteria, archaea and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. An important RNase H function is to remove Okazaki fragments during DNA replication.	153
260015	cd13935	RNase_H_bacteria_like	RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. This family includes bacterial ribonuclease H (RNase H) enzymes. RNases are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.	133
260110	cd13936	PANDER_like	Domains similar to the Pancreatic-derived factor. FAM3B or PANDER (PANcreatic DERived factor) has been identifed as a regulator of glucose homeostasis and beta cell function. The protein is expressed in the endocrine pancreas and co-secreted with insulin in response to glucose, particularly under conditions of insulin resistance. The protein had initially been predicted to be a member of the four-helical cytokine family, hence the FAM3B designation. This wider family contains FAM3B and FAM4C, N-terminal domains of N-acetylglucosaminyltransferases, and domains in poorly characterized proteins that have been associated with deafness and the progression of cancer.	149
260111	cd13937	PANDER_GnT-1_2_like	PANDER-like domain of N-acetylglucosaminyltransferases. O-linked-mannose beta-1,2-N-acetylglucosaminyltransferase 1 participates in O-mannosyl glycosylation and may be responsible for creating GlcNAc(beta1-2)Man(alpha1-)O-Ser/Thr moieties on alpha dystroglycan and other O-mannosylated proteins. The domain characterized by this model lies N-terminal to the catalytic domain. Its function has not been determined.	148
260112	cd13938	PANDER_like_TMEM2	PANDER-like domain of the transmembrane protein TMEM2. TMEM2 has been characterized as a transmembrane protein that maps to the DFNB7-DFNB11 deafness locus on human chromosome 9. It contains a domain similar to the Pancreatic-derived factor PANDER, C-terminal to a glycine rich G8-domain. The function of the PANDER-like domain in TMEM2 has not been characterized.	168
260113	cd13939	PANDER_FAM3B	Pancreatic derived factor. FAM3B or PANDER (PANcreatic DERived factor) has been identifed as a regulator of glucose homeostasis and beta cell function. The protein is expressed in the endocrine pancreas and co-secreted with insulin in response to glucose, particularly under conditions of insulin resistance. The protein had initially been predicted to be a member of the four-helical cytokine family, hence the FAM3B designation. PANDER induces apoptosis of insulin-secreting beta-cells when over-expressed in vitro. It has been associated with the progression of type 2 diabetes by downregulating beta cell function as well as insulin sensitivity in the liver.	175
260114	cd13940	ILEI_FAM3C	Interleukin-like EMT inducer. The secreted factor FAM3C or ILEI (InterLeukin-like Emt Inducer) has been identifed as a protein involved in the epithelial-mesenchymal transition (EMT) and in processes associated with metastasis formation and the progression of cancer. The protein had initially been predicted to be a member of the four-helical cytokine family, hence the FAM3C designation. ILEI has been found to be widely expressed, and to be involved in retinal development.	171
260115	cd13941	PANDER_like_KIAA1199	PANDER-like domain of KIAA1199 and similar proteins. KIAA1199 has been characterized as a protein associated with poor survival when upregulated in human cancer, as well as with nonsyndromic loss of hearing when mutated. It contains a C-terminal domain similar to the Pancreatic-derived factor PANDER; the function of this PANDER-like domain has not been characterized.	157
260116	cd13944	lytB_ispH	4-hydroxy-3-methylbut-2-enyl diphosphate reductase. The 4-hydroxy-3-methylbut-2-enyl diphosphate (HMBPP) reductase (called lytB or ispH) is the terminal enzyme of the mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway, one of the two metabolic routes for isoprenoid biosynthesis. The MEP pathway is essential in many eubacteria, plants, and the malaria parasite. LytB converts HMBPP into isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP).	275
260117	cd13945	Chs5_N	N-terminal dimerization domain of Chs5 and similar proteins. Chs5/6 is a multi-protein complex conserved in fungi that interacts with chitin synthase III (Chs3p) and is involved in its transport to the cell surface from the trans-Golgi network, functioning as an exomer cargo adapter. Chs5p appears to form a complex with Chs6p and its paralogs Bch1p, Bud7p, and Bch2p. In this complex, Chs5p may act as a central scaffold. The N-terminal domain characterized by this model forms a homodimer and has been shown to interact with Chs6p and Bch1p. It may function as a flexible hinge domain that allows the exomer to interact with both proteins and the Golgi membrane as the latter undergoes changes in curvature during the formation of transport vesicles. The dimerization domain sits N-terminally to a conserved FBE (FN3-BRCT) unit, which binds Arf1 an is involved in the recruitment of the exomer to the membrane.	73
260118	cd13946	LysW	Lysine biosynthesis protein LysW. LysW functions as a carrier protein in the biosynthesis pathway of lysine. The C-terminal glutamate sidechain of LysW attaches to the amino group of alpha-aminoadipate (AAA); this peptide bond formation is catalyzed by the ligase LysX. AAA remains associated with LysW throughout its biosynthetic conversion to lysine. LysW also acts to protect the amino group of glutamate in arginine biosynthesis.	54
320087	cd13949	7tm_V1R_pheromone	vomeronasal organ pheromone receptor type-1 family, member of the seven-transmembrane G protein-coupled receptor superfamily. This family represents vomeronasal type-1 receptors (V1Rs) that are specifically expressed in the vomeronasal organ (VNO), which is the sensory organ of the accessory olfactory system present in amphibians, reptiles, and non-primate mammals such as mice and rodents, but it is non-functional or absent in humans, apes and monkeys. The VNO detects pheromones, chemicals released from animals that can influence social and reproductive behaviors, such as male-male aggression or sexual mating, in other members of the same species. On the other hand, the olfactory epithelium, which contains olfactory receptor neurons inside the nasal cavity, is responsible for detecting odor molecules (smells). There are two types of vertebrate pheromones: (1) small volatile molecules such as 2-heptanone, a substance in the urine of both male and female that extends estrous cycle length in female mice; and (2) water-soluble molecules such as the major histocompatibility complex (HMC) class-I peptide, which can induce the pregnancy block effect, the tendency for female rodents to abort their pregnancies upon exposure to the scent of an unknown male. While V1Rs and G-alpha(i2) protein are co-expressed in the apical neurons of the VNO, V2Rs (type-2 vomeronasal receptors) and G-alpha(o) protein are coexpressed in the basal layer of the VNO. Activation of V1R or V2R causes stimulation of phospholipase pathway, generating diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). V1Rs have a short N-terminal extracellular domain, whereas V2Rs contain a long N-terminal extracellular domain, which is believed to bind pheromones. Although V1Rs share the seven-transmembrane domain structure with V1Rs and olfactory receptors, they share little sequence similarity with each other.	295
320088	cd13950	7tm_TAS2R	mammalian taste receptors type 2, member of the seven-transmembrane G protein-coupled receptor superfamily. This group represents a family of mammalian taste receptors (TAS2Rs), which function as bitter taste receptors. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	288
320089	cd13951	7tmF_Frizzled_SMO	class F frizzled/smoothened family, member of the 7-transmembrane G protein-coupled receptor superfamily. The class F G protein-coupled receptors includes the frizzled (FZD) family of seven-transmembrane proteins consisting of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. Also included in the class F family is the closely related smoothened (SMO), which is a transmembrane G protein-coupled receptor that acts as the transducer of the hedgehog (HH) signaling pathway. SMO is activated by the hedgehog (HH) family of proteins acting on the 12-transmembrane domain receptor patched (PTCH), which constitutively inhibits SMO. Thus, in the absence of HH proteins, PTCH inhibits SMO signaling. On the other hand, binding of HH to the PTCH receptor activates its internalization and degradation, thereby releasing the PTCH inhibition of SMO. This allows SMO to trigger intracellular signaling and the subsequent activation of the Gli family of zinc finger transcriptional factors and induction of HH target gene expression (PTCH, Gli1, cyclin, Bcl-2, etc). The WNT and HH signaling pathways play critical roles in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	314
410627	cd13952	7tm_classB	class B family of seven-transmembrane G protein-coupled receptors. The class B of seven-transmembrane GPCRs is classified into three major subfamilies: subfamily B1 (secretin-like receptor family), B2 (adhesion family), and B3 (Methuselah-like family). The class B receptors have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi or prokaryotes. The B1 subfamily comprises receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the subfamily B1 receptors preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. The subfamily B2 consists of cell-adhesion receptors with 33 members in humans and vertebrates. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing a variety of structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, linked to a class B seven-transmembrane domain. These include, for example, EGF (epidermal growth factor)-like domains in CD97, Celsr1 (cadherin family member), Celsr2, Celsr3, EMR1 (EGF-module-containing mucin-like hormone receptor-like 1), EMR2, EMR3, and Flamingo; two laminin A G-type repeats and nine cadherin domains in Flamingo and its human orthologs Celsr1, Celsr2 and Celsr3; olfactomedin-like domains in the latrotoxin receptors; and five or four thrombospondin type 1 repeats in BAI1 (brain-specific angiogenesis inhibitor 1), BAI2 and BAI3. Almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. Furthermore, the subfamily B3 includes Methuselah (Mth) protein, which was originally identified in Drosophila as a GPCR affecting stress resistance and aging, and its closely related proteins.	260
320091	cd13953	7tm_classC_mGluR-like	metabotropic glutamate receptor-like class C family of seven-transmembrane G protein-coupled receptors superfamily. The class C GPCRs consist of glutamate receptors (mGluR1-8), the extracellular calcium-sensing receptors (caSR), the gamma-amino-butyric acid type B receptors (GABA-B), the vomeronasal type-2 pheromone receptors (V2R), the type 1 taste receptors (TAS1R), and the promiscuous L-alpha-amino acid receptor (GPRC6A), as well as several orphan receptors. Structurally, these receptors are typically composed of a large extracellular domain containing a Venus flytrap module which possesses the orthosteric agonist-binding site, a cysteine-rich domain (CRD) with the exception of GABA-B receptors, and the seven-transmembrane domains responsible for G protein activation. Moreover, the Venus flytrap module shows high structural homology with bacterial periplasmic amino acid-binding proteins, which serve as primary receptors in transport of a variety of soluble substrates such as amino acids and polysaccharides, among many others. The class C GPCRs exist as either homo- or heterodimers, which are essential for their function. The GABA-B1 and GABA-B2 receptors form a heterodimer via interactions between the N-terminal Venus flytrap modules and the C-terminal coiled-coiled domains. On the other hand, heterodimeric CaSRs and Tas1Rs and homodimeric mGluRs utilize Venus flytrap interactions and intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD), which can also acts as a molecular link to mediate the signal between the Venus flytrap and the 7TMs. Furthermore, members of the class C GPCRs bind a variety of endogenous ligands, ranging from amino acids, ions, to pheromones and sugar molecules, and play important roles in many physiological processes such as synaptic transmission, calcium homeostasis, and the sensation of sweet and umami tastes.	251
320092	cd13954	7tmA_OR	olfactory receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
260119	cd13956	PT_UbiA	UbiA family of prenyltransferases (PTases). Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways.	271
260120	cd13957	PT_UbiA_Cox10	Protoheme IX farnesyltransferase. Protoheme IX farnesyltransferase (also called heme O synthase, heme A:farnesyltransferase, cytochrome c oxidase subunit X [Cox10]) converts heme B (protoheme IX) to heme O by substitution of the vinyl group on carbon 2 of the heme B porphyrin ring with a hydroxyethyl farnesyl side group. It is localized at the mitochondrial inner membrane. Eukaryotic Cox10 is important for the maturation of the heme A prosthetic group of cytochrome c oxidase (COX), the terminal component of the mitochondrial respiratory chain, that catalyzes the electron transfer from reduced cytochrome c to oxygen. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways.	271
260121	cd13958	PT_UbiA_chlorophyll	Bacteriochlorophyll/chlorophyll synthetase. Chlorophyll synthase catalyzes the last step of chlorophyll (Chl) biosynthesis, the addition of the tetraprenyl (phytyl or geranylgeranyl) side chain. In plant chloroplast, the chlorophyll synthase is located in thylakoid membrane and has been shown to also have a regulatory or channeling function. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways.	277
260122	cd13959	PT_UbiA_COQ2	4-Hydroxybenzoate polyprenyltransferase. 4-Hydroxybenzoate polyprenyltransferase, also known as Coq2, catalyzes the prenylation of p-hydroxybenzoate with an all-trans polyprenyl group, an important step in ubiquinone (CoQ) biosynthesis. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways.	272
260123	cd13960	PT_UbiA_HPT1	Tocopherol phytyltransferase. Tocopherol polyprenyltransferase (TPT1), also known as homogentisate phytyltransferase 1 (HPT1), tocopherol phytyltransferase, or VTE2, catalyzes the first step in the biosynthesis of the tocopherol forms of vitamin E, which involves the prenylation of homogentisate using phytyl diphosphate (PDP) as the prenyl donor. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways.	289
260124	cd13961	PT_UbiA_DGGGPS	Geranylgeranylglycerol-phosphate geranylgeranyltransferase. Digeranylgeranylglyceryl phosphate synthase (DGGGPS) transfers a geranylgeranyl group from geranylgeranyl diphosphate to (S)-3-O-geranylgeranylglyceryl phosphate to form (S)-2,3-di-O-geranylgeranylglyceryl phosphate, as part of the isoprenoid ether lipid biosynthesis. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways.	270
260125	cd13962	PT_UbiA_UBIAD1	1,4-Dihydroxy-2-naphthoate octaprenyltransferase. Human UBIAD1 is an enzyme involved in the synthesis of MK-4. Menaquinones (MKs, also called bacterial forms) are one of the two forms of natural vitamin K, the other being the plant form, phylloquinone (PK). All forms of vitamin K have a 2-methyl-1,4-naphthoquinone (menadione; K3) ring structure in common. At the 3-position of the ring, PK has a phytyl side chain while MKs have several repeating prenyl units. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways.	283
260126	cd13963	PT_UbiA_2	UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown.	278
260127	cd13964	PT_UbiA_1	UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown.	282
260128	cd13965	PT_UbiA_3	UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown.	273
260129	cd13966	PT_UbiA_4	UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown.	272
260130	cd13967	PT_UbiA_5	UbiA family of prenyltransferases (PTases), Unknown subgroup. Many characterized members of the UbiA prenyltransferase family are aromatic prenyltransferases and play an important role in the biosynthesis of heme, chlorophyll, vitamin E, and vitamin K. They contain two copies of a motif similar to the active site DxxD motif of trans-prenyltransferases and are potentially related. Prenyltransferases (PTs) catalyze the regioselective transfer of prenyl moieties onto a wide variety of substrates and play an important role in many biosynthetic pathways. The function of this subgroup is unknown.	277
270870	cd13968	PKc_like	Catalytic domain of the Protein Kinase superfamily. The PK superfamily contains the large family of typical PKs that includes serine/threonine kinases (STKs), protein tyrosine kinases (PTKs), and dual-specificity PKs that phosphorylate both serine/threonine and tyrosine residues of target proteins, as well as pseudokinases that lack crucial residues for catalytic activity and/or ATP binding. It also includes phosphoinositide 3-kinases (PI3Ks), aminoglycoside 3'-phosphotransferases (APHs), choline kinase (ChoK), Actin-Fragmin Kinase (AFK), and the atypical RIO and Abc1p-like protein kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to their target substrates; these include serine/threonine/tyrosine residues in proteins for typical or atypical PKs, the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives for PI3Ks, the 4-hydroxyl of PtdIns for PI4Ks, and other small molecule substrates for APH/ChoK and similar proteins such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine.	136
270871	cd13969	ADCK1-like	aarF domain containing kinase 1 and similar proteins. This subfamily is composed of uncharacterized ABC1 kinase-like proteins including the human protein called aarF domain containing kinase 1 (ADCK1). Eukaryotes contain at least three ABC1-like proteins: in humans, these are ADCK3 and the putative protein kinases named ADCK1 and ADCK2. Yeast Abc1p and its human homolog ADCK3 are atypical protein kinases required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. In algae and higher plants, ABC1 kinases have proliferated to more than 15 subfamilies, most of which are located in plastids or mitochondria. Plant subfamilies 14 and 15 (ABC1K14-15) belong to the same group of ABC1 kinases as human ADCK1. ABC1 kinases are not related to the ATP-binding cassette (ABC) membrane transporter family.	253
270872	cd13970	ABC1_ADCK3	Activator of bc1 complex (ABC1) kinases, also called aarF domain containing kinase 3. This subfamily is composed of the atypical yeast protein kinase Abc1p, its human homolog ADCK3 (also called CABC1), and similar proteins. Abc1p (also called Coq8p) is required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. It is necessary for the formation of a multi-subunit Q-biosynthetic complex and may also function in the regulation of Q synthesis. Human ADCK3 is able to rescue defects in Q synthesis and the phosphorylation state of Coq proteins in yeast Abc1 (or Coq8) mutants. Mutations in ADCK3 cause progressive cerebellar ataxia and atrophy due to Q10 deficiency. In algae and higher plants, ABC1 kinases have proliferated to more than 15 subfamilies, most of which are located in plastids or mitochondria. Subfamily 13 (ABC1K13) of plant ABC1 kinases belongs in this subfamily with yeast Abc1p and human ADCK3. ABC1 kinases are not related to the ATP-binding cassette (ABC) membrane transporter family.	251
270873	cd13971	ADCK2-like	aarF domain containing kinase 2 and similar proteins. This subfamily is composed of uncharacterized ABC1 kinase-like proteins including the human protein called aarF domain containing kinase 2 (ADCK2). Eukaryotes contain at least three ABC1-like proteins; in humans, these are ADCK3 and the putative protein kinases named ADCK1 and ADCK2. Yeast Abc1p and its human homolog ADCK3 are atypical protein kinases required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. In algae and higher plants, ABC1 kinases have proliferated to more than 15 subfamilies, most of which are located in plastids or mitochondria. Plant subfamily 10 (ABC1K10) belong to the same group of ABC1 kinases as human ADCK2. ABC1 kinases are not related to the ATP-binding cassette (ABC) membrane transporter family.	298
270874	cd13972	UbiB	Ubiquinone biosynthetic protein UbiB. UbiB is the prokaryotic homolog of yeast Abc1p and human ADCK3 (aarF domain containing kinase 3). It is required for the biosynthesis of Coenzyme Q (ubiquinone or Q), which is an essential lipid component in respiratory electron and proton transport. It is required in the first monooxygenase step in Q biosynthesis. Mutant strains with disrupted ubiB genes lack Q and accumulate octaprenylphenol, a Q biosynthetic intermediate.	247
270875	cd13973	PK_MviN-like	Pseudokinase domain of the peptidoglycan biosynthetic protein MviN. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. This family is composed of the mycobacterial protein MviN and similar proteins. MviN is an integral membrane protein that is essential for growth and is required for cell wall integrity and peptidogylcan (PG) biosynthesis. It comprises of 14 predicted transmembrane (TM) helices at the N-terminus, followed by an intracellular pseudokinase domain linked through a single TM helix to a carbohydrate binding extracellular domain. Phosphorylation of the MviN pseudokinase domain by the PG-sensitive serine/threonine protein kinase PknB recruits a forkhead associated (FHA) domain protein FhaA, which modulates local PG synthesis at cell poles and the septum. The MviN pseudokinase forms a canonical receptor kinase dimer.	236
270876	cd13974	STKc_SHIK	Catalytic domain of the Serine/Threonine kinase, SINK-homologous inhibitory kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SHIK, also referred to as STK40 or LYK4, is a cytoplasmic and nuclear protein that is involved in the negative regulation of NF-kappaB- and p53-mediated transcription. It was identified as a protein related to SINK, a p65-interacting protein that inhibits p65 phosphorylation by the catalytic subunit of PKA, thereby inhibiting transcriptional competence of NF-kappaB. The SHIK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
270877	cd13975	PKc_Dusty	Catalytic domain of the Dual-specificity Protein Kinase, Dusty. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. Dusty protein kinase is also called Receptor-interacting protein kinase 5 (RIPK5 or RIP5) or RIP-homologous kinase. It is widely distributed in the central nervous system, and may be involved in inducing both caspase-dependent and caspase-independent cell death. The Dusty subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
270878	cd13976	PK_TRB	Pseudokinase domain of Tribbles Homolog proteins. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. Tribbles Homolog (TRB) proteins interact with many proteins involved in signaling pathways. They play scaffold-like regulatory functions and affect many cellular processes such as mitosis, apoptosis, differentiation, and gene expression. TRB proteins bind to the middle kinase in mitogen activated protein kinase (MAPK) signaling cascades, MAPK kinases. They regulate the activity of MAPK kinases, and thus, affect MAPK signaling. In Drosophila, Tribbles regulates String, the ortholog of mammalian Cdc25, during morphogenesis. String is implicated in the progression of mitosis during embryonic development. Vertebrates contain three TRB proteins encoded by three separate genes: Tribbles-1 (TRB1 or TRIB1), Tribbles-2 (TRB2 or TRIB2), and Tribbles-3 (TRB3 or TRIB3). The TRB subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	242
270879	cd13977	STKc_PDIK1L	Catalytic domain of the Serine/Threonine kinase, PDLIM1 interacting kinase 1 like. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PDIK1L is also called STK35 or CLIK-1. It is predominantly a nuclear protein which is capable of autophosphorylation. Through its interaction with the PDZ-LIM protein CLP-36, it is localized to actin stress fibers. The PDIK1L subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	322
270880	cd13978	STKc_RIP	Catalytic domain of the Serine/Threonine kinase, Receptor Interacting Protein. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RIP kinases serve as essential sensors of cellular stress. They are involved in regulating NF-kappaB and MAPK signaling, and are implicated in mediating cellular processes such as apoptosis, necroptosis, differentiation, and survival. RIP kinases contain a homologous N-terminal kinase domain and varying C-terminal domains. Higher vertebrates contain multiple RIP kinases, with mammals harboring at least five members. RIP1 and RIP2 harbor C-terminal domains from the Death domain (DD) superfamily while RIP4 contains ankyrin (ANK) repeats. RIP3 contain a RIP homotypic interaction motif (RHIM) that facilitates binding to RIP1. RIP1 and RIP3 are important in apoptosis and necroptosis, while RIP2 and RIP4 play roles in keratinocyte differentiation and inflammatory immune responses. The RIP subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
270881	cd13979	STKc_Mos	Catalytic domain of the Serine/Threonine kinase, Oocyte maturation factor Mos. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Mos (or c-Mos) is a germ-cell specific kinase that plays roles in both the release of primary arrest and the induction of secondary arrest in oocytes. It is expressed towards the end of meiosis I and is quickly degraded upon fertilization. It is a component of the cytostatic factor (CSF), which is responsible for metaphase II arrest. In addition, Mos activates a phoshorylation cascade that leads to the activation of the p34 subunit of MPF (mitosis-promoting factor or maturation promoting factor), a cyclin-dependent kinase that is responsible for the release of primary arrest in meiosis I. The Mos subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
270882	cd13980	STKc_Vps15	Catalytic domain of the Serine/Threonine kinase, Vacuolar protein sorting-associated protein 15. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Vps15 is a large protein consisting of an N-terminal kinase domain, a C-terminal WD-repeat containing domain, and an intermediate bridge domain that contain HEAT repeats. The kinase domain is necessary for the signaling functions of Vps15. Human Vps15 was previously called p150. It associates and regulates Vps34, also called Class III phosphoinositide 3-kinase (PI3K), which catalyzes the phosphorylation of D-myo-phosphatidylinositol (PtdIns). Vps34 is the only PI3K present in yeast. It plays an important role in the regulation of protein and vesicular trafficking and sorting, autophagy, trimeric G-protein signaling, and phagocytosis. The Vps15 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and PI3K.	278
270883	cd13981	STKc_Bub1_BubR1	Catalytic domain of the Serine/Threonine kinases, Spindle assembly checkpoint proteins Bub1 and BubR1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Bub1 (Budding uninhibited by benzimidazoles 1), BubR1, and similar proteins. They contain an N-terminal Bub1/Mad3 homology domain essential for Cdc20 binding and a C-terminal kinase domain. Bub1 and BubR1 are involved in SAC, a surveillance system that delays metaphase to anaphase transition by blocking the activity of APC/C (the anaphase promoting complex) until all chromosomes achieve proper attachments to the mitotic spindle, to avoid chromosome missegregation. Impaired SAC leads to genomic instabilities and tumor development. Bub1 and BubR1 facilitate the localization of SAC proteins to kinetochores and regulate kinetochore-microtubule (K-MT) attachments. Repression studies of Bub1 and BubR1 show that they exert an additive effect in misalignment phenotypes and may function cooperatively or in parallel pathways in regulating K-MT attachments. The Bub1/BubR1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	298
270884	cd13982	STKc_IRE1	Catalytic domain of the Serine/Threonine kinase, Inositol-requiring protein 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), is an ER-localized type I transmembrane protein with kinase and endoribonuclease domains in the cytoplasmic side. It acts as an ER stress sensor and is the oldest and most conserved component of the unfolded protein response (UPR) in eukaryotes. The UPR is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. During ER stress, IRE1 dimerizes and forms oligomers, allowing the kinase domain to undergo trans-autophosphorylation. This leads to a conformational change that stimulates its endoribonuclease activity and results in the cleavage of its mRNA substrate, HAC1 in yeast and XBP1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. Mammals contain two IRE1 proteins, IRE1alpha (or ERN1) and IRE1beta (or ERN2). The Ire1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
270885	cd13983	STKc_WNK	Catalytic domain of the Serine/Threonine kinase, With No Lysine (WNK) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNKs comprise a subfamily of STKs with an unusual placement of a catalytic lysine relative to all other protein kinases. They are critical in regulating ion balance and are thus, important components in the control of blood pressure. They are also involved in cell signaling, survival, proliferation, and organ development. WNKs are activated by hyperosmotic or low-chloride hypotonic stress and they function upstream of SPAK and OSR1 kinases, which regulate the activity of cation-chloride cotransporters through direct interaction and phosphorylation. There are four vertebrate WNKs which show varying expression patterns. WNK1 and WNK2 are widely expressed while WNK3 and WNK4 show a more restricted expression pattern. Because mutations in human WNK1 and WNK4 cause PseudoHypoAldosteronism type II (PHAII), characterized by hypertension (due to increased sodium reabsorption) and hyperkalemia (due to impaired renal potassium secretion), there are more studies conducted on these two proteins, compared to WNK2 and WNK3. The WNK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270886	cd13984	PK_NRBP1_like	Pseudokinase domain of Nuclear Receptor Binding Protein 1 and similar proteins. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. This subfamily is composed of NRBP1, also called MLF1-adaptor molecule (MADM), and MADML. NRBP1 was originally named based on the presence of nuclear binding and localization motifs prior to functional analyses. It is expressed ubiquitously and is found to localize in the cytoplasm, not the nucleus. NRBP1 is an adaptor protein that interacts with myeloid leukemia factor 1 (MLF1), an oncogene that enhances myeloid development of hematopoietic cells. It also interacts with the small GTPase Rac3. NRBP1 may also be involved in Golgi to ER trafficking. MADML (for MADM-Like) has been shown to be expressed throughout development in Xenopus laevis with highest expression found in the developing lens and retina. The NRBP1-like subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270887	cd13985	STKc_GAK_like	Catalytic domain of cyclin G-Associated Kinase-like proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes cyclin G-Associated Kinase (GAK), Drosophila melanogaster Numb-Associated Kinase (NAK)-like proteins, and similar protein kinases. GAK plays regulatory roles in clathrin-mediated membrane trafficking, the maintenance of centrosome integrity and chromosome congression, neural patterning, survival of neurons, and immune responses. NAK plays a role in asymmetric cell division through its association with Numb. It also regulates the localization of Dlg, a protein essential for septate junction formation. The GAK-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	272
270888	cd13986	STKc_16	Catalytic domain of Serine/Threonine Kinase 16. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK16 is associated with many names including Myristylated and Palmitylated Serine/threonine Kinase 1 (MPSK1), Kinase related to cerevisiae and thaliana (Krct), and Protein Kinase expressed in day 12 fetal liver (PKL12). It is widely expressed in mammals with highest levels found in liver, testis, and kidney. It is localized in the Golgi but is translocated to the nucleus upon disorganization of the Golgi. STK16 is constitutively active and is capable of phosphorylating itself and other substrates. It may be involved in regulating stromal-epithelial interactions during mammary gland ductal morphogenesis. It may also function as a transcriptional co-activator of type-C natriuretic peptide and VEGF. The STK16 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	282
270889	cd13987	STKc_SBK1	Catalytic domain of the Serine/Threonine kinase, SH3 Binding Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SBK1, also called BSK146, is predominantly expressed in the brain. Its expression is increased in the developing brain during the late embryonic stage, coinciding with dramatic neuronal proliferation, migration, and maturation. SBK1 may play an important role in regulating brain development. The SBK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
270890	cd13988	STKc_TBK1	Catalytic domain of the Serine/Threonine kinase, TANK Binding Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TBK1 is also called T2K and NF-kB-activating kinase. It is widely expressed in most cell types and acts as an IkappaB kinase (IKK)-activating kinase responsible for NF-kB activation in response to growth factors. It plays a role in modulating inflammatory responses through the NF-kB pathway. TKB1 is also a major player in innate immune responses since it functions as a virus-activated kinase necessary for establishing an antiviral state. It phosphorylates IRF-3 and IRF-7, which are important transcription factors for inducing type I interferon during viral infection. In addition, TBK1 may also play roles in cell transformation and oncogenesis. The TBK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	316
270891	cd13989	STKc_IKK	Catalytic domain of the Serine/Threonine kinase, Inhibitor of Nuclear Factor-KappaB Kinase (IKK). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The IKK complex functions as a master regulator of Nuclear Factor-KappaB (NF-kB) proteins, a family of transcription factors which are critical in many cellular functions including inflammatory responses, immune development, cell survival, and cell proliferation, among others. It is composed of two kinases, IKKalpha and IKKbeta, and the regulatory subunit IKKgamma or NEMO (NF-kB Essential MOdulator). IKKs facilitate the release of NF-kB dimers from an inactive state, allowing them to migrate to the nucleus where they regulate gene transcription. There are two IKK pathways that regulate NF-kB signaling, called the classical (involving IKKbeta and NEMO) and non-canonical (involving IKKalpha) pathways. The classical pathway regulates the majority of genes activated by NF-kB. The IKK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	289
270892	cd13990	STKc_TLK	Catalytic domain of the Serine/Threonine kinase, Tousled-Like Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TLKs play important functions during the cell cycle and are implicated in chromatin remodeling, DNA replication and repair, and mitosis. They phosphorylate and regulate Anti-silencing function 1 protein (Asf1), a histone H3/H4 chaperone that helps facilitate the assembly of chromatin following DNA replication during S phase. TLKs also phosphorylate the H3 histone tail and are essential in transcription. Vertebrates contain two subfamily members, TLK1 and TLK2. The TLK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
270893	cd13991	STKc_NIK	Catalytic domain of the Serine/Threonine kinase, NF-kappaB Inducing Kinase (NIK). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NIK, also called mitogen activated protein kinase kinase kinase 14 (MAP3K14), phosphorylates and activates Inhibitor of NF-KappaB Kinase (IKK) alpha, which is a regulator of NF-kB proteins, a family of transcription factors which are critical in many cellular functions including inflammatory responses, immune development, cell survival, and cell proliferation, among others. NIK is essential in the IKKalpha-mediated non-canonical NF-kB signaling pathway, in which IKKalpha processes the IkB-like C-terminus of NF-kB2/p100 to produce p52, allowing the p52/RelB dimer to migrate to the nucleus where it regulates gene transcription. NIK also plays an important role in Toll-like receptor 7/9 signaling cascades. The NIK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
270894	cd13992	PK_GC	Pseudokinase domain of membrane Guanylate Cyclase receptors. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many  processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs lack a critical aspartate involved in ATP binding and does not exhibit kinase activity. It functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
270895	cd13993	STKc_Pat1_like	Catalytic domain of Fungal Pat1-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Schizosaccharomyces pombe Pat1 (also called Ran1), Saccharomyces cerevisiae VHS1 and KSP1, and similar fungal STKs. Pat1 blocks Mei2, an RNA-binding protein which is indispensable in the initiation of meiosis. Pat1 is inactivated and Mei2 activated, which initiates meiosis, under nutrient-deprived conditions through a signaling cascade involving Ste11. Meiosis induced by Pat1 inactivation may show different characteristics than normal meiosis including aberrant positioning of centromeres. VHS1 was identified in a screen for suppressors of cell cycle arrest at the G1/S transition, while KSP1 may be involved in regulating PRP20, which is required for mRNA export and maintenance of nuclear structure. The Pat1-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270896	cd13994	STKc_HAL4_like	Catalytic domain of Fungal Halotolerance protein 4-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of HAL4, Saccharomyces cerevisiae Ptk2/Stk2, and similar fungal proteins. Proteins in this subfamily are involved in regulating ion transporters. In budding and fission yeast, HAL4 promotes potassium ion uptake, which increases cellular resistance to other cations such as sodium, lithium, and calcium ions. HAL4 stabilizes the major high-affinity K+ transporter Trk1 at the plasma membrane under low K+ conditions, which prevents endocytosis and vacuolar degradation. Budding yeast Ptk2 phosphorylates and regulates the plasma membrane H+ ATPase, Pma1. The HAL4-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
270897	cd13995	STKc_MAP3K8	Catalytic domain of the Serine/Threonine kinase, Mitogen-Activated Protein Kinase (MAPK) Kinase Kinase 8. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAP3K8 is also called Tumor progression locus 2 (Tpl2) or Cancer Osaka thyroid (Cot), and was first identified as a proto-oncogene in T-cell lymphoma induced by MoMuL virus and in breast carcinoma induced by MMTV. Activated MAP3K8 induces various MAPK pathways including Extracellular Regulated Kinase (ERK) 1/2, c-Jun N-terminal kinase (JNK), and p38. It plays a pivotal role in innate immunity, linking Toll-like receptors to the production of TNF and the activation of ERK in macrophages. It is also required in interleukin-1beta production and is critical in host defense against Gram-positive bacteria. MAP3Ks (MKKKs or MAPKKKs) phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The MAP3K8 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270898	cd13996	STKc_EIF2AK	Catalytic domain of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. eIF-2 phosphorylation is induced in response to cellular stresses including virus infection, heat shock, nutrient deficiency, and the accummulation of unfolded proteins, among others. There are four distinct kinases that phosphorylate eIF-2 and control protein synthesis under different stress conditions: General Control Non-derepressible-2 (GCN2) which is activated during amino acid or serum starvation; protein kinase regulated by RNA (PKR) which is activated by double stranded RNA; heme-regulated inhibitor kinase (HRI) which is activated under heme-deficient conditions; and PKR-like endoplasmic reticulum kinase (PERK) which is activated when misfolded proteins accumulate in the ER. The EIF2AK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	273
270899	cd13997	PKc_Wee1_like	Catalytic domain of the Wee1-like Protein Kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine or tyrosine residues on protein substrates. This subfamily is composed of the dual-specificity kinase Myt1, the protein tyrosine kinase Wee1, and similar proteins. These proteins are cell cycle checkpoint kinases that are involved in the regulation of cyclin-dependent kinase CDK1, the master engine for mitosis. CDK1 is kept inactivated through phosphorylation of N-terminal thr (T14 by Myt1) and tyr (Y15 by Myt1 and Wee1) residues. Mitosis progression is ensured through activation of CDK1 by dephoshorylation and inactivation of Myt1/Wee1. The Wee1-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	252
270900	cd13998	STKc_TGFbR-like	Catalytic domain of Transforming Growth Factor beta Receptor-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of receptors for the TGFbeta family of secreted signaling molecules including TGFbeta, bone morphogenetic proteins (BMPs), activins, growth and differentiation factors (GDFs), and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. There are two types of TGFbeta receptors included in this subfamily, I and II, that play different roles in signaling. For signaling to occur, the ligand first binds to the high-affinity type II receptor, which is followed by the recruitment of the low-affinity type I receptor to the complex and its activation through trans-phosphorylation by the type II receptor. The active type I receptor kinase starts intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. Different ligands interact with various combinations of types I and II receptors to elicit a specific signaling pathway. Activins primarily signal through combinations of ACVR1b/ALK7 and ACVR2a/b; myostatin and GDF11 through TGFbR1/ALK4 and ACVR2a/b; BMPs through ACVR1/ALK1 and BMPR2; and TGFbeta through TGFbR1 and TGFbR2. The TGFbR-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	289
270901	cd13999	STKc_MAP3K-like	Catalytic domain of Mitogen-Activated Protein Kinase (MAPK) Kinase Kinase-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed mainly of MAP3Ks and similar proteins, including TGF-beta Activated Kinase-1 (TAK1, also called MAP3K7), MAP3K12, MAP3K13, Mixed lineage kinase (MLK), MLK-Like mitogen-activated protein Triple Kinase (MLTK), and Raf (Rapidly Accelerated Fibrosarcoma) kinases. MAP3Ks (MKKKs or MAPKKKs) phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Also included in this subfamily is the pseudokinase Kinase Suppressor of Ras (KSR), which is a scaffold protein that functions downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway.	245
270902	cd14000	STKc_LRRK	Catalytic domain of the Serine/Threonine kinase, Leucine-Rich Repeat Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LRRKs are also classified as ROCO proteins because they contain a ROC (Ras of complex proteins)/GTPase domain followed by a COR (C-terminal of ROC) domain of unknown function. In addition, LRRKs contain a catalytic kinase domain and protein-protein interaction motifs including a WD40 domain, LRRs and ankyrin (ANK) repeats. LRRKs possess both GTPase and kinase activities, with the ROC domain acting as a molecular switch for the kinase domain, cycling between a GTP-bound state which drives kinase activity and a GDP-bound state which decreases the activity. Vertebrates contain two members, LRRK1 and LRRK2, which show complementary expression in the brain. Mutations in LRRK2 are linked to both familial and sporadic forms of Parkinson's disease. The normal roles of LRRKs are not clearly defined. They may be involved in mitogen-activated protein kinase (MAPK) pathways, protein translation control, programmed cell death pathways, and cytoskeletal dynamics. The LRRK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	275
270903	cd14001	PKc_TOPK	Catalytic domain of the Dual-specificity protein kinase, Lymphokine-activated killer T-cell-originated protein kinase. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. TOPK, also called PDZ-binding kinase (PBK), is activated at the early stage of mitosis and plays a critical role in cytokinesis. It partly functions as a mitogen-activated protein kinase (MAPK) kinase and is capable of phosphorylating p38, JNK1, and ERK2. TOPK also plays a role in DNA damage sensing and repair through its phosphorylation of histone H2AX. It contributes to cancer development and progression by downregulating the function of tumor suppressor p53 and reducing cell-cycle regulatory proteins. TOPK is found highly expressed in breast and skin cancer cells. The TOPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	292
270904	cd14002	STKc_STK36	Catalytic domain of Serine/Threonine Kinase 36. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK36, also called Fused (or Fu) kinase, is involved in the Hedgehog signaling pathway. It is activated by the Smoothened (SMO) signal transducer, resulting in the stabilization of GLI transcription factors and the phosphorylation of SUFU to facilitate the nuclear accumulation of GLI. In Drosophila, Fused kinase is maternally required for proper segmentation during embryonic development and for the development of legs and wings during the larval stage. In mice, STK36 is not necessary for embryonic development, although mice deficient in STK36 display growth retardation postnatally. The STK36 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	253
270905	cd14003	STKc_AMPK-like	Catalytic domain of AMP-activated protein kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The AMPK-like subfamily is composed of AMPK, MARK, BRSK, NUAK, MELK, SNRK, TSSK, and SIK, among others. LKB1 serves as a master upstream kinase that activates AMPK and most AMPK-like kinases. AMPK, also called SNF1 (sucrose non-fermenting1) in yeasts and SnRK1 (SNF1-related kinase1) in plants, is a heterotrimeric enzyme composed of a catalytic alpha subunit and two regulatory subunits, beta and gamma. It is a stress-activated kinase that serves as master regulator of glucose and lipid metabolism by monitoring carbon and energy supplies, via sensing the cell's AMP:ATP ratio. MARKs phosphorylate tau and related microtubule-associated proteins (MAPs), and regulates microtubule-based intracellular transport. They are involved in embryogenesis, epithelial cell polarization, cell signaling, and neuronal differentiation. BRSKs play important roles in establishing neuronal polarity. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. The AMPK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	252
270906	cd14004	STKc_PASK	Catalytic domain of the Serine/Threonine kinase, Per-ARNT-Sim (PAS) domain Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PASK (or PASKIN) is a nutrient and energy sensor and thus, plays an important role in maintaining cellular energy homeostasis. It coordinates the utilization of glucose in response to metabolic demand. It contains an N-terminal PAS domain which directly interacts and inhibits a C-terminal catalytic kinase domain. The PAS domain serves as a sensory module for different environmental signals such as light, redox state, and various metabolites. Binding of ligands to the PAS domain causes structural changes which leads to kinase activation and the phosphorylation of substrates to trigger the appropriate cellular response. The PASK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270907	cd14005	STKc_PIM	Catalytic domain of the Serine/Threonine kinase, Proviral Integration Moloney virus (PIM) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PIM gene locus was discovered as a result of the cloning of retroviral intergration sites in murine Moloney leukemia virus, leading to the identification of PIM kinases. They are constitutively active STKs with a broad range of cellular targets and are overexpressed in many haematopoietic malignancies and solid cancers. Vertebrates contain three distinct PIM kinase genes (PIM1-3); each gene may result in mutliple protein isoforms. There are two PIM1 and three PIM2 isoforms as a result of alternative translation initiation sites, while there is only one PIM3 protein. Compound knockout mice deficient of all three PIM kinases that survive the perinatal period show a profound reduction in body size, indicating that PIMs are important for body growth. The PIM subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
270908	cd14006	STKc_MLCK-like	Catalytic kinase domain of Myosin Light Chain Kinase-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This family is composed of MLCKs and related MLCK-like kinase domains from giant STKs such as titin, obscurin, SPEG, Unc-89, Trio, kalirin, and Twitchin. Also included in this family are Death-Associated Protein Kinases (DAPKs) and Death-associated protein kinase-Related Apoptosis-inducing protein Kinase (DRAKs). MLCK phosphorylates myosin regulatory light chain and controls the contraction of all muscle types. Titin, obscurin, Twitchin, and SPEG are muscle proteins involved in the contractile apparatus. The giant STKs are multidomain proteins containing immunoglobulin (Ig), fibronectin type III (FN3), SH3, RhoGEF, PH and kinase domains. Titin, obscurin, Twitchin, and SPEG contain many Ig domain repeats at the N-terminus, while Trio and Kalirin contain spectrin-like repeats. The MLCK-like family is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	247
270909	cd14007	STKc_Aurora	Catalytic domain of the Serine/Threonine kinase, Aurora kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Aurora kinases are key regulators of mitosis and are essential for the accurate and equal division of genomic material from parent to daughter cells. Yeast contains only one Aurora kinase while most higher eukaryotes have two. Vertebrates contain at least 2 Aurora kinases (A and B); mammals contains a third Aurora kinase gene (C). Aurora-A regulates cell cycle events from the late S-phase through the M-phase including centrosome maturation, mitotic entry, centrosome separation, spindle assembly, chromosome alignment, cytokinesis, and mitotic exit. Aurora-A activation depends on its autophosphorylation  and binding to the microtubule-associated protein TPX2. Aurora-B is most active at the transition during metaphase to the end of mitosis. It is critical for accurate chromosomal segregation, cytokinesis, protein localization to the centrosome and kinetochore, correct microtubule-kinetochore attachments, and regulation of the mitotic checkpoint. Aurora-C is mainly expressed in meiotically dividing cells; it was originally discovered in mice as a testis-specific STK called Aie1. Both Aurora-B and -C are chromosomal passenger proteins that can form complexes with INCENP and survivin, and they may have redundant cellular functions. The Aurora subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	253
270910	cd14008	STKc_LKB1_CaMKK	Catalytic domain of the Serine/Threonine kinases, Liver Kinase B1, Calmodulin Dependent Protein Kinase Kinase, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Both LKB1 and CaMKKs can phosphorylate and activate AMP-activated protein kinase (AMPK). LKB1, also called STK11, serves as a master upstream kinase that activates AMPK and most AMPK-like kinases. LKB1 and AMPK are part of an energy-sensing pathway that links cell energy to metabolism and cell growth. They play critical roles in the establishment and maintenance of cell polarity, cell proliferation, cytoskeletal organization, as well as T-cell metabolism, including T-cell development, homeostasis, and effector function. CaMKKs are upstream kinases of the CaM kinase cascade that phosphorylate and activate CaMKI and CamKIV. They may also phosphorylate other substrates including PKB and AMPK. Vertebrates contain two CaMKKs, CaMKK1 (or alpha) and CaMKK2 (or beta). CaMKK1 is involved in the regulation of glucose uptake in skeletal muscles. CaMKK2 is involved in regulating energy balance, glucose metabolism, adiposity, hematopoiesis, inflammation, and cancer. The LKB1/CaMKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270911	cd14009	STKc_ATG1_ULK_like	Catalytic domain of the Serine/Threonine kinases, Autophagy-related protein 1 and Unc-51-like kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes yeast ATG1 and metazoan homologs including vertebrate ULK1-3. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. It is involved in nutrient sensing and signaling, the assembly of autophagy factors and the execution of autophagy. In metazoans, ATG1 homologs display additional functions. Unc-51 and ULKs have been implicated in neuronal and axonal development. The ATG1/ULK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	251
270912	cd14010	STKc_ULK4	Catalytic domain of the Serine/Threonine kinase, Unc-51-like kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ULK4 is a functionally uncharacterized kinase that shows similarity to ATG1/ULKs. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. The ULK4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
270913	cd14011	PK_SCY1_like	Pseudokinase domain of Scy1-like proteins. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. This subfamily is composed of the catalytically inactive kinases with similarity to yeast Scy1. It includes four mammalian proteins called SCY1-like protein 1 (SCYL1), SCYL2, SCYL3, as well as Testis-EXpressed protein 14 (TEX14). SCYL1 binds to and co-localizes with the membrane trafficking coatomer I (COPI) complex, and regulates COPI-mediated vesicle trafficking. Null mutations in the SCYL1 gene are responsible for the pathology in mdf (muscle-deficient) mice which display progressive motor neuropathy. SCYL2, also called coated vesicle-associated kinase of 104 kDa (CVAK104), is involved in the trafficking of clathrin-coated vesicles. It also binds the HIV-1 accessory protein Vpu and acts as a regulatory factor that promotes the dephosphorylation of Vpu, facilitating the restriction of HIV-1 release. SCYL3, also called ezrin-binding protein PACE-1, may be involved in regulating cell adhesion and migration. TEX14 is required for spermatogenesis and male fertility. It localizes to kinetochores (KT) during mitosis and is a target of the mitotic kinase PLK1. It regulates the maturation of the outer KT and the KT-microtubule attachment. The SCY1-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
270914	cd14012	PK_eIF2AK_GCN2_rpt1	Pseudokinase domain, repeat 1, of eukaryotic translation Initiation Factor 2-Alpha Kinase 4 or General Control Non-derepressible-2. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the overall downregulation of protein synthesis. eIF-2 phosphorylation is induced in response to cellular stresses including virus infection, heat shock, nutrient deficiency, and the accummulation of unfolded proteins, among others. There are four distinct kinases that phosphorylate eIF-2 and control protein synthesis under different stress conditions: GCN2, protein kinase regulated by RNA (PKR), heme-regulated inhibitor kinase (HRI), and PKR-like endoplasmic reticulum kinase (PERK). GCN2 is activated by amino acid or serum starvation and UV irradiation. It induces GCN4, a transcriptional activator of amino acid biosynthetic genes, leading to increased production of amino acids under amino acid-deficient conditions. In serum-starved cells, GCN2 activation induces translation of the stress-responsive transcription factor ATF4, while under UV stress, GCN2 triggers transcriptional rescue via NF-kappaB signaling. GCN2 contains an N-terminal RWD, a degenerate kinase-like (repeat 1), the catalytic kinase (repeat 2), a histidyl-tRNA synthetase (HisRS)-like, and a C-terminal ribosome-binding and dimerization (RB/DD) domains. The degenerate pseudokinase domain of GCN2 may function as a regulatory domain. The GCN2 subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	254
270915	cd14013	STKc_SNT7_plant	Catalytic domain of the Serine/Threonine kinase, Plant SNT7. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SNT7 is a plant thylakoid-associated kinase that is essential in short- and long-term acclimation responses to cope with various light conditions in order to maintain photosynthetic redox poise for optimal photosynthetic performance. Short-term response involves state transitions over periods of minutes while the long-term response (LTR) occurs over hours to days and involves changing the relative amounts of photosystems I and II. SNT7 acts as a redox sensor and a signal transducer for both responses, which are triggered by the redox state of the plastoquinone (PQ) pool. It is positioned at the top of a phosphorylation cascade that induces state transitions by phosphorylating light-harvesting complex II (LHCII), and triggers the LTR through the phosphorylation of chloroplast proteins. The SNT7 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	318
270916	cd14014	STKc_PknB_like	Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes many bacterial eukaryotic-type STKs including Staphylococcus aureus PknB (also called PrkC or Stk1), Bacillus subtilis PrkC, and Mycobacterium tuberculosis Pkn proteins (PknB, PknD, PknE, PknF, PknL, and PknH), among others. S. aureus PknB is the only eukaryotic-type STK present in this species, although many microorganisms encode for several such proteins. It is important for the survival and pathogenesis of S. aureus as it is involved in the regulation of purine and pyrimidine biosynthesis, cell wall metabolism, autolysis, virulence, and antibiotic resistance. M. tuberculosis PknB is essential for growth and it acts on diverse substrates including proteins involved in peptidoglycan synthesis, cell division, transcription, stress responses, and metabolic regulation. B. subtilis PrkC is located at the inner membrane of endospores and functions to trigger spore germination. Bacterial STKs in this subfamily show varied domain architectures. The well-characterized members such as S. aureus and M. tuberculosis PknB, and B. subtilis PrkC, contain an N-terminal cytosolic  kinase domain, a transmembrane (TM) segment, and mutliple C-terminal extracellular PASTA domains. The PknB subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	260
270917	cd14015	STKc_VRK	Catalytic domain of the Serine/Threonine protein kinase, Vaccinia Related Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. VRKs were initially discovered due to its similarity to vaccinia virus B1R STK, which is important for viral replication. They play important roles in cell signaling, nuclear envelope dynamics, apoptosis, and stress responses. Vertebrates contain three VRK proteins (VRK1, VRK2, and VRK3) while invertebrates, specifically fruit flies and nematodes, seem to carry only a single ortholog. Mutations of VRK in Drosophila and Caenorhabditis elegans showed varying phenotypes ranging from embryonic lethality to mitotic and meiotic defects resulting in sterility. In vertebrates, VRK1 is implicated in cell cycle progression and proliferation, nuclear envelope assembly, and chromatin condensation. VRK2 is involved in modulating JNK signaling. VRK3 is an inactive pseudokinase that inhibits ERK signaling. The VRK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	300
270918	cd14016	STKc_CK1	Catalytic domain of the Serine/Threonine protein kinase, Casein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. Some isoforms have several splice variants such as the long (L) and short (S) variants of CK1alpha. CK1 proteins are involved in the regulation of many cellular processes including membrane transport processes, circadian rhythm, cell division, apoptosis, and the development of cancer and neurodegenerative diseases. The CK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	266
270919	cd14017	STKc_TTBK	Catalytic domain of the Serine/Threonine protein kinase, Tau-Tubulin Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TTBK is a neuron-specific kinase that phosphorylates the microtubule-associated protein tau and promotes its aggregation. Higher vertebrates contain two TTBK proteins, TTBK1 and TTBK2, both of which have been implicated in neurodegeneration. TTBK1 has been linked to Alzheimer's disease (AD) while TTBK2 is associated with spinocerebellar ataxia type 11 (SCA11). Both AD and SCA11 patients show the presence of neurofibrillary tangles in the brain. The Drosophila TTBK homolog, Asator, is an essential protein that localizes to the mitotic spindle during mitosis and may be involved in regulating microtubule dynamics and function. The TTBK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
270920	cd14018	STKc_PINK1	Catalytic domain of the Serine/Threonine protein kinase, Pten INduced Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PINK1 contains an N-terminal mitochondrial targeting sequence, a catalytic domain, and a C-terminal regulatory region. It plays an important role in maintaining mitochondrial homeostasis. It protects cells against oxidative stress-induced apoptosis by phosphorylating the chaperone TNFR-associated protein 1 (TRAP1), also called Hsp75. Phosphorylated TRAP1 prevents cytochrome c release and peroxide-induced apoptosis. PINK1 interacts with Omi/HtrA2, a serine protease, and Parkin, an E3 ubiquitin ligase, in different pathways to promote mitochondrial health. The parkin gene is the most commonly mutated gene in autosomal recessive familial parkinsonism. Mutations within the catalytic domain of PINK1 are also associated with Parkinson's disease. The PINK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	313
270921	cd14019	STKc_Cdc7	Catalytic domain of the Serine/Threonine Kinase, Cell Division Cycle 7 kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Cdc7 kinase (or Hsk1 in fission yeast) is a critical regulator in the initiation of DNA replication. It forms a complex with a Dbf4-related regulatory subunit, a cyclin-like molecule that activates the kinase in late G1 phase, and is also referred to as Dbf4-dependent kinase (DDK). Its main targets are mini-chromosome maintenance (MCM) proteins. Cdc7 kinase may also have additional roles in meiosis, checkpoint responses, the maintenance and repair of chromosome structures, and cancer progression. The Cdc7 kinase subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	252
270922	cd14020	STKc_KIS	Catalytic domain of the Serine/Threonine Kinase, Kinase Interacting with Stathmin (also called U2AF homology motif (UHM) kinase 1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. KIS (or UHMK1) contains an N-terminal kinase domain and a C-terminal domain with a UHM motif, a protein interaction motif initially found in the pre-mRNA splicing factor U2AF. It phosphorylates the splicing factor SF1, which enhances binding to the splice site to promote spliceosome assembly. KIS was first identified as a kinase that interacts with stathmin, a phosphoprotein that plays a role in axon development and microtubule dynamics. It localizes in RNA granules in neurons and is important in neurite outgrowth. The KIS/UHMK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	285
270923	cd14021	ChoK-like_euk	Euykaryotic Choline Kinase and similar proteins. This group is composed of eukaryotic choline kinase, ethanolamine kinase, and similar proteins. ChoK catalyzes the transfer of the gamma-phosphoryl group from ATP (or CTP) to its substrate, choline, producing phosphorylcholine (PCho), a precursor to the biosynthesis of two major membrane phospholipids, phosphatidylcholine (PC), and sphingomyelin (SM). Although choline is the preferred substrate, ChoK also shows substantial activity towards ethanolamine and its N-methylated derivatives. ETNK catalyzes the transfer of the gamma-phosphoryl group from CTP to ethanolamine (Etn), the first step in the CDP-Etn pathway for the formation of the major phospholipid, phosphatidylethanolamine (PtdEtn). Unlike ChoK, ETNK shows specific activity for its substrate and displays negligible activity towards N-methylated derivatives of Etn. ChoK plays an important role in cell signaling pathways and the regulation of cell growth. The ChoK subfamily is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K).	229
270924	cd14022	PK_TRB2	Pseudokinase domain of Tribbles Homolog 2. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. TRB2 binds and negatively regulates the mitogen activated protein kinase (MAPK) kinases, MKK7 and MEK1, which are activators of the MAPKs, ERK and JNK. It controls the activation of inflammatory monocytes, which is essential in innate immune responses and the pathogenesis of inflammatory diseases such as atherosclerosis. TRB2 expression is down-regulated in human acute myeloid leukaemia (AML), which may lead to enhanced cell survival and pathogenesis of the disease. TRB2 is one of three Tribbles Homolog (TRB) proteins present in vertebrates that are encoded by three separate genes. TRB proteins interact with many proteins involved in signalling pathways. They play scaffold-like regulatory functions and affect many cellular processes such as mitosis, apoptosis, and gene expression. The TRB2 subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	242
270925	cd14023	PK_TRB1	Pseudokinase domain of Tribbles Homolog 1. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. TRB1 interacts directly with the mitogen activated protein kinase (MAPK) kinase MKK4, an activator of JNK. It regulates vascular smooth muscle cell proliferation and chemotaxis through the JNK signaling pathway. It is found to be down-regulated in human acute myeloid leukaemia (AML) and may play a role in the pathogenesis of the disease. It has also been identified as a potential biomarker for antibody-mediated allograft failure. TRB1 is one of three Tribbles Homolog (TRB) proteins present in vertebrates that are encoded by three separate genes. TRB proteins interact with many proteins involved in signalling pathways. They play scaffold-like regulatory functions and affect many cellular processes such as mitosis, apoptosis, and gene expression. The TRB1 subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	242
270926	cd14024	PK_TRB3	Pseudokinase domain of Tribbles Homolog 3. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. TRB3 binds and regulates ATF4, p65/RelA, and PKB (or Akt). It negatively regulates ATF4-mediated gene expression including that of CHOP (C/EBP homologous protein) and HO-1, which are both involved in modulating apoptosis. It also inhibits insulin-mediated phosphorylation of PKB and is a possible determinant of insulin resistance and related disorders. In osteoarthritic chondrocytes where it inhibits insulin-like growth factor 1-mediated cell survival, TRB3 is overexpressed, resulting in increased cell death. TRB3 is one of three Tribbles Homolog (TRB) proteins present in vertebrates that are encoded by three separate genes. TRB proteins interact with many proteins involved in signalling pathways. They play scaffold-like regulatory functions and affect many cellular processes such as mitosis, apoptosis, and gene expression. The TRB3 subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	242
270927	cd14025	STKc_RIP4_like	Catalytic domain of the Serine/Threonine kinases, Receptor Interacting Protein 4 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of RIP4, ankyrin (ANK) repeat and kinase domain containing 1 (ANKK1), and similar proteins, all of which harbor C-terminal ANK repeats. RIP4, also called Protein Kinase C-associated kinase (PKK), regulates keratinocyte differentiation and cutaneous inflammation. It activates NF-kappaB and is important in the survival of diffuse large B-cell lymphoma cells. The ANKK1 protein, also called PKK2, has not been studied extensively. The ANKK1 gene, located less than 10kb downstream of the D2 dopamine receptor (DRD2) locus, is altered in the Taq1 A1 polymorphism, which is related to a reduced DRD2 binding affinity and consequently, to mental disorders. The RIP4-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270928	cd14026	STKc_RIP2	Catalytic domain of the Serine/Threonine kinase, Receptor Interacting Protein 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RIP2, also called RICK or CARDIAK, harbors a C-terminal Caspase Activation and Recruitment domain (CARD) belonging to the Death domain (DD) superfamily. It functions as an effector kinase downstream of the pattern recognition receptors from the Nod-like (NLR) family, Nod1 and Nod2, which recognizes bacterial peptidoglycans released upon infection. RIP2 may also be involved in regulating wound healing and keratinocyte proliferation. RIP kinases serve as essential sensors of cellular stress. The RIP2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270929	cd14027	STKc_RIP1	Catalytic domain of the Serine/Threonine kinase, Receptor Interacting Protein 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RIP1 harbors a C-terminal Death domain (DD), which binds death receptors (DRs) including TNF receptor 1, Fas, TNF-related apoptosis-inducing ligand receptor 1 (TRAILR1), and TRAILR2. It also interacts with other DD-containing adaptor proteins such as TRADD and FADD. RIP1 can also recruit other kinases including MEKK1, MEKK3, and RIP3 through an intermediate domain (ID) that bears a RIP homotypic interaction motif (RHIM). RIP1 plays a crucial role in determining a cell's fate, between survival or death, following exposure to stress signals. It is important in the signaling of NF-kappaB and MAPKs, and it links DR-associated signaling to reactive oxygen species (ROS) production. Abnormal RIP1 function may result in ROS accummulation affecting inflammatory responses, innate immunity, stress responses, and cell survival. RIP kinases serve as essential sensors of cellular stress. The RIP1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270930	cd14028	STKc_Bub1_vert	Catalytic domain of the Serine/Threonine kinase, Vertebrate Spindle assembly checkpoint protein Bub1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Bub1 (Budding uninhibited by benzimidazoles 1) contains an N-terminal Bub1/Mad3 homology domain essential for Cdc20 binding, a GLEBS motif for Bub3/kinetochore binding, and a C-terminal kinase domain. It is involved in SAC, a surveillance system that delays metaphase to anaphase transition by blocking the activity of APC/C (the anaphase promoting complex) until all chromosomes achieve proper attachments to the mitotic spindle, to avoid chromosome missegregation. Bub1 contributes to the inhibition of APC/C by phosphorylating its crucial cofactor, Cdc20, rendering it unable to activate APC/C. In addition, Bub1 facilitates the localization to kinetochores of other SAC and motor proteins including Mad1, Mad2, BubR1, and Plk1. It acts as the master organizer of the functional inner centromere. Bub1 also play roles in protecting sister chromatid cohesion and normal metaphase congression. The Bub1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
270931	cd14029	STKc_BubR1_vert	Catalytic domain of the Serine/Threonine kinase, Vertebrate Spindle assembly checkpoint protein BubR1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BubR1 (Budding uninhibited by benzimidazoles R1) is also called Bub1 beta (Bub1b). It contains an N-terminal Bub1/Mad3 homology domain essential for Cdc20 binding and a C-terminal kinase domain. It is involved in SAC, a surveillance system that delays metaphase to anaphase transition by blocking the activity of APC/C (the anaphase promoting complex) until all chromosomes achieve proper attachments to the mitotic spindle, to avoid chromosome missegregation. BubR1 inhibits APC/C through direct binding. It also plays an important role in stabilizing kinetochore-microtubule attachments. Mutant mice expressing only 10% normal BubR1 protein are viable and develop into adult mice, but display many early aging-associated phenotypes including reduced lifespan, muscle atrophy, cataracts, impaired wound healing, and infertility. The BubR1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	304
270932	cd14030	STKc_WNK1	Catalytic domain of the Serine/Threonine protein kinase, With No Lysine (WNK) 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNK1 is widely expressed and is most abundant in the testis. In hyperosmotic or hypotonic low-chloride stress conditions, WNK1 is activated and it phosphorylates its substrates including SPAK and OSR1 kinases, which regulate the activity of cation-chloride cotransporters through direct interaction and phosphorylation. Mutations in WNK1 cause PseudoHypoAldosteronism type II (PHAII), characterized by hypertension and hyperkalemia. WNK1 negates WNK4-mediated inhibition of the sodium-chloride cotransporter NCC and activates the epithelial sodium channel ENaC by activating SGK1. WNK1 also decreases the surface expression of renal outer medullary potassium channel (ROMK) by stimulating their endocytosis. Hypertension and hyperkalemia in PHAII patients with WNK1 mutations may be due partly to increased activity of NCC and ENaC, and impaired renal potassium secretion by ROMK, respectively. In addition, WNK1 interacts with MEKK2/3 and acts as an activator of extracellular signal-regulated kinase (ERK) 5. It also negatively regulates TGFbeta signaling. WNKs comprise a subfamily of STKs with an unusual placement of the catalytic lysine relative to all other protein kinases. The WNK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	289
270933	cd14031	STKc_WNK3	Catalytic domain of the Serine/Threonine protein kinase, With No Lysine (WNK) 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNK3 shows a restricted expression pattern; it is found at high levels in the pituary glands and is also expressed in the kidney and brain. It has been shown to regulate many ion transporters including members of the SLC12A family of cation-chloride cotransporters such as NCC and NKCC2, the renal potassium channel ROMK, and the epithelial calcium channels TRPV5 and TRPV6. WNK3 appears to sense low-chloride hypotonic stress and under these conditions, it activates SPAK, which directly interacts and phosphorylates cation-chloride cotransporters. WNK3 has also been shown to promote cell survival, possibly through interaction with procaspase-3 and HSP70. WNKs comprise a subfamily of STKs with an unusual placement of the catalytic lysine relative to all other protein kinases. The WNK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	275
270934	cd14032	STKc_WNK2_like	Catalytic domain of With No Lysine (WNK) 2-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNK2 is widely expressed and has been shown to be epigenetically silenced in gliomas. It inhibits cell growth by acting as a negative regulator of MEK1-ERK1/2 signaling. WNK2 modulates growth factor-induced cancer cell proliferation, suggesting that it may be a tumor suppressor gene. WNKs comprise a subfamily of STKs with an unusual placement of the catalytic lysine relative to all other protein kinases. They are critical in regulating ion balance and are thus, important components in the control of blood pressure. The WNK2-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	266
270935	cd14033	STKc_WNK4	Catalytic domain of the Serine/Threonine protein kinase, With No Lysine (WNK) 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. WNK4 shows a restricted expression pattern and is usually found in epithelial cells. It is expressed in nephrons and in extrarenal tissues including intestine, eye, mammary glands, and prostate. WNK4 regulates a variety of ion transport proteins including apical or basolateral ion transporters, ion channels in the transcellular pathway, and claudins in the paracellular pathway. Mutations in WNK4 cause PseudoHypoAldosteronism type II (PHAII), characterized by hypertension and hyperkalemia. WNK4 inhibits the activity of the thiazide-sensitive Na-Cl cotransporter (NCC), which is responsible for about 15% of NaCl reabsorption in the kidney. It also inhibits the renal outer medullary potassium channel (ROMK) and decreases its surface expression. Hypertension and hyperkalemia in PHAII patients with WNK4 mutations may be partly due to increased NaCl reabsorption through NCC and impaired renal potassium secretion by ROMK, respectively. The WNK4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	261
270936	cd14034	PK_NRBP1	Pseudokinase domain of Nuclear Receptor Binding Protein 1. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. NRBP1, also called MLF1-adaptor molecule (MADM), was originally named based on the presence of nuclear binding and localization motifs prior to functional analyses. It is expressed ubiquitously and is found to localize in the cytoplasm, not the nucleus. NRBP1 is an adaptor protein that interacts with myeloid leukemia factor 1 (MLF1), an oncogene that enhances myeloid development of hematopoietic cells. It also interacts with the small GTPase Rac3. NRBP1 may also be involved in Golgi to ER trafficking and actin dynamics. The NRBP1-like subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270937	cd14035	PK_MADML	Pseudokinase domain of MLF1-ADaptor Molecule-Like. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. MADML has been shown to be expressed throughout development in Xenopus laevis with highest expression found in the developing lens and retina. It may play an important role in embryonic eye development. The MADML subfamily is part of a larger superfamily that includes the catalytic domains of serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
270938	cd14036	STKc_GAK	Catalytic domain of the Serine/Threonine protein kinase, cyclin G-Associated Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GAK, also called auxilin-2, contains an N-terminal kinase domain that phosphorylates the mu subunits of adaptor protein (AP) 1 and AP2. In addition, it contains an auxilin-1-like domain structure consisting of PTEN-like, clathrin-binding, and J domains. Like auxilin-1, GAK facilitates Hsc70-mediated dissociation of clathrin from clathrin-coated vesicles. GAK is expressed ubiquitously and is enriched in the Golgi, unlike auxilin-1 which is nerve-specific. GAK also plays regulatory roles outside of clathrin-mediated membrane traffic including the maintenance of centrosome integrity and chromosome congression, neural patterning, survival of neurons, and immune responses through interaction with the interleukin 12 receptor. It also interacts with the androgen receptor, acting as a transcriptional coactivator, and its expression is significantly increased with the progression of prostate cancer. The GAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	282
270939	cd14037	STKc_NAK_like	Catalytic domain of Numb-Associated Kinase (NAK)-like Serine/Threonine kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Drosophila melanogaster NAK, human BMP-2-inducible protein kinase (BMP2K or BIKe) and similar vertebrate proteins, as well as the Saccharomyces cerevisiae proteins Prk1, Actin-regulating kinase 1 (Ark1), and Akl1. NAK was the first characterized member of this subfamily. It plays a role in asymmetric cell division through its association with Numb. It also regulates the localization of Dlg, a protein essential for septate junction formation. BMP2K contains a nuclear localization signal and a kinase domain that is capable of phosphorylating itself and myelin basic protein. The expression of the BMP2K gene is increase during BMP-2-induced osteoblast differentiation. It may function to control the rate of differentiation. Prk1, Ark1, and Akl1 comprise a subfamily of yeast proteins that are important regulators of the actin cytoskeleton and endocytosis. They share an N-terminal kinase domain but no significant homology in other regions of their sequences. The NAK-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
270940	cd14038	STKc_IKK_beta	Catalytic domain of the Serine/Threonine kinase, Inhibitor of Nuclear Factor-KappaB Kinase (IKK) beta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IKKbeta is involved in the classical pathway of regulating Nuclear Factor-KappaB (NF-kB) proteins, a family of transcription factors which are critical in many cellular functions including inflammatory responses, immune development, cell survival, and cell proliferation, among others. The classical pathway regulates the majority of genes activated by NF-kB including those encoding cytokines, chemokines, leukocyte adhesion molecules, and anti-apoptotic factors. It involves NEMO (NF-kB Essential MOdulator)- and IKKbeta-dependent phosphorylation and degradation of the Inhibitor of NF-kB (IkB), which liberates NF-kB dimers (typified by the p50-p65 heterodimer) from an inactive IkB/dimeric NF-kB complex, enabling them to migrate to the nucleus where they regulate gene transcription. The IKKbeta subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
270941	cd14039	STKc_IKK_alpha	Catalytic domain of the Serine/Threonine kinase, Inhibitor of Nuclear Factor-KappaB Kinase (IKK) alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IKKalpha is involved in the non-canonical or alternative pathway of regulating Nuclear Factor-KappaB (NF-kB) proteins, a family of transcription factors which are critical in many cellular functions including inflammatory responses, immune development, cell survival, and cell proliferation, among others. The non-canonical pathway functions in cells lacking NEMO (NF-kB Essential MOdulator) and IKKbeta. It is induced by a subset of TNFR family members including CD40, RANK, and B cell-activating factor receptor. IKKalpha processes the Inhibitor of NF-kB (IkB)-like C-terminus of NF-kB2/p100 to produce p52, allowing the p52/RelB dimer to migrate to the nucleus. This pathway is dependent on NIK (NF-kB Inducing Kinase) which phosphorylates and activates IKKalpha. The IKKalpha subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	289
270942	cd14040	STKc_TLK1	Catalytic domain of the Serine/Threonine kinase, Tousled-Like Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. A splice variant of TLK1, called TLK1B, is expressed in the presence of double strand breaks (DSBs). It lacks the N-terminal part of TLK1, but is expected to phosphorylate the same substrates. TLK1/1B interacts with Rad9, which is critical in DNA damage-activated checkpoint response, and plays a role in the repair of linearized DNA with incompatible ends. TLKs play important functions during the cell cycle and are implicated in chromatin remodeling, DNA replication and repair, and mitosis. The TLK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	299
270943	cd14041	STKc_TLK2	Catalytic domain of the Serine/Threonine kinase, Tousled-Like Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TLKs play important functions during the cell cycle and are implicated in chromatin remodeling, DNA replication and repair, and mitosis. They phosphorylate and regulate Anti-silencing function 1 protein (Asf1), a histone H3/H4 chaperone that helps facilitate the assembly of chromatin following DNA replication during S phase. TLKs also phosphorylate the H3 histone tail and are essential in transcription. Vertebrates contain two subfamily members, TLK1 and TLK2. The TLK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	309
270944	cd14042	PK_GC-A_B	Pseudokinase domain of the membrane Guanylate Cyclase receptors, GC-A and GC-B. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. GC-A binds and is activated by the atrial and B-type natriuretic peptides, ANP and BNP, which are important in blood pressure regulation and cardiac pathophysiology. GC-B binds the C-type natriuretic peptide, CNP, which is a potent vasorelaxant and functions in vascular remodeling and bone growth regulation. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many  processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC-A/B subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
270945	cd14043	PK_GC-2D	Pseudokinase domain of the membrane Guanylate Cyclase receptor, GC-2D. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. GC-2D is allso called Retinal Guanylyl Cyclase 1 (RETGC-1) or Rod Outer Segment membrane Guanylate Cyclase (ROS-GC). It is found in the photoreceptors of the retina where it anchors the reciprocal feedback loop between calcium and cGMP, which regulates the dark, light, and recovery phases in phototransduction. It is also found in other sensory neurons and may be a universal transduction component that plays a role in the perception of all senses. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC-2D subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270946	cd14044	PK_GC-C	Pseudokinase domain of the membrane Guanylate Cyclase receptor, GC-C. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. GC-C binds and is activated by the intestinal hormones, guanylin (GN) and uroguanylin (UGN), which are secreted after salty meals to inhibit sodium absorption and induce the secretion of chloride, bicarbonate, and water. GN and UGN are also present in the kidney, where they induce increased salt and water secretion. This prevents the development of hypernatremia and hypervolemia after ingestion of high amounts of salt. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many  processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC-C subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
270947	cd14045	PK_GC_unk	Pseudokinase domain of the unknown subfamily of membrane Guanylate Cyclase receptors. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. Membrane (or particulate) GCs consist of an extracellular ligand-binding domain, a single transmembrane region, and an intracellular tail that contains a PK-like domain, an amphiphatic region and a catalytic GC domain that catalyzes the conversion of GTP into cGMP and pyrophosphate. Membrane GCs act as receptors that transduce an extracellular signal to the intracellular production of cGMP, which has been implicated in many  processes including cell proliferation, phototransduction, and muscle contractility, through its downstream effectors such as PKG. The PK-like domain of GCs lack a critical aspartate involved in ATP binding and does not exhibit kinase activity. It functions as a negative regulator of the catalytic GC domain and may also act as a docking site for interacting proteins such as GC-activating proteins. The GC subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
270948	cd14046	STKc_EIF2AK4_GCN2_rpt2	Catalytic domain, repeat 2, of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 4 or General Control Non-derepressible-2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GCN2 (or EIF2AK4) is activated by amino acid or serum starvation and UV irradiation. It induces GCN4, a transcriptional activator of amino acid biosynthetic genes, leading to increased production of amino acids under amino acid-deficient conditions. In serum-starved cells, GCN2 activation induces translation of the stress-responsive transcription factor ATF4, while under UV stress, GCN2 triggers transcriptional rescue via NF-kB signaling. GCN2 contains an N-terminal RWD, a degenerate kinase-like (repeat 1), the catalytic kinase (repeat 2), a histidyl-tRNA synthetase (HisRS)-like, and a C-terminal ribosome-binding and dimerization (RB/DD) domains. Its kinase domain is activated via conformational changes as a result of the binding of uncharged tRNA to the HisRS-like domain. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the overall downregulation of protein synthesis. The GCN2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	278
270949	cd14047	STKc_EIF2AK2_PKR	Catalytic domain of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 2 or Protein Kinase regulated by RNA. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKR (or EIF2AK2) contains an N-terminal double-stranded RNA (dsRNA) binding domain and a C-terminal catalytic kinase domain. It is activated by dsRNA, which is produced as a replication intermediate in virally infected cells. It plays a key role in mediating innate immune responses to viral infection. PKR is also directly activated by PACT (protein activator of PKR) and heparin, and is inhibited by viral proteins and RNAs. PKR also regulates transcription and signal transduction in diseased cells, playing roles in tumorigenesis and neurodegenerative diseases. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. The PKR subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270950	cd14048	STKc_EIF2AK3_PERK	Catalytic domain of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 3 or PKR-like Endoplasmic Reticulum Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PERK (or EIF2AK3) is a type-I ER transmembrane protein containing a luminal domain bound with the chaperone BiP under unstressed conditions and a cytoplasmic catalytic kinase domain. In response to the accumulation of misfolded or unfolded proteins in the ER, PERK is activated through the release of BiP, allowing it to dimerize and autophosphorylate. It functions as the central regulator of translational control during the Unfolded Protein Response (UPR) pathway. In addition to the eIF-2 alpha subunit, PERK also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. The PERK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	281
270951	cd14049	STKc_EIF2AK1_HRI	Catalytic domain of the Serine/Threonine kinase, eukaryotic translation Initiation Factor 2-Alpha Kinase 2 or Heme-Regulated Inhibitor kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HRI (or EIF2AK1) contains an N-terminal regulatory heme-binding domain and a C-terminal catalytic kinase domain. It is suppressed under normal conditions by binding of the heme iron, and is activated during heme deficiency. It functions as a critical regulator that ensures balanced synthesis of globins and heme, in order to form stable hemoglobin during erythroid differentiation and maturation. HRI also protects cells and enhances survival under iron-deficient conditions. EIF2AKs phosphorylate the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. The HRI subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
270952	cd14050	PKc_Myt1	Catalytic domain of the Dual-specificity protein kinase, Myt1. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. Myt1 is a cytoplasmic cell cycle checkpoint kinase that can keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of N-terminal thr (T14) and tyr (Y15) residues, leading to the delay of meiosis I entry. Meiotic progression is ensured by a two-step inhibition and downregulation of Myt1 by CDK1/XRINGO and p90Rsk during oocyte maturation. In addition, Myt1 targets cyclin B1/B2 and is essential for Golgi and ER assembly during telophase. In Drosophila, Myt1 may be a downstream target of Notch during eye development. The Myt1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	249
270953	cd14051	PTKc_Wee1	Catalytic domain of the Protein Tyrosine Kinase, Wee1. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Wee1 is a nuclear cell cycle checkpoint kinase that helps keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of an N-terminal tyr (Y15) residue. During the late G2 phase, CDK1 is activated and mitotic entry is promoted by the removal of this inhibitory phosphorylation by the phosphatase Cdc25. Although Wee1 is functionally a tyr kinase, it is more closely related to serine/threonine kinases (STKs). It contains a catalytic kinase domain sandwiched in between N- and C-terminal regulatory domains. It is regulated by phosphorylation and degradation, and its expression levels are also controlled by circadian clock proteins. There are two distinct Wee1 proteins in vertebrates showing different expression patterns, called Wee1a and Wee1b. They are functionally dstinct and are implicated in different steps of egg maturation and embryo development. The Wee1 subfamily is part of a larger superfamily that includes the catalytic domains of STKs, other PTKs, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	275
270954	cd14052	PTKc_Wee1_fungi	Catalytic domain of the Protein Tyrosine Kinases, Fungal Wee1 proteins. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of fungal Wee1 proteins, also called Swe1 in budding yeast and Mik1 in fission yeast. Yeast Wee1 is required to control cell size. Wee1 is a cell cycle checkpoint kinase that helps keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of an N-terminal tyr (Y15) residue. During the late G2 phase, CDK1 is activated and mitotic entry is promoted by the removal of this inhibitory phosphorylation by the phosphatase Cdc25. Although Wee1 is functionally a tyr kinase, it is more closely related to serine/threonine kinases (STKs). It contains a catalytic kinase domain sandwiched in between N- and C-terminal regulatory domains. It is regulated by phosphorylation and degradation, and its expression levels are also controlled by circadian clock proteins. The fungal Wee1 subfamily is part of a larger superfamily that includes the catalytic domains of STKs, other PTKs, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	278
270955	cd14053	STKc_ACVR2	Catalytic domain of the Serine/Threonine Kinase, Activin Type II Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ACVR2 belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins (BMPs), activins, growth and differentiation factors (GDFs), and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. Type II receptors, such as ACVR2, are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. ACVR2 acts primarily as the receptors for activins, nodal, myostatin, GDF11, and a subset of BMPs. ACVR2 signaling impacts many cellular and physiological processes including reproductive and gonadal functions, myogenesis, bone remodeling and tooth development, kidney organogenesis, apoptosis, fibrosis, inflammation, and neurogenesis. Vertebrates contain two ACVR2 proteins, ACVR2a (or ActRIIA) and ACVR2b (or ActRIIB). The ACVR2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
270956	cd14054	STKc_BMPR2_AMHR2	Catalytic domain of the Serine/Threonine Kinases, Bone Morphogenetic Protein and Anti-Muellerian Hormone Type II Receptors. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BMPR2 and AMHR2 belong to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, BMPs, activins, growth and differentiation factors (GDFs), and AMH, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. Type II receptors are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. BMPR2 and AMHR2 act primarily as a receptor for BMPs and AMH, respectively. BMPs induce bone and cartilage formation, as well as regulate tooth, kidney, skin, hair, haematopoietic, and neuronal development. Mutations in BMPR2A is associated with familial pulmonary arterial hypertension. AMH is mainly responsible for the regression of Mullerian ducts during male sex differentiation. It is expressed exclusively by somatic cells of the gonads. Mutations in either AMH or AMHR2 cause persistent Mullerian duct syndrome (PMDS), a rare form of male pseudohermaphroditism characterized by the presence of Mullerian derivatives (ovary and tubes) in otherwise normally masculine males. The BMPR2/AMHR2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	300
270957	cd14055	STKc_TGFbR2_like	Catalytic domain of the Serine/Threonine Kinase, Transforming Growth Factor beta Type II Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TGFbR2 belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. Type II receptors, such as TGFbR2, are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. TGFbR2 acts as the receptor for TGFbeta, which is crucial in growth control and homeostasis in many different tissues. It plays roles in regulating apoptosis and in maintaining the balance between self renewal and cell loss. It also plays a key role in maintaining vascular integrity and in regulating responses to genotoxic stress. Mutations in TGFbR2 can cause aortic aneurysm disorders such as Loeys-Dietz and Marfan syndromes. The TGFbR2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	295
270958	cd14056	STKc_TGFbR_I	Catalytic domain of the Serine/Threonine Kinases, Transforming Growth Factor beta family Type I Receptors. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of type I receptors for the TGFbeta family of secreted signaling molecules including TGFbeta, bone morphogenetic proteins, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation through trans-phosphorylation by type II receptors, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. They are inhibited by the immunophilin FKBP12, which is thought to control leaky signaling caused by receptor oligomerization in the absence of ligand. The TGFbR-I subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
270959	cd14057	PK_ILK	Pseudokinase domain of  Integrin Linked Kinase. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. ILK contains N-terminal ankyrin repeats, a Pleckstrin Homology (PH) domain, and a C-terminal pseudokinase domain. It is a component of the IPP (ILK/PINCH/Parvin) complex that couples beta integrins to the actin cytoskeleton, and plays important roles in cell adhesion, spreading, invasion, and migration. ILK was initially thought to be an active kinase despite the lack of key conserved residues because of in vitro studies showing that it can phosphorylate certain protein substrates. However, in vivo experiments in Caenorhabditis elegans, Drosophila melanogaster, and mice (ILK-null and knock-in) proved that ILK is not an active kinase. In addition to actin cytoskeleton regulation, ILK also influences the microtubule network and mitotic spindle orientation. The pseudokinase domain of ILK binds several adaptor proteins including the parvins and paxillin. The ILK subfamily is part of a larger superfamily that includes the catalytic domains of protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	251
270960	cd14058	STKc_TAK1	Catalytic domain of the Serine/Threonine Kinase, Transforming Growth Factor beta Activated Kinase-1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TAK1 is also known as mitogen-activated protein kinase kinase kinase 7 (MAPKKK7 or MAP3K7), TAK, or MEKK7. As a MAPKKK, it is an important mediator of cellular responses to extracellular signals. It regulates both the c-Jun N-terminal kinase and p38 MAPK cascades by activating the MAPK kinases, MKK4 and MKK3/6. In addition, TAK1 plays diverse roles in immunity and development, in different biological contexts, through many signaling pathways including TGFbeta/BMP, Wnt/Fz, and NF-kB. It is also implicated in the activation of the tumor suppressor kinase, LKB1. The TAK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	253
270961	cd14059	STKc_MAP3K12_13	Catalytic domain of the Serine/Threonine Kinases, Mitogen-Activated Protein Kinase Kinase Kinases 12 and 13. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAP3K12 is also called MAPK upstream kinase (MUK), dual leucine zipper-bearing kinase (DLK) or leucine-zipper protein kinase (ZPK). It is involved in the c-Jun N-terminal kinase (JNK) pathway that directly regulates axonal regulation through the phosphorylation of microtubule-associated protein 1B (MAP1B). It also regulates the differentiation of many cell types including adipocytes and may play a role in adipogenesis. MAP3K13, also called leucine zipper-bearing kinase (LZK), directly phosphorylates and activates MKK7, which in turn activates the JNK pathway. It also activates NF-kB through IKK activation and this activity is enhanced by antioxidant protein-1 (AOP-1). MAP3Ks (MKKKs or MAPKKKs) phosphorylate and activate MAP2Ks (MAPKKs or MKKs), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The MAP3K12/13 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	237
270962	cd14060	STKc_MLTK	Catalytic domain of the Serine/Threonine Kinase, Mixed lineage kinase-Like mitogen-activated protein Triple Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLTK, also called zipper sterile-alpha-motif kinase (ZAK), contains a catalytic kinase domain and a leucine zipper. There are two alternatively-spliced variants, MLTK-alpha and MLTK-beta. MLTK-alpha contains a sterile-alpha-motif (SAM) at the C-terminus. MLTK regulates the c-Jun N-terminal kinase, extracellular signal-regulated kinase, p38 MAPK, and NF-kB pathways. ZAK is the MAP3K involved in the signaling cascade that leads to the ribotoxic stress response initiated by cellular damage due to Shiga toxins and ricin. It may also play a role in cell transformation and cancer development. MAP3Ks (MKKKs or MAPKKKs) phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals.The MLTK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	242
270963	cd14061	STKc_MLK	Catalytic domain of the Serine/Threonine Kinases, Mixed Lineage Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLKs act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Mammals have four MLKs (MLK1-4), mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. The MLK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270964	cd14062	STKc_Raf	Catalytic domain of the Serine/Threonine Kinases, Raf (Rapidly Accelerated Fibrosarcoma) kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Raf kinases act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. Aberrant expression or activation of components in this pathway are associated with tumor initiation, progression, and metastasis. Raf proteins contain a Ras binding domain, a zinc finger cysteine-rich domain, and a catalytic kinase domain. Vertebrates have three Raf isoforms (A-, B-, and C-Raf) with different expression profiles, modes of regulation, and abilities to function in the ERK cascade, depending on cellular context and stimuli. They have essential and non-overlapping roles during embryo- and organogenesis. Knockout of each isoform results in a lethal phenotype or abnormality in most mouse strains. The Raf subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	253
270965	cd14063	PK_KSR	Pseudokinase domain of Kinase Suppressor of Ras. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. KSR is a scaffold protein that functions downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. KSR proteins regulate the assembly and activation of the Raf/MEK/ERK module upon Ras activation at the membrane by direct association of its components. They are widely regarded as pseudokinases, but there is some debate in this designation as a few groups have reported detecting kinase catalytic activity for KSRs, specifically KSR1. Vertebrates contain two KSR proteins, KSR1 and KSR2. The KSR subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
270966	cd14064	PKc_TNNI3K	Catalytic domain of the Dual-specificity protein kinase, TNNI3-interacting kinase. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. TNNI3K, also called cardiac ankyrin repeat kinase (CARK), is a cardiac-specific troponin I-interacting kinase that promotes cardiac myogenesis, improves cardiac performance, and protects the myocardium from ischemic injury. It contains N-terminal ankyrin repeats, a catalytic kinase domain, and a C-terminal serine-rich domain. TNNI3K exerts a disease-accelerating effect on cardiac dysfunction and reduced survival in mouse models of cardiomyopathy. The TNNI3K subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	254
270967	cd14065	PKc_LIMK_like	Catalytic domain of the LIM domain kinase-like protein kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine or tyrosine residues on protein substrates. Members of this subfamily include LIMK, Testicular or testis-specific protein kinase (TESK), and similar proteins. LIMKs are characterized as serine/threonine kinases (STKs) while TESKs are dual-specificity protein kinases. Both LIMK and TESK phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They are implicated in many cellular functions including cell spreading, motility, morphogenesis, meiosis, mitosis, and spermatogenesis. The LIMK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	252
270968	cd14066	STKc_IRAK	Catalytic domain of the Serine/Threonine kinases, Interleukin-1 Receptor Associated Kinases and related STKs. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. Some IRAKs may also play roles in T- and B-cell signaling, and adaptive immunity. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK-1, -2, and -4 are ubiquitously expressed and are active kinases, while IRAK-M is only induced in monocytes and macrophages and is an inactive kinase. Variations in IRAK genes are linked to diverse diseases including infection, sepsis, cancer, and autoimmune diseases. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain (a pseudokinase domain in the case of IRAK3), and a C-terminal domain; IRAK-4 lacks the C-terminal domain. This subfamily includes plant receptor-like kinases (RLKs) including Arabidopsis thaliana BAK1 and CLAVATA1 (CLV1). BAK1 functions in BR (brassinosteroid)-regulated plant development and in pathways involved in plant resistance to pathogen infection and herbivore attack.  CLV1, directly binds small signaling peptides, CLAVATA3 (CLV3) and CLAVATA3/EMBRYO SURROUNDING REGI0N (CLE), to restrict stem cell proliferation: the CLV3-CLV1-WUS (WUSCHEL) module influences stem cell maintenance in the shoot apical meristem, and the CLE40 (CLAVATA3/EMBRYO SURROUNDING REGION40) -ACR4 (CRINKLY4) -CLV1- WOX5 (WUSCHEL-RELATED HOMEOBOX5) module at the root apical meristem. The IRAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	272
270969	cd14067	STKc_LRRK1	Catalytic domain of the Serine/Threonine Kinase, Leucine-Rich Repeat Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LRRK1 is one of two vertebrate LRRKs which show complementary expression in the brain. It can form heterodimers with LRRK2, and may influence the age of onset of LRRK2-associated Parkinson's disease. LRRKs are also classified as ROCO proteins because they contain a ROC (Ras of complex proteins)/GTPase domain followed by a COR (C-terminal of ROC) domain of unknown function. In addition, LRRKs contain a catalytic kinase domain and protein-protein interaction motifs including a WD40 domain, LRRs and ankyrin (ANK) repeats. LRRKs possess both GTPase and kinase activities, with the ROC domain acting as a molecular switch for the kinase domain, cycling between a GTP-bound state which drives kinase activity and a GDP-bound state which decreases the activity. The LRRK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	276
270970	cd14068	STKc_LRRK2	Catalytic domain of the Serine/Threonine Kinase, Leucine-Rich Repeat Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LRRK2 is one of two vertebrate LRRKs which show complementary expression in the brain. Mutations in LRRK2, found in the kinase, ROC-COR, and WD40 domains, are linked to both familial and sporadic forms of Parkinson's disease. The most prevalent mutation, G2019S located in the activation loop of the kinase domain, increases kinase activity. The R1441C/G mutations in the GTPase domain have also been reported to influence kinase activity. LRRKs are also classified as ROCO proteins because they contain a ROC (Ras of complex proteins)/GTPase domain followed by a COR (C-terminal of ROC) domain of unknown function. In addition, LRRKs contain a catalytic kinase domain and protein-protein interaction motifs including a WD40 domain, LRRs and ankyrin (ANK) repeats. LRRKs possess both GTPase and kinase activities, with the ROC domain acting as a molecular switch for the kinase domain, cycling between a GTP-bound state which drives kinase activity and a GDP-bound state which decreases the activity. The LRRK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	252
270971	cd14069	STKc_Chk1	Catalytic domain of the Serine/Threonine kinase, Checkpoint kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Chk1 is implicated in many major checkpoints of the cell cycle, providing a link between upstream sensors and the cell cycle engine. It plays an important role in DNA damage response and maintaining genomic stability. Chk1 acts as an effector of the sensor kinase, ATR (ATM and Rad3-related), a member of the PI3K family, which is activated upon DNA replication stress. Chk1 delays mitotic entry in response to replication blocks by inhibiting cyclin dependent kinase (Cdk) activity. In addition, Chk1 contributes to the function of centrosome and spindle-based checkpoints, inhibits firing of origins of DNA replication (Ori), and represses transcription of cell cycle proteins including cyclin B and Cdk1. The Chk1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	261
270972	cd14070	STKc_HUNK	Catalytic domain of the Serine/Threonine Kinase, Hormonally up-regulated Neu-associated kinase (also called MAK-V). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HUNK/MAK-V was identified from a mammary tumor in an MMTV-neu transgenic mouse. It is required for the metastasis of c-myc-induced mammary tumors, but is not necessary for c-myc-induced primary tumor formation or normal development. It is required for HER2/neu-induced tumor formation and maintenance of the cells' tumorigenic phenotype. It is over-expressed in aggressive subsets of ovary, colon, and breast carcinomas. HUNK interacts with synaptopodin, and may also play a role in synaptic plasticity. The HUNK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
270973	cd14071	STKc_SIK	Catalytic domain of the Serine/Threonine Kinases, Salt-Inducible kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SIKs are part of a complex network that regulates Na,K-ATPase to maintain sodium homeostasis and blood pressure. Vertebrates contain three forms of SIKs (SIK1-3) from three distinct genes, which display tissue-specific effects. SIK1, also called SNF1LK, controls steroidogenic enzyme production in adrenocortical cells. In the brain, both SIK1 and SIK2 regulate energy metabolism. SIK2, also called QIK or SNF1LK2, is involved in the regulation of gluconeogenesis in the liver and lipogenesis in adipose tissues, where it phosphorylates the insulin receptor substrate-1. In the liver, SIK3 (also called QSK) regulates cholesterol and bile acid metabolism. In addition, SIK2 plays an important role in the initiation of mitosis and regulates the localization of C-Nap1, a centrosome linker protein. The SIK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	253
270974	cd14072	STKc_MARK	Catalytic domain of the Serine/Threonine Kinases, MAP/microtubule affinity-regulating kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MARKs, also called Partitioning-defective 1 (Par1) proteins, function as regulators of diverse cellular processes in nematodes, Drosophila, yeast, and vertebrates. They are involved in embryogenesis, epithelial cell polarization, cell signaling, and neuronal differentiation. MARKs phosphorylate tau and related microtubule-associated proteins (MAPs), and regulates microtubule-based intracellular transport. Vertebrates contain four isoforms, namely MARK1 (or Par1c), MARK2 (or Par1b), MARK3 (Par1a), and MARK4 (or MARKL1). Known substrates of MARKs include the cell cycle-regulating phosphatase Cdc25, tyrosine phosphatase PTPH1, MAPK scaffolding protein KSR1, class IIa histone deacetylases, and plakophilin 2. The MARK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	253
270975	cd14073	STKc_NUAK	Catalytic domain of the Serine/Threonine Kinase, novel (nua) kinase family NUAK. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NUAK proteins are classified as AMP-activated protein kinase (AMPK)-related kinases, which like AMPK are activated by the major tumor suppressor LKB1. Vertebrates contain two NUAK proteins, called NUAK1 and NUAK2. NUAK1, also called ARK5 (AMPK-related protein kinase 5), regulates cell proliferation and displays tumor suppression through direct interaction and phosphorylation of p53. It is also involved in cell senescence and motility. High NUAK1 expression is associated with invasiveness of nonsmall cell lung cancer (NSCLC) and breast cancer cells. NUAK2, also called SNARK (Sucrose, non-fermenting 1/AMP-activated protein kinase-related kinase), is involved in energy metabolism. It is activated by hyperosmotic stress, DNA damage, and nutrients such as glucose and glutamine. NUAK2-knockout mice develop obesity, altered serum lipid profiles, hyperinsulinaemia, hyperglycaemia, and impaired glucose tolerance. The NUAK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	254
270976	cd14074	STKc_SNRK	Catalytic domain of the Serine/Threonine Kinase, SNF1-related kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SNRK is a kinase highly expressed in testis and brain that is found inactive in cells that lack the LKB1 tumour suppressor protein kinase. The regulatory subunits STRAD and MO25 are required for LKB1 to activate SNRK. The SNRK mRNA is increased 3-fold when granule neurons are cultured in low potassium, and may thus play a role in the survival responses in these cells. In some vertebrates, a second SNRK gene (snrkb or snrk-1) has been sequenced and/or identified. Snrk-1 is expressed specifically in embryonic zebrafish vasculature; it plays an essential role in angioblast differentiation, maintenance, and migration. The SNRK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270977	cd14075	STKc_NIM1	Catalytic domain of the Serine/Threonine Kinase, NIM1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NIM1 is a widely-expressed kinase belonging to the AMP-activated protein kinase (AMPK) subfamily. Although present in most tissues, NIM1 kinase activity is only observed in the brain and testis. NIM1 is capable of autophosphorylating and activating itself, but may be present in other tissues in the inactive form. The physiological function of NIM1 has yet to be elucidated. The NIM1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
270978	cd14076	STKc_Kin4	Catalytic domain of the yeast Serine/Threonine Kinase, Kin4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Kin4 is a central component of the spindle position checkpoint (SPOC), which monitors spindle position and regulates the mitotic exit network (MEN). Kin4 associates with spindle pole bodies in mother cells to inhibit MEN signaling and delay mitosis until the anaphase nucleus is properly positioned along the mother-bud axis. Kin4 activity is regulated by both the bud neck-associated kinase Elm1 and protein phosphatase 2A. The Kin4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
270979	cd14077	STKc_Kin1_2	Catalytic domain of Kin1, Kin2, and simlar Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of yeast Kin1, Kin2, and similar proteins. Fission yeast Kin1 is a membrane-associated kinase that is involved in regulating cell surface cohesiveness during interphase. It also plays a role during mitosis, linking actomyosin ring assembly with septum synthesis and membrane closure to ensure separation of daughter cells. Budding yeast Kin1 and Kin2 act downstream of the Rab-GTPase Sec4 and are associated with the exocytic apparatus; they play roles in the secretory pathway. The Kin1/2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
270980	cd14078	STKc_MELK	Catalytic domain of the Serine/Threonine Kinase, Maternal Embryonic Leucine zipper Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MELK is a cell cycle dependent protein which functions in cytokinesis, cell cycle, apoptosis, cell proliferation, and mRNA processing. It is found upregulated in many types of cancer cells, playing an indispensable role in cancer cell survival. It makes an attractive target in the design of inhibitors for use in the treatment of a wide range of human cancer. The MELK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
270981	cd14079	STKc_AMPK_alpha	Catalytic domain of the Alpha subunit of the Serine/Threonine Kinase, AMP-activated protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. AMPK, also called SNF1 (sucrose non-fermenting1) in yeasts and SnRK1 (SNF1-related kinase1) in plants, is a heterotrimeric enzyme composed of a catalytic alpha subunit and two regulatory subunits, beta and gamma. It is a stress-activated kinase that serves as master regulator of glucose and lipid metabolism by monitoring carbon and energy supplies, via sensing the cell's AMP:ATP ratio. In response to decreased ATP levels, it enhances energy-producing processes and inhibits energy-consuming pathways. Once activated, AMPK phosphorylates a broad range of downstream targets, with effects in carbohydrate metabolism and uptake, lipid and fatty acid biosynthesis, carbon energy storage, and inflammation, among others. Defects in energy homeostasis underlie many human diseases including Type 2 diabetes, obesity, heart disease, and cancer. As a result, AMPK has emerged as a therapeutic target in the treatment of these diseases. The AMPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
270982	cd14080	STKc_TSSK-like	Catalytic domain of testis-specific serine/threonine kinases and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK1 and TSSK2 are expressed specifically in meiotic and postmeiotic spermatogenic cells, respectively. TSSK3 has been reported to be expressed in the interstitial Leydig cells of adult testis. TSSK4, also called TSSK5, is expressed in testis from haploid round spermatids to mature spermatozoa. TSSK6, also called SSTK, is expressed at the head of elongated sperm. TSSK1/TSSK2 double knock-out and TSSK6 null mice are sterile without manifesting other defects, making these kinases viable targets for male contraception. The TSSK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
270983	cd14081	STKc_BRSK1_2	Catalytic domain of Brain-specific serine/threonine-protein kinases 1 and 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BRSK1, also called SAD-B or SAD1 (Synapses of Amphids Defective homolog 1), and BRSK2, also called SAD-A, are highly expressed in mammalian forebrain. They play important roles in establishing neuronal polarity. BRSK1/2 double knock-out mice die soon after birth, showing thin cerebral cortices due to disordered subplate layers and neurons that lack distinct axons and dendrites. BRSK1 regulates presynaptic neurotransmitter release. Its activity fluctuates during cell cysle progression and it acts as a regulator of centrosome duplication. BRSK2 is also abundant in pancreatic islets, where it is involved in the regulation of glucose-stimulated insulin secretion. The BRSK1/2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
270984	cd14082	STKc_PKD	Catalytic domain of the Serine/Threonine kinase, Protein Kinase D. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PKDs are important regulators of many intracellular signaling pathways such as ERK and JNK, and cellular processes including the organization of the trans-Golgi network, membrane trafficking, cell proliferation, migration, and apoptosis. They contain N-terminal cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. Mammals harbor three types of PKDs: PKD1 (or PKCmu), PKD2, and PKD3 (or PKCnu). PKDs are activated in a PKC-dependent manner by many agents including diacylglycerol (DAG), PDGF, neuropeptides, oxidative stress, and tumor-promoting phorbol esters, among others. The PKD subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	260
270985	cd14083	STKc_CaMKI	Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. There are several types of CaMKs including CaMKI, CaMKII, and CaMKIV. In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
270986	cd14084	STKc_Chk2	Catalytic domain of the Serine/Threonine kinase, Cell cycle Checkpoint Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Checkpoint Kinase 2 (Chk2) plays an important role in cellular responses to DNA double-strand breaks and related lesions. It is phosphorylated and activated by ATM kinase, resulting in its dissociation from sites of damage to phosphorylate downstream targets such as BRCA1, p53, cell cycle transcription factor E2F1, the promyelocytic leukemia protein (PML) involved in apoptosis, and CDC25 phosphatases, among others. Mutations in Chk2 is linked to a variety of cancers including familial breast cancer, myelodysplastic syndromes, prostate cancer, lung cancer, and osteosarcomas. Chk2 contains an N-terminal SQ/TQ cluster domain (SCD), a central forkhead-associated (FHA) domain, and a C-terminal catalytic kinase domain. The Chk2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	275
270987	cd14085	STKc_CaMKIV	Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type IV. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. There are several types of CaMKs including CaMKI, CaMKII, and CaMKIV. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKIV is found predominantly in neurons and immune cells. It is activated by the binding of calcium/CaM and phosphorylation by CaMKK (alpha or beta). The CaMKK-CaMKIV cascade participates in regulating several transcription factors like CREB, MEF2, and retinoid orphan receptors. It also is implicated in T-cell development and signaling, cytokine secretion, and signaling through Toll-like receptors, and is thus, pivotal in immune response and inflammation. The CaMKIV subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	294
270988	cd14086	STKc_CaMKII	Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type II. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. There are several types of CaMKs including CaMKI, CaMKII, and CaMKIV. CaMKs contain an N-terminal catalytic domain followed by a regulatory domain that harbors a CaM binding site. In addition, CaMKII contains a C-terminal association domain that facilitates oligomerization. There are four CaMKII proteins (alpha, beta, gamma, delta) encoded by different genes; each gene undergoes alternative splicing to produce more than 30 isoforms. CaMKII-alpha and -beta are enriched in neurons while CaMKII-gamma and -delta are predominant in myocardium. CaMKII is a signaling molecule that translates upstream calcium and reactive oxygen species (ROS) signals into downstream responses that play important roles in synaptic function and cardiovascular physiology. It is a major component of the postsynaptic density and is critical in regulating synaptic plasticity including long-term potentiation. It is critical in regulating ion channels and proteins involved in myocardial excitation-contraction and excitation-transcription coupling. Excessive CaMKII activity promotes processes that contribute to heart failure and arrhythmias. The CaMKII subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	292
270989	cd14087	STKc_PSKH1	Catalytic domain of the Protein Serine/Threonine kinase H1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PSKH1 is an autophosphorylating STK that is expressed ubiquitously and exhibits multiple intracellular localizations including the centrosome, Golgi apparatus, and splice factor compartments. It contains a catalytic kinase domain and an N-terminal SH4-like motif that is acylated to facilitate membrane attachment. PSKH1 plays a rile in the maintenance of the Golgi apparatus, an important organelle within the secretory pathway. It may also function as a novel splice factor and a regulator of prostate cancer cell growth. The PSKH1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
270990	cd14088	STKc_CaMK_like	Catalytic domain of an Uncharacterized group of Serine/Threonine kinases with similarity to Calcium/calmodulin-dependent protein kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of uncharacterized STKs with similarity to CaMKs, which are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). CaMKs contain an N-terminal catalytic domain followed by a regulatory domain that harbors a CaM binding site. This uncharacterized subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
270991	cd14089	STKc_MAPKAPK	Catalytic domain of the Serine/Threonine kinases, Mitogen-activated protein kinase-activated protein kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of the MAPK-activated protein kinases MK2, MK3, MK5 (also called PRAK for p38-regulated/activated protein kinase), and related proteins. These proteins contain a catalytic kinase domain followed by a C-terminal autoinhibitory region that contains nuclear localization (NLS) and nuclear export (NES) signals with a p38 MAPK docking motif that overlaps the NLS. In addition, MK2 and MK3 contain an N-terminal proline-rich region that can bind to SH3 domains. MK2 and MK3 are bonafide substrates for the MAPK p38, while MK5 plays a functional role in the p38 MAPK pathway although their direct interaction has been difficult to detect. MK2 and MK3 are closely related and show, thus far, indistinguishable substrate specificity, while MK5 shows a distinct spectrum of substrates. MK2 and MK3 are mainly involved in the regulation of gene expression and they participate in diverse cellular processes such as endocytosis, cytokine production, cytoskeletal reorganization, cell migration, cell cycle control and chromatin remodeling. They are implicated in inflammation and cance and their substrates include mRNA-AU-rich-element (ARE)-binding proteins (TTP and hnRNP A0), Hsp proteins (Hsp27 and Hsp25) and RSK, among others. MK2/3 are both expressed ubiquitously but MK2 is expressed at significantly higher levels. MK5 is a ubiquitous protein that is implicated in neuronal morphogenesis, cell migration, and tumor angiogenesis. It interacts with PKA, which induces cytoplasmic translocation of MK5. Its substrates includes p53, ERK3/4, Hsp27, and cytosolic phospholipase A2 (cPLA2). The MAPKAPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
270992	cd14090	STKc_Mnk	Catalytic domain of the Serine/Threonine kinases, Mitogen-activated protein kinase signal-integrating kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK signal-integrating kinases (Mnks) are MAPK-activated protein kinases and is comprised by a group of four proteins, produced by alternative splicing from two genes (Mnk1 and Mnk2). The isoforms of Mnk1 (1a/1b) and Mnk2 (2a/2b) differ at their C-termini, with the a-form having a longer C-terminus containing a MAPK-binding region. All Mnks contain a catalytic kinase domain and a polybasic region at the N-terminus which binds importin and the eukaryotic initiation factor eIF4G. The best characterized Mnk substrate is eIF4G, whose phosphorylation may promote the export of certain mRNAs from the nucleus. Mnk also phosphorylate substrates that bind to AU-rich elements that regulate mRNA stability and translation. Mnks have also been implicated in tyrosine kinase receptor signaling, inflammation, and cell prolieration or survival. The Mnk subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	289
270993	cd14091	STKc_RSK_C	C-terminal catalytic domain of the Serine/Threonine Kinases, Ribosomal S6 kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. Mammals possess four RSK isoforms (RSK1-4) from distinct genes. RSK proteins are also referred to as MAP kinase-activated protein kinases (MAPKAPKs), 90 kDa ribosomal protein S6 kinases (p90-RSKs), or p90S6Ks. The RSK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	291
270994	cd14092	STKc_MSK_C	C-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, in response to various stimuli such as growth factors, hormones, neurotransmitters, cellular stress, and pro-inflammatory cytokines. This triggers phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) in the C-terminal extension of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. MSKs are predominantly nuclear proteins. They are widely expressed in many tissues including heart, brain, lung, liver, kidney, and pancreas. There are two isoforms of MSK, called MSK1 and MSK2. The MSK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	311
270995	cd14093	STKc_PhKG	Catalytic domain of the Serine/Threonine Kinase, Phosphorylase kinase Gamma subunit. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Phosphorylase kinase (PhK) catalyzes the phosphorylation of inactive phosphorylase b to form the active phosphorylase a. It coordinates hormonal, metabolic, and neuronal signals to initiate the breakdown of glycogen stores, which enables the maintenance of blood-glucose homeostasis during fasting, and is also used as a source of energy for muscle contraction. PhK is one of the largest and most complex protein kinases, composed of a heterotetramer containing four molecules each of four subunit types: one catalytic (gamma) and three regulatory (alpha, beta, and delta). Each subunit has tissue-specific isoforms or splice variants. Vertebrates contain two isoforms of the gamma subunit (gamma 1 and gamma 2). The gamma subunit, when isolated, is constitutively active and does not require phosphorylation of the A-loop for activity. The regulatory subunits restrain this kinase activity until signals are received to relieve this inhibition. For example, the kinase is activated in response to hormonal stimulation, after autophosphorylation or phosphorylation by cAMP-dependent kinase of the alpha and beta subunits. The high-affinity binding of ADP to the beta subunit also stimulates kinase activity, whereas calcium relieves inhibition by binding to the delta (calmodulin) subunit. The PhKG subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	272
270996	cd14094	STKc_CASK	Catalytic domain of the Serine/Threonine Kinase, Calcium/calmodulin-dependent serine protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CASK belongs to the MAGUK (membrane-associated guanylate kinase) protein family, which functions as multiple domain adaptor proteins and is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The enzymatically inactive GuK domain in MAGUK proteins mediates protein-protein interactions and associates intramolecularly with the SH3 domain. In addition, CASK contains a catalytic kinase and two L27 domains. It is highly expressed in the nervous system and plays roles in synaptic protein targeting, neural development, and regulation of gene expression. Binding partners include parkin (a Parkinson's disease molecule), neurexin (adhesion molecule), syndecans, calcium channel proteins, CINAP (nucleosome assembly protein), transcription factor Tbr-1, and the cytoplasmic adaptor proteins Mint1, Veli/mLIN-7/MALS, SAP97, caskin, and CIP98. Deletion or mutations in the CASK gene have been implicated in X-linked mental retardation. The CASK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	300
270997	cd14095	STKc_DCKL	Catalytic domain of the Serine/Threonine Kinase, Doublecortin-like kinase (also called Doublecortin-like and CAM kinase-like). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DCKL (or DCAMKL) proteins belong to the doublecortin (DCX) family of proteins which are involved in neuronal migration, neurogenesis, and eye receptor development, among others. Family members typically contain tandem doublecortin (DCX) domains at the N-terminus; DCX domains can bind microtubules and serve as protein-interaction platforms. In addition, DCKL proteins contain a C-terminal kinase domain with similarity to CAMKs. They are involved in the regulation of cAMP signaling. Vertebrates contain three DCKL proteins (DCKL1-3); DCKL1 and 2 also contain a serine, threonine, and proline rich domain (SP), while DCKL3 contains only a single DCX domain instead of tandem domains. The DCKL subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
270998	cd14096	STKc_RCK1-like	Catalytic domain of RCK1-like Serine/Threonine Kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of fungal STKs including Saccharomyces cerevisiae RCK1 and RCK2, Schizosaccharomyces pombe Sty1-regulated kinase 1 (Srk1), and similar proteins. RCK1, RCK2 (or Rck2p), and Srk1 are MAPK-activated protein kinases. RCK1 and RCK2 are involved in oxidative and metal stress resistance in budding yeast. RCK2 also regulates rapamycin sensitivity in both S. cerevisiae and Candida albicans. Srk1 is activated by Sty1/Spc1 and is involved in negatively regulating cell cycle progression by inhibiting Cdc25. The RCK1-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	295
270999	cd14097	STKc_STK33	Catalytic domain of Serine/Threonine Kinase 33. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. STK33 is highly expressed in the testis and is present in low levels in most tissues. It may be involved in spermatogenesis and organ ontogenesis. It interacts with and phosphorylates vimentin and may be involved in regulating intermediate filament cytoskeletal dynamics. Its role in promoting the cell viability of KRAS-dependent cancer cells is under debate; some studies have found STK33 to promote cancer cell viability, while other studies have found it to be non-essential. KRAS is the most commonly mutated human oncogene, thus, studies on the role of STK33 in KRAS mutant cancer cells are important. The STK33 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	266
271000	cd14098	STKc_Rad53_Cds1	Catalytic domain of the yeast Serine/Threonine Kinases, Rad53 and Cds1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Rad53 and Cds1 are the checkpoint kinase 2 (Chk2) homologs found in budding and fission yeast, respectively. They play a central role in the cell's response to DNA lesions to prevent genome rearrangements and maintain genome integrity. They are phosphorylated in response to DNA damage and incomplete replication, and are essential for checkpoint control. They help promote DNA repair by stalling the cell cycle prior to mitosis in the presence of DNA damage. The Rad53/Cds1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
271001	cd14099	STKc_PLK	Catalytic domain of the Serine/Threonine Kinases, Polo-like kinases. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. PLKs derive their names from homology to polo, a kinase first identified in Drosophila. There are five mammalian PLKs (PLK1-5) from distinct genes. There is good evidence that PLK1 may function as an oncogene while PLK2-5 have tumor suppressive properties. PLK1 functions as a positive regulator of mitosis, meiosis, and cytokinesis. PLK2 functions in G1 progression, S-phase arrest, and centriole duplication. PLK3 regulates angiogenesis and responses to DNA damage. PLK4 is required for late mitotic progression, cell survival, and embryonic development. PLK5 was first identified as a pseudogene containing a stop codon within the kinase domain, however, both murine and human genes encode expressed proteins. PLK5 functions in cell cycle arrest.	258
271002	cd14100	STKc_PIM1	Catalytic domain of the Serine/Threonine kinase, Proviral Integration Moloney virus (PIM) kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PIM gene locus was discovered as a result of the cloning of retroviral intergration sites in murine Moloney leukemia virus, leading to the identification of PIM kinases. They are constitutively active STKs with a broad range of cellular targets and are overexpressed in many haematopoietic malignancies and solid cancers. Vertebrates contain three distinct PIM kinase genes (PIM1-3); each gene may result in mutliple protein isoforms. There are two PIM1 isoforms resulting from alternative translation initiation sites. PIM1 is the founding member of the PIM subfamily. It is involved in regulating cell growth, differentiation, and apoptosis. It promotes cancer development when overexpressed by inhibiting apoptosis, promoting cell proliferation, and promoting genomic instability. The PIM1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	254
271003	cd14101	STKc_PIM2	Catalytic domain of the Serine/Threonine kinase, Proviral Integration Moloney virus (PIM) kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PIM gene locus was discovered as a result of the cloning of retroviral intergration sites in murine Moloney leukemia virus, leading to the identification of PIM kinases. They are constitutively active STKs with a broad range of cellular targets and are overexpressed in many haematopoietic malignancies and solid cancers. Vertebrates contain three distinct PIM kinase genes (PIM1-3); each gene may result in mutliple protein isoforms. There are three PIM2 isoforms resulting from alternative translation initiation sites. PIM2 is highly expressed in leukemia and lymphomas and has been shown to promote the survival and proliferation of tumor cells. The PIM2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
271004	cd14102	STKc_PIM3	Catalytic domain of the Serine/Threonine kinase, Proviral Integration Moloney virus (PIM) kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The PIM gene locus was discovered as a result of the cloning of retroviral intergration sites in murine Moloney leukemia virus, leading to the identification of PIM kinases. They are constitutively active STKs with a broad range of cellular targets and are overexpressed in many haematopoietic malignancies and solid cancers. Vertebrates contain three distinct PIM kinase genes (PIM1-3). PIM3 can inhibit apoptosis and promote cell survival and protein translation, therefore, it can enhance the proliferation of normal and cancer cells. Mice deficient with PIM3 show minimal effects, suggesting that PIM3 msy not be essential. Since its expression is enhanced in several cancers, it may make a good molecular target for cancer drugs. The PIM3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	253
271005	cd14103	STKc_MLCK	Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK phosphorylates myosin regulatory light chain and controls the contraction of all muscle types. In vertebrates, different MLCKs function in smooth (MLCK1), skeletal (MLCK2), and cardiac (MLCK3) muscles. A fourth protein, MLCK4, has also been identified through comprehensive genome analysis although it has not been biochemically characterized. The MLCK1 gene expresses three transcripts in a cell-specific manner: a short MLCK1 which contains three immunoglobulin (Ig)-like and one fibronectin type III (FN3) domains, PEVK and actin-binding regions, and a kinase domain near the C-terminus; a long MLCK1 containing six additional Ig-like domains at the N-terminus compared to the short MLCK1; and the C-terminal Ig module. MLCK2, MLCK3, and MLCK4 share a simpler domain architecture of a single kinase domain near the C-terminus and the absence of Ig-like or FN3 domains. The MLCK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	250
271006	cd14104	STKc_Titin	Catalytic domain of the Giant Serine/Threonine Kinase Titin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Titin, also called connectin, is a muscle-specific elastic protein and is the largest known protein to date. It contains multiple immunoglobulin (Ig)-like and fibronectin type III (FN3) domains, and a single kinase domain near the C-terminus. It spans half of the sarcomere, the repeating contractile unit of striated muscle, and performs mechanical and catalytic functions. Titin contributes to the passive force generated when muscle is stretched during relaxation. Its kinase domain phosphorylates and regulates the muscle protein telethonin, which is required for sarcomere formation in differentiating myocytes. In addition, titin binds many sarcomere proteins and acts as a molecular scaffold for filament formation during myofibrillogenesis. The Titin subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
271007	cd14105	STKc_DAPK	Catalytic domain of the Serine/Threonine Kinase, Death-Associated Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumor suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK1 is the prototypical member of the subfamily and is also simply referred to as DAPK. DAPK2 is also called DAPK-related protein 1 (DRP-1), while DAPK3 has also been named DAP-like kinase (DLK) and zipper-interacting protein kinase (ZIPk). These proteins are ubiquitously expressed in adult tissues, are capable of cross talk with each other, and may act synergistically in regulating cell death. The DAPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
271008	cd14106	STKc_DRAK	Catalytic domain of the Serine/Threonine Kinase, Death-associated protein kinase-Related Apoptosis-inducing protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DRAKs, also called STK17, were named based on their similarity (around 50% identity) to the kinase domain of DAPKs. They contain an N-terminal kinase domain and a C-terminal regulatory domain. Vertebrates contain two subfamily members, DRAK1 and DRAK2. Both DRAKs are localized to the nucleus, autophosphorylate themselves, and phosphorylate myosin light chain as a substrate. They may play a role in apoptotic signaling. The DRAK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
271009	cd14107	STKc_obscurin_rpt1	Catalytic kinase domain, first repeat, of the Giant Serine/Threonine Kinase Obscurin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Obscurin, approximately 800 kDa in size, is one of three giant proteins expressed in vetebrate striated muscle, together with titin and nebulin. It is a multidomain protein composed of tandem adhesion and signaling domains, including 49 immunoglobulin (Ig) and 2 fibronectin type III (FN3) domains at the N-terminus followed by a more complex region containing more Ig domains, a conserved SH3 domain near a RhoGEF and PH domains, non-modular regions, as well as IQ and phosphorylation motifs. The obscurin gene also encode two kinase domains, which are not expressed as part of the 800 kDa protein, but as a smaller, alternatively spliced product present mainly in the heart muscle, also called obscurin-MLCK. Obscurin is localized at the peripheries of Z-disks and M-lines, where it is able to communicate with the surrounding myoplasm. It interacts with diverse proteins including sAnk1, myosin, titin, and MyBP-C. It may act as a scaffold for the assembly of elements of the contractile apparatus. The obscurin subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
271010	cd14108	STKc_SPEG_rpt1	Catalytic kinase domain, first repeat, of Giant Serine/Threonine Kinase Striated muscle preferentially expressed protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The Striated muscle preferentially expressed gene (SPEG) generates 4 different isoforms through alternative promoter use and splicing in a tissue-specific manner: SPEGalpha and SPEGbeta are expressed in cardiac and skeletal striated muscle; Aortic Preferentially Expressed Protein-1 (APEG-1) is expressed in vascular smooth muscle; and Brain preferentially expressed gene (BPEG) is found in the brain and aorta. SPEG proteins have mutliple immunoglobulin (Ig), 2 fibronectin type III (FN3), and two kinase domains. They are necessary for cardiac development and survival. The SPEG subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
271011	cd14109	PK_Unc-89_rpt1	Pseudokinase domain, first repeat, of the Giant Serine/Threonine Kinase Uncoordinated protein 89. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. The nematode Unc-89 gene, through alternative promoter use and splicing, encodes at least six major isoforms (Unc-89A to Unc-89F) of giant muscle proteins that are homologs for the vetebrate obscurin. In flies, five isoforms of Unc-89 have been detected: four in the muscles of adult flies (two in the indirect flight muscle and two in other muscles) and another isoform in the larva. Unc-89 in nematodes is required for normal muscle cell architecture. In flies, it is necessary for the development of a symmetrical sarcomere in the flight muscles. Unc-89 proteins contain several adhesion and signaling domains including multiple copies of the immunoglobulin (Ig) domain, as well as fibronectin type III (FN3), SH3, RhoGEF, and PH domains. The nematode Unc-89 isoforms D, C, D, and F contain two kinase domain with B and F having two complete kinase domains while the first repeat of C and D are partial domains. Homology modeling suggests that the first kinase repeat of Unc-89 may be catalytically inactive, a pseudokinase, while the second kinase repeat may be active. The pseudokinase domain may function as a regulatory domain or a protein interaction domain. The Unc-89 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
271012	cd14110	STKc_obscurin_rpt2	Catalytic kinase domain, second repeat, of the Giant Serine/Threonine Kinase Obscurin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Obscurin, approximately 800 kDa in size, is one of three giant proteins expressed in vetebrate striated muscle, together with titin and nebulin. It is a multidomain protein composed of tandem adhesion and signaling domains, including 49 immunoglobulin (Ig) and 2 fibronectin type III (FN3) domains at the N-terminus followed by a more complex region containing more Ig domains, a conserved SH3 domain near a RhoGEF and PH domains, non-modular regions, as well as IQ and phosphorylation motifs. The obscurin gene also encode two kinase domains, which are not expressed as part of the 800 kDa protein, but as a smaller, alternatively spliced product present mainly in the heart muscle, also called obscurin-MLCK. Obscurin is localized at the peripheries of Z-disks and M-lines, where it is able to communicate with the surrounding myoplasm. It interacts with diverse proteins including sAnk1, myosin, titin, and MyBP-C. It may act as a scaffold for the assembly of elements of the contractile apparatus. The obscurin subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
271013	cd14111	STKc_SPEG_rpt2	Catalytic kinase domain, second repeat, of Giant Serine/Threonine Kinase Striated muscle preferentially expressed protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The Striated muscle preferentially expressed gene (SPEG) generates 4 different isoforms through alternative promoter use and splicing in a tissue-specific manner: SPEGalpha and SPEGbeta are expressed in cardiac and skeletal striated muscle; Aortic Preferentially Expressed Protein-1 (APEG-1) is expressed in vascular smooth muscle; and Brain preferentially expressed gene (BPEG) is found in the brain and aorta. SPEG proteins have mutliple immunoglobulin (Ig), 2 fibronectin type III (FN3), and two kinase domains. They are necessary for cardiac development and survival. The SPEG subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
271014	cd14112	STKc_Unc-89_rpt2	Catalytic kinase domain, second repeat, of the Giant Serine/Threonine Kinase Uncoordinated protein 89. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The nematode Unc-89 gene, through alternative promoter use and splicing, encodes at least six major isoforms (Unc-89A to Unc-89F) of giant muscle proteins that are homologs for the vetebrate obscurin. In flies, five isoforms of Unc-89 have been detected: four in the muscles of adult flies (two in the indirect flight muscle and two in other muscles) and another isoform in the larva. Unc-89 in nematodes is required for normal muscle cell architecture. In flies, it is necessary for the development of a symmetrical sarcomere in the flight muscles. Unc-89 proteins contain several adhesion and signaling domains including multiple copies of the immunoglobulin (Ig) domain, as well as fibronectin type III (FN3), SH3, RhoGEF, and PH domains. The nematode Unc-89 isoforms D, C, D, and F contain two kinase domain with B and F having two complete kinase domains while the first repeat of C and D are partial domains. Homology modeling suggests that the first kinase repeat of Unc-89 may be catalytically inactive, a pseudokinase, while the second kinase repeat may be active. The Unc-89 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
271015	cd14113	STKc_Trio_C	C-terminal kinase domain of the Large Serine/Threonine Kinase and Rho Guanine Nucleotide Exchange Factor, Triple functional domain protein. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Triple functional domain protein (Trio), also called PTPRF-interacting protein, is a large multidomain protein containing a series of spectrin-like repeats, two each of RhoGEF and SH3 domains, an immunoglobulin-like (Ig) domain and a C-terminal kinase. Trio plays important roles in neuronal cell migration and axon guidance. It was originally identified as an interacting partner of the of the receptor-like tyrosine phosphatase (RPTP) LAR (leukocyte-antigen-related protein), a family of receptors that function in the signaling to the actin cytoskeleton during development. Trio functions as a GEF for Rac1, RhoG, and RhoA, and is involved in the regulation of lamellipodia formation, mediating Rac1-dependent cell spreading and migration. The Trio subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
271016	cd14114	STKc_Twitchin_like	The catalytic domain of the Giant Serine/Threonine Kinases, Twitchin and Projectin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily is composed of Caenorhabditis elegans and Aplysia californica Twitchin, Drosophila melanogaster Projectin, and similar proteins. These are very large muscle proteins containing multiple immunoglobulin (Ig)-like and fibronectin type III (FN3) domains and a single kinase domain near the C-terminus. Twitchin and Projectin are both associated with thick filaments. Twitchin is localized in the outer parts of A-bands and is involved in regulating muscle contraction. It interacts with the myofibrillar proteins myosin and actin in a phosphorylation-dependent manner, and may be involved in regulating the myosin cross-bridge cycle. The kinase activity of Twitchen is activated by Ca2+ and the Ca2+ binding protein S100A1. Projectin is associated with the end of thick filaments and is a component of flight muscle connecting filaments. The kinase domain of Projectin may play roles in autophosphorylation and transphosphorylation, which impact the formation of myosin filaments. The Twitchin-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
271017	cd14115	STKc_Kalirin_C	C-terminal kinase domain of the Large Serine/Threonine Kinase and Rho Guanine Nucleotide Exchange Factor, Kalirin. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Kalirin, also called Duo or Duet, is a large multidomain protein containing a series of spectrin-like repeats, two each of RhoGEF and SH3 domains, an immunoglobulin-like (Ig) domain and a C-terminal kinase. As a GEF, it activates Rac1, RhoA, and RhoG. It is highly expressed in neurons and is required for spine formation. The kalirin gene produces at least 10 isoforms from alternative promoter use and splicing. Of the major isoforms (Kalirin-7, -9, and -12), only kalirin-12 contains the C-terminal kinase domain. Kalirin-12 is highly expressed during embryonic development and it plays an important role in axon outgrowth. The Kalirin subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	248
271018	cd14116	STKc_Aurora-A	Catalytic domain of the Serine/Threonine kinase, Aurora-A kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Aurora kinases are key regulators of mitosis and are essential for the accurate and equal division of genomic material from parent to daughter cells. Vertebrates contain at least 2 Aurora kinases (A and B); mammals contains a third Aurora kinase gene (C). Aurora-A regulates cell cycle events from the late S-phase through the M-phase including centrosome maturation, mitotic entry, centrosome separation, spindle assembly, chromosome alignment, cytokinesis, and mitotic exit. Aurora-A activation depends on its autophosphorylation and binding to the microtubule-associated protein TPX2, which also localizes the kinase to spindle microtubules. Aurora-A is overexpressed in many cancer types such as prostate, ovarian, breast, bladder, gastric, and pancreatic. The Aurora subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
271019	cd14117	STKc_Aurora-B_like	Catalytic domain of the Serine/Threonine kinase, Aurora-B kinase and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Aurora kinases are key regulators of mitosis and are essential for the accurate and equal division of genomic material from parent to daughter cells. Vertebrates contain at least 2 Aurora kinases (A and B); mammals contains a third Aurora kinase gene (C). This subfamily includes Aurora-B and Aurora-C. Aurora-B is most active at the transition during metaphase to the end of mitosis. It associates with centromeres, relocates to the midzone of the central spindle, and concentrates at the midbody during cell division. It is critical for accurate chromosomal segregation, cytokinesis, protein localization to the centrosome and kinetochore, correct microtubule-kinetochore attachments, and regulation of the mitotic checkpoint. Aurora-C is mainly expressed in meiotically dividing cells; it was originally discovered in mice as a testis-specific STK called Aie1. Both Aurora-B and -C are chromosomal passenger proteins that can form complexes with INCENP and survivin, and they may have redundant cellular functions. INCENP participates in the activation of Aurora-B in a two-step process: first by binding to form an intermediate state of activation and the phosphorylation of its C-terminal TSS motif to generate the fully active kinase. The Aurora-B subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
271020	cd14118	STKc_CAMKK	Catalytic domain of the Serine/Threonine kinase, Calmodulin Dependent Protein Kinase Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKKs are upstream kinases of the CaM kinase cascade that phosphorylate and activate CaMKI and CamKIV. They may also phosphorylate other substrates including PKB and AMP-activated protein kinase (AMPK). Vertebrates contain two CaMKKs, CaMKK1 (or alpha) and CaMKK2 (or beta). CaMKK1 is involved in the regulation of glucose uptake in skeletal muscles. CaMKK2 is involved in regulating energy balance, glucose metabolism, adiposity, hematopoiesis, inflammation, and cancer. The CaMKK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	275
271021	cd14119	STKc_LKB1	Catalytic domain of the Serine/Threonine kinase, Liver Kinase B1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LKB1, also called STK11, was first identified as a tumor suppressor responsible for Peutz-Jeghers syndrome, a disorder that leads to an increased risk of spontaneous epithelial cancer. It serves as a master upstream kinase that activates AMP-activated protein kinase (AMPK) and most AMPK-like kinases. LKB1 and AMPK are part of an energy-sensing pathway that links cell energy to metabolism and cell growth. They play critical roles in the establishment and maintenance of cell polarity, cell proliferation, cytoskeletal organization, as well as T-cell metabolism, including T-cell development, homeostasis, and effector function. To be activated, LKB1 requires the adaptor proteins STe20-Related ADaptor (STRAD) and mouse protein 25 (MO25). The LKB1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
271022	cd14120	STKc_ULK1_2-like	Catalytic domain of the Serine/Threonine kinases, Unc-51-like kinases 1 and 2, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. ULK1 is required for efficient amino acid starvation-induced autophagy and mitochondrial clearance. ULK2 is ubiquitously expressed and is essential in autophagy induction. ULK1 and ULK2 have unique and cell-type specific roles, but also display partially redundant roles in starvation-induced autophagy. They both display neuron-specific functions: ULK1 is involved in non-clathrin-coated endocytosis in growth cones, filopodia extension, and axon branching; ULK2 plays a role in axon development. The ULK1/2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
271023	cd14121	STKc_ULK3	Catalytic domain of the Serine/Threonine kinase, Unc-51-like kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. ULK3 mRNA is up-regulated in fibroblasts after Ras-induced senescence, and its overexpression induces both autophagy and senescence in a fibroblast cell line. ULK3, through its kinase activity, positively regulates Gli proteins, mediators of the Sonic hedgehog (Shh) signaling pathway that is implicated in tissue homeostasis maintenance and neurogenesis. It is inhibited by binding to Suppressor of Fused (Sufu). The ULK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	252
271024	cd14122	STKc_VRK1	Catalytic domain of the Serine/Threonine protein kinase, Vaccinia Related Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. VRKs were initially discovered due to its similarity to vaccinia virus B1R STK, which is important for viral replication. Vertebrates contain three VRK proteins. Human VRK1 is implicated in the regulation of many cellular processes including cell cycle progression and proliferation, stress responses, nuclear envelope assembly and chromatin condensation. It regulates cell cycle progression during the DNA replication period by inducing cyclin D1 expression. VRK1 also phosphorylates and regulates some transcription factors including p53, c-Jun, ATF2, and nuclear factor BAF. VRK1 stabilizes p53 by interfering with its mdm2-mediated degradation. Accumulation of p53, which blocks cell growth and division, is modulated by an autoregulatory loop between p53 and VRK1 (accumulated p53 downregulates VRK1). This autoregulatory loop has been found to be nonfunctional in some lung carcinomas. The VRK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	301
271025	cd14123	STKc_VRK2	Catalytic domain of the Serine/Threonine protein kinase, Vaccinia Related Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. VRKs were initially discovered due to its similarity to vaccinia virus B1R STK, which is important for viral replication. They play important roles in cell signaling, nuclear envelope dynamics, apoptosis, and stress responses. Vertebrates contain three VRK proteins. VRK2 exists as two alternative splice forms, A and B, which differ in their C-terminal regions. VRK2A, the predominant isoform, contains a hydrophobic tail and is anchored to the ER and mitochondria. It is expressed in all cell types. VRK2B lacks a membrane-anchor tail and is detected in the cytosol and the nucleus. Like VRK1, it can stabilize p53. VRK2B functionally replaces VRK1 in the nucleus of cell types where VRK1 is absent. VRK2 modulates hypoxia-induced stress responses by interacting with TAK1, an atypical MAPK kinase kinase which triggers cascades that activate JNK following oxidative stress. VRK2 also interacts with JIP1, a scaffold protein that assembles three consecutive members of a MAPK pathway. This interaction prevents the association of JNK with the signaling complex, leading to reduced phosphorylation and AP1-dependent transcription. The VRK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	302
271026	cd14124	PK_VRK3	Pseudokinase domain of Vaccinia Related Kinase 3. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. VRKs were initially discovered due to its similarity to vaccinia virus B1R STK, which is important for viral replication. They play important roles in cell signaling, nuclear envelope dynamics, apoptosis, and stress responses. Vertebrates contain three VRK proteins. VRK3 is an inactive pseudokinase that is unable to bind ATP. It achieves its regulatory function through protein-protein interactions. It negatively regulates ERK signaling by binding directly and enhancing the activity of the MAPK phosphatase VHR (vaccinia H1-related), which dephosphorylates and inactivates ERK. The VRK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	298
271027	cd14125	STKc_CK1_delta_epsilon	Catalytic domain of the Serine/Threonine protein kinases, Casein Kinase 1 delta and epsilon. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. The delta and epsilon isoforms of CK1 play important roles in circadian rhythm and cell growth. They phosphorylate PERIOD proteins (PER1-3), which are circadian clock proteins that fulfill negative regulatory functions. PER phosphorylation leads to its degradation. However, CRY proteins form a complex with PER and CK1delta/epsilon that protects PER from degradation and leads to nuclear accummulation of the complex, which inhibits BMAL1-CLOCK dependent transcription activation. CK1delta/epsilon also phosphorylate the tumor suppressor p53 and the cellular oncogene Mdm2, which are key regulators of cell growth, genome integrity, and the development of cancer. This subfamily also includes the CK1 fungal proteins Saccharomyces cerevisiae HRR25 and Schizosaccharomyces pombe HHP1. These fungal proteins are involved in DNA repair. The CK1 delta/epsilon subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	275
271028	cd14126	STKc_CK1_gamma	Catalytic domain of the Serine/Threonine protein kinase, Casein Kinase 1 gamma. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. CK1gamma proteins are unique within the CK1 subfamily in that they are palmitoylated at the C-termini and are anchored to the plasma membrane. CK1gamma is involved in transducing the signaling of LDL-receptor-related protein 6 (LRP6) through direct phosphorylation following Wnt stimulation, resulting in the recruitment of the scaffold protein Axin. In Xenopus embryos, CK1gamma is required during anterio-posterior patterning. In higher vertebrates, three CK1gamma (gamma1-3) isoforms exist. In mammalian cells, CK1gamma2 has been implicated in regulating the synthesis of sphingomyelin, a phospholipid that is found in the outer leaflet of the plasma membrane, by hyperphosphorylating and inactivating the ceramide transfer protein CERT. CK1gamma2 also phosphorylates the transcription factor Smad-3 resulting in its ubiquitination and degradation. It inhibits Smad-3 mediated responses of Transforming Growth Factor-beta (TGF-beta) including cell growth arrest. The CK1 gamma subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
271029	cd14127	STKc_CK1_fungal	Catalytic domain of the Serine/Threonine protein kinase, Fungal Casein Kinase 1 homolog 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. This subfamily is composed of fungal CK1 homolog 1 proteins, also called Yck1 in Saccharomyces cerevisiae and Cki1 in Schizosaccharomyces pombe. Yck1 (or Yck1p) and Cki1 are plasma membrane-anchored proteins. Yck1 phosphorylates and regulates Khd1p, a RNA-binding protein that represses translation of bud-localized mRNA. Cki1 phosphorylates and regulates phosphatidylinositol (PI)-(4)P-5-kinase, which catalyzes the last step in the sythesis of PI(4,5)P2, which is involved in actin cytoskeleton remodeling and membrane traffic. The fungal CK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
271030	cd14128	STKc_CK1_alpha	Catalytic domain of the Serine/Threonine protein kinases, Casein Kinase 1 alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK1 phosphorylates a variety of substrates including enzymes, transcription and splice factors, cytoskeletal proteins, viral oncogenes, receptors, and membrane-associated proteins. There are mutliple isoforms of CK1 and in mammals, seven isoforms (alpha, beta, gamma1-3, delta, and epsilon) have been characterized. These isoforms differ mainly in the length and structure of their C-terminal non-catalytic region. CK1alpha plays a role in cell cycle progression, spindle dynamics, and chromosome segregation. It is also involved in regulating apoptosis mediated by Fas or the retinoid X receptor (RXR), and is a positive regulator of Wnt signaling. CK1alpha phosphorylates the NS5A protein of flaviviruses such as the Hepatitis C virus (HCV) and yellow fever virus (YFV), and influences flaviviral replication. The CK1 alpha subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	266
271031	cd14129	STKc_TTBK2	Catalytic domain of the Serine/Threonine protein kinase, Tau-Tubulin Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TTBK is a neuron-specific kinase that phosphorylates the microtubule-associated protein tau and promotes its aggregation. Higher vertebrates contain two TTBK proteins, TTBK1 and TTBK2, both of which have been implicated in neurodegeneration. Mutations in TTBK2 is associated with the development of spinocerebellar ataxia type 11, belonging to a group of neurodegenerative disorders characterized by progressive incoordination, dysarthria and impairment of eye movements. Brain tissues of SCA11 patients show the presence of neurofibrillary tangles and tau deposition in the brain, similar to Alzheimer's disease (AD) patients. The TTBK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
271032	cd14130	STKc_TTBK1	Catalytic domain of the Serine/Threonine protein kinase, Tau-Tubulin Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TTBK is a neuron-specific kinase that phosphorylates the microtubule-associated protein tau and promotes its aggregation. Higher vertebrates contain two TTBK proteins, TTBK1 and TTBK2, both of which have been implicated in neurodegeneration. Genetic variations in TTBK1 are linked to Alzheimer's disease (AD). Hyperphosphorylated tau is a major component of paired helical filaments that accumulate in the brain of AD patients. Studies in transgenic mice show that TTBK1 is involved in the phosphorylation-dependent pathogenic aggregation of tau. The TTBK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
271033	cd14131	PKc_Mps1	Catalytic domain of the Dual-specificity Mitotic checkpoint protein kinase, Monopolar spindle 1 (also called TTK). Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. TTK/Mps1 is a spindle checkpoint kinase that was first discovered due to its necessity in centrosome duplication in budding yeast. It was later found to function in the spindle assembly checkpoint, which monitors the proper attachment of chromosomes to the mitotic spindle. In yeast, substrates of Mps1 include the spindle pole body components Spc98p, Spc110p, and Spc42p. The TTK/Mps1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
271034	cd14132	STKc_CK2_alpha	Catalytic subunit (alpha) of the Serine/Threonine Kinase, Casein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CK2 is a tetrameric protein with two catalytic (alpha) and two regulatory (beta) subunits. It is constitutively active and ubiquitously expressed, and is found in the cytoplasm, nucleus, as well as in the plasma membrane. It phosphorylates a wide variety of substrates including gylcogen synthase, cell cycle proteins, nuclear proteins (e.g. DNA topoisomerase II), and ion channels (e.g. ENaC), among others. It may be considered a master kinase controlling the activity or lifespan of many other kinases and exerting its effect over cell fate, gene expression, protein synthesis and degradation, and viral infection. CK2 is implicated in every stage of the cell cycle and is required for cell cycle progression. It plays crucial roles in cell differentiation, proliferation, and survival, and is thus implicated in cancer. CK2 is not an oncogene by itself but elevated CK2 levels create an environment that enhances the survival of tumor cells. The CK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	306
271035	cd14133	PKc_DYRK_like	Catalytic domain of Dual-specificity tYrosine-phosphorylated and -Regulated Kinase-like protein kinases. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. This subfamily is composed of the dual-specificity DYRKs and YAK1, as well as the S/T kinases (STKs), HIPKs. DYRKs and YAK1 autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. Proteins in this subfamily play important roles in cell proliferation, differentiation, survival, growth, and development. The DYRK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	262
271036	cd14134	PKc_CLK	Catalytic domain of the Dual-specificity protein kinases, CDC-like kinases. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. CLKs are involved in the phosphorylation and regulation of serine/arginine-rich (SR) proteins, which play a crucial role in pre-mRNA splicing by directing splice site selection. SR proteins are phosphorylated first by SR protein kinases (SRPKs) at the N-terminus, which leads to its assembly into nuclear speckles where splicing factors are stored. CLKs phosphorylate the C-terminal part of SR proteins, causing the nuclear speckles to dissolve and splicing factors to be recruited at sites of active transcription. Based on a conserved "EHLAMMERILG" signature motif which may be crucial for substrate specificity, CLKs are also referred to as LAMMER kinases. CLKs autophosphorylate at tyrosine residues and phosphorylate their substrates exclusively on S/T residues. In Drosophila, the CLK homolog DOA (Darkener of apricot) is essential for embryogenesis and its mutation leads to defects in sexual differentiation, eye formation, and neuronal development. In fission yeast, the CLK homolog Lkh1 is a negative regulator of filamentous growth and asexual flocculation, and is also involved in oxidative stress response. Vertebrates contain mutliple CLK proteins and mammals have four (CLK1-4). The CLK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	332
271037	cd14135	STKc_PRP4	Catalytic domain of the Serine/Threonine Kinase, Pre-mRNA-Processing factor 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PRP4 phosphorylates a number of factors involved in the formation of active spliceosomes, which catalyze pre-mRNA splicing. It phosphorylates PRP6 and PRP31, components of the U4/U6-U5 tri-small nuclear ribonucleoprotein (snRNP), during spliceosomal complex formation. In fission yeast, PRP4 phosphorylates the splicing factor PRP1 (U5-102 kD in mammals). Thus, PRP4 plays a key role in regulating spliceosome assembly and pre-mRNA splicing. It also plays an important role in mitosis by acting as a spindle assembly checkpoint kinase that is required for chromosome alignment and the recruitment of the checkpoint proteins MPS1, MAD1, and MAD2 at kinetochores. The PRP4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	318
271038	cd14136	STKc_SRPK	Catalytic domain of the Serine/Threonine Kinase, Serine-aRginine Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SRPKs phosphorylate and regulate splicing factors from the SR protein family by specifically phosphorylating multiple serine residues residing in SR/RS dipeptide motifs (also known as RS domains). Phosphorylation of the RS domains enhances interaction with transportin SR and facilitates entry of the SR proteins into the nucleus. SRPKs contain a nonconserved insert domain, within the well-conserved catalytic kinase domain, that regulates their subcellular localization. They play important roles in mediating pre-mRNA processing and mRNA maturation, as well as other cellular functions such as chromatin reorganization, cell cycle and p53 regulation, and metabolic signaling. Vertebrates contain three distinct SRPKs, called SRPK1-3. The SRPK homolog in budding yeast, Sky1p, recognizes and phosphorylates its substrate Npl3p, which lacks a classic RS domain but contains a single RS dipeptide at the C-terminus of its RGG domain. Npl3p is a shuttling heterogeneous nuclear ribonucleoprotein (hnRNP) that exports a distinct class of mRNA from the nucleus to the cytoplasm. The SRPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	320
271039	cd14137	STKc_GSK3	The catalytic domain of the Serine/Threonine Kinase, Glycogen Synthase Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GSK3 is a mutifunctional kinase involved in many cellular processes including cell division, proliferation, differentiation, adhesion, and apoptosis. In plants, GSK3 plays a role in the response to osmotic stress. In Caenorhabditis elegans, it plays a role in regulating normal oocyte-to-embryo transition and response to oxidative stress. In Chlamydomonas reinhardtii, GSK3 regulates flagellar length and assembly. In mammals, there are two isoforms, GSK3alpha and GSK3beta, which show both distinct and redundant functions. The two isoforms differ mainly in their N-termini. They are both involved in axon formation and in Wnt signaling.They play distinct roles in cardiogenesis, with GSKalpha being essential in cardiomyocyte survival, and GSKbeta regulating heart positioning and left-right symmetry. GSK3beta was first identified as a regulator of glycogen synthesis, but has since been determined to play other roles. It regulates the degradation of beta-catenin and IkB. Beta-catenin is the main effector of Wnt, which is involved in normal haematopoiesis and stem cell function. IkB is a central inhibitor of NF-kB, which is critical in maintaining leukemic cell growth. GSK3beta is enriched in the brain and is involved in regulating neuronal signaling pathways. It is implicated in the pathogenesis of many diseases including Type II diabetes, obesity, mood disorders, Alzheimer's disease, osteoporosis, and some types of cancer, among others. The GSK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	293
271040	cd14138	PTKc_Wee1a	Catalytic domain of the Protein Tyrosine Kinase, Wee1a. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of human Wee1a, Xenopus laevis Wee1b (XeWee1b) and similar vertebrate proteins. Members of this subfamily show a wide expression pattern. XeWee1b functions after the first zygotic cell divisions. It is expressed in all tissues and is also present after the gastrulation stage of embryos. Wee1 is a cell cycle checkpoint kinase that helps keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of an N-terminal tyr (Y15) residue. During the late G2 phase, CDK1 is activated and mitotic entry is promoted by the removal of this inhibitory phosphorylation by the phosphatase Cdc25. Although Wee1 is functionally a tyr kinase, it is more closely related to serine/threonine kinases (STKs). It contains a catalytic kinase domain sandwiched in between N- and C-terminal regulatory domains. It is regulated by phosphorylation and degradation, and its expression levels are also controlled by circadian clock proteins. The Wee1a subfamily is part of a larger superfamily that includes the catalytic domains of STKs, other PTKs, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	276
271041	cd14139	PTKc_Wee1b	Catalytic domain of the Protein Tyrosine Kinase, Wee1b. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily is composed of human Wee1b (also called Wee2), Xenopus laevis Wee1a (XeWee1a) and similar vertebrate proteins. XeWee1a accumulates after exiting the metaphase II stage in oocytes and in early mitotic cells. It functions during the first zygotic cell division and not during subsequent divisions. Mammalian Wee2/Wee1b is an oocyte-specific inhibitor of meiosis that functions downstream of cAMP. Wee1 is a cell cycle checkpoint kinase that helps keep the cyclin-dependent kinase CDK1 in an inactive state through phosphorylation of an N-terminal tyr (Y15) residue. During the late G2 phase, CDK1 is activated and mitotic entry is promoted by the removal of this inhibitory phosphorylation by the phosphatase Cdc25. Although Wee1 is functionally a tyr kinase, it is more closely related to serine/threonine kinases (STKs). It contains a catalytic kinase domain sandwiched in between N- and C-terminal regulatory domains. It is regulated by phosphorylation and degradation, and its expression levels are also controlled by circadian clock proteins. The Wee1b subfamily is part of a larger superfamily that includes the catalytic domains of STKs, other PTKs, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	274
271042	cd14140	STKc_ACVR2b	Catalytic domain of the Serine/Threonine Kinase, Activin Type IIB Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ACVR2b (or ActRIIB) belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins (BMPs), activins, growth and differentiation factors (GDFs), and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. ACVR2b is one of two ACVR2 receptors found in vertebrates. Type II receptors are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. ACVR2 acts primarily as the receptors for activins, nodal, myostatin, GDF11, and a subset of BMPs. ACVR2 signaling impacts many cellular and physiological processes including reproductive and gonadal functions, myogenesis, bone remodeling and tooth development, kidney organogenesis, apoptosis, fibrosis, inflammation, and neurogenesis. The ACVR2b subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	291
271043	cd14141	STKc_ACVR2a	Catalytic domain of the Serine/Threonine Kinase, Activin Type IIA Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ACVR2a (or ActRIIA) belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins (BMPs), activins, growth and differentiation factors (GDFs), and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane region, and a cytoplasmic catalytic kinase domain. ACVR2b is one of two ACVR2 receptors found in vertebrates. Type II receptors are high-affinity receptors which bind ligands, autophosphorylate, as well as trans-phosphorylate and activate low-affinity type I receptors. ACVR2 acts primarily as the receptors for activins, nodal, myostatin, GDF11, and a subset of BMPs. ACVR2 signaling impacts many cellular and physiological processes including reproductive and gonadal functions, myogenesis, bone remodeling and tooth development, kidney organogenesis, apoptosis, fibrosis, inflammation, and neurogenesis. The ACVR2a subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
271044	cd14142	STKc_ACVR1_ALK1	Catalytic domain of the Serine/Threonine Kinases, Activin Type I Receptor and Activin receptor-Like Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ACVR1, also called Activin receptor-Like Kinase 2 (ALK2), and ALK1 act as receptors for bone morphogenetic proteins (BMPs) and they activate SMAD1/5/8. ACVR1 is widely expressed while ALK1 is limited mainly to endothelial cells. The specificity of BMP binding to type I receptors is affected by type II receptors. ACVR1 binds BMP6/7/9/10 and can also bind anti-Mullerian hormone (AMH) in the presence of AMHR2. ALK1 binds BMP9/10 as well as TGFbeta in endothelial cells. A missense mutation in the GS domain of ACVR1 causes fibrodysplasia ossificans progressiva, a complex and disabling disease characterized by congenital skeletal malformations and extraskeletal bone formation. ACVR1 belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, BMPs, activins, growth and differentiation factors, and AMH, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like ACVR1 and ALK1, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The ACVR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	298
271045	cd14143	STKc_TGFbR1_ACVR1b_ACVR1c	Catalytic domain of the Serine/Threonine Kinases, Transforming Growth Factor beta Type I Receptor and Activin Type IB/IC Receptors. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TGFbR1, also called Activin receptor-Like Kinase 5 (ALK5), functions as a receptor for TGFbeta and phoshorylates SMAD2/3. TGFbeta proteins are cytokines that regulate cell growth, differentiation, and survival, and are critical in the development and progression of many human cancers. Mutations in TGFbR1 (and TGFbR2) can cause aortic aneurysm disorders such as Loeys-Dietz and Marfan syndromes. ACVR1b (also called ALK4) and ACVR1c (also called ALK7) act as receptors for activin A and B, respectively. TGFbR1, ACVR1b, and ACVR1c belong to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, bone morphogenetic proteins, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like TGFbR1, ACVR1b, and ACVR1c, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The TGFbR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
271046	cd14144	STKc_BMPR1	Catalytic domain of the Serine/Threonine Kinase, Bone Morphogenetic Protein Type I Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BMPR1 functions as a receptor for morphogenetic proteins (BMPs), which are involved in the regulation of cell proliferation, survival, differentiation, and apoptosis. BMPs are able to induce bone, cartilage, ligament, and tendon formation, and may play roles in bone diseases and tumors. Vertebrates contain two type I BMP receptors, BMPR1a and BMPR1b. BMPR1 belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that also includes TGFbeta, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like BMPR1, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The BMPR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
271047	cd14145	STKc_MLK1	Catalytic domain of the Serine/Threonine Kinase, Mixed Lineage Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLK1 is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK) and is also called MAP3K9. MAP3Ks phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. Little is known about the specific function of MLK1. It is capable of activating the c-Jun N-terminal kinase pathway. Mice lacking both MLK1 and MLK2 are viable, fertile, and have normal life spans. There could be redundancy in the function of MLKs. Mammals have four MLKs, mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. The MLK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
271048	cd14146	STKc_MLK4	Catalytic domain of the Serine/Threonine Kinase, Mixed Lineage Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLK4 is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. The specific function of MLK4 is yet to be determined. Mutations in the kinase domain of MLK4 have been detected in colorectal cancers.  Mammals have four MLKs, mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation.The MLK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
271049	cd14147	STKc_MLK3	Catalytic domain of the Serine/Threonine Kinase, Mixed Lineage Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLK3 is a mitogen-activated protein kinase kinase kinases (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLK3 activates multiple MAPK pathways and plays a role in apoptosis, proliferation, migration, and differentiation, depending on the cellular context. It is highly expressed in breast cancer cells and its signaling through c-Jun N-terminal kinase has been implicated in the migration, invasion, and malignancy of cancer cells. MLK3 also functions as a negative regulator of Inhibitor of Nuclear Factor-KappaB Kinase (IKK) and consequently, it also impacts inflammation and immunity. Mammals have four MLKs, mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation.The MLK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
271050	cd14148	STKc_MLK2	Catalytic domain of the Serine/Threonine Kinase, Mixed Lineage Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLK2 is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK) and is also called MAP3K10. MAP3Ks phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. MLK2 is abundant in brain, skeletal muscle, and testis. It functions upstream of the MAPK, c-Jun N-terminal kinase. It binds hippocalcin, a calcium-sensor protein that protects neurons against calcium-induced cell death. Both MLK2 and hippocalcin may be associated with the pathogenesis of Parkinson's disease. MLK2 also binds to normal huntingtin (Htt), which is important in neuronal transcription, development, and survival. MLK2 does not bind to the polyglutamine-expanded Htt, which is implicated in the pathogeneis of Huntington's disease, leading to neuronal toxicity. Mammals have four MLKs, mostly conserved in vertebrates, which contain an SH3 domain, a catalytic kinase domain, a leucine zipper, a proline-rich region, and a CRIB domain that mediates binding to GTP-bound Cdc42 and Rac. MLKs play roles in immunity and inflammation, as well as in cell death, proliferation, and cell cycle regulation. The MLK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	258
271051	cd14149	STKc_C-Raf	Catalytic domain of the Serine/Threonine Kinase, C-Raf (Rapidly Accelerated Fibrosarcoma) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. C-Raf, also known as Raf-1 or c-Raf-1, is ubiquitously expressed and was the first Raf identified. It was characterized as the acquired oncogene from an acutely transforming murine sarcoma virus (3611-MSV) and the transforming agent from the avian retrovirus MH2. C-Raf-deficient mice embryos die around midgestation with increased apoptosis of embryonic tissues, especially in the fetal liver. One of the main functions of C-Raf is restricting caspase activation to promote survival in response to specific stimuli such as Fas stimulation, macrophage apoptosis, and erythroid differentiation. C-Raf is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. It functions in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. The C-Raf subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	283
271052	cd14150	STKc_A-Raf	Catalytic domain of the Serine/Threonine Kinase, A-Raf (Rapidly Accelerated Fibrosarcoma) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. A-Raf cooperates with C-Raf in regulating ERK transient phosphorylation that is associated with cyclin D expression and cell cycle progression. Mice deficient in A-Raf are born alive but show neurological and intestinal defects. A-Raf demonstrates low kinase activity to MEK, compared with B- and C-Raf, and may also have alternative functions other than in the ERK signaling cascade. It regulates the M2 type pyruvate kinase, a key glycolytic enzyme. It also plays a role in endocytic membrane trafficking. A-Raf is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. It functions in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. The A-Raf subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
271053	cd14151	STKc_B-Raf	Catalytic domain of the Serine/Threonine Kinase, B-Raf (Rapidly Accelerated Fibrosarcoma) kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. B-Raf activates ERK with the strongest magnitude, compared with other Raf kinases. Mice embryos deficient in B-Raf die around midgestation due to vascular hemorrhage caused by apoptotic endothelial cells. Mutations in B-Raf have been implicated in initiating tumorigenesis and tumor progression, and are found in malignant cutaneous melanoma, papillary thyroid cancer, as well as in ovarian and colorectal carcinomas. Most oncogenic B-Raf mutations are located at the activation loop of the kinase and surrounding regions; the V600E mutation accounts for around 90% of oncogenic mutations. The V600E mutant constitutively activates MEK, resulting in sustained activation of ERK. B-Raf is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. The B-Raf subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	274
271054	cd14152	STKc_KSR1	Catalytic domain of the Serine/Threonine Kinase, Kinase Suppressor of Ras 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. KSR1 functions as a transducer of TNFalpha-stimulated C-Raf activation of ERK1/2 and NF-kB. Detected activity of KSR1 is cell type specific and context dependent. It is inactive in normal colon epithelial cells and becomes activated at the onset of inflammatory bowel disease (IBD). Similarly, KSR1 activity is undetectable prior to stimulation by EGF or ceramide in COS-7 or YAMC cells, respectively. KSR proteins are widely regarded as pseudokinases, however, this matter is up for debate as catalytic activity has been detected for KSR1 in some systems. The KSR1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
271055	cd14153	PK_KSR2	Pseudokinase domain of Kinase Suppressor of Ras 2. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. KSR2 interacts with the protein phosphatase calcineurin and functions in calcium-mediated ERK signaling. It also functions in energy metabolism by regulating AMP kinase and AMPK-dependent processes such as glucose uptake and fatty acid oxidation. KSR proteins act as scaffold proteins that function downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. KSR proteins regulate the assembly and activation of the Raf/MEK/ERK module upon Ras activation at the membrane by direct association of its components. They are widely regarded as pseudokinases. The KSR2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
271056	cd14154	STKc_LIMK	Catalytic domain of the Serine/Threonine Kinase, LIM domain kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LIMKs phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They act downstream of Rho GTPases and are expressed ubiquitously. As regulators of actin dynamics, they contribute to diverse cellular functions such as cell motility, morphogenesis, differentiation, apoptosis, meiosis, mitosis, and neurite extension. LIMKs contain the LIM (two repeats), PDZ, and catalytic kinase domains. Vertebrate have two members, LIMK1 and LIMK2. The LIMK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	272
271057	cd14155	PKc_TESK	Catalytic domain of the Dual-specificity protein kinase, Testicular protein kinase. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. TESK proteins phosphorylate cofilin and induce actin cytoskeletal reorganization. In the Drosphila eye, TESK is required for epithelial cell organization. Mammals contain two TESK proteins, TESK1 and TESK2, which are highly expressed in testis and play roles in spermatogenesis. TESK1 is found in testicular germ cells while TESK2 is expressed mainly in nongerminal Sertoli cells. TESK1 is stimulated by integrin-mediated signaling pathways. It regulates cell spreading and focal adhesion formation. The TESK subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	253
271058	cd14156	PKc_LIMK_like_unk	Catalytic domain of an unknown subfamily of LIM domain kinase-like protein kinases. PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine or tyrosine residues on protein substrates. This group is composed of uncharacterized proteins with similarity to LIMK and Testicular or testis-specific protein kinase (TESK). LIMKs are characterized as serine/threonine kinases (STKs) while TESKs are dual-specificity protein kinases. Both LIMK and TESK phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They are implicated in many cellular functions including cell spreading, motility, morphogenesis, meiosis, mitosis, and spermatogenesis. The LIMK-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
271059	cd14157	STKc_IRAK2	Catalytic domain of the Serine/Threonine kinase, Interleukin-1 Receptor Associated Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain, and a C-terminal domain; IRAK-4 lacks the C-terminal domain. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK2 plays a role in mediating NFkB activation by TLR3, TLR4, and TLR8. It is specifically targeted by the viral protein A52, which is important for virulence, to inhibit all IL-1/TLR pathways, indicating that IRAK2 has a predominant role in NFkB activation. It is redundant with IRAK1 in early signaling but is critical for late and sustained activation. The IRAK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	289
271060	cd14158	STKc_IRAK4	Catalytic domain of the Serine/Threonine kinase, Interleukin-1 Receptor Associated Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain, and a C-terminal domain; IRAK-4 lacks the C-terminal domain. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK4 plays a critical role in NFkB activation by its interaction with MyD88, which acts as a scaffold that enables IRAK4 to phosphorylate and activate IRAK1 and/or IRAK2. It also plays an important role in type I IFN production induced by TLR7/8/9. The IRAK4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
271061	cd14159	STKc_IRAK1	Catalytic domain of the Serine/Threonine kinase, Interleukin-1 Receptor Associated Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain, and a C-terminal domain; IRAK-4 lacks the C-terminal domain. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK1 plays a role in the activation of IRF3/7, STAT, and NFkB. It mediates IL-6 and IFN-gamma responses following IL-1 and IL-18 stimulation, respectively. It also plays an essential role in IFN-alpha induction downstream of TLR7 and TLR9. The IRAK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	296
271062	cd14160	PK_IRAK3	Pseudokinase domain of Interleukin-1 Receptor Associated Kinase 3. The pseudokinase domain shows similarity to protein kinases but lacks crucial residues for catalytic activity. IRAKs are involved in Toll-like receptor (TLR) and interleukin-1 (IL-1) signalling pathways, and are thus critical in regulating innate immune responses and inflammation. IRAKs contain an N-terminal Death domain (DD), a proST region (rich in serines, prolines, and threonines), a central kinase domain (a pseudokinase in the case of IRAK3), and a C-terminal domain; IRAK-4 lacks the C-terminal domain. Vertebrates contain four IRAKs (IRAK-1, -2, -3 (or -M), and -4) that display distinct functions and patterns of expression and subcellular distribution, and can differentially mediate TLR signaling. IRAK3 (or IRAK-M) is the only IRAK that does not show kinase activity. It is found only in monocytes and macrophages in humans, and functions as a negative regulator of TLR signaling including TLR-2 induced p38 activation. It also negatively regulates the alternative NFkB pathway in a TLR-2 specific manner. IRAK3 is downregulated in the monocytes of obese people, and is associated with high SOD2, a marker of mitochondrial oxidative stress. It is an important inhibitor of inflammation in association with obesity and metabolic syndrome. The IRAK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	276
271063	cd14161	STKc_NUAK2	Catalytic domain of the Serine/Threonine Kinase, novel (nua) kinase family NUAK 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. NUAK proteins are classified as AMP-activated protein kinase (AMPK)-related kinases, which like AMPK are activated by the major tumor suppressor LKB1. Vertebrates contain two NUAK proteins, called NUAK1 and NUAK2. NUAK2, also called SNARK (Sucrose, non-fermenting 1/AMP-activated protein kinase-related kinase), is involved in energy metabolism. It is activated by hyperosmotic stress, DNA damage, and nutrients such as glucose and glutamine. NUAK2-knockout mice develop obesity, altered serum lipid profiles, hyperinsulinaemia, hyperglycaemia, and impaired glucose tolerance. NUAK2 is implicated in regulating actin stress fiber assembly through its association with myosin phosphatase Rho-interacting protein (MRIP), which leads to an increase in myosin regulatory light chain (MLC) phosphorylation. It is also associated with tumor growth, migration, and oncogenicity of melanoma cells. The NUAK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
271064	cd14162	STKc_TSSK4-like	Catalytic domain of testis-specific serine/threonine kinase 4 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK4, also called TSSK5, is expressed in testis from haploid round spermatids to mature spermatozoa. It phosphorylates Cre-Responsive Element Binding protein (CREB), facilitating the binding of CREB to the specific cis cAMP responsive element (CRE), which is important in activating genes related to germ cell differentiation. Mutations in the human TSSK4 gene is associated with infertile Chinese men with impaired spermatogenesis. The TSSK4-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
271065	cd14163	STKc_TSSK3-like	Catalytic domain of testis-specific serine/threonine kinase 3 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK3 has been reported to be expressed in the interstitial Leydig cells of adult testis. Its mRNA levels is low at birth, increases at puberty, and remains high throughout adulthood. The TSSK3-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
271066	cd14164	STKc_TSSK6-like	Catalytic domain of testis-specific serine/threonine kinase 6 and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK6, also called SSTK, is expressed at the head of elongated sperm. It can phosphorylate histones and associate with heat shock protens HSP90 and HSC70. Male mice deficient in TSSK6 are infertile, showing spermatogenic impairment including reduced sperm counts, impaired DNA condensation, abnormal morphology and decreased motility rates. The TSSK6-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
271067	cd14165	STKc_TSSK1_2-like	Catalytic domain of testis-specific serine/threonine kinase 1, TSSK2, and similar proteins. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression. TSSK1 and TSSK2 are expressed specifically in meiotic and postmeiotic spermatogenic cells, respectively. TSSK2 is localized in the sperm neck, equatorial segment, and mid-piece of the sperm tail. Both TSSK1 and TSSK2 phosphorylate their common substrate TSKS (testis-specific-kinase-substrate). TSSK1/TSSK2 double knock-out mice are sterile without manifesting other defects, making these kinases viable targets for male contraception. The TSSK1/2-like subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
271068	cd14166	STKc_CaMKI_gamma	Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I gamma. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI-gamma subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	285
271069	cd14167	STKc_CaMKI_alpha	Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I alpha. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI-alpha subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	263
271070	cd14168	STKc_CaMKI_delta	Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I delta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI-delta subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	301
271071	cd14169	STKc_CaMKI_beta	Catalytic domain of the Serine/Threonine kinase, Calcium/calmodulin-dependent protein kinase Type I beta. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. CaMKs are multifunctional calcium and calmodulin (CaM) stimulated STKs involved in cell cycle regulation. The CaMK family includes CaMKI, CaMKII, CaMKIV, and CaMK kinase (CaMKK). In vertebrates, there are four CaMKI proteins encoded by different genes (alpha, beta, gamma, and delta), each producing at least one variant. CaMKs contain an N-terminal catalytic domain and a C-terminal regulatory domain that harbors a CaM binding site. CaMKI proteins are monomeric and they play pivotal roles in the nervous system, including long-term potentiation, dendritic arborization, neurite outgrowth, and the formation of spines, synapses, and axons. In addition, they may be involved in osteoclast differentiation and bone resorption. The CaMKI-beta subfamily is part of a larger superfamily that includes the catalytic domains of other protein kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	277
271072	cd14170	STKc_MAPKAPK2	Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase-activated protein kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK-activated protein kinase 2 (MAPKAP2 or MK2) contains an N-terminal proline-rich region that can bind to SH3 domains, a catalytic kinase domain followed by a C-terminal autoinhibitory region that contains nuclear localization (NLS) and nuclear export (NES) signals with a p38 MAPK docking motif that overlaps the NLS. MK2 is a bonafide substrate for the MAPK p38. It is closely related to MK3 and thus far, MK2/3 show indistinguishable substrate specificity. They are mainly involved in the regulation of gene expression and they participate in diverse cellular processes such as endocytosis, cytokine production, cytoskeletal reorganization, cell migration, cell cycle control and chromatin remodeling. They are implicated in inflammation and cance and their substrates include mRNA-AU-rich-element (ARE)-binding proteins (TTP and hnRNP A0), Hsp proteins (Hsp27 and Hsp25) and RSK, among others. MK2/3 are both expressed ubiquitously but MK2 is expressed at significantly higher levels. The MK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	303
271073	cd14171	STKc_MAPKAPK5	Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase-activated protein kinase 5. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK-activated protein kinase 5 (MAPKAP5 or MK5) is also called PRAK (p38-regulated/activated protein kinase). It contains a catalytic kinase domain followed by a C-terminal autoinhibitory region that contains nuclear localization (NLS) and nuclear export (NES) signals with a p38 MAPK docking motif that overlaps the NLS. MK5 is a ubiquitous protein that is implicated in neuronal morphogenesis, cell migration, and tumor angiogenesis. It interacts with PKA, which induces cytoplasmic translocation of MK5. Its substrates includes p53, ERK3/4, Hsp27, and cytosolic phospholipase A2 (cPLA2). The MAPKAPK subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	289
271074	cd14172	STKc_MAPKAPK3	Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase-activated protein kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK-activated protein kinase 3 (MAPKAP3 or MK3) contains an N-terminal proline-rich region that can bind to SH3 domains, a catalytic kinase domain followed by a C-terminal autoinhibitory region that contains nuclear localization (NLS) and nuclear export (NES) signals with a p38 MAPK docking motif that overlaps the NLS. MK3 is a bonafide substrate for the MAPK p38. It is closely related to MK2 and thus far, MK2/3 show indistinguishable substrate specificity. They are mainly involved in the regulation of gene expression and they participate in diverse cellular processes such as endocytosis, cytokine production, cytoskeletal reorganization, cell migration, cell cycle control and chromatin remodeling. They are implicated in inflammation and cance and their substrates include mRNA-AU-rich-element (ARE)-binding proteins (TTP and hnRNP A0), Hsp proteins (Hsp27 and Hsp25) and RSK, among others. MK2/3 are both expressed ubiquitously but MK2 is expressed at significantly higher levels. MK3 activity is only significant when MK2 is absent. The MK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
271075	cd14173	STKc_Mnk2	Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase signal-integrating kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK signal-integrating kinases (Mnks) are MAPK-activated protein kinases and is comprised by a group of four proteins, produced by alternative splicing from two genes (Mnk1 and Mnk2). The isoforms of Mnk1 (1a/1b) and Mnk2 (2a/2b) differ at their C-termini, with the a-form having a longer C-terminus containing a MAPK-binding region. All Mnks contain a catalytic kinase domain and a polybasic region at the N-terminus which binds importin and the eukaryotic initiation factor eIF4G. The best characterized Mnk substrate is eIF4G, whose phosphorylation may promote the export of certain mRNAs from the nucleus. Mnk also phosphorylate substrates that bind to AU-rich elements that regulate mRNA stability and translation. Mnks have also been implicated in tyrosine kinase receptor signaling, inflammation, and cell prolieration or survival. The Mnk subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	288
271076	cd14174	STKc_Mnk1	Catalytic domain of the Serine/Threonine kinase, Mitogen-activated protein kinase signal-integrating kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MAPK signal-integrating kinases (Mnks) are MAPK-activated protein kinases and is comprised by a group of four proteins, produced by alternative splicing from two genes (Mnk1 and Mnk2). The isoforms of Mnk1 (1a/1b) and Mnk2 (2a/2b) differ at their C-termini, with the a-form having a longer C-terminus containing a MAPK-binding region. All Mnks contain a catalytic kinase domain and a polybasic region at the N-terminus which binds importin and the eukaryotic initiation factor eIF4G. The best characterized Mnk substrate is eIF4G, whose phosphorylation may promote the export of certain mRNAs from the nucleus. Mnk also phosphorylate substrates that bind to AU-rich elements that regulate mRNA stability and translation. Mnks have also been implicated in tyrosine kinase receptor signaling, inflammation, and cell prolieration or survival. The Mnk subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	289
271077	cd14175	STKc_RSK1_C	C-terminal catalytic domain of the Serine/Threonine Kinase, Ribosomal S6 kinase 1 (also called Ribosomal protein S6 kinase alpha-1 or 90kDa ribosomal protein S6 kinase 1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSK1 is also called S6K-alpha-1, RPS6KA1, p90RSK1 or MAPK-activated protein kinase 1a (MAPKAPK-1a). It is a component of the insulin transduction pathway, regulating the function of IRS1. It also interacts with PKA and promotes its inactivation. RSK1 is one of four RSK isoforms (RSK1-4) from distinct genes present in vertebrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. The RSK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	291
271078	cd14176	STKc_RSK2_C	C-terminal catalytic domain of the Serine/Threonine Kinase, Ribosomal S6 kinase 2 (also called 90kDa ribosomal protein S6 kinase 3 or Ribosomal protein S6 kinase alpha-3). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSK2 is also called p90RSK3, RPS6KA3, S6K-alpha-3, or MAPK-activated protein kinase 1b (MAPKAPK-1b). RSK2 is expressed highly in the regions of the brain with high synaptic activity. It plays a role in the maintenance and consolidation of excitatory synapses. It is a specific modulator of phospholipase D in calcium-regulated exocytosis. Mutations in the RSK2 gene, RPS6KA3, cause Coffin-Lowry syndrome (CLS), a rare syndromic form of X-linked mental retardation characterized by growth and psychomotor retardation and skeletal abnormalities. RSK2 is one of four RSK isoforms (RSK1-4) from distinct genes present in vertebrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. The RSK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	339
271079	cd14177	STKc_RSK4_C	C-terminal catalytic domain of the Serine/Threonine Kinase, Ribosomal S6 kinase 4 (also called Ribosomal protein S6 kinase alpha-6 or 90kDa ribosomal protein S6 kinase 6). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSK4 is also called S6K-alpha-6, RPS6KA6, p90RSK6 or pp90RSK4. RSK4 is a substrate of ERK and is a modulator of p53-dependent proliferation arrest in human cells. Deletion of the RSK4 gene, RPS6KA6, frequently occurs in patients of X-linked deafness type 3, mental retardation and choroideremia. Studies of RSK4 in cancer cells and tissues suggest that it may be oncogenic or tumor suppressive depending on many factors. RSK4 is one of four RSK isoforms (RSK1-4) from distinct genes present in vertebrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. The RSK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	295
271080	cd14178	STKc_RSK3_C	C-terminal catalytic domain of the Serine/Threonine Kinase, Ribosomal S6 kinase 3 (also called Ribosomal protein S6 kinase alpha-2 or 90kDa ribosomal protein S6 kinase 2). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. RSK3 is also called S6K-alpha-2, RPS6KA2, p90RSK2 or MAPK-activated protein kinase 1c (MAPKAPK-1c). RSK3 binds muscle A-kinase anchoring protein (mAKAP)-b  directly and regulates concentric cardiac myocyte growth. The RSK3 gene, RPS6KA2, is a putative tumor suppressor gene in sporadic epithelial ovarian cancer and variations to the gene may be associated with rectal cancer risk. RSK3 is one of four RSK isoforms (RSK1-4) from distinct genes present in vertebrates. RSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. They are activated by signaling inputs from extracellular regulated kinase (ERK) and phosphoinositide dependent kinase 1 (PDK1). ERK phosphorylates and activates the CTD of RSK, serving as a docking site for PDK1, which phosphorylates and activates the NTD, which in turn phosphorylates all known RSK substrates. RSKs act as downstream effectors of mitogen-activated protein kinase (MAPK) and play key roles in mitogen-activated cell growth, differentiation, and survival. The RSK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	293
271081	cd14179	STKc_MSK1_C	C-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSK1 plays a role in the regulation of translational control and transcriptional activation. It phosphorylates the transcription factors, CREB and NFkB. It also phosphorylates the nucleosomal proteins H3 and HMG-14. Increased phosphorylation of MSK1 is associated with the development of cerebral ischemic/hypoxic preconditioning. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family. MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, which trigger phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. The MSK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	310
271082	cd14180	STKc_MSK2_C	C-terminal catalytic domain of the Serine/Threonine Kinase, Mitogen and stress-activated kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MSK2 and MSK1 play nonredundant roles in activating histone H3 kinases, which play pivotal roles in compaction of the chromatin fiber. MSK2 is the required H3 kinase in response to stress stimuli and activation of the p38 MAPK pathway. MSK2 also plays a role in the pathogenesis of psoriasis. MSKs contain an N-terminal kinase domain (NTD) from the AGC family and a C-terminal kinase domain (CTD) from the CAMK family, similar to 90 kDa ribosomal protein S6 kinases (RSKs). MSKs are activated by two major signaling cascades, the Ras-MAPK and p38 stress kinase pathways, which trigger phosphorylation in the activation loop (A-loop) of the CTD of MSK. The active CTD phosphorylates the hydrophobic motif (HM) of NTD, which facilitates the phosphorylation of the A-loop and activates the NTD, which in turn phosphorylates downstream targets. The MSK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	309
271083	cd14181	STKc_PhKG2	Catalytic domain of the Serine/Threonine Kinase, Phosphorylase kinase Gamma 2 subunit. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Phosphorylase kinase (PhK) catalyzes the phosphorylation of inactive phosphorylase b to form the active phosphorylase a. It coordinates hormonal, metabolic, and neuronal signals to initiate the breakdown of glycogen stores, which enables the maintenance of blood-glucose homeostasis during fasting, and is also used as a source of energy for muscle contraction. PhK is one of the largest and most complex protein kinases, composed of a heterotetramer containing four molecules each of four subunit types: one catalytic (gamma) and three regulatory (alpha, beta, and delta). The gamma 2 subunit (PhKG2) is also referred to as the testis/liver gamma isoform. Mutations in its gene cause autosomal-recessive glycogenosis of the liver. The gamma subunit, when isolated, is constitutively active and does not require phosphorylation of the A-loop for activity. The regulatory subunits restrain this kinase activity until signals are received to relieve this inhibition. For example, the kinase is activated in response to hormonal stimulation, after autophosphorylation or phosphorylation by cAMP-dependent kinase of the alpha and beta subunits. The high-affinity binding of ADP to the beta subunit also stimulates kinase activity, whereas calcium relieves inhibition by binding to the delta (calmodulin) subunit. The PhKG2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	279
271084	cd14182	STKc_PhKG1	Catalytic domain of the Serine/Threonine Kinase, Phosphorylase kinase Gamma 1 subunit. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. Phosphorylase kinase (PhK) catalyzes the phosphorylation of inactive phosphorylase b to form the active phosphorylase a. It coordinates hormonal, metabolic, and neuronal signals to initiate the breakdown of glycogen stores, which enables the maintenance of blood-glucose homeostasis during fasting, and is also used as a source of energy for muscle contraction. PhK is one of the largest and most complex protein kinases, composed of a heterotetramer containing four molecules each of four subunit types: one catalytic (gamma) and three regulatory (alpha, beta, and delta). The gamma 1 subunit (PhKG1) is also referred to as the muscle gamma isoform. The gamma subunit, when isolated, is constitutively active and does not require phosphorylation of the A-loop for activity. The regulatory subunits restrain this kinase activity until signals are received to relieve this inhibition. For example, the kinase is activated in response to hormonal stimulation, after autophosphorylation or phosphorylation by cAMP-dependent kinase of the alpha and beta subunits. The high-affinity binding of ADP to the beta subunit also stimulates kinase activity, whereas calcium relieves inhibition by binding to the delta (calmodulin) subunit. The PhKG1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	276
271085	cd14183	STKc_DCKL1	Catalytic domain of the Serine/Threonine Kinase, Doublecortin-like kinase 1 (also called Doublecortin-like and CAM kinase-like 1). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DCKL1 (or DCAMKL1) belongs to the doublecortin (DCX) family of proteins which are involved in neuronal migration, neurogenesis, and eye receptor development, among others. Family members typically contain tandem doublecortin (DCX) domains at the N-terminus; DCX domains can bind microtubules and serve as protein-interaction platforms. In addition, DCKL1 contains a serine, threonine, and proline rich domain (SP) and a C-terminal kinase domain with similarity to CAMKs. DCKL1 interacts with tubulin, glucocorticoid receptor, dynein, JIP1/2, caspases (3 and 8), and calpain, among others. It plays roles in neurogenesis, neuronal migration, retrograde transport, and neuronal apoptosis. The DCKL1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	268
271086	cd14184	STKc_DCKL2	Catalytic domain of the Serine/Threonine Kinase, Doublecortin-like kinase 2 (also called Doublecortin-like and CAM kinase-like 2). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DCKL2 (or DCAMKL2) belongs to the doublecortin (DCX) family of proteins which are involved in neuronal migration, neurogenesis, and eye receptor development, among others. Family members typically contain tandem doublecortin (DCX) domains at the N-terminus; DCX domains can bind microtubules and serve as protein-interaction platforms. In addition, DCKL2 contains a serine, threonine, and proline rich domain (SP) and a C-terminal kinase domain with similarity to CAMKs. DCKL2 has been shown to interact with tubulin, JIP1/2, JNK, neurabin 2, and actin. It is associated with the terminal segments of axons and dendrites, and may function as a phosphorylation-dependent switch to control microtubule dynamics in neuronal growth cones. The DCKL2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
271087	cd14185	STKc_DCKL3	Catalytic domain of the Serine/Threonine Kinase, Doublecortin-like kinase 3 (also called Doublecortin-like and CAM kinase-like 3). STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DCKL3 (or DCAMKL3) belongs to the doublecortin (DCX) family of proteins which are involved in neuronal migration, neurogenesis, and eye receptor development, among others. Family members typically contain tandem doublecortin (DCX) domains at the N-terminus; DCX domains can bind microtubules and serve as protein-interaction platforms. DCKL3 contains a single DCX domain (instead of a tandem) and a C-terminal kinase domain with similarity to CAMKs. It has been shown to interact with tubulin and JIP1/2. The DCKL3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	258
271088	cd14186	STKc_PLK4	Catalytic domain of the Serine/Threonine Kinase, Polo-like kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes. PLK4, also called SAK or STK18, is structurally different from other PLKs in that it contains only one polo box that can form two adjacent polo boxes  and a functional PDB by homodimerization. It is required for late mitotic progression, cell survival, and embryonic development. It localizes to centrosomes and is required for centriole duplication and chromosomal stability. Overexpression of PLK4 may be associated with colon tumors. The PLK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
271089	cd14187	STKc_PLK1	Catalytic domain of the Serine/Threonine Kinase, Polo-like kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes. PLK1 functions as a positive regulator of mitosis, meiosis, and cytokinesis. Its localization changes during mitotic progression; associating first with centrosomes in prophase, with kinetochores in prometaphase and metaphase, at the central spindle in anaphase, and in the midbody during telophase. It carries multiple functions throughout the cell cycle through interactions with differrent substrates at these specific subcellular locations. PLK1 is overexpressed in many human cancers and is associated with poor prognosis. The PLK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	265
271090	cd14188	STKc_PLK2	Catalytic domain of the Serine/Threonine Kinase, Polo-like kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes. PLK2, also called Snk (serum-inducible kinase), functions in G1 progression, S-phase arrest, and centriole duplication. Its gene is responsive to both growth factors and cellular stress, is a transcriptional target of p53, and activates a G2-M checkpoint. The PLK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
271091	cd14189	STKc_PLK3	Catalytic domain of the Serine/Threonine Kinase, Polo-like kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes. PLK3, also called Prk or Fnk (FGF-inducible kinase), regulates angiogenesis and responses to DNA damage. Activated PLK3 mediates Chk2 phosphorylation by  ATM and the resulting checkpoint activation. PLK3 phosphorylates DNA polymerase delta and may be involved in DNA repair. It also inhibits Cdc25c, thereby regulating the onset of mitosis. The PLK3 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	255
271092	cd14190	STKc_MLCK2	Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK2 (or MYLK2) phosphorylates myosin regulatory light chain and controls the contraction of skeletal muscles. MLCK2 contains a single kinase domain near the C-terminus followed by a regulatory segment containing an autoinhibitory Ca2+/calmodulin binding site. The MLCK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	261
271093	cd14191	STKc_MLCK1	Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK1 (or MYLK1) phosphorylates myosin regulatory light chain and controls the contraction of smooth muscles. The MLCK1 gene expresses three transcripts in a cell-specific manner: a short MLCK1 which contains three immunoglobulin (Ig)-like and one fibronectin type III (FN3) domains, PEVK and actin-binding regions, and a kinase domain near the C-terminus followed by a regulatory segment containing an autoinhibitory Ca2+/calmodulin binding site; a long MLCK1 containing six additional Ig-like domains at the N-terminus compared to the short MLCK1; and the C-terminal Ig module which results in the expression of telokin in phasic smooth muscles, leading to Ca2+ desensitization by cyclic nucleotides of smooth muscle force. MLCK1 is also responsible for myosin regulatory light chain phosphorylation in nonmuscle cells and may play a role in regulating myosin II ATPase activity. The MLCK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	259
271094	cd14192	STKc_MLCK3	Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK3 (or MYLK3) phosphorylates myosin regulatory light chain 2 and controls the contraction of cardiac muscles. It is expressed specifically in both the atrium and ventricle of the heart and its expression is regulated by the cardiac protein Nkx2-5. MLCK3 plays an important role in cardiogenesis by regulating the assembly of cardiac sarcomeres, the repeating contractile unit of striated muscle. MLCK3 contains a single kinase domain near the C-terminus and a unique N-terminal half, and unlike MLCK1/2, it does not appear to be regulated by Ca2+/calmodulin. The MLCK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	261
271095	cd14193	STKc_MLCK4	Catalytic domain of the Serine/Threonine Kinase, Myosin Light Chain Kinase 4. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. MLCK phosphorylates myosin regulatory light chain and controls the contraction of all muscle types. In vertebrates, different MLCKs function in smooth (MLCK1), skeletal (MLCK2), and cardiac (MLCK3) muscles. A fourth protein, MLCK4, has also been identified through comprehensive genome analysis although it has not been biochemically characterized. MLCK4  (or MYLK4 or SgK085) contains a single kinase domain near the C-terminus. The MLCK4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	261
271096	cd14194	STKc_DAPK1	Catalytic domain of the Serine/Threonine Kinase, Death-Associated Protein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumor suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK1 is the prototypical member of the subfamily and is also simply referred to as DAPK. It is Ca2+/calmodulin (CaM)-regulated and actin-associated protein that contains an N-terminal kinase domain followed by an autoinhibitory CaM binding region and a large C-terminal extension with multiple functional domains including ankyrin (ANK) repeats, a cytoskeletal binding domain, a Death domain, and a serine-rich tail. Loss of DAPK1 expression, usually because of DNA methylation, is implicated in many tumor types. DAPK1 is highly abundant in the brain and has also been associated with neurodegeneration. The DAPK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
271097	cd14195	STKc_DAPK3	Catalytic domain of the Serine/Threonine Kinase, Death-Associated Protein Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumor suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK3, also called DAP-like kinase (DLK) and zipper-interacting protein kinase (ZIPk), contains an N-terminal kinase domain and a C-terminal region with nuclear localization signals (NLS) and a leucine zipper motif that mediates homodimerization and interaction with other leucine zipper proteins. It interacts with Par-4, a protein that contains a death domain and interacts with actin filaments. DAPK3 is present in both the cytoplasm and nucleus. Its co-expression with Par-4 results in the co-localization of the two proteins to actin filaments. In addition to cell death, DAPK3 is also implicated in mediating cell motility and the contraction of smooth muscles. The DAPK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
271098	cd14196	STKc_DAPK2	Catalytic domain of the Serine/Threonine Kinase, Death-Associated Protein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumor suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK2, also called DAPK-related protein 1 (DRP-1), is a Ca2+/calmodulin (CaM)-regulated protein containing an N-terminal kinase domain, a CaM autoinhibitory site and a dimerization module. It lacks the cytoskeletal binding regions of DAPK1 and the exogenous protein has been shown to be soluble and cytoplasmic. FLAG-tagged DAPK2, however, accumulated within membrane-enclosed autophagic vesicles. It is unclear where endogenous DAPK2 is localized. DAPK2 participates in TNF-alpha and FAS-receptor induced cell death and enhances neutrophilic maturation in myeloid leukemic cells. It contributes to the induction of anoikis and its down-regulation is implicated in the beta-catenin induced resistance of malignant epithelial cells to anoikis. The DAPK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	269
271099	cd14197	STKc_DRAK1	Catalytic domain of the Serine/Threonine Kinase, Death-associated protein kinase-Related Apoptosis-inducing protein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DRAKs were named based on their similarity (around 50% identity) to the kinase domain of DAPKs. They contain an N-terminal kinase domain and a C-terminal regulatory domain. Vertebrates contain two subfamily members, DRAK1 (also called STK17A) and DRAK2. Both DRAKs are localized to the nucleus, autophosphorylate themselves, and phosphorylate myosin light chain as a substrate. Rabbit DRAK1 has been shown to induce apoptosis in osteoclasts and overexpressio of human DRAK1 induces apoptosis in cultured fibroblast cells. DRAK1 may be involved in apoptotic signaling. The DRAK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
271100	cd14198	STKc_DRAK2	The catalytic domain of the Serine/Threonine Kinase, Death-associated protein kinase-Related Apoptosis-inducing protein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DRAKs were named based on their similarity (around 50% identity) to the kinase domain of DAPKs. They contain an N-terminal kinase domain and a C-terminal regulatory domain. Vertebrates contain two subfamily members, DRAK1 and DRAK2 (also called STK17B). Both DRAKs are localized to the nucleus, autophosphorylate themselves, and phosphorylate myosin light chain as a substrate. DRAK2 has been implicated in inducing or enhancing apoptosis in beta cells, fibroblasts, and lymphoid cells, where it is highly expressed. It is involved in regulating many immune processes including the germinal center (GC) reaction, responses to thymus-dependent antigens, activated T cell survival, memory T cell responses. It may be involved in the development  of autoimmunity. The DRAK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
271101	cd14199	STKc_CaMKK2	Catalytic domain of the Serine/Threonine kinase, Calmodulin Dependent Protein Kinase Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates.  CaMKKs are upstream kinases of the CaM kinase cascade that phosphorylate and activate CaMKI and CamKIV. They may also phosphorylate other substrates including PKB and AMP-activated protein kinase (AMPK). CaMKK2, also called CaMKK beta, is one of the most versatile CaMKs. It is involved in regulating energy balance, glucose metabolism, adiposity, hematopoiesis, inflammation, and cancer. CaMKK2 contains unique N- and C-terminal domains and a central catalytic kinase domain that is followed by a regulatory domain that bears overlapping autoinhibitory and CaM-binding regions. It can be activated by signaling through G-coupled receptors, IP3 receptors, plasma membrane ion channels, and Toll-like receptors. Thus, CaMKK2 acts as a molecular hub that is capable of receiving and decoding signals from diverse pathways. The CaMKK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	286
271102	cd14200	STKc_CaMKK1	Catalytic domain of the Serine/Threonine kinase, Calmodulin Dependent Protein Kinase Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates.  CaMKKs are upstream kinases of the CaM kinase cascade that phosphorylate and activate CaMKI and CamKIV. They may also phosphorylate other substrates including PKB and AMP-activated protein kinase (AMPK). CaMKK1, also called CaMKK alpha, is involved in the regulation of glucose uptake in skeletal muscles, independently of AMPK and PKB activation. It also play roles in learning and memory. Studies on CaMKK1 knockout mice reveal deficits in fear conditioning. The CaMKK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
271103	cd14201	STKc_ULK2	Catalytic domain of the Serine/Threonine kinase, Unc-51-like kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. ULK2 is ubiquitously expressed and is essential in autophagy induction. It displays partially redundant functions with ULK1 and is able to compensate for the loss of ULK1 in non-selective autophagy. It also displays neuron-specific functions and is important in axon development. The ULK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	271
271104	cd14202	STKc_ULK1	Catalytic domain of the Serine/Threonine kinase, Unc-51-like kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The ATG1/ULK complex is conserved from yeast to humans and it plays a critical role in the initiation of autophagy, the intracellular system that leads to the lysosomal degradation of cellular components and their recycling into basic metabolic units. ULK1 is required for efficient amino acid starvation-induced autophagy and mitochondrial clearance. It associates with three autophagy-related proteins (Atg13, FIP200 amd Atg101) to form the ULK1 complex. All fours proteins are essential for autophagosome formation. ULK1 is regulated by both mammalian target-of rapamycin complex 1 (mTORC1) and AMP-activated protein kinase (AMPK). mTORC1 negatively regulates the ULK1 complex in a nutrient-dependent manner while AMPK stimulates autophagy by inhibiting mTORC1. ULK1 also plays neuron-specific roles and is involved in non-clathrin-coated endocytosis in growth cones, filopodia extension, neurite extension, and axon branching. The ULK1 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
271105	cd14203	PTKc_Src_Fyn_like	Catalytic domain of a subset of Src kinase-like Protein Tyrosine Kinases. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. This subfamily includes a subset of Src-like PTKs including Src, Fyn, Yrk, and Yes, which are all widely expressed. Yrk has been detected only in chickens. It is primarily found in neuronal and epithelial cells and in macrophages. It may play a role in inflammation and in response to injury. Src (or c-Src) proteins are cytoplasmic (or non-receptor) PTKs which are anchored to the plasma membrane. They contain an N-terminal SH4 domain with a myristoylation site, followed by SH3 and SH2 domains, a tyr kinase domain, and a regulatory C-terminal region containing a conserved tyr. They are activated by autophosphorylation at the tyr kinase domain, but are negatively regulated by phosphorylation at the C-terminal tyr by Csk (C-terminal Src Kinase). Src proteins are involved in signaling pathways that regulate cytokine and growth factor responses, cytoskeleton dynamics, cell proliferation, survival, and differentiation. They were identified as the first proto-oncogene products, and they regulate cell adhesion, invasion, and motility in cancer cells and tumor vasculature, contributing to cancer progression and metastasis. They are also implicated in acute inflammatory responses and osteoclast function. The Src/Fyn-like subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	248
271106	cd14204	PTKc_Mer	Catalytic Domain of the Protein Tyrosine Kinase, Mer. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Mer (or Mertk) is named after its original reported expression pattern (monocytes, epithelial, and reproductive tissues). It is required for the ingestion of apoptotic cells by phagocytes such as macrophages, retinal pigment epithelial cells, and dendritic cells. Mer is also important in maintaining immune homeostasis. Mer is a member of the TAM subfamily, composed of receptor PTKs (RTKs) containing an extracellular ligand-binding region with two immunoglobulin-like domains followed by two fibronectin type III repeats, a transmembrane segment, and an intracellular catalytic domain. Binding to their ligands, Gas6 and protein S, leads to receptor dimerization, autophosphorylation, activation, and intracellular signaling. The Mer subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
271107	cd14205	PTKc_Jak2_rpt2	Catalytic (repeat 2) domain of the Protein Tyrosine Kinase, Janus kinase 2. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Jak2 is widely expressed in many tissues and is essential for the signaling of hormone-like cytokines such as growth hormone, erythropoietin, thrombopoietin, and prolactin, as well as some IFNs and cytokines that signal through the IL-3 and gp130 receptors. Disruption of Jak2 in mice results in an embryonic lethal phenotype with multiple defects including erythropoietic and cardiac abnormalities. It is the only Jak gene that results in a lethal phenotype when disrupted in mice. A mutation in the pseudokinase domain of Jak2, V617F, is present in many myeloproliferative diseases, including almost all patients with polycythemia vera, and 50% of patients with essential thrombocytosis and myelofibrosis. Jak2 is a member of the Janus kinase (Jak) subfamily of proteins, which are cytoplasmic (or nonreceptor) PTKs containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal catalytic tyr kinase domain. Jaks are crucial for cytokine receptor signaling. They are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The PTKc family is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	284
271108	cd14206	PTKc_Aatyk3	Catalytic domain of the Protein Tyrosine Kinases, Apoptosis-associated tyrosine kinase 3. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. Aatyk3, also called lemur tyrosine kinase 3 (Lmtk3) is a receptor kinase containing a transmembrane segment and a long C-terminal cytoplasmic tail with a catalytic domain. The function of Aatyk3 is still unknown. The Aatyk3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, and phosphoinositide 3-kinase (PI3K).	276
271109	cd14207	PTKc_VEGFR1	Catalytic domain of the Protein Tyrosine Kinases, Vascular Endothelial Growth Factor Receptors. PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. VEGFR1 (or Flt1) binds VEGFA, VEGFB, and placenta growth factor (PLGF). It regulates monocyte and macrophage migration, vascular permeability, haematopoiesis, and the recruitment of haematopietic progenitor cells from the bone marrow. VEGFR1 is a member of the VEGFR subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular ligand-binding region with seven immunoglobulin (Ig)-like domains, a transmembrane segment, and an intracellular catalytic domain. The binding of VEGFRs to their ligands, the VEGFs, leads to receptor dimerization, activation, and intracellular signaling. The VEGFR1 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as protein serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	340
271110	cd14208	PTK_Jak3_rpt1	Pseudokinase (repeat 1) domain of the Protein Tyrosine Kinase, Janus kinase 3. Jak3 is expressed only in hematopoietic cells. It binds the shared receptor subunit, common gamma chain and thus, is essential in the signaling of cytokines that use it such as IL-2, IL-4, IL-7, IL-9, IL-15, and IL-21. Jak3 is important in lymphoid development and myeloid cell differentiation. Inactivating mutations in Jak3 have been reported in humans with severe combined immunodeficiency (SCID). Jak3 is a cytoplasmic (or nonreceptor) PTK containing an N-terminal FERM domain, followed by a Src homology 2 (SH2) domain, a pseudokinase domain, and a C-terminal tyr kinase domain. The pseudokinase domain shows similarity to tyr kinases but lacks crucial residues for catalytic activity and ATP binding. It modulates the kinase activity of the C-terminal catalytic domain. Jaks are activated by autophosphorylation upon cytokine-induced receptor aggregation, and subsequently trigger downstream signaling events such as the phosphorylation of signal transducers and activators of transcription (STATs). The Jak3 subfamily is part of a larger superfamily that includes the catalytic domains of other kinases such as serine/threonine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	260
271111	cd14209	STKc_PKA	Catalytic subunit of the Serine/Threonine Kinase, cAMP-dependent protein kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. The PKA subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	290
271112	cd14210	PKc_DYRK	Catalytic domain of the protein kinase, Dual-specificity tYrosine-phosphorylated and -Regulated Kinase. Protein Kinases (PKs), Dual-specificity tYrosine-phosphorylated and -Regulated Kinase (DYRK) subfamily, catalytic (c) domain. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. The DYRK subfamily is part of a larger superfamily that includes the catalytic domains of other protein S/T PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). DYRKs autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. They play important roles in cell proliferation, differentiation, survival, and development. Vertebrates contain multiple DYRKs (DYRK1-4) and mammals contain two types of DYRK1 proteins, DYRK1A and DYRK1B. DYRK1A is involved in neuronal differentiation and is implicated in the pathogenesis of DS (Down syndrome). DYRK1B plays a critical role in muscle differentiation by regulating transcription, cell motility, survival, and cell cycle progression. It is overexpressed in many solid tumors where it acts as a tumor survival factor. DYRK2 promotes apoptosis in response to DNA damage by phosphorylating the tumor suppressor p53, while DYRK3 promotes cell survival by phosphorylating SIRT1 and promoting p53 deacetylation. DYRK4 is a testis-specific kinase that may function during spermiogenesis.	311
271113	cd14211	STKc_HIPK	Catalytic domain of the Serine/Threonine Kinase, Homeodomain-Interacting Protein Kinase. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HIPKs, originally identified by their ability to bind homeobox factors, are nuclear proteins containing catalytic kinase and homeobox-interacting domains as well as a PEST region overlapping with the speckle-retention signal (SRS). They show speckled localization in the nucleus, apart from the nucleoles. They play roles in the regulation of many nuclear pathways including gene transcription, cell survival, proliferation, differentiation, development, and DNA damage response. Vertebrates contain three HIPKs (HIPK1-3) and mammals harbor an additional family member HIPK4, which does not contain a homeobox-interacting domain and is localized in the cytoplasm. HIPK2, the most studied HIPK, is a coregulator of many transcription factors and cofactors and it regulates gene transcription during development and in DNA damage response. The HIPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	329
271114	cd14212	PKc_YAK1	Catalytic domain of the Dual-specificity protein kinase, YAK1. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. This subfamily is composed of proteins with similarity to Saccharomyces cerevisiae YAK1 (or Yak1p), a dual-specificity kinase that autophosphorylates at tyrosine residues and phosphorylates substrates on S/T residues. YAK1 phosphorylates and activates the transcription factors Hsf1 and Msn2, which play important roles in cellular homeostasis during stress conditions including heat shock, oxidative stress, and nutrient deficiency. It also phosphorylates the protein POP2, a component of a complex that regulates transcription, under glucose-deprived conditions. It functions as a part of a glucose-sensing system that is involved in controlling growth in yeast. The YAK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	330
271115	cd14213	PKc_CLK1_4	Catalytic domain of the Dual-specificity protein kinases, CDC-like kinases 1 and 4. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. CLK1 plays a role in neuronal differentiation. CLKs are involved in the phosphorylation and regulation of serine/arginine-rich (SR) proteins, which play a crucial role in pre-mRNA splicing by directing splice site selection. SR proteins are phosphorylated first by SR protein kinases (SRPKs) at the N-terminus, which leads to its assembly into nuclear speckles where splicing factors are stored. CLKs phosphorylate the C-terminal part of SR proteins, causing the nuclear speckles to dissolve and splicing factors to be recruited at sites of active transcription. Based on a conserved "EHLAMMERILG" signature motif which may be crucial for substrate specificity, CLKs are also referred to as LAMMER kinases. CLKs autophosphorylate at tyrosine residues and phosphorylate their substrates exclusively on serine/threonine residues. The CLK1/4 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	330
271116	cd14214	PKc_CLK3	Catalytic domain of the Dual-specificity protein kinase, CDC-like kinase 3. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. CLK3 is predominantly expressed in mature spermatozoa, and might play a role in the fertilization process. CLKs are involved in the phosphorylation and regulation of serine/arginine-rich (SR) proteins, which play a crucial role in pre-mRNA splicing by directing splice site selection. SR proteins are phosphorylated first by SR protein kinases (SRPKs) at the N-terminus, which leads to its assembly into nuclear speckles where splicing factors are stored. CLKs phosphorylate the C-terminal part of SR proteins, causing the nuclear speckles to dissolve and splicing factors to be recruited at sites of active transcription. Based on a conserved "EHLAMMERILG" signature motif which may be crucial for substrate specificity, CLKs are also referred to as LAMMER kinases. CLKs autophosphorylate at tyrosine residues and phosphorylate their substrates exclusively on serine/threonine residues. The CLK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	331
271117	cd14215	PKc_CLK2	Catalytic domain of the Dual-specificity protein kinase, CDC-like kinase 2. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine as well as tyrosine residues on protein substrates. CLK2 plays a role in hepatic insulin signaling and glucose metabolism. It is induced by the insulin/Akt pathway as part of the hepatic refeeding reponse, and it directly phosphorylates the SR domain of PGC-1alpha, which results in decreased gluconeogenic gene expression and glucose output. CLKs are involved in the phosphorylation and regulation of serine/arginine-rich (SR) proteins, which play a crucial role in pre-mRNA splicing by directing splice site selection. SR proteins are phosphorylated first by SR protein kinases (SRPKs) at the N-terminus, which leads to its assembly into nuclear speckles where splicing factors are stored. CLKs phosphorylate the C-terminal part of SR proteins, causing the nuclear speckles to dissolve and splicing factors to be recruited at sites of active transcription. Based on a conserved "EHLAMMERILG" signature motif which may be crucial for substrate specificity, CLKs are also referred to as LAMMER kinases. CLKs autophosphorylate at tyrosine residues and phosphorylate their substrates exclusively on serine/threonine residues. The CLK2 subfamily is part of a larger superfamily that includes the catalytic domains of other protein serine/threonine PKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	330
271118	cd14216	STKc_SRPK1	Catalytic domain of the Serine/Threonine Kinase, Serine-aRginine Protein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SRPK1 binds with high affinity the alternative splicing factor, SRSF1 (serine/arginine-rich splicing factor 1), and regiospecifically phosphorylates 10-12 serines in its RS domain. It plays a role in the regulation of pre-mRNA splicing, chromatin structure, and germ cell development. SRPKs phosphorylate and regulate splicing factors from the SR protein family by specifically phosphorylating multiple serine residues residing in SR/RS dipeptide motifs (also known as RS domains). Phosphorylation of the RS domains enhances interaction with transportin SR and facilitates entry of the SR proteins into the nucleus. SRPKs contain a nonconserved insert domain, within the well-conserved catalytic kinase domain, that regulates their subcellular localization. The SRPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	349
271119	cd14217	STKc_SRPK2	Catalytic domain of the Serine/Threonine Kinase, Serine-aRginine Protein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SRPK2 mediates neuronal cell cycle and cell death through regulation of nuclear cyclin D1. It has also been found to promote leukemia cell proliferation by regulating cyclin A1. SRPK2 also plays a role in regulating pre-mRNA splicing and is required for spliceosomal B complex formation. SRPKs phosphorylate and regulate splicing factors from the SR protein family by specifically phosphorylating multiple serine residues residing in SR/RS dipeptide motifs (also known as RS domains). Phosphorylation of the RS domains enhances interaction with transportin SR and facilitates entry of the SR proteins into the nucleus. SRPKs contain a nonconserved insert domain, within the well-conserved catalytic kinase domain, that regulates their subcellular localization. The SRPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	366
271120	cd14218	STKc_SRPK3	Catalytic domain of the Serine/Threonine Kinase, Serine-aRginine Protein Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. SRPK3 is highly expressed in the heart and skeletal muscles, and is controlled by a muscle-specific enhancer that is regulated by MEF2. It may play an important role in muscle development. SRPKs phosphorylate and regulate splicing factors from the SR protein family by specifically phosphorylating multiple serine residues residing in SR/RS dipeptide motifs (also known as RS domains). Phosphorylation of the RS domains enhances interaction with transportin SR and facilitates entry of the SR proteins into the nucleus. SRPKs contain a nonconserved insert domain, within the well-conserved catalytic kinase domain, that regulates their subcellular localization. The SRPK subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	365
271121	cd14219	STKc_BMPR1b	Catalytic domain of the Serine/Threonine Kinase, Bone Morphogenetic Protein Type IB. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BMPR1b, also called Activin receptor-Like Kinase 6 (ALK6), functions as a receptor for bone morphogenetic proteins (BMPs), which are involved in the regulation of cell proliferation, survival, differentiation, and apoptosis. BMPs are able to induce bone, cartilage, ligament, and tendon formation, and may play roles in bone diseases and tumors. Mutations in BMPR1b that led to inhibition of chondrogenesis can cause Brachydactyly (BD) type A2, a dominant hand malformation characterized by shortening and lateral deviation of the index fingers. A point mutation in the BMPR1b kinase domain is also associated with the Booroola phenotype, characterized by precocious differentiation of ovarian follicles. BMPR1b belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, BMPs, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like BMPR1b, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The BMPR1b subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	305
271122	cd14220	STKc_BMPR1a	Catalytic domain of the Serine/Threonine Kinase, Bone Morphogenetic Protein Type IA Receptor. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. BMPR1a, also called Activin receptor-Like Kinase 3 (ALK3), functions as a receptor for bone morphogenetic proteins (BMPs), which are involved in the regulation of cell proliferation, survival, differentiation, and apoptosis. BMPs are able to induce bone, cartilage, ligament, and tendon formation, and may play roles in bone diseases and tumors. Germline mutations in BMPR1a are associated with an increased risk to Juvenile Polyposis Syndrome, a hamartomatous disorder that may lead to gastrointestinal cancer. BMPR1a may also play an indirect role in the development of hematopoietic stem cells (HSCs) as osteoblasts are a major component of the HSC niche within the bone marrow. BMPR1a belongs to a group of receptors for the TGFbeta family of secreted signaling molecules that includes TGFbeta, BMPs, activins, growth and differentiation factors, and anti-Mullerian hormone, among others. These receptors contain an extracellular domain that binds ligands, a single transmembrane (TM) region, and a cytoplasmic catalytic kinase domain. Type I receptors, like BMPR1a, are low-affinity receptors that bind ligands only after they are recruited by the ligand/type II high-affinity receptor complex. Following activation, they start intracellular signaling to the nucleus by phosphorylating SMAD proteins. Type I receptors contain an additional domain located between the TM and kinase domains called the GS domain, which contains the activating phosphorylation site and confers preference for specific SMAD proteins. The BMPR1a subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	287
271123	cd14221	STKc_LIMK1	Catalytic domain of the Serine/Threonine Kinase, LIM domain kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LIMK1 activation is induced by bone morphogenic protein, vascular endothelial growth factor, and thrombin. It plays roles in microtubule disassembly and cell cycle progression, and is critical in the regulation of neurite outgrowth. LIMK1 knockout mice show abnormalities in dendritic spine morphology and synaptic function. LIMK1 is one of the genes deleted in patients with Williams Syndrome, which is characterized by distinct craniofacial features, cardiovascular problems, as well as behavioral and neurological abnormalities. LIMKs phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They act downstream of Rho GTPases and are expressed ubiquitously. As regulators of actin dynamics, they contribute to diverse cellular functions such as cell motility, morphogenesis, differentiation, apoptosis, meiosis, mitosis, and neurite extension. LIMKs contain the LIM (two repeats), PDZ, and catalytic kinase domains. The LIMK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	267
271124	cd14222	STKc_LIMK2	Catalytic domain of the Serine/Threonine Kinase, LIM domain kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LIMK2 activation is induced by transforming growth factor-beta l (TGFb-l) and shares the same subcellular location as the cofilin family member twinfilin, which may be its biological substrate. LIMK2 plays a role in spermatogenesis, and may contribute to tumor progression and metastasis formation in some cancer cells. LIMKs phosphorylate and inactivate cofilin, an actin depolymerizing factor, to induce the reorganization of the actin cytoskeleton. They act downstream of Rho GTPases and are expressed ubiquitously. As regulators of actin dynamics, they contribute to diverse cellular functions such as cell motility, morphogenesis, differentiation, apoptosis, meiosis, mitosis, and neurite extension. LIMKs contain the LIM (two repeats), PDZ, and catalytic kinase domains. The LIMK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	272
271125	cd14223	STKc_GRK2	Catalytic domain of the Serine/Threonine Kinase, G protein-coupled Receptor Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. GRK2, also called beta-adrenergic receptor kinase (beta-ARK) or beta-ARK1, is important in regulating several cardiac receptor responses. It plays a role in cardiac development and in hypertension. Deletion of GRK2 in mice results in embryonic lethality, caused by hypoplasia of the ventricular myocardium. GRK2 also plays important roles in the liver (as a regulator of portal blood pressure), in immune cells, and in the nervous system. Altered GRK2 expression has been reported in several disorders including major depression, schizophrenia, bipolar disorder, and Parkinsonism. GRK2 contains an N-terminal RGS homology (RH) domain, a central catalytic domain, and C-terminal pleckstrin homology (PH) domain that mediates PIP2 and G protein betagamma-subunit translocation to the membrane. GRKs phosphorylate and regulate G protein-coupled receptors (GPCRs), the largest superfamily of cell surface receptors which regulate some part of nearly all physiological functions. Phosphorylated GPCRs bind to arrestins, which prevents further G protein signaling despite the presence of activating ligand. TheGRK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	321
271126	cd14224	PKc_DYRK2_3	Catalytic domain of the protein kinases, Dual-specificity tYrosine-phosphorylated and -Regulated Kinases 2 and 3. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. This subfamily is composed of DYRK2 and DYRK3, and similar proteins. Drosophila DYRK2 interacts and phosphorylates the chromatin remodelling factor, SNR1 (Snf5-related 1), and also interacts with the essential chromatin component, trithorax. It may play a role in chromatin remodelling. Vertebrate DYRK2 phosphorylates and regulates the tumor suppressor p53 to induce apoptosis in response to DNA damage. It can also phosphorylate the transcription factor, nuclear factor of activated T cells (NFAT). DYRK2 is overexpressed in lung adenocarcinoma and esophageal carcinomas, and is a predictor for favorable prognosis in lung adenocarcinoma. DYRK3, also called regulatory erythroid kinase (REDK), is highly expressed in erythroid cells and the testis, and is also present in adult kidney and liver. It promotes cell survival by phosphorylating and activating SIRT1, an NAD(+)-dependent protein deacetylase, which promotes p53 deacetylation, resulting in the inhibition of apoptosis. DYRKs autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. The DYRK2/3 subfamily is part of a larger superfamily that includes the catalytic domains of other S/T kinases, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	380
271127	cd14225	PKc_DYRK4	Catalytic domain of the protein kinase, Dual-specificity tYrosine-phosphorylated and -Regulated Kinase 4. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. DYRK4 is a testis-specific kinase with restricted expression to postmeiotic spermatids. It may function during spermiogenesis, however, it is not required for male fertility. DYRK4 has also been detected in a human teratocarcinoma cell line induced to produce postmitotic neurons. It may have a role in neuronal differentiation. DYRKs autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. They play important roles in cell proliferation, differentiation, survival, and development. The DYRK4 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	341
271128	cd14226	PKc_DYRK1	Catalytic domain of the protein kinase, Dual-specificity tYrosine-phosphorylated and -Regulated Kinase 1. Dual-specificity PKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine (S/T) as well as tyrosine residues on protein substrates. Mammals contain two types of DYRK1 proteins, DYRK1A and DYRK1B. DYRK1A was previously called minibrain kinase homolog (MNBH) or dual-specificity YAK1-related kinase. It phosphorylates various substrates and is involved in many cellular events. It phosphorylates and inhibits the transcription factors, nuclear factor of activated T cells (NFAT) and forkhead in rhabdomyosarcoma (FKHR). It regulates neuronal differentiation by targetting CREB (cAMP response element-binding protein). It also targets many endocytic proteins including dynamin and amphiphysin and may play a role in the endocytic pathway. The gene encoding DYRK1A is located in the DSCR (Down syndrome critical region) of human chromosome 21 and DYRK1A has been implicated in the pathogenesis of DS. DYRK1B, also called minibrain-related kinase (MIRK), is highly expressed in muscle and plays a critical role in muscle differentiation by regulating transcription, cell motility, survival, and cell cycle progression. It is overexpressed in many solid tumors where it acts as a tumor survival factor. DYRKs autophosphorylate themselves on tyrosine residues and phosphorylate their substrates exclusively on S/T residues. The DYRK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	339
271129	cd14227	STKc_HIPK2	Catalytic domain of the Serine/Threonine Kinase, Homeodomain-Interacting Protein Kinase 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HIPK2, the most studied HIPK, is a coregulator of many transcription factors and cofactors including homeodomain proteins (Nkx and HOX families), Smad1-4, Pax6, c-Myb, AML1, the histone acetyltransferase p300, and the tumor repressor p53, among others. It regulates gene transcription during development and in DNA damage response (DDR), and mediates cell processes such as apoptosis, survival, differentiation, and proliferation. HIPK2 mediates apoptosis by phosphorylating and activating p53 during DDR, resulting in the activation of apoptotic genes. In the absence of p53, HIPK2 targets the anti-apoptotic corepressor C-terminal binding protein (CtBP), leading to CtBP's degradation and the promotion of apoptosis. HIPKs, originally identified by their ability to bind homeobox factors, are nuclear proteins containing catalytic kinase and homeobox-interacting domains as well as a PEST region overlapping with the speckle-retention signal (SRS). The HIPK2 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	355
271130	cd14228	STKc_HIPK1	Catalytic domain of the Serine/Threonine Kinase, Homeodomain-Interacting Protein Kinase 1. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HIPK1 has been implicated in regulating eye size, lens formation, and retinal morphogenesis during late embryogenesis. It also contributes to the regulation of haematopoiesis and leukaemogenesis by phosphorylating and repressing the transcription factor c-Myb, which is crucial in T- and B-cell development. In glucose-deprived conditions, HIPK1 phosphorylates Daxx, leading to its relocalization from the nucleus to the cytoplasm, where it binds and stabilizes ASK1 (apoptosis signal-regulating kinase 1), a mitogen-activated protein kinase (MAPK) kinase kinase that activates the JNK and p38 MAPK pathways. HIPKs, originally identified by their ability to bind homeobox factors, are nuclear proteins containing catalytic kinase and homeobox-interacting domains as well as a PEST region overlapping with the speckle-retention signal (SRS). The HIPK1 subfamily is part of a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	355
271131	cd14229	STKc_HIPK3	Catalytic domain of the Serine/Threonine Kinase, Homeodomain-Interacting Protein Kinase 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. HIPK3 is a Fas-interacting protein that induces FADD (Fas-associated death domain) phosphorylation and mediates FasL-induced JNK activation. Overexpression of HIPK3 does not affect cell death, however its expression in prostate cancer cells contributes to increased resistance to Fas receptor-mediated apoptosis. HIPK3 also plays a role in regulating steroidogenic gene expression. In response to cAMP, HIPK3 activates the phosphorylation of JNK and c-Jun, leading to increased activity of the transcription factor SF-1 (Steroidogenic factor 1), a key regulator for steroid biosynthesis in the gonad and adrenal gland. HIPKs, originally identified by their ability to bind homeobox factors, are nuclear proteins containing catalytic kinase and homeobox-interacting domains as well as a PEST region overlapping with the speckle-retention signal (SRS). The HIPK3 subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K).	330
410578	cd14230	GAT_GGA	canonical GAT domain found in metazoan and fungal ADP-ribosylation factor (Arf)-binding proteins (GGAs). GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGAs also play important roles in ubiquitin-dependent sorting of cargo proteins both in biosynthetic and endocytic pathways. The family includes three GGAs (GGA1, GGA2, and GGA3) identified in mammals and two GGAs (Gga1p and Gga2p) identified in the budding yeast Saccharomyces cerevisiae. All these GGAs have a multidomain structure consisting of: an N-terminal VHS (Vps27/Hrs/Stam) domain that binds acidic-cluster dileucine (DxxLL)-type sorting signals (where x is any amino acid) found in the cytoplasmic tail of TGN sorting receptors; a GAT (GGA and TOM) domain that interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and the tumor susceptibility gene 101 product (TSG101); a largely unstructured hinge region that contains clathrin-binding motifs; and a C-terminal GAE (gamma-adaptin ear homology) domain that binds accessory proteins. In contrast to other GGAs-like proteins, members of this family contain a GAT N-terminal region, a helix-loop-helix in the complex with Arf1-GTP.	80
410579	cd14231	GAT_GGA-like_plant	canonical GAT domain found in uncharacterized Golgi-localized gamma ear-containing Arf-binding protein (GGA)-like proteins mainly found in plants. The family includes a group of uncharacterized plant proteins containing an N-terminal VHS (Vps27p/Hrs/STAM)-domain and a GAT (GGA and TOM1) domain. Both domains are also present in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs), which belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. In contrast to GGA proteins, members in this family do not have either a GAE (gamma-adaptin ear homology) domain or a clathrin-binding motif. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin.	79
410580	cd14232	GAT_LSB5	canonical GAT domain found in yeast LAS seventeen-binding protein 5 (Lsb5p) and similar proteins. Lsb5p, also called LAS17-binding protein 5, is a Golgi-localized gamma ear-containing Arf-binding protein (GGA)-like protein located to the plasma membrane in an actin-independent manner. It plays important roles in membrane-trafficking events through association with the actin regulators, the yeast Wiskott-Aldrich syndrome protein (WASP) homologue Las17p and the cortical protein Sla1p, the yeast Arf3p (orthologous with mammalian Arf6), and ubiquitin. Lsb5p contains an N-terminal VHS (Vps27p/Hrs/STAM)-domain and a GAT (GGA and TOM1) domain. In contrast to GGA proteins, Lsb5p harbors a C-terminal NPF (Asn-Pro-Phe) motif, but does not have either a GAE (gamma-adaptin ear homology) domain or a clathrin-binding motif. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin.	78
410581	cd14233	GAT_TOM1_like	canonical GAT domain found in target of myb protein 1 (Tom1) protein family. Tom1 and its related proteins, Tom1L1 and Tom1L2, form a protein family sharing an N-terminal VHS (Vps27p/Hrs/STAM)-domain followed by a GAT (GGA and TOM1) domain, both of which are also conserved in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs). In contrast to GGAs, the Tom1 family proteins bind to ubiquitin, ubiquitinated proteins, and Toll-interacting protein (Tollip) through its GAT domain, but do not associate with Arf GTPases through its GAT domain nor with acidic cluster-dileucine sequences through its VHS domain. In addition, the Tom1 family proteins recruit clathrin onto endosomes through their C-terminal region. In their C-terminal clathrin-binding regions, Tom1 and Tom1L2 are similar to each other, but distinguishable from Tom1L1. The yeast S. cerevisiae does not contain homologous proteins of the Tom1 family. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin.	87
410582	cd14234	GAT_GGA_meta	canonical GAT domain found in metazoan ADP-ribosylation factor (Arf)-binding proteins (GGAs). GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. Moreover, GGAs play important roles in ubiquitin-dependent sorting of cargo proteins both in biosynthetic and endocytic pathways. Three GGAs (GGA1, GGA2, and GGA3) have been identified in mammals. They may appear to behave similarly, since all of them have a multidomain structure consisting of: an N-terminal VHS (Vps27/Hrs/Stam) domain that binds the acidic-cluster dileucine (DxxLL)-type sorting signals (where x is any amino acid) found in the cytoplasmic tail of TGN sorting receptors; a GAT (GGA and TOM) domain that interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and the tumor susceptibility gene 101 product (TSG101); a largely unstructured hinge region that contains clathrin-binding motifs; and a C-terminal GAE (gamma-adaptin ear homology) domain that binds accessory proteins. However, the three GGAs have some differences, which suggest they may possess their own distinct roles. For instance, both GGA1 and GGA3, but not GGA2, contains an internal DxxLL motif that binds to it own VHS domain. Only a portion of the VHS domain of GGA2 possesses distant structural homology to that of GGA1 or GGA3. Moreover, the binding affinity of GGA2 to ubiquitin is quite lower than that of GGA1 or GGA3. In addition, GGA3 has a short splicing variant that is predominantly expressed in human cell lines and tissues except the brain. It does have a VHS domain, but it is unable to bind to the DxxLL motif. GGA2 and GGA3 undergo epidermal growth factor (EGF)-induced phosphorylation.	84
410583	cd14235	GAT_GGA_fungi	canonical GAT domain found in fungal ADP-ribosylation factor (Arf)-binding proteins (GGAs). GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. Two GGAs (Gga1p and Gga2p) have been identified in the budding yeast Saccharomyces cerevisiae. Yeast GGAs play important roles in the carboxypeptidase Y (CPY) pathway, vacuole biogenesis, alpha-factor maturation, and interactions with clathrin. They have a multidomain structure consisting of VHS (Vps27/Hrs/ STAM), GAT (GGA and TOM), hinge, and GAE (gamma-adaptin ear) domains. Both Gga1p and Gga2p function as effectors of Arf in yeast. They interact with Arf1p and Arf2p in a GTP-dependent manner. Moreover, Gga2p mediates sequential ubiquitin-independent and ubiquitin-dependent steps in the trafficking of ARN1, a ferrichrome transporter in S. cerevisiae, from the TGN to the vacuole. It also acts as a phosphatidylinositol 4-phosphate effector at the Golgi exit, which binds directly to the TGN PtdIns(4)-kinase Pik1p and contributes to Pik1p recruitment. In addition, Gga2p is required for sorting of the yeast siderophore iron transporter1 (Sit1) to the vacuolar pathway. The GAT domain of GGAs interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and/or the tumor susceptibility gene 101 product (TSG101).	92
260094	cd14236	GAT_TOM1	canonical GAT domain found in target of Myb protein 1 (Tom1). Tom1 was originally identified by its induced expression by the v-Myb oncogene. It is predominantly present in the cytosol and can interact with clathrin, endofin, Toll-interacting protein (Tollip), and ubiquitinated proteins. It acts as a linker protein to regulate the ability of endofin to recruit clathrin onto the sorting endosome. Moreover, Tom1 functions as a negative regulator of IL-1beta and tumor necrosis factor (TNF)-alpha-induced signaling pathways. It also plays a role in the TLR2/4 signaling pathways. Tom1 contains an N-terminal VHS (Vps27p/Hrs/STAM)-domain, a GAT (GGA and TOM1) domain and a C-terminal clathrin-binding region, both of which are conserved in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs). In contrast to GGAs, Tom1 binds to ubiquitin, ubiquitinated proteins, and Tollip through its GAT domain, but does not associate with Arf GTPases through its GAT domain nor with acidic cluster-dileucine sequences through its VHS domain. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin.	95
410584	cd14237	GAT_TM1L1	canonical GAT domain found in target of Myb-like protein 1 (Tom1L1). Tom1L1, also called Src-activating and signaling molecule protein (Srcasm), was identified as a substrate of the Src family of protein kinases. It is tyrosine-phosphorylated by Src family kinases and modulates growth factor and Src-mediated signaling pathways. It also plays a potential role in endosomal sorting and ligand-stimulated endocytosis of EGF receptors (EGFR). Tom1L1 is predominantly present in the cytosol and can interact with Toll-interacting protein (Tollip), Hrs or TSG101, clathrin, and ubiquitinated proteins. It contains an N-terminal VHS (Vps27p/Hrs/STAM)-domain, a GAT (GGA and TOM1) domain, and a C-terminal clathrin-binding region, both of which are conserved in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs). It interacts with Tollip through their GAT domain and recuits clathrin onto endosomes through their C-terminal region. However, in the C-terminal clathrin-binding region, Tom1 and Tom1L2 are similar to each other, but distinguishable from Tom1L1. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin.	92
410585	cd14238	GAT_TM1L2	canonical GAT domain found in target of Myb-like protein 2 (Tom1L2). Tom1L2, together with Myb protein 1 (Tom1) and target of Myb-like protein 1 (Tom1L1), constitute the Tom1 family. Tom1L2 can interact with Toll-interacting protein (Tollip), clathrin, and ubiquitin. It may play a potential role in endosomal sorting, as well as in the regulation of membrane trafficking that is linked to immunity and cell proliferation. Tom1L2 contains an N-terminal VHS (Vps27p/Hrs/STAM)-domain, a GAT (GGA and TOM1) domain, and a C-terminal clathrin-binding region, both of which are conserved in Golgi-localized gamma ear-containing Arf-binding proteins (GGAs). It interacts with Tollip through their GAT domain and recuits clathrin onto endosomes through their C-terminal region. However, in the C-terminal clathrin-binding region, Tom1 and Tom1L2 are similar to each other, but distinguishable from Tom1L1. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin.	92
410586	cd14239	GAT_GGA1_GGA2	canonical GAT domain found in ADP-ribosylation factor (Arf)-binding proteins GGA1 and GGA2. This subfamily includes GGA1 and GGA2, both of which belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGA1, also called gamma-adaptin-related protein 1, or Golgi-localized gamma ear-containing Arf-binding protein 1, regulates the low-density lipoprotein and sorting receptor LR11/SorLA endocytic traffic and further alters amyloid-beta precursor protein (APP) intracellular distribution and amyloid-beta production. It is also critical for the effects of beta-site APP-cleaving enzyme-1 (BACE1) on amyloid-beta generation. It interacts with BACE1 and promotes its traffic from early endosomes to late endocytic compartments or the TGN. Moreover, GGA1 acts as a clathrin assembly protein with the ability to polymerize clathrin into tubules. GGA2, also called gamma-adaptin-related protein 2, Golgi-localized gamma ear-containing Arf-binding protein 2, or VHS domain and ear domain of gamma-adaptin (Vear), interacts with the acidic cluster-dileucine motif in the cytoplasmic tail of the cation-independent mannose 6-phosphate receptor (CI-MPR) and further plays a major role in the sorting of lysosomal enzymes. It also mediates a vital function that cannot be compensated for by GGA1 and/or GGA3. Both GGA1 and GGA2 have a multidomain structure consisting of an N-terminal VHS (Vps27/Hrs/Stam) domain, a GAT (GGA and TOM) domain, a largely unstructured hinge region that contains clathrin-binding motifs, and a C-terminal GAE (gamma-adaptin ear homology) domain. The GAT domain of GGAs interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and/or the tumor susceptibility gene 101 product (TSG101).	88
410587	cd14240	GAT_GGA3	canonical GAT domain found in ADP-ribosylation factor-binding protein GGA3. GGA3, also called Golgi-localized gamma ear-containing Arf-binding protein 3, belongs to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGA3 interacts selectively with the Met/Hepatocyte Growth Factor receptor tyrosine kinase (RTK) when stimulated. It functions as a specific cargo adaptor to target the Met RTK into recycling tubules, and further coordinates the recycling, signaling and degradative fates of the Met RTK. Moreover, GGA3, together with PACS-1 and the protein kinase CK2, forms a complex that regulates cation-independent mannose-6-phosphate receptor (CI-MPR) trafficking. Furthermore, GGA3 has been identified as an interacting protein of the beta-site APP-cleaving enzyme-1 (BACE1), a stress-related protease that is involved in Alzheimer's disease (AD) pathology. GGA3 has a multidomain structure consisting of an N-terminal VHS (Vps27/Hrs/Stam) domain, a GAT (GGA and TOM) domain, a largely unstructured hinge region that contains clathrin-binding motifs, and a C-terminal GAE (gamma-adaptin ear homology) domain. The GAT domain of GGAs interacts with class I GTP-bound form of Arf proteins, Rabaptin-5, ubiquitin, and/or the tumor susceptibility gene 101 product (TSG101).	87
260131	cd14241	PAD	Phenolic Acid Decarboxylase. This family of bacterial and fungal phenolic acid decarboxylases catalyzes the non-oxidative decarboxylation of phenolic acids to produce 4-vinyl derivates. Phenolic acid, like ferulic, p-coumaric, and caffeic acids, are important lignin-related aromatic acids and are natural constituents of plant cell walls. They act as crosslinkers between lignin polymers and hemicellulose/cellulose in plants. Their degradation is important from a biotechnological viewpoint.	144
260109	cd14243	PT-AcyF_like	Putative ABBA-type prenyltransferases acting on cyanobactins. Members of this family are found in gene clusters responsible for the production and posttranslational modification of cyanobactins, small ribosomal cyclic peptides produced by cyanobacteria. The AcyF_like proteins are structurally similar to the ABBA-type aromatic prenyltransferases, and may be responsible for the reverse- and forward-O-prenylation of tyrosine, serine, and theronine in cyanobactins. ABBA-type aromatic prenyltransferases (PTases) are a subgroup of prenyltransferases that are characterized by an unusual type of beta/alpha fold with antiparallel beta strands. They lack the (N/D)DxxD motif which is characteristic for many other prenyltransferases.	294
271203	cd14244	GH_101_like	Endo-a-N-acetylgalactosaminidase and related glcyosyl hydrolases. This family contains the enzymatically active domain of cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins (EC:3.2.1.97). It has been classified as glycosyl hydrolase family 101 in the Cazy resource. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae and other commensal human bacteria is largely determined by their ability to degrade host glycoproteins and to metabolize the resultant carbohydrates.	298
271204	cd14245	DMP12	Putative DNA mimic protein DMP12. The Neisseria sp. protein DMP12 has been shown to interact with the bacterial histone-like protein HU and may do so by acting as a DNA mimic. It is likely to play a regulatory role, but not via direct competition for binding of HU, which is involved in maintenance of the bacterial nucleoid structure.	114
271205	cd14246	ADAM17_MPD	Membrane-proximal domain of a disintegrin and metalloprotease 17 (ADAM17). ADAM17 is a multi-domain protein that acts as a sheddase; is involved in the cleavage and release of the soluble ectodomain of tumor necrosis factor alpha from the cell surface and in the trans-Golgi network, as well as in the release of various other targets such as cytokines and cell adhesion molecules. This links ADAM17 to a variety of biological processes, including cellular differentiation and the progression of cancer. It was shown that the enzymatic activity of ADAM17 is regulated via a protein-disulfide isomerase (PDI). Specifically, the disulfide bridges within a CxxC motif of the membrane-proximal domain (MPD) are isomerized by PDI; the conversion triggers a conformational change between a closed and an opened form of the MPD, which may constitute a molecular switch that triggers the shedding activity of ADAM17.	60
271206	cd14247	Lmo2686_like	Uncharacterized hexameric protein conserved in Bacilli. This family conserved in bacilli contains proteins that form an unusual hexameric arrangement via circular domain-swapping of a beta-hairpin-beta unit.	138
271207	cd14248	ESP	Exocrine gland-secreting peptide 1 (ESP1) and similar pheromones. ESP1 is a peptide pheromone found in male mouse tear fluid, which is recognized by a specific G-protein-coupled receptor in the vomeronasal sensory neurons and affects female mouse sexual receptive behaviour. This small family appears restricted to rodents. ESP36 is expressed only in the female mouse extraorbital lacrimal gland. The juvenile pheromone ESP22 is secreted from the lacrimal gland and released into the tears of 2-3 week old mice; it activates the vomeronasal response pathway, and inhibits male sexual behavior. Information regarding other members of this family is not yet available.	52
271208	cd14249	ESP1_like	Exocrine gland-secreting peptide 1 (ESP1) and similar pheromones. ESP1 is a peptide pheromone found in male mouse tear fluid, which is recognized by a specific G-protein-coupled receptor in the vomeronasal sensory neurons and affects female mouse sexual receptive behaviour. This small family appears restricted to rodents; the functions of members other than mouse ESP1 have not yet been determined.	46
271209	cd14250	ESP36_like	Exocrine gland-secreting peptide 36 (ESP36) and similar pheromones. ESP36 is a peptide pheromone expressed only in the female mouse extraorbital lacrimal gland. This family also includes the juvenile pheromone ESP22 which is secreted from the lacrimal gland and released into the tears of 2-3 week old mice. ESP22 activates the vomeronasal response pathway, and inhibits male sexual behavior. This small family appears restricted to rodents; the functions of other members have not yet been determined.	55
271210	cd14251	PL-6	Polysaccharide Lyase Family 6. Polysaccharide Lyase Family 6 is a family of beta-helical polysaccharide lyases. Members include alginate lyase (EC 4.2.2.3) and chondroitinase B (EC 4.2.2.19). Chondroitinase B is an enzyme that only cleaves the beta-(1,4)-linkage of dermatan sulfate (DS), leading to 4,5-unsaturated dermatan sulfate disaccharides as the product. DS is a highly sulfated, unbranched polysaccharide belonging to a family of glycosaminoglycans (GAGs) composed of alternating hexosamine (gluco- or galactosamine) and uronic acid (D-glucuronic or L-iduronic acid) moieties. DS contains alternating 1,4-beta-D-galactosamine (GalNac) and 1,3-alpha-L-iduronic acid units. The related chondroitin sulfate (CS) contains alternating GalNac and 1,3-beta-D-glucuronic acid units. Alginate lyases (known as either mannuronate (EC 4.2.2.3) or guluronate lyases (EC 4.2.2.11) catalyze the degradation of alginate, a copolymer of alpha-L-guluronate and its C5 epimer beta-D-mannuronate.	369
271211	cd14252	Dockerin_like	Dockerin repeat domains and domains resembling dockerin repeats. Dockerins are modules in the cellulosome complex that often anchor catalytic subunits by binding to cohesin domains of scaffolding proteins. Three types of dockerins and their corresponding cohesin have been described in the literature. This alignment models two consecutive dockerin repeats, the functional unit.	57
271212	cd14253	Dockerin	Dockerin repeat domain. Dockerins are modules in the cellulosome complex that often anchor catalytic subunits by binding to cohesin domains of scaffolding proteins. Three types of dockerins and their corresponding cohesin have been described in the literature. This alignment models two consecutive dockerin repeats, the functional unit.	56
271213	cd14254	Dockerin_II	Type II dockerin repeat domain. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type II dockerins, which are responsible for mediating attachment of the cellulosome complex to the bacterial cell wall.	54
271214	cd14255	Dockerin_III	Type III dockerin repeat domain. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. Two specific calcium-dependent interactions between cohesin and dockerin appear to be essential for cellulosome assembly, type I and type II. This subfamily represents the atypical type III dockerins and related domains.	65
271215	cd14256	Dockerin_I	Type I dockerin repeat domain. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type I dockerins, which are responsible for anchoring a variety of enzymatic domains to the complex.	57
271221	cd14257	CttA_X	X module of the carbohydrate-binding protein CttA and similar proteins. This model represents a putative carbohydrate-binding domain conserved in Ruminococcus, which sits N-terminal to a dockerin repeat; the protein may be a component of the Ruminococcus cellulosome system. This X module does not share similarities with other known X modules from cellulolytic bacteria and may have a structural role.	116
271222	cd14259	PUFD_like	PCGF Ub-like fold discriminator and related domains. The PUFD domain binds the RAWUL (RING finger and WD40-associated ubiquitin-like) domain of the polycomb-group RING finger homologs PCGF1 and PCGF3. PUFD was characterized as a domain of the BCL6 corepressor BCOR. It does not appear to bind to PCGF2 and PCGF4. PCGF1 is a component of the Polycomb group (PcG) multi-protein BCOR complex, which is involved in repressing the transcription of BCL6 and CDKN1A. The BCL-6 corepressor (BCOR) is a transcriptional repressor required for germinal center formation and is possibly involved in apoptosis.	106
271223	cd14260	PUFD_like_1	PCGF Ub-like fold discriminator of BCOR-like 1. The PUFD domain binds the RAWUL (RING finger and WD40-associated ubiquitin-like) domain of the polycomb-group RING finger homologs PCGF1 and PCGF3. PUFD was characterized as a domain of the BCL6 corepressor BCOR. It does not appear to bind to PCGF2 and PCGF4. PCGF1 is a component of the Polycomb group (PcG) multi-protein BCOR complex, which is involved in repressing the transcription of BCL6 and CDKN1A. The BCL-6 corepressor-like protein 1 (BCoR-L1) is largely uncharacterized; it contains ankyrin repeats.	115
271224	cd14261	PUFD	PCGF Ub-like fold discriminator of BCOR. The PUFD domain binds the RAWUL (RING finger and WD40-associated ubiquitin-like) domain of the polycomb-group RING finger homologs PCGF1 and PCGF3. PUFD was characterized as a domain of the BCL6 corepressor BCOR. It does not appear to bind to PCGF2 and PCGF4. PCGF1 is a component of the Polycomb group (PcG) multi-protein BCOR complex, which is involved in repressing the transcription of BCL6 and CDKN1A. The BCL-6 corepressor (BCOR) is a transcriptional repressor required for germinal center formation and is possibly involved in apoptosis.	117
271354	cd14262	VirB5_like	VirB5 protein family. This family contains VirB5 domains, including TraC, a VirB5 homolog encoded by the pKM101 plasmid, and similar proteins. VirB5 is one of 11 conserved proteins (VirB1-VirB11) in Agrobacterium tumefaciens, the causative agent of crown gall disease, that span the inner and the outer membrane, and is involved in type IV DNA secretion systems (T4SS) which mediate the translocation of virulence factors (proteins and/or DNA) from Gram-negative bacteria into eukaryotic cells. VirB5 assembles extracellular pili by interacting with several essential proteins. VirB2-VirB5 complex formation precedes incorporation into pili; it depends on the inner membrane protein VirB4 to interact directly with and stabilize VirB8 in order for VirB5 to bind to VirB8 and VirB10. Mutagenesis studies show that VirB5 proteins participate in protein-protein interactions important for pilus assembly and function.	173
260132	cd14263	DAGK_IM_like	Integral membrane diacylglycerol kinase and similar enzymes. This mostly bacterial family of homo-trimeric integral membrane enzymes, the products of the dgkA gene, catalyzes the ATP-dependent phosphorylation of substrates such as diacylglycerol to phosphatidic acid or of undecaprenol to undecaprenyl phosphate. They are not related other cytosolic or membrane-associated kinases, including the eukaryotic diacylglycerol kinases.	106
260133	cd14264	DAGK_IM	Integral membrane diacylglycerol kinase. This mostly bacterial family of homo-trimeric integral membrane enzymes, the products of the dgkA gene, catalyzes the ATP-dependent phosphorylation of diacylglycerol to phosphatidic acid. Escherichia coli DAGK participates in the membrane-derived oligosaccharide cycle (MDO cycle) by recycling lipids to restore phosphatidylglycerols that were used up in the biosynthesis of MDOs. DAGK also recycles diacylglycerols that are produced during the biosynthesis of lipopolysaccharides (LPS) back to phospholipids. DAGK is not the main source of phosphatidic acid in de-novo biosynthesis of glycerophospholipids. Escherichia coli DAGK has low activity as an undecaprenol kinase.	109
260134	cd14265	UDPK_IM_like	Integral membrane undecaprenol kinase and similar enzymes. This mostly bacterial family of homo-trimeric integral membrane enzymes, the products of the dgkA gene, catalyzes the ATP-dependent phosphorylation of undecaprenol to undecaprenyl phosphate. C55-isoprenyl (undecaprenyl) pyroposphate acts as a scaffold for the assembly of peptidoglycan components; undecaprenol kinase (UDPK) is involved in recycling undecaprenyl units for re-use in the peptidoglycan biosynthesis. UDPK does not participate in the de-novo biosynthesis of undecaprenyl phosphate. Gram-positive bacteria have a large pool of free undecaprenol, in contrast to gram-negative bacteria. UDPK may also play a role in a stress-induced pathway that affects the function of ribosomes. In Streptococcus mutans, UDPK has been shown to be required for biofilm formation, such as in the case of smooth surface dental caries. Members of the UDPK family have low activity as diacylglycerol kinases (DAGK), and many of them are annotated as DAGKs.	106
260135	cd14266	UDPK_IM_PAP2_like	Integral membrane undecaprenol kinase domain co-occurring with type 2 phosphatidic acid phosphatase-like domains. This bacterial family of homo-trimeric integral membrane enzyme domains catalyzes the ATP-dependent phosphorylation of of undecaprenol to undecaprenyl phosphate. They sit N-terminally to phosphatase domains that are members of the type 2 phosphatidic acid phosphatase superfamily, and the function of members of this domain architecture was determined to be undecaprenyl pyrophosphate phosphatases. The bi-functional enzymes might generate undecaprenyl phosphate via two mechanisms - the phosphorylation of undecaprenol or the cleavage of the terminal phosphate group of undecaprenyl pyrophosphate.	106
341312	cd14267	Rif1_CTD_C-II_like	Saccharomyces cerevisiae Rap1-interacting factor 1 CTD domain, metazoan Rif1 C-II domains and related domains. This model includes Saccharomyces cerevisiae Rif1_CTD (carboxy-terminal domain) and metazoan Rif1 C-II (C-terminal subdomain II). Rif1 was originally identified in S. cerevisiae where it negatively regulates telomere length homeostasis via interaction with the C-terminal domain of Rap1. A protective capping structure (telosome) comprised of Rap1, Rif1, and Rif2, inhibits telomerase, counteracts SIR-mediated transcriptional silencing, and prevents inadvertent recognition of telomeres as DNA double-strand breaks (DSBs). S. cerevisiae Rif1 has two Rap1 binding sites: the Rap1-binding module (RBM), and the CTD domain. The latter, represented here, has a lower Rap1 affinity, and provides trans binding through tetramerization. In mammals, Rif1 has been implicated in various cellular processes including pluripotency of stem cells, breast cancer development, and DSB repair pathway choice. A mutual antagonism between the nonhomologous end joining factors (53BP1-RIF1) and the homologous recombination factors (BRCA1 -CtIP) ensures correct repair pathway choice.	46
270456	cd14270	UBA	UBA domain found in proteins involved in ubiquitin-mediated proteolysis. The ubiquitin-associated (UBA) domains are commonly occurring sequence motifs found in proteins involved in ubiquitin-mediated proteolysis. They contribute to ubiquitin (Ub) binding or ubiquitin-like (UbL) domain binding. However, some kinds of UBA domains can only the bind UbL domain, but not the Ub domain. UBA domains are normally comprised of compact three-helix bundles which contain a conserved GF/Y-loop. They can bind polyubiquitin with high affinity. They also bind monoubiquitin and other proteins. Most UBA domain-containing proteins have one UBA domain, but some harbor two or three UBA domains.	30
270457	cd14271	UBA_YLR419W_like	UBA domain found in Saccharomyces cerevisiae putative ATP-dependent RNA helicase YLR419W and similar proteins. The group includes some uncharacterized hypothetical proteins which show a high level of sequence similarity with Saccharomyces cerevisiae putative ATP-dependent RNA helicase YLR419W. All family members contain a ubiquitin-associated (UBA) domain, RWD domain, DEAD-box (DEXDc), helicase superfamily c-terminal domain (HELICc), Helicase associated domain (HA2), and a C-terminal oligonucleotide/oligosaccharide-binding (OB)-fold. 	41
270458	cd14272	UBA_AMPK-RKs	UBA domain of AMPK related kinases. The AMPK-RK family comprises AMP-activated protein kinases (AMPKs), MAP/microtubule affinity-regulating kinases (MARKs), Brain-specific kinases (BRSKs), Salt inducible kinases (SIKs), maternal embryonic leucine zipper kinase (MELK), and SNF-related serine/threonine-protein kinase (SNRK). It is the only kinase family in the human genome containing an ubiquitin-associated (UBA) or UBA-like domain which is located immediately C-terminal to their N-terminal protein kinase catalytic domain. In addition, most of family members contain a C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK), but some are lack of this region. AMPK-RKs play central roles in metabolic control, energy homeostasis and stress responses in eukaryotes. They require phosphorylation by liver kinase B1 (LKB1) for full activity. Normally, AMPK-RKs appear to exist as heterotrimeric complexes consisting of a catalytic alpha-subunit and regulatory beta- and gamma-subunits. This model corresponds to the catalytic subunit. The UBA domain, previously called SNF1 homology (SNH) domain, regulates the conformation, LKB1-mediated phosphorylation and activation, and localization of the AMPK-RKs, but does not interact with ubiquitin-like molecules. In AMPKalpha subunits, the UBA-like autoinhibitory domain (AID) is responsible for AMPKalpha subunit autoinhibition. Due to the lack of UBA domain, NUAK1 kinase, also called ARK5 (AMPK-related kinase 5), and NUAK2 kinase, also called SNARK (SNF1/AMPK-related kinase), are not included in this family.	38
270459	cd14273	UBA_TAP-C_like	UBA-like domain found in the NXF family of mRNA nuclear export factors and similar proteins. This family includes nuclear RNA export factors (NXF1/NXF2), FAS-associated factors (FAF1/2), tyrosyl-DNA phosphodiesterase 2 (TDP2), OTU domain-containing proteins (OTU7A/OTU7B), NSFL1 cofactor p47, defective in cullin neddylation protein 1 (DCN1)-like protein (DCNL1/DCNL2), yeast defective in cullin neddylation protein 1 (DCN1) and similar proteins. NXF proteins can stimulate nuclear export of mRNAs and facilitate the export of unspliced viral mRNA containing the constitutive transport element. FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF2 is the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. Its biological function remains unclear. TDP2 is a 5'-Tyr-DNA phosphodiesterase required for the efficient repair of topoisomerase II-induced DNA double strand breaks. OTU7A and OTU7B are zinc finger proteins that function as deubiquitinating enzymes. p47 is a major cofactor of the cytosolic AAA ATPase p97. It is required for the p97-regulated membrane reassembly of the endoplasmic reticulum (ER), the nuclear envelope and the Golgi apparatus. DCNL1 plays an essential role in the neddylation E3 complex and participates in the release of inhibitory effects of CAND1 on cullin-RING ligase E3 complex assembly and activity. The biological function of DCNL2 remains unclear. Yeast DCN1 is a scaffold-type E3 ligase for cullin neddylation. It can bind directly to cullins and the ubiquitin-like protein Nedd8-specific E2 (Ubc12), and regulate cullin neddylation and thus display ubiquitin ligase activity.	31
270460	cd14274	UBA_ACK1	UBA domain found in activated Cdc42 kinase 1 (ACK1) and similar proteins. ACK1, also called tyrosine kinase non-receptor protein 2, is an intracellular non-receptor tyrosine kinase that specifically interacts with Cdc42 and act as Cdc42 effectors. It forms a signaling complex with Cdc42, p130(Cas), and Crk, and mediates Cdc42-dependent cell migration and signaling to p130(Cas). Ack1 also stimulates prostate tumorigenesis in part by inhibiting the proapoptotic tumor suppressor WW domain containing oxidoreductase (Wwox). Moreover, ACK1 associates directly with the heavy chain of clathrin and further participates in trafficking, underlying an ability to increase receptor-mediated transferrin uptake. It may functions as a regulator of the guanine nucleotide exchange factor Dbl that can activate Rho family proteins. ACK1 consists of an N-terminal tyrosine kinase catalytic domain followed by an SH3 domain, a Cdc42/Rac interactive binding (CRIB) domain, a proline-rich region, and a C-terminal ubiquitin-association (UBA) domain. The proline-rich region of ACK1 is responsible for the binding to the adaptor proteins Nck, Grb2, sorting nexin protein 9 (SH3PX1), and Hck.	45
270461	cd14275	UBA_EF-Ts	UBA domain found in elongation factor Ts (EF-Ts) from bacteria, chloroplasts and mitochondria of eukaryotes. EF-Ts functions as a nucleotide exchange factor in the functional cycle of EF-Tu, another translation elongation factor that facilitates the binding of aminoacylated transfer RNAs (aminoacyl-tRNA) to the ribosomal A site as a ternary complex with guanosine triphosphate during the elongation cycle of protein biosynthesis, and then catalyzes the hydrolysis of GTP and release itself in GDP-bound form. EF-Ts forms complex with EF-Tu and catalyzes the nucleotide exchange reaction promoting the formation of EF-Tu in GTP-bound form from EF-Tu in GDP-bound form. EF-Ts from Thermus thermophiles is shorter than EF-Ts from Escherichia coli, but it has higher thermostability. The mitochondrial translational EF-Ts from chloroplasts and mitochondria display high similarity to the bacterial EF-Ts. The majority of family members contain one ubiquitin-associated (UBA) domain, but some family members from plants harbor two tandem UBA domains.	37
270462	cd14276	UBA_UBP25_like	UBA domain found in ubiquitin carboxyl-terminal hydrolase UBP25, UBP28, and similar proteins. UBP25, also called deubiquitinating enzyme 25, USP on chromosome 21, ubiquitin thioesterase 25, or ubiquitin-specific-processing protease 25, belongs to the deubiquitinating enzyme (DUB) family that specifically hydrolyzes ubiquitin chains on ubiquitin-conjugated proteins. USP25 has one muscular isoform and two ubiquitous isoforms. The longer muscular isoform can bind to muscle-restricted cytoskeletal and sarcomeric proteins, such as myosin binding protein C1 (MyBPC1), actin alpha-1 (ACTA1) and filamin C (FLNC), and further prevent their degradation. USP25 harbors three potential ubiquitin-binding domains (UBDs), one ubiquitin-associated (UBA) domain and two ubiquitin-interacting motifs (UIMs) in the N-terminal region. Its C-terminal tyrosine-rich region is responsible for the binding of the second SH2 domain of SYK, a non-receptor tyrosine kinase that specifically phosphorylates USP25 and alters its cellular levels. UBP28, also called deubiquitinating enzyme 28, ubiquitin thioesterase 28, or ubiquitin-specific-processing protease 28, is also an ubiquitin-specific protease belonging to the DUB family. UBP28 can form a ternary complex with nucleoplasmic Fbw7alpha, an F-box protein that is part of an SCF-type ubiquitin ligase, and MYC, a transcription factor encoded by MYC proto-oncogene. UBP28 is required for the stability of MYC, and this stabilization is necessary for tumour-cell proliferation. Besides, UBP28 plays a critical role in the regulation of the Chk2-p53-PUMA pathway. It specifically interacts with 53BP1 and is essential to stabilize Chk2 and 53BP1 in response to DNA damage.	38
270463	cd14277	UBA_UBP2_like	UBA domain found in ubiquitin-associated protein 2 (UBAP-2) like proteins. The family contains some uncharacterized ubiquitin-associated proteins, including UBAP-2 and its homolog, UBAP2-like [UBP2L, also called protein NICE-4 (for newly identified cDNA from the epidermal differentiation complex EDC)], both of which contain an N-terminal ubiquitin-associated (UBA) domain along with a highly conserved, but function unknown domain (DUF3697).	38
270464	cd14278	UBA_NAC_like	UBA-like domain found in nascent polypeptide-associated complex subunit alpha (NACA) and similar proteins. The family contains nascent polypeptide-associated complex subunit alpha (NACA), putative NACA-like protein (NACP1), nascent polypeptide-associated complex subunit alpha domain-containing protein 1 (NACAD), and similar proteins found in archaea and bacteria. NACA, also called NAC-alpha or Alpha-NAC, together with BTF3, also called Beta-NAC, form the nascent polypeptide-associated complex (NAC) which is a cytosolic protein chaperone that contacts the nascent polypeptide chains as they emerge from the ribosome. Besides, NACA has a high affinity for nucleic acids and exists as part of several protein complexes playing a role in proliferation, apoptosis, or degradation. It is a cytokine-modulated specific transcript in the human TF-1 erythroleukemic cell line. It also acts as a transcriptional co-activator in osteoblasts by binding to phosphorylated c-Jun, a member of the activator-protein-1 (AP-1) family. Moreover, NACA binds to and regulates the adaptor protein Fas-associated death domain (FADD). In addition, NACA functions as a novel factor participating in the positive regulation of human erythroid-cell differentiation. The biological function of NACP1 (also called Alpha-NAC pseudogene 1 or NAC-alpha pseudogene 1) and NACAD remain unclear. The family also includes huntingtin-interacting protein K (HYPK), also called Huntingtin yeast partner K or Huntingtin yeast two-hybrid protein K. It is an intrinsically unstructured Huntingtin (HTT)-interacting protein with chaperone-like activity. It may be involved in regulating cell growth, cell cycle, unfolded protein response and cell death. All members in this family contain an ubiquitin-associated (UBA) domain.	37
270465	cd14279	CUE	CUE domain found in ubiquitin-binding CUE proteins. This family includes many coupling of ubiquitin conjugation to endoplasmic reticulum degradation (CUE) domain containing proteins that are characterized by an FP and a di-leucine-like sequence and bind to monoubiquitin with varying affinities. Some higher eukaryotic CUE domain proteins do not bind monoubiquitin efficiently, since they carry LP, rather than FP among CUE domains. CUE domains form three-helix bundle structures and are distantly related to the ubiquitin-associated (UBA) domains which are widely occurring ubiquitin-binding motifs found in a broad range of cellular proteins in species ranging from yeast to human. The majority of family members contain one CUE domain, but some family members from fungi harbor two CUE domains.	38
270466	cd14280	UBA1_Rad23_like	UBA1 domain of Rad23 proteins found in eukaryotes. The Rad23 family includes the yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe), their mammalian orthologs HR23A and HR23B, and putative DNA repair proteins from plants. Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry a ubiquitin-like (UBL) and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. UBL domain is responsible for the binding to proteasome. UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates which suggests Rad23 proteins might be involved in certain pathways of ubiquitin metabolism. Both UBL domain and XPC-binding domain are necessary for efficient NER function of Rad23 proteins. This model corresponds to the UBA1 domain.	39
270467	cd14281	UBA2_Rad23_like	UBA2 domain of Rad23 proteins found in eukaryotes. The Rad23 family includes the yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe), their mammalian orthologs HR23A and HR23B, and putative DNA repair proteins from plants. Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry a ubiquitin-like (UBL) and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. UBL domain is responsible for the binding to proteasome. UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates which suggests Rad23 proteins might be involved in certain pathways of ubiquitin metabolism. Both UBL domain and XPC-binding domain are necessary for efficient NER function of Rad23 proteins. This model corresponds to the UBA2 domain.	38
270468	cd14282	UBA_TDRD3	UBA domain of Tudor domain-containing protein 3 (TDRD3) and similar proteins. TDRD3 is a modular protein containing Tudor domain, a DUF/OB-fold motif and a ubiquitin-associated (UBA) domain. It shows both nucleic acid- and methyl-binding properties and can interact with methylated RNA-binding proteins, such as fragile X mental retardation protein (FMRP) and DEAD/H box-3 (also known as DDX3X/Y, DBX/Y, HLP2 and DDX14) which is implicated in human genetic diseases. At this point, TDRD3 may play a central role in RNA processing regulatory pathways involving arginine methylation. TDRD3 localizes predominantly to the cytoplasm stress granules (SGs). The Tudor domain is essential and sufficient for its recruitment to SGs.	39
270469	cd14283	UBA_TNR6C	UBA domain found in trinucleotide repeat-containing gene 6C protein (TNRC6C) and similar proteins. TNRC6C is one of three GW182 paralogs in mammalian genomes. It is enriched in P-bodies and important for efficient miRNA-mediated repression. TNRC6C is composed of an N-terminal glycine/tryptophan (G/W)-rich region containing an Ago hook responsible for Ago protein-binding; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C-terminus. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred as silencing domain that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. The C-terminal half containing the RRM domain functions as a key effector domain mediating protein synthesis repression by TNRC6C.	38
270470	cd14284	UBA_GAWKY	UBA domain found in Drosophila melanogaster protein Gawky (GW) and similar proteins. GW is the D. melanogaster GW182 homolog (dGW182) which belongs to the GW182 protein family. The GW182 proteins directly interact with Argonaute (Ago) proteins, and thus function as downstream effectors in the miRNA pathway, responsible for inhibition of translation and acceleration of mRNA decay. They are characterized by an abnormally high content of glycine/tryptophan (G/W) repeats, one or more glutamine (Q)-rich motifs, and a C-terminal RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The GW182 proteins are recruited to miRNA targets through an interaction between their N-terminal domain and an Argonaute protein. Then they promote translational repression and/or degradation of miRNA targets through their C-terminal silencing domain. In addition to a G/W repeats region, a Q-rich region, and a RRM domain, GW also contains a ubiquitin-associated domain (UBA).	35
270471	cd14285	UBA_scEDE1_like	UBA domain found in Saccharomyces cerevisiae EH domain-containing and endocytosis protein 1 (Ede1) and similar proteins. Ede1, also bud site selection protein 15, is the mammalian protein Eps15 homolog found in yeast and functions at the internalization step of endocytosis. Both Ede1 and Eps15 are endocytic scaffold proteins that may involve in stabilization of the adaptor-cargo complex. They both contain contains three N-terminal Eps15 homology (EH) domains and C-terminal ubiquitin-binding motifs. Whereas Eps15 has two ubiquitin interacting motifs (UIM), Ede1 harbors a single ubiquitin-associated (UBA) domain. This model corresponds to Ede1 UBA domain that is responsible for the binding of monoubiquitinated proteins and negatively regulates EH domain-mediated protein-protein interactions.	35
270472	cd14286	UBA_UBP24	UBA domain found in ubiquitin carboxyl-terminal hydrolase 24 (UBP24) and similar proteins. UBP24, also called deubiquitinating enzyme 24, ubiquitin thioesterase 24, or ubiquitin-specific-processing protease 24, is a deubiquitinating protein that interacts with damage-specific DNA-binding protein 2 (DDB2) and regulates DDB2 stability. It may also play a role in the pathogenesis of Parkinson's disease (PD). UBP24 proteins contain an N-terminal ubiquitin-associated (UBA) domain and a C-terminal peptidase C19 domain.	37
270473	cd14287	UBA_At3g58460_like	UBA domain found in uncharacterized protein At3g58460 from Arabidopsis thaliana and its homologs from other plants. The uncharacterized protein At3g58460 from Arabidopsis thaliana is also known as rhomboid-like protein 15 which is encoded by RBL15 gene. Although the biological function of the family members remains unclear, they all contain an N-terminal rhomboid-like domain and a C-terminal ubiquitin-associated (UBA) domain.	36
270474	cd14288	UBA_HUWE1	UBA domain found in eukaryotic E3 ubiquitin-protein ligase HUWE1 and similar proteins. HUWE1, also called ARF-binding protein 1 (ARF-BP1), HECT, UBA and WWE domain-containing protein 1, homologous to E6AP carboxyl terminus homologs protein 9 (HectH9), large structure of UREB1 (LASU1), Mcl-1 ubiquitin ligase E3 (Mule), upstream regulatory element-binding protein 1 (URE-B1), or URE-binding protein 1, may function as a ubiquitin-protein ligase that involves in the ubiquitination cascade that targets specific substrate proteins in proteolysis. It can ubiquitylate DNA polymerase beta (Pol beta), the major BER DNA polymerase and modulates base excision repair (BER). HUWE1 also acts as a critical mediator of both the p53-independent and p53-dependent tumor suppressor functions of ARF tumor suppressor in p53 regulation. Moreover, HUWE1 is both required and sufficient for the polyubiquitination of Mcl-1, an anti-apoptotic Bcl-2 family member involving in DNA damage-induced apoptosis. Furthermore, HUWE1 plays an important role in the regulation of Cdc6 stability after DNA damage. In addition, HUWE1 works as a partner of N-Myc oncoprotein in neural cells. It ubiquitinates N-Myc and primes it for proteasomal-mediated degradation. HUWE1 contains a ubiquitin-associated (UBA) domain, a WWE domain, and a Bcl-2 homology region 3 (BH3) domain at the N-terminus and a HECT domain at the C-terminus. WWE domain plays a role in the regulation of specific protein-protein interactions in a ubiquitin conjugation system. BH3 domain is responsible for the specific binding to Mcl-1. HECT domain involves in the inhibition of the transcriptional activity of p53 via a ubiquitin-dependent degradation pathway. It also controls neural differentiation and proliferation by destabilizing the N-Myc oncoprotein.	40
270475	cd14289	UBA_RHBD3	UBA domain found in vertebrate rhomboid domain-containing protein 3 (RHBD3). RHBD3 is encoded by a novel chromosome 22 CpG island-associated gene (C22orf3) that is not expressed in a significant proportion of pituitary tumors. C22orf3 is also called pituitary tumor apoptosis gene (PTAG) or RHBDD3 which is located directly upstream of EWSR1. Although its biological function remains unclear, RHBD3 contains an N-terminal rhomboid domain and a C-terminal ubiquitin-associated (UBA) domain.	44
270476	cd14290	UBA_PUB_plant	UBA domain found in plant PNGase/UBA or UBX (PUB) domain-containing proteins. This family includes some uncharacterized hypothetical proteins found in plants. Although their biological function remain unclear, all family members contain an N-terminal ubiquitin-associated (UBA) domain and a C-terminal PUB domain. UBA domain, along with UBL (ubiquitin-like) domain, has been implicated in proteasomal degradation by associating with substrates destined for degradation as well as with subunits of the proteasome, thus regulating protein turnover. PUB domain functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The type II AAA+ ATPase p97 is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins.	49
270477	cd14291	UBA1_NUB1_like	UBA1 domain found in NEDD8 ultimate buster 1 (NUB1) and similar proteins. NUB1, also called negative regulator of ubiquitin-like proteins 1, renal carcinoma antigen NY-REN-18, or protein BS4, is a NEDD8-interacting protein that can be induced by interferon. It functions as a strong post-transcriptional down-regulator of the NEDD8 expression and plays critical roles in regulating many biological events, such as cell growth, NF-kappaB signaling, and biological responses to hypoxia. NUB1 can also interact with aryl hydrocarbon receptor-interacting protein-like 1 (AIPL1) which may function in the regulation of cell cycle progression. NUB1 contains three ubiquitin-associated domains (UBA), a bipartite nuclear localization signal (NLS) and a PEST motif. This model corresponds to UBA1 domain.	36
270478	cd14292	UBA2_NUB1	UBA2 domain found in NEDD8 ultimate buster 1 (NUB1) and similar proteins. NUB1, also called negative regulator of ubiquitin-like proteins 1, renal carcinoma antigen NY-REN-18, or protein BS4, is a NEDD8-interacting protein that can be induced by interferon. It functions as a strong post-transcriptional down-regulator of the NEDD8 expression and plays critical roles in regulating many biological events, such as cell growth, NF-kappaB signaling, and biological responses to hypoxia. NUB1 can also interact with aryl hydrocarbon receptor-interacting protein-like 1 (AIPL1) which may function in the regulation of cell cycle progression. NUB1 contains three ubiquitin-associated domains (UBA), a bipartite nuclear localization signal (NLS) and a PEST motif. This model corresponds to UBA2 domain.	35
270479	cd14293	UBA3_NUB1	UBA3 domain found in NEDD8 ultimate buster 1 (NUB1) and similar proteins. NUB1, also called negative regulator of ubiquitin-like proteins 1, renal carcinoma antigen NY-REN-18, or protein BS4, is a NEDD8-interacting protein that can be induced by interferon. It functions as a strong post-transcriptional down-regulator of the NEDD8 expression and plays critical roles in regulating many biological events, such as cell growth, NF-kappaB signaling, and biological responses to hypoxia. NUB1 can also interact with aryl hydrocarbon receptor-interacting protein-like 1 (AIPL1) which may function in the regulation of cell cycle progression. NUB1 contains three ubiquitin-associated domains (UBA), a bipartite nuclear localization signal (NLS) and a PEST motif. This model corresponds to UBA3 domain.	36
270480	cd14294	UBA1_UBP5_like	UBA1 domain found in ubiquitin carboxyl-terminal hydrolase UBP5, UBP13 and similar proteins. UBP5, also called deubiquitinating enzyme 5, Isopeptidase T (IsoT), ubiquitin thioesterase 5, or ubiquitin-specific-processing protease 5, is a deubiquitinating enzyme largely responsible for the disassembly of the majority of unanchored polyubiquitin in the cell. Zinc is required for its catalytic activity. UBP5 contains four ubiquitin (Ub)-binding sites including an N-terminal zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. ZnF domain binds the proximal ubiquitin. UBP domain forms the active site. UBA domains are involved in binding linear or K48-linked polyubiquitin. UBP13, also called deubiquitinating enzyme 13, Isopeptidase T-3 (isoT3), ubiquitin thioesterase 13, or ubiquitin-specific-processing protease 13, is an ortholog of UBP5. It has similar domain architecture, but functions differently from USP5 in cellular deubiquitination processes. It exhibits a weak deubiquitinating activity preferring to Lys63-linked polyubiquitin in a non-activation manner. Moreover, the zinc finger (ZnF) domain of USP13 cannot bind to Ub. Its tandem UBA domains can bind with different types of diUb but preferentially with K63-linked.USP13 can also regulate the protein level of CD3delta in cells via its UBA domains. This model corresponds to the UBA1 domain.	44
270481	cd14295	UBA1_atUBP14	UBA1 domain found in Arabidopsis thaliana ubiquitin carboxyl-terminal hydrolase 14 (atUBP14) and similar proteins. atUBP14, also called deubiquitinating enzyme 14, TITAN-6 protein, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, is related to the isopeptidase T class of deubiquitinating enzymes that recycle polyubiquitin chains following protein degradation. atUBP14 is essential for early plant development. It can disassemble multi-ubiquitin chains linked internally via epsilon-amino isopeptide bonds using Lys48 and can process some, but not all, translational fusions of ubiquitin linked via alpha-amino peptide bonds. atUBP14 contains two ubiquitin-association (UBA) domains. This model corresponds to the UBA1 domain.	45
270482	cd14296	UBA1_scUBP14_like	UBA1 domain found in Saccharomyces cerevisiae ubiquitin carboxyl-terminal hydrolase 14 (scUBP14) and similar proteins. scUBP14, also called deubiquitinating enzyme 14, glucose-induced degradation protein 6, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, is the yeast ortholog of human Isopeptidase T (USP5), a deubiquitinating enzyme known to bind the 29-linked polyubiquitin chains. scUBP14 has been identified as a K29-linked polyubiquitin binding protein as well. It is involved in K29-linked polyubiquitin metabolism by binding to the 29-linked Ub4 resin and serving as an internal positive control in budding yeast. Members in this family contain two tandem ubiquitin-association (UBA) domains. This model corresponds to the UBA1 domain.	39
270483	cd14297	UBA2_spUBP14_like	UBA2 domain found in Schizosaccharomyces pombe ubiquitin carboxyl-terminal hydrolase 14 (spUBP14) and similar proteins. spUBP14, also called deubiquitinating enzyme 14, UBA domain-containing protein 2, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, functions as a deubiquitinating enzyme that is involved in protein degradation in fission yeast. Members in this family contain two tandem ubiquitin-association (UBA) domains. This model corresponds to the UBA2 domain.	39
270484	cd14298	UBA2_scUBP14_like	UBA2 domain found in Saccharomyces cerevisiae ubiquitin carboxyl-terminal hydrolase 14 (scUBP14) and similar proteins. scUBP14, also called deubiquitinating enzyme 14, glucose-induced degradation protein 6, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, is the yeast ortholog of human Isopeptidase T (USP5), a deubiquitinating enzyme known to bind the 29-linked polyubiquitin chains. scUBP14 has been identified as a K29-linked polyubiquitin binding protein as well. It is involved in K29-linked polyubiquitin metabolism by binding to the 29-linked Ub4 resin and serving as an internal positive control in budding yeast. Members in this family contain two tandem ubiquitin-association (UBA) domains. This model corresponds to the UBA2 domain.	38
270485	cd14300	UBA_UBS3A_like	UBA domain found in ubiquitin-associated and SH3 domain-containing protein A (UBS3A) and similar proteins. UBS3A, also called Cbl-interacting protein 4 (CLIP4), suppressor of T-cell receptor signaling 2 (Sts-2), or T-cell ubiquitin ligand 1 (TULA-1), is a lymphoid protein only detected in thymus, spleen, and bone marrow. UBS3A exhibits extremely low phosphatase activity, but is capable of promoting T-cell apoptosis independent of either T cell receptor (TCR)/CD3-mediated signaling or caspase activity. It functions as a negative regulator of TCR signaling. UBS3A can also inhibit HIV-1 biogenesis through the binding of ATP-binding cassette protein family E member 1 (ABCE-1), a host factor of HIV-1 assembly. Moreover, UBS3A acts as the Cbl- and ubiquitin-interacting protein that can inhibit endocytosis and downregulation of ligand-activated epidermal growth factor receptor (EGFR) by impairing Cbl-induced ubiquitination, as well as inhibit clathrin-dependent endocytosis in general. This family also includes Arabidopsis thaliana ubiquitin carboxyl-terminal hydrolase 14 (atUBP14) and some uncharacterized AAA-type ATPase-like proteins found in plants.	37
270486	cd14301	UBA_UBS3B	UBA domain found in ubiquitin-associated and SH3 domain-containing protein B (UBS3B) and similar proteins. UBS3B, or Cbl-interacting protein p70, suppressor of T-cell receptor signaling 1 (Sts-1), T-cell ubiquitin ligand 2 (TULA-2), or tyrosine-protein phosphatase STS1/TULA2, is ubiquitously expressed in mammalian tissues in a variety of cell types. It exhibits high phosphatase activity, but demonstrates no proapoptotic activity. It negatively regulates the tyrosine kinase Zap-70 activation and T cell receptor (TCR) signaling pathways that modulate T cell activation. Moreover, UBS3B acts as a Cbl- and ubiquitin-interacting protein that inhibits endocytosis of epidermal growth factor receptor (EGFR) and platelet-derived growth factor receptor.	38
270487	cd14302	UBA_UBXN1	UBA domain found in UBX domain-containing protein 1 (UBXN1) and similar proteins. UBXN1, also called SAPK substrate protein 1 (SAKS1) or UBA/UBX 33.3 kDa protein, is a widely expressed protein containing an N-terminal ubiquitin-associated (UBA) domain, a coiled-coil region, and a C-terminal ubiquitin-like (UBX) domain. It binds polyubiquitin and valosin-containing protein (VCP), and has been identified as a substrate for stress-activated protein kinases (SAPKs). Moreover, UBXN1 specifically binds to Homer2b. It may also interact with ubiquitin (Ub) and may be involved in the Ub-proteasome proteolytic pathways. In addition, UBXN1 can associate with autoubiquitinated BRCA1 tumor suppressor and inhibit its enzymatic function through its UBA domain.	41
270488	cd14303	UBA1_KPC2	UBA1 domain found in Kip1 ubiquitination-promoting complex protein 2 (KPC2) and similar proteins. KPC2, also called ubiquitin-associated domain-containing protein 1 (UBAC1), or glialblastoma cell differentiation-related protein 1, is one of two subunits of Kip1 ubiquitination-promoting complex (KPC), a novel E3 ubiquitin-protein ligase that also contains KPC1 subunit and regulates the ubiquitin-dependent degradation of the cyclin-dependent kinase (CDK) inhibitor p27 at G1 phase. KPC2 contains a ubiquitin-like (UBL) domain and two ubiquitin-associated (UBA) domains. This model corresponds to the UBA1 domain.	41
270489	cd14304	UBA2_KPC2	UBA2 domain found in Kip1 ubiquitination-promoting complex protein 2 (KPC2) and similar proteins. KPC2, also called ubiquitin-associated domain-containing protein 1 (UBAC1), or glialblastoma cell differentiation-related protein 1, is one of two subunits of Kip1 ubiquitination-promoting complex (KPC), a novel E3 ubiquitin-protein ligase that also contains KPC1 subunit and regulates the ubiquitin-dependent degradation of the cyclin-dependent kinase (CDK) inhibitor p27 at G1 phase. KPC2 contains a ubiquitin-like (UBL) domain and two ubiquitin-associated (UBA) domains. This model corresponds to the UBA2 domain.	39
270490	cd14305	UBA_UBAC2	UBA domain found in ubiquitin-associated domain-containing protein 2 (UBAC2) and similar proteins. UBAC2, also called phosphoglycerate dehydrogenase-like protein 1, is a ubiquitin-associated domain (UBA)-domain containing protein encoded by gene UBAC2 (or PHGDHL1), a risk gene for Behcet's disease (BD). It may play an important role in the development of BD through its transcriptional modulation. Members in this family contain an N-terminal rhomboid-like domain and a C-terminal UBA domain.	38
270491	cd14306	UBA_VP13D	UBA domain found in vacuolar protein sorting-associated protein 13D (VP13D) and similar proteins. VP13D is a chorea-acanthocytosis (CHAC)-similar protein encoded by gene VPS13D. it contains two putative domains, ubiquitin-associated (UBA) domain and lectin domain of ricin B chain profile (ricin-B-lectin), suggesting it may interact with, and be involved in the trafficking of, proteins modified with ubiquitin and/or carbohydrate molecules. Further investigation is required.	36
270492	cd14307	UBA_RUP1p	UBA domain found in yeast UBA domain-containing protein RUP1p and similar proteins. RUP1p is a ubiquitin-associated (UBA) domain-containing protein encoded by a nonessential yeast gene RUP1. It can mediate the association of Rsp5 and Ubp2. The N-terminal UBA domain is responsible for antagonizing Rsp5 function, as well as bridging the Rsp5-Ubp2 interaction. No other characterized functional domains or motifs are found in RUP1p.	38
270493	cd14308	UBA_Mud1_like	UBA domain found in Schizosaccharomyces pombe UBA domain-containing protein mud1 and similar proteins. Schizosaccharomyces pombe mud1 is an ortholog of the Saccharomyces cerevisiae DNA-damage response protein Ddi1. S. cerevisiae Ddi1, also called v-SNARE-master 1 (Vsm1), belongs to a family of proteins known as the ubiquitin receptors which can bind ubiquitinated substrates and the proteasome. It is involved in the degradation of the F-box protein Ufo1, involved in the G1/S transition. It also participates in Mec1-mediated degradation of Ho endonuclease. Both S. pombe mud1 and S. cerevisiae Ddi1 contain an N-terminal ubiquitin-like (UBL) domain, an aspartyl protease-like domain, and a C-terminal ubiquitin-associated (UBA) domain. S. pombe mud1 binds to K48-linked polyubiquitin (polyUb) through UBA domain.	36
270494	cd14309	UBA_scDdi1_like	UBA domain found in Saccharomyces cerevisiae DNA-damage response protein Ddi1 and similar proteins. Ddi1, also called v-SNARE-master 1 (Vsm1), is a ubiquitin receptor involved in regulation of the cell cycle and late secretory pathway in Saccharomyces cerevisiae. It functions as a ubiquitin association domain (UBA)- ubiquitin-like-domain (UBL) shuttle protein that is required for the proteasome to enable ubiquitin-dependent degradation of its ligands. For instance, Ddi1 plays an essential role in the final stages of proteasomal degradation of Ho endonuclease and of its cognate FBP, Ufo1. Moreover, Ddi1 and its associated protein Rad23p play a cooperative role as negative regulators in yeast PHO pathway. Ddi1 contains an N-terminal UBL domain and a C-terminal UBA domain. It also harbors a central retroviral aspartyl-protease-like domain (RVP) which may be important in cell-cycle control. At this point, Ddi1 may function proteolytically during regulated protein turnover in the cell. This family also includes mammalian regulatory solute carrier protein family 1 member 1 (RSC1A1), also called transporter regulator RS1 (RS1) which mediates transcriptional and post-transcriptional regulation of Na(+)-D-glucose cotransporter SGLT1.	36
270495	cd14310	UBA_cnDdi1_like	UBA domain found in Cryptococcus neoformans DNA-damage response protein Ddi1 and similar proteins. The family includes some uncharacterized Ddi and similar proteins which show a high level of sequence similarity with yeast Ddi1. Ddi1, also called v-SNARE-master 1 (Vsm1), is a ubiquitin receptor involved in regulation of the cell cycle and late secretory pathway in yeast. It functions as a ubiquitin association domain (UBA)- ubiquitin-like-domain (UBL) shuttle protein that is required for the proteasome to enable ubiquitin-dependent degradation of its ligands. Ddi1 contains an N-terminal UBL domain and a C-terminal UBA domain. It also harbors a central retroviral aspartyl-protease-like domain (RVP) which may be important in cell-cycle control. At this point, Ddi1 may function proteolytically during regulated protein turnover in the cell.	30
270496	cd14311	UBA_II_E2_UBC1	UBA domain of yeast ubiquitin-conjugating enzyme E2 1 (UBC1) and similar proteins. UBC1, also called ubiquitin-conjugating enzyme E2-24 kDa, or ubiquitin-protein ligase, is the yeast homolog of mammalian ubiquitin-conjugating enzyme E2 K (UBE2K or E2-25K). UBC1 and UBE2K are unique class II E2 conjugating enzymes, both of which contain a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain. The yeast UBC1 plays an important role in the degradation of short-lived proteins especially during the G0-G1 transition accompanying spore germination.	48
270497	cd14312	UBA_II_E2_UBC27_like	UBA domain found in plant ubiquitin-conjugating enzyme E2 27 and similar proteins. UBC27, also called ubiquitin carrier protein 27, functions as a class II ubiquitin-conjugating (UBC) enzyme (E2). E2, together with E1 (ubiquitin-activating enzyme UBA) and E3 (ubiquitin ligase), is required in the multi-step reaction of ubiquitin conjugation. Unlike other Arabidopsis UBCs, in addition to an N-terminal ubiquitin-conjugating enzyme E2 catalytic domain (UBCc), UBC27 has an additional C-terminal ubiquitin-associated domain (UBA).	36
270498	cd14313	UBA_II_E2_UBE2K_like	UBA domain found in vertebrate ubiquitin-conjugating enzyme E2 K (UBE2K), Drosophila melanogaster ubiquitin-conjugating enzyme E2-22 kDa (UbcD4) and similar proteins. UBE2K, also called Huntingtin-interacting protein 2 (HIP-2), ubiquitin carrier protein, ubiquitin-conjugating enzyme E2-25 kDa (E2-25K), or ubiquitin-protein ligase, is a multi-ubiquitinating enzyme with the ability to synthesize Lys48-linked polyubiquitin chains which is involved in the ubiquitin (Ub)-dependent proteolytic pathway. It interacts with the frameshift mutant of ubiquitin B and functions as a crucial factor regulating amyloid-beta neurotoxicity. It has also been characterized as Huntingtin-interacting protein that modulates the neurotoxicity of Amyloid-beta (Abeta), the principal protein involved in Alzheimer's disease pathogenesis. Moreover, E2-25K increases aggregate the formation of expanded polyglutamine proteins and polyglutamine-induced cell death in the pathology of polyglutamine diseases. UbcD4, also called ubiquitin carrier protein, or ubiquitin-protein ligase, is encoded by Drosophila E2 gene which is only expressed in pole cells in embryos. It is a putative E2 enzyme homologous to the Huntingtin interacting protein-2 (HIP2) of human. UbcD4 specifically interacts with the polyubiquitin-binding subunit of the proteasome. This family also includes a putative ubiquitin conjugating enzyme from plasmodium Yoelii (pyUCE). It shows a high level of sequence similarity with UBE2K and may also plays a role in the ubiquitin-mediated protein degradation pathway. All family members are class II E2 conjugating enzymes which contain a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain.	36
270499	cd14314	UBA_II_E2_pyUCE_like	UBA domain found in a putative ubiquitin conjugating enzyme from plasmodium Yoelii (pyUCE) and similar proteins. P. Yoelii ubiquitin-conjugating enzyme and other uncharacterized family members show high sequence similarity to the human Huntingtin interacting protein-2 (HIP2) which belongs to a class II E2 ubiquitin-conjugating enzyme family. These proteins may play roles in the ubiquitin-mediated protein degradation pathway. They all contain a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain.	37
270500	cd14315	UBA1_UBAP1	UBA1 domain found in vertebrate ubiquitin-associated protein 1 (UBAP-1). UBAP-1, also called nasopharyngeal carcinoma-associated gene 20 protein, is a ubiquitously expressed protein that may play an important role in the ubiquitin pathway and cell progression. It co-localizes with TDP-43 proteins in neuronal cytoplasmic inclusions and acts as a genetic risk factor for frontotemporal lobar degeneration (FTLD). Moreover, UBAP-1, together with VPS37A, forms an endosome-specific endosomal sorting complexes I required for transport (ESCRT-I) complex that displays a restricted cellular function, ubiquitin-dependent endosomal sorting and multivesicular body (MVB) biogenesis. UBAP-1 contains an N-terminal UBAP-1-MVB12-associated (UMA) domain, and two tandem ubiquitin-associated (UBA) domains that may be responsible for the binding of ubiquitin-conjugating enzymes. This model corresponds to UBA1 domain.	43
270501	cd14316	UBA2_UBAP1_like	UBA2 domain found in ubiquitin-associated protein 1 (UBAP-1) and similar proteins. UBAP-1, also called nasopharyngeal carcinoma-associated gene 20 protein, is a ubiquitously expressed protein that may play an important role in the ubiquitin pathway and cell progression. It co-localizes with TDP-43 proteins in neuronal cytoplasmic inclusions and acts as a genetic risk factor for frontotemporal lobar degeneration (FTLD). Moreover, UBAP-1, together with VPS37A, forms an endosome-specific endosomal sorting complexes I required for transport (ESCRT-I) complex that displays a restricted cellular function, ubiquitin-dependent endosomal sorting and multivesicular body (MVB) biogenesis. UBAP-1 contains an N-terminal UBAP-1-MVB12-associated (UMA) domain, and two tandem ubiquitin-associated (UBA) domains that may be responsible for the binding of ubiquitin-conjugating enzymes. This model corresponds to UBA2 domain.	37
270502	cd14317	UBA_DHX57	UBA domain found in putative ATP-dependent RNA helicase DHX57 and similar proteins. DHX57, also called DEAH box protein 57, is a multi-domain protein with an N-terminal ubiquitin-association (UBA) domain, a Zinc finger domain, a RWD domain, a DEAD-like helicase domain and two C-terminal helicase associated domains. Although the precise biological function of DHX57 remains unclear, it may function as a putative ATP-dependent RNA helicase.	38
270503	cd14318	UBA_Cbl_like	UBA domain found in casitas B-lineage lymphoma (Cbl) proteins. The Cbl adaptor proteins family contains a small class of RING-type E3 ubiquitin ligases with oncogenic activity which is represented by three mammalian members, c-Cbl, Cbl-b and Cbl-3. Cbl proteins function as potent negative regulators of various signaling cascades in a wide range of cell types. They play roles in ubiquitinating the activated tyrosine kinases and targeting them for degradation. Cbl proteins in this family consists of a highly conserved N-terminal half that includes a tyrosine-kinase-binding (TKB) domain and a RING finger domain, both of which are required for Cbl-mediated downregulation of RTKs, and a C-terminal half that includes a ubiquitin-associated (UBA) domain and other protein interaction motifs. The UBA domain contains leucine/isoleucine repeats and may play a role in dimerization of Cbl proteins. In addition, although both c-Cbl and Cbl-b have the C-terminal UBA domain, only the UBA domain from Cbl-b can bind ubiquitin.	40
270504	cd14319	UBA_NBR1	UBA domain of next to BRCA1 gene 1 protein (NBR1) and similar proteins. NBR1, also called cell migration-inducing gene 19 protein, membrane component chromosome 17 surface marker 2, neighbor of BRCA1 gene 1 protein, or protein 1A1-3B, is a scaffold protein that may be involved in signal transmission downstream of the serine/protein kinase from the giant muscle protein titin. Moreover, NBR1 functions as an autophagic receptor for ubiquitinated cargo. It interacts with ATG8-family proteins for its degradation by autophagy. NBR1 contains an N-terminal Phox and Bem1p (PB1) domain that plays a critical role in mediating protein-protein interactions with both titin kinase and with another scaffold protein, p62. NBR1 also has a LC3-interaction region (LIR) and a ubiquitin-associated (UBA) domain. The LIR is required for the autophagic clearance of NBR1. UBA domain is responsible for the ubiquitin binding which is necessary for the puromycin-induced formation of ubiquitinated protein aggregates.	39
270505	cd14320	UBA_SQSTM	UBA domain of sequestosome-1 (SQSTM) and similar proteins. SQSTM, also called EBI3-associated protein of 60 kDa (EBIAP /p60), phosphotyrosine-independent ligand for the Lck SH2 domain of 62 kDa, or ubiquitin-binding protein p62, is a widely expressed multifunctional cytoplasmic protein that is able to noncovalently bind ubiquitin and several signaling proteins, suggesting a regulatory role connected to the ubiquitin-proteasome pathway. It functions as a scaffolding protein that regulates a diverse range of signaling pathways leading to activation of the nuclear factor kappa B (NF-kappaB) family of transcription factors. It also plays a novel role in connecting receptor signals with the endosomal signaling network required for mediating TrkA-induced differentiation. SQSTM contains a PB1 dimerization domain, a tumor necrosis factor receptor-associated factor 6 (TRAF6) binding site, and a C-terminal ubiquitin-associated (UBA) domain that mediates the recognition of polyubiquitin chains and ubiquitylated substrates.	40
270506	cd14321	UBA_IAPs	UBA domain found in inhibitor of apoptosis proteins (IAPs). IAPs are frequently overexpressed in cancer and associated with tumor cell survival, chemoresistance, disease progression and poor prognosis. They function primarily as negative regulators of cell death. They regulate caspases and apoptosis through the inhibition of specific members of the caspase family of cysteine proteases. In addition, IAPs has been implicated in a multitude of other cellular processes, including inflammatory signalling and immunity, mitogenic kinase signalling, proliferation and mitosis, as well as cell invasion and metastasis. IAPs in this family includes cellular inhibitor of apoptosis protein c-IAP1 and c-IAP2, XIAP, and BIRC8, all of which contain three N-terminal baculoviral IAP repeat (BIR) domains that enable interactions with proteins, a ubiquitin-association (UBA) domain that is responsible for the binding of binds polyubiquitin (polyUb), and a RING domain at the carboxyl terminus that is required for ubiquitin ligase activity. c-IAPs contains an additional caspase activation and recruitment domain (CARD) between UBA and RING domains. CARD domain may serve as a protein interaction surface.	44
270507	cd14322	UBA_LATS	UBA domain found in serine/threonine-protein kinase LATS and similar proteins. The LATS proteins family consists of two isoforms, LATS1 and LATS2, both of which are mammalian homologs of the Drosophila tumor suppressor gene lats/warts. LATS1, also called large tumor suppressor homolog 1, or WARTS protein kinase (warts), is a serine/threonine-protein kinase that highly conserved from fly to human. LATS2, also called kinase phosphorylated during mitosis protein, or large tumor suppressor homolog 2, or serine/threonine-protein kinase KPM, or Warts-like kinase, inhibits the G1/S transition and is essential for embryonic development, proliferation control and genomic integrity. LATS proteins contain an N-terminal ubiquitin-associated (UBA) domain and a C-terminal protein kinase domain.	39
270508	cd14323	UBA_PLCs_like	UBA domain of eukaryotic protein linking integrin-associated protein with cytoskeleton (PLIC) proteins, Saccharomyces cerevisiae proteins Dsk2p and Gts1p, and similar proteins. The PLIC proteins (or ubiquilins) family contains human homologs of the yeast ubiquitin-like Dsk2 protein, PLIC-1 (also called ubiquilin-1), PLIC-2 (also called ubiquilin-2 or Chap1), PLIC-3 (also called ubiquilin-3) and PLIC-4 (also called ubiquilin-4, Ataxin-1 interacting ubiquitin-like protein, A1Up, Connexin43-interacting protein of 75 kDa, or CIP75), and mouse PLIC proteins. They are ubiquitin-binding adaptor proteins involved in all protein degradation pathways through delivering ubiquitinated substrates to proteasomes. They also promote autophagy-dependent cell survival during nutrient starvation. Saccharomyces cerevisiae Dsk2p is a nuclear-enriched protein that may involve in the ubiquitin-proteasome proteolytic pathway through interacting with K48-linked polyubiquitin and the proteasome. Gts1p, also called protein LSR1, is encoded by a pleiotropic gene GTS1 in budding yeast. The formation of Gts1p-mediated protein aggregates may induce reactive oxygen species (ROS) production and apoptosis. Gts1p also plays an important role in the regulation of heat and other stress responses under glucose-limited or -depleted conditions in either batch or continuous culture.	39
270509	cd14324	UBA_Dsk2p_like	UBA domain of Saccharomyces cerevisiae proteasome interacting protein Dsk2p and its homologs found in fungi. The family contains several fungal multi-ubiquitin receptors, including Saccharomyces cerevisiae Dsk2p and Schizosaccharomyces pombe Dph1p, both of which have been characterized as shuttle proteins transporting ubiquitinated substrates destined for degradation from the E3 ligase to the 26S proteasome. They interact with the proteasome through their N-terminal ubiquitin-like domain (UBL) and with ubiquitin (Ub) through their C-terminal ubiquitin-associated domain (UBA). S. cerevisiae Dsk2p is a nuclear-enriched protein that may involve in the ubiquitin-proteasome proteolytic pathway through interacting with K48-linked polyubiquitin and the proteasome. Moreover, it has been implicated in spindle pole duplication through assisting in Cdc31 assembly into the new spindle pole body (SPB). S. pombe Dph1p is a ubiquitin receptor working in concert with the class V myosin, Myo52, to target the degradation of the S. pombe CLIP-170 homolog, Tip1. It also can protect ubiquitin chains against disassembly by deubiquitinating enzymes.	42
270510	cd14325	UBA_RNF31	UBA domain found in E3 ubiquitin-protein ligase RING finger protein 31 and similar proteins. RNF31, also called HOIL-1-interacting protein (HOIP), or zinc in-between-RING-finger ubiquitin-associated domain protein, together with HOIL-1 and SHARPIN, forms the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis. RNF31 contains a central ubiquitin-associated (UBA) domain that is responsible for the interaction with the N-terminal ubiquitin-like domain (UBL) of HOIL-1L. In addition, RNF31 can interact with the atypical mammalian orphan receptor DAX-1, trigger DAX-1 ubiquitination and stabilization, and participate in repressing steroidogenic gene expression.	55
270511	cd14326	UBA_UBL7	UBA domain found in ubiquitin-like protein 7 (UBL7) and similar proteins. UBL7, also called bone marrow stromal cell ubiquitin-like protein (BMSC-UbP), or ubiquitin-like protein SB132, is a novel ubiquitin-like protein that may play roles in regulation of bone marrow stromal cell (BMSC) function or cell differentiation via an evocator-associated and cell-specific pattern. UBL7 contains an N-terminal ubiquitin domain (UBQ) and a C-terminal ubiquitin-associated (UBA) domain. UBQ domain interacts with 26S proteasome-dependent degradation, and UBA domain links cellular processes and the ubiquitin system.	38
270512	cd14327	UBA_atUPL1_2_like	UBA domain found in Arabidopsis thaliana E3 ubiquitin-protein ligase UPL1 (atUPL1), UPL2 (atUPL2) and similar proteins. The family includes two highly similar 405-kDa HECT E3 ubiquitin-protein ligases (UPLs), UPL1 and UPL2, from Arabidopsis thaliana. The HECT E3 UPL family plays a prominent role in the ubiquitination of plant proteins. The biological functions of UPL1 and UPL2 remain unclear. Both of them contain a ubiquitin-associated (UBA) domain and a C-terminal HECT domain. UBA domain may be involved in ubiquitin metabolism. HECT domain is necessary and sufficient for their E3 catalytic activity, but requires ATP, E1 and an E2 of the Arabidopsis UBC8 family to ubiquitinate proteins.	38
270513	cd14328	UBA_TNK1	UBA domain found in non-receptor tyrosine-protein kinase TNK1 and similar proteins. TNK1, also called CD38 negative kinase 1, is a non-receptor protein tyrosine kinase (NRPTK) that has been implicated in the regulation of apoptosis, cell growth, nuclear factor-kappaB, and Ras. It associates with phospholipase C (PLC)-gamma1 and may play a role in phospholipid signal transduction. TNK1 contains an NH2-terminal kinase, a Src Homology 3 (SH3) domain, a proline-rich (PR) region, and a C-terminal ubiquitin-association (UBA) domain.	40
270514	cd14329	UBA_SWA2p_like	UBA domain found in yeast auxilin-like clathrin uncoating factor SWA2 (Swa2p) and similar proteins. The lineage specific group includes Swa2p and other uncharacterized hypothetical proteins from Saccharomyces. Swa2p, also called bud site selection protein 24, DnaJ-related protein SWA2, or synthetic lethal with ARF1 protein 2, is the yeast auxilin ortholog that is a multifunctional protein with three N-terminal clathrin-binding (CB) motifs, a ubiquitin-association (UBA) domain, a tetratricopeptide repeat (TPR) domain, and a C-terminal J-domain. It is required for disassembly of clathrin-coated vesicles (CCVs) in an ATP-dependent manner, as well as for cortical endoplasmic reticulum (ER) inheritance.	36
270515	cd14330	UBA_atDRM2_like	UBA domain found in Arabidopsis thaliana DNA (cytosine-5)-methyltransferase DRM2 (atDRM2) and similar proteins. atDRM2, also called protein domains rearranged methylase 2, is a homolog of the mammalian de novo methyltransferase DNMT3. It is the major de novo methyltransferase targeted to DNA by small interfering RNAs (siRNAs) in the RNA-directed DNA methylation (RdDM) pathway in Arabidopsis thaliana. atDRM2 is a part of the RdDM effector complex and plays a catalytic role in RdDM. It contains an N-terminal UBA domains and a C-terminal methyltransferase domain, both of which are required for normal RdDM.	37
270516	cd14331	UBA_HERC1_2	UBA domain found in probable E3 ubiquitin-protein ligase HERC1, HERC2 and similar proteins. HERC1, also called HECT domain and RCC1-like domain-containing protein 1, p532, or p619, is an ubiquitously expressed giant protein involved in ubiquitin-dependent intracellular membrane trafficking through its interaction with vesicle coat proteins such as clathrin and ARF. Moreover, it has been identified as a tuberous sclerosis complex TSC2-interacting protein that may play a role in TSC-mTOR (mammalian target of rapamycin) pathway. HERC2, also called HECT domain and RCC1-like domain-containing protein 2, is a SUMO-regulated E3 ubiquitin ligase that plays an important role in the SUMO-dependent pathway which orchestrates the DNA double-strand break (DSB) response. Moreover, HERC2 functions as a RNF8 auxiliary factor that regulates ubiquitin-dependent retention of repair proteins on damaged chromosomes. HERC1 and HERC2 are multi-domain proteins with different domain organizations. Both of them contain a ubiquitin-association (UBA) domain, more than one RCC1-like domains (RLDs) and a C-terminal HECT E3 ubiquitin ligase domain.	40
270517	cd14332	UBA_RuvA_C	C-terminal UBA-like domain of holliday junction ATP-dependent DNA helicase RuvA. RuvA, along with RuvB and RuvC proteins, is involved in branch migration of heteroduplex DNA in homologous recombination that is a crucial process for maintaining genomic integrity and generating biological diversity in all living organisms. RuvA has a tetrameric architecture in which each subunit comprised of three distinct domains. This model corresponds to the C-terminal domain of RuvA which is distantly related to the ubiquitin-associated (UBA) domain. It plays a significant role in the ATP-dependent branch migration of the hetero-duplex through direct contact with RuvB. Within the Holliday junction, the C-terminal domain makes no interaction with DNA.	45
270518	cd14333	UBA_unchar_Eumetazoa	UBA domain found in some hypothetical proteins from Eumetazoa. The family includes some uncharacterized Eumetazoan proteins. Although their biological function remain unclear, they all contain a very conserved ubiquitin-associated (UBA) domain which is a commonly occurring sequence motif found in proteins involved in ubiquitin-mediated proteolysis.	38
270519	cd14334	UBA_SNF1_fungi	UBA domain of yeast carbon catabolite-derepressing protein kinases (Snf1) and similar proteins found in fungi. Snf1, also called yeast adenosine monophosphate (AMP)-activated protein kinase (AMPK), is a global regulator of carbon metabolism in the yeast Saccharomyces cerevisiae. Its phosphorylation is essential for the regulation by carbon catabolite repression in eukaryotic cells. Snf1 is involved in the cellular responses to nutrient stress, as well as other environmental stresses, including sodium ion stress, heat shock, alkaline pH, oxidative stress, and genotoxic stress. It plays roles in various nutrient-responsive, cellular developmental processes, including meiosis and sporulation, aging, haploid invasive growth, and diploid pseudohyphal growth. It is required for transcription of glucose-repressed genes, glycogen storage, thermotolerance, and peroxisome biogenesis. The catalytic activity of Snf1 can be regulated by upstream kinases, Sak1, Elm1, and Tos3, by the Reg1-Glc7 protein phosphatase 1, and by autoinhibition. In addition to an N-terminal protein kinase domain and a C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK), Snf1 contains an ubiquitin-associated (UBA) domain, previously called SNF1 homology (SNH) domain, in the middle region.	48
270520	cd14335	UBA_SnRK1_plant	UBA domain found in the plant sucrose nonfermenting-1-related kinase (SnRK1) proteins. The plant SnRK1 proteins (also known as AKIN10/11) family contains plant orthologs of the yeast sucrose non-fermenting (Snf1) kinase and mammalian AMP-activated protein kinase (AMPK), including two catalytic alpha-subunits of plant Snf1-related kinases (SnRKs): SNF1-related protein kinase catalytic subunit alpha KIN10 (also called AKIN10 or AKIN alpha2) and SNF1-related protein kinase catalytic subunit alpha KIN11 (also called AKIN11 or AKIN alpha1). AKIN10 and AKIN11 function as central integrators of sugar, metabolic, stress, and developmental signals in plants. They form different complexes with the regulatory AKINbeta2, a plant ortholog of conserved Snf1/AMPK beta-subunits. In addition to an N-terminal protein kinase domain and a C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK), Snf1 contains an ubiquitin-associated (UBA) domain, previously called SNF1 homology (SNH) domain, in the middle region.	41
270521	cd14336	UBA_AID_AMPKalpha	UBA-like autoinhibitory domain (AID) found in vertebrate 5'-AMP-activated protein kinase catalytic alpha (AMPKalpha) subunits. The family corresponds to the catalytic subunits of adenosine monophosphate (AMP)-activated protein kinase (AMPK) which includes two isoforms encoded by two distinct genes, AMPKalpha-1 (PRKAA1) and AMPKalpha-2 (PRKAA2). Skeletal muscle predominantly expresses the AMPKalpha-2, whereas the liver expresses approximately equal amounts of both AMPKalpha subunits. One AMPKalpha subunit and two regulatory subunits, beta (beta1, beta2, beta3) and gamma (gamma1, gamma2, gamma3) form a heterotrimeric AMPK complex that plays a central role in the regulation of cellular energy metabolism, activates energy-producing pathways and inhibits energy-consuming processes through responding to a fall in intracellular ATP levels. It is activated in beta-cells at low glucose concentrations, but inhibited as glucose levels increase. AMPKalpha subunits show significant similarity in the catalytic core region, but have divergent COOH-terminal tails, suggesting they may interact with different proteins within this region. Both of AMPKalpha subunits have an N-terminal Ser/Thr kinase domain followed by an ubiquitin-associated (UBA)-like AID, and a C-terminal AMPK regulatory domain. The Ser/Thr kinase domain contains a conserved Thr residue that must be phosphorylated for activity in the activation loop. The AID is responsible for AMPKalpha subunits autoinhibition. The C-terminal regulatory domain of the alpha-subunit is essential for binding the beta- and gamma-subunits.	65
270522	cd14337	UBA_MARK_Par1	UBA domain found in microtubule-associated protein (MAP)/microtubule affinity-regulating kinase (MARK)/ partitioning-defective 1 (Par-1) and similar proteins. The MARK/Par-1 subfamily contains serine/threonine-protein kinases including mammal MARKs, and polarity kinases Par-1 found in Caenorhabditis elegans and Drosophila melanogaster. Those proteins are frequently found associated with membrane structures and participate in diverse processes from control of the cell cycle and polarity to intracellular signaling and microtubule stability. They are involved in nematode embryogenesis, cell cycle control, epithelial cell polarization, cell signaling, and neuronal migration and differentiation. The mammals MARKs have been implicated in carcinomas, Alzheimer's disease (through tau hyperphosphorylation), and autism. Four MARK isoforms exist in humans. Members in this subfamily contain an N-terminal protein kinase catalytic domain, followed by an ubiquitin-associated (UBA) domain and a C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK).	40
270523	cd14338	UBA_SIK	UBA domain found in salt-inducible kinase SIK1, SIK2, SIK3 and similar proteins. Salt-inducible kinase SIK1, SIK2, SIK3 are serine/threonine kinases that belong to the AMP-activated protein kinases (AMPK) family involved in the regulation of metabolism during energy stress. SIK1, also called serine/threonine-protein kinase SNF1-like kinase 1 (SNF1LK), is required for myogenic differentiation. It is degraded by the proteasome in myoblasts which is regulated by cAMP signaling. Moreover, SIK1 acts as a class II histone deacetylase (HDAC) kinase, triggering the cytoplasmic export of the HDACs and activation of myocyte enhancer factor 2 (MEF2)-dependent transcription. It also regulates transcription through inhibitory phosphorylation of a family of cAMP responsive element binding protein (CREB) coactivators, called TORCs/CRTCs. In addition, SIK1 links LKB1 to p53-dependent anoikis and suppresses metastasis. It is also involved in a cell sodium-sensing network that regulates active sodium transport through a calcium-dependent process. SIK2, also called Qin-induced kinase or serine/threonine-protein kinase SNF1-like kinase 2 (SNF1LK2), plays an important role in the insulin-signaling pathway during adipocyte differentiation, as well as in autophagy progression. Moreover, SIK2 plays a critical role in neuronal survival and modulates cAMP responsive element binding protein (CREB)-mediated gene expression in response to hormones and nutrients. SIK2 acts as a critical determinant in autophagy progression. In addition, SIK2 localizes at the centrosome and functions as a centrosome kinase required for bipolar mitotic spindle formation. It is involved in the initiation of mitosis, and regulates the localization of the centrosome linker protein, C-Nap1, through S2392 phosphorylation. SIK3, also called salt-inducible kinase 3 or serine/threonine-protein kinase QSK, acts as a novel energy regulator that modulates cholesterol and bile acid metabolism by coupling with retinoid metabolism. It also play an essential role in facilitating chondrocyte hypertrophy during skeletogenesis and growth plate maintenance. Members in this family contain an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain. 	45
270524	cd14339	UBA_SNRK	UBA domain of SNF-related serine/threonine-protein kinase (SNRK) and similar proteins mainly found in metazoa. SNRK, also called Sucrose nonfermenting 1 (Snf1)-related kinase, is a serine/threonine kinase highly expressed in the testis. It is a distant member of the largely adenosine monophosphate (AMP)-activated protein kinase (AMPK) family. SNRK can be phosphorylated and activated by LKB1 and may mediate cellular effects regulated by LKB1. It is also involved in the regulation of colon cancer cell proliferation and beta-catenin signaling. It inhibits colon cancer cell proliferation through calcyclin-binding protein (CacyBP)-dependent reduction of beta-catenin. In addition to an N-terminal protein kinase domain, it harbors an ubiquitin-associated (UBA) domain, previously called SNF1 homology (SNH) domain which is conserved in other Snf1-related kinases, but not in any other protein kinase.	48
270525	cd14340	UBA_BRSK	UBA domain found in serine/threonine-protein kinase BRSK1, BRSK2 and similar proteins. The family includes brain-specific kinases BRSK1 and BRSK2. They are AMP-activated protein kinase (AMPK)-related kinases that are highly expressed in mammalian forebrain and crucial for establishing neuronal polarity.BRSK1, also called brain-selective kinase 1, brain-specific serine/threonine-protein kinase 1, BR serine/threonine-protein kinase 1, serine/threonine-protein kinase SAD-B, or synapses of Amphids Defective homolog 1 (SAD1 homolog), is associated with synaptic vesicles and is tightly associated with the presynaptic cytomatrix in nerve terminals. It can regulate neurotransmitter release presynaptically. BRSK2, also called brain-selective kinase 2, brain-specific serine/threonine-protein kinase 2, BR serine/threonine-protein kinase 2, serine/threonine-protein kinase 29, or serine/threonine-protein kinase SAD-A is an AMP-activated protein kinase (AMPK)-related kinase exclusively expressed in brain and pancreas. It plays an essential role in neuronal polarization. It interacts with CDK-related protein kinase PCTAIRE1, a kinase involved in neurite outgrowth and neurotransmitter release, and further negatively regulates glucose-stimulated insulin secretion (GSIS) in pancreatic beta-cells through activation of p21-activated kinase-1 (PAK1). BRSK2 also regulates cell-cycle progression controlled by APC/C(Cdh1) through the ubiquitin-proteasome pathway. Moreover, BRSK2 is regulated by endoplasmic reticulum (ER) stress in protein level and involved in ER stress-induced apoptosis. Both BRSK1 and BRSK2 contain an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain.	54
270526	cd14341	UBA_MELK	UBA domain found in maternal embryonic leucine zipper kinase (MELK) and similar proteins. MELK, also called protein kinase Eg3 (pEg3 kinase), protein kinase PK38 (PK38), or tyrosine-protein kinase MELK, is a cell cycle dependent protein kinase involved in diverse cell processes including stem cell renewal, cell cycle progression, cell proliferation, apoptosis and mRNA processing. It is expressed in normal tissues and especially in cancer cells. It is upregulated in cancer tissues and thus may act as potential anticancer target in diverse tumor entities. MELK comprises an N-terminal protein kinase catalytic domain, followed by an ubiquitin-associated (UBA) domain, and a C-terminal autoinhibitory domain of 5'-AMP-activated protein kinase (AMPK).	52
270527	cd14342	UBA_TAP-C	UBA-like domain found in nuclear RNA export factor NXF1, NXF2 and similar proteins. The NXF family of mRNA nuclear export factors including vertebrate NXF1 (also called tip-associated protein or mRNA export factor TAP), NXF2 (also called cancer/testis antigen CT39 or TAP-like protein TAPL-2), Caenorhabditis elegans NXF1 (ceNXF1), Saccharomyces cerevisiae mRNA nuclear export factor Mex67p and similar proteins. NXF proteins can stimulate nuclear export of mRNAs and facilitate the export of unspliced viral mRNA containing the constitutive transport element. It is a multi-domain protein with a nuclear localization sequence (NLS), a non-canonical mRNA-binding domain, and four leucine-rich repeats (LLR) at the N-terminal region. Its C-terminal part contains a NTF2-like domain and a ubiquitin-associated (UBA)-like domain, joined by flexible Pro-rich linker. Caenorhabditis elegans NXF1 are essential for the nuclear export of poly(A)+mRNA. In budding yeast, Mex67p binds mRNAs through its adaptor Yra1/REF. It also interacts directly with Nab2, an essential shuttling mRNA-binding protein required for export. Moreover, Mex67p associates with both nuclear pore protein (nucleoporin) FG repeats and Hpr1, a component of the TREX/THO complex linking transcription and export.	51
270528	cd14343	UBA_F100B_like	UBA-like domain found in protein FAM100B. The family corresponds to the uncharacterized protein FAM100B and its homologs mainly found in Metazoa. Although their biological roles remain unclear, all family members contain a ubiquitin-associated (UBA)-like domain that may be involved in the binding of ubiquitin.	39
270529	cd14344	UBA_TYDP2	UBA-like domain found in tyrosyl-DNA phosphodiesterase 2 (TDP2) and similar proteins. TDP2, also called ETS1-associated protein II (EAPII) or TRAF and TNF receptor-associated protein (Ttrap), is a 5'-Tyr-DNA phosphodiesterase, a member of the Mg(2+)/Mn(2+)-dependent family of phosphodiesterases which contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal phosphodiesterase domain. TDP2 is required for the efficient repair of topoisomerase II-induced DNA double strand breaks. The topoisomerase is covalently linked by a phosphotyrosyl bond to the 5'-terminus of the break. TDP2 cleaves the DNA 5'-phosphodiester bond and restores 5'-phosphate termini needed for subsequent DNA ligation and hence repair of the break. Tyrosyl-DNA phosphodiesterase 1 (TDP1), an enzyme that cleaves 3'-phosphotyrosyl bonds, and TDP2 are complementary activities; together, they allow cells to remove trapped topoisomerase from both 3'- and 5'-DNA termini. TDP2 has been reported as being involved in apoptosis, embryonic development, and transcriptional regulation. It can associate with CD40, tumor necrosis factor receptor-75 (TNF-R75) and TNF receptor-associated factors (TRAFs) and may inhibit the activation of nuclear factor-kappa B (NF-kappaB).	37
270530	cd14345	UBA_UBXD7	UBA-like domain found in UBX domain-containing protein 7 (UBXD7) and similar proteins. UBXD7, also known as UBXN7, functions as a ubiquitin-binding adaptor that mediates the interaction between the AAA+ ATPase p97 (also known as VCP or Cdc48) and the transcription factor HIF1alpha. It binds only to the active, NEDD8- or Rub1-modified form of cullins. UBXD7 contains the ubiquitin-associated (UBA), ubiquitin-associating (UAS), ubiquitin regulatory X (UBX), and ubiquitin-interacting motif (UIM) domains. Either UBA or UIM could serve as a docking site for neddylated-cullins. Moreover, UBA-like domain is required for binding ubiquitylated-protein substrates, UIM motif is responsible for the binding to cullin RING ligases (CRLs), and UBX domain is essential for p97 binding.	37
270531	cd14346	UBA_Ubx5_like	UBA-like domain found in Saccharomyces cerevisiae UBX domain-containing protein 5 (Ubx5) and similar proteins. Ubx5 is a ubiquitin regulatory X (UBX) domain-containing protein encoded by the open reading frame (ORF) YDR330W in yeast. As the yeast ortholog of mammalian UBXD7, Ubx5 functions as the cofactor of AAA+ ATPase p97, also known as VCP or Cdc48. It binds only to the active, NEDD8- or Rub1-modified form of cullins. Ubx5 contains the ubiquitin-associated (UBA), ubiquitin-associating (UAS), ubiquitin regulatory X (UBX) and ubiquitin-interacting motif (UIM) domains and its UIM domain is required to promote UV-dependent degradation of polyubiquitinated Rpb1.	39
270532	cd14347	UBA_Cezanne_like	UBA-like domain found in OTU domain-containing proteins OTU7A, OTU7B and similar proteins. OTU7A, also called zinc finger protein Cezanne 2, belongs to a family of proteins that have been characterized as highly specific ubiquitin iso-peptidases removing ubiquitin from proteins. OTU7B, also called cellular zinc finger anti-NF-kappaB protein, zinc finger A20 domain-containing protein 1, or zinc finger protein Cezanne, is a novel deubiquitinating enzyme that acts as a negative regulator of NF-kappaB and may play a role in the control of the inflammatory process. Both OTU7A and OTU7B contain an N-terminal ubiquitin-associated (UBA)-like domain, followed by an ovarian tumor (OTU) domain and a ubiquitin binding domain, A20-like zinc finger. In addition, they both display proteolytic activity.	43
270533	cd14348	UBA_p47	UBA-like domain found in NSFL1 cofactor p47 and similar proteins. p47, also called UBX domain-containing protein 2C, is a major cofactor of the cytosolic AAA ATPase p97. It is required for the p97-regulated membrane reassembly of the endoplasmic reticulum (ER), the nuclear envelope and the Golgi apparatus. p47, together with p97, forms the p97-p47 complex that plays an important role in regulation of membrane fusion events. p47 contains an N-terminal ubiquitin-associated (UBA)-like domain, a central SEP (named after shp1, eyc and p47) domain, and a ubiquitin-like (UBX) domain. UBA-like domain is responsible for forming a highly stable complex with ubiquitin. SEP domain and UBX domain may involve in p47 trimerization or forms a stable complex with the p97 N-terminal domain.	40
270534	cd14349	UBA_CF106	UBA-like domain found in uncharacterized protein C6orf106 and similar proteins. The family corresponds to a group of uncharacterized protein C6orf106 and its homologs mainly found in Metazoa. All family members contain a ubiquitin-associated (UBA)-like domain.	41
270535	cd14350	UBA_DCNL	UBA-like domain found in DCN1-like protein DCNL1, DCNL2 and similar proteins. DCNL1 (defective in cullin neddylation protein 1-like protein 1), also called DCUN1 domain-containing protein 1, is encoded by squamous cell carcinoma-related oncogene SCCRO (DCUN1D1). It interacts with known cullin isoforms as well as ROC1, Ubc12 and CAND1, the components of the neddylation pathway. It plays an essential role in the neddylation E3 complex and participates in the release of inhibitory effects of CAND1 on cullin-RING ligase E3 complex assembly and activity. DCNL1 contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal cullin binding domain that binds to cullins and Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. DCNL2 (defective in cullin neddylation protein 1-like protein 2), also called DCUN1 domain-containing protein 2, is encoded by gene DCUN1D2. Although its biological function remains unclear, DCNL2 shows high sequence similarity with DCNL1 and may also contribute to neddylation of cullin components of SCF-type E3 ubiquitin ligase complexes. Like DCNL1, DCNL2 contains an N-terminal UBA-like domain and a C-terminal cullin binding domain.	42
270536	cd14351	UBA_Ubx1_like	UBA-like domain found in yeast UBX domain-containing protein 1 (Ubx1) and similar proteins. Ubx1, also called suppressor of high-copy PP1 protein (Shp1), is the substrate-recruiting cofactor of AAA-adenosine triphosphatase Cdc48 in Saccharomyces cerevisiae. In concert with ubiquitin-like Atg8, Cdc48 and Ubx1 are involved in the regulation of autophagosome biogenesis. Ubx1 also functions as a regulator of phosphoprotein phosphatase 1 (PP1) with differential effects on glycogen metabolism, meiotic differentiation, and mitotic cell cycle progression. All family members contain an N-terminal ubiquitin-associated (UBA)-like domain.	37
270537	cd14352	UBA_DCN1	UBA-like domain found in yeast defective in cullin neddylation protein 1 (DCN1) and similar proteins. DCN1 is a scaffold-type E3 ligase for cullin neddylation. It can bind directly to cullins and the ubiquitin-like protein Nedd8-specific E2 (Ubc12), and regulate cullin neddylation and thus display ubiquitin ligase activity. It contains an N-terminal ubiquitin-associated (UBA)-like domain and a unique C-terminal PONY domain that is essential for the neddylation function of DCN1.	36
270538	cd14353	UBA_FAF	UBA-like domain found in FAS-associated factor FAF1, FAF2 and similar proteins. FAF1, also called UBX domain-containing protein 12 or UBX domain-containing protein 3A, is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP) which is involved in the ubiquitin-proteosome pathway. FAF2, also called protein ETEA, UBX domain-containing protein 3B, or UBX domain-containing protein 8, is the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. FAF2 shows homology to Fas-associated factor 1 (FAF1). Both of them contain N-terminal ubiquitin-associated (UBA)-like domain, UAS and ubiquitin-like (UBX) domains. Compared to FAF1, however, FAF2 lacks the nuclear targeting domain. The function of FAF2 remains unclear. A yeast two-hybrid assay showed that it can interact with Fas. Because of its homology to FAF1, it is postulated that FAF2 could be involved in modulating Fas-mediated apoptosis of T-cells and eosinophils of atopic dermatitis patients, making them more resistant to apoptosis.	32
270539	cd14354	UBA_UBP25	UBA domain found in ubiquitin carboxyl-terminal hydrolase 25 (UBP25) and similar proteins. UBP25, also called deubiquitinating enzyme 25, USP on chromosome 21, ubiquitin thioesterase 25, or ubiquitin-specific-processing protease 25, belongs to the deubiquitinating enzyme (DUB) family that specifically hydrolyzes ubiquitin chains on ubiquitin-conjugated proteins. USP25 has one muscular isoform and two ubiquitous isoforms. The longer muscular isoform can bind to muscle-restricted cytoskeletal and sarcomeric proteins, such as myosin binding protein C1 (MyBPC1), actin alpha-1 (ACTA1) and filamin C (FLNC), and further prevent their degradation. USP25 harbors three potential ubiquitin-binding domains (UBDs), one ubiquitin-associated (UBA) domain and two ubiquitin-interacting motifs (UIMs) in the N-terminal region. Its C-terminal tyrosine-rich region is responsible for the binding of the second SH2 domain of SYK, a non-receptor tyrosine kinase that specifically phosphorylates USP25 and alters its cellular levels.	46
270540	cd14355	UBA_UBP28	UBA domain found in ubiquitin carboxyl-terminal hydrolase 28 (UBP28) and similar proteins. UBP28, also called deubiquitinating enzyme 28, ubiquitin thioesterase 28, or ubiquitin-specific-processing protease 28, is an ubiquitin-specific protease that belongs to the deubiquitinating enzyme (DUB) family which specifically hydrolyzes ubiquitin chains on ubiquitin-conjugated proteins. UBP28 can form a ternary complex with nucleoplasmic Fbw7alpha, an F-box protein that is part of an SCF-type ubiquitin ligase, and MYC, a transcription factor encoded by MYC proto-oncogene. UBP28 is required for the stability of MYC, and this stabilization is necessary for tumour-cell proliferation. Besides, UBP28 plays a critical role in the regulation of the Chk2-p53-PUMA pathway. It specifically interacts with 53BP1 and is essential to stabilize Chk2 and 53BP1 in response to DNA damage.	42
270541	cd14358	UBA_NAC_euk	UBA-like domain found in nascent polypeptide-associated complex subunit alpha (NACA) and its homologs mainly found in eukaryotes. The subfamily contains nascent polypeptide-associated complex subunit alpha (NACA), putative NACA-like protein (NACP1), nascent polypeptide-associated complex subunit alpha domain-containing protein 1 (NACAD), and similar proteins. NACA, also called NAC-alpha or Alpha-NAC, together with BTF3, also called Beta-NAC, form the nascent polypeptide-associated complex (NAC) which is a cytosolic protein chaperone that contacts the nascent polypeptide chains as they emerge from the ribosome. Besides, NACA has a high affinity for nucleic acids and exists as part of several protein complexes playing a role in proliferation, apoptosis, or degradation. It is a cytokine-modulated specific transcript in the human TF-1 erythroleukemic cell line. It also acts as a transcriptional co-activator in osteoblasts by binding to phosphorylated c-Jun, a member of the activator-protein-1 (AP-1) family. Moreover, NACA binds to and regulates the adaptor protein Fas-associated death domain (FADD). In addition, NACA functions as a novel factor participating in the positive regulation of human erythroid-cell differentiation. The biological function of NACP1 (also called Alpha-NAC pseudogene 1 or NAC-alpha pseudogene 1) and NACAD remain unclear. All family members contain an NAC domain and a C-terminal ubiquitin-associated (UBA) domain.	37
270542	cd14359	UBA_AeNAC	UBA-like domain found in archaeal nascent polypeptide-associated complex homolog from Methanothermobacter marburgensis (AeNAC) and similar proteins. AeNAC is a functional archaeal homolog of eukaryotic nascent polypeptide-associated complex (NAC). Both AeNAC and eukaryotic NAC function as the cytosolic chaperone that can bind to ribosomal RNA, interact with the nascent polypeptide chains as they emerge from the ribosome, and assist in post-translational processes. They all contain a NAC domain and an ubiquitin-associated (UBA) domain in the C-terminus. However, unlike eukaryotic NAC, AeNAC forms a ribosome associated homodimer, but not heterodimer. The NAC domain of AeNAC is responsible for the homodimer formation.	40
270543	cd14360	UBA_NAC_like_bac	UBA-like domain found in uncharacterized bacteria proteins similar to eukaryotic nascent polypeptide-associated complex proteins (NAC). This subfamily contains a group of uncharacterized proteins found in bacteria. They all contain an N-terminal ubiquitin-associated (UBA) that shows high sequence similarity with that of eukaryotic nascent polypeptide-associated complex proteins (NAC) which is one of the cytosolic chaperones that contact the nascent polypeptide chains as they emerge from the ribosome and assist in post-translational processes.	38
270544	cd14361	UBA_HYPK	UBA-like domain found in Huntingtin-interacting protein K (HYPK) and similar proteins. HYPK, also called Huntingtin yeast partner K or Huntingtin yeast two-hybrid protein K, is an intrinsically unstructured Huntingtin (HTT)-interacting protein with chaperone-like activity. It is involved in regulating cell growth, cell cycle, unfolded protein response, and cell death. All members in this subfamily contain an N-terminal ubiquitin-associated (UBA) that shows high sequence similarity with that of eukaryotic nascent polypeptide-associated complex proteins (NAC) which is one of the cytosolic chaperones that contact the nascent polypeptide chains as they emerge from the ribosome and assist in post-translational processes.	41
270545	cd14362	CUE_TAB2_TAB3	CUE domain found in the N-terminal of TGF-beta-activated kinase 1 and MAP3K7-binding proteins TAB2, TAB3 and similar proteins. TAB2, also called mitogen-activated protein kinase kinase kinase 7-interacting protein 2, TAK1-binding protein 2, or TGF-beta-activated kinase 1-binding protein 2, is an adaptor protein that regulates activation of TAK1, a MAP kinase kinase kinase (MAPKKK), through linking TAK1 to TRAF6 in the Interleukin-1 (IL-1) induced NF-kappaB activation pathway. TAB3, also called mitogen-activated protein kinase kinase kinase 7-interacting protein 3, NF-kappa-B-activating protein 1, TAK1-binding protein 3, or TGF-beta-activated kinase 1-binding protein 3, is a TAB2-like TAK1-binding protein that activates NF-kappaB similar to TAB2. It activates TAK1 and regulates its association with TRAF2 and TRAF6. Moreover, TAB3 interacts with TRAF6 and TRAF2 in an IL-1- and a TNF-dependent manner, respectively. In summary, TAB2 and TAB3 function redundantly as mediators of TAK1 activation in IL-1 and TNF signal transduction. Both of them contain an N-terminal CUE domain, a coiled-coil (CC) region, a TAK1-binding domain and a C-terminal Npl4 zinc finger (NZF) ubiquitin-binding domain (UBD).	42
270546	cd14363	CUE_TOLIP	CUE domain found in the C-terminal of toll-interacting protein (Tollip) and similar proteins. Tollip is a new component of the IL-1RI pathway which contains an N-terminal C2 domain and a C-terminal CUE domain. Tollip binds to the cytoplasmic TIR domain of IL-1Rs after IL-1 stimulation. It is sufficient for recruitment of IRAK to IL-1Rs and negatively regulates IL-1-induced signaling by inhibiting IRAK phosphorylation. In addition, Tollip directly interacts with toll-like receptors TLR2 and TLR4, and plays an inhibitory role in TLR-mediated cell activation through suppressing phosphorylation and kinase activity of IRAK. Moreover, Tollip can associate with GAT domains of Tom1 and its related proteins Tom1L1 and Tom1L2, and facilitate the recruitment of clathrin onto endosomes.	41
270547	cd14364	CUE_ASCC2	CUE domain found in activating signal cointegrator 1 complex subunit 2 (ASCC2) and similar proteins. ASCC2, also called ASC-1 complex subunit p100 or Trip4 complex subunit p100, together with ASCC1 (also called p50) and ASCC3 (also called p300), form the activating signal cointegrator complex (ASCC). ASCC plays an essential role in activating protein 1 (AP-1), serum response factor (SRF), and nuclear factor kappaB (NF-kappaB) transactivation. It acts as a transcriptional coactivator of nuclear receptors and regulates the transrepression between nuclear receptors and either AP-1 or NF-kappaB in vivo. Members in this family all contain a CUE domain.	40
270548	cd14365	CUE_N4BP2	CUE domain found in NEDD4-binding protein 2 (N4BP2) and similar proteins. N4BP2 has been identified as an oncogene bcl-3 coding protein BCL-3-binding protein (B3BP) that participates in connecting transcriptional activation and genetic recombination of the Ig gene. In addition to BCL-3, it also interacts with p300/CBP histone acetyltransferases. N4BP2 shows intrinsic ATP binding and hydrolyzing activity. It contains an N-terminal ATP-binding region that is responsible for the interaction with BCL-3 and p300/CBP. N4BP2 also functions as a 5'-polynucleotide kinase that can transfer a phosphate group to the 5' end of DNA and RNA substrates. Moreover, N4BP2 contains a C-terminal MutS-related domain that possesses nicking endonuclease activity and may play a role in DNA mismatch repair (MMR). This model corresponds to CUE domain in the N-terminus of N4BP2.	42
270549	cd14366	CUE_CUED1	CUE domain found in CUE domain-containing protein 1 (CUED1) and similar proteins. The subfamily includes a group of uncharacterized CUE domain-containing protein termed CUED1. Their biological function remains unknown.	42
270550	cd14367	CUE_CUED2	CUE domain found in CUE domain-containing protein 2 (CUED2) and similar proteins. CUEDC2 is a novel negative regulator of progesterone receptor (PR) and functions to promote the progesterone-induced PR degradation by the ubiquitin-proteasome pathway. It also acts as the regulator of JAK1/STAT3 signaling through inhibiting cytokine-induced phosphorylation of JAK1 and STAT3 and the subsequent STAT3 transcriptional activity. All members in this subfamily contain a CUE domain.	42
270551	cd14368	CUE_DEF1_like	CUE domain found in fungal RNA polymerase II degradation factor 1 (DEF1) and similar proteins. DEF1, also called RRM3-interacting protein 1, is a RNA Polymerase II (RNAPII) degradation factor that may be required to couple arrested RNAPII to the proteasome to facilitate its degradation. It contains a CUE domain that is responsible for the binding of ubiquitin. The family also includes many uncharacterized hypothetical proteins. They show a high level of sequence similarity with DEF1.	41
270552	cd14369	CUE_VPS9_like	CUE domain found in vacuolar protein sorting-associated protein 9 (VPS9) and similar proteins. VPS9, also called vacuolar protein-targeting protein 9, is a cytosolic yeast protein required for localization of vacuolar proteins, such as the soluble vacuolar hydrolases CPY and PrA. It may bind and act as an effector of a rab GTPase and plays a role in vacuolar protein sorting (VPS) pathway. VPS9 contains a region called GBH domain that is related to mammalian Ras-binding proteins, Rin1 and JC265, and may negatively regulate Ras-mediated signaling in yeast Saccharomyces cerevisiae. This model corresponds to the N-terminal CUE domain that interacts specifically with monoubiquitin and regulates intramolecular monoubiquitylation.	42
270553	cd14370	CUE_DMA	CUE-like DMA domain found in the DM domain gene family encodes putative transcription factors DMRTA1, DMRTA2 and DMRTA3. The DM domain proteins are related to the sexual regulators doublesex from Drosophila melanogaster and MAB-3 from Caenorhabditis elegans. Thus, they have been named as doublesex- and mab-3-related transcription factors and may be involved in sexual development or in somite development. All DM domain proteins contain a DM domain which is an unusual zinc finger motif. In addition to an N-terminal DM domain, members in this family, including DMRTA1, DMRTA2 and DMRTA3, also harbor additional CUE-like DMA domain. DMRTA1 is encoded by gene DMRT1, a vertebrate equivalent of the D. melanogaster master sex regulator gene, doublesex. In D. melanogaster, doublesex controls the terminal switch of the pathway leading to sex fate choice. DMRT1 may function as regulator of sex differentiation in vertebrate. Especially, it is required for testis differentiation, but is not involved in the gonadal sex fate choice. DMRTA2, also called Doublesex- and mab-3-related transcription factor 5 (DMRT5), is encoded by gene DMRT2. In the zebrafish, DMRT2 is involved in somite development. DMRTA2 may act as an activator of cyclin-dependent kinase inhibitor 2C (cdkn2c) during spermatogenesis. It may also play significant roles in embryonic neurogenesis. DMRTA3 is encoded by tumor suppressor gene DMRT3 which serves as a novel potential target for homozygous deletion in squamous cell carcinoma of the lung.	40
270554	cd14371	CUE_CID7_like	CUE domain found in CTC-interacting domain proteins CID5, CID6, CID7 and similar proteins. CID7 is encoded by ubiquitously expressed gene CID7. It contains an N-terminal PABC-interacting domain (PAM2 or PABP-interacting motif 2) which is also found in the human Paip1 and Paip2. At this point, it functions as an interaction partner of the PABC domain of Arabidopsis thaliana Poly(A)-binding proteins. It also harbors an ubiquitin-associated (UBA)-like CUE domain and a C-terminal small MutS-related (SMR) domain. CID5 and CID6 are encoded by gene CID5, CID6, respectively. CID5 is only expressed in immature siliques. The biological function of CID5 and CID6 remain unclear.	43
270555	cd14372	CUE_Cue5p_like	CUE domain found in yeast ubiquitin-binding protein CUE5 (Cue5p), donuts protein 1 (DON1p) and similar proteins. Cue5p, also called coupling of ubiquitin conjugation to ER degradation protein 5, is encoded by the open reading frame (ORF) Yor042. It contains a CUE domain which exhibits weak ubiquitin binding properties. Donuts protein 1 (DON1p) is encoded by the ORF YDR273w. It localizes specifically to the prospore membrane and is expressed exclusively during meiosis. DON1p may function as a unique marker to investigate the defects associated with the impaired function of the meiotic plaque in the mpc- mutants. 	45
270556	cd14373	CUE_Cue3p_like	CUE domain found in yeast ubiquitin-binding protein CUE3 (Cue3p) and similar proteins. Cue3p, also called coupling of ubiquitin conjugation to ER degradation protein 3, is encoded by the open reading frame (ORF) YGL110C. It is involved in the intramolecular monoubiquitination that serves as a regulatory signal in a variety of cellular processes in yeast. Cue3p contains a CUE domain.	41
270557	cd14374	CUE1_Cue2p_like	CUE1 domain found in yeast ubiquitin-binding protein CUE2 (Cue2p) and similar proteins. Cue2p, also called coupling of ubiquitin conjugation to ER degradation protein 2, is encoded by the open reading frame (ORF) YKL090W. It is involved in the intramolecular monoubiquitination that serves as a regulatory signal in a variety of cellular processes in yeast. Cue2p contains two tandem CUE domains at the N-terminus. Both of them can bind monoubiquitin independently. This model corresponds to the first CUE domain.	42
270558	cd14375	CUE2_Cue2p_like	CUE2 domain found in yeast ubiquitin-binding protein CUE2 (Cue2p) and similar proteins. Cue2p, also called coupling of ubiquitin conjugation to ER degradation protein 2, is encoded by the open reading frame (ORF) YKL090W. It is involved in the intramolecular monoubiquitination that serves as a regulatory signal in a variety of cellular processes in yeast. Cue2p contains two tandem CUE domains at the N-terminus. Both of them can bind monoubiquitin independently. This model corresponds to the second CUE domain.	38
270559	cd14376	CUE_AUP1_AMFR_like	CUE domain found in ancient ubiquitous protein 1 (AUP1), autocrine motility factor receptor (AMFR) and similar proteins. AUP1 is a component of the HRD1-SEL1L endoplasmic reticulum (ER) quality control complex and is essential for US11-mediated dislocation of class I MHC heavy chains. AMFR is an internalizing cell surface glycoprotein that is localized in both plasma membrane caveolae and the ER, and involves in the regulation of cellular adhesion, proliferation, motility and apoptosis, as well as in the process of learning and memory. Cue1p is an N-terminally membrane-anchored endoplasmic reticulum (ER) protein essential for the activity of the two major yeast RING finger ubiquitin ligases (E3s) implicated in ER-associated degradation (ERAD). This family also includes plant E3 ubiquitin protein ligases RIN2, RIN3, and similar proteins. Comparing with other CUE domain-containing proteins, some family members from higher eukaryotes do not bind monoubiquitin efficiently, since they carry LP, rather than FP among CUE domains. 	37
270560	cd14377	UBA1_Rad23	UBA1 domain of Rad23 proteins found in metazoa. The family includes mammalian orthologs of yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe). Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry a ubiquitin-like (UBL) and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. UBL domain is responsible for the binding to proteasome. UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates which suggests Rad23 proteins might be involved in certain pathways of ubiquitin metabolism. Both UBL domain and XPC-binding domain are necessary for efficient NER function of Rad23 proteins. This model corresponds to the UBA1 domain.	40
270561	cd14378	UBA1_Rhp23p_like	UBA1 domain of Schizosaccharomyces pombe UV excision repair protein Rhp23p and its homologs. The subfamily contains several fungal multi-ubiquitin receptors, including Schizosaccharomyces pombe Rhp23p and Saccharomyces cerevisiae Rad23p, both of which are orthologs of human HR23A. They play roles in nucleotide excision repair (NER) and in cell cycle regulation. They also function as shuttle proteins transporting ubiquitinated substrates destined for degradation from the E3 ligase to the 26S proteasome. For instance, S. pombe Rhp23p forms a complex with Rhp41p to recognize photolesions and help initiate DNA repair, and it also protects ubiquitin chains against disassembly by deubiquitinating enzymes. Like human HR23A, members in this subfamily interact with the proteasome through their N-terminal ubiquitin-like domain (UBL), and with ubiquitin (Ub), or multi-ubiquitinated substrates, through their two ubiquitin-associated domains (UBA), termed internal UBA1 and C-terminal UBA2. In addition, they contain a xeroderma pigmentosum group C (XPC) protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA1 domain.	47
270562	cd14379	UBA1_Rad23_plant	UBA1 domain of putative DNA repair proteins Rad23 found in plant. The radiation sensitive 23 (Rad23) subfamily consists of four isoforms of putative DNA repair proteins from Arabidopsis thaliana and similar proteins from other plants. The nuclear-enriched Rad23 proteins function in the cell cycle, morphology, and fertility of plants through their delivery of ubiquitin (Ub)/26S proteasome system (UPS) substrates to the 26S proteasome. Rad23 proteins contain an N-terminal ubiquitin-like (UBL) domain that associates with the 26S proteasome Ub receptor RPN10, and two C-terminal ubiquitin-associated (UBA) domains that bind Ub conjugates. This model corresponds to the UBA1 domain.	50
270563	cd14380	UBA2_Rad23	UBA2 domain of Rad23 proteins found in metazoa. The family includes mammalian orthologs of yeast nucleotide excision repair (NER) proteins, Rad23p (in Saccharomyces cerevisiae) and Rhp23p (in Schizosaccharomyces pombe). Rad23 proteins play dual roles in DNA repair as well as in proteosomal degradation. They have affinity for both the proteasome and ubiquitinylated proteins and participate in translocating polyubiquitinated proteins to the proteasome. Rad23 proteins carry a ubiquitin-like (UBL) and two ubiquitin-associated (UBA) domains, as well as a xeroderma pigmentosum group C (XPC) protein-binding domain. UBL domain is responsible for the binding to proteasome. UBA domains are important for binding of ubiquitin (Ub) or multi-ubiquitinated substrates which suggests Rad23 proteins might be involved in certain pathways of ubiquitin metabolism. Both UBL domain and XPC-binding domain are necessary for efficient NER function of Rad23 proteins. This model corresponds to the UBA2 domain.	39
270564	cd14381	UBA2_Rhp23p_like	UBA2 domain of Schizosaccharomyces pombe UV excision repair protein Rhp23p and its fungal homologs. The subfamily contains several fungal multiubiquitin receptors, including Schizosaccharomyces pombe Rhp23p and Saccharomyces cerevisiae Rad23p, both of which are orthologs of human HR23A. They play roles in nucleotide excision repair (NER) and in cell cycle regulation. They also function as shuttle proteins transporting ubiquitinated substrates destined for degradation from the E3 ligase to the 26S proteasome. For instance, S. pombe Rhp23p forms a complex with Rhp41p to recognize photolesions and help initiate DNA repair, and it also protects ubiquitin chains against disassembly by deubiquitinating enzymes. Like human HR23A, members in this subfamily interact with the proteasome through their N-terminal ubiquitin-like domain (UBL), and with ubiquitin (Ub), or multi-ubiquitinated substrates, through their two ubiquitin-associated domains (UBA), termed internal UBA1 and C-terminal UBA2. In addition, they contain a xeroderma pigmentosum group C (XPC) protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA2 domain.	40
270565	cd14382	UBA2_RAD23_plant	UBA2 domain of putative DNA repair proteins RAD23 found in plant. The radiation sensitive 23 (RAD23) subfamily consists of four isoforms of putative DNA repair proteins from Arabidopsis thaliana and similar proteins from other plants. The nuclear-enriched RAD23 proteins function in the cell cycle, morphology, and fertility of plants through their delivery of ubiquitin (Ub)/26S proteasome system (UPS) substrates to the 26S proteasome. RAD23 proteins contain an N-terminal ubiquitin-like (UBL) domain that associates with the 26S proteasome Ub receptor RPN10, and two C-terminal ubiquitin-associated (UBA) domains that bind Ub conjugates. This model corresponds to the UBA2 domain.	43
270566	cd14383	UBA1_UBP5	UBA1 domain found in ubiquitin carboxyl-terminal hydrolase 5 (UBP5). UBP5, also called deubiquitinating enzyme 5, Isopeptidase T (IsoT), ubiquitin thioesterase 5, or ubiquitin-specific-processing protease 5, is a deubiquitinating enzyme largely responsible for the disassembly of the majority of unanchored polyubiquitin in the cell. Zinc is required for its catalytic activity. UBP5 contains four ubiquitin (Ub)-binding sites including an N-terminal zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. ZnF domain binds the proximal ubiquitin. UBP domain forms the active site. UBA domains are involved in binding linear or K48-linked polyubiquitin. This model corresponds to the UBA1 domain.	49
270567	cd14384	UBA1_UBP13	UBA1 domain found in ubiquitin carboxyl-terminal hydrolase 13 (UBP13). UBP13, also called deubiquitinating enzyme 13, Isopeptidase T-3 (isoT3), ubiquitin thioesterase 13, or ubiquitin-specific-processing protease 13, is an ortholog of UBP5 implicated in catalyzing hydrolysis of various ubiquitin (Ub)-chains. It contains a zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. Due to the non-activating catalysis for K63-polyubiquitin chains, UBP13 may function differently from USP5 in cellular deubiquitination processes. Moreover, the zinc finger (ZnF) domain of USP13 cannot bind to Ub. Its tandem UBA domains can bind with different types of diUb but preferentially with K63-linked.USP13 can also regulate the protein level of CD3delta in cells via its UBA domains. This model corresponds to the UBA1 domain.	49
270568	cd14385	UBA1_spUBP14_like	UBA1 domain found in Schizosaccharomyces pombe ubiquitin carboxyl-terminal hydrolase 14 (spUBP14) and similar proteins. spUBP14, also called deubiquitinating enzyme 14, UBA domain-containing protein 2, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, functions as a deubiquitinating enzyme that is involved in protein degradation in fission yeast. Members in this family contain two tandem ubiquitin-association (UBA) domains. This model corresponds to the UBA1 domain.	47
270569	cd14386	UBA2_UBP5	UBA2 domain found in ubiquitin carboxyl-terminal hydrolase 5 (UBP5). UBP5, also called deubiquitinating enzyme 5, Isopeptidase T (IsoT), ubiquitin thioesterase 5, or ubiquitin-specific-processing protease 5, is a deubiquitinating enzyme largely responsible for the disassembly of the majority of unanchored polyubiquitin in the cell. Zinc is required for its catalytic activity. UBP5 contains four ubiquitin (Ub)-binding sites including an N-terminal zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. ZnF domain binds the proximal ubiquitin. UBP domain forms the active site. UBA domains are involved in binding linear or K48-linked polyubiquitin. This model corresponds to the UBA2 domain.	43
270570	cd14387	UBA2_UBP13	UBA2 domain found in ubiquitin carboxyl-terminal hydrolase 13 (UBP13). UBP13, also called deubiquitinating enzyme 13, Isopeptidase T-3 (isoT3), ubiquitin thioesterase 13, or ubiquitin-specific-processing protease 13 is an ortholog of UBP5 implicated in catalyzing hydrolysis of various ubiquitin (Ub)-chains. It contains a zinc finger (ZnF) domain, a catalytic ubiquitin-specific processing protease (UBP) domain (catalytic C-box and H-box), and two ubiquitin-associated (UBA) domains. Due to the non-activating catalysis for K63-polyubiquitin chains, UBP13 may function differently from USP5 in cellular deubiquitination processes. Moreover, the zinc finger (ZnF) domain of USP13 cannot bind to Ub. Its tandem UBA domains can bind with different types of diUb but preferentially with K63-linked.USP13 can also regulate the protein level of CD3delta in cells via its UBA domains. This model corresponds to the UBA2 domain.	35
270571	cd14388	UBA2_atUBP14	UBA2 domain found in Arabidopsis thaliana ubiquitin carboxyl-terminal hydrolase 14 (atUBP14) and similar proteins. atUBP14, also called deubiquitinating enzyme 14, TITAN-6 protein, ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, is related to the isopeptidase T class of deubiquitinating enzymes that recycle polyubiquitin chains following protein degradation. atUBP14 is essential for early plant development. It can disassemble multi-ubiquitin chains linked internally via epsilon-amino isopeptide bonds using Lys48 and can process some, but not all, translational fusions of ubiquitin linked via alpha-amino peptide bonds. atUBP14 contains two ubiquitin-association (UBA) domains. This model corresponds to the UBA2 domain which show a high level of sequence similarity with mammalian ubiquitin-associated and SH3 domain-containing protein A (UBS3A).	38
270572	cd14389	UBA_AAA_plant	UBA domain found in plant AAA-type ATPase-like proteins. This family includes some uncharacterized AAA-type ATPase-like proteins found in plant. The AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. Members in this family contains an N-terminal ubiquitin-association (UBA) domain, a AAA-type ATPase domain and a C-terminal MgsA AAA+ ATPase domain. This model corresponds to the UBA domain which show a high level of sequence similarity with mammalian ubiquitin-associated and SH3 domain-containing protein A (UBS3A).	37
270573	cd14390	UBA_II_E2_UBE2K	UBA domain of vertebrate ubiquitin-conjugating enzyme E2 K (UBE2K). UBE2K, also called Huntingtin-interacting protein 2 (HIP-2), ubiquitin carrier protein, ubiquitin-conjugating enzyme E2-25 kDa (E2-25K), or ubiquitin-protein ligase is a multi-ubiquitinating enzyme with the ability to synthesize Lys48-linked polyubiquitin chains which is involved in the ubiquitin (Ub)-dependent proteolytic pathway. It interacts with the frameshift mutant of ubiquitin B and functions as a crucial factor regulating amyloid-beta neurotoxicity. It has also been characterized as Huntingtin-interacting protein that modulates the neurotoxicity of Amyloid-beta (Abeta), the principal protein involved in Alzheimer's disease pathogenesis. Moreover, E2-25K increases aggregate the formation of expanded polyglutamine proteins and polyglutamine-induced cell death in the pathology of polyglutamine diseases. UBE2K and its yeast homolog UBC1 are unique class II E2 conjugating enzymes, both of which contain a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain.	38
270574	cd14391	UBA_II_E2_UBCD4	UBA domain found in Drosophila melanogaster ubiquitin-conjugating enzyme E2-22 kDa (UbcD4) and similar proteins. UbcD4, also called ubiquitin carrier protein or ubiquitin-protein ligase, is a class II E2 ubiquitin-conjugating enzyme encoded by Drosophila E2 gene which is only expressed in pole cells in embryos. It is a putative E2 enzyme homologous to the Huntingtin interacting protein-2 (HIP2) of human. UbcD4 specifically interacts with the polyubiquitin-binding subunit of the proteasome. It contains a C-terminal ubiquitin-associated (UBA) domain in addition to an N-terminal catalytic ubiquitin-conjugating enzyme E2 (UBCc) domain.	36
270575	cd14392	UBA_Cbl-b	UBA domain found in E3 ubiquitin-protein ligase Cbl-b and similar proteins. Cbl-b, also called casitas B-lineage lymphoma proto-oncogene b, RING finger protein 56, SH3-binding protein Cbl-b, or signal transduction protein Cbl-b, has been identified as a regulator of antigen-specific, T cell-intrinsic, peripheral immune tolerance (a state also called clonal anergy). It may inhibit activation of the p85 subunit of phosphoinositide 3-kinase (PI3K), protein kinase C-theta (PKC-theta), and phospholipase C-gamma1 (PLC-gamma1) and negatively regulates T-cell receptor-induced transcription factor nuclear factor-kappaB (NF-kappaB) activation. In addition, Cbl-b may target multiple signaling molecules involved in transforming growth factor (TGF)-beta-mediated transactivation pathways. Cbl-b contains a proline rich domain, a nuclear localization signal, a C3HC4 zinc finger and a ubiquitin-associated (UBA) domain.	41
270576	cd14393	UBA_c-Cbl	UBA domain found in E3 ubiquitin-protein ligase Cbl and similar proteins. Cbl, also called casitas B-lineage lymphoma proto-oncogene, proto-oncogene c-Cbl, RING finger protein 55, or signal transduction protein Cbl, is a multi-domain protein that acts as a key negative regulator of various receptor and non-receptor tyrosine kinases signaling. It contains a tyrosine kinase-binding domain (TKB), a proline-rich domain, a RING domain, and a ubiquitin-associated (UBA) domain. The TKB is responsible for the interactions with many tyrosine kinases, such as the colony-stimulating factor-1 (CSF-1) receptor, Syk/ZAP-70 and Src-family of protein tyrosine kinases. The proline-rich domain can recruit proteins with SH3 domain. Moreover, Cbl functions as an E3 ubiquitin ligase that can bind ubiquitin-conjugating enzymes (E2s) through RING domain. 	40
270577	cd14394	UBA_BIRC2_3	UBA domain found in baculoviral IAP repeat-containing protein BIRC2, BIRC3 and similar proteins. The subfamily includes cellular inhibitor of apoptosis protein 1 (c-IAP1) and c-IAP2. c-IAPs function as ubiquitin E3 ligases that mediate the ubiquitination of the substrates involved in apoptosis, nuclear factor-kappaB (NF-kappaB)signaling, and oncogenesis. Unlike other apoptosis proteins (IAPs), such as XIAP, c-IAPs exhibit minimal binding to caspases and may not play an important role in the inhibition of these proteases. c-IAP1, also called baculoviral IAP repeat-containing protein BIRC2, IAP-2, RING finger protein 48, or TNFR2-TRAF-signaling complex protein 2, is a potent regulator of the tumor necrosis factor (TNF) receptor family and NF-kappaB signaling pathways in the cytoplasm. It can also regulate E2F1 transcription factor-mediated control of cyclin transcription in the nucleus. c-IAP2, also called BIRC3, IAP-1, apoptosis inhibitor 2 (API2), or IAP homolog C, also influences ubiquitin-dependent pathways that modulate innate immune signalling by activation of NF-kappaB. c-IAPs contain three N-terminal baculoviral IAP repeat (BIR) domains that enable interactions with proteins, a ubiquitin-association (UBA) domain that is responsible for the binding of binds polyubiquitin (polyUb), a caspase activation and recruitment domain (CARD) that serves as a protein interaction surface, and a RING domain at the carboxyl terminus that is required for ubiquitin ligase activity.	50
270578	cd14395	UBA_BIRC4_8	UBA domain found in E3 ubiquitin-protein ligase XIAP, baculoviral IAP repeat-containing protein 8 (BIRC8) and similar proteins. XIAP, also called baculoviral IAP repeat-containing protein 4 (BIRC4), IAP-like protein (ILP), inhibitor of apoptosis protein 3 (IAP-3), or X-linked inhibitor of apoptosis protein (X-linked IAP), is a potent suppressor of apoptosis that directly inhibits specific members of the caspase family of cysteine proteases, including caspase-3, -7, and -9. It promotes proteasomal degradation of caspase-3 and enhances its anti-apoptotic effect in Fas-induced cell death. The ubiquitin-protein ligase (E3) activity of XIAP also exhibits in the ubiquitination of second mitochondria-derived activator of caspases (Smac). The mitochondrial proteins, Smac/DIABLO and Omi/HtrA2, can inhibit the antiapoptotic activity of XIAP. XIAP has also been implicated in several intracellular signaling cascades involved in the cellular response to stress, such as the c-Jun N-terminal kinase (JNK) pathway, the nuclear factor-kappaB (NF-kappaB) pathway, and the transforming growth factor-beta (TGF-beta) pathway. Moreover, XIAP can regulate copper homeostasis through interacting with MURR1. BIRC8, also called inhibitor of apoptosis-like protein 2 (IAP-like protein 2 or ILP-2), or testis-specific inhibitor of apoptosis, is a tissue-specific homolog of E3 ubiquitin-protein ligase XIAP. It has been implicated in the control of apoptosis in the testis by direct inhibition of caspase 9. Both XIAP and BIRC8 contain three N-terminal baculoviral IAP repeat (BIR) domains, a ubiquitin-association (UBA) domain and a RING domain at the carboxyl terminus.	50
270579	cd14396	UBA_XtBIRC7_like	UBA domain found in Xenopus tropicalis baculoviral IAP repeat-containing protein BIRC7, BIRC71A and similar proteins. X. tropicalis BIRC7, also called E3 ubiquitin-protein ligase EIAP, embryonic/Egg IAP (xEIAP/XLX), inhibitor of apoptosis (IAP)-like protein, XIAP homolog XLX, is a weak apoptosis inhibitor that exhibits caspase inhibition and autoubiquitylation. It is uniquely modified by MAPK- and Cdc2/Cyclin B-dependent phosphorylation during oocyte maturation. Its caspase-dependent cleavage is altered when it is phosphorylated. X. tropicalis BIRC7 contains two N-terminal baculoviral IAP repeats (BIRs) and a C-terminal RING domain. Based on sequence homology, it also harbors a ubiquitin-associated (UBA) domain which is not detected in human BIRC7.	44
270580	cd14397	UBA_LATS1	UBA domain found in vertebrate serine/threonine-protein kinase LATS1. LATS1, also called large tumor suppressor homolog 1 or WARTS protein kinase (warts), is a serine/threonine-protein kinase that highly conserved from fly to human. It plays a crucial role in the prevention of tumor formation by controlling mitosis progression. Human LATS1 is the mammalian homologs of Drosophila lats/warts gene that could suppress tumor growth and rescue all developmental defects in flies, including embryonic lethality. It forms a regulatory complex with zyxin, a regulator of actin filament assembly. The LATS1/zyxin complex plays a role in controlling mitosis progression on mitotic apparatus. LATS1 is phosphorylated in a cell-cycle-dependent manner and complexes with CDC2 in early mitosis. It can negatively modulates tumor cell growth by inducing G(2)/M cell cycle transition or apoptosis. It also functions as a mitotic exit network kinase interacting with MOB1A, a protein whose homolog in budding yeast associates with kinases involved in mitotic exit. Moreover, LATS1 acts as a novel cytoskeleton regulator that affects cytokinesis by regulating actin polymerization through inhibiting LIMK1. LATS1 can also inhibit transcription regulation and transformation functions of oncogene YAP by inhibiting its nuclear translocation through phosphorylation. In addition, LATS1 can regulate the transcriptional activity of forkhead L2 (FOXL2) via phosphorylation. It also acts as an acting-binding protein that can negatively regulate the actin polymerization. LATS1 contains an N-terminal ubiquitin-associated (UBA) domain and a C-terminal protein kinase domain.	41
270581	cd14398	UBA_LATS2	UBA domain found in vertebrate serine/threonine-protein kinase LATS2. LATS2, also called kinase phosphorylated during mitosis protein, or large tumor suppressor homolog 2, or serine/threonine-protein kinase KPM, or Warts-like kinase, is a novel mammalian homolog of the Drosophila tumor suppressor gene lats/warts. It inhibits the G1/S transition and is essential for embryonic development, proliferation control, and genomic integrity. LATS2 is a serine/threonine kinase that negatively regulates CyclinE/CDK2 and plays a role in tumor suppression. It also acts as the negative regulator of androgen receptor (AR) through inhibiting androgen-regulated gene expression and thus plays an important role in AR -regulated transcription and in the development of prostate cancer. Moreover, LATS2 induces apoptosis via down-regulation of anti-apoptotic proteins, BCL-2 and BCL-x(L), in human lung cancer cells. It is a centrosomal protein and forms a complex with Ajuba, a LIM protein, to regulate organization of the spindle apparatus through recruitment of gamma-tubulin to the centrosome during mitosis. Furthermore, LATS2 interacts with Mdm2 to inhibit p53 ubiquitination and promote p53 activation. It stabilizes the cellular protein level of Snail1, a central regulator of epithelial cell adhesion and movement in epithelial-to-mesenchymal transitions (EMTs) during embryo development, and enhances its EMT activity. LATS2 contains an N-terminal ubiquitin-associated (UBA) domain and a C-terminal protein kinase domain.	41
270582	cd14399	UBA_PLICs	UBA domain of eukaryotic protein linking integrin-associated protein (IAP, also known as CD47) with cytoskeleton (PLIC) proteins. The PLIC proteins (or ubiquilins) family contains human homologs of the yeast ubiquitin-like Dsk2 protein, PLIC-1 (also called ubiquilin-1), PLIC-2 (also called ubiquilin-2 or Chap1), PLIC-3 (also called ubiquilin-3) and PLIC-4 (also called ubiquilin-4, Ataxin-1 interacting ubiquitin-like protein, A1Up, Connexin43-interacting protein of 75 kDa, or CIP75), and mouse PLIC proteins. They are ubiquitin-binding adaptor proteins involved in all protein degradation pathways through delivering ubiquitinated substrates to proteasomes. They also promote autophagy-dependent cell survival during nutrient starvation. PLIC-1 regulates the function of the thrombospondin receptor CD47 and G protein signaling. It plays a role in TLR4-mediated signaling through interacting with the Toll/interleukin-1 receptor (TIR) domain of TLR4. It also inhibits the TLR3-Trif antiviral pathway by reducing the abundance of Trif. Moreover, PLIC-1 binds to gamma-aminobutyric acid receptors (GABAARs) and modulates the ubiquitin-dependent, proteasomal degradation of GABAARs. Furthermore, PLIC-1 acts as a molecular chaperone regulating amyloid precursor protein (APP) biosynthesis, trafficking, and degradation by stimulating K63-linked polyubiquitination of lysine 688 in the APP intracellular domain. In addition, PLIC-1 is involved in the protein aggregation-stress pathway via associating with the ubiquitin-interacting motif (UIM) proteins ataxin 3, HSJ1a, and epidermal growth factor substrate 15 (EPS15). PLIC-2 is a protein that binds the ATPase domain of the HSP70-like Stch protein. It functions as a negative regulator of G protein-coupled receptor (GPCR) endocytosis. It also involved in amyotrophic lateral sclerosis (ALS)-related dementia. PLIC-3 is encoded by UBQLN3, a testis-specific gene. It shows high sequence similarity with the Xenopus protein XDRP1, a nuclear phosphoprotein that binds to the N-terminus of cyclin A and inhibits Ca2+-induced degradation of cyclin A, but not cyclin B. PLIC-4 is a ubiquitin-like nuclear protein that interacts with ataxin-1 and further links ataxin-1 with the chaperone and ubiquitin-proteasome pathways. It also binds to the non-ubiquitinated gap junction protein connexin43 (Cx43) and regulates the turnover of Cx43 through the proteasomal pathway. PLIC proteins contain an N-terminal ubiquitin-like (UBL) domain that is responsible for the binding of ubiquitin-interacting motifs (UIMs) expressed by proteasomes and endocytic adaptors, and C-terminal ubiquitin-associated (UBA) domain that interacts with ubiquitin chains present on proteins destined for proteasomal degradation. In addition, mammalian PLIC2 proteins have an extra collagen-like motif region which is absent in other PLIC proteins and the yeast Dsk2 protein.	40
270583	cd14400	UBA_Gts1p_like	UBA domain found in Saccharomyces cerevisiae protein GTS1 (Gts1p) and similar proteins. Gts1p, also called protein LSR1, is encoded by a pleiotropic gene GTS1 in budding yeast. The formation of Gts1p-mediated protein aggregates may induce reactive oxygen species (ROS) production and apoptosis. Gts1p also plays an important role in the regulation of heat and other stress responses under glucose-limited or -depleted conditions in either batch or continuous culture. Gts1p contains an N-terminal zinc finger motif similar to that of GATA-transcription factors, a ubiquitin-associated (UBA) domain and a C-terminal glutamine-rich strand. The zinc finger is responsible for the binding to the glycolytic enzyme glyceraldehydes-3-phosphate dehydrogenase (GAPDH) which is required for the maintenance of the metabolic oscillations of budding yeast. The polyglutamine sequence is indispensable for the pleiotropy and nuclear localization of Gts1p. It is essential for the transcriptional activation, whereas Gts1p lacks DNA binding activity.	39
270584	cd14401	UBA_HERC1	UBA domain found in probable E3 ubiquitin-protein ligase HERC1 and similar proteins. HERC1, also called HECT domain and RCC1-like domain-containing protein 1, or p532, or p619, is an ubiquitously expressed multi-domain protein involved in ubiquitin-dependent intracellular membrane trafficking through its interaction with vesicle coat proteins such as clathrin and ARF. Moreover, it has been identified as a tuberous sclerosis complex TSC2-interacting protein that may play a role in TSC-mTOR (mammalian target of rapamycin) pathway. In addition to a ubiquitin-association (UBA) domain, HERC1 contains more than one RCC1-like domains (RLDs) and a C-terminal HECT E3 ubiquitin ligase domain. At this point, it may function as both E3 ubiquitin ligases and guanine nucleotide exchange factors (GEFs).	44
270585	cd14402	UBA_HERC2	UBA domain found in probable E3 ubiquitin-protein ligase HERC2 and similar proteins. HERC2, also called HECT domain and RCC1-like domain-containing protein 2, is a SUMO-regulated E3 ubiquitin ligase that plays an important role in the SUMO-dependent pathway which orchestrates the DNA double-strand break (DSB) response. Moreover, HERC2 functions as a RNF8 auxiliary factor that regulates ubiquitin-dependent retention of repair proteins on damaged chromosomes. In addition to a ubiquitin-association (UBA) domain, HERC2 contains more than one RCC1-like domains (RLDs) and a C-terminal HECT E3 ubiquitin ligase domain.	45
270586	cd14403	UBA_AID_AAPK1	UBA-like autoinhibitory domain (AID) found in vertebrate 5'-AMP-activated protein kinase catalytic subunit alpha-1 (AMPKalpha-1). AMPKalpha-1, also called acetyl-CoA carboxylase kinase (ACACA kinase), hydroxymethylglutaryl-CoA reductase kinase (HMGCR kinase), or Tau-protein kinase PRKAA1, is one of the catalytic subunits of adenosine monophosphate (AMP)-activated protein kinase (AMPK). It has been implicated in a number of important cellular processes. For instance, it functions as a glucose sensor controlling CD8 T-cell memory, as well as a new kinase for RhoA and a new mediator of the vasoprotective effects of estrogen. It also plays a significant role in cervical malignant growth, in regulating oxidative stress and life span in erythrocytes, in modulating the antioxidant status of vascular endothelial cells, in limiting skeletal muscle overgrowth during hypertrophy through inhibition of the mammalian target of rapamycin (mTOR)-signaling pathway. AMPKalpha-1 has an N-terminal Ser/Thr kinase domain followed by an ubiquitin-associated (UBA)-like AID and a C-terminal AMPK regulatory domain. The Ser/Thr kinase domain contains a conserved Thr residue that must be phosphorylated for activity in the activation loop. The AID is responsible for AMPKalpha subunit autoinhibition. The C-terminal regulatory domain of the alpha1-subunit is essential for binding the beta1- and gamma1-subunits.	65
270587	cd14404	UBA_AID_AAPK2	UBA-like autoinhibitory domain (AID) found in vertebrate 5'-AMP-activated protein kinase catalytic subunit alpha-2 (AMPKalpha-2). AMPKalpha-2, also called acetyl-CoA carboxylase kinase (ACACA kinase) or hydroxymethylglutaryl-CoA reductase kinase (HMGCR kinase), is one of the catalytic subunits of adenosine monophosphate (AMP)-activated protein kinase (AMPK). It shows a wide expression pattern and is highly expressed in skeletal muscle, heart, and liver. It may be involved in the regulation of glucose and lipid metabolism and protein synthesis in peripheral tissues, as well as in regulation of energy intake and body weight. AMPKalpha-2 has an N-terminal Ser/Thr kinase domain followed by an ubiquitin-associated (UBA)-like AID, and a C-terminal AMPK regulatory domain. The Ser/Thr kinase domain contains a conserved Thr residue that must be phosphorylated for activity in the activation loop. The AID is responsible for AMPKalpha subunit autoinhibition. The C-terminal regulatory domain is essential for binding the beta- and gamma-subunits.	65
270588	cd14405	UBA_MARK1	UBA domain found in serine/threonine-protein kinase MARK1 and similar proteins. MARK1, also called MAP/microtubule affinity-regulating kinase 1 or PAR1 homolog c (Par-1c), is a kinase-regulating microtubule-dependent transport in axons and dendrites. It is involved in the specification of neuronal polarity, in axon-dendrite specification, and in the synaptic plasticity in adult neurons. It has been implicated in Alzheimer's disease, cancer, and autism.	41
270589	cd14406	UBA_MARK2	UBA domain found in serine/threonine-protein kinase MARK2 and similar proteins. MARK2, also called ELKL motif kinase 1 (EMK-1), MAP/microtubule affinity-regulating kinase 2, PAR1 homolog, or PAR1 homolog b (Par-1b), is enriched in brain. It belongs to the AMPK family of Ser/Thr kinases. MARK2 has been implicated in regulating fertility, immune homeostasis, learning, and memory as well as adiposity, insulin hypersensitivity, and glucose metabolism. The activity of MARK2 is necessary for the outgrowth of cell processes, neurites, and dendritic spines. It is a TORC2 (also known as Crtc2) Ser-275 kinase that blocks TORC2-induced cAMP response element binding protein (CREB) activity. It regulates axon formation via phosphorylation of a kinesin-like motor protein GAKIN/KIF13B. It also acts as a positive regulator of Wnt-beta-catenin signaling.	42
270590	cd14407	UBA_MARK3_4	UBA domain found in MAP/microtubule affinity-regulating kinase MARK3, MARK4, and similar proteins. MARK3, also called C-TAK1, Cdc25C-associated protein kinase 1, ELKL motif kinase 2 (EMK-2), protein kinase STK10, Ser/Thr protein kinase PAR-1 (Par-1a), or serine/threonine-protein kinase p78, is a known regulator of KSR1, a molecular scaffold of the Raf/MEK/ERK MAP kinase cascade that regulates the intensity and duration of ERK activation. It binds plakophilin 2 (PKP2), phosphorylates human Cdc25C on serine 216, and promotes 14-3-3 protein binding and protein localization. It also interacts with microphthalmia-associated transcription factor, Mitf which is necessary for regulating genes involved in osteoclast differentiation. Moreover, MARK3 is involved in regulating localization and activity of class IIa histone deacetylases. The lack of MARK3 leads to reduced adiposity, resistance to hepatic steatosis, and defective gluconeogenesis. MARK4, also called MAP/microtubule affinity-regulating kinase-like 1 (MARKL1), or Par-1d, is a member of the AMP-activated protein kinase (AMPK)-related family of kinases. It plays a key role in energy metabolism and may act as a novel drug target for the treatment of obesity and type 2 diabetes. MARK4 also functions as the substrate of ubiquitin specific protease-9 (USP9X) and can be regulated by unusual Lys(29)/Lys(33)-linked polyubiquitin chains. Furthermore, MARK4 may play some role in hepatocellular carcinogenesis.	43
270591	cd14408	UBA_SIK1	UBA domain found in salt-inducible kinase 1 (SIK1). SIK1, also called serine/threonine-protein kinase SNF1-like kinase 1 (SNF1LK), is a serine/threonine kinase abundant in adrenal glands. It belongs to the AMP-activated protein kinases (AMPK) family involved in the regulation of metabolism during energy stress. SIK1 is required for myogenic differentiation. It is degraded by the proteasome in myoblasts which is regulated by cAMP signaling. Moreover, SIK1 acts as a class II histone deacetylase (HDAC) kinase, triggering the cytoplasmic export of the HDACs and activation of myocyte enhancer factor 2 (MEF2)-dependent transcription. It also regulates transcription through inhibitory phosphorylation of a family of cAMP responsive element binding protein (CREB) coactivators, called TORCs/CRTCs. In addition, SIK1 links LKB1 to p53-dependent anoikis and suppresses metastasis. It is also involved in a cell sodium-sensing network that regulates active sodium transport through a calcium-dependent process. SIK1 contains an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain and a putative PEST domain.	50
270592	cd14409	UBA_SIK2	UBA domain found in salt-inducible kinase 2 (SIK2). SIK2, also called Qin-induced kinase or serine/threonine-protein kinase SNF1-like kinase 2 (SNF1LK2), is a serine/threonine kinase highly expressed in adipocytes. It belongs to the AMP-activated protein kinases (AMPK) family involved in the regulation of metabolism during energy stress. It plays an important role in the insulin-signaling pathway during adipocyte differentiation, as well as in autophagy progression. Moreover, SIK2 plays a critical role in neuronal survival and modulates cAMP responsive element binding protein (CREB)-mediated gene expression in response to hormones and nutrients. SIK2 acts as a critical determinant in autophagy progression. In addition, SIK2 localizes at the centrosome and functions as a centrosome kinase required for bipolar mitotic spindle formation. It is involved in the initiation of mitosis, and regulates the localization of the centrosome linker protein, C-Nap1, through S2392 phosphorylation. SIK2 contains an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain.	45
270593	cd14410	UBA_SIK3	UBA domain found in salt-inducible kinase 3 (SIK3). SIK3, also called salt-inducible kinase 3 or serine/threonine-protein kinase QSK, is a serine/threonine kinase ubiquitously expressed. It belongs to the AMP-activated protein kinases (AMPK) family involved in the regulation of metabolism during energy stress. It acts as a novel energy regulator that modulates cholesterol and bile acid metabolism by coupling with retinoid metabolism. It also play an essential role in facilitating chondrocyte hypertrophy during skeletogenesis and growth plate maintenance. SIK3 contains an N-terminal protein kinase catalytic domain followed by an ubiquitin-associated (UBA) domain.	45
270594	cd14411	UBA_DCNL1	UBA-like domain found in DCN1-like protein 1 (DCNL1) and similar proteins. DCNL1 (defective in cullin neddylation protein 1-like protein 1), also called DCUN1 domain-containing protein 1, is encoded by squamous cell carcinoma-related oncogene SCCRO (DCUN1D1). It interacts with known cullin isoforms as well as ROC1, Ubc12 and CAND1, the components of the neddylation pathway. It plays an essential role in the neddylation E3 complex and participates in the release of inhibitory effects of CAND1 on cullin-RING ligase E3 complex assembly and activity. DCNL1 contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal cullin binding domain that binds to cullins and Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. 	51
270595	cd14412	UBA_DCNL2	UBA-like domain found in DCN1-like protein 2 (DCNL2) and similar proteins. DCNL2 (defective in cullin neddylation protein 1-like protein 2), also called DCUN1 domain-containing protein 2, is encoded by gene DCUN1D2. Although its biological function remains unclear, DCNL2 shows high sequence similarity with DCNL1, a protein that plays an essential role in the neddylation E3 complex and participates in the release of inhibitory effects of CAND1 on cullin-RING ligase E3 complex assembly and activity. At this point, DCNL2 may also contribute to neddylation of cullin components of SCF-type E3 ubiquitin ligase complexes. Like DCNL1, DCNL2 contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal cullin binding domain that is responsible for the binding to cullins and Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. 	47
270596	cd14413	UBA_FAF1	UBA-like domain found in FAS-associated factor 1 (FAF1) and similar proteins. FAF1, also called UBX domain-containing protein 12 or UBX domain-containing protein 3A, is a multi-functional Fas associating protein that contains an N-terminal ubiquitin-associated (UBA)-like domain, UAS and ubiquitin-like (UBX) domains, p150 subunit of a chromatin assembly factor like domain (CAF) and a novel nuclear localization signal (NLS). FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP) which is involved in the ubiquitin-proteosome pathway. 	33
270597	cd14414	UBA_FAF2	UBA-like TAP-C domain found in FAS-associated factor 2 (FAF2) and similar proteins. FAF2, also called protein ETEA, UBX domain-containing protein 3B, or UBX domain-containing protein 8, is the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. FAF2 shows homology to Fas-associated factor 1 (FAF1). Both of them contain N-terminal ubiquitin-associated (UBA)-like domain, UAS and ubiquitin-like (UBX) domains. Compared to FAF1, however, FAF2 lacks the nuclear targeting domain. The function of FAF2 remains unclear. A yeast two-hybrid assay showed that it can interact with Fas. Because of its homology to FAF1, it is postulated that FAF2 could be involved in modulating Fas-mediated apoptosis of T-cells and eosinophils of atopic dermatitis patients, making them more resistant to apoptosis.	38
270598	cd14415	UBA_NACA_NACP1	UBA-like domain found in nascent polypeptide-associated complex subunit alpha (NACA) and putative NACA-like protein (NACP1). NACA, also called NAC-alpha or alpha-NAC, together with BTF3, also called Beta-NAC, form the nascent polypeptide-associated complex (NAC) which is a cytosolic protein chaperone that contacts the nascent polypeptide chains as they emerge from the ribosome. Besides, NACA has a high affinity for nucleic acids and exists as part of several protein complexes playing a role in proliferation, apoptosis, or degradation. It is a cytokine-modulated specific transcript in the human TF-1 erythroleukemic cell line. It also acts as a transcriptional co-activator in osteoblasts by binding to phosphorylated c-Jun, a member of the activator-protein-1 (AP-1) family. Moreover, NACA binds to and regulates the adaptor protein Fas-associated death domain (FADD). In addition, NACA functions as a novel factor participating in the positive regulation of human erythroid-cell differentiation. Both NACA and BTF3 harbor an NAC domain that mediates the dimerization of the two subunits. By contrast, NACA has an extra ubiquitin-associated (UBA) domain in the C-terminus. In addition to NACA, the family includes NACP1, also called Alpha-NAC pseudogene 1 or NAC-alpha pseudogene 1. The biological function of NACP1 remains unclear.	46
270599	cd14416	UBA_NACAD	UBA-like domain found in nascent polypeptide-associated complex subunit alpha domain-containing protein 1 (NACAD). The subfamily includes a group of uncharacterized proteins mainly found in vertebrates. Their biological function remains unknown, but they show high sequence similarity to the nascent polypeptide-associated complex (NAC) subunit alpha (NACA) that exists as part of several protein complexes playing a role in proliferation, apoptosis, or degradation. Like NACA, NACAD contains an NAC domain and a C-terminal ubiquitin-associated (UBA) domain.	44
270600	cd14417	CUE_DMA_DMRTA1	CUE-like DMA domain found in doublesex- and mab-3-related transcription factor A1 (DMRTA1) and similar proteins. DMRTA1 is encoded by gene DMRT1, a vertebrate equivalent of the Drosophila melanogaster master sex regulator gene, doublesex. In D. melanogaster, doublesex controls the terminal switch of the pathway leading to sex fate choice. DMRT1 may function as regulator of sex differentiation in vertebrate. Especially, it is required for testis differentiation, but is not involved in the gonadal sex fate choice.	40
270601	cd14418	CUE_DMA_DMRTA2	CUE-like DMA domain found in doublesex- and mab-3-related transcription factor A2 (DMRTA2). DMRTA2, also called Doublesex- and mab-3-related transcription factor 5 (DMRT5), is encoded by gene DMRT2. In the zebrafish, DMRT2 is involved in somite development. DMRTA2 may act as an activator of cyclin-dependent kinase inhibitor 2C (cdkn2c) during spermatogenesis. It may also play significant roles in embryonic neurogenesis.	42
270602	cd14419	CUE_DMA_DMRTA3	CUE-like DMA domain found in doublesex- and mab-3-related transcription factor 3 (DMRTA3). DMRTA3 is encoded by tumor suppressor gene DMRT3 which serves as a novel potential target for homozygous deletion in squamous cell carcinoma of the lung.	43
270603	cd14420	CUE_AUP1	CUE domain found in ancient ubiquitous protein 1 (AUP1) and similar proteins. AUP1 is a component of the HRD1-SEL1L endoplasmic reticulum (ER) quality control complex and is essential for US11-mediated dislocation of class I MHC heavy chains. It also binds to the membrane-proximal KVGFFKR motif of the cytoplasmic tail of the integrin alphaCTs that plays a crucial role in the inside-out signaling of alpha(IIb)beta(3). AUP1 is found in both the ER and in lipid droplets. It contains two conserved cytoplasmic domains, an acyltransferase domain, a CUE domain and an E2 ubiquitin conjugase G2 (Ube2g2)-binding domain (G2BR). The acyltransferase domain transfers fatty acids onto phospholipids and CUE domain participates in ubiquitin binding or in recruitment of ubiquitin-conjugating enzymes to the site of dislocation.	45
270604	cd14421	CUE_AMFR	CUE domain found in autocrine motility factor receptor (AMFR) and similar proteins. AMFR is an internalizing cell surface glycoprotein that is localized in both plasma membrane caveolae and the endoplasmic reticulum (ER), and involves in the regulation of cellular adhesion, proliferation, motility and apoptosis, as well as in the process of learning and memory. It is also called ER-protein gp78 that has been identified as a RING finger-dependent ubiquitin protein ligase (E3) implicated in degradation from the ER. AMFR contains an N-terminal RING-finger domain and a C-terminal CUE domain.	41
270605	cd14422	CUE_RIN3_plant	CUE domain found in plant E3 ubiquitin protein ligases RIN2, RIN3 and similar proteins. RIN2 and RIN3 are two closely related RPM1-interacting proteins conserved in higher eukaryotes. They are orthologs of the mammalian autocrine motility factor receptor (AMFR), a cytokine receptor localized in both plasma membrane caveolae and the endoplasmic reticulum (ER). RIN2 and RIN3 have been identified as membrane-bound RING-finger type ubiquitin ligases with six apparent transmembrane domains, a RING-finger domain and a CUE domain. They act as positive regulators of RPM1- and RPS2-dependent hypersensitive response (HR).	38
270606	cd14423	CUE_UBR5	CUE domain found in E3 ubiquitin-protein ligase UBR5 and similar proteins. UBR5, also called E3 ubiquitin-protein ligase, HECT domain-containing 1, hyperplastic discs protein homolog (HYD), progestin-induced protein, EDD, or Rat100, belongs to the E3 protein family of HECT (homologous to E6-AP C-terminus) ligases. It is frequently overexpressed in breast and ovarian cancer, suggesting a role in cancer development. UBR5 is involved in DNA-damage signaling. It can ubiquitinate DNA topoisomerase II-binding protein 1 (TopBP1) in the presence of the E2 enzyme UBCH4. It also activates the DNA-damage checkpoint kinase CHK2. Moreover, UBR5 interacts with the calcium and integrin-binding protein (CIB) in a DNA-damage-dependent manner. It functions as the substrate of the extracellular signal-regulated kinases (ERKs) 1 and 2. It also acts as a ubiquitin ligase that controls the levels of poly(A)-binding protein-interacting protein 2. In addition, UBR5 ubiquitinates and up-regulates beta-catenin, regulates transcription, and activates smooth-muscle differentiation through its ability to stabilize myocardin. UBR5 contains an N-terminal CUE domain, a zinc-finger-like domain termed the ubiquitin-recognin (UBR) box, a MLLE (mademoiselle) domain, and a C-terminal catalytic HECT domain.	47
270607	cd14424	CUE_Cue1p_like	CUE domain found in yeast ubiquitin-binding protein CUE1 (Cue1p), CUE4 (Cue4p) and similar proteins. Cue1p, also called coupling of ubiquitin conjugation to ER degradation protein 1 or kinetochore-defect suppressor 4, is encoded by the open reading frame (ORF) YMR264W in yeast. It is an N-terminally membrane-anchored endoplasmic reticulum (ER) protein essential for the activity of the two major yeast RING finger ubiquitin ligases (E3s) implicated in ER-associated degradation (ERAD). It interacts with the ERAD ubiquitin-conjugating enzyme (E2) Ubc7p in vivo, stimulates Ubc7p E2 activity, and further activates ER-associated protein degradation. Cue1p contains a CUE domain which binds ubiquitin much more weakly than those of other CUE domain containing proteins. It also has an Ubc7p binding-domain at the C-terminal region which is required for Ubc7p-dependent ubiquitylation and for degradation of substrates in the ER. This family also includes Cue4p, also called coupling of ubiquitin conjugation to ER degradation protein 4. It is encoded by the open reading frame (ORF) YML101C in yeast. Cue4p contains a CUE domain which shows high level of similarity with that of Cue1p.	37
270608	cd14425	UBA1_HR23A	UBA1 domain of UV excision repair protein RAD23 homolog A (HR23A) found in vertebrates. HR23A, also called Rad23A, is a DNA repair protein that binds to 19S subunit of the 26S proteasome and shuttles ubiquitinated proteins to the proteasome for degradation which is required for efficient nucleotide excision repair (NER), a primary mechanism for removing UV-induced DNA lesions. HR23A also plays a critical role in the interaction of HIV-1 viral protein R (Vpr) with proteasome, especially facilitating Vpr to promote protein poly-ubiquitination. HR23A contains an N-terminal ubiquitin-like (UBL) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA1 domain.	40
270609	cd14426	UBA1_HR23B	UBA1 domain of UV excision repair protein RAD23 homolog B (HR23B) found in vertebrates. HR23B, also called xeroderma pigmentosum group C (XPC) repair-complementing complex 58 kDa protein (p58), is tightly complexed with XPC protein to form the XPC-HR23B complex. Although it displays a high affinity for both single- and double-stranded DNA, the XPC-HR23B complex functions as a global genome repair (GGR)-specific repair factor that is specifically involved in global genome but not transcription-coupled nucleotide excision repair (NER). HR23B also interacts specifically with S5a subunit of the human 26 S proteasome, and plays an important role in shuttling ubiquitinated cargo proteins to the proteasome. HR23B contains an N-terminal ubiquitin-like (UBL) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA1 domain.	46
270610	cd14427	UBA2_HR23A	UBA2 domain of UV excision repair protein RAD23 homolog A (HR23A) found in vertebrates. HR23A, also called Rad23A, is a DNA repair protein that binds to 19S subunit of the 26S proteasome and shuttles ubiquitinated proteins to the proteasome for degradation which is required for efficient nucleotide excision repair (NER), a primary mechanism for removing UV-induced DNA lesions. HR23A also plays a critical role in the interaction of HIV-1 viral protein R (Vpr) with proteasome, especially facilitating Vpr to promote protein poly-ubiquitination. HR23A contains an N-terminal ubiquitin-like (UBL) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA2 domain.	41
270611	cd14428	UBA2_HR23B	UBA2 domain of UV excision repair protein RAD23 homolog B (HR23B) found in vertebrates. HR23B, also called xeroderma pigmentosum group C (XPC) repair-complementing complex 58 kDa protein (p58), is tightly complexed with XPC protein to form the XPC-HR23B complex. Although it displays a high affinity for both single- and double-stranded DNA, the XPC-HR23B complex functions as a global genome repair (GGR)-specific repair factor that is specifically involved in global genome but not transcription-coupled nucleotide excision repair (NER). HR23B also interacts specifically with S5a subunit of the human 26 S proteasome, and plays an important role in shuttling ubiquitinated cargo proteins to the proteasome. HR23B contains an N-terminal ubiquitin-like (UBL) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function. This model corresponds to the UBA2 domain.	45
259859	cd14435	SPO1_TF1_like	Bacteriophage SPO1-encoded TF1 binds and bends DNA. This group contains proteins related to bacillus phage SPO1-encoded transcription factor 1 (TF1),  a type II DNA-binding protein related to the DNA sequence specific (IHF) and non-specific (HU) domains. Type II DNA-binding proteins bind and bend DNA as dimers. Like IHF, TF1 binds DNA specifically and bends DNA sharply.  Bacteriophage SPO1-encoded TF1 recognizes SPO1 phage DNA containing 5-(hydroxymethyl)-2'-deoxyuridine as opposed to thymine,   Related  family members includes integration host factor (IHF) and HU, also called type II DNA-binding proteins (DNABII), which are small dimeric proteins that specifically bind the DNA minor groove, inducing large bends in the DNA and serving as architectural factors in a variety of cellular processes such as recombination, initiation of replication/transcription and gene regulation. IHF binds DNA in a sequence specific manner while HU displays little or no sequence preference. IHF homologs are usually heterodimers, while HU homologs are typically homodimers (except HU heterodimers from E. coli and other enterobacteria). HU is highly basic and contributes to chromosomal compaction and maintenance of negative supercoiling, thus often referred to as histone-like protein. IHF is an essential cofactor in phage lambda site-specific recombination, having an architectural role during assembly of specialized nucleoprotein structures (snups).	87
271226	cd14436	LepB	Legionella Rab1-specific GAP LepB. LepB of Legionella, a human pathogen, is a specific RabGAP for Rab1, a member of the largest subfamily of small GTPases. RabGTPases play a role in the control of vesicular trafficking and are switched off by GTPase-activating enzymes (GAPs) that stimulate the intrinsic GTP hydrolysis activity. Legionella LepB is unrelated to the TBC family of human Rab1 RabGAPs.	272
271227	cd14437	nt01cx_1156_like	Uncharacterized proteins conserved in Clostridia. Some members of this uncharacterized protein family have been annotated as putative lipoproteins. The structure resembles that of a partial beta-propeller (3 out of 6 blades), suggesting that family members might form dimers.	199
271228	cd14438	Hip_N	N-terminal dimerization domain of the Hsp70-interacting protein (Hip) and similar proteins. The Hsc70/Hsp70-interacting protein (Hip, also p48 or suppressor of tumorigenicity ST13) functions as a regulator of the cyclic action of Hsp70. Hip forms homodimers, and this model characterizes the N-terminal dimerization domain, which may not be directly involved in its regulatory function. A central domain of Hip that contains TPR repeats binds the ATPase domain of Hsp70 and slows the release of ADP.	41
270205	cd14439	AlgX_N_like	N-terminal catalytic domain of putative alginate O-acetyltranferase and similar proteins. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. This wider family includes AlgX, AlgJ, AlgV, and a number of uncharacterized families, some of which may have been mis-annotated in sequence databases.	316
270206	cd14440	AlgX_N_like_3	Uncharacterized proteins similar to putative alginate O-acetyltransferase. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. Members of this uncharacterized protein family resemble AlgX_N.	315
270207	cd14441	AlgX_N	N-terminal catalytic domain of the putative alginate O-acetyltranferase AlgX. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. This N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. AlgX contains a C-terminal carbohydrate binding domain that belongs to the wider family of CBM6-CBM35-CBM36_like domains.	310
270208	cd14442	AlgJ_like	putative alginate O-acetyltranferases AlgJ, AlgV, and similar proteins. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. Members of this family have been annotated as AlgJ or AlgV, and they closely resemble AlgX in sequence and function, although they lack the C-terminal carbohydrate binding domain of AlgX.	321
270209	cd14443	AlgX_N_like_2	Uncharacterized proteins similar to putative alginate O-acetyltranferase. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. Some members of this uncharacterized family, which resembles AlgX_N, have been annotated as twin-arginine translocation signal, although they share little or no similarity with experimentally characterized proteins that bear the same name.	313
270210	cd14444	AlgX_N_like_1	Uncharacterized proteins similar to putative alginate O-acetyltranferase. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such those as by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. Its N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. Members of this uncharacterized family similar to AlgX_N have been annotated as cell division proteins FtsQ, although they share little or no similarity with experimentally characterized members of the FtsQ family.	298
271220	cd14445	RILP-like	Rab interacting lysosomal protein-like 1 and 2 (Rilpl1 and Rilpl2). This domain is found in Rab interacting lysosomal protein-like 1 and 2, and appears to be conserved in Bilateria. The Rilp-like proteins regulate the concentration of ciliary membrane proteins in the primary cilium. Rilpl2 interacts with myosin-Va and has been linked to the regulation of cellular morphology in neurons; it forms a complex with Rac1 and activates Rac1-Pak signaling, dependent on myosin-Va.	89
271219	cd14446	bt3222_like	Uncharacterized proteins similar to Bacteriodes thetaiotaomicron bt3222. This family appears to be specific to Bacteroidetes; the two-domain protein forms a homodimer.	266
269894	cd14447	SPX	Domain found in Syg1, Pho81, XPR1, and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). This domain is found at the amino terminus of a variety of proteins. In the yeast protein Syg1, the N-terminus directly binds to the G-protein beta subunit and inhibits transduction of the mating pheromone signal. Similarly, the N-terminus of the human XPR1 protein binds directly to the beta subunit of the G-protein heterotrimer leading to increased production of cAMP. These findings suggest that members of this family are involved in G-protein associated signal transduction. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors Pho81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The SPX domain of S. cerevisiae low-affinity phosphate transporters Pho87 and Pho90 auto-regulates uptake and prevents efflux. This SPX dependent inhibition is mediated by the physical interaction with Spl2. NUC-2 contains several ankyrin repeats. Several members of this family are annotated as XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with xenotropic and polytropic murine leukaemia viruses (MLV). Infection by these retroviruses can inhibit XPR1-mediated cAMP signaling and result in cell toxicity and death. The similarity between Syg1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, S. cerevisiae, and many other diverse organisms.	143
259990	cd14448	CuRO_2_BOD_CotA_like	Cupredoxin domain 2 of Bilirubin oxidase (BOD), the bacterial endospore coat component CotA, and similar proteins. Bilirubin oxidase (BOD) catalyzes the oxidation of bilirubin to biliverdin and the four-electron reduction of molecular oxygen to water. CotA protein is an abundant component of the outer coat layer in bacterial endospore coat and is required for spore resistance against hydrogen peroxide and UV light. Also included in this subfamily are phenoxazinone synthase (PHS), which catalyzes the oxidative coupling of substituted o-aminophenols to produce phenoxazinones, and FtsP (also named SufI), which is a component of the cell division apparatus. These proteins are laccase-like multicopper oxidases (MCOs) that are able to couple oxidation of substrates with reduction of dioxygen to water. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper center and transferring them to the active site trinuclear copper center. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.	144
259991	cd14449	CuRO_1_2DMCO_NIR_like_2	The cupredoxin domain 1 of a two-domain laccase related to nitrite reductase. The two-domain laccase (small laccase) in this family differs significantly from all laccases. It resembles the two domain nitrite reductase in both sequence and structure. It consists of two cupredoxin domains and forms trimers, and hence resembles the quaternary structure of nitrite reductases more than that of large laccases. There are three trinuclear copper clusters in the enzyme localized between domains 1 and 2 of each pair of neighbor chains. Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety of organic substrates coupled to the reduction of molecular oxygen to water. It displays broad substrate specificity, catalyzing the oxidation of a wide variety of aromatic, notably phenolic, and inorganic substances. Laccase has been implicated in a wide spectrum of biological activities. This subfamily has lost the type 1 (T1) copper binding site in domain 1 that is present in other two-domain laccases.	135
259992	cd14450	CuRO_3_FV_like	The third cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 3 of unprocessed Factor V or the heavy chain of Factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom.	181
259993	cd14451	CuRO_5_FV_like	The fifth cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 5 of unprocessed Factor V or the first cupredoxin domain of the light chain of coagulation factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom.	173
259994	cd14452	CuRO_1_FVIII_like	The first cupredoxin domain of coagulation factor VIII and similar proteins. Factor VIII functions in the factor X-activating complex of the intrinsic coagulation pathway. It facilitates blood clotting by acting as a cofactor for factor IXa. In the presence of Ca2+ and phospholipids, Factor VIII and IXa form a complex that converts factor X to the activated form Xa. A variety of mutations in the Factor VIII gene can cause hemophilia A, which typically requires replacement therapy with purified protein. Factor VIII is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor VIII is initially processed through proteolysis to generate a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2), which circulates in a tight complex with von Willebrand factor (VWF). Further processing of the heavy chain produces activated factor VIIIa, a heterotrimer composed of polypeptides (1-2), (3-4), and the light chain. This model represents the cupredoxin domain 1 of unprocessed Factor VIII or the heavy chain of circulating Factor VIII, and similar proteins.	173
259995	cd14453	CuRO_2_FV_like	The second cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 2 of unprocessed Factor V or the heavy chain of Factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom.	123
259996	cd14454	CuRO_4_FV_like	The fourth cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 4 of unprocessed Factor V or the heavy chain of Factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom.	144
259997	cd14455	CuRO_6_FV_like	The sixth cupredoxin domain of coagulation factor V and similar proteins. Factor V is an essential coagulation protein with both pro- and anti-coagulant functions. Aberrant expression of human factor V can lead to bleeding or thromboembolic disease, which may be life-threatening. Bovine factor Va serves as the cofactor in the prothrombinase complex that results in a 300,000-fold increase in the rate of thrombin generation. Factor V is synthesized as a single polypeptide with six cupredoxin domains and a domain structure of 1-2-3-4-B-5-6-C1-C2, where 1-6 are cupredoxin domains, B is a domain with no known structural homologs and is dispensible for coagulant activity, and C are domains distantly related to discoidin protein-fold family members. Factor V has little activity prior to proteolytic cleavage by thrombin or FXa upon secretion. The resulting Factor Va is a heterodimer consisting of a heavy chain (1-2-3-4) and a light chain (5-6-C1-C2). This model represents the cupredoxin domain 6 of unprocessed Factor V or the second cupredoxin domain of the light chain of coagulation factor Va, and similar proteins including pseutarin C non-catalytic subunit. Pseutarin C is a prothrombin activator from Pseudonaja textilis venom.	140
271218	cd14456	Menin	Scaffolding protein menin encoded by the MEN1 gene. MEN1 is the gene responsible for multiple endocrine neoplasia type 1, and it has been characterized as a tumor suppressor gene that encodes a protein called menin. Menin is mostly found in the nucleus and can regulate gene expression in a positive and in a negative way, and it has been shown to interact with transcription activators, transcription repressors, cell signaling proteins, and various other proteins. It plays major roles in DNA repair, the regulation of the cell cycle, and chromatin remodeling.	437
271217	cd14458	DP_DD	Dimerization domain of DP. DP functions as a binding partner for E2F transcription factors. DP and E2F form heterodimers and play important roles in regulating genes involved in DNA synthesis, cell cycle progression, proliferation and apoptosis. The transcriptional activity of E2F is inhibited by the retinoblastoma protein (Rb) which binds to the E2F-DP heterodimer, blocks the transactivation domain, and negatively regulates the G1-S transition. DP is distantly related to E2F. In humans, there are at least six closely related E2F and two DP family members, all containing a DNA binding domain, a coiled-coil (CC) region, and a marked-box domain. E2F1 to E2F5 also contain a C-terminal transactivation domain.	105
270615	cd14472	mltA_B_like	Domain B insert of mltA_like lytic transglycosylases. Escherichia coli MltA is a membrane-bound lytic transglycosylase comprised of two domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, which correspond to the 3D domain, named for 3 conserved aspartate residues. Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial lytic transglycosylases (LTs), which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. Typically, peptidoglycan lytic transglycosylases (LT) are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, MltE is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane-bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and Family 4 of bacteriophage origin. While most of the LT family members are similar in structure and sequence with a lysozyme-like fold, Family 2 (including mltA) is distinct.	134
271216	cd14473	FERM_B-lobe	FERM domain B-lobe. The FERM domain has a cloverleaf tripart structure (FERM_N, FERM_M, FERM_C/N, alpha-, and C-lobe/A-lobe, B-lobe, C-lobe/F1, F2, F3). The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases, the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains and consequently is capable of binding to both peptides and phospholipids at different sites.	99
269895	cd14474	SPX_YDR089W	SPX domain of the yeast protein YDR089W and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The uncharacterized yeast protein YDR089W has not been shown to be involved in phosphate homeostasis, in contrast to most of the other SPX-domain containing proteins.	144
269896	cd14475	SPX_SYG1_like	SPX domain of the yeast plasma protein Syg1 and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. In the yeast protein Syg1, the N-terminus binds directly to the G-protein beta subunit and inhibits transduction of the mating pheromone signal, and it co-occurs with a C-terminal domain from the EXS family.	139
269897	cd14476	SPX_PHO1_like	SPX domain of the plant protein PHOSPHATE1 (PHO1). This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The PHO1 gene family conserved in plants is involved in a variety of processes, most notably the transport of inorganic phosphate from the root to the shoot of the plant and mediating the response to low levels of inorganic phosphate. More recently it has become evident that PHO1 gene families have diverged in various plants and may play roles in stress response as well as the stomatal response to abscisic acid.	139
269898	cd14477	SPX_XPR1_like	SPX domain of the xenotropic and polytropic retrovirus receptor 1 (XPR1) and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The N-terminus of the human XPR1 protein (xenotropic and polytropic retrovirus receptor 1) binds directly to the beta subunit of the G-protein heterotrimer leading to increased production of cAMP. These findings suggest that all members of this family are involved in G-protein associated signal transduction. Several members of this family are annotated as XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with xenotropic and polytropic murine leukaemia viruses (MLV). Infection by these retroviruses can inhibit XPR1-mediated cAMP signaling and result in cell toxicity and death. Similarity between Syg1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae, and many other diverse organisms.	161
269899	cd14478	SPX_PHO87_PHO90_like	SPX domain of the phosphate transporters Pho87, Pho90, Pho91, and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The SPX domain of the Saccharomyces cerevisiae membrane-localized low-affinity phosphate transporters Pho87 and Pho90 auto-regulates uptake and prevents efflux. This SPX dependent inhibition is mediated by the physical interaction with Spl2. Pho91 is involved in the export of inorganic phosphate from the vacuole to the cytosol. While both, Pho87 and Pho90, transport phosphate into the cell, only Pho87 appears to also function as a sensor for high extracellular phosphate concentrations.	148
269900	cd14479	SPX-MFS_plant	SPX domain of proteins found in plants and stramenopiles; most have a C-terminal MFS domain. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The SPX domain is found at the amino terminus of a variety of proteins. This family, mostly found in plants, contains a C-terminal MFS domain (major facilitator superfamily), suggesting a function as a secondary transporter. The function of this N-terminal region is unclear, although it might be involved in regulating transport.	140
269901	cd14480	SPX_VTC2_like	SPX domain of the vacuolar transport chaperone Vtc2 and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. Vtc2 is part of the Saccharomyces cerevisiae membrane-integral VTC complex, together with Vtc1, Vtc3, and Vtc4. It contains an N-terminal SPX domain next to a central polyphosphate polymerase domain and a C-terminal domain of unknown function.	135
269902	cd14481	SPX_AtSPX1_like	SPX domain of the plant protein SPX1 and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. This family of plant proteins contains a single SPX domain. Arabidopsis thaliana SPX1 and SPX3 have been reported to play roles in the adaptation to low-phosphate conditions, SPX3 may be involved in the regulation of SPX1 activity. Oryza sativa SPX1 suppresses the regulation of expression of OsPT2, a low-affinity phosphate transporter, by the MYB-like OsPHR2.	149
269903	cd14482	SPX_BAH1-like	SPX domain of the E3 ubiquitin-protein ligase BAH1/NLA and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. BAH1 (benzoic acid hypersensitive 1) appears to function as an E3 ubiquitin ligase; the protein contains an SPX and a RING finger domain. It has been suggested that BAH1/NLA is involved in the regulation of plant immune responses, probably via a pathway of salicylic acid biosynthesis that includes benzoic acid as an intermediate.	156
269904	cd14483	SPX_PHO81_NUC-2_like	SPX domain of Pho81, NUC-2, and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors Pho81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. NUC-2 plays an important role in the phosphate-regulated signal transduction pathway in N. crassa. It shows high similarity to a cyclin-dependent kinase inhibitory protein Pho81, which is part of the phosphate regulatory cascade in S. cerevisiae. Both, NUC-2 and Pho81, have multi-domain architecture, including the SPX N-terminal domain following by several ankyrin repeats and a putative C-terminal glycerophosphodiester phosphodiesterase domain (GDPD) with unknown function.	162
269905	cd14484	SPX_GDE1_like	SPX domain of Gde1 and similar proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors Pho81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The yeast protein Gde1/Ypl110c is similar to both, NUC-2 and Pho81, in sharing their multi-domain architecture, which includes the SPX N-terminal domain followed by several ankyrin repeats and a C-terminal glycerophosphodiester phosphodiesterase domain (GDPD). Gde1 hydrolyzes intracellular glycerophosphocholine into glycerolphosphate and choline, and plays a role in the utilization of glycerophosphocholine as a source for phosphate.	134
270618	cd14485	mltA_like_LT_A	Domain A of MltA and related lytic transglycosylase; domain A is interrupted by domain B. Escherichia coli MltA is a membrane-bound lytic transglycosylase comprised of two domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, which correspond to the 3D domain, named for 3 conserved aspartate residues. Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial lytic transglycosylases (LTs), which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. Typically, peptidoglycan lytic transglycosylases (LT) are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, MltE is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane-bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and Family 4 of bacteriophage origin. While most of the LT family members are similar in structure and sequence with a lysozyme-like fold, Family 2 (including mltA) is distinct.	159
270619	cd14486	3D_domain	3D domain, named for 3 conserved aspartate residues, is found in mltA-like lytic transglycosylases and numerous other contexts. This family contains the 3D domain, named for its 3 conserved aspartates. It is found in conjunction with numerous other domains such as MltA (membrane-bound lytic murein transglycosylase A).  These aspartates are critical active site residues of mltA-like lytic transglycosylases. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. MltA has 2 domains, separated by a large groove, where the peptidoglycan strand binds. The C-terminus has a double-psi beta barrel fold within the 3D domain, which forms the larger A domain along with the N-terminal region of Mlts, but is also found in various other domain architectures. Peptigoglycan (also known as murein) chains, the primary structural component of bacterial cells walls, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc); lytic transglycosylases (LTs) cleave this beta-1-4 bond. Typically, LTs are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, membrane-bound lytic murein transglycosylase E (MltE) is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane- bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and family 4 of bacteriophage origin. While most LTs are related members of the lysozyme-like lytic transglycosylase family, MltA represents a distinct fold and sequence conservation.	104
271153	cd14487	AlgX_C	C-terminal carbohydrate-binding domain of the alginate O-acetyltranferase AlgX. The alginate biosynthesis protein AlgX appears to be directly involved in the O-acetylation of alginate, an exopolysaccharide that is associated with the formation of persistent biofilms, such as those by mucoid strains of Pseudomonas aeruginosa that affect patients suffering from cystic fibrosis. This N-terminal catalytic domain resembles SGNH hydrolases, though with a permuted topology. The active site matches that of the SGNH hydrolases, is well conserved, and has been verified experimentally. AlgX contains a C-terminal carbohydrate binding domain that belongs to the wider family of CBM6-CBM35-CBM36_like domains.	128
271154	cd14488	CBM6-CBM35-CBM36_like_2	uncharacterized members of the carbohydrate binding module 6 (CBM6) and CBM35_like superfamily. Carbohydrate binding module family 6 (CBM6, family 6 CBM), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36). These are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of the appended catalytic modules to their dedicated, insoluble substrates. Many of these CBMs are associated with glycoside hydrolase (GH) domains. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. CBM36s are calcium-dependent xylan binding domains. CBM35s display conserved specificity through extensive sequence similarity, but divergent function through their appended catalytic modules.	132
271155	cd14489	CBM_SBP_bac_1_like	Putative Carbohydrate Binding Module (CBM) of extracellular solute-binding protein family 1. Domains in this family co-occur with extracellular solute-binding domains which are periplasmic components of ABC-type sugar transport systems involved in carbohydrate transport and metabolism. Carbohydrate binding modules of family 6 (CBM6), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36) are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of co-occuring (catalytic) modules to their insoluble substrates.	150
271156	cd14490	CBM6-CBM35-CBM36_like_1	uncharacterized members of the carbohydrate binding module 6 (CBM6) and CBM35_like superfamily. Carbohydrate binding module family 6 (CBM6, family 6 CBM), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36). These are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of the appended catalytic modules to their dedicated, insoluble substrates. Many of these CBMs are associated with glycoside hydrolase (GH) domains. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. CBM36s are calcium-dependent xylan binding domains. CBM35s display conserved specificity through extensive sequence similarity, but divergent function through their appended catalytic modules.	156
381186	cd14491	lipocalin_MxiM-like	Shigella pilot protein MxiM and similar proteins. Shigella flexneri MxiM, is a pilot protein for S. flexneri MxiD, an outer membrane (OM)-associated ring-forming secretin and component of the type-III secretion system. MxiM, also an OM protein, binds lipids and MxiD. MxiM binds and affects several features of the secretin MxiD, including its stability in the periplasm, OM association, as well as assembly into multimeric structure. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	134
381187	cd14492	lipocalin_MxiM-like	MxiM-like lipocalin found in Bacteroidetes. Uncharacterized proteins in this family conserved in Bacteroidetes are similar to Shigella flexneri MxiM,  a pilot protein for S. flexneri MxiD, an outer membrane (OM)-associated ring-forming secretin and component of the type-III secretion system. MxiM, also an OM protein, binds lipids and MxiD. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	142
381188	cd14493	lipocalin_MxiM	Shigella Pilot protein MxiM. Shigella flexneri MxiM is a pilot protein for S. flexneri MxiD, an outer membrane (OM)-associated ring-forming secretin. MxiM, also an OM protein, binds lipids and MxiD. MixD is a component of the type-III secretion system that translocates proteins through both membranes of gram-negative bacterial pathogens into host cells and requires the formation of an integral OM secretin ring. MxiM binds and affects several features of the secretin MxiD, including its stability in the periplasm, OM association, as well as assembly into multimeric structure. MxiM belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	115
350344	cd14494	PTP_DSP_cys	cys-based protein tyrosine phosphatase and dual-specificity phosphatase superfamily. This superfamily is composed of cys-based phosphatases, which includes classical protein tyrosine phosphatases (PTPs) as well as dual-specificity phosphatases (DUSPs or DSPs). They are characterized by a CxxxxxR conserved catalytic loop (where C is the catalytic cysteine, x is any amino acid, and R is an arginine). PTPs are part of the tyrosine phosphorylation/dephosphorylation regulatory mechanism, and are important in the response of the cells to physiologic and pathologic changes in their environment. DUSPs show more substrate diversity (including RNA and lipids) and include pTyr, pSer, and pThr phosphatases.	113
350345	cd14495	PTPLP-like	Protein tyrosine phosphatase-like domains of phytases and similar domains. This subfamily contains the tandem protein tyrosine phosphatase (PTP)-like domains of protein tyrosine phosphatase-like phytases (PTPLPs) and similar domains including the PTP domain of Pseudomonas syringae tyrosine-protein phosphatase hopPtoD2. PTPLPs, also known as cysteine phytases, are one of four known classes of phytases, enzymes that degrade phytate (inositol hexakisphosphate [InsP(6)]) to less-phosphorylated myo-inositol derivatives. Phytate is the most abundant cellular inositol phosphate and plays important roles in a broad scope of cellular processes, including DNA repair, RNA processing and export, development, apoptosis, and pathogenicity. PTPLPs adopt a PTP fold, including the active-site signature sequence (CX5R(S/T)) and utilize a classical PTP reaction mechanism. However, these enzymes display no catalytic activity against classical PTP substrates due to several unique structural features that confer specificity for myo-inositol polyphosphates.	278
350346	cd14496	PTP_paladin	protein tyrosine phosphatase-like domains of paladin. Paladin is a putative phosphatase, which in mouse is expressed in endothelial cells during embryonic development and in arterial smooth muscle cells in adults. It has been suggested to be an antiphosphatase that regulates the activity of specific neural crest regulatory factors and thus, modulates neural crest cell formation and migration. Paladin contains two protein tyrosine phosphatase (PTP)-like domains. This model represents both repeats.	185
350347	cd14497	PTP_PTEN-like	protein tyrosine phosphatase-like domain of phosphatase and tensin homolog and similar proteins. Phosphatase and tensin homolog (PTEN) is a tumor suppressor that acts as a dual-specificity protein phosphatase and as a lipid phosphatase. It dephosphorylates phosphoinositide trisphosphate. In addition to PTEN, this family includes tensins, voltage-sensitive phosphatases (VSPs), and auxilins. They all contain a protein tyrosine phosphatase-like domain although not all are active phosphatases. Tensins are intracellular proteins that act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility, and they may or may not have phosphatase activity. VSPs are phosphoinositide phosphatases with substrates that include phosphatidylinositol-4,5-diphosphate and phosphatidylinositol-3,4,5-trisphosphate. Auxilins are J domain-containing proteins that facilitate Hsc70-mediated dissociation of clathrin from clathrin-coated vesicles, and they do not exhibit phosphatase activity.	160
350348	cd14498	DSP	dual-specificity phosphatase domain. The dual-specificity phosphatase domain is found in typical and atypical dual-specificity phosphatases (DUSPs), which function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). Typical DUSPs, also called mitogen-activated protein kinase (MAPK) phosphatases (MKPs), deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. Atypical DUSPs contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. Also included in this family are dual specificity phosphatase-like domains of catalytically inactive members such as serine/threonine/tyrosine-interacting protein (STYX) and serine/threonine/tyrosine interacting like 1 (STYXL1), as well as active phosphatases with substrates that are not phosphoproteins such as PTP localized to the mitochondrion 1 (PTPMT1), which is a lipid phosphatase, and laforin, which is a glycogen phosphatase.	135
350349	cd14499	CDC14_C	C-terminal dual-specificity phosphatase domain of CDC14 family proteins. The cell division control protein 14 (CDC14) family is highly conserved in all eukaryotes, although the roles of its members seem to have diverged during evolution. Yeast Cdc14, the best characterized member of this family, is a dual-specificity phosphatase that plays key roles in cell cycle control. It preferentially dephosphorylates cyclin-dependent kinase (CDK) targets, which makes it the main antagonist of CDK in the cell. Cdc14 functions at the end of mitosis and it triggers the events that completely eliminates the activity of CDK and other mitotic kinases. It is also involved in coordinating the nuclear division cycle with cytokinesis through the cytokinesis checkpoint, and in chromosome segregation. Cdc14 phosphatases also function in DNA replication, DNA damage checkpoint, and DNA repair. Vertebrates may contain more than one Cdc14 homolog; humans have three (CDC14A, CDC14B, and CDC14C). CDC14 family proteins contain a highly conserved N-terminal pseudophosphatase domain that contributes to substrate specificity and a C-terminal catalytic dual-specificity phosphatase domain with the PTP signature motif.	174
350350	cd14500	PTP-IVa	protein tyrosine phosphatase type IVA family. Protein tyrosine phosphatases type IVA (PTP-IVa), also known as protein-tyrosine phosphatases of regenerating liver (PRLs) constitute a family of small, prenylated phosphatases that are the most oncogenic of all PTPs. They stimulate progression from G1 into S phase during mitosis and enhances cell proliferation, cell motility and invasive activity, and promotes cancer metastasis. They associate with magnesium transporters of the cyclin M (CNNM) family, which results in increased intracellular magnesium levels that promote oncogenic transformation. Vertebrates contain three members: PRL-1, PRL-2, and PRL-3.	156
350351	cd14501	PFA-DSP	plant and fungi atypical dual-specificity phosphatase. Plant and fungi atypical dual-specificity phosphatases (PFA-DSPs) are a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds. They share structural similarity with atypical- and lipid phosphatase DSPs from mammals. The PFA-DSP group is composed of active as well as inactive phosphatases. The best characterized member is Saccharomyces Siw14, also known as Oca3, which plays a role in actin filament organization and endocytosis. Siw14 has been shown to be an inositol pyrophosphate phosphatase, hydrolyzing the beta-phosphate from 5-diphosphoinositol pentakisphosphate (5PP-IP5or IP7).	149
350352	cd14502	RNA_5'-triphosphatase	RNA 5'-triphosphatase domain. This family of RNA-specific cysteine phosphatases includes baculovirus RNA 5'-triphosphatase, dual specificity protein phosphatase 11 (DUSP11), and the RNA triphosphatase domains of metazoan and plant mRNA capping enzymes. RNA/polynucleotide 5'-triphosphatase (EC 3.1.3.33) catalyzes the removal of the gamma-phosphate from the 5'-triphosphate end of nascent mRNA to yield a diphosphate end. mRNA capping enzyme is a bifunctional enzyme that catalyzes the first two steps of cap formation. DUSP11 has RNA 5'-triphosphatase and diphosphatase activity, but only poor protein-tyrosine phosphatase activity.	167
350353	cd14503	PTP-bact	bacterial tyrosine-protein phosphataseS similar to Neisseria NMA1982. This subfamily is composed of bacterial tyrosine-protein phosphatases similar to Neisseria meningitidis NMA1982, which displays phosphatase activity but whose biological function is still unknown.	136
350354	cd14504	DUSP23	dual specificity phosphatase 23. Dual specificity phosphatase 23 (DUSP23), also known as VH1-like phosphatase Z (VHZ) or low molecular mass dual specificity phosphatase 3 (LDP-3), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP23 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It is able to enhance activation of JNK and p38 MAPK, and has been shown to dephosphorylate p44-ERK1 (MAPK3) in vitro. It has been associated with cell growth and human primary cancers. It has also been identified as a cell-cell adhesion regulatory protein; it promotes the dephosphorylation of beta-catenin at Tyr 142 and enhances the interaction between alpha- and beta-catenin.	142
350355	cd14505	CDKN3-like	cyclin-dependent kinase inhibitor 3 and similar proteins. This family is composed of eukaryotic cyclin-dependent kinase inhibitor 3 (CDKN3) and related archaeal and bacterial proteins. CDKN3 is also known as kinase-associated phosphatase (KAP), CDK2-associated dual-specificity phosphatase, cyclin-dependent kinase interactor 1 (CDI1), or cyclin-dependent kinase-interacting protein 2 (CIP2). It has been characterized as dual-specificity phosphatase, which function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and protein-tyrosine-phosphatase (EC 3.1.3.48). It dephosphorylates CDK2 at a threonine residue in a cyclin-dependent manner, resulting in the inhibition of G1/S cell cycle progression. It also interacts with CDK1 and controls progression through mitosis by dephosphorylating CDC2. CDKN3 may also function as a tumor suppressor; its loss of function was found in a variety of cancers including glioblastoma and hepatocellular carcinoma. However, it has also been found over-expressed in many cancers such as breast, cervical, lung and prostate cancers, and may also have an oncogenic function.	163
350356	cd14506	PTP_PTPDC1	protein tyrosine phosphatase domain of PTP domain-containing protein 1. protein tyrosine phosphatase domain-containing protein 1 (PTPDC1) is an uncharacterized non-receptor class protein-tyrosine phosphatase (PTP). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Small interfering RNA (siRNA) knockdown of the ptpdc1 gene is associated with elongated cilia.	206
350357	cd14507	PTP-MTM-like	protein tyrosine phosphatase-like domain of myotubularins. Myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. Not all members are catalytically active proteins, some function as adaptors for the active members.	226
350358	cd14508	PTP_tensin	protein tyrosine phosphatase-like domain of tensins. The tensin family of intracellular proteins (tensin-1, -2, -3 and -4) act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility. Dysregulation of tensin expression has been implicated in human cancer. Tensin-1, -2, and -3 contain an N-terminal region with a protein tyrosine phosphatase (PTP)-like domain followed by a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. In addition, tensin-2 contains a zinc finger N-terminal to its PTP domain. Tensin-4 is not included in this model as it does not contain a PTP-like domain.	159
350359	cd14509	PTP_PTEN	protein tyrosine phosphatase-like catalytic domain of phosphatase and tensin homolog. Phosphatase and tensin homolog (PTEN), also phosphatidylinositol 3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase PTEN or mutated in multiple advanced cancers 1 (MMAC1), is a tumor suppressor that acts as a dual-specificity protein phosphatase and as a lipid phosphatase. It is a critical endogenous inhibitor of phosphoinositide signaling. It dephosphorylates phosphoinositide trisphosphate, and therefore, has the function of negatively regulating Akt. The PTEN/PI3K/AKT pathway regulates the signaling of multiple biological processes such as apoptosis, metabolism, cell proliferation, and cell growth. PTEN contains an N-terminal PIP-binding domain, a protein tyrosine phosphatase (PTP)-like catalytic domain, a regulatory C2 domain responsible for its cellular location, a C-tail containing phosphorylation sites, and a C-terminal PDZ domain.	158
350360	cd14510	PTP_VSP_TPTE	protein tyrosine phosphatase-like catalytic domain of voltage-sensitive phosphatase/transmembrane phosphatase with tensin homology. Voltage-sensitive phosphatase (VSP) proteins comprise a family of phosphoinositide phosphatases with substrates that include phosphatidylinositol-4,5-diphosphate and phosphatidylinositol-3,4,5-trisphosphate. This family is conserved in deuterostomes; VSP was first identified as a sperm flagellar plasma membrane protein in Ciona intestinalis. Gene duplication events in primates resulted in the presence of paralogs, transmembrane phosphatase with tensin homology (TPTE) and TPTE2, that retain protein domain architecture but, in the case of TPTE, have lost catalytic activity. TPTE, also called cancer/testis antigen 44 (CT44), may play a role in the signal transduction pathways of the endocrine or spermatogenic function of the testis. TPTE2, also called TPTE and PTEN homologous inositol lipid phosphatase (TPIP), occurs in several differentially spliced forms; TPIP alpha displays phosphoinositide 3-phosphatase activity and is  localized on the endoplasmic reticulum, while TPIP beta is cytosolic and lacks detectable phosphatase activity. VSP/TPTE proteins contain an N-terminal voltage sensor consisting of four transmembrane segments, a protein tyrosine phosphatase (PTP)-like phosphoinositide phosphatase catalytic domain, followed by a regulatory C2 domain.	177
350361	cd14511	PTP_auxilin-like	protein tyrosine phosphatase-like domain of auxilin and similar proteins. This subfamily contains proteins similar to auxilin, characterized by also containing a J domain. It includes auxilin, also called auxilin-1, and cyclin-G-associated kinase (GAK), also called auxilin-2. Auxilin-1 and -2 facilitate Hsc70-mediated dissociation of clathrin from clathrin-coated vesicles. GAK is expressed ubiquitously and is enriched in the Golgi, while auxilin-1 which is nerve-specific. Both proteins contain a protein tyrosine phosphatase (PTP)-like domain similar to the PTP-like domain of PTEN (a phosphoinositide 3-phosphatase), and a C-terminal region with clathrin-binding and J domains. In addition, GAK contains an N-terminal protein kinase domain that phosphorylates the mu subunits of adaptor protein (AP) 1 and AP2.	164
350362	cd14512	DSP_MKP	dual specificity phosphatase domain of mitogen-activated protein kinase phosphatase. Mitogen-activated protein kinase (MAPK) phosphatases (MKPs) are eukaryotic dual-specificity phosphatases (DUSPs) that act on MAPKs, which are involved in gene regulation, cell proliferation, programmed cell death and stress responses, as an important feedback control mechanism that limits MAPK cascades. MKPs, also referred to as typical DUSPs, function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain. Based on sequence homology, subcellular localization and substrate specificity, 10 MKPs can be subdivided into three subfamilies (class I-III).	136
350363	cd14513	DSP_slingshot	dual specificity phosphatase domain of slingshot family phosphatases. The slingshot (SSH) family of dual specificity protein phosphatases is composed of Drosophila slingshot phosphatase and its vertebrate homologs: SSH1, SSH2 and SSH3. Its members specifically dephosphorylate and reactivate Ser-3-phosphorylated cofilin (P-cofilin), an actin-binding protein that plays an essential role in actin filament dynamics. In Drosophila, loss of ssh gene function causes prominent elevation in the levels of P-cofilin and filamentous actin and disorganized epidermal cell morphogenesis, including bifurcation phenotypes of bristles and wing hairs. SSH family phosphatases contain an N-terminal, SSH family-specific non-catalytic (SSH-N) domain, followed by a short domain with similarity to the C-terminal domain of the chromatin-associated protein DEK, and a dual specificity phosphatase catalytic domain. In addition, many members contain a C-terminal tail. The SSH-N domain plays critical roles in P-cofilin recognition, F-actin-mediated activation, and subcellular localization of SSHs.	139
350364	cd14514	DUSP14-like	dual specificity protein phosphatases 14, 18, 21, 28 and similar proteins. This family is composed of dual specificity protein phosphatase 14 (DUSP14, also known as MKP-6), 18 (DUSP18), 21 (DUSP21), 28 (DUSP28), and similar proteins. They function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48), and are atypical DUSPs. They contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP14 directly interacts and dephosphorylates TGF-beta-activated kinase 1 (TAK1)-binding protein 1 (TAB1) in T cells, and negatively regulates TCR signaling and immune responses. DUSP18 has been shown to interact and dephosphorylate SAPK/JNK, and may play a role in regulating the SAPK/JNK pathway. DUSP18 and DUSP21 target to opposing sides of the mitochondrial inner membrane. DUSP28 has been implicated in hepatocellular carcinoma progression and in migratory activity and drug resistance of pancreatic cancer cells.	133
350365	cd14515	DUSP3-like	dual specificity protein phosphatases 3, 13, 26, 27, and similar domains. This family is composed of dual specificity protein phosphatase 3 (DUSP3, also known as VHR), 13B (DUSP13B, also known as TMDP), 26 (DUSP26, also known as MPK8), 13A (DUSP13A, also known as MDSP), dual specificity phosphatase and pro isomerase domain containing 1 (DUPD1), and inactive DUSP27. In general, DUSPs function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). Members of this family are atypical DUSPs; they contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. Inactive DUSP27 contains a dual specificity phosphatase-like domain with the active site cysteine substituted to serine.	148
350366	cd14516	DSP_fungal_PPS1	dual specificity phosphatase domain of fungal dual specificity protein phosphatase PPS1-like. This subfamily contains fungal proteins with similarity to dual specificity protein phosphatase PPS1 from Saccharomyces cerevisiae, which has a role in the DNA synthesis phase of the cell cycle. As a dual specificity protein phosphatase, PPS1 functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It contains a C-terminal catalytic dual specificity phosphatase domain.	177
350367	cd14517	DSP_STYXL1	dual specificity phosphatase-like domain of serine/threonine/tyrosine interacting like 1. Serine/threonine/tyrosine interacting like 1 (STYXL1), also known as DUSP24 and MK-STYX, is a catalytically inactive phosphatase with homology to the mitogen-activated protein kinase (MAPK) phosphatases (MKPs). STYXL1 plays a role in regulating pathways by competing with active phosphatases for binding to MAPKs. Similar to MKPs, STYXL1 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, however its C-terminal dual specificity phosphatase-like domain is a pseudophosphatase missing the catalytic cysteine.	155
350368	cd14518	DSP_fungal_YVH1	dual specificity phosphatase domain of fungal YVH1-like dual specificity protein phosphatase. This family is composed of Saccharomyces cerevisiae dual specificity protein phosphatase Yvh1 and similar fungal proteins. Yvh1 could function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It regulates cell growth, sporulation, and glycogen accumulation. It plays an important role in ribosome assembly. Yvh1 associates transiently with late pre-60S particles and is required for the release of the nucleolar/nuclear pre-60S factor Mrt4, which is necessary to construct a translation-competent 60S subunit and mature ribosome stalk. Yvh1 contains an N-terminal catalytic dual specificity phosphatase domain and a C-terminal tail.	153
350369	cd14519	DSP_DUSP22_15	dual specificity phosphatase domain of dual specificity protein phosphatase 22, 15, and similar proteins. Dual specificity protein phosphatase 22 (DUSP22, also known as VHX) and 15 (DUSP15, also known as VHY) function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). They are atypical DUSPs; they contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. The both contain N-terminal myristoylation recognition sequences and myristoylation regulates their subcellular location. DUSP22 negatively regulates the estrogen receptor-alpha-mediated signaling pathway and the IL6-leukemia inhibitory factor (LIF)-STAT3-mediated signaling pathway. DUSP15 has been identified as a regulator of oligodendrocyte differentiation. DUSP22 is a single domain protein containing only the catalytic dual specificity phosphatase domain while DUSP15 contains a short C-terminal tail.	136
350370	cd14520	DSP_DUSP12	dual specificity phosphatase domain of dual specificity protein phosphatase 12 and similar proteins. Dual specificity protein phosphatase 12 (DUSP12), also called YVH1, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP12 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It targets p38 MAPK to regulate macrophage response to bacterial infection. It also ameliorates cardiac hypertrophy in response to pressure overload through c-Jun N-terminal kinase (JNK) inhibition. DUSP12 has been identified as a modulator of cell cycle progression, a function independent of phosphatase activity and mediated by its C-terminal zinc-binding domain.	144
350371	cd14521	DSP_fungal_SDP1-like	dual specificity phosphatase domain of fungal dual specificity protein phosphatase SDP1, MSG5, and similar proteins. This family is composed of fungal dual specificity protein phosphatases (DUSPs) including Saccharomyces cerevisiae SDP1 and MSG5, and Schizosaccharomyces pombe Pmp1. function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. SDP1 is oxidative stress-induced and dephosphorylates MAPK substrates such as SLT2. MSG5 dephosphorylates the Fus3 and Slt2 MAPKs operating in the mating and cell wall integrity (CWI) pathways, respectively. Pmp1 is responsible for dephosphorylating the CWI MAPK Pmk1. These phosphatases bind to their target MAPKs through a conserved IYT motif located outside of the dual specificity phosphatase domain.	155
350372	cd14522	DSP_STYX	dual specificity phosphatase-like domain of serine/threonine/tyrosine-interacting protein. Serine/threonine/tyrosine-interacting protein (STYX), also called protein tyrosine phosphatase-like protein, is a catalytically inactive member of the protein tyrosine phosphatase family that plays an integral role in regulating pathways by competing with active phosphatases for binding to MAPKs. It acts as a nuclear anchor for MAPKs, affecting their nucleocytoplasmic shuttling.	151
350373	cd14523	DSP_DUSP19	dual specificity phosphatase domain of dual specificity protein phosphatase 19. Dual specificity protein phosphatase 19 (DUSP19), also called low molecular weight dual specificity phosphatase 3 (LMW-DSP3) or stress-activated protein kinase (SAPK) pathway-regulating phosphatase 1 (SKRP1), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP19  interacts with the MAPK kinase MKK7, a JNK activator, and inactivates the JNK MAPK pathway.	137
350374	cd14524	PTPMT1	protein-tyrosine phosphatase mitochondrial 1. Protein-tyrosine phosphatase mitochondrial 1 or PTP localized to the mitochondrion 1 (PTPMT1), also called phosphoinositide lipid phosphatase (PLIP), phosphatidylglycerophosphatase and protein-tyrosine phosphatase 1, or PTEN-like phosphatase, is a lipid phosphatase or phosphatidylglycerophosphatase (EC 3.1.3.27) which dephosphorylates phosphatidylglycerophosphate (PGP) to phosphatidylglycerol (PG). It is targeted to the mitochondrion by an N-terminal signal sequence and is found anchored to the matrix face of the inner membrane. It is essential for the biosynthesis of cardiolipin, a mitochondrial-specific phospholipid regulating the membrane integrity and activities of the organelle. PTPMT1 also plays a crucial role in hematopoietic stem cell (HSC) function, and has been shown to display activity  toward phosphoprotein substrates.	149
350375	cd14526	DSP_laforin-like	dual specificity phosphatase domain of laforin and similar domains. This family is composed of glucan phosphatases including vertebrate dual specificity protein phosphatase laforin, also called lafora PTPase (LAFPTPase), and plant starch excess4 (SEX4). Laforin is a glycogen phosphatase; its gene is mutated in Lafora progressive myoclonus epilepsy or Lafora disease (LD), a fatal autosomal recessive neurodegenerative disorder characterized by the presence of progressive neurological deterioration, myoclonus, and epilepsy. One characteristic of LD is the accumulation of insoluble glucans. Laforin prevents LD by at least two mechanisms: by preventing hyperphosphorylation of glycogen by dephosphorylating it, allowing proper glycogen formation, and by promoting the ubiquitination of proteins involved in glycogen metabolism via its interaction with malin. Laforin contains an N-terminal CBM20 (carbohydrate-binding module, family 20) domain and a C-terminal catalytic dual specificity phosphatase (DSP) domain. Plant SEX4 regulate starch metabolism by selectively dephosphorylating glucose moieties within starch glucan chains. It contains an N-terminal catalytic DSP domain and a C-terminal Early (E) set domain.	146
350376	cd14527	DSP_bac	unknown subfamily of bacterial and plant dual specificity protein phosphatases. This subfamily is composed of uncharacterized bacterial and plant dual-specificity protein phosphatases. DUSPs function as a protein-serine/threonine phosphatases (EC 3.1.3.16) and a protein-tyrosine-phosphatases (EC 3.1.3.48).	136
350377	cd14528	PFA-DSP_Siw14	atypical dual specificity phosphatases similar to yeast Siw14. This subfamily contains Saccharomyces Siw14 and a novel phosphatase from the Arabidopsis thaliana gene locus At1g05000. Siw14, also known as Oca3, plays a role in actin filament organization and endocytosis. Siw14 has been shown to be an inositol pyrophosphate phosphatase, hydrolyzing the beta-phosphate from 5-diphosphoinositol pentakisphosphate (5PP-IP5or IP7). The At1g05000 protein, also called AtPFA-DSP1, has been shown to have highest activity toward olyphosphate (poly-P(12-13)) and deoxyribo- and ribonucleoside triphosphates, and less activity toward phosphoenolpyruvate, phosphotyrosine, phosphotyrosine-containing peptides, and phosphatidylinositols. This subfamily belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs).	148
350378	cd14529	TpbA-like	bacterial protein tyrosine and dual-specificity phosphatases related to Pseudomonas aeruginosa TpbA. This subfamily contains bacterial protein tyrosine phosphatases (PTPs) and dual-specificity phosphatases (DUSPs) related to Pseudomonas aeruginosa TpbA, a DUSP that negatively regulates biofilm formation by converting extracellular quorum sensing signals and to Mycobacterium tuberculosis PtpB, a PTP virulence factor that attenuates host immune defenses by interfering with signal transduction pathways in macrophages. PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides, while DUSPs function as  protein-serine/threonine phosphatases (EC 3.1.3.16) and PTPs.	158
350379	cd14531	PFA-DSP_Oca1	atypical dual specificity phosphatases similar to oxidant-induced cell-cycle arrest protein 1. Oxidant-induced cell-cycle arrest protein 1 (Oca1) is an atypical dual specificity phosphatase whose gene is required for G1 arrest in response to the lipid oxidation product linoleic acid hydroperoxide. It may function in linking growth, stress responses, and the cell cycle. Oca1 belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs).	149
350380	cd14532	PTP-MTMR6-like	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatases 6, 7, and 8. This subgroup of enzymatically active phosphatase domains of myotubularins consists of MTMR6, MTMR7 and MTMR8, and related domains.  Beside the phosphatase domain, they contain a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. MTMR6, MTMR7 and MTMR8 form complexes with catalytically inactive MTMR9, and display differential substrate preferences. In cells, the MTMR6/R9 complex significantly increases the cellular levels of PtdIns(5)P, the product of PI(3,5)P(2) dephosphorylation, whereas the MTMR8/R9 complex reduces cellular PtdIns(3)P levels. The MTMR6/R9 complex serves to inhibit stress-induced apoptosis while the MTMR8/R9 complex inhibits autophagy.	301
350381	cd14533	PTP-MTMR3-like	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatases 3 and 4. This subgroup of enzymatically active phosphatase domains of myotubularins consists of MTMR3, also known as ZFYVE10, and MTMR4, also known as ZFYVE11, and related domains. Beside the phosphatase domain, they contain a C-terminal FYVE domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively.	229
350382	cd14534	PTP-MTMR5-like	protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatases 5 and 13. This subgroup of enzymatically inactive phosphatase domains of myotubularins consists of MTMR5, also known as SET binding factor 1 (SBF1) and MTMR13, also known as SET binding factor 2 (SBF2), and similar domains. Beside the pseudophosphatase domain, they contain a variety of other domains, including a DENN and a PH-like domain.  In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR5 and MTMR13 are pseudophosphatases that lack the catalytic cysteine in their catalytic pocket. Mutations in MTMR13 causes Charcot-Marie-Tooth type 4B2, a severe childhood-onset neuromuscular disorder, characterized by demyelination and redundant loops of myelin known as myelin outfoldings, a similar phenotype as mutations in MTMR2. Mutations in the MTMR5 gene cause Charcot-Marie-tooth disease type 4B3. MTMR5 and MTMR13 interact with MTMR2 and stimulate its phosphatase activity.	274
350383	cd14535	PTP-MTM1-like	protein tyrosine phosphatase-like domain of myotubularin, and  myotubularin related phosphoinositide phosphatases 1 and 2. This subgroup of enzymatically active phosphatase domains of myotubularins consists of MTM1, MTMR1 and MTMR2. All contain an additional N-terminal PH-GRAM domain and C-terminal coiled-coiled domain and PDZ binding site. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively.	249
350384	cd14536	PTP-MTMR9	protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 9. Myotubularin related phosphoinositide phosphatase 9 (MTMR9) is enzymatically inactive and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. Mutations have been associated with obesity and metabolic syndrome. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR9 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. It forms complexes with catalytically active MTMR6, MTMR7 and MTMR8, and regulates their activities; the complexes display differential substrate preferences. The MTMR6/R9 complex serves to inhibit stress-induced apoptosis while the MTMR8/R9 complex inhibits autophagy.	224
350385	cd14537	PTP-MTMR10-like	protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatases 10, 11, and 12. This subgroup of enzymatically inactive phosphatase domains of myotubularins consists of MTMR10, MTMR11, MTMR12, and similar proteins. Beside the phosphatase domain, they contain an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR10, MTMR11, and MTMR12 are pseudophosphatases that lack the catalytic cysteine in their catalytic pocket. MTMR12 functions as an adapter for the catalytically active myotubularin to regulate its intracellular location.	200
350386	cd14538	PTPc-N20_13	catalytic domain of tyrosine-protein phosphatase non-receptor type 20 and type 13. Tyrosine-protein phosphatase non-receptor type 20 (PTPN20) and type 13 (PTPN13, also known as PTPL1) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Human PTPN20 is a widely expressed phosphatase with a dynamic subcellular distribution that is targeted to sites of actin polymerization. Human PTPN13 is an important regulator of tumor aggressiveness.	207
350387	cd14539	PTP-N23	PTP-like domain of tyrosine-protein phosphatase non-receptor type 23. Tyrosine-protein phosphatase non-receptor type 23 (PTPN23), also called His domain-containing protein tyrosine phosphatase (HD-PTP) or protein tyrosine phosphatase TD14 (PTP-TD14), is a catalytically inactive member of the tyrosine-specific protein tyrosine phosphatase (PTP) family. Human PTPN23 may be involved in the regulation of small nuclear ribonucleoprotein assembly and pre-mRNA splicing by modifying the survival motor neuron (SMN) complex. It plays a role in ciliogenesis and is part of endosomal sorting complex required for transport (ESCRT) pathways. PTPN23 contains five domains: a BRO1-like domain that plays a role in endosomal sorting; a V-domain that interacts with Lys63-linked polyubiquitinated substrates; a central proline-rich region that might recruit SH3-containing proteins; a PTP-like domain; and a proteolytic degradation-targeting motif, also known as a PEST sequence.	205
350388	cd14540	PTPc-N21_14	catalytic domain of tyrosine-protein phosphatase non-receptor type 21 and type 14. Tyrosine-protein phosphatase non-receptor type 21 (PTPN21) and type 14 (PTPN14) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Both PTPN21 and PTPN14 contain an N-terminal FERM domain and a C-terminal catalytic PTP domain, separated by a long intervening sequence.	219
350389	cd14541	PTPc-N3_4	catalytic domain of tyrosine-protein phosphatase non-receptor type 21 and type 14. Tyrosine-protein phosphatase non-receptor type 3 (PTPN3) and type 4 (PTPN4) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN3 and PTPN4 are large modular proteins containing an N-terminal FERM domain, a PDZ domain and a C-terminal catalytic PTP domain. PTPN3 interacts with mitogen-activated protein kinase p38gamma and serves as its specific phosphatase. PTPN4 functions in TCR cell signaling, apoptosis, cerebellar synaptic plasticity, and innate immune responses.	212
350390	cd14542	PTPc-N22_18_12	catalytic domain of tyrosine-protein phosphatase non-receptor type 22, type 18 and type 12. Tyrosine-protein phosphatase non-receptor type 22 (PTPN22), type 18 (PTPN18) and type 12 (PTPN12) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN22 is expressed in hematopoietic cells and it functions as a key regulator of immune homeostasis by inhibiting T-cell receptor signaling through the direct dephosphorylation of Src family kinases (Lck and Fyn), ITAMs of the TCRz/CD3 complex, and other signaling molecules. TPN18 regulates HER2-mediated cellular functions through defining both its phosphorylation and ubiquitination states. PTPN12 is characterized as a tumor suppressor and a pivotal regulator of EGFR/HER2 signaling.	202
350391	cd14543	PTPc-N9	catalytic domain of tyrosine-protein phosphatase non-receptor type 9. Tyrosine-protein phosphatase non-receptor type 9 (PTPN9), also called protein-tyrosine phosphatase MEG2, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN9 plays an important role in promoting intracellular secretary vesicle fusion in hematopoietic cells and promotes the dephosphorylation of ErbB2 and EGFR in breast cancer cells, leading to impaired activation of STAT5 and STAT3. It also directly dephosphorylates STAT3 at the Tyr705 residue, resulting in its inactivation. PTPN9 has been found to be dysregulated in various human cancers, including breast, colorectal, and gastric cancer.	271
350392	cd14544	PTPc-N11_6	catalytic domain of tyrosine-protein phosphatase non-receptor type 11 and type 6. Tyrosine-protein phosphatase non-receptor type 11 (PTPN11) and type 6 (PTPN6) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN11 and PTPN6, are also called SH2 domain-containing tyrosine phosphatase 2 (SHP2) and 1 (SHP1), respectively. They contain two tandem SH2 domains: a catalytic PTP domain, and a C-terminal tail with regulatory properties. Although structurally similar, they have different localization and different roles in signal transduction. PTPN11/SHP2 is expressed ubiquitously and plays a positive role in cell signaling, leading to cell activation, while PTPN6/SHP1 expression is restricted mainly to hematopoietic and epithelial cells and functions as a negative regulator of signaling events.	251
350393	cd14545	PTPc-N1_2	catalytic domain of tyrosine-protein phosphatase non-receptor type 1 and type 2. Tyrosine-protein phosphatase non-receptor type 1 (PTPN1) type 2 (PTPN2) belong to the family of classical tyrosine-specific protein tyrosine phosphatases, (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN1 (or PTP-1B) is the first PTP to be purified and characterized and is the prototypical intracellular PTP found in a wide variety of human tissues. It dephosphorylates and regulates the activity of a number of receptor tyrosine kinases, including the insulin receptor, the EGF receptor, and the PDGF receptor. PTPN2 (or TCPTP), a tumor suppressor, dephosphorylates and inactivates EGFRs, Src family kinases, Janus-activated kinases (JAKs)-1 and -3, and signal transducer and activators of transcription (STATs)-1, -3 and -5, in a cell type and context-dependent manner.	231
350394	cd14546	R-PTP-N-N2	PTP-like domain of receptor-type tyrosine-protein phosphatase-like N and N2. Receptor-type tyrosine-protein phosphatase-like N (PTPRN) and N2 (PTPRN2) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). They consist of a large ectodomain that contains a RESP18HD (regulated endocrine-specific protein 18 homology domain), followed by a transmembrane segment, and a single, catalytically-impaired, PTP domain. They are mainly expressed in neuropeptidergic neurons and peptide-secreting endocrine cells, including insulin-producing pancreatic beta-cells, and are involved in involved in the generation, cargo storage, traffic, exocytosis and recycling of insulin secretory granules, as well as in beta-cell proliferation. They also are major autoantigens in type 1 diabetes and are involved in the regulation of insulin secretion.	208
350395	cd14547	PTPc-KIM	catalytic domain of the kinase interaction motif (KIM) family of protein-tyrosine phosphatases. The kinase interaction motif (KIM) family of protein-tyrosine phosphatases (PTPs) includes tyrosine-protein phosphatases non-receptor type 7 (PTPN7) and non-receptor type 5 (PTPN5), and protein-tyrosine phosphatase receptor type R (PTPRR). PTPN7 is also called hematopoietic protein-tyrosine phosphatase (HePTP) while PTPN5 is also called  striatal-enriched protein-tyrosine phosphatase (STEP). They belong to the family of classical tyrosine-specific PTPs (EC 3.1.3.48) that catalyze the dephosphorylation of phosphotyrosine peptides. KIM-PTPs are characterized by the presence of a 16-amino-acid KIM that binds specifically to members of the MAPK (mitogen-activated protein kinase) family. They are highly specific to the MAPKs ERK1/2 (extracellular-signal-regulated kinase 1/2) and p38, over JNK (c-Jun N-terminal kinase); they dephosphorylate these kinases and thereby critically modulate cell proliferation and differentiation.	224
350396	cd14548	R3-PTPc	catalytic domain of R3 subfamily receptor-type tyrosine-protein phosphatases and similar proteins. R3 subfamily receptor-type phosphotyrosine phosphatases (RPTP) are characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. Vertebrate members include receptor-type tyrosine-protein phosphatase-like O (PTPRO), J (PTPRJ), Q (PTPRQ), B (PTPRB), V (PTPRV) and H (PTPRH). They belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Most members are PTPs, except for PTPRQ, which dephosphorylates phosphatidylinositide substrates. PTPRV is characterized only in rodents; its function has been lost in humans. Both vertebrate and invertebrate R3 subfamily RPTPs are involved in the control of a variety of cellular processes, including cell growth, differentiation, mitotic cycle and oncogenic transformation.	222
350397	cd14549	R5-PTPc-1	catalytic domain of R5 subfamily receptor-type tyrosine-protein phosphatases, repeat 1. The R5 subfamily of receptor-type phosphotyrosine phosphatases (RPTP) is composed of receptor-type tyrosine-protein phosphatase Z (PTPRZ) and G (PTPRG). They belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. They are type 1 integral membrane proteins consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the catalytic PTP domain (repeat 1).	204
350398	cd14550	R5-PTP-2	PTP-like domain of R5 subfamily receptor-type tyrosine-protein phosphatases, repeat 2. The R5 subfamily of receptor-type phosphotyrosine phosphatases (RPTP) is composed of receptor-type tyrosine-protein phosphatase Z (PTPRZ) and G (PTPRG). They belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. They are type 1 integral membrane proteins consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the inactive PTP-like domain (repeat 2).	200
350399	cd14551	R-PTPc-A-E-1	catalytic domain of receptor-type tyrosine-protein phosphatase A and E, repeat 1. Receptor-type tyrosine-protein phosphatase A (PTPRA) and E (PTPRE) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRA and PTPRE share several functions including regulation of Src family kinases and voltage-gated potassium (Kv) channels. They both contain a small extracellular domain, a transmembrane segment, and an intracellular region containing two tandem catalytic PTP domains. This model represents the first catalytic PTP domain (repeat 1).	202
350400	cd14552	R-PTPc-A-E-2	catalytic domain of receptor-type tyrosine-protein phosphatase A and E, repeat 2. Receptor-type tyrosine-protein phosphatase A (PTPRA) and E (PTPRE) belong to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRA and PTPRE share several functions including regulation of Src family kinases and voltage-gated potassium (Kv) channels. They both contain a small extracellular domain, a transmembrane segment, and an intracellular region containing two tandem catalytic PTP domains. This model represents the second PTP domain (repeat 2).	202
350401	cd14553	R-PTPc-LAR-1	catalytic domain of LAR family receptor-type tyrosine-protein phosphatases, repeat 1. The LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs) include three vertebrate members: LAR (or PTPRF), R-PTP-delta (or PTPRD), and R-PTP-sigma (or PTPRS). They belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules; they bind to distinct synaptic membrane proteins and are physiologically responsible for mediating presynaptic development by shaping various synaptic adhesion pathways. They play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function. LAR-RPTPs contain an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the catalytic PTP domain (repeat 1).	238
350402	cd14554	R-PTP-LAR-2	PTP-like domain of the LAR family receptor-type tyrosine-protein phosphatases, repeat 2. The LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs) include three vertebrate members: LAR (or PTPRF), R-PTP-delta (or PTPRD), and R-PTP-sigma (or PTPRS). They belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules; they bind to distinct synaptic membrane proteins and are physiologically responsible for mediating presynaptic development by shaping various synaptic adhesion pathways. They play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function. LAR-RPTPs contain an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the non-catalytic PTP-like domain (repeat 2).	238
350403	cd14555	R-PTPc-typeIIb-1	catalytic domain of type IIb (or R2B) subfamily receptor-type tyrosine-protein phosphatases, repeat 1. The type II (or R2B) subfamily of receptor protein tyrosine phosphatases (RPTPs) include the prototypical member PTPmu (or PTPRM), PCP-2 (or PTPRU), PTPrho (or PTPRT), and PTPkappa (or PTPRK). They belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Type IIb RPTPs mediate cell-cell adhesion though homophilic interactions; their ligand is an identical molecule on an adjacent cell. No heterophilic interactions between the subfamily members have been observed. They also commonly function as tumor suppressors. They contain an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain.	204
350404	cd14556	R-PTPc-typeIIb-2	PTP domain of type IIb (or R2B) subfamily receptor-type tyrosine-protein phosphatases, repeat 2. The type IIb (or R2B) subfamily of receptor protein tyrosine phosphatases (RPTPs) include the prototypical member PTPmu (or PTPRM), PCP-2 (or PTPRU), PTPrho (or PTPRT), and PTPkappa (or PTPRK). They belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Type IIb RPTPs mediate cell-cell adhesion though homophilic interactions; their ligand is an identical molecule on an adjacent cell. No heterophilic interactions between the subfamily members have been observed. They also commonly function as tumor suppressors. They contain an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain.	201
350405	cd14557	R-PTPc-C-1	catalytic domain of receptor-type tyrosine-protein phosphatase C, repeat 1. Receptor-type tyrosine-protein phosphatase C (PTPRC), also known as CD45, leukocyte common antigen (LCA) or GP180, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRC/CD45 is found in all nucleated hematopoietic cells and is an essential regulator of T- and B-cell antigen receptor signaling. It controls immune response, both positively and negatively, by dephosphorylating a number of signaling molecules such as the Src family kinases, the CD3zeta chain of TCY, and ZAP-70 kinase. Mutations in the human PTPRC/CD45 gene are associated with severe combined immunodeficiency (SCID) and multiple sclerosis. PTPRC/CD45 contains an extracellular receptor-like region with fibronectin type III (FN3) repeats, a short transmembrane segment, and a cytoplasmic region comprising of a membrane proximal catalytically active PTP domain (repeat 1 or D1) and a membrane distal catalytically impaired PTP-like domain (repeat 2, or D2). This model represents repeat 1.	201
350406	cd14558	R-PTP-C-2	PTP-like domain of receptor-type tyrosine-protein phosphatase C, repeat 2. Receptor-type tyrosine-protein phosphatase C (PTPRC), also known as CD45, leukocyte common antigen (LCA) or GP180, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRC/CD45 is found in all nucleated hematopoietic cells and is an essential regulator of T- and B-cell antigen receptor signaling. It controls immune response, both positively and negatively, by dephosphorylating a number of signaling molecules such as the Src family kinases, the CD3zeta chain of TCY, and ZAP-70 kinase. Mutations in the human PTPRC/CD45 gene are associated with severe combined immunodeficiency (SCID) and multiple sclerosis. PTPRC/CD45 contains an extracellular receptor-like region with fibronectin type III (FN3) repeats, a short transmembrane segment, and a cytoplasmic region comprising of a membrane proximal catalytically active PTP domain (repeat 1 or D1) and a membrane distal catalytically impaired PTP-like domain (repeat 2, or D2). This model represents repeat 2.	203
350407	cd14559	PTP_YopH-like	YopH and related bacterial protein tyrosine phosphatases. Yersinia outer protein H (YopH) belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. YopH is an essential virulence determinant of the pathogenic bacterium by dephosphorylating several focal adhesion proteins including p130Cas in human epithelial cells, resulting in the disruption of focal adhesions and cell detachment from the extracellular matrix. It contains an N-terminal domain that contains signals required for TTSS-mediated delivery of YopH into host cells and a C-terminal catalytic PTP domain.	227
350408	cd14560	PTP_tensin-1	protein tyrosine phosphatase-like domain of tensin-1. Tensin-1 (TNS1) is part of the tensin family of intracellular proteins (tensin-1, -2, -3 and -4), which act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility. It plays an essential role in TGF-beta-induced myofibroblast differentiation and myofibroblast-mediated formation of extracellular fibronectin and collagen matrix. It also positively regulates RhoA activity through its interaction with DLC1, a RhoGAP-containing tumor suppressor; the tensin-1-DLC1-RhoA signaling axis is critical in regulating cellular functions that lead to angiogenesis. Tensin-1 contains an N-terminal region with a protein tyrosine phosphatase (PTP)-like domain followed by a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains.	159
350409	cd14561	PTP_tensin-3	protein tyrosine phosphatase-like domain of tensin-3. Tensin-3 (TNS3) is also called tensin-like SH2 domain-containing protein 1 (TENS1) or tumor endothelial marker (TEM6). It is part of the tensin family of intracellular proteins (tensin-1, -2, -3 and -4), which act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility. Tensin-3 contributes to cell migration, anchorage-independent growth, tumorigenesis, and metastasis of cancer cells. It cooperates with Dock5, an exchange factor for the small GTPase Rac, for osteoclast activity to ensure the correct organization of podosomes. Tensin-3 contains an N-terminal region with a protein tyrosine phosphatase (PTP)-like domain followed by a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains.	159
350410	cd14562	PTP_tensin-2	protein tyrosine phosphatase-like domain of tensin-2. Tensin-2 (TNS2) is also called tensin-like C1 domain-containing phosphatase (TENC1) or C1 domain-containing phosphatase and tensin homolog (C1-TEN). It is part of the tensin family of intracellular proteins (tensin-1, -2, -3 and -4), which act as links between the extracellular matrix and the cytoskeleton, and thereby mediate signaling for cell shape and motility. Tensin-2 is an essential component for the maintenance of glomerular basement membrane (GBM) structures. It also modulates cell contractility and remodeling of collagen fibers through the DLC1, a RhoGAP that binds to tensins in focal adhesions. Tensin-2 may have phosphatase activity; it reduces AKT1 phosphorylation. It contains an N-terminal region with a zinc finger, a protein tyrosine phosphatase (PTP)-like domain and a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains.	159
350411	cd14563	PTP_auxilin_N	N-terminal protein tyrosine phosphatase-like domain of auxilin. Auxilin, also called auxilin-1 or DnaJ homolog subfamily C member 6 (DNAJC6), is a J-domain containing protein that recruits the ATP-dependent chaperone Hsc70 to newly budded clathrin-coated vesicles and promotes uncoating of clathrin-coated vesicles, driving the clathrin assembly#disassembly cycle. Mutations in the DNAJC6 gene, encoding auxilin, are associated with early-onset Parkinson's disease. Auxilin contains an N-terminal protein tyrosine phosphatase (PTP)-like domain similar to the PTP-like domain of PTEN, a phosphoinositide 3-phosphatase, and a C-terminal region with clathrin-binding and J domains.	163
350412	cd14564	PTP_GAK	protein tyrosine phosphatase-like domain of cyclin-G-associated kinase. cyclin-G-associated kinase (GAK), also called auxilin-2, contains an N-terminal protein kinase domain that phosphorylates the mu subunits of adaptor protein (AP) 1 and AP2. In addition, it contains an auxilin-1-like domain structure consisting of a protein tyrosine phosphatase (PTP)-like domain similar to the PTP-like domain of PTEN (a phosphoinositide 3-phosphatase), and a C-terminal region with clathrin-binding and J domains. Like auxilin-1, GAK facilitates Hsc70-mediated dissociation of clathrin from clathrin-coated vesicles. GAK is expressed ubiquitously and is enriched in the Golgi, unlike auxilin-1 which is nerve-specific. GAK also plays regulatory roles outside of clathrin-mediated membrane traffic including the maintenance of centrosome integrity and chromosome congression, neural patterning, survival of neurons, and immune responses through interaction with the interleukin 12 receptor.	163
350413	cd14565	DSP_MKP_classI	dual specificity phosphatase domain of class I mitogen-activated protein kinase phosphatase. Mitogen-activated protein kinase (MAPK) phosphatases (MKPs) are eukaryotic dual-specificity phosphatases (DUSPs) that act on MAPKs and function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. Based on sequence homology, subcellular localization and substrate specificity, 10 MKPs can be subdivided into three subfamilies (class I-III). Class I MKPs consist of DUSP1/MKP-1, DUSP2 (PAC1), DUSP4/MKP-2 and DUSP5. They are all mitogen- and stress-inducible nuclear MKPs. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	138
350414	cd14566	DSP_MKP_classII	dual specificity phosphatase domain of class II mitogen-activated protein kinase phosphatase. Mitogen-activated protein kinase (MAPK) phosphatases (MKPs) are eukaryotic dual-specificity phosphatases (DUSPs) that act on MAPKs and function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. Based on sequence homology, subcellular localization and substrate specificity, 10 MKPs can be subdivided into three subfamilies (class I-III). Class II MKPs consist of DUSP6/MKP-3, DUSP7/MKP-X and DUSP9/MKP-4, and are ERK-selective cytoplasmic MKPs. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	137
350415	cd14567	DSP_DUSP10	dual specificity phosphatase domain of dual specificity protein phosphatase 10. Dual specificity protein phosphatase 10 (DUSP10), also called mitogen-activated protein kinase (MAPK) phosphatase 5 (MKP-5), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class III subfamily and is a JNK/p38-selective cytoplasmic MKP. DUSP10/MKP-5 coordinates skeletal muscle regeneration by negatively regulating mitochondria-mediated apoptosis. It is also an important regulator of intestinal epithelial barrier function and a suppressor of colon tumorigenesis. DUSP10/MKP-5 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	152
350416	cd14568	DSP_MKP_classIII	dual specificity phosphatase domain of class III mitogen-activated protein kinase phosphatase. Mitogen-activated protein kinase (MAPK) phosphatases (MKPs) are eukaryotic dual-specificity phosphatases (DUSPs) that act on MAPKs and function as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). They deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. Based on sequence homology, subcellular localization and substrate specificity, 10 MKPs can be subdivided into three subfamilies (class I-III). Class III MKPs consist of DUSP8, DUSP10/MKP-5 and DUSP16/MKP-7, and are JNK/p38-selective phosphatases, which are found in both the cell nucleus and cytoplasm. All MKPs contain an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	140
350417	cd14569	DSP_slingshot_2	dual specificity phosphatase domain of slingshot homolog 2. Dual specificity protein phosphatase slingshot homolog 2 (SSH2), also called SSH-like protein 2, is part of the slingshot (SSH) family, whose members specifically dephosphorylate and reactivate Ser-3-phosphorylated cofilin (P-cofilin), an actin-binding protein that plays an essential role in actin filament dynamics. SSH2 has been identified as a target of protein kinase D1 that regulates cofilin phosphorylation and remodeling of the actin cytoskeleton during neutrophil chemotaxis. There are at least two human SSH2 isoforms reported: hSSH-2L (long) and hSSH-2. As SSH family phosphatases, they contain an N-terminal, SSH family-specific non-catalytic (SSH-N) domain, followed by a short domain with similarity to the C-terminal domain of the chromatin-associated protein DEK, and a dual specificity phosphatase catalytic domain. In addition, hSSH-2L contains a long C-terminal tail while hSSH-2 does not.	144
350418	cd14570	DSP_slingshot_1	dual specificity phosphatase domain of slingshot homolog 1. Dual specificity protein phosphatase slingshot homolog 1 (SSH1), also called SSH-like protein 1, is part of the slingshot (SSH) family, whose members specifically dephosphorylate and reactivate Ser-3-phosphorylated cofilin (P-cofilin), an actin-binding protein that plays an essential role in actin filament dynamics. SSH1 links NOD1 signaling to actin remodeling, facilitating the changes that leads to NF-kappaB activation and innate immune responses. There are at least two human SSH1 isoforms reported: hSSH-1L (long) and hSSH-1S (short). As SSH family phosphatases, they contain an N-terminal, SSH family-specific non-catalytic (SSH-N) domain, followed by a short domain with similarity to the C-terminal domain of the chromatin-associated protein DEK, and a dual specificity phosphatase catalytic domain. They also contain C-terminal tails, differing in the lengths of the tail.	144
350419	cd14571	DSP_slingshot_3	dual specificity phosphatase domain of slingshot homolog 3. Dual specificity protein phosphatase slingshot homolog 3 (SSH3), also called SSH-like protein 3, is part of the slingshot (SSH) family, whose members specifically dephosphorylate and reactivate Ser-3-phosphorylated cofilin (P-cofilin), an actin-binding protein that plays an essential role in actin filament dynamics. The Xenopus homolog (xSSH) is involved in the gastrulation movement. Mouse SSH3 dephosphorylates actin-depolymerizing factor (ADF) and cofilin but is dispensable for development. There are at least two human SSH3 isoforms reported: hSSH-3L (long) and hSSH-3. As SSH family phosphatases, they contain an N-terminal, SSH family-specific non-catalytic (SSH-N) domain, followed by a short domain with similarity to the C-terminal domain of the chromatin-associated protein DEK, and a dual specificity phosphatase catalytic domain. In addition, hSSH-3L contains a C-terminal tail while hSSH-3 does not.	144
350420	cd14572	DUSP14	dual specificity protein phosphatase 14. dual specificity protein phosphatase 14 (DUSP14), also called mitogen-activated protein kinase (MAPK) phosphatase 6 (MKP-6) or MKP-1-like protein tyrosine phosphatase (MKP-L), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP14 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP14 dephosphorylates JNK, ERK, and p38 in vitro. It also directly interacts and dephosphorylates TGF-beta-activated kinase 1 (TAK1)-binding protein 1 (TAB1) in T cells, and negatively regulates TCR signaling and immune responses.	150
350421	cd14573	DUSP18_21	dual specificity protein phosphatases 18 and 21. This subfamily contains dual specificity protein phosphatase 18 (DUSP18), dual specificity protein phosphatase 21 (DUSP21), and similar proteins. They function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48), and are atypical DUSPs. They contain the catalytic dual specificity phosphatase domain but lack the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP18, also called low molecular weight dual specificity phosphatase 20 (LMW-DSP20), is a catalytically active phosphatase with a preference for phosphotyrosine over phosphoserine/threonine oligopeptides in vitro. In vivo, it has been shown to interact and dephosphorylate SAPK/JNK, and may play a role in regulating the SAPK/JNK pathway. DUSP21 is also called low molecular weight dual specificity phosphatase 21 (LMW-DSP21). Its gene has been identified as a potential therapeutic target in human hepatocellular carcinoma. DUSP18 and DUSP21 target to opposing sides of the mitochondrial inner membrane.	158
350422	cd14574	DUSP28	dual specificity protein phosphatase 28. Dual specificity protein phosphatase 28 (DUSP28), also called VHP, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It is an atypical DUSP that contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It has been implicated in hepatocellular carcinoma progression and in migratory activity and drug resistance of pancreatic cancer cells. DUSP28 has an exceptionally low phosphatase activity due to the presence of bulky residues in the active site pocket resulting in low accessibility.	140
350423	cd14575	DUPD1	dual specificity phosphatase and pro isomerase domain containing 1. Dual specificity phosphatase and pro isomerase domain containing 1 (DUPD1) was initially named as such because computational prediction appeared to encode a protein of 446 amino acids in length that included two catalytic domains: a proline isomerase and a dual specificity phosphatase (DUSP). However, it was subsequently shown that the true open reading frame only encompassed the DUSP domain and the gene product was therefore renamed DUSP27. This is distinct from inactive DUSP27. DUSPs function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). DUPD1/DUSP27 has been shown to have catalytic activity with preference for phosphotyrosine over phosphothreonine and phosphoserine residues. It associates with the short form of the prolactin (PRL) receptor and plays a role in PRL-mediated MAPK inhibition in ovarian cells.	160
350424	cd14576	DSP_iDUSP27	dual specificity phosphatase-like domain of inactive dual specificity protein phosphatase 27. Inactive dual specificity protein phosphatase 27 (DUSP27) may play a role in myofiber maturation. It is a pseudophosphatase containing a substitution of the active site cysteine into a serine. It is a large protein of more than 1000 amino acids in length with an N-terminal dual specificity phosphatase-like domain.	159
350425	cd14577	DUSP13B	dual specificity protein phosphatase 13 isoform B. Dual specificity protein phosphatase 13 isoform B (DUSP13B), also called testis- and skeletal-muscle-specific DSP (TMDP) or dual specificity phosphatase SKRP4, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP13B is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP13B inactivates MAPK activation in the order of selectivity, JNK = p38 > ERK in cells. It may play a role  in protection from external stress during spermatogenesis.	163
350426	cd14578	DUSP26	dual specificity protein phosphatase 26. Dual specificity protein phosphatase 26 (DUSP26), also called mitogen-activated protein kinase (MAPK) phosphatase 8 (MKP-8) or low-molecular-mass dual-specificity phosphatase 4 (LDP-4), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP26 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It is a brain phosphatase highly overexpressed in neuroblastoma and has also been identified as a p53 phosphatase, dephosphorylating phospho-Ser20 and phospho-Ser37 in the p53 transactivation domain.	144
350427	cd14579	DUSP3	dual specificity protein phosphatase 3. Dual specificity protein phosphatase 3 (DUSP3), also called vaccinia H1-related phosphatase (VHR), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP3 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It favors bisphosphorylated substrates over monophosphorylated ones, and prefers pTyr peptides over pSer/pThr peptides. Reported physiological substrates includes MAPKs ERK1/2, JNK, and p38, as well as STAT5, EGFR, and ErbB2. DUSP3 has been linked to breast and prostate cancer, and may also play a role in thrombosis.	168
350428	cd14580	DUSP13A	dual specificity protein phosphatase 13 isoform A. Dual specificity protein phosphatase 13 isoform A (DUSP13A), also called branching-enzyme interacting DSP or muscle-restricted DSP (MDSP), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP13A is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP13A also functions as a regulator of apoptosis signal-regulating kinase 1 (ASK1), a MAPK kinase kinase, by interacting with its N-terminal domain and inducing ASK1-mediated apoptosis through the activation of caspase-3. This function is independent of phosphatase activity.	145
350429	cd14581	DUSP22	dual specificity protein phosphatase 22. Dual specificity protein phosphatase 22 (DUSP22), also called JNK-stimulatory phosphatase-1 (JSP-1), low molecular weight dual specificity phosphatase 2 (LMW-DSP2), mitogen-activated protein kinase phosphatase x (MKP-x) or VHR-related MKPx (VHX), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). It deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. DUSP22 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. DUSP22 negatively regulates the estrogen receptor-alpha-mediated signaling pathway and the IL6-leukemia inhibitory factor (LIF)-STAT3-mediated signaling pathway. It also regulates cell death by acting as a scaffold protein for the ASK1-MKK7-JNK signal transduction pathway independently of its phosphatase activity.	149
350430	cd14582	DSP_DUSP15	dual specificity phosphatase domain of dual specificity protein phosphatase 15. Dual specificity protein phosphatase 15 (DUSP15), also called Vaccinia virus VH1-related dual-specific protein phosphatase Y (VHY) or VH1-related member Y, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). DUSP15 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs or MKPs. It is highly expressed in the testis and is located in the plasma membrane in a myristoylation-dependent manner. It may be involved in the regulation of meiotic signal transduction in testis cells. It is also expressed in the brain and has been identified as a regulator of oligodendrocyte differentiation. DUSP15 contains an N-terminal catalytic dual specificity phosphatase domain and a short C-terminal tail.	146
350431	cd14583	PTP-MTMR7	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 7. Myotubularin related phosphoinositide phosphatase 7 (MTMR7) is enzymatically active and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. In neuronal cells, MTMR7 forms a complex with catalytically inactive MTMR9 and dephosphorylates phosphatidylinositol 3-phosphate and Ins(1,3)P2.	302
350432	cd14584	PTP-MTMR8	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 8. Myotubularin related phosphoinositide phosphatase 8 (MTMR8) is enzymatically active and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. MTMR8 forms a complex with catalytically inactive MTMR9 and preferentially dephosphorylates PtdIns(3)P; the MTMR8/R9 complex inhibits autophagy. In zebrafish, it cooperates with PI3K to regulate actin filament modeling and muscle development.	308
350433	cd14585	PTP-MTMR6	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 6. Myotubularin related phosphoinositide phosphatase 6 is enzymatically active and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. MTMR6 forms a complex with catalytically inactive MTMR9 and preferentially dephosphorylates PtdIns(3,5)P(2); the MTMR6/R9 complex serves to inhibit stress-induced apoptosis.	302
350434	cd14586	PTP-MTMR3	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 3. Myotubularin related phosphoinositide phosphatase 3 (MTMR3), also known as FYVE domain-containing dual specificity protein phosphatase 1 (FYVE-DSP1) or Zinc finger FYVE domain-containing protein 10 (ZFYVE10), is enzymatically active and contains a C-terminal FYVE domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. Together with phosphoinositide 5-kinase PIKfyve, phosphoinositide 3-phosphatase MTMR3 constitutes a phosphoinositide loop that produces PI(5)P via PI(3,5)P2 and regulates cell migration.	317
350435	cd14587	PTP-MTMR4	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 4. Myotubularin related phosphoinositide phosphatase 4 (MTMR4), also known as FYVE domain-containing dual specificity protein phosphatase 2 (FYVE-DSP2) or zinc finger FYVE domain-containing protein 11 (ZFYVE11), is enzymatically active and contains a C-terminal FYVE domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. MTMR4 localizes at the interface of early and recycling endosomes to regulate trafficking through this pathway. It plays a role in bacterial pathogenesis by stabilizing the integrity of bacteria-containing vacuoles.	308
350436	cd14588	PTP-MTMR5	protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 5. Myotubularin related phosphoinositide phosphatase 5 (MTMR5), also known as SET binding factor 1 (SBF1), is enzymatically inactive and contains a variety of other domains, including a DENN and a PH-like domain. Mutations in the MTMR5 gene cause Charcot-Marie-tooth disease type 4B3. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR5 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. It interacts with MTMR2, an active myotubularin related phosphatidylinositol phosphatase, regulates its enzymatic activity and subcellular location.	291
350437	cd14589	PTP-MTMR13	protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 13. Myotubularin related phosphoinositide phosphatase 13 (MTMR13), also known as SET binding factor 2 (SBF2), is enzymatically inactive and contains a variety of other domains, including a DENN and a PH-like domain. Mutations in MTMR13 causes Charcot-Marie-Tooth type 4B2, a severe childhood-onset neuromuscular disorder, characterized by demyelination and redundant loops of myelin known as myelin outfoldings, a similar phenotype as mutations in MTMR2. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR13 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. It is believed to interact with MTMR2 and stimulate its phosphatase activity. It is also a guanine nucleotide exchange factor (GEF) which may activate RAB28, promoting the exchange of GDP to GTP and converting inactive GDP-bound Rab proteins into their active GTP-bound form.	297
350438	cd14590	PTP-MTMR2	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 2. Myotubularin related phosphoinositide phosphatase 2 (MTMR2) is enzymatically active and contains an additional N-terminal PH-GRAM domain and C-terminal coiled-coiled domain and PDZ binding site. Mutations in MTMR2 causes Charcot-Marie-Tooth type 4B1, a severe childhood-onset neuromuscular disorder, characterized by demyelination and redundant loops of myelin known as myelin outfoldings, a similar phenotype as mutations in MTMR13. MTMR13, an inactive phosphatase, is believed to interact with MTMR2 and stimulate its phosphatase activity. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively.	262
350439	cd14591	PTP-MTM1	protein tyrosine phosphatase-like domain of myotubularin phosphoinositide phosphatase 1. Myotubularin phosphoinositide phosphatase 1 (MTM1), also called myotubularin, is enzymatically active and contains an N-terminal PH-GRAM domain and C-terminal coiled-coiled domain and PDZ binding site. Mutations in MTM1 cause X-linked myotubular myopathy. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively.	249
350440	cd14592	PTP-MTMR1	protein tyrosine phosphatase-like domain of myotubularin related phosphoinositide phosphatase 1. Myotubularin-related phosphoinositide phosphatase 1 (MTMR1) is enzymatically active and contains an N-terminal PH-GRAM domain, a C-terminal coiled-coiled domain and a PDZ binding site. MTMR1 is associated with myotonic dystrophy. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively.	249
350441	cd14593	PTP-MTMR10	protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 10. Myotubularin related phosphoinositide phosphatase 10 (MTMR10) is enzymatically inactive and contains an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR10 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket.	195
350442	cd14594	PTP-MTMR12	protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 12. Myotubularin related phosphoinositide phosphatase 12 (MTMR12), also called phosphatidylinositol 3 phosphate 3-phosphatase adapter subunit (3-PAP), is enzymatically inactive and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR12 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket. It functions as an adapter for the catalytically active myotubularin to regulate its intracellular location.	203
350443	cd14595	PTP-MTMR11	protein tyrosine phosphatase-like pseudophosphatase domain of myotubularin related phosphoinositide phosphatase 11. Myotubularin related phosphoinositide phosphatase 11 (MTMR11), also called cisplatin resistance-associated protein (hCRA) in humans, is enzymatically inactive and contains a C-terminal coiled-coil domain and an N-terminal PH-GRAM domain. In general, myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. MTMR11 is a pseudophosphatase that lacks the catalytic cysteine in its catalytic pocket.	195
350444	cd14596	PTPc-N20	catalytic domain of tyrosine-protein phosphatase non-receptor type 20. Tyrosine-protein phosphatase non-receptor type 20 (PTPN20) belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Human PTPN20 is a widely expressed phosphatase with a dynamic subcellular distribution that is targeted to sites of actin polymerization.	207
350445	cd14597	PTPc-N13	catalytic domain of tyrosine-protein phosphatase non-receptor type 13. Tyrosine-protein phosphatase non-receptor type 13 (PTPN13, also known as PTPL1) belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Human PTPN13 is an important regulator of tumor aggressiveness. It regulates breast cancer cell aggressiveness through direct inactivation of Src kinase. In hepatocellular carcinoma, PTPN13 is a tumor suppressor. PTPN13 contains a FERM domain, five PDZ domains, and a C-terminal catalytic PTP domain. With its PDZ domains, PTPN13 has numerous interacting partners that can actively participate in the regulation of its phosphatase activity or can permit direct or indirect recruitment of tyrosine phosphorylated substrates. Its FERM domain is necessary for localization to the membrane.	234
350446	cd14598	PTPc-N21	catalytic domain of tyrosine-protein phosphatase non-receptor type 21. Tyrosine-protein phosphatase non-receptor type 21 (PTPN21), also called protein-tyrosine phosphatase D1, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN21 is a component of a multivalent scaffold complex nucleated by focal adhesion kinase (FAK) at specific intracellular sites. It promotes cytoskeleton events that induce cell adhesion and migration by modulating Src-FAK signaling. It can also selectively associate with and stimulate Tec family kinases and modulate Stat3 activation. Human PTPN21 may also play a pathologic role in gastrointestinal tract tumorigenesis. PTPN21 contains an N-terminal FERM domain and a C-terminal catalytic PTP domain, separated by a long intervening sequence.	220
350447	cd14599	PTPc-N14	catalytic domain of tyrosine-protein phosphatase non-receptor type 14. Tyrosine-protein phosphatase non-receptor type 14 (PTPN14), also called protein-tyrosine phosphatase pez, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN14 is a potential tumor suppressor and plays a regulatory role in the Hippo and Wnt/beta-catenin signaling pathways. It contains an N-terminal FERM domain and a C-terminal catalytic PTP domain, separated by a long intervening sequence.	287
350448	cd14600	PTPc-N3	catalytic domain of tyrosine-protein phosphatase non-receptor type 3. Tyrosine-protein phosphatase non-receptor type 3 (PTPN3), also called protein-tyrosine phosphatase H1 (PTP-H1), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN3 interacts with mitogen-activated protein kinase p38gamma and serves as its specific phosphatase. PTPN3 and p38gamma cooperate to promote Ras-induced oncogenesis. PTPN3 is a large modular protein containing an N-terminal FERM domain, a PDZ domain and a C-terminal catalytic PTP domain. Its PDZ domain binds with the PDZ-binding motif of p38gamma and enables efficient tyrosine dephosphorylation.	274
350449	cd14601	PTPc-N4	catalytic domain of tyrosine-protein phosphatase non-receptor type 4. Tyrosine-protein phosphatase non-receptor type 4 (PTPN4), also called protein-tyrosine phosphatase MEG1, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN4 functions in TCR cell signaling, apoptosis, cerebellar synaptic plasticity, and innate immune responses. It specifically inhibits the TRIF-dependent TLR4 pathway by suppressing tyrosine phosphorylation of TRAM. It is a large modular protein containing an N-terminal FERM domain, a PDZ domain and a C-terminal catalytic PTP domain; the PDZ domain regulates the catalytic activity of PTPN4.	212
350450	cd14602	PTPc-N22	catalytic domain of tyrosine-protein phosphatase non-receptor type 22. Tyrosine-protein phosphatase non-receptor type 22 (PTPN22), also called lymphoid phosphatase (LyP), PEST-domain phosphatase (PEP), or hematopoietic cell protein-tyrosine phosphatase 70Z-PEP, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN22 is expressed in hematopoietic cells and it functions as a key regulator of immune homeostasis by inhibiting T-cell receptor signaling through the direct dephosphorylation of Src family kinases (Lck and Fyn), ITAMs of the TCRz/CD3 complex, and other signaling molecules. Mutations in the PTPN22 gene are associated with  multiple connective tissue and autoimmune diseases including type 1 diabetes mellitus, rheumatoid arthritis, and systemic lupus erythematosus. PTPN22 contains an N-terminal catalytic PTP domain and four proline-rich regions at the C-terminus.	234
350451	cd14603	PTPc-N18	catalytic domain of tyrosine-protein phosphatase non-receptor type 18. Tyrosine-protein phosphatase non-receptor type 18 (PTPN18), also called brain-derived phosphatase, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN18 regulates HER2-mediated cellular functions through defining both its phosphorylation and ubiquitination states. The N-terminal catalytic PTP domain of PTPN18  blocks lysosomal routing and delays the degradation of HER2 by dephosphorylation, and its C-terminal PEST domain promotes K48-linked HER2 ubiquitination and its destruction via the proteasome pathway.	266
350452	cd14604	PTPc-N12	catalytic domain of tyrosine-protein phosphatase non-receptor type 12. Tyrosine-protein phosphatase non-receptor type 12 (PTPN12), also called PTP-PEST or protein-tyrosine phosphatase G1 (PTPG1), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN12 is characterized as a tumor suppressor and a pivotal regulator of EGFR/HER2 signaling. It regulates various physiological processes, including cell migration, immune response, and neuronal activity, by dephosphorylating multiple substrates including HER2, FAK, PYK2, PSTPIP, WASP, p130Cas, paxillin, Shc, catenin, c-Abl, ArgBP2, p190RhoGAP, RhoGDI, cell adhesion kinase beta, and Rho GTPase.	297
350453	cd14605	PTPc-N11	catalytic domain of tyrosine-protein phosphatase non-receptor type 11. Tyrosine-protein phosphatase non-receptor type 11 (PTPN11), also called SH2 domain-containing tyrosine phosphatase 2 (SHP-2 or SHP2), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN11 promotes the activation of the RAS/Mitogen-Activated Protein Kinases (MAPK) Extracellular-Regulated Kinases 1/2 (ERK1/2) pathway, a canonical signaling cascade that plays key roles in various cellular processes, including proliferation, survival, differentiation, migration, or metabolism. It also regulates the phosphoinositide 3-kinase (PI3K)/AKT pathway, a fundamental cascade that functions in cell survival, proliferation, migration, morphogenesis, and metabolism. PTPN11 dysregulation is associated with several developmental diseases and malignancies, such as Noonan syndrome and juvenile myelomonocytic leukemia. It contains two tandem SH2 domains, a catalytic PTP domain, and a C-terminal tail with regulatory properties.	253
350454	cd14606	PTPc-N6	catalytic domain of tyrosine-protein phosphatase non-receptor type 6. Tyrosine-protein phosphatase non-receptor type 6 (PTPN6), also called SH2 domain-containing protein-tyrosine phosphatase 1 (SHP1 or SHP-1), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN6 expression is restricted mainly to hematopoietic and epithelial cells. It is an important regulator of hematopoietic cells, downregulating pathways that promote cell growth, survival, adhesion, and activation. It regulates glucose homeostasis by modulating insulin signalling in the liver and muscle, and it also negatively regulates bone resorption, affecting both the formation and the function of osteoclasts. PTPN6 contains two tandem SH2 domains, a catalytic PTP domain, and a C-terminal tail with regulatory properties.	266
350455	cd14607	PTPc-N2	catalytic domain of tyrosine-protein phosphatase non-receptor type 2. Tyrosine-protein phosphatase non-receptor type 2 (PTPN2), also called T-cell protein-tyrosine phosphatase (TCPTP), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN2, a tumor suppressor, dephosphorylates and inactivates EGFRs, Src family kinases, Janus-activated kinases (JAKs)-1 and -3, and signal transducer and activators of transcription (STATs)-1, -3 and -5, in a cell type and context-dependent manner. It is deleted in 6% of all T-cell acute lymphoblastic leukemias and is associated with constitutive JAK1/STAT5 signaling and tumorigenesis.	257
350456	cd14608	PTPc-N1	catalytic domain of tyrosine-protein phosphatase non-receptor type 1. Tyrosine-protein phosphatase non-receptor type 1 (PTPN1), also called protein-tyrosine phosphatase 1B (PTP-1B), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN1/PTP-1B is the first PTP to be purified and characterized and is the prototypical intracellular PTP found in a wide variety of human tissues. It contains an N-terminal catalytic PTP domain, followed by two tandem proline-rich motifs that mediate interaction with SH3-domain-containing proteins, and a small  hydrophobic stretch that localizes the enzyme to the endoplasmic reticulum (ER). It dephosphorylates and regulates the activity of a number of receptor tyrosine kinases, including the insulin receptor, the EGF receptor, and the PDGF receptor.	277
350457	cd14609	R-PTP-N	PTP-like domain of receptor-type tyrosine-protein phosphatase N. Receptor-type tyrosine-protein phosphatase-like N (PTPRN or R-PTP-N), also called islet cell antigen 512 (ICA512) or PTP IA-2, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). It consists of a large ectodomain that contains a RESP18HD (regulated endocrine-specific protein 18 homology domain), followed by a transmembrane segment, and a single, catalytically-impaired, PTP domain. PTPRN is located in secretory granules of neuroendocrine cells and is involved in the generation, cargo storage, traffic, exocytosis and recycling of insulin secretory granules, as well as in beta-cell proliferation. It is a major autoantigen in type 1 diabetes and is involved in the regulation of insulin secretion.	281
350458	cd14610	R-PTP-N2	PTP-like domain of receptor-type tyrosine-protein phosphatase N2. Receptor-type tyrosine-protein phosphatase N2 (PTPRN2 or R-PTP-N2), also called islet cell autoantigen-related protein (IAR), ICAAR, phogrin, or IA-2beta, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). It consists of a large ectodomain that contains a RESP18HD (regulated endocrine-specific protein 18 homology domain), followed by a transmembrane segment, and a single, catalytically-impaired, PTP domain. It is mainly expressed in neuropeptidergic neurons and peptide-secreting endocrine cells, including insulin-producing pancreatic beta-cells. It may function as a phosphatidylinositol phosphatase to regulate insulin secretion. It is also required for normal accumulation of the neurotransmitters norepinephrine, dopamine and serotonin in the brain.	283
350459	cd14611	R-PTPc-R	catalytic domain of receptor-type tyrosine-protein phosphatase R. Receptor-type tyrosine-protein phosphatase-like R (PTPRR or R-PTP-R), also called protein-tyrosine phosphatase PCPTP1, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRR is a kinase interaction motif (KIM)-PTP, characterized by the presence of a 16-amino-acid KIM that binds specifically to members of the MAPK (mitogen-activated protein kinase) family. The human and mouse PTPRR gene produces multiple neuronal protein isoforms of varying sizes (in human, PTPPBS-alpha, beta, gamma and delta). All isoforms contain the KIM motif and the catalytic PTP domain. PTPRR-deficient mice show significant defects in fine motor coordination and balance skills that are reminiscent of a mild ataxia.	226
350460	cd14612	PTPc-N7	catalytic domain of tyrosine-protein phosphatase non-receptor type 7. Tyrosine-protein phosphatase non-receptor type 7 (PTPN7), also called hematopoietic protein-tyrosine phosphatase (HePTP) or LC-PTP. belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN7/HePTP is a kinase interaction motif (KIM)-PTP, characterized by the presence of a 16-amino-acid KIM that binds specifically to members of the MAPK (mitogen-activated protein kinase) family. PTPN7/HePTP is found exclusively in the white blood cells in bone marrow, thymus, spleen, lymph nodes and all myeloid and lymphoid cell lines. It negatively regulates T-cell activation and proliferation, and is often dysregulated in the preleukemic disorder myelodysplastic syndrome, as well as in acute myelogenous leukemia.	247
350461	cd14613	PTPc-N5	catalytic domain of tyrosine-protein phosphatase non-receptor type 5. Tyrosine-protein phosphatase non-receptor type 5 (PTPN5), also called striatum-enriched protein-tyrosine phosphatase (STEP) or neural-specific PTP, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPN5/STEP is a kinase interaction motif (KIM)-PTP, characterized by the presence of a 16-amino-acid KIM that binds specifically to members of the MAPK (mitogen-activated protein kinase) family. It is a CNS-enriched protein that regulates key signaling proteins required for synaptic strengthening, as well as NMDA and AMPA receptor trafficking. PTPN5 is implicated in multiple neurologic and neuropsychiatric disorders, such as Alzheimer's disease, Parkinson's disease, schizophrenia, and fragile X syndrome.	258
350462	cd14614	R-PTPc-O	catalytic domain of receptor-type tyrosine-protein phosphatase O. Receptor-type tyrosine-protein phosphatase O (PTPRO or R-PTP-O), also known as glomerular epithelial protein 1 or protein tyrosine phosphatase U2 (PTP-U2), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRO is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It is essential for sustaining the structure and function of foot processes by regulating tyrosine phosphorylation of podocyte proteins. It has been identified as a synaptic cell adhesion molecule (CAM) that serves as a potent initiator of synapse formation. It is also a tumor suppressor in several types of cancer, such as hepatocellular carcinoma, lung cancer, and breast cancer.	245
350463	cd14615	R-PTPc-J	catalytic domain of receptor-type tyrosine-protein phosphatase J. Receptor-type tyrosine-protein phosphatase J (PTPRJ or R-PTP-J), also known as receptor-type tyrosine-protein phosphatase eta (R-PTP-eta) or density-enhanced phosphatase 1 (DEP-1) OR CD148, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRJ is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats (eight in PTPRJ) and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It is expressed in various cell types including epithelial, hematopoietic, and endothelial cells. It plays a role in cell adhesion, migration, proliferation and differentiation. It dephosphorylates or contributes to the dephosphorylation of various substrates including protein kinases such as FLT3, PDGFRB, MET, RET (variant MEN2A), VEGFR-2, LYN, SRC, MAPK1, MAPK3, and EGFR, as well as PIK3R1 and PIK3R2.	229
350464	cd14616	R-PTPc-Q	catalytic domain of receptor-type tyrosine-protein phosphatase Q. Receptor-type tyrosine-protein phosphatase Q (PTPRQ or R-PTP-Q), also called phosphatidylinositol phosphatase PTPRQ, belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRQ is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats (18 in PTPRQ) and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It displays low tyrosine-protein phosphatase activity; rather, it functions as a phosphatidylinositol phosphatase required for auditory processes. It regulates the levels of phosphatidylinositol 4,5-bisphosphate (PIP2) in the basal region of hair bundles. It can dephosphorylate a broad range of phosphatidylinositol phosphates, including phosphatidylinositol 3,4,5-trisphosphate and most phosphatidylinositol monophosphates and diphosphates.	224
350465	cd14617	R-PTPc-B	catalytic domain of receptor-type tyrosine-protein phosphatase B. Receptor-type tyrosine-protein phosphatase B (PTPRB), also known as receptor-type tyrosine-protein phosphatase beta (R-PTP-beta) or vascular endothelial protein tyrosine phosphatase(VE-PTP), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRB/VE-PTP is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It is expressed specifically in vascular endothelial cells and it plays an important role in blood vessel remodeling and angiogenesis.	228
350466	cd14618	R-PTPc-V	catalytic domain of receptor-type tyrosine-protein phosphatase V. Receptor-type tyrosine-protein phosphatase V (PTPRV or R-PTP-V), also known as embryonic stem cell protein-tyrosine phosphatase (ES cell phosphatase) or osteotesticular protein-tyrosine phosphatase (OST-PTP), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRV is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. In rodents, it may play a role in the maintenance of pluripotency and may function in signaling pathways during bone remodeling. It is the only PTP whose function has been lost between rodent and human. The human OST-PTP gene is a pseudogene.	230
350467	cd14619	R-PTPc-H	catalytic domain of receptor-type tyrosine-protein phosphatase H. Receptor-type tyrosine-protein phosphatase H (PTPRH or R-PTP-H), also known as stomach cancer-associated protein tyrosine phosphatase 1 (SAP-1), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRH is a member of the R3 subfamily of receptor-type phosphotyrosine phosphatases (RPTP), characterized by a unique modular composition consisting of multiple extracellular fibronectin type III (FN3) repeats and a single (most RPTP subtypes have two) cytoplasmic catalytic PTP domain. It is localized specifically at microvilli of the brush border in gastrointestinal epithelial cells. It plays a role in intestinal immunity by regulating CEACAM20 through tyrosine dephosphorylation. It is also a negative regulator of integrin-mediated signaling and may contribute to contact inhibition of cell growth and motility.	233
350468	cd14620	R-PTPc-E-1	catalytic domain of receptor-type tyrosine-protein phosphatase E, repeat 1. Receptor-type tyrosine-protein phosphatase E (PTPRE), also known as receptor-type tyrosine-protein phosphatase epsilon (R-PTP-epsilon), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. The PTPRE gene contains two distinct promoters that generate the two major isoforms: transmembrane (receptor type RPTPe or PTPeM) and cytoplasmic (cyt-PTPe or PTPeC). Receptor type RPTPe plays a critical role in signaling transduction pathways and phosphoprotein network topology in red blood cells, and may also play a role in osteoclast formation and function. It also negatively regulates PDGFRbeta-mediated signaling pathways that are crucial for the pathogenesis of atherosclerosis. cyt-PTPe acts as a negative regulator of insulin receptor signaling in skeletal muscle. It regulates insulin-induced phosphorylation of proteins downstream of the insulin receptor. Receptor type RPTPe contains a small extracellular region, a single transmembrane segment, and an intracellular region two tandem catalytic PTP domains. This model represents the first PTP domain (repeat 1).	229
350469	cd14621	R-PTPc-A-1	catalytic domain of receptor-type tyrosine-protein phosphatase A, repeat 1. Receptor-type tyrosine-protein phosphatase A (PTPRA), also known as receptor-type tyrosine-protein phosphatase alpha (R-PTP-alpha), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRA is a positive regulator of Src and Src family kinases via dephosphorylation of the Src-inhibitory tyrosine 527. Thus, it affects transformation and tumorigenesis, inhibition of proliferation, cell cycle arrest, integrin signaling, neuronal differentiation and outgrowth, and ion channel activity. It is also involved in interleukin-1 signaling in fibroblasts through its interaction with the focal adhesion targeting domain of focal adhesion kinase. PTPRA comprises a small extracellular domain, a transmembrane segment, and an intracellular region containing two tandem catalytic PTP domains. This model represents the first catalytic PTP domain (repeat 1).	296
350470	cd14622	R-PTPc-E-2	catalytic domain of receptor-type tyrosine-protein phosphatase E, repeat 2. Receptor-type tyrosine-protein phosphatase E (PTPRE), also known as receptor-type tyrosine-protein phosphatase epsilon (R-PTP-epsilon), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. The PTPRE gene contains two distinct promoters that generate the two major isoforms: transmembrane (receptor type RPTPe or PTPeM) and cytoplasmic (cyt-PTPe or PTPeC). Receptor type RPTPe plays a critical role in signaling transduction pathways and phosphoprotein network topology in red blood cells, and may also play a role in osteoclast formation and function. It also negatively regulates PDGFRbeta-mediated signaling pathways that are crucial for the pathogenesis of atherosclerosis. cyt-PTPe acts as a negative regulator of insulin receptor signaling in skeletal muscle. It regulates insulin-induced phosphorylation of proteins downstream of the insulin receptor. Receptor type RPTPe contains a small extracellular region, a single transmembrane segment, and an intracellular region two tandem catalytic PTP domains. This model represents the second PTP domain (repeat 2).	205
350471	cd14623	R-PTPc-A-2	catalytic domain of receptor-type tyrosine-protein phosphatase A, repeat 2. Receptor-type tyrosine-protein phosphatase A (PTPRA), also known as receptor-type tyrosine-protein phosphatase alpha (R-PTP-alpha), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRA is a positive regulator of Src and Src family kinases via dephosphorylation of the Src-inhibitory tyrosine 527. Thus, it affects transformation and tumorigenesis, inhibition of proliferation, cell cycle arrest, integrin signaling, neuronal differentiation and outgrowth, and ion channel activity. It is also involved in interleukin-1 signaling in fibroblasts through its interaction with the focal adhesion targeting domain of focal adhesion kinase. PTPRA comprises a small extracellular domain, a transmembrane segment, and an intracellular region containing two tandem catalytic PTP domains. This model represents the second PTP domain (repeat 2).	228
350472	cd14624	R-PTPc-D-1	catalytic domain of receptor-type tyrosine-protein phosphatase D, repeat 1. Receptor-type tyrosine-protein phosphatase D (PTPRD), also known as receptor-type tyrosine-protein phosphatase delta (R-PTP-delta), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules that play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function. PTPRD is involved in pre-synaptic differentiation through interaction with SLITRK2. It contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the catalytic PTP domain (repeat 1).	284
350473	cd14625	R-PTPc-S-1	catalytic domain of receptor-type tyrosine-protein phosphatase S, repeat 1. Receptor-type tyrosine-protein phosphatase S (PTPRS), also known as receptor-type tyrosine-protein phosphatase sigma (R-PTP-sigma), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRS is a receptor for glycosaminoglycans, including heparan sulfate proteoglycan and neural chondroitin sulfate proteoglycans (CSPGs), which present a barrier to axon regeneration. It also plays a role in stimulating neurite outgrowth in response to the heparan sulfate proteoglycan GPC2. PTPRS contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the catalytic PTP domain (repeat 1).	282
350474	cd14626	R-PTPc-F-1	catalytic domain of receptor-type tyrosine-protein phosphatase F, repeat 1. Receptor-type tyrosine-protein phosphatase F (PTPRF), also known as leukocyte common antigen related (LAR), is the prototypical member of the LAR family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRF/LAR plays a role for LAR in cadherin complexes where it associates with and dephosphorylates beta-catenin, a pathway which may be critical for cadherin complex stability and cell-cell association. It also regulates focal adhesions through cyclin-dependent kinase-1 and is involved in axon guidance in the developing nervous system. It also functions in regulating insulin signaling. PTPRF contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the catalytic PTP domain (repeat 1).	276
350475	cd14627	R-PTP-S-2	PTP-like domain of receptor-type tyrosine-protein phosphatase S, repeat 2. Receptor-type tyrosine-protein phosphatase S (PTPRS), also known as receptor-type tyrosine-protein phosphatase sigma (R-PTP-sigma), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRS is a receptor for glycosaminoglycans, including heparan sulfate proteoglycan and neural chondroitin sulfate proteoglycans (CSPGs), which present a barrier to axon regeneration. It also plays a role in stimulating neurite outgrowth in response to the heparan sulfate proteoglycan GPC2. PTPRS contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the non-catalytic PTP-like domain (repeat 2). Although described as non-catalytic, this domain contains the catalytic cysteine and the active site signature motif, HCSAGxGRxG.	290
350476	cd14628	R-PTP-D-2	PTP-like domain of receptor-type tyrosine-protein phosphatase D, repeat 2. Receptor-type tyrosine-protein phosphatase-like D (PTPRD), also known as receptor-type tyrosine-protein phosphatase delta (R-PTP-delta), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules that play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function. PTPRD is involved in pre-synaptic differentiation through interaction with SLITRK2. It contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the non-catalytic PTP-like domain (repeat 2). Although described as non-catalytic, this domain contains the catalytic cysteine and the active site signature motif, HCSAGxGRxG.	292
350477	cd14629	R-PTP-F-2	PTP-like domain of receptor-type tyrosine-protein phosphatase F, repeat 2. Receptor-type tyrosine-protein phosphatase F (PTPRF), also known as leukocyte common antigen related (LAR), is the prototypical member of the LAR family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRF/LAR plays a role for LAR in cadherin complexes where it associates with and dephosphorylates beta-catenin, a pathway which may be critical for cadherin complex stability and cell-cell association. It also regulates focal adhesions through cyclin-dependent kinase-1 and is involved in axon guidance in the developing nervous system. It also functions in regulating insulin signaling. PTPRF contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2). This model represents the non-catalytic PTP-like domain (repeat 2). Although described as non-catalytic, this domain contains the catalytic cysteine and the active site signature motif, HCSAGxGRxG.	291
350478	cd14630	R-PTPc-T-1	catalytic domain of receptor-type tyrosine-protein phosphatase T, repeat 1. Receptor-type tyrosine-protein phosphatase T (PTPRT), also known as receptor-type tyrosine-protein phosphatase rho (RPTP-rho or PTPrho), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRT is highly expressed in the nervous system and it plays a critical role in regulation of synaptic formation and neuronal development. It dephosphorylates a specific tyrosine residue in syntaxin-binding protein 1, a key component of synaptic vesicle fusion machinery, and regulates its binding to syntaxin 1. PTPRT has been identified as a potential candidate gene for autism spectrum disorder (ASD) susceptibility. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment  with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain.	237
350479	cd14631	R-PTPc-K-1	catalytic domain of receptor-type tyrosine-protein phosphatase K, repeat 1. Receptor-type tyrosine-protein phosphatase K (PTPRK), also known as receptor-type tyrosine-protein phosphatase kappa (RPTP-kappa or PTPkappa), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRK is widely expressed and has been shown to stimulate cell motility and neurite outgrowth. It is required for anti-proliferative and pro-migratory effects of TGF-beta, suggesting a role in regulation, maintenance, and restoration of cell adhesion. It is a potential tumour suppressor in primary central nervous system lymphomas, colorectal cancer, and breast cancer. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment  with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain.	218
350480	cd14632	R-PTPc-U-1	catalytic domain of receptor-type tyrosine-protein phosphatase U, repeat 1. Receptor-type tyrosine-protein phosphatase U (PTPRU), also known as pancreatic carcinoma phosphatase 2 (PCP-2), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRU/PCP-2 is the most distant member of the type IIb subfamily and may have a distinct biological function other than cell-cell aggregation. It localizes to the adherens junctions and directly binds and dephosphorylates beta-catenin, and regulates the balance between signaling and adhesive beta-catenin. It plays an important role in the maintenance of epithelial integrity. PTPRU contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment  with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain.	205
350481	cd14633	R-PTPc-M-1	catalytic domain of receptor-type tyrosine-protein phosphatase M, repeat 1. Receptor-type tyrosine-protein phosphatase M (PTPRM), also known as protein-tyrosine phosphatase mu (R-PTP-mu or PTPmu), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRM/PTPmu is a homophilic cell adhesion molecule expressed in CNS neurons and glia. It is required for E-, N-, and R-cadherin-dependent neurite outgrowth. Loss of PTPmu contributes to tumor cell migration and dispersal of human glioblastomas. PTPRM contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment  with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the first (repeat 1) PTP domain.	273
350482	cd14634	R-PTPc-T-2	PTP domain of receptor-type tyrosine-protein phosphatase T, repeat 2. Receptor-type tyrosine-protein phosphatase T (PTPRT), also known as receptor-type tyrosine-protein phosphatase rho (RPTP-rho or PTPrho), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRT is highly expressed in the nervous system and it plays a critical role in regulation of synaptic formation and neuronal development. It dephosphorylates a specific tyrosine residue in syntaxin-binding protein 1, a key component of synaptic vesicle fusion machinery, and regulates its binding to syntaxin 1. PTPRT has been identified as a potential candidate gene for autism spectrum disorder (ASD) susceptibility. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment  with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain.	206
350483	cd14635	R-PTPc-M-2	PTP domain of receptor-type tyrosine-protein phosphatase M, repeat 2. Receptor-type tyrosine-protein phosphatase M (PTPRM), also known as protein-tyrosine phosphatase mu (R-PTP-mu or PTPmu), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRM/PTPmu is a homophilic cell adhesion molecule expressed in CNS neurons and glia. It is required for E-, N-, and R-cadherin-dependent neurite outgrowth. Loss of PTPmu contributes to tumor cell migration and dispersal of human glioblastomas. PTPRM contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment  with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain.	206
350484	cd14636	R-PTPc-K-2	PTP domain of receptor-type tyrosine-protein phosphatase K, repeat 2. Receptor-type tyrosine-protein phosphatase K (PTPRK), also known as receptor-type tyrosine-protein phosphatase kappa (RPTP-kappa or PTPkappa), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRK is widely expressed and has been shown to stimulate cell motility and neurite outgrowth. It is required for anti-proliferative and pro-migratory effects of TGF-beta, suggesting a role in regulation, maintenance, and restoration of cell adhesion. It is a potential tumour suppressor in primary central nervous system lymphomas, colorectal cancer, and breast cancer. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment  with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain.	206
350485	cd14637	R-PTPc-U-2	PTP domain of receptor-type tyrosine-protein phosphatase U, repeat 2. Receptor-type tyrosine-protein phosphatase U (PTPRU), also known as pancreatic carcinoma phosphatase 2 (PCP-2), belongs to the type IIb subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRU/PCP-2 is the most distant member of the type IIb subfamily and may have a distinct biological function other than cell-cell aggregation. It localizes to the adherens junctions and directly binds and dephosphorylates beta-catenin, and regulates the balance between signaling and adhesive beta-catenin. It plays an important role in the maintenance of epithelial integrity. PTPRU contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment  with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains. This model represents the second (repeat 2) PTP domain.	207
350486	cd14638	DSP_DUSP1	dual specificity phosphatase domain of dual specificity protein phosphatase 1. Dual specificity protein phosphatase 1 (DUSP1), also called mitogen-activated protein kinase (MAPK) phosphatase 1 (MKP-1), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class I subfamily and is a mitogen- and stress-inducible nuclear MKP. Human MKP-1 dephosphorylates MAPK1/ERK2, regulating its activity during the meiotic cell cycle. Although initially MKP-1 was considered to be ERK-specific, it has been shown that MKP-1 also dephosphorylates both JNK and p38 MAPKs. DUSP1/MKP-1 is involved in various functions, including proliferation, differentiation, and apoptosis in normal cells. It is a central regulator of a variety of functions in the immune, metabolic, cardiovascular, and nervous systems. It contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	151
350487	cd14639	DSP_DUSP5	dual specificity phosphatase domain of dual specificity protein phosphatase 5. Dual specificity protein phosphatase 5 (DUSP5) functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other mitogen-activated protein kinase (MAPK) phosphatases (MKPs), it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class I subfamily and is a mitogen- and stress-inducible nuclear MKP. DUSP5 preferentially dephosphorylates extracellular signal-regulated kinase (ERK), and is involved in ERK signaling and ERK-dependent inflammatory gene expression in adipocytes. It also plays a role in regulating pressure-dependent myogenic cerebral arterial constriction, which is crucial for the maintenance of constant cerebral blood flow to the brain. DUSP5 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	138
350488	cd14640	DSP_DUSP4	dual specificity phosphatase domain of dual specificity protein phosphatase 4. Dual specificity protein phosphatase 4 (DUSP4), also called mitogen-activated protein kinase (MAPK) phosphatase 2 (MKP-2), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class I subfamily and is a mitogen- and stress-inducible nuclear MKP. DUSP4 regulates either ERK or c-JUN N-terminal kinase (JNK), depending on the cell type. It dephosphorylates nuclear JNK and induces apoptosis in diffuse large B cell lymphoma (DLBCL) cells. It acts as a negative regulator of macrophage M1 activation and inhibits inflammation during macrophage-adipocyte interaction. It has been linked to different aspects of cancer: it may have a role in the development of ovarian cancers, oesophagogastric rib metastasis, and pancreatic tumours; it may also be a candidate tumor suppressor gene, with its deletion implicated in breast cancer, prostate cancer, and gliomas. DUSP4/MKP-2 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	141
350489	cd14641	DSP_DUSP2	dual specificity phosphatase domain of dual specificity protein phosphatase 2. Dual specificity protein phosphatase 2 (DUSP2), also called dual specificity protein phosphatase PAC-1, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other mitogen-activated protein kinase (MAPK) phosphatases (MKPs), it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class I subfamily and is a mitogen- and stress-inducible nuclear MKP. DUSP2 can preferentially dephosphorylate ERK1/2 and p38, but not JNK in vitro. It is predominantly expressed in hematopoietic tissues with high T-cell content, such as thymus, spleen, lymph nodes, peripheral blood and other organs such as the brain and liver. It has a critical and positive role in inflammatory responses. DUSP2 mRNA and protein are significantly reduced in most solid cancers including breast, colon, lung, ovary, kidney and prostate, and the suppression of DUSP2 is associated with tumorigenesis and malignancy. DUSP2 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	144
350490	cd14642	DSP_DUSP6	dual specificity phosphatase domain of dual specificity protein phosphatase 6. Dual specificity protein phosphatase 6 (DUSP6), also called mitogen-activated protein kinase (MAPK) phosphatase 3 (MKP-3) or dual specificity protein phosphatase PYST1, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class II subfamily and is an ERK-selective cytoplasmic MKP. DUSP6/MKP-3 plays an important role in obesity-related hyperglycemia by promoting hepatic glucose output. MKP-3 deficiency attenuates body weight gain induced by a high-fat diet, protects mice from developing obesity-related hepatosteatosis, and reduces adiposity, possibly by repressing adipocyte differentiation. It also contributes to p53-controlled cellular senescence. It contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	143
350491	cd14643	DSP_DUSP7	dual specificity phosphatase domain of dual specificity protein phosphatase 7. Dual specificity protein phosphatase 7 (DUSP7), also called mitogen-activated protein kinase (MAPK) phosphatase X (MKP-X) or dual specificity protein phosphatase PYST2, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class II subfamily and is an ERK-selective cytoplasmic MKP. DUSP7 has been shown as an essential regulator of multiple steps in oocyte meiosis. Due to alternative promoter usage, the PYST2 gene gives rise to two isoforms, PYST2-S and PYST2-L. PYST2-L is over-expressed in leukocytes derived from AML and ALL patients as well as in some solid tumors and lymphoblastoid cell lines; it plays a role in cell-crowding. It contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	149
350492	cd14644	DSP_DUSP9	dual specificity phosphatase domain of dual specificity protein phosphatase 9. Dual specificity protein phosphatase 9 (DUSP9),  also called mitogen-activated protein kinase (MAPK) phosphatase 4 (MKP-4), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class II subfamily and is an ERK-selective cytoplasmic MKP. DUSP9 is a mediator of bone morphogenetic protein (BMP) signaling to control the appropriate ERK activity critical for the determination of embryonic stem cell fate. Down-regulation of DUSP9 expression has been linked to severe pre-eclamptic placenta as well as cancers such as hepatocellular carcinoma. DUSP9 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	145
350493	cd14645	DSP_DUSP8	dual specificity phosphatase domain of dual specificity protein phosphatase 8. Dual specificity protein phosphatase 8 (DUSP8), also called DUSP hVH-5 or M3/6, functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class III subfamily and is a JNK/p38-selective cytoplasmic MKP. DUSP8 controls basal and acute stress-induced ERK1/2 signaling in adult cardiac myocytes, which impacts contractility, ventricular remodeling, and disease susceptibility. It also plays a role in decreasing ureteric branching morphogenesis by inhibiting p38MAPK. DUSP8 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	151
350494	cd14646	DSP_DUSP16	dual specificity phosphatase domain of dual specificity protein phosphatase 16. Dual specificity protein phosphatase 16 (DUSP16), also called mitogen-activated protein kinase (MAPK) phosphatase 7 (MKP-7), functions as a protein-serine/threonine phosphatase (EC 3.1.3.16) and a protein-tyrosine-phosphatase (EC 3.1.3.48). Like other MKPs, it deactivates its MAPK substrates by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. It belongs to the class III subfamily and is a JNK/p38-selective cytoplasmic MKP. DUSP16/MKP-7 plays an essential role in perinatal survival and selectively controls the differentiation and cytokine production of myeloid cells. It is acetylated by Mycobacterium tuberculosis Eis protein, which leads to the inhibition of JNK-dependent autophagy, phagosome maturation, and ROS generation, and thus, initiating suppression of host immune responses. DUSP16/MKP-7 contains an N-terminal Cdc25/rhodanese-like domain, which is responsible for MAPK-binding, and a C-terminal catalytic dual specificity phosphatase domain.	145
271236	cd14651	ZIP_Put3	Leucine zipper Dimerization domain of transcription factor Put3. Put3p activates the transcription of PUT1 and PUT2 genes in the presence of proline, allowing yeast cells to use proline as a nitrogen source. These genes encode for proteins that convert proline to glutamate, which is a metabolically more useful form of nitrogen. Put3p is a member of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner.	31
271237	cd14653	ZIP_Gal4p-like	Leucine zipper Dimerization domain of Gal4p-like transcription factors. The Gal4p family of transcriptional activators contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. Included in this family are Saccharomyces cerevisiae Gal4p, Hap1p, Put3p, Ppr1p and Sip4p, Neurospora crassa acu-15, and Colletotrichum acutatum Nir1, among others. Gal4p functions in the induction of GAL genes in the presence of galactose; GAL proteins are responsible for the transport of galactose into the cell and for its metabolism through the glycolytic pathway. Hap1p promotes transcription of genes required for respiration and controlling oxidative damage in response to heme. Put3p activates the transcription of the PUT1 and PUT2 genes in the presence of proline, allowing yeast cells to use proline as a nitrogen source. Ppr1p activates transcription of the URA1, URA3, and URA4 genes, which encode enzymes involved in the regulation of pyrimidine levels. Sip4p activates target genes under conditions of glucose deprivation. Acu-15 is involved in regulating acetate utilization while Nir1 plays a role during nitrogen-starvation conditions.	24
271238	cd14654	ZIP_Gal4	Leucine zipper Dimerization domain of transcription factor Gal4 and similar fungal proteins. Gal4p is one of several GAL proteins required for the growth of yeast on galactose; GAL proteins are responsible for the transport of galactose into the cell and for its metabolism through the glycolytic pathway. Gal4p functions in the induction of GAL genes in the presence of galactose through an upstream activating sequence (UAS) present in their promoters. The Gal4p family of transcriptional activators contain an N-terminal DNA binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner.	47
271239	cd14655	ZIP_Hap1	Leucine zipper Dimerization domain of transcription factor Hap1 and similar fungal proteins. Hap1p mediates oxygen sensing and heme signaling in yeast. In response to heme, it promotes transcription of genes required for respiration and controlling oxidative damage. It is a member of the Gal4/Gal4p family of transcriptional activators which contain an N-terminal DNA binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. Hap1p binds to DNA containing a direct repeat of two CGG triplets. It is a large protein that contains repression modules (RPMs) and heme-responsive motifs (HRMs) in addition to the DNA-binding and dimerization domains.	32
271139	cd14656	Imelysin-like_EfeO	EfeO is a component of the EfeUOB operon. This family includes the EfeO domain, an essential component of the EfeUOB operon which is highly conserved in bacteria. However, its biochemical function is unknown. EfeO contains an N-terminal cupredoxin (CUP)-like domain and C-terminal imelysin-like domain that may bind iron. Algp7, a member of EfeO family protein from Sphingomonas sp. A1, is found to bind alginate at neutral pH, but does not contain the CUP domain, thus having a role that does not seem to be related to iron uptake. Some members of this family are fused to an N-terminal putative EfeU ion permease domain. The imelysin-like domain of this family also contains the GxHxxE sequence motif and a highly conserved functional site, suggesting a similar role to other imelysin family proteins containing the same motif.	239
271140	cd14657	Imelysin_IrpA-like	Imelysin-like iron-regulated protein A-like. This family includes putative iron-regulated protein A (IrpA) mainly from Bacteriodes, proteobacteria and cyanobacteria, as well as uncharacterized proteins with domains similar to insulin-cleaving membrane protease (imelysin, ICMP) protein. IrpA has been shown to be essential for growth under iron-deficient conditions in the cyanobacteria Synechococcus sp. The conserved GxHxxE motif is similar to other known imelysin-like proteins that are regulated by iron, such as ICMP, IrpA and EfeO. Imelysin is a membrane protein with the active site outside the cell envelope. The tertiary structure shows a fold consisting of two domains, each of which consists of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event.	345
271141	cd14658	Imelysin-like_IrpA	Imelysin-like domain in iron-regulated protein A. This family includes putative iron-regulated protein A (IrpA) mainly from Bacteriodes, proteobacteria and cyanobacteria, with domain similar to insulin-cleaving membrane protease (imelysin, ICMP) protein. It has been shown to be essential for growth under iron-deficient conditions in the cyanobacteria Synechococcus sp. The conserved GxHxxE motif is similar to other known imelysin-like proteins that are regulated by iron, such as ICMP, IrpA and EfeO. Imelysin is a membrane protein with the active site outside the cell envelope. The tertiary structure shows a fold consisting of two domains, each of which consists of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event.	282
271142	cd14659	Imelysin-like_IPPA	Imelysin-like protein. This family includes insulin-cleaving membrane protease (imelysin, ICMP)-like protein (IPPA from Psychrobacter arcticus), the Pseudomonas aeruginosa PA4372 and Vibrio cholera VC1266 Fur-regulated imelysin-like protein. They share the overall fold and a similar functional site as the insulin-cleaving membrane protease (ICMP). However, IPPA adopts a structure distinctive from the known HxxE metallopeptidases or iron-binding proteins, suggesting this protein may not be a peptidase; the histidine in the GxHxxE motif region is no longer conserved (GxxxxE), indicating a possible loss of enzymatic function or a change in substrate preference (compared to imelysin and IrpA families). A putative functional site for this non-peptidase homolog is located at the domain interface. The tertiary structure shows a fold consisting of two domains, each of which consists of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event.	331
271137	cd14660	E2F_DD	Dimerization domain of E2F transcription factors. E2F transcription factors are involved in the regulation of DNA synthesis, cell cycle progression, proliferation and apoptosis. It associates with the retinoblastoma (Rb) protein, negatively regulating the G1-S transition until cyclin-dependent kinases phosphorylate Rb, which causes E2F release. E2F forms heterodimers with DP, a distantly related protein. Heterodimerization enhances the Rb binding, DNA binding, and transactivation activities of E2Fs. In humans, there are at least six closely related E2F and two DP family members, all containing a DNA-binding domain, a coiled-coil (CC) region, and a marked-box domain. E2F1 to E2F5 also contain a C-terminal transactivation domain.	104
271136	cd14661	Imelysin_like_PIBO	Permuted imelysin-like protein from Bacteroides ovatus (PIBO) and similar proteins. This family includes imelysin-like proteins such as imelysin-like protein from gut bacteria Bacteroides ovatus (PIBO) that have a circularly permuted topology compared with the canonical imelysin fold, such that the N-terminal and C-terminal regions are swapped in the primary sequence. PIBO is highly similar to imelysin-like protein from Psychrobacter arcticus (IPPA) despite low sequence similarity and circular permutation. PIBO is functionally equivalent to insulin-cleaving membrane protease (ICMP or imelysin), although the permutation results in the conserved GxHxxE motif to be at the C-terminus. It may have a conserved role in iron uptake although it adopts a structure distinctive from known metallopeptidases or iron-binding proteins.	347
271132	cd14662	STKc_SnRK2	Catalytic domain of the Serine/Threonine Kinases, Sucrose nonfermenting 1-related protein kinase subfamily 2. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The SnRKs form three different subfamilies designated SnRK1-3. SnRK2 is represented in this cd. SnRK2s are involved in plant response to abiotic stresses and abscisic acid (ABA)-dependent plant development. The SnRK2s subfamily is in turn classed into three subgroups, all 3 of which are represented in this CD. Group 1 comprises kinases not activated by ABA, group 2 - kinases not activated or activated very weakly by ABA (depending on plant species), and group 3 - kinases strongly activated by ABA. The SnRKs belong to a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
271133	cd14663	STKc_SnRK3	Catalytic domain of the Serine/Threonine Kinases, Sucrose nonfermenting 1-related protein kinase subfamily 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The SnRKs form three different subfamilies designated SnRK1-3. SnRK3 is represented in this cd. The SnRK3 group contains members also known as CBL-interacting protein kinase, salt overly sensitive 2, SOS3-interacting proteins and protein kinase S. These kinases interact with calcium-binding proteins such as SOS3, SCaBPs, and CBL proteins, and are involved in responses to salt stress and in sugar and ABA signaling. The SnRKs belong to a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	256
271134	cd14664	STK_BAK1_like	Catalytic domain of the Serine/Threonine Kinase, BRI1 associated kinase 1 and related STKs. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. This subfamily includes three leucine-rich repeat receptor-like kinases (LRR-RLKs): Arabidopsis thaliana BAK1 and CLAVATA1 (CLV1), and Physcomitrella patens CLL1B clavata1-like receptor S/T protein kinase. BAK1 functions in various signaling pathways. It plays a role in BR (brassinosteroid)-regulated plant development as a co-receptor of BRASSINOSTEROID (BR) INSENSITIVE 1 (BRI1), the receptor for BRs, and is required for full activation of BR signaling. It also modulates pathways involved in plant resistance to pathogen infection (pattern-triggered immunity, PTI) and herbivore attack (wound- or herbivore feeding-induced accumulation of jasmonic acid (JA) and JA-isoleucine. CLV1, directly binds small signaling peptides, CLAVATA3 (CLV3) and CLAVATA3/EMBRYO SURROUNDING REGI0N (CLE), to restrict stem cell proliferation: the CLV3-CLV1-WUS (WUSCHEL) module influences stem cell maintenance in the shoot apical meristem, and the CLE40 (CLAVATA3/EMBRYO SURROUNDING REGION40) -ACR4 (CRINKLY4) -CLV1- WOX5 (WUSCHEL-RELATED HOMEOBOX5) module at the root apical meristem. The STK_BAK1-like subfamily is part of a larger superfamily that includes the catalytic domains of other protein STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	270
271135	cd14665	STKc_SnRK2-3	Catalytic domain of the Serine/Threonine Kinases, Sucrose nonfermenting 1-related protein kinase subfamily 2, group 3. STKs catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The SnRKs form three different subfamilies designated SnRK1-3. SnRK2 is represented in this cd. SnRK2s are involved in plant response to abiotic stresses and abscisic acid (ABA)-dependent plant development. The SnRK2s subfamily is in turn classed into three subgroups, all 3 of which are represented in this CD. Group 1 comprises kinases not activated by ABA, group 2 - kinases not activated or activated very weakly by ABA (depending on plant species), and group 3 - kinases strongly activated by ABA. The SnRKs belong to a larger superfamily that includes the catalytic domains of other STKs, protein tyrosine kinases, RIO kinases, aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase.	257
270620	cd14667	3D_containing_proteins	Non-mltA associated 3D domain containing proteins, named for 3 conserved aspartate residues. This family contains the 3D domain, named for its 3 conserved aspartates, including similar uncharacterized proteins. These proteins contain the critical active site aspartate of mltA-like lytic transglycosylases where the 3D domain forms a larger domain with the N-terminal region. This domain is also found in conjunction with numerous other domains such as the Escherichia coli MltA, a membrane-bound lytic transglycosylase comprised of 2 domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, corresponding to the 3D domain and Domain B is inserted within the linear sequence of domain A.  MltA is distinct from other bacterial LTs, which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond.	90
270616	cd14668	mlta_B	Domain B insert of mltA_like lytic transglycosylases. Escherichia coli MltA is a membrane-bound lytic transglycosylase comprised of two domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, which correspond to the 3D domain, named for 3 conserved aspartate residues. Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial lytic transglycosylases (LTs), which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. Typically, peptidoglycan lytic transglycosylases (LT) are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, MltE is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane-bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and Family 4 of bacteriophage origin. While most of the LT family members are similar in structure and sequence with a lysozyme-like fold, Family 2 (including mltA) is distinct.	159
270617	cd14669	mlta_related_B	putative domain B insert of mltA_type lytic transglycosylases. Escherichia coli MltA is a membrane-bound lytic transglycosylase comprised of two domains separated by a large groove, where the peptidoglycan strand binds. Domain A is made up of an N-terminal and a C-terminal portion, which correspond to the 3D domain, named for 3 conserved aspartate residues. Domain B is inserted within the linear sequence of domain A. MltA is distinct from other bacterial lytic transglycosylases (LTs), which are similar to each other. Escherichia coli peptidoglycan lytic transglycosylase (LT) initiates cell wall recycling in response to damage, during bacterial fission, and cleaves peptidoglycan (PG) to create functional spaces in its wall. PG chains (also known as murein), the major components of the bacterial cell wall, are comprised of alternating beta-1-4-linked N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), and lytic transglycosylases cleave this beta-1-4 bond. Typically, peptidoglycan lytic transglycosylases (LT) are exolytic, releasing Metabolite 1 (GlcNAc-anhMurNAc-L-Ala-D-Glu-m-Dap-D-Ala-D-Ala) from the ends of the PG strands. In contrast, MltE is endolytic , cleaving in the middle of PG strands, with further processing to Metabolite 1 accomplished by other LTs. In E. coli, there are six membrane-bound LTs: MltA-MltF and soluble Slt70. Slt35 is a soluble fragment cleaved from MltB. Bacterial LTs are classified in 4 families: Family 1 includes slt70 MltC-MltF, Family 2 includes MltA, Family 3 includes MltB, and Family 4 of bacteriophage origin. While most of the LT family members are similar in structure and sequence with a lysozyme-like fold, Family 2 (including mltA) is distinct.	128
270614	cd14670	BslA_like	Bacterial immunoglobulin-like hydrophobin BslA and similar proteins. BslA (YuaB) is a protein from Bacillus subtilis acting as a hydrophobin, which forms surface layers around biofilms and participates in biofilm assembly. BslA contains an unusually hydrophobic "cap structure", which is essential for its activity and for the ability of bacteria to form hydrophobic, non-wetting biofilms. A number of domains in various proteins from Bacilli and other bacterial lineages appear related to BslA, but do not conserve the hydrophobic cap.	128
269821	cd14671	PAAR_like	proline-alanine-alanine-arginine (PAAR) repeat superfamily. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat superfamily, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. The PAAR-repeat proteins form a diverse superfamily with several subgroups extended both N- and C-terminally by domains with various predicted functions; the termini are exposed to solution, and do not distort the VgrG binding site. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.	77
270612	cd14672	UBA_ceTYDP2_like	UBA-like domain found in Caenorhabditis elegans tyrosyl-DNA phosphodiesterase 2 (TDP2) and similar proteins. The family includes C. elegans TDP2 and its homologs found in bilateria. TDP2 (also known as TTRAP or EAPII) belongs to the Mg(2+)/Mn(2+)-dependent family of phosphodiesterases which contains an N-terminal ubiquitin-associated (UBA)-like domain and a C-terminal phosphodiesterase domain. It required for the efficient repair of topoisomerase II-induced DNA double strand breaks. The topoisomerase is covalently linked by a phosphotyrosyl bond to the 5'-terminus of the break. TDP2 cleaves the DNA 5'-phosphodiester bond and restores 5'-phosphate termini needed for subsequent DNA ligation and hence repair of the break. Tyrosyl-DNA phosphodiesterase 1 (TDP1), an enzyme that cleaves 3'-phosphotyrosyl bonds, and TDP2 are complementary activities; together, they allow cells to remove trapped topoisomerase from both 3'- and 5'-DNA termini. TDP2 has been reported as being involved in apoptosis, embryonic development, and transcriptional regulation.	37
270192	cd14673	PH_PHLDB1_2	Pleckstrin homology-like domain-containing family B member 2 pleckstrin homology (PH) domain. PHLDB2 (also called LL5beta) and PHLDB1 (also called LL5alpha) are cytoskeleton- and membrane-associated proteins. PHLDB2 has been identified as a key component of the synaptic podosomes that play an important role in in postsynaptic maturation. Both are large proteins containing an N-terminal pleckstrin (PH) domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
270193	cd14674	PH_PLEKHM3_1	Pleckstrin homology domain-containing family M member 3 Pleckstrin homology domain 1. PLEKHM3 (also called differentiation associated protein/DAPR) exists as three alternatively spliced isoforms that participate in metal ion binding. It contains 2 PH domains and 1 phorbol-ester/DAG-type zinc finger domain. PLEKHM3 is found in Humans, canines, bovine, mouse, rat, chicken and zebrafish. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2, or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	90
270194	cd14675	PH-SEC3_like	PH-like domain of Sec3-like protein. Fungal Sec3, as well as its homolog in higher eukaryotes Exocyst complex component 1 (EXOC1) are part of the exocyst is a conserved octameric complex involved in the docking of post-Golgi transport vesicles to sites of membrane remodeling during cellular processes such as polarization, migration, and division. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	115
270195	cd14676	PH_DOK1,2,3	Pleckstrin homology (PH) domain of Downstream of tyrosine kinase 1, 2, and 3. The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain is binds to acidic phospholids and localizes proteins to the plasma membrane. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. In general, PH domains have diverse functions, but are generally involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	113
270196	cd14677	PH_DOK7	Pleckstrin homology (PH) domain of Downstream of tyrosine kinase 7. The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain is binds to acidic phospholids and localizes proteins to the plasma membrane. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). In general, PH domains have diverse functions, but are generally involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	102
270197	cd14678	PH_DOK4_DOK5_DOK6	Pleckstrin homology (PH) domain of Downstream of tyrosine kinase 4, 5, and 6 proteins. The Dok family adapters are phosphorylated by different protein tyrosine kinases. Dok proteins are involved in processes such as modulation of cell differentiation and proliferation, as well as in control of the cell spreading and migration The Dok protein contains an N-terminal pleckstrin homology (PH) domain followed by a central phosphotyrosine binding (PTB) domain, which has a PH-like fold, and a proline- and tyrosine-rich C-terminal tail. The PH domain binds to acidic phospholids and localizes proteins to the plasma membrane, while the PTB domain mediates protein-protein interactions by binding to phosphotyrosine-containing motifs. The C-terminal part of Dok contains multiple tyrosine phosphorylation sites that serve as potential docking sites for Src homology 2-containing proteins such as ras GTPase-activating protein and Nck, leading to inhibition of ras signaling pathway activation and the c-Jun N-terminal kinase (JNK) and c-Jun activation, respectively. There are 7 mammalian Dok members: Dok-1 to Dok-7. Dok-1 and Dok-2 act as negative regulators of the Ras-Erk pathway downstream of many immunoreceptor-mediated signaling systems, and it is believed that recruitment of p120 rasGAP by Dok-1 and Dok-2 is critical to their negative regulation. Dok-3 is a negative regulator of the activation of JNK and mobilization of Ca2+ in B-cell receptor-mediated signaling, interacting with SHIP-1 and Grb2. Dok-4- 6 play roles in protein tyrosine kinase(PTK)-mediated signaling in neural cells and Dok-7 is the key cytoplasmic activator of MuSK (Muscle-Specific Protein Tyrosine Kinase). In general, PH domains have diverse functions, but are generally involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	105
275429	cd14679	PH_p115RhoGEF	Rho guanine nucleotide exchange factor Pleckstrin homology domain. p115RhoGEF (also called LSC, GEF1 or LBCL2) belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. In addition to the Dbl homology (DH)-PH domain, p115RhoGEF contains an N-terminal RGS (Regulator of G-protein signalling) domain. The DH-PH domains bind and catalyze the exchange of GDP for GTP on RhoA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	125
275430	cd14680	PH_p190RhoGEF	Rho guanine nucleotide exchange factor Pleckstrin homology domain. p190RhoGEF (also called RIP2 or ARHGEF28) belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. In addition to the Dbl homology (DH)-PH domain, p190RhoGEF contains an N-terminal C1 (Protein kinase C conserved region 1) domain. The DH-PH domains bind and catalyze the exchange of GDP for GTP on RhoA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	101
270200	cd14681	PH-STXBP6	PH-like domain of Syntaxin binding protein 6. Syntaxin binding protein 6 (STXBP6, also called Amisyn) contains, beside the N-terminal PH-like domain, a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis. SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, with STXBP6 being a R-SNARE. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	130
270201	cd14682	PH-EXOC1_like	PH-like domain of Exocyst complex component 1-like. Exocyst complex component 1-like proteins are short, higher eukaryotic proteins that show homology to the PH-domain of higher eukaryotic EXOC1 and yeast SEC3 which are part of the exocyst complex involved in the docking of post-Golgi transport vesicles to sites of membrane remodeling during cellular processes such as polarization, migration, and division. Their function is unknown. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	118
270202	cd14683	PH-EXOC1	PH-like domain of Exocyst complex component 1. Exocyst complex component 1 (EXOC1, also known as SEC3) is the higher eukaryotes homolog of yeast Sec3. The Exocyst is a conserved octameric complex involved in the docking of post-Golgi transport vesicles to sites of membrane remodeling during cellular processes such as polarization, migration, and division. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	117
270203	cd14684	RanBD1_RanBP2-like	Ran-binding protein 2, Ran binding domain repeat 1. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeat 1 is present in this hierarchy.	117
270204	cd14685	RanBD3_RanBP2-like	Ran-binding protein 2, Ran binding domain repeat 3. RanBP2 (also called E3 SUMO-protein ligase RanBP2, 358 kDa nucleoporin, and nuclear pore complex (NPC) protein Nup358) is a giant nucleoporin that localizes to the cytosolic face of the NPC. RanBP2 contains a leucine-rich region, 8 zinc-finger motifs, a cyclophilin A homologous domain, and 4 RanBDs. Ran is a Ras-like nuclear small GTPase, which regulates receptor-mediated transport between the nucleus and the cytoplasm. RanGTP hydrolysis is stimulated by RanGAP together with the Ran-binding domain containing acessory proteins RanBP1 and RanBP2. These accessory proteins stabilize the active GTP-bound form of Ran. All eukaryotic cells contain RanBP1, but in vertebrates however, the main RanBP seems to be RanBP2. There is no RanBP2 ortholog in yeast. Transport complex disassembly is accomplished by a small ubiquitin-related modifier-1 (SUMO-1)-modified version of RanGAP that is bound to RanBP2. RanBP1 acts as a second line of defense against exported RanGTP-importin complexes which have escaped from dissociation by RanBP2. RanBP2 also interacts with the importin subunit beta-1. RabBD shares structural similarity to the PH domain, but lacks detectable sequence similarity. The members here include human, chicken, frog, tunicates, sea urchins, ticks, sea anemones, and sponges. RanBD repeats 3 is present in this hierarchy.	117
269834	cd14686	bZIP	Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain. Basic leucine zipper (bZIP) factors comprise one of the most important classes of enhancer-type transcription factors. They act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes including cell survival, learning and memory, lipid metabolism, and cancer progression, among others. They also play important roles in responses to stimuli or stress signals such as cytokines, genotoxic agents, or physiological stresses. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
269835	cd14687	bZIP_ATF2	Basic leucine zipper (bZIP) domain of Activating Transcription Factor-2 (ATF-2) and similar proteins: a DNA-binding and dimerization domain. ATF-2 is a sequence-specific DNA-binding protein that belongs to the Basic leucine zipper (bZIP) family of transcription factors. In response to stress, it activates a variety of genes including cyclin A, cyclin D, and c-Jun. ATF-2 also plays a role in the DNA damage response that is independent of its transcriptional activity. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	61
269836	cd14688	bZIP_YAP	Basic leucine zipper (bZIP) domain of Yeast Activator Protein (YAP) and similar proteins: a DNA-binding and dimerization domain. This subfamily is composed predominantly of AP-1-like transcription factors including Saccharomyces cerevisiae YAPs, Schizosaccharomyces pombe PAP1, and similar proteins. Members of this subfamily belong to the Basic leucine zipper (bZIP) family of transcription factors. The YAP subfamily is composed of eight members (YAP1-8) which may all be involved in stress responses. YAP1 is the major oxidative stress regulator and is also involved in iron metabolism (like YAP5) and detoxification of arsenic (like YAP8). YAP2 is involved in cadmium stress responses while YAP4 and YAP6 play roles in osmotic stress. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	63
269837	cd14689	bZIP_CREB3	Basic leucine zipper (bZIP) domain of Cyclic AMP-responsive element-binding protein 3 (CREB3) and similar proteins: a DNA-binding and dimerization domain. This subfamily is composed of CREB3 (also called LZIP or Luman), and the CREB3-like proteins CREB3L1 (or OASIS), CREB3L2, CREB3L3 (or CREBH), and CREB3L4 (or AIbZIP). They are type II membrane-associated members of the Basic leucine zipper (bZIP) family of transcription factors, with their N-termini facing the cytoplasm and their C-termini penetrating through the ER membrane. They contain an N-terminal transcriptional activation domain followed bZIP and transmembrane domains, and a C-terminal tail. They play important roles in ER stress and the unfolded protein response (UPR), as well as in many other biological processes such as cell secretion, bone and cartilage formation, and carcinogenesis. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	61
269838	cd14690	bZIP_CREB1	Basic leucine zipper (bZIP) domain of Cyclic AMP-responsive element-binding protein 1 (CREB1) and similar proteins: a DNA-binding and dimerization domain. CREB1 is a Basic leucine zipper (bZIP) transcription factor that plays a role in propagating signals initiated by receptor activation through the induction of cAMP-responsive genes. Because it responds to many signal transduction pathways, CREB1 is implicated to function in many processes including learning, memory, circadian rhythm, immune response, and reproduction, among others. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	55
269839	cd14691	bZIP_XBP1	Basic leucine zipper (bZIP) domain of X-box binding protein 1 (XBP1) and similar proteins: a DNA-binding and dimerization domain. XBP1, a member of the Basic leucine zipper (bZIP) family, is the key transcription factor that orchestrates the unfolded protein response (UPR). It is the most conserved component of the UPR and is critical for cell fate determination in response to ER stress. The inositol-requiring enzyme 1 (IRE1)-XBP1 pathway is one of the three major sensors at the ER membrane that initiates the UPR upon activation. IRE1, a type I transmembrane protein kinase and endoribonuclease, oligomerizes upon ER stress leading to its increased activity. It splices the XBP1 mRNA, producing a variant that translocates to the nucleus and activates its target genes, which are involved in protein folding, degradation, and trafficking. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	58
269840	cd14692	bZIP_ATF4	Basic leucine zipper (bZIP) domain of Activating Transcription Factor-4 (ATF-4) and similar proteins: a DNA-binding and dimerization domain. ATF-4 was also isolated and characterized as the cAMP-response element binding protein 2 (CREB2). It is a Basic leucine zipper (bZIP) transcription factor that has been reported to act as both an activator or repressor. It is a critical component in both the unfolded protein response (UPR) and amino acid response (AAR) pathways. Under certain stress conditions, ATF-4 transcription is increased; accumulation of ATF-4 induces the expression of genes involved in amino acid metabolism and transport, mitochondrial function, redox chemistry, and others that ensure protein synthesis and recovery from stress. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	63
269841	cd14693	bZIP_CEBP	Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein (CEBP) and similar proteins: a DNA-binding and dimerization domain. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate the cell cycle, differentiation, growth, survival, energy metabolism, innate and adaptive immunity, and inflammation, among others. They are also associated with cancer and viral disease. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. Each possesses unique properties to regulate cell type-specific growth and differentiation. The sixth isoform, CEBPZ (zeta), lacks an intact DNA-binding domain and is excluded from this subfamily. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	60
269842	cd14694	bZIP_NFIL3	Basic leucine zipper (bZIP) domain of Nuclear factor interleukin-3-regulated protein (NFIL3): a DNA-binding and dimerization domain. NFIL3, also called E4 promoter-binding protein 4 (E4BP4), is a Basic leucine zipper (bZIP) transcription factor that was independently identified as a transactivator of the IL3 promoter in T-cells and as a transcriptional repressor that binds to a DNA sequence site in the adenovirus E4 promoter. Its expression levels are regulated by cytokines and it plays crucial functions in the immune system. It is required for the development of natural killer cells and CD8+ conventional dendritic cells. In B-cells, NFIL3 mediates immunoglobulin heavy chain class switching that is required for IgE production, thereby influencing allergic and pathogenic immune responses. It is also involved in the polarization of T helper responses. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	60
269843	cd14695	bZIP_HLF	Basic leucine zipper (bZIP) domain of Hepatic leukemia factor (HLF) and similar proteins: a DNA-binding and dimerization domain. HLF, also called vitellogenin gene-binding protein (VBP) in birds, is a circadian clock-controlled Basic leucine zipper (bZIP) transcription factor which is a direct transcriptional target of CLOCK/BMAL1. It is implicated, together with bZIPs DBP and TEF, in the regulation of genes involved in the metabolism of endobiotic and xenobiotic agents. Triple knockout mice display signs of early aging and suffer premature death, likely due to impaired defense against xenobiotic stress. A leukemogenic translocation results in the chimeric fusion protein E2A-HLF that results in a rare form of pro-B-cell acute lymphoblastic leukemia (ALL). bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	60
269844	cd14696	bZIP_Jun	Basic leucine zipper (bZIP) domain of Jun proteins and similar proteins: a DNA-binding and dimerization domain. Jun is a member of the activator protein-1 (AP-1) complex, which is mainly composed of Basic leucine zipper (bZIP) dimers of the Jun and Fos families, and to a lesser extent, the activating transcription factor (ATF) and musculoaponeurotic fibrosarcoma (Maf) families. The broad combinatorial possibilities for various dimers determine binding specificity, affinity, and the spectrum of regulated genes. The AP-1 complex is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. There are three Jun proteins: c-Jun, JunB, and JunD. c-Jun is the most potent transcriptional activator of the AP-1 proteins. Both c-Jun and JunB are essential during development; deletion of either results in embryonic lethality in mice. c-Jun is essential in hepatogenesis and liver erythropoiesis, while JunB is required in vasculogenesis and angiogenesis in extraembryonic tissues. While JunD is dispensable in embryonic development, it is involved in transcription regulation of target genes that help cells to cope with environmental signals. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	61
269845	cd14697	bZIP_Maf	Basic leucine zipper (bZIP) domain of musculoaponeurotic fibrosarcoma (Maf) proteins: a DNA-binding and dimerization domain. Maf proteins are Basic leucine zipper (bZIP) transcription factors that may participate in the activator protein-1 (AP-1) complex, which is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. Maf proteins fall into two groups: small and large. The large Mafs (c-Maf, MafA, MafB, NRL) contain an N-terminal transactivation domain, a linker region of varying size, an anxillary DNA-binding domain, and a C-terminal bZIP domain. They function as critical regulators of terminal differentiation in the blood and in many tissues such as bone, brain, kidney, pancreas, and retina. The small Mafs (MafF, MafK, MafG) do not contain a transactivation domain. They form dimers with cap'n'collar (CNC) proteins that harbor transactivation domains, and they act either as activators or repressors depending on their dimerization partner. They play roles in stress response and detoxification pathways. They have been implicated in various diseases such as diabetes, neurological diseases, thrombocytopenia and cancer. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	70
269846	cd14698	bZIP_CNC	Basic leucine zipper (bZIP) domain of Cap'n'Collar (CNC) transcription factors: a DNA-binding and dimerization domain. CNC proteins form a subfamily of Basic leucine zipper (bZIP) transcription factors that are defined by a conserved 43-amino acid region (called the CNC domain) located N-terminal to the bZIP DNA-binding domain. This subfamily includes Drosophila Cnc and four vertebrate counterparts, NFE2 (nuclear factor, erythroid-derived 2), NFE2-like 1 or NFE2-related factor 1 (NFE2L1 or Nrf1), NFE2L2 (or Nrf2), and NFE2L3 (or Nrf3). It also includes BACH1 and BACH2, which contain an additional BTB domain (Broad complex###Tramtrack###Bric-a-brac domain, also known as the POZ [poxvirus and zinc finger] domain). CNC proteins function during development and/or contribute in maintaining homeostasis during stress responses. In flies, Cnc functions both in development and in stress responses. In vertebrates, several CNC proteins encoded by distinct genes show varying functions and expression patterns. NFE2 is required for the proper development of platelets while the three Nrfs function in stress responses. Nrf2, the most extensively studied member of this subfamily, acts as a xenobiotic-activated receptor that regulates the adaptive response to oxidants and electrophiles. BACH1 forms heterodimers with small Mafs such as MafK to function as a repressor of heme oxygenase-1 (HO-1) gene (Hmox-1) enhancers. BACH2 is a B-cell specific transcription factor that plays a critical role in oxidative stress-mediated apoptosis. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	68
269847	cd14699	bZIP_Fos_like	Basic leucine zipper (bZIP) domain of the oncogene Fos (Fos)-like transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of Fos proteins (c-Fos, FosB, Fos-related antigen 1 (Fra-1), and Fra-2), Activating Transcription Factor-3 (ATF-3), and similar proteins. Fos proteins are members of the activator protein-1 (AP-1) complex, which is mainly composed of bZIP dimers of the Jun and Fos families, and to a lesser extent, ATF and musculoaponeurotic fibrosarcoma (Maf) families. The broad combinatorial possibilities for various dimers determine binding specificity, affinity, and the spectrum of regulated genes. The AP-1 complex is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. ATF3 is induced by various stress signals such as cytokines, genotoxic agents, or physiological stresses. It is implicated in cancer and host defense against pathogens. It negatively regulates the transcription of pro-inflammatory cytokines and is critical in preventing acute inflammatory syndromes. ATF3 dimerizes with Jun and other ATF proteins; the heterodimers function either as activators or repressors depending on the promoter context. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	59
269848	cd14700	bZIP_ATF6	Basic leucine zipper (bZIP) domain of Activating Transcription Factor-6 (ATF-6) and similar proteins: a DNA-binding and dimerization domain. ATF-6 is a type I membrane-bound Basic leucine zipper (bZIP) transcription factor that binds to the consensus ER stress response element (ERSE) and enhances the transcription of genes encoding glucose-regulated proteins Grp78, Grp94, and calreticulum. ATF-6 is one of three sensors of the unfolded protein response (UPR) in metazoans; the others being the kinases Ire1 and PERK. It contains an ER-lumenal domain that detects unfolded proteins. In response to ER stress, ATF-6 translocates from the ER to the Golgi with simultaneous cleavage in a process called regulated intramembrane proteolysis (Rip) to its transcriptionally competent form, which enters the nucleus and upregulates target UPR genes. The three UPR sensor branches cross-communicate to form a signaling network. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
269849	cd14701	bZIP_BATF	Basic leucine zipper (bZIP) domain of BATF proteins: a DNA-binding and dimerization domain. Basic leucine zipper (bZIP) transcription factor ATF-like (BATF or SFA2), BATF2 (or SARI) and BATF3 form heterodimers with Jun proteins. They function as inhibitors of AP-1-driven transcription. Unlike most bZIP transcription factors that contain additional domains, BATF and BATF3 contain only the the bZIP DNA-binding and dimerization domain. BATF2 contains an additional C-terminal domain of unknown function. BATF:Jun hetrodimers preferentially bind to TPA response elements (TREs) with the consensus sequence TGA(C/G)TCA, and can also bind to a TGACGTCA cyclic AMP response element (CRE). In addition to negative regulation, BATF proteins also show positive transcriptional activities in the development of classical dendritic cells and T helper cell subsets, and in antibody production. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	58
269850	cd14702	bZIP_plant_GBF1	Basic leucine zipper (bZIP) domain of Plant G-box binding factor 1 (GBF1)-like transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of plant bZIP transciption factors including Arabidopsis thaliana G-box binding factor 1 (GBF1), Zea mays Opaque-2 and Ocs element-binding factor 1 (OCSBF-1), Triticum aestivum Histone-specific transcription factor HBP1 (or HBP-1a), Petroselinum crispum Light-inducible protein CPRF3 and CPRF6, and Nicotiana tabacum BZI-3, among many others. bZIP G-box binding factors (GBFs) contain an N-terminal proline-rich domain in addition to the bZIP domain. GBFs are involved in developmental and physiological processes in response to stimuli such as light or hormones. Opaque-2 plays a role in affecting lysine content and carbohydrate metabolism, acting indirectly on starch/amino acid ratio. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
269851	cd14703	bZIP_plant_RF2	Basic leucine zipper (bZIP) domain of Plant RF2-like transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of plant bZIP transciption factors with similarity to Oryza sativa RF2a and RF2b, which are important for plant development. They interact with, as homodimers or heterodimers with each other, and activate transcription from the RTBV (rice tungro bacilliform virus) promoter, which is regulated by sequence-specific DNA-binding proteins that bind to the essential cis element BoxII. RF2a and RF2b show differences in binding affinities to BoxII, expression patterns in different rice organs, and subcellular localization. Transgenic rice with increased RF2a and RF2b display increased resistance to rice tungro disease (RTD) with no impact on plant development. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
269852	cd14704	bZIP_HY5-like	Basic leucine zipper (bZIP) domain of Plant Elongated/Long Hypocotyl5 (HY5)-like transcription factors and similar proteins: a DNA-binding and dimerization domain. This subfamily is predominantly composed of plant Basic leucine zipper (bZIP) transcription factors with similarity to Solanum lycopersicum and Arabidopsis thaliana HY5. Also included are the Dictyostelium discoideum bZIP transcription factors E and F. HY5 plays an important role in seedling development and is a positive regulator of photomorphogenesis. Plants with decreased levels of HY5 show defects in light responses including inhibited photomorphogenesis, loss of alkaloid organization, and reduced carotenoid accumulation. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
269853	cd14705	bZIP_Zip1	Basic leucine zipper (bZIP) domain of Fungal Zip1-like transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of fungal bZIP transcription factors including Schizosaccharomyces pombe Zip1, Saccharomyces cerevisiae Methionine-requiring protein 28 (Met28p), and Neurospora crassa cys-3, among others. Zip1 is required for the production of key proteins involved in sulfur metabolism and also plays a role in cadmium response. Met28p acts as a cofactor of Met4p, a transcriptional activator of the sulfur metabolic network; it stabilizes DNA:Met4 complexes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	55
269854	cd14706	bZIP_CREBZF	Basic leucine zipper (bZIP) domain of CREBZF/Zhangfei transcription factor and similar proteins: a DNA-binding and dimerization domain. CREBZF (also called Zhangfei, ZF, LAZip, or SMILE) is a neuronal bZIP transcription factor that is involved in the infection cycle of herpes simplex virus (HSV) and related cellular processes. It suppresses the ability of the HSV transactivator VP16 to initiate the viral replicative cycle. CREBZF has also been implicated in the regulation of the human nerve growth factor receptor trkA and the tumor suppressor p53. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	54
269855	cd14707	bZIP_plant_BZIP46	Basic leucine zipper (bZIP) domain of uncharaterized Plant BZIP transcription factors: a DNA-binding and dimerization domain. This subfamily is composed of uncharacterized plant bZIP transciption factors with similarity to Glycine max BZIP46, which may be a drought-responsive gene. Plant bZIPs are involved in developmental and physiological processes in response to stimuli/stresses such as light, hormones, and temperature changes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	55
269856	cd14708	bZIP_HBP1b-like	Basic leucine zipper (bZIP) domain of uncharaterized BZIP transcription factors with similarity to Triticum aestivum HBP-1b: a DNA-binding and dimerization domain. This subfamily is composed primarily of uncharacterized bZIP transciption factors from flowering plants, mosses, clubmosses, and algae. Included in this subfamily is wheat HBP-1b, which contains a C-terminal DOG1 domain, which is a specific plant regulator for seed dormancy. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	53
269857	cd14709	bZIP_CREBL2	Basic leucine zipper (bZIP) domain of Cyclic AMP-responsive element-binding protein-like 2 (CREBL2): a DNA-binding and dimerization domain. CREBL2 is a bZIP transcription factor that interacts with CREB and plays a critical role in adipogenesis and lipogenesis. Its overexpression upregulates the expression of PPARgamma and CEBPalpha to promote adipogenesis as well as accelerate lipogenesis by increasing GLUT1 and GLUT4. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	56
269858	cd14710	bZIP_HAC1-like	Basic leucine zipper (bZIP) domain of Fungal HAC1-like transcription factors: a DNA-binding and dimerization domain. HAC1 (also called Hac1p or HacA) is a bZIP transcription factor that plays a critical role in the unfolded protein response (UPR). The UPR is initiated by the ER-resident protein kinase and endonuclease IRE1, which promotes non-conventional splicing of the HAC1 mRNA, facilitating its translation. HAC1 binds to and activates promoters of genes that encode chaperones and other targets of the UPR. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	53
269859	cd14711	bZIP_CEBPA	Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein alpha (CEBPA): a DNA-binding and dimerization domain. CEPBA is a critical regulator of myeloid development; it directs granulocyte and monocyte differentiation. It is highly expressed in early myeloid progenitors and is found mutated in over half of patients with acute myeloid leukemia (AML). It is also a key regulator in energy homeostasis; mice deficient of CEBPA show abnormalities in glycogen/lipid synthesis and storage. CEPBA is the longest CEBP protein containing two transactivation domains at the N-terminus followed by a regulatory domain, a bZIP domain, and C-terminal tail. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	61
269860	cd14712	bZIP_CEBPB	Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein beta (CEBPB): a DNA-binding and dimerization domain. CEBPB is a key regulator of metabolism, adipocyte differentiation, myogenesis, and macrophage activation. It is expressed as three distinct isoforms from an intronless gene through alternative translation initiation: CEBPB1 (or liver-enriched activator protein 1, LAP1); CEBPB2 (OR LAP2); and CEBPB3 (or liver-enriched inhibitory protein, LIP). LAP1/2 function as transcriptional activators while LIP is a repressor due to its lack of a transactivation domain. The relative expression of LAP and LIP has effects on inflammation, ER stress, and insulin resistance. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	71
269861	cd14713	bZIP_CEBPG	Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein gamma (CEBPG): a DNA-binding and dimerization domain. CEBPG is an important regulator of cellular senescence; mouse embryonic fibroblasts deficient of CEBPG proliferated poorly, entered senescence prematurely, and expressed elevated levels of proinflammatory genes. It is also the primary transcription factor that regulates antioxidant and DNA repair transcripts in normal bronchial epithelial cells. In a subset of AML patients with CEBPA hypermethylation, CEBPG is significantly overexpressed. CEBPG is the shortest CEBP protein and it lacks a transactivation domain. It acts as a regulator and buffering reservoir against the transcriptional activities of other CEBP proteins. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	61
269862	cd14714	bZIP_CEBPD	Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein delta (CEBPD): a DNA-binding and dimerization domain. CEBPD is an inflammatory response gene that is induced by Toll-like receptor 4 (TLR4) and is essential in the expression of many lipopolysaccharide (LPS)-induced genes and the clearance of bacterial infection. Its expression is increased in response to various extracellular stimuli and it induces growth arrest and apoptosis in cancer cells. It is thought to function as a tumor suppressor and its expression is found reduced by site-specific methylation in many cancers including breast, cervical, and hepatocellular carcinoma. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	65
269863	cd14715	bZIP_CEBPE	Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein epsilon (CEBPE): a DNA-binding and dimerization domain. CEBPE is a critical regulator of terminal granulocyte differentiation or granulopoiesis. It is expressed only in myeloid cells. Mice deficient with CEBPE are normal at birth and fertile, but they do not produce normal neutrophils or eosinophils, and show impaired inflammatory and bacteriocidal responses. Functional loss of CEBPE causes the rare congenital disorder, Neutrophil-specific granule deficiency (SGD), which is characterized by patients' neutrophils with atypical nuclear morphology, abnormal migration and bactericidal activity, and the lack of specific granules. Patients with SGD suffer from severe and frequent bacterial infections. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate many cellular processes. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	61
269864	cd14716	bZIP_CEBP-like_1	Basic leucine zipper (bZIP) domain of CCAAT/enhancer-binding protein (CEBP)-like proteins: a DNA-binding and dimerization domain. This group is an uncharacterized subfamily of CEBP-like proteins. CEBPs (or C/EBPs) are Basic leucine zipper (bZIP) transcription factors that regulate the cell cycle, differentiation, growth, survival, energy metabolism, innate and adaptive immunity, and inflammation, among others. They are also associated with cancer and viral disease. There are six CEBP proteins in mammalian cells including CEBPA (alpha), CEBPB (beta), CEBPG (gamma), CEBPD (delta), and CEBPE (epsilon), which all contain highly conserved bZIP domains at their C-termini and variations at their N-terminal regions. Each possesses unique properties to regulate cell type-specific growth and differentiation. The sixth isoform, CEBPZ (zeta), lacks an intact DNA-binding domain and is excluded from this subfamily. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	60
269865	cd14717	bZIP_Maf_small	Basic leucine zipper (bZIP) domain of small musculoaponeurotic fibrosarcoma (Maf) proteins: a DNA-binding and dimerization domain. Maf proteins are Basic leucine zipper (bZIP) transcription factors that may participate in the activator protein-1 (AP-1) complex, which is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. Maf proteins fall into two groups: small and large. The small Mafs (MafF, MafK, and MafG) do not contain a transactivation domain but do harbor the anxillary DNA-binding domain and a C-terminal bZIP domain. They form dimers with cap'n'collar (CNC) proteins that harbor transactivation domains, and they act either as activators or repressors depending on their dimerization partner. CNC transcription factors include NFE2 (nuclear factor, erythroid-derived 2) and similar proteins NFE2L1 (NFE2-like 1), NFE2L2, and NFE2L3, as well as BACH1 and BACH2. Small Mafs play roles in stress response and detoxification pathways. They also regulate the expression of betaA-globin and other genes activated during erythropoiesis. They have been implicated in various diseases such as diabetes, neurological diseases, thrombocytopenia and cancer. Triple deletion of the three small Mafs is embryonically lethal. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	70
269866	cd14718	bZIP_Maf_large	Basic leucine zipper (bZIP) domain of large musculoaponeurotic fibrosarcoma (Maf) proteins: a DNA-binding and dimerization domain. Maf proteins are Basic leucine zipper (bZIP) transcription factors that may participate in the activator protein-1 (AP-1) complex, which is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. Maf proteins fall into two groups: small and large. The large Mafs (c-Maf, MafA, MafB, and neural retina leucine zipper or NRL) contain an N-terminal transactivation domain, a linker region of varying size, an anxillary DNA-binding domain, a C-terminal bZIP domain. They function as critical regulators of terminal differentiation in the blood and in many tissues such as bone, brain, kidney, pancreas, and retina. MafA and MafB also play crucial roles in islet beta cells; they regulate genes essential for glucose sensing and insulin secretion cooperatively and sequentially. Large Mafs are also implicated in oncogenesis; MafB and c-Maf chromosomal translocations result in multiple myelomas. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	70
269867	cd14719	bZIP_BACH	Basic leucine zipper (bZIP) domain of BTB and CNC homolog (BACH) proteins: a DNA-binding and dimerization domain. BACH proteins are Cap'n'Collar (CNC) Basic leucine zipper (bZIP) transcription factors that are defined by a conserved 43-amino acid region (called the CNC domain) located N-terminal to the bZIP DNA-binding domain. In addition, they contain a BTB domain (Broad complex-Tramtrack-Bric-a-brac domain, also known as the POZ [poxvirus and zinc finger] domain) that is absent in other CNC proteins. Veterbrates contain two members, BACH1 and BACH2. BACH1 forms heterodimers with small Mafs such as MafK to function as a repressor of heme oxygenase-1 (HO-1) gene (Hmox-1) enhancers. It has also been implicated as the master regulator of breast cancer bone metastasis. The BACH1 bZIP transcription factor should not be confused with the protein originally named as BRCA1-Associated C-terminal Helicase1 (BACH1), which has been renamed BRIP1 (BRCA1 Interacting Protein C-terminal Helicase1) and also called FANCJ. BACH2 is a B-cell specific transcription factor that plays a critical role in oxidative stress-mediated apoptosis. It plays an important role in class switching and somatic hypermutation of immunoglobulin genes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	71
269868	cd14720	bZIP_NFE2-like	Basic leucine zipper (bZIP) domain of Nuclear Factor, Erythroid-derived 2 (NFE2) and similar proteins: a DNA-binding and dimerization domain. This subfamily is composed of NFE2 and NFE2-like proteins including NFE2-like 1 or NFE2-related factor 1 (NFE2L1 or Nrf1), NFE2L2 (or Nrf2), and NFE2L3 (or Nrf3). These are Cap'n'Collar (CNC) Basic leucine zipper (bZIP) transcription factors that are defined by a conserved 43-amino acid region (called the CNC domain) located N-terminal to the bZIP DNA-binding domain. NFE2 functions in development; it is required for the proper development of platelets. The three Nrfs function in stress responses. Nrf2, the most extensively studied member of this subfamily, acts as a xenobiotic-activated receptor that regulates the adaptive response to oxidants and electrophiles. As the master regulator of the antioxidant defense pathway, it plays roles in the biology of inflammation, obesity, and cancer. Nrf1 is an essential protein that binds to the antioxidant response element (ARE) and is also involved in regulating oxidative stress. In addition, it also regulates genes involved in cell and tissue differentiation, inflammation, and hepatocyte homeostasis. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	68
269869	cd14721	bZIP_Fos	Basic leucine zipper (bZIP) domain of the oncogene Fos (Fos): a DNA-binding and dimerization domain. Fos proteins are members of the activator protein-1 (AP-1) complex, which is mainly composed of Basic leucine zipper (bZIP) dimers of the Jun and Fos families, and to a lesser extent, the activating transcription factor (ATF) and musculoaponeurotic fibrosarcoma (Maf) families. The broad combinatorial possibilities for various dimers determine binding specificity, affinity, and the spectrum of regulated genes. The AP-1 complex is implicated in many cell functions including proliferation, apoptosis, survival, migration, tumorigenesis, and morphogenesis, among others. There are four Fos proteins: c-Fos, FosB, Fos-related antigen 1 (Fra-1), and Fra-2. In addition, FosB also exists as smaller splice variants FosB2 and deltaFosB2. They all contain an N-terminal region and a bZIP domain. c-Fos and FosB also contain a C-terminal transactivation domain which is absent in Fra-1/2 and the smaller FosB variants. Fos proteins can only heterodimerize with Jun and other AP-1 proteins, but cannot homodimerize. Fos:Jun heterodimers are more stable and can bind DNA with more affinity that Jun:Jun homodimers. Fos proteins can enhance the trans-activating and transforming properties of Jun proteins. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	62
269870	cd14722	bZIP_ATF3	Basic leucine zipper (bZIP) domain of Activating Transcription Factor-3 (ATF-3) and similar proteins: a DNA-binding and dimerization domain. ATF-3 is a Basic leucine zipper (bZIP) transcription factor that is induced by various stress signals such as cytokines, genetoxic agents, or physiological stresses. It is implicated in cancer and host defense against pathogens. It negatively regulates the transcription of pro-inflammatory cytokines and is critical in preventing acute inflammatory syndromes. Mice deficient with ATF3 display increased susceptibility to endotoxic shock induced death. ATF3 dimerizes with Jun and other ATF proteins; the heterodimers function either as activators or repressors depending on the promoter context. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	62
271240	cd14723	ZIP_Ppr1	Leucine zipper Dimerization coil of Ppr1-like transcription factors. Ppr1/Ppr1p activates transcription of the URA1, URA3, and URA4 genes, which encode enzymes involved in the regulation of pyrimidine levels. Also included in this subfamily is Colletotrichum acutatum Nir1 which plays a role during nitrogen-starvation conditions. Proteins in this subfamily are members of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner.	25
271241	cd14724	ZIP_Gal4-like_1	Leucine zipper Dimerization domain of Gal4-like transcription factors. The Gal4p family of transcriptional activators contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. Gal4p family members are involved in the activation of genes in response to a specific signal. Gal4p functions in the induction of GAL genes in the presence of galactose; GAL proteins are responsible for the transport of galactose into the cell and for its metabolism through the glycolytic pathway. Hap1p promotes transcription of genes required for respiration and controlling oxidative damage in response to heme. Put3p activates the transcription of PUT1 and PUT2 genes in the presence of proline, allowing yeast cells to use proline as a nitrogen source. Sip4p activates target genes under conditions of glucose deprivation while Nir1 plays a role during nitrogen-starvation conditions. This subfamily is composed of uncharacterized members of the Gal4p family.	24
271242	cd14725	ZIP_Gal4-like_2	Leucine zipper Dimerization coil of Gal4-like transcription factors. The Gal4p family of transcriptional activators contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner. Gal4p family members are involved in the activation of genes in response to a specific signal. Gal4p functions in the induction of GAL genes in the presence of galactose; GAL proteins are responsible for the transport of galactose into the cell and for its metabolism through the glycolytic pathway. Hap1p promotes transcription of genes required for respiration and controlling oxidative damage in response to heme. Put3p activates the transcription of the PUT1 and PUT2 genes in the presence of proline, allowing yeast cells to use proline as a nitrogen source. Sip4p activates target genes under conditions of glucose deprivation while Nir1 plays a role during nitrogen-starvation conditions. This subfamily is composed of uncharacterized members of the Gal4p family.	24
350608	cd14726	TraB_PrgY-like	TraB/PryY confer plasmid-borne pheromone resistance. TraB/PrgY proteins, identified in gut bacterium Enterococcus faecalis, are plasmid-borne homologs that are induced by pheromones. Induction rends the host bacterium insensitive to self-induction by its own pheromones, and prevents the transfer of the pheromone-inducible conjugative plasmids to bacteria that already contain it. Based on homology to Tiki activity, it has been proposed that TraB acts as a protease in the inactivation of mating pheromone, cleaving at the amino-terminus. The pheromones are small peptides (7-8 residues) encoded by the bacterial genome, and are specific for particular plasmids, or class of plasmids, which may contain several virulence factors and disseminate rapidly. Plasmid-borne antibiotic resistance and virulence determinants make these elements important contributors to medical problems. Trab/PrygY is a member of a Tiki-like superfamily. Tiki is a membrane-associated metalloprotease (MEROPS family M96) that inhibits Wnt via the cleavage of its amino terminus. Wnt is essential in animal development and homeostasis. In Xenopus, Tiki is critical in head development.  In human cells, Tiki inhibits Wnt-signaling. Tiki proteins are also related to erythromycin esterase, gumN plant pathogens, RtxA containing toxins, and Campylobacter Jejuni ChaN heme-binding protein.	177
350609	cd14727	ChanN-like	ChaN is an iron-regulated, heme-binding protein. This family represents a domain found in ChaN, a heme-binding/iron-regulated lipoprotein from Campylobacter jejuni. ChaN, possibly involved in the uptake of heme-iron, contains a pair of cofacial heme groups situated between two ChaN monomers. A single tyrosine residue contacts the heme-bound iron atom while the heme-binding regions of each monomer also have contacts to the heme in the complementary monomer. ChaN presumably associates with an outer membrane-associated receptor, ChaR. Campylobacter jejuni is an important cause of food-borne illness, and is dependent on iron uptake from the host. ChaN like proteins are related to the Tiki/TraB like family of proteases. Proteins containing this domain also include protein reticulata-related from Arabidopsis which may play a role in leaf development.	211
350610	cd14728	Ere-like	Erythromycin esterase and succinoglycan biosynthesis related proteins. This group contains erythromycin esterase, which shares conserved active site residues of the Tiki/TraB family. Erythromycin esterases (EreA and EreB) disrupt erythromycin via the hydrolysis of the macrolactone ring. A critical catalytic histidine acts as a general base in the activation of a water molecule. Macrolides act by inhibiting bacterial protein synthesis by binding at the exit tunnel of ribosomal subunit 50s, blocking the translation of the polypeptide. Erythromycin esterase, typically found in integrons and transposons, confers antibiotic resistance through the disruption of the drug ring structure. EreB substrate profile is substantially broader than that for EreA, being able to also metabolize semisynthetic derivatives such as azalide azithromycin.	367
350611	cd14729	RtxA-like	C2-2 like domain of various multidomain toxins, including RTX-containing like proteins. This group contains mostly poorly-characterized C2-2 like domains of multidomain bacterial toxins, including Pasteurella multocida toxin PMT (also known as dermonecrotic toxin), MARTX (multifunctional-autoprocessing repeats-in-toxin holotoxin RtxA) proteins from Vibrio vulnificus, as well as bacterial effector protein from Pseudomonas syringae (type III effector HopAC1). MARTX domains at the N- and C- termini act in the translocation of the central domain across the eukaryotic plasma membrane, where it is proteolytically released. These are related to Pasteurella multicida toxin, which has structural and sequence similarity to the TIKI/TraB family of proteases. However, while this group of multidomain proteins shows fairly strong conservation of the active site residues of this family, the Pasteurella multicida toxin does not.	170
269830	cd14730	LodA_like	L-lysine epsilon-oxidase from Marinomonas mediterranea and similar proteins. L-lysine epsilon-oxidase is responsible for oxidative deamination of L-lysine, producing L-2-aminoadipate-6-semialdehyde. Hydrogen peroxide is a side-product of this enzymatic reaction, which requires the cofactor CTQ (cysteine tryptophylquinone). CTQ most likely forms a Schiff base with the free amino acid substrate. The protein is also called marinocine, for its broad-spectrum antibacterial activity; the latter is most likely caused by hydrogen peroxide synthesis. Homologs of LodA have been detected in various gram-negative bacteria, and they appear to be associated with the formation of biofilms.	509
269831	cd14731	LodA_like_1	Uncharacterized proteins similar to L-lysine epsilon-oxidase from Marinomonas mediterranea. L-lysine epsilon-oxidase is responsible for oxidative deamination of L-lysine, producing L-2-aminoadipate-6-semialdehyde. Hydrogen peroxide is a side-product of this enzymatic reaction, which requires the cofactor CTQ (cysteine tryptophylquinone). CTQ most likely forms a Schiff base with the free amino acid substrate. The protein is known for its broad-spectrum antibacterial activity; the latter is most likely caused by hydrogen peroxide synthesis. Although members of this related family share features of the active site, their functions are not known. Homologs of LodA have been detected in various gram-negative bacteria, and they appear to be associated with the formation of biofilms.	587
269832	cd14732	LodA	L-lysine epsilon-oxidase from Marinomonas mediterranea and similar proteins. L-lysine epsilon-oxidase is responsible for oxidative deamination of L-lysine, producing L-2-aminoadipate-6-semialdehyde. Hydrogen peroxide is a side-product of this enzymatic reaction, which requires the cofactor CTQ (cysteine tryptophylquinone). CTQ most likely forms a Schiff base with the free amino acid substrate. The protein is also called marinocine, for its broad-spectrum antibacterial activity; the latter is most likely caused by hydrogen peroxide synthesis. The dimerization interface observed in the available 3D structure does not seem to be conserved. Homologs of LodA have been detected in various gram-negative bacteria, and they appear to be associated with the formation of biofilms.	639
350515	cd14733	BACK	BACK (BTB and C-terminal Kelch) domain. The BACK domain is found in architectures C-terminal to a BTB domain, in a diverse set of architectures together with Kelch, MATH, and/or TAZ domains. It is involved in interactions with the Cullin3 (Cul3) ubiquitin ligase complex, as well as in homo-oligomerization. Most proteins containing the BACK domain are understood to function as adaptor proteins that play a role in ubiquitination of various substrates.	55
350516	cd14736	BACK_AtBPM-like	BACK (BTB and C-terminal Kelch) domain found in plant BTB/POZ-MATH (BPM) protein family. The BPM protein family includes Arabidopsis thaliana BTB/POZ and MATH domain-containing proteins, AtBPM1-6, and similar proteins. BPM protein, also termed protein BTB-POZ and MATH domain, may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins.	62
269822	cd14737	PAAR_1	proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.	94
269823	cd14738	PAAR_2	proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.	94
269824	cd14739	PAAR_3	proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.	90
269825	cd14740	PAAR_4	proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family of bacteria, and forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). A few members contains C-terminal domain extensions corresponding to Rearrangement hotspot (Rhs) protein repeats and conserved Rhs repeat-associated unique core sequences as well as uncharacterized domains such as DUF4150. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. Rhs and related YD-peptide repeat proteins are widely distributed in bacteria. Rhs shares similar architecture with distantly related WapA proteins of Bacillus and Listeria species, suggesting intercellular growth inhibition as its primary function. Additionally, a plasmid-encoded Rhs protein has been implicated in bacteriocin production in Pseudomonas savastanoi. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes.	121
269826	cd14741	PAAR_5	proline-alanine-alanine-arginine (PAAR) domain. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family in bacteria as well as some archaea, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.	95
269827	cd14742	PAAR_RHS	proline-alanine-alanine-arginine (PAAR) domain, also containing C-terminal Rearrangement hotspot (Rhs) extensions. This PAAR (proline-alanine-alanine-arginine) repeat subfamily, which forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS), contains C- and N-terminal domain extensions. These include Rearrangement hotspot (Rhs) protein repeats and conserved Rhs repeat-associated unique core sequences at the C-terminal, and various predicted functions at N- and C-terminal extensions. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. Rhs and related YD-peptide repeat proteins are widely distributed in bacteria. Rhs shares similar architecture with distantly related WapA proteins of Bacillus and Listeria species, suggesting intercellular growth inhibition as its primary function. Additionally, a plasmid-encoded Rhs protein has been implicated in bacteriocin production in Pseudomonas savastanoi. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes.	86
269828	cd14743	PAAR_CT_1	proline-alanine-alanine-arginine (PAAR) domain with C-terminal extension. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family of mostly gamma-proteobacteria, and forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). Some members contains C-terminal domain extensions corresponding to Rearrangement hotspot (Rhs) protein repeats and conserved Rhs repeat-associated unique core sequences as well as uncharacterized domains. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. Rhs and related YD-peptide repeat proteins are widely distributed in bacteria. Rhs shares similar architecture with distantly related WapA proteins of Bacillus and Listeria species, suggesting intercellular growth inhibition as its primary function. Additionally, a plasmid-encoded Rhs protein has been implicated in bacteriocin production in Pseudomonas savastanoi. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes.	78
269829	cd14744	PAAR_CT_2	proline-alanine-alanine-arginine (PAAR) domain with uncharacterized C-terminal extension. This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family of mostly beta- and gamma-proteobacteria, and forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). Most members contain C-terminal domain extensions corresponding to several uncharacterized domains such as S-type pyocin, DUF2235, DUF2345 and cytotoxic proteins. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes.	78
270613	cd14745	GH66	Glycoside Hydrolase Family 66. Glycoside Hydrolase Family 66 contains proteins characterized as cycloisomaltooligosaccharide glucanotransferase (CITase) and dextranases from a variety of bacteria. CITase cyclizes part of a (1-6)-alpha-D-glucan (dextrans) chain by formation of a (1-6)-alpha-D-glucosidic bond. Dextranases catalyze the endohydrolysis of (1-6)-alpha-D-glucosidic linkages in dextran. Some members contain Carbohydrate Binding Module 35 (CBM35) domains, either C-terminal or inserted in the domain or both.	331
270450	cd14747	PBP2_MalE	Maltose-binding protein MalE; possesses type 2 periplasmic binding fold. This group includes the periplasmic maltose-binding component of an ABC transport system from the phytopathogen Xanthomonas citri and its related bacterial proteins. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	386
270451	cd14748	PBP2_UgpB	The periplasmic-binding component of ABC transport system specific for sn-glycerol-3-phosphate; possesses type 2 periplasmic binding fold. This group includes the periplasmic component of an ABC transport system specific for sn-glycerol-3-phosphate (G3P) and closely related proteins from archaea and bacteria. Under phophate starvation conditions, Escherichia coli can utilize G3P as phosphate source when exclusively imported by an ATP-binding cassette (ABC) transporter composed of the periplasmic binding protein, UgpB, the transmembrane subunits, UgpA and UgpE, and a homodimer of the nucleotide binding subunit, UgpC. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	385
270452	cd14749	PBP2_XBP1_like	The periplasmic-binding component of ABC transport systems specific for xylo-oligosaccharides; possesses type 2 periplasmic binding fold. This group represents the periplasmic component of an ABC transport system XBP1 that shows preference for xylo-oligosaccharides in the order of xylotriose > xylobiose > xylotetraose. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	388
270453	cd14750	PBP2_TMBP	The periplasmic-binding component of ABC transport systems specific for trehalose/maltose; possesses type 2 periplasmic binding fold. This group represents the periplasmic trehalose/maltose-binding component of an ABC transport system and related proteins from archaea and bacteria. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	385
270454	cd14751	PBP2_GacH	The periplasmic-binding component of the putative oligosacchride ABC transporter GacHFG; possesses type 2 periplasmic binding fold. This group represents the periplasmic component GacH of an ABC import system. GacH is identified as a maltose/maltodextrin-binding protein with a low affinity for acarbose. Members of this group belong to the type 2 periplasmic-binding fold superfamily. PBP2 proteins are comprised of two globular subdomains connected by a flexible hinge and bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. The majority of PBP2 proteins function in the uptake of small soluble substrates in eubacteria and archaea. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis.	376
270212	cd14752	GH31_N	N-terminal domain of glycosyl hydrolase family 31 (GH31). This family is found N-terminal to the glycosyl-hydrolase domain of Glycoside hydrolase family 31 (GH31). GH31 includes the glycoside hydrolases alpha-glucosidase (EC 3.2.1.20), alpha-1,3-glucosidase (EC 3.2.1.84), alpha-xylosidase (EC 3.2.1.177), sucrase-isomaltase (EC 3.2.1.48 and EC 3.2.1.10), as well as alpha-glucan lyase (EC 4.2.2.13). All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite-1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as Pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues of the catalytic domain have been identified as the catalytic nucleophile and the acid/base, respectively. A loop of the N-terminal beta-sandwich domain is part of the active site pocket.	122
271288	cd14755	GS_BA2291-HK-like	Non-heme globin sensor domain of BA2291 histidine kinase and related domains. This subfamily includes the sensor domain of Bacillus anthracis BA2291 histidine kinase. BA2291 is one of the most active kinases in promoting sporulation, and is found in most members of the Bacillus cereus subfamily of the genus Bacillus, which includes B. anthracis and Bacillus thuringiensis, but not Bacillus subtilis. This subfamily also includes two sensor-only plasmid encoded sporulation inhibitors pXO1-118 and pXO2-61 found only in B. anthracis and various strains of Bacillus cereus having similar plasmids. The pXO1-118 and pXO2-61 sensor domains form homodimers, and in vitro bind fatty acid and halide, and not heme; there may be roles for fatty acid (or similar molecule), chloride ion, and possibly pH, as signaling cues. It has been proposed that BA2291 senses the same environmental cue in vivo, and that pXO1-118 and pXO2-61 act by titrating out an environmental signal that might cause an ill-timed sporulation.	132
381270	cd14756	TrHb	Truncated Mb-fold globins, T family. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. They are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). Typical of the TrHb1s (N) group is a protein matrix tunnel. An example of a TrHb1 is Mycobacterium tuberculosis TrHb1/Mt-trHbN which is expressed during the Mycobacterium stationary phase, and plays a specific defense role against nitrosative stress. TrHb2s include the dimeric Arabidopsis thaliana TrHb2 AtGLB3. GLB3 is likely to have a function distinct from other plant globins: it exhibits a low O2 affinity, an unusual concentration-independent binding of O2 and CO, and does not respond to any of the treatments that induce plant 3-on-3 globins. TrHb3s include Campylobacter jejuni Ctb, encoded by Cj0465c, which may play a role in moderating O2 flux within C. jejuni.	111
381271	cd14757	GS_EcDosC-like_GGDEF	Globin sensor domain of Escherichia coli Direct Oxygen Sensing Cyclase and related proteins; coupled to a C-terminal GGDEF domain. Globin-coupled-sensors belonging to this subfamily have a C-terminal diguanylate cyclase (DGC/GGDEF) domain coupled to the globin sensor domain. DGC/GGDEF likely functions as a c-di-GMP cyclase in the synthesis of the second messenger cyclic-di-GMP (c-di-GMP). Members include Escherichia coli DosC (also known as YddV), the gene for which is found in a two-gene operon, dosCP. In DosC, the sensory globin domain is coupled to a GGDEF-class diguanylate cyclase, while in DosP, a heme-containing PAS domain is coupled to an EAL-class c-di-GMP phosphodiesterase. DosP and DosC associate in a di-GMP-responsive Escherichia coli RNA processing complex along with polynucleotide phosphorylase (PNPase), enolase, RNase E, and RNA.	149
381272	cd14758	GS_GGDEF_1	Globin sensor domain, coupled to DGC/GGDEF domains; uncharacterized subgroup. Globin-coupled-sensors belonging to this subfamily have a sensor domain coupled to a C-terminal diguanylate cyclase (DGC/GGDEF) domain. DGC/GGDEF likely functions as a c-di-GMP cyclase in the synthesis of the second messenger cyclic-di-GMP (c-di-GMP).	148
381273	cd14759	GS_GGDEF_2	Globin sensor domain, coupled to DGC/GGDEF domains; uncharacterized subgroup. The majority of globin-coupled-sensors in this subfamily have diguanylate cyclase (DGC/GGDEF) domains N-terminal to the globin sensor domain and/or C-terminal EAL domains, DGC/GGDEF and EAL domains are involved in the synthesis and degradation of the secondary messenger c-di-GMP, respectively. Some members have GAF small-molecule-binding domains in addition.	150
271293	cd14760	GS_PAS-GGDEF-EAL	Globin sensor domain; coupled to PAS, DGC/GGDEF and EAL domains. In addition to the N-terminal sensing domain, globin-coupled-sensors in this bacterial subfamily have a signal-sensing PAS domain, and diguanylate cyclase (DGC/GGDEF) and EAL domains. The latter two domains are involved in the synthesis and degradation of c-di-GMP, respectively, and may be involved in regulating cell surface adhesiveness, and in the transition between planktonic and biofilm growth modes.	148
381274	cd14761	GS_GsGCS-like	Globin sensor domain of Geobacter sulfurreducens globin-coupled-sensor and related proteins. GsGCS is a GCS of unknown function, comprised of an N-terminal globin sensor domain and a C- terminal transmembrane signal-transduction domain. For GCSs in general, the first signal O2 binds to/dissociates from the heme iron complex inducing a structural change in the globin domain, which is then transduced to the functional domain, switching on (or off) the function of the latter. Ferric GsGCS is bis-histidyl hexa-coordinated (provided by a His residue located at the E11 topological site, as distinct from the E7 site). Ferrous GsGCS is a penta- and hexa-coordinated mixture. The C-terminal domains of other members of this subfamily include histidine kinase, and PsiE domains.	149
381275	cd14762	GS_STAS	Globin sensor domain; coupled to a STAS domain. Globin-coupled-sensors in this subfamily have a C-terminal sulphate transporter and anti-sigma factor antagonist (STAT) domain coupled to the globin sensor domain.	143
381276	cd14763	SSDgbs_1	Sensor single-domain globins; uncharacterized bacterial subgroup. This subfamily of sensor single-domain globins, belongs to a family that includes GCSs (globin-coupled-sensors) and single-domain protoglobins (Pgbs). For GCSs, an N-terminal heme-bound oxygen-sensing/binding globin domain is coupled to a C-terminal functional/signaling domain. The first signal O2 binds to/dissociates from the heme in its sensor domain inducing a conformational change in that domain and ultimately in the signaling domain. It has been demonstrated that the Pgbs and other single domain globins can function as sensors, when coupled to an appropriate regulatory domain.	144
381277	cd14764	SSDgbs_2	Sensor single-domain globins; uncharacterized subgroup. This subfamily of sensor single-domain globins, belongs to a family that includes GCSs (globin-coupled-sensors) and single-domain protoglobins (Pgbs). For GCSs, an N-terminal heme-bound oxygen-sensing/binding globin domain is coupled to a C-terminal functional/signaling domain. The first signal O2 binds to/dissociates from the heme in its sensor domain inducing a conformational change in that domain and ultimately in the signaling domain. It has been demonstrated that the Pgbs and other single domain globins can function as sensors, when coupled to an appropriate regulatory domain.	145
381278	cd14765	Hb	Hemoglobins. Hb is the oxygen transport protein of erythrocytes. It is an allosterically modulated heterotetramer. Hemoglobin A (HbA) is the most common Hb in adult humans, and is formed from two alpha-chains and two beta-chains (alpha2beta2). An equilibrium exists between deoxygenated/unliganded/T(tense state) Hb having low oxygen affinity, and oxygenated /liganded/R(relaxed state) Hb having a high oxygen affinity. Various endogenous heterotropic effectors bind Hb to modulate its oxygen affinity and cooperative behavior, e.g. hydrogen ions, chloride ions, carbon dioxide and 2,3-bisphosphoglycerate. Hb is also an allosterically regulated nitrite reductase; the plasma nitrite anion may be activated by hemoglobin in areas of hypoxia to bring about vasodilation. Other Hb types are: HbA2 (alpha2delta2) which, in normal individuals, is naturally expressed at a low level; Hb Portland-1 (zeta2gamma2), Hb Gower-1 (zeta2epsilon2), and Hb Gower-2 (alpha2epsilon2), which are Hbs present during the embryonic period; and fetal hemoglobin (HbF, alpha2gamma2), the primary Hb throughout most of gestation. These Hb types have differences in O2 affinity and in their interactions with allosteric effectors.	131
381279	cd14766	CeGLB25-like	Caenorhabditis elegans globin GLB-25, and related globins. The C. elegans genome contains 33 genes encoding globins that are all transcribed. These are very diverse in gene and protein structure and are localized in a variety of cells. The C. elegans globin GLB-25 (locus tag T06A1.3), like the majority of them, was expressed in neuronal cells in the head and tail portions of the body and in the nerve cord.	137
271300	cd14767	PE_beta-like	Phycoerythrin beta subunit, a component of the phycobilisome rod; and related proteins. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC). This subfamily also includes the beta subunits of Cryptophyte phycobiliproteins which represent another type of biliprotein antenna with different structure and organization. The beta subunits of cryptophyte PBPs share a high degree of sequence identity with both the alpha and beta subunits of the cyanobacterial and red algal PBPs, however the alpha cryptophyte subunits are shorter, and unrelated. There is only one type of PBP present in a single species, either phycocyanin or phycoerythrin, but not allophycocyanin. Structurally, phycoerythrin in cryptophytes is an alpha1alpha2betabeta dimer and not a trimer as in the PBS.	176
271301	cd14768	PC_PEC_beta	Beta subunits of phycoerythrin and phycoerythrocyanin; phycobilisome rod components. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC).	171
271302	cd14769	PE_alpha	Phycoerythrin alpha subunit, a phycobilisome rod component. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC).	164
271303	cd14770	PC-PEC_alpha	Alpha subunits of phycoerythrin and phycoerythrocyanin; phycobilisome rod components. phycobilisomes (PBSs) are the main light-harvesting complex in cyanobacteria and red algae. In general, they consist of a central core and surrounding rods and function to harvest and channel light energy toward the photosynthetic reaction centers within the membrane. They are comprised of phycobiliproteins/chromophorylated proteins (PBPs) maintained together by linker polypeptides. PBPs have different numbers of chromophores, and the basic monomer component (alpha/beta heterodimers) can further oligomerize to ring-shaped trimers (heterohexamers) and hexamers (heterododecamers). Stacked PBP hexamers form both the core and the rods of the PBS; the core is mainly made up by allophycocyanin (APC) while the rods can be composed of the PBPs phycoerythrin (PE), phycocyanin (PC) and phycoerythrocyanin (PEC).	162
381280	cd14771	TrHb2_Mt-trHbO-like_O	Truncated hemoglobins, group 2 (O); Mycobacterium tuberculosis hemoglobin O like. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). This group includes a Mycobacterium tuberculosis TrHb2, Mt-trHbO, encoded by the Mycobacterium tuberculosis glbO gene, which is expressed throughout the Mycobacterium growth phase. It also includes a TrHb2 from the thermophilic Thermobifida fusca ( Tf-trHb) which has a high thermostability and at the optimal growth temperature for Thermobifida fusca (between 55 and 60 degrees C ), it is capable of efficient O2 binding and release. Tf-trHb shares a relatively slow rate of oxygen binding with Mt-trHbO.	119
381281	cd14772	TrHb2_Bs-trHb-like_O	Truncated hemoglobins, group 2 (O); Bacillus subtilis TrHb like. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). TrHb2's belonging to this group include monomeric Bacillus subtilis trHb (Bs-trHb), which exhibits an extremely high oxygen affinity, and a dimeric TrHb2 from the thermophilic aerobic spore forming bacterium Geobacillus stearothermophilus(Gs-trHb).	116
381282	cd14773	TrHb2_PhHbO-like_O	Truncated hemoglobins, group 2 (O); Pseudoalteromonas haloplanktis PhHbO like. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). TrHb2's belonging to this group include Pseudoalteromonas haloplanktis PhHbO (encoded by the PSHAa0030 gene) which appears to be involved in oxidative and nitrosative stress resistance.	119
381283	cd14774	TrHb2_HGbIV-like_O	hell's gate globin IV and similar truncated hemoglobins, group 2 (O). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P).	131
381284	cd14775	TrHb2_O-like	Truncated hemoglobins, group 2 (O); uncharacterized subgroup. The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P).	119
271309	cd14776	HmpEc-globin-like	Globin domain of Escherichia coli flavohemoglobin (Hmp) and related proteins. Flavohemoglobins (flavoHbs) function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. This subfamily includes Vibrio fischeri Hmp and E.coli Hmp. NO scavenging by flavoHb affects the swarming behavior of Escherichia coli, and protects against NO during initiation of the squid-Vibrio symbiosis. E.coli Hmp can catalyze the reduction of several alkylhydroperoxide substrates into their corresponding alcohols using NADH as an electron donor, and it has been suggested that it participates in the repair of the lipid membrane oxidative damage generated during oxidative/nitrosative stress.	138
381285	cd14777	Yhb1-globin-like	Globin domain of Saccharomyces cerevisiae flavohemoglobin (Yhb1p) and related domains. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. S. cerevisiae Yhb1p has been shown to protect against nitrosative stress and to control ferric reductase activity; it may participate in regulating the activity of plasma membrane ferric reductase(s). Also included in this subfamily is Dictyostelium discoideum FlavoHb, the expression of which affects D. discoideum development.	140
381286	cd14778	VtHb-like_SDgb	Vitreoscilla stercoraria hemoglobin and related proteins; single-domain globins. VtHb is homodimeric, and may both transport oxygen to terminal respiratory oxidases, and provide resistance to nitrosative stress. It has medium oxygen affinity and displays cooperative ligand-binding properties. VHb has biotechnological application, its expression in heterologous hosts (bacteria and plants) has improved growth and productivity under microaerobic conditions. Another member of this subfamily Campylobacter jejuni hemoglobin (Cgb) is monomeric, and plays a role in detoxifying NO. Along with a truncated globin Ctb, it is up-regulated by the transcription factor NssR in response to nitrosative stress.	140
381287	cd14779	FHP_Ae-globin-like	Globin domain of Alcaligenes eutrophus flavohemoglobin (FHP) and related proteins. Flavohemoglobins (flavoHbs) function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. NO scavenging by flavoHb maintains Medicago truncatula-Sinorhizobium meliloti symbiosis. Alcaligenes eutrophus FHP contains a phospholipid-binding site.	140
381288	cd14780	HmpPa-globin-like	Globin domain of Pseudomonas aeruginosa flavohemoglobin (HmpPa) and related proteins. Flavohemoglobins (flavoHbs) function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. The physiological role of HmpPa is thought to be detoxification of NO under aerobic conditions.	140
381289	cd14781	FHb-globin_1	Globin domain of flavohemoglobins (flavoHbs); uncharacterized subgroup. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. This subfamily may contain some single-domain goblins (SDgbs).	139
381290	cd14782	FHb-globin_2	Globin domain of flavohemoglobins (flavoHbs); uncharacterized subgroup. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways.	143
271316	cd14783	FHb-globin_3	Globin domain of flavohemoglobins (flavoHbs); uncharacterized subgroup. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways.	140
381291	cd14784	class1_nsHb-like	Class 1 nonsymbiotic hemoglobins and related proteins. Class1 nsHbs include the dimeric hexacoordinate Trema tomentosa nsHb and the dimeric hexacoordinate nsHb from monocot barley. This subfamily also includes ParaHb, a dimeric pentacoordinate Hb from the root nodules of Parasponia andersonii, a non-legume capable of symbiotic nitrogen fixation. ParaHb is unusual in that it has different heme redox potentials for each subunit. 	149
270211	cd14785	V-ATPase_C	Subunit C of vacuolar H+-ATPase (V-ATPase). This family contains subunit C of vacuolar H+-ATPase (V-ATPase), a protein that plays a crucial role in the vacuolar system of eukaryotic cells. The main function of V-ATPase is to generate a proton-motive force at the expense of ATP and to cause limited acidification in the internal space (lumen) of several organelles of the vacuolar system. V-ATPases are multi-subunit protein complexes made up of two distinct structures: a peripheral catalytic sector (V1) and a hydrophobic membrane sector (V0) responsible for driving protons; subunit C is one of five polypeptides composing V1. The key function of the C subunit is intimately involved in the reversible dissociation of the V1 and V0 structures. It has also been identified as a mediator of the acidic microenvironment of tumors which it controls by proton extrusion to the extracellular medium. The acidic environment causes tissue damage, activates destructive enzymes in the extracellular matrix, and acquires metastatic cell phenotypes.	368
341075	cd14786	STAT_CCD	Coiled-coil domain of Signal Transducer and Activator of Transcription (STAT), also called alpha domain. This family consists of the coiled-coil (alpha) domain of the STAT proteins (Signal Transducer and Activator of Transcription, or Signal Transduction And Transcription), which are latent cytoplasmic transcriptional factors that play an important role in cytokine and growth factor signaling. STAT proteins regulate several aspects of growth, survival and differentiation in cells. The transcription factors of this family are activated by JAK (Janus kinase) and dysregulation of this pathway is frequently observed in primary tumors and leads to immunosuppression, increased angiogenesis and enhanced survival of tumors. There are seven mammalian STAT family members that have been identified: STAT1, STAT2, STAT3, STAT4, STAT5A, STAT5B and STAT6. STAT proteins consist of six structural regions: N-domain (ND)/protein interaction domain, coiled-coil domain (CCD)/STAT all alpha domain, DNA-binding domain (DBD), linker domain (LK), a Src homology 2 (SH2) domain, and C-terminal transcriptional activation domain (TA) that includes two conserved phosphorylation sites (tyrosine and serine residues). The coiled-coil or alpha domain is an interacting region with other proteins, including IRF-9/p48 for STAT1, c-Jun, StIP1, and GRIM-19 for STAT3, and SMRT with STAT5A and STAT5B. A functional STAT1 mutant (phenylalanine to serine) in this domain region shows significantly decreased protein expression caused by translational/post-translational mechanisms independent of proteasome machinery. The phenylalanine is not conserved in STAT4 and STAT6 that have tight specificity, suggesting a novel potential mechanism of specific activation of STAT proteins. Specifically, STAT3, STAT5, and STAT6, which are continually imported to the nucleus independent of tyrosine phosphorylation, require the conformational structure of their coiled-coil domains.	125
350612	cd14787	Tiki_TraB-like	diverse proteins related to the Tiki and TraB protease domains. The extracellular domain of Tiki family proteins shares homology with bacterial TraB/PrgY proteins which are known for their roles in the inhibition of mating pheromones. Tiki and TraB/PrgY proteins share limited sequence identity, but their predicted secondary structures reveal that several catalytic residues are anchored in a similar manner, consistent with a common evolutionary origin. Tiki domains are related to the erythromycin esterase, gumN plant pathogens, RtxA toxins, and Campylobacter Jejuni heme-binding, ChaN-like proteins. Tiki is a membrane-associated metalloprotease (MEROPS family M96) that inhibits Wnt via the cleavage of its amino terminus, diminishing Wnt's binding to receptors. Wnt is essential in animal development and homeostasis. In Xenopus, Tiki is critical in head development.  In human cells, Tiki inhibits Wnt-signaling, which is important in embryogenesis, homeostasis, and regeneration. Deregulation of Wnt contributes to birth defects, cancer and various diseases. TraB/PrgY protein has been identified in gut bacterium Enterococcus faecalis, but its function has not been well characterized. Plasmid-borne TraB has been implicated in the regulation of pheromone sensitivity and specificity. Based on homology to Tiki activity, it has been proposed that TraB acts as a metalloprotease in the inactivation of mating pheromone. Pasteurella multicida toxin has structural and sequence similarity to the Tiki/TraB family of proteases. However, unlike related multidomain toxins in this family, they do not exhibit conservation of the typical active site residues.	127
350613	cd14788	GumN	poorly characterized family of proteins related to gumN pathogenicity factor of Xanthomonas. GumN, a poorly characterized protein, is part of the large gum cluster of pathogenicity factors of the plant pathogen Xanthomonas. Except for GumN, the gum cluster is conserved, and proteins of this operon are involved in the production of xanthan, an extracellular polysaccharide that promotes plant disease. Xanthomonas campestri is responsible for 'black rot' disease in certain crop plants. GumN has sequence similarity to the Tiki/TraB protease family, but lacks the typical conserved residues of the active site.	286
350614	cd14789	Tiki	Tiki homology domain antagonizes Wnt function via cleavage of amino-terminal residues. Tiki is a membrane-associated metalloprotease that inhibits Wnt via the cleavage of its amino terminus, diminishing Wnt's binding to receptors. Wnt is essential in animal development and homeostasis. In xenopus, tiki is critical in head development.  In human cells, TIKI inhibits Wnt-signaling, which is important in embryogenesis, homeostasis, and regeneration. Deregulation of WNT contributes to birth defects, cancer and various diseases.  TIKI homology domains are part of the TraB family and are related to the Erythromycin esterase, GumN plant pathogens, RtxA toxins, and Campylobacter Jejuni heme-binding, Chan-like proteins.  TraB/PrgY are identified in gut bacterium Enterococcus faecalis, but its function has not been well characterized. Plasmid-borne, TraB has been implicated in the regulation of pheromone sensitivity and specificity. Based on homology to TIKI activity, it has been proposed that TraB acts as a metalloprotease in the inactivation of mating pheromone. The TIKI/TraB family has 2 conserved GxxH motifs and  conserved glutamate and arginine residues that may be catalytic.	259
269891	cd14790	GH_D	Glycoside hydrolases, clan D. This group of glycosyl hydrolase families is comprised of glycosyl hydrolase family 31 (GH31), family 36 (GH36), and family 27 (GH27). These structurally and mechanistically related protein families are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively. They have a wide range of functions including alpha-glucosidase, alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase, alpha-N-acetylgalactosaminidase, stachyose synthase, raffinose synthase, and alpha-1,4-glucan lyase.	253
269892	cd14791	GH36	glycosyl hydrolase family 36 (GH36). GH36 enzymes occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-galactosidase, alpha-N-acetylgalactosaminidase, stachyose synthase, and raffinose synthase. All GH36 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. GH36 members are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively.	299
269893	cd14792	GH27	glycosyl hydrolase family 27 (GH27). GH27 enzymes occur in eukaryotes, prokaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-N-acetylgalactosaminidase, and 3-alpha-isomalto-dextranase. All GH27 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. GH27 members are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively.	271
269816	cd14793	DUF302_like	Domains similar to DUF302 and the N-terminal domains found in some bacterial RNAses. DUF302 is an uncharacterized domain with widespread phylogenetic distribution. It appears homologous to the N-terminal domains of RNAse H3 and the Escherichia coli toxin RnlA.	81
269817	cd14794	RNLA_N_1	N-terminal repeat domain of toxin RnlA; first out of two repeats. The Escherichia coli toxin RnlA functions as an mRNA endoribonuclease and is part of a two-component toxin-antitoxin system that promotes resistance to phage infections, together with the antitoxin RnlB. RNAse activity is located in the C-terminal domain. This N-terminal domain appears to participate in homodimerization and is the first out of two repeats.	90
269818	cd14795	RNLA_N_2	N-terminal repeat domain of toxin RnlA; second out of two repeats. The Escherichia coli toxin RnlA functions as an mRNA endoribonuclease and is part of a two-component toxin-antitoxin system that promotes resistance to phage infections, together with the antitoxin RnlB. RNAse activity is located in the C-terminal domain. This N-terminal domain appears to participate in homodimerization and is the second out of two repeats.	87
269819	cd14796	RNAse_HIII_N	N-terminal domain of ribonuclease H3. RNAse H3 (HIII) is a bacterial type 2 ribonuclease, which endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, and plays a role in DNA replication and repair. The N-terminal domain characterized by this model has been shown to be important in substrate binding; it might form initial contacts with the substrate and not be part of the active complex that involves the C-terminal ribonuclease domain. This domain has also been characterized as DUF3378.	66
269820	cd14797	DUF302	Uncharacterized domain family DUF302. These domains are mostly found in bacterial single-domain proteins and have been shown to form homodimers; they may also bind zinc. Also characterized as COG3439.	124
271353	cd14798	RX-CC_like	Coiled-coil domain of the potato virux X resistance protein and similar proteins. The potato virus X resistance protein (RX) confers resistance against potato virus X. It is a member of a family of resistance proteins with a domain architecture that includes an N-terminal coiled-coil domain (modeled here), a nucleotide-binding domain, and leucine-rich repeats (CC-NB-LRR). These intracellular resistance proteins recognize pathogen effector proteins and will subsequently trigger a response that may be as severe as localized cell death. The N-terminal coiled-coil domain of RX has been shown to interact with RanGAP2, which is a necessary co-factor in the resistance response.	124
341082	cd14801	STAT_DBD	DNA-binding domain of Signal Transducer and Activator of Transcription (STAT). This family consists of the DNA binding domain (DBD) of the STAT proteins (Signal Transducer and Activator of Transcription, or Signal Transduction And Transcription), which are latent cytoplasmic transcriptional factors that play an important role in cytokine and growth factor signaling. STAT proteins regulate several aspects of growth, survival and differentiation in cells. The transcription factors of this family are activated by JAK (Janus kinase) and dysregulation of this pathway is frequently observed in primary tumors and leads to immunosuppression, increased angiogenesis and enhanced survival of tumors. There are seven mammalian STAT family members that have been identified: STAT1, STAT2, STAT3, STAT4, STAT5A, STAT5B and STAT6. STAT proteins consist of six structural regions: N-terminal domain (ND)/protein interaction domain, coiled-coil domain (CCD)/STAT all alpha domain, DNA-binding domain (DBD), linker domain (LK), a Src homology 2 (SH2) domain, and C-terminal transcriptional activation domain (TA) that includes two conserved phosphorylation sites (tyrosine and serine residues). STAT1 and STAT3 have the greatest diversity of biological functions among the 7 known members of the STAT family. The DNA binding domain of STAT has an Ig-like fold. DNA binding specificity experiments of different STAT proteins show that STAT5A specificity is more similar to that of STAT6 than that of STAT1, as also seen from the evolutionary relationships.	157
269812	cd14803	RAP	Receptor-associated protein (RAP). Receptor-associated protein, RAP, is an antagonist and a specialized chaperone in the endoplasmic reticulum that binds tightly to members of the low-density lipoprotein (LDL) receptor family and prevents them from associating with other ligands. RAP associates with (LDL) receptor-related protein (LRP) early in the secretory pathway, reducing its ligand binding capacity, and then dissociates from LRP in the low-pH environment of the Golgi; studies have shown that histidine residues in RAP D3 serve as a switch that facilitates its uncoupling from the receptor. RAP is a modular protein identified as having an internal triplication, with domains, D1, D2, and D3, each thought to have distinct functions; these domains are independent and do not interact. The carboxyl-terminal domain (D3) of RAP is required for folding and trafficking of LRP, while the amino-terminal tandem D1D2 domains of RAP are essential for blocking LRP from binding of certain ligands, such as activated forms of alpha2-macroglobulin.	97
271351	cd14804	Tra_M	TraM mediates signalling between transferosome and relaxosome. TraM is a plasmid encoded DNA-binding protein that is essential for conjugative transfer of F-like plasmids (e.g. F, R1, R100 and pED208) between bacterial cells. Bacterial conjugation, a form of horizontal gene transfer between cells, is an important contributor to bacterial genetic diversity, enabling virulence and antibiotics resistance factors to rapidly spread in medically important human pathogens. Mutation studies have shown that TraM is required for normal levels of transfer gene expression as well as for efficient site-specific single-stranded DNA cleavage at the origin of transfer (oriT). TraM tetramers bridge oriT to a key component of the conjugative pore, the coupling protein TraD. The N-terminal ribbon-helix-helix (RHH) domain of TraM is able to cooperatively bind DNA in a staggered arrangement without interaction between tetramers. This allows the C-terminal TraM tetramerization domains to be free to make multiple interactions with TraD, thus driving plasmid recruitment to the conjugative pore.	122
271348	cd14805	Translin-like	Translin and translin-associated factor-X (TRAX). Translin (also known as TB-RBP), and its binding partner protein TRAX (translin-associated factor-X) are a paralogous pair of conserved proteins, and oligomeric complexes of TRAX and translin are known as C3PO proteins (for component 3 promoter of RNA-induced silencing complex or RISC). The Translin-Trax complex enhances the removal of the passenger strand in RNAi and the formation of active RISC. Translin and Trax participate in a variety of nucleic acid metabolism pathways in addition to RNAi and have been implicated in a wide range of biological activities, including mRNA processing, cell growth regulation, spermatogenesis, neuronal development/function, genome stability regulation and carcinogenesis; however, their precise role in some of the processes remains unclear. It has been shown that Trax subunit, but not Translin, possesses a Glu-Glu-Asp catalytic center with the capacity to digest RNA as well as DNA; this catalytic activity is required for passenger-strand removal and RISC activation in RNAi.  In Archaeoglobus fulgidus, Trax-like-subunits assemble into an octameric structure, highly similar to human C3PO; its complex with duplex RNA reveals that the octamer entirely encapsulates a single 13-base-pair RNA duplex inside a large inner cavity.	197
269813	cd14806	RAP_D1	Receptor-associated protein (RAP), Domain 1. This subfamily is the N-terminal domain (D1) of receptor-associated protein, RAP, an antagonist and a specialized chaperone in the endoplasmic reticulum that binds tightly to members of the low-density lipoprotein (LDL) receptor family and prevents them from associating with other ligands. D1 as well as domain 2 (D2) are essential for blocking low-density lipoprotein receptor-related protein (LRP) from binding of certain ligands, such as activated forms of alpha2-macroglobulin; D1 and D2 each bind LRP weakly but the tandem D1D2 binds much more tightly, suggesting the avidity effects arising from amino acid residues contributed from each domain. The double module of complement type repeats, CR56, of LRP binds many ligands including alpha2-macroglobulin, which promotes the catabolism of the Abeta-peptide implicated in Alzheimer's disease.	71
269814	cd14807	RAP_D2	Domain 2 of receptor-associated protein (RAP). This subfamily is the N-terminal domain (D2) of receptor-associated protein, RAP, an antagonist and a specialized chaperone in the endoplasmic reticulum that binds tightly to members of the low-density lipoprotein (LDL) receptor family and prevents them from associating with other ligands. D2, along with RAP domain 1 (D1), is essential for blocking low-density lipoprotein receptor-related protein (LRP) from binding of certain ligands, such as alpha2-macroglobulin; D1 and D2 each bind LRP weakly but the tandem D1D2 binds much more tightly to the second and the fourth ligand-binding clusters present on LRP, suggesting the avidity effects arising from amino acid residues contributed from each domain. Also, RAP has regions that interact weakly with heparin, one located in D2 and two located in D3. The double module of complement type repeats, CR56, of LRP binds many ligands including alpha2-macroglobulin, which promotes the catabolism of the Abeta-peptide implicated in Alzheimer's disease.	98
269815	cd14808	RAP_D3	C-terminal receptor-associated protein (RAP), Domain 3. This subfamily is the C-terminal domain (D3) of receptor-associated protein, RAP, an antagonist and a specialized chaperone in the endoplasmic reticulum that binds tightly to members of the low-density lipoprotein (LDL) receptor family and prevents them from associating with other ligands. D3 is required for folding and trafficking of low-density lipoprotein receptor-related protein (LRP). In the mildly acidic pH of the Golgi, unfolding of RAP-D3 helical bundle facilitates dissociation of RAP from the LDL receptor type A (LA) repeats of LDLR family proteins. Also, RAP has 3 regions that interact weakly with heparin, two regions located in D3 and one in RAP domain 2 (D2). The double module of complement type repeats, CR56, of LRP binds many ligands including alpha2-macroglobulin, which promotes the catabolism of the Abeta-peptide implicated in Alzheimer's disease.	100
269871	cd14809	bZIP_AUREO-like	Basic leucine zipper (bZIP) domain of blue light (BL) receptor aureochrome (AUREO) and similar bZIP domains. AUREO is a BL-activated transcription factor specific to phototrophic stramenopiles. It has a bZIP and a BL-sensing light-oxygen voltage (LOV) domain. It has been shown to mediate BL-induced branching and regulate the development of the sex organ in Vaucheria frigida. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription. This subgroup also includes the Epstein-Barr virus (EBV) immediate-early transcription factor ZEBRA (BZLF1, Zta, Z, EB1). ZEBRA exhibits a variant of the bZIP fold, it has a unique dimer interface and a substantial hydrophobic pocket; it has a C-terminal moiety which stabilizes the coiled coil involved in dimer formation. ZEBRA functions to trigger the switch of EBV's biphasic infection cycle from latent to lytic infection. It activates the promoters of EBV lytic genes by binding ZEBRA response elements (ZREs) and inducing a cascade of expression of over 50 viral genes. It also down regulates latency-associated promoters, is an essential replication factor, induces host cell cycle arrest, and alters cellular immune responses and transcription factor activity.	52
269872	cd14810	bZIP_u1	Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain; uncharacterized subfamily. Basic leucine zipper (bZIP) factors comprise one of the most important classes of enhancer-type transcription factors. They act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes including cell survival, learning and memory, lipid metabolism, and cancer progression, among others. They also play important roles in responses to stimuli or stress signals such as cytokines, genotoxic agents, or physiological stresses. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
269873	cd14811	bZIP_u2	Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain; uncharacterized subfamily. Basic leucine zipper (bZIP) factors comprise one of the most important classes of enhancer-type transcription factors. They act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes including cell survival, learning and memory, lipid metabolism, and cancer progression, among others. They also play important roles in responses to stimuli or stress signals such as cytokines, genotoxic agents, or physiological stresses. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
269874	cd14812	bZIP_u3	Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain; uncharacterized subfamily. Basic leucine zipper (bZIP) factors comprise one of the most important classes of enhancer-type transcription factors. They act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes including cell survival, learning and memory, lipid metabolism, and cancer progression, among others. They also play important roles in responses to stimuli or stress signals such as cytokines, genotoxic agents, or physiological stresses. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
269875	cd14813	bZIP_BmCbz-like	Basic leucine zipper (bZIP) domain of Bombyx mori chorion b-ZIP transcription factor and similar bZIP domains. Bombyx mori chorion b-ZIP transcription factor, is encoded by the Cbz gene. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription.	52
350615	cd14814	Peptidase_M15	Metalloproteases including zinc D-Ala-D-Ala carboxypeptidase, L-Ala-D-Glu peptidase, L,D-carboxypeptidase, bacteriophage endolysins, and related proteins. This family summarizes zinc-binding metallopeptidases which are mostly carboxypeptidases and dipeptidases, and includes zinc-dependent D-Ala-D-Ala carboxypeptidases, VanX, L-Ala-D-Glu peptidase, L,D-carboxypeptidase and bacteriophage endolysins, amongst other family members. These peptidases belong to MEROPS family M15 which are involved in bacterial cell wall biosynthesis and metabolism.	111
271352	cd14815	BA_2398_like	Putative Bacillus anthracis lipoprotein and related proteins. Uncharacterized protein family found in Bacilli and Gammaproteobacteria	145
350616	cd14817	D-Ala-D-Ala_dipeptidase_VanX	D-Ala-D-Ala dipeptidase VanX. D-Ala-D-Ala dipeptidase (also known as D-alanyl-D-alanine dipeptidase vanX; VanX; EC 3.4.13.22) is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci and other bacteria (both Gram-positive and Gram-negative). It is part of a gene cluster that affects cell-wall biosynthesis. The operon triggers the termination of peptidoglycan precursors by D-Ala-(R)-lactate instead of D-Ala-D-Ala dipeptides. The enzyme is stereospecific, as L-Ala-L-Ala, D-Ala-L-Ala and L-Ala-D-Ala are not substrates. It belongs in the MEROPS peptidase family M15, subfamily D.	199
341426	cd14818	longin-like	Longin-like domains. Longin-like domains are small protein domains present in a variety of proteins and members of protein complexes involved in or required for different steps during the transport of proteins from the ribosome to the ER to the plasma membrane, via the Golgi apparatus. Examples are mu and sigma subunits of the heterotetrameric adaptor protein (AP) complex, zeta and delta subunits of the heterotetrameric F-COPI complex, a subgroup of R-SNARE proteins, a subfamily of the transport protein particle (TRAPP), and the signal recognition particle receptor subunit alpha (SR-alpha).	117
271349	cd14819	Translin	Translin, also known as TB-RBP (testis brain RNA-binding protein). Translin (also known as TB-RBP for Testis Brain RNA-binding protein, a mouse ortholog), is a paralog of its binding partner protein TRAX (translin-associated factor-X) and together they form oligomeric complexes known as C3PO proteins (for component 3 promoter of  RNA-induced silencing complex or RISC).  DNA damage has been proposed to stimulate transport of Translin into nuclei. It binds to RNA and single-stranded DNA, and its selectivity is modulated by interactions with GTP and TRAX. Translin may also regulate dendritic trafficking of BDNF RNAs as well as function as a key activator of siRNA-mediated silencing in drosophila. Translin and Trax participate in a variety of nucleic acid metabolism pathways in addition to RNAi and have been implicated in a wide range of biological activities, including mRNA processing, cell growth regulation, spermatogenesis, neuronal development/function, genome stability regulation and carcinogenesis; however, their precise role in some of the processes remains unclear.	206
271350	cd14820	TRAX	Translin-associated factor-X (TRAX). TRAX (translin-associated factor-X) is a paralog of its binding partner protein Translin and together they form oligomeric complexes known as C3PO proteins (for component 3 promoter of  RNA-induced silencing complex or RISC). TRAX complexed with Translin is possibly involved in dendritic RNA processing and in DNA double-strand break repair as an interacting partner with C1D, an activator of the DNA-dependent protein kinase involved in the repair of DNA-double strand breaks. It has been shown that Trax subunit, but not Translin, possesses a Glu-Glu-Asp catalytic center with the capacity to digest RNA; this catalytic activity is required for passenger-strand removal and RISC activation in RNAi. In Archaeoglobus fulgidus, Trax-like-subunits assemble into an octameric structure, highly similar to human C3PO; its complex with duplex RNA reveals that the octamer entirely encapsulates a single 13-base-pair RNA duplex inside a large inner cavity. Translin and Trax participate in a variety of nucleic acid metabolism pathways in addition to RNAi and have been implicated in a wide range of biological activities, including mRNA processing, cell growth regulation, spermatogenesis, neuronal development/function, genome stability regulation and carcinogenesis; however, their precise role in some of the processes remains unclear.	182
350517	cd14821	BACK_SPOP_like	BACK (BTB and C-terminal Kelch) domain found in speckle-type POZ protein (SPOP) and similar proteins. This family includes speckle-type POZ protein (SPOP), speckle-type POZ protein-like (SPOPL), TD and POZ domain-containing proteins (TDPOZ), Drosophila melanogaster protein roadkill, and similar proteins. Both SPOP and SPOPL serve as adaptors of cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and proteasomal degradation of target proteins. TDPOZ is a family of bipartite animal and plant proteins that contain a tumor necrosis factor receptor-associated factor (TRAF) domain (TD) and a POZ/BTB domain. TDPOZ proteins may be nuclear scaffold proteins involved in transcription regulation in early development and other cellular processes. Drosophila melanogaster protein roadkill, also termed Hh-induced MATH and BTB domain-containing protein (HIB), is a hedgehog-induced BTB protein that modulates hedgehog signaling by degrading Ci/Gli transcription factor.	59
350518	cd14822	BACK_BTBD9	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 9 (BTBD9). BTBD9 is a risk factor for Restless Legs Syndrome (RLS) encoding a Cullin-3 substrate adaptor. The BTBD9 gene may be associated with antipsychotic-induced RLS in schizophrenia. Mutations in BTBD9 lead to reduced dopamine, increased locomotion and sleep fragmentation.	101
341427	cd14823	AP_longin-like	Longin-like domains of AP complex subunits. AP complex sigma subunits are part of the heterotetrameric adaptor protein (AP) complex which consists of one large subunit (alpha-, gamma-, delta- or epsilon), one beta-, one mu-, and one sigma-subunit. In general, AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In most cases the coat protein is clathrin (AP1 and AP2 complex), but some of the other members of the AP complex family are associated with nonclathrin coats. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals.	131
341428	cd14824	Longin	longin domain. Longin-domain is N-terminal domain of a subgroup of R-SNARE proteins, including VAMP7, Ykt6, and Sec22. Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain.	122
341429	cd14825	TRAPPC2_sedlin	Trafficking protein particle complex subunit 2. Trafficking protein particle complex subunit 2 (TRAPPC2), also known as Sedlin (SEDL) or TRS20, has been identified as a component of the transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic. In humans, deletions or point mutations in the SEDL gene cause the genetic disease spondyloepiphyseal dysplasia tarda (SEDT), an X-linked skeletal disorder.	135
341430	cd14826	SR_alpha_SRX	SRX domain of signal recognition particle receptor subunit alpha. Signal recognition particle receptor subunit alpha (SR-alpha) is part of the membrane-associated heterodimeric receptor for the signal recognition particle (SRP). The signal recognition particle (SRP) pathway is highly conserved and plays an important role in the translocation of proteins across and insertion into membranes by targeting the translating ribosome to the endoplasmic reticulum. The N-terminal SRX domain of SR-alpha has a profilin-like fold and has been shown to be the interaction site with the second subunit, SR-beta.	118
341431	cd14827	AP_sigma	AP complex subunit sigma. AP complex sigma subunits are part of the heterotetrameric adaptor protein (AP) complex which consists of one large subunit (alpha-, gamma-, delta- or epsilon), one beta-, one mu-, and one sigma-subunit. In general, AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In most cases the coat protein is clathrin (AP1 and AP2 complex), but some of the other members of the AP complex family are associated with nonclathrin coats. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals.	138
341432	cd14828	AP_Mu_N	AP complex subunit mu N-terminal domain. AP complex mu subunits are part of the heterotetrameric adaptor protein (AP) complex which consists of one large subunit (alpha-, gamma-, delta- or epsilon), one beta-, one mu-, and one sigma-subunit. In general, AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In most cases the coat protein is clathrin (AP1 and AP2 complex), but some of the other members of the AP complex family are associated with nonclathrin coats. The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal.	136
341433	cd14829	Zeta-COP	zeta subunit of the F-COPI complex. Zeta subunit of the heterotetrameric F-COPI complex, which consists of one beta-, one gamma-, one delta-, and one zeta subunit, where beta- and gamma- subunits are related to the large adaptor protein (AP) complex subunits, and delta- and zeta- subunits are related to the medium and small AP subunits, respectively. F-COPI forms a coatomer together with the B-COPI subcomplex, which assembles with a small GTPase, ADP-ribosylation factor 1 (ARF1), playing an important role in the formation of COPI complex-coated vesicles. COPI complex-coated vesicles function in the early secretory pathway mediating the retrograde transport from the Golgi to the ER, and intra-Golgi transport.	132
341434	cd14830	Delta_COP_N	delta subunit of the F-COPI complex, N-terminal domain. Delta subunit of the heterotetrameric F-COPI complex, which consists of one beta-, one gamma-, one delta-, and one zeta subunit, where beta- and gamma- subunits are related to the large adaptor protein (AP) complex subunits, and delta- and zeta- subunits are related to the medium and small AP subunits, respectively. F-COPI forms a coatomer together with the B-COPI subcomplex, which assembles with a small GTPase, ADP-ribosylation factor 1 (ARF1), playing an important role in the formation of COPI complex-coated vesicles. COPI complex-coated vesicles function in the early secretory pathway mediating the retrograde transport from the Golgi to the ER, and intra-Golgi transport.	130
341435	cd14831	AP1_sigma	AP-1 complex subunit sigma. AP-1 complex sigma subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large gamma-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In the case of AP-1 the coat protein is clathrin. AP-1 binds the phospholipid  PI(4)P which plays a role in its localisation to the trans-Golgi network (TGN)/endosome. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals.	143
341436	cd14832	AP4_sigma	AP-4 complex subunit sigma. AP-4 complex sigma subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large epsilon-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. AP-4 does not bind the coat protein clathrin, it is associated with nonclathrin coats. Its phospholipid binding partner is unknown and it is localized in the trans-Golgi network (TGN). The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals.	138
341437	cd14833	AP2_sigma	AP-2 complex subunit sigma. AP-2 complex sigma subunit is part of the heterotetrameric adaptor protein (AP)-2 complex which consists of one large alpha-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In the case of AP-2 the coat protein is clathrin. AP-2 binds the phospholipid PI(4,5)P2 which is important for its localisation to the plasma membrane. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals.	141
341438	cd14834	AP3_sigma	AP-3 complex subunit sigma. AP-3 complex sigma subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large delta-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. AP-3 binds the coat protein clathrin and the phospholipid  PI(3)P and it is localized in the endosome. The sigma subunit is comprised of a single longin domain and plays a role in binding dileucine-based sorting signals.	146
341439	cd14835	AP1_Mu_N	AP-1 complex subunit mu N-terminal domain. AP-1 complex mu subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large gamma-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In the case of AP-1 the coat protein is clathrin. AP-1 binds the phospholipid  PI(4)P which plays a role in its localisation to the trans-Golgi network (TGN)/endosome. The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal.	139
341440	cd14836	AP2_Mu_N	AP-2 complex subunit mu N-terminal domain. AP-2 complex mu subunit is part of the heterotetrameric adaptor protein (AP)-2 complex which consists of one large alpha-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. In the case of AP-2 the coat protein is clathrin. AP-2 binds the phospholipid PI(4,5)P2 which is important for its localisation to the plasma membrane. The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal.	140
341441	cd14837	AP3_Mu_N	AP-3 complex subunit mu N-terminal domain. AP-3 complex mu subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large delta-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. AP-3 binds the coat protein clathrin and the phospholipid  PI(3)P and it is localized in the endosome. The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal.	139
341442	cd14838	AP4_Mu_N	AP-4 complex subunit mu N-terminal domain. AP-4 complex mu subunit is part of the heterotetrameric adaptor protein (AP)-1 complex which consists of one large epsilon-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport. AP-4 does not bind the coat protein clathrin, it is associated with nonclathrin coats. Its phospholipid binding partner is unknown and it is localized in the trans-Golgi network (TGN). The mu subunit is comprised of an N-terminal longin domain followed by a C-terminal domain which is involved in the binding of the Y-X-X-Phi sorting signal.	137
350617	cd14840	D-Ala-D-Ala_dipeptidase_Aad	D-Ala-D-Ala dipeptidase (includes Lactobacillus plantarum Aad peptidase). D-Ala-D-Ala dipeptidase (also known as D-alanyl-D-alanine dipeptidase vanX; VanX; EC 3.4.13.22) is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci and other bacteria (both Gram-positive and Gram-negative). It is part of a gene cluster that affects cell-wall biosynthesis. The operon triggers the termination of peptidoglycan precursors by D-Ala-(R)-lactate instead of D-Ala-D-Ala dipeptides. The enzyme is stereospecific, as L-Ala-L-Ala, D-Ala-L-Ala and L-Ala-D-Ala are not substrates. This subfamily includes Lactobacillus Aad peptidase and belongs in the MEROPS peptidase family M15, subfamily D.	158
350618	cd14843	D-Ala-D-Ala_dipeptidase_like	D-Ala-D-Ala dipeptidase, includes uncharacterized enzymes. This subfamily of D-Ala-D-Ala dipeptidase (also known as D-alanyl-D-alanine dipeptidase vanX; VanX; EC 3.4.13.22) also includes several uncharacterized proteins. This is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci and other bacteria (both Gram-positive and Gram-negative). It is part of a gene cluster that affects cell-wall biosynthesis. The operon triggers the termination of peptidoglycan precursors by D-Ala-(R)-lactate instead of D-Ala-D-Ala dipeptides. The enzyme is stereospecific, as L-Ala-L-Ala, D-Ala-L-Ala and L-Ala-D-Ala are not substrates. It belongs in the MEROPS peptidase family M15, subfamily D.	160
350619	cd14844	Zn-DD-carboxypeptidase_like	Proteins similar to the zinc-containing D-Ala-D-Ala dipeptidase. The zinc D-Ala-D-Ala carboxypeptidase (Streptomyces-type) (also known as D-alanyl-D-alanine hydrolase; D-alanyl-D-alanine-cleaving carboxypeptidase; DD-carboxypeptidase; DD-carboxypeptidase-transpeptidase; Zn2+ G peptidase; G enzyme; EC 3.4.17.14) is a zinc enzyme that belongs to the peptidase M15 subfamily A. The enzyme catalyzes carboxypeptidation but not transpeptidation reactions involved in bacterial cell wall metabolism. Its specificity with substrates of the type Xaa-Yaa-Zaa shows that the enzyme requires the substrate N-terminus to be blocked and C-terminus to be free, and Yaa and Zaa should be in the D-configuration. It is weakly inhibited by beta-lactams most likely caused by the enzyme active site geometry.	108
350620	cd14845	L-Ala-D-Glu_peptidase_like	L-Ala-D-Glu peptidase, also known as L-alanyl-D-glutamate endopeptidase. This L-Ala-D-Glu peptidase family includes L-alanyl-D-glutamate peptidase (bacteriophage T5) (also known as L-alanoyl-D-glutamate endopeptidase), and Ply118 and Ply500 L-Ala-D-Glu peptidase. Bacteriophage endolysin degrades the peptidoglycan of the bacterial host from within, leading to cell lysis and release of progeny virions. The bacteriophage endolysin Ply118 cleaves between L-Ala and D-Glu residues of Listeria cell wall peptidoglycan. This family belongs to the MEROPS peptidase M15 subfamily C.	126
350621	cd14846	Peptidase_M15_like	Uncharacterized family of the peptidase family M15, subfamily B. This family of uncharacterized proteins, similar to endolysin lys (Clavibacter phage CMP1) and VanYn peptidase, are zinc-binding enzymes that belong to the peptidase M15 subfamily B, involved in bacterial cell wall metabolism.	104
350622	cd14847	DD-carboxypeptidase_like	Uncharacterized proteins of the MEROPS peptidase family M15, subfamily B. This family of uncharacterized proteins similar to D-Ala-D-Ala carboxypeptidase pdcA (Myxococcus-type) are zinc-binding enzymes that belong to the peptidase M15 subfamily B. The enzyme D-Ala-D-Ala carbozypeptidase catalyzes carboxypeptidation reactions involved in bacterial cell wall metabolism.	162
350623	cd14849	DD-dipeptidase_VanXYc	D-Ala-D-Ala dipeptidase/D-Ala-D-Ala carboxypeptidase (VanXYc) and related proteins. VanXYc peptidase (also known as vanXY(C) peptidase, D-alanyl-D-alanine carboxypeptidase D,D-dipeptidase/D,D-carboxypeptidase, vancomycin resistance D,D-dipeptidase) is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci. Some of the vancomycin resistance operons encode VanXY D,D-carboxypeptidase which hydrolyzes both, dipeptide (D-Ala-D-Ala) or pentapeptide (UDP-MurNac-L-Ala-D-Glu-L-Lys-D-Ala-D-Ala). It is a bifunctional enzyme that catalyzes D,D-peptidase and D,D-carboxypeptidase activities. VanXY has higher sequence similarity to VanY than with VanX and hydrolyzes D,D-dipeptides such as D-Ala-D-Ala, whereas VanY is inactive against this substrate; thus having a less restrictive active site to accommodate larger substrates such as UDP-MurNAc-pentapeptide[Ala]. This family belongs to the MEROPS family M15, subfamily B, and includes the D,D-dipeptidases VanXYg and VanXYe.	127
350624	cd14852	LD-carboxypeptidase	L,D-carboxypeptidase DacB and LdcB, and related proteins. This L,D-carboxypeptidase family includes LdcB LD-Carboxypeptidase from Streptococcus pneumoniae, Bacillus anthracis, and Bacillus subtilis, and L,D-carboxypeptidase DacB from Streptococcus pneumonia and Lactococcus lactis. These enzymes are active against cell-wall-derived tetrapeptides and synthetic tetrapeptides lacking the sugar moiety but are inactive against tetrapeptides terminating in L-alanine. L,D-carboxypeptidase DacB plays a key role in the remodeling of S. pneumoniae peptidoglycan during cell division. It adopts a zinc-dependent carboxypeptidase fold and acts as an L,D-carboxypeptidase towards the tetrapeptide L-Ala-D-iGln-L-Lys-D-Ala of the peptidoglycan stem. This family also includes vanY D-Ala-D-Ala carboxypeptidase which is vancomycin-inducible and penicillin-resistant. VanY hydrolyzes depsipeptide- and D-alanyl-D-alanine-containing peptidoglycan precursors; it is insensitive to beta-lactams. All these enzymes belong to the MEROPS family M15 subfamily B.	162
341443	cd14853	TRAPPC_longin-like	Longin-like domains of Trafficking protein particle complex. Longin-like domains of a subfamily of core components of the trafficking protein particle complex (TRAPP), including TRAPPC2, TRAPPC4, TRAPPC1 and a TRAPPC2L, whose function is not known.  TRAPP complexes are required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic.	132
341444	cd14854	TRAPPC2L	Trafficking protein particle complex subunit 2-like. Trafficking protein particle complex subunit 2-like (TRAPPC2L) is related to TRAPPC2. Its function is not known, but there are indications that it is part of the TRAPP II complex, which is required for distinct tethering events at Golgi membranes. TRAPPC2 has been identified as a general component of transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic.	135
341445	cd14855	TRAPPC1_MUM2	Trafficking protein particle complex subunit 1. Trafficking protein particle complex subunit 1 (TRAPPC1), also known as MUM2 and BET5, has been identified as a component of the transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic.	132
341446	cd14856	TRAPPC4_synbindin	Trafficking protein particle complex subunit 4. Trafficking protein particle complex subunit 4 (TRAPPC4), also known as synbindin or TRS23, has been identified as a component of the transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic.	127
410986	cd14858	TrmE_N	N-terminal domain of TrmE, a tRNA modification GTPase. This family contains the N-terminal domain of TrmE (also known as MnmE, ThdF, MSS1), a guanine nucleotide-binding protein conserved in all three kingdoms of life. It is involved in the modification of uridine bases (U34) at the first anticodon (wobble) position of tRNAs decoding two-family box triplets. TrmE is a three-domain protein comprising an N-terminal alpha/beta domain, a helical domain, and the GTPase domain which is nested within the helical domain. The N-terminal domain induces dimerization for self-assembly and is topologically homologous to the tetrahydrofolate (THF)-binding domain of N,N-dimethylglycine oxidase (DMGO). However, the THF-binding site in DMGO is encoded on a single polypeptide, while homodimerization would be required to create a similar THF-binding site in TrmE. Dimerization also creates a second, symmetry-related THF-binding site. Biochemical and structural studies show that TrmE indeed binds formyl-THF. A cysteine residue, necessary for modification of U34, is located close to the C1-group donor 5-formyl-tetrahydrofolate, suggesting a direct role of TrmE in the modification analogous to DNA modification enzymes.	117
275438	cd14859	PMEI_like	pectin methylesterase inhibitor and related proteins. Pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) catalyzes the demethylesterification of homogalacturonans in the cell wall. Its activity is regulated by the proteinaceous PME inhibitor (PMEI) which inhibits PME and invertase through formation of a non-covalent 1:1 complex. Depending on the mode of demethylesterification, PMEI activity results in either loosening or rigidification of the cell wall. PMEI has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. Thus, PMEI probably plays an important physiological role in PME regulation in plants, possessing several potential applications in a food-technological context. CIF (cell-wall inhibitor of beta-fructosidase from tobacco) is structurally similar to PMEI and these members are also included in this model. Comparison of the CIF/INV1 structure with the complex between PMEI/PME suggests a common targeting mechanism in PMEI and CIF. However, CIF and PMEI use distinct surface areas to selectively inhibit very different enzymatic scaffolds.	140
341482	cd14860	4HBD_NAD	4-hydroxybutyrate dehydrogenase, also called gamma-hydroxybutyrate dehydrogenase, catalyzes the reduction of succinic simialdehyde to 4-hydroxybutyrate in the succinic degradation pathway. 4-hydroxybutyrate dehydrogenase (4HBD) is an iron-containing (type III) NAD-dependent alcohol dehydrogenase. It plays a role in the succinate metabolism biochemical pathway. It catalyzes the reduction of succinic simialdehide to 4-hydroxybutyrate in the succinate degradation pathway This succinate degradation pathway is present in some bacteria which can use succinate as sole carbon source.	371
341483	cd14861	Fe-ADH-like	Iron-containing alcohol dehydrogenases-like. This family contains proteins similar to iron-containing alcohol dehydrogenase (Fe-ADH), most of which have not been characterized. Their specific function is unknown. The protein structure represents a dehydroquinate synthase-like fold and belongs to the alcohol dehydrogenase-like superfamily. It is distinct from other alcohol dehydrogenases which contain different protein domains.  Alcohol dehydrogenase catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions.	374
341484	cd14862	Fe-ADH-like	iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized.	375
341485	cd14863	Fe-ADH-like	iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized.	380
341486	cd14864	Fe-ADH-like	iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized.	376
341487	cd14865	Fe-ADH-like	iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized.	383
341488	cd14866	Fe-ADH-like	iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized.	384
271246	cd14867	uS7_Eukaryote	Eukaryota homolog of Ribosomal Protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. Eukaryotic RPS7 (also named RPS5) have variable N-terminal regions that affect the efficiency of initiation translation process by impacting small ribosomal subunit to function. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit which is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis.	185
271247	cd14868	uS7_Mitochondria_Fungi	Fungal Mitochondrial homolog of Ribosomal Protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. Fungal and plants mitochondrial RPS7 shows less homology to the mammalian than to bacterial RPS7. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis.	151
271248	cd14869	uS7_Bacteria	Bacterial homolog of Ribosomal Protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. Prokaryotic RPS7 is lacking the variable N-terminal region of eukaryotic RPS7. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit which is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis.	138
271249	cd14870	uS7_Mitochondria_Mammalian	Mammalian Mitochondrial homolog of Ribosomal Protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. MRPS7 shows more homology to bacterial RPS7 than mitochondrial proteins from plants and fungi. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The ribosomes present in mammalian mitochondria have more proteins and low percentage of ribosomal RNA than bacterial ribosomes. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis.	199
271250	cd14871	uS7_Chloroplast	Chloroplast homolog of Ribosomal Protein S7. Chloroplast RPS7 has both general and specific regulatory roles in chloroplast translation process. uS7, also known as Ribosomal protein (RP)S7, is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. The chloroplasts of plants and algae have bacterial ancestry, but it has adopted novel mechanisms in order to execute its roles within a eukaryotic cell. Chloroplast RPS7 is more homologous to bacterial RPS7 than other eukaryotic mitochondrial proteins. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The chloroplast translation regulation is more complex than in bacteria with additional RNA and chloroplast-unique proteins. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis.	146
276839	cd14872	MYSc_Myo4	class IV myosin, motor domain. These myosins all possess a WW domain either N-terminal or C-terminal to their motor domain and a tail with a MyTH4 domain followed by a SH3 domain in some instances. The monomeric Acanthamoebas were the first identified members of this group and have been joined by Stramenopiles. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	644
276840	cd14873	MYSc_Myo10	class X myosin, motor domain. Myosin X is an unconventional myosin motor that functions as a monomer.  In mammalian cells, the motor is found to localize to filopodia. Myosin X walks towards the barbed ends of filaments and is thought to walk on bundles of actin, rather than single filaments, a unique behavior. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. C-terminal to the head domain are a variable number of IQ domains, 2 PH domains, a MyTH4 domain, and a FERM domain. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	651
276841	cd14874	MYSc_Myo12	class XXXIII myosin, motor domain. Little is known about the  XXXIII class of myosins. They are found predominately in nematodes. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	628
276842	cd14875	MYSc_Myo13	class XIII myosin, motor domain. These myosins have an N-terminal motor domain, a light-chain binding domain, and a C-terminal GPA/Q-rich domain. There is little known about the function of this myosin class. Two of the earliest members identified in this class are green alga Acetabularia cliftonii, Aclmyo1 and Aclmyo2. They are striking with their short tail of Aclmyo1 of 18 residues and the maximum of 7 IQ motifs in Aclmyo2. It is thought that these myosins are involved in organelle transport and tip growth. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	664
276843	cd14876	MYSc_Myo14	class XIV myosin, motor domain. These myosins localize to plasma membranes of the intracellular parasites and may be involved in the cell invasion process. Their known functions include: transporting phagosomes to the nucleus and perturbing the developmentally regulated elimination of the macronucleus during conjugation. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases.  C-terminal to their motor domain these myosins have a MyTH4-FERM protein domain combination. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	649
276844	cd14878	MYSc_Myo16	class XVI myosin, motor domain. These XVI type myosins are also known as Neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adapter 3/NYAP3. Myo16 is thought to play a regulatory role in cell cycle progression and has been recently implicated in Schizophrenia. Class XVI myosins are characterized by an N-terminal ankyrin repeat domain and some with chitin synthase domains that arose independently from the ones in the class XVII fungal myosins. They bind protein phosphatase 1 catalytic subunits 1alpha/PPP1CA and 1gamma/PPP1CC. Human Myo16 interacts with ACOT9, ARHGAP26 and PIK3R2 and with components of the WAVE1 complex, CYFIP1 and NCKAP1. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	656
276845	cd14879	MYSc_Myo17	class XVII myosin, motor domain. This fungal myosin which is also known as chitin synthase uses its motor domain to tether its vesicular cargo to peripheral actin. It works in opposition to dynein, contributing to the retention of Mcs1 vesicles at the site of cell growth and increasing vesicle fusion necessary for polarized growth. Class 17 myosins consist of a N-terminal myosin motor domain with Cyt-b5, chitin synthase 2, and a DEK_C domains at it C-terminus.  The chitin synthase region contains several transmembrane domains by which myosin 17 is thought to bind secretory vesicles. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	647
276846	cd14880	MYSc_Myo19	class XIX myosin, motor domain. Monomeric myosin-XIX (Myo19) functions as an actin-based motor for mitochondrial movement in vertebrate cells.  It contains a variable number of IQ domains. Human myo19 contains a motor domain, three IQ motifs, and a short tail. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	658
276847	cd14881	MYSc_Myo20	class XX myosin, motor domain. These class 20 myosins are primarily insect myosins with such members as Drosophila, Daphnia, and mosquitoes. These myosins contain a single IQ motif in the neck region. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	633
276848	cd14882	MYSc_Myo21	class XXI myosin, motor domain. The myosins here are comprised of insects. Leishmania class XXI myosins do not group with them. Myo21, unlike other myosin proteins, contains UBA-like protein domains and has no structural or functional relationship with the myosins present in other organisms possessing cilia or flagella. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. They have diverse tails with IQ, WW, PX, and Tub domains. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	642
276849	cd14883	MYSc_Myo22	class XXII myosin, motor domain. These myosins possess an extended neck with multiple IQ motifs such as found in class V, VIII, XI, and XIII myosins. These myosins are defined by two tandem MyTH4 and FERM domains. The apicomplexan, but not diatom myosins contain 4-6 WD40 repeats near the end of the C-terminal tail which suggests a possible function of these myosins in signal transduction and transcriptional regulation. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	661
276850	cd14884	MYSc_Myo23	class XXIII myosin, motor domain. These myosins are predicted to have a neck region with 1-2 IQ motifs and a single MyTH4 domain in its C-terminal tail. The lack of a FERM domain here is odd since MyTH4 domains are usually found alongside FERM domains where they bind to microtubules. At any rate these Class XXIII myosins are still proposed to function in the apicomplexan microtubule cytoskeleton. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	685
276851	cd14886	MYSc_Myo25	class XXV myosin, motor domain. These myosins are MyTH-FERM myosins that play a role in cell adhesion and filopodia formation. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	650
276852	cd14887	MYSc_Myo26	class XXVI myosin, motor domain. These MyTH-FERM myosins are thought to be related to the other myosins that have a MyTH4 domain such as class III, VII, IX, X , XV, XVI, XVII, XX, XXII, XXV, and XXXIV. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	725
276853	cd14888	MYSc_Myo27	class XXVII myosin, motor domain. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	667
276854	cd14889	MYSc_Myo28	class XXVIII myosin, motor domain. These myosins are found in fish, chicken, and mollusks. The tail regions of these class-XXVIII myosins consist of an IQ motif, a short coiled-coil region, and an SH2 domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	659
276855	cd14890	MYSc_Myo29	class XXIX myosin, motor domain. Class XXIX myosins are comprised of Stramenopiles and have very long tail domains consisting of three IQ motifs, short coiled-coil regions, up to 18 CBS domains, a PB1 domain, and a carboxy-terminal transmembrane domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	662
276856	cd14891	MYSc_Myo30	class XXX myosin, motor domain. Myosins of class XXX are composed of an amino-terminal SH3-like domain, two IQ motifs, a coiled-coil region and a PX domain. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	645
276857	cd14892	MYSc_Myo31	class XXXI myosin, motor domain. Class XXXI myosins have a very long neck region consisting of 17 IQ motifs and 2 tandem ANK repeats that are separated by a PH domain. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	656
276858	cd14893	MYSc_Myo32	class XXXII myosin, motor domain. Class XXXII myosins do not contain any IQ motifs, but possess tandem MyTH4 and FERM domains. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	741
276859	cd14894	MYSc_Myo33	class myosin, motor domain. Class XXXIII myosins have variable numbers of IQ domain and 2 tandem ANK repeats that are separated by a PH domain. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	871
276860	cd14895	MYSc_Myo34	class XXXIV myosin, motor domain. Class XXXIV myosins are composed of an IQ motif, a short coiled-coil region, 5 tandem ANK repeats, and a carboxy-terminal FYVE domain. The myosin classes XXX to XXXIV contain members from Phytophthora species and Hyaloperonospora parasitica. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	704
276861	cd14896	MYSc_Myo35	class XXXV myosin, motor domain. This class of metazoan myosins contains 2 IQ motifs, 2 MyTH4 domains, a single FERM domain, and an SH3 domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	644
276862	cd14897	MYSc_Myo36	class XXXVI myosin, motor domain. This class of molluscan myosins contains a motor domain followed by a GlcAT-I (Beta1,3-glucuronyltransferase I) domain. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	635
276863	cd14898	MYSc_Myo37	class XXXVII myosin, motor domain. The class XXXVIII myosins are comprised of fungi. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	578
276864	cd14899	MYSc_Myo38	class XXXVIII myosin. The class XXXVIII myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	717
276865	cd14900	MYSc_Myo39	class XXXIX myosin, motor domain. The class XXXIX myosins are found in Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	627
276866	cd14901	MYSc_Myo40	class XL myosin, motor domain. The class XL myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	655
276867	cd14902	MYSc_Myo41	class XLI myosin, motor domain. The class XLI myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	716
276868	cd14903	MYSc_Myo42	class XLII myosin, motor domain. The class XLII myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	658
276869	cd14904	MYSc_Myo43	class XLIII myosin, motor domain. The class XLIII myosins are comprised of Stramenopiles. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	653
276870	cd14905	MYSc_Myo44	class XLIV myosin, motor domain. There is little known about the function of the myosin XLIV class. Members here include cellular slime mold Polysphondylium and soil-living amoeba Dictyostelium. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	673
276871	cd14906	MYSc_Myo45	class XLV myosin, motor domain. The class XLVI myosins are comprised of slime molds Dictyostelium and Polysphondylium. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	715
276872	cd14907	MYSc_Myo46	class XLVI myosin, motor domain. The class XLVI myosins are comprised of Alveolata. Not much is known about this myosin class. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	669
276873	cd14908	MYSc_Myo47	class XLVII myosin, motor domain. The class XLVII myosins are comprised of Stramenopiles. Not much is known about this myosin class. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	682
276874	cd14909	MYSc_Myh1_insects_crustaceans	class II myosin heavy chain 1, motor domain. Myosin motor domain of type IIx skeletal muscle myosin heavy chain 1 (also called MYHSA1, MYHa, MyHC-2X/D, MGC133384) in insects and crustaceans. Myh1 is a type I skeletal muscle myosin that in Humans is encoded by the MYH1 gene. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	666
276875	cd14910	MYSc_Myh1_mammals	class II myosin heavy chain 1, motor domain. Myosin motor domain of type IIx skeletal muscle myosin heavy chain 1 (also called MYHSA1, MYHa, MyHC-2X/D, MGC133384) in mammals. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	671
276876	cd14911	MYSc_Myh2_insects_mollusks	class II myosin heavy chain 2, motor domain. Myosin motor domain of type IIa skeletal muscle myosin heavy chain 2 (also called MYH2A, MYHSA2, MyHC-IIa, MYHas8, MyHC-2A) in insects and mollusks. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. Mutations in this gene results in inclusion body myopathy-3 and familial congenital myopathy. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	674
276877	cd14912	MYSc_Myh2_mammals	class II myosin heavy chain 2, motor domain. Myosin motor domain of type IIa skeletal muscle myosin heavy chain 2 (also called MYH2A, MYHSA2, MyHC-IIa, MYHas8, MyHC-2A) in mammals. Mutations in this gene results in inclusion body myopathy-3 and familial congenital myopathy. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	673
276878	cd14913	MYSc_Myh3	class II myosin heavy chain 3, motor domain. Myosin motor domain of fetal skeletal muscle myosin heavy chain 3 (MYHC-EMB, MYHSE1, HEMHC, SMHCE) in tetrapods including mammals, lizards, and frogs. This gene is a member of the MYH family and encodes a protein with an IQ domain and a myosin head-like domain. Mutations in this gene have been associated with two congenital contracture (arthrogryposis) syndromes, Freeman-Sheldon syndrome and Sheldon-Hall syndrome. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	668
276879	cd14915	MYSc_Myh4	class II myosin heavy chain 4, motor domain. Myosin motor domain of skeletal muscle myosin heavy chain 4 (also called MYH2B, MyHC-2B, MyHC-IIb). Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	671
276880	cd14916	MYSc_Myh6	class II myosin heavy chain 6, motor domain. Myosin motor domain of alpha (or fast) cardiac muscle myosin heavy chain 6. Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	670
276881	cd14917	MYSc_Myh7	class II myosin heavy chain 7, motor domain. Myosin motor domain of beta (or slow) type I cardiac muscle myosin heavy chain 7 (also called CMH1, MPD1, and CMD1S). Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. It is expressed predominantly in normal human ventrical and in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	668
276882	cd14918	MYSc_Myh8	class II myosin heavy chain 8, motor domain. Myosin motor domain of perinatal skeletal muscle myosin heavy chain 8 (also called MyHC-peri, MyHC-pn). Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. A mutation in this gene results in trismus-pseudocamptodactyly syndrome. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	668
276883	cd14919	MYSc_Myh9	class II myosin heavy chain 9, motor domain. Myosin motor domain of non-muscle myosin heavy chain 9 (also called NMMHCA, NMHC-II-A, MHA, FTNS, EPSTS, and DFNA17). Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	670
276952	cd14920	MYSc_Myh10	class II myosin heavy chain 10, motor domain. Myosin motor domain of non-muscle myosin heavy chain 10 (also called NMMHCB). Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	673
276885	cd14921	MYSc_Myh11	class II myosin heavy chain 11, motor domain. Myosin motor domain of smooth muscle myosin heavy chain 11 (also called SMMHC, SMHC). The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3' end overlaps with that of the latter. Inversion of the MYH11 locus is one of the most frequent chromosomal aberrations found in acute myeloid leukemia. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Mutations in MYH11 have been described in individuals with thoracic aortic aneurysms leading to acute aortic dissections with patent ductus arteriosus. MYH11 mutations are also thought to contribute to human colorectal cancer and are also associated with Peutz-Jeghers syndrome. The mutations found in human intestinal neoplasia result in unregulated proteins with constitutive motor activity, similar to the mutant myh11 zebrafish. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	673
276887	cd14923	MYSc_Myh13	class II myosin heavy chain 13, motor domain. Myosin motor domain of skeletal muscle myosin heavy chain 13 (also called MyHC-eo) in mammals, chicken, and green anole. Myh13 is a myosin whose expression is restricted primarily to the extrinsic eye muscles which are specialized for function in eye movement. Class II myosins, also called conventional myosins, are the myosin type responsible for producing muscle contraction in muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	671
276953	cd14927	MYSc_Myh7b	class II myosin heavy chain 7b, motor domain. Myosin motor domain of cardiac muscle, beta myosin heavy chain 7b (also called KIAA1512, dJ756N5.1, MYH14, MHC14). MYH7B is a slow-twitch myosin. Mutations in this gene result in one form of autosomal dominant hearing impairment. Multiple transcript variants encoding different isoforms have been found for this gene. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	676
276892	cd14929	MYSc_Myh15_mammals	class II myosin heavy chain 15, motor domain. Myosin motor domain of sarcomeric myosin heavy chain 15 in mammals (also called KIAA1000) . MYH15 is a slow-twitch myosin. Myh15 is a ventricular myosin heavy chain. Myh15 is absent in embryonic and fetal muscles and is found in orbital layer of extraocular muscles at birth. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	662
276893	cd14930	MYSc_Myh14_mammals	class II myosin heavy chain 14 motor domain. Myosin motor domain of non-muscle myosin heavy chain 14 (also called FLJ13881, KIAA2034, MHC16, MYH17).  Its members include mammals, chickens, and turtles.  Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. Some of the data used for this classification were produced by the CyMoBase team at the Max-Planck-Institute for Biophysical Chemistry. The sequence names are composed of the species abbreviation followed by the protein abbreviation and optional protein classifier and variant designations.	670
276895	cd14932	MYSc_Myh18	class II myosin heavy chain 18, motor domain. Myosin motor domain of muscle myosin heavy chain 18. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	676
276896	cd14934	MYSc_Myh16	class II myosin heavy chain 16, motor domain. Myosin motor domain of myosin heavy chain 16 pseudogene (also called MHC20, MYH16, and myh5), encoding a sarcomeric myosin heavy chain expressed in nonhuman primate masticatory muscles, is inactivated in humans. This cd contains Myh16 in mammals. MYH16 has intermediate fibres between that of slow type 1 and fast 2B fibres, but exert more force than any other fibre type examined. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. Some of the data used for this classification were produced by the CyMoBase team at the Max-Planck-Institute for Biophysical Chemistry. The sequence names are composed of the species abbreviation followed by the protein abbreviation and optional protein classifier and variant designations.	659
276897	cd14937	MYSc_Myo24A	class XXIV A myosin, motor domain. These myosins have a 1-2 IQ motifs in their neck and a coiled-coil region in their C-terminal tail. The function of the class XXIV myosins remain elusive. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy. 	637
276898	cd14938	MYSc_Myo24B	class XXIV B myosin, motor domain. These myosins have a 1-2 IQ motifs in their neck and a coiled-coil region in their C-terminal tail. The functions of these myosins remain elusive. The catalytic (head) domain has ATPase activity and belongs to the larger group of P-loop NTPases. Myosins are actin-dependent molecular motors that play important roles in muscle contraction, cell motility, and organelle transport. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. A cyclical interaction between myosin and actin provides the driving force. Rates of ATP hydrolysis and consequently the speed of movement along actin filaments vary widely, from about 0.04 micrometer per second for myosin I to 4.5 micrometer per second for myosin II in skeletal muscle. Myosin II moves in discrete steps about 5-10 nm long and generates 1-5 piconewtons of force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	713
320093	cd14939	7tmD_STE2	fungal alpha-factor pheromone receptor STE2, member of the class D family of seven-transmembrane G protein-coupled receptors. This subfamily represents the alpha-factor pheromone receptor encoded by the STE2 gene, which is required for pheromone sensing and mating in haploid cells of the yeast Saccharomyces cerevisiae. The STE2-encoded seven-transmembrane domain receptor is a member of the class D GPCRs. Class D receptors are composed of two major subfamilies: Ste2 and Ste3. These two GPCRs (Ste2 and Ste3) sense the polypeptide mating pheromones, alpha-factor and a-factor, which activate a G protein-coupled receptors on the surface of the opposite yeast-mating haploid-types (MATa and MAT-alpha), respectively. Activation of these receptors by pheromones leads to activation of the mitogen-activated protein kinase (MAPK) signal transduction cascades, G1 cell cycle arrest, and polarized cell growth in the direction of the partner cell (a process called shmooing), which ultimately induces cell-cell fusion and the formation of a diploid zygote.  Like all GPCRs, these pheromone mating factor receptors possess the same basic architecture of seven-transmembrane (7TM) domains and share common signaling mechanisms; however, there is no significant sequence similarity either between Ste2 and Ste3, or between these two receptors and the other 7TM GPCRs. Thus, STE2 and STE3 represent phylogenetically distinct groups.	265
320094	cd14940	7tmE_cAMP_R_Slime_mold	slime mold cyclic AMP receptor, member of the class E family of seven-transmembrane G protein-coupled receptors. This family represents the class E of seven-transmembrane G-protein coupled receptors found in soil-living amoebas, commonly referred to as slime molds. The class E family includes cAMP receptors (cAR1-4) and cAMP receptors-like proteins (CrlA-C) from Dictyostelium discoideum, and their highly homologous cAMP receptors (TasA and TasB) from Polysphondylium pallidum.  So far, four subtypes of cAMP receptors  (cAR1-4) have been identified that play an essential role in the detection and transmit of the periodic extracellular cAMP waves that regulate chemotactic cell movement during Dictyostelium development, from the unicellular amoeba aggregate into many multicellular slugs and then differentiate into a sporocarp, a fruiting body with cells specialized for different functions. These four subtypes differ in their expression levels and patterns during development. cAR1 is high-affinity receptor that is the first one to be expressed highly during early aggregation and continues to be expressed at low levels during later developmental stages. cAR1 detects extracellular cAMP and is coupled to G-alpha2 protein. Cells lacking cAR1 fail to aggregate, demonstrating that cAR1 is responsible for aggregation. During later aggregation the high-affinity cAR3 receptor is expressed at low levels. Nonetheless, cells lacking cAR3 do not show an obviously altered pattern of development and are still able to aggregate into fruiting bodies. In contrast, cAR2 and cAR4 are low affinity receptors expressed predominantly after aggregation in pre-stalk cells.  cAR2 is essential for normal tip formation and deletion of the receptor arrests development at the mound stage. On the other hand, CAR4 regulates axial patterning and cellular differentiation, and deletion of the receptor results in defects during culmination. Furthermore, three cAMP receptor-like proteins (CrlA-C) were identified in Dictyostelium that show limited sequence similarity to the cAMP receptors. Of these CrlA is thought to be required for normal cell growth and tip formation in developing aggregates.	256
271344	cd14941	TRAPPC_bet3-like	Bet3-like domains of TRAPP. Bet3-like domains of a subfamily of core components of the trafficking protein particle complex (TRAPP) include TRAPPC3, TRAPPC5, and TRAPPC6A. TRAPP complexes play a key role in the regulation of ER-to-Golgi  and intra-Golgi transport by tethering the vesicle membrane to the target membrane. TRAPPs are large multimeric protein complexes which contain six core subunits that belong to two distinct structural families, the bet3-like family and the sedlin-like family.	152
271345	cd14942	TRAPPC3_bet3	Bet3-TRAPPC3 subunit of the TRAPP complex. Bet3 (also known as TRAPPC3) subunit of the trafficking protein particle complex (TRAPP). Bet3 is one of the six core subunits of TRAPP complexes which play a key role in the regulation of ER-to-Golgi  and intra-Golgi transport by tethering the vesicle membrane to the target membrane. TRAPPC3 has also been shown to be additionally important for membrane fusion during the formation of vesicular tubular clusters (VTC). In its core, Bet3 forms a hydrophobic channel that also contains a conserved acylation site.	155
271346	cd14943	TRAPPC5_Trs31	Trs31 subunit of the TRAPP complex. TRS31 (also known as TRAPPC5) subunit of the trafficking protein particle complex (TRAPP). TRS31 is one of the six core subunits of TRAPP complexes which play a key role in the regulation of ER-to-Golgi  and intra-Golgi transport by tethering the vesicle membrane to the target membrane.	158
271347	cd14944	TRAPPC6A_Trs33	Trs33 subunit of the TRAPP complex. TRS33 (also known as TRAPPC6A) subunit of the trafficking protein particle complex (TRAPP). TRS33 is one of the six core subunits of TRAPP complexes which play a key role in the regulation of ER-to-Golgi  and intra-Golgi transport by tethering the vesicle membrane to the target membrane. In mammals, mutations in TRAPPC6a cause mosaic loss of coat pigment.	167
271253	cd14945	Myo5-like_CBD	Cargo binding domain of myosin 5 and similar proteins. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5 a,b,c) in vertebrates and two (myo2 and myo4) in fungi and related to plant class XI myosins. Their C-terminal cargo binding domains is important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. MyoV-CBDs interact with several adaptor proteins that in turn interact with the cargo.	288
380670	cd14946	Tet_JBP	oxygenase domain of ten-eleven translocation (TET) enzymes, J-binding proteins (JBPs), and similar proteins. TET proteins are involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. Alterations in TET protein function have been linked to cancer, and TETs influence many cell differentiation processes. J binding protein (JBP) 1 and JBP2 are thymidine hydroxylases that catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and a thymidine hydroxylase domain. Members of this TET/JBP family of dioxygenases require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	264
271343	cd14947	NBR1_like	Functionally uncharacterized domain in neighbor of Brca1 Gene 1 and related proteins. NBR1 has been characterized as a specific late endosomal protein, which might play a role in receptor (RTK) trafficking. Specifically, NBR1 was shown to inhibit ligand-mediated receptor internalization from the cell surface. The region covered by this domain model may be involved in that function, as the C-terminus (which contains a UBA domain) was shown to be essential but not sufficient by itself. In an earlier yeast two-hybrid study, the region in mouse NBR1 covered by this domain has been shown to interact with CIB (calcium and integrin-binding protein) and FEZ1 (fasciculation and elongation protein zeta-1). Thus, NBR1 may play a role in cellular signalling pathways and possibly in neural development.	112
271342	cd14948	BACON	Bacteroidetes-Associated Carbohydrate-binding (putative) Often N-terminal (BACON) domain. The BACON domain is found in diverse domain architectures and accociated with a wide variety of domains, including carbohydrate-active enzymes and proteases. It was named for its suggested function of carbohydrate binding; the latter was inferred from domain architectures, sequence conservation, and phyletic distribution. However, recent experimental data suggest that its primary function in Bacteroides ovatus endo-xyloglucanase BoGH5A is to distance the catalytic module from the cell surface and confer additional mobility to the catalytic domain for attack of the polysaccharide. No evidence for a direct role in carbohydrate binding could be found in that case. The large majority of BACON domains are found in Bacteroidetes.	83
271340	cd14949	Asparaginase_2_like_3	Uncharacterized bacterial subfamily of the L-Asparaginase type 2-like enzymes, an Ntn-hydrolase family. The wider family of Asparaginase 2-like enzymes includes Glycosylasparaginase, Taspase 1, and  L-Asparaginase type 2. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue.	280
271341	cd14950	Asparaginase_2_like_2	Uncharacterized archaebacterial subfamily of the L-Asparaginase type 2-like enzymes, an Ntn-hydrolase family. The wider family of Asparaginase 2-like enzymes includes Glycosylasparaginase, Taspase 1, and  L-Asparaginase type 2. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue.	251
271321	cd14951	NHL-2_like	NHL repeat domain of NHL repeat-containing protein 2 and similar proteins. NHL repeat-containing protein 2 (NHLRC2) and related bacterial proteins; members of this eukaryotic and bacterial family are uncharacterized, the NHL repeat domain is found C-terminally of a thioredoxin domain. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	334
271322	cd14952	NHL_PKND_like	NHL repeat domain of the protein kinase PknD. PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	247
271323	cd14953	NHL_like_1	Uncharacterized NHL-repeat domain in bacterial proteins. This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	323
271324	cd14954	NHL_TRIM71_like	NHL repeat domain of the tripartite motif-containing protein 71 (TRIM71) and related proteins. The E3 ubiquitin-protein ligase TRIM71 (LIN-41) is a RING-finger domain containing protein that has been associated with a variety of activities. The NHL repeat domain appears responsible for targeting TRIM71 to mRNAs, and TRIM71 appears responsible for translational repression and mRNA decay. Together with BRAT, TRIM71 may be part of a family of mRNA repressors that regulate proliferation and differentiation. TRIM has been shown to negatively regulate stability of Lin28B, which inhibits the pre-let-7 miRNA precursor from maturing by recruiting the terminal uriyltransferase TUT4. This family also contains the Caenorhabditis elegans NHL repeat containing 1 (NHL-1), a RING-finger-containing protein that was shown to interact with E2 ubiquitin conjugating enzymes in two-hybrid screens. Its domain architecture resembles that of the E3 ubiquitin protein ligases TRIM2, TRIM32, and TRIM71. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	285
271325	cd14955	NHL_like_4	Uncharacterized NHL-repeat domain in bacterial and archaeal proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	279
271326	cd14956	NHL_like_3	Uncharacterized NHL-repeat domain in bacterial proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	274
271327	cd14957	NHL_like_2	Uncharacterized NHL-repeat domain in bacterial and archaeal proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	280
271328	cd14958	NHL_PAL_like	Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL, EC 4.3.2.5). PAL catalyzes the N-dealkylation of peptidyl-alpha-hydroxyglycine, which results in an alpha-amidated peptide and glyoxylate. Amidation of the C-terminus is required for the activity of many peptide hormones and neuropeptides. The catalytic residues of PAL are located on several NHL-repeats. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	300
271329	cd14959	NHL_brat_like	NHL repeat domain of the Drosophila brain-tumor protein (brat) and similar proteins. Drosophila brain-tumor (brat) has been identified as a tumor suppressor that negatively regulates cell proliferation during development of the Drosophila larval brain. It appears to be recruited to the 3'-untranslated region of hunchback RNA and regulates its translation by forming a complex with Pumilio (Pum) and Nanos (Nos). The NHL domain of brat appears to be involved by interacting with the RNA-binding Puf repeats of Pumilio, a sequence-specific RNA binding protein. This family also contains the Caenorhabditis elegans homolog NCL-1. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	274
271330	cd14960	NHL_TRIM2_like	NHL repeat domain of the tripartite motif-containing protein 2 (TRIM2) and related proteins. The E3 ubiquitin-protein ligase TRIM2 is responsible for ubiquinating the apoptosis-inducing Bcl-2-interacting mediator of cell death (Bim), when the latter is phosphorylated by p42/p44 MAPK. TRIM2 regulates the ubiquitination of neurofilament light subunit (NF-L), deficiencies in TRIM2 result in increased NF-L levels in axons and subsequent axonopathy. TRIM2 is also involved in regulating axon outgrowth during development; it contains RING and BBOX domains, the NHL repeat domain is located at its C-terminus. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	274
271331	cd14961	NHL_TRIM32_like	NHL repeat domain of the tripartite motif-containing protein 32 (TRIM32) and related proteins. The E3 ubiquitin-protein ligase TRIM32 (HT2A) is widely expressed and is responsible for ubiquinating a large variety of targets, including dysbindin (DTNBP1), NPHP7/Glis2, TAp73, and others. TRIM32 promotes disassociation of the plakoglobin-PI3K complex and reduces PI3K-Akt-FoxO signaling. Mutations in TRIM32 have been implemented in the two diverse diseases limb-girdle muscular dystrophy type 2H (LGMD2H) or sarcotubular myopathy (STM) and Bardet-Biedl syndrome type 11 (BBS11). The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	273
271332	cd14962	NHL_like_6	Uncharacterized NHL-repeat domain in bacterial proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	271
271333	cd14963	NHL_like_5	Uncharacterized NHL-repeat domain in bacterial proteins. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.	268
410628	cd14964	7tm_GPCRs	seven-transmembrane G protein-coupled receptor superfamily. This hierarchical evolutionary model represents the seven-transmembrane (7TM) receptors, often referred to as G protein-coupled receptors (GPCRs), which transmit physiological signals from the outside of the cell to the inside via G proteins. GPCRs constitute the largest known superfamily of transmembrane receptors across the three kingdoms of life that respond to a wide variety of extracellular stimuli including peptides, lipids, neurotransmitters, amino acids, hormones, and sensory stimuli such as light, smell and taste. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes. However, some 7TM receptors, such as the type 1 microbial rhodopsins, do not activate G proteins. Based on sequence similarity, GPCRs can be divided into six major classes: class A (the rhodopsin-like family), class B (the Methuselah-like, adhesion and secretin-like receptor family), class C (the metabotropic glutamate receptor family), class D (the fungal mating pheromone receptors), class E (the cAMP receptor family), and class F (the frizzled/smoothened receptor family). Nearly 800 human GPCR genes have been identified and are involved essentially in all major physiological processes. Approximately 40% of clinically marketed drugs mediate their effects through modulation of GPCR function for the treatment of a variety of human diseases including bacterial infections.	267
410629	cd14965	7tm_Opsins_type1	type 1 opsins, member of the seven-transmembrane GPCR superfamily. This group represents the microbial rhodopsin family, also known as type 1 rhodopsins, which can function as light-dependent ion pumps, cation channels, and sensors. They have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. Members of the type I rhodopsin family include: light-driven inward chloride pump halorhodopsin (HR); light-driven outward proton pump bacteriorhodopsin (BR); light-gated cation channel channelrhodopsin (ChR); light-sensor activating transmembrane transducer proteins, sensory rhodopsin I and II (SRI and II); light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR); and other light-driven proton pumps such as blue-light-absorbing and green-light absorbing proteorhodopsins, among others. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins.	214
320097	cd14966	7tmD_STE3	fungal a-factor pheromone receptor STE3, member of the class D family of seven-transmembrane G protein-coupled receptors. This subfamily represents the a-factor pheromone receptor encoded by the STE3 gene, which is required for pheromone sensing and mating in haploid cells of the yeast Saccharomyces cerevisiae. The STE3-encoded seven-transmembrane domain receptor is a member of the class D GPCRs. Class D receptors are composed of two major subfamilies: Ste2 and Ste3. These two GPCRs (Ste2 and Ste3) sense the polypeptide mating pheromones, alpha-factor and a-factor, which activate a G protein-coupled receptors on the surface of the opposite yeast-mating haploid-types (MATa and MAT-alpha), respectively. Activation of these receptors by pheromones leads to activation of the mitogen-activated protein kinase (MAPK) signal transduction cascades, G1 cell cycle arrest, and polarized cell growth in the direction of the partner cell (a process called shmooing), which ultimately induces cell-cell fusion and the formation of a diploid zygote.  Like all GPCRs, these pheromone mating factor receptors possess the same basic architecture of seven-transmembrane (7TM) domains and share common signaling mechanisms; however, there is no significant sequence similarity either between Ste2 and Ste3, or between these two receptors and the other 7TM GPCRs. Thus, STE2 and STE3 represent phylogenetically distinct groups.	259
320098	cd14967	7tmA_amine_R-like	amine receptors and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. Amine receptors of the class A family of GPCRs include adrenoceptors, 5-HT (serotonin) receptors, muscarinic cholinergic receptors, dopamine receptors, histamine receptors, and trace amine receptors. The receptors of amine subfamily are major therapeutic targets for the treatment of neurological disorders and psychiatric diseases. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	259
341316	cd14968	7tmA_Adenosine_R	adenosine receptor subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. The adenosine receptors (or P1 receptors), a family of G protein-coupled purinergic receptors, bind adenosine as their endogenous ligand. There are four types of adenosine receptors in human, designated as A1, A2A, A2B, and A3. Each type is encoded by a different gene and has distinct functions with some overlap. For example, both A1 and A2A receptors are involved in regulating myocardial oxygen consumption and coronary blood flow in the heart, while the A2A receptor also has a broad spectrum of anti-inflammatory effects in the body. These two receptors also expressed in the brain, where they have important roles in the release of other neurotransmitters such as dopamine and glutamate, while the A2B and A3 receptors found primarily in the periphery and play important roles in inflammation and immune responses. The A1 and A3 receptors preferentially interact with G proteins of the G(i/o) family, thereby lowering the intracellular cAMP levels, whereas the A2A and A2B receptors interact with G proteins of the G(s) family, activating adenylate cyclase to elevate cAMP levels.	285
381741	cd14969	7tmA_Opsins_type2_animals	type 2 opsins in animals, member of the class A family of seven-transmembrane G protein-coupled receptors. This rhodopsin family represents the type 2 opsins found in vertebrates and invertebrates except sponge. Type 2 opsins primarily function as G protein coupled receptors and are responsible for vision as well as for circadian rhythm and pigment regulation. On the contrary, type 1 opsins such as bacteriorhodopsin and proteorhodopsin are found in both prokaryotic and eukaryotic microbes, functioning as light-gated ion channels, proton pumps, sensory receptors and in other unknown functions. Although these two opsin types share seven-transmembrane domain topology and a conserved lysine reside in the seventh helix, type 1 opsins do not activate G-proteins and are not evolutionarily related to type 2. Type 2 opsins can be classified into six distinct subfamilies including the vertebrate opsins/encephalopsins, the G(o) opsins, the G(s) opsins, the invertebrate G(q) opsins, the photoisomerases, and the neuropsins.	284
320101	cd14970	7tmA_Opioid_R-like	opioid receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes opioid receptors, somatostatin receptors, melanin-concentrating hormone receptors (MCHRs), and neuropeptides B/W receptors. Together they constitute the opioid receptor-like family, members of the class A G-protein coupled receptors. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and are involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others. G protein-coupled somatostatin receptors (SSTRs), which display strong sequence similarity with opioid receptors, binds somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. MCHR binds melanin concentrating hormone and is presumably involved in the neuronal regulation of food intake. Despite strong homology with somatostatin receptors, MCHR does not appear to bind somatostatin. Neuropeptides B/W receptors are primarily expressed in the CNS and stimulate the cortisol secretion by activating the adenylate cyclase- and the phospholipase C-dependent signaling pathways.	282
320102	cd14971	7tmA_Galanin_R-like	galanin receptor and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes G-protein coupled galanin receptors, kisspeptin receptor and allatostatin-A receptor (AstA-R) in insects. These receptors, which are members of the class A of seven transmembrane GPCRs, share a high degree of sequence homology among themselves. The galanin receptors bind galanin, a neuropeptide that is widely expressed in the brain, peripheral tissues, and endocrine glands. Galanin is implicated in numerous neurological and psychiatric diseases including Alzheimer's disease, eating disorders, and epilepsy, among many others. KiSS1-derived peptide receptor (also known as GPR54 or kisspeptin receptor) binds the peptide hormone kisspeptin (metastin), which encoded by the metastasis suppressor gene (KISS1) expressed in various endocrine and reproductive tissues. AstA-R is a G-protein coupled receptor that binds allatostatin A. Three distinct types of allatostatin have been identified in the insects and crustaceans: AstA, AstB, and AstC. They both inhibit the biosynthesis of juvenile hormone and exert an inhibitory influence on food intake. Therefore, allatostatins are considered as potential targets for insect control.	281
341317	cd14972	7tmA_EDG-like	endothelial differentiation gene family, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents the endothelial differentiation gene (Edg) family of G-protein coupled receptors, melanocortin/ACTH receptors, and cannabinoid receptors as well as their closely related receptors. The Edg GPCRs bind blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  Melanocortin receptors bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. Two types of cannabinoid receptors, CB1 and CB2, are activated by naturally occurring endocannabinoids, cannabis plant-derived cannabinoids such as tetrahydrocannabinol, or synthetic cannabinoids. The CB receptors are involved in the various physiological processes such as appetite, mood, memory, and pain sensation. CB1 receptor is expressed predominantly in central and peripheral neurons, while CB2 receptor is found mainly in the immune system.	275
320104	cd14973	7tmA_Mrgpr	mas-related G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined. Also included in this family is Mas-related G-protein coupled receptor 1-like (MAS1L) which is only found in primates. The angiotensin-II metabolite angiotensin is an endogenous ligand for MAS1L.	272
320105	cd14974	7tmA_Anaphylatoxin_R-like	anaphylatoxin receptors and related G protein-coupled chemokine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily of G-protein coupled receptors includes anaphylatoxin receptors, formyl peptide receptors (FPR), prostaglandin D2 receptor 2, GPR1, and related chemokine receptors. The anaphylatoxin receptors are a group of G-protein coupled receptors that bind anaphylatoxins. The members of this group include C3a and C5a receptors. The formyl peptide receptors (FPRs) are chemoattractant GPCRs that involved in mediating immune responses to infection. They are expressed mainly on polymorphonuclear and mononuclear phagocytes and bind N-formyl-methionyl peptides (FMLP), which are derived from the mitochondrial proteins of ruptured host cells or invading pathogens. Chemokine receptor-like 1 (also known as chemerin receptor 23) is a GPCR for the chemoattractant adipokine chemerin, also known as retinoic acid receptor responder protein 2 (RARRES2), and for the omega-3 fatty acid derived molecule resolvin E1. Interaction with chemerin induces activation of the MAPK and PI3K signaling pathways leading to downstream functional effects, such as a decrease in immune responses, stimulation of adipogenesis, and angiogenesis. On the other hand, resolvin E1 negatively regulates the cytokine production in macrophages by reducing the activation of MAPK1/3 and NF-kB pathways. Prostaglandin D2 receptor, also known as CRTH2, is a chemoattractant G-protein coupled receptor expressed on T helper type 2 cells that binds prostaglandin D2 (PGD2). PGD2 functions as a mast cell-derived mediator to trigger asthmatic responses and also causes vasodilation. PGD2 exerts its inflammatory effects by binding to two G-protein coupled receptors, the D-type prostanoid receptor (DP) and PD2R2 (CRTH2). PD2R2 couples to the G protein G(i/o) type which leads to a reduction in intracellular cAMP levels and an increase in intracellular calcium. GPR1 is an orphan receptor that can be activated by the leukocyte chemoattractant chemerin, thereby suggesting that some of the anti-inflammatory actions of chemerin may be mediated through GPR1.	274
320106	cd14975	7tmA_LTB4R	leukotriene B4 receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Leukotriene B4 (LTB4), a metabolite of arachidonic acid, is a powerful chemotactic activator for granulocytes and macrophages. Two receptors for LTB4 have been identified: a high-affinity receptor (LTB4R1 or BLT1) and a low-affinity receptor (TB4R2 or BLT2). Both BLT1 and BLT2 receptors belong to the rhodopsin-like G-protein coupled receptor superfamily and primarily couple to G(i) proteins, which lead to chemotaxis, calcium mobilization, and inhibition of adenylate cyclase. In some cells, they can also couple to the G(q)-like protein, G16, and activate phospholipase C. LTB4 is involved in mediating inflammatory processes, immune responses, and host defense against infection. Studies have shown that LTB4 stimulates leukocyte extravasation, neutrophil degranulation, lysozyme release, and reactive oxygen species generation.	278
320107	cd14976	7tmA_RNL3R	relaxin-3 like peptide receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This G protein-coupled receptor subfamily is composed of the relaxin-3 like peptide receptors, RNL3R1 and RNL3R2, and similar proteins. The relaxin-3 like peptide family includes relaxin-1, -2, -3, as well as insulin-like (INSL) peptides 3 to 6. RNL3/relaxin-3 and INSL5 are the endogenous ligands for RNL3R1 and RNL3R2, respectively. RNL3R1, also called GPCR135 or RXFP3, is predominantly expressed in the brain and is implicated in stress, anxiety, feeding, and metabolism. Insulin-like peptide 5 (INSL5), the endogenous ligand for RNL3R2 (also called GPCR142 or RXFP4), plays a role in fat and glucose metabolism. INSL5 is highly expressed in human rectal and colon tissues. Both RNL3R1 and RNL3R2 signal through G(i) protein and inhibit adenylate cyclase, thereby inhibit cAMP accumulation. RNL3R1 is shown to activate Erk1/2 signaling pathway.	290
320108	cd14977	7tmA_ET_R-like	endothelin receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily of G-protein coupled receptors includes endothelin receptors, bombesin receptor subtype 3 (BRS-3), gastrin-releasing peptide receptor (GRPR), neuromedin B receptor (NMB-R), endothelin B receptor-like 2 (ETBR-LP-2), and GRP37. The endothelin receptors and related proteins are members of the seven transmembrane rhodopsin-like G-protein coupled receptor family (class A GPCRs) which activate multiple effectors via different types of G protein.	292
410630	cd14978	7tmA_FMRFamide_R-like	FMRFamide (Phe-Met-Arg-Phe) receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Drosophila melanogaster G-protein coupled FMRFamide (Phe-Met-Arg-Phe-NH2) receptor DrmFMRFa-R and related invertebrate receptors, as well as the vertebrate proteins GPR139 and GPR142. DrmFMRFa-R binds with high affinity to FMRFamide and intrinsic FMRFamide-related peptides. FMRFamide is a neuropeptide from the family of FMRFamide-related peptides (FaRPs), which all containing a C-terminal RFamide (Arg-Phe-NH2) motif and have diverse functions in the central and peripheral nervous systems. FMRFamide is an important neuropeptide in many types of invertebrates such as insects, nematodes, molluscs, and worms. In invertebrates, the FMRFamide-related peptides are involved in the regulation of heart rate, blood pressure, gut motility, feeding behavior, and reproduction. On the other hand, in vertebrates such as mice, they play a role in the modulation of morphine-induced antinociception. Orphan receptors GPR139 and GPR142 are very closely related G protein-coupled receptors, but they have different expression patterns in the brain and in other tissues. These receptors couple to inhibitory G proteins and activate phospholipase C. Studies suggested that dimer formation may be required for their proper function. GPR142 is predominantly expressed in pancreatic beta-cells and mediates enhancement of glucose-stimulated insulin secretion, whereas GPR139 is mostly expressed in the brain and is suggested to play a role in the control of locomotor activity. Tryptophan and phenylalanine have been identified as putative endogenous ligands of GPR139.	299
320110	cd14979	7tmA_NTSR-like	neurotensin receptors and related G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes the neurotensin receptors and related G-protein coupled receptors, including neuromedin U receptors, growth hormone secretagogue receptor, motilin receptor, the putative GPR39 and the capa receptors from insects. These receptors all bind peptide hormones with diverse physiological effects. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	300
320111	cd14980	7tmA_Glycoprotein_LRR_R-like	glycoprotein hormone receptors and leucine-rich repeats containing G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes the glycoprotein hormone receptors (GPHRs), vertebrate receptors containing 17 leucine-rich repeats (LGR4-6), and the relaxin family peptide receptors (also known as LGR7 and LGR8). They are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone receptor family contains receptors for the pituitary hormones, thyrotropin (thyroid-stimulating hormone receptor), follitropin (follicle-stimulating hormone receptor), and lutropin (luteinizing hormone receptor). Glycoprotein hormone receptors couple primarily to the G(s)-protein and promotes cAMP production, but also to the G(i)- or G(q)-protein. Two orphan GPCRs, LGR7 and LGR8, have been recently identified as receptors for the relaxin peptide hormones.	286
320112	cd14981	7tmA_Prostanoid_R	G protein-coupled receptors for prostanoids, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters.	288
341318	cd14982	7tmA_purinoceptor-like	purinoceptor and its related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. Members of this subfamily include lysophosphatidic acid receptor, P2 purinoceptor, protease-activated receptor, platelet-activating factor receptor, Epstein-Barr virus induced gene 2, proton-sensing G protein-coupled receptors, GPR35, and GPR55, among others. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	283
320114	cd14983	7tmA_FFAR	free fatty acid receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes the free fatty acid receptors (FFARs) which bind free fatty acids (FFAs). They belong to the class A G-protein coupled receptors and are composed of three members, each encoded by a separate gene (FFAR1, FFAR2, and FFAR3). These genes and a fourth pseudogene, GPR42, are localized together on chromosome 19. FFAR1 is a receptor for medium- and long-chain FFAs, whereas FFAR2 and FFAR3 are receptors for short chain FFAs (SCFAs), which have different ligand affinities.  FFAR1 directly mediates FFA stimulation of glucose-stimulated insulin secretion and also indirectly increases insulin secretion by enhancing the release of incretin. FFAR2 activation by SCFA suppresses adipose insulin signaling, which leads to the inhibition of fat accumulation in adipose tissue. FAAR3 is expressed in intestinal L cells, which produces glucagon-like peptide 1 (GLP-1) and peptide YY (PYY), suggesting that this receptor may be involved in energy homeostasis. FFARs are considered important components of the body's nutrient sensing mechanism, and therefore, these receptors are potential therapeutic targets for the treatment of metabolic disorders, such as type 2 diabetes and obesity.	278
341319	cd14984	7tmA_Chemokine_R	classical and atypical chemokine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. In addition to these classical chemokine receptors, there exists a subfamily of atypical chemokine receptors (ACKRs) that are unable to couple to G-proteins and, instead, they preferentially mediate beta-arrestin dependent processes, such as receptor internalization, after ligand binding. The classical chemokine receptors contain a conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling. However, the ACKRs lack this conserved motif and fail to couple to G-proteins and induce classical GPCR signaling. Five receptors have been identified for the ACKR family, including CC-chemokine receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, Duffy antigen receptor for chemokine (DARC), and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors.	278
341320	cd14985	7tmA_Angiotensin_R-like	angiotesin receptor family and its related G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the angiotensin receptors, the bradykinin receptors, apelin receptor as well as putative G-protein coupled receptors (GPR15 and GPR25). Angiotensin II (Ang II), the main effector in the renin-angiotensin system, plays a crucial role in the regulation of cardiovascular homeostasis through its type 1 (AT1) and type 2 (AT2) receptors.  Ang II contributes to cardiovascular diseases such as hypertension and atherosclerosis via AT1R activation. Ang II increases blood pressure through Gq-mediated activation of phospholipase C, resulting in phosphoinositide (PI) hydrolysis and increased intracellular calcium levels. Through the AT2 receptor, Ang II counteracts the vasoconstrictor action of AT1R and thereby induces vasodilation, sodium excretion, and reduction of blood pressure. Bradykinins (BK) are pro-inflammatory peptides that mediate various vascular and pain responses to tissue injury through its B1 and B2 receptors. Apelin (APJ) receptor binds the endogenous peptide ligands, apelin and Toddler/Elabela. APJ is an adipocyte-derived hormone that is ubiquitously expressed throughout the human body, and Toddler/Elabela is a short secretory peptide that is required for normal cardiac development in zebrafish. Activation of APJ receptor plays key roles in diverse physiological processes including vasoconstriction and vasodilation, cardiac muscle contractility, angiogenesis, and regulation of water balance and food intake. Orphan receptors, GPR15 and GPR25, share strong sequence homology to the angiotensin II type AT1 and AT2 receptors.	284
320117	cd14986	7tmA_Vasopressin-like	vasopressin receptors and its related G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Members of this group form a subfamily within the class A G-protein coupled receptors (GPCRs), which includes the vasopressin and oxytocin receptors, the gonadotropin-releasing hormone receptors (GnRHRs), the neuropeptide S receptor (NPSR), and orphan GPR150. These receptors share significant sequence homology with each other, suggesting that they have a common evolutionary origin. Vasopressin, also known as arginine vasopressin or anti-diuretic hormone, is a neuropeptide synthesized in the hypothalamus. The actions of vasopressin are mediated by the interaction of this hormone with three tissue-specific subtypes: V1AR, V1BR, and V2R. Although vasopressin differs from oxytocin by only two amino acids, they have divergent physiological functions. Vasopressin is involved in regulating osmotic and cardiovascular homeostasis, whereas oxytocin plays an important role in the uterus during childbirth and in lactation. GnRHR, also known as luteinizing hormone releasing hormone receptor (LHRHR), plays an central role in vertebrate reproductive function; its activation by binding to GnRH leads to the release of follicle stimulating hormone (FSH) and luteinizing hormone (LH) from the pituitary gland. Neuropeptide S (NPS) promotes arousal and anxiolytic-like effects by activating its cognate receptor NPSR. NPSR has also been associated with asthma and allergy. GPR150 is an orphan receptor closely related to the oxytocin and vasopressin receptors.	295
320118	cd14987	7tmA_ACKR3_CXCR7	CXC chemokine receptor 7, member of the class A family of seven-transmembrane G protein-coupled receptors. ACKR3, also known as CXCR7, is an atypical chemokine receptor for CXCL12 and CXCR11. Unlike the classical chemokine receptors, ACKR3 contains a DRYLSIT-sequence instead of the conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling. Thus, ACKR3 does not activate classical GPCR signaling, instead induces beta-arrestin recruitment which is leading to ligand internalization and MAP-kinase activation. It is acting as a scavenger for CXCL12 and, to a lesser degree, for CXCL11. ACKR3 is highly expressed by blood vascular endothelial cells in brain, in numerous embryonic and neonatal tissues, in inflamed tissues and in a variety of cancers such as lymphomas, sarcomas, prostate and breast cancers, and gliomas. Five receptors have been identified for the ACKR family, including CC-Chemokine Receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, DARC, and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors.	282
320119	cd14988	7tmA_GPR182	G protein-coupled receptor 182, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR182 is an orphan G-protein coupled receptor that belongs to the class A of seven-transmembrane GPCR superfamily. When GPR182 gene was first cloned, it was proposed to encode an adrenomedullin receptor. However when the corresponding protein was expressed, it was found not to respond to adrenomedullin (ADM). All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	278
320120	cd14989	7tmA_GPER1	G protein-coupled estrogen receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled estrogen receptor 1 (GPER1), also known as the G-protein coupled receptor 30 (GPR30), is a high affinity receptor for estrogen. This receptor is a member of the class A of seven-transmembrane GPCRs. Estrogen binding results in intracellular calcium mobilization and synthesis of phosphatidylinositol (3,4,5)-trisphosphate in the nucleus. GPR30 plays an important role in development of tamoxifen resistance in breast cancer cells. The distribution of GPR30 is well established in the rodent, with high expression observed in the hypothalamus, pituitary gland, adrenal medulla, kidney medulla and developing follicles of the ovary. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	276
320121	cd14990	7tmA_GPR146	G protein-coupled receptor 146, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR146 is an orphan G-protein coupled receptor that belongs to the class A of seven-transmembrane GPCR superfamily. The endogenous ligand for GPR146 is not known. It has been suggested that GPR146 may be a part of the C-peptide signaling complex. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	280
320122	cd14991	7tmA_HCAR-like	hydroxycarboxylic acid receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the hydroxycarboxylic acid receptors (HCARs) as well as their closely related receptors, GPR31 and oxoeicosanoid receptor 1 (OXER1). HCARs are members of the class A family of G-protein coupled receptors (GPCRs). HCAR subfamily contain three receptor subtypes: HCAR1, HCAR2, and HCAR3. The endogenous ligand of HCAR1 (also known as lactate receptor 1, GPR104, or GPR81) is L-lactic acid. The endogenous ligands of HCAR2 (also known as niacin receptor 1, GPR109A, nicotinic acid receptor) and HCAR3 (also known as niacin receptor 2, orGPR109B) are 3-hydroxybutyric acid and 3-hydroxyoctanoic acid, respectively. All three HCA receptors are expressed in adipocytes, and are coupled to G(i)-proteins mediating anti-lipolytic effects in fat cells. OXER1 is a receptor for eicosanoids and polyunsaturated fatty acids such as 5-oxo-6E,8Z,11Z,14Z-eicosatetraenoic acid (5-OXO-ETE), 5(S)-hydroperoxy-6E,8Z,11Z,14Z-eicosatetraenoic acid (5(S)-HPETE) and arachidonic acid, whereas GPR31 is a high-affinity receptor for 12-(S)-hydroxy-5,8,10,14-eicosatetraenoic acid (12-S-HETE).	280
320123	cd14992	7tmA_TACR_family	tachykinin receptor and closely related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes G-protein coupled receptors for a variety of neuropeptides of the tachykinin (TK) family as well as closely related receptors. The tachykinins are widely distributed throughout the mammalian central and peripheral nervous systems and act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R.  SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate in the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception. NK3R is activated by its high-affinity ligand, NKB, which is primarily involved in the central nervous system and plays a critical role in the regulation of gonadotropin hormone release and the onset of puberty.	291
320124	cd14993	7tmA_CCKR-like	cholecystokinin receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents four G-protein coupled receptors that are members of the RFamide receptor family, including cholecystokinin receptors (CCK-AR and CCK-BR), orexin receptors (OXR), neuropeptide FF receptors (NPFFR), and pyroglutamylated RFamide peptide receptor (QRFPR). These RFamide receptors are activated by their endogenous peptide ligands that share a common C-terminal arginine (R) and an amidated phenylanine (F) motif. CCK-AR (type A, alimentary; also known as CCK1R) is found abundantly on pancreatic acinar cells and binds only sulfated CCK-peptides with very high affinity, whereas CCK-BR (type B, brain; also known as CCK2R), the predominant form in the brain and stomach, binds CCK or gastrin and discriminates poorly between sulfated and non-sulfated peptides. CCK is implicated in regulation of digestion, appetite control, and body weight, and is involved in neurogenesis via CCK-AR. There is some evidence to support that CCK and gastrin, via their receptors, are involved in promoting cancer development and progression, acting as growth and invasion factors. Orexins (OXs; also referred to as hypocretins) are neuropeptide hormones that regulate the sleep-wake cycle and potently influence homeostatic systems regulating appetite and feeding behavior or modulating emotional responses such as anxiety or panic. OXs are synthesized as prepro-orexin (PPO) in the hypothalamus and then proteolytically cleaved into two forms of isoforms: orexin-A (OX-A) and orexin-B (OX-B). OXA is a 33 amino-acid peptide with N-terminal pyroglutamyl residue and two intramolecular disulfide bonds, whereas OXB is a 28 amino-acid linear peptide with no disulfide bonds. OX-A binds orexin receptor 1 (OX1R) with high-affinity, but also binds with somewhat low-affinity to OX2R, and signals primarily to Gq coupling, whereas OX-B shows a strong preference for the orexin receptor 2 (OX2R) and signals through Gq or Gi/o coupling. The 26RFa, also known as QRFP (Pyroglutamylated RFamide peptide), is a 26-amino acid residue peptide that exerts similar orexigenic activity including the regulation of feeding behavior in mammals. It is the ligand for G-protein coupled receptor 103 (GPR103), which is predominantly expressed in paraventricular (PVN) and ventromedial (VMH) nuclei of the hypothalamus. GPR103 shares significant protein sequence homology with orexin receptors (OX1R and OX2R), which have recently shown to produce a neuroprotective effect in Alzheimer's disease by forming a functional heterodimer with GPR103. Neuropeptide FF (NPFF) is a mammalian octapeptide that has been implicated in a wide range of physiological functions in the brain including pain sensitivity, insulin release, food intake, memory, blood pressure, and opioid-induced tolerance and hyperalgesia. The effects of NPFF are mediated through neuropeptide FF1 and FF2 receptors (NPFF1-R and NPFF2-R) which are predominantly expressed in the brain. NPFF induces pro-nociceptive effects, mainly through the NPFF1-R, and anti-nociceptive effects, mainly through the NPFF2-R.	296
320125	cd14994	7tmA_GPR141	orphan G protein-coupled receptor 141, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the G-protein coupled receptor 141 of unknown function. Several ESTs for GPR141 were found in marrow and cancer cells. GPR141 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	275
320126	cd14995	7tmA_TRH-R	thyrotropin-releasing hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. TRH-R is a member of the class A rhodopsin-like G protein-coupled receptors, which binds the tripeptide thyrotropin releasing hormone. The TRH-R activates phosphoinositide metabolism through a pertussis-toxin-insensitive G-protein, the G(q)/G(11) class. TRH stimulates the synthesis and release of thyroid-stimulating hormone in the anterior pituitary. TRH is produced in many other tissues, especially within the nervous system, where it appears to act as a neurotransmitter/neuromodulator. It also stimulates the synthesis and release of prolactin. In the CNS, TRH stimulates a number of behavioral and pharmacological actions, including increased turnover of catecholamines in the nucleus accumbens. There are two thyrotropin-releasing hormone receptors in some mammals, thyrotropin-releasing hormone receptor 1 (TRH1) which has been found in a number of species including rat, mouse, and human and thyrotropin-releasing hormone receptor 2 (TRH2) which has, only been found in rodents. These TRH receptors are found in high levels in the anterior pituitary, and are also found in the retina and in certain areas of the brain.	269
320127	cd14996	7tmA_GPR82	orphan G protein-coupled receptor 82, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the G-protein coupled receptor 82 of unknown function. GPR82 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	305
320128	cd14997	7tmA_ETH-R	ecdysis-triggering hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the ecdysis-triggering hormone receptors found in insects, which are members of the class A family of seven-transmembrane G-protein coupled receptors. Ecdysis-triggering hormones are vital regulatory signals that govern the stereotypic physiological sequence leading to cuticle shedding in insects. Thus, the ETH signaling system has been a target for the design of more sophisticated insect-selective pest control strategies. Two subtypes of ecdysis-triggering hormone receptor were identified in Drosophila melanogaster. Blood-borne ecdysis-triggering hormone (ETH) activates the behavioral sequence through direct actions on the central nervous system. In insects, ecdysis is thought to be controlled by the interaction between peptide hormones; in particular between ecdysis-triggering hormone (ETH) from the periphery and eclosion hormone (EH) and crustacean cardioactive peptide (CCAP) from the central nervous system. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	294
320129	cd14998	7tmA_GPR153_GPR162-like	orphan G protein-coupled receptors 153 and 162, member of the class A family of seven-transmembrane G protein-coupled receptors. This group contains the G-protein coupled receptor 153 (GPR153), GPR162, and similar proteins. These are orphan GCPRs with unknown endogenous ligand and function. GPR153 and GPR163 are widely expressed in the central nervous system (CNS) and share a common evolutionary ancestor due to a gene duplication event. Although categorized as members of the rhodopsin-like class A GPCRs, both GPR162 and GPR153 contain an HRM-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors which is important for efficient G protein-coupled signal transduction. Moreover, the LPxF motif, a variant of NPxxY motif that plays a crucial role during receptor activation, is found at the end of TM7 in both GPR162 and GPR153. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	301
320130	cd14999	7tmA_UII-R	urotensin-II receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The urotensin-II receptor (UII-R, also known as the hypocretin receptor) is a member of the class A rhodopsin-like G-protein coupled receptors, which binds the peptide hormone urotensin-II. Urotensin II (UII) is a vasoactive somatostatin-like or cortistatin-like peptide hormone. However, despite the apparent structural similarity to these peptide hormones, they are not homologous to UII. Urotensin II was first identified in fish spinal cord, but later found in humans and other mammals. In fish, UII is secreted at the back part of the spinal cord, in a neurosecretory centre called uroneurapophysa, and is involved in the regulation of the renal and cardiovascular systems. In mammals, urotensin II is the most potent mammalian vasoconstrictor identified to date and causes contraction of arterial blood vessels, including the thoracic aorta. The urotensin II receptor is a rhodopsin-like G-protein coupled receptor, which binds urotensin-II. The receptor was previously known as GPR14, or sensory epithelial neuropeptide-like receptor (SENR). The UII receptor is expressed in the CNS (cerebellum and spinal cord), skeletal muscle, pancreas, heart, endothelium and vascular smooth muscle. It is involved in the pathophysiological control of cardiovascular function and may also influence CNS and endocrine functions. Binding of urotensin II to the receptor leads to activation of phospholipase C, through coupling to G(q/11) family proteins. The resulting increase in intracellular calcium may cause the contraction of vascular smooth muscle.	282
320131	cd15000	7tmA_BNGR-A34-like	putative neuropeptide receptor BNGR-A34 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes putative neuropeptide receptor BNGR-A34 found in silkworm and its closely related proteins from invertebrates. They are members of the class A rhodopsin-like GPCRs, which represent a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	285
320132	cd15001	7tmA_GPRnna14-like	GPRnna14 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the orphan G-protein coupled receptor GPRnna14 found in body louse (Pediculus humanus humanus) as well as its closely related proteins of unknown function. These receptors are members of the class A rhodopsin-like G-protein coupled receptors. As an obligatory parasite of humans, the body louse is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. GPRnna14 shares significant sequence similarity with the members of the neurotensin receptor family. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	266
320133	cd15002	7tmA_GPR151	G protein-coupled receptor 151, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor 151 (GRP151) is an orphan receptor of unknown function. Its expression is conserved in habenular axonal projections of vertebrates and may be a promising novel target for psychiatric drug development. GPR151 shows high sequence similarity with galanin receptors (GALR). GPR151 is a member of the class A rhodopsin-like GPCRs, which represent a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	280
320134	cd15005	7tmA_SREB-like	super conserved receptor expressed in brain and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. The SREB (super conserved receptor expressed in brain) subfamily consists of at least three members, named SREB1 (GPR27), SREB2 (GPR85), and SREB3 (GPR173). They are very highly conserved G protein-coupled receptors throughout vertebrate evolution, however no endogenous ligands have yet been identified. SREB2 is greatly expressed in brain regions involved in psychiatric disorders and cognition, such as the hippocampal dentate gyrus. Genetic studies in both humans and mice have shown that SREB2 influences brain size and negatively regulates hippocampal adult neurogenesis and neurogenesis-dependent cognitive function, all of which are suggesting a potential link between SREB2 and schizophrenia.  All three SREB genes are highly expressed in differentiated hippocampal neural stem cells. Furthermore, all GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	329
320135	cd15006	7tmA_GPR176	orphan G protein-coupled receptor 176, member of the rhodopsin-like class A GPCR family. GPR176 is a putative G protein-coupled receptor that belongs to the class A GPCR superfamily; no endogenous ligand for GPR176 has yet been identified. The class A rhodopsin-like GPCRs represent a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	289
320136	cd15007	7tmA_GPR75	G protein-coupled receptor 75, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor 75 (GPR75) is an atypical chemokine receptor that is expressed by mouse and human islets. Although GPR75 shows low sequence homology to C-C chemokine receptors, chemokine (C-C motif) ligand 5 (CCL5) has been shown to act as an endogenous ligand for GPR75.  CCL5 plays a key role in recruiting lymphocytes to sites of inflammatory and infection through promiscuous binding to the C-C chemokine G-protein-coupled receptors. Although categorized as a member of the rhodopsin-like class A GPCRs, GPR75 contains HRL-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors and important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. GPR75 is coupled to the G-protein G(q), which elevates intracellular calcium.	261
320137	cd15008	7tmA_GPR19	G protein-coupled receptor 19, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor 19 is an orphan receptor that is expressed predominantly in neuronal cells during mouse embryogenesis. Its mRNA is found frequently over-expressed in patients with small cell lung cancer. GPR19 shares a significant amino acid sequence identity with the D2 dopamine and neuropeptide Y families of receptors. Human GPR19 gene, intronless in the coding region, also has a distribution in brain overlapping that of the D2 dopamine receptor gene, and is located on chromosome 12. GPR19 is a member of the class A family of GPCRs, which represents a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	275
410631	cd15010	7tmA_ACKR1_DARC	Duffy antigen receptor for chemokines, member of the class A family of seven-transmembrane G protein-coupled receptors. Atypical chemokine receptor 1 (ACKR1), also known as DARC (Duffy antigen receptor for chemokines) or Fy glycoprotein (GpFy), was originally identified on erythrocytes. ACKR1 is also ubiquitously expressed by endothelial cells of venules and is highly promiscuous among all chemokine receptor. It binds many proinflammatory chemokines from both the CC and CXC subfamilies, including CCL2, CCL5, CCL7, CCL11, CXCL1, CXCL2, CXCL3, and CXCL5.  Erythrocyte ACKR1 is thought to act as a chemokine sink, limiting the levels of circulating chemokines, thereby controlling leukocyte activation. ACKR1-deficient erythrocytes are shown to confer resistance to the malarial parasite, Plasmodium vivax. On the other hand, ACKR1-expressing endothelial cells can internalize chemokines. ACKR1-internalized chemokines can be moved intact across the endothelium and promotes neutrophil transmigration. Unlike the classical chemokine receptors that contain a conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling, the ACKRs lack this conserved motif and fail to couple to G-proteins and induce classical GPCR signaling. Five receptors have been identified for the ACKR family, including CC-Chemokine Receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, DARC, and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors.	257
320139	cd15011	7tmA_GPR149	G protein-coupled receptor 149, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR149 is predominantly expressed in the ovary and is present at low levels in the brain and the digestive tract (stomach and small intestine). GPR149-null mice are viable and have normal maturation of the ovarian follicle, but show enhanced fertility and ovulation. Additionally, the null mice showed increased expression levels of growth differentiation factor 9 (Gdf9) in oocytes, and upregulated expression of cyclin D2, a downstream target of FSH (follicle-stimulating hormone) receptor signaling pathways that promotes granulosa cell proliferation. GPR149 is an orphan receptor with no known endogenous ligand as yet identified. Although categorized as a member of the class A GPCRs, GPR149 lacks the first two charged amino acids of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors which is important for efficient G protein-coupled signal transduction. Moreover, the transmembrane domains and carboxyl terminus of GPR149 show low similarities to other GPCRs. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	256
320140	cd15012	7tmA_Trissin_R	trissin receptor and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the Drosophila melanogaster trissin receptor and closely related invertebrate proteins which are a member of the class A family of seven-transmembrane G-protein coupled receptors. The cysteine-rich trissin has been shown to be an endogenous ligand for the orphan CG34381 in Drosophila melanogaster. Trissin is a peptide composed of 28 amino acids with three intrachain disulfide bonds with no significant structural similarities to known endogenous peptides. Cysteine-rich peptides are known to have antimicrobial or toxicant activities, although frequently their mechanism of action is poorly understood. Since the expression of trissin and its receptor is reported to predominantly localize to the brain and thoracicoabdominal ganglion, trissin is predicted to behave as a neuropeptide. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	277
320141	cd15013	7tm_TAS2R4	mammalian taste receptor 2, subtype 4, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 4, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	286
320142	cd15014	7tm_TAS2R40	mammalian taste receptor 2, subtype 40, member of the seven-transmembrane G-protein coupled receptor superfamily. This group includes the mammalian taste receptor 2 (T2R) subtype 40, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (taste of glutamate MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	290
320143	cd15015	7tm_TAS2R39	mammalian taste receptor 2, subtype 39, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (T2R) subtype 39, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (taste of glutamate MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	289
320144	cd15016	7tm_TAS2R1	mammalian taste receptor 2, subtype 1, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 1, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	283
320145	cd15017	7tm_TAS2R16	mammalian taste receptor 2, subtype 16, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 16, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	285
320146	cd15018	7tm_TAS2R41-like	mammalian taste receptor 2, subtype 41, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 41, which functions as a bitter taste receptor. Also included is the closely related TAS2R60. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	290
320147	cd15019	7tm_TAS2R14-like	mammalian taste receptor 2, subtype 14, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 14, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	290
320148	cd15020	7tm_TAS2R3	mammalian taste receptor 2, subtype 3, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 3, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	290
320149	cd15021	7tm_TAS2R10	mammalian taste receptor 2, subtype 10, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 10, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	285
320150	cd15022	7tm_TAS2R8	mammalian taste receptor 2, subtype 8, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 8, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	292
320151	cd15023	7tm_TAS2R7-like	mammalian taste receptor 2, subtypes 7 and 9, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtypes 7 and 9, which function as bitter taste receptors. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	291
320152	cd15024	7tm_TAS2R42	mammalian taste receptor 2, subtype 42, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 42, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	288
320153	cd15025	7tm_TAS2R38	mammalian taste receptor 2, subtype 38, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (T2R) subtype 38, which functions as a bitter taste receptor. Genetic variants of human TAS2R38 influence the ability to taste synthetic compounds 6-n-propylthiouracil (PROP) and phenylthiocarbamide (PTC). Thus, sensitivity to the bitter taste of PROP is often used as a marker for individual differences in taste perception that can affect food preferences and intake. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	293
320154	cd15026	7tm_TAS2R13	mammalian taste receptor 2, subtype 13, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtype 13, which functions as a bitter taste receptor. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	287
320155	cd15027	7tm_TAS2R43-like	mammalian taste receptor 2, subtype 43, and related proteins, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (T2R) subtype 43, which functions as a bitter taste receptor. Also included are the closely related taste receptors TAS2R19, TAS2R20, TAS2R30, TAS2R31, TAS2R45 and TAS2R50. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (the taste of glutamate, MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	292
320156	cd15028	7tm_Opsin-1_euk	proton pumping rhodopsins in fungi and algae, member of the seven-transmembrane GPCR superfamily. This subgroup represents uncharacterized proton pumping rhodopsins found in fungi and algae. They belong to the microbial rhodopsin family, also known as type I rhodopsins, consisting of the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	231
320157	cd15029	7tm_SRI_SRII	light-sensor activating transmembrane transducer protein sensory rhodopsin I and II; member of the seven-transmembrane GPCR superfamily. This subgroup includes the light-sensor activating transmembrane transducer proteins, sensory rhodopsin I (SRI) and II (SRII, also called phoborhodopsin). SRI and SRII are responsible for positive (attractive) and negative (repellent) phototaxis in halobacteria, respectively, thereby controlling the cell's directional movement in response to changes in light intensity by swimming either towards or away from the light. Both sensory rhodopsins belong to the family of microbial rhodopsins, also known as type I rhodopsins, consisting of the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	214
320158	cd15030	7tmF_SMO_homolog	class F smoothened family membrane region, a homolog of frizzled receptors. This group represents smoothened (SMO), a transmembrane G protein-coupled receptor that acts as the transducer of the hedgehog (HH) signaling pathway. SMO is activated by the hedgehog (HH) family of proteins acting on the 12-transmembrane domain receptor patched (PTCH), which constitutively inhibits SMO. Thus, in the absence of HH proteins, PTCH inhibits SMO signaling. On the other hand, binding of HH to the PTCH receptor activates its internalization and degradation, thereby releasing the PTCH inhibition of SMO. This allows SMO to trigger intracellular signaling and the subsequent activation of the Gli family of zinc finger transcriptional factors and induction of HH target gene expression (PTCH, Gli1, cyclin, Bcl-2, etc). SMO is closely related to the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate family of G-protein coupled receptors. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The WNT and HH signaling pathways play critical roles in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	331
320159	cd15031	7tmF_FZD3_insect	class F insect frizzled subfamily 3, member of 7-transmembrane G protein-coupled receptors. This group represents subfamily 3 of the frizzled (FZD) family of seven transmembrane-spanning G protein-coupled proteins that is found in insects such as Drosophila melanogaster. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	311
320160	cd15032	7tmF_FZD6	class F frizzled subfamily 6, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 6 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	321
320161	cd15033	7tmF_FZD3	class F frizzled subfamily 3, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 3 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	321
320162	cd15034	7tmF_FZD1_2_7-like	class F frizzled subfamilies 1, 2 and 7; member of 7-transmembrane G protein-coupled receptors. This group includes subfamilies 1, 2 and 7 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors, as well as their closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	322
320163	cd15035	7tmF_FZD5_FZD8-like	class F frizzled subfamilies 5, 8 and related proteins; member of 7-transmembrane G protein-coupled receptors. This group includes subfamilies 5 and 8 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, as well as their closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	307
320164	cd15036	7tmF_FZD9	class F frizzled subfamily 9, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 9 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	320
320165	cd15037	7tmF_FZD10	class F frizzled subfamily 10, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 10 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	320
320166	cd15038	7tmF_FZD4	class F frizzled subfamily 4, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 4 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	304
410632	cd15039	7tmB3_Methuselah-like	Methuselah-like subfamily B3, member of the class B family of seven-transmembrane G protein-coupled receptors. The subfamily B3 of class B GPCRs consists of Methuselah (Mth) and its closely related proteins found in bilateria. Mth was originally identified in Drosophila as a GPCR affecting stress resistance and aging. In addition to the seven transmembrane helices, Mth contains an N-terminal extracellular domain involved in ligand binding, and a third intracellular loop (IC3) required for the specificity of G-protein coupling. Drosophila Mth mutants showed an increase in average lifespan by 35% and greater resistance to a variety of stress factors, including starvation, high temperature, and paraquat-induced oxidative toxicity. Moreover, mutations in two endogenous peptide ligands of Methuselah, Stunted A and B, showed an increased in lifespan and resistance to oxidative stress induced by dietary paraquat. These results strongly suggest that the Stunted-Methuselah system plays important roles in stress response and aging.	270
320168	cd15040	7tmB2_Adhesion	adhesion receptors, subfamily B2 of the class B family of seven-transmembrane G protein-coupled receptors. The B2 subfamily of class B GPCRs consists of cell-adhesion receptors with 33 members in humans and vertebrates. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing a variety of structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, linked to a class B seven-transmembrane domain. These include, for example, EGF (epidermal growth factor)-like domains in CD97, Celsr1 (cadherin family member), Celsr2, Celsr3, EMR1 (EGF-module-containing mucin-like hormone receptor-like 1), EMR2, EMR3, and Flamingo; two laminin A G-type repeats and nine cadherin domains in Flamingo and its human orthologs Celsr1, Celsr2 and Celsr3; olfactomedin-like domains in the latrotoxin receptors; and five or four thrombospondin type 1 repeats in BAI1 (brain-specific angiogenesis inhibitor 1), BAI2 and BAI3. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	253
341321	cd15041	7tmB1_hormone_R	The subfamily B1 of hormone receptors (secretin-like), member of the class B family seven-transmembrane G protein-coupled receptors. The B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of this subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. Moreover, the B1 subfamily receptors play key roles in hormone homeostasis and are promising drug targets in various human diseases including diabetes, osteoporosis, obesity, neurodegenerative conditions (Alzheimer###s and Parkinson's), cardiovascular disease, migraine, and psychiatric disorders (anxiety, depression). Furthermore, the subfamilies B2 and B3 consist of receptors that are capable of interacting with epidermal growth factors (EGF) and the Drosophila melanogaster Methuselah gene product (Mth), respectively. The class B GPCRs have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi, or prokaryotes.	273
320170	cd15042	7tmC_Boss	Bride of sevenless, member of the class C family of seven-transmembrane G protein-coupled receptors. Bride of Sevenless (Boss) is a putative Drosophila melanogaster G protein-coupled receptor that functions as a glucose-responding receptor to regulate energy metabolism. Boss is expressed predominantly in the fly's fat body, a nutrient-sensing tissue functionally analogous to the mammalian liver and adipose tissues, and in photoreceptor cells. Boss, which is expressed on the surface of R8 photoreceptor cell, binds and activates the Sevenless receptor tyrosine kinase on the neighboring R7 precursor cell. Activation of Sevenless results in phosphorylation of the Sevenless, triggering a signaling transduction cascade through Ras pathway that ultimately leads to the differentiation of the R7 precursor into a fully functional R7 photoreceptor, the last of eight photoreceptors to differentiate in each ommatidium of the developing Drosophila eye. In the absence of either of Sevenless or Boss, the R7 precursor fails to differentiate as a photoreceptor and instead develops into a non-neuronal cone cell. Moreover, Boss mutants in Drosophila showed elevated food intake, but reduced stored triglyceride levels, suggesting that Boss may play a role in regulating energy homeostasis in nutrient sensing tissues. Furthermore, GPRC5B, a mammalian Boss homolog, activates obesity-associated inflammatory signaling in adipocytes, and that the GPRC5B knockout mice showed resistance to high-fat diet-induced obesity and insulin resistance.	238
320171	cd15043	7tmC_RAIG_GPRC5	retinoic acid-inducible orphan G-protein-coupled receptors; class C family of seven-transmembrane G protein-coupled receptors, group 5. Retinoic acid-inducible G-protein-coupled receptors (RAIGs), also referred to as GPCR class C group 5, are a group consisting of four orphan receptors RAIG1 (GPRC5A), RAIG2 (GPRC5B), RAIG3 (GPRC5C), and RAIG4 (GPRC5D). Unlike other members of the class C GPCRs which contain a large N-terminal extracellular domain, RAIGs have a shorter N-terminus. Thus, it is unlikely that RAIGs bind an agonist at its N-terminus domain. Instead, agonists may bind to the seven-transmembrane domain of these receptors. In addition, RAIG2 and RAIG3 contain a cleavable signal peptide whereas RAIG1 and RAIG4 do not. Although their expression is induced by retinoic acid (vitamin A analog), their biological function is not clearly understood. To date, no ligand is known for the members of RAIG family. Three receptor types (RAIG1-3) are found in vertebrates, while RAIG4 is only present in mammals. They show distinct tissue distribution with RAIG1 being primarily expressed in the lung, RAIG2 in the brain and placenta, RAIG3 in the brain, kidney and liver, and RAIG4 in the skin. RAIG1 is evolutionarily conserved from mammals to fish. RAIG1 has been to shown to act as a tumor suppressor in non-small cell lung carcinoma as well as oral squamous cell carcinoma, but it could also act as an oncogene in breast cancer, colorectal cancer, and pancreatic cancer. Studies have shown that overexpression of RAIG1 decreases intracellular cAMP levels.  Moreover, knocking out RAIG1 induces the activation of the NF-kB and STAT3 signaling pathways leading to cell proliferation and resistance to apoptosis.  RAIG2 (GPRC5B), a mammalian Boss (Bride of sevenless) homolog, activates obesity-associated inflammatory signaling in adipocytes, and GPRC5B knockout mice show resistance to high-fat diet-induced obesity and insulin resistance. The specific functions of RAIG3 and RAIG4 are unknown; however, they may play roles in mediating the effects of retinoic acid on embryogenesis, differentiation, and tumorigenesis through interactions with G-protein signaling pathways.	248
320172	cd15044	7tmC_V2R_AA_sensing_receptor-like	vomeronasal type-2 pheromone receptors, amino acid-sensing receptors and closely related proteins; member of the class C family of seven-transmembrane G protein-coupled receptors. This group is composed of vomeronasal type-2 pheromone receptors (V2Rs), a subgroup of broad-spectrum amino-acid sensing receptors including calcium-sensing receptor (CaSR) and GPRC6A, as well as their closely related proteins. Members of the V2R family of vomeronasal GPCRs are involved in detecting protein pheromones for social and sexual cues between the same species. V2Rs and G-alpha(o) protein are co-expressed in the basal layer of the vomeronasal organ (VNO), which is the sensory organ of the accessory olfactory system present in amphibians, reptiles, and non-primate mammals such as mice and rodents, but it is non-functional or absent in humans, apes, and monkeys. On the other hand, members of the V1R receptor family and G-alpha(i2) protein are co-expressed in the apical neurons of the VNO. Activation of V1R or V2R causes activation of phospholipase pathway, producing the second messengers diacylglycerol (DAG) and IP3. However, in contrast to V1Rs, V2Rs contain the long N-terminal extracellular domain, which is believed to bind pheromones. CaSR is a widely expressed GPCR that is involved in sensing small changes in extracellular levels of calcium ion to maintain a constant level of the extracellular calcium via modulating the synthesis and secretion of calcium regulating hormones, such as parathyroid hormone (PTH), in order to regulate Ca(2+)transport into or out of the extracellular fluid via kidney, intestine, and/or bone. For instance, when Ca2+ is high, CaSR downregulates PTH synthesis and secretion, leading to an increase in renal Ca2+ excretion, a decrease in intestinal Ca2+ absorption, and a reduction in release of skeletal Ca2+. GRPC6A (GPCR, class C, group 6, subtype A) is a widely expressed amino acid-sensing GPCR that is most closely related to CaSR.  GPRC6A is most potently activated by the basic amino acids L-arginine, L-lysine, and L-ornithine and less potently by small aliphatic amino acids. Moreover, the receptor can be either activated or modulated by divalent cations such as Ca2+. GPRC6A is expressed in the testis, but not the ovary and specifically also binds to the osteoblast-derived hormone osteocalcin (OCN), which regulates testosterone production by the testis and male fertility independently of the hypothalamic-pituitary axis. Furthermore, GPRC6A knockout studies suggest that GRPC6A is involved in regulation of bone metabolism, male reproduction, energy homeostasis, glucose metabolism, and in activation of inflammation response, as well as prostate cancer growth and progression, among others.	251
320173	cd15045	7tmC_mGluRs	metabotropic glutamate receptors, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group I mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to (Gi/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	253
320174	cd15046	7tmC_TAS1R	type 1 taste receptors, member of the class C of seven-transmembrane G protein-coupled receptors. This subfamily represents the type I taste receptors (TAS1Rs) that belongs to the class C family of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids.  On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs.  The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors.	253
320175	cd15047	7tmC_GABA-B-like	gamma-aminobutyric acid type B receptor and related proteins, member of the class C family of seven-transmembrane G protein-coupled receptors. The type B receptor for gamma-aminobutyric acid, GABA-B, is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD).  However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism. Also included in this group are orphan receptors, GPR156 and GPR158, which are closely related to the GABA-B receptor family.	263
320176	cd15048	7tmA_Histamine_H3R_H4R	histamine receptor subtypes H3R and H4R, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine subtypes H3R and H4R, members of the histamine receptor family, which belong to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). The H3 and H4 receptors couple to the G(i)-proteins, which leading to the inhibition of cAMP formation. The H3R receptor functions as a presynaptic autoreceptors controlling histamine release and synthesis. The H4R plays an important role in histamine-mediated chemotaxis in mast cells and eosinophils. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	296
341322	cd15049	7tmA_mAChR	muscarinic acetylcholine receptor subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of  G proteins. Activation of mAChRs by agonist (acetylcholine) leads to a variety of biochemical and electrophysiological responses. In general, the exact nature of these responses and the subsequent physiological effects mainly depend on the molecular and pharmacological identity of the activated receptor subtype(s). All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	262
320178	cd15050	7tmA_Histamine_H1R	histamine subtype H1 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine receptor subtype H1R, a member of histamine receptor family, which belongs to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). H1R selectively interacts with the G(q)-type G protein that activates phospholipase C and the phosphatidylinositol pathway. Antihistamines, a widely used anti-allergy medication, act on the H1 subtype and produce drowsiness as a side effect. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	263
320179	cd15051	7tmA_Histamine_H2R	histamine subtype H2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine receptor subtype H2R, a member of histamine receptor family, which belongs to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). The H2R subtype selectively interacts with the G(s)-type G protein that activates adenylate cyclase, leading to increased cAMP production and activation of Protein Kinase A. H2R is found in various tissues such as the brain, stomach, and  heart. Its most prominent role is in histamine-induced gastric acid secretion. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	287
320180	cd15052	7tmA_5-HT2	serotonin receptor subtype 2, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	262
320181	cd15053	7tmA_D2-like_dopamine_R	D2-like dopamine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5.  The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. The D1-like family receptors are coupled to G proteins of the G(s) family, which activate adenylate cyclase, causing cAMP formation and activation of protein kinase A. In contrast, activation of D2-like family receptors is linked to G proteins of the G(i) family, which inhibit adenylate cyclase. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease.	263
320182	cd15054	7tmA_5-HT6	serotonin receptor subtype 6, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT6 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the mammalian central nervous system (CNS). 5-HT6 receptors are selectively linked to G proteins of the G(s) family, which positively stimulate adenylate cyclase, causing cAMP formation and activation of protein kinase A. The 5-HT6 receptors mediates excitatory neurotransmission and are involved in learning and memory; thus they are promising targets for the treatment of cognitive impairment. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	267
320183	cd15055	7tmA_TAARs	trace amine-associated receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The trace amine-associated receptors (TAARs) are a distinct subfamily within the class A G protein-coupled receptor family. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	285
320184	cd15056	7tmA_5-HT4	serotonin receptor subtype 4, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT4 subtype is a member of the serotonin receptor family that belongs to the class A G protein-coupled receptors, and binds the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the mammalian central nervous system (CNS). 5-HT4 receptors are selectively linked to G proteins of the G(s) family, which positively stimulate adenylate cyclase, causing cAMP formation and activation of protein kinase A. 5-HT4 receptor-specific agonists have been shown to enhance learning and memory in animal studies. Moreover, hippocampal 5-HT4 receptor expression has been reported to be inversely correlated with memory performance in humans. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	294
320185	cd15057	7tmA_D1-like_dopamine_R	D1-like family of dopamine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5.  The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. The D1-like family receptors are coupled to G proteins of the G(s) family, which activate adenylate cyclase, causing cAMP formation and activation of protein kinase A. In contrast, activation of D2-like family receptors is linked to G proteins of the G(i) family, which inhibit adenylate cyclase. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	299
320186	cd15058	7tmA_Beta_AR	beta adrenergic receptors (adrenoceptors), member of the class A family of seven-transmembrane G protein-coupled receptors. The beta adrenergic receptor (beta adrenoceptor), also known as beta AR, is activated by hormone adrenaline (epinephrine) and plays important roles in regulating cardiac function and heart rate, as well as pulmonary physiology. The human heart contains three subtypes of the beta AR: beta-1 AR, beta-2 AR, and beta-3 AR. Beta-1 AR and beta-2 AR, which expressed at about a ratio of 70:30, are the major subtypes involved in modulating cardiac contractility and heart rate by positively stimulating the G(s) protein-adenylate cyclase-cAMP-PKA signaling pathway. In contrast, beta-3 AR produces negative inotropic effects by activating inhibitory G(i) proteins. The aberrant expression of beta-ARs can lead to cardiac dysfunction such as arrhythmias or heart failure. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	305
320187	cd15059	7tmA_alpha2_AR	alpha-2 adrenergic receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	261
320188	cd15060	7tmA_tyramine_octopamine_R-like	tyramine/octopamine receptor-like, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes tyramine/octopamine receptors and similar proteins found in insects and other invertebrates. Both octopamine and tyramine mediate their actions via G protein-coupled receptors (GPCRs) and are the invertebrate equivalent of vertebrate adrenergic neurotransmitters. In Drosophila, octopamine is involved in ovulation by mediating an egg release from the ovary, while a physiological role for tyramine in this process is not fully understood. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	260
320189	cd15061	7tmA_tyramine_R-like	tyramine receptors and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes tyramine-specific receptors and similar proteins found in insects and other invertebrates. These tyramine receptors form a distinct receptor family that is phylogenetically different from the other tyramine/octopamine receptors which also found in invertebrates.  Both octopamine and tyramine mediate their actions via G protein-coupled receptors (GPCRs) and are the invertebrate equivalent of vertebrate adrenergic neurotransmitters. In Drosophila, octopamine is involved in ovulation by mediating an egg release from the ovary, while a physiological role for tyramine in this process is not fully understood. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	256
320190	cd15062	7tmA_alpha1_AR	alpha-1 adrenergic receptors,  member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-1 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that primarily mediate smooth muscle contraction: alpha-1A, alpha-1B, and alpha-1D. Activation of alpha-1 receptors by catecholamines such as norepinephrine and epinephrine couples to the G(q) protein, which then activates the phospholipase C pathway, leading to an increase in IP3 and calcium. Consequently, the elevation of intracellular calcium concentration leads to vasoconstriction in smooth muscle of blood vessels. In addition, activation of alpha-1 receptors by phenylpropanolamine (PPA) produces anorexia and may induce appetite suppression in rats.	261
320191	cd15063	7tmA_Octopamine_R	octopamine receptors in invertebrates, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor for octopamine (OA), which functions as a neurotransmitter, neurohormone, and neuromodulator in invertebrate nervous system. Octopamine (also known as beta, 4-dihydroxyphenethylamine) is an endogenous trace amine that is highly similar to norepinephrine, but lacks a hydroxyl group, and has effects on the adrenergic and dopaminergic nervous systems. Based on the pharmacological and signaling profiles, the octopamine receptors can be classified into at least two groups:  OA1 receptors elevate intracellular calcium levels in muscle, whereas OA2 receptors activate adenylate cyclase and increase cAMP production.	266
320192	cd15064	7tmA_5-HT1_5_7	serotonin receptor subtypes 1, 5 and 7, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes serotonin receptor subtypes 1, 5, and 7 that are activated by the neurotransmitter serotonin. The 5-HT1 and 5-HT5 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family. The 5-HT1 receptor subfamily includes 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F. There is no 5-HT1C receptor subtype, as it has been reclassified as 5-HT2C receptor. The 5-HT5A and 5-HT5B receptors have been cloned from rat and mouse, but only the 5-HT5A isoform has been identified in human because of the presence of premature stop codons in the human 5-HT5B gene, which prevents a functional receptor from being expressed. The 5-HT7 receptor is coupled to Gs, which positively stimulates adenylate cyclase activity, leading to increased intracellular cAMP formation and calcium influx. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression.	258
320193	cd15065	7tmA_Ap5-HTB1-like	serotonin receptor subtypes B1 and B2 from Aplysia californica and similar proteins; member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes Aplysia californica serotonin receptors Ap5-HTB1 and Ap5-HTB2, and similar proteins from bilateria including insects, mollusks, annelids, and worms. Ap5-HTB1 is one of the several different receptors for 5-hydroxytryptamine (5HT, serotonin). In Aplysia, serotonin plays important roles in a variety of behavioral and physiological processes mediated by the central nervous system. These include circadian clock, feeding, locomotor movement, cognition and memory, synaptic growth and synaptic plasticity. Both Ap5-HTB1 and Ap5-HTB2 receptors are coupled to G-proteins that stimulate phospholipase C, leading to the activation of phosphoinositide metabolism. Ap5-HTB1 is expressed in the reproductive system, whereas Ap5-HTB2 is expressed in the central nervous system.	300
320194	cd15066	7tmA_DmOct-betaAR-like	Drosophila melanogaster beta-adrenergic receptor-like octopamine receptors and similar receptors in bilateria; member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Drosophila beta-adrenergic-like octopamine receptors and similar proteins. The biogenic amine octopamine is the invertebrate equivalent of vertebrate adrenergic neurotransmitters and exerts its effects through different G protein-coupled receptor types. Insect octopamine receptors are involved in the modulation of carbohydrate metabolism, muscular tension, cognition and memory. The activation of octopamine receptors mediating these actions leads to an increase in adenylate cyclase activity, thereby increasing cAMP levels. In Drosophila melanogaster, three subgroups have been classified on the basis of their structural homology and functional equivalents with vertebrate beta-adrenergic receptors: DmOctBeta1R, DmOctBeta2R, and DmOctBeta3R.	265
320195	cd15067	7tmA_Dop1R2-like	dopamine 1-like receptor 2 from Drosophila melanogaster and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled dopamine 1-like receptor 2 is expressed in Drosophila heads and it shows significant sequence similarity with vertebrate and invertebrate dopamine receptors. Although the Drosophila Dop1R2 receptor does not cluster into the D1-like structural group, it does show pharmacological properties similar to D1-like receptors. As shown in vertebrate D1-like receptors, agonist stimulation of Dop1R2 activates adenylyl cyclase to increase cAMP levels and also generates a calcium signal through stimulation of phospholipase C.	262
320196	cd15068	7tmA_Adenosine_R_A2A	adenosine receptor subtype A2A, member of the class A family of seven-transmembrane G protein-coupled receptors. The A2A receptor, a member of the adenosine receptor family of G protein-coupled receptors, binds adenosine as its endogenous ligand and is involved in regulating myocardial oxygen consumption and coronary blood flow. High-affinity A2A and low-affinity A2B receptors are preferentially coupled to G proteins of the stimulatory (Gs) family, which lead to activation of adenylate cyclase and thereby increasing the intracellular cAMP levels. The A2A receptor activation protects against tissue injury and acts as anti-inflammatory agent. In human skin endothelial cells, activation of A2B receptor, but not the A2A receptor, promotes angiogenesis. Alternatively, activated A2A receptor, but not the A2B receptor, promotes angiogenesis in human umbilical vein and lung microvascular endothelial cells. The A2A receptor alters cardiac contractility indirectly by modulating the anti-adrenergic effect of A1 receptor, while the A2B receptor exerts direct effects on cardiac contractile function, but does not modulate beta-adrenergic or A1 anti-adrenergic effects.	293
320197	cd15069	7tmA_Adenosine_R_A2B	adenosine receptor subtype 2AB, member of the class A family of seven-transmembrane G protein-coupled receptors. The A2B receptor, a member of the adenosine receptor family of G protein-coupled receptors, binds adenosine as its endogenous ligand and is involved in regulating myocardial oxygen consumption and coronary blood flow. High-affinity A2A and low-affinity A2B receptors are preferentially coupled to G proteins of the stimulatory (Gs) family, which lead to activation of adenylate cyclase and thereby increasing the intracellular cAMP levels. The A2A receptor activation protects against tissue injury and acts as anti-inflammatory agent. In human skin endothelial cells, activation of A2B receptor, but not the A2A receptor, promotes angiogenesis. Alternatively, activated A2A receptor, but not the A2B receptor, promotes angiogenesis in human umbilical vein and lung microvascular endothelial cells. The A2A receptor alters cardiac contractility indirectly by modulating the anti-adrenergic effect of A1 receptor, while the A2B receptor exerts direct effects on cardiac contractile function, but does not modulate beta-adrenergic or A1 anti-adrenergic effects.	294
320198	cd15070	7tmA_Adenosine_R_A3	adenosine receptor subtype A3, member of the class A family of seven-transmembrane G protein-coupled receptors. The A3 receptor, a member of the adenosine receptor family of G protein-coupled receptors, is coupled to G proteins of the inhibitory G(i) family, which lead to inhibition of adenylate cyclase and thereby lowering the intracellular cAMP levels.  The A3 receptor has a sustained protective function in the heart during cardiac ischemia and contributes to inhibition of neutrophil degranulation in neutrophil-mediated tissue injury. Moreover, activation of A3 receptor by adenosine protects astrocytes from cell death induced by hypoxia.	280
341323	cd15071	7tmA_Adenosine_R_A1	adenosine receptor subtype A1, member of the class A family of seven-transmembrane G protein-coupled receptors. The adenosine A1 receptor, a member of the adenosine receptor family of G protein-coupled receptors, binds adenosine as its endogenous ligand. The A1 receptor has primarily inhibitory function on the tissues in which it is located. The A1 receptor slows metabolic activity in the brain and has a strong anti-adrenergic effects in the heart. Thus, it antagonizes beta1-adrenergic receptor-induced stimulation and thereby reduces cardiac contractility. The A1 receptor preferentially couples to G proteins of the G(i/o) family, which lead to inhibition of adenylate cyclase and thereby lowering the intracellular cAMP levels.	290
320200	cd15072	7tmA_Retinal_GPR	retinal G protein coupled receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents the retinal G-protein coupled receptor (RGR) found exclusively in retinal pigment epithelium (RPE) and Muller cells. RGR is a member of the class A rhodopsin-like receptor family. As with other opsins, RGR binds all-trans retinal and contains a conserved lysine reside on the seventh helix. RGR functions as a photoisomerase to catalyze the conversion of all-trans-retinal to 11-cis-retinal. Two mutations in RGR gene are found in patients with retinitis pigmentosa, indicating that RGR is essential to the visual process.	260
320201	cd15073	7tmA_Peropsin	retinal pigment epithelium-derived rhodopsin homolog, member of the class A family of seven-transmembrane G protein-coupled receptors. Peropsin, also known as a retinal pigment epithelium-derived rhodopsin homolog (RRH), is a visual pigment-like protein found exclusively in the apical microvilli of the retinal pigment epithelium. Peropsin belongs to the type 2 opsin family of the class A G-protein coupled receptors. Peropsin presumably plays a physiological role in the retinal pigment epithelium either by detecting light directly or monitoring the levels of retinoids, the primary light absorber in visual perception, or other pigment-related compounds in the eye.	280
320202	cd15074	7tmA_Opsin5_neuropsin	neuropsin (Opsin-5), member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropsin, also known as Opsin-5, is a photoreceptor protein expressed in the retina, brain, testes, and spinal cord. Neuropsin belongs to the type 2 opsin family of the class A G-protein coupled receptors. Mammalian neuropsin activates Gi protein-mediated photo-transduction pathway in a UV-dependent manner, whereas, in non-mammalian vertebrates, neuropsin is involved in regulating the photoperiodic control of seasonal reproduction in birds such as quail. As with other opsins, it may also act as a retinal photoisomerase.	284
320203	cd15075	7tmA_Parapinopsin	non-visual parapinopsin, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the non-visual pineal pigment, parapinopsin, which is a member of the class A of the seven transmembrane G protein-coupled receptors. Parapinopsin serves as a UV-sensitive pigment for the wavelength discrimination in the pineal-related organs of lower vertebrates such as reptiles, amphibians, and fish. Although parapinopsin is phylogenetically related to vertebrate visual pigments such as rhodopsin, which releases its retinal chromophore and bleaches, the parapinopsin photoproduct is stable and does not bleach. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extra-retinal tissues and/or in non-rod, non-cone retinal cells.	279
320204	cd15076	7tmA_SWS1_opsin	short wave-sensitive 1 opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Short Wave-Sensitive opsin 1 (SWS1), which mediates visual transduction in response to light at short wavelengths (ultraviolet to blue). Vertebrate cone opsins are expressed in cone photoreceptor cells of the retina and involved in mediating photopic vision, which allows color perception. The cone opsins can be classified into four classes according to their peak absorption wavelengths: SWS1 (ultraviolet sensitive), SWS2 (short wave-sensitive), MWS/LWS (medium/long wave-sensitive), and RH2 (medium wave-sensitive, rhodopsin-like opsins). Members of this group belong to the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	280
320205	cd15077	7tmA_SWS2_opsin	short wave-sensitive 2 opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Short Wave-Sensitive opsin 2 (SWS2), which mediates visual transduction in response to light at short wavelengths (violet to blue). Vertebrate cone opsins are expressed in cone photoreceptor cells of the retina and involved in mediating photopic vision, which allows color perception. The cone opsins can be classified into four classes according to their peak absorption wavelengths: SWS1 (ultraviolet sensitive), SWS2 (short wave-sensitive), MWS/LWS (medium/long wave-sensitive), and RH2 (medium wave-sensitive, rhodopsin-like opsins). Members of this group belong to the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	280
320206	cd15078	7tmA_Encephalopsin	encephalopsins (opsin-3), member of the class A family of seven-transmembrane G protein-coupled receptors. Encephalopsin, also called Opsin-3 or Panopsin, is a mammalian extra-retinal opsin that is highly localized in the brain. It is thought to play a role in encephalic photoreception. Encephalopsin belongs to the class A of the G protein-coupled receptors and shows strong homology to the vertebrate visual opsins.	279
320207	cd15079	7tmA_photoreceptors_insect	insect photoreceptors R1-R6 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the insect photoreceptors and their closely related proteins. The Drosophila eye is composed of about 800 unit eyes called ommatidia, each of which contains eight photoreceptor cells (R1-R8). The six outer photoreceptors (R1-R6) function like the vertebrate rods and are responsible for motion detection in dim light and image formation. The R1-R6 photoreceptors express a blue-absorbing pigment, Rhodopsin 1(Rh1). The inner photoreceptors (R7 and R8) are considered the equivalent of the color-sensitive vertebrate cone cells, which express a range of different pigments. The R7 photoreceptors express one of two different UV absorbing pigments, either Rh3 or Rh4. Likewise, the R8 photoreceptors express either the blue absorbing pigment Rh5 or green absorbing pigment Rh6. These photoreceptors belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	292
381742	cd15080	7tmA_MWS_opsin	medium wave-sensitive opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes Medium Wave-Sensitive opsin, which mediates visual transduction in response to light at medium wavelengths (green). Vertebrate cone opsins are expressed in cone photoreceptor cells of the retina and involved in mediating photopic vision, which allows color perception. The cone opsins can be classified into four classes according to their peak absorption wavelengths: SWS1 (ultraviolet sensitive), SWS2 (short wave-sensitive), MWS/LWS (medium/long wave-sensitive), and RH2 (medium wave-sensitive, rhodopsin-like opsins). Members of this group belong to the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	280
320209	cd15081	7tmA_LWS_opsin	long wave-sensitive opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. Long Wave-Sensitive opsin is also called red-sensitive opsin or red cone photoreceptor pigment, which mediates visual transduction in response to light at long wavelengths. Vertebrate cone opsins are expressed in cone photoreceptor cells of the retina and involved in mediating photopic vision, which allows color perception. The cone opsins can be classified into four classes according to their peak absorption wavelengths: SWS1 (ultraviolet sensitive), SWS2 (short wave-sensitive), MWS/LWS (medium/long wave-sensitive), and RH2 (medium wave-sensitive, rhodopsin-like opsins). Members of this group belong to the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	292
320210	cd15082	7tmA_VA_opsin	non-visual VA (vertebrate ancient) opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. The vertebrate ancient (VA) opsin photopigments were originally identified in salmon and they appear to have diverged early in the evolution of vertebrate opsins. VA opsins are localized in the inner retina and the brain in teleosts. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extraretinal tissues and/or in non-rod, non-cone retinal cells. They are thought to be involved in light-dependent physiological functions such as photo-entrainment of circadian rhythm, photoperiodicity, and body color change. The VA opsins belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	291
320211	cd15083	7tmA_Melanopsin-like	vertebrate melanopsins and related opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represent the Gq-coupled rhodopsin subfamily consists of melanopsins, insect photoreceptors R1-R6, invertebrate Gq opsins as well as their closely related opsins. Melanopsins (also called Opsin-4) are the primary photoreceptor molecules for non-visual functions such as the photo-entrainment of the circadian rhythm and pupillary constriction in mammals. Mammalian melanopsins are expressed only in the inner retina, whereas non-mammalian vertebrate melanopsins are localized in various extra-retinal tissues such as iris, brain, pineal gland, and skin. The outer photoreceptors (R1-R6) are the insect Drosophila equivalent to the vertebrate rods and are responsible for image formation and motion detection. The invertebrate G(q) opsins includes the arthropod and mollusk visual opsins as well as invertebrate melanopsins, which are also found in vertebrates. Arthropods possess color vision by the use of multiple opsins sensitive to different light wavelengths. Members of this subfamily belong to the class A of the G protein-coupled receptors and have seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	291
320212	cd15084	7tmA_Pinopsin	non-visual pinopsins, member of the class A family of seven-transmembrane G protein-coupled receptors. Pinopsins are found in the pineal organ of birds, reptiles and amphibians, but are absent from teleosts and mammals. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extra-retinal tissues and/or in non-rod, non-cone retinal cells. They are thought to be involved in light-dependent physiological functions such as photo-entrainment of circadian rhythm, photoperiodicity and body color change. Pinopsins belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	295
320213	cd15085	7tmA_Parietopsin	non-visual parietopsins, member of the class A family of seven-transmembrane G protein-coupled receptors. Parietopsin is a non-visual green light-sensitive opsin that was initially identified in the parietal eye of lizards. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extra-retinal tissues and/or in non-rod, non-cone retinal cells. They are thought to be involved in light-dependent physiological functions such as photo-entrainment of circadian rhythm, photoperiodicity and body color change. Parietopsin belongs to the class A of the G protein-coupled receptors and shows strong homology to the vertebrate visual opsins.	280
320214	cd15086	7tmA_tmt_opsin	teleost multiple tissue (tmt) opsin, member of the class A family of seven-transmembrane G protein-coupled receptors. Teleost multiple tissue (tmt) opsins are homologs of encephalopsin. Mouse encephalopsin (or panopsin) is highly expressed in the brain and testes, whereas the teleost homologs are localized to multiple tissues. The exact functions of the encephalopsins and tmt-opsins are unknown. The vertebrate non-visual opsin family includes pinopsins, parapinopsin, VA (vertebrate ancient) opsins, and parietopsins. These non-visual opsins are expressed in various extra-retinal tissues and/or in non-rod, non-cone retinal cells. They are thought to be involved in light-dependent physiological functions such as photo-entrainment of circadian rhythm, photoperiodicity and body color change. Tmt opsins belong to the class A of the G protein-coupled receptors and show strong homology to the vertebrate visual opsins.	276
320215	cd15087	7tmA_NPBWR	neuropeptide B/W receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide B/W receptor 1 and 2 are members of the class A G-protein coupled receptors that bind the neuropeptides B and W, respectively. NPBWR1 (previously known as GPR7) is expressed predominantly in cerebellum and frontal cortex, while NPBWR2 (previously known as GPR8) is located mostly in the frontal cortex and is present in human, but not in rat and mice. These receptors are suggested to be involved in the regulation of food intake, neuroendocrine function, and modulation of inflammatory pain, among many others. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	282
320216	cd15088	7tmA_MCHR-like	melanin concentrating hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Melanin-concentrating hormone receptor (MCHR) binds melanin concentrating hormone and is presumably involved in the neuronal regulation of food intake and energy homeostasis. Despite strong homology with somatostatin receptors, MCHR does not appear to bind somatostatin. Two MCHRs have been characterized in vertebrates, MCHR1 and MCHR2. MCHR1 is expressed in all mammals, whereas MCHR2 is only expressed in the higher order mammals, such as humans, primates, and dogs, and is not found in rodents. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	278
320217	cd15089	7tmA_Delta_opioid_R	opioid receptor subtype delta, member of the class A family of seven-transmembrane G protein-coupled receptors. The delta-opioid receptor binds the endogenous pentapeptide ligands such as enkephalins and produces antidepressant-like effects. The opioid receptor family is composed of four major subtypes: mu (MOP), delta (DOP), kappa (KOP) opioid receptors, and the nociceptin/orphanin FQ peptide receptor (NOP). They are distributed widely in the central nervous system and respond to classic alkaloid opiates, such as morphine and heroin, as well as to endogenous peptide ligands, which include dynorphins, enkephalins, endorphins, endomorphins, and nociceptin. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others.	281
320218	cd15090	7tmA_Mu_opioid_R	opioid receptor subtype mu, member of the class A family of seven-transmembrane G protein-coupled receptors. The mu-opioid receptor binds endogenous opioids such as beta-endorphin and endomorphin. The opioid receptor family is composed of four major subtypes: mu (MOP), delta (DOP), kappa (KOP) opioid receptors, and the nociceptin/orphanin FQ peptide receptor (NOP). They are distributed widely in the central nervous system and respond to classic alkaloid opiates, such as morphine and heroin, as well as to endogenous peptide ligands, which include dynorphins, enkephalins, endorphins, endomorphins, and nociceptin. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others.	279
320219	cd15091	7tmA_Kappa_opioid_R	opioid receptor subtype kappa, member of the class A family of seven-transmembrane G protein-coupled receptors. The kappa-opioid receptor binds the opioid peptide dynorphin as the primary endogenous ligand. The opioid receptor family is composed of four major subtypes: mu (MOP), delta (DOP), kappa (KOP) opioid receptors, and the nociceptin/orphanin FQ peptide receptor (NOP). They are distributed widely in the central nervous system and respond to classic alkaloid opiates, such as morphine and heroin, as well as to endogenous peptide ligands, which include dynorphins, enkephalins, endorphins, endomorphins, and nociceptin. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others.	282
320220	cd15092	7tmA_NOFQ_opioid_R	nociceptin/orphanin FQ peptide receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The nociceptin (NOP) receptor binds nociceptin or orphanin FQ, a 17 amino acid endogenous neuropeptide. The NOP receptor is involved in the modulation of various brain activities including instinctive and emotional behaviors. The opioid receptor family is composed of four major subtypes: mu (MOP), delta (DOP), kappa (KOP) opioid receptors, and the nociceptin/orphanin FQ peptide receptor (NOP). They are distributed widely in the central nervous system and respond to classic alkaloid opiates, such as morphine and heroin, as well as to endogenous peptide ligands, which include dynorphins, enkephalins, endorphins, endomorphins, and nociceptin. Opioid receptors are coupled to inhibitory G proteins of the G(i/o) family and involved in regulating a variety of physiological functions such as pain, addiction, mood, stress, epileptic seizure, and obesity, among many others.	279
320221	cd15093	7tmA_SSTR	somatostatin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. They share common signaling cascades such as inhibition of adenylyl cyclase, activation of phosphotyrosine phosphatase activity, and G-protein-dependent regulation of MAPKs.	280
320222	cd15094	7tmA_AstC_insect	somatostatin-like receptor for allatostatin C, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. In Drosophila melanogaster and other insects, a 15-amino-acid peptide named allatostatin C(AstC) binds the somatostatin-like receptors. Two AstC receptors have been identified in Drosophila with strong sequence homology to human somatostatin and opioid receptors.	282
320223	cd15095	7tmA_KiSS1R	KiSS1-derived peptide (kisspeptin) receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled KiSS1-derived peptide receptor (GPR54 or kisspeptin receptor) binds the peptide hormone kisspeptin (previously known as metastin), which encoded by the metastasis suppressor gene (KISS1) expressed in various endocrine and reproductive tissues. The KiSS1 receptor is coupled to G proteins of the G(q/11) family, which lead to activation of phospholipase C and increase of intracellular calcium. This signaling cascade plays an important role in reproduction by regulating the secretion of gonadotropin-releasing hormone.	288
320224	cd15096	7tmA_AstA_R_insect	allatostatin-A receptor in insects, member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled AstA receptor binds allatostatin A. Three distinct types of allatostatin have been identified in the insects and crustaceans: AstA, AstB, and AstC. They both inhibit the biosynthesis of juvenile hormone and exert an inhibitory influence on food intake. Therefore, allatostatins are considered as potential targets for insect control.	284
320225	cd15097	7tmA_Gal2_Gal3_R	galanin receptor subtypes 2 and 3, member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled galanin receptors bind galanin, a neuropeptide that is widely expressed in the brain, peripheral tissues, and endocrine glands. Three receptors subtypes have been so far identified: GAL1, GAL2, and GAL3. The specific functions of each subtype remains mostly unknown, although galanin is thought to be involved in a variety of neuronal functions such as hormone release and food intake. Galanin is implicated in numerous neurological and psychiatric diseases including Alzheimer's disease, depression, eating disorders, epilepsy and stroke, among many others.	279
320226	cd15098	7tmA_Gal1_R	galanin receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled galanin receptors bind galanin, a neuropeptide that is widely expressed in the brain, peripheral tissues, and endocrine glands. Three receptors subtypes have been so far identified: GAL1, GAL2, and GAL3. The specific functions of each subtype remains mostly unknown, although galanin is thought to be involved in a variety of neuronal functions such as hormone release and food intake. Galanin is implicated in numerous neurological and psychiatric diseases including Alzheimer's disease, depression, eating disorders, epilepsy and stroke, among many others.	282
320227	cd15099	7tmA_Cannabinoid_R	cannabinoid receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Cannabinoid receptors belong to the class A G-protein coupled receptor superfamily. Two types of cannabinoid receptors, CB1 and CB2, have been identified so far. They are activated by naturally occurring endocannabinoids, cannabis plant-derived cannabinoids such as tetrahydrocannabinol, or synthetic cannabinoids. The CB receptors are involved in the various physiological processes such as appetite, mood, memory, and pain sensation. CB1 receptor is expressed predominantly in central and peripheral neurons, while CB2 receptor is found mainly in the immune system.	281
320228	cd15100	7tmA_GPR3_GPR6_GPR12-like	G protein-coupled receptors 3, 6, 12, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. They constitutively activate adenylate cyclase to a similar degree as that seen with fully activated G(s)-coupled receptors, and are also able to constitutively activate inhibitory G(i/o) proteins. Lysophospholipids such as sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine have been detected as the high-affinity ligands for Gpr6 and Gpr12, respectively, which show high sequence homology with GPR3. Also included in this subfamily is GPRx, also known as GPR185, which involved in the maintenance of meiotic arrest in frog oocytes.	268
341325	cd15101	7tmA_LPAR	lysophosphatidic acid receptor subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	274
320230	cd15102	7tmA_S1PR	sphingosine-1-phosphate receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	270
320231	cd15103	7tmA_MCR	melanocortin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function.	270
320232	cd15104	7tmA_GPR119_R_insulinotropic_receptor	G protein-coupled receptor 119, also called glucose-dependent insulinotropic receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR119 is activated by oleoylethanolamide (OEA), a naturally occurring bioactive lipid with hypophagic and anti-obesity effects. Immunohistochemistry and double-immunofluorescence studies revealed the predominant GPR119 localization in pancreatic polypeptide (PP)-cells of islets. In addition, GPR119 expression is elevated in islets of obese hyperglycemic mice as compared to control islets, suggesting a possible involvement of this receptor in the development of obesity and diabetes. GPR119 has a significant sequence similarity with the members of the endothelial differentiation gene family. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	283
320233	cd15105	7tmA_MrgprA	mas-related G protein-coupled receptor subtype A, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	276
320234	cd15106	7tmA_MrgprX-like	primate-specific mas-related G protein-coupled receptor subtype X-like, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	274
320235	cd15107	7tmA_MrgprB	mas-related G protein-coupled receptor subtype B, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	276
320236	cd15108	7tmA_MrgprD	mas-related G protein-coupled receptor subtype D, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	276
320237	cd15109	7tmA_MrgprF	mas-related G protein-coupled receptor subtype F, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	274
320238	cd15110	7tmA_MrgprH	mas-related G protein-coupled receptor subtype H, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	274
320239	cd15111	7tmA_MrgprG	mas-related G protein-coupled receptor subtype G, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	263
320240	cd15112	7tmA_MrgprE	mas-related G protein-coupled receptor subtype E, member of the class A family of seven-transmembrane G protein-coupled receptors. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	272
320241	cd15113	7tmA_MAS1L	mas-related G protein-coupled receptor 1-like (MAS1L), member of the class A family of seven-transmembrane G protein-coupled receptors. MAS1L is also called MAS1 oncogene-like (MAS1-like) or mas-related G-protein coupled receptor MRG.  MAS1L is a G protein-coupled receptor that only found in primates. The angiotensin-II metabolite angiotensin is an endogenous ligand for MAS1L. The Mas-related G-protein coupled receptor (Mrgpr) family constitutes a group of orphan receptors exclusively expressed in nociceptive primary sensory neurons and mast cells in the skin. Members of the Mrgpr family have been implicated in the modulation of nociception, pruritus (itching), and mast cell degranulation. The Mrgpr family in rodents and humans contains more than 50 members that can be grouped into 9 distinct subfamilies: MrgprA, B, C (MrgprX1), D, E, F, G, H (GPR90), and the primate-specific MrgprX subfamily. Some Mrgprs can be activated by endogenous ligands such as beta-alanine, adenine (a cell metabolite and potential transmitter), RF-amide related peptides, or salusin-beta (a bioactive peptide). However, the effects of these agonists are not clearly understood, and the physiological role of the individual receptor family members remains to be determined.	265
320242	cd15114	7tmA_C5aR	complement component 5a anaphylatoxin chemotactic receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The anaphylatoxin receptors are a group of G-protein coupled receptors which bind anaphylatoxins; members of this group include C3a receptors and C5a receptors. Anaphylatoxins are also known as complement peptides (C3a, C4a and C5a) that are produced from the activation of the complement system cascade. These complement anaphylatoxins can trigger degranulation of endothelial cells, mast cells, or phagocytes, which induce a local inflammatory response and stimulate smooth muscle cell contraction, histamine release, and increased vascular permeability. They are potent mediators involved in chemotaxis, inflammation, and generation of cytotoxic oxygen-derived free radicals. In humans, a single receptor for C3a (C3AR1) and two receptors for C5a (C5AR1 and C5AR2, also known as C5L2 or GPR77) have been identified, but there is no known receptor for C4a.	274
320243	cd15115	7tmA_C3aR	complement component 3a anaphylatoxin chemotactic receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The anaphylatoxin receptors are a group of G-protein coupled receptors which bind anaphylatoxins; members of this group include C3a receptors and C5a receptors. Anaphylatoxins are also known as complement peptides (C3a, C4a and C5a) that are produced from the activation of the complement system cascade. These complement anaphylatoxins can trigger degranulation of endothelial cells, mast cells, or phagocytes, which induce a local inflammatory response and stimulate smooth muscle cell contraction, histamine release, and increased vascular permeability. They are potent mediators involved in chemotaxis, inflammation, and generation of cytotoxic oxygen-derived free radicals. In humans, a single receptor for C3a (C3AR1) and two receptors for C5a (C5AR1 and C5AR2, also known as C5L2 or GPR77) have been identified, but there is no known receptor for C4a.	265
320244	cd15116	7tmA_CMKLR1	chemokine-like receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Chemokine receptor-like 1 (also known as Chemerin receptor 23) is a GPCR for the chemoattractant adipokine chemerin, also known as retinoic acid receptor responder protein 2 (RARRES2), and for the omega-3 fatty acid derived molecule resolvin E1. Interaction with chemerin induces activation of the MAPK and PI3K signaling pathways leading to downstream functional effects, such as a decrease in immune responses, stimulation of adipogenesis, and angiogenesis. On the other hand, resolvin E1 negatively regulates the cytokine production in macrophages by reducing the activation of MAPK1/3 and NF-kB pathways. CMKLR1 is prominently expressed in dendritic cells and macrophages.	284
320245	cd15117	7tmA_FPR-like	N-formyl peptide receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The formyl peptide receptors (FPRs) are chemoattractant GPCRs that involved in mediating immune responses to infection. They are expressed at elevated levels on polymorphonuclear and mononuclear phagocytes. FPRs bind N-formyl peptides, which are derived from the mitochondrial proteins of ruptured host cells or invading pathogens. Activation of FPRs by N-formyl peptides such as N-formyl-Met-Leu-Phe (FMLP) triggers a signaling cascade that stimulates neutrophil accumulation, phagocytosis and superoxide production. These responses are mediated through a pertussis toxin-sensitive G(i) protein that activates a PLC-IP3-calcium signaling pathway. While FPRs are involved in host defense responses to bacterial infection, they can also suppress the immune system under certain conditions. Yet, the physiological role of the FPR family is not fully understood.	288
320246	cd15118	7tmA_PD2R2_CRTH2	prostaglandin D2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin D2 receptor, also known as CRTH2, is a chemoattractant G-protein coupled receptor expressed on T helper type 2 cells that binds prostaglandin D2 (PGD2). PGD2 functions as a mast cell-derived mediator to trigger asthmatic responses and also causes vasodilation. PGD2 exerts its inflammatory effects by binding to two G-protein coupled receptors, the D-type prostanoid receptor (DP) and PD2R2 (CRTH2). PD2R2 couples to the G protein G(i/o) type which leads to a reduction in intracellular cAMP levels and an increase in intracellular calcium. PD2R2 is involved in mediating chemotaxis of Th2 cells, eosinophils, and basophils generated during allergic inflammatory processes. CRTH2 (PD2R2), but not DP receptor, undergoes agonist-induced internalization which is one of key processes that regulates the signaling of the GPCR.	284
320247	cd15119	7tmA_GPR1	G protein-coupled receptor 1 for chemerin, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor 1 (GPR1) belongs to the class A of the seven transmembrane domain receptors.  This is an orphan receptor that can be activated by the leukocyte chemoattractant chemerin, thereby suggesting that some of the anti-inflammatory actions of chemerin may be mediated through GPR1. GPR1 is most closely related to another chemerin receptor CMKLR1. In an in-vitro study, GPR1 has been shown to act as a co-receptor to allow replication of HIV viruses.	278
320248	cd15120	7tmA_GPR33	orphan receptor GPR33, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein coupled receptor GPR33, an orphan member of the chemokine-like receptor family, was originally identified as a pseudogene in humans as well as in several apes and rodent species. Although the intact GPR33 allele is still present in a small fraction of the human population, the human GPR33 contains a premature stop codon. The amino acid sequence of GPR33 shares a high degree of sequence identity with the members of the chemokine and chemoattractant receptors that control leukocyte chemotaxis. The human GPR33 is expressed in spleen, lung, heart, kidney, pancreas, thymus, gonads, and leukocytes.	282
320249	cd15121	7tmA_LTB4R1	leukotriene B4 receptor subtype 1 (LTB4R1 or BLT1), member of the class A family of seven-transmembrane G protein-coupled receptors. Leukotriene B4 (LTB4), a metabolite of arachidonic acid, is a powerful chemotactic activator for granulocytes and macrophages. Two receptors for LTB4 have been identified: a high-affinity receptor (LTB4R1 or BLT1) and a low-affinity receptor (TB4R2 or BLT2). Both BLT1 and BLT2 receptors belong to the rhodopsin-like G-protein coupled receptor superfamily and primarily couple to G(i) proteins, which lead to chemotaxis, calcium mobilization, and inhibition of adenylate cyclase. In some cells, they can also couple to the Gq-like protein, G16, and activate phospholipase C. LTB4 is involved in mediating inflammatory processes, immune responses, and host defense against infection. Studies have shown that LTB4 stimulates leukocyte extravasation, neutrophil degranulation, lysozyme release, and reactive oxygen species generation.	278
320250	cd15122	7tmA_LTB4R2	leukotriene B4 receptor subtype 2 (LTB4R2 or BLT2), member of the class A family of seven-transmembrane G protein-coupled receptors. Leukotriene B4 (LTB4), a metabolite of arachidonic acid, is a powerful chemotactic activator for granulocytes and macrophages. Two receptors for LTB4 have been identified: a high-affinity receptor (LTB4R1 or BLT1) and a low-affinity receptor (TB4R2 or BLT2). Both BLT1 and BLT2 receptors belong to the rhodopsin-like G-protein coupled receptor superfamily and primarily couple to G(i) proteins, which lead to chemotaxis, calcium mobilization, and inhibition of adenylate cyclase. In some cells, they can also couple to the Gq-like protein, G16, and activate phospholipase C. LTB4 is involved in mediating inflammatory processes, immune responses, and host defense against infection. Studies have shown that LTB4 stimulates leukocyte extravasation, neutrophil degranulation, lysozyme release, and reactive oxygen species generation.	281
320251	cd15123	7tmA_BRS-3	bombesin receptor subtype 3, member of the class A family of seven-transmembrane G protein-coupled receptors. BRS-3 is classified as an orphan receptor and belongs to the bombesin subfamily of G-protein coupled receptors, whose members also include neuromedin B receptor (NMBR) and gastrin-releasing peptide receptor (GRPR). Bombesin is a tetradecapeptide, originally isolated from frog skin. Mammalian bombesin-related peptides are widely distributed in the gastrointestinal and central nervous systems. The bombesin family receptors couple primarily to the G proteins of G(q/11) family. BRS-3 interacts with known naturally-occurring bombesin-related peptides with low affinity; however, no endogenous high-affinity ligand to the receptor has been identified. BRS-3 is suggested to play a role in sperm cell division and maturation.	294
320252	cd15124	7tmA_GRPR	gastrin-releasing peptide receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The gastrin-releasing peptide receptor (GRPR) is a G-protein coupled receptor whose endogenous ligand is gastrin releasing peptide. GRP shares high sequence homology with the neuropeptide neuromedin B in the C-terminal region. This receptor is high glycosylated and couples to a pertussis-toxin-insensitive G protein of the family of Gq/11, which leads to the activation of phospholipase C. Gastrin-releasing peptide (GRP) is a potent mitogen for neoplastic tissues and involved in regulating multiple functions of the gastrointestinal and central nervous systems. These include the release of gastrointestinal hormones, the contraction of smooth muscle cells, and the proliferation of epithelial cells. GRPR belongs to the bombesin subfamily of G-protein coupled receptors, whose members also include neuromedin B receptor (NMBR) and bombesin receptor subtype 3 (BRS-3). Bombesin is a tetradecapeptide, originally isolated from frog skin.	293
320253	cd15125	7tmA_NMBR	neuromedin B receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The neuromedin B receptor (NMBR), also known as BB1, is a G-protein coupled receptor whose endogenous ligand is the neuropeptide neuromedin B. Neuromedin B is a potent mitogen and growth factor for normal and cancerous lung and for gastrointestinal epithelial tissues. NMBR is widely distributed in the CNS, with especially high levels in olfactory nucleus and thalamic regions. The receptor couples primarily to a pertussis-toxin-insensitive G protein of the Gq/11 family, which leads to the activation of phospholipase C. NMBR belongs to the bombesin subfamily of G-protein coupled receptors, whose members also include gastrin-releasing peptide receptor (GRPR) and bombesin receptor subtype 3 (BRS-3). Bombesin is a tetradecapeptide, originally isolated from frog skin.	292
320254	cd15126	7tmA_ETBR-LP2	endothelin B receptor-like protein 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelin B receptor-like protein 2, also called GPR37L1, is almost exclusively expressed in the nervous system. It has recently been shown to act as a receptor for the neuropeptide prosaptide, the active fragment of the secreted neuroprotective and glioprotective factor prosaposin (also called sulfated glycoprotein-1). Both prosaptide and prosaposin protect primary astrocytes against oxidative stress. GPR37L1 is part of the class A family of GPCRs that includes receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	298
320255	cd15127	7tmA_GPR37	G protein-coupled receptor 37, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR37, also called parkin-associated endothelin-like receptor (Pael-R), was isolated from a set of human brain frontal lobe expressed sequence tags. It is highly expressed in the mammalian CNS. It is a substrate of parkin and is involved in the pathogenesis of Parkinson's disease. GPR37 has recently been shown to act as a receptor for the neuropeptide prosaptide, the active fragment of the secreted neuroprotective and glioprotective factor prosaposin (also called sulfated glycoprotein-1). Both prosaptide and prosaposin protect primary astrocytes against oxidative stress. GPR37 is part of the class A family of GPCRs that includes receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	298
320256	cd15128	7tmA_ET_R	endothelin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelins are 21-amino acid peptides which able to activate a number of signal transduction processes including phospholipase A2, phospholipase C, and phospholipase D, as well as cytosolic protein kinase activation. They play an important role in the regulation of the cardiovascular system and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels. Two endothelin receptor subtypes have been isolated and identified in vertebrates, endothelin A receptor (ET-A) and endothelin B receptor (ET-B), and are members of the seven transmembrane class A G-protein coupled receptor family which activate multiple effectors via different types of G protein. Some vertebrates contain a third subtype, endothelin A receptor (ET-C). ET-A receptors are mainly located on vascular smooth muscle cells, whereas ET-B receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain.	300
320257	cd15129	7tmA_GPR142	G-protein-coupled receptor GPR142, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR142, a vertebrate orphan receptor, is very closely related to GPR139, but they have different expression patterns in the brain and in other tissues. These receptors couple to inhibitory G proteins and activate phospholipase C. Studies suggested that dimer formation may be required for their proper function. GPR142 is predominantly expressed in pancreatic beta-cells and plays an important role in mediating enhancement of glucose-stimulated insulin secretion and maintaining glucose homeostasis, whereas GPR139 is expressed almost exclusively in the brain and is suggested to play a role in the control of locomotor activity. These orphan receptors are phylogenetically clustered with invertebrate FMRFamide receptors such as Drosophila melanogaster DrmFMRFa-R.	270
320258	cd15130	7tmA_NTSR	neurotensin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Neurotensin (NTS) is a 13 amino-acid neuropeptide that functions as both a neurotransmitter and a hormone in the nervous system and peripheral tissues, respectively. NTS exerts various biological activities through activation of the G protein-coupled neurotensin receptors, NTSR1 and NTSR2. In the brain, NTS is involved in the modulation of dopamine neurotransmission, opioid-independent analgesia, hypothermia, and the inhibition of food intake, while in the periphery NTS promotes the growth of various normal and cancer cells and acts as a paracrine and endocrine modulator of the digestive tract. The third neurotensin receptor, NTSR3 or also called sortilin, is not a G protein-coupled receptor.	281
320259	cd15131	7tmA_GHSR	growth hormone secretagogue receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Growth hormone secretagogue receptor, GHSR, is also known as GH-releasing peptide receptor (GHRP) or Ghrelin receptor. Ghrelin, the endogenous ligand for GHSR, is an acylated 28-amino acid peptide hormone produced by ghrelin cells in the gastrointestinal tract. Ghrelin, also called hunger hormone, is involved in the regulation of growth hormone release, appetite and feeding, gut motility, lipid and glucose metabolism, and energy balance. It also plays a role in the cardiovascular, immune, and reproductive systems. GHSR couples to G-alpha-11 proteins. Both ghrelin and GHSR are expressed in a wide range of cancer tissues. Recent studies suggested that ghrelin may play a role in processes associated with cancer progression, including cell proliferation, metastasis, apoptosis, and angiogenesis.	291
320260	cd15132	7tmA_motilin_R	motilin receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Motilin receptor, also known as GPR38, is a G-protein coupled receptor that binds the endogenous ligand motilin. Motilin is a 22 amino acid peptide hormone expressed throughout the gastrointestinal tract and stimulates contraction of gut smooth muscle. Motilin is also called as the housekeeper of the gut because it is responsible for the proper filling and emptying of the gastrointestinal tract in response to food intake, and for stimulating the production of pepsin. Motilin receptor shares significant amino acid sequence identity with the growth hormone secretagogue receptor (GHSR) and neurotensin receptors (NTS-R1 and 2).	289
320261	cd15133	7tmA_NMU-R	neuromedin U receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuromedin U (NMU) is a highly conserved neuropeptide with a common C-terminal heptapeptide sequence (FLFRPRN-amide) found at the highest levels in the gastrointestinal tract and pituitary gland of mammals. Disruption or replacement of residues in the conserved heptapeptide region can result in the reduced ability of NMU to stimulate smooth-muscle contraction. Two G-protein coupled receptor subtypes, NMU-R1 and NMU-R2, with a distinct expression pattern, have been identified to bind NMU. NMU-R1 is expressed primarily in the peripheral nervous system, while NMU-R2 is mainly found in the central nervous system. Neuromedin S, a 36 amino-acid neuropeptide that shares a conserved C-terminal heptapeptide sequence with NMU, is a highly potent and selective NMU-R2 agonist. Pharmacological studies have shown that both NMU and NMS inhibit food intake and reduce body weight, and that NMU increases energy expenditure.	298
320262	cd15134	7tmA_capaR	neuropeptide capa receptor and similar invertebrate proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. CapaR is a G-protein coupled receptor for the Drosophila melanogaster capa neuropeptides (Drm-capa-1 and -2), which act on the Malpighian tubules to increase fluid transport. The capa peptides are evolutionarily related to vertebrate Neuromedin U neuropeptide and contain a C-terminal FPRXamide motif. CapaR regulates fluid homeostasis through its ligands, thereby acts as a desiccation stress-responsive receptor. CapaR undergoes desensitization, with internalization mediated by beta-arrestin-2.	298
320263	cd15135	7tmA_GPR39	G protein-coupled receptor 39, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR39 is an orphan G protein-coupled receptor that belongs to the growth hormone secretagogue and neurotensin receptor subfamily. GPR39 is expressed in peripheral tissues such as pancreas, gut, gastrointestinal tract, liver, kidney as well as certain regions of the brain. The divalent metal ion Zn(2+) has been shown to be a ligand capable of activating GPR39. Thus, it has been suggested that GPR39 function as a G(q)-coupled Zn(2+)-sensing receptor which involved in the regulation of endocrine pancreatic function, body weight, gastrointestinal mobility, and cell death.	320
320264	cd15136	7tmA_Glyco_hormone_R	glycoprotein hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The glycoprotein hormone receptors (GPHRs) are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone family includes three gonadotropins: luteinizing hormone (LH), follicle-stimulating hormone (FSH), chorionic gonadotropin (CG) and a pituitary thyroid-stimulating hormone (TSH). The glycoprotein hormones exert their biological functions by interacting with their cognate GPCRs. Both LH and CG bind to the same receptor, the luteinizing hormone-choriogonadotropin receptor (LHCGR); FSH binds to FSH-R and TSH to TSH-R. GPHRs couple primarily to the G(s)-protein and promotes cAMP production, but also to the G(i)- or G(q)-protein.	275
320265	cd15137	7tmA_Relaxin_R	relaxin family peptide receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes relaxin/insulin-like family peptide receptor 1 (RXFP1 or LGR7) and 2 (RXFP2 or LGR8), which contain a very large extracellular N-terminal domain with numerous leucine-rich repeats responsible for hormone recognition and binding. Relaxin is a member of the insulin superfamily that has diverse actions in both reproductive and non-reproductive tissues. The relaxin-like peptide family includes relaxin-1, relaxin-2, and the insulin-like (INSL) peptides such as INSL3, INSL4, INSL5 and INSL6. The relaxin family peptides share high structural but low sequence similarity, and exert their physiological functions by activating a group of four GPCRs, RXFP1-4. Relaxin and INSL3 are the endogenous ligands for RXFP1 and RXFP2, respectively. Upon receptor binding, relaxin activates a variety of signaling pathways to produce second messengers such as cAMP.	284
320266	cd15138	7tmA_LRR_GPR	orphan leucine-rich repeat-containing G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes leucine-rich repeat-containing G-protein coupled receptor 4 (LGR4), 5 (LGR5), and 6 (LGR6). These receptors contain a subfamily of receptors related to the glycoprotein hormone receptor family, which includes the luteinizing hormone (LH) receptor, the follicle-stimulating hormone (FSH) receptor, and the pituitary thyroid-stimulating hormone (TSH) receptor. LGR4-6 are receptors for the R-spondin (Rspo) family of secreted proteins containing two N-terminal furin-like repeats and a thrombospondin domain. The RSPO proteins are involved in regulating proliferation and differentiation of adult stem cells by potently enhancing the WNT-stimulated beta-catenin signaling. LGR4 is broadly expressed in proliferating cells, and its deficient mice display development defects in multiple organs. LGR5 acts as a marker for resident stem cell in numerous epithelial cell layers, including small intestine, colon, stomach, and kidney. LGR6 also serves as a marker of multipotent stem cells in the hair follicle that generate all skin cell lineages. Members of this group are characterized by a very large extracellular N-terminal domain containing 17 leucine-rich repeats (LRRs), flanked by cysteine-rich N- and C-terminal capping domains, and the extracellular domain is responsible for high-affinity binding with the Rspo proteins.	274
320267	cd15139	7tmA_PGE2_EP2	prostaglandin E2 receptor EP2 subtype, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin E2 receptor EP2, also called prostanoid EP2 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Each of these subtypes (EP1-EP4) have unique but overlapping tissue distributions that activate different intracellular signaling pathways. Stimulation of the EP2 receptor by PGE2 causes cAMP accumulation through G(s) protein activation, which subsequently produces smooth muscle relaxation and mediates the systemic vasodepressor response to PGE2. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters.	299
320268	cd15140	7tmA_PGD2	prostaglandin D2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin D2 receptor (also called prostanoid DP receptor, DP1, or PGD2R1) is a G-protein coupled receptor whose endogenous ligand is prostaglandin D2 (PGD2). PGD2, the major cyclooxygenase metabolite of arachidonic acid produced by mast cells, mediates inflammatory reactions in response to allergen challenge and causes peripheral vasodilation. PGD2 exerts its biological effects by binding to two types of cell surface receptors: a DP1 receptor that belongs to the prostanoid receptor family and a chemoattractant receptor-homologous molecule expressed on the T-helper type 2 cells (CRTH2 or PD2R2).	312
320269	cd15141	7tmA_PGI2	prostaglandin I2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin I2 receptor (also called prostacyclin receptor or prostanoid IP receptor) is a class A, G protein-coupled receptor whose endogenous ligand is prostacyclin, which is the major product of cyclooxygenase metabolite of arachidonic acid that found predominantly in platelets and vascular smooth muscle cells (VSMCs). The PGI2 receptor is coupled to both G(s) and G(q) protein subtypes, resulting in increased cAMP formation, phosphoinositide turnover, and Ca2+ signaling. PGI2 receptor activation by prostacyclin induces VSMC differentiation and produces a potent vasodilation and inhibition of platelet aggregation.	301
320270	cd15142	7tmA_PGE2_EP4	prostaglandin E2 receptor EP4 subtype, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin E2 receptor EP4, also called prostanoid EP4 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Each of these subtypes (EP1-EP4) have unique but overlapping tissue distributions that activate different intracellular signaling pathways. Like the EP2 receptor, stimulation of the EP4 receptor by PGE2 causes cAMP accumulation through G(s) protein activation. Knockout studies in mice suggest that EP4 receptor may be involved in the maintenance of bone mass and fracture healing. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters.	302
320271	cd15143	7tmA_TXA2_R	thromboxane A2 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The thromboxane receptor, also known as the prostanoid TP receptor, is a class A G-protein coupled receptor whose endogenous ligand is thromboxane A2 (TXA2). TXA2 is the major product of cyclooxygenase metabolite of arachidonic acid that found predominantly in platelets and stimulates platelet aggregation, Ca2+ influx into platelets, and also causes vasoconstriction. TXA2 has been shown to be involved in immune regulation, angiogenesis and metastasis, among many others. Activation of TXA2 receptor is coupled to G(q) and G(13), resulting in the activations of phospholipase C and RhoGEF, respectively. TXA2 receptor is widely distributed in the body and is abundantly expressed in thymus and spleen.	296
320272	cd15144	7tmA_PGE2_EP1	prostaglandin E2 receptor EP1 subtype, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin E2 receptor EP1, also called prostanoid EP1 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Each of these subtypes (EP1-EP4) have unique but overlapping tissue distributions that activate different intracellular signaling pathways. It has been shown that stimulation of the EP1 receptor by PGE2 causes smooth muscle contraction and increased intracellular Ca2+ levels; however, it is still unclear whether EP1 receptor is exclusively coupled to G(q/11), which leading to activation of phospholipase C and phosphatidylinositol hydrolysis. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters.	294
320273	cd15145	7tmA_FP	prostaglandin F2-alpha receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The PGF2-alpha receptor, also called prostanoid FP receptor, is a class A G-protein coupled receptor whose endogenous ligand is prostaglandin F2-alpha.  PGF2-alpha binding to this receptor is coupled to the stimulation of phospholipase C (PLC) pathway via G-protein subunit G(q).  This leads to the release of inositol trisphosphate (IP3) and diacylglycerol (DAG) which results in increased intracellular Ca2+ levels and activation of PKC.  The receptor activation primarily induces uterine contraction and bronchoconstriction, and stimulates luteolysis. Like most prostanoid receptors, the PGF2-alpha receptor has also been implicated in tumor angiogenesis and metastasis.	290
320274	cd15146	7tmA_PGE2_EP3	prostaglandin E2 receptor EP3 subtype, member of the class A family of seven-transmembrane G protein-coupled receptors. Prostaglandin E2 receptor EP3, also called prostanoid EP3 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Each of these subtypes (EP1-EP4) have unique but overlapping tissue distributions that activate different intracellular signaling pathways. Stimulation of the EP3 receptor by PGE2 preferentially couples to G(i) protein. This leads to a decrease in adenylate cyclase activity, thereby decreasing cAMP levels, which subsequently produces smooth muscle contraction. Knockout mice studies suggest that the EP3 receptor may act as a systemic vasopressor. Prostanoids are the cyclooxygenase (COX) metabolites of arachidonic acid, which include the prostaglandins (PGD2, PGE2, PGF2alpha), prostacyclin (PGI2), and thromboxane A2 (TxA2). These five major bioactive prostanoids acts as mediators or modulators in a wide range of physiological and pathophysiological processes within the kidney and play important roles in inflammation, platelet aggregation, and vasoconstriction/relaxation, among many others. They act locally by preferentially interacting with G protein-coupled receptors designated DP, EP. FP, IP, and TP, respectively. The phylogenetic tree suggests that the prostanoid receptors can be grouped into two major branches: G(s)-coupled (DP1, EP2, EP4, and IP) and G(i)- (EP3) or G(q)-coupled (EP1, FP, and TP), forming three clusters.	308
320275	cd15147	7tmA_PAFR	platelet-activating factor receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The platelet-activating factor receptor is a G(q/11)-protein coupled receptor, which is linked to p38 MAPK and PI3K signaling pathways. PAF is a phospholipid (1-0-alkyl-2-acetyl-sn-glycero-3-phosphorylcholine) which is synthesized by cells especially involved in host defense such as platelets, macrophages, neutrophils, and monocytes. PAF is well-known for its ability to induce platelet aggregation and anaphylaxis, and also plays important roles in allergy, asthma, and inflammatory responses, among many others.	291
320276	cd15148	7tmA_GPR34-like	putative G protein-coupled receptor 34, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the G-protein coupled receptor 34 of unknown function. Orphan GPR34 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	282
320277	cd15149	7tmA_P2Y14	P2Y purinoceptor 14, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y14 receptor is activated by UDP-sugars and belongs to the G(i) class of the P2Y family of purinergic G-protein coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-sugars. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-sugars (P2Y14).  P2Y14 receptor has been reported to be involved in a diverse set of physiological responses in many epithelia as well as in immune and inflammatory cells.	284
341326	cd15150	7tmA_P2Y12	P2Y purinoceptor 12, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y12 receptor (P2Y12R) is found predominantly on the surface of blood platelets and is activated by adenosine diphosphate (ADP). P2Y12R plays an important role in the regulation of blood clotting and belongs to the G(i) class of the P2Y family of purinergic G protein-coupled receptors. P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-sugars. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-sugars (P2Y14).	285
341327	cd15151	7tmA_P2Y13	P2Y purinoceptor 13, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y13 receptor (P2Y13R) is activated by adenosine diphosphate (ADP) and belongs to the G(i) class of the P2Y family of purinergic G protein-coupled receptors. P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-sugars. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-sugars (P2Y14).	284
320280	cd15152	7tmA_GPR174-like	putative purinergic receptor GPR174, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR174 has been recently identified as a lysophosphatidylserine receptor that enhances intracellular cAMP formation by coupling to a G(s) protein. GPR174 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	282
320281	cd15153	7tmA_P2Y10	P2Y purinoceptor 10, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y10 receptor is a G-protein coupled receptor that is activated by both sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA). Phylogenetic analysis of the class A GPCRs shows that P2Y10 is grouped into the cluster comprising nucleotide and lipid receptors. Although the mouse P2Y10 was found to be expressed in brain, lung, reproductive organs, and skeletal muscle, the physiological function of this receptor is not yet known. S1P and LPA are bioactive lipid molecules that induce a variety of cellular responses through G proteins: adhesion, invasion, cell migration and proliferation, among many others.	283
320282	cd15154	7tmA_LPAR5	lysophosphatidic acid receptor 5, member of the class A family of seven-transmembrane G protein-coupled receptors. Lysophosphatidic acid receptor 5 (LPAR5) is a G protein-coupled receptor that binds the bioactive lipid lysophosphatidic acid (LPA) and is involved in maintenance of human hair growth. Phylogenetic analysis of the class A GPCRs shows that LAPR5 is classified into the cluster consisting receptors that are preferentially activated by adenosine and uridine nucleotides. Although LPA6 (P2Y5) is expressed in human hair follicle cells, LPA4 and LPA5 are not. These three receptors are highly homologous and mediate an increase in intracellular cAMP production. Activation of LPAR5 is coupled to G(q) and G(12/13) proteins.	285
320283	cd15155	7tmA_LPAR4	lysophosphatidic acid receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. Lysophosphatidic acid receptor 4 (LPAR4) is a G protein-coupled receptor that binds and is activated by the bioactive lipid lysophosphatidic acid (LPA), which is released by activated platelets and constitutively found in serum. Phylogenetic analysis of the class A GPCRs shows that LAPR4 is classified into the cluster consisting receptors that are preferentially activated by adenosine and uridine nucleotides. Although LPA6 (P2Y5) is expressed in human hair follicle cells, LPA4 and LPA5 are not. These three receptors are highly homologous and mediate an increase in intracellular cAMP production. Activation of LPAR5 is coupled to G(12/13) proteins, leading to neurite retraction and stress fiber formation, whereas coupling to G(q) protein leads to increases in calcium levels.	283
320284	cd15156	7tmA_LPAR6_P2Y5	lysophosphatidic acid receptor 6, member of the class A family of seven-transmembrane G protein-coupled receptors. Lysophosphatidic acid receptor 6 (LPAR6), also known as P2Y5, is a G(i), G(12/13) G protein-coupled receptor that is activated by the bioactive lipid lysophosphatidic acid (LPA), which is released by activated platelets and constitutively present in serum. LPAR6 plays an important role in maintenance of human hair growth. Thus, mutations in the receptor are responsible for both autosomal recessive wooly hair and hypotrichosis. Phylogenetic analysis of the class A GPCRs shows that LAPR6 (P2Y5) is classified into the cluster consisting of receptors that are preferentially activated by adenosine and uridine nucleotides. Although LPA6 (P2Y5) is expressed in human hair follicle cells, LPA4 and LPA5 are not. These three receptors are highly homologous and mediate an increase in intracellular cAMP production.	285
320285	cd15157	7tmA_CysLTR2	cysteinyl leukotriene receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Cysteinyl leukotrienes (LTC4, LTD4, and LTE4) are the most potent inflammatory lipid mediators that play an important role in human asthma. They are synthesized in the leucocytes (cells of immune system) from arachidonic acid by the actions of 5-lipoxygenase and induce bronchial constriction through G protein-coupled receptors, CysLTR1 and CysLTR2. Activation of CysLTR1 by LTD4 induces airway smooth muscle contraction and proliferation, eosinophil migration, and damage to the lung tissue. They belong to the class A GPCR superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	278
320286	cd15158	7tmA_CysLTR1	cysteinyl leukotriene receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Cysteinyl leukotrienes (LTC4, LTD4, and LTE4) are the most potent inflammatory lipid mediators that play an important role in human asthma. They are synthesized in the leucocytes (cells of immune system) from arachidonic acid by the actions of 5-lipoxygenase and induce bronchial constriction through G protein-coupled receptors, CysLTR1 and CysLTR2. Activation of CysLTR1 by LTD4 induces airway smooth muscle contraction and proliferation, eosinophil migration, and damage to the lung tissue. They belong to the class A GPCR superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	285
320287	cd15159	7tmA_EBI2	Epstein-Barr virus (EBV)-induced gene 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Epstein-Barr virus-induced G-protein coupled receptor 2 (EBI2), also called GPR183, is activated by 7alpha, 25-dihydroxyxcholesterol (7alpha, 25-OHC), an oxysterol. EBI2 was originally identified as one of major genes induced in the Burkitt's lymphoma cell line BL41by EBV infection. EBI2 is involved in regulating B cell migration and responses, and is also implicated in human diseases such as type I diabetes, multiple sclerosis, and cancers.	286
320288	cd15160	7tmA_Proton-sensing_R	proton-sensing G protein-coupled receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Proton/pH-sensing G-protein coupled receptors sense pH of 7.6 to 6.0.  They mediate a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. The proton/pH-sensing receptor family includes the G2 accumulation receptor (G2A, also known as GPR132), the T cell death associated gene-8 (TDAG8, GPR65) receptor, ovarian cancer G-protein receptor 1 (OGR-1, GPR68), and G-protein-coupled receptor 4 (GPR4).	280
320289	cd15161	7tmA_GPR17	G protein-coupled receptor 17, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR17 is a Forkhead box protein O1 (FOXO1) target and abundantly expressed in agouti-related peptide (AGRP) neurons. FOXO1 is a transcription factor that plays key roles in regulation of gluconeogenesis and glycogenolysis by insulin signaling.  For instance, food intake and body weight increase when hypothalamic FOXO1 is activated, whereas they both decrease when FOXO1 is inhibited. However, a recent study has been reported that GPR17 deficiency in mice did not affect food intake or glucose homeostasis. Thus, GPR17 may not play a role in the control of food intake, body weight, or glycemic control. GPR17 is phylogenetically closely related to purinergic P2Y and cysteinyl-leukotriene receptors.	277
341328	cd15162	7tmA_PAR	protease-activated receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes purinergic receptor P2Y8 and protease-activated receptors. P2Y8 (or P2RY8) expression is often increased in leukemia patients, and it plays a role in the pathogenesis of acute leukemia. P2Y8 is phylogenetically closely related to the protease-activated receptors (PARs), which are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. Four different types of the protease-activated receptors have been identified (PAR1-4) and are predominantly expressed in platelets. PAR1, PAR3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin.  The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects.	280
320291	cd15163	7tmA_GPR20	G protein-coupled receptor 20, member of the class A family of seven-transmembrane G protein-coupled receptors. Orphan GPR20 is phylogenetically related to the P2Y family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. GPR20 has been shown to constitutively activate G(i) proteins in the absence of a ligand; however its functional role is not known. GPR20 is a member of the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A common feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. G-proteins regulate a variety of cellular functions including metabolic enzymes, ion channels, and transporters, among many others.	258
320292	cd15164	7tmA_GPR35-like	G protein-coupled receptor 35 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR35 shares closest homology with GPR55, and they belong to the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A number of studies have suggested that GPR35 may play important physiological roles in hypertension, atherosclerosis, nociception, asthma, glucose homeostasis and diabetes, and inflammatory bowel disease. GPR35 is thought to be responsible for brachydactyly mental retardation syndrome, which is associated with a deletion comprising chromosome 2q37 in human, and is also implicated as a potential oncogene in stomach cancer. Several endogenous ligands for GPR35 have been identified including kynurenic acid, 2-oleoyl lysophosphatidic acid, and zaprinast.  GPR35 couples to G(13) and G(i/o) proteins.	272
320293	cd15165	7tmA_GPR55-like	G protein-coupled receptor 55 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR55 shares closest homology with GPR35, and they belong to the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. GPR55 has been reported to couple to G(13), G(12), or G(q) proteins. Activation of GPR55 leads to activation of phospholipase C, RhoA, ROCK, ERK, p38MAPK, and calcium release. Lysophosphatidylinositol (LPI) is currently considered as the endogenous ligand for GPR55, although the receptor was initially de-orphanized as a cannabinoid receptor and binds many cannabinoid ligands.	277
320294	cd15166	7tmA_NAGly_R_GPR18	N-arachidonyl glycine receptor, GPR18, member of the class A family of seven-transmembrane G protein-coupled receptors. N-arachidonyl glycine (NAGly), an endogenous metabolite of the endocannabinoid anandamide, has been identified as an endogenous ligand of the G(i/o) protein-coupled receptor 18 (GPR18). NAGly is involved in directing microglial migration in the CNS through activation of GPR18. NAGly-GPR18 signaling is thought to play an important role in microglial-neuronal communication. Recent studies also show that GPR18 functions as the abnormal cannabidiol (Abn-CBD) receptor. Abn-CBD is a synthetic isomer of cannabidiol and is inactive at cannabinoid receptors (CB1 or CB2), but acts as a selective agonist at GPR18. The NAGly receptor is a member of the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. G-proteins regulate a variety of cellular functions including metabolic enzymes, ion channels, and transporters, among many others.	275
320295	cd15167	7tmA_GPR171	orphan G protein-coupled receptor 171, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR171 is phylogenetically related to the P2Y family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. A recent study has been reported that the peptide LENSSPQAPARRLLPP (BigLEN) activates GPR17 to regulate body weight in mice; however the biological role of the receptor remains unknown. GPR171 is a member of the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A common feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. G-proteins regulate a variety of cellular functions including metabolic enzymes, ion channels, and transporters, among many others.	282
341329	cd15168	7tmA_P2Y1-like	P2Y purinoceptors 1, 2, 4, 6, 11 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). This cluster only includes P2Y1-like receptors as well as other closely related orphan receptors, such as GPR91 (a succinate receptor) and GPR80/GPR99 (an alpha-ketoglutarate receptor).	284
320297	cd15169	7tmA_FFAR1	free fatty acid receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes the mammalian free fatty acid receptor 1 (FFAR1), also called GPR40. FFAR1 is a cell-surface receptor for medium- and long-chain free fatty acids (FFAs). The receptor is most potently activated by eicosatrienoic acid (C20:3), but can also be activated at micromolar concentrations of various fatty acids. FFAR1 directly mediates FFA stimulation of glucose-stimulated insulin secretion and indirectly increases insulin secretion by enhancing the release of incretin. Free fatty acid receptors (FFARs) belong to the class A G-protein coupled receptors and are comprised of three members, each encoded by a separate gene (FFAR1, FFAR2, and FFAR3). These genes and a fourth pseudogene, GPR42, are localized together on chromosome 19. FFARs are considered important components of the body's nutrient sensing mechanism, and therefore, these receptors are potential therapeutic targets for the treatment of metabolic disorders, such as type 2 diabetes and obesity.	284
320298	cd15170	7tmA_FFAR2_FFAR3	free fatty acid receptors 2, 3, and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes free fatty acid receptor 2 (FFAR2), FFAR3, and similar proteins. They are a member of the class A G-protein coupled receptors that bind free fatty acids. The FFAR subfamily is composed of three receptors, each encoded by a separate gene (FFAR1, FFAR2, and FFAR3). These genes and a fourth pseudogene, GPR42, are localized together on chromosome 19. FFAR2 and FFAR3 are cell-surface receptors for short chain FFAs (SCFAs) with different ligand affinities, whereas FFAR1 is a receptor for medium- and long-chain FFAs. FFAR2 activation by SCFA suppresses adipose insulin signaling, which leads to inhibition of fat accumulation in adipose tissue. FAAR3 is expressed in intestinal L cells, which produces glucagon-like peptide 1 (GLP-1) and peptide YY (PYY), thus suggesting that this receptor may be involved in energy homeostasis. FFARs are considered important components of the body's nutrient sensing mechanism, and therefore, these receptors are potential therapeutic targets for the treatment of metabolic disorders, such as type 2 diabetes and obesity.	278
320299	cd15171	7tmA_CCRL2	CC chemokine receptor-like 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Chemokine (CC-motif) receptor-like 2 (CCRL2) is a member of the atypical chemokine receptor family. CCRL2, like other atypical receptors, has an alteration in the conserved DRYLAIV motif in the third intracellular loop, which is essential for GPCR coupling and signaling. CCR2L is expressed in most hematopoietic cells and many lymphoid organs as well as in heart and lung. CCRL2 was initially reported to promote chemotaxis and calcium fluxes in responses to chemokines (CCL2, CCL5, CCL7, and CCL8); however, these results are still controversial. More recently, chemerin, a chemotactic agonist of CMKLR1 (chemokine-like receptor-1) and GPR1, was identified as a novel non-signaling ligand for both human and mouse CCRL2. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C).	279
341330	cd15172	7tmA_CCR6	CC chemokine receptor type 6, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR6 is the only known receptor identified for the chemokine CCL20 (also known as macrophage inflammatory protein-3alpha, MIP-3alpha). CCR6 is expressed by all mature human B cells, effector memory T-cells, and dendritic cells found in the gut mucosal immune system. CCL20 contributes to recruitment of CCR6-expressing cells to Peyer's patches and isolated lymphoid follicles in the intestine, thereby promoting the assembly and maintenance of organized lymphoid structures. Also, CCL20 expression is highly inducible in response to inflammatory signals. Thus, CCL20 is involved in both inflammatory and homeostatic functions in the immune system. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. The CC chemokine receptors are all activating the G protein Gi.	281
320301	cd15173	7tmA_CXCR6	CXC chemokine receptor type 6, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR6 binds specifically to the chemokine CXCL16, which is expressed on dendritic cells, monocyte/macrophages, activated T cells, fibroblastic reticular cells, and cancer cells. CXCR6 is phylogenetically more closely related to CC-type chemokine receptors (CCR6 and CCR9) than other CXC receptors. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	270
320302	cd15174	7tmA_CCR9	CC chemokine receptor type 9, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR9 is a homeostatic receptor specific for CCL25 (formerly known as thymus expressed chemokine) and is highly expressed on both immature and mature thymocytes as well as on intestinal homing T Lymphocytes and mucosal Lymphocytes. In cutaneous melanoma, activation of CCR9-CCL25 has been shown to stimulate metastasis to the small intestine. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. The CC chemokine receptors are all activating the G protein Gi.	280
341331	cd15175	7tmA_CCR7	CC chemokine receptor type 7, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR7 is a major homeostatic receptor responsible for lymph node development and effective adaptive immune responses and plays a critical role in trafficking of dendritic cells and B and T lymphocytes. Its only two ligands, CCL and CCl21, are primarily produced by stromal cells in the T cell zones of lymph nodes and spleen. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. The CC chemokine receptors are all activating the G protein Gi.	278
320304	cd15176	7tmA_ACKR4_CCR11	atypical chemokine receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. ACKR4 was first reported to bind several CC chemokines including CCL19, CCL21, and CCL25 and was originally designated CCR11. AKCR4 is unable to couple to G-protein and, instead, it preferentially mediates beta-arrestin dependent processes, such as receptor internalization, after ligand binding. Thus, ACKR4 may act as a scavenger receptor to suppress the effects of proinflammatory chemokines. Unlike the classical chemokine receptors that contain a conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling, the ACKRs lack this conserved motif and fail to couple to G-proteins and induce classical GPCR signaling. Five receptors have been identified for the ACKR family, including CC-chemokine receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, Duffy antigen receptor for chemokine (DARC), and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors.	276
341332	cd15177	7tmA_CCR10	CC chemokine receptor type 10, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR10 is a homeostatic receptor specific for two C-C motif chemokines, CCL27 and CCL28.  Activation of CCR10 by its two ligands mediates diverse activities, ranging from leukocyte trafficking to skin cancer. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines. The CC chemokine receptors are all activating the G protein Gi.	280
341333	cd15178	7tmA_CXCR1_2	CXC chemokine receptor types 1 and 2, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR1 and CXCR2 are closely related chemotactic receptors for a group of CXC chemokines distinguished by the presence of the amino acid motif ELR immediately adjacent to their CXC motif. Expression of CXCR1 and CXCR2 is strictly controlled in neutrophils by external stimuli such as lipopolysaccharide (LPS), tumor necrosis factor (TNF)-alpha, Toll-like receptor agonists, and nitric oxide. CXCL8 (formerly known as interleukin-8) binds with high-affinity and activates both receptors. CXCR1 also binds CXCL7 (neutrophil-activating protein-2), whereas CXCR2 non-selectively binds to all seven ELR-positive chemokines (CXCL1-7). Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	279
341334	cd15179	7tmA_CXCR4	CXC chemokine receptor type 4, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR4 is the only known G protein-coupled chemokine receptor for the key homeostatic ligand CXCL12, which is constitutively secreted by bone marrow stromal cells. Atypical chemokine receptor CXCR7 (ACKR3) also binds CXCL12, but activates signaling in a G protein-independent manner. CXCR4 is also a co-receptor for HIV infection and plays critical roles in the development of immune system during both lymphopoiesis and myelopoiesis. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	278
341335	cd15180	7tmA_CXCR3	CXC chemokine receptor type 3, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR3 is an inflammatory chemotactic receptor for a group of CXC chemokines distinguished by the presence of the amino acid motif ELR immediately adjacent to their CXC motif. CXCR3 specifically binds three chemokines CXCL9 (monokine induced by gamma-interferon), CXCL10 (interferon induced protein of 10 kDa), and CXCL11 (interferon inducible T-cell alpha-chemoattractant, I-TAC).  CXC3R is expressed on CD4+ Th1 and CD8+ cytotoxic T lymphocytes as well as highly on innate lymphocytes, such as NK cells and NK T cells, where it may mediate the recruitment of these cells to the sites of infection and inflammation. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	280
341336	cd15181	7tmA_CXCR5	CXC chemokine receptor type 5, member of the class A family of seven-transmembrane G protein-coupled receptors. CXCR5 is a B-cell selective receptor that binds specifically to the homeostatic chemokine CXCL13 and regulates adaptive immunity. The receptor is found on all peripheral blood and tonsillar B cells and is involved in lymphocyte migration (homing) to specific tissues and development of normal lymphoid tissue. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	281
341337	cd15182	7tmA_XCR1	XC chemokine receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. XCR1 is a chemokine receptor specific for XCL1 and XCL2 (previously called lymphotactin alpha/beta), which differ in only two amino acids. XCL1/2 is the only member of the C chemokine subfamily, which is unique as containing only two of the four cysteines that are found in other chemokine families. Human XCL1/2 has been shown to be secreted by activated CD8+ T cells and upon activation of the innate immune system. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling.	271
320311	cd15183	7tmA_CCR1	CC chemokine receptor type 1, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR1 is widely expressed on both hematopoietic and non-hematopoietic cells and binds to the inflammatory CC chemokines CCL3, CCL5, CCL6, CCL9, CCL15, and CCL23. CCR1 activates the typical chemokine signaling pathway through the G(i/o) type of G proteins, causing inhibition of adenylate cyclase and stimulation of phospholipase C, PKC, calcium flux, and PLA2. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	278
341338	cd15184	7tmA_CCR5_CCR2	CC chemokine receptor types 5 and 2, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR2 and CCR5 share very high amino acid sequence identity. Both receptors play important roles in the trafficking of monocytes/macrophages and are implicated in the pathogenesis of immunologic diseases (rheumatoid arthritis, celiac disease, and transplant rejection) and cardiovascular diseases (atherosclerosis and autoimmune hepatitis). CCR2 is a receptor specific for members of the monocyte chemotactic protein family, including CCL2, CCL7, and CCL13. Conversely, CCR5 is a major co-receptor for HIV infection and binds many CC chemokine ligands, including CC chemokine ligands including CCL2, CCL3, CCL4, CCL5, CCL11, CCL13, CCL14, and CCL16. CCR2 is expressed primarily on blood monocytes and memory T cells, whereas CCR5 is expressed on antigen-presenting cells (macrophages and dendritic cells) and activated T effector cells. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	278
341339	cd15185	7tmA_CCR3	CC chemokine receptor type 3, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR3 is a highly promiscuous receptor that binds a variety of inflammatory CC-type chemokines, including CCL11 (eotaxin-1), CCL3L1, CCL5 (regulated on activation, normal T cell expressed and secreted; RANTES), CCL7 (monocyte-specific chemokine 3 or MCP-3), CCL8 (MCP-2), CCL11, CCL13 (MCP-4), CCL15, CCL24 (eotaxin-2), CCL26 (eotaxin-3), and CCL28. Among these, the eosinophil chemotactic chemokines (CCL11, CCL24, and CCL26) are the most potent and specific ligands. In addition to eosinophil, CCR3 is expressed on cells involved in allergic responses, such as basophils, Th2 lymphocytes, and mast cells. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	278
320314	cd15186	7tmA_CX3CR1	CX3C chemokine receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. CX3CR1 is an inflammatory receptor specific for CX3CL1 (also known as fractalkine in human), which is involved in the adhesion and migration of leukocytes. The CX3C chemokine subfamily is only represented by CX3CL1, which exists in both soluble and membrane-anchored forms. Membrane-anchored form promotes strong adhesion of receptor-bearing leukocytes to CX3CL1-expressing endothelial cells. On the other hand, soluble CX3CL1, which is released by the proteolytic cleavage of membrane-anchored CX3CL1, is a potent chemoattractant for CX3CR1-expressing T cells and monocytes. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling.	273
320315	cd15187	7tmA_CCR8	CC chemokine receptor type 8, member of the class A family of seven-transmembrane G protein-coupled receptors. CCR8, the receptor for the CC chemokines CCL1 and CC16, is highly expressed on allergen-specific T-helper type 2 cells, and is implicated in the pathogenesis of human asthma. CCL1- and CCR8-expressing CD4+ effector T lymphocytes are shown to have a critical role in lung mucosal inflammatory responses. CCR8 is also a functional receptor for CCL16, a liver-expressed CC chemokine that involved in attracting lymphocytes, dendritic cells, and monocytes. Chemokines are principal regulators for leukocyte trafficking, recruitment, and activation. Chemokine family membership is defined on the basis of sequence homology and on the presence of variations on a conserved cysteine motif, which allows the family to further divide into four subfamilies (CC, CXC, XC, and CX3C). Chemokines interact with seven-transmembrane receptors which are typically coupled to G protein for signaling. Currently, there are ten known receptors for CC chemokines, seven for CXC chemokines, and single receptors for the XC and CX3C chemokines.	276
320316	cd15188	7tmA_ACKR2_D6	atypical chemokine receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. ACKR2 (also known as D6) binds non-selectively to all inflammatory CC-chemokines, but not to homeostatic CC-chemokines involved in controlling the migration of cells. Unlike the classical chemokine receptors that contain a conserved DRYLAIV motif in the second intracellular loop, which is required for G-protein coupling, the ACKRs lack this conserved motif and fail to couple to G-proteins and induce classical GPCR signaling. Five receptors have been identified for the ACKR family, including CC-chemokine receptors like 1 and 2 (CCRL1 and CCRL2), CXCR7, Duffy antigen receptor for chemokine (DARC), and D6. Both ACKR1 (DARC) and ACKR3 (CXCR7) show low sequence homology to the classic chemokine receptors.	278
320317	cd15189	7tmA_Bradykinin_R	bradykinin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The bradykinin receptor family is a group of the seven transmembrane G-protein coupled receptors, whose endogenous ligand is the pro-inflammatory nonapeptide bradykinin that mediates various vascular and pain responses. Two major bradykinin receptor subtypes, B1 and B2, have been identified based on their pharmacological properties. The B1 receptor is rapidly induced by tissue injury and inflammation, whereas the B2 receptor is ubiquitously expressed on many tissue types. Both receptors contain three consensus sites for N-linked glycosylation in extracellular domains and couple to G(q) protein to activate phospholipase C, leading to phosphoinositide hydrolysis and intracellular calcium mobilization. They can also interact with G(i) protein to inhibit adenylate cyclase and activate the MAPK (mitogen-activated protein kinase) pathways.	284
341340	cd15190	7tmA_Apelin_R	apelin receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Apelin (APJ) receptor is a G protein-coupled receptor that binds the endogenous peptide ligands, apelin and Toddler/Elabela. APJ is an adipocyte-derived hormone that is ubiquitously expressed throughout the human body and Toddler/Elabela is a short secretory peptide that is required for normal cardiac development in zebrafish. Activation of APJ receptor plays key roles in diverse physiological processes including vasoconstriction and vasodilation, cardiac muscle contractility, angiogenesis, and regulation of water balance and food intake.	304
341341	cd15191	7tmA_AT2R	type 2 angiotensin II receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Angiotensin II (Ang II), the main effector in the renin-angiotensin system, plays a crucial role in the regulation of cardiovascular homeostasis through its type 1 (AT1) and type 2 (AT2) receptors.  Ang II contributes to cardiovascular diseases such as hypertension and atherosclerosis via AT1R activation. Ang II increases blood pressure through Gq-mediated activation of phospholipase C, resulting in phosphoinositide (PI) hydrolysis and increased intracellular calcium levels. Through the AT2R, Ang II counteracts the vasoconstrictor action of AT1R and thereby induces vasodilation, sodium excretion, and reduction of blood pressure. Moreover, AT1R promotes cell proliferation, whereas AT2R inhibits proliferation and stimulates cell differentiation. The AT2R is highly expressed during fetal development, however it is scarcely present in adult tissues and is induced in pathological conditions.  Generally, the AT1R mediates many actions of Ang II, while the AT2R is involved in the regulation of blood pressure and renal function.	285
320320	cd15192	7tmA_AT1R	type 1 angiotensin II receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Angiotensin II (Ang II), the main effector in the renin-angiotensin system, plays a crucial role in the regulation of cardiovascular homeostasis through its type 1 (AT1) and type 2 (AT2) receptors.  Ang II contributes to cardiovascular diseases such as hypertension and atherosclerosis via AT1R activation. Ang II increases blood pressure through Gq-mediated activation of phospholipase C, resulting in phosphoinositide (PI) hydrolysis and increased intracellular calcium levels. Through the AT2R, Ang II counteracts the vasoconstrictor action of AT1R and thereby induces vasodilation, sodium excretion, and reduction of blood pressure. Moreover, AT1R promotes cell proliferation, whereas AT2R inhibits proliferation and stimulates cell differentiation. The AT2R is highly expressed during fetal development, however it is scarcely present in adult tissues and is induced in pathological conditions.  Generally, the AT1R mediates many actions of Ang II, while the AT2R is involved in the regulation of blood pressure and renal function.	285
320321	cd15193	7tmA_GPR25	G protein-coupled receptor 25, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR25 is an orphan G-protein coupled receptor that shares strong sequence homology to GPR15 and the angiotensin II receptors. These closely related receptors form a group within the class A G-protein coupled receptors (GPCRs). GPR15 controls homing of T cells, especially FOXP3(+) regulatory T cells, to the large intestine mucosa and thereby mediates local immune homeostasis. Moreover, GRP15-deficient mice were shown to be prone to develop more severe large intestine inflammation. Angiotensin II (Ang II), the main effector in the renin-angiotensin system, plays a crucial role in the regulation of cardiovascular homeostasis through its type 1 (AT1) and type 2 (AT2) receptors.	279
320322	cd15194	7tmA_GPR15	G protein-coupled receptor 15, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR15, also called as Brother of Bonzo (BOB), is an orphan G-protein coupled receptor that was originally identified as a co-receptor for human immunodeficiency virus. GPR15 is upregulated in patients with rheumatoid arthritis and shares high sequence homology with angiotensin II type AT1 and AT2 receptors; however, its endogenous ligand is unknown.  GPR15 controls homing of T cells, especially FOXP3(+) regulatory T cells, to the large intestine mucosa and thereby mediates local immune homeostasis. Moreover, GRP15-deficient mice were shown to be prone to develop more severe large intestine inflammation.	281
320323	cd15195	7tmA_GnRHR-like	gonadotropin-releasing hormone and adipokinetic hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Gonadotropin-releasing hormone (GnRH) and adipokinetic hormone (AKH) receptors share strong sequence homology to each other, suggesting that they have a common evolutionary origin. GnRHR, also known as luteinizing hormone releasing hormone receptor (LHRHR), plays an central role in vertebrate reproductive function; its activation by binding to GnRH leads to the release of follicle stimulating hormone (FSH) and luteinizing hormone (LH) from the pituitary gland. Adipokinetic hormone (AKH) is a lipid-mobilizing hormone that is involved in control of insect metabolism. Generally, AKH behaves as a typical stress hormone by mobilizing lipids, carbohydrates and/or certain amino acids such as proline. Thus, it utilizes the body's energy reserves to fight the immediate stress problems and subdue processes that are less important. Although AKH is known to responsible for regulating the energy metabolism during insect flying, it is also found in insects that have lost its functional wings and predominantly walk for their locomotion. Both GnRH and AKH receptors are members of the class A of the seven-transmembrane, G-protein coupled receptor (GPCR) superfamily.	293
320324	cd15196	7tmA_Vasopressin_Oxytocin	vasopressin and oxytocin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Vasopressin (also known as arginine vasopressin or anti-diuretic hormone) and oxytocin are synthesized in the hypothalamus and are released from the posterior pituitary gland. The actions of vasopressin are mediated by the interaction of this hormone with three receptor subtypes: V1aR, V1bR, and V2R. These subtypes are differ in localization, function, and signaling pathways. Activation of V1aR and V1bR stimulate phospholipase C, while activation of V2R stimulates adenylate cyclase. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation.	264
320325	cd15197	7tmA_NPSR	neuropeptide S receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide S (NPS) promotes arousal and anxiolytic-like effects by activating its cognate receptor NPSR. NPSR is widely expressed in the brain, and its activation induces an elevation of intracellular calcium and cAMP concentrations, presumably by coupling to G(s) and G(q) proteins. Mutations in NPSR have been associated with an increased susceptibility to asthma. NPSR was originally identified as an orphan receptor GPR154 and is also known as G protein receptor for asthma susceptibility (GPRA) or vasopressin receptor-related receptor 1 (VRR1).	294
320326	cd15198	7tmA_GPR150	G protein-coupled receptor 150, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR150 is an orphan receptor closely related to the oxytocin and vasopressin receptors. Its endogenous ligand is not known. These receptors share a significant amino acid sequence similarity, suggesting that they have a common evolutionary origin.	299
320327	cd15199	7tmA_GPR31	G protein-coupled receptor 31, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR31, also known as 12-(S)-HETE receptor, is a high-affinity receptor for 12-(S)-hydroxy-5,8,10,14-eicosatetraenoic acid. Phylogenetic analysis showed that GPR31 and oxoeicosanoid receptor 1 (OXER1, GPR170) are the most closely related receptors to the hydroxycarboxylic acid receptor family (HCARs). GPR31, like OXER1, activates the ERK1/2 (MAPK3/MAPK1) pathway of intracellular signaling, but unlike the OXER1, does not cause increase in the cytosolic calcium level. GPR31 is also shown to activate NFkB. 12-(S)-HETE is a 12-lipoxygenase metabolite of arachidonic acid produced by mammalian platelets and tumor cells. It promotes tumor cells adhesion to endothelial cells and sub-endothelial matrix, which is a critical step for metastasis.	278
320328	cd15200	7tmA_OXER1	oxoeicosanoid receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. OXER1, also called GPR170, is a receptor for eicosanoids and polyunsaturated fatty acids such as 5-oxo-6E,8Z,11Z,14Z-eicosatetraenoic acid (5-OXO-ETE), 5(S)-hydroperoxy-6E,8Z,11Z,14Z-eicosatetraenoic acid (5(S)-HPETE) and arachidonic acid. OXER1 is a member of the class A family of seven-transmembrane G-protein coupled receptors and appears to be coupled to the G(i/o) protein. The receptor is expressed in various tissues except brain. Phylogenetic analysis showed that GPR31 and OXER1 are the most closely related receptors to the hydroxycarboxylic acid receptor family (HCARs). OXER1, like GPR31, activates the ERK1/2 (MAPK3/MAPK1) pathway of intracellular signaling, but unlike GPR31, does cause increase in the cytosolic calcium level.	276
320329	cd15201	7tmA_HCAR1-3	hydroxycarboxylic acid receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Hydroxycarboxylic acid receptor (HCAR) subfamily, a member of the class A G-protein coupled receptors (GPCRs), contains three receptor subtypes: HCAR1, HCAR2, and HCAR3. The endogenous ligand of HCAR1 (also known as lactate receptor 1, GPR104, or GPR81) is L-lactic acid. The endogenous ligands of HCAR2 (also known as niacin receptor 1, GPR109A, or nicotinic acid receptor) and HCAR3 (also known as niacin receptor 2 or GPR109B) are 3-hydroxybutyric acid and 3-hydroxyoctanoic acid, respectively. Because nicotinic acid is capable of stimulating HCAR2 at higher concentrations only (in the range of sub-micromolar concentration), it is unlikely that nicotinic acts as a physiological ligand of HCAR2. All three receptors are expressed in adipocytes and mediate anti-lipolytic effects in fat cells through G(i) type G protein-dependent inhibition of adenylate cyclase.	281
320330	cd15202	7tmA_TACR-like	tachykinin receptors and related receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes the neurokinin/tachykinin receptors and its closely related receptors such as orphan GPR83 and leucokinin-like peptide receptor. The tachykinins are widely distributed throughout the mammalian central and peripheral nervous systems and act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R.  SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate in the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception. NK3R is activated by its high-affinity ligand, NKB, which is primarily involved in the central nervous system and plays a critical role in the regulation of gonadotropin hormone release and the onset of puberty.	288
320331	cd15203	7tmA_NPYR-like	neuropeptide Y receptors and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to Gi or Go proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety.  Also included in this subgroup is prolactin-releasing peptide (PrRP) receptor (previously known as GPR10), which is activated by its endogenous ligand PrRP, a neuropeptide possessing C-terminal Arg-Phe-amide motif. There are two active isoforms of PrRP in mammals: one consists of 20 amino acid residues (PrRP-20) and the other consists of 31 amino acid residues (PrRP-31). PrRP receptor shows significant sequence homology to the NPY receptors, and a micromolar level of NPY can bind and completely inhibit the PrRP-evoked intracellular calcium response in PrRP receptor-expressing cells, suggesting that the PrRP receptor shares a common ancestor with the NPY receptors.	293
320332	cd15204	7tmA_prokineticin-R	prokineticin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Prokineticins 1 (PROK1) and 2 (PROK2), also known as endocrine gland vascular endothelial factor and Bombina varigata 8, respectively, are multifunctional chemokine-like peptides that are highly conserved across species. Prokineticins can bind with similar affinities to two closely homologous 7-transmembrane G protein coupled receptors, PROKR1 and PROKR2, which are phylogenetically related to the tachykinin receptors. Prokineticins and their GPCRs are widely distributed in human tissues and are involved in numerous physiological roles, including gastrointestinal motility, generation of circadian rhythms, neuron migration and survival, pain sensation, angiogenesis, inflammation, and reproduction. Moreover, different point mutations in genes encoding PROK2 or its receptor (PROKR2) can lead to Kallmann syndrome, a disease characterized by delayed or absent puberty and impaired olfactory function.	288
320333	cd15205	7tmA_QRFPR	pyroglutamylated RFamide peptide receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. 26RFa, also known as QRFP (Pyroglutamylated RFamide peptide), is a 26-amino acid residue peptide that belongs to a family of neuropeptides containing an Arg-Phe-NH2 (RFamide) motif at its C-terminus. 26Rfa/QRFP exerts similar orexigenic activity including the regulation of feeding behavior in mammals. It is the ligand for G-protein coupled receptor 103 (GPR103), which is predominantly expressed in paraventricular (PVN) and ventromedial (VMH) nuclei of the hypothalamus. GPR103 shares significant protein sequence homology with orexin receptors (OX1R and OX2R), which have recently shown to produce a neuroprotective effect in Alzheimer's disease by forming a functional heterodimer with GPR103.	298
320334	cd15206	7tmA_CCK_R	cholecystokinin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Cholecystokinin receptors (CCK-AR and CCK-BR) are a group of G-protein coupled receptors which bind the peptide hormones cholecystokinin (CCK) or gastrin. CCK, which facilitates digestion in the small intestine, and gastrin, a major regulator of gastric acid secretion, are highly similar peptides. Like gastrin, CCK is a naturally-occurring linear peptide that is synthesized as a preprohormone, then proteolytically cleaved to form a family of peptides with the common C-terminal sequence (Gly-Trp-Met-Asp-Phe-NH2), which is required for full biological activity. CCK-AR (type A, alimentary; also known as CCK1R) is found abundantly on pancreatic acinar cells and binds only sulfated CCK-peptides with very high affinity, whereas CCK-BR (type B, brain; also known as CCK2R), the predominant form in the brain and stomach, binds CCK or gastrin and discriminates poorly between sulfated and non-sulfated peptides. CCK is implicated in regulation of digestion, appetite control, and body weight, and is involved in neurogenesis via CCK-AR. There is some evidence to support that CCK and gastrin, via their receptors, are involved in promoting cancer development and progression, acting as growth and invasion factors.	269
320335	cd15207	7tmA_NPFFR	neuropeptide FF receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide FF (NPFF) is a mammalian octapeptide that belongs to a family of neuropeptides containing an RF-amide motif at their C-terminus that have been implicated in a wide range of physiological functions in the brain including pain sensitivity, insulin release, food intake, memory, blood pressure, and opioid-induced tolerance and hyperalgesia. The effects of these peptides are mediated through neuropeptide FF1 and FF2 receptors (NPFF1-R and NPFF2-R) which are predominantly expressed in the brain. NPFF induces pro-nociceptive effects, mainly through the NPFF1-R, and anti-nociceptive effects, mainly through the NPFF2-R. NPFF has been shown to inhibit adenylate cyclase via the Gi protein coupled to NPFF1-R.	291
320336	cd15208	7tmA_OXR	orexin receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Orexins (OXs, also referred to as hypocretins) are neuropeptide hormones that regulate the sleep-wake cycle and potently influence homeostatic systems regulating appetite and feeding behavior or modulating emotional responses such as anxiety or panic. OXs are synthesized as prepro-orexin (PPO) in the hypothalamus and then proteolytically cleaved into two forms of isoforms: orexin-A (OX-A) and orexin-B (OX-B). OXA is a 33 amino-acid peptide with N-terminal pyroglutamyl residue and two intramolecular disulfide bonds, whereas OXB is a 28 amino-acid linear peptide with no disulfide bonds. OX-A binds orexin receptor 1 (OX1R) with high-affinity, but also binds with somewhat low-affinity to OX2R, and signals primarily to Gq coupling, whereas OX-B shows a strong preference for the orexin receptor 2 (OX2R) and signals through Gq or Gi/o coupling. Thus, activation of OX1R or OX2R will activate phospholipase activity and the phosphatidylinositol and calcium signaling pathways. Additionally, OX2R activation can also lead to inhibition of adenylate cyclase.	303
320337	cd15209	7tmA_Mel1	melatonin receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Melatonin (N-acetyl-5-methoxytryptamine) is a naturally occurring sleep-promoting chemical found in vertebrates, invertebrates, bacteria, fungi, and plants. In mammals, melatonin is secreted by the pineal gland and is involved in regulation of circadian rhythms. Its production peaks during the nighttime, and is suppressed by light. Melatonin is shown to be synthesized in other organs and cells of many vertebrates, including the Harderian gland, leukocytes, skin, and the gastrointestinal (GI) tract, which contains several hundred times more melatonin than the pineal gland and is involved in the regulation of GI motility, inflammation, and sensation. Melatonin exerts its pleiotropic physiological effects through specific membrane receptors, named MT1A, MT1B, and MT1C, which belong to the class A rhodopsin-like G-protein coupled receptor family. MT1A and MT1B subtypes are present in mammals, whereas MT1C subtype has been found in amphibians and birds. The melatonin receptors couple to G proteins of the G(i/o) class, leading to the inhibition of adenylate cyclase.	279
320338	cd15210	7tmA_GPR84-like	G protein-coupled receptor 84 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR84, also known as the inflammation-related G-Protein coupled receptor EX33, is a receptor for medium-chain free fatty acid (FFA) with carbon chain lengths of C9 to C14. Among these medium-chain FFAs, capric acid (C10:0), undecanoic acid (C11:0), and lauric acid (C12:0) are the most potent endogenous agonists of GPR84, whereas short-chain and long-chain saturated and unsaturated FFAs do not activate this receptor. GPR84 contains a [G/N]RY-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors and important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. In the case of GPR84, activation of the receptor couples to a pertussis toxin sensitive G(i/o)-protein pathway. GPR84 knockout mice showed increased Th2 cytokine production including IL-4, IL-5, and IL-13 compared to wild-type mice. It has been also shown that activation of GPR84 augments lipopolysaccharide-stimulated IL-8 production in polymorphonuclear leukocytes and TNF-alpha production in macrophages, suggesting that GPR84 may function as a proinflammatory receptor.	254
320339	cd15211	7tmA_GPR88-like	G protein-coupled receptor 88, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR88, an orphan G protein-coupled receptor, is predominantly and almost exclusively expressed within medium spiny neurons (MSNs) of the brain's striatum in both human and rodents; thus it is also called Striatum-specific GPCR (STRG). The striatum is known to involve in motor coordination, reward-based decision making, and response learning. GPR88 is shown to co-localize with both dopamine D1 and D2 receptors and displays the highest sequence similarity to receptors for biogenic amines such as dopamine and serotonin. GPR88 knockout mice showed abnormal behaviors observed in schizophrenia, such as disrupted sensorimotor gating, increased stereotypic behavior and locomotor activity in response to treatment with dopaminergic compounds such as apomorphine and amphetamine, respectively, suggesting a role for GPR88 in dopaminergic signaling. Furthermore, the transcriptional profiling studies showed that GPR88 expression is altered in a number of psychiatric disorders such as depression, drug addiction, bipolar and schizophrenia, providing further evidence that GPR88 plays an important role in CNS signaling pathways related to psychiatric disorder. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	283
320340	cd15212	7tmA_GPR135	G protein-coupled receptor 135, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR135, also known as the somatostatin- and angiotensin-like peptide receptor (SALPR), is found in various tissues including eye, brain, cervix, stomach, and testis. Pharmacological studies have shown that relaxin-3 (R3) is a high-affinity endogenous ligand for GPR135. R3 has recently been identified as a new member of the insulin/relaxin family of peptide hormones and is exclusively expressed in the brain neurons. In addition to GPR135, R3 also acts as an agonist for GPR142, a pseudogene in the rat, and can activate LGR7 (leucine repeat-containing G-protein receptor-7), which is the main receptor for relaxin-1 (R1) and relaxin-2 (R2).  While R1 and R2 are hormones primarily associated with reproduction and pregnancy, R3 is involved in neuroendocrine and sensory processing. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	285
320341	cd15213	7tmA_PSP24-like	G protein-coupled receptor PSP24 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes two human orphan receptors, GPR45 and GPR65, and their closely related proteins found in vertebrates and invertebrates. GPR45 and GPR 65 are also called PSP24-alpha (or PSP24-1) and PSP24-beta (or PSP24-2) in other vertebrates, respectively. These receptors exhibit the highest sequence homology to each other. PSP24 was originally identified as a novel, high-affinity lysophosphatidic acid (LPA) receptor in Xenopus laevis oocytes; however, PSP24 receptors (GPR45 and GPR63) have not been shown to be activated by LPA. Instead, sphingosine 1-phosphate and dioleoylphosphatidic acid have been shown to act as low affinity agonists for GPR63.  PSP24 receptors are highly expressed in neuronal cells of cerebellum and their expression level remains constant from the early embryonic stages to adulthood, suggesting the important role of PSP24s in brain neuronal functions. Members of this subgroup contain the highly conserved Asp-Arg-Tyr/Phe (DRY/F) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors which is important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	262
320342	cd15214	7tmA_GPR161	orphan G protein-coupled receptor 161, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR161, an orphan GPCR, is a negative regulator of Sonic hedgehog (Shh) signaling, which promotes the processing of zinc finger protein GLI3 into its transcriptional repressor form (GLI3R) during neural tube development. In the absence of Shh, this proteolytic processing is normally mediated by cAMP-dependent protein kinase A (PKA). GPR161 is recruited to primary cilia by a mechanism depends on TULP3 (tubby-related protein 3) and the intraflagellar complex A (IFT-A). Moreover, Gpr161 knockout mice show phenotypes observed in Tulp3/IFT-A mutants, and cause increased Shh signaling in the neural tube. Taken together, GPR161 negatively regulates the PKA-dependent GLI3 processing in the absence of Shh signal by coupling to G(s) protein, which causes activation of adenylate cyclase, elevated cAMP levels, and activation of PKA. Conversely, in the presence of Shh, GPR161 is removed from the cilia by internalization into the endosomal recycling compartment, leading to downregulation of its activity and thereby allowing Shh signaling to proceed. In addition, GPR161 is over-expressed in triple-negative breast cancer (lacking estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2 (HER2) expression) and correlates with poor prognosis. Mutations of GPR161 have also been implicated as a novel cause for pituitary stalk interruption syndrome (PSIS), a rare congenital disease of the pituitary gland. GPR161 is a member of the class A family of GPCRs, which contains receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	261
320343	cd15215	7tmA_GPR101	orphan G protein-coupled receptor 101, member of the class A family of seven-transmembrane G protein-coupled receptors. Gpr101, an orphan GPCR, is predominantly expressed in the brain within discrete nuclei and is predicted to couple to the stimulatory G(s) protein, a potent activator of adenylate cyclase. GPR101 has been implicated in mediating the actions of GnRH-(1-5), a pentapeptide formed by metallopeptidase cleavage of the decapeptide gonadotropin-releasing hormone (GnRH), which plays a critical role in the regulation of the hypothalamic-pituitary-gonadal axis. GnRH-(1-5) acts on GPR101 to stimulate epidermal growth factor (EFG) release and EFG-receptor (EGFR) phosphorylation, leading to enhanced cell migration and invasion in the Ishikawa endometrial cancer cell line. Furthermore, these effects of GnRH-(1-5) are also dependent on enzymatic activation of matrix metallopeptidase-9 (MMP-9). GPR101 is a member of the class A family of GPCRs, which includes receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	261
320344	cd15216	7tmA_SREB1_GPR27	super conserved receptor expressed in brain 1 (or GPR27), member of the class A family of seven-transmembrane G protein-coupled receptors. The SREB (super conserved receptor expressed in brain) subfamily consists of at least three members, named SREB1 (GPR27), SREB2 (GPR85), and SREB3 (GPR173). They are very highly conserved G protein-coupled receptors throughout vertebrate evolution, however no endogenous ligands have yet been identified. SREB2 is greatly expressed in brain regions involved in psychiatric disorders and cognition, such as the hippocampal dentate gyrus. Genetic studies in both humans and mice have shown that SREB2 influences brain size and negatively regulates hippocampal adult neurogenesis and neurogenesis-dependent cognitive function, all of which are suggesting a potential link between SREB2 and schizophrenia.  All three SREB genes are highly expressed in differentiated hippocampal neural stem cells. Furthermore, all GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	332
320345	cd15217	7tmA_SREB3_GPR173	super conserved receptor expressed in brain 3 (or GPR173), member of the class A family of seven-transmembrane G protein-coupled receptors. The SREB (super conserved receptor expressed in brain) subfamily consists of at least three members, named SREB1 (GPR27), SREB2 (GPR85), and SREB3 (GPR173). They are very highly conserved G protein-coupled receptors throughout vertebrate evolution, however no endogenous ligands have yet been identified. SREB2 is greatly expressed in brain regions involved in psychiatric disorders and cognition, such as the hippocampal dentate gyrus. Genetic studies in both humans and mice have shown that SREB2 influences brain size and negatively regulates hippocampal adult neurogenesis and neurogenesis-dependent cognitive function, all of which are suggesting a potential link between SREB2 and schizophrenia.  All three SREB genes are highly expressed in differentiated hippocampal neural stem cells. Furthermore, all GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	329
320346	cd15218	7tmA_SREB2_GPR85	super conserved receptor expressed in brain 2 (or GPR85), member of the class A family of seven-transmembrane G protein-coupled receptors. The SREB (super conserved receptor expressed in brain) subfamily consists of at least three members, named SREB1 (GPR27), SREB2 (GPR85), and SREB3 (GPR173). They are very highly conserved G protein-coupled receptors throughout vertebrate evolution, however no endogenous ligands have yet been identified. SREB2 is greatly expressed in brain regions involved in psychiatric disorders and cognition, such as the hippocampal dentate gyrus. Genetic studies in both humans and mice have shown that SREB2 influences brain size and negatively regulates hippocampal adult neurogenesis and neurogenesis-dependent cognitive function, all of which are suggesting a potential link between SREB2 and schizophrenia.  All three SREB genes are highly expressed in differentiated hippocampal neural stem cells. Furthermore, all GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	330
320347	cd15219	7tmA_GPR26_GPR78-like	G protein-coupled receptors 26 and 78, member of the class A family of seven-transmembrane G protein-coupled receptors. Orphan G-protein coupled receptor 26 (GPR26) and GPR78 are constitutively active and coupled to increased cAMP formation. They are closely related based on sequence homology and comprise a conserved subgroup within the class A G-protein coupled receptor (GPCR) superfamily. Both receptors are widely expressed in selected tissues of the brain but their endogenous ligands are unknown. GPR26 knockout mice showed increased levels of anxiety- and depression-like behaviors, whereas GPR78 has been implicated in susceptibility to bipolar affective disorder and schizophrenia. Members of this subgroup contain the highly conserved Asp-Arg-Tyr/Phe (DRY/F) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors which is important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	264
410633	cd15220	7tmA_GPR61_GPR62-like	G protein-coupled receptors 61 and 62, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes the orphan receptors GPR61 and GPR62, which are both constitutively active and predominantly expressed in the brain. While GPR61 couples to G(s) subtype of G proteins, the signaling pathway and function of GPR 62 are unknown. GPR61-deficient mice displayed significant hyperphagia and heavier body weight compared to wild-type mice, suggesting that GPR61 is involved in the regulation of food intake and body weight. GPR61 transcript expression was found in the caudate, putamen, and thalamus of human brain, whereas GPR62 transcript expression was found in the basal forebrain, frontal cortex, caudate, putamen, thalamus, and hippocampus. Both receptors share the highest sequence homology with each other and comprise a conserved subgroup within the class A family of GPCRs, which includes receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. Members of this subgroup contain [A/E]RY motif, a variant of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of the class A GPCRs and important for efficient G protein-coupled signal transduction.	264
320349	cd15221	7tmA_OR52B-like	olfactory receptor subfamily 52B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor (OR) subfamilies 52B, 52D, 52H and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
320350	cd15222	7tmA_OR51-like	olfactory receptor family 51 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 51 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
320351	cd15223	7tmA_OR56-like	olfactory receptor family 56 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 56 and related proteins in other mammals, sauropsids, and fishes. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320352	cd15224	7tmA_OR6B-like	olfactory receptor subfamily 6B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 6B, 6A, 6Y, 6P, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320353	cd15225	7tmA_OR10A-like	olfactory receptor subfamily 10A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 10A, 10C, 10H, 10J, 10V, 10R, 10J, 10W, among others, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320354	cd15226	7tmA_OR4-like	olfactory receptor family 4 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 4 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	267
320355	cd15227	7tmA_OR14-like	olfactory receptor family 14 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 14 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320356	cd15228	7tmA_OR10D-like	olfactory receptor subfamily 10D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 10D and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
320357	cd15229	7tmA_OR8S1-like	olfactory receptor subfamily 8S1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 8S1 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320358	cd15230	7tmA_OR5-like	olfactory receptor family 5 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 5, some subfamilies from families 8 and 9, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320359	cd15231	7tmA_OR5V1-like	olfactory receptor subfamily 5V1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5V1 and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320360	cd15232	7tmA_OR13-like	olfactory receptor family 13 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 13 (subfamilies 13A1 and 13G1) and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320361	cd15233	7tmA_OR3A-like	olfactory receptor subfamily 3A3 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 3A3 and 3A4, and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320362	cd15234	7tmA_OR7-like	olfactory receptor family 7 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 7 and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320363	cd15235	7tmA_OR1A-like	olfactory receptor subfamily 1A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 1A, 1B, 1K, 1L, 1Q and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	278
320364	cd15236	7tmA_OR1E-like	olfactory receptor subfamily 1E and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 1E, 1J, and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320365	cd15237	7tmA_OR2-like	olfactory receptor family 2  and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor families 2 and 13, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320366	cd15238	7tm_ARII-like	Acetabularia rhodopsin II and similar proteins, member of the seven-transmembrane GPCR superfamily. This subgroup includes the eukaryotic light-driven proton-pumping Acetabularia rhodopsin II from the giant unicellular marine alga Acetabularis acetabulum, as well as its closely related proteins. They belong to the microbial rhodopsin family, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	219
320367	cd15239	7tm_YRO2_fungal-like	fungal YRO2 and related proteins, member of the seven-transmembrane GPCR superfamily. This subgroup includes the yeast YRO2 protein and it closely related proteins. Although the exact function of these proteins is unknown, they show strong sequence homology to the family of microbial rhodopsins, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	227
320368	cd15240	7tm_ASR-like	Anabaena sensory rhodopsin and similar proteins, member of the seven-transmembrane GPCR superfamily. This subgroup includes eubacterial sensory rhodopsin from the freshwater cyanobacterium Anabaena and its closely related proteins. Unlike other sensory rhodopsins (SRI and SRII), the Anabaena sensory rhodopsin (ASR) activates a soluble transducer protein (ASRT), which may leading to transcriptional control of several genes. Although ASRT was shown to interact with DNA in vitro, the exact mechanism of photosensory transduction is not clearly understood. Moreover, the regulation of CRP (cAMP receptor protein) expression by ASR has been reported demonstrating a direct interaction of the C-terminal region of ASR with DNA, suggesting that ASR itself may also work as a transcription factor. ASR belongs to the microbial rhodopsin family, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	221
320369	cd15241	7tm_ChRs	channelrhodopsins, member of the seven-transmembrane GPCR superfamily. Channelrhodopsins (ChRs) are light-gated ion channels acting as sensory photoreceptors in unicellular green algae, controlling phototaxis (directional movement toward or away from light). ChRs are large seven-transmembrane proteins with large C-terminal extensions, which have been implicated in localizing the channel to the algal eyespot, a single layer of pigmented granules, overlaying part of the plasma membrane but are not required for ion channel function. ChRs are belongs to the microbial rhodopsin family, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. Microbial rhodopsins have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	219
320370	cd15242	7tm_Proteorhodopsin	green- and blue-light absorbing proteorhodopsins, member of the seven-transmembrane GPCR superfamily. This subgroup represents blue-light absorbing and green-light absorbing proteorhodopsins (PRs), which act as a light-driven proton pump that plays a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. PRs are found in most marine bacteria in surface waters, as well as in archaea and eukaryotes. They belong to the microbial rhodopsin family, also known as type 1 rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). They have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	229
320371	cd15243	7tm_Halorhodopsin	light-driven inward chloride pump halorhodopsin, member of the seven-transmembrane GPCR superfamily. Halorhodopsin (HR) acts as a light-driven inward-directed chloride pump. When activated by yellow light, HR pumps chloride ions into the cell cytoplasm, generating a negative-inside membrane potential which drives proton uptake. The resulting electrochemical ion gradient provides an energy source to the cell and contributes to pH homeostasis. HR is found in phylogenetically ancient archaea, known as halobacteria which live in high salty environments. HR belongs to the microbial rhodopsin family, also known as type I rhodopsins, comprising light-driven retinal-binding outward pump bacteriorhodopsin (BR), light-gated cation channel channelrhodopsin (ChR), light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. They have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	226
320372	cd15244	7tm_bacteriorhodopsin	light-driven outward proton pump bacteriorhodopsin, member of the seven-transmembrane GPCR superfamily. Bacteriorhodopsin (BR) serves as a light-driven retinal-binding outward proton pump, generating an outside positive membrane potential and thus creating an inwardly directed proton motive force (PMF) necessary for ATP synthesis. BR belongs to the microbial rhodopsin family, also known as type I rhodopsins, comprising light-driven inward chloride pump halorhodopsin (HR), light-gated cation channel channelrhodopsin (ChR), light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and other light-driven proton pumps such as blue-light absorbing and green-light absorbing proteorhodopsins, among others. They have been found in various single-celled microorganisms from all three domains of life, including halophile archaea, gamma-proteobacteria, cyanobacteria, fungi, and green algae. While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	221
320373	cd15245	7tmF_FZD2	class F frizzled subfamily 2, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 2 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	330
320374	cd15246	7tmF_FZD7	class F frizzled subfamily 7, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 7 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others	331
320375	cd15247	7tmF_FZD1	class F mammalian frizzled subfamily 1, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 1 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors.  This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	341
320376	cd15248	7tmF_FZD1_insect	class F insect frizzled subfamily 1, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 1 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of G-protein coupled receptors, found in insects such as Drosophila melanogaster. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	332
320377	cd15249	7tmF_FZD5	class F frizzled subfamily 5, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 5 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	310
320378	cd15250	7tmF_FZD8	class F frizzled subfamily 8, member of 7-transmembrane G protein-coupled receptors. This group includes subfamily 8 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and its closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	314
320379	cd15251	7tmB2_BAI_Adhesion_VII	brain-specific angiogenesis inhibitors, group VII adhesion GPCRs, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Brain-specific angiogenesis inhibitors (BAI1-3) constitute the group VII of cell-adhesion receptors that have been implicated in vascularization of glioblastomas. They belong to the B2 subfamily of class B GPCRs, are predominantly expressed in the brain, and are only present in vertebrates. Three BAIs, like all adhesion receptors, are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. For example, BAI1 N-terminus contain an integrin-binding RGD (Arg-Gly-Asp) motif  in addition to five thrombospondin type 1 repeats (TSRs), which are known to regulate the anti-angiogenic activity of thrombospondin-1,  whereas BAI2 and BAI3 have four TSRs, but do not possess RGD motifs. The TSRs are functionally involved in cell attachment, activation of latent TGF-beta, inhibition of angiogenesis and endothelial cell migration. The TSRs of BAI1 mediate direct binding to phosphatidylserine, which enables both recognition and internalization of apoptotic cells by phagocytes. Thus, BAI1 functions as a phosphatidylserine receptor that forms a trimeric complex with ELMO and Dock180, leading to activation of Rac-GTPase which promotes the binding and phagocytosis of apoptotic cells. BAI3 can also interact with the ELMO-Dock180 complex to activate the Rac pathway and can also bind to secreted C1ql proteins of the C1Q complement family via its N-terminal TSRs. BAI3 and its ligands C1QL1 are highly expressed during synaptogenesis and are involved in synapse specificity. Moreover, BAI2 acts as a transcription repressor to regulate vascular endothelial growth factor (VEGF) expression through interaction with GA-binding protein gamma (GABP). The N-terminal extracellular domains of all three BAIs also contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain, which undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif to generate N- and C-terminal fragments (NTF and CTF), a putative hormone-binding domain (HBD), and multiple N-glycosylation sites. The C-terminus of each BAI subtype ends with a conserved Gln-Thr-Glu-Val (QTEV) motif known to interact with PDZ domain-containing proteins, but only BAI1 possesses a proline-rich region, which may be involved in protein-protein interactions.	253
320380	cd15252	7tmB2_Latrophilin_Adhesion_I	Latrophilins and similar receptors, group I adhesion GPCRs, member of class B2 family of seven-transmembrane G protein-coupled receptors. Group I adhesion GPCRs consist of latrophilins (also called lectomedins or latrotoxin receptors) and ETL (EGF-TM7-latrophilin-related protein. These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified:  LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	257
320381	cd15253	7tmB2_GPR113	orphan adhesion receptor GPR113, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR113 is an orphan receptor that belongs to group VI adhesion-GPCRs along with GPR110, GPR111, GPR115, and GPR116. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. GPR113 contains a hormone binding domain and one EGF (epidermal grown factor) domain, and is primarily expressed in a subset of taste receptor cells. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS.	271
320382	cd15254	7tmB2_GPR116_Ig-Hepta	The immunoglobulin-repeat-containing receptor Ig-hepta/GPR116, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR116 (also known as Ig-hepta) is an orphan receptor that belongs to group VI adhesion-GPCRs along with GPR110, GPR111, GPR113, and GPR115. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. GPR116 has two C2-set immunoglobulin-like repeats, which is found in the members of the immunoglobulin superfamily of cell surface proteins, and a SEA (sea urchin sperm protein, enterokinase, and a grin)-box, which is present in the extracellular domain of the transmembrane mucin (MUC) family and known to enhance O-glycosylation. GPR116 is highly expressed in fetal and adult lung, and it has been shown to regulate lung surfactant levels as well as to stimulate breast cancer metastasis through a G(q)-p63-RhoGEF-Rho GTPase signaling pathway. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS.	275
320383	cd15255	7tmB2_GPR144	orphan adhesion receptor GPR114,  member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR144 is an orphan receptor that belongs to the group V adhesion-GPCRs together with GPR133. The function of GPR144 has not yet been characterized, whereas GPR133 is highly expressed in the pituitary gland and is coupled to the Gs protein, leading to activation of adenylyl cyclase pathway. Moreover, genetic variations in the GPR133 have been reported to be associated with adult height and heart rate. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS.	263
320384	cd15256	7tmB2_GPR133	orphan adhesion receptor GPR133, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR133 is an orphan receptor that belongs to the group V adhesion-GPCRs together with GPR144. The function of GPR144 has not yet been characterized, whereas GPR133 is highly expressed in the pituitary gland and is coupled to the Gs protein, leading to activation of adenylyl cyclase pathway. Moreover, genetic variations in the GPR133 have been reported to be associated with adult height and heart rate. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS.	260
320385	cd15257	7tmB2_GPR128	orphan adhesion receptor GPR128, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR128 is an orphan receptor of the adhesion family (subclass B2) that belongs to the class B GPCRs. Expression of GPR128 was detected in the mouse intestinal mucosa and is thought to be involved in energy balance, as its knockout mice showed a decrease in body weight gain and an increase in intestinal contraction frequency compared to wild-type controls. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. These include, for example, EGF (epidermal growth factor)-like domains in CD97, Celsr1 (cadherin family member), Celsr2, Celsr3, EMR1 (EGF-module-containing mucin-like hormone receptor-like 1), EMR2, EMR3, and Flamingo; two laminin A G-type repeats and nine cadherin domains in Flamingo and its human orthologs Celsr1, Celsr2 and Celsr3; olfactomedin-like domains in the latrotoxin receptors; and five or four thrombospondin type 1 repeats in BAI1 (brain-specific angiogenesis inhibitor 1), BAI2 and BAI3. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	303
320386	cd15258	7tmB2_GPR126-like_Adhesion_VIII	orphan GPR126 and related proteins, group VIII adhesion GPCRs, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Group VIII adhesion GPCRs include orphan GPCRs such as GPR56, GPR64, GPR97, GPR112, GPR114, and GPR126.  GPR56 is involved in the regulation of oligodendrocyte development and myelination in the central nervous system via coupling to G(12/13) proteins, which leads to the activation of RhoA GTPase. GPR126, on the other hand, is required for Schwann cells, but not oligodendrocyte myelination in the peripheral nervous system. Gpr64 is mainly expressed in the epididymis of male reproductive tract, and targeted deletion of GPR64 causes sperm stasis and efferent duct blockage due to abnormal fluid reabsorption, resulting in male infertility. GPR64 is also over-expressed in Ewing's sarcoma (ES), as well as upregulated in other carcinomas from kidney, prostate or lung, and promotes invasiveness and metastasis in ES via the upregulation of placental growth factor (PGF) and matrix metalloproteinase (MMP) 1. GPR97 is identified as a lymphatic adhesion receptor that is specifically expressed in lymphatic endothelium, but not in blood vascular endothelium, and is shown to regulate migration of lymphatic endothelial cells via the small GTPases RhoA and cdc42. GPR112 is specifically expressed in normal enterochromatin cells and gastrointestinal neuroendocrine carcinoma cells, but its biological function is unknown. GPR114 is mainly found in granulocytes (polymorphonuclear leukocytes), and GPR114-transfected cells induced an increase in cAMP levels via coupling to G(s) protein. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	267
320387	cd15259	7tmB2_GPR124-like_Adhesion_III	orphan GPR124 and related proteins, group III adhesion GPCRs, member of class B2 family of seven-transmembrane G protein-coupled receptors. group III adhesion GPCRs include orphan GPR123, GPR124, GPR125, and their closely related proteins. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. GPR123 is predominantly expressed in the CNS including thalamus, brain stem and regions containing large pyramidal cells.  GPR124, also known as tumor endothelial marker 5 (TEM5), is highly expressed in tumor vessels and in the vasculature of the developing embryo. GPR124 is essentially required for proper angiogenic sprouting into neural tissue, CNS-specific vascularization, and formation of the blood-brain barrier. GPR124 also interacts with the PDZ domain of DLG1 (discs large homolog 1) through its PDZ-binding motif. Recently, studies of double-knockout mice showed that GPR124 functions as a co-activator of Wnt7a/Wnt7b-dependent beta-catenin signaling in brain endothelium. Furthermore, WNT7-stimulated beta-catenin signaling is regulated by GPR124's intracellular PDZ binding motif and leucine-rich repeats (LRR) in its N-terminal extracellular domain.  GPR125 directly interacts with dishevelled (Dvl) via its intracellular C-terminus, and together, GPR125 and Dvl recruit a subset of planar cell polarity (PCP) components into membrane subdomains, a prerequisite for activation of Wnt/PCP signaling.  Thus, GPR125 influences the noncanonical WNT/PCP pathway, which does not involve beta-catenin, through interacting with and modulating the distribution of Dvl.	260
320388	cd15260	7tmB1_NPR_B4_insect-like	insect neuropeptide receptor subgroup B4 and related proteins, member of the class B family of seven-transmembrane G protein-coupled receptors. This subgroup includes a neuropeptide receptor found in Nilaparvata lugens (brown planthopper) and its closely related proteins from mollusks and annelid worms. They belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. The class B GPCRs have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi, or prokaryotes.	267
320389	cd15261	7tmB1_PDFR	The pigment dispersing factor receptor, member of the class B seven-transmembrane G protein-coupled receptors. The pigment dispersing factor receptor (PDFR) is a G protein-coupled receptor that binds the circadian clock neuropeptide PDF, a functional ortholog of the mammalian vasoactive intestinal peptide (VIP), on the pacemaker neurons. The PDFR is implicated in regulating flight circuit development and in modulating acute flight In Drosophila melanogaster.  The PDFR activation stimulates adenylate cyclase, thereby increasing cAMP levels in many different pacemakers, and the receptor signaling has been shown to regulate behavioral circadian rhythms and geotaxis in Drosophila. The PDFR belongs to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. . These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. They play key roles in hormone homeostasis in mammals and are promising drug targets in various human diseases including diabetes, osteoporosis, obesity, neurodegenerative conditions (Alzheimer###s and Parkinson's), cardiovascular disease, migraine, and psychiatric disorders (anxiety, depression).	282
320390	cd15262	7tmB1_NPR_B3_insect-like	insect neuropeptide receptor subgroup B3 and related proteins belong to subfamily B1 of hormone receptors; member of the class B secretin-like seven-transmembrane G protein-coupled receptors. This subgroup includes a neuropeptide receptor found in Bombyx mori (silk worm) and its closely related proteins from arthropods. They belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. The class B GPCRs have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi, or prokaryotes.	270
320391	cd15263	7tmB1_DH_R	insect diuretic hormone receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. This group includes G protein-coupled receptors that specifically bind to insect diuretic hormones found in Manduca sexta (moth) and Acheta domesticus (the house cricket), among others. Insect diuretic hormone and their GPCRs play critical roles in the regulation of water and ion balance. Thus they are attractive targets for developing new insecticides. Activation of the diuretic hormone receptors stimulate adenylate cyclase, thereby increasing cAMP levels in Malpighian tube.  They belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of Gs family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx.	272
320392	cd15264	7tmB1_CRF-R	corticotropin-releasing factor receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. The vertebrate corticotropin-releasing factor (CRF) receptors are predominantly expressed in central nervous system with high levels in cortex tissue, brain stem, and pituitary. They have two isoforms as a result of alternative splicing of the same receptor gene: CRF-R1 and CRF-R2, which differ in tissue distribution and ligand binding affinities. Recently, a third CRF receptor (CRF-R3) has been identified in catfish pituitary. The catfish CRF-R1 is highly homologous to CRF-R3. CRF is a 41-amino acid neuropeptide that plays a central role in coordinating neuroendocrine, behavioral, and autonomic responses to stress by acting as the primary neuroregulator of the hypothalamic-pituitary-adrenal axis, which controls the levels of cortisol and other stress related hormones.  In addition, the CRF family of neuropeptides also includes structurally related peptides such as mammalian urocortin, fish urotensin I, and frog sauvagine. The actions of CRF and CRF-related peptides are mediated through specific binding to CRF-R1 and CRF-R2. CRF and urocortin 1 bind and activate mammalian CRF-R1 with similar high affinities. By contrast, urocortin 2 and urocortin 3 do not bind to CRF-R1 or stimulate CRF-R1-mediated cAMP formation. Urocortin 1 also shows high affinity for mammalian CRF-R2, whereas CRF has significantly lower affinity for this receptor. These evidence suggest that urocortin 1 is an endogenous ligand for CRF-R1 and CRF-R2. The CRF receptors are members of the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, and parathyroid hormone (PTH). These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on its cellular location and function, CRF receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways.	265
320393	cd15265	7tmB1_PTHR	parathyroid hormone receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. The parathyroid hormone (PTH) receptor family has three subtypes: PTH1R, PTH2R and PTH3R. PTH1R is expressed in bone and kidney and is activated by two polypeptide ligands: PTH, an endocrine hormone that regulates calcium homoeostasis and bone maintenance, and PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH1R couples predominantly to a G(s)-protein that in turn activates adenylate cyclase thereby producing cAMP, but it can also couple to several G protein subtypes, including G(q/11), G(i/o), and G(12/13), resulting in activation of multiple intracellular signaling pathways. PTH2R is potently activated by tuberoinfundibular peptide-39 (TIP-39), but not by PTHrP. PTH also strongly activates human PTH2R, but only weakly activates rat and zebrafish PTH2Rs, suggesting that TIP-39 is a natural ligand for PTH2R. On the other hand, PTH3R binds and responds to both PTH and PTHrP, but not the TIP-39. Moreover, the PTH3R is more closely related to the PTH1R than PTH2R. PTH1R is found in all vertebrate species, whereas PTH2R is found in mammals and fish, but not in chicken or frog. The PTH3R is found in chicken and fish, but it is absent in mammals. The PTH receptors are members of the B1 (or secretin-like) subfamily of class B GPCRs, which include receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways.	289
320394	cd15266	7tmB1_GLP2R	glucagon-like peptide-2 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Glucagon-like peptide-2 receptor (GLP2R) is a member of the glucagon receptor family of G protein-coupled receptors, which also includes glucagon receptor (GCGR) and GLP1R. GLP2R is activated by glucagon-like peptide 2, which is derived from the large proglucagon precursor. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. GLP2R belongs to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways.	280
320395	cd15267	7tmB1_GCGR	glucagon receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Glucagon receptor (GCGR) is a member of the glucagon receptor family of G protein-coupled receptors, which also includes glucagon-like peptide-1 receptor (GLP1R) and GLP2R. GCGR is activated by glucagon, which is derived from the large proglucagon precursor. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. GCGR belongs to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways.	281
341342	cd15268	7tmB1_GLP1R	glucagon-like peptide-1 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Glucagon-like peptide-1 receptor (GLP1R) is a member of the glucagon receptor family of G protein-coupled receptors, which also includes glucagon receptor and GLP2R. GLP1R is activated by glucagon-like peptide 1 (GLP1), which is derived from the large proglucagon precursor. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. Receptors in this group belong to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways.	279
320397	cd15269	7tmB1_VIP-R1	vasoactive intestinal polypeptide (VIP) receptor 1, member of the class B family of seven-transmembrane G protein-coupled receptors. Vasoactive intestinal peptide (VIP) receptor 1 is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, growth-hormone-releasing hormone (GHRH), and pituitary adenylate cyclase activating polypeptide (PACAP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. VIP and PACAP exert their effects through three G protein-coupled receptors, PACAP-R1, VIP-R1 (vasoactive intestinal receptor type 1, also known as VPAC1) and VIP-R2 (or VPAC2). PACAP-R1 binds only PACAP with high affinity, whereas VIP-R1 and -R2 specifically bind and respond to both VIP and PACAP.  VIP and PACAP and their receptors are widely expressed in the brain and periphery. They are upregulated in neurons and immune cells in responses to CNS injury and/or inflammation and exert potent anti-inflammatory effects, as well as play important roles in the control of circadian rhythms and stress responses, among many others. VIP-R1 is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. However, depending on its cellular location, VIP-R1 is also capable of coupling to additional G proteins such as G(q) protein, thus leading to the activation of phospholipase C and intracellular calcium influx.	268
320398	cd15270	7tmB1_GHRHR	growth-hormone-releasing hormone receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Growth hormone-releasing hormone receptor (GHRHR) is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, pituitary adenylate cyclase activating polypeptide (PACAP), and vasoactive intestinal peptide. These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. GHRHR is a specific receptor for the growth hormone-releasing hormone (GHRH) that controls the synthesis and release of growth hormone (GH) from the anterior pituitary somatotrophs. Mutations in the gene encoding GHRHR have been connected to isolated growth hormone deficiency (IGHD), a short-stature condition caused by deficient production of GH or lack of GH action.  GHRH is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. GHRHR is found in mammals as well as zebrafish and chicken, whereas the GHRHR type 2, an ortholog of the GHRHR, has only been identified in ray-finned fish, chicken and Xenopus. Xenopus laevis GHRHR2 has been shown to interact with both endogenous GHRH and PACAP-related peptide (PRP).	268
320399	cd15271	7tmB1_GHRHR2	growth-hormone-releasing hormone receptor type 2, member of the class B family of seven-transmembrane G protein-coupled receptors. Growth hormone-releasing hormone receptor type 2 (GHRHR2) is found in non-mammalian vertebrates such as chicken and frog. It is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, pituitary adenylate cyclase activating polypeptide (PACAP), vasoactive intestinal peptide, and mammalian growth hormone-releasing hormone. These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. Mammalian GHRHR is a specific receptor for the growth hormone-releasing hormone (GHRH) that controls the synthesis and release of growth hormone (GH) from the anterior pituitary somatotrophs. Mutations in the gene encoding GHRHR have been connected to isolated growth hormone deficiency (IGHD), a short-stature condition caused by deficient production of GH or lack of GH action.  Mammalian GHRH is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. GHRHR is found in mammals as well as zebrafish and chicken, whereas the GHRHR type 2, an ortholog of the GHRHR, has only been identified in ray-finned fish, chicken and Xenopus. Xenopus laevis GHRHR2 has been shown to interact with both endogenous GHRH and PACAP-related peptide (PRP).	267
320400	cd15272	7tmB1_PTH-R_related	invertebrate parathyroid hormone-related receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. This group includes parathyroid hormone (PTH)-related receptors found in invertebrates such as mollusks and annelid worms. The PTH family receptors are members of the B1 subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. The parathyroid hormone type 1 receptor (PTH1R) is found in all vertebrate species and is activated by two polypeptide ligands: parathyroid hormone (PTH), an endocrine hormone that regulates calcium homoeostasis and bone maintenance, and PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH1R couples predominantly to G(s)- protein that in turn activates adenylyl cyclase thereby producing cAMP, but it can also couple to several G protein subtypes, including G(q/11), G(i/o), and G(12/13), resulting in activation of multiple signaling pathways.	285
320401	cd15273	7tmB1_NPR_B7_insect-like	insect neuropeptide receptor subgroup B7 and related proteins, member of the class B family of seven-transmembrane G protein-coupled receptors. This subgroup includes a neuropeptide receptor found in Nilaparvata lugens (brown planthopper) and its closely related proteins from invertebrates. They belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. The class B GPCRs have been identified in all the vertebrates, from fishes to mammals, as well as invertebrates including Caenorhabditis elegans and Drosophila melanogaster, but are not present in plants, fungi, or prokaryotes.	285
341343	cd15274	7tmB1_calcitonin_R	calcitonin receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. This group includes G protein-coupled receptors for calcitonin (CT) and calcitonin gene-related peptides (CGRPs). Calcitonin, a 32-amino acid peptide hormone, is involved in calcium metabolism in many mammalian species and acts to reduce blood calcium levels and directly inhibits bone resorption by acting on osteoclast. Thus, CT acts as an antagonist to parathyroid hormone and is commonly used in the treatment of bone disorders. The CT receptor is predominantly found in osteoclasts, kidney, and brain, and is primarily coupled to stimulatory G(s) protein, which leads to activation of adenylate cyclase, thereby increasing cAMP production. CGRP, a member of the calcitonin family of peptides, is a potent vasodilator and may contribute to migraine. It is expressed in the peripheral and central nervous system and exists in two forms in humans (alpha-CGRP and beta-CGRP). CGRP meditates its physiological effects through calcitonin receptor-like receptor (CRLR) and receptor activity-modifying protein 1 (RAMP1), a single transmembrane domain protein. Thus, the CRLR/RAMP1 complex serves as a functional CGRP receptor.  On the other hand, the CRLR/RAMP2 and CRLR/RAMP3 complexes function as adrenomedullin-specific receptors. The CT and CGRP receptors belong to the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide.	274
320403	cd15275	7tmB1_secretin	secretin receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Secretin receptor is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include vasoactive intestinal peptide (VIP), growth-hormone-releasing hormone (GHRH), and pituitary adenylate cyclase activating polypeptide (PACAP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors, and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. Secretin, a polypeptide secreted by entero-endocrine S cells in the small intestine, is involved in maintaining body fluid balance. This polypeptide regulates the secretion of bile and bicarbonate into the duodenum from the pancreatic and biliary ducts, as well as regulates the duodenal pH by the control of gastric acid secretion. Studies with secretin receptor-null mice indicate that secretin plays a role in regulating renal water reabsorption. Secretin mediates its biological actions by elevating intracellular cAMP via G protein-coupled secretin receptor, which is expressed in the brain, pancreas, stomach, kidney, and liver.	271
320404	cd15277	7tmC_RAIG3_GPRC5C	retinoic acid-inducible orphan G-protein-coupled receptor 3; class C family of seven-transmembrane G protein-coupled receptors, group 5, member C. Retinoic acid-inducible G-protein-coupled receptors (RAIGs), also referred to as GPCR class C group 5, are a group consisting of four orphan receptors RAIG1 (GPRC5A), RAIG2 (GPRC5B), RAIG3 (GPRC5C), and RAIG4 (GPRC5D). Unlike other members of the class C GPCRs which contain a large N-terminal extracellular domain, RAIGs have a shorter N-terminus. Thus, it is unlikely that RAIGs bind an agonist at its N-terminus domain. Instead, the agonists may bind to the seven-transmembrane domain of these receptors. In addition, RAIG2 and RAIG3 contain a cleavable signal peptide whereas RAIG1 and RAIG4 do not. Although their expression is induced by retinoic acid (vitamin A analog), their biological function is not clearly understood. To date, no ligand is known for the members of RAIG family. Three receptor types (RAIG1-3) are found in vertebrates, while RAIG4 is only present in mammals. They show distinct tissue distribution with RAIG1 being primarily expressed in the lung, RAIG2 in the brain and placenta, RAIG3 in the brain, kidney and liver, and RAIG4 in the skin. The specific function of RAIG3 is unknown; however, this protein may play a role in mediating the effects of retinoic acid on embryogenesis, differentiation, and tumorigenesis through interaction with a G-protein signaling cascade.	250
320405	cd15278	7tmC_RAIG2_GPRC5B	retinoic acid-inducible orphan G-protein-coupled receptor 2; class C family of seven-transmembrane G protein-coupled receptors, group 5, member B. Retinoic acid-inducible G-protein-coupled receptors (RAIGs), also referred to as GPCR class C group 5, are a group consisting of four orphan receptors RAIG1 (GPRC5A), RAIG2 (GPRC5B), RAIG3 (GPRC5C), and RAIG4 (GPRC5D). Unlike other members of the class C GPCRs which contain a large N-terminal extracellular domain, RAIGs have a shorter N-terminus. Thus, it is unlikely that RAIGs bind an agonist at its N-terminus domain. Instead, the agonists may bind to the seven-transmembrane domain of these receptors. In addition, RAIG2 and RAIG3 contain a cleavable signal peptide whereas RAIG1 and RAIG4 do not. Although their expression is induced by retinoic acid (vitamin A analog), their biological function is not clearly understood. To date, no ligand is known for the members of RAIG family. Three receptor types (RAIG1-3) are found in vertebrates, while RAIG4 is only present in mammals. They show distinct tissue distribution with RAIG1 being primarily expressed in the lung, RAIG2 in the brain and placenta, RAIG3 in the brain, kidney and liver, and RAIG4 in the skin. RAIG2 (GPRC5B), a mammalian Boss (Bride of sevenless) homolog, has been shown to activate obesity-associated inflammatory signaling in adipocytes, and that the GPRC5B knockout mice have been shown to be resistance to high-fat diet-induced obesity and insulin resistance.	244
320406	cd15279	7tmC_RAIG1_4_GPRC5A_D	retinoic acid-inducible orphan G-protein-coupled receptors 1 and 4; class C family of seven-transmembrane G protein-coupled receptors, group 5, member A and D. Retinoic acid-inducible G-protein-coupled receptors (RAIGs), also referred to as GPCR class C group 5, are a group consisting of four orphan receptors RAIG1 (GPRC5A), RAIG2 (GPRC5B), RAIG3 (GPRC5C), and RAIG4 (GPRC5D). Unlike other members of the class C GPCRs which contain a large N-terminal extracellular domain, RAIGs have a shorter N-terminus. Thus, it is unlikely that RAIGs bind an agonist at its N-terminus domain. Instead, the agonists may bind to the seven-transmembrane domain of these receptors. In addition, RAIG2 and RAIG3 contain a cleavable signal peptide whereas RAIG1 and RAIG4 do not. Although their expression is induced by retinoic acid (vitamin A analog), their biological function is not clearly understood. To date, no ligand is known for the members of RAIG family. Three receptor types (RAIG1-3) are found in vertebrates, while RAIG4 is only present in mammals. They show distinct tissue distribution with RAIG1 being primarily expressed in the lung, RAIG2 in the brain and placenta, RAIG3 in the brain, kidney and liver, and RAIG4 in the skin. RAIG1 is evolutionarily conserved from mammals to fish. RAIG1 has been to shown to act as a tumor suppressor in non-small cell lung carcinoma as well as oral squamous cell carcinoma, but it could also act as an oncogene in breast cancer, colorectal cancer, and pancreatic cancer. Studies have shown that overexpression of RAIG1 decreases intracellular cAMP levels.  Moreover, knocking out RAIG1 induces the activation of the NF-kB and STAT3 signaling pathways leading to cell proliferation and resistance to apoptosis.  The specific function of RAIG4 is unknown; however, this protein may play a role in mediating the effects of retinoic acid on embryogenesis, differentiation, and tumorigenesis through interaction with a G-protein signaling cascade.	248
320407	cd15280	7tmC_V2R-like	vomeronasal type-2 receptor-like proteins, member of the class C family of seven-transmembrane G protein-coupled receptors. This group represents vomeronasal type-2 receptor-like proteins that are closely related to the V2R family of vomeronasal GPCRs. Members of the V2R family of vomeronasal GPCRs are involved in detecting protein pheromones for social and sexual cues between the same species. V2Rs and G-alpha(o) protein are coexpressed in the basal layer of the vomeronasal organ (VNO), which is the sensory organ of the accessory olfactory system present in amphibians, reptiles, and non-primate mammals such as mice and rodents, but it is non-functional or absent in humans, apes, and monkeys. On the other hand, members of the V1R receptor family and G-alpha(i2) protein are co-expressed in the apical neurons of the VNO. Activation of V1R or V2R causes activation of phospholipase pathway, generating the secondary messengers diacylglycerol (DAG) and IP3. However, in contrast to V1Rs, V2Rs contain the long N-terminal extracellular domain, which is believed to bind pheromones. Human V2R1-like protein, also known as putative calcium-sensing receptor-like 1 (CASRL1), is not included here because it is a nonfunctional pseudogene.	253
320408	cd15281	7tmC_GPRC6A	class C of seven-transmembrane G protein-coupled receptors, subtype 6A. GRPC6A (GPCR, class C, group 6, subtype A) is a widely expressed amino acid-sensing GPCR that is most closely related to CaSR.  GPRC6A is most potently activated by the basic amino acids L-arginine, L-lysine, and L-ornithine and less potently by small aliphatic amino acids. Moreover, the receptor can be either activated or modulated by divalent cations such as Ca2+ and Mg2+. GPRC6A is expressed in the testis, but not the ovary and specifically also binds to the osteoblast-derived hormone osteocalcin (OCN), which regulates testosterone production by the testis and male fertility independently of the hypothalamic-pituitary axis. Furthermore, GPRC6A knockout studies suggest that GRPC6A is involved in regulation of bone metabolism, male reproduction, energy homeostasis, glucose metabolism, and in activation of inflammation response, as well as prostate cancer growth and progression, among others. GPRC6A has been suggested to couple to the Gq subtype of G proteins, leading to IP3 production and intracellular calcium mobilization. GPRC6A contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD), and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs.  The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B, GPRC6A, mGlu, and TAS1R receptors.	249
320409	cd15282	7tmC_CaSR	calcium-sensing receptor, member of the class C of seven-transmembrane G protein-coupled receptors. CaSR is a widely expressed GPCR that is involved in sensing small changes in extracellular levels of calcium ion to maintain a constant level of the extracellular calcium via modulating the synthesis and secretion of calcium regulating hormones, such as parathyroid hormone (PTH), in order to regulate Ca(2+)transport into or out of the extracellular fluid via kidney, intestine, and/or bone. For instance, when Ca2+ is high, CaSR downregulates PTH synthesis and secretion, leading to an increase in renal Ca2+ excretion, a decrease in intestinal Ca2+ absorption, and a reduction in release of skeletal Ca2+. CaSR is coupled to both G(q/11)-dependent activation of phospholipase and, subsequently, intracellular calcium mobilization and protein kinase C activation as well as G(i/o)-dependent inhibition of adenylate cyclase leading to inhibition of cAMP formation.  CaSR is closely related to GRPC6A (GPCR, class C, group 6, subtype A), which is an amino acid-sensing GPCR that is most potently activated by the basic amino acids L-arginine, L-lysine, and L-ornithine. These receptors contain a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD), and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs.  The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TASR1 receptors.	252
320410	cd15283	7tmC_V2R_pheromone	vomeronasal type-2 pheromone receptors, member of the class C family of seven-transmembrane G protein-coupled receptors. This group represents vomeronasal type-2 pheromone receptors (V2Rs). Members of the V2R family of vomeronasal GPCRs are involved in detecting protein pheromones for social and sexual cues between the same species. V2Rs and G-alpha(o) protein are coexpressed in the basal layer of the vomeronasal organ (VNO), which is the sensory organ of the accessory olfactory system present in amphibians, reptiles, and non-primate mammals such as mice and rodents, but it is non-functional or absent in humans, apes, and monkeys. On the other hand, members of the V1R receptor family and G-alpha(i2) protein are coexpressed in the apical neurons of the VNO. Activation of V1R or V2R causes activation of phospholipase pathway, producing the second messengers diacylglycerol (DAG) and IP3. However, in contrast to V1Rs, V2Rs contain the long N-terminal extracellular domain, which is believed to bind pheromones.	252
320411	cd15284	7tmC_mGluR_group2	metabotropic glutamate receptors in group 2, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) in group 2 include mGluR 2 and 3. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	254
320412	cd15285	7tmC_mGluR_group1	metabotropic glutamate receptors in group 1, member of the class C family of seven-transmembrane G protein-coupled receptors. Group 1 mGluRs includes mGluR1 and mGluR5, as well as their closely related invertebrate receptors. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	250
320413	cd15286	7tmC_mGluR_group3	metabotropic glutamate receptors in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	271
320414	cd15287	7tmC_TAS1R2a-like	type 1 taste receptor subtype 2a and similar proteins, member of the class C of seven-transmembrane G protein-coupled receptors. This group includes TAS1R2a and its similar proteins found in fish. They are members of the type I taste receptor (TAS1R) family that belongs to the class C of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids.  On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs.  The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors.	252
320415	cd15288	7tmC_TAS1R2	type 1 taste receptor subtype 2, member of the class C of seven-transmembrane G protein-coupled receptors. This group represents TAS1R2, which is a member of the type I taste receptor (TAS1R) family that belongs to the class C of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids.  On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs.  The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors.	254
320416	cd15289	7tmC_TAS1R1	type 1 taste receptor subtype 1, member of the class C of seven-transmembrane G protein-coupled receptors. This group represents TAS1R1, which is a member of the type I taste receptor (TAS1R) family that belongs to the class C of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids.  On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs.  The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors.	253
320417	cd15290	7tmC_TAS1R3	type 1 taste receptor subtype 3, member of the class C of seven-transmembrane G protein-coupled receptors. This group represents TAS1R3, which is a member of the type I taste receptor (TAS1R) family that belongs to the class C of G protein-coupled receptors. The functional TAS1Rs are obligatory heterodimers built from three known members, TAS1R1-3. TAS1R1 combines with TAS1R3 to form an umami taste receptor, which is responsible for the perception of savory taste, such as the food additive mono-sodium glutamate (MSG); whereas the combination of TAS1R2-TAS1R3 forms a sweet-taste receptor for sugars and D-amino acids.  On the other hand, the type II taste receptors (TAS2Rs), which belong to the class A family of GPCRs, recognize bitter tasting compounds. In the case of sweet, for example, the TAS1R2-TAS1R3 heterodimer activates phospholipase C (PLC) via alpha-gustducin, a heterodimeric G protein that is involved in perception of sweet and bitter tastes. This activation leads to generation of inositol (1, 4, 5)-trisphosphate (IP3) and diacylglycerol (DAG), and consequently increases intracellular Ca2+ mobilization and activates a cation channel, TRPM5. In contrast to the TAS1R2-TAS1R3 heterodimer, TAS1R3 alone could activate adenylate cyclase leading to cAMP formation in the absence of alpha-gustducin. Each TAS1R contains a large extracellular Venus flytrap-like domain in the N-terminus, cysteine-rich domain (CRD) and seven-transmembrane (7TM) domain, which are characteristics of the class C GPCRs.  The Venus flytrap-like domain shares strong sequence homology to bacterial periplasmic binding proteins and possess the orthosteric amino acid and calcium binding sites for members of the class C, including CaSR, GABA-B1, GPRC6A, mGlu, and TAS1R receptors.	253
320418	cd15291	7tmC_GABA-B-R1	gamma-aminobutyric acid type B receptor subunit 1, member of the class C family of seven-transmembrane G protein-coupled receptors. The type B receptor for gamma-aminobutyric acid, GABA-B, is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD).  However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism.	274
320419	cd15292	7tmC_GPR156	orphan GPR156, member of the class C family of seven-transmembrane G protein-coupled receptors. This subgroup represents orphan GPR156 that is closely related to the type B receptor for gamma-aminobutyric acid (GABA-B), which is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD).  However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism.	268
320420	cd15293	7tmC_GPR158-like	orphan GPR158 and similar proteins, member of the class C family of seven-transmembrane G protein-coupled receptors. This group includes orphan receptors GPR158, GPR158-like (also called GPR179) and similar proteins. These orphan receptors are closely related to the type B receptor for gamma-aminobutyric acid (GABA-B), which is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD).  However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism.	252
320421	cd15294	7tmC_GABA-B-R2	gamma-aminobutyric acid type B receptor subunit 2, member of the class C family of seven-transmembrane G protein-coupled receptors. The type B receptor for gamma-aminobutyric acid, GABA-B, is activated by its endogenous ligand GABA, the principal inhibitory neurotransmitter. The functional GABA-B receptor is an obligatory heterodimer composed of two related subunits, GABA-B1, which is primarily involved in GABA ligand binding, and GABA-B2, which is responsible for both G-protein coupling and trafficking of the heterodimer to the plasma membrane. Activation of GABA-B couples to G(i/o)-type G proteins, which in turn modulate three major downstream effectors: adenylate cyclase, voltage-sensitive Ca2+ channels, and inwardly-rectifying K+ channels. Consequently, GABA-B receptor produces slow and sustained inhibitory responses by decreased neurotransmitter release via inhibition of Ca2+ channels and by postsynaptic hyperpolarization via the activation of K+ channels through the G-protein beta-gamma dimer. The GABA-B is expressed in both pre- and postsynaptic sites of glutamatergic and GABAergic neurons in the brain where it regulates synaptic activity. Thus, the GABA-B receptor agonist, baclofen, is used to treat muscle tightness and cramping caused by spasticity in multiple sclerosis patients. Moreover, GABA-B antagonists improves cognitive performance in mammals, while GABA-B agonists suppress cognitive behavior. In most of the class C family members, the extracellular Venus-flytrap domain in the N-terminus is connected to the seven-transmembrane (7TM) via a cysteine-rich domain (CRD).  However, in the GABA-B receptor, the CRD is absent in both subunits and the Venus-flytrap ligand-binding domain is directly connected to the 7TM via a 10-15 amino acids linker, suggesting that GABA-B receptor may utilize a different activation mechanism.	270
320422	cd15295	7tmA_Histamine_H4R	histamine receptor subtype H4R, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine subtype H4R, a member of the histamine receptor family, which belong to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). The H3 and H4 receptors couple to the G(i)-proteins, which leading to the inhibition of cAMP formation. The H3R receptor functions as a presynaptic autoreceptors controlling histamine release and synthesis. The H4R plays an important role in histamine-mediated chemotaxis in mast cells and eosinophils. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	267
320423	cd15296	7tmA_Histamine_H3R	histamine receptor subtypes H3R and H3R-like, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes histamine subtypes H3R and H3R-like, members of the histamine receptor family, which belong to the class A of GPCRs. Histamine plays a key role as chemical mediator and neurotransmitter in various physiological and pathophysiological processes in the central and peripheral nervous system. Histamine exerts its functions by binding to four different G protein-coupled receptors (H1-H4). The H3 and H4 receptors couple to the G(i)-proteins, which leading to the inhibition of cAMP formation. The H3R receptor functions as a presynaptic autoreceptors controlling histamine release and synthesis. The H4R plays an important role in histamine-mediated chemotaxis in mast cells and eosinophils. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	271
320424	cd15297	7tmA_mAChR_M2	muscarinic acetylcholine receptor subtype M2, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of  G proteins. Activation of M2 receptor causes a decrease in cAMP production, generally leading to inhibitory-type effects. This causes an outward current of potassium in the heart, resulting in a decreased heart rate. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	262
341344	cd15298	7tmA_mAChR_M4	muscarinic acetylcholine receptor subtype M4, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to G(i/o) types of  G proteins. The M4 receptor is mainly found in the CNS and function as an inhibitory autoreceptor regulating acetycholine release. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	262
320426	cd15299	7tmA_mAChR_M3	muscarinic acetylcholine receptor subtype M3, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of  G proteins. The M3 receptor is mainly located in smooth muscle, exocrine glands and vascular endothelium. It induces vomiting in the central nervous system and is a critical regulator of glucose homeostasis by modulating insulin secretion. Generally, M3 receptor causes contraction of smooth muscle resulting in vasoconstriction and increased glandular secretion. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	274
320427	cd15300	7tmA_mAChR_M5	muscarinic acetylcholine receptor subtype M5, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of  G proteins. M5 mAChR is primarily found in the central nervous system and mediates acetylcholine-induced dilation of cerebral blood vessels. Activation of M5 receptor triggers a variety of cellular responses, including inhibition of adenylate cyclase, breakdown of phosphoinositides, and modulation of potassium channels. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	262
320428	cd15301	7tmA_mAChR_DM1-like	muscarinic acetylcholine receptor DM1, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes muscarinic acetylcholine receptor DM1-like from invertebrates. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of  G proteins. Activation of mAChRs by agonist (acetylcholine) leads to a variety of biochemical and electrophysiological responses. In general, the exact nature of these responses and the subsequent physiological effects mainly depend on the molecular and pharmacological identity of the activated receptor subtype(s). All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	270
320429	cd15302	7tmA_mAChR_GAR-2-like	muscarinic acetylcholine receptor GAR-2 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of  G proteins. Activation of mAChRs by agonist (acetylcholine) leads to a variety of biochemical and electrophysiological responses. In general, the exact nature of these responses and the subsequent physiological effects mainly depend on the molecular and pharmacological identity of the activated receptor subtype(s). All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	266
341345	cd15304	7tmA_5-HT2A	serotonin receptor subtype 2A, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	267
341346	cd15305	7tmA_5-HT2C	serotonin receptor subtype 2C, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	275
341347	cd15306	7tmA_5-HT2B	serotonin receptor subtype 2B, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	277
320433	cd15307	7tmA_5-HT2_insect-like	serotonin receptor subtype 2 from insects, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT2 receptors are a subfamily of serotonin receptors that bind the neurotransmitter serotonin (5HT; 5-hydroxytryptamine) in the central nervous system (CNS). The 5-HT2 subfamily is composed of three subtypes that mediate excitatory neurotransmission: 5-HT2A, 5-HT2B, and 5-HT2C. They are selectively linked to G proteins of the G(q/11) family and activate phospholipase C, which leads to activation of protein kinase C and calcium release. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in diseases such as migraine, schizophrenia, and depression. Indeed, 5-HT2 receptors are attractive targets for a variety of psychoactive drugs, ranging from atypical antipsychotic drugs, antidepressants, and anxiolytics, which have an antagonistic action on 5-HT2 receptors, to hallucinogens, which act as agonists at postsynaptic 5-HT2 receptors. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	279
320434	cd15308	7tmA_D4_dopamine_R	D4 dopamine receptor of the D2-like family, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5.  The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. Activation of D2-like family receptors is linked to G proteins of the G(i) family. This leads to a decrease in adenylate cyclase activity, thereby decreasing cAMP levels. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease.	258
320435	cd15309	7tmA_D2_dopamine_R	D2 subtype of the D2-like family of dopamine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5.  The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. Activation of D2-like family receptors is linked to G proteins of the G(i) family. This leads to a decrease in adenylate cyclase activity, thereby decreasing cAMP levels. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease.	254
320436	cd15310	7tmA_D3_dopamine_R	D3 subtype of the D2-like family of dopamine receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5.  The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. Activation of D2-like family receptors is linked to G proteins of the G(i) family. This leads to a decrease in adenylate cyclase activity, thereby decreasing cAMP levels. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease.	259
320437	cd15312	7tmA_TAAR2_3_4	trace amine-associated receptors 2, 3, 4, and similar receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. TAAR2, TAAR3, and TAAR4 are among the 15 identified trace amine-associated receptor subtypes, which form a distinct subfamily within the class A G protein-coupled receptor family. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	289
320438	cd15314	7tmA_TAAR1	trace amine-associated receptor 1 and similar receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. The trace amine-associated receptor 1 (TAAR1) is one of the 15 identified trace amine-associated receptor subtypes, which form a distinct subfamily within the class A G protein-coupled receptor family. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. TAAR1 is coupled to the Gs protein, which leads to activation of adenylate cyclase, and is thought to play functional role in the regulation of brain monoamines. TAAR1 is also shown to be activated by psychoactive compounds such as Ecstasy (MDMA), amphetamine and LSD. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	282
320439	cd15316	7tmA_TAAR6_8_9	trace amine-associated receptors 6, 8, and 9, member of the class A family of seven-transmembrane G protein-coupled receptors. Included in this group are mammalian TAAR6, TAAR8, TAAR9, and similar proteins. They are among the 15 identified amine-associated receptors (TAARs), a distinct subfamily within the class A G protein-coupled receptors. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	290
320440	cd15317	7tmA_TAAR5-like	trace amine-associated receptor 5 and similar receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Included in this group are mammalian TAAR5, TAAR6, TAAR8, TAAR9, and similar proteins. They are among the 15 identified trace amine-associated receptors (TAARs), a distinct subfamily within the class A G protein-coupled receptors. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	290
320441	cd15318	7tmA_TAAR5	trace amine-associated receptor 5, member of the class A family of seven-transmembrane G protein-coupled receptors. The trace amine-associated receptor 5 is one of the 15 identified amine-activated G protein-coupled receptors (TAARs), a distinct subfamily within the class A G protein-coupled receptors. Trace amines are endogenous amines of unknown function that have strong structural and metabolic similarity to classical monoamine neurotransmitters (serotonin, noradrenaline, adrenaline, dopamine, and histamine), which play critical roles in human and animal physiological activities such as cognition, consciousness, mood, motivation, perception, and autonomic responses. However, trace amines are found in the mammalian brain at very low concentrations compared to classical monoamines. Trace amines, including p-tyramine, beta-phenylethylamine, and tryptamine, are also thought to act as chemical messengers to exert their biological effects in vertebrates. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	282
320442	cd15319	7tmA_D1B_dopamine_R	D1B (or D5) subtype dopamine receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5.  The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. The D1-like family receptors are coupled to G proteins of the G(s) family, which activate adenylate cyclase, causing cAMP formation and activation of protein kinase A. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease.	317
320443	cd15320	7tmA_D1A_dopamine_R	D1A (or D1) subtype dopamine receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Dopamine receptors are members of the class A G protein-coupled receptors that are involved in many neurological processes in the central nervous system (CNS). The neurotransmitter dopamine is the primary endogenous agonist for dopamine receptors. Dopamine receptors consist of at least five subtypes: D1, D2, D3, D4, and D5.  The D1 and D5 subtypes are members of the D1-like family of dopamine receptors, whereas the D2, D3 and D4 subtypes are members of the D2-like family. The D1-like family receptors are coupled to G proteins of the G(s) family, which activate adenylate cyclase, causing cAMP formation and activation of protein kinase A. Dopamine receptors are major therapeutic targets for neurological and psychiatric disorders such as drug abuse, depression, schizophrenia, or Parkinson's disease.	319
320444	cd15321	7tmA_alpha2B_AR	alpha-2 adrenergic receptors subtype B, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback.	268
320445	cd15322	7tmA_alpha2A_AR	alpha-2 adrenergic receptors subtype A, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback.	259
320446	cd15323	7tmA_alpha2C_AR	alpha-2 adrenergic receptors subtype C, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback.	261
320447	cd15324	7tmA_alpha-2D_AR	alpha-2 adrenergic receptors subtype D, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-2 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that have a key role in neurotransmitter release: alpha-2A, alpha-2B, and alpha-2C. In addition, a fourth subtype, alpha-2D is present in ray-finned fishes and amphibians, but is not found in humans. The alpha-2 receptors are found in both central and peripheral nervous system and serve to produce inhibitory functions through the G(i) proteins. Thus, the alpha-2 receptors inhibit adenylate cyclase, which decreases cAMP production and thereby decreases calcium influx during the action potential. Consequently, lowered levels of calcium will lead to a decrease in neurotransmitter release by negative feedback.	256
320448	cd15325	7tmA_alpha1A_AR	alpha-1 adrenergic receptors subtype A, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-1 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that primarily mediate smooth muscle contraction: alpha-1A, alpha-1B, and alpha-1D. Activation of alpha-1 receptors by catecholamines such as norepinephrine and epinephrine couples to the G(q) protein, which then activates the phospholipase C pathway, leading to an increase in IP3 and calcium. Consequently, the elevation of intracellular calcium concentration leads to vasoconstriction in smooth muscle of blood vessels. In addition, activation of alpha-1 receptors by phenylpropanolamine (PPA) produces anorexia and may induce appetite suppression in rats.	261
320449	cd15326	7tmA_alpha1B_AR	alpha-1 adrenergic receptors subtype B, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-1 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that primarily mediate smooth muscle contraction: alpha-1A, alpha-1B, and alpha-1D. Activation of alpha-1 receptors by catecholamines such as norepinephrine and epinephrine couples to the G(q) protein, which then activates the phospholipase C pathway, leading to an increase in IP3 and calcium. Consequently, the elevation of intracellular calcium concentration leads to vasoconstriction in smooth muscle of blood vessels. In addition, activation of alpha-1 receptors by phenylpropanolamine (PPA) produces anorexia and may induce appetite suppression in rats.	261
320450	cd15327	7tmA_alpha1D_AR	alpha-1 adrenergic receptors subtype D, member of the class A family of seven-transmembrane G protein-coupled receptors. The alpha-1 adrenergic receptors (or adrenoceptors) are a subfamily of the class A rhodopsin-like GPCRs that share a common architecture of seven transmembrane helices. This subfamily consists of three highly homologous receptor subtypes that primarily mediate smooth muscle contraction: alpha-1A, alpha-1B, and alpha-1D. Activation of alpha-1 receptors by catecholamines such as norepinephrine and epinephrine couples to the G(q) protein, which then activates the phospholipase C pathway, leading to an increase in IP3 and calcium. Consequently, the elevation of intracellular calcium concentration leads to vasoconstriction in smooth muscle of blood vessels. In addition, activation of alpha-1 receptors by phenylpropanolamine (PPA) produces anorexia and may induce appetite suppression in rats.	261
320451	cd15328	7tmA_5-HT5	serotonin receptor subtype 5, member of the class A family of seven-transmembrane G protein-coupled receptors. 5-HT5 receptor, one of 14 mammalian 5-HT receptors, is activated by the neurotransmitter and peripheral signal mediator serotonin (also known as 5-hydroxytryptamine or 5-HT). The 5-HT5A and 5-HT5B receptors have been cloned from rat and mouse, but only the 5-HT5A isoform has been identified in human because of the presence of premature stop codons in the human 5-HT5B gene, which prevents a functional receptor from being expressed.  5-HT5 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/0)  family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression.	259
320452	cd15329	7tmA_5-HT7	serotonin receptor subtype 7, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT7 receptor, one of 14 mammalian serotonin receptors, is a member of the class A of GPCRs and is activated by the neurotransmitter serotonin (5-hydroxytryptamine, 5-HT).  5-HT7 receptor mainly couples to Gs protein, which positively stimulates adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. 5-HT7 receptor is expressed in various human tissues, mainly in the brain, the lower gastrointestinal tract and in vital blood vessels including the coronary artery. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression.	260
320453	cd15330	7tmA_5-HT1A_vertebrates	serotonin receptor subtype 1A from vertebrates, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F.  There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression.	260
320454	cd15331	7tmA_5-HT1A_invertebrates	serotonin receptor subtype 1A from invertebrates, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F.  There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression.	261
320455	cd15333	7tmA_5-HT1B_1D	serotonin receptor subtypes 1B and 1D, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F.  There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression.	265
320456	cd15334	7tmA_5-HT1F	serotonin receptor subtype 1F, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F.  There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression.	259
320457	cd15335	7tmA_5-HT1E	serotonin receptor subtype 1E, member of the class A family of seven-transmembrane G protein-coupled receptors. The 5-HT1 receptors, one of 14 mammalian 5-HT receptors, is a member of the class A of GPCRs and is activated by the endogenous neurotransmitter and peripheral signal mediator serotonin (5-hydroxytryptamine, 5-HT). The 5-HT1 receptors mediate inhibitory neurotransmission by coupling to G proteins of the G(i/o) family, which lead to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels and calcium influx. The 5-HT1 receptor subfamily includes 5 subtypes: 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, and 5-HT1F.  There is no 5-HT1C receptor subtype, as it has been reclassified as the 5-HT2C receptor. In the CNS, serotonin is involved in the regulation of appetite, mood, sleep, cognition, learning and memory, as well as implicated in neurologic disorders such as migraine, schizophrenia, and depression.	258
320458	cd15336	7tmA_Melanopsin	vertebrate melanopsins (Opsin-4), member of the class A family of seven-transmembrane G protein-coupled receptors. Melanopsin (also called Opsin-4) is the G protein-coupled photopigment that mediates non-visual responses to light. In mammals, these photoresponses include the photo-entrainment of circadian rhythm, pupillary constriction, and acute nocturnal melatonin suppression. Mammalian melanopsins are expressed only in the inner retina, whereas non-mammalian vertebrate melanopsins are localized in various extra-retinal tissues such as iris, brain, pineal gland, and skin. Melanopsins belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	290
320459	cd15337	7tmA_Opsin_Gq_invertebrates	invertebrate Gq opsins, member of the class A family of seven-transmembrane G protein-coupled receptors. The invertebrate Gq-coupled opsin subfamily includes the arthropod and mollusc visual opsins. Like the vertebrate visual opsins, arthropods possess color vision by the use of multiple opsins sensitive to different light wavelengths. The invertebrate Gq opsins are closely related to the vertebrate melanopsins, the primary photoreceptor molecules for non-visual responses to light, and the R1-R6 photoreceptors, which are the fly equivalent to the vertebrate rods. The Gq opsins belong the class A of the G protein-coupled receptors and possess seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops.	292
320460	cd15338	7tmA_MCHR1	melanin concentrating hormone receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Melanin-concentrating hormone receptor (MCHR) binds melanin concentrating hormone and is presumably involved in the neuronal regulation of food intake and energy homeostasis. Despite strong homology with somatostatin receptors, MCHR does not appear to bind somatostatin. Two MCHRs have been characterized in vertebrates, MCHR1 and MCHR2. MCHR1 is expressed in all mammals, whereas MCHR2 is only expressed in the higher order mammals, such as humans, primates, and dogs, and is not found in rodents. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	282
320461	cd15339	7tmA_MCHR2	melanin concentrating hormone receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Melanin-concentrating hormone receptor (MCHR) binds melanin concentrating hormone and is presumably involved in the neuronal regulation of food intake and energy homeostasis. Despite strong homology with somatostatin receptors, MCHR does not appear to bind somatostatin. Two MCHRs have been characterized in vertebrates, MCHR1 and MCHR2. MCHR1 is expressed in all mammals, whereas MCHR2 is only expressed in the higher order mammals, such as humans, primates, and dogs, and is not found in rodents. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	283
320462	cd15340	7tmA_CB1	cannabinoid receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Cannabinoid receptors belong to the class A G-protein coupled receptor superfamily. Two types of cannabinoid receptors, CB1 and CB2, have been identified so far. They are activated by naturally occurring endocannabinoids, cannabis plant-derived cannabinoids such as tetrahydrocannabinol, or synthetic cannabinoids. The CB receptors are involved in the various physiological processes such as appetite, mood, memory, and pain sensation. CB1 receptor is expressed predominantly in central and peripheral neurons, while CB2 receptor is found mainly in the immune system.	292
320463	cd15341	7tmA_CB2	cannabinoid receptor subtype 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Cannabinoid receptors belong to the class A G-protein coupled receptor superfamily. Two types of cannabinoid receptors, CB1 and CB2, have been identified so far. They are activated by naturally occurring endocannabinoids, cannabis plant-derived cannabinoids such as tetrahydrocannabinol, or synthetic cannabinoids. The CB receptors are involved in the various physiological processes such as appetite, mood, memory, and pain sensation. CB1 receptor is expressed predominantly in central and peripheral neurons, while CB2 receptor is found mainly in the immune system.	279
320464	cd15342	7tmA_LPAR2_Edg4	lysophosphatidic acid receptor subtype 2 (LPAR2 or LPA2), also called Endothelial differentiation gene 4 (Edg4), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	274
320465	cd15343	7tmA_LPAR3_Edg7	lysophosphatidic acid receptor subtype 3 (LPAR3 or LPA3), also called endothelial differentiation gene 7 (Edg7), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	274
341348	cd15344	7tmA_LPAR1_Edg2	lysophosphatidic acid receptor subtype 1 (LPAR1 or LPA1), also called endothelial differentiation gene 2 (Edg2), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	273
320467	cd15345	7tmA_S1PR3_Edg3	sphingosine-1-phosphate receptor subtype 3 (S1PR3 or S1P3), also called endothelial differentiation gene 3 (Edg3), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	270
320468	cd15346	7tmA_S1PR1_Edg1	sphingosine-1-phosphate receptor subtype 1 (S1PR1 or S1P1), also called endothelial differentiation gene 1 (Edg1), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	277
320469	cd15347	7tmA_S1PR2_Edg5	sphingosine-1-phosphate receptor subtype 2 (S1PR2 or S1P2), also called endothelial differentiation gene 5 (Edg5), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	266
320470	cd15348	7tmA_S1PR5_Edg8	sphingosine-1-phosphate receptor subtype 5 (S1PR5 or S1P5), also called endothelial differentiation gene 8 (Edg8), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	277
320471	cd15349	7tmA_S1PR4_Edg6	sphingosine-1-phosphate receptor subtype 4 (S1PR4 or S1P4), also called endothelial differentiation gene 6 (Edg6), member of the class A family of seven-transmembrane G protein-coupled receptors. The endothelial differentiation gene (Edg) family of G-protein coupled receptors binds blood borne lysophospholipids including sphingosine-1-phosphate (S1P) and lysophosphatidic acid (LPA), which are involved in the regulation of cell proliferation, survival, migration, invasion, endothelial cell shape change and cytoskeletal remodeling. The Edg receptors are classified into two subfamilies: the lysophosphatidic acid subfamily that includes LPA1 (Edg2), LPA2 (Edg4), and LPA3 (Edg7); and the S1P subfamily that includes S1P1 (Edg1), S1P2 (Edg5), S1P3 (Edg3), S1P4 (Edg6), and S1P5 (Edg8).  The Edg receptors couple and activate at least three different G protein subtypes including G(i/o), G(q/11), and G(12/13).	271
320472	cd15350	7tmA_MC2R_ACTH_R	melanocortin receptor subtype 2, also called adrenocorticotropic hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function.	270
320473	cd15351	7tmA_MC1R	melanocortin receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function.	271
320474	cd15352	7tmA_MC3R	melanocortin receptor subtype 3, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function.	272
320475	cd15353	7tmA_MC4R	melanocortin receptor subtype 4, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function.	269
320476	cd15354	7tmA_MC5R	melanocortin receptor subtype 5, member of the class A family of seven-transmembrane G protein-coupled receptors. The melanocortin receptor (MCR) subfamily is a member of the class A family of seven-transmembrane G-protein coupled receptors. MCRs bind a group of pituitary peptide hormones known as melanocortins, which include adrenocorticotropic hormone (ACTH) and the different isoforms of melanocyte-stimulating hormones. There are five known subtypes of the MCR subfamily. MC1R is involved in regulating skin pigmentation and hair color. ACTH (adrenocorticotropic hormone) is the only endogenous ligand for MC2R, which shows low sequence similarity with other melanocortin receptors. Mutations in MC2R cause familial glucocorticoid deficiency type 1, in which patients have elevated plasma ACTH and low cortisol levels. MC3R is expressed in many parts of the brain and peripheral tissues and involved in the regulation of energy homeostasis. MC4R is expressed primarily in the central nervous system and involved in both eating behavior and sexual function. MC5R is widely expressed in peripheral tissues and is mainly involved in the regulation of exocrine gland function.	270
320477	cd15355	7tmA_NTSR1	neurotensin receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Neurotensin (NTS) is a 13 amino-acid neuropeptide that functions as both a neurotransmitter and a hormone in the nervous system and peripheral tissues, respectively. NTS exerts various biological activities through activation of the G protein-coupled neurotensin receptors, NTSR1 and NTSR2. In the brain, NTS is involved in the modulation of dopamine neurotransmission, opioid-independent analgesia, hypothermia, and the inhibition of food intake, while in the periphery NTS promotes the growth of various normal and cancer cells and acts as a paracrine and endocrine modulator of the digestive tract. The third neurotensin receptor, NTSR3 or also called sortilin, is not a G protein-coupled receptor.	310
320478	cd15356	7tmA_NTSR2	neurotensin receptor subtype 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Neurotensin (NTS) is a 13 amino-acid neuropeptide that functions as both a neurotransmitter and a hormone in the nervous system and peripheral tissues, respectively. NTS exerts various biological activities through activation of the G protein-coupled neurotensin receptors, NTSR1 and NTSR2. In the brain, NTS is involved in the modulation of dopamine neurotransmission, opioid-independent analgesia, hypothermia, and the inhibition of food intake, while in the periphery NTS promotes the growth of various normal and cancer cells and acts as a paracrine and endocrine modulator of the digestive tract. The third neurotensin receptor, NTSR3 or also called sortilin, is not a G protein-coupled receptor.	285
320479	cd15357	7tmA_NMU-R2	neuromedin U receptor subtype 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuromedin U (NMU) is a highly conserved neuropeptide with a common C-terminal heptapeptide sequence (FLFRPRN-amide) found at the highest levels in the gastrointestinal tract and pituitary gland of mammals. Disruption or replacement of residues in the conserved heptapeptide region can result in the reduced ability of NMU to stimulate smooth-muscle contraction. Two G-protein coupled receptor subtypes, NMU-R1 and NMU-R2, with a distinct expression pattern, have been identified to bind NMU. NMU-R1 is expressed primarily in the peripheral nervous system, while NMU-R2 is mainly found in the central nervous system. Neuromedin S, a 36 amino-acid neuropeptide that shares a conserved C-terminal heptapeptide sequence with NMU, is a highly potent and selective NMU-R2 agonist. Pharmacological studies have shown that both NMU and NMS inhibit food intake and reduce body weight, and that NMU increases energy expenditure.	293
320480	cd15358	7tmA_NMU-R1	neuromedin U receptor subtype 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuromedin U (NMU) is a highly conserved neuropeptide with a common C-terminal heptapeptide sequence (FLFRPRN-amide) found at the highest levels in the gastrointestinal tract and pituitary gland of mammals. Disruption or replacement of residues in the conserved heptapeptide region can result in the reduced ability of NMU to stimulate smooth-muscle contraction. Two G-protein coupled receptor subtypes, NMU-R1 and NMU-R2, with a distinct expression pattern, have been identified to bind NMU. NMU-R1 is expressed primarily in the peripheral nervous system, while NMU-R2 is mainly found in the central nervous system. Neuromedin S, a 36 amino-acid neuropeptide that shares a conserved C-terminal heptapeptide sequence with NMU, is a highly potent and selective NMU-R2 agonist. Pharmacological studies have shown that both NMU and NMS inhibit food intake and reduce body weight, and that NMU increases energy expenditure.	305
320481	cd15359	7tmA_LHCGR	luteinizing hormone-choriogonadotropin receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The glycoprotein hormone receptors are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone family includes the three gonadotropins: luteinizing hormone (LH), follicle-stimulating hormone (FSH), chorionic gonadotropin (CG), and a pituitary thyroid-stimulating hormone (TSH). The glycoprotein hormones exert their biological functions by interacting with their cognate GPCRs. Both LH and CG bind to the same receptor, the luteinizing hormone-choriogonadotropin receptor (LHCGR); FSH binds to FSH-R and TSH to TSH-R. LHCGR is expressed predominantly in the ovary and testis, and plays an essential role in sexual development and reproductive processes. LHCGR couples primarily to the G(s)-protein and activates adenylate cyclase, thereby promoting cAMP production.	275
320482	cd15360	7tmA_FSH-R	follicle-stimulating hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The glycoprotein hormone receptors are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone family includes the three gonadotropins: luteinizing hormone (LH), follicle-stimulating hormone (FSH), chorionic gonadotropin (CG), and a pituitary thyroid-stimulating hormone (TSH). The glycoprotein hormones exert their biological functions by interacting with their cognate GPCRs. Both LH and CG bind to the same receptor, the luteinizing hormone-choriogonadotropin receptor (LHCGR); FSH binds to FSH-R and TSH to TSH-R. FSH-R functions in gonad development and is found in the ovary, testis, and uterus. Defects in this receptor cause ovarian dysgenesis type 1, and also ovarian hyperstimulation syndrome. The FSH-R activation couples to the G(s)-protein and stimulates adenylate cyclase, thereby promoting cAMP production.	275
320483	cd15361	7tmA_LGR4	leucine-rich repeats-containing G protein-coupled receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. The leucine-rich repeat containing G-protein coupled receptor Lgr4 (formerly known as Gpr48), together with its close family members LGR5 and LGR6, is structurally related to the glycoprotein hormone receptor family, which includes the luteinizing hormone (LH) receptor, the follicle-stimulating hormone (FSH) receptor, and the pituitary thyroid-stimulating hormone (TSH) receptor. LGR4-6 are receptors for the R-spondin (Rspo) family of secreted proteins containing two N-terminal furin-like repeats and a thrombospondin domain. The Rspo proteins are involved in regulating proliferation and differentiation of adult stem cells by potently enhancing the WNT-stimulated beta-catenin signaling. LGR4 is broadly expressed in proliferating cells, and its deficient mice display development defects in multiple organs. LGR5 acts as a marker for resident stem cell in numerous epithelial cell layers, including small intestine, colon, stomach, and kidney. LGR6 also serves as a marker of multipotent stem cells in the hair follicle that generate all skin cell lineages. Members of this group are characterized by a very large extracellular N-terminal domain containing 17 leucine-rich repeats (LRRs), flanked by cysteine-rich N- and C-terminal capping domains, and the extracellular domain is responsible for high-affinity binding with the Rspo proteins.	274
320484	cd15362	7tmA_LGR6	leucine-rich repeats-containing G protein-coupled receptor 6, the class A of 7-transmembrane GPCRs. The leucine-rich repeat containing G-protein coupled receptor LGR5, together with its family members LGR4 and LGR6, is structurally related to the glycoprotein hormone receptor family, which includes the luteinizing hormone (LH) receptor, the follicle-stimulating hormone (FSH) receptor, and the pituitary thyroid-stimulating hormone (TSH) receptor. LGR4-6 are receptors for the R-spondin (Rspo) family of secreted proteins containing two N-terminal furin-like repeats and a thrombospondin domain. The Rspo proteins are involved in regulating proliferation and differentiation of adult stem cells by potently enhancing the WNT-stimulated beta-catenin signaling. LGR5 serves as a marker for resident stem cell in numerous epithelial cell layers, including small intestine, colon, stomach, and kidney. LGR6 is a marker for multipotent stem cells in the hair follicle that generate all skin cell lineages. In addition, LGR4 is broadly expressed in proliferating cells, and its deficient mice display development defects in multiple organs. Members of this group are characterized by a very large extracellular N-terminal domain containing 17 leucine-rich repeats (LRRs), flanked by cysteine-rich N- and C-terminal capping domains, and the extracellular domain is responsible for high-affinity binding with the Rspo proteins.	276
320485	cd15363	7tmA_LGR5	leucine-rich repeats-containing G protein-coupled receptor 5, member of the class A family of seven-transmembrane G protein-coupled receptors. The leucine-rich repeat containing G-protein coupled receptor LGR6, together with its family members LGR4 and LGR5, is structurally related to the glycoprotein hormone receptor family, which includes the luteinizing hormone (LH) receptor, the follicle-stimulating hormone (FSH) receptor, and the pituitary thyroid-stimulating hormone (TSH) receptor. LGR4-6 are receptors for the R-spondin (Rspo) family of secreted proteins containing two N-terminal furin-like repeats and a thrombospondin domain. The Rspo proteins are involved in regulating proliferation and differentiation of adult stem cells by potently enhancing the WNT-stimulated beta-catenin signaling. LGR6 serves as a marker of multipotent stem cells in the hair follicle that generate all skin cell lineages, whereas LGR5 is a marker for resident stem cell in numerous epithelial cell layers, including small intestine, colon, stomach, and kidney. In addition, LGR4 is broadly expressed in proliferating cells, and its deficient mice display development defects in multiple organs. Members of this group are characterized by a very large extracellular N-terminal domain containing 17 leucine-rich repeats (LRRs), flanked by cysteine-rich N- and C-terminal capping domains, and the extracellular domain is responsible for high-affinity binding with the Rspo proteins.	274
320486	cd15364	7tmA_GPR132_G2A	proton-sensing G protein-coupled receptor 132, member of the class A family of seven-transmembrane G protein-coupled receptors. The G2 accumulation receptor (G2A, also known as GPR132) is a member of the proton-sensing G-protein-coupled receptor (GPCR) family which also includes the T cell death associated gene-8 (TDAG8, GPR65) receptor, ovarian cancer G-protein receptor 1 (OGR-1, GPR68), and G-protein-coupled receptor 4 (GPR4). Proton-sensing G-protein coupled receptors sense pH of 7.6 to 6.0 and mediates a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. G2A was originally identified as a stress-inducible receptor that causes the cell cycle arrest at G2/M phase when serum is deprived. Lysophosphatidylcholine was identified as a ligand for G2A, and whose overexpression was shown to induce cell proliferation, oncogenic transformation, and apoptosis.	279
320487	cd15365	7tmA_GPR65_TDAG8	proton-sensing G protein-coupled receptor 65, member of the class A family of seven-transmembrane G protein-coupled receptors. The T cell death associated gene-8 receptor (TDAG8, also known as GPR65) is a member of the proton-sensing G-protein-coupled receptor (GPCR) family which also includes the G2 accumulation receptor (G2A, also known as GPR132), ovarian cancer G-protein receptor 1 (OGR-1, GPR68), and G-protein-coupled receptor 4 (GPR4). Proton-sensing G-protein coupled receptors sense pH of 7.6 to 6.0 and mediates a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. Activation of TDAG8 by extracellular acidosis increases the cAMP production, stimulates Rho, and induces stress fiber formation. TDAG8 has also been shown to regulate the extracellular acidosis-induced inhibition of pro-inflammatory cytokine production in peritoneal macrophages.	277
320488	cd15366	7tmA_GPR4	proton-sensing G protein-coupled receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. G-protein-coupled receptor 4 (GPR4) is a member of the proton-sensing G-protein-coupled receptor (GPCR) family which also includes the G2 accumulation receptor (G2A, also known as GPR132), the T cell death associated gene-8 receptor (TDAG8, GPR65), ovarian cancer G-protein receptor 1 (OGR-1, GPR68), and G-protein-coupled receptor 4 (GPR4). Proton-sensing G-protein coupled receptors sense pH of 7.6 to 6.0 and mediates a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. GPR4 overexpression in melanoma cells was shown to reduce cell migration, membrane ruffling, and cell spreading under acidic pH conditions. Activation of GPR4 via extracellular acidosis is coupled to the G(s), G(q), and G(12/13) pathways.	280
320489	cd15367	7tmA_GPR68_OGR1	G protein-coupled receptor 68, member of the class A family of seven-transmembrane G protein-coupled receptors. The ovarian cancer G-protein receptor 1 (OGR1, also known as GPR68) is a member of the proton-sensing G-protein-coupled receptor (GPCR) family which also includes the G2 accumulation receptor (G2A, also known as GPR132), the T cell death associated gene-8 receptor (TDAG8, GPR65), and the G-protein-coupled receptor 4 (GPR4). Proton-sensing G-protein coupled receptors sense pH of 7.6 to 6.0 and mediates a variety of biological activities in neutral and mildly acidic pH conditions, whereas the acid-sensing ionotropic ion channels typically sense strong acidic pH. Knock-out mice studies have suggested that OGR1 plays a role in the regulation of insulin secretion and glucose metabolism. OGR1 couples to G(q/11) proteins and activates phospholipase C and Ca2+ signaling pathways.	276
320490	cd15368	7tmA_P2Y8	purinergic receptor P2Y8, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y8 (or P2RY8) expression is often increased in leukemia patients, and it plays a role in the pathogenesis of acute leukemia. P2Y8 is phylogenetically closely related to the protease-activated receptors (PARs), which are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. Four different types of the protease-activated receptors have been identified (PAR1-4) and are predominantly expressed in platelets. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin.  The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects.	281
320491	cd15369	7tmA_PAR1	protease-activated receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Protease-acted receptors (PARs) are seven-transmembrane proteins that belong to the class A G-protein coupled receptor (GPCR) family. Four different types of the protease-activated receptors have been identified: PAR1, PAR2, PAR3, and PAR4. PARs are predominantly expressed in platelets and are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin.  The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects.	281
341349	cd15370	7tmA_PAR2	protease-activated receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Protease-acted receptors (PARs) are seven-transmembrane proteins that belong to the class A G-protein coupled receptor (GPCR) family. Four different types of the protease-activated receptors have been identified: PAR1, PAR2, PAR3, and PAR4. PARs are predominantly expressed in platelets and are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin.  The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects.	280
320493	cd15371	7tmA_PAR3	protease-activated receptor 3, member of the class A family of seven-transmembrane G protein-coupled receptors. Protease-acted receptors (PARs) are seven-transmembrane proteins that belong to the class A G-protein coupled receptor (GPCR) family. Four different types of the protease-activated receptors have been identified: PAR1, PAR2, PAR3, and PAR4. PARs are predominantly expressed in platelets and are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin.  The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects.	274
320494	cd15372	7tmA_PAR4	protease-activated receptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. Protease-acted receptors (PARs) are seven-transmembrane proteins that belong to the class A G-protein coupled receptor (GPCR) family. Four different types of the protease-activated receptors have been identified: PAR1, PAR2, PAR3, and PAR4. PARs are predominantly expressed in platelets and are activated by serine proteases such as thrombin, trypsin, and tryptase. These proteases cleave the extracellular domain of the receptor to form a new N-terminus, which in turn functions as a tethered ligand. The newly-formed tethered ligand binds intramolecularly to activate the receptor and triggers G-protein binding and intracellular signaling. PAR1, PA3, and PAR4 are activated by thrombin, whereas PAR2 is activated by trypsin.  The PARs are known to couple with several G-proteins including Gi (cAMP inhibitory), G12/13 (Rho and Ras activation), and Gq (calcium signaling) to activate downstream signaling messengers which induces numerous cellular and physiological effects.	274
320495	cd15373	7tmA_P2Y2	P2Y purinoceptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y2 belongs to the P2Y receptor family of purinergic G-protein coupled receptors and is implicated to play a role in the control of the cell cycle of endometrial carcinoma cells. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	283
320496	cd15374	7tmA_P2Y4	P2Y purinoceptor 4, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y4 belongs to the P2Y receptor family of purinergic G-protein coupled receptors. This family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	285
320497	cd15375	7tmA_OXGR1	2-oxoglutarate receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. 2-oxoglutarate receptor 1 (OXGR1) is also known as GPR80, GPR99, or P2Y15. OXGR1 functions as a receptor for alpha-ketoglutarate, a citric acid cycle intermediate, and acts exclusively through a G(q)-dependent pathway. OXGR1 belongs to the class A GPCR superfamily and is phylogenetically related to the purinergic P2Y1-like receptor subfamily, whose members are coupled to G(q) protein to activate phospholipase C (PLC). OXGR1 has also been reported as a potential third cysteinyl leukotriene receptor with specificity for leukotriene E4.	280
320498	cd15376	7tmA_P2Y11	P2Y purinoceptor 11, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y11 belongs to the P2Y receptor family of purinergic G-protein coupled receptors. The activation of P2Y11 is a major pathway of macrophage activation that leads to the release of cytokines. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	284
341350	cd15377	7tmA_P2Y1	P2Y purinoceptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y1 belongs to the P2Y receptor family of purinergic G-protein coupled receptors. This family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	289
320500	cd15378	7tmA_SUCNR1_GPR91	succinate receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Succinate receptor (SUCNR1) GPR91 exclusively couples to G(i) protein to inhibit cAMP production and also activates PLC-beta to increase intracellular calcium concentrations in an inositol phosphate dependent mechanism. Succinate, an intermediate molecule of the citric cycle, is shown to cause cardiac hypertrophy via GPR91 activation. Furthermore, succinate-induced GPR91 activation is involved in the regulation of renin-angiotensin system and is suggested to play an important role in the development of renovascular hypertension and diabetic nephropathy. SUCNR1 belongs to the class A GPCR superfamily and is phylogenetically related to the purinergic P2Y1-like receptor subfamily, whose members are coupled to G(q) protein to activate phospholipase C (PLC).	283
320501	cd15379	7tmA_P2Y6	P2Y purinoceptor 6, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes mammalian P2Y6, avian P2Y3, and similar proteins. P2Y3 is the avian homolog of mammalian P2Y6. They belong to the G(i) class of a family of purinergic G-protein coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	288
320502	cd15380	7tmA_BK-1	bradykinin receptor B1, member of the class A family of seven-transmembrane G protein-coupled receptors. The bradykinin receptor family is a group of the seven transmembrane G-protein coupled receptors, whose endogenous ligand is the pro-inflammatory nonapeptide bradykinin that mediates various vascular and pain responses. Two major bradykinin receptor subtypes, B1 and B2, have been identified based on their pharmacological properties. The B1 receptor is rapidly induced by tissue injury and inflammation, whereas the B2 receptor is ubiquitously expressed on many tissue types. Both receptors contain three consensus sites for N-linked glycosylation in extracellular domains and couple to G(q) protein to activate phospholipase C, leading to phosphoinositide hydrolysis and intracellular calcium mobilization. They can also interact with G(i) protein to inhibit adenylate cyclase and activate the MAPK (mitogen-activated protein kinase) pathways.	286
320503	cd15381	7tmA_BK-2	bradykinin receptor B2, member of the class A family of seven-transmembrane G protein-coupled receptors. The bradykinin receptor family is a group of the seven transmembrane G-protein coupled receptors, whose endogenous ligand is the pro-inflammatory nonapeptide bradykinin that mediates various vascular and pain responses. Two major bradykinin receptor subtypes, B1 and B2, have been identified based on their pharmacological properties. The B1 receptor is rapidly induced by tissue injury and inflammation, whereas the B2 receptor is ubiquitously expressed on many tissue types. Both receptors contain three consensus sites for N-linked glycosylation in extracellular domains and couple to G(q) protein to activate phospholipase C, leading to phosphoinositide hydrolysis and intracellular calcium mobilization. They can also interact with G(i) protein to inhibit adenylate cyclase and activate the MAPK (mitogen-activated protein kinase) pathways.	284
320504	cd15382	7tmA_AKHR	adipokinetic hormone receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Adipokinetic hormone (AKH) is a lipid-mobilizing hormone that is involved in control of insect metabolism. Generally, AKH behaves as a typical stress hormone by mobilizing lipids, carbohydrates and/or certain amino acids such as proline. Thus, it utilizes the body's energy reserves to fight the immediate stress problems and subdue processes that are less important. Although AKH is known to responsible for regulating the energy metabolism during insect flight, it is also found in insects that have lost its functional wings and predominantly walk for their locomotion. AKH is structurally related to the mammalian gonadotropin-releasing hormone (GnRH) and they share a common ancestor. Both GnRH and AKH receptors are members of the class A of the seven-transmembrane, G-protein coupled receptor (GPCR) superfamily.	298
320505	cd15383	7tmA_GnRHR_vertebrate	vertebrate gonadotropin-releasing hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. GnRHR, also known as luteinizing hormone releasing hormone receptor (LHRHR), plays an central role in vertebrate reproductive function; its activation by binding to GnRH leads to the release of follicle stimulating hormone (FSH) and luteinizing hormone (LH) from the pituitary gland. GnRHR is expressed predominantly in the gonadotrope membrane of the anterior pituitary as well as found in numerous extrapituitary tissues including lymphocytes, breast, ovary, prostate, and cancer cell lines. There are at least two types of GnRH receptors, GnRHR1 and GnRHR2, which couple primarily to G proteins of the Gq/11 family. GnRHR is closely related to the adipokinetic hormone receptor (AKH), which binds to a lipid-mobilizing hormone that is involved in control of insect metabolism. They share a common ancestor and are members of the class A of the seven-transmembrane, G-protein coupled receptor (GPCR) superfamily.	295
320506	cd15384	7tmA_GnRHR_invertebrate	invertebrate gonadotropin-releasing hormone receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. GnRHR, also known as luteinizing hormone releasing hormone receptor (LHRHR), plays an central role in vertebrate reproductive function; its activation by binding to GnRH leads to the release of follicle stimulating hormone (FSH) and luteinizing hormone (LH) from the pituitary gland. GnRHR is expressed predominantly in the gonadotrope membrane of the anterior pituitary as well as found in numerous extrapituitary tissues including lymphocytes, breast, ovary, prostate, and cancer cell lines. There are at least two types of GnRH receptors, GnRHR1 and GnRHR2, which couple primarily to G proteins of the Gq/11 family. GnRHR is closely related to the adipokinetic hormone receptor (AKH), which binds to a lipid-mobilizing hormone that is involved in control of insect metabolism. They share a common ancestor and are members of the class A of the seven-transmembrane, G-protein coupled receptor (GPCR) superfamily.	293
320507	cd15385	7tmA_V1aR	vasopressin receptor subtype 1A, member of the class A family of seven-transmembrane G protein-coupled receptors. V1a-type receptor is a G(q/11)-coupled receptor that mediates blood vessel constriction. Vasopressin (also known as arginine vasopressin or anti-diuretic hormone) is synthesized in the hypothalamus and is released from the posterior pituitary gland. The actions of vasopressin are mediated by the interaction of this hormone with three receptor subtypes: V1aR, V1bR, and V2R. These subtypes are differ in localization, function, and signaling pathways. Activation of V1aR and V1bR stimulate phospholipase C, while activation of V2R stimulates adenylate cyclase. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation.	301
320508	cd15386	7tmA_V1bR	vasopressin receptor subtype 1B, member of the class A family of seven-transmembrane G protein-coupled receptors. The V1b receptor is specifically expressed in corticotropes of the anterior pituitary and plays a critical role in regulating the activity of hypothalamic-pituitary-adrenal axis, a key part of the neuroendocrine system that controls reactions to stress, by maintaining adrenocorticotropic hormone (ACTH) and corticosterone levels. Vasopressin (also known as arginine vasopressin or anti-diuretic hormone) is synthesized in the hypothalamus and is released from the posterior pituitary gland. The actions of vasopressin are mediated by the interaction of this hormone with three receptor subtypes: V1aR, V1bR, and V2R. These subtypes are differ in localization, function, and signaling pathways. Activation of V1aR and V1bR stimulate phospholipase C, while activation of V2R stimulates adenylate cyclase. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation.	302
320509	cd15387	7tmA_OT_R	oxytocin receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Oxytocin is a peptide of nine amino acids synthesized in the hypothalamus and is released from the posterior pituitary gland. Oxytocin plays an important role in sexual reproduction of both sexes and is structurally very similar to vasopressin. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation.	297
320510	cd15388	7tmA_V2R	vasopressin receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. The vasopressin type 2 receptor (V2R) is a G(s)-coupled receptor that controls balance of water and sodium ion by regulating their reabsorption in the renal collecting duct. Mutations of V2R is responsible for nephrogenic diabetes insipidus. Vasopressin (also known as arginine vasopressin or anti-diuretic hormone) is synthesized in the hypothalamus and is released from the posterior pituitary gland. The actions of vasopressin are mediated by the interaction of this hormone with three receptor subtypes: V1aR, V1bR, and V2R. These subtypes are differ in localization, function, and signaling pathways. Activation of V1aR and V1bR stimulate phospholipase C, while activation of V2R stimulates adenylate cyclase. Although vasopressin and oxytocin differ only by two amino acids and stimulate the same cAMP/PKA pathway, they have divergent physiological functions. Vasopressin is involved in regulating blood pressure and the balance of water and sodium ions, whereas oxytocin plays an important role in the uterus during childbirth and in lactation.	295
320511	cd15389	7tmA_GPR83	G protein-coupled receptor 83, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR83, also known as GPR72, is widely expressed in the brain, including hypothalamic nuclei which is involved in regulating energy balance and food intake. The hypothalamic expression of GPR83 is tightly regulated in response to nutrient availability and is decreased in obese mice. A recent study suggests that GPR83 has a critical role in the regulation of systemic energy metabolism via ghrelin-dependent and ghrelin-independent mechanisms. GPR83 shares a significant amino acid sequence identity with the tachykinin receptors, however its endogenous ligand is unknown.	285
320512	cd15390	7tmA_TACR	neurokinin receptors (or tachykinin receptors), member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents G-protein coupled receptors for a variety of neuropeptides of the tachykinin (TK) family. The tachykinins are widely distributed throughout the mammalian central and peripheral nervous systems and act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R.  SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate in the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception. NK3R is activated by its high-affinity ligand, NKB, which is primarily involved in the central nervous system and plays a critical role in the regulation of gonadotropin hormone release and the onset of puberty.	289
320513	cd15391	7tmA_NPR-like_invertebrate	invertebrate neuropeptide receptor-like, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes putative neuropeptide receptor found in invertebrates, which is a member of class A of 7-transmembrane G protein-coupled receptors.  This orphan receptor shares a significant amino acid sequence identity with the neurokinin 1 receptor (NK1R). The endogenous ligand for NK1R is substance P, an 11-amino acid peptide that functions as a vasodilator and neurotransmitter and is released from the autonomic sensory nerve fibers.	289
320514	cd15392	7tmA_PR4-like	neuropeptide Y receptor-like found in insect and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes a novel G protein-coupled receptor (also known as PR4 receptor) from Drosophila melanogaster, which can be activated by the members of the neuropeptide Y (NPY) family, including NPY, peptide YY (PYY) and pancreatic polypeptide (PP), when expressed in Xenopus oocytes. These homologous peptides of 36-amino acids in length contain a hairpin-like structural motif, which referred to as the pancreatic polypeptide fold, and function as gastrointestinal hormones and neurotransmitters. The PR4 receptor also shares strong sequence homology to the mammalian tachykinin receptors (NK1R, NK2R, and NK3R), whose endogenous ligands are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB), respectively. The tachykinins function as excitatory transmitters on neurons and cells in the gastrointestinal tract.	287
320515	cd15393	7tmA_leucokinin-like	leucokinin-like peptide receptor from tick and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes a leucokinin-like peptide receptor from the Southern cattle tick, Boophilus microplus, a pest of cattle world-wide. Leucokinins are invertebrate neuropeptides that exhibit myotropic and diuretic activity. This receptor is the first neuropeptide receptor known from the Acari and the second known in the subfamily of leucokinin-like peptide G-protein-coupled receptors. The other known leucokinin-like peptide receptor is a lymnokinin receptor from the mollusc Lymnaea stagnalis.	288
320516	cd15394	7tmA_PrRP_R	prolactin-releasing peptide receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Prolactin-releasing peptide (PrRP) receptor (previously known as GPR10) is expressed in the central nervous system with the highest levels located in the anterior pituitary and is activated by its endogenous ligand PrRP, a neuropeptide possessing a C-terminal Arg-Phe-amide motif. There are two active isoforms of PrRP in mammals: one consists of 20 amino acids (PrRP-20) and the other consists of 31 amino acids (PrRP-31), where PrRP-20 is a C-terminal fragment of PrRP-31. Binding of PrRP to the receptor coupled to G(i/o) proteins activates the extracellular signal-related kinase (ERK) and it can also couple to G(q) protein leading to an increase in intracellular calcium and activation of c-Jun N-terminal protein kinase (JNK). The PrRP receptor shares significant sequence homology with the neuropeptide Y (NPY) receptor, and micromolar levels of NPY can bind and completely inhibit the PrRP-evoked intracellular calcium response in PrRP receptor-expressing cells, suggesting that the PrRP receptor shares a common ancestor with the NPY receptors. PrRP has been shown to reduce food intake and body weight and modify body temperature when administered in rats.  It also has been shown to decrease circulating growth hormone levels by activating somatostatin-secreting neurons in the hypothalamic periventricular nucleus.	286
320517	cd15395	7tmA_NPY1R	neuropeptide Y receptor type 1, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. When NPY signals through NPY2R in concert with NPY5R, it induces angiogenesis and consequently plays an important role in revascularization and wound healing. On the other hand, when NPY acts through NPY1R and NPYR5, it acts as a vascular mitogen, leading to restenosis and atherosclerosis.	293
320518	cd15396	7tmA_NPY6R	neuropeptide Y receptor type 6, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety.	293
320519	cd15397	7tmA_NPY4R	neuropeptide Y receptor type 4, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety.	293
320520	cd15398	7tmA_NPY5R	neuropeptide Y receptor type 5, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. When NPY signals through NPY2R in concert with NPY5R, it induces angiogenesis and consequently plays an important role in revascularization and wound healing. On the other hand, when NPY acts through NPY1R and NPYR5, it acts as a vascular mitogen, leading to restenosis and atherosclerosis.	273
320521	cd15399	7tmA_NPY2R	neuropeptide Y receptor type 2, member of the class A family of seven-transmembrane G protein-coupled receptors. NPY is a 36-amino acid peptide neurotransmitter with a C-terminal tyrosine amide residue that is widely distributed in the brain and the autonomic nervous system of many mammalian species. NPY exerts its functions through five, G-protein coupled receptor subtypes including NPY1R, NPY2R, NPY4R, NPY5R, and NPY6R; however, NPY6R is not functional in humans. NYP receptors are also activated by its two other family members, peptide YY (PYY) and pancreatic polypeptide (PP). They typically couple to G(i) or G(o) proteins, which leads to a decrease in adenylate cyclase activity, thereby decreasing intracellular cAMP levels, and are involved in diverse physiological roles including appetite regulation, circadian rhythm, and anxiety. When NPY signals through NPY2R in concert with NPY5R, it induces angiogenesis and consequently plays an important role in revascularization and wound healing. On the other hand, when NPY acts through NPY1R and NPYR5, it acts as a vascular mitogen, leading to restenosis and atherosclerosis.	285
320522	cd15400	7tmA_Mel1B	melatonin receptor subtype 1B, member of the class A family of seven-transmembrane G protein-coupled receptors. Melatonin (N-acetyl-5-methoxytryptamine) is a naturally occurring sleep-promoting chemical found in vertebrates, invertebrates, bacteria, fungi, and plants. In mammals, melatonin is secreted by the pineal gland and is involved in regulation of circadian rhythms. Its production peaks during the nighttime, and is suppressed by light. Melatonin is shown to be synthesized in other organs and cells of many vertebrates, including the Harderian gland, leukocytes, skin, and the gastrointestinal (GI) tract, which contains several hundred times more melatonin than the pineal gland and is involved in the regulation of GI motility, inflammation, and sensation. Melatonin exerts its pleiotropic physiological effects through specific membrane receptors, named MT1A, MT1B, and MT1C, which belong to the class A rhodopsin-like G-protein coupled receptor family. MT1A and MT1B subtypes are present in mammals, whereas MT1C subtype has been found in amphibians and birds. The melatonin receptors couple to G proteins of the G(i/o) class, leading to the inhibition of adenylate cyclase.	279
320523	cd15401	7tmA_Mel1C	melatonin receptor subtype 1C, member of the class A family of seven-transmembrane G protein-coupled receptors. Melatonin (N-acetyl-5-methoxytryptamine) is a naturally occurring sleep-promoting chemical found in vertebrates, invertebrates, bacteria, fungi, and plants. In mammals, melatonin is secreted by the pineal gland and is involved in regulation of circadian rhythms. Its production peaks during the nighttime, and is suppressed by light. Melatonin is shown to be synthesized in other organs and cells of many vertebrates, including the Harderian gland, leukocytes, skin, and the gastrointestinal (GI) tract, which contains several hundred times more melatonin than the pineal gland and is involved in the regulation of GI motility, inflammation, and sensation. Melatonin exerts its pleiotropic physiological effects through specific membrane receptors, named MT1A, MT1B, and MT1C, which belong to the class A rhodopsin-like G-protein coupled receptor family. MT1A and MT1B subtypes are present in mammals, whereas MT1C subtype has been found in amphibians and birds. The melatonin receptors couple to G proteins of the G(i/o) class, leading to the inhibition of adenylate cyclase.	279
320524	cd15402	7tmA_Mel1A	melatonin receptor subtype 1A, member of the class A family of seven-transmembrane G protein-coupled receptors. Melatonin (N-acetyl-5-methoxytryptamine) is a naturally occurring sleep-promoting chemical found in vertebrates, invertebrates, bacteria, fungi, and plants. In mammals, melatonin is secreted by the pineal gland and is involved in regulation of circadian rhythms. Its production peaks during the nighttime, and is suppressed by light. Melatonin is shown to be synthesized in other organs and cells of many vertebrates, including the Harderian gland, leukocytes, skin, and the gastrointestinal (GI) tract, which contains several hundred times more melatonin than the pineal gland and is involved in the regulation of GI motility, inflammation, and sensation. Melatonin exerts its pleiotropic physiological effects through specific membrane receptors, named MT1A, MT1B, and MT1C, which belong to the class A rhodopsin-like G-protein coupled receptor family. MT1A and MT1B subtypes are present in mammals, whereas MT1C subtype has been found in amphibians and birds. The melatonin receptors couple to G proteins of the G(i/o) class, leading to the inhibition of adenylate cyclase.	279
320525	cd15403	7tmA_GPR45	G protein-coupled receptor 45, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes the human orphan receptor GPR45 and closely related proteins found in vertebrates. GPR45 is also called PSP24 in Xenopus and PSP24-alpha (or PSP24-1) in mammals. GPR45 shows the highest sequence homology with GPR63 (PSP24-beta, or PSP24-2). PSP24 was originally identified as a novel, high-affinity lysophosphatidic acid (LPA) receptor in Xenopus laevis oocytes; however, PSP24 receptors (GPR45 and GPR63) have not been shown to be activated by LPA. Mammalian PSP24 receptors are highly expressed in neuronal cells of cerebellum and their expression level remains constant from the early embryonic stages to adulthood, suggesting the important role of PSP24s in brain neuronal functions. Members of this subgroup contain the highly conserved Asp-Arg-Tyr/Phe (DRY/F) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors which is important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	301
320526	cd15404	7tmA_GPR63	G protein-coupled receptor 63,  member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup includes the human orphan receptor GPR63, which is also called PSP24-beta or PSP24-2, and its closely related proteins found in vertebrates. GPR63 shares the highest sequence homology with GPR45 (Xenopus PSP24, mammalian PSP24-alpha or PSP24-1). PSP24 was originally identified as a novel, high-affinity lysophosphatidic acid (LPA) receptor in Xenopus laevis oocytes; however, PSP24 receptors (GPR45 and GPR63) have not been shown to be activated by LPA. Mammalian PSP24 receptors are highly expressed in neuronal cells of cerebellum and their expression level remains constant from the early embryonic stages to adulthood, suggesting the important role of PSP24s in brain neuronal functions. Members of this subgroup contain the highly conserved Asp-Arg-Tyr/Phe (DRY/F) motif found in the third transmembrane helix (TM3) of the rhodopsin-like class A receptors which is important for efficient G protein-coupled signal transduction. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	265
320527	cd15405	7tmA_OR8B-like	olfactory receptor subfamily 8B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 8B and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320528	cd15406	7tmA_OR8D-like	olfactory receptor subfamily 8D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 8D and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	290
320529	cd15407	7tmA_OR5B-like	olfactory receptor subfamily 5B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5B and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320530	cd15408	7tmA_OR5AK3-like	olfactory receptor subfamily 5AK3, 5AU1, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5AK3, 5AU1, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	287
320531	cd15409	7tmA_OR5H-like	olfactory receptor subfamily 5H and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5H, 5K, 5AC, 5T and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320532	cd15410	7tmA_OR5D-like	olfactory receptor subfamily 5D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5D, 5L, 5W, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	294
320533	cd15411	7tmA_OR8H-like	olfactory receptor subfamily 8H and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 8H, 8I, 5F and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320534	cd15412	7tmA_OR5M-like	olfactory receptor subfamily 5M and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5M and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320535	cd15413	7tmA_OR8K-like	olfactory receptor subfamily 8K and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 8K, 8U, 8J, 5R, 5AL and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320536	cd15414	7tmA_OR5G-like	olfactory receptor subfamily 5G and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5G and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	285
320537	cd15415	7tmA_OR5J-like	olfactory receptor subfamily 5J and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5J and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320538	cd15416	7tmA_OR5P-like	olfactory receptor subfamily 5P and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5P and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320539	cd15417	7tmA_OR5A1-like	olfactory receptor subfamily 5A1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5A1, 5A2, 5AN1, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320540	cd15418	7tmA_OR9G-like	olfactory receptor subfamily 9G and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 9G and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	281
320541	cd15419	7tmA_OR9K2-like	olfactory receptor subfamily 9K2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes transmembrane olfactory receptor subfamily 9K2 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	279
320542	cd15420	7tmA_OR2A-like	olfactory receptor subfamily 2A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2A and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320543	cd15421	7tmA_OR2T-like	olfactory receptor subfamily 2T and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamilies 2T, 2M, 2L, 2V, 2Z, 2AE, 2AG, 2AK, 2AJ, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320544	cd15424	7tmA_OR2_unk	olfactory receptor family 2, unknown subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. This group represents an unknown subfamily, conserved in some mammalia and sauropsids, in family 2 of olfactory receptors. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320545	cd15428	7tmA_OR2D-like	olfactory receptor subfamily 2D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2D and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320546	cd15429	7tmA_OR2F-like	olfactory receptor subfamily 2F and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2F and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320547	cd15430	7tmA_OR13-like	olfactory receptor family 13 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 13 (subfamilies 13C, 13D, 13F, and 13J), some subfamilies from OR family 2 (2K and 2S), and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320548	cd15431	7tmA_OR13H-like	olfactory receptor subfamily 13H and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 13H and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	269
320549	cd15432	7tmA_OR2B2-like	olfactory receptor subfamily 2B2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes transmembrane olfactory receptor subfamily 2B2 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320550	cd15433	7tmA_OR2Y-like	olfactory receptor subfamily 2Y and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2Y, 2I, and related protein in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320551	cd15434	7tmA_OR2W-like	olfactory receptor subfamily 2W and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 2W and related proteins in other mammals. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320552	cd15436	7tmB2_Latrophilin	Latrophilins, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified:  LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	258
320553	cd15437	7tmB2_ETL	Epidermal Growth Factor, latrophilin and seven transmembrane domain-containing protein 1; member of the class B2 family of seven-transmembrane G protein-coupled receptors. ETL (EGF-TM7-latrophilin-related protein) belongs to Group I adhesion GPCRs, which also include latrophilins (also called lectomedins or latrotoxin receptors). All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. ETL, for instance, contains EGF-like repeats, which also present in other EGF-TM7 adhesion GPCRs, such as Cadherin EGF LAG seven-pass G-type receptors (CELSR1-3), EGF-like module receptors (EMR1-3), CD97, and Flamingo. ETL is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	258
320554	cd15438	7tmB2_CD97	CD97 antigen, member of the class B2 family of seven-transmembrane G protein-coupled receptors. group II adhesion GPCRs, including the leukocyte cell-surface antigen CD97 and the epidermal growth factor (EGF)-module-containing, mucin-like hormone receptor (EMR1-4), are primarily expressed in cells of the immune system. All EGF-TM7 receptors, which belong to the B2 subfamily B2 of adhesion GPCRs, are members of group II, except for ETL (EGF-TM7-latrophilin related protein), which is classified into group I. Members of the EGF-TM7 receptors are characterized by the presence of varying numbers of N-terminal EGF-like domains, which play critical roles in ligand recognition and cell adhesion, linked by a stalk region to a class B seven-transmembrane domain.  In the case of CD97, alternative splicing results in three isoforms possessing either three (EGF1,2,5), four (EGF1,2,3,5) or five (EGF1,2,3,4,5) EGF-like domains. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. For example, CD97, which is involved in angiogenesis and the migration and invasion of tumor cells, has been shown to promote cell aggregation in a GPS proteolysis-dependent manner.  CD97 is widely expressed on lymphocytes, monocytes, macrophages, dendritic cells, granulocytes and smooth muscle cells as well as in a variety of human tumors including colorectal, gastric, esophageal pancreatic, and thyroid carcinoma. EMR2 shares strong sequence homology with CD97, differing by only six amino acids. However, unlike CD97, EMR2 is not found in those of CD97-positive tumor cells and is not expressed on lymphocytes but instead on monocytes, macrophages and granulocytes. CD97 has three known ligands: CD55, decay-accelerating factor for regulation of complement system; chondroitin sulfate, a glycosaminoglycan found in the extracellular matrix; and the integrin alpha5beta1, which play a role in angiogenesis.  Although EMR2 does not effectively interact with CD55, the fourth EGF-like domain of this receptor binds to chondroitin sulfate to mediate cell attachment.	261
320555	cd15439	7tmB2_EMR	epidermal growth factor-like module-containing mucin-like hormone receptors, member of the class B2 family of seven-transmembrane G protein-coupled receptors. group II adhesion GPCRs, including the epidermal growth factor (EGF)-module-containing, mucin-like hormone receptor (EMR1-4) and the leukocyte cell-surface antigen CD97, are primarily expressed in cells of the immune system. All EGF-TM7 receptors, which belong to the B2 subfamily of adhesion GPCRs, are members of group II, except for ETL (EGF-TM7-latrophilin related protein), which is classified into group I. Members of the EGF-TM7 receptors are characterized by the presence of varying number of N-terminal EGF-like domains, which play critical roles in ligand recognition and cell adhesion, linked by a stalk region to a class B seven-transmembrane domain. In the case of EMR2, alternative splicing results in four  isoforms possessing either two (EGF1,2), three (EGF1,2,5), four (EGF1,2,3,5) or five (EGF1,2,3,4,5) EGF-like domains. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. EMR2 shares strong sequence homology with CD97, differing by only six amino acids. CD97 is widely expressed on lymphocytes, monocytes, macrophages, dendritic cells, granulocytes and smooth muscle cells as well as in a variety of human tumors including colorectal, gastric, esophageal pancreatic, and thyroid carcinoma. However, unlike CD97, EMR2 is not found in those of CD97-positive tumor cells and is not expressed on lymphocytes but instead on monocytes, macrophages and granulocytes. CD97 has three known ligands: CD55, decay-accelerating factor for regulation of complement system; chondroitin sulfate, a glycosaminoglycan found in the extracellular matrix; and the integrin alpha5beta1, which play a role in angiogenesis.  Although EMR2 does not effectively interact with CD55, the fourth EGF-like domain of this receptor binds to chondroitin sulfate to mediate cell attachment.	263
320556	cd15440	7tmB2_latrophilin-like_invertebrate	invertebrate latrophilin-like receptors, member of the class B2 family of seven-transmembrane G protein-coupled receptors. This subgroup includes latrophilin-like proteins that are found in invertebrates such as insects and worms. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of vertebrate latrophilins have been identified:  LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	259
320557	cd15441	7tmB2_CELSR_Adhesion_IV	cadherin EGF LAG seven-pass G-type receptors, group IV adhesion GPCRs, member of the class B2 family of seven-transmembrane G protein-coupled receptors. The group IV adhesion GPCRs include the cadherin EGF LAG seven-pass G-type receptors (CELSRs) and their Drosophila homolog Flamingo (also known as Starry night). These receptors are also classified as that belongs to the EGF-TM7 group of subfamily B2 adhesion GPCRs, because they contain EGF-like domains. Functionally, the group IV receptors act as key regulators of many physiological processes such as endocrine cell differentiation, neuronal migration, dendrite growth, axon, guidance, lymphatic vessel and valve formation, and planar cell polarity (PCP) during embryonic development. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. In the case of CELSR/Flamingo/Starry night, their extracellular domains comprise nine cadherin repeats linked to a series of epidermal growth factor (EGF)-like and laminin globular (G)-like domains.  The cadherin repeats contain sequence motifs that mediate calcium-dependent cell-cell adhesion by homophilic interactions.  Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. Three mammalian orthologs of Flamingo, Celsr1-3, are widely expressed in the nervous system from embryonic development until the adult stage. Each Celsr exhibits different expression patterns in the developing brain, suggesting that they serve distinct functions. Mutations of CELSR1 cause neural tube defects in the nervous system, while mutations of CELSR2 are associated with coronary heart disease.  Moreover, CELSR1 and several other PCP signaling molecules, such as dishevelled, prickle, frizzled, have been shown to be upregulated in B lymphocytes of chronic lymphocytic leukemia patients. Celsr3 is expressed in both the developing and adult mouse brain. It has been functionally implicated in proper neuron migration and axon guidance in the CNS.	254
320558	cd15442	7tmB2_GPR97	orphan adhesion receptor GPR97, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR97 is an orphan receptor that has been classified into the group VIII of adhesion GPCRs. Other members of the Group VII include GPR56, GPR64, GPR112, GPR114, and GPR126. GPR97 is identified as a lymphatic adhesion receptor that is specifically expressed in lymphatic endothelium, but not in blood vascular endothelium, and is shown to regulate migration of lymphatic endothelial cells via the small GTPases RhoA and cdc42. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	277
320559	cd15443	7tmB2_GPR114	orphan adhesion receptor GPR114, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR114 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include GPR56, GPR64, GPR97, GPR112, and GPR126. GPR114 is mainly found in granulocytes (polymorphonuclear leukocytes), and GPR114-transfected cells induced an increase in cAMP levels via coupling to G(s) protein. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	268
320560	cd15444	7tmB2_GPR64	orphan adhesion receptor GPR64 and related proteins, member of subfamily B2 of the class B secretin-like receptors of seven-transmembrane G protein-coupled receptors. GPR64 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include orphan GPCRs such as GPR56, GPR97, GPR112, GPR114, and GPR126. GPR64 is mainly expressed in the epididymis of male reproductive tract, and targeted deletion of GPR64 causes sperm stasis and efferent duct blockage due to abnormal fluid reabsorption, resulting in male infertility. GPR64 is also over-expressed in Ewing's sarcoma (ES), as well as upregulated in other carcinomas from kidney, prostate or lung, and promotes invasiveness and metastasis in ES via the upregulation of placental growth factor (PGF) and matrix metalloproteinase (MMP) 1. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	271
320561	cd15445	7tmB1_CRF-R1	corticotropin-releasing factor receptor 1, member of the class B family of seven-transmembrane G protein-coupled receptors. The vertebrate corticotropin-releasing factor (CRF) receptors are predominantly expressed in central nervous system with high levels in cortex tissue, brain stem, and pituitary. They have two isoforms as a result of alternative splicing of the same receptor gene: CRF-R1 and CRF-R2, which differ in tissue distribution and ligand binding affinities. Recently, a third CRF receptor (CRF-R3) has been identified in catfish pituitary. The catfish CRF-R1 is highly homologous to CRF-R3. CRF is a 41-amino acid neuropeptide that plays a central role in coordinating neuroendocrine, behavioral, and autonomic responses to stress by acting as the primary neuroregulator of the hypothalamic-pituitary-adrenal axis, which controls the levels of cortisol and other stress related hormones.  In addition, the CRF family of neuropeptides also includes structurally related peptides such as mammalian urocortin, fish urotensin I, and frog sauvagine. The actions of CRF and CRF-related peptides are mediated through specific binding to CRF-R1 and CRF-R2. CRF and urocortin 1 bind and activate mammalian CRF-R1 with similar high affinities. By contrast, urocortin 2 and urocortin 3 do not bind to CRF-R1 or stimulate CRF-R1-mediated cAMP formation. Urocortin 1 also shows high affinity for mammalian CRF-R2, whereas CRF has significantly lower affinity for this receptor. These evidence suggest that urocortin 1 is an endogenous ligand for CRF-R1 and CRF-R2. The CRF receptors are members of the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, and parathyroid hormone (PTH). These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on its cellular location and function, CRF receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways.	265
320562	cd15446	7tmB1_CRF-R2	corticotropin-releasing factor receptor 2, member of the class B family of seven-transmembrane G protein-coupled receptors. The vertebrate corticotropin-releasing factor (CRF) receptors are predominantly expressed in central nervous system with high levels in cortex tissue, brain stem, and pituitary. They have two isoforms as a result of alternative splicing of the same receptor gene: CRF-R1 and CRF-R2, which differ in tissue distribution and ligand binding affinities. Recently, a third CRF receptor (CRF-R3) has been identified in catfish pituitary. The catfish CRF-R1 is highly homologous to CRF-R3. CRF is a 41-amino acid neuropeptide that plays a central role in coordinating neuroendocrine, behavioral, and autonomic responses to stress by acting as the primary neuroregulator of the hypothalamic-pituitary-adrenal axis, which controls the levels of cortisol and other stress related hormones.  In addition, the CRF family of neuropeptides also includes structurally related peptides such as mammalian urocortin, fish urotensin I, and frog sauvagine. The actions of CRF and CRF-related peptides are mediated through specific binding to CRF-R1 and CRF-R2. CRF and urocortin 1 bind and activate mammalian CRF-R1 with similar high affinities. By contrast, urocortin 2 and urocortin 3 do not bind to CRF-R1 or stimulate CRF-R1-mediated cAMP formation. Urocortin 1 also shows high affinity for mammalian CRF-R2, whereas CRF has significantly lower affinity for this receptor. These evidence suggest that urocortin 1 is an endogenous ligand for CRF-R1 and CRF-R2. The CRF receptors are members of the B1 subfamily of class B GPCRs, also referred to as secretin-like receptor family, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), calcitonin gene-related peptide, and parathyroid hormone (PTH). These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on its cellular location and function, CRF receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways.	264
320563	cd15447	7tmC_mGluR2	metabotropic glutamate receptor 2 in group 2, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) in group 2 include mGluR 2 and 3. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	254
320564	cd15448	7tmC_mGluR3	metabotropic glutamate receptor 3 in group 2, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) in group 2 include mGluR 2 and 3. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	254
320565	cd15449	7tmC_mGluR1	metabotropic glutamate receptor 1 in group 1, member of the class C family of seven-transmembrane G protein-coupled receptors. Group 1 mGluRs includes mGluR1 and mGluR5, as well as their closely related invertebrate receptors. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	250
320566	cd15450	7tmC_mGluR5	metabotropic glutamate receptor 5 in group 1, member of the class C family of seven-transmembrane G protein-coupled receptors. Group 1 mGluRs includes mGluR1 and mGluR5, as well as their closely related invertebrate receptors. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	250
320567	cd15451	7tmC_mGluR7	metabotropic glutamate receptor 7 in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The receptors in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	307
320568	cd15452	7tmC_mGluR4	metabotropic glutamate receptor 4 in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The receptors in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	327
320569	cd15453	7tmC_mGluR6	metabotropic glutamate receptor 6 in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The receptors in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	273
320570	cd15454	7tmC_mGluR8	metabotropic glutamate receptor 8 in group 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The receptors in group 3 include mGluRs 4, 6, 7, and 8. They are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group 1 mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to G(i/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	311
271319	cd15457	NADAR	Escherichia coli swarming motility protein YbiA and related proteins. This family of uncharacterized domains was initially classified as Domain of Unknown Function (DUF1768). It contains members such as the E. coli swarming motility protein YbiA. Mutations in YbiA cause defects in Escherichia coli swarming, but not necessarily in motility. More recently, this family has been predicted to be involved in NAD-utilizing pathways, likely to act on ADP-ribose derivatives, and has been named the NADAR (NAD and ADP-ribose) superfamily.	148
271230	cd15464	HN_like	Haemagglutinin-neuraminidase (HN) of paramyxoviridae and similar proteins. Most paramyxoviridae have two membrane-anchored glycoproteins that mediate entry of the virus into the host cell. The protein characterized by this model is called hemagglutinin-neuraminidase (HN), hemagglutinin glycoprotein (H), or glycoprotein (G). Typically it has a variety of functions during viral infection; it participates in virus attachment to host cells, may cleave sialic acid off host oligosaccharides, and has a stimulating effect on membrane fusion during the entry of the virus into the host cell.	391
275386	cd15465	bS6_mito	Mitochondrial Ribosomal Protein (MRP) S6. bS6_MRPS6 is one of the proteins of the small subunit of the mitochondrial ribosome. Mitochondrial and chloroplastic ribosomes are similar to bacterial ribosomes. The ribosome is a ribonucleoprotein organelle that decodes the genetic information in messenger RNA and forms peptide bonds to synthesize the corresponding polypeptide. Ribosomes consist of a large and a small subunit, which assemble during the initiation stage of protein synthesis. Prokaryotic ribosomes consist of three molecules of RNA and more than 50 proteins. The small subunits of bacterial and eukaryotic ribosomes have the same overall shapes (with structural elements described as head, body, platform, beak and shoulder). Mitochondrial and chloroplastic ribosomes synthesize proteins that are involved in oxidative phosphorylation (ATP generation) and photosynthesis. bS6_MRPS6 is one of the fourteen mitochondrial ribosomal proteins that is  known to have significant sequence homology with their bacterial counterpart.	95
271318	cd15466	CLU-central	An uncharacterized central domain of CLU mitochondrial proteins. Mutations in the mitochondrial CLU proteins have been shown to result in clustered mitochondria. CLU proteins include Saccharomyces cerevisiae clustered mitochondria protein (Clu1p, alias translation initiation factor 31/TIF31p), Dictyostelium discoideum clustered mitochondria protein homolog (CluA), Caenorhabditis elegans clustered mitochondria protein homolog (CLUH/ Protein KIAA0664), Drosophila clueless (alias clustered mitochondria protein homolog), Arabidopsis clustered mitochondria protein (CLU, alias friendly mitochondria protein/FMT), and human clustered mitochondria protein homolog (CLUH). Dictyostelium CluA is involved in mitochondrial dynamics and is necessary for both, mitochondrial fission and fusion. Drosophila clueless is essential for cytoplasmic localization and function of cellular mitochondria. The Drosophila clu gene interacts genetically with parkin (park, the Drosophila ortholog of a human gene responsible for many familial cases of Parkinson's disease). Arabidopsis CLU/FMT is required for correct mitochondrial distribution and morphology. The specific role CLU proteins play in mitochondrial processes in not yet known. In an early study, S. cerevisiae Clu1/TIF31p was reported as sometimes being associated with the elF3 translation initiation factor. The authors noted, however, that its tentative assignment as a subunit of elf3 was uncertain, and to date there has been no direct evidence for a role of this protein in translation.	159
271231	cd15467	MV-h	Measles virus hemagglutinin. The hemagglutinin (H) of measles virus plays several roles during viral infection; it participates in virus attachment to host cells via binding to a proteinaceous receptor and it has a stimulating effect on membrane fusion during the entry of the virus into the host cell by interacting with the fusion protein F. This model characterizes the globular ectodomain of measles hemagglutinin, minus the stalk region. Receptors for measles virus have been identified as the signaling lymphocyte activation molecule (SLAM, in particular its distal ectodomain), CD46, and nectin-4 in epithelial cells.	415
271232	cd15468	HeV-G	Glycoprotein G, or hemagglutinin-neuraminidase of Hendravirus and Nipah virus. The glycoprotein (G) of Nendravirus and Nipah virus  has a variety of functions during viral infection; it participates in virus attachment to host cells and has a stimulating effect on membrane fusion during the entry of the virus into the host cell. This models characterizes the globular ectodomain of glycoprotein G. The receptors for Hendravirus and Nipah virus have been identified as ephrin B2 (EFNB2) and ephrin B3 (EFNB3).	413
271233	cd15469	HN	Hemagglutinin-neuraminidase (HN) of parainfluenza virus 5, Newcastle disease virus, and related paramyxoviridiae. The hemagglutinin-neuraminidase (HN) found in this family of viruses has a variety of functions during viral infection; it participates in virus attachment to host cells, cleaves sialic acid off host oligosaccharides, and has a stimulating effect on membrane fusion during the entry of the virus into the host cell. This model characterizes the global ectodomain of HN. Hemagglutinin-neuraminidase ectodomains of these viruses attaches the virion to sialic acid receptors on host cells; the neuraminidase cleaves sialic acid moieties from host cell molecules as well as virus particles, this removal may happen in the trans Golgi network.	408
271254	cd15470	Myo5_CBD	Cargo binding domain of myosin 5. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5a,b,c) in vertebrates.  Their C-terminal cargo binding domains (CBDs) are important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. The MyoV-CBDs directly interact with several adaptor proteins, in case of Myo5a, melanophilin (MLPH), Rab interacting lysosomal protein-like 2 (RILPL2), and granuphilin, and in case of Myo5b, Rab11-family interacting protein 2.	332
271255	cd15471	Myo5p-like_CBD_afadin	cargo binding domain of myosin 5-like of afadin. Afadin is an actin filament (F-actin) and Rap1 small G protein-binding protein, found in cadherin-based adherens junctions in epithelial cells, endothelial cells, and fibroblasts. It interacts with cell adhesion molecules and signaling molecules and plays a role in the formation of cell junctions, cell polarization, migration, survival, proliferation, and differentiation. Afadin is a multi domain protein, that contains beside a myosin5-like CBD, two Ras-associated domains, a forkhead-associated domain, a PDZ domain, three proline-rich domains, and an F-actin-binding domain.	322
271256	cd15472	Myo5p-like_CBD_Rasip1	cargo binding domain of myosin 5-like of Ras-interacting protein 1. Ras-interacting protein 1 (Rasip1 or RAIN) is an effector of the small G protein Rap1 and plays an important role in endothelial junction stabilization. Rasip1, like afadin, is a multi domain protein, that contains beside a myosin5-like CBD, a Ras-associated domain and a PDZ domain.	366
271257	cd15473	Myo5p-like_CBD_DIL_ANK	cargo binding domain of myosin 5-like of dil and ankyrin domain containing protein. DIL and ankyrin domain-containing protein are a group of fungal proteins that contain a domain homologous to the cargo binding domain of class V myosins and ankyrin repeats. Their function is unknown.	316
271258	cd15474	Myo5p-like_CBD_fungal	cargo binding domain of fungal myosin V -like proteins. Yeast myosin V travels along actin cables, actin filaments that are bundled by fimbrin, in the presence of tropomyosin. This is in contrast to the other vertebrate class V  myosins. Like other class V myosins, fungal myosin 2 and 4 contain a C-terminal cargo binding domain. In case of Myo4 it has been shown to bind to the adapter protein She3p (Swi5p-dependent HO expression 3), which in turn anchors myosin 4 to its cargos, zip-coded mRNP (messenger ribonucleoprotein particles) and tER (tubular endoplasmic reticulum). Myo 2 binds to Vac17,  vacuole-specific cargo adaptor, and Mmr1, mitochondria-specific cargo adaptor. Both adaptors bind competitivly at the same site.	352
271259	cd15475	MyosinXI_CBD	cargo binding domain of myosin XI. Class XI myosins are a plant specific group, homologous to class V myosins.  C-terminal domain of Arabidopsis myosin XI has been shown to be homologous  to the cargo-binding domain of yeast myosin V myo2p, which targets myosin to vacuole- and mitochondria, as well as secretory vesicle.	326
271260	cd15476	Myo5c_CBD	Cargo binding domain of myosin 5C. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5a,b,c) in vertebrates.  Their C-terminal cargo binding domains (CBDs) are important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. The MyoV-CBDs directly interact with several adaptor proteins.MyoVb and myoVc areprimarily expressed in epithelial cells, and have been implicated as motors involved in recycling endosomes.	332
271261	cd15477	Myo5b_CBD	Cargo binding domain of myosin 5b. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5a,b,c) in vertebrates.  Their C-terminal cargo binding domains (CBDs) are important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. They interact with several adaptor proteins, in case of Myo5b-CBD, Rab11-family interacting protein 2.	372
271262	cd15478	Myo5a_CBD	Cargo binding domain of myosin 5a. Class V myosins are well studied unconventional myosins, represented by three paralogs (Myo5a,b,c) in vertebrates.  Their C-terminal cargo binding domains (CBDs) are important for the binding of a diverse set of cargos, including membrane vesicles, organelles, proteins and mRNA. They interact with several adaptor proteins, in case of Myo5a-CBD, melanophilin (MLPH), Rab interacting lysosomal protein-like 2 (RILPL2), and granuphilin. Mutations in human Myo5a (many of which map to the cargo binding domain) lead to Griscelli syndrome, a severe neurological disease.	375
271263	cd15479	fMyo4p_CBD	cargo binding domain of fungal myosin 4. Yeast myosin V travels along actin cables, actin filaments that are bundled by fimbrin, in the presence of tropomyosin. This is in contrast to the other vertebrate class V  myosins. Like other class V myosins, fungal myosin 2 and 4 contain a C-terminal cargo binding domain. In case of Myo4 it has been shown to bind to the adapter protein She3p (Swi5p-dependent HO expression 3), which in turn anchors myosin 4 to its cargos, zip-coded mRNP (messenger ribonucleoprotein particles) and tER (tubular endoplasmic reticulum).	329
271264	cd15480	fMyo2p_CBD	cargo binding domain of fungal myosin 2. Yeast myosin V travels along actin cables, actin filaments that are bundled by fimbrin, in the presence of tropomyosin. This is in contrast to the other vertebrate class V  myosins. Like other class V myosins, fungal myosin 2 and 4 contain a C-terminal cargo binding domain. Myo 2 binds to Vac17,  vacuole-specific cargo adaptor, and Mmr1, mitochondria-specific cargo adaptor. Both adaptors bind competitivly at the same site.	363
271252	cd15481	SRP68-RBD	RNA-binding domain of signal recognition particle subunit 68. Signal recognition particles (SRPs) are ribonucleoprotein complexes that target particular nascent pre-secretory proteins to the endoplasmic reticulum. SRP68 is one of the two largest proteins found in SRPs (the other being SRP72), and it forms a heterodimer with SRP72. Heterodimer formation is essential for SRP function. This model characterizes the N-terminal RNA-binding domain SRP68-RBD, a tetratricopeptide-like module. Interactions between SRP68-RBD and SRP RNA (7SL RNA) are thought to facilitate a conformation of SRP RNA that is required for interactions with ribosomal RNA.	195
271234	cd15482	Sialidase_non-viral	Non-viral sialidases. Sialidases or neuraminidases function to bind and hydrolyze terminal sialic acid residues from various glycoconjugates, they play vital roles in pathogenesis, bacterial nutrition and cellular interactions. They have a six-bladed, beta-propeller fold with the non-viral sialidases containing 2-5 Asp-box motifs (most commonly Ser/Thr-X-Asp-[X]-Gly-X-Thr- Trp/Phe).  This CD includes eubacterial and eukaryotic sialidases.	339
271235	cd15483	Influenza_NA	Sialidase or neuraminidase (EC 3.2.1.18) of Influenza viruses A and B. Sialidases or neuraminidases function to bind and hydrolyze terminal sialic acid residues from various glycoconjugates. Viral neuraminidases, such as this family from Influenza viruses A and B, play a vital role in pathogenesis. Influenza neuraminidase cleaves an alpha-ketosidic linkage between sialic acid and a neighboring sugar residue. During budding of virus particles from the infected cell, the sialidase helps to prevent the newly formed viral particles from aggregating. The viral sialidase cleaves terminal sialic acid from glycan structures on the infected cell surface, promoting virus release and the spread of virus to neighboring cells that are not yet infected.  Also, sialidase modifies mucins in the respiratory tract and may improve access of the viral particle to its target cells. Sialidases have a six-bladed beta-propeller fold.	386
271251	cd15484	uS7_plant	plant ribosomal protein S7. uS7, also known as Ribosomal protein (RP)S7, is an important part of the translation process which is universally present in the small subunit of prokaryotic and eukaryotic ribosomes. The ribosome small subunit is one of the two subunits of ribosome organelles that use mRNA as a template for protein synthesis in a process called translation. The small subunits of bacteria and eukaryotes have the same shape of head, body, platform, beak, and shoulder. RPS7 is located at the head of the small subunit. RPS7 is a primary ribosomal RNA (rRNA) binding protein that assists in rRNA folding and the binding of other proteins during small subunit assembly in all species. RPS7 is also involved in the formation of the mRNA exit channel at the interface of the large and small subunits. Some ribosomal proteins have extra ribosomal functions in cell differentiation and apoptosis.	147
271243	cd15485	ZIP_Cat8	Leucine zipper Dimerization domain of transcription factor Cat8 and similar proteins. Cat8p binds to carbon source-responsive element (CSRE) motifs and activates target genes under conditions of glucose deprivation. It mediates the transcriptional control of at least nine genes (ACS1, FBP1, ICL1, IDP2, JEN1, MLS1, PCK1, SFC1, and SIP4) under non-fermentative growth conditions in yeast. Studies show another 25 genes or open reading frames whose expression at the transition between the fermentative and the oxidative metabolism (diauxic shift) is altered in the absence of Cat8p. This Cat8p-dependent control results in a parallel alteration in mRNA and protein synthesis. The biochemical functions of proteins encoded by Cat8p-dependent genes are essentially related to the first steps of ethanol utilization, the glyoxylate cycle, and gluconeogenesis. Cat8p is a member of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner.	27
271244	cd15486	ZIP_Sip4	Leucine zipper Dimerization domain transcription factor Sip4p and similar fungal proteins. Sip4p binds to carbon source-responsive element (CSRE) motifs and activates transcription of target genes under conditions of glucose deprivation. Its function is modulated through phosphorylation by SNF1 protein kinase, a protein essential for expression of glucose-repressed genes in response to glucose deprivation. Sip4p is a member of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner.	27
275387	cd15487	bS6_chloro_cyano	30S ribosomal protein S6 of chloroplasts and cyanobacteria. bS6 is one of the components of the small subunit of the prokaryotic ribosome, a ribonucleoprotein organelle that decodes the genetic information in messenger RNA and forms peptide bonds to synthesize the corresponding polypeptides. Mitochondrial and chloroplastic ribosomes are similar to bacterial ribosomes. Ribosomes consist of a large and a small subunit, which assemble during the initiation stage of protein synthesis. Prokaryotic ribosomes consist of three molecules of RNA and more than 50 proteins. The small subunits of bacterial and eukaryotic ribosomes have the same overall shapes (with structural elements described as head, body, platform, beak and shoulder). The bacterial ribosomal protein S6 is important for the assembly of the central domain of the small subunit via heterodimerization with ribosomal protein S18.	94
350626	cd15488	Tm-1-like	ATP-binding domain found in plant Tm-1-like (Tm-1L) and similar proteins. Members of this family have been annotated as UPF0261-domain containing proteins. They are found in plants, fungi, bacteria, and archaea. A three-dimensional structure of a complex between the tomato resistance gene product Tm-1 and tomato mosaic virus helicase reveals an organization encompassing two distinct structurally similar domains and an ATP-binding site present in the N-terminal subdomain. The Tm-1-like domain is found co-occurring with a C-terminal TIM-barrel signal transduction (TBST) domain in some plant proteins like Tm-1, and with an N-terminal ABC-transporter ATP-binding domain in a few bacterial proteins.	399
276966	cd15489	PHD_SF	PHD finger superfamily. The PHD finger superfamily includes a canonical plant homeodomain (PHD) finger typically characterized as Cys4HisCys3, and a non-canonical extended PHD finger, characterized as Cys2HisCys5HisCys2His. Variations include the RAG2 PHD finger characterized by Cys3His2Cys2His and the PHD finger 5 found in nuclear receptor-binding SET domain-containing proteins characterized by Cys4HisCys2His. The PHD finger is also termed LAP (leukemia-associated protein) motif or TTC (trithorax consensus) domain. Single or multiple copies of PHD fingers have been found in a variety of eukaryotic proteins involved in the control of gene transcription and chromatin dynamics. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins. They also function as epigenome readers controlling gene expression through molecular recruitment of multi-protein complexes of chromatin regulators and transcription factors. The PHD finger domain SF is structurally similar to the RING and FYVE_like superfamilies.	48
294011	cd15490	eIF2_gamma_III	Domain III of eukaryotic initiation factor eIF2 gamma. This family represents the C-terminal domain of the gamma subunit of eukaryotic translation initiation factor 2 (eIF2-gamma) found in eukaryotes and archaea. eIF2 is a G protein that delivers the methionyl initiator tRNA to the small ribosomal subunit and releases it upon GTP hydrolysis after the recognition of the initiation codon. eIF2 is composed three subunits, alpha, beta and gamma. Subunit gamma shows strongest conservation, and it confers both tRNA binding and GTP/GDP binding.	90
294012	cd15491	selB_III	Domain III of selenocysteine-specific translation elongation factor. This family represents domain III of bacterial selenocysteine (Sec)-specific elongation factor (EFSec), homologous to domain III of EF-Tu. SelB is a specialized translation elongation factor responsible for the co-translational incorporation of selenocysteine into proteins by recoding of a UGA stop codon in the presence of a downstream mRNA hairpin loop, called Sec insertion sequence (SECIS) element.	87
276967	cd15492	PHD_BRPF_JADE_like	PHD finger found in BRPF proteins, Jade proteins, protein AF-10, AF-17, and similar proteins. The family includes BRPF proteins, Jade proteins, protein AF-10 and AF-17. BRPF proteins are scaffold proteins that form monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complexes with other regulatory subunits, such as inhibitor of growth 5 (ING5) and Esa1-associated factor 6 ortholog (EAF6). BRPF proteins have multiple domains, including a canonical Cys4HisCys3 plant homeodomain (PHD) zinc finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. PHD and ePHD fingers both bind to lysine 4 of histone H3 (K4H3), bromodomains interact with acetylated lysines on N-terminal tails of histones and other proteins, and PWWP domains show histone-binding and chromatin association properties. Jade proteins are required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and EAF6, to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. AF-10, also termed ALL1 (acute lymphoblastic leukemia)-fused gene from chromosome 10 protein, is a transcription factor that has been implicated in the development of leukemia following chromosomal rearrangements between the AF10 gene and one of at least two other genes, MLL and CALM. AF-17, also termed ALL1-fused gene from chromosome 17 protein, is a putative transcription factor that may play a role in multiple signaling pathways. All Jade proteins, AF-10, and AF-17 contain a canonical PHD finger followed by a non-canonical ePHD finger. This model corresponds to the canonical PHD finger.	46
276968	cd15493	PHD_JMJD2	PHD finger found in Jumonji domain-containing protein 2 (JMJD2) family of histone demethylases. JMJD2 proteins, also termed lysine-specific demethylase 4 histone demethylases (KDM4), have been implicated in various cellular processes including DNA damage response, transcription, cell cycle regulation, cellular differentiation, senescence, and carcinogenesis. They selectively catalyze the demethylation of di- and trimethylated H3K9 and H3K36. This model contains only three JMJD2 proteins, JMJD2A-C, which all contain jmjN and jmjC domains in the N-terminal region, followed by a Cys4HisCys3 canonical PHD finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a Tudor domain. JMJD2D is not included in this family, since it lacks both PHD and Tudor domains and has a different substrate specificity. JMJD2A-C are required for efficient cancer cell growth. This model corresponds to the Cys4HisCys3 canonical PHD finger.	42
276969	cd15494	PHD_ATX1_2_like	PHD finger found in Arabidopsis thaliana histone-lysine N-methyltransferase arabidopsis trithorax-like protein ATX1, ATX2, and similar proteins. The family includes A. thaliana ATX1 and ATX2, both of which are sister paralogs originating from a segmental chromosomal duplication. They are plant counterparts of the Drosophila melanogaster trithorax (TRX) and mammalian mixed-lineage leukemia (MLL1) proteins. ATX1, also termed protein SET domain group 27, or trithorax-homolog protein 1 (TRX-homolog protein 1), is a methyltransferase that trimethylates histone H3 at lysine 4 (H3K4me3). It also acts as a histone modifier and as a positive effector of gene expression. ATX1regulates transcription from diverse classes of genes implicated in biotic and abiotic stress responses. It is involved in dehydration stress signaling in both abscisic acid (ABA)-dependent and ABA-independent pathways. ATX2, also termed protein SET domain group 30, or trithorax-homolog protein 2 (TRX-homolog protein 2), is involved in dimethylating histone H3 at lysine 4 (H3K4me2). Both ATX1 and ATX2 are multi-domain containing proteins that consist of an N-terminal PWWP domain, FYRN- and FYRC (DAST, domain associated with SET in trithorax) domains, a canonical Cys4HisCys3 plant homeodomain (PHD) finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a C-terminal SET domain; this model corresponds to the Cys4HisCys3 canonical PHD finger.	47
276970	cd15495	PHD_ATX3_4_5_like	PHD finger found in Arabidopsis thaliana histone-lysine N-methyltransferase arabidopsis trithorax-like protein ATX3, ATX4, ATX5, and similar proteins. The family includes A. thaliana ATX3 (also termed protein SET domain group 14, or trithorax-homolog protein 3), ATX4 (also termed protein SET domain group 16, or trithorax-homolog protein 4) and ATX5 (also termed protein SET domain group 29, or trithorax-homolog protein 5), which belong to the histone-lysine methyltransferase family. They show distinct phylogenetic origins from the ATX1 and ATX2 family. They are multi-domain containing proteins that consist of an N-terminal PWWP domain, a canonical Cys4HisCys3 plant homeodomain (PHD) finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a C-terminal SET domain; this model corresponds to the Cys4HisCys3 canonical PHD finger.	47
276971	cd15496	PHD_PHF7_G2E3_like	PHD finger found in PHD finger protein 7 (PHF7) and G2/M phase-specific E3 ubiquitin-protein ligase (G2E3). PHF7, also termed testis development protein NYD-SP6, is a testis-specific plant homeodomain (PHD) finger-containing protein that associates with chromatin and binds histone H3 N-terminal tails with a preference for dimethyl lysine 4 (H3K4me2). It may play an important role in stimulating transcription involved in testicular development and/or spermatogenesis. PHF7 contains a canonical Cys4HisCys3 PHD finger and a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, both of which may be involved in activating transcriptional regulation. G2E3 is a dual function ubiquitin ligase (E3) that may play a possible role in cell cycle regulation and the cellular response to DNA damage. It is essential for prevention of apoptosis in early embryogenesis. It is also a nucleo-cytoplasmic shuttling protein with DNA damage responsive localization. G2E3 contains two distinct RING-like ubiquitin ligase domains that catalyze lysine 48-linked polyubiquitination, and a C-terminal catalytic HECT domain that plays an important role in ubiquitin ligase activity and in the dynamic subcellular localization of the protein. The RING-like ubiquitin ligase domains consist of a PHD finger and an ePHD finger. This model corresponds to the Cys4HisCys3 canonical PHD finger.	54
276972	cd15497	PHD1_Snt2p_like	PHD finger 1 found in Saccharomyces cerevisiae SANT domain-containing protein 2 (Snt2p) and similar proteins. Snt2p is a yeast protein that may function in multiple stress pathways. It coordinates the transcriptional response to hydrogen peroxide-mediated oxidative stress through interaction with Ecm5 and the Rpd3 deacetylase. Snt2p contains a bromo adjacent homology (BAH) domain, two canonical Cys4HisCys3 plant homeodomain (PHD) fingers, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domain; this model corresponds to the first canonical Cys4HisCys3 PHD finger.	48
276973	cd15498	PHD2_Snt2p_like	PHD finger 2 found in Saccharomyces cerevisiae SANT domain-containing protein 2 (Snt2p) and similar proteins. This group corresponds to Snt2p and similar proteins. Snt2p is a yeast protein that may function in multiple stress pathways. It coordinates the transcriptional response to hydrogen peroxide-mediated oxidative stress through interaction with Ecm5 and the Rpd3 deacetylase. Snt2p contains a bromo adjacent homology (BAH) domain, two canonical Cys4HisCys3 plant homeodomain (PHD) fingers, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domain; this model corresponds to the second canonical Cys4HisCys3 PHD finger.	55
276974	cd15499	PHD1_MTF2_PHF19_like	PHD finger 1 found in polycomb repressive complex 2 (PRC2)-associated polycomb-like (PCL) family proteins MTF2, PHF19, and similar proteins. The family includes two PCL family proteins, metal-response element-binding transcription factor 2 (MTF2/PCL2) and PHF19/PCL3, which are homologs of PHD finger protein1 (PHF1). PCL family proteins are accessory components of the polycomb repressive complex 2 (PRC2) core complex and all contain an N-terminal Tudor domain followed by two PHD fingers, and a C-terminal MTF2 domain. They specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains. The interaction between their Tudor domains and H3K36me3 is critical for both the targeting and spreading of PRC2 into active chromatin regions and for the maintenance of optimal repression of poised developmental genes where PCL proteins, H3K36me3, and H3K27me3 coexist. Moreover, unlike other PHD finger-containing proteins, the first PHD fingers of PCL proteins do not display histone H3K4 binding affinity and they do not affect the Tudor domain binding to histones. This model corresponds to the first PHD finger.	53
276975	cd15500	PHD1_PHF1	PHD finger 1 found in PHD finger protein1 (PHF1). PHF1, also termed polycomb-like protein 1 (PCL1), together with JARID2 and AEBP2, associates with the polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF1 is essential in epigenetic regulation and genome maintenance. It acts as a dual reader of Lysine trimethylation at Lysine 36 of Histone H3 and Lysine 27 of Histone variant H3t. PHF1 consists of an N-terminal Tudor domain followed by two PHD fingers, and a C-terminal MTF2 domain. Its Tudor domain selectively binds to histone H3K36me3. Moreover, PHF1 is required for efficient H3K27me3 and Hox gene silencing. It can mediate deposition of the repressive H3K27me3 mark and acts as a cofactor in early DNA-damage response. This model corresponds to the first PHD finger.	51
276976	cd15501	PHD_Int12	PHD finger found in integrator complex subunit 12 (Int12) and similar proteins. Int12, also termed IntS12, or PHD finger protein 22, is a component of integrator, a multi-protein mediator of small nuclear RNA processing. The integrator complex directly interacts with the C-terminal domain of RNA polymerase II (RNAPII) largest subunit and mediates the 3' end processing of small nuclear RNAs (snRNAs) U1 and U2. Different from other components of integrator, Int12 contains a PHD finger, which is not required for snRNA 3' end cleavage. Instead, Int12 harbors a small microdomain at its N-terminus which is necessary and sufficient for Int12 function; this microdomain facilitates Int12 binding to Int1 and promotes snRNA 3' end formation.	52
276977	cd15502	PHD_Phf1p_Phf2p_like	PHD finger found in Schizosaccharomyces pombe SWM histone demethylase complex subunits Phf1 (Phf1p) and Phf2 (Phf2p). Phf1p and Phf2p are components of the SWM histone demethylase complex that specifically demethylates histone H3 at lysine 9 (H3K9me2), a specific tag for epigenetic transcriptional activation. They function as corepressors and play roles in regulating heterochromatin propagation and euchromatic transcription. Both Phf1p and Phf2p contain a plant homeodomain (PHD) finger.	52
276978	cd15503	PHD2_MTF2_PHF19_like	PHD finger 2 found in polycomb repressive complex 2 (PRC2)-associated polycomb-like (PCL) family proteins MTF2, PHF19, and similar proteins. The PCL family includes PHD finger protein1 (PHF1) and its homologs metal-response element-binding transcription factor 2 (MTF2/PCL2) and PHF19/PCL3, which are accessory components of the Polycomb repressive complex 2 (PRC2) core complex and all contain an N-terminal Tudor domain followed by two plant homeodomain (PHD) fingers, and a C-terminal MTF2 domain. PCL proteins specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains. The interaction between their Tudor domains and H3K36me3 is critical for both the targeting and spreading of PRC2 into active chromatin regions and for the maintenance of optimal repression of poised developmental genes where PCL proteins, H3K36me3, and H3K27me3 coexist. Moreover, unlike other PHD finger-containing proteins, the first PHD finger of PCL proteins do not display histone H3K4 binding affinity and they do not affect the Tudor domain binding to histones. This model corresponds to the second PHD finger.	52
276979	cd15504	PHD_PRHA_like	PHD finger found in Arabidopsis thaliana pathogenesis-related homeodomain protein (PRHA) and similar proteins. PRHA is a homeodomain protein encoded by a single-copy Arabidopsis thaliana homeobox gene, prha. It shows the capacity to bind to TAATTG core sequence elements but requires additional adjacent bases for high-affinity binding. PRHA contains a plant homeodomain (PHD) finger, a homeodomain, peptide repeats and a putative leucine zipper dimerization domain.	53
276980	cd15505	PHD_ING	PHD finger found in the inhibitor of growth (ING) protein family. The ING family includes a group of tumor suppressors, ING1-5, which act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger.	45
276981	cd15506	PHD1_KMT2A_like	PHD finger 1 found in histone-lysine N-methyltransferase 2A (KMT2A) and 2B (KMT2B). This family includes histone-lysine N-methyltransferase trithorax (Trx) like proteins, KMT2A (MLL1) and KMT2B (MLL2), which comprise the mammalian Trx branch of the COMPASS family, and are both essential for mammalian embryonic development. KMT2A regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex. KMT2B is a second human homolog of Drosophila trithorax, located on chromosome 19 and functions as the catalytic subunit in the MLL2 complex. It plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. Both KMT2A and KMT2B contain a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the first PHD finger.	47
276982	cd15507	PHD2_KMT2A_like	PHD finger 2 found in histone-lysine N-methyltransferase 2A (KMT2A) and 2B (KMT2B). This family includes histone-lysine N-methyltransferase trithorax (Trx) like proteins, KMT2A (MLL1) and KMT2B (MLL2), which comprise the mammalian Trx branch of the COMPASS family, and are both essential for mammalian embryonic development. KMT2A regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex. KMT2B is a second human homolog of Drosophila trithorax, located on chromosome 19 and functions as the catalytic subunit in the MLL2 complex. It plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. Both KMT2A and KMT2B contain a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the second PHD finger.	50
276983	cd15508	PHD3_KMT2A_like	PHD finger 3 found in histone-lysine N-methyltransferase 2A (KMT2A) and 2B (KMT2B). This family includes histone-lysine N-methyltransferase trithorax (Trx) like proteins, KMT2A (MLL1) and KMT2B (MLL2), which comprise the mammalian Trx branch of the COMPASS family, and are both essential for mammalian embryonic development. KMT2A regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex. KMT2B is a second human homolog of Drosophila trithorax, located on chromosome 19 and functions as the catalytic subunit in the MLL2 complex. It plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. Both KMT2A and KMT2B contain a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the third PHD finger.	57
276984	cd15509	PHD1_KMT2C_like	PHD finger 1 found in Histone-lysine N-methyltransferase 2C (KMT2C) and 2D (KMT2D). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the first PHD finger.	48
276985	cd15510	PHD2_KMT2C_like	PHD finger 2 found in Histone-lysine N-methyltransferase 2C (KMT2C) and 2D (KMT2D). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, five plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobilitygroup)-binding motif, and two FY-rich regions. This model corresponds to the second PHD finger.	46
276986	cd15511	PHD3_KMT2C	PHD finger 3 found in Histone-lysine N-methyltransferase 2C (KMT2C). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. This model corresponds to the third PHD finger.	52
276987	cd15512	PHD4_KMT2C_like	PHD finger 4 found in Histone-lysine N-methyltransferase 2C (KMT2C) and PHD domain 3 found in KMT2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, two extended PHD (ePHD) fingers, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the fourth PHD finger of KMT2C and the third domain of KMT2D.	49
276988	cd15513	PHD5_KMT2C_like	PHD finger 5 found in Histone-lysine N-methyltransferase 2C (KMT2C) and PHD finger 4 found in KMT2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, several plant homeodomain (PHD) fingers, extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the fifth PHD finger of KMT2C and the fourth PHD finger of KMT2D.	47
276989	cd15514	PHD6_KMT2C_like	PHD finger 6 found in Histone-lysine N-methyltransferase 2C (KMT2C) and PHD finger 5 found in KMT2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the sixth PHD finger of KMT2C and the fifth PHD finger of KMT2D.	51
276990	cd15515	PHD1_KDM5A_like	PHD finger 1 found in Lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D and similar proteins. The JARID subfamily within the JmjC proteins includes Lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D and a Drosophila homolog, protein little imaginal discs (Lid). KDM5A was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5B has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. Both KDM5A and KDM5B function as trimethylated histone H3 lysine 4 (H3K4me3) demethylases. KDM5C is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me2 and H3K4me3), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. This family also includes Drosophila melanogaster protein little imaginal discs (Lid) that functions as a JmjC-dependent H3K4me3 demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Members in this family contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as two or three plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger.	46
276991	cd15516	PHD2_KDM5A_like	PHD finger 2 found in Lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D, and similar proteins. The JARID subfamily within the JmjC proteins includes Lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D and a Drosophila homolog protein, little imaginal discs (Lid). KDM5A was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK, and BMAL1. KDM5B has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. Both KDM5A and KDM5B function as trimethylated histone H3 lysine 4 (H3K4me3) demethylases. KDM5C is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 and H3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. The family also includes Drosophila melanogaster protein little imaginal discs (Lid) that functions as a JmjC-dependent trimethyl histone H3K4 (H3K4me3) demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Members in this family contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as two or three plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger.	53
276992	cd15517	PHD_TCF19_like	PHD finger found in Transcription factor 19 (TCF-19), Lysine-specific demethylase KDM5A and KDM5B, and other similar proteins. TCF-19 was identified as a putative trans-activating factor with expression beginning at the late G1-S boundary in dividing cells. It functions as a novel islet factor necessary for proliferation and survival in the INS-1 beta cell line. It plays an important role in susceptibility to both Type 1 Diabetes Mellitus (T1DM) and Type 2 Diabetes Mellitus (T2DM); it has been suggested that it may positively impact beta cell mass under conditions of beta cell stress and  increased insulin demand. KDM5A was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interaction with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK, and BMAL1. KDM5B has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. Both KDM5A and KDM5B function as trimethylated histone H3 lysine 4 (H3K4me3) demethylases. This family also includes Caenorhabditis elegans Lysine-specific demethylase 7 homolog (ceKDM7A). ceKDM7A (also termed JmjC domain-containing protein 1.2, PHD finger protein 8 homolog, or PHF8 homolog) is a plant homeodomain (PHD)- and JmjC domain-containing protein that functions as a histone demethylase specific for H3K9me2 and H3K27me2. The binding of the PHD finger to H3K4me3 guides H3K9me2- and H3K27me2-specific demethylation by its catalytic JmjC domain in a trans-histone regulation mechanism. In addition, this family includes plant protein OBERON 1 and OBERON 2, Alfin1-like (AL) proteins, histone acetyltransferases (HATs) HAC, and AT-rich interactive domain-containing protein 4 (ARID4).	49
276993	cd15518	PHD_Ecm5p_Lid2p_like	PHD finger found in Saccharomyces cerevisiae extracellular matrix protein 5 (Ecm5p), Schizosaccharomyces pombe Lid2 complex component Lid2p, and similar proteins. The family includes Saccharomyces cerevisiae Ecm5p, Schizosaccharomyces pombe Lid2 complex component Lid2p, and similar proteins. Ecm5p is a JmjC domain-containing protein that directly removes histone lysine methylation via a hydroxylation reaction. It associates with the yeast Snt2p and Rpd3 deacetylase, which may play a role in regulating transcription in response to oxidative stress. Ecm5p promotes oxidative stress tolerance, while Snt2p ultimately decreases tolerance. Ecm5p contains an N-terminal ARID domain, a JmjC domain, and a C-terminal plant homeodomain (PHD) finger. Lid2p is a trimethyl H3K4 (H3K4me3) demethylase responsible for H3K4 hypomethylation in heterochromatin. It interacts with the histone lysine-9 methyltransferase, Clr4, through the Dos1/Clr8-Rik1 complex, and mediates H3K9 methylation and small RNA production. It also acts cooperatively with the histone modification enzymes Set1 and Lsd1 and plays an essential role in cross-talk between H3K4 and H3K9 methylation in euchromatin. Lid2p contains a JmjC domain, three PHD fingers and a JmjN domain. This model includes the second PHD finger of Lid2p.	45
276994	cd15519	PHD1_Lid2p_like	PHD finger 1 found in Schizosaccharomyces pombe Lid2 complex component Lid2p and similar proteins. Lid2p is a trimethyl H3K4 (H3K4me3) demethylase responsible for H3K4 hypomethylation in heterochromatin. It interacts with the histone lysine-9 methyltransferase, Clr4, through the Dos1/Clr8-Rik1 complex, and mediates H3K9 methylation and small RNA production. It also acts cooperatively with the histone modification enzymes Set1 and Lsd1 and plays an essential role in cross-talk between H3K4 and H3K9 methylation in euchromatin. Lid2p contains a JmjC domain, three PHD fingers and a JmjN domain. This model corresponds to the first PHD finger.	46
276995	cd15520	PHD3_Lid2p_like	PHD finger 3 found in Schizosaccharomyces pombe Lid2 complex component Lid2p and similar proteins. Lid2p is a trimethyl H3K4 (H3K4me3) demethylase responsible for H3K4 hypomethylation in heterochromatin. It interacts with the histone lysine-9 methyltransferase, Clr4, through the Dos1/Clr8-Rik1 complex, and mediates H3K9 methylation and small RNA production. It also acts cooperatively with the histone modification enzymes Set1 and Lsd1, and plays an essential role in cross-talk between H3K4 and H3K9 methylation in euchromatin. Lid2p contains a JmjC domain, three PHD fingers and a JmjN domain. The family corresponds to the third PHD finger.	47
276996	cd15521	PHD_VIN3_plant	PHD finger found in Arabidopsis thaliana protein Vernalization Insensitive 3 (VIN3) and similar proteins. The lineage specific VIN3 family of proteins includes VIN3, VIN3-like1 (VIL1, or Vernalization5 (VRN5)), VIN3-like2 (VIL2, or Vernalization5/VIN3-like protein 1 (VEL1)), VIN3-like3 (VIL3 or Vernalization5/VIN3-like protein 2 (VEL2)), and similar proteins. They contain a plant homeodomain (PHD) finger, and collectively repress different sets of members of the Flowering LOCUS C (FLC) gene family during the course of vernalization. Both VIN3 and VIL1 are required for modifying the histone architecture of the MADS box floral repressor FLC in response to prolonged cold exposure in Arabidopsis. VIN3 is required for both Histone H3 Lys 9 (H3K9) and Histone H3 Lys 27 (H3K27) methylation at FLC chromatin, ultimately leading to its repression. It is regulated by the components of Polycomb Response Complex2 (PRC2), which trimethylates histone H3 Lys 27 (H3K27me3). VIL1 appears to play a prominent role in regulating FLC by vernalization. VIL2 acts together with PRC2 to repress the floral repressor MAF5, an FLC clade member, in a photoperiod-dependent manner to accelerate flowering under non-inductive photoperiods.	64
276997	cd15522	PHD_TAF3	PHD finger found in transcription initiation factor TFIID subunit 3 (TAF3). TAF3 (also termed 140 kDa TATA box-binding protein-associated factor, TBP-associated factor 3, transcription initiation factor TFIID 140 kDa subunit (TAF140), or TAFII-140, is an integral component of TFIID) is a general initiation factor (GTF) that plays a key role in preinitiation complex (PIC) assembly through core promoter recognition. The interaction of H3K4me3 with TAF3 directs global TFIID recruitment to active genes, which regulates gene-selective functions of p53 in response to genotoxic stress. TAF3 is highly enriched in embryonic stem cells and is required for endoderm lineage differentiation and prevents premature specification of neuroectoderm and mesoderm. Moreover, TAF3, along with TRF3, forms a complex that is essential for myogenic differentiation. TAF3 contains a plant homeodomain (PHD) finger. This family also includes Drosophila melanogaster BIP2 (Bric-a-brac interacting protein 2) protein, which functions as an interacting partner of D. melanogaster p53 (Dmp53).	46
276998	cd15523	PHD_PHF21A	PHD finger found in PHD finger protein 21A (PHF21A). PHF21A (also termed BHC80a or BRAF35-HDAC complex protein BHC80) along with HDAC1/2, CtBP1, CoREST, and BRAF35, is associated with LSD1, a lysine (K)-specific histone demethylase. It inhibits LSD1-mediated histone demethylation in vitro. PHF21A is predominantly present in the central nervous system and spermatogenic cells and is one of the six components of BRAF-HDAC complex (BHC) involved in REST-dependent transcriptional repression of neuron-specific genes in non-neuronal cells. It acts as a scaffold protein in BHC in neuronal as well as non-neuronal cells and also plays a role in spermatogenesis. PHF21A contains a C-terminal plant homeodomain (PHD) finger that is responsible for the binding directly to each of five other components of BHC, and of organizing BHC mediating transcriptional repression.	43
276999	cd15524	PHD_PHF21B	PHD finger found in PHD finger protein 21B (PHF21B). PHF21B is a plant homeodomain (PHD) finger-containing protein whose biological function remains unclear. It shows high sequence similarity with PHF21A, which is associated with LSD1, a lysine (K)-specific histone demethylase and inhibits LSD1-mediated histone demethylation in vitro. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins.	43
277000	cd15525	PHD_UHRF1_2	PHD finger found in ubiquitin-like PHD and RING finger domain-containing protein UHRF1 and UHRF2. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of transcription factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) by interacting with the HBV core protein and promoting its degradation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger.	47
277001	cd15526	PHD1_MOZ_d4	PHD finger 1 found in monocytic leukemia zinc-finger protein (MOZ), its factor (MORF), and d4 gene family proteins. MOZ is a MYST-type histone acetyltransferase (HAT) that functions as a coactivator for acute myeloid leukemia 1 protein (AML1)- and p53-dependent transcription. It possesses intrinsic HAT activity and to acetylate both itself and lysine (K) residues on histone H2B, histone H3 (K14) and histone H4 (K5, K8, K12 and K16) in vitro and H3K9 in vivo. MOZ-related factor (MORF) is a ubiquitously expressed transcriptional regulator with intrinsic HAT activity. It can interact with the Runt-domain transcription factor Runx2 and form a tetrameric complex with BRPFs, ING5, and EAF6. Both MOZ and MORF are catalytic subunits of HAT complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and are implicated in human leukemias. MOZ is also the catalytic subunit of a tetrameric inhibitor of growth 5 (ING5) complex, which specifically acetylates nucleosomal histone H3K14. Moreover, MOZ and MORF are involved in regulating transcriptional activation mediated by Runx2 (or Cbfa1), a Runt-domain transcription factor known to play important roles in T cell lymphoma genesis and bone development, and its homologs. This family also includes three members of the d4 gene family, DPF1 (neuro-d4), DPF2 (ubi-d4/Requiem), and DPF3 (cer-d4), which function as transcription factors and are involved in transcriptional regulation of genes via changing the condensed/decondensed state of chromatin in nucleus. DPF2 is ubiquitously expressed and it acts as a transcription factor that may participate in developmentally programmed cell death. DPF1 and DPF3 are expressed predominantly in neural tissues, and they may be involved in the transcription regulation of neuro specific gene clusters. All family members contain two plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger.	56
277002	cd15527	PHD2_KAT6A_6B	PHD finger 2 found in monocytic leukemia zinc-finger protein (MOZ) and its factor (MORF). MOZ, also termed histone acetyltransferase KAT6A, YBF2/SAS3, SAS2 and TIP60 protein 3 (MYST-3), or runt-related transcription factor-binding protein 2, or zinc finger protein 220, is a MYST-type histone acetyltransferase (HAT) that functions as a coactivator for acute myeloid leukemia 1 protein (AML1)- and p53-dependent transcription. It possesses intrinsic HAT activity to acetylate both itself and lysine (K) residues on histone H2B, histone H3 (K14) and histone H4 (K5, K8, K12 and K16) in vitro and H3K9 in vivo. MOZ-related factor (MORF), also termed MOZ2, or histone acetyltransferase KAT6B, or MOZ, YBF2/SAS3, SAS2 and TIP60 protein 4 (MYST4), is a ubiquitously expressed transcriptional regulator with intrinsic HAT activity. It can interact with the Runt-domain transcription factor Runx2 and form a tetrameric complex with BRPFs, ING5, and EAF6. Both MOZ and MORF are catalytic subunits of HAT complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and are also implicated in human leukemias. MOZ is also the catalytic subunit of a tetrameric inhibitor of growth 5 (ING5) complex, which specifically acetylates nucleosomal histone H3K14. Moreover, MOZ and MORF are involved in regulating transcriptional activation mediated by Runx2 (or Cbfa1), a Runt-domain transcription factor known to play important roles in T cell lymphomagenesis and bone development, and its homologs. MOZ contains a linker histone 1 and histone 5 domains and two plant homeodomain (PHD) fingers. In contrast, MORF contains an N-terminal region containing two PHD fingers, a putative HAT domain, an acidic region, and a C-terminal Ser/Met-rich domain. The family corresponds to the first PHD finger.	46
277003	cd15528	PHD1_PHF10	PHD finger 1 found in PHD finger protein 10 (PHF10) and similar proteins. PHF10, also termed BRG1-associated factor 45a (BAF45a), or XAP135, is a ubiquitously expressed transcriptional regulator that is required for maintaining the undifferentiated status of neuroblasts. It contains a SAY (supporter of activation of yellow) domain and two adjacent plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger.	54
277004	cd15529	PHD2_PHF10	PHD finger 2 found in PHD finger protein 10 (PHF10) and similar proteins. PHF10, also termed BRG1-associated factor 45a (BAF45a), or XAP135, is a ubiquitously expressed transcriptional regulator that is required for maintaining the undifferentiated status of neuroblasts. It contains a SAY (supporter of activation of yellow) domain and two adjacent plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger.	44
277005	cd15530	PHD2_d4	PHD finger 2 found in d4 gene family proteins. The family includes proteins coded by three members of the d4 gene family, DPF1 (neuro-d4), DPF2 (ubi-d4/Requiem), and DPF3 (cer-d4), which function as transcription factors and are involved in transcriptional regulation of genes by changing the condensed/decondensed state of chromatin in the nucleus. DPF2 is ubiquitously expressed and it acts as a transcription factor that may participate in developmentally programmed cell death. DPF1 and DPF3 are expressed predominantly in neural tissues, and they may be involved in the transcription regulation of neuro-specific gene clusters. The d4 family proteins show distinct domain organization with domain 2/3 in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc finger in the central part and two adjacent plant homeodomain (PHD) fingers (d4-domain) in the C-terminal part of the molecule. This model corresponds to the second PHD finger.	46
277006	cd15531	PHD1_CHD_II	PHD finger 1 found in class II Chromodomain-Helicase-DNA binding (CHD) proteins. Class II CHD proteins includes chromodomain-helicase-DNA-binding protein CHD3, CHD4, and CHD5, which are nuclear and ubiquitously expressed chromatin remodelling ATPases generally associated with histone deacetylases (HDACs). They are involved in DNA Double Strand Break (DSB) signaling, DSB repair and/or p53-dependent pathways such as apoptosis and senescence, as well as in the maintenance of genomic stability, and/or cancer prevention. They function as subunits of the Nucleosome Remodelling and Deacetylase (NuRD) complex, which is generally associated with gene repression, heterochromatin formation, and overall chromatin compaction. In contrast to the class I CHD enzymes (CHD1 and CHD2), class II CHD proteins lack identifiable DNA-binding domains, but possess a C-terminal coiled-coil region. Moreover, in addition to the tandem chromodomains and a helicase domain, they all harbor tandem plant homeodomain (PHD) zinc fingers involved in the recognition of methylated histone tails. This model corresponds to the first PHD finger.	43
277007	cd15532	PHD2_CHD_II	PHD finger 2 found in class II Chromodomain-Helicase-DNA binding (CHD) proteins. Class II CHD proteins includes chromodomain-helicase-DNA-binding protein CHD3, CHD4, and CHD5, which are nuclear and ubiquitously expressed chromatin remodelling ATPases generally associated with histone deacetylases (HDACs). They are involved in DNA Double Strand Break (DSB) signaling, DSB repair and/or p53-dependent pathways such as apoptosis and senescence, as well as in the maintenance of genomic stability, and/or cancer prevention. They function as subunits of the Nucleosome Remodelling and Deacetylase (NuRD) complex, which is generally associated with gene repression, heterochromatin formation, and overall chromatin compaction. In contrast to the class I CHD enzymes (CHD1 and CHD2), class II CHD proteins lack identifiable DNA-binding domains, but possess a C-terminal coiled-coil region. Moreover, in addition to the tandem chromodomains and a helicase domain, they all harbor tandem plant homeodomain (PHD) zinc fingers involved in the recognition of methylated histone tails. This model corresponds to the second PHD finger.	43
277008	cd15533	PHD1_PHF12	PHD finger 1 found in PHD finger protein 12 (PHF12). PHF12, also termed PHD factor 1 (Pf1), is a plant homeodomain (PHD) zinc finger-containing protein that bridges the transducin-like enhancer of split (TLE) corepressor to the mSin3A-histone deacetylase (HDAC)-complex, and further represses transcription at targeted genes. PHF12 also interacts with MRG15 (mortality factor-related genes on chromosome 15), a member of the mortality factor (MORF) family of proteins implicated in regulating cellular senescence. PHF12 contains two plant-homeodomain (PHD) zinc fingers followed by a polybasic region. The PHD fingers function downstream of phosphoinositide signaling triggered by the interaction between polybasic regions and phosphoinositides. This model corresponds to the first PHD finger.	45
277009	cd15534	PHD2_PHF12_Rco1	PHD finger 2 found in PHD finger protein 12 (PHF12), yeast Rco1, and similar proteins. PHF12, also termed PHD factor 1 (Pf1), is a plant homeodomain (PHD) zinc finger-containing protein that bridges the transducin-like enhancer of split (TLE) corepressor to the mSin3A-histone deacetylase (HDAC)-complex, and further represses transcription at targeted genes. PHF12 also interacts with MRG15 (mortality factor-related genes on chromosome 15), a member of the mortality factor (MORF) family of proteins implicated in regulating cellular senescence. PHF12 contains two plant homeodomain (PHD) zinc fingers followed by a polybasic region. The PHD fingers function downstream of phosphoinositide signaling triggered by the interaction between polybasic regions and phosphoinositides. This subfamily also includes yeast transcriptional regulatory protein Rco1 and similar proteins. Rco1 is a component of the Rpd3S histone deacetylase complex that plays an important role at actively transcribed genes. Rco1 contains two PHD fingers, which are required for the methylation of histone H3 lysine 36 (H3K36) nucleosome recognition by Rpd3S. This model corresponds to the second PHD finger.	47
277010	cd15535	PHD1_Rco1	PHD finger 1 found in Saccharomyces cerevisiae transcriptional regulatory protein Rco1 and similar proteins. Rco1 is a component of the Rpd3S histone deacetylase complex that plays an important role at actively transcribed genes. Rco1 contains two plant homeodomain (PHD) fingers, which are required for the methylation of histone H3 lysine 36 (H3K36) nucleosome recognition by Rpd3S. This model corresponds to the first PHD finger.	45
277011	cd15536	PHD_PHRF1	PHD finger found in PHD and RING finger domain-containing protein 1 (PHRF1). PHRF1, also termed KIAA1542, or CTD-binding SR-like protein rA9, is a ubiquitin ligase that induces the ubiquitination of TGIF (TG-interacting factor) at lysine 130. It acts as a tumor suppressor that promotes the transforming growth factor (TGF)-beta cytostatic program through selective release of TGIF-driven promyelocytic leukemia protein (PML) inactivation. PHRF1 contains a plant homeodomain (PHD) finger and a RING finger.	46
277012	cd15537	PHD_BS69	PHD finger found in protein BS69. Protein BS69, also termed zinc finger MYND domain-containing protein 11 (ZMYND11 or ZMY11), is a ubiquitously expressed nuclear protein acting as a transcriptional co-repressor in association with various transcription factors. It was originally identified as an adenovirus 5 E1A-binding protein that inhibits E1A transactivation, as well as c-Myb transcription. It also mediates repression, at least in part, through interaction with the co-repressor N-CoR. Moreover, it interacts with Toll-interleukin 1 receptor domain (TIR)-containing adaptor molecule-1 (TICAM-1, also named TRIF) to facilitate NF-kappaB activation and type I IFN induction. It associates with PIAS1, a SUMO E3 enzyme, and Ubc9, a SUMO E2 enzyme, and plays an inhibitory role in muscle and neuronal differentiation. Moreover, BS69 regulates Epstein-Barr virus (EBV) latent membrane protein 1 (LMP1)/C-terminal activation region 2 (CTAR2)-mediated NF-kappaB activation by interfering with the complex formation between TNFR-associated death domain protein (TRADD) and LMP1/CTAR2. It also cooperates with tumor necrosis factor receptor (TNFR)-associated factor 3 (TRAF3) in the regulation of EBV-derived LMP1/CTAR1-induced NF-kappaB activation. Furthermore, BS69 is involved in the p53-p21Cip1-mediated senescence pathway. BS69 contains a plant homeodomain (PHD) finger, a bromodomain, a proline-tryptophan-tryptophan-proline (PWWP) domain, and a Myeloid translocation protein 8, Nervy and DEAF-1 (MYND) domain.	43
277013	cd15538	PHD_PRKCBP1	PHD finger found in protein kinase C-binding protein 1 (PRKCBP1). PRKCBP1, also termed cutaneous T-cell lymphoma-associated antigen se14-3 (CTCL-associated antigen se14-3), or Rack7, or zinc finger MYND domain-containing protein 8 (ZMYND8), is a novel receptor for activated C-kinase (RACK)-like protein that may play an important role in the activation and regulation of PKC-beta I, and the PKC signaling cascade. It also has been identified as a formin homology-2-domain containing protein 1 (FHOD1)-binding protein that may be involved in FHOD1-regulated actin polymerization and transcription. Moreover, PRKCBP1 may function as a REST co-repressor 2 (RCOR2) interacting factor; the RCOR2/ZMYND8 complex which might be involved in the regulation of neural differentiation. PRKCBP1 contains a plant homeodomain (PHD) finger, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain.	41
277014	cd15539	PHD1_AIRE	PHD finger 1 found in autoimmune regulator (AIRE). AIRE, also termed autoimmune polyendocrinopathy candidiasis ectodermal dystrophy (APECED) protein, functions as a regulator of gene transcription in the thymus. It is essential for prevention of autoimmunity. AIRE plays a critical role in the induction of central tolerance. It promotes self-tolerance through tissue-specific antigen (TSA) expression. It also acts as an active regulator of chondrocyte differentiation. AIRE contains a homogeneously-staining region (HSR) or caspase-recruitment domain (CARD), a nuclear localization signal (NLS), a SAND (for Sp100, AIRE, nuclear phosphoprotein 41/75 or NucP41/75, and deformed epidermal auto regulatory factor 1 or Deaf1) domain, two plant homeodomain (PHD) fingers, and four LXXLL (where L stands for leucine) motifs. This model corresponds to the first PHD finger that recognizes the unmethylated tail of histone H3 and targets AIRE-dependent genes.	43
277015	cd15540	PHD2_AIRE	PHD finger 2 found in autoimmune regulator (AIRE). AIRE, also termed autoimmune polyendocrinopathy candidiasis ectodermal dystrophy (APECED) protein, functions as a regulator of gene transcription in the thymus. It is essential for prevention of autoimmunity. AIRE plays a critical role in the induction of central tolerance. It promotes self-tolerance through tissue-specific antigen (TSA) expression. It also acts as an active regulator of chondrocyte differentiation. AIRE contains a homogeneously-staining region (HSR) or caspase-recruitment domain (CARD), a nuclear localization signal (NLS), a SAND (for Sp100, AIRE, nuclear phosphoprotein 41/75 or NucP41/75, and deformed epidermal auto regulatory factor 1 or Deaf1) domain, two plant homeodomain (PHD) fingers, and four LXXLL (where L stands for leucine) motifs. This model corresponds to the second PHD finger that may play a critical role in the activation of gene transcription.	42
277016	cd15541	PHD_TIF1_like	PHD finger found in the transcriptional intermediary factor 1 (TIF1) family and similar proteins. The TIF1 family of transcriptional cofactors includes TIF1alpha (TRIM24), TIF1beta (TRIM28), TIF1gamma (TRIM33), and TIF1delta (TRIM66), which are characterized by an N-terminal RING-finger B-box coiled-coil (RBCC/TRIM) motif and plant homeodomain (PHD) finger followed by a bromodomain in the C-terminal region. TIF1 proteins couple chromatin modifications to transcriptional regulation, signaling, and tumor suppression. They exert a deacetylase-dependent silencing effect when tethered to a promoter region. TIF1alpha, TIF1beta, and TIF1delta can homodimerize and contain a PXVXL motif necessary and sufficient for heterochromatin protein 1(HP1) binding. TIF1alpha and TIF1beta bind nuclear receptors and Kruppel-associated boxes (KRAB) specifically and respectively. In contrast, TIF1delta appears to lack nuclear receptor- and KRAB-binding activity. Moreover, TIF1delta is specifically involved in heterochromatin-mediated gene silencing during postmeiotic phases of spermatogenesis. TIF1gamma is structurally closely related to TIF1alpha and TIF1beta, but has very little functional features in common with them. It does not interact with the KRAB silencing domain of KOX1 or the heterochromatinic proteins HP1alpha, beta, and gamma. It cannot bind to nuclear receptors (NRs). This family also includes Sp100/Sp140 family proteins, the nuclear body SP100 and SP140. Sp110 is a leukocyte-specific component of the nuclear body. It may function as a nuclear hormone receptor transcriptional coactivator that may play a role in inducing differentiation of myeloid cells. It is also involved in resisting intracellular pathogens and functions as an important drug target for preventing intracellular pathogen diseases, such as tuberculosis, hepatic veno-occlusive disease, and intracellular cancers. SP140 is an interferon inducible nuclear leukocyte-specific protein involved in primary biliary cirrhosis and a risk factor in chronic lymphocytic leukemia. It is also implicated in innate immune response to human immunodeficiency virus type 1 (HIV-1) by binding to the virus viral infectivity factor (Vif) protein. Both Sp110 and Sp140 contain a SAND domain, a plant homeodomain (PHD) finger, and a bromodomain (BRD).	43
277017	cd15542	PHD_UBR7	PHD finger found in putative E3 ubiquitin-protein ligase UBR7. UBR7, also termed N-recognin-7, is a UBR box-containing protein that belongs to the E3 ubiquitin ligase family that recognizes N-degrons or structurally related molecules for ubiquitin-dependent proteolysis or related processes through the UBR box motif. In addition to the UBR box, UBR7 also harbors a plant homeodomain (PHD) finger. The biochemical properties of UBR7 remain unclear.	54
277018	cd15543	PHD_RSF1	PHD finger found in Remodeling and spacing factor 1 (Rsf-1). Rsf-1, also termed HBV pX-associated protein 8, or Hepatitis B virus X-associated protein alpha (HBxAPalpha), or p325 subunit of RSF chromatin-remodeling complex, is a novel nuclear protein with histone chaperon function. It is a subunit of an ISWI chromatin remodeling complex, remodeling and spacing factor (RSF), and plays a role in mediating ATPase-dependent chromatin remodeling and conferring tumor aggressiveness in common carcinomas. As an ataxia-telangiectasia mutated (ATM)-dependent chromatin remodeler, Rsf-1 facilitates DNA damage checkpoints and homologous recombination repair. It regulates the mitotic spindle checkpoint and chromosome instability through the association with serine/threonine kinase BubR1 (BubR1) and Hepatitis B virus (HBV) X protein (HBx) in the chromatin fraction during mitosis. It also interacts with cyclin E1 and promotes tumor development. Rsf-1 contains a plant homeodomain (PHD) finger.	46
277019	cd15544	PHD_BAZ1A_like	PHD finger found in bromodomain adjacent to zinc finger domain protein BAZ1A and BAZ1B. BAZ1A, also termed ATP-dependent chromatin-remodeling protein, or ATP-utilizing chromatin assembly and remodeling factor 1 (ACF1), or CHRAC subunit ACF1, or Williams syndrome transcription factor-related chromatin-remodeling factor 180 (WCRF180), or WALp1, is a subunit of the conserved imitation switch (ISWI)-family ATP-dependent chromatin assembly and remodeling factor (ACF)/chromatin accessibility complex (CHRAC) chromatin remodeling complex, which is required for DNA replication through heterochromatin. It alters the remodeling properties of the ATPase motor protein sucrose nonfermenting-2 homolog (SNF2H). Moreover, BAZ1A and its complexes play important roles in DNA double-strand break (DSB) repair. It is essential for averting improper gene expression during spermatogenesis. It also regulates transcriptional repression of vitamin D3 receptor-regulated genes. BAZ1B, also termed Tyrosine-protein kinase BAZ1B, or Williams syndrome transcription factor (WSTF), or Williams-Beuren syndrome chromosomal region 10 protein, Williams-Beuren syndrome chromosomal region 9 protein, or WALp2, is a multifunctional protein implicated in several nuclear processes, including replication, transcription, and the DNA damage response. BAZ1B/WSTF, together with the imitation switch (ISWI) ATPase, forms a WSTF-ISWI chromatin remodeling complex (WICH), which transiently associates with the human inactive X chromosome (Xi) during late S-phase prior to BRCA1 and gamma-H2AX. Moreover, BAZ1B/WSTF, SNF2h, and nuclear myosin 1 (NM1) forms the chromatin remodeling complex B-WICH that is involved in regulating rDNA transcription. Both BAZ1A and BAZ1B contain a WAC motif, a DDT domain, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain.	46
277020	cd15545	PHD_BAZ2A_like	PHD finger found in bromodomain adjacent to zinc finger domain protein 2A (BAZ2A) and 2B (BAZ2B). BAZ2A, also termed transcription termination factor I-interacting protein 5 (TTF-I-interacting protein 5, or Tip5), or WALp3, is an epigenetic regulator. It has been implicated in epigenetic rRNA gene silencing, as the large subunit of the SNF2h-containing chromatin-remodeling complex NoRC that induces nucleosome sliding in an ATP- and histone H4 tail-dependent fashion. BAZ2A has also been shown to be broadly overexpressed in prostate cancer, to regulate numerous protein-coding genes and to cooperate with EZH2 (enhancer of zeste homolog 2) to maintain epigenetic silencing at genes repressed in prostate cancer metastasis. Its  overexpression is tightly associated with a prostate cancer subtype displaying CpG island methylator phenotype (CIMP) in tumors and with prostate cancer recurrence in patients. BAZ2B, also termed WALp4, is a bromodomain-containing protein whose biological role is still elusive. It shows high sequence similarly with BAZ2A. Both BAZ2A and BAZ2B contain a TAM (TIP5/ARBP/MBD) domain, a DDT domain, four AT-hooks, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain. BAZ2B also harbors an extra Apolipophorin-III like domain in its N-terminal region.	46
277021	cd15546	PHD_PHF13_like	PHD finger found in PHD finger proteins PHF13 and PHF23. PHF13, also termed survival time-associated PHD finger protein in ovarian cancer 1 (SPOC1), is a novel plant homeodomain (PHD) finger-containing protein that shows strong expression in spermatogonia and ovarian cancer cells, modulates chromatin structure and mitotic chromosome condensation, and is important for proper cell division. It is also required for spermatogonial stem cell differentiation and sustained spermatogenesis. The overexpression of PHF13 associates with unresectable carcinomas and shorter survival in ovarian cancer. PHF23, also termed PHD-containing protein JUNE-1, is a hypothetical protein with a PHD finger. It is encoded by gene PHF23 that acts as a candidate fusion partner for the nucleoporin gene NUP98. The NUP98-PHF23 fusion results from a cryptic translocation t(11;17)(p15;p13) in acute myeloid leukemia (AML).	44
277022	cd15547	PHD_SHPRH	PHD finger found in E3 ubiquitin-protein ligase SHPRH. SHPRH, also termed SNF2, histone-linker, PHD and RING finger domain-containing helicase, belongs to the SWI2/SNF2 family of ATP-dependent chromatin remodeling enzymes, containing the Cys3HisCys4 RING-finger characteristic of E3 ubiquitin ligases. It plays a key role in the error-free branch of DNA damage tolerance. As functional homologs of Saccharomyces cerevisiae Rad5, SHPRH and its closely-related protein, helicase like transcription factor (HLTF), act as ubiquitin ligases that cooperatively mediate Ubc13-Mms2-dependent polyubiquitination of proliferating cell nuclear antigen (PCNA) and maintain genomic stability. SHPRH contains a SNF2 domain, a H1.5 (linker histone H1 and H5) domain, a plant homeodomain (PHD) finger, a Cys3HisCys4 RING-finger, and a C-terminal helicase domain.	47
277023	cd15548	PHD_ASH1L	PHD finger found in histone-lysine N-methyltransferase ASH1L. ASH1L, also termed ASH1-like protein, or absent small and homeotic disks protein 1 homolog, or lysine N-methyltransferase 2H, is a protein belonging to the Trithorax family. It methylates Lys36 of histone H3 independently of transcriptional elongation to promote the establishment of Hox gene expression by counteracting Polycomb silencing. It can suppress interleukin-6 (IL-6), and tumor necrosis factor (TNF) production in Toll-like receptor (TLR)-triggered macrophages, and inflammatory autoimmune diseases by inducing the ubiquitin-editing enzyme A20. ASH1L contains an associated with SET domain (AWS), a SET domain, a post-SET domain, a bromodomain, a bromo-adjacent homology domain (BAH), and a plant homeodomain (PHD) finger.	43
277024	cd15549	PHD_PHF20_like	PHD finger found in PHD finger protein 20 (PHF20) and PHD finger protein 20-like protein 1 (P20L1). PHF20, also termed Glioma-expressed antigen 2, or hepatocellular carcinoma-associated antigen 58, or novel zinc finger protein, or transcription factor TZP (referring to Tudor and zinc finger domain containing protein), is a regulator of NF-kappaB activation by disrupting recruitment of PP2A to p65. It also functions as a transcription factor that binds Akt and plays a role in Akt cell survival/growth signaling. Moreover, it transcriptionally regulates p53. The phosphorylation of PHF20 on Ser291 mediated by protein kinase B (PKB) is essential in tumorigenesis via the regulation of p53 mediated signaling. P20L1 is an active malignant brain tumor (MBT) domain-containing protein that binds to monomethylated lysine 142 on DNA (Cytosine-5) Methyltransferase 1 (DNMT1) (DNMT1K142me1) and colocalizes at the perinucleolar space in a SET7-dependent manner. Its MBT domain reads and controls enzyme levels of methylated DNMT1 in cells, thus representing a novel antagonist of DNMT1 proteasomal degradation. Both PHF20 and PHF20L1 contain an N-terminal MBT domain, two Tudor domains, a plant homeodomain (PHD) finger and the putative DNA-binding domains, AT hook and Cys2His2-type zinc finger.	45
277025	cd15550	PHD_MLL5	PHD finger found in mixed lineage leukemia 5 (MLL5). MLL5 is a histone methyltransferase that plays a key role in hematopoiesis, spermatogenesis and cell cycle progression. It contains a single plant homeodomain (PHD) finger followed by a catalytic SET domain. MLL5 can be recruited to E2F1-responsive promoters to stimulate H3K4 trimethylation and transcriptional activation by binding to the cell cycle regulator host cell factor (HCF-1), thereby facilitating the cell cycle G1 to S phase transition. It is also involved in mitotic fidelity and genomic integrity by modulating the stability of the chromosomal passenger complex (CPC) via the interaction with Borealin. Moreover, MLL5 is a component of a complex associated with retinoic acid receptor that requires GlcN Acylation of its SET domain in order to activate its histone lysine methyltransferase activity. It also participates in the camptothecin (CPT)-induced p53 activation. Furthermore, MLL5 indirectly regulates H3K4 methylation, represses cyclin A2 (CycA) expression, and promotes myogenic differentiation.	44
277026	cd15551	PHD_PYGO	PHD finger found in PYGO proteins. The family includes Drosophila melanogaster protein pygopus (dPYGO) and its two homologs, PYGO1 and PYGO2. dPYGO is a fundamental Wnt signaling transcriptional component in Drosophila. PYGO1 is essential for the association with Legless (Lgs)/Bcl9 that acts an adaptor between Pygopus (Pygo) and Arm/beta-catenin. dPYGO and PYGO2 function as context-dependent beta-catenin coactivators, and they bind di- and trimethylated lysine 4 of histone H3 (H3K4me2/3). Moreover, PYGO2 acts as a histone methylation reader, and a chromatin remodeler in a testis-specific and Wnt-unrelated manner. It also mediates chromatin regulation and links Wnt signaling and Notch signaling to suppress the luminal/alveolar differentiation competence of mammary stem and basal cells. PYGO2 also plays a new role in rRNA transcription during cancer cell growth. It regulates mammary tumor initiation and heterogeneity in MMTV-Wnt1 mice. All family members contain a plant homeodomain (PHD) finger.	54
277027	cd15552	PHD_PHF3_like	PHD finger found in PHD finger protein 3 (PHF3), and death-inducer obliterator variants Dido1, Dido2, and Dido3. PHF3 is a human homolog of yeast protein bypass of Ess1 (Bye1), a nuclear protein with a domain resembling the central domain in the transcription elongation factor TFIIS. It is ubiquitously expressed in normal tissues including brain, but its expression is significantly reduced or lost in glioblastomas. PHF3 contains an N-terminal plant homeodomain (PHD) finger, a central RNA polymerase II (Pol II)-binding TFIIS-like domain (TLD) domain, and a C-terminal Spen paralogue and orthologue C-terminal (SPOC) domain. This family also includes Dido gene encoding three alternative splicing variants (Dido1, 2, and 3), which have been implicated in a number of cellular processes such as apoptosis and chromosomal segregation, particularly in the hematopoietic system. Dido1 is important for maintaining embryonic stem (ES) cells and directly regulates the expression of pluripotency factors. It is the shortest isoform that contains only a highly conserved PHD finger responsible for the binding of histone H3 with a higher affinity for trimethylated lysine4 (H3K4me3). Gene Dido1 is a Bone morphogenetic protein (BMP) target gene and promotes BMP-induced melanoma progression. It also triggers apoptosis after nuclear translocation and caspase upregulation. Dido3 is the largest isoform and is ubiquitously expressed in all human tissues. It is dispensable for ES cell self-renewal and pluripotency, but is involved in the maintenance of stem cell genomic stability and tumorigenesis. Dido3 contains a PHD finger, a transcription elongation factor S-II subunit M (TFSIIM) domain, a SPOC module, and a long C-terminal region (CT) of unknown homology.	50
277028	cd15553	PHD_Cfp1	PHD finger found in CXXC-type zinc finger protein 1 (Cfp1). Cfp1, also termed CpG-binding protein, or PHD finger and CXXC domain-containing protein 1 (PCCX1), is a specificity factor that binds to unmethylated CpGs and links H3K4me3 with CpG islands (CGIs). It integrates both promoter CpG content and gene activity for accurate trimethylation of histone H3 Lys 4 (H3K4me3) deposition in embryonic stem cells. Moreover, Cfp1 is an essential component of the SETD1 histone H3K4 methyltransferase complex and functions as a critical regulator of histone methylation, cytosine methylation, cellular differentiation, and vertebrate development. Cfp1 contains a plant homeodomain (PHD) finger, a CXXC domain, and a CpG binding protein zinc finger C-terminal domain. Its CXXC domain selectively binds to non-methylated CpG islands, following by a preference for a guanosine nucleotide.	46
277029	cd15554	PHD_PHF2_like	PHD finger found in PHF2, PHF8 and KDM7. This family includes PHF2, PHF8, KDM7, and similar proteins. PHF2, also termed GRC5, or PHD finger protein 2, is a histone lysine demethylase ubiquitously expressed in various tissues. PHF8, also termed PHD finger protein 8, or KDM7B, is a monomethylated histone H4 lysine 20(H4K20me1) demethylase that transcriptionally regulates many cell cycle genes. It also preferentially acts on H3K9me2 and H3K9me1. PHF8 is modulated by CDC20-containing anaphase-promoting complex (APC (cdc20)) and plays an important role in the G2/M transition. It acts as a critical molecular sensor for mediating retinoic acid (RA) treatment response in RAR alpha-fusion-induced leukemia. Moreover, PHF8 is essential for cytoskeleton dynamics and is associated with X-linked mental retardation. KDM7, also termed JmjC domain-containing histone demethylation protein 1D (JHDM1D), or KIAA1718, is a dual histone demethylase that catalyzes demethylation of monomethylated and dimethylated H3K9 (H3K9me2/me1) and H3K27 (H3K27me2/me1), which functions as an eraser of silencing marks on chromatin during brain development. It also plays a tumor-suppressive role by regulating angiogenesis. All family members contain a plant homeodomain (PHD) finger and a JmjC domain.	47
277030	cd15555	PHD_KDM2A_2B	PHD finger found in Lysine-specific demethylase KDM2A, KDM2B, and similar proteins. This family includes KDM2A, KDM2B, and F-box and leucine-rich repeat protein 19 (FBXL19). KDM2A is a ubiquitously expressed histone H3 lysine 36 (H3K36) demethylase that has been implicated in gene silencing, cell cycle, cell growth, and cancer development. KDM2B is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. Both KDM2A and KDM2B belong to the JmjC-domain-containing histone demethylase family. They consist of two Jumonji C (JmjC) domains, and FBXHA and FBXHB domains. A CXXC zinc-finger domain, followed by a plant homeodomain (PHD) finger, is located within the FBXHA domain, and an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain, is located within the FBXHB domain. FBXL19 belongs to the Skp1-Cullin-F-box (SCF) family of E3 ubiquitin ligases. It mediates ubiquitination and interleukin 33 (IL-33)-induced degradation of ST2L receptor in lung epithelia, blocks IL-33-mediated apoptosis, and prevents endotoxin-induced acute lung injury. FBXL19 consists of FBXHA and FBXHB domains, similar to KDM2A and KDM2B.	55
277031	cd15556	PHD_MMD1_like	PHD finger found in Arabidopsis thaliana PHD finger protein MALE MEIOCYTE DEATH 1 (MMD1), PHD finger protein MALE STERILITY 1 (MS1), and similar proteins. MMD1 is a plant homeodomain (PHD) finger protein expressed in male meiocytes. It is encoded by the gene DUET, which is required for male meiotic chromosome organization and progression. MMD1 has been implicated in the regulation of gene expression during meiosis. The mmd1 mutation triggers cell death in male meiocytes. MS1 is a nuclear transcriptional activator that is important for tapetal development and pollen wall biosynthesis. It contains a Leu zipper-like domain and a PHD finger motif, both of which are essential for its function.	46
277032	cd15557	PHD_CBP_p300	PHD finger found in CREB-binding protein (CBP) and histone acetyltransferase p300. This p300/CBP family includes two highly homologous histone acetyltransferases (HATs), CREB-binding protein (CBP) and p300. CBP is also known as KAT3A or CREBBP. It specifically interacts with the phosphorylated form of cyclic adenosine monophosphate-responsive element-binding protein (CREB). p300, also termed as KAT3B, or E1A-associated protein p300 (EP300), is a paralog of CBP. and is involved in E1A function in cell cycle progression and cellular differentiation. Both CBP and p300 are co-activator proteins that have been implicated in cell cycle regulation, apoptosis, embryonic development, cellular differentiation and cancer. They associate with a number of DNA-binding transcription activators as well as general transcription factors (GTFs), thus mediating recruitment of basal transcription machinery to the promoter. They contain a cysteine-histidine rich region, KIX (CREB interaction) domain, a plant homeodomain (PHD) finger, a HAT domain, followed by a SRC interaction domain.	37
277033	cd15558	PHD_Hop1p_like	PHD finger found in Schizosaccharomyces pombe meiosis-specific protein hop1 (Hop1p) and similar proteins. Fission yeast Hop1p, also termed linear element-associated protein hop1, is an S. pombe homolog of the synaptonemal complex (SC)-associated protein Hop1 in Saccharomyces cerevisiae. In contrast to S. cerevisiae, S. pombe forms thin threads, known as linear elements (LinEs), in meiotic nuclei, instead of a canonical synaptonemal complex. LinEs contain Rec10 protein and are evolutionary relics of SC axial elements. Fission yeast Hop1p is a linear element (LinE)-associated protein. It also associates with Rec10, which plays a role in recruiting the recombination machinery to chromatin. Hop1p contains an N-terminal HORMA (for Hop1p, Rev7p, and MAD2) domain and a C-terminal plant homeodomain (PHD) finger.	47
277034	cd15559	PHD1_BPTF	PHD finger 1 found in bromodomain and PHD finger-containing transcription factor (BPTF). BPTF, also termed nucleosome-remodeling factor subunit BPTF, or fetal Alz-50 clone 1 protein (FAC1), or fetal Alzheimer antigen, functions as a transcriptional regulator that exhibits altered expression and subcellular localization during neuronal development and neurodegenerative diseases such as Alzheimer's disease. It interacts with the human orthologue of the Kelch-like Ech-associated protein (Keap1). Its function and subcellular localization can be regulated by Keap1. Moreover, BPTF is a novel DNA-binding protein that recognizes the DNA sequence CACAACAC and represses transcription through this site in a phosphorylation-dependent manner. Furthermore, BPTF interacts with the Myc-associated zinc finger protein (ZF87/MAZ) and alters its transcriptional activity, which has been implicated in gene regulation in neurodegeneration. Some family members contain two or three plant homeodomain (PHD) fingers, which may be involved in complex formation with histone H3 trimethylated at K4 (H3K4me3). This family corresponds to the first PHD finger.	43
277035	cd15560	PHD2_3_BPTF	PHD finger 2 and 3 found in bromodomain and PHD finger-containing transcription factor (BPTF). BPTF, also termed nucleosome-remodeling factor subunit BPTF, or fetal Alz-50 clone 1 protein (FAC1), or fetal Alzheimer antigen, functions as a transcriptional regulator that exhibits altered expression and subcellular localization during neuronal development and neurodegenerative diseases such as Alzheimer's disease. It interacts with the human orthologue of the Kelch-like Ech-associated protein (Keap1). Its function and subcellular localization can be regulated by Keap1. Moreover, BPTF is a novel DNA-binding protein that recognizes the DNA sequence CACAACAC and represses transcription through this site in a phosphorylation-dependent manner. Furthermore, BPTF interacts with the Myc-associated zinc finger protein (ZF87/MAZ) and alters its transcriptional activity, which has been implicated in gene regulation in neurodegeneration. Some family members contain two or three plant homeodomain (PHD) fingers, which may be involved in complex formation with histone H3 trimethylated at K4 (H3K4me3). This family corresponds to the second and third PHD fingers.	47
277036	cd15561	PHD1_PHF14	PHD finger 1 found in PHD finger protein 14 (PHF14) and similar proteins. PHF14 is a novel nuclear transcription factor that controls the proliferation of mesenchymal cells by directly repressing platelet-derived growth factor receptor-alpha (PDGFRalpha) expression. It also acts as an epigenetic regulator and plays an important role in the development of multiple organs in mammals. PHF14 contains three canonical plant homeodomain (PHD) fingers and a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His. It can interact with histones through its PHD fingers. The model corresponds to the first PHD finger.	56
277037	cd15562	PHD2_PHF14	PHD finger 2 found in PHD finger protein 14 (PHF14) and similar proteins. PHF14 is a novel nuclear transcription factor that controls the proliferation of mesenchymal cells by directly repressing platelet-derived growth factor receptor-alpha (PDGFRalpha) expression. It also acts as an epigenetic regulator and plays an important role in the development of multiple organs in mammals. PHF14 contains three canonical plant homeodomain (PHD) fingers and a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His. It can interact with histones through its PHD fingers. The model corresponds to the second PHD finger.	50
277038	cd15563	PHD3_PHF14	PHD finger 3 found in PHD finger protein 14 (PHF14) and similar proteins. PHF14 is a novel nuclear transcription factor that controls the proliferation of mesenchymal cells by directly repressing platelet-derived growth factor receptor-alpha (PDGFRalpha) expression. It also acts as an epigenetic regulator and plays an important role in the development of multiple organs in mammals. PHF14 contains three canonical plant homeodomain (PHD) fingers and a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His. It can interact with histones through its PHD fingers. The model corresponds to the third PHD finger.	49
277039	cd15564	PHD1_NSD	PHD finger 1 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the first PHD finger.	43
277040	cd15565	PHD2_NSD	PHD finger 2 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they play non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the second PHD finger.	51
277041	cd15566	PHD3_NSD	PHD finger 3 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they play non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the third PHD finger.	48
277042	cd15567	PHD4_NSD	PHD finger 4 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they play non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the fourth PHD finger.	41
277043	cd15568	PHD5_NSD	PHD finger 5 found in nuclear receptor-binding SET domain-containing (NSD) proteins. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1, that are critical in maintaining chromatin integrity. Reducing NSD activity through specific lysine-HMTase inhibitors appears promising to help suppress cancer growth. NSD proteins have specific mono- and dimethylase activities for H3K36, and they play non-redundant roles during development. NSD1 plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. NSD2 is involved in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. NSD3 is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD proteins contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). This model corresponds to the fifth PHD finger.	43
277044	cd15569	PHD_RAG2	PHD finger found in V(D)J recombination-activating protein 2 (RAG-2) and similar proteins. RAG-2 is an essential component of the lymphoid-specific recombination activating gene RAG1/2 V(D)J recombinase mediating antigen-receptor gene assembly. It contains an acidic hinge region implicated in histone-binding, a non-canonical plant homeodomain (PHD) finger followed by a C-terminal extension of 40 amino acids that is essential for phosphoinositide (PtdIns)-binding. The PHD finger is a chromatin-binding module that specifically recognizes histone H3 trimethylated at lysine 4 (H3K4me3) and influences V(D)J recombination.	67
277045	cd15570	PHD_Bye1p_SIZ1_like	PHD domain found in Saccharomyces cerevisiae bypass of ESS1 protein 1 (Bye1p), the E3 Sumo Ligase SIZ1, and similar proteins. Yeast Bye1p is a nuclear transcription factor with a domain resembling the central domain in the transcription elongation factor TFIIS and plays an inhibitory role during transcription elongation. It functions as a multicopy suppressor of Ess1, a peptidyl-prolyl cis-trans isomerase involved in proline isomerization of the C-terminal domain (CTD) of RNA polymerase II (Pol II). Bye1p contains an N-terminal plant homeodomain (PHD) finger, a central Pol II-binding TFIIS-like domain (TLD) domain, and a C-terminal Spen paralogue and orthologue C-terminal (SPOC) domain. The PHD domain binds to a histone H3 tail peptide containing trimethylated lysine 4 (H3K4me3). The TLD domain is responsible for the association with chromatin. Plant SIZ1 protein is a SUMO (small ubiquitin-related modifier) E3 ligase that facilitates conjugation of SUMO to substrate target proteins (sumoylation) and belongs to the protein inhibitor of activated STAT (PIAS) protein family. It negatively regulates abscisic acid (ABA) signaling, which is dependent on the bZIP transcripton factor ABI5. It also modulates plant growth and plays a role in drought stress response likely through the regulation of gene expression. SIZ1 functions as a floral repressor that not only represses the salicylic acid (SA)-dependent pathway, but also promotes FLOWERING LOCUS C (FLC) expression by repressing FLOWERING LOCUS D (FLD) activity through sumoylation. SIZ1 contains a PHD finger, which specifically binds methylated histone H3 at lysine 4 and arginine 2.	50
277046	cd15571	ePHD	Extended plant homeodomain (PHD) finger, characterized by Cys2HisCys5HisCys2His. PHD finger is also termed LAP (leukemia-associated protein) motif or TTC (trithorax consensus) domain. The extended PHD finger is characterized as Cys2HisCys5HisCys2His, which has been found in a variety of eukaryotic proteins involved in the control of gene transcription and chromatin dynamics. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins. They also function as epigenome readers controlling gene expression through molecular recruitment of multi-protein complexes of chromatin regulators and transcription factors.	112
277047	cd15572	PHD_BRPF	PHD finger found in bromodomain and PHD finger-containing (BRPF) proteins. The family of BRPF proteins includes BRPF1, BRD1/BRPF2, and BRPF3. They are scaffold proteins that form monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complexes with other regulatory subunits, such as inhibitor of growth 5 (ING5) and Esa1-associated factor 6 ortholog (EAF6). BRPF proteins have multiple domains, including a canonical Cys4HisCys3 plant homeodomain (PHD) zinc finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. PHD and ePHD fingers both bind to lysine 4 of histone H3 (K4H3), bromodomains interact with acetylated lysines on N-terminal tails of histones and other proteins, and PWWP domains show histone-binding and chromatin association properties. This model corresponds to the canonical Cys4HisCys3 PHD finger.	54
277048	cd15573	PHD_JADE	PHD finger found in proteins Jade-1, Jade-2, Jade-3, and similar proteins. This family includes proteins Jade-1 (PHF17), Jade-2 (PHF15), and Jade-3 (PHF16), each of which is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and EAF6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. This family also contains Drosophila melanogaster PHD finger protein rhinoceros (RNO). It is a novel plant homeodomain (PHD)-containing nuclear protein that may function as a transcription factor that antagonizes Ras signaling by regulating transcription of key EGFR/Ras pathway regulators in the Drosophila eye. All Jade proteins contain a canonical Cys4HisCys3 PHD finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, both of which are zinc-binding motifs. This model corresponds to the canonical Cys4HisCys3 PHD finger.	46
277049	cd15574	PHD_AF10_AF17	PHD finger found in protein AF-10 and AF-17. This family includes protein AF-10 and AF-17. AF-10, also termed ALL1 (acute lymphoblastic leukemia)-fused gene from chromosome 10 protein, is a transcription factor encoded by gene AF10, a translocation partner of the MLL (mixed-lineage leukemia) oncogene in leukemia. AF-10 has been implicated in the development of leukemia following chromosomal rearrangements between the AF10 gene and one of at least two other genes, MLL and CALM. It plays a key role in the survival of uncommitted hematopoietic cells. Moreover, AF-10 functions as a follistatin-related gene (FLRG)-interacting protein. The interaction with FLRG enhances AF10-dependent transcription. It interacts with the human counterpart of the yeast Dot1, hDOT1L, and may act as a bridge for the recruitment of hDOT1L to the genes targeted by MLL-AF10. It also interacts with the synovial sarcoma associated SYT protein and may play a role in synovial sarcomas and acute leukemias. AF-17, also termed ALL1-fused gene from chromosome 17 protein, is encoded by gene AF17 that has been identified in hematological malignancies as translocation partners of the mixed lineage leukemia gene MLL. It is a putative transcription factor that may play a role in multiple signaling pathways. It is involved in chromatin-mediated gene regulation mechanisms. It functions as a component of the multi-subunit Dot1 complex (Dotcom) and plays a role in the Wnt/Wingless signaling pathway. It also seems to be a downstream target of the beta-catenin/T-cell factor pathway, and participates in G2-M progression. Moreover, it may function as an important regulator of ENaC-mediated Na+ transport and thus blood pressure. Both AF-10 and AF-17 contain an N-terminal canonical Cys4HisCys3 plant homeodomain (PHD) finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His. The PHD finger is involved in their homo-oligomerization. In the C-terminal region, they possess a leucine zipper domain and a glutamine-rich region. This family also includes ZFP-1, the Caenorhabditis elegans AF10 homolog. It was originally identified as a factor promoting RNAi interference in C. elegans. It also acts as a Dot1-interacting protein that opposes H2B ubiquitination to reduce polymerase II (Pol II) transcription. This model corresponds to the canonical Cys4HisCys3 PHD finger.	48
277050	cd15575	PHD_JMJD2A	PHD finger found in Jumonji domain-containing protein 2A (JMJD2A). JMJD2A, also termed lysine-specific demethylase 4A (KDM4A), or JmjC domain-containing histone demethylation protein 3A (JHDM3A), catalyzes the demethylation of di- and trimethylated H3K9 and H3K36. It is involved in carcinogenesis and functions as a transcription regulator that may either stimulate or repress gene transcription. It associates with nuclear receptor co-repressor complex or histone deacetylases. Moreover, JMJD2A forms complexes with both the androgen and estrogen receptor (ER) and plays an essential role in growth of both ER-positive and -negative breast tumors. It is also involved in prostate, colon, and lung cancer progression. JMJD2A contains jmjN and jmjC domains in the N-terminal region, followed by a canonical Cys4HisCys3 PHD finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a Tudor domain. This model corresponds to the canonical Cys4HisCys3 PHD finger.	100
277051	cd15576	PHD_JMJD2B	PHD finger found in Jumonji domain-containing protein 2B (JMJD2B). JMJD2B, also termed lysine-specific demethylase 4B (KDM4B), or JmjC domain-containing histone demethylation protein 3B (JHDM3B), specifically antagonizes the trimethyl group from H3K9 in pericentric heterochromatin and reduces H3K36 methylation in mammalian cells. It plays an essential role in the growth regulation of cancer cells by modulating the G1-S transition and promotes cell-cycle progression through the regulation of cyclin-dependent kinase 6 (CDK6). It interacts with heat shock protein 90 (Hsp90) and its stability can be regulated by Hsp90. JMJD2B also functions as a direct transcriptional target of p53, which induces its expression through promoter binding. Moreover, JMJD2B expression can be controlled by hypoxia-inducible factor 1alpha (HIF1alpha) in colorectal cancer and estrogen receptor alpha (ERalpha) in breast cancer. It is also involved in bladder, lung, and gastric cancer. JMJD2B contains jmjN and jmjC domains in the N-terminal region, followed by a canonical Cys4HisCys3 PHD finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a Tudor domain. This model corresponds to the canonical Cys4HisCys3 PHD finger.	99
277052	cd15577	PHD_JMJD2C	PHD finger found in Jumonji domain-containing protein 2C (JMJD2C). JMJD2C, also termed lysine-specific demethylase 4C (KDM4C), or gene amplified in squamous cell carcinoma 1 protein (GASC-1 protein), or JmjC domain-containing histone demethylation protein 3C (JHDM3C), is an epigenetic factor that catalyzes the demethylation of di- and trimethylated H3K9 and H3K36, and may be involved in the development and/or progression of various types of cancer including esophageal squamous cell carcinoma (ESC) and breast cancer. It selectively interacts with hypoxia-inducible factor 1alpha (HIF1alpha) and plays a role in breast cancer progression. Moreover, JMJD2C may play an important role in the treatment of obesity and its complications through modulating the regulation of adipogenesis by nuclear receptor peroxisome proliferator-activated receptor gamma (PPARgamma). JMJD2C contains jmjN and jmjC domains in the N-terminal region, followed by a canonical Cys4HisCys3 plant homeodomain (PHD) finger, a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, and a Tudor domain. This model corresponds to the canonical Cys4HisCys3 PHD finger.	104
277053	cd15578	PHD1_MTF2	PHD finger 1 found in metal-response element-binding transcription factor 2 (MTF2). MTF2, also termed metal regulatory transcription factor 2, or metal-response element DNA-binding protein M96, or polycomb-like protein 2 (PCL2), complexes with the polycomb repressive complex-2 (PRC2) in embryonic stem cells and regulates the transcriptional networks during embryonic stem cell self-renewal and differentiation. It recruits the PRC2 complex to the inactive X chromosome and target loci in embryonic stem cells. Moreover, MTF2 is required for PRC2-mediated Hox cluster repression. It activates the Cdkn2a gene and promotes cellular senescence, thus suppressing the catalytic activity of PRC2 locally. MTF2 consists of an N-terminal Tudor domain followed by two PHD fingers, and a C-terminal MTF2 domain. This model corresponds to the first PHD finger.	53
277054	cd15579	PHD1_PHF19	PHD finger 1 found in PHD finger protein 19 (PHF19). PHF19, also termed Polycomb-like protein 3 (PCL3), is a component of the polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF19 consists of an N-terminal Tudor domain followed by two PHD fingers, and a C-terminal MTF2 domain. It binds trimethylated histone H3 Lys36 (H3K36me3) through its Tudor domain and recruits the PRC2 complex and the H3K36me3 demethylase NO66 to embryonic stem cell genes during differentiation. Moreover, PHF19 and its upstream regulator, Akt, play roles in the phenotype switch of melanoma cells from proliferative to invasive states. This model corresponds to the first PHD finger.	53
277055	cd15580	PHD2_MTF2	PHD finger 2 found in metal-response element-binding transcription factor 2 (MTF2). MTF2, also termed metal regulatory transcription factor 2, or metal-response element DNA-binding protein M96, or Polycomb-like protein 2 (PCL2), complexes with the Polycomb repressive complex-2 (PRC2) in embryonic stem cells and regulates the transcriptional networks during embryonic stem cell self-renewal and differentiation. It recruits the PRC2 complex to the inactive X chromosome and target loci in embryonic stem cells. Moreover, MTF2 is required for PRC2-mediated Hox cluster repression. It activates the Cdkn2a gene and promotes cellular senescence, thus suppressing the catalytic activity of PRC2 locally. MTF2 consists of an N-terminal Tudor domain followed by two plant homeodomain (PHD) fingers, and a C-terminal MTF2 domain. This model corresponds to the second PHD finger.	52
277056	cd15581	PHD2_PHF19	PHD finger 2 found in PHD finger protein 19 (PHF19). PHF19, also termed Polycomb-like protein 3 (PCL3), is a component of the Polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF19 consists of an N-terminal Tudor domain followed by two plant homeodomain (PHD) fingers, and a C-terminal MTF2 domain. It binds H3K36me3 through its Tudor domain and recruits the PRC2 complex and the H3K36me3 demethylase NO66 to embryonic stem cell genes during differentiation. Moreover, PHF19 and its upstream regulator, Akt, play roles in the phenotype switch of melanoma cells from proliferative to invasive states. This model corresponds to the second PHD finger.	52
277057	cd15582	PHD2_PHF1	PHD finger 2 found in PHD finger protein1 (PHF1). PHF1, also termed Polycomb-like protein 1 (PCL1), together with JARID2 and AEBP2, associates with the Polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF1 is essential in epigenetic regulation and genome maintenance. It acts as a dual reader of Lysine trimethylation at Lysine 36 of Histone H3 and Lysine 27 of Histone variant H3t. PHF1 consists of an N-terminal Tudor domain followed by two plant homeodomain (PHD) fingers, and a C-terminal MTF2 domain. Its Tudor domain selectively binds to histone H3K36me3. Moreover, PHF1 is required for efficient H3K27me3 and Hox gene silencing. It can mediate deposition of the repressive H3K27me3 mark and acts as a cofactor in early DNA-damage response. This model corresponds to the second PHD finger.	52
277058	cd15583	PHD_ash2p_like	PHD finger found in Schizosaccharomyces pombe Set1 complex component ash2 (spAsh2p) and similar proteins. spAsh2p, also termed Set1C component ash2, or COMPASS component ash2, or complex proteins associated with set1 protein ash2, or Lid2 complex component ash2, or Lid2C component ash2, is orthologous to Drosophila melanogaster Ash2 protein. Both spAsh2p and D. melanogaster Ash2 contain a plant homeodomain (PHD) finger and a SPRY domain. In contrast, its counterpart in Saccharomyces cerevisiae, Bre2p, has no PHD finger and is not included in this family. spAsh2p shows histone H3 Lys4 (H3K4) methyltransferase activity through its PHD finger. It also interacts with Lid2p in S. pombe. Human Ash2L contains an atypical PHD finger that lacks part of the Cys4HisCys3 signature characteristic of PHD fingers, it binds to only one zinc ion through the second half of the motif and does not have histone tail binding activity.	50
277059	cd15584	PHD_ING1_2	PHD finger found in inhibitor of growth protein 1 (ING1) and 2 (ING2). ING1 is an epigenetic regulator and a type II tumor suppressor that impacts cell growth, aging, apoptosis, and DNA repair, by affecting chromatin conformation and gene expression. It acts as a reader of the active chromatin mark, the trimethylation of histone H3 lysine 4 (H3K4me3). It binds and directs Growth arrest and DNA damage inducible protein 45 a (Gadd45a) to target sites, thus linking the histone code with DNA demethylation. It interacts with the proliferating cell nuclear antigen (PCNA) via the PCNA-interacting protein (PIP) domain in a UV-inducible manner. It also interacts with a PCNA-interacting protein, p15 (PAF). Moreover, ING1 associates with members of the 14-3-3 family, which is necessary for cytoplasmic relocalization. Endogenous ING1 protein specifically interacts with the pro-apoptotic BCL2 family member BAX and colocalizes with BAX in a UV-inducible manner. It stabilizes the p53 tumor suppressor by inhibiting polyubiquitination of multi-monoubiquitinated forms via interaction with and colocalization of the herpesvirus-associated ubiquitin-specific protease (HAUSP)-deubiquitinase with p53. It is also involved in trichostatin A-induced apoptosis and caspase 3 signaling in p53-deficient glioblastoma cells. In addition, tyrosine kinase Src can bind and phosphorylate ING1 and further regulates its activity. ING2, also termed inhibitor of growth 1-like protein (ING1Lp), or p32, or p33ING2, belongs to the inhibitor of growth (ING) family of type II tumor suppressors. It is a core component of a multi-factor chromatin-modifying complex containing the transcriptional co-repressor SIN3A and histone deacetylase 1 (HDAC1). It has been implicated in the control of cell cycle, in genome stability, and in muscle differentiation. ING2 independently interacts with H3K4me3 (Histone H3 trimethylated on lysine 4) and PtdIns(5)P, and modulates crosstalk between lysine methylation and lysine acetylation on histone proteins through association with chromatin in the presence of DNA damage. It collaborates with SnoN to mediate transforming growth factor (TGF)-beta-induced Smad-dependent transcription and cellular responses. It is upregulated in colon cancer and increases invasion by enhanced MMP13 expression. It also acts as a cofactor of p300 for p53 acetylation and plays a positive regulatory role during p53-mediated replicative senescence. Both ING1 and ING2 contain an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger.	45
277060	cd15585	PHD_ING3	PHD finger found in inhibitor of growth protein 3 (ING3) and similar proteins. ING3, also termed p47ING3, is one member of the inhibitor of growth (ING) family of type II tumor suppressors. It is ubiquitously expressed and has been implicated in transcription modulation, cell cycle control, and the induction of apoptosis. It is an important subunit of human NuA4 histone acetyltransferase complex, which regulates the acetylation of histones H2A and H4. Moreover, ING3 promotes ultraviolet (UV)-induced apoptosis through the Fas/caspase-8-dependent pathway in melanoma cells. It physically interacts with subunits of E3 ligase Skp1-Cullin-F-boxprotein complex (SCF complex) and is degraded by the SCF (F-box protein S-phase kinase-associated protein 2, Skp2)-mediated ubiquitin-proteasome system. It also acts as a suppression factor during tumorigenesis and progression of hepatocellular carcinoma (HCC). ING3 contains an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger.	45
277061	cd15586	PHD_ING4_5	PHD finger found in inhibitor of growth protein 4 (ING4) and 5 (ING5). ING4, also termed p29ING4, and ING5, also termed p28ING5, belong to the inhibitor of growth (ING) family of type II tumor suppressors. ING4 acts as an E3 ubiquitin ligase to induce ubiquitination of the p65 subunit of NF-kappaB and inhibit the transactivation of NF-kappaB target genes. It also induces apoptosis through a p53 dependent pathway, including increasing p53 acetylation, inhibiting Mdm2-mediated degradation of p53 and enhancing the expression of p53 responsive genes both at the transcriptional and post-translational levels. Moreover, ING4 can inhibit the translation of proto-oncogene MYC by interacting with AUF1. It also regulates other transcription factors, such as hypoxia-inducible factor (HIF). ING5 is a Tip60 cofactor that acetylates p53 at K120 and subsequently activates the expression of p53-dependent apoptotic genes in response to DNA damage. Aberrant ING5 expression may contribute to pathogenesis, growth, and invasion of gastric carcinomas and colorectal cancer. ING5 can physically interact with p300 and p53 in vivo, and its overexpression induces apoptosis in colorectal cancer cells. It also associates with cyclin A1 (INCA1) and functions as a growth suppressor with suppressed expression in Acute Myeloid Leukemia (AML). Moreover, ING5 translocation from the nucleus to the cytoplasm might be a critical event for carcinogenesis and tumor progression in human head and neck squamous cell carcinoma. Both ING4 and ING5 contain an N-terminal ING histone-binding domain and a C-terminal plant homeodomain (PHD) finger. They associate with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further direct the MOZ/MORF and HBO1 complexes to chromatin.	45
277062	cd15587	PHD_Yng1p_like	PHD finger found in yeast orthologs of ING tumor suppressor family. The yeast orthologs of the plant homeodomain (PHD) finger-containing ING tumor suppressor family consists of chromatin modification-related protein YNG1 (Yng1p), YNG2 (Yng2p), and transcriptional regulatory protein PHO23 (Pho23p). Yng1p, also termed ING1 homolog 1, is one of the components of the NuA3 histone acetyltransferase (HAT) complex. Its PHD finger binding to H3 Trimethylated at K4 (H3K4me3) promotes NuA3 H3 HAT activity at K14 of H3 on chromatin. Yng2p, also termed ESA1-associated factor 4, or ING1 homolog 2, is a subunit of the NuA4 HAT complex. It plays a critical role in intra-S-phase DNA damage response. Pho23p is part of the Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Yng1p and Pho23p inhibit p53-dependent transcription. In contrast, Yng2p has the opposite effect. All family members contain an N-terminal ING histone-binding domain and a C-terminal PHD finger.	47
277063	cd15588	PHD1_KMT2A	PHD finger 1 found in histone-lysine N-methyltransferase 2A (KMT2A). KMT2A (also termed ALL-1, CXXC-type zinc finger protein 7, myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (Htrx), or zinc finger protein HRX) is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL1 complex is highly active and specific for H3K4 methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, a Bromodomain domain, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the first PHD finger.	47
277064	cd15589	PHD1_KMT2B	PHD finger 1 found in Histone-lysine N-methyltransferase 2B (KMT2B). KMT2B, also termed trithorax homolog 2 or WW domain-binding protein 7 (WBP-7), is encoded by the gene that was first named myeloid/lymphoid or mixed-lineage leukemia 2 (MLL2), a second human homolog of Drosophila trithorax, located on chromosome 19. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3 lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner. Moreover, KMT2B plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. KMT2B contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the first PHD finger.	47
277065	cd15590	PHD2_KMT2A	PHD finger 2 found in histone-lysine N-methyltransferase 2A (KMT2A). KMT2A (also termed ALL-1, CXXC-type zinc finger protein 7, myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (Htrx), or zinc finger protein HRX) is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL1 complex is highly active and specific for H3K4 methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, a Bromodomain domain, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the second PHD finger.	50
277066	cd15591	PHD2_KMT2B	PHD domain 2 found in Histone-lysine N-methyltransferase 2B (KMT2B). KMT2B, also termed trithorax homolog 2 or WW domain-binding protein 7 (WBP-7), is encoded by the gene that was first named myeloid/lymphoid or mixed-lineage leukemia 2 (MLL2), a second human homolog of Drosophila trithorax, located on chromosome 19. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner. Moreover, KMT2B plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. KMT2B contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD), an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the second PHD finger.	50
277067	cd15592	PHD3_KMT2A	PHD finger 3 found in histone-lysine N-methyltransferase 2A (KMT2A). KMT2A (also termed ALL-1, CXXC-type zinc finger protein 7, myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (Htrx), or zinc finger protein HRX) is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL1 complex is highly active and specific for H3K4 methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, a Bromodomain domain, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the third PHD finger.	57
277068	cd15593	PHD3_KMT2B	PHD finger 3 found in Histone-lysine N-methyltransferase 2B (KMT2B). KMT2B, also termed trithorax homolog 2 or WW domain-binding protein 7 (WBP-7), is encoded by the gene that was first named myeloid/lymphoid or mixed-lineage leukemia 2 (MLL2), a second human homolog of Drosophila trithorax, located on chromosome 19. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3 lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner. Moreover, KMT2B plays a critical role in memory formation through mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. KMT2B contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain. This model corresponds to the third PHD finger.	57
277069	cd15594	PHD2_KMT2C	PHD finger 2 found in Histone-lysine N-methyltransferase 2C (KMT2C). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. This model corresponds to the second PHD finger.	46
277070	cd15595	PHD2_KMT2D	PHD finger 2 found in Histone-lysine N-methyltransferase 2D (KMT2D). KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such asHOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D contains the catalytic domain SET, five plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the second PHD finger.	46
277071	cd15596	PHD4_KMT2C	PHD finger 4 found in Histone-lysine N-methyltransferase 2C (KMT2C). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3) or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. This model corresponds to the fourth PHD finger.	57
277072	cd15597	PHD3_KMT2D	PHD finger 3 found in Histone-lysine N-methyltransferase 2D (KMT2D). KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D contains the catalytic domain SET, five plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the third PHD finger.	51
277073	cd15600	PHD6_KMT2C	PHD finger 6 found in Histone-lysine N-methyltransferase 2C (KMT2C). KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. This model corresponds to the sixth PHD finger.	51
277074	cd15601	PHD5_KMT2D	PHD finger 5 found in Histone-lysine N-methyltransferase 2D (KMT2D). KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such asHOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is downregulated in cholestasis. KMT2D contains the catalytic domain SET, five plant homeodomain (PHD) fingers, two extended PHD (ePHD) fingers, Cys2HisCys5HisCys2His, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions. This model corresponds to the fifth PHD finger.	51
277075	cd15602	PHD1_KDM5A	PHD finger 1 found in Lysine-specific demethylase 5A (KDM5A). KDM5A (also termed Histone demethylase JARID1A, Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2)) was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5A functions as a trimethylated histone H3 lysine 4 (H3K4me3) demethylase that belongs to the JARID subfamily within the JmjC proteins. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5A contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger.	49
277076	cd15603	PHD1_KDM5B	PHD finger 1 found in lysine-specific demethylase 5B (KDM5B). KDM5B (also termed Cancer/testis antigen 31 (CT31), Histone demethylase JARID1B, Jumonji/ARID domain-containing protein 1B (JARID1B), PLU-1, or retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A)) is a member of the JARID subfamily within the JmjC proteins. It has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of pregnant females and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well as TIEG1/KLF10 (transforming growth factor-beta inducible early gene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. KDM5B contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger.	46
277077	cd15604	PHD1_KDM5C_5D	PHD finger 1 found in Lysine-specific demethylase 5C (KDM5C) and 5D (KDM5D). The family includes KDM5C and KDM5D, both of which belong to the JARID subfamily within the JmjC proteins. KDM5C (also termed Histone demethylase JARID1C, Jumonji/ARID domain-containing protein 1C, SmcX, or Xe169) is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D (also termed Histocompatibility Y antigen (H-Y), Histone demethylase JARID1D, Jumonji/ARID domain-containing protein 1D, or SmcY) is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 andH3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. Both KDM5C and KDM5D contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as two plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger.	46
277078	cd15605	PHD1_Lid_like	PHD finger 1 found in Drosophila melanogaster protein little imaginal discs (Lid) and similar proteins. Drosophila melanogaster Lid, also termed Retinoblastoma-binding protein 2 homolog, is identified genetically as a trithorax group (trxG) protein that is a Drosophila homolog of the human protein JARID1A/kdm5A, a member of the JARID subfamily within the JmjC proteins. Lid functions as a JmjC-dependent trimethyl histone H3K4 (H3K4me3) demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Lid contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the first PHD finger of Lid.	46
277079	cd15606	PHD2_KDM5A	PHD finger 2 found in Lysine-specific demethylase 5A (KDM5A). KDM5A (also termed Histone demethylase JARID1A, Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2)) was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK, and BMAL1. KDM5A functions as a trimethylated histone H3 lysine 4 (H3K4me3) demethylase that belongs to the JARID subfamily within the JmjC proteins. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5A contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger.	56
277080	cd15607	PHD2_KDM5B	PHD finger 2 found in lysine-specific demethylase 5B (KDM5B). KDM5B (also termed Cancer/testis antigen 31 (CT31), Histone demethylase JARID1B, Jumonji/ARID domain-containing protein 1B (JARID1B), retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A), or PLU-1) is a member of the JARID subfamily within the JmjC proteins. It has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well as TIEG1/KLF10 (transforming growth factor-beta inducible early gene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. KDM5B contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger.	44
277081	cd15608	PHD2_KDM5C_5D	PHD finger 2 found in Lysine-specific demethylase 5C (KDM5C) and 5D (KDM5D). The family includes KDM5C and KDM5D, both of which belong to the JARID subfamily within the JmjC proteins. KDM5C (also termed Histone demethylase JARID1C, Jumonji/ARIDdomain-containing protein 1C, SmcX, or Xe169) is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D (also termed Histocompatibility Y antigen (H-Y), Histone demethylase JARID1D, Jumonji/ARID domain-containing protein 1D, or SmcY) is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 and H3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. Both KDM5C and KDM5D contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as two plant homeodomain (PHD) fingers. This model corresponds to the second PHD finger.	58
277082	cd15609	PHD_TCF19	PHD finger found in Transcription factor 19 (TCF-19) and similar proteins. TCF-19, also termed transcription factor SC1, was identified as a putative trans-activating factor with expression beginning at the late G1-S boundary in dividing cells. It also functions as a novel islet factor necessary for proliferation and survival in the INS-1 beta cell line. It plays an important role in susceptibility to both Type 1 Diabetes Mellitus (T1DM) and Type 2 Diabetes Mellitus (T2DM); it has been suggested that it may positively impact beta cell mass under conditions of beta cell stress and  increased insulin demand. TCF-19 contains an N-terminal fork head association domain (FHA), a proline rich region, and a C-terminal plant homeodomain (PHD) finger. The FHA domain may serve as a nuclear signaling domain or as a phosphoprotein binding domain. The proline rich region is a common characteristic of trans-activating factors. The PHD finger may allow TCF-19 to interact with chromatin via methylated histone H3.	50
277083	cd15610	PHD3_KDM5A_like	PHD finger 3 found in Lysine-specific demethylase 5A (KDM5A), 5B (KDM5B), and similar proteins. The family includes KDM5A and KDM5B, both of which belong to the JARID subfamily within the JmjC proteins. KDM5A, also termed Histone demethylase JARID1A, or Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2), was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5A functions as the trimethylated histone H3 lysine 4 (H3K4me3) demethylase. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5B, also termed Cancer/testis antigen 31 (CT31), or Histone demethylase JARID1B, or Jumonji/ARID domain-containing protein 1B (JARID1B), or PLU-1, or retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A), has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well asTIEG1/KLF10 (transforming growth factor-beta inducible early gene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. The family also includes the Drosophila melanogaster protein little imaginal discs (Lid) that functions as a JmjC-dependent trimethyl histone H3K4 (H3K4me3) demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Members in this family contain the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the third PHD finger.	50
277084	cd15612	PHD_OBE1_like	PHD finger found in Arabidopsis thaliana protein OBERON 1, OBERON 2, and similar proteins mainly found in plants. Included in this family are OBERON 1 (OBE1, or potyvirus VPg-interacting protein 2) and OBERON 2 (OBE2, or potyvirus VPg-interacting protein 1), which have been involved in the maintenance and/or establishment of the meristems in Arabidopsis. They interact with potyvirus VPg-interacting proteins (PVIP1 and 2) and act as central regulators in auxin-mediated control of development. Both OBE1and OBE2 contain a plant homeodomain (PHD) finger. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins.	60
277085	cd15613	PHD_AL_plant	PHD finger found in plant Alfin1-like (AL) proteins. AL proteins are ubiquitously expressed nuclear proteins existing only in plants. They are involved in chromatin regulation by binding to tri- and dimethylated histone H3 at lysine 4 (H3K4me3/2), the active histone markers, through their plant homeodomain (PHD) fingers.	51
277086	cd15614	PHD_HAC_like	PHD finger found in Arabidopsis thaliana histone acetyltransferases (HATs) HAC and similar proteins. This family includes A. thaliana HACs (HAC1/2/4/5/12), which are histone acetyltransferases of the p300/CREB-binding protein (CBP) co-activator family. CBP-type HAT proteins are also found in animals, but absent in fungi. The domain architecture of CBP-type HAT proteins differs between plants and animals. Members in this family contain an N-terminal partially conserved KIX domain, a Zf-TAZ domain, a Cysteine rich CBP-type HAT domain that harbors a plant homeodomain (PHD) finger, a Zf-ZZ domain, and a Zf-TAZ domain. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins.	73
277087	cd15615	PHD_ARID4_like	PHD finger found in Arabidopsis thaliana AT-rich interactive domain-containing protein 4 (ARID4) and similar proteins. This family includes A. thaliana ARID4 (ARID domain-containing protein 4) and similar proteins. Their biological roles remain unclear, but they all contain an AT-rich interactive domain (ARID) and a plant homeodomain (PHD) finger at the C-terminus. ARID is a helix-turn-helix motif-based DNA-binding domain conserved in all eukaryotes. PHD fingers can recognize the unmodified and modified histone H3 tail, and some have been found to interact with non-histone proteins.	57
277088	cd15616	PHD_UHRF1	PHD finger found in ubiquitin-like PHD and RING finger domain-containing protein 1 (UHRF1). UHRF1 (also termed inverted CCAAT box-binding protein of 90 kDa, nuclear protein 95, nuclear zinc finger protein Np95 (Np95), RING finger protein 106, transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1) is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of transcription factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET and RING finger associated (SRA) domain, and a C-terminal RING-finger domain. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitylation has an essential role in maintaining DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD finger targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-finger domain exhibit both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1.	47
277089	cd15617	PHD_UHRF2	PHD finger found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2). UHRF2 (also termed Np95/ICBP90-like RING finger protein (NIRF), Np95-like RING finger protein, nuclear protein 97, nuclear zinc finger protein Np97, RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2) was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs,p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) by interacting with the HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger.	47
277090	cd15618	PHD1_MOZ_MORF	PHD finger 1 found in monocytic leukemia zinc-finger protein (MOZ) and its factor (MORF). MOZ (also termed histone acetyltransferase KAT6A, YBF2/SAS3, SAS2 and TIP60protein 3 (MYST-3), runt-related transcription factor-binding protein 2, or zinc finger protein 220) is a MYST-type histone acetyltransferase (HAT) that functions as a coactivator for acute myeloid leukemia 1 protein (AML1)- and p53-dependent transcription. It possesses intrinsic HAT activity to acetylate both itself and lysine (K) residues on histone H2B, histone H3 (K14) and histone H4 (K5, K8, K12 and K16) in vitro and H3K9 in vivo. MOZ-related factor (MORF), also termed MOZ2, or histone acetyltransferase KAT6B, or MOZ, YBF2/SAS3, SAS2 and TIP60 protein 4 (MYST4), is a ubiquitously expressed transcriptional regulator with intrinsic HAT activity. It can interact with the Runt-domain transcription factor Runx2 and form a tetrameric complex with BRPFs, ING5, and EAF6. Both MOZ and MORF are catalytic subunits of HAT complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and are also implicated in human leukemias. MOZ is also the catalytic subunit of a tetrameric inhibitor of growth 5 (ING5) complex, which specifically acetylates nucleosomal histone H3K14. Moreover, MOZ and MORF are involved in regulating transcriptional activation mediated by Runx2 (or Cbfa1), a Runt-domain transcription factor known to play important roles in T cell lymphomagenesis and bone development, and its homologs. MOZ contains a linker histone 1 and histone 5 domains and two plant homeodomain (PHD) fingers. In contrast, MORF contains an N-terminal region containing two PHD fingers, a putative HAT domain, an acidic region, and a C-terminal Ser/Met-rich domain. The model corresponds to the first PHD finger.	58
277091	cd15619	PHD1_d4	PHD finger 1 found in d4 gene family proteins. The family includes proteins coded by three members of the d4 gene family, DPF1 (neuro-d4), DPF2 (ubi-d4/Requiem), and DPF3 (cer-d4), which function as transcription factors and are involved in transcriptional regulation of genes by changing the condensed/decondensed state of chromatin in the nucleus. DPF2 is ubiquitously expressed and it acts as a transcription factor that may participate in developmentally programmed cell death. DPF1 and DPF3 are expressed predominantly in neural tissues, and they may be involved in the transcription regulation of neuro-specific gene clusters. The d4 family proteins show distinct domain organization with domain 2/3 in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc finger in the central part and two adjacent plant homeodomain (PHD) fingers (d4-domain) in the C-terminal part of the molecule. This model corresponds to the first PHD finger.	56
277092	cd15622	PHD_TIF1alpha	PHD finger found in transcription intermediary factor 1-alpha (TIF1-alpha). TIF1-alpha, also termed tripartite motif-containing protein 24 (TRIM24), or E3 ubiquitin-protein ligase TRIM24, or RING finger protein 82, belongs to the TRIM/RBCC protein family. It interacts specifically and in a ligand-dependent manner with the ligand binding domain (LBD) of several nuclear receptors (NRs), including retinoid X (RXR), retinoic acid (RAR), vitamin D3 (VDR), estrogen (ER), and progesterone (PR) receptors. It also associates with heterochromatin-associated factors HP1alpha, MOD1 (HP1beta) and MOD2 (HP1gamma), as well as vertebrate Kruppel-type (C2H2) zinc finger proteins that contain transcriptional silencing domain KRAB. TIF1-alpha is a ligand-dependent co-repressor of retinoic acid receptor (RAR) that interacts with multiple nuclear receptors in vitro via an LXXLL motif, and further acts as a gatekeeper of liver carcinogenesis. It also functions as an E3-ubiquitin ligase targeting p53 and is broadly associated with chromatin silencing. Moreover, it is a chromatin regulator that recognizes specific, combinatorial histone modifications through its C-terminal plant homeodomain (PHD)-Bromodomain (Bromo) region. In addition, it interacts with chromatin and estrogen receptor to activate estrogen-dependent genes associated with cellular proliferation and tumor development. TIF1-alpha contains an N-terminal RBCC (RING finger, B-box zinc-fingers, coiled-coil), a plant homeodomain (PHD) finger, followed by a bromodomain in the C-terminal region.	43
277093	cd15623	PHD_TIF1beta	PHD finger found in transcription intermediary factor 1-beta (TIF1-beta). TIF1-beta, also termed Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), or KRAB-interacting protein 1 (KRIP-1), or nuclear co-repressor KAP-1, or RING finger protein 96, or tripartite motif-containing protein 28 (TRIM28), or E3 SUMO-protein ligase TRIM28, acts as a nuclear co-repressor that plays a role in transcription and in DNA damage response. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair. It also regulates CHD3 nucleosome remodeling during DNA double-strand break (DSB) response. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response. In addition, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase. TIF1-beta contains an N-terminal RBCC (RING finger, B-box zinc-fingers, coiled-coil), which can interact with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and mediates homo- and heterodimerization, a plant homeodomain (PHD) finger followed by a bromodomain in the C-terminal region, which interact with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity.	43
277094	cd15624	PHD_TIF1gamma	PHD finger found in transcriptional intermediary factor 1 gamma (TIF1gamma). TIF1gamma, also termed tripartite motif-containing 33 (trim33), or ectodermin, or RFG7, or PTC7, is an E3-ubiquitin ligase that functions as a regulator of transforming growth factor beta (TGFbeta) signaling; it inhibits the Smad4-mediated TGFbeta response by interaction with Smad2/3 or ubiquitylation of Smad4. Moreover, TIF1gamma is an important regulator of transcription during hematopoiesis, as well as a key factor of tumorigenesis. Like other TIF1 family members, TIF1gamma also contains an intrinsic transcriptional silencing function. It can control erythroid cell fate by regulating transcription elongation. It can bind to the anaphase-promoting complex/cyclosome (APC/C) and promotes mitosis. TIF1gamma contains an N-terminal RBCC (RING finger, B-box zinc-fingers, coiled-coil), a plant homeodomain (PHD) finger, followed by a bromodomain in the C-terminal region.	46
277095	cd15625	PHD_TIF1delta	PHD finger found in transcriptional intermediary factor 1 delta (TIF1delta). TIF1delta, also termed tripartite motif-containing protein 66 (TRIM66), is a novel heterochromatin protein 1 (HP1)-interacting member of the transcriptional intermediary factor1 (TIF1) family expressed by elongating spermatids. Like other TIF1 proteins, TIF1delta displays a potent trichostatin A (TSA)-sensitive repression function; TSA is a specific inhibitor of histone deacetylases. Moreover, TIF1delta plays an important role in heterochromatin-mediated gene silencing during postmeiotic phases of spermatogenesis. It functions as a negative regulator of postmeiotic genes acting through HP1 isotype gamma (HP1gamma) complex formation and centromere association. TIF1delta contains an N-terminal RBCC (RING finger, B-box zinc-fingers, coiled-coil), a plant homeodomain (PHD) finger, followed by a bromodomain in the C-terminal region.	49
277096	cd15626	PHD_SP110_140	PHD finger found in the Sp100/Sp140 family of nuclear body components. The Sp100/Sp140 family includes nuclear body proteins SP100, SP140, and similar proteins. Sp110, also termed interferon-induced protein 41/75, or speckled 110 kDa, or transcriptional coactivator Sp110, is a leukocyte-specific component of the nuclear body. It may function as a nuclear hormone receptor transcriptional coactivator that may play a role in inducing differentiation of myeloid cells. It is also involved in resisting intracellular pathogens and functions as an important drug target for preventing intracellular pathogen diseases, such as tuberculosis, hepatic veno-occlusive disease, and intracellular cancers. Sp110 gene polymorphisms may be associated with susceptibility to tuberculosis in Chinese population. Sp110 contains a Sp100-like domain, a SAND domain, a plant homeodomain (PHD) finger, and a bromodomain (BRD). SP140, also termed lymphoid-restricted homolog of Sp100 (LYSp100), or nuclear autoantigen Sp-140, or speckled 140 kDa, is an interferon inducible nuclear leukocyte-specific protein involved in primary biliary cirrhosis and a risk factor in chronic lymphocytic leukemia. It is also implicated in innate immune response to human immunodeficiency virus type 1 (HIV-1) by binding to the virus's viral infectivity factor (Vif) protein. Sp140 contains a nuclear localization signal, a dimerization domain (HSR or CARD domain), a SAND domain, a PHD finger, and a BRD.	42
277097	cd15627	PHD_BAZ1A	PHD finger found in bromodomain adjacent to zinc finger domain protein 1A (BAZ1A). BAZ1A, also termed ATP-dependent chromatin-remodeling protein, or ATP-utilizing chromatin assembly and remodeling factor 1 (ACF1), or CHRAC subunit ACF1, or Williams syndrome transcription factor-related chromatin-remodeling factor 180 (WCRF180), or WALp1, is a subunit of the conserved imitation switch (ISWI)-family ATP-dependent chromatin assembly and remodeling factor (ACF)/chromatin accessibility complex (CHRAC) chromatin remodeling complex, which is required for DNA replication through heterochromatin. It alters the remodeling properties of the ATPase motor protein sucrose nonfermenting-2 homolog (SNF2H). Moreover, BAZ1A and its complexes play important roles in DNA double-strand break (DSB) repair. It is essential for averting improper gene expression during spermatogenesis. It also regulates transcriptional repression of vitamin D3 receptor-regulated genes. BAZ1A contains a WAC motif, a DDT domain, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain.	46
277098	cd15628	PHD_BAZ1B	PHD finger found in bromodomain adjacent to zinc finger domain protein 1B (BAZ1B). BAZ1B, also termed Tyrosine-protein kinase BAZ1B, or Williams syndrome transcription factor (WSTF), or Williams-Beuren syndrome chromosomal region 10 protein, Williams-Beuren syndrome chromosomal region 9 protein, or WALp2, is a multifunctional protein implicated in several nuclear processes, including replication, transcription, and the DNA damage response. BAZ1B/WSTF, together with the imitation switch (ISWI) ATPase, forms a WSTF-ISWI chromatin remodeling complex (WICH), which transiently associates with the human inactive X chromosome (Xi) during late S-phase prior to BRCA1 and gamma-H2AX. Moreover, BAZ1B/WSTF, SNF2h, and nuclear myosin 1 (NM1) forms the chromatin remodeling complex B-WICH that is involved in regulating rDNA transcription. BAZ1B contains a WAC motif, a DDT domain, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain.	46
277099	cd15629	PHD_BAZ2A	PHD finger found in bromodomain adjacent to zinc finger domain protein 2A (BAZ2A). BAZ2A, also termed transcription termination factor I-interacting protein 5 (TTF-I-interacting protein 5, or Tip5), or WALp3, is an epigenetic regulator. It has been implicated in epigenetic rRNA gene silencing, as the large subunit of the SNF2h-containing chromatin-remodeling complex NoRC that induces nucleosome sliding in an ATP- and histone H4 tail-dependent fashion. BAZ2A has also been shown to be broadly overexpressed in prostate cancer, to regulate numerous protein-coding genes and to cooperate with EZH2 (enhancer of zeste homolog 2) to maintain epigenetic silencing at genes repressed in prostate cancer metastasis. Its  overexpression is tightly associated with a prostate cancer subtype displaying CpG island methylator phenotype (CIMP) in tumors and with prostate cancer recurrence in patients. It contains a TAM (TIP5/ARBP/MBD) domain, a DDT domain, four AT-hooks, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain.	47
277100	cd15630	PHD_BAZ2B	PHD finger found in bromodomain adjacent to zinc finger domain protein 2B (BAZ2B). BAZ2B, also termed WALp4, is a bromodomain-containing protein whose biological role is still elusive. It shows high sequence similarly with BAZ2A, which is the large subunit of the SNF2h-containing chromatin-remodeling complex NoRC that induces nucleosome sliding in an ATP-and histone H4 tail-dependent fashion. BAZ2B contains a TAM (TIP5/ARBP/MBD) domain, an Apolipophorin-III like domain, a DDT domain, four AT-hooks, BAZ 1 and BAZ 2 motifs, a WAKZ (WSTF/Acf1/KIAA0314/ZK783.4) motif, a plant homeodomain (PHD) finger, and a bromodomain.	49
277101	cd15631	PHD_PHF23	PHD finger found in PHD finger protein 23 (PHF23). PHF23, also termed PHD-containing protein JUNE-1, is a hypothetical protein with a plant homeodomain (PHD) finger. It is encoded by gene PHF23 that acts as a candidate fusion partner for the nucleoporin gene NUP98. The NUP98-PHF23 fusion results from a cryptic translocation t(11;17)(p15;p13) in acute myeloid leukemia (AML).	44
277102	cd15632	PHD_PHF13	PHD finger found in PHD finger protein 13 (PHF13). PHF13, also termed survival time-associated PHD finger protein in ovarian cancer 1 (SPOC1), is a novel plant homeodomain (PHD) finger-containing protein that shows strong expression in spermatogonia and ovarian cancer cells, modulates chromatin structure and mitotic chromosome condensation, and is important for proper cell division. It is also required for spermatogonial stem cell differentiation and sustained spermatogenesis. The overexpression of PHF13 associates with unresectable carcinomas and shorter survival in ovarian cancer.	47
277103	cd15633	PHD_PHF20L1	PHD finger found in PHD finger protein 20-like protein 1 (P20L1). P20L1 is an active malignant brain tumor (MBT) domain-containing protein that binds to monomethylated lysine 142 on DNA (Cytosine-5) Methyltransferase 1 (DNMT1) (DNMT1K142me1) and colocalizes at the perinucleolar space in a SET7-dependent manner. Its MBT domain reads and controls enzyme levels of methylated DNMT1 in cells, thus representing a novel antagonist of DNMT1 proteasomal degradation. In addition to the MBT domain, PHF20L1 also contains two Tudor domains, a plant homeodomain (PHD) finger and the putative DNA-binding domains, AT hook and Cys2His2-type zinc finger.	46
277104	cd15634	PHD_PHF20	PHD finger found in PHD finger protein 20 (PHF20). PHF20, also termed Glioma-expressed antigen 2, or hepatocellular carcinoma-associated antigen 58, or novel zinc finger protein, or transcription factor TZP (referring to Tudor and zinc finger domain containing protein), is a regulator of NF-kappaB activation by disrupting recruitment of PP2A to p65. It also functions as a transcription factor that binds Akt and plays a role in Akt cell survival/growth signaling. Moreover, it transcriptionally regulates p53. The phosphorylation of PHF20 on Ser291 mediated by protein kinase B (PKB) is essential in tumorigenesis via the regulation of p53 mediated signaling. PHF20 contains an N-terminal malignant brain tumor (MBT) domain, two Tudor domains, a plant homeodomain (PHD) finger and the putative DNA-binding domains, AT hook and Cys2His2-type zinc finger.	44
277105	cd15635	PHD_PYGO1	PHD finger found in pygopus homolog 1 (PYGO1). PYGO1 is a homolog of Drosophila melanogaster protein pygopus (dPYGO), which is a fundamental Wnt signaling transcriptional component in Drosophila. It functions as a context-dependent beta-catenin coactivator, and binds di- and trimethylated lysine 4 of histone H3 (H3K4me2/3). PYGO1 is essential for the association with Legless (Lgs)/Bcl9 that acts as an adaptor between Pygopus (Pygo) and Arm/beta-catenin. PYGO1 contains a plant homeodomain (PHD) finger, which is important for Lgs/Bcl9 recognition as well as for the regulation of the Wnt/beta-catenin signaling pathway.	57
277106	cd15636	PHD_PYGO2	PHD finger found in pygopus homolog 2 (PYGO2). PYGO2 is a homolog of Drosophila melanogaster protein pygopus (dPYGO), which is a fundamental Wnt signaling transcriptional component in Drosophila. It functions as a context-dependent beta-catenin coactivator, as well as a histone methylation reader that binds di-and trimethylated lysine 4 of histone H3 (H3K4me2/3). Moreover, PYGO2 acts as a chromatin remodeler in a testis-specific and Wnt-unrelated manner. It also mediates chromatin regulation and links Wnt signaling and Notch signaling to suppress the luminal/alveolar differentiation competence of mammary stem and basal cells. Furthermore, PYGO2 plays a new role in rRNA transcription during cancer cell growth. It regulates mammary tumor initiation and heterogeneity in MMTV-Wnt1 mice. PYGO2 contains a plant homeodomain (PHD) finger, which is important for Lgs/Bcl9 recognition as well as for the regulation of the Wnt/beta-catenin signaling pathway.	54
277107	cd15637	PHD_dPYGO	PHD finger found in Drosophila melanogaster protein pygopus (dPYGO) and similar proteins. dPYGO, also termed protein gammy legs, is a nuclear adapter protein encoded by pygopus (pygo). It is a fundamental Wnt signaling transcriptional component in Drosophila, and has both Wnt-related and Wnt-independent functions. It plays a critical role in aging-related cardiac dysfunction that is canonical Wnt signaling independent. dPYGO contains a plant homeodomain (PHD) finger, which is important for Lgs/Bcl9 recognition as well as for the regulation of the Wnt/beta-catenin signaling pathway.	54
277108	cd15638	PHD_PHF3	PHD finger found in PHD finger protein 3 (PHF3). PHF3 is a human homolog of yeast protein bypass of Ess1 (Bye1), a nuclear protein with a domain resembling the central domain in the transcription elongation factor TFIIS. It is ubiquitously expressed in normal tissues including brain, but its expression is significantly reduced or lost in glioblastomas. PHF3 contains an N-terminal plant homeodomain (PHD) finger, a central RNA polymerase II (Pol II)-binding TFIIS-like domain (TLD) domain, and a C-terminal Spen paralogue and orthologue C-terminal (SPOC) domain.	51
277109	cd15639	PHD_DIDO1_like	PHD finger found in death-inducer obliterator variants Dido1, Dido2, and Dido3. This family includes three alternative splicing variants (Dido1, 2, and 3) encoded by the Dido gene, which have been implicated in a number of cellular processes such as apoptosis and chromosomal segregation, particularly in the hematopoietic system. Dido1, also termed DIO-1, or death-associated transcription factor 1 (DATF-1), is important for maintaining embryonic stem (ES) cells and directly regulates the expression of pluripotency factors. It is the shortest isoform that contains only a highly conserved plant homeodomain (PHD) finger responsible for the binding of histone H3 with a higher affinity for trimethylated lysine 4 (H3K4me3). Gene Dido is a Bonemorphogenetic protein (BMP) target gene, which promotes BMP-induced melanoma progression. It also triggers apoptosis after nuclear translocation and caspase upregulation. Dido3 is the largest isoform ubiquitously expressed in all human tissues. It is dispensable for ES cell self-renewal and pluripotency, but involved in the maintenance of stem cell genomic stability and tumorigenesis. Dido3 contains a PHD finger, a transcription elongation factor S-II subunit M (TFSIIM) domain, aspen paralog and ortholog (SPOC) module, and a long C-terminal region (CT) of unknown homology. Its PHD finger interacts with H3K4me3.	54
277110	cd15640	PHD_KDM7	PHD finger found in lysine-specific demethylase 7 (KDM7). KDM7, also termed JmjC domain-containing histone demethylation protein 1D (JHDM1D), or KIAA1718, is a dual histone demethylase that catalyzes demethylation of monomethylated and dimethylated H3K9 (H3K9me2/me1) and H3K27 (H3K27me2/me1), which functions as an eraser of silencing marks on chromatin during brain development. It also plays a tumor-suppressive role by regulating angiogenesis. KDM7 contains a plant homeodomain (PHD) that binds Lys4-trimethylated histone 3 (H3K4me3) and a jumonji domain that demethylates either H3K9me2 or H3K27me2.	50
277111	cd15641	PHD_PHF2	PHD finger found in lysine-specific demethylase PHF2. PHF2, also termed GRC5, or PHD finger protein 2, is a histone lysine demethylase ubiquitously expressed in various tissues. It contains a plant homeodomain (PHD) finger and a JmjC domain and plays an important role in adipogenesis. The PHD finger domain can recognize trimethylated histone H3 lysine 4 (H3K4me3). PHF2 also has dimethylated histone H3 lysine 9(H3K9me2) demethylase activity and acts as a coactivator of several metabolism-related transcription factors. Moreover, it can demethylate ARID5B and further forms a complex with demethylated ARD5B to bind the promoter regions of target genes. The overexpression of PHF2 is involved in the progression of esophageal squamous cell carcinoma (ESCC).	50
277112	cd15642	PHD_PHF8	PHD finger found in histone lysine demethylase PHF8. PHF8, also termed PHD finger protein 8, or KDM7B, is a monomethylated histone H4 lysine 20 (H4K20me1) demethylase that transcriptionally regulates many cell cycle genes. It also preferentially acts on H3K9me2 and H3K9me1. PHF8 is modulated by CDC20-containing anaphase-promoting complex (APC (cdc20)) and plays an important role in the G2/M transition. It acts as a critical molecular sensor for mediating retinoic acid (RA) treatment response in RAR alpha-fusion-induced leukemia. Moreover, PHF8 is essential for cytoskeleton dynamics and is associated with X-linked mental retardation. PHF8 contains an N-terminal plant homeodomain (PHD) finger followed by a JmjC domain. The PHD finger mediates binding to nucleosomes at active gene promoters and the JmjC domain catalyzes the demethylation of mono- or dimethyl-lysines.	52
277113	cd15643	PHD_KDM2A	PHD finger found in Lysine-specific demethylase 2A (KDM2A). KDM2A, also termed CXXC-type zinc finger protein 8, or F-box and leucine-rich repeat protein 11 (FBXL11), or F-box protein FBL7, or F-box protein Lilina, or F-box/LRR-repeat protein 11, or JmjC domain-containing histone demethylation protein 1A (Jhdm1a), or [Histone-H3]-lysine-36 demethylase 1A, is a ubiquitously expressed histone H3 lysine 36 (H3K36) demethylase that has been implicated in gene silencing, cell cycle, cell growth, and cancer development. It acts as a key negative regulator of gluconeogenic gene expression and plays a critical role in the invasiveness, proliferation, and anchorage-independent growth of non-small cell lung cancer (NSCLC) cells, as well as in the osteo/dentinogenic differentiation of Mesenchymal stem cells (MSCs). It regulates rRNA transcription in response to starvation. Meanwhile, it is a negative regulator of NFkappaB. Moreover, KDM2A is a heterochromatin-associated and HP1-interacting protein that promotes HP1 localization to chromatin. It is specifically recruited to CpG islands to define a unique chromatin architecture, which requires direct and specific interaction with linker DNA. It also functions as a H3K4 demethylase that regulates cell proliferation through p15 (INK4B) and p27 (Kip1) in stem cells from apical papilla (SCAPs). KDM2A belongs to the JmjC-domain-containing histone demethylase family. KDM2A consists of two Jumonji C (JmjC) domains, and FBXHA and FBXHB domains. A CXXC zinc-finger domain, followed by a plant homeodomain (PHD) finger, is located within the FBXHA domain, and an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain, is located within the FBXHB domain.	57
277114	cd15644	PHD_KDM2B	PHD finger found in Lysine-specific demethylase 2B (KDM2B). KDM2B, also termed Ndy1, or CXXC-type zinc finger protein 2, or F-box and leucine-rich (LRR) repeat protein 10 (FBXL10), or F-box protein FBL10, or JmjC domain-containing histone demethylation protein 1B (Jhdm1b), or Jumonji domain-containing EMSY-interactor methyltransferase motif protein (Protein JEMMA), or [Histone-H3]-lysine-36 demethylase 1B, is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. It regulates the differentiation of Mesenchymal Stem Cells (MSCs) and has been implicated in cell cycle regulation by de-repressing cyclin-dependent kinase inhibitor 2B (CDKN2B or p15INK4B). It also plays a role in recruiting polycomb repressive complex 1 (PRC1) to CpG islands (CGIs) of developmental genes and regulates lysine 119 monoubiquitylation on H2A (H2AK119ub1) in embryonic stem cells (ESCs). Moreover, it acts as an oncogene that plays a critical role in leukemia development and maintenance. KDM2B consists of two Jumonji C (JmjC) domains, and FBXHA and FBXHB domains. A CXXC zinc-finger domain, followed by a plant homeodomain (PHD) finger, is located within the FBXHA domain, and an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain, is located within the FBXHB domain.	62
277115	cd15645	PHD_FXL19	PHD finger found in F-box and leucine-rich repeat protein 19 (FBXL19). FBXL19, also termed F-box/LRR-repeat protein 19, is a novel homolog of KDM2A and KDM2B. It belongs to the Skp1-Cullin-F-box (SCF) family of E3 ubiquitin ligases. FBXL19 mediates ubiquitination and interleukin 33 (IL-33)-induced degradation of ST2L receptor in lung epithelia, blocks IL-33-mediated apoptosis, and prevents endotoxin-induced acute lung injury. It also functions as a RhoA antagonist during cell proliferation and cytoskeleton rearrangement, and regulates RhoA ubiquitination and degradation in lung epithelial cells. Moreover, FBXL19 regulates cell migration by targeting Rac1 for its polyubiquitination and proteasomal degradation. It plays an essential role in regulating TGFbeta1-induced E-cadherin down-regulation by mediating Rac3 site-specific ubiquitination and stability. FBXL19 consists of FBXHA and FBXHB domains. A CXXC zinc-finger domain, followed by a plant homeodomain (PHD) finger, is located within the FBXHA domain, and an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain, is located within the FBXHB domain.	62
277116	cd15646	PHD_p300	PHD finger found in histone acetyltransferase p300. p300, also termed KAT3B, or E1A-associated protein p300 (EP300), is a paralog of CREB-binding protein (CBP). It is involved in E1A function in cell cycle progression and cellular differentiation. It functions as an intrinsic HAT, as well as a factor acetyltransferase (FAT) for many transcription regulators. And thus, p300 serves as a scaffold or bridge for transcription factors and other components of the basal transcription machinery to facilitate chromatin remodeling and to activate gene transcription. p300 contains a cysteine-histidine rich region, KIX (CREB interaction) domain, a plant homeodomain (PHD) finger, a HAT domain, followed by a SRC interaction domain.	40
277117	cd15647	PHD_CBP	PHD finger found in CREB-binding protein (CBP). CBP, also termed as KAT3A, is an acetyltransferase acting on histone, which gives a specific tag for transcriptional activation and also acetylates non-histone proteins. CBP is also known as CREBBP, since it specifically interacts with the phosphorylated form of cyclic adenosine monophosphate-responsive element-binding protein (CREB). It augments the activity of phosphorylated CREB to activate transcription of cAMP-responsive genes. CBP contains a cysteine-histidine rich region, a KIX (CREB interaction) domain, a plant homeodomain (PHD) finger, a HAT domain, followed by a SRC interaction domain.	40
277118	cd15648	PHD1_NSD1_2	PHD finger 1 found in nuclear receptor-binding SET domain-containing protein NSD1 and NSD2. NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or Lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). Both NSD1 and NSD2 contain a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). In addition, NSD2 harbors a high mobility group (HMG) box. The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the first PHD finger.	43
277119	cd15649	PHD1_NSD3	PHD finger 1 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant-homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the first PHD finger.	44
277120	cd15650	PHD2_NSD1	PHD finger 2 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1). NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or LysineN-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD1 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (C5HCH). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the second PHD finger.	47
277121	cd15651	PHD2_NSD2	PHD finger 2 found in nuclear SET domain-containing protein 2 (NSD2). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). NSD2 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, a high mobility group (HMG) box, five PHD (plant homeodomain) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the second PHD finger.	47
277122	cd15652	PHD2_NSD3	PHD finger 2 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the second PHD finger.	47
277123	cd15653	PHD3_NSD1	PHD finger 3 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1). NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or Lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD1 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the third PHD finger.	54
277124	cd15654	PHD3_NSD2	PHD finger 3 found in nuclear SET domain-containing protein 2 (NSD2). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). NSD2 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, a high mobility group (HMG) box, five PHD (plant homeodomain) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the third PHD finger.	54
277125	cd15655	PHD3_NSD3	PHD finger 3 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the third PHD finger.	53
277126	cd15656	PHD4_NSD1	PHD finger 4 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1). NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or Lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma, and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD1 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the fourth PHD finger.	40
277127	cd15657	PHD4_NSD2	PHD finger 4 found in nuclear SET domain-containing protein 2 (NSD2). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). NSD2 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, a high mobility group (HMG) box, five PHD (plant-homeodomain) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the fourth PHD finger.	41
277128	cd15658	PHD4_NSD3	PHD finger 4 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant-homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the fourth PHD finger.	40
277129	cd15659	PHD5_NSD1	PHD finger 5 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1). NSD1, also termed H3 Lysine-36 and H4 Lysine-20 specific histone-lysine N-methyltransferase, or androgen receptor coactivator 267 kDa protein, or androgen receptor-associated protein of 267 kDa, or H3-K36-HMTase H4-K20-HMTase, or Lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein, is a lysine methyltransferase that preferentially methylates H3 on Lysine36 (H3-K36) and H4 on Lysine20 (H4-K20), which is primarily associated with active transcription. It plays a role in several pathologies, including but not limited to Sotos and Weaver syndromes, acute myeloid leukemia, breast cancer, neuroblastoma and glioblastoma formation. It can alter transcription by interacting with the protein NSD1-interacting zinc finger protein 1 (NIZP1). It also mitigates caspase-1 activation by listeriolysin o (LLO) in macrophages, and requires functional LLO for the regulation of IL-1beta secretion. Moreover, NSD1 regulates RNA polymerase II (RNAP II) recruitment to bone morphogenetic protein 4 (BMP4). NSD1 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-proline (PWWP) domains, five plant homeodomain (PHD) fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the fifth PHD finger.	43
277130	cd15660	PHD5_NSD2	PHD finger 5 found in nuclear SET domain-containing protein 2 (NSD2). NSD2, also termed histone-lysine N-methyltransferase NSD2, or multiple myeloma SET domain-containing protein (MMSET), or protein trithorax-5 Wolf-Hirschhorn syndrome candidate 1 protein (WHSC1), is overexpressed frequently in the t(4;14) translocation in 15% to 20% of multiple myeloma. It plays important roles in cancer cell proliferation, survival, and tumor growth, by mediating constitutive NF-kappaB signaling via the cytokine autocrine loop. It also enhances androgen receptor (AR)-mediated transcription. The principal chromatin-regulatory activity of NSD2 is dimethylation of histone H3 at lysine 36 (H3K36me2). NSD2 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, a high mobility group (HMG) box, five PHD (plant-homeodomain) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP, HMG, and PHD fingers mediate chromatin interaction and recognition of histone marks. This model corresponds to the fifth PHD finger.	43
277131	cd15661	PHD5_NSD3	PHD finger 5 found in nuclear SET domain-containing protein 3 (NSD3). NSD3, also termed histone-lysine N-methyltransferase NSD3, or protein whistle, or WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, or Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1-like protein 1, or WHSC1L1), is a lysine methyltransferase encoded by gene NSD3, which is amplified in human breast cancer cell lines. Moreover, translocation resulting in NUP98 fusion to NSD3 leads to the development of acute myeloid leukemia. NSD3 contains a catalytic suppressor of variegation, enhancer of zeste and trithorax (SET) domain, two proline-tryptophan-tryptophan-prolin motif (PWWP) domains, five plant-homeodomain (PHD) zinc fingers, and an NSD-specific Cys-His rich domain (Cys5HisCysHis). The SET domain is responsible for histone methyltransferase activity. The PWWP and PHD fingers are involved in protein-protein interactions. This model corresponds to the fifth PHD finger.	43
277132	cd15662	ePHD_ATX1_2_like	Extended PHD finger found in Arabidopsis thaliana ATX1, -2, and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of A. thaliana histone-lysine N-methyltransferase arabidopsis trithorax-like proteins ATX1, -2, and similar proteins. ATX1 and -2 are sister paralogs originating from a segmental chromosomal duplication; they are plant counterparts of the Drosophila melanogaster trithorax (TRX) and mammalian mixed-lineage leukemia (MLL1) proteins. ATX1 (also known as protein SET domain group 27, or trithorax-homolog protein 1/TRX-homolog protein 1), is a methyltransferase that trimethylates histone H3 at lysine 4 (H3K4me3). It also acts as a histone modifier and as a positive effector of gene expression. ATX1 regulates transcription from diverse classes of genes implicated in biotic and abiotic stress responses. It is involved in dehydration stress signaling in both abscisic acid (ABA)-dependent and ABA-independent pathways. ATX2 (also known as protein SET domain group 30, or trithorax-homolog protein 2/TRX-homolog protein 2), is involved in dimethylating histone H3 at lysine 4 (H3K4me2). ATX1 and ATX2 are multi-domain proteins that consist of an N-terminal PWWP domain, FYRN- and FYRC (DAST, domain associated with SET in trithorax) domains, a canonical PHD finger, this non-canonical ePHD finger, and a C-terminal SET domain.	115
277133	cd15663	ePHD_ATX3_4_5_like	Extended PHD finger found in Arabidopsis thaliana ATX3, -4, -5, and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of A. thaliana histone-lysine N-methyltransferase arabidopsis trithorax-like proteins ATX3 (also termed protein SET domain group 14, or trithorax-homolog protein 3), ATX4 (also termed protein SET domain group 16, or trithorax-homolog protein 4) and ATX5 (also termed protein SET domain group 29, or trithorax-homolog protein 5), which belong to the histone-lysine methyltransferase family. These proteins show distinct phylogenetic origins from the family of ATX1 and ATX2. They are multi-domain proteins that consist of an N-terminal PWWP domain, a canonical PHD finger, this non-canonical extended PHD finger, and a C-terminal SET domain.	112
277134	cd15664	ePHD_KMT2A_like	Extended PHD finger found in histone-lysine N-methyltransferase 2A (KMT2A) and 2B (KMT2B). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of histone-lysine N-methyltransferase trithorax (Trx) like proteins, KMT2A/MLL1 and KMT2B/MLL2. KMT2A and KMT2B comprise the mammalian Trx branch of COMPASS family, and are both essential for mammalian embryonic development. KMT2A regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex. KMT2B is a second human homolog of Drosophila trithorax, located on chromosome 19 and functions as the catalytic subunit in the MLL2 complex. It plays a critical role in memory formation by mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. Both KMT2A and KMT2B contain a CxxC (x for any residue) zinc finger domain, three PHD fingers, this extended PHD (ePHD) finger, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain.	105
277135	cd15665	ePHD1_KMT2C_like	Extended PHD finger 1 found in histone-lysine N-methyltransferase 2C (KMT2C) and 2D (KMT2D). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the first ePHD finger of KMTC2C and KMTC2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, five plant PHD fingers, two ePHD fingers, a RING finger, an HMG (high-mobilitygroup)-binding motif, and two FY-rich regions.	90
277136	cd15666	ePHD2_KMT2C_like	Extended PHD finger 2 found in histone-lysine N-methyltransferase 2C (KMT2C) and 2D (KMT2D). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the second ePHD finger of KMT2C, and KMT2D. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named MLL4, a fourth human homolog of Drosophila trithorax, located on chromosome 12. It enzymatically generates trimethylated histone H3 Lysine 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. KMT2D is also a part of ASCOM. Both KMT2C and KMT2D contain the catalytic domain SET, five PHD fingers, two ePHD fingers, a RING finger, an HMG (high-mobilitygroup)-binding motif, and two FY-rich regions.	105
277137	cd15667	ePHD_Snt2p_like	Extended PHD finger found in Saccharomyces cerevisiae SANT domain-containing protein 2 (Snt2p) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Snt2p. Sntp2 is a yeast protein that may function in multiple stress pathways. It coordinates the transcriptional response to hydrogen peroxide-mediated oxidative stress through interaction with Ecm5 and the Rpd3 deacetylase. Snt2p contains a bromo adjacent homology (BAH) domain, two canonical PHD fingers, a non-canonical ePHD finger, and a SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domain.	141
277138	cd15668	ePHD_RAI1_like	Extended PHD finger found in retinoic acid-induced protein 1 (RAI1), transcription factor 20 (TCF-20) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the C-terminal ePHD/ADD (ATRX-DNMT3-DNMT3L) domain of RAI1 and TCF-20. RAI1, a homolog of stromelysin-1 PDGF (platelet-derived growth factor)-responsive element-binding protein (SPBP, also termed TCF-20), is a chromatin-binding protein implicated in the regulation of gene expression. TCF-20 is involved in transcriptional activation of the MMP3 (matrix metalloprotease 3) promoter. It also functions as a transcriptional co-regulator that enhances or represses the transcriptional activity of certain transcription factors/cofactors, such as specificity protein 1 (Sp1), E twenty-six 1 (Ets1), paired box protein 6 (Pax6), small nuclear RING-finger (SNURF)/RNF4, c-Jun, androgen receptor (AR) and estrogen receptor alpha (ERalpha). Both RAI1 and TCF-20 are strongly enriched in chromatin in interphase HeLa cells, and display low nuclear mobility, and have been implicated in Smith-Magenis syndrome and Potocki-Lupski syndrome.	103
277139	cd15669	ePHD_PHF7_G2E3_like	Extended PHD finger found in PHD finger protein 7 (PHF7) and G2/M phase-specific E3 ubiquitin-protein ligase (G2E3). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of PHF7 and G2E3. PHF7, also termed testis development protein NYD-SP6, is a testis-specific PHD finger-containing protein that associates with chromatin and binds histone H3 N-terminal tails with a preference for dimethyl lysine 4 (H3K4me2). It may play an important role in stimulating transcription involved in testicular development and/or spermatogenesis. PHF7 contains a PHD finger and a non-canonical ePHD finger, both of which may be involved in activating transcriptional regulation. G2E3 is a dual function ubiquitin ligase (E3) that may play a possible role in cell cycle regulation and the cellular response to DNA damage. It is essential for prevention of apoptosis in early embryogenesis. It is also a nucleo-cytoplasmic shuttling protein with DNA damage responsive localization. G2E3 contains two distinct RING-like ubiquitin ligase domains that catalyze lysine 48-linked polyubiquitination, and a C-terminal catalytic HECT domain that plays an important role in ubiquitin ligase activity and in the dynamic subcellular localization of the protein. The RING-like ubiquitin ligase domains consist of a PHD finger and an ePHD finger.	112
277140	cd15670	ePHD_BRPF	Extended PHD finger found in BRPF proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the ePHD finger of the family of BRPF proteins, which includes BRPF1, BRD1/BRPF2, and BRPF3. These are scaffold proteins that form monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complexes with other regulatory subunits, such as inhibitor of growth 5 (ING5) and Esa1-associated factor 6 ortholog (EAF6). BRPF proteins have multiple domains, including a plant homeodomain (PHD) zinc finger followed by a non-canonical ePHD finger, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. This PHD finger binds to lysine 4 of histone H3 (K4H3), the bromodomain interacts with acetylated lysines on N-terminal tails of histones and other proteins, and the PWWP domain shows histone-binding and chromatin association properties. 	116
277141	cd15671	ePHD_JADE	Extended PHD finger found in protein Jade-1, Jade-2, Jade-3 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Jade-1 (PHF17), Jade-2 (PHF15), and Jade-3 (PHF16); each of these proteins is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and EAF6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, has reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. This family also contains Drosophila melanogaster PHD finger protein rhinoceros (RNO). It is a novel plant homeodomain (PHD)-containing nuclear protein that may function as a transcription factor that antagonizes Ras signaling by regulating transcription of key EGFR/Ras pathway regulators in the Drosophila eye. All Jade proteins contain a canonical PHD finger followed by this non-canonical ePHD finger, both of which are zinc-binding motifs.	112
277142	cd15672	ePHD_AF10_like	Extended PHD finger found in protein AF-10 and AF-17. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of AF-10 and AF-17. AF-10, also termed ALL1 (acute lymphoblastic leukaemia)-fused gene from chromosome 10 protein, is a transcription factor encoded by gene AF10, a translocation partner of the MLL (mixed-lineage leukaemia) oncogene in leukaemia. AF-10 has been implicated in the development of leukemia following chromosomal rearrangements between the AF10 gene and one of at least two other genes, MLL and CALM. It plays a key role in the survival of uncommitted hematopoietic cells. Moreover, AF-10 functions as a follistatin-related gene (FLRG)-interacting protein. The interaction with FLRG enhances AF10-dependent transcription. It interacts with the human counterpart of yeast Dot1, hDOT1L, and may act as a bridge for the recruitment of hDOT1L to the genes targeted by MLL-AF10. It also interacts with the synovial sarcoma associated protein SYT protein and may play a role in synovial sarcomas and acute leukemias. AF-17, also termed ALL1-fused gene from chromosome 17 protein, is encoded by gene AF17 that has been identified in hematological malignancies as translocation partners of the mixed lineage leukemia gene MLL. It is a putative transcription factor that may play a role in multiple signaling pathways. It is involved in chromatin-mediated gene regulation mechanisms. It functions as a component of the multi-subunit Dot1 complex (Dotcom) and plays a role in the Wnt/Wingless signaling pathway. It also seems to be a downstream target of the beta-catenin/T-cell factor pathway, and participates in G2-M progression. Moreover, it may function as an important regulator of ENaC-mediated Na+ transport and thus blood pressure. Both AF-10 and AF-17 contain an N-terminal plant homeodomain (PHD) finger followed by this non-canonical ePHD finger. The PHD finger is involved in their homo-oligomerization. In the C-terminal region, they possess a leucine zipper domain and a glutamine-rich region. This family also includes ZFP-1, the Caenorhabditis elegans AF10 homolog. It was originally identified as a factor promoting RNAi interference in C. elegans. It also acts as Dot1-interacting protein that opposes H2B ubiquitination to reduce polymerase II (Pol II) transcription.	116
277143	cd15673	ePHD_PHF6_like	Extended PHD finger found in PHD finger protein 6 (PHF6) and PHD finger protein 11 (PHF11). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the two ePHD fingers of PFH6 and the single ePHD finger of PFH11. PHF6, also termed the X-linked mental retardation disorder Borjeson-Forssman-Lehmann syndrome-associated protein, is a nucleolus, ribosomal RNA promoter-associated protein that regulates cell cycle progression by suppressing ribosomal RNA synthesis. It has been implicated in cell cycle control, genomic maintenance, and tumor suppression. PHF6 shows transcriptional repression activity through directly interacting with the nucleosome remodeling and deacetylation complex component RBBP4. PHF6 contains two non-canonical ePHD fingers. PHF11, also termed BRCA1 C-terminus-associated protein, or renal carcinoma antigen NY-REN-34, is a transcriptional co-activator of the Th1 effector cytokine genes, interleukin-2 (IL2) and interferon-gamma (IFNG), co-operating with nuclear factor kappa B (NF-kappaB). It is involved in T-cell activation and viability. Polymorphisms within PHF11 are associated with total IgE, allergic asthma and eczema.	116
277144	cd15674	ePHD_PHF14	Extended PHD finger found in PHD finger protein 14 (PHF14) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of PHF14. PHF14 is a novel nuclear transcription factor that controls the proliferation of mesenchymal cells by directly repressing platelet-derived growth factor receptor-alpha (PDGFRalpha) expression. It also acts as an epigenetic regulator and plays an important role in the development of multiple organs in mammals. PHF14 contains three canonical plant homeodomain (PHD) fingers and this non-canonical ePHD finger. It can interact with histones through its PHD fingers.	114
277145	cd15675	ePHD_JMJD2	Extended PHD finger found in Jumonji domain-containing protein 2 (JMJD2) family of histone demethylases. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of JMJD2 proteins. JMJD2 proteins, also termed lysine-specific demethylase 4 histone demethylases (KDM4), have been implicated in various cellular processes including DNA damage response, transcription, cell cycle regulation, cellular differentiation, senescence, and carcinogenesis. They selectively catalyze the demethylation of di- and trimethylated H3K9 and H3K36. This model contains three JMJD2 proteins, JMJD2A-C, which all contain jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD finger, this non-canonical ePHD finger, and a Tudor domain. JMJD2D is not included in this family, since it lacks both PHD and Tudor domains and has a different substrate specificity. JMJD2A-C are required for efficient cancer cell growth.	112
277146	cd15676	PHD_BRPF1	PHD finger found in bromodomain and PHD finger-containing protein 1 (BRPF1) and similar proteins. BRPF1, also termed peregrin or protein Br140, is a multi-domain protein that binds histones, mediates monocytic leukemic zinc-finger protein (MOZ)-dependent histone acetylation, and is required for Hox gene expression and segmental identity. It is a close partner of the MOZ histone acetyltransferase (HAT) complex and a novel Trithorax group (TrxG) member with a central role during development. BRPF1 is primarily a nuclear protein that has a broad tissue distribution and is abundant in testes and spermatogonia. It contains a canonical Cys4HisCys3 plant homeodomain (PHD) zinc finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. PHD and ePHD fingers both bind to lysine 4 of histone H3 (K4H3), bromodomains interact with acetylated lysines on N-terminal tails of histones and other proteins, and PWWP domains show histone-binding and chromatin association properties. BRPF1 may be involved in chromatin remodeling. This model corresponds to the canonical Cys4HisCys3 PHD finger.	62
277147	cd15677	PHD_BRPF2	PHD finger found in bromodomain and PHD finger-containing protein 2 (BRPF2) and similar proteins. BRPF2, also termed bromodomain-containing protein 1 (BRD1), or BR140-like protein, is encoded by BRL (BR140 Like gene). It is responsible for the bulk of the acetylation of H3K14 and forms a novel monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complex with HBO1 and ING4. The complex is required for full transcriptional activation of the erythroid-specific regulator genes essential for terminal differentiation and survival of erythroblasts in fetal liver. BRPF2 shows widespread expression and localizes to the nucleus within spermatocytes. It contains a cysteine rich region harboring a canonical Cys4HisCys3 plant homeodomain (PHD) finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. This model corresponds to the canonical Cys4HisCys3 PHD finger.	54
277148	cd15678	PHD_BRPF3	PHD finger found in bromodomain and PHD finger-containing protein 3 (BRPF3) and similar proteins. BRPF3 is a homolog of BRPF1 and BRPF2. It is a scaffold protein that forms a novel monocytic leukemic zinc finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complex with other regulatory subunits. BRPF3 contains a canonical Cys4HisCys3 plant homeodomain (PHD) finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. This model corresponds to the canonical Cys4HisCys3 PHD finger.	55
277149	cd15679	PHD_JADE1	PHD finger found in protein Jade-1 and similar proteins. Jade-1, also termed PHD finger protein 17 (PHF17), is a novel binding partner of von Hippel-Lindau (VHL) tumor suppressor Pvhl, a key regulator of the cellular oxygen sensing pathway. It is highly expressed in renal proximal tubules. Jade-1 functions as an essential regulator of multiple cell signaling pathways. It may be involved in the serine/threonine kinase AKT/AKT1 pathway during renal cancer pathogenesis and normally prevents renal epithelial cell proliferation and transformation. It also acts as a pro-apoptotic and growth suppressive ubiquitin ligase to inhibit canonical Wnt downstream effector beta-catenin for proteasomal degradation, and as a transcription factor associated with histone acetyltransferase activity and with increased abundance of cyclin-dependent kinase inhibitor p21. Moreover, Jade-1 is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and Eaf6 to form a HBO1 complex, and plays a role in epithelial cell regeneration. It has also been identified as a novel component of the nephrocystin protein (NPHP) complex and interacts with the ciliary protein nephrocystin-4 (NPHP4). Jade-1 contains a canonical Cys4HisCys3 plant homeodomain (PHD) finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, both of which are zinc-binding motifs. This model corresponds to the canonical Cys4HisCys3 PHD finger.	46
277150	cd15680	PHD_JADE2	PHD finger found in protein Jade-2 and similar proteins. Jade-2, also termed PHD finger protein 15 (PHF15), is a plant homeodomain (PHD) zinc finger protein that is closely related to Jade-1, which functions as an essential regulator of multiple cell signaling pathways. Like Jade-1, Jade-2 is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and Eaf6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. Jade-2 contains a canonical Cys4HisCys3 PHD finger followed by a non-canonical extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, both of which are zinc-binding motifs. This model corresponds to the canonical Cys4HisCys3 PHD finger.	46
277151	cd15681	PHD_JADE3	PHD finger found in protein Jade-3 and similar proteins. Jade-3, also termed PHD finger protein 16 (PHF16), is a plant homeodomain (PHD) zinc finger protein that is closely related to Jade-1, which functions as an essential regulator of multiple cell signaling pathways. Like Jade-1, Jade-3 is required for ING4 and ING5 to associate with histone acetyl transferase (HAT) HBO1 and Eaf6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. Jade-3 contains a canonical Cys4HisCys3 PHD domain followed by a non-canonical extended PHD (ePHD) domain, Cys2HisCys5HisCys2His, both of which are zinc-binding motifs. This model corresponds to the canonical Cys4HisCys3 PHD finger.	50
277152	cd15682	PHD_ING1	PHD finger found in inhibitor of growth protein 1 (ING1). ING1 is an epigenetic regulator and a type II tumor suppressor that impacts cell growth, aging, apoptosis, and DNA repair, by affecting chromatin conformation and gene expression. It acts as a reader of the active chromatin mark, the trimethylation of histone H3 lysine 4 (H3K4me3). It binds and directs Growth arrest and DNA damage inducible protein 45 a (Gadd45a) to target sites, thus linking the histone code with DNA demethylation. It interacts with the proliferating cell nuclear antigen (PCNA) via the PCNA-interacting protein (PIP) domain in a UV-inducible manner. It also interacts with a PCNA-interacting protein, p15 (PAF). Moreover, ING1 associates with members of the 14-3-3 family, which is necessary for the cytoplasmic relocalization. Endogenous ING1 protein specifically interacts with the pro-apoptotic BCL2 family member BAX and colocalizes with BAX in a UV-inducible manner. It stabilizes the p53 tumor suppressor by inhibiting polyubiquitination of multi-monoubiquitinated forms via interaction with and colocalization of the herpesvirus-associated ubiquitin-specific protease (HAUSP)-deubiquitinase with p53. It is also involved in trichostatin A-induced apoptosis and caspase 3 signaling in p53-deficient glioblastoma cells. In addition, tyrosine kinase Src can bind phosphorylate ING1 and further regulates its activity. ING1 contains an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger.	49
277153	cd15683	PHD_ING2	PHD finger found in inhibitor of growth protein 2 (ING2). ING2, also termed inhibitor of growth 1-like protein (ING1Lp), or p32, or p33ING2, is one member of the inhibitor of growth (ING) family of type II tumor suppressors. It is a core component of a multi-factor chromatin-modifying complex containing the transcriptional co-repressor SIN3A and histone deacetylase 1 (HDAC1). It has been implicated in the control of cell cycle, in genome stability, and in muscle differentiation. ING2 independently interacts with H3K4me3 (Histone H3 trimethylated on lysine 4) and PtdIns(5)P, and modulates crosstalk between lysine methylation and lysine acetylation on histone proteins through association with chromatin in the presence of DNA damage. It collaborates with SnoN to mediate transforming growth factor (TGF)-beta-induced Smad-dependent transcription and cellular responses. It is upregulated in colon cancer and increases invasion by enhanced MMP13 expression. It also acts as a cofactor of p300 for p53 acetylation and plays a positive regulatory role during p53-mediated replicative senescence. ING2 contains an N-terminal ING domain and a C-terminal plant homeodomain (PHD) finger.	49
277154	cd15684	PHD_ING4	PHD finger found in inhibitor of growth protein 4 (ING4). ING4, also termed p29ING4, is one member of the inhibitor of growth (ING) family of type II tumor suppressors. It acts as an E3 ubiquitin ligase to induce ubiquitination of the p65 subunit of NF-kappaB and inhibit the transactivation of NF-kappaB target genes. It also induces apoptosis through a p53 dependent pathway, including increasing p53 acetylation, inhibiting Mdm2-mediated degradation of p53 and enhancing the expression of p53 responsive genes both at the transcriptional and post-translational levels. Moreover, ING4 can inhibit the translation of proto-oncogene MYC by interacting with AUF1. It also regulates other transcription factors, such as hypoxia-inducible factor (HIF). In addition, ING4 associates with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further directs the MOZ/MORF and HBO1 complexes to chromatin. ING4 contains an N-terminal ING histone-binding domain and a C-terminal plant homeodomain (PHD) finger.	48
277155	cd15685	PHD_ING5	PHD finger found in inhibitor of growth protein 5 (ING5). ING5, also termed p28ING5, is one member of the inhibitor of growth (ING) family of type II tumor suppressors. It acts as a Tip60 cofactor that acetylates p53 at K120 and subsequently activates the expression of p53-dependent apoptotic genes in response to DNA damage. Aberrant ING5 expression may contribute to pathogenesis, growth, and invasion of gastric carcinomas and colorectal cancer. ING5 can physically interact with p300 and p53 in vivo, and its overexpression induces apoptosis in colorectal cancer cells. It also associates with cyclin A1 (INCA1) and functions as a growth suppressor with suppressed expression in Acute Myeloid Leukemia (AML). Moreover, ING5 translocation from the nucleus to the cytoplasm might be a critical event for carcinogenesis and tumor progression in human head and neck squamous cell carcinoma. In addition, ING5 associates with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further directs the MOZ/MORF and HBO1 complexes to chromatin. ING5 contains an N-terminal ING histone-binding domain and a C-terminal plant homeodomain (PHD) finger.	49
277156	cd15686	PHD3_KDM5A	PHD finger 3 found in Lysine-specific demethylase 5A (KDM5A). KDM5A, also termed Histone demethylase JARID1A, or Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2), was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5A functions as a trimethylated histone H3 lysine 4 (H3K4me3) demethylase that belongs to the JARID subfamily within the JmjC proteins. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5A contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the third PHD finger.	52
277157	cd15687	PHD3_KDM5B	PHD finger 3 found in lysine-specific demethylase 5B (KDM5B). KDM5B, also termed Cancer/testis antigen 31 (CT31), or Histone demethylase JARID1B, or Jumonji/ARID domain-containing protein 1B (JARID1B), or PLU-1, or retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A), is a member of the JARID subfamily within the JmjC proteins. It has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well as TIEG1/KLF10 (transforming growth factor-beta inducible early gene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. KDM5B contains the catalytic JmjC domain, JmjN, the BRIGHT domain, which is an AT-rich interacting domain (ARID), and a Cys5HisCys2 zinc finger, as well as three plant homeodomain (PHD) fingers. This model corresponds to the third PHD finger.	50
277158	cd15688	PHD1_MOZ	PHD finger 1 found in monocytic leukemia zinc-finger protein (MOZ). MOZ, also termed histone acetyltransferase KAT6A, YBF2/SAS3, SAS2 and TIP60 protein 3 (MYST-3), or runt-related transcription factor-binding protein 2, or zinc finger protein 220, is a MYST-type histone acetyltransferase (HAT) that functions as a coactivator for acute myeloid leukemia 1 protein (AML1)- and p53-dependent transcription. It possesses intrinsic HAT activity to acetylate both itself and lysine (K) residues on histone H2B, histone H3 (K14) and histone H4 (K5, K8, K12 and K16) in vitro and H3K9 in vivo. MOZ and MOZ-related factor (MORF) are catalytic subunits of histone acetyltransferase (HAT) complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and implicated in human leukemias. It is also the catalytic subunit of a tetrameric inhibitor of growth 5 (ING5) complex, which specifically acetylates nucleosomal histone H3K14. Moreover, MOZ and MORF are involved in regulating transcriptional activation mediated by Runx2 (or Cbfa1), a Runt-domain transcription factor known to play important roles in T cell lymphomagenesis and bone development, and its homologs. MOZ contains a linker histone 1 and histone 5 domains and two plant homeodomain (PHD) fingers. The model corresponds to the first PHD finger.	59
277159	cd15689	PHD1_MORF	PHD finger 1 found in monocytic leukemia zinc finger protein-related factor (MORF). MORF, also termed MOZ2, or histone acetyltransferase KAT6B, or MOZ, YBF2/SAS3, SAS2 and TIP60 protein 4 (MYST4), is a ubiquitously expressed transcriptional regulator with intrinsic histone acetyltransferase (HAT) activity. It can interact with the Runt-domain transcription factor Runx2 and form a tetrameric complex with BRPFs, ING5, and EAF6. MORF and monocytic leukemia zinc-finger protein (MOZ) are catalytic subunits of HAT complexes that are required for normal developmental programs, such as hematopoiesis, neurogenesis, and skeletogenesis, and are also implicated in human leukemias. MORF contains an N-terminal region containing two plant homeodomain (PHD) fingers, a putative HAT domain, an acidic region, and a C-terminal Ser/Met-rich domain. The model corresponds to the first PHD finger.	59
277160	cd15690	PHD1_DPF1	PHD finger 1 found in D4, zinc and double PHD fingers family 1 (DPF1). DPF1, also termed zinc finger protein neuro-d4, or BRG1-associated factor 45B (BAF45B), is encoded by a neuro specific gene, neuro-d4. It may be involved in the transcription regulation of neuro specific gene clusters. DPF1 contains a nuclear localization signal in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc-finger and a sequence of negatively charged amino acids in the central, and a cysteine/histidine-rich region that is composed of two adjacent plant homeodomain (PHD)-fingers (d4-domain) in the C-terminal part of the molecule. The family corresponds to the first PHD finger.	58
277161	cd15691	PHD1_DPF2_like	PHD finger 1 found in D4, zinc and double PHD fingers family 2 (DPF2). DPF2 (also termed zinc finger protein ubi-d4, apoptosis response zinc finger protein, BRG1-associated factor 45D (BAF45D), or protein requiem) is a transcription factor that is encoded by the ubiquitously expressed gene, ubi-d4, and may be involved in leukemia or other cancers with other genes connected with cancer. It recognizes acetylated histone H3 and suppresses the function of estrogen-related receptor alpha (ERRalpha) through histone deacetylase 1 (HDAC1). Moreover, DPF2 functions as a linker protein between the SWI/SNF complex and RelB/p52 NF-kappaB heterodimer and plays important roles in NF-kappaB transactivation via its non-canonical pathway. It is also required as a transcriptional coactivator in SWI/SNF complex-dependent activation of NF-kappaB RelA/p50 heterodimer. DPF2 contains a nuclear localization signal in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc-finger and a sequence of negatively charged amino acids in the central region, and a cysteine/histidine-rich region that is composed of two adjacent plant homeodomain (PHD) fingers (d4-domain) in the C-terminal part of the molecule. This subfamily also includes DPF3 from zebrafish. This model describes the first PHD finger.	56
277162	cd15692	PHD1_DPF3	PHD finger 1 found in D4, zinc and double PHD fingers family 3 (DPF3). DPF3, also termed BRG1-associated factor 45C (BAF45C), or zinc finger protein cer-d4, is encoded by a neuro-specific gene, cer-d4. It functions as a new epigenetic key factor for heart and muscle development and may be involved in the transcription regulation of neuro-specific gene clusters. It interacts with the BAF chromatin remodeling complex and binds methylated and acetylated lysine residues of histone 3 and 4. DPF3 contains a nuclear localization signal in the N-terminal region, a Cys2His2 (C2H2) zinc finger or Kruppel-type zinc-finger and a sequence of negatively charged amino acids in the central region, and a cysteine/histidine-rich region that is composed of two adjacent plant homeodomain (PHD) fingers (d4-domain) in the C-terminal part of the molecule. This model corresponds to the first PHD finger.	57
277163	cd15693	ePHD_KMT2A	Extended PHD finger found in histone-lysine N-methyltransferase 2A (KMT2A). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of KMT2A. KMT2A also termed ALL-1, or CXXC-type zinc finger protein 7, or myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), or trithorax-like protein (Htrx), or zinc finger protein HRX, is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia. KMT2A functions as the catalytic subunit in the MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL1 complex is highly active and specific for H3K4methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three PHD fingers, a Bromodomain domain, this extended PHD finger, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain.	113
277164	cd15694	ePHD_KMT2B	Extended PHD finger found in histone-lysine N-methyltransferase 2B (KMT2B). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This subfamily includes the ePHD finger of KMT2B. KMT2B is also called trithorax homolog 2 or WW domain-binding protein 7 (WBP-7). KMT2B is encoded by the gene that was first named myeloid/lymphoid or mixed-lineage leukemia 2 (MLL2), a second human homolog of Drosophila trithorax, located on chromosome 19. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3 lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner. Moreover, KMT2B plays a critical role in memory formation by mediating hippocampal H3K4 di- and trimethylation. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter. KMT2B contains a CxxC (x for any residue) zinc finger domain, three PHD fingers, this ePHD finger, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain.	105
277165	cd15695	ePHD1_KMT2D	Extended PHD finger 1 found in histone-lysine N-methyltransferase 2D (KMT2D). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the first ePHD finger of KMT2D. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 at Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D contains the catalytic domain SET, five PHD fingers, two ePHD fingers, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions.	90
277166	cd15696	ePHD1_KMT2C	Extended PHD finger 1 found in histone-lysine N-methyltransferase 2C (KMT2C). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the first ePHD finger of KMT2C. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains several PHD fingers, two ePHD fingers, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. 	90
277167	cd15697	ePHD2_KMT2C	Extended PHD finger 2 found in histone-lysine N-methyltransferase 2C (KMT2C). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the second ePHD finger of KMT2C. KMT2C, also termed myeloid/lymphoid or mixed-lineage leukemia protein 3 (MLL3), or homologous to ALR protein, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP). KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2C contains PHD fingers, two ePHD fingers, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains. 	105
277168	cd15698	ePHD2_KMT2D	Extended PHD finger 2 found in histone-lysine N-methyltransferase 2D (KMT2D). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the second ePHD finger of KMT2D. KMT2D, also termed ALL1-related protein (ALR), is encoded by the gene that was named myeloid/lymphoid or mixed-lineage leukemia 4 (MLL4), a fourth human homolog of Drosophila trithorax, located on chromosome 12. KMT2D enzymatically generates trimethylated histone H3 Lys 4 (H3K4me3). It plays an essential role in differentiating the human pluripotent embryonal carcinoma cell line NTERA-2 clone D1 (NT2/D1) stem cells by activating differentiation-specific genes, such as HOXA1-3 and NESTIN. It is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and KMT2D. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis. KMT2D contains the catalytic domain SET, five PHD fingers, two ePHD fingers, a RING finger, an HMG (high-mobility group)-binding motif, and two FY-rich regions.	107
277169	cd15699	ePHD_TCF20	Extended PHD finger (ePHD) found in transcription factor 20 (TCF-20). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the C-terminal ePHD/ADD (ATRX-DNMT3-DNMT3L) domain of TCF-20. TCF-20, also termed nuclear factor SPBP, or protein AR1, or stromelysin-1 PDGF (platelet-derived growth factor)-responsive element-binding protein (SPRE-binding protein), is involved in transcriptional activation of the MMP3 (matrix metalloprotease 3) promoter. It is strongly enriched on chromatin in interphase HeLa cells, and displays low nuclear mobility, and has been implicated in Smith-Magenis syndrome and Potocki-Lupski syndrome. As a chromatin-binding protein, TCF-20 plays a role in the regulation of gene expression. It also functions as a transcriptional co-regulator that enhances or represses the transcriptional activity of certain transcription factors/cofactors, such as specificity protein 1 (Sp1), E twenty-six 1 (Ets1), paired box protein 6 (Pax6), small nuclear RING-finger (SNURF)/RNF4, c-Jun, androgen receptor (AR) and estrogen receptor alpha (ERalpha). TCF-20 contains an N-terminal transactivation domain, a novel DNA-binding domain with an AT-hook motif, three nuclear localization signals (NLSs) and a C-terminal ePHD/ADD domain.	103
277170	cd15700	ePHD_RAI1	Extended PHD finger (ePHD) found in retinoic acid-induced protein 1 (RAI1). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model corresponds to the C-terminal ePHD/ADD (ATRX-DNMT3-DNMT3L) domain of RAI1. RAI1, a homolog of stromelysin-1 PDGF (platelet-derived growth factor)-responsive element-binding protein (SPBP, also termed TCF-20), is a chromatin-binding protein implicated in the regulation of gene expression. It is strongly enriched on chromatin in interphase HeLa cells, and displays low nuclear mobility, and has been implicated in Smith-Magenis syndrome, Potocki-Lupski syndrome, and non-syndromic autism. RAI1 contains a region with homology to the novel nucleosome-binding region SPBP and an ePHD/ADD domain with ability to bind nucleosomes.	104
277171	cd15701	ePHD_BRPF1	Extended PHD finger found in bromodomain and PHD finger-containing protein 1 (BRPF1) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of BRPF1. BRPF1, also termed peregrin, or protein Br140, is a multi-domain protein that binds histones, mediates monocytic leukemic zinc-finger protein (MOZ) -dependent histone acetylation, and is required for Hox gene expression and segmental identity. It is a close partner of the MOZ histone acetyltransferase (HAT) complex and a novel Trithorax group (TrxG) member with a central role during development. BRPF1 is primarily a nuclear protein that has a broad tissue distribution and is abundant in testes and spermatogonia. It contains a plant homeodomain (PHD) zinc finger followed by a non-canonical ePHD finger, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. This PHD finger binds to methylated lysine 4 of histone H3 (H3K4me), the bromodomain interacts with acetylated lysines on N-terminal tails of histones and other proteins, and the PWWP domain shows histone-binding and chromatin association properties. BRPF1 may be involved in chromatin remodeling.	121
277172	cd15702	ePHD_BRPF2	Extended PHD finger found in bromodomain and PHD finger-containing protein 2 (BRPF2) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of BRPF2. BRPF2 also termed bromodomain-containing protein 1 (BRD1), or BR140-likeprotein, is encoded by BRL (BR140 Like gene). It is responsible for the bulk of the acetylation of H3K14 and forms a novel monocytic leukemic zinc-finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complex with HBO1 and ING4. The complex is required for full transcriptional activation of the erythroid-specific regulator genes essential for terminal differentiation and survival of erythroblasts in fetal liver. BRPF2 shows widespread expression and localizes to the nucleus within spermatocytes. It contains a cysteine rich region harboring a plant homeodomain (PHD) finger followed by a non-canonical ePHD finger, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. 	118
277173	cd15703	ePHD_BRPF3	Extended PHD finger found in bromodomain and PHD finger-containing protein 3 (BRPF3) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of BRPF3. BRF3 is a homolog of BRPF1 and BRPF2. It is a scaffold protein that forms a novel monocytic leukemic zinc finger protein (MOZ)/MOZ-related factor (MORF) H3 histone acetyltransferase (HAT) complex with other regulatory subunits. BRPF3 contains a plant homeodomain (PHD) finger followed by this non-canonical ePHD finger, a bromodomain, and a proline-tryptophan-tryptophan-proline (PWWP) domain. 	118
277174	cd15704	ePHD_JADE1	Extended PHD finger found in protein Jade-1 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Jade-1. Jade-1, also termed PHD finger protein 17 (PHF17), is a novel binding partner of von Hippel-Lindau (VHL) tumor suppressor Pvhl, a key regulator of cellular oxygen sensing pathway. It is highly expressed in renal proximal tubules. Jade-1 functions as an essential regulator of multiple cell signaling pathways. It may be involved in the Serine/threonine kinase AKT/AKT1 pathway during renal cancer pathogenesis and normally prevents renal epithelial cell proliferation and transformation. It also acts as a pro-apoptotic and growth suppressive ubiquitin ligase to inhibit canonical Wnt downstream effector beta-catenin for proteasomal degradation and ASA transcription factor associated with histone acetyltransferase activity and with increased abundance of cyclin-dependent kinase inhibitor p21. Moreover, Jade-1 is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and Eaf6 to form a HBO1 complex, and plays a role in epithelial cell regeneration. It has also been identified as a novel component of the nephrocystin protein (NPHP) complex and interacts with the ciliary protein nephrocystin-4 (NPHP4). Jade-1 contains a canonical plant homeodomain (PHD) finger followed by this non-canonical ePHD finger, both of which are zinc-binding motifs.	118
277175	cd15705	ePHD_JADE2	Extended PHD finger found in protein Jade-2 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Jade-2. Jade-2, also termed PHD finger protein 15 (PHF15), is a plant homeodomain (PHD) zinc finger protein that is closely related to Jade-1, which functions as an essential regulator of multiple cell signaling pathways. Like Jade-1, Jade-2 is required for ING4 and ING5 to associate with histone acetyltransferase (HAT) HBO1 and Eaf6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. Jade-2 contains a canonical PHD finger followed by this non-canonical ePHD finger, both of which are zinc-binding motifs.	111
277176	cd15706	ePHD_JADE3	Extended PHD finger found in protein Jade-3 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Jade-3. Jade-3, also termed PHD finger protein 16 (PHF16), is a plant homeodomain (PHD) zinc finger protein that is close related to Jade-1, which functions as an essential regulator of multiple cell signaling pathways. Like Jade-1, Jade-3 is required for ING4 and ING5 to associate with histone acetyl transferase (HAT) HBO1 and Eaf6 to form a HBO1 complex that has a histone H4-specific acetyltransferase activity, a reduced activity toward histone H3, and is responsible for the bulk of histone H4 acetylation in vivo. Jade-3 contains a canonical PHD domain followed by this non-canonical ePHD domain, both of which are zinc-binding motifs.	111
277177	cd15707	ePHD_RNO	Extended PHD finger found in Drosophila melanogaster PHD finger protein rhinoceros (RNO) and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of Drosophila melanogaster RNO. RNO is a novel plant homeodomain (PHD)-containing nuclear protein that may function as a transcription factor that antagonizes Ras signaling by regulating the transcription of key EGFR/Ras pathway regulators in the Drosophila eye. RNO contains a canonical PHD domain followed by this non-canonical ePHD domain, both of which are zinc-binding motifs.	113
277178	cd15708	ePHD_AF10	Extended PHD finger found in protein AF-10 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of AF-10. AF-10, also termed ALL1 (acute lymphoblastic leukaemia)-fused gene from chromosome 10 protein, is a transcription factor encoded by gene AF10, a translocation partner of the MLL (mixed-lineage leukaemia) oncogene in leukaemia. AF-10 has been implicated in the development of leukemia following chromosomal rearrangements between the AF10 gene and one of at least two other genes, MLL and CALM. It plays a key role in the survival of uncommitted hematopoietic cells. Moreover, AF-10 functions as a follistatin-related gene (FLRG)-interacting protein. The interaction with FLRG enhances AF10-dependent transcription. It interacts with human counterpart of the yeast Dot1, hDOT1L, and may act as a bridge for the recruitment of hDOT1L to the genes targeted by MLL-AF10. It also interacts with the synovial sarcoma associated protein SYT protein and may play a role in synovial sarcomas and acute leukemias. AF-10 contains an N-terminal plant homeodomain (PHD) finger followed by this non-canonical ePHD finger.	129
277179	cd15709	ePHD_AF17	Extended PHD finger found in protein AF-17 and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of AF-17. AF-17, also termed ALL1-fused gene from chromosome 17 protein, is encoded by gene AF17 that has been identified in hematological malignancies as a translocation partner of the mixed lineage leukemia gene MLL. It is a putative transcription factor that may play a role in multiple signaling pathways. It is involved in chromatin-mediated gene regulation mechanisms. It functions as a component of the multi-subunit Dot1 complex (Dotcom) and plays a role in the Wnt/Wingless signaling pathway. It also seems to be a downstream target of the beta-catenin/T-cell factor pathway, and participates in G2-M progression. Moreover, it may function as an important regulator of ENaC-mediated Na+ transport and thus blood pressure. AF-17 contains an N-terminal plant homeodomain (PHD) finger followed by a non-canonical ePHD finger.	125
277180	cd15710	ePHD1_PHF6	Extended PHD finger 1 found in PHD finger protein 6 (PHF6). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. PHF6 contains two non-canonical ePHD fingers, this model corresponds to the first ePHD finger. PHF6, also termed the X-linked mental retardation disorder Borjeson-Forssman-Lehmann syndrome-associated protein, is a nucleolus, ribosomal RNA promoter-associated protein that regulates cell cycle progression by suppressing ribosomal RNA synthesis. It has been implicated in cell cycle control, genomic maintenance, and tumor suppression. PHF6 shows transcriptional repression activity through directly interacting with the nucleosome remodeling and deacetylation complex component RBBP4. .	115
277181	cd15711	ePHD2_PHF6	Extended PHD finger 2 found in PHD finger protein 6 (PHF6). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. PHF6 contains two non-canonical ePHD fingers, this model corresponds to the second ePHD finger. PHF6, also termed the X-linked mental retardation disorder Borjeson-Forssman-Lehmann syndrome-associated protein, is a nucleolus, ribosomal RNA promoter-associated protein that regulates cell cycle progression by suppressing ribosomal RNA synthesis. It has been implicated in cell cycle control, genomic maintenance, and tumor suppression. PHF6 shows transcriptional repression activity through directly interacting with the nucleosome remodeling and deacetylation complex component RBBP4.	118
277182	cd15712	ePHD_PHF11	Extended PHD finger found in PHD finger protein 11 (PHF11). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of PHF11. PHF11, also termed BRCA1 C-terminus-associated protein, or renal carcinoma antigen NY-REN-34, is a transcriptional co-activator of the Th1 effector cytokine genes, interleukin-2 (IL2) and interferon-gamma (IFNG), co-operating with nuclear factor kappa B (NF-kappaB). It is involved in T-cell activation and viability. Polymorphisms within PHF11 are associated with total IgE, allergic asthma and eczema.	115
277183	cd15713	ePHD_JMJD2A	Extended PHD finger (ePHD) found in Jumonji domain-containing protein 2A (JMJD2A). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of JMJD2A. JMJD2A, also termed lysine-specific demethylase 4A (KDM4A), or JmjC domain-containing histone demethylation protein 3A (JHDM3A), catalyzes the demethylation of di- and trimethylated H3K9 and H3K36. It is involved in carcinogenesis and functions as a transcription regulator that may either stimulate or repress gene transcription. It associates with nuclear receptor co-repressor complex or histone deacetylases. Moreover, JMJD2A forms complexes with both the androgen and estrogen receptor (ER) and plays an essential role in growth of both ER-positive and -negative breast tumors. It is also involved in prostate, colon, and lung cancer progression. JMJD2A contains jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD finger, this non-canonical ePHD finger, and a Tudor domain.	110
277184	cd15714	ePHD_JMJD2B	Extended PHD finger found in Jumonji domain-containing protein 2B (JMJD2B). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of JMJD2B. JMJD2B, also termed lysine-specific demethylase 4B (KDM4B), or JmjC domain-containing histone demethylation protein 3B (JHDM3B), specifically antagonizes the trimethyl group from H3K9 in pericentric heterochromatin and reduces H3K36 methylation in mammalian cells. It plays an essential role in the growth regulation of cancer cells by modulating the G1-S transition and promotes cell-cycle progression through the regulation of cyclin-dependent kinase 6 (CDK6). It interacts with heat shock protein 90 (Hsp90) and its stability can be regulated by Hsp90. JMJD2B also functions as a direct transcriptional target of p53, which induces its expression through promoter binding. Moreover, JMJD2B expression can be controlled by hypoxia-inducible factor 1alpha (HIF1alpha) in colorectal cancer and estrogen receptor alpha (ERalpha) in breast cancer. It is also involved in bladder, lung, and gastric cancer. JMJD2B contains jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD finger, this non-canonical ePHD finger, and a Tudor domain.	110
277185	cd15715	ePHD_JMJD2C	Extended PHD finger found in Jumonji domain-containing protein 2C (JMJD2C). The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This model includes the ePHD finger of JMJD2C. JMJD2C, also termed lysine-specific demethylase 4C (KDM4C), or gene amplified in squamous cell carcinoma 1 protein (GASC-1 protein), or JmjC domain-containing histone demethylation protein 3C (JHDM3C), is an epigenetic factor that catalyzes the demethylation of di- and trimethylated H3K9 and H3K36, and may be involved in the development and/or progression of various types of cancer including esophageal squamous cell carcinoma (ESC) and breast cancer. It selectively interacts with hypoxia-inducible factor 1alpha (HIF1alpha) and plays a role in breast cancer progression. Moreover, JMJD2C may play an important role in the treatment of obesity and its complications by modulating the regulation of adipogenesis by nuclear receptor peroxisome proliferator-activated receptor gamma (PPARgamma). JMJD2C contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) finger, this non-canonical ePHD finger, and a Tudor domain.	110
277256	cd15716	FYVE_RBNS5	FYVE domain found in FYVE finger-containing Rab5 effector protein rabenosyn-5 (Rbsn-5) and similar proteins. Rbsn-5, also termed zinc finger FYVE domain-containing protein 20, is a novel Rab5 effector that is complexed to the Sec1-like protein VPS45 and recruited in a phosphatidylinositol-3-kinase-dependent fashion to early endosomes. It also binds to Rab4 and EHD1/RME-1, two regulators of the recycling route, and is involved in cargo recycling to the plasma membrane. Moreover, Rbsn-5 regulates endocytosis at the apical side of the wing epithelium and plays a role of the apical endocytic trafficking of Fmi in the establishment of planar cell polarity (PCP).	61
277257	cd15717	FYVE_PKHF	FYVE domain found in protein containing both PH and FYVE domains 1 (phafin-1), 2 (phafin-2), and similar proteins. This family includes protein containing both PH and FYVE domains 1 (phafin-1) and 2 (phafin-2). Phafin-1 is a representative of a novel family of PH and FYVE domain-containing proteins called phafins. It is a ubiquitously expressed pro-apoptotic protein via translocating to lysosomes, facilitating apoptosis induction through a lysosomal-mitochondrial apoptotic pathway. Phafin-2 is a ubiquitously expressed endoplasmic reticulum-associated protein that facilitates tumor necrosis factor alpha (TNF-alpha)-triggered cellular apoptosis through endoplasmic reticulum (ER)-mitochondrial apoptotic pathway. It is an endosomal phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) effector, as well as an interactor of the endosomal-tethering protein EEA1. It regulates endosome fusion upstream of Rab5. Phafin-2 also functions as a novel regulator of endocytic epidermal growth factor receptor (EGFR) degradation through a role in endosomal fusion.	61
277258	cd15718	FYVE_WDFY1_like	FYVE domain found in WD40 repeat and FYVE domain-containing protein WDFY1 and WDFY2, and similar proteins. This family includes WD40 repeat and FYVE domain-containing protein WDFY1 and WDFY2. WDFY1, also termed FYVE domain containing protein localized to endosomes-1 (FENS-1), or phosphoinositide-binding protein 1, or zinc finger FYVE domain-containing protein 17, is a novel single FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. WDFY1 to early endosomes requires an intact FYVE domain and is inhibited by wortmannin, a PI3-kinase inhibitor. WDFY2, also termed zinc finger FYVE domain-containing protein 22, or ProF (propeller-FYVE protein), is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding protein that is localized to a distinct subset of early endosomes close to the plasma membrane. It interacts preferentially with endogenous serine/threonine kinase Akt2, but not Akt1, and plays a specific role in modulating signaling through Akt downstream of the interaction of this kinase with the endosomal proteins APPL (adaptor protein containing PH domain, PTB domain, and leucine zipper motif). In addition to Akt, WDFY2 serves as a binding partner for protein kinase C, zeta (PRKCZ), and its substrate vesicle-associated membrane protein 2 (VAMP2), and is involved in vesicle cycling in various secretory pathways. Moreover, Silencing of WDFY2 by siRNA produces a strong inhibition of endocytosis. Both WDFY1 and WDFY2 contain a FYVE domain and multiple WD-40 repeats.	70
277259	cd15719	FYVE_WDFY3	FYVE domain found in WD40 repeat and FYVE domain-containing protein 3 (WDFY3) and similar proteins. WDFY3, also termed autophagy-linked FYVE protein (Alfy), is a ubiquitously expressed phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding protein required for selective macroautophagic degradation of aggregated proteins. It regulates the protein degradation through the direct interaction with the autophagy protein Atg5. Moreover, WDFY3 acts as a scaffold that bridges its cargo to the macroautophagic machinery via the creation of a greater complex with Atg12, Atg16L, and LC3. It also functionally associates with sequestosome-1/p62 (SQSTM1) in osteoclasts. WDFY3 shuttles between the nucleus and cytoplasm. It predominantly localizes to the nucleus and nuclear membrane under basal conditions, but is recruited to cytoplasmic ubiquitin-positive protein aggregates under stress conditions. WDFY3 contains a PH-BEACH domain assemblage, five WD40 repeats and a PtdIns3P-binding FYVE domain.	65
277260	cd15720	FYVE_Hrs	FYVE domain found in hepatocyte growth factor (HGF)-regulated tyrosine kinase substrate (Hrs) and similar proteins. Hrs, also termed protein pp110, is a tyrosine phosphorylated protein that plays an important role in the signaling pathway of HGF. It is localized to early endosomes and an essential component of the endosomal sorting and trafficking machinery. Hrs interacts with hypertonia-associated protein Trak1, a novel regulator of endosome-to-lysosome trafficking. It can also forms an Hrs/actinin-4/BERP/myosin V protein complex that is required for efficient transferrin receptor (TfR) recycling but not for epidermal growth factor receptor (EGFR) degradation. Moreover, Hrs, together with STAM proteins, STAM1 and STAM2, and EPs15, forms a multivalent ubiquitin-binding complex that sorts ubiquitinated proteins into the multivesicular body pathway, and plays a regulatory role in endocytosis/exocytosis. Furthermore, Hrs functions as an interactor of the neurofibromatosis 2 tumor suppressor protein schwannomin/merlin. It is also involved in the inhibition of citron kinase-mediated HIV-1 budding. Hrs contains a single ubiquitin-interacting motif (UIM) that is crucial for its function in receptor sorting, and a FYVE domain that harbors double Zn2+ binding sites.	61
277261	cd15721	FYVE_RUFY1_like	FYVE domain found in RUN and FYVE domain-containing protein RUFY1, RUFY2 and similar proteins. This family includes RUN and FYVE domain-containing protein RUFY1 and RUFY2. RUFY1, also termed FYVE-finger protein EIP1, or La-binding protein 1, or Rab4-interacting protein (Rabip4), or Zinc finger FYVE domain-containing protein 12 (ZFY12), a human homologue of mouse Rabip4, an effector of Rab4 GTPase that regulates recycling of endocytosed cargo. RUFY1 is an endosomal protein that functions as a dual effector of Rab4 and Rab14 and is involved in efficient recycling of transferrin (Tfn). It is a downstream effector of Etk, a downstream tyrosine kinase of PI3-kinase that is involved in regulation of vesicle trafficking. RUFY2, also termed Rab4-interacting protein related, is a novel embryonic factor that is present in the nucleus at early stages of embryonic development. It may have both endosomal functions in the cytoplasm and nuclear functions. Both RUFY1 and RUFY2 contain an N-terminal RUN domain and a C-terminal FYVE domain with two coiled-coil domains in-between.	58
277262	cd15723	FYVE_protrudin	FYVE-related domain found in protrudin and similar proteins. Protrudin, also termed zinc finger FYVE domain-containing protein 27 (ZFY27 or ZFYVE27), is a FYVE domain-containing protein involved in transport of neuronal cargoes and implicated in the onset of hereditary spastic paraplegia (HSP). It is involved in neurite outgrowth through binding to spastin. Moreover, it functions as a key regulator of the Rab11-dependent membrane trafficking during neurite extension. It serves as an adaptor molecule that links its associated proteins, such as Rab11-GDP, VAP-A and -B, Surf4, and RTN3, to KIF5, a motor protein that mediates anterograde vesicular transport in neurons, and thus plays a key role in the maintenance of neuronal function. The FYVE domain of protrudin resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. In addition, unlike canonical FYVE domains that is located to early endosomes and specifically binds to phosphatidylinositol 3-phosphate (PtdIns3P or PI3P), the FYVE domain of protrudin is located to plasma membrane and preferentially binds phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2), phosphatidylinositol 3,4-bisphosphate (PtdIns(3,4)P2), and phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3). In addition to FYVE-related domain, protrudin also contains a Rab11-binding domain (RBD11), two hydrophobic domains, HP-1 and HP-2, an FFAT motif, and a coiled-coil domain.	62
277263	cd15724	FYVE_ZFY26	FYVE domain found in FYVE domain-containing protein 26 (ZFY26 or ZFYVE26). ZFY26, also termed FYVE domain-containing centrosomal protein (FYVE-CENT), or spastizin, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding protein that localizes to the centrosome and midbody. ZFY26 and its interacting partners TTC19 and KIF13A are required for cytokinesis. It also interacts with Beclin 1, a subunit of class III phosphatidylinositol 3-kinase complex, and may have potential implications for carcinogenesis. In addition, it has been considered as the causal agent of a rare form of hereditary spastic paraplegia. ZFY26 contains a FYVE domain that is important for targeting of FYVE-CENT to the midbody.	61
277264	cd15725	FYVE_PIKfyve_Fab1	FYVE domain found in metazoan PIKfyve, fungal and plant Fab1, and similar proteins. PIKfyve, also termed FYVE finger-containing phosphoinositide kinase, or 1-phosphatidylinositol 3-phosphate 5-kinase, or phosphatidylinositol 3-phosphate 5-kinase (PIP5K3), or phosphatidylinositol 3-phosphate 5-kinase type III (PIPkin-III or type III PIP kinase), is a phosphoinositide 5-kinase that forms a complex with its regulators, the scaffolding protein Vac14 and the lipid phosphatase Fig4. The complex is responsible for synthesizing phosphatidylinositol 3,5-bisphosphate [PtdIns(3,5)P2] from phosphatidylinositol 3-phosphate (PtdIns3P or PI3P). Then phosphatidylinositol-5-phosphate (PtdIns5P) is generated directly from PtdIns(3,5)P2. PtdIns(3,5)P2 and PtdIns5P regulate endosomal trafficking and responses to extracellular stimuli. At this point, PIKfyve is vital in early embryonic development. Moreover, PIKfyve forms a complex with ArPIKfyve (associated regulator of PIKfyve) and SAC3 at the endomembranes, which plays a role in receptor tyrosine kinase (RTK) degradation. The phosphorylation of PIKfyve by AKT can facilitate Epidermal growth factor receptor (EGFR) degradation. In addition, PIKfyve may participate in the regulation of the glutamate transporters EAAT2, EAAT3 and EAAT4, and the cystic fibrosis transmembrane conductance regulator (CFTR). It is also essential for systemic glucose homeostasis and insulin-regulated glucose uptake/GLUT4 translocation in skeletal muscle. It can be activated by protein kinase B (PKB/Akt) and further up-regulates human ether-a-go-go (hERG) channels. This family also includes the yeast and plant orthologs of human PIKfyve, Fab1. PIKfyve and its orthologs share a similar architecture. They contain an N-terminal FYVE domain, a middle region related to the CCT/TCP-1/Cpn60 chaperonins that are involved in productive folding of actin and tubulin, a second middle domain that contains a number of conserved cysteine residues (CCR) unique to this family, and a C-terminal lipid kinase domain related to PtdInsP kinases.	62
277265	cd15726	FYVE_FYCO1	FYVE domain found in FYVE and coiled-coil domain-containing protein 1 (FYCO1) and similar proteins. FYCO1, also termed zinc finger FYVE domain-containing protein 7, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding protein that is associated with the exterior of autophagosomes and mediates microtubule plus-end-directed vesicle transport. It acts as an effector of GTP-bound Rab7, a GTPase that recruits FYCO1 to autophagosomes and has been implicated in autophagosome-lysosomal fusion. FYCO1 also interacts with two microtubule motor proteins, kinesin (KIF) 5B and KIF23, and thus functions as a platform for assembly of vesicle fusion and trafficking factors. FYCO1 contains an N-terminal alpha-helical RUN domain followed by a long central coiled-coil region, a FYVE domain and a GOLD (Golgi dynamics) domain in C-terminus.	58
277266	cd15727	FYVE_ZF21	FYVE domain found in zinc finger FYVE domain-containing protein 21 (ZF21) and similar proteins. ZF21 is phosphoinositide-binding protein that functions as a regulator of focal adhesions and cell movement through interaction with focal adhesion kinase. It can also bind to the cytoplasmic tail of membrane type 1 matrix metalloproteinase, a potent invasion-promoting protease, and play a key role in regulating multiple aspects of cancer cell migration and invasion. ZF21 contains a FYVE domain, which corresponds to this model.	64
277267	cd15728	FYVE_ANFY1	FYVE domain found in ankyrin repeat and FYVE domain-containing protein 1 (ANFY1) and similar proteins. ANFY1, also termed ankyrin repeats hooked to a zinc finger motif (Ankhzn), is a novel cytoplasmic protein that belongs to a new group of double zinc finger proteins involved in vesicle or protein transport. It is ubiquitously expressed in a spatiotemporal-specific manner and is located on endosomes. ANFY1 contains an N-terminal coiled-coil region and a BTB/POZ domain, ankyrin repeats in the middle, and a C-terminal FYVE domain.	63
277268	cd15729	FYVE_endofin	FYVE domain found in endofin and similar proteins. Endofin, also termed zinc finger FYVE domain-containing protein 16 (ZFY16), or endosome-associated FYVE domain protein, is a FYVE domain-containing protein that is localized to EEA1-containing endosomes. It is regulated by phosphoinositol lipid and engaged in endosome-mediated receptor modulation. Endofin is involved in Bone morphogenetic protein (BMP) signaling through interacting with Smad1 preferentially and enhancing Smad1 phosphorylation and nuclear localization upon BMP stimulation. It also functions as a scaffold protein that brings Smad4 to the proximity of the receptor complex in Transforming growth factor (TGF)-beta signaling. Moreover, endofin is a novel tyrosine phosphorylation target downstream of epidermal growth factor receptor (EGFR) in EGF-signaling. In addition, endofin plays a role in endosomal trafficking by recruiting cytosolic TOM1, an important molecule for membrane recruitment of clathrin, onto endosomal membranes.	68
277269	cd15730	FYVE_EEA1	FYVE domain found in early endosome antigen 1 (EEA1) and similar proteins. EEA1, also termed endosome-associated protein p162, or zinc finger FYVE domain-containing protein 2, is an essential component of the endosomal fusion machinery and required for the fusion and maturation of early endosomes in endocytosis. It forms a parallel coiled-coil homodimer in cells. EEA1 serves as the p97 ATPase substrate and the p97 ATPase may regulate the size of early endosomes by governing the oligomeric state of EEA1. It can interact with the GTP-bound form of Rab22a and be involved in endosomal membrane trafficking. EEA1 also functions as an obligate scaffold for angiotensin II-induced Akt activation in early endosomes. It can be phosphorylated by p38 mitogen-activated protein kinase (MAPK) and further regulate mu opioid receptor endocytosis. EEA1 consists of an N-terminal C2H2 Zn2+ finger, four long heptad repeats, and a C-terminal region containing a calmodulin binding (IQ) motif, a Rab5 interaction site, and a FYVE domain. This model corresponds to the FYVE domain that is responsible for binding phosphatidyl inositol-3-phosphate (PtdIns3P or PI3P) on the membrane.	63
277270	cd15731	FYVE_LST2	FYVE domain found in lateral signaling target protein 2 homolog (Lst2) and similar proteins. Lst2, also termed zinc finger FYVE domain-containing protein 28, is a monoubiquitinylated phosphoprotein that functions as a negative regulator of epidermal growth factor receptor (EGFR) signaling. Unlike other FYVE domain-containing proteins, Lst2 displays primarily non-endosomal localization. Its endosomal localization is regulated by monoubiquitinylation. Lst2 physically binds Trim3, also known as BERP or RNF22, which is a coordinator of endosomal trafficking and interacts with Hrs and a complex that biases cargo recycling.	65
277271	cd15732	FYVE_MTMR3	FYVE domain found in myotubularin-related protein 3 (MTMR3) and similar proteins. MTMR3, also termed Myotubularin-related phosphatase 3, or FYVE domain-containing dual specificity protein phosphatase 1 (FYVE-DSP1), or zinc finger FYVE domain-containing protein 10, is a ubiquitously expressed phosphoinositide 3-phosphatase specific for phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) and phosphatidylinositol 3,5-bisphosphate (PtdIns(3,5)P2) and PIKfyve, which produces PtdIns(3,5)P2 from PtdIns3P. It regulates cell migration through modulating phosphatidylinositol 5-phosphate (PtdIns5P) levels. MTMR3 contains an N-terminal PH-GRAM (PH-G) domain, a MTM phosphatase domain, a coiled-coil region, and a C-terminal FYVE domain. Unlike conventional FYVE domains, the FYVE domain of MTMR3 neither confers endosomal localization nor binds to PtdIns3P. It is also not required for the enzyme activity of MTMR3. In contrast, the PH-G domain binds phosphoinositides.	61
277272	cd15733	FYVE_MTMR4	FYVE domain found in myotubularin-related protein 4 (MTMR4) and similar proteins. MTMR4, also termed FYVE domain-containing dual specificity protein phosphatase 2 (FYVE-DSP2), or zinc finger FYVE domain-containing protein 11, is an dual specificity protein phosphatase that specifically dephosphorylates phosphatidylinositol 3-phosphate (PtdIns3P or PI3P). It is localizes to early endosomes, as well as to Rab11- and Sec15-positive recycling endosomes, and regulates sorting from early endosomes. Moreover, MTMR4 is preferentially associated with and dephosphorylated the activated regulatory Smad proteins (R-Smads) in cytoplasm to keep transforming growth factor (TGF) beta signaling in homeostasis. It also functions as an essential negative modulator for the homeostasis of bone morphogenetic protein (BMP)/decapentaplegic (Dpp) signaling. In addition, MTMR4 acts as a novel interactor of the ubiquitin ligase Nedd4 (neural-precursor-cell-expressed developmentally down-regulated 4) and may play a role in the biological process of muscle breakdown. MTMR4 contains an N-terminal PH-GRAM (PH-G) domain, a MTM phosphatase domain, a coiled-coil region, and a C-terminal FYVE domain.	60
277273	cd15734	FYVE_ZFYV1	FYVE domains found in zinc finger FYVE domain-containing protein 1 (ZFYV1) and similar proteins. ZFYV1, also termed double FYVE-containing protein 1 (DFCP1), or SR3, or tandem FYVE fingers-1, is a novel tandem FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. The subcellular distribution of exogenously-expressed ZFYV1 to Golgi, endoplasmic reticulum (ER) and vesicular is governed in part by its FYVE domains but unaffected by wortmannin, a PI3-kinase inhibitor. In addition to C-terminal tandem FYVE domain, ZFYV1 contains an N-terminal putative C2H2 type zinc finger and a possible nucleotide binding P-loop. 	61
277274	cd15735	FYVE_spVPS27p_like	FYVE domain found in Schizosaccharomyces pombe vacuolar protein sorting-associated protein 27 (spVps27p) and similar proteins. spVps27p, also termed suppressor of ste12 deletion protein 4 (Sst4p), is a conserved homolog of budding Saccharomyces cerevisiae Vps27 and of mammalian Hrs. It functions as a downstream factor for phosphatidylinositol 3-kinase (PtdIns 3-kinase) in forespore membrane formation with normal morphology. It colocalizes and interacts with Hse1p, a homolog of Saccharomyces cerevisiae Hse1p and of mammalian STAM, to form a complex whose ubiquitin-interacting motifs (UIMs) are important for sporulation. spVps27p contains a VHS (Vps27p/Hrs/Stam) domain, a FYVE domain, and two UIMs.	59
277275	cd15736	FYVE_scVPS27p_Vac1p_like	FYVE domain found in Saccharomyces cerevisiae vacuolar protein sorting-associated protein 27 (scVps27p) and FYVE-related domain 1 found in yeast protein VAC1 (Vac1p) and similar proteins. The family includes Saccharomyces cerevisiae vacuolar protein sorting-associated protein 27 (scVps27p) and protein VAC1 (Vac1p). scVps27p, also termed Golgi retention defective protein 11, is the putative yeast counterpart of the mammalian protein Hrs and is involved in endosome maturation. It is a mono-ubiquitin-binding protein that interacts with ubiquitinated cargoes, such as Hse1p, and is required for protein sorting into the multivesicular body. Vps27p forms a complex with Hse1p. The complex binds ubiquitin and mediates endosomal protein sorting. At the endosome, Vps27p and a trimeric protein complex, ESCRT-1, bind ubiquitin and are important for multivesicular body (MVB) sorting. Vps27p contains an N-terminal VHS (Vps27/Hrs/STAM) domain, a FYVE domain that binds PtdIns3P, followed by two ubiquitin-interacting motifs (UIMs), and a C-terminal clathrin-binding motif. Vac1p, also termed vacuolar segregation protein Pep7p, or carboxypeptidase Y-deficient protein 7, or vacuolar protein sorting-associated protein 19 (Vps19p), or vacuolar protein-targeting protein 19, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding protein that interacts with a Rab GTPase, GTP-bound form of Vps21p, and a Sec1p homologue, Vps45p, to facilitate Vps45p-dependent vesicle-mediated vacuolar protein sorting. It also acts as a novel regulator of vesicle docking and/or fusion at the endosome and functions in vesicle-mediated transport of Golgi precursor carboxypeptidase Y (CPY), protease A (PrA), protease B (PrB), but not alkaline phosphatase (ALP) from the trans-Golgi network-like compartment (TGN) to the endosome. Vac1p contains an N-terminal classical TFIIIA-like zinc finger, two putative zinc-binding FYVE fingers, and a C-terminal coiled coil region. The FYVE domain in both Vps27p and Vac1p harbors a zinc-binding site composed of seven Cysteines and one Histidine, which is different from that of other FYVE domain containing proteins.	56
277276	cd15737	FYVE2_Vac1p_like	FYVE domain 2 found in yeast protein VAC1 (Vac1p) and similar proteins. Vac1p, also termed vacuolar segregation protein Pep7p, or carboxypeptidase Y-deficient protein 7, or vacuolar protein sorting-associated protein 19 (Vps19p), or vacuolar protein-targeting protein 19, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding protein that interacts with a Rab GTPase, GTP-bound form of Vps21p, and a Sec1p homologue, Vps45p, to facilitate Vps45p-dependent vesicle-mediated vacuolar protein sorting. It also acts as a novel regulator of vesicle docking and/or fusion at the endosome and functions in vesicle-mediated transport of Golgi precursor carboxypeptidase Y (CPY), protease A (PrA), protease B (PrB), but not alkaline phosphatase (ALP) from the trans-Golgi network-like compartment (TGN) to the endosome. Vac1p contains an N-terminal classical TFIIIA-like zinc finger, two putative zinc-binding FYVE fingers, and a C-terminal coiled coil region. The family corresponds to the second FYVE domain that is responsible for the ability of Pep7p to efficiently interact with Vac1p and Vps45p.	83
277277	cd15738	FYVE_MTMR_unchar	FYVE-related domain found in uncharacterized myotubularin-related proteins mainly from eumetazoa. This family includes a group of uncharacterized myotubularin-related proteins mainly found in eumetazoa. Although their biological functions remain unclear, they share similar domain architecture that consists of an N-terminal pleckstrin homology (PH) domain, a highly conserved region related to myotubularin proteins, a C-terminal FYVE domain. The model corresponds to the FYVE domain, which resembles the FYVE-related domain as it has an altered sequence in the basic ligand binding patch.	61
277278	cd15739	FYVE_RABE_unchar	FYVE domain found in uncharacterized rab GTPase-binding effector proteins from bilateria. This family includes a group of uncharacterized rab GTPase-binding effector proteins found in bilateria. Although their biological functions remain unclear, they all contain a FYVE domain that harbors a putative phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding site.	73
277279	cd15740	FYVE_FGD3	FYVE-like domain found in FYVE, RhoGEF and PH domain-containing protein 3 (FGD3) and similar proteins. FGD3, also termed zinc finger FYVE domain-containing protein 5, is a putative Cdc42-specific guanine nucleotide exchange factor (GEF) that undergoes the ubiquitin ligase SCFFWD1/beta-TrCP-mediated proteasomal degradation. It is a homologue of FGD1 and contains a DBL homology (DH) domain and pleckstrin homology (PH) domain in the middle region, a FYVE domain, and another PH domain in the C-terminus, but lacks the N-terminal proline-rich domain (PRD) found in FGD1. Due to this difference, FGD3 may play different roles from that of FGD1 to regulate cell morphology or motility. The FYVE domain of FGD3 resembles a FYVE-like domain that is different from the canonical FYVE domains, since it lacks one of the three conserved signature motifs (the WxxD motif) that are involved in phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding and exhibits altered lipid binding specificities.	54
277280	cd15741	FYVE_FGD1_2_4	FYVE domain found in FYVE, RhoGEF and PH domain-containing protein facio-genital dysplasia FGD1, FGD2, FGD4. This family represents a group of Rho GTPase cell division cycle 42 (Cdc42)-specific guanine nucleotide exchange factors (GEFs), including FYVE, RhoGEF and PH domain-containing protein FGD1, FGD2 and FGD4. FGD1, also termed faciogenital dysplasia 1 protein, or Rho/Rac guanine nucleotide exchange factor FGD1 (Rho/Rac GEF), or zinc finger FYVE domain-containing protein 3, is a central regulator of extracellular matrix remodeling and belongs to the DBL family of GEFs that regulate the activation of the Rho GTPases. FGD1 is encoded by gene FGD1. Disabling mutations in the FGD1 gene cause the human X-linked developmental disorder faciogenital dysplasia (FGDY, also known as Aarskog-Scott syndrome). FGD2, also termed zinc finger FYVE domain-containing protein 4, is expressed in antigen-presenting cells, including B lymphocytes, macrophages, and dendritic cells. It localizes to early endosomes and active membrane ruffles. It plays a role in leukocyte signaling and vesicle trafficking in cells specialized to present antigen in the immune system. FGD4, also termed actin filament-binding protein frabin, or FGD1-related F-actin-binding protein, or zinc finger FYVE domain-containing protein 6, functions as an F-actin-binding (FAB) protein showing significant homology to FGD1. It induces the formation of filopodia through the activation of Cdc42 in fibroblasts. Those FGD proteins possess a similar domain organization that contains a DBL homology (DH) domain, a pleckstrin homology (PH) domain, a FYVE domain, and another PH domain in the C-terminus. However, each FGD has a unique N-terminal region that may directly or indirectly interact with F-actin. FGD1 and FGD4 have an N-terminal proline-rich domain (PRD) and an N-terminal F-actin binding (FAB) domain, respectively. This model corresponds to the FYVE domain, which has been found in many proteins involved in membrane trafficking and phosphoinositide metabolism, and has been defined by three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCR patch, and a C-terminal RVC motif, which form a compact phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding site. FGD1 possesses a FYVE-like domain that lack the N-terminal WxxD motif. Moreover, FGD2 is the only known RhoGEF family member shown to have a functional FYVE domain and endosomal binding activity.	65
277281	cd15742	FYVE_FGD5	FYVE-like domain found in FYVE, RhoGEF and PH domain-containing protein 5 (FGD5) and similar proteins. FGD5, also termed zinc finger FYVE domain-containing protein 23, is an endothelial cell (EC)-specific guanine nucleotide exchange factor (GEF) that regulates endothelial adhesion, survival, and angiogenesis by modulating phosphatidylinositol 3-kinase signaling. It functions as a novel genetic regulator of vascular pruning by activation of endothelial cell-targeted apoptosis. FGD5 is a homologue of FGD1 and contains a DBL homology (DH) domain, a pleckstrin homology (PH) domain, a FYVE domain, and another PH domain in the C-terminus, but lacks the N-terminal proline-rich domain (PRD) found in FGD1. The FYVE domain of FGD5 resembles a FYVE-like domain that is different from the canonical FYVE domains, since it lacks one of the three conserved signature motifs (the WxxD motif) that are involved in phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding and exhibits altered lipid binding specificities.	67
277282	cd15743	FYVE_FGD6	FYVE domain found in FYVE, RhoGEF and PH domain-containing protein 6 (FGD6) and similar proteins. FGD6, also termed zinc finger FYVE domain-containing protein 24 is a putative Cdc42-specific guanine nucleotide exchange factor (GEF) whose biological function remains unclear. It is a homologue of FGD1 and contains a DBL homology (DH) domain and pleckstrin homology (PH) domain in the middle region, a FYVE domain, and another PH domain in the C-terminus, but lacks the N-terminal proline-rich domain (PRD) found in FGD1. Moreover, the FYVE domain of FGD6 is a canonical FYVE domain, which has been found in many proteins involved in membrane trafficking and phosphoinositide metabolism, and has been defined by three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCR patch, and a C-terminal RVC motif, which form a compact phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding site.	61
277283	cd15744	FYVE_RUFY3	FYVE-related domain found in RUN and FYVE domain-containing protein 3 (RUFY3) and similar proteins. RUFY3, also termed Rap2-interacting protein x (RIPx or RPIPx), or single axon-regulated protein (singar), is an N-terminal RUN domain and a C-terminal FYVE domain containing protein predominantly expressed in the brain. It suppresses formation of surplus axons for neuronal polarity. Unlike other RUFY proteins, RUFY3 can associate with the GTP-bound active form of Rab5. Moreover, the FYVE domain of RUFY3 resembles the FYVE-related domain as it lacks the WxxD motif (x for any residue).	52
277284	cd15745	FYVE_RUFY4	FYVE-related domain found in RUN and FYVE domain-containing protein 4 (RUFY4) and similar proteins. RUFY4 belongs to the FUFY protein family which is characterized by the presence of an N-terminal RUN domain and a C-terminal FYVE domain. The FYVE domain of RUFY4 resembles the FYVE-related domain as it lacks the WxxD motif (x for any residue). The biological function of RUFY4 still remains unclear.	52
277285	cd15746	FYVE_RP3A_like	FYVE-related domain found in rabphilin-3A, Rab effector Noc2, and similar proteins. This family includes rabphilin-3A and Rab effector Noc2. Rabphilin-3A, also termed exophilin-1, is an effector protein that binds to the GTP-bound form of Rab3A, which is one of the most abundant Rab proteins in neurons and predominantly localized to synaptic vesicles. Rabphilin-3A is homologous to alpha-Rab3-interacting molecules (RIMs). It is a multi-domain protein containing an N-terminal Rab3A effector domain, a proline-rich linker region, and two tandem C2 domains. The effector domain binds specifically to the activated GTP-bound state of Rab3A and harbors a conserved FYVE zinc finger. The C2 domains are responsible for the binding of phosphatidylinositol-4,5-bisphosphate (PIP2) , a key player in the neurotransmitter release process. Thus, Rabphilin-3A has also been implicated in vesicle trafficking. Rab effector Noc2, also termed No C2 domains protein, or rabphilin-3A-like protein (RPH3AL), is a Rab3 effector that mediates the regulation of secretory vesicle exocytosis in neurons and certain endocrine cells. It also functions as a Rab27 effector and is involved in isoproterenol (IPR)-stimulated amylase release from acinar cells. Noc2 contains an N-terminal Rab3A effector domain which only harbors a conserved FYVE zinc finger. The FYVE domains of Rabphilin-3A and Noc2 resemble a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	55
277286	cd15747	FYVE_Slp3_4_5	FYVE-related domain found in the synaptotagmin-like proteins 3, 4, 5. The synaptotagmin-like proteins 1-5 (Slp1-5) family belongs to the carboxyl-terminal-type (C-type) tandem C2 proteins superfamily, which also contains the synaptotagmin and the Doc2 families. Slp proteins are putative membrane trafficking proteins that are characterized by the presence of a unique N-terminal Slp homology domain (SHD), and C-terminal tandem C2 domains (known as the C2A domain and C2B domain). The SHD consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2. The SHD1 and SHD2 of Slp3, Slp4 and Slp5 are separated by a putative FYVE zinc finger. By contrast, Slp1 and Slp2 lack such zinc finger and their SHD1 and SHD2 are linked together. This model corresponds to the FYVE zinc finger. At this point, Slp1 and Slp2 are not included in this model. Moreover, the FYVE domains of Slp3, Slp4 and Slp5 resemble a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	48
277287	cd15748	FYVE_SPIR	FYVE-related domain found in Spir proteins, Spire1 and Spire2. Spir proteins were originally discovered as the protein products of the Drosophila spire gene. They are Jun N-terminal kinase (JNK)-interacting proteins that have exclusively been identified in metazoans. They may play roles in membrane trafficking and cortical filament crosslinking. This family includes Spire1 and Spire2, which function as new essential factors in asymmetric division of oocytes. They mediate asymmetric spindle positioning by assembling a cytoplasmic actin network. They are also required for polar body extrusion by promoting assembly of the cleavage furrow. Moreover, they cooperate synergistically with Fmn2 to assemble F-actin in oocytes. Both Spire1 and Spire2 contain an N-terminal protein-interaction KIND domain, WH2 actin-binding domains, a Rab GTPase-interaction Spir-box, and a C-terminal FYVE membrane-binding domain. Their FYVE domains resemble FYVE-related domains that are structurally similar to the canonical FYVE domains but lack the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif, which form a binding pocket that specifically bind the phospholipid phosphatidylinositol 3-phosphate (PtdIns3P or PI3P).	42
277288	cd15749	FYVE_ZFY19	FYVE-related domain found in FYVE domain-containing protein 19 (ZFY19) and similar proteins. ZFY19, also termed mixed lineage leukemia (MLL) partner containing FYVE domain, is encoded by a novel gene, MLL partner containing FYVE domain (MPFYVE). The FYVE domain of ZFY19 resembles FYVE-related domains that are structurally similar to the canonical FYVE domains but lack the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. The biological function of ZFY19 remains unclear.	51
277289	cd15750	FYVE_CARP	FYVE-like domain found in caspase-associated ring proteins, CARP1 and CARP2. CARP1 and CARP2 are a novel group of caspase regulators by the presence of a FYVE-type zinc finger domain. They do not localize to membranes in the cell and are involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10, which are distinguished from other FYVE-type proteins. Moreover, these proteins have an altered sequence in the basic ligand binding patch and lack the WxxD (x for any residue) motif that is conserved only in phosphoinositide binding FYVE domains. Thus they constitute a family of unique FYVE-type domains called FYVE-like domains.	47
277290	cd15751	FYVE_BSN_PCLO	FYVE-related domain found in protein bassoon and piccolo. This family includes protein bassoon and piccolo. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Both bassoon and piccolo contain two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. Their FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. 	62
277291	cd15752	FYVE_SlaC2-a	FYVE-related domain found in Slp homolog lacking C2 domains a (SlaC2-a) and similar proteins. SlaC2-a, also termed melanophilin, or exophilin-3, is a GTP-bound form of Rab27A-, myosin Va-, and actin-binding protein present on melanosomes. It is involved in the control of transferring of melanosomes from microtubules to actin filaments. It also functions as a melanocyte type myosin Va (McM5) binding partner and directly activates the actin-activated ATPase activity of McM5 through forming a tripartite protein complex with Rab27A and an actin-based motor myosin Va. SlaC2-a belongs to the Slp homolog lacking C2 domains (Slac2) family. It contains an N-terminal Slp homology domain (SHD), but lacks tandem C2 domains. The SHD consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of SlaC2-a are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. Moreover, Slac2-a has a middle myosin-binding domain and a C-terminal actin-binding domain.	76
277292	cd15753	FYVE_SlaC2-c	FYVE-related domain found in Slp homolog lacking C2 domains c (SlaC2-c) and similar proteins. SlaC2-c, also termed Rab effector MyRIP, or exophilin-8, or myosin-VIIa- and Rab-interacting protein, or synaptotagmin-like protein lacking C2 domains c, is a GTP-bound form of Rab27A-, myosin Va/VIIa-, and actin-binding protein mainly present on retinal melanosomes and secretory granules. It may play a role in insulin granule exocytosis. It is also involved in the control of isoproterenol (IPR)-induced amylase release from parotid acinar cells. SlaC2-c belongs to the Slp homolog lacking C2 domains (Slac2) family. It contains an N-terminal Slp homology domain (SHD), but lacks tandem C2 domains. The SHD consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of SlaC2-c are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. Moreover, Slac2-c has a middle myosin-binding domain and a C-terminal actin-binding domain. 	49
277293	cd15754	FYVE_PKHF1	FYVE domain found in protein containing both PH and FYVE domains 1 (phafin-1) and similar proteins. Phafin-1, also termed lysosome-associated apoptosis-inducing protein containing PH (pleckstrin homology) and FYVE domains (LAPF), or pleckstrin homology domain-containing family F member 1 (PKHF1), or PH domain-containing family F member 1, or apoptosis-inducing protein, or PH and FYVE domain-containing protein 1, or zinc finger FYVE domain-containing protein 15, is a representative of a novel family of PH and FYVE domain-containing proteins called phafins. It is a ubiquitously expressed pro-apoptotic protein via translocating to lysosomes, facilitating apoptosis induction through a lysosomal-mitochondrial apoptotic pathway.	64
277294	cd15755	FYVE_PKHF2	FYVE domain found in protein containing both PH and FYVE domains 2 (phafin-2) and similar proteins. Phafin-2, also termed endoplasmic reticulum-associated apoptosis-involved protein containing PH and FYVE domains (EAPF), or pleckstrin homology domain-containing family F member 2 (PKHF2), or PH domain-containing family F member 2, or PH and FYVE domain-containing protein 2, or zinc finger FYVE domain-containing protein 18, is a ubiquitously expressed endoplasmic reticulum-associated protein that facilitates tumor necrosis factor alpha (TNF-alpha)-triggered cellular apoptosis through endoplasmic reticulum (ER)-mitochondrial apoptotic pathway. It is an endosomal phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) effector, as well as an interactor of the endosomal-tethering protein EEA1. It regulates endosome fusion upstream of Rab5. Phafin-2 also functions as a novel regulator of endocytic epidermal growth factor receptor (EGFR) degradation through a role in endosomal fusion.	64
277295	cd15756	FYVE_WDFY1	FYVE domain found in WD40 repeat and FYVE domain-containing protein 1 (WDFY1) and similar proteins. WDFY1, also termed FYVE domain containing protein localized to endosomes-1 (FENS-1), or phosphoinositide-binding protein 1, or zinc finger FYVE domain-containing protein 17, is a novel single FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. WDFY1 to early endosomes requires an intact FYVE domain and is inhibited by wortmannin, a PI3-kinase inhibitor. In addition to FYVE domain, WDFY1 harbors multiple WD-40 repeats. 	76
277296	cd15757	FYVE_WDFY2	FYVE domain found in WD40 repeat and FYVE domain-containing protein 2 (WDFY2). WDFY2, also termed zinc finger FYVE domain-containing protein 22, or ProF (propeller-FYVE protein), is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) binding protein that is localized to a distinct subset of early endosomes close to the plasma membrane. It interacts preferentially with endogenous serine/threonine kinase Akt2, but not Akt1, and plays a specific role in modulating signaling through Akt downstream of the interaction of this kinase with the endosomal proteins APPL (adaptor protein containing PH domain, PTB domain, and leucine zipper motif). In addition to Akt, WDFY2 serves as a binding partner for protein kinase C, zeta (PRKCZ), and its substrate vesicle-associated membrane protein 2 (VAMP2), and is involved in vesicle cycling in various secretory pathways. Moreover, Silencing of WDFY2 by siRNA produces a strong inhibition of endocytosis. WDFY2 contains WD40 motifs and a FYVE domain.	70
277297	cd15758	FYVE_RUFY1	FYVE domain found in RUN and FYVE domain-containing protein 1 (RUFY1) and similar proteins. RUFY1, also termed FYVE-finger protein EIP1, or La-binding protein 1, or Rab4-interacting protein (Rabip4), or Zinc finger FYVE domain-containing protein 12 (ZFY12), a human homologue of mouse Rabip4, an effector of Rab4 GTPase that regulates recycling of endocytosed cargo. RUFY1 is an endosomal protein that functions as a dual effector of Rab4 and Rab14 and is involved in efficient recycling of transferrin (Tfn). It is a downstream effector of Etk, a downstream tyrosine kinase of PI3-kinase that is involved in regulation of vesicle trafficking. RUFY1 contains an N-terminal RUN domain and a C-terminal FYVE domain with two coiled-coil domains in-between.	71
277298	cd15759	FYVE_RUFY2	FYVE domain found in RUN and FYVE domain-containing protein 2 (RUFY2) and similar proteins. RUFY2, also termed Rab4-interacting protein related, is a novel embryonic factor that contains an N-terminal RUN domain and a C-terminal FYVE domain with two coiled-coil domains in-between. It is present in the nucleus at early stages of embryonic development. It may have both endosomal functions in the cytoplasm and nuclear functions. 	71
277299	cd15760	FYVE_scVPS27p_like	FYVE domain found in Saccharomyces cerevisiae vacuolar protein sorting-associated protein 27 (scVps27p) and similar proteins. scVps27p, also termed Golgi retention defective protein 11, is the putative yeast counterpart of the mammalian protein Hrs and is involved in endosome maturation. It is a mono-ubiquitin-binding protein that interacts with ubiquitinated cargoes, such as Hse1p, and is required for protein sorting into the multivesicular body. Vps27p forms a complex with Hse1p. The complex binds ubiquitin and mediates endosomal protein sorting. At the endosome, Vps27p and a trimeric protein complex, ESCRT-1, bind ubiquitin and are important for multivesicular body (MVB) sorting. Vps27p contains an N-terminal VHS (Vps27/Hrs/STAM) domain, a FYVE domain that binds PtdIns3P, followed by two ubiquitin-interacting motifs (UIMs), and a C-terminal clathrin-binding motif.	59
277300	cd15761	FYVE1_Vac1p_like	FYVE-related domain 1 found in yeast protein VAC1 (Vac1p) and similar proteins. Vac1p, also termed vacuolar segregation protein Pep7p, or carboxypeptidase Y-deficient protein 7, or vacuolar protein sorting-associated protein 19 (Vps19p), or vacuolar protein-targeting protein 19, is a phosphatidylinositol 3-phosphate (PtdIns3P or PI3P)-binding protein that interacts with a Rab GTPase, GTP-bound form of Vps21p, and a Sec1p homologue, Vps45p, to facilitate Vps45p-dependent vesicle-mediated vacuolar protein sorting. It also acts as a novel regulator of vesicle docking and/or fusion at the endosome and functions in vesicle-mediated transport of Golgi precursor carboxypeptidase Y (CPY), protease A (PrA), protease B (PrB), but not alkaline phosphatase (ALP) from the trans-Golgi network-like compartment (TGN) to the endosome. Vac1p contains an N-terminal classical TFIIIA-like zinc finger, two putative zinc-binding FYVE fingers, and a C-terminal coiled coil region. The family corresponds to the first FYVE domain, which resembles the FYVE-related domain as it has an altered sequence in the basic ligand binding patch.	76
277301	cd15762	FYVE_RP3A	FYVE-related domain found in rabphilin-3A and similar proteins. Rabphilin-3A, also termed exophilin-1, is an effector protein that binds to the GTP-bound form of Rab3A, which is one of the most abundant Rab proteins in neurons and predominantly localized to synaptic vesicles. Rabphilin-3A is homologous to alpha-Rab3-interacting molecules (RIMs). It is a multi-domain protein containing an N-terminal Rab3A effector domain, a proline-rich linker region, and two tandem C2 domains. The effector domain binds specifically to the activated GTP-bound state of Rab3A and harbors a conserved FYVE zinc finger. The C2 domains are responsible for the binding of phosphatidylinositol-4,5-bisphosphate (PIP2) , a key player in the neurotransmitter release process. Thus, Rabphilin-3A has also been implicated in vesicle trafficking. The FYVE domain of Rabphilin-3A resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	80
277302	cd15763	FYVE_RPH3L	FYVE-related domain found in Rab effector Noc2 and similar proteins. Rab effector Noc2, also termed No C2 domains protein, or rabphilin-3A-like protein (RPH3AL), is a Rab3 effector that mediates the regulation of secretory vesicle exocytosis in neurons and certain endocrine cells. It also functions as a Rab27 effector and is involved in isoproterenol (IPR)-stimulated amylase release from acinar cells. Noc2 contains an N-terminal Rab3A effector domain which harbors a conserved zinc finger, but lacks tandem C2 domains. The FYVE domain of Noc2 resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	64
277303	cd15764	FYVE_Slp4	FYVE-related domain found in synaptotagmin-like protein 4 (Slp4) and similar proteins. Slp4, also termed exophilin-2, or granuphilin, has been characterized as a regulator of the release of insulin granules from pancreatic beta-cells and dense core granules from PC12 neuronal cells by binding to Rab27A , and amylase granules from parotid gland acinar cells through interaction with syntaxin-2/3 in a Munc18-2-dependent manner on the apical plasma membrane. It can binds to syntaxin 2 in parotid acinar cells. It is also involved in granule transport by recruitment of the motor protein myosin Va. Moreover, it requires Rab8 to increase granule release in platelets. Slp4 contains an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains. The Slp homology domain (SHD) consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of Slp4 are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	50
277304	cd15765	FYVE_Slp3	FYVE-related domain found in synaptotagmin-like protein 3 (Slp3) and similar proteins. Slp3, also termed exophilin-6, functions as a Rab27A-specific effector in cytotoxic T lymphocytes. It binds to kinesin-1 motor through interaction with the tetratricopeptide repeat of the kinesin-1 light chain (KLC1). The kinesin-1/Slp3/Rab27a complex plays a role in mediating the terminal transport of lytic granules to the immune synapse. Slp3 contains an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains. The Slp homology domain (SHD) consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of Slp3 are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. In addition, the Slp3 C2A domain showed Ca2+-dependent phospholipid binding activity. At this point, Slp3 is a Ca2+-dependent isoform in Slp proteins family.	48
277305	cd15766	FYVE_Slp5	FYVE-related domain found in synaptotagmin-like protein 5 (Slp5) and similar proteins. Slp5 is a novel Rab27A-specific effector that is highly expressed in placenta and liver. Slp5 specifically interacted with the GTP-bound form of Rab27A and is involved in Rab27A-dependent membrane trafficking in specific tissues. Slp5 contains an N-terminal Slp homology domain (SHD) and C-terminal tandem C2 domains. The Slp homology domain (SHD) consists of two conserved regions, designated SHD1 (Slp homology domain 1) and SHD2, which may function as protein interaction sites. The SHD1 and SHD2 of Slp5 are separated by a putative FYVE zinc finger, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	47
277306	cd15767	FYVE_SPIR1	FYVE-related domain found in protein spire homolog 1 (Spire1) and similar proteins. Spire1 is encoded by gene spir-1, which is primarily found to be expressed in the developing nervous system and in neuronal cells of the adult brain, as well as in the fetal liver and in the adult spleen. It functions as a new essential factor in asymmetric division of oocytes. It mediates asymmetric spindle positioning by assembling a cytoplasmic actin network. It is also required for polar body extrusion by promoting assembly of the cleavage furrow. Moreover, it cooperates synergistically with Fmn2 to assemble F-actin in oocytes. Spire1 contains an N-terminal protein-interaction KIND domain, WH2 actin-binding domains, a Rab GTPase-interaction Spir-box, and a C-terminal FYVE membrane-binding domain. The FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lack the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif, which form a binding pocket that specifically binds the phospholipid phosphatidylinositol 3-phosphate (PtdIns3P or PI3P).	79
277307	cd15768	FYVE_SPIR2	FYVE-related domain found in protein spire homolog 2 (Spire2) and similar proteins. Spire2 is encoded by gene spir-2, which is expressed in the nervous system and highly expressed in the colonic epithelium. It functions as a new essential factor in asymmetric division of oocytes. It mediates asymmetric spindle positioning by assembling a cytoplasmic actin network. It is also required for polar body extrusion by promoting assembly of the cleavage furrow. Moreover, it cooperates synergistically with Fmn2 to assemble F-actin in oocytes. Spire2 contains an N-terminal protein-interaction KIND domain, WH2 actin-binding domains, a Rab GTPase-interaction Spir-box, and a C-terminal FYVE membrane-binding domain. The FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif, which form a binding pocket that specifically binds the phospholipid phosphatidylinositol 3-phosphate (PtdIns3P or PI3P).	112
277308	cd15769	FYVE_CARP1	FYVE-like domain found in caspase regulator CARP1 and similar proteins. CARP1, also termed E3 ubiquitin-protein ligase RNF34, or caspases-8 and -10-associated RING finger protein 1, or FYVE-RING finger protein Momo, or RING finger homologous to inhibitor of apoptosis protein (RFI), or RING finger protein 34, or RING finger protein RIFF, is a nuclear protein that functions as a specific E3 ubiquitin ligase for the transcriptional coactivator PGC-1alpha, a master regulator of energy metabolism and adaptive thermogenesis in the brown fat cell, and negatively regulates brown fat cell metabolism. It is preferentially expressed in esophageal, gastric and colorectal cancers, suggesting a possible association with the development of the digestive tract cancers. It regulates the p53 signaling pathway through degrading 14-3-3 sigma and stabilizing MDM2. CARP1 does not localize to membranes in the cell and is involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10, which are distinguished from other FYVE-type proteins. Moreover, CARP1 has an altered sequence in the basic ligand binding patch and lack the WxxD (x for any residue) motif that is conserved only in phosphoinositide binding FYVE domains. Thus it belongs to a family of unique FYVE-type domains called FYVE-like domains. In addition to the N-terminal FYVE-like domain, CARP1 harbors a C-terminal RING domain.	47
277309	cd15770	FYVE_CARP2	FYVE-like domain found in caspase regulator CARP2 and similar proteins. CARP2, also termed E3 ubiquitin-protein ligase rififylin, or caspases-8 and -10-associated RING finger protein 2, or FYVE-RING finger protein Sakura (Fring), or RING finger and FYVE-like domain-containing protein 1, or RING finger protein 189, or RING finger protein 34-like, is a novel caspase regulator containing a FYVE-type zinc finger domain. It regulates the p53 signaling pathway through degrading 14-3-3 sigma and stabilizing MDM2. CARP2 does not localize to membranes in the cell and is involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10, which are distinguished from other FYVE-type proteins. Moreover, CARP2 has an altered sequence in the basic ligand binding patch and lack the WxxD (x for any residue) motif that is conserved only in phosphoinositide binding FYVE domains. Thus it belongs to a family of unique FYVE-type domains called FYVE-like domains. In addition to the N-terminal FYVE-like domain, CARP2 harbors a C-terminal RING domain.	49
277310	cd15771	FYVE1_BSN_PCLO	FYVE-related domain 1 found in protein bassoon and piccolo. This family includes protein bassoon and piccolo. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Both bassoon and piccolo contain two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. Their FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. This model corresponds to the first FYVE-related domain.	61
277311	cd15772	FYVE2_BSN_PCLO	FYVE-related domain 2 found in protein bassoon and piccolo. This family includes protein bassoon and piccolo. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Both bassoon and piccolo contain two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. Their FYVE domain resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif. This model corresponds to the second FYVE-related domain.	64
277312	cd15773	FYVE1_BSN	FYVE-related domain 1 found in protein bassoon. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Bassoon contains two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. This family corresponds to the first FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	64
277313	cd15774	FYVE1_PCLO	FYVE-related domain 1 found in protein piccolo. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Piccolo is a multi-domain protein containing two N-terminal FYVE zinc fingers, a polyproline tract, and a PDZ domain and two C-terminal C2 domains. This family corresponds to the first FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	62
277314	cd15775	FYVE2_BSN	FYVE-related domain 2 found in protein bassoon. Protein bassoon, also termed zinc finger protein 231, is a core component of the presynaptic cytomatrix. It is a vertebrate-specific active zone scaffolding protein that plays a key role in structural organization and functional regulation of presynaptic release sites. Bassoon may modulate synaptic transmission efficiency by binding to presynaptic P/Q-type voltage-dependent calcium channel (VDCC) complexes and modify the channel function. As one of the most highly phosphorylated synaptic proteins, bassoon can interact with the small ubiquitous adaptor protein 14-3-3 in a phosphorylation-dependent manner, which modulates its anchoring to the presynaptic cytomatrix. Bassoon contains two N-terminal FYVE zinc fingers, a PDZ domain and two C-terminal C2 domains. This family corresponds to the second FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	65
277315	cd15776	FYVE2_PCLO	FYVE-related domain 2 found in protein piccolo. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Piccolo is a multi-domain protein containing two N-terminal FYVE zinc fingers, a polyproline tract, and a PDZ domain and two C-terminal C2 domains. This family corresponds to the second FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	64
276940	cd15777	CRBN_C_like	Thalidomide-binding C-terminal domain of cereblon (CRBN) and similar protein domains. Cereblon is part of an E3 ubiquitin ligase complex, together with damaged DNA-binding protein 1 (DDB1), CUL4A and ROC1. Cereblon interacts directly with DDB1, although the C-terminal domain characterized here does not contribute to that interaction. Ubiquination of cellular targets by this complex increases levels of FGF8 and FGF10, which was shown to affect the development of limbs and auditory vesicles in embryogenesis. The C-terminal domain of Cereblon was shown to contain the binding site for thalidomide and its analogs, a class of teratogenic drugs that exhibit an antiproliferative effect on myelomas. Mutations in CRBN, some of which map onto the C-terminal domain, were associated with autosomal recessive mental retardation, which may have to do with interactions between CRBN and ion channels in the brain.	101
275446	cd15778	Lreu_0056_like	Proteins similar to Lactobacillus reuteri ORF 0056. This family of Lactobacillus proteins has not been characterized. The 3D structure has been solved for a hypothetical protein with a predicted signal peptide, as part of a wider examination of the structural biology of commensal human gut flora and pathogens.	112
294014	cd15783	SA1633_like	Uncharacterized protein family conserved in Staphylococci. Some proteins in this family have been described as putative beta-lactamases. They are structurally similar to the C-terminal beta-grasp domains of Staphylococcal and Streptococcal superantigens.	143
275431	cd15784	PH_RUTBC	Rab-binding Pleckstrin homology domain (PH) of small G-protein signaling modulator 1 and similar proteins. Small G-protein signaling modulator 1, or RUN and TBC1 domain containing 2 (RUTBC2), as well as RUTBC1, bind to Rab9A via their Pleckstrin homology (PH) domain. They do not seem to act as GAP proteins that stimulate GTP hydrolysis by Rab9A, and RUTBC2 has been shown to also interact with Rab9B, most likely in a similar manner. RUTBC1 does stimulate GTP hydrolysis by Rab32 and Rab33B, however, while RUTBC2 appears to be a GAP for Rab36. Rab9A and associated proteins control the recycling of mannose-6-phosphate receptors from late endosomes to the trans-Golgi.	176
276946	cd15785	YycH_N_like	N-terminal domain of YycH and structurally similar proteins conserved in Firmicutes. These protein domains appear to be members of a somewhat larger structural family conserved in Firmicutes, including the N-terminal domain of YycH. YycH plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers.	113
276947	cd15786	CPF_1278_like	Uncharacterized protein conserved in Clostridia. This protein appears to be a member of a somewhat larger structural family conserved in Firmicutes. The 3D structure is available for one protein, CPF_1278, which has been labelled a putative lipoprotein. CPF_1278 displays structural similarity to the N-terminal domain of YycH, which plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers.	124
276948	cd15787	YycH_N	N-terminal domain of YycH and similar proteins. This protein appears to be a member of a somewhat larger structural family conserved in Firmicutes. YycH plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers. This model describes the N-terminal domain of YycH.	143
276949	cd15788	Clospo_01618_like	Uncharacterized protein conserved in Clostridia. This protein appears to be a member of a somewhat larger structural family conserved in Firmicutes. It displays structural similarity to the N-terminal domain of YycH, which plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers.	129
275432	cd15789	PH_ARHGEF2_18_like	rho guanine nucleotide exchange factor. RhoGEFs belongs to regulator of G-protein signaling (RGS) domain-containing RhoGEFs that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. The members here all contain Dbl homology (DH)-PH domains. In addition some members contain N-terminal C1 (Protein kinase C conserved region 1) domains, PDZ (also called DHR/Dlg homologous regions) domains, ANK (ankyrin) domains, and RGS (Regulator of G-protein signalling) domains or C-terminal ATP-synthase B subunit. The DH-PH domains bind and catalyze the exchange of GDP for GTP on RhoA. RhoGEF2/Rho guanine nucleotide exchange factor 2, p114RhoGEF/p114 Rho guanine nucleotide exchange factor, p115RhoGEF, p190RhoGEF, PRG/PDZ Rho guanine nucleotide exchange factor, RhoGEF 11, RhoGEF 12, RhoGEF 18, AKAP13/A-kinase anchoring protein 13, and LARG/Leukemia-associated Rho guanine nucleotide exchange factor are included in this CD. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	102
275433	cd15790	PH-GRAM_MTMR11	Myotubularian (MTM) related 11 protein (MTMR11) Pleckstrin Homology-Glucosyltransferases, Rab-like GTPase activators and Myotubularins (PH-GRAM) domain. MTMR10, MTMR11, and MTMR12 are catalytically inactive phosphatases that play a role as an adapter for the phosphatase myotubularin to regulate myotubularintracellular location. They contains a Glu residue instead of a conserved Cys residue in the dsPTPase catalytic loop which renders it catalytically inactive as a phosphatase. They contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an inactive PTP domain, a SET interaction domain, and a C-terminal coiled-coil domain. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome. 6 of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules. Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold.	123
275434	cd15791	PH1_FDG4	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia proteins 4, N-terminal Pleckstrin homology (PH) domain. In general, FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain. FGD4 is one of the genes associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance. Those affected have distal muscle weakness and atrophy associated with sensory loss and, frequently, pes cavus foot deformity. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	94
275435	cd15792	PH1_FGD5	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 5, N-terminal Pleckstrin Homology (PH) domain. FGD5 regulates promotes angiogenesis of vascular endothelial growth factor (VEGF) in vascular endothelial cells, including network formation, permeability, directional movement, and proliferation. In general, FGDs have a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activate the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the PH domain is involved in intracellular targeting of the DH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	123
275436	cd15793	PH1_FGD6	FYVE, RhoGEF and PH domain containing/faciogenital dysplasia protein 6, N-terminal Pleckstrin Homology (PH) domain. FGD5 regulates promotes angiogenesis of vascular endothelial growth factor (VEGF) in vascular endothelial cells, including network formation, permeability, directional movement, and proliferation. The specific function of FGD6 is unknown. In general, FGDs have a RhoGEF (DH) domain, followed by a PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activate the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the PH domain is involved in intracellular targeting of the DH domain. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	123
275437	cd15794	PH_ARHGEF18	Rho guanine nucleotide exchange factor 18 Pleckstrin homology (PH) domain. ARHGEF18, also called p114RhoGEF, is a key regulator of RhoA-Rock2 signaling that is crucial for maintenance of polarity in the vertebrate retinal epithelium, and consequently is essential for cellular differentiation, morphology and eventually organ function. ARHGEF18 contains Dbl-homology (DH) and pleckstrin-homology (PH) domains which bind and catalyze the exchange of GDP for GTP on RhoA. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N-terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes.	119
275439	cd15795	PMEI-Pla_a_1_like	Pollen allergen Pla a 1 and similar plant proteins. The major Platanus acerifolia pollen allergen Pla a 1 belongs to a class of allergens related to proteinaceous invertase and pectin methylesterase inhibitors. Platanus acerifolia is an important cause of pollinosis; Pla a 1 has a prevalence of about 80% among plane tree pollen-allergic patients. Recombinant Pla a 1 binds IgE in vitro, similar to its natural counterpart, rendering it suitable for specific diagnosis and structural studies. Invertase inhibitors are structurally similar to those of pectin methylesterase (PMEIs), an enzyme that is involved in the control of pectin metabolism and is structurally unrelated to invertases. All inhibitors share a size of about 18 kDa, two strictly conserved disulfide bridges and only moderate sequence homology (about 20% sequence identity).	148
275440	cd15796	CIF_like	Cell-wall inhibitor of beta fructosidase and similar proteins. Cell-wall invertases (CWIs) are secreted apoplastic enzymes belonging to the glycoside hydrolase family 32 (EC 3.2.1.26) that catalyze the hydrolytic cleavage of the disaccharide sucrose into glucose and fructose. Their activity is tightly regulated by compartment-specific inhibitor proteins at transcriptional and post-transcriptional levels.  Invertase inhibitors are structurally similar to those of pectin methylesterase (PMEIs), an enzyme that is involved in the control of pectin metabolism and is structurally unrelated to invertases. All inhibitors share a size of about 18 kDa, two strictly conserved disulfide bridges and only moderate sequence homology (about 20% sequence identity). Interaction of invertase inhibitor Nt-CIF (Nicotiana tabacum cell-wall inhibitor of beta-fructosidase) with CWI is strictly pH-dependent, modulated between pH 4 and 6, with rapid dissociation at neutral pH mediated by structure rearrangement or surface charge pattern in the binding interface. Comparison of the CIF/INV1 structure with the complex between the structurally CIF-related pectin methylesterase inhibitor (PMEI) and pectin methylesterase indicates a common targeting mechanism in PMEI and CIF.	148
275441	cd15797	PMEI	Pectin methylesterase inhibitor. Pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) catalyzes the demethylesterification of homogalacturonans in the cell wall. Its activity is regulated by the proteinaceous PME inhibitor (PMEI) which inhibits PME and invertase through formation of a non-covalent 1:1 complex. Depending on the mode of demethylesterification, PMEI activity results in either loosening or rigidification of the cell wall. PMEI has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. Thus, PMEI probably plays an important physiological role in PME regulation in plants, possessing several potential applications in a food-technological context.	149
275442	cd15798	PMEI-like_3	Uncharacterized subfamily of plant invertase/pectin methylesterase inhibitor domains. This subfamily contains inhibitors similar to those of pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) that catalyzes the demethylesterification of homogalacturonans in the cell wall. The proteinaceous PME inhibitor (PMEI) inhibits PME and invertase through formation of a non-covalent 1:1 complex. Depending on the mode of demethylesterification, PMEI activity results in either loosening or rigidification of the cell wall. PMEI has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. Thus, PMEI probably plays an important physiological role in PME regulation in plants, possessing several potential applications in a food-technological context.	154
275443	cd15799	PMEI-like_4	plant invertase/pectin methylesterase inhibitor domain-containing protein. This subfamily contains inhibitors similar to those of pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) that catalyzes the demethylesterification of homogalacturonans in the cell wall, and cell-wall invertases (CWIs) that catalyze the hydrolytic cleavage of the disaccharide sucrose into glucose and fructose. The proteinaceous PME inhibitor (PMEI) inhibits PME and invertase through formation of a non-covalent 1:1 complex. Cell-wall inhibitor of beta-fructosidase from tobacco (CIF) interacts with CWI in a strictly pH-dependent manner, modulated between pH 4 and 6, with rapid dissociation at neutral pH mediated by structure rearrangement or surface charge pattern in the binding interface. Comparison of the CIF/INV1 structure with the complex between the structurally CIF-related pectin methylesterase inhibitor (PMEI) and pectin methylesterase indicates a common targeting mechanism in PMEI and CIF.	151
275444	cd15800	PMEI-like_2	Uncharacterized subfamily of plant invertase/pectin methylesterase inhibitors. This subfamily contains inhibitors similar to those of pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) that catalyzes the demethylesterification of homogalacturonans in the cell wall, and cell-wall invertases (CWIs) that catalyze the hydrolytic cleavage of the disaccharide sucrose into glucose and fructose. The proteinaceous PME inhibitor (PMEI) inhibits PME and invertase through formation of a non-covalent 1:1 complex. Cell-wall inhibitor of beta-fructosidase from tobacco (CIF) interacts with CWI in a strictly pH-dependent manner, modulated between pH 4 and 6, with rapid dissociation at neutral pH mediated by structure rearrangement or surface charge pattern in the binding interface. Comparison of the CIF/INV1 structure with the complex between the structurally CIF-related pectin methylesterase inhibitor (PMEI) and pectin methylesterase indicates a common targeting mechanism in PMEI and CIF.	148
275445	cd15801	PMEI-like_1	Uncharacterized subfamily of plant invertase/pectin methylesterase inhibitors. This subfamily contains inhibitors similar to those of pectin methylesterase (PME; Pectinesterase; EC 3.1.1.11; CAZy class 8 of carbohydrate esterases) that catalyzes the demethylesterification of homogalacturonans in the cell wall, and cell-wall invertases (CWIs) that catalyze the hydrolytic cleavage of the disaccharide sucrose into glucose and fructose. The proteinaceous PME inhibitor (PMEI) inhibits PME and invertase through formation of a non-covalent 1:1 complex. Cell-wall inhibitor of beta-fructosidase from tobacco (CIF) interacts with CWI in a strictly pH-dependent manner, modulated between pH 4 and 6, with rapid dissociation at neutral pH mediated by structure rearrangement or surface charge pattern in the binding interface. Comparison of the CIF/INV1 structure with the complex between the structurally CIF-related pectin methylesterase inhibitor (PMEI) and pectin methylesterase indicates a common targeting mechanism in PMEI and CIF.	146
276805	cd15802	RING_CBP-p300	atypical RING domain found in CREB-binding protein and p300 histone acetyltransferases. CBP and p300 (also known as CREBBP or KAT3A and EP300 or KAT3B, respectively) are two histone acetyltransferases (HATs) that associate with and acetylate transcriptional regulators and chromatin. The catalytic core of animal CBP-p300 contains a bromodomain, a CH2 region containing a discontinuous PHD domain interrupted by this RING domain, and a HAT domain. Bromodomain-RING-PHD forms a compact module in which the RING domain is juxtaposed with the HAT substrate-binding site. This ring domain contains only a single zinc ion-binding cluster instead of two; instead of a second zinc atom, a network of hydrophobic interactions stabilizes the domain. The RING domain has an inhibitory role. Disease mutations that disrupt RING attachment lead to upregulation of HAT activity. HAT regulation may require repositioning of the RING domain to facilitate access to an otherwise partially occluded HAT active site. Plant CBP-p300 type HATs lack a bromodomain whose role in the animal animal CBP-p300's is to bind acetylated histones; it has been suggested that these plant proteins may utilize a different domain or another bromodomain protein to perform this function. This RING domain has also been referred to as DUF902.	73
276941	cd15803	RLR_C_like	C-terminal domain of Retinoic acid-inducible gene (RIG)-I-like Receptors, Cereblon (CRBN), and similar protein domains. Retinoic acid-inducible gene (RIG)-I-like Receptors (RLRs) are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. They play crucial roles in innate antiviral responses, including the production of proinflammatory cytokines and type I interferon. There are three RLRs in vertebrates, RIG-I, LGP2, and MDA5. They are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. Cereblon is part of an E3 ubiquitin ligase complex, together with damaged DNA binding protein 1 (DDB1), CUL4A and ROC1. Cereblon interacts directly with DDB1, although the C-terminal domain characterized here does not contribute to that interaction. The C-terminal domain of Cereblon was shown to contain the binding site for thalidomide and its analogs, a class of teratogenic drugs that exhibit an antiproliferative effect on myelomas. Mutations in CRBN, some of which map onto the C-terminal domain, were associated with autosomal recessive mental retardation, which may have to do with interactions between CRBN and ion channels in the brain. RLRs and Cereblon contain a common conserved zinc binding site in their C-terminal domains.	84
276942	cd15804	RLR_C	C-terminal domain of Retinoic acid-inducible gene (RIG)-I-like Receptors. Retinoic acid-inducible gene (RIG)-I-like Receptors (RLRs) are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. They play crucial roles in innate antiviral responses, including the production of proinflammatory cytokines and type I interferon. There are three RLRs in vertebrates, RIG-I, LGP2, and MDA5. They are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. The helicase domain catalyzes the unwinding of double stranded RNA in an ATP-dependent manner. RIG-I and MDA5 also contain two N-terminal caspase activation and recruitment domains (CARDs), which initiate downstream signaling upon viral RNA sensing. They may detect partially overlapping viral substrates, including dengue virus, West Nile virus (WNV), reoviruses, and several paramyxoviruses (such as measles virus and Sendai virus). LGP2 lacks CARD and may play a regulatory role in RLR signaling. It may cooperate with either RIG-I or MDA5 to sense viral RNA.	111
276943	cd15805	RIG-I_C	C-terminal domain of Retinoic acid-inducible gene (RIG)-I protein, a cytoplasmic viral RNA receptor. Retinoic acid-inducible gene (RIG)-I protein, also called DEAD box protein 58 (DDX58), is one of three members of the RIG-I-like Receptor (RLR) family. RLRs are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. RIG-I is activated by blunt-ended double-stranded RNA with or without a 5'-triphosphate (ppp), by single-stranded RNA marked by a 5'-ppp and by polyuridine sequences. It has been found to confer resistance to many negative-sense RNA viruses, including orthomyxoviruses, rhabdoviruses, bunyaviruses, and paramyxoviruses, as well as the positive-strand hepatitis C virus. RLRs are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. The helicase domain catalyzes the unwinding of double stranded RNA in an ATP-dependent manner. RIG-I and MDA5 also contain two N-terminal caspase activation and recruitment domains (CARDs), which initiate downstream signaling upon viral RNA sensing.	112
276944	cd15806	LGP2_C	C-terminal domain of Laboratory of Genetics and Physiology 2 (LGP2), a cytoplasmic viral RNA receptor. Laboratory of Genetics and Physiology 2 (LGP2) is one of three members of the RIG-I-like Receptor (RLR) family. RLRs are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. They are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. LGP2 lacks the caspase activation and recruitment domains (CARDs) that are present in other RLRs, which initiate downstream signaling upon viral RNA sensing. LGP2 may play a regulatory role in RLR signaling, and may cooperate with either RIG-I or MDA5 to sense viral RNA.	112
276945	cd15807	MDA5_C	C-terminal domain of Melanoma differentiation-associated protein 5, a cytoplasmic viral RNA receptor. Melanoma differentiation-associated protein 5 (MDA5) is also called Interferon-induced helicase C domain-containing protein 1 (IFIH1) or RIG-I-like receptor 2 (RLR-2). It  is one of three members of the RLR family. RLRs are cytoplasmic RNA receptors that recognize non-self RNA and act as molecular sensors to detect viral pathogens. It has been shown to detect viruses from the Picornaviridae and Caliciviridae families. RLRs are characterized by a central DExD/H-box helicase domain and a C-terminal domain, both of which are responsible for binding viral RNA. The helicase domain catalyzes the unwinding of double stranded RNA in an ATP-dependent manner. RIG-I and MDA5 also contain two N-terminal caspase activation and recruitment domains (CARDs), which initiate downstream signaling upon viral RNA sensing.	117
293980	cd15808	SPRY_PRY_TRIM47	PRY/SPRY domain in tripartite motif-containing protein 47 (TRIM47), also known as RING finger protein 100 (RNF100) or Gene overexpressed in astrocytoma protein (GOA). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM47, also known as GOA (Gene overexpressed in astrocytoma protein) or RNF100 (RING finger protein 100). TRIM47 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. It is highly expressed in kidney tubular cells, but lowly expressed in most tissue. It is overexpressed in astrocytoma tumor cells and plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis; astrocytoma, also known as cerebral astrocytoma, is a malignant glioma that arises from astrocytes. Genome wide studies on white matter lesions have identified a novel locus on chromosome 17q25 harboring several genes such as TRIM47 and TRIM65 which pinpoints to possible novel mechanisms leading to these lesions.	206
293981	cd15809	SPRY_PRY_TRIM4	PRY/SPRY domain in tripartite motif-binding protein 4 (TRIM4), also known as RING finger protein 87 (RNF87). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM4 which is also known as RING finger protein 87 (RNF87). TRIM4 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. It is a positive regulator of RIG-I-mediated interferon (IFN) induction. It regulates virus-induced IFN induction and cellular antiviral innate immunity by targeting RIG-I for K63-linked poly-ubiquitination. Over-expression of TRIM4 enhances virus-triggered activation of transcription factors IRF3 and NF-kappaB, as well as IFN-beta induction. Expression of TRIM4 differs significantly in Huntington's Disease (HD) neural cells when compared with wild-type controls, possibly impacting down-regulation of the Huntingtin (HTT) gene, which is involved in the regulation of diverse cellular activities that are impaired in Huntington's Disease (HD) cells.	191
293982	cd15810	SPRY_PRY_TRIM5_6_22_34	PRY/SPRY domain of tripartite motif-binding protein 5, 6, 22 and 34 (TRIM5, TRIM6, TRIM22 and TRIM34). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of very close paralogs, TRIM5, TRIM6, TRIM22 and TRIM34. These domains are composed of RING/B-box/coiled-coil core and are also known as RBCC proteins. They form a locus of four closely related TRIM genes within an olfactory receptor-rich region on chromosome 11 of the human genome. Genetic analysis of this locus indicates that these four genes have evolved by gene duplication from a common ancestral gene. All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. TRIM5 promotes innate immune signaling by activating the TAK1 kinase complex by cooperating with the heterodimeric E2, UBC13/UEV1A. It also stimulates NFkB and AP-1 signaling, and the transcription of inflammatory cytokines and chemokines, amplifying these activities upon retroviral infection. Interaction of its PRY-SPRY or cyclophilin domains with the retroviral capsid lattice stimulates the formation of a complementary lattice by TRIM5, with greatly increased TRIM5 E3 activity, and host cell signal transduction. TRIM6 is selectively expressed in embryonic stem (ES) cells and interacts with the proto-oncogene product Myc, maintaining the pluripotency of the ES cells. TRIM6, together with E2 Ubiquitin conjugase (UbE2K) and K48-linked poly-Ub chains, is critical for the IkappaB kinase epsilon (IKKepsilon) branch of type I interferon (IFN-I) signaling pathway and subsequent establishment of a protective antiviral response. TRIM22 plays an integral role in the host innate immune response to viruses; it has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. Altered TRIM22 expression has also been associated with multiple sclerosis, cancer, and autoimmune disease. While the PRY-SPRY domain of TRIM5a provides specificity and the capsid recognition motif to retroviral restriction, TRIM34 binds HIV-1 capsid but does not restrict HIV-1 infection.	189
293983	cd15811	SPRY_PRY_TRIM11	PRY/SPRY domain of tripartite motif-binding protein 11 (TRIM11), also known as RING finger protein 92 (RNF92). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM11, also known as RING finger protein 92 (RNF92) or BIA1. TRIM11 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. It localizes to the nucleus and the cytoplasm; it is overexpressed in high-grade gliomas and promotes proliferation, invasion, migration and glial tumor growth. TRIM11 increases expression of dopamine beta-hydroxylase gene by interacting with the homeodomain transcription factor, PHOX2B, via the B30.2/SPRY domain, thus playing a potential role in the specification of noradrenergic (NA) neuron phenotype. It has also been shown that TRIM11 plays a critical role in the clearance of mutant PHOX2B, which causes congenital central hypoventilation syndrome, via the proteasome. TRIM11 binds a key component of the activator-mediated cofactor complex (ARC105), and destabilizes it, through the ubiquitin-proteasome system; ARC105 mediates chromatin-directed transcription activation and is a key regulatory factor for transforming growth factor beta (TGFbeta) signaling.	169
293984	cd15812	SPRY_PRY_TRIM17	PRY/SPRY domain of tripartite motif-binding protein 17 (TRIM17), also known as testis RING finger protein (terf). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM17, also known as RING finger protein 16 (RNF16) or testis RING finger protein (terf). TRIM17 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein, expressed almost exclusively in the testis. It exhibits E3 ligase activity, causing protein degradation of ZW10 interacting protein (ZWINT), a known component of the kinetochore complex required for the mitotic spindle checkpoint, and negatively regulates proliferation of breast cancer cells. TRIM17 undergoes ubiquitination in COS7 fibroblast-like cells but is inhibited and stabilized by TRIM44.	176
293985	cd15813	SPRY_PRY_TRIM20	PRY/SPRY domain in tripartite motif-binding protein 20 (TRIM20), also known as pyrin. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM20, which is also known as pyrin or marenostrin. Unlike TRIM domains that are composed of RING/B-box/coiled-coil core, the N-terminal RING domain in TRIM20 is exchanged by a PYRIN domain (PYD), a prime mediator of protein interactions necessary for apoptosis, inflammation and innate immune signaling pathway, and it also harbors a C-terminal B30.2 domain. Mutations in pyrin (TRIM20) are associated with familial Mediterranean fever (FMF), a recessively hereditary periodic fever syndrome, characterized by episodes of inflammation and fever. These mutations cluster in the C-terminal B30.2 domain and therefore it is assumed that pyrin plays a role in the innate immune system by possibly effecting caspase-1-dependent IL-1beta maturation.	184
293986	cd15814	SPRY_PRY_TRIM27	PRY/SPRY domain in tripartite motif-containing protein 27 (TRIM27), also known as RING finger protein 76 (RNF76). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM27, also known as RING finger protein 76 (RNF76) or RET finger protein (RFP). TRIM27 domain is composed of RING/B-box/coiled-coil core and also known as RBCC proteins. It is highly expressed in the spleen, thymus and in cells of the hematopoietic compartment. TRIM27 exhibits either nuclear or cytosolic localization depending on the cell type. TRIM27 negatively regulates nucleotide-binding oligomerization domain containing 2 (NOD2)-mediated signaling by proteasomal degradation of NOD2, suggesting that TRIM27 could be a new target for therapeutic intervention in NOD2-associated diseases such as Crohn's. High expression of TRIM27 is observed in several human cancers, including breast and endometrial cancer, where elevated TRIM27 expression predicts poor prognosis. Also, TRIM27 forms an oncogenic fusion protein with Ret proto-oncogene. It is involved in different stages of spermatogenesis and its significant expression in male germ cells and seminomas, suggests that TRIM27 may be associated with the regulation of testicular germ cell proliferation and histological-type of germ cell tumors. TRIM27 could also be a predictive marker for chemoresistance in ovarian cancer patients. In the neurotoxin model of Parkinson's disease (PD), deficiency of TRIM27 decreases apoptosis and protects dopaminergic neurons, making TRIM27 an effective potential target during the treatment of PD.	177
293987	cd15815	SPRY_PRY_TRIM38	PRY/SPRY domain of tripartite motif-binding protein 38 (TRIM38), also known as Ring finger protein 15 (RNF15). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM38, which is also known as RING finger protein 15 (RNF15) or RORET. TRIM38 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. TRIM38 has been shown to act as a suppressor in TOLL-like receptor (TLR)-mediated interferon (IFN)-beta induction by promoting degradation of TRAF6 and NAP1 through the ubiquitin-proteasome system. Another study has shown that TRIM38 may act as a novel negative regulator for TLR3-mediated IFN-beta signaling by targeting TRIF for degradation. TRIM38 has been identified as a critical negative regulator in TNFalpha- and IL-1beta-triggered activation of NF-kappaB and MAP Kinases (MAPKs); it causes degradation of two essential cellular components, TGFbeta-associated kinase 1 (TAK1)-associating chaperones 2 and 3 (TAB2/3). The degradation is promoted through a lysosomal-dependent pathway, which requires the C-terminal PRY-SPRY of TRIM38. Enterovirus 71 infection induces degradation of TRIM38, suggesting that TRIM38 may play a role in viral infections.	182
293988	cd15816	SPRY_PRY_TRIM58	PRY/SPRY domain in tripartite motif-binding protein 58 (TRIM58), also known as BIA2. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM58, also known as BIA2. TRIM58 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins.It is implicated by genome-wide association studies (GWAS) to regulate erythrocyte traits, including cell size and number. Trim58 facilitates erythroblast enucleation by inducing proteolytic degradation of the microtubule motor dynein.	168
293989	cd15817	SPRY_PRY_TRIM60_75	PRY/SPRY domain of tripartite motif-binding protein 60 and 75 (TRIM60 and TRIM75). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM60 and TRIM75, both composed of RING/B-box/coiled-coil core and also known as RBCC proteins. TRIM60 domain is also known as RING finger protein 33 (RNF33) or 129 (RNF129). Based on its expression profile, RNF33 likely plays an important role in the spermatogenesis process, the development of the pre-implantation embryo, and in testicular functions; Rnf33 is temporally transcribed in the unfertilized egg and the pre-implantation embryo, and is permanently silenced before the blastocyst stage. Mice experiments have shown that RNF33 associates with the cytoplasmic motor proteins, kinesin-2 family members 3A (KIF3A) and 3B (KIF3B), suggesting possible contribution to cargo movement along the microtubule in the expressed sites. TRIM75, also known as Gm794, has a single site of positive selection in its RING domain associated with E3 ubiquitin ligase activity. It has not been detectably expressed experimentally due to their constant turnover by the proteasome, and therefore not been characterized.	168
293990	cd15818	SPRY_PRY_TRIM69	PRY/SPRY domain in tripartite motif-binding protein 69 (TRIM69), also known as RING finger protein 36 (RNF36). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM69, which is also known as RING finger protein 36 (RNF36) or testis-specific ring finger (Trif). TRIM69 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. It is a novel testis E3 ubiquitin ligase that may function to ubiquitinate its particular substrates during spermatogenesis. In humans, TRIM69 localizes in the cytoplasm and nucleus, and requires an intact RING finger domain to function. The mouse ortholog of this gene is specifically expressed in germ cells at the round spermatid stages during spermatogenesis and, when overexpressed, induces apoptosis. TRIM69 has been shown to be a novel regulator of mitotic spindle assembly in tumor cells; it associates with spindle poles and promotes centrosomal clustering, and is therefore essential for formation of a bipolar spindle.	187
293991	cd15819	SPRY_PRY_BTN1_2	butyrophilin subfamily member A1 and A2 (BTN1A and BTN2A). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of butyrophilin family 1A and 2A (BTN1A and BTN2A). BTNs belong to receptor glycoproteins of immunoglobulin (Ig) superfamily, characterized by the presence of extracellular Ig-like domains (IgV and/or IgC). BTN1A plays a role in the secretion, formation and stabilization of milk fat globules. The B30.2 domain of BTN1A1 binds the enzyme xanthine oxidoreductase (XOR) in order to participate in milk fat globule secretion; this interaction may lead to the production of reactive oxygen species, which have immunomodulatory and antimicrobial functions. Duplication events have led to three paralogs of BTN2A in primates: BTN2A1, BTN2A2, and BTN2A3. In humans, only BTN2A1 has been functionally characterized; it has been detected on epithelial cells and leukocytes, and identified as a novel ligand of dendritic cell-specific ICAM-3 grabbing nonintegrin (DCSIGN), a C-type lectin receptor that acts as an internalization receptor for HIV-1, HCV, and other pathogens. BTN2A2 mRNA has been shown to be expressed in circulating human immune cells.	172
293992	cd15820	SPRY_PRY_BTN3	PRY/SPRY domain of butyrophilin 3 (BTN3), includes BTN3A1, BTN3A2, BTN3A3 as well as BTN-like 3 (BTNL3); BTN3A also known as CD277. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of butyrophilin family 3A (BTN3A); duplication events have led to three paralogs in primates: BTN3A1, BTN3A2, and BTN3A3. BTNs belong to receptor glycoproteins of immunoglobulin (Ig) superfamily, characterized by the presence of extracellular Ig-like domains (IgV and/or IgC). BTN3 transcripts are ubiquitously present in all immune cells (T cells, B cells, NK cells, monocytes, dendritic cells, and hematopoietic precursors) with different expression levels; BTN3A1 and BTN3A2 are expressed mainly by CD4+ and CD8+ T cells, BTN3A2 is the major form expressed in NK cells, and BTN3A3 is poorly expressed in these immune cells. The PRY/SPRY domain of the BTN3A1 isoform mediates phosphoantigen (pAg)-induced activation by binding directly to the pAg.	176
293993	cd15821	SPRY_PRY_RFPL	Ret finger protein-like (RFPL), includes RFP1, 2, 3, 4. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of RFPL protein family, which includes RFPL1, RFPL2, RFPL3 and RFPL4. In humans, RFPL transcripts can be detected at the onset of neurogenesis in differentiating human embryonic stem cells, and in the developing human neocortex. The human RFPL1, 2, 3 genes have a role in neocortex development. RFPL1 is a primate-specific target gene of Pax6, a key transcription factor for pancreas, eye and neocortex development; human RFPL1 decreases cell number through its RFPL-defining motif (RDM) and SPRY domains. The RFPL4 (also known as RFPL4A) gene encodes a putative E3 ubiquitin-protein ligase expressed in adult germ cells and interacts with oocyte proteins of the ubiquitin-proteasome degradation pathway.	178
293994	cd15822	SPRY_PRY_TRIM5	PRY/SPRY domain in tripartite motif-binding protein 5 (TRIM5), also known as RING finger protein 88 (RNF88). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM5 which is also known as RING finger protein 88 (RNF88) or TRIM5alpha (TRIM5a), an antiretroviral restriction factor and a retrovirus capsid sensor in immune signaling. TRIM5 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. It blocks retrovirus infection soon after the virion core enters the cell cytoplasm by recognizing the capsid protein lattice that encases the viral genomic RNA; the SPRY domain provides the capsid recognition motif that dictates specificity to retroviral restriction. TRIM5a, an E3 ubiquitin ligase, promotes innate immune signaling by activating the TAK1 kinase complex by cooperating with the heterodimeric E2, UBC13/UEV1A. It also stimulates NFkB and AP-1 signaling, and the transcription of inflammatory cytokines and chemokines, and amplifies these activities upon retroviral infection. Interaction of its PRY-SPRY or cyclophilin domains with the retroviral capsid lattice stimulates the formation of a complementary lattice by TRIM5, with greatly increased TRIM5 E3 activity, and host cell signal transduction.	200
293995	cd15823	SPRY_PRY_TRIM6	PRY/SPRY domain in tripartite motif-binding protein 6 (TRIM6), also known as RING finger protein 89 (RNF89). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM6, also known as RING finger protein 89 (RNF89). TRIM6 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. It is selectively expressed in embryonic stem (ES) cells and interacts with the proto-oncogene product Myc, maintaining the pluripotency of the ES cells. TRIM6, together with E2 Ubiquitin conjugase (UbE2K) and K48-linked poly-Ub chains, is critical for the IkappaB kinase epsilon (IKKepsilon) branch of type I interferon (IFN-I) signaling pathway and subsequent establishment of a protective antiviral response.	188
293996	cd15824	SPRY_PRY_TRIM22	PRY/SPRY domain in tripartite motif-containing protein 22 (TRIM22), also known as RING finger protein 94 (RNF94) or Stimulated trans-acting factor of 50 kDa (STAF50). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM22, also known as RING finger protein 94 (RNF94) or STAF50 (Stimulated trans-acting factor of 50 kDa). TRIM6 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. TRIM22 is an interferon-induced protein, predominantly expressed in peripheral blood leukocytes, in lymphoid tissue such as spleen and thymus, and in the ovary.TRIM22 plays an integral role in the host innate immune response to viruses; it has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. TRIM22 inhibits influenza A virus (IAV) infection by targeting the viral nucleoprotein for degradation; it represents a novel restriction factor up-regulated upon IAV infection that curtails its replicative capacity in epithelial cells. Altered TRIM22 expression has also been associated with multiple sclerosis, cancer, and autoimmune disease. A large number of high-risk non-synonymous (ns)SNPs have been identified in the highly polymorphic TRIM22 gene, most of which are located in the SPRY domain and could possibly alter critical regions of the SPRY structural and functional residues, including several sites that undergo post-translational modification. TRIM22 is a direct p53 target gene and inhibits the clonogenic growth of leukemic cells. Its expression in Wilms tumors is negatively associated with disease relapse. It is greatly under-expressed in breast cancer cells as compared to non-malignant cell lines; p53 dysfunction may be one of the mechanisms for its down-regulation.	198
293997	cd15825	SPRY_PRY_TRIM34	PRY/SPRY domain in tripartite motif-containing protein 34 (TRIM34), also known as RING finger protein 21 (RNF21) or interferon-responsive finger protein (IFP1). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM34, also known as RING finger protein 21 (RNF21) or interferon-responsive finger protein (IFP1). TRIM34 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. The TRIM21 cDNA possesses at least three kinds of isoforms, due to alternative splicing, of which only the long and medium forms contain the SPRY domain. It is an interferon-induced protein, predominantly expressed in the testis, kidney, and ovary. The SPRY domain provides the capsid recognition motif that dictates specificity to retroviral restriction. While the PRY-SPRY domain provides specificity and the capsid recognition motif to retroviral restriction, TRIM34 binds HIV-1 capsid but does not restrict HIV-1 infection.	185
293998	cd15826	SPRY_PRY_TRIM15	PRY/SPRY domain in tripartite motif-binding protein 15 (TRIM15). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of tripartite motif-containing protein 15 (TRIM15), also referred to as RING finger protein 93 (RNF93) or Zinc finger protein B7 or 178 (ZNFB7 or ZNF178). TRIM15 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. The PRY and SPRY/B30.2 domains can function as immune defense components and in pathogen sensing. TRIM15 has been shown to regulate inflammatory and innate immune signaling, in addition to displaying antiviral activities. Down-regulation of TRIM15, as well as TRIM11, enhances virus release, suggesting that these proteins contribute to the endogenous restriction of retroviruses in cells. TRIM15 is also a regulatory component of focal adhesion turnover and cell migration.	170
293999	cd15827	SPRY_PRY_TRIM10	PRY/SPRY domain of tripartite motif-binding protein 10 (TRIM10) also known as hematopoietic RING finger 1 (HERF1). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM10, also known as RING finger protein 9 (RNF9) or hematopoietic RING finger 1 (HERF1). TRIM10 domain is composed of RING/B-box/coiled-coil core and also known as RBCC protein. TRIM10/HERF1 is predominantly expressed during definitive erythropoiesis and in embryonic liver, and minimally expressed in adult liver, kidney, and colon. It is critical for erythroid cell differentiation and its down-regulation leads to cell death; inhibition of TRIM10 expression blocks terminal erythroid differentiation, while its over-expression in erythroid cells induces beta-major globin expression and erythroid differentiation.	172
294000	cd15828	SPRY_PRY_TRIM60	PRY/SPRY domain of tripartite motif-binding protein 60 (TRIM60) also known as RING finger protein 33 (RNF33). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM60, which is also known as RING finger protein 33 (RNF33) or 129 (RNF129). TRIM60 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. Based on its expression profile, RNF33 likely plays an important role in the spermatogenesis process, the development of the pre-implantation embryo, and in testicular functions; Rnf33 is temporally transcribed in the unfertilized egg and the pre-implantation embryo, and is permanently silenced before the blastocyst stage. Mice experiments have shown that RNF33 associates with the cytoplasmic motor proteins, kinesin-2 family members 3A (KIF3A) and 3B (KIF3B), suggesting possible contribution to cargo movement along the microtubule in the expressed sites.	180
294001	cd15829	SPRY_PRY_TRIM75	PRY/SPRY domain of tripartite motif-binding protein 75 (TRIM75). This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of TRIM75, also known as Gm794. TRIM75 domains are composed of RING/B-box/coiled-coil core and also known as RBCC proteins. TRIM75 has a single site of positive selection in its RING domain associated with E3 ubiquitin ligase activity. It has not been detectably expressed experimentally due to their constant turnover by the proteasome, and therefore not been characterized.	187
276939	cd15830	BamD	BamD lipoprotein, a component of the beta-barrel assembly machinery. BamD, also called YfiO, is part of the beta-barrel assembly machinery (BAM), which is essential for the folding and insertion of outer membrane proteins (OMPs) in the OM of Gram-negative bacteria. Transmembrane OMPs carry out important functions including nutrient and waste management, cell adhesion, and structural roles. The BAM complex is composed of the beta-barrel OMP BamA (also called Omp85/YaeT) and four lipoproteins BamBCDE. BamD is the only BAM lipoprotein required for viability. Both BamA and BamD are broadly distributed in Gram-negative bacteria, and may constitute the core of the BAM complex. BamD contains five Tetratricopeptide repeats (TPRs). The three TPRs at the N-terminus may participate in interaction with substrates, while the two TPRs in the C-terminus may be involved in binding with other BAM components.	213
276938	cd15831	BTAD	Bacterial Transcriptional Activation (BTA) domain. The Bacterial Transcriptional Activation (BTA) domain is present in the putative transcriptional regulator Mycobacterium EmbR and the related Streptomyces antibiotic regulatory protein (SARP) family of transcription factors, which includes DnrI and AfsR, among others. Members of this family contain an N-terminal DNA-binding domain, followed by the BTA domain, and many have diverse domains at the C-terminus. EmbR contains an C-terminal forkhead-associated (FHA) domain, which mediates binding to threonine-phosphorylated sites in a sequence-specific manner. The BTA domain of EmbR contains three Tetratricopeptide repeats (TPRs) and two C-terminal helices. The TPR motif typically contains 34 amino acids, and 5 or 6 tandem repeats of the motif generate a right-handed helical structure with an amphipathic channel that is thought to accommodate an alpha-helix of a target protein.	146
276937	cd15832	SNAP	Soluble N-ethylmaleimide-sensitive factor (NSF) Attachment Protein family. Members of the soluble NSF attachment protein (SNAP) family are involved in intracellular membrane trafficking, including vesicular transport between the endoplasmic reticulum and Golgi apparatus. Higher eukaryotes contain three isoforms of SNAPs: alpha, beta, and gamma. Alpha-SNAP is universally present in eukaryotes and acts as an adaptor protein between SNARE (integral membrane SNAP receptor) and NSF for recruitment to the 20S complex. Beta-SNAP is brain-specific and shares high sequence identity (about 85%) with alpha-SNAP. Gamma-SNAP is weakly related (about 20-25% identity) to the two other isoforms, and is ubiquitous. It may help regulate the activity of the 20S complex. The X-ray structures of vertebrate gamma-SNAP and yeast Sec17, a SNAP family member, show similar all-helical structures consisting of an N-terminal extended twisted sheet of four Tetratricopeptide repeat (TPR)-like helical hairpins and a C-terminal helical bundle.	278
276930	cd15834	TNFRSF1A_teleost	Tumor necrosis factor receptor superfamily member 1A (TNFRSF1A) in teleosts; also known as TNFR1. This subfamily of TNFRSF1 ((also known as type I TNFR, TNFR1, DR1, TNFRSF1A, CD120a, p55) is found in teleosts. It binds TNF-alpha, through the death domain (DD), and activates NF-kappaB, mediates apoptosis and activates signaling pathways controlling inflammatory, immune, and stress responses. It mediates signal transduction by interacting with antiapoptotic protein BCL2-associated athanogene 4 (BAG4/SODD) and adaptor proteins TRAF2 and TRADD that play regulatory roles. The human genetic disorder called tumor necrosis factor associated periodic syndrome (TRAPS), or periodic fever syndrome, is associated with germline mutations of the extracellular domains of this receptor, possibly due to impaired receptor clearance. Serum levels of TNFRSF1A are elevated in schizophrenia and bipolar disorder, and high levels are also associated with cognitive impairment and dementia. Knockout studies in zebrafish embryos have shown that a signaling balance between TNFRSF1A and TNFRSF1B is required for endothelial cell integrity. TNFRSF1A signals apoptosis through caspase-8, whereas TNFRSF1B signals survival via NF-kappaB in endothelial cells. Thus, this apoptotic pathway seems to be evolutionarily conserved, as TNFalpha promotes apoptosis of human endothelial cells and triggers caspase-2 and P53 activation in these cells via TNFRSF1A.	150
276931	cd15835	TNFRSF1B_teleost	Tumor necrosis factor receptor superfamily member 1B (TNFRSF1B) in teleost; also known as TNFR2. This subfamily of TNFRSF1B (also known as TNFR2, type 2 TNFR, TNFBR, TNFR80, TNF-R75, TNF-R-II, p75, CD120b) is found in teleosts. It binds TNF-alpha, but lacks the death domain (DD) that is associated with the cytoplasmic domain of TNFRSF1A (TNFR1). It is inducible and expressed exclusively by oligodendrocytes, astrocytes, T cells, thymocytes, myocytes, endothelial cells, and in human mesenchymal stem cells. TNFRSF1B protects oligodendrocyte progenitor cells (OLGs) against oxidative stress, and induces the up-regulation of cell survival genes. While pro-inflammatory and pathogen-clearing activities of TNF are mediated mainly through activation of TNFRSF1A, a strong activator of NF-kappaB, TNFRSF1B is more responsible for suppression of inflammation. Although the affinities of both receptors for soluble TNF are similar, TNFRSF1B is sometimes more abundantly expressed and thought to associate with TNF, thereby increasing its concentration near TNFRSF1A receptors, and making TNF available to activate TNFRSF1A (a ligand-passing mechanism). Knockout studies in zebrafish embryos have shown that a signaling balance between TNFRSF1A and TNFRSF1B is required for endothelial cell integrity. TNFRSF1A signals apoptosis through caspase-8, whereas TNFRSF1B signals survival via NF-kB in endothelial cells. In goldfish (Carassius aurutus L.), TNFRSF1B expression is substantially higher than that of TNFRSF1 in tissues and various immune cell types. Both receptors are most robustly expressed in monocytes; mRNA levels of TNFRSF1B are lowest in peripheral blood leukocytes.	130
276932	cd15836	TNFRSF11A_teleost	Tumor necrosis factor receptor superfamily member 11A (TNFRSF11A) in teleost; also known as RANK. TNFRSF11A (also known as RANK, FEO, OFE, ODFR, OSTS, PDB2, CD26, OPTB7, TRANCER, LOH18CR1) induces the activation of NF-kappa B and MAPK8/JNK through interactions with various TRAF adaptor proteins. This receptor and its ligand are important regulators of the interaction between T cells and dendritic cells. This receptor is also an essential mediator for osteoclast and lymph node development. Mutations at this locus have been associated with familial expansile osteolysis, autosomal recessive osteopetrosis, and Juvenile Paget's disease (JPD) of bone. Alternatively spliced transcript variants have been described for this locus. Mutation analysis may improve diagnosis, prognostication, recurrence risk assessment, and perhaps treatment selection among the monogenic disorders of RANKL/OPG/RANK activation.	122
276933	cd15837	TNFRSF26	Tumor necrosis factor receptor superfamily member 26 (TNFRSF26), also known as tumor necrosis factor receptor homolog 3 (TNFRH3). TNFRSF26 (also known as tumor necrosis factor receptor homolog 3 (TNFRH3) or TNFRSF24) is predominantly expressed in embryos and lymphoid cell types, along with its closely related TNFRSF22 and TNFRSF23 orthologs, and is developmentally regulated. Unlike TNFRSF22/23, TNFRSF26 does not serve as a TRAIL decoy receptor; it remains an orphan receptor.	118
276934	cd15838	TNFRSF27	Tumor necrosis factor receptor superfamily member 27 (TNFRSF27), also known as ectodysplasin A2 receptor (EDA2R) or X-linked ectodermal dysplasia receptor (XEDAR). TNFRSF27 (also known as ectodysplasin A2 receptor (EDA2R), X-linked ectodermal dysplasia receptor (XEDAR), EDAA2R, EDA-A2R) has two isoforms, EDA-A1 and EDA-A2, that are encoded by the anhidrotic ectodermal dysplasia (EDA) gene. It is highly expressed during embryonic development and binds to ectodysplasin-A2 (EDA-A2), playing a crucial role in the p53-signaling pathway. EDA2R is a direct p53 target that is frequently down-regulated in colorectal cancer tissues due to its epigenetic alterations or through the p53 gene mutations. Mutations in the EDA-A2/XEDAR signaling give rise to ectodermal dysplasia, characterized by loss of hair, sweat glands, and teeth. A non-synonymous SNP on EDA2R, along with genetic variants in human androgen receptor is associated with androgenetic alopecia (AGA).	116
276935	cd15839	TNFRSF_viral	Tumor necrosis factor receptor superfamily members, virus-encoded. This family contains viral TNFR homologs that include vaccinia virus (VACV) cytokine response modifier E (CrmE), an encoded TNFR that shares significant sequence similarity with mammalian type 2 TNF receptors (TNFSFR1B, p75, TNFR type 2), a cowpox virus encoded cytokine-response modifier B (crmB), which is a secreted form of TNF receptor that can contribute to the modification of TNF-mediated antiviral processes, and a myxoma virus (MYXV) T2 (M-T2) protein that binds and inhibits rabbit TNF-alpha. The CrmE structure confirms that the canonical TNFR fold is adopted, but only one of the two "ligand-binding" loops of TNFRSF1A is conserved, suggesting a mechanism for the higher affinity of poxvirus TNFRs for TNFalpha over lymphotoxin-alpha. CrmB protein specifically binds TNF-alpha and TNF-beta indicating that cowpox virus seeks to invade antiviral processes mediated by TNF. Intracellular M-T2 blocks virus-induced lymphocyte apoptosis via a highly conserved viral preligand assembly domain (vPLAD), which controls receptor signaling competency prior to ligand binding.	125
277193	cd15840	SNARE_Qa	SNARE motif, subgroup Qa. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qa SNAREs are syntaxin 18, syntaxin 5, syntaxin 16, and syntaxin 1.	59
277194	cd15841	SNARE_Qc	SNARE motif, subgroup Qc. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qc SNAREs are C-terminal domains of SNAP23 and SNAP25, syntaxin 8, syntaxin 6, and Bet1.	59
277195	cd15842	SNARE_Qb	SNARE motif, subgroup Qb. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb SNAREs are N-terminal domains of SNAP23 and SNAP25, Vti1, Sec20 and GS27.	62
277196	cd15843	R-SNARE	SNARE motif, subgroup R-SNARE. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). In contrast to Qa-, Qb- and Qc-SNAREs that are localized to target organelle membranes, R-SNAREs are localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qa SNAREs are syntaxin 18, syntaxin 5, syntaxin 16, and syntaxin 1.	60
277197	cd15844	SNARE_syntaxin5	SNARE motif of syntaxin 5. Syntaxin 5 (Syn5) regulates the transport from the ER to the Golgi, as well as the early/recycling endosomes to the trans-Golgi network and participates in the assembly of transitional ER and the Golgi, lipid droplet fusion, and cytokinesis. Syn5 exists in 2 isoforms, long (42 kDa) and short (35 kDa). The short form is localized in the Golgi complex, whereas the long form is additionally found in the endoplasmic reticulum (ER). The syntaxin-5 SNARE complexes, which also contain Bet1 (Qc) and either GS27 (Qb) and Sec22B (R-SNARE) or GS28 (Qb) and Ykt6 (R-SNARE), regulate the early secretory pathway of eukaryotic cells at the level of endoplasmic reticulum (ER) to Golgi transport. The syntaxin-5 SNARE complex, which also contains GS15 (Qc), GS28 (Qb) and Ykt6 (R-SNAREs) is involved in the transport from the trans-Golgi network to the cis-Golgi. Syn5 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	86
277198	cd15845	SNARE_syntaxin16	SNARE motif of syntaxin 16. Syntaxin 16 is located in trans-Golgi network (TGN) and regulated by the SM protein Vps45p. It forms a complex with syntaxin 6 (Qc), Vti1a (Qb) and VAMP4 (R-SNARE) and is involved in the regulation of recycling of early endosomes to the trans-Golgi network (TGN). Syntaxin 16 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	59
277199	cd15846	SNARE_syntaxin17	SNARE motif of syntaxin 17. Synthaxin 17 (STX17) belongs to the Qa subgroup of SNAREs and interacts with SNAP29 (Qb/Qc) and the lysosomal R-SNARE VAMP8. The complex plays a role in autophagosome-lysosome fusion. Autophagosome transports cytoplasmic materials, including cytoplasmic proteins, glycogen, lipids, organelles, and invading bacteria to the lysosome for degradation. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	62
277200	cd15847	SNARE_syntaxin7_like	SNARE motif of syntaxin 7, 12 and related sequences. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. This subgroup of the Qa SNAREs includes syntaxin 7, syntaxin 12, TSNARE1 and related proteins.	60
277201	cd15848	SNARE_syntaxin1-like	SNARE motif of syntaxin 1 and related proteins. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. This subgroup of the Qa SNAREs includes syntaxin 1, syntaxin 11, syntaxin 19, syntaxin 2, syntaxin 3, syntaxin 4 and related proteins.	63
277202	cd15849	SNARE_Sso1	SNARE motif of Sso1. Saccharomyces cerevisiae SNARE protein Sso1p forms a complex with synaptobrevin homolog Snc1p (R-SNARE) and the SNAP-25 homolog Sec9p (Qb/c) which is involved in exocytosis. Sso1 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	64
277203	cd15850	SNARE_syntaxin18	SNARE motif, subgroup Qa. Syntaxin18 (also known as Ufe1p) is involved in retrograde transport of CopI coatomer coated vesicles from the Golgi to the ER. It forms a complex with USE1 (SLT1, Qc), Bnip1 (Sec20p, Qb) and Sec22b (R-SNARE). Syntaxin18  is a member of the Qa subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	59
277204	cd15851	SNARE_Syntaxin6	SNARE motif of syntaxin 6. Syntaxin 6 forms a complex with syntaxin 16 (Qa), Vti1a (Qb) and VAMP4 (R-SNARE) and is involved in the regulation of recycling of early endosomes to the trans-Golgi network (TGN). Syntaxin 6 and its yeast homolog TLG1 are members of the Qc subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	66
277205	cd15852	SNARE_Syntaxin8	SNARE motif of syntaxin 8. Syntaxin 8 forms a complex with syntaxin 7 (Qa), Vti1b (Qb) and either VAMP7 or VAMP8 (R-SNARE) and is involved in the transport from early endosomes to the lysosome. Syntaxin 8 is a member of the Qc subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	59
277206	cd15853	SNARE_Bet1	SNARE motif of Bet1. Bet1 forms a complexes with GS27 (Qb), syntaxin-5 (Qa) and Sec22B (R-SNARE) or GS28 (Qb), syntaxin-5 (Qa) and Ykt6 (R-SNARE). These complexes regulates the early secretory pathway of eukaryotic cells at the level of the transport from endoplasmic reticulum (ER) to the ER-Golgi intermediate compartment (ERGIC) and from ERGIC to the cis-Golgi, respectively. Bet1 is a member of the Qc subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	59
277207	cd15854	SNARE_SNAP47C	C-terminal SNARE motif of SNAP47. C-terminal SNARE motif of SNAP47, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. The exact funtion of SNAP47 is unknown. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP29 and SEC9.	59
277208	cd15855	SNARE_SNAP25C_23C	C-terminal SNARE motif of SNAP25 and SNAP23. C-terminal SNARE motifs of SNAP25 and SNAP23, members of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP23 interacts with STX4 (Qa) and the lysosomal R-SNARE VAMP8. The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. SNAP25 interacts with Syntaxin-1 (Qa) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP29, SNAP47 and SEC9.	59
277209	cd15856	SNARE_SNAP29C	C-terminal SNARE motif of SNAP29. C-terminal SNARE motif of SNAP29, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP29 interacts with STX17 (Qa) and the lysosomal R-SNARE VAMP8. The complex plays a role in autophagosome-lysosome fusion. Autophagosome transports cytoplasmic materials including cytoplasmic proteins, glycogen, lipids, organelles, and invading bacteria to the lysosome for degradation. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP47 and SEC9.	59
277210	cd15857	SNARE_SEC9C	C-terminal SNARE motif of SEC9. C-terminal SNARE motif of fungal SEC9, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SEC9 interacts with Sso1(Qa) and the lysosomal R-SNARE Snc1. The complex plays a role in post-Golgi transport. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP47 and SNAP29.	59
277211	cd15858	SNARE_VAM7	SNARE motif of VAM7. Fungal VAM7 (vacuolar morphogenesis protein 7) is a member of the Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family involved in vacuolar protein transport and membrane fusion. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	59
277212	cd15859	SNARE_SYN8	SNARE motif of SYN8. Fungal SYN8 is a member of the Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family presetn in the endosomes. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	68
277213	cd15860	SNARE_USE1	SNARE motif of USE1. USE1 (unconventional SNARE in the ER 1 homolog, also known as SNARE-like tail-anchored protein 1 or SLT1) is involved in retrograde transport of CopI coatomer coated vesicles from the Golgi to the ER. It forms a complex with syntaxin18 (Ufe1p, Qa), Bnip1 (Sec20p, Qb) and Sec22b (R-SNARE). USE1 is a member of the Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein family. SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	60
277214	cd15861	SNARE_SNAP25N_23N_29N_SEC9N	N-terminal SNARE motif of SNAP25, SNAP23, SNAP29, and SEC9. N-terminal SNARE motif of members of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP29, SNAP47 and SEC9.	65
277215	cd15862	SNARE_Vti1	SNARE motif of Vti1. Vti1 (vesicle transport through interaction with t-SNAREs homolog 1) belongs to the Qb subgroup of SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptor). Vti1b interacts with syntaxin 7 (Qa), syntaxin 8 (Qc), and the lysosomal R-SNARE VAMP8 or VAMP7 to form the endosomal SNARE core complex that mediates transport from the early endosomes and the MVBs (multivesicular bodies), and from the MVBs to the lysosomes, respectively. Vti1a interacts with syntaxin 16 (Qa), syntaxin 6 (Qc), and the lysosomal R-SNARE VAMP4 to form an endosomal SNARE core complex that mediates transport from the early endosomes to the TGN (trans-Golgi network). SNARE proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb SNAREs are N-terminal domains of SNAP23 and SNAP25, Vti1, Sec20 and GS27.	62
277216	cd15863	SNARE_GS27	SNARE motif of GS27. GS27 (also known as Bos1, EPM6, golgi SNAP receptor complex member 2 or GOSR2) is a member of the Qb subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. GS27 forms a complex together with Bet1 (Qc), syntaxin-5 (Qa) and Sec22B (R-SNARE). This complex regulates the early secretory pathway of eukaryotic cells at the level of the transport from endoplasmic reticulum (ER) to the ER-Golgi intermediate compartment (ERGIC). SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	66
277217	cd15864	SNARE_GS28	SNARE motif of GS28. GS28 (also known as golgi SNAP receptor complex member 1 or GOSR1) forms complexes with syntaxin-5 (Qa), Ykt6 (R-SNARE) and either Bet1 (Qc) or GS15 (Qc). These complexes regulate the early secretory pathway of eukaryotic cells at the level of the transport from the ER-Golgi intermediate compartment (ERGIC) to the cis-Golgi and transport from the trans-Golgi network to the cis-Golgi, respectively. GS28 is a member of the Qb subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins which contain coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	66
277218	cd15865	SNARE_SEC20	SNARE motif of SEC20. SEC20 (also known as BNIP1, NIP1, or TRG-8) forms a complex with syntaxin 18 (Qa), SEC22 (R-SNARE)and USE1 (Qc), and is involved in the transport from cis-Golgi to the endoplasmic reticulum (ER). SEC20 is a member of the Qb subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins which contain coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	93
277219	cd15866	R-SNARE_SEC22	SNARE motif of SEC22. SEC22 forms complexes with syntaxin 18 (Qa), Sec20 (Qb) and USE1 (Qc), and with syntaxin 5 (Qa), GS27 (Qb) and Bet1 (Qc). These complexes are involved in the transport from cis-Golgi to the endoplasmic reticulum (ER) and in the transport from endoplasmic reticulum (ER) to the ER-Golgi intermediate compartment (ERGIC), respectively. SEC22 is a member of the R-SNARE subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins which contain coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	64
277220	cd15867	R-SNARE_YKT6	SNARE motif of YKT6. Ykt6 forms complexes with syntaxin-5 (Qa), GS28 (Qb) and either  Bet1 (Qc) or GS15 (Qc). This complex regulates the early secretory pathway of eukaryotic cells at the level of the transport from the ER-Golgi intermediate compartment (ERGIC) to the cis-Golgi and transport from the trans-Golgi network to the cis-Golgi, respectively. Ykt6 is a member of the  R-SNARE subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins which contain coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	61
277221	cd15868	R-SNARE_VAMP8	SNARE motif of VAMP8. The lysosomal VAMP8  (vesicle-associated membrane protein 8, also called endobrevin) protein belongs to the R-SNARE subgroup of SNAREs and interacts with STX17 (Qa) and SNAP29 (Qb/Qc). The complex plays a role in autophagosome-lysosome fusion via regulating the transport from early endosomes to multivesicular bodies. Autophagosome transports cytoplasmic materials including cytoplasmic proteins, glycogen, lipids, organelles, and invading bacteria to the lysosome for degradation. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	68
277222	cd15869	R-SNARE_VAMP4	SNARE motif of VAMP4. The VAMP-4 (vesicle-associated membrane protein 4) protein belongs to the R-SNARE subgroup of SNAREs and interacts with  syntaxin 16 (Qa), Vti1a (Qb) and  syntaxin 6 (Qc). This complex plays a role in maintenance of Golgi ribbon structure and normal retrograde trafficking from the early endosome to the trans-Golgi network (TGN). SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	67
277223	cd15870	R-SNARE_VAMP2	SNARE motif of VAMP2. The VAMP-2 (vesicle-associated membrane protein 2, also called synaptobrevin-2) protein belongs to the R-SNARE subgroup of SNAREs and interacts with Syntaxin-1 (Qa) and SNAP-25(Qb/Qc), as well as syntaxin 12 (Qa) and SNAP23 (Qb/Qc). The complexes play a role in transport of secretory granule from trans-Golgi network to the plasma membrane, and in the transport from early endosomes to and from the plasma membrane, respectively. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	63
277224	cd15871	R-SNARE_VAMP7	SNARE motif of VAMP7. The VAMP-7 (vesicle-associated membrane protein 7, also called synaptobrevin-like protein 1) protein belongs to the R-SNARE subgroup of SNAREs and interacts with syntaxin 7(Qa), syntaxin 8 (Qc) and Vti1b (Qb). The complex is involved in the transport from early endosomes to the lysosome via regulating the transport from multivesicular bodies to the lysosomes. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	65
277225	cd15872	R-SNARE_VAMP5	SNARE motif of VAMP5. The VAMP-5 (vesicle-associated membrane protein 5) protein belongs to the R-SNARE subgroup of SNAREs. Its function is unknown. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins contain coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles.	68
277226	cd15873	R-SNARE_STXBP5_6	SNARE domain of STXBP5, STXBP6 and related proteins. Syntaxin binding protein 5 (STXBP5, also called Tomosyn), as well as its relative Syntaxin binding protein 6 (STXBP6, also called Amisyn) contains a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis. In general, SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	61
277227	cd15874	R-SNARE_Snc1	SNARE motif of Snc1. Saccharomyces cerevisiae SNARE protein Snc1p forms a complex with synaptobrevin homolog Sso1p (Qa) and the SNAP-25 homolog Sec9p (Qb/c) which is involved in exocytosis. Snc1 is a member of the R-SNARE subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	60
277228	cd15875	SNARE_syntaxin7	SNARE motif of syntaxin 7. Syntaxin 7 forms a complex with syntaxin 8 (Qc), Vti1b (Qb) and either VAMP7 or VAMP8 (R-SNARE) and is involved in the transport from early endosomes to the lysosome. Syntaxin 7 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	60
277229	cd15876	SNARE_syntaxin12	SNARE motif of syntaxin 12. Syntaxin 12 (STX12, also known as STX13 and STX14) forms a complex with SNAP25 (Qb/Qc) or SNAP29 (Qb/Qc) and VAMP2 or VAMP3 (R-SNARE) and plays a role in plasma membrane to early endosome transport. Syntaxin 12 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	67
277230	cd15877	SNARE_TSNARE1	SNARE motif of TSNARE1. TSNARE1 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. Its function is unknown, but polymorphisms in human TSNARE1 have been associated with schizophrenia susceptibility. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. TSNARE1 is part of a subgroup of the Qa SNAREs that also includes syntaxin 7, syntaxin 12 and related proteins.	64
277231	cd15878	SNARE_syntaxin11	SNARE motif of syntaxin 11. Syntaxin 11 (also known as STX11, FHL4, HLH4, HPLH4) is present on endosomal membranes, including late endosomes and lysosomes in macrophages, and has been shown to bind Vti1b and regulate the availability of Vti1b to form other SNARE-complexes. Mutations in human STX11 has been linked to familial hemophagocytic lymphohistiocytosis type-4 (FHL-4), an autosomal recessive disorder of immune dysregulation. Syntaxin 11 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	63
277232	cd15879	SNARE_syntaxin19	SNARE motif of syntaxin 19. Syntaxin 19 has been shown to have the potential to form SNARE complexes with SNAP-23, 25 and 29 (Qb/Qc) and VAMP3 and VAMP8 (R-SNARE), indicating a role in post-Golgi trafficking or plasma membrane fusion. Syntaxin 19 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	63
277233	cd15880	SNARE_syntaxin1	SNARE motif of syntaxin 1. Syntaxin-1 belongs to the Qa subgroup of SNAREs and interacts with SNAP-25 (Qb/Qc) and the  R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in exocytosis of synaptic vesicles. SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	69
277234	cd15881	SNARE_syntaxin3	SNARE motif of syntaxin 3. Syntaxin 3 (STX3) has been shown to form a complex with VAMP8 (R-SNARE) and  SNAP-23  (Qb/c) in mast cells. Mutations have been implicated in human microvillus inclusion disease (MVID), a disorder of the differentiation of intestinal epithelium. Syntaxin 3 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	69
277235	cd15882	SNARE_syntaxin2	SNARE motif of syntaxin 2. Syntaxin 2 (STX2), also known as epimorphin (EPM or EPIM), may interact with SNAP-23 (Qb/c) and genetic varioations are associated with type 1 von Willebrand disease (VWD). Syntaxin 2 is a member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	69
277236	cd15883	SNARE_syntaxin4	SNARE motif of syntaxin 4. Syntaxin-4 forms a complex with SNAP-23 (Qb/Qc) and  R-SNAREs VAMP8, VAMP2 and VAMP7 which plays a role in exocytosis of secetory granule. Syntaxin 4 is member of the Qa subgroup of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins, which consist of coiled-coil helices (called SNARE motifs) that mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	63
277237	cd15884	SNARE_SNAP23C	C-terminal SNARE motif of SNAP23. C-terminal SNARE motifs of SNAP23, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP23 interacts with Syntaxin-4 (Qa) and the R-SNARE VAMP8. The complex plays a role in exocytosis of secretory granule. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP25, SNAP29, SNAP47 and SEC9.	59
277238	cd15885	SNARE_SNAP25C	C-terminal SNARE motif of SNAP25. C-terminal SNARE motifs of SNAP25, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP25 interacts with Syntaxin-1 (Qa) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP29, SNAP47 and SEC9.	59
277239	cd15886	SNARE_SEC9N	N-terminal SNARE motif of SEC9. N-terminal SNARE motif of fungal SEC9, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SEC9 interacts with Sso1(Qa) and the lysosomal R-SNARE Snc1. The complex plays a role in post-Golgi transport. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP47 and SNAP29.	70
277240	cd15887	SNARE_SNAP29N	N-terminal SNARE motif of SNAP29. N-terminal SNARE motif of SNAP29, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP29 interacts with STX17 (Qa) and the lysosomal R-SNARE VAMP8. The complex plays a role in autophagosome-lysosome fusion. Autophagosome transports cytoplasmic materials including cytoplasmic proteins, glycogen, lipids, organelles, and invading bacteria to the lysosome for degradation. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP47 and SEC9.	65
277241	cd15888	SNARE_SNAP47N	N-terminal SNARE motif of SNAP47. N-terminal SNARE motif of SNAP47, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. The exact funtion of SNAP47 is unknown. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP25, SNAP29 and SEC9.	65
277242	cd15889	SNARE_SNAP25N_23N	N-terminal SNARE motif of SNAP25 and SNAP23. N-terminal SNARE motifs of SNAP25 and SNAP23, members of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP23 interacts with STX4 (Qa) and the lysosomal R-SNARE VAMP8. The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. SNAP25 interacts with Syntaxin-1 (Qa) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP29, SNAP47 and SEC9.	65
277243	cd15890	SNARE_Vti1b	SNARE motif of Vti1b-like. Vti1b (vesicle transport through interaction with t-SNAREs homolog 1B) belongs to the Qb subgroup of SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptor). Vti1b interacts with syntaxin 7 (Qa), syntaxin 8 (Qc), and the lysosomal R-SNARE VAMP8 or VAMP7 to form the endosomal SNARE core complexes that mediate transport from the early endosomes and the MVBs (multivesicular bodies), and from the MVBs to the lysosomes, respectively. SNARE  proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb SNAREs are N-terminal domains of SNAP23 and SNAP25, Vti1, Sec20 and GS27.	62
277244	cd15891	SNARE_Vti1a	SNARE motif of Vti1b-like. Vti1a (vesicle transport through interaction with t-SNAREs homolog 1A) belongs to the Qb subgroup of SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptor). Vti1a interacts with syntaxin 16 (Qa), syntaxin 6 (Qc), and the lysosomal R-SNARE VAMP4 to form an endosomal SNARE core complex that mediates transport from the early endosomes to the TGN (trans-Golgi network). SNARE  proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Examples for members of the Qb SNAREs are N-terminal domains of SNAP23 and SNAP25, Vti1, Sec20 and GS27.	62
277245	cd15892	R-SNARE_STXBP6	SNARE domain of STXBP6. Syntaxin binding protein 6 (STXBP6, also called Amisyn), as well as its relative Syntaxin binding protein 5 (STXBP5, also called Tomosyn), contains a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis. In general, SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	62
277246	cd15893	R-SNARE_STXBP5	SNARE domain of STXBP5. Syntaxin binding protein 5 (STXBP5, also called Tomosyn), as well as its relative Syntaxin binding protein 6 (STXBP6, also called Amisyn) contains a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis. Tomosyn contains an N-terminal WD40 repeat region and has been shown to form complexes with SNAP-25 and syntaxin 1a, as well as SNAP-23 and syntaxin 4. In general, SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins consist of coiled-coil helices (called SNARE motifs) which mediate the interactions between SNARE proteins, and a transmembrane domain. The SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qc-, as well as Qa- and Qb-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle.	61
277247	cd15894	SNARE_SNAP25N	N-terminal SNARE motif of SNAP25. N-terminal SNARE motifs of SNAP25, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP25 interacts with Syntaxin-1 (Qa) and the R-SNARE VAMP2 (also called synaptobrevin-2). The complex plays a role in transport of secretory granule from trans-Golgi network to the plasma membrane. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP23, SNAP29, SNAP47 and SEC9.	73
277248	cd15895	SNARE_SNAP23N	N-terminal SNARE motif of SNAP23. N-terminal SNARE motifs of SNAP23, a member of the Qb/Qc subfamily of SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) proteins. SNAP23 interacts with Syntaxin-4 (Qa) and the R-SNARE VAMP8. The complex plays a role in exocytosis of secretory granule. Qb/Qc SNAREs consist of 2 coiled-coil helices (called SNARE motifs, one belonging to the Qb subgroup and one belonging to the Qc subgroup), which mediate the interactions with other SNARE proteins, and a transmembrane domain. In general, the SNARE complex mediates membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, depending on whether the residue in the hydrophilic center layer of the four-helical bundle is a glutamine (Q) or arginine (R). Qa-, as well as Qb- and Qc-SNAREs, are localized to target organelle membranes, while R-SNARE is localized to vesicle membranes. They form unique complexes consisting of one member of each subgroup, that mediate fusion between a specific type of vesicles and their target organelle. Their SNARE motifs form twisted and parallel heterotetrameric helix bundles. Other members of the Qb/Qc SNAREs are SNAP25, SNAP29, SNAP47 and SEC9.	67
276899	cd15896	MYSc_Myh19	class II myosin heavy chain19, motor domain. Myosin motor domain of muscle myosin heavy chain 19. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	675
320054	cd15897	EFh_PEF	The penta-EF hand (PEF) family. The penta-EF hand (PEF) family contains a group of five EF-hand calcium-binding proteins, including several classical calpain large catalytic subunits (CAPN1, 2, 3, 8, 9,  11, 12, 13, 14), two calpain small subunits (CAPNS1 and CAPNS2), as well as non-calpain PEF proteins, ALG-2 (apoptosis-linked gene 2, also termed programmed cell death protein 6, PDCD6), peflin, sorcin, and grancalcin. Based on the sequence similarity of EF1 hand, ALG-2 and peflin have been classified into group I PEF proteins. Calcium-dependent protease calpain subfamily members, sorcin and grancalcin, are group II PEF proteins. Calpains (EC 3.4.22.17) are calcium-activated intracellular cysteine proteases that play important roles in the degradation or functional modulation in a variety of substrates. They have been implicated in a number of physiological processes such as cell cycle progression, remodeling of cytoskeletal-cell membrane attachments, signal transduction, gene expression and apoptosis. ALG-2 is a pro-apoptotic factor that forms a homodimer in the cell or a heterodimer with its closest paralog peflin through their EF5s. Peflin is a 30-kD PEF protein with a longer N-terminal hydrophobic domain than any other member of the PEF family, and it contains nine nonapeptide (A/PPGGPYGGP) repeats. It exists only as a heterodimer with ALG-2. The dissociation of heterodimer occurs in the presence of Ca2+. ALG-2 interacts with various proteins in a Ca2+-dependent manner. Sorcin (for soluble resistance-related calcium binding protein) is a soluble resistance-related calcium-binding protein that participates in the regulation of calcium homeostasis in cells. Grancalcin is a cytosolic Ca2+-binding protein specifically expressed in neutrophils and monocytes/macrophages. It plays a key role in leukocyte-specific functions that are responsible for host defense. Grancalcin can form a heterodimer together with sorcin. Members in this family contain five EF-hand motifs attached to an N-terminal region of variable length containing one or more short Gly/Pro-rich sequences. These proteins form homodimers or heterodimers through pairing between the 5th EF-hands from the two molecules. Unlike calmodulin, the PEF domains do not undergo major conformational changes upon binding Ca2+.	165
320029	cd15898	EFh_PI-PLC	EF-hand motif found in eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) isozymes. PI-PLC isozymes are signaling enzymes that hydrolyze the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  Inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which goes on to phosphorylate other molecules, leading to altered cellular activity. Calcium is required for the catalysis. This family corresponds to the four EF-hand motifs containing PI-PLC isozymes, including PI-PLC-beta (1-4), -gamma (1-2), -delta (1,3,4), -epsilon (1), -zeta (1), eta (1-2). Lower eukaryotes such as yeast and slime molds contain only delta-type isozymes. In contrast, other types of isoforms present in higher eukaryotes. This family also includes 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase 1 (PLC1) from fungi. Some homologs from plants contain only two atypical EF-hand motifs and they are not included. All PI-PLC isozymes except sperm-specific PI-PLC-zeta share a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. PI-PLC-zeta lacks the PH domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Most of EF-hand motifs found in PI-PLCs consist of a helix-loop-helix structure, but lack residues critical to metal binding. Moreover, the EF-hand region of most of PI-PLCs may have an important regulatory function, but it has yet to be identified. However, PI-PLC-zeta is a key exception. It is responsible for Ca2+ oscillations in fertilized oocytes and exhibits a high sensitivity to Ca2+ mediated through its EF-hand domain. In addition, PI-PLC-eta2 shows a canonical EF-loop directing Ca2+-sensitivity and thus can amplify transient Ca2+ signals. Also it appears that PI-PLC-delta1 can regulate the binding of PH domain to PIP2 in a Ca2+-dependent manner through its functionally important EF-hand domains. PI-PLCs can be activated by a variety of extracellular ligands, such as growth factors, hormones, cytokines and lipids. Their activation has been implicated in tumorigenesis and/or metastasis linked to migration, proliferation, growth, inflammation, angiogenesis and actin cytoskeleton reorganization.  PI-PLC-beta isozymes are activated by G-protein coupled receptor (GPCR) through different mechanisms. However, PI-PLC-gamma isozymes are activated by receptor tyrosine kinase (RTK), such as Rho and Ras GTPases. In contrast, PI-PLC-epsilon are activated by both GPCR and RTK. PI-PLC-delta1 and PLC-eta 1 are activated by GPCR-mediated calcium mobilization. The activation mechanism for PI-PLC-zeta remains unclear.	137
320021	cd15899	EFh_CREC	EF-hand, calcium binding motif, found in CREC-EF hand family. The CREC (Cab45/reticulocalbin/ERC45/calumenin)-EF hand family contains a group of six EF-hand, low-affinity Ca2+-binding proteins, including reticulocalbin (RCN-1), ER Ca2+-binding protein of 55 kDa (ERC-55, also known as TCBP-49 or E6BP), reticulocalbin-3 (RCN-3), Ca2+-binding protein of 45 kDa (Cab45 and its splice variant Cab45b), and calumenin ( also known as crocalbin or CBP-50). The proteins are not only localized in various parts of the secretory pathway, but also found in the cytosolic compartment and at the cell surface. They interact with different ligands or proteins and have been implicated in the secretory process, chaperone activity, signal transduction as well as in a large variety of disease processes.	267
320080	cd15900	EFh_MICU	EF-hand, calcium binding motif, found in mitochondrial calcium uptake proteins MICU1, MICU2, MICU3, and similar proteins. This family includes mitochondrial calcium uptake protein MICU1 and its two additional paralogs, MICU2 and MICU3. MICU1 localizes to the inner mitochondrial membrane (IMM). It functions as a gatekeeper of the mitochondrial calcium uniporter (MCU) and regulates MCU-mediated mitochondrial Ca2+ uptake, which is essential for maintaining mitochondrial homoeostasis. MICU1 and MICU2 are physically associated within the uniporter complex and are co-expressed across all tissues. They may play non-redundant roles in the regulation of the mitochondrial calcium uniporter. At present, the precise molecular function of MICU2 and MICU3 remain unclear. MICU2 may play possible roles in Ca2+ sensing and regulation of MCU, calcium buffering with a secondary impact on transport or assembly and stabilization of MCU. MICU3 likely has a role in mitochondrial calcium handling. All members in this family contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices.	152
319999	cd15901	EFh_DMD_DYTN_DTN	EF-hand-like motif found in the dystrophin/dystrobrevin/dystrotelin family. The dystrophin/dystrobrevin/dystrotelin family has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. Dystrophin is the founder member of this family. It is a sub-membrane cytoskeletal protein associated with the inner surface membrane. Dystrophin and its close paralog utrophin have a large N-terminal extension of actin-binding CH domains, up to 24 spectrin repeats, and a WW domain. Its further paralog, dystrophin-related protein 2 (DRP-2), retains only two of the spectrin repeats. Dystrophin, utrophin or DRP2 can form the core of a membrane-bound complex consisting of dystroglycan, sarcoglycans and syntrophins, known as the dystrophin-glycoprotein complex (DGC) that plays an important role in brain development and disease, as well as in the prevention of muscle damage. Dystrobrevins, including alpha- and beta-dystrobrevin, lack the large N-terminal extension found in dystrophin, but alpha-dystrobrevin has a characteristic C-terminal extension. Dystrobrevins are part of the DGC. They physically associate with members of the dystrophin family and with the syntrophins through their homologous C-terminal coiled coil motifs. In contrast, dystrotelins lack both the large N-terminal extension found in dystrophin and the obvious syntrophin-binding sites (SBSs). Dystrotelins are not critical for mammalian development. They may be involved in other forms of cytokinesis. Moreover, dystrotelin is unable to heterodimerize with members of the dystrophin or dystrobrevin families, or to homodimerize.	163
320075	cd15902	EFh_HEF	EF-hand, calcium binding motif, found in the hexa-EF hand proteins family. The hexa-EF hand proteins family, also named the calbindin sub-family,  contains a group of six EF-hand Ca2+-binding proteins, including calretinin (CR, also termed 29 kDa calbindin), calbindin D28K (CB, also termed vitamin D-dependent calcium-binding protein, avian-type), and secretagogin (SCGN). CR is a cytosolic hexa-EF-hand calcium-binding protein predominantly expressed in a variety of normal and tumorigenic t-specific neurons of the central and peripheral nervous system. It is a multifunctional protein implicated in many biological processes, including cell proliferation, differentiation, and cell death. CB is highly expressed in brain tissue. It is a strong calcium-binding and buffering protein responsible for preventing a neuronal death as well as maintaining and controlling calcium homeostasis. SCGN is a six EF-hand calcium-binding protein expressed in neuroendocrine, pancreatic endocrine and retinal cells. It plays a crucial role in cell apoptosis, receptor signaling and differentiation. It is also involved in vesicle secretion through binding to various proteins, including interacts with SNAP25, SNAP23, DOC2alpha, ARFGAP2, rootletin, KIF5B, beta-tubulin, DDAH-2, ATP-synthase and myeloid leukemia factor 2. SCGN functions as a Ca2+ sensor/coincidence detector modulating vesicular exocytosis of neurotransmitters, neuropeptides or hormones. Although the family members share a significant amount of secondary sequence homology, they display altered structural and biochemical characteristics, and operate in distinct fashions. CB contains six EF-hand motifs in a single globular domain, where EF-hands 1, 3, 4, 5 bind four calcium ions. CR contains six EF-hand motifs within two independent domains, CR I-II and CR III-VI. They harbor two and four EF-hand motifs, respectively. The first 5 EF-hand motifs are capable of binding calcium ions, while the EF-hand 6 is inactive. SCGN consists of the three globular domains each of which contains a pair of EF-hand motifs. Human SCGN simultaneously binds four calcium ions through its EF-hands 3, 4, 5 and 6 in one high affinity and three low affinity calcium-binding sites. In contrast, SCGNs in other lower eukaryotes, such as D. rerio, X. laevis, M. domestica, G. gallus, O. anatinus, are fully competent in terms of six calcium-binding.	254
277191	cd15903	Dicer_PBD	Partner-binding domain of the endoribonuclease Dicer. The endoribonuclease Dicer plays a central role in RNA interference by breaking down RNA molecules into fragments of about 22 nucleotides (miRNAs and siRNAs). Loading of RNA onto Dicer and the enzymatic cleavage are supported by dsRNA-binding proteins, including trans-activation response (TAR) RNA-binding protein (TRBP) or protein activator of PKR (PACT). Together with Argonaute, this constitutes the RNA-induced silencing complex (RISC) which functions to load the small RNA fragments onto Argonaute. The Partner-binding domain of Dicer is responsible for interactions with the dsRNA-binding proteins. This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases.	104
320706	cd15904	TSPO_MBR	Translocator protein (TSPO)/peripheral-type benzodiazepine receptor (MBR) family. This family contains tryptophan-rich translocator protein (TSPO), an integral membrane protein that is highly conserved from bacteria to mammals. In eukaryotes, it is mainly found in the outer mitochondrial membranes of steroid-synthesizing cells of the nervous system where it transports cholesterol into mitochondria. It is known to be highly expressed in metastatic cancer, steriodogenic tissues, as well as inflammatory and neurological diseases such as Alzheimer's and Parkinson's. TSPO is also known as the peripheral benzodiazepine receptor (MBR) and its ligands include benzodiazepine drugs, implicated in regulating apoptosis. In human, a single polymorphism A147T is associated with psychiatric disorders; the mutation causes structural changes in a region implicated in cholesterol binding. TSPO is homologous to bacterial tryptophan-rich sensory proteins, and their tryptophan residues are believed to be functionally important. In bacteria, TSPO acts as a negative regulator of expression of specific photosynthesis genes in response to oxygen/light; it catalyzes a photooxidative degradation of Proto porphyrine (PpIX). R. sphaeroides TSPO (RsTSPO) is involved in porphyrin transport, similar to human, while Arabidopsis translocator protein (AtTSPO) is regulated at multiple levels in response to salt stress and perturbations in tetrapyrrole metabolism.	142
320571	cd15905	7tmA_GPBAR1	G protein-coupled bile acid receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. The G-protein coupled bile acid receptor GPBAR1 is also known as BG37, TGR5 (Takeda G-protein-coupled receptor 5), M-BAR (membrane-type receptor for bile acids), and GPR131. GPBAR1 is highly expressed in the gastrointestinal tract, but also found at many other tissues including liver, colon, heart, skeletal muscle, and brown adipose tissue. GPBAR1 functions as a membrane-bound receptor specific for bile acids, which are the end products of cholesterol metabolism that facilitate digestion and absorption of lipids or fat-soluble vitamins. Bile acids act as liver-specific metabolic signaling molecules and stimulate liver regeneration by activating GPBAR1 and nuclear receptors such as the farnesoid X receptor (FXR). Upon bile acids binding, GPBAR1 activation causes release of the G-alpha(s) subunit and activation of adenylate cyclase. The increase in intracellular cAMP level then stimulates the expression of many genes via the PKA-mediated phosphorylation of cAMP-response element binding protein (CREB). Thus, GPAR1-signalling exerts various biological effects in immune cells, liver, and metabolic tissues. For example, GPBAR1 activation leads to enhanced energy expenditure in brown adipose tissue and skeletal muscle; stimulation of glucagon-like peptide-1 (GLP-1) production in enteroendocrine L-cells; and inhibition of pro-inflammatory cytokine production in macrophages and attenuation of atherosclerosis development. GPBAR1 is a member of the class A rhodopsin-like family of GPCRs, which comprises receptors for hormones, neurotransmitters, sensory stimuli, and a variety of other ligands.	272
320572	cd15906	7tmA_GPR162	G protein-coupled receptor 162, member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the orphan G-protein coupled receptor 162 (GPR162), also called A-2 or GRCA, with unknown endogenous ligand and function. Phylogenetic analysis indicates that GPR162 and GPR153 share a common evolutionary ancestor due to a gene duplication event. Although categorized as members of the rhodopsin-like class A GPCRs, both GPR162 and GPR153 contain HRM-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors and important for efficient G protein-coupled signal transduction. Moreover, the LPxF motif, a variant of NPxxY motif that plays a crucial role during receptor activation, is found at the end of TM7 in GPR162 and GPR153. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	315
320573	cd15907	7tmA_GPR153	orphan G protein-coupled receptor 153,  member of the class A family of seven-transmembrane G protein-coupled receptors. This subgroup represents the G-protein coupled receptor 153 (GPR153) with unknown endogenous ligand and function. GPR153 shares a common evolutionary origin with GPR162 and is highly expressed in central nervous system (CNS) including the thalamus, cerebellum, and the arcuate nucleus. Although categorized as a member of the rhodopsin-like class A GPCRs, GPR153 contains HRM-motif instead of the highly conserved Asp-Arg-Tyr (DRY) motif found in the third transmembrane helix (TM3) of class A receptors and important for efficient G protein-coupled signal transduction. Moreover, the LPxFL motif, a variant of NPxxY motif that plays a crucial role during receptor activation, is found at the end of TM7 in GPR153. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	301
320574	cd15908	7tm_TAS2R40-like	taste receptor 2, subtypes 39 and 40, and similar receptors, member of the seven-transmembrane G protein-coupled receptor superfamily. This group includes the mammalian taste receptor 2 (TAS2R) subtypes 39 and 40, which function as bitter taste receptors. The human TAS2R family contains about 25 functional members, which are glycoproteins and have the ability to form both homomeric and heteromeric receptor complexes. Five basic tastes are perceived by animals: bitter, sweet, sour, salty, and umami (taste of glutamate MSG). Among these, sour and salty are mediated by ion channels, while the perception of umami and sweet tastes is mediated by the TAS1R taste receptors, which belong to the class C GPCR family. The TAS2Rs in humans have a short extracellular N-terminus and the ligand binds within the transmembrane domain, whereas the TAS1Rs have a large N-terminal extracellular domain composed of the Venus flytrap module that forms the orthosteric (primary) ligand binding site. Signal transduction of bitter taste involves binding of bitter compounds to TAS2Rs linked to the alpha-subunit of gustducin, a heterotrimeric G protein expressed in taste receptor cells. This G-alpha subunit stimulates phosphodiesterase and decreases cAMP and cGMP levels. Further steps in the signaling cascade is still unknown. The beta-gamma-subunit of gustducin also mediates bitter taste transduction by activating phospholipase C, which leads to an increased formation of IP3 (inositol triphosphate) and DAG (diacylglycerol), thereby causing release of Ca2+ from intracellular stores and enhanced neurotransmitter release.	289
320575	cd15909	7tmF_FZD4_9_10-like	class F frizzled subfamilies 4, 9, 10, and related proteins; member of 7-transmembrane G protein-coupled receptors. This group includes subfamilies 4, 9 and 10 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and their closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	320
320576	cd15910	7tmF_FZD3_FZD6-like	class F frizzled subfamilies 3, 6 and related proteins; member of 7-transmembrane G protein-coupled receptors. This group includes subfamilies 3 and 6 of the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate class of GPCRs, and their closely related proteins. This class F protein family consists of 10 isoforms (FZD1-10) in mammals. The FZDs are activated by the wingless/int-1 (WNT) family of secreted lipoglycoproteins and preferentially couple to stimulatory G proteins of the Gs family, which activate adenylate cyclase, but can also couple to G proteins of the Gi/Gq families. In the WNT/beta-catenin signaling pathway, the WNT ligand binds to FZD and a lipoprotein receptor-related protein (LRP) co-receptor. This leads to the stabilization and translocation of beta-catenin to the nucleus, where it induces the activation of TCF/LEF family transcription factors. The conserved cytoplasmic motif of FZD, Lys-Thr-X-X-X-Trp, is required for activation of the WNT/beta-catenin pathway, and for membrane localization and phosphorylation of Dsh (dishevelled) protein, a key component of the WNT pathway that relays the WNT signals from the activated receptor to downstream effector proteins. The WNT pathway plays a critical role in many developmental processes, such as cell-fate determination, cell proliferation, neural patterning, stem cell renewal, tissue homeostasis and repair, and tumorigenesis, among many others.	321
320577	cd15911	7tmA_OR11A-like	olfactory receptor subfamily 11A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 11A and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320578	cd15912	7tmA_OR6C-like	olfactory receptor subfamily 6C and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 6C, 6X, 6J, 6T, 6V, 6M, 9A, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320579	cd15913	7tmA_OR11G-like	olfactory receptor OR11G and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 11G, 11H, and related proteins in other mammals, and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320580	cd15914	7tmA_OR6N-like	olfactory receptor OR6N and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 6N, 6K, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320581	cd15915	7tmA_OR12D-like	olfactory receptor subfamily 12D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 12D and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	271
320582	cd15916	7tmA_OR10G-like	olfactory receptor subfamily 10G and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 10G, 10S, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	276
341351	cd15917	7tmA_OR51_52-like	olfactory receptor family 51, 52, 56 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor families 51, 52, 56, and related proteins in other mammals, sauropsids, amphibians, and fishes. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
320584	cd15918	7tmA_OR1_7-like	olfactory receptor families 1, 7, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor families 1 and 7, and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320585	cd15919	7tmA_GPR139	G-protein-coupled receptor GPR139, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR139, a vertebrate orphan receptor, is very closely related to GPR142, but they have different expression patterns in the brain and in other tissues. These receptors couple to inhibitory G proteins and activate phospholipase C. Studies suggested that dimer formation may be required for their proper function. GPR142 is predominantly expressed in pancreatic beta-cells and plays an important role in mediating insulin secretion and maintaining glucose homeostasis, whereas GPR139 is expressed almost exclusively in the brain and is suggested to play a role in the control of locomotor activity. Tryptophan and phenylalanine have been identified as putative endogenous ligands of GPR139. These orphan receptors are phylogenetically clustered with invertebrate FMRFamide receptors such as Drosophila melanogaster DrmFMRFa-R.	270
320586	cd15920	7tmA_GPR34-like	P2Y-like receptor and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR34 is phylogenetically related to the P2Y family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. GPR34 is shown to couple to G(i/o) protein and is highly expressed in microglia. Recently, lysophosphatidylserine has been identified as a ligand for GPR34. This group belongs to the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, which then activate the heterotrimeric G proteins. G-proteins regulate a variety of cellular functions including metabolic enzymes, ion channels, and transporters, among many others.	278
320587	cd15921	7tmA_CysLTR	cysteinyl leukotriene receptors, member of the class A family of seven-transmembrane G protein-coupled receptors. Cysteinyl leukotrienes (LTC4, LTD4, and LTE4) are the most potent inflammatory lipid mediators that play an important role in human asthma. They are synthesized in the leucocytes (cells of immune system) from arachidonic acid by the actions of 5-lipoxygenase and induce bronchial constriction through G protein-coupled receptors, CysLTR1 and CysLTR2. Activation of CysLTR1 by LTD4 induces airway smooth muscle contraction and proliferation, eosinophil migration, and damage to the lung tissue. They belong to the class A GPCR superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	283
320588	cd15922	7tmA_P2Y-like	P2Y purinoceptor-like proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y-like proteins are an uncharacterized group that is phylogenetically related to a family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	284
320589	cd15923	7tmA_GPR35_55-like	G protein-coupled receptor 35, GPR55, and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily is composed of GPR35, GPR55, and similar proteins. GPR35 shares closest homology with GPR55, and they belong to the class A G protein-coupled receptor superfamily, which all have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A number of studies have suggested that GPR35 may play important physiological roles in hypertension, atherosclerosis, nociception, asthma, glucose homeostasis and diabetes, and inflammatory bowel disease. GPR35 is thought to be responsible for brachydactyly mental retardation syndrome, which is associated with a deletion comprising chromosome 2q37 in human, and is also implicated as a potential oncogene in stomach cancer. GPR35 couples to G(13) and G(i/o) proteins, whereas GPR55 has been reported to couple to G(13), G(12), or G(q) proteins. Activation of GPR55 leads to activation of phospholipase C, RhoA, ROCK, ERK, p38MAPK, and calcium release. Recently, lysophosphatidylinositol (LPI) has been identified as an endogenous ligand for GPR55, while several endogenous ligands for GPR35 have been identified including kynurenic acid, 2-oleoyl lysophosphatidic acid, and zaprinast.	273
341352	cd15924	7tmA_P2Y12-like	P2Y purinoceptors 12, 13, 14, and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5 and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12 and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14). This cluster only includes P2Y12-like receptors as well as closely related orphan receptor, GPR87.	284
320591	cd15925	7tmA_RNL3R2	relaxin-3 receptor 2 (RNL3R2), member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled receptor RNL3R2 is also known as GPR100, GPR142, and relaxin family peptide receptor 4 (RXFP4). Insulin-like peptide 5 (INSL5) is an endogenous ligand for RNL3R2 and plays a role in fat and glucose metabolism. INSL5 is highly expressed in human rectal and colon tissues. RNL3R2 signals through G(i) protein and inhibit adenylate cyclase, thereby inhibit cAMP accumulation.	283
320592	cd15926	7tmA_RNL3R1	relaxin 3 receptor 1 (RNL3R1), member of the class A family of seven-transmembrane G protein-coupled receptors. The G protein-coupled receptor RNL3R1 is also known as GPCR135, relaxin family peptide receptor 3 (RXFP3), and somatostatin- and angiotensin-like peptide receptor (SALPR). RNL3/relaxin-3, a member of the insulin superfamily, is an endogenous neuropeptide ligand for RNL3R1. RNL3R1 is predominantly expressed in brain regions and implicated in stress, anxiety, and feeding, and metabolism. RNL3R1 signals through G(i) protein and inhibit adenylate cyclase, thereby inhibit cAMP accumulation, and also activates Erk1/2 signaling pathway.	288
320593	cd15927	7tmA_Bombesin_R-like	bombesin receptor subfamily, member of the class A family of seven-transmembrane G protein-coupled receptors. This bombesin subfamily of G-protein coupled receptors consists of neuromedin B receptor (NMBR), gastrin-releasing peptide receptor (GRPR), and bombesin receptor subtype 3 (BRS-3). Bombesin is a tetradecapeptide, originally isolated from frog skin. Mammalian bombesin-related peptides are widely distributed in the gastrointestinal and central nervous systems. The bombesin family receptors couple mainly to the G proteins of G(q/11) family. NMBR functions as the receptor for the neuropeptide neuromedin B, a potent mitogen and growth factor for normal and cancerous lung and for gastrointestinal epithelial tissues. Gastrin-releasing peptide is an endogenous ligand for GRPR and shares high sequence homology with NMB in the C-terminal region. Both NMB and GRP possess bombesin-like biochemical properties. BRS-3 is classified as an orphan receptor and suggested to play a role in sperm cell division and maturation. BRS-3 interacts with known naturally-occurring bombesin-related peptides with low affinity; however, no endogenous high-affinity ligand to the receptor has been identified. The bombesin receptor family belongs to the seven transmembrane rhodopsin-like G-protein coupled receptors (class A GPCRs), which perceive extracellular signals and transduce them to guanine nucleotide-binding (G) proteins.	294
320594	cd15928	7tmA_GHSR-like	growth hormone secretagogue receptor, motilin receptor, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This subfamily includes growth hormone secretagogue receptor (GHSR or ghrelin receptor), motilin receptor (also called GPR38), and related proteins. Both GHSR and GPR38 bind peptide hormones. Ghrelin, the endogenous ligand for GHSR, is an acylated 28-amino acid peptide hormone produced by ghrelin cells in the gastrointestinal tract. Ghrelin is also called the hunger hormone and is involved in the regulation of growth hormone release, appetite and feeding, gut motility, lipid and glucose metabolism, and energy balance. Motilin, the ligand for GPR38, is a 22 amino acid peptide hormone expressed throughout the gastrointestinal tract and stimulates contraction of gut smooth muscle. It is involved in the regulation of digestive tract motility.	288
341353	cd15929	7tmB1_GlucagonR-like	glucagon receptor-like subfamily, member of the class B family of seven-transmembrane G protein-coupled receptors. This group represents the glucagon receptor family of G protein-coupled receptors, which includes glucagon receptor (GCGR), glucagon-like peptide-1 receptor (GLP1R), GLP2R, and closely related receptors. These receptors are activated by the members of the glucagon (GCG) peptide family including GCG, glucagon-like peptide 1 (GLP1), and GLP2, which are derived from the large proglucagon precursor. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. Receptors in this group belong to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways.	279
320596	cd15930	7tmB1_Secretin_R-like	secretin receptor-like group of hormone receptors, member of the class B family of seven-transmembrane G protein-coupled receptors. This group represents G protein-coupled receptors for structurally similar peptide hormones that include secretin, growth-hormone-releasing hormone (GHRH), pituitary adenylate cyclase activating polypeptide (PACAP), and vasoactive intestinal peptide (VIP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. Secretin, a polypeptide secreted by entero-endocrine S cells in the small intestine, is involved in maintaining body fluid balance. This polypeptide regulates the secretion of bile and bicarbonate into the duodenum from the pancreatic and biliary ducts, as well as regulates the duodenal pH by the control of gastric acid secretion.  Studies with secretin receptor-null mice indicate that secretin plays a role in regulating renal water reabsorption. Secretin mediates its biological actions by elevating intracellular cAMP via G protein-coupled secretin receptors, which are expressed in the brain, pancreas, stomach, kidney, and liver. GHRHR is a specific receptor for the growth hormone-releasing hormone (GHRH) that controls the synthesis and release of growth hormone (GH) from the anterior pituitary somatotrophs. Mutations in the gene encoding GHRHR have been connected to isolated growth hormone deficiency (IGHD), a short-stature condition caused by deficient production of GH or lack of GH action.  VIP and PACAP exert their effects through three G protein-coupled receptors, PACAP-R1, VIP-R1 (vasoactive intestinal receptor type 1, also known as VPAC1) and VIP-R2 (or VPAC2). PACAP-R1 binds only PACAP with high affinity, whereas VIP-R1 and -R2 specifically bind and respond to both VIP and PACAP.  VIP and PACAP and their receptors are widely expressed in the brain and periphery. They are upregulated in neurons and immune cells in responses to CNS injury and/or inflammation and exert potent anti-inflammatory effects, as well as play important roles in the control of circadian rhythms and stress responses, among many others.  All B1 subfamily GPCRs are able to increase intracellular cAMP levels by coupling to adenylate cyclase via a stimulatory Gs protein. However, depending on its cellular location, some members of subfamily B1 are also capable of coupling to additional G proteins such as G(i/o) and/or G(q) proteins, thereby leading to activation of phospholipase C and intracellular calcium influx.	268
320597	cd15931	7tmB2_EMR_Adhesion_II	EGF-like module receptors, group II adhesion GPCRs, member of class B2 family of seven-transmembrane G protein-coupled receptors. group II adhesion GPCRs, including the leukocyte cell-surface antigen CD97 and the epidermal growth factor (EGF)-module-containing, mucin-like hormone receptor (EMR1-4), are primarily expressed in cells of the immune system. All EGF-TM7 receptors, which belong to the B2 subfamily B2 of adhesion GPCRs, are members of group II, except for ETL (EGF-TM7-latrophilin related protein), which is classified into group I. Members of the EGF-TM7 receptors are characterized by the presence of varying numbers of N-terminal EGF-like domains, which play critical roles in ligand recognition and cell adhesion, linked by a stalk region to a class B seven-transmembrane domain.  In the case of CD97, alternative splicing results in three isoforms possessing either three (EGF1,2,5), four (EGF1,2,3,5) or five (EGF1,2,3,4,5) EGF-like domains. On the other hand, EMR2 generates four isoforms possessing either two (EGF1,2), three (EGF1,2,5), four (EGF1,2,3,5) or five (EGF1,2,3,4,5) EGF-like domains. Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. For example, CD97, which is involved in angiogenesis and the migration and invasion of tumor cells, has been shown to promote cell aggregation in a GPS proteolysis-dependent manner.  CD97 is widely expressed on lymphocytes, monocytes, macrophages, dendritic cells, granulocytes and smooth muscle cells as well as in a variety of human tumors including colorectal, gastric, esophageal pancreatic, and thyroid carcinoma. EMR2 shares strong sequence homology with CD97, differing by only six amino acids. However, unlike CD97, EMR2 is not found in those of CD97-positive tumor cells and is not expressed on lymphocytes but instead on monocytes, macrophages and granulocytes. CD97 has three known ligands: CD55, decay-accelerating factor for regulation of complement system; chondroitin sulfate, a glycosaminoglycan found in the extracellular matrix; and the integrin alpha5beta1, which play a role in angiogenesis.  Although EMR2 does not effectively interact with CD55, the fourth EGF-like domain of this receptor binds to chondroitin sulfate to mediate cell attachment.	262
320598	cd15932	7tmB2_GPR116-like_Adhesion_VI	orphan GPR116 and related proteins, group IV adhesion GPCRs, member of the class B2 family of seven-transmembrane G protein-coupled receptors. group VI adhesion GPCRs consist of orphan receptors GPR110, GPR111, GPR113, GPR115, GPR116, and closely related proteins. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. GPR110 possesses a SEA box in the N-terminal has been identified as an oncogene over-expressed in lung and prostate cancer. GPR113 contains a hormone binding domain and one EGF (epidermal grown factor) domain. GPR112 has extremely long N-terminus (about 2,400 amino acids) containing a number of Ser/Thr-rich glycosylation sites and a pentraxin (PTX) domain. GPR116 has two C2-set immunoglobulin-like repeats, which is found in the members of the immunoglobulin superfamily of cell surface proteins, and a SEA (sea urchin sperm protein, enterokinase, and a grin)-box, which is present in the extracellular domain of the transmembrane mucin (MUC) family and known to enhance O-glycosylation. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS.	268
320599	cd15933	7tmB2_GPR133-like_Adhesion_V	orphan GPR133 and related proteins, group V adhesion GPCRs, member of class B2 family of seven-transmembrane G protein-coupled receptors. group V adhesion GPCRs include orphan receptors GPR133, GPR144, and closely related proteins. The function of GPR144 has not yet been characterized, whereas GPR133 is highly expressed in the pituitary gland and is coupled to the G(s) protein, leading to activation of adenylate cyclase pathway. Moreover, genetic variations in the GPR133 have been reported to be associated with adult height and heart rate. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS.	252
320600	cd15934	7tmC_mGluRs_group2_3	metabotropic glutamate receptors in group 2 and 3, member of the class C family of seven-transmembrane G protein-coupled receptors. The metabotropic glutamate receptors (mGluRs) are homodimeric class C G-protein coupled receptors which are activated by glutamate, the major excitatory neurotransmitter of the CNS. The mGluRs are involved in regulating neuronal excitability and synaptic transmission via intracellular activation of second messenger signaling pathways. While the ionotropic glutamate receptor subtypes (AMPA, NMDA, and kainite) mediate fast excitatory postsynaptic transmission, mGluRs are known to mediate slower excitatory postsynaptic responses and to be involved in synaptic plasticity in the mammalian brain. In addition to seven-transmembrane helices, the class C GPCRs are characterized by a large N-terminal extracellular Venus flytrap-like domain, which is composed of two adjacent lobes separated by a cleft which binds an endogenous ligand. Moreover, they exist as either homo- or heterodimers, which are essential for their function. For instance, mGluRs form homodimers via interactions between the N-terminal Venus flytrap domains and the intermolecular disulphide bonds between cysteine residues located in the cysteine-rich domain (CRD). At least eight different subtypes of metabotropic receptors (mGluR1-8) have been identified and further classified into three groups based on their sequence homology, pharmacological properties, and signaling pathways. Group 1 (mGluR1 and mGluR5) receptors are predominantly located postsynaptically on neurons and are involved in long-term synaptic plasticity in the brain, including long-term potentiation (LTP) in the hippocampus and long-term depression (LTD) in the cerebellum. They are coupled to G(q/11) proteins, thereby activating phospholipase C to generate inositol-1,4,5-triphosphate (IP3) and diacyglycerol (DAG), which in turn lead to Ca2+ release and protein kinase C activation, respectively. Group I mGluR expression is shown to be strongly upregulated in animal models of epilepsy, brain injury, inflammatory, and neuropathic pain, as well as in patients with amyotrophic lateral sclerosis or multiple sclerosis. Group 2 (mGluR2 and mGluR3) and 3 (mGluR4, mGluR6, mGluR7, and mGluR8) receptors are predominantly localized presynaptically in the active region of neurotransmitter release. They are coupled to (Gi/o) proteins, which leads to inhibition of adenylate cyclase activity and cAMP formation, and consequently to a decrease in protein kinase A (PKA) activity. Ultimately, activation of these receptors leads to inhibition of neurotransmitter release such as glutamate and GABA via inhibition of Ca2+ channels and activation of K+ channels. Furthermore, while activation of Group 1 mGluRs increases NMDA (N-methyl-D-aspartate) receptor activity and risk of neurotoxicity, Group 2 and 3 mGluRs decrease NMDA receptor activity and prevent neurotoxicity.	252
320601	cd15935	7tmA_OR4Q3-like	olfactory receptor 4Q3 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 4Q3 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this  information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	268
320602	cd15936	7tmA_OR4D-like	olfactory receptor 4D and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 4D and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	267
320603	cd15937	7tmA_OR4N-like	olfactory receptor 4N, 4M, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 4N, 4M, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	267
320604	cd15938	7tmA_OR4Q2-like	olfactory receptor 4Q2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 4Q2 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	265
320605	cd15939	7tmA_OR4A-like	olfactory receptor 4A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 4A, 4C, 4P, 4S, 4X and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	267
320606	cd15940	7tmA_OR4E-like	olfactory receptor 4E and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 4E and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	267
320607	cd15941	7tmA_OR10S1-like	olfactory receptor subfamily 10S1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 10S1 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320608	cd15942	7tmA_OR10G6-like	olfactory receptor subfamily 10G6 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor 10G6 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
320609	cd15943	7tmA_OR5AP2-like	olfactory receptor subfamily 5AP2 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5AP2 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	295
320610	cd15944	7tmA_OR5AR1-like	olfactory receptor subfamily 5AR1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5AR1 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	294
320611	cd15945	7tmA_OR5C1-like	olfactory receptor subfamily 5C1 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 5C1 and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	292
320612	cd15946	7tmA_OR1330-like	olfactory receptor 1330 and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes olfactory receptors 1330 from mouse, Olr859 from rat, and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320613	cd15947	7tmA_OR2B-like	olfactory receptor subfamily 2B and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor family 2 (subfamilies 2B, 2C, 2G, 2H, 2I, 2J, 2W, 2Y) and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	270
320614	cd15948	7tmA_OR52K-like	olfactory receptor subfamily 52K and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52K and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	277
320615	cd15949	7tmA_OR52M-like	olfactory receptor subfamily 52M and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52M and related proteins in other mammals, sauropsids, and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	292
320616	cd15950	7tmA_OR52I-like	olfactory receptor subfamily 52I and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52I and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
320617	cd15951	7tmA_OR52R_52L-like	olfactory receptor subfamily 52R, 52L, and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamilies 52R, 52L and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
320618	cd15952	7tmA_OR52E-like	olfactory receptor subfamily 52E and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52E and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	274
341354	cd15953	7tmA_OR52P-like	olfactory receptor subfamily 52P and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52P and related proteins in other mammals, sauropsids and amphibians. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
320620	cd15954	7tmA_OR52N-like	olfactory receptor subfamily 52N and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52N and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	276
320621	cd15955	7tmA_OR52A-like	olfactory receptor subfamily 52A and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52A and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	276
320622	cd15956	7tmA_OR52W-like	olfactory receptor subfamily 52W and related proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes human olfactory receptor subfamily 52W and related proteins in other mammals and sauropsids. Olfactory receptors (ORs) play a central role in olfaction, the sense of smell. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs. Binding of an odorant to the olfactory receptor induces a conformational change that leads to the activation of the olfactory-specific G protein (Golf). The G protein (Golf and/or Gs) in turn stimulates adenylate cyclase to make cAMP. The cAMP opens cyclic nucleotide-gated ion channels, which allow the influx of calcium and sodium ions, resulting in depolarization of the olfactory receptor neuron and triggering an action potential which transmits this information to the brain.  A consensus nomenclature system based on evolutionary divergence is used here to classify the olfactory receptor family. The nomenclature begins with the root name OR, followed by an integer representing a family, a letter denoting a subfamily, and an integer representing the individual gene within the subfamily.	275
341355	cd15957	7tmA_Beta2_AR	beta-2 adrenergic receptors (adrenoceptors), member of the class A family of seven-transmembrane G protein-coupled receptors. Beta-2 AR is activated by adrenaline that plays important roles in cardiac function and pulmonary physiology. While beta-1 AR and beta-2 AR are the major subtypes involved in modulating cardiac contractility and heart rate by positively stimulating the G(s) protein-adenylate cyclase-cAMP-PKA signaling pathway, beta-2 AR can couple to both G(s) and G(i) proteins in the heart. Moreover, beta-2 AR activation leads to smooth muscle relaxation and bronchodilation in the lung. The beta adrenergic receptors are a subfamily of the class A rhodopsin-like G protein-coupled receptors.	301
320624	cd15958	7tmA_Beta1_AR	beta-1 adrenergic receptors (adrenoceptors), member of the class A family of seven-transmembrane G protein-coupled receptors. The beta-1 adrenergic receptor (beta-1 adrenoceptor), also known as beta-1 AR, is activated by adrenaline (epinephrine) and plays important roles in regulating cardiac function and heart rate. The human heart contains three subtypes of the beta AR: beta-1 AR, beta-2 AR, and beta-3 AR. Beta-1 AR and beta-2 AR, which expressed at about a ratio of 70:30, are the major subtypes involved in modulating cardiac contractility and heart rate by positively stimulating the G(s) protein-adenylate cyclase-cAMP-PKA signaling pathway. In contrast, beta-3 AR produces negative inotropic effects by activating inhibitory G(i) proteins. The aberrant expression of betrayers can lead to cardiac dysfunction such as arrhythmias or heart failure.	298
320625	cd15959	7tmA_Beta3_AR	beta-3 adrenergic receptors (adrenoceptors), member of the class A family of seven-transmembrane G protein-coupled receptors. The beta-3 adrenergic receptor (beta-3 adrenoceptor), also known as beta-3 AR, is activated by adrenaline and plays important roles in regulating cardiac function and heart rate. The human heart contains three subtypes of the beta AR: beta-1 AR, beta-2 AR, and beta-3 AR. Beta-1 AR and beta-2 AR, which expressed at about a ratio of 70:30, are the major subtypes involved in modulating cardiac contractility and heart rate by positively stimulating the G(s) protein-adenylate cyclase-cAMP-PKA signaling pathway. In contrast, beta-3 AR produces negative inotropic effects by activating inhibitory G(i) proteins. The aberrant expression of betrayers can lead to cardiac dysfunction such as arrhythmias or heart failure.	302
320626	cd15960	7tmA_GPR185-like	G protein-coupled receptor 185 and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR185, also called GPRx, is a member of the constitutively active GPR3/6/12 subfamily of G protein-coupled receptors. It plays a role in the maintenance of meiotic arrest in Xenopus laevis oocytes through G(s) protein, which leads to increased cAMP levels. In Xenopus laevis, GPR185 is primarily expressed in brain, ovary, and testis; however, its ortholog has not been identified in other vertebrate genomes. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest.	268
320627	cd15961	7tmA_GPR12	G protein-coupled receptor 12, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. They constitutively activate adenylate cyclase to a similar degree as that seen with fully activated G(s)-coupled receptors, and are also able to constitutively activate inhibitory G(i/o) proteins. Lysophospholipids such as sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine have been detected as the high-affinity ligands for Gpr6 and Gpr12, respectively, which show high sequence homology with GPR3.	268
320628	cd15962	7tmA_GPR6	G protein-coupled receptor 6, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. They constitutively activate adenylate cyclase to a similar degree as that seen with fully activated G(s)-coupled receptors, and are also able to constitutively activate inhibitory G(i/o) proteins. Lysophospholipids such as sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine have been detected as the high-affinity ligands for Gpr6 and Gpr12, respectively, which show high sequence homology with GPR3.	268
320629	cd15963	7tmA_GPR3	G protein-coupled receptor 3, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR3, GPR6, and GPR12 form a subfamily of constitutively active G-protein coupled receptors with dual coupling to G(s) and G(i) proteins. These three orphan receptors are involved in the regulation of cell proliferation and survival, neurite outgrowth, cell clustering, and maintenance of meiotic prophase arrest. They constitutively activate adenylate cyclase to a similar degree as that seen with fully activated G(s)-coupled receptors, and are also able to constitutively activate inhibitory G(i/o) proteins. Lysophospholipids such as sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine have been detected as the high-affinity ligands for Gpr6 and Gpr12, respectively, which show high sequence homology with GPR3.	268
320630	cd15964	7tmA_TSH-R	thyroid-stimulating hormone receptor (or thyrotropin receptor), member of the class A family of seven-transmembrane G protein-coupled receptors. The glycoprotein hormone receptors are seven transmembrane domain receptors with a very large extracellular N-terminal domain containing many leucine-rich repeats responsible for hormone recognition and binding. The glycoprotein hormone family includes the three gonadotropins: luteinizing hormone (LH), follicle-stimulating hormone (FSH), chorionic gonadotropin (CG), and a pituitary thyroid-stimulating hormone (TSH). The glycoprotein hormones exert their biological functions by interacting with their cognate GPCRs. Both LH and CG bind to the same receptor, the luteinizing hormone-choriogonadotropin receptor (LHCGR); FSH binds to FSH-R and TSH to TSH-R. TSH-R plays an important role thyroid physiology, and its activation stimulates the production of thyroxine (T4) and triiodothyronine (T3). Defects in TSH-R are a cause of several types of hyperthyroidism. The receptor is predominantly found on the surface of the thyroid epithelial cells and couples to the G(s)-protein and activates adenylate cyclase, thereby promoting cAMP production. TSH and cAMP stimulate thyroid cell proliferation, differentiation, and function.	275
320631	cd15965	7tmA_RXFP1_LGR7	relaxin receptor 1 (or LGR7), member of the class A family of seven-transmembrane G protein-coupled receptors. Relaxin is a member of the insulin superfamily that has diverse actions in both reproductive and non-reproductive tissues. The relaxin-like peptide family includes relaxin-1, relaxin-2, and the insulin-like (INSL) peptides such as INSL3, INSL4, INSL5 and INSL6. The relaxin family peptides share high structural but low sequence similarity, and exert their physiological functions by activating a group of four G protein-coupled receptors, RXFP1-4. Relaxin is the endogenous ligand for RXFP1, which has a large extracellular N-terminal domain containing 10 leucine-rich repeats and a unique low-density lipoprotein type A (LDLa) module which is necessary for receptor activation. Upon receptor binding, relaxin activates a variety of signaling pathways to produce second messengers such as cAMP and nitric oxide. RXFP1 is expressed in various tissues including uterus, ovary, placenta, cerebral cortex, heart, lung and kidney, among others.	287
320632	cd15966	7tmA_RXFP2_LGR8	relaxin receptor 2 (or LGR8), member of the class A family of seven-transmembrane G protein-coupled receptors. Relaxin is a member of the insulin superfamily that has diverse actions in both reproductive and non-reproductive tissues. The relaxin-like peptide family includes relaxin-1, relaxin-2, and the insulin-like (INSL) peptides such as INSL3, INSL4, INSL5 and INSL6. The relaxin family peptides share high structural similarity, but low sequence similarity, and exert their physiological functions by activating a group of four G protein-coupled receptors, RXFP1-4. INSL3 is the endogenous ligand for RXFP2, which couples to the G(s) protein to increase intracellular cAMP levels, but also to the GoB protein to decrease cAMP formation. RXFP2 (or LGR8) is expressed in various tissues including the brain, kidney, muscle, testis, thyroid, uterus, and peripheral blood cells, among others.	287
320633	cd15967	7tmA_P2Y1-like	P2Y purinoceptor 1-like. P2Y1-like is an uncharacterized group that is phylogenetically related to a family of purinergic G protein-coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	281
320634	cd15968	7tmA_P2Y6_P2Y3-like	P2Y purinoceptors 6 and 3, and similar proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. This group includes P2Y receptor 6 (P2Y6), P2Y3, and P2Y3-like proteins. These receptors belong to the G(i) class of a family of purinergic G-protein coupled receptors. In the CNS, P2Y6 plays a role in microglia activation and phagocytosis, and is involved in the secretion of interleukin from monocytes and macrophages in the immune system. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	285
320635	cd15969	7tmA_GPR87	G protein-coupled receptor 87, member of the class A family of seven-transmembrane G protein-coupled receptors. GPR87 acts as one of multiple receptors for lysophosphatidic acid (LPA). This orphan receptor has been shown to be over-expressed in several malignant tumors including lung squamous cell carcinoma and regulated by p53. GPR87 is phylogenetically closely related to the G(i) class of the P2Y family of purinergic G protein-coupled receptors. P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-sugars. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase.	283
320636	cd15970	7tmA_SSTR1	somatostatin receptor type 1, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. SSTR1 is coupled to a Na/H exchanger, voltage-dependent calcium channels, and AMPA/kainate glutamate channels. SSTR1 is expressed in the normal human pituitary and in nearly half of all pituitary adenoma subtypes.	276
320637	cd15971	7tmA_SSTR2	somatostatin receptor type 2, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs), which display strong sequence similarity with opioid receptors, binds somatostatin, a polypeptide hormone that regulates a wide variety of physiological such as neurotransmission, endocrine secretion, cell proliferation, and smooth muscle contractility. SSTRs are composed of five distinct subtypes (SSTR1-5) which are encoded by separate genes on different chromosomes. SSTR2 plays critical roles in growth hormone secretion, glucagon secretion, and immune responses. SSTR2 is expressed in the normal human pituitary and in nearly all pituitary growth hormone adenomas.	279
320638	cd15972	7tmA_SSTR3	somatostatin receptor type 3, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. SSTR3 is coupled to inward rectifying potassium channels. SSTR3 plays critical roles in growth hormone secretion, endothelial cell cycle arrest and apoptosis. Furthermore, SSTR3 is expressed in the normal human pituitary and in nearly half of pituitary growth hormone adenomas.	279
320639	cd15973	7tmA_SSTR4	somatostatin receptor type 4, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. SSTR4 plays a critical role in mediating inflammation. Unlike other SSTRs, SSTR4 subtype is not detected in all pituitary adenomas while it is expressed in the normal human pituitary.	274
320640	cd15974	7tmA_SSTR5	somatostatin receptor type 5, member of the class A family of seven-transmembrane G protein-coupled receptors. G protein-coupled somatostatin receptors (SSTRs) are composed of five distinct subtypes (SSTR1-5) that display strong sequence similarity with opioid receptors. All five receptor subtypes bind the natural somatostatin (somatotropin release inhibiting factor), a polypeptide hormone that regulates a wide variety of physiological functions such as neurotransmission, cell proliferation, contractility of smooth muscle cells, and endocrine signaling as well as inhibition of the release of many secondary hormones. SSTR5 is coupled to inward rectifying K channels and phospholipase C, and plays critical roles in growth hormone and insulin secretion. SSTR5 acts as a negative regulator of PDX-1 (pancreatic and duodenal homeobox-1) expression, which is a conserved homeodomain-containing beta cell-specific transcription factor essentially involved in pancreatic development, among many other functions.	277
320641	cd15975	7tmA_ET-AR	endothelin A (or endothelin-1) receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelins are able to activate a number of signal transduction processes including phospholipase A2, phospholipase C, and phospholipase D, as well as cytosolic protein kinase activation. They play an important role in the regulation of the cardiovascular system and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels. Two endothelin receptor subtypes have been isolated and identified in vertebrates, endothelin A receptor (ET-A) and endothelin B receptor (ET-B), and are members of the seven transmembrane class A G-protein coupled receptor family which activate multiple effectors via different types of G protein. Some vertebrates contain a third subtype, endothelin A receptor (ET-C). ET-A receptors are mainly located on vascular smooth muscle cells, whereas ET-B receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain.	300
320642	cd15976	7tmA_ET-BR	endothelin B receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelins are able to activate a number of signal transduction processes including phospholipase A2, phospholipase C, and phospholipase D, as well as cytosolic protein kinase activation. They play an important role in the regulation of the cardiovascular system and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels. Two endothelin receptor subtypes have been isolated and identified in vertebrates, endothelin A receptor (ET-A) and endothelin B receptor (ET-B), and are members of the seven transmembrane class A G-protein coupled receptor family which activate multiple effectors via different types of G protein. Some vertebrates contain a third subtype, endothelin A receptor (ET-C). ET-A receptors are mainly located on vascular smooth muscle cells, whereas ET-B receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain.	296
320643	cd15977	7tmA_ET-CR	endothelin C receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. Endothelins are able to activate a number of signal transduction processes including phospholipase A2, phospholipase C, and phospholipase D, as well as cytosolic protein kinase activation. They play an important role in the regulation of the cardiovascular system and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels. Two endothelin receptor subtypes have been isolated and identified in vertebrates, endothelin A receptor (ET-A) and endothelin B receptor (ET-B), and are members of the seven transmembrane class A G-protein coupled receptor family which activate multiple effectors via different types of G protein. Some vertebrates contain a third subtype, endothelin A receptor (ET-C). ET-A receptors are mainly located on vascular smooth muscle cells, whereas ET-B receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain. The ET-C receptor is specific for endothelin-3 on frog dermal melanophores; its activation causes dispersion of pigment granules.	296
320644	cd15978	7tmA_CCK-AR	cholecystokinin receptor type A, member of the class A family of seven-transmembrane G protein-coupled receptors. Cholecystokinin receptors (CCK-AR and CCK-BR) are a group of G-protein coupled receptors which bind the peptide hormones cholecystokinin (CCK) or gastrin. CCK, which facilitates digestion in the small intestine, and gastrin, a major regulator of gastric acid secretion, are highly similar peptides. Like gastrin, CCK is a naturally-occurring linear peptide that is synthesized as a preprohormone, then proteolytically cleaved to form a family of peptides with the common C-terminal sequence (Gly-Trp-Met-Asp-Phe-NH2), which is required for full biological activity. CCK-AR (type A, alimentary; also known as CCK1R) is found abundantly on pancreatic acinar cells and binds only sulfated CCK-peptides with very high affinity, whereas CCK-BR (type B, brain; also known as CCK2R), the predominant form in the brain and stomach, binds CCK or gastrin and discriminates poorly between sulfated and non-sulfated peptides. CCK is implicated in regulation of digestion, appetite control, and body weight, and is involved in neurogenesis via CCK-AR. There is some evidence to support that CCK and gastrin, via their receptors, are involved in promoting cancer development and progression, acting as growth and invasion factors.	278
320645	cd15979	7tmA_CCK-BR	cholecystokinin receptor type B, member of the class A family of seven-transmembrane G protein-coupled receptors. Cholecystokinin receptors (CCK-AR and CCK-BR) are a group of G-protein coupled receptors which bind the peptide hormones cholecystokinin (CCK) or gastrin. CCK, which facilitates digestion in the small intestine, and gastrin, a major regulator of gastric acid secretion, are highly similar peptides. Like gastrin, CCK is a naturally-occurring linear peptide that is synthesized as a preprohormone, then proteolytically cleaved to form a family of peptides with the common C-terminal sequence (Gly-Trp-Met-Asp-Phe-NH2), which is required for full biological activity. CCK-AR (type A, alimentary; also known as CCK1R) is found abundantly on pancreatic acinar cells and binds only sulfated CCK-peptides with very high affinity, whereas CCK-BR (type B, brain; also known as CCK2R), the predominant form in the brain and stomach, binds CCK or gastrin and discriminates poorly between sulfated and non-sulfated peptides. CCK is implicated in regulation of digestion, appetite control, and body weight, and is involved in neurogenesis via CCK-AR. There is some evidence to support that CCK and gastrin, via their receptors, are involved in promoting cancer development and progression, acting as growth and invasion factors.	275
320646	cd15980	7tmA_NPFFR2	neuropeptide FF receptor 2, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide FF (NPFF) is a mammalian octapeptide that belongs to a family of neuropeptides containing an RF-amide motif at their C-terminus that have been implicated in a wide range of physiological functions in the brain including pain sensitivity, insulin release, food intake, memory, blood pressure, and opioid-induced tolerance and hyperalgesia. The effects of these peptides are mediated through neuropeptide FF1 and FF2 receptors (NPFF1-R and NPFF2-R) which are predominantly expressed in the brain. NPFF induces pro-nociceptive effects, mainly through the NPFF1-R, and anti-nociceptive effects, mainly through the NPFF2-R. NPFF has been shown to inhibit adenylate cyclase via the Gi protein coupled to NPFF1-R.	299
320647	cd15981	7tmA_NPFFR1	neuropeptide FF receptor 1, member of the class A family of seven-transmembrane G protein-coupled receptors. Neuropeptide FF (NPFF) is a mammalian octapeptide that belongs to a family of neuropeptides containing an RF-amide motif at their C-terminus that have been implicated in a wide range of physiological functions in the brain including pain sensitivity, insulin release, food intake, memory, blood pressure, and opioid-induced tolerance and hyperalgesia. The effects of these peptides are mediated through neuropeptide FF1 and FF2 receptors (NPFF1-R and NPFF2-R) which are predominantly expressed in the brain. NPFF induces pro-nociceptive effects, mainly through the NPFF1-R, and anti-nociceptive effects, mainly through the NPFF2-R. NPFF has been shown to inhibit adenylate cyclase via the Gi protein coupled to NPFF1-R.	299
320648	cd15982	7tmB1_PTH2R	parathyroid hormone 2 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. The parathyroid hormone 2 receptor (PTH2R), one of the three subtypes of PTH receptor family, is found in mammals and fish, but not in chicken or frog. PTH2R is potently activated by tuberoinfundibular peptide-39 (TIP-39) but not by PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH, an endocrine hormone that regulates calcium homoeostasis and bone maintenance, strongly activates human PTH2R, but only weakly activates rat and zebrafish PTH2Rs. These results suggest that TIP-39 is a natural ligand for PTH2R. Conversely, PTH1R is activated by PTH and PTHrP, but not by TIP-39. The PTH family receptors are members of the B1 (or secretin-like) subfamily of class B GPCRs, which include receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways.	289
320649	cd15983	7tmB1_PTH3R	parathyroid hormone 3 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. The parathyroid hormone 3 receptor (PTH3R), one of the three subtypes of PTH receptor family, is found in chicken and fish, but it is absent in mammals. On the other hand, the PTH1R is found in all vertebrate species, whereas PTH2R is found in mammals and fish, but not in chicken or frog. PTH1R is activated by two polypeptide ligands: PTH, an endocrine hormone that regulates calcium homoeostasis and bone maintenance, and PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH2R is potently activated by tuberoinfundibular peptide-39 (TIP-39), but not by PTHrP. PTH also strongly activates human PTH2R, but only weakly activates rat and zebrafish PTH2Rs, suggesting that TIP-39 is a natural ligand for PTH2R. Conversely, PTH3R binds and responds to both PTH and PTHrP, but not the TIP-39. The PTH family receptors are members of the B1 (or secretin-like) subfamily of class B GPCRs, which include receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways.	285
320650	cd15984	7tmB1_PTH1R	parathyroid hormone 1 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. The parathyroid hormone (PTH) receptor family has three subtypes: PTH1R, PTH2R and PTH3R. PTH1R is expressed in bone and kidney and is activated by two polypeptide ligands: PTH, an endocrine hormone that regulates calcium homoeostasis and bone maintenance, and PTH-related peptide (PTHrP), a paracrine factor that regulates endochondral bone development. PTH1R couples predominantly to G(s)-protein that in turn activates adenylate cyclase thereby producing cAMP, but it can also couple to several G protein subtypes, including G(q/11), G(i/o), and G(12/13), resulting in activation of multiple intracellular signaling pathways. PTH1R is found in all vertebrate species, whereas PTH2R is found in mammals and fish, but not in chicken or frog. PTH3R is found in chicken and fish, but it is absent in mammals. The PTH receptors are members of the B1 (or secretin-like) subfamily of class B GPCRs, which include receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, glucagon, glucagon-like peptide (GLP), and calcitonin gene-related peptide. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways.	290
320651	cd15985	7tmB1_GlucagonR-like_1	uncharacterized group of glucagon receptor-like proteins, member of the class B family of seven-transmembrane G protein-coupled receptors. This group consists of uncharacterized proteins with similarity to members of the glucagon receptor family of G protein-coupled receptors, which include glucagon receptor (GCGR), and glucagon-like peptide-1 receptor (GLP1R), and GLP2R. The glucagon receptors are activated by the members of the glucagon (GCG) peptide family including GCG, glucagon-like peptide 1 (GLP1), and GLP2, which are derived from the large proglucagon precursor. GCGR regulates blood glucose levels by control of hepatic glycogenolysis and gluconeogenesis and by regulation of insulin secretion from the pancreatic beta-cells. Activation of GLP1R stimulates glucose-dependent insulin secretion from pancreatic beta cells, whereas activation of GLP2R stimulates intestinal epithelial proliferation and increases villus height in the small intestine. Receptors in this group belong to the B1 (or secretin-like) subfamily of class B GPCRs, which includes receptors for polypeptide hormones of 27-141 amino-acid residues such as secretin, calcitonin gene-related peptide, parathyroid hormone (PTH), and corticotropin-releasing factor. These receptors contain the large N-terminal extracellular domain (ECD), which plays a critical role in hormone recognition by binding to the C-terminal portion of the peptide. On the other hand, the N-terminal segment of the hormone induces receptor activation by interacting with the receptor transmembrane domains and connecting extracellular loops, triggering intracellular signaling pathways. All members of the B1 subfamily preferentially couple to G proteins of G(s) family, which positively stimulate adenylate cyclase, leading to increased intracellular cAMP formation and calcium influx. However, depending on their cellular location, GCGR and GLP receptors can activate multiple G proteins, which can in turn stimulate different second messenger pathways.	280
320652	cd15986	7tmB1_VIP-R2	vasoactive intestinal polypeptide (VIP) receptor 2, member of the class B family of seven-transmembrane G protein-coupled receptors. Vasoactive intestinal peptide (VIP) receptor 2 is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, growth-hormone-releasing hormone (GHRH), and pituitary adenylate cyclase activating polypeptide (PACAP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. VIP and PACAP exert their effects through three G protein-coupled receptors, PACAP-R1, VIP-R1 (vasoactive intestinal receptor type 1, also known as VPAC1) and VIP-R2 (or VPAC2). PACAP-R1 binds only PACAP with high affinity, whereas VIP-R1 and -R2 specifically bind and respond to both VIP and PACAP.  VIP and PACAP and their receptors are widely expressed in the brain and periphery. They are upregulated in neurons and immune cells in responses to CNS injury and/or inflammation and exert potent anti-inflammatory effects, as well as play important roles in the control of circadian rhythms and stress responses, among many others. VIP-R1 is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level. However, depending on its cellular location, VIP-R1 is also capable of coupling to additional G proteins such as G(q) protein, thus leading to the activation of phospholipase C and intracellular calcium influx.	269
320653	cd15987	7tmB1_PACAP-R1	pituitary adenylate cyclase-activating polypeptide type 1 receptor, member of the class B family of seven-transmembrane G protein-coupled receptors. Pituitary adenylate cyclase-activating polypeptide type 1 receptor (PACAP-R1) is a member of the group of G protein-coupled receptors for structurally similar peptide hormones that also include secretin, growth-hormone-releasing hormone (GHRH), and vasoactive intestinal peptide (VIP). These receptors are classified into the subfamily B1 of class B GRCRs that consists of the classical hormone receptors and have been identified in all the vertebrates, from fishes to mammals, but are not present in plants, fungi, or prokaryotes. For all class B receptors, the large N-terminal extracellular domain plays a critical role in peptide hormone recognition. VIP and PACAP exert their effects through three G protein-coupled receptors, PACAP-R1, VIP-R1 (vasoactive intestinal receptor type 1, also known as VPAC1) and VIP-R2 (or VPAC2). PACAP-R1 binds only PACAP with high affinity, whereas VIP-R1 and -R2 specifically bind and respond to both VIP and PACAP.  VIP and PACAP and their receptors are widely expressed in the brain and periphery. They are upregulated in neurons and immune cells in responses to CNS injury and/or inflammation and exert potent anti-inflammatory effects, as well as play important roles in the control of circadian rhythms and stress responses, among many others. PACAP-R1 is preferentially coupled to a stimulatory G(s) protein, which leads to the activation of adenylate cyclase and thereby increases in intracellular cAMP level.	268
320654	cd15988	7tmB2_BAI2	brain-specific angiogenesis inhibitor 2, a group VII adhesion GPCR, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Brain-specific angiogenesis inhibitors (BAI1-3) constitute the group VII of cell-adhesion receptors that have been implicated in vascularization of glioblastomas. They belong to the B2 subfamily of class B GPCRs, are predominantly expressed in the brain, and are only present in vertebrates. Three BAIs, like all adhesion receptors, are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. For example, BAI1 N-terminus contain an integrin-binding RGD (Arg-Gly-Asp) motif in addition to five thrombospondin type 1 repeats (TSRs), which are known to regulate the anti-angiogenic activity of thrombospondin-1,  whereas BAI2 and BAI3 have four TSRs, but do not possess RGD motifs. The TSRs are functionally involved in cell attachment, activation of latent TGF-beta, inhibition of angiogenesis and endothelial cell migration. The TSRs of BAI1 mediates direct binding to phosphatidylserine, which enables both recognition and internalization of apoptotic cells by phagocytes. Thus, BAI1 functions as a phosphatidylserine receptor that forms a trimeric complex with ELMO and Dock180, leading to activation of Rac-GTPase which promotes the binding and phagocytosis of apoptotic cells. BAI3 can also interact with the ELMO-Dock180 complex to activate the Rac pathway and can also bind to secreted C1ql proteins of the C1Q complement family via its N-terminal TSRs. BAI3 and its ligands C1QL1 are highly expressed during synaptogenesis and are involved in synapse specificity. Moreover, BAI2 acts as a transcription repressor to regulate vascular endothelial growth factor (VEGF) expression through interaction with GA-binding protein gamma (GABP). The N-terminal extracellular domains of all three BAIs also contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain, which undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif to generate N- and C-terminal fragments (NTF and CTF), a putative hormone-binding domain (HBD), and multiple N-glycosylation sites. The C-terminus of each BAI subtype ends with a conserved Gln-Thr-Glu-Val (QTEV) motif known to interact with PDZ domain-containing proteins, but only BAI1 possesses a proline-rich region, which may be involved in protein-protein interactions.	291
320655	cd15989	7tmB2_BAI3	brain-specific angiogenesis inhibitor 3, a group VII adhesion GPCR, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Brain-specific angiogenesis inhibitors (BAI1-3) constitute the group VII of cell-adhesion receptors that have been implicated in vascularization of glioblastomas. They belong to the B2 subfamily of class B GPCRs, are predominantly expressed in the brain, and are only present in vertebrates. Three BAIs, like all adhesion receptors, are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. For example, BAI1 N-terminus contain an integrin-binding RGD (Arg-Gly-Asp) motif  in addition to five thrombospondin type 1 repeats (TSRs), which are known to regulate the anti-angiogenic activity of thrombospondin-1,  whereas BAI2 and BAI3 have four TSRs, but do not possess RGD motifs. The TSRs are functionally involved in cell attachment, activation of latent TGF-beta, inhibition of angiogenesis and endothelial cell migration. The TSRs of BAI1 mediates direct binding to phosphatidylserine, which enables both recognition and internalization of apoptotic cells by phagocytes. Thus, BAI1 functions as a phosphatidylserine receptor that forms a trimeric complex with ELMO and Dock180, leading to activation of Rac-GTPase which promotes the binding and phagocytosis of apoptotic cells. BAI3 can also interact with the ELMO-Dock180 complex to activate the Rac pathway and can also bind to secreted C1ql proteins of the C1Q complement family via its N-terminal TSRs. BAI3 and its ligands C1QL1 are highly expressed during synaptogenesis and are involved in synapse specificity. Moreover, BAI2 acts as a transcription repressor to regulate vascular endothelial growth factor (VEGF) expression through interaction with GA-binding protein gamma (GABP). The N-terminal extracellular domains of all three BAIs also contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain, which undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif to generate N- and C-terminal fragments (NTF and CTF), a putative hormone-binding domain (HBD), and multiple N-glycosylation sites. The C-terminus of each BAI subtype ends with a conserved Gln-Thr-Glu-Val (QTEV) motif known to interact with PDZ domain-containing proteins, but only BAI1 possesses a proline-rich region, which may be involved in protein-protein interactions.	293
320656	cd15990	7tmB2_BAI1	brain-specific angiogenesis inhibitor 1, a group VII adhesion GPCR, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Brain-specific angiogenesis inhibitors (BAI1-3) constitute the group VII of cell-adhesion receptors that have been implicated in vascularization of glioblastomas. They belong to the B2 subfamily of class B GPCRs, are predominantly expressed in the brain, and are only present in vertebrates. Three BAIs, like all adhesion receptors, are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. For example, BAI1 N-terminus contain an integrin-binding RGD (Arg-Gly-Asp) motif  in addition to five thrombospondin type 1 repeats (TSRs), which are known to regulate the anti-angiogenic activity of thrombospondin-1,  whereas BAI2 and BAI3 have four TSRs, but do not possess RGD motifs. The TSRs are functionally involved in cell attachment, activation of latent TGF-beta, inhibition of angiogenesis and endothelial cell migration. The TSRs of BAI1 mediates direct binding to phosphatidylserine, which enables both recognition and internalization of apoptotic cells by phagocytes. Thus, BAI1 functions as a phosphatidylserine receptor that forms a trimeric complex with ELMO and Dock180, leading to activation of Rac-GTPase which promotes the binding and phagocytosis of apoptotic cells. BAI3 can also interact with the ELMO-Dock180 complex to activate the Rac pathway and can also bind to secreted C1ql proteins of the C1Q complement family via its N-terminal TSRs. BAI3 and its ligands C1QL1 are highly expressed during synaptogenesis and are involved in synapse specificity. Moreover, BAI2 acts as a transcription repressor to regulate vascular endothelial growth factor (VEGF) expression through interaction with GA-binding protein gamma (GABP). The N-terminal extracellular domains of all three BAIs also contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain, which undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif to generate N- and C-terminal fragments (NTF and CTF), a putative hormone-binding domain (HBD), and multiple N-glycosylation sites. The C-terminus of each BAI subtype ends with a conserved Gln-Thr-Glu-Val (QTEV) motif known to interact with PDZ domain-containing proteins, but only BAI1 possesses a proline-rich region, which may be involved in protein-protein interactions.	267
320657	cd15991	7tmB2_CELSR1	Cadherin EGF LAG seven-pass G-type receptor 1, member of the class B2 family of seven-transmembrane G protein-coupled receptors. The group IV adhesion GPCRs include the cadherin EGF LAG seven-pass G-type receptors (CELSRs) and their Drosophila homolog Flamingo (also known as Starry night). These receptors are also classified as that belongs to the EGF-TM7 group of subfamily B2 adhesion GPCRs, because they contain EGF-like domains. Functionally, the group IV receptors act as key regulators of many physiological processes such as endocrine cell differentiation, neuronal migration, dendrite growth, axon, guidance, lymphatic vessel and valve formation, and planar cell polarity (PCP) during embryonic development. Three mammalian orthologs of Flamingo, Celsr1-3, are widely expressed in the nervous system from embryonic development until the adult stage. Each Celsr exhibits different expression patterns in the developing brain, suggesting that they serve distinct functions. Mutations of CELSR1 cause neural tube defects in the nervous system, while mutations of CELSR2 are associated with coronary heart disease.  Moreover, CELSR1 and several other PCP signaling molecules, such as dishevelled, prickle, frizzled, have been shown to be upregulated in B lymphocytes of chronic lymphocytic leukemia patients. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. In the case of CELSR/Flamingo/Starry night, their extracellular domains comprise nine cadherin repeats linked to a series of epidermal growth factor (EGF)-like and laminin globular (G)-like domains.  The cadherin repeats contain sequence motifs that mediate calcium-dependent cell-cell adhesion by homophilic interactions.  Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	254
320658	cd15992	7tmB2_CELSR2	Cadherin EGF LAG seven-pass G-type receptor 2, member of the class B2 family of seven-transmembrane G protein-coupled receptors. The group IV adhesion GPCRs include the cadherin EGF LAG seven-pass G-type receptors (CELSRs) and their Drosophila homolog Flamingo (also known as Starry night). These receptors are also classified as that belongs to the EGF-TM7 group of subfamily B2 adhesion GPCRs, because they contain EGF-like domains. Functionally, the group IV receptors act as key regulators of many physiological processes such as endocrine cell differentiation, neuronal migration, dendrite growth, axon, guidance, lymphatic vessel and valve formation, and planar cell polarity (PCP) during embryonic development. Three mammalian orthologs of Flamingo, Celsr1-3, are widely expressed in the nervous system from embryonic development until the adult stage. Each Celsr exhibits different expression patterns in the developing brain, suggesting that they serve distinct functions. Mutations of CELSR1 cause neural tube defects in the nervous system, while mutations of CELSR2 are associated with coronary heart disease.  Moreover, CELSR1 and several other PCP signaling molecules, such as dishevelled, prickle, frizzled, have been shown to be upregulated in B lymphocytes of chronic lymphocytic leukemia patients. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. In the case of CELSR/Flamingo/Starry night, their extracellular domains comprise nine cadherin repeats linked to a series of epidermal growth factor (EGF)-like and laminin globular (G)-like domains.  The cadherin repeats contain sequence motifs that mediate calcium-dependent cell-cell adhesion by homophilic interactions.  Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	255
320659	cd15993	7tmB2_CELSR3	Cadherin EGF LAG seven-pass G-type receptor 3, member of the class B2 family of seven-transmembrane G protein-coupled receptors. The group IV adhesion GPCRs include the cadherin EGF LAG seven-pass G-type receptors (CELSRs) and their Drosophila homolog Flamingo (also known as Starry night). These receptors are also classified as that belongs to the EGF-TM7 group of subfamily B2 adhesion GPCRs, because they contain EGF-like domains. Functionally, the group IV receptors act as key regulators of many physiological processes such as endocrine cell differentiation, neuronal migration, dendrite growth, axon, guidance, lymphatic vessel and valve formation, and planar cell polarity (PCP) during embryonic development. Three mammalian orthologs of Flamingo, Celsr1-3, are widely expressed in the nervous system from embryonic development until the adult stage. Each Celsr exhibits different expression patterns in the developing brain, suggesting that they serve distinct functions. Mutations of CELSR1 cause neural tube defects in the nervous system, while mutations of CELSR2 are associated with coronary heart disease.  Moreover, CELSR1 and several other PCP signaling molecules, such as dishevelled, prickle, frizzled, have been shown to be upregulated in B lymphocytes of chronic lymphocytic leukemia patients. Celsr3 is expressed in both the developing and adult mouse brain. It has been functionally implicated in proper neuronal migration and axon guidance in the CNS. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. In the case of CELSR/Flamingo/Starry night, their extracellular domains comprise nine cadherin repeats linked to a series of epidermal growth factor (EGF)-like and laminin globular (G)-like domains.  The cadherin repeats contain sequence motifs that mediate calcium-dependent cell-cell adhesion by homophilic interactions.  Moreover, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	254
320660	cd15994	7tmB2_GPR111_115	orphan adhesion receptors GPR111 and GPR115, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR111 and GPR115 are highly homologous orphan receptors that belong to group VI adhesion-GPCRs along with GPR110, GPR113, and GPR116. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in ligand recognition as well as cell-cell adhesion and cell-matrix interactions, linked by a stalk region to a class B seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions. However, several adhesion GPCRs, including GPR 111, GPR115, and CELSR1, are predicted to be non-cleavable at the GAIN domain because of the lack of a consensus catalytic triad sequence (His-Leu-Ser/Thr) within their GPS. Both GPR111 and GPR5 are present only in land-living animals and are predominantly expressed in the developing skin.	267
320661	cd15995	7tmB2_GPR56	orphan adhesion receptor GPR56, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR56 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include orphan GPCRs such as GPR64, GPR97, GPR112, GPR114, and GPR126. GPR56 is involved in the regulation of oligodendrocyte development and myelination in the central nervous system via coupling to G(12/13) proteins, which leads to the activation of RhoA GTPase. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	269
320662	cd15996	7tmB2_GPR126	orphan adhesion receptor GPR126, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR126 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include orphan GPCRs such as GPR56, GPR64, GPR97, GPR112, and GPR114. GPR126 is required in Schwann cells for proper differentiation and myelination via G-Protein Activation. GPR126 is believed to couple to G(s)-protein, which leads to activation of adenylate cyclase for cAMP production. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	271
320663	cd15997	7tmB2_GPR112	Probable G protein-coupled receptor 112, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR112 is an orphan receptor that has been classified as that belongs to the Group VIII of adhesion GPCRs. Other members of the Group VII include orphan GPCRs such as GPR56, GPR64, GPR97, GPR114, and GPR126. GPR112 is specifically expressed in normal enterochromatin cells and gastrointestinal neuroendocrine carcinoma cells, but its biological function is unknown. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	269
320664	cd15998	7tmB2_GPR124	G protein-coupled receptor 124, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR124 is an orphan receptor that has been classified as that belongs to the group III of adhesion GPCRs, which also includes orphan GPR123 and GPR125. GPR124, also known as tumor endothelial marker 5 (TEM5), is highly expressed in tumor vessels and in the vasculature of the developing embryo. GPR124 is essentially required for proper angiogenic sprouting into neural tissue, CNS-specific vascularization, and formation of the blood-brain barrier. GPR124 interacts with the PDZ domain of DLG1 (discs large homolog 1) through its PDZ-binding motif. Recently, studies of double-knockout mice showed that GPR124 functions as a co-activator of Wnt7a/Wnt7b-dependent beta-catenin signaling in brain endothelium. Moreover, WNT7-stimulated beta-catenin signaling is regulated by GPR124's intracellular PDZ binding motif and leucine-rich repeats (LRR) in its N-terminal extracellular domain. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	268
320665	cd15999	7tmB2_GPR125	G protein-coupled receptor 125, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR125 is an orphan receptor that has been classified as that belongs to the group III of adhesion GPCRs, which also includes orphan receptors GPR123 and GPR124. GPR125 directly interacts with dishevelled (Dvl) via its intracellular C-terminus, and together, GPR125 and Dvl recruit a subset of planar cell polarity (PCP) components into membrane subdomains, a prerequisite for activation of Wnt/PCP signaling.  Thus, GPR125 influences the noncanonical WNT/PCP pathway, which does not involve beta-catenin, through interacting with and modulating the distribution of Dvl. The adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	312
320666	cd16000	7tmB2_GPR123	G protein-coupled receptor 123, member of the class B2 family of seven-transmembrane G protein-coupled receptors. GPR123 is an orphan receptor that has been classified as that belongs to the group III of adhesion GPCRs, and also includes orphan receptors GPR124 and GPR125. GPR123 is predominantly expressed in the CNS including thalamus, brain stem and regions containing large pyramidal cells, yet its biological function remains to be determined. Adhesion receptors are characterized by the presence of large N-terminal extracellular domains containing multiple adhesion motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, that are coupled to a class B seven-transmembrane domain. Furthermore, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR- autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	275
320667	cd16001	7tmA_P2Y3-like	P2Y purinoceptor 3-like proteins, member of the class A family of seven-transmembrane G protein-coupled receptors. P2Y3-like proteins are an uncharacterized group that belongs to the G(i) class of a family of purinergic G-protein coupled receptors. The P2Y receptor family is composed of eight subtypes, which are activated by naturally occurring extracellular nucleotides such as ATP, ADP, UTP, UDP, and UDP-glucose. These eight receptors are ubiquitous in human tissues and can be further classified into two subfamilies based on sequence homology and second messenger coupling: a subfamily of five P2Y1-like receptors (P2Y1, P2Y2, P2Y4, P2Y6, and P2Y11Rs) that are coupled to G(q) protein to activate phospholipase C (PLC) and a second subfamily of three P2Y12-like receptors (P2Y12, P2YR13, and P2Y14Rs) that are coupled to G(i) protein to inhibit adenylate cyclase. Several cloned subtypes, such as P2Y3, P2Y5, and P2Y7-10, are not functional mammalian nucleotide receptors. The native agonists for P2Y receptors are: ATP (P2Y2, P2Y12), ADP (P2Y1, P2Y12, and P2Y13), UTP (P2Y2, P2Y4), UDP (P2Y6, P2Y14), and UDP-glucose (P2Y14).	284
320668	cd16002	7tmA_NK1R	neurokinin 1 receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The neurokinin 1 receptor (NK1R), also known as tachykinin receptor 1 (TACR1) or substance P receptor (SPR), is a G-protein coupled receptor found in the mammalian central nervous and peripheral nervous systems. The tachykinins act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R.  SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. SP is an extremely potent vasodilator through endothelium dependent mechanism and is released from the autonomic sensory nerves. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate in the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception.	284
320669	cd16003	7tmA_NKR_NK3R	neuromedin-K receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The neuromedin-K receptor (NKR), also known as tachykinin receptor 3 (TACR3) or neurokinin B receptor or NK3R, is a G-protein coupled receptor that specifically binds to neurokinin B. The tachykinins (TKs) act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R.  NK3R is activated by its high-affinity ligand, NKB, which is primarily involved in the central nervous system and plays a critical role in the regulation of gonadotropin hormone release and the onset of puberty.	282
320670	cd16004	7tmA_SKR_NK2R	substance-K receptor, member of the class A family of seven-transmembrane G protein-coupled receptors. The substance-K receptor (SKR), also known as tachykinin receptor 2 (TACR2) or neurokinin A receptor or NK2R, is a G-protein coupled receptor that specifically binds to neurokinin A. The tachykinins are widely distributed throughout the mammalian central and peripheral nervous systems and act as excitatory transmitters on neurons and cells in the gastrointestinal tract. The TKs are characterized by a common five-amino acid C-terminal sequence, Phe-X-Gly-Leu-Met-NH2, where X is a hydrophobic residue. The three major mammalian tachykinins are substance P (SP), neurokinin A (NKA), and neurokinin B (NKB). The physiological actions of tachykinins are mediated through three types of receptors: neurokinin receptor type 1 (NK1R), NK2R, and NK3R. SP is a high-affinity endogenous ligand for NK1R, which interacts with the Gq protein and activates phospholipase C, leading to elevation of intracellular calcium. NK2R is a high-affinity receptor for NKA, the tachykinin neuropeptide substance K. SP and NKA are found in the enteric nervous system and mediate the regulation of gastrointestinal motility, secretion, vascular permeability, and pain perception.	285
320671	cd16005	7tmB2_Latrophilin-3	Latrophilin-3, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified:  LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	258
320672	cd16006	7tmB2_Latrophilin-2	Latrophilin-2, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified:  LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	258
320673	cd16007	7tmB2_Latrophilin-1	Latrophilin-1, member of the class B2 family of seven-transmembrane G protein-coupled receptors. Latrophilins (also called lectomedins or latrotoxin receptors) belong to Group I adhesion GPCRs, which also include ETL (EGF-TM7-latrophilin-related protein). These receptors are a member of the adhesion family (subclass B2) that belongs to the class B GPCRs. Three subtypes of latrophilins have been identified:  LPH1 (latrophilin-1), LPH2, and LPH3. The latrophilin-1 is a brain-specific calcium-independent receptor of alpha-latrotoxin, a potent presynaptic neurotoxin from the venom of the black widow spider that induces massive neurotransmitter release from sensory and motor neurons as well as endocrine cells, leading to nerve-terminal degeneration. Latrophilin-2 and -3, although sharing strong sequence homology to latrophilin-1, do not bind alpha-latrotoxin. While latrophilin-3 is also brain specific, latrophilin-2 is ubiquitously distributed. The endogenous ligands for these two receptors are unknown. ETL, a seven transmembrane receptor containing EGF-like repeats is highly expressed in heart, where developmentally regulated, as well as in normal smooth cells. The function of the ETL is unknown. All adhesion GPCRs possess large N-terminal extracellular domains containing multiple structural motifs, which play critical roles in cell-cell adhesion and cell-matrix interactions, coupled to a seven-transmembrane domain. In addition, almost all adhesion receptors, except GPR123, contain an evolutionarily conserved GPCR-autoproteolysis inducing (GAIN) domain that undergoes autoproteolytic processing at the GPCR proteolysis site (GPS) motif located immediately N-terminal to the first transmembrane region, to generate N- and C-terminal fragments (NTF and CTF), which may serve important biological functions.	258
293733	cd16009	PPM	Bacterial phosphopentomutase. Bacterial phosphopentomutases (PPMs) are alkaline phosphatase superfamily members that interconvert alpha-D-ribose 5-phosphate (ribose 5-phosphate) and alpha-D-ribose 1-phosphate (ribose 1-phosphate). This reaction bridges glucose metabolism and RNA biosynthesis. PPM is a Mn(2+)-dependent enzyme and protein phosphorylation activates the enzyme.	382
293734	cd16010	iPGM	2 3 bisphosphoglycerate independent phosphoglycerate mutase iPGM. The 2,3-diphosphoglycerate- independent phosphoglycerate mutase (iPGM) catalyzes the interconversion of 3-phosphoglycerate (3PGA) and 2-phosphoglycerate (2PGA). They are the predominant PGM in plants and some other bacteria, including endospore forming Gram-positive bacteria and their close relatives. The two steps catalysis is a phosphatase reaction removing the phosphate from 2- or 3-phosphoglycerate, generating an enzyme-bound phosphoserine intermediate, followed by a phosphotransferase reaction as the phosphate is transferred from the enzyme back to the glycerate moiety. The iPGM exists as a dimer, each monomer binding 2 magnesium atoms, which are essential for enzymatic activity.	503
293735	cd16011	iPGM_like	uncharacterized subfamily of alkaline phosphatase, homologous to 2 3 bisphosphoglycerate independent phosphoglycerate mutase (iPGM) and bacterial phosphopentomutases. The proteins in this subfamily of alkaline phosphatase are not characterized. Their sequences show similarity to 2 3 bisphosphoglycerate independent phosphoglycerate mutase (iPGM) which catalyzes the interconversion of 3-phosphoglycerate to 2-phosphoglycerate, and to bacterial phosphopentomutases (PPMs) which interconvert alpha-D-ribose 5-phosphate (ribose 5-phosphate) and alpha-D-ribose 1-phosphate (ribose 1-phosphate).	368
293736	cd16012	ALP	Alkaline Phosphatase. Alkaline phosphatases are non-specific membrane-bound phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity. Mammalian alkaline phosphatase is divided into four isozymes depending upon the site of tissue expression. They are Intestinal ALP, Placental ALP, Germ cell ALP and tissue nonspecific alkaline phosphatase or liver/bone/kidney (L/B/K) ALP.	283
293737	cd16013	AcpA	acid phosphatase A. Acid phosphatase A catalyzes the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at low pH. AcpA hydrolyzes a variety of substrates, including p-nitrophenylphosphate (pNPP), p-nitrophenylphosphorylcholine (pNPPC), peptides containing phosphotyrosine, inositol phosphates, AMP, ATP, fructose 1,6-bisphosphate, glucose and fructose 6-phosphates, NADP, and ribose 5-phosphate. AcpA is distinct from histidine ACPs and purple ACPs, as well as class A, B, and C bacterial nonspecific ACPs.	370
293738	cd16014	PLC	non-hemolytic phospholipase C. Nonhemolytic Phospholipases C is produced by pathogenic bacterial. The toxic phospholipases C can interact with eukaryotic cell membranes and hydrolyze phosphatidylcholine and sphingomyelin, leading to cell lysis.	287
293739	cd16015	LTA_synthase	Lipoteichoic acid synthase like. Lipoteichoic acid (LTA) is an important cell wall polymer found in Gram-positive bacteria. It may contain long chains of ribitol or glycerol phosphate. LTA synthase catalyzes the reaction to extend the polymer by the repeated addition of glycerolphosphate (GroP) subunits to the end of the growing chain.	283
293740	cd16016	AP-SPAP	SPAP is a subclass of alkaline phosphatase (AP). Alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity. Although SPAP is a subclass of alkaline phosphatase, SPAP has many differences from other APs: 1) the catalytic residue is a threonine instead of serine, 2) there is no binding pocket for the third metal ion, and 3) the arginine residue forming bidentate hydrogen bonding is deleted in SPAP. A lysine and an asparagine residue, recruited together for the first time into the active site, bind the substrate phosphoryl group in a manner not observed before in any other AP.	457
293741	cd16017	LptA	Lipooligosaccharide Phosphoethanolamine Transferase A (LptA) or Lipid A Phosphoethanolamine Transferase. Lipooligosaccharide Phosphoethanolamine Transferase A (LptA) or Lipid A Phosphoethanolamine Transferase catalyzes the modification of the lipid A headgroups by phosphoethanolamine (PEA) or 4-amino-arabinose residues. Lipopolysaccharides, also called endotoxins, protect bacterial pathogens from antimicrobial peptides and have roles in virulence. The PEA modified lipid A increases resistance to the cationic cyclic polypeptide antibiotic, polymyxin. Lipid A PEA transferases usually consist of a transmembrane domain anchoring the enzyme to the periplasmic face of the cytoplasmic membrane.	288
293742	cd16018	Enpp	Ectonucleotide pyrophosphatase/phosphodiesterase, also called autotaxin. Ecto-nucleotide pyrophosphatases/phosphodiesterases (ENPPs) hydrolyze 5'-phosphodiester bonds in nucleotides and their derivatives, resulting in the release of 5'-nucleotide monophosphates. ENPPs have multiple physiological roles, including nucleotide recycling, modulation of purinergic receptor signaling, regulation of extracellular pyrophosphate levels, stimulation of cell motility, and possible roles in regulation of insulin receptor (IR) signaling and activity of ecto-kinases. The eukaryotic ENPP family contains at least five members that have different tissue distribution and physiological roles.	267
293743	cd16019	GPI_EPT	GPI ethanolamine phosphate transferase. Ethanolamine phosphate transferase is involved in glycosylphosphatidylinositol-anchor biosynthesis. It catalyzes the transfer of ethanolamine phosphate to the first alpha-1,4-linked mannose of the glycosylphosphatidylinositol precursor of GPI-anchor. It may act as suppressor of replication stress and chromosome missegregation.	292
293744	cd16020	GPI_EPT_1	GPI ethanolamine phosphate transferase 1; PIG-N. Ethanolamine phosphate transferase is involved in glycosylphosphatidylinositol-anchor biosynthesis. It catalyzes the transfer of ethanolamine phosphate to the first alpha-1,4-linked mannose of the glycosylphosphatidylinositol precursor of GPI-anchor. It may act as suppressor of replication stress and chromosome missegregation.	294
293745	cd16021	ALP_like	uncharacterized Alkaline phosphatase subfamily. Alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity.	278
293746	cd16022	sulfatase_like	sulfatase. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	236
293747	cd16023	GPI_EPT_3	GPI ethanolamine phosphate transferase 3, PIG-O. Ethanolamine phosphate transferase is involved in glycosylphosphatidylinositol-anchor biosynthesis. It catalyzes the transfer of ethanolamine phosphate to the first alpha-1,4-linked mannose of the glycosylphosphatidylinositol precursor of GPI-anchor. It may act as suppressor of replication stress and chromosome missegregation.	289
293748	cd16024	GPI_EPT_2	GPI ethanolamine phosphate transferase 2; PIG-G. Ethanolamine phosphate transferase is involved in glycosylphosphatidylinositol-anchor biosynthesis. It catalyzes the transfer of ethanolamine phosphate to the first alpha-1,4-linked mannose of the glycosylphosphatidylinositol precursor of GPI-anchor. It may act as suppressor of replication stress and chromosome missegregation.	274
293749	cd16025	PAS_like	Bacterial Arylsulfatase of Pseudomonas aeruginosa and related proteins. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	402
293750	cd16026	GALNS_like	galactosamine-6-sulfatase; also known as N-acetylgalactosamine-6-sulfatase (GALNS). Lysosomal galactosamine-6-sulfatase removes sulfate groups from a terminal N-acetylgalactosamine-6-sulfate (or galactose-6-sulfate) in mucopolysaccharides such as keratan sulfate and chondroitin-6-sulfate. Defects in GALNS lead to accumulation of substrates, resulting in the development of the lysosomal storage disease mucopolysaccharidosis IV A.	399
293751	cd16027	SGSH	N-sulfoglucosamine sulfohydrolase (SGSH; sulfamidase). N-sulfoglucosamine sulfohydrolase (SGSH) belongs to the sulfatase family and catalyses the cleavage of N-linked sulfate groups from the GAGs heparin sulfate and heparin. The active site is characterized by the amino-acid sequence motif C(X)PSR that is highly conserved among most sulfatases. The cysteine residue is post-translationally converted to a formylglycine (FGly) residue, which is crucial for the catalytic process. Loss of function of SGSH results a disease called mucopolysaccharidosis type IIIA (Sanfilippo A syndrome), a fatal childhood-onset neurodegenerative disease with mild facial, visceral and skeletal abnormalities.	373
293752	cd16028	PMH	Phosphonate monoester hydrolase/phosphodiesterase. Phosphonate monoester hydrolase/phosphodiesterase hydrolyses phosphonate monoesters or phosphate diesters using a posttranslationally formed formylglycine as the catalytic nucleophile. PMH is the member of the alkaline phosphatase superfamily. The structure of PMH is more homologous to arylsulfatase than alkaline phosphatase. Sulfatases also use formylglycine as catalytic nucleophile.	449
293753	cd16029	4-S	N-acetylgalactosamine 4-sulfatase, also called arylsulftase B. Sulfatases catalyze the hydrolysis of sulfuric acid esters from a wide variety of substrates. N-acetylgalactosamine 4-sulfatase catalyzes the removal of the sulfate ester group from position 4 of an N-acetylgalactosamine sugar at the non-reducing terminus of the polysaccharide in the degradative pathways of the glycosaminoglycans dermatan sulfate and chondroitin-4-sulfate. N-acetylgalactosamine 4-sulfatase is a lysosomal enzyme.	393
293754	cd16030	iduronate-2-sulfatase	iduronate-2-sulfatase. Iduronate 2-sulfatase is a sulfatase enzyme that catalyze the hydrolysis of sulfate ester bonds from a wide variety of substrates, including steroids, carbohydrates and proteins. Iduronate 2-sulfatase is required for the lysosomal degradation of heparan sulfate and dermatan sulfate. Mutations in the iduronate 2-sulfatase gene that result in enzymatic deficiency lead to the sex-linked mucopolysaccharidosis type II, also known as Hunter syndrome.	435
293755	cd16031	G6S_like	unchracterized sulfatase homologous to glucosamine (N-acetyl)-6-sulfatase(G6S, GNS). N-acetylglucosamine-6-sulfatase also known as glucosamine (N-acetyl)-6-sulfatase hydrolyzes of the 6-sulfate groups of the N-acetyl-D-glucosamine 6-sulfate units of heparan sulfate and keratan sulfate. Deficiency of N-acetylglucosamine-6-sulfatase results in the disease of Sanfilippo Syndrome type IIId or Mucopolysaccharidosis III (MPS-III), a rare autosomal recessive lysosomal storage disease.	429
293756	cd16032	choline-sulfatase	choline-sulfatase. Choline-sulphatase is involved in the synthesis of glycine betaine from choline. The symbiotic soil bacterium Rhizobium meliloti can synthesize glycine betaine from choline-O-sulphate and choline to protect itself from osmotic stress. This biosynthetic pathway is encoded by the betICBA locus, which comprises a regulatory gene, betI, and three structural genes, betC (choline sulfatase), betB (betaine aldehyde dehydrogenase), and betA (choline dehydrogenase). betICBA genes constitute a single operon.	327
293757	cd16033	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	411
293758	cd16034	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	399
293759	cd16035	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	311
293760	cd16037	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	321
277186	cd16039	PHD_SPP1	PHD finger found in Set1 complex component SPP1. Set1C component SPP1, also called COMPASS component Spp1, or Complex proteins associated with set1 protein Spp1, or Suppressor of PRP protein 1, is a component of the COMPASS complex that links histone methylation to initiation of meiotic recombination. It induces double-strand break (DSB) formation by tethering to recombinationally cold regions. SPP1 interacts with H3K4me3 and Mer2, a protein required for DSB formation, to promote recruitment of potential meiotic DSB sites to the chromosomal axis. SPP1 contains a PHD finger, a zinc binding motif.	46
294002	cd16040	SPRY_PRY_SNTX	Stonustoxin subunit alpha or SNTX subunit alpha. This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C-terminus of Stonustoxin alpha proteins. Stonustoxin (SNTX) is a multifunctional lethal protein isolated from venom elaborated by the stonefish. It comprises two subunits, termed alpha and beta. SNTX elicits an array of biological responses, particularly a potent hypotension and respiratory difficulties.	180
293880	cd16074	OCRE	OCRE domain. The OCRE (OCtamer REpeat) domain contains 5 repeats of an 8-residue motif, which were shown to form beta-strands. Based on the architectures of proteins containing OCRE domains, a role in RNA metabolism and/or signalling has been proposed.	54
293922	cd16075	ORC6_CTD	C-terminal domain of the eukaryotic origin recognition complex subunit ORC6. In eukaryotes, a complex consisting of six subunits promotes the onset of DNA replication. The 6th subunit, ORC6, does not belong to the wider AAA+ family of nucleotide hydrolases, but contains a tandemly repeated domain resembling the transcription factor TFIIB, as well as this C-terminal domain which harbors a helical segment that is responsible for interactions with the complex by binding to ORC3. Mutations in this C-terminal helix interfere with the formation of the ORC and have been linked to Meier-Gorlin syndrome, a dwarfism disorder.	53
293923	cd16076	TSPcc	Coiled coil region of thrombospondin. This domain family contains coiled coil region of subgroup B of thrombospondins, comprising TSP-3, TSP-4, and TSP-5, that assemble as pentamers. This region is located adjacent to the N-terminal domain (NTD) of thrombospondin (TSP), that mediates co-translational oligomerization via formation of a left-handed super-helix which binds hydrophilic signaling molecules such as vitamin D3 and vitamin A. Pentameric TSPs are stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end. TSP-5 is also known as cartilage oligomeric matrix protein (COMP). TSPs comprise a conserved family of extracellular, oligomeric, multidomain, calcium-binding glycoproteins. In mammals, they have several complex tissue-specific roles, including activities in wound healing and angiogenesis, connective tissue organization, vessel wall biology, and synaptogenesis, all mechanistically derived from interactions with cell surfaces, cytokines, growth factors, or components of the extracellular matrix (ECM) that together regulate many aspects of cell phenotype. In invertebrates, TSPs may have ancient functions such as bridging activities in cell-cell and cell-ECM interactions. Most protostomes and inferred basal metazoa encode a single TSP with the general domain organization of subgroup B TSPs and with a pentamerizing coiled coil.	40
293924	cd16077	TSP-5cc	Coiled coil region of thrombospondin-5 (TSP-5). This family contains the N-terminal coiled coil region of TSP-5, also known as cartilage oligomeric matrix protein (COMP). It forms a pentameric left-handed coiled coil (COMPcc) with a channel that is a unique carrier for lipophilic compounds. It is known to bind hydrophilic signaling molecules such as vitamin D3 and vitamin A, making it a possible targeted drug delivery system. TSP-5/COMP is expressed in all types of cartilage as well as in the vitreous of the eye, tendons, vascular smooth muscle cells, and heart. The pentamer is stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end of the coiled coil region. TSP-5 is essential for modulating the phenotypic transition of vascular smooth muscle cells and vascular remodeling. Mutations in TSP-5 result in two different inherited chondrodysplasias and osteoarthritic phenotypes: pseudoachondroplasia and multiple epithelial dysplasia. Deficiency of TSP-5 causes dilated cardiomyopathy (DCM), a common cause of congestive heart failure. Early increase in serum TSP-5 is associated with joint damage progression in patients with rheumatoid arthritis, thus representing a novel indicator of an activated destructive process in the joint.	43
293925	cd16079	TSP-3cc	Coiled coil region of thrombospondin-3 (TSP-3). This family contains the N-terminal coiled coil region of TSP-3, which is highly expressed in osteosarcomas and associated with metastasis. TSP-3, along with TSP-5 and type IX collagen, is also expressed in the growth plate and all operate in concert and participate in growth plate organization that directly modulates linear growth. It forms a pentameric left-handed coiled coil with a channel that is a unique carrier for lipophilic compounds. The pentamer is stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end of the coiled coil region. TSP-3 knockout mice have been shown to display accelerated endochondral ossification and increased trabecular bone in the femoral head.	43
293926	cd16080	TSP-4cc	Coiled coil region of thrombospondin-4 (TSP-4). This family contains the N-terminal coiled coil region of TSP-4, which is abundantly expressed in tendon and muscle, as well as in neural and osteogenic tissues, and has also been detected in brain capillaries. It forms a pentameric left-handed coiled coil with a channel that is a unique carrier for lipophilic compounds. The pentamer is stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end of the coiled coil region. TSP-4 regulates the composition of the deposition of extracellular matrix (ECM) in tendon and skeletal muscle. The absence of TSP-4 alters the organization, composition and physiological functions of these tissues. TSP-4 deficiency causes incorrect modification of heparan-sulfate (HS), resulting in decreased activity of lipoprotein lipase (LpL) and loss of beta-glycan; HS is involved in a wide variety of cellular functions, LpL is an endothelial enzyme responsible for the uptake and hydrolysis of lipoproteins, and beta-glycan has inhibiting effect on TGF-beta signaling in skeletal muscle. The human gene THBS4 that encodes for TSP-4 contains a single nucleotide polymorphism (SNP), which is expressed at high frequency in Caucasians and associated with a significantly increased risk of premature myocardial infarction. TSP-4 also binds stromal interaction molecule 1 (STIM1), a transmembrane protein that functions in the endoplasmic reticulum (ER), and regulates calcium channel activity. Studies show that TSP-4 may act as an organizer of adhesive and axon outgrowth-promoting molecules in the ECM to optimize retinal ganglion cell responses. TSP-4 is also involved in the post-translational modification of collagen and may assist in collagen fibril assembly.	44
293927	cd16081	TSPcc_insect	Coiled coil region of thrombospondin in protostomes. This family contains the N-terminal coiled coil region of thrombospondin (TSP) in some protostomes, which suggest ancient functions that include bridging activities in cell-cell and cell-ECM interactions. It appears that most protostomes and inferred basal metazoa encode a single TSP with the general domain organization of subgroup B TSPs and with a pentamerizing coiled coil. This region has heparin-binding activity and is a component of extracellular matrix (ECM), showing that the pentameric TSPs are of earlier origin and that the trimeric TSP subfamily A form is associated with higher chordates. The left-handed coiled coil pentamer forms a channel that is a unique carrier for lipophilic compounds, and is stabilized by inter-subunit disulfide bonds formed between cysteine residues adjacent to the C-terminal end of the coiled coil region. Several heparan sulphate (HS) proteoglycans are known in D. melanogaster, including both transmembrane and matrix forms, which could contribute to its retention in pericellular matrix.	42
409504	cd16082	IgC_CRIg	Immunoglobulin (Ig) constant domain of the complement receptor of the immunoglobulin superfamily (CRIg). The members here are composed of the Immunoglobulin (Ig) constant domain of the complement receptor of the immunoglobulin superfamily (CRIg). The N-terminal domain of CRIg (also referred to as Z39Ig and V-set and Ig domain-containing 4 (VSIG4)) belongs to the IgV family of immunoglobulin-like domains while the C-terminal domain of CRIg belongs to the IgC family of immunoglobulin-like domains. CRIg plays a role in the complement system, an inhibitor of the alternative pathway convertases, and a negative regulator of T cell activation. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins such as T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins such as butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond.	86
409505	cd16083	IgC1_CD80	Immunoglobulin constant (IgC)-like domain of antigen receptor Cluster of Differentiation (CD) 80; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin constant (IgC)-like domain of the antigen receptor Cluster of Differentiation (CD) 80. CD80 (also known as glycoprotein B7-1) and CD86 (also known as glycoprotein B7-2) are expressed on antigen-presenting cells and deliver the co-stimulatory signal through CD28 and CTLA-4 (CD152) on T cells. signaling through CD28 augments the T-cell response, whereas CTLA-4 signaling attenuates it. CD80 contains two Ig-like domains, an amino-terminal immunoglobulin variable (IgV)-like domain characteristic of adhesion molecules, and a membrane proximal immunoglobulin constant (IgC)-like domain similar to the constant domains of antigen receptors. Members of the Ig family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions.	91
409506	cd16084	IgC1_CH2_IgD	CH2 domain (second constant Ig domain of the heavy chain) in immunoglobulin delta chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin constant domain (IgC) in delta heavy chains. The IgC family includes immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions.	97
409507	cd16085	IgC1_SIRP_domain_3	Signal-regulatory protein (SIRP) immunoglobulin-like domain 3; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig)-like domain in Signal-Regulatory Protein (SIRP), domain 3 (C1 repeat 2). The SIRPs belong to the "paired receptors" class of membrane proteins that comprise several genes coding for proteins with similar extracellular regions but very different transmembrane/cytoplasmic regions with different (activating or inhibitory) signaling potentials. They are commonly on NK cells, but are also on many myeloid cells. Their extracellular region contains three Immunoglobulin superfamily domains a single V-set and two C1-set IgSF domains. Their cytoplasmic tails that contain either ITIMs or transmembrane regions that have positively charged residues that allow an association with adaptor proteins, such as DAP12/KARAP, containing ITAMs. There are 3 distinct SIRP members: alpha, beta, and gamma.  SIRP alpha (also known as CD172a or SRC homology 2 domain-containing protein tyrosine phosphatase substrate 1/Shps-1) is a membrane receptor that interacts with a ligand CD47 expressed on many cells and gives an inhibitory signal through immunoreceptor tyrosine-based inhibition motifs in the cytoplasmic region that interact with phosphatases SHP-1 and SHP-2. SIRP beta has a short cytoplasmic region and associates with a transmembrane adapter protein DAP12 containing immunoreceptor tyrosine-based activation motifs to give an activating signal. SIRP gamma contains a very short cytoplasmic region lacking obvious signaling motifs but also binds CD47, but with much less affinity.	96
319335	cd16086	IgV_CD80	Immunoglobulin variable domain (IgV) in Cluster of Differentiation (CD) 80. The members here are composed of the immunoglobulin variable region (IgV) in the Cluster of Differentiation (CD) 80). Glycoproteins B7-1 (also known as cluster of differentiation (CD) 80) and B7-2 (also known as CD86) are expressed on antigen-presenting cells and deliver the co-stimulatory signal through CD28 and CTLA-4 (also known as cluster of differentiation 152/CD152) on T cells. signaling through CD28 augments the T-cell response, whereas CTLA-4 signaling attenuates it. CD80 contains two Ig-like domains, an amino-terminal immunoglobulin variable (IgV)-like domain characteristic of adhesion molecules and a membrane proximal immunoglobulin constant (IgC)-like domain similar to the constant domains of antigen receptors. Members of the Ig family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and Major Histocompatibility Complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions.	105
409508	cd16087	IgV_CD86	Immunoglobulin variable domain (IgV) in Cluster of Differentiation (CD) 86. The members here are composed of the immunoglobulin variable region (IgV) in the Cluster of Differentiation (CD) 86). Glycoproteins B7-1 (also known as cluster of differentiation (CD) 80) and B7-2 (also known as CD86) are expressed on antigen-presenting cells and deliver the co-stimulatory signal through CD28 and CTLA-4 (also known as CD152) on T cells.  signaling through CD28 augments the T-cell response, whereas CTLA-4 signaling attenuates it. The CTLA-4 and B7-2 monomers are both two-layer beta-sandwiches that display the chain topology characteristic of the immunoglobulin variable (V-type) domains present in antigen receptors. The front and back sheets of B7-2 are composed of AGFCC'C" and BED strands, respectively. Members of the IgV family are components of immunoglobulin (Ig) and T cell receptors. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is the disulfide bridge connecting 2 beta-sheets with a tryptophan residue packed against the disulfide bond.	108
409509	cd16088	IgV_PD1	Immunoglobulin (Ig)-like domain of Programmed Cell Death 1 (PD1). The members here are composed of the immunoglobulin (Ig)-like domain of Programmed Cell Death 1 (PD1; also known as CD279/cluster of differentiation 279). PD1 is a cell surface receptor that is expressed on T cells and pro-B cells. The protein's structure includes an extracellular IgV domain followed by a transmembrane region and an intracellular tail. Activation of CD4+ T cells, CD8+ T cells, NKT cells, B cells, and monocytes induces PD-1 expression, immediately after which it binds two distinct ligands, PD-L1 (also known as B7-H1 or CD274/cluster of differentiation 274) and PD-L2, also known as B7-DC. PD-1 plays an important role in down regulating the immune system by preventing the activation of T-cells, reducing autoimmunity and promoting self-tolerance. The inhibitory effect of PD-1 is accomplished by promoting apoptosis in antigen specific T-cells in lymph nodes while simultaneously reducing apoptosis in regulatory T cells. A class of drugs that target PD-1, known as the PD-1 inhibitors, activate the immune system to attack tumors and treat cancer. Comparisons between the mouse PD-1 (mPD-1) and human PD-1 (hPD-1) reveals that unlike the mPD-1 which has a conventional IgSF V-set domain, hPD-1 lacks a C" strand, and instead the C' and D strands are connected by a long and flexible loop. In addition, the BC loop is not stabilized by disulfide bonding to the F strand of the ligand binding beta sheet. These differences result in different binding affinities of human and mouse PD-1 for their ligands.	112
409510	cd16089	IgV_CRIg	Immunoglobulin variable (IgV)-like domain in complement receptor of the immunoglobulin superfamily (CRIg). The members here are composed of the immunoglobulin variable (IgV) region of the complement receptor of the immunoglobulin superfamily (CRIg). The N-terminal domain of CRIg (also known as Z39Ig and V-set and Ig domain-containing 4 (VSIG4) belongs to the IgV family of immunoglobulin-like domains while the C-terminal domain of CRIg belongs to the IgC family of immunoglobulin-like domains. Like all members of this family, the CRIg domain contains two beta-sheets: one composed of strands A', G, F, C, C' and C", and the other of strands B, E and D. The complement system is an important part of the innate immune system and is required for removal of pathogens from the bloodstream. After exposure to pathogens, the third component of the complement system, C3, is cleaved to C3b which, after recruitment of factor B, initiates formation of the alternative pathway convertases. CRIg, a complement receptor expressed on macrophages, binds to C3b and iC3b mediating phagocytosis of the particles. It is also a potent inhibitor of the alternative pathway convertases and a negative regulator of T cell activation. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond.	117
409511	cd16090	IgV_CD47	Immunoglobulin variable region (IgV) in Cluster of Differentiation (CD) 47. The members here are composed of immunoglobulin variable (IgV) region in the Cluster of Differentiation (CD) 47 (also known as integrin associated protein/IAP). CD47 partners with membrane integrins and binds thrombospondin-1 (TSP-1) and signal-regulatory protein alpha (SIRP alpha). It is involved in apoptosis, proliferation, adhesion, migration, and immune and angiogenic responses. Members of the IgV family are components of immunoglobulin (Ig) and T cell receptors. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is the disulfide bridge connecting 2 beta-sheets with a tryptophan residue packed against the disulfide bond.	112
409512	cd16091	IgV_HHLA2	Immunoglobulin Variable (IgV) domain in HERV-H LTR-associating 2 (HHLA2). The members here are composed of the immunoglobulin variable (IgV) region in HERV-H LTR-associating 2 (HHLA2; also known as B7-H7/B7 homolog 7). HHLA2 is a member of the B7 family of immune regulatory proteins. Mature human HHLA2 consists of an extracellular domain (ECD) with three immunoglobulin-like domains, a transmembrane segment, and a cytoplasmic domain. HHLA2 is widely expressed in human cancers including non-small cell lung carcinoma (NSCLS), triple negative breast cancer (TNBC), and melanoma, but has limited expression on normal tissues. Interestingly, unlike other members of B7 family, HHLA2 is not expressed in mice or rats. HHLA2 functions as a T cell coinhibitory molecules as it inhibits the proliferation of activated CD4(+) and CD8(+) T cells and their cytokine production. Furthermore, HHLA2 is constitutively expressed on the surface of human monocytes and is induced on B cells after stimulation, however it is not inducible on T cells.	107
319341	cd16092	IgC1_CH1_IgD	CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin delta chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of delta chains. It belongs to a family composed of the first immunoglobulin constant-1 set domain of alpha, delta, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors.	96
409513	cd16093	IgC1_CH2_Mu	CH2 domain (second constant Ig domain of the heavy chain) in immunoglobulin mu chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the second immunoglobulin constant domain (IgC) of mu heavy chains. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns.	99
319343	cd16094	IgC1_CH3_IgD	CH3 domain (third constant Ig domain of the heavy chain) in immunoglobulin delta chain; member of the C1-set of Ig superfamily domains. The members here are composed of the third immunoglobulin constant domain (IgC) of delta heavy chains. This domain is found on the Fc fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. C1-set Ig domains have one beta sheet that is formed by strands A, B,  E, and D and the other strands by G, F, C, and C'.	100
409514	cd16095	IgV_H_TCR_mu	T-cell receptor Mu, Heavy chain, variable (V) domain. The members here are composed of the immunoglobulin (Ig) heavy chain (H), variable (V) domain of the T-cell receptor Mu. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determines the type of immunoglobulin formed: IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which can associate with any of the heavy chains. This family includes alpha, gamma, delta, epsilon, and mu heavy chains. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	115
409515	cd16096	IgV_CD79b_beta	Immunoglobulin variable domain (IgV) Cluster of Differentiation (CD) 79B. The members here are composed of the immunoglobulin variable domain (IgV) of the Cluster of Differentiation (CD) 79B (also known as CD79b molecule, immunoglobulin-associated beta (Ig-beta), and B29). The B lymphocyte antigen receptor is a multimeric complex that includes the antigen-specific component, surface immunoglobulin (Ig). Surface Ig non-covalently associates with two other proteins, Ig-alpha and Ig-beta, which are necessary for expression and function of the B-cell antigen receptor. This gene encodes the Ig-beta protein of the B-cell antigen component. Alternatively spliced transcript variants encoding different isoforms have been described. Members of the IgV family are components of immunoglobulin (Ig) and T cell receptors. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. Within the variable domain, there are regions of even more variability called the hypervariable or complementarity-determining regions (CDRs) which are responsible for antigen binding. A predominant feature of most Ig domains is the disulfide bridge connecting 2 beta-sheets with a tryptophan residue packed against the disulfide bond. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	96
409516	cd16097	IgV_SIRP	Immunoglobulin (Ig)-like variable (V) domain of the Signal-Regulatory Protein (SIRP). The members here are composed of the immunoglobulin (Ig)-like domain of the Signal-Regulatory Protein (SIRP). The SIRPs belong to the "paired receptors" class of membrane proteins that comprise several genes coding for proteins with similar extracellular regions, but very different transmembrane/cytoplasmic regions with different (activating or inhibitory) signaling potentials. They are commonly on NK cells, but are also on many myeloid cells. Their extracellular region contains three immunoglobulin superfamily domains, a single V-set, and two C1-set IgSF domains. Their cytoplasmic tails that contain either ITIMs or transmembrane regions have positively charged residues that allow an association with adaptor proteins, such as DAP12/KARAP, containing ITAMs. There are 3 distinct SIRP members: alpha, beta, and gamma.  SIRP alpha (also known as CD172a or SRC homology 2 domain-containing protein tyrosine phosphatase substrate 1/Shps-1) is a membrane receptor that interacts with a ligand CD47 expressed on many cells and gives an inhibitory signal through immunoreceptor tyrosine-based inhibition motifs in the cytoplasmic region that interact with phosphatases SHP-1 and SHP-2. SIRP beta has a short cytoplasmic region and associates with a transmembrane adapter protein DAP12 containing immunoreceptor tyrosine-based activation motifs to give an activating signal. SIRP gamma contains a very short cytoplasmic region lacking obvious signaling motifs, but also binds CD47 with much less affinity. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology.	111
294015	cd16098	FliS	flagellar export chaperone FliS. This family contains flagellar export chaperone FliS, a protein critical for flagellar assembly and bacterial colonization. FliS prevents premature polymerization of flagellins, the major protein of the filament, by regulating interactions between structural components of the bacterial flagellum in the cytosol. It binds specifically to FliC (flagellin) which is sequentially secreted in large numbers through the central channel of the flagellum and polymerized to form the tail filament. FliS protects FliC from degradation and aggregation by binding to the FliC C-terminal helical domain, which contributes to stabilization of flagellin subunit interactions during polymerization. FliS has been shown to interact specifically with FlgM, whose role is to inhibit FliA, a flagellum-specific RNA polymerase responsible for flagellin transcription; FliA competes with FliS for FlgM binding.	102
381691	cd16099	TenA_PqqC-like	TenA-like proteins including TenA_C and TenA_E proteins, as well as pyrroloquinoline quinone (PQQ) synthesis protein C. TenA proteins participate in thiamin metabolism and can be classified into two classes: TenA_C which has an active site Cys, and TenA_E which does not; TenA_E proteins often have a pair of structurally conserved Glu residues in the active site. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product amino-HMP (4-amino-5-amino-methyl-2-methylpyrimidine) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway; the role of TenA_E proteins is less clear. Arabidopsis thaliana TenA_E hydrolyzes amino-HMP to AMP, and the N-formyl derivative of amino-HMP to amino-HMP, but does not hydrolyze thiamin. Bacillus subtilis TenA_C can hydrolyze amino-HMP to AMP and can catalyze the hydrolysis of thiamin. Saccharomyces cerevisiae THI20 includes a C-terminal tetrameric TenA-like domain fused to an N-terminal ThiD domain, and participates in thiamin biosynthesis, degradation and salvage; the TenA-like domain catalyzes the production of HMP from thiamin degradation products (salvage). Bacillus halodurans TenA_C participates in a salvage pathway where the thiamine degradation product 2-methyl-4-formylamino-5-aminomethylpyrimidine (formylamino-HMP) is hydrolyzed first to amino-HMP by the YlmB protein, and the amino-HMP is then hydrolyzed by TenA to produce HMP. Helicobacter pylori TenA_C is also thought to catalyze a salvage reaction but the pyrimidine substrate has not yet been identified. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect; Pyrococcus furiosus TenA_E lacks appropriate surface charges for DNA interactions. This family also includes bacterial coenzyme pyrroloquinoline quinone (PQQ) synthesis protein C (PQQC), an oxidase involved in the final step of PQQ biosynthesis, and CADD, a Chlamydia protein that interacts with death receptors.	196
350627	cd16100	ARID	ARID/BRIGHT DNA binding domain family. The AT-rich interaction domain (ARID) family of transcription factors, found in a broad array of organisms from fungi to mammals, is characterized by a highly conserved, helix-turn-helix DNA binding domain that binds to the major groove of DNA. The ARID domain, also called BRIGHT, was first identified in the mouse B-cell-specific transcription factor Bright and in the product of the dead ringer (dri) gene of Drosophila melanogaster. ARID family members are implicated in normal development, differentiation, cell cycle regulation, transcriptional activation and chromatin remodeling. Different family members exhibit different DNA-binding properties. Drosophila Dri, mammalian ARID3A/3B/3C and ARID5A/5B, selectively bind AT-rich sites. However, ARID1A/1B, Drosophila Osa, yeast SWI1, ARID2, ARID4A/4B, JARID1A/1B/1C/1D, and JARID2, bind DNA without sequence specificity.	87
341089	cd16101	ING	Inhibitor of growth (ING) domain family. The Inhibitor of growth (ING) family includes a group of tumor suppressors, ING1-5, which act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation, and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, which binds lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING domain for the H3 tail. The ING family also includes three yeast orthologs, chromatin modification-related protein YNG1 (Yng1p), YNG2 (Yng2p), and transcriptional regulatory protein PHO23 (Pho23p). Yng1p, also termed ING1 homolog 1, is one of the components of the NuA3 histone acetyltransferase (HAT) complex. Yng2p, also termed ESA1-associated factor 4, or ING1 homolog 2, is a subunit of the NuA4 HAT complex. It plays a critical role in intra-S-phase DNA damage response. Pho23p is part of Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Yng1p and Pho23p inhibit p53-dependent transcription. In contrast, Yng2p has the opposite effect.	88
340519	cd16102	RAWUL_PCGF_like	RRING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in PCGF1-6, RING1 and -2, DRIP and similar proteins; structurally similar to a beta-grasp ubiquitin-like fold. The family includes six Polycomb Group (PcG) RING finger homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) that use epigenetic mechanisms to maintain or repress expression of their target genes. They were first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place, and are well known for silencing Hox genes through modulation of chromatin structure during embryonic development in fruit flies. PCGF homologs play important roles in cell proliferation, differentiation, and tumorigenesis.  They all have been found to associate with ring finger protein 2 (RNF2). The RNF2-PCGF heterodimer is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF homologs are critical components in the assembly of distinct Polycomb Repression Complex 1 (PRC1) related complexes which are involved in the maintenance of gene repression and target different genes through distinct mechanisms. The Drosophila PRC1 core complex is formed by the Polycomb (Pc), Polyhomeotic (Ph), Posterior sex combs (Psc), and Sex combs extra (Sce, also known as Ring) subunits. In mammals, the composition of PRC1 is much more diverse and varies depending on the cellular context. All PRC1 complexes contain homologs of the Drosophila Ring protein. Ring1A/RNF1 and Ring1B/RNF2 are E3 ubiquitin ligases that mark lysine 119 of histone H2A with a single ubiquitin group (H2AK119ub). Mammalian homologs of the Drosophila Psc protein, such as PCGF2/Mel-18 or PCGF4/BMI1, regulate PRC1 enzymatic activity. PRC1 complexes can be divided into at least two classes according to the presence or absence of CBX proteins, which are homologs of Drosophila Pc. Canonical PRC1 complexes contain CBX proteins that recognize and bind H3K27me3, the mark deposited by PRC2. Therefore, canonical PRC1 complexes and PRC2 can act together to repress gene transcription and maintain this repression through cell division. Non-canonical PRC1 complexes, containing RYBP (together with additional proteins, such as L3mbtl2 or Kdm2b) rather than the CBX proteins, have recently been described in mammals. PCGF homologs contain a C3HC4-type RING-HC finger, and a RAWUL domain that might be responsible for interaction with Cbx members of the Polycomb repression complexes.	87
340520	cd16103	Ubl2_OASL	ubiquitin-like (Ubl) domain 2 found in 2'-5'-oligoadenylate synthase-like protein (OASL) and similar proteins. OASL, also termed 2'-5'-OAS-related protein (2'-5'-OAS-RP), or 59 kDa 2'-5'-oligoadenylate synthase-like protein, or thyroid receptor-interacting protein 14, or TR-interacting protein 14 (TRIP-14), or p59 OASL (p59OASL), is an interferon (IFN)-induced antiviral protein that plays an important role in the IFNs-mediated antiviral signaling pathway. It inhibits respiratory syncytial virus replication and is targeted by the viral nonstructural protein 1 (NS1). It also displays antiviral activity against encephalomyocarditis virus (EMCV) and hepatitis C virus (HCV) via an alternative antiviral pathway independent of RNase L. Moreover, OASL does not have 2'-5'-OAS activity, but can bind double-stranded RNA (dsRNA) to enhance RIG-I signaling. OASL belongs to the 2'-5' oligoadenylate synthase (OAS) family. While each member of this family has a conserved N-terminal OAS catalytic domain, only OASL has two tandem C-terminal ubiquitin-like (Ubl) repeats, which are required for its antiviral activity. This family corresponds to the second Ubl domain.	72
340521	cd16104	Ubl_USP14_like	ubiquitin-like (Ubl) domain found in ubiquitin carboxyl-terminal hydrolase 14 (USP14) and similar proteins. USP14 (EC 3.4.19.12), also termed deubiquitinating enzyme 14, or ubiquitin thioesterase 14, or ubiquitin-specific-processing protease 14, or ubiquitin carboxyl-terminal hydrolase 14, is a component of proteasome regulatory subunit 19S that regulates deubiquitinated proteins entering inside the proteasome core 20S, which plays an inhibitory role in protein degradation. USP14 is also associated with various signal transduction pathways and tumorigenesis, and thus plays an essential role in the development of various types of cancer. Moreover, USP14 mediates the development of cardiac hypertrophy by promoting GSK-3beta phosphorylation, suggesting a role in cardiac hypertrophy treatment. USP14 contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, and a C-terminal ubiquitin-specific protease (USP) domain.	75
340522	cd16105	Ubl_ASPSCR1_like	Ubiquitin-like (Ubl) domain found at the N-terminus of mammalian ASPSCR1 (alveolar soft part sarcoma chromosomal region candidate gene 1 protein), Saccharomyces cerevisie Ubx4p, and similar proteins. ASPSCR1 (alveolar soft part sarcoma chromosomal region candidate gene 1 protein) is also known as alveolar soft part sarcoma locus protein (ASPL), tether containing UBX domain for GLUT4, TUG, UBX domain protein 9, UBXD9, UBXN9 or renal papillary cell carcinoma protein 17 (RCC17). The majority of members of this family contain two beta-grasp ubiquitin-like fold domains: the N-terminal UBL domain (described in this CD), and a C-terminal UBX domain. This UBL domain lacks the characteristic C-terminal double glycine motif. ASPSCR1 functions as a cofactor of the hexameric AAA (ATPase associated with various activities) ATPase complex, known as p97 or VCP in mammals and Cdc48p in yeast. In mammalian cells, ASPSCR1 is involved in insulin-stimulated redistribution of the glucose transporter GLUT4 and assembly of the Golgi apparatus; ASPSCR1 also plays a role in controlling vesicle translocation by interacting with insulin-regulated aminopeptidase (IRAP), a transmembrane aminopeptidase. Ubx4p and ASPSCR1 have only partially overlapping functions: both interact with p97/Cdc48p; however, Ubx4p is important for the ERAD (endoplasmic reticulum-associated protein degradation) pathway while ASPSCR1 appears not to be.	71
340523	cd16106	Ubl_Dsk2p_like	ubiquitin-like (Ubl) domain found in Saccharomyces cerevisiae proteasome interacting protein Dsk2p and similar proteins. The family contains several fungal multiubiquitin receptors, including Saccharomyces cerevisiae Dsk2p and Schizosaccharomyces pombe Dph1p, both of which have been characterized as shuttle proteins transporting ubiquitinated substrates destined for degradation from the E3 ligase to the 26S proteasome. They interact with the proteasome through their N-terminal ubiquitin-like domain (Ubl) and with ubiquitin (Ub) through their C-terminal Ub-associated domain (UBA). S. cerevisiae Dsk2p is a nuclear-enriched protein that may involve in the ubiquitin-proteasome proteolytic pathway through interacting with K48-linked polyubiquitin and the proteasome. Moreover, it has been implicated in spindle pole duplication through assisting in Cdc31 assembly into the new spindle pole body (SPB). S. pombe Dph1p is an ubiquitin (Ub0 receptor working in concert with the class V myosin, Myo52, to target the degradation of the S. pombe CLIP-170 homolog, Tip1. It also can protect Ub chains against disassembly by deubiquitinating enzymes.	73
340524	cd16107	Ubl_AtUPL5_like	ubiquitin-like (Ubl) domain found in Arabidopsis thaliana ubiquitin-protein ligase 5 (AtUPL5) and similar proteins. Arabidopsis thaliana AtUPL5, also termed HECT-type E3 ubiquitin transferase UPL5, is an E3 ubiquitin-protein ligase that contains a ubiquitin-like domain (Ubl), a C-type lectin-binding domain, a leucine zipper and a HECT domain. HECT domain containing-ubiquitin-protein ligases have more than one member in different genomes, these proteins have been classified into four sub-families (UPL1/2, UPL3/4, UPL5 and UPL6/7). AtUPL5 fUPL5 regulates leaf senescence in Arabidopsis through degradation of the transcription factor WRKY53.	70
340525	cd16108	Ubl_ATG8_like	ubiquitin-like (Ubl) domain found in autophagy-related 8 (ATG8) and similar proteins. The ATG8 family of proteins constitute a single member in Saccharomyces cerevisiae, Atg8p, and multiple homologs in higher eukaryotes, they are multifunctional ubiquitin-like (Ubl) key regulators of autophagy. The ATG8 system is a Ubl conjugation system that is essential for autophagosome formation. In the ATG8 system, a cysteine protease (ATG4) cleaves a C-terminal arginine from ATG8, and then the exposed C-terminal glycine is conjugated to phosphatidylethanolamine (PE) by ATG7, an E1-like enzyme, and ATG3, an E2-like enzyme. The mammalian ATG8 family is classified into three subfamilies: i) MAP1LC3 (microtubule associated protein 1 light chain 3) which includes MAP1LC3A, MAP1LC3B, MAP1LC3B2, and MAP1LC3C, ii) GABARAP (GABA type A receptor associated protein) which includes GABARAP, GABARAPL1, and GABARAPL3, and iii) GABARAPL2 (GABA type A receptor associated protein like 2), also known as GATE-16 (golgi-associated adenosine triphosphatase enhancer of 16 kDa).	85
340526	cd16109	DCX1	Dublecortin-like domain 1. Members of the doublecortin (DCX) gene family are microtubule-associated proteins (MAPs). Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX gene family consists of eleven paralogs in human and mouse, and its protein domains can occur in double tandem or single repeats. The family represents the first repeat of the DCX domain which has a stable ubiquitin-like tertiary fold. Proteins with DCX double tandem domains in general have roles in microtubule (MT) regulation and signal transduction such as X-linked doublecortin (DCX), retinitis pigmentosa-1 (RP1) and doublecortin-like kinase (DCLK).	85
340527	cd16110	DCX1_RP_like	Doublecortin-like domain 1 found in retinitis pigmentosa (RP)-like protein. RP-like protein family is part of doublecortin (DCX) family. It has double tandem DCX repeats that are associated with retinitis pigmentosa. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport.  RP-like proteins are colocalized to the photoreceptor and share a function in outer segment disc morphogenesis.	75
340528	cd16111	DCX_DCLK3	Doublecortin-like domain found in doublecortin-like kinase 3 (DCLK3). DCLK3 is a member of doublecortin (DCX) protein family. It functions as a microtubule-associated protein (MAP). DCLK3 contains only one N-terminal doublecortin domain (DCX), unlike DCLK1 and DCLK2 which each have two conserved DCX domains. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule binding domains, DCLK3 has a serine/threonine kinase domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases.	85
340529	cd16112	DCX1_DCX	Dublecortin-like domain 1 found in neuronal migration protein doublecortin (DCX). DCX, also termed doublin or lissencephalin-X (Lis-XDCX), is a microtubule-associated protein (MAP). It belongs to the doublecortin (DCX) family, has double tandem DCX repeats, and is expressed in migrating neurons. Structure studies show that the N-terminal DCX domain has a stable ubiquitin-like fold. DCX is not only a unique MAP in terms of structure, it also interacts with multiple additional proteins. Mutations in the human DCX genes are associated with abnormal neuronal migration, epilepsy, and mental retardation.	89
340530	cd16113	DCX2_DCDC2_like	Doublecortin-like domain 2 found in doublecortin domain-containing protein 2 (DCDC2). DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of a ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	74
340531	cd16114	Ubl_SUMO1	ubiquitin-like (Ubl) domain found in small ubiquitin-related modifier 1  (SUMO-1) and similar proteins. SUMO (also known as "Smt3" and "sentrin" in other organisms) resembles ubiquitin (Ub) in structure, ligation to other proteins and the mechanism of ligation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. SUMOs, like Ub, are covalently conjugated to lysine residues in a wide variety of target proteins in eukaryotic cells and regulate numerous cellular processes, such as transcription, epigenetic gene control, genomic instability, and protein degradation. Four SUMO paralogs exist in mammals, SUMO1 through SUMO4. SUMO2-SUMO4 are more closely related to each other than they are to SUMO1. SUMO1 is a binding partner of the RAD51/52 nucleoprotein filament proteins, which mediate DNA strand exchange. SUMO1 conjugation to cellular proteins has been implicated in multiple important cellular processes, such as nuclear transport, cell cycle control, oncogenesis, inflammation, and the response to virus infection.	76
340532	cd16115	Ubl_SUMO2_3_4	ubiquitin-like (Ubl) domain found in small ubiquitin-related modifier  SUMO-2, SUMO-3, SUMO-4, and similar proteins. SUMO (also known as "Smt3" and "sentrin" in other organisms) resembles ubiquitin (Ub) in structure, ligation to other proteins and the mechanism of ligation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. SUMOs, like Ub, are covalently conjugated to lysine residues in a wide variety of target proteins in eukaryotic cells and regulate numerous cellular processes, such as transcription, epigenetic gene control, genomic instability, and protein degradation. The mammalian SUMOs have four paralogs, SUMO1 through SUMO4. SUMO2 and SUMO3 are more closely related to each other than they are to SUMO1. SUMO2/3 are capable of forming chains on substrate proteins through internal lysine residues. The basic biology of SUMO4 remains unclear. A M55V polymorphism in SUMO4 has been associated with susceptibility to type I diabetes in some genetic studies.	72
340533	cd16116	Ubl_Smt3_like	ubiquitin-like (Ubl) domain found in Saccharomyces cerevisiae ubiquitin-like protein Smt3p and similar proteins. Smt3 (Suppressor of Mif Two 3) was originally isolated as a high-copy suppressor of a mutation in MIF2, the gene of a centromere binding protein in S. cerevisiae. Smt3p is the yeast homolog of small ubiquitin-related modifier (SUMO) proteins that are involved in post-translational protein modification called SUMOylation, covalently attaching to and detaching from other proteins in cells to modify their function. SUMO resembles ubiquitin (Ub) in its structure, its ability to be ligated to other proteins, as well as in the mechanism of ligation. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. Smt3p plays essential roles in cell-cycle regulation and chromosome segregation in budding yeast. It interacts with different modification enzymes, and regulates their functions through linking covalently to its targets.	74
340534	cd16117	UBX_UBXN4	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 4 (UBXN4) and similar proteins. UBXN4, also termed ERAD (endoplasmic-reticulum-associated protein degradation) substrate erasing protein (erasin), or UBX domain-containing protein 2 (UBXD2), or UBXDC1, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN4 is an endoplasmic reticulum (ER) localized protein that interacts with p97 (also known as VCP or Cdc48) via its UBX domain. Erasin exists in a complex with other p97/VCP-associated factors involved in endoplasmic-reticulum-associated protein degradation (ERAD). p97 is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The overexpression of UBXN4 increases degradation of a classical ERAD substrate and UBXN4 levels are increased in ER stressed cells. Anti-UBXN4 staining is increased in neuropathological lesions in brains of patients with Alzheimer's disease.	77
340535	cd16118	UBX2_UBXN9	Ubiquitin regulatory domain X (UBX) 2 found in UBX domain protein 9 (UBXN9, UBXD9, or ASPSCR1) and similar proteins. UBXN9, also termed tether containing UBX domain for GLUT4 (TUG), or alveolar soft part sarcoma chromosomal region candidate gene 1 protein (ASPSCR1), or alveolar soft part sarcoma locus (ASPL), or renal papillary cell carcinoma protein 17 (RCC17), belongs to the UBXD family of proteins that contains two ubiquitin regulatory domains X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, UBXN9 contains an N-terminal ubiquitin-like (Ubl) domain. UBXN9 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. However, high-affinity interacting protein ASPL efficiently promotes p97 hexamer disassembly, resulting in the formation of stable p97:ASPL heterotetramers; the extended UBX domain (eUBX) in ASPL is critical for p97 hexamer disassembly and facilitates the assembly of p97:ASPL heterotetramers.UBXN9 is involved in insulin-stimulated redistribution of the glucose transporter GLUT4, assembly of the Golgi apparatus. In addition to GLUT4, UBXN9 also controls vesicle translocation by interacting with insulin-regulated aminopeptidase (IRAP), a transmembrane aminopeptidase. UBXN9 and its budding yeast ortholog, Ubx4p, are multifunctional proteins that share some, but not all functions. Yeast Ubx4p is important for endoplasmic reticulum-associated protein degradation (ERAD) but UBXN9 appears not to share this function.	74
340536	cd16119	UBX_UBXN6	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 6 (UBXN6) and similar proteins. UBXN6, also termed UBX domain-containing protein 1 (UBXD1), and UBXDC2, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN6 acts as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. Unlike other p97 cofactors that binds the N-domain of p97 through their UBX domain, UBXN6 binds p97 in two regions, at the p97 C terminus via a PUB domain and at the p97 N-domain with a short linear interaction motif termed VIM. Its UBX domain is not functional for the binding of p97. The UBXN6-p97 complex regulates the endolysosomal sorting of ubiquitylated plasma membrane protein caveolin-1 (CAV1), as well as the trafficking of ERGIC-53-containing vesicles by controlling the interaction of transport factors with the cytoplasmic tail of ERGIC-53. In addition, UBXN6 is a regulatory component of endoplasmic reticulum-associated degradation (ERAD) that may modulate the adaptor binding to p97.	73
340537	cd16120	UBX_UBXN3B	Ubiquitin regulatory domain X (UBX) found in FAS associated factor 2 (FAF2, also known as UBXN3B) and similar proteins. UBX domain-containing protein 3B (UBXN3B), also termed protein ETEA, or FAF2, or UBX domain-containing protein 8 (UBXD8), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. FAF2 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The p97-UBXD8 complex destabilizes mRNA by promoting release of ubiquitinated the RNA-binding protein HuR from messenger ribonucleoprotein (mRNP). Moreover, FAF2 is the translation product of a highly expressed gene in the T-cells and eosinophils of atopic dermatitis patients compared with those of normal individuals. A yeast two-hybrid assay showed that FAF2 can interact with Fas.	80
340538	cd16121	FERM_F1_SNX17	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in sorting nexin protein 17 (SNX17). SNX17 is a member of the family of cytoplasmic sorting nexin adaptor proteins that regulate endosomal trafficking of cell surface proteins. It localizes to early endosomes, and plays an important role in mediating endocytic internalization, recycling, and/or protection from lysosomal degradation of NPxY-motif containing cell surface proteins including amyloid precursor protein (APP), P-selectin, beta1-integrin, low density lipoprotein receptor (LDLR), LDLR related protein (Lrp1), ApoER2, and FEEL1. SNX17 also affects T cell activation by regulating T cell receptor and integrin recycling. SNX17 contains a PX (Phox homology) domain and a FERM (Band 4.1, ezrin, radixin, moesin) domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	93
340539	cd16122	FERM_F1_SNX31	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in sorting nexin protein 31 (SNX31). SNX31 is a member of the family of cytoplasmic sorting nexin adaptor proteins that regulate endosomal trafficking of cell surface proteins. It is a novel sorting nexin associated with the uroplakin-degrading multivesicular bodies in terminally differentiated urothelial cells. SNX31 binds multiple beta integrin cytoplasmic domains and regulates beta1 integrin surface levels and stability. SNX31 contains a PX (Phox homology) domain and a FERM (Band 4.1, ezrin, radixin, moesin) domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	98
340540	cd16123	RA_RASSF7_like	Ras-associating (RA) domain found in Ras-association domain family members, RASSF7, RASSF8, RASSF9, and RASSF10. The RASSF family of proteins shares a conserved RalGDS/AF6 Ras association (RA) domain either in the C-terminus (RASSF1-6) or N-terminus (RASSF7-10). RASSF7-10 lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. The structural differences between the C-terminus and N-terminus RASSF subgroups have led to the suggestion that they are two distinct families. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ras proteins are small GTPases that are involved in cellular signal transduction. The N-terminus RASSF proteins are potential Ras effectors that have been linked to key biological processes, including cell death, proliferation, microtubule stability, promoter methylation, vesicle trafficking and response to hypoxia.	81
340541	cd16124	RA_GRB7_10_14	Ras-associating (RA) domain found in growth factor receptor-bound (Grb) protein 7/10/14. The RA domain is highly conserved among the members of the Grb proteins family which includes Grb7, Grb10 and Grb14. Grb7/10/14 are multi-domain cytoplasmic adaptor proteins that are recruited to activated receptor tyrosine kinases. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and pleckstrin homology (PH) domains, and a carboxyl-terminal SH2 domain. The tandem RA and PH domains of Grb7/10/14 are also found in a second adaptor family, Rap1-interacting adaptor molecule (RIAM) and lamellipodin, which is involved in actin-cytoskeleton rearrangement. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm.	85
340542	cd16125	RA_ASPP1_2	Ras-associating (RA) domain found in apoptosis-stimulating protein of p53 (ASPP) 1 and 2. The ASPP protein (apoptosis-stimulating protein of p53; also called ankyrin repeat-, Src homology 3 domain- and Pro-rich region-containing protein) plays a critical role in regulating apoptosis. The ASPP family consists of three members, ASPP1, ASPP2 and iASPP, all of which bind to p53 and regulate p53-mediated apoptosis. ASPP1 and ASPP2, have a RA domain at their N-terminus and have pro-apoptotic functions, while iASPP is involved in anti-apoptotic responses. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin.	80
340543	cd16126	Ubl_HR23B	ubiquitin-like (Ubl) domain found in UV excision repair protein RAD23 homolog B (HR23B). HR23B, also termed xeroderma pigmentosum group C (XPC) repair-complementing complex 58 kDa protein (p58), is tightly complexed with XPC protein to form the XPC-HR23B complex. Although it displays a high affinity for both single- and double-stranded DNA, the XPC-HR23B complex functions as a global genome repair (GGR)-specific repair factor that is specifically involved in global genome but not transcription-coupled nucleotide excision repair (NER). HR23B also interacts specifically with S5a subunit of the human 26S proteasome, and plays an important role in shuttling ubiquitinated cargo proteins to the proteasome. HR23B contains an N-terminal ubiquitin-like (Ubl) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function.	78
340544	cd16127	Ubl_ATG8_GABARAP_like	ubiquitin-like (Ubl) domain found in gamma-aminobutyric acid receptor-associated protein (GABARAP) and similar proteins; sub-family of the autophagy-related 8 (ATG8) protein family. GABARAP (also termed GABA(A) receptor-associated protein, ATG8A, or MM46) has been implicated in intracellular protein trafficking. It is a cytosolic protein, localized to transport vesicles, the Golgi network and the endoplasmic reticulum. It interacts with the intracellular domain of the gamma2 subunit of GABA(A) receptors, and thus, functions as a trafficking modulator implicated in the intracellular trafficking of GABA(A) receptor. GABARAP also acts as a Ubl modifier belonging to the ATG8 (autophagy-related 8) protein family which is essential for autophagosome biogenesis and maturation. GABARAP recruits phosphatidylinositol 4-kinase II alpha (PI4KIIalpha) as a specific downstream effector, and regulates phosphatidylinositol 4-phosphate (PI4P)-dependent autophagosome lysosome fusion. This sub-family also includes GABARAPL1 (also termed GABA(A) receptor-associated protein-like, or GEC1), GABARAPL2/GATE-16, and GABARAPL3. GABARAPL1 has been involved in the intracellular transport of receptors via interactions with tubulin and GABA(A) or kappa opioid receptors. GABARAPL1 is also a Ubl modifier that functions as a mediator involved in androgen-regulated autophagy process. It is transcriptionally modulated by androgen receptor (AR) and has a repressive role in autophagy. In addition, GABARAPL1 is required for increased membrane expression of epidermal growth factor receptor (EGFR) during hypoxia, suggesting a possible role in the trafficking of these membrane proteins. GABARAPL1 may also play a key role in several important biological processes such as cancer or neurodegenerative diseases. Low expression of GABARAPL1 is associated with poor prognosis of patients with hepatocellular carcinoma.	107
340545	cd16128	Ubl_ATG8	ubiquitin-like (Ubl) domain found in Saccharomyces cerevisiae Atg8p and related proteins; sub-family of the autophagy-related 8 (ATG8) family. The ATG8 family of proteins constitutes a single member in Saccharomyces cerevisiae, Atg8p, and multiple homologs in higher eukaryotes. These proteins are multifunctional ubiquitin-like (Ubl) key regulators of autophagy. ATG8 is characterized by a C-terminal ubiquitin-like (Ubl) domain with a short N-terminal extension. The covalent attachment of ATG8 to phosphatidylethanolamine (PtdEth) at the autophagosomal membrane places it at a crucial juncture during autophagosome formation. ATG Ubl proteins such as Saccharomyces cerevisiae Atg8p undergo a unique Ubl conjugation, a process essential for autophagosome formation.	103
340546	cd16129	Ubl_ATG8_MAP1LC3	ubiquitin-like (Ubl) domain found in microtubule associate protein 1 light chain 3 (MAP1LC3). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. MAP1LC3 (also known as LC3) has a ubiquitin-like (Ubl) fold and belongs to the ATG8 autophagy protein family. A Ubl conjugation of MAP1LC3 by the phospholipid phosphatidylethanolamine (PE) is an essential process for the formation of autophagosomes. MAP1LC3 is cleaved by the cysteine protease ATG4 and is then conjugated with PE by E1-like enzyme ATG7 and ATG3, an E2-like enzyme. The Ubl conversion of MAP1LC3 is known as a marker of autophagy-induction. This sub-family includes MAP1LC3A, MAP1LC3B, and MAP1LC3C, each encoded by a different MAP1LC3 gene.	105
340547	cd16130	RA_Rin3	Ras-associating (RA) domain found in Ras and Rab interactor 3 (Rin3). Rin3, also termed Ras interaction/interference protein 3, is a RAS effector and a RAB5-activating guanine nucleotide exchange factor (GEF) specifically for GTPase Rab31. It functions as a negative regulator of mast cell responses to Stem Cell Factor (SCF). Rin3 contains the Vps9p-like guanine nucleotide exchange factor and Ras-association (RA) domains.	88
340548	cd16131	RA_Rin2	Ras-associating (RA) domain found in Ras and Rab interactor 2 (Rin2). Rin2, also termed Ras association domain family 4, or Ras inhibitor JC265, or Ras interaction/interference protein 2, is a Rab5 GDP/GTP exchange factor with the Vps9p-like guanine nucleotide exchange factor and Ras-association (RA) domains. Rin2 connects three GTPases, R-Ras, Rab5 and Rac1, to promote endothelial cell adhesion through the regulation of integrin internalization and Rac1 activation. Rin2 is involved in the regulation of Rab5-mediated early endocytosis. The deficiency of Rin2 can cause the RIN2 syndrome, an autosomal recessive connective tissue disorder.	91
340549	cd16132	RA_RASSF10	Ras-associating (RA) domain found in N-terminal Ras-association domain family 10 (RASSF10). RASSF10 is a member of a family of N-terminus RASSF7-10 proteins. RASSF7-10 has an RA domain at the N-terminus and lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. RA domain of N-terminal RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RASSF10 is expressed in a wide variety of tissues and its expression in human thyroid, pancreas, placenta, heart, lung and kidney has been observed. RASSF10 is the most frequently methylated of the N-terminal RASSFs in some cancers such as in childhood acute lymphoblastic leukemia and both, thyroid cancer cell lines and primary thyroid carcinomas.	102
340550	cd16133	RA_RASSF9	Ras-associating (RA) domain of N-terminal Ras-association domain family 9 (RASSF9). RASSF9, also termed PAM COOH-terminal interactor protein 1 (P-CIP1), or peptidylglycine alpha-amidating monooxygenase COOH-terminal interactor, is a member of N-terminus RASSF7-10 protein family. RASSF7-10 has an RA domain at the N-terminus and lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. The RA domain of the N-terminal RASSF proteins family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RASSF9 was formerly known as PAM COOH-terminal interactor-1 (P-CIP1) because of its interaction with peptidylglycine alpha-amidating mono-oxygenase (PAM) and possibility of its role in regulating the trafficking of integral membrane PAM. RASSF9 is widely expressed in multiple organs such as testis, kidney, skeletal muscle, liver, lung, brain, and heart. Cloned RASSF9 showed preferential binding to N-Ras and K-Ras.	93
340551	cd16134	RA_RASSF8	Ras-associating (RA) domain found in N-terminal Ras-association domain family 8 (RASSF8). RASSF8, also termed carcinoma-associated protein HOJ-1, is a member of the N-terminus RASSF7-10 protein family. RASSF7-10 has an RA-domain at the N-terminus and lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. The RA domain of N-terminal RASSF proteins family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RASSF8 has been described as a potential tumor suppressor. RASSF8 might have a role in the regulation of cell-cell adhesion and cell growth.	82
340552	cd16135	RA_RASSF7	Ras-associating (RA) domain found in N-terminal Ras-association domain family 7 (RASSF7). RASSF7, also termed HRAS1-related cluster protein 1, is a member of the N-terminus RASSF7-10 protein family. RASSF7-10 has an RA-domain at the N-terminus and lacks a conserved SARAH (Salvador/RASSF/Hpo) motif adjacent to the RA domain that is found in members of the RASSF1-6 family. The RA domain of N-terminal RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RASSF7 is a potential Ras effector as its function has been linked to some key biological processes including the regulation of cell death and proliferation; for example, RASSF7 is up-regulated in pancreatic cancer.	83
340553	cd16136	RA_MRL_Lpd	Ras-associating (RA) domain found in the adapter protein lamellipodin (Lpd). Lpd, also termed Ras-associated and pleckstrin homology domains-containing protein 1 (RAPH1), or amyotrophic lateral sclerosis 2 chromosomal region candidate gene 18 protein, or amyotrophic lateral sclerosis 2 chromosomal region candidate gene 9 protein, or proline-rich EVH1 ligand 2 (PREL-2), or protein RMO1, is a member of MRL (Mig10/RIAM/Lpd) family proteins that regulates cell migration and promote lamellipodia protrusion in fibroblast by interacting with Ena/VASP proteins. MRL proteins share a common structural architecture, including a central structural unit consisting of an RA domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. Lpd also contains a helical region at the amino terminus for talin binding. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA and PH domain in Lpd form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain. Lpd also exhibits other unique enzymatic functions including its catalytic activity of butyrylcholinesterase, a potent therapeutic treatment targeting cocaine abuse.	90
340554	cd16137	RA_MRL_RIAM	Ras-associating (RA) domain found in Rap1-GTP-interacting adapter molecule (RIAM). RIAM, also termed amyloid beta A4 precursor protein-binding family B member 1-interacting protein, or APBB1-interacting protein 1, or proline-rich EVH1 ligand 1 (PREL-1), or proline-rich protein 73, or retinoic acid-responsive proline-rich protein 1 (RARP-1), is a member of MRL (Mig10/RIAM/Lpd) family proteins that regulates cell migration and promote lamellipodia protrusion in fibroblast by interacting with Ena/VASP proteins. RIAM regulates cell migration and mediates Rap1-induced cell adhesion. MRL proteins share a common structural architecture, including a central structural unit consisting of an RA domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RIAM also contains a helical region at the amino terminus for talin binding. RA and PH form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain.	89
340555	cd16138	RA_MRL_MIG10	Ras-associating (RA) domain found in Caenorhabditis elegans abnormal cell migration protein 10 (MIG-10) and similar proteins. MIG-10 is lamellipodin (Lpd) found in C. elegans. It stabilizes invading cell adhesion to basement membrane and is a negative transcriptional target of Evi-1 proto-oncogene, EGL-43, in C. elegans. It also shows netrin-independent functions and is a transcriptional target of FOS-1A, a transcription factor that promotes basement membrane breaching, during anchor cell invasion in C. elegans. MIG-10 is a member of MRL (Mig10/RIAM/Lpd) family of proteins that is involved in antero-posterior migration of embryonic neurons CAN (canalassociated neurons), ALM (anterior lateral microtubule cells) and HSN (hermaphrodite-specific neurons). MRL proteins share a common structural architecture, including a central structural unit consisting of an RA domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA and PH form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain.	86
340556	cd16139	RA_GRB14	Ras-associating (RA) domain found in growth factor receptor-bound (Grb) protein 14. Grb14, a member of cytoplasmic adaptor proteins, is a tissue-specific negative regulator of insulin signaling. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ubi is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. A novel function of Grb14 RA domain is to interact with the nucleotide binding pocket of a cyclic nucleotide gated channel alpha subunit (CNGA1) and inhibits its activity. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and PH domains, and a carboxyl-terminal SH2 domain.  Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm.	85
340557	cd16140	RA_GRB7	Ras-associating (RA) domain found in growth factor receptor-bound (Grb) protein 7. GRB7, also termed B47, or epidermal growth factor receptor GRB-7, or GRB7 adapter protein, is a signal-transducing cytoplasmic adaptor protein that is co-opted by numerous tyrosine kinases involved in various cellular signaling and functions. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and pleckstrin homology (PH) domains, and a carboxyl-terminal SH2 domain.  The tandem RA and PH domains of  Grb7/10/14 are also found in a second adaptor family, Rap1-interacting adaptor molecule (RIAM) and lamellipodin, which is involved in actin-cytoskeleton rearrangement. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm. Grb7 could interact with activated N-Ras in transfected cells.	88
340558	cd16141	RA_GRB10	Ras-associating (RA) domain found in growth factor receptor-bound (Grb) protein 10. GRB10, also termed insulin receptor-binding protein Grb-IR, is a multi-domain cytoplasmic adaptor protein that binds to the insulin-like growth factor 1 receptor (IGF-1R) and inhibits insulin signaling. Grb10 and its related family members Grb7 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and pleckstrin homology (PH) domains, and a carboxyl-terminal SH2 domain. The tandem RA and PH domains of Grb7/10/14 are also found in a second adaptor family, Rap1-interacting adaptor molecule  (RIAM) and lamellipodin, which is involved in actin-cytoskeleton rearrangement. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm. Grb14 binds to both GTPase-defective mutant Rab5 as well as CNGA1, whereas Grb10 binds only to GTP-bound form of active Rab5.	92
293761	cd16142	ARS_like	uncharacterized arylsulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	372
293762	cd16143	ARS_like	uncharacterized arylsulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	395
293763	cd16144	ARS_like	uncharacterized arylsulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	421
293764	cd16145	ARS_like	uncharacterized arylsulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	415
293765	cd16146	ARS_like	uncharacterized arylsulfatase. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	409
293766	cd16147	G6S	glucosamine (N-acetyl)-6-sulfatase(G6S, GNS) AND sulfatase 1(SULF1). N-acetylglucosamine-6-sulfatase also known as glucosamine (N-acetyl)-6-sulfatase hydrolyzes of the 6-sulfate groups of the N-acetyl-D-glucosamine 6-sulfate units of heparan sulfate and keratan sulfate. Deficient of N-acetylglucosamine-6-sulfatase results in disease of Sanfilippo Syndrome type IIId or Mucopolysaccharidosis III (MPS-III), a rare autosomal recessive lysosomal storage disease. SULF1 encodes an extracellular heparan sulfate endosulfatase, that removes 6-O-sulfate groups from heparan sulfate chains of heparan sulfate proteoglycans (HSPGs).	396
293767	cd16148	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	271
293768	cd16149	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	257
293769	cd16150	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	423
293770	cd16151	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	377
293771	cd16152	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	373
293772	cd16153	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	282
293773	cd16154	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	372
293774	cd16155	sulfatase_like	uncharacterized sulfatase subfamily. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	372
293775	cd16156	sulfatase_like	uncharacterized sulfatase subfamily; includes Escherichia coli YidJ. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	468
293776	cd16157	GALNS	galactosamine-6-sulfatase; also known as N-acetylgalactosamine-6-sulfatase (GALNS). Lysosomal galactosamine-6-sulfatase removes sulfate groups from a terminal N-acetylgalactosamine-6-sulfate (or galactose-6-sulfate) in mucopolysaccharides such as keratan sulfate and chondroitin-6-sulfate. Defects in GALNS lead to accumulation of substrates, resulting in the development of the lysosomal storage disease mucopolysaccharidosis IV A.	466
293777	cd16158	ARSA	Arylsulfatase A or cerebroside-sulfatase. Arylsulfatase A breaks down sulfatides, namely cerebroside 3-sulfate into cerebroside and sulfate. It is a member of the sulfatase family. The arylsulfatase A was located in lysosome-like structures and transported to dense lysosomes in a mannose 6-phosphate receptor-dependent manner. Deficiency of arylsulfatase A leads to the accumulation of cerebroside sulfate, which causes a lethal progressive demyelination. Arylsulfatase A requires the posttranslational oxidation of the -CH2SH group of a conserved cysteine to an aldehyde, yielding a formylglycine to be in an active form.	479
293778	cd16159	ES	Estrone sulfatase. Human estrone sulfatase (ES) is responsible for maintaining high levels of the active estrogen in tumor cells. ES catalyzes the hydrolysis of E1 sulfate, which is a component of the three-enzyme system that has been implicated in intracrine biosynthesis of estradiol. It is associated with the membrane of the endoplasmic reticulum (ER). The structure of ES consisting of two antiparallel alpha helices that protrude from the roughly spherical molecule. These highly hydrophobic helices anchor the functional domain on the membrane surface facing the ER lumen.	521
293779	cd16160	spARS_like	sea urchin arylsulfatase-like. This family includes sea urchin arylsulfatase and its homologous proteins. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	445
293780	cd16161	ARSG	arylsulfatase G. Arylsulfatase G is a subfamily of sulfatases which specifically hydrolyze sulfate esters in a wide variety of substrates such as glycosaminoglycans, steroid sulfates, or sulfolipids. ARSG has arylsulfatase activity toward different pseudosubstrates like p-nitrocatechol sulfate and 4-methylumbelliferyl sulfate. An active site Cys is post-translationally converted to the critical active site C(alpha)-formylglycine. ARSG mRNA expression was found to be tissue-specific with highest expression in liver, kidney, and pancreas, suggesting a metabolic role of ARSG that might be associated with a non-classified lysosomal storage disorder.	383
293881	cd16162	OCRE_RBM5_like	OCRE domain found in RNA-binding protein RBM5, RBM10, and similar proteins. RBM5 is a known modulator of apoptosis. It may also act as a tumor suppressor or an RNA splicing factor; it specifically binds poly(G) RNA. RBM10, a paralog of RBM5, may play an important role in mRNA generation, processing, and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. Both RBM5 and RBM10, contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, and a G-patch/D111 domain.	56
293882	cd16163	OCRE_RBM6	OCRE domain found in RNA-binding protein 6 (RBM6) and similar proteins. RBM6, also called lung cancer antigen NY-LU-12, or protein G16, or RNA-binding protein DEF-3, has been predicted to be a nuclear factor based on its nuclear localization signal. It shows high sequence similarity to RNA-binding protein 5 (RBM5 or LUCA15 or NY-REN-9). Both specifically binds poly(G) RNA. RBM6 contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. In contrast to RBM5, RBM6 has an additional unique domain, the POZ (poxvirus and zinc finger) domain, which may be involved in protein-protein interactions and inhibit binding of target sequences by zinc fingers.	57
293883	cd16164	OCRE_VG5Q	OCRE domain found in angiogenic factor VG5Q and similar proteins. VG5Q, also called angiogenic factor with G patch and FHA domains 1 (AGGF1), or G patch domain-containing protein 7, or vasculogenesis gene on 5q protein, functions as a potent angiogenic factor in promoting angiogenesis through interacting with TWEAK (also known as TNFSF12), which is a member of the tumor necrosis factor (TNF) superfamily that induces angiogenesis in vivo. VG5Q can bind to the surface of endothelial cells and promote cell proliferation, suggesting that it may act in an autocrine fashion. The chromosomal translocation t(5;11) and the E133K variant in VG5Q are associated with Klippel-Trenaunay syndrome (KTS), a disorder characterized by diverse effects in the vascular system. In addition to a forkhead-associated (FHA) domain and a G-patch motif, VG5Q contains an N-terminal OCtamer REpeat (OCRE) domain that is characterized by a 5-fold, imperfectly repeated octameric sequence.	54
293884	cd16165	OCRE_ZOP1_plant	OCRE domain found in Zinc-finger and OCRE domain-containing Protein 1 (ZOP1) and similar proteins found in plant. ZOP1 is a novel plant-specific nucleic acid-binding protein required for both RNA-directed DNA methylation (RdDM) and pre-mRNA splicing. It promotes RNA polymerase IV (Pol IV)-dependent siRNA accumulation, DNA methylation, and transcriptional silencing. As a pre-mRNA splicing factor, ZOP1 associates with several typical splicing proteins as well as with RNA polymerase II (RNAP II and Pol II). It also shows both RdDM-dependent and -independent roles in transcriptional silencing. ZOP1 contains an N-terminal C2H2-type ZnF domain and an OCtamer REpeat (OCRE) domain that is usually related to RNA processing.	55
293885	cd16166	OCRE_SUA_like	OCRE domain found in Suppressor of ABI3-5 (SUA) and similar proteins. SUA is an RNA-binding protein located in the nucleus and expressed in all plant tissues. It functions as a splicing factor that influences seed maturation by controlling alternative splicing of ABI3. The suppression of the cryptic ABI3 intron indicates a role of SUA in mRNA processing. SUA also interacts with the prespliceosomal component U2AF65, the larger subunit of the conserved pre-mRNA splicing factor U2AF. SUA contains two RNA recognition motifs surrounding a zinc finger domain, an OCtamer REpeat (OCRE) domain, and a Gly-rich domain close to the C-terminus.	54
293886	cd16167	OCRE_RBM10	OCRE domain found in RNA-binding protein 10 (RBM10) and similar proteins. RBM10, also called G patch domain-containing protein 9, or RNA-binding protein S1-1 (S1-1), is a paralogue of putative tumor suppressor RNA-binding protein 5 (RBM5 or LUCA15 or H37). It may play an important role in mRNA generation, processing and degradation in several cell types. The rat homolog of human RBM10 is protein S1-1, a hypothetical RNA binding protein with poly(G) and poly(U) binding capabilities. RBM10 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, and a G-patch/D111 domain.	64
293887	cd16168	OCRE_RBM5	OCRE domain found in RNA-binding protein 5 (RBM5) and similar proteins. RBM5 is also called protein G15, H37, putative tumor suppressor LUCA15, or renal carcinoma antigen NY-REN-9. It is a known modulator of apoptosis. It acts as a tumor suppressor or an RNA splicing factor. RBM5 shows high sequence similarity to RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). Both of them specifically binds poly(G) RNA. RBM5 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain.	56
320085	cd16169	Tau138_eWH	extended winged-helix domain of tau138 and related proteins. Tau138 is one of three subunits of the tauB subcomplex of yeast transcription factor IIIC. This extended winged-helix domain of tau138 appears to interact with the TPR (tetratricopeptide repeat) array of tauA subunit tau131, and may therefore play a role in linking tauA, tauB, and TFIIIB to regulate the formation of the  RNA polymerase III pre-initiation complex.	97
320084	cd16170	MvaT_DBD	DNA-binding domain of the bacterial xenogeneic silencer MvaT. MvaT is a xenogeneic silencer conserved in Pseudomonas which assists in distinguishing foreign from self DNA. It prefers binding to flexible DNA segments with multiple TpA steps, and forms nucleoprotein filaments through cooperative polymerization.	42
293781	cd16171	ARSK	arylsulfatase family, member K ....arylsulfatase k short ask flags precursor. ARSK is a lysosomal sulfatase which exhibits an acidic pH optimum for catalytic activity against arylsulfate substrates. Other names for ARSK include arylsulfatase K and TSULF. Sulfatases catalyze the hydrolysis of sulfate esters from wide range of substrates, including steroids, carbohydrates and proteins. Sulfate esters may be formed from various alcohols and amines. The biological roles of sulfatase includes the cycling of sulfur in the environment, in the degradation of sulfated glycosaminoglycans and glycolipids in the lysosome, and in remodeling sulfated glycosaminoglycans in the extracellular space. The sulfatases are essential for human metabolism. At least eight human monogenic diseases are caused by the deficiency of individual sulfatases.	366
293930	cd16172	TorS_sensor_domain	sensor domain of the sensor histidine kinase TorS. TorS is part of the trimethylamine-N-oxide (TMAO) reductase (Tor) pathway, which consists TorT, a periplasmic binding protein that binds TMAO; TorS, a sensor histidine kinase that forms a complex with TorT,  and TorR, the response regulator. The Tor pathway is involved in regulating a cellular response to trimethylamine-N-oxide (TMAO), a terminal electron receptor in anaerobic respiration. TorS consists of a periplasmic sensor domain, as well as a HAMP domain, a histidine kinase domain, and a receiver domain.	261
320081	cd16173	EFh_MICU1	EF-hand, calcium binding motif, found in calcium uptake protein 1, mitochondrial (MICU1) and similar proteins. MICU1, also termed atopy-related autoantigen CALC (ara CALC), or calcium-binding atopy-related autoantigen 1 (CBARA1), or Hom s 4, or EFHA3, localizes to the inner mitochondrial membrane (IMM). It functions as a gatekeeper of the mitochondrial calcium uniporter (MCU) and regulates MCU-mediated mitochondrial Ca2+ uptake, which is essential for maintaining mitochondrial homoeostasis. MICU1 and its paralog, MICU2, are physically associated within the uniporter complex and are co-expressed across all tissues. They may operate together with MCU to regulate the channel. The mutations in MICU1 are associated with neuromuscular abnormalities in children. MICU1 contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices.	153
320082	cd16174	EFh_MICU2	EF-hand, calcium binding motif, found in calcium uptake protein 2, mitochondrial (MICU2) and similar proteins. MICU2, also termed EF-hand domain-containing family member A1 (EFHA1), is a mitochondrial-localized paralog of MICU1. MICU2 and its paralog, MICU1, are physically associated within the mitochondrial calcium uniporter (MCU) complex and are co-expressed across all tissues. They may operate together with MCU to regulate the channel. At present, the precise molecular function of MICU2 remains unclear. It may play possible roles in Ca2+ sensing and regulation of MCU, calcium buffering with a secondary impact on transport or assembly and stabilization of MCU. MICU2 contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices.	154
320083	cd16175	EFh_MICU3	EF-hand, calcium binding motif, found in calcium uptake protein 3, mitochondrial (MICU3) and similar proteins. MICU3, also termed EF-hand domain-containing family member A2 (EFHA2), is a paralog of MICU1 and notably found in the central nervous system (CNS) and skeletal muscle. At present, the precise molecular function of MICU3 remains unclear. It likely has a role in mitochondrial calcium handling. MICU3 contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices.	128
320076	cd16176	EFh_HEF_CB	EF-hand, calcium binding motif, found in calbindin (CB). CB, also termed calbindin D28, or D-28K, or avian-type vitamin D-dependent calcium-binding protein, is a unique intracellular calcium binding protein that functions as both a calcium sensor and buffer in eukaryotic cells, which undergoes a conformational change upon calcium binding and protects cells against insults of high intracellular calcium concentration. CB is highly expressed in brain and sensory neurons. It plays essential roles in neural functioning, altering synaptic interactions in the hippocampus, modulating calcium channel activity, calcium transients, and intrinsic neuronal firing activity. It prevents a neuronal death, as well as maintains and controls calcium homeostasis. CB also modulates the activity of proteins participating in the development of neurodegenerative disorders such as Alzheimer's disease, Huntington's disease, and bipolar disorder. Moreover, CB interacts with Ran-binding protein M, a protein known to involve in microtubule function. It also interacts with alkaline phosphatase and myo-inositol monophosphatase, as well as caspase 3, an enzyme that plays an important role in the regulation of apoptosis. CB contains six EF-hand motifs in a single globular domain, where EF-hands 1, 3, 4, 5 bind four calcium ions with high affinity.	243
320077	cd16177	EFh_HEF_CR	EF-hand, calcium binding motif, found in calretinin (CR). CR, also termed 29 kDa calbindin, is a cytosolic hexa-EF-hand calcium-binding protein predominantly expressed in a variety of normal and tumorigenic t specific neurons of the central and peripheral nervous system. It possibly functions as a calcium buffer, calcium sensor, and apoptosis regulator, which may be implicated in many biological processes, including cell proliferation, differentiation, and cell death. CR contains six EF-hand motifs within two independent domains, CR I-II and CR III-VI. CR I-II consists of EF-hand motifs 1 and 2, and CR III-VI consists of EF-hand motifs 3-6. The first 5 EF-hand motifs are capable of binding calcium ions, while the EF-hand 6 is inactive. Thus, CR has two pairs of cooperative binding sites (I-II and III-IV), which display high affinity calcium-binding sites, and one independent calcium ion-binding site (V), which displays lower affinity binding.	248
320078	cd16178	EFh_HEF_SCGN	EF-hand, calcium binding motif, found in secretagogin (SCGN). SCGN is a six EF-hand calcium-binding protein expressed in neuroendocrine, pancreatic endocrine and retinal cells. It plays a crucial role in cell apoptosis, receptor signaling and differentiation. It is also involved in vesicle secretion through binding to various proteins, including interacts with SNAP25, SNAP23, DOC2alpha, ARFGAP2, rootletin, KIF5B, beta-tubulin, DDAH-2, ATP-synthase and myeloid leukemia factor 2. SCGN functions as a calcium sensor/coincidence detector modulating vesicular exocytosis of neurotransmitters, neuropeptides or hormones. It also serves as a calcium buffer in neurons. Thus, SCGN may be linked to the pathogenesis of neurological diseases such as Alzheimer's, and also acts as a serum marker of neuronal damage, or as a tumor biomarker. SCGN consists of the three globular domains each of which contains a pair of EF-hand motifs. All six EF hand motifs of SCGN in some eukaryotes, including D. rerio, X. laevis, M. domestica, G. gallus, O. anatinus, could potentially bind six calcium ions. In contrast, SCGNs from higher eukaryotes have at least one non-functional EF-hand motif due to the mutation(s) or deletions. For instance, the EF1 loop does not coordinate calcium ion due to the key residue asparagine replaced by lysine in SCGNs of many mammalian species. Moreover, the EF2 loop seems to be competent for calcium-binding in most mammalian SCGNs except for human and chimpanzee orthologs.	257
320079	cd16179	EFh_HEF_CBN	EF-hand, calcium binding motif, found in Drosophila melanogaster calbindin-32 (CBN) and similar proteins. CBN, the product of the cbn gene, is a Drosophila homolog to vertebrate neuronal six EF-hand calcium binding proteins. It is expressed through most of ontogenesis with a selective distribution in the nervous system and in a few small adult thoracic muscles. Its precise biological role remains unclear. CBN contains six EF-hand motifs, but some of them may not bind calcium ions due to the lack of key residues.	261
320055	cd16180	EFh_PEF_Group_I	Penta-EF hand, calcium binding motifs, found in Group I PEF proteins. The family corresponds to Group I PEF proteins that have been found not only in higher animals but also in lower animals, plants, fungi and protists. Group I PEF proteins include apoptosis-linked gene 2 protein (ALG-2), peflin and similar proteins. ALG-2, also termed programmed cell death protein 6 (PDCD6), is a widely expressed calcium-binding modulator protein associated with cell proliferation and death, as well as cell survival. It forms a homodimer in the cell or a heterodimer with its closest paralog peflin. Among the PEF proteins, ALG-2 can bind three Ca2+ ions through its EF1, EF3, and EF5 hands, where it is unique in that its EF5 hand binds Ca2+ ion in a canonical coordination. Peflin is a ubiquitously expressed 30-kD PEF protein containing five EF-hand motifs in its C-terminal domain and a longer N-terminal hydrophobic domain (NHB domain) than any other member of the PEF family. The NHB domain harbors nine repeats of a nonapeptide (A/PPGGPYGGP). Peflin may modulate the function of ALG-2 in Ca2+ signaling. It exists only as a heterodimer with ALG-2, and binds two Ca2+ ions through its EF1 and EF3 hands. Its additional EF5 hand is unpaired and does not bind Ca2+ ion but mediates the heterodimerization with ALG-2. The dissociation of heterodimer occurs in the presence of Ca2+.	164
320056	cd16181	EFh_PEF_Group_II_sorcin_like	Penta-EF hand, calcium binding motifs, found in sorcin, grancalcin, and similar proteins. The family corresponds to the second group of penta-EF hand (PEF) proteins that includes sorcin, grancalcin, and similar proteins. Sorcin, also termed 22 kDa Ca2+-binding protein, CP-22, or V19, is a soluble resistance-related calcium-binding protein that is expressed in normal mammalian tissues, such as the liver, lungs and heart. It contains a flexible glycine and proline-rich N-terminal extension and five EF-hand motifs that associate with membranes in a calcium-dependent manner. It may harbor three potential Ca2+ binding sites through its EF1, EF2 and EF3 hands. However, binding of only two Ca2+/monomer suffices to trigger the conformational change that exposes hydrophobic regions and leads to interaction with the respective targets. Sorcin forms homodimers through the association of the unpaired EF5 hand. Among the PEF proteins, sorcin is unique in that it contains potential phosphorylation sites by cAMP-dependent protein kinase (PKA), and it can form a tetramer at slightly acid pH values although remaining a stable dimer at neutral pH.  Grancalcin (GCA) is a cytosolic Ca2+-binding protein specifically expressed in neutrophils and monocytes/macrophages. It can strongly interact with sorcin to form a heterodimer and further modulate the function of sorcin. GCA exists as homodimers in solution. It contains five EF-hand motifs attached to an N-terminal region of an approximately 50 residue-long segment rich in glycines and prolines. In contrast with sorcin, GCA binds two Ca2+ ions through its EF1 and EF3 hands.	165
320057	cd16182	EFh_PEF_Group_II_CAPN_like	Penta-EF hand, calcium binding motifs, found in PEF calpain family. The PEF calpain family belongs to the second group of penta-EF hand (PEF) proteins. It includes classical (also called conventional or typical) calpain (referring to a calcium-dependent papain-like enzymes, EC 3.4.22.17) large catalytic subunits (CAPN1, 2, 3, 8, 9, 11, 12, 13, 14) and two calpain small subunits (CAPNS1 and CAPNS2), which are largely confined to animals (metazoans). These PEF-containing are nonlysosomal intracellular calcium-activated intracellular cysteine proteases that play important roles in the degradation or functional modulation in a variety of substrates in response to calcium signalling. The classical mu- and m-calpains are heterodimers consisting of homologous but a distinct (large) L-subunit/chain (CAPN1 or CAPN2) and a common (small) S-subunit/chain (CAPNS1 or CAPNS2). These L-subunits (CAPN1 and CAPN2) and S-subunit CAPNS1 are ubiquitously found in all tissues. Other calpains likely consist of an isolated L-subunit/chain alone. Many of them, such as CAPNS2, CAPN3 (in skeletal muscle, or lens), CAPN8 (in stomach), CAPN9 (in digestive tracts), CAPN11 (in testis), CAPN12 (in follicles), are tissue-specific and have specific functions in distinct organs. The L-subunits of similar structure (called CALPA and B) also have been found in Drosophila melanogaster. The S-subunit seems to have a chaperone-like function for proper folding of the L-subunit. The catalytic L-subunits contain a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The S-subunits only have the PEF domain following an N-terminal Gly-rich hydrophobic domain. The calpains undergo a rearrangement of the protein backbone upon Ca2+-binding.	167
320058	cd16183	EFh_PEF_ALG-2	EF-hand, calcium binding motif, found in apoptosis-linked gene 2 protein (ALG-2) and similar proteins. ALG-2, also termed programmed cell death protein 6 (PDCD6), or probable calcium-binding protein ALG-2, is one of the prototypic members of the penta EF-hand protein family. It is a widely expressed calcium-binding modulator protein associated with cell proliferation and death, as well as cell survival. ALG-2 acts as a pro-apoptotic factor participating in T cell receptor-, Fas-, and glucocorticoid-induced programmed cell death, and also serves as a useful molecular marker for the prognosis of cancers.  Moreover, ALG-2 functions as a calcium ion sensor at endoplasmic reticulum (ER) exit sites, and modulates ER-stress-stimulated cell death and neuronal apoptosis during organ formation. Furthermore, ALG-2 can mediate the pro-apoptotic activity of cisplatin or tumor necrosis factor alpha (TNFalpha) through the down-regulation of nuclear factor-kappaB (NF-kappaB) expression. It also inhibits angiogenesis through PI3K/mTOR/p70S6K pathway by interacting of vascular endothelial growth factor receptor-2 (VEGFR-2). In addition, nuclear ALG-2 may participate in the post-transcriptional regulation of Inositol Trisphosphate Receptor Type 1 (IP3R1) pre-mRNA at least in part by interacting with CHERP (Ca2+ homeostasis endoplasmic reticulum protein) calcium-dependently. ALG-2 contains five serially repeated EF-hand motifs and interacts with various proteins, including ALG-2-interacting protein X (Alix), Fas, annexin XI, death-associated protein kinase 1 (DAPk1), Tumor susceptibility gene 101 (TSG101), Sec31A, phospholipid scramblase 3 (PLSCR3), the P-body component PATL1, and endosomal sorting complex required for transport (ESCRT)-III-related protein IST1, in a calcium-dependent manner. It forms a homodimer in the cell or a heterodimer with its closest paralog peflin. Among the PEF proteins, ALG-2 can bind three Ca2+ ions through its EF1, EF3, and EF5 hands, where it is unique in that its EF5 hand binds Ca2+ ion in a canonical coordination.	165
320059	cd16184	EFh_PEF_peflin	EF-hand, calcium binding motif, found in peflin and similar proteins. Peflin, also termed penta-EF hand (PEF) protein with a long N-terminal hydrophobic domain, or penta-EF hand domain-containing protein 1, is a ubiquitously expressed 30-kD PEF protein containing five EF-hand motifs in its C-terminal domain and a longer N-terminal hydrophobic domain (NHB domain) than any other member of the PEF family. The NHB domain harbors nine repeats of a nonapeptide (A/PPGGPYGGP). Peflin may modulate the function of ALG-2 in Ca2+ signaling. It exists only as a heterodimer with ALG-2, and binds two Ca2+ ions through its EF1 and EF3 hands. Its additional EF5 hand is unpaired and does not bind Ca2+ ion but mediates the heterodimerization with ALG-2. The dissociation of heterodimer occurs in the presence of Ca2+. In lower vertebrates, peflin may interact with transient receptor potential N (TRPN1), suggesting a potential role of peflin in fast transducer channel adaptation.	165
320060	cd16185	EFh_PEF_ALG-2_like	EF-hand, calcium binding motif, found in homologs of mammalian apoptosis-linked gene 2 protein (ALG-2). The family includes some homologs of mammalian apoptosis-linked gene 2 protein (ALG-2) mainly found in lower eukaryotes, such as a parasitic protist Leishmarua major and a cellular slime mold Dictyostelium discoideum. These homologs contains five EF-hand motifs. Due to the presence of unfavorable residues at the Ca2+-coordinating positions, their non-canonical EF4 and EF5 hands may not bind Ca2+. Two Dictyostelium PEF proteins are the prototypes of this family. They may bind to cytoskeletal proteins and/or signal-transducing proteins localized to detergent-resistant membranes named lipid rafts, and occur as monomers or weak homo- or heterodimers like ALG-2. They can serve as a mediator for Ca2+ signaling-related Dictyostehum programmed cell death (PCD).	163
320061	cd16186	EFh_PEF_grancalcin	Penta-EF hand, calcium binding motifs, found in grancalcin. Grancalcin (GCA) is a cytosolic Ca2+-binding protein specifically expressed in neutrophils and monocytes/macrophages. It displays a Ca2+-dependent translocation to granules and plasma membrane upon neutrophil activation, suggesting roles in granule-membrane fusion and degranulation of neutrophils. It may also play a role in the regulation of vesicle/granule exocytosis through the reversible binding of secretory vesicles and plasma membranes upon the presence of calcium. Moreover, GCA is involved in inflammation, as well as in the process of adhesion of neutrophils to fibronectin. It plays a key role in leukocyte-specific functions that are responsible for host defense, and affects the function of integrin receptors on immune cells through binding to L-plastin in the absence of calcium. Furthermore, GCA can strongly interact with sorcin to form a heterodimer, and further modulate the function of sorcin. GCA exists as homodimers in solution. It contains five EF-hand motifs attached to an N-terminal region of an approximately 50 residue-long segment rich in glycines and prolines. GCA binds two Ca2+ ions through its EF1 and EF3 hands.	165
320062	cd16187	EFh_PEF_sorcin	Penta-EF hand, calcium binding motifs, found in sorcin. Sorcin, also termed 22 kDa Ca2+-binding protein, CP-22, or V19, is a soluble resistance-related calcium-binding protein that is expressed in normal mammalian tissues, such as the liver, lungs and heart. The up-regulation of sorcin is correlated with a number of cancer types, including colorectal, gastric and breast cancer. It may represent a therapeutic target for reversing tumor multidrug resistance (MDR). Sorcin participates in the regulation of calcium homeostasis in cells and is necessary for the activation of mitosis and cytokinesis. It enhances metastasis and promotes epithelial-to-mesenchymal transition of colorectal cancer. Moreover, sorcin has been implicated in the regulation of intracellular Ca2+ cycling and cardiac excitation-contraction coupling. It displays the anti-apoptotic properties via the modulation of mitochondrial Ca2+ handling in cardiac myocytes. It can target and activate the sarcolemmal Na+/Ca2+ exchanger (NCX1) in cardiac muscle. Meanwhile, sorcin modulates cardiac L-type Ca2+ current by functional interaction with the alpha1C subunit. It also associates with calcium/calmodulin-dependent protein kinase IIdeltaC (CaMKIIdelta(C)) and further modulates ryanodine receptor (RyR) function in cardiac myocytes. Furthermore, sorcin may act as a Ca2+ sensor for glucose-induced nuclear translocation and the activation of carbohydrate-responsive element-binding protein (ChREBP)-dependent genes. As a mitochondrial chaperone TRAP1 interactor, sorcin involves in mitochondrial metabolism through the TRAP1 pathway. In addition, sorcin may regulate the inhibition of type I interferon response in cells through interacting with foot-and-mouth disease virus (FMDV) VP1. Sorcin contains a flexible glycine and proline-rich N-terminal extension and five EF-hand motifs that associate with membranes in a calcium-dependent manner. It may harbor three potential Ca2+ binding sites through its EF1, EF2 and EF3 hands. However, binding of only two Ca2+/monomer suffices to trigger the conformational change that exposes hydrophobic regions and leads to interaction with the respective targets. Sorcin forms homodimers through the association of the unpaired EF5 hand. Among the PEF proteins, sorcin is unique in that it contains potential phosphorylation sites by cAMP-dependent protein kinase (PKA), and it can form a tetramer at slightly acid pH values although remaining a stable dimer at neutral pH.	165
320063	cd16188	EFh_PEF_CPNS1_2	Penta-EF hand, calcium binding motifs, found in calcium-dependent protease small subunit CAPNS1 and CAPNS2. CAPNS1, also termed calpain small subunit 1 (CSS1), or calcium-activated neutral proteinase small subunit (CANP small subunit), or calcium-dependent protease small subunit (CDPS), or calpain regulatory subunit, is a common 28-kDa regulatory calpain subunit encoded by the calpain small 1 (Capns1, also known as Capn4) gene. It acts as a binding partner to form a heterodimer with the 80 kDa calpain large catalytic subunit and is required in maintaining the activity of calpain. CAPNS1 plays a significant role in tumor progression of human cancer, and functions as a potential therapeutic target in human hepatocellular carcinoma (HCC), nasopharyngeal carcinoma (NPC), glioma, and clear cell renal cell carcinoma (ccRCC).  It may be involved in regulating migration and cell survival through binding to the SH3 domain of Ras GTPase-activating protein (RasGAP). It may also modulate Akt/FoxO3A signaling and apoptosis through PP2A. CAPNS1 contains an N-terminal glycine rich domain and a C-terminal PEF-hand domain. CAPNS2, also termed calpain small subunit 2 (CSS2), is a novel tissue-specific 30 kDa calpain small subunit that lacks two oligo-Gly stretches characteristic of the N-terminal Gly-rich domain of CAPNS1. CAPNS2 acts as a chaperone for the calpain large subunit, and appears to be the functional equivalent of CAPNS1. However, CAPNS2 binds the large subunit much more weakly than CAPNS1 and it does not undergo the autolytic conversion typical of CAPNS1.	169
320064	cd16189	EFh_PEF_CAPN1_like	Penta-EF hand, calcium binding motifs, found in mu-type calpain (CAPN1), m-type calpain (CAPN2), and similar proteins. The family includes mu-type calpain (CAPN1) and m-type calpain (CAPN2), both of which are ubiquitously expressed 80-kDa Ca2+-dependent intracellular cysteine proteases that contain a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The catalytic subunit CAPN1 or CAPN2 in complex with a regulatory subunit encoded by CAPNS1 forms a mu- or m-calpain heterodimer, respectively.	168
320065	cd16190	EFh_PEF_CAPN3	Calcium-activated neutral. CAPN3, also termed calcium-activated neutral proteinase 3 (CANP 3), or calpain L3, or calpain p94, or muscle-specific calcium-activated neutral protease 3, or new calpain 1 (nCL-1), is a calpain large subunit that is mainly expressed in skeletal muscle, or lens. The skeletal muscle-specific CAPN3 are pathologically associated with limb girdle muscular dystrophy type 2A (LGMD2A). Its autolytic activity can be positively regulated by calmodulin (CaM), a known transducer of the calcium signal. CAPN3 is also involved in human melanoma tumorigenesis and progression. It impairs cell proliferation and stimulates oxidative stress-mediated cell death in melanoma cells. Moreover, it plays an important role in sarcomere remodeling and mitochondrial protein turnover. Furthermore, the phosphorylated skeletal muscle-specific CAPN3 acts as a myofibril structural component and may participate in myofibril-based signaling pathways. In the eye, the lens-specific CAPN3, together with CAPN2, is responsible for proteolytic cleavages of alpha and beta-crystallin. Overactivated alpha and beta-crystallin can lead to cataract formation. CAPN3 exists as a homodimer, rather than a heterodimer with the calpain small subunit. It may also form heterodimers with other calpain large subunits. CAPN3 contains a long N-terminal region, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. Ca2+ binding at EF5 of the CAPN3 PEF domain is a distinct feature not observed in other calpain isoforms.	169
320066	cd16191	EFh_PEF_CAPN8	Penta-EF hand, calcium binding motifs, found in calpain-8 (CAPN8). CAPN8, also termed new calpain 2 (nCL-2), or stomach-specific M-type calpain, is a calpain large subunit predominantly expressed in the stomach. It appears to be involved in membrane trafficking in the gastric surface mucus cells (pit cells), via its location at the Golgi and interaction with the beta-subunit of coatomer complex (beta-COP) of vesicles derived from the Golgi. Moreover, CAPN8, together with CAPN9, forms an active protease complex, G-calpain, in which both proteins are essential for stability and activity. The G-Calpain has been implicated in gastric mucosal defense. CAPN8 exists as both a monomer and homo-oligomer, but not as a heterodimer with the conventional calpain regulatory subunit (30K). The monomer and homodimer forms predominate. CAPN8 contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain.	168
320067	cd16192	EFh_PEF_CAPN9	Penta-EF hand, calcium binding motifs, found in calpain-9 (CAPN9). CAPN9, also termed digestive tract-specific calpain, or new calpain 4 (nCL-4), or protein CG36, is a calpain large subunit predominantly expressed in gastrointestinal tract. It plays a physiological role in the suppression of tumorigenesis. It acts as an important biomolecule link for the regression of colorectal cancer via intracellular calcium homeostasis. CAPN9 may also play a critical role in lumen formation. Moreover, CAPN9, together with CAPN8, forms an active protease complex, G-calpain, in which both proteins are essential for stability and activity. The G-Calpain has been implicated in gastric mucosal defense. Furthermore, down-regulation of calpain 9 has been linked to hypertensive heart and kidney disease in salt-sensitive Dahl rats. CAPN9 contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain.	169
320068	cd16193	EFh_PEF_CAPN11	Penta-EF hand, calcium binding motifs, found in calpain-11 (CAPN11). CAPN11, also termed calcium-activated neutral proteinase 11 (CANP 11), is a mammalian orthologue of micro/m calpain. It is one of the calpain large subunits that appears to be exclusively expressed in certain cells of the testis. It may be involved in regulating calcium-dependent signal transduction events during meiosis and sperm functional processes. CAPN11 contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain.	169
320069	cd16194	EFh_PEF_CAPN12	Penta-EF hand, calcium binding motifs, found in calpain-12 (CAPN12). CAPN12, also termed calcium-activated neutral proteinase 12 (CANP 12), is a calpain large subunit mainly expressed in the cortex of the hair follicle. It may affect apoptosis regulation. CAPN12 contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain.	169
320070	cd16195	EFh_PEF_CAPN13_14	Penta-EF hand, calcium binding motifs, found in calpain-13 (CAPN13), calpain-14 (CAPN14), and similar proteins. CAPN13, also termed calcium-activated neutral proteinase 13 (CANP 13), a 63.6 kDa calpain large subunit that exhibits a restricted tissue distribution with low levels of expression detected only in human testis and lung. In calpain family, CAPN13 is most closely related to calpain-14 (CAPN14). CAPN14, also termed calcium-activated neutral proteinase 14 (CANP 14), is a 76.7 kDa calpain large subunit that is most highly expressed in the oesophagus. Its expression and calpain activity can be induced by IL-13. Both CAPN13 and CAPN14 contain a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain.	168
320071	cd16196	EFh_PEF_CalpA_B	Penta-EF hand, calcium binding motifs, found in Drosophila melanogaster calpain-A (CalpA),  calpain-B (CalpB), and similar proteins. The family contains two calpains that have been found in Drosophila, CalpA and CalpB. CalpA, also termed calcium-activated neutral proteinase A (CANP A), or calpain-A catalytic subunit, is a Drosophila calpain homolog specifically expressed in a few neurons in the central nervous system, in scattered endocrine cells in the midgut, and in blood cells.  CalpB, also termed calcium-activated neutral proteinase B (CANP B), contains calpain-B catalytic subunit 1 and calpain-B catalytic subunit 2. Both CalpA and CalpB are closely related to that of vertebrate calpains, and they share similar domain architecture, which consists of four domains: the N-terminal domain I, the catalytic domain II carrying the three active site residues, Cys, His and Asn, the Ca2+-regulated phospholipid-binding domain III, and penta-EF-hand Ca2+-binding domain IV. Besides, CalpA and CalpB display some distinguishing structural features that are not found in mammalian typical calpains. CalpA harbors a 76 amino acid long hydrophobic stretch inserted in domain IV, which may be involved in membrane attachment of this enzyme. CalpB has an unusually long N-terminal tail of 224 amino acids, which belongs to the class of intrinsically unstructured proteins (IUP) and may become ordered upon binding to target protein(s). Moreover, they do not need small regulatory subunits for their catalytic activity, and their proteolytic function is not regulated by an intrinsic inhibitor as the Drosophila genome contains neither regulatory subunit nor calpastatin orthologs. As a result, they may exist as a monomer or perhaps as a homo- or heterodimer together with a second large subunit. Furthermore, both CalpA and CalpB are dispensable for viability and fertility and do not share vital functions during Drosophila development. Phosphatidylinositol 4,5-diphosphate, phosphatidylinositol 4-monophosphate, phosphatidylinositol, and phosphatidic acid can stimulate the activity and the rate of activation of CalpA, but not CalpB. Calpain A modulates Toll responses by limited Cactus/IkappaB proteolysis. CalpB directly interacts with talin, an important component of the focal adhesion complex, and functions as an important modulator in border cell migration within egg chambers, which may act via the digestion of talin. CalpB can be phosphorylated by cAMP-dependent protein kinase (protein kinase A, PKA; EC 2.7.11.11) at Ser240 and Ser845, as well as by mitogen-activated protein kinase (ERK1 and ERK2; EC 2.7.11.24) at Thr747.  The activation of the ERK pathway by extracellular signals results in the phosphorylation and activation of calpain B. In Schneider cells (S2), calpain B was mainly in the cytoplasm and upon a rise in Ca2+ the enzyme adhered to intracellular membranes.	167
320072	cd16197	EFh_PEF_CalpC	Penta-EF hand, calcium binding motifs, found in Drosophila melanogaster calpain-C (CalpC) and similar proteins. CalpC, also termed calcium-activated neutral proteinase homolog C (CANP C), is a catalytically inactive homolog of CalpA and CalpB found in Drosophila. It is strongly expressed in the salivary glands. In contrast with CalpA and CalpB, both of which are closely related to that of vertebrate calpains, and they share similar domain architecture, which consists of four domains: the N-terminal domain I, the catalytic domain II carrying the three active site residues, Cys, His and Asn, the Ca2+-regulated phospholipid-binding domain III, and penta-EF-hand Ca2+-binding domain IV. CalpC is a truncated calpain form missing domain I and about 20 residues from domain II. Moreover, due to all three mutated active site residues (Cys to Arg, His to Val and Asn to Ser), it may not have proteolytic activity.	166
320073	cd16198	EFh_PEF_CAPN1	Penta-EF hand, calcium binding motifs, found in mu-type calpain (CAPN1). CAPN1, also termed calpain-1 80-kDa catalytic subunit, or calpain-1 large subunit, or micromolar-calpain (muCANP), or calcium-activated neutral proteinase 1 (CANP 1), or cell proliferation-inducing gene 30 protein, is a ubiquitously expressed 80-kDa Ca2+-dependent intracellular cysteine protease that contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The catalytic subunit CAPN1 in complex with a regulatory subunit encoded by CAPNS1 forms a mu-calpain heterodimer. CAPN1 plays a central role in postmortem proteolysis and meat tenderization processes, as well as in regulation of proliferation and survival of skeletal satellite cells. It also acts as a novel regulator in IgE-mediated mast cell activation and could serve as a potential therapeutic target for the management of allergic inflammation. Moreover, CAPN1 is involved in neutrophil motility and functions as a potential target for intervention in inflammatory disease. It also facilitates age-associated aortic wall calcification and fibrosis through the regulation of matrix metalloproteinase 2 activity in vascular smooth muscle cells, and thus plays a role in hypertension and atherosclerosis. The proteolytic cleavage of beta-amyloid precursor protein and tau protein by CAPN1 may be involved in plaque formation. Furthermore, CAPN1 is activated in the brains of individuals with Alzheimer's disease. It is involved in the maintenance of a proliferative neural stem cell pool. The activation and macrophage inflammation of CAPN1 in hypercholesterolemic nephropathy is promoted by nicotinic acetylcholine receptor alpha1 (nAChRalpha1). In addition, CAPN1 displays a functional role in hemostasis, as well as in sickle cell disease.	169
320074	cd16199	EFh_PEF_CAPN2	Penta-EF hand, calcium binding motifs, found in m-type calpain (CAPN2). CAPN2, also termed millimolar-calpain (m-calpain), or calpain-2 catalytic subunit, or calcium-activated neutral proteinase 2 (CANP 2), or calpain large polypeptide L2, or calpain-2 large subunit, is a ubiquitously expressed 80-kDa Ca2+-dependent intracellular cysteine protease that contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The catalytic subunit CAPN2 in complex with a regulatory subunit encoded by CAPNS1 forms an m-calpain heterodimer. CAPN2 acts as the key protease responsible for N-methyl-d-aspartic acid (NMDA)-induced cytoplasmic polyadenylation element-binding protein 3 (CPEB3) degradation in neurons. It cleaves several components of the focal adhesion complex, such as FAK and talin, triggering disassembly of the complex at the rear of the cell. The stimulation of CAPN2 activity is required for Golgi antiapoptotic proteins (GAAPs) to promote cleavage of FA kinase (FAK), cell spreading, and enhanced migration. calpain 2 is also involved in the onset of glial differentiation. It regulates proliferation, survival, migration, and tumorigenesis of breast cancer cells through a PP2A-Akt-FoxO-p27(Kip1) signaling cascade. Its expression is associated with response to platinum based chemotherapy, progression-free and overall survival in ovarian cancer. Moreover, CAPN2 may play a role in fundamental mitotic functions, such as the maintenance of sister chromatid cohesion. The activation of CAPN2 plays an essential role in hippocampal synaptic plasticity and in learning and memory. In the eye, CAPN2, together with a lens-specific variant of CAPN3, is responsible for proteolytic cleavages of alpha and beta-crystallin. Overactivated alpha and beta-crystallin can lead to cataract formation. Sometimes, CAPN2 compensates for loss of CAPN1, and both calpain isoforms are involved in AngII-induced aortic aneurysm formation. The main phosphorylation sites in m-calpain are Ser50 and Ser369/Thr370.	168
320030	cd16200	EFh_PI-PLCbeta	EF-hand motif found in metazoan phosphoinositide-specific phospholipase C (PI-PLC)-beta isozymes. PI-PLC-beta isozymes represent a class of metazoan PI-PLCs that hydrolyze the membrane lipid phosphatidylinositol 4,5-bisphosphate (PIP2) to propagate diverse intracellular responses that underlie the physiological action of many hormones, neurotransmitters, and growth factors (EC 3.1.4.11). They have been implicated in numerous processes relevant to central nervous system (CNS), including chemotaxis, cardiovascular function, neuronal signaling, and opioid sensitivity. Like other PI-PLC isozymes, PI-PLC-beta isozymes contain a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, they have a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are four PI-PLC-beta isozymes (1-4). PI-PLC-beta1 and PI-PLC-beta3 are expressed in a wide range of tissues and cell types, whereas PI-PLC-beta2 and PI-PLC-beta4 have been found only in hematopoietic and neuronal tissues, respectively. All PI-PLC-beta isozymes are activated by the heterotrimeric G protein alpha subunits of the Gq class through their C2 domain and long C-terminal extension. They are GTPase-activating proteins (GAPs) for these G alpha(q) proteins. PI-PLC-beta2 and PI-PLC-beta3 can also be activated by beta-gamma subunits of the G alpha(i/o) family of heterotrimeric G proteins and the small GTPases such as Rac and Cdc42. This family also includes two invertebrate homologs of PI-PLC-beta, PLC21 from cephalopod retina and No receptor potential A protein (NorpA) from Drosophila melanogaster.	153
320031	cd16201	EFh_PI-PLCgamma	EF-hand motif found in phosphoinositide phospholipase C gamma isozymes (PI-PLC-gamma). PI-PLC-gamma isozymes represent a class of metazoan PI-PLCs that hydrolyze the membrane lipid phosphatidylinositol 4,5-bisphosphate (PIP2) to propagate diverse intracellular responses that underlie the physiological action of many hormones, neurotransmitters, and growth factors. They can form a complex with the phosphorylated cytoplasmic domains of the immunoglobulin Ig-alpha and Ig-beta subunits of the B cell receptor (BCR), the membrane-tethered Src family kinase Lyn, phosphorylated spleen tyrosine kinase (Syk), the phosphorylated adaptor protein B-cell linker (BLNK), and activated Bruton's tyrosine kinase (Btk). Like other PI-PLC isozymes, PI-PLC-gamma isozymes contain a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence.  Unique to PI-PLC-gamma, a second PH domain, which is split by two SH2 (Src homology 2) domains, and one SH3 (Src homology 3) domain, are present within this linker. The SH2 and SH3 domains are responsible for the binding of phosphotyrosine-containing sequences and proline-rich sequences, respectively. There are two PI-PLC-gamma isozymes (1-2), both of which are activated by receptor and non-receptor tyrosine kinases due to the presence of SH2 and SH3 domains.	145
320032	cd16202	EFh_PI-PLCdelta	EF-hand motif found in phosphoinositide phospholipase C delta (PI-PLC-delta). PI-PLC-delta isozymes represent a class of metazoan PI-PLCs that are some of the most sensitive to calcium among all PLCs. Their activation is modulated by intracellular calcium ion concentration, phospholipids, polyamines, and other proteins, such as RhoAGAP. Like other PI-PLC isozymes, PI-PLC-delta isozymes contain a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There are three PI-PLC-delta isozymes (1, 3 and 4). PI-PLC-delta1 is relatively well characterized. It is activated by high calcium levels generated by other PI-PLC family members, and therefore functions as a calcium amplifier within the cell. Different PI-PLC-delta isozymes have different tissue distribution and different subcellular locations. PI-PLC-delta1 is mostly a cytoplasmic protein, PI-PLC-delta3 is located in the membrane, and PI-PLC-delta4 is predominantly detected in the cell nucleus. PI-PLC-delta isozymes is evolutionarily conserved even in non-mammalian species, such as yeast, slime molds and plants.	140
320033	cd16203	EFh_PI-PLCepsilon	EF-hand motif found in phosphoinositide phospholipase C epsilon 1 (PI-PLC-epsilon1). PI-PLC-epsilon1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase epsilon-1, or pancreas-enriched phospholipase C, or phospholipase C-epsilon-1 (PLC-epsilon-1), is dominant in connective tissues and brain. It has been implicated in carcinogenesis, such as in bladder and intestinal tumor, oesophageal squamous cell carcinoma, gastric adenocarcinoma, murine skin cancer, head and neck cancer. PI-PLC-epsilon1 contains an N-terminal CDC25 homology domain with a guanyl-nucleotide exchange factor (GFF) activity, a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, a C2 domain, and at least one and perhaps two C-terminal predicted RA (Ras association) domains that are implicated in the binding of small GTPases, such as Ras or Rap, from the Ras family. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. There is only one PI-PLC-epsilon isozyme. It is directly activated by G alpha(12/13), G beta gamma, and activated members of  Ras and Rho small GTPases.	174
320034	cd16204	EFh_PI-PLCzeta	EF-hand motif found in phosphoinositide phospholipase C zeta 1 (PI-PLC-zeta1). PI-PLC-zeta1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase zeta-1, or phospholipase C-zeta-1 (PLC-zeta-1), or testis-development protein NYD-SP27, is only found in the testis. The sperm-specific PI-PLC plays a fundamental role in vertebrate fertilization by initiating intracellular calcium oscillations that trigger the embryo development. However, the mechanism of its activation still remains unclear. PI-PLC-zeta1 contains an N-terminal four atypical EF-hand motifs, a PLC catalytic core domain, and a C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unlike other PI-PLCs, PI-PLC-zeta is responsible for Ca2+ oscillations in fertilized oocytes and exhibits a high sensitivity to Ca2+ mediated through its EF-hand domain. There is only one PLC-zeta isozyme. Aside from PI-PLC-zeta identified in mammals, its eukaryotic homologs have been classified with this family.	142
320035	cd16205	EFh_PI-PLCeta	EF-hand motif found in phosphoinositide phospholipase C eta (PI-PLC-eta). PI-PLC-eta isozymes represent a class of neuron-specific metazoan PI-PLCs that are most abundant in the brain, particularly in the hippocampus, habenula, olfactory bulb, cerebellum, and throughout the cerebral cortex. They are phosphatidylinositol 4,5-bisphosphate-hydrolyzing enzymes that are more sensitive to Ca2+ than other PI-PLC isozymes. They function as calcium sensors activated by small increases in intracellular calcium concentrations. They are also activated through G-protein-coupled receptor (GPCR) stimulation, and further mediate GPCR signalling pathways. PI-PLC-eta isozymes contain an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. The C-terminal tail harbors a number of proline-rich motifs which may interact with SH3 (Src homology 3) domain-containing proteins, as well as many serine/threonine residues, suggesting possible regulation of interactions by protein kinases/phosphatases. There are two PI-PLC-eta isozymes (1-2). Aside from the PI-PLC-eta isozymes identified in mammals, their eukaryotic homologs are also present in this family.	141
320036	cd16206	EFh_PRIP	EF-hand motif found in phospholipase C-related but catalytically inactive proteins (PRIP). This family represents a class of metazoan phospholipase C related, but catalytically inactive proteins (PRIP), which belong to a group of novel inositol 1,4,5-trisphosphate (InsP3) binding protein. PRIP has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP do not have PLC enzymatic activity. PRIP consists of two subfamilies, PRIP-1(also known as p130 or PLC-L1), which is predominantly expressed in the brain, and PRIP-2 (also known as PLC-L2), which exhibits a relatively ubiquitous expression. Experiments show both, PRIP-1 and PRIP-2, are involved in InsP3-mediated calcium signaling pathway and GABA(A)receptor-mediated signaling pathway. In addition, PRIP-2 acts as a negative regulator of B-cell receptor signaling and immune responses.	143
320037	cd16207	EFh_ScPlc1p_like	EF-hand motif found in Saccharomyces cerevisiae phospholipase C-1 (ScPlc1p) and similar proteins. This family represents a group of putative phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) encoded by PLC1 genes from yeasts, which are homologs of the delta isoforms of mammalian PI-PLC in terms of overall sequence similarity and domain organization. Mammalian PI-PLC is a signaling enzyme that hydrolyzes the membrane phospholipids phosphatidylinositol-4,5-bisphosphate (PIP2)  to generate two important second messengers in eukaryotic signal transduction cascades,  inositol 1,4,5-trisphosphate (InsP3) and diacylglycerol (DAG). InsP3 triggers inflow of calcium from intracellular stores, while DAG, together with calcium, activates protein kinase C, which then phosphorylates other molecules, leading to altered cellular activity. Calcium is required for the catalysis. The prototype of this family is protein Plc1p (also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase 1) encoded by PLC1 genes from Saccharomyces cerevisiae. ScPlc1p contains both highly conserved X- and Y- regions of PLC catalytic core domain, as well as a presumptive EF-hand like calcium binding motif.  Experiments show that ScPlc1p displays calcium dependent catalytic properties with high similarity to those of the mammalian PLCs, and plays multiple roles in modulating the membrane/protein interactions in filamentation control. CaPlc1p encoded by CAPLC1 from the closely related yeast Candida albicans, an orthologue of S. cerevisiae Plc1p, is also included in this group. Like SCPlc1p, CaPlc1p has conserved presumptive catalytic domain, shows PLC activity when expressed in E. coli, and is involved in multiple cellular processes. There are two other gene copies of CAPLC1 in C. albicans, CAPLC2 (also named as PIPLC) and CAPLC3. Experiments show CaPlc1p is the only enzyme in C. albicans which functions as PLC. The biological functions of CAPLC2 and CAPLC3 gene products must be clearly different from CaPlc1p, but their exact roles remain unclear. Moreover, CAPLC2 and CAPLC3 gene products are more similar to extracellular bacterial PI-PLC than to the eukaryotic PI-PLC, and they are not included in this subfamily.	142
320038	cd16208	EFh_PI-PLCbeta1	EF-hand motif found in phosphoinositide phospholipase C beta 1 (PI-PLC-beta1). PI-PLC-beta1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-1, or  PLC-154, or phospholipase C-I (PLC-I), or phospholipase C-beta-1 (PLC-beta1), is expressed at highest levels in specific regions of the brain, as well as in the cardiovascular system. It has two splice variants, PI-PLC-beta1a and PI-PLC-beta1b, both of which are present within the nucleus. Nuclear PI-PLC-beta1 is a key molecule for nuclear inositide signaling, where it plays a role in cell cycle progression, proliferation and differentiation. It also contributes to generate cell-specific Ca2+ signals evoked by G protein-coupled receptor stimulation. PI-PLC-beta1 acts as an effector and a GTPase activating protein (GAP) specifically activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. It regulates neuronal activity in the cerebral cortex and hippocampus, and has been implicated for participations in diverse critical functions related to forebrain diseases such as schizophrenia. It may play an important role in maintenance of the status epilepticus, and in osteosarcoma-related signal transduction pathways. PI-PLC-beta1 also functions as a regulator of erythropoiesis in kinamycin F, a potent inducer of gamma-globin production in K562 cells. The G protein activation and the degradation of PI-PLC-beta1 can be regulated by the interaction of alpha-synuclein. As a result, it may reduce cell damage under oxidative stress. Moreover, PI-PLC-beta1 works as a new intermediate in the HIV-1 gp120-triggered phosphatidylcholine-specific phospholipase C (PC-PLC)-driven signal transduction pathway leading to cytoplasmic CCL2 secretion in macrophages. PI-PLC-beta1 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, it has a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence.	151
320039	cd16209	EFh_PI-PLCbeta2	EF-hand motif found in phosphoinositide phospholipase C beta 2 (PI-PLC-beta2). PI-PLC-beta2, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-2, or phospholipase C-beta-2 (PLC-beta2), is expressed at highest levels in cells of hematopoietic origin. It is activated by the heterotrimeric G protein alpha q subunits (G alpha(q)) through their C2 domain and long C-terminal extension.  It is also activated by the beta-gamma subunits of heterotrimeric G proteins. PI-PLC-beta2 has two cellular binding partners, alpha- and gamma-synuclein. The binding of either alpha- and gamma-synuclein inhibits PI-PLC-beta2 activity through preventing the binding of its activator G alpha(q). However, the binding of gamma-synuclein to PI-PLC-beta2 does not affect its binding to G beta(gamma) subunits or small G proteins, but enhances these signals. Meanwhile, gamma-synuclein may protect PI-PLC-beta2 from protease degradation and contribute to its over-expression in breast cancer. In leukocytes, the G beta(gamma)-mediated activation of PI-PLC-beta2 can be promoted by a scaffolding protein WDR26, which is also required for the translocation of PI-PLC-beta2 from the cytosol to the membrane in polarized leukocytes. PI-PLC-beta2 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, it has a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence.	151
320040	cd16210	EFh_PI-PLCbeta3	EF-hand motif found in phosphoinositide phospholipase C beta 3 (PI-PLC-beta3). PI-PLC-beta3, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-3, or phospholipase C-beta-3 (PLC-beta3), is widely expressed at highest levels in brain, liver, and parotid gland. It is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension.  It is also activated by the beta-gamma subunits of heterotrimeric G proteins. PI-PLC-beta3 associates with CXC chemokine receptor 2 (CXCR2) and Na+/H+ exchanger regulatory factor-1 (NHERF1) to form macromolecular complexes at the plasma membrane of pancreatic cancer cells, which functionally couple chemokine signaling to PI-PLC-beta3-mediated signaling cascade. Moreover, PI-PLC-beta3 directly interacts with the M3 muscarinic receptor (M3R), a prototypical G alpha-q-coupled receptor that promotes PI-PLC-beta3 localization to the plasma membrane. This binding can alter G alpha-q-dependent PLC activation. Furthermore, PI-PLC-beta3 inhibits the proliferation of hematopoietic stem cells (HSCs) and myeloid cells through the interaction of SH2-domain-containing protein phosphatase 1 (SHP-1) and signal transducer and activator of transcription 5 (Stat5), and the augment of the dephosphorylating activity of SHP-1 toward Stat5, leading to the inactivation of Stat5. It is also involved in atopic dermatitis (AD) pathogenesis via regulating the expression of periostin in fibroblasts and thymic stromal lymphopoietin (TSLP) in keratinocytes. In addition, PI-PLC-beta3 mediates the thrombin-induced Ca2+ response in glial cells. PI-PLC-beta3 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, it has a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence.	151
320041	cd16211	EFh_PI-PLCbeta4	EF-hand motif found in phosphoinositide phospholipase C beta 4 (PI-PLC-beta4). PI-PLC-beta4, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-4, or phospholipase C-beta-4 (PLC-beta4), is expressed in high concentrations in cerebellar Purkinje and granule cells, the median geniculate body, and the lateral geniculate nucleus. It may play a critical role in linking anxiety behaviors and theta rhythm heterogeneity. PI-PLC-beta4 is activated by the heterotrimeric G protein alpha q subunits through their C2 domain and long C-terminal extension. It contributes to generate cell-specific Ca2+ signals evoked by G protein-coupled receptor stimulation. PI-PLC-beta4 functions as a downstream signaling molecule of type 1 metabotropic glutamate receptors (mGluR1s). The thalamic mGluR1-PI-PLC-beta4 cascade is essential for formalin-induced inflammatory pain by regulating the response of ventral posterolateral thalamic nucleus (VPL) neurons. Moreover, PI-PLC-beta4 is essential for long-term depression (LTD) in the rostral cerebellum, which may be required for the acquisition of the conditioned eyeblink response. Besides, PI-PLC-beta4 may play an important role in maintenance of the status epilepticus. The mutations of PI-PLC-beta4 has been identified as the major cause of autosomal dominant auriculocondylar syndrome (ACS). PI-PLC-beta4 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. Besides, it has a unique C-terminal coiled-coil (CT) domain necessary for homodimerization. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence.	153
320042	cd16212	EFh_NorpA_like	EF-hand motif found in Drosophila melanogaster No receptor potential A protein (NorpA) and similar proteins. NorpA, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase, is an eye-specific phosphoinositide phospholipase C (PI-PLC) encoded by norpA gene in Drosophila. It is expressed predominantly in photoreceptors and plays an essential role in the phototransduction pathway of Drosophila. A mutation within the norpA gene can render the fly blind without affecting any of the obvious structures of the eye. Like beta-class of vertebrate PI-PLCs, NorpA contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence.	153
320043	cd16213	EFh_PI-PLC21	EF-hand motif found in phosphoinositide phospholipase PLC21 and similar proteins. The family includes invertebrate homologs of phosphoinositide phospholipase C beta (PI-PLC-beta) named PLC21 from cephalopod retina. It also includes PLC21 encoded by plc-21 gene, which is expressed in the central nervous system of Drosophila. Like beta-class of vertebrate PI-PLCs, PLC21 contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence.	154
320044	cd16214	EFh_PI-PLCgamma1	EF-hand motif found in phosphoinositide phospholipase C gamma 1 (PI-PLC-gamma1). PI-PLC-gamma1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1, or PLC-148, or phospholipase C-II (PLC-II), or phospholipase C-gamma-1 (PLC-gamma-1), is abundantly expressed in embryonal cortical structures, neurons, oligodendrocytes and astrocytes, and is involved in various cellular events, including proliferation, differentiation, migration, survival, and cell death. It also associates with many diseases, including epilepsy, Huntington's disease (HD), depression, Alzheimer's disease (AD) and bipolar disorder. PI-PLC-gamma1 plays a critical role in cell migration and tumor cell invasiveness and metastasis. It can mediate the cell motility effects of growth factors, including platelet-derived growth factor (PDGF), epidermal growth factor (EGF), insulin-like growth factor (IGF) and hepatocyte growth factor (HGF), as well as adhesion receptors. Moreover, PI-PLC-gamma1 can modulate neurite outgrowth, neuronal cell migration and synaptic plasticity through the Trk receptor. PI-PLC-gamma1 contains an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Besides, PI-PLC-gamma1 has a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region, which are present within this linker. PI-PLC-gamma1 is activated by receptor and non-receptor tyrosine kinases via its two SH2 and a single SH3 domain.	146
320045	cd16215	EFh_PI-PLCgamma2	1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-2. PI-PLC-gamma2, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-2, or phospholipase C-IV (PLC-IV), or phospholipase C-gamma-2 (PLC-gamma-2), is highly expressed in cells of hematopoietic origin. It has been implicated in cell motility important to invasion and dissemination of tumor cells. As an important component of the B cell receptor (BCR) signaling pathway, PI-PLC-gamma2 is required for efficient formation of germinal center (GC) and memory B cells. It works as a critical effector stimulating the increase of intracellular Ca2+ and activates various signaling pathways downstream of the BCR. Moreover, PI-PLC-gamma2 has been implicated in Fc receptor-mediated degranulation of mast cells, integrin signaling in platelets, as well as integrin and Fc receptor-mediated neutrophil functions. It also acts as a crucial signaling mediator modifying DC gene expression program to activate DC responses to beta-glucan-containing pathogens. PI-PLC-gamma2 contains an N-terminal pleckstrin homology (PH) domain, an array of EF hands, a PLC catalytic core domain, and a C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Besides, PI-PLC-gamma2 has a second PH domain, two SH2 (Src homology 2) regions, and one SH3 (Src homology 3) region, which are present within this linker. PI-PLC-gamma2 is activated by receptor and non-receptor tyrosine kinases via its two SH2 and a single SH3 domain. Unlike PI-PLC-gamma1, the activation of PI-PLC-gamma2 may require concurrent stimulation of PI 3-kinase.	154
320046	cd16216	EFh_PI-PLCgamma1_like	EF-hand motif found in 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1-like proteins. This family corresponds to a small group of uncharacterized 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1-like (PI-PLC-gamma1-like) proteins. Although their biological function remains unclear, they shows high sequence similarity with other phosphoinositide phospholipase C gamma proteins.  They contain a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence.  A second PH domain, which is split by two SH2 (Src homology 2) domains, and one SH3 (Src homology 3) domain, are present within this linker.	150
320047	cd16217	EFh_PI-PLCdelta1	EF-hand motif found in phosphoinositide phospholipase C delta 1 (PI-PLC-delta1). PI-PLC-delta1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase delta-1 (PLCD1), or phospholipase C-III (PLC-III), or phospholipase C-delta-1 (PLC-delta-1), is present in high abundancy in the brain, heart, lung, skeletal muscle and testis. It is activated by high calcium levels generated by other PI-PLC family members, and therefore functions as a calcium amplifier within the cell. PI-PLC-delta1 is required for maintenance of homeostasis in skin and metabolic tissues. Moreover, it is essential in trophoblasts for placental development. Simultaneous loss of PI-PLC-delta1 may cause placental vascular defects, leading to embryonic lethality. PI-PLC-delta1 can be positively or negatively regulated by several binding partners, including p122/Rho GTPase activating protein (RhoGAP), Gha/Transglutaminase II, RalA, and calmodulin. It is involved in Alzheimer's disease and hypertension. Furthermore, PI-PLC-delta1 regulates cell proliferation and cell-cycle progression from G1- to S-phase by control of cyclin E-CDK2 activity and p27 levels. It can be activated by alpha1-adrenoreceptors (AR) in a calcium-dependent manner and may be important for G protein-coupled receptors (GPCR) responses in vascular smooth muscle (VSM). PI-PLC-delta1 may also be involved in noradrenaline (NA)-induced phosphatidylinositol-4,5-bisphosphate (PIP2) hydrolysis and modulate sustained contraction of mesenteric small arteries. In addition, it inhibits thermogenesis and induces lipid accumulation, and therefore contributes to the development of obesity. PI-PLC-delta1 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. PI-PLC-delta1 can regulate the binding of PH domain to PIP2 in a Ca2+-dependent manner through its functionally important EF-hand domains. In addition, PI-PLC-delta1 possesses a classical leucine-rich nuclear export sequence (NES) located in the EF hand motifs, as well as a nuclear localization signal within its linker region, both of which may be responsible for translocating PI-PLC-delta1 into and out of the cell nucleus.	139
320048	cd16218	EFh_PI-PLCdelta3	EF-hand motif found in phosphoinositide phospholipase C delta 3 (PI-PLC-delta3). PI-PLC-delta3, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase delta-3 (PLCD3), phospholipase C-delta-3 (PLC-delta-3), is expressed abundantly in brain, skeletal muscle and heart. PI-PLC-delta3 gene expression is down-regulation by cAMP and calcium. PI-PLC-delta3 acts as anchoring of myosin VI on plasma membrane, and further modulates Myosin IV expression and microvilli formation in enterocytes. It negatively regulates RhoA expression, inhibits RhoA/Rho kinase signaling, and plays an essential role in normal neuronal migration by promoting neuronal outgrowth in the developing brain. Moreover, PI-PLC-delta3 is essential in trophoblasts for placental development. Simultaneous loss of PI-PLC-delta3 may cause placental vascular defects, leading to embryonic lethality. PI-PLC-delta3 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. In addition, PI-PLC-delta3 possesses a classical leucine-rich nuclear export sequence (NES) located in the EF hand motifs, which may be responsible transporting PI-PLC-delta3 from the cell nucleus.	138
320049	cd16219	EFh_PI-PLCdelta4	EF-hand motif found in phosphoinositide phospholipase C delta 4 (PI-PLC-delta4). PI-PLC-delta4, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase delta-4 (PLCD4), or phospholipase C-delta-4 (PLC-delta-4), is expressed in various tissues with the highest levels detected selectively in the brain, skeletal muscle, testis and kidney. It plays a significant role in cell growth, cell proliferation, tumorigenesis, and in an early stage of fertilization. PI-PLC-delta4 may function as a key enzyme in the regulation of PtdIns(4,5)P2 levels and Ca2+ metabolism in nuclei in response to growth factors, and its expression may be partially regulated by an increase in cytoplasmic Ca2+. Moreover, PI-PLC-delta4 binds glutamate receptor-interacting protein1 (GRIP1) in testis and is required for calcium mobilization essential for the zona pellucida-induced acrosome reaction in sperm. Overexpression or dysregulated expression of PLCdelta4 may initiate oncogenesis in certain tissues through upregulating erbB1/2 expression, extracellular signal-regulated kinase (ERK) signaling pathway, and proliferation in MCF-7 cells. PI-PLC-delta4 contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, and a C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. Unlike PI-PLC-delta 1 and 3, a putative nuclear export sequence (NES) located in the EF-hand domain, which may be responsible transporting PI-PLC-delta1 and 3 from the cell nucleus, is not present in PI-PLC-delta4.	140
320050	cd16220	EFh_PI-PLCeta1	EF-hand motif found in phosphoinositide phospholipase C eta 1 (PI-PLC-eta1). PI-PLC-eta1, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase eta-1, or phospholipase C-eta-1 (PLC-eta-1), or phospholipase C-like protein 3 (PLC-L3), is a neuron-specific PI-PLC that is most abundant in the brain, particularly in the hippocampus, habenula, olfactory bulb, cerebellum, and throughout the cerebral cortex. It is also expressed in the zona incerta and in the spinal cord. PI-PLC-eta1 may perform a fundamental role in the brain. It may also act in synergy with other PLC subtypes. For instance, it is activated via intracellular Ca2+ mobilization and then plays a role in the amplification of GPCR (G-protein-coupled receptor)-mediated PLC-beta signals. In addition, its activity can be stimulated by ionomycin. PI-PLC-eta1 contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. The C-terminal tail harbors a number of proline-rich motifs which may interact with SH3 (Src homology 3) domain-containing proteins, as well as many serine/threonine residues, suggesting possible regulation of interactions by protein kinases/phosphatases.	141
320051	cd16221	EFh_PI-PLCeta2	EF-hand motif found in phosphoinositide phospholipase C eta 2 (PI-PLC-eta2). PI-PLC-eta2, also termed 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase eta-2, or phosphoinositide phospholipase C-like 4, or phospholipase C-like protein 4 (PLC-L4), or phospholipase C-eta-2 (PLC-eta2), is a neuron-specific PI-PLC that is most abundant in the brain,  particularly in the hippocampus, habenula, olfactory bulb, cerebellum, and throughout the cerebral cortex. It is also expressed in the pituitary gland, pineal gland, retina, and lung, as well as in neuroendocrine cells. PI-PLC-eta2 has been implicated in the regulation of neuronal differentiation/maturation. It is required for retinoic acid-stimulated neurite growth. It may also in part function downstream of G-protein-coupled receptors and play an important role in the formation and maintenance of the neuronal network in the postnatal brain. Moreover, PI-PLC-eta2 acts as a Ca2+ sensor that shows a canonical EF-loop directing Ca2+-sensitivity and thus can amplify transient Ca2+ signals. Its activation can be triggered either by intracellular calcium mobilization or by G beta-gamma signaling. PI-PLC-eta2 contains an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain, a C2 domain, and a unique C-terminal tail that terminates with a PDZ-binding motif, a potential interaction site for other signaling proteins. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. The C-terminal tail harbors a number of proline-rich motifs which may interact with SH3 (Src homology 3) domain-containing proteins, as well as many serine/threonine residues, suggesting possible regulation of interactions by protein kinases/phosphatases.	141
320052	cd16222	EFh_PRIP1	EF-hand motif found in phospholipase C-related but catalytically inactive protein 1 (PRIP-1). PRIP-1, also termed phospholipase C-deleted in lung carcinoma, or inactive phospholipase C-like protein 1 (PLC-L1), or p130, is a novel inositol 1,4,5-trisphosphate (InsP3) binding protein that is predominantly expressed in the brain. It is involved in InsP3-mediated calcium signaling pathway and GABA(A)receptor-mediated signaling pathway. It interacts with the catalytic subunits of protein phosphatase 1 (PP1) and protein phosphatase 2A (PP2A), and functions as a scaffold to regulate the activities and subcellular localizations of both PP1 and PP2A in phospho-dependent cellular signaling. It also promotes the translocation of phosphatases to lipid droplets to trigger the dephosphorylation of hormone-sensitive lipase (HSL) and perilipin A, thus reducing protein kinase A (PKA)-mediated lipolysis. Moreover, PRIP-1 plays an important role in insulin granule exocytosis through the association with GABAA-receptor-associated protein (GABARAP) to form a complex to regulate KIF5B-mediated insulin secretion. It also inhibits regulated exocytosis through direct interactions with syntaxin 1 and synaptosomal-associated protein 25 (SNAP-25) via its C2 domain. Furthermore, PRIP-1 has been implicated in the negative regulation of bone formation. PRIP-1 has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP-1 does not have PLC enzymatic activity.	143
320053	cd16223	EFh_PRIP2	EF-hand motif found in phospholipase C-related but catalytically inactive protein 2 (PRIP-2). PRIP-2, also termed phospholipase C-L2, or phospholipase C-epsilon-2 (PLC-epsilon-2), or inactive phospholipase C-like protein 2 (PLC-L2), is a novel inositol 1,4,5-trisphosphate (InsP3) binding protein that exhibits a relatively ubiquitous expression. It functions as a novel negative regulator of B-cell receptor (BCR) signaling and immune responses. PRIP-2 has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP-2 does not have PLC enzymatic activity.	144
320022	cd16224	EFh_CREC_RCN2	EF-hand, calcium binding motif, found in reticulocalbin-2 (RCN2). RCN2, also termed calcium-binding protein ERC-55, or E6-binding protein (E6BP), or TCBP-49, is an endoplasmic reticulum resident low-affinity Ca2+-binding protein that has been implicated in immunity, redox homeostasis, cell cycle regulation and coagulation. It is associated with tumorigenesis, in particular with transformation of cells of the cervix induced by human papillomavirus (HPV), through binding to human papillomavirus (HPV) E6 oncogenic protein. It specifically interacts with vitamin D receptor among nuclear receptors. RCN2 contains an N-terminal signal sequence followed by six copies of the EF-hand Ca2+-binding motif, and a C-terminal His-Asp-Glu-Leu (HDEL) tetrapeptide that is required for retention of RCN2 in the endoplasmic reticulum (ER).	268
320023	cd16225	EFh_CREC_cab45	EF-hand, calcium binding motif, found in 45 kDa calcium-binding protein (Cab45). Cab45, also termed stromal cell-derived factor 4 (SDF-4), is a soluble, lumenal Golgi resident low-affinity Ca2+-binding protein that contains six copies of the EF-hand Ca2+-binding motif. It is required for secretory pathway calcium ATPase1 (SPCA1)-dependent Ca2+ import into the trans-Golgi network (TGN) and plays an essential role in Ca2+-dependent secretory cargo sorting at the TGN.	278
320024	cd16226	EFh_CREC_Calumenin_like	EF-hand, calcium binding motif, found in calumenin, reticulocalbin-1 (RCN-1), reticulocalbin-3 (RCN-3), and similar proteins. The family corresponds to a group of six EF-hand Ca2+-binding proteins, including calumenin (also known as crocalbin or CBP-50), reticulocalbin-1 (RCN-1), reticulocalbin-3 (RCN-3), and similar proteins.  Calumenin is an endo/sarcoplasmic reticulum (ER/SR) resident low-affinity Ca2+-binding protein that contains six EF-hand domains and a C-terminal SR retention signal His-Asp-Glu-Phe (HDEF) tetrapeptide. It functions as a novel regulator of SERCA2, and its expressional changes are tightly coupled with Ca2+-cycling of cardiomyocytes. It is also broadly involved in haemostasis and in the pathophysiology of thrombosis. Moreover, the extracellular calumenin acts as a suppressor of cell migration and tumor metastasis. RCN-1 is an endoplasmic reticulum resident Ca2+-binding protein with a carboxyl-terminal His-Asp-Glu-Leu (HDEL) tetrapeptide signal. It acts as a potential negative regulator of B-RAF activation and can negatively modulate cardiomyocyte hypertrophy by inhibition of the mitogen-activated protein kinase signalling cascade. It also plays a key role in the development of doxorubicin-associated resistance. RCN-3 is a putative six EF-hand Ca2+-binding protein that contains five RXXR (X is any amino acid) motifs and a C-terminal ER retrieval signal HDEL tetrapeptide. The RXXR motif represents the target sequence of subtilisin-like proprotein convertases (SPCs). RCN-3 is specifically bound to the paired basic amino-acid-cleaving enzyme-4 (PACE4) precursor protein and plays an important role in the biosynthesis of PACE4.	264
320025	cd16227	EFh_CREC_RCN2_like	EF-hand, calcium binding motif, found in reticulocalbin-2 (RCN2) mainly from protostomes. This family corresponds to a group of uncharacterized RCN2-like proteins, which are mainly found in protostomes. Although their biological function remains unclear, they show high sequence similarity with RCN2 (also known as E6BP or TCBP-49), which is an endoplasmic reticulum resident low-affinity Ca2+-binding protein that has been implicated in immunity, redox homeostasis, cell cycle regulation and coagulation. Members in this family contain six copies of the EF-hand Ca2+-binding motif, but may lack a C-terminal His-Asp-Glu-Leu (HDEL) tetrapeptide that is required for retention of RCN2 in the endoplasmic reticulum (ER).	263
320026	cd16228	EFh_CREC_Calumenin	EF-hand, calcium binding motif, found in calumenin. Calumenin, also termed crocalbin, or IEF SSP 9302, is an endo/sarcoplasmic reticulum (ER/SR) resident low-affinity Ca2+-binding protein that contains six EF-hand domains and a C-terminal SR retention signal His-Asp-Glu-Phe (HDEF) tetrapeptide. It is highly expressed in various brain regions. Thus it plays an important role in migration and differentiation of neurons, and/or in Ca2+ signaling between glial cells and neurons. Calumenin is involved in Ca2+ homeostasis through interacting with ryanodine receptor RyR2 and SERCA2. It acts as a novel regulator of SERCA2, and its expressional changes are tightly coupled with Ca2+-cycling of cardiomyocytes. Calumenin also forms a Ca2+-dependent complex with thrombospondin-1, which is broadly involved in haemostasis and thrombosis. Moreover, calumenin is a molecular chaperone that endogenously regulates the vitamin K-dependent gamma-carboxylation of several proteins, including blood coagulation factors (such as FII, FVII, FIX, FX, and proteins C, S and Z), cell survival factors (Gas6) and bone metabolism proteins (such as matrix Gla protein or MGP, osteocalcin and periostin), through targeting the gamma-glutamyl carboxylase. It also functions as a charged F508del-cystic fibrosis transmembrane regulator (CFTR) folding modulator, as well as a G551D-CFTR associated protein. Furthermore, the extracellular calumenin acts as a suppressor of cell migration and tumor metastasis. It binds to and stabilizes fibulin-1, and further inactivates extracellular signal-regulated kinases 1 and 2 (ERK1/2) signaling.	263
320027	cd16229	EFh_CREC_RCN1	EF-hand, calcium binding motif, found in reticulocalbin-1 (RCN-1). RCN-1 is an endoplasmic reticulum resident low-affinity Ca2+-binding protein with six EF-hand motifs and a carboxyl-terminal His-Asp-Glu-Leu (HDEL) tetrapeptide signal. It is expressed at the cell surface. RCN-1 acts as a potential negative regulator of B-RAF activation and can negatively modulate cardiomyocyte hypertrophy by inhibition of the mitogen-activated protein kinase signaling cascade. It also plays a key role in the development of doxorubicin-associated resistance.	267
320028	cd16230	EFh_CREC_RCN3	EF-hand, calcium binding motif, found in reticulocalbin-3 (RCN-3). RCN-3, also termed EF-hand calcium-binding protein RLP49, is a putative six EF-hand Ca2+-binding protein that contains five RXXR (X is any amino acid) motifs and a C-terminal ER retrieval signal His-Asp-Glu-Leu (HDEL) tetrapeptide. The RXXR motif represents the target sequence of subtilisin-like proprotein convertases (SPCs). RCN-3 is specifically bound to the paired basic amino-acid-cleaving enzyme-4 (PACE4) precursor protein and plays an important role in the biosynthesis of PACE4.	268
320010	cd16231	EFh_SPARC_like	EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein acidic and rich in cysteine (SPARC) and similar proteins. This family includes secreted protein acidic and rich in cysteine (SPARC), secreted protein, acidic and rich in cysteine-like 1 (SPARCL1), and similar proteins. SPARC is a prototypic collagen-binding matricellular protein that is involved in extracellular matrix (ECM) assembly and fibrosis through binding both fibrillar collagen and basal lamina collagen IV. It regulates the activity of matrix metalloproteinases (MMPs), as well as the growth factor signaling mediated by cell surface receptors including vascular endothelial growth factor (VEGF) receptor, basic fibroblast growth factor (bFGF), and transforming growth factor (TGF) beta1. It also shows survival activity in tumor progression. SPARC contains an N-terminal acidic 52-residue segment followed by a follistatin-like (FS) domain, and an alpha-helical EC domain with 2 unusual calcium-binding EF-hands and the collagen-binding site. SPARCL1 is the closest family member to SPARC. It shares the three primary domains contained within SPARC with an expanded N-terminal domain. SPARCL1 may function as both a tumor suppressor and as a regulator of angiogenesis. It can bind to collagens and be counter-adhesive to wild-type dermal fibroblasts, but do not influence rates of cell proliferation. Moreover, SPARCL1 can influence central nervous system (CNS) development and synaptic rearrangement.	116
320011	cd16232	EFh_SPARC_TICN	EF-hand, extracellular calcium-binding (EC) motif, found in testicans. Testicans are nervous system-expressed proteoglycans that play important roles in the regulation of protease activity, as well as in the determination of age at menarche. Testican-1 (TICN1, also termed protein SPOCK) is a secreted chimeric proteoglycan that is highly expressed in brain and carries both chondroitin and heparan sulfate glycosaminoglycan side chains.  It has been implicated in autoimmune disease. It also acts as a regulator of bone morphogenetic protein (BMP) signaling and show critical functions in the nervous system. Testican-2 (TICN2, also termed protein SPOCK2) is an extracellular heparan sulphate proteoglycan highly expressed in brain. It may play regulatory roles in the development of the central nervous system. It also participates in diverse steps of neurogenesis. TICN1, but not TICN2, inhibits cathepsin L. TICN1 also inhibits attachment and neurite outgrowth in cultures of N2A neuroblastoma cells, While TICN2 is able to inhibit neurite outgrowth from primary cerebellar cells. Testicans contain an N-terminal signal peptide, a testican-specific domain followed by a follistatin-like (FS) domain, an extracellular calcium-binding (EC) domain including a pair of EF hands, a thyroglobulin-like domain (TY), and a C-terminal region with two putative glycosaminoglycan attachment sites.  The substitution of a ligating Asp residue by Tyr orTyr  in the +Y position of EF hand 2 in testican-2 could prevent Ca2+ binding to this site and also cause  EF-hand 1 to bind one Ca2+ with low affinity. The substitution of a ligating Asp residue by Phe or Tyr in the +Y position of EF-hand 2 in testicans could prevent Ca2+ binding to this site and also cause  EF-hand 1 to bind one Ca2+ ion with low affinity.	108
320012	cd16233	EFh_SPARC_FSTL1	EF-hand, extracellular calcium-binding (EC) motif, found in follistatin-related protein 1 (FRP-1). FRP-1, also termed follistatin-like protein 1 (fstl-1), TGF-beta-stimulated clone 36 (TSC-36/Flik), or TGF-beta inducible protein, is a secreted glycoprotein that is overexpressed in certain inflammatory diseases and has been implicated in many autoimmune diseases. FRP-1 functions as an important proinflammatory factor in the pathogenesis of osteoarthritis (OA) by activating the canonical NF-kappaB-mediated inflammatory cytokines, including tumor necrosis factor alpha (TNF-alpha), interleukin-1beta (IL-1beta) and interleukin-6 (IL-6), and enhancing fibroblast like synoviocytes proliferation. It also acts as a critical mediator of collagen-induced arthritis (CIA), juvenile rheumatoid arthritis (JRA), as well as Lyme arthritis observed after Borrelia burgdorferi infection. Meanwhile, it enhances nod-like receptor family, pyrin domain containing 3 (NLRP3) inflammasome-mediated IL-1beta secretion from monocytes and macrophages. Moreover, FRP-1 shows critical functions in the nervous system. It differentially regulates transforming growth factor beta (TGF-beta) and bone morphogenetic protein (BMP) signaling, leading to epithelial injury and fibroblast activation. Furthermore, FRP-1 functions as a cardiokine with cardioprotective properties. It may play a potential role in ischemic stroke through decreasing neuronal apoptosis and improving neurological deficits via disco-interacting protein 2 homolog A (DIP2A)/Akt pathway after middle cerebral artery occlusion (MCAO). Plasma FRP-1 is elevated in Kawasaki disease (KD) and thus may play a possible role in the formation of coronary artery aneurysm (CAA). FRP-1 contains a follistatin-like (FS) domain, an extracellular calcium-binding (EC) domain including a pair of EF hands, and a von Willebrand factor type C (VWC) domain. The EC domain does not undergo characteristic structural changes upon calcium addition or depletion and therefore is not a functional calcium binding domain.	114
320013	cd16234	EFh_SPARC_SMOC	EF-hand, extracellular calcium-binding (EC) motif, found in secreted modular calcium-binding protein SMOC-1, SMOC-2, and similar proteins. SMOC proteins corresponds to a group matricellular proteins that are involved in direct or indirect modulation of growth factor signaling pathways and play diverse roles in physiological processes involving extensive tissue remodeling, migration, proliferation, and angiogenesis. They may mediate intercellular signaling and cell type-specific differentiation during gonad and reproductive tract development. SMOC-1 is localized in basement membranes. Its mutations have been found to be associated with individuals with Warrdenburg Anopthalmia Syndrome. SMOC-2 is ubiquitously expressed and is involved in angiogenesis and the regulation of cell cycle progression. It enhances the angiogenic effect of basic fibroblast growth factor (bFGF) and vascular endothelial growth factor (VEGF). It has also been implicated in generalized vitiligo. SMOC proteins consist of a follistatin-like (FS) domain, two thyroglobulin-like (TY) domains, a novel domain conserved only in SMOC proteins, and an extracellular calcium-binding (EC) domain with two EF-hand calcium-binding motifs.	104
320014	cd16235	EFh_SPARC_SPARC	EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein acidic and rich in cysteine (SPARC). SPARC, also termed basement-membrane protein 40 (BM-40), or osteonectin (ON), is a prototypic collagen-binding matricellular protein that is essential for embryo development in invertebrates and highly expressed in bone. It participates in normal tissue remodeling as it regulates the deposition of extracellular matrix, as well as in neoplastic transformation. It is involved in extracellular matrix (ECM) assembly and fibrosis through binding both fibrillar collagen and basal lamina collagen IV. It regulates the activity of matrix metalloproteinases (MMPs), as well as the growth factor signaling mediated by cell surface receptors including vascular endothelial growth factor (VEGF) receptor, basic fibroblast growth factor (bFGF), and transforming growth factor (TGF) beta1. SPARC shows survival activity in tumor progression. It plays a role in metastatic process to the lung during melanoma progression. It can suppress prostate cancer cell growth and survival. Moreover, SPARC is a bone- associated protein that has a major role in bone development and mineralisationis. It is involved in the initiation and progression of vascular calcification and upregulated by adiponectin.  Furthermore, SPARC may be one of the molecules that govern the uptake and delivery of proteins from blood to the cerebrospinal fluid (CSF) during brain development. SPARC contains an N-terminal acidic 52-residue segment followed by a follistatin-like (FS) domain, and an alpha-helical EC domain with 2 unusual calcium-binding EF-hands and the collagen-binding site. Platelet-derived growth factor (PDGF) also interacts with its EC domain, but in a calcium-independent manner, whereas collagen binding is calcium-dependent.	96
320015	cd16236	EFh_SPARC_SPARCL1	EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein, acidic and rich in cysteine-like 1 (SPARCL1). SPARCL1, also termed SPARC-like protein 1, or high endothelial venule protein (Hevin), or MAST 9, or SC-1, or RAGS-1, or QR1, or ECM 2, is a diversely expressed and developmentally regulated extracellular matrix glycoprotein involved in tissue repair and remodeling via interaction with the surrounding extracellular matrix (ECM) proteins. It plays a pivotal role in the corneal wound healing. SPARCL1 may function as both a tumor suppressor and as a regulator of angiogenesis. It regulates cell migration/invasion and suppresses metastasis in many cancers, including prostate cancer, colorectal cancer, gastric cancer, and breast cancer. It can bind to collagens and be counter-adhesive to wild-type dermal fibroblasts, but do not influence rates of cell proliferation. Moreover, SPARCL1 contributes to neural development and participates in remodeling events associated with neuronal degeneration following neural injury. It can influence central nervous system (CNS) development and synaptic rearrangement.  SPARCL1 is the closest family member to secreted protein acidic and rich in cysteine (SPARC), but does not compensate for the absence of SPARC in the CNS. SPARC contains an N-terminal acidic 52-residue segment followed by a follistatin-like (FS) domain, and an alpha-helical EC domain with 2 unusual calcium-binding EF-hands and the collagen-binding site. SPARCL1 shares the three primary domains contained within SPARC with an expanded N-terminal domain.	93
320016	cd16237	EFh_SPARC_TICN1	EF-hand, extracellular calcium-binding (EC) motif, found in testican-1 (TICN1). TICN1, also termed protein SPOCK, or SPARC/osteonectin, CWCV, and Kazal-like domains proteoglycan 1 (Spock1), is a secreted chimeric proteoglycan that is highly expressed in brain and carries both chondroitin and heparan sulfate glycosaminoglycan side chains. It promotes resistance against Pseudomonas aeruginosa-induced keratitis through regulation of matrix metalloproteinase (MMP)-2 expression and activation. It also acts as a potential cancer prognostic marker that promotes the proliferation and metastasis of gallbladder cancer cells by activating the PI3K/Akt pathway. Moreover, TICN1 corresponding gene SPOCK1 is a novel transforming growth factor-beta target gene that regulates lung cancer cell epithelial-mesenchymal transition. It is also up-regulated by chromodomain helicase/adenosine triphosphatase DNA binding protein 1-like (CHD1L), and promotes human hepatocellular carcinoma (HCC) cell invasiveness and metastasis. Furthermore, TICN1 inhibits the lysosomal cysteine protease cathepsin L in intracellular vesicles and in the extracellular milieu. TICN1 contains an N-terminal signal sequence known to direct nascent polypeptides to the extracellular space, an unique region to the testicans, a follistatin (FS)-like domain generally involving five disulfide bridges, an extracellular calcium-binding (EC) domain including a pair of EF hands, and a thyroglobulin type-1 (TY) domain followed by a C-terminal acidic region with high density of negatively charged amino acids. The substitution of a ligating Asp residue by Phe291 in the +Y position of EF-hand 2 in TICN1 could prevent Ca2+ binding to this site and also cause  EF-hand 1 to bind one Ca2+ ion with low affinity.	112
320017	cd16238	EFh_SPARC_TICN2	EF-hand, extracellular calcium-binding (EC) motif, found in testican-2 (TICN2). TICN2, also termed SPARC/osteonectin, CWCV, and Kazal-like domains proteoglycan 2 (Spock2), is an extracellular heparan sulphate proteoglycan expressed in brain, lung, and testis. It inhibits neurite extension from cultured primary cerebellar neurons and may play regulatory roles in the development of the central nervous system. It also participates in diverse steps of neurogenesis. Moreover, TICN2 may contribute to ECM remodeling by regulating function(s) of other testican family members, which possess membrane-type matrix metalloproteinases (MT-MMPs) inhibitory function. Furthermore, TICN2 corresponding gene SPOCK2 acts as a susceptibility gene for bronchopulmonary dysplasia. TICN2 contains an N-terminal signal peptide, a testican-specific domain followed by a follistatin-like (FS) domain, an extracellular calcium-binding (EC) domain including a pair of EF hands, a thyroglobulin-like domain (TY), and a C-terminal region with two putative glycosaminoglycan attachment sites. The substitution of a ligating Asp residue by Tyr292 in the +Y position of EF-hand 2 in TICN2 could prevent Ca2+ binding to this site and also cause  EF-hand 1 to bind one Ca2+ ion with low affinity.	112
320018	cd16239	EFh_SPARC_TICN3	EF-hand, extracellular calcium-binding (EC) motif, found in testican-3 (TICN3). TICN3, also termed SPARC/osteonectin, CWCV, and Kazal-like domains proteoglycan 3 (Spock3), is a brain-specific heparan sulfate proteoglycan that shows a widespread distribution within the extracellular matrix of the brain. It plays an important role in the formation or maintenance of major neuronal structures in the brain. It also functions as a novel regulator to reduce the activity of matrix metalloproteinase (MMP) in adult T-cell leukemia (ATL). It suppresses membrane-type 1 MMP-mediated MMP-2 activation and tumor invasion. Moreover, TICN3 corresponding gene SPOCK3 acts as a risk gene for adult attention-deficit/hyperactivity disorder (ADHD) and personality disorders. TICN3 contains an N-terminal signal peptide, a testican-specific domain followed by the follistatin-like (FS) and extracellular calcium-binding (EC) domains characteristic of the BM-40 family. Towards the C-terminus they contain a thyroglobulin-like domain (TY) and a novel sequence (domain V), which includes two potential glycosaminoglycan attachment sites. The substitution of a ligating Asp residue by Tyr295 in the +Y position of EF-hand 2 in testican-3 could prevent Ca2+ binding to this site and also cause  EF-hand 1 to bind one Ca2+ ion with low affinity.	113
320019	cd16240	EFh_SPARC_SMOC1	EF-hand, extracellular calcium-binding (EC) motif, found in secreted modular calcium-binding protein 1 (SMOC-1). SMOC-1, also termed SPARC-related modular calcium-binding protein 1, or smooth muscle-associated protein 1 (SMAP-1), is an Arf6 GTPase-activating protein (GAP) that directly interacts with clathrin and regulates the clathrin-dependent endocytosis of transferrin receptors from the plasma membrane. It is predominantly localized in basement membranes. SMOC-1 acts as a regulator of osteoblast differentiation and is involved in inhibition of transforming growth factor-beta (TGF-beta) signaling through production of nitric oxide. It also plays an essential role in ocular and limb development and functions as a regulator of bone morphogenic protein (BMP) signaling. It interacts with a matricellular protein, tenascin C in addition to the serum proteins, fibulin-1 and C-reactive protein, but not collagens. Two point mutations in the SMOC1 gene may cause Waardenburg Anophtalmia Syndrome. Moreover, SMOC-1 is involved in direct or indirect modulation of growth factor signaling pathways and plays a role in physiological processes involving extensive tissue remodeling. SMOC-1 contains a follistatin-like (FS) domain, two thyroglobulin-like (TY) domains, a novel domain, which is found only in the homologous SMOC-2, and an extracellular calcium-binding (EC) domain with two EF-hand calcium-binding motifs.	115
320020	cd16241	EFh_SPARC_SMOC2	EF-hand, extracellular calcium-binding (EC) motif, found in secreted modular calcium-binding protein 2 (SMOC-2). SMOC-2, also termed SPARC-related modular calcium-binding protein 2, or smooth muscle-associated protein 2 (SMAP-2), is a ubiquitously expressed matricellular protein that enhances the response to angiogenic growth factors, mediate cell adhesion, keratinocyte migration, and metastasis. It is also associated with vitiligo and craniofacial and dental defects. Moreover, SMOC-2 acts as an Arf1 GTPase-activating protein (GAP) that interacts with clathrin heavy chain (CHC) and clathrin assembly protein CALM and functions in the retrograde, early endosome/trans-Golgi network (TGN) pathway in a clathrin- and AP-1-dependent manner. It also contributes to mitogenesis via activation of integrin-linked kinase (ILK). SMOC-2 contains a follistatin-like (FS) domain, two thyroglobulin-like (TY) domains, a novel domain, which is found only in the homologous SMOC-1, and an extracellular calcium-binding (EC) domain with two EF-hand calcium-binding motifs.	114
320000	cd16242	EFh_DMD_like	EF-hand-like motif found in the dystrophins subfamily. This dystrophins subfamily includes dystrophin and its two paralogs, utrophin and DRP-2. Dystrophin is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscle. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. Dystrophin also involves in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Utrophin, also termed dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homologue that increases dystrophic muscle function and reduces pathology. It is broadly expressed at both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with the dystroglycans (DGs) and the sarcoglycan-dystroglycans, sarcoglycans and sarcospan (SG-SSPN) subcomplex. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. DRP-2 is mainly expressed in the vertebrate central nervous system (CNS). It is associated with brain membrane fractions and highly enriched in the postsynaptic density. DRP-2 plays a role in the organization of central cholinergic synapses. It interacts with dystroglycan and L-Periaxin to form a transmembrane complex, which plays a role in Schwann cell-basal lamina interactions and in the regulation of the terminal stages of myelination. The dystrophins subfamily has been characterized by a compact cluster of domains comprising a WW domain, four EF-hand-like motifs and a ZZ-domain, followed by two syntrophin binding sites (SBSs) and a looser region with two coiled-coils.	163
320001	cd16243	EFh_DYTN	EF-hand-like motif found in dystrotelin and similar proteins. Dystrotelin is the vertebrate orthologue of Drosophila DAH, which is involved in the synchronised cellularization of thousands of nuclei in the syncytial early fly embryo (a specialised form of cytokinesis). Dystrotelin is mainly expressed in the developing central nervous system (CNS) and adult nervous and muscular tissues. Heterologously expressed dystrotelin protein localizes spontaneously to the cytoplasmic membrane, and possibly to the endoplasmic reticulum (ER). Dystrotelin is not critical for mammalian development. It may be involved in other forms of cytokinesis. Its N-terminal region contains a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. The C-terminal region is extremely divergent. Unlike other superfamily members, dystrophin or dystrobrevin, the residues directly involved in beta-dystroglycan binding are not conserved in dystrotelin, which makes it unlikely that dystrotelin interacts with this ligand. Moreover, dystrotelin is unable to heterodimerize with members of the dystrophin or dystrobrevin families, or to homodimerize.	163
320002	cd16244	EFh_DTN	EF-hand-like motif found in dystrobrevins and similar proteins. Dystrobrevins are part of the dystrophin-glycoprotein complex (DGC). They physically associate with members of the dystrophin family and with the syntrophins through their homologous C-terminal coiled coil motifs. The family includes two paralogs dystrobrevins, alpha- and beta-dystrobrevin, both of which are cytoplasmic components of the dystrophin-associated protein complex that function as scaffold proteins in signal transduction and intracellular transport. Absence of alpha- and beta-dystrobrevin causes cerebellar synaptic defects and abnormal motor behavior. The dystrobrevins subfamily has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, dystrobrevins contain one or two syntrophin binding sites (SBSs).	161
320003	cd16245	EFh_DAH	EF-hand-like motif found in Drosophila melanogaster discontinuous actin hexagon (DAH) and similar proteins. DAH, the product of the dah (discontinuous actin hexagon) gene, is a Drosophila homolog to vertebrate dystrotelin. It is tightly membrane-associated and highly phosphorylated in a time-dependent fashion. DAH plays an essential role in the process of cellularization, and is associated with vesicles that convene at the cleavage furrow. The absence of DAH leads the severe disruption of the cleavage furrows around the nuclei and development stalls. DAH contains a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils.	164
320004	cd16246	EFh_DMD	EF-hand-like motif found in dystrophin. Dystrophin is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscle. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. It involves in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). The dystrophin subfamily has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, dystrophin contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, approximately 24 spectrin repeats (SRs) and a WW domain. Mutations in dystrophin lead to Duchenne muscular dystrophy (DMD). Moreover, dystrophin deficiency is associated abnormal cerebral diffusion and perfusion, acute Trypanosoma cruzi infection.	162
320005	cd16247	EFh_UTRO	EF-hand-like motif found in utrophin. Utrophin, also termed dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homologue that increases dystrophic muscle function and reduces pathology. It is broadly expressed at both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with the dystroglycans (DGs) and the sarcoglycan-dystroglycans, sarcoglycans and sarcospan (SG-SSPN) subcomplex. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Like dystrophin, Utrophin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, it contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, up to 24 spectrin repeats (SRs) and a WW domain. However, utrophin lacks the intrinsic microtubule binding activity of dystrophin SRs.	162
320006	cd16248	EFh_DRP-2	EF-hand-like motif found in dystrophin-related protein 2 (DRP-2). DRP-2 is a dystrophin homologue mainly expressed in the vertebrate central nervous system (CNS). It is associated with brain membrane fractions and highly enriched in the postsynaptic density. DRP-2 plays a role in the organization of central cholinergic synapses. It interacts with dystroglycan and L-Periaxin to form a transmembrane complex, which plays a role in Schwann cell-basal lamina interactions and in the regulation of the terminal stages of myelination. Like dystrophin, DRP-2 has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, it contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises only two spectrin repeats (SRs) and a WW domain.	162
320007	cd16249	EFh_DTNA	EF-hand-like motif found in alpha-dystrobrevin. Alpha-dystrobrevin, also termed dystrobrevin alpha (DTN-A), or dystrophin-related protein 3 (DRP-3), is the mammalian ortholog of the Torpedo 87 kDa postsynaptic protein that tightly associates with dystrophin. It is a cytoplasmic protein expressed predominantly in skeletal muscle, heart, lung, and brain. Alpha-dystrobrevin has been implicated in the regulation of acetylcholine receptor (AChR) aggregate density and patterning. It is also essential in the pathogenesis of dystrophin-dependent muscular dystrophies. It plays a critical role in the full functionality of dystrophin through increasing dystrophin's binding to the dystrophin-glycoprotein complex (DGC), and provides protection during cardiac stress. Alpha-dystrobrevin binds to the intermediate filament proteins syncoilin and beta-synemin, thereby linking the dystrophin-associated protein complex (DAPC) to the intermediate filament network. Moreover, alpha-dystrobrevin involves in cell signaling via interaction with other proteins such as syntrophin, a modular adaptor protein that coordinates the assembly of the signaling proteins nitric oxide synthase, stress-activated protein kinase-3, and Grb2 to the DAPC. Furthermore, alpha-dystrobrevin plays an important role in muscle function, as well as in nuclear morphology maintenance through specific interaction with the nuclear lamina component lamin B1. In addition, alpha-dystrobrevin is required in dystrophin-associated protein scaffolding in brain. Absence of glial alpha-dystrobrevin causes abnormalities of the blood-brain barrier and progressive brain edema. Alpha-dystrobrevin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, alpha-dystrobrevin contain two syntrophin binding sites (SBSs).	161
320008	cd16250	EFh_DTNB	EF-hand-like motif found in beta-dystrobrevin. Beta-dystrobrevin, also termed dystrobrevin beta (DTN-B), is a dystrophin-related protein that is restricted to non-muscle tissues and is abundantly expressed in brain, lung, kidney, and liver. It may be involved in regulating chromatin dynamics, possibly playing a role in neuronal differentiation, through the interactions with the high mobility group HMG20 proteins iBRAF/HMG20a and BRAF35 /HMG20b. It also binds to and represses the promoter of synapsin I, a neuronal differentiation gene. Moreover, beta-dystrobrevin functions as a kinesin-binding receptor involved in brain development via the association with the extracellular matrix components pancortins. Furthermore, beta-dystrobrevin binds directly to dystrophin and is a cytoplasmic component of the dystrophin-associated glycoprotein complex, a multimeric protein complex that links the extracellular matrix to the cortical actin cytoskeleton and acts as a scaffold for signaling proteins such as protein kinase A. Absence of alpha- and beta-dystrobrevin causes cerebellar synaptic defects and abnormal motor behavior. Beta-dystrobrevin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, beta-dystrobrevin contain two syntrophin binding sites (SBSs).	161
319994	cd16251	EFh_parvalbumin_like	EF-hand, calcium binding motif, found in parvalbumin-like EF-hand family. The family includes alpha- and beta-parvalbumins, and a group of uncharacterized calglandulin-like proteins. Parvalbumins are small, acidic, cytosolic EF-hand-containing Ca2+-buffer and Ca2+ transporter/shuttle proteins belonging to EF-hand superfamily. They are expressed by vertebrates in fast-twitch muscle cells, specific neurons of the central and peripheral nervous system, sensory cells of the mammalian auditory organ (Corti's cell), and some other cells, and characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. Thus, they may play an additional role in Mg2+ handling. Moreover, parvalbumins represent one of the major animal allergens. In metal-bound states, parvalbumins possess a rigid and stable tertiary structure and display strong allergenicity. In contrast, the metal-free parvalbumins are intrinsically disordered, and the loss of metal ions results in a conformational change that decreases their IgE binding capacity. Furthermore, parvalbumins have been widely used as a neuronal marker for a variety of functional brain systems. They also function as a Ca2+ shuttle transporting Ca2+ from troponin-C (TnC) to the sarcoplasmic reticulum (SR) Ca2+ pump during muscle relaxation. Thus they may facilitate myocardial relaxation and play important roles in cardiac diastolic dysfunction. Parvalbumins consists of alpha- and beta- sublineages, which can be distinguished on the basis of isoelectric point (pI > 5 for alpha; pI 	101
319995	cd16252	EFh_calglandulin_like	EF-hand, calcium binding motif, found in uncharacterized calglandulin-like proteins. The family corresponds to a group of uncharacterized calglandulin-like proteins. Although their biological function remain unclear, they show high sequence similarity with human calglandulin-like protein GAGLP, which is an ortholog of calglandulin from the venom glands of Bothrops insularis snake. Both GAGLP and calglandulin are putative Ca2+-binding proteins with four EF-hand motifs. However, members in this family contain only three EF-hand motifs. In this point, they may belong to the parvalbumin-like EF-hand family, which is characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix).	106
319996	cd16253	EFh_parvalbumins	EF-hand, calcium binding motif, found in parvalbumins. Parvalbumins are small, acidic, cytosolic EF-hand-containing Ca2+-buffer and Ca2+ transporter/shuttle proteins belonging to EF-hand superfamily. They are expressed by vertebrates in fast-twitch muscle cells, specific neurons of the central and peripheral nervous system, sensory cells of the mammalian auditory organ (Corti's cell), and some other cells, and characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. Thus, they may play an additional role in Mg2+ handling. Moreover, parvalbumins represent one of the major animal allergens. In metal-bound states, parvalbumins possess a rigid and stable tertiary structure and display strong allergenicity. In contrast, the metal-free parvalbumins are intrinsically disordered, and the loss of metal ions results in a conformational change that decreases their IgE binding capacity. Furthermore, parvalbumins have been widely used as a neuronal marker for a variety of functional brain systems. They also function as a Ca2+ shuttle transporting Ca2+ from troponin-C (TnC) to the sarcoplasmic reticulum (SR) Ca2+ pump during muscle relaxation. Thus they may facilitate myocardial relaxation and play important roles in cardiac diastolic dysfunction. Parvalbumins consists of alpha- and beta- sublineages, which can be distinguished on the basis of isoelectric point (pI > 5 for alpha; pI 	101
319997	cd16254	EFh_parvalbumin_alpha	EF-hand, calcium binding motif, found in alpha-parvalbumin. Alpha-parvalbumin is cytosolic Ca2+/Mg2+-binding protein expressed mainly in fast-twitch skeletal myofibrils, where it may act as a soluble relaxing factor facilitating the Ca2+-mediated relaxation phase. It is also expressed in rapidly firing neurons, particularly GABA-ergic neurons, and thus may confer protection against Ca2+ toxicity. The major role of alpha-parvalbumin is metal buffering and transport of Ca2+. It binds different metal cations, and exhibits very high affinity for Ca2+ and physiologically significant affinity for Mg2+. Alpha-parvalbumin is characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. Both metal ion-binding sites in alpha-parvalbumin are high-affinity sites. Additionally, in contrast to beta-parvalbumin, alpha-parvalbumin is less acidic and has an additional residue in the C-terminal helix.	101
319998	cd16255	EFh_parvalbumin_beta	EF-hand, calcium binding motif, found in beta-parvalbumin. Beta-parvalbumin, also termed Oncomodulin-1 (OM), is a small calcium-binding protein that is expressed in hepatomas, as well as in the blastocyst and the cytotrophoblasts of the placenta. It is also found to be expressed in the cochlear outer hair cells of the organ of Corti and frequently expressed in neoplasms. Mammalian beta-parvalbumin is secreted by activated macrophages and neutrophils. It may function as a tissue-specific Ca2+-dependent regulatory protein, and may also serve as a specialized cytosolic Ca2+ buffer. Beta-parvalbumin acts as a potent growth-promoting signal between the innate immune system and neurons in vivo. It has high and specific affinity for its receptor on retinal ganglion cells (RGC) and functions as the principal mediator of optic nerve regeneration. It exerts its effects in a cyclic adenosine monophosphate (cAMP)-dependent manner and can further elevate intracellular cAMP levels. Moreover, beta-parvalbumin is associated with efferent function and outer hair cell electromotility, and can identify different hair cell types in the mammalian inner ear. Beta-parvalbumin is characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. The EF site displays a high-affinity for Ca2+/Mg2+, and the CD site is a low-affinity Ca2+-specific site. In addition, beta-parvalbumin is distinguished from other parvalbumins by its unusually low isoelectric point (pI = 3.1) and sequence eccentricities (e.g., Y57-L58-D59 instead of F57-I58-E59).	101
293929	cd16256	LumP	lumazine protein. Lumazine protein (LumP) is involved in the bioluminescence of certain marine bacteria. It serves as an optical transponder in bioluminescence emission.  The intense fluorescence of LumP is caused by non-covalently bound 6,7- dimethyl-8-ribityllumazine. Though its amino acid sequence is very similar to riboflavin synthase it functions as a monomer, unlike the riboflavin synthases from eubacteria, yeasts and plants which act as trimers.	186
293914	cd16257	EFG_III-like	Domain III of Elongation factor G (EF-G) and related proteins. Bacterial Elongation factor G (EF-G) and related proteins play a role in translation and share a similar domain architecture. Elongation factor EFG participates in the elongation phase during protein biosynthesis on the ribosome by stimulating translocation. Its functional cycles depend on GTP binding and its hydrolysis. Domain III is involved in the activation of GTP hydrolysis. This domain III, which is different from domain III in EF-TU and related elongation factors, is found in several translation factors, like bacterial release factors RF3, elongation factor 4, elongation factor 2, GTP-binding protein BipA and tetracycline resistance protein Tet.	71
293915	cd16258	Tet_III	Domain III of Tetracycline resistance protein Tet. Tetracycline resistance proteins, including TetM and TetO, catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner thereby mediating Tc resistance. Tcs are broad-spectrum antibiotics. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA.	71
293916	cd16259	RF3_III	Domain III of bacterial Release Factor 3 (RF3). The class II RF3 is a member of one of two release factor (RF) classes required for the termination of protein synthesis by the ribosome. RF3 is a GTPase that removes class I RFs (RF1 or RF2) from the ribosome after release of the nascent polypeptide. RF3 in the GDP state binds to the ribosomal class I RF complex, followed by an exchange of GDP for GTP and release of the class I RF. Sequence comparison of class II release factors with elongation factors shows that prokaryotic RF3 is more similar to EF-G whereas eukaryotic eRF3 is more similar to eEF1A, implying that their precise function may differ.	70
293917	cd16260	EF4_III	Domain III of Elongation Factor 4 (EF4). Elongation factor 4 (EF4 or LepA) is a highly conserved guanosine triphosphatase found in bacteria and eukaryotic mitochondria and chloroplasts. EF4 functions as a translation factor, which promotes back-translocation of tRNAs on posttranslocational ribosome complexes and competes with elongation factor G for interaction with pretranslocational ribosomes, inhibiting the elongation phase of protein synthesis.	76
293918	cd16261	EF2_snRNP_III	Domain III of Elongation Factor 2 (EF2). This model represents domain III of Elongation factor 2 (EF2) found in eukaryotes and archaea, and the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and its yeast counterpart Snu114p. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. This translocation step is catalyzed by EF-2_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis are important for the function of the U5-116 kD/Snu114p.	72
293919	cd16262	EFG_III	Domain III of Elongation Factor G (EFG). This model represents domain III of bacterial Elongation factor G (EF-G), and mitochondrial Elongation factor G1 (mtEFG1) and G2 (mtEFG2), which play an important role during peptide synthesis and tRNA site changes. In bacteria, this translocation step is catalyzed by EF-G_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site. Eukaryotic cells harbor 2 protein synthesis systems: one localized in the cytoplasm, the other in the mitochondria. Most factors regulating mitochondrial protein synthesis are encoded by nuclear genes, translated in the cytoplasm, and then transported to the mitochondria. The eukaryotic system of elongation factor (EF) components is more complex than that in prokaryotes, with both cytoplasmic and mitochondrial elongation factors and multiple isoforms being expressed in certain species. mtEFG1 and mtEFG2 show significant homology to bacterial EF-Gs. Mutants in yeast mtEFG1 have impaired mitochondrial protein synthesis, respiratory defects, and a tendency to lose mitochondrial DNA. No clear phenotype has been found for mutants of the yeast homolog of mtEFG2, MEF2.	76
293920	cd16263	BipA_III	Domain III of GTP-binding protein BipA (TypA). BipA (also called TypA) is a highly conserved protein with global regulatory properties in Escherichia coli. BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios. It is stimulated by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion.	79
293921	cd16264	snRNP_III	Domain III of the spliceosomal 116kD U5 small nuclear ribonucleoprotein (snRNP) component. Domain III of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and its yeast counterpart Snu114p is homologous to domain III of the eukaryotic translational elongation factor EF-2. U5-116 kD is a GTPase component of the spliceosome complex which functions in the processing of precursor mRNAs to produce mature mRNAs.	72
293910	cd16265	Translation_Factor_II	Proteins related to domain II of EF-Tu and related translation factors. Elongation factor Tu consists of three structural domains; this family represents single domain proteins that are related to the second domain of EF-Tu. Domain II of EF-Tu adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is also found in other proteins such as elongation factor G and translation initiation factor IF-2.	80
293911	cd16266	IF2_aeIF5B_IV	Domain IV of prokaryotic Initiation Factor 2 and archaeal and eukaryotic Initiation Factor 5. This family represents the domain IV of prokaryotic Initiation Factor 2 (IF2) and its archaeal and eukaryotic homologs IF5B. IF2, the largest initiation factor is an essential GTP binding protein. In E. coli three natural forms of IF2 exist in the cell, IF2alpha, IF2beta1, and IF2beta2. Disruption of the eIF5B gene (FUN12) in yeast causes a severe slow-growth phenotype, associated with a defect in translation. eIF5B has a function analogous to prokaryotic IF2 in mediating the joining of the 60S ribosomal subunit. The eIF5B consists of three N-terminal domains (I, II, II) connected by a long helix to domain IV. Domain I is a G domain, domain II and IV are beta-barrels and domain III has a novel alpha-beta-alpha sandwich fold. The G domain and the beta-barrel domain II display a similar structure and arrangement to the homologous domains in EF1A, eEF1A and aeIF2gamma.	87
293912	cd16267	HBS1-like_II	Domain II of Hbs1-like proteins. S. cerevisiae Hbs1 is closely related to the eukaryotic class II release factor (eRF3). Hbs1, together with Dom34 (pelota), plays an important role in termination and recycling, but in contrast to eRF3/eRF1, Hbs1, together with Dom34 (pelota), functions on mRNA-bound ribosomes in a codon-independent manner and promotes subunit splitting on completely empty ribosomes.	84
293913	cd16268	EF2_II	Domain II of Elongation Factor 2. This subfamily represents domain II of elongation factor 2 (EF-2) found in eukaryotes and archaea. During the process of peptide synthesis and tRNA site changes, the ribosome is moved along the mRNA a distance equal to one codon with the addition of each amino acid. This translocation step is catalyzed by EF-2_GTP, which is hydrolyzed to provide the required energy. Thus, this action releases the uncharged tRNA from the P site and transfers the newly formed peptidyl-tRNA from the A site to the P site.	96
293879	cd16269	GBP_C	Guanylate-binding protein, C-terminal domain. Guanylate-binding protein (GBP), C-terminal domain. Guanylate-binding proteins (GBPs) are synthesized after activation of the cell by interferons. The biochemical properties of GBPs are clearly different from those of Ras-like and heterotrimeric GTP-binding proteins. They bind guanine nucleotides with low affinity (micromolar range), are stable in their absence, and have a high turnover GTPase. In addition to binding GDP/GTP, they have the unique ability to bind GMP with equal affinity and hydrolyze GTP not only to GDP, but also to GMP. This C-terminal domain has been shown to mediate inhibition of endothelial cell proliferation by inflammatory cytokines.	291
293878	cd16270	Apc5_N	N-terminal domain of the anaphase-promoting complex subunit Apc5 (or Anapc5). The N-terminal domain of Apc5 interacts with subunits Apc4, Apc15, and CDC23. Apc5 is a subunit of the eukaryotic anaphase-promoting complex/cyclosome (APC/C) which is a multi-subunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Although Apc5 does not contain a classical RNA binding domain, it binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction. The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC.	143
293830	cd16272	RNaseZ_MBL-fold	Ribonuclease Z; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme. Only the short form exists in bacteria. It includes the C-terminus of human ELAC2 and Escherichia coli zinc phosphodiesterase (ZiPD, also known as ecoZ, tRNase Z, or RNase BN) is a 3' tRNA-processing endonuclease, encoded by the elaC gene. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	180
293831	cd16273	SNM1A-1C-like_MBL-fold	SNM1A , artemis/SNM1C, yeast Pso2p, and related proteins; MBL-fold metallo-hydrolase domain. Includes human SNM1A (SNM1 homolog A, also known as DNA cross-link repair 1A protein) and Saccharomyces cerevisiae Pso2 protein (PSOralen derivative sensitive 2, also known as SNM1, sensitive to nitrogen mustard 1), both proteins are 5'-exonucleases and function in interstrand cross-links (ICL) repair. Also includes the nuclease artemis (also known as SNM1C, SNM1 homolog C, SNM1-like protein, and DNA cross-link repair 1C protein) which plays a role in V(D)J recombination/DNA repair. Purified artemis protein possesses single-strand-specific 5' to 3' exonuclease activity. Upon complex formation with, and phosphorylation by, DNA-dependent protein kinase, artemis gains endonucleolytic activity on hairpins and 5' and 3' overhangs. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	160
293832	cd16274	PQQB-like_MBL-fold	Coenzyme pyrroloquinoline quinone (PQQ) synthesis protein B and related proteins; MBL-fold metallo hydrolase domainhydrolase domain. PQQB is essential for the synthesis of the cofactor pyrroloquinoline quinone (PQQ) in Klebsiella pneumonia. PqqB is not directly involved in the PQQ biosynthesis but may serve as a carrier for PQQ when PQQ is released from PqqC. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	220
293833	cd16275	BaeB-like_MBL-fold	Bacillus amyloliquefaciens BaeB and related proteins; MBL-fold metallo hydrolase domain. Bacillus amyloliquefaciens BaeB may play a role in the synthesis of the antibiotic polyketide bacillaene. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 	174
293834	cd16276	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	188
293835	cd16277	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	222
293836	cd16278	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	185
293837	cd16279	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry. Some members of this subgroup are named as octanoyltransferase (also known as lipoate-protein ligase B).	193
293838	cd16280	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	251
293839	cd16281	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	252
293840	cd16282	metallo-hydrolase-like_MBL-fold	uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain. Members of the MBL-fold metallohydrolase superfamily are mainly hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) for which this fold was named perform only a small fraction of the activities included in this superfamily.Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	209
293841	cd16283	RomA-like_MBL-fold	Enterobacter cloacae RomA and related proteins; MBL-fold metallo hydrolase domain. Derepression of the romA-ramA locus results in a multidrug-resistance phenotype. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. The class B metal beta-lactamases (MBLs) from which this fold was named are only a small fraction of the activities which are included in this superfamily. Activities carried out by superfamily members include class B beta-lactamases, hydroxyacylglutathione hydrolases, AHL (acyl homoserine lactone) lactonases, persulfide dioxygenases, flavodiiron proteins, cleavage and polyadenylation specificity factors such as the Int9 and Int11 subunits of Integrator, Sdsa1-like and AtsA-like arylsulfatases, 5'-exonucleases human SNM1A and yeast Pso2p, ribonuclease J and ribonuclease Z, cyclic nucleotide phosphodiesterases, insecticide hydrolases, and proteins required for natural transformation competence. Classical members of the superfamily are di-, or less commonly mono-, zinc-ion-dependent hydrolases, however the diversity of biological roles is reflected in variations in the active site metallo-chemistry.	181
293842	cd16284	UlaG-like_MBL-fold	UlaG a putative l-ascorbate-6-P lactonase and related proteins; MBL-fold metallo hydrolase domain. UlaG is essential for L-ascorbate utilization under anaerobic conditions; it is a putative l-ascorbate-6-P lactonase thought to catalyze the hydrolysis of L-ascorbate-6-phosphate to 3-keto-L-gulonate-6-phosphate. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	178
293843	cd16285	MBL-B1	metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. Subclass B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. Includes chromosomally-encoded MBLs such as Bacillus cereus BcII, Bacteroides fragilis CcrA, and  Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB and acquired MBLs including IMP-1, VIM-1, VIM-2, GIM-1, NDM-1 and FIM-1.	210
293844	cd16286	SPM-1-like_MBL-B1-B2-like	Pseudomonas areoginosa SPM-1 and related metallo-beta-lactamases, subclasses B1 and B2 like; MBL-fold metallo-hydrolase domain. SPM-1 was first identified in a Pseudomonas aeruginosa strain from a paediatric leukaemia patient and is a major clinical problem. MBLs (class B of the Ambler beta-lactamase classification) have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs are most closely related to each other. SPM-1 appears to be a hybrid B1/B2 MBL.	236
293845	cd16287	CphS_ImiS-like_MBL-B2	metallo-beta-lactamases, subclass B2; MBL-fold metallo-hydrolase domain. Includes Aeromonas hydrophyla CphA, Aeromonas veronii ImiS, and Serratia fonticola Sfh-I. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. B2 MBLs have a narrow substrate profile relative to subclass B1 MBLs that includes carbapenems, and they are active with one zinc ion bound in the Asp-Cys-His site, binding of a second zinc ion in the modified 3H site (Asn-His-His) inhibits catalysis.	226
293846	cd16288	BJP-1_FEZ-1-like_MBL-B3	BJP-1, FEZ-1, GOB-1, Mbl1b and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. This subgroup of B3 subclass MBLs includes Bradyrhizobium diazoefficiens BJP-1, Fluoribacter gormanii FEZ-1, Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) GOB-1, Caulobacter crescentus Mbl1b. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	254
293847	cd16289	L1_POM-1-like_MBL-B3	Stenotrophomonas maltophilia L1, Pseudomonas otitidis POM-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of L1- and Pom-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	239
293848	cd16290	AIM-1_SMB-1-like_MBL-B3	AIM-1, SMB-1, EVM-1, THIN-B and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. This subgroup of B3 subclass MBLs includes Pseudomonas Aeruginosa AIM-1, Serratia marcescens SMB-1, Erythrobacter vulgaris EVM-1, and Janthinobacterium lividum THIN-B. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of AIM-1-,SMB-1-, EVM-1-, THIN-B-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	256
293849	cd16291	INTS11-like_MBL-fold	Integrator complex subunit 11, and related proteins; MBL-fold metallo-hydrolase domain. Integrator is a metazoan-specific multisubunit, multifunctional protein complex composed of 14 subunits named Int1-Int14 (Integrator subunits). This subgroup includes Int11 (also known as cleavage and polyadenylation-specific factor (CPSF) 3-like protein, and protein related to CPSF subunits of 68 kDa (RC-68)). Integrator complex has been implicated in a variety of Pol II transcription events including 3' end processing of snRNA, transcription initiation, promoter-proximal pausing, termination of protein-coding transcripts, and in HVS pre-miRNA 3' end processing. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	199
293850	cd16292	CPSF3-like_MBL-fold	cleavage and polyadenylation specificity factor (CPSF) subunit 3 and related proteins; MBL-fold metallo-hydrolase domain. CPSF3 (also known as cleavage and polyadenylation specificity factor 73 kDa subunit/CPSF-73) functions as a 3' endonuclease in 3' end processing of pre-mRNAs during cleavage/polyadenylation, and in 3' end processing of metazoan histone pre-mRNAs. This subgroup also contains the yeast homolog of CPSF-73, Ysh1/Brr5 which has roles in mRNA and snoRNA synthesis. In addition to this MBL-fold metallo-hydrolase domain, members of this subgroup contain a beta-CASP (named for metallo-beta-lactamase, CPSF, Artemis, Snm1, Pso2) domain, and a RMMBL domain (RNA-metabolizing metallo-beta-lactamase). Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	194
293851	cd16293	CPSF2-like_MBL-fold	cleavage and polyadenylation specificity factor (CPSF) subunit 2 and related proteins; MBL-fold metallo-hydrolase domain. CPSF2, also known as cleavage and polyadenylation specificity factor 100 kDa subunit (CPSF-100), is a component of the CPSF complex, which plays a role in 3' end processing of pre-mRNAs during cleavage/polyadenylation, and during processing of metazoan histone pre-mRNAs. This subgroup includes Ydh1p, the yeast homolog of CPSF2. In addition to this MBL-fold metallo-hydrolase domain, members of this subgroup contain a beta-CASP (named for metallo-beta-lactamase, CPSF, Artemis, Snm1, Pso2) domain. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 	199
293852	cd16294	Int9-like_MBL-fold	integrator subunit 9, and related proteins; MBL-fold metallo-hydrolase domain. Integrator is a metazoan-specific multisubunit, multifunctional protein complex composed of 14 subunits named Int1-Int14 (Integrator subunits). This subgroup includes Int9, also known as protein related to CPSF subunits of 74 kDa (RC-74). Integrator complex has been implicated in a variety of Pol II transcription events including 3' end processing of snRNA, transcription initiation, promoter-proximal pausing, termination of protein-coding transcripts, and in HVS pre-miRNA 3' end processing. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 	166
293853	cd16295	TTHA0252-CPSF-like_MBL-fold	Thermus thermophilus TTHA0252 and related cleavage and polyadenylation specificity factors; MBL-fold metallo-hydrolase domain. Includes the archaeal cleavage and polyadenylation specificity factors (CPSFs) such as Methanothermobacter thermautotrophicus MTH1203, and Pyrococcus horikoshii PH1404. In addition to the MBL-fold metallo-hydrolase nuclease and the beta-CASP domains, members of this subgroup contain two contiguous KH domains. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. 	197
293854	cd16296	RNaseZ_ELAC2-N-term-like_MBL-fold	Ribonuclease Z, N-terminus of human ELAC2 and related proteins; MBL-fold metallo-hydrolase domain. The tRNA maturase RNase Z (also known as tRNase Z or 3' tRNase) catalyzes the endonucleolytic removal of the 3' extension of the majority of tRNA precursors. Two forms of RNase Z exist in eukaryotes, one long (ELAC2) and one short form (ELAC1), the former may have resulted from a duplication of the shorter enzyme. This eukaryotic subgroup includes the N-terminus of human ELAC2 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	175
293855	cd16297	artemis-SNM1C-like_MBL-fold	artemis-SNM1C and related proteins; MBL-fold metallo-hydrolase domain. Includes the nuclease artemis (also known as SNM1C, SNM1 homolog C, SNM1-like protein and DNA cross-link repair 1C protein) which plays a role in V(D)J recombination/DNA repair. Purified artemis protein possesses single-strand-specific 5' to 3' exonuclease activity. Upon complex formation with, and phosphorylation by, DNA-dependent protein kinase, artemis gains endonucleolytic activity on hairpins and 5' and 3' overhangs. Inactivation of Artemis causes severe combined immunodeficiency (SCID). Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	171
293856	cd16298	SNM1A-like_MBL-fold	5'-exonucleases human SNM1A and related proteins; MBL-fold metallo-hydrolase domain. Includes human SNM1A (SNM1 homolog A, also known as DNA cross-link repair 1A protein) which is a 5'-exonuclease and functions in interstrand cross-links (ICL) repair. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions.	157
293857	cd16299	IND_BlaB-like_MBL-B1	IND1, IND2, BlaB-1 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the chromosome-encoded metallo-beta-lactamases Chryseobacterium indologenes IND-1, IND-2, and IND-7, Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB, Chryseobacterium gleum CGB-1, and Empedobacter brevis EBR-1. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	212
293858	cd16300	NDM_FIM-like_MBL-B1	NDM-1, FIM-1 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the ISCR-mediated MBLs NDM-1 (NDM (New Delhi metallo-beta-lactamase) and FIM-1 (Florence imipenemase). MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	214
293859	cd16301	IMP_DIM-like_MBL-B1	IMP-1, DIM-1, GIM-1, SIM-1, TMB-1 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the acquired MBLs IMP-1(a beta-lactamase that is active on imipenem), DIM-1 (Dutch imipenemase), GIM-1 (German imipenemase), KHM-1 (Kyorin Health Science MBL 1), SIM-1 (Seoul imipenemase), and TMB-1 (Tripoli metallo-beta-lactamase). IMP-1, DIM-1, GIM-1, SIM-1, and TMB-1 are Class 1 integron-mediated MBLs, KMH-1 is plasmid-mediated. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of acquired MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	215
293860	cd16302	CcrA-like_MBL-B1	Bacteroides fragilis CcrA and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	212
293861	cd16303	VIM_type_MBL-B1	VIM-type metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. VIM (Verona integron-encoded metallo-beta-lactamase)-type MBLs are integron-associated and are widely distributed acquired MBLs. MBLs (class B of the Ambler beta-lactamase classification) have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of VIM-type MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	218
293862	cd16304	BcII-like_MBL-B1	Bacillus cereus Beta-lactamase 2 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Bacillus cereus Beta-lactamase 2, also called BcII. MBLs (class B of the Ambler beta-lactamase classification) have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. BcII is a chromosome-encoded B1 MBL. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	212
293863	cd16305	Sfh-1-like_MBL-B2	Serratia fonticola Sfh-I and related metallo-beta-lactamases, subclass B2; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. B2 MBLs have a narrow substrate profile relative to subclass B1 MBLs that includes carbapenems, and they are active with one zinc ion bound in the Asp-Cys-His site, binding of a second zinc ion in the modified 3H site (Asn-His-His) inhibits catalysis.	226
293864	cd16306	CphA_ImiS-like_MBL-B2	Aeromonas hydrophyla CphA, Aeromonas veronii ImiS, and related metallo-beta-lactamases, subclass B2; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. B2 MBLs have a narrow substrate profile relative to subclass B1 MBLs that includes carbapenems, and they are active with one zinc ion bound in the Asp-Cys-His site, binding of a second zinc ion in the modified 3H site (Asn-His-His) inhibits catalysis.	222
293865	cd16307	FEZ-1-like_MBL-B3	Fluoribacter gormanii FEZ-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of FEZ-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	255
293866	cd16308	GOB1-like_MBL-B3	Elizabethkingia meningoseptica GOB-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of GOB-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	254
293867	cd16309	BJP-1-like_MBL-B3	Bradyrhizobium diazoefficiens BJP-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of BJP-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	252
293868	cd16310	Mbl1b-like_MBL-B3	Caulobacter crescentus Mbl1b and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of Mbl1b-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	252
293869	cd16311	THIN-B2-like_MBL-B3	Janthinobacterium lividum THIN-B2 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of THIN-B2-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	257
293870	cd16312	THIN-B-like_MBL-B3	Janthinobacterium lividum THIN-B and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of THIN-B-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	258
293871	cd16313	SMB-1-like_MBL-B3	SMB-1, THIN-B and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of SMB-1- and THIN-B-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	254
293872	cd16314	AIM-1-like_MBL-B3	Pseudomonas Aeruginosa AIM-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup AIM-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	255
293873	cd16315	EVM-1-like_MBL-B3	Erythrobacter vulgaris EVM-1 and related metallo-beta-lactamases, subclass B3; MBL-fold metallo-hydrolase domain. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup EVM-1-like MBLs belongs to the B3 subclass. Subclass B3 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile. These B3 enzymes have a modified Zn2/DCH site (Asp-His-His).	248
293874	cd16316	BlaB-like_MBL-B1	Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the chromosome-encoded MBL Elizabethkingia meningoseptica (Chryseobacterium meningosepticum) BlaB and related MBLs. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	214
293875	cd16317	IND_MBL-B1	Chryseobacterium indologenes IND-1, IND-2, IND-7and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the chromosome-encoded MBLs Chryseobacterium indologenes IND-1, IND-2, and IND-7 and related MBLs. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	215
293876	cd16318	MUS_TUS_MBL-B1	Myroides odoratimimus MUS-1, MUS-2, TUS-1 and related metallo-beta-lactamases, subclass B1; MBL-fold metallo-hydrolase domain. Includes the chromosome-encoded MBLs Myroides odoratimimus MUS-1 and related MBLs. MBLs (class B of the Ambler beta-lactamase classification) are a diverse group of metallo-enzymes that are capable of catalyzing the hydrolysis of a wide range of beta-lactam antibiotics. MBLs have been divided into three subclasses B1, B2 and B3, based on sequence/structural relationships and substrates, with the B1 and B2 MBLs being most closely related to each other. This subgroup of MBLs belongs to the B1 subclass. B1 enzymes are most active with two zinc ions bound in the active site, and have a broad-spectrum substrate profile.	214
293782	cd16319	MraZ	protein domain of unknown function (UPF0040) includes MraZ. This family contains proteins of unknown function (UPF0040), implicated in a cellular function of bacterial cell division. It includes protein MraZ which is present in almost all bacteria and appears to be essential for survival. It is found in gene clusters associated with the cellular function of cell division and cell wall biosynthesis. Members of this family contain two tandem copies of the domain;  the crystal structure of a member of this family (MPN314) reveals that the two subdomains are related by a pseudo two-fold axis, with each subdomain containing a highly conserved DXXXR sequence motif in close proximity to each other, suggested to form the functional site.	53
293783	cd16320	MraZ_N	N-terminal subdomain of transcriptional regulator MraZ. This family contains the N-terminal domain of proteins of unknown function (UPF0040), implicated in a cellular function of bacterial cell division. It includes protein MraZ which is present in almost all bacteria and appears to be essential for survival. It is found in gene clusters associated with the cellular function of cell division and cell wall biosynthesis, including mraW, ftsI, murE, murF, ftsW and murG. Members of this family contain two tandem copies of the domain; the crystal structure of a member of this family (MPN314) reveals that the two subdomains are related by a pseudo two-fold axis, with each subdomain containing a highly conserved DXXXR sequence motif in close proximity to each other, suggested to form the functional site.	60
293784	cd16321	MraZ_C	C-terminal subdomain of transcriptional regulator MraZ. This family contains the C-terminal domain of proteins of unknown function (UPF0040), implicated in a cellular function of bacterial cell division. It includes protein MraZ which is present in almost all bacteria and appears to be essential for survival. It is found in gene clusters associated with the cellular function of cell division and cell wall biosynthesis, including mraW, ftsI, murE, murF, ftsW and murG. Members of this family contain two tandem copies of the domain; the crystal structure of a member of this family (MPN314) reveals that the two subdomains are related by a pseudo two-fold axis, with each subdomain containing a highly conserved DXXXR sequence motif in close proximity to each other, suggested to form the functional site.	62
293877	cd16322	TTHA1623-like_MBL-fold	uncharacterized Thermus thermophilus TTHA1623 and related proteins; MBL-fold metallo hydrolase domain. Includes the MBL-fold metallo hydrolase domain of uncharacterized Thermus thermophilus TTHA1623 and related proteins. Members of this subgroup belong to the MBL-fold metallo-hydrolase superfamily which is comprised mainly of hydrolytic enzymes which carry out a variety of biological functions. This family includes homologs present in a wide range of bacteria and archaea and some eukaryota. Members of the MBL-fold metallo-hydrolase superfamily exhibit a variety of active site metallo-chemistry, TTHA1623 exhibiting a uniquely shaped putative substrate-binding pocket with a glyoxalase II-type metal-coordination mode.	204
319993	cd16323	Syd	Syd, a SecY-interacting protein. This family contains the Syd protein that has been implicated in the Sec-dependent transport of polypeptides across the inner membrane in bacteria. Syd has been shown to bind the SecY subunit of membrane-embedded SecYEG heterotrimer (also known as core translocon or SecY complex) which is a conserved protein-conducting channel essential for the biogenesis of most of the secretory and integral membrane proteins. The SecY-binding site of Syd is a conserved concave and electronegative groove that forms interactions with the electropositive loops of the SecY subunit. Syd is also known to verify the proper assembly of the SecY complex in the membrane by interfering with protein translocation only when the channel displays abnormal SecY-SecE associations. Operon analysis has shown that Syd protein may function as immunity protein in bacterial toxin systems.	173
319982	cd16324	LolA_fold-like	family containing periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the periplasmic protein RseB. This family contains the periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the N-terminal domain of periplasmic protein RseB, all of which have similar unclosed beta-barrel structures that resemble a baseball glove-like scaffold consisting of an 11-stranded antiparallel sheet. There are five Lol proteins (LolA, LolB, LolC, LolD, and LolE) involved in the sorting and membrane localization of lipoprotein and are highly conserved in Gram-negative bacteria. LolA accepts outer membrane (OM)-specific lipoproteins that are released from the inner membrane by the LolCDE complex and transfers them to the OM receptor LolB. It is proposed that the LolA/LolB complex forms a tunnel-like structure, where the hydrophobic insides of LolA and LolB are connected, which enables lipoproteins to transfer from LolA to LolB. RseB exerts a crucial role in modulating the stability of RseA, the transmembrane anti-sigma-factor that is degraded during sigma-E-dependent transcription caused by bacterial envelope stress.  Its structural similarity to LolA and LolB suggests that RseA may act as a sensor of periplasmic stress with a dual functionality, detecting mislocalized lipoproteins as well as propagating the signal to induce the sigma-E-response.	162
319983	cd16325	LolA	LolA, a periplasmic chaperone. This family contains periplasmic molecular chaperone LolA which binds to outer-membrane specific lipoproteins and transports them from inner membrane to outer membrane (OM) through LolB, a lipoprotein anchored to outer membranes. There are five Lol proteins (LolA, LolB, LolC, LolD, and LolE) involved in the sorting and membrane localization of lipoprotein and are highly conserved in Gram-negative bacteria. LolA accepts OM-specific lipoproteins that are released from the inner membrane by the LolCDE complex and transfers them to the OM receptor LolB. Studies have shown that hydrophobic surface patches large enough to accommodate acyl chains of the OM lipoproteins and the structural flexibility of LolA are important factors for its role as a periplasmic chaperone.	166
319984	cd16326	LolB	LolB, an outer membrane lipoprotein receptor. This family contains the outer membrane lipoprotein receptor, LolB, which catalyzes the last step of lipoprotein transfer from the inner to the outer membrane. There are five Lol proteins (LolA, LolB, LolC, LolD, and LolE) involved in the sorting and membrane localization of lipoprotein and are highly conserved in Gram-negative bacteria. LolA transports lipoproteins through the periplasm to LolB, which then localizes them to outer membranes; the protruding loop of LolB has been shown to be essential for the localization of lipoproteins in the anchoring of bacterial triacylated proteins to the outer membrane.	163
319985	cd16327	RseB	RseB, a sensor in periplasmic stress. This family contains the periplasmic protein RseB (also known as MucB or Mucb/RseB) which exerts a crucial role in modulating the stability of RseA, the transmembrane anti-sigma-factor that is degraded during sigma-E-dependent transcription caused by bacterial envelope stress. RseB binds to RseA and inhibits its sequential cleavage, thereby functioning as a negative modulator of this response. The protein is composed of two domains, the larger N-terminal domain resembling an unclosed beta-barrel that is remarkably similar structurally to LolA and LolB, proteins capable of binding the lipid anchor of lipoproteins, suggesting that RseA acts as a sensor of periplasmic stress with a dual functionality, detecting mislocalized lipoproteins as well as propagating the signal to induce the sigma-E-response.	166
319992	cd16328	RseA_N	N-terminal domain of RseA. This family contains the cytoplasmic (N-terminal) domain of RseA, the transmembrane anti-sigma-E factor. RseA is degraded during sigma-E-dependent transcription caused by bacterial envelope stress such as heat shock. It is an inner membrane protein with an N-terminal cytoplasmic domain that binds sigma-E and blocks its transcriptional activity, and a C-terminal periplasmic domain that binds RseB, an auxiliary negative regulator. Under inducing conditions, RseA is rapidly degraded and sigma-E is released into the cytoplasm, where it can bind core RNAP and induce its regulon. It has been shown that just the N-terminal domain is sufficient to bind and inhibit sigma-E. The C-terminal domain may interact with other proteins that signal periplasmic stress.	65
319986	cd16329	LolA_like	proteins similar to periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the periplasmic protein RseB. This family contains uncharacterized proteins similar to the periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the periplasmic protein RseB, all of which have similar unclosed beta-barrel structures that resemble a baseball glove-like scaffold consisting of an 11-stranded antiparallel sheet. There are five Lol proteins (LolA, LolB, LolC, LolD, and LolE) involved in the sorting and membrane localization of lipoprotein and are highly conserved in Gram-negative bacteria. LolA accepts outer membrane (OM)-specific lipoproteins that are released from the inner membrane by the LolCDE complex and transfers them to the OM receptor LolB. It is proposed that the LolA/LolB complex forms a tunnel-like structure, where the hydrophobic insides of LolA and LolB are connected, which enables lipoproteins to transfer from LolA to LolB. RseB exerts a crucial role in modulating the stability of RseA, the transmembrane anti-sigma-factor that is degraded during sigma-E-dependent transcription caused by bacterial envelope stress.  Its structural similarity to LolA and LolB suggests that RseA may act as a sensor of periplasmic stress with a dual functionality, detecting mislocalized lipoproteins as well as propagating the signal to induce the sigma-E-response.	225
319987	cd16330	LolA_VioE	Proteins similar to violacein biosynthetic enzyme VioE that shares fold with periplasmic molecular chaperone LolA. This family includes the violacein biosynthetic enzyme VioE which shares a core fold with lipoprotein transporter proteins that include lipoprotein transporter proteins LolA and LolB. VioE is an enzyme with no characterized homologs that plays a key role in the biosynthesis of violacein, a naturally occurring bisindole product with various biological activities, including antitumor activity as well as antibacterial and cytotoxic properties. In Chromobacterium violaceum, VioE catalyzes the third step in violacein biosynthesis from a pair of Trp residues (i.e. mediates a 1,2 shift of an indole ring and oxidative chemistry) to generate prodeoxyviolacein, a precursor to violacein. Structural and mutagenesis studies suggest that VioE acts as a catalytic chaperone, using this fold associated with lipoprotein transporters to catalyze the production of its prodeoxyviolacein product.	175
319991	cd16331	YjgA-like	uncharacterized proteins similar to Escherichia coli YjgA. Family of conserved uncharacterized proteins similar to Escherichia coli YjgA, which has been identified as comigrating with the 50S ribosome	153
381747	cd16332	Prp-like	ribosomal-processing cysteine protease Prp and similar proteins. This model represents a family of cysteine proteases that include members found to cleave the N-terminus extension of ribosomal subunit L27 in eubacteria. Proteins in this family are distinguished by a pair of invariant histidine and cysteine residues with conserved spacing that form the classic catalytic dyad of a cysteine protease. Dependence of Staphylococcus aureus on L27 cleavage by Prp makes the enzyme a target for antibiotic development.	102
319989	cd16333	RELM	resistin-like molecule (RELM) hormone family. RELMs, secreted proteins with roles including insulin resistance and the activation of inflammatory processes, are also known as found in inflammatory zone (FIZZ), and include four members in mouse (RELM-alpha/FIZZ1/HIMF, RELM-beta/FIZZ2, Resistin/FIZZ3, and RELM-gamma/FIZZ4) and two members in human (resistin and RELM-beta). Little is yet known about the differences and similarities in function of the different isoforms. RELMs are potentially implicated in a wide range of physiological and pathological processes including obesity-associated diabetes, cardiovascular system function, cancer development and metastasis. There are significant differences between human and rodent RELMs with respect to gene and protein structure, differential gene regulation, different tissue distribution profiles, and insulin resistance induction. Resistin appears to convey insulin resistance in rodents, and to instigate inflammatory processes in humans.  In the pathophysiology of obesity-associated diabetes, mouse resistin is secreted by adipocytes and increases hepatic gluconeogenesis, thereby promoting insulin resistance, human resistin is secreted by macrophages and may play a role through inflammatory contributions. Elevated levels of human resistin have been reported in various cancers including colorectal, endometrial, and postmenopausal breast cancers, and may initiate the production of further inflammatory cytokines, to promote tumor cell progression; contrary to this, in vitro overexpression of human RELM-beta abolishes invasion, metastasis and angiogenesis of gastric cancer cells. Resistin circulates as hexamers and trimers; structural similarity has been noted between the resistin homotrimer and the proprotein convertase subtilisin/kexin type 9, C-terminal cysteine-rich domain.	86
319988	cd16334	LppX-like	family includes lipoproteins LppX, LprA, LprF and LprG from Mycobacterium tuberculosis. This family includes the homologous lipoproteins LppX, LprA, LprF and LprG from Mycobacterium tuberculosis (Mtb), all of which share a core fold with lipoprotein transporter proteins LolA and LolB. Mtb contains components such as glycolipids, lipoglycans and lipoproteins that play critical roles in regulating host responses and promoting survival of the pathogen. Mtb LprA is a lipoprotein agonist of Toll-like receptor 2 (TLR2) that regulates innate immunity and APC function. LprF, which is also found in Mycobacterium bovis but not in the nonpathogenic Mycobacterium smegmatis, has a central hydrophobic cavity that binds a diacylated glycolipid that it transfers from the plasma membrane to the cell wall, which might be related to the pathogenesis of the bacteria. Similarly, LprG functions as a carrier of glycolipids and lipoglycans, such as lipoarabinomannan (LAM), during their trafficking and delivery to the mycobacterial cell wall, contributing to virulence; LAM inhibits fusion of phagosomes with lysosomes as a means for mycobacteria to evade host defense. In addition, LprG has potent TLR2 agonist activity that modulates antigen processing of dendritic cells and macrophages. LppX is required for the translocation of the key virulence factors, the phthiocerol dimycocerosates (PDIMs), to the surface of Mtb.	196
319981	cd16335	MukF_N	bacterial condensin complex subunit MukF, N-terminal domain. MukF is part of the MukBEF condensin complex that is mainly found in proteobacteria and is involved in chromosome organization and condensation. The complex is believed to serve as a part of the chromosome scaffold rather than a bulk DNA packing protein. MukE and MukF form a stable complex with each other and dynamically associate with MukB, a member of the SMC protein family. MukEF does not bind DNA on its own but modulates MukB-DNA activity. The stoichiometry of the MukEF complex is MukE4F2.	315
319980	cd16336	MukE	bacterial condensin complex subunit MukE. MukE is part of the MukBEF condensin complex that is mainly found in proteobacteria and is involved in chromosome organization and condensation. The complex is believed to serve as a part of the chromosome scaffold rather than a bulk DNA packing protein. MukE and MukF form a stable complex with each other and dynamically associate with MukB, a member of the SMC protein family. MukEF does not bind DNA on its own but modulates MukB-DNA activity. The stoichiometry of the MukEF complex is MukE4F2.	204
319979	cd16337	MukF_C	bacterial condensin complex subunit MukF, C-terminal domain. MukF is part of the MukBEF condensin complex that is mainly found in proteobacteria and is involved in chromosome organization and condensation. The complex is believed to serve as a part of the chromosome scaffold rather than a bulk DNA packing protein. MukE and MukF form a stable complex with each other and dynamically associate with MukB, a member of the SMC protein family. MukEF does not bind DNA on its own but modulates MukB-DNA activity. The stoichiometry of the MukEF complex is MukE4F2.	97
319976	cd16338	CpcT	T-type phycobiliprotein (PBP) lyase. This family contains the T-type phycobiliprotein (PBP) lyase (includes CpcT/CpeT, also known as CpcT bilin lyase). PBP lyases are employed by cyanobacteria, red algae, cryptophytes and glaucophytes for light-harvesting. Pigmentation of light-harvesting phycobiliproteins of cyanobacteria and cryptophytes requires covalent attachment of open-chain tetrapyrrole chromophores, the phycobilins, to the apoproteins. PBP lyases mediate this covalent attachment of phycobilin chromophores to apo-PBPs and also ensure the correct binding of the chromophore with regard to the specific attachment site and stereospecificity. The T-type lyase is distantly related to CpcS and is responsible for covalent attachment of phycocyanobilin (PCB) or phycoerythrobilin to a specific cysteine residue in the beta-subunit of phycocyanin (CpcB) and the beta-subunit of phycoerythrocyanin (PecB), and with a different stereochemistry than CpcS. In CpcT (All5339) from Nostoc (Anabaena) sp. PCC7120, sequential binding studies indicate that beta-subunit chromophorylation with PCB at a specific C- terminal cysteine residue in cyanobacterial phycocyanin and phycoerythrocyanin is hindered by a preceding chromophorylation at a specific N-terminal cysteine residue by CpcS.  T-type PBP lyases adopt a beta-barrel structure with a modified lipocalin fold, similar to S-type PBP lyases.	180
319977	cd16339	CpcS	S-type phycobiliprotein (PBP) lyase. This family contains the S-type phycobiliprotein (PBP) lyase (denoted CpcS/CpcU or CpeS/CpeU). PBP lyases are employed by cyanobacteria, red algae, cryptophytes and glaucophytes for light-harvesting. Pigmentation of light-harvesting phycobiliproteins of cyanobacteria and cryptophytes requires covalent attachment of open-chain tetrapyrrole chromophores, the phycobilins, to the apoproteins. PBP lyases mediate this covalent attachment of phycobilin chromophores to apo-PBPs and also ensure the correct binding of the chromophore with regard to the specific attachment site and stereospecificity. The S-type lyase is distantly related to CpcT and similarly adopts a beta-barrel structure with a modified lipocalin fold. Many members of the CpcS/CpcU family ligate phycocyanobilin (PCB) to a specific cysteine residue in the beta-subunits of phycocyanin (CpcB) or phycoerythrocyanin (PecB) and to a related cysteine residue in the alpha and beta subunits of allophycocyanin (AP); they are typically given the designation of "CpcS" or "CpcU". Other members which attach phycoerythrobilin (PEB) to the beta-subunit of phycoerythrin (PE) are given the designation "CpeS" or "CpeU". In Guillardia theta, a Cryptophyte, which has adopted phycoerythrobilin (PEB) biosynthesis from cyanobacteria, phycobiliprotein lyase has been shown to provide structural requirements for the transfer of this chromophore to the specific cysteine residue of the apophycobiliprotein.	166
319978	cd16340	CpcS_T	S- and T-type phycobiliprotein (PBP) lyases. This family contains the S- and T-type phycobiliprotein (PBP) lyases.  PBP lyases are employed by cyanobacteria, red algae, cryptophytes and glaucophytes for light-harvesting.  Pigmentation of light-harvesting phycobiliproteins of cyanobacteria and cryptophytes requires covalent attachment of open-chain tetrapyrrole chromophores, the phycobilins, to the apoproteins.  PBP lyases mediate this covalent attachment of phycobilin chromophores to apo-PBPs. These lyases are distinguishable in the clades of E/F-, S/U-, and T-type lyases; T-type lyases (which include CpcT) are distantly related to S-type lyases (which include CpcS and CpcU). S- and T-type PBP lyases differ in mechanistic details; the conformation and protonation state in which the chromophore is presented account for their differences in stereochemistry of the chromophore selectivity as well as corresponding binding sites. On the other hand, both lyases carry out the main functions of assisting site selectivity in the apo-PBP, protecting the chromophore and ensuring the regio- and stereoselectivity of the addition. The S- and T-type PBP lyases adopt a beta-barrel structure with a modified lipocalin fold.	166
319975	cd16341	FdhE	formate dehydrogenase accessory protein FdhE and similar proteins. This family contains formate dehydrogenase accessory protein FdhE and FdhE-like protein, found largely in gamma- and some beta-Proteobacteria, where the fdhE genes are almost always genetically-linked to the structural genes for formate dehydrogenases.  FdhE is  required for the assembly of formate dehydrogenase although not present in the final complex. In E. coli, FdhE interacts with the catalytic subunits of the respiratory formate dehydrogenases. Purification of recombinant FdhE demonstrates the protein is an iron-binding rubredoxin that can adopt monomeric and homodimeric forms. E. coli FdhE interacts with the catalytic subunits, FdnG and FdoG, of the Tat- dependent respiratory formate dehydrogenases. Site-directed mutagenesis has shown that conserved cysteine motifs are essential for the physiological activity of the FdhE protein and are also involved in Fe(III) ligation. The iron likely is redox active, suggesting that the switch from aerobic to anaerobic conditions may be important in modulating FdhE function. Alternatively, FdhE may be involved in an electron transfer reaction, similar to other rubredoxins.	257
319974	cd16342	FusC_FusB	Fusidic acid resistance protein (FusC/FusB). The fusidic acid resistance protein FusC (FusB) mediates resistance to the antibiotic fusidic acid. Its C-terminal domain, which contains a zinc-binding site, interacts with EF-G with high affinity, promoting the dissociation of stalled ribosome#EF-G#GDP complexes that form in the presence of fusidic acid, thus allowing the ribosomes to resume translation.	204
319971	cd16343	LMWPTP	Low molecular weight protein tyrosine phosphatase. Low molecular weight protein tyrosine phosphatases (LMW-PTP) are a family of  small soluble single-domain enzymes that are characterized by a highly conserved active site motif (V/I)CXGNXCRS and share no sequence similarity with other types of protein tyrosine phosphatases (PTPs). LMW-PTPs play important roles in many biological processes and are widely distributed in prokaryotes and eukaryotes.	147
319972	cd16344	LMWPAP	low molecular weight protein arginine phosphatase. Low molecular weight protein arginine phosphatases are part of the low molecular weight phosphatase (LMWP) family. They share a highly conserved active site motif (V/I)CXGNTCRS. It has been shown that the conserved threonine, which in many LMWPTPs is an isoleucine, confers specificity to phosphoarginine over phosphotyrosine.	142
319973	cd16345	LMWP_ArsC	Arsenate reductase of the LMWP family. Arsenate reductase plays an important role in the reduction of intracellular arsenate to arsenite, an important step in arsenic detoxification. The reduction involves three different thiolate nucleophiles. In arsenate reductases of the LMWP family, reduction can be coupled with thioredoxin (Trx)/thioredoxin reductase (TrxR) or glutathione (GSH)/glutaredoxin (Grx).	132
319957	cd16347	VOC_like	uncharacterized subfamily of the vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	221
319958	cd16348	VOC_YdcJ_like	uncharacterized metal-dependent enzyme similar to Shigella flexneri YdcJ. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	310
319959	cd16349	VOC_like	uncharacterized subfamily of the vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	301
319960	cd16350	VOC_like	uncharacterized subfamily of the vicinal oxygen chelate (VOC) family. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	254
319750	cd16351	CheB_like	methylesterase CheB domain family. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain, a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. CheB family members may also contain an N-terminal regulatory (REC) domain, which blocks the active site of the C-terminal domain until it is phosphorylated, or a CheR domain; typically cheB and cheR occur in the same operon. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR.	184
319352	cd16352	CheD	chemotaxis protein CheD stimulates methylation of methyl-accepting chemotaxis proteins. This family contains bacterial chemotaxis protein CheD that stimulates methylation of methyl-accepting chemotaxis proteins (MCPs). The CheD chemotaxis gene is not found in the Escherichia coli genome, but is present in many other organisms, including B. subtilis, where CheD appears to have two separate roles; it binds to chemoreceptors to activate them as part of the CheC/CheD/CheYp adaptation system, and it deamidates selected residues to activate chemoreceptors, enabling them to mediate amino acid chemotaxis. CheD has been shown to catalyze amide hydrolysis of specific glutaminyl side chains of the B. subtilis chemoreceptors McpA, McpB and McpC; deamination by CheD is required for the chemoreceptors to effectively transduce signals to the CheA kinase. CheD's ability to bind the receptors is controlled by CheC via a competitive binding mechanism; substituting Gln into the receptor motif of the signal-terminating phosphatase, CheC, turns the inhibitor into a receptor-modifying deamidase CheD substrate.  Also, CheYp increases the affinity of CheD for CheC, controlling CheD binding to the receptors through its interactions with CheC. Thus, high levels of CheYp means CheC is a better binding target for CheD than the receptors, resulting in decreased CheA activity. The CheD structure reveals a distant homology with a class of bacterial toxins represented by the cytotoxic necrotizing factor 1 (CNF1) as well as a class of proteins of unknown function represented by B. subtilis YfiH. An invariant Cys-His pair forming a catalytic dyad is observed, and is required by the toxins for deamidation activity.	146
381732	cd16353	CheC_CheX_FliY	CheC/CheX/FliY (CXY) family phosphatases. The CXY family includes CheY-P-hydrolyzing proteins that function in bacterial chemotaxis, which involves cellular processes that control the movement of organisms toward favorable environments via rotating flagella, which in turn determines the sense of rotation by the intracellular response regulator CheY. When phosphorylated, CheY-P interacts directly with the flagellar motor, and this signal is terminated by the CXY family of phosphatases (Escherichia coli uses CheZ). CheC acts as a weak CheY-P phosphatase but increases activity in the presence of CheD. Bacillus subtilis has only CheC and FliY while many systems also have CheX. CheC and CheX appear to be primarily involved in restoring normal CheY-P levels, whereas FliY seems to act on CheY-P constitutively. Unlike CheC and CheX, FliY is localized in the flagellar switch complex, which also contains the stator-coupling protein FliG and the target of CheY-P, FliM. CheC, CheX, and FliY phosphatases share a consensus sequence ([DS]xxxExxNx(22)P) with four conserved residues thought to form the phosphatase active site. CheC class I and FliY each have two active sites, while CheC class II and III, and CheX have only one. This family also includes FliM, a component of the flagellar switch complex and a target of CheY, which lacks the phosphatase active site consensus sequence, and is not a CheY phosphatase.	162
319961	cd16354	BAT	Bleomycin N-Acetyltransferase and similar proteins. BlmB, encodes a bleomycin N-acetyltransferase, designated BAT, which inactivates Bm using acetyl-coenzyme A (AcCoA). BAT forms a dimer structure via interaction of its C-terminal domains in the monomers. The N-terminal domain of BAT has a tunnel with two entrances: a wide entrance that accommodates the metal-binding domain of Bm and a narrow entrance that accommodates acetyl-CoA (AcCoA). A groove formed on the dimer interface of two BAT C-terminal domains forms the DNA-binding domain of Bm. In a ternary complex of BAT, BmA(2), and CoA, a thiol group of CoA is positioned near the primary amine of Bm at the midpoint of the tunnel and ensures efficient transfer of an acetyl group from AcCoA to the primary amine of Bm. BAT belongs to vicinal oxygen chelate (VOC) superfamily that is  composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including thiocoraline, bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	114
319962	cd16355	VOC_like	uncharacterized subfamily of vicinal oxygen chelate (VOC) superfamily. The vicinal oxygen chelate (VOC) superfamily  is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	121
319963	cd16356	PsjN_like	Burkholderia Phytofirmans glyoxalase/bleomycin resistance protein/dioxygenase family enzyme and similar proteins. Burkholderia Phytofirmans glyoxalase/bleomycin resistance protein/dioxygenase family enzyme and similar proteins. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	119
319964	cd16357	GLOD4_C	C-terminal domain of human glyoxalase domain-containing protein 4 and similar proteins. Uncharacterized subfamily of the vicinal oxygen chelate (VOC) superfamily contains human glyoxalase domain-containing protein 4 and similar proteins. VOC is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	114
319965	cd16358	GlxI_Ni	Glyoxalase I that uses Ni(++) as cofactor. This family includes Escherichia coil and other prokaryotic glyoxalase I that uses nickel as cofactor. Glyoxalase I (also known as lactoylglutathione lyase; EC 4.4.1.5) is part of the glyoxalase system, a two-step system for detoxifying methylglyoxal, a side product of glycolysis. This system is responsible for the conversion of reactive, acyclic alpha-oxoaldehydes into the corresponding alpha-hydroxyacids and involves 2 enzymes, glyoxalase I and II. Glyoxalase I catalyses an intramolecular redox reaction of the hemithioacetal (formed from methylglyoxal and glutathione) to form the thioester, S-D-lactoylglutathione. This reaction involves the transfer of two hydrogen atoms from C1 to C2 of the methylglyoxal, and proceeds via an ene-diol intermediate. Glyoxalase I has a requirement for bound metal ions for catalysis. Eukaryotic glyoxalase I prefers the divalent cation zinc as cofactor, whereas Escherichia coil and other prokaryotic glyoxalase I uses nickel. However, eukaryotic Trypanosomatid parasites also use nickel as a cofactor, which could possibly be explained by acquiring their GLOI gene by horizontal gene transfer. Human glyoxalase I is a two-domain enzyme and  it has the structure of a domain-swapped dimer with two active sites located at the dimer interface. In yeast, in various plants, insects and Plasmodia, glyoxalase I is four-domain, possibly the result of a further gene duplication and an additional gene fusing event.	122
319966	cd16359	VOC_BsCatE_like_C	C-terminal of Bacillus subtilis CatE like protein, a vicinal oxygen chelate subfamily. Uncharacterized subfamily of vicinal oxygen chelate (VOC) superfamily contains Bacillus subtilis CatE and similar proteins. CatE is proposed to function as Catechol-2,3-dioxygenase. VOC  is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	110
319967	cd16360	ED_TypeI_classII_N	N-terminal domain of type I, class II extradiol dioxygenases. This family contains the N-terminal non-catalytic domain of type I, class II extradiol dioxygenases. Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site; extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon, whereas intradiol enzymes cleave the aromatic ring between two hydroxyl groups. Extradiol dioxygenases are classified into type I and type II enzymes. Type I extradiol dioxygenases include class I and class II enzymes. These two classes of enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. The extradiol dioxygenases represented in this family are type I, class II enzymes, and are composed of the N- and C-terminal domains of similar structure fold, resulting from an ancient gene duplication. The active site is located in a funnel-shaped space of the C-terminal domain. A catalytically essential metal, Fe(II) or Mn(II),  presents in all the enzymes in this family.	111
319968	cd16361	VOC_ShValD_like	vicinal oxygen chelate (VOC) family protein similar to Streptomyces hygroscopicus ValD protein. This subfamily of vicinal oxygen chelate (VOC) family protein includes  Streptomyces hygroscopicus ValD protein and similar proteins. ValD protein functions in  validamycin biosynthetic pathway. The vicinal oxygen chelate (VOC) superfamily is composed of structurally related proteins with paired beta.alpha.beta.beta.beta motifs that provide a metal coordination environment with two or three open or readily accessible coordination sites to promote direct electrophilic participation of the metal ion in catalysis. VOC domain is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.	150
319969	cd16362	TflA	Toxoflavin Lyase. Toxoflavin lyase (TflA) is metalloenzyme degrading  toxoflavin at the presence of  oxygen, Mn(II), and dithiothreitol. TflA is structurally homologous to proteins of the vicinal oxygen chelate superfamily. The structure of TflA contains  fold that displays a rare rearrangement of the structural modules indicative of domain permutation. Moreover, unlike the 2-His-1-carboxylate facial triad commonly utilized by vicinal oxygen chelate dioxygenases and other dioxygen-activating non-heme Fe(II) enzymes, the metal center in TflA consists of a 1-His-2-carboxylate facial triad. Toxoflavin  is an azapteridine that is toxic to various plants, fungi, animals, and bacteria.	110
319897	cd16363	Col_Im_like	inhibitory immunity (Im) protein of colicin (Col) deoxyribonuclease (DNase) and pyocins. This family contains inhibitory immunity (Im) proteins that bind to colicin endonucleases (DNases) or pyocins with very high affinity and specificity; this is critical for the neutralization of endogenous DNase catalytic activity and for protection against exogenous DNase bacteriocins. The DNase colicin family (ColE2, ColE7, ColE8 and ColE9) in E. coli, and pyocin family (S1, S2, S3 and AP41) in P. aeruginosa, are potent bacteriocins where the immunity proteins (Ims) protect the colicin/pyocin producing (i.e. colicinogenic) bacteria by binding and inactivating colicin nucleases. The binding affinities between cognate and non-cognate nucleases by Im proteins can vary up to 10 orders of magnitude.	81
409301	cd16364	T3SC_I-like	class I type III secretion system (T3SS) chaperones and similar proteins. This family contains class I type III secretion system (T3SS) chaperones mainly found in Gram-negative bacteria such as Pseudomonas, Yersinia, Salmonella, Escherichia and Erwinia, among others. A wide variety of these bacterial pathogens and symbionts require a T3SS to inject eukaryotic host cells with effector proteins important for suppressing host defenses and establishing infection. Many of these effector proteins interact with specific type III secretion chaperones prior to secretion. These T3SS chaperones have been classified as class I type III secretion chaperones (T3SC), which are small structurally conserved dimers that interact specifically with T3SS effector proteins. Class I T3SC consists of two subclasses: IA and IB. Class IA T3SC binds a single effector, whereas class IB T3SC binds to several effectors. Class IA and Class IB T3SCs typically exhibit little sequence similarities, but share a common overall heart-shaped structure fold (alpha-beta-beta-beta-alpha-beta-beta-alpha) and features, such as a small size, an acidic pI and an amphipathic C-terminal alpha-helix.  Chaperone protein CesT serves a chaperone function for the enteropathogenic Escherichia coli (EPEC) translocated intimin receptor (Tir) protein, which confers upon EPEC the ability to alter host cell morphology following intimate bacterial attachment.  In Pseudomonas aeruginosa, chaperone ExsC binds small secreted protein ExsE as well as the non-secreted anti-activator protein ExsD; it relieves repression of the transcriptional activator ExsA (which activates expression of T3SS genes) by ExsD. P. aeruginosa SpcU binds the cytotoxin ExoU, which is a broad-specificity phospholipase A2 (PLA2) and lysophospholipase, and maintains the N-terminus of ExoU in an unfolded state which is required for secretion. Salmonella enterica chaperone SicP forms a complex with effector protein SptP at an early stage of its secretion process in order to avoid premature degradation, while chaperone SigE binds to effector SigD, which, upon translocation into the host cell, preferentially dephosphorylates specific inositol phospholipids that are thought to be crucial for subsequent activation of the host cell Ser-Thr kinase Akt. This family also includes Yersinia chaperone/escortee pairs SycE/YopE, SycH/YopH, SycT/YopT and SycN+YscB/YopN, all of which bind to specific Yersinia outer proteins (Yops). Also included are several DspF and related sequences from several plant pathogenic bacteria. The "disease-specific" (dsp) region next to the hrp gene cluster of Erwinia amylovora is required for pathogenicity but not for elicitation of the hypersensitive reaction. In addition, a group of proteins including Escherichia coli YbjN, Erwinia amylovora AmyR, and their homologs, are included in this family.  They share a class I T3SC-like fold with T3SS chaperone proteins but appear to function independently of the T3SS.	117
319887	cd16365	NarH_like	beta FeS subunits DMSOR NarH-like family. This subfamily contains beta FeS subunits of several DMSO reductase superfamily, including nitrate reductase A, ethylbenzene dehydrogenase and selenate reductase.  DMSO Reductase (DMSOR) family members have a large, periplasmic molybdenum-containing alpha subunit as well as a small beta FeS subunit, and may also have a small gamma subunit. . The beta subunits of DMSOR contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system. Nitrate reductase A contains three subunits (the catalytic subunit NarG, the catalytic subunit NarH with four [Fe-S] clusters, and integral membrane subunit NarI) and often forms a respiratory chain with the formate dehydrogenase via the lipid soluble quinol pool. Ethylbenzene dehydrogenase oxidizes the hydrocarbon ethylbenzene to (S)-1-phenylethanol. Selenate reductase catalyzes reduction of selenate to selenite in bacterial species that can obtain energy by respiring anaerobically with selenate as the terminal electron acceptor.	201
319888	cd16366	FDH_beta_like	beta FeS subunits of formate dehydrogenase N (FDH-N) and similar proteins. This family contains beta FeS subunits of several dehydrogenases in the DMSO reductase superfamily, including formate dehydrogenase N (FDH-N), tungsten-containing formate dehydrogenase (W-FDH) and other similar proteins. FDH-N is a major component of nitrate respiration of Escherichia coli; it catalyzes the oxidation of formate to carbon dioxide, donating the electrons to a second substrate to a cytochrome.  W-FDH contains a tungsten instead of molybdenum at the catalytic center and seems to be exclusively found in organisms such as hyperthermophilic archaea that live in extreme environments. It catalyzes the oxidation of formate to carbon dioxide, donating the electrons to a second substrate.	156
319889	cd16367	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	138
319890	cd16368	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	200
319891	cd16369	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	172
319892	cd16370	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	131
319893	cd16371	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	140
319894	cd16372	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	125
319895	cd16373	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	154
319896	cd16374	DMSOR_beta_like	uncharacterized subfamily of DMSO Reductase beta subunit family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	139
319867	cd16375	Avd_IVP_like	proteins similar to the diversity-generating retroelement protein bAvd. A superfamily of four-helix bundles that form homopentamers, including the bacterial accessory variability determinant (bAvd) protein and a family of functionally uncharacterized bacterial proteins, some of which are encoded by an atypically large intervening sequence present within some 23S rRNA genes.	103
319868	cd16376	Avd_like	diversity-generating retroelement protein bAvd and similar proteins. The bacterial accessory variability determinant (bAvd) protein, together with a reverse transcriptase, is involved in retrohoming processes as part of a diversity-generating retroelement (DGR), a type of system that is involved in generating sequence variation in bacterial proteins by inserting coding information from a template region into a variable region of a protein coding gene. bAvd forms homopentamers and interacts with the reverse transcriptase as well as with DNA and/or RNA.	106
319869	cd16377	23S_rRNA_IVP_like	23S rRNA-intervening sequence protein and similar proteins. A family of functionally uncharacterized bacterial proteins, some of which are encoded by an atypically large intervening sequence present within some 23S rRNA genes. The distantly related bAvd protein, which also forms a homopentamer of four-helix bundles, has been suggested to interact with nucleic acids and a reverse transcriptase.	108
319866	cd16378	CcmH_N	N-terminal domain of cytochrome c-type biogenesis protein CcmH and related proteins. Cytochrome c-type biogenesis protein CcmH is a membrane-anchored thiol-oxidoreductase that is essential in the maturation of c-type cytochromes. CcmH consists of an N-terminal catalytic domain with the active site CXXC motif and a C-terminal domain of unknown function which is predicted to contain TPR-like repeats. Other members of this family include NrfF, CycL, and Ccl2.	67
319863	cd16379	YitT_C_like	C-terminal domain of Bacillus subtilis YitT and similar protein domains. This domain, found in various bacterial proteins, has no known function. It has been given the designation DUF2179 and occurs at the C-terminus of the Bacillus subtilis membrane protein YitT as well as in single-domain proteins.	80
319864	cd16380	YitT_C	C-terminal domain of Bacillus subtilis YitT and similar protein domains. This domain, found in various bacterial proteins, has no known function. It has been given the designation DUF2179 and occurs at the C-terminus of the Bacillus subtilis membrane protein YitT.	85
319865	cd16381	YitT_C_like_1	Proteins similar to the C-terminal domain of Bacillus subtilis YitT. This domain, characteristic of various bacterial proteins, has no known function. It has been given the designation DUF2179 and is similar to the C-terminus of the Bacillus subtilis membrane protein.	80
341357	cd16382	XisI-like	XisI is FdxN element excision controlling factor protein. This family contains XisI proteins, also known as FdxN element excision controlling factors, and similar proteins. FdxN element is excised from the chromosome during heterocyst differentiation in cyanobacteria. This is accomplished by the large serine recombinase XisF (fdxN element site-specific recombinase). The xisH as well as the xisF and xisI genes are required. XisI may function as recombination directionality factor (RDF), and needs XisH which may function as an endonuclease.	107
319862	cd16383	GUN4	porphyrin-binding protein domain GUN4. GUN4 is a porphyrin-binding protein involved in chlorophyll biosynthesis regulation and intracellular signaling, found in aerobic photosynthetic organisms. It has been implicated in retrograde signalling between the chloroplast and nucleus. GUN4 can bind protoporphyrin IX (PIX) and magnesium protoporphyrin IX (MgPIX), substrate and product of the Mg-chelatase, as well as heme and cobalt protoporphyrin IX (CoPIX). It may play a role in protecting plants from reactive oxygen species (ROS)-mediated damage.	142
319760	cd16384	VirB8_like	virulence protein VirB8. This family contains bacterial virulence protein VirB8 and similar proteins which consist of cytoplasmic, transmembrane, and periplasmic regions. VirB8 is an essential assembly factor of type IV secretion system (T4SS) channel proteins that are highly diverse in function relative to other bacterial secretion systems. T4SS proteins are important virulence factors for many Gram-negative pathogens, such as Agrobacterium, Brucella, Legionella, and Helicobacter. This family also includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain has a similar fold to the nuclear transport factor-2 (NTF2) protein.	133
319861	cd16385	IcmL	inner membrane protein IcmL/DotI. This family contains the periplasmic domain of the inner membrane protein DotI of the Dot/Icm (defect in organelle trafficking/intracellular multiplication) type IVB secretion system, including its ortholog in the conjugation system of plasmid R64, TraM, and similar proteins. These domains share striking structural similarity to the type IV secretion system (T4ASS) component VirB8 suggesting DotI/TraM to be its functional counterpart. DotI is essential for intracellular growth of Legionella pneumophila (causing agent of Legionnaires' disease) within mammalian and protozoan cells; it translocates numerous effector proteins into its eukaryotic host.	132
319860	cd16386	TcpC_N	N-terminal domain of conjugative transposon protein TcpC. This family contains the N-terminal domain of conjugation protein TcpC and similar proteins. TcpC is required for efficient conjugative transfer, localizing to the cell membrane independently of other conjugation proteins, where membrane localization is important for its function, oligomerization and interaction with the conjugation proteins TcpA, TcpH, and TcpG. N-terminal region (sometimes referred to as central domain) of TcpC is required for efficient conjugation, oligomerization and protein-protein interactions. TcpC has low level sequence identity to proteins encoded by the conjugative transposon Tn916, which is responsible for a large proportion of the tetracycline resistance in different pathogens.	123
319246	cd16387	ParB_N_Srx	ParB N-terminal domain and sulfiredoxin protein-related families. The ParB N-terminal domain/Sulfiredoxin (Srx) superfamily contains proteins with diverse activities. Many of the families are involved in segregation and competition between plasmids and chromosomes. Several families share similar activities with the N-terminal domain of ParB (Spo0J in Bacillus subtilis), a DNA-binding component of the prokaryotic parABS partitioning system. Also within this superfamily is sulfiredoxin (Srx; reactivator of oxidatively inactivated 2-cys peroxiredoxins), RepB N-terminal domain (plasmid segregation replication protein B like protein), nucleoid occlusion protein, KorB N-terminal domain partition protein of low copy number plasmid RK2, irbB (immunoglobulin-binding regulator that activates eib genes), N-terminal domain of sopB protein (promotes proper partitioning of F1 plasmid), fertility inhibition factors OSA and FiwA,DNA sulfur modification protein DndB, and a ParB-like toxin domain. Other activities includes a StrR (regulator in the streptomycin biosynthetic gene cluster), and a family containing a Pyrococcus furiosus nuclease and putative transcriptional regulators sbnI (Staphylococcus aureus siderophore biosynthetic gene cluster ). Nuclease activity has also been reported in Arabidopsis Srx.	54
319247	cd16388	SbnI_like_N	N-terminal domain of transcriptional regulators similar to SbnI. Siderophore staphylobactin biosynthesis protein SbnI of Staphylococcus aureus is a ParB/Spo0J like protein required for the expression of genes in the sbn operon, which is responsible for staphyloferrin B (SB) biosynthesis. SnbI forms dimers and binds DNA upstream of sdnD. SbnI binds heme, which inhibits DNA binding of SbnI, leading to a suppression of sbn operon expression.	77
319248	cd16389	FIN	fertility inhibition factors, including OSA and FiwA, related to the ParB/Srx superfamily. Osa and FiwA are fertility inhibition factors (FIN), which are employed by plasmids to block import of rival plasmids. Osa (oncogenic suppressive activity) of IncW group plasmid pSa gene inhibits the oncogenic properties of Agrobacterium tumefaciens. Osa is structurally similar to the ParB N-terminal domain/Srx superfamily of proteins: ParB acts in the bacterial and plasmid parABS partitioning systems. Osa has been shown to have ATPase and DNAse activities, an can block T-DNA transfer into plants. FiwA is encoded by plasmid RP1 and blocks the transfer of plasmid R388. The gene product of Haemophilus influenzae p1056.10c also blocks T-DNA transfer.	124
319249	cd16390	ParB_N_Srx_like	uncharacterized family distantly related to the N-terminal domain of the ParB/Srx superfamily. Uncharacterized proteins distantly related to the N-terminal domain of the ParB superfamily, primarily involved in bacterial and plasmid parABS-related partitioning systems. A small minority of proteins in this family include a C-terminal inorganic pyrophosphatase domain. Also within the ParB superfamily is sulfiredoxin (Srx), which is a reactivator of oxidatively inactivated 2-cys peroxiredoxins. Other families includes a putative regulator in the biosynthetic gene cluster and a family containing a Pyrococcus furiosus nuclease and putative transcriptional regulators SbnI (Staphylococcus aureus siderophore biosynthetic gene cluster ) and EdeB (Brevibacillus brevis antimicrobial peptide edeine biosynthetic cluster). Nuclease activity has also been reported in Arabidopsis Srx.	162
319250	cd16392	toxin-ParB	toxin domain of the ParB/Srx superfamily. toxin domain with similarity to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system and related proteins. Toxin found, for example, at the C-terminus of polymorphic toxin system members.	72
319251	cd16393	SPO0J_N	Thermus thermophilus stage 0 sporulation protein J-like N-terminal domain, ParB family member. Spo0J (stage 0 sporulation protein J) is a ParB family member, a critical component of the ParABS-type bacterial chromosome segregation system. The Spo0J N-terminal region acts in protein-protein interaction and is adjacent to the DNA-binding domain that binds to parS sites. Two Spo0J bind per parS site, and Spo0J interacts with neighbors via the N-terminal domain to form oligomers via an Arginine-rich patch (RRXR).  This superfamily represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	97
319252	cd16394	sopB_N	N-terminal domain of sopB protein, which promotes proper partitioning of F1 plasmid. Escherichia coli SopB acts in the equitable partitioning of the F plasmid in the SopABC system. SopA binds to the sopAB promoter, while SopB binds SopC and helps stimulate polymerization of SopA in the presence of ATP and Mg(II). Mutation of SopA inhibits proper plasmid segregation. This N-terminal domain is related to the ParB N-terminal domain of bacterial and plasmid parABS partitioning systems, which binds parA.	67
319253	cd16395	Srx	Sulfiredoxin reactivates peroxiredoxins after oxidative inactivation. Sulfiredoxin reduces and thereby re-activates 2-cys peroxiredoxins. Peroxiredoxins act as molecular switches, inactivating in response to hyperoxidation from hydrogen peroxide and other free radicals. Sulfiredoxin reactivates Prx-SO(2)(-) via ATP-Mg(2+)-dependent reduction. Arabidopsis sulfiredoxin has been described as a dual function enzyme, having nuclease activity in addition to the sulfiredoxin activity. This protein is similar to ParB N-terminal-like domain of bacterial and plasmid parABS partitioning systems.	90
319254	cd16396	Noc_N	nucleoid occlusion protein, N-terminal domain, and related domains of the ParB partitioning protein family. Nucleoid occlusion protein has been shown in Bacillus subtilis to bind to specific DNA sequences on the chromosome (Noc-binding DNA sequences, NBS), inhibiting cell division near the nucleoid and thereby protecting the chromosome. This N-terminal domain is related to the N-terminal domain of ParB/repB partitioning system proteins.	95
319255	cd16397	IbrB_like	immunoglobulin-binding regulator IbrB activates eib genes. IbrB (along with IbrA) activates immunoglobulin-binding eib genes in Escherichia coli. IbrB is related to the ParB N-terminal domain family, which includes DNA-binding proteins involved in chromosomal/plasmid segregation and transcriptional regulation, consistent with a possible mechanism for IbrB in DNA binding-related regulation of eib expression. The ParB like family is a diverse domain superfamily with structural and sequence similarity to ParB of bacterial chromosomes/plasmid parABS partitioning system and Sulfiredoxin (Srx), which is a reactivator of oxidatively inactivated 2-cys peroxiredoxins. Other families includes proteins related to StrR, a putative regulator in the biosynthetic gene cluster and a family containing a Pyrococcus furiosus nuclease and putative transcriptional regulators SbnI (Staphylococcus aureus siderophore biosynthetic gene cluster ) and EdeB (Brevibacillus brevis antimicrobial peptide edeine biosynthetic cluster). Nuclease activity has also been reported in arabidopsis Srx.	100
319256	cd16398	KorB_N_like	ParB-like partition protein of low copy number plasmid RK2, N-terminal domain and related domains. KorB, a member of the ParB like family, is present on the low copy number, broad host range plasmid RK2. KorB encodes a gene product involved in segregation of RK2 and acts as a transcriptional regulator, down-regulating at least 6 RK2 operons. KorB binds RNA polymerase and acts cooperatively with several co-repressors in modulating transcription. KorB is comprised of 3 domains, including a beta-strand C-terminal domain similar to SH3 domains and an alpha helical central domain that interacts with operator DNA. In ParB of P1 and SopB of F, the N-terminal region is responsible for interaction with the parA component. However, korB interaction with the RK2 parA-equivalent IncC has been mapped to the central HTH motif.  This family is related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	91
319257	cd16400	ParB_Srx_like_nuclease	ParB/Srx_like nuclease and putative transcriptional regulators related to SbnI. This family contains a Pyrococcus Furiosus enzyme reported to have DNA nuclease activity and resembles the N-terminal domain of ParB proteins of the parABS bacterial chromosome partitioning system. This sub-family also includes siderophore staphylobactin biosynthesis protein SbnI. 60% of the P. furiosus nuclease activity was retained at 90 degree C, suggesting a physiological role in the organism, which can grow in temperatures as high as 100 degrees Celsius. The protein has endo- and exo-nuclease activity vs. single- and double-stranded DNA, and nuclease activity was lost in methylated proteins prepared for structure solution. This family has a fairly well-conserved DGHHR motif that corresponds to the same structural position as the phosphorylation site (a portion of the ATP-Mg-binding site) of sulfiredoxin and the arginine-rich ParB BoxII of ParB.	72
319258	cd16401	ParB_N_like_MT	ParB N-terminal-like domain, some attached to C-terminal S-adenosylmethionine-dependent methyltransferase domain. This family represents domains related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, fused to a variety of C-terminal domains, including S-adenosylmethionine-dependent methyltransferase-like domains. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	85
319259	cd16402	ParB_N_like_MT	ParB N-terminal-like domain, some attached to C-terminal S-adenosylmethionine-dependent methyltransferase domain. This family represents domains related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, fused to a variety of C-terminal domains, including S-adenosylmethionine-dependent methyltransferase-like domains and DUF4417. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	87
319260	cd16403	ParB_N_like_MT	ParB N-terminal-like domain, some attached to C-terminal S-adenosylmethionine-dependent methyltransferase. This family represents domains related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, fused to a variety of C-terminal domains, including S-adenosylmethionine-dependent methyltransferase-like domains. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	88
319261	cd16404	pNOB8_ParB_N_like	pNOB8 ParB-like N-terminal domain, plasmid partitioning system protein domain. archaeal pNOB8 ParB acts in a plasmid partitioning system made up of 3 parts: AspA, ParA motor protein, and ParB, which links ParA to the protein-DNA superhelix. As demonstrated in Sulfolobus, AspA spreads along DNA, which allows ParB binding, and links to the Walker-motif containing ParA motor protein. The Sulfolobus ParB C-terminal domain resembles eukaryotic segregation protein CenpA, and other histones. This family is related to the N-terminal domain of ParB (Spo0J in Bacillus subtilis), a DNA-binding component of the prokaryotic parABS partitioning system and related proteins.	69
319262	cd16405	RepB_like_N	plasmid segregation replication protein B like protein, N-terminal domain. RepB, found on plasmids and secondary chromosomes, works along with repA in directing plasmid segregation, and has been shown in Rhizobium etli to require the parS centromere-like sequence for full transcriptional repression of the repABC operon, inducing plasmid incompatibility. RepA is a Walker-type ATPase that complexes with RepB to form DNA-protein complexes in the presence of ATP/ADP. RepC is an initiator protein for the plasmid. repA and repB are homologous to the parA and ParB genes of the parABS partitioning system found on primary chromosomes.	91
319263	cd16406	ParB_N_like	ParB N-terminal, parA-binding, -like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	82
319264	cd16407	ParB_N_like	ParB N-terminal, parA-binding, -like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	86
319265	cd16408	ParB_N_like	ParB N-terminal, parA -binding, -like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	84
319266	cd16409	ParB_N_like	ParB N-terminal-like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	74
319267	cd16410	ParB_N_like	ParB N-terminal, parA-binding, -like domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	80
319268	cd16411	ParB_N_like	ParB N-terminal, parA -binding, domain of bacterial and plasmid parABS partitioning systems. This family represents the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	90
319269	cd16412	dndB	DNA sulfur modification protein DndB. dndB acts in the regulation of DNA modifications, including DNA phosphorothioation. DndB may act by binding near the phosphorothioate modification site and regulating access of the Dnd modification machinery to DNA. These proteins show similarity to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, and other members of the ParB/Srx superfamily.	333
319270	cd16413	DGQHR_domain	DGQHR motif containing domain. Uncharacterized diverse domain family with conserved DGQHR motif, in addition to QR and FXXXN motifs. Some proteins have been identified as parts of DNA phosphorothioation systems. Related to dndB, which acts in the regulation of DNA modifications, including DNA phosphorothioation. These proteins show similarity to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, and other members of the ParB/Srx superfamily.	229
319271	cd16414	dndB_like	DNA-sulfur modification-associated domain. Family of proteins related to dndB. dndB acts in the regulation of DNA modifications, including DNA phosphorothioation. Both have a conserved DGQHR sequence motif. These proteins show similarity to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, and other members of the ParB/Srx superfamily	238
319852	cd16415	HAD_dREG-2_like	uncharacterized family of the haloacid dehalogenase-like superfamily, similar to uncharacterized Drosophila melanogaster rhythmically expressed gene 2 protein and human haloacid dehalogenase-like hydrolase domain-containing protein 3. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	128
319853	cd16416	HAD_BsYqeG-like	Uncharacterized family of the the haloacid dehalogenase-like superfamily, similar to the uncharacterized protein Bacillus subtilis YqeG. The haloacid dehalogenase-like (HAD) hydrolases are a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. Members are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	108
319854	cd16417	HAD_PGPase	Escherichia coli Gph phosphoglycolate phosphatase and related proteins; belongs to the haloacid dehalogenase-like superfamily. Phosphoglycolate phosphatase (PGP; EC 3.1.3.18) catalyzes the conversion of 2-phosphoglycolate into glycolate and phosphate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	212
319855	cd16418	HAD_Pase	phosphatases, similar to human PHOSPHO1 and PHOSPHO2 phosphatases; belongs to the haloacid dehalogenase-like superfamily. This family includes phosphatases with different substrate specificities. Human PHOSPHO1 is a phosphoethanolamine/phosphocholine phosphatase associated with high levels of expression at mineralizing regions of bone and cartilage and is thought to be involved in the generation of inorganic phosphate for bone mineralization. Human PHOSPHO2 is a putative phosphatase which shows high specific activity toward pyridoxal-5-phosphate; PHODPHO2 is not specific to bone but is expressed in a wide range of soft tissues. These belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	130
319856	cd16419	HAD_SPS	sucrose-phosphate synthase; belongs to the haloalcanoic acid dehalogenase (HAD) superfamily. Sucrose phosphate synthase (SPS; EC 2.4.1.14) also known as UDP-glucose-fructose-phosphate glucosyltransferase, catalyzes the transfer of a hexosyl group from UDP-glucose to D-fructose 6-phosphate to form UDP and D-sucrose-6-phosphate, this is the rate limiting step of sucrose synthesis. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	174
319857	cd16421	HAD_PGPase	Rhodobacter capsulatus Cbbz phosphoglycolate phosphatase and related proteins; ; belongs to the haloacid dehalogenase-like superfamily. Phosphoglycolate phosphatase (PGPase; EC 3.1.3.18) catalyzes the conversion of 2-phosphoglycolate into glycolate and phosphate. Members of this family belong to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase (C-Cl bond hydrolysis), azetidine hydrolase (C-N bond hydrolysis); phosphonoacetaldehyde hydrolase (C-P bond hydrolysis), phosphoserine phosphatase and phosphomannomutase (CO-P bond hydrolysis), P-type ATPases (PO-P bond hydrolysis) and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	105
319858	cd16422	HAD_Pase_UmpH-like	uncharacterized subfamily of the UmpH/NagD phosphatase family, belongs to the haloacid dehalogenase-like superfamily. This uncharacterized subfamily belongs to the UmpH/NagD phosphatase family and to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	247
319859	cd16423	HAD_BPGM-like	uncharacterized subfamily of beta-phosphoglucomutase-like family, similar to uncharacterized Bacillus subtilis YhcW. This subfamily includes the uncharacterized Bacillus subtilis YhcW. It belongs to the beta-phosphoglucomutase-like family whose other members include Lactococcus lactis beta-PGM, a mutase which catalyzes the interconversion of beta-D-glucose 1-phosphate (G1P) and D-glucose 6-phosphate (G6P), Saccharomyces cerevisiae phosphatases GPP1 and GPP2 that dephosphorylate DL-glycerol-3-phosphate and DOG1 and DOG2 that dephosphorylate 2-deoxyglucose-6-phosphate, and Escherichia coli 6-phosphogluconate phosphatase YieH. This family belongs to the haloacid dehalogenase-like (HAD) hydrolases, a large superfamily of diverse enzymes that catalyze carbon or phosphoryl group transfer reactions on a range of substrates, using an active site aspartate in nucleophilic catalysis. Members of this superfamily include 2-L-haloalkanoic acid dehalogenase, azetidine hydrolase, phosphonoacetaldehyde hydrolase, phosphoserine phosphatase, phosphomannomutase, P-type ATPases and many others. HAD hydrolases are found in all three kingdoms of life, and most genomes are predicted to contain multiple HAD-like proteins. Members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. HAD hydrolases are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases.	169
319761	cd16424	VirB8	periplasmic domain of VirB8 protein. This family contains the periplasmic domain of VirB8 protein which is an essential assembly factor of type IV secretion system (T4SS) channel proteins that are highly diverse in function relative to other bacterial secretion systems. T4SS proteins are important virulence factors for many Gram-negative pathogens, such as Agrobacterium, Brucella, Legionella, and Helicobacter. VirB8 is a bacterial virulence protein with cytoplasmic, transmembrane, and periplasmic regions. It is thought that VirB8 is a primary constituent of a DNA transporter. It is a crucial structural and functional component of the T4SS, with interactions between VirB8 and many other T4SS proteins, including VirB10, VirB9, VirB1, VirB4, and VirB11, as well as with itself.	137
319762	cd16425	TrbF	conjugal transfer protein TrbF. This family includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain is similar to the type IV secretion system (T4ASS) component VirB8 and possibly has a similar fold to the nuclear transport factor-2 (NTF-2)-like superfamily.	133
319754	cd16426	VirB10_like	VirB10 and similar proteins form part of core complex in Type IV secretion system (T4SS). This family contains VirB10, a component of the type IV secretion system (T4SS) and its homologs, including TraB, TraF, IcmE, and similar proteins. T4S system is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. It forms a large multiprotein complex consisting of 12 proteins termed VirB1-11 and VirD4.  VirB10 interacts with VirB7 and VirB9, forming the membrane-spanning 'core complex' (CC), around which all other components assemble. The CC is inserted in both the outer and inner membranes, playing a fundamental role as a scaffold for the rest of the T4SS components and actively participating in T4S substrate transfer through the bacterial envelope via conformational changes regulating channel opening and closing. In Gram-negative bacterial pathogen Helicobacter pylori, an important aetiological agent of gastroduodenal disease in humans, the comB3 gene encodes protein with best homologies to TraS/TraB from the Pseudomonas aeruginosa conjugative plasmid RP1 and TrbI of plasmid RP4 and VirB10 from the Ti plasmid of Agrobacterium tumefaciens, as well as DotG/IcmE  of Legionella pneumophila.	151
319758	cd16427	TraM-like	C-terminal domain of transfer protein TraM. This family contains the C-terminal domain of transfer protein TraM of the G+ broad host range Enterococcus conjugative plasmid pIP501 and similar proteins. The protein localizes to the cell envelope and is part of the plasmid transfer system that is accessible from outside of the cell. TraM displays a fold similar to the type IV secretion system (T4SS) VirB8 proteins from A. tumefaciens and B. suis (G-) and to the transfer protein TcpC from C. perfringens plasmid pCW3 (G+), reinforcing the prediction that TraM performs a key role in the secretion process, which is underlined by its surface accessibility. It is also structurally related to members of the nuclear transport factor-2 (NTF-2)-like superfamily with a high similarity to the NTF-2 protein from Rattus norvegicus. TraM (categorized as T4SS VirB8-like class GAMMA) does not possess the binding pocket of classic VirB8 class ALPHA proteins, recognized by VirB8 interaction inhibitors.	108
319759	cd16428	TcpC_C	C-terminal domain of conjugative transposon protein TcpC. This family contains the C-terminal domain of conjugation protein TcpC and similar proteins. TcpC is required for efficient conjugative transfer, localizing to the cell membrane independently of other conjugation proteins, where membrane localization is important for its function, oligomerization and interaction with the conjugation proteins TcpA, TcpH, and TcpG. C-terminal domain is critical for interactions with these other conjugation proteins. TcpC has low level sequence identity to proteins encoded by the conjugative transposon Tn916, which is responsible for a large proportion of the tetracycline resistance in different pathogens.	97
319755	cd16429	VirB10	VirB10 forms part of core complex in Type IV secretion system (T4SS). This family contains VirB10, a component of the type IV secretion system (T4SS), including homologs TrbI, TraF, TrwE and TraL. T4S system is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. It forms a large multiprotein complex consisting of 12 proteins termed VirB1-11 and VirD4.  VirB10, interacts with VirB7 and VirB9, forming the membrane-spanning 'core complex' (CC), around which all other components assemble. The CC is inserted in both, the outer and inner membranes, playing a fundamental role as a scaffold for the rest of the T4SS components and actively participating in T4S substrate transfer through the bacterial envelope via conformational changes regulating channel opening and closing. TrwE in R33 plasmid has been shown to be anchored to the inner membrane and its C-terminal is necessary for conjugation; the transmembrane domains of TrwB and TrwE are involved in TrwB-TrwE interactions. TraF protein of the RP4 plasmid is involved in circularization of pilin subunits of P-type pili. In gonococcal genetic island (GGI) of Neisseria gonorrhoeae, T4SS encodes TrbI and circularization occurs via a covalent intermediate between the C terminus of putative pilin protein TraA and TrbI.	180
319756	cd16430	TraB	TraB is a homolog of VirB10 which forms part of core complex in Type IV secretion system (T4SS). This family contains TraB (VirB10 homolog) and a component of the type IV secretion system (T4SS), and similar proteins. T4S system is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. It forms a large multiprotein complex consisting of 12 proteins termed VirB1-11 and VirD4.  VirB10, interacts with VirB7 and VirB9, forming the membrane-spanning 'core complex' (CC), around which all other components assemble. The CC is inserted in both the outer and inner membranes, playing a fundamental role as a scaffold for the rest of the T4SS components and actively participating in T4S substrate transfer through the bacterial envelope via conformational changes regulating channel opening and closing. TraB is localized similarly to related proteins in other systems, but unlike in other systems, Neisseria gonorrhoeae TraB does not require the presence of other T4SS components for proper localization. It has been shown to be expressed with TraK (VirB9 homolog) at low levels in wild-type cells, suggesting that gonococcal T4SS may be present in single copy per cell and the small amounts of these proteins are sufficient for DNA secretion.	203
319757	cd16431	IcmE	DotG/IcmE is a homolog of VirB10 which forms part of core complex in Type IV secretion system. This family contains DotG/IcmE (VirB10 homolog) and a component of the type IV secretion system (T4SS), and similar proteins. The Dot/Icm system is a T4SS found in the pathogens Legionella and Coxiella and the conjugative apparatus of IncI plasmids; T4SS is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. Similar to T4SS VirB/D components, the Legionella Dot/Icm secretion apparatus contains a critical five-protein sub-assembly that forms the membrane-spanning 'core-complex' (CC), around which all other components assemble. This transmembrane connection is mediated by protein dimer pairs consisting of two inner membrane proteins, DotF and DotG, each independently associating with DotH/DotC/DotD in the outer membrane.	179
319751	cd16432	CheB_Rec	Chemotaxis response regulator protein-glutamate methylesterase, CheB, with N-terminal REC domain. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain with a REC domain at the N-terminus. CheB is a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. The N-terminal regulatory (REC) domain blocks the active site of the C-terminal domain until it is phosphorylated. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR.	184
319752	cd16433	CheB	Chemotaxis response regulator protein-glutamate methylesterase, CheB. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain, a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. cheR and cheB have a strong preference to occur in the same operon, and a subgroup contains multidomain proteins with CheB-CheR fusions. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR.	181
319753	cd16434	CheB-CheR_fusion	Chemotaxis response regulator protein-glutamate methylesterase, CheB, fused with CheR domain. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain, a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors, fused with a CheR domain as well as other domains. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. cheB and cheR are typically found in the same operon. However, CheB and CheR are fused in multi-domain proteins in this subgroup. The CheR protein/domain includes an all-alpha N-terminal domain and an S-adenosylmethionine-dependent methyltransferase C-terminal domain. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR.	180
319740	cd16435	BPL_LplA_LipB	biotin-lipoate ligase family. This family includes biotin protein ligase (BPL), lipoate-protein ligase A (LplA) and octanoyl-[acyl carrier protein]-protein acyltransferase (LipB). Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LplA) catalyses the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes.	198
319744	cd16436	beta_Kdo_transferase	beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. KpsC and KpsS are retaining beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferases. They are part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria.	222
319745	cd16437	beta_Kdo_transferase_KpsC	beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsC. KpsC is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides, that are important virulence factors for many pathogenic bacteria. KpsC contains a domain duplication.	256
319746	cd16438	beta_Kdo_transferase_KpsS_like	beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsS like. KpsS is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria.	254
319747	cd16439	beta_Kdo_transferase_KpsC_2	beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsC, repeat 2. KpsC is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria. KpsC contains a domain duplication.	259
319748	cd16440	beta_Kdo_transferase_KpsC_1	beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsC, repeat1. KpsC is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria. KpsC contains a domain duplication.	262
319749	cd16441	beta_Kdo_transferase_KpsS	beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase KpsS. KpsS is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria.	307
319741	cd16442	BPL	biotin protein ligase. Biotin protein ligase (EC 6.3.4.15) catalyzes the synthesis of an activated form of biotin, biotinyl-5'-AMP, from substrates biotin and ATP followed by biotinylation of the biotin carboxyl carrier protein subunit of acetyl-CoA carboxylase. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine.	173
319742	cd16443	LplA	lipoate-protein ligase. Lipoate-protein ligase A (LplA) catalyzes the formation of an amide linkage between free lipoic acid and a specific lysine residue of the lipoyl domain in lipoate dependent enzymes, similar to the biotinylation reaction mediated by biotinyl protein ligase (BPL). The two step reaction includes activation of exogenously supplied lipoic acid at the expense of ATP to lipoyl-AMP and then transfer to the epsilon-amino group of a specific lysine residue of the lipoyl domain of the target protein.	209
319743	cd16444	LipB	lipoyl/octanoyl transferase. Lipoate-protein ligase B is a octanoyl-[acyl carrier protein]-protein acyltransferase the catalyzes the first step of lipoic acid synthesis. It transfers endogenous octanoic acid attached via a thioester bond to acyl carrier protein (ACP) onto lipoyl domains, which is later converted by lipoate synthase LipA into lipoylated derivatives.	199
319362	cd16448	RING-H2	H2 subclass of RING (RING-H2) finger and its variants. RING finger is a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc. It is defined by the "cross-brace" motif that chelates zinc atoms by eight amino acid residues, typically Cys or His, arranged in a characteristic spacing. Canonical RING motifs have been categorized as two major subclasses, RING-HC (C3HC4-type) and RING-H2 (C3H2C3-type), according to their Cys/His content. There are also many variants of RING fingers. Some have different Cys/His pattern. Some lack a single Cys or His residues at typical Zn ligand positions. Especially, the fourth or eighth zinc ligand is prevalently exchanged for an Asp, which can indeed chelate Zn in a RING finger as well. This family corresponds to H2 subclass of RING (RING-H2) finger proteins that are characterized by containing C3H2C3-type canonical RING-H2 fingers or noncanonical RING-H2 finger variants, including C4HC3- (RING-CH alias RINGv), C3H3C2-, C3H2C2D-, C3DHC3-, and C4HC2H-type modified RING-H2 fingers. The canonical RING-H2 finger has been defined as C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-H-X2-C-X(4-48)-C-X2-C, X is any amino acid and the number of X residues varies in different fingers. It binds two Zn ions in a unique "cross-brace" arrangement, which distinguishes it from tandem zinc fingers and other similar motifs. RING-H2 finger can be found in a group of diverse proteins with a variety of cellular functions, including oncogenesis, development, viral replication, signal transduction, the cell cycle and apoptosis. Many of them are ubiquitin-protein ligases (E3s) that serves as a scaffold for binding to ubiquitin-conjugating enzymes (E2s, also referred to as ubiquitin carrier proteins or UBCs) in close proximity to substrate proteins, which enables efficient transfer of ubiquitin from E2 to the substrates.	44
319363	cd16449	RING-HC	HC subclass of RING (RING-HC) finger and its variants. RING finger is a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc. It is defined by the "cross-brace" motif that chelates zinc atoms by eight amino acid residues, typically Cys or His, arranged in a characteristic spacing. Canonical RING motifs have been categorized into two major subclasses, RING-HC (C3HC4-type) and RING-H2 (C3H2C3-type), according to their Cys/His content. There are also many variants of RING fingers. Some have a different Cys/His pattern. Some lack a single Cys or His residue at typical Zn ligand positions, especially, the fourth or eighth zinc ligand is prevalently exchanged for an Asp, which can chelate Zn in a RING finger as well. This family corresponds to HC subclass of RING (RING-HC) finger proteins that are characterized by containing C3HC4-type canonical RING-HC fingers or noncanonical RING-HC finger variants, including C4C4-, C3HC3D-, C2H2C4-, and C3HC5-type modified RING-HC fingers. The canonical RING-HC finger has been defined as C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C. It binds two Zn ions in a unique "cross-brace" arrangement, which distinguishes it from tandem zinc fingers and other similar motifs. RING-HC finger can be found in a group of diverse proteins with a variety of cellular functions, including oncogenesis, development, viral replication, signal transduction, the cell cycle, and apoptosis. Many of them are ubiquitin-protein ligases (E3s) that serve as scaffolds for binding to ubiquitin-conjugating enzymes (E2s, also referred to as ubiquitin carrier proteins or UBCs) in close proximity to substrate proteins, which enables efficient transfer of ubiquitin from E2 to the substrates.	39
319364	cd16450	mRING-C3HGC3_RFWD3	Modified RING finger, C3HGC3-type, found in RING finger and WD repeat domain-containing protein 3 (RFWD3) and similar proteins. RFWD3, also known as RING finger protein 201 (RNF201) or FLJ10520, is an E3 ubiquitin-protein ligase that forms a complex with Mdm2 and p53 to synergistically ubiquitinate p53 and acts as a positive regulator of p53 stability in response to DNA damage. It is phosphorylated by checkpoint kinase ATM/ATR and the phosphorylation mutant fails to stimulate p53 ubiquitination. RFWD3 also functions as a novel replication protein A (RPA)-associated protein involved in DNA replication checkpoint control. RFWD3 contains an N-terminal SQ-rich region followed by a RING finger domain that exhibits robust E3 ubiquitin ligase activity toward p53, a coiled-coil domain and three WD40 repeats in the C-terminus, the latter two of which may be responsible for protein-protein interaction. The RING finger in this family is a modified C3HGC3-type RING finger, but not a canonical C3H2C3-type RING-H2 finger or C3HC4-type RING-HC finger.	49
319365	cd16451	mRING_PEX12	Modified RING finger found in peroxin-12 (PEX12) and similar proteins. PEX12, also known as peroxisome assembly protein 12 or peroxisome assembly factor 3 (PAF-3), is a RING finger domain-containing integral membrane peroxin required for protein import into peroxisomes. Mutations in human PEX12 result in the peroxisome deficiency Zellweger syndrome of complementation group III (CG-III), a lethal neurological disorder. PEX12 also functions as an E3-ubiquitin ligase that facilitates the PEX4-dependent monoubiquitination of PEX5, a key player in peroxisomal matrix protein import, to control PEX5 receptor recycling or degradation. PEX12 contains a modified RING finger that lacks the third, fourth, and eighth zinc-binding residues of the consensus RING finger motif, suggesting PEX12 may only bind one zinc ion.	42
319366	cd16452	SP-RING_like	A group of variants of RING finger including SP-RING finger, SPL-RING finger, dRING finger, and RING-like Rtf2 domain. The family corresponds to a group of proteins with variants of RING fingers that are characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues compared with the classic C3H2C3-/C3HC4-type RING fingers. They include SP-RING finger found in the Siz/PIAS RING (SP-RING) family of SUMO E3 ligases, SPL-RING finger found in E3 SUMO-protein ligase NSE2, degenerated RING (dRING) finger found in Saccharomyces cerevisiae required for meiotic nuclear division protein 5 (Rmd5p) and homologs, and RING-like Rtf2 domain found in the replication termination factor 2 (Rtf2) protein family. The SP-RING family includes PIAS (protein inhibitor of activated STAT) proteins, Zmiz proteins, and Siz proteins from plants and fungi. The PIAS (protein inhibitor of activated STAT) protein family modulates the activity of several transcription factors and acts as an E3 ubiquitin ligase in the sumoylation pathway. NSE2, also known as MMS21 homolog (MMS21) or non-structural maintenance of chromosomes element 2 homolog (Non-SMC element 2 homolog, NSMCE2), is an autosumoylating small ubiquitin-like modifier (SUMO) ligase required for the response to DNA damage. It regulates sumoylation and nuclear-to-cytoplasmic translocation of skeletal and heart muscle-specific variant of the alpha subunit of nascent polypeptide associated complex (skNAC)-Smyd1 in myogenesis. It is also required for resisting extrinsically induced genotoxic stress. Rmd5p, also known as glucose-induced degradation protein 2 (Gid2) or sporulation protein RMD5, is an E3 ubiquitin ligase that forms the heterodimeric E3 ligase unit of the glucose induced degradation deficient (GID) complex with Gid9 (also known as Fyv10), which has a degenerated RING finger as well. The GID complex triggers polyubiquitylation and subsequent proteasomal degradation of the gluconeogenic enzymes fructose-1, 6-bisphosphate by fructose-1, 6-bisphosphatase (FBPase), phosphoenolpyruvate carboxykinase (PEPCK), and cytoplasmic malate dehydrogenase (c-MDH). The Rtf2 protein family includes a group of conserved proteins found in eukaryotes ranging from fission yeast to humans. The defining member of the family is Schizosaccharomyces pombe Rtf2 (SpRtf2), which is a proliferating cell nuclear antigen-interacting protein that functions as a key requirement for efficient replication termination at the site-specific replication barrier RTS1. It promotes termination at RTS1 by preventing replication restart. The RING-like Rtf2 domain in fission yeast is required to stabilize a paused DNA replication fork during imprinting at the mating type locus, possibly by facilitating sumoylation of PCNA. The family also includes Arabidopsis RTF2 (AtRTF2), an essential nuclear protein required for both normal embryo development and for proper expression of the GFP reporter gene. It plays a critical role in splicing the GFP pre-mRNA, and may also have a more transient regulatory role during the spliceosome cycle. The biological function of Rtf2 homologs found in eumetazoa remains unclear.	42
319367	cd16453	RING-Ubox	U-box domain, a modified RING finger. The U-box protein family is a family of E3 enzymes that also includes the HECT family and the RING finger family. E3 enzyme is ubiquitin-protein ligase that cooperates with a ubiquitin-activating enzyme (E1) and a ubiquitin-conjugating enzyme (E2), and plays a central role in determining the specificity of the ubiquitination system. It removes the ubiquitin molecule from E2 enzyme and attaches it to the target substrate, forming a covalent bond between ubiquitin and the target. U-box proteins are characterized by the presence of a U-box domain of approximately 70 amino acids. U-box is a modified form of the RING finger domain that lacks metal chelating cysteines and histidines. It resembles the cross-brace RING structure consisting of three beta-sheets and a single alpha-helix, which would be stabilized by salt bridges instead of chelated metal ions. U-box proteins are widely distributed among eukaryotic organisms and show a higher prevalence in plants than in other organisms.	40
319368	cd16454	RING-H2_PA-TM-RING	RING finger, H2 subclass, found in the PA-TM-RING ubiquitin ligase family. The PA-TM-RING family represents a group of transmembrane-type E3 ubiquitin ligases, which has been characterized by an N-terminal transient signal peptide, a PA (protease-associated) domain, a TM (transmembrane) domain, as well as a C-terminal C3H2C3-type RING-H2 finger domain. It includes RNF13, RNF167, ZNRF4 (zinc and RING finger 4), GRAIL (gene related to anergy in lymphocytes)/RNF128, RNF130, RNF133, RNF148, RNF149 and RNF150 (which are more closely related), as well as RNF43 and ZNRF3 which have substantially longer C-terminal tail extensions compared with the others. PA-TM-RING proteins are expressed at low levels in all mammalian tissues and species, but they are not present in yeast. They play a common regulatory role in intracellular trafficking/sorting, suggesting that abrogation of their function may result in dysregulation of cellular signaling events in cancer.	43
319369	cd16455	RING-H2_AMFR	RING finger, H2 subclass, found in autocrine motility factor receptor (AMFR) and similar proteins. AMFR, also known as AMF receptor, or RING finger protein 45, or ER-protein gp78, is an internalizing cell surface glycoprotein localized in both plasma membrane caveolae and the endoplasmic reticulum (ER). It is involved in the regulation of cellular adhesion, proliferation, motility and apoptosis, as well as in the process of learning and memory. AMFR also functions as a RING finger-dependent ubiquitin protein ligase (E3) implicated in degradation from the ER. AMFR contains an N-terminal RING-H2 finger and a C-terminal ubiquitin-associated (UBA)-like CUE domain.	44
319370	cd16456	RING-H2_APC11	RING finger, H2 subclass, found in anaphase-promoting complex subunit 11 (APC11) and similar proteins. APC11, also known as cyclosome subunit 11, or hepatocellular carcinoma-associated RING finger protein, is a C3H2C3-type RING-H2 protein that facilitates ubiquitin chain formation by recruiting ubiquitin-charged ubiquitin conjugating enzymes (E2) through its RING-H2 domain. APC11 and its partner the cullin-like subunit APC2 form the dynamic catalytic core of the gigantic, multisubunit 1.2-MDa anaphase-promoting complex/cyclosome (APC), also known as the cyclosome, which is a ubiquitin-protein ligase (E3) composed of at least 12 subunits and controls cell division by ubiquitinating cell cycle regulators, such as cyclin B and securin, to drive their timely degradation. APC11 can be inhibited by hydrogen peroxide, which may contributes to the delay in cell cycle progression through mitosis that is characteristic of cells subjected to oxidative stress. APC11 contains a canonical RING-H2-finger domain, which includes one histidine and seven cysteine residues that coordinate two Zn2+ ions. In addition, it contains a third Zn2+-binding site and the third Zn2+ ion is not essential for its ligase activity.	63
319371	cd16457	RING-H2_BRAP2	RING finger, H2 subclass, found in BRCA1-associated protein (BRAP2) and similar proteins. BRAP2, also known as impedes mitogenic signal propagation (IMP), RING finger protein 52, or renal carcinoma antigen NY-REN-63, is a novel cytoplasmic protein interacting with the two functional nuclear localization signal (NLS) motifs of BRCA1, a nuclear protein linked to breast cancer. It also binds to the SV40 large T antigen NLS motif and the bipartite NLS motif found in mitosin. BRAP2 serves as a cytoplasmic retention protein and plays a role in in the regulation of nuclear protein transport. It contains an N-terminal RNA recognition motif (RRM), also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), followed by a C3H2C3-type RING-H2 finger and a UBP-type zinc finger.	44
319372	cd16458	RING-H2_Dmap_like	RING finger, H2 subclass, found in defective in mitotic arrest proteins (Dmap) and similar proteins. The subfamily includes one Schizosaccharomyces pombe protein Dma1 (SpDma1p), two Saccharomyces cerevisiae proteins, Dma1 (ScDma1p) and Dma2 (ScDma2p), and their homologs from fungi. SpDma1p was originally isolated as multicopy suppressor of the temperature-sensitive growth phenotype caused by cdc16 mutations. It functions to prevent mitotic exit and cytokinesis during spindle checkpoint arrest by inhibiting septation initiation network (SIN) signaling. ScDma1p and ScDma2p, also known as checkpoint forkhead associated with RING domains-containing protein 1 and 2 respectively, seem to be functionally redundant. They are involved in proper septin ring positioning and cytokinesis. The simultaneous lack of Dma1 and Dma2 leads to spindle mispositioning and defects in the spindle position checkpoint. All members in this family contain a forkhead-associated domain (FHA) and a C3H2C3-type RING-H2 finger, the latter suggesting they may possibly possess E3 ubiquitin-ligase activities.	47
319373	cd16459	RING-H2_DTX1_like	RING finger, H2 subclass, found in E3 ubiquitin-protein ligase Deltex1 (DTX1), Deltex2 (DTX2), Deltex4 (DTX4), and similar proteins. This family includes Drosophila melanogaster Deltex, its vertebrate homologs, DTX1, DTX2, and DTX4, and other similar proteins mainly from eumetazoa. Deltex is a ubiquitously expressed cytoplasmic ubiquitin E3 ligase that mediates Notch activation in Drosophila. It selectively suppresses T-cell activation through degradation of a key signaling molecule, MAP kinase kinase kinase 1 (MEKK1). It also inhibits Jun-mediated transcription at the stage of Ras-dependent Jun N-terminal protein kinase (JNK) activation. Deltex contains N-terminal two Notch-binding WWE domains that physically interact with the Notch ankyrin domains, a proline-rich motif that shares homology with SH3-binding domains, and a RING finger at the C-terminus. The vertebrate homologs of Deltex have been involved in Notch signaling and neurogenesis. The mammalian DTX1 is most closely related to the Drosophila Deltex. Both of them bind to SH3-domain containing protein Grb2 and further inhibit E2A. DTX1 functions as a Notch downstream transcription regulator. It interacts with the transcription coactivator p300 and inhibits transcription activation mediated by the neural specific transcription factor MASH1. It is also a transcription target of nuclear factor of activated T cells (NFAT) and participated in T cell anergy and Foxp3 protein level maintenance in vivo. Moreover, DTX1 promotes protein kinase C theta degradation and sustains Casitas B-lineage lymphoma expression. DTX4, also known as RING finger protein 155, shares the highest degree of sequence similarity with DTX1. So it likely interacts with the intracellular domain of Notch as well. Like DTX1 and DTX4, DTX2 is expressed in thymocytes. It interacts with the intracellular domain of Notch receptors and acts as a negative regulator of Notch signals in T cells. However, the endogenous levels of DTX1 and DTX2 is not important for regulating Notch signals during thymocyte development. In contrast to other DTXs, DTX3 does not contain N-terminal two Notch-binding WWE domains, but a short unique N-terminal domain. It does not interact with intracellular domain of Notch. In addition, it has a different class of RING finger (C3HC4 type or RING-HC subclass) than do the other DTXs which harbor a C3H2C3-type RING-H2 finger. Thus DTX3 is not included in this family.	64
319374	cd16460	RING-H2_DZIP3	RING finger, H2 subclass, found in DAZ (deleted in azoospermia)-interacting protein 3 (DZIP3) and similar proteins. DZIP3, also known as RNA-binding ubiquitin ligase of 138 kDa (RUL138) or 2A-HUB protein, is an RNA-binding E3 ubiquitin-protein ligase that interacts with coactivator-associated arginine methyltransferase 1 (CARM1) and acts as a transcriptional coactivator of estrogen receptor (ER) alpha. It is also a histone H2A ubiquitin ligase that catalyzes monoubiquitination of H2A at lysine 119, functioning as a combinatoric component of the repression machinery required for repressing a specific chemokine gene expression program, critically modulating migratory responses to Toll-like receptors (TLR) activation. DZIP3 contains a C3H2C3-type RING-H2 finger at the C-terminus.	43
319375	cd16461	RING-H2_EL5_like	RING finger, H2 subclass, found in rice E3 ubiquitin-protein ligase EL5 and similar proteins. EL5, also known as protein ELICITOR 5, is an E3 ubiquitin-protein ligase containing an N-terminal transmembrane domain and a C3H2C3-type RING-H2 finger that is a binding site for ubiquitin-conjugating enzyme (E2). It can be rapidly induced by N-acetylchitooligosaccharide elicitor. EL5 catalyzes polyubiquitination via the Lys48 residue of ubiquitin, and thus plays a crucial role as a membrane-anchored E3 in the maintenance of cell viability after the initiation of root primordial formation in rice. It also acts as an anti-cell death enzyme that might be responsible for mediating the degradation of cytotoxic proteins produced in root cells after the actions of phytohormones. Moreover, EL5 interacts with UBC5b, a rice ubiquitin carrier protein, through its RING-H2 finger. EL5 is an unstable protein, and its degradation is regulated by the C3H2C3-type RING-H2 finger in a proteasome-independent manner.	43
319376	cd16462	RING-H2_Pep3p_like	RING finger, H2 subclass, found in Saccharomyces cerevisiae vacuolar membrane protein PEP3 (Pep3p) and similar proteins. Pep3p, also known as carboxypeptidase Y-deficient protein 3, vacuolar morphogenesis protein 8, vacuolar protein sorting-associated protein 18 (Vps18p), or vacuolar protein-targeting protein 18, is a vacuolar membrane protein that affects late Golgi functions required for vacuolar protein sorting and efficient alpha-factor prohormone maturation. It is required for vacuolar biogenesis caused hypersensitivity to heat shock and ethanol stresses, probably due to disappearance of normal vacuoles. As a component of the homotypic fusion and vacuole protein sorting (HOPS) and class C core vacuole/endosome tethering (CORVET) complexes, its overexpression shortens lag phase but does not alter growth rate in Saccharomyces cerevisiae exposed to acetic acid stress. Moreover, Pep3p forms the Class C Vps protein complex (C-Vps complex) with Pep5p (also known as Vps11), Vps16, and Vps33, and is necessary for trafficking of hydrolase precursors to the vacuole by promoting vesicular docking reactions with SNARE proteins. Pep3p contains a C3H2C3-type RING-H2 finger at the C-terminus.	50
319377	cd16463	RING-H2_PHR	RING finger, H2 subclass, found in the PHR protein family. The PHR protein family represents an evolutionally conserved family of large proteins including human E3 ubiquitin ligase protein associated with Myc (Pam) and its homologs, Phr1 (for Pam/Highwire/RPM-1) in mouse, Highwire (HIW) in Drosophila, RPM-1 (regulator of presynaptic morphology 1) in Caenorhabditis elegans, and Esrom in zebrafish. Those proteins are large E3 ubiquitin ligases containing regulator of chromosome condensation (RCC) homology domains (RHD-1 and RHD-2) with inferred guanine exchange factor (GEF) activity, a Myc-binding domain, a B-box zinc finger, and a C-terminal C3H2C3-type RING-H2 finger with E3 ubiquitin (Ub) ligase activity. They play an important role in axon guidance and synaptogenesis. They regulate synapse formation and growth in mammals, zebrafish, Drosophila, and Caenorhabditis elegans, and may control a variety of signaling pathways, including cAMP signaling in mammalian cells, JNK/p38 MAPK signaling in Drosophila and C. elegans, and bone morphogenetic protein signaling in Drosophila. Pam also known as Myc-binding protein 2 (MYCBP2), or Pam/highwire/rpm-1 protein (PHR1), negatively regulates neuronal growth, synaptogenesis and synaptic plasticity by modulating several signaling pathways including the p38 MAPK signaling cascade. It also participates in receptor and ion channel internalization, such as regulating internalization of transient receptor potential vanilloid receptor 1 (TRPV1) in peripheral sensory neurons, as well as duration of thermal hyperalgesia through p38 MAPK. It interacts with neuron-specific electroneutral potassium (K+) and chloride (Cl-) cotransporter KCC2 and modulates its function. Moreover, Pam genetically interacts with Robo2 to modulate axon guidance in the olfactory system. It also associates with tuberous sclerosis complex (TSC) proteins, ubiquitinating TSC2 and regulating mammalian/mechanistic target of rapamycin (mTOR) signaling. Furthermore, Pam is the longest lasting nontranscriptional regulator of adenylyl cyclase activity, and can mediate sustained inhibition of cAMP signaling by sphingosine-1-phosphate. It is also involved in spinal nociceptive processing. Phr1 is an essential regulator of retinal ganglion cell projection during both dorsal lateral geniculate nucleus (dLGN) and superior colliculus (SC) topographic map development. RPM-1 positively regulates a Rab GTPase pathway to promote vesicular trafficking via late endosomes, thereby regulating synapse formation and axon termination. Esrom has E3 ligase activity and modulates the amount of phosphorylated Tuberin, a tumor suppressor, in growth cones. It is required in formation of the retinotectal projection.	55
319378	cd16464	RING-H2_Pirh2	RING finger, H2 subclass, found in p53-induced RING-H2 protein (Pirh2) and similar proteins. Pirh2, also known as RING finger and CHY zinc finger domain-containing protein 1 (Rchy1), androgen receptor N-terminal-interacting protein, CH-rich-interacting match with PLAG1, RING finger protein 199 (RNF199), or zinc finger protein 363 (ZNF363), is a p53 inducible E3 ubiquitin-protein ligase that functions as a negative regulator of p53. It ubiquitylates preferably the tetrameric form of p53 in vitro and in vivo, suggesting a role of Pirh2 in downregulating the transcriptionally active form of p53 in the cell. Moreover, Pirh2 inhibits p73, a homolog of the tumor suppressor p53, transcriptional activity by promoting its ubiquitination. It also monoubiquitinates DNA polymerase eta (PolH) to suppress translesion DNA synthesis. Furthermore, Pirh2 functions as a negative regulator of the cyclin-dependent kinase inhibitor p27(Kip1) function by promoting ubiquitin-dependent proteasomal degradation. In addition, Pirh2 enhances androgen receptor (AR) signaling through inhibition of histone deacetylase 1 (HDAC1) and is overexpressed in prostate cancer. Pirh2 also interacts with TIP60 and the TIP60-Pirh2 association may regulate Pirh2 stability. In addition, the oncoprotein pleomorphic adenoma gene like 2 (PLAGL2) can bind to Pirh2 dimer and therefore control the stability of Pirh2. Pirh2 contains a total of nine zinc-binding sites with six located at the N-terminal region, two in the C3H2C3-type RING-H2 domain, and one in the C-terminal region. Nine zinc binding sites comprise three different zinc coordination schemes, including RING type cross-brace zinc coordination, C4 zinc finger, and a novel left-handed beta-spiral zinc-binding motif formed by three recurrent CCHC sequence motifs.	45
319379	cd16465	RING-H2_PJA1_2	RING finger, H2 subclass, found in protein E3 ubiquitin-protein ligase Praja-1, Praja-2, and similar proteins. The family includes two highly similar E3 ubiquitin-protein ligases Praja-1 and Praja-2. Praja-1, also known as RING finger protein 70, is a RING-H2 finger ubiquitin ligase encoded by gene PJA1, a novel human X chromosome gene abundantly expressed in brain. It has been implicated in bone and liver development, as well as memory formation and X-linked mental retardation (MRX). Praja-1 interacts with and activates the ubiquitin-conjugating enzyme UbcH5B, and shows E2-dependent E3 ubiquitin ligase activity. It is a 3-deazaneplanocin A (DZNep)-induced ubiquitin ligase that directly ubiquitinates individual polycomb repressive complex 2 (PRC2) subunits in a cell free system, which leads to their proteasomal degradation. It also plays an important role in neuronal plasticity, which is the basis for learning and memory. Moreover, Praja-1 ubiquitinates embryonic liver fodrin (ELF) and Smad3, but not Smad4, in a transforming growth factor-beta (TGF-beta)-dependent manner. It controls ELF abundance through ubiquitin-mediated degradation, and further regulates TGF-beta signaling, which plays a key role in the suppression of gastric carcinoma. Furthermore, Praja-1 regulates the transcription function of the homeodomain protein Dlx5 by controlling the stability of the Dlx/Msx-interacting MAGE/Necdin family protein, Dlxin-1, via an ubiquitin-dependent degradation pathway. Praja-2, also known as RING finger protein 131, or NEURODAP1, or KIAA0438, is an E2-dependent E3 ubiquitin ligase that interacts with and activates the ubiquitin-conjugating enzyme UbcH5B. It functions as an A-kinase anchoring protein (AKAP)-like E3 ubiquitin ligase that plays a critical role in controlling cyclic AMP (cAMP) dependent PKA activity and pro-survival signaling, and further promotes cell proliferation and growth. Praja-2 is also involved in protein sorting at the postsynaptic density region of axosomatic synapses and possibly plays a role in synaptic communication and plasticity. Moreover, Praja-2, together with the AMPK-related kinase SIK2 and the CDK5 activator CDK5R1/p35, forms a SIK2-p35-PJA2 complex that plays an essential role for glucose homeostasis in pancreatic beta cell functional compensation. Furthermore, Praja-2 ubiquitylates and degrades Mob, a core component of NDR/LATS kinase and a positive regulator of the tumor-suppressor Hippo signaling. Both Praja-1 and Praja-2 contain a potential nuclear localization signal (NLS) and a C-terminal C3H2C3-type RING-H2 motif.	46
319380	cd16466	RING-H2_RBX2	RING finger, H2 subclass, found in RING-box protein 2 (RBX2) and similar proteins. RBX2, also known as CKII beta-binding protein 1 (CKBBP1), RING finger protein 7 (RNF7), regulator of cullins 2 (ROC2), or sensitive to apoptosis gene protein (SAG), is an E3 ubiquitin-protein ligase that protects cells from apoptosis, confers radioresistance, and plays an essential and non-redundant role in embryogenesis and vasculogenesis. It promotes ubiquitination and degradation of a number of protein substrates, including c-JUN, DEPTOR, HIF-1alpha, IkappaBalpha, NF1, NOXA, p27, and procaspase-3, thus regulating various signaling pathways and biological processes. RBX2 is necessary for ubiquitin ligation activity of the multimeric cullin (Cul)-RING E3 ligases (CRLs). RBX2-containing CRLs are involved in NEDD8 pathway and RBX2 specifically regulate NEDD8ylation of Cul5. It can bind and activate HIV-1 Vif-Cullin5 E3 ligase complex in vitro. It is also a substrate of NEDD4-1 E3 ubiquitin ligase and mediates NEDD4-1 induced chemosensitization. The inactivation of RBX2 E3 ubiquitin ligase activity triggers senescence and inhibits Kras-induced immortalization. Endothelial deletion of RBX2 causes embryonic lethality and blocks tumor angiogenesis, suggesting a way for anti-angiogenesis therapy of human cancer. Moreover, as a component of Cullin 5-RING E3 ubiquitin ligase (CRL5) complex, RBX2 regulates neuronal migration through different CRL5 adaptors, such as SOCS7. Furthermore, RBX2 functions as a redox inducible antioxidant protein that scavenges oxygen radicals by forming inter- and intra-molecular disulfide bonds when acting alone. RBX2 contains a C-terminal C3H2C3-type RING-H2 finger that is essential for its ligase activity.	60
319381	cd16467	RING-H2_RNF6_like	RING finger, H2 subclass, found in E3 ubiquitin-protein ligase RNF6, RNF12, and similar proteins. RNF6 is an androgen receptor (AR)-associated protein that induces AR ubiquitination and promotes AR transcriptional activity. RNF6-induced ubiquitination may regulate AR transcriptional activity and specificity through modulating cofactor recruitment. RNF6 is overexpressed in hormone-refractory human prostate cancer tissues and required for prostate cancer cell growth under androgen-depleted conditions. Moreover, RNF6 regulates local serine/threonine kinase LIM kinase 1 (LIMK1) levels in axonal growth cones. RNF6-induced LIMK1 polyubiquitination is mediated via K48 of ubiquitin and leads to proteasomal degradation of the kinase. RNF6 also binds and upregulates the Inha promoter, and functions as a transcription regulatory protein in the mouse sertoli cell. Furthermore, RNF6 acts as a potential tumor suppressor gene involved in the pathogenesis of esophageal squamous cell carcinoma (ESCC). RNF12, also known as LIM domain-interacting RING finger protein, or RING finger LIM domain-binding protein (R-LIM), is E3 ubiquitin-protein ligase encoded by gene RLIM that is crucial for normal embryonic development in some species and for normal X inactivation in mice. It thus functions as a major sex-specific epigenetic regulator of female mouse nurturing tissues. RNF12 is widely expressed during embryogenesis, and mainly localizes to the cell nucleus, where it regulates the levels of many proteins, including CLIM, LMO, HDAC2, TRF1, SMAD7, and REX1, by proteasomal degradation. Both RNF6 and RNF12 contain a well conserved C3H2C3-type RING-H2 finger.	43
319382	cd16468	RING-H2_RNF11	RING finger, H2 subclass, found in RING finger protein 11 (RNF11) and similar proteins. RNF11 is an E3 ubiquitin-protein ligase that acts both as an adaptor and a modulator of itch-mediated control of ubiquitination events underlying membrane traffic. It is the downstream of an enzymatic cascade for the ubiquitination of specific substrates. It is also a molecular adaptor of homologous to E6-associated protein C-terminus (HECT)-type ligases. RNF11 has been implicated in the regulation of several signaling pathways. It enhances the transforming growth factor receptor (TGFR) signaling by both abrogating Smurf2-mediated receptor ubiquitination and by promoting the Smurf2-mediated degradation of AMSH (associated molecule with the SH3 domain of STAM), a de-ubiquitinating enzyme that enhances transforming growth factor-beta (TGF-beta) signaling and epidermal growth factor receptor (EGFR) endosomal recycling. It also acts directly on Smad4 to enhance Smad4 function, and plays a role in prolonged TGF-beta signaling. Moreover, RNF11 functions as a critical component of the A20 ubiquitin-editing protein complex that negatively regulates tumor necrosis factor (TNF)-mediated nuclear factor (NF)-kappaB activation. It also interacts with Smad anchor for receptor activation (SARA) and the endosomal sorting complex required for transport (ESCRT)-0 complex, thus participating in the regulation of lysosomal degradation of EGFR. Furthermore, RNF11 acts as a novel GGA cargo actively participating in regulating the ubiquitination of the GGA protein family. In addition, RNF11 functions together with TAX1BP1 to target TANK-binding kinase 1 (TBK1)/IkappaB kinase IKKi, and further restricts antiviral signaling and type I interferon (IFN)-beta production. RNF11 contains an N-terminal PPPY motif that binds WW domain-containing proteins such as AIP4/itch, Nedd4 and Smurf1/2 (SMAD-specific E3 ubiquitin-protein ligase 1/2), and a C-terminal C3H2C3-type RING-H2 finger that functions as a scaffold for the coordinated transfer of ubiquitin to substrate proteins together with the E2 enzymes UbcH527 and Ubc13.	43
319383	cd16469	RING-H2_RNF24_like	RING finger, H2 subclass, found in RING finger proteins RNF24, RNF122, and similar proteins. The family includes RNF24, RNF122, and similar proteins. RNF24 is an intrinsic membrane protein localized in the Golgi apparatus. It specifically interacts with the ankyrin-repeats domains (ARDs) of TRPC1, ?3, ?4, ?5, ?6, and ?7, and affects TRPC intracellular trafficking without affecting their activity. RNF122 is a RING finger protein associated with HEK 293T cell viability. It is localized to the endoplasmic reticulum (ER) and the Golgi apparatus, and overexpressed in anaplastic thyroid cancer cells. RNF122 functions as an E3 ubiquitin ligase that can ubiquitinate itself and undergoes degradation through its RING finger in a proteasome-dependent manner. Both RNF24 and RNF122 contain an N-terminal transmembrane domain and a C-terminal C3H2C3-type RING-H2 finger.	47
319384	cd16470	RING-H2_RNF25	RING finger, H2 subclass, found in RING finger protein 25 (RNF25) and similar proteins. RNF25, also known as AO7, is a putative E3 ubiquitin-protein ligase that was initially identified as an interacting protein with an ubiquitin-conjugating enzyme, Ubc5B. It is ubiquitously expressed in various tissues and predominantly localized in the nucleus. RNF25 activates the nuclear factor (NF)-kappaB-dependent gene expression upon stimulation with Interleukin-1 beta (IL-1beta), or tumor necrosis factor (TNF), or overexpression of NF-kappaB-inducing kinase. It interacts with the p65 transactivation domain (TAD) and modulates its transcriptional activity. RNF25 contains an N-terminal RWD domain, a C3H2C3-type RING-H2 finger, and a C-termial Pro-rich region. Both the RING-H2 finger and the C-terminal regions of RNF25 are necessary for the transcriptional activation.	68
319385	cd16471	RING-H2_RNF32	RING finger, H2 subclass, found in RING finger protein 32 (RNF32) and similar proteins. RNF32 is mainly expressed in testis spermatogenesis, most likely in spermatocytes and/or in spermatids, suggesting a possible role in sperm formation. RNF32 contains two C3H2C3-type RING-H2 fingers separated by an IQ domain of unknown function. Although the biological function of RNF32 remains unclear, the protein with double RING-H2 fingers may act as a scaffold for binding several proteins that function in the same pathway.	49
319386	cd16472	RING-H2_RNF38_like	RING finger, H2 subclass, found in RING finger proteins RNF38, RNF44, and similar proteins. The family includes RING finger proteins RNF38, RNF44, and similar proteins. RNF38 is a nuclear E3 ubiquitin protein ligase that plays a role in regulating p53. RNF44 is an uncharacterized RING finger protein that shows high sequence similarity with RNF38. Both RNF38 and RNF44 contain a coiled-coil motif, a KIL motif (Lys-X2-Ile/Leu-X2-Ile/Leu, X can be any amino acid), and a C3H2C3-type RING-H2 finger. In addition, RNF38 harbors two potential nuclear localization signals.	45
319387	cd16473	RING-H2_RNF103	RING finger, H2 subclass, found in RING finger protein 103 (RNF103) and similar proteins. RNF103, also known as KF-1, or zinc finger protein 103 homolog (Zfp-103), is an endoplasmic reticulum (ER)-resident E3 ubiquitin-protein ligase that is widely expressed in many different organs, including brain, heart, kidney, spleen, and lung. It is involved in the ER-associated degradation (ERAD) pathway through interacting with components of the ERAD pathway, including Derlin-1 and VCP. RNF103 contains several hydrophobic regions at its N-terminal and middle regions, as well as a C-terminal C3H2C3-type RING-H2 finger.	46
319388	cd16474	RING-H2_RNF111_like	RING finger, H2 subclass, found in RING finger proteins RNF111, RNF165, and similar proteins. The family includes RING finger proteins RNF111, RNF165, and similar proteins. RNF111, also known as Arkadia, is a nuclear E3 ubiquitin-protein ligase that targets intracellular effectors and modulators of transforming growth factor beta (TGF-beta)/Nodal-related signaling for polyubiquitination and proteasome-dependent degradation. It also interacts with the clathrin-adaptor 2 (AP2) complex and regulates endocytosis of certain cell surface receptors, leading to modulation of epidermal growth factor (EGF) and possibly other signaling pathways. The N-terminal half of RNF111 harbors three SUMO-interacting motifs (SIMs). It thus functions as a SUMO-targeted ubiquitin ligase (STUbL) that directly links nonproteolytic ubiquitylation and SUMOylation in the DNA damage response, as well as triggers degradation of signal-induced polysumoylated proteins, such as the promyelocytic leukemia protein (PML). RNF165, also known as Arkadia-like 2, or Arkadia2, or Ark2C, is an E3 ubiquitin ligase with homology to C-terminal half of RNF111. It is expressed specifically in the nervous system, and can serve to amplify neuronal responses to specific signals. It thus acts as a positive regulator of bone morphogenetic protein (BMP)-Smad signaling that is involved in motor neuron (MN) axon elongation. Both RNF165 and RNF111 contain a C-terminal C3H2C3-type RING-H2 finger.	46
319389	cd16475	RING-H2_RNF121_like	RING finger, H2 subclass, found in RING finger proteins RNF121, RNF175 and similar proteins. The family includes RNF121, RNF175 and similar proteins. RNF121 is an E3-ubiquitin ligase present in the endoplasmic reticulum (ER) and cis-Golgi compartments. It is a novel regulator of apoptosis. It also facilitates the degradation and membrane localization of voltage-gated sodium (NaV) channels, and thus plays a role in the quality control of NaV channels during their synthesis and subsequent transport to the membrane. Moreover, RNF121 acts as a broad regulator of nuclear factor kappaB (NF-kappaB) signaling since its silencing also dampens NF-kappaB activation following stimulation of toll-like receptors (TLRs), nod-like receptors (NLRs), RIG-I-like Receptors (RLRs) or after DNA damages. RNF121 contains five conserved transmembrane (TM) domains and a C3H2C2-type RING-H2 finger. RNF175 is an uncharacterized RING finger protein that shows high sequence similarity with RNF121. This family also includes Arabidopsis RING finger E3 ligase RHA2A, RHA2B, and their homologs. RHA2A is a novel positive regulator of abscisic acid (ABA) signaling during seed germination and early seedling development. RHA2B may play a role in the ubiquitin-dependent proteolysis pathway that respond to proteasome inhibition. All family members contain a C3H2C3-type RING-H2 finger, which is responsible for E3-ubiquitin ligase activity.	55
319390	cd16476	RING-H2_RNF139_like	RING finger, H2 subclass, found in RING finger proteins RNF139, RNF145, and similar proteins. RNF139, also known as translocation in renal carcinoma on chromosome 8 protein (TRC8), is an endoplasmic reticulum (ER)-resident multi-transmembrane protein that functions as a potent growth suppressor in mammalian cells, inducing G2/M arrest, decreased DNA synthesis and increased apoptosis. It is a tumor suppressor that has been implicated in a novel regulatory relationship linking the cholesterol/lipid biosynthetic pathway with cellular growth control. The mutation of RNF139 has been identified in families with hereditary renal (RCC) and thyroid cancers. RNF145 is an uncharacterized RING finger protein encoded by RNF145 gene, which is expressed in T lymphocytes, and its expression is altered in acute myelomonocytic and acute promyelocytic leukemias. Although its biological function remains unclear, RNF145 shows high sequence similarity with RNF139. Both RNF139 and RNF145 contain a C3H2C3-type RING-H2 finger with possible E3-ubiquitin ligase activity.	41
319391	cd16477	RING-H2_RNF214	RING finger, H2 subclass, found in RING finger protein 214 (RNF214) and similar proteins. RNF214 is an uncharacterized RING finger protein containing a C3H2C3-type RING-H2 finger, suggesting possible E3-ubiquitin ligase activity.	45
319392	cd16478	RING-H2_Rapsyn	RING finger, H2 subclass, found in 43 kDa receptor-associated protein of the synapse (Rapsyn) and similar proteins. Rapsyn, also known as acetylcholine receptor-associated 43 kDa protein or RING finger protein 205 (RNF205), is a 43 kDa postsynaptic protein that plays an essential role in the clustering and maintenance of acetylcholine receptor (AChR) in the postsynaptic membrane of the motor endplate (EP). AChRs enable the transport of rapsyn from the Golgi complex to the plasma membrane through a molecule-specific interaction. Rapsyn also mediates subsynaptic anchoring of protein kinase A (PKA) type I in close proximity to the postsynaptic membrane, which is essential for synapse maintenance. Its mutations in humans cause endplate acetylcholine-receptor deficiency and myasthenic syndrome. Rapsyn contains an N-terminal myristoylation signal required for membrane association, seven tetratricopeptide repeats (TPRs) that subserve rapsyn self-association, a coiled-coil domain responsible for the binding of determinants within the long cytoplasmic loop of each AChR subunit, a C3H2C3-type RING-H2 finger that binds to the cytoplasmic domain of beta-dystroglycan and to S-NRAP and links rapsyn to the subsynaptic cytoskeleton, and a serine phosphorylation site.	47
319393	cd16479	RING-H2_synoviolin	RING finger, H2 subclass, found in synoviolin and similar proteins. Synoviolin, also known as synovial apoptosis inhibitor 1 (Syvn1), Hrd1, or Der3, is an endoplasmic reticulum (ER)-anchoring E3 ubiquitin ligase that functions as a suppressor of ER stress-induced apoptosis and plays a role in homeostasis maintenance. It also targets tumor suppressor gene p53 for proteasomal degradation, suggesting the crosstalk between ER associated degradation (ERAD) and p53 mediated apoptotic pathway under ER stress. Moreover, Synoviolin controls body weight and mitochondrial biogenesis through negative regulation of the thermogenic coactivator peroxisome proliferator-activated receptor coactivator (PGC)-1beta. It upregulates amyloid beta production by targeting a negative regulator of gamma-secretase, Retention in endoplasmic reticulum 1 (Rer1), for degradation. It is also involved in the degradation of endogenous immature nicastrin, and affects amyloid beta-protein generation. Moreover, Synoviolin is highly expressed in rheumatoid synovial cells and may be involved in the pathogenesis of rheumatoid arthritis (RA). It functions as an anti-apoptotic factor that is responsible for the outgrowth of synovial cells during the development of RA. It promotes inositol-requiring enzyme 1 (IRE1) ubiquitination and degradation in synovial fibroblasts with collagen-induced arthritis. Furthermore, the upregulation of Synoviolin may represent a protective response against neurodegeneration in Parkinson"s disease (PD). In addition, Synoviolin is involved in liver fibrogenesis. Synoviolin contains a C3H2C2-type RING-H2 finger.	43
319394	cd16480	RING-H2_TRAIP	RING finger, H2 subclass, found in TRAF-interacting protein (TRAIP) and similar proteins. TRAIP, also known as RING finger protein 206 (RNF206) or TRIP, is a ubiquitously expressed nucleolar E3 ubiquitin ligase important for cellular proliferation and differentiation. It is found near mitotic chromosomes and functions as a regulator of the spindle assembly checkpoint. TRAIP interacts with tumor necrosis factor (TNF)-receptor-associated factor (TRAF) proteins and inhibits TNF-alpha-mediated nuclear factor (NF)-kappaB activation. It also interacts with two tumor suppressors CYLD and spleen tyrosine kinase (Syk), and DNA polymerase eta, which facilitates translesional synthesis after DNA damage. TRAIP contains an N-terminal C3H2C2-type RING-H2 finger and an extended coiled-coil domain.	45
319395	cd16481	RING-H2_TTC3	RING finger, H2 subclass, found in Tetratricopeptide repeat protein 3 (TTC3) and similar proteins. TTC3, also known as protein DCRR1, or TPR repeat protein D, or TPR repeat protein 3, or RING finger protein 105 (RNF105), is an E3 ubiquitin-protein ligase encoded by a gene within the Down syndrome (DS) critical region on chromosome 21. It affects differentiation and Golgi compactness in neurons through specific actin-regulating pathways. It inhibits the neuronal-like differentiation of pheocromocytoma cells by activating RhoA and by binding to Citron proteins. TTC3 is an Akt-specific E3 ligase that binds to phosphorylated Akt and facilitates its ubiquitination and degradation within the nucleus. TTC3 contains four N-terminal TPR motifs, a potential coiled-coil region and a Citron binding region in the central part, and a C-terminal C3H2C2-type RING-H2 finger.TTC3, also known as protein DCRR1, TPR repeat protein D, TPR repeat protein 3, or RING finger protein 105 (RNF105), is an E3 ubiquitin-protein ligase encoded by a gene within the Down syndrome (DS) critical region on chromosome 21. It also affects differentiation and Golgi compactness in neurons through specific actin-regulating pathways. It inhibits the neuronal-like differentiation of pheocromocytoma cells by activating RhoA and by binding to Citron proteins. TTC3 is an Akt-specific E3 ligase that binds to phosphorylated Akt and facilitates its ubiquitination and degradation within the nucleus. TTC3 contains four N-terminal TPR motifs, a potential coiled-coil region and a Citron binding region in the central part, and a C-terminal C3H2C2-type RING-H2 finger.	42
319396	cd16482	RING-H2_UBR1_like	RING finger, H2 subclass, found in ubiquitin-protein ligase E3-alpha-1 (UBR1), E3-alpha-2 (UBR2), and similar proteins. Two UBR family members, UBR1 and UBR2, are major N-recognin ubiquitin ligases that both function in the N-end rule degradation pathway. They can recognize substrate proteins with type-1 (basic) and type-2 (bulky hydrophobic) N-terminal residues as part of N-degrons and an internal lysine residue for ubiquitin conjugation. They also function in a quality control pathway for degradation of unfolded cytosolic proteins. Their action is stimulated by Hsp70. Moreover, UBR1 and UBR2 are negative regulators of the leucine-mTOR signaling pathway. Leucine might activate this pathway in part through inhibition of their ubiquitin ligase activity. In yeast only one E3, encoded by UBR1, mediates the recognition of substrates by the N-end rule pathway. Saccharomyces cerevisiae UBR1 also functions as an additional E3 ligase in endoplasmic reticulum-associated protein degradation (ERAD) in yeast. It can provide ubiquitin ligation activity for the ERAD substrate mutated Ste6 (sterile). Schizosaccharomyces pombe UBR1 is a critical regulator that influences the oxidative stress response via degradation of active Pap1 basic leucine zipper (bZIP) transcription factor in the nucleus. Both UBR1 and UBR2 contain an N-terminal ubiquitin-recognin (UBR) box involved in binding type-1 (basic) N-end rule substrate, an N-domain (also known as ClpS domain) required for type-2 (bulky hydrophobic) N-end rule substrate recognition, a C3H2C3-type RING-H2 finger, and a C-terminal UBR-specific autoinhibitory (UAIN) domain.	67
319397	cd16483	RING-H2_UBR3	RING finger, H2 subclass, found in ubiquitin-protein ligase E3-alpha-3 (UBR3) and similar proteins. UBR3, also known as N-recognin-3, E3alpha-III, or zinc finger protein 650, is an E3 ubiquitin-protein ligase targeting the essential DNA repair protein APE1, also known as Ref-1, for ubiquitylation. It regulates cellular levels of APE1 and is required for genome stability. It also plays a regulatory role in sensory pathways, including olfaction. Moreover, in Drosophila, UBR3 regulates apoptosis by controlling the activity of Drosophila inhibitor of apoptosis protein 1 (DIAP1), which is required to prevent caspase activation. UBR3 contains an N-terminal ubiquitin-recognin (UBR) box, a C3H2C3-type RING-H2 finger, and a C-terminal UBR-specific autoinhibitory (UAIN) domain.	90
319398	cd16484	RING-H2_Vps	RING finger, H2 subclass, found in vacuolar protein sorting-associated proteins Vps8, Vps11, Vps18, Vps41, and similar proteins. This family corresponds to a group of vacuolar protein sorting-associated proteins containing a C-terminal C3H2C3-type RING-H2 finger, which includes Vps8, Vps11, Vps18, and Vps41. Vps11 and Vps18 associate with Vps16 and Vps33 to form a Class C Vps core complex that is required for soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNARE)-mediated membrane fusion at the lysosome-like yeast vacuole. The core complex, together with two additional compartment-specific subunits, forms the tethering complexes HOPS (homotypic vacuole fusion and protein sorting) and CORVET (class C core vacuole/endosome transport) protein complexes. CORVET contains the additional Vps3 and Vps8 subunits. It operates at endosomes, controls traffic into late endosomes and interacts with the Rab5/Vps21-GTP form. HOPS contains the additional Vps39 and Vps41 subunits. It operates at the lysosomal vacuole, controls all traffic from late endosomes into the vacuole and interacts with the Rab7/Ypt7-GTP form.	46
319399	cd16485	mRING-H2-C3H2C2D_RBX1	modified RING finger, H2 subclass (C3H2C2D-type), found in RING-box protein 1 (RBX1) and similar proteins. RBX1, also known as Hrt1, protein ZYP, RING finger protein 75 (RNF75), or regulator of cullins 1 (ROC1), is an E3 ubiquitin-protein ligase necessary for ubiquitin ligation activity of the multimeric cullin (Cul)-RING E3 ligases (CRLs). RBX1-containing CRLs are involved in NEDD8 pathway and RBX1 specifically regulate NEDD8ylation of Cul1-4. It can also bind and activate HIV-1 Vif-Cullin5 E3 ligase complex in vitro. Moreover, RBX1 is an essential element of Skp1/Cullins/F-box (SCF) E3-ubiquitin ligase complex that targets diverse proteins for proteasome-mediated degradation. It is a direct functional target of miR-194 and plays an important role in proliferation and migration of gastric cancer (GC) cells. RBX1 is also an essential component of KEAP1/CUL3/RBX1 E3-ubiquitin ligase complex that functions as a regulator of NFE2-related factor 2 (NRF2) and plays a key role in NRF2 pathway deregulation in multiple tumor types, including ovarian carcinomas (OVCA) and papillary thyroid carcinoma (PTC). Furthermore, RBX1 associates with DDB1, Cul4A, and Fbxw5 to form the Fbxw5-DDB1-Cul4A-Rbx1 complex that may function as a dual SUMO/ubiquitin ligase suppressing c-Myb activity through sumoylation or ubiquitination. RBX1 contains a C-terminal modified RING-H2 finger that is C3H2C2D-type, rather than the canonical C3H2C3-type. The modified RING-H2 finger is essential for its ligase activity.	62
319400	cd16486	mRING-H2-C3H2C2D_ZSWM2	Modified RING finger, H2 subclass (C3H2C2D-type), found in zinc finger SWIM domain-containing protein 2 (ZSWIM2) and similar proteins. ZSWIM2, also known as MEKK1-related protein X (MEX) or ZZ-type zinc finger-containing protein 2, is a testis-specific E3 ubiquitin ligase that promotes death receptor-induced apoptosis through Fas, death receptor (DR) 3 and DR4 signaling. ZSWIM2 is self-ubiquitinated and targeted for degradation through the proteasome pathway. It also acts as an E3 ubiquitin ligase, through the E2, Ub-conjugating enzymes UbcH5a, UbcH5c, or UbcH6. ZSWIM2 contains four putative zinc-binding domains including an N-terminal SWIM (SWI2/SNF2 and MuDR) domain critical for its ubiquitination, and two modified RING-H2 fingers separated by a ZZ zinc finger domain, which was required for interaction with UbcH5a and its self-association. This family corresponds to the second RING-H2 finger, which is not a canonical C3H2C3-type, but a modified C3H2C2D-type.	44
319401	cd16487	mRING-H2-C3DHC3_ZFPL1	Modified RING finger, H2 subclass (C3DHC3-type), found in zinc finger protein-like 1 (ZFPL1) and similar proteins. ZFPL1, also known as zinc finger protein MCG4, is a novel mitotic Golgi phosphoprotein required for cis-Golgi integrity and efficient endoplasmic reticulum (ER)-to-Golgi transport via directly interacting with the cis-Golgi matrix protein GM130. ZFPL1 is a widely expressed integral membrane protein with two predicted zinc fingers at its N-terminus. One is a novel type of zinc finger, and the other is a modified RING-H2 finger that lacks the fourth zinc-binding residue of the consensus C3H2C3-type RING-H2 finger. It also contains a bipartite nuclear localization signal (NLS), and a leucine zipper at the C-terminus.	55
319402	cd16488	mRING-H2-C3H3C2_Mio_like	Modified RING finger, H2 subclass (C3H3C2-type), found in WD repeat-containing protein mio and its homologs. This family contains Mio, WDR24, WDR59, and their counterpart Sea4, Sea2, and Sea3 from yeast, respectively. Mio/Sea4, Sea2/WDR24, and Sea3/WDR59 are components of GATOR2 complex, which also includes another two subunits, Seh1and Sec13. GATOR2 and GATOR1, which is composed of three subunits, DEPDC5, Nprl2, and Nprl3, form the Rag-interacting complex GATOR (GAP Activity Towards Rags). Inhibition of GATOR1 subunits makes mTORC1 signaling resistant to amino acid deprivation. In contrast, inhibition of GATOR2 subunits suppresses mTORC1 signaling and GATOR2 negatively regulates DEPDC5. All family members contain an N-terminal WD40 domain and a C-terminal RING-H2 finger with an unusual arrangement of zinc-coordinating residues. The cysteines and histidines in RING-H2 finger are arranged as a modified C3H3C2-type, rather than the canonical C3H2C3-type.	44
319403	cd16489	mRING-CH-C4HC2H_ZNRF	Modified RING-CH finger, H2 subclass (C4HC2H-type), found in the ZNRF family. This ZNRF family includes zinc/RING finger proteins ZNRF1, ZNRF2, and similar proteins. It has been characterized by containing a unique combination zinc finger-RING finger motif in the C-terminal region, which is evolutionarily conserved in a wide range of species, including Caenorhabditis elegans and Drosophila. The ZNRF family of proteins function as an E3 ubiquitin ligase and are highly expressed in central nervous system (CNS) and peripheral nervous system (PNS) neurons, particularly during development and in adulthood. In neurons, ZNRF1 and ZNRF2 are differentially localized within the synaptic region. ZNRF1 is associated with synaptic vesicle membranes, whereas ZNRF2 is present in presynaptic plasma membranes. They are N-myrisotoylated and also located in the endosome-lysosome compartment in fibroblasts. ZNRF proteins may play a role in the establishment and maintenance of neuronal transmission and plasticity via their ubiquitin ligase activity, as well as in regulating Ca2+-dependent exocytosis. The RING fingers found in ZNRF proteins are modified as C4HC2H-type RING-CH finger, rather than the typical C4HC3-type RING-CH finger, which is a variant of RING-H2 finger.	43
319404	cd16490	RING-CH-C4HC3_FANCL	RING-CH finger, H2 subclass (C4HC3-type), found in Fanconi anemia group L protein (FANCL) and similar proteins. FANCL, also known as fanconi anemia-associated polypeptide of 43 kDa (FAAP43) or PHF9, is a monomeric RING E3 ubiquitin-protein ligase that monoubiquitinates FANCD2 and FANCI. The monoubiquitinated FANCD2-FANCI heterodimer complex in turn recruits key proteins involved in homologous recombination and DNA repair. FANCL is also one of seven components in Fanconi anemia (FA) nuclear core complex, which provides the essential E3 ligase function for spatially defined FANCD2 ubiquitination and FA pathway activation. In the FA core complex, FANCL associates with FANCB and FAAP100 to constitute a catalytic subcomplex that functions as the monoubiquitination module. FANCL specifically interacts with the E2 ubiquitin-conjugating (UBC) enzyme Ube2T to make an E3-E2 pair, which is the catalytic center of the Fanconi Anemia (FA) pathway required for DNA interstrand crosslink repair. Moreover, FANCL has a noncanonical function to regulate the Wnt/beta-catenin signaling, a pathway involved in hematopoietic stem cell self-renewal. It functionally enhances beta-catenin activity through ubiquitinating beta-catenin, with atypical ubiquitin chains (K11 linked). FANCL contains an N-terminal E2-like fold (ELF) domain, a novel double-RWD (DRWD) domain with a clear hydrophobic core, and a C-terminal C4HC3-type RING-CH finger. The DRWD domain is required for substrate binding. The RING-CH finger, also known as vRING or RINGv, is predicted to facilitate E2 binding. It has an unusual arrangement of zinc-coordinating residues. Its cysteines and histidines are arranged in the sequence as C4HC3-type, rather than the C3H2C3-type in canonical RING-H2 finger.	58
319405	cd16491	RING-CH-C4HC3_LTN1	RING-CH finger, H2 subclass (C4HC3-type), found in E3 ubiquitin-protein ligase listerin and similar proteins. Listerin, also known as RING finger protein 160 or zinc finger protein 294, is the mammalian homolog of yeast Ltn1. It is widely expressed in all tissues, but motor and sensory neurons and neuronal processes in the brainstem and spinal cord are primarily affected in the mutant. Listerin is required for embryonic development and plays an important role in neurodegeneration. It also functions as a critical E3 ligase involving quality control of nonstop proteins. It mediates ubiquitylation of aberrant proteins that become stalled on ribosomes during translation. Ltn1 works with several cofactors to form a large ribosomal subunit-associated quality control complex (RQC), whick mediates the ubiquitylation and extraction of ribosome-stalled nascent polypeptide chains for proteasomal degradation. It appears to first associate with nascent chain-stalled 60S subunits together with two proteins of unknown function, Tae2 and Rqc1. Listerin contains a long stretch of HEAT (Huntingtin, Elongation factor 3, PR65/A subunit of protein phosphatase 2A, and TOR) or ARM (Armadillo) repeats in the N terminus and middle region, and a catalytic RING-CH finger, also known as vRING or RINGv, with an unusual arrangement of zinc-coordinating residues in the C-terminus . Its cysteines and histidines are arranged in the sequence as C4HC3-type, rather than the C3H2C3-type in canonical RING-H2 finger.	50
319406	cd16492	RING-CH-C4HC3_NFX1_like	RING-CH finger, H2 subclass (C4HC3-type), found in transcriptional repressor NF-X1, NF-X1-type zinc finger protein NFXL1, and similar proteins. NF-X1, also known as nuclear transcription factor, X box-binding protein 1, is a novel cysteine-rich sequence-specific DNA-binding protein that interacts with the conserved X-box motif of the human major histocompatibility complex (MHC) class II genes via a repeated Cys-His domain. It functions as a cytokine-inducible transcriptional repressor that plays an important role in regulating the duration of an inflammatory response by limiting the period in which class II MHC molecules are induced by interferon gamma (IFN- gamma). NFXL1, also known as NF-X1-type zinc finger protein NFXL1 or ovarian zinc finger protein (OZFP), is encoded by a novel human cytoplasm-distribution zinc finger protein (CDZFP) gene. This family also includes human transcription factor NF-X1 homologs from insects, plants, and fungi. Drosophila melanogaster shuttle craft (STC) is a DNA- or RNA-binding protein required for proper axon guidance in the central nervous system. It functions as a putative transcription factor and plays an essential role in the completion of embryonic development. In contrast to NF-X1, STC contains an RD domain. The Arabidopsis genome encodes two NF-X1 homologs, AtNFXL1 and AtNFXL2, both of which function as regulators of salt stress responses. The AtNFXL1 protein is a nuclear factor that positively affects adaptation to salt stress. It also functions as a negative regulator of the type A trichothecene phytotoxin-induced defense response. AtNFXL2 controls abscisic acid (ABA) levels and suppresses ABA responses. It may also prevent unnecessary and costly stress adaptation under favorable conditions. FKBP12-associated protein 1 (FAP1) is a dosage suppressor of rapamycin toxicity in budding yeast. It is localized in the cytoplasm, but upon rapamycin treatment translocates to the nucleus. FAP1 interacts with FKBP12 in a rapamycin-sensitive manner. It is a proline-rich protein containing a novel cysteine-rich DNA-binding motif. Unique structural features of the NFX1 and NFXL proteins are the Cys-rich region and a specific RING-CH finger motif with an unusual arrangement of zinc-coordinating residues. The Cys-rich region is required for binding to specific promoter elements. It frequently comprises more than 500 amino acids and harbors several NFX1-type zinc finger domains, characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The RING-CH finger, also known as vRING or RINGv, may have E3 ligase activity. It is characterized by a C4HC3-type Zn ligand signature and additional conserved amino acids, rather than C3H2C3-type cysteines and histidines arrangement in canonical RING-H2 finger. In addition to the Cys-rich region and RING-CH finger, NFX1 contains a PAM2 motif in the N-terminus and a R3H domain in the C-terminus.	58
319407	cd16493	RING-CH-C4HC3_NSE1	RING-CH finger, H2 subclass (C4HC3-type), found in non-structural maintenance of chromosomes element 1 homolog (NSE1) and similar proteins. NSE1, also known as non-SMC element 1 homolog (NSMCE1), is an E3 ubiquitin ligase that contains a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger. It together with its partner, proteins NSE3 and NSE4, form a tight subcomplex of the structural maintenance of chromosomes SMC5-6 complex, which includes another two subcomplexes, SMC6-SMC5-NSE2 and NSE5-NSE6. The vRING finger is essential for normal NSE1-NSE3-NSE4 trimer formation in vitro and for damage-induced recruitment of NSE4 and SMC5 to subnuclear foci in vivo. Thus it functions as a protein-protein interaction domain required for SMC5-6 holocomplex integrity and recruitment to, or retention at, DNA lesions. The C-terminal half of NSE1, including the vRING finger, is required for DNA damage resistance and mitotic fidelity of SMC5-6 complex in the fission yeast Schizosaccharomyces pombe. The RING-CH finger may play an important role in Rad52-dependent postreplication repair of UV-damaged DNA in Saccharomyces cerevisiae.	47
319408	cd16494	RING-CH-C4HC3_ZSWM2	RING-CH finger, H2 subclass (C4HC3-type), found in zinc finger SWIM domain-containing protein 2 (ZSWIM2) and similar proteins. ZSWIM2, also known as MEKK1-related protein X (MEX) or ZZ-type zinc finger-containing protein 2, is a testis-specific E3 ubiquitin ligase that promotes death receptor-induced apoptosis through Fas, death receptor (DR) 3, and DR4 signaling. ZSWIM2 is self-ubiquitinated and targeted for degradation through the proteasome pathway. It also acts as an E3 ubiquitin ligase, through the E2, Ub-conjugating enzymes UbcH5a, UbcH5c, or UbcH6. ZSWIM2 contains four putative zinc-binding domains including an N-terminal SWIM (SWI2/SNF2 and MuDR) domain critical for its ubiquitination and two RING fingers separated by a ZZ zinc finger domain, which was required for interaction with UbcH5a and its self-association. This family corresponds to the first RING finger, which is a C4HC3-type RING-CH finger, also known as vRING or RINGv, rather than the canonical C3H2C3-type RING-H2 finger.	58
319409	cd16495	RING_CH-C4HC3_MARCH	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH proteins (MARCH). The family corresponds to a novel family of membrane-associated E3 ubiquitin ligases, consisting of 11 members in mammals (MARCH1-11), which are characterized by containing an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger). Most family members have hydrophobic transmembrane spans and are localized to the plasma membrane and intracellular organelle membrane. Only MARCH7 and MARCH10 are predicted to have no transmembrane spanning region. MARCH proteins have been implicated in mediating the ubiquitination and subsequent down-regulation of cell-surface immune regulatory molecules, such as major histocompatibility complex class II and CD86, as well as in endoplasmic reticulum-associated degradation, endosomal protein trafficking, mitochondrial dynamics, and spermatogenesis.	52
319410	cd16496	RING-HC_BARD1	RING finger, HC subclass, found in BRCA1-associated RING domain protein 1 (BARD-1) and similar proteins. BARD-1 is a critical factor in BRCA1-mediated tumor suppression and may also serve as a target for tumorigenic lesions in some human cancers. It associates with BRCA1 (breast cancer-1) to form a heterodimeric BRCA1/BARD1 complex that is responsible for maintaining genomic stability through nuclear functions involving DNA damage signaling and repair, transcriptional regulation, and cell cycle control. The BRCA1/BARD1 complex catalyzes autoubiquitination of BRCA1 and trans ubiquitination of other protein substrates. Its E3 ligase activity is dramatically reduced in the presence of UBX domain protein 1 (UBXN1). BARD-1 contains an C3HC4-type RING-HC finger that binds BRCA1 at its N-terminus and three tandem ankyrin repeats and tandem BRCT repeat domains that bind CstF-50 (cleavage stimulation factor) to modulate mRNA processing and RNAP II stability in response to DNA damage at its C-terminus.	45
319411	cd16497	RING-HC_BAR	RING finger, HC subclass, found in bifunctional apoptosis regulator (BAR). BAR, also known as RING finger protein 47, was originally identified as an inhibitor of Bax-induced apoptosis. It participates in the block of apoptosis induced by TNF-family death receptors (extrinsic pathway) and mitochondria-dependent apoptosis (intrinsic pathway). BAR is predominantly expressed by neurons in the central nervous system and is involved in the regulation of neuronal survival. It is an endoplasmic reticulum (ER)-associated RING-type E3 ubiquitin ligase that interacts with BI-1 protein and post-translationally regulates its stability as well as functions in ER stress. BAR contains an N-terminal C3HC4-type RING-HC finger, a SAM domain, a coiled-coil domain, and a C-terminal transmembrane (TM) domain. This family corresponds to the RING-HC finger responsible for the binding of ubiquitin conjugating enzymes (E2s).	46
319412	cd16498	RING-HC_BRCA1	RING finger, HC subclass, found in breast cancer type 1 susceptibility protein (BRCA1) and similar proteins. BRCA1, also known as RING finger protein 53 (RNF53), is a RING finger protein encoded by tumor suppressor gene BRCA1 that regulates all DNA double-strand break (DSB) repair pathways. BRCA1 is frequently mutated in in patients with hereditary breast and ovarian cancer (HBOC). Its mutation is also associated with an increased risk of pancreatic, stomach, laryngeal, fallopian tube, and prostate cancer. It plays an important role in the DNA damage response signaling and has been implicated in various cellular processes such as cell cycle regulation, transcriptional regulation, chromatin remodeling, DNA DSBs, and apoptosis. BRCA1 contains an N-terminal C3HC4-type RING-HC finger, and two BRCT (BRCA1 C-terminus domain) repeats at the C-terminus.	48
319413	cd16499	RING-HC_BRE1_like	RING finger, HC subclass, found in yeast Bre1 and its homologs from eukaryotes. Bre1 is an E3 ubiquitin-protein ligase that catalyzes monoubiquitination of histone H2B in concert with the E2 ubiquitin-conjugating enzyme, Rad6. The Rad6-Bre1-mediated histone H2B ubiquitylation modulates the formation of double-strand breaks (DSBs) during meiosis in yeast. it is also required, indirectly, for the methylation of histone 3 on lysine 4 (H3K4) and 79. RNF20, also known as BRE1A and RNF40, also known as BRE1B, are the mammalian homologs of Bre1. They work together to form a heterodimeric Bre1 complex that facilitate the K120 monoubiquitination of histone H2B (H2Bub1), a DNA damage-induced histone modification that is crucial for recruitment of the chromatin remodeler SNF2h to DNA double-strand break (DSB) damage sites. Moreover, Bre1 complex acts as a tumor suppressor, augmenting expression of select tumor suppressor genes and suppressing select oncogenes. Deficiency in the mammalian histone H2B ubiquitin ligase Bre1 leads to replication stress and chromosomal instability. All family members contain a C3HC4-type RING-HC finger at its C-terminus.	42
319414	cd16500	RING-HC_CARP	RING finger, HC subclass, found in caspases-8 and -10-associated RING finger protein CARP-1, CARP-2 and similar proteins. The CARPs family includes CARP-1 and CARP-2 proteins, both of which are E3 ubiquitin ligases that ubiquitinate apical caspases and target them for proteasome-mediated degradation. As a novel group of caspase regulators with a FYVE-type zinc finger domain, they do not localize to membranes in the cell and are involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8, and caspase 10. Moreover, they stabilize MDM2 by inhibiting MDM2 self-ubiquitination, as well as by targeting 14-3-3sigma for degradation. They work together with MDM2 enhancing p53 degradation, thereby inhibiting p53-mediated cell death. CARPs contain an N-terminal FYVE-like domain that can serve as a membrane-targeting or endosome localizing signal and a C-terminal C3HC4-type RING-HC finger domain.	39
319415	cd16501	RING-HC_CblA_like	RING finger, HC subclass, found in Dictyostelium discoideum Cbl-like protein A (CblA) and similar proteins. CblA is a Dictyostelium homolog of the Cbl proteins which are multi-domain proteins acting as  key negative regulators of various receptor and non-receptor tyrosine kinases signaling. CblA upregulates STATc tyrosine phosphorylation by downregulating PTP3, the protein tyrosine phosphatase responsible for dephosphorylating STATc. STATc is a signal transducer and activator of transcription protein. Like other Cbl proteins, CblA contains a tyrosine-kinase-binding domain (TKB), a proline-rich domain, a C3HC4-type RING-HC finger, and an ubiquitin-associated (UBA) domain.  TKB, also known as a phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain. This family also includes Drosophila melanogaster defense repressor 1 (Dnr1) that was identified as an inhibitor of Dredd activity in the absence of a microbial insult in Drosophila S2 cells. It inhibits the Drosophila initiator caspases Dredd and Dronc. Moreover, Dnr1 acts as a negative regulator of the Imd (immune deficiency) innate immune-response pathway. Its mutations cause neurodegeneration in Drosophila by activating the innate immune response in the brain. Dnr1 contains a FERM N-terminal domain followed by a region rich in glutamine and serine residues, a central FERM domain, and a C-terminal C3HC4-type RING-HC finger.	37
319416	cd16502	RING-HC_Cbl_like	RING finger, HC subclass, found in Casitas  B-lineage lymphoma (Cbl) proteins. The Cbl adaptor proteins family contains a small class of RING-type E3 ubiquitin ligases with oncogenic activity, which is represented by three mammalian members, c-Cbl, Cbl-b and Cbl-c, as well as two invertebrate Cbl-family proteins, D-Cbl in Drosophila and Sli-1 in C. elegans. Cbl proteins function as potent negative regulators of various signaling cascades in a wide range of cell types. They play roles in ubiquitinating the activated tyrosine kinases and targeting them for degradation. D-Cbl associates with the Drosophila epidermal growth factor receptor (EGFR) and overexpression of D-Cbl in the eye of Drosophila embryos inhibits EGFR dependent photoreceptor cell development. Sli-1 is a negative regulator of the Let-23 receptor tyrosine kinase, an EGFR homolog, in vulva development. Cbl proteins in this family consist of a highly conserved N-terminal half that includes a tyrosine-kinase-binding domain (TKB, also known as the phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain) and a C3HC4-type RING-HC finger, both of which are required for Cbl-mediated downregulation of RTKs, and a divergent C-terminal region.	43
319417	cd16503	RING-HC_CHFR	RING finger, HC subclass, found in checkpoint with forkhead and RING finger domains protein (CHFR). CHFR, also known as RING finger protein 196 (RNF196), is a checkpoint protein that delays entry into mitosis in response to stress. It functions as an E3 ubiquitin ligase that ubiquitinates and degrades its target proteins, such as Aurora-A, Plk1, Kif22, and PARP-1, which are critical for proper mitotic transitions. It also plays an important role in cell cycle progression and tumor suppression, and is negatively regulated by SUMOylation-mediated proteasomal ubiquitylation. Moreover, CHFR is involved in the early stage of the DNA damage response, which mediates the crosstalk between ubiquitination and poly-ADP-ribosylation. CHFR contains a fork head associated- (FHA) and a C3HC4-type RING-HC finger.	44
319418	cd16504	RING-HC_COP1	RING finger, HC subclass, found in constitutive photomorphogenesis protein 1 (COP1) and similar proteins. COP1, also known as RING finger and WD repeat domain protein 2 (RFWD2) or RING finger protein 200 (RNF200), was defined as a central regulator of photomorphogenic development in plants, which targets key transcription factors for proteasome-dependent degradation. It is localized predominantly in the nucleus, but may also be present in the cytosol. Mammalian COP1 functions as an E3 ubiquitin-protein ligase that interacts with Jun transcription factors and modulates their transcriptional activity. It also interacts with and negatively regulates the tumor-suppressor protein p53. Moreover, COP1 associates with COP9 signalosome subunit 6 (CSN6), and is involved in 14-3-3 delta ubiquitin-mediated degradation. The CSN6-COP1 link enhances ubiquitin-mediated degradation of p27(Kip1), a critical CDK inhibitor involved in cell cycle regulation, to promote cancer cell growth. Furthermore, COP1 functions as the negative regulator of ETV1 and influences prognosis in triple-negative breast cancer. COP1 contains an N-terminal extension, a C3HC4-type RING-HC finger, a coiled coil domain, and seven WD40 repeats. In human COP1, a classic leucine-rich NES, and a novel bipartite NLS is bridged by the RING-HC finger.	46
319419	cd16505	RING-HC_CYHR1	RING finger, HC subclass, found in cysteine and histidine-rich protein 1 (CYHR1) and similar proteins. CYHR1, also known as cysteine/histidine-rich protein (Chrp), shows sequence similarity with the Drosophila RING finger protein Seven-in-Absentia (sina) and its murine and human siah homologs. It is a novel prognostic marker that may work as a therapeutic target in patients with esophageal squamous cell carcinoma. It is also a biomarker of the response to erythropoietin in hemodialysis patients. CYHR1 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal tumor necrosis factor (TNF) receptor associated factor (TRAF)-like substrate-binding domain (SBD).	49
319420	cd16506	RING-HC_DTX3_like	RING finger, HC subclass, found in E3 ubiquitin-protein ligase Deltex3 (DTX3), Deltex-3-like (DTX3L) and similar proteins. The family contains Deltex3 (DTX3) and Deltex-3-like (DTX3L), both of which are E3 ubiquitin-protein ligases belonging to the Deltex (DTX) family. DTX3, also known as RING finger protein 154 (RNF154), has a biological function that remains unclear. DTX3L, also known as B-lymphoma- and BAL-associated protein (BBAP) or Rhysin-2 (Rhysin2), regulates endosomal sorting of the G protein-coupled receptor CXCR4 from endosomes to lysosomes. It also regulates subcellular localization of its partner protein, B aggressive lymphoma (BAL), by a dynamic nucleocytoplasmic trafficking mechanism. In contrast to other DTXs, both DTX3 and DTX3L contain a C3HC4-type RING-HC finger, and a previously unidentified C-terminal domain. DTX3L can associate with DTX1 through its unique N termini and further enhance self-ubiquitination.	41
319421	cd16507	RING-HC_GEFO_like	RING finger, HC subclass, found in Dictyostelium discoideum Ras guanine nucleotide exchange factor O (RasGEFO) and similar proteins. RasGEFO, also known as RasGEF domain-containing protein O, is one of the Ras guanine-nucleotide exchange factors (RasGEFs), which are the proteins that activate Ras through catalyzing the replacement of GDP with GTP. They are particularly important for signaling in development and chemotaxis in many organisms, including Dictyostelium. RasGEFO contain a C3HC4-type RING-HC finger that may be responsible for the E3 ubiquitin ligase activity.	40
319422	cd16508	RING-HC_HAKAI_like	RING finger, HC subclass, found in E3 ubiquitin-protein ligase Hakai, zinc finger protein 645 (ZNF645), and similar proteins. Hakai, also known as Casitas  B-lineage lymphoma-transforming sequence-like protein 1, RING finger protein 188 (RNF188), or c-Cbl-like protein 1 (CBLL1), is an E3 ubiquitin ligase that disrupts cell-cell contacts in epithelial cells and is upregulated in human colon and gastric adenocarcinomas. It was identified to mediate the posttranslational downregulation of E-cadherin (CDH1), a major component of adherens junctions in epithelial cells and a potent tumor suppressor. It also promotes ubiquitination of several other tyrosine-phosphorylated Src substrates, including cortactin (CTTN) and DOK1. Hakai acts as a homodimer with a novel HYB (Hakai pTyr-binding) domain that forms a phosphotyrosine-binding pocket upon, and consists of a pair of monomers arranged in an anti-parallel configuration. Each monomer contains a C3HC4-type RING-HC finger and a short pTyr-B domain that incorporates a novel, atypical C2H2-type Zn-finger coordination motif. Both domains are important for dimerization. ZNF645 is a novel testis-specific E3 ubiquitin-protein ligase that plays a role in sperm production and quality control. It has a structure similar to that of the c-Cbl-like protein Hakai. In contrast to Hakai, its HYB domain demonstrates different target specificities. It interacts with v-Src-phosphorylated E-cadherin, but not to cortactin.	38
319423	cd16509	RING-HC_HLTF	RING finger, HC subclass, found in helicase-like transcription factor (HLTF) and similar proteins. HLTF, also known as DNA-binding protein/plasminogen activator inhibitor 1 regulator, or HIP116, or RING finger protein 80, or SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A member 3, or sucrose nonfermenting protein 2-like 3, is a yeast RAD5 homolog found in mammals. It has both E3 ubiquitin ligase and DNA helicase activities, and plays a pivotal role in the template-switching pathway of DNA damage tolerance. It is involved in Lys-63-linked poly-ubiquitination of proliferating cell nuclear antigen (PCNA) at Lys-164 and in the regulation of DNA damage tolerance. It shows double-stranded DNA translocase activity with 3'-5' polarity, thereby facilitating regression of the replication fork. HLTF contains an N-terminal HIRAN (HIP116 and RAD5 N-terminal) domain, a SWI/SNF helicase domain that is divided into N- and C-terminal parts by an insertion of a C3HC4-type RING-HC finger involved in the poly-ubiquitination of PCNA.	43
319424	cd16510	RING-HC_IAPs	RING finger, HC subclass, found in inhibitor of apoptosis proteins (IAPs). IAPs are frequently overexpressed in cancer and associated with tumor cell survival, chemoresistance, disease progression, and poor prognosis. They function primarily as negative regulators of cell death. They regulate caspases and apoptosis through the inhibition of specific members of the caspase family of cysteine proteases. In addition, IAPs has been implicated in a multitude of other cellular processes, including inflammatory signalling and immunity, mitogenic kinase signalling, proliferation and mitosis, as well as cell invasion and metastasis. IAPs in this family includes cellular inhibitor of apoptosis protein c-IAP1 (BIRC2) and c-IAP2 (BIRC3), XIAP (BIRC4), BIRC7, and BIRC8, all of which contain three N-terminal baculoviral IAP repeat (BIR) domains that enable interactions with proteins, a ubiquitin-association (UBA) domain that is responsible for the binding of binds polyubiquitin (polyUb), and a C3HC4-type RING-HC finger at the carboxyl terminus that is required for ubiquitin ligase activity. The UBA domain is only absent in mammalian homologs of BIRC7. Moreover, c-IAPs contains an additional caspase activation and recruitment domain (CARD) between UBA and C3HC4-type RING-HC domains. CARD domain may serve as a protein interaction surface.	39
319425	cd16511	vRING-HC_IRF2BP1_like	variant of RING finger, HC subclass, found in interferon regulatory factor 2-binding protein IRF-2BP1, IRF-2BP2, and similar proteins. The family includes IRF-2BP1, IRF-2BP2, and their homolog, IRF-2BP-like, also known as IRF-2BPL or C14orf4. IRF-2BP1 and IRF-2BP2 are nuclear proteins that bind to the C-terminal repression domain of IRF-2 and act as an IRF-2-dependent transcriptional corepressors, both enhancer-activated and basal transcription. IRF-2BPL is expressed in the mediobasal hypothalamus and plays a critical function in regulating the female reproductive neuroendocrine axis. All family members contain a C-terminal C3HC4-type RING-HC finger with a partially new pattern.	56
319426	cd16512	RING-HC_LNX3_like	RING finger, HC subclass, found in ligand of Numb protein LNX3, LNX4, and similar proteins. The ligand of Numb protein X (LNX) family, also known as PDZ and RING (PDZRN) family, includes LNX1-5, which can interact with Numb, a key regulator of neurogenesis and neuronal differentiation. LNX5 (also known as PDZK4, or PDZRN4L) shows high sequence homology to LNX3 and LNX4, but it lacks the RING domain. LNX1-4 proteins function as E3 ubiquitin ligases and have a unique domain architecture consisting of an N-terminal RING-HC finger for E3 ubiquitin ligase activity and either two or four PDZ domains necessary for the substrate-binding. This family corresponds to LNX3/LNX4-like proteins, which contains a typical C3HC4-type RING-HC finger and two PDZ domains.	41
319427	cd16513	RING1-HC_LONFs	RING finger 1, HC subclass, found in the LON peptidase N-terminal domain and RING finger proteins family. The LON peptidase N-terminal domain and RING finger proteins family includes LONRF1 (also known as RING finger protein 191 or RNF191), LONRF2 (also known as RING finger protein 192 or RNF192, or neuroblastoma apoptosis-related protease), LONRF3 (also known as RING finger protein 127 or RNF127), which are characterized by containing two C3HC4-type RING-HC fingers, four tetratricopeptide (TPR) repeats, and one N-terminal domain of the ATP-dependent protease La (LON) domain at the C-terminus. Their biological function remain unclear. This family corresponds to the first RING-HC finger.	42
319428	cd16514	RING2-HC_LONFs	RING finger 2, HC subclass, found in the LON peptidase N-terminal domain and RING finger proteins family. The LON peptidase N-terminal domain and RING finger proteins family includes LONRF1 (also known as RING finger protein 191 or RNF191), LONRF2 (also known as RING finger protein 192 or RNF192, or neuroblastoma apoptosis-related protease), LONRF3 (also known as RING finger protein 127 or RNF127), which are characterized by containing two C3HC4-type RING-HC fingers, four tetratricopeptide (TPR) repeats, and one N-terminal domain of the ATP-dependent protease La (LON) domain at the C-terminus. Their biological function remain unclear. This family corresponds to the second RING-HC finger.	42
319429	cd16515	RING-HC_LRSAM1	RING finger, HC subclass, found in leucine-rich repeat and sterile alpha motif-containing protein 1 (LRSAM1) and similar proteins. LRSAM1, also known as Tsg101-associated ligase (TAL), or RIFLE, is an E3 ubiquitin-protein ligase that physically associates with, and selectively ubiquitylates, Tsg101, an E2-like molecule that regulates vesicular trafficking processes in yeast and mammals. It regulates a Tsg101-associated complex responsible for the sorting of cargo into cytoplasm-containing vesicles that bud at the multivesicular body and at the plasma membrane. LRSAM1 is a multidomain protein containing an N-terminal leucine-rich repeat (LRR), followed by several recognizable motifs, including an ezrin-radixin-moezin (ERM) domain, a coiled-coil (CC) region, a SAM domain, and a C-terminal C3HC4-type RING-HC finger domain.	40
319430	cd16516	RING-HC_malin	RING finger, HC subclass, found in malin and similar proteins. Malin ("mal" for seizure in French), also known as NHL repeat-containing protein 1 (NHLRC1), or EPM2B, is a nuclear E3 ubiquitin-protein ligase that ubiquitinates and promotes the degradation of laforin (EPM2A encoding protein phosphatase). Malin and laforin operate as a functional complex that play key roles in regulating cellular functions such as glycogen metabolism, unfolded cellular stress response, and proteolytic processes. They act as pro-survival factors that negatively regulate the Hipk2-p53 cell death pathway. They also negatively regulate cellular glucose uptake by preventing plasma membrane targeting of glucose transporters. Moreover, they degrade polyglucosan bodies in concert with glycogen debranching enzyme and brain isoform glycogen phosphorylase. Furthermore, they, together with Hsp70, form a new functional complex that suppress the cellular toxicity of misfolded proteins by promoting their degradation through the ubiquitin-proteasome system. Defects in either malin or laforin may cause Lafora disease (LD), a fatal form of teenage-onset autosomal recessive progressive myoclonus epilepsy. In addition, malin may have function independent of laforin in lysosomal biogenesis and/or lysosomal glycogen disposal. Malin contains six NHL-repeat protein-protein interaction domains and a C3HC4-type RING-HC finger.	48
319431	cd16517	RING-HC_MAT1	RING finger, HC subclass, found in RING finger protein MAT1. MAT1, also known as CDK-activating kinase assembly factor MAT1, CDK7/cyclin-H assembly factor, cyclin-G1-interacting protein, menage a trois, RING finger protein 66 (RNF66), p35, or p36, is involved in cell cycle control and in RNA transcription by RNA polymerase II. It associates primarily with the catalytic subunit cyclin-dependent kinase 7 (CDK7) and the regulatory subunit cyclin H to form the CDK-activating kinase (CAK) complex that can further associate with the core-TFIIH to form the transcription factor IIH (TFIIH) basal transcription/DNA repair factor, which activates RNA polymerase II by serine phosphorylation of the repetitive C-terminal domain (CTD) of its large subunit (POLR2A), allowing its escape from the promoter and elongation of the transcripts. MAT1 contains an N-terminal C3HC4-type RING-HC finger, a central coiled coil domain, and a C-terminal domain rich in hydrophobic residues.	49
319432	cd16518	RING-HC_MEX3	RING finger, HC subclass, found in RNA-binding proteins of the evolutionarily-conserved MEX-3 family. The family includes MEX-3 family phosphoproteins have been found in vertebrates. They are mediators of post-transcriptional regulation in different organisms, and have been implicated in many core biological processes, including embryonic development, epithelial homeostasis, immune responses, metabolism, and cancer. They contain two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. They shuttle between the nucleus and the cytoplasm via the CRM1-dependent export pathway. The RNA-binding protein MEX-3 from nematode Caenorhabditis elegans is the founding member of the MEX-3 family. Due to the lack of RING-HC finger, it is not included here.	41
319433	cd16519	RING-HC_MIBs	RING finger, HC subclass, found in mind bomb MIB1, MIB2, and similar proteins. MIBs are large, multi-domain E3 ubiquitin-protein ligases that promote ubiquitination of the cytoplasmic tails of Notch ligands. They are also responsible for TBK1 K63-linked ubiquitination and activation, promoting interferon production and controlling antiviral immunity. Moreover, MIBs selectively control responses to cytosolic RNA and regulate type I interferon transcription. Both MIB1 and MIB2 have similar domain architectures, which consist of two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region, where MIB1 and MIB2 contain three and two C3HC4-type RING-HC fingers, respectively. This family corresponds to the first RING-HC finger of MIB1 and MIB2, as well as the second RING-HC finger of MIB1.	37
319434	cd16520	RING-HC_MIBs_like	RING finger, HC subclass, found in mind bomb MIB1, MIB2, RGLG1, RGLG2, and similar proteins. MIBs are large, multi-domain E3 ubiquitin-protein ligases that promote ubiquitination of the cytoplasmic tails of Notch ligands. They are also responsible for TBK1 K63-linked ubiquitination and activation, promoting interferon production and controlling antiviral immunity. Moreover, MIBs selectively control responses to cytosolic RNA and regulate type I interferon transcription. Both MIB1 and MIB2 have similar domain architectures, which consist of two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region, where MIB1 and MIB2 contain three and two C3HC4-type RING-HC fingers, respectively. This family corresponds to the third RING-HC finger of MIB1, as well as the second RING-HC finger of MIB2. In addition to MIB1 and MIB2, RING domain ligase RGLG1, RGLG2 and similar proteins from plant have also been included in this family. RGLG1 is a ubiquitously expressed E3 ubiquitin-protein ligase that interacts with UBC13 and, together with UBC13, catalyzes the formation of K63-linked polyubiquitin chains, which is involved in DNA damage repair. RGLG1 mediates the formation of canonical, K48-linked polyubiquitin chains that target proteins for degradation. It also regulates apical dominance by acting on the auxin transport proteins abundance. RGLG1 has overlapping functions with its closest sequelog, RGLG2. They both function as RING E3 ligases that interact with ethylene response factor 53 (ERF53) in the nucleus and negatively regulate the plant drought stress response. All RGLG proteins contain a Von Willebrand factor type A (vWA) domain and a C3HC4-type RING-HC finger.	38
319435	cd16521	RING-HC_MKRN	RING finger, HC subclass, found in the makorin (MKRN) protein family. The MKRN protein family includes the ribonucleoproteins that are characterized by a variety of zinc-finger motifs, including typical arrays of one to four C3H1-type zinc fingers and a C3HC4-type RING-HC finger. Another motif rich in Cys and His residues (CH), with so far unknown function, is also generally present in MKRN proteins. MKRN proteins may have E3 ubiquitin ligase activity.	51
319436	cd16522	RING-HC_MSL2	RING finger found in Drosophila melanogaster male-specific lethal-2 (MSL2) and similar proteins. MSL2, also known as RING finger protein 184 (RNF184), is a putative DNA-binding protein required for X chromosome dosage compensation in Drosophila males. Its expression is sex specifically regulated by Sex-lethal. Drosophila dosage compensation proteins MOF, MSL1, MSL2, and MSL3 are essential for elevating transcription of the single X chromosome in the male (X chromosome dosage compensation). MSL2 plays a critical role in translation and/or stability of MSL1 in males. In complex with MSL1, it acts as an E3 ubiquitin ligase that promotes ubiquitination of histone H2B. MSL2 contains a C3HC4-type RING-HC finger and a metallothionein-like domain with eight conserved and two non-conserved cysteines, as well as a positively and a negatively charged amino acid residue cluster and a coiled coil domain that may be involved in protein-protein interactions. This family also includes many male-specific lethal-2 homologs from bilaterians.	45
319437	cd16523	RING-HC_MYLIP	RING finger, HC subclass, found in myosin regulatory light chain interacting protein (MYLIP) and similar proteins. MYLIP, also known as inducible degrader of the low-density lipoprotein (LDL)-receptor (IDOL), or MIR, is an E3 ubiquitin-protein ligase that mediates ubiquitination and subsequent proteasomal degradation of myosin regulatory light chain (MRLC), LDLR, VLDLR, and LRP8. Its activity depends on E2 ubiquitin-conjugating enzymes of the UBE2D family. MYLIP stimulates clathrin-independent endocytosis and acts as a sterol-dependent inhibitor of cellular cholesterol uptake by binding directly to the cytoplasmic tail of the LDLR and promoting its ubiquitination via the UBE2D1/E1 complex. The ubiquitinated LDLR then enters the multivesicular body (MVB) protein-sorting pathway and is shuttled to the lysosome for degradation. Moreover, MYLIP has been identified as a novel ERM-like protein that affects cytoskeleton interactions regulating cell motility, such as neurite outgrowth. The ERM proteins includes ezrin, radixin, and moesin, which are cytoskeletal effector proteins linking actin to membrane-bound proteins at the cell surface. MYLIP contains an ERM-homology domain and a C-terminal C3HC4-type RING-HC finger.	38
319438	cd16524	RING-HC_NHL-1_like	RING finger, HC subclass, found in Caenorhabditis elegans RING finger protein NHL-1 and similar proteins. NHL-1 functions as an E3 ubiquitin-protein ligase in the presence of both UBC-13 and UBC-1 within the ubiquitin pathway of Caenorhabditis elegans. It acts in chemosensory neurons to promote stress resistance in distal tissues by the transcription factor DAF-16 activation but is dispensable for the activation of heat shock factor 1 (HSF-1). NHL-1 belongs to the TRIM (tripartite motif)-NHL family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain.	45
319439	cd16525	RING-HC_PCGF	RING finger, HC subclass, found in Polycomb Group RING finger homologs (PCGF1, 2, 3, 4, 5 and 6), and similar proteins. The family includes six Polycomb Group (PcG) RING finger homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) that use epigenetic mechanisms to maintain or repress expression of their target genes. They were first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place, and are well known for silencing Hox genes through modulation of chromatin structure during embryonic development in fruit flies. PCGF homologs play important roles in cell proliferation, differentiation, and tumorigenesis. They all have been found to associate with ring finger protein 2 (RNF2). The RNF2-PCGF heterodimer is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF homologs are critical components in the assembly of distinct Polycomb Repression Complex 1 (PRC1) related complexes which is involved in the maintenance of gene repression and target different genes through distinct mechanisms. The Drosophila PRC1 core complex is formed by the Polycomb (Pc), Polyhomeotic (Ph), Posterior sex combs (Psc), and Sex combs extra (Sce, also known as Ring) subunits. In mammals, the composition of PRC1 is much more diverse and varies depending on the cellular context. All PRC1 complexes contain homologs of the Drosophila Ring protein. Ring1A/RNF1 and Ring1B/RNF2 are E3 ubiquitin ligases that mark lysine 119 of histone H2A with a single ubiquitin group (H2AK119ub). Mammalian homologs of the Drosophila Psc protein, such as PCGF2/Mel-18 or PCGF4/BMI1, regulate PRC1 enzymatic activity. PRC1 complexes can be divided into at least two classes according to the presence or absence of CBX proteins, which are homologs of Drosophila Pc. Canonical PRC1 complexes contain CBX proteins that recognize and bind H3K27me3, the mark deposited by PRC2. Therefore, canonical PRC1 complexes and PRC2 can act together to repress gene transcription and maintain this repression through cell division. Non-canonical PRC1 complexes, containing RYBP (together with additional proteins, such as L3mbtl2 or Kdm2b) rather than the CBX proteins have recently been described in mammals. PCGF homologs contain a C3HC4-type RING-HC finger.	42
319440	cd16526	RING-HC_PEX2	RING finger, HC subclass, found in peroxin-2 (PEX2) and similar proteins. PEX2, also known as peroxisome biogenesis factor 2, 35 kDa peroxisomal membrane protein, peroxisomal membrane protein 3, peroxisome assembly factor 1 (PAF-1), or RING finger protein 72 (RNF72), is an integral peroxisomal membrane protein with two transmembrane regions and a C3HC4-type RING-HC finger within its cytoplasmically exposed C-terminus. It may be involved in the biogenesis of peroxisomes, as well as in peroxisomal matrix protein import. Mutations in the PEX2 gene are the primary defect in a subset of patients with Zellweger syndrome and related peroxisome biogenesis disorders. Moreover, PEX2 functions as an E3-ubiquitin ligase that mediates the UBC4-dependent polyubiquitination of PEX5, a key player in peroxisomal matrix protein import, to control PEX5 receptor recycling or degradation.	43
319441	cd16527	RING-HC_PEX10	RING finger, HC subclass, found in peroxin-10 (PEX10) and similar proteins. PEX10, also known as peroxisome biogenesis factor 10, peroxisomal biogenesis factor 10, peroxisome assembly protein 10, or RING finger protein 69 (RNF69), is an integral peroxisomal membrane protein with two transmembrane regions and a C3HC4-type RING-HC finger within its cytoplasmically exposed C-terminus. It plays an essential role in peroxisome assembly, import of target substrates, and recycling or degradation of protein complexes and amino acids. It is an essential component of the spinal locomotor circuit, and thus its mutations may be involved in peroxisomal biogenesis disorders (PBD). Mutations in human PEX10 also result in autosomal recessive ataxia. Moreover, PEX10 functions as an E3-ubiquitin ligase with an E2, UBCH5C. It mono- or poly-ubiquitinates PEX5, a key player in peroxisomal matrix protein import, in a UBC4-dependent manner, to control PEX5 receptor recycling or degradation. It also links the E2 ubiquitin conjugating enzyme PEX4 to the protein import machinery of the peroxisome.	40
319442	cd16528	RING-HC_prokRING	RING finger, HC subclass, found in prokaryotic RING finger family proteins. The family corresponds to a group of uncharacterized prokaryotic C3HC4-type RING-HC finger containing proteins. The RING finger is fused to an N-terminal alpha-helical domain, ROT/Trove-like repeats, and a C-terminal TerD domain, suggesting a possible role in an RNA-processing complex.	39
319443	cd16529	RING-HC_RAD18	RING finger, HC subclass, found in postreplication repair protein RAD18 and similar proteins. RAD18, also known as HR18 or RING finger protein 73 (RNF73), is an E3 ubiquitin-protein ligase involved in post replication repair of UV-damaged DNA via its recruitment to stalled replication forks. It associates to the E2 ubiquitin conjugating enzyme UBE2B to form the UBE2B-RAD18 ubiquitin ligase complex involved in mono-ubiquitination of DNA-associated PCNA on K164. It also interacts with another E2 ubiquitin conjugating enzyme RAD6 to form a complex that monoubiquitinates proliferating cell nuclear antigen at stalled replication forks in DNA translesion synthesis. Moreover, Rad18 is a key factor in double-strand break DNA damage response (DDR) pathways via its association with K63-linked polyubiquitylated chromatin proteins. It can function as a mediator for DNA damage response signals to activate the G2/M checkpoint in order to maintain genome integrity and cell survival after ionizing radiation (IR) exposure. RAD18 contains a C3HC4-type RING-HC finger, a ubiquitin-binding zinc finger domain (UBZ), a SAP (SAF-A/B, Acinus and PIAS) domain, and a RAD6-binding domain (R6BD).	42
319444	cd16530	RING-HC_RAG1	RING finger, HC subclass, found in recombination activating gene-1 (RAG-1) and similar proteins. RAG-1, also known as V(D)J recombination-activating protein 1, RING finger protein 74 (RNF74), or endonuclease RAG1, is the catalytic component of the RAG complex, a multiprotein complex that mediates the DNA cleavage phase during V(D)J recombination. RAG1 is the lymphoid-specific factor that mediates the DNA-binding to the conserved recombination signal sequences (RSS) and catalyzes the DNA cleavage activities by introducing a double-strand break between the RSS and the adjacent coding segment. It also functions as an E3 ubiquitin-protein ligase that mediates monoubiquitination of histone H3, which is required for the joining step of V(D)J recombination. RAG-1 contains an N-terminal C3HC4-type RING-HC finger that mediates monoubiquitylation of Histone H3, an adjacent C2H2-type zinc finger, and a nonamer binding (NBD) DNA-binding domain.	46
319445	cd16531	RING-HC_RING1_like	RING finger, HC subclass, found in really interesting new gene proteins RING1, RING2 and similar proteins. RING1, also known as polycomb complex protein RING1, RING finger protein 1 (RNF1), or RING finger protein 1A (RING1A), was identified as a transcriptional repressor that is associated with the Polycomb group (PcG) protein complex involved in stable repression of gene activity. RING2, also known as huntingtin-interacting protein 2-interacting protein 3, HIP2-interacting protein 3, protein DinG, RING finger protein 1B (RING1B), RING finger protein 2 (RNF2), or RING finger protein BAP-1, is an E3 ubiquitin-protein ligase that interacts with both nucleosomal DNA and an acidic patch on histone H4 to achieve the specific monoubiquitination of K119 on histone H2A (H2AK119ub), thereby playing a central role in histone code and gene regulation. Both RING1 and RING2 are core components of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. RING2 acts as the main E3 ubiquitin ligase on histone H2A of the PRC1 complex, while RING1 may rather act as a modulator of RNF2/RING2 activity. Members in this family contain a C3HC4-type RING-HC finger.	41
319446	cd16532	RING-HC_RNFT1_like	RING finger, HC subclass, found in RING finger and transmembrane domain-containing protein RNFT1, RNFT2, and similar proteins. Both RNFT1 and RNFT2 are multi-pass membrane proteins containing a C3HC4-type RING-HC finger. Their biological roles remain unclear.	40
319447	cd16533	RING-HC_RNF4	RING finger, HC subclass, found in RING finger protein 4 (RNF4) and similar proteins. RNF4, also known as small nuclear ring finger protein (SNURF), is a SUMO-targeted E3 ubiquitin-protein ligase with a pivotal function in the DNA damage response (DDR) through interacting with the deubiquitinating enzyme ubiquitin-specific protease 11 (USP11), a known DDR-component, and further facilitating DNA repair. It plays a novel role in preventing the loss of intact chromosomes and ensures the maintenance of chromosome integrity. Moreover, RNF4 is responsible for the UbcH5A-catalyzed formation of K48 chains that target SUMO-modified promyelocytic leukemia (PML) protein for proteasomal degradation in response to arsenic treatment. It also interacts with telomeric repeat binding factor 2 (TRF2) in a small ubiquitin-like modifiers (SUMO)-dependent manner and preferentially targets SUMO-conjugated TRF2 for ubiquitination through SUMO-interacting motifs (SIMs). Furthermore, RNF4 can form a complex with a Ubc13-ubiquitin conjugate and Ube2V2. It catalyzes K63-linked polyubiquitination by the Ube2V2-Ubc13 (ubiquitin-loaded) complex. Meanwhile, RNF4 negatively regulates nuclear factor kappa B (NF-kappaB) signaling by down-regulating transforming growth factor beta (TGF-beta)-activated kinase 1 (TAK1)-TAK1-binding protein2 (TAB2). RNF4 contains four SIMs followed by a C3HC4-type RING-HC finger at the C-terminus.	54
319448	cd16534	RING-HC_RNF5_like	RING finger, HC subclass, found in RING finger protein RNF5, RNF185 and similar proteins. RNF5 and RNF185 are E3 ubiquitin-protein ligases that are anchored to the outer membrane of the endoplasmic reticulum (ER). RNF5 acts at early stages of cystic fibrosis (CF) transmembrane conductance regulator (CFTR) biosynthesis, and functions as a target for therapeutic modalities to antagonize mutant CFTR proteins in CF patients carrying the F508del allele. RNF185 controls the degradation of CFTR and CFTR F508del allele in a RING- and proteasome-dependent manner, but does not control that of other classical endoplasmic reticulum-associated degradation (ERAD) model substrates. Moreover, both RNF5 and RNF185 play important roles in cell adhesion and migration through the modulation of cell migration by ubiquitinating paxillin. Arabidopsis thaliana RING membrane-anchor proteins (AtRMAs) are also included in this family. They possess E3 ubiquitin-protein ligase activity and may play a role in the growth and development of Arabidopsis. All members in this family contain a C3HC4-type RING-HC finger.	43
319449	cd16535	RING-HC_RNF8	RING finger, HC subclass, found in RING finger protein 8 (RNF8) and similar proteins. RNF8 is a telomere-associated E3 ubiquitin-protein ligase that plays an important role in DNA double-strand break (DSB) repair via histone ubiquitination. It is localized in the nucleus and interacts with class III E2s (UBE2E2, UbcH6, and UBE2E3), but not with other E2s (UbcH5, UbcH7, UbcH10, hCdc34, and hBendless). It recruits UBC13 for lysine 63-based self polyubiquitylation. Its deficiency causes neuronal pathology and cognitive decline, and its loss results in neuron degeneration. RNF8, together with RNF168, catalyzes a series of ubiquitylation events on substrates such as H2A and H2AX, with the H2AK13/15 ubiquitylation being particularly important for recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of DSBs. Specially, RNF8 mediates the ubiquitination of gammaH2AX, and recruits 53BP1 and BRCA1 to DNA damage sites which promotes DNA damage response (DDR) and inhibits chromosomal instability. Moreover, RNF8 interacts with retinoid X receptor alpha (RXR alpha) and enhances its transcription-stimulating activity. It also regulates the rate of exit from mitosis and cytokinesis. RNF8 contains an N-terminal forkhead-associated (FHA) domain and a C-terminal C3HC4-type RING-HC finger.	42
319450	cd16536	RING-HC_RNF10	RING finger, HC subclass, found in RING finger protein 10 (RNF10) and similar proteins. RNF10 is an E3 ubiquitin-protein ligase that interacts with mesenchyme Homeobox 2 (MEOX2) transcription factor, a regulator of the proliferation, differentiation and migration of vascular smooth muscle cells and cardiomyocytes, and enhances Meox2 activation of the p21 promoter. It also regulates the expression of myelin-associated glycoprotein (MAG) genes and is required for myelin production in Schwann cells of peripheral nervous system. Moreover, RNF10 regulates retinoic acid-induced neuronal differentiation and the cell cycle exit of P19 embryonic carcinoma cells. RNF10 contains a C3HC4-type RING-HC finger and three putative nuclear localization signals.	43
319451	cd16537	RING-HC_RNF37	RING finger, HC subclass, found in RING finger protein 37 (RNF37). RNF37, also known as KIAA0860, U-box domain-containing protein 5 (UBOX5), UbcM4-interacting protein 5 (UIP5), or ubiquitin-conjugating enzyme 7-interacting protein 5, is an E3 ubiquitin-protein ligase found exclusively in the nucleus as part of a nuclear dot-like structure. It interacts with the molecular chaperone VCP/p97 protein. RNF37 contains a U-box domain followed by a potential nuclear location signal (NLS) and a C-terminal C3HC4-type RING-HC finger. The U-box domain is a modified RING finger domain that lacks the hallmark metal-chelating cysteines and histidines of the latter, and is likely to adopt a RING finger-like conformation. The presence of the U-box, but not of the RING finger, is required for the E3 activity. The U-box domain can directly interact with several E2 enzymes, including UbcM2, UbcM3, UbcM4, UbcH5, and UbcH8, suggesting a similar function as the RING finger in the ubiquitination pathway. This family corresponds to the RING-HC finger.	47
319452	cd16538	RING-HC_RNF112	RING finger, HC subclass, found in RING finger protein 112 (RNF112) and similar proteins. RNF112, also known as brain finger protein (BFP), zinc finger protein 179 (ZNF179), or neurolastin, is a peripheral membrane protein that is predominantly expressed in the central nervous system and localizes to endosomes. It contains functional GTPase and C3HC4-type RING-HC finger domains and has been identified as a brain-specific dynamin family GTPase that affects endosome size and spine density. Moreover, RNF112 acts as a downstream target of sigma-1 receptor (Sig-1R) regulation and may play a novel role in neuroprotection by mediating the neuroprotective effects of dehydroepiandrosterone (DHEA) and its sulfated analog (DHEAS).	48
319453	cd16539	RING-HC_RNF113A_B	RING finger, HC subclass, found in RING finger proteins RNF113A, RNF113B, and similar proteins. RNF113A, also known as zinc finger protein 183 (ZNF183), is an E3 ubiquitin-protein ligase that physically interacts with the E2 protein, UBE2U. A nonsense mutation in RNF113A is associated with an X-linked trichothiodystrophy (TTD). Its yeast ortholog Cwc24p is predicted to have a spliceosome function and acts in a complex with Cef1p to participate in pre-U3 snoRNA splicing, indirectly affecting pre-rRNA processing. It is also important for the U2 snRNP binding to primary transcripts and co-migrates with spliceosomes. Moreover, the ortholog of RNF113A in fruit flies may also act as a spliceosome and is hypothesized to be involved in splicing, namely within the central nervous system. The ortholog in Caenorhabditis elegans is involved in DNA repair of inter-strand crosslinks. RNF113B, also known as zinc finger protein 183-like 1, shows high sequence similarity with RNF113A. Both RNF113A and RNF113B contain a CCCH-type zinc finger, which is commonly found in RNA-binding proteins involved in splicing, and a C3HC4-type RING-HC finger, which is frequently found in E3 ubiquitin ligases.	41
319454	cd16540	RING-HC_RNF114	RING finger, HC subclass, found in RING finger protein 114 (RNF114) and similar proteins. RNF114, also known as zinc finger protein 228 (ZNF228) or zinc finger protein 313 (ZNF313), is a p21(WAF1)-targeting ubiquitin E3 ligase that interacts with X-linked inhibitor of apoptosis (XIAP)-associated factor 1 (XAF1) and may play a role in p53-mediated cell-fate decisions. It is involved in immune response to double-stranded RNA in disease pathogenesis. Moreover, RNF114 interacts with A20 and modulates its ubiquitylation. It negatively regulates nuclear factor-kappaB (NF-kappaB)-dependent transcription and positively regulates T-cell activation. RNF114 may play a putative role in the regulation of immune responses, since it corresponds to a novel psoriasis susceptibility gene, ZNF313. RNF114, together with three closely related proteins: RNF125, RNF138 and RNF166, forms a novel family of ubiquitin ligases with a C3HC4-type RING-HC finger, a C2HC-, and two C2H2-type zinc fingers, as well as a ubiquitin interacting motif (UIM).	42
319455	cd16541	RING-HC_RNF123	RING finger, HC subclass, found in RING finger protein 123 (RNF123) and similar proteins. RNF123, also known as Kip1 ubiquitination-promoting complex protein 1 (KPC1), is an E3 ubiquitin-protein ligase that mediates ubiquitination and proteasomal processing of the nuclear factor-kappaB 1 (NF- kappaB1) precursor p105 to the p50 active subunit restricts tumor growth. It also regulates degradation of heterochromatin protein 1alpha (HP1alpha) and 1beta (HP1beta) in lamin A/C knock-down cells. Moreover, RNF123, together with Kip1 ubiquitylation-promoting complex 2 (KPC2), forms the Kip1 ubiquitination-promoting complex (KPC), acting as a cytoplasmic ubiquitin ligase that regulates degradation of the cyclin-dependent kinase inhibitor p27 (Kip1) at the G1 phase of the cell cycle. Furthermore, RNF123 may function as a clinically relevant, peripheral state marker of depression. RNF123 contains a C3HC4-type RING-HC finger at the C-terminus.	41
319456	cd16542	RING-HC_RNF125	RING finger, HC subclass, found in RING finger protein 125 (RNF125). RNF125, also known as T-cell RING activation protein 1 (TRAC-1), is an E3 ubiquitin-protein ligase that is predominantly expressed in lymphoid cells, and functions as a positive regulator of T cell activation. It also down-modulates HIV replication and inhibits pathogen-induced cytokine production. It negatively regulates type I interferon signaling, which conjugates Lys(48)-linked ubiquitination to retinoic acid-inducible gene-I (RIG-I) and subsequently leads to the proteasome-dependent degradation of RIG-I. Further, RNF125 conjugates ubiquitin to melanoma differentiation-associated gene 5 (MDA5), a family protein of RIG-I. It thus acts as a negative regulator of RIG-I signaling, and is a direct target of miR-15b in the context of Japanese encephalitis virus (JEV) infection. Moreover, RNF125 binds to and ubiquitinates JAK1, prompting its degradation and inhibition of receptor tyrosine kinase (RTK) expression. It also negatively regulates p53 function through physical interaction and ubiquitin-mediated proteasome degradation. Mutations in RNF125 may lead to overgrowth syndromes (OGS). RNF125, together with three closely related proteins: RNF114, RNF138 and RNF166, forms a novel family of ubiquitin ligases with a C3HC4-type RING-HC finger, a C2HC-, and two C2H2-type zinc fingers, as well as a ubiquitin interacting motif (UIM). The UIM of RNF125 binds K48-linked poly-ubiquitin chains and is, together with the RING domain, required for auto-ubiquitination.	42
319457	cd16543	RING-HC_RNF135_like	RING finger, HC subclass, found in RING finger protein 135 (RNF135), tripartite motif-containing protein 15 (TRIM15) and similar proteins. RNF135, also known as RIG-I E3 ubiquitin ligase (REUL) or Riplet, is a widely expressed E3 ubiquitin-protein ligase that consists of an N-terminal C3HC4-type RING-HC finger and C-terminal B30.2/SPRY and PRY motifs, but lacks the B-box and coiled-coil domains that are also typically present in TRIM proteins. RNF135 serves as a specific retinoic acid-inducible gene-I (RIG-I)-interacting protein that ubiquitinates RIG-I and specifically stimulates RIG-I-mediated innate antiviral activity to produce antiviral type-I interferon (IFN) during the early phase of viral infection. It also has been identified as a bio-marker and therapy target of glioblastoma. It associates with the ERK signal transduction pathway and plays a role in glioblastoma cell proliferation, migration and cell cycle. TRIM15, also known as RING finger protein 93 (RNF93), zinc finger protein 178 (ZNF178), or zinc finger protein B7 (ZNFB7), is a focal adhesion protein that regulates focal adhesion disassembly. It localizes to focal contacts in a myosin-II-independent manner by an interaction between its coiled-coil domain and the LD2 motif of paxillin. TRIM15 can also associate with coronin 1B, cortactin, filamin binding LIM protein1, and vasodilator-stimulated phosphoprotein, which are involved in actin cytoskeleton dynamics. As an additional component of the integrin adhesome, it regulates focal adhesion turnover and cell migration. TRIM15 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain.	37
319458	cd16544	RING-HC_RNF138	RING finger, HC subclass, found in RING finger protein 138 (RNF138) and similar proteins. RNF138, also known as Nemo-like kinase-associated RING finger protein (NARF) or NLK-associated RING finger protein, is an E3 ubiquitin-protein ligase that plays an important role in glioma cell proliferation, apoptosis, and cell cycle. It specifically cooperates with the E2 conjugating enzyme E2-25K (Hip-2/UbcH1), regulates the ubiquitylation and degradation of T cell factor/lymphoid enhancer factor (TCF/LEF), and further suppresses Wnt-beta-catenin signaling. RNF138, together with three closely related proteins: RNF114, RNF125 and RNF166, forms a novel family of ubiquitin ligases with a C3HC4-type RING-HC finger, a C2HC-, and two C2H2-type zinc fingers, as well as a ubiquitin interacting motif (UIM).	46
319459	cd16545	RING-HC_RNF141	RING finger, HC subclass, found in RING finger protein 141 (RNF141) and similar proteins. RNF141, also known as zinc finger protein 230 (ZNF230), is a RING finger protein present primarily in the nuclei of spermatogonia, the acrosome, and the tail of spermatozoa. It may have a broad function during early development of vertebrates. It plays an important role in spermatogenesis, including spermatogenic cell proliferation and sperm maturation, as well as motility and fertilization. It also exhibits DNA binding activity. RNF141 corresponding ZNF230 gene mutation may be associated with azoospermia. RNF141 contains a C3HC4-type RING finger domain that may function as an activator module in transcription.	39
319460	cd16546	RING-HC_RNF146	RING finger, HC subclass, found in RING finger protein 146 (RNF146) and similar proteins. RNF146, also known as dactylidin, or iduna, is a cytoplasmic E3 ubiquitin-protein ligase that is responsible for PARylation-dependent ubiquitination (PARdU). It displays neuroprotective property due to its inhibition of Parthanatos, a PAR dependent cell death, via binding with Poly(ADP-ribose) (PAR). It also modulates PAR polymerase-1 (PARP-1)-mediated oxidative cell injury in cardiac myocytes. Moreover, RNF146 mediates tankyrase-dependent degradation of axin, thereby positively regulates Wnt signaling. It also facilitates DNA repair and protects against cell death induced by DNA damaging agents or gamma-irradiation through translocating to the nucleus after cellular injury and promoting the ubiquitination and degradation of various nuclear proteins involved in DNA damage repair. Furthermore, RNF146 is implicated in neurodegenerative disease and cancer development. It regulates the development and progression of non-small cell lung cancer (NSCLC) by enhancing cell growth, invasion, and survival. RNF146 contains an N-terminal C3HC4-type RING-HC finger followed by a WWE domain with a poly(ADP-ribose) (PAR) binding motif at the tail.	40
319461	cd16547	RING-HC_RNF151	RING finger, HC subclass, found in RING finger protein 151 (RNF151) and similar proteins. RNF151 is a testis-specific RING finger protein that interacts with dysbindin, a synaptic and microtubular protein that binds brain snapin, a SNARE-binding protein that mediated intracellular membrane fusion in both neuronal and non-neuronal cells. Thus, it may be involved in acrosome formation of spermatids through interacting with multiple proteins participating in membrane biogenesis and microtubule organization. RNF151 contains a C3HC4-type RING finger domain, a putative nuclear localization signal (NLS), and a TNF receptor associated factor (TRAF)-type zinc finger domain.	39
319462	cd16548	RING-HC_RNF152	RING finger, HC subclass, found in RING finger protein 152 (RNF152) and similar proteins. RNF152 is a lysosome-anchored E3 ubiquitin-protein ligase involved in apoptosis. It is polyubiquitinated through K48 linkage. It negatively regulates the activation of the mTORC1 pathway by targeting RagA GTPase for K63-linked ubiquitination. It interacts with and ubiquitinates RagA in an amino-acid-sensitive manner. The ubiquitination of RagA recruits its inhibitor GATOR1, a GAP complex for Rag GTPases to the Rag complex, thereby inactivating mTORC1 signaling. RNF152 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal transmembrane domain, both of which are responsible for its E3 ligase activity.	45
319463	cd16549	RING-HC_RNF166	RING finger, HC subclass, found in RING finger protein 166 (RNF166) and similar proteins. RNF166 is encoded by gene RNF166 targeted by thyroid hormone receptor alpha1 (TRalpha1), which is important in brain development. It plays an important role in RNA virus-induced interferon-beta production by enhancing the ubiquitination of TRAF3 and TRAF6. RNF166, together with three closely related proteins: RNF114, RNF125 and RNF138, forms a novel family of ubiquitin ligases with a C3HC4-type RING-HC finger, a C2HC-, and two C2H2-type zinc fingers, as well as a ubiquitin interacting motif (UIM).	47
319464	cd16550	RING-HC_RNF168	RING finger, HC subclass, found in RING finger protein 168 (RNF168) and similar proteins. RNF168 is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. It, together with RNF8, functions as a DNA damage response (DDR) factor that promotes a series of ubiquitylation events on substrates, such as H2A and H2AX with H2AK13/15 ubiquitylation, facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. Moreover, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. RNF168 contains an N-terminal C3HC4-type RING-HC finger that catalyzes H2A-K15ub and interacts with H2A, and two MIU (motif interacting with ubiquitin) domains responsible for the interaction with K63 linked poly-ubiquitin.	42
319465	cd16551	RING-HC_RNF169	RING finger, HC subclass, found in RING finger protein 169 (RNF169) and similar proteins. RNF169 is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. RNF169 recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to regulation of the DSB repair pathway utilization via functionally competing with recruiting repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin independent of its catalytic activity, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. RNF169 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal MIU (motif interacting with ubiquitin) domain.	41
319466	cd16552	RING-HC_NEURL3	RING finger, HC subclass, found in neuralized-like protein 3 (NEURL3) and similar proteins. NEURL3, also known as lung-inducible neuralized-related C3HC4 RING domain protein (LINCR), is a novel inflammation-induced E3 ubiquitin-protein ligase encoded by LINCR, a glucocorticoid-attenuated response gene induced in the lung during endotoxemia. It is expressed in alveolar epithelial type II cells, preferentially interacts with the ubiquitin-conjugating enzyme UbcH6, and generates polyubiquitin chains linked via non-canonical lysine residues. Overexpression of NEURL3 in the developing lung epithelium inhibits distal differentiation and induces cystic changes in the Notch signaling pathway. NEURL3 contains an N-terminal neuralized homology repeat (NHR) domain similar to the SPRY (SPla and the RYanodine receptor) domain and a C-terminal C3HC4-type RING-HC finger.	42
319467	cd16553	RING-HC_RNF170	RING finger, HC subclass, found in RING finger protein 170 (RNF170) and similar proteins. RNF170, also known as putative LAG1-interacting protein, is an endoplasmic reticulum (ER) membrane-bound E3 ubiquitin-protein ligase that mediates ubiquitination-dependent degradation of type-I inositol 1,4,5-trisphosphate (IP3) receptors (ITPR1) via the endoplasmic-reticulum-associated protein degradation (ERAD) pathway. A point mutation (arginine to cysteine at position 199) of RNF170 gene is linked with autosomal-dominant sensory ataxia (ADSA), a disease characterized by neurodegeneration in the posterior columns of the spinal cord. RNF170 contains a C3HC4-type RING-HC finger.	44
319468	cd16554	RING-HC_RNF180	RING finger, HC subclass, found in RING finger protein 180 (RNF180) and similar proteins. RNF180, also known as Rines, is a membrane-bound E3 ubiquitin-protein ligase well conserved among vertebrates. It is a critical regulator of the monoaminergic system, as well as emotional and social behavior. It interacts with brain monoamine oxidase A (MAO-A) and targets it for ubiquitination and degradation. It also functions as a novel tumor suppressor in gastric carcinogenesis. The hypermethylated CpG site count of RNF180 DNA promoter can be used to predict the survival of gastric cancer. RNF180 contains a novel conserved dual specificity protein phosphatase Rines conserved (DSPRC) domain, a basic coiled-coil domain, a C3HC4-type RING-HC finger, and a C-terminal hydrophobic region that is predicted to be a transmembrane domain.	44
319469	cd16555	RING-HC_RNF182	RING finger, HC subclass, found in RING finger protein 182 (RNF182) and similar proteins. RNF182 is a brain-enriched E3 ubiquitin-protein ligase that stimulates E2-dependent polyubiquitination in vitro. It is upregulated in the Alzheimer"s disease (AD) brains and neuronal cells exposed to injurious insults. It interacts with ATP6V0C and promotes its degradation by the ubiquitin-proteosome pathway, suggesting a very specific role in controlling the turnover of an essential component of neurotransmitter release machinery. RNF182 contains an N-terminal C3HC4-type RING-HC finger, and a C-terminal transmembrane domain.	51
319470	cd16556	RING-HC_RNF183_like	RING finger, HC subclass, found in RING finger protein RNF183, RNF223 and similar proteins. RNF183 is an E3 ubiquitin-protein ligase that is upregulated during intestinal inflammation and is negatively regulated by miR-7. It promotes intestinal inflammation by increasing the ubiquitination and degradation of inhibitor of kappa B, thereby resulting in secondary activation of the Nuclear factor-kappaB (NF-kB) pathway. The interaction between RNF183-mediated ubiquitination and miRNA may be an important novel epigenetic mechanism in the pathogenesis of inflammatory bowel disease (IBD). The biological function of RNF223 remains unclear. Both RNF183 and RNF223 contain an N-terminal C3HC4-type RING-HC finger and a C-terminal transmembrane domain.	54
319471	cd16557	RING-HC_RNF186	RING finger, HC subclass, found in RING finger protein 186 (RNF186) and similar proteins. RNF186 is an E3 ubiquitin-protein ligase with an N-terminal C3HC4-type RING-HC finger and two putative C-terminal transmembrane domains which enable it to localize in a certain organelle. It regulates RING-dependent self-ubiquitination, as well as endoplasmic reticulum (ER) stress-mediated apoptosis through interaction with the Bcl-2 family protein BNip1.	51
319472	cd16558	RING-HC_RNF207	RING finger, HC subclass, found in RING finger protein 207 (RNF207) and similar proteins. RNF207 is a cardiac-specific E3 ubiquitin-protein ligase that plays an important role in the regulation of cardiac repolarization. It regulates action potential duration, likely via effects on human ether-a-go-go-related gene (HERG) trafficking and localization in a heat shock protein-dependent manner. RNF207 contains a C3HC4-type RING-HC finger, Bbox 1 and Bbox C-terminal (BBC), as well as a C-terminal non-homologous region (CNHR).	43
319473	cd16559	RING-HC_RNF208	RING finger, HC subclass, found in RING finger protein 208 (RNF208) and similar proteins. RNF208 is an E3 ubiquitin-protein ligase whose activity can be modulated by S-nitrosylation. It contains a C3HC4-type RING-HC finger.	50
319474	cd16560	RING-HC_RNF212_like	RING finger, HC subclass, found in RING finger proteins RNF212, RNF212B and similar proteins. The family includes RING finger protein RNF212, RNF212B, and their homologs. RNF212 is a dosage-sensitive regulator of crossing-over during mammalian meiosis. It plays a central role in designating crossover sites and coupling chromosome synapsis to the formation of crossover-specific recombination complexes. It also functions as an E3 ligase for small ubiquitin-related modifier (SUMO) modification. RNF212B shows high sequence similarity with RNF212, but its biological function remains unclear. Members in this family contain an N-terminal C3HC4-type RING-HC finger. The family also includes two homologs of RNF212, meiotic procrossover factors Zip3 and ZHP-3, which have been identified in Saccharomyces cerevisiae and Caenorhabditis elegans, respectively. Budding yeast Zip3 is a small ubiquitin-related modifier (SUMO) E3 ligase implicated in the SUMO pathway of post-translational modification. It sumoylates chromosome axis proteins, thus promoting synaptonemal complex polymerization. It also acts as a Smt3 E3 ligase. Zip3 includes a SUMO Interacting Motif (SIM) and a modified C3HCHC2-type RING-HC finger that are important for Zip3 in vitro E3 ligase activity and necessary for SC polymerization and correct sporulation. ZHP-3 acts at crossovers to couple meiotic recombination with synaptonemal complex disassembly and chiasma formation in Caenorhabditis elegans. It possess a C3HC4-type RING-HC finger.	41
319475	cd16561	RING-HC_RNF213	RING finger, HC subclass, found in RING finger protein 213 (RNF213) and similar proteins. RNF213, also known as ALK lymphoma oligomerization partner on chromosome 17 or Moyamoya steno-occlusive disease-associated AAA+ and RING finger protein (mysterin), is an intracellular soluble protein that functions as an E3 ubiquitin-protein ligase and AAA+ ATPase, which possibly contributes to vascular development through mechanical processes in the cell. It plays a unique role in endothelial cells for proper gene expression in response to inflammatory signals from the environment. Mutations in RNF213 may associate with Moyamoya disease (MMD), an idiopathic cerebrovascular occlusive disorder prevalent in East Asia. It also acts as a nuclear marker for acanthomorph phylogeny. RNF213 contains two tandem enzymatically active AAA+ ATPase modules and a C3HC4-type RING-HC finger. It can forms huge ring-shaped oligomeric complex.	41
319476	cd16562	RING-HC_RNF219	RING finger, HC subclass, found in RING finger protein 219 (RNF219) and similar proteins. RNF219 may function as a modulator of late-onset Alzheimer"s disease (LOAD) associated amyloid beta A4 precursor protein (APP) endocytosis and metabolism. It genetically interacts with apolipoprotein E epsilon4 allele (APOE4). Thus a genetic variant within RNF219 was found to affect amyloid deposition in human brain and LOAD age-of-onset. Moreover, common genetic variants at the RNF219 locus had been associated with alternations in lipid metabolism, cognitive performance and central nervous system ventricle volume. RNF219 contains a C3HC4-type RING-HC finger.	42
319477	cd16563	RING-HC_RNF220	RING finger, HC subclass, found in RING finger protein 220 (RNF220) and similar proteins. RNF220 is an E3 ubiquitin-protein ligase that promotes the ubiquitination and proteasomal degradation of Sin3B, a scaffold protein of the Sin3/HDAC (histone deacetylase) corepressor complex. It can also bind E2 and mediate auto-ubiquitination of itself. Moreover, RNF220 specifically interacts with beta-catenin, and enhances canonical Wnt signaling through ubiquitin-specific protease 7 (USP7)-mediated deubiquitination and stabilization of beta-catenin, which is independent of its E3 ligase activity. RNF220 contains a characteristic C3HC4-type RING-HC finger at its C-terminus.	41
319478	cd16564	RING-HC_RNF222	RING finger, HC subclass, found in RING finger protein 222 (RNF222) and similar proteins. RNF222 is an uncharacterized C3HC4-type RING-HC finger-containing protein. It may function as an E3 ubiquitin-protein ligase.	47
319479	cd16565	RING-HC_RNF224_like	RING finger, HC subclass, found in RING finger protein RNF224, RNF225 and similar proteins. Both RNF224 and RNF225 are uncharacterized C3HC4-type RING-HC finger-containing proteins. They may function as an E3 ubiquitin-protein ligase.	49
319480	cd16566	RING-HC_RSPRY1	RING finger, HC subclass, found in RING finger and SPRY domain-containing protein 1 (RSPRY1) and similar proteins. RSPRY1 is a hypothetical RING and SPRY domain-containing protein of unknown physiological function. Mutations in its corresponding gene RSPRY1 may associate with a distinct skeletal dysplasia syndrome. RSPRY1 contains a B30.2/SPRY domain and a C3HC4-type RING-HC finger.	41
319481	cd16567	RING-HC_RAD16_like	RING finger, HC subclass, found in Saccharomyces cerevisiae DNA repair protein RAD16, Schizosaccharomyces pombe rhp16, and similar proteins. Budding yeast RAD16, also known as ATP-dependent helicase RAD16, is encoded by a yeast excision repair gene homologous to the recombinational repair gene RAD54 and to the SNF2 gene involved in transcriptional activation. It is a component of the global genome repair (GGR) complex which promotes global genome nucleotide excision repair (GG-NER) that removes DNA damage from non-transcribing DNA. RAD16 is involved in differential repair of DNA after UV damage, and repairs preferentially the MAT-alpha locus compared with the HML-alpha locus. Fission yeast rhp16, also known as ATP-dependent helicase rhp16, is a RAD16 homolog. It is involved in GGR via nucleotide excision repair (NER), in conjunction with rhp7, after UV irradiation. Both RAD16 and rhp16 contain a C3HC4-type RING-HC finger, as well as a DEAD-like helicase domain and a helicase superfamily C-terminal domain.	47
319482	cd16568	RING-HC_ScPSH1_like	RING finger, HC subclass, found in Saccharomyces cerevisiae POB3/SPT16 histone-associated protein 1 (ScPSH1), Arabidopsis thaliana Protein KEEP ON GOING (AtKEG) and similar proteins. ScPSH1 is a Cse4-specific E3 ubiquitin ligase that interacts with the kinetochore protein Pat1 and targets the degradation of budding yeast centromeric histone H3 variant, CENP-ACse4, which is essential for faithful chromosome segregation. ScPSH1 contains a C3HC4-type RING-HC finger and a DNA directed RNA polymerase domain. AtKEG is an E3 ubiquitin ligase essential for Arabidopsis growth and development. It maintains low levels of ABSCISIC ACID-INSENSITIVE5 (ABI5) in the absence of stress and thus functions as a negative regulator of abscisic acid (ABA) signaling. AtKEG is a multidomain protein that includes a C3HC4-type RING-HC finger, a kinase domain, ankyrin repeats, and 12 HERC2-like (for HECT and RCC1-like) repeats.	45
319483	cd16569	RING-HC_SHPRH	RING finger, HC subclass, found in SNF2 histone-linker PHD finger RING finger helicase (SHPRH) and similar proteins. SHPRH is a yeast RAD5 homolog found in mammals. It functions as an E3 ubiquitin-protein ligase that associates with proliferating cell nuclear antigen (PCNA), RAD18, and the ubiquitin-conjugating enzyme UBC13 (E2) and suppresses genomic instability through proliferating methyl methanesulfonate (MMS)-induced PCNA polyubiquitination. SHPRH contains a SWI/SNF helicase domain that is divided into N- and C-terminal parts by an insertion of a linker histone domain (H15), a PHD-finger, and a C3HC4-type RING-HC finger involved in the poly-ubiquitination of PCNA.	51
319484	cd16570	RING-HC_SH3RFs	RING finger, HC subclass, found in SH3 domain-containing RING finger proteins SH3RF1, SH3RF2, SH3RF3, and similar proteins. SH3RF1, also known as plenty of SH3s (POSH), RING finger protein 142 (RNF142), or SH3 multiple domains protein 2 (SH3MD2), is a trans-Golgi network-associated pro-apoptotic scaffold protein with E3 ubiquitin-protein ligase activity. SH3RF2, also known as heart protein phosphatase 1-binding protein (HEPP1), plenty of SH3s (POSH)-eliminating RING protein (POSHER), protein phosphatase 1 regulatory subunit 39, or RING finger protein 158 (RNF158), is a putative E3 ubiquitin-protein ligase that acts as an anti-apoptotic regulator for the c-Jun N-terminal kinase (JNK) pathway by binding to and promoting the proteasomal degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. SH3RF3, also known as plenty of SH3s 2 (POSH2) or SH3 multiple domains protein 4 (SH3MD4), is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2) and may play a role in regulating c-Jun N-terminal kinase (JNK) mediated apoptosis in certain conditions. Members in this family contain an N-terminal C3HC4-type RING-HC finger responsible for the E3 ligase activity and four Src Homology 3 (SH3) domains that are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs.	46
319485	cd16571	RING-HC_SIAHs	RING finger, HC subclass, found in Drosophila melanogaster protein Seven-in-Absentia (sina) and its homologs. The family includes the Drosophila melanogaster protein Seven-in-Absentia (sina), its mammalian orthologs, SIAH1 and SIAH2, plant SINA-related proteins, and similar proteins. The Drosophila homolog sina plays an important role in the phyllopod-dependent degradation of the transcriptional repressor tramtrack as for the formation of the R7 photoreceptor in the developing eye of Drosophila melanogaster. Both of SIAH1 and SIAH2 are E3 ubiquitin-protein ligases, mediating the ubiquitinylation and subsequent proteasomal degradation of biologically important target proteins that regulate general functions, such as cell cycle control, apoptosis, and DNA repair. They are inducible by the tumor suppressor and transcription factor p53. SIAH2 can also be regulated by sex hormones and cytokine signaling. Moreover, they share high sequence similarity, but possess contrary roles in cancer, with Siah1 more often acting as a tumor suppressor while Siah2 functions as a proto-oncogene. Plant SINAT1-5 are putative E3 ubiquitin ligase involved in the regulation of stress responses. All family members possess two characteristic domains, an N-terminal C3HC4-type RING-HC finger and a C-terminal tumor necrosis factor (TNF) receptor associated factor (TRAF)-like substrate-binding domain (SBD).	38
319486	cd16572	RING-HC_SpRad8_like	RING finger, HC subclass, found in Schizosaccharomyces pombe DNA repair protein Rad8 (SpRad8) and similar proteins. SpRad8 is a conserved protein homologous to Saccharomyces cerevisiae DNA repair protein Rad5 and human helicase-like transcription factor (HLTF) that is required for error-free postreplication repair by contributing to polyubiquitylation of PCNA. SpRad8 contains a C3HC4-type RING-HC finger responsible for the E3 ubiquitin ligase activity, a SNF2-family helicase domain including an ATP binding site, and a family-specific HIRAN domain (HIP116, Rad5p N-terminal domain) that contributes to nuclear localization.	49
319487	cd16573	RING-HC_TFB3_like	RING finger, HC subclass, found in RNA polymerase II transcription factor B subunit 3 (TFB3) from fungi. TFB3, also known as RNA polymerase II transcription factor B 38 kDa subunit, RNA polymerase II transcription factor B p38 subunit, or Rig2, is a component of the general transcription and DNA repair factor IIH (TFIIH or factor B), which is essential for both basal and activated transcription and is involved in nucleotide excision repair (NER) of damaged DNA. TFIIH has CTD kinase and DNA-dependent ATPase activity, and is essential for polymerase II transcription in vitro. TFB3 is a homolog of MAT1 of higher eukaryotes which forms a ternary complex with MO15 (cdk7) and cyclin H. It physically interacts with Ubc4 and the Nedd8-conjugating enzyme Ubc12 as well as the Hrt1/Rtt101 complex. It targets the yeast Cul4-type cullin Rtt101 for its neddylation and ubiquitylation, and regulates neddylation and activity of cullin-3, but not Cdc53. TFB3 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal MAT1 domain responsible for the interaction with the transcription factor TFIIH.	54
319488	cd16574	RING-HC_Topors	RING finger, HC subclass, found in topoisomerase I-binding arginine/serine-rich protein (Topors) and similar proteins. Topors, also known as topoisomerase I-binding RING finger protein, tumor suppressor p53- binding protein 3, or p53-binding protein 3 (p53BP3), is a ubiquitously expressed nuclear E3 ubiquitin-protein ligase that can ligate both ubiquitin and small ubiquitin-like modifier (SUMO) to substrate proteins in the nucleus. It contains an N-terminal C3HC4-type RING-HC finger which ligates ubiquitin to its target proteins including DNA topoisomerase I, p53, NKX3.1, H2AX, and the AAV-2 Rep78/68 proteins. As a RING-dependent E3 ubiquitin ligase, Topors works with the E2 enzymes UbcH5a, UbcH5c, and UbcH6, but not with UbcH7, CDC34, or UbcH2b. Topors acts as a tumor suppressor in various malignancies. It regulates p53 modification, suggesting it may be responsible for astrocyte elevated gene-1 (AEG-1, also known as metadherin, or LYRIC) ubiquitin modification. Plk1-mediated phosphorylation of Topors inhibits Topors-mediated sumoylation of p53, whereas p53 ubiquitination is enhanced, leading to p53 degradation. It also functions as a negative regulator of the prostate tumor suppressor NKX3.1. Moreover, Topors is associated with promyelocytic leukemia nuclear bodies, and may be involved in the cellular response to camptothecin. It also plays a key role in the turnover of H2AX protein, discriminating the type of DNA damaging stress. Furthermore, Topors is a cilia-centrosomal protein associated with autosomal dominant retinal degeneration. Mutations in TOPORS cause autosomal dominant retinitis pigmentosa (adRP).	40
319489	cd16575	RING-HC_MID_C-I	RING finger, HC subclass, found in midline-1 (MID1), midline-2 (MID2) and similar proteins. MID1, also known as midin, midline 1 RING finger protein, putative transcription factor XPRF, RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRIM18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is highly related to MID1. It associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with Alpha 4, which is a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. They also play a central role in the regulation of granule exocytosis and the functional redundancy exists between MID1 and MID2 in cytotoxic lymphocytes (CTL). Both MID1 and MID2 belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	52
319490	cd16576	RING-HC_TRIM9_like_C-I	RING finger, HC subclass, found in tripartite motif-containing proteins TRIM9, TRIM36, TRIM46, TRIM67, and similar proteins. Tripartite motif-containing proteins TRIM9, TRIM36, TRIM46, and TRIM67 belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, consisting of three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. TRIM9 (the human ortholog of rat Spring), also known as RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. TRIM9 plays an important role in the regulation of neuronal functions and participates in the neurodegenerative disorders through its ligase activity. TRIM36 (the human ortholog of mouse Haprin), also known as RING finger protein 98 (RNF98), or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in the carcinogenesis. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that forms parallel microtubule bundles in the proximal axon and plays a crucial role for the establishment and maintenance of neuronal polarity. TRIM67, also known as TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H, also known as glucosidase II beta, a protein kinase C substrate.	42
319491	cd16577	RING-HC_MuRF_C-II	RING finger, HC subclass, found in muscle-specific RING finger proteins TRIM63/MuRF-1, TRIM55/MuRF-2 and TRIM54/MuRF-3. This family corresponds to a group of striated muscle-specific tripartite motif (TRIM) proteins, including TRIM63/MuRF-1, TRIM55/MuRF-2, and TRIM54/MuRF-3, which function as E3 ubiquitin ligases in ubiquitin-mediated muscle protein turnover. They are tightly developmentally regulated in skeletal muscle and associate with different cytoskeleton components, such as microtubules, Z-disks and M-bands, as well as with metabolic enzymes and nuclear proteins. They also cooperate with diverse proteins implicated in selective protein degradation by the proteasome and autophagosome, and target proteins of metabolic regulation, sarcomere assembly and transcriptional regulation. Moreover, MURFs display variable fibre-type preferences. TRIM63/MuRF-1 is predominantly fast (type II) fibre-associated in skeletal muscle. TRIM55/MuRF-2 is predominantly slow-fibre associated. TRIM54/MuRF-3 is ubiquitously present. They play an active role in microtubule-mediated sarcomere assembly. MuRFs belong to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain positioned C-terminal to the RBCC domain. They also harbor a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains.	51
319492	cd16578	RING-HC_TRIM42_C-III	RING finger, HC subclass, found in tripartite motif-containing protein 42 (TRIM42) and similar proteins. TRIM42 belongs to the C-III subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain. It also has a novel cysteine-rich motif N-terminal to the RBCC domain, as well as a COS (carboxyl-terminal subgroup one signature) box and a fibronectin type-III (FN3) domain positioned C-terminal to the RBCC domain. TRIM42 can interact with TRIM27, a known cancer-associated protein. Its precise biological function remains unclear.	51
319493	cd16579	RING-HC_PML_C-V	RING finger, HC subclass, found in promyelocytic leukemia protein (PML) and similar proteins. Protein PML, also known as RING finger protein 71 (RNF71) or tripartite motif-containing protein 19 (TRIM19), is predominantly a nuclear protein with a broad intrinsic antiviral activity. It is the eponymous component of PML nuclear bodies (PML NBs) and has been implicated in a wide variety of cell processes, including DNA damage signaling, apoptosis, and transcription. PML interferes with the replication of many unrelated viruses, including human immunodeficiency virus 1 (HIV-1), human foamy virus (HFV), poliovirus, influenza virus, rabies virus, EMCV, adeno-associated virus (AAV), and vesicular stomatitis virus (VSV). It also selectively interacts with misfolded proteins through distinct substrate recognition sites and conjugates these proteins with the small ubiquitin-like modifiers (SUMOs) through its SUMO ligase activity. PML belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain.	37
319494	cd16580	RING-HC_TRIM8_C-V	RING finger, HC subclass, found in tripartite motif-containing protein 8 (TRIM8) and similar proteins. TRIM8, also known as glioblastoma-expressed RING finger protein (GERP) or RING finger protein 27 (RNF27), is a probable E3 ubiquitin-protein ligase that may promote proteasomal degradation of suppressor of cytokine signaling 1 (SOCS1) and further regulate interferon-gamma signaling. It functions as a new p53 modulator that stabilizes p53 impairing its association with MDM2 and inducing the reduction of cell proliferation. TRIM8 deficit dramatically impairs p53 stabilization and activation in response to chemotherapeutic drugs. TRIM8 also modulates tumor necrosis factor-alpha (TNFalpha) and interleukin-1beta (IL-1beta)-triggered nuclear factor-kappaB (NF- kappa B) activation by targeting transforming growth factor beta (TGFbeta) activated kinase 1 (TAK1) for K63-linked polyubiquitination. Moreover, TRIM8 modulates translocation of phosphorylated STAT3 into the nucleus through interaction with Hsp90beta and consequently regulates transcription of Nanog in embryonic stem cells. It also interacts with protein inhibitor of activated STAT3 (PIAS3), which inhibits IL-6-dependent activation of STAT3. TRIM8 belongs to the C-V subclass of nuclear TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The coiled coil domain is required for homodimerization and the region immediately C-terminal to the RING motif is sufficient to mediate the interaction with SOCS1.	44
319495	cd16581	RING-HC_TRIM13_like_C-V	RING finger, HC subclass, found in tripartite motif-containing proteins TRIM13, TRIM59 and similar proteins. TRIM13 and TRIM59, two closely related tripartite motif-containing proteins, belong to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, followed by a C-terminal transmembrane domain. TRIM13, also known as B-cell chronic lymphocytic leukemia tumor suppressor Leu5, leukemia-associated protein 5, putative tumor suppressor RFP2, RING finger protein 77 (RNF77), or Ret finger protein 2, is an endoplasmic reticulum (ER) membrane anchored E3 ubiquitin-protein ligase that interacts proteins localized to the ER, including valosin-containing protein (VCP), a protein indispensable for ER-associated degradation (ERAD). TRIM59, also known as RING finger protein 104 (RNF104) or tumor suppressor TSBF-1, is a putative E3 ubiquitin-protein ligase that functions as a novel multiple cancer biomarker for immunohistochemical detection of early tumorigenesis.	45
319496	cd16582	RING-HC_TRIM31_C-V	RING finger, HC subclass, found in tripartite motif-containing protein 31 (TRIM31) and similar proteins. TRIM31 is an E3 ubiquitin-protein ligase that primarily localizes to the cytoplasm, but is also associated with the mitochondria. It can negatively regulate cell proliferation and may be a potential biomarker of gastric cancer as it is overexpressed from the early stage of gastric carcinogenesis. TRIM31 is downregulated in non-small cell lung cancer and serves as a potential tumor suppressor. It interacts with p52 (Shc) and inhibits Src-induced anchorage-independent growth. TRIM31 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain.	41
319497	cd16583	RING-HC_TRIM40-C-V	RING finger, HC subclass, found in tripartite motif-containing protein 40 (TRIM40) and similar proteins. TRIM40, also known as probable E3 NEDD8-protein ligase or RING finger protein 35 (RNF35), is highly expressed in the gastrointestinal tract including the stomach, small intestine, and large intestine. It enhances neddylation of inhibitor of nuclear factor kappaB kinase subunit gamma (IKKgamma), inhibits the activity of nuclear factor-kappaB (NF-kappaB)-mediated transcription, and thus prevents inflammation-associated carcinogenesis in the gastrointestinal tract. TRIM40 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region , as well as an uncharacterized region positioned C-terminal to the RBCC domain.	46
319498	cd16584	RING-HC_TRIM56_C-V	RING finger, HC subclass, found in tripartite motif-containing protein 56 (TRIM56) and similar proteins. TRIM56, also known as RING finger protein 109 (RNF109), is a virus-inducible E3 ubiquitin ligase that restricts pestivirus infection. It positively regulates the Toll-like receptor 3 (TLR3) antiviral signaling pathway, and possesses antiviral activity against bovine viral diarrhea virus (BVDV), a ruminant pestivirus classified within the family Flaviviridae shared by tick-borne encephalitis virus (TBEV). It also possesses antiviral activity against two classical flaviviruses, yellow fever virus (YFV) and dengue virus (DENV), as well as a human coronavirus, HCoV-OC43, which is responsible for a significant share of common cold cases. It may do not act on positive-strand RNA viruses indiscriminately. Moreover, TRIM56 is an interferon-inducible E3 ubiquitin ligase that modulates STING to confer double-stranded DNA-mediated innate immune responses. TRIM56 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain.	44
319499	cd16585	RING-HC_TIF1_C-VI	RING finger, HC subclass, found in the transcriptional inknown asiary factor 1 (TIF1) family and similar proteins. This family corresponds to the TIF1 family of transcriptional cofactors including TIF1alpha (TRIM24), TIF1beta (TRIM28), and TIF1gamma (TRIM33), which belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. TIF1 proteins couple chromatin modifications to transcriptional regulation, signaling, and tumor suppression. They exert a deacetylase-dependent silencing effect when tethered to a promoter region. TIF1alpha and TIF1beta can homodimerize and contain a PXVXL motif necessary and sufficient for heterochromatin protein 1(HP1) binding. They bind nuclear receptors and Kruppel-associated boxes (KRAB) specifically and respectively. TIF1gamma is structurally closely related to TIF1alpha and TIF1beta, but has very little functional features in common with them. It does not interact with the KRAB silencing domain of KOX1 or the heterochromatinic proteins HP1alpha, beta and gamma. It cannot bind to nuclear receptors (NRs). TIF1delta (TRIM66) doesn"t have RING-HC finger and is not included here.	61
319500	cd16586	RING-HC_TRIM2_like_C-VII	RING finger, HC subclass, found in tripartite motif-containing protein TRIM2, TRIM3, and similar proteins. TRIM2, also known as RING finger protein 86 (RNF86), is an E3 ubiquitin-protein ligase that ubiquitinates the neurofilament light chain, a component of the intermediate filament in axons. Loss of function of TRIM2 results in early-onset axonal neuropathy. TRIM3, also known as brain-expressed RING finger protein (BERP), RING finger protein 97 (RNF97), or RING finger protein 22 (RNF22), is an E3 ubiquitin-protein ligase involved in the pathogenesis of various cancers. It also plays an important role in the central nervous system (CNS). In addition, TRIM3 may be involved in vesicular trafficking via its association with the cytoskeleton-associated-recycling or transport (CART) complex that is necessary for efficient transferrin receptor recycling, but not for epidermal growth factor receptor (EGFR) degradation. Both TRIM2 and TRIM3 belong to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain.	45
319501	cd16587	RING-HC_TRIM32_C-VII	RING finger, HC subclass, found in tripartite motif-containing protein 32 (TRIM32) and similar proteins. TRIM32, also known as 72 kDa Tat-interacting protein, or zinc finger protein HT2A, or BBS11, is an E3 ubiquitin-protein ligase that promotes degradation of several targets, including actin, PIASgamma, Abl interactor 2, dysbindin, X-linked inhibitor of apoptosis (XIAP), p73 transcription factor, thin filaments and Z-bands during fasting. It plays important roles in neuronal differentiation of neural progenitor cells, as well as in controlling cell fate in skeletal muscle progenitor cells. It reduces PI3K-Akt-FoxO signaling in muscle atrophy by promoting plakoglobin-PI3K dissociation. It also functions as a pluripotency-reprogramming roadblock that facilitates cellular transition towards differentiation via modulating the levels of Oct4 and cMyc. Moreover, TRIM32 is an intrinsic influenza A virus (IAV) restriction factor which senses and targets the polymerase basic protein 1 (PB1) polymerase for ubiquitination and protein degradation. It also plays a significant role in mediating the biological activity of the HIV-1 Tat protein in vivo, binds specifically to the activation domain of HIV-1 Tat, and can also interact with the HIV-2 and EIAV Tat proteins in vivo. Furthermore, Trim32 regulates myoblast proliferation by controlling turnover of NDRG2 (N-myc downstream-regulated gene). It negatively regulates tumor suppressor p53 to promote tumorigenesis. It also facilitates degradation of MYCN on spindle poles and induces asymmetric cell division in human neuroblastoma cells. In addition, TRIM32 plays important roles in regulation of hyperactivities and positively regulates the development of anxiety and depression disorders induced by chronic stress. It also plays a role in regeneration by affecting satellite cell cycle progression via modulation of the SUMO ligase PIASy (PIAS4). Defects in TRIM32 leads to limb-girdle muscular dystrophy type 2H (LGMD2H), sarcotubular myopathies (STM) and Bardet-Biedl syndrome. TRIM32 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The NHL domain mediates the interaction with Argonaute proteins and consequently allows TRIM32 to modulate the activity of certain miRNAs.	47
319502	cd16588	RING-HC_TRIM45-C-VII	RING finger, HC subclass, found in tripartite motif-containing protein 45 (TRIM45) and similar proteins. TRIM45, also known as RING finger protein 99 (RNF99), is a novel receptor for activated C-kinase (RACK1)-interacting protein that suppresses transcriptional activities of Elk-1 and AP-1 and downregulates mitogen-activated protein kinase (MAPK) signal transduction through inhibiting RACK1/PKC (protein kinase C) complex formation. It also negatively regulates tumor necrosis factor alpha (TNFalpha)-induced nuclear factor-kappaB (NF-kappa B)-mediated transcription and suppresses cell proliferation. TRIM45 belongs to the C-VII subclass of TRIM (tripartite motif) family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a filamin-type immunoglobulin (IG-FLMN) domain and NHL repeats positioned C-terminal to the RBCC domain.	64
319503	cd16589	RING-HC_TRIM71_C-VII	RING finger, HC subclass, found in tripartite motif-containing protein 71 (TRIM71) and similar proteins. TRIM71, also known as protein lineage variant 41 (lin-41), is an E3 ubiquitin-protein ligase that may plays essential roles in embryonic stem cells, cellular reprogramming and the timing of embryonic neurogenesis. It was first identified in the nematode Caenorhabditis elegans as target of the differentiation-associated microRNA (miRNA) let-7 (lethal 7) and therefore part of a heterochronic gene network that controls larval development. In humans, it regulates let-7 microRNA biogenesis via modulation of Lin28B protein polyubiquitination. TRIM71 localizes to cytoplasmic P-bodies and directly interacts with the miRNA pathway proteins Argonaute 2 (AGO2) and DICER. It represses miRNA activity by promoting degradative ubiquitination of AGO2. Moreover, TRIM71 associates with SHCBP1, a novel component of the fibroblast growth factor (FGF) signaling pathway, and regulates its non-degradative polyubiquitination. It is also involved in the post-transcriptional regulation of the CDKN1A, RBL1 and RBL2 or EGR1 mRNAs through mediating RNA-binding in embryonic stem cells. TRIM71 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain.	77
319504	cd16590	RING-HC_TRIM4_C-IV_like	RING finger, HC subclass, found in tripartite motif-containing proteins, TRIM4, TRIM75, tripartite motif family-like protein 1 (TRIML1) and similar proteins. TRIM4 and TRIM75, two closely related tripartite motif-containing proteins, belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. TRIM4, also known as RING finger protein 87 (RNF87), is a cytoplasmic E3 ubiquitin-protein ligase that it had recently evolved and is present only in higher mammals. It transiently interacts with mitochondria, induces mitochondrial aggregation and sensitizes the cells to hydrogen peroxide (H2O2) induced death. Its interaction with peroxiredoxin 1 (PRX1) is critical for the regulation of H2O2 induced cell death. Moreover, TRIM4 functions as a positive regulator of RIG-I-mediated type I interferon induction. It regulates the K63-linked ubiquitination of RIG-1 and assembly of antiviral signaling complex at mitochondria. TRIM75 mainly localizes within spindles, suggesting it may function in spindle organization and thereby affect meiosis. The family also includes TRIML1 that is identical to TRIM11 and TRIM17 except for the absence of B-box domain. TRIML1, also known as RING finger protein 209 (RNF209), is a probable E3 ubiquitin-protein ligase expressed in embryo before implantation. It plays an important role in blastocyst development. By interacting with USP5 (also known as isoT), TRIML1 may exerts its influence on debranching ubiquitin from multi-chains on the stability and activity of protein substrates in the preimplantation embryo.	45
319505	cd16591	RING-HC_TRIM5_like-C-IV	RING finger, HC subclass, found in tripartite motif-containing proteins TRIM5, TRIM6, TRIM22, TRIM34 and similar proteins. TRIM5, TRIM6, TRIM22, and TRIM34, four closely related tripartite motif-containing proteins, belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. TRIM5, also known as RING finger protein 88 (RNF88), is a capsid-specific restriction factor that prevents infection from non-host-adapted retroviruses in a species-specific manner by binding to and destabilizing the retroviral capsid lattice before reverse transcription is completed. Its retroviral restriction activity correlates with the ability to activate TAK1-dependent innate immune signaling. TRIM5 also acts as a pattern recognition receptor that activates innate immune signaling in response to the retroviral capsid lattice. Moreover, TRIM5 plays a role in regulating autophagy through activation of autophagy regulator BECN1 by causing its dissociation from its inhibitors BCL2 and TAB2. It also plays a role in autophagy by acting as a selective autophagy receptor which recognizes and targets HIV-1 capsid protein p24 for autophagic destruction. TRIM6, also known as RING finger protein 89 (RNF89), is an E3-ubiquitin ligase that cooperates with the E2-ubiquitin conjugase UbE2K to catalyze the synthesis of unanchored K48-linked polyubiquitin chains, and further stimulates the interferon-I kappa B kinase epsilon (IKKepsilon) kinase-mediated antiviral response. It also regulates the transcriptional activity of Myc during the maintenance of embryonic stem (ES) cell pluripotency, and may act as a novel regulator for Myc-mediated transcription in ES cells. TRIM22, also known as 50 kDa-stimulated trans-acting factor (Staf-50) or RING finger protein 94 (RNF94), is an E3 ubiquitin-protein ligase that plays an integral role in the host innate immune response to viruses. It has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. TRIM22 acts as a suppressor of basal HIV-1 long terminal repeat (LTR)-driven transcription by preventing the transcription factor specificity protein 1 (Sp1) binding to the HIV-1 promoter. It also controls FoxO4 activity and cell survival by directing Toll-like receptor 3 (TLR3)-stimulated cells toward type I interferon (IFN) type I gene induction or apoptosis. Moreover, TRIM22 can activate the noncanonical nuclear factor-kappaB (NF-kappaB) pathway by activating I kappa B kinase alpha (IKKalpha). It also regulates nucleotide binding oligomerization domain containing 2 (NOD2)-dependent activation of interferon-beta signaling and nuclear factor-kappaB. TRIM34, also known as interferon-responsive finger protein 1 or RING finger protein 21 (RNF21), may function as antiviral protein that contribute to the defense against retroviral infections.	49
319506	cd16592	RING-HC_TRIM7_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 7 (TRIM7) and similar proteins. TRIM7, also known as glycogenin-interacting protein (GNIP) or RING finger protein 90 (RNF90), is an E3 ubiquitin-protein ligase that mediates c-Jun/AP-1 activation by Ras signalling. Its phosphorylation and activation by MSK1 in response to direct activation by the Ras-Raf-MEK-ERK pathway can stimulate TRIM7 E3 ubiquitin ligase activity in mediating Lys63-linked ubiquitination of the AP-1 coactivator RACO-1, leading to RACO-1 protein stabilization. Moreover, TRIM7 binds and activates glycogenin, the self-glucosylating initiator of glycogen biosynthesis. TRIM7 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	45
319507	cd16593	RING-HC_TRIM10_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 10 (TRIM10) and similar proteins. TRIM10, also known as B30-RING finger protein (RFB30), RING finger protein 9 (RNF9), or hematopoietic RING finger 1 (HERF1), is a novel hematopoiesis-specific RING finger protein required for terminal differentiation of erythroid cells. TRIM10 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	49
319508	cd16594	RING-HC_TRIM11_like_C-IV	RING finger, HC subclass, found in tripartite motif-containing proteins, TRIM11 and TRIM27, and similar proteins. TRIM11 and TRIM27, two closely related tripartite motif-containing proteins, belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. TRIM11, also known as protein BIA1, or RING finger protein 92 (RNF92), is an E3 ubiquitin-protein ligase involved in the development of the central nervous system. It is overexpressed in high-grade gliomas and promotes proliferation, invasion, migration and glial tumor growth. TRIM11 acts as a potential therapeutic target for congenital central hypoventilation syndrome (CCHS) through mediating the degradation of congenital central hypoventilation syndrome-associated polyalanine-expanded Phox2b. Trim11 modulates the function of neurogenic transcription factor Pax6 through ubiquitin-proteosome system, and thus plays an essential role for the Pax6-dependent neurogenesis. It also binds to and destabilizes a key component of the activator-mediated cofactor complex (ARC105), humanin, a neuroprotective peptide against Alzheimer"s disease-relevant insults, through the ubiquitin-proteasome system, and further regulates ARC105 function in transforming growth factor beta (TGFbeta) signaling. Moreover, TRIM11 negatively regulates retinoic acid-inducible gene-I (RIG-I)-mediated interferon-beta (IFNbeta) production and antiviral activity by targeting TANK-binding kinase-1 (TBK1). It may contribute to the endogenous restriction of retroviruses in cells. It enhances N-tropic murine leukemia virus (N-MLV) entry by interfering with Ref1 restriction. It also suppresses the early steps of human immunodeficiency virus HIV-1 transduction, resulting in decreased reverse transcripts. TRIM27, also known as RING finger protein 76 (RNF76), RET finger protein (RFP), or zinc finger protein RFP, is a nuclear E3 ubiquitin-protein ligase that is highly expressed in testis and in various tumor cell lines. Expression of TRIM27 is associated with prognosis of colon and endometrial cancers. TRIM27 was first identified as a fusion partner of the RET receptor tyrosine kinase. It functions as a transcriptional repressor and associates with several proteins involved in transcriptional activity, such as enhancer of polycomb 1 (Epc1), a member of the Polycomb group proteins, and Mi-2beta, a main component of the nucleosome remodeling and deacetylase (NuRD) complex, and the cell cycle regulator retinoblastoma protein (RB1). It also interacts with HDAC1, leading to downregulation of thioredoxin binding protein 2 (TBP-2), which inhibits the function of thioredoxin. Moreover, TRIM27 mediates Pax7-induced ubiquitination of MyoD in skeletal muscle atrophy. In addition, it inhibits muscle differentiation by modulating serum response factor (SRF) and Epc1. Furthermore, TRIM27 promote a non-canonical polyubiquitinations of PTEN, a lipid phosphatase that catalyzes PtdIns(3,4,5)P3 (PIP3) to PtdIns(4,5)P2 (PIP2). It is an IKKepsilon-interacting protein that regulates IkappaB kinase (IKK) function and negatively regulates signaling involved in the antiviral response and inflammation. In addition, TRIM27 forms a protein complex with MBD4 or MBD2 or MBD3, and thus plays an important role in the enhancement of transcriptional repression through MBD proteins in tumorigenesis, spermatogenesis, and embryogenesis. It is also a component of an estrogen receptor 1 (ESR1) regulatory complex, and is involved in estrogen receptor-mediated transcription in MCF-7 cells. Meanwhile, TRIM27 interacts with the hinge region of chromosome 3 protein (SMC3), a component of the multimeric cohesin complex that holds sister chromatids together and prevents their premature separation during mitosis.	45
319509	cd16595	RING-HC_TRIM17_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein TRIM17 and similar proteins. TRIM17, also known as RING finger protein 16 (RNF16) or testis RING finger protein (Terf), is a crucial E3 ubiquitin ligase that is necessary and sufficient for neuronal apoptosis and contributes to Mcl-1 ubiquitination in cerebellar granule neurons (CGNs). It interacts in a SUMO-dependent manner with nuclear factor of activated T cell NFATc3 transcription factor, and thus inhibits the activity of NFATc3 by preventing its nuclear localization. In contrast, it binds to and inhibits NFATc4 transcription factor in a SUMO-independent manner. Moreover, TRIM17 stimulates degradation of kinetochore protein ZW10 interacting protein (ZWINT), a known component of the kinetochore complex required for the mitotic spindle checkpoint, and negatively regulates cell proliferation. TRIM17 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	48
319510	cd16596	RING-HC_TRIM21_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein TRIM21 and similar proteins. TRIM21, also known as 52 kDa Ro protein, 52 kDa ribonucleoprotein autoantigen Ro/SS-A, Ro(SS-A), RING finger protein 81 (RNF81), or Sjoegren syndrome type A antigen (SS-A), is a ubiquitously expressed E3 ubiquitin-protein ligase and a high affinity antibody receptor uniquely expressed in the cytosol of mammalian cells. As a cytosolic Fc receptor, TRIM21 binds the Fc of virus-associated antibodies and targets the complex in the cytosol for proteasomal degradation in a process known as antibody-dependent intracellular neutralization (ADIN), and provides an intracellular immune response to protect host defense against pathogen infection. It shows remarkably broad isotype specificity as it does not only bind IgG, but also IgM and IgA. Moreover, TRIM21 promotes the cytosolic DNA sensor cGAS and the cytosolic RNA sensor RIG-I sensing of viral genomes during infection by antibody-opsonized virus. It stimulates inflammatory signaling and activates innate transcription factors, such as nuclear factor-kappaB (NF-kappaB). TRIM21 also plays an essential role in p62-regulated redox homeostasis, suggesting a viable target for treating pathological conditions resulting from oxidative damage. Furthermore, TRIM21 may have implications for various autoimmune diseases associated uncontrolled antiviral signaling through the regulation of Nmi-IFI35 complex-mediated inhibition of innate antiviral response. TRIM21 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	44
319511	cd16597	RING-HC_TRIM25_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein TRIM25 and similar proteins. TRIM25, also known as estrogen-responsive finger protein (EFP), RING finger protein 147 (RNF147), or RING-type E3 ubiquitin transferase, is an E3 ubiquitin/ISG15 ligase that is induced by estrogen and is therefore particularly abundant in placenta and uterus. TRIM25 regulates various cellular processes through E3 ubiquitin ligase activity, transferring ubiquitin and ISG15 to target proteins. It mediates K63-linked polyubiquitination of retinoic acid inducible gene I (RIG-I) that is crucial for downstream antiviral interferon signaling. It is also required for melanoma differentiation-associated gene 5 (MDA5) and mitochondrial antiviral signaling (MAVS, also known as IPS-1, VISA, Cardiff) mediated activation of nuclear factor-kappaB (NF-kappaB) and interferon production. Upon UV irradiation, TRIM25 interacts with mono-ubiquitinated PCNA and promotes its ISG15 modification (ISGylation), suggesting a crucial role in termination of error-prone translesion DNA synthesis. TRIM25 also functions as a novel regulator of p53 and Mdm2. It enhances p53 and Mdm2 abundance by inhibiting their ubiquitination and degradation in 26S proteasomes. Meanwhile, it inhibits p53's transcriptional activity and dampens the response to DNA damage, and is essential for medaka development and this dependence is rescued by silencing of p53. Moreover, TRIM25 is involved in the host cellular innate immune response against retroviral infection. It interferes with the late stage of feline leukemia virus (FeLV) replication. Furthermore, TRIM25 acts as an oncogene in gastric cancer. Its blockade by RNA interference inhibits migration and invasion of gastric cancer cells through transforming growth factor-beta (TGF-beta) signaling, suggesting it presents a novel target for the detection and treatment of gastric cancer. In addition, TRIM25 acts as an RNA-specific activator for Lin28a/TuT4-mediated uridylation. TRIM25 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	44
319512	cd16598	RING-HC_TRIM26_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 26 (TRIM26) and similar proteins. TRIM26, also known as acid finger protein (AFP), RING finger protein 95 (RNF95), or zinc finger protein 173 (ZNF173), is an E3 ubiquitin-protein ligase that negatively regulates interferon-beta production and antiviral response through polyubiquitination and degradation of nuclear transcription factor IRF3. It functions as an important regulator for RNA virus-triggered innate immune response by bridging TBK1 to NEMO (NF-kappaB essential modulator, also known as IKKgamma) and mediating TBK1 activation. It also acts as a novel tumor suppressor of hepatocellular carcinoma by regulating cancer cell proliferation, colony forming ability, migration, and invasion. TRIM26 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	45
319513	cd16599	RING-HC_TRIM35_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 35 (TRIM35) and similar proteins. TRIM35, also known as hemopoietic lineage switch protein 5 (HLS5), is a putative hepatocellular carcinoma (HCC) suppressor that inhibits phosphorylation of pyruvate kinase isoform M2 (PKM2), which is involved in aerobic glycolysis of cancer cells and further suppresses the Warburg effect and tumorigenicity in HCC. It also negatively regulates Toll-like receptor 7 (TLR7)- and TLR9-mediated type I interferon production by suppressing the stability of interferon regulatory factor 7 (IRF7). Moreover, TRIM35 regulates erythroid differentiation by modulating globin transcription factor 1 (GATA-1) activity. TRIM35 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	44
319514	cd16600	RING-HC_TRIM38_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 38 (TRIM38) and similar proteins. TRIM38, also known as RING finger protein 15 (RNF15) or zinc finger protein RoRet, is an E3 ubiquitin-protein ligase that promotes K63- and K48-linked ubiquitination of cellular proteins and also catalyzes self-ubiquitination. It negatively regulates Tumor necrosis factor alpha(TNF-alpha)- and interleukin-1beta-triggered Nuclear factor-kappaB (NF-kappaB) activation by mediating lysosomal-dependent degradation of transforming growth factor beta (TGFbeta)-activated kinase 1 (TAK1)-binding protein (TAB)2/3, two critical components of the TAK1 kinase complex. It also inhibits TLR3/4-mediated activation of NF-kappaB and interferon regulatory factor 3 (IRF3) by mediating ubiquitin-proteasomal degradation of TNF receptor-associated factor 6 (Traf6) and NAK-associated protein 1 (Nap1), respectively. Moreover, TRIM38 negatively regulates TLR3-mediated interferon beta (IFN-beta) signaling by targeting ubiquitin-proteasomal degradation of TIR domain-containing adaptor inducing IFN-beta (TRIF). It functions as a valid target for autoantibodies in primary Sjogren"s Syndrome. TRIM38 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	51
319515	cd16601	RING-HC_TRIM39_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 39 (TRIM39) and similar proteins. TRIM39, also known as RING finger protein 23 (RNF23) or testis-abundant finger protein, is an E3 ubiquitin-protein ligase that plays a role in controlling DNA damage-induced apoptosis through inhibition of the anaphase promoting complex (APC/C), a multiprotein ubiquitin ligase that controls multiple cell cycle regulators, including cyclins, geminin, and others. TRIM39 also functions as a regulator of several key processes in the proliferative cycle. It directly regulates p53 stability. It modulates cell cycle progression and DNA damage responses via stabilizing p21. Moreover, TRIM39 negatively regulates the nuclear factor kappaB (NFkappaB)-mediated signaling pathway through stabilization of Cactin, an inhibitor of NFkappaB- and Toll-like receptor (TLR)-mediated transcriptions, which is induced by inflammatory stimulants such as tumor necrosis factor alpha (TNFalpha). Furthermore, TRIM39 is a MOAP-1-binding protein that can promote apoptosis signaling through stabilization of MOAP-1 via the inhibition of its poly-ubiquitination process. TRIM39 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	44
319516	cd16602	RING-HC_TRIM41_like_C-IV	RING finger, HC subclass, found in tripartite motif-containing proteins TRIM41, TRIM52 and similar proteins. TRIM41 and TRIM52, two closely related tripartite motif-containing proteins, have dramatically expanded RING domains compared with the rest of the TRIM family proteins. TRIM41 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. In contrast, TRIM52 lacks the putative viral recognition SPRY/B30.2 domain, and thus has been classified to the C-V subclass of TRIM family that contains only RBCC domains. TRIM41, also known as RING finger-interacting protein with C kinase (RINCK), is an E3 ubiquitin-protein ligase that promotes the ubiquitination of protein kinase C (PKC) isozymes in cells. It specifically recognizes the C1 domain of PKC isozymes. It controls the amplitude of PKC signaling by controlling the amount of PKC in the cell. TRIM52, also known as RING finger protein 102 (RNF102), is encoded by a novel, noncanonical antiviral TRIM52 gene in primate genomes with unique specificity determined by the rapidly evolving RING domain.	41
319517	cd16603	RING-HC_TRIM43_like_C-IV	RING finger, HC subclass, found in tripartite motif-containing proteins TRIM43, TRIM48, TRIM49, TRIM51, TRIM64, TRIM77 and similar proteins. The family includes a group of closely related uncharacterized tripartite motif-containing proteins, TRIM43, TRIM43B, TRIM48/RNF101, TRIM49/RNF18, TRIM49B, TRIM49C/TRIM49L2, TRIM49D/TRIM49L, TRIM51/SPRYD5, TRIM64, TRIM64B, TRIM64C, and TRIM77, whose biological function remain unclear. TRIM49, also known as testis-specific RING-finger protein, has moderate similarity with SS-A/Ro52 antigen, suggesting it may be one of target proteins of autoantibodies in the sera of patients with these autoimmune disorders. All family members belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain.	46
319518	cd16604	RING-HC_TRIM47_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 47 (TRIM47) and similar proteins. TRIM47, also known as gene overexpressed in astrocytoma protein (GOA) or RING finger protein 100 (RNF100), belongs a subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. It plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis.	47
319519	cd16605	RING-HC_TRIM50_like_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein TRIM50, TRIM73, TRIM74 and similar proteins. TRIM50 is a stomach-specific E3 ubiquitin-protein ligase, encoded by the Williams-Beuren syndrome (WBS) TRIM50 gene, which regulates vesicular trafficking for acid secretion in gastric parietal cells. It colocalizes, interacts with, and increases the level of p62/SQSTM1, a multifunctional adaptor protein implicated in various cellular processes including the autophagy clearance of polyubiquitinated protein aggregates. It also promotes the formation and clearance of aggresome-associated polyubiquitinated proteins through the interaction with the histone deacetylase 6 (HDAC6), a tubulin specific deacetylase that regulates microtubule-dependent aggresome formation. TRIM50 can be acetylated by PCAF and p300. TRIM50 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The family also includes two paralogs of TRIM50, tripartite motif-containing protein 73 (TRIM73), also known as tripartite motif-containing protein 50B (TRIM50B), and tripartite motif-containing protein 74 (TRIM74), also known as tripartite motif-containing protein 50C (TRIM50C), both of which are WBS-related genes encoding proteins and may also act as E3 ligases. In contrast with TRIM50, TRIM73 and TRIM74 belong to the C-V subclass of TRIM family of proteins that are defined by an N-terminal RBCC domains only.	45
319520	cd16606	RING-HC_TRIM58_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein TRIM58 and similar proteins. TRIM58, also known as protein BIA2, is an erythroid E3 ubiquitin-protein ligase induced during late erythropoiesis. It binds and ubiquitinates the intermediate chain of the microtubule motor dynein (DYNC1LI1/DYNC1LI2), stimulating the degradation of the dynein holoprotein complex. It may participate in the erythroblast enucleation process through regulation of nuclear polarization. TRIM58 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	51
319521	cd16607	RING-HC_TRIM60_like_C-IV	RING finger, HC subclass, found in tripartite motif-containing proteins TRIM60, TRIM61 and similar proteins. TRIM60 and TRIM61 are two closely related tripartite motif-containing proteins. TRIM60, also known as RING finger protein 129 (RNF129) or RING finger protein 33 (RNF33), is a cytoplasmic protein expressed in the testis. It may play an important role in the spermatogenesis process, the development of the preimplantation embryo, and in testicular functions. RNF33 interacts with the cytoplasmic kinesin motor proteins KIF3A and KIF3B suggesting possible contribution to cargo movement along the microtubule in the expressed sites. It is also involved in spermatogenesis in Sertoli cells under the regulation of nuclear factor-kappaB (NF-kappaB). TRIM60 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. In contrast to TRIM60, TRIM61 belongs to the C-V subclass of TRIM family that contains RBCC domains only. Its biological function remains unclear.	47
319522	cd16608	RING-HC_TRIM62_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 62 (TRIM62) and similar proteins. TRIM62, also known as Ductal Epithelium Associated Ring Chromosome 1 (DEAR1), is a cytoplasmic E3 ubiquitin-protein ligase that was identified as a dominant regulator of acinar morphogenesis in the mammary gland. It is implicated in the inflammatory response of immune cells by regulating the Toll-like receptor 4 (TLR4) signaling pathway, leading to increased activity of the activator protein 1 (AP-1) transcription factor in primary macrophages. It is also involved in muscular protein homeostasis, especially during inflammation-induced atrophy, and may play a role in the pathogenesis of ICU-acquired weakness (ICUAW) by activating and maintaining inflammation in myocytes. Moreover, TRIM62 facilitates K27-linked poly-ubiquitination of CARD9 and also regulates CARD9-mediated anti-fungal immunity and intestinal inflammation. Furthermore, TRIM62 is involved in the regulation of apical-basal polarity and acinar morphogenesis. It also functions as a chromosome 1p35 tumor suppressor and negatively regulates transforming growth factor beta (TGFbeta)-driven epithelial-mesenchymal transition (EMT) through binding to and promoting the ubiquitination of SMAD3, a major effector of TGFbeta-mediated EMT. TRIM62 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	50
319523	cd16609	RING-HC_TRIM65_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein TRIM65 and similar proteins. TRIM65 is an E3 ubiquitin-protein ligase that interacts with the innate immune receptor MDA5 enhancing its ability to stimulate interferon-beta signaling. It functions as a potential oncogenic protein that negatively regulates p53 through ubiquitination, providing insight into development of novel approaches targeting TRIM65 for non-small cell lung carcinoma (NSCLC) treatment, and also overcoming chemotherapy resistance. Moreover, TRIM65 negatively regulates microRNA-driven suppression of mRNA translation by targeting TNRC6 proteins for ubiquitination and degradation. TRIM65 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	47
319524	cd16610	RING-HC_TRIM68_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 68 (TRIM68) and similar proteins. TRIM68, also known as RING finger protein 137 (RNF137) or SSA protein SS-56 (SS-56), is an E3 ubiquitin-protein ligase that negatively regulates Toll-like receptor (TLR)- and RIG-I-like receptor (RLR)-driven type I interferon production by degrading TRK fused gene (TFG), a novel driver of IFN-beta downstream of anti-viral detection systems. It also functions as a cofactor for androgen receptor-mediated transcription through regulating ligand-dependent transcription of androgen receptor in prostate cancer cells. Moreover, TRIM68 is a cellular target of autoantibody responses in Sjogren"s syndrome (SS), as well as systemic lupus erythematosus (SLE). It is also an auto-antigen for T cells in SS and SLE. TRIM68 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, a B-box, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	49
319525	cd16611	RING-HC_TRIM69_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 69 (TRIM69) and similar proteins. TRIM69, also known as RFP-like domain-containing protein trimless or RING finger protein 36 (RNF36), is a testis E3 ubiquitin-protein ligase that plays a specific role in apoptosis and may also play an important role in germ cell homeostasis during spermatogenesis. TRIM69 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	44
319526	cd16612	RING-HC_TRIM72_C-IV	RING finger, HC subclass, found in tripartite motif-containing protein 72 (TRIM72) and similar proteins. TRIM72, also known as Mitsugumin-53 (MG53), is a muscle-specific protein that plays a central role in cell membrane repair by nucleating the assembly of the repair machinery at muscle injury sites. It is required in repair of alveolar epithelial cells under plasma membrane stress failure. It interacts with dysferlin to regulate sarcolemmal repair. Upregulation of TRIM72 develops obesity, systemic insulin resistance, dyslipidemia, and hyperglycemia, as well as induces diabetic cardiomyopathy through transcriptional activation of peroxisome proliferation-activated receptor alpha (PPAR-alpha) signaling pathway. Compensation for the absence of AKT signaling by ERK signaling during TRIM72 overexpression leads to pathological hypertrophy. Moreover, TRIM72 functions as a novel negative feedback regulator of myogenesis via targeting insulin receptor substrate-1 (IRS-1). It is transcriptionally activated by the synergism of myogenin (MyoD) and myocyte enhancer factor 2 (MEF2). TRIM72 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	47
319527	cd16613	RING-HC_UHRF	RING finger, HC subclass, found in ubiquitin-like PHD and RING finger domain-containing proteins, UHRF1 and UHRF2, and similar proteins. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumor suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of transcription factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation, but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) by interacting with the HBV core protein and promoting its degradation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal C3HC4-type RING-HC finger.	46
319528	cd16614	RING-HC_UNK_like	RING finger, HC subclass, found in RING finger protein unkempt (UNK), unkempt-like (UNKL), and similar proteins. UNK, also known as zinc finger CCCH domain-containing protein 5, is a metazoan-specific zinc finger protein enriched in embryonic brains. It may play a broad regulatory role during the formation of the central nervous system (CNS). It is a sequence-specific RNA-binding protein required for the early neuronal morphology. UNK is a neurogenic component of the mTOR pathway, and functions as a negative regulator of the timing of photoreceptor differentiation. It also specifically binds to Brg/Brm-associated factor BAF60b and promotes its ubiquitination in a Rac1-dependent manner. UNKL, also known as zinc finger CCCH domain-containing protein 5-like, is a putative E3 ubiquitin-protein ligase that may participate in a protein complex showing an E3 ligase activity regulated by RAC1. Both UNK and UNKL contain several tandem CCCH-type zinc fingers at the N-terminus, and a C3HC4-type RING-HC finger at its C-terminus.	34
319529	cd16615	RING-HC_ZNF598	RING finger, HC subclass, found in zinc finger protein 598 (ZNF598) and similar proteins. ZNF598 associates with eukaryotic initiation factor 4E (eIF4E) homologous protein from mammalian (m4EHP) through binding to Grb10-interacting GYF protein 2 (GIGYF2). The m4EHP-GIGYF2 complex functions as a translational repressor and is essential for normal embryonic development of mammalian. ZNF598 harbors a C3HC4-type RING-HC finger at its N-terminus.	41
319530	cd16616	mRING-HC-C4C4_Asi1p_like	Modified RING finger, HC subclass (C4C4-type), found in Saccharomyces cerevisiae amino acid sensor-independent protein Asi1p, Asi3p and similar proteins. Asi1p and Asi3p are inner nuclear membrane proteins that act as negative regulators of SPS (Ssy1-Ptr3-Ssy5)-sensor signaling in yeast. Together with Asi2p, they assemble into an Asi complex that functions in the SPS amino acid sensing pathway involved in degradation of Stp1 and Stp2 transcription factors. Both Asi1p and Asi3p contain five membrane-spanning domains, as well as highly conserved RING fingers at their extreme C termini, which are a C4C4-type RING finger motif whose overall folding is similar to that of the C3HC4-type RING-HC finger.	44
319531	cd16617	mRING-HC-C4C4_CesA_plant	Modified RING finger, HC subclass (C4C4-type), found in Arabidopsis thaliana cellulose synthase A (CesA) catalytic subunit 1-10, and similar proteins from plant. The family includes a group of plant catalytic subunits of cellulose synthase terminal complexes ("rosettes") required for beta-1,4-glucan microfibril crystallization, a major mechanism of the cell wall formation. CesA1, also known as protein RADIALLY SWOLLEN 1 (RSW1), is required during embryogenesis for cell elongation, orientation of cell expansion and complex cell wall formations, such as interdigitated pattern of epidermal pavement cells, stomatal guard cells, and trichomes. It plays a role in lateral roots formation, but seems unnecessary for the development of tip-growing cells such as root hairs. CesA2, also known as Ath-A, is involved in the primary cell wall formation. It forms a homodimer. CesA3, also known as constitutive expression of VSP1 protein 1, or isoxaben-resistant protein 1, or Ath-B, or protein ECTOPIC LIGNIN 1, or protein RADIALLY SWOLLEN 5 (RSW5), is involved in the primary cell wall formation, especially in roots. CesA4, also known as protein IRREGULAR XYLEM 5 (IRX5), is involved in the secondary cell wall formation, and required for the xylem cell wall thickening. CesA5 may be partially redundant with CesA6. CesA6, also known as AraxCelA, isoxaben-resistant protein 2, protein PROCUSTE 1, or protein QUILL, is involved in the primary cell wall formation. Like CesA1, CesA6 is critical for cell expansion. The CESA6-dependent cell elongation seems to be independent of gibberellic acid, auxin, and ethylene. CesA6 interacts with and moves along cortical microtubules for the process of cellulose deposition. CesA7, also known as protein FRAGILE FIBER 5, or protein IRREGULAR XYLEM 3 (IRX3) is involved in the secondary cell wall formation and required for the xylem cell wall thickening. CesA8, also known as protein IRREGULAR XYLEM 1 (IRX1) or protein LEAF WILTING 2, is involved in the secondary cell wall formation and required for the xylem cell wall thickening. The biological function of CesA9 and CesA10 remain unclear. CesA1, CesA3, and CesA6 form a functional complex essential for primary cell wall cellulose synthesis, while CesA4, CesA7, and CesA8 form a functional complex located in secondary cell wall deposition sites. All family members contain an N-terminal C4C4-type RING-HC finger and a C-terminal glycosyltransferase family A (GT-A) domain.	51
319532	cd16618	mRING-HC-C4C4_CNOT4	Modified RING finger, HC subclass (C4C4-type), found in CCR4-NOT transcription complex subunit 4 (NOT4) and similar proteins. NOT4, also known as CCR4-associated factor 4, E3 ubiquitin-protein ligase CNOT4, or potential transcriptional repressor NOT4, is a component of the multifunctional CCR4-NOT complex, a global regulator of RNA polymerase II transcription. It associates with polysomes and contributes to the negative regulation of protein synthesis. NOT4 functions as an E3 ubiquitin-protein ligase that interacts with a specific E2, Ubc4/5 in yeast, and the ortholog UbcH5B in humans, and ubiquitylates a wide range of substrates, including ribosome-associated factors. Thus, it plays a role in cotranslational quality control (QC) through ribosome-associated ubiquitination and degradation of aberrant peptides. NOT4 contains a C4C4-type RING finger motif, whose overall folding is similar to that of the C3HC4-type RING-HC finger, a central RNA recognition motif (RRM), and a C-terminal domain predicted to be unstructured.	45
319533	cd16619	mRING-HC-C4C4_TRIM37_C-VIII	Modified RING finger, HC subclass (C4C4-type), found in tripartite motif-containing protein 37 (TRIM37) and similar proteins. TRIM37, also known as mulibrey nanism protein, or MUL, is a peroxisomal E3 ubiquitin-protein ligase that is involved in the tumorigenesis of several cancer types, including pancreatic ductal adenocarcinoma (PDAC), hepatocellular carcinoma (HCC), breast cancer, and sporadic fibrothecoma. It mono-ubiquitinates histone H2A, a chromatin modification associated with transcriptional repression. Moreover, TRIM37 possesses anti-HIV-1 activity, and interferes with viral DNA synthesis. Mutations in the human TRIM37 gene (also known as MUL) cause Mulibrey (muscle-liver-brain-eye) nanism, a rare growth disorder of prenatal onset characterized by dysmorphic features, pericardial constriction, and hepatomegaly. TRIM37 belongs to the C-VIII subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C4C4-type RING finger, whose overall folding is similar to that of the typical C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a MATH (meprin and TRAF-C homology) domain positioned C-terminal to the RBCC domain. Its MATH domain has been shown to interact with the TRAF (TNF-Receptor-Associated Factor) domain of six known TRAFs in vitro.	43
319534	cd16620	vRING-HC-C4C4_RBBP6	Variant RING finger, HC subclass (C4C4-type), found in retinoblastoma-binding protein 6 (RBBP6) and similar proteins. RBBP6, also known as proliferation potential-related protein, protein P2P-R, retinoblastoma-binding Q protein 1 (RBQ-1), or p53-associated cellular protein of testis (PACT), is a nuclear E3 ubiquitin-protein ligase involved in multiple processes, such as the control of gene expression, mitosis, cell differentiation, and cell apoptosis. It plays a role in both promoting and inhibiting apoptosis in many human cancers, including esophageal, lung, hepatocellular, and colon cancers, familial myeloproliferative neoplasms, as well as in human immunodeficiency virus-associated nephropathy (HIVAN). It functions as an Rb- and p53-binding protein that plays an important role in chaperone-mediated ubiquitination and possibly in protein quality control. It acts as a scaffold protein to promote the assembly of the p53/TP53-MDM2 complex, resulting in an increase of MDM2-mediated ubiquitination and degradation of p53/TP53, and leading to both apoptosis and cell growth. It is also a double-stranded RNA-binding protein that plays a role in mRNA processing by regulating the human polyadenylation machinery and modulating expression of mRNAs with AU-rich 3' untranslated regions (UTRs). Moreover, RBBP6 ubiquitinates and destabilizes the transcriptional repressor ZBTB38 that negatively regulates transcription and levels of the MCM10 replication factor on chromatin. Furthermore, RBBP6 is involved in tunicamycin-induced apoptosis through mediating protein kinase (PKR) activation. RBBP6 contains an N-terminal ubiquitin-like domain and a C4C4-type RING finger, whose overall folding is similar to that of the typical C3HC4-type RING-HC finger. RBBP6 interacts with chaperones Hsp70 and Hsp40 through its N-terminal ubiquitin-like domain. It promotes the ubiquitination of p53 by Hdm2 in an E4-like manner through its RING finger. It also interacts directly with the pro-proliferative transcription factor Y-box-binding protein-1 (YB-1) via its RING finger.	45
319535	cd16621	vRING-HC-C4C4_RFPL1_like	Variant RING finger, HC subclass (C4C4-type), found in Ret finger protein-like (RFPL) family. RFPL family, also known as RING-B30 family, represents a group of RFPL gene products, RFPL1, RFPL2, RFPL3, and RFPL4A, which are characterized by containing an N-terminal RFPL1, 2, 3-specifying helix (RSH), a C4C4- or a modified C4C4-type RING finger, whose overall folding is similar to that of the typical C3HC4-type RING-HC finger, an RFPL-defining motif (RDM), and C-terminal PRY/SPRY-forming B30.2 domain. RFPL1, also known as RING finger protein 78 (RNF78), is expressed during cell differentiation, its impact on cell-cycle lengthening therefore provides novel insights into primate-specific development. RFPL2, also known as RING finger protein 79 (RNF79), shows high sequence similarity with other RFPL gene products. Its biological role remains unclear. RFPL3 interacts directly with CREB binding protein (CBP) in the nucleus of lung cancer cells. RFPL3 and CBP synergistically upregulate TERT activity and promote lung cancer growth. Moreover, RFPL3 acts as a novel E3 ubiquitin ligase modulating the integration activity of human immunodeficiency virus, type 1 (HIV-1) preintegration complex. RFPL4A, also known as RING finger protein 210 (RNF210), is a novel factor that increases the G1 population and decreases sensitivity to chemotherapy in human colorectal cancer cells. This family corresponds to the C4C4-type RING finger. RFPL4A lacks the fourth conserved zinc-binding residue, cysteine, and the eighth zinc-binding residue, cysteine, in RFPL2 is replaced by serine.	44
319536	cd16622	vRING-HC-C4C4_RBR_RNF217	Variant RING finger, HC subclass (C4C4-type), found in RING finger protein 217 (RNF217) and similar proteins. RNF217, also known as IBR domain-containing protein 1 (IBRDC1), is a transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligase mainly expressed in testis and skeletal muscle with different splice variants. It interacts with the anti-apoptotic protein HAX1, and is adjacent to a translocation breakpoint involving ETV6 in childhood acute lymphoblastic leukemia (ALL). RNF217 contains a RBR domain followed by TMs. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a variant C4C4-type RING finger whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is required for RBR-mediated ubiquitination.	47
319537	cd16623	RING-HC_RBR_TRIAD1_like	RING finger, HC subclass, found in two RING fingers and DRIL [double RING finger linked] 1 (TRIAD1), ankyrin repeat and IBR domain-containing protein 1 (ANKIB1) and similar proteins. TRIAD1, also known as ariadne-2 (ARI-2), protein ariadne-2 homolog, Ariadne RBR E3 ubiquitin protein ligase 2 (ARIH2), or UbcM4-interacting protein 48, is a RBR-type E3 ubiquitin-protein ligase that catalyzes the formation of polyubiquitin chains linked via lysine-48 as well as lysine-63 residues. Its auto-ubiquitylation can be catalyzed by the E2 conjugating enzyme UBCH7. TRIAD1 has been implicated in hematopoiesis, specifically in myelopoiesis, as well as in embryogenesis. ANKIB1 is a RBR-type E3 ubiquitin-protein ligase that may function as part of E3 complex, which accepts ubiquitin from specific E2 ubiquitin-conjugating enzymes and then transfers it to substrates. Both TRIAD1 and ANKIB1 contain a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination. In contrast to TRIAD1, ANKIB1 harbors an extra N-terminal ankyrin repeats domain.	50
319538	cd16624	RING-HC_RBR_CUL9	RING finger, HC subclass, found in cullin-9 (CUL-9) and similar proteins. CUL-9, also known as UbcH7-associated protein 1 (H7-AP1), p53-associated parkin-like cytoplasmic protein, or PARC, is a cytoplasmic RBR-type E3 ubiquitin-protein ligase that is a tumor suppressor and promotes p53-dependent apoptosis. It mediates the ubiquitination and degradation of cytosolic cytochrome c to promote survival in neurons and cancer cells. It is also a critical downstream effector of the 3M complex in the maintenance of microtubules and genome integrity. Moreover, CUL-9, together with CUL-7, forms homodimers and heterodimers, as well as some atypical cullin RING ligase complexes, which may exhibit ubiquitin ligase activity. CUL-9 contains a CPH domain (Cul7, PARC, and HERC2), a DOC (DOC1/APC10) domain, cullin homology (CH) domains linked with E3 ligase function, and a C-terminal RBR domain previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	52
319539	cd16625	RING-HC_RBR_HEL2_like	RING finger, HC subclass, found in Saccharomyces cerevisiae histone E3 ligase 2 (HEL2) and similar proteins. HEL2 is an E3 ubiquitin-protein ligase that interacts with the E2 ubiquitin-conjugating enzyme UBC4 and histones H3 and H4. It plays an important role in regulating histone protein levels and also likely to contribute to the maintenance of genomic stability in the budding yeast. HEL2 can be phosphorylated by the DNA damage checkpoint kinase and histone protein regulator Rad53. This family also includes Schizosaccharomyces pombe histone E3 ligase 1 (HEL1), also known as DNA-break-localizing protein 4 (dbl4), and Dictyostelium discoideum Ariadne-like ubiquitin ligase (RbrA). RbrA may act as an E3 ubiquitin-protein ligase that appears to be required for normal cell-type proportioning and cell sorting during multicellular development, and is also necessary for spore cell viability. Members in this family contain a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	54
319540	cd16626	RING-HC_RBR_HHARI	RING finger, HC subclass, found in human homolog of Drosophila ariadne (HHARI) and similar proteins. The family includes Drosophila melanogaster protein ariadne-1 (ARI-1), and its eukaryotic homologs, such as HHARI. ARI-1 is a novel widely expressed Drosophila RING-finger protein that localizes mainly in the cytoplasm and is required for neural development. It interacts with a novel ubiquitin-conjugating enzyme, UbcD10. HHARI, also known as H7-AP2, monocyte protein 6 (MOP-6), protein ariadne-1 homolog, Ariadne RBR E3 ubiquitin protein ligase 1 (ARIH1), ariadne-1 (ARI-1), UbcH7-binding protein, UbcM4-interacting protein, or ubiquitin-conjugating enzyme E2-binding protein 1, is a RBR-type E3 ubiquitin-protein ligase highly expressed in nuclei, where it is co-localized with nuclear bodies including Cajal, PML, and Lewy bodies. It interacts with the E2 conjugating enzymes UbcH7, UbcH8, UbcM4, and UbcD10 in human, mouse, and fly, and modulates the ubiquitylation of substrate proteins including single-minded 2 (SIM2) and translation initiation factor 4E homologous protein (4EHP). It functions as a potent mediator of DNA damage-induced translation arrest, which protects stem and cancer cells against genotoxic stress by initiating a 4EHP-mediated mRNA translation arrest. HHARI contains a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	58
319541	cd16627	RING-HC_RBR_parkin	RING finger, HC subclass, found in parkin and similar proteins. Parkin, also known as Parkinson juvenile disease protein 2, is a RBR-type E3 ubiquitin-protein ligase that is associated with recessive early onset Parkinson"s disease (PD), and exerts a protective effect against dopamine-induced alpha-synuclein-dependent cell toxicity. Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Parkin functions within a multiprotein E3 ubiquitin ligase complex, catalyzing the covalent attachment of ubiquitin moieties onto substrate proteins, such as BCL2, SYT11, CCNE1, GPR37, RHOT1/MIRO1, MFN1, MFN2, STUB1, SNCAIP, SEPT5, TOMM20, USP30, ZNF746, and AIMP2. It mediates monoubiquitination, as well as Lys-6-, Lys-11-, Lys-48- and Lys-63-linked polyubiquitination of substrates depending on the context. Parkin may enhance cell viability and protects dopaminergic neurons from oxidative stress-mediated death by regulating mitochondrial function. It also limits the production of reactive oxygen species (ROS) and regulates cyclin-E during neuronal apoptosis. Moreover, parkin displays a ubiquitin ligase-independent function in transcriptional repression of p53. Parkin contains an N-terminal ubiquitin-like domain and a C-terminal RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	59
319542	cd16628	RING-HC_RBR_RNF14	RING finger, HC subclass, found in RING finger protein 14 (RNF14) and similar proteins. RNF14, also known as androgen receptor-associated protein 54 (ARA54), HFB30, or Triad2 protein, is a RBR-type E3 ubiquitin-protein ligase that is highly expressed in the testis and interacts with class III E2s (UBE2E2, UbcH6, and UBE2E3). Its differential localization may play an important role in testicular development and spermatogenesis in humans. RNF14 functions as a transcriptional regulator of mitochondrial and immune function in muscle. It is a ligand-dependent androgen receptor (AR) co-activator that enhances AR-dependent transcriptional activation. It also may participate in enhancing cell cycle progression and cell proliferation via induction of cyclin D1. Moreover, RNF14 is crucial for colon cancer cell survival. It acts as a new enhancer of the Wnt-dependent transcriptional outputs that acts at the level of the T-cell factor/lymphoid enhancer factor (TCF/LEF)-beta-catenin complex. RNF14 contains an N-terminal RWD domain and a C-terminal RBR domain. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain uses an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	54
319543	cd16629	RING-HC_RBR_RNF19	RING finger, HC subclass, found in the family of RING finger proteins RNF19A, RNF19B and similar proteins. The family includes RING finger protein RNF19A and RNF19B, both of which are transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligases. RNF19A, also known as double ring-finger protein (Dorfin) or p38, localizes to the ubiquitylated inclusions in Parkinson's disease (PD), dementia with Lewy bodies, multiple system atrophy, and amyotrophic lateral sclerosis (ALS). It interacts with Psmc3, a protein component of the 19S regulatory cap of the 26S proteasome, and further participates in the ubiquitin-proteasome system in acrosome biogenesis, spermatid head shaping, and development of the head-tail coupling apparatus and tail. It modulates the ubiquitination and degradation of calcium-sensing receptor (CaR), which may contribute to a general mechanism for CaR quality control during biosynthesis. Moreover, RNF19A can also ubiquitylate mutant superoxide dismutase 1 (SOD1), the causative gene of familial ALS. It may associate with endoplasmic reticulum-associated degradation (ERAD) pathway, which is related to the pathogenesis of neurodegenerative disorders, such as PD or Alzheimer"s disease. It is also involved in the pathogenic process of PD and Lewy body (LB) formation by ubiquitylation of synphilin-1. RNF19B, also known as IBR domain-containing protein 3 or natural killer lytic-associated molecule (NKLAM), plays a role in controlling tumor dissemination and metastasis. It is involved in the cytolytic function of natural killer (NK) cells and cytotoxic T lymphocytes (CTLs). It interacts with ubiquitin conjugates UbcH7 and UbcH8, and ubiquitinates uridine kinase like-1 (URKL-1) protein, targeting it for degradation. Moreover, RNF19B is a novel component of macrophage phagosomes and plays a role in macrophage anti-bacterial activity. It functions as a novel modulator of macrophage inducible nitric oxide synthase (iNOS) expression. Both RNF19A and RNF19B contain a RBR domain followed by three TMs. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	55
319544	cd16630	RING-HC_RBR_RNF216	RING finger, HC subclass, found in RING finger protein 216 (RNF216) and similar proteins. RNF216, also known as Triad domain-containing protein 3 (Triad3A), ubiquitin-conjugating enzyme 7-interacting protein 1, or zinc finger protein inhibiting NF-kappa-B (ZIN), is a RBR-type E3 ubiquitin-protein ligase that interacts with several components of Toll-like receptor (TLR) signaling and promotes their proteolytic degradation. It negatively regulates the RIG-I RNA sensing pathway through Lys48-linked, ubiquitin-mediated degradation of the tumor necrosis factor receptor-associated factor 3 (TRAF3) adapter following RNA virus infection. It also controls ubiquitination and proteasomal degradation of receptor-interacting protein 1 (RIP1), a serine/threonine protein kinase that is critically involved in tumor necrosis factor receptor-1 (TNF-R1)-induced NF-kappa B activation, following disruption of Hsp90 binding. Moreover, RNF216 is involved in inflammatory diseases through strongly inhibiting autophagy in macrophages. It interacts with and ubiquitinates BECN1, a key regulator in autophagy, thereby contributing to BECN1 degradation. It regulates synaptic strength by ubiquitination of Arc, resulting in its rapid proteasomal degradation. It is also a key negative regulator of sustained Killer cell Ig-like receptor (KIR) with two Ig-like domains and a long cytoplasmic domain 4 (2DL4)-mediated NF-kappaB signaling from internalized 2DL4, which functions by promoting ubiquitylation and degradation of endocytosed receptor from early endosomes. Furthermore, RNF216 interacts with human immunodeficiency virus type 1 (HIV-1) Virion infectivity factor (Vif) protein, which is essential for the productive infection of primary human CD4 T lymphocytes and macrophages. Mutations in RNF216 may result in Gordon Holmes syndrome, a condition defined by hypogonadotropic hypogonadism and cerebellar ataxia, as well as in autosomal recessive Huntington-like disorder. RNF216 contains a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger motif required for RBR-mediated ubiquitination.	58
319545	cd16631	mRING-HC-C4C4_RBR_HOIP	Modified RING finger, HC subclass (C4C4-type), found in HOIL-1-interacting protein (HOIP) and similar proteins. HOIP, also known as RING finger protein 31 (RNF31) or zinc in-between-RING-finger ubiquitin-associated domain protein, together with HOIL-1 and SHARPIN, forms the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis. It also interacts with the atypical mammalian orphan receptor DAX-1, trigger DAX-1 ubiquitination and stabilization, and participate in repressing steroidogenic gene expression. HOIP contains three Npl4 zinc fingers, a central ubiquitin-associated (UBA) domain responsible for the interaction with the N-terminal ubiquitin-like domain (UBL) of HOIL-1L, a RBR domain, and a C-terminal linear chain determining domain (LDD). The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C4C4-type RING finger motif whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is required for RBR-mediated ubiquitination.	53
319546	cd16632	mRING-HC-C4C4_RBR_RNF144	Modified RING finger, HC subclass (C4C4-type), found in the RNF144 protein family. The RNF144 family includes RNF144A and RNF144B, both of which are transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligases. RNF144A, also known as UbcM4-interacting protein 4 (UIP4) or ubiquitin-conjugating enzyme 7-interacting protein 4 targets DNA-dependent protein kinase catalytic subunit (DNA-PKcs), and thus promote DNA damage-induced cell apoptosis. It is transcriptionally repressed by metastasis-associated protein 1 (MTA1) and inhibits MTA1-driven cancer cell migration and invasion. RNF144B, also known as PIR2, IBR domain-containing protein 2 (IBRDC2), or p53-inducible RING finger protein (p53RFP), induces p53-dependent, but caspase-independent apoptosis. It interacts with E2 ubiquitin-conjugating enzymes UbcH7 and UbcH8, but not with UbcH5. It is involved in ubiquitination and degradation of p21, a p53 downstream protein promoting growth arrest and antagonizing apoptosis, suggesting a role in switching a cell from p53-mediated growth arrest to apoptosis. Moreover, RNF144B regulates the levels of Bax, a pro-apoptotic protein from the Bcl-2 family, and protects cells from unprompted Bax activation and cell death. It also regulates epithelial homeostasis by mediating degradation of p21WAF1 and p63. Both RNF144A and RNF144B contain a RBR domain followed by a potential single-TM domain. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C4C4-type RING finger whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is required for RBR-mediated ubiquitination.	51
319547	cd16633	mRING-HC-C3HC3D_RBR_HOIL1	Modified RING finger, HC subclass (C3HC3D-type), found in heme-oxidized IRP2 ubiquitin ligase 1 (HOIL-1) and similar proteins. HOIL-1, also known as RBCK1, HOIL-1L, RanBP-type and C3HC4-type zinc finger-containing protein 1, HBV-associated factor 4, Hepatitis B virus X-associated protein 4, RING finger protein 54 (RNF54), ubiquitin-conjugating enzyme 7-interacting protein 3, or UbcM4-interacting protein 28 (UIP28), together with E3 ubiquitin-protein ligase RNF31 (also known as HOIP) and SHANK-associated RH domain interacting protein (SHARPIN), form the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis through conjugation of linear polyubiquitin chains to NF-kappaB essential modulator (also known as NEMO or IKBKG). HOIL-1 plays a crucial role in TNF-alpha-mediated NF-kappaB activation. It also functions as an ubiquitin-protein ligase E3 that interacts with not only PKCbeta, but also PKCzeta. It can recognize heme-oxidized IRP2 (iron regulatory protein2) and is thought to affect the turnover of oxidatively damaged proteins. HOIL-1 contains an N-terminal ubiqutin-like (UBL) domain and an Npl4 zinc-finger (NZF) domain, which regulate the interaction with the LUBAC subunit RNF31 and ubiquitin, respectively. The NZF domain belongs to RanBP2-type zinc finger (zf-RanBP2) domain superfamily. In addition, HOIL-1 has a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a modified C3HC3D-type RING-HC finger required for RBR-mediated ubiquitination.	55
319548	cd16634	mRING-HC-C3HC3D_Nrdp1	Modified RING finger, HC subclass (C3HC3D-type), found in neuregulin receptor degradation protein-1 (Nrdp1) and similar proteins. Nrdp1 (referred to as FLRF in mice), also known as RING finger protein 41 (RNF41), is an E3 ubiquitin-protein ligase that plays a critical role in the regulation of cell growth and apoptosis, inflammation and production of reactive oxygen species (ROS), as well as in doxorubicin (DOX)-induced cardiac injury. It promoten and degradation of the epidermal growth factor receptor (EGFR/ErbB) family member, ErbB3, which is independent of growth factor stimulation. It also promotes M2 macrophage polarization by ubiquitinating and activating transcription factor CCAAT/enhancer-binding Protein beta (C/EBPbeta) via Lys-63-linked ubiquitination. Moreover, Nrdp1 interacts with and modulates activity of Parkin, a causative protein for early onset recessive juvenile parkinsonism (AR-JP). It also interacts with ubiquitin-specific protease 8 (USP8), which is involved in trafficking of various transmembrane proteins. Furthermore, Nrdp1 inhibits basal lysosomal degradation and enhances ectodomain shedding of JAK2-associated cytokine receptors. Its phosphorylation by the kinase Par-1b (also known as MARK2) is required for epithelial cell polarity. Nrdp1 contains an N-terminal modified C3HC3D-type RING-HC finger required for enhancing ErbB3 degradation, a B-box, a coiled-coil domain responsible for Nrdp1 oligomerization, and a C-terminal ErbB3-binding domain.	43
319549	cd16635	mRING-HC-C3HC3D_PHRF1	Modified RING finger, HC subclass (C3HC3D-type), found in PHD and RING finger domain-containing protein 1(PHRF1) and similar proteins. PHRF1, also known as KIAA1542, or CTD-binding SR-like protein rA9, is a ubiquitin ligase which induces the ubiquitination of TGIF (TG-interacting factor) at lysine 130. It acts as a tumor suppressor that promotes the transforming growth factor (TGF)-beta cytostatic program through selective release of TGIF-driven promyelocytic leukemia protein (PML) inactivation. PHRF1 contains a plant homeodomain (PHD) finger and a modified C3HC3D-type RING-HC finger.	44
319550	cd16636	mRING-HC-C3HC3D_SCAF11	Modified RING finger, HC subclass (C3HC3D-type), found in SR-related and CTD-associated factor 11 (SCAF11) and similar proteins. SCAF11, also known as CTD-associated SR protein 11 (CASP11), renal carcinoma antigen NY-REN-40, SC35-interacting protein 1 (Sip1), Serine/arginine-rich splicing factor 2 (SRSF2)-interacting protein, or splicing regulatory protein 129 (SRrp129), is a novel arginine-serine-rich (RS) domain-containing protein essential for pre-mRNA splicing. It functions as an auxiliary splice factor interacting with spliceosomal component SC35 promoting RNAPII elongation. In addition to SR proteins, such as SC35, ASF/SF2, SRp75, and SRp20, SCAF11 also associates with U1-70K and U2AF65, proteins associated with 5' and 3' splice sites, respectively. SCAF11 contains an N-terminal modified C3HC3D-type RING-HC finger, an internal serine-arginine rich domain (SR domain), and a C-terminal SRI domain.	43
319551	cd16637	mRING-HC-C3HC3D_LNX1_like	Modified RING finger, HC subclass (C3HC3D-type), found in ligand of Numb protein LNX1, LNX2, and similar proteins. The ligand of Numb protein X (LNX) family, also known as PDZ and RING (PDZRN) family, includes LNX1-5, which can interact with Numb, a key regulator of neurogenesis and neuronal differentiation. LNX5 (also known as PDZK4 or PDZRN4L) shows high sequence homology to LNX3 and LNX4, but it lacks the RING domain. LNX1-4 proteins function as E3 ubiquitin ligases and have a unique domain architecture consisting of an N-terminal RING-HC finger for E3 ubiquitin ligase activity and either two or four PDZ domains necessary for the substrate-binding. This family corresponds to LNX1/LNX2-like proteins, which contains a modified C3HC3D-type RING-HC finger and four PDZ domains.	42
319552	cd16638	mRING-HC-C3HC3D_Roquin	Modified RING finger, HC subclass (C3HC3D-type), found in Roquin-1, Roquin-2, and similar proteins. This family corresponds to the ROQUIN family of proteins, including Roquin-1, Roquin-2, and similar proteins, which localize to the cytoplasm and upon stress are concentrated in stress granules. They may play essential roles in preventing T-cell-mediated autoimmune disease and in microRNA-mediated repression of inducible costimulator (Icos) mRNA. They function as E3 ubiquitin ligases consisting of an N-terminal modified C3HC3D-type RING-HC finger with a potential E3 activity, a highly conserved ROQ domain required for RNA binding and localization to stress granules, and a CCCH-type zinc finger involved in RNA recognition.	44
319553	cd16639	RING-HC_TRAF2	RING finger, HC subclass, found in tumor necrosis factor (TNF) receptor-associated factor 2 (TRAF2) and similar proteins. TRAF2, also known as tumor necrosis factor type 2 receptor-associated protein 3, is an E3 ubiquitin-protein ligase that was identified as a 75 kDa tumor necrosis factor receptor (TNF-R2)-assciated signaling protein. It interacts with members of the TNF receptor superfamily and connects the receptors to downstream signaling proteins. It also mediates K63-linked polyubiquitination of RIP1, a kinase pivotal in TNFalpha-induced NF-kappaB activation. Moreover, TRAF2 regulates peripheral CD8(+) T-cell and NKT-cell homeostasis by modulating sensitivity to IL-15. It also acts an important biological suppressor of necroptosis. It inhibits TNF-related apoptosis inducing ligand (TRAIL)- and CD95L-induced apoptosis and necroptosis. TRAF2 contains an N-terminal domain with a typical C3HC4-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain and a conserved TRAF-C domain.	42
319554	cd16640	RING-HC_TRAF3	RING finger, HC subclass, found in tumor necrosis factor (TNF) receptor-associated factor 3 (TRAF3) and similar proteins. TRAF3, also known as CAP-1, CD40 receptor-associated factor 1 (CRAF1), CD40-binding protein (CD40BP), or LMP1-associated protein 1 (LAP1), is a member of TRAF protein family, which mainly functions in the immune system, where it mediates signaling through tumor necrosis factor receptors (TNFRs) and interleukin-1/Toll-like receptors (IL-1/TLRs). It also plays a unique cell type-specific and critical role in the restraint of B-cell homeostatic survival, a role with important implications for both B-cell differentiation and the pathogenesis of B-cell malignancies. Meanwhile, TRAF3 differentially regulates differentiation of specific T cell subsets. It is required for iNKT cell development, restrains Treg cell development in the thymus, and plays an essential role in the homeostasis of central memory CD8+ T cells. TRAF3 contains an N-terminal domain with a typical C3HC4-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain, and a conserved TRAF-C domain.	42
319555	cd16641	mRING-HC-C3HC3D_TRAF4	Modified RING finger, HC subclass (C3HC3D-type), found in tumor necrosis factor (TNF) receptor-associated factor 4 (TRAF4) and similar proteins. TRAF4, also known as cysteine-rich domain associated with RING and Traf domains protein 1, or metastatic lymph node gene 62 protein (MLN 62), or RING finger protein 83 (RNF83), is a member of TRAF protein family, which mainly function in the immune system, where they mediate signaling through tumor necrosis factor receptors (TNFRs) and interleukin-1/Toll-like receptors (IL-1/TLRs). It also plays a critical role in nervous system, as well as in carcinogenesis. TRAF4 promotes the growth and invasion of colon cancer through the Wnt/beta-catenin pathway. It contributes to the TNFalpha-induced activation of 70 kDa ribosomal protein S6 kinase (p70s6k) signaling pathway, and activation of transforming growth factor beta (TGF-beta)-induced SMAD-dependent signaling and non-SMAD signaling in breast cancer. It also enhances osteosarcoma cell proliferation and invasion by Akt signaling pathway. Moreover, TRAF4 is a novel phosphoinositide-binding protein modulating tight junctions and favoring cell migration. TRAF4 contains an N-terminal domain with a modified C3HC3D-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain and a conserved TRAF-C domain.	45
319556	cd16642	mRING-HC-C3HC3D_TRAF5	Modified RING finger, HC subclass (C3HC3D-type), found in tumor necrosis factor (TNF) receptor-associated factor 5 (TRAF5) and similar proteins. TRAF5, also known as RING finger protein 84 (RNF84), is an important signal transducer for a wide range of TNF receptor superfamily members, including tumor necrosis factor receptor 1 (TNFR1), tumor necrosis factor receptor 2 (TNFR2), CD40, and other lymphocyte costimulatory receptors, RANK/TRANCE-R, ectodysplasin-A Receptor (EDAR), lymphotoxin-beta receptor (LT-betaR), latent membrane protein 1 (LMP1), and IRE1. It functions as an activator of NF-kappaB, MAPK, and JNK, and is involved in both RANKL- and TNFalpha-induced osteoclastogenesis. It mediates CD40 signaling through associating with the cytoplasmic tail of CD40. It also negatively regulates Toll-like receptor (TLR) signaling and functions as a negative regulator of the interleukin 6 (IL-6) receptor signaling pathway that limits the differentiation of inflammatory CD4(+) T cells. TRAF5 contains an N-terminal domain with a modified C3HC3D-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain and a conserved TRAF-C domain.	43
319557	cd16643	mRING-HC-C3HC3D_TRAF6	Modified RING finger, HC subclass (C3HC3D-type), found in tumor necrosis factor (TNF) receptor-associated factor 6 (TRAF6) and similar proteins. TRAF6, also known as interleukin-1 signal transducer or RING finger protein 85 (RNF85), is a cytoplasmic adapter protein that mediates signals induced by the tumor necrosis factor receptor (TNFR) superfamily and Toll-like receptor (TLR)/interleukin-1 receptor (IL-1R) family. It functions as a mediator involved in the activation of mitogen-activated protein kinase (MAPK), phosphoinositide 3-kinase (PI3K), and interferon regulatory factor pathways, as well as in IL-1R-mediated activation of NF-kappaB. TRAF6 is also an oncogene that plays a vital role in K-RAS-mediated oncogenesis. TRAF6 contains an N-terminal domain with a modified C3HC3D-type RING-HC finger and several zinc fingers, and a C-terminal TRAF domain that comprises a coiled coil domain and a conserved TRAF-C domain.	58
319558	cd16644	mRING-HC-C3HC3D_TRAF7	Modified RING finger, HC subclass (C3HC3D-type), found in tumor necrosis factor (TNF) receptor-associated factor 7 (TRAF7) and similar proteins. TRAF7, also known as RING finger and WD repeat-containing protein 1 or RING finger protein 119 (RNF119), is an E3 ubiquitin-protein ligase involved in signal transduction pathways that lead either to activation or repression of NF-kappaB transcription factor through promoting K29-linked ubiquitination of several cellular targets, including the NF-kappaB essential modulator (NEMO) and the p65 subunit of NF-kappaB transcription factor. It is also involved in K29-linked polyubiquitination that has been implicated in lysosomal degradation of proteins. Moreover, TRAF7 is required for K48-linked ubiquitination of p53, a key tumor suppressor and a master regulator of various signaling pathways, such as those related to apoptosis, cell cycle and DNA repair. It is also required for tumor necrosis factor alpha (TNFalpha)-induced Jun N-terminal kinase activation and promotes cell death by regulating polyubiquitination and lysosomal degradation of c-FLIP protein. Furthermore, TRAF7 functions as small ubiquitin-like modifier (SUMO) E3 ligase involved in other post-translational modification, such as sumoylation. It binds to and stimulates sumoylation of the proto-oncogene product c-Myb, a transcription factor regulating proliferation and differentiation of hematopoietic cells. It potentiates MEKK3-induced AP1 and CHOP activation and induces apoptosis. Meanwhile, TRAF7 mediates MyoD1 regulation of the pathway and cell-cycle progression in myoblasts. It also plays a role in Toll-like receptors (TLR) signaling. TRAF7 contains an N-terminal domain with a modified C3HC3D-type RING-HC finger and an adjacent zinc finger, and a unique C-terminal domain that comprises a coiled coil domain and seven WD40 repeats.	39
319559	cd16645	mRING-HC-C3HC3D_TRIM23_C-IX	Modified RING finger, HC subclass (C3HC3D-type), found in tripartite motif-containing protein 23 (TRIM23) and similar proteins. TRIM23, also known as ADP-ribosylation factor domain-containing protein 1, GTP-binding protein ARD-1, or RING finger protein 46 (RNF46), is an E3 ubiquitin-protein ligase belonging to the C-IX subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a modified C3HC3D-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as C-terminal ADP ribosylation factor (ARF) domains. TRIM23 is involved in nuclear factor (NF)-kappaB activation. It mediates atypical lysine 27 (K27)-linked polyubiquitin conjugation to NF-kappaB essential modulator NEMO, also known as IKKgamma, which plays an important role in the NF-kappaB pathway, and this conjugation is essential for TLR3- and RIG-I/MDA5-mediated antiviral innate and inflammatory responses. It also regulates adipocyte differentiation via stabilization of the adipogenic activator peroxisome proliferator-activated receptor gamma (PPARgamma) through atypical ubiquitin conjugation to PPARgamma. Moreover, TRIM23 interacts with and polyubiquitinates yellow fever virus (YFV) NS5 to promote its binding to STAT2 and trigger type I interferon (IFN-I) signaling inhibition.	50
319560	cd16646	mRING-HC-C2H2C4_MDM2_like	Modified RING finger, HC subclass (C2H2C4-type), found in E3 ubiquitin-protein ligase MDM2, protein MDM4 and similar proteins. MDM2 (also known as HDM2) and MDM4 (also known as MDMX or HDMX) are the primary p53 tumor suppressor negative regulators. They have non-redundant roles in the regulation of p53. MDM2 mainly functions to control p53 stability, while MDM4 controls p53 transcriptional activity. Both MDM2 and MDM4 contain an N-terminal p53-binding domain, a RanBP2-type zinc finger (zf-RanBP2) domain near the central acidic region, and a C-terminal modified C2H2C4-type RING-HC finger. Mdm2 can form homo-oligomers through its RING domain and displays E3 ubiquitin ligase activity that catalyzes the attachment of ubiquitin to p53 as an essential step in the regulation of its level in cells. Despite its RING domain and structural similarity with MDM2, MDM4 does not homo-oligomerize and lacks ubiquitin-ligase function, but inhibits the transcriptional activity of p53. In addition, both their RING domains are responsible for the hetero-oligomerization, which is crucial for the suppression of P53 activity during embryonic development and the recruitment of E2 ubiquitin-conjugating enzymes. Moreover, MDM2 and MDM4 can be phosphorylated and destabilized in response to DNA damage stress. In response to ribosomal stress, MDM2-mediated p53 ubiquitination and degradation can be inhibited through the interaction with ribosomal proteins L5, L11, and L23. However, MDM4 is not bound to ribosomal proteins, suggesting its different response to regulation by small basic proteins such as ribosomal proteins and ARF.	44
319561	cd16647	mRING-HC-C3HC5_NEU1	Modified RING finger, HC subclass (C3HC5-type), found in neuralized-like protein NEURL1A, NEURL1B, and similar proteins. The family includes Drosophila neuralized (D-neu) protein, and its two mammalian homologs, NEURL1A and NEURL1B. D-neu is a regulator of the developmentally important Notch signaling pathway. NEURL1A, also known as NEURL1, NEU, neuralized 1, or RING finger protein 67 (RNF67), is a mammalian homolog of D-neu. It functions as an E3 ubiquitin-protein ligase that directly interacts with and monoubiquitinates cytoplasmic polyadenylation element-binding protein 3 (CPEB3), an RNA binding protein and a translational regulator of local protein synthesis, which facilitates hippocampal plasticity and hippocampal-dependent memory storage. It also acts as a potential tumor suppressor that causes apoptosis and downregulates Notch target genes in medulloblastoma. NEURL1B, also known as neuralized-2 (NEUR2) or neuralized-like protein 3, is another mammalian homolog of D-neu protein. It functions as an E3 ubiquitin-protein ligase that interacts with and ubiquitinates Delta. Thus, it plays a role in the endocytic pathways for Notch signaling through working cooperatively with another E3 ligase, Mind bomb-1 (Mib1), in Delta endocytosis to hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs)-positive vesicles. Members in this family contain two neuralized homology regions (NHRs) responsible for Neural-ligand interactions and a modified C3HC5-type RING-HC finger required for ubiquitin ligase activity. The C3HC5-type RING-HC finger is distinguished from typical C3HC4-type RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	42
319562	cd16648	mRING-HC-C3HC5_MAPL	Modified RING finger, HC subclass (C3HC5-type), found in mitochondrial-anchored protein ligase (MAPL) and similar proteins. MAPL, also known as MULAN, mitochondrial ubiquitin ligase activator of NFKB 1, E3 SUMO-protein ligase MUL1, E3 ubiquitin-protein ligase MUL1, growth inhibition and death E3 ligase (GIDE), putative NF-kappa-B-activating protein 266, or RING finger protein 218 (RNF218), is a multifunctional mitochondrial outer membrane protein involved in several processes specific to metazoan (multicellular animal) cells, such as NF-kappaB activation, innate immunity and antiviral signaling, suppression of PINK1/parkin defects, mitophagy in skeletal muscle, and caspase-dependent apoptosis. MAPL contains a unique BAM (beside a membrane)/GIDE (growth inhibition death E3 ligase) domain and a C-terminal modified cytosolic C3HC5-type RING-HC finger which is distinguished from typical C3HC4-type RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	40
319563	cd16649	mRING-HC-C3HC5_CGRF1_like	Modified RING finger, HC subclass (C3HC5-type), found in RING finger proteins, RNF26, RNF197 (CGRRF1), RNF156 (MGRN1), RNF157 and similar proteins. This family corresponds to a group of RING finger proteins containing a modified C3HC5-type RING-HC finger, which is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain. Cell growth regulator with RING finger domain protein 1 (CGRRF1), also known as cell growth regulatory gene 19 protein (CGR19) or RING finger protein 197 (RNF197), functions as a novel biomarker of tissue monitor endometrial sensitivity and response to insulin-sensitizing drugs, such as metformin, in the context of obesity. RNF26 is an E3 ubiquitin ligase that temporally regulates virus-triggered type I interferon induction by increasing the stability of Mediator of IRF3 activation, MITA, also known as STING, through K11-linked polyubiquitination of MITA after viral infection and promoting degradation of IRF3, another important component required for virus-triggered interferon induction. Mahogunin ring finger-1 (MGRN1), also known as RING finger protein 156 (RNF156), is a cytosolic E3 ubiquitin-protein ligase that inhibits signaling through the G protein-coupled melanocortin receptors-1 (MC1R), -2 (MC2R) and -4 (MC4R) via ubiquitylation-dependent and -independent processes. It suppresses chaperone-associated misfolded protein aggregation and toxicity. RNF157 is a cytoplasmic E3 ubiquitin ligase predominantly expressed in brain. It is a homolog of the E3 ligase MGRN1. In cultured neurons, it promotes neuronal survival in an E3 ligase-dependent manner. In contrast, it supports growth and maintenance of dendrites independent of its E3 ligase activity. RNF157 interacts with and ubiquitinates the adaptor protein APBB1 (amyloid beta precursor protein-binding, family B, member 1 or Fe65), which regulates neuronal survival, but not dendritic growth downstream of RNF157. The nuclear localization of APBB1 together with its interaction partner RNA-binding protein SART3 (squamous cell carcinoma antigen recognized by T cells 3 or Tip110) is crucial to trigger apoptosis.	41
319564	cd16650	SP-RING_PIAS_like	SP-RING finger found in the Siz/PIAS RING (SP-RING) family of SUMO E3 ligases. The SP-RING family includes PIAS (protein inhibitor of activated STAT) proteins, Zmiz proteins, and Siz proteins from plants and fungi. The PIAS (protein inhibitor of activated STAT) protein family modulates the activity of several transcription factors and acts as an E3 ubiquitin ligase in the sumoylation pathway. It consists of four members: PIAS1, PIAS2 (also known as PIASx), PIAS3, and PIAS4 (also known as PIASy). PIAS proteins were initially identified as inhibitors of activated STAT only, but are now known to interact with and modulate several other proteins, including androgen receptor (AR), tumor suppressor p53, and the transforming growth factor-beta (TGF-beta) signaling protein SMAD. They interact with STATs in a cytokine-dependent manner. PIAS proteins have SUMO E3-ligase activity and interaction of PIAS proteins with transcription factors often results in sumoylation of that protein. Zmiz1 (Zimp10) and its homolog Zmiz2 (Zimp7) were initially identified in humans as androgen receptor (AR) interacting proteins and function as transcriptional co-activators. They interact with BRG1, the catalytic subunit of the SWI-SNF remodeling complex. They also associate with other hormone nuclear receptors and transcription factors such as p53 and Smad3/Smad4, and regulate transcription of specific target genes by altering their chromatin structure. SIZ1 proteins from plants and fungi are also founding members of this family. SIZ1-mediated conjugation of SUMO1 and SUMO2 to other intracellular proteins is essential in Arabidopsis. Yeast SIZ proteins are SUMO E3 ligases involved in a novel pathway of chromosome maintenance. They enhance SUMO modification to many substrates in vivo, but also exhibit unique substrate specificity. All family members contain a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, which is essential for SUMO ligase activity. The SP-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers.	48
319565	cd16651	SPL-RING_NSE2	SPL-RING finger found in E3 SUMO-protein ligase NSE2 and similar proteins. NSE2, also known as MMS21 homolog (MMS21) or non-structural maintenance of chromosomes element 2 homolog (Non-SMC element 2 homolog, NSMCE2), is an autosumoylating small ubiquitin-like modifier (SUMO) ligase required for the response to DNA damage. It regulates sumoylation and nuclear-to-cytoplasmic translocation of skeletal and heart muscle-specific variant of the alpha subunit of nascent polypeptide associated complex (skNAC)-Smyd1 in myogenesis. It is also required for resisting extrinsically induced genotoxic stress. Moreover, NSE2 together with its partner proteins SMC6 and SMC5 form a tight subcomplex of the structural maintenance of chromosomes SMC5-6 complex, which includes another two subcomplexes, NSE1-NSE3-NSE4 and NSE5-NSE6. SMC6 and NSE3 are sumoylated in an NSE2-dependent manner, but SMC5 and NSE1 are not. NSE2-dependent E3 SUMO ligase activity is required for efficient DNA repair, but not for SMC5-6 complex stability. NSE2 contains a RING variant known as a Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription)-like RING (SPL-RING) finger that is likely shared by the SP-RING type SUMO E3 ligases, such as PIAS family proteins. The SPL-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers. It harbors only one Zn2+-binding site and is required for the sumoylating activity.	67
319566	cd16652	dRing_Rmd5p_like	Degenerated RING (dRING) finger found in Saccharomyces cerevisiae required for meiotic nuclear division protein 5 (Rmd5p) and similar proteins. Rmd5p, also known as glucose-induced degradation protein 2 (Gid2) or sporulation protein RMD5, is an E3 ubiquitin ligase containing a Lissencephaly type-1-like homology motif (LisH), a C-terminal to LisH motif (CTLH) domain, and a degenerated RING finger that is characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues compared with the classic C3H2C3-/C3HC4-type RING fingers. It forms the heterodimeric E3 ligase unit of the glucose induced degradation deficient (GID) complex with Gid9 (also known as Fyv10), which has a degenerated RING finger as well. The GID complex triggers polyubiquitylation and subsequent proteasomal degradation of the gluconeogenic enzymes fructose-1, 6-bisphosphate by fructose-1, 6-bisphosphatase (FBPase), phosphoenolpyruvate carboxykinase (PEPCK), and cytoplasmic malate dehydrogenase (c-MDH). Moreover, Rmd5p can form the GID complex with the other six Gid proteins, including Gid1/Vid30, Gid4/Vid24, Gid5/Vid28, Gid7, Gid8, and Gid9/Fyv10. The GID complex in which the seven Gid proteins reside functions as a novel ubiquitin ligase (E3) involved in the regulation of carbohydrate metabolism.	49
319567	cd16653	RING-like_Rtf2	RING-like Rtf2 domain, C2HC2-type, found in the replication termination factor 2 (Rtf2) protein family. The Rtf2 protein family includes a group of conserved proteins found in eukaryotes ranging from fission yeast to humans. The defining member of the family is Schizosaccharomyces pombe Rtf2 (SpRtf2), which is a proliferating cell nuclear antigen-interacting protein that functions as a key requirement for efficient replication termination at the site-specific replication barrier RTS1. It promotes termination at RTS1 by preventing replication restart. SpRtf2 contains a RING-like Rtf2 domain that is characterized by a C2HC2 motif similar to C3HC4 RING-HC finger motif known to bind two Zn2+ ions and mediate protein-protein interactions. The C2HC2 motif lacks three of the seven conserved cysteines of the C3HC4 motif, and forms only one functional Zn2+ ion-binding site. The RING-like Rtf2 domain in fission yeast is required to stabilize a paused DNA replication fork during imprinting at the mating type locus, possibly by facilitating sumoylation of PCNA. The family also includes Arabidopsis RTF2 (AtRTF2), an essential nuclear protein required for both normal embryo development and for proper expression of the GFP reporter gene. It plays a critical role in splicing the GFP pre-mRNA, and may also have a more transient regulatory role during the spliceosome cycle. The biological function of Rtf2 homologs found in eumetazoa remains unclear. They contains a variant C2HC2 motif where the middle conserved histidine has been replaced by cysteine.	46
319568	cd16654	RING-Ubox_CHIP	U-box domain, a modified RING finger, found in carboxyl terminus of HSP70-interacting protein (CHIP) and similar proteins. CHIP, also known as STIP1 homology and U box-containing protein 1 (STUB1), CLL-associated antigen KW-8, or Antigen NY-CO-7, is a multifunctional protein that functions both as a co-chaperone and an E3 ubiquitin-protein ligase. It couples protein folding and proteasome mediated degradation by interacting with heat shock proteins (e.g. HSC70) and ubiquitinating their misfolded client proteins thereby targeting them for proteasomal degradation. It is also important for cellular differentiation and survival (apoptosis), as well as susceptibility to stress. It targets a wide range of proteins, such as expanded ataxin-1, ataxin-3, huntingtin, and androgen receptor, which play roles in glucocorticoid response, tau degradation, and both p53 and cAMP signaling. CHIP contains an N-terminal tetratricopeptide repeat (TPR) domain responsible for protein-protein interaction, a highly charged middle coiled-coil (CC), and a C-terminal RING-like U-box domain acting as an ubiquitin ligase.	67
319569	cd16655	RING-Ubox_WDSUB1_like	U-box domain, a modified RING finger, found in WD repeat, SAM and U-box domain-containing protein 1 (WDSUB1) and similar proteins. WDSUB1 is an uncharacterized protein containing seven WD40 repeats and a SAM domain in addition of the U-box. Its biological role remains unclear. The family also includes many uncharacterized kinase domain-containing U-box (AtPUB) proteins and several MIF4G motif-containing AtPUB proteins from Arabidopsis.	42
319570	cd16656	RING-Ubox_PRP19	U-box domain, a modified RING finger, found in pre-mRNA-processing factor 19 (Prp19) and similar proteins. Prp19, also known as nuclear matrix protein 200 (NMP200), senescence evasion factor (SNEV), or DNA repair protein Pso4 (psoralen-sensitive mutant 4), is a ubiquitously expressed multifunctional E3 ubiquitin ligase with pleiotropic activities in DNA damage signaling, repair, and replicative senescence. It functions as a critical component of DNA repair and DNA damage checkpoint complexes. It senses DNA damage, binds double-stranded DNA in a sequence-independent manner, facilitates processing of damaged DNA, promotes DNA end joining, regulates replication protein A (RPA2) phosphorylation and ubiquitination at damaged DNA, and regulates RNA splicing and mitotic spindle formation in its integral capacity as a scaffold for a multimeric core complex. Prp19 contains an N-terminal E3 ubiquitin ligase U-box domain with E2 recruitment function that facilitates dimerization and is essential for its auto-ubiquitination activity in vitro or when overexpressed, a coiled-coil Prp19 homology region that mediates its tetramerization and interaction with CDC5L and SPF27, and a C-terminal seven-bladed WD40 beta-propeller type of leucine-rich architectural repeats that form an asymmetrical barrel-shaped structure important for substrate recognition and recruitment.	53
319571	cd16657	RING-Ubox_UBE4A	U-box domain, a modified RING finger, found in ubiquitin conjugation factor E4 A (UBE4A) and similar proteins. The family includes yeast ubiquitin fusion degradation protein 2 (UFD2p) and its mammalian homolog, UBE4A. Yeast UFD2p, also known as ubiquitin conjugation factor E4 or UB fusion protein 2, is a polyubiquitin chain conjugation factor (E4) in the ubiquitin fusion degradation (UFD) pathway which catalyzes elongation of the ubiquitin chain through Lys48 linkage. It binds to substrates conjugated with one to three ubiquitin molecules and catalyzes the addition of further ubiquitin moieties in the presence of ubiquitin-activating enzyme (E1), ubiquitin-conjugating enzyme (E2) and ubiquitin ligase (E3), yielding multiubiquitylated substrates that are targets for the 26S proteasome. UFD2p is implicated in cell survival under stress conditions and is essential for homoeostasis of unsaturated fatty acids. It interacts with UBL-UBA proteins Rad23 and Dsk2, which are involved in the endoplasmic reticulum-associated degradation, ubiquitin fusion degradation, and OLE-1 gene induction pathway. UBE4A is a U-box-type ubiquitin-protein ligase that is located in common neuroblastoma deletion regions and may be subject to mutations in tumors. It may have a specific role in different biochemical processes other than ubiquitination, including growth or differentiation. Members in this family contain an N-terminal ubiquitin elongating factor core and a RING-like U-box domain at the C-terminus.	70
319572	cd16658	RING-Ubox_UBE4B	U-box domain, a modified RING finger, found in ubiquitin conjugation factor E4 B (UBE4B) and similar proteins. UBE4B, also known as UFD2a, is a U-box-type ubiquitin-protein ligase that functions as an E3 ubiquitin ligase and an E4 polyubiquitin chain elongation factor, which catalyzes formation of Lys27- and Lys33-linked polyubiquitin chains rather than the Lys48-linked chain. It is a mammalian homolog of yeast UFD2 ubiquitination factor and participates in the proteasomal degradation of misfolded or damaged proteins through association with chaperones. It is located in common neuroblastoma deletion regions and may be subject to mutations in tumors. UBE4B has contradictory functions upon tumorigenesis as an oncogene or tumor suppressor in different types of cancers. It is essential for Hdm2 (also known as Mdm2)-mediated p53 degradation. It mediates p53 polyubiquitination and degradation, as well as inhibits p53-dependent transactivation and apoptosis, and thus plays an important role in regulating phosphorylated p53 following DNA damage. UBE4B is also associated with other pathways independent of the p53 family, such as polyglutamine aggregation and Wallerian degeneration, both of which are critical in neurodegenerative diseases. Moreover, UBE4B acts as a regulator of epidermal growth factor receptor (EGFR) degradation. It is recruited to endosomes in response to EGFR activation by binding to Hrs, a key component of endosomal sorting complex required for transport (ESCRT) 0, and then regulates endosomal sorting, affecting cellular levels of the EGFR and its downstream signaling. UBE4B contains a ubiquitin elongating factor core and a RING-like U-box domain at the C-terminus.	75
319573	cd16659	RING-Ubox_Emp	U-box domain, a modified RING finger, found in erythroblast macrophage protein (Emp) and similar proteins. Emp, also known as cell proliferation-inducing gene 5 protein or macrophage erythroblast attacher (MAEA), is a key protein which functions in normal differentiation of erythroid cells and macrophages. It is a potential biomarker for hematopoietic evaluation of Hematopoietic stem cell transplantation (HSCT) patients. Emp was initially identified as a heparin-binding protein involved in the association of erythroblasts with macrophages promotes erythroid proliferation and maturation. It also plays an important role in erythroblastic island formation. Absence of Emp leads to failure of erythroblast nuclear extrusion. It is required in definitive erythropoiesis and plays a cell intrinsic role in the erythroid lineage. Emp contains a Lissencephaly type-1-like homology (LisH) motif, a C-terminal to LisH (CTLH) domain, and a RING-like U-box domain at the C-terminus.	50
319574	cd16660	RING-Ubox_RNF37	U-box domain, a modified RING finger, found in RING finger protein 37 (RNF37). RNF37, also known as KIAA0860, U-box domain-containing protein 5 (UBOX5), UbcM4-interacting protein 5 (UIP5), or ubiquitin-conjugating enzyme 7-interacting protein 5, is an E3 ubiquitin-protein ligase found exclusively in the nucleus as part of a nuclear dot-like structure. It interacts with the molecular chaperone VCP/p97 protein. RNF37 contains a U-box domain followed by a potential nuclear location signal (NLS), and a C-terminal C3HC4-type RING-HC finger. The U-box domain is a modified RING finger domain that lacks the hallmark metal-chelating cysteines and histidines of the latter, but is likely to adopt a RING finger-like conformation. The presence of the U-box, but not of the RING finger, is required for the E3 activity. The U-box domain can directly interact with several E2 enzymes, including UbcM2, UbcM3, UbcM4, UbcH5, and UbcH8, suggesting a similar function as the RING finger in the ubiquitination pathway. This family corresponds to the U-box domain.	53
319575	cd16661	RING-Ubox1_NOSIP	U-box domain 1, a modified RING finger, found in nitric oxide synthase-interacting protein (NOSIP) and similar proteins. NOSIP, also known as endothelial NO synthase (eNOS)-interacting protein, p33RUL, is an E3 ubiquitin-protein ligase implicated in the control of airway and vascular diameter, mucosal secretion, NO synthesis in ciliated epithelium, and, therefore, of mucociliary and bronchial function. The loss of NOSIP may cause holoprosencephaly and facial anomalies, including cleft lip/palate, cyclopia, and facial midline clefting. NOSIP interacts with neuronal nitric oxide synthase (nNOS) and eNOS by inhibiting the nitric oxide (NO) production. It acts as a novel type of modulator that promotes translocation of eNOS from the plasma membrane to intracellular sites, thereby uncoupling eNOS from plasma membrane caveolae and inhibiting NO synthesis. NOSIP also interacts with protein phosphatase PP2A and mediates the monoubiquitination of the PP2A catalytic subunit. Thus, it is a critical modulator of brain and craniofacial development in mice and a candidate gene for holoprosencephaly in humans. Moreover, NOSIP associates with the erythropoietin (Epo) receptor (EpoR), mediates ubiquitination of EpoR, and plays an essential role in erythropoietin-induced proliferation. NOSIP contains an atypical N-terminal RING-like U-box domain that is split into two parts by an interjacent stretch of 104 amino acid residues, as well as a C-terminal RING-like U-box domain. This family corresponds to the first U-box domain.	43
319576	cd16662	RING-Ubox2_NOSIP	U-box domain 2, a modified RING finger, found in nitric oxide synthase-interacting protein (NOSIP) and similar proteins. NOSIP, also known as endothelial NO synthase (eNOS)-interacting protein, p33RUL, is an E3 ubiquitin-protein ligase implicated in the control of airway and vascular diameter, mucosal secretion, NO synthesis in ciliated epithelium, and, therefore, of mucociliary and bronchial function. The loss of NOSIP may cause holoprosencephaly and facial anomalies including cleft lip/palate, cyclopia and facial midline clefting. NOSIP interacts with neuronal nitric oxide synthase (nNOS) and eNOS by inhibiting nitric oxide (NO) production. It acts as a novel type of modulator that promotes translocation of eNOS from the plasma membrane to intracellular sites, thereby uncoupling eNOS from plasma membrane caveolae and inhibiting NO synthesis. NOSIP also interacts with protein phosphatase PP2A and mediates the monoubiquitination of the PP2A catalytic subunit. Thus, it is a critical modulator of brain and craniofacial development in mice and a candidate gene for holoprosencephaly in humans. Moreover, NOSIP associates with the erythropoietin (Epo) receptor (EpoR), mediates ubiquitination of EpoR, and plays an essential role in erythropoietin-induced proliferation. NOSIP contains an atypical N-terminal RING-like U-box domain that is split into two parts by an interjacent stretch of 104 amino acid residues, as well as a C-terminal RING-like U-box domain. This family corresponds to the second U-box domain.	65
319577	cd16663	RING-Ubox_PPIL2	U-box domain, a modified RING finger, found in peptidyl-prolyl cis-trans isomerase-like 2 (PPIL2) and similar proteins. PPIL2 (EC 5.2.1.8), also known as PPIase, CYC4, cyclophilin-60 (Cyp60), cyclophilin-like protein Cyp-60, or Rotamase PPIL2, is a nuclear-specific cyclophilin which interacts with the proteinase inhibitor eglin c and regulates gene expression. PPIL2 belongs to the cyclophilin family of peptidylprolyl isomerases and catalyzes cis-trans isomerization of proline-peptide bonds, which is often a rate-limiting step in protein folding. It positively regulates beta-site amyloid precursor protein cleaving enzyme (BACE1) expression and beta-secretase activity. Moreover, PPIL2 plays an important role in the translocation of CD147 to the cell surface, and thus may present a novel target for therapeutic interventions in diseases where CD147 functions as a pathogenic factor in cancer, human immunodeficiency virus infection, or rheumatoid arthritis. PPIL2 contains an N-terminal RING-like U-box domain and a C-terminal cyclophilin (Cyp)-like chaperone domain.	73
319578	cd16664	RING-Ubox_PUB	U-box domain, a modified RING finger, found in Arabidopsis plant U-box proteins (AtPUB) and similar proteins. The plant PUB proteins, also known as U-box domain-containing proteins, are much more numerous in Arabidopsis which has 62 in comparison with the typical 6 in most animals . The majority of AtPUB of this family are known as ARM domain-containing PUB proteins which contain a C-terminal located, tandem ARM (armadillo) repeat protein-interaction region in addition to the U-box domain. They have been implicated in the regulation of cell death and defense. They also play important roles in other plant-specific pathways, such as controlling both self-incompatibility and pseudo-self-incompatibility, as well as acting in abiotic stress. A subgroup of ARM domain-containing PUB proteins harbors a plant-specific U-box N-terminal domain.	43
319579	cd16665	RING-H2_RNF13_like	RING finger, H2 subclass, found in RING finger protein 13 (RNF13), RING finger protein 167 (RNF167), and similar proteins. This subfamily includes RING finger protein 13 (RNF13), RING finger protein 167 (RNF167), Zinc/RING finger protein 4 (ZNRF4), and similar proteins, which belong to a larger PA-TM-RING ubiquitin ligase family that has been characterized by containing an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane domain (TM), and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. RNF13 is a widely expressed membrane-associated E3 ubiquitin-protein ligase that is functionally significant in the regulation of cancer development, muscle cell growth, and neuronal development. Its expression is developmentally regulated during myogenesis and is upregulated in various tumors. RNF13 negatively regulates cell proliferation through its E3 ligase activity. RNF167, also known as RING105, is an endosomal/lysosomal E3 ubiquitin-protein ligase involved in alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor (AMPAR) ubiquitination. It acts as an endosomal membrane protein which ubiquitylates vesicle-associated membrane protein 3 (VAMP3) and regulates endosomal trafficking. Moreover, RNF167 plays a role in the regulation of TSSC5 (tumor-suppressing subchromosomal transferable fragment cDNA; also known as ORCTL2/IMPT1/BWR1A/SLC22A1L), which can function in concert with the ubiquitin-conjugating enzyme UbcH6. ZNRF4, also known as RING finger protein 204 (RNF204), or Nixin, is an endoplasmic reticulum (ER) membrane-anchored ubiquitin ligase that physically interacts with the ER-localized chaperone calnexin in a glycosylation-independent manner, induces calnexin ubiquitination, and p97-dependent degradation, indicating an ER-associated degradation-like mechanism of calnexin turnover. The murine protein sperizin (spermatid-specific ring zinc finger) is a homolog of human ZNRF4. It is specifically expressed in Haploid germ cells and involved in spermatogenesis.	46
319580	cd16666	RING-H2_RNF43_like	RING finger, H2 subclass, found in RING finger proteins RNF43, ZNRF3, and similar proteins. RNF43 and ZNRF3 (also known as RNF203) are transmembrane E3 ubiquitin-protein ligases that belong to the PA-TM-RING ubiquitin ligases family, which has been characterized by containing an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C3H2C3-type RING-H2 finger domain followed by a long C-terminal region. Both RNF43 and RNF203 function as tumor suppressors involved in the regulation of Wnt/beta-catenin signaling. They negatively regulate Wnt signaling through interacting with complexes of frizzled receptors (FZD) and low-density lipoprotein receptor-related protein (LRP) 5/6, which leads to ubiquitination of Frizzled receptors (FZD) and endocytosis of the Wnt receptor. Dishevelled (DVL), a positive Wnt regulator, is required for ZNRF3/RNF43-mediated ubiquitination and degradation of FZD. They also associate with R-spondin 1 (RSPO1). This interaction may block Frizzled ubiquitination and enhances Wnt signaling.	45
319581	cd16667	RING-H2_RNF126_like	RING finger, H2 subclass, found in RING finger proteins RNF126, RNF115, and similar proteins. The family includes RING finger proteins RNF126, RNF115, and similar proteins. RNF126 is a Bag6-dependent E3 ubiquitin ligase that is involved in the mislocalized protein (MLP) pathway of quality control. It regulates the retrograde sorting of the cation-independent mannose 6-phosphate receptor (CI-MPR). RNF126 promotes cancer cell proliferation by targeting the tumor suppressor p21 for ubiquitin-mediated degradation, and could be a novel therapeutic target in breast and prostate cancers. It is also able to ubiquitylate cytidine deaminase (AID), a poorly soluble protein that is essential for antibody diversification. RNF115, also known as Rab7-interacting ring finger protein (Rabring 7), or zinc finger protein 364 (ZNF364), or breast cancer-associated gene 2 (BCA2), is an E3 ubiquitin-protein ligase that is an endogenous inhibitor of adenosine monophosphate-activated protein kinase (AMPK) activation and its inhibition increases the efficacy of metformin in breast cancer cells. It also functions as a co-factor in the restriction imposed by tetherin on HIV-1, and targets HIV-1 Gag for lysosomal degradation, impairing virus assembly and release, in a tetherin-independent manner. Moreover, RNF115 is a Rab7-binding protein that stimulates c-Myc degradation through mono-ubiquitination of MM-1. It also plays crucial roles as a Rab7 target protein in vesicle traffic to late endosome/lysosome and lysosome biogenesis. RNF115 and RNF126 associate with the epidermal growth factor receptor (EGFR) and promote ubiquitylation of EGFR, suggesting they play a role in the ubiquitin-dependent sorting and downregulation of membrane receptors. Both of them contain an N-terminal BCA2 Zinc-finger domain (BZF), the AKT-phosphorylation sites, and the C-terminal C3H2C3-type RING-H2 finger.	43
319582	cd16668	RING-H2_GRAIL	RING finger, H2 subclass, found in the GRAIL transmembrane proteins family. The GRAIL transmembrane proteins family includes RING finger proteins RNF128 (also known as GRAIL), RNF130, RNF133, RNF148, RNF149, and RNF150, which belong to a larger PA-TM-RING ubiquitin ligase family that has been characterized by an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. RNF128 is a type 1 transmembrane E3 ubiquitin-protein ligase that is a critical regulator of adaptive immunity and development. RNF130, also known as Goliath homolog (H-Goliath), is a paralog of RNF128. It is a transmembrane E3 ubiquitin-protein ligase expressed in leukocytes. It has a self-ubiquitination property, and controls the development of T cell clonal anergy by ubiquitination. RNF133 is a testis-specific endoplasmic reticulum-associated E3 ubiquitin ligase that may play a role in sperm maturation through an ER-associated degradation (ERAD) pathway. RNF148 is a testis-specific E3 ubiquitin ligase that is abundantly expressed in testes and slightly expressed in pancreas. Its expression regulated by histone deacetylases. RNF149, also known as DNA polymerase-transactivated protein 2, is an E3 ubiquitin-protein ligase that induces the ubiquitination of wild-type v-Raf murine sarcoma viral oncogene homolog B1 (BRAF) and promotes its proteasome-dependent degradation. RNF150 is a RING finger protein that its polymorphisms may be associated with chronic obstructive pulmonary disease (COPD) risk in the Chinese population. The family also includes Drosophila melanogaster protein goliath (d-goliath), also known as protein g1, which is one of the funding members of the family. It was originally identified as a transcription factor involved in the embryo mesoderm formation.	48
319583	cd16669	RING-H2_RNF181	RING finger, H2 subclass, found in RING finger protein 181 (RNF181) and similar proteins. RNF181, also known as HSPC238, is a platelet E3 ubiquitin-protein ligase containing a C3H2C3-type RING-H2 finger. It interacts with the KVGFFKR motif of platelet integrin alpha(IIb)beta3, suggesting a role for RNF181-mediated ubiquitination in integrin and platelet signaling. It also suppresses the tumorigenesis of hepatocellular carcinoma (HCC) through the inhibition of extracellular signal-regulated kinase/mitogen-activated protein kinase (ERK/MAPK) signaling in the liver.	46
319584	cd16670	RING-H2_RNF215	RING finger, H2 subclass, found in RING finger protein 215 (RNF215) and similar proteins. This family includes uncharacterized protein RNF215 and similar proteins. Although its biological function remains unclear, RNF215 shares high sequence similarity with PA-TM-RING ubiquitin ligases, which have been characterized by containing an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain.	50
319585	cd16671	RING-H2_DTX1_4	RING finger, H2 subclass, found in E3 ubiquitin-protein ligase deltex1 (DTX1), deltex4 (DTX4), and similar proteins. DTX1 is a mammalian homolog of Drosophila Deltex that is a ubiquitously expressed cytoplasmic ubiquitin E3 ligase that mediates Notch activation in Drosophila. It functions as a Notch downstream transcription regulator that mediates a Notch signal to block differentiation of neural progenitor cells. DTX1 interacts with the transcription coactivator p300 and inhibits transcription activation mediated by the neural specific transcription factor MASH1. It is also a transcription target of nuclear factor of activated T cells (NFAT) and participates in T cell anergy and Foxp3 protein level maintenance in vivo. Moreover, Deltex1 appears to promote B-cell development at the expense of T-cell development. It also promotes protein kinase C theta degradation and sustains Casitas B-lineage lymphoma expression. DTX4, also known as RING finger protein 155, shares the highest degree of sequence similarity with DTX1 and likely interacts with the intracellular domain of Notch as well. Both DTX1 and DTX4 contain N-terminal two Notch-binding WWE domains that physically interact with the Notch ankyrin domains, a proline-rich motif that shares homology with SH3-binding domains, and a C3H2C3-type RING-H2 finger at the C-terminus. They also harbor two nuclear localization signals.	69
319586	cd16672	RING-H2_DTX2	RING finger, H2 subclass, found in E3 ubiquitin-protein ligase Deltex2 (DTX2) and similar proteins. DTX2, also known as RING finger protein 58, together with DTX1 and DTX4, forms a family of related proteins that are the mammalian homologs of Drosophila Deltex, a known regulator of Notch signals. Like DTX1 and DTX4, DTX2 is expressed in thymocytes. It interacts with the intracellular domain of Notch receptors and acts as a negative regulator of Notch signals in T cells. However, the endogenous levels of DTX1 and DTX2 is not important for regulating Notch signals during thymocyte development. DTX2 contains N-terminal two Notch-binding WWE domains that physically interact with the Notch ankyrin domains, a proline-rich motif that shares homology with SH3-binding domains, and a C3H2C3-type RING-H2 finger at the C-terminus. It also harbors two nuclear localization signals.	72
319587	cd16673	RING-H2_RNF6	RING finger, H2 subclass, found in E3 ubiquitin-protein ligase RNF6 and similar proteins. RNF6 is an androgen receptor (AR)-associated protein that induces AR ubiquitination and promotes AR transcriptional activity. RNF6-induced ubiquitination may regulate AR transcriptional activity and specificity through modulating cofactor recruitment. RNF6 is overexpressed in hormone-refractory human prostate cancer tissues and required for prostate cancer cell growth under androgen-depleted conditions. Moreover, RNF6 regulates local serine/threonine kinase LIM kinase 1 (LIMK1) levels in axonal growth cones. RNF6-induced LIMK1 polyubiquitination is mediated via K48 of ubiquitin and leads to proteasomal degradation of the kinase. RNF6 also binds and upregulates the Inha promoter, and functions as a transcription regulatory protein in the mouse sertoli cell. Furthermore, RNF6 acts as a potential tumor suppressor gene involved in the pathogenesis of esophageal squamous cell carcinoma (ESCC). RNF6 contains an N-terminal coiled-coil domain, a Lys-X-X-Leu/Ile-X-X-Leu/Ile (KIL) motif, and a C-terminal C3H2C3-type RING-H2 finger which is responsible for its ubiquitin ligase activity. The KIL motif is present in a subset of RING-H2 proteins from organisms as evolutionarily diverse as human, mouse, chicken, Drosophila, Caenorhabditis elegans, and Arabidopsis thaliana.	45
319588	cd16674	RING-H2_RNF12	RING finger, H2 subclass, found in RING finger protein 12 (RNF12) and similar proteins. RNF12, also known as LIM domain-interacting RING finger protein or RING finger LIM domain-binding protein (R-LIM), is an E3 ubiquitin-protein ligase encoded by gene RLIM that is crucial for normal embryonic development in some species and for normal X inactivation in mice. It thus functions as a major sex-specific epigenetic regulator of female mouse nurturing tissues. RNF12 is widely expressed during embryogenesis, and mainly localizes to the cell nucleus, where it regulates the levels of many proteins, including CLIM, LMO, HDAC2, TRF1, SMAD7, and REX1, by proteasomal degradation. Its functional activity is regulated by phosphorylation-dependent nucleocytoplasmic shuttling. It is negatively regulated by pluripotency factors in embryonic stem cells. p53 represses its transcription through Sp1. RNF12 is the primary factor responsible for X chromosome inactivation (XCI) in female placental mammals. It is an indispensable factor in up-regulation of Xist transcription, thereby leading to initiation of random XCI. It also targets REX1, an inhibitor of XCI, for proteasomal degradation. Moreover, RNF12 acts as a co-regulator of a range of transcription factors, particularly those containing a LIM homeodomain, and modulates the formation of transcriptional multiprotein complexes. It is a negative regulator of Smad7, which in turn negatively regulates the type I receptors in transforming growth factor beta (TGF-beta) superfamily signaling. In addition, paternal RNF12 is a critical survival factor for milk-producing alveolar cells. RNF12 contains an nuclear localization signal (NLS) and a C3H2C3-type RING-H2 finger.	45
319589	cd16675	RING-H2_RNF24	RING finger, H2 subclass, found in RING finger protein 24 (RNF24) and similar proteins. RNF24 is an intrinsic membrane protein localized in the Golgi apparatus. It specifically interacts with the ankyrin-repeats domains (ARDs) of TRPC1, ?3, ?4, ?5, ?6, and ?7, and affects TRPC intracellular trafficking without affecting their activity. RNF24 contains an N-terminal transmembrane domain and a C-terminal C3H2C3-type RING-H2 finger.	47
319590	cd16676	RING-H2_RNF122	RING finger, H2 subclass, found in RING finger protein 122 (RNF122) and similar proteins. RNF122 is a RING finger protein associated with HEK 293T cell viability. It is localized to the endoplasmic reticulum (ER) and the Golgi apparatus, and overexpressed in anaplastic thyroid cancer cells. RNF122 functions as an E3 ubiquitin ligase that can ubiquitinate itself and undergoes degradation through its RING finger in a proteasome-dependent manner. It interacts with calcium-modulating cyclophilin ligand (CAML), which is not a substrate, but a stabilizer of RNF122. RNF122 contains an N-terminal transmembrane domain and a C-terminal C3H2C3-type RING-H2 finger.	47
319591	cd16677	RING1-H2_RNF32	RING finger 1, H2 subclass, found in RING finger protein 32 (RNF32) and similar proteins. RNF32 is mainly expressed in testis spermatogenesis, most likely in spermatocytes and/or in spermatids, suggesting a possible role in sperm formation. RNF32 contains two C3H2C3-type RING-H2 fingers separated by an IQ domain of unknown function. Although the biological function of RNF32 remains unclear, the protein with double RING-H2 fingers may act as a scaffold for binding several proteins that function in the same pathway. This family corresponds to the first RING-H2 finger.	44
319592	cd16678	RING2-H2_RNF32	RING finger 2, H2 subclass, found in RING finger protein 32 (RNF32) and similar proteins. RNF32 is mainly expressed in testis spermatogenesis, most likely in spermatocytes and/or in spermatids, suggesting a possible role in sperm formation. RNF32 contains two C3H2C3-type RING-H2 fingers separated by an IQ domain of unknown function. Although the biological function of RNF32 remains unclear, the protein with double RING-H2 fingers may act as a scaffold for binding several proteins that function in the same pathway. This family corresponds to the second RING-H2 finger.	60
319593	cd16679	RING-H2_RNF38	RING finger, H2 subclass, found in RING finger protein 38 (RNF38) and similar proteins. RNF38 is a nuclear E3 ubiquitin protein ligase that is widely expressed throughout the body in human, especially highly expressed in the heart, brain, placenta and the testis. It recognizes p53 as a substrate for ubiquitination, and thus plays a role in regulating p53. The overexpression of RNF38 increases p53 ubiquitination and alters p53 localization. It is also capable of autoubiquitination. Moreover, RNF38 expression is negatively regulated by the serotonergic system. Induction of RNF38 may be involved in the anxiety-like behavior or non-cell autonomous by the decline of serotonin (5-HT) levels. RNF38 contains a coiled-coil motif, a KIL motif (Lys-X2-Ile/Leu-X2-Ile/Leu, X can be any amino acid), and a C3H2C3-type RING-H2 finger, as well as two potential nuclear localization signals.	49
319594	cd16680	RING-H2_RNF44	RING finger, H2 subclass, found in RING finger protein 44 (RNF44) and similar proteins. RNF44 is an uncharacterized RING finger protein that shows high sequence similarity with RNF38, which is a nuclear E3 ubiquitin protein ligase that plays a role in regulating p53. RNF44 contains a coiled-coil motif, a KIL motif (Lys-X2-Ile/Leu-X2-Ile/Leu, X can be any amino acid), and a C3H2C2-type RING-H2 finger.	45
319595	cd16681	RING-H2_RNF111	RING finger, H2 subclass, found in RING finger protein 111 (RNF111) and similar proteins. RNF111, also known as Arkadia, is a nuclear E3 ubiquitin-protein ligase that targets intracellular effectors and modulators of transforming growth factor beta (TGF-beta)/Nodal-related signaling for polyubiquitination and proteasome-dependent degradation. It acts as an amplifier of Nodal signals, and enhances the dorsalizing activity of limiting amounts of Xnr1, a Nodal homolog, and requires Nodal signaling for its function. The loss of RNF111 results in early embryonic lethality, with defects attributed to compromised Nodal signaling. Moreover, RNF111 regulates tumor metastasis by modulation of the TGF-beta pathway. Its ubiquitination can be modulated by the four and a half LIM-only protein 2 (FHL2) that activates TGF-beta signal transduction. Furthermore, RNF111 interacts with the clathrin-adaptor 2 (AP2) complex and regulates endocytosis of certain cell surface receptors, leading to modulation of epidermal growth factor (EGF) and possibly other signaling pathways. In addition, RNF111 has been identified as a small ubiquitin-like modifier (SUMO)-binding protein with clustered SUMO-interacting motifs (SIMs) that together form a SUMO-binding domain (SBD). It thus functions as a SUMO-targeted ubiquitin ligase (STUbL) that directly links nonproteolytic ubiquitylation and SUMOylation in the DNA damage response, as well as triggers degradation of signal-induced polysumoylated proteins, such as the promyelocytic leukemia protein (PML). The N-terminal half of RNF111 harbors three SIMs. Its C-terminal half show high sequence similarity with RING finger protein 165 (RNF165), where it contains two serine rich domains, two nuclear localization signals, a NRG-TIER domain, and a C-terminal C3H2C3-type RING-H2 finger that is required for polyubiqutination and proteasome-dependent degradation of phosphorylated forms of Smad2/3 and three major negative regulators of TGF-beta signaling, Smad7, SnoN and c-Ski.	46
319596	cd16682	RING-H2_RNF165	RING finger, H2 subclass, found in RING finger protein 165 (RNF165) and similar proteins. RNF165, also known as Arkadia-like 2, or Arkadia2, or Ark2C, is an E3 ubiquitin ligase with homology to C-terminal half of RNF111. It is expressed specifically in the nervous system, and can serve to amplify neuronal responses to specific signals. It thus acts as a positive regulator of bone morphogenetic protein (BMP)-Smad signaling that is involved in motor neuron (MN) axon elongation. RNF165 contains two serine rich domains, a nuclear localization signal, a NRG-TIER domain, and a C-terminal C3H2C3-type RING-H2 finger that is responsible for the enhancement of BMP-Smad1/5/8 signaling in the spinal cord.	51
319597	cd16683	RING-H2_RNF139	RING finger, H2 subclass, found in RING finger protein 139 (RNF139) and similar proteins. RNF139, also known as translocation in renal carcinoma on chromosome 8 protein (TRC8), is an endoplasmic reticulum (ER)-resident multi-transmembrane protein that functions as a potent growth suppressor in mammalian cells, inducing G2/M arrest, decreased DNA synthesis and increased apoptosis. It is a tumor suppressor that has been implicated in a novel regulatory relationship linking the cholesterol/lipid biosynthetic pathway with cellular growth control. The mutation of RNF139 has been identified in families with hereditary renal (RCC) and thyroid cancers. RNF139 physically and functionally interacts with von Hippel-Lindau (VHL), which is part of an SCF related E3-ubiquitin ligase complex with "gatekeeper" function in renal carcinoma and is defective in most sporadic clear-cell renal cell carcinomas (ccRCC). It suppresses growth and functions with VHL in a common pathway. RNF139 also suppresses tumorigenesis through targeting heme oxygenase-1 for ubiquitination and degradation. Moreover, RNF139 is a target of Translin (TSN), a posttranscriptional regulator of genes transcribed by the transcription factor CREM-tau in postmeiotic male germ cells, suggesting a role of RNF139 in dysgerminoma. Furthermore, RNF139 physically and functionally interacts with von Hippel-Lindau (VHL), which is part of an SCF related E3-ubiquitin ligase complex with "gatekeeper" function in renal carcinoma and is defective in most sporadic clear-cell renal cell carcinomas (ccRCC). It suppresses growth and functions with VHL in a common pathway. In addition, RNF139 forms an integral part of a novel multi-protein ER complex, containing MHC I, US2, and signal peptide peptidase, which is associated with ER-associated degradation (ERAD) pathway. It is required for the ubiquitination of MHC class I molecules before dislocation from the ER. As a novel sterol-sensing ER membrane protein, RNF139 hinders sterol regulatory element-binding protein-2 (SREBP-2) processing through interaction with SREBP-2 and SREBP cleavage-activated protein (SCAP), regulating its own turnover rate via its E3 ubiquitin ligase activity. RNF139 shows two regions of similarity with the receptor for sonic hedgehog (SHH), Patched. The first region corresponds to the second extracellular domain of Patched, which is involved in binding SHH. The second region is a putative sterol-sensing domain (SSD). In addition, the C-terminal half of RNF139 contains a C3H2C3-type RING-H2 finger with E3-ubiquitin ligase activity in vitro.	42
319598	cd16684	RING-H2_RNF145	RING finger, H2 subclass, found in RING finger protein 145 (RNF145) and similar proteins. RNF145 is an uncharacterized RING finger protein encoded by RNF145 gene, which is expressed in T lymphocytes, and its expression is altered in acute myelomonocytic and acute promyelocytic leukemias. Although its biological function remains unclear, RNF145 shows high sequence similarity with RNF139, an endoplasmic reticulum (ER)-resident multi-transmembrane protein that functions as a potent growth suppressor in mammalian cells, inducing G2/M arrest, decreased DNA synthesis and increased apoptosis. Like RNF139, RNF145 contains a C3H2C3-type RING-H2 finger with possible E3-ubiquitin ligase activity.	43
319599	cd16685	RING-H2_UBR1	RING finger, H2 subclass, found in ubiquitin-protein ligase E3-alpha-1 (UBR1) and similar proteins. UBR1, also known as N-recognin-1 or E3alpha-I, is an E3 ubiquitin-protein ligase that is the E3 component of the N-end rule pathway. It also promotes degradation of proteins via distinct mechanism that detects a misfolded conformation. UBR1 associates with the RAD6-encoded E2 enzyme to form an E2-E3 complex that catalyzes the synthesis of a substrate-linked multi-ubiquitin chain and may also mediate the delivery of substrates to the 26S proteasome. Moreover, UBR1 promotes the degradation of a misfolded protein in the cytosol. It promotes protein kinase quality control and sensitizes cells to heat shock protein 90 (Hsp90) inhibition. Furthermore, UBR1 functions as a polyubiquitylation-enhancing component of the UBR1-UFD4 complex in its targeting of ubiquitin-fusion degradation (UFD) substrates. UBR1 harbors at least three distinct substrate-binding sites and functions in association with Ubc2/Rad6 and also Ubc4. It contains an N-terminal ubiquitin-recognin (UBR) box involved in binding type-1 (basic) N-end rule substrate, an N-domain (also known as ClpS domain) required for type-2 (bulky hydrophobic) N-end rule substrate recognition, a C3H2C3-type RING-H2 finger, and a C-terminal UBR-specific autoinhibitory (UAIN) domain. A missense mutation in UBR1 is responsible for Johanson-Blizzard syndrome leads to UBR box unfolding and loss of function.	120
319600	cd16686	RING-H2_UBR2	RING finger, H2 subclass, found in ubiquitin-protein ligase E3-alpha-2 (UBR2) and similar proteins. UBR2, also known as N-recognin-2 or E3alpha-II, is an E3 ubiquitin-protein ligase that play an important role in maintaining genome integrity and in homologous recombination repair. It regulates the level of the transcription factor Rpn4 (also known as Son1 and Ufd5) through ubiquitylation. The ubiquitin-conjugating enzyme Rad6, another binding partner of URB2, and an additional factor Mub1, are required for the ubiquitin-dependent degradation of Rpn4. UBR2 associates with Mub1 to form a Mub1/Ubr2 ubiquitin ligase complex that regulates the conserved Dsn1 kinetochore protein levels, which is a part of a quality control system that monitors kinetochore integrity, thus ensuring genomic stability. As the recognition component of a major cellular proteolytic system, UBR2 is associated with chromatin and controls chromatin dynamics and gene expression in both spermatocytes and somatic cells. Moreover, UBR2 mediates transcriptional silencing during spermatogenesis via histone ubiquitination. It functions as a scaffold E3 promoting HR6B/UbcH2-dependent ubiquitylation of H2A and H2B, but not H3 and H4. It also binds to Tex19.1, also known as Tex19, a germ cell-specific protein, and metabolically stabilizes it during spermatogenesis. Furthermore, UBR2 is involved in skeletal muscle (SKM) atrophy. Its expression can be modulated by the mouse ether-a-gogo-related gene 1a (MERG1a) potassium channel. In addition, UBR2 up-regulation in cachectic muscle is mediated by the p38beta-CCAAT/enhancer binding protein (C/EBP)-beta signaling pathway responsible for the bulk of tumor-induced muscle proteolysis. UBR2 contains an N-terminal ubiquitin-recognin (UBR) box involved in binding type-1 (basic) N-end rule substrate, an N-domain (also known as ClpS domain) required for type-2 (bulky hydrophobic) N-end rule substrate recognition, a C3H2C3-type RING-H2 finger, and a C-terminal UBR-specific autoinhibitory (UAIN) domain.	116
319601	cd16687	RING-H2_Vps8	RING finger, H2 subclass, found in vacuolar protein sorting-associated protein 8 (Vps8) and similar proteins. Vps8 is the Rab-specific subunit of the endosomal tethering complex CORVET (class C core vacuole/endosome transport) that also includes Vps3 and a Class C Vps core complex composed of Vps11, Vps16, Vps18, and Vps33. CORVET operates at endosomes, controls traffic into late endosomes, and interacts with the Rab5/Vps21-GTP form. The CORVET-specific Vps3 and Vps8 subunits belong to a class D Vps. They form a subcomplex that interact with Rab5/Vps21, and are critical for localization and function of the CORVET tethering complex on endosomes. Vps8 contains an N-terminal WD40 repeat and a C-terminal C3H2C3-type RING-H2 finger.	55
319602	cd16688	RING-H2_Vps11	RING finger, H2 subclass, found in vacuolar protein sorting-associated protein 11 homolog (Vps11) and similar proteins. Vps11, also known as RING finger protein 108 (RNF108), is a soluble protein involved in regulation of glycolipid degradation and retrograde toxin transport. It is highly expressed in heart and pancreas. Vps11 associates with Vps16, Vps18, and Vps33 to form a Class C Vps core complex that is required for soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNARE)-mediated membrane fusion at the lysosome-like yeast vacuole. The core complex, together with two additional compartment-specific subunits, forms the tethering complexes HOPS (homotypic vacuole fusion and protein sorting) and CORVET (class C core vacuole/endosome transport) protein complexes. CORVET contains the additional Vps3 and Vps8 subunits. It operates at endosomes, controls traffic into late endosomes and interacts with the Rab5/Vps21-GTP form. HOPS contains the additional Vps39 and Vps41 subunits. It operates at the lysosomal vacuole, controls all traffic from late endosomes into the vacuole and interacts with the Rab7/Ypt7-GTP form. Vps11 is a central scaffold protein upon which both HOPS and CORVET assemble. The HOPS and CORVET complexes disassemble in the absent of Vps11, resulting in a massive fragmentation of vacuoles. Vps11 contains a clathrin repeat domain and a C-terminal C3H2C3-type RING-H2 finger. This subfamily also includes Vps11 homologs found in fungi, such as Saccharomyces cerevisiae vacuolar membrane protein Pep5p, also known as carboxypeptidase Y-deficient protein 5, vacuolar morphogenesis protein 1, or vacuolar biogenesis protein END1. Pep5p is essential for vacuolar biogenesis in Saccharomyces cerevisiae. It associates with Pep3p to form a core Pep3p/Pep5p complex that promotes vesicular docking/fusion reactions in conjunction with SNARE proteins at multiple steps in transport routes to the vacuole.	44
319603	cd16689	RING-H2_Vps18	RING finger, H2 subclass, found in vacuolar protein sorting-associated protein 18 (Vps18) and similar proteins. Vps18 is an ubiquitin ligase E3 that is highly expressed in heart. It induces the ubiquitylation and degradation of serum-inducible kinase (SNK). Vps18 associates with Vps11, Vps16, and Vps33 to form a Class C Vps core complex that is required for soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNARE)-mediated membrane fusion at the lysosome-like yeast vacuole. The core complex, together with two additional compartment-specific subunits, forms the tethering complexes HOPS (homotypic vacuole fusion and protein sorting) and CORVET (class C core vacuole/endosome transport) protein complexes. CORVET contains the additional Vps3 and Vps8 subunits. It operates at endosomes, controls traffic into late endosomes and interacts with the Rab5/Vps21-GTP form. HOPS contains the additional Vps39 and Vps41 subunits. It operates at the lysosomal vacuole, controls all traffic from late endosomes into the vacuole and interacts with the Rab7/Ypt7-GTP form. Vps18 deficiency inhibits dendritogenesis in Purkinje cells by blocking the lysosomal degradation of lysyl oxidase. Vps18 contains a clathrin heavy chain repeat, a coiled-coil domain, and a C3H2C3-type RING-H2 finger domain close to its C-terminus. This subfamily also includes Vps18 homologs found in insects and fungi, such as Drosophila melanogaster protein deep orange (dor) gene encoding protein Dor, and Saccharomyces cerevisiae vacuolar membrane protein Pep3p, also known as carboxypeptidase Y-deficient protein 3, or vacuolar morphogenesis protein 8. Drosophila Dor is part of a protein complex, which also includes the Sep1p homolog carnation (car), which localizes to endosomal compartments and is required not only for the biogenesis of pigment granules but also for the normal delivery of proteins to lysosomes. Pep3p is a vacuolar peripheral membrane protein that is required for vacuolar biogenesis in Saccharomyces cerevisiae. Pep3p associates with Pep5p to form a core Pep3p/Pep5p complex that promotes vesicular docking/fusion reactions in conjunction with SNARE proteins at multiple steps in transport routes to the vacuole.	38
319604	cd16690	RING-H2_Vps41	RING finger, H2 subclass, found in vacuolar protein sorting-associated protein 41 (Vps41) and similar proteins. Vps41, also known as S53, is a protein involved in trafficking of proteins from the late Golgi to the vacuole. It interacts with caspase-8, suggesting a potential role of Vps41 beyond lysosomal trafficking. It has been identified as a potential therapeutic target for human Parkinson"s disease (PD). Vps41 and the soluble N-ethylmaleimide-sensitive factor attachment protein receptors protein VAMP7 are specifically involved in the fusion of the trans-Golgi network-derived lysosome-associated membrane protein carriers with late endosomes. Vps41 is a specific subunit of the lysosomal tethering complex HOPS (homotypic vacuole fusion and protein sorting) that also includes Vps39 and a Class C Vps core complex composed of Vps11, Vps16, Vps18, and Vps33. HOPS operates at the lysosomal vacuole, controls all traffic from late endosomes into the vacuole and interacts with the Rab7/Ypt7-GTP form. The HOPS-specific Vps39 and Vps41 subunits belong to a class B Vps. They form a subcomplex that interacts with Rab7/Ypt7 and is are required for homotypic and heterotypic late endosome fusion. Vps41 contains an N-terminal WD40 repeat, one or two clathrin repeats and a C3H2C3-type RING-H2 finger domain close to its C-terminus. This subfamily also includes Vps18 homologs found in insects, such as Drosophila melanogaster eye color gene light encoding protein.	51
319605	cd16691	mRING-H2-C3H3C2_Mio	Modified RING finger, H2 subclass (C3H3C2-type), found in WD repeat-containing protein mio and simialr proteins. This family contains Mio, its counterpart Sea4 from yeast, and other homologs. Mio/Sea4 is a component of GATOR2 complex, which also includes another four subunits, Seh1, Sec13, Sea2/WDR24, and Sea3/WDR59. GATOR2 and GATOR1, which is composed of three subunits, DEPDC5, Nprl2, and Nprl3, form the Rag-interacting complex GATOR (GAP Activity Towards Rags). Inhibition of GATOR1 subunits makes mTORC1 signaling resistant to amino acid deprivation. In contrast, inhibition of GATOR2 subunits suppresses mTORC1 signaling and GATOR2 negatively regulates DEPDC5. Mio interacts with endogenous RagA and RagC, and plays an essential role in the activation of mTOR Complex 1 (mTORC1) by amino acids. In GATOR2, Mio and Seh1 localize to lysosomes and autolysosomes, and form a heterodimer that is required to oppose the TORC1 inhibitory activity of the Iml1/GATOR1 complex to prevent the constitutive down-regulation of TORC1 activity in later stages of oogenesis. A tissue-specific requirement is necessary for Mio to be involved in cell growth in the female germ line. Mio contains an N-terminal WD40 domain and a C-terminal RING-H2 finger with an unusual arrangement of zinc-coordinating residues. The cysteines and histidines in RING-H2 finger are arranged as a modified C3H3C2-type, rather than the canonical C3H2C3-type.	73
319606	cd16692	mRING-H2-C3H3C2_WDR59	Modified RING finger, H2 subclass (C3H3C2-type), found in WD repeat-containing protein 59 (WDR59) and similar proteins. WDR59 is a component of GATOR2 complex, which also includes another four subunits, Seh1, Sec13, Sea2/WDR24, and Mio/Sea4. GATOR2 and GATOR1, which is composed of three subunits, DEPDC5, Nprl2, and Nprl3, form the Rag-interacting complex GATOR (GAP Activity Towards Rags). Inhibition of GATOR1 subunits makes mTORC1 signaling resistant to amino acid deprivation. In contrast, inhibition of GATOR2 subunits suppresses mTORC1 signaling and GATOR2 negatively regulates DEPDC5. WDR59 contains an N-terminal WD40 domain followed by a RWD domain, and a C-terminal RING-H2 finger with an unusual arrangement of zinc-coordinating residues. The cysteines and histidines in RING-H2 finger are arranged as a modified C3H3C2-type, rather than the canonical C3H2C3-type. Sea3 is the yeast counterpart of WDR59. It is not included in this subfamily.	47
319607	cd16693	mRING-H2-C3H3C2_WDR24	Modified RING finger, H2 subclass (C3H3C2-type), found in WD repeat-containing protein 24 (WDR24) and similar proteins. WDR24 is a component of GATOR2 complex, which also includes another four subunits, Seh1, Sec13, Sea3/WDR59, and Mio/Sea4. GATOR2 and GATOR1, which is composed of three subunits, DEPDC5, Nprl2, and Nprl3, form the Rag-interacting complex GATOR (GAP Activity Towards Rags). Inhibition of GATOR1 subunits makes mTORC1 signaling resistant to amino acid deprivation. In contrast, inhibition of GATOR2 subunits suppresses mTORC1 signaling and GATOR2 negatively regulates DEPDC5. WDR24 contains an N-terminal WD40 domain and a C-terminal RING-H2 finger with an unusual arrangement of zinc-coordinating residues. The cysteines and histidines in RING-H2 finger are arranged as a modified C3H3C2-type, rather than the canonical C3H2C3-type. Sea2 is the yeast counterpart of WDR24. It is not included in this subfamily.	46
319608	cd16694	mRING-CH-C4HC2H_ZNRF1	Modified RING-CH finger, H2 subclass (C4HC2H-type), found in zinc/RING finger protein 1 (ZNRF1) and similar proteins. ZNRF1, also known as Nerve injury-induced gene 283 protein (nin283), or peripheral nerve injury protein (PNIP), is an E3 ubiquitin-protein ligase that is highly expressed in the nervous system during development and is associated with synaptic vesicle membranes. It is N-myrisotoylated and also located in the endosome-lysosome compartment in fibroblasts, suggesting it may participate in ubiquitin-mediated protein modification. It contains an N-terminal MAGE domain, and a special C-terminal domain that combines a zinc finger and a modified C4HC2H-type RING-CH finger, rather than the typical C4HC3-type RING-CH finger, which is a variant of RING-H2 finger. Only the RING finger of the zinc finger-RING finger motif is required for its E3 ubiquitin ligase activity. ZNRF1 regulates Schwann cell differentiation by proteasomal degradation of glutamine synthetase (GS). It also mediates regulation of neuritogenesis via interaction with beta-tubulin type 2 (Tubb2). Moreover, ZNRF1 promotes Wallerian degeneration by degrading AKT to induce glycogen synthase kinase-3beta (GSK3B)-dependent CRMP2 phosphorylation. Furthermore, ZNRF1 and its sister protein ZNRF2 regulate the ubiquitous Na+/K+ pump (Na+/K+ATPase). In addition, ZNRF1 may be associated with leukemogenesis of acute lymphoblastic leukemia (ALL) with paired box domain gene 5 (PAX5) alteration.	46
319609	cd16695	mRING-CH-C4HC2H_ZNRF2	Modified RING-CH finger, H2 subclass (C4HC2H-type), found in zinc/RING finger protein 2 (ZNRF2) and similar proteins. ZNRF2, also known as protein Ells2 or RING finger protein 202 (RNF202), is an E3 ubiquitin-protein ligase that is highly expressed in the nervous system during development and is present in presynaptic plasma membranes. It is N-myrisotoylated and also located in the endosome-lysosome compartment in fibroblasts. It contains an N-terminal MAGE domain, and a special C-terminal domain that combines a zinc finger and a modified C4HC2H-type RING-CH finger, rather than the typical C4HC3-type RING-CH finger, which is a variant of RING-H2 finger. Only the RING finger of the zinc finger-RING finger motif is required for its E3 ubiquitin ligase activity. Together with its sister protein ZNRF1, ZNRF2 regulates the ubiquitous Na+/K+ pump (Na+/K+ATPase).	45
319610	cd16696	RING-CH-C4HC3_NFX1	RING-CH finger, H2 subclass (C4HC3-type), found in transcriptional repressor NF-X1 and similar proteins. NF-X1, also known as nuclear transcription factor, X box-binding protein 1, is a novel cysteine-rich sequence-specific DNA-binding protein that interacts with the conserved X-box motif of the human major histocompatibility complex (MHC) class II genes via a repeated Cys-His domain. It functions as a cytokine-inducible transcriptional repressor that plays an important role in regulating the duration of an inflammatory response by limiting the period in which class II MHC molecules are induced by interferon gamma (IFN- gamma). NFX1 contains an N-terminal PAM2 motif, a C4HC3-type RING-CH finger, a Cys-rich region that harbors several NFX1-type zinc fingers, and a C-terminal R3H domain.	55
319611	cd16697	RING-CH-C4HC3_NFXL1	RING-CH finger, H2 subclass (C4HC3-type), found in nuclear transcription factor, X-box binding-like 1 (NFXL1) and similar proteins. NFXL1, also known as NF-X1-type zinc finger protein NFXL1, or ovarian zinc finger protein (OZFP), is encoded by a novel human cytoplasm-distribution zinc finger protein (CDZFP) gene. It is a putative zinc finger protein with a C4HC3-type RING-CH finger and a Cys-rich region that harbors several NFX1-type zinc fingers.	63
319612	cd16698	RING_CH-C4HC3_MARCH1_like	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING finger protein MARCH1, MARCH8, and similar proteins. This family includes the closely related MARCH1 and MARCH8, both of which are located on endosomes and the plasma membrane and are implicated in regulating cell surface expression of their substrates. They ubiquitylate and downregulate many targets, including major histocompatibility complex class II (MHCII), CD86, transferrin receptor, HLA-DM, and Fas from the cell surface. MARCH1 is mainly expressed in cells of the immune system, while MARCH8 is more broadly expressed. Both of them contain an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, and two transmembrane domains. The cytoplasmic RING-CH domain participates in the ubiquitin transfer from the E2 to its substrate. The transmembrane domains are implicated in target recognition and dimer formation.	52
319613	cd16699	RING_CH-C4HC3_MARCH2_like	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING finger protein MARCH2, MARCH3, and similar proteins. MARCH2 contain a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, in the N-terminal cytoplasmic region, two transmembrane domains in the middle region, and a PDZ-binding motif at the C-terminus. It is a Golgi-localized, membrane-associated E3 ubiquitin-protein ligase that is involved in endosomal trafficking through the binding of syntaxin 6 (STX6). It is involved in the cystic fibrosis transmembrane conductance regulator (CFTR)-associated ligand (CAL)-mediated ubiquitination and lysosomal degradation of mature CFTR through the association with adaptor proteins CAL and STX6. It also reduces the surface expression of CD86 and the transferrin receptor TFRC and regulates cell surface carvedilol-bound beta2-adrenergic receptor (beta2ARs) expression. Moreover, MARCH2 interacts with and ubiquitinates PDZ domains polarity determining scaffold protein DLG1 through its PDZ-binding motif, suggesting it may function as a molecular bridge with ubiquitin ligase activity connecting endocytic tumor suppressor proteins such as syntaxins to DLG1. MARCH3 is an E3 ubiquitin-protein ligase that is broadly expressed at relatively high levels in spleen, colon, and lung. It is localized to early endosomes, binds to MARCH2 and syntaxin 6, and is involved in the regulation of vesicular trafficking and fusion of the transport vesicles in endosomes. Its E2 specificity significantly overlaps that of MARCH2.	51
319614	cd16700	RING_CH-C4HC3_MARCH4_like	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING finger protein MARCH4, MARCH9, MARCH11, and similar proteins. MARCH4 and MARCH9 are closely related to each other. They downregulate major histocompatibility complex-I (MHC-I). Moreover, MARCH4 and MARCH9, but not other MARCH proteins, can associate with Mult1 and prevent Mult1 expression at the cell surface in a lysine-dependent manner that can be reversed by heat shocking the cells. MARCH11 is a transmembrane RING-finger ubiquitin ligase that is predominantly expressed in developing spermatids in a stage-specific manner and is localized to the trans-Golgi network (TGN) vesicles and multivesicular bodies (MVBs). It mediates selective protein sorting via the TGN-MVB transport pathway through its ubiquitin ligase activity. SAMT family proteins have been identified as substrates of MARCH11 in mouse spermatids, suggesting that MARCH11 plays a role in mammalian spermiogenesis. Moreover, MARCH11 functions as an E3 ubiquitin ligase that targets CD4 for ubiquitination. It also forms complexes with the adaptor protein complex-1 and with fucose-containing glycoproteins including ubiquitinated forms. All family members contain an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, followed by two transmembrane regions.	51
319615	cd16701	RING_CH-C4HC3_MARCH5	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH5 (MARCH5). MARCH5, also known as membrane-associated RING finger protein 5, membrane-associated RING-CH protein V (MARCH-V), RING finger protein 153 (RNF153), or mitochondrial ubiquitin ligase (MITOL), is a mitochondrial outer membrane-associated E3 ubiquitin-protein ligase that regulates mitochondrial dynamics including mitochondrial morphology, transport, and interaction with endoplasmic reticulum (ER), at least in part, through the ubiquitination of mitochondrial fission factor Drp1, microtubule-associated protein 1B (MAP1B) and mitofusin 2 (Mfn2), respectively. MARCH5 also mediates the cell cycle-dependent degradation of Mitofusin 1 (Mfn1) in G2/M phase, and thus serves as an upstream quality controller on Mitofusin 1 (Mfn1), preventing excessive accumulation of Mfn1 protein under stress conditions, which is crucial for mitochondrial homeostasis and cell viability. Moreover, MARCH5 is involved in maintaining mouse-embryonic stem cell (mESC) pluripotency via suppression of ERK signalling. It is also a positive regulator of Toll-like receptor 7 (TLR7)-mediated NF-kappaB activation in mammals. MARCH5 contains an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, and four C-terminal transmembrane spans.	61
319616	cd16702	RING_CH-C4HC3_MARCH6	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH6 (MARCH6). MARCH6, also known as membrane-associated RING finger protein 6, membrane-associated RING-CH protein VI (MARCH-VI), RING finger protein 176 (RNF176), protein TEB-4, or Doa10 homolog, is an endoplasmic reticulum (ER)-localized E3 ubiquitin ligase that ubiquitinates ER-associated proteins with a cytoplasmic domain in a ubiquitin-conjugating enzyme 7 (UBC7)-dependent manner), such as Mps2, UBC6, and Ste6. It also regulates its own UBC7-mediated degradation. MARCH6 interacts with ubiquitin-specific protease USP19, which deubiquitinates and stabilizes MARCH6 and inhibits p97-dependent proteasomal degradation. It is also involved in the cholesterol synthesis pathway through controlling the degradation of squalene monooxygenase (SM), and affects 3-hydroxy-3-methyl-glutaryl coenzyme A reductase (HMGCR). Furthermore, it may be a key regulator of thyroid hormone activation in a number of tissues, since it mediates the proteasomal degradation of type 2 iodothyronine deiodinase (D2). MARCH6 contains 14 transmembrane helices and a conserved N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, that catalyzes ubiquitin Lys48-specific ligation.	50
319617	cd16703	RING_CH-C4HC3_MARCH7_like	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated MARCH7, MARCH10, and similar proteins. The subfamily includes two closely related membrane-associated RING-CH proteins, MARCH7 and MARCH10, both of which are predicted to have no transmembrane spanning region, but harbor a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, that is responsible for E3 activity. MARCH7, also known as MARCH-VII, RNF177, or axotrophin, is a ubiquitin E3 ligase expressed in multiple types of cells and tissues, including stem cells and precursor cells, and predominantly localized on the plasma membrane, and cytoplasm. MARCH7 is involved in T cell proliferation and neuronal development. It also participates in the regulation of cytoskeleton re-organization, cellular migration and invasion, cell proliferation, and tumorigenesis in ovarian carcinoma cells. Moreover, MARCH7 modulates nuclear factor kappaB (NF-kappaB) and Wnt/beta-catenin pathways. It has been identified as an authentic target of miR-101. Furthermore, ubiquitinates tau protein in vitro impairing microtubule binding. MARCH10, also known as MARCH-X or RNF190, is a microtubule-associated E3 ubiquitin ligase of developing spermatids. It is localized to the principal piece of elongating spermatids. MARCH10 is involved in spermiogenesis by regulating the formation and maintenance of the flagella in developing spermatids.	61
319618	cd16704	RING-HC_RNF20_like	RING finger, HC subclass, found in RING finger protein RNF20, RNF40, and similar proteins. RNF20, also known as BRE1A, and RNF40, also known as BRE1B, are E3 ubiquitin-protein ligases that work together to form a heterodimeric complex that facilitate the K120 monoubiquitination of histone H2B (H2Bub1), a DNA damage-induced histone modification that is crucial for recruitment of the chromatin remodeler SNF2h to DNA double-strand break (DSB) damage sites. RNF20 regulates the cell cycle and differentiation of neural precursor cells (NPCs) and links histone H2B ubiquitylation with inflammation and inflammation-associated cancer. RNF40, also known as 95 kDa retinoblastoma-associated protein (RBP95), was identified as a novel leucine zipper retinoblastoma protein (pRb)-associated protein that may function as a regulation factor in the process of RNA polymerase II-mediated transcription and/or transcriptional processing. All family members contain a C3HC4-type RING-HC finger at its C-terminus.	46
319619	cd16705	RING-HC_dBre1_like	RING finger, HC subclass, found in Drosophila melanogaster Bre1 (dBre1) and similar proteins. dBre1 is the functional homolog of yeast Bre1, an E3 ubiquitin ligase required for the monoubiquitination of histone H2B and, indirectly, for H3K4 methylation. dBre1 acts as a nuclear component required cell autonomously for the expression of Notch target genes in Drosophila development. dBre1 contains a C3HC4-type RING-HC finger at its C-terminus.	42
319620	cd16706	RING-HC_CARP1	RING finger, HC subclass, found in caspases-8 and -10-associated RING finger protein 1 (CARP1) and similar proteins. CARP1, also known as caspase regulator CARP1, FYVE-RING finger protein Momo, RING finger homologous to inhibitor of apoptosis protein (RFI), RING finger protein 34 (RNF34), or RING finger protein RIFF, is a nuclear protein that functions as a specific E3 ubiquitin ligase for the transcriptional coactivator PGC-1alpha, a master regulator of energy metabolism and adaptive thermogenesis in the brown fat cell which negatively regulates brown fat cell metabolism. It is preferentially expressed in esophageal, gastric, and colorectal cancers, suggesting a possible association with the development of the digestive tract cancers. It regulates the p53 signaling pathway by degrading 14-3-3 sigma and stabilizing MDM2. CARP1 does not localize to membranes in the cell and is involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10. CARP1 contains an N-terminal FYVE-like domain and a C-terminal C3HC4-type RING-HC finger domain.	39
319621	cd16707	RING-HC_CARP2	RING finger, HC subclass, found in caspases-8 and -10-associated RING finger protein 2 (CARP-2) and similar proteins. CARP-2, also known as rififylin, caspase regulator CARP2, FYVE-RING finger protein Sakura (Fring), RING finger and FYVE-like domain-containing protein 1, RING finger protein 189 (RNF189), or RING finger protein 34-like, is an endosome-associated E3 ubiquitin-protein ligase that targets internalized receptor interacting kinase (RIP) for proteasome-mediated degradation. It acts as a negative regulator of tumor necrosis factor (TNF)-induced nuclear factor (NF)-kappaB activation. It also regulates the p53 signaling pathway through degrading 14-3-3 sigma and stabilizing MDM2. As a caspase regulator, CARP2 does not localize to membranes in the cell and is involved in the negative regulation of apoptosis, specifically targeting two initiator caspases, caspase 8 and caspase 10. CARP2 contains an N-terminal FYVE-like domain and a C-terminal C3HC4-type RING-HC finger domain.	40
319622	cd16708	RING-HC_Cbl	RING finger, HC subclass, found in E3 ubiquitin-protein ligase Cbl and similar proteins. Cbl, also known as Casitas B-lineage lymphoma proto-oncogene, proto-oncogene c-Cbl, RING finger protein 55 (RNF55), or signal transduction protein Cbl, is a multi-domain protein that acts as a key negative regulator of various receptor and non-receptor tyrosine kinases signaling. It contains a tyrosine kinase-binding domaina (TKB, also known as the phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain), a proline-rich domain, a C3HC4-type RING-HC finger, and an ubiquitin-associated (UBA) domain. TKB is responsible for the interactions with many tyrosine kinases, such as the colony-stimulating factor-1 (CSF-1) receptor, Syk/ZAP-70, and Src-family of protein tyrosine kinases. The proline-rich domain can recruit proteins with a SH3 domain. Moreover, Cbl functions as an E3 ubiquitin ligase that can bind ubiquitin-conjugating enzymes (E2s) through the RING-HC finger.	57
319623	cd16709	RING-HC_Cbl-b	RING finger, HC subclass, found in E3 ubiquitin-protein ligase Cbl-b and similar proteins. Cbl-b, also known as Casitas B-lineage lymphoma proto-oncogene b, RING finger protein 56 (RNF56), SH3-binding protein Cbl-b, or signal transduction protein Cbl-b, has been identified as a regulator of antigen-specific, T cell-intrinsic, peripheral immune tolerance, a state also known as clonal anergy. It may inhibit activation of the p85 subunit of phosphoinositide 3-kinase (PI3K), protein kinase C-theta (PKC-theta), and phospholipase C-gamma1 (PLC-gamma1) and negatively regulates T-cell receptor-induced transcription factor nuclear factor kappaB (NF-kappaB) activation. In addition, Cbl-b may target multiple signaling molecules involved in transforming growth factor (TGF)-beta-mediated transactivation pathways. Cbl-b contains a tyrosine-kinase-binding domain (TKB, also known as the phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain), a proline rich domain, a nuclear localization signal, a C3HC4-type RING-HC finger and an ubiquitin-associated (UBA) domain.	66
319624	cd16710	RING-HC_Cbl-c	RING finger, HC subclass, found in E3 ubiquitin-protein ligase Cbl-c and similar proteins. Cbl-c, also known as RING finger protein 57 (RNF57), SH3-binding protein Cbl-3, SH3-binding protein Cbl-c, or signal transduction protein Cbl-c, is an E3 ubiquitin-protein ligase expressed exclusively in epithelial cells. It contains a tyrosine-kinase-binding domain (TKB, also known as the phosphotyrosine binding PTB domain, is composed of a four helix-bundle, a Ca2+ binding EF-hand and a highly variant SH2 domain), a C3HC4-type RING-HC finger, and a short proline-rich region, but lacks the ubiquitin-associated (UBA) leucine zipper motif that are present in Cbl and Cbl-b. Cbl-c acts as a regulator of epidermal growth factor receptor (EGFR) mediated signal transduction. It also suppresses v-Src-induced transformation through ubiquitin-dependent protein degradation. Moreover, Cbl-c ubiquitinates and downregulates RETMEN2A and implicates Enigma (PDLIM7) as a positive regulator of RETMEN2A through blocking of Cbl-mediated ubiquitination and degradation. The ubiquitin ligase activity of Cbl-c is increased via the interaction of its RING-HC finger domain with a LIM domain of the paxillin homolog, hydrogen peroxide Induced Construct 5 (Hic-5).	53
319625	cd16711	RING-HC_DTX3	RING finger, HC subclass, found in E3 ubiquitin-protein ligase Deltex3 (DTX3) and similar proteins. DTX3, also known as RING finger protein 154 (RNF154), is an E3 ubiquitin-protein ligase that belongs to the Deltex (DTX) family. In contrast to other DTXs, DTX3 does not contain N-terminal two Notch-binding WWE domains, but a short unique N-terminal domain, suggesting it does not interact with intracellular domain of Notch. Its C-terminal region includes a C3HC4-type RING-HC finger, and a previously unidentified C-terminal domain.	41
319626	cd16712	RING-HC_DTX3L	RING finger, HC subclass, found in protein Deltex-3-like (DTX3L) and similar proteins. DTX3L, also known as B-lymphoma- and BAL-associated protein (BBAP) or Rhysin-2 (Rhysin2), is a RING-domain E3 ubiquitin-protein ligase that regulates endosomal sorting of the G protein-coupled receptor CXCR4 from endosomes to lysosomes. It also regulates subcellular localization of its partner protein, B aggressive lymphoma (BAL), by a dynamic nucleocytoplasmic trafficking mechanism. DTX3L has a unique N-terminus, but lacks the highly basic N-terminal motif and the central proline-rich motif present in other Deltex (DTX) family members, such as DTX1, DTX2, and DTX4. Moreover, its C-terminal region is highly homologous to DTX3. It includes a C3HC4-type RING-HC finger, and a previously unidentified C-terminal domain. DTX3L can associate with DTX1 through its unique N-terminus and further enhance self-ubiquitination.	41
319627	cd16713	RING-HC_BIRC2_3_7	RING finger, HC subclass, found in apoptosis protein c-IAP1, c-IAP2, livin, and similar proteins. The cellular inhibitor of apoptosis protein c-IAPs function as ubiquitin E3 ligases that mediate the ubiquitination of the substrates involved in apoptosis, nuclear factor-kappaB (NF-kappaB) signaling, and oncogenesis. Unlike other apoptosis proteins (IAPs), such as XIAP, c-IAPs exhibit minimal binding to caspases and may not play an important role in the inhibition of these proteases. c-IAP1, also known as baculoviral IAP repeat-containing protein BIRC2, IAP-2, RING finger protein 48, or TNFR2-TRAF-signaling complex protein 2, is a potent regulator of the tumor necrosis factor (TNF) receptor family and NF-kappaB signaling pathways in the cytoplasm. It can also regulate E2F1 transcription factor-mediated control of cyclin transcription in the nucleus. c-IAP2, also known as BIRC3, IAP-1, apoptosis inhibitor 2 (API2), or IAP homolog C, also influences ubiquitin-dependent pathways that modulate innate immune signalling by activation of NF-kappaB. c-IAPs contain three N-terminal baculoviral IAP repeat (BIR) domains that enable interactions with proteins, a ubiquitin-association (UBA) domain that is responsible for the binding of binds polyubiquitin (polyUb), a caspase activation and recruitment domain (CARD) that serves as a protein interaction surface, and a C3HC4-type RING-HC finger at the carboxyl terminus that is required for ubiquitin ligase activity. Livin, also known as baculoviral IAP repeat-containing protein 7 (BIRC7), or kidney inhibitor of apoptosis protein (KIAP), or melanoma inhibitor of apoptosis protein (ML-IAP), or RING finger protein 50, was identified as the melanoma IAP. It plays crucial roles in apoptosis, cell proliferation, and cell cycle control. Its anti-apoptotic activity is regulated by the inhibition of caspase-3, -7, and -9. Its E3 ubiquitin-ligase-like activity promotes degradation of Smac/DIABLO, a critical endogenous regulator of all IAPs. Unlike other family members, mammalian livin contains a single BIR domain and a C3HC4-type RING-HC finger. The UBA domain can be detected in non-mammalian homologs of livin.	54
319628	cd16714	RING-HC_BIRC4_8	RING finger, HC subclass, found in E3 ubiquitin-protein ligase XIAP, baculoviral IAP repeat-containing protein 8 (BIRC8) and similar proteins. XIAP, also known as baculoviral IAP repeat-containing protein 4 (BIRC4), IAP-like protein (ILP), inhibitor of apoptosis protein 3 (IAP-3), or X-linked inhibitor of apoptosis protein (X-linked IAP), is a potent suppressor of apoptosis that directly inhibits specific members of the caspase family of cysteine proteases, including caspase-3, -7, and -9. It promotes proteasomal degradation of caspase-3 and enhances its anti-apoptotic effect in Fas-induced cell death. The ubiquitin-protein ligase (E3) activity of XIAP also exhibits in the ubiquitination of second mitochondria-derived activator of caspases (Smac). The mitochondrial proteins, Smac/DIABLO and Omi/HtrA2, can inhibit the antiapoptotic activity of XIAP. XIAP has also been implicated in several intracellular signaling cascades involved in the cellular response to stress, such as the c-Jun N-terminal kinase (JNK) pathway, the nuclear factor-kappaB (NF-kappaB) pathway, and the transforming growth factor-beta (TGF-beta) pathway. Moreover, XIAP can regulate copper homeostasis through interacting with MURR1. BIRC8, also known as inhibitor of apoptosis-like protein 2, IAP-like protein 2, ILP-2, or testis-specific inhibitor of apoptosis, is a tissue-specific homolog of E3 ubiquitin-protein ligase XIAP. It has been implicated in the control of apoptosis in the testis by direct inhibition of caspase 9. Both XIAP and BIRC8 contain three N-terminal baculoviral IAP repeat (BIR) domains, a ubiquitin-association (UBA) domain and a C3HC4-type RING-HC finger at the carboxyl terminus.	62
319629	cd16715	vRING-HC_IRF2BP1	variant of RING finger, HC subclass, found in interferon regulatory factor 2-binding protein 1 (IRF-2BP1). IRF-2BP1, also known as IRF-2-binding protein 1, is a nuclear protein that binds to the C-terminal repression domain of IRF-2 and acts as an IRF-2-dependent transcriptional corepressor, both enhancer-activated and basal transcription. It binds to Jun-dimerization protein 2 (JDP2), a member of the activating protein-1 (AP-1) family of transcription factors, and enhances the polyubiquitination of JDP2. It also represses activating transcription factor-2 (ATF2)-mediated transcriptional activation from a cyclic AMP-responsive element (CRE)-containing promoter. IRF-2BP1 contains an N-terminal C4-type zinc finger and a C-terminal C3HC4-type RING-HC finger with a partially new pattern.	56
319630	cd16716	vRING-HC_IRF2BP2	variant of RING finger, HC subclass, found in interferon regulatory factor 2 (IRF2)-binding protein 2 (IRF-2BP2). IRF-2BP2, also known as IRF-2-binding protein 2 or DIF-1, is a nuclear protein that binds to the C-terminal repression domain of IRF-2 and acts as an IRF-2-dependent transcriptional corepressor, both enhancer-activated and basal transcription. IRF-2BP2 also specifically interacts with the C-terminal domain of the nuclear factor of activated T cells NFAT1 transcription factor, and negatively regulates the NFAT1-dependent transactivation of NFAT-responsive promoters. Moreover, IRF2BP2 suppresses the transactivation activity of p53 on both Bax and p21 promoters. It also shows anti-apoptotic activity through the modulation of a death domain in NRIF3. In addition, IRF2BP2 functions as a cofactor of VGLL4 and plays a critical role controlling gene expression in skeletal, cardiac, and smooth muscle cells. It is a muscle-enriched transcription factor required to activate vascular endothelial growth factor-A (VEGF-A) expression in muscle. IRF-2BP2 contains an N-terminal C4-type zinc finger and a C-terminal C3HC4-type RING-HC finger with a partially new pattern. The zinc finger is responsible for the homo- and hetero-dimerization between different members of the IRF-2BP2 family. The RING-HC finger interacts with IRF2 and also with nuclear receptor interacting factor 3 (NRIF3).	56
319631	cd16717	vRING-HC_IRF2BPL	variant of RING finger, HC subclass, found in interferon regulatory factor 2-binding protein-like (IRF-2BPL). IRF-2BPL, also known as C14orf4 or enhanced at puberty protein 1(EAP1), is a homolog of interferon regulatory factor 2-binding proteins, IRF-2BP1 and IRF-2BP2. It is expressed in the mediobasal hypothalamus and plays a critical function in regulating the female reproductive neuroendocrine axis. IRF-2BPL is a proline-rich protein with polyglutamine and polyalanine tracks at the N-terminus and a C3HC4-type RING-HC finger domain with a partially new pattern at the C-terminus.	56
319632	cd16718	RING-HC_LNX3	RING finger, HC subclass, found in ligand of numb protein X 3 (LNX3). LNX3, also known as PDZ domain-containing RING finger protein 3 (PDZRN3), or Semaphorin cytoplasmic domain-associated protein 3 (SEMACAP3), is an E3 ubiquitin-protein ligase that was first identified as a Semaphorin-binding partner. It is also responsible for the ubiquitination and degradation of Numb, a component of the Notch signaling pathway that functions in the specification of cell fates during development and is known to control cell numbers during neurogenesis in vertebrates. LNX3 acts as a negative regulator of osteoblast differentiation by inhibiting Wnt-beta-catenin signaling. LNX3 also plays an important role in neuromuscular junction formation. It interacts with and ubiquitinates the muscle specific tyrosine kinase (MuSK), thus promoting its endocytosis and negatively regulating the cell surface expression of this key regulator of postsynaptic assembly. LNX3 contains an N-terminal typical C3HC4-type RING-HC finger, two PDZ domains, and a C-terminal LNX3 homology (LNX3H) domain.	42
319633	cd16719	RING-HC_LNX4	RING finger, HC subclass, found in ligand of numb protein X 4 (LNX4). LNX4, also known as PDZ domain-containing RING finger protein 4 (PDZRN4), or SEMACAP3-like protein (SEMCAP3L), is an E3 ubiquitin-protein ligase responsible for the ubiquitination and degradation of Numb, a component of the Notch signaling pathway that functions in the specification of cell fates during development and is known to control cell numbers during neurogenesis in vertebrates. LNX4 contains an N-terminal typical C3HC4-type RING-HC finger, two PDZ domains, and a C-terminal LNX3 homology (LNX3H) domain.	42
319634	cd16720	RING-HC_MEX3A	RING finger, HC subclass, found in RNA-binding protein MEX3A. MEX3A, also known as RING finger and KH domain-containing protein 4 (RKHD4), is a RNA-binding phosphoprotein that localizes in P-bodies and stress granules, which are two structures involved in the storage and turnover of mRNAs. It has been implicated in the regulation of tumorigenesis. It controls the polarity and stemness of intestinal epithelial cells through the post-transcriptional regulation of the homeobox transcription factor CDX2, which plays a crucial role in intestinal cell fate specification, both during normal development and in tumorigenic processes involving intestinal reprogramming. Moreover, it exhibits a transforming activity when overexpressed in gastric epithelial cells. MEX3A contains two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. Like other MEX-3 family proteins, MEX3A shuttles between the nucleus and the cytoplasm via the CRM1-dependent export pathway.	43
319635	cd16721	RING-HC_MEX3B	RING finger, HC subclass, found in RNA-binding protein MEX3B. MEX3B, also known as RING finger and KH domain-containing protein 3 (RKHD3), or RING finger protein 195 (RNF195), is a RNA-binding phosphoprotein that localizes in P-bodies and stress granules, which are two structures involved in the storage and turnover of mRNAs. It regulates the spatial organization of the Rap1 pathway that orchestrates Sertoli cell functions. It has a 3' long conserved untranslated region (3'LCU)-mediated fine-tuning system for mRNA regulation in early vertebrate development such as anteroposterior (AP) patterning and signal transduction. MEX3B contains two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. Like other MEX-3 family proteins, MEX3B shuttles between the nucleus and the cytoplasm via the CRM1-dependent export pathway.	43
319636	cd16722	RING-HC_MEX3C	RING finger, HC subclass, found in RNA-binding protein MEX3C. MEX3C, also known as RING finger and KH domain-containing protein 2 (RKHD2), or RING finger protein 194 (RNF194), is a RNA-binding phosphoprotein that acts as a suppressor of chromosomal instability. It functions as a RNA-binding ubiquitin E3 ligase responsible for the post-transcriptional, HLA-A allotype-specific regulation of MHC class I molecules (MHC-I). It also modifies retinoic acid inducible gene-1 (RIG-I) in stress granules and plays a critical role in eliciting antiviral immune responses. Moreover, MEX3C plays an essential role in normal postnatal growth via enhancing the local expression of insulin-like growth factor 1 (IGF1) in bone. It may also be involved in metabolic regulation of energy balance. MEX3C contains two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. Like other MEX-3 family proteins, MEX3C shuttles between the nucleus and the cytoplasm via the CRM1-dependent export pathway.	43
319637	cd16723	RING-HC_MEX3D	RING finger, HC subclass, found in RNA-binding protein MEX3D. MEX3D, also known as RING finger and KH domain-containing protein 1 (RKHD1), RING finger protein 193 (RNF193), or TINO, is a RNA-binding phosphoprotein that controls the stability of the transcripts coding for the anti-apoptotic protein BCL-2, and negatively regulates BCL-2 in HeLa cells. MEX3D contains two K homology (KH) domains that provide RNA-binding capacity, and a C-terminal C3HC4-type RING-HC finger. Like other MEX-3 family proteins, MEX3D shuttles between the nucleus and the cytoplasm via the CRM1-dependent export pathway.	45
319638	cd16724	RING1-HC_MIB1	RING finger 1, HC subclass, found in mind bomb 1 (MIB1) and similar proteins. MIB1, also known as DAPK-interacting protein 1 (DIP-1) or zinc finger ZZ type with ankyrin repeat domain protein 2, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands, and thus plays an essential role in controlling metazoan development by Notch signaling. It is also involved in Wnt/beta-catenin signaling and nuclear factor (NF)-kappaB signaling, and has been implicated in innate immunity, neuronal function, genomic stability, and cell death. MIB1 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of three C3HC4-type RING-HC fingers. This family corresponds to the first RING-HC finger.	37
319639	cd16725	RING2-HC_MIB1	RING finger 2, HC subclass, found in mind bomb 1 (MIB1) and similar proteins. MIB1, also known as DAPK-interacting protein 1 (DIP-1) or zinc finger ZZ type with ankyrin repeat domain protein 2, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands, and thus plays an essential role in controlling metazoan development by Notch signaling. It is also involved in Wnt/beta-catenin signaling and nuclear factor (NF)-kappaB signaling, and has been implicated in innate immunity, neuronal function, genomic stability, and cell death. MIB1 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of three C3HC4-type RING-HC fingers. This family corresponds to the second RING-HC finger.	37
319640	cd16726	RING1-HC_MIB2	RING finger 1, HC subclass, found in mind bomb 2 (MIB2) and similar proteins. MIB2, also known as novel zinc finger protein (Novelzin), putative NF-kappa-B-activating protein 002N, skeletrophin, or zinc finger ZZ type with ankyrin repeat domain protein 1, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands. It promotes Delta ubiquitylation and endocytosis in Notch activation. Overexpression of MIB2, activates NF-kappaB and interferon-stimulated response element (ISRE) reporter activity. Moreover, MIB2 acts as a novel component of the activated B-cell CLL/lymphoma 10 (BCL10) complex and controls BCL10-dependent NF-kappaB activation. It also functions as a founder myoblast-specific protein that regulates myoblast fusion and muscle stability. MIB2 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of two C3HC4-type RING-HC fingers. This family corresponds to the first RING-HC finger.	37
319641	cd16727	RING3-HC_MIB1	RING finger 3, HC subclass, found in mind bomb 1 (MIB1) and similar proteins. MIB1, also known as DAPK-interacting protein 1 (DIP-1) or zinc finger ZZ type with ankyrin repeat domain protein 2, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands, and thus plays an essential role in controlling metazoan development by Notch signaling. It is also involved in Wnt/beta-catenin signaling and nuclear factor (NF)-kappaB signaling, and has been implicated in innate immunity, neuronal function, genomic stability, and cell death. MIB1 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of three C3HC4-type RING-HC fingers. This family corresponds to the third RING-HC finger.	42
319642	cd16728	RING2-HC_MIB2	RING finger 2, HC subclass, found in mind bomb 2 (MIB2) and similar proteins. MIB2, also known as novel zinc finger protein (Novelzin), putative NF-kappa-B-activating protein 002N, skeletrophin, or zinc finger ZZ type with ankyrin repeat domain protein 1, is a large, multi-domain E3 ubiquitin-protein ligase that promotes ubiquitination of the cytoplasmic tails of Notch ligands. Especially, it promotes Delta ubiquitylation and endocytosis in Notch activation. Overexpression of MIB2, activates NF-kappaB and interferon-stimulated response element (ISRE) reporter activity. Moreover, MIB2 acts as a novel component of the activated B-cell CLL/lymphoma 10 (BCL10) complex and controls BCL10-dependent NF-kappaB activation. It also functions as a founder myoblast-specific protein that regulates myoblast fusion and muscle stability. MIB2 contains an MZM region with two Mib-Herc2 domains flanking a ZZ zinc finger, a REP region including two tandem Mib repeats, an ANK region that spans ankyrin repeats, and a RNG region consisted of two C3HC4-type RING-HC fingers. This family corresponds to the second RING-HC finger.	46
319643	cd16729	RING-HC_RGLG_plant	RING finger, HC subclass, found in RING domain ligase RGLG1, RGLG2 and similar proteins from plant. RGLG1 is a ubiquitously expressed E3 ubiquitin-protein ligase that interacts with UBC13 and, together with UBC13, catalyzes the formation of K63-linked polyubiquitin chains, which is involved in DNA damage repair. RGLG1 mediates the formation of canonical, K48-linked polyubiquitin chains that target proteins for degradation. It also regulates apical dominance by acting on the auxin transport proteins abundance. RGLG1 has overlapping functions with its closest sequelog, RGLG2. They both function as RING E3 ligases that interact with ethylene response factor 53 (ERF53) in the nucleus and negatively regulate the plant drought stress response. All members in this family contain a Von Willebrand factor type A (vWA) domain and a C3HC4-type RING-HC finger.	45
319644	cd16730	RING-HC_MKRN1_3	RING finger, HC subclass, found in makorin-1 (MKRN1), makorin-3 (MKRN3), and similar proteins. MKRN1, also known as makorin RING finger protein 1 or RING finger protein 61 (RNF61), is an E3 ubiquitin-protein ligase targeting the telomerase catalytic subunit (TERT) for proteasome processing. It regulates the ubiquitination and degradation of peroxisome-proliferator-activated receptor gamma (PPARgamma), a nuclear receptor that is linked to obesity and metabolic diseases. It also mediates the posttranslational regulation of p14ARF, and thus potentially regulates cellular senescence and tumorigenesis in gastric cancer. Moreover, MKRN1 functions as a differentially negative regulator of p53 and p21, and controls cell cycle arrest and apoptosis. It induces degradation of West Nile virus (WNV) capsid protein to protect cells from WNV. Furthermore, MKRN1 may represent a nuclear protein with multiple nuclear functions, including regulating RNA polymerase II-catalyzed transcription. It is a RNA-binding protein involved in the modulation of cellular stress and apoptosis. It predominantly associates with proteins involved in mRNA metabolism including regulators of mRNA turnover, transport, and/or translation, and acts as a component of a ribonucleoprotein complex in embryonic stem cells (ESCs) that is recruited to stress granules upon exposure to environmental stress. Meanwhile, MKRN1 interacts with poly(A)-binding protein (PABP), a key component of different ribonucleoprotein complexes, in an RNA-independent manner, and stimulates translation in nerve cells. In addition, MKRN1 is a novel SEREX (serological identification of antigens by recombinant cDNA expression cloning) antigen of esophageal squamous cell carcinoma (SCC). It may be involved in carcinogenesis of the well-differentiated type of tumors possibly via ubiquitination of filamin A interacting protein 1 (L-FILIP). Human MKRN1 contains three N-terminal C3H1-type zinc fingers, a motif rich in Cys and His residues (CH), a C3HC4-type RING-HC finger, and another C3H1-type zinc finger at the C-terminus. MKRN3, also known as makorin RING finger protein 3, RING finger protein 63 (RNF63), or zinc finger protein 127 (ZNF127), is a therian mammal-specific retrocopy of MKRN1. It acts as a putative E3 ubiquitin-protein ligase involved in ubiquitination and cell signaling. MKRN3 shows a potential inhibitory effect on hypothalamic gonadotropin-releasing hormone (GnRH) secretion. Its defects represent the most frequent known genetic cause of familial central precocious puberty (CPP). In contrast to human MKRN1, human MKRN3 lacks the second C3H1-type zinc finger at the N-terminal region. The RING-HC finger of mammalian MKRN4 shows high sequence similarity with that of MKRN3, and is also included in this subfamily.	61
319645	cd16731	RING-HC_MKRN2	RING finger, HC subclass, found in makorin-2 (MKRN2) and similar proteins. MKRN2, also known as makorin RING finger protein 2, RING finger protein 62 (RNF62), or HSPC070, is a putative ribonucleoprotein that acts as a neurogenesis inhibitor acting upstream of glycogen synthase kinase-3beta (GSK-3beta) in the phosphatidylinositol 3-kinase (PI3K)/Akt pathway. It also functions on promoting cell proliferation of primary CD34+ progenitor cells and K562 cells, indicating its possible involvement in normal and malignant hematopoiesis. Mammalian MKRN2 contains three N-terminal C3H1-type zinc fingers, a motif rich in Cys and His residues (CH), a C3HC4-type RING-HC finger, and another C3H1-type zinc finger at the C-terminus. The third C3H1-type zinc finger, the CH motif, as well as the RING zinc finger are necessary for its anti-neurogenic activity.	58
319646	cd16732	RING-HC_MKRN4	RING finger, HC subclass, found in makorin-4 (MKRN4) and similar proteins. MKRN4, also known as makorin RING finger protein pseudogene 4, makorin RING finger protein pseudogene 5, RING finger protein 64 (RNF64), zinc finger protein 127-Xp (ZNF127-Xp), or zinc finger protein 127-like 1, is a new divergent member of the makorin protein family in vertebrates. It may have an ancestral gonad-specific function and maternal embryonic expression before duplication in vertebrates. MKRN4 contains typical arrays of one to four C3H1-type zinc fingers, a motif rich in Cys and His residues (CH) and a C3HC4-type RING-HC finger. The RING-HC finger of mammalian MKRN4 shows high sequence similarity with that of MKRN3, and is not included in this subfamily.	61
319647	cd16733	RING-HC_PCGF1	RING finger, HC subclass, found in polycomb group RING finger protein 1 (PCGF1) and similar proteins. PCGF1, also known as nervous system Polycomb-1 (NSPc1) or RING finger protein 68 (RNF68), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a noncanonical Polycomb repressive complex 1 (PRC1)-like BCOR complex that also contains RING1, RNF2, RYBP, SKP1, as well as the BCL6 co-repressor BCOR and the histone demethylase KDM2B, and is required to maintain the transcriptionally repressive state of some genes, such as Hox genes, BCL6 and the cyclin-dependent kinase inhibitor, CDKN1A. PCGF1 promotes cell cycle progression and enhances cell proliferation as well. It is a cell growth regulator that acts as a transcriptional repressor of p21Waf1/Cip1 via the retinoid acid response element (RARE element). Moreover, PCGF1 functions as an epigenetic regulator involved in hematopoietic cell differentiation. It cooperates with the transcription factor runt-related transcription factor 1 (Runx1) in regulating differentiation and self-renewal of hematopoietic cells. Furthermore, PCGF1 represents a physical and functional link between Polycomb function and pluripotency. PCGF1 contains a C3HC4-type RING-HC finger.	43
319648	cd16734	RING-HC_PCGF2	RING finger found in polycomb group RING finger protein 2 (PCGF2) and similar proteins. PCGF2, also known as DNA-binding protein Mel-18, RING finger protein 110 (RNF110), or zinc finger protein 144 (ZNF144), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a canonical Polycomb repressive complex 1 (PRC1), which is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a Polyhomeotic protein (PHC1, PHC2, or PHC3). Like other PCGF homologs, PCGF2 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF2 uniquely regulates PRC1 to specify mesoderm cell fate in embryonic stem cells. It is required for PRC1 stability and maintenance of gene repression in embryonic stem cells (ESCs) and essential for ESC differentiation into early cardiac-mesoderm precursors. PCGF2 also plays a significant role in the angiogenic function of endothelial cells (ECs) by regulating endothelial gene expression. Furthermore, PCGF2 is a SUMO-dependent regulator of hormone receptors. It facilitates the deSUMOylation process by inhibiting PCGF4/BMI1-mediated ubiquitin-proteasomal degradation of SUMO1/sentrin-specific protease 1 (SENP1). It is also a novel negative regulator of breast cancer stem cells (CSCs) that inhibits the stem cell population and in vitro and in vivo self-renewal through the inactivation of Wnt-mediated Notch signaling. PCGF2 contains a C3HC4-type RING-HC finger.	43
319649	cd16735	RING-HC_PCGF3	RING finger found in polycomb group RING finger protein 3 (PCGF3) and similar proteins. PCGF3, also known as RING finger protein 3A (RNF3A), is one of six PcG RING finger (PCGF) homologs (PCGF1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6) and serves as the core component of a Polycomb repressive complex 1 (PRC1). Like other PCGF homologs, PCGF3 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF3 contains a C3HC4-type RING-HC finger.	47
319650	cd16736	RING-HC_PCGF4	RING finger found in polycomb group RING finger protein 4 (PCGF4) and similar proteins. PCGF4, also known as polycomb complex protein BMI-1 (B cell-specific Moloney murine leukemia virus integration site 1) or RING finger protein 51 (RNF51), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a canonical Polycomb repressive complex 1 (PRC1), which is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a Polyhomeotic protein (PHC1, PHC2, or PHC3), and plays important roles in chromatin compaction and H2AK119 monoubiquitination. PCGF4 associates with the Runx1/CBFbeta transcription factor complex to silence target gene in a PRC2-independent manner. Moreover, PCGF4 is expressed in the hair cells and supporting cells. It can regulate cell survival by controlling mitochondrial function and reactive oxygen species (ROS) level in thymocytes and neurons, thus having an important role in the survival and sensitivity to ototoxic drug of auditory hair cells. Furthermore, PCGF4 controls memory CD4 T-cell survival through direct repression of Noxa gene in an Ink4a- and Arf-independent manner. It is required in neurons to suppress p53-induced apoptosis via regulating the antioxidant defensive response, and also involved in the tumorigenesis of various cancer types. PCGF4 contains a C3HC4-type RING-HC finger.	54
319651	cd16737	RING-HC_PCGF5	RING finger found in polycomb group RING finger protein 5 (PCGF5) and similar proteins. PCGF5, also known as RING finger protein 159 (RNF159), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a Polycomb repressive complex 1 (PRC1). Like other PCGF homologs, PCGF5 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF5 contains a C3HC4-type RING-HC finger.	42
319652	cd16738	RING-HC_PCGF6	RING finger found in polycomb group RING finger protein 6 (PCGF6) and similar proteins. PCGF6, also known as Mel18 and Bmi1-like RING finger (MBLR), or RING finger protein 134 (RNF134), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a noncanonical Polycomb repressive complex 1 (PRC1)-like L3MBTL2 complex, which is composed of some canonical components, such as RNF2, CBX3, CXB4, CXB6, CXB7, and CXB8, as well as some noncanonical components, such as L3MBTL2, E2F6, WDR5, HDAC1, and RYBP, and plays a critical role in epigenetic transcriptional silencing in higher eukaryotes. Like other PCGF homologs, PCGF6 possesses the transcriptional repression activity, and also associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF6 can regulate the enzymatic activity of JARID1d/KDM5D, a trimethyl H3K4 demethylase, through the direct interaction with it. Furthermore, PCGF6 is expressed predominantly in meiotic and post-meiotic male germ cells and may play important roles in mammalian male germ cell development. It also regulates mesodermal lineage differentiation in mammalian embryonic stem cells (ESCs) and functions in induced pluripotent stem (iPS) reprogramming. The activity of PCGF6 is found to be regulated by cell cycle dependent phosphorylation. PCGF6 contains a C3HC4-type RING-HC finger.	45
319653	cd16739	RING-HC_RING1	RING finger, HC subclass, found in really interesting new gene 1 protein (RING1) and similar proteins. RING1, also known as polycomb complex protein RING1, RING finger protein 1 (RNF1), or RING finger protein 1A (RING1A), was identified as a transcriptional repressor that is associated with the Polycomb group (PcG) protein complex involved in stable repression of gene activity. It is a core component of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase that transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. RING1 interacts with multiple PcG proteins and displays tumorigenic activity. It also shows zinc-dependent DNA binding activity. Moreover, RING1 inhibits transactivation of the DNA-binding protein recombination signal binding protein-Jkappa (RBP-J) by Notch through interaction with the LIM domains of KyoT2. RING1 contains a C3HC4-type RING-HC finger.	44
319654	cd16740	RING-HC_RING2	RING finger, HC subclass, found in really interesting new gene 2 protein (RING2) and similar proteins. RING2, also known as huntingtin-interacting protein 2-interacting protein 3, HIP2-interacting protein 3, protein DinG, RING finger protein 1B (RING1B), RING finger protein 2 (RNF2), or RING finger protein BAP-1, is an E3 ubiquitin-protein ligase that interacts with both nucleosomal DNA and an acidic patch on histone H4 to achieve the specific monoubiquitination of K119 on histone H2A (H2AK119ub), thereby playing a central role in histone code and gene regulation. RING2 is a core component of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. The enzymatic activity of RING2 is enhanced by the interaction with BMI1/PCGF4, and it is dispensable for early embryonic development and much of the gene repression activity of PRC1. Moreover, RING2 plays a key role in terminating neural precursor cell (NPC)-mediated production of subcerebral projection neurons (SCPNs) during neocortical development. It also plays a critical role in nonhomologous end-joining (NHEJ)-mediated end-to-end chromosome fusions. Furthermore, RING2 is essential for expansion of hepatic stem/progenitor cells. It promotes hepatic stem/progenitor cell expansion through simultaneous suppression of cyclin-dependent kinase inhibitors (CDKIs) Cdkn1a and Cdkn2a, known negative regulators of cell proliferation. RING2 also negatively regulates p53 expression through directly binding with both p53 and MDM2 and promoting MDM2-mediated p53 ubiquitination in selective cancer cell types to stimulate tumor development. RING2 contains a C3HC4-type RING-HC finger.	47
319655	cd16741	RING-HC_RNFT1	RING finger, HC subclass, found in RING finger and transmembrane domain-containing protein 1 (RNFT1). RNFT1, also known as protein PTD016, is a multi-pass membrane protein containing a C3HC4-type RING-HC finger. Its biological role remains unclear.	40
319656	cd16742	RING-HC_RNFT2	RING finger, HC subclass, found in RING finger and transmembrane domain-containing protein 2(RNFT2). RNFT2, also known as transmembrane protein 118 (TMEM118), is a multi-pass membrane protein containing a C3HC4-type RING-HC finger. Its biological role remains unclear.	41
319657	cd16743	RING-HC_RNF5	RING finger, HC subclass, found in RING finger protein 5 (RNF5) and similar proteins. RNF5, also known as protein G16 or Ram1, is an E3 ubiquitin-protein ligase anchored to the outer membrane of the endoplasmic reticulum (ER). It acts at early stages of cystic fibrosis (CF) transmembrane conductance regulator (CFTR) biosynthesis and functions as a target for therapeutic modalities to antagonize mutant CFTR proteins in CF patients carrying the F508del allele. It also regulates the turnover of specific G protein-coupled receptors by ubiquitinating JNK-associated membrane protein (JAMP) and preventing proteasome recruitment. RNF5 limits basal levels of autophagy and influences susceptibility to bacterial infection through the regulation of ATG4B stability. It is also involved in the degradation of Pendrin, a transmembrane chloride/anion exchanger highly expressed in thyroid, kidney, and inner ear. RNF5 plays an important role in cell adhesion and migration. It can modulate cell migration by ubiquitinating paxillin. Furthermore, RNF5 interacts with virus-induced signaling adaptor (VISA) at mitochondria in a viral infection-dependent manner, and further targets VISA at K362 and K461 for K48-linked ubiquitination and degradation after viral infection. It also negatively regulates virus-triggered signaling by targeting MITA, also known as STING, for ubiquitination and degradation at the mitochondria. In addition, RNF5 determines breast cancer response to ER stress-inducing chemotherapies through the regulation of the L-glutamine carrier proteins SLC1A5 and SLC38A2 (SLC1A5/38A2). It also has been implicated in muscle organization and in recognition and processing of misfolded proteins. RNF5 contains a C3HC4-type RING-HC finger.	46
319658	cd16744	RING-HC_RNF185	RING finger, HC subclass, found in RING finger protein 185 (RNF185) and similar proteins. RNF185 is an E3 ubiquitin-protein ligase of endoplasmic reticulum-associated degradation (ERAD) that targets cystic fibrosis transmembrane conductance regulator (CFTR). It controls the degradation of CFTR and CFTR F508del allele in a RING- and proteasome-dependent manner, but does not control that of other classical ERAD model substrates. It also negatively regulates osteogenic differentiation by targeting dishevelled2 (Dvl2), a key mediator of Wnt signaling pathway, for degradation. Moreover, RNF185 regulates selective mitochondrial autophagy through interaction with the Bcl-2 family protein BNIP1. It also plays an important role in cell adhesion and migration through the modulation of cell migration by ubiquitinating paxillin. RNF185 contains a C3HC4-type RING-HC finger.	43
319659	cd16745	RING-HC_AtRMA_like	RING finger, HC subclass, found in Arabidopsis thaliana RING membrane-anchor proteins (AtRMAs) and similar proteins. AtRMAs, including AtRma1, AtRma2, and AtRma3, are endoplasmic reticulum (ER)-localized Arabidopsis homologs of human outer membrane of the ER-anchor E3 ubiquitin-protein ligase, RING finger protein 5 (RNF5). AtRMAs possess E3 ubiquitin ligase activity, and may play a role in the growth and development of Arabidopsis. The AtRMA1 and AtRMA3 genes are predominantly expressed in major tissues, such as cotyledons, leaves, shoot-root junction, roots, and anthers, while AtRMA2 expression is restricted to the root tips and leaf hydathodes. AtRma1 probably functions with the Ubc4/5 subfamily of E2. AtRma2 is likely involved in the cellular regulation of ABP1 expression levels through interacting with auxin binding protein 1 (ABP1). AtRMA proteins contain an N-terminal C3HC4-type RING-HC finger and a trans-membrane-anchoring domain in their extreme C-terminal region.	45
319660	cd16746	RING-HC_RNF212	RING finger, HC subclass, found in RING finger protein 212 (RNF212) and similar proteins. RNF212 is a dosage-sensitive regulator of crossing-over during mammalian meiosis. It plays a central role in designating crossover sites and coupling chromosome synapsis to the formation of crossover-specific recombination complexes. It also functions as an E3 ligase for SUMO modification. RNF212 contains an N-terminal C3HC4-type RING-HC finger.	48
319661	cd16747	RING-HC_RNF212B	RING finger, HC subclass, found in RING finger protein 212B (RNF212B) and similar proteins. RNF212B is an uncharacterized protein with high sequence similarity with RNF212, a dosage-sensitive regulator of crossing-over during mammalian meiosis. RNF212B contains an N-terminal C3HC4-type RING-HC finger.	41
319662	cd16748	RING-HC_SH3RF1	RING finger, HC subclass, found in SH3 domain-containing RING finger protein 1 (SH3RF1) and similar proteins. SH3RF1, also known as plenty of SH3s (POSH), RING finger protein 142 (RNF142), or SH3 multiple domains protein 2 (SH3MD2), is a trans-Golgi network-associated pro-apoptotic scaffold protein with E3 ubiquitin-protein ligase activity. It also plays a role in calcium homeostasis through the control of the ubiquitin domain protein Herp. It may also have a role in regulating death receptor mediated and c-Jun N-terminal kinase (JNK) mediated apoptosis, linking Rac1 to downstream components. SH3RF1 also enhances the ubiquitination of ROMK1 potassium channel resulting in its increased endocytosis. Moreover, SH3RF1 assembles an inhibitory complex with the actomyosin regulatory protein Shroom3, which links to the actin-myosin network to regulate neuronal process outgrowth. It also forms a complex with apoptosis-linked gene-2 (ALG-2) and ALG-2-interacting protein (ALIX/AIP1) in a calcium-dependent manner to play a role in the regulation of the JNK pathway. Furthermore, direct interaction of SH3RF1 and another molecular scaffold JNK-interacting protein (JIP) is required for apoptotic activation of JNKs. Interaction of SH3RF1 and E3 ubiquitin-protein isopeptide ligases, Siah proteins, also needs to promote JNK activation and apoptosis. In addition, SH3RF1 binds to and degrades TAK1, a crucial activator of both the JNK and the Relish signaling pathways.SH3RF1 contains an N-terminal C3HC4-type RING-HC finger responsible for the E3 ligase activity and four Src Homology 3 (SH3) domains that are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs.	48
319663	cd16749	RING-HC_SH3RF2	RING finger, HC subclass, found in SH3 domain-containing RING finger protein 2 (SH3RF2) and similar proteins. SH3RF2, also known as heart protein phosphatase 1-binding protein (HEPP1), plenty of SH3s (POSH)-eliminating RING protein (POSHER), protein phosphatase 1 regulatory subunit 39, or RING finger protein 158 (RNF158), is a putative E3 ubiquitin-protein ligase that acts as an anti-apoptotic regulator for the c-Jun N-terminal kinase (JNK) pathway by binding to and promoting the proteasomal degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation. It may also play a role in cardiac functions together with protein phosphatase 1. SH3RF2 contains an N-terminal C3HC4-type RING-HC finger responsible for the E3 ligase activity and four Src Homology 3 (SH3) domains that are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs.	48
319664	cd16750	RING-HC_SH3RF3	RING finger, HC subclass, found in SH3 domain-containing RING finger protein 3 (SH3RF3) and similar proteins. SH3RF3, also known as plenty of SH3s 2 (POSH2) or SH3 multiple domains protein 4 (SH3MD4), is a scaffold protein with E3 ubiquitin-protein ligase activity. It was identified in the screen for interacting partners of p21-activated kinase 2 (PAK2). It may play a role in regulating c-Jun N-terminal kinase (JNK) mediated apoptosis in certain conditions. It also interacts with GTP-loaded Rac1. SH3RF3 is highly homologous to SH3RF1. Both of them contain an N-terminal C3HC4-type RING-HC finger responsible for the E3 ligase activity and four Src Homology 3 (SH3) domains that are protein interaction domains that bind to proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs.	46
319665	cd16751	RING-HC_SIAH1	RING finger, HC subclass, found in seven in absentia homolog 1 (SIAH1) and similar proteins. SIAH1, also known as Siah-1a, is an inducible E3 ubiquitin-protein ligase that contributes to proteasome-mediated degradation of multiple targets in numerous cellular processes including apoptosis, tumor suppression, cell cycle, axon guidance, transcription regulation, and tumor necrosis factor signaling. SIAH1 functions as a scaffolding protein and interacts with a variety of different substrates for ubiquitination and subsequent degradation. It regulates the oncoprotein p34SEI-1 polyubiquitination and its subsequent degradation in a p53-dependent manner, which mediates p53 preferential vitamin C cytotoxicity. It targets the nonreceptor tyrosine kinase activated Cdc42-associated kinase 1 (ACK1), a valid target in cancer therapy, for ubiquitinylation and proteasomal degradation. It also interacts with KLF10 and targets for its degradation. The CDK2 phosphorylation-mediates KLF10 dissociation from SIAH1 is linked to cell cycle progression. Moreover, Siah1 is downregulated and associated with apoptosis and invasion in human breast cancer. It targets TAp73, a homolog of the tumor suppressor p53, for degradation. It is suppressed by hypoxia-inducible factor 1-alpha (HIF-1alpha) under hypoxic conditions to regulate TAp73 levels. It also promotes the migration and invasion of human glioma cells by regulating HIF-1alpha signaling under hypoxia. Furthermore, Siah1 forms a protein complex with glyceraldehyde-3-phosphate dehydrogenase (GAPDH). The apoptosis signal-regulating kinase 1 (ASK1) functions as an activator of the GAPDH-Siah1 stress-signaling cascade. It also plays an important role in ethanol-induced apoptosis in neural crest cells (NCCs). SIAH1 contains an N-terminal C3HC4-type RING-HC finger, two zinc-finger subdomains, and a C-terminal tumor necrosis factor (TNF) receptor associated factor (TRAF)-like substrate-binding domain (SBD) responsible for dimer formation.	40
319666	cd16752	RING-HC_SIAH2	RING finger, HC subclass, found in seven in absentia homolog 2 (SIAH2) and similar proteins. SIAH2 is an E3 ubiquitin-protein ligase that contributes to proteasome-mediated degradation of multiple targets in numerous cellular processes. It targets the ubiquitylation and degradation of tumor necrosis factor receptor-associated factor 2 (TRAF2) under stress conditions, which is required for the cell to commit to undergoing apoptosis. It is, therefore, a key regulator of TRAF2-dependent signaling in response to tumor necrosis factor-alpha (TNF-alpha) treatment and UV irradiation. SIAH2 modulates the polyubiquitination of G protein pathway suppressor 2 (GPS2), and targets for its proteasomal degradation. It is also a regulator of NF-E2-related factor 2 (Nrf2), a key regulator of cellular oxidative response and contributes to the degradation of Nrf2 irrespective of its phosphorylation status. Moreover, SIAH2 contributes to castration-resistant prostate cancer (CRPC) by regulation of androgen receptor (AR) transcriptional activity. It enhances AR transcriptional activity and prostate cancer cell growth. Its stability can be regulated by AKR1C3. SIAH2 also inhibits tyrosine kinase-2 (TYK2)-STAT3 signaling in lung carcinoma cells. Furthermore, SIAH2 regulates obesity-induced adipose tissue inflammation through altering peroxisome proliferator-activated receptor gamma (PPAR gamma) protein levels and selectively regulating PPAR gamma activity. It also functions as a regulator of the nuclear hormone receptor RevErbalpha (Nr1d1) stability and rhythmicity, and overall circadian oscillator function. In addition, Siah2 is an essential component of the hypoxia response Hippo signaling pathway and has been implicated in normal development and tumorigenesis. It modulates the hypoxia pathway upstream of hypoxia-induced transcription factor subunit HIF-1alpha, and therefore may play an important role in angiogenesis in response to hypoxic stress in endothelial cells. It also stimulates transcriptional coactivator YAP1 by destabilizing serine/threonine-protein kinase LATS2, a critical component of the Hippo pathway, in response to hypoxia. Meanwhile, Siah2 is involved in regulation of tight junction integrity and cell polarity under hypoxia, through its regulation of apoptosis-stimulating proteins of p53 subunit 2 (ASPP2) stability. SIAH2 contains an N-terminal C3HC4-type RING-HC finger, two zinc-finger subdomains, and a C-terminal tumor necrosis factor (TNF) receptor associated factor (TRAF)-like substrate-binding domain (SBD) responsible for dimer formation.	38
319667	cd16753	RING-HC_MID1	RING finger, HC subclass, found in midline-1 (MID1) and similar proteins. MID1, also known as midin, midline 1 RING finger protein, putative transcription factor XPRF, RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRIM18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. It monoubiquinates the alpha4 subunit of protein phosphatase 2A (PP2A), promoting proteosomal degradation of the catalytic subunit of PP2A (PP2Ac) and preventing the A and B subunits from forming an active complex. It promotes allergen and rhinovirus-induced asthma through the inhibition of PP2A activity. It is strongly upregulated in cytotoxic lymphocytes (CTLs) and directs lytic granule exocytosis and cytotoxicity of killer T cells. Loss-of-function mutations in MID1 lead to the human X-linked Opitz G/BBB (XLOS) syndrome characterized by defective midline development during embryogenesis. MID1 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. MID1 hetero-dimerizes in vitro with its paralog MID2.	54
319668	cd16754	RING-HC_MID2	RING finger, HC subclass, found in midline-2 (MID2) and similar proteins. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is a probable E3 ubiquitin-protein ligase that is highly related to MID1 that associate with cytoplasmic microtubules along their length and throughout the cell cycle. Like MID1, MID2 associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with Alpha 4, which is a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. MID2 can also substitute for MID1 to control exocytosis of lytic granules in cytotoxic T cells. Loss-of-function mutations in MID2 lead to the human X-linked intellectual disability (XLID). MID2 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxy-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. MID2 hetero-dimerizes in vitro with its paralog MID1.	53
319669	cd16755	RING-HC_TRIM9	RING finger, HC subclass, found in tripartite motif-containing protein 9 (TRIM9) and similar proteins. TRIM9, human ortholog of rat Spring, also known as RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. TRIM9 plays an important role in the regulation of neuronal functions and participates in the neurodegenerative disorders through its ligase activity. It interacts with the WD repeat region of beta-transducin repeat-containing protein (beta-TrCP) through its N-terminal degron motif depending on the phosphorylation status, and thus negatively regulates nuclear factor-kappaB (NF-kappaB) activation in the NF-kappaB pro-inflammatory signaling pathway. Moreover, TRIM9 acts as a critical catalytic link between Netrin-1 and exocytic soluble NSF attachment receptor protein (SNARE) machinery in murine cortical neurons. It promotes SNARE-mediated vesicle fusion and axon branching in a Netrin-dependent manner. TRIM9 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	52
319670	cd16756	RING-HC_TRIM36	RING finger, HC subclass, found in tripartite motif-containing protein 36 (TRIM36) and similar proteins. TRIM36, human ortholog of mouse Haprin, also known as RING finger protein 98 (RNF98) or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in the carcinogenesis. TRIM36 functions upstream of Wnt/beta-catenin activation, and plays a role in controlling the stability of proteins regulating microtubule polymerization during cortical rotation, and subsequently dorsal axis formation. It is also potentially associated with chromosome segregation through interacting with the kinetochore protein centromere protein-H (CENP-H), and colocalizing with the microtubule protein alpha-tubulin. Its overexpression may cause chromosomal instability and carcinogenesis. It is, thus, a novel regulator affecting cell cycle progression. Moreover, TRIM36 plays a critical role in the arrangement of somites during embryogenesis. TRIM36 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	49
319671	cd16757	RING-HC_TRIM46	RING finger, HC subclass, found in tripartite motif-containing protein 46 (TRIM46) and similar proteins. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that specifically localizes to the proximal axon, partly overlaps with the axon initial segment (AIS) at later stages, and organizes uniform microtubule orientation in axons. It controls neuronal polarity and axon specification by driving the formation of parallel microtubule arrays. TRIM46 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins, which are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	43
319672	cd16758	RING-HC_TRIM67	RING finger, HC subclass, found in tripartite motif-containing protein 67 (TRIM67) and similar proteins. TRIM67, also known as TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis. TRIM67 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain.	50
319673	cd16759	RING-HC_MuRF1	RING finger, HC subclass, found in muscle-specific RING finger protein 1 (MuRF-1) and similar proteins. MuRF-1, also known as tripartite motif-containing protein 63 (TRIM63), RING finger protein 28 (RNF28), iris RING finger protein, or striated muscle RING zinc finger, is an E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover. It is predominantly fast (type II) fibre-associated in skeletal muscle and can bind to many myofibrillar proteins, including titin, nebulin, the nebulin-related protein NRAP, troponin-I (TnI), troponin-T (TnT), myosin light chain 2 (MLC-2), myotilin, and T-cap. The early and robust upregulation of MuRF-1 is triggered by disuse, denervation, starvation, sepsis, or steroid administration resulting in skeletal muscle atrophy. It also plays a role in maintaining titin M-line integrity. It associates with the periphery of the M-line lattice and may be involved in the regulation of the titin kinase domain. It also participates in muscle stress response pathways and gene expression. MuRF-1 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains.	63
319674	cd16760	RING-HC_MuRF2	RING finger, HC subclass, found in muscle-specific RING finger protein 2 (MuRF-2) and similar proteins. MuRF-2, also known as tripartite motif-containing protein 55 (TRIM55) or RING finger protein 29 (RNF29), is a muscle-specific E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover and also a ligand of the transactivation domain of the serum response transcription factor (SRF). It is predominantly slow-fibre associated and highly expressed in embryonic skeletal muscle. MuRF-2 associates transiently with microtubules, myosin, and titin during sarcomere assembly. It has been implicated in microtubule, intermediate filament, and sarcomeric M-line maintenance in striated muscle development, as well as in signalling from the sarcomere to the nucleus. It plays an important role in the earliest stages of skeletal muscle differentiation and myofibrillogenesis. It is developmentally downregulated and is assembled at the M-line region of the sarcomere and with microtubules. MuRF-2 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains.	64
319675	cd16761	RING-HC_MuRF3	RING finger, HC subclass, found in muscle-specific RING finger protein 3 (MuRF-3) and similar proteins. MuRF-3, also known as tripartite motif-containing protein 54 (TRIM54), or RING finger protein 30 (RNF30), is an E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover. It is ubiquitously detected in all fibre types, and is developmentally upregulated, associates with microtubules, the sarcomeric M-line (this report) and Z-line, and is required for microtubule stability and myogenesis. It associates with glutamylated microtubules during skeletal muscle development, and is required for skeletal myoblast differentiation and development of cellular microtubular networks. MuRF-3 controls the degradation of four-and-a-half LIM domain (FHL2) and gamma-filamin and is required for maintenance of ventricular integrity after myocardial infarction (MI). MuRF-3 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains.	59
319676	cd16762	RING-HC_TRIM13_C-V	RING finger, HC subclass, found in tripartite motif-containing protein 13 (TRIM13) and similar proteins. TRIM13, also known as B-cell chronic lymphocytic leukemia tumor suppressor Leu5, leukemia-associated protein 5, putative tumor suppressor RFP2, RING finger protein 77 (RNF77), or Ret finger protein 2, is an endoplasmic reticulum (ER) membrane anchored E3 ubiquitin-protein ligase that interacts proteins localized to the ER, including valosin-containing protein (VCP), a protein indispensable for ER-associated degradation (ERAD). It also targets the known ER proteolytic substrate CD3-delta, but not the N-end rule substrate Ub-R-YFP (yellow fluorescent protein) for its degradation. Moreover, TRIM13 regulates ubiquitination and degradation of NEMO to suppress tumor necrosis factor (TNF) induced nuclear factor-kappaB (NF- kappa B) activation. It is also involved in NF-kappaB p65 activation and nuclear factor of activated T-cells (NFAT)-dependent activation of c-Rel upon T-cell receptor engagement. Furthermore, TRIM13 negatively regulates lanoma differentiation-associated gene 5 (MDA5)-mediated type I interferon production. It also regulates caspase-8 ubiquitination, translocation to autophagosomes, and activation during ER stress induced cell death. Meanwhile, TRIM13 enhances ionizing radiation-induced apoptosis by increasing p53 stability and decreasing AKT kinase activity through MDM2 and AKT degradation. TRIM13 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region. In addition, TRIM13 contains a C-terminal transmembrane domain.	57
319677	cd16763	RING-HC_TRIM59_C-V	RING finger, HC subclass, found in tripartite motif-containing protein 59 (TRIM59) and similar proteins. TRIM59, also known as RING finger protein 104 (RNF104) or tumor suppressor TSBF-1, is a putative E3 ubiquitin-protein ligase that functions as a novel multiple cancer biomarker for immunohistochemical detection of early tumorigenesis. It is upregulated in gastric cancer and promotes gastric carcinogenesis by interacting with and targeting the P53 tumor suppressor for its ubiquitination and degradation. It also acts as a novel accessory molecule involved in cytotoxicity of BCG-activated macrophages (BAM). Moreover, TRIM59 may serve as a multifunctional regulator for innate immune signaling pathways. It interacts with ECSIT and negatively regulates nuclear factor-kappaB (NF- kappa B) and interferon regulatory factor (IRF)-3/7-mediated signal pathways. TRIM59 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region. In addition, TRIM59 contains a C-terminal transmembrane domain.	56
319678	cd16764	RING-HC_TIF1alpha	RING finger, HC subclass, found in transcription inknown asiary factor 1-alpha (TIF1-alpha). TIF1-alpha, also known as tripartite motif-containing protein 24 (TRIM24), E3 ubiquitin-protein ligase TRIM24, or RING finger protein 82, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. It interacts specifically and in a ligand-dependent manner with the ligand binding domain (LBD) of several nuclear receptors (NRs), including retinoid X (RXR), retinoic acid (RAR), vitamin D3 (VDR), estrogen (ER), and progesterone (PR) receptors. It also associates with heterochromatin-associated factors HP1alpha, MOD1 (HP1beta), and MOD2 (HP1gamma), as well as the vertebrate Kruppel-type (C2H2) zinc finger proteins that contains transcriptional silencing domain KRAB. TIF1-alpha is a ligand-dependent co-repressor of retinoic acid receptor (RAR) that interacts with multiple nuclear receptors in vitro via an LXXLL motif and further acts as a gatekeeper of liver carcinogenesis. It also functions as an E3-ubiquitin ligase targeting p53, and is broadly associated with chromatin silencing. Moreover, it is a chromatin regulator that recognizes specific, combinatorial histone modifications through its C-terminal PHD-Bromo region. In addition, it interacts with chromatin and estrogen receptor to activate estrogen-dependent genes associated with cellular proliferation and tumor development.	77
319679	cd16765	RING-HC_TIF1beta	RING finger, HC subclass, found in transcription inknown asiary factor 1-beta (TIF1-beta). TIF1-beta, also known as Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), KRAB-interacting protein 1 (KRIP-1), nuclear co-repressor KAP-1, RING finger protein 96, tripartite motif-containing protein 28 (TRIM28), or E3 SUMO-protein ligase TRIM28, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. It acts as a nuclear co-repressor that plays a role in transcription and in the DNA damage response. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair. It also regulates CHD3 nucleosome remodeling during the DNA double-strand break (DSB) response. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response. Moreover, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase. The N-terminal RBCC domains of TIF1-beta are responsible for the interaction with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and the regulation of homo- and heterodimerization. The C-terminal PHD/Bromo domains are involved in interacting with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity.	61
319680	cd16766	RING-HC_TIF1gamma	RING finger, HC subclass, found in transcriptional inknown asiary factor 1 gamma (TIF1gamma). TIF1gamma, also known as tripartite motif-containing 33 (TRIM33), ectodermin, RFG7, or PTC7, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. It is an E3-ubiquitin ligase that functions as a regulator of transforming growth factor beta (TGFbeta) signaling, inhibits the Smad4-mediated TGFbeta response by interaction with Smad2/3 or ubiquitylation of Smad4. Moreover, TIF1gamma is an important regulator of transcription during hematopoiesis, as well as a key actor of tumorigenesis. Like other TIF1 family members, TIF1gamma also contains an intrinsic transcriptional silencing function. It can control erythroid cell fate by regulating transcription elongation. It can bind to the anaphase-promoting complex/cyclosome (APC/C) and promotes mitosis.	67
319681	cd16767	RING-HC_TRIM2	RING finger, HC subclass, found in tripartite motif-containing protein 2 (TRIM2). TRIM2, also known as RING finger protein 86 (RNF86), is an E3 ubiquitin-protein ligase that ubiquitinates the neurofilament light chain, a component of the intermediate filament in axons. Loss of function of TRIM2 results in early-onset axonal neuropathy. TRIM2 also plays a role in mediating the p42/p44 MAPK-dependent ubiquitination of the cell death-promoting protein Bcl-2-interacting mediator of cell death (Bim) in rapid ischemic tolerance. TRIM2 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain.	46
319682	cd16768	RING-HC_TRIM3	RING finger, HC subclass, found in tripartite motif-containing protein 3 (TRIM3). TRIM3, also known as brain-expressed RING finger protein (BERP), RING finger protein 97 (RNF97), or RING finger protein 22 (RNF22), is an E3 ubiquitin-protein ligase involved in the pathogenesis of various cancers. It functions as a tumor suppressor that regulates asymmetric cell division in glioblastoma. It binds to the cdk inhibitor p21(WAF1/CIP1) and regulates its availability that promotes cyclin D1-cdk4 nuclear accumulation. Moreover, TRIM3 plays an important role in the central nervous system (CNS). It corresponds to gene BERP (brain-expressed RING finger protein), a unique p53-regulated gene that modulates seizure susceptibility and GABAAR cell surface expression. Furthermore, TRIM3 mediates activity-dependent turnover of postsynaptic density (PSD) scaffold proteins GKAP/SAPAP1 and is a negative regulator of dendritic spine morphology. In addition, TRIM3 may be involved in vesicular trafficking via its association with the cytoskeleton-associated-recycling or transport (CART) complex that is necessary for efficient transferrin receptor recycling, but not for epidermal growth factor receptor (EGFR) degradation. It also regulates the motility of the kinesin superfamily protein KIF21B. TRIM3 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain.	45
319683	cd16769	RING-HC_UHRF1	RING finger, HC subclass, found in ubiquitin-like PHD and RING finger domain-containing protein 1 (UHRF1). UHRF1, also known as inverted CCAAT box-binding protein of 90 kDa, nuclear protein 95, nuclear zinc finger protein Np95 (Np95), RING finger protein 106, transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1, is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 can acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumor suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also a N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET and RING finger associated (SRA) domain, and a C-terminal C3HC4-type RING-HC finger. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitylation has an essential role in maintenance DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD domain targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-HC finger exhibits both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1.	47
319684	cd16770	RING-HC_UHRF2	RING finger, HC subclass, found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2). UHRF2, also known as Np95/ICBP90-like RING finger protein (NIRF), Np95-like RING finger protein, nuclear protein 97, nuclear zinc finger protein Np97, RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2, was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation, but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING-associated (SRA) domain, and a C-terminal C3HC4-type RING-HC finger.	46
319685	cd16771	RING-HC_UNK	RING finger, HC subclass, found in RING finger protein unkempt (UNK) and similar proteins. UNK, also known as zinc finger CCCH domain-containing protein 5, is a metazoan-specific zinc finger protein enriched in embryonic brains. It may play a broad regulatory role during the formation of the central nervous system (CNS). It is a sequence-specific RNA-binding protein required for the early neuronal morphology. UNK is a neurogenic component of the mTOR pathway, and functions as a negative regulator of the timing of photoreceptor differentiation. It also specifically binds to Brg/Brm-associated factor BAF60b and promotes its ubiquitination in a Rac1-dependent manner. UNK contains six tandem CCCH-type zinc fingers at the N-terminus, and a C3HC4-type RING-HC finger at its C-terminus.	37
319686	cd16772	RING-HC_UNKL	RING finger, HC subclass, found in RING finger protein unkempt-like (UNKL) and similar proteins. UNKL, also known as zinc finger CCCH domain-containing protein 5-like, is a putative E3 ubiquitin-protein ligase that may participate in a protein complex showing an E3 ligase activity regulated by RAC1. It shows high sequence similarity with RING finger protein unkempt (UNK), which is a metazoan-specific zinc finger protein enriched in embryonic brains, and may play a broad regulatory role during the formation of the central nervous system (CNS). UNKL contains several CCCH-type zinc fingers at the N-terminus, and a C3HC4-type RING-HC finger at its C-terminus.	38
319687	cd16773	RING-HC_RBR_TRIAD1	RING finger, HC subclass, found in two RING fingers and DRIL [double RING finger linked] 1 (TRIAD1). TRIAD1, also known as ariadne-2 (ARI-2), protein ariadne-2 homolog, Ariadne RBR E3 ubiquitin protein ligase 2 (ARIH2), or UbcM4-interacting protein 48, is a RBR-type E3 ubiquitin-protein ligase that catalyzes the formation of polyubiquitin chains linked via lysine-48, as well as lysine-63 residues. Its auto-ubiquitylation can be catalyzed by the E2 conjugating enzyme UBCH7. TRIAD1 has been implicated in hematopoiesis, specifically in myelopoiesis, as well as in embryogenesis. It functions as a regulator of endosomal transport and is required for the proper function of multivesicular bodies. It also acts as a novel ubiquitination target for proteasome-dependent degradation by murine double minute 2 (MDM2). As a proapoptotic protein, TRIAD1 promotes p53 activation, and inhibits MDM2-mediated p53 ubiquitination and degradation. Furthermore, TRIAD1 can inhibit the ubiquitination and proteasomal degradation of growth factor independence 1 (Gfi1), a transcriptional repressor essential for the function and development of many different hematopoietic lineages. TRIAD1 contains a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	54
319688	cd16774	RING-HC_RBR_ANKIB1	RING finger, HC subclass, found in ankyrin repeat and IBR domain-containing protein 1 (ANKIB1) and similar proteins. ANKIB1 is a RBR-type E3 ubiquitin-protein ligase that may function as part of E3 complex, which accepts ubiquitin from specific E2 ubiquitin-conjugating enzymes and then transfers it to substrates. It contains an N-terminal ankyrin repeats domain and a RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	57
319689	cd16775	RING-HC_RBR_RNF19A	RING finger, HC subclass, found in RING finger protein 19A (RNF19A) and similar proteins. RNF19A, also known as double ring-finger protein (Dorfin) or p38, is a transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligase that localizes to the ubiquitylated inclusions in Parkinson"s disease (PD), dementia with Lewy bodies, multiple system atrophy, and amyotrophic lateral sclerosis (ALS). It interacts with Psmc3, a protein component of the 19S regulatory cap of the 26S proteasome, and further participates in the ubiquitin-proteasome system in acrosome biogenesis, spermatid head shaping, and development of the head-tail coupling apparatus and tail. It modulates the ubiquitination and degradation of calcium-sensing receptor (CaR), which may contribute to a general mechanism for CaR quality control during biosynthesis. Moreover, RNF19A can also ubiquitylate mutant superoxide dismutase 1 (SOD1), the causative gene of familial ALS. It may associate with endoplasmic reticulum-associated degradation (ERAD) pathway, which is related to the pathogenesis of neurodegenerative disorders, such as PD or Alzheimer"s disease. It is also involved in the pathogenic process of PD and Lewy body (LB) formation by ubiquitylation of synphilin-1. RNF19A contains a RBR domain followed by three TMs. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	55
319690	cd16776	RING-HC_RBR_RNF19B	RING finger, HC subclass, found in RING finger protein 19B (RNF19B) and similar proteins. RNF19B, also known as IBR domain-containing protein 3 or natural killer lytic-associated molecule (NKLAM), is a transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligase that plays a role in controlling tumor dissemination and metastasis. It is involved in the cytolytic function of natural killer (NK) cells and cytotoxic T lymphocytes (CTLs). It interacts with ubiquitin conjugates UbcH7 and UbcH8, and ubiquitinates uridine kinase like-1 (URKL-1) protein, targeting it for degradation. Moreover, RNF19B is a novel component of macrophage phagosomes and plays a role in macrophage anti-bacterial activity. It functions as a novel modulator of macrophage inducible nitric oxide synthase (iNOS) expression. RNF19B contains a RBR domain followed by three TMs. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C3HC4-type RING-HC finger required for RBR-mediated ubiquitination.	55
319691	cd16777	mRING-HC-C4C4_RBR_RNF144A	Modified RING finger, HC subclass (C4C4-type), found in RING finger protein 144A (RNF144A). RNF144A, also known as UbcM4-interacting protein 4 (UIP4) or ubiquitin-conjugating enzyme 7-interacting protein 4, is a transmembrane (TM) domain-containing RBR-type E3 ubiquitin-protein ligase that targets DNA-dependent protein kinase catalytic subunit (DNA-PKcs) and thus promotes DNA damage-induced cell apoptosis. It is transcriptionally repressed by metastasis-associated protein 1 (MTA1) and inhibits MTA1-driven cancer cell migration and invasion. RNF144A contains a RBR domain followed by a potential single-TM domain. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C4C4-type RING finger whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is responsible for the interaction of E2-conjugating enzymes UbcH7 and UbcH8.	54
319692	cd16778	mRING-HC-C4C4_RBR_RNF144B	Modified RING finger, HC subclass (C4C4-type), found in RING finger protein 144B (RNF144B). RNF144B, also known as PIR2, IBR domain-containing protein 2 (IBRDC2), or p53-inducible RING finger protein (p53RFP), is a transmembrane (TM) domain-containing RBR (RING1-IBR-RING2) E3 ubiquitin-protein ligase that induces p53-dependent, but caspase-independent apoptosis. It interacts with E2 ubiquitin-conjugating enzymes UbcH7 and UbcH8, but not with UbcH5. It is involved in ubiquitination and degradation of p21, a p53 downstream protein promoting growth arrest and antagonizing apoptosis, suggesting a role in switching a cell from p53-mediated growth arrest to apoptosis. Moreover, RNF144B regulates the levels of Bax, a pro-apoptotic protein from the Bcl-2 family, and protects cells from unprompted Bax activation and cell death. It also regulates epithelial homeostasis by mediating degradation of p21WAF1 and p63. RNF144B contains a RBR domain followed by a potential single-TM domain. The RBR domain was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. Based on current understanding of the structural biology of RBR ligases, the nomenclature of RBR has been corrected as RING-BRcat (benign-catalytic)-Rcat (required-for-catalysis) recently. The RBR (RING1-BRcat-Rcat) domain use an auto-inhibitory mechanism to modulate ubiquitination activity, as well as a hybrid mechanism that combines aspects from both RING and HECT E3 ligase function to facilitate the ubiquitination reaction. This family corresponds to the RING domain, a C4C4-type RING finger whose overall folding is similar to that of the C3HC4-type RING-HC finger. It is required for RBR-mediated ubiquitination.	57
319693	cd16779	mRING-HC-C3HC3D_LNX1	Modified RING finger, HC subclass (C3HC3D-type), found in ligand of numb protein X 1 (LNX1). LNX1, also known as numb-binding protein 1 or PDZ domain-containing RING finger protein 2, is a PDZ domain-containing RING-type E3 ubiquitin ligase responsible for the ubiquitination and degradation of Numb, a component of the Notch signaling pathway that functions in the specification of cell fates during development and is known to control cell numbers during neurogenesis in vertebrates. LNX1 contains an N-terminal modified C3HC3D-type RING-HC finger, a NPAY motif for Numb-LNX interaction, and four PDZ domains necessary for the binding of substrates, including CAR, ErbB2, SKIP, JAM4, CAST, c-Src, Claudins, RhoC, KCNA4, PAK6, PLEKHG5, PKC-alpha1, TYK2, PDZ-binding kinase (PBK), LNX2, and itself.	42
319694	cd16780	mRING-HC-C3HC3D_LNX2	Modified RING finger, HC subclass (C3HC3D-type), found in ligand of numb protein X 2 (LNX2). LNX2, also known as numb-binding protein 2, or PDZ domain-containing RING finger protein 1 (PDZRN1), is a PDZ domain-containing RING-type E3 ubiquitin ligase responsible for the ubiquitination and degradation of Numb, a component of the Notch signaling pathway that functions in the specification of cell fates during development and is known to control cell numbers during neurogenesis in vertebrates. It interacts with contactin-associated protein 4 (Caspr4, also known as CNTNAP4) in a PDZ domain-dependent manner, which modulates the proliferation and neuronal differentiation of neural progenitor cells (NPCs). LNX2 contains an N-terminal modified C3HC3D-type RING-HC finger, a NPAF motif for Numb/ Numblike-LNX interaction, and four PDZ domains necessary for the binding of substrates, including ErbB2, RhoC, the presynaptic protein CAST, the melanoma/cancer-testis antigen MAGEB18 and several proteins associated with cell junctions, such as JAM4 and the Coxsackievirus and adenovirus receptor (CAR).	45
319695	cd16781	mRING-HC-C3HC3D_Roquin1	Modified RING finger, HC subclass (C3HC3D-type), found in Roquin-1. Roquin-1, also known as RING finger and C3H zinc finger protein 1 (RC3H1), or RING finger protein 198 (RNF198), is a ubiquitously expressed RNA-binding protein essential for degradation of inflammation-related mRNAs and maintenance of immune homeostasis. It is localized in cytoplasmic granules and binds to the 3' untranslated region (3'UTR) of inducible costimulator (ICOS) mRNA to post-transcriptionally repress its expression. Roquin-1 interacts with 3'UTR of tumor necrosis factor receptor superfamily member 4 (TNFRSF4) and tumor-necrosis factor-alpha (TNFalpha), and post-transcriptionally regulates A20 mRNA and modulates the activity of the IKK/NF-kappaB pathway. Moreover, Roquin-1 shares functions with its paralog Roquin-2 in the repression of mRNAs controlling T follicular helper cells and systemic inflammation. Roquin-1 contains an N-terminal modified C3HC3D-type RING-HC finger with a potential E3 ubiquitin-ligase function, a highly conserved ROQ domain required for RNA binding and localization to stress granules, and a CCCH-type zinc finger that is involved in RNA recognition, typically contacting AU-rich elements. In addition, both N- and C-terminal to the ROQ domain are combined to form a HEPN (higher eukaryotes and prokaryotes nucleotide-binding) domain that is highly likely to function as a RNA-binding domain.	44
319696	cd16782	mRING-HC-C3HC3D_Roquin2	Modified RING finger, HC subclass (C3HC3D-type), found in Roquin-2. Roquin-2, also known as membrane-associated nucleic acid-binding protein (MNAB), RING finger and CCCH-type zinc finger domain-containing protein 2 (RC3H2), or RING finger protein 164 (RNF164), is an E3 ubiquitin ligase that is localized to the cytoplasm and upon stress is concentrated in stress granules. It is required for reactive oxygen species (ROS)-induced ubiquitination and degradation of apoptosis signal-regulating kinase 1 (ASK1, also known as MAP3K5). Roquin-2 interacts with 3'UTR of tumor necrosis factor receptor superfamily member 4 (TNFRSF4) and tumor-necrosis factor-alpha (TNFalpha), and modulates immune responses. Moreover, Roquin-2 shares functions with its paralog Roquin-1 in the repression of mRNAs controlling T follicular helper cells and systemic inflammation. Roquin-2 contains an N-terminal modified C3HC3D-type RING-HC finger with a potential E3 ubiquitin-ligase function, a highly conserved ROQ domain required for RNA binding and localization to stress granules, a coiled-coil (CC1), and a CCCH-type zinc finger that is involved in RNA recognition.	44
319697	cd16783	mRING-HC-C2H2C4_MDM2	Modified RING finger, HC subclass (C2H2C4-type), found in E3 ubiquitin-protein ligase MDM2 and similar proteins. MDM2, also known as double minute 2 protein (Hdm2), oncoprotein MDM2, or p53-binding protein, exerts its oncogenic activity predominantly by binding p53 tumor suppressor and blocking its transcriptional activity. It forms homo-oligomers and displays E3 ubiquitin ligase activity that catalyzes the attachment of ubiquitin to p53 as an essential step in the regulation of its level in cells. Moreover, in response to ribosomal stress, MDM2-mediated p53 ubiquitination and degradation can be inhibited through its interaction with ribosomal proteins L5, L11, and L23. MDM2 can be phosphorylated in the DNA damage. Meanwhile, MDM2 has a p53-independent role in tumorigenesis and cell growth regulation. In addition, it binds interferon (IFN) regulatory factor-2 (IRF-2), an IFN-regulated transcription factor, and mediates its ubiquitination. MDM2 contains an N-terminal p53-binding domain, and a C-terminal modified C2H2C4-type RING-HC finger conferring E3 ligase activity that is required for ubiquitination and nuclear export of p53. It is also responsible for the hetero-oligomerization of MDM2, which is crucial for the suppression of P53 activity during embryonic development, and the recruitment of E2 ubiquitin-conjugating enzymes. MDM2 also harbors a RanBP2-type zinc finger (zf-RanBP2) domain, as well as a nuclear localization signal (NLS) and a nuclear export signal (NES), near the central acidic region. The zf-RanBP2 domain plays an important role in mediating MDM2 binding to ribosomal proteins and thus is involved in MDM2-mediated p53 suppression.	57
319698	cd16784	mRING-HC-C2H2C4_MDM4	Modified RING finger, HC subclass (C2H2C4-type), found in protein MDM4 and similar proteins. MDM4, also known as double minute 4 protein (Hdm4), or MDM2-like p53-binding protein, or protein MDMX, or HDMX, or p53-binding protein MDM4, exerts its oncogenic activity predominantly by binding p53 tumor suppressor and blocking its transcriptional activity. MDM4 is phosphorylated and destabilized in response to DNA damage stress. It can also be specifically dephosphorylated through directly interacting with protein phosphatase 1 (PP1), which may increase its stability and thus inhibits p53 activity. Meanwhile, MDM4 has a p53-independent role in tumorigenesis and cell growth regulation. MDM4 contains an N-terminal p53-binding domain and a C-terminal modified C2H2C4-type RING-HC finger responsible for its hetero-oligomerization, which is crucial for the suppression of P53 activity during embryonic development and the recruitment of E2 ubiquitin-conjugating enzymes. MDM4 also harbors a RanBP2-type zinc finger (zf-RanBP2) domain near the central acidic region.	59
319699	cd16785	mRING-HC-C3HC5_NEU1A	Modified RING finger, HC subclass (C3HC5-type), found in neuralized-like protein 1A (NEURL1A) and similar proteins. NEURL1A, also known as NEURL1, NEU, neuralized 1, or RING finger protein 67 (RNF67), is a mammalian homolog of the Drosophila neuralized (D-neu) protein. It functions as an E3 ubiquitin-protein ligase that directly interacts with and monoubiquitinates cytoplasmic polyadenylation element-binding protein 3 (CPEB3), an RNA binding protein and a translational regulator of local protein synthesis, which facilitates hippocampal plasticity and hippocampal-dependent memory storage. It also acts as a potential tumor suppressor that causes apoptosis and downregulates Notch target genes in the medulloblastoma. NEURL1A contains two neuralized homology regions (NHRs) responsible for Neural-ligand interactions and a modified C3HC5-type RING-HC finger required for ubiquitin ligase activity. The C3HC5-type RING-HC finger is distinguished from typical C3HC4-type RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	44
319700	cd16786	mRING-HC-C3HC5_NEU1B	Modified RING finger, HC subclass (C3HC5-type), found in neuralized-like protein 1B (NEURL1B). NEURL1B, also known as neuralized-2 (NEUR2) or neuralized-like protein 3, is a mammalian homolog of the Drosophila neuralized (D-neu) protein. It functions as an E3 ubiquitin-protein ligase that interacts with and ubiquitinates Delta. Thus, it plays a role in the endocytic pathways for Notch signaling through working cooperatively with another E3 ligase, Mind bomb-1 (Mib1), in Delta endocytosis to hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs)-positive vesicles. NEURL1B contains two neuralized homology regions (NHRs) responsible for Neural-ligand interactions and a modified C3HC5-type RING-HC finger required for ubiquitin ligase activity. The C3HC5-type RING-HC finger is distinguished from typical C3HC4-type RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	43
319701	cd16787	mRING-HC-C3HC5_CGRF1	Modified RING finger, HC subclass (C3HC5-type), found in cell growth regulator with RING finger domain protein 1 (CGRRF1) and similar proteins. CGRRF1, also known as cell growth regulatory gene 19 protein (CGR19) or RING finger protein 197 (RNF197), functions as a novel biomarker of tissue monitor endometrial sensitivity and response to insulin-sensitizing drugs, such as metformin, in the context of obesity. CGRRF1 contains a C-terminal modified C3HC5-type RING-HC finger, which is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	37
319702	cd16788	mRING-HC-C3HC5_RNF26	Modified RING finger, HC subclass (C3HC5-type), found in RING finger protein 26 (RNF26) and similar proteins. RNF26 is an E3 ubiquitin ligase that temporally regulates virus-triggered type I interferon induction by increasing the stability of Mediator of IRF3 activation, MITA, also known as STING, through K11-linked polyubiquitination of MITA after viral infection and promoting degradation of IRF3, another important component required for virus-triggered interferon induction. Although RNF26 substrates of ubiquitination remain unclear at present, RNF26 upregulation in gastric cancer might be implicated in carcinogenesis through dysregulation of growth regulators. RNF26 contains an N-terminal leucine zipper domain and a C-terminal modified C3HC5-type RING-HC finger, which is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	48
319703	cd16789	mRING-HC-C3HC5_MGRN1_like---blasttree	Modified RING finger, HC subclass (C3HC5-type), found in mahogunin RING finger protein 1 (MGRN1), RING finger protein 157 (RNF157) and similar proteins. MGRN1, also known as RING finger protein 156 (RNF156), is a cytosolic E3 ubiquitin-protein ligase that inhibits signaling through the G protein-coupled melanocortin receptors-1 (MC1R), -2 (MC2R) and -4 (MC4R) via ubiquitylation-dependent and -independent processes. It suppresses chaperone-associated misfolded protein aggregation and toxicity. MGRN1 interacts with cytosolic prion proteins (PrPs) that are linked with neurodegeneration. It also interacts with expanded polyglutamine proteins, and suppresses misfolded polyglutamine aggregation and cytotoxicity. Moreover, MGRN1 inhibits melanocortin receptor signaling by competition with Galphas, suggesting a novel pathway for melanocortin signaling from the cell surface to the nucleus. Furthermore, MGRN1 interacts with and ubiquitylates TSG101, a key component of the endosomal sorting complex required for transport (ESCRT)-I, and regulates endosomal trafficking. A null mutation in the gene encoding MGRN1 causes spongiform neurodegeneration, suggesting a link between dysregulation of endosomal trafficking and spongiform neurodegeneration. RNF157 is a cytoplasmic E3 ubiquitin ligase predominantly expressed in brain. It is a homolog of the E3 ligase mahogunin ring finger-1 (MGRN1). In cultured neurons, it promotes neuronal survival in an E3 ligase-dependent manner. In contrast, it supports growth and maintenance of dendrites independent of its E3 ligase activity. RNF157 interacts with and ubiquitinates the adaptor protein APBB1 (amyloid beta precursor protein-binding, family B, member 1 or Fe65), which regulates neuronal survival, but not dendritic growth downstream of RNF157. The nuclear localization of APBB1 together with its interaction partner RNA-binding protein SART3 (squamous cell carcinoma antigen recognized by T cells 3 or Tip110) is crucial to trigger apoptosis. Both MGRN1 and RNF157 contain a modified C3HC5-type RING-HC finger, and a functionally uncharacterized region, known as domain associated with RING2 (DAR2), N-terminal to the RING finger. The C3HC5-type RING-HC finger is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	41
319704	cd16790	SP-RING_PIAS	SP-RING finger found in protein inhibitor of activated signal transducer and activator of transcription (PIAS) proteins. The PIAS (protein inhibitor of activated STAT) protein family modulates the activity of several transcription factors and acts as an E3 ubiquitin ligase in the sumoylation pathway. It consists of four members: PIAS1, PIAS2 (also known as PIASx), PIAS3, and PIAS4 (also known as PIASy). PIAS proteins were initially identified as inhibitors of activated STAT only, but are now known to interact with and modulate several other proteins, including androgen receptor (AR), tumor suppressor p53, and the transforming growth factor-beta (TGF-beta) signaling protein SMAD. They interact with STATs in a cytokine-dependent manner. PIAS1, PIAS2, and PIAS3 interact with STAT1, STAT3, and STAT4, respectively. In addition, PIAS4 is associated with STAT1. PIAS proteins have SUMO E3-ligase activity and interaction of PIAS proteins with transcription factors often results in sumoylation of that protein. PIAS proteins contain an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, which is required for the trans-repression of STAT1 activity by PIAS2, a PINT motif, which is essential for nuclear retention of PIAS3L (the long form of PIAS3), a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, which is essential for SUMO ligase activity, and the acidic C-terminal domain, which is involved in binding of PIAS3 to the nuclear coactivator TIF2. The SP-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers.	48
319705	cd16791	SP-RING_ZMIZ	SP-RING finger found in zinc finger MIZ domain-containing protein Zmiz1, Zmiz2, and similar proteins. This family includes Zmiz1 (Zimp10) and its homolog Zmiz2 (Zimp7), both of which were initially identified in humans as androgen receptor (AR) interacting proteins and function as transcriptional co-activators. They interact with BRG1, the catalytic subunit of the SWI-SNF remodeling complex. They also associate with other hormone nuclear receptors and transcription factors, such as p53 and Smad3/Smad4, and regulate transcription of specific target genes by altering their chromatin structure. The family also includes tonalli (Tna), an ortholog identified in Drosophila. It genetically interacts with the ATP-dependent SWI/SNF and Mediator complexes, suggesting a potential role for the Zmiz proteins in chromatin remodeling. Zmiz proteins contain a highly conserved Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, also known as msx-interacting zinc finger (Miz domain), and a strong transactivation domain within the C-terminus. The SP-RING/Miz domain is highly conserved in members of the PIAS family and confers SUMO-conjugating activity. It is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers. The strong intrinsic transactivation domain facilitates Zmiz proteins to augment the transcriptional activity of nuclear hormone receptors and other transcriptional factors. They may act as transcriptional co-regulators.	48
319706	cd16792	SP-RING_Siz_plant	SP-RING finger found in Arabidopsis thaliana E3 SUMO-protein ligase SIZ1 (AtSIZ1) and similar proteins. SIZ1-mediated conjugation of SUMO1 and SUMO2 to other intracellular proteins is essential in Arabidopsis. AtSIZ1 negatively regulates abscisic acid (ABA) signaling through the sumoylation of bZIP transcripton factor ABI5. It also mediates sumoylation of bromodomain GTE proteins. Moreover, AtSIZ1 regulates flowering by controlling a salicylic acid-mediated floral promotion pathway and through affecting on FLOWERING LOCUS C (FLC) chromatin structure. It also plays a role in drought stress response likely through the regulation of gene expression. Members in this family contain an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box, a plant homeodomain (PHD) finger, and a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger. The SP-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers.	49
319707	cd16793	SP-RING_ScSiz_like	SP-RING finger found in Saccharomyces cerevisiae E3 SUMO-protein ligase SIZ1, SIZ2, and similar proteins. Saccharomyces cerevisiae SIZ proteins, also known as SAP and Miz-finger domain-containing proteins, are a Siz/PIAS RING (SP-RING) family of SUMO E3 ligases, and may be involved in a novel pathway of chromosome maintenance. They enhance SUMO modification with many substrates in vivo, but also exhibit unique substrate specificity. SIZ1, also known as ubiquitin-like protein ligase 1 (Ull1), modifies both cytoplasmic and nuclear proteins. It functions as an E3 factor specific for septin components. SIZ1-dependent substrates include Cdc3 and Cdc11 (septin subunits), Prp45 (a splicing factor), and the proliferating cell nuclear antigen (PCNA). SIZ2, also known as NFI1, interacts with Smt3, SUMO/Smt3 conjugating enzyme Ubc9, and a septin component Cdc3. Members in this family contain an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain. The SP-RING finger is a variant of RING finger, which lacks the second, fifth, and sixth zinc-binding residues of the consensus C3H2C3-/C3HC4-type RING fingers.	48
319708	cd16794	dRING_RMD5A	Degenerated RING finger found in protein RMD5 homolog A (RMD5A). RMD5A is one of the vertebrate homologs of yeast Rmd5p. The biological function of RMD5A remains unclear. RMD5A contains a Lissencephaly type-1-like homology motif (LisH), a C-terminal to LisH motif (CTLH) domain, and a degenerated RING finger that is characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues compared with the classic C3H2C3-/C3HC4-type RING fingers.	49
319709	cd16795	dRING_RMD5B	Degenerated RING finger found in protein RMD5 homolog B (RMD5B). RMD5B is one of the vertebrate homologs of yeast Rmd5p. The biological function of RMD5B remains unclear. RMD5B contains a Lissencephaly type-1-like homology motif (LisH), a C-terminal to LisH motif (CTLH) domain, and a degenerated RING finger that is characterized by lacking the second, fifth, and sixth Zn2+ ion-coordinating residues compared with the classic C3H2C3-/C3HC4-type RING fingers.	47
319710	cd16796	RING-H2_RNF13	RING finger, H2 subclass, found in RING finger protein 13 (RNF13) and similar proteins. RING finger, H2 subclass, found in RING finger protein 13 (RNF13) and similar proteins RNF13 is a widely expressed membrane-associated E3 ubiquitin-protein ligase that is functionally significant in the regulation of cancer development, muscle cell growth, and neuronal development. Its expression is developmentally regulated during myogenesis and is upregulated in various tumors. RNF13 negatively regulates cell proliferation through its E3 ligase activity. It functions as an important regulator of Inositol-requiring transmembrane kinase/endonuclease IRE1alpha, mediating endoplasmic reticulum (ER) stress-induced apoptosis through the activation of the IRE1alpha-TRAF2-JNK signaling pathway. Moreover, RNF13 is involved in the regulation of the soluble N-ethylmaleimide-sensitive fusion protein attachment protein receptor (SNARE) complex via the ubiquitination of snapin, a SNAP25-interacting protein, which thereby controls synaptic function. In addition, RNF13 participates in regulating the function of satellite cells by modulating cytokine composition. RNF13 is evolutionarily conserved among many metazoans and contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence.	49
319711	cd16797	RING-H2_RNF167	RING finger, H2 subclass, found in RING finger protein 167 (RNF167) and similar proteins. RNF167, also known as RING105, is an endosomal/lysosomal E3 ubiquitin-protein ligase involved in alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor (AMPAR) ubiquitination. It ubiquitinates GluA2 and regulates its surface expression, and thus acts as a selective regulator of AMPAR-mediated neurotransmission. It acts as an endosomal membrane protein which ubiquitylates vesicle-associated membrane protein 3 (VAMP3) and regulates endosomal trafficking. Moreover, RNF167 plays a role in the regulation of TSSC5 (tumor-suppressing subchromosomal transferable fragment cDNA, also known as ORCTL2/IMPT1/BWR1A/SLC22A1L), which can function in concert with the ubiquitin-conjugating enzyme UbcH6. RNF167 is widely conserved in metazoans and contains an N-terminal signal peptide, a protease-associated (PA) domain, two transmembrane domains (TM1 and TM2), and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence.	46
319712	cd16798	RING-H2_RNF43	RING finger, H2 subclass, found in RING finger protein 43 (RNF43) and similar proteins. RNF43 is a transmembrane E3 ubiquitin-protein ligase that plays an important role in frizzled-dependent regulation of the Wnt/beta-catenin pathway. It functions as a tumor suppressor that inhibits Wnt/beta-catenin signaling by ubiquitinating Frizzled receptor and targeting it to the lysosomal pathway for degradation. miR-550a-5p directly targeted the 3?-UTR of gene RNF43 and regulated its expression. Moreover, RNF43 interacts with NEDD-4-like ubiquitin-protein ligase-1 (NEDL1) and regulates p53-mediated transcription. It may also be involved in cell growth control potentially through the interaction with HAP95, a chromatin-associated protein interfacing the nuclear envelope. Mutations of RNF43 have been identified in various tumors, including colorectal cancer (CRC), endometrial cancer, mucinous ovarian tumors, gastric adenocarcinoma, pancreatic ductal adenocarcinoma, liver fluke-associated cholangiocarcinoma, hepatocellular carcinoma, and glioma. RNF43 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C3H2C3-type RING-H2 finger domain followed by a long C-terminal region.	47
319713	cd16799	RING-H2_ZNRF3	RING finger, H2 subclass, found in zinc/RING finger protein 3 (ZNRF3) and similar proteins. ZNRF3, also known as RING finger protein 203 (RNF203), is a homolog of Ring finger protein 43 (RNF43). It is a transmembrane E3 ubiquitin-protein ligase that is associated with the Wnt receptor complex, and negatively regulates Wnt signaling by promoting the turnover of frizzled and lipoprotein receptor-related protein LRP6 in an R-spondin-sensitive manner. It inhibits gastric cancer cell growth and promotes the cell apoptosis by affecting the Wnt/beta-catenin/TCF signaling pathway. ZNRF3 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C3H2C3-type RING-H2 finger domain followed by a long C-terminal region.	45
319714	cd16800	RING-H2_RNF115	RING finger, H2 subclass, found in RING finger protein 115 (RNF115) and similar proteins. RNF115, also known as Rab7-interacting ring finger protein (Rabring 7), or zinc finger protein 364 (ZNF364), or breast cancer-associated gene 2 (BCA2), is an E3 ubiquitin-protein ligase that is an endogenous inhibitor of adenosine monophosphate-activated protein kinase (AMPK) activation and its inhibition increases the efficacy of metformin in breast cancer cells. It also functions as a co-factor in the restriction imposed by tetherin on HIV-1, and targets HIV-1 Gag for lysosomal degradation, impairing virus assembly and release, in a tetherin-independent manner. Moreover, RNF115 is a Rab7-binding protein that stimulates c-Myc degradation through mono-ubiquitination of MM-1. It also plays crucial roles as a Rab7 target protein in vesicle traffic to late endosome/lysosome and lysosome biogenesis. Furthermore, RNF115 and the related protein, RNF126 associate with the epidermal growth factor receptor (EGFR) and promote ubiquitylation of EGFR, suggesting they play a role in the ubiquitin-dependent sorting and downregulation of membrane receptors. RNF115 contains an N-terminal BCA2 Zinc-finger domain (BZF), the AKT-phosphorylation sites, and the C-terminal C3H2C3-type RING-H2 finger.	47
319715	cd16801	RING-H2_RNF126	RING finger, H2 subclass, found in RING finger protein 126 (RNF126) and similar proteins. RNF126 is a Bag6-dependent E3 ubiquitin ligase that is involved in the mislocalized protein (MLP) pathway of quality control. It regulates the retrograde sorting of the cation-independent mannose 6-phosphate receptor (CI-MPR). Moreover, RNF126 promotes cancer cell proliferation by targeting the tumor suppressor p21 for ubiquitin-mediated degradation, and could be a novel therapeutic target in breast and prostate cancers. It is also able to ubiquitylate cytidine deaminase (AID), a poorly soluble protein that is essential for antibody diversification. In addition, RNF126 and the related protein, RNF115 associate with the epidermal growth factor receptor (EGFR) and promote ubiquitylation of EGFR, suggesting they play a role in the ubiquitin-dependent sorting and downregulation of membrane receptors. RNF126 contains an N-terminal BCA2 Zinc-finger domain (BZF), the AKT-phosphorylation sites, and the C-terminal C3H2C3-type RING-H2 finger.	44
319716	cd16802	RING-H2_RNF128_like	RING finger, H2 subclass, found in RING finger protein 128 (RNF128) and similar proteins. This subfamily includes RING finger proteins RNF128, RNF133, RNF148, and similar proteins, which belong to a larger PA-TM-RING ubiquitin ligase family that has been characterized by containing an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence. RNF128, also known as gene related to anergy in lymphocytes protein (GRAIL), is a type 1 transmembrane E3 ubiquitin-protein ligase that is a critical regulator of adaptive immunity and development. It inhibits cytokine gene transcription is expressed in anergic CD4+ T cells, and has been implicated in primary T cell activation, survival, and differentiation, as well as in T cell anergy and oral tolerance. It induces T cell anergy through the ubiquitination activity of its cytosolic RING finger. It regulates expression of the costimulatory molecule CD40L on CD4 T cells, and ubiquitinates the costimulatory molecule CD40 ligand (CD40L) during the induction of T cell anergy. Moreover, RNF128 interacts with the luminal/extracellular portion of both CD151 and the related tetraspanin CD81 via its PA domain, which promoted ubiquitination of cytosolic lysine residues. It also down-modulates the expression of CD83 (previously described as a cell surface marker for mature dendritic cells) on CD4 T cells. Furthermore, Rho guanine dissociation inhibitor (RhoGDI) has been identified as a potential substrate of RNF128, suggesting a role for Rho effector molecules in T cell anergy. In addition, RNF128 plays a role in environmental stress responses. It promotes environmental salinity tolerance in euryhaline tilapia. RNF133 is a testis-specific endoplasmic reticulum-associated E3 ubiquitin ligase that is mainly present in the cytoplasm of elongated spermatids. It may play a role in sperm maturation through an ER-associated degradation (ERAD) pathway. RNF148 is a testis-specific E3 ubiquitin ligase that is abundantly expressed in testes and slightly expressed in pancreas. Its expression regulated by histone deacetylases.	49
319717	cd16803	RING-H2_RNF130	RING finger, H2 subclass, found in RING finger protein 130 (RNF130) and similar proteins. RNF130, also known as Goliath homolog (H-Goliath), is a paralog of RNF128, also known as gene related to anergy in lymphocytes protein (GRAIL). It is a transmembrane E3 ubiquitin-protein ligase expressed in leukocytes. It has a self-ubiquitination property, and controls the development of T cell clonal anergy by ubiquitination. RNF130 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence.	49
319718	cd16804	RING-H2_RNF149	RING finger, H2 subclass, found in RING finger protein 149 (RNF149) and similar proteins. RNF149, also known as DNA polymerase-transactivated protein 2, is an E3 ubiquitin-protein ligase that interacts with wild-type v-Raf murine sarcoma viral oncogene homolog B1 (BRAF), a RING domain-containing E3 ubiquitin ligase involved in control of gene transcription, translation, cytoskeletal organization, cell adhesion, and epithelial development. RNF149 induces the ubiquitination of wild-type BRAF and promotes its proteasome-dependent degradation. Mutated RNF149 has been found in some human breast, ovarian, and colorectal cancers. RNF149 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence.	48
319719	cd16805	RING-H2_RNF150	RING finger, H2 subclass, found in RING finger protein 150 (RNF150) and similar proteins. RNF150 is a RING finger protein that its polymorphisms may be associated with chronic obstructive pulmonary disease (COPD) risk in the Chinese population. Further studies with larger numbers of participants worldwide are needed for validation of the relationships between RNF150 genetic variants and the pathogenesis of COPD. RNF150 contains an N-terminal signal peptide, a protease-associated (PA) domain, a transmembrane (TM) domain and a C-terminal C3H2C3-type RING-H2 finger domain followed by a putative PEST sequence.	49
319720	cd16806	RING_CH-C4HC3_MARCH1	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH1 (MARCH1). MARCH1, also known as membrane-associated RING finger protein 1, membrane-associated RING-CH protein I (MARCH-I), or RING finger protein 171 (RNF171), is a membrane-anchored E3 ubiquitin ligase that mainly expressed in cells of the immune system. It regulates antigen presentation and T-cell costimulatory functions of dendritic cells by down-regulating the cell surface expression of major histocompatibility complex class II (MHCII) and CD86 molecules. It mediates ubiquitination of MHCII and CD86 in dendritic cells (DCs). This ubiquitination induces MHCII and CD86 endocytosis, lysosomal transport, and degradation. MARCH1 also plays a regulatory role in T cell activation during immune responses, as well as in splenic DC homeostasis. Moreover, MARCH1 may regulate its own expression through dimerization and autoubiquitination. MARCH1 contains an N-terminal cytoplasmic C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger and two transmembrane domains.	53
319721	cd16807	RING_CH-C4HC3_MARCH8	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH8 (MARCH8). MARCH8, also known as membrane-associated RING finger protein 8, membrane-associated RING-CH protein VIII (MARCH-VIII), RING finger protein 178 (RNF178), or cellular modulator of immune recognition (c-MIR), is a membrane-anchored E3 ubiquitin ligase that is broadly expressed. It is a functional homolog of Kaposi"s sarcoma associated-herpes virus encodes proteins modulator of immune recognition (MIR) 1 and 2, which are involved in the evasion of host immunity. MARCH8 mediates the ubiquitination and down-regulation of immune regulatory cell surface molecules, including major histocompatibility complex class II (MHCII), CD86, transferrin receptor, HLA-DM, and Fas in immune cells. Moreover, MARCH8 controls cell surface expression of some additional proteins. It regulates the ubiquitination and lysosomal degradation of the transferrin receptor (TfR). Tumor necrosis factor-related apoptosis inducing ligand receptor 1 (TRAIL-R1) is also a physiological substrate of the endogenous MARCH8, which regulates the steady-state cell surface expression of TRAIL-R1. Meanwhile, it negatively regulates interleukin-1 (IL-1) beta-induced NF-kappaB activation by targeting the IL-1 receptor accessory protein (IL1RAP) coreceptor for ubiquitination and degradation. Furthermore, MARCH8 functions in the embryo to modulate the strength of cell adhesion by regulating the localization of E-cadherin. In addition, MARCH8 plays a role in the inhibition of inflammatory cytokine production, suggesting a new therapeutic approach to the treatment of rheumatoid arthritis (RA). MARCH8 contains an N-terminal cytoplasmic C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger and two transmembrane domains.	55
319722	cd16808	RING_CH-C4HC3_MARCH2	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH2 (MARCH2). MARCH2, also known as membrane-associated RING finger protein 2, membrane-associated RING-CH protein II (MARCH-II), or RING finger protein 172 (RNF172), is a Golgi-localized, membrane-associated E3 ubiquitin-protein ligase that is involved in endosomal trafficking through the binding of syntaxin 6 (STX6). It is involved in the cystic fibrosis transmembrane conductance regulator (CFTR)-associated ligand (CAL)-mediated ubiquitination and lysosomal degradation of mature CFTR through the association with adaptor proteins CAL and STX6. It also reduces the surface expression of CD86 and the transferrin receptor TFRC and regulates cell surface carvedilol-bound beta2-adrenergic receptor (beta2ARs) expression. Moreover, MARCH2 interacts with and ubiquitinates PDZ domains polarity determining scaffold protein DLG1 through its PDZ-binding motif, suggesting it may function as a molecular bridge with ubiquitin ligase activity connecting endocytic tumor suppressor proteins such as syntaxins to DLG1. MARCH2 contains a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, in the N-terminal cytoplasmic region, two transmembrane domains in the middle region, and a PDZ-binding motif at the C-terminus.	52
319723	cd16809	RING_CH-C4HC3_MARCH3	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH3 (MARCH3). MARCH3, also known as membrane-associated RING finger protein 3, or membrane-associated RING-CH protein III (MARCH-III), or RING finger protein 173 (RNF173), is an E3 ubiquitin-protein ligase that is broadly expressed at relatively high levels in spleen, colon, and lung. It is localized to early endosomes, binds to MARCH2 and syntaxin 6, and is involved in the regulation of vesicular trafficking and fusion of the transport vesicles in endosomes. MARCH3 is the closest homolog of MARCH2 and it is also a functional homolog of K3 and K5 viral ubiquitin E3 ligases related to immune-evasion strategies used by Kaposi"s sarcoma-associated herpesvirus (KSHV). Its E2 specificity significantly overlaps that of MARCH2. MARCH3 contains a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, in the N-terminal cytoplasmic region, two transmembrane domains in the middle region, and a PDZ-binding motif at the C-terminus. The RING-CH finger and PDZ-binding motif are essential for the subcellular localization of MARCH3 and the inhibitory effect on transferrin uptake.	51
319724	cd16810	RING_CH-C4HC3_MARCH11	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH11 (MARCH11). MARCH11, also known as membrane-associated RING finger protein 11, or membrane-associated RING-CH protein XI (MARCH-XI), is a transmembrane RING-finger ubiquitin ligase that is predominantly expressed in developing spermatids in a stage-specific manner and is localized to the trans-Golgi network (TGN) vesicles and multivesicular bodies (MVBs). It mediates selective protein sorting via the TGN-MVB transport pathway through its ubiquitin ligase activity. SAMT family proteins have been identified as substrates of MARCH11 in mouse spermatids, suggesting that MARCH11 plays a role in mammalian spermiogenesis. Moreover, MARCH11 functions as an E3 ubiquitin ligase that targets CD4 for ubiquitination. It also forms complexes with the adaptor protein complex-1 and with fucose-containing glycoproteins including ubiquitinated forms. MARCH11 contains an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, and two transmembrane domains. In addition, it harbors a proline-rich region, a tyrosine-based motif, and a PDZ binding motif.	52
319725	cd16811	RING_CH-C4HC3_MARCH4_9	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING finger protein MARCH4, MARCH9, and similar proteins. This subfamily includes the closely related MARCH4 and MARCH9, both belonging to the family of MARCH E3 ligases. They downregulate major histocompatibility complex-I (MHC-I). In the presence of MARCH4 or MARCH9, MHC-I can be ubiquitinated and rapidly internalized by endocytosis, whereas MHC-I molecules lacking lysines in their cytoplasmic tail are resistant to downregulation. Moreover, MARCH4 and MARCH9, but not other MARCH proteins, can associate with Mult1 and prevent Mult1 expression at the cell surface in a lysine-dependent manner that can be reversed by heat shocking the cells. Both MARCH4 and MARCH9 contain an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, followed by two transmembrane regions.	51
319726	cd16812	RING_CH-C4HC3_MARCH7	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH7 (MARCH7). MARCH7, also known as membrane-associated RING finger protein 7, membrane-associated RING-CH protein VII (MARCH-VII), RING finger protein 177 (RNF177), or axotrophin, is a ubiquitin E3 ligase expressed in multiple types of cells and tissues, including stem cells and precursor cells, and is predominantly localized on the plasma membrane and cytoplasm. MARCH7 is involved in T cell proliferation and neuronal development. It also participates in the regulation of cytoskeleton re-organization, cellular migration and invasion, cell proliferation, and tumorigenesis in ovarian carcinoma cells. Moreover, MARCH7 modulates nuclear factor kappaB (NF-kappaB) and Wnt/beta-catenin pathways. It has been identified as an authentic target of miR-101. Furthermore, ubiquitinates tau protein in vitro impairing microtubule binding. Unlike other MARCH proteins, MARCH7 is predicted to have no transmembrane spanning region. It harbors a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, that is responsible for its E3 activity.	65
319727	cd16813	RING_CH-C4HC3_MARCH10	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH10 (MARCH10). MARCH10, also known as membrane-associated RING finger protein 10, membrane-associated RING-CH protein X (MARCH-X), or RING finger protein 190 (RNF190), is a microtubule-associated E3 ubiquitin ligase of developing spermatids. It is localized to the principal piece of elongating spermatids. MARCH10 is involved in spermiogenesis by regulating the formation and maintenance of the flagella in developing spermatids. Unlike other MARCH proteins, MARCH10 is predicted to have no transmembrane spanning region. It harbors a C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, that is responsible for its E3 activity.	64
319728	cd16814	RING-HC_RNF20	RING finger, HC subclass, found in RING finger protein 20 (RNF20). RNF20, also known as BRE1A or BRE1, is an E3 ubiquitin-protein ligase that forms a heterodimeric complex together with BRE1B, also known as RNF40, to facilitate the K120 monoubiquitination of histone H2B (H2Bub1), a DNA damage-induced histone modification that is crucial for recruitment of the chromatin remodeler SNF2h to DNA double-strand break (DSB) damage sites. It regulates the cell cycle and differentiation of neural precursor cells (NPCs), and links histone H2B ubiquitylation with inflammation and inflammation-associated cancer. Moreover, RNF20 promotes the polyubiquitination and proteasome-dependent degradation of transcription factor activator protein 2alpha (AP-2alpha), a negative regulator of adipogenesis by repressing the transcription of CCAAT/enhancer binding protein (C/EBPalpha) gene. Furthermore, RNF20 functions as an additional chromatin regulator that is necessary for mixed-lineage leukemia (MLL)-fusion-mediated leukemogenesis. It also inhibits TFIIS-facilitated transcriptional elongation to suppress pro-oncogenic gene expression. TFIIS is a factor capable of relieving stalled RNA polymerase II. RNF20 contains a C3HC4-type RING-HC finger at its C-terminus.	46
319729	cd16815	RING-HC_RNF40	RING finger, HC subclass, found in RING finger protein 40 (RNF40). RNF40, also known as BRE1B or 95 kDa retinoblastoma-associated protein (RBP95), was identified as a novel leucine zipper retinoblastoma protein (pRb)-associated protein that may function as a regulation factor in the process of RNA polymerase II-mediated transcription and/or transcriptional processing. RNF40 also functions as an E3 ubiquitin-protein ligase that forms a heterodimeric complex together with BRE1B, also known as RNF40, to facilitate the K120 monoubiquitination of histone H2B (H2Bub1), a DNA damage-induced histone modification that is crucial for recruitment of the chromatin remodeler SNF2h to DNA double-strand break (DSB) damage sites. It cooperates with SUPT16H to induce dynamic changes in chromatin structure during DSB repair. RNF40 contains a C3HC4-type RING-HC finger at the C-terminus.	55
319730	cd16816	mRING-HC-C3HC5_MGRN1	Modified RING finger, HC subclass (C3HC5-type), found in mahogunin RING finger protein 1 (MGRN1) and similar proteins. MGRN1, also known as RING finger protein 156 (RNF156), is a cytosolic E3 ubiquitin-protein ligase that inhibits signaling through the G protein-coupled melanocortin receptors-1 (MC1R), -2 (MC2R) and -4 (MC4R) via ubiquitylation-dependent and -independent processes. It suppresses chaperone-associated misfolded protein aggregation and toxicity. MGRN1 interacts with cytosolic prion proteins (PrPs) that are linked with neurodegeneration. It also interacts with expanded polyglutamine proteins, and suppresses misfolded polyglutamine aggregation and cytotoxicity. Moreover, MGRN1 inhibits melanocortin receptor signaling by competition with Galphas, suggesting a novel pathway for melanocortin signaling from the cell surface to the nucleus. Furthermore, MGRN1 interacts with and ubiquitylates TSG101, a key component of the endosomal sorting complex required for transport (ESCRT)-I, and regulates endosomal trafficking. A null mutation in the gene encoding MGRN1 causes spongiform neurodegeneration, suggesting a link between dysregulation of endosomal trafficking and spongiform neurodegeneration. MGRN1 contains a modified C3HC5-type RING-HC finger, a conserved PSAP motif necessary for interaction between MGRN1 and TSG101. In addition, MGRN1 harbors a functionally uncharacterized region, as known as the domain associated with RING2 (DAR2), N-terminal to the RING finger. The C3HC5-type RING-HC finger is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	41
319731	cd16817	mRING-HC-C3HC5_RNF157	Modified RING finger, HC subclass (C3HC5-type), found in RING finger protein 157 (RNF157) and similar proteins. RNF157 is a cytoplasmic E3 ubiquitin ligase predominantly expressed in brain. It is a homolog of the E3 ligase mahogunin ring finger-1 (MGRN1). In cultured neurons, it promotes neuronal survival in an E3 ligase-dependent manner. In contrast, it supports growth and maintenance of dendrites independent of its E3 ligase activity. RNF157 interacts with and ubiquitinates the adaptor protein APBB1 (amyloid beta precursor protein-binding, family B, member 1 or Fe65), which regulates neuronal survival, but not dendritic growth downstream of RNF157. The nuclear localization of APBB1 together with its interaction partner RNA-binding protein SART3 (squamous cell carcinoma antigen recognized by T cells 3 or Tip110) is crucial to trigger apoptosis. RNF157 contains a modified C3HC5-type RING-HC finger, and a functionally uncharacterized region, known as domain associated with RING2 (DAR2), N-terminal to the RING finger. The C3HC5-type RING-HC finger is distinguished from typical C3HC4 RING-HC finger due to the existence of the additional cysteine residue in the middle portion of the RING finger domain.	41
319732	cd16818	SP-RING_PIAS1	SP-RING finger found in protein inhibitor of activated STAT protein 1 (PIAS1) and similar proteins. PIAS1, also known as DEAD/H box-binding protein 1, Gu-binding protein (GBP), or RNA helicase II-binding protein, was initially identified as an inhibitor of STAT1 that blocks the DNA-binding activity of STAT1 and specifically inhibits STAT1-mediated gene transcription in response to cytokine stimulation. It selectively inhibits interferon-inducible gene expression and plays an important role in the IFN-gamma- or IFN-beta-mediated innate immune response through negative regulation of STAT1. It also regulates the activity of other transcription factors to regulate immune response, such as NF-kappaB and Smad4. Moreover, PIAS1 functions as an E3 small ubiquitin-like modifier (SUMO)-protein ligase specifying target proteins for SUMO conjugation by Ubc9. The sumoylation activity of PIAS1 can suppress cytokine transforming growth factor beta (TGFbeta)-induced epithelial mesenchymal transition (EMT) in non-transformed epithelial cells to promote activation of the matrix metalloproteinase 2 (MMP2). It thus regulates TGFbeta-induced cancer cell invasion and metastasis. PIAS1 may also be involved in spatial learning and memory formation through its SUMOylation of cAMP-responsive element binding protein (CREB). In addition, PIAS1 is the E3 ligase responsible for SUMOylation of High mobility group nucleosomal binding domain 2 (HMGN2), which is a small and unique non-histone protein that has many functions in a variety of cellular processes, including regulation of chromatin structure, transcription, and DNA repair, as well as antimicrobial activity, cell homing, and regulating cytokine release. Furthermore, PIAS1 is a genuine chromatin-bound androgen receptor (AR) co-regulator that functions in a target gene selective fashion to regulate prostate cancer cell growth. It also mediates the SUMOylation of c-Myc, which is the most frequently overexpressed oncogene in tumours, including breast cancer, colon cancer, and lung cancer. Necdin, a pleiotropic protein that promotes differentiation and survival of mammalian neurons, can suppresses PIAS1 both by inhibiting SUMO E3 ligase activity and by promoting ubiquitin-dependent degradation. PIAS1 contains an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain. The SP-RING finger mediates the interaction of PIAS1 with the SUMO E2 conjugating enzyme Ubc9.	51
319733	cd16819	SP-RING_PIAS2	SP-RING finger found in protein inhibitor of activated STAT protein 2 (PIAS2) and similar proteins. PIAS2, also known as androgen receptor-interacting protein 3 (ARIP3), DAB2-interacting protein (DIP), Msx-interacting zinc finger protein (Miz1), PIAS-NY protein, protein inhibitor of activated STAT x, protein inhibitor of activated STAT2, is an E3 SUMO-protein ligase highly expressed in the testis. It functions as a transcriptional activator of BCL2 and is essential for blocking c-MYC-induced apoptosis. It also acts as a negative regulator of cell proliferation, induces expression of the cell-cycle inhibitors p15(Ink4b) and p21(Cip1), and activates transcription of the p21(Cip1) gene in response to UV irradiation. Moreover, PIAS2 associates with topoisomerase II binding protein 1 (TopBP1), an essential activator of the Atr kinase. It thus affects the activity of the Atr checkpoint. Receptor of activated C kinase 1 (RACK1), glucocorticoid receptor (GR)-interacting protein 1 (GRIP1), friend leukemia integration-I (FLI-1), and ubiquitously expressed transcript (UXT) are binding partners of PIAS2. The interaction between UXT and PIAS2 may be important for the transcriptional activation of androgen receptor (AR). PIAS2 contains an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus, and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain.	49
319734	cd16820	SP-RING_PIAS3	SP-RING finger found in protein inhibitor of activated STAT protein 3 (PIAS3) and similar proteins. PIAS3 is an E3 SUMO-protein ligase that was initially identified as an interleukin-6 (IL-6)-dependent repressor of signal transducer and activator of transcription 3 (STAT3) and has anti-proliferative properties. It binds specifically to phosphorylated STAT3 and inhibits its transcriptional activity by blocking its binding to DNA. It regulates STAT3-mediated induction of Snail expression, as well as suppresses acute graft-versus-host disease (GVHD) by modulating effector T and B cell subsets through inhibition of STAT3 activation. It activates the intrinsic apoptotic pathway in non-small cell lung cancer cells independent of p53 status. When overexpressed, it can interact with STAT5 to regulate prolactin-induced STAT5-mediated gene expression. Moreover, PIAS3 binds to and activates Smad3 transcriptional activity, resulting in the enhancement of transforming growth factor-beta (TGF-beta) signaling. It functions as a transcriptional corepressor of Erythroid Kruppel-like factor (EKLF or KLF1) and thus plays an important role in erythropoiesis. It also plays a significant role in DNA damage response (DDR) pathway through promoting homologous recombination (HR)- and non-homologous end joining (NHEJ)-mediated DNA double-strand break (DSB) repair. Furthermore, PIAS3 preferentially interacts with and enhances the SUMOylation of TAK1-binding protein 2 (TAB2), an upstream adaptor protein in the IL-1 signaling pathway. It also promotes SUMOylation and nuclear sequestration of ErbB4 receptor tyrosine kinase. In addition, PIAS3 may form a complex with microphthalmia-associated transcription factor, nuclear factor-kappaB, Smad, and estrogen receptor. Its other transcription factor binding partners include: ETS, EGR1, NR1I2, and GATA1. PIAS3 contains an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain.	51
319735	cd16821	SP-RING_PIAS4	SP-RING finger found in protein inhibitor of activated STAT protein 4 (PIAS4) and similar proteins. PIAS4, also known as PIASy or protein inhibitor of activated STAT protein gamma (PIAS-gamma), is an E3 SUMO-protein ligase that interacts with the androgen receptor (AR) and is involved in ubiquitin signaling pathways. It is associated with macro/microcephaly in the novel interstitial 19p13.3 microdeletion/microduplication syndrome. It also regulates the hypoxia signalling pathway through interacting with the tumor suppressor von Hippel-Lindau (VHL) and leads to VHL sumoylation, oligomerization, and impaired function during growth of pancreatic cancer cells. Moreover, PIAS4 acts as a direct binding partner for vitamin D receptor (VDR) and facilitates its modification with SUMO2. The process of SUMOylation modulates VDR-mediated signaling. As components of the DNA-damage response (DDR), PIAS4 together with PIAS1 promote responses to DNA double-strand breaks (DSBs). They are required for effective ubiquitin-adduct formation mediated by RNF8, RNF168, and BRCA1 at sites of DNA damage. PIAS4 contains an N-terminal SAP (scaffold attachment factor A/B (SAF-A/B), acinus and PIAS) box with the LXXLL signature, a PINT motif, a specific RING finger known as Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, and the acidic C-terminal domain.	49
319736	cd16822	SP-RING_ZMIZ1	SP-RING finger found in zinc finger MIZ domain-containing protein 1 (Zmiz1) and similar proteins. Zmiz1, also known as PIAS-like protein Zimp10 (zinc finger-containing, Miz1, PIAS-like protein on chromosome 10) or retinoic acid-induced protein 17, is a novel PIAS-like protein that was initially identified as an androgen receptor (AR) interacting protein and functions as a transcriptional co-activator. It co-localizes with AR and small ubiquitin-like modifier SUMO-1, forms a protein complex at replication foci in the nucleus, and augments AR-mediated transcription. It also functions as a transcriptional co-activator of the p53 tumor suppressor that plays a critical role in the cell cycle progression, DNA repair, and apoptosis. Moreover, Zmiz1 associates with multiple autoimmune diseases and has been implicated in the development, function, and survival of melanocyte. Zmiz1 also interacts with Smad3/4 proteins and augments Smad-mediated transcription, suggesting it is important in the regulation of the transforming growth factor beta (TGF-beta)/Smad signaling pathway and may have an inhibitory effect on the immune system. Furthermore, Zmiz1 is overexpressed in a significant percentage of human cutaneous squamous cell carcinoma (SCC), breast, ovarian, and colon cancers, suggesting it may play a broader role in epithelial cancers. It functionally interacts with NOTCH1 to promote C-MYC transcription and activity, and thus is involved in a variety of C-MYC-driven cancers. Zmiz1 contains a PAT domain, a highly conserved Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, also known as msx-interacting zinc finger (Miz domain), and a putative nuclear localization sequence (NLS), as well as a strong intrinsic transactivation domain within the C-terminus.	49
319737	cd16823	SP-RING_ZMIZ2	SP-RING finger found in zinc finger MIZ domain-containing protein 2 (Zmiz2) and similar proteins. Zmiz2, also known as PIAS-like protein Zimp7 (zinc finger-containing, Miz1, PIAS-like protein on chromosome 7), is a novel PIAS-like protein that was initially identified as an androgen receptor (AR) interacting protein and functions as a transcriptional co-activator. It interacts with beta-catenin and enhances Wnt/beta-catenin-mediated transcription. It also associates with BRG1 and BAF57, components of the ATP-dependent mammalian SWI/SNF-like BAF chromatin-remodeling complexes, and thus plays a potential role in modulation of AR and/or other nuclear receptor-mediated transcription. For instance, it can increase the effects of BRG1 on androgen receptor-mediated transcriptional activity. Moreover, Zmiz2 physically interacts with PIAS proteins, especially PIAS3. Through the interaction, PIAS3 augments Zmiz2-mediated transcription, suggesting PIAS proteins may play a regulatory role in Zmiz-mediated transcription. Furthermore, Zmiz2 is involved in transcriptional regulation of factors essential for patterning in the dorsoventral axis. It is required for the restriction of the zebrafish organizer and mesoderm development. Zmiz2 contains a PAT domain, a highly conserved Siz/PIAS (protein inhibitor of activated signal transducer and activator of transcription) RING (SP-RING) finger, also known as msx-interacting zinc finger (Miz domain), and a strong intrinsic transactivation domain within the C-terminus.	49
319738	cd16824	RING_CH-C4HC3_MARCH4	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH4 (MARCH4). MARCH4, also known as membrane-associated RING finger protein 4, membrane-associated RING-CH protein IV (MARCH-IV), or RING finger protein 174 (RNF174), is a transmembrane E3 ubiquitin-protein ligase that down-regulates the tetraspanin CD81 and major histocompatibility complex-I (MHC). It also associates with Mult1, suppressing Mult1 expression at the cell surface in a lysine-dependent manner that can be reversed by heat shocking the cells. MARCH4 contains an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, followed by two transmembrane regions.	51
319739	cd16825	RING_CH-C4HC3_MARCH9	RING-CH finger, H2 subclass (C4HC3-type), found in membrane-associated RING-CH9 (MARCH9). MARCH9, also known as membrane-associated RING finger protein 9, membrane-associated RING-CH protein IX (MARCH-IX), or RING finger protein 179 (RNF179), is a transmembrane E3 ubiquitin-protein ligase that down-regulates Mult1, CD4, major histocompatibility complex-I (MHC), and intercellular adhesion molecule (ICAM-1). It may also interact with receptor-type protein-tyrosine phosphatases (e.g. PTPRJ/CD148) as well as Fc gamma receptor IIB (CD32B), HLA-DQ, signaling lymphocytic activation molecule (CD150), and polio virus receptor (CD155). MARCH9 contains an N-terminal C4HC3-type RING-CH finger, also known as vRING or RINGv, a variant of C3H2C3-type RING-H2 finger, followed by two transmembrane regions.	51
319356	cd16827	ChuX-like	heme utilization protein ChuX and similar proteins. This family contains ChuX, a member of the conserved heme utilization operon from pathogenic E. coli, and similar proteins, which include ChuS, HutX, HuvX, HugX, and ShuX in proteobacteria, among others. It forms a dimer which displays a very similar fold and organization to the monomeric structure of other heme utilization proteins such as HemS, ChuS, HmuS, PhuS; these latter occurring as duplicated domains. They all bind heme via a key conserved histidine. The genes encoded within these heme utilization operons enable the effective uptake and utilization of heme as an iron source in pathogenic microorganisms to enable multiplication and survival within hosts they invade.	141
319357	cd16828	HemS-like	N- and C-terminal domains of heme degrading enzyme HemS, and similar proteins. This family contains the N- and C-terminal domains of heme degrading enzyme HemS, and similar proteins, including PhuS, ChuS, ShuS, and HmuS in proteobacteria.  Despite low sequence identity between the N- and C-terminal halves, these segments represent a structural duplication, with each terminal half having similar fold to single domains of ChuX. HemS shares homology with both heme degrading enzymes and heme trafficking enzymes. Heme is an iron source for pathogenic microorganisms to enable multiplication and survival within hosts they invade and therefore heme degrading enzyme activity is required for the release of iron from heme after its transportation into the cytoplasm. N- and C-terminal halves of ChuS are each a functional heme oxygenase (HO). The mode of heme coordination by ChuS has been shown to be distinct, whereby the heme is stabilized mostly by residues from the C-terminal domain, assisted by a distant arginine from the N-terminal domain. ChuS can use ascorbic acid or cytochrome P450 reductase-NADPH as electron sources for heme oxygenation. Shigella dysenteriae ShuS promotes utilization of heme as an iron source and protects against heme toxicity by physically sequestering DNA. PhuS in Pseudomonas aeruginosa has been reported as a heme chaperone and as a heme degrading enzyme, and is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX. Heme transporter protein PhuS in Pseudomonas aeruginosa is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX.	152
319358	cd16829	ChuX_HutX-like	heme iron utilization protein ChuX and similar proteins. This family contains proteins similar to ChuX, a member of the conserved heme utilization operon from pathogenic E. coli, and includes ChuS, HutX, HuvX, HugX, and ShuX in proteobacteria, among others. It forms a dimer which displays a very similar fold and organization to the monomeric structure of other heme utilization proteins such as  HemS, ChuS, HmuS, PhuS; these latter occurring as duplicated domains. They all bind heme via a key conserved histidine. The genes encoded within these heme utilization operons enable the effective uptake and utilization of heme as an iron source in pathogenic microorganisms to enable multiplication and survival within hosts they invade.  ChuX, a member of the conserved heme utilization operon from pathogenic E. coli O157:H7, forms a dimer with a very similar fold to the monomer structure of two other heme utilization proteins, ChuS and HemS, despite low sequence homology. ChuX has been shown to bind heme in a 1:1 manner, inferring that the ChuX homodimer could coordinate two heme molecules in contrast to only one heme molecule bound in ChuS and HemS. Similarly, cytoplasmic heme-binding protein HutX in Vibrio cholera, an intracellular heme transport protein for the heme-degrading enzyme HutZ, forms a dimer, each domain binding heme that is transferred from HutX to HutZ via a specific protein-protein interaction. This family also includes AGR_C_4470p from Agrobacterium tumefaciens and found to be a dimer, with each subunit having strong structural homology and organization to the heme utilization protein ChuS from Escherichia coli and HemS from Yersinia enterocolitica. However, the heme binding site is not conserved in AGR_C_4470p, suggesting a possible different function.	143
319359	cd16830	HemS-like_N	N-terminal domain of heme degrading enzyme HemS, and similar proteins. This family contains the N-terminal domain of heme degrading enzyme HemS, and similar proteins, including PhuS, ChuS, ShuS, and HmuS in proteobacteria.  Despite low sequence identity between the N- and C-terminal halves, these segments represent a structural duplication, with each terminal half having similar fold to single domains of ChuX. HemS shares homology with both heme degrading enzymes and heme trafficking enzymes. Heme is an iron source for pathogenic microorganisms to enable multiplication and survival within hosts they invade and therefore heme degrading enzyme activity is required for the release of iron from heme after its transportation into the cytoplasm. N- and C-terminal halves of ChuS are each a functional heme oxygenase (HO). The mode of heme coordination by ChuS has been shown to be distinct, whereby the heme is stabilized mostly by residues from the C-terminal domain, assisted by a distant arginine from the N-terminal domain. ChuS can use ascorbic acid or cytochrome P450 reductase-NADPH as electron sources for heme oxygenation. Shigella dysenteriae ShuS promotes utilization of heme as an iron source and protects against heme toxicity by physically sequestering DNA. Heme transporter protein PhuS in Pseudomonas aeruginosa is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX.	152
319360	cd16831	HemS-like_C	C-terminal domain of heme degrading enzyme HemS, and similar proteins. This family contains the C-terminal domain of heme degrading enzyme HemS, and similar proteins, including PhuS, ChuS, ShuS, and HmuS in proteobacteria.  Despite low sequence identity between the N- and C-terminal halves, these segments represent a structural duplication, with each terminal half having similar fold to single domains of ChuX. HemS shares homology with both, heme degrading enzymes and heme trafficking enzymes. Heme is an iron source for pathogenic microorganisms to enable multiplication and survival within hosts they invade and therefore heme degrading enzyme activity is required for the release of iron from heme after its transportation into the cytoplasm. N- and C-terminal halves of ChuS are each a functional heme oxygenase (HO). The mode of heme coordination by ChuS has been shown to be distinct, whereby the heme is stabilized mostly by residues from the C-terminal domain, assisted by a distant arginine from the N-terminal domain. ChuS can use ascorbic acid or cytochrome P450 reductase-NADPH as electron sources for heme oxygenation. Shigella dysenteriae ShuS promotes utilization of heme as an iron source and protects against heme toxicity by physically sequestering DNA. Heme transporter protein PhuS in Pseudomonas aeruginosa is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX.	155
319353	cd16832	CNF1_CheD_YfiH-like	cytotoxic necrotizing factor 1 (CNF1), chemotaxis protein CheD and YfiH (DUF152) are distant homologs. This family contains distant homologs that include cytotoxic necrotizing factor 1 (CNF1), chemotaxis protein CheD and a protein of unknown function YfiH. CNF-1 along with dermonecrotic toxin (DNT) from Bordetella species, and Burkholderia Lethal Factor 1 (BLF1, also known as BPSL1549) are Rho-activating toxins. The bacterial chemotaxis protein CheD stimulates methylation of methyl-accepting chemotaxis proteins (MCPs). YfiH, a domain of unknown function, also included in this family reveals a structure with a distant homology between to the CNF1, and CheD, all having an invariant Cys-His pair forming a catalytic dyad that is required by the CNF-1 toxins for deamidation activity.	145
319354	cd16833	YfiH	protein of unknown function YfiH. This subfamily contains YfiH, a protein of unknown function from Shigella flexneri, E. coli, and many similar proteins which collectively are often called DUF152. The structure of YfiH reveals a distant homology to Rho-activating toxins cytotoxic necrotizing factor 1 (CNF1) as well as chemotaxis protein CheD that stimulates methylation of methyl-accepting chemotaxis proteins (MCPs), all having an invariant Cys-His pair forming a catalytic dyad, and is required by the CNF-1 toxins for deamidation activity.	185
319355	cd16834	CNF1-like	cytotoxic necrotizing factor 1 (CNF1) and similar proteins. This subfamily contains Rho-activating toxins cytotoxic necrotizing factor 1 (CNF1) and dermonecrotic toxin (DNT) from Bordetella species, as well as Burkholderia Lethal Factor 1 (BLF1, also known as BPSL1549), and similar proteins. CNF1 causes alteration of the host cell actin cytoskeleton and promotes bacterial invasion of blood-brain barrier endothelial cells. E. coli CNF1 constitutively activates host small G proteins such as RhoA and Cdc42 by deamidating a glutamine residue essential for GTP hydrolysis. DNT stimulates the assembly of actin stress fibers and focal adhesions by deamidation/polyamination of a specific glutamine of the small GTPase Rho. CNF1 and DNT are A-B toxins composed of an N-terminal receptor-binding (B) domain and a C-terminal enzymatically active (A) domain; their homology is restricted to the catalytic domains at the C termini of the toxins, suggesting that they share a similar molecular mechanism. BLF1, a toxin that inhibits helicase activity of translation factor eIF4A, is similar to the catalytic domain of Escherichia coli CNF1 (CNF1-C); although CNF1-C and BLF1 show little sequence identity, the active sites have the conserved LSGC (Leu, Ser, Gly, Cys) motif.	168
319351	cd16837	BldD_C_like	C-terminal domain of BldD and similar transcription factors. The Streptomyces transcription factor BldD dimerizes via an unusual mechanism that inolves a tetrameric c-di-GMP assembly. BdlD is involved in controlling multicellular differentiation in sporulating actinomycetes.	73
319350	cd16839	PCSK9_C-CRD	proprotein convertase subtilisin/kexin type 9, C-terminal cysteine-rich domain (CRD). PCSK9 post-translationally regulates hepatic low-density lipoprotein receptors (LDLRs) by binding to LDLRs on the cell surface, leading to their degradation. Other known PSCK9 targets include very-low-density lipoprotein receptor (VLDLR), apoE receptor2, lipoprotein receptor-related protein 1, etc. This PCSK9 C-terminal CRD may play an analogous role to the P (processing) domains of Furin and Kex2 (i.e. be required for the correct functioning/folding of the protein).  Structural similarity has been noted between PCSK9 C-terminal CRD and the resistin homotrimer. This alignment model represents a three-fold repeat.	225
411037	cd16840	toxin_MLD	toxin effector region membrane localization domain. This MLD domain functions as a membrane-targeting domain for toxin effectors such as the Rho-inactivation domain of Vibrio MARTX, Pasteurella mitogenic toxin (PMT), where it has been termed PMT C1 domain, and clostridial glycosylating cytotoxins including Clostridium difficile toxins A (TcdA) and B (TcdB), Clostridium novyi alpha-toxin (TcnA), and Clostridium sordellii lethal toxin (TcsL). During infection, the C. difficile homologous exotoxins, TcdA and TcdB, target and disrupt the colonic epithelium, leading to diarrhea and colitis. They disrupt host cell function through a multistep process involving receptor binding, endocytosis, low pH-induced pore formation, and the translocation and delivery of a C-terminal glucosyltransferase domain (GTD) that inactivates host GTPases. Their N-terminal MLD domains confer membrane localization of adjacent effector domains via the 4-helix-bundle motif.	78
319245	cd16841	RraA_family	ribonuclease activity regulator RraA family. RraA protein family is named after the regulator of ribonuclease activity A (RraA), a protein that  binds to RNase E and inhibits RNase E endonucleolytic cleavages. Members also include proteins with other functions, like a 4-hydroxy-4-methyl-2-oxoglutarate/4-carboxy-4-hydroxy-2-oxoadipate (HMG/CHA) aldolase from Pseudomonas putida, which catalyzes the last step of the bacterial protocatechuate 4,5-cleavage pathway and the uncharacterized YER010Cp protein from yeast, an organism lacking RNAse E.	150
409517	cd16842	Ig_SLAM-like_N	N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family. The members here are composed of the N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family and similar proteins. The SLAM family is a group of immune-cell specific receptors that can regulate both adaptive and innate immune responses. Members of this group include proteins such as CD84, SLAM (CD150), Ly-9 (CD229), NTB-A (ly-108, SLAM6), 19A (CRACC), and SLAMF9. The genes coding for the SLAM family are nested on chromosome 1, in humans at 1q23, and in mice at 1H2. The SLAM family is a subset of the CD2 family, which also includes CD2 and CD58 located on chromosome 1 at 1p13 in humans. In mice, CD2 is located on chromosome 3, and there is no CD58 homolog. The SLAM family proteins are organized as an extracellular domain with either two or four Ig-like domains, a single transmembrane segment, and a cytoplasmic region having Tyr-based motifs. The extracellular domain is organized as a membrane-distal Ig variable (IgV) domain that is responsible for ligand recognition and a membrane-proximal truncated Ig constant-2 (IgC2) domain.	102
409518	cd16843	IgC2_D1_D2_LILR_KIR_like	Immunoglobulin (Ig)-like domain found in Leukocyte Ig-like receptors, Natural killer inhibitory receptors (KIRs) and similar domains; member of Immunoglobulin Constant-2 set of IgSF domains. The members here are composed of the first and second immunoglobulin (Ig)-like domains found in Leukocyte Ig-like receptors (LILRs), Natural killer inhibitory receptors (KIRs, also known as also known as cluster of differentiation (CD) 158), and similar proteins. This group includes LILRB1 (also known as LIR-1), LILRA5 (also known as LIR9), an activating natural cytotoxicity receptor NKp46, the immune-type receptor glycoprotein VI (GPVI), and the IgA-specific receptor Fc-alphaRI (also known as cluster of differentiation (CD) 89). LILRs are a family of immunoreceptors expressed on expressed on T and B cells, on monocytes, dendritic cells, and subgroups of natural killer (NK) cells. The human LILR family contains nine proteins (LILRA1-3, and 5, and LILRB1-5). From functional assays, and as the cytoplasmic domains of various LILRs, for example LILRB1, LILRB2 (also known as LIR-2), and LILRB3 (also known as LIR-3) contain immunoreceptor tyrosine-based inhibitory motifs (ITIMs), it is thought that LIR proteins are inhibitory receptors. Of the eight LIR family proteins, only LILRB1, and LILRB2, show detectable binding to class I MHC molecules; ligands for the other members have yet to be determined. The extracellular portions of the different LIR proteins contain different numbers of Ig-like domains for example, four in the case of LILRB1, and LILRB2, and two in the case of LILRB4 (also known as LIR-5). The activating natural cytotoxicity receptor NKp46 is expressed in natural killer cells, and is organized as an extracellular portion having two Ig-like extracellular domains, a transmembrane domain, and a small cytoplasmic portion. GPVI, which also contains two Ig-like domains, participates in the processes of collagen-mediated platelet activation and arterial thrombus formation. Fc-alphaRI is expressed on monocytes, eosinophils, neutrophils, and macrophages; it mediates IgA-induced immune effector responses such as phagocytosis, antibody-dependent cell-mediated cytotoxicity and respiratory burst. Killer cell immunoglobulin-like receptors (KIRs; also known as CD158 for human KIR) are transmembrane glycoproteins expressed by natural killer cells and subsets of T cells. KIRs are a family of highly polymorphic activating and inhibitory receptors that serve as key regulators of human NK cell function. The KIR proteins are classified by the number of extracellular immunoglobulin domains (2D or 3D) and by whether they have a long (L) or short (S) cytoplasmic domain. KIR proteins with the long cytoplasmic domain transduce inhibitory signals upon ligand binding via an immune tyrosine-based inhibitory motif (ITIM), while KIR proteins with the short cytoplasmic domain lack the ITIM motif and instead associate with the TYRO protein tyrosine kinase binding protein to transduce activating signals. The major ligands for KIR are MHC class I (HLA-A, -B or -C) molecules.	90
319272	cd16844	ParB_N_like_MT	ParB N-terminal-like domain, some attached to C-terminal S-adenosylmethionine-dependent methyltransferase domain. This family represents domains related to the N-terminal domain of ParB, a DNA-binding component of the prokaryotic parABS partitioning system, fused to a variety of C-terminal domains, including S-adenosylmethionine-dependent methyltransferase-like domains and DUF4417. parABS contributes to the efficient segregation of chromosomes and low-copy number plasmids to daughter cells during prokaryotic cell division. The process includes the parA (Walker box) ATPase, the ParB DNA-binding protein and a parS cis-acting DNA sites. Binding of ParB to centromere-like parS sites is followed by non-specific binding to DNA ("spreading", which has been implicated in gene silencing in plasmid P1) and oligomerization of additional ParB molecules near the parS sites. It has been proposed that ParB-ParB cross-linking compacts the DNA, binds to parA via the N-terminal region, and leads to parA separating the ParB-parS complexes and the recruitment of the SMC (structural maintenance of chromosomes) complexes. The ParB N-terminal domain of Bacillus subtilis and other species contains a Arginine-rich ParB Box II with residues essential for bridging of the ParB-parS complexes. The arginine-rich ParB Box II consensus (I[VIL]AGERR[FYW]RA[AS] identified in several species is partially conserved with this family and related families. Mutations within the basic columns particularly debilitate spreading from the parS sites and impair SMC recruitment. The C-terminal domain contains a HTH DNA-binding motif and is the primary homo-dimerization domain, and binds to parS DNA sites. Additional homo-dimerization contacts are found along the N-terminal domain, but dimerization of the N-terminus may only occur after concentration at ParB-parS foci.	54
341083	cd16845	STAT1_DBD	DNA-binding domain of Signal Transducer and Activator of Transcription 1 (STAT1). This family consists of the DNA-binding domain (DBD) of the STAT1 proteins (Signal Transducer and Activator of Transcription 1, or Signal Transduction And Transcription 1). The DNA binding domain has an Ig-like fold. STAT1 plays an essential role in mediating responses to all types of interferons (IFN), transducing signals from cytoplasmic domains of transmembrane receptors into the nucleus where it regulates gene expression. Thus STAT1 is involved in modulating diverse cellular processes, such as antimicrobial activities, cell proliferation and cell death. STAT1 function is crucial in the innate and adaptive arm of immunity and protects from pathogen infections; phosphorylation of a critical tyrosine by Janus kinases (JAKs) leads to its activation and nuclear translocation, while phosphorylation of a critical serine is required for full transcriptional activation upon IFN stimulation and in response to cellular stress. Transcription of protein-encoding genes (including Stat1 itself) as well as expression of microRNAs (miRNAs) is regulated by activated STAT1. Animal studies have shown that STAT1 is generally considered a tumor suppressor but it can also act as a tumor promoter; its functions are not restricted to tumor cells, but extend to parts of the tumor microenvironment such as immune cells, endothelial cells. STAT1 abundance is a reliable marker for good prognosis in selected tumor types, but it can also correlate with disease progression. In head and neck cancer (HNC) patients, upregulation of STAT1-induced HLA class I enhances immunogenicity and clinical response to anti-EGFR mAb cetuximab therapy. In systemic juvenile idiopathic arthritis (sJIA) characterized by systemic inflammation and arthritis, STAT1 phosphorylation downstream of IFNs is impaired. It exerts anti-oncogenic activities through interferon-gamma and interferon-alpha. STAT1 may inhibit hepatocellular carcinoma cell growth by regulating p53-related cell cycling and apoptosis. Studies also show a significant correlation of high STAT1 activity with longer colorectal cancer patient overall survival. Recent studies have shown that STAT1 suppresses mouse mammary gland tumorigenesis by immune regulatory as well as tumor cell-specific functions of STAT1.	161
341084	cd16846	STAT2_DBD	DNA-binding domain of Signal Transducer and Activator of Transcription 2 (STAT2). This family consists of the DNA-binding domain (DBD) of the STAT2 proteins (Signal Transducer and Activator of Transcription 2, or Signal Transduction And Transcription 2). The DNA binding domain has an Ig-like fold. STAT2 activation is driven predominantly by only two classes of cell surface receptors: Type I and III interferon receptors, making it a unique STAT family of transcription factors. Thus, STAT2 plays a critical role in host defenses against viral infections since type I interferon (IFN-I) response inhibits viral replication, and sets the stage for the development of adaptive immunity; viruses target STAT2 by either inhibiting its expression, blocking its activity, or by targeting it for degradation, thus triggering remarkable divergence in the STAT2 gene across species compared to other STAT family members.  STAT2 function is regulated by tyrosine phosphorylation which enables STAT dimerization, and subsequent nuclear translocation and transcriptional activation of IFN stimulated genes. Dengue virus (DENV)-mediated degradation of STAT2 has emerged as an important determinant of DENV pathogenesis and host tropism. This vector-borne flavivirus suppresses IFN1 signaling to replicate and cause disease in vertebrates via proteasome-dependent STAT2 degradation mediated by the nonstructural protein NS5 and its interaction partner UBR4, an E3 ubiquitin ligase. The mechanism of Zika virus (ZIKV) NS5 resembles DENV NS5 but through different mechanism - ZIKV does not require the UBR4 to induce STAT2 degradation. It has also been shown that the STAT2 and STAT4 genes are direct targets for transcription factor Oct-1 protein which is involved in the regulation of expression of genes of the JAK-STAT signaling pathway in the Namalwa Burkitt's lymphoma cell line.	160
341085	cd16847	STAT3_DBD	DNA-binding domain of Signal Transducer and Activator of Transcription 3 (STAT3). This family consists of the DNA-binding domain (DBD) of the STAT3 proteins (Signal Transducer and Activator of Transcription 3, or Signal Transduction And Transcription 3). The DNA binding domain has an Ig-like fold. STAT3 plays key roles in vertebrate development and mature tissue function including control of inflammation and immunity. Mutations in human STAT3, especially in the DNA-binding and SH2 domains, are associated with diseases such as autoimmunity, immunodeficiency and cancer. STAT3 regulation is tightly controlled since either inactivation or hyperactivation results in disease. STAT3 activation is stimulated by several cytokines and growth factors, via diverse receptors. For example, IL-6 receptors depend on the tyrosine kinases JAK1 or JAK2, which associate with the cytoplasmic tail of gp130, and results in STAT3 phosphorylation, dimerization, and translocation to the nucleus; this leads to further IL-6 production and up-regulation of anti-apoptotic genes, thus promoting various cellular processes required for cancer progression. Other activators of STAT3 include IL-10, IL-23, and LPS activation of Toll-like receptors TLR4 and TLR9.  STAT3 is constitutively activated in numerous cancer types, including over 40% of breast cancers. It has been shown to play a significant role in promoting acute myeloid leukemia (AML) through three mechanisms: promoting proliferation and survival, preventing AML differentiation to functional dendritic cells (DCs), and blocking T-cell function through other pathways. STAT3 also regulates mitochondrion functions, as well as gene expression through epigenetic mechanisms; its activation is induced by overexpression of Bcl-2 via an increase in mitochondrial superoxide. Thus, many of the regulators and functions of JAK-STAT3 in tumors are important therapeutic targets for cancer treatment.	164
341086	cd16848	STAT4_DBD	DNA-binding domain of Signal Transducer and Activator of Transcription 4 (STAT4). This family consists of the DNA-binding domain (DBD) of the STAT4 proteins (Signal Transducer and Activator of Transcription 4, or Signal Transduction And Transcription 4). The DNA binding domain has an Ig-like fold. STAT4 acts as the major signaling transducing STATs in response to interleukin-12 (IL-12) by inducing interferon-gamma (IFNg) , and is a central mediator in generating inflammation during protective immune responses and immune-mediated diseases. STAT4 is a critical regulator of Th1 differentiation and inflammatory disease. It is essential for the differentiation and function of many immune cells, including natural killer cells, dendritic cells, mast cells and T helper cells. STAT4-mediated signaling promotes the production of autoimmune-associated components, which are implicated in the pathogenesis of autoimmune diseases, such as rheumatoid arthritis, systemic lupus erythematosus, systemic sclerosis and psoriasis, making STAT4 a promising therapeutic target for autoimmune diseases. Variations in STAT4 gene are linked to the development of systemic lupus erythematosus (SLE) in humans. STAT4 activation is detected in chronic liver diseases; polymorphism in STAT4 gene has been shown to be associated with the antiviral response in primary biliary cirrhosis (PBC), HCV-associated liver fibrosis, hepatocellular carcinoma (HCC), chronic hepatitis C and in drug-induced liver injury (DILI). STAT4 may inhibit HCC development by modulating HCC cell proliferation. Studies show that increased expression of STAT4 is positively correlated with the depth of invasion in colorectal cancer (CRC) patients, and the growth and invasion of CRC cells are repressed by inhibition of STAT4 expression, making STAT4 a promising therapeutic target for the treatment of CRC.	152
341087	cd16849	STAT5_DBD	DNA-binding domain of Signal Transducer and Activator of Transcription 5 (STAT5). This family consists of the DNA-binding domain (DBD) of the STAT5 proteins (Signal Transducer and Activator of Transcription 5, or Signal Transduction And Transcription 4), which include STAT5A and STAT5B, both of which are >90% identical despite being encoded by separate genes. The DNA binding domain has an Ig-like fold. STAT5A and STAT5B regulate erythropoiesis, lymphopoiesis, and the maintenance of the hematopoietic stem cell population. STAT5A and STAT5B have overlapping and redundant functions; both isoforms can be activated by the same set of cytokines, but some cytokines preferentially activate either STAT5A or STAT5B, e.g. during pregnancy and lactation, STAT5A rather than STAT5B is required for the production of luminal progenitor cells from mammary stem cells and is essential for the differentiation of milk producing alveolar cells during pregnancy. STAT5 has been found to be constitutively phosphorylated in cancer cells, and therefore constantly activated, either by aberrant cell signaling expression or by mutations. It differentially regulates cellular behavior in human mammary carcinoma. Prolactin (PRL) in the prostate gland can induce growth and survival of prostate cancer cells and tissues through the activation of STAT5, its downstream target; PRL expression and STAT5 activation correlates with disease severity. STAT5A and STAT5B are central signaling molecules in leukemias driven by Abelson fusion tyrosine kinases, displaying unique nuclear shuttling mechanisms and having a key role in resistance of leukemic cells against treatment with tyrosine kinase inhibitors (TKI). In addition, STAT5A and STAT5B promote survival of leukemic stem cells. STAT5 is a key transcription factor for IL-3-mediated inhibition of RANKL-induced osteoclastogenesis via the induction of the expression of Id genes. Autosomal recessive STAT5B mutations are associated with severe growth failure, insulin-like growth factor (IGF) deficiency and growth hormone insensitivity (GHI) syndrome. STAT5B deficiency can lead to potentially fatal primary immunodeficiency.	159
341088	cd16850	STAT6_DBD	DNA-binding domain of Signal Transducer and Activator of Transcription 6 (STAT6). This family consists of the DNA-binding domain (DBD) of the STAT6 proteins (Signal Transducer and Activator of Transcription 6, or Signal Transduction And Transcription 6). The DNA binding domain has an Ig-like fold. STAT6 is essential for the functional responses of T helper 2 (Th2) lymphocyte mediated by interleukins IL-4 and IL-13. STAT6 almost exclusively mediates the expression of genes activated by these cytokines; IL-4 signaling regulates the expression of genes involved in immune and anti-inflammatory responses. Abnormal production of IL-4 and IL-13 play important roles in the pathogenesis of asthma where upregulation of the Th2 response mediated by IL-4/IL-13 is a main characteristic. STAT6 has a unique extended transactivation domain, not found in other STATs, through which it recruits p300/CBP and NCoA-1, two coactivators needed for transcriptional activation by IL-4. STAT6 activation is linked to Kaposi's sarcoma-associated herpesvirus (KSHV)-associated cancers such as primary effusion lymphoma, a cancerous proliferation of B cells.  Studies show that Meningeal solitary fibrous tumor (SFT) and hemangiopericytoma (HPC) represent a histopathologic spectrum linked by STAT6 nuclear expression and recurrent somatic fusions of the two genes, NGFI-A-binding protein 2 (NAB2) and STAT6 (NAB2-STAT6), similar to their soft tissue counterparts. It is associated with local recurrence and late distance metastasis of brain tumors to extracranial sites.	160
341076	cd16851	STAT1_CCD	Coiled-coil domain of Signal Transducer and Activator of Transcription 1 (STAT1). This family consists of the coiled-coil (alpha) domain of the STAT1 proteins (Signal Transducer and Activator of Transcription 1, or Signal Transduction And Transcription 1). STAT1 plays an essential role in mediating responses to all types of interferons (IFN), transducing signals from cytoplasmic domains of transmembrane receptors into the nucleus where it regulates gene expression. Thus STAT1 is involved in modulating diverse cellular processes, such as antimicrobial activities, cell proliferation and cell death. STAT1 function is crucial in the innate and adaptive arm of immunity and protects from pathogen infections; phosphorylation of a critical tyrosine by Janus kinases (JAKs) leads to its activation and nuclear translocation, while phosphorylation of a critical serine is required for full transcriptional activation upon IFN stimulation and in response to cellular stress. Transcription of protein-encoding genes (including Stat1 itself) as well as expression of microRNAs (miRNAs) is regulated by activated STAT1. Animal studies have shown that STAT1 is generally considered a tumor suppressor but it can also act as a tumor promoter; its functions are not restricted to tumor cells, but extend to parts of the tumor microenvironment such as immune cells, endothelial cells. STAT1 abundance is a reliable marker for good prognosis in selected tumor types, but it can also correlate with disease progression. In head and neck cancer (HNC) patients, upregulation of STAT1-induced HLA class I enhances immunogenicity and clinical response to anti-EGFR mAb cetuximab therapy. In systemic juvenile idiopathic arthritis (sJIA) characterized by systemic inflammation and arthritis, STAT1 phosphorylation downstream of IFNs is impaired. It exerts anti-oncogenic activities through interferon-gamma and interferon-alpha. STAT1 may inhibit hepatocellular carcinoma cell growth by regulating p53-related cell cycling and apoptosis. Studies also show a significant correlation of high STAT1 activity with longer colorectal cancer patient overall survival. Recent studies have shown that STAT1 suppresses mouse mammary gland tumorigenesis by immune regulatory as well as tumor cell-specific functions of STAT1.	176
341077	cd16852	STAT2_CCD	Coiled-coil domain of Signal Transducer and Activator of Transcription 2 (STAT2). This family consists of the coiled-coil (alpha) domain of the STAT2 proteins (Signal Transducer and Activator of Transcription 2, or Signal Transduction And Transcription 2). STAT2 activation is driven predominantly by only two classes of cell surface receptors: Type I and III interferon receptors, making it a unique STAT family of transcription factors. It differs from other STAT family members in that it associates constitutively with a non-STAT protein, the interferon regulatory factor 9 (IRF9). The coiled-coil domain of STAT2 is necessary for binding the carboxyl terminus of IRF9, an association required for the constitutive nuclear import of unphosphorylated STAT2. STAT2 plays a critical role in host defenses against viral infections since type I interferon (IFN-I) response inhibits viral replication, and sets the stage for the development of adaptive immunity; viruses target STAT2 by either inhibiting its expression, blocking its activity, or by targeting it for degradation, thus triggering remarkable divergence in the STAT2 gene across species compared to other STAT family members. STAT2 function is regulated by tyrosine phosphorylation which enables STAT dimerization, and subsequent nuclear translocation and transcriptional activation of IFN stimulated genes. Dengue virus (DENV)-mediated degradation of STAT2 has emerged as an important determinant of DENV pathogenesis and host tropism. This vector-borne flavivirus suppresses IFN1 signaling to replicate and cause disease in vertebrates via proteasome-dependent STAT2 degradation mediated by the nonstructural protein NS5 and its interaction partner UBR4, an E3 ubiquitin ligase. The mechanism of Zika virus (ZIKV) NS5 resembles DENV NS5 but through different mechanism - ZIKV does not require the UBR4 to induce STAT2 degradation. It has also been shown that the STAT2 and STAT4 genes are direct targets for transcription factor Oct-1 protein which is involved in the regulation of expression of genes of the JAK-STAT signaling pathway in the Namalwa Burkitt's lymphoma cell line.	172
341078	cd16853	STAT3_CCD	Coiled-coil domain of Signal Transducer and Activator of Transcription 3 (STAT3). This family consists of the coiled-coil (alpha) domain of the STAT3 proteins (Signal Transducer and Activator of Transcription 3, or Signal Transduction And Transcription 3). STAT3 continuously shuttles between nuclear and cytoplasmic compartments. The coiled-coil domain (CCD) of STAT3 appears to be required for constitutive nuclear localization signals (NLS) function; small deletions within the STAT3 CCD can abrogate nuclear import. Studies show that the CCD binds to the importin-alpha3 in the testis, and importin-alpha6 NLS adapters in most cells. STAT3 plays key roles in vertebrate development and mature tissue function including control of inflammation and immunity. Mutations in human STAT3, especially in the DNA-binding and SH2 domains, are associated with diseases such as autoimmunity, immunodeficiency and cancer. STAT3 regulation is tightly controlled since either inactivation or hyperactivation results in disease. STAT3 activation is stimulated by several cytokines and growth factors, via diverse receptors. For example, IL-6 receptors depend on the tyrosine kinases JAK1 or JAK2, which associate with the cytoplasmic tail of gp130, and results in STAT3 phosphorylation, dimerization, and translocation to the nucleus; this leads to further IL-6 production and up-regulation of anti-apoptotic genes, thus promoting various cellular processes required for cancer progression. Other activators of STAT3 include IL-10, IL-23, and LPS activation of Toll-like receptors TLR4 and TLR9. STAT3 is constitutively activated in numerous cancer types, including over 40% of breast cancers. It has been shown to play a significant role in promoting acute myeloid leukemia (AML) through three mechanisms: promoting proliferation and survival, preventing AML differentiation to functional dendritic cells (DCs), and blocking T-cell function through other pathways. STAT3 also regulates mitochondrion functions, as well as gene expression through epigenetic mechanisms; its activation is induced by overexpression of Bcl-2 via an increase in mitochondrial superoxide. Thus, many of the regulators and functions of JAK-STAT3 in tumors are important therapeutic targets for cancer treatment.	180
341079	cd16854	STAT4_CCD	Coiled-coil domain of Signal Transducer and Activator of Transcription 4 (STAT4). This family consists of the coiled-coil (alpha) domain of the STAT4 proteins (Signal Transducer and Activator of Transcription 4, or Signal Transduction And Transcription 4). STAT4 expression is restricted to spermatozoa, myeloid cells, and T lymphocytes, making it distinct from other STATs. It acts as the major signaling transducing STATs in response to interleukin-12 (IL-12) by inducing interferon-gamma (IFNgamma), and is a central mediator in generating inflammation during protective immune responses and immune-mediated diseases. STAT4 is a critical regulator of Th1 differentiation and inflammatory disease. It is essential for the differentiation and function of many immune cells, including natural killer cells, dendritic cells, mast cells and T helper cells. STAT4-mediated signaling promotes the production of autoimmune-associated components, which are implicated in the pathogenesis of autoimmune diseases, such as rheumatoid arthritis, systemic lupus erythematosus, systemic sclerosis and psoriasis, making STAT4 a promising therapeutic target for autoimmune diseases. Variations in STAT4 gene are linked to the development of systemic lupus erythematosus (SLE) in humans. STAT4 activation is detected in chronic liver diseases; polymorphism in STAT4 gene has been shown to be associated with the antiviral response in primary biliary cirrhosis (PBC), HCV-associated liver fibrosis, hepatocellular carcinoma (HCC), chronic hepatitis C and in drug-induced liver injury (DILI). STAT4 may inhibit HCC development by modulating HCC cell proliferation. Studies show that increased expression of STAT4 is positively correlated with the depth of invasion in colorectal cancer (CRC) patients, and the growth and invasion of CRC cells are repressed by inhibition of STAT4 expression, making STAT4 a promising therapeutic target for the treatment of CRC.	173
341080	cd16855	STAT5_CCD	Coiled-coil domain of Signal Transducer and Activator of Transcription 5 (STAT5). This family consists of the coiled-coil (alpha) domain of the STAT5 proteins (Signal Transducer and Activator of Transcription 5, or Signal Transduction And Transcription 5) which include STAT5A and STAT5B, both of which are >90% identical despite being encoded by separate genes. The coiled-coil domain (CCD) of STAT5A and STAT5B appears to be required for constitutive nuclear localization signals (NLS) function; small deletions within the CCD can abrogate nuclear import. Studies show that the CCD binds to the importin-alpha3 NLS adapter in most cells. STAT5A and STAT5B regulate erythropoiesis, lymphopoiesis, and the maintenance of the hematopoietic stem cell population. STAT5A and STAT5B have overlapping and redundant functions; both isoforms can be activated by the same set of cytokines, but some cytokines preferentially activate either STAT5A or STAT5B, e.g. during pregnancy and lactation, STAT5A rather than STAT5B is required for the production of luminal progenitor cells from mammary stem cells and is essential for the differentiation of milk producing alveolar cells during pregnancy. STAT5 has been found to be constitutively phosphorylated in cancer cells, and therefore constantly activated, either by aberrant cell signaling expression or by mutations. It differentially regulates cellular behavior in human mammary carcinoma. Prolactin (PRL) in the prostate gland can induce growth and survival of prostate cancer cells and tissues through the activation of STAT5, its downstream target; PRL expression and STAT5 activation correlates with disease severity. STAT5A and STAT5B are central signaling molecules in leukemias driven by Abelson fusion tyrosine kinases, displaying unique nuclear shuttling mechanisms and having a key role in resistance of leukemic cells against treatment with tyrosine kinase inhibitors (TKI). In addition, STAT5A and STAT5B promote survival of leukemic stem cells. STAT5 is a key transcription factor for IL-3-mediated inhibition of RANKL-induced osteoclastogenesis via the induction of the expression of Id genes. Autosomal recessive STAT5B mutations are associated with severe growth failure, insulin-like growth factor (IGF) deficiency and growth hormone insensitivity (GHI) syndrome. STAT5B deficiency can lead to potentially fatal primary immunodeficiency.	194
341081	cd16856	STAT6_CCD	Coiled-coil domain of Signal Transducer and Activator of Transcription 6 (STAT6). This family consists of the coiled-coil (alpha) domain of the STAT6 proteins (Signal Transducer and Activator of Transcription 6, or Signal Transduction And Transcription 6). SImilar to STAT3 and STAT5. the coiled-coil domain (CCD) of STAT6 is required for constitutive nuclear localization signals (NLS) function; small deletions within the CCD can abrogate nuclear import. Studies show that the CCD binds to the importin-alpha3 NLS adapter in most cells.STAT6 is essential for the functional responses of T helper 2 (Th2) lymphocyte mediated by interleukins IL-4 and IL-13. STAT6 almost exclusively mediates the expression of genes activated by these cytokines; IL-4 signaling regulates the expression of genes involved in immune and anti-inflammatory responses. Abnormal production of IL-4 and IL-13 play important roles in the pathogenesis of asthma where upregulation of the Th2 response mediated by IL-4/IL-13 is a main characteristic. STAT6 has a unique extended transactivation domain, not found in other STATs, through which it recruits p300/CBP and NCoA-1, two coactivators needed for transcriptional activation by IL-4. STAT6 activation is linked to Kaposi's sarcoma-associated herpesvirus (KSHV)-associated cancers such as primary effusion lymphoma, a cancerous proliferation of B cells. Studies show that Meningeal solitary fibrous tumor (SFT) and hemangiopericytoma (HPC) represent a histopathologic spectrum linked by STAT6 nuclear expression and recurrent somatic fusions of the two genes, NGFI-A-binding protein 2 (NAB2) and STAT6 (NAB2-STAT6), similar to their soft tissue counterparts. It is associated with local recurrence and late distance metastasis of brain tumors to extracranial sites.	167
341090	cd16857	ING_ING1_2	Inhibitor of growth (ING) domain of inhibitor of growth protein ING1, ING2, and similar proteins. ING1 is an epigenetic regulator and a type II tumor suppressor that impacts cell growth, aging, apoptosis, and DNA repair, by affecting chromatin conformation and gene expression. It acts as a reader of the active chromatin mark, the trimethylation of histone H3 lysine 4 (H3K4me3). It binds and directs growth arrest and DNA damage inducible protein 45 a (Gadd45a) to target sites, thus linking the histone code with DNA demethylation. It interacts with the proliferating cell nuclear antigen (PCNA) via the PCNA-interacting protein (PIP) domain in a UV-inducible manner. It also interacts with a PCNA-interacting protein, p15 (PAF). Moreover, ING1 associates with members of the 14-3-3 family, which is necessary for cytoplasmic relocalization. Endogenous ING1 protein specifically interacts with the pro-apoptotic BCL2 family member BAX and colocalizes with BAX in a UV-inducible manner. It stabilizes the p53 tumor suppressor by inhibiting polyubiquitination of multi-monoubiquitinated forms via interaction with and colocalization of the herpesvirus-associated ubiquitin-specific protease (HAUSP)-deubiquitinase with p53. It is also involved in trichostatin A-induced apoptosis and caspase 3 signaling in p53-deficient glioblastoma cells. In addition, tyrosine kinase Src can bind and phosphorylate ING1 and further regulates its activity. ING2, also termed inhibitor of growth 1-like protein (ING1Lp), or p32, or p33ING2, is a core component of a multi-factor chromatin-modifying complex containing the transcriptional co-repressor SIN3A and histone deacetylase 1 (HDAC1). It has been implicated in the control of cell cycle, in genome stability, and in muscle differentiation. ING2 independently interacts with H3K4me3 (Histone H3 trimethylated on lysine 4) and PtdIns(5)P, and modulates crosstalk between lysine methylation and lysine acetylation on histone proteins through association with chromatin in the presence of DNA damage. It collaborates with SnoN to mediate transforming growth factor (TGF)-beta-induced Smad-dependent transcription and cellular responses. It is upregulated in colon cancer and increases invasion by enhanced MMP13 expression. It also acts as a cofactor of p300 for p53 acetylation and plays a positive regulatory role during p53-mediated replicative senescence. Both ING1 and ING2 contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain.	89
341091	cd16858	ING_ING3_Yng2p	Inhibitor of growth (ING) domain of inhibitor of growth protein 3 (ING3), Yng2p and similar proteins. ING3, also termed p47ING3, is a member of the inhibitor of growth (ING) family of type II tumor suppressors. It is ubiquitously expressed and has been implicated in transcription modulation, cell cycle control, and the induction of apoptosis. It is an important subunit of human NuA4 histone acetyltransferase complex, which regulates the acetylation of histones H2A and H4. Moreover, ING3 promotes ultraviolet (UV)-induced apoptosis through the Fas/caspase-8-dependent pathway in melanoma cells. It physically interacts with subunits of E3 ligase Skp1-Cullin-F-boxprotein complex (SCF complex) and is degraded by the SCF (F-box protein S-phase kinase-associated protein 2, Skp2)-mediated ubiquitin-proteasome system. It also acts as a suppression factor during tumorigenesis and progression of hepatocellular carcinoma (HCC). Yeast chromatin modification-related protein Yng2p, also termed ESA1-associated factor 4 or ING1 homolog 2, is a subunit of the NuA4 histone acetyltransferase (HAT) complex. It plays a critical role in intra-S-phase DNA damage response. Members of this family contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain.	92
341092	cd16859	ING_ING4_5	Inhibitor of growth (ING) domain of inhibitor of growth protein ING4, ING5, and similar proteins. ING4, also termed p29ING4, and ING5, also termed p28ING5, belong to the inhibitor of growth (ING) family of type II tumor suppressors. ING4 acts as an E3 ubiquitin ligase to induce ubiquitination of the p65 subunit of NF-kappaB and inhibit the transactivation of NF-kappaB target genes. It also induces apoptosis through a p53 dependent pathway, including increasing p53 acetylation, inhibiting Mdm2-mediated degradation of p53, and enhancing the expression of p53 responsive genes both at the transcriptional and post-translational levels. Moreover, ING4 can inhibit the translation of proto-oncogene MYC by interacting with AUF1. It also regulates other transcription factors, such as hypoxia-inducible factor (HIF). ING5 is a Tip60 cofactor that acetylates p53 at K120 and subsequently activates the expression of p53-dependent apoptotic genes in response to DNA damage. Aberrant ING5 expression may contribute to pathogenesis, growth, and invasion of gastric carcinomas and colorectal cancer. ING5 can physically interact with p300 and p53 in vivo, and its overexpression induces apoptosis in colorectal cancer cells. It also associates with Inhibitor of cyclin A1 (INCA1) and functions as a growth suppressor with suppressed expression in Acute Myeloid Leukemia (AML). Moreover, ING5 translocation from the nucleus to the cytoplasm might be a critical event for carcinogenesis and tumor progression in human head and neck squamous cell carcinoma. Both ING4 and ING5 contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain. They associate with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further direct the MOZ/MORF and HBO1 complexes to chromatin.	91
341093	cd16860	ING_ING1	Inhibitor of growth (ING) domain of inhibitor of growth protein 1 (ING1). ING1 is an epigenetic regulator and a type II tumor suppressor that impacts cell growth, aging, apoptosis, and DNA repair, by affecting chromatin conformation and gene expression. It acts as a reader of the active chromatin mark, the trimethylation of histone H3 lysine 4 (H3K4me3). It binds and directs growth arrest and DNA damage inducible protein 45 a (Gadd45a) to target sites, thus linking the histone code with DNA demethylation. It interacts with the proliferating cell nuclear antigen (PCNA) via the PCNA-interacting protein (PIP) domain in a UV-inducible manner. It also interacts with a PCNA-interacting protein, p15 (PAF). Moreover, ING1 associates with members of the 14-3-3 family, which is necessary for the cytoplasmic relocalization. Endogenous ING1 protein specifically interacts with the pro-apoptotic BCL2 family member BAX and colocalizes with BAX in a UV-inducible manner. It stabilizes the p53 tumor suppressor by inhibiting polyubiquitination of multi-monoubiquitinated forms via interaction with and colocalization of the herpesvirus-associated ubiquitin-specific protease (HAUSP)-deubiquitinase with p53. It is also involved in trichostatin A-induced apoptosis and caspase 3 signaling in p53-deficient glioblastoma cells. In addition, tyrosine kinase Src can bind phosphorylate ING1 and further regulates its activity. ING1 contains an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain.	88
341094	cd16861	ING_ING2	Inhibitor of growth (ING) domain of inhibitor of growth protein 2 (ING2). ING2, also termed inhibitor of growth 1-like protein (ING1Lp), or p32, or p33ING2, is a core component of a multi-factor chromatin-modifying complex containing the transcriptional co-repressor SIN3A and histone deacetylase 1 (HDAC1). It has been implicated in the control of cell cycle, in genome stability, and in muscle differentiation. ING2 independently interacts with H3K4me3 (Histone H3 trimethylated on lysine 4) and PtdIns(5)P, and modulates crosstalk between lysine methylation and lysine acetylation on histone proteins through association with chromatin in the presence of DNA damage. It collaborates with SnoN to mediate transforming growth factor (TGF)-beta-induced Smad-dependent transcription and cellular responses. It is upregulated in colon cancer and increases invasion by enhanced MMP13 expression. It also acts as a cofactor of p300 for p53 acetylation and plays a positive regulatory role during p53-mediated replicative senescence. ING2 contains an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain.	88
341095	cd16862	ING_ING4	Inhibitor of growth (ING) domain of inhibitor of growth protein 4 (ING4). ING4, also termed p29ING4, is a member of the inhibitor of growth (ING) family of type II tumor suppressors. It acts as an E3 ubiquitin ligase to induce ubiquitination of the p65 subunit of NF-kappaB and inhibit the transactivation of NF-kappaB target genes. It also induces apoptosis through a p53 dependent pathway, including increasing p53 acetylation, inhibiting Mdm2-mediated degradation of p53 and enhancing the expression of p53 responsive genes both at the transcriptional and post-translational levels. Moreover, ING4 can inhibit the translation of proto-oncogene MYC by interacting with AUF1. It also regulates other transcription factors, such as hypoxia-inducible factor (HIF). In addition, ING4 associates with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further directs the MOZ/MORF and HBO1 complexes to chromatin. ING4 contains an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain.	94
341096	cd16863	ING_ING5	Inhibitor of growth (ING) domain of inhibitor of growth protein 5 (ING5). ING5, also termed p28ING5, is a member of the inhibitor of growth (ING) family of type II tumor suppressors. It acts as a Tip60 cofactor that acetylates p53 at K120 and subsequently activates the expression of p53-dependent apoptotic genes in response to DNA damage. Aberrant ING5 expression may contribute to pathogenesis, growth, and invasion of gastric carcinomas and colorectal cancer. ING5 can physically interact with p300 and p53 in vivo, and its overexpression induces apoptosis in colorectal cancer cells. It also associates with Inhibitor of cyclin A1 (INCA1) and functions as a growth suppressor with suppressed expression in Acute Myeloid Leukemia (AML). Moreover, ING5 translocation from the nucleus to the cytoplasm might be a critical event for carcinogenesis and tumor progression in human head and neck squamous cell carcinoma. In addition, ING5 associates with histone acetyltransferase (HAT) complexes containing MOZ (monocytic leukemia zinc finger protein)/MORF (MOZ-related factor) and HBO1, and further directs the MOZ/MORF and HBO1 complexes to chromatin. ING5 contains an N-terminal leucine zipper-like (LZL) motif-containing ING domain, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain.	93
350628	cd16864	ARID_JARID	ARID/BRIGHT DNA binding domain of JARID proteins. The JARID subfamily within the JmjC protein family includes lysine-specific demethylase KDM5A, KDM5B, KDM5C, KDM5D and a Drosophila homolog, protein little imaginal discs (Lid). KDM5A was originally identified as a retinoblastoma protein (Rb)-binding partner and its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5B has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. Both KDM5A and KDM5B function as trimethylated histone H3 lysine 4 (H3K4me3) demethylases. KDM5C is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 and H3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. The family also includes Drosophila melanogaster protein little imaginal discs (Lid) that functions as a JmjC-dependent trimethyl histone H3K4 (H3K4me3) demethylase, which is required for dMyc-induced cell growth. It positively regulates Hox gene expression in S2 cells. Members of this subfamily contain the catalytic JmjC domain, JmjN, the AT-rich domain interacting domain (ARID)/BRIGHT domain, a C5HC2 zinc finger, as well as two or three plant homeodomain (PHD) fingers.	87
350629	cd16865	ARID_ARID1A-like	ARID/BRIGHT DNA binding domain found in AT-rich interactive domain-containing proteins ARID1A, ARID1B and similar proteins. This subfamily contains ARID1A and its paralog ARID1B. They are mutually exclusive components of human SWItch/Sucrose NonFermentable (SWI/SNF) chromatin remodeling protein complexes, but display different functions in development and cell-cycle control. SWI/SNF complexes containing ARID1A have an antiproliferative function, whereas the one harboring ARID1B shows a pro-proliferative function. ARID1A functions as an important tumor suppressor in various tumor types. It has been implicated in cell-cycle arrest, as well as in the interactions with p53 and BRG1/BRM and with topoisomerase II alpha. ARID1B may be considered as a potential therapeutic target for ARID1A-mutant cancers. Moreover, mutations in the ARID1B gene cause Coffin-Siris syndrome, exhibiting developmental defects, and haplo-insufficiency of ARID1B is a frequent cause of intellectual disability. Mutations in the ARID1B gene also have been found in many cancers. Both ARID1A and ARID1B contain an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), which binds DNA in a non-sequence-specific manner.	93
350630	cd16866	ARID_ARID2	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 2 (ARID2) and similar proteins. ARID2, also called BRG1-associated factor 200 (BAF200) or zinc finger protein with activation potential (Zipzap/p200), is a novel serum response factor (SRF)-binding protein with multiple conserved domains, including an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), RFX DNA-binding domain, a glutamine-rich domain, and two C2H2 zinc fingers. It binds DNA without sequence specificity. ARID2 is an intrinsic subunit of PBAF (SWI/SNF-B) remodeling complex, which needs ARID2 to play an essential role in promoting osteoblast differentiation, maintaining cellular identity and activating tissue-specific gene expression. Moreover, ARID2 may function as a tumor suppressor in many cancers. It may also serve as a transcription co-activator for the regulation of cardiac gene expression, and is required for heart morphogenesis and coronary artery development.	88
350631	cd16867	ARID_ARID3	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing proteins ARID3A, ARID3B, ARID3C, dead ringer (Dri) from Drosophila melanogaster, and similar proteins. The ARID3 subfamily includes AT-rich interactive domain (ARID, also known as BRIGHT)-containing proteins ARID3A, ARID3B and ARID3C, which are the most direct mammalian counterparts of the Drosophila "dead ringer" protein Dri. They consist of an acidic N-terminal region of unknown function, the central ARID matrix association (or attachment) region (MAR)-DNA binding domain, a SUMO-I conjugation (SUMO) motif, and a multifunctional homomerization/nuclear export REKLES domain in the C-terminal third of the molecule. The ARID domain in this subfamily has been described as the "extended" or e-ARID due to additional conserved sequences at both the N and C termini of the core ARID region. The REKLES domain is found only in the ARID3 subfamily. It has co-evolved with and regulates functional properties of the ARID DNA-binding domain.	118
350632	cd16868	ARID_ARID4	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing proteins ARID4A, ARID4B and similar proteins. This subfamily contains ARID4A and its paralog ARID4B, both of which are retinoblastoma (Rb)-binding proteins that function as coactivators to enhance the androgen receptor (AR) and Rb transcriptional activity, and play important roles in the AR and Rb pathways to control male fertility. They also act as the leukemia and tumor suppressors involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndrome. Moreover, they associate with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through the interaction with each other, as well as with the breast cancer associated tumor suppressor ING1 and the breast cancer metastasis suppressor BRMS1. Both ARID4A and ARID4B contain a Tudor domain, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain.	87
350633	cd16869	ARID_ARID5	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing proteins ARID5A, ARID5B, and similar proteins. This subfamily contains ARID5A and its paralog ARID5B. ARID5A, also called modulator recognition factor 1 (MRF-1), is an estrogen receptor alpha (ER alpha)-interacting protein that is expressed abundantly in cardiovascular tissues and suppresses ER alpha-induced transactivation. It also plays an important role in the promotion of inflammatory processes and autoimmune diseases. ARID5B, also called MRF1-like protein or modulator recognition factor 2 (MRF-2), is a DNA-binding protein that directly interacts with plant homeodomain (PHD) finger 2 (PHF2) to form a protein kinase A (PKA)-dependent PHF2-ARID5B histone H3K9Me2 demethylase complex. It also functions as a transcriptional co-regulator for the transcription factor sex determining region Y (SRY)-box protein 9 (Sox9) and promotes chondrogenesis through histone modification. Moreover, ARID5B is highly expressed in the cardiovascular system and may play essential roles in the phenotypic change of smooth muscle cells (SMCs) through its regulation of SMC differentiation. Both ARID5A and ARID5B contain an AT-rich DNA-interacting domain (ARID, also known as BRIGHT).	87
350634	cd16870	ARID_JARD2	ARID/BRIGHT DNA binding domain of Jumonji/ARID domain-containing protein 2 (JARID2) and similar proteins. JARID2, also called protein Jumonji, is a DNA-binding protein that contains both the Jumonji C (JmjC) domain and AT-rich DNA-interacting domain (ARID, also known as BRIGHT). It is an interacting component of Polycomb repressive complex-2 (PRC2) that catalyzes methylation of lysine 27 of histone H3 (H3K27) and regulates important gene expression patterns during development. It exhibits nucleosome-binding activity that contributes to PRC2 stimulation. However, unlike other JmjC domain-containing proteins, JARID2 is catalytically inactive due to the lack of conserved residues essential for histone demethylase activity. JARID2 is also involved in transforming growth factor-beta (TGF-beta)-induced epithelial-mesenchymal transition (EMT) of lung and colon cancer cell lines through the modulation of histone H3K27 methylation. Moreover, JARID2 is a part of GLP- and G9a-containing protein complex that promotes lysine 9 on histone H3 (H3K9) methylation on the cyclin D1 promoter and silences the expression of cyclin D1 and other cell cycle genes. It functions as a transcriptional repressor that plays critical roles in embryonic development including heart development in mice, and regulates cardiomyocyte proliferation via interaction with retinoblastoma protein (Rb), one of the master regulatory genes of the cell cycle. Furthermore, JARID2 acts as a transcriptional repressor of target genes, including Notch1. It directly binds to SETDB1 (SET domain, bifurcated 1) to form a complex that plays an important role in a novel process involving the modification of H3K9 methylation during heart development. Meanwhile, JARID2 is a key transcriptional repressor that plays a role in invariant natural killer T (iNKT) cell maturation. It regulates promyelocytic leukemia zinc finger (PLZF) expression by linking T-cell receptor (TCR) signaling to H3K9me3. JARID2 polymorphisms are associated with non-syndromic orofacial clefts (NSOC) susceptibility.	112
350635	cd16871	ARID_Swi1p-like	ARID/BRIGHT DNA binding domain of yeast SWI/SNF chromatin-remodeling complex subunit Swi1p and similar proteins. Saccharomyces cerevisiae Swi1p, also called SWI/SNF chromatin-remodeling complex subunit SWI1, regulatory protein GAM3, or transcription regulatory protein ADR6, is a transcription regulatory protein that is a subunit of the SWI/SNF complex, which plays critical roles in the regulation of gene transcription and expression. It can exist as a prion, [SWI(+)], which demonstrates a link between prionogenesis and global transcriptional regulation. Swi1p contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT) that binds DNA nonspecifically. This subfamily also includes Schizosaccharomyces pombe SWI/SNF chromatin-remodeling complex subunit sol1 (sol1p, also known as switch one-like protein). sol1p is a homolog of S. cerevisiae Swi1p and is also a part of SWI/SNF chromatin-remodeling complex.	90
350636	cd16872	ARID_HMGB9-like	ARID/BRIGHT DNA binding domain of Arabidopsis thaliana high mobility group B proteins HMGB9, HMGB10, HMGB11, HMGB15 and similar proteins. This subfamily includes a group of conserved plant DNA-binding proteins, including HMGB9 (or ARID-HMG1), HMGB10 (or ARID-HMG2), HMGB11, and HMGB15. They have been termed ARID-HMG proteins, due to containing two DNA-binding domains, an N-terminal AT-rich DNA-interacting domain (ARID, also known as BRIGHT), and a C-terminal high mobility group (HMG)-box domain. They are widely expressed in Arabidopsis and localize primarily to the nucleus. HMGB9/ARID-HMG1 binds specifically to A/T-rich DNA. HMGB15 is a transcription factor predominantly expressed in mature pollen grains and pollen tubes. It may work in the form of a homodimer, or interact with HMGB9, HMGB10 and HMGB11 to form heteromultimers in plant cells. HMGB15 is required for pollen tube growth in Arabidopsis and is involved in transcriptional regulation through the interaction with AGL66 and AGL104.	86
350637	cd16873	ARID_KDM5A	ARID/BRIGHT DNA binding domain of lysine-specific demethylase 5A (KDM5A). KDM5A, also called histone demethylase JARID1A, Jumonji/ARID domain-containing protein 1A, or Retinoblastoma-binding protein 2 (RBBP-2 or RBP2), was originally identified as a retinoblastoma protein (Rb)-binding partner; its inactivation may be important for Rb to promote differentiation. It is involved in transcription through interacting with TBP, p107, nuclear receptors, Myc, Sin3/HDAC, Mad1, RBP-J, CLOCK and BMAL1. KDM5A functions as the trimethylated histone H3 lysine 4 (H3K4me3) demethylase that belongs to the JARID subfamily within the JmjC proteins. It also displays DNA-binding activities that can recognize the specific DNA sequence CCGCCC. KDM5A contains the catalytic JmjC domain, a JmjN domain, an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a C5HC2 zinc finger, as well as three plant homeodomain (PHD) fingers.	92
350638	cd16874	ARID_KDM5B	ARID/BRIGHT DNA binding domain of lysine-specific demethylase 5B (KDM5B). KDM5B, also called cancer/testis antigen 31 (CT31), histone demethylase JARID1B, Jumonji/ARID domain-containing protein 1B (JARID1B), PLU-1, or retinoblastoma-binding protein 2 homolog 1 (RBP2-H1 or RBBP2H1A), is a member of the JARID subfamily within the JmjC proteins. It has a restricted expression pattern in the testis, ovary, and transiently in the mammary gland of the pregnant female and has been shown to be upregulated in breast cancer, prostate cancer, and lung cancer, suggesting a potential role in tumorigenesis. KDM5B acts as a histone demethylase that catalyzes the removal of trimethylation of lysine 4 on histone H3 (H3K4me3), induced by polychlorinated biphenyls (PCBs). It also mediates demethylation of H3K4me2 and H3K4me1. Moreover, KDM5B functions as a negative regulator of hematopoietic stem cell (HSC) self-renewal and progenitor cell activity. KDM5B has also been shown to interact with the DNA binding transcription factors BF-1 and PAX9, as well as TIEG1/KLF10 (transforming growth factor-beta inducible earlygene-1/Kruppel-like transcription factor 10), and possibly function as a transcriptional corepressor. KDM5B contains the catalytic JmjC domain, a JmjN domain, an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a C5HC2 zinc finger, as well as three plant homeodomain (PHD) fingers.	90
350639	cd16875	ARID_KDM5C_5D	ARID/BRIGHT DNA binding domain of lysine-specific demethylase KDM5C and KDM5D. This group includes KDM5C and KDM5D, both of which belong to the JARID subfamily within the JmjC proteins. KDM5C, also called histone demethylase JARID1C, Jumonji/ARID domain-containing protein 1C, protein SmcX, or protein Xe169, is a H3K4 trimethyl-histone demethylase that catalyzes demethylation of H3K4me3 and H3K4me2 to H3K4me1. It plays a role in neuronal survival and dendrite development. KDM5C defects are associated with X-linked mental retardation (XLMR). KDM5D, also called histocompatibility Y antigen (H-Y), histone demethylase JARID1D, Jumonji/ARID domain-containing protein 1D, or protein SmcY, is a male-specific antigen that shows a demethylase activity specific for di- and tri-methylated histone H3K4 (H3K4me3 and H3K4me2), and has a male-specific function as a histone H3K4 demethylase by recruiting a meiosis-regulatory protein, MSH5, to condensed DNA. KDM5D directly interacts with a polycomb-like protein Ring6a/MBLR, and plays a role in regulation of transcriptional initiation through H3K4 demethylation. Both KDM5C and KDM5D contain the catalytic JmjC domain, a JmjN domain, an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a C5HC2 zinc finger, as well as two plant homeodomain (PHD) fingers.	92
350640	cd16876	ARID_ARID1A	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 1A (ARID1A) and similar proteins. ARID1A, also called B120, BRG1-associated factor 250a (BAF250A), Osa homolog 1(OSA1), SWI-like protein, SWI/SNF complex protein p270, or SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin subfamily F member 1 (SWI1), has been identified as a novel tumor suppressor in various tumor types. It interacts with BRG1 adenosine triphosphatase to form a SWItch/Sucrose NonFermentable (SWI/SNF) chromatin remodeling protein complex, which plays a critical role in transcriptional control and gene expression. ARID1A contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), and Eld/Osa homology domains (EHD) 1 and 2 within the C-terminus. The ARID in ARID1A binds nonspecific DNA in general and plays an important role in targeting SWI/SNF to chromatin. The EHD1 may be capable of mediating an intramolecular association with EHD2, and/or an intermolecular association resulting in homo- or hetero-dimerization. The EHD2 binds Swi2/Brahma homologue Brahma-related gene 1 (BRG1, also known as Snf2b), a human homologue of yeast Swi2.	93
350641	cd16877	ARID_ARID1B	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 1B (ARID1B) and similar proteins. ARID1B, also called BRG1-associated factor 250b (BAF250B), BRG1-binding protein ELD/OSA1, Osa homolog 2 (Osa2), or p250R, is the largest subunit of ATP-dependent SWItch/sucrose nonfermentable (SWI/SNF) chromatin remodeling complex, which plays a critical role in transcriptional control and gene expression. ARID1B exhibits tumour-suppressor activities in pancreatic cancer cell lines. Mutations in the ARID1B gene cause Coffin-Siris syndrome, exhibiting developmental defects, and haplo-insufficiency of ARID1B is a frequent cause of intellectual disability. Moreover, mutations in the ARID1B gene have been found in many cancers. ARID1B contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), which binds DNA in a non-sequence-specific manner similar to ARID1A.	93
350642	cd16878	ARID_ARID3A	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 3A (ARID3A) and similar proteins. ARID3A, also called B-cell regulator of IgH transcription (Bright), dead ringer-like protein 1 (Dril1), or E2F-binding protein 1 (E2FBP1), is an ubiquitously expressed DNA-binding protein that has been implicated in embryonic patterning, cell lineage gene regulation, and cell cycle control, chromatin remodeling and transcriptional regulation. It was originally identified as a B cell-specific trans-activator of immunoglobulin heavy-chain (IgH) transcription, which increases immunoglobulin transcription in antigen-activated B cells and plays regulatory roles in hematopoiesis. It also functions as an E2F transcription regulator, inducing promyelocytic leukemia protein (PML) reduction and suppressing the formation of PML-nuclear bodies. It antagonizes the p16(INK4A)-Rb tumor suppressor machinery by regulating PML stability. ARID3A transcriptional activity can be modulated by SUMO (Small Ubiquitin-related Modifier) modification through the interaction with the SUMO-conjugating enzyme Ubc9. ARID3A also plays an important role in marginal zone B lymphocyte development and autoantibody production. Furthermore, ARID3A is a direct p53 target gene. It controls cell growth in a p53-dependent manner. ARID3A contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a SUMO-I conjugation (SUMO) motif and a multifunctional homomerization/nuclear export REKLES domain, which consists of two subdomains: a modestly conserved N-terminal REKLES alpha and a highly conserved (among ARID3 orthologous proteins) C-terminal REKLES beta.	133
350643	cd16879	ARID_ARID3B	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 3B (ARID3B) and similar proteins. ARID3B, also called Bright and dead ringer protein, or Bright-Dri-like protein (Bdp), is a DNA binding protein involved in cellular immortalization, epithelial-mesenchymal transition (EMT), and tumorigenesis. Its expression is differentially regulated in normal and malignant tissues. It is required for heart development by regulating the motility and differentiation of heart progenitors. ARID3B is overexpressed in neuroblastoma and ovarian cancer. It acts as a novel target with roles in cell motility in breast cancer cells, promotes migration of mouse embryo fibroblasts (MEFs) and breast cancer cells, and induces tumor necrosis factor alpha (TNFalpha)-mediated apoptosis. ARID3B contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a SUMO-I conjugation (SUMO) motif and a multifunctional homomerization/nuclear export REKLES domain, which consists of two subdomains: a modestly conserved N-terminal REKLES alpha and a highly conserved (among ARID3 orthologous proteins) C-terminal REKLES beta.	126
350644	cd16880	ARID_ARID3C	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 3C (ARID3C) and similar proteins. ARID3C, also called Brightlike, is a new ARID3 family transcription factor that co-activates ARID3A-mediated immunoglobulin gene transcription. It also functions as a potential regulator of early events in B cell antigen receptor (BCR) signaling. ARID3C contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a SUMO-I conjugation (SUMO) motif and a multifunctional homomerization/nuclear export REKLES domain, which consists of two subdomains: a modestly conserved N-terminal REKLES alpha and a highly conserved (among ARID3 orthologous proteins) C-terminal REKLES beta.	127
350645	cd16881	ARID_Dri-like	ARID/BRIGHT DNA binding domain of dead ringer (Dri) from Drosophila melanogaster and similar proteins. Dri, also termed retained (retn), is a nuclear protein with a sequence-specific DNA-binding domain termed AT-rich DNA-interacting domain (ARID, also known as BRIGHT). It is a founding member of the ARID family. Sequence comparison shows that DRI belongs to the "extended" or e-ARID subfamily, which exhibits an extended region of similarity either side of the ARID. Dri plays an important role in embryogenesis. It functions as an essential transcription factor involved in aspects of dorsal/ventral and anterior/posterior axis patterning, as well as myogenesis and hindgut development.	125
350646	cd16882	ARID_ARID4A	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 4A (ARID4A) and similar proteins. ARID4A, also called retinoblastoma-binding protein 1 (RBBP1, or RBP1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndrome. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through the interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and the ARID4 family homolog ARID4B (also known as RBP1L1). ARID4A specifically interacts with retinoblastoma protein (pRb) and shows both HDAC-dependent and -independent repression activities. It is also involved in the pocket domain of pRb-mediated repression of E2F-dependent transcription and cellular proliferation. Moreover, it acts as a Runx2 coactivator and is involved in the regulation of osteoblastic differentiation in Runx2-osterix transcriptional cascade. ARID4A contains a Tudor domain, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The ARID and R2 domains are responsible for the repression activities. The Tudor, PWWP, and chromobarrel domains are all Royal Family domains, but only chromobarrel domain of ARID4A is responsible for recognizing both dsDNA and methylated histone tails, particularly H4K20me3, in chromatin remodeling and epigenetic regulation.	87
350647	cd16883	ARID_ARID4B	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 4B (ARID4B) and similar proteins. ARID4B, also called 180 kDa Sin3-associated polypeptide (p180), breast cancer-associated antigen BRCAA1, histone deacetylase complex subunit SAP180, or retinoblastoma-binding protein 1-like 1 (RBP1L1, or RBBP1L1), is a tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndrome. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through the interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and the ARID4 family homolog ARID4A ( also known as RBP1). ARID4B plays a causative role in metastatic progression of breast cancer. It may also be associated with regulating cell cycle. ARID4B contains a Tudor domain, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain.	92
350648	cd16884	ARID_ARID5A	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 5A (ARID5A) and similar proteins. ARID5A, also called modulator recognition factor 1 (MRF-1), is an estrogen receptor alpha (ER alpha)-interacting protein that is expressed abundantly in cardiovascular tissues and suppresses ER alpha-induced transactivation. It also associates with thyroid receptor alpha (TR alpha) and retinoid X receptor alpha (RXR alpha) in a ligand-dependent manner, and with ER beta, androgen receptor (AR), and the retinoic acid receptor (RAR) in a ligand-independent manner. ARID5A functions as a negative regulator of RORgamma-induced Th17 cell differentiation and may be involved in the pathogenesis of rheumatoid arthritis (RA). Moreover, it is an important transcriptional partner of the transcription factor sex determining region Y (SRY)-box protein 9 (Sox9) in stimulation of chondrocyte-specific transcription. Meanwhile, ARID5A plays an important role in promotion of inflammatory processes and autoimmune diseases. It works as a unique RNA binding protein, which stabilizes interleukin-6 (IL-6) but not tumor necrosis factor-alpha (TNF-alpha) mRNA through binding to the 3' untranslated region (UTR) of IL-6 mRNA, and inhibits the destabilizing effect of regnase-1 on IL-6 mRNA. ARID5A contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT).	87
350649	cd16885	ARID_ARID5B	ARID/BRIGHT DNA binding domain of AT-rich interactive domain-containing protein 5B (ARID5B) and similar proteins. ARID5B, also called MRF1-like protein or modulator recognition factor 2 (MRF-2), is a DNA-binding protein that directly interacts with plant homeodomain (PHD) finger 2 (PHF2) to form a protein kinase A (PKA)-dependent PHF2-ARID5B histone H3K9Me2 demethylase complex, which is a signal-sensing modulator of histone methylation and gene transcription. It also functions as a transcriptional co-regulator for the transcription factor sex determining region Y (SRY)-box protein 9 (Sox9) and promotes chondrogenesis through histone modification. Moreover, ARID5B is highly expressed in the cardiovascular system and may play essential roles in the phenotypic change of smooth muscle cells (SMCs) through its regulation of SMC differentiation. Its polymorphism has been associated with risk for pediatric acute lymphoblastic leukemia (ALL). ARID5B contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), which can bind both the major and minor grooves of its target sequences.	95
341123	cd16887	YEATS	YEATS domain family, chromatin reader proteins. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein, YEATS2, Drosophila D12, and others.  DNA regulation by chromatin through histone post-translational modification and other mechanisms involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state and stimulates transcriptional activity through preferential interactions with crotonylated lysines on histones.	120
381609	cd16888	lyz_G-like_1	lysozyme G-like protein 1. Eukaryotic goose-type or G-type lysozymes (goose egg-white lysozyme; GEWL) catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetylglucosamine (GlcNAc). Mammals have two lysozyme G-like proteins, and this family corresponds to human and mouse lysozyme G-like protein 1. In humans and some other species, the canonical catalytic glutamate residue is absent, suggesting a loss of muramidase activity.	160
381610	cd16889	chitinase-like	chitinase-like domain. This family includes proteins such as chitinases, chitosanase, pesticin, and endolysin, which are involved in the degradation of 1,4-N-acetyl D-glucosamine linkages in chitin polymers and related activities. Chitinases are enzymes that catalyze the hydrolysis of the beta-1,4-N-acetyl-D-glucosamine linkages in chitin polymers. Chitosanase enzymes hydrolyze chitosan, a biopolymer of beta (1,4)-linked-D-glucosamine (GlcN) residues produced by partial or full deacetylation of chitin. Pesticin (Pst) is a anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. The dsDNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall heteropolymer peptidoglycan and cell division.	105
381611	cd16890	lyz_i	I-type lysozyme. Invertebrate type (I-type) lysozyme, initially identified in starfish and marine bivalves, are found in various invertebrate phyla and are apparently ubiquitous in insects. Lysozymes cleave the beta-(1,4)-glycosidic bond between N-acetylmuramic acid and N-acetylglucosamine in peptidoglycan, the major bacterial cell wall polymer. I-type enzymes share structural similarity and the conserved glutamate catalytic residue of the lysozyme family.	117
381612	cd16891	CwlT-like	CwlT-like N-terminal lysozyme domain and similar domains. CwlT is a bifunctional cell wall hydrolase containing an N-terminal lysozyme domain and a C-terminal NlpC/P60 endopeptidase domain (gamma-d-D-glutamyl-L-diamino acid endopeptidase), and has been implicated in the spread of transposons. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL).	151
381613	cd16892	LT_VirB1-like	VirB1-like subfamily. This subfamily includes VirB1 protein, one of twelve proteins making up type IV secretion systems (T4SS). T4SS are macromolecular assemblies generally composed of VirB1-11 and VirD4 proteins, and are used by bacteria to transport material across their membranes. VirB1 acts as a lytic transglycosylase (LT), and is important with respect to piercing the peptidoglycan layer in the periplasm. LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc) as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL).	143
381614	cd16893	LT_MltC_MltE	membrane-bound lytic murein transglycosylases MltC and MltE, and similar proteins. MltC and MltE are periplasmic, outer membrane attached lytic transglycosylases (LTs), which cleave beta-1,4-glycosidic bonds joining N-acetylmuramic acid and N-acetylglucosamine in the cell wall peptidoglycan, yielding 1,6-anhydromuropeptides. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria and the LTs in bacteriophage lambda	162
381615	cd16894	MltD-like	Membrane-bound lytic murein transglycosylase D and similar proteins. Lytic transglycosylases (LT) catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc). Membrane-bound lytic murein transglycosylase D protein (MltD) family members may have one or more small LysM domains, which may contribute to peptidoglycan binding. Unlike the similar "goose-type" lysozymes, LTs also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue. Proteins similar to this family include the soluble and insoluble membrane-bound LTs in bacteria, the LTs in bacteriophage lambda, as well as the eukaryotic "goose-type" lysozymes (goose egg-white lysozyme; GEWL).	129
381616	cd16895	TraH-like	conjugal transfer protein H and similar proteins. This subfamily consists of several TraH proteins, putative conjugal transfer proteins of uncharacterized function, apparently found only in alphaproteobacteria. They have similarity to lysozyme and preserve the critical glutamate residue which has catalytic activity in lysozyme-like proteins.	198
381617	cd16896	LT_Slt70-like	uncharacterized lytic transglycosylase subfamily with similarity to Slt70. Uncharacterized lytic transglycosylase (LT) with a conserved sequence pattern suggesting similarity to the Slt70, a 70kda soluble lytic transglycosylase which also has an N-terminal U-shaped U-domain and a linker L-domain. LTs catalyze the cleavage of the beta-1,4-glycosidic bond between N-acetylmuramic acid (MurNAc) and N-acetyl-D-glucosamine (GlcNAc), as do "goose-type" lysozymes. However, in addition to this, they also make a new glycosidic bond with the C6 hydroxyl group of the same muramic acid residue.	146
340383	cd16897	LYZ_C	C-type lysozyme. C-type lysozyme (chicken or conventional type; 1, 4-beta-N-acetylmuramidase). In humans, lysozyme is found in a wide variety of tissue types and body fluids. It has bacteriolytic properties through the hydolysis of beta-1,4, glyocosidic linkages  between N-acetylmuramic acid and N-acetyl-D-glucosamine residues in a peptidoglycan, as well as between N-acetyl-D-glucosamine residues in chitodextrins. This family also includes digestive stomach lysozyme, which allow ruminants to digest bacteria in the foregut. The mammalian enzyme is related to lysozyme found hen egg-whites and related species.	126
340384	cd16898	LYZ_LA	alpha lactalbumin. alpha-lactalbumin (lactose synthase B protein, LA) is a calcium-binding metalloprotein that is expressed exclusively in the mammary gland during lactation. LA is the regulatory subunit of the enzyme lactose synthase. The association of LA with the catalytic component of lactose synthase, galactosyltransferase, alters the acceptor substrate specificity of this glycosyltransferase, facilitating biosynthesis of lactose.	123
381618	cd16899	LYZ_C_invert	C-type invertebrate lysozyme. C-type lysozyme proteins of invertebrates, including digestive lysozymes 1 and 2 from Musca domestica, which aid in the use of bacteria as a food source. These lysozymes have high expression in the gut and optimal lytic activity at a lower pH. Other lysozymes in this subfamily have immunological roles. e.g. Anopheles gambiae has eight lysozymes, most of which seem to have immunological roles, those some may function as digestive enzymes in larvae. C-type lysozyme (chicken or conventional type;  1, 4-beta-N-acetylmuramidase) has bacteriolytic properties through the hydolysis of beta-1,4, glyocosidic linkages  between N-acetylmuramic acid and N-acetyl-D-glucosamine residues in a peptidoglycan, as well as between N-acetyl-D-glucosamine residues in chitodextrins.	121
381619	cd16900	endolysin_R21-like	endolysin R21-like proteins. Unlike T4 E phage lysozyme, the endolysin R21 from Enterobacteria phage P21 has an N-terminal SAR (signal-arrest-release) domain that anchors the endolysin to the membrane in an inactive form, which act to prevent premature lysis of the infected bacterium. The dsDNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall heteropolymer peptidoglycan and cell division. Endolysins and autolysins are found in viruses and bacteria, respectively. Both endolysin and autolysin enzymes cleave the glycosidic beta 1,4-bonds between the N-acetylmuramic acid and the N-acetylglucosamine of the peptidoglycan.	142
381620	cd16901	lyz_P1	P1 lysozyme Lyz-like proteins. Enterobacteria phage P1 lysozyme Lyz is secreted to the Escherichia coli periplasm where it is membrane bound and inactive. Activation involves the release from the membrane, an intramolecular thiol-disulfide isomerization and extensive structural rearrangement of the N-terminal region. The dsDNA phages of eubacteria use endolysins or muralytic enzymes in conjunction with hollin, a small membrane protein, to degrade the peptidoglycan found in bacterial cell walls. Similarly, bacteria produce autolysins to facilitate the biosynthesis of its cell wall heteropolymer peptidoglycan and cell division. Endolysins and autolysins are found in viruses and bacteria, respectively. Both endolysin and autolysin enzymes cleave the glycosidic beta 1,4-bonds between the N-acetylmuramic acid and the N-acetylglucosamine of the peptidoglycan.	140
381621	cd16902	pesticin_lyz	lysozyme-like C-terminal domain of pesticin. Pesticin (Pst) is an anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacterial stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. The pesticin C-terminal domain resembles the lysozyme-like family, which includes soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides.	178
340389	cd16903	pesticin_lyz-like	pesticin C-terminal-like domain of uncharacterized proteins. This subfamily is composed of uncharacterized proteins containing a lysozyme-like domain similar to the C-terminal domain of pesticin. Some members also contain an EF hand domain. Pesticin (Pst) is an anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacterial stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. The pesticin C-terminal domain resembles the lysozyme-like family, which includes soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides.	150
340390	cd16904	pesticin_lyz-like	pesticin C-terminal-like domain of uncharacterized proteins. This subfamily is composed of uncharacterized proteins containing a lysozyme-like domain similar to the C-terminal domain of pesticin. Pesticin (Pst) is an anti-bacterial toxin produced by Yersinia pestis that acts through uptake by the target related bacteria and the hydrolysis of peptidoglycan in the periplasm. Pst contains an N-terminal translocation domain, an intermediate receptor binding domain, and a phage-lysozyme like C-terminal activity domain. Bacteriocins such as pesticin are produced by gram-negative bacteria to attack related bacterial stains. Pst is transported to the periplasm via FyuA, an outer-membrane receptor of Y. pestis and E. coli, where it hydrolyzes peptidoglycan via the cleavage of N-acetylmuramic acid and C4 of N-acetylglucosamine. Disruption of the peptidoglycan layer renders the bacteria vulnerable to lysis via osmotic pressure. The pesticin C-terminal domain resembles the lysozyme-like family, which includes soluble lytic transglycosylases (SLT), goose egg-white lysozymes (GEWL), hen egg-white lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides.	138
341124	cd16905	YEATS_Taf14_like	YEATS domain found in Taf14 and similar proteins. Taf14 has been identified as a component of TFIIF and TFIID transcription factor complexes, various chromatin-remodeling complexes (such as SWI/SNF, INO80, and RSC) and the NuA3 histone acetyltransferase complex. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. Specifically, Taf14 has been show to be a reader of lysine crotonylation, associated with active gene promoters and enhancers and binding acetyllysine on in histone H3. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein.	118
341125	cd16906	YEATS_AF-9_like	YEATS domain found in ENL and AF-9-like proteins. Yeast AFF9 is a YEATS domain protein that binds to modified histones, with a preference for crotonyllysine over acetyllysine. Histone crotonylation upregulates gene expression in an AF9-dependent manner. Sub-family also includes eleven-nineteen-leukemia protein ENL, which binds histones H3 and H1. DNA regulation by chromatin through histone post-translational modification and other mechanisms involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state and stimulates transcriptional activity through preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein.	127
341126	cd16907	YEATS_YEATS2_like	YEATS domain found in YEATS2 and Drosophila D12. YEATS2 is a YEATS domain reader protein with a preference for recognition of histone H3 crotonylation on lysine 27 (H3K27crHistone crotonylation upregulates gene expression in an AF9-dependent manner. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein.	123
341127	cd16908	YEATS_Yaf9_like	YEATS domain found in Yaf9 and similar proteins. Yaf9 is a YEATS domain family protein essential in the function the NuA4 histone acetyltransferase complex and the SWR1-C ATP-dependent chromatin remodeling complex. Yaf9 shares structural similarity with histone chaperone Asf1, both exhibit histone H3 and H4 binding in vitro, and evidence supports both play a role in the same histone acetylation mechanism.  DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones.  The YEATS family is named for several family members: `YNK7', `ENL', `AF-9', and `TFIIF small subunit', and also contains the GAS41 protein.	145
341128	cd16909	YEATS_GAS41_like	YEATS domain found in YEATS domain-containing protein 4 and similar proteins. glioma amplified sequence 41 (GAS41, also known as,  YEATS domain-containing protein 4, NuMA-binding protein 1 ) is a YEATS domain family protein that is amplified and acts as an oncogene in human glioma. GAS41 is a YEATS domain family protein and has been shown to interact with the general transcription factor TFIIF via the YEATS domain. DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones. The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein.	137
341129	cd16910	YEATS_TFIID14_like	YEATS domain found in transcription initiation factor TFIID subunit 14 and similar proteins. YEATS domain containing proteins, which include Transcription initiation factor TFIID subunits 14 and 14b of Arabidopsis, shown to be part of the TFIID general transcriptional regulator complex in a two-hybrid screen.  DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones.   The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein.	131
350650	cd16911	AfaD_SafA-like	AfaD-like family of invasins. This family consists of Escherichia coli AfaD, Salmonella SafA and SafD, Yersinia pestis PsaA, Yersinia enterocolitica MyfA, and similar proteins. The afa gene clusters encode an afimbrial adhesive sheath produced by Escherichia coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells. Saf-pilin pilus formation proteins SafA and SafD are the major and minor subunits, respectively, of Saf pili, which are often found in clinical isolates of Salmonella and are assembled by the chaperone-usher secretion pathway. PsaA and MyfA are the major subunits of pH 6 antigen (Psa) and Myf fimbrial homopolymers. Also included is the enteroaggregative Escherichia coli AAF/IV pilus tip protein, which is implicated in adhesion as well. During fimbria/pili assembly, polymerization occurs when the N-terminal extension (NTE) of one AfaD-like family monomer is inserted into an adjacent monomer, providing the final beta strand or G-strand, to complete the Ig-like fold, in a mechanism called the donor-strand complementation (DSC) or donor-strand exchange (DSE).	120
341130	cd16913	YkuD_like	L,D-transpeptidases/carboxypeptidases similar to Bacillus YkuD. Members of the YkuD-like family of proteins are found in a range of bacteria. The best studied member Bacillus YkuD has been shown to act as an L,D-transpeptidase that gives rise to an alternative pathway for peptidoglycan cross-linking. Another member Helicobacter pylori Csd6 functions as an L,D-carboxypeptidase and regulates helical cell shape and motility. The conserved region contains a conserved histidine and cysteine, with the cysteine thought to be an active site residue.	121
410987	cd16914	EcfT	T component of ECF-type transporters. The transmembrane component (T component) of the energy coupling-factor (ECF)-type transporter is a transmembrane protein important for vitamin uptake in prokaryotes. In addition to the T component, energy-coupling factor (ECF) transporters contain an energy-coupling module that consists of two ATP-binding proteins (known as the A and A' components) and a substrate-binding (S) component. ECF transporters comprise a subgroup of ATP-binding cassette (ABC) transporters that do not make use of water-soluble substrate binding proteins or domains, but instead employ integral membrane proteins for substrate binding, the S component, in contrast to classical ABC importers. The T component links the S component to the ATP-binding subcomplex that is composed of the A and A' components.	233
340392	cd16915	HATPase_DpiB-CitA-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli K-12 DpiB, DcuS, and Bacillus subtilis CitS, DctS, and YufL. This family includes histidine kinase-like ATPase domains of Escherichia coli K-12 DpiB and DcuS, and Bacillus subtilis CitS, DctS and MalK histidine kinases (HKs) all of which are two component transduction systems (TCSs). E. coli K-12 DpiB (also known as CitA) is the histidine kinase (HK) of DpiA-DpiB, a two-component signal transduction system (TCS) required for the expression of citrate-specific fermentation genes and genes involved in plasmid inheritance. E. coli K-12 DcuS (also known as YjdH) is the HK of DcuS-DcuR, a TCS that in the presence of the extracellular C4-dicarboxlates, activates the expression of the genes of anaerobic fumarate respiration and of aerobic C4-dicarboxylate uptake. CitS is the HK of Bacillus subtilis CitS-CitT, a TCS which regulates expression of CitM, the Mg-citrate transporter. Bacillus subtilis DctS forms a tripartite sensor unit (DctS/DctA/DctB) for sensing C4 dicarboxylates.  Bacillus subtilis MalK (also known as YfuL) is the HK of MalK-MalR (YufL-YufM) a TCS which regulates the expression of the malate transporters MaeN (YufR) and YflS, and is essential for utilization of malate in minimal medium. Proteins having this DpiB-CitA-like HATPase domain generally have sensor domains such as Cache and PAS, and a histidine kinase A (HisKA)-like SpoOB-type, alpha-helical domain.	104
340393	cd16916	HATPase_CheA-like	Histidine kinase-like ATPase domain of the chemotaxis protein histidine kinase CheA, and some hybrid sensor histidine kinases. This family includes the cytoplasmic histidine kinase (HK) CheA, a transmembrane receptor which, together with cytoplasmic adaptor protein (CheW), forms the lattice at the core of the chemosensory array that controls the cellular chemotaxis of motile bacteria and archaea. CheA forms a two-component signal transduction system (TCS) with the response regulator CheY. Proteins having this CheA-like HATPase domain generally also have a histidine-phosphotransfer domain, a histidine kinase homodimeric domain, and a regulatory domain; some are hybrid sensor histidine kinases as they contain a REC signal receiver domain.	178
340394	cd16917	HATPase_UhpB-NarQ-NarX-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli UhpB, NarQ and NarX, and Bacillus subtilis YdfH, YhcY and YfiJ. This family includes the histidine kinase-like ATPase (HATPase) domains of various histidine kinases (HKs) of two-component signal transduction systems (TCSs) such as Escherichia coli UhpB, a HK of the UhpB-UhpA TCS, NarQ and NarX, HKs of the NarQ-NarP and NarX-NarL TCSs, respectively, and Bacillus YdfH, YhcY and YfiJ HKs, of the YdfH-YdfI, YhcY-YhcZ and  YfiJ-YfiK  TCSs, respectively. In addition, it includes Bacillus YxjM, ComP, LiaS and DesK, HKs of the YxjM-YxjML, ComP-ComA, LiaS-LiaR, DesR-DesK TCSs, respectively. Proteins having this HATPase domain have a histidine kinase dimerization and phosphoacceptor domain; some have accessory domains such as GAF, HAMP, PAS and MASE sensor domains.	87
340395	cd16918	HATPase_Glnl-NtrB-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli GlnL (synonyms NtrB and NRII). This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs), similar to Escherichia coli GlnL/NtrB/NRII HK of the two-component regulatory system (TCS) GlnL/GlnG (NtrB-NtrC, or NRII-NRI), which regulates the transcription of genes encoding metabolic enzymes and permeases in response to carbon and nitrogen status in E. coli and related bacteria. Also included in this family are Rhodobacter capsulatus NtrB, Azospirillum brasilense NtrB, Vibrio alginolyticus NtrB, Rhizobium leguminosarum biovar phaseoli NtrB, and Herbaspirillum seropedicae NtrB.  Escherichia coli GlnL/NtrB/NRII is both a kinase and a phosphatase, catalyzing the phosphorylation and dephosphorylation of GlnG/NtrC/NRI. The kinase and phosphatase activities of GlnL/NtrB/NRII are regulated by the PII signal transduction protein, which on binding to GlnL/NtrB/NRII, inhibits the kinase activity of GlnL/NtrB/NRII and activates the GlnL/NtrB/NRII phosphatase activity. Proteins having this HATPase domain also have a histidine kinase dimerization and phosphoacceptor domain (HisKA); some also contain PAS sensor domain(s).	109
340396	cd16919	HATPase_CckA-like	Histidine kinase-like ATPase domain of two-component sensor hybrid histidine kinases, similar to Brucella abortus 2308 CckA. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component hybrid sensor histidine kinase (HKs) similar to Brucella abortus 2308 CckA, which is a component of an essential protein phosphorelay that regulates expression of genes required for growth, division, and intracellular survival; phosphoryl transfer initiates from the sensor kinase CckA and proceeds via the ChpT phosphotransferase to two regulatory substrates: the DNA-binding response regulator CtrA and the phospho-receiver protein CpdR. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), a REC signal receiver domain, and some contain PAS or PAS and GAF sensor domain(s).	116
340397	cd16920	HATPase_TmoS-FixL-DctS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Rhizobium meliloti FixL, and Rhodobacter capsulatus DctS; includes hybrid sensor histidine kinase similar to Pseudomonas mendocina TmoS. This family includes the histidine kinase-like ATPase (HATPase) domains of various histidine kinases (HKs) of two-component signal transduction systems (TCSs), such as Pseudomonas mendocina TmoS HK of the TmoS-TmoT TCS, which controls the expression of the toluene-4-monooxygenase pathway, Rhizobium meliloti FixL HK of the FixL-FixJ TCS, which regulates the expression of the genes related to nitrogen fixation in the root nodule in response to O(2) levels, and Rhodobacter capsulatus DctS of the DctS-DctR TCS, which controls synthesis of the high-affinity C4-dicarboxylate transport system. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA) and PAS sensor domain(s); many are hybrid sensor histidine kinases as they also contain a REC signal receiver domain.	104
340398	cd16921	HATPase_FilI-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Methanosaeta harundinacea FilI and some hybrid sensor histidine kinases. This family includes FilI, the histidine kinase (HK) component of FilI-FilRs, a two-component signal transduction system (TCS) of the methanogenic archaeon, Methanosaeta harundinacea, which is involved in regulating methanogenesis. The cytoplasmic HK core consists of a C-terminal HK-like ATPase domain (represented here) and a histidine kinase dimerization and phosphoacceptor domain (HisKA) domain, which, in FilI, are coupled to CHASE, HAMP, PAS, and GAF sensor domains. FilI-FilRs catalyzes the phosphotransfer between FilI (HK) and FilRs (FilR1 and FilR2, response regulators) of the TCS. TCSs are predicted to be of bacterial origin, and acquired by archaea by horizontal gene transfer. This model also includes related HATPase domains such as that of Synechocystis sp. PCC6803 phytochrome-like protein Cph1. Proteins having this HATPase domain and HisKA domain also have accessory sensor domains such as CHASE, GAF, HAMP and PAS; some are  hybrid sensor histidine kinases as they also contain a REC signal receiver domain.	105
340399	cd16922	HATPase_EvgS-ArcB-TorS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases, many are hybrid sensor histidine kinases, similar to Escherichia coli EvgS, ArcB, TorS, BarA, RcsC. This family contains the histidine kinase-like ATPase (HATPase) domains of various two-component hybrid sensor histidine kinases (HKs), including the following Escherichia coli HKs: EvgS, a HK of the EvgS-EvgA two-component system (TCS) that confers acid resistance; ArcB, a HK of the ArcB-ArcA TCS that modulates the expression of numerous genes in response to respiratory growth conditions; TorS, a HK of the TorS-TorR TCS which is involved in the anaerobic utilization of trimethylamine-N-oxide; BarA, a HK of the BarA-UvrY TCS involved in the regulation of carbon metabolism; and RcsC, a HK of the RcsB-RcsC TCS which regulates the expression of the capsule operon and of the cell division gene ftsZ. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), with most having accessory sensor domain(s) such as GAF, PAS and CHASE; many are hybrid sensor histidine kinases as they also contain a REC signal receiver domain.	110
340400	cd16923	HATPase_VanS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Enterococcus faecium VanS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Enterococcus faecium VanS HK of the VanS-VanR two-component regulatory system (TCS) which activates the transcription of vanH, vanA and vanX vancomycin resistance genes. It also contains Ecoli YedV and PcoS, probable members of YedW-YedV TCS and PcoS-PcoR TCS, repectively. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); most also have a HAMP sensor domain.	102
340401	cd16924	HATPase_YpdA-YehU-LytS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli YpdA, YehU, Bacillus subtilis LytS, and some hybrid sensor histidine kinases. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Bacillus subtilis LytS, a HK of the two-component system (TCS) LytS-LytR needed for growth on pyruvate, and Staphylococcus aureus LytS-LytR TCS involved in the adaptation of S. aureus to cationic antimicrobial peptides. It also includes the HATPase domains of Escherichia coli YpdA and YehU, HKs of YpdA-YpdB and YehU-YehTCSs, which are involved together in a nutrient sensing regulatory network. Proteins having this HATPase domain also contain a histidine kinase domain (His-kinase), some having accessory sensor domain(s) such as Cache, HAMP or GAF; some are hybrid sensor histidine kinases as they also contain a REC signal receiver domain.	103
340402	cd16925	HATPase_TutC-TodS-like	Histidine kinase-like ATPase domain of hybrid sensor histidine kinases similar to Pseudomonas putida TodS and Thauera aromatica TutC. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component hybrid sensor histidine kinase (HKs) such Pseudomonas putida TodS HK of the TodS-TodT two-component regulatory system (TCS) which controls the expression of a toluene degradation pathway. Thauera aromatica TutC may be part of a TCS that is involved in anaerobic toluene metabolism. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), PAS sensor domain(s) and a REC domain.	110
340403	cd16926	HATPase_MutL-MLH-PMS-like	Histidine kinase-like ATPase domain of DNA mismatch repair proteins Escherichia coli MutL, human MutL homologs (MLH/ PMS), and related domains. This family includes the histidine kinase-like ATPase (HATPase) domains of Escherichia coli MutL, human MLH1 (mutL homolog 1), human PMS1 (PMS1 homolog 1, mismatch repair system component), human MLH3 (mutL homolog 3), and human PMS2 (PMS1 homolog 2, mismatch repair system component). MutL homologs (MLH/PMS) participate in MMR (DNA mismatch repair), and in addition have role(s) in DNA damage signaling and suppression of homologous recombination (recombination between partially homologous parental DNAs). The primary role of MutL in MMR is to mediate protein-protein interactions during mismatch recognition and strand removal; a ternary complex is formed between MutS, MutL, and the mismatched DNA, which activates the MutH endonuclease.	188
340404	cd16927	HATPase_Hsp90-like	Histidine kinase-like ATPase domain of human cytosolic Hsp90 and its homologs including Escherichia coli HtpG, and related domains. This family includes the histidine kinase-like ATPase (HATPase) domains of 90 kilodalton heat-shock protein (Hsp90) eukaryotic homologs including cytosolic Hsp90, mitochondrial TRAP1 (tumor necrosis factor receptor-associated protein 1), GRP94 (94 kDa glucose-regulated protein) of the endoplasmic reticulum (ER), and chloroplast Hsp90C. It also includes the bacterial homologs of Hsp90, known as HtpG (High temperature protein G). Hsp90 family of chaperones assist other proteins to fold correctly, stabilizes them against heat stress, and aids in protein degradation.	189
340405	cd16928	HATPase_GyrB-like	Histidine kinase-like ATPase domain of the B subunit of DNA gyrase. This family includes histidine kinase-like ATPase domain of the B subunit of DNA gyrase. Bacterial DNA gyrase is a type II topoisomerase (type II as it transiently cleaves both strands of DNA) which catalyzes the introduction of negative supercoils into DNA, possibly by a mechanism in which one segment of the double-stranded DNA substrate is passed through a transient break in a second segment. It consists of GyrA and GyrB subunits in an A2B2 stoichiometry; GyrA subunits catalyze strand-breakage and reunion reactions, and GyrB subunits hydrolyze ATP. DNA gyrase is found in bacteria, plants and archaea, but as it is absent in humans it is a possible drug target for the treatment of bacterial and parasite infections.	180
340406	cd16929	HATPase_PDK-like	Histidine kinase-like ATPase domain of pyruvate dehydrogenase kinase, branched-chain alpha-ketoacid dehydrogenase kinase and related domains. This family includes the histidine kinase-like ATPase (HATPase) domains of all four PDK isoforms (pyruvate dehydrogenase kinases 1-4) that have been described in mammals, and other PDKs including Saccharomyces Pkp1p and Pkp2p. PDKs and phosphatases tightly regulate the mitochondrial pyruvate dehydrogenase complex (PDC) by reversible phosphorylation. PDC catalyzes the oxidative decarboxylation of pyruvate to acetyl-CoA, connecting glycolysis and the TCA acid cycle. Also included in this family is mammalian branched-chain alpha-ketoacid dehydrogenase kinase (BDK), a mitochondrial protein kinase that phosphorylates a subunit of the branched-chain a-ketoacid dehydrogenase (BCKD) complex, which catalyzes the oxidative decarboxylation of branched-chain alpha-ketoacids derived from leucine, isoleucine, and valine, a rate-limiting step in the oxidative degradation of these branched-chain amino acids.	169
340407	cd16930	HATPase_TopII-like	Histidine kinase-like ATPase domain of eukaryotic topoisomerase II. This family includes the histidine kinase-like ATPase (HATpase) domains of human topoisomerase IIA (TopIIA) and TopIIB, Saccharomyces cerevisae TOP2p, and related proteins. These proteins catalyze the passage of DNA double strands through a transient double-strand break in the presence of ATP.	147
340408	cd16931	HATPase_MORC-like	Histidine kinase-like ATPase domain of human microrchidia (MORC) family CW-type zinc finger proteins MORC1-4, and related domains. This family includes the histidine kinase-like ATPase (HATPase) domain of human microrchidia (MORC) family CW-type zinc finger proteins MORC1-4, and related domains. In addition to the HATPase domain, MORC family proteins have a CW-type zinc finger domain containing four conserved cysteines and two conserved tryptophans, and coiled-coil domains at the carboxy-terminus. MORC1 has cross-species differential methylation in association with early life stress, and genome-wide association with major depressive disorder (MDD). MORC2 is involved in several nuclear processes, including transcription modulation and DNA damage repair, and exhibits a cytosolic function in lipogenesis, adipogenic differentiation, and lipid homeostasis by increasing the activity of ACLY. MORC3 regulates p53, and is an antiviral factor which plays an important role during HSV-1 and HCMV infection, and is a positive regulator of influenza virus transcription. MORC4 is highly expressed in a subset of diffuse large B-cell lymphomas and has potential as a lymphoma biomarker.	118
340409	cd16932	HATPase_Phy-like	Histidine kinase-like ATPase domain of plant phytochromes similar to Arabidopsis thaliana Phytochrome A, B, C, D and E. This family includes the histidine kinase-like ATPase (HATPase) domains of plant red/far-red photoreceptors, the phytochromes, and includes the Arabidopsis thaliana phytochrome family phyA-phyE. Following red light absorption, biologically inactive forms of phytochromes convert to active forms, which rapidly convert back to inactive forms upon far-red light irradiation. Phytochromes can be considered as having an N-terminal photosensory region to which a bilin chromophore is bound, and a C-terminal output region, which includes the HATPase domain represented here, and is involved in dimerization and presumably contributes to relaying the light signal to downstream signaling events.	113
340410	cd16933	HATPase_TopVIB-like	Histidine kinase-like ATPase domain of type IIB topoisomerase, Topo VI, subunit B. This family includes the histidine kinase-like ATPase (HATPase) domain of the B subunit of topoisomerase VI (Topo VIB). Topo VI is a heterotetrameric complex composed of two TopVIA and two TopVIB subunits and is categorized as a type II B DNA topoisomerase. It is found in archaea and also in plants. Type II enzymes cleave both strands of a DNA duplex and pass a second duplex through the resulting break in an ATP-dependent mechanism. DNA cleavage by Topo VI generates two-nucleotide 5'-protruding ends.	203
340411	cd16934	HATPase_RsbT-like	Histidine kinase-like ATPase domain of the anti sigma-B factor Bacillus subtilis serine/threonine-protein kinase RsbT, and related domains. This family includes the histidine kinase-like ATPase (HATPase) domain of Bacillus subtilis serine/threonine-protein kinase RsbT, a component of the stressosome signaling complex of Bacillus subtilis. The stressosome is formed from multiple copies of three proteins, a sensor protein RsbR, a scaffold protein RsbS, and RsbT, and responds to environmental changes by initiating a protein partner switching cascade. Stress perception increases the phosphorylation of RsbR and RsbS, by RsbT. Subsequent dissociation of RsbT from the stressosome activates the sigma-B cascade, leading to the release of the alternative sigma factor, sigma-B.	117
340412	cd16935	HATPase_AgrC-ComD-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Staphylococcus aureus AgrC and Streptococcus pneumoniae ComD which are involved in quorum sensing. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) including Staphylococcus aureus AgrC which is an HK of the accessory gene regulator (agr) quorum sensing two-component regulatory system (TCS) AgrC-AgrA. The agr system plays a part in the transition from persistent to virulent phenotype. This family also includes Streptococcus pneumoniae ComD HK of the ComD-ComE TCS, involved in quorum sensing and genetic competence.	134
340413	cd16936	HATPase_RsbW-like	Histidine kinase-like ATPase domain of RsbW, an anti sigma-B factor and serine-protein kinase involved in regulating sigma-B during stress in Bacilli, and related domains. This family includes histidine kinase-like ATPase (HATPase) domain of RsbW, an anti sigma-B factor as well as a serine-protein kinase involved in regulating sigma-B during stress in Bacilli. The alternative sigma factor sigma-B is an important regulator of the general stress response of Bacillus cereus and B. subtilis. RsbW is an anti-sigma factor while RsbV is an anti-sigma factor antagonist (anti-anti-sigma factor). RsbW can also act as a kinase on RsbV. In a partner-switching mechanism, RsbW, RsbV, and sigma-B participate as follows: in non-stressed cells, sigma-B is present in an inactive form complexed with RsbW; in this form, sigma-B is unable to bind to RNA polymerase. Under stress, RsbV binds to RsbW, forming an RsbV-RsbW complex, and sigma-B is released to bind to RNA polymerase. RsbW may then act as a kinase on RsbV, phosphorylating a serine residue; RsbW is then released to bind to sigma-B, hence blocking its ability to bind RNA polymerase. A phosphatase then dephosphorylates RsbV so that it can again form a complex with RsbW, leading to the release of sigma-B.	91
340414	cd16937	HATPase_SMCHD1-like	Histidine kinase-like ATPase domain of structural maintenance of chromosomes flexible hinge domain containing 1 (SMCHD1) protein. This family includes histidine kinase-like ATPase (HATPase) domain of structural maintenance of chromosomes flexible hinge domain containing 1 (SMCHD1) protein, which is involved in gene silencing and in DNA damage. It has critical roles in X-chromosome inactivation and is also an important regulator of autosomal gene clusters. Upon DNA damage, SMCHD1 promotes non-homologous end joining and inhibits homologous recombination repair. SMCHD1 is implicated in the pathogenesis of facioscapulohumeral muscular dystrophy.	119
340415	cd16938	HATPase_ETR2_ERS2-EIN4-like	Histidine kinase-like ATPase domain of Arabidopsis thaliana ETR2, ERS2, and EIN4, and related domains. This family includes the histidine kinase-like ATPase domains (HATPase) of three out of the five receptors that recognize the plant hormone ethylene in Arabidopsis thaliana. These three proteins have been classified as belonging to subfamily 2: ETR2, ERS2, and EIN4. They lack most of the motifs characteristic of histidine kinases, and EIN4 is the only one in this group containing the conserved histidine that is phosphorylated in two-component and phosphorelay systems. This family also includes the HATPase domains of Escherichia coli RcsD phosphotransferase which is a component of the Rcs-signaling system, a complex multistep phosphorelay involving five proteins, and is involved in many transcriptional networks such as cell division, biofilm formation, and virulence, among others. Also included is Schizosaccharomyces pombe Mak3 (Phk1) which participates in a multi-step two-component related system which regulates H2O2-induced activation of the Sty1 stress-activated protein kinase pathway. Most proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), and a GAF sensor domain; most are hybrid sensor histidine kinases as they also contain a REC signal receiver domain.	133
340416	cd16939	HATPase_RstB-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Salmonella typhimurium RstB. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Salmonella typhimurium RstB HK of the RstA-RstB two-component regulatory system (TCS), which regulates expression of the constituents participating in pyrimidine metabolism and iron acquisition, and may be required for regulation of Salmonella motility and invasion. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), and a HAMP sensor domain.	104
340417	cd16940	HATPase_BasS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli BasS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) similar to Escherichia coli BasS HK of the BasS-BasR two-component regulatory system (TCS). Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some contain a HAMP sensory domain, while some an N-terminal two-component sensor kinase domain.	113
340418	cd16942	HATPase_SpoIIAB-like	Histidine kinase-like ATPase domain of SpoIIAB, an anti sigma-F factor and serine-protein kinase involved in regulating sigma-F during sporulation in Bacilli, and related domains. This family includes histidine kinase-like ATPase (HATPase) domain of SpoIIAB, an anti sigma-F factor and a serine-protein kinase involved in regulating sigma-F during sporulation in Bacilli where, early in sporulation, the cell divides into two unequal compartments: a larger mother cell and a smaller forespore. Sigma-F transcription factor is activated in the forespore directly after the asymmetric septum forms, and its spatial and temporal activation is required for sporulation. Free sigma-F can associate with the RNA polymerase core and activate transcription of the sigma-F regulon, its regulation may comprise a partner-switching mechanism involving SpoIIAB, SpoIIAA, and sigma-F as follows: SpoIIAB can form alternative complexes with either: i) sigma-F, holding it in an inactive form and preventing its association with RNA polymerase, or ii) unphosphorylated SpoIIAA and a nucleotide, either ATP or ADP. In the presence of ATP, SpoIIAB acts as a kinase to specifically phosphorylate a serine residue of SpoIIAA; this phosphorylated form has low affinity for SpoIIAB and dissociates, making SpoIIAB available to capture sigma-F. SpoIIAA may then be dephosphorylated by a SpoIIE serine phosphatase and be free to attack the SpoIIAB sigma-F complex to induce the release of sigma-F.	135
340419	cd16943	HATPase_AtoS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli K-12 AtoS. This family includes the histidine kinase-like ATPase (HATPase) domains of various histidine kinases (HKs) of two-component signal transduction systems (TCSs) such as Escherichia coli AtoS, an HK of the AtoS-AtoC TCS. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some have accessory domains such as HAMP or PAS sensor domains or CBS-pair domains.	105
340420	cd16944	HATPase_NtrY-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Azorhizobium caulinodans NtrY. This family includes the histidine kinase-like ATPase (HATPase) domains of various histidine kinases (HKs) of two-component signal transduction systems (TCSs) such as Azorhizobium caulinodans ORS571 NtrY of the NtrY-NtrX TCS, which is involved in nitrogen fixation and metabolism. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA) and a HAMP sensor domain; some also have PAS sensor domains.	108
340421	cd16945	HATPase_CreC-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli CreC. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Escherichia coli CreC of the CreC-CreB two-component regulatory system (TCS) involved in catabolic regulation. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), and accessory sensory domain(s) such as HAMP, CACHE or PAS.	106
340422	cd16946	HATPase_BaeS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli BasS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) similar to Escherichia coli BaeS HK of the BaeS/BaeR two-component regulatory system (TCS), which responds to envelope stress. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), and a HAMP sensory domain.	109
340423	cd16947	HATPase_YcbM-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Bacillus subtilis YcbM. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Bacillus subtilis YcbM, a HK of the two-component system YcbM-YcbL. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA).	125
340424	cd16948	HATPase_BceS-YxdK-YvcQ-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Bacillus subtilis BceS, YxdK, and Bacillus thuringiensis YvcQ. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Bacillus subtilis BceS and Bacillus thuringiensis YvcQ, the HKs of the two-component regulatory system (TCSs) BceS-BceR and YvcQ-YvcP, repsectively, which are both involved in regulating bacitracin resistance. It also includes the HATPase domain of YxdK, the HK of YxdK-YxdJ TCS involved in sensing antimicrobial compounds.	109
340425	cd16949	HATPase_CpxA-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli CpxA. This family includes the histidine kinase-like ATPase (HATPase) domains of two-component sensor histidine kinase (HKs) similar to Escherichia coli CpxA, HK of the CpxA-CpxR two-component regulatory system (TCS) which may function in acid stress and in cell wall stability. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA) and a HAMP sensor domain; some also contain a CpxA family periplasmic domain.	104
340426	cd16950	HATPase_EnvZ-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli EnvZ and Pseudomonas aeruginosa BfmS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Escherichia coli EnvZ of the EnvZ-OmpR two-component regulatory system (TCS), which functions in osmoregulation. It also contains the HATPase domain of Pseudomonas aeruginosa BfmS, the HK of the BfmSR TCS, which functions in the regulation of the rhl quorum-sensing system and bacterial virulence in P. aeruginosa. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA) and a HAMP sensor domain; some also contain a periplasmic domain.	101
340427	cd16951	HATPase_EL346-LOV-HK-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Erythrobacter litoralis blue light-activated histidine kinase 2. This domain family includes the histidine kinase-like ATPase (HATPase) domain of blue light-activated histidine kinase 2 of Erythrobacter litoralis (EL346). Signaling commonly occurs within HK dimers, however EL346 functions as a monomer. Also included in this family are the HATPase domains of ethanolamine utilization sensory transduction histidine kinase (EutW), whereby regulation of ethanolamine, a carbon and nitrogen source for gut bacteria, results in autophosphorylation and subsequent phosphoryl transfer to a response regulator (EutV) containing an RNA-binding domain. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some have an accessory PAS sensor domain, while some have an N-terminal histidine kinase domain.	131
340428	cd16952	HATPase_EcPhoR-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli PhoR. This family includes histidine kinase-like ATPase (HATPase) domain of two-component sensor histidine kinases similar to Escherichia coli or Vibrio cholera PhoR, the histidine kinase (HK) of PhoB-PhoR a two-component signal transduction system (TCS) involved in phosphate regulation. PhoR monitors extracellular inorganic phosphate (Pi) availability and PhoB, the response regulator, regulates transcription of genes of the phosphate regulon. PhoR is a bifunctional histidine autokinase/phospho-PhoB phosphatase; in phosphate deficiency, it autophosphorylates and Pi is transferred to PhoB, and when environmental Pi is abundant, it removes the phosphoryl group from phosphorylated PhoB. Other roles of PhoB-PhoR TCS have been described, including motility, biofilm formation, intestinal colonization, and virulence in V. cholera. E.coli PhoR and Bacillus subtilis PhoR (whose HATPase domain belongs to a different family) sense very different signals in each bacterium. In E. coli the PhoR signal comes from phosphate transport mediated by the PstSCAB2 phosphate transporter and the PhoU chaperone-like protein while in B. subtilis, the PhoR activation signal comes from wall teichoic acid (WTA) metabolism.	108
340429	cd16953	HATPase_BvrS-ChvG-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Brucella abortus BvrS and Sinorhizobium meliloti ChvG. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Brucella abortus BvrS of the BvrR-BvrS two-component regulatory system (TCS), which controls cell invasion and intracellular survival, as well as Sinorhizobium meliloti and Agrobacterium tumefaciens ChvG of the ChvI-ChvG TCS necessary for endosymbiosis and pathogenicity in plants. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA), an accessory HAMP sensor domain, a periplasmic stimulus-sensing domain, and some also have a sensor N-terminal transmembrane domain.	110
340430	cd16954	HATPase_PhoQ-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli PhoQ and Providencia stuartii AarG. This family includes histidine kinase-like ATPase (HATPase) domain of two-component sensor histidine kinases similar to Escherichia coli PhoQ and Providencia stuartii AarG. PhoQ is the histidine kinase (HK) of the PhoP-PhoQ two-component regulatory system (TCS), which responds to the levels of Mg2+ and Ca2+, controls virulence, mediates the adaptation to Mg2+-limiting environments, and regulates numerous cellular activities. Providencia stuartii AarG is a putative sensor kinase which controls the expression of the 2'-N-acetyltransferase and an intrinsic multiple antibiotic resistance (Mar) response in Providencia stuartii. The AarG product is similar to PhoQ in that it is able to restore wild-type levels of resistance to a Salmonella typhimurium phoQ mutant. However, the expression of the 2'-N-acetyltransferase gene and of aarP (a gene encoding a transcriptional activator of 2'-N-acetyltransferase) are not significantly affected by the levels of Mg2+ or Ca2+. Most proteins in this group contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some have an accessory HAMP sensor domain, and some have an intracellular membrane -interaction PhoQ sensor domain.	135
340431	cd16955	HATPase_YpdA-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli YpdA. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Escherichia coli YpdA, a HK of the two-component system (TCS) YpdA-YpdB  which is involved in a nutrient sensing regulatory network with YehU-YehT. Proteins having this HATPase domain also contain a histidine kinase domain (His-kinase), and some have a GAF sensor domain; some contain a DUF3816 domain; some are hybrid sensor histidine kinases as they also contain a REC signal receiver domain.	102
340432	cd16956	HATPase_YehU-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli YehU. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) including Escherichia coli YehU, a HK of the two-component system (TCS) YehU-YehT which is involved in a nutrient sensing regulatory network. Proteins having this HATPase domain also contain a histidine kinase domain (His-kinase); some have a GAF sensor domain while some have a cupin domain.	101
340433	cd16957	HATPase_LytS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Bacillus subtilis LytS and Staphylococcus aureus LytS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Bacillus subtilis LytS, a HK of the two-component system (TCS) LytS-LytR needed for growth on pyruvate, and Staphylococcus aureus LytS-LytR TCS involved in the adaptation of S. aureus to cationic antimicrobial peptides. Proteins having this HATPase domain also contain a histidine kinase domain (His-kinase), and a GAF sensor domain; most contain a DUF3816 domain.	106
341131	cd16961	RMtype1_S_TRD-CR_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR) and similar domains. The restriction-modification (RM) system S subunit generally consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This superfamily represents a single TRD-CR unit; in addition to type I TRD-CR units, it includes RMtype1_S_TRD-CR_like domains of various putative Helicobacter type II restriction enzymes and methyltransferases, such as Hci611ORFHP and HfeORF12890P, as well as TRD-CR-like sequence-recognition domains of the M subunit of putative type I DNA methyltransferase such as M2.CinURNWORF2828P and M.Mae7806ORF3969P.	178
340813	cd16962	RuvC	Crossover junction endodeoxyribonuclease RuvC. Crossover junction endodeoxyribonuclease RuvC is also called Holliday junction resolvase RuvC. It is part of the RuvABC pathway in Escherichia coli and other Gram-negative bacteria that is involved in processing Holliday junctions, which are formed by the reciprocal exchange of strands between two DNA duplexes. Holliday junction resolvases (HJRs) are endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. RuvC is thought to bind either on the open, DNA exposed face of a single RuvA tetramer, or to replace one of the two tetramers. Binding is proposed to be mediated by an unstructured loop on RuvC, which becomes structured on binding RuvA. RuvC can be bound to the complex in either orientation, therefore resolving Holliday junctions in either a horizontal or vertical manner. HJRs occur in archaea, bacteria, and in the mitochondria of certain fungi. These are referred to as the RuvC family of Holliday junction resolvases, RuvC being the Escherichia coli HJR. RuvC and its orthologs are homodimers and display structural similarity to RNase H and Hsp70.	153
340814	cd16963	CCE1	fungal mitochondrial Holliday junction resolvases similar to Saccharomyces cerevisiae CCE1. Saccharomyces cerevisiae Cruciform cutting endonuclease 1 (CCE1) is a Holliday junction resolvase specific for 4-way junctions. CCE1 is involved in the maintenance of mitochondrial DNA. Holliday junction resolvases (HJRs) are endonucleases that specifically resolve Holliday junction DNA intermediates during homologous recombination. Holliday junctions are formed by the reciprocal exchange of strands between two DNA duplexes. HJRs occur in archaea, bacteria, and in the mitochondria of certain fungi; they may form homodimers and display structural similarity to RNase H and Hsp70.	290
340815	cd16964	YqgF	putative pre-16S rRNA nuclease YqgF and RuvX family. Escherichia coli YqgF has been shown to act as a pre-16S rRNA nuclease, presumably as a monomer. It is involved in the processing of pre-16S rRNA during ribosome maturation. The RuvX gene product from Mycobacterium tuberculosis was shown to act, in a dimeric form, as a Holliday junction resolvase (HJR). HJRs endonucleases specifically resolve Holliday junction DNA intermediates during homologous recombination. Holliday junctions are formed by the reciprocal exchange of strands between two DNA duplexes. HJRs occur in archaea, bacteria, and in the mitochondria of certain fungi; they may form homodimers and display structural similarity to RNase H and Hsp70.	132
341215	cd16965	Alpha_kinase_ChaK	Alpha-kinase domain of channel kinases. This group is composed of channel kinases 1 (ChaK1) and 2 (ChaK2), and similar proteins. ChaK1 and ChaK2 are also called transient receptor potential cation channel subfamily M members 7 (TRMP7) and 6 (TRMP6), respectively. They are fusion proteins containing a transmembrane ion pore or channel and a C-terminal alpha-kinase domain, both of which are functional. They are both cation-selective channels that preferentially permeate Zn2+, Mg2+, and Ca2+ ions. They are central regulators of Mg2+ and Ca2+ homeostasis. TRMP7 is ubiquitously expressed while TRMP6 is highly expressed in specific tissues such as the kidney and intestine. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	239
341216	cd16966	Alpha_kinase_ALPK2_3	Alpha-kinase domain of alpha-protein kinases 2 and 3. Alpha-protein kinases 2 (ALPK2) and 3 (ALPK3) are also called heart alpha-protein kinase (HAK) and muscle alpha-protein kinase (MAK), respectively. They both contain a C-terminal alpha-kinase domain and two immunoglobulin (Ig)-like domains. Loss of function mutations in ALPK3 can cause early-onset and familial cardiomyopathy in humans. The ALPK2 gene may also be a novel candidate gene for inherited hypertension in Dahl rats. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	239
341217	cd16967	Alpha_kinase_eEF2K	Alpha-kinase domain of eukaryotic elongation factor-2 kinase. Eukaryotic elongation factor-2 kinase (eEF2K) is also called calcium/calmodulin (CaM)-dependent eEF2K. It phosphorylates eukaryotic elongation factor-2 (EEF2) at a single site, leading to its inactivation and inability to bind ribosomes, and slowing down the elongation stage of protein synthesis. It has been linked to many human diseases including cardiovascular conditions (atherosclerosis) and pulmonary arterial hypertension, as well as solid tumors and neurological disorders. eEF2K is an atypical protein kinase containing a CaM binding region, an alpha-kinase catalytic domain, and TPR-like Sel1 repeats at the C-terminus. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	216
341218	cd16968	Alpha_kinase_MHCK_like	Alpha-kinase domain of myosin heavy chain kinase and similar domains. This group is composed of alpha-kinase domains of Dictyostelium discoideum myosin heavy chain kinases A-D (MHCKA, MHCKB, MHCKC, MHCKD), alpha-protein kinase 1 (AK1), and similar proteins. The myosin heavy chain kinases are involved in regulating myosin II filament assembly in Dictyostelium discoideum. They phosphorylate target threonine residues located in the carboxyl-terminal portion of the myosin II heavy chain (MHC) tail, resulting in filament disassembly. The different MHCK isoforms display different spatial regulation, indicating specific roles for each isoform in fine tuning the Dictyostelium actomyosin cytoskeleton. They all contain an alpha-kinase domain as well as WD40 repeats at the C-terminus. AK1 contains an N-terminal Arf-GAP domain and a central alpha-kinase domain. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	202
341219	cd16969	Alpha_kinase_ALPK1	Alpha-kinase domain of alpha-protein kinase 1. Alpha-protein kinase 1 is also called chromosome 4 kinase or lymphocyte alpha-protein kinase (LAK). ALPK1 is implicated in epithelial cell polarity and exocytic vesicular transport towards the apical plasma membrane. It resides on Golgi-derived vesicles where it phosphorylates myosin IA, a motor protein that regulates the delivery of vesicles to the plasma-membrane. It may be associated with inflammation-related diseases such as gout and type 2 diabetes mellitus. ALPK1 contains a C-terminal alpha-kinase domain, an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	227
341220	cd16970	Alpha_kinase_VwkA_like	Alpha-kinase domain of Dictyostelium discoideum VwkA and similar domains. Dictyostelium discoideum alpha-protein kinase VwkA is also called von Willebrand factor A alpha-kinase or vWF kinase. It influences myosin II abundance and assembly behavior as vWKA gene disruption leads to significant myosin II overassembly. VwkA also serves a critical conserved role in the periodic contractions of the contractile vacuole through its regulation of the myosin II cortical cytoskeleton. It contains a vWFa domain (named after its homology to von Willebrand factor A, a plasma glycoprotein essential for proper blood clotting) and a C-terminal alpha-kinase domain. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	227
341221	cd16971	Alpha_kinase_ChaK1_TRMP7	Alpha-kinase domain of channel kinase 1, also called transient receptor potential cation channel subfamily M member 7. Channel kinase 1 (ChaK1), also called transient receptor potential cation channel subfamily M member 7 (TRMP7) or long transient receptor potential channel 7 (LTrpC7), is a fusion protein containing a transmembrane ion pore or channel and a C-terminal alpha-kinase domain, both of which are functional. It is ubiquitously expressed and is a cation-selective channel that preferentially permeates Zn2+, Mg2+, and Ca2+ ions. It is a central regulator of Mg2+ and Ca2+ homeostasis. TRPM7 plays a role in cancer proliferation, stroke, hydrogen peroxide dependent neurodegeneration, and heavy metal toxicity. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	239
341222	cd16972	Alpha_kinase_ChaK2_TRPM6	Alpha-kinase domain of channel kinase 2, also called transient receptor potential cation channel subfamily M member 6. Channel kinase 2 (ChaK2), also called transient receptor potential cation channel subfamily M member 6 (TRMP6) or melastatin-related TRP cation channel 6, is a fusion protein containing a transmembrane ion pore or channel and a C-terminal alpha-kinase domain, both of which are functional. It is highly expressed in the kidney and instestine. It is a cation-selective channel that preferentially permeates Zn2+, Mg2+, and Ca2+ ions. It is a central regulator of Mg2+ and Ca2+ homeostasis. TRPM6 is considered to be the Mg2+ entry pathway in the distal convoluted tubule of the kidney, where it functions as a gatekeeper for controlling the body's Mg2+ balance. Mutations in the TRPM6 gene cause the autosomal recessive disorder hypomagnesemia with secondary hypocalcemia, which often results in severe muscular and neurologic complications from early infancy that can lead to neurologic damage or cardiac arrest if left untreated. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	239
341223	cd16973	Alpha_kinase_ALPK3	Alpha-kinase domain of alpha-protein kinase 3. Alpha-protein kinase 3 (ALPK3) is also called muscle alpha-protein kinase (MAK) or myocytic induction/differentiation originator (Midori). Its expression is restricted to fetal and adult heart and adult skeletal muscle, and is localized in the nucleus. It is thought to act as a transcriptional regulator implicated in early cardiac development. Loss of function mutations in ALPK3 can cause early-onset and familial cardiomyopathy in humans. ALPK3 contains a C-terminal alpha-kinase domain and two immunoglobulin (Ig)-like domains. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	239
341224	cd16974	Alpha_kinase_ALPK2	Alpha-kinase domain of alpha-protein kinase 2. Alpha-protein kinase 2 (ALPK2) is also called heart alpha-protein kinase (HAK). Little functional information is known about ALPK2. In a three-dimensional colonic-crypt model, it has been identified as crucial for luminal apoptosis and expression of DNA repair-related genes, possibly in the transition of normal colonic crypt to adenoma. The ALPK2 gene may also be a novel candidate gene for inherited hypertension in Dahl rats. ALPK2 contains a C-terminal alpha-kinase domain and two immunoglobulin (Ig)-like domains. Alpha-kinase is an atypical protein kinase catalytic domain with no detectable similarity to conventional protein serine/threonine kinases. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	239
340434	cd16975	HATPase_SpaK_NisK-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Bacillus subtilis SpaK and Lactococcus lactis NisK. This family includes histidine kinase-like ATPase (HATPase) domain of two-component sensor histidine kinases similar to Bacillus subtilis SpaK and Lactococcus lactis NisK. SpaK is the histidine kinase (HK) of the SpaK-SpaR two-component regulatory system (TCS), which is involved in the regulation of the biosynthesis of lantibiotic subtilin. NisK is the HK of the NisK-NisR TCS, which is involved in the regulation of the biosynthesis of lantibiotic nisin. SpaK and NisK may function as membrane-associated protein kinases that phosphorylate SpaR and NisR, respectively, in response to environmental signals.	107
340435	cd16976	HATPase_HupT_MifS-like	Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Rhodobacter capsulatus HupT and Pseudomonas aeruginosa MifS. This family includes the histidine kinase-like ATPase (HATPase) domains of various two-component sensor histidine kinase (HKs) such as Rhodobacter capsulatus HupT of the HupT-HupR two-component regulatory system (TCS), which regulates the synthesis of HupSL, a membrane bound [NiFe]hydrogenase. It also contains the HATPase domain of Pseudomonas aeruginosa MifS, the HK of the MifS-MifR TCS, which may be involved in sensing alpha-ketoglutarate and regulating its transport and subsequent metabolism. Proteins having this HATPase domain also contain a histidine kinase dimerization and phosphoacceptor domain (HisKA); some also have a C-terminal PAS sensor domain.	102
340774	cd16977	VHS_GGA	VHS (Vps27/Hrs/STAM) domain of GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) subfamily. GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) comprises a subfamily of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system.	133
340775	cd16978	VHS_HSE1	VHS (Vps27/Hrs/STAM) domain of Class E vacuolar protein-sorting machinery protein HSE1. Class E vacuolar protein-sorting machinery protein HSE1, together with Vps27, comprise the ESCRT-0 complex, the sorting receptor for ubiquitinated cargo proteins at the multivesicular body (MVB). The complex directly binds to ubiquitinated transmembrane proteins and recruits both ubiquitin ligases and deubiquitinating enzymes. It is also required the efficient recycling of late Golgi proteins including the carboxypeptidase Y (CPY) sorting receptor, Vps10. Similar to metazoan STAMs, HSE1 contain: an N-terminal VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats; a Ubiquitin-Interacting Motif (UIM); a SH3 (Src Homology 3) domain, a well-established protein-protein interaction domain; and a GAT (GGA and TOM) domain, which is essential for the normal sorting function of HSE1.	134
340776	cd16979	VHS_Vps27	VHS (Vps27/Hrs/STAM) domain of Vacuolar protein sorting-associated protein 27. Vacuolar protein sorting-associated protein 27 (Vps27 or Vps27p) is also called Golgi retention defective protein 11, and is the yeast homolog of Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate). Together with class E vacuolar protein-sorting machinery protein HSE1, it comprises the ESCRT-0 complex, the sorting receptor for ubiquitinated cargo proteins at the multivesicular body (MVB). The complex directly binds to ubiquitinated transmembrane proteins and recruits both ubiquitin ligases and deubiquitinating enzymes. It is also required the efficient recycling of late Golgi proteins including the carboxypeptidase Y (CPY) sorting receptor, Vps10. Vps27 contain similar domains and motifs to Hrs; it contains an N-terminal VHS domain, which has a superhelical structure similar to the structure of ARM (Armadillo) repeats, a FYVE (Fab1p, YOTB, Vac1p, and EEA1) zinc finger domain, two Ubiquitin-Interacting Motifs (UIMs), a GAT (GGA and TOM) domain, two a P(S/T)XP motifs that recruit ESCRT-I, and a short peptide motif near the C-terminus that recruits clathrin.	141
340777	cd16980	VHS_Lsb5	VHS (Vps27/Hrs/STAM) domain of LAS seventeen-binding protein 5. LAS seventeen-binding protein 5 (LAS17-binding protein 5, Lsb5, or Lsb5p) localizes to the plasma membrane and plays a role in endocytosis in yeast. It interacts with actin regulators Sla1p and Las17p, ubiquitin, and Arf3p, coupling actin dynamics to membrane trafficking processes. Lsb5p contains an N-terminal VHS domain and a GAT (GGA and TOM) domain. The VHS domain has a superhelical structure similar to the structure of ARM (Armadillo) repeats. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting.	132
340778	cd16981	CID_RPRD_like	CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing proteins. This family is composed of  Regulation of nuclear pre-mRNA domain-containing proteins 1A (RPRD1A), 1B (RPRD1B), 2 (RPRD2), yeast Rtt103, and similar proteins. RPRD1A, RPRD1B, and  RPRD2 are CID (CTD-Interacting Domain) containing proteins that co-purify with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. Yeast transcription termination factor Rtt103 is a CID containing protein that functions in DNA damage response. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	125
340779	cd16982	CID_Pcf11	CID (CTD-Interacting Domain) of Pcf11. Pcf11 is conserved across eukaryotes. The best studied protein is Saccharomyces cerevisiae Pcf11, also called protein 1 of CF I, an essential subunit of the cleavage factor IA (CFIA) complex which is required for polyadenylation-dependent pre-mRNA 3'-end processing and RNA polymerase (Pol) II (RNAP II) transcription termination. Human Pcf11, also referred to as pre-mRNA cleavage complex 2 protein Pcf11, has been shown to enhance degradation of RNAP II-associated nascent RNA and transcriptional termination. The family also includes plant PCFS4 (Pcf11-similar-4 protein or Polyadenylation and cleavage factor homolog 4) and Caenorhabditis elegans Polyadenylation and cleavage factor homolog 11. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. Pcf11 CID preferentially interacts with CTD phosphorylated at Ser2. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	127
340780	cd16983	CID_SCAF8_like	CID (CTD-Interacting Domain) of SR-related and CTD-associated factor 8 and similar proteins. This subfamily includes SR-related and CTD-associated factors 8 (SCAF8) and 4 (SCAF4), and similar proteins. SCAF4 is also called Splicing factor arginine serine rich 15 (SFRS15). Members may play roles in mRNA processing. Both SCAF4 and SCAF8 contains a CTD-interacting domain (CID) at the amino terminus and a Ser/Arg-rich domain followed by an RNA recognition motif. CID binds tightly to the carboxy-terminal domain (CTD) of  RNA polymerase (Pol) II (RNAP II). During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	131
340781	cd16984	CID_Nrd1_like	CID (CTD-Interacting Domain) of Nrd1 and similar proteins. This subfamily includes Saccharomyces cerevisiae protein Nrd1, Schizosaccharomyces pombe Rpb7-binding protein Seb1, and similar proteins. Nrd1 cooperates with Nab3 and Sen1, also called the Nrd1-Nab3-Sen1 (NNS) complex, to terminate the transcription by RNA polymerase (Pol) II (RNAPII) of many noncoding RNAs (ncRNAs), including small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), and cryptic unstable transcripts (CUTs). Schizosaccharomyces pombe Seb1 does not function in an NNS-like termination pathway but promotes polyadenylation site selection of coding and noncoding genes. It cotranscriptionally controls alternative polyadenylation. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. Nrd1 CID preferentially interacts with CTD phosphorylated at Ser5. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	145
340782	cd16985	ANTH_N_AP180	ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of adaptor protein 180 (AP180) subfamily. The Adaptor Protein 180 (AP180) subfamily members are phosphatidylinositol-binding clathrin assembly proteins, including mammalian clathrin coat assembly protein AP180 and Clathrin Assembly Lymphoid Myeloid Leukemia protein (CALM), Drosophila LAP (also called Like-AP180 or AP180), and Caenorhabditis elegans Uncoordinated protein 11 (unc-11, also called AP180-like adaptor protein). They are components of the adaptor complexes which link clathrin to receptors in coated vesicles. AP180 and CALM play important roles in clathrin-mediated endocytosis. AP180, also called 91 kDa synaptosomal-associated protein (SNAP91) or phosphoprotein F1-20, is a brain-specific clathrin-binding protein which stimulates clathrin assembly during the recycling of synaptic vesicles. CALM, also called phosphatidylinositol binding clathrin assembly protein (PICALM), is ubiquitously expressed. Members of this subfamily contain ANTH domains, which bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. This model describes the N-terminal region of ANTH domains of the Adaptor Protein 180 (AP180) subfamily.	117
340783	cd16986	ANTH_N_Sla2p_HIP1_like	ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Sla2p/HIP1/HIP1R subfamily. Members of the Sla2p/HIP1/HIP1R subfamily share a common domain architecture, containing an N-terminal ANTH, a central clathrin-binding colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domains. HIP1 was identified in 1997 as an interactor of huntingtin; when mutated, it is involved in the neurodegenerative disorder Huntington's disease. Both HIP1 and HIP1R promote clathrin assembly in vitro. Yeast Sla2p, is a regulator of membrane cytoskeleton assembly. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. While the ANTH domain of Sla2p preferentially binds PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome, mammalian HIP1 and HIP1R were found to preferentially bind PtdIns(3,4)P2 and PtdIns(3,5)P2, respectively. This model describes the N-terminal region of ANTH domains of the Sla2p/HIP1/HIP1R subfamily.	117
340784	cd16987	ANTH_N_AP180_plant	ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of plant Clathrin coat assembly protein AP180 and similar proteins. This subfamily is composed of plant clathrin coat assembly protein AP180 and other ANTH domain containing proteins that are yet to be characterized. Arabidopsis thaliana AP180 (At-AP180) is a binding partner of plant alphaC-adaptin; it functions as a clathrin assembly protein that promotes the formation of cages with an almost uniform size distribution. In addition to At-AP180, Arabidopsis thaliana contains many ANTH domain containing proteins labelled as putative clathrin assembly proteins included in this subfamily such as At4g02650, At5g10410, At2g25430, and At1g33340, among others. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. This model describes the N-terminal region of ANTH domains of plant clathrin coat assembly protein AP180 and similar proteins.	122
340785	cd16988	ANTH_N_YAP180	ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of yeast clathrin coat assembly protein AP180 (YAP180) and similar proteins. This subfamily includes yeast clathrin coat assembly protein AP180 (YAP180) and similar proteins. There are two YAP180 proteins in Saccharomyces cerevisiae, AP180A (yAP180A or YAP1801) and AP180B (yAP180B or YAP1802). They are involved in endocytosis and clathrin cage assembly. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. This model describes the N-terminal region of ANTH domains of plant clathrin coat assembly protein AP180 and similar proteins.	117
340786	cd16989	ENTH_EpsinR	Epsin N-Terminal Homology (ENTH) domain of Epsin-related protein. Epsin-related protein (EpsinR) is also called clathrin interactor 1 (Clint), enthoprotin, or epsin-4. It is a clathrin-coated vesicle (CCV) protein that binds to membranes enriched in phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2), clathrin, and the gamma appendage domain of the adaptor protein complex 1 (AP1). It contains an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. The ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. The ENTH domain of human epsinR binds directly to the helical bundle domain of the mouse SNARE Vti1b; soluble NSF attachment protein receptors (SNAREs) are type II transmembrane proteins that have critical roles in providing the specificity and energy for transport-vesicle fusion. Specific ENTH domains may also function as protein cargo selection/recognition modules. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	130
340787	cd16990	ENTH_Epsin	Epsin N-Terminal Homology (ENTH) domain of Epsin family. Members of the epsin family play an important role as accessory proteins in clathrin-mediated endocytosis. They are important factors in clathrin-coated vesicle (CCV) generation. They contribute to membrane deformation and play a key function as adaptor proteins, coupling various components of clathrin-mediated uptake. They also have an important role in selecting and recognizing cargo. Three isoforms have been identified in mammals, epsin-1 to -3, and these are conserved in vertebrates. Epsin-1 is highly enriched and represents the dominant isoform in the brain. It is required for proper synaptic vesicle retrieval and modulates the endocytic capacity of synaptic vesicles. Epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. The ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of CCVs. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	124
340788	cd16991	ENTH_Ent1_Ent2	Epsin N-Terminal Homology (ENTH) domain of Yeast Ent1, Ent2, and similar proteins. This subfamily is composed of the two orthologs of epsin in Saccharomyces cerevisiae, Epsin-1 (Ent1 or Ent1p) and Epsin-2 (Ent2 or Ent2p), and similar proteins. Yeast single epsin knockouts, either Ent1 and Ent2, are viable while the double knockout is not. Yeast epsins are required for endocytosis and localization of actin. Ent2 also plays a signaling role during cell division. The ENTH domain of Ent2 interacts with the septin organizing, Cdc42 GTPase activating protein, Bem3, leading to increased cytokinesis failure when overexpressed. Yeast epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	132
340789	cd16992	ENTH_Ent3	Epsin N-Terminal Homology (ENTH) domain of Yeast Ent3 and similar proteins. This subfamily is composed of one of two epsinR orthologs present in Saccharomyces cerevisiae, Epsin-3 (Ent3 or Ent3p), and similar proteins. Ent3 is an adaptor proteins at the Trans-Golgi Network (TGN); it cooperates with yeast SNARE Vti1p to regulate transport from the TGN to the prevacuolar endosome. Ent3 facilitates the interaction between Gga2p with both the endosomal syntaxin Pep12p and clathrin in the GGA-dependent transport to the late endosome. Yeast epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. Similar to mammalian epsinR, The ENTH domain of Ent3 binds to the yeast SNARE Vti1p; soluble NSF attachment protein receptors (SNAREs) are type II transmembrane proteins that have critical roles in providing the specificity and energy for transport-vesicle fusion. Specific ENTH domains may also function as protein cargo selection/recognition modules. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	121
340790	cd16993	ENTH_Ent5	Epsin N-Terminal Homology (ENTH) domain of Yeast Ent5 and similar proteins. This subfamily is composed of one of two epsinR orthologs present in Saccharomyces cerevisiae, Epsin-5 (Ent5 or Ent5p), and similar proteins. Ent5 is required, together with Ent3 and Vps27p for ubiquitin-dependent protein sorting into the multivesicular body. It is also required for protein transport from the Trans-Golgi Network (TGN) to the vacuole. Yeast epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	158
340791	cd16994	ENTH_Ent4	Epsin N-Terminal Homology (ENTH) domain of Yeast Ent4 and similar proteins. Yeast Epsin-4 (Ent4 or Ent4p) has been reported to be involved in the Trans-Golgi Network (TGN)-to-vacuole sorting of Arn1p, a transporter for the uptake of ferrichrome, an important nutritional source of iron. Yeast epsins contain an Epsin N-Terminal Homology (ENTH) domain, an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. ENTH domain is highly similar to the N-terminal region of the AP180 N-Terminal Homology (ANTH_N) domain. ENTH and ANTH_N domains are structurally similar to the VHS domain and are composed of a superhelix of eight alpha helices. ENTH domains bind both, inositol phospholipids with preference for PtdIns(4,5)P2, and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. ENTH and ANTH (E/ANTH)-containing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the Trans-Golgi Network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding.	126
340792	cd16995	VHS_Tom1	VHS (Vps27/Hrs/STAM) domain of Target of Myb protein 1. Tom1 (Target of myb1 - retroviral oncogene) is a novel negative regulator of interleukin-1 and tumor necrosis factor-induced signaling pathways. It also plays important roles in protein-degradation systems in Alzheimer's disease pathogenesis. Tom1 contains VHS and GAT domains in the N-terminal and central region, respectively. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting. The VHS domain of Tom1 is essential for its function as a negative regulator.	137
340793	cd16996	VHS_Tom1L2	VHS (Vps27/Hrs/STAM) domain of TOM1-like protein 2. TOM1-like protein 2 (Tom1L2) is a member of the Tom1 (Target of myb1) subfamily, characterized by the presence of a VHS (Vps27p/Hrs/Stam) domain in the N-terminal portion followed by a GAT (GGA and Tom) domain. They are novel regulators for post-Golgi trafficking and signaling. Studies in Tom1L2 hypomorphic mice suggest that Tom1L2 may play roles in immune responses and tumor suppression. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting.	137
340794	cd16997	VHS_Tom1L1	VHS (Vps27/Hrs/STAM) domain of TOM1-like protein 1. TOM1-like protein 1 (Tom1L1) is also called Src-activating and signaling molecule protein (Srcasm). It is a member of the Tom1 (Target of myb1) subfamily, characterized by the presence of a VHS (Vps27p/Hrs/Stam) domain in the N-terminal portion followed by a GAT (GGA and Tom) domain. They are novel regulators for post-Golgi trafficking and signaling. Tom1L1 has been implicated in multivesicular body (MVB) formation, viral egress from the cell, and cytokinesis. Its amplification enhances the metastatic progression of ERBB2-positive breast cancers. The VHS domain has a superhelical structure similar to the structure of the ARM repeats and is present at the very N-termini of proteins. It is a right-handed superhelix of eight alpha helices. The VHS domain has been found in a number of proteins, some of which have been implicated in intracellular trafficking and sorting.	137
340795	cd16998	VHS_GGA_fungi	VHS (Vps27/Hrs/STAM) domain of fungal GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) proteins. GGA (Golgi-localized, Gamma-ear-containing, Arf-binding) comprises a subfamily of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. Yeast GGAs facilitate the specific and direct delivery of vacuolar sorting receptor Vps10p and the processing protease Kex2p from the TGN to the late endosome/prevacuolar compartment (PVC). The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system.	139
340796	cd16999	VHS_STAM2	VHS (Vps27/Hrs/STAM) domain of Signal Transducing Adapter Molecule 2. Signal Transducing Adapter Molecule 2 (STAM2) is also called EAST (EGFR-Associated protein with SH3 and TAM domain) and Hbp (Hrs-binding protein). It is highly expressed in neurons, where it is localized in the nucleus. STAM (Signal Transducing Adaptor Molecule) subfamily members have at their N-termini a VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats, followed by a Ubiquitin-Interacting Motif (UIM) and a SH3 (Src Homology 3) domain, which is a well-established protein-protein interaction domain, and a GAT (GGA and TOM) domain. At the C-termini of most vertebrate STAMS, an Immunoreceptor Tyrosine-based Activation Motif (ITAM) is present, which mediates the binding of HRS (hepatocyte growth factor-regulated tyrosine kinase substrate) in endocytic and exocytic machineries. STAM is a component of the ESCRT (Endosomal Sorting Complex Required for Transport)-0 machinery and together with Hrs, functions to bind and sequester cargoes for downstream sorting into intralumenal vesicles.	139
340797	cd17000	VHS_STAM1	VHS (Vps27/Hrs/STAM) domain of Signal Transducing Adapter Molecule 1. Signal Transducing Adapter Molecule 1 (STAM1) is part of a crucial regulatory axis for the ventral axonal trajectory of developing spinal motor neurons. It forms a complex with beta-arrestin, which regulates lysosomal trafficking of the chemokine receptor CXCR4 and also mediates CXCR4-dependent chemotaxis. STAM (Signal Transducing Adaptor Molecule) subfamily members have at their N-termini a VHS domain, which is involved in cytokine-mediated intracellular signal transduction and has a superhelical structure similar to the structure of ARM (Armadillo) repeats, followed by a Ubiquitin-Interacting Motif (UIM) and a SH3 (Src Homology 3) domain, which is a well-established protein-protein interaction domain, and a GAT (GGA and TOM) domain. At the C-termini of most vertebrate STAMS, an Immunoreceptor Tyrosine-based Activation Motif (ITAM) is present, which mediates the binding of HRS (hepatocyte growth factor-regulated tyrosine kinase substrate) in endocytic and exocytic machineries. STAM is a component of the ESCRT (Endosomal Sorting Complex Required for Transport)-0 machinery and together with Hrs, functions to bind and sequester cargoes for downstream sorting into intralumenal vesicles.	131
340798	cd17001	CID_RPRD2	CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing protein 2. Regulation of nuclear pre-mRNA domain-containing protein 2 (RPRD2) is a CID (CTD-Interacting Domain) domain containing protein that co-purifies with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	125
340799	cd17002	CID_RPRD1	CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing protein 1 and similar proteins. This subfamily contains Regulation of nuclear pre-mRNA domain-containing proteins 1A (RPRD1A) and 1B (RPRD1B) from jawed vertebrates, CID domain-containing protein 1 (CIDS1 or cids-1) from Caenorhabditis elegans, and similar proteins. RPRD1A and RPRD1B are CID (CTD-Interacting Domain) containing proteins that co-purify with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. RPRD1A and RPRD1B form homodimers and heterodimers through their coiled-coil domains. Both associate directly with RPAP2 phosphatase and serve as CTD scaffolds to coordinate the dephosphorylation of phospho-S5 by RPAP2. The function of CIDS1 is not yet known. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	128
340800	cd17003	CID_Rtt103	CID (CTD-Interacting Domain) of yeast transcription termination factor Rtt103 and similar proteins. Yeast transcription termination factor Rtt103 is a CID (CTD-Interacting Domain) containing protein that functions in DNA damage response. It associates with sites of DNA breaks and is essential for recovery from DNA double strand breaks in the chromosome. CID binds tightly to the carboxy-terminal domain (CTD) of RNA polymerase (Pol) II (RNAP II). Rtt103 CID preferentially interacts with CTD phosphorylated at Ser2. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	127
340801	cd17004	CID_SCAF8	CID (CTD-Interacting Domain) of SR-related and CTD-associated factor 8. SR-related and CTD-associated factor 8 (SCAF8) is also called CDC5L complex-associated protein 7 (CCAP7) or RNA-binding motif protein 16 (RBM16). It may play a role in mRNA processing. SCAF8 contains a CTD-interacting domain (CID) at the amino terminus and a Ser/Arg-rich domain followed by an RNA recognition motif. CID binds tightly to the carboxy-terminal domain (CTD) of  RNA polymerase (Pol) II (RNAP II). During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	131
340802	cd17005	CID_SFRS15_SCAF4	CID (CTD-Interacting Domain) of Splicing factor arginine serine rich 15. Splicing factor arginine serine rich 15 (SFRS15) is also called CTD-binding SR-like protein RA4 or SR-related and CTD-associated factor 4 (SCAF4). It may act to physically and functionally link transcription and pre-mRNA processing. SFRS15/SCAF4 contains a CTD-interacting domain (CID) at the amino terminus and a Ser/Arg-rich domain followed by an RNA recognition motif. CID binds tightly to the carboxy-terminal domain (CTD) of  RNA polymerase (Pol) II (RNAP II). During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	131
340803	cd17006	ANTH_N_HIP1_like	ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Huntingtin-interacting protein 1 and related proteins. This subfamily includes Huntingtin-interacting protein 1 (HIP1), HIP1-related protein (HIP1R), and similar proteins. Mammalian HIP1 was identified in 1997 as an interactor of huntingtin; when mutated, it is involved in the neurodegenerative disorder Huntington's disease. HIP1 is expressed only in neurons while HIP1R is ubiquitously expressed. Together with its interacting partner HIPPI, HIP1 regulates apoptosis and gene expression. Both HIP1 and HIP1R promote clathrin assembly in vitro, and they share a common domain architecture, containing an N-terminal ANTH, a central clathrin-binding colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domains. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. Mammalian HIP1 and HIP1R were found to preferentially bind PtdIns(3,4)P2 and PtdIns(3,5)P2, respectively, instead of PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome. This model describes the N-terminal region of the ANTH domain of Huntingtin-interacting protein 1 and related proteins.	114
340804	cd17007	ANTH_N_Sla2p	ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Sla2p and similar proteins. This subfamily is composed of Saccharomyces cerevisiae Sla2 protein (Sla2p, also called transmembrane protein MOP2), Schizosaccharomyces pombe endocytosis protein End4 (End4p, also called Sla2 protein homolog), and similar proteins. In yeast, cells lacking Sla2p have severe defects in actin organization, cell morphology, and endocytosis, suggesting roles in these processes. Sla2p regulates the Eps15-like Arp2/3 complex activator, Pan1p, controlling actin polymerization during endocytosis. In fission yeast, End4p has been implicated in cellular morphogenesis. Sla2p contains an N-terminal ANTH, a central colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domains. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. The ANTH domain of Sla2p preferentially binds PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome. This model describes the N-terminal region of ANTH domains f Sla2p and similar proteins.	115
340805	cd17008	VHS_GGA3	VHS (Vps27/Hrs/STAM) domain of ADP-ribosylation factor-binding protein GGA3. ADP-ribosylation factor-binding protein GGA3 (Golgi-localized, Gamma-ear-containing, Arf-binding 3) regulates the trafficking and is required for the lysosomal degradation of BACE (beta-site APP-cleaving enzyme), the protease that initiates the production of beta-amyloid, which causes Alzheimer's disease. It also plays a key role in GABA (+) transmission, which is important in the regulation of anxiety-like behaviors. GGA3 is a member of the GGA subfamily, which is comprised of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system.	141
340806	cd17009	VHS_GGA1	VHS (Vps27/Hrs/STAM) domain of ADP-ribosylation factor-binding protein GGA1. ADP-ribosylation factor-binding protein GGA1 (Golgi-localized, Gamma-ear-containing, Arf-binding 1) is also called Gamma-adaptin-related protein 1. It is expressed in human brain and affects the generation of amyloid beta-peptide, and may be involved in the pathogenesis of Alzheimer disease. It is a member of the GGA subfamily, which is comprised of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system.	139
340807	cd17010	VHS_GGA2	VHS (Vps27/Hrs/STAM) domain of ADP-ribosylation factor-binding protein GGA2. ADP-ribosylation factor-binding protein GGA2 (Golgi-localized, Gamma-ear-containing, Arf-binding 2) is also called Gamma-adaptin-related protein 2 and VHS domain and ear domain of gamma-adaptin (Vear). It is a member of the GGA subfamily, which is comprised of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins involved in membrane trafficking between the Trans-Golgi Network (TGN) and endosomes. The VHS domain has a superhelical structure similar to the structure of the ARM (Armadillo) repeats and is present at the N-termini of proteins. GGA proteins have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The VHS domain of GGA proteins binds to the acidic-cluster dileucine (DxxLL) motif found on the cytoplasmic tails of cargo proteins trafficked between the Trans-Golgi Network and the endosomal system.	139
340808	cd17011	CID_RPRD1A	CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing protein 1A. Regulation of nuclear pre-mRNA domain-containing protein 1A (RPRD1A) is also called Cyclin-dependent kinase inhibitor 2B-related protein or p15INK4B-related protein (P15RS). RPRD1A is a CID (CTD-Interacting Domain) containing protein that co-purifies with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. RPRD1A form homodimers and heterodimers with RPRD1B through their coiled-coil domains. Both RPRD1A and RPRD1B associate directly with RPAP2 phosphatase and serve as CTD scaffolds to coordinate the dephosphorylation of phospho-S5 by RPAP2. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	128
340809	cd17012	CID_RPRD1B	CID (CTD-Interacting Domain) of Regulation of nuclear pre-mRNA domain-containing protein 1B. Regulation of nuclear pre-mRNA domain-containing protein 1B (RPRD1B) is also called Cell cycle-related and expression-elevated protein in tumor (CREPT). RPRD1B is a CID (CTD-Interacting Domain) containing protein that co-purifies with RNA polymerase (Pol) II (RNAP II) and three other RNAP II-associated proteins, RPAP2, GRINL1A and RECQL5, but not with the Mediator complex. CID binds tightly to the carboxy-terminal domain (CTD) of RNAP II. During transcription, RNAP II synthesizes eukaryotic messenger RNA. Transcription is coupled to RNA processing through the CTD, which consists of up to 52 repeats of the sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. RPRD1B form homodimers and heterodimers with RPRD1A through their coiled-coil domains. Both RPRD1A and RPRD1B associate directly with RPAP2 phosphatase and serve as CTD scaffolds to coordinate the dephosphorylation of phospho-S5 by RPAP2. RPRD1B is highly expressed during tumorigenesis and in endometrial cancer, has been shown to promote tumor growth by accelerating the cell cycle. CID contains eight alpha-helices in a right-handed superhelical arrangement, which closely resembles that of the VHS domains and ARM (Armadillo) repeat proteins, except for its two amino-terminal helices.	129
340810	cd17013	ANTH_N_HIP1	ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Huntingtin-interacting protein 1. Huntingtin-interacting protein 1 (HIP1) was identified in 1997 as an interactor of huntingtin; when mutated, it is involved in the neurodegenerative disorder Huntington's disease. HIP1 promotes clathrin assembly in vitro. Together with its interacting partner HIPPI, it regulates apoptosis and gene expression. HIP1 contains an N-terminal ANTH, a central clathrin-binding colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domain. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. The ANTH domain of mammalian HIP1 was found to preferentially bind PtdIns(3,4)P2 instead of PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome. This model describes the N-terminal region of ANTH domain of Huntingtin-interacting protein 1.	114
340811	cd17014	ANTH_N_HIP1R	ANTH (AP180 N-Terminal Homology) domain, N-terminal region, of Huntingtin-interacting protein 1-related protein. Huntingtin-interacting protein 1-related protein (HIP1R), also called HIP12, promotes clathrin assembly in vitro. It is an endocytic protein involved in receptor trafficking, including regulating cell surface expression of receptor tyrosine kinases. Low HIP1R protein expression is associated with worse survival in diffuse large B-cell lymphoma (DLBCL) patients; it is preferentially expressed in germinal center B-cell (GCB)-like DLBCL, and may be potentially useful in subtyping DLBCL cases. HIP1R contains an N-terminal ANTH, a central clathrin-binding colied-coil, and a C-terminal actin-binding talin-like (also called I/LWEQ) domain. ANTH domains bind both inositol phospholipids and proteins, and contribute to the nucleation and formation of clathrin coats on membranes. The ANTH domain is a unique module whose N-terminal half is structurally similar to the Epsin N-Terminal Homology (ENTH) and Vps27/Hrs/STAM (VHS) domains, containing a superhelix of eight alpha helices. In addition, it contains a coiled-coil C-terminal half with strutural similarity to spectrin repeats. It binds phosphoinositide PtdIns(4,5)P2 at a short conserved motif K[X]9[K/R][H/Y] between helices 1 and 2. The ANTH domain of mammalian HIP1R was found to preferentially bind PtdIns(3,5)P2 instead of PtdIns(4,5)P2, which is considered to be an interaction hub in the clathrin interactome. This model describes the N-terminal region of ANTH domain of Huntingtin-interacting protein 1-related protein.	114
341097	cd17015	ING_plant	Inhibitor of growth (ING) domain of plant inhibitor of growth proteins. This subfamily is composed of mainly plant inhibitor of growth proteins such as Arabidopsis thaliana ING1 (AtING1 or PHD finger protein ING1) and ING2 (AtING2 or PHD finger protein ING2). They are histone-binding components that specifically recognizes H3 tails trimethylated on 'Lys-4' (H3K4me3), which mark transcription start sites of virtually all active genes. The related mammalian ING proteins act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation, and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, which binds lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING domain for the H3 tail.	98
341098	cd17016	ING_Pho23p_like	Inhibitor of growth (ING) domain of yeast Pho23p and similar proteins. This family is composed of Saccharomyces cerevisiae transcriptional regulatory protein PHO23 (Pho23p), Schizosaccharomyces pombe chromatin modification-related protein png2 (also called ING1 homolog 2), and similar proteins. Pho23p is part of Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Pho23p inhibits p53-dependent transcription. The related mammalian ING proteins act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail.	89
341099	cd17017	ING_Yng1p	Inhibitor of growth (ING) domain of yeast Yng1p and similar proteins. The ING family includes three yeast orthologs, chromatin modification-related protein YNG1 (Yng1p), YNG2 (Yng2p), and transcriptional regulatory protein PHO23 (Pho23p). Yng1p, also termed ING1 homolog 1, is one of the components of the NuA3 histone acetyltransferase (HAT) complex. Yng2p, also termed ESA1-associated factor 4, or ING1 homolog 2, is a subunit of the NuA4 HAT complex. It plays acritical role in intra-S-phase DNA damage response. Pho23p is part of Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Yng1p and Pho23p inhibit p53-dependent transcription. In contrast, Yng2p has the opposite effect. The related mammalian ING proteins act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail.	100
409302	cd17018	T3SC_IA_ExsC-like	Class IA type III secretion system chaperone protein, similar to Pseudomonas aeruginosa exoenzyme S synthesis protein C (ExsC). This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas aeruginosa and Aeromonas hydrophila ExsC (also known as exoenzyme S synthesis protein C). P. aeruginosa ExsC, a member of the type IA family of T3SS chaperones, is unique because, as part of the signaling process, it binds small secreted protein ExsE as well as the non-secreted anti-activator protein ExsD; it relieves repression of the transcriptional activator ExsA (which activates expression of T3SS genes) by ExsD. However, in Aeromonas, although ExsA is likely the master regulator of the T3SS, there is little evidence of ExsC and ExsE involvement in the regulation of the T3SS.	127
409303	cd17019	T3SC_IA_ShcA-like	Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae chaperone protein ShcA. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas syringae ShcA and similar proteins. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of the effector HopA1 (previously known as HopPsyA or HrmA), a protein that has unknown functions in the host cell but possesses close homologs that trigger the plant hypersensitive response in resistant strains. Chaperone ShcA binding to Hop1A shows that interactions in animal pathogens are preserved in the Gram-negative pathogens of plants.	122
409304	cd17020	T3SC_IA_ShcM-like	Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae chaperone protein ShcM. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas syringae ShcM and similar proteins. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of the effector protein HopPtoM (previously known as CEL ORF3), among known plant pathogen effectors, that makes a major contribution to the elicitation of lesion symptoms but not growth in host tomato leaves. Chaperone ShcM is required for efficient translocation and function of HopPtoM in the plant cell, consistent with the presence of customized chaperones in plant pathogenic bacteria.	122
409305	cd17021	T3SC_IA_SicP-like	Class IA type III secretion system chaperone protein, similar to Salmonella enterica chaperone protein SicP. This family includes type III secretion system (T3SS) chaperone proteins similar to Salmonella enterica SicP and similar proteins. In S. enterica, many of its serovars being serious human pathogens, the T3SS allows injection of the effector SptP, a virulence protein that is involved in bacterial invasion into a host cell. Chaperone SicP forms a complex with SptP at an early stage of the effector protein secretion process in order to avoid premature degradation; also, the complex is dissociated at a late stage to secrete only SptP with the help of the ATPase InvC which is part of the related T3SS injectisome.	121
409306	cd17022	T3SC_IA_SigE-like	Class IA type III secretion system chaperone protein, similar to Salmonella enterica SigE. This family includes type III secretion system (T3SS) chaperone proteins similar to Salmonella enterica chaperone SigE and similar proteins. In S. enterica, many of its serovars being serious human pathogens, the T3SS allows injection of the effector SigD (also known as SopB) which is an inositol phosphatase. Chaperone SigE binds to SigD, which, upon translocation into the host cell, preferentially dephosphorylates specific inositol phospholipids that are thought to be crucial for subsequent activation of the host cell Ser-Thr kinase Akt.	113
409307	cd17023	T3SC_IA_CesT-like	Class IA type III secretion system chaperone protein, similar to Escherichia coli CesT. This family includes type III secretion system (T3SS) chaperone proteins similar to Escherichia coli CesT and also contains Stm2138, a novel virulence chaperone in Salmonella enterica subsp. enterica serovar Typhimurium. In E. coli, the T3SS allows injection of the effector Tir (translocated intimin receptor), which plays a key role in enterohemorrhagic Escherichia coli (EHEC) infection, attaching and effacing (A/E) lesions, and intracellular signal transduction. CesT binds to Tir, which interacts with intimin and anchors the infected cell membrane inside the host cytoplasm for signaling.	133
409308	cd17024	T3SC_IA_DspF-like	Class IA type III secretion system chaperone protein, similar to Erwinia amylovora DspF (DspF/AvrF family protein). This family includes type III secretion system (T3SS) chaperone proteins similar to Erwinia amylovora DspF, Pantoea stewartii WtsE, Pseudomonas viridiflava AvrF, and similar proteins, all of which bind AvrE family type III effector proteins. In E. amylovora, a gram-negative enterobacterium that causes a devastating blight disease of apple and pear trees, the T3SS allows injection of effector DspE via the chaperone DspF. DspE has been shown to interact with several apple proteins, suppress salicylic acid-mediated host defenses and cause necrotic cell death in host and non-host plants. In Pectobacterium carotovorum, DspE is required early in solanum tuberosum leaf infection to cause cell death. Effector WtsE in P. stewartii causes disease-associated cell death in corn and requires the chaperone protein WtsF for stability.	121
409309	cd17025	T3SC_IA_ShcF-like	Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae ShcF. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas syringae ShcF and similar proteins. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of the effector protein AvrPphF into genetically susceptible host cells.  Chaperone ShcF (originally known as AvrPphF ORD1) binds AvrPphF in a similar manner to type III chaperones from bacterial pathogens of animals, indicating structural conservation of these specialized chaperones, despite high sequence divergence.	124
409310	cd17026	T3SC_IA_SpcU-like	Class IA type III secretion system chaperone protein, similar to Pseudomonas aeruginosa SpcU. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas aeruginosa SpcU and similar proteins. In P. aeruginosa, a multidrug resistant pathogen associated with serious illnesses such as ventilator-associated pneumonia and various sepsis syndromes, the T3SS allows injection of effector protein ExoU, one of the most aggressive toxins injected by a T3SS, into the cytosol of target eukaryotic cells, leading to rapid cell necrosis. Chaperone SpcU binds the cytotoxin ExoU, which is a broad-specificity phospholipase A2 (PLA2) and lysophospholipase, and maintains the N-terminus of ExoU in an unfolded state which is required for secretion.	120
409311	cd17027	T3SC_IA_YscB_AscB-like	Class IA type III secretion system chaperone protein, similar to Yersinia pestis YscB. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia pestis YscB and its homologs, Aeromonas hydrophila AscB and Photorhabdus luminescens LscB. In Yersinia pestis, which causes the deadly bubonic plague, the T3SS allows injection of effector proteins, termed Yersinia outer proteins (Yops) into macrophages and other immune cells, forming pores in the host cell membrane. The secretion of Yops is regulated by the activity of the YopN/SycN/YscB/TyeA complex. YscB acts, along with SycN, as a chaperone for YopN, a key part of a complex that regulates type III secretion so that it responds to contact with the eukaryotic target cell.	127
409312	cd17028	T3SC_IA_SycE_Scc1-like	Class IA type III secretion system chaperone protein, similar to Chlamydia SycE/Scc1. This family includes type III secretion system (T3SS) chaperone proteins similar to Chlamydia SycE (also known as Scc1) and similar proteins. Chlamydia SycE is homologous to that of the SycE chaperone protein of Yersinia, which is involved in promoting translocation of Yersinia outer protein E (YopE). In Chlamydia, two T3SS chaperones, Scc1 and Scc4, work together to promote secretion of the important effector and plug protein, CopN, whereas, the Scc3 chaperone represses its secretion.	134
409313	cd17029	T3SC_IA_SycE_SpcS-like	Class IA type III secretion system chaperone protein, similar to Yersinia SycE. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia SycE and its homolog Pseudomonas aeruginosa SpcS. Involvement of Yersinia chaperone SycE (also known as YerA) in promoting translocation of Yersinia outer protein E (YopE), a selective activator of mammalian Rho-family GTPases, into host macrophages is essential to Yersinia pathogenesis. In P. aeruginosa, which is an opportunistic pathogen that harbors multiple virulence factors that widely manipulate host cell signaling and immune response, the effector toxin proteins of T3SS are ExoT, ExoS, ExoU and ExoY. Chaperone SpcS (formerly known as Orf1) binds to ExoT as well as its homolog, ExoS, both known to be the actual virulence determinants due to the presence of bifunctional GTPase-activating (GAP) and ADP-ribosyltransferase (ADPRT) domains which are essential for inhibition of bacterial internalization and epithelial cell migration by altering the actin cytoskeleton.	115
409314	cd17030	T3SC_IA_SycH-like	Class IA type III secretion system chaperone protein, similar to Yersinia pestis SycH. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia pestis SycH and similar proteins. In Yersinia pestis, the causative agent of bubonic and pneumonic plague, the T3SS allows injection of effector proteins, termed Yersinia outer proteins (Yops) into macrophages and other immune cells, forming pores in the host cell membrane and have been linked to cytolysis. The secretion of Yops is regulated by the activity of the YopN/SycN/YscB/TyeA complex. SycH is the chaperone for YopH, a potent eukaryotic-like protein tyrosine phosphatase that is essential for virulence. SycH also binds two negative regulators of type III secretion, YscM1 and YscM2, both sharing significant sequence homology with the chaperone-binding domain of YopH.	120
409315	cd17031	T3SC_IA_SycN-like	Class IA type III secretion system chaperone protein, similar to Yersinia pestis SycN. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia pestis SycN and similar proteins. In Yersinia pestis, the causative agent of bubonic and pneumonic plague, the T3SS allows injection of effector proteins, termed Yersinia outer proteins (Yops) into macrophages and other immune cells, forming pores in the host cell membrane and have been linked to cytolysis. The secretion of Yops is regulated by the activity of the YopN/SycN/YscB/TyeA complex; SycN-YscB forms a heterodimeric secretion chaperone and binds YopN, a key part of a complex that regulates type III secretion, in response to calcium levels, so that secretion occurs only after contact with the targeted eukaryotic cell. Negative regulation is mediated by the complex by blocking the entrance to the secretion apparatus prior to contact with mammalian cells.	118
409316	cd17032	T3SC_IA_SycT-like	Class IA type III secretion system chaperone protein, similar to Yersinia enterocolitica SycT. This family includes type III secretion system (T3SS) chaperone proteins similar to Yersinia enterocolitica SycT and similar proteins. In Y. enterocolitica, a food-borne pathogen causing gastroenteritis and mesenteric lymphadenitis, chaperone SycT promotes translocation of effector YopT (Yersinia outer protein T), a cysteine protease that inactivates the small GTPase RhoA of targeted host cells by cleaving its C-terminal, prenylated cysteine, thereby releasing the GTPase into the host cytosol.	121
409317	cd17033	DR1245-like	possible type III secretion system (T3SS) chaperone protein DR1245 found in Deinococcus radiodurans. This family includes a possible type III secretion system (T3SS) chaperone protein DR1245 found in Deinococcus radiodurans, a bacterium that is exceptionally resistant to the lethal effects of ionizing radiation (IR), ultraviolet light and other DNA-damaging agents. DR1245, a protein of unknown function conserved only in the Deinococcaceae, and with strong structural homology to YbjN proteins and T3SS chaperones, may display some chaperone activity towards DdrB, a protein found to be highly up-regulated following irradiation; DR1245 may also bind to other substrates.	138
409318	cd17034	T3SC_IA_ShcO1-like	Class IA type III secretion system chaperone proteins, similar to Pseudomonas syringae ShcO1, ShcS1, and ShcS2. This family includes type III secretion system (T3SS) chaperone proteins similar to Pseudomonas syringae ShcO1 and similar proteins. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of effector Hrp-dependent outer proteins (HOPs) HopO1-1, HopS1, and HopS2. Three homologous chaperones ShcO1, ShcS1, and ShcS2 facilitate the translocation of their cognate effectors HopO1-1, HopS1, and HopS2, respectively. Interestingly, ShcS1 and ShcS2 are capable of substituting for ShcO1 in facilitating HopO1-1 secretion and translocation. ShcS1 and ShcO1 are exceptional class IA T3SS chaperones because they can bind more than one target effector.	123
409319	cd17035	T3SC_IB_Spa15-like	Class IB type III secretion system chaperone protein, similar to Shigella flexneri Spa15. This family includes type III secretion system (T3SS) chaperone proteins similar to Shigella flexneri Spa15, Salmonella enterica InvB, and similar proteins. In S. flexineri, which is a facultative intracellular pathogen that invades the colonic epithelium and causes bacillary dysentery, the T3SS allows injection of a number of effectors to ensure their stabilization prior to secretion. Spa15 is the chaperone for several TTS effectors, including IpaA, IpgB1, OspC3, OspB and OspD1. Effector IpgB that chaperone Spa15 is a mimic of the human Ras-like Rho guanosine triphosphatase RhoG, thus activating Rac1 guanosine triphosphatase and setting off membrane ruffling of the cell, assisting the internalization of Shigella. Also, Spa15 is a chaperone for secreted anti-activator OspD1 which is involved in the control of transcription by the type III secretion apparatus (T3SA) activity in Shigella flexneri.	129
409320	cd17036	T3SC_YbjN-like_1	T110839 is structurally similar to type III secretion system chaperones and YbjN family proteins. This family includes protein T110839 from Synechococcus elongatus that is structurally similar to type III secretion system (T3SS) chaperones (T3SC) that bind effector proteins, and is homologous to YbjN, a putative sensory transduction regulator protein found in Proteobacteria.	125
409321	cd17037	T3SC_IA_ShcV-like	Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae ShcV. This family includes type III secretion system (T3SS) chaperone protein similar to Pseudomonas syringae ShcV. In P. syringae, which is a plant pathogen that can infect a wide range of species, the T3SS allows injection of effector HopPtoV which may play a subtle role in pathogenesis. Chaperone ShcV facilitates secretion of HopProV into plant cells via the amino-terminal third of the effector.	131
341208	cd17038	Flavi_M	Flavivirus envelope glycoprotein M. Flaviviruses are small enveloped viruses with a membrane-anchored envelope comprised of 3 proteins called C, M and E. The envelope glycoprotein M is translated as a precursor, called prM. The precursor portion of the protein is the signal peptide for the protein's entry into the membrane. prM is cleaved to form M by the proprotein convertase furin in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus.	75
340559	cd17039	Ubl_ubiquitin_like	ubiquitin-like (Ubl) domain found in ubiquitin and ubiquitin-like Ubl proteins. Ubiquitin-like (Ubl) proteins have a similar ubiquitin (Ub) beta-grasp fold and attach to other proteins in a Ubl manner but with biochemically distinct roles. Ub and Ubl proteins conjugate and deconjugate via ligases and peptidases to covalently modify target polypeptides. Some Ubl domains have adaptor roles in Ub-signaling by mediating protein-protein interaction. Prokaryotic sulfur carrier proteins are Ub-related proteins that can be activated in an ATP-dependent manner. Polyubiquitination signals for a diverse set of cellular events via different isopeptide linkages formed between the C terminus of one ubiquitin (Ub) and the epsilon-amine of K6, K11, K27, K29, K33, K48, or K63 of a second Ub. One of these seven lysine residues (K27, Ub numbering) is conserved in this Ubl_ubiquitin_like family.  K27-linked Ub chains are versatile and can be recognized by several downstream receptor proteins. K27 has roles beyond chain linkage, such as in Ubl NEDD8 (which contains many of the same lysines (K6, K11, K27, K33, K48) as Ub) where K27 has a role (other than conjugation) in the mechanism of protein neddylation.	68
340560	cd17040	Ubl_MoaD_like	ubiquitin-like (Ubl) domain found in a group of small sulfide carrier proteins. Ubiquitin-like (Ubl) domain found in a group of small sulfide carrier proteins This family includes ThiS, MoaD, CysO, QbsE, and their homologs, which are structurally homologous to ubiquitin (Ub) and may function as the sulfide donor for the biosynthesis of thiamin, molybdopterin, cysteine, thioquinolobactin, and other sulfur-containing natural products. Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Ubiquitination is comprised of a cascade of E1, E2 and E3 enzymes that results in a covalent bond between the C-terminus of Ub and the epsilon-amino group of a substrate lysine. Like Ub, small sulfide carrier proteins in this family are adenylated at a diglycyl C-terminus by specific activating proteins. The adenylated C-terminus is subsequently converted to a thiocarboxylate, serving as the sulfide source. Those activating proteins are diverse and show little sequence similarity. This family also includes the small archaeal modifier protein (SAMP), including SAMP1, SAMP2 and SAMP3, which are Ub-like proteins that function as protein modifiers and are required for the production of sulfur-containing biomolecules in the archaeon Haloferax volcanii. SAMP1 and SAMP2 are involved in sulfur transfer during molybdenum cofactor biosynthesis and tRNA thiolation much like MoaD and Urm1, respectively. They can form covalent conjugates with their protein targets through an isopeptide linkage via their C-terminal diglycine motif in a streamlined archaeal E1-dependent pathway. SAMP2 also forms homo-conjugates through the intermolecular isopeptide bond between the C-terminal Gly and the Lys58 side chain, a feature that likely resembles polyubiquitination. SAMP3 conjugates are dependent on the Ub-activating E1 enzyme homolog of archaea (UbaA) for synthesis and are cleaved by the JAMM/MPN+ domain metalloprotease HvJAMM1.	88
340561	cd17041	Ubl_WDR48	Ubiquitin-like (Ubl) domain found in WD repeat-containing protein 48 (WDR48) and similar proteins. WDR48, also termed USP1-associated factor 1 (UAF1), or WD repeat endosomal protein, or p80, is required for the histone deubiquitination activity. It stimulates activity of ubiquitin-specific proteases USP1, USP12, and USP46.As potential tumor suppressor, WDR48 in complex with deubiquitinase USP12 suppresses Akt-dependent cell survival signaling by stabilizing PH domain leucine-rich repeat protein phosphatase 1 (PHLPP1). WDR48 also functions as a novel interaction partner of E1 helicase from anogenital human papillomavirus (HPV) types, and plays an essential role in anogenital HPV DNA replication. WDR48 contains a WD40 domain and a ubiquitin-like domain that shows high sequence and structural similarity with RING finger- and WD40-associated ubiquitin-like (RAWUL) domain.	97
340562	cd17042	Ubl_TmoB	Ubiquitin-like (Ubl) domain found in toluene-4-monooxygenase system protein B (TmoB). TmoB is a component of the multicomponent toluene-4-monooxygenase (T4MO) system that metabolizes toluene as a carbon source. The T4MO complex is composed of a diiron hydroxylase (T4MOH), a Rieske-type ferredoxin (T4MOC), an effector protein (T4MOD), and an NADH oxidoreductase (T4MOF). The T4MOH component consists of TmoA, TmoB, and TmoE. TmoB adopts a beta-grasp ubiquitin-like fold but its precise role remains unclear.	79
340563	cd17043	RA	Ras-associating (RA) domain, structurally similar to a beta-grasp ubiquitin-like fold. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in various functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. RA-containing proteins include RalGDS, AF6, RIN, RASSF1, SNX27, CYR1, STE50, and phospholipase C epsilon.	87
340564	cd17044	Ubl_TBCE	ubiquitin-like (Ubl) domain found in tubulin-folding cofactor E (TBCE) and similar proteins. TBCE, also termed tubulin-specific chaperone E, is a tubulin polymerizing protein involved in the second step of the tubulin folding pathway through cooperating in tubulin heterodimer dissociation both in vivo and in vitro. It may also be implicated in the maintenance of the neuronal microtubule network. Mutations in TBCE gene cause hypoparathyroidism, mental retardation and facial dysmorphism. TBCE contains an N-terminal cytoskeleton-associated protein with glycine-rich segment (CAP-Gly) domain, a leucine-rich repeat protein-protein interaction domain followed by leucine-rich repeat (LRR) domains, and a C-terminal ubiquitin-like (Ubl) domain. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes.	83
340565	cd17045	Ubl_TBCEL	ubiquitin-like (Ubl) domain found in tubulin-specific chaperone cofactor E-like protein (TBCEL) and similar proteins. TBCEL, also termed leucine-rich repeat-containing protein 35 (LRRC35), or E-like (EL), is a novel regulator of tubulin stability, suggesting a link between tubulin turnover and vesicle transport. TBCEL is abundantly expressed in testis, but is also present in several tissues at a much lower level. It is required for the synchronous movement of the investment cones and is important for normal male fertility. TBCEL shows high sequence similarity to tubulin-specific chaperone cofactor E (TBCE), a component of the multimolecular complex required for tubulin heterodimer formation in all eukaryotic cells. It contains a leucine-rich repeat protein-protein interaction domain and a C-terminal ubiquitin-like (Ubl) domain, but does not harbor the cytoskeleton-associated protein with glycine-rich segment (CAP-Gly) domain found in TBCE.	87
340566	cd17046	Ubl_IKKA_like	ubiquitin-like (Ubl) domain found in inhibitor of nuclear factor kappa-B kinases, IKK-alpha and IKK-beta, and similar proteins. IKK, also termed IkappaB kinase, is an enzyme complex involved in propagating the cellular response to inflammation. It is part of the upstream nuclear factor kappa-B kinase (NF-kappaB) signal transduction cascade, and plays an important role in regulating the NF-kappaB transcription factor. IKK is composed of three subunits, IKK-alpha/CHUK, IKK-beta/IKBKB, and IKK-gamma/NEMO. The IKK-alpha and IKK-beta subunits together are catalytically active whereas the IKK-gamma subunit serves a regulatory function. IKK-alpha and IKK-beta phosphorylate the IkappaB proteins, marking them for degradation via ubiquitination and allowing NF-kappaB transcription factors to go into the nucleus. IKK-alpha, also known as IKK-A, or IkappaB kinase A (IkBKA), or conserved helix-loop-helix ubiquitous kinase (CHUK), or I-kappa-B kinase 1 (IKK1), or nuclear factor NF-kappa-B inhibitor kinase alpha (NFKBIKA), or transcription factor 16 (TCF-16), belongs to the serine/threonine protein kinase family. In addition to NF-kappaB response, it has many additional cellular targets in an NF-kappaB-independent manner. For instance, it plays a role in epidermal differentiation, as well as in the regulation of the cell cycle protein cyclin D1. IKK-beta, also known as IKK-B, or IkappaB kinase B (IkBKB), or I-kappa-B kinase 2 (IKK2), or nuclear factor NF-kappa-B inhibitor kinase beta (NFKBIKB), belongs to the serine/threonine protein kinase family as well. It interacts with many different protein partners and has been implicated in the treatment of many inflammatory diseases and cancers. Both IKK-alpha and IKK-beta contain an N-terminal catalytic domain followed by a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions.	75
340567	cd17047	Ubl_UBFD1	ubiquitin-like (Ubl) domain found in ubiquitin domain-containing protein UBFD1 and similar proteins. UBFD1, also termed ubiquitin-binding protein homolog (UBPH), is a polyubiquitin binding protein containing a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. It may play a role as nuclear factor-kappaB (NF-kappaB) regulator.	70
340568	cd17048	Ubl_UBL3	ubiquitin-like (Ubl) domain found in ubiquitin-like protein 3 (UBL3) and similar proteins. UBL3, also termed membrane-anchored ubiquitin-fold protein (MUB), or protein HCG-1, belongs to a newly described MUB protein family with structural homology with ubiquitin. MUB proteins have a beta-grasp ubiquitin-like (Ubl) domain with longer N- and C-termini and extended loops. The Ubl domain contains a C-terminal CAAX-box, a canonical motif for protein prenylation, which is modified through protein lipidation with a hydrophobic membrane anchor. The lipidation and membrane localization inhibit attachment of MUBs to target proteins.	82
340569	cd17049	Ubl_Sacsin	ubiquitin-like (Ubl) domain found in Sacsin and similar proteins. Sacsin, also termed DnaJ homolog subfamily C member 29 (DNAJC29), is encoded by SACS gene that is highly expressed in the brain. Mutations in SACS can cause the neurodegenerative disease autosomal recessive spastic ataxia of Charlevoix Saguenay (ARSACS) which is characterized by early-onset spastic ataxia. Sacsin is a modular protein that is localized on the mitochondrial surface and possibaly required for normal mitochondrial network organization. Sacsin knockdown resulted in a reduction in cells expressing plyglutamine-expanded ataxin-1, which correlated with a loss of cells with large nuclear ataxin-1 incusions. At the N-terminus, sacsin contains a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, which can interact with the proteasome. At the C-terminus, sacsin harbors a protein-protein interaction J-domain followed by an higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domain. The J-domain is typically associated with DnaJ-like co-chaperones involved in regulation of the Hsp70 heat shock system.	73
340570	cd17050	Ubl1_ANKUB1	ubiquitin-like (Ubl) domain 1 found in Ankyrin repeat and ubiquitin domain-containing 1 (ANKUB1) and similar proteins. ANKUB1 is an uncharacterized protein with two tandem ubiquitin-like (Ubl) domains located at the N-terminal of Ankyrin repeats (ANK). The Ubl domain may have an adaptor role in ubiquitin (Ub)-signaling by mediating protein-protein interaction. Ubl proteins have a beta-grasp Ubl fold and attach to other proteins in a Ubl manner with biochemically distinct roles. The ankyrin repeats have been identified in numerous proteins with diverse functions. The family corresponds to the first Ubl domain.	79
340571	cd17051	Ubl2_ANKUB1	ubiquitin-like (Ubl) domain 2 found in Ankyrin repeat and ubiquitin domain-containing 1 (ANKUB1) and similar proteins. ANKUB1 is an uncharacterized protein with two tandem ubiquitin-like (Ubl) domains located at the N-terminal of Ankyrin repeats (ANK). The Ubl domain may have an adaptor role in ubiquitin(Ub)-signaling by mediating protein-protein interaction. Ubl proteins have a beta-grasp Ubl fold and attach to other proteins in a Ubl manner with biochemically distinct roles. The ankyrin repeats have been identified in numerous proteins with diverse functions. The family corresponds to the second Ubl domain.	83
340572	cd17052	Ubl1_FAT10	ubiquitin-like (Ubl) domain 1 found in leukocyte antigen F (HLA-F) adjacent transcript 10 (FAT10) and similar proteins. FAT10, also termed ubiquitin D (UBD), or diubiquitin, is a cytokine-inducible ubiquitin-like (Ubl) modifer that is highly expressed in the thymus, and targets substrates covalently for 26S proteasomal degradation. It is also associated with cancer development, antigen processing and antimicrobial defense, chromosomal stability and cell cycle regulation. FAT10 is presented on immune cells and under the inflammatory conditions, is synergistically induced by interferon gamma (IFNgamma) and tumor necrosis factor (TNFalpha) in the non-immune (liver parenchymal) cells. FAT10 contains two Ubl domains. The family corresponds to the first Ubl domain of FAT10. Some family members contain only one Ubl domain.	74
340573	cd17053	Ubl2_FAT10	ubiquitin-like (Ubl) domain 2 found in leukocyte antigen F (HLA-F) adjacent transcript 10 (FAT10) and similar proteins. FAT10, also termed ubiquitin D (UBD), or diubiquitin, is a cytokine-inducible ubiquitin-like (Ubl) modifer that is highly expressed in the thymus, and targets substrates covalently for 26S proteasomal degradation. It is also associated with cancer development, antigen processing and antimicrobial defense, chromosomal stability and cell cycle regulation. FAT10 is presented on immune cells and under the inflammatory conditions, is synergistically induced by interferon gamma (IFNgamma) and tumor necrosis factor (TNFalpha) in the non-immune (liver parenchymal) cells. FAT10 contains two Ubl domains. The family corresponds to the second Ubl domain of FAT10. Some family members contain only one Ubl domain.	71
340574	cd17054	Ubl_AtBAG1_like	ubiquitin-like (Ubl) domain found in Arabidopsis thaliana Bcl-2-associated athanogenes AtBAG1, AtBAG2, AtBAG3, AtBAG4, and similar proteins. The family includes four Arabidopsis BAG family proteins (AtBAG1, AtBAG2, AtBAG3, AtBAG4) that have very similar domain organizations with a ubiquitin-like (Ubl) domain in the N-terminus and a BAG domain in the C-terminus. They may function as co-chaperones that regulate diverse cellular pathways, such as programmed cell death and stress responses. AtBAG1, AtBAG3, and AtBAG4 are predicted to localize in the cytoplasm, but the localization of AtBAG2 is the microbody. AtBAG4 can interact with Hsc70. The overexpression of AtBAG4 in tobacco plants confers tolerance to a wide range of abiotic stresses such as UV light, cold, oxidants, and salt treatments.	70
340575	cd17055	Ubl_AtNPL4_like	ubiquitin-like (Ubl) domain found in Arabidopsis thaliana NPL4-like proteins NPL4-1, NPL4-2, and similar proteins. The family includes a group of uncharacterized plant ubiquitin-like (Ubl) domain-containing proteins, including Arabidopsis thaliana NPL4-like protein 1 and NPL4-like protein 2.	73
340576	cd17056	Ubl_FAF1	ubiquitin-like (Ubl) domain found in FAS-associated factor 1 (FAF1) and similar proteins. FAF1, also termed UBX domain-containing protein 12 (UBXD12), or UBX domain-containing protein 3A (UBXN3A), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like (Ubl) fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, FAF1 contains two tandem Ubl domains, which show high structural similarity with UBX domain. FAF1 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The FAF1-p97 complex inhibits the proteasomal protein degradation in which p97 acts as a co-chaperone. Moreover, FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. This family corresponds to Ubl domains.	71
340577	cd17057	Ubl_TMUB1_like	ubiquitin-like (Ubl) domain found in transmembrane and ubiquitin-like domain-containing proteins TMUB1, TMUB2, and similar proteins. TMUB1, also termed dendritic cell-derived ubiquitin-like protein (DULP), or hepatocyte odd protein shuttling protein, or ubiquitin-like protein SB144, or HOPS, is highly expressed in the nervous system. It is involved in the termination of liver regeneration and plays a negative role in interleukin-6-induced hepatocyte proliferation. The overexpression of Tmub1 has been shown to play a role in the inhibition of cell proliferation. TMUB1 has been implicated in the regulation of locomotor activity and wakefulness in mice, perhaps acting through its interaction with CAMLG. It also facilitates the recycling of AMPA receptors into synaptic membrane in cultured primary neurons. TMUB1 contains transmembrane domains and a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. TMUB2 is an uncharacterized transmembrane domain and Ubl domain-containing protein that shows high sequence similarity to TMUB1.	74
340578	cd17058	Ubl_SNRNP25	ubiquitin-like (Ubl) domain found in small nuclear ribonucleoprotein U11/U12 subunit 25 (SNRNP25) and similar proteins. SNRNP25, also termed U11/U12 small nuclear ribonucleoprotein 25 kDa protein, U11/U12 snRNP 25 kDa protein (U11/U12-25K), or minus-99 protein, is a component of the U11/U12 snRNPs that are part of the U12-type spliceosome. It contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions.	89
340579	cd17059	Ubl_OTU1	ubiquitin-like (Ubl) domain found in ubiquitin thioesterase OTU1 and similar proteins. OTU1 (EC 3.4.19.12), also termed YOD1, or DUBA-8, or HIV-1-induced protease 7 (HIN-7), or OTU domain-containing protein 2 (OTUD2), is a p97-associated deubiquitinylase that functions as a key player in endoplasmic reticulum-associated degradation (ERAD). Its deubiquitinylase activity is also required for negatively regulating cholera toxin A1 (CTA1) retro-translocation. OTU1 contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a C2H2-type zinc finger, and an OTU domain.	75
340580	cd17060	Ubl_RB1CC1	ubiquitin-like (Ubl) domain found in retinoblastoma 1-inducible coiled-coil protein 1 (RB1CC1) and similar proteins. RB1CC1, also termed FAK family kinase-interacting protein of 200 kDa (FIP200), is the mammalian counterpart of the yeast Atg17 gene and functions as a component of the ULK1/Atg13/RB1CC1/Atg101 complex essential for induction of autophagy. RB1CC1 is a key signaling node to regulate cellular proliferation and differentiation. As a DNA-binding transcription factor, RB1CC1 has been implicated in the regulation of retinoblastoma 1 (RB1) expression. RB1CC1 contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, as well as a nuclear localization signal (KPRK), a leucine zipper motif and a coiled-coil structure.	75
340581	cd17061	Ubl_IQUB	ubiquitin-like (Ubl) domain found in IQ and ubiquitin-like domain-containing protein (IQUB) and similar proteins. IQUB is an IQ motif and ubiquitin domain-containing protein that may play roles in cilia formation and/or maintenance. It contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions.	79
340582	cd17062	Ubl_NUB1	ubiquitin-like (Ubl) domain found in NEDD8 ultimate buster 1 (NUB1) and similar proteins. NUB1, also termed negative regulator of ubiquitin-like proteins 1, or renal carcinoma antigen NY-REN-18, or protein BS4, is a NEDD8-interacting protein that can be induced by interferon. It functions as a strong post-transcriptional down-regulator of the NEDD8 expression and plays critical roles in regulating many biological events, such as cell growth, NF-kappaB signaling, and biological responses to hypoxia. NUB1 can also interact with aryl hydrocarbon receptor-interacting protein-like 1 (AIPL1), which may function in the regulation of cell cycle progression. NUB1 contains a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, three ubiquitin-associated domains (UBA), a bipartite nuclear localization signal (NLS) and a PEST motif.	78
340583	cd17063	Ubl_ANKRD60	ubiquitin-like (Ubl) domain found in ankyrin repeat domain-containing protein 60 (ANKRD60) and similar proteins. ANKRD60 is an uncharacterized ankyrin repeat domain-containing protein which also harbors a conserved ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions.	77
340584	cd17064	Ubl_TAFs_like	ubiquitin-like (Ubl) domain found in plant TBP-associated factors (TAFs) and similar proteins. TAFs, also termed transcription initiation factor TFIID subunits, or TAFII250 subunits, are components of the TFIID complex, a multisubunit protein complex involved in promoter recognition and essential for mediating regulation of RNA polymerase transcription. TAFs is the core scaffold of the TFIID complex, which is comprised of the TATA binding protein (TBP) and 12-15 TAFs. TAFs contain a ubiquitin-like (Ubl) domain and a Bromo domain.	72
340585	cd17065	Ubl_UBP24	ubiquitin-like (Ubl) domain found in ubiquitin carboxyl-terminal hydrolase 24 (UBP24) and similar proteins. UBP24 (EC 3.4.19.12), also termed deubiquitinating enzyme 24, or ubiquitin thioesterase 24, or ubiquitin-specific-processing protease 24 (USP24), is a deubiquitinating protein that interacts with damage-specific DNA-binding protein 2 (DDB2) and regulates DDB2 stability. It may also play a role in the pathogenesis of Parkinson's disease (PD). UBP24 proteins contain an N-terminal ubiquitin-associated (UBA) domain, a ubiquitin-like (Ubl) domain, and a C-terminal peptidase C19 domain.	79
340586	cd17066	Ubl_KPC2	ubiquitin-like (Ubl) domain found in Kip1 ubiquitination-promoting complex protein 2 (KPC2) and similar proteins. KPC2, also termed ubiquitin-associated domain-containing protein 1 or UBA domain containing 1 (UBAC1), or glialblastoma cell differentiation-related protein 1 (GBDR1), is one of two subunits of Kip1 ubiquitination-promoting complex (KPC), a novel E3 ubiquitin-protein ligase that also contains KPC1 subunit and regulates the ubiquitin-dependent degradation of the cyclin-dependent kinase (CDK) inhibitor p27 at G1 phase. KPC2 contains an ubiquitin-like (Ubl) domain and two ubiquitin-associated (UBA) domains.	87
340587	cd17067	RBD2_RGS12_like	Ras-binding domain (RBD) 2 of regulator of G protein signaling 12 (RGS12) and similar proteins. Regulator of G-protein signaling (RGS) proteins belong to a large family of GTPase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. The RGS12-like subfamily is composed of RGS12 and RGS14, with multidomain architectures including a RGS domain, two tandem Ras-binding domains (RBDs), and a second Galpha interacting domain, the GoLoco motif. The RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. Ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	72
340588	cd17068	RBD_PLEKHG5	Ras-binding domain (RBD) found in pleckstrin homology (PH) and RhoGEF domain containing G5 (PLEKHG5) and similar proteins. PLEKHG5, is also termed PH domain-containing family G member 5, or guanine nucleotide exchange factor 720 (GEF720), Syx, or Tech, is a novel Dbl-like protein related to p115Rho-GEF. It functions as a Rho guanine nucleotide exchange factor directly activating RhoA in vivo and potentially involved in the control of neuronal cell differentiation. It also regulates the balance of the RhoA downstream effector Dia and ROCK activities to promote polarized-cancer-cell migration. Moreover, PLEKHG5 activates the nuclear factor kappaB (NFkappaB) signaling pathway. Mutations in the PLEKHG5 gene are relevant with autosomal recessive intermediate Charcot-Marie-Tooth disease (CMT) and lower motor neuron disease (LMND).	75
340589	cd17069	DCX2	Dublecortin-like domain 2. Members in doublecortin (DCX) gene family are microtubule-associated proteins (MAPs). Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX gene family consists of eleven paralogs in human and mouse, and its protein domains can occur in double tandem or as a single repeat. The first repeat of DCX domain has a stable ubiquitin-like tertiary fold. Proteins with DCX double tandem domains in general have roles in microtubule (MT) regulation and signal transduction such as X-linked doublecortin (DCX), retinitis pigmentosa-1 (RP1) and doublecortin-like kinase (DCLK).	84
340590	cd17070	DCX2_RP_like	Dublecortin-like domain 2 found in retinitis pigmentosa (RP)-like protein. RP-like protein family is part of doublecortin (DCX) superfamily with double tandem DCX repeats that are associated with retinitis pigmentosa. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport.  RP-like proteins are colocalized to the photoreceptor and share a function in outer segment disc morphogenesis.	69
340591	cd17071	DCX1_DCDC2_like	Dublecortin-like domain 1 found in doublecortin domain-containing protein 2 (DCDC2) and similar proteins. DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	80
340592	cd17072	DCX_DCDC5_like	Doublecortin-like domain found in doublecortin domain-containing protein 5 (DCDC5) and similar proteins. DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8 as well as with the Rab8 nucleotide exchange factor Rabin8. This family also includes DCDC1, which is a hydrophilic intracellular protein that contains only one DCX repeat. Therefore, DCDC1 might only bind to microtubules without microtubule polymerization properties. DCDC1 is mainly expressed in adult testis.	71
340593	cd17073	KHA	KHA, dimerization domain of potassium ion channel, similar to doublecortin-like domain, found in potassium channel tetramerization domain containing 9 (KCTD9) and similar proteins. This family corresponds to KHA, the tetramerization domain of eukaryotic voltage-dependent potassium ion-channel proteins, mainly found in vertebrates KCTD9 and plants AKT proteins. In plants the domain lies at the C-terminus whereas in many chordates it lies at the N-terminus. KHA shows high sequence similarity with doublecortin-like domain, which has a stable ubiquitin-like tertiary fold. KCTD9, also termed BTB/POZ domain-containing protein 9, belongs to the KCTD protein family, which corresponds to potassium channel tetramerization domain proteins, a class of BTB-domain-containing proteins. It is involved in potassium channel formation. Moreover, KCTD9 contributes to liver injury through NK cell activation during hepatitis B virus (HBV)-induced acute-on-chronic liver failure. AKT proteins play crucial roles in K+ uptake and translocation in plant cells.	65
340594	cd17074	Ubl_CysO_like	ubiquitin-like (Ubl) domain found in Mycobacterium tuberculosis CysO and similar proteins. CysO, also termed 9.5 kDa culture filtrate antigen cfp10A, together with CysM (Cysteine synthase M), forms a protein complex CysM-CysO that represents a new cysteine biosynthetic pathway in Mycobacterium tuberculosis. The replacement of the acetyl group of O-acetylserine by CysO thiocarboxylate to generate a protein-bound cysteine is catalyzed by CysM in a pyridoxal 5?-phosphate (PLP)-dependent manner. The family also includes QbsE that functions as the sulfide donor for the biosynthesis of thioquinolobactin in Pseudomonas fluorescens. A JAMM motif protein QbsD catalyzes removal of the carboxy-terminal dipeptide from QbsE. Both CysO and QbsE are similar to prokaryotic sulfur carrier proteins such as ThiS and MoaD, containing the beta-grasp ubiquitin-like fold.	89
340595	cd17075	UBX1_UBXN9	Ubiquitin regulatory domain X (UBX) 1 found in UBX domain protein 9 (UBXN9, UBXD9, or ASPSCR1) and similar proteins. UBXN9, also termed tether containing UBX domain for GLUT4 (TUG), or alveolar soft part sarcoma chromosomal region candidate gene 1 protein (ASPSCR1), or alveolar soft part sarcoma locus (ASPL), or renal papillary cell carcinoma protein 17 (RCC17), belongs to the UBXD family of proteins that contains two ubiquitin regulatory domains X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, UBXN9 contains an N-terminal ubiquitin-like (Ubl) domain.  UBXN9 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN9 is involved in insulin-stimulated redistribution of the glucose transporter GLUT4, assembly of the Golgi apparatus. In addition to GLUT4, UBXN9 also controls vesicle translocation by interacting with insulin-regulated aminopeptidase (IRAP), a transmembrane aminopeptidase. UBXN9 and its budding yeast ortholog, Ubx4p, are multifunctional proteins that share some, but not all functions. Yeast Ubx4p is important for endoplasmic reticulum-associated protein degradation (ERAD) but UBXN9 appears not to share this function.	85
340596	cd17076	UBX_UBXN10	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 10 (UBXN10) and similar proteins. UBXN10, also termed UBX domain-containing protein 3 (UBXD3), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN10 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN10 localizes to cilia in a p97-dependent manner, and both p97 and UBXN10 are required for ciliogenesis. Additionally, UBXN10 interacts with the intraflagellar transport B (IFT-B) and regulates anterograde transport into cilia.	76
340597	cd17077	UBX_UBXN11	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 11 (UBXN11) and similar proteins. UBXN11, also termed colorectal tumor-associated antigen COA-1, or socius, or UBX domain-containing protein 5 (UBXD5), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN11 may function as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN11 also acts as a novel interacting partner of Rnd proteins (Rnd1, Rnd2, and Rnd3/RhoE), new members of Rho family of small GTPases. It directly binds to Rnd GTPases through its C-terminal region, and further participates in disassembly of actin stress fibers. UBXN11 also binds directly to Galpha12 and Galpha13 through its N-terminal region. As a novel activator of the Galpha12 family, UBXN11 promotes the Galpha12-induced RhoA activation.	76
340598	cd17078	Ubl_SLD1_NFATC2ip	SUMO-like domain 1 (SLD1), structurally similar to a beta-grasp ubiquitin-like fold, found in nuclear factor of activated T-cells 2 interacting protein (NFATC2ip) and similar proteins. NFATC2ip, also termed nuclear factor of activated T cells (NFAT), cytoplasmic, calcineurin dependent 2 interacting protein, or 45 kDa NF-AT-interacting protein, or 45 kDa NFAT-interacting protein (Nip45), or nuclear factor of activated T-cells, or cytoplasmic 2-interacting protein, belongs to the eukaryotic-specific Rad60-Esc2-Nip45 (RENi) protein family. The family members may act as factors in transcriptional regulation, chromatin silencing and genomic stability, and typically contain an N-terminal polar/charged low-complexity segment and two C-terminal consecutive unique small ubiquitin-related modifier (SUMO)-like domains (SLD1 and SLD2) with beta-grasp fold. NFATC2ip was firstly identified as a co-regulator with NFAT and the T helper 2 (Th2)-specific transcription factor, c-Maf, to induce IL-4 production. NFATC2ip has also been involved in cellular differentiation and coordination of the immune response in humans and mice.	74
340599	cd17079	Ubl_SLD2_NFATC2ip	SUMO-like domain 2 (SLD2), structurally similar to a beta-grasp ubiquitin-like fold, found in nuclear factor of activated T-cells 2 interacting protein (NFATC2ip) and similar proteins. NFATC2ip, also termed nuclear factor of activated T cells (NFAT), cytoplasmic, calcineurin dependent 2 interacting protein, or 45 kDa NF-AT-interacting protein, or 45 kDa NFAT-interacting protein (Nip45), or nuclear factor of activated T-cells, or cytoplasmic 2-interacting protein, belongs to the eukaryotic-specific Rad60-Esc2-Nip45 (RENi) protein family. The family members may act as factors in transcriptional regulation, chromatin silencing and genomic stability, and typically contain an N-terminal polar/charged low-complexity segment and two C-terminal consecutive unique small ubiquitin-related modifier (SUMO)-like domains (SLD1 and SLD2) with beta-grasp fold. NFATC2ip was firstly identified as a co-regulator with NFAT and the T helper 2 (Th2)-specific transcription factor, c-Maf, to induce IL-4 production. NFATC2ip has also been involved in cellular differentiation and coordination of the immune response in humans and mice. NFATC2ip SLD2 domain binds to E2 SUMOylation enzyme, Ubc9, in an almost identical manner to that of SUMO and thereby inhibits elongation of poly-SUMO chains.	73
340600	cd17080	Ubl_SLD2_Esc2_like	SUMO-like domain 2 (SLD2), structurally similar to a beta-grasp ubiquitin-like fold, found in Saccharomyces cerevisiae establishes silent chromatin protein 2 (Esc2p) and similar proteins. Protein Esc2p belongs to the eukaryotic-specific Rad60-Esc2-Nip45 (RENi) protein family, whose members may act as factors in transcriptional regulation, chromatin silencing and genomic stability, and typically contain an N-terminal polar/charged low-complexity segment and two C-terminal consecutive unique small ubiquitin-related modifier (SUMO)-like domains (SLD1 and SLD2) with beta-grasp fold. Yeast Esc2p was identified as a factor promoting gene silencing. It is also required for genome integrity during DNA replication and sister chromatid cohesion in Saccharomyces cerevisiae. Esc2p promotes Mus81p complex-activity via its SUMO-like and DNA binding domains. It also acts as a novel structure-specific DNA-binding factor implicated in the local regulation of the Srs2p helicase through promoting recombination at sites of stalled replication. In addition, Esc2p specifically promotes the accumulation of SUMOylated Mms21-specific substrates and functions with Mms21p to suppress gross chromosomal rearrangements (GCRs). This family also includes DNA repair protein Rad60p from Schizosaccharomyces pombe. It is a SUMO mimetic and SUMO-targeted ubiquitin ligase (STUbL)-interacting protein that is required for the repair of DNA double strand breaks, recovery from S phase replication arrest, and plays an essential role in cell viability. Like other RENi family members, Rad60p has two SLD domains.	74
340601	cd17081	RAWUL_PCGF1	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 1 (PCGF1) and similar proteins. PCGF1, also termed nervous system Polycomb-1 (NSPc1), or RING finger protein 68 (RNF68), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a noncanonical Polycomb repressive complex 1 (PRC1)-like BCOR complex that also contains RING1, RNF2, RYBP, SKP1, as well as the BCL6 co-repressor BCOR and the histone demethylase KDM2B, and is required to maintain the transcriptionally repressive state of some genes, such as Hox genes, BCL6 and the cyclin-dependent kinase inhibitor, CDKN1A.  PCGF1 promotes cell cycle progression and enhances cell proliferation as well. It is a cell growth regulator that acts as a transcriptional repressor of p21Waf1/Cip1 via the retinoid acid response element (RARE element). Moreover, PCGF1 functions as an epigenetic regulator involved in hematopoietic cell differentiation. It cooperates with the transcription factor runt-related transcription factor 1 (Runx1) in regulating differentiation and self-renewal of hematopoietic cells. Furthermore, PCGF1 represents a physical and functional link between Polycomb function and pluripotency. PCGF1 contains a C3HC4-type RING-HC finger and a RAWUL domain.	92
340602	cd17082	RAWUL_PCGF2_like	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger proteins PCGF2, PCGF4, and similar proteins. This family includes polycomb group RING finger proteins, PCGF2 (also known as Mel-18, or RNF110, or ZNF144) and PCGF4 (also known as BMI-1, or RNF51), both serving as the core component of a canonical polycomb repressive complex 1 (PRC1). PRC1 is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a polyhomeotic protein (PHC1, PHC2, or PHC3). Like other PCGF homologs, PCGF2 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF4 associates with the Runx1/CBFbeta transcription factor complex to silence the target gene in a PRC2-independent manner. Both, PCGF2 and PCGF4, contain a C3HC4-type RING-HC finger and a RAWUL domain.	108
340603	cd17083	RAWUL_PCGF3	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 3 (PCGF3) and similar proteins. PCGF3, also termed RING finger protein 3A (RNF3A), is one of six PcG RING finger (PCGF) homologs (PCGF1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6) and serves as the core component of a Polycomb repressive complex 1 (PRC1). Like other PCGF homologs, PCGF3 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF3 contains a C3HC4-type RING-HC finger, and a RAWUL domain.	85
340604	cd17084	RAWUL_PCGF5	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 5 (PCGF5) and similar proteins. PCGF5, also termed RING finger protein 159 (RNF159), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a Polycomb repressive complex 1 (PRC1). Like other PCGF homologs, PCGF5 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. PCGF5 contains a C3HC4-type RING-HC finger, and a RAWUL domain.	101
340605	cd17085	RAWUL_PCGF6	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 6 (PCGF6) and similar proteins. PCGF6, also termed Mel18 and Bmi1-like RING finger (MBLR), or RING finger protein 134 (RNF134), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5 and PCGF6/MBLR), and serves as the core component of a noncanonical Polycomb repressive complex 1 (PRC1)-like L3MBTL2 complex, which is composed of some canonical components, such as RNF2, CBX3, CXB4, CXB6, CXB7 and CXB8, as well as some noncanonical components, such as L3MBTL2, E2F6, WDR5, HDAC1 and RYBP, and plays critical roles in epigenetic transcriptional silencing in higher eukaryotes. Like other PCGF homologs, PCGF6 possesses the transcriptional repression activity, and also associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF6 can regulate the enzymatic activity of JARID1d/KDM5D, a trimethyl H3K4 demethylase, through the direct interaction with it. Furthermore, PCGF6 is expressed predominantly in meiotic and post-meiotic male germ cells and may play important roles in mammalian male germ cell development. It also regulates mesodermal lineage differentiation in mammalian embryonic stem cells (ESCs) and functions in induced pluripotent stem (iPS) reprogramming. The activity of PCGF6 is found to be regulated by cell cycle dependent phosphorylation. PCGF6 contains a C3HC4-type RING-HC finger, and a RAWUL domain.	89
340606	cd17086	RAWUL_RING1_like	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in really interesting new gene proteins RING1, RING2 and similar proteins. RING1, also termed polycomb complex protein RING1, or RING finger protein 1 (RNF1), or RING finger protein 1A (RING1A), has been identified as a transcriptional repressor that is associated with the Polycomb group (PcG) protein complex involved in stable repression of gene activity. RING2, also termed huntingtin-interacting protein 2-interacting protein 3, or HIP2-interacting protein 3, or protein DinG, or RING finger protein 1B (RING1B), or RING finger protein 2 (RNF2), or RING finger protein BAP-1, is an E3 ubiquitin-protein ligase that interacts with both nucleosomal DNA and an acidic patch on histone H4 to achieve the specific monoubiquitination of K119 on histone H2A (H2AK119ub), thereby playing a central role in histone code and gene regulation. Both, RING1 and RING2, are core components of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. RING2 acts as the main E3 ubiquitin ligase on histone H2A of the PRC1 complex, while RING1 may rather act as a modulator of RNF2/RING2 activity. Members in this family contain a C3HC4-type RING-HC finger, and a RAWUL domain.	106
340607	cd17087	RAWUL_DRIP_like	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in DREB2A-interacting protein (DRIP) and similar proteins. Dehydration-Responsive Element-Binding Protein 2A (DREB2A) regulates the expression of stress-inducible genes via the dehydration-responsive elements and requires posttranslational modification for its activation. DREB2A-Interacting Protein (DRIP) contains a RING finger, and a RING finger- and WD40-associated ubiquitin-like (RAWUL) domain. DRIP interacts with DREB2A and functions as a E3 ubiquitin ligases that negatively regulates DREB2A expression.	104
340608	cd17088	FERM_F1_FRMPD1_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing proteins FRMPD1, FRMPD3, FRMPD4, and similar proteins. This family includes FERM and PDZ domain-containing proteins FRMPD1, FRMPD3, and FRMPD4, which all contain a PDZ domain and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain of the FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). FRMPD1, also termed FERM domain-containing protein 2, is an activator of G-protein signaling 3 (AGS3)-binding protein that regulates the subcellular location of AGS3 and its interaction with G-proteins. FRMPD4, also termed PDZ domain-containing protein 10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a multiscaffolding protein that modulates both Homer1 and post-synaptic density protein 95 activity. Both FRMPD1 and FRMPD4 can associate with the tetratricopeptide repeat (TPR) motif-containing adaptor protein LGN. The biological role of FRMPD3 remains unclear.	90
340609	cd17089	FERM_F0_TLN	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in Talin and similar proteins. Talin is a cytoskeletal protein that activates integrins and couples them to cytoskeletal actin. Talin consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain that is joined to the F1 domain in a novel fixed orientation by an extensive charged interface. It is required for maximal integrin-activation, by interacting with other FA components; no binding partner has yet been found for it.	84
340610	cd17090	FERM_F1_TLN	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Talin and similar proteins. Talin is a cytoskeletal protein that activates integrins and couples them to cytoskeletal actin. Talin consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain that is joined to the F1 domain in a novel fixed orientation by an extensive charged interface. It is required for maximal integrin-activation, by interacting with other FA components; no binding partner has yet been found for it.	111
340611	cd17091	FERM_F0_SHANK	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in SH3 and multiple ankyrin repeat domains proteins, SHANK1, SHANK2, and SHANK3. SHANK proteins (SHANK1, SHANK2, and SHANK3) are core components of the postsynaptic density (PSD) of excitatory synapses. They act as scaffolding molecules that cluster neurotransmitter receptors as well as cell adhesion molecules attaching them to the actin cytoskeleton. They play important roles in proper excitatory synapse and circuit function. Mutations in SHANK genes, especially in SHANK3 and SHANK2, may lead to neuropsychiatric disorders, such as autism spectrum disorder (ASD). SHANK proteins contain an N-terminal F0 domain of FERM (Band 4.1, ezrin, radixin, moesin), six ankyrin (ANK) repeats, one SH3 (Src homology 3) domain, one PDZ (PSD-95, Dlg, and ZO-1/2, also termed DHR or GLGF) domain, and a C-terminal SAM (sterile alpha motif) domain. This family corresponds to the F0 domain that adopts a ubiquitin-like fold.	84
340612	cd17092	FERM1_F1_Myosin-VII	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain 1, F1 sub-domain, found in Myosin-VIIa, Myosin-VIIb, and similar proteins. This family includes two nontraditional members of the myosin superfamily, myosin-VIIa and myosin-VIIb. Myosin-VIIa, also termed myosin-7a (Myo7a), has been implicated in the structural organization of hair bundles at the apex of sensory hair cells (SHCs) where it serves mechanotransduction in the process of hearing and balance. Mutations in the MYO7A gene may be associated with Usher Syndrome type 1B (USH1B) and nonsyndromic hearing loss (DFNB2, DFNA11). Myosin-VIIb, also termed myosin-7b (Myo7b), is a high duty ratio motor adapted for generating and maintaining tension. It associates with harmonin and ANKS4B to form a stable ternary complex for anchoring microvilli tip-link cadherins. Like other unconventional myosins, myosin-VII is composed of a conserved motor head, a neck region and a tail region containing two MyTH4 domains, a SH3 domain, and two FERM domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain of the first FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	99
340613	cd17093	FERM2_F1_Myosin-VII	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain 2, F1 sub-domain, found in Myosin-VIIa, Myosin-VIIb, and similar proteins. This family includes two nontraditional members of myosin superfamily, myosin-VIIa and myosin-VIIb. Myosin-VIIa, also termed myosin-7a (Myo7a), has been implicated in the structural organization of hair bundles at the apex of sensory hair cells (SHCs) where it serves mechanotransduction in the process of hearing and balance. Mutations in MYO7A gene may be associated with Usher Syndrome type 1B (USH1B) and nonsyndromic hearing loss (DFNB2, DFNA11). Myosin-VIIb, also termed myosin-7b (Myo7b), is a high duty ratio motor adapted for generating and maintaining tension. It associates with harmonin and ANKS4B to form a stable ternary complex for anchoring microvilli tip-link cadherins. Like other unconventional myosins, myosin-VII is composed of a conserved motor head, a neck region and a tail region containing two MyTH4 domains, a SH3 domain, and two FERM domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain of the second FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	98
340614	cd17094	FERM_F1_Max1_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Caenorhabditis elegans max-1 and its homologs PLEKHH1 and PLEKHH2. Caenorhabditis elegans max-1 is expressed and functions in motor neurons. MAX-1 protein plays a possible role in netrin-induced axon repulsion by modulating the UNC-5 receptor signaling pathway. PLEKHH1 is critically required in vascular patterning in vertebrate species through acting upstream of the ephrin pathway. PLEKHH2 is highly enriched in renal glomerular podocytes, and acts as a novel, important component of the podocyte foot processes. It is involved in matrix adhesion and actin dynamics. Members in this family all contain two Pleckstrin homology (PH) domains, a MyTH4 domain, and a FERM (Band 4.1, ezrin, radixin, moesin) domain within the C-terminal half. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	102
340615	cd17095	FERM_F0_kindlins	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in the kindlin family. The kindlin family is composed of kindlin-1, 2 and 3, which are FERM domain-containing adaptor molecules that interact with the cytoplasmic component of integrins and regulate cell-matrix connections. Kindlins belong to the 4.1- ezrin-ridixin-moesin (FERM) domain containing protein family. They contain F1, F2 and F3 subdomains that typify FERM family members, and these subdomains are preceded by an N-terminal F0 subdomain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain. In addition, a distinctive feature of kindlins is the insertion of a pleckstrin homology (PH) subdomain into the F2 subdomain.	80
340616	cd17096	FERM_F1_kindlins	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in the kindlin family. The kindlin family is composed of Kindlin-1, 2 and 3, which are FERM domain-containing adaptor molecules that interact with the cytoplasmic component of integrins and regulate cell-matrix connections. Kindlins belong to the 4.1- ezrin-ridixin-moesin (FERM) domain containing protein family. They contain F1, F2 and F3 subdomains that typify FERM family members, and these subdomains are preceded by an N-terminal F0 subdomain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F1 domain. In addition, a distinctive feature of kindlins is the insertion of a pleckstrin homology (PH) subdomain into the F2 subdomain.	90
340617	cd17097	FERM_F1_ERM_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in the ERM family proteins. The ezrin-radixin-moesin (ERM) family includes a group of closely related cytoskeletal proteins that play an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. They exist in two states, a dormant state in which the FERM domain binds to its own C-terminal tail and thereby precludes binding of some partner proteins, and an activated state, in which the FERM domain binds to one of many membrane binding proteins and the C-terminal tail binds to F-actin. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Merlin, which is highly related to the members of the ezrin, radixin, and moesin (ERM) protein family that are directly attached to and functionally linked with NHE1, is included in this family.	83
340618	cd17098	FERM_F1_FARP1_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM, RhoGEF and pleckstrin domain-containing protein 1 (FARP1) and similar proteins. This family includes the F1 sub-domain of FERM, RhoGEF and pleckstrin domain-containing proteins FARP1, FARP2, and FERM domain-containing protein 7 (FRMD7). FARP1, also termed chondrocyte-derived ezrin-like protein (CDEP), or pleckstrin homology (PH) domain-containing family C member 2 (PLEKHC2), is a neuronal activator of the RhoA GTPase. It promotes outgrowth of developing motor neuron dendrites. It also regulates excitatory synapse formation and morphology, as well as activates the GTPase Rac1 to promote F-actin assembly. FARP2, also termed FERM domain including RhoGEF (FIR), or Pleckstrin homology (PH) domain-containing family C member 3, is a Dbl-family guanine nucleotide exchange factor (GEF) that activates Rac1 or Cdc42 in response to upstream signals, suggesting roles in regulating processes such as neuronal axon guidance and bone homeostasis. It is also a key molecule involved in the response of neuronal growth cones to class-3 semaphorins. FRMD7 plays an important role in neuronal development and is involved in the regulation of F-actin, neurofilament, and microtubule dynamics. All family members contain a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	85
340619	cd17099	FERM_F1_PTPN14_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptors PTPN14, PTPN21, and similar proteins. This family includes tyrosine-protein phosphatase non-receptors PTPN14 and PTPN21, both of which are protein-tyrosine phosphatase (PTP). They belong to the FERM family of PTPs characterized by a conserved N-terminal FERM domain and a C-terminal PTP catalytic domain with an intervening sequence containing an acidic region and a putative SH3 domain-binding sequence.  The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN14 plays a role in the nucleus during cell proliferation. PTPN21 interacts with a Tec tyrosine kinase family member, the epithelial and endothelial tyrosine kinase (Etk, also known as Bmx), modulates Stat3 activation, and plays a role in the regulation of cell growth and differentiation.	85
340620	cd17100	FERM_F1_PTPN3_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 3 (PTPN3) and similar proteins. This family includes two tyrosine-protein phosphatase non-receptors, PTPN3 and PTPN4, both of which belong to the non-transmembrane FERM-containing protein-tyrosine phosphatase (PTP) subfamily characterized by a conserved N-terminal FERM domain, a PDZ domain, and a C-terminal PTP catalytic domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	86
340621	cd17101	FERM_F1_PTPN13_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 13 (PTPN13) and similar proteins. This family includes tyrosine-protein phosphatase non-receptor type 13 (PTPN13), FERM and PDZ domain-containing protein 2 (FRMPD2), and FERM domain-containing proteins FRMD1 and FRMD6. All family members contain a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	97
340622	cd17102	FERM_F1_FRMD3	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 3 (FRMD3) and similar proteins. FRMD3, also termed band 4.1-like protein 4O, or ovary type protein 4.1 (4.1O), belongs to the 4.1 protein superfamily, which share the highly conserved membrane-association FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). FRMD3 is involved in maintaining cell shape and integrity. It also functions as a tumour suppressor in non-small cell lung carcinoma (NSCLC). Some single nucleotide polymorphisms (SNPs) located in FRMD3 have been associated with diabetic kidney disease (DKD) in different ethnicities.	82
340623	cd17103	FERM_F1_FRMD4	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing proteins FRMD4A, FRMD4B, and similar proteins. This family includes FERM domain-containing proteins FRMD4A and FRMD4B, both of which contain a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). FRMD4A is a cytohesin adaptor involved in cell structure, transport and signaling. It promotes the growth of cancer cells in tongue, head and neck squamous cell carcinomas. FRMD4B, also termed GRP1-binding protein GRSP1, interacts with the coil-coil domain of ARF exchange factor GRP1 to form the Grsp1-Grp1 complex that co-localizes with cortical actin rich regions in response to stimulation of CHO-T cells with insulin or epidermal growth factor (EGF).	91
340624	cd17104	FERM_F1_MYLIP	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in E3 ubiquitin-protein ligase MYLIP and similar proteins. MYLIP, also termed inducible degrader of the LDL-receptor (Idol), or myosin regulatory light chain interacting protein (MIR), is an E3 ubiquitin-protein ligase that mediates ubiquitination and subsequent proteasomal degradation of myosin regulatory light chain (MRLC), LDLR, VLDLR and LRP8. Its activity depends on E2 ubiquitin-conjugating enzymes of the UBE2D family, including UBE2D1, UBE2D2, UBE2D3, and UBE2D4. MYLIP stimulates clathrin-independent endocytosis and acts as a sterol-dependent inhibitor of cellular cholesterol uptake by binding directly to the cytoplasmic tail of the LDLR and promoting its ubiquitination via the UBE2D1/E1 complex. The ubiquitinated LDLR then enters the multivesicular body (MVB) protein-sorting pathway and is shuttled to the lysosome for degradation. Moreover, MYLIP has been identified as a novel ERM-like protein that affects cytoskeleton interactions regulating cell motility, such as neurite outgrowth. The ERM proteins includes ezrin, radixin, and moesin, which are cytoskeletal effector proteins linking actin to membrane-bound proteins at the cell surface. MYLIP contains a FERM-domain and a C-terminal C3HC4-type RING-HC finger. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	81
340625	cd17105	FERM_F1_EPB41	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1 (EPB41) and similar proteins. EPB41, also termed protein 4.1 (P4.1), or 4.1R, or Band 4.1, or EPB4.1, belongs to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. EPB41 is a widely expressed cytoskeletal phosphoprotein that stabilizes the spectrin-actin cytoskeleton and anchors the cytoskeleton to the cell membrane. EPB41 contains a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	83
340626	cd17106	FERM_F1_EPB41L	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like proteins. The family includes erythrocyte membrane protein band 4.1-like proteins EPB41L1/4.1N, EPB41L2/4.1G, and EPB41L3/4.1B. They belong to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. EPB41L1 is a cytoskeleton-associated protein that may serve as a tumor suppressor in solid tumors. EPB41L2 is involved in cellular processes such as cell adhesion, migration and signaling. EPB41L3 also acts as a tumor suppressor implicated in a variety of meningiomas and carcinomas. Members in this family contain a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	84
340627	cd17107	FERM_F1_EPB41L4A	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte band 4.1-like protein 4A (EPB41L4A) and similar proteins. EPB41L4A, also termed protein NBL4, is a member of the band 4.1/Nbl4 (novel band 4.1-like protein 4) group of the FERM protein superfamily. It contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). EPB41L4A is an important component of the beta-catenin/Tcf pathway. It may be related to determination of cell polarity or proliferation.	91
340628	cd17108	FERM_F1_EPB41L5_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like 5 (EPB41L5) and similar proteins. This family includes FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like proteins, EPB41L5 and EPB41L4B. EPB41L5 is a mesenchymal-specific protein that is an integral component of the ARF6-based pathway. EPB41L4B is a positive regulator of keratinocyte adhesion and motility, suggesting a role in wound healing. It also promotes cancer metastasis in melanoma, prostate cancer and breast cancer. Both EPB41L5 and EPB41L4B contain a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	81
340629	cd17109	FERM_F1_SNX17_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in PX-FERM family sorting nexin proteins. This family includes three endosome-associated PX (Phox homology) and FERM (Band 4.1, ezrin, radixin, moesin) domain-containing proteins called sorting nexin (SNX) 17, SNX27, and SNX31, which are modular peripheral membrane proteins acting as central scaffolds mediating protein-lipid interactions, cargo binding, and regulatory protein recruitment. They are key regulators of endosomal recycling and bind conserved NPX(Y/F) peptide sorting motifs in transmembrane cargos via an atypical FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	93
340630	cd17110	FERM_F1_Myo10_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in unconventional myosin-X and similar proteins. Myosin-X, also termed myosin-10 (Myo10), is an untraditional member of myosin superfamily. It is an actin-based motor protein that plays a critical role in diverse cellular motile events, such as filopodia formation/extension, phagocytosis, cell migration, and mitotic spindle maintenance, as well as a number of disease states including cancer metastasis and pathogen infection. Myosin-X functions as an important regulator of cytoskeleton that modulates cell motilities in many different cellular contexts. It regulates neuronal radial migration through interacting with N-cadherin. Like other unconventional myosins, Myosin-X is composed of a conserved motor head, a neck region and a variable tail. The neck region consists of three IQ motifs (light chain-binding sites), and a predicted stalk of coiled coil. The tail contains three PEST regions, three PH domains, a MyTH4 domain, and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Amoebozoan Dictyostelium discoideum myosin VII (DdMyo7)  and uncharacterized pleckstrin homology domain-containing family H member 3 (PLEKHH3) are also included in this family. Like metazoan Myo10, DdMyo7 is essential for the extension of filopodia, plasma membrane protrusions filled with parallel bundles of F-actin.	97
340631	cd17111	RA1_DAGK-theta	Ras-associating (RA) domain 1 found in diacylgylcerol kinase theta (DAGK-theta) and similar proteins. DAGK phosphorylates the second messenger diacylglycerol to phosphatidic acid as part of a protein kinase C pathway. DAGK-theta is characterized as a type V DAGK that has three cysteine-rich domains (all other isoforms have two), a proline/glycine-rich domain at its N-terminal, and a proposed Ras-associating (RA) domain. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. The RA domain has a beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. Ten mammalian isoforms of DAGK have been identified to date which are organized into five categories based on the domain architecture. DAGK-theta also contains a pleckstrin homology (PH) domain. The subcellular localization and the activity of DAGK-theta are regulated in a complex (stimulation- and cell type-dependent) manner. This family corresponds to the first RA domain of DAGK-theta.	91
340632	cd17112	RA_MRL_like	Ras-associating (RA) domain found in Mig10/RIAM/Lpd (MRL) family and growth factor receptor-bound (Grb) protein 7/10/14. MRL proteins share a common structural architecture, including a central structural unit consisting of a Ras-associating (RA) domain and a pleckstrin homology (PH) domain, an upstream coiled-coil region, and a number of polyproline motifs. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes. RA and PH form a tandem domain pair (RA-PH), and serve tightly coordinated functions in both Ras GTPase signaling via the RA domain and membrane translocalization via the PH domain. MRL proteins have distinct functions in cell migration and adhesion, signaling, and in cell growth. Grb7/10/14 are multi-domain cytoplasmic adaptor proteins that are recruited to activated receptor tyrosine kinases. Grb7 and its related family members Grb10 and Grb14 share a conserved domain architecture that includes an amino-terminal proline-rich region, a central segment termed the GM region (for Grb and Mig) which includes the RA, PIR, and pleckstrin homology (PH) domains, and a carboxyl-terminal SH2 domain.  The tandem RA and PH domains of  Grb7/10/14 are also found in a second adaptor family, Rap1-interacting adaptor molecule (RIAM) and lamellipodin, which is involved in actin-cytoskeleton rearrangement. Grb7/10/14 family proteins are phosphorylated on serine/threonine as well as tyrosine residues and are mainly localized to the cytoplasm.	81
340633	cd17113	RA_ARAPs	Ras-associating (RA) domain found in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing proteins ARAP1, ARAP2, ARAP3, and similar proteins. ARAPs are phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P(3))-dependent Arf Rap-activated guanosine triphosphatase (GTPase)-activating proteins (GAPs). They contain multiple functional domains, including ArfGAP and RhoGAP domains, as well as a sterile alpha motif (Sam) domain, five PH domains, and a RA domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	95
340634	cd17114	RA_PLC-epsilon	Ras-associating (RA) domain found in Phosphatidylinositide-specific phospholipase C (PLC)-epsilon. PLC is a signaling enzyme that hydrolyzes membrane phospholipids to generate inositol triphosphate. PLC-epsilon represents a novel forth class of PLC that has a PLC catalytic core domain, a CDC25 guanine nucleotide exchange factor domain and two RA (Ras-association) domains. RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin (Ub). Although PLC RA1 and RA2 have homologous ubiquitin-like folds only RA2 can bind Ras and activate it. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and are involved in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction.	97
340635	cd17115	RA_RHG20	Ras-associating (RA) domain found in Rho GTPase-activating protein 20 (RHG20) and similar proteins. RHG20, also termed ARHGAP20, is one of GTPase activating proteins for Rho family proteins (RhoGAPs). It contains a PH-like domain, an RA domain, a RhoGap domain, and two Annexin-like (ANXL) repeats. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin that is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	117
340636	cd17116	RA_Radil_like	Ras-associating (RA) domain found in ras-associating and dilute domain-containing protein (Radil) and similar proteins. Radil acts as an important small GTPase Rap1 effector required for cell spreading and migration. It regulates neutrophil adhesion and motility through linking Rap1 to beta2-integrin activation.This family also includes Ras-interacting protein 1 (Rain, also termed Rasip1), which is a novel Ras-interacting protein with a unique subcellular localization. It interacts with Ras in a GTP-dependent manner, and may serve as an effector for endomembrane-associated Ras. Radil contains RA, DIL, and PDZ domains. In contrast, Rain contains a myosin5-like cargo binding domain, a RA domain and a PDZ domain. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub). Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	121
340637	cd17117	RA_ANKFN1_like	Ras-associating (RA) domain found in Ankyrin repeat and fibronectin type-III domain-containing protein 1 (ANKFN1) and similar proteins. ANKFN1 is a multi-domain protein, with unknown function, that contains two ankyrin repeats and one fibronectin type-III domain. Except for the mammalian homologs, most metazon ANKFN1 harbor a RA domain at the C-terminus. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin.	97
340638	cd17118	Ubl_HERP1	ubiquitin-like (Ubl) domain found in homocysteine-inducible endoplasmic reticulum stress protein HERP1 and similar proteins. HERP1, also termed homocysteine-responsive endoplasmic reticulum-resident ubiquitin-like domain member 1 protein (HERPUD1), or methyl methanesulfonate (MMF)-inducible fragment protein 1 (MIF1), is an endoplasmic reticulum (ER) integral membrane protein containing an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. HERP1 is a component of the ER quality control (ERQC) system, also called ER-associated degradation (ERAD), which is involved in ubiquitin-dependent degradation of misfolded ER proteins. It promotes the degradation of ER-resident Ca2+ channels. It is also involved in ubiquitin ligase HRD1-dependent protein degradation at the ER. Moreover, HERP1 plays a role in regulating the cell cycle, apoptosis and steroid hormone biosynthesis in mouse granulosa cells.	78
340639	cd17119	Ubl_HERP2	ubiquitin-like (Ubl) domain found in homocysteine-inducible endoplasmic reticulum stress protein HERP2 and similar proteins. HERP2, also termed homocysteine-responsive endoplasmic reticulum-resident ubiquitin-like domain member 2 protein (HERPUD2), is an endoplasmic reticulum (ER) integral membrane protein containing an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold. It is homologous to HERP1, and could be involved in the unfolded protein response (UPR) pathway. It regulates the ubiquitin ligase HRD1-dependent protein degradation at the ER.	77
340640	cd17120	Ubl_UBTD1	ubiquitin-like (Ubl) domain found in ubiquitin domain-containing protein 1 (UBTD1). UBTD1 is the mammalian homolog of the mitochondrial Dc-UbP/UBTD2 protein, both of which contain an N-terminal ubiquitin binding domain (UBD) and a C-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions. UBTD1 stably interacts with the UBE2D family of E2 ubiquitin conjugating enzymes. As a tumor suppressor, UBTD1 plays a pivotal role in in regulating cellular senescence that mediates p53 function. It is also involved in MDM2 ubiquitination and degradation.	69
340641	cd17121	Ubl_UBTD2	ubiquitin-like (Ubl) domain found in ubiquitin domain-containing protein 2 (UBTD2). UBTD2, also termed dendritic cell-derived ubiquitin-like protein (DC-UbP), or ubiquitin-like protein SB72, is a potential ubiquitin shuttle protein firstly identified in dendritic cells and implicated in apoptosis. It binds proteins involved in the ubiquitination pathway and may play an important role in regulating protein ubiquitination and delivery of ubiquitinated substrates in eukaryotic cells. UBTD2 is expressed in tumor cells but not in normal human adult tissue suggesting a role in tumorogenesis. It can potentially regulate the stability of proteins involved in various physiological processes relevant to many disease phenotypes, such as ageing and cancer. UBTD2 reconciles protein ubiquitination and deubiquitination via linking UbE1 and USP5 enzymes.	75
340642	cd17122	Ubl_UHRF1	ubiquitin-like (Ubl) domain found in ubiquitin-like PHD and RING finger domain-containing protein 1(UHRF1). UHRF1, also termed inverted CCAAT box-binding protein of 90 kDa, or nuclear protein 95, or nuclear zinc finger protein Np95 (Np95), or RING finger protein 106, or transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1, is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma, gastric cancer, esophageal squamous cell carcinoma, colorectal cancer, prostate cancer, and breast cancer. UHRF1 can acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumor suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21.Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination ofTIP60. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (Ubl), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a SET and RING finger associated (SRA) domain, and a C-terminal RING-finger domain. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitination has an essential role in maintenance DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD finger targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-finger domain exhibits both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1.	74
340643	cd17123	Ubl_UHRF2	ubiquitin-like (Ubl) domain found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2). UHRF2, also termed Np95/ICBP90-like RING finger protein (NIRF), or Np95-likeRING finger protein, or nuclear protein 97, or nuclear zinc finger protein Np97, or RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2, was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131(ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (Ubl), a tandem Tudor domain (TTD), a plant homeodomain (PHD) finger, a set- and ring-associated (SRA) domain, and a C-terminal RING finger.	74
340644	cd17124	Ubl_TECR	ubiquitin-like (Ubl) domain found in trans-2,3-enoyl-CoA reductase (TECR). TECR, also termed very-long-chain enoyl-CoA reductase, or synaptic glycoprotein SC2, or TER, or GPSN2, is a synaptic glycoprotein that catalyzes the fourth reaction in the synthesis of very long-chain fatty acids (VLCFA) which is the reduction step of the microsomal fatty acyl-elongation process. Diseases involving perturbations to normal synthesis and degradation of VLCFA (e.g. adrenoleukodystrophy and Zellweger syndrome) have significant neurological consequences. The mammalian TECR P182L mutation causes nonsyndromic mental retardation. Deletion of the yeast TECR homolog (TSC13p) is lethal. TECR contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions, as well as a C-terminal catalytic domain.	79
340645	cd17125	Ubl_TECRL	ubiquitin-like (Ubl) domain found in trans-2,3-enoyl-CoA reductase-like (TECRL). TECRL, also termed steroid 5-alpha-reductase 2-like 2 protein (SRD5A2L2), is associated with life?threatening inherited arrhythmias displaying features of both long QT syndrome (LQTS) and catecholaminergic polymorphic ventricular tachycardia (CPVT). TECRL contains an N-terminal ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold, a common structure involved in protein-protein interactions, as well as a C-terminal catalytic domain.	78
340646	cd17126	Ubl_HR23A	ubiquitin-like (Ubl) domain found in UV excision repair protein RAD23 homolog A (HR23A). HR23A, also termed RAD23A, is a DNA repair protein that binds to 19S subunit of the 26S proteasome and shuttles ubiquitinated proteins to the proteasome for degradation, which is required for efficient nucleotide excision repair (NER), a primary mechanism for removing UV-induced DNA lesions. HR23A also plays a critical role in the interaction of HIV-1 viral protein R (Vpr) with the proteasome, especially facilitating Vpr to promote protein poly-ubiquitination. HR23A contains an N-terminal ubiquitin-like (Ubl) domain that binds proteasomes and two C-terminal ubiquitin-associated (UBA) domains that bind ubiquitin or multi-ubiquitinated substrates. In addition, it has a XPC protein-binding domain that might be necessary for its efficient NER function.	76
340647	cd17127	Ubl_TBK1	ubiquitin-like (Ubl) domain found in TRAF family member-associated NF-kappaB activator (TANK)-binding kinase 1 (TBK1). TBK1, also termed NF-kappa-B-activating kinase, or T2K, or TANK-binding kinase 1, is an interferon regulatory factor-activating kinase that is a non-canonical member of IKK family. It plays a role in regulating innate immunity, inflammation and oncogenic signaling. TBK1 is involved in the regulation of type I interferons and of nuclear factor-kappaB (NF-kappaB) signal transduction. It regulates factors such as IRF3 and IRF7, promoting antiviral activity in the interferon signaling pathways. It modulates inflammatory hyperalgesia by regulating MAP kinases and NF-kappaB dependent genes. Moreover, TBK1 acts as a central player in the intracellular nucleic acid-sensing pathways involved in antiviral signaling. Dysregulation of TBK1 activity is often associated with autoimmune diseases and cancer. TBK1 contains an N-terminal protein kinase domain followed a ubiquitin-like (Ubl) domain, and a C-terminal elongated helical domain. The Ubl domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the interferon pathway.	78
340648	cd17128	Ubl_IKKE	ubiquitin-like (Ubl) domain found in inhibitor of nuclear factor kappa-B kinase subunit epsilon (IKK-E). IKK-E (EC 2.7.11.10), also termed I-kappa-B kinase epsilon (IKKepsilon), or IKK-epsilon, or IkBKE, or inducible I kappa-B kinase (IKK-i), is an interferon regulatory factor-activating kinase that is a non-canonical member of IKK family. It is involved in the cellular innate immunity by inducing type I interferons. It is induced by the activation of nuclear factor-kappaB (NF-kappaB). IKK-E has also been implicated in antiviral immune response in higher vertebrates. It acts as a crucial pro-survival factor in human T cell leukemia virus type 1 (HTLV-1)-transformed T lymphocytes. Moreover, IKK-E plays an essential role in tumor initiation and progression. It inhibits protein kinase C (PKC) to promote Fascin-dependent actin bundling. IKK-E contains an N-terminal protein kinase domain followed a ubiquitin-like (Ubl) domain, and a C-terminal elongated helical domain. The Ubl domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the interferon pathway.	78
340649	cd17129	Ubl1_FAF1	ubiquitin-like (Ubl) domain 1 found in FAS-associated factor 1 (FAF1) and similar proteins. FAF1, also termed UBX domain-containing protein 12 (UBXD12), or UBX domain-containing protein 3A (UBXN3A), belongs to the UBXD family of proteins that contains the ubiquitin (Ub) regulatory domain X (UBX) with a beta-grasp ubiquitin-like (Ubl) fold, but without the C-terminal double glycine motif. The UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, FAF1 contains two tandem Ubl domains, which show high structural similarity with the UBX domain. FAF1 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The FAF1-p97 complex inhibits the proteasomal protein degradation in which p97 acts as a co-chaperone. Moreover, FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. The family corresponds to the first Ubl domain.	73
340650	cd17130	Ubl2_FAF1	ubiquitin-like (Ubl) domain 2 found in FAS-associated factor 1 (FAF1) and similar proteins. FAF1, also termed UBX domain-containing protein 12 (UBXD12), or UBX domain-containing protein 3A (UBXN3A), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like (Ubl) fold, but without the C-terminal double glycine motif. The UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. In addition, FAF1 contains two tandem Ubl domains, which show high structural similarity with the UBX domain. FAF1 functions as a cofactor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The FAF1-p97 complex inhibits the proteasomal protein degradation in which p97 acts as a co-chaperone. Moreover, FAF1 is an apoptotic signaling molecule that acts downstream in the Fas signal transduction pathway. It interacts with the cytoplasmic domain of Fas, but not to a Fas mutant that is deficient in signal transduction. FAF1 is widely expressed in adult and embryonic tissues, and in tumor cell lines, and is localized not only in the cytoplasm where it interacts with Fas, but also in the nucleus. FAF1 contains phosphorylation sites for protein kinase CK2 within the nuclear targeting domain. Phosphorylation influences nuclear localization of FAF1 but does not affect its potentiation of Fas-induced apoptosis. Other functions have also been attributed to FAF1. It inhibits nuclear factor-kappaB (NF-kappaB) by interfering with the nuclear translocation of the p65 subunit. Although the precise role of FAF1 in the ubiquitination pathway remains unclear, FAF1 interacts with valosin-containing protein (VCP), which is involved in the ubiquitin-proteosome pathway. The family corresponds to the second Ubl domain.	75
340651	cd17131	Ubl_TMUB1	ubiquitin-like (Ubl) domain found in transmembrane and ubiquitin-like domain-containing protein 1 (TMUB1). TMUB1, also termed dendritic cell-derived ubiquitin-like protein (DULP), or hepatocyte odd protein shuttling protein, or ubiquitin-like protein SB144, or HOPS, is highly expressed in the nervous system. It is involved in the termination of liver regeneration and plays a negative role in interleukin-6-induced hepatocyte proliferation. The overexpression of Tmub1 has been shown to play a role in the inhibition of cell proliferation. TMUB1 has also been implicated in the regulation of locomotor activity and wakefulness in mice, perhaps acting through its interaction with CAMLG. It also facilitates the recycling of AMPA receptors into synaptic membrane in cultured primary neurons. TMUB1 contains transmembrane domains and a ubiquitin-like (Ubl) domain with a beta-grasp Ubl fold.	71
340652	cd17132	Ubl_TMUB2	ubiquitin-like (Ubl) domain found in transmembrane and ubiquitin-like domain-containing protein 2 (TMUB2). TMUB2 is composed of an uncharacterized transmembrane domain and a ubiquitin-like (Ubl) domain-containing protein that shows high sequence similarity to TMUB1; the latter is highly expressed in the nervous system and is involved in the termination of liver regeneration.	71
340653	cd17133	RBD_ARAF	Ras-binding domain (RBD) found in serine/threonine-protein kinase ARAF. ARAF, also termed proto-oncogene ARAF, or proto-oncogene ARAF1, or proto-oncogene PKS2, belongs to the RAF protein family. The RAF family includes three RAF kinases ARAF, BRAF, and RAF1/CRAF, encoded by proto-oncogenes, which activate the mitogen-activated protein kinase/extracellular-signal-regulated kinase (MAPK/ERK) cascade downstream of RAS. They share a common structure consisting of an N-terminal regulatory domain and a C-terminal kinase domain. There are three conserved regions (CR1-3) in the regulatory domain, CR1 contains a Ras-binding domain (RBD) and a cysteine-rich domain (CRD), CR2 is a serine/threonine-rich domain, and CR3 encodes the kinase domain required for RAF. The RBD of RAF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. ARAF is predominantly found in urogenital tissue with a low basal kinase activity. It directly cross-talks with NODAL/SMAD2 signaling in a MAPK-independent manner. It also promotes MAPK pathway activation and cell migration in a cell type-dependent manner. Moreover, ARAF acts as a scaffold to stabilize BRAF-CRAF heterodimers. Mice deleted for ARAF are viable but die perinatally.	73
340654	cd17134	RBD_BRAF	Ras-binding domain (RBD) found in serine/threonine-protein kinase BRAF. BRAF, also termed proto-oncogene B-Raf, or p94, or v-Raf murine sarcoma viral oncogene homolog B1, belongs to the RAF family. The RAF family includes three RAF kinases ARAF, BRAF, and RAF1/CRAF, encoded by proto-oncogenes, which activate the mitogen-activated protein kinase/extracellular-signal-regulated kinase (MAPK/ERK) cascade downstream of RAS. They share a common structure consisting of an N-terminal regulatory domain and a C-terminal kinase domain. There are three conserved regions (CR1-3) in the regulatory domain, CR1 contains a Ras-binding domain (RBD) and a cysteine-rich domain (CRD), CR2 is a serine/threonine-rich domain, and CR3 encodes the kinase domain required for RAF. The RBD of RAF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions.  BRAF is the most effective RAF kinase in terms of induction of MEK/ERK activity. It is somatically mutated in a number of human cancers.	79
340655	cd17135	RBD_CRAF	Ras-binding domain (RBD) found in RAF proto-oncogene serine/threonine-protein kinase RAF1/CRAF. RAF1/CRAF, also termed proto-oncogene c-RAF (cRaf), or Raf-1, belongs to the RAF family. The RAF family includes three RAF kinases ARAF, BRAF, and RAF1/CRAF, encoded by proto-oncogenes, which activate the mitogen-activated protein kinase/extracellular-signal-regulated kinase (MAPK/ERK) cascade downstream of RAS. They share a common structure consisting of an N-terminal regulatory domain and a C-terminal kinase domain. There are three conserved regions (CR1-3) in the regulatory domain, CR1 contains a Ras-binding domain (RBD) and a cysteine-rich domain (CRD), CR2 is a serine/threonine-rich domain, and CR3 encodes the kinase domain required for RAF. The RBD of RAF has a beta-grasp ubiquitin-like fold, a common structure involved in protein-protein interactions. RAF1/CRAF is an important effector of Ras-mediated signaling and is a critical regulator of the MAPK/ERK pathway.	77
340656	cd17136	RBD1_RGS12	Ras-binding domain (RBD) 1 of regulator of G protein signaling 12 (RGS12). RGS12 (regulator of G protein signaling 12) is a multidomain RGS protein with numerous signaling regulatory elements. In addition to a central RGS domain which is responsible for GAP activity, the long RGS12 splice variant contains a PDZ (PSD-95/Discs-large/ZO-1 homology) domain capable of binding the interleukin-8 receptor B (CXCR2) or its own C-terminal, a phosphotyrosine-binding (PTB) domain that associates with tyrosine-phosphorylated N-type calcium channel, two tandem Ras-binding domains (RBDs) that may integrate signaling pathways involving both heterotrimeric and monomeric G-proteins, and a GoLoco (G-alpha-i/o-Loco) interaction motif which has guanine nucleotide dissociation inhibitor (GDI) activity toward G-alpha-i subunits. RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. RGS proteins belong to a large family of GTpase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes.	70
340657	cd17137	RBD1_RGS14	Ras-binding domain (RBD) 1 of regulator of G protein signaling 14 (RGS14). RGS12 (regulator of G protein signaling 14) is a RGS protein with a multidomain structure that allows it to interact with binding partners from multiple signaling pathways. RGS proteins belong to a large family of GTPase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. RGS14 contains an N-terminal RGS domain, two tandem Ras-binding domains (RBDs) and a G protein regulatory (GPR, also referred to as a GoLoco) motif. RGS14 binds activated H-Ras-GTP through its first RBD and interacts with Rap2-GTP and RAF kinases by the second tandem RBD. RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. RGS14 modulates neuronal physiology and all of its binding partners have roles in synaptic plasticity.	71
340658	cd17138	RBD2_RGS12	Ras-binding domain (RBD) 2 of regulator of G protein signaling 12 (RGS12). RGS12 (regulator of G-protein signaling 12) is a multidomain RGS protein with numerous signaling regulatory elements. In addition to a central RGS domain which is responsible for GAP activity, the long RGS12 splice variant contains a PDZ (PSD-95/Discs-large/ZO-1 homology) domain capable of binding the interleukin-8 receptor B (CXCR2) or its own C-terminal, a phosphotyrosine-binding (PTB) domain that associates with tyrosine-phosphorylated N-type calcium channel, two tandem Ras-binding domains (RBDs) that may integrate signaling pathways involving both heterotrimeric and monomeric G-proteins, and a GoLoco (G-alpha-i/o-Loco) interaction motif which has guanine nucleotide dissociation inhibitor (GDI) activity toward G-alpha-i subunits. RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. RGS proteins belong to a large family of GTpase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes.	73
340659	cd17139	RBD2_RGS14	Ras-binding domain (RBD) 2 of regulator of G protein signaling 14 (RGS14). RGS14 (regulator of G protein signaling 14) is a RGS protein with a multidomain structure that allows it to interact with binding partners from multiple signaling pathways. RGS proteins belong to a large family of GTPase-accelerating proteins (GAPs) which act as key inhibitors of G-protein-mediated cell responses in eukaryotes. RGS14 contains an N-terminal RGS domain, two tandem Ras-binding domains (RBDs) and a G protein regulatory (GPR, also referred to as a GoLoco) motif. RGS14 binds activated H-Ras-GTP through its first RBD and interacts with RAP2-GTP and RAF kinases by the second tandem RBD. RBD is structurally similar to the beta-grasp fold of ubiquitin, a common structure involved in protein-protein interactions. RGS14 modulates neuronal physiology and all of its binding partners have roles in synaptic plasticity.	73
340660	cd17140	DCX1_DCLK1	Dublecortin-like domain 1 found in doublecortin-like kinase 1 (DCLK1). DCLK1 is a member of doublecortin (DCX) protein superfamily that functions as a microtubule-associated protein (MAP), and contains two conserved tubulin binding domains. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule-binding domains, DCLK encodes a serine/threonine kinase domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. DCLK1 appears to regulate cyclic AMP signaling and is involved in neuronal migration, retrograde transport, neuronal apoptosis and neurogenesis. Unlike DCX, this DCLK has varying levels of expression throughout embryonic and adult life.	89
340661	cd17141	DCX1_DCLK2	Dublecortin-like domain 1 found in doublecortin-like kinase 2 (DCLK2). DCLK2 is a member of doublecortin (DCX) protein superfamily that functions as a microtubule-associated protein (MAP), and contains two conserved tubulin binding domains, which typically occur in tandem. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier (Ubiquitination) in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule binding domains, DCLK encodes a serine/threonine kinase-domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. Molecular actions of DCX members are less well characterized and it shows that DCLK2 members regulate cyclic AMP signaling. Unlike DCX, this DCLK has varying levels of expression throughout embryonic and adult life.	85
340662	cd17142	DCX2_DCX	Dublecortin-like domain 2 found in neuronal migration protein doublecortin (DCX). DCX, also termed doublin or lissencephalin-X (Lis-XDCX), is a microtubule-associated protein (MAP). It belongs to the doublecortin (DCX) family, has double tandem DCX repeats, and is expressed in migrating neurons. Structure studies show that the N-terminal DCX domain has a stable ubiquitin-like fold. DCX is not only a unique MAP in terms of its structure, but also interacts with multiple additional proteins. Mutations in the human DCX genes are associated with abnormal neuronal migration, epilepsy, and mental retardation.	84
340663	cd17143	DCX2_DCLK1	Dublecortin-like domain 2 found in doublecortin-like kinase 1 (DCLK1). DCLK1 is a member of doublecortin (DCX) protein family that functions as a microtubule-associated protein (MAP), and contains two conserved tubulin binding domains. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule binding domains, DCLK encodes a serine/threonine kinase-domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. DCLK1 appears to regulate cyclic AMP signaling and is involved in neuronal migration, retrograde transport, neuronal apoptosis and neurogenesis. Unlike DCX, the DCLK has varying levels of expression throughout embryonic and adult life.	84
340664	cd17144	DCX2_DCLK2	Dublecortin-like domain 2 found in doublecortin-like kinase 2 (DCLK2). DCLK2 is a member of doublecortin (DCX) protein family that functions as a microtubule-associated protein (MAP), and contains two conserved tubulin binding domains, which typically occur in double tandem. The DCX domain has a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. In addition to microtubule binding domains, DCLK encodes a serine/threonine kinase-domain that is similar to Ca/calmodulin-dependent (Cam) protein kinases. DCLK2 members regulate cyclic AMP signaling. Unlike DCX, the DCLK has varying levels of expression throughout embryonic and adult life.	84
340665	cd17145	DCX1_RP1	Doublecortin-like domain 1 found in retinitis pigmentosa 1 (RP1)-like protein. RP1, also termed oxygen-regulated protein 1, is a member of the doublecortin (DCX) family. Its DCX domains occur in double tandem repeats. RP1 is associated with retinitis pigmentosa, which is a type of inherited blindness. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The RP1 protein is expressed in photoreceptors and is required for correct stacking of outer segment discs. It interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	79
340666	cd17146	DCX1_RP1L1	Doublecortin-like domain 1 found in retinitis pigmentosa 1-like 1 (RP1L1) protein. RP1L1 is a member of the doublecortin (DCX) family. Its DCX domains occur in double tandem repeats. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX-domain of RP1L1 localizes to the photoreceptor and is genetically associated with retinitis pigmentosa.	79
340667	cd17147	DCX2_RP1	Dublecortin-like domain 2 found in retinitis pigmentosa 1 (RP1)-like protein. RP1, also termed oxygen-regulated protein 1, is a member of doublecortin (DCX) superfamily that contains double tandem repeats of the DCX domains. RP1 is associated with retinitis pigmentosa, which is a type of inherited blindness. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The RP1 protein is expressed in photoreceptors that is required for correct stacking of outer segment discs. RP1 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	76
340668	cd17148	DCX2_RP1L1	Dublecortin-like domain 2 found in retinitis pigmentosa 1-like 1 (RP1L1) protein. RP1L1 is a member of doublecortin (DCX) family. Its protein domains occur in tandem repeats. DCX is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. The DCX-domain of RP1L1 localizes to the photoreceptor and is genetically associated with retinitis pigmentosa.	76
340669	cd17149	DCX1_DCDC2	Dublecortin-like domain 1 found in doublecortin domain-containing protein 2 (DCDC2). DCDC2 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	80
340670	cd17150	DCX1_DCDC2B	Dublecortin-like domain 1 found in doublecortin domain-containing protein 2B (DCDC2B). DCDC2 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	79
340671	cd17151	DCX1_DCDC2C	Dublecortin-like domain 1 found in doublecortin domain-containing protein 2C (DCDC2C). DCDC2 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	79
340672	cd17152	DCX2_DCDC2	Doublecortin-like domain 2 found in doublecortin domain-containing protein 2 (DCDC2). DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	80
340673	cd17153	DCX2_DCDC2B	Doublecortin-like domain 2 found in doublecortin domain-containing protein 2B (DCDC2B). DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of a ubiquitin-like tertiary fold.  Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	80
340674	cd17154	DCX2_DCDC2C	Doublecortin-like domain 2 found in doublecortin domain-containing protein 2C (DCDC2C). DCDC2 is a member of the doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of a ubiquitin-like tertiary fold.  Microtubules are key components of cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC2 genetic variation in humans is associated with reading disability, attention deficit hyperactivity disorder (ADHD), and difficulties in mathematics. A genetic variant of DCDC2 associates with dyslexia, a common neurobehavioral disorder of reading. DCDC2 protein interacts with many of the same cytoskeleton related proteins that other members of the DCX family interact with.	80
340675	cd17155	DCX_DCDC1	Doublecortin-like domain found in doublecortin domain-containing protein 1 (DCDC1). Doublecortin (DCX) is a microtubule-associated protein (MAP) with a stable ubiquitin-like tertiary fold.  Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCX gene family consists of eleven paralogs in human and mouse, such that its protein domains can occur in tandem or single repeats. Proteins with DCX tandem domains in general have roles in microtubule (MT) regulation and signal transduction while single DCX repeat proteins are normally localized to actin-rich subcellular structures or to the nucleus. DCDC1 is a hydrophilic intracellular protein that contains only one DCX repeat. Therefore, DCDC1 might only bind to microtubules without microtubule polymerization properties. DCDC1 is mainly expressed in adult testis.	72
340676	cd17156	DCX1_DCDC5	Doublecortin-like domain 1 found in doublecortin domain-containing protein 5 (DCDC5). DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and is involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8, as well as with the Rab8 nucleotide exchange factor Rabin8.	76
340677	cd17157	DCX2_DCDC5	Doublecortin-like domain 2 found in doublecortin domain-containing protein 5 (DCDC5). DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8, as well as with the Rab8 nucleotide exchange factor Rabin8.	86
340678	cd17158	DCX3_DCDC5	Doublecortin-like domain 3 found in doublecortin domain-containing protein 5 (DCDC5). DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8, as well as with the Rab8 nucleotide exchange factor Rabin8.	73
340679	cd17159	DCX4_DCDC5	Doublecortin-like domain 4 found in doublecortin domain-containing protein 5 (DCDC5). DCDC5 is a member of doublecortin (DCX) family. It is a microtubule-associated protein (MAP) with stable double tandem DCX repeats of ubiquitin-like tertiary fold. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. Microtubules are key components of the cytoskeleton that are involved in cell movement, shape determination, division and transport. DCDC5 is expressed during mitosis and involved in coordinating late cytokinesis. DCDC5 interacts with cytoplasmic dynein and Rab8, as well as with the Rab8 nucleotide exchange factor Rabin8.	73
340680	cd17160	UBX_UBXN2A	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 2A (UBXN2A) and similar proteins. UBXN2A, also termed UBX domain-containing protein 4 (UBXD4), belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN2A functions as a p97 (also known as VCP or Cdc48) adaptor protein facilitating the regulation of the cell surface number and stability of alpha3-containing neuronal nicotinic acetylcholine receptors (nAChRs). It also regulates nicotinic receptor degradation by modulating the E3 ligase activity of carboxyl terminus of Hsc70 interacting protein (CHIP) that is involved in the degradation of several disease-related proteins. In addition, UBXN2A is an important anticancer factor that contributes to p53 localization and activation as a host defense mechanism against cancerous growth. It acts as a potential mortalin inhibitor, as well as a potential chemotherapy sensitizer for clinical application. It binds to the oncoprotein mortalin-2 (mot-2), and further unsequesters p53 from mot-2 in the cytoplasm, resulting in translocation of p53 to the nucleus where p53 proteins activate their downstream biological effects, including apoptosis.	84
340681	cd17161	UBX_UBXN2B	Ubiquitin regulatory domain X (UBX) found in UBX domain protein 2B (UBXN2B) and similar proteins. UBXN2B, also termed NSFL1 cofactor p37, or p97 cofactor p37, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN2B is an adaptor protein of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. UBXN2B forms a tight complex with p97 in the cytosol and localizes to the Golgi and endoplasmic reticulum (ER).	82
340682	cd17162	UBX_UBXN2C	Ubiquitin regulatory domain X (UBX) found in NSFL1 cofactor (also known as UBX domain-containing protein 2C (UBXN2C) and similar proteins. UBXN2C, also termed NSFL1C, or NSFL1 cofactor p47, or p97 cofactor p47, UBX1, or UBXD10, belongs to the UBXD family of proteins that contains the ubiquitin regulatory domain X (UBX) with a beta-grasp ubiquitin-like fold, but without the C-terminal double glycine motif. UBX domain is typically located at the carboxyl terminus of proteins, and participates broadly in the regulation of protein degradation. UBXN2C is a major adaptor of p97 (also known as VCP or Cdc48), which is a homohexameric AAA ATPase (ATPase associated with a variety of activities) involved in a variety of functions ranging from cell-cycle regulation to membrane fusion and protein degradation. The main role of the UBXN2C/p97 complex is in regulation of membrane fusion events. It plays an essential role in the reassembly of Golgi stacks at the end of mitosis. UBXN2C also functions as an essential factor for Golgi membrane fusion, which associates with the nuclear factor-kappaB essential modulator (NEMO) subunit of the IkappaB kinase (IKK) complex upon tumor necrosis factor (TNF)-alpha or interleukin (IL)-1 stimulation, induces the lysosome-dependent degradation of polyubiquitinated NEMO without p97, and thus inhibits IKK activation. Moreover, UBXN2C regulates a membrane fusion process, which is required by the maintenance of the endoplasmic reticulum (ER) network, through phosphorylation by Cdc2 kinase.	82
340683	cd17163	Ubl_ATG8_GABARAPL2	ubiquitin-like (Ubl) domain found in GABA type A receptor associated protein like 2 (GABARAPL2, also known as GATE16). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. GABARAPL2 (GABA type A receptor associated protein like 2), also known as GATE-16 (golgi-associated adenosine triphosphatase enhancer of 16 kDa), has a ubiquitin-like (Ubl) domain and is a sub-family of the autophagy-related 8 (ATG8) protein family. GABARAPL2 participates to the autophagosome maturation, and seems to be involved in constitutive transport under normal conditions and in autophagic processes during stress. GABARAPL2 is characterized as a membrane transport modulator that controls intra-Golgi transport by interacting with NSF and Golgi v-SNARE GOS-28.	112
340684	cd17164	RAWUL_PCGF2	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 2 (PCGF2). PCGF2, also termed DNA-binding protein Mel-18, or RING finger protein 110 (RNF110), or zinc finger protein 144 (ZNF144), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a canonical polycomb repressive complex 1 (PRC1), which is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a polyhomeotic protein (PHC1, PHC2, or PHC3). Like other PCGF homologs, PCGF2 associates with ring finger protein 2 (RNF2) to form a RNF2-PCGF heterodimer, which is catalytically competent as an E3 ubiquitin transferase and is the scaffold for the assembly of additional components. Moreover, PCGF2 uniquely regulates PRC1 to specify mesoderm cell fate in embryonic stem cells. It is required for PRC1 stability and maintenance of gene repression in embryonic stem cells (ESCs) and essential for ESC differentiation into early cardiac-mesoderm precursors. PCGF2 also plays a significant role in the angiogenic function of endothelial cells (ECs) by regulating endothelial gene expression. Furthermore, PCGF2 is a SUMO-dependent regulator of hormone receptors. It facilitates the deSUMOylation process by inhibiting PCGF4/BMI1-mediated ubiquitin-proteasomal degradation of SUMO1/sentrin-specific protease 1 (SENP1). It is also a novel negative regulator of breast cancer stem cells (CSCs) that inhibits the stem cell population, and in vitro and in vivo self-renewal through the inactivation of Wnt-mediated Notch signaling. PCGF2 contains a C3HC4-type RING-HC finger, and a RAWUL domain.	99
340685	cd17165	RAWUL_PCGF4	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in polycomb group RING finger protein 4 (PCGF4). PCGF4, also termed polycomb complex protein BMI-1 (B cell-specific Moloney murine leukemia virus integration site 1), or RING finger protein 51 (RNF51), is one of six PcG RING finger (PCGF) homologs (PCGF1/NSPc1, PCGF2/Mel-18, PCGF3, PCGF4/BMI1, PCGF5, and PCGF6/MBLR) and serves as the core component of a canonical Polycomb repressive complex 1 (PRC1), which is composed of a chromodomain-containing protein (CBX2, CBX4, CBX6, CBX7 or CBX8) and a Polyhomeotic protein (PHC1, PHC2, or PHC3), and plays important roles in  chromatin compaction and H2AK119 monoubiquitination. PCGF4 associates with the Runx1/CBFbeta transcription factor complex to silence target gene in a PRC2-independent manner. Moreover, PCGF4 is expressed in the hair cells and supporting cells. It can regulate cell survival by controlling mitochondrial function and reactive oxygen species (ROS) level in thymocytes and neurons, thus having an important role in the survival and sensitivity to ototoxic drug of auditory hair cells. Furthermore, PCGF4 controls memory CD4 T-cell survival through direct repression of Noxa gene in an Ink4a- and Arf-independent manner. It is required in neurons to suppress p53-induced apoptosis via regulating the antioxidant defensive response, and also involved in the tumorigenesis of various cancer types. PCGF4 contains a C3HC4-type RING-HC finger, and a RAWUL domain.	97
340686	cd17166	RAWUL_RING1	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in really interesting new gene 1 protein (RING1). RING1, also termed polycomb complex protein RING1, or RING finger protein 1 (RNF1), or RING finger protein 1A (RING1A), has been identified as a transcriptional repressor that is associated with the Polycomb group (PcG) protein complex involved in stable repression of gene activity. It is a core component of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase that transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. RING1 interacts with multiple PcG proteins and displays tumorigenic activity. It also shows zinc-dependent DNA binding activity. Moreover, RING1 inhibits transactivation of the DNA-binding protein recombination signal binding protein-Jkappa (RBP-J) by Notch through interaction with the LIM domains of KyoT2. RING1 contains a C3HC4-type RING-HC finger, and a RAWUL domain.	124
340687	cd17167	RAWUL_RING2	RING finger- and WD40-associated ubiquitin-like (RAWUL) domain found in really interesting new gene 2 protein (RING2). RING2, also termed huntingtin-interacting protein 2-interacting protein 3, or HIP2-interacting protein 3, or protein DinG, or RING finger protein 1B (RING1B), or RING finger protein 2 (RNF2), or RING finger protein BAP-1, is an E3 ubiquitin-protein ligase that interacts with both nucleosomal DNA and an acidic patch on histone H4 to achieve the specific monoubiquitination of K119 on histone H2A (H2AK119ub), thereby playing a central role in histone code and gene regulation. RING2 is a core component of polycomb repressive complex 1 (PRC1) that functions as an E3-ubuiquitin ligase transferring the mono-ubuiquitin mark to the C-terminal tail of Histone H2A at K118/K119. PRC1 is also capable of chromatin compaction, a function not requiring histone tails, and this activity appears important in gene silencing. The enzymatic activity of RING2 is enhanced by the interaction with BMI1/PCGF4, and it is dispensable for early embryonic development and much of the gene repression activity of PRC1. Moreover, RING2 plays a key role in terminating neural precursor cell (NPC)-mediated production of subcerebral projection neurons (SCPNs) during neocortical development. It also plays a critical role in nonhomologous end-joining (NHEJ)-mediated end-to-end chromosome fusions. Furthermore, RING2 is essential for expansion of hepatic stem/progenitor cells. It promotes hepatic stem/progenitor cell expansion through simultaneous suppression of cyclin-dependent kinase inhibitors (CDKIs) Cdkn1a and Cdkn2a, known negative regulators of cell proliferation. RING2 also negatively regulates p53 expression through directly binding with both p53 and MDM2 and promoting MDM2-mediated p53 ubiquitination in selective cancer cell types to stimulate tumor development. RING2 contains a C3HC4-type RING-HC finger, and a RAWUL domain.	106
340688	cd17168	FERM_F1_FRMPD1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing protein 1 (FRMPD1). FRMPD1, also termed FERM domain-containing protein 2, is an activator of G-protein signaling 3 (AGS3)-binding protein that regulates the subcellular location of AGS3 and its interaction with G-proteins. It also binds to the tetratricopeptide repeat (TPR) motif-containing adaptor protein LGN. FRMPD1 contains a PDZ domain and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain of the FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	90
340689	cd17169	FERM_F1_FRMPD3	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing protein 3 (FRMPD3). FRMPD3 is an uncharacterized FERM and PDZ domain-containing protein. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to the F1 sub-domain of the FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	93
340690	cd17170	FERM_F1_FRMPD4	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing protein 4 (FRMPD4). FRMPD4, also termed PDZ domain-containing protein 10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a multiscaffolding protein that modulates both Homer1 and postsynaptic density protein 95 activity. It can associate with the tetratricopeptide repeat (TPR) motif-containing adaptor protein LGN. Moreover, FRMPD4 is asymmetrically distributed in the cytosol and nuclei of neural stem/progenitor cells in the adult brain, suggesting a significant role in cell differentiation via association with cell polarity machinery. FRMPD4 contains a WW domain, a PDZ domain, and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain of the FERM domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	94
340691	cd17171	FERM_F0_TLN1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in Talin-1 (TLN1). TLN1 is a cytoskeletal protein that plays a pivotal role in regulating the activity of the integrin family of cell adhesion proteins by coupling them to F-actin. It functions as a focal adhesion protein involved in the attachment of the bacterium. It binds to multiple adhesion molecules, including integrins, vinculin, focal adhesion kinase (FAK), and actin. TLN1 also plays an essential role in integrin activation. TLN1 interacts with the hepatitis B virus (HBV) accessory protein X (HBx), which induces the degradation of TLN1. It also acts as an adaptor protein that regulates leukocyte function-associated antigen-1 (LFA-1) affinity. In addition, TLN1 is required for myoblast fusion, sarcomere assembly and the maintenance of myotendinous junctions. TLN1 consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain.	84
340692	cd17172	FERM_F0_TLN2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in Talin-2 (TLN2). TLN2 is a cytoskeletal protein that plays an important role in cell adhesion and recycling of synaptic vesicles. TLN2 is required for myoblast fusion, sarcomere assembly and the maintenance of myotendinous junctions. TLN2 consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F0 domain.	84
340693	cd17173	FERM_F1_TLN1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Talin-1 (TLN1). TLN1 is a cytoskeletal protein that plays a pivotal role in regulating the activity of the integrin family of cell adhesion proteins by coupling them to F-actin. It functions as a focal adhesion protein involved in the attachment of the bacterium. It binds to multiple adhesion molecules, including integrins, vinculin, focal adhesion kinase (FAK), and actin. TLN1 also plays an essential role in integrin activation. TLN1 interacts with the hepatitis B virus (HBV) accessory protein X (HBx), which induces the degradation of TLN1. It also acts as an adaptor protein that regulates leukocyte function-associated antigen-1 (LFA-1) affinity. In addition, TLN1 is required for myoblast fusion, sarcomere assembly and the maintenance of myotendinous junctions. TLN1 consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F1 domain.	112
340694	cd17174	FERM_F1_TLN2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Talin-2 (TLN2). TLN2 is a cytoskeletal protein that plays an important role in cell adhesion and recycling of synaptic vesicles. TLN2 is required for myoblast fusion, sarcomere assembly and the maintenance of myotendinous junctions. TLN2 consists of an N-terminal head and a C-terminal rod. The talin head harbors a FERM (Band 4.1, ezrin, radixin, moesin) domain made up of F1, F2 and F3 domains, as well as an N-terminal region that precedes the FERM domain and has been referred to as the F0 domain. Both F0 and F1 domains have similar ubiquitin-like folds. This family corresponds to the F1 domain.	112
340695	cd17175	FERM_F0_SHANK1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in SH3 and multiple ankyrin repeat domains protein 1 (SHANK1). SHANK1, also termed somatostatin receptor-interacting protein, or SSTR-interacting protein (SSTRIP), is a postsynaptic density (PSD)-associated scaffolding proteins at the excitatory synapse that interconnects neurotransmitter receptors and cell adhesion molecules by direct and indirect interactions with numerous other PSD-associated proteins. Mutations in the SHANK1 synaptic scaffolding gene may lead to autism spectrum disorder and mental retardation. SHANK1 contains an N-terminal F0 domain of FERM (Band 4.1, ezrin, radixin, moesin), six ankyrin (ANK) repeats, one SH3 (Src homology 3) domain, one PDZ (PSD-95, Dlg, and ZO-1/2, also termed DHR or GLGF) domain, and a C-terminal SAM (sterile alpha motif) domain. This family corresponds to the F0 domain that adopts a ubiquitin-like fold.	89
340696	cd17176	FERM_F0_SHANK2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in SH3 and multiple ankyrin repeat domains protein 2 (SHANK2). SHANK2, also termed cortactin-binding protein 1 (CortBP1), or proline-rich synapse-associated protein 1, is a postsynaptic density (PSD)-associated scaffolding proteins at the excitatory synapse that interconnects neurotransmitter receptors and cell adhesion molecules by direct and indirect interactions with numerous other PSD-associated proteins. It is strongly expressed in the cerebellum. Moreover, SHANK2 acts as a component of the albumin endocytic pathway in podocytes, and regulates renal albumin endocytosis. It also associates with and regulates Na+/H+ exchanger 3 (NHE3) and is involved in the fine regulation of transepithelial salt and water transport through affecting NHE3 expression and activity. Mutations in the SHANK2 synaptic scaffolding gene may lead to autism spectrum disorder and mental retardation. SHANK2 contains an N-terminal F0 domain of FERM (Band 4.1, ezrin, radixin, moesin), six ankyrin (ANK) repeats, one SH3 (Src homology 3) domain, one PDZ (PSD-95, Dlg, and ZO-1/2, also termed DHR or GLGF) domain, and a C-terminal SAM (sterile alpha motif) domain. This family corresponds to the F0 domain that adopts a ubiquitin-like fold.	88
340697	cd17177	FERM_F0_SHANK3	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in SH3 and multiple ankyrin repeat domains protein 3 (SHANK3). SHANK3, also termed proline-rich synapse-associated protein 2 (ProSAP2), is a postsynaptic density (PSD)-associated scaffolding protein at the excitatory synapse that interconnects neurotransmitter receptors and cell adhesion molecules by direct and indirect interactions with numerous other PSD-associated proteins. It is critical for synaptic plasticity and the trans-synaptic coupling between the reliability of presynaptic neurotransmitter release and postsynaptic responsiveness. It is a key component of a zinc-sensitive signaling system that regulates excitatory synaptic strength. Mutations in the SHANK3 synaptic scaffolding gene may lead to autism spectrum disorder and mental retardation, and the cause of human Phelan-McDermid syndrome (22q13.3 deletion syndrome) has been isolated to loss of function of one copy of the SHANK3 gene. SHANK3 contains an N-terminal F0 domain of FERM (Band 4.1, ezrin, radixin, moesin), six ankyrin (ANK) repeats, one SH3 (Src homology 3) domain, one PDZ (PSD-95, Dlg, and ZO-1/2, also known as DHR or GLGF) domain, and a C-terminal SAM (sterile alpha motif) domain. This family corresponds to the F0 domain that adopts a ubiquitin-like fold.	87
340698	cd17178	FERM_F1_PLEKHH1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in pleckstrin homology domain-containing family H member 1 (PLEKHH1). PLEKHH1 is a homolog of Caenorhabditis elegans MAX-1 that has been implicated in motor neuron axon guidance. PLEKHH1 is critical in vascular patterning in vertebrate species through acting upstream of the ephrin pathway. PLEKHH1 contains a putative alpha-helical coiled-coil segment within the N-terminal half, and two Pleckstrin homology (PH) domains, a MyTH4 domain, and a FERM (Band 4.1, ezrin, radixin, moesin) domain within the C-terminal half. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	106
340699	cd17179	FERM_F1_PLEKHH2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in pleckstrin homology domain-containing family H member 2 (PLEKHH2). PLEKHH2 is a novel podocyte protein downregulated in human focal segmental glomerulosclerosis. It is highly enriched in renal glomerular podocytes, and acts as a novel, important component of the podocyte foot processes. PLEKHH2 contains a putative alpha-helical coiled-coil segment within the N-terminal half, and two Pleckstrin homology (PH) domains, a MyTH4 domain, and a FERM (Band 4.1, ezrin, radixin, moesin) domain within the C-terminal half. The FERM domain is made up of three sub-domains, F1, F2, and F3. The family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PLEKHH2 is involved in matrix adhesion and actin dynamics. It directly interacts through its FERM domain with the focal adhesion protein Hic-5 and actin.	103
340700	cd17180	FERM_F0_KIND1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in kindlin-1 (KIND1). KIND1, also termed Kindlerin, or Kindler syndrome protein, or fermitin family homolog 1 (FERMT1), or Unc-112-related protein 1 (URP1), is an integrin-interacting protein that has been implicated in cell adhesion, proliferation, polarity, and motility. It is essential for maintaining the structure of cell-matrix adhesion, such as focal adhesions and podosomes. KIND1 is expressed primarily in epithelial cells. Loss or mutations of KIND1 gene may cause the Kindler syndrome (KS), an autosomal recessive skin disorder with an intriguing progressive phenotype comprising skin blistering, photosensitivity, progressive poikiloderma with extensive skin atrophy, and propensity to skin cancer. KIND1 forms a molecular complex with the key transforming growth factor (TGF)-beta/Smad3 signaling components including type I TGFbeta receptor (TbetaRI), Smad3 and Smad anchor for receptor activation (SARA) to control the activation of TGF-beta/Smad3 signaling pathway. KIND1 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F0 domain.	84
340701	cd17181	FERM_F0_KIND2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in kindlin-2 (KIND2). KIND2, also termed fermitin family homolog 2 (FERMT2), or mitogen-inducible gene 2 protein (MIG-2), or Pleckstrin homology (PH) domain-containing family C member 1, is an adaptor protein that is widely distributed and is particularly abundant in adherent cells. It binds to the integrin beta cytoplasmic tail to promote integrin activation. It promotes carcinogenesis through regulation of cell-cell and cell-extracellular matrix adhesion. In additon, KIND2 plays an important role in cardiac development. KIND2 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F0 domain.	80
340702	cd17182	FERM_F0_KIND3	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F0 sub-domain, found in kindlin-3 (KIND3). KIND3, also termed fermitin family homolog 3 (FERMT3), or MIG2-like protein, or Unc-112-related protein 2, is an adaptor protein that expressed primarily in hematopoietic cells. It plays a central role in cell adhesion in hematopoietic cells, and also promotes integrin activation, clustering and outside-in signaling. KIND3, together with talin-1, contributes essentially to the activation of beta2-integrins in neutrophils. In addition, KIND3 interacts with the ribosome and regulates c-Myc expression required for proliferation of chronic myeloid leukemia cells. Mutations in the KIND3 gene cause leukocyte adhesion deficiency type III (LAD III), which is characterized by high susceptibility to infections, spontaneous and episodic bleedings, and osteopetrosis. KIND3 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F0 domain.	83
340703	cd17183	FERM_F1_KIND1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in kindlin-1 (KIND1). KIND1, also termed Kindlerin, or Kindler syndrome protein, or fermitin family homolog 1 (FERMT1), or Unc-112-related protein 1 (URP1), is an integrin-interacting protein that has been implicated in cell adhesion, proliferation, polarity, and motility. It is essential for maintaining the structure of cell-matrix adhesion, such as focal adhesions and podosomes. KIND1 is expressed primarily in epithelial cells. Loss or mutations of KIND1 gene may cause the Kindler syndrome (KS), an autosomal recessive skin disorder with an intriguing progressive phenotype comprising skin blistering, photosensitivity, progressive poikiloderma with extensive skin atrophy, and propensity to skin cancer. KIND1 forms a molecular complex with the key transforming growth factor (TGF)-beta/Smad3 signaling components including type I TGFbeta receptor (TbetaRI), Smad3 and Smad anchor for receptor activation (SARA) to control the activation of TGF-beta/Smad3 signaling pathway. KIND1 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F1 domain.	93
340704	cd17184	FERM_F1_KIND2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in kindlin-2 (KIND2). KIND2, also termed fermitin family homolog 2 (FERMT2), or mitogen-inducible gene 2 protein (MIG-2), or Pleckstrin homology (PH) domain-containing family C member 1, is an adaptor protein that is widely distributed and is particularly abundant in adherent cells. It binds to the integrin beta cytoplasmic tail to promote integrin activation. It promotes carcinogenesis through regulation of cell-cell and cell-extracellular matrix adhesion.  KIND2 also plays an important role in cardiac development. KIND2 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F1 domain.	101
340705	cd17185	FERM_F1_KIND3	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in kindlin-3 (KIND3). KIND3, also termed fermitin family homolog 3 (FERMT3), or MIG2-like protein, or Unc-112-related protein 2, is an adaptor protein that expressed primarily in hematopoietic cells. It plays a central role in cell adhesion in hematopoietic cells, and also promotes integrin activation, clustering and outside-in signaling. KIND3, together with talin-1, contributes essentially to the activation of beta2-integrins in neutrophils. In addition, KIND3 interacts with the ribosome and regulates c-Myc expression required for proliferation of chronic myeloid leukemia cells. Mutations in the KIND3 gene cause leukocyte adhesion deficiency type III (LAD III), which is characterized by high susceptibility to infections, spontaneous and episodic bleedings, and osteopetrosis. KIND3 consists of an atypical FERM domain that is made up of F1, F2 and F3 domains, as well as an N-terminal region, which precedes the FERM domain and has been referred to as the F0 domain. This family corresponds to the F1 domain.	91
340706	cd17186	FERM_F1_Merlin	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in merlin and similar proteins. Merlin, also termed moesin-ezrin-radixin-like protein, or neurofibromin-2 (NF2), or Schwannomerlin, or Schwannomin, is a member of the ezrin/radixin/moesin (ERM) family of cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif, merlin however lacks the typical actin-binding motif in the C-tail. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Merlin plays vital roles in controlling proper development of organ sizes by specifically binding to a large number of target proteins localized both in cytoplasm and nuclei. Merlin may function as a tumor suppressor that functions upstream of the core Hippo pathway kinases Lats1/2 (Wts in Drosophila) and Mst1/2 (Hpo in Drosophila), as well as the nuclear E3 ubiquitin ligase DDB1-and-Cullin 4-associated Factor 1 (DCAF1)-associated cullin 4-Roc1 ligase, CRL4(DCAF1). Merlin may also has a tumor suppressor function in melanoma cells, the inhibition of the proto-oncogenic Na(+)/H(+) exchanger isoform 1 (NHE1) activity.	85
340707	cd17187	FERM_F1_ERM	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in the ERM family proteins, Ezrin, Radixin, and Moesin. The ezrin-radixin-moesin (ERM) family includes a group of closely related cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. They exist in two states, a dormant state in which the FERM domain binds to its own C-terminal tail and thereby precludes binding of some partner proteins, and an activated state, in which the FERM domain binds to one of many membrane binding proteins and the C-terminal tail binds to F-actin. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	83
340708	cd17188	FERM_F1_FRMD7	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 7 (FRMD7). FRMD7 plays an important role in neuronal development and is involved in the regulation of F-actin, neurofilament, and microtubule dynamics. It interacts with the Rho GTPase regulator, RhoGDIalpha, and activates the Rho subfamily member Rac1, which regulates reorganization of actin filaments and controls neuronal outgrowth. Mutations in the FRMD7 gene are responsible for the X-linked idiopathic congenital nystagmus (ICN), a disease which affects ocular motor control. FRMD7 contains a FERM domain, and a pleckstrin homology (PH) domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	86
340709	cd17189	FERM_F1_FARP1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM, ARH/RhoGEF and pleckstrin domain-containing protein 1 (FARP1). FARP1, also termed chondrocyte-derived ezrin-like protein (CDEP), or pleckstrin homology (PH) domain-containing family C member 2 (PLEKHC2), is a neuronal activator of the RhoA GTPase. It promotes outgrowth of developing motor neuron dendrites. It also regulates excitatory synapse formation and morphology, as well as activates the GTPase Rac1 to promote F-actin assembly. As a novel downstream signaling partner of Rif, FARP1 is involved in the regulation of semaphorin signaling in neurons. FARP1 contains a FERM domain, a Dbl-homology (DH) domain and two pleckstrin homology (PH) domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	85
340710	cd17190	FERM_F1_FARP2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM, ARH/RhoGEF and pleckstrin domain-containing protein 2 (FARP2) and similar proteins. FARP2, also termed FERM domain including RhoGEF (FIR), or Pleckstrin homology (PH) domain-containing family C member 3, is a Dbl-family guanine nucleotide exchange factor (GEF) that activates Rac1 or Cdc42 in response to upstream signals, suggesting roles in regulating processes such as neuronal axon guidance and bone homeostasis. It is also a key molecule involved in the response of neuronal growth cones to class-3 semaphorins. FARP2 contains a FERM domain, a Dbl-homology (DH) domain and two pleckstrin homology (PH) domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	85
340711	cd17191	FERM_F1_PTPN14	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 14 (PTPN14) and similar proteins. PTPN14, also termed protein-tyrosine phosphatase pez, or PTPD2, or PTP36, is a widely expressed non-transmembrane cytosolic protein tyrosine phosphatase (PTP). It belongs to the FERM family of PTPs characterized by a conserved N-terminal FERM domain and a C-terminal PTP catalytic domain with an intervening sequence containing an acidic region and a putative SH3 domain-binding sequence. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN14 plays a role in the nucleus during cell proliferation. It forms a complex with Kibra and LATS1 proteins and negatively regulates the key Hippo pathway protein Yes-associated protein (YAP) oncogenic function by controlling its localization.  It specifically regulates p130 Crk-associated substrate (p130Cas) phosphorylation at tyrosine residue 128 (Y128) in colorectal cancer (CRC) cells. Moreover, PTPN14 may be a critical enzyme in regulating endothelial cell function. It plays a crucial role in organogenesis by inducing transforming growth factor beta (TGFbeta) and epithelial-mesenchymal transition (EMT). It also acts as a modifier of angiogenesis and hereditary haemorrhagic telangiectasia. It regulates the lymphatic function and choanal development through the interaction with the vascular endothelial growth factor receptor 3 (VEGFR3), a receptor tyrosine kinase essential for lymphangiogenesis. Furthermore, PTPN14 functions as a regulator of cell motility through its action on cell-cell adhesion. Beta-Catenin, a central component of adherens junctions, has been identified as a PTPN14 substrate. PTPN14 works as a novel sperm-motility biomarker and a potential mitochondrial protein.	87
340712	cd17192	FERM_F1_PTPN21	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 21 (PTPN21) and similar proteins. PTPN21, also termed protein-tyrosine phosphatase D1 (PTPD1), is a cytosolic non-receptor protein-tyrosine phosphatase (PTP) that belongs to the FERM family of PTPs characterized by a conserved N-terminal FERM domain and a C-terminal PTP catalytic domain with an intervening sequence containing an acidic region and a putative SH3 domain-binding sequence.  The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN21 interacts with a Tec tyrosine kinase family member, the epithelial and endothelial tyrosine kinase (Etk, also known as Bmx), modulates Stat3 activation, and plays a role in the regulation of cell growth and differentiation. It also associates with and activates Src tyrosine kinase, and directs epidermal growth factor (EGF)/Src signaling to the nucleus through activating ERK1/2- and Elk1-dependent gene transcription. PTPD1-Src complex interacts a protein kinase A-anchoring protein AKAP121 to forms a PTPD1-Src-AKAP121 complex, which is required for efficient maintenance of mitochondrial membrane potential and ATP oxidative synthesis. As a novel component of the endocytic pathway, PTPN21 supports EGF receptor stability and mitogenic signaling in bladder cancer cells. Moreover, PTPD1 regulates focal adhesion kinase (FAK) autophosphorylation and cell migration through modulating Src-FAK signaling at adhesion sites.	87
340713	cd17193	FERM_F1_PTPN3	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 3 (PTPN3). PTPN3, also termed protein-tyrosine phosphatase H1 (PTP-H1), belongs to the non-transmembrane FERM-containing protein-tyrosine phosphatase (PTP) subfamily characterized by a conserved N-terminal FERM domain, a PDZ domain, and a C-terminal PTP catalytic domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN3 associates with the mitogen-activated protein kinase p38gamma (also known as MAPK12) to form a PTPN3-p38gamma complex that promotes Ras-induced oncogenesis. It may also act as a tumor suppressor in lung cancer through its modulation of epidermal growth factor receptor (EGFR) signaling. Moreover, PTPN3 shows sensitizing effect to anti-estrogens. It dephosphorylates the tyrosine kinase EGFR, disrupts its interaction with the nuclear estrogen receptor, and increases breast cancer sensitivity to small molecule tyrosine kinase inhibitors (TKIs). It also cooperates with vitamin D receptor to stimulate breast cancer growth through their mutual stabilization.	84
340714	cd17194	FERM_F1_PTPN4	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 4 (PTPN4). PTPN4, also termed protein-tyrosine phosphatase MEG1 (MEG) or PTPase-MEG1, belongs to the non-transmembrane FERM-containing protein-tyrosine phosphatase (PTP) subfamily characterized by a conserved N-terminal FERM domain, a PDZ domain, and a C-terminal PTP catalytic domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN4 protects cells against apoptosis. It associates with the mitogen-activated protein kinase p38gamma (also known as MAPK12) to form a PTPN4-p38gamma complex that promotes cellular signaling, preventing cell death induction. It also inhibits tyrosine phosphorylation and subsequent cytoplasm translocation of TRIF-related adaptor molecule (TRAM, also known as TICAM2), resulting in the disturbance of TRAM-TRIF interaction. Moreover, PTPN4 negatively regulates cell proliferation and motility through dephosphorylation of CrkI.	84
340715	cd17195	FERM_F1_PTPN13	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in tyrosine-protein phosphatase non-receptor type 13 (PTPN13). PTPN13, also termed Fas-associated protein-tyrosine phosphatase 1 (FAP-1), or PTP-BAS, or protein-tyrosine phosphatase 1E (PTP-E1 or PTPE1), or protein-tyrosine phosphatase PTPL1, belongs to the non-transmembrane FERM-containing protein-tyrosine phosphatase (PTP) subfamily characterized by a KIND domain, a FERM domain, five PDZ domains, and a C-terminal PTP catalytic domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). PTPN13 interacts with a variety of ligands, suggests an important role as a scaffolding protein. It is also involved in the regulation of apoptosis, cytokinesis and cell cycle progression.	96
340716	cd17196	FERM_F1_FRMPD2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM and PDZ domain-containing protein 2 (FRMPD2). FRMPD2, also termed PDZ domain-containing protein 4 (PDZK4), or PDZ domain-containing protein 5C (PDZD5C), is a potential scaffold protein involved in basolateral membrane targeting in epithelial cells. It interacts with nucleotide-binding oligomerization domain-2 (NOD2) through leucine-rich repeats and forms a complex with the membrane-associated protein ERBB2IP. FRMPD2 contains an N-terminal KIND domain, a FERM domain and three PDZ domains. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	95
340717	cd17197	FERM_F1_FRMD1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 1 (FRMD1). FRMD1 is an uncharacterized FERM domain-containing protein. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	98
340718	cd17198	FERM_F1_FRMD6	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 6 (FRMD6). FRMD6, also termed willin, expanded or expanded homolog, is a FERM domain-containing protein that plays a critical role in regulating both cell proliferation and apoptosis. It acts as a tumor suppressor of human breast cancer cells independently of the Hippo pathway. It also inhibits human glioblastoma growth and progression by negatively regulating activity of receptor tyrosine kinases. As an upstream component of the hippo signaling pathway, FRMD6 orchestrates mammalian peripheral nerve fibroblasts. FRMD6 contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	98
340719	cd17199	FERM_F1_FRMD4A	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 4A (FRMD4A). FRMD4A is a cytohesin adaptor involved in cell structure, transport and signaling. It promotes the growth of cancer cells in tongue, head and neck squamous cell carcinomas. It also regulates tau secretion by activating cytohesin-Arf6 signaling through connecting cytohesin family Arf6-specific guanine-nucleotide exchange factors (GEFs) and Par-3 at primordial adherens junctions during epithelial polarization. As a genetic risk factor for late-onset Alzheimer's disease (AD), FRMD4A may play a role in amyloidogenic and tau-related pathways in AD. FRMD4A contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	89
340720	cd17200	FERM_F1_FRMD4B	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in FERM domain-containing protein 4B (FRMD4B). FRMD4B, also termed GRP1-binding protein GRSP1, interacts with the coil-coil domain of ARF exchange factor GRP1 to form the Grsp1-Grp1 complex that co-localizes with cortical actin rich regions in response to stimulation of CHO-T cells with insulin or epidermal growth factor (EGF). FRMD4B contains a FERM protein interaction domain as well as two coiled coil domains and may therefore function as a scaffolding protein. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	89
340721	cd17201	FERM_F1_EPB41L1	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like protein 1 (EPB41L1) and similar proteins. EPB41L1, also termed neuronal protein 4.1 (4.1N), belongs to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. It is a cytoskeleton-associated protein that may serve as a tumor suppressor in solid tumors. It suppresses hypoxia-induced epithelial-mesenchymal transition in epithelial ovarian cancer (EOC) cells. The down-regulation of EPB41L1 expression is a critical step for non-small cell lung cancer (NSCLC) development. Moreover, EPB41L1 functions as a linker protein between inositol 1,4,5-trisphosphate receptor type1 (IP3R1) and actin filaments in neurons. EPB41L1 contains a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	84
340722	cd17202	FERM_F1_EPB41L2	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like protein 2 (EPB41L2) and similar proteins. EPB41L2, also termed generally expressed protein 4.1 (4.1G), belongs to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. EPB41L2 contains a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	84
340723	cd17203	FERM_F1_EPB41L3	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like protein 3 (EPB41L3) and similar proteins. EPB41L3, also termed 4.1B, or differentially expressed in adenocarcinoma of the lung protein 1 (DAL-1), belongs to the skeletal protein 4.1 family that is involved in cellular processes such as cell adhesion, migration and signaling. EPB41L3 is a tumor suppressor that has been implicated in a variety of meningiomas and carcinomas. EPB41L3 contains a FERM domain, a spectrin and actin binding (SAB) domain, and a C-terminal domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	84
340724	cd17204	FERM_F1_EPB41L4B	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte band 4.1-like protein 4B (EPB41L4B). EPB41L4B, also termed FERM-containing protein CG1, or expressed in high metastatic cells (Ehm2), or Lulu2, is a member of the band 4.1/Nbl4 (novel band 4.1-like protein 4) group of the FERM protein superfamily. It contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). EPB41L4B is a positive regulator of keratinocyte adhesion and motility, suggesting a role in wound healing. It also promotes cancer metastasis in melanoma, prostate cancer and breast cancer.	84
340725	cd17205	FERM_F1_EPB41L5	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in erythrocyte membrane protein band 4.1-like 5 (EPB41L5). EPB41L5 is a mesenchymal-specific protein that is an integral component of the ARF6-based pathway. It is normally induced during epithelial-mesenchymal transition (EMT) by an EMT-related transcriptional factor, ZEB1, which drives ARF6-based invasion, metastasis and drug resistance. EPB41L5 also binds to paxillin to enhance integrin/paxillin association, and thus promotes focal adhesion dynamics. Moreover, EPB41L5 acts as a substrate for the E3 ubiquitin ligase Mind bomb 1 (Mib1), which is essential for activation of Notch signaling. EPB41L5 is a member of the band 4.1/Nbl4 (novel band 4.1-like protein 4) group of the FERM protein superfamily. It contains a FERM domain that is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	86
340726	cd17206	FERM_F1_Myosin-X	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in unconventional myosin-X. Myosin-X, also termed myosin-10 (Myo10), is an untraditional member of myosin superfamily. It is an actin-based motor protein that plays a critical role in diverse cellular motile events, such as filopodia formation/extension, phagocytosis, cell migration, and mitotic spindle maintenance, as well as a number of disease states including cancer metastasis and pathogen infection. Myosin-X functions as an important regulator of cytoskeleton that modulates cell motilities in many different cellular contexts. It regulates neuronal radial migration through interacting with N-cadherin. Like other unconventional myosins, Myosin-X is composed of a conserved motor head, a neck region and a variable tail. The neck region consists of three IQ motifs (light chain-binding sites), and a predicted stalk of coiled coil. The tail contains three PEST regions, three PH domains, a MyTH4 domain, and a FERM domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	97
340727	cd17207	FERM_F1_PLEKHH3	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in pleckstrin homology domain-containing family H member 3 (PLEKHH3). PLEKHH3 is an uncharacterized Pleckstrin homology (PH) domain-containing protein that shows high sequence similarity with unconventional myosin-X, an actin-based motor protein that plays a critical role in diverse cellular motile events, such as filopodia formation/extension, phagocytosis, cell migration, and mitotic spindle maintenance, as well as a number of disease states including cancer metastasis and pathogen infection. In addition to two PH domains, PLEKHH3 harbors a MyTH4 domain, and a FERM (Band 4.1, ezrin, radixin, moesin) domain. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	96
340728	cd17208	FERM_F1_DdMyo7_like	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Dictyostelium discoideum Myosin-VIIa (DdMyo7) and similar proteins. DdMyo7, also termed Myosin-I heavy chain, or class VII unconventional myosin, or M7, plays a role in adhesion in Dictyostelium where it is a component of a complex of proteins that serve to link membrane receptors to the underlying actin cytoskeleton. It interacts with talinA, an actin-binding protein with a known role in cell-substrate adhesion. DdMyo7 is required for phagocytosis. It is also essential for the extension of filopodia, plasma membrane protrusions filled with parallel bundles of F-actin. Members in this family contain a myosin motor domain, two MyTH4 domains, two FERM (Band 4.1, ezrin, radixin, moesin) domains, and two Pleckstrin homology (PH) domains. Some family members contain an extra SH3 domain. Each FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).	98
340729	cd17209	RA_RalGDS	Ras-associating (RA) domain found in Ral guanine nucleotide dissociation stimulator (RalGDS) and similar proteins. RalGDS, also termed Ral guanine nucleotide exchange factor (RalGEF), is a guanine exchange factor (GEF) for the Ral family of small GTPases. It is the prototype of RalGDS family proteins that are involved in Ras and Ral signaling pathways as downstream effector proteins. RalGDS stimulates the dissociation of GDP from the Ras-related RalA and RalB GTPases which allows GTP binding and activation of the GTPases. It interacts and acts as an effector molecule for R-Ras, H-Ras, K-Ras, and Rap. Moreover, RalGDS functions as a novel interacting partner for Rab7-interacting lysosomal protein (RILP), a key regulator for late endosomal/lysosomal trafficking. RILP suppresses invasion of breast cancer cells by inhibiting the GEF activity for RalA of RalGDS. RalGDS also plays a vital role in the regulation of Ral-dependent Weibel-Palade bodies (WPB) exocytosis from endothelial cells. In addition, RalGDS couples growth factor signaling to Akt activation by promoting PDK1-induced Akt phosphorylation.  Members in this family have similar domain structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal Ras-associating (RA) domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair.	86
340730	cd17210	RA_RGL	Ras-associating (RA) domain found in Ral guanine nucleotide dissociation stimulator-like 1 (RalGDS-like 1) and similar proteins. RalGDS-like 1 (RGL) is a Ral-specific guanine nucleotide exchange factor that belongs to RalGDS family, whose members are involved in Ras and Ral signaling pathways as downstream effector proteins. RGL has been identified as a possible effector protein of Ras. It also regulates c-fos promoter and the GDP/GTP exchange of Ral. Members in this family have similar structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal Ras-associating (RA) domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair.	87
340731	cd17211	RA_RGL2	Ras-associating (RA) domain found in Ral guanine nucleotide dissociation stimulator-like 2 (RalGDS-like 2) and similar proteins. RalGDS-like 2 (RGL2), also termed RalGDS-like factor (RLF), or Ras-associated protein RAB2L, is a novel Ras and Rap 1A-associating protein that belongs to RalGDS family, whose members are involved in Ras and Ral signaling pathways as downstream effector proteins. RGL2 exhibits guanine nucleotide exchange activity towards the small GTPase Ral. Members in this family have similar domain structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal Ras-associating (RA) domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The RA domain of RGL2 is phosphorylated by protein kinase A and the phosphorylation affects the ability of RGL2 to bind both Ras and Rap1.	86
340732	cd17212	RA_RGL3	Ras-associating (RA) domain found in Ral guanine nucleotide dissociation stimulator-like 3 (RalGDS-like 3) and similar proteins. RalGDS-like 3 (RGL3), also termed Ras pathway modulator (RPM), interacts in a GTP- and effector loop-dependent manner with Rit and Ras. As a novel potential effector of both p21 Ras and M-Ras, RGL3 negatively regulates Elk-1-dependent gene induction downstream of p21 Ras or mitogen activated protein/extracellular signal regulated kinase Kinase 1 (MEKK1). It also functions as a potential binding partner for Rap-family small G-proteins and profilin II. RGL3 belongs to RalGDS family, whose members are involved in Ras and Ral signaling pathways as downstream effector proteins. Members in this family have similar domain structure: a central CDC25 homology domain with an upstream Ras Exchange motif (REM), and a C-terminal Ras-associating (RA) domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier (Ubiquitination) in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair.	87
340733	cd17213	RA_PHLPP	Ras-associating (RA) domain found in PH domain leucine-rich repeat-containing protein phosphatase PHLPP1, PHLPP2, and similar proteins. PHLPP represents a novel family of Ser/Thr protein phosphatases, which is involved in two key signaling pathways, the phosphatidylinositol 3-kinase and diacylglycerol signaling pathways, by directly dephosphorylating and inactivating Akt serine-threonine kinases (Akt1, Akt2, Akt3) and protein kinase C (PKC) isoforms. PHLPP targets oncogenic kinases and may act as a tumor suppressor in several types of cancers. Two PHLPP isoforms are included in this family, PHLPP1 and PHLPP2. They regulate Akt activation together when both phosphatases are expressed. PHLPP1 is also termed pleckstrin homology domain-containing family E member 1, or PH domain-containing family E member 1, or suprachiasmatic nucleus circadian oscillatory protein (SCOP). It plays a suppression role in inflammatory response of glioma. Its loss contributes to gliomas development and progression. Loss of PHLPP1 also protects against colitis by inhibiting intestinal epithelial cell apoptosis. The overexpression of PHLPP1 impairs hippocampus-dependent learning, suggesting a role for PHLPP1 in learning and memory. PHLPP2 is also termed PH domain leucine-rich repeat-containing protein phosphatase-like (PHLPP-like). Both PHLPP1 and PHLPP2 contain a putative Ras-associating (RA) domain followed by a pleckstrin homology (PH) domain, a series of leucine-rich repeats and a protein phosphatase 2C (PP2C) domain.	97
340734	cd17214	RA_CYR1_like	Ras-associating (RA) domain found in Saccharomyces cerevisiae adenylate cyclase and similar proteins. CYR1, also termed ATP pyrophosphate-lyase, or adenylyl cyclase, is a fungal adenylate cyclase that regulates developmental processes such as hyphal growth, biofilm formation, and phenotypic switching. CYR1 plays essential roles in regulation of cellular metabolism by catalyzing the synthesis of a second messenger, cAMP. It acts as a scaffold protein keeping Ras2 available for its regulatory factors, the Ira proteins. CYR1 has at least four domains, including an N-terminal adenylate cyclase G-alpha binding domain, a Ras-associating (RA) domain, a middle leucine-rich repeat region, and a catalytic domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair. The RA domain of CYR1 post-translationally modifies a small GTPase called Ras, which is involved in cellular signal transduction. CYR1 activity is stimulated directly by regulatory proteins (Ras1 and Gpa2), peptidoglycan fragments and carbon dioxide.	99
340735	cd17215	RA_Rin1	Ras-associating (RA) domain found in Ras and Rab interactor 1 (Rin1). Rin1, also termed Ras inhibitor JC99, or Ras interaction/interference protein 1, is a downstream Ras effector that represents a unique class of Ras effector connected to two independent signaling pathways. The first effector pathway is the direct activation of RAB5-mediated endocytosis and the second pathway involves direct activation of ABL tyrosine kinase activity. Rin1 functions as a guanine nucleotide exchange factor (GEF) for RAB5 GTPases. The RAB5 GEF activity of Rin1 promotes early endosome fusion, an early event in transit to the lysosome. Rin1 binds the SH3 and SH2 domains of ABL proteins, ABL1 and ABL2, and activates their tyrosine kinase activity. Rin1 contains SH2 and proline-rich domains in the N-terminal region, and RH, VPS9, and RA domains in the C-terminal region. The RA domain has the beta-grasp ubiquitin-like (Ubl) fold with low sequence similarity to ubiquitin; ubiquitin is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair.	88
340736	cd17216	RA_Myosin-IXa	Ras-associating (RA) domain found in Myosin-IXa. Myosin-IXa, also termed myosin-9a (Myo9a), is a single-headed, actin-dependent motor protein of the unconventional myosin IX class. It is expressed in several tissues and is enriched in the brain and testes. Myosin-IXa contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a Rho GTPase activating domain (RhoGAP). Its RA domain is located at its head domain and has the beta-grasp ubiquitin-like fold with unknown function. Myosin-IXa binds the alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid receptor (AMPAR) GluA2 subunit, and plays a key role in controlling the molecular structure and function of hippocampal synapses. Moreover, Myosin-IXa functions in epithelial cell morphology and differentiation such that its knockout mice develop hydrocephalus and kidney dysfunction. Myosin-IXa regulates collective epithelial cell migration by targeting RhoGAP activity to cell-cell junctions. Myosin-IXa negatively regulates Rho GTPase signaling, and functions as a regulator of kidney tubule function.	96
340737	cd17217	RA_Myosin-IXb	Ras-associating (RA) domain found in Myosin-IXb. Myosin-IXb, also termed myosin-9b (Myo9b), is a motor protein with a Rho GTPase activating domain (RhoGAP); it is an actin-dependent motor protein of the unconventional myosin IX class. It is expressed abundantly in tissues of the immune system, like lymph nodes, thymus, and spleen and in several immune cells including dendritic cells, macrophages and CD4+ T. Myosin-IXb contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a RhoGAP domain. Its RA domain is located at its head domain and has the beta-grasp ubiquitin-like fold with unknown function. Myosin-IXb acts as a motorized signaling molecule that links Rho signaling to the dynamic actin cytoskeleton. It regulates leukocyte migration by controlling RhoA signaling. Myosin-IXb is also involved in the development of autoimmune diseases, including rheumatoid arthritis, systemic lupus erythematosus and type 1 diabetes. Moreover, Myosin-IXb is a ROBO-interacting protein that suppresses RhoA activity in lung cancer cells.	96
340738	cd17218	RA_RASSF1	Ras-associating (RA) domain found in Ras-association domain-containing protein 1 (RASSF1). RASSF1 is a member of a family of six related RASSF1-6 proteins (the classical RASSF proteins). RASSF1 has eight transcripts (A-H) arising from alternative splicing and differential promoter usage. With the exception of some minor splice variants (RASSF1F and RASSF1G), RASSF1 contains an RA domain and a C-terminal SARAH protein-protein interaction motif. The RA domain of the classical RASSF proteins has a beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. The RA domain mediates interactions with Ras and other small GTPases, and the SARAH domain mediates protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF1A and 1C are the most extensively studied RASSF1 with both localized to microtubules and involved in regulation of growth and migration.	157
340739	cd17219	RA_RASSF3	Ras-associating (RA) domain found in Ras-association domain-containing protein 3 (RASSF3). RASSF3 is a member of a family of six related classical RASSF1-6 proteins (the classical RASSF proteins). RASSF3 has three transcripts (A-C) due to alternative splicing of the exons. The RASSF3B and 3C isoforms are shorter than RASSF3A, and unlike RASSF3A do not contain the RA or SARAH domains. The RA domain of the classical RASSF proteins has a beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. The RA domain mediates interactions with Ras and other small GTPases, and the SARAH domain mediates protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF3A regulates apoptosis and cell cycle via p53 stabilization and possibly is involved in DNA repair.	141
340740	cd17220	RA_RASSF5	Ras-associating (RA)  domain of Ras-association domain family 5 (RASSF5). RASSF5, also called New ras effector 1 (NORE1), or regulator for cell adhesion and polarization enriched in lymphoid tissues (RAPL), is a member of a family of six related RASSF1-6 proteins (the classical RASSF proteins) and is expressed as three transcripts (A-C) via differential promoter usage and alternative splicing. All transcripts variants of RASSF5 contain the RA or SARAH domains. The RA domain of the classical RASSF proteins has a beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. The RA domain mediates interactions with Ras and other small GTPases, and the SARAH domain mediates protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF5A is a pro-apoptotic Ras effector and functions as a Ras regulated tumor suppressor. RASSF5C is regulated by Ras related protein and modulates cellular adhesion.	152
340741	cd17221	RA_RASSF2	Ras-associating (RA) domain found in Ras-association domain-containing protein 2 (RASSF2). RASSF2 is a member of a family of six related classical RASSF1-6 proteins. The RASSF2 gene is transcribed into two major isoforms (A and C). RASSF2 is structurally related to RASSF1A but unlike RASSF1A It is primarily a nuclear protein. RASSF2 contains the RA and SARAH domains. The RA domain of the classical RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RA domains mediate interactions with Ras and other small GTPases, and SARAH domains mediate protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF2 is inactivated in different cancers and cancer cell lines by promoter methylation and loss of expression, implicating the correlation and significance of RASSF2 in tumorigenesis. In addition to regulating apoptosis and proliferation RASSF2 may have other functions as RASSF2 knockout mice develop normally for the first two weeks but then develop growth retardation and die 4 weeks after birth.	87
340742	cd17222	RA_RASSF4	Ras-associating (RA) domain found in Ras-association domain-containing protein 4 (RASSF4). RASSF4 is a member of a family of six related classical RASSF1-6 proteins and is broadly expressed in normal tissues. RASSF4 expression is reduced in tumor cell lines and primary tumors by promoter specific hypermethylation. RASSF4 contains the RA and SARAH domains. The RA domain of the classical RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RA domains mediate interactions with Ras and other small GTPases, and SARAH domains mediate protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF4 inhibits lung cancer cell proliferation and invasion.	87
340743	cd17223	RA_RASSF6	Ras-associating (RA) domain found in Ras-association domain-containing protein 6 (RASSF6). RASSF6 is a member of a family of six related classical RASSF1-6 proteins and is expressed as four transcripts via alternative splicing. All transcripts variant of RASSF6 contain the RA and SARAH domains. The RA domain of the classical RASSF protein family has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. RA domains mediate interactions with Ras and other small GTPases, SARAH domains mediate protein-protein interactions crucial in the pathways that induce cell cycle arrest and apoptosis. RASSF6 is ubiquitiated and degraded by interacting with MDM2 to stabilize P53 and regulates apoptosis and cell cycle.  RASSF6 is a tumor suppressor protein and is epigenetically silenced in childhood leukemia and neuroblastomas. Overexpression of RASSF6 causes apoptosis in HeLa cells.	87
340744	cd17224	RA_ASPP1	Ras-associating (RA) domain found in apoptosis-stimulating protein of p53 protein 1 (ASPP1). ASPP1 is a member of the ASPP protein family (Apoptosi-Stimulating Protein of p53) that activates the p53-mediated apoptotic response. ASSP1 functions as a tumor suppressor and coordinates with p53 to protect hematopoietic stem cell (HSC) pool integrity, guarding against hematological malignancies. ASSP1 contains a RA domain at the N-terminus. The RA domain is a ubiquitin-like domain and RA domain-containing proteins are involved in several different functions ranging from tumor suppression to being oncoproteins.	85
340745	cd17225	RA_ASPP2	Ras-associating (RA) domain found in apoptosis-stimulating protein of p53 protein 2 (ASPP2). ASPP2, also termed Bcl2-binding protein (Bbp), or renal carcinoma antigen NY-REN-51, or tumor suppressor p53-binding protein 2 (53BP2), or p53-binding protein 2 (p53BP2), is a member of ASPP protein family and it functions as a tumor suppressor. ASPP2 binds to p53 and enhances p53-mediated transcription of proapoptotic genes. ASSP2 contains a RA domain at the N-terminus. The RA domain is a ubiquitin-like domain and RA domain-containing proteins are involved in several different functions ranging from tumor suppression to being oncoproteins. All p53 amino acids that are important for ASPP2 binding are mutated in human cancer, and ASPP2 is frequently downregulated in these tumor cells.	80
340746	cd17226	RA_ARAP1	Ras-associating (RA) domain found in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing protein 1 (ARAP1). ARAP1, also termed Centaurin-delta-2 (Cnt-d2), is a phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P(3))-dependent Arf Rap-activated guanosine triphosphatase (GTPase)-activating protein (GAP) that inhibits the trafficking of epidermal growth factor receptor (EGFR) to the early endosome. It associates with the Cbl-interacting protein of 85 kDa (CIN85), regulates endocytic trafficking of the EGFR, and thus affects ubiquitination of EGFR. It also regulates the ring size of circular dorsal ruffles through Arf1 and Arf5. ARAP1 contains multiple functional domains, including ArfGAP and RhoGAP domains, as well as a sterile alpha motif (Sam) domain, five PH domains, and a RA domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	93
340747	cd17227	RA_ARAP2	Ras-associating (RA) domain found in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing protein 2 (ARAP2). ARAP2, also termed Centaurin-delta-1 (Cnt-d1), or Protein PARX, is a phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P(3))-dependent Arf Rap-activated guanosine triphosphatase (GTPase)-activating protein (GAP), which promotes GLUT1-mediated basal glucose uptake by modifying sphingolipid metabolism through glucosylceramide synthase (GCS). ARAP2 signals through Arf6 and Rac1 to control focal adhesion morphology. ARAP2 contains multiple functional domains, including ArfGAP and RhoGAP domains, as well as a sterile alpha motif (Sam) domain, five PH domains, and a RA domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	98
340748	cd17228	RA_ARAP3	Ras-associating (RA) domain found in Arf-GAP with Rho-GAP domain, ANK repeat and PH domain-containing protein 3 (ARAP3). ARAP3, also termed Centaurin-delta-3 (Cnt-d3), is a phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P(3))-dependent Arf Rap-activated guanosine triphosphatase (GTPase)-activating protein (GAP) that modulates actin cytoskeleton remodeling by regulating ARF and RHO family members, ADP-ribosylation factor 6 (Arf6) and Ras homolog gene family member A (RhoA). It is regulated by phosphatidylinositol 3,4,5-trisphosphate and a small GTPase Rap1-GTP, and has been implicated in the regulation of cell shape and adhesion. ARAP3 contains multiple functional domains, including ArfGAP and RhoGAP domains, as well as a sterile alpha motif (Sam) domain, five PH domains, and a RA domain. The RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin (Ub); Ub is a protein modifier in eukaryotes that is involved in various cellular processes including transcriptional regulation, cell cycle control, and DNA repair in eukaryotes.	99
340749	cd17229	RA1_PLC-epsilon	Ras-associating (RA) domain 1 found in Phosphatidylinositide-specific phospholipase C (PLC)-epsilon. PLC is a signaling enzyme that hydrolyzes membrane phospholipids to generate inositol triphosphate. PLC-epsilon represents a novel forth class of PLC that has a PLC catalytic core domain, a CDC25 guanine nucleotide exchange factor domain and two RA (Ras-association) domains. RA domain has the beta-grasp ubiquitin-like fold with low sequence similarity to ubiquitin. Although PLC RA1 and RA2 have homologous ubiquitin-like folds only RA2 can bind Ras and activate it. RA domain-containing proteins function by interacting with Ras proteins directly or indirectly and involve in several different functions ranging from tumor suppression to being oncoproteins. Ras proteins are small GTPases that are involved in cellular signal transduction. This family corresponds to the first RA domain of PLC-epsilon.	108
340750	cd17230	TGS_DRG1	TGS (ThrRS, GTPase and SpoT) domain found in developmentally regulated GTP binding protein 1 (DRG-1). DRG-1 is a potassium-dependent GTPase that belongs to the DRG family GTP-binding proteins. It plays an important role in regulating cell growth. It functions as a potential oncogene in lung adenocarcinoma and promotes tumor progression via spindle checkpoint signaling regulation. It also plays an important role in melanoma cell growth and transformation, indicating a novel role in CD4(+) T cell-mediated immunotherapy in melanoma. In addition, DRG-1 is regulated by ZC3H15 (zinc finger CCCH-type containing 15, also known as Lerepo4), and displays a high temperature optimum of activity at 42C, suggesting the ability of being active under possible heat stress conditions. DRG-1 contains a domain of characteristic Obg-type G-motifs that may be the core of GTPase activity, as well as this C-terminal TGS (ThrRS, GTPase and SpoT) domain that has a predominantly beta-grasp ubiquitin-like fold and may be related to RNA binding.	80
340751	cd17231	TGS_DRG2	TGS (ThrRS, GTPase and SpoT) domain found in developmentally regulated GTP binding protein 2 (DRG-2). DRG-2 is a member of the DRG family GTP-binding proteins. It has been implicated in cell growth, differentiation and death. DRG-2 plays a critical role in control of the cell cycle and apoptosis in Jurkat T cells. It regulates G2/M progression via the cyclin B1-Cdk1 complex. Moreover, DRG-2 is an endosomal protein and a key regulator of the small GTPase Rab5 deactivation and transferrin recycling. It enhances experimental autoimmune encephalomyelitis (EAE) by suppressing the development of TH17 cells. It is also associated with survival and cytoskeleton organization of osteoclasts under influence of macrophage colony-stimulating factor, and its overexpression leads to elevated bone resorptive activity of osteoclasts, resulting in bone loss. DRG-2 contains a domain of characteristic Obg-type G-motifs that may be the core of GTPase activity, as well as this C-terminal TGS (ThrRS, GTPase and SpoT) domain that has a predominantly beta-grasp ubiquitin-like fold and may be involved in RNA binding.	79
340752	cd17232	Ubl_ATG8_GABARAP	ubiquitin-like (Ubl) domain found in gamma-aminobutyric acid receptor-associated protein (GABARAP). GABARAP (also termed GABA(A) receptor-associated protein, ATG8A, or MM46) has been implicated in intracellular protein trafficking. It is a cytosolic protein that is localized to transport vesicles, the Golgi network and the endoplasmic reticulum. It interacts with the intracellular domain of the gamma2 subunit of GABA(A) receptors, and thus functions as a trafficking modulator implicated in the intracellular trafficking of GABA(A) receptor. GABARAP also acts as a Ubl modifier belonging to the ATG8 (autophagy-related 8) protein family, which is essential for autophagosome biogenesis and maturation. GABARAP recruits phosphatidylinositol 4-kinase II alpha (PI4KIIalpha) as a specific downstream effector, and regulates phosphatidylinositol 4-phosphate (PI4P)-dependent autophagosome lysosome fusion.	115
340753	cd17233	Ubl_ATG8_GABARAPL1_like	ubiquitin-like (Ubl) domain found in gamma-aminobutyric acid receptor-associated protein-like 1 (GABARAPL1) and similar proteins. GABARAPL1, also termed GEC1, or GABA(A) receptor-associated protein-like, belongs to the small family of GABARAP proteins which includes GABARAP, GABARAPL1, GABARAPL2/GATE-16, and GABARAPL3. GABARAPL1 has been involved in the intracellular transport of receptors via interactions with tubulin and GABA(A) or kappa opioid receptors. It is also a Ubl modifier that functions as a mediator involved in androgen-regulated autophagy process. It is transcriptionally modulated by androgen receptor (AR) and has a repressive role in autophagy. In addition, GABARAPL1 is required for increased membrane expression of epidermal growth factor receptor (EGFR) during hypoxia, suggesting a possible role in the trafficking of these membrane proteins. GABARAPL1 may also play a key role in several important biological processes such as cancer or neurodegenerative diseases. Low expression of GABARAPL1 is associated with poor prognosis of patients with hepatocellular carcinoma. This family also includes GABARAPL3, a paralog of GABARAPL1.	107
340754	cd17234	Ubl_ATG8_MAP1LC3A	ubiquitin-like (Ubl) domain found in microtubule associate protein 1 light chain 3A (MAP1LC3A). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. MAP1LC3A is belong to MAP1LC3 (short name LC3) family proteins. MAP1LC3 has a ubiquitin-like fold (Ubl) that belongs to the autophagy-related 8 (ATG8) protein family. A Ubl conjugation of MAP1LC3 by the phospholipid phosphatidylethanolamine (PE) is an essential process for the formation of autophagosomes. MAP1LC3 is cleaved by cysteine protease ATG4 and then conjugated with PE by E1-like enzyme ATG7 and ATG3, an E2-like enzyme. The Ubl conversion of MAPLC3 is known as a marker of autophagy-induction. MAP1LC3A staining patterns are used for different cancer diagnostic.	117
340755	cd17235	Ubl_ATG8_MAP1LC3B	ubiquitin-like (Ubl) domain found in microtubule associate protein 1 light chain 3B (MAP1LC3B). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. MAPLC3B belongs to the MAP1LC3 (short name LC3) family proteins. MAP1LC3 has a ubiquitin-like (Ubl) fold and belongs to the autophagy-related 8 (ATG8) protein family. A Ubl conjugation of MAPLC3 by the phospholipid phosphatidylethanolamine (PE) is an essential process for the formation of autophagosomes. MAP1LC3 is cleaved by cysteine protease ATG4 and then conjugated with PE by E1-like enzyme ATG7 and ATG3, an E2-like enzyme. The Ubl conversion of MAP1LC3 is known as a marker of autophagy-induction. All MAP1LC3 proteins are dispensable for basal autophagy; however, it has been shown that  MAP1LC3B is responsible for selective degradation of p62 through autophagy.	115
340756	cd17236	Ubl_ATG8_MAP1LC3C	ubiquitin-like (Ubl) domain found in microtubule associate protein 1 light chain 3C (MAPLC3C). Autophagy is an essential intracellular process that targets large protein complexes, bacterial pathogens, and organelles for degradation. MAP1LC3C belongs to the MAP1LC3 (short name LC3) family proteins. MAP1LC3 has a ubiquitin-like (Ubl) fold that belongs to the autophagy-related 8 (ATG8) protein family. A Ubl conjugation of MAP1LC3 by the phospholipid phosphatidylethanolamine (PE) is an essential process for the formation of autophagosomes. MAP1LC3 is cleaved by cysteine protease ATG4 and then conjugated with PE by E1-like enzyme ATG7 and ATG3, a E2-like enzyme. The Ubl conversion of MAP1LC3 is known as a marker of autophagy-induction. ATG8 proteins are ubiquitously expressed, although some subfamily members are expressed at increased levels in certain tissues, e.g. MAP1LC3C is transcribed at lower levels than other members of MAP1LC3 subfamily and expressed predominantly in the lung.	113
340757	cd17237	FERM_F1_Moesin	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in moesin and similar proteins. Moesin, also termed membrane-organizing extension spike protein, is a member of the ezrin/radixin/moesin (ERM) family of cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. The C-terminal domain can fold back to bind to the FERM domain forming an autoinhibited conformation. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N). Moesin is involved in mitotic spindle function through stabilizing cell shape and microtubules at the cell cortex. It is required for the formation of F-actin networks that mediate endosome biogenesis or maturation and transport through the degradative pathway.	84
340758	cd17238	FERM_F1_Radixin	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in radixin and similar proteins. Radixin is a member of the ezrin/radixin/moesin (ERM) family of cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. The C-terminal domain can fold back to bind to the FERM domain forming an autoinhibited conformation. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to the F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).  Radixin plays important roles in cell polarity, cell motility, invasion and tumor progression. It mediates the binding of F-actin to the plasma membrane after a conformational activation through Akt2-dependent phosphorylation at Thr564. It is also involved in reversal learning and short-term memory by regulating synaptic GABAA receptor density.	83
340759	cd17239	FERM_F1_Ezrin	FERM (Four.1 protein, Ezrin, Radixin, Moesin) domain, F1 sub-domain, found in Ezrin and similar proteins. Ezrin, also termed cytovillin, or villin-2, or p81, is a member of the ezrin/radixin/moesin (ERM) family of cytoskeletal proteins that plays an essential role in microvilli formation, T-cell activation, and tumor metastasis through providing a regulated linkage between F-actin and membrane-associated proteins. These proteins may also function in signaling cascades that regulate the assembly of actin stress fibers. The ERM proteins consist of an N-terminal FERM domain, a coiled-coil (CC) domain and a C-terminal tail segment (C-tail) containing a well-defined actin-binding motif. The C-terminal domain can fold back to bind to the FERM domain forming an autoinhibited conformation. The FERM domain is made up of three sub-domains, F1, F2, and F3. This family corresponds to F1 sub-domain, which is also called the N-terminal ubiquitin-like structural domain of the FERM domain (FERM_N).  Ezrin is a tyrosine kinase substrate that functions as a cross-linker between actin cytoskeleton and plasma membrane. It has been implicated in the regulation of the proliferation, apoptosis, adhesion, invasion, metastasis and angiogenesis of cancer cells.	85
340760	cd17240	RA_PHLPP1	Ras-associating (RA) domain found in PH domain leucine-rich repeat-containing protein phosphatase 1 (PHLPP1). PHLPP1, also termed pleckstrin homology domain-containing family E member 1, or PH domain-containing family E member 1, or suprachiasmatic nucleus circadian oscillatory protein (SCOP), is involved in two key signaling pathways, the phosphatidylinositol 3-kinase and diacylglycerol signaling pathways, by directly dephosphorylating and inactivating Akt serine-threonine kinases (Akt1, Akt2, Akt3) and protein kinase C (PKC) isoforms. PHLPP1 also plays critical roles in many cancers, such as gastric cancer, sacral chordoma, gallbladder cancer, hypopharyngeal squamous cell carcinomas, and non-small cell lung cancer. It plays a suppression role in inflammatory response of glioma. Its loss contributes to gliomas development and progression. Loss of PHLPP1 also protects against colitis by inhibiting intestinal epithelial cell apoptosis. The overexpression of PHLPP1 impairs hippocampus-dependent learning, suggesting a role in learning and memory. PHLPP1 contains a Ras-associating (RA) domain followed by a pleckstrin homology (PH) domain, a series of leucine-rich repeats and a protein phosphatase 2C (PP2C) domain.	90
340761	cd17241	RA_PHLPP2	Ras-associating (RA) domain found in PH domain leucine-rich repeat-containing protein phosphatase 2 (PHLPP2). PHLPP2, also termed PH domain leucine-rich repeat-containing protein phosphatase-like (PHLPP-like), is involved in two key signaling pathways, the phosphatidylinositol 3-kinase and diacylglycerol signaling pathways, by directly dephosphorylating and inactivating Akt serine-threonine kinases (Akt1, Akt2, Akt3) and protein kinase C (PKC) isoforms. PHLPP2 also plays critical roles in many cancers, such as glioma, hypopharyngeal squamous cell carcinomas, and non-small cell lung cancer. PHLPP2 contains a Ras-associating (RA) domain followed by a PH domain, leucine-rich repeats and protein phosphatase 2C (PP2C) domain.	108
410988	cd17242	MobM_relaxase	relaxase domain of MobM and similar proteins. With some plasmids, recombination can occur in a site-specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre (plasmid recombination enzyme), also known as Mob (conjugative mobilization). The best characterized member of this family is encoded by the streptococcal plasmid pMV158 that recognizes the plasmid origin of transfer. MobM converts supercoiled plasmid DNA into relaxed DNA by cleaving a phosphodiester bond of a specific dinucleotide and remains bound to the 5'-end of the nick site.	196
341132	cd17243	RMtype1_S_AchA6I-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Arthrobacter chlorophenolicus A6 S subunit (S.AchA6I) TRD2-CR2. The S.AchA6I S subunit recognizes 5'... TGAANNNNNTCG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.AchA6I-TRD1 recognizes TGAA/TTCA, and TRD2 recognizes CGA/TCG. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	182
341133	cd17244	RMtype1_S_Apa101655I-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Acetobacter pasteurianus S subunit (S.Apa101655I) TRD2-CR2. The S. Apa101655I S subunit recognizes 5'... TTAGNNNNNNTTC... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	180
341134	cd17245	RMtype1_S_TteMORF1547P-TRD2-CR2_Aco12261I-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Thermoanaerobacter tengcongensis S subunit (S.TteMORF1547P) TRD2-CR2 and Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I) TRD1-CR1. The S.Aco12261I S subunit recognizes 5'... GCANNNNNNTGT ... 3', while the recognition sequence is undetermined for S.TteMORF1547P TRD2-CR2. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. S.TteMORF1547P TRD1-CR1 and S.Aco12261I TRD2-CR2 do not belong to this family. This family may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	174
341135	cd17246	RMtype1_S_SonII-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Shewanella oneidensis MR-1 S subunit (S.SonII) TRD2-CR2. This model contains Shewanella oneidensis MR-1 S subunit (S.SonII) TRD2-CR2 and similar TRD-CR's. S.SonII recognizes 5'...  GTCANNNNNNRTCA  ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. S.SonII TRD1-CR1 does not belong to this subfamily.	189
341136	cd17247	RMtype1_S_Eco2747I-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli ST2747 S subunit (S.Eco2747I) TRD2-CR2. The S. Eco2747I S subunit recognizes 5'... CACNNNNNNNGTTG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This CD contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	190
341137	cd17248	RMtype1_S_AmiI-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Actinosynnema mirum DSM 43827 S subunit (S. AmiI) TRD2-CR2. The S. AmiI S subunit recognizes 5'... CAGNNNNNNNTCGA ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S. AmiI -TRD1 recognizes CAG/CTG, and TRD2 recognizes TCGA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	196
341138	cd17249	RMtype1_S_EcoR124I-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoR124I TRD2-CR2, S.Eco540I TRD2-CR2, S.Eco540AI TRD2-CR2, and S.Eco540ANI TRD2-CR2. Escherichia coli (R124) S subunit (S.EcoR124I), E. coli ST540 S subunit (S.Eco540I), E. coli ST540A S subunit (S.Eco540AI), and Escherichia coli ST540AN S subunit (S.Eco540ANI) recognize the sequence 5'... GAANNNNNNRTCG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoR124I -TRD1 recognizes GAA/TTC, and -TRD2 recognizes CGAY/RTCG. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	185
341139	cd17250	RMtype1_S_Eco4255II_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli O118:H16 07-4255 S subunit (S.Eco4255II) TRD2-CR2 and Shewanella oneidensis MR-1 S subunit (S.SonIV) TRD1-CR1. Escherichia coli O118:H16 07-4255 S subunit (S.Eco4255II) recognizes 5'... TACNNNNNNNRTRTC ... 3 while Shewanella oneidensis MR-1 S subunit (S.SonIV) recognizes 5'... TACNNNNNNGTNGT ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.SonIV-TRD1 recognizes TAC/GTA and S.SonIV-TRD2 recognizes ACNAC/GTNGT. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	185
341140	cd17251	RMtype1_S_HinAWORF1578P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to S.HinAWORF1578P TRD2-CR2. Haemophilus influenzae RdAW S subunit (S.HinAWORF1578P) recognizes 5'... CTANNNNNGTTY ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains mostly TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	185
341141	cd17252	RMtype1_S_EcoKI-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoKI TRD1-CR1, S.StySPI TRD1-CR1, S.Ara36733II TRD1-CR1, and S.Eco3722I TRD1-CR1. Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) and Escherichia coli NCM3722 S subunit (S.Eco3722I) recognize 5'... AACNNNNNNGTGC ... 3', Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) recognizes 5'... AACNNNNNNGTRC ... 3', and Actinomyces radicidentis S subunit (S.Ara36733II) recognizes 5'... CATCNNNNNNCTC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoKI-TRD1 and S.StySPI-TRD1 recognize AAC/GTT, S.EcoKI-TRD2 recognizes GCAC/GTGC, and S.StySPI-TRD2 recognizes GYAC/GTRC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases such as Treponema pedis T A4 putative Type IIG restriction enzyme/N6-adenine DNA methyltransferase RM.TpeTA4ORF2695P. It may also include type I DNA methyltransferases.	189
341142	cd17253	RMtype1_S_Eco933I-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli O157:H7 EDL933 S subunit (S.Eco933I), Escherichia coli O104:H4 2009EL-2071 S subunit (S.Eco2071ORF3585P) TRD2-CR2, and Streptomyces species SirexAA-E S subunit (S.SspAAEORF2129P) TRD1-CR1 and TRD2-CR2. Escherichia coli O157:H7 EDL933 S subunit (S.Eco933I) recognizes 5'... CACNNNNNNNCTGG ... 3' and Escherichia coli O104:H4 2009EL-2071 S subunit (S.Eco2071ORF3585P) recognizes 5'... RTCANNNNNNNNGTGG ... 3'. The recognition sequence of Streptomyces species SirexAA-E S subunit (S.SspAAEORF2129P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example,  S.Eco2071ORF3585P TRD1 recognizes RTCA/TGAY and S.Eco2071ORF3585P TRD2 recognizes CCAC/GTGG. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	193
341143	cd17254	RMtype1_S_FclI-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.FclI TRD1-CR1. The recognition sequence of Flavobacterium columnare G4 S subunit (S.FclI) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also contains TRD-CR-like sequence-recognition domains of type I DNA methyltransferases, such as putative Type I N6-adenine DNA methyltransferases from Microbacterium ketosireducens (M.Msp12510ORF408P) and Treponema primitia ZAS-2 (M.TprZAS2ORF3630P). It may also include various type II restriction enzymes and methyltransferases.	173
341144	cd17255	RMtype1_S_Fco49512ORF2615P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Flavobacterium columnare S subunit (S.Fco49512ORF2615P) TRD2-CR2. The recognition sequence of Flavobacterium columnare S subunit (S.Fco49512ORF2615P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	166
341145	cd17256	RMtype1_S_EcoJA65PI-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoJA65PI TRD1-CR1, S.Fco49512ORF2615P TRD1-CR1, and S.SonIV TRD2-CR2. Escherichia coli UCD_JA65_pb S subunit (S.EcoJA65PI) recognizes 5'... AGCANNNNNNTGA ... 3' while Shewanella oneidensis MR-1 S subunit (S.SonIV) recognizes 5'... TACNNNNNNGTNGT ... 3'. The recognition sequence of Flavobacterium columnare S subunit (S.Fco49512ORF2615P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoJA65PI TRD1 recognizes AGCA/TGCT and S.EcoJA65PI TRD2 recognizes TCA/TGA; S.SonIV TRD1 recognizes TAC/GTA and S.SonIV TRD2 recognizes ACNAC/GTNGT. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	182
341146	cd17257	RMtype1_S_EcoBI-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoBI TRD1-CR1, S.EcoSanI TRD1-CR1, and S.EcoVR50I TRD1-CR1. Escherichia coli B S subunit (S.EcoBI) and Escherichia coli VR50 S subunit (S.EcoVR50I) recognize 5'... TGANNNNNNNNTGCT ... 3', while Escherichia coli Sanji S subunit (S.EcoSanI) recognizes 5'... TGANNNNNNCTTC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	176
341147	cd17258	RMtype1_S_Sau13435ORF2165P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Sau13435ORF2165P TRD1-CR1 and S.SauL3067ORFAP TRD1-CR1. Staphylococcus aureus NCTC 13435 S subunit (S.Sau13435ORF2165P) recognizes 5'... TCTANNNNNNRTTC ... 3'; the recognition sequence of Staphylococcus aureus 3067 S.Sau3067ORFAP S subunit is as yet undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Sau13435ORF2165P TRD1 recognizes TCTA/TAGA, and S.Sau13435ORF2165P TRD2 recognizes GAAY/RTTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains mostly TRD1-CR1. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	173
341148	cd17259	RMtype1_S_StySKI-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to TRD2-CR2's of StySKI, S.EcoAI, S.EcoJA17PI, and S.EcoJA23PI. Salmonella kaduna CDC-388 S subunit (StySKI) recognizes 5'... CGATNNNNNNNGTTA ... 3' while Escherichia coli Type-1 restriction enzyme EcoAI specificity protein (S.EcoAI), Escherichia coli UCD_JA17_pb S subunit (S.EcoJA17PI) and Escherichia coli UCD_JA23_pb S subunit (S.EcoJA23PI) recognize 5'... GAGNNNNNNNGTCA ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	189
341149	cd17260	RMtype1_S_EcoEI-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoEI TRD1-CR1, S.EcoJA17PI TRD1-CR1, S.EcoJA23PI TRD1-CR1, and S.StyLTIII TRD1-CR1. Escherichia coli A58 S subunit (S.EcoEI) recognizes 5'... GAGNNNNNNNATGC ... 3', Escherichia coli UCD_JA17_pb S subunit (S.EcoJA17PI) and Escherichia coli UCD_JA23_pb S subunit (S.EcoJA23PI) recognize 5'... GAGNNNNNNNGTCA ... 3', and Salmonella typhimurium LT7 S subunit (S.StyLTIII) recognizes 5'... GAGNNNNNNRTAYG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example: S.EcoEI TRD1 and S.StyLTIII TRD1 recognize GAG/CTC, S.EcoEI TRD2 recognizes GCAT/ATGC, and S.StyLTIII TRD2 recognizes CRTAY/RTAYG. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases, such as Pseudomonas putida Jo 4-731 Type IIG restriction enzyme/N6-adenine DNA methyltransferas RM.PpiI and Porphyromonas macacae COT-192 OH2631 RM.Pma2631ORF8845P, as well as type I DNA methyltransferases such as Chlorobium limicola M.Cli245ORF128P. RM.PpiI recognizes the sequence 5' ... GAACNNNNNCTC ... 3'.	165
341150	cd17261	RMtype1_S_EcoKI-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) TRD2-CR2, Escherichia coli A58 S subunit (S.EcoEI) TRD2-CR2, and Aminomonas paucivorans S subunit (S.Apa12260I) TRD2-CR2. Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) recognizes 5'...  AACNNNNNNGTGC  ... 3', Escherichia coli A58 S subunit (S.EcoEI) recognizes 5'... GAGNNNNNNNATGC ... 3', and Aminomonas paucivorans S subunit (S.Apa12260I) recognizes 5'... GCCNNNNNCTCC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoKI-TRD1 recognizes AAC/GTT and S.EcoKI-TRD2 recognizes GCAC/GTGC, S.EcoEI TRD1 recognizes GAG/CTC and S.EcoEI TRD2 recognizes GCAT/ATGC, and S.Apa12260I TRD1 recognizes GCC/GGC and S.Apa12260I TRD2 recognizes GGAG/CTCC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	191
341151	cd17262	RMtype1_S_Aco12261I-TRD2-CR2	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I) TRD2-CR2 and Moraxella catarrhalis S subunit (S.Mca353ORF290P) TRD2-CR2. Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I) recognizes 5'... GCANNNNNNTGT ... 3', and Moraxella catarrhalis S subunit (S.Mca353ORF290P) recognizes 5'... CAAGNNNNNNTGT ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	172
341152	cd17263	RMtype1_S_AbaB8300I-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Acinetobacter baumannii B8300 S subunit (S.AbaB8300I) TRD1-CR1. Acinetobacter baumannii B8300 S subunit (S.AbaB8300I) recognizes 5'... GAYNNNNNNNTCYC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	177
341153	cd17264	RMtype1_S_Eco3763I-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli O69:H11 07-3763 S subunit (S.Eco3763I) TRD2-CR2. Escherichia coli O69:H11 07-3763 S subunit (S.Eco3763I) recognizes 5'... TACNNNNNNNRTRTC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	184
341154	cd17265	RMtype1_S_Eco4255III-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli O118:H16 07-4255 S subunit (S.Eco4255III) TRD2-CR2 and Escherichia coli ECONIH1 S subunit (S.EcoNIH1II) TRD2-CR2. Escherichia coli O118:H16 07-4255 S subunit (S.Eco4255III) recognizes 5'... GAGNNNNNGTTY ... 3', and Escherichia coli ECONIH1 S subunit (S.EcoNIH1II) recognizes 5'... YTCANNNNNNGTTY ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Eco4255III-TRD1 recognizes GAG/CTC and S.EcoNIH1II-TRD1 recognizes YTCA/TGAR, while both S.EcoNIH1II-TRD2 and S.Eco4255III-TRD2 recognize RAAC/GTTY. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	181
341155	cd17266	RMtype1_S_Sau1132ORF3780P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Staphylococcus aureus subsp. aureus MSHR1132 S subunit (S.Sau1132ORF3780P) TRD2-CR2. Staphylococcus aureus subsp. aureus MSHR1132 S subunit (S.Sau1132ORF3780P) recognizes 5'... CAAGNNNNNRTC ... 3'. The RM system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example S.Sau1132ORF3780P-TRD1 recognizes CAAG/CTTG and S.Sau1132ORF3780P-TRD2 recognizes GAY/RTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	159
341156	cd17267	RMtype1_S_EcoAO83I-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoAO83I TRD1-CR1 and S.AbaB8342I TRD2-CR2. Escherichia coli strain A0 34/86 S subunit (S.EcoAO83I) recognizes 5'... GGANNNNNNNNATGC ... 3, and Acinetobacter baumannii B8342 S subunit (S.AbaB8342I) recognizes 5'... TTCANNNNNNTCC ... 3. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example S.AbaB8342I-TRD1 recognizes TTCA/TGAA and S.AbaB8342I-TRD2 recognizes GGA/TCC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases such as Type IIG restriction enzyme/N6-adenine DNA methyltransferases from Thermus scotoductus RFL1  (RM.TstI) and Acinetobacter lwoffi Ks 4-8 (RM.AloI), as well as type I DNA methyltransferases such as Sideroxydans lithotrophicus ES-1 Type I N6-adenine DNA methyltransferase (M.SliESORF1090P). RM.TstI recognizes 5' ... CACNNNNNNTCC ... 3' and RM.AloI recognizes 5' ... GAACNNNNNNTCC ... 3'.	158
341157	cd17268	RMtype1_S_Ara36733I_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Ara36733I TRD1-CR1 AND S.Ara36733I TRD2-CR2. Actinomyces radicidentis S subunit (S.Ara36733I) recognizes 5'... CGAGNNNNNCTG ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	185
341158	cd17269	RMtype1_S_PluTORF4319P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Photorhabdus luminescens S subunit (S.PluTORF4319P) TRD2-CR2. The recognition sequence of Photorhabdus luminescens S subunit (S.PluTORF4319P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	168
341159	cd17270	RMtype1_S_Sba223ORF3470P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Sba223ORF3470P TRD1-CR1. The recognition sequence of Shewanella baltica OS223 S subunit (S.Sba223ORF3470P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	183
341160	cd17271	RMtype1_S_NmaSCMORF606P_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Nitrosopumilus maritimus SCM1 S subunit (S2.NmaSCMORF606P) TRD2-CR2, Corynebacterium jeikeium K411 S subunit (S.CjeKORF1254P) TRD2-CR2 and Porphyromonas canoris COT-108 OH2762 S subunit (S2.Pca2762ORF8685P) TRD1-CR1. The recognition sequences of Nitrosopumilus maritimus SCM1 S subunit (S2.NmaSCMORF606P), Corynebacterium jeikeium K411 S subunit (S.CjeKORF1254P), and Porphyromonas canoris COT-108 OH2762 S subunit (S2.Pca2762ORF8685P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	186
341161	cd17272	RMtype1_S_Eco2747II-TRD2-CR2-like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Eco2747II TRD2-CR2 and S.Eco2747AII TRD2-CR2. Escherichia coli ST2747 S subunit (S.Eco2747II) and Escherichia coli ST2747A s SUBUNIT (S.Eco2747AII) recognize 5'... GAANNNNNNNTAAA ... 3'. Generally The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains mainly TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	189
341162	cd17273	RMtype1_S_EcoJA69PI-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.EcoJA69PI TRD1-CR1, MjaXIP/S.MjaORF132P TRD2-CR2, and S.HspDL1ORF16625P TRD2-CR2. Escherichia coli UCD_JA69_pb S subunit (S.EcoJA69PI) recognizes 5'... CCANNNNNNNCTTC ... 3'. The recognition sequences of Methanococcus jannaschii MjaXIP/S.MjaORF132P TRD2-CR2 and Halobacterium species DL1 S subunit (S.HspDL1ORF16625P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various putative type II restriction enzymes and methyltransferases and may also include type I DNA methyltransferases.	186
341163	cd17274	RMtype1_S_Eco540ANI-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.Eco540ANI TRD1-CR1, S.Eco2747AII TRD1-CR1, S.Eco540AI TRD1-CR1, S.Eco2747II TRD1-CR1, and S.Eco540I TRD1-CR1. Escherichia coli ST540AN S subunit (S.Eco540ANI ), Escherichia coli ST540A S subunit (S.Eco540AI), and Escherichia coli ST540 S subunit (S.Eco540I) recognize 5'... GAANNNNNNRTCG ... 3'. Escherichia coli ST2747A S subunit (S.Eco2747AII) and Escherichia coli ST2747 S subunit (S.Eco2747II) recognize 5'... GAANNNNNNNTAAA ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	171
341164	cd17275	RMtype1_S_MjaORF132P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to MjaXIP/S.MjaORF132P TRD1-CR1. The recognition sequence of Methanococcus jannaschii S subunit (MjaXIP/S.MjaORF132P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	186
341165	cd17276	RMtype1_S_Sau1132ORF3780P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to S.Sau1132ORF3780P TRD1-CR1, and S.Mca353ORF290P TRD1-CR1. The Staphylococcus aureus subsp. aureus MSHR1132 S subunit (S.Sau1132ORF3780P) recognizes 5'... CAAGNNNNNRTC ... 3', and Moraxella catarrhalis S subunit (S.Mca353ORF290P) recognizes 5'... CAAGNNNNNNTGT ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Sau1132ORF3780P-TRD1 recognizes CAAG/CTTG, and S.Sau1132ORF3780P-TRD2 recognizes GAY/RTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	187
341166	cd17277	RMtype1_M_Cni19672ORF1405P_RMtype11G_Hci611ORFHP_TRD1-CR1_like	Restriction modification N6-adenine DNA methyltransferase TRD-CR, similar to RMtype1 Calditerrivibrio nitroreducens M.Cni19672ORF1405P TRD1-CR1 and RMtype11G Helicobacter cinaedi PAGU611 RM.Hci611ORFHP TRD1-CR1. The recognition sequence of Calditerrivibrio nitroreducens M.Cni19672ORF1405P is undetermined, and the predicted recognition sequence of RM.Hci611ORFHP is 5'...  GAGNNNNNGT  ... 3'. M.Cni19672ORF1405P is a putative type I N6-adenine DNA methyltransferase. RM.Hci611ORFHP is a type II subtype gamma (also called type IIG and type IIC) N6-adenine DNA methyltransferase. Both are DNA methyltransferase-specificity subunit fusion proteins, they each have a domain corresponding to a HsdM methylation (M) subunit followed by a C-terminal, TRD-CR-like domain for sequence-recognition, which corresponds to the HsdS specificty (S) subunit. The latter consists of two variable TRDs, and two CRs which separate the TRDs; the TRDs each bind to different specific sequences in the DNA. RM.Hci611ORFHP has an additional N-terminal HSDR_N domain. Restriction-modification (RM) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit, two modification (M) subunits, and two restriction (R) subunits.	184
341167	cd17278	RMtype1_S_LdeBORF1052P-TRD2-CR2	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Lactobacillus delbrueckii subsp. bulgaricus S subunit (S2.LdeBORF1052P) TRD2-CR2. The recognition sequence of Lactobacillus delbrueckii subsp. bulgaricus S subunit (S2.LdeBORF1052P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	189
341168	cd17279	RMtype1_S_BmuCF2ORF3362P_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Burkholderia multivorans CF2 S subunit (S.BmuCF2ORF3362P) TRD1-CR1 and and Halomonas campaniensis LS21 S subunit (S.HcaLS21ORF9970P) TRD1-CR1. The recognition sequences of Burkholderia multivorans CF2 S subunit (S.BmuCF2ORF3362P) and Halomonas campaniensis LS21 S subunit (S.HcaLS21ORF9970P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	184
341169	cd17280	RMtype1_S_MspEN3ORF6650P_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Marinobacter species EN3 S subunit (S.MspEN3ORF6650P) TRD1-CR1, Methanothermobacter marburgensis str. Marburg S subunit (S.Mma2133ORF14720P) TRD2-CR2 and Nostoc species NIES-3756 S subunit (S.Nsp3756ORF27100P) TRD1-CR1. The recognition sequences of Marinobacter species EN3 S subunit (S.MspEN3ORF6650P), Methanothermobacter marburgensis str. Marburg S subunit (S.Mma2133ORF14720P), and Nostoc species NIES-3756 S subunit (S.Nsp3756ORF27100P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	187
341170	cd17281	RMtype1_S_HpyAXIII_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Helicobacter pylori 26695 S subunit (S.HpyAXIII/Prototype S.Hpy26695ORF4050P) TRD1-CR1, Neisseria meningitidis 510612 S subunit (S.Nme510612ORF1157P) TRD1-CR1 and Streptococcus mitis SVGS_061 S subunit (S2.Smi61ORF7905P) TRD1-CR1. Helicobacter pylori 26695 S subunit (S.HpyAXIII/Prototype S.Hpy26695ORF4050P) recognizes 5'... CTANNNNNNNNTGT ... 3', and the recognition sequences of Neisseria meningitidis 510612 S subunit (S.Nme510612ORF1157P) and Streptococcus mitis SVGS_061 S subunit (S2.Smi61ORF7905P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, Helicobacter pylori 26695  S subunit (S.HpyAXIII/Prototype S.Hpy26695ORF4050P) TRD1 recognizes CTA/TAG, and TRD2 recognizes ACA/TGT. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	196
341171	cd17282	RMtype1_S_Eco16444ORF1681_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli G4/9 S subunit (S.Eco16444ORF1681P) TRD1-CR1 and Zobellia galactanivorans DsiJT S subunit (S.ZgaJTORF2697P)TRD2-CR2. The recognition sequences of Escherichia coli G4/9 S subunit (S.Eco16444ORF1681P) and Zobellia galactanivorans DsiJT S subunit (S.ZgaJTORF2697P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various putative type II restriction enzymes and methyltransferases and may also include type I DNA methyltransferases.	186
341172	cd17283	RMtype1_S_Hpy180ORF7835P_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Helicobacter pylori SJM180 S subunit (S.Hpy180ORF7835P) TRD2-CR2 and Haemophilus influenzae PittGG S subunit (S.HinGGORF3080P) TRD2-CR2. The recognition sequences of Helicobacter pylori SJM180 S subunit (S.Hpy180ORF7835P) and Haemophilus influenzae PittGG S subunit (S.HinGGORF3080P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	181
341173	cd17284	RMtype1_S_Cbo7060ORF11580P_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Clostridium botulinum CFSAN024410 S subunit (S.Cbo7060ORF11580P) TRD2-CR2 and Shewanella xiamenensis BC01 S subunit (S.SxiBC01ORF77P) TRD1-CR1. The recognition sequences of Clostridium botulinum CFSAN024410 S subunit (S.Cbo7060ORF11580P) and Shewanella xiamenensis BC01 S subunit (S.SxiBC01ORF77P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	185
341174	cd17285	RMtype1_S_Csp16704I_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Campylobacter species RM1670 S subunit (S.Csp16704I) TRD2-CR2, Aeromonas media WS S subunit (S.AmeWSORF2351P) TRD1-CR1, and Clostridium carboxidivorans P7 S subunit (S.CcaPORF573P) TRD2-CR2. Campylobacter species RM16704  S subunit (S.Csp16704I ) recognizes 5'... ACANNNNNNNNTCG ... 3', and the recognition sequences of Aeromonas media WS  TRD1-CR1 S subunit  (S.AmeWSORF2351P) and Clostridium carboxidivorans P7 S subunit (S.CcaPORF573P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	181
341175	cd17286	RMtype1_S_Lla161ORF747P_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Lactococcus lactis subsp. lactis Dephy 1 S subunit (S.Lla161ORF747P) TRD1-CR1, and Lactococcus lactis IO-1 S subunit (S2.LlaIO1ORF1141P) TRD2-CR2. The recognition sequences of Lactococcus lactis subsp. lactis Dephy 1 S subunit (S.Lla161ORF747P) and Lactococcus lactis IO-1 S subunit (S2.LlaIO1ORF1141P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	179
341176	cd17287	RMtype1_S_EcoN10ORF171P_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli N10-0505 S subunit (S.EcoN10ORF171P) TRD2-CR2, and Herpetosiphon aurantiacus S subunit (S.HauORF5277P) TRD2-CR2. The recognition sequences of Escherichia coli N10-0505 S subunit (S.EcoN10ORF171P) and Herpetosiphon aurantiacus S subunit (S.HauORF5277P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	184
341177	cd17288	RMtype1_S_LlaAI06ORF1089P_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Lactococcus lactis S subunit (S.LlaAI06ORF1089P) TRD1-CR1 and Bacillus subtilis B4071 S subunit (S2.BsuCC16ORF609P) TRD2-CR2. The recognition sequences of Lactococcus lactis S subunit  (S.LlaAI06ORF1089P) and Bacillus subtilis B4071 S subunit (S2.BsuCC16ORF609P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	163
341178	cd17289	RMtype1_S_BamJRS5ORF1993P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Bacillus amyloliquefaciens JRS5 S subunit (S.BamJRS5ORF1993P) TRD1-CR1 and Bacillus pumilus Jo2 S subunit (S.BpuJo2I) TRD1-CR1. The recognition sequences of Bacillus amyloliquefaciens JRS5 S subunit (S.BamJRS5ORF1993P) and Bacillus pumilus Jo2 S subunit (S.BpuJo2I) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	191
341179	cd17290	RMtype1_S_AleSS8ORF2795P_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Algibacter lectus S subunit (S.AleSS8ORF2795P) TRD1-CR1 and Vibrio parahaemolyticus O1:K33 CDC_K4557 S subunit (S.Vpa4557ORF22590P) TRD2-CR2. The recognition sequences of Algibacter lectus S subunit (S.AleSS8ORF2795P) and Vibrio parahaemolyticus O1:K33 CDC_K4557 S subunit (S.Vpa4557ORF22590P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	184
341180	cd17291	RMtype1_S_MgeORF438P-TRD-CR_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.MgeORF438P TRD1-CR1 and TRD2-CR2, and to Escherichia coli G5 S subunit (S.Eco16445ORF5013P ) TRD2-CR2 and Acetobacter pasteurianus IFO 3283-01 S subunit (S2.Apa3283ORF14230P) TRD1-CR1. The recognition sequences of Mycoplasma genitalium G-37 S subunit (S.MgeORF438P), Escherichia coli G5 S subunit (S.Eco16445ORF5013P), and Acetobacter pasteurianus IFO 3283-01 S subunit (S2.Apa3283ORF14230P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	161
341181	cd17292	RMtype1_S_LlaA17I_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to the S subunit TRD2-CR2 regions of Lactococcus lactis subsp. cremoris A17 (S.LlaA17I), Haemophilus influenzae Rd (S.HindORF215P) and Clostridium species ASF502 (S.Csp502ORF478P). Lactococcus lactis subsp. cremoris A17 S subunit (S.LlaA17I) recognizes 5'...  CAANNNNNNNNTAYG... 3', while the recognition sequences of Clostridium species ASF502 S subunit (S.Csp502ORF478P) and Haemophilus influenzae Rd S subunit (S.HindORF215P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases, such as Porphyromonas species COT-108 OH1349 Type IIG restriction enzyme/N6-adenine DNA methyltransferase (RM.Psp1349ORF730P) of unknown recognition sequence. It may also include type I DNA methyltransferases.	149
341182	cd17293	RMtype1_S_Ppo21ORF8840P_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Paenibacillus polymyxa SQR-21 SQR21 S subunit (S.Ppo21ORF8840P) TRD1-CR1, Nitrosococcus halophilus Nc4 S subunit (S.NhaNc4ORF3964P) TRD1-CR1. The recognition sequences of Paenibacillus polymyxa SQR-21 SQR21 S subunit (S.Ppo21ORF8840P) and Nitrosococcus halophilus Nc4 S subunit (S.NhaNc4ORF3964P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This superfamily contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	180
341183	cd17294	RMtype1_S_MmaC7ORF19P_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Methanococcus maripaludis C7 S subunit (S.MmaC7ORF19P) TRD1-CR1 and Mycoplasma gallinaceum S subunit (S3.Mme68BORF1125P) TRD2-CR2. The recognition sequences of Methanococcus maripaludis C7 S subunit (S.MmaC7ORF19P) and Mycoplasma gallinaceum S subunit (S3.Mme68BORF1125P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	188
341184	cd17296	RMtype1_S_MmaC5ORF1169P_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Methanococcus maripaludis C5 S subunit (S.MmaC5ORF1169P) TRD1-CR1, and Methanobacterium formicicum S subunit (S.Mfo3637ORF3708P) TRD2-CR2. The recognition sequences of Methanococcus maripaludis C5 S subunit (S.MmaC5ORF1169P) and Methanobacterium formicicum S subunit (S.Mfo3637ORF3708P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	182
341209	cd17297	AldB-like	proteins similar to alpha-acetolactate dehydrogenase. The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an H(x)H(x)nH motif (x could be any residue, n could be 9 or 10) that coordinates a zinc ion. The proteins are homologous to bacterial alpha-acetolactate decarboxylase (AldB, E.C. 4.1.1.5), which converts acetolactate into acetoin.	209
341210	cd17298	DUF1907	proteins similar to putative ester hydrolase C11orf54/PTD012. The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxH motif (x could be any residue) that coordinates a zinc ion, and an acetate anion at a site that may support the enzymatic activity of a ester. In vitro hydrolytic activity towards para-nitrophenylacetate for the human enzyme was reported. The proteins are homologous to bacterial alpha-acetolactate decarboxylase (AldB, E.C. 4.1.1.5), which converts acetolactate into acetoin.	287
341211	cd17299	acetolactate_decarboxylase	alpha-acetolactate decarboxylase. alpha-acetolactate decarboxylase (AldB, E.C. 4.1.1.5) converts acetolactate ((2S)-2-hydroxy-2-methyl-3-oxobutanoate) into acetoin ((3R)-3-hydroxybutan-2-one) and CO(2). Acetoin may be secreted by the cells, perhaps in order to control the internal pH. AldB may function as a regulator in valine and leucine biosynthesis and in catalyzing the second step of the 2,3-butanediol pathway. The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxxH motif  (x could be any residue) that coordinates a zinc ion.	232
340437	cd17300	PIPKc_PIKfyve	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in 1-phosphatidylinositol-3-phosphate 5-kinase and similar proteins. 1-phosphatidylinositol-3-phosphate 5-kinase (EC 2.7.1.150) is also called FYVE finger-containing phosphoinositide kinase, PIKfyve, phosphatidylinositol 3-phosphate 5-kinase (PIP5K3), or phosphatidylinositol 3-phosphate 5-kinase type III (PIPkin-III or type III PIP kinase). It forms a complex with its regulators, the scaffolding protein Vac14 and the lipid phosphatase Fig4. The complex is responsible for synthesizing phosphatidylinositol 3,5-bisphosphate [PtdIns(3,5)P2] by catalyzing the phosphorylation of phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) on the fifth hydroxyl of the myo-inositol ring. Then phosphatidylinositol-5-phosphate (PtdIns5P) is generated directly from PtdIns(3,5)P2. PtdIns(3,5)P2 and PtdIns5P regulate endosomal trafficking and responses to extracellular stimuli. PIKfyve is vital in early embryonic development. It forms a complex with ArPIKfyve (associated regulator of PIKfyve) and SAC3 at the endomembranes, playing a role in receptor tyrosine kinase (RTK) degradation. The phosphorylation of PIKfyve by AKT can facilitate epidermal growth factor receptor (EGFR) degradation. In addition, PIKfyve may participate in the regulation of the glutamate transporters EAAT2, EAAT3 and EAAT4, and the cystic fibrosis transmembrane conductance regulator (CFTR). It is also essential for systemic glucose homeostasis and insulin-regulated glucose uptake/GLUT4 translocation in skeletal muscle. It can be activated by protein kinase B (PKB/Akt) and further up-regulates human Ether-a-go-go-Related Gene (hERG) channels. This family also includes the yeast ortholog of human PIKfyve, Fab1. PIKfyve and its orthologs share a similar architecture. They contain an N-terminal FYVE domain, a middle region related to the CCT/TCP-1/Cpn60 chaperonins that are involved in productive folding of actin and tubulin, a second middle domain that contains a number of conserved cysteine residues (CCR) unique to this family, and a C-terminal catalytic lipid kinase domain related to PtdInsP kinases (or the PIPKc domain).	262
340438	cd17301	PIPKc_PIP5KI	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in type I phosphatidylinositol 4-phosphate (PtdIns(4)P) 5-kinases (PIP5KI) and similar proteins. PIP5KIs, also known as PIPKIs, or PI4P5KIs, phosphorylate the head group of phosphatidylinositol 4-phosphate (PtdIns4P) to generate phosphatidylinositol 4,5-bisphosphate (PtdIns4,5P2), an essential lipid molecule in various cellular processes. Three distinct PIP5KIs have been characterized in erythrocytes, PIP5K1alpha, PIP5K1beta, and PIP5K1gamma isoforms.	320
340439	cd17302	PIPKc_AtPIP5K_like	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in Arabidopsis thaliana phosphatidylinositol 4-phosphate 5-kinases (PIP5Ks) and similar proteins. PIP5K (EC 2.7.1.68), also known as PtdIns(4)P-5-kinase, or diphosphoinositide kinase, phosphorylates phosphatidylinositol-4-phosphate to produce phosphatidylinositol-4,5-bisphosphate as a precursor of two second messengers, inositol-1,4,5-triphosphate and diacylglycerol, and as a regulator of many cellular proteins involved in signal transduction and cytoskeletal organization. The family includes several PIP5Ks from Arabidopsis thaliana. AtPIP5K1 is involved in water-stress signal transduction. AtPIP5K2 acts as an interactor of all five Arabidopsis RAB-E proteins but not with other Rab subclasses residing at the Golgi or trans-Golgi network. AtPIP5K3 is a key regulator of root hair tip growth. AtPIP5K4 and AtPIP5K5 are type B PI4P 5-kinases expressed in pollen and have important functions in pollen germination and in pollen tube growth. AtPIP5K6 regulates clathrin-dependent endocytosis in pollen tubes. AtPIP5K9 interacts with a cytosolic invertase to negatively regulate sugar-mediated root growth.	314
340440	cd17303	PIPKc_PIP5K_yeast_like	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in yeast phosphatidylinositol 4-phosphate 5-kinases (PIP5Ks) and similar proteins. PIP5K (EC 2.7.1.68), also known as PtdIns(4)P-5-kinase, or diphosphoinositide kinase, phosphorylates phosphatidylinositol-4-phosphate to produce phosphatidylinositol-4,5-bisphosphate as a precursor of two second messengers, inositol-1,4,5-triphosphate and diacylglycerol, and as a regulator of many cellular proteins involved in signal transduction and cytoskeletal organization. The family includes Saccharomyces cerevisiae PIP5K MSS4, Schizosaccharomyces pombe PIP5K Its3. MSS4 is required for organization of the actin cytoskeleton in budding yeast. Its3 is involved, together with the calcineurin ppb1, in cytokinesis of fission yeast.	318
340441	cd17304	PIPKc_PIP5KL1	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in phosphatidylinositol 4-phosphate 5-kinase-like protein 1 (PIP5KL1) and similar proteins. PIP5KL1 (EC 2.7.1.68), also known as PI(4)P 5-kinase-like protein 1, or PtdIns(4)P-5-kinase-like protein 1, may act as a scaffold to localize and regulate type I PI(4)P 5-kinases to specific compartments within the cell, where they generate PI(4,5)P2 for actin nucleation, signaling and scaffold protein recruitment, and conversion to PI(3,4,5)P3.	319
340442	cd17305	PIPKc_PIP5KII	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in type II phosphatidylinositol 5-phosphate 4-kinase (PIP5KII) and similar proteins. PIP5KIIs, also known as PIPKIIs, or PI4P5KIIs, are responsible for the synthesis of phosphatidylinositol-4,5-bisphosphate (PtdIns4,5P2), an essential lipid molecule in various cellular processes, from phosphatidylinositol-5-phosphate (PtdIns5P). Three distinct PIP5KIs have been characterized in erythrocytes, PIP5K2A, PIP5K2B, and PIP5K2C isoforms.	300
340443	cd17306	PIPKc_PIP5K1A_like	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in phosphatidylinositol 4-phosphate 5-kinase type-1 alpha (PIP5K1alpha) and similar proteins. PIP5K1alpha (EC 2.7.1.68), also termed PIP5K1A, or PtdIns(4)P-5-kinase 1 alpha, or 68 kDa type I phosphatidylinositol 4-phosphate 5-kinase alpha, or PIPKI-alpha, catalyzes the phosphorylation of phosphatidylinositol 4-phosphate (PtdIns4P) to form phosphatidylinositol  4,5-bisphosphate (PtdIns(4,5)P2). It mediates extracellular calcium-induced keratinocyte differentiation. Unlike other type I phosphatidylinositol-4-phosphate 5-kinase (PIPKI) isoforms, PIP5K1alpha regulates directed cell migration by modulating Rac1 plasma membrane targeting and activation. This function is independent of its catalytic activity, and requires physical interaction of PIP5K1alpha with the Rac1 polybasic domain. The family also includes testis-specific PIP5K1A and PSMD4-like protein, also known as PIP5K1A-PSMD4 or PIPSL. It has negligeable PIP5 kinase activity and binds to ubiquitinated proteins.	339
340444	cd17307	PIPKc_PIP5K1B	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in phosphatidylinositol 4-phosphate 5-kinase type-1 beta (PIP5K1beta) and similar proteins. PIP5K1beta (EC 2.7.1.68), also known as PtdIns(4)P-5-kinase 1 beta, or protein STM-7, or PIP5K1B, is encoded by the Friedreich's ataxia (FRDA) gene, STM7. FRDA is a progressive neurodegenerative disease characterized by ataxia, variously associating heart disease, diabetes mellitus, and/or glucose intolerance. PIP5K1beta is an enzyme functionally linked to actin cytoskeleton dynamics and it phosphorylates phosphatidylinositol 4-phosphate (PtdIns4P) to generate phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2).	321
340445	cd17308	PIPKc_PIP5K1C	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in phosphatidylinositol 4-phosphate 5-kinase type-1 gamma (PIP5K1gamma) and similar proteins. PIP5K1gamma(EC 2.7.1.68), also known as PtdIns(4)P-5-kinase 1 gamma, or PIP5K1gamma, or PIPKIgamma, or PtdInsPKI gamma, is a phosphatidylinositol-4-phosphate 5-kinase that catalyzes the phosphorylation of phosphatidylinositol 4-phosphate (PtdIns4P) to form phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2), which is involved in a variety of cellular processes and is the substrate to form phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3), another second messenger. PIP5K1gamma is required for epidermal growth factor (EGF)-stimulated directional cell migration. It also modulates adherens junction and E-cadherin trafficking via a direct interaction with mu 1B adaptin.	323
340446	cd17309	PIPKc_PIP5K2A	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in Phosphatidylinositol 5-phosphate 4-kinase type-2 alpha (PIP5K2A) and similar proteins. PIP5K2A (EC 2.7.1.149), also known as PIP4K2A, or 1-phosphatidylinositol 5-phosphate 4-kinase 2-alpha, or diphosphoinositide kinase 2-alpha, or PIP5KIII, or phosphatidylinositol 5-phosphate 4-kinase type II alpha, or PI(5)P 4-kinase type II alpha, or PIP4KII-alpha, or PtdIns(4)P-5-kinase C isoform, or PtdIns(5)P-4-kinase isoform 2-alpha, catalyzes the phosphorylation of phosphatidylinositol 5-phosphate (PtdIns5P) on the fourth hydroxyl of the myo-inositol ring, to form phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2), one of the key metabolic crossroads in phosphoinositide signaling. It is possibly involved in a mechanism protecting against tardive dyskinesia-inducing neurotoxicity. PIP5K2A is associated with schizophrenia. It controls the function of KCNQ channels via phosphatidylinositol-4,5-bisphosphate (PIP2) synthesis, and plays a potential role in the regulation of alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid (AMPA) receptors.	309
340447	cd17310	PIPKc_PIP5K2B	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in Phosphatidylinositol 5-phosphate 4-kinase type-2 beta (PIP5K2B) and similar proteins. PIP5K2B (EC 2.7.1.149), also known as 1-phosphatidylinositol 5-phosphate 4-kinase 2-beta, or diphosphoinositide kinase 2-beta, or phosphatidylinositol 5-phosphate 4-kinase type II beta, or PI(5)P 4-kinase type II beta, or PIP4KII-beta, or PtdIns(5)P-4-kinase isoform 2-beta, or PIP5KIIbeta, or PIP4K2B, participates in the biosynthesis of phosphatidylinositol 4,5-bisphosphate. It directly regulates the levels of two important phosphoinositide second messengers, PtdIns5P and phosphatidylinositol-(4,5)-bisphosphate (PtdIns(4,5)P2), one of the key metabolic crossroads in phosphoinositide signaling. It regulates the levels of nuclear PtdIns5P, which in turn modulates the acetylation of the tumour suppressor p53. It also interacts with and modulates nuclear localization of the high-activity PtdIns5P-4-kinase isoform PIP4Kalpha. Moreover, PIP5K2B is a molecular sensor that transduces changes in GTP into changes in the levels of the phosphoinositide PtdIns5P to modulate tumour cell growth.	311
340448	cd17311	PIPKc_PIP5K2C	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain found in Phosphatidylinositol 5-phosphate 4-kinase type-2 gamma (PIP5K2C) and similar proteins. PIP5K2C (EC 2.7.1.149), also known as 1-phosphatidylinositol 5-phosphate 4-kinase 2-gamma, or PI5P4Kgamma, or diphosphoinositide kinase 2-gamma, or phosphatidylinositol 5-phosphate 4-kinase type II gamma, or PI(5)P 4-kinase type II gamma, or PIP4KII-gamma, or PIP4K2C, may play an important role in the production of phosphatidylinositol bisphosphate (PIP2) in the endoplasmic reticulum. It contributes to the development and maintenance of epithelial cell functional polarity. It also plays a role in the regulation of the immune system via mTORC1 signaling. Moreover, PIP5K2C is involved in arsenic trioxide (ATO) cytotoxicity. It mediates PIP2 generation required for positioning and assembly of bipolar spindles and alteration of PIP5K2C function by ATO may thus lead to spindle abnormalities.	298
340870	cd17312	MFS_OPA_SLC37	Organophosphate:Pi antiporter/Solute Carrier family 37 of the Major Facilitator Superfamily of transporters. Organophosphate:Pi antiporters (OPA) are integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The OPA family is also called solute carrier family 37 (SLC37) in vertebrates. Members include glucose-6-phosphate (Glc6P) transporter (also called translocase or exchanger), glycerol-3-phosphate permease, 2-phosphonopropionate transporter, phosphoglycerate transporter, as well as membrane sensor protein UhpC from Escherichia coli. UhpC is both a sensor and a transport protein; it recognizes external Glc6P and induces transport by UhpT, and it can also transport Glc6P. Vertebrates contain four SLC37 or sugar-phosphate exchange (SPX) proteins: SLC37A1 (SPX1), SLC37A2 (SPX2), SLC37A3 (SPX3), and SLC37AA4 (SPX4). The OPA/SLC37 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	364
340871	cd17313	MFS_SLC45_SUC	Solute carrier family 45 and similar sugar transporters of the Major Facilitator Superfamily of transporters. This group includes the solute carrier 45 (SLC45) family as well as plant sucrose transporters (SUCs or SUTs) and similar proteins such as Schizosaccharomyces pombe general alpha-glucoside permease. the SLC45 family is composed of four (A1-A4) vertebrate proteins as well as related insect proteins such as Drosophila sucrose transporter SCRT or Slc45-1. Members of this group transport sucrose and other sugars like maltose into the cell, with the concomitant uptake of protons (symport system). Plant sucrose transporters are crucial to carbon partitioning, playing a key role in phloem loading/unloading. They play a key role in loading and unloading of sucrose into the phloem and as a result, they control sucrose distribution throughout the whole plant and drive the osmotic flow system in the phloem. They also play a role in the exchange of sucrose between beneficial symbionts (mycorrhiza and Rhizobium) as well as pathogens such as nematodes and parasitic fungi. There are nine sucrose transporter genes in Arabidopsis and five in rice. Vertebrate SLC45 family proteins have been implicated in the regulation of glucose homoeostasis in the brain (SLC45A1), with skin and hair pigmentation (SLC45A2), and with prostate cancer and myelination (SLC45A3). Mutations in SLC45A2, also called MATP (membrane-associated transporter protein) or melanoma antigen AIM1, cause oculocutaneous albinism type 4 (OCA4), an autosomal recessive disorder of melanin biosynthesis that results in congenital hypopigmentation of ocular and cutaneous tissues. The SLC45 family and related sugar transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	421
340872	cd17314	MFS_MCT_like	Monocarboxylate transporter (MCT) family and similar transporters of the Major Facilitator Superfamily. The group is composed of the Monocarboxylate transporter (MCT) family in animals and similar transporters from fungi, plants, archaea, and bacteria. MCT is also called Solute carrier family 16 (SLC16 or SLC16A). It is composed of 14 members, MCT1-14. MCTs play an integral role in cellular metabolism via lactate transport and have been implicated in metabolic synergy in tumors. MCTs have been found to facilitate the transport across the plasma membrane not only of monocarboxylates (MCT1-4), but also thyroid hormones (MCT8/10), and aromatic acids (MCT10). Yeast MCT homologous (Mch) proteins are not involved in the uptake of monocarboxylates; their substrates are not known. The MCT-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	385
340873	cd17315	MFS_GLUT_like	Glucose transporters (GLUTs) and other similar sugar transporters of the Major Facilitator Superfamily. This family is composed of glucose transporters (GLUTs) and other sugar transporters including fungal hexose transporters (HXT), bacterial xylose transporter (XylE), plant sugar transport proteins (STP) and polyol transporters (PLT), H(+)-myo-inositol cotransporter (HMIT), and similar proteins. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. The GLUT-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	365
340874	cd17316	MFS_SV2_like	Metazoan Synaptic vesicle glycoprotein 2 (SV2) and related small molecule transporters of the Major Facilitator Superfamily. This family is composed of metazoan synaptic vesicle glycoprotein 2 (SV2) and related small molecule transporters including those that transport inorganic phosphate (Pht), aromatic compounds (PcaK and related proteins), proline/betaine (ProP), alpha-ketoglutarate (KgtP), citrate (CitA), shikimate (ShiA), and cis,cis-muconate (MucK), among others. SV2 is a transporter-like protein that serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. Also included in this family is synaptic vesicle 2 (SV2)-related protein (SVOP) and similar proteins. SVOP is a transporter-like nucleotide binding protein that localizes to neurotransmitter-containing vesicles. The SV2-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	353
340875	cd17317	MFS_SLC22	Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily. The Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters includes organic cation transporters (OCTs), organic zwitterion/cation transporters (OCTNs), and organic anion transporters (OATs). SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. The SLC22 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	331
340876	cd17318	MFS_SLC17	Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily of transporters. The Solute carrier 17 (SLC17) family is primarily involved in the transport of organic anions. There are nime human proteins belonging to this family including: the type I phosphate transporters (SLC17A1-4) that were initially identified as sodium-dependent inorganic phosphate (Pi) transporters but are now known to be involved in tha transport of organic anions; lysosomal acidic sugar transporter (SLC17A5 or sialin), vesicular glutamate transporters (VGluT1#3 or SLC17A7, SLC17A6, and SLC17A8, respectively), and a vesicular nucleotide transporter (VNUT or SLC17A9). SLC17A1 and SLC17A3 have roles in the transport of urate and para-aminohippurate, respectively. The SLC17 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	389
340877	cd17319	MFS_ExuT_GudP_like	Hexuronate transporter, Glucarate transporter, and similar transporters of the Major Facilitator Superfamily. This family is composed of predominantly bacterial transporters for hexuronate (ExuT), glucarate (GudP), galactarate (GarP), and galactonate (DgoT). They mediate the uptake of these compounds into the cell. They belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	358
340878	cd17320	MFS_MdfA_MDR_like	Multidrug transporter MdfA and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This family is composed of bacterial multidrug resistance (MDR) transporters including several proteins from Escherichia coli such as MdfA (also called chloramphenicol resistance pump Cmr), EmrD, MdtM, MdtL, bicyclomycin resistance protein (also called sulfonamide resistance protein), and the uncharacterized inner membrane transport protein YdhC. EmrD is a proton-dependent secondary transporter, first identified as an efflux pump for uncouplers of oxidative phosphorylation. It expels a range of drug molecules and amphipathic compounds across the inner membrane of E. coli. Similarly, MdfA is a secondary multidrug transporter that exports a broad spectrum of structurally and electrically dissimilar toxic compounds. These MDR transporters are drug/H+ antiporters (DHA) belonging to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	379
340879	cd17321	MFS_MMR_MDR_like	Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This family is composed of bacterial, fungal, and archaeal multidrug resistance (MDR) transporters including several proteins from Bacilli such as methylenomycin A resistance protein (also called MMR peptide), tetracycline resistance protein (TetB), and lincomycin resistance protein LmrB, as well as fungal proteins such as vacuolar basic amino acid transporters, which are involved in the transport into vacuoles of the basic amino acids histidine, lysine, and arginine in Saccharomyces cerevisiae, and aminotriazole/azole resistance proteins. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. For example, MMR confers resistance to the epoxide antibiotic methylenomycin while TetB resistance to tetracycline by an active tetracycline efflux. MMR-like MDR transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	370
340880	cd17322	MFS_ARN_like	Yeast ARN family of Siderophore iron transporters and similar proteins of the Major Facilitator Superfamily. The ARN family of siderophore iron transporters includes ARN1 (or ferrichrome permease), ARN2 (or triacetylfusarinine C transporter 1 or TAF1), ARN3 (or siderophore iron transporter 1 or SIT1 or ferrioxamine B permease) and ARN4 (or Enterobactin permease or ENB1). They specifically recognize siderophore-iron chelates are expressed under conditions of iron deprivation. They facilitate the uptake of both hydroxamate- and catecholate-type siderophores. This group also includes glutathione exchanger 1 (Gex1p) and Gex2p, which are proton/glutathione antiporters that import glutathione from the vacuole and exports it through the plasma membrane. The ARN family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	514
340881	cd17323	MFS_Tpo1_MDR_like	Yeast Polyamine transporter 1 (Tpo1) and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This family is composed of fungal multidrug resistance (MDR) transporters including several proteins from Saccharomyces cerevisiae such as polyamine transporters 1-4 (Tpo1-4), quinidine resistance proteins 1-3 (Qdr1-3), dityrosine transporter 1 (Dtr1), fluconazole resistance protein 1 (Flr1), and protein HOL1. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. For example, Flr1 confers resistance to the azole derivative fluconazole while Tpo1 confers resistance and adaptation to quinidine and ketoconazole. The polyamine transporters are involved in the detoxification of excess polyamines in the cytoplasm. Tpo1-like MDR transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	376
340882	cd17324	MFS_NepI_like	Purine ribonucleoside efflux pump NepI and similar transporters of the Major Facilitator Superfamily. This family is composed of purine efflux pumps such as Escherichia coli NepI and Bacillus subtilis PbuE, sugar efflux transporters such as Corynebacterium glutamicum arabinose efflux permease, multidrug resistance (MDR) transporters such as Streptomyces lividans chloramphenicol resistance protein (CmlR), and similar proteins. NepI and PbuE are involved in the efflux of purine ribonucleosides such as guanosine, adenosine and inosine, as well as purine bases like guanine, adenine, and hypoxanthine, and purine base analogs. They play a role in the maintenance of cellular purine base pools, as well as in protecting the cells and conferring resistance against toxic purine base analogs such as 6-mercaptopurine. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. The NepI-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	370
340883	cd17325	MFS_MdtG_SLC18_like	bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily of transporters. This family is composed of eukaryotic solute carrier 18 (SLC18) family transporters and related bacterial multidrug resistance (MDR) transporters including several proteins from Escherichia coli such as multidrug resistance protein MdtG, from Bacillus subtilis such as multidrug resistance proteins 1 (Bmr1) and 2 (Bmr2), and from Staphylococcus aureus such as quinolone resistance protein NorA. The family also includes Escherichia coli arabinose efflux transporters YfcJ and YhhS. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. The SLC18 transporter family includes vesicular monoamine transporters (VAT1 and VAT2), vesicular acetylcholine transporter (VAChT), and SLC18B1, which is proposed to be a vesicular polyamine transporter (VPAT). The MdtG/SLC18 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	375
340884	cd17326	MFS_MFSD8	Major facilitator superfamily domain-containing protein 8. Major facilitator superfamily (MFS) domain-containing protein 8 (MFSD8) is also called ceroid-lipofuscinosis neuronal protein 7 (CLN7). It is a polytopic lysosomal membrane protein that may transport small solutes by using chemiosmotic ion gradients. Mutations in MFSD8/CLN7 cause a variant of late-infantile neuronal ceroid lipofuscinoses (vLINCL), a neurodegenerative lysosomal storage disorder. Some variants are associated with nonsyndromic autosomal recessive macular dystrophy. MFSD8/CLN7 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	374
340885	cd17327	MFS_FEN2_like	Pantothenate transporter FEN2 and similar transporters of the Major Facilitator Superfamily. This family is composed of Saccharomyces cerevisiae pantothenate transporter FEN2 (or fenpropimorph resistance protein 2) and similar proteins from fungi and bacteria including fungal vitamin H transporter, allantoate permease, and high-affinity nicotinic acid transporter, as well as Pseudomonas putida phthalate transporter and nicotinate degradation protein T (nicT). These proteins are involved in the uptake into the cell of specific substrates such as pathothenate, biotin, allantoate, and nicotinic acid, among others. The FEN2-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	406
340886	cd17328	MFS_spinster_like	Protein spinster and spinster homologs of the Major Facilitator Superfamily of transporters. The protein spinster family includes Drosophila protein spinster, its vertebrate homologs, and similar proteins. Humans contain three homologs called protein spinster homologs 1 (SPNS1), 2 (SPNS2), and 3 (SPNS3). Protein spinster and its homologs may be sphingolipid transporters that play central roles in endosomes and/or lysosomes storage. SPNS2 is also called sphingosine 1-phosphate (S1P) transporter and is required for migration of myocardial precursors. S1P is a secreted lipid mediator that plays critical roles in cardiovascular, immunological, and neural development and function. The spinster-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	405
340887	cd17329	MFS_MdtH_MDR_like	Multidrug resistance protein MdtH and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This family is composed of Escherichia coli MdtH and similar multidrug resistance (MDR) transporters from bacteria and archaea, many of which remain uncharacterized. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. MdtH confers resistance to norfloxacin and enoxacin. MdtH-like MDR transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	376
340888	cd17330	MFS_SLC46_TetA_like	Eukaryotic Solute carrier 46 (SLC46) family, Bacterial Tetracycline resistance proteins, and similar proteins of the Major Facilitator Superfamily of transporters. This family is composed of the eukaryotic proteins MFSD9, MFSD10, MFSD14, and SLC46 family proteins, as well as bacterial multidrug resistance (MDR) transporters such as tetracycline resistance protein TetA and multidrug resistance protein MdtG. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. TetA proteins confer resistance to tetracycline while MdtG confers resistance to fosfomycin and deoxycholate. The Solute carrier 46 (SLC46) family is composed of three vertebrate members (SLC46A1, SLC46A2, and SLC46A3), the best-studied of which is SLC46A1, which functions both as an intestinal proton-coupled high-affinity folate transporter involved in the absorption of folates and as an intestinal heme transporter which mediates heme uptake. MFSD10 facilitates the uptake of organic anions such as some non-steroidal anti-inflammatory drugs (NSAIDs) and confers resistance to such NSAIDs. The SLC46/TetA-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	349
340889	cd17331	MFS_SLC22A18	Solute carrier family 22 member 18 of the Major Facilitator Superfamily of transporters. Solute carrier family 22 member 18 (SLC22A18) is also called Beckwith-Wiedemann syndrome chromosomal region 1 candidate gene A protein (BWR1A or BWSCR1A), efflux transporter-like protein, imprinted multi-membrane-spanning polyspecific transporter-related protein 1 (IMPT1), organic cation transporter-like protein 2 (ORCTL2), or tumor-suppressing subchromosomal transferable fragment candidate gene 5 protein (TSSC5). It is localized at the apical membrane surface of renal proximal tubules and may act as an organic cation/proton antiporter. It functions as a tumor suppressor in several cancer types including glioblastoma and colorectal cancer. SLC22A18 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	382
340890	cd17332	MFS_MelB_like	Salmonella enterica Na+/melibiose symporter MelB and similar transporters of the Major Facilitator Superfamily. This family is composed of Salmonella enterica Na+/melibiose symporter MelB, Major Facilitator Superfamily domain-containing proteins, MFSD2 and MFSD12, and other sugar transporters. MelB catalyzes the electrogenic symport of galactosides with Na+, Li+ or H+. The MFSD2 subfamily is composed of two vertebrate members, MFSD2A and MFSD2B. MFSD2A is more commonly called sodium-dependent lysophosphatidylcholine symporter 1 (NLS1). It is an LPC symporter that plays an essential role for blood-brain barrier formation and function. Inactivating mutations in MFSD2A cause a lethal microcephaly syndrome. MFSD2B is a potential risk or protect factor in the prognosis of lung adenocarcinoma. MelB-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	424
340891	cd17333	MFS_FucP_MFSD4_like	Bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4, and similar proteins. This family is composed of bacterial L-fucose permease (FucP), eukaryotic Major facilitator superfamily domain-containing protein 4 (MFSD4) proteins, and similar proteins.  L-fucose permease facilitates the uptake of L-fucose across the boundary membrane with the concomitant transport of protons into the cell; it can also transport L-galactose and D-arabinose. The MFSD4 subfamily consists of two vertebrate members: MFSD4A and MFSD4B. The function of MFSD4A is unknown. MFSD4B is more commonly know as Sodium-dependent glucose transporter 1 (NaGLT1), a primary fructose transporter in rat renal brush-border membranes that also facilitates sodium-independent urea uptake. The FucP/MFSD4 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	372
340892	cd17334	MFS_SLC49	Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily of transporters. The Solute carrier 49 (SLC49) family is composed of four members: feline leukemia virus subgroup C receptor 1 (FLVCR1, SLC49A1); FLVCR2 (SLC49A2); major facilitator superfamily domain-containing protein 7 (MFSD7, SLC49A3); and disrupted in renal carcinoma protein 2 (DIRC2, SLC49A4). FLVCR1 and FLVCR2 are heme transporters. In addition, FLVCR2 also functions as a transporter for a calcium-chelator complex that is important for growth and calcium metabolism. The function of MFSD7 is unknown. DIRC2 is an electrogenic lysosomal metabolite transporter that is regulated by limited proteolytic processing by cathepsin L. The SLC49 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	407
340893	cd17335	MFS_MFSD6	Major facilitator superfamily domain-containing protein 6. Human Major facilitator superfamily domain-containing protein 6 (MFSD6) is also called macrophage MHC class I receptor 2 homolog (MMR2). It has been postulated as a possible receptor for human leukocyte antigen (HLA)-B62. MFSD6 is conserved through evolution and appeared before bilateral animals. It belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	375
340894	cd17336	MFS_SLCO_OATP	Solute carrier organic anion transporters of the Major Facilitator Superfamily of transporters. Solute carrier organic anion transporters (SLCOs) are also called organic anion transporting polypeptides (OATPs) or SLC21 (Solute carrier family 21) proteins. They are sodium-independent transporters that mediate the transport of a broad range of endo- as well as xenobiotics. Their substrates are mainly amphipathic organic anions with a molecular weight of more than 300Da, although there are a few known neutral or positively charged substrates. These include drugs including statins, angiotensin-converting enzyme inhibitors, angiotensin receptor blockers, antibiotics, antihistaminics, antihypertensives, and anticancer drugs. SLCOs/OATPs can be classified into 6 families (SLCO1-6 or OATP1-6) and each family may have subfamilies (e.g. OATP1A, OATP1B, OATP1C). Within the subfamilies, individual members are numbered according to the chronology of their identification and if there is already an ortholog known, they are given the same number. For example, the first SLCO identified, is rat OATP1A1 (encoded by the Slco1a1 gene). The second SLCO identified is the first human SLCO from the same subfamily and is called OATP1A2 (encoded by the SLCO1A2 gene). There are 11 human SLCOs/OATPs. SLCOs belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	411
340895	cd17337	MFS_CsbX	CsbX family of the Major Facilitator Superfamily of transporters. The CsbX family is composed of Bacillus subtilis CsbX protein (also named alpha-ketoglutarate permease), Klebsiella pneumoniae D-arabinitol transporter (DalT), and similar proteins. The csbX gene is a sigmaB-controlled gene that  is expressed during the stationary phase of cell growth. DalT is a pentose-specific ion symporter for D-arabinitol uptake. Most members of this family remain uncharacterized. The CsbX family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	388
340896	cd17338	MFS_unc93_like	Unc-93 family of the Major Facilitator Superfamily of transporters. The Unc-93 family is composed of Caenorhabditis elegans uncoordinated protein 93 (also called putative potassium channel regulatory protein unc-93) and similar proteins including three vertebrate members: protein unc-93 homolog A (UNC93A), protein unc-93 homolog B1 (UNC93B1), and UNC93-like protein MFSD11 (also called major facilitator superfamily domain-containing protein 11 or protein ET). Unc-93 acts as a regulatory subunit of a multi-subunit  potassium channel complex that may function in coordinating muscle contraction in C. elegans. The human UNC93A gene is located in a region of the genome that is frequently associated with ovarian cancer, however, there is no evidence that UNC93A has a tumor suppressor function. UNC93B1 controls intracellular trafficking and transport of a subset of Toll-like receptors (TLRs), including TLR3, TLR7 and TLR9, from the endoplasmic reticulum to endolysosomes where they can engage pathogen nucleotides and activate signaling cascades. MFSD11 is ubiquitously expressed in the periphery and the central nervous system of mice, where it is expressed in excitatory and inhibitory mouse brain neurons. The unc93-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	388
340897	cd17339	MFS_NIMT_CynX_like	2-nitroimidazole and cyanate transporters and similar proteins of the Major Facilitator Superfamily of transporters. This family is composed of Escherichia coli 2-nitroimidazole transporter (NIMT) and cyanate transport protein CynX, and similar proteins. NIMT, also called YeaN, confers resistance to 2-nitroimidazole, the antibacterial and antifungal antibiotic, by mediating the active efflux of this compound. CynX is part of an active transport system that transports exogenous cyanate into E. coli cells. The NIMT/CynX-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	374
340898	cd17340	MFS_MFSD1	Major facilitator superfamily domain-containing protein 1. Human major facilitator superfamily domain-containing protein 1 (MFSD1) is also called smooth muscle cell-associated protein 4 (SMAP-4). The function of MFSD1 is still unknown. Its expression is affected by altered nutrient intake. During starvation, expression of MFSD1 is downregulated in anterior brain sections in mice while it is upregulated in the brainstem. In mice raised on high-fat diet, MFSD1 is specifically downregulated in brainstem and hypothalamus. MFSD1 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	394
340899	cd17341	MFS_NRT2_like	Plant Nitrate transporter NRT2 family and Bacterial Nitrate/Nitrite transporters of the Major Facilitator Superfamily. This family is composed of plant NRT2 family high-affinity nitrate transporters as well as nitrate and nitrite transporters from bacteria including Bacillus subtilis nitrate transporter NasA and nitrite extrusion protein NarK, Staphylococcus aureus NarT, Synechococcus sp. nitrate permease NapA, Mycobacterium tuberculosis NarK2 and nitrite extrusion protein NarU. NRT2 family proteins are involved in the uptake of nitrate by plant roots from the soil through the high-affinity transport system (HATS). There are seven Arabidopsis thaliana NRT2 proteins, called AtNRT2:1 to AtNRT2:7. The NRT2-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	384
340900	cd17342	MFS_SLC37A3	Solute carrier family 37 member 3 of the Major Facilitator Superfamily of transporters. Solute carrier family 37 member 3 (SLC37A3) is also called sugar phosphate exchanger 3 (SPX3), and is one of four SLC37 family proteins in vertebrates. It's function and activity is unknown. The best characterized SLC37 family member is SLC37A4, also called the glucose-6-phosphate transporter (G6PT), a phosphate (Pi)-linked G6P antiporter. SLC37A3 is a member of the Organophosphate:Pi antiporter (OPA)/SLC37 family, whose members are integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The OPA/SLC37 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	399
340901	cd17343	MFS_SLC37A4	Solute carrier family 37 member 4 of the Major Facilitator Superfamily of transporters. Solute carrier family 37 member 4 (SLC37A4), one of four SLC37 family proteins in vertebrates, is better known as glucose-6-phosphate transporter (G6PT). It is also called sugar phosphate exchanger 4 (SPX4), G6P translocase, or transformation-related gene 19 protein (TRG-19). G6PT is a phosphate (Pi)-linked G6P antiporter, catalyzing G6P:Pi and Pi:Pi exchanges. Deficiencies in human G6PT lead to glycogen storage disease type Ib (GSD-Ib), which is a metabolic and immune disorder. G6PT is a member of the Organophosphate:Pi antiporter (OPA)/SLC37 family, whose members are integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The OPA/SLC37 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	409
340902	cd17344	MFS_SLC37A1_2	Solute carrier family 37 members 1 and 2 of the Major Facilitator Superfamily of transporters. Solute carrier family 37 members 1 (SLC37A1) and 2 (SLC37A2) are also called sugar phosphate exchangers 1 (SPX1) and 2 (SPX2). SLC37A1 and SLC37A2 are ER-associated, Pi-linked antiporters that can transport glucose-6-phosphate (G6P) but are insensitive to chlorogenic acid, a competitive inhibitor of physiological ER G6P transport, unlike SLC37A4, the best characterized SLC37 family member and is the physiological G6P transporter (G6PT). SLC37A1 and SLC37A2 belong to the Organophosphate:Pi antiporter (OPA)/SLC37 family, whose members are integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The OPA/SLC37 family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	400
340903	cd17345	MFS_GlpT	Glycerol-3-Phosphate Transporter of the Major Facilitator Superfamily of transporters. Glycerol-3-Phosphate Transporter (also called GlpT or G-3-P permease) is responsible for glycerol-3-phosphate uptake. It is part of the Organophosphate:Pi antiporter (OPA) family of integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The GlpT group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	411
340904	cd17346	MFS_DtpA_like	Dipeptide and tripeptide permease A (DtpA)-like subfamily of the Major Facilitator Superfamily of transporters. The DtpA-like subfamily includes four Escherichia coli proteins: dipeptide and tripeptide permeases A (DtpA, TppB or YdgR), B (DtpB or YhiP), C (DtpC or YjdL), and D (DtpD or YbgH). They are proton-dependent permeases that transport di- and tripeptides. DtpA and DtpB display a preference for di- and tripeptides composed of L-amino acids. DtpC shows higher specificity for dipeptides compared to tripeptides, and prefers dipeptides containing a C-terminal lysine residue. The DtpA-like subfamily belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	399
340905	cd17347	MFS_SLC15A1_2_like	Solute carrier family 15 members 1 and 2, and similar Major Facilitator Superfamily transporters. Solute carrier family 15 member 1 (SLC15A1) and SLC15A2 are members of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. They mediate the proton-coupled active transport of a broad range of dipeptides and tripeptides, including zwitterionic, anionic and cationic peptides, as well as a variety of peptide-like drugs such as cefadroxil, enalapril, and valacyclovir. SLC15A1, or peptide transporter 1 (PepT1), is primarily expressed in the brush border membranes of enterocytes of the small intestine and is also known as the intestinal isoform. SLC15A2, or peptide transporter 2 (PepT2), is abundantly expressed in the apical membrane of kidney proximal tubules and is also referred to as the renal isoform. Both proteins transport di/tripeptides, but not tetrapeptides or free amino acids, using the energy generated by an inwardly directed transmembrane proton gradient. The SLC15A1/SLC15A2-like group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	427
340906	cd17348	MFS_SLC15A3_4	Solute Carrier family 15 members 3 and 4 of the Major Facilitator Superfamily of transporters. Solute carrier family 15 members 3 (SLC15A3) and 4 (SLC15A4) are members of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. They are peptide/histidine transporters (PHTs) that transport free histidine in addition to di/tripeptides. SLC15A4, also called peptide transporter 4 or peptide/histidine transporter 1 (PHT1), is expressed in the human brain, retina, placenta, and immune cells. It is required for Toll-like receptor 7 (TLR7)- and TLR9-mediated type I interferon production in plasmacytoid dendritic cells (pDCs) and is involved in the pathogenesis of lupus-like autoimmunity. SLC15A3, also called osteoclast transporter, peptide transporter 3, or peptide/histidine transporter 2 (PHT2), is expressed in immune tissues including the spleen and thymus. The SLC15A3/SLC15A4 group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	435
340907	cd17349	MFS_SLC15A5	Solute Carrier family 15 member 5 of the Major Facilitator Superfamily of transporters. Solute carrier family 15 member 5 (SLC15A5) is a member of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. The specific function of SLC15A5 is unknown. SLC15A5 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	437
340908	cd17350	MFS_PTR2	Peptide transporter PTR2 of the Major Facilitator Superfamily of transporters. Fungal peptide transporter or permease PTR2 is a member of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. It is a 12-transmembrane domain (TMD) integral membrane protein that translocates di-/tripeptides. As with other POT family proteins, it displays characteristic substrate multispecificity. PTR2 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	438
340909	cd17351	MFS_NPF	Plant NRT1/PTR family (NPF) of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	445
340910	cd17352	MFS_MCT_SLC16	Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily of transporters. The animal Monocarboxylate transporter (MCT) family is also called Solute carrier family 16 (SLC16 or SLC16A). It is composed of 14 members, MCT1-14. MCTs play an integral role in cellular metabolism via lactate transport and have been implicated in metabolic synergy in tumors. MCT1-4 are proton-coupled transporters that facilitate the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. MCT8 and MCT10 are transporters which stimulate the cellular uptake of thyroid hormones such as thyroxine (T4), triiodothyronine (T3), reverse triiodothyronine (rT3) and diidothyronine (T2). MCT10 also functions as a sodium-independent transporter that mediates the uptake or efflux of aromatic acids. Many members are orphan transporters whose substrates are yet to be determined. The MCT family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	361
340911	cd17353	MFS_OFA_like	Oxalate:formate antiporter (OFA) and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of Oxalobacter formigenes oxalate:formate antiporter (OFA or OxlT) and similar proteins. O. formigenes, a commensal found in the gut of animals and humans, plays an important role in clearing dietary oxalate from the intestinal tract, which is carried out by OFA/OxlT, an anion transporter that facilitates the exchange of divalent oxalate with monovalent formate, the product of oxalate decarboxylation. This exchange generates an electrochemical proton gradient and is the source of energy for ATP synthesis in this cell. The OFA-like subfamily belongs to the Monocarboxylate transporter -like (MCT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	389
340912	cd17354	MFS_Mch1p_like	Monocarboxylate transporter-homologous (Mch) 1 protein and similar transporters of the Major Facilitator Superfamily of transporters. Yeast monocarboxylate transporter-homologous (Mch) proteins are putative transporters that do not transport monocarboxylic acids across the plasma membrane, and may play roles distinct from their mammalian counterparts. Their function has not been determined. The Mch1p-like group belongs to the Monocarboxylate transporter -like (MCT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	385
340913	cd17355	MFS_YcxA_like	MFS-type transporter YcxA and similar proteins of the Major Facilitator Superfamily of transporters. This group is composed of uncharacterized bacterial MFS-type transporters including Bacillus subtilis YcxA and YbfB. YcxA has been shown to facilitate the export of surfactin in B. subtilis. The YcxA-like group belongs to the Monocarboxylate transporter -like (MCT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	386
340914	cd17356	MFS_HXT	Fungal Hexose transporter subfamily of the Major Facilitator Superfamily of transporters and similar proteins. The fungal hexose transporter (HXT) subfamily is comprised of functionally redundant proteins that function mainly in the transport of glucose, as well as other sugars such as galactose and fructose. Saccharomyces cerevisiae has 20 genes that encode proteins in this family (HXT1 to HXT17, GAL2, SNF3, and RGT2). Seven of these (HXT1-7) encode functional glucose transporters. Gal2p is a galactose transporter, while Rgt2p and Snf3p act as cell surface glucose receptors that initiate signal transduction in response to glucose, functioning in an induction pathway responsible for glucose uptake. Rgt2p is activated by high levels of glucose and stimulates expression of low affinity glucose transporters such as Hxt1p and Hxt3p, while Snf3p generates a glucose signal in response to low levels of glucose, stimulating the expression of high affinity glucose transporters such as Hxt2p and Hxt4p. Schizosaccharomyces pombe contains eight GHT genes (GHT1-8) belonging to this family. Ght1, Ght2, and Ght5 are high-affinity glucose transporters; Ght3 is a high-affinity gluconate transporter; and Ght6 high-affinity fructose transporter. The substrate specificities for Ght4, Ght7, and Ght8 remain undetermined. The HXT subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	403
340915	cd17357	MFS_GLUT_Class1_2_like	Class 1 and Class 2 Glucose transporters (GLUTs) of the Major Facilitator Superfamily. This subfamily includes Class 1 and Class 2 glucose transporters (GLUTs) including Solute carrier family 2, facilitated glucose transporter member 1 (SLC2A1, also called glucose transporter type 1 or GLUT1), SLC2A2-5 (GLUT2-5), SLC2A7 (GLUT7), SLC2A9 (GLUT9), SLC2A11 (GLUT11), SLC2A14 (GLUT14), and similar proteins. GLUTs are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUTs 1-5 are the most thoroughly studied and are well-established as glucose and/or fructose transporters in various tissues and cell types. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	447
340916	cd17358	MFS_GLUT6_8_Class3_like	Glucose transporter (GLUT) types 6 and 8, Class 3 GLUTs, and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of glucose transporter type 6 (GLUT6), GLUT8, plant early dehydration-induced gene ERD6-like proteins, and similar insect proteins including facilitated trehalose transporter Tret1-1. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). Insect Tret1-1 is a low-capacity facilitative transporter for trehalose that mediates the transport of trehalose synthesized in the fat body and the incorporation of trehalose into other tissues that require a carbon source. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	436
340917	cd17359	MFS_XylE_like	D-xylose-proton symporter and similar transporters of the Major Facilitator Superfamily. This subfamily includes bacterial transporters such as D-xylose-proton symporter (XylE or XylT), arabinose-proton symporter (AraE), galactose-proton symporter (GalP), major myo-inositol transporter IolT, glucose transport protein, putative metabolite transport proteins YfiG, YncC, and YwtG, and similar proteins. The symporters XylE, AraE, and GalP facilitate the uptake of D-xylose, arabinose, and galactose, respectively, across the boundary membrane with the concomitant transport of protons into the cell. IolT is involved in polyol metabolism and myo-inositol degradation into acetyl-CoA. The XylE-like subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	383
340918	cd17360	MFS_HMIT_like	H(+)-myo-inositol cotransporter and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of myo-inositol/inositol transporters and similar transporters from vertebrates, plant, and fungi. The human protein is called H(+)-myo-inositol cotransporter/Proton myo-inositol cotransporter (HMIT), or H(+)-myo-inositol symporter, or Solute carrier family 2 member 13 (SLC2A13). HMIT is classified as a Class 3 GLUT (glucose transporter) based on sequence similarity with GLUTs, but it does not transport glucose. It specifically transports myo-inositol and is expressed predominantly in the brain, with high expression in the hippocampus, hypothalamus, cerebellum and brainstem. HMIT may be involved in regulating processes that require high levels of myo-inositol or its phosphorylated derivatives, such as membrane recycling, growth cone dynamics, and synaptic vesicle exocytosis. Arabidopsis Inositol transporter 4 (AtINT4) mediates high-affinity H+ symport of myo-inositol across the plasma membrane. The HMIT-like subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	362
340919	cd17361	MFS_STP	Plant Sugar transport protein subfamily of the Major Facilitator Superfamily of transporters. The plant Sugar transport protein (STP) subfamily includes STP1-STP14; they are also called hexose transporters. They mediate the active uptake of hexoses such as glucose, 3-O-methylglucose, fructose, xylose, mannose, galactose, fucose, 2-deoxyglucose and arabinose, by sugar/hydrogen symport. Several STP family transporters are expressed in a tissue-specific manner, or at specific developmental stages. STP1 is the member with the highest expression level of all members and high expression is detected in photosynthetic tissues, such as leaves and stems, while roots, siliques, and flowers show lower expression levels. It plays a major role in the uptake and response of Arabidopsis seeds and seedlings to sugars. The STP subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	390
340920	cd17362	MFS_GLUT10_12_Class3_like	Glucose transporter (GLUT) types 10 and 12, Class 3 GLUTs, and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of glucose transporter type 10, GLUT12, plant polyol transporters (PLTs), and similar proteins. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	389
340921	cd17363	MFS_SV2	Synaptic vesicle glycoprotein 2 of the Major Facilitator Superfamily of transporters. Synaptic vesicle glycoprotein 2 (SV2) is a transporter-like integral membrane glycoprotein, with 12 transmembrane regions, expressed in vertebrates and is localized to synaptic and endocrine secretory vesicles. Three isoforms have been identified, SV2A, SV2B, and SV2C. SV2A and SV2B are widely expressed in the brain, while SV2C is more restricted to evolutionarily older brain. SV2 isoforms have been shown to be critical for the proper function of the central nervous system. SV2 serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. The SV2 family belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	474
340922	cd17364	MFS_PhT	Inorganic Phosphate Transporter of the Major Facilitator Superfamily of transporters. This subfamily is composed of predominantly fungal and plant high-affinity inorganic phosphate transporters (PhT or PiPT), which are involved in the uptake, translocation, and internal transport of inorganic phosphate. They also function in sensing external phosphate levels as transceptors. Phosphate is crucial for structural and metabolic needs, including nucleotide and lipid synthesis, signalling and chemical energy storage. The Pht subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	389
340923	cd17365	MFS_PcaK_like	4-hydroxybenzoate transporter PcaK and similar transporters of the Major Facilitator Superfamily. This aromatic acid:H(+) symporter subfamily includes Acinetobacter sp. 4-hydroxybenzoate transporter PcaK, Pseudomonas putida gallate transporter (GalT), Corynebacterium glutamicum gentisate transporter (GenK), Nocardioides sp. 1-hydroxy-2-naphthoate transporter (PhdT), Escherichia coli 3-(3-hydroxy-phenyl)propionate (3HPP) transporter (MhpT), and similar proteins. These transporters are involved in the uptake across the cytoplasmic membrane of specific aromatic compounds such as 4-hydroxybenzoate, gallate, gentisate (2,5-dihydroxybenzoate), 1-hydroxy-2-naphthoate, and 3HPP, respectively. The PcaK-like aromatic acid:H(+) symporter subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	351
340924	cd17366	MFS_ProP	Proline/betaine transporter of the Major Facilitator Superfamily of transporters. This subfamily is composed of Escherichia coli proline/betaine transporter, also called proline porter II (PPII), and similar proteins. ProP is a proton symporter that senses osmotic shifts and responds by importing osmolytes such as proline, glycine betaine, stachydrine, pipecolic acid, ectoine and taurine. It is both an osmosensor and an osmoregulator which is available to participate early in the bacterial osmoregulatory response. The ProP subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	377
340925	cd17367	MFS_KgtP	Alpha-ketoglutarate permease of the Major Facilitator Superfamily of transporters. This subfamily includes Escherichia coli alpha-ketoglutarate permease (KgtP) and similar proteins. KgtP is a constitutively expressed proton symporter that functions in the uptake of alpha-ketoglutarate across the boundary membrane. Also included is a putative transporter from Pseudomonas aeruginosa named dicarboxylic acid transporter PcaT. The KgtP subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	407
340926	cd17368	MFS_CitA	Citrate-proton symporter of the Major Facilitator Superfamily of transporters. Citrate-proton symporter, also called citrate carrier protein or citrate transporter or citrate utilization protein A (CitA), is a proton symporter that functions in the uptake of citrate across the boundary membrane. It allows the utilization of citrate as a sole source of carbon and energy. In Klebsiella pneumoniae, the gene encoding this protein is called citH, instead of citA, which is the case for Escherichia coli and other organisms. CitA belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	407
340927	cd17369	MFS_ShiA_like	Shikimate transporter and similar proteins of the Major Facilitator Superfamily. This subfamily is composed of Escherichia coli shikimate transporter (ShiA), inner membrane metabolite transport protein YhjE, and other putative metabolite transporters. ShiA is involved in the uptake of shikimate, an aromatic compound involved in siderophore biosynthesis. It has been suggested that YhjE may mediate the uptake of osmoprotectants. The ShiA-like subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	408
340928	cd17370	MFS_MJ1317_like	MJ1317 and similar transporters of the Major Facilitator Superfamily. This family is composed of Methanocaldococcus jannaschii MFS-type transporter MJ1317, Mycobacterium bovis protein Mb2288, and similar proteins. They are uncharacterized transporters belonging to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	371
340929	cd17371	MFS_MucK	Cis,cis-muconate transport protein and similar proteins of the Major Facilitator Superfamily. This subfamily is composed of Acinetobacter sp. Cis,cis-muconate transport protein (MucK), Escherichia coli putative sialic acid transporter 1, and similar proteins. MucK functions in the uptake of muconate and allows Acinetobacter calcoaceticus ADP1 (BD413) to grow on exogenous cis,cis-muconate as the sole carbon source. The MucK subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	389
340930	cd17372	MFS_SVOP_like	Synaptic vesicle 2-related protein (SVOP) and related proteins of the Major Facilitator Superfamily. This subfamily is composed of synaptic vesicle 2 (SV2)-related protein (SVOP), SVOP-like protein (SVOPL), and similar proteins. SVOP is a transporter-like nucleotide binding protein that localizes to neurotransmitter-containing vesicles. Like SV2, SVOP is expressed in all brain regions, with highest levels in cerebellum, hindbrain and pineal gland. Studies with knockout mice suggets that SVOP may perform a subtle function that is not necessary for survival under normal conditions, since mice lacking SVOP are viable, fertile, and phenotypically normal. SVOP and SVOPL share structural similarity to the solute carrier family 22 (SLC22), a large family of organic cation and anion transporters. The SVOP-like subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	367
340931	cd17373	MFS_SLC22A17_like	Solute carrier family 22, member 17 and similar proteins of the Major Facilitator Superfamily. This group is composed of Solute carrier family 22, members 17, 23, and 31. They are members of the SLC22 family of organic cation/anion/zwitterion transporters, which includes organic cation transporters (OCTs/OCTNs) and organic anion transporters (OATs). SLC22A17 functions as a cell surface receptor for lipocalin-2 (LCN2), also called NGAL or 24p3, which plays a key role in iron homeostasis and transport. SLC22A23 and SLC22A31 are orphan members of the SLC22 family. SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. The SLC22A17-like group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	348
340932	cd17374	MFS_OAT	Organic anion transporters of the Major Facilitator Superfamily of transporters. Organic anion transporters (OATs) generally display broad substrate specificity and they facilitate the exchange of extracellular with intracellular organic anions (OAs). Several OATs have been characterized including OAT1-10 and urate anion exchanger 1 (URAT1, also called SLC22A12). Many OATs occur in renal proximal tubules, the site of active drug secretion. OATs mediate the absorption, distribution, and excretion of a diverse array of environmental toxins, and clinically important drugs, including anti-HIV therapeutics, anti-tumor drugs, antibiotics, anti-hypertensives, and anti-inflammatories, and therefore is critical for the survival of the mammalian species. OAT falls into the SLC22 (solute carrier 22) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	341
340933	cd17375	MFS_SLC22A16_CT2	Solute carrier family 22 member 16  (also called Carnitine transporter 2) of the Major Facilitator Superfamily of transporters. Solute carrier family 22 member 16 (SLC22A16) is also called carnitine transporter 2 (CT2), fly-like putative transporter 2 (FLIPT2), organic cation transporter OKB1, or organic cation/carnitine transporter 6 (OCT6). It is a partially sodium-ion dependent high affinity carnitine transporter. It also transports organic cations such as tetraethylammonium (TEA) and doxorubicin. It is one of several organic cation transporters (OCTs) that falls into the SLC22 (solute carrier 22) family. OCTs are broad-specificity transporters that play a critical role in the excretion and distribution of endogeneous organic cations and for the uptake, elimination and distribution of cationic drugs, toxins, and environmental waste products. SLC22A16 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	341
340934	cd17376	MFS_SLC22A4_5_OCTN1_2	Solute carrier family 22 members 4 and 5 (also called Organic cation/carnitine transporters 1 and 2) of the Major Facilitator Superfamily of transporters. This subfamily is composed of solute carrier family 22 members 4 (SLC22A4) and 5 (SLC22A5), and similar proteins. SLC22A4 is also called ergothioneine transporter (ETT) or organic cation/carnitine transporter 1 (OCTN1). It is a sodium-ion dependent, low affinity carnitine transporter, and a highly specific transporter for the uptake of ergothioneine (ET), a thiolated derivative of histidine with antioxidant properties. ET is a natural compound produced only by certain fungi and bacteria and must be absorbed from the diet by humans and other vetebrates. SLC22A5, also called organic cation/carnitine transporter 2 (OCTN2), is a sodium-ion dependent, high affinity carnitine transporter involved in the active cellular uptake of carnitine. SLC22A4/5 belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	342
340935	cd17377	MFS_SLC22A15	Solute carrier family 22 member 15 of the Major Facilitator Superfamily of transporters. Solute carrier family 22 member 15 (SLC22A15) is also called fly-like putative transporter 1 (FLIPT1). It is expressed at the highest levels in the kidney and brain. It is a member of the SLC22 family of transporters, which includes organic cation transporters (OCTs), organic zwitterion/cation transporters (OCTNs), and organic anion transporters (OATs). SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. SLC22A15 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	353
340936	cd17378	MFS_OCT_plant	Plant organic cation/carnitine transporters of the Major Facilitator Superfamily of transporters. Plant organic cation/carnitine transporters (OCTs) are sequence-similar to their animal counterparts, which are broad-specificity transporters that play a critical role in the excretion and distribution of endogeneous organic cations and for the uptake, elimination and distribution of cationic drugs, toxins, and environmental waste products. Little is know about plant OCTs. In Arabidopsis, there are six genes belonging to this family that show distinct, organ-specific expression pattern of the individual genes. AtOCT1 has been found to affect root development and carnitine-related responses in Arabidopsis. AtOCT4, 5 and 6 are up-regulated during drought stress, AtOCT3 and 5 during cold stress and AtOCT5 and 6 during salt stress treatments. Plant OCTs belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	342
340937	cd17379	MFS_SLC22A1_2_3	Solute carrier family 22 members 1, 2, and 3 (also called Organic cation transporters 1, 2, and 3) of the Major Facilitator Superfamily of transporters. This sufamily includes solute carrier family 22 member 1 (SLC22A1, also called organic cation transporter 1 or OCT1), SLC22A2 (or OCT2), SLC22A3 (or OCT3), and similar proteins. OCT1-3 have similar basic functional properties: they are able to translocate a variety of structurally different organic cations in both directions across the plasma membrane; to translocate organic cations independently from sodium, chloride or proton gradients; and to function as electrogenic uniporters for cations or as electroneutral cation exchangers. They show overlapping but distinct substrate and inhibitor specificities, and different tissue expression pattern. In humans, OCT1 is strongly expressed in the liver, OCT2 is highly expressed in the kidney where it is localized at the basolateral membrane of renal proximal tubules, and OCT3 is most strongly expressed in skeletal muscle. OCTs are broad-specificity transporters that play a critical role in the excretion and distribution of endogeneous organic cations and for the uptake, elimination and distribution of cationic drugs, toxins, and environmental waste products. The SLC22A1-3 subfamily belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	340
340938	cd17380	MFS_SLC17A9_like	Solute carrier family 17 member 9 and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily includes solute carrier family 17 member 9 (SLC17A9) and similar proteins including plant inorganic phosphate transporters (PHT4) that are also probably anion transporters. SLC17A9, also called vesicular nucleotide transporter (VNUT), is involved in vesicular storage and exocytosis of ATP. It facilitates the accumulation of ATP and other nucleotides in secretory vesicles such as adrenal chromaffin granules and synaptic vesicles. It also functions as a lysosomal ATP transporter and regulates cell viability. Plant PHT4 family transporters mediate the transport of inorganic phosphate and may also transport organic anions. The Arabidopsis protein AtPHT4;4 is a chloroplast-localized ascorbate transporter. PHT4 proteins show differential expression that suggests specialized functions. The SLC17A9-like subfamily belongs to the Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	361
340939	cd17381	MFS_SLC17A5	Solute carrier family 17 member 5 (also called sialin) of the Major Facilitator Superfamily of transporters. Solute carrier family 17 member 5 (SLC17A5) is also called sialin, H(+)/nitrate cotransporter, H(+)/sialic acid cotransporter (AST), membrane glycoprotein HP59, or vesicular H(+)/aspartate-glutamate cotransporter. It transports glucuronic acid and free sialic acid out of the lysosome after its cleavage from sialoglycoconjugates, which is required for normal CNS myelination. It also mediates the membrane potential-dependent uptake of aspartate and glutamate into synaptic vesicles and synaptic-like microvesicles. In the plasma membrane, it functions as a nitrate transporter. Recessive mutations in the SLC17A5 gene cause the allelic disorders, Infantile sialic acid storage disease (ISSD) and Salla disease (a predominantly neurological disorder). SLC17A5 belongs to the Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	397
340940	cd17382	MFS_SLC17A6_7_8_VGluT	Solute carrier family 17 members 6, 7, and 8 (also called Vesicular glutamate transporters) of the Major Facilitator Superfamily of transporters. This subfamily is composed of solute carrier family 17 member 6 (SLC17A6), SLC17A7, SLC17A8, and similar proteins. SLC17A6 is also called vesicular glutamate transporter 2 (VGluT2), differentiation-associated BNPI, or differentiation-associated Na(+)-dependent inorganic phosphate cotransporter. SLC17A7 is also called VGluT1 or brain-specific Na(+)-dependent inorganic phosphate cotransporter. SLC17A8 is also called VGluT3. They mediate the uptake of glutamate into synaptic vesicles at presynaptic nerve terminals of excitatory neural cells, and may also mediate the transport of inorganic phosphate. VGluTs are also expressed and localized in various secretory vesicles in non-neuronal peripheral organelles such as hormone-containing secretory granules in endocrine cells, and thus, also act as metabolic regulators. The VGluT subfamily belongs to the Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	380
340941	cd17383	MFS_SLC18A3_VAChT	Vesicular acetylcholine transporter (VAChT) and similar transporters of the Major Facilitator Superfamily. Vesicular acetylcholine transporter (VAChT) is also called solute carrier family 18 member 3 (SLC18A3) in vertebrates and uncoordinated protein 17 (unc-17) in Caenorhabditis elegans. It is a glycoprotein involved in acetylcholine transport into synaptic vesicles and is responsible for the accumulation of acetylcholine into pre-synaptic vesicules of cholinergic neurons. Variants in SLC18A3 are associated with congenital myasthenic syndrome in humans. VAChT belongs to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	378
340942	cd17384	MFS_SLC18A1_2_VAT1_2	Vesicular amine transporters 1 (VAT1) and 2 (VAT2), and similar transporters of the Major Facilitator Superfamily. Vesicular amine transporter 1 (VAT1 or VMAT1) is also called solute carrier family 18 member 1 (SLC18A1) or chromaffin granule amine transporter, while VAT2 (or VMAT2) is also called SLC18A2, synaptic vesicular amine transporter, or monoamine transporter. VATs (or VMATs) are responsible for the uptake of cytosolic monoamines into synaptic vesicles in monoaminergic neurons. VAT1 and VAT2 distinct pharmacological properties and tissue distributions. VAT1 is preferentially expressed in neuroendocrine cells and endocrine cells, where it transports biogenic monoamines, such as serotonin, from the cytoplasm into the secretory vesicles. VAT2 is primarily expressed in the CNS and is involved in the ATP-dependent vesicular transport of biogenic amine neurotransmitters including dopamine, norepinephrine, serotonin, and histamine into synaptic vesicles. VATs belong to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	373
340943	cd17385	MFS_SLC18B1	Solute carrier family 18 member B1 of the Major Facilitator Superfamily of transporters. Solute carrier family 18 member B1 (SLC18B1) is the fourth member of the SLC18 transporter family, which includes vesicular monoamine transporters and vesicular acetylcholine transporter. It is predominantly expressed in the hippocampus and is associated with vesicles in astrocytes. It actively transports spermine and spermidine by exchange of H(+), and has been suggested to be a vesicular polyamine transporter (VPAT). SLC18B1 belongs to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	390
340944	cd17386	MFS_SLC46	Solute carrier 46 (SLC46) family of the Major Facilitator Superfamily of transporters. The solute carrier 46 (SLC46) family is composed of three vertebrate members (SLC46A1, SLC46A2, and SLC46A3) and similar proteins from insects and nematodes. The best-studied member is SLC46A1, also called proton-coupled folate transporter (PCFT), which functions both as an intestinal proton-coupled high-affinity folate transporter involved in the absorption of folates and as an intestinal heme transporter which mediates heme uptake. SLC46A2, also called thymic stromal cotransporter protein (TSCOT), is a putative 12-transmembrane protein mainly expressed in the thymic cortex in a specific thymic epithelial cell (TEC) subpopulation. SLC46A3 is a lysosomal membrane protein that functions as a direct transporter of noncleavable antibody maytansine-based catabolites from the lysosome to the cytoplasm. The SLC46 family belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	360
340945	cd17387	MFS_MFSD14	Major facilitator superfamily domain-containing 14A and 14B. This subfamily is composed of major facilitator superfamily domain-containing 14A (MFSD14A) and MFSD14B, and similar proteins. MFSD14A and MFSD14B are also called hippocampus abundant transcript 1 protein (HIAT1) and hippocampus abundant transcript-like protein 1 (HIATL1), respectively. They are both ubiquitously expressed with HIAT1 highly expressed intestis and HIATL1 most abundantly expressed in skeletal muscle. Gene disruption of MFSD14A causes globozoospermia and infertility in male mice. It has bee suggested that MFSD14A may transport a solute from the bloodstream that is required for spermiogenesis. The function of MFSD14B is unknown. The MFSD14 subfamily belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	410
340946	cd17388	MFS_TetA	Tetracycline resistance protein TetA and related proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of tetracycline resistance proteins similar to Escherichia coli TetA(A), TetA(B), and TetA(E), which are metal-tetracycline/H(+) antiporters that confer resistance to tetracycline by an active tetracycline efflux, which is an energy-dependent process that decreases the accumulation of the antibiotic in cells. TetA-like tetracycline resistance proteins belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	385
340947	cd17389	MFS_MFSD10	Major facilitator superfamily domain-containing protein 10. Major facilitator superfamily domain-containing protein 10 (MFSD10) is also called tetracycline transporter-like protein (TETRAN). It is expressed in various human tissues, including the kidney. In cultured cells, its overexpression facilitated the uptake of organic anions such as some non-steroidal anti-inflammatory drugs (NSAIDs). MFSD10/TETRAN overexpression cause resistance to some NSAIDs, suggesting that it may be an organic anion transporter that serves as an efflux pump for some NSAIDs and various other organic anions at the final excretion step. MFSD10 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	391
340948	cd17390	MFS_MFSD9	Major facilitator superfamily domain-containing protein 9. Major facilitator superfamily domain-containing protein 9 (MFSD9) is expressed in the central nervous system (CNS) and in most peripheral tissues but at very low expression levels. The function of MFSD9 is unknown. MFSD9 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	350
340949	cd17391	MFS_MdtG_MDR_like	Multidrug resistance protein MdtG and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This subfamily is composed of Escherichia coli multidrug resistance protein MdtG, Streptococcus pneumoniae multidrug resistance efflux pump PmrA, and similar multidrug resistance (MDR) transporters from bacteria. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. MdtG confers resistance to fosfomycin and deoxycholate. PmrA serves as an efflux pump for various substrates and is associated with fluoroquinolone resistance. MdtG-like MDR transporters belong to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	380
340950	cd17392	MFS_MFSD2	Major facilitator superfamily domain-containing protein 2 subfamily. The major facilitator superfamily domain-containing protein 2 (MFSD2) subfamily is composed of two vertebrate members, MFSD2A amd MFSD2B. MFSD2A is more commonly called sodium-dependent lysophosphatidylcholine symporter 1 (NLS1). It is an LPC symporter that plays an essential role for blood-brain barrier formation and function. Inactivating mutations in MFSD2A cause a lethal microcephaly syndrome. MFSD2B is a potential risk or protect factor in the prognosis of lung adenocarcinoma. The MFSD2 subfamily belongs to the Salmonella enterica Na+/melibiose symporter like (MelB-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	446
340951	cd17393	MFS_MosC_like	Membrane protein MosC and similar proteins of the Major Facilitator Superfamily of transporters. The gene encoding Sinorhizobium meliloti membrane protein MosC is part of the mos locus, which encodes the biosynthesis of the rhizopine 3-O-methyl-scyllo-inosamine. MosC belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	373
340952	cd17394	MFS_FucP_like	Fucose permease and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of L-fucose permease (also called L-fucose-proton symporter) and similar proteins such as glucose/galactose transporter and N-acetyl glucosamine transporter NagP. L-fucose permease facilitates the uptake of L-fucose across the boundary membrane with the concomitant transport of protons into the cell; it can also transport L-galactose and D-arabinose. Glucose/galactose transporter functions in the uptake of of glucose and galactose. The FucP-like subfamily belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	401
340953	cd17395	MFS_MFSD4	Major facilitator superfamily domain-containing protein 4. The Major facilitator superfamily domain-containing protein 4 (MFSD4) subfamily consists of two vertebrate members: MFSD4A and MFSD4B. The function of MFSD4A is unknown. MFSD4B is more commonly know as sodium-dependent glucose transporter 1 (NaGLT1), a primary fructose transporter in rat renal brush-border membranes that also facilitates sodium-independent urea uptake. The MFSD4 subfamily belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	367
340954	cd17396	MFS_YdiM_like	Inner membrane transport protein YdiM and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily contains Escherichia coli inner membrane transport proteins YdiM and YdiN, which belong to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	384
340955	cd17397	MFS_DIRC2	Disrupted in renal carcinoma protein 2 of the Major Facilitator Superfamily of transporters. Disrupted in renal carcinoma protein 2 or disrupted in renal cancer protein 2 (DIRC2), encoded by the SLC49A4 gene, was initially identified as a breakpoint-spanning gene in a chromosomal translocation associated with the development of renal cancer. It is an electrogenic lysosomal metabolite transporter that is regulated by limited proteolytic processing by cathepsin L. DIRC2 belongs to the Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	381
340956	cd17398	MFS_FLVCR_like	Feline leukemia virus subgroup C receptor subfamily of the Major Facilitator Superfamily of transporters. The Feline leukemia virus subgroup C receptor (FLVCR) subfamily is conserved in metazoans and is composed of two vertebrate members, FLVCR1 and FLVCR2. FLVCR1 is a heme transporter and it has two isoforms: 1 (or FLVCR1a), which exports cytoplasmic heme as well as coproporphyrin and protoporphyrin IX; and 2 (FLVCR1b), which promotes heme efflux from the mitochondrion to the cytoplasm. FLVCR2 functions as a heme importer as well as a transporter for a calcium-chelator complex that is important for growth and calcium metabolism. The FLVCR subfamily belongs to the Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	406
340957	cd17399	MFS_MFSD7	Major facilitator superfamily domain-containing protein 7. Major facilitator superfamily domain-containing protein 7 (MFSD7) is also called myosin light polypeptide 5 regulatory protein (MYL5). It's function is unknown. It is encoded by the a SLC49A3 gene and is a member of the Solute carrier 49 (SLC49) family, which also includes feline leukemia virus subgroup C receptor 1 (FLVCR1, SLC49A1), FLVCR2 (SLC49A2), as well as disrupted in renal carcinoma protein 2 (DIRC2, SLC49A4). FLVCR1 and FLVCR2 are heme transporters. DIRC2 is an electrogenic lysosomal metabolite transporter that is regulated by limited proteolytic processing by cathepsin L. MFSD7 belongs to the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	419
340958	cd17400	MFS_SLCO1_OATP1	Solute carrier organic anion transporter 1 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 1 (SLCO1) or Organic anion transporting polypeptide 1 (OATP1) family contains three subfamilies: OATP1A, OATP1B, and OATP1C. OATP1A contains one human member, OATP1A2, which shows a broad spectrum of substrates including endogenous compounds (such as bile acids, steroid hormones and their conjugates, thyroid hormones) and various drugs (such as fexofenadine, ouabain and the cyanobacterial toxin microcystin). OATP1B contains two human proteins, OATP1B1 and OATP1B3, which can both accept a wide variety of structurally-unrelated compounds as substrates including clinically-important drugs such as hydroxymethylglutaryl (HMG)-CoA reductase inhibitors (statins), angiotensin II receptor blockers (sartans), angiotensin converting enzyme (ACE) inhibitors, and anti-diabetes drugs (glinides). OATP1C contains one mammalian member, OATP1C1, which is also called thyroxine transporter. It mediates the high affinity transport of the thyroid hormones, T4 (3,5,3',5'tetraiodo-L-thyronine or thyroxine), rT3 (3,3'5'-triiodo-L-thyronine), and T3 (3,5,3'tri-iodo-L-thyronine or triiodothyronine). The SLCO1/OATP1 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	436
340959	cd17401	MFS_SLCO2_OATP2	Solute carrier organic anion transporter 2 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 2 (SLCO2) or Organic anion transporting polypeptide 2 (OATP2) family contains two subfamilies: OATP2A and OATP2B, each containing one mammalian member, OATP2A1 and OATP2B1, respectively. OATP2A1 (encoded by SLCO2A1) is a lactate/prostaglandin anion exchanger that mediates the release of newly synthesized prostaglandins (PGD2, PGE1, PGE2, PGF2A and PGI2) from cells, the transepithelial transport of prostaglandins, and the clearance of prostaglandins from the circulation. OATP2B1 (encoded by SLCO2B1) mediates the Na(+)-independent transport of various organic anions such as taurocholate, the prostaglandins PGD2, PGE1, PGE2, leukotriene C4, thromboxane B2 and iloprost, as well as endogenous sex steroid conjugates such as dehydroepiandrosterone sulfate (DHEAS). The SLCO2/OATP2 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	440
340960	cd17402	MFS_SLCO3_OATP3	Solute carrier organic anion transporter 3 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 3 (SLCO3) or Organic anion transporting polypeptide 3 (OATP3) family contains only one subfamily, OATP3A, which contains only one mammalian member OATP3A1 (encoded by SLCO3A1). It mediates the Na(+)-independent transport of organic anions such as estrone-3-sulfate, prostaglandins (PG) E1 and E2, thyroxine (T4), deltorphin II, BQ-123, and vasopressin. SLCO3A1 has been identified as a Crohn's disease (CD)-associated gene, which mediates inflammatory processes in intestinal epithelial cells through NF-kappaB transcription activation, resulting in a higher incidence of bowel perforation in CD patients. The SLCO3/OATP3 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	444
340961	cd17403	MFS_SLCO4_OATP4	Solute carrier organic anion transporter 4 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 4 (SLCO4) or Organic anion transporting polypeptide 4 (OATP4) family contains two families: OATP4A and OATP4C, each containing one mammalian member, OATP4A1 and OATP4C1, respectively. OATP4A1 (encoded by SLCO4A1), is ubiquitously expressed and mediates the Na(+)-independent transport of the thyroid hormones T3 (triiodo-L-thyronine), T4 (thyroxine) and rT3, and other organic anions such as estrone sulfate and taurocholate. OATP4C1 (encoded by SLCO4C1) is capable of transporting pharmacological substances such as digoxin, ouabain, thyroxine, methotrexate, cAMP, and uremic toxins, which accumulate in patients with chronic kidney diseases (CKDs). The SLCO4/OATP4 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	420
340962	cd17404	MFS_SLCO5_OATP5	Solute carrier organic anion transporter 5 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 5 (SLCO5) or Organic anion transporting polypeptide 5 (OATP5) family contains only one subfamily, OATP5A, which contains only one mammalian member OATP5A1 (encoded by SLCO5A1). Deletion of the SLCO5A1 gene has been implicated in the pathogenesis of Mesomelia-synostoses syndrome (MSS), a rare autosomal-dominant disorder characterized by mesomelic limb shortening, acral synostoses, and multiple congenital malformations. OATP5A1 may be a non-classical OATP which is involved in biological processes that require the reorganization of the cell shape, such as differentiation and migration. It seems to affect intracellular transport of drugs and may participate in chemoresistance of small cell lung cancer (SCLC by sequestration), rather than mediating cellular uptake. The SLCO5/OATP5 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	425
340963	cd17405	MFS_SLCO6_OATP6	Solute carrier organic anion transporter 6 family of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 6 (SLCO6) or Organic anion transporting polypeptide 6 (OATP6) family contains only one subfamily, OATP6A, which contains only one human member OATP6A1 (encoded by SLCO6A1). The OATP6 family is the most diverged of the OATPs. OATP6A1 is also called cancer/testis antigen 48 (CT48) or gonad-specific transporter. It is strongly expressed only in normal testis, and weakly in spleen, brain, fetal brain, and placenta. It is found in tumor samples (lung, bladder, and esophageal) and cancer cell lines (lung), and may be of potential use as a target for therapy for a variety of tumor types. The SLCO6/OATP6 family belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	428
340964	cd17406	MFS_unc93A_like	Protein unc-93 homolog A and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of Caenorhabditis elegans Uncoordinated protein 93 (also called putative potassium channel regulatory protein unc-93), human protein unc-93 homolog A (HmUnc-93A or UNC93A), and similar proteins. Unc-93 acts as a regulatory subunit of a multi-subunit  potassium channel complex that may function in coordinating muscle contraction in C. elegans. The human UNC93A gene is located in a region of the genome that is frequently associated with ovarian cancer, however, there is no evidence that UNC93A has a tumor suppressor function. This unc93A-like subfamily belongs to the Unc-93 family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	390
340965	cd17407	MFS_MFSD11	UNC93-like Major facilitator superfamily domain-containing protein 11. This group is composed of UNC93-like protein MFSD11 (also called major facilitator superfamily domain-containing protein 11 or protein ET) and similar proteins, most of which are uncharacterized. MFSD11 is ubiquitously expressed in the periphery and the central nervous system of mice, where it is expressed in excitatory and inhibitory mouse brain neurons. Its expression is affected by altered energy homeostasis, suggesting plausible involvement in the energy regulation. MFSD11 belongs to the Unc-93 family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	387
340966	cd17408	MFS_unc93B1	Protein unc-93 homolog B1 of the Major Facilitator Superfamily of transporters. Protein unc-93 homolog B1 (UNC93B1) controls intracellular trafficking and transport of a subset of Toll-like receptors (TLRs), including TLR3, TLR7 and TLR9, from the endoplasmic reticulum to endolysosomes where they can engage pathogen nucleotides and activate signaling cascades. It regulates differential transport of TLR7 and TLR9 into signaling endosomes to prevent autoimmunity. UNC93B1 belongs to the Unc-93 family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	456
340967	cd17409	MFS_NIMT_like	2-nitroimidazole transporter and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of Escherichia coli 2-nitroimidazole transporter (NIMT), also called YeaN, and similar proteins. NIMT confers resistance to 2-nitroimidazole, the antibacterial and antifungal antibiotic, by mediating the active efflux of this compound. The NIMT-like subfamily belongs to the 2-nitroimidazole and cyanate transporters like (NIMT/CynX-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	374
340968	cd17410	MFS_CynX_like	Cyanate transport protein CynX and similar proteins of the Major Facilitator Superfamily of transporters. This subfamily is composed of Escherichia coli cyanate transport protein CynX and similar proteins. CynX is part of an active transport system that transports exogenous cyanate into E. coli cells. The gene encoding CynX is part of the cyn operon that also includes cynS, encoding cynase, which catalyzes the reaction of cyanate with bicarbonate to give ammonia and carbon dioxide, and cynT, which encodes a carbonic anhydrase. The CynX-like subfamily belongs to the 2-nitroimidazole and cyanate transporters like (NIMT/CynX-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	372
340969	cd17411	MFS_SLC15A2	Solute carrier family 15 member 2 of the Major Facilitator Superfamily of transporters. Solute carrier family 15 member 2 (SLC15A2), also called peptide transporter 2 (PepT2), is a member of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. SLC15A2, as well as SLC15A1, mediate the proton-coupled active transport of a broad range of dipeptides and tripeptides, including zwitterionic, anionic and cationic peptides, as well as a variety of peptide-like drugs such as cefadroxil, enalapril, and valacyclovir. SLC15A2 is a high-affinity transporter and is abundantly expressed in the apical membrane of kidney proximal tubules and choroid plexus epithelial cells. It is the major transporter involved in the reclamation of peptide-bound amino acids and peptide-like drugs in the kidney, and is also called the renal isoform. In choroid plexus and the brain, it acts as an efflux transporter and plays a role in regulating peptide/neuropeptide homeostasis. SLC15A2/PepT2 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	403
340970	cd17412	MFS_SLC15A1	Solute Carrier family 15 member 1 of the Major Facilitator Superfamily of transporters. Solute carrier family 15 member 1 (SLC15A1), also called peptide transporter 1 (PepT1), is a member of the proton-coupled oligopeptide transporter (POT) family of integral membrane proteins that mediate the cellular uptake of di/tripeptides and peptide-like drugs. SLC15A1, as well as SLC15A2, mediate the proton-coupled active transport of a broad range of dipeptides and tripeptides, including zwitterionic, anionic and cationic peptides, as well as a variety of peptide-like drugs such as cefadroxil, enalapril, and valacyclovir. SLC15A1 is primarily expressed in the brush border membranes of enterocytes of the small intestine and is also known as the intestinal isoform. It is a high-capacity/low-affinity transporter that drives the transport of di-and tripeptides for metabolic purposes. It's expression is upregulated in the colon during chronic inflammation associated with inflammatory bowel disease. SLC15A1/PepT1 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	415
340971	cd17413	MFS_NPF6	NRT1/PTR family (NPF), subfamily 6 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter 1/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF6 includes the first identified member of the NRT1/PTR family: Arabidopsis thaliana NRT1.1, now called AtNPF6.3. It is a dual affinity nitrate influx transporter and a nitrate sensor. It also transports auxin and has nitrate efflux activity. NPF6 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	457
340972	cd17414	MFS_NPF4	NRT1/PTR family (NPF), subfamily 4 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. Members of the NPF4 subfamily have been shown to transport ABA. NPF4 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	456
340973	cd17415	MFS_NPF3	NRT1/PTR family (NPF), subfamily 3 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF3 is the smallest NPF subfamily and it includes Cucumis sativus nitrite transporter (CsNitr1), now named CsNPF3.2. It functions as a chloroplast nitrite uptake transporter to remove toxic nitrite from the cytosol. NPF3 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	448
340974	cd17416	MFS_NPF1_2	NRT1/PTR family (NPF), subfamily 1 and 2 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF1 includes Medicago truncatula LATD/NIP, now named MtNPF1.7, which is a high-affinity nitrate transporter and is involved in nodulation and root architecture. NPF2 members are well-established nitrate and glucosinolate transporters, including Arabidopsis nitrate influx and efflux transporters with varied tissue and developmental specificity. Examples are AtNPF2.7, which is expressed in the cortex of mature roots, and AtNPF2.9, which is expressed in root companion cells where it is involved in phloem loading. NPF1/2 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	444
340975	cd17417	MFS_NPF5	NRT1/PTR family (NPF), subfamily 5 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF5 includes Arabidopsis thaliana PTR3 (AtPTR3, now named AtNPF5.2), which is a wound-induced peptide transporter that is necessary for defense against virulent bacterial pathogens. NPF5 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	452
340976	cd17418	MFS_NPF8	NRT1/PTR family (NPF), subfamily 8 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF8 contains the Arabidopsis dipeptide transporters AtNPF8.1 (PTR1), AtNPF8.2 (PTR5), and AtNPF8.3 (PTR2), as well as tonoplast-localized transporters AtNPF8.4 (PTR4) and AtNPF8.5 (PTR6). Oryza sativa NRT1 (now called OsNPF8.9) is a low-affinity nitrate transporter. NPF8 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	447
340977	cd17419	MFS_NPF7	NRT1/PTR family (NPF), subfamily 7 of the Major Facilitator Superfamily of transporters. The plant Nitrate transporter/Peptide transporter (NRT1/PTR) family (NPF) is related to the POT (proton-coupled oligopeptide transporter), Peptide transporter (PepT/PTR), or Solute Carrier 15 (SLC15) family in animals. In contrast to related animal and bacterial counterparts, the plant proteins transport a wide variety of substrates including nitrate, peptides, amino acids, dicarboxylates, glucosinolates, as well as the plant hormones indole-3-acetic acid (IAA) and abscisic acid (ABA). A recent study identified eight subfamilies within this family, named NPF1-NPF8. NPF7 includes the nitrate transporters AtNPF7.2 and AtNPF7.3, as well as the  dipeptide transporter OsNPF7.3. AtNPF7.3 is a bidirectional transporter involved in nitrate influx and efflux. NPF7 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	447
340978	cd17420	MFS_MCT8_10	Monocarboxylate transporters 8 and 10, and similar proteins of the Major Facilitator Superfamily of transporters. Monocarboxylate transporters 8 (MCT8) and 10 (MCT10) are transporters which stimulate the cellular uptake of thyroid hormones such as thyroxine (T4), triiodothyronine (T3), reverse triiodothyronine (rT3) and diidothyronine (T2). MCT has a preference for T3 and is also a sodium-independent transporter that mediates the uptake or efflux of aromatic acids such as Phe, Tyr, and Trp, as well as L-3,4-di-hydroxy-phenylalanine. MCT8/10 and similar proteins belong to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	400
340979	cd17421	MFS_MCT5	Monocarboxylate transporter 5 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 5 (MCT5) is also called Solute carrier family 16 member 4 (SLC16A4). It is an orphan transporter expressed in the brain, muscle, liver, kidney, lung, ovary, placenta, and heart. It is a member of the monocarboxylate transporter (MCT) family, whose members include MCT1-4, which are proton-coupled transporters that facilitate the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. MCT5 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	369
340980	cd17422	MFS_MCT7	Monocarboxylate transporter 7 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 7 (MCT7) is also called Solute carrier family 16 member 6 (SLC16A6). Zebrafish MCT7 is required for hepatocyte secretion of ketone bodies during fasting; it has been shown to be a selective transporter of the major ketone body beta-hydroxybutyrate, whose abundance is increased during fasting. MCT7 is expressed in the brain, pancreas, muscle, and prostate. It belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	363
340981	cd17423	MFS_MCT11_13	Monocarboxylate transporters 11 and 13 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporters 11 (MCT11) and 13 (MCT13) are also called Solute carrier family 16 members 11 (SLC16A11) and 13 (SLC16A13), respectively. They are orphan transporters whose substrates are yet to be determined. MCT11 is expressed in skin, lung, ovary, breast, lung, pancreas, retinal pigment epithelium, and choroid plexus. Genetic variants in SLC16A11, the gene encoding MCT11, are associated with type 2 diabetes in Mexican and other Latin American populations. MCT13 is expressed in breast and bone marrow stem cells. MCT11/13 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	383
340982	cd17424	MFS_MCT12	Monocarboxylate transporter 12 of the of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 12 (MCT12) is also called Solute carrier family 16 member 12 (SLC16A12). It is a creatine transporter encoded by the cataract and glucosuria associated gene SLC16A12. A heterozygous mutation of the gene causes a syndrome with juvenile cataracts, microcornea, and glucosuria. MCT12 may function in a basolateral exit pathway for creatine in the proximal tubule. It belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	363
340983	cd17425	MFS_MCT6	Monocarboxylate transporter 6 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 6 (MCT6) is also called Solute carrier family 16 member 5 (SLC16A5). MCT6 has been shown to transport bumetanide, nateglinide, probenecid, and prostaglandin F2a, but not L-lactic acid, in a pH- and membrane potential-dependent manner. It may be involved in the disposition and absorption of various drugs. MCT6 is expressed in the kidney, muscle, brain, heart, pancreas, prostate, lung, and placenta. It belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	364
340984	cd17426	MFS_MCT1	Monocarboxylate transporter 1 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 1 (MCT1) is also called Solute carrier family 16 member 1 (SLC16A1). It is a proton-coupled transporter that facilitates the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. It is widely expressed in many tissues its main function is to transport lactate into the cell. MCT1 deficiency has been identified as a cause of profound ketoacidosis, a potentially lethal condition caused by the imbalance between hepatic production and extrahepatic utilization of ketone bodies. This suggests that MCT1-mediated ketone-body transport is crucial in maintaining acid-base balance. MCT1 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	374
340985	cd17427	MFS_MCT2	Monocarboxylate transporter 2 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 2 (MCT2) is also called Solute carrier family 16 member 7 (SLC16A7). It is a proton-coupled transporter that facilitates the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. It transports pyruvate and lactate outside and inside of sperm and plays roles in the regulation of spermatogenesis. Genetic variation in MCT2 has functional and clinical relevance with male infertility. MCT2 is consistently overexpressed in prostate cancer (PCa) cells and its location at peroxisomes is associated with malignant transformation. MCT2 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	367
340986	cd17428	MFS_MCT9	Monocarboxylate transporter 9 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 9 (MCT9) is also called Solute carrier family 16 member 9 (SLC16A9). It is an orphan transporter that is expressed in a number of tissues including intestine and kidney. A missense variant of MCT9 (K258T) is associated with significant increase in susceptibility to renal overload (ROL) gout with intestinal urate underexcretion. This suggests that MCT9 may have a role in intestinal urate excretion; it is possible that it transports urate. MCT9 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	361
340987	cd17429	MFS_MCT14	Monocarboxylate transporter 14 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 14 (MCT14) is also called Solute carrier family 16 member 14 (SLC16A14). It is an orphan transporter expressed in the brain, heart, muscle, ovary, prostate, breast, lung, pancreas, liver, spleen, and thymus. It may function as a neuronal aromatic-amino-acid transporter. MCT14 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	361
340988	cd17430	MFS_MCT3_4	Monocarboxylate transporters 9 and 14, and similar proteins of the Major Facilitator Superfamily of transporters. Monocarboxylate transporters 3 (MCT3) and 4 (MCT4) are also called Solute carrier family 16 members 8 (SLC16A8) and 3 (SLC16A3), respectively. They are proton-coupled transporters that facilitate the transport across the plasma membrane of monocarboxylates such as lactate, pyruvate, branched-chain oxo acids derived from leucine, valine and isoleucine, and ketone bodies such as acetoacetate, beta-hydroxybutyrate and acetate. MCT3 is preferentially expressed in the basolateral membrane of the retinal pigment epithelium and plays a role in pH and ion homeostasis of the outer retina by facilitating the transport of lactate and H(+) out of the retina. Mice deficient with MCT3 display altered visual function. MCT4 is highly expressed in tissues dependent on glycolysis, and it plays an important role in lactate efflux from cells. MCT4 is expressed in neurons and astrocytes; it has been found to play a role in neuroprotective mechanism of ischemic preconditioning in animals (in the gerbil) with transient cerebral ischemia. Increased MCT4 expression has also been correlated with worse prognosis across many cancer types. MCT3/4 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	368
340989	cd17431	MFS_GLUT_Class1	Class 1 Glucose transporters (GLUTs) of the Major Facilitator Superfamily. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUTs 1-4 are well-established as glucose and/or fructose transporters in various tissues and cell types. GLUT1, also called solute carrier family 2, facilitated glucose transporter member 1 (SLC2A1), displays broad substrate specificity and can transport a wide range of pentoses and hexoses including glucose, galactose, mannose, and glucosamine. It is found in the brain, erythrocytes, and in many fetal tissues. GLUT2 (or SLC2A2) is found in the liver, islet of Langerhans, intestine, and kidney, and is the isoform that likely mediates the bidirectional transfer of glucose across the plasma membrane of hepatocytes and is responsible for uptake of glucose by beta cells. GLUT3 (or SLC2A3) is found in the brain and can mediates the uptake of glucose, 2-deoxyglucose, galactose, mannose, xylose and fucose, and dehydroascorbate. GLUT4 (or SLC2A4) is an insulin-regulated facilitative glucose transporter found in adipose tissues, and in skeletal and cardiac muscle. GLUT14 (or SLC2A14) is an orphan transporter expressed mainly in the testis. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	445
340990	cd17432	MFS_GLUT_Class2	Class 2 Glucose transporters (GLUTs) of the Major Facilitator Superfamily. GLUTs, also called Solute carrier family 2, facilitated glucose transporters (SLC2A), are a family of proteins that facilitate the transport of hexoses such as glucose and fructose. There are fourteen GLUTs found in humans; they display different substrate specificities and tissue expression. They have been categorized into three classes based on sequence similarity: Class 1 (GLUTs 1-4, 14); Class 2 (GLUTs 5, 7, 9, and 11); and Class 3 (GLUTs 6, 8, 10, 12, and HMIT). GLUT5, also called Solute carrier family 2, facilitated glucose transporter member 5 (SLC2A5), is a well-established fructose transporter found in the small intestine. GLUT7 (or SLC2A7) is a high-affinity glucose and fructose transporter expressed in the small intestine and colon. GLUT9 (or SLC2A9) transports urate and fructose, and is most strongly expressed in the basolateral membranes of proximal renal tubular cells, liver and placenta. It may play a role in urate reabsorption by proximal tubules. GLUT11 (or SLC2A11) is a facilitative glucose transporter expressed in heart and skeletal muscle. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	452
340991	cd17433	MFS_GLUT8_Class3	Glucose transporter type 8, a Class 3 GLUT, of the Major Facilitator Superfamily of transporters. Glucose transporter type 8 (GLUT8) is also called Solute carrier family 2, facilitated glucose transporter member 8 (SLC2A8) or glucose transporter type X1 (GLUTX1). It is classified as a Class 3 GLUT protein and is an insulin-regulated facilitative glucose transporter predominantly expressed in testis and brain. It can also transport fructose and galactose. SLC2A8 knockout mice were viable, developed normally, and display only a very mild phenotype, including mild alterations in the brain (increased proliferation of hippocampal neurons), heart (impaired transmission of electrical wave through the atrium), and sperm cells (reduced number of motile sperm cells). GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	416
340992	cd17434	MFS_GLUT6_Class3	Glucose transporter type 6, a Class 3 GLUT, of the Major Facilitator Superfamily of transporters. Glucose transporter type 6 (GLUT6) is also called Solute carrier family 2, facilitated glucose transporter member 6 (SLC2A6). It is classified as a Class 3 GLUT protein, and is a facilitative glucose transporter that binds cytochalasin B with low affinity. It is found in the brain, spleen, and leucocytes. GLUT6 may function in oxalate secretion. SLC2A6 has been identified as an oxalate nephrolithiasis gene in mice; its deletion causes spontaneous calcium oxalate nephrolithiasis in the setting of hyperoxalaemia, hyperoxaluria, and nephrocalcinosis. GLUT proteins are comprised of about 500 amino acid residues, possess a single N-linked oligosaccharide, and have 12 transmembrane segments. They belong to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	417
340993	cd17435	MFS_GLUT12_Class3	Glucose transporter type 12 (GLUT12), a Class 3 GLUT, of the Major Facilitator Superfamily of transporters. Glucose transporter type 12 (GLUT12) is also called Solute carrier family 2, facilitated glucose transporter member 12 (SLC2A12). It is a facilitative glucose transporter, classified as a Class 3 GLUT, and is expressed in the heart, skeletal muscle, prostate, and small intestine, and is highly upregulated in breast ductal cell carcinoma. It plays a role as a secondary insulin-sensitive glucose transporter in insulin-dependent tissues. The GLUT12 subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	376
340994	cd17436	MFS_GLUT10_Class3	Glucose transporter type 10 (GLUT10), a Class 3 GLUT, of the Major Facilitator Superfamily of transporters. Glucose transporter type 10 (GLUT10) is also called Solute carrier family 2, facilitated glucose transporter member 10 (SLC2A10). It is classified as a Class 3 GLUT and is a facilitative glucose transporter that exhibits a wide tissue distribution. It is expressed in pancreas, placenta, heart, lung, liver, brain, fat, muscle, and kidney. GLUT10 facilitates the transport of dehydroascorbic acid (DHA), the oxidized form of vitamin C, into mitochondria, and also increases cellular uptake of DHA, which in turn protects cells against oxidative stress. Loss-of-function mutations in SLC2A10 cause arterial tortuosity syndrome (ATS), an autosomal recessive connective tissue disorder characterized by twisting and lengthening of the major arteries, hypermobility of the joints, and laxity of skin. The GLUT10 subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	376
340995	cd17437	MFS_PLT	Plant Polyol transporter family of the Major Facilitator Superfamily of transporters. The plant Polyol transporter (PLT) subfamily includes PLT1-6 from  Arabidopsis thaliana and similar transporters. The best characterized member of the group is Polyol transporter 5, also called Sugar-proton symporter PLT5, which mediates the H+-symport of numerous substrates including linear polyols (such as sorbitol, xylitol, erythritol or glycerol), cyclic polyol myo-inositol, and different hexoses, pentoses (including ribose), tetroses, and sugar alcohols. It functions to transport a wide range of substrates into specific sink tissues in the plant. The PLT subfamily belongs to the Glucose transporter -like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	387
340996	cd17438	MFS_SV2B	Synaptic vesicle glycoprotein 2B of the Major Facilitator Superfamily of transporters. Synaptic vesicle glycoprotein 2 (SV2) is a transporter-like integral membrane glycoprotein, with 12 transmembrane regions, expressed in vertebrates and is localized to synaptic and endocrine secretory vesicles. Three isoforms have been identified, SV2A, SV2B, and SV2C. SV2A and SV2B are widely expressed in the brain, while SV2C is more restricted to evolutionarily older brain. SV2 isoforms have been shown to be critical for the proper function of the central nervous system. SV2 serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. SV2B is a key modulator of amyloid toxicity at the synaptic site and also has an essential role in the formation and maintenance of the glomerular capillary wall. SV2B belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	477
340997	cd17439	MFS_SV2A	Synaptic vesicle glycoprotein 2A of the Major Facilitator Superfamily of transporters. Synaptic vesicle glycoprotein 2 (SV2) is a transporter-like integral membrane glycoprotein, with 12 transmembrane regions, expressed in vertebrates and is localized to synaptic and endocrine secretory vesicles. Three isoforms have been identified, SV2A, SV2B, and SV2C. SV2A and SV2B are widely expressed in the brain, while SV2C is more restricted to evolutionarily older brain. SV2 isoforms have been shown to be critical for the proper function of the central nervous system. SV2 serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. It is unclear how SV2A is involved in correct SV function, but it has been suggested to either act as a transporter or a regulator of exocytosis by mediating Ca2+ dynamics. SV2A has been identified as the molecular target of the antiepileptic drug levetiracetam (LEV). Its expression is decreased in patients with epilepsy and in epileptic animal models. SV2A belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	478
340998	cd17440	MFS_SV2C	Synaptic vesicle glycoprotein 2C of the Major Facilitator Superfamily of transporters. Synaptic vesicle glycoprotein 2 (SV2) is a transporter-like integral membrane glycoprotein, with 12 transmembrane regions, expressed in vertebrates and is localized to synaptic and endocrine secretory vesicles. Three isoforms have been identified, SV2A, SV2B, and SV2C. SV2A and SV2B are widely expressed in the brain, while SV2C is more restricted to evolutionarily older brain. SV2 isoforms have been shown to be critical for the proper function of the central nervous system. SV2 serves as the receptor for botulinum neurotoxin A (BoNT/A), one of seven neurotoxins produced by the bacterium Clostridium botulinum. BoNT/A blocks neurotransmitter release by cleaving synaptosome-associated protein of 25 kD (SNAP-25) within presynaptic nerve terminals. SV2C exhibits enriched expression in several basal ganglia nuclei, and has been found to be involved in normal operation of the basal ganglia network and could be also be involved in system adaptation in basal ganglia pathological conditions. SV2C belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	479
340999	cd17441	MFS_SVOP	Synaptic vesicle 2-related protein (SVOP) of the Major Facilitator Superfamily. Synaptic vesicle 2 (SV2)-related protein (SVOP) is a transporter-like nucleotide binding protein that localizes to neurotransmitter-containing vesicles. Like SV2, SVOP is expressed in all brain regions, with highest levels in cerebellum, hindbrain and pineal gland. Studies with knockout mice suggets that SVOP may perform a subtle function that is not necessary for survival under normal conditions, since mice lacking SVOP are viable, fertile, and phenotypically normal. SVOP shares structural similarity to the solute carrier family 22 (SLC22), a large family of organic cation and anion transporters. This SVOP subfamily belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	372
341000	cd17442	MFS_SVOPL	Synaptic vesicle 2 (SV2)-related protein-like (SVOPL) of the Major Facilitator Superfamily. Synaptic vesicle 2 (SV2)-related protein-like (SVOPL) or SVOP-like protein is a transporter-like protein that shares structural similarity to the solute carrier family 22 (SLC22), a large family of organic cation and anion transporters. It belongs to the Metazoan Synaptic Vesicle Glycoprotein 2 (SV2) and related small molecule transporter family (SV2-like) of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	375
341001	cd17443	MFS_SLC22A31	Solute carrier family 22, member 31 of the Major Facilitator Superfamily. Solute carrier family 22, member 31 (SLC22A31) is an uncharacterized member of the SLC22 family of transporters, which includes organic cation transporters (OCTs), organic zwitterion/cation transporters (OCTNs), and organic anion transporters (OATs). SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. SLC22A31 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	343
341002	cd17444	MFS_SLC22A23	Solute carrier family 22, member 23 of the Major Facilitator Superfamily. Solute carrier family 22, member 23 (SLC22A23) is an orphan member of the SLC22 family of organic cation/anion/zwitterion transporters, which includes organic cation transporters (OCTs/OCTNs) and organic anion transporters (OATs). It is abundantly expressed in brain and is also found in liver. Single-nucleotide polymorphisms in SLC22A23 are associated with inflammatory bowel disease (IBD) in a Canadian white population. SLC22 transporters interact with a variety of compounds that include drugs of abuse, environmental toxins, opioid analgesics, antidepressant and anxiolytic agents, and neurotransmitters and their metabolites. SLC22A23 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	364
341003	cd17445	MFS_SLC22A17	Solute carrier family 22, member 17 of the Major Facilitator Superfamily. Solute carrier family 22, member 17 (SLC22A17) is also called 24p3 receptor (24p3R), lipocalin-2 receptor, or neutrophil gelatinase-associated lipocalin (NGAL) receptor (NGALR). It functions as a cell surface receptor for lipocalin-2 (LCN2), also called NGAL or 24p3, which plays a key role in iron homeostasis and transport. LCN2 is a secreted protein of the lipocalin family that induces apoptosis in some types of cells and inhibits bacterial growth by sequestration of the iron-laden bacterial siderophore. Over-expressions of NGAL and NGALR have been found to be correlated with unfavorable clinicopathologic features and poor prognosis of patients with hepatocellular carcinoma. SLC22A17 is a member of the SLC22 family of organic cation/anion/zwitterion transporters, which includes organic cation transporters (OCTs/OCTNs) and organic anion transporters (OATs). It belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	346
341004	cd17446	MFS_SLC22A6_OAT1_like	Solute carrier family 22 member 6 (also called Organic anion transporter 1) and similar transporters of the Major Facilitator Superfamily. This subfamily includes solute carrier family 22 member 6 (SLC22A6, also called organic anion transporter 1 or OAT1 or para-aminohippurate (PAH) transporter), SLC22A8 (or OAT3), and SLC22A20 (or OAT6). OAT1 and OAT3 are involved in the renal elimination of endogenous and exogenous organic anions (OAs). They function as OA exchangers, coupling the uptake of OAs against an electrochemical gradient with the efflux of intracellular dicarboxylates. SLC22A20 is an OA transporter that mediates the uptake of estrone sulfate. The OAT1-like subfamily belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	339
341005	cd17447	MFS_SLC22A7_OAT2	Solute carrier family 22 member 7 (also called Organic anion transporter 2) of the Major Facilitator Superfamily of transporters. Solute carrier family 22 member 7 (SLC22A7), also called organic anion transporter 2 (OAT2) mediates sodium-independent transport of a variety of organic anions including prostaglandin E2, prostaglandin F2, tetracycline, bumetanide, estrone sulfate, glutarate, dehydroepiandrosterone sulfate, allopurinol, 5-fluorouracil, paclitaxel, L-ascorbic acid, salicylate, ethotrexate, and alpha-ketoglutarate. It also plays a role in renal uric acid uptake from blood as a first step of tubular secretion. OAT2 belongs to the Solute carrier 22 (SLC22) family of organic cation/anion/zwitterion transporters of the Major Facilitator Superfamily (MFS)of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	341
341006	cd17448	MFS_SLC46A3	Solute carrier family 46 member 3 of the Major Facilitator Superfamily of transporters. Solute carrier family 46 member 3 (SLC46A3) is a lysosomal membrane protein that functions as a direct transporter of noncleavable antibody maytansine-based catabolites from the lysosome to the cytoplasm. SLC46A3 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	396
341007	cd17449	MFS_SLC46A1_PCFT	Solute carrier family 46 member 1, also called Proton-coupled folate transporter, of the Major Facilitator Superfamily of transporters. Solute carrier family 46 member 1 (SLC46A1) is also called proton-coupled folate transporter (PCFT), G21, or heme carrier protein 1 (HCP1). It functions in two ways: as an intestinal proton-coupled high-affinity folate transporter that facilitates the absorption of folates across the brush-border membrane of the small intestine; and as an intestinal heme transporter which mediates heme uptake from the gut lumen into duodenal epithelial cells. It displays a higher affinity for folate than heme. It is also expressed in the choroid plexus and is required for transport of folates into the cerebrospinal fluid. Loss of function mutations in the SLC46A1 gene results in the autosomal recessive disorder "hereditary folate malabsorption" (HFM), characterized by severe systemic and cerebral folate deficiency. SLC46A1 belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	425
341008	cd17450	MFS_SLC46A2_TSCOT	Solute carrier family 46 member 2, also called Thymic stromal cotransporter protein, of the Major Facilitator Superfamily of transporters. Solute carrier family 46 member 2 (SLC46A2) is also called thymic stromal cotransporter protein (TSCOT). It is a putative 12-transmembrane protein mainly expressed in the thymic cortex in a specific thymic epithelial cell (TEC) subpopulation. Polymorphisms in TSCOT are linked to cervical cancer in affected sib-pairs with high mean age at diagnosis. TSCOT belongs to the Eukaryotic Solute carrier 46 (SLC46)/Bacterial Tetracycline resistance (TetA) -like (SLC46/TetA-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	383
341009	cd17451	MFS_NLS1_MFSD2A	Sodium-dependent lysophosphatidylcholine symporter 1 of the Major Facilitator Superfamily of transporters. Sodium-dependent lysophosphatidylcholine (LPC) symporter 1 (NLS1) is also called major facilitator superfamily domain-containing protein 2A (MFSD2A). NLS1/MFSD2A is an LPC symporter that plays an essential role for blood-brain barrier formation and function. It also transports the essential omega-3 fatty acid docosahexaenoic acid (DHA), which is essential for normal brain growth and cognitive function, in the form of LPC into the brain across the blood-brain barrier. Inactivating mutations in MFSD2A cause a lethal microcephaly syndrome. NLS1/MFSD2A belongs to the Salmonella enterica Na+/melibiose symporter like (MelB-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	419
341010	cd17452	MFS_MFSD2B	Major facilitator superfamily domain-containing protein 2B. Major facilitator superfamily domain-containing protein 2B (MFSD2B) is closely related to MFSD2A, and their conserved genomic structure suggests that they are derived from the duplication of an ancestral gene. Variations of chromosome 2 gene expressions among patients with lung cancer or non-cancer identified MFSD2B as a potential risk or protect factor in the prognosis of lung adenocarcinoma. MFSD2B belongs to the Salmonella enterica Na+/melibiose symporter like (MelB-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	416
341011	cd17453	MFS_MFSD4A	Major facilitator superfamily domain-containing protein 4A. Major facilitator superfamily domain-containing protein 4A (MFSD4A) belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	415
341012	cd17454	MFS_NaGLT1_MFSD4B	Sodium-dependent glucose transporter 1, also called Major facilitator superfamily domain-containing protein 4B. Sodium-dependent glucose transporter 1 (NaGLT1) is also called major facilitator superfamily domain-containing protein 4B (MFSD4B). NaGLT1 is a primary fructose transporter in rat renal brush-border membranes. It also facilitates sodium-independent urea uptake in assays performed on Xenopus oocytes. NaGLT1/MFSD4B belongs to the bacterial fucose permease, eukaryotic Major facilitator superfamily domain-containing protein 4 (FucP/MFSD4) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	369
341013	cd17455	MFS_FLVCR1	Feline leukemia virus subgroup C receptor-related protein 1 of the Major Facilitator Superfamily of transporters. Feline leukemia virus subgroup C receptor-related protein 1 (FLVCR1) is also called feline leukemia virus subgroup C receptor (FLVCR). FLVCR1 is a heme transporter and it has two isoforms: 1 (or FLVCR1a), which exports cytoplasmic heme as well as coproporphyrin and protoporphyrin IX; and 2 (FLVCR1b), which promotes heme efflux from the mitochondrion to the cytoplasm. Mutations in the FLVCR1 gene have been linked to vision impairment, posterior column ataxia, and sensory neurodegeneration with loss of pain perception. FLVCR1 belongs to the Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	407
341014	cd17456	MFS_FLVCR2	Feline leukemia virus subgroup C receptor-related protein 2 of the Major Facilitator Superfamily of transporters. Feline leukemia virus subgroup C receptor-related protein 2 (FLVCR2) is also called calcium-chelate transporter (CCT). It functions as a heme importer as well as a transporter for a calcium-chelator complex that is important for growth and calcium metabolism. Mutations in the FLVCR2 gene cause Proliferative vasculopathy and hydranencephaly-hydrocephaly syndrome (PVHH), also known as Fowler syndrome, a rare autosomal recessive disorder characterized by glomerular vasculopathy in the central nervous system, severe hydrocephaly, hypokinesia and arthrogryphosis. FLVCR2 belongs to the Solute carrier 49 (SLC49) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	406
341015	cd17457	MFS_SLCO1B_OATP1B	Solute carrier organic anion transporter 1B subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 1B (SLCO1B), also called organic anion-transporting polypeptide 1B (OATP1B), subfamily is composed of two human proteins, OATP1B1 (encoded by SLCO1B1) and OATP1B3 (encoded by SLCO1B3), and one rodent member, OATP1B2 (encoded by Slco1b2). OATP1B1 and OATP1B3 are almost exclusively expressed on the basal side of hepatocytes in normal human organs. They both can accept a wide variety of structurally-unrelated compounds as substrates including clinically-important drugs such as hydroxymethylglutaryl (HMG)-CoA reductase inhibitors (statins), angiotensin II receptor blockers (sartans), angiotensin converting enzyme (ACE) inhibitors, and anti-diabetes drugs (glinides). Loss-of-function mutations in both SLCO1B1 and SLCO1B3 genes result in the Rotor syndrome, a hereditary hyperbilirubinemia. The SLCO1B/OATP1B subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	455
341016	cd17458	MFS_SLCO1A_OATP1A	Solute carrier organic anion transporter 1A subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 1A (SLCO1A), also called Organic anion-transporting polypeptide 1A (OATP1A), subfamily is composed of one human member OATP1A2 (encoded by SLCO1A2) and several rodent proteins encoded by the Slco1a1, Slco1a3, Slco1a4, Slco1a5, and Slco1a6 genes. OATP1A2, also known as human OATP-A or OATP1, shows a broad spectrum of substrates including endogenous compounds (such as bile acids, steroid hormones and their conjugates, thyroid hormones) and various drugs (such as fexofenadine, ouabain and the cyanobacterial toxin microcystin). It is expressed in the brain, kidney, intestine, liver, lung, testes, and the eye (ciliary body). The SLCO1A/OATP1A subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	527
341017	cd17459	MFS_SLCO1C_OATP1C	Solute carrier organic anion transporter 1C subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 1C (SLCO1C), also called Organic anion-transporting polypeptide 1C (OATP1C), subfamily contains one mammalian member, OATP1C1 (encoded by SLCO1C1), which is also called thyroxine transporter. It mediates the high affinity transport of the thyroid hormones, T4 (3,5,3',5'tetraiodo-L-thyronine or thyroxine), rT3 (3,3'5'-triiodo-L-thyronine), and T3 (3,5,3'tri-iodo-L-thyronine or triiodothyronine), as well as organic anions such as 17-beta-glucuronosyl estradiol, estrone-3-sulfate, and sulfobromophthalein (BSP), which are transported with much lower efficiency. The SLCO1C/OATP1C subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	498
341018	cd17460	MFS_SLCO2B_OATP2B	Solute carrier organic anion transporter 2B subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 2B (SLCO2B), also called Organic anion-transporting polypeptide 2B (OATP2B), subfamily has one mammalian member, OATP2B1 (encoded by SLCO2B1). It mediates the Na(+)-independent transport of various organic anions such as taurocholate, the prostaglandins PGD2, PGE1, PGE2, leukotriene C4, thromboxane B2 and iloprost. It also mediates the transport of endogenous sex steroid conjugates such as dehydroepiandrosterone sulfate (DHEAS). SLCO2B1 variations result in differential expression and uptake of DHEAS, which impacts subsequent resistance to androgen-deprivation therapy (ADT), the primary treatment of metastatic prostate cancer. The SLCO2B/OATP2B subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	479
341019	cd17461	MFS_SLCO2A_OATP2A	Solute carrier organic anion transporter 2A subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 2A (SLCO2A), also called Organic anion-transporting polypeptide 2A (OATP2A), subfamily has one mammalian member, OATP2A1 (encoded by SLCO2A1), which is also called prostaglandin transporter. It is a lactate/prostaglandin anion exchanger that mediates the release of newly synthesized prostaglandins (PGD2, PGE1, PGE2, PGF2A and PGI2) from cells, the transepithelial transport of prostaglandins, and the clearance of prostaglandins from the circulation. Mutations in SLCO2A1 can cause primary hypertrophic osteoarthropathy (PHO), a rare multi-organic disease characterized by digital clubbing, pachydermia and periosteal reaction. The SLCO2A/OATP2A subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	474
341020	cd17462	MFS_SLCO4A_OATP4A	Solute carrier organic anion transporter 4A subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 4A (SLCO4A), also called Organic anion-transporting polypeptide 4A (OATP4A), subfamily has one mammalian member, OATP4A1 (encoded by SLCO4A1). It is ubiquitously expressed and it mediates the Na(+)-independent transport of the thyroid hormones T3 (triiodo-L-thyronine), T4 (thyroxine) and rT3, and other organic anions such as estrone sulfate and taurocholate. OATP4A1 is the most abundantly expressed transporter colorectal cancer (CRC) and its role in the transport of estrone sulfate, which is used in hormone replacement therapy (HRT), affects the outcome of the treatment. The SLCO4A/OATP4A subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	427
341021	cd17463	MFS_SLCO4C_OATP4C	Solute carrier organic anion transporter 4C subfamily of the Major Facilitator Superfamily of transporters. The Solute carrier organic anion transporter 4C (SLCO4C), also called Organic anion-transporting polypeptide 4C (OATP4C), subfamily has one mammalian member, OATP4C1 (encoded by SLCO4C1). It is capable of transporting pharmacological substances such as digoxin, ouabain, thyroxine, methotrexate and cAMP. It is the only OATP expressed at the basolateral side of proximal tubular cells in human kidney and it mediates the excretion of uremic toxins, which accumulate in patients with chronic kidney diseases (CKDs) and cause further progression of renal damage and cardiovascular diseases. Overexpression of human SLCO4C1 in rat kidney promotes the renal excretion of uremic toxins and reduces hypertension, cardiomegaly, and renal inflammation in renal failure. The SLCO4C/OATP4C subfamily belongs to the Solute carrier organic anion transporter [SLCO, also called organic anion transporting polypeptides (OATPs) or Solute carrier family 21] family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	429
341022	cd17464	MFS_MCT10	Monocarboxylate transporter 10 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 10 (MCT10) is also called Solute carrier family 16 member 10 (SLC16A10). In addition, human MCT10 is also called T-type amino acid transporter 1 (TAT1). MCT10 is a sodium-independent transporter that mediates the uptake or efflux of aromatic acids such as Phe, Tyr, and Trp, as well as L-3,4-di-hydroxy-phenylalanine. It is also a thyroid hormone transporter with preference for triiodothyronine (T3). MCT10 is expressed in intestine, kidney, liver, muscle, and placenta, and appears predominantly localized in the basolateral membrane. It belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	395
341023	cd17465	MFS_MCT8	Monocarboxylate transporter 8 of the Major Facilitator Superfamily of transporters. Monocarboxylate transporter 8 (MCT8) is also called Solute carrier family 16 member 2 (SLC16A2) or X-linked PEST-containing transporter. MCT8 is a very active and specific thyroid hormone transporter which stimulates the cellular uptake of thyroxine (T4), triiodothyronine (T3), reverse triiodothyronine (rT3) and diidothyronine (T2). Inactivating mutations in SLC16A2, the gene that encodes MCT8, lead to an X-linked syndrome with severe neurological impairment known as Allan-Herndon-Dudley syndrome (AHDS). AHDS is characterized by congenital hypotonia that progresses to spasticity with severe psychomotor delays, spastic paraplegia and dystonic movements, global developmental delay, and profound intellectual disability. MCT8 belongs to the Monocarboxylate transporter (MCT) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to  function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	367
350654	cd17466	T3SS_Flik_C_like	C-terminal domain of type III secretion proteins FliK, HrpP, YscP, and similar domains. Type III secretion systems (T3SS) are essential components of two complex bacterial machineries: the flagellum, which drives cell motility, and the non-flagellar T3SS, which delivers effectors into eukaryotic cells. This model represents the C-terminal domain of T3SS proteins such as the flagellar hook-length control protein FliK, and non-flagellar Yop proteins translocation protein P (YscP) and HrpP. FliK is responsible for switching secretion from the hook protein to that of the filament protein, by interacting with FlhB, the switchable secretion gate. HrpP is a type III secretion system substrate specificity switch-domain protein that is required for the export of pathogenicity factors into plant cells by pathogens. YscP is a needle-length sensing protein that controls the needle length of the injectisome, which is used by pathogenic bacteria to inject effector proteins across eukaryotic cell membranes. FliK, YscP, and HrpP contain a C-terminal globular domain that is necessary for the hierarchical switching of substrates during T3SS assembly and subsequent virulence effector secretion and is also referred to as the substrate-switching (SS) domain or the type III secretion substrate specificity switch (T3S4) domain.	87
350655	cd17467	T3SS_YscP_C	C-terminal substrate-switching domain of ruler proteins from the Ysc family, such as YscP and PscP. This subfamily includes needle-length sensing proteins, also called ruler proteins, in type 3 secretion systems (T3SS), such as Yersinia pestis Yop proteins translocation protein P (YscP) and Pseudomonas aeruginosa PscP. T3SS ruler proteins contain an N-terminal helical region that dictates needle length and is referred to as the length-sensing (LS) domain, and a C-terminal globular domain that is necessary for the hierarchical switching of substrates during T3SS assembly and subsequent virulence effector secretion and is also referred to as the substrate-switching (SS) domain or the type III secretion substrate specificity switch (T3S4) domain. The C-terminal SS domain is highly stable and sits on the extracellular side prior to needle assembly.	111
350656	cd17468	T3SS_HrpP_C	C-terminal domain of type III secretion protein HrpP and similar domains. This subfamily contains Pseudomonas syringae HrpP, a type III secretion system (T3SS) substrate specificity switch-domain protein that has has an atypical T3SS translocation signal. HrpP is required for the export of pathogenicity factors into plant cells. HrpP contains a C-terminal domain similar to the globular C-terminal substrate-switching (SS) domain, also called the type III secretion substrate specificity switch (T3S4) domain, of Yersinia pestis YscP.	89
350657	cd17470	T3SS_Flik_C	C-terminal domain of flagellar hook-length control protein FliK and similar domains. The flagellar hook-length control protein FliK is a soluble cytoplasmic protein that is secreted during flagellar formation. It controls hook elongation by two successive events: by determining hook length and by stopping the supply of hook protein. It contains an N-terminal domain that determines hook length and a C-terminal domain that is responsible for switching secretion from the hook protein to that of the filament protein, by interacting with FlhB, the switchable secretion gate.	86
341024	cd17471	MFS_Set	Sugar efflux transporter (Set) family of the Major Facilitator Superfamily of transporters. This family is composed of sugar transporters such as Escherichia coli Sugar efflux transporter SetA, SetB, SetC and other sugar transporters. SetA, SetB, and SetC are involved in the efflux of sugars such as lactose, glucose, IPTG, and substituted glucosides or galactosides. They may be involved in the detoxification of non-metabolizable sugar analogs. The Set family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	371
341025	cd17472	MFS_YajR_like	Escherichia coli inner membrane transport protein YajR and similar multidrug-efflux transporters of the Major Facilitator Superfamily. This family is composed of Escherichia coli inner membrane transport protein YajR and some uncharacterized multidrug-efflux transporters. YajR is a putative proton-driven major facilitator superfamily (MFS) transporter found in many gram-negative bacteria. Unlike most MFS transporters, YajR contains a C-terminal, cytosolic YAM domain, which may play an essential role for the proper functioning of the transporter. YajR-like transporters belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	371
341026	cd17473	MFS_arabinose_efflux_permease_like	Putative arabinose efflux permease family transporters of the Major Facilitator Superfamily. This family includes a group of putative arabinose efflux permease family transporters, such as alpha proteobacterium quinolone resistance protein NorA (characterized Staphylococcus aureus Quinolone resistance protein NorA belongs to a different group), Desulfovibrio dechloracetivorans bacillibactin exporter, Vibrio aerogenes antiseptic resistance protein. The biological function of those transporters remain unclear. They belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	374
341027	cd17474	MFS_YfmO_like	Bacillus subtilis multidrug efflux protein YfmO and similar transporters of the Major Facilitator Superfamily. This family is composed of Bacillus subtilis multidrug efflux protein YfmO, bacillibactin exporter YmfD/YmfE, uncharacterized MFS-type transporter YvmA, and similar proteins. YfmO acts to efflux copper or a copper complex, and could contribute to copper resistance. YmfD/YmfE is involved in secretion of bacillibactin. The YfmO-like family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	374
341028	cd17475	MFS_MT3072_like	Mycobacterium tuberculosis uncharacterized MFS-type transporter MT3072 and similar transporters of the Major Facilitator Superfamily. This family includes the Mycobacterium tuberculosis uncharacterized MFS-type transporter MT3072. It belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	378
341029	cd17476	MFS_Amf1_MDR_like	Saccharomyces cerevisiae low affinity ammonium transporter Amf1p/YOR378W, aminotriazole resistance protein Atr1p, and similar transporters of the Major Facilitator Superfamily. Saccharomyces cerevisiae Amf1p/Ammonium Facilitator 1/YOR378W functions as a low affinity NH4+ transporter. S. cerevisiae aminotriazole resistance protein (Atr1p) is required for controlling sensitivity to aminotriazole; it is a putative component of the machinery responsible for pumping aminotriazole (and possibly other toxic compounds) out of the cell. This subfamily also includes S. cerevisiae YMR279C, a putative boron transporter involved in boron efflux and resistance, and Kluyveromyces lactis Knq1p which is involved in oxidative stress response and iron homeostasis. Amf1p, Atr1p, and YMR279C have been classified as group 1 members of the DHA2 (Drug:H+ Antiporter family 2) family, K. lactis Knq1 as group 2. This subfamily also includes two Aspergillus terreus terrein biosynthesis cluster proteins, efflux pump TerG and TerJ which may be required for efficient secretion of terrein or other secondary metabolites produced by the terrein gene cluster. The Amf1p-like subfamily belongs to the Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters (MMR-like MDR transporter) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	362
341030	cd17477	MFS_YcaD_like	YcaD and similar transporters of the Major Facilitator Superfamily. This family is composed of Escherichia coli MFS-type transporter YcaD, Bacillus subtilis MFS-type transporter YfkF, and similar proteins. They are uncharacterized transporters belonging to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	360
341031	cd17478	MFS_FsR	Fosmidomycin resistance protein of the Major Facilitator Superfamily of transporters. Fosmidomycin resistance protein (FsR) confers resistance against fosmidomycin. It shows sequence similarity with the bacterial drug-export proteins that mediate resistance to tetracycline and chloramphenicol. This FsR family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	365
341032	cd17479	MFS_MFSD6L	Major facilitator superfamily domain-containing protein 6-like and similar transporters of the Major Facilitator Superfamily. Major facilitator superfamily domain-containing protein 6-like (MFSD6L) protein family includes a group uncharacterized proteins similar to human major facilitator superfamily domain-containing protein 6 (MFSD6). MFSD6 is also called Macrophage MHC class I receptor 2 homolog (MMR2). It has been postulated as a possible receptor for human leukocyte antigen (HLA)-B62. The MFSD6L family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	376
341033	cd17480	MFS_SLC40A1_like	Solute carrier family 40 member 1 of the Major Facilitator Superfamily of transporters. Solute carrier family 40 member 1 (SLC40A1 or SLC11A3) is also called ferroportin-1 (FPN1) or iron-regulated transporter 1 (IREG1). In the presence of a ferroxidase (hephaestin and/or ceruloplasmin), SLC40A1 acts as an iron exporter ferroportin releases Fe(2+) from cells into plasma, thereby maintaining iron homeostasis. Specially, it is involved in iron export from duodenal epithelial cell and also in the transfer of iron between maternal and fetal circulation. The transport activity of SLC40A1 is suppressed by the peptide hormone hepcidin. This family also includes a bacterial homologue of SLC40A1 (Bdellovibrio bacteriovorus ferroportin). It adopts the major facilitator superfamily fold, but undergoes an intra-domain conformational rearrangement during the transport cycle. SLC40A1 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	386
341034	cd17481	MFS_MFSD13A	Major facilitator superfamily domain containing 13A. Human major facilitator superfamily domain containing 13A (MFSD1A) protein is also called transmembrane protein 180. Its function is still unknown. MFSD13A belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement	429
341035	cd17482	MFS_YxiO_like	Bacillus subtilis YxiO, Listeria monocytogenes BtlA, and similar transporters of the Major Facilitator Superfamily. This family is composed of Bacillus subtilis MFS-type transporter YxiO, and similar proteins including Listeria monocytogenes BtlA. The function of B. subtilis YxiO is still unknown, and L. monocytogenes BtlA is a putative secondary transporter involved in bile tolerance and general stress resistance. This family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	362
341036	cd17483	MFS_Atg22_like	Autophagy-related protein 22 and similar proteins; member of the Major Facilitator Superfamily of transporters. Atg22 (also known as Aut4) protein functions as a vacuolar effluxer which mediates the efflux of amino acids resulting from autophagic degradation. The release of autophagic amino acids allows the maintenance of protein synthesis and viability during nitrogen starvation. Members of this family belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	467
341037	cd17484	MFS_FBT	Folate-biopterin transporter of the Major Facilitator Superfamily of transporters. The Folate-biopterin transporter (FBT) family includes folate carriers related to those of trypanosomatids in higher plant plastids and cyanobacteria. FBT mediates folate monoglutamate transport involved in tetrahydrofolate biosynthesis. It also mediates transport of antifolates, such as methotrexate and aminopterin. The FBT family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	399
341038	cd17485	MFS_MFSD3	Major facilitator superfamily domain containing 3 protein. Major facilitator superfamily domain containing 3 protein (MFSD3) is a predicted acetyl-CoA transporter. As an atypical putative membrane-bound solute carrier (SLC), MFSD3 is most likely to be functionally active in the plasma membrane and not in any intracellular organelles. MFSD3 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	386
341039	cd17486	MFS_AmpG_like	AmpG and similar transporters of the Major Facilitator Superfamily. AmpG acts as an inner membrane permease in the beta-lactamase induction system and in peptidoglycan recycling. It transports meuropeptide from the periplasm into the cytosol in gram-negative bacteria, which is essential for the induction of the ampC encoding beta-lactamase. The AmpG family belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	387
341040	cd17487	MFS_MFSD5_like	Major facilitator superfamily domain containing 5 protein. Human major facilitator superfamily domain containing 5 protein (MFSD5) is also called molybdate-anion transporter, or molybdate transporter 2 homolog (MOT2). It acts as an atypical solute carrier (SLC) that mediates high-affinity intracellular uptake of the rare oligo-element molybdenum. It may also play a role in maintaining the glucose homeostasis and pancreatic beta-cell proliferation, as well as in altered energy homeostasis. MFSD5 belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	385
341041	cd17488	MFS_UhpC	Membrane sensor protein UhpC of the Major Facilitator Superfamily of transporters. Membrane sensor protein UhpC acts as both a sensor and a transport protein. It is part of the UhpABC signaling cascade that controls the expression of the hexose phosphate transporter UhpT. UhpC recognizes external glucose-6-phosphate (Glc6P) and induces transport by UhpT. It can also transport and sense Glc6P, and interacts with the histidine kinase UhpB, leading to the stimulation of the autokinase activity of UhpB. This group also includes the hexose phosphate transport protein UhpT from Chlamydia pneumoniae; it is a transport protein for sugar phosphate uptake. It is part of the Organophosphate:Pi antiporter (OPA) family of integral membrane proteins responsible for the transport of specific organophosphates or sugar phosphates across biological membranes with the simultaneous translocation of inorganic phosphate into the opposite direction. The UhpC group belongs to the Major Facilitator Superfamily (MFS) of membrane transport proteins, which are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	364
341042	cd17489	MFS_YfcJ_like	Escherichia coli YfcJ, YhhS, and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of Escherichia coli membrane proteins, YfcJ and YhhS, Bacillus subtilis uncharacterized MFS-type transporter YwoG, and similar proteins. YfcJ and YhhS are putative arabinose efflux transporters. YhhS has been implicated glyphosate resistance. YfcJ-like arabinose efflux transporters belong to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	367
341043	cd17490	MFS_YxlH_like	Bacillus subtilis YxlH and similar transporters of the Major Facilitator Superfamily. This subfamily is composed of Bacillus subtilis YxlH uncharacterized MFS-type transporter YxlH and similar proteins. The biological function of YxlH remains unclear. The YxlH-like subfamily belongs to the bacterial MdtG-like and eukaryotic solute carrier 18 (SLC18) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	371
341044	cd17491	MFS_MFSD12	Major facilitator superfamily domain-containing protein 12. Major facilitator superfamily domain-containing protein 12 (MFSD12) protein subfamily includes a group of uncharacterized proteins similar to human MFSD2. MFSD2 is composed of two vertebrate members, MFSD2A and MFSD2B. MFSD2A is an LPC symporter that plays an essential role for blood-brain barrier formation and function. MFSD2B is a potential risk or protect factor in the prognosis of lung adenocarcinoma. The MFSD12 subfamily belongs to the Salmonella enterica Na+/melibiose symporter like (MelB-like) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	438
341212	cd17492	toxin_CptN	type III toxin-antitoxin system toxin CptN and similar proteins. CptN-like toxin component of a type III toxin-antitoxin (TA) system, which consists of a ribonuclease (RNase) toxin that processes its structured and specific cognate RNA antitoxin, which in turn then directly inhibits the toxin. TA systems have been associated with many important phenotypes, like phage resistance, maintenance of genomic islands, and formation of bacterial persister cells.	149
341213	cd17493	toxin_TenpN	type III toxin-antitoxin system toxin TenpN and similar proteins. TenpN-like toxin component of a type III toxin-antitoxin (TA) system, which consists of a ribonuclease (RNase) toxin that processes its structured and specific cognate RNA antitoxin, which in turn then directly inhibits the toxin. TA systems have been associated with many important phenotypes, like phage resistance, maintenance of genomic islands, and formation of bacterial persister cells.	121
341185	cd17494	RMtype1_S_Sma198ORF994P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Streptococcus macedonicus ACA-DC 198 S subunit (S1.Sma198ORF994P) TRD2-CR2 and Lactobacillus amylovorus GRL 1112 S subunit (S1.LamGRLORF5415P) TRD2-CR2. The recognition sequences of Streptococcus macedonicus ACA-DC 198 S subunit (S1.Sma198ORF994P) and Lactobacillus amylovorus GRL 1112 S subunit (S1.LamGRLORF5415P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	171
341186	cd17495	RMtype1_S_Cep9333ORF4827P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Crinalium epipsammum S subunit (S.Cep9333ORF4827P) TRD2-CR2 and Corynebacterium genitalium sp. nov. S subunit (S.CgeORF10704P) TRD2-CR2. The recognition sequences for Crinalium epipsammum S subunit (S.Cep9333ORF4827P) and Corynebacterium genitalium sp. nov. S subunit (S.CgeORF10704P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	174
341187	cd17496	RMtype1_S_BliBORF2384P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Bacillus licheniformis S subunit (S1.BliBORF2384P) TRD1-CR1 and Chlorobium tepidum TLS S subunit (S.CteTORF675P) TRD1-CR1. The recognition sequences for Bacillus licheniformis S subunit (S1.BliBORF2384P) and Chlorobium tepidum TLS S subunit (S.CteTORF675P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	175
341188	cd17497	RMtype1_S_TteMORF1547P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Thermoanaerobacter tengcongensis S subunit (S.TteMORF1547P) TRD2-CR2. The recognition sequence is undetermined for Thermoanaerobacter tengcongensis S subunit (S.TteMORF1547P). The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This CD contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. S.TteMORF1547P TRD1-CR1 does not belong to this subfamily.	174
341189	cd17498	RMtype1_S_Aco12261I-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I) TRD1-CR1. The S.Aco12261I S subunit recognizes 5'... GCANNNNNNTGT ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases. S.Aco12261I TRD2-CR2 does not belong to this subfamily.	173
341190	cd17499	RMtype1_S_CloLW9ORF3270P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Cecembia lonarensis LW9 S subunit (S.CloLW9ORF3270P) TRD1-CR1 and Bacillus licheniformis 9945A S subunit (S.Bli9945ORF10320P) TRD1-CR1. The recognition sequences for Cecembia lonarensis LW9 S subunit (S.CloLW9ORF3270P) and Bacillus licheniformis 9945A S subunit (S.Bli9945ORF10320P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. This subfamily of TRD-CR's shows similarity to TRD1-CR1 of Aminobacterium colombiense DSM 12261 S subunit (S.Aco12261I), which recognizes 5'... GCANNNNNNTGT ... 3'. This subfamily also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases such as Helicobacter bizzozeronii CIII-1 putative Type IIG restriction enzyme/N6-adenine DNA methyltransferase RM.HbiCORF8670P, and may also contain type I DNA methyltransferases.	175
341191	cd17500	RMtype1_S_MmaGORF2198P_TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Methanosarcina mazei Goe1 S subunit (S.MmaGORF2198P) TRD1-CR1, and Flavobacterium psychrophilum FPG3  S subunit (S.FpsFPG3ORF6820P) TRD1-CR1. The recognition sequences of Methanosarcina mazei Goe1 S subunit (S.MmaGORF2198P) and Flavobacterium psychrophilum FPG3 S subunit (S.FpsFPG3ORF6820P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains TRD1-CR1. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	186
341192	cd17501	RMtype1_S_Vch69ORF1407P_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Vibrio cholerae 1311-69 S subunit (S.Vch69ORF1407P) TRD2-CR2, and Methanococcoides methylutens MM1 S subunit (S.MmeMM1ORF456P) TRD2-CR2. The recognition sequences of Vibrio cholerae 1311-69 S subunit (S.Vch69ORF1407P) and Methanococcoides methylutens MM1 S subunit (S.MmeMM1ORF456P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	191
341045	cd17502	MFS_Azr1_MDR_like	Saccharomyces cerevisiae Azole resistance protein 1 (Azr1p), and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This subfamily is composed of multidrug resistance (MDR) transporters including various Saccharomyces cerevisiae proteins such as azole resistance protein 1 (Azr1p), vacuolar basic amino acid transporter 1 (Vba1p), vacuolar basic amino acid transporter 5 (Vba5p), and Sge1p (also known as Nor1p, 10-N-nonyl acridine orange resistance protein, and crystal violet resistance protein). MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. This subfamily belongs to the Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters (MMR-like MDR transporter) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	337
341046	cd17503	MFS_LmrB_MDR_like	Bacillus subtilis lincomycin resistance protein (LmrB) and similar multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This subfamily is composed of multidrug resistance (MDR) transporters including Bacillus subtilis lincomycin resistance protein LmrB, and several proteins from Escherichia coli such as the putative MDR transporters EmrB, MdtD, and YieQ. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. For example, MMR confers resistance to the epoxide antibiotic methylenomycin. This subfamily belongs to the Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters (MMR-like MDR transporter) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	380
341047	cd17504	MFS_MMR_MDR_like	Methylenomycin A resistance protein (also called MMR peptide)-like multidrug resistance (MDR) transporters of the Major Facilitator Superfamily. This subfamily is composed of putative multidrug resistance (MDR) transporters including Chlamydia trachomatis antiseptic resistance protein QacA_2, and Serratia sp. DD3 Bmr3. MDR transporters are drug/H+ antiporters (DHA) that mediate the efflux of a variety of drugs and toxic compounds, and confer resistance to these compounds. This subfamily belongs to the Methylenomycin A resistance protein (also called MMR peptide) and similar multidrug resistance (MDR) transporters (MMR-like MDR transporter) family of the Major Facilitator Superfamily (MFS) of transporters. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement.	371
340762	cd17505	Ubl_SAMP1_like	ubiquitin-like (Ubl) domain found in small archaeal modifier protein 1 (SAMP1). Ubiquitin-like small archaeal modifier protein 1 (SAMP1) shows a beta-grasp fold of Ub, suggesting that this archaeal Ubl molecule is more closely related to eukaryotic Ub and Ubls than to its prokaryotic counterpart. Several Ub-like structural features such as an N-terminal single lysine residue and di-glycine motif at the C-terminus, spatially isolated, implicate formation of a poly-SAMPylated chainpoly-SAMPylation. SAMP1 can form covalent conjugates with its protein targets through an isopeptide linkage via their C-terminal diglycine motif in a streamlined archaeal E1-dependent pathway. It is involved in sulfur transfer during molybdenum cofactor biosynthesis much like MoaD. This family also includes proteins such as Thermoplasma acidophilum TA0895 and others, all closely related to proteins MoaD.	90
340763	cd17506	Ubl_SAMP2_like	ubiquitin-like (Ubl) domain found in small archaeal modifier protein (SAMP2). Ubiquitin-like small archaeal modifier protein 2 (SAMP2) shows a beta-grasp fold of Ub, suggesting that this archaeal Ubl molecule is more closely related to eukaryotic Ub and Ubls than to its prokaryotic counterpart. Several Ub-like structural features such as an N-terminal single lysine residue and di-glycine motif at the C-terminus, spatially isolated, implicate formation of a poly-SAMPylated chainpoly-SAMPylation. SAMP2 can form covalent conjugates with its protein targets through an isopeptide linkage via their C-terminal diglycine motif in a streamlined archaeal E1-dependent pathway. It also forms homo-conjugates through the intermolecular isopeptide bond between the C-terminal Gly and the Lys58 side chain, a feature that likely resembles polyubiquitination. SAMP2 is involved in sulfur transfer during tRNA thiolation much like Urm1. This family also includes uncharacterized proteins such as Methanothermococcus thermolithotrophicus Mth1743, Pyrococcus furiosus PF1061 and others, all closely related to proteins MoaD.	67
340861	cd17507	GT28_Beta-DGS-like	beta-diglucosyldiacylglycerol synthase and similar proteins. beta-diglucosyldiacylglycerol synthase (processive diacylglycerol beta-glucosyltransferase EC 2.4.1.315) is involved in the biosynthesis of both the bilayer- and non-bilayer-forming membrane glucolipids. This family of glycosyltransferases also contains plant major galactolipid synthase (chloroplastic monogalactosyldiacylglycerol synthase 1 EC 2.4.1.46). Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility.	364
341225	cd17508	Alpha_kinase	Alpha kinase family; uncharacterized subgroup. The alpha kinase family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional serine/threonine protein kinases. The family contains myosin heavy chain kinases, elongation factor-2 kinases, and bifunctional ion channel kinases. These kinases are implicated in a large variety of cellular processes such as protein translation, Mg2+/Ca2+ homeostasis, intracellular transport, cell migration, adhesion, and proliferation. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	243
341226	cd17509	Alpha_kinase	Alpha kinase family; uncharacterized subgroup. The alpha kinase family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional serine/threonine protein kinases. The family contains myosin heavy chain kinases, elongation factor-2 kinases, and bifunctional ion channel kinases. These kinases are implicated in a large variety of cellular processes such as protein translation, Mg2+/Ca2+ homeostasis, intracellular transport, cell migration, adhesion, and proliferation. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	221
409322	cd17510	T3SC_YbjN-like_2	Uncharacterized protein is structurally similar to type III secretion system chaperones and YbjN family proteins. This family includes an uncharacterized protein from Methanothermobacter Thermautotrophicus that is structurally similar to type III secretion system (T3SS) chaperones (T3SC) that bind effector proteins, and is homologous to YbjN, a putative sensory transduction regulator protein found in Proteobacteria.	142
409323	cd17511	YbjN_AmyR-like	YbjN protein family is structurally similar to type III secretion system chaperones. This YbjN protein family includes Escherichia coli YbjN, Erwinia amylovora AmyR, and similar proteins. YbjN proteins share a class I type III secretion chaperone (T3SC)-like fold with type III secretion system (T3SS) chaperone proteins but appear to function independently of the T3SS. YbjN is an enterobacteria-specific protein. In E. coli, it acts as a sensory transduction regulator that may play important roles in regulating bacterial multicellular behavior, metabolism, and survival under stress conditions. E. amylovora AmyR, a functionally conserved ortholog of E. coli YbjN, is a stress and virulence associated protein that regulates the ams operon. Ams proteins are required for amylovoran biosynthesis. AmyR may also regulate the Rcs phosphorelay system, an atypical two-component signal transduction (TCST) system present only in Enterobacteriaceae and positively regulates amylovoran biosynthesis by activating the ams operon transcription.	122
341193	cd17512	RMtype1_S_BceB55ORF5615P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Bacillus cereus HuB5-5 S subunit (S.BceB55ORF5615P) TRD2-CR2. The recognition sequence of Bacillus cereus HuB5-5 S subunit (S.BceB55ORF5615P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	195
341194	cd17513	RMtype1_S_AveSPN6ORF1907P_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Archaeoglobus veneficus SNP6 S subunit (S.AveSPN6ORF1907P) TRD2-CR2 and Bacillus subtilis JRS2 S subunit (S.BsuJRS7ORF3308P) TRD1-CR1. The recognition sequences of Archaeoglobus veneficus SNP6 S subunit (S.AveSPN6ORF1907P) and Bacillus subtilis JRS2 S subunit (S.BsuJRS7ORF3308P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	182
341195	cd17514	RMtype1_S_Eco2747I_MmaC7ORF19P-TRD-CR_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Escherichia coli ST2747 S subunit (S.Eco2747I) TRD2-CR2, Methanococcus maripaludis C7 S subunit (S.MmaC7ORF19P) TRD1-CR1, and related domains. The S. Eco2747I S subunit recognizes 5'... CACNNNNNNNGTTG ... 3'. The recognition sequence of Methanococcus maripaludis C7 S subunit (S.MmaC7ORF19P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This CD contains both TRD1-CR1 and TRD2-CR2. It also includes the TRD-CR-like domains of putative type II restriction enzymes and methyltransferases, such as Helicobacter cinaedi PAGU611 Hci611ORFHP which may recognize 5'... GAGNNNNNGT ... 3', and type I N6-adenine DNA methyltransferases, such as Calditerrivibrio nitroreducens M.Cni19672ORF1405P whose recognition sequence is undetermined.	183
341196	cd17515	RMtype1_S_MjaORF132P_Sau1132ORF3780P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to MjaXIP/S.MjaORF132P TRD1-CR1, S.Sau1132ORF3780P TRD1-CR1, S.Mca353ORF290P TRD1-CR1, and other TRD-CR's. The Staphylococcus aureus subsp. aureus MSHR1132 S subunit (S.Sau1132ORF3780P) recognizes 5'... CAAGNNNNNRTC ... 3', and Moraxella catarrhalis S subunit (S.Mca353ORF290P) recognizes 5'... CAAGNNNNNNTGT ... 3'. The recognition sequence of Methanococcus jannaschii S subunit (MjaXIP/S.MjaORF132P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.Sau1132ORF3780P-TRD1 recognizes CAAG/CTTG, and S.Sau1132ORF3780P-TRD2 recognizes GAY/RTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	181
341197	cd17516	RMtype1_S_HinAWORF1578P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S.HinAWORF1578P TRD2-CR2. Haemophilus influenzae RdAW S subunit (S.HinAWORF1578P) recognizes 5'... CTANNNNNGTTY ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	184
341198	cd17517	RMtype1_S_EcoKI_StySPI-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR),similar to Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) TRD2-CR2, Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) TRD2-CR2, and other TRD-CR's. Escherichia coli str. K-12 substr. MG1655 S subunit (S.EcoKI) recognizes 5'... AACNNNNNNGTGC ... 3' and Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) recognizes 5'... AACNNNNNNGTRC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.EcoKI-TRD1 and S.StySPI-TRD1 both recognize AAC/GTT, S.EcoKI-TRD2 recognizes GCAC/GTGC and S.StySPI-TRD2 recognizes GYAC/GTRC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2.It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases, such as Pseudomonas putida Jo 4-731 Type IIG restriction enzyme/N6-adenine DNA methyltransferase (RM.PpiI), and type I DNA methyltransferases such as Bacillus cereus BDRD-ST24 M subunit of Type I N6-adenine DNA methyltransferase (M.Bce24ORF51270P). RM.PpiI recognizes 5' ... GAACNNNNNCTC ... 3'.	192
341199	cd17518	RMtype1_S_Asp27244ORF1181P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Acinetobacter sp. S subunit (S.Asp27244ORF1181P) TRD1-CR1. The recognition sequence of Acinetobacter sp. S subunit (S.Asp27244ORF1181P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	180
341200	cd17519	RMtype1_S_HpyCR35ORFAP-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Helicobacter pylori CR35 S subunit (S.HpyCR35ORFAP) TRD1-CR1 and Mycoplasma haemofelis str. Langford 1 S subunit (S2.Mha1ORF7190P) TRD1-CR1. The recognition sequences of Helicobacter pylori CR35 S subunit (S.HpyCR35ORFAP) and Mycoplasma haemofelis str. Langford 1 S subunit (S2.Mha1ORF7190P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	183
341201	cd17520	RMtype1_S_HmoORF3075P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to Heliobacterium modesticaldum Ice1 S subunit (S1.HmoORF3075P) TRD1-CR1. The recognition sequence of Heliobacterium modesticaldum Ice1 S subunit (S1.HmoORF3075P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This subfamily also includes the TRD-CR-like sequence-recognition domain of type I N6-adenine DNA methyltransferase (M) subunit of Clostridium intestinale URNW (M2.CinURNWORF2828P). The recognition sequence of M2.CinURNWORF2828P is undetermined. Type I methyltransferases included in this group include two domains: one for methylation, and another (TRD-CR-like) for sequence-recognition.	180
341202	cd17521	RMtype1_S_Sau13435ORF2165P_TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Staphylococcus aureus NCTC 13435 S subunit (S.Sau13435ORF2165P) TRD2-CR2, Escherichia coli E24377A S subunit (S.EcoE24377ORF286P) TRD1-CR1 and Pseudoalteromonas species P1-13-1a S. subunit (S.Psp1bORF2093P) TRD2-CR2. Staphylococcus aureus NCTC 13435 S subunit (S.Sau13435ORF2165P) recognizes 5'...  TCTANNNNNNRTTC ... 3', and the recognition sequences of Escherichia coli E24377A S subunit (S.EcoE24377ORF286P) and Pseudoalteromonas species P1-13-1a S subunit (S.Psp1bORF2093P) are undetermined. The restriction-modification (RM) system S subunit generally consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, Staphylococcus aureus NCTC 13435 S subunit (S.Sau13435ORF2165P) TRD1 recognizes TCTA/TAGA, and -TRD2 recognizes GAAY/RTTC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. In addition, this family includes RMtype1_S_TRD-CR_like domains of various putative Helicobacter type II restriction enzymes and methyltransferases, such as Hci611ORFHP and HfeORF12890P.	187
341203	cd17522	RMtype1_S_MjaORF1531P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Methanocaldococcus jannaschii DSM 2661 S subunit (S.MjaORF1531P/MjaXIIP) TRD1-CR1. The recognition sequence of Methanocaldococcus jannaschii DSM 2661 S subunit (S.MjaORF1531P, also called MjaXIIP) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	190
341204	cd17523	RMtype1_S_StySPI-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) TRD2-CR2. Salmonella enterica subsp. enterica serovar Potsdam S subunit (S.StySPI) recognizes 5'... AACNNNNNNGTRC ... 3'. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. For example, S.StySPI-TRD1 recognizes AAC/GTT and S.StySPI-TRD2 recognizes GYAC/GTRC. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	190
341205	cd17524	RMtype1_S_EcoUTORF5051P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli UTI89 S subunit (S.EcoUTORF5051P) TRD2-CR2 and Archaeoglobus fulgidus VC-16 S subunit (S.AfuORF1715P) TRD2-CR2. Escherichia coli UTI89 S subunit (S.EcoUTORF5051P) recognizes 5'... CCANNNNNNNCTTC ... 3' and the recognition sequence of Archaeoglobus fulgidus VC-16 S subunit (S.AfuORF1715P) is undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It also includes TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases, such as Pseudomonas putida Jo 4-731 Type IIG restriction enzyme/N6-adenine DNA methyltransferase (RM.PpiI), and type I DNA methyltransferases such as Bacillus cereus BDRD-ST24 M subunit of Type I N6-adenine DNA methyltransferase (M.Bce24ORF51270P). RM.PpiI recognizes 5' ... GAACNNNNNCTC ... 3'.	189
341206	cd17525	RMtype1_S_Eco15ORF14057P-TRD1-CR1_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Escherichia coli 541-15 S subunit (S.Eco15ORF14057P) TRD1-CR1 and Desulfotignum phosphitoxidans S subunit (S.Dph13687ORF2110P) TRD2-CR2. The recognition sequences of Escherichia coli 541-15 S subunit (S.Eco15ORF14057P) and Desulfotignum phosphitoxidans S subunit (S.Dph13687ORF2110P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. It may also include TRD-CR-like sequence-recognition domains of various type II restriction enzymes and methyltransferases and type I DNA methyltransferases.	190
341207	cd17526	RMtype1_S_Cje2232P-TRD2-CR2_like	Type I restriction-modification system specificity (S) subunit TRD-CR, similar to Campylobacter jejuni RM 2232 S subunit (S.Cje2232P) TRD2-CR2 and Shewanella baltica OS223 S subunit (S.Sba223ORF389P) TRD1-CR1. The recognition sequences of Campylobacter jejuni RM 2232 S subunit (S.Cje2232P) and Shewanella baltica OS223 S subunit (S.Sba223ORF389P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. Also included in this subfamily is the C-terminal TRD-CR-like sequence-recognition domain of Microcystis aeruginosa putative type I N6-adenine DNA methyltransferase M subunit (M.Mae7806ORF3969P). The recognition sequence of M.Mae7806ORF3969P is undetermined.	192
381744	cd17527	HAMP_II	second HAMP domain of aerotaxis transducer Aer2 and similar domains. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established.	46
381745	cd17528	HAMP_III	third HAMP domain of aerotaxis transducer Aer2 and similar domains. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established.	44
381746	cd17529	HAMP_I	first HAMP domain of aerotaxis transducer Aer2 and similar domains. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established.	44
381086	cd17530	REC_RocR	phosphoacceptor receiver (REC) domain of response regulator RocR. The response regulator RocR from some pathogens contains an N-terminal phosphoreceiver (REC) domain and a C-terminal EAL domain that possesses c-di-GMP specific phosphodiesterase activity. The RocR REC domain is phosphorylated and modulates its EAL domain enzymatic activity, regulating the local level of c-di-GMP. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	123
381087	cd17532	REC_LytTR_AlgR-like	phosphoacceptor receiver (REC) domain of LytTR/AlgR family response regulators similar to AlgR. Members of the LytTR/AlgR family of response regulators contain a REC domain and a unique LytTR DNA-binding output domain that lacks the helix-turn-helix motif and consists mostly of beta-strands. Transcriptional regulators with the LytTR-type output domains are involved in biosynthesis of extracellular polysaccharides, fimbriation, expression of exoproteins, including toxins, and quorum sensing. Included in this AlgR-like group of LytTR/AlgR family response regulators are Streptococcus agalactiae sensory transduction protein LytR, Pseudomonas aeruginosa positive alginate biosynthesis regulatory protein AlgR, Bacillus subtilis sensory transduction protein LytT, and Escherichia coli transcriptional regulatory protein BtsR, which are members of two-component regulatory systems. LytR and LytT are components of regulatory systems that regulate genes involved in cell wall metabolism. AlgR positively regulates the algD gene, which codes for a GDP-mannose dehydrogenase, a key enzyme in the alginate biosynthesis pathway. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381088	cd17533	REC_LytTR_AgrA-like	phosphoacceptor receiver (REC) domain of LytTR/AlgR family response regulators similar to AgrA. Members of the LytTR/AlgR family of response regulators contain a REC domain and a unique LytTR DNA-binding output domain that lacks the helix-turn-helix motif and consists mostly of beta-strands. Transcriptional regulators with the LytTR-type output domains are involved in biosynthesis of extracellular polysaccharides, fimbriation, expression of exoproteins, including toxins, and quorum sensing. Included in this AgrA-like group of LytTR/AlgR family response regulators are Staphylococcus aureus accessory gene regulator protein A (AgrA) and Streptococcus pneumoniae response regulator ComE, which are members of two-component regulatory systems. AgrA is a global regulator that controls the synthesis of virulence factors and other exoproteins. ComE is part of the ComD-ComE system that is part of a quorum-sensing signaling pathway that controls the development of competence, a physiological state required for genetic transformation. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	131
381089	cd17534	REC_DC-like	phosphoacceptor receiver (REC) domain of modulated diguanylate cyclase and similar domains. This groups includes a modulated diguanylate cyclase containing a PAS sensor domain from Desulfovibrio desulfuricans G20. Members of this group contain N-terminal REC domains and various output domains including the GGDEF, histidine kinase, and helix-turn-helix (HTH) DNA binding domains. Also included in this family is Mycobacterium tuberculosis PdtaR, a transcriptional antiterminator that contains a REC domain and an ANTAR RNA-binding output domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381090	cd17535	REC_NarL-like	phosphoacceptor receiver (REC) domain of NarL (Nitrate/Nitrite response regulator L) family response regulators. The NarL family is one of the more abundant families of DNA-binding response regulators (RRs). Members of the NarL family contain a REC domain and a helix-turn-helix (HTH) DNA-binding output domain, with a majority of members containing a LuxR-type HTH domain. They function as transcriptional regulators. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381091	cd17536	REC_YesN-like	phosphoacceptor receiver (REC) domain of YesN and related helix-turn-helix containing response regulators. This family is composed of uncharacterized response regulators that contain a REC domain and a AraC family helix-turn-helix (HTH) DNA-binding output domain, including Bacillus subtilis uncharacterized transcriptional regulatory protein YesN and Staphylococcus aureus uncharacterized response regulatory protein SAR0214. YesN is a member of the two-component regulatory system YesM/YesN and SAR0214 is a member of the probable two-component regulatory system SAR0215/SAR0214. Also included in this family is the AlgR-like group of LytTR/AlgR family response, which includes Pseudomonas aeruginosa positive alginate biosynthesis regulatory protein AlgR and Bacillus subtilis sensory transduction protein LytT, among others. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	121
381092	cd17537	REC_FixJ	phosphoacceptor receiver (REC) domain of FixJ family response regulators. FixJ family response regulators contain an N-terminal receiver domain (REC) and a C-terminal LuxR family helix-turn-helix (HTH) DNA-binding output domain. The Sinorhizobium meliloti two-component system FixL/FixJ regulates nitrogen fixation in response to oxygen during symbiosis. Under microaerobic conditions, the kinase FixL phosphorylates the response regulator FixJ resulting in the regulation of nitrogen fixation genes such as nifA and fixK. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	116
381093	cd17538	REC_D1_PleD-like	first (D1) phosphoacceptor receiver (REC) domain of response regulator PleD and similar domains. PleD contains a REC domain (D1) with the phosphorylatable aspartate, a REC-like adaptor domain (D2), and the enzymatic diguanylate cyclase (DGC) domain, also called the GGDEF domain according to a conserved sequence motif, as its output domain. The GGDEF-containing PleD response regulators are global regulators of cell metabolism in some important human pathogens. This model describes D1 of PleD and similar domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	104
381094	cd17539	psREC-like_D2_PleD	REC-like adaptor domain (D2) of response regulator PleD. PleD contains a REC domain (D1) with the phosphorylatable aspartate, a pseudo receiver (psREC)-like adaptor domain (D2), and the enzymatic diguanylate cyclase (DGC) domain, also called the GGDEF domain according to a conserved sequence motif, as its output domain. The GGDEF-containing PleD response regulators are global regulators of cell metabolism in some important human pathogens. This model describes the REC-like adaptor domain D2 of PleD, which is an inactive domain.	124
381095	cd17540	REC_PhyR	phosphoacceptor receiver (REC) domain of response regulator PhyR and similar proteins. PhyR is a hybrid stress regulator that contains an N-terminal sigma-like (SL) domain and a C-terminal REC domain. Phosphorylation of the REC domain is known to promote binding of the SL domain to an anti-sigma factor. PhyR thus functions as an anti-anti-sigma factor in its phosphorylated state. It is involved in the general stress response. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381096	cd17541	REC_CheB-like	phosphoacceptor receiver (REC) domain of chemotaxis response regulator protein-glutamate methylesterase CheB and similar chemotaxis proteins. Methylesterase CheB is a chemotaxis response regulator with an N-terminal REC domain and a C-terminal methylesterase domain. Chemotaxis is a behavior known in motile bacteria that directs their movement in response to chemical gradients. CheB is a phosphorylation-activated response regulator involved in the reversible modification of bacterial chemotaxis receptors. It catalyzes the demethylation of specific methylglutamate residues introduced into the chemoreceptors (methyl-accepting chemotaxis proteins) by CheR. The CheB REC domain packs against the active site of the C-terminal domain and inhibits methylesterase activity by directly restricting access to the active site. Also included in this family is chemotaxis response regulator CheY, which contains a stand-alone REC domain, and an uncharacterized subfamily composed of proteins containing an N-terminal REC domain and a C-terminal CheY-P phosphatase (CheC) domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	125
381097	cd17542	REC_CheY	phosphoacceptor receiver (REC) domain of chemotaxis protein CheY. The chemotaxis response regulator CheY contains a stand-alone REC domain. Chemotaxis is a behavior known for motile bacteria that directs their movement in response to chemical gradients. CheY is involved in transmitting sensory signals from chemoreceptors to the flagellar motors. Phosphorylated CheY interacts with the flagella switch components FliM and FliY, which causes counterclockwise rotation of the flagella, resulting in smooth swimming. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381098	cd17544	REC_2_GGDEF	second phosphoacceptor receiver (REC) domain of uncharacterized GGDEF domain proteins. This family is composed of uncharacterized PleD-like response regulators that contain two N-terminal REC domains and a C-terminal diguanylate cyclase output domain with the characteristic GGDEF motif at the active site. Unlike PleD which contains a REC-like adaptor domain, the second REC domain of these uncharacterized GGDEF domain proteins, described in this model, contains characteristic metal-binding and active site residues. PleD response regulators are global regulators of cell metabolism in some important human pathogens. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	122
381099	cd17546	REC_hyHK_CKI1_RcsC-like	phosphoacceptor receiver (REC) domain of hybrid sensor histidine kinases/response regulators similar to Arabidopsis thaliana CKI1 and Escherichia coli RcsC. This family is composed of hybrid sensor histidine kinases/response regulators that are sensor histidine kinases (HKs) fused with a REC domain, similar to the sensor histidine kinase CKI1 from Arabidopsis thaliana, which is involved in multi-step phosphorelay (MSP) signaling that mediates responses to a variety of important stimuli in plants. MSP involves a signal being transferred from HKs via histidine phosphotransfer proteins (AHP1-AHP5) to nuclear response regulators. The CKI1 REC domain specifically interacts with the downstream signaling protein AHP2, AHP3 and AHP5. The plant MSP system has evolved from the prokaryotic two-component system (TCS), which allows organisms to sense and respond to changes in environmental conditions. This family also includes bacterial hybrid sensor HKs such as Escherichia coli RcsC, which is a component of the Rcs signalling pathway that controls a variety of physiological functions like capsule synthesis, cell division, and motility. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	113
381100	cd17548	REC_DivK-like	phosphoacceptor receiver (REC) domain of DivK and similar proteins. Caulobacter crescentus DivK is an essential response regulator that is involved in the complex phosphorelay pathways controlling both cell division and motility. It localizes cell cycle regulators to specific poles of the cell during division. DivK contains a stand-alone REC domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381101	cd17549	REC_DctD-like	phosphoacceptor receiver (REC) domain of C4-dicarboxylic acid transport protein D (DctD) and similar proteins. C4-dicarboxylic acid transport protein D (DctD) is part of the two-component regulatory system DctB/DctD, which regulates C4-dicarboxylate transport via regulation of expression of the dctPQM operon and dctA. It is an activator of sigma(54)-RNA polymerase holoenzyme that uses the energy released from ATP hydrolysis to stimulate the isomerization of a closed promoter complex to an open complex capable of initiating transcription. DctD is a member of the NtrC family, characterized by a domain architecture containing an N-terminal REC domain, followed by a central sigma-54 interaction/ATPase domain, and a C-terminal DNA binding domain. The ability of the central domain to hydrolyze ATP and thus to interact effectively with a complex of RNA polymerase, sigma54, and promoter, is controlled by the phosphorylation status of the REC domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	130
381102	cd17550	REC_NtrX-like	phosphoacceptor receiver (REC) domain of nitrogen assimilation regulatory protein NtrX and similar proteins. NtrX is part of the two-component regulatory system NtrY/NtrX that is involved in the activation of nitrogen assimilatory genes such as Gln. It is phosphorylated by the histidine kinase NtrY and interacts with sigma-54. NtrX is a member of the NtrC family, characterized by a domain architecture containing an N-terminal REC domain, followed by a central sigma-54 interaction/ATPase domain, and a C-terminal DNA binding domain. NtrC family response regulators are sigma54-dependent transcriptional activators. Also included in this subfamily is Aquifex aeolicus NtrC4. The ability of the central domain to hydrolyze ATP and thus to interact effectively with a complex of RNA polymerase, sigma54, and promoter, is controlled by the phosphorylation status of the REC domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381103	cd17551	REC_RpfG-like	phosphoacceptor receiver (REC) domain of cyclic di-GMP phosphodiesterase response regulator RpfG and similar proteins. Cyclic di-GMP phosphodiesterase response regulator RpfG, together with sensory/regulatory protein RpfC, constitute a two-component system implicated in sensing and responding to the diffusible signal factor (DSF) that is essential for cell-cell signaling. RpfC is a hybrid sensor/histidine kinase that phosphorylates and activates RpfG, which degrades cyclic di-GMP to GMP, leading to the activation of Clp, a global transcriptional regulator that regulates a large set of genes in the DSF pathway. RpfG contains a CheY-like receiver domain attached to a histidine-aspartic acid-glycine-tyrosine-proline (HD-GYP) cyclic di-GMP phosphodiesterase domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381104	cd17552	REC_RR468-like	phosphoacceptor receiver (REC) domain of Thermotoga maritima response regulator RR468 and similar domains. Thermotoga maritima RR468 (encoded by gene TM0468) is the cognate response regulator (RR) of the class I histidine kinase HK853 (product of gene TM0853). HK853/RR468 comprise a two-component system (TCS) that couples environmental stimuli to adaptive responses. This subfamily also includes Fremyella diplosiphon complementary adaptation response regulator homolog RcaF, a small RR that is involved in four-step phosphorelays of the complementary chromatic adaptation (CCA) system that occurs in many cyanobacteria. Both RR468 and RcaF are stand-alone RRs containing only a REC domain with no output/effector domain. The REC domain itself functions as an effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	121
381105	cd17553	REC_Spo0F-like	phosphoacceptor receiver (REC) domain of Spo0F and similar domains. Spo0F, a stand-alone response regulator containing only a REC domain with no output/effector domain, controls sporulation in Bacillus subtilis through the exchange of a phosphoryl group. Bacillus subtilis forms spores when conditions for growth become unfavorable. The initiation of sporulation is controlled by a phosphorelay (an expanded version of the two-component system) that consists of four main components: a histidine kinase (KinA), a secondary messenger (Spo0F), a phosphotransferase (Spo0B), and a transcription factor (Spo0A). REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381106	cd17554	REC_TrrA-like	phosphoacceptor receiver (REC) domain of Thermotoga maritima response regulator TrrA and similar domains. Thermotoga maritima contains a two-component signal transduction system (TCS) composed of the ThkA sensory histidine kinase (HK) and its cognate response regulator (RR) TrrA; the specific function of the system is unknown. TCSs couple environmental stimuli to adaptive responses. TrrA is a stand-alone RR containing only a REC domain with no output/effector domain. The REC domain itself functions as an effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	113
381107	cd17555	REC_RssB-like	phosphoacceptor receiver (REC) domain of Pseudomonas aeruginosa RssB and similar domains. Pseudomonas aeruginosa RssB is an orphan atypical response regulator containing a REC domain and a PP2C-type protein phosphatase output domain. Its function is still unknown. Escherichia RssB, which is not included in this subfamily, is a ClpX adaptor protein which alters ClpX specificity by mediating a specific interaction between ClpX and the substrates such as RpoS, an RNA polymerase sigma factor. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	116
381108	cd17557	REC_Rcp-like	phosphoacceptor receiver (REC) domain of cyanobacterial phytochrome response regulator Rcp and similar domains. This family is composed of response regulators (RRs) that are members of phytochrome-associated, light-sensing two-component signal transduction pathways such as Synechocystis sp. Rcp1, Tolypothrix sp. RcpA, and Agrobacterium tumefaciens bacteriophytochrome response regulator AtBRR. They are stand-alone RRs containing only a REC domain with no output/effector domain. The REC domain itself functions as an effector domain. Also included in this family us Methanosaeta harundinacea methanogenesis regulatory protein FilR2, also a stand-alone RR. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	129
381109	cd17561	REC_Spo0A	phosphoacceptor receiver (REC) domain of Spo0A. Spo0A is a response regulator of the phosphorelay system in the early stage of spore formation. It may be an element of the effector pathway responsible for the activation of sporulation genes in response to nutritional stress and may act in the with sigma factor spo0H to control the expression of some genes that are critical to the sporulation process. Spo0A contains a regulatory N-terminal REC domain and a C-terminal DNA-binding transcription activation domain as its effector/output domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	108
381110	cd17562	REC_CheY4-like	phosphoacceptor receiver (REC) domain of chemotaxis response regulator CheY4 and similar CheY family proteins. CheY family chemotaxis response regulators (RRs) comprise about 17%  of bacterial RRs and almost half of all RRs in archaea. This subfamily contains Vibrio cholerae CheY4 and similar CheY family RRs. CheY proteins control bacterial motility and participate in signaling phosphorelays and in protein-protein interactions. CheY RRs contain only the REC domain with no output/effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381111	cd17563	REC_RegA-like	phosphoacceptor receiver (REC) domain of photosynthetic apparatus regulatory protein RegA. Rhodobacter sphaeroides RegA, also called response regulator PrrA, is the DNA binding regulatory protein of a redox-responsive two-component regulatory system RegB/RegA that is involved in transactivating anaerobic expression of the photosynthetic apparatus. It contains a REC domain and a DNA-binding helix-turn-helix output domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	112
381112	cd17565	REC_GlnL-like	phosphoacceptor receiver (REC) domain of transcriptional regulatory protein GlnL and similar proteins. Bacillus subtilis GlnL is part of the GlnK-GlnL (formerly YcbA-YcbB) two-component system that positively regulates the expression of the glsA-glnT (formerly ybgJ-ybgH) operon in response to glutamine. It contains a REC domain and a DNA-binding output domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	103
381113	cd17569	REC_HupR-like	phosphoacceptor receiver (REC) domain of hydrogen uptake protein regulator (HupR) and similar domains. This family is composed of mostly uncharacterized response regulators with similarity to the REC domains of response regulator components of two-component systems that regulates hydrogenase activity, including HupR and HoxA. HupR is part of the HupT/HupR system that controls the synthesis of the membrane-bound [NiFe]hydrogenase, HupSL, of the photosynthetic bacterium Rhodobacter capsulatus. It contains an N-terminal REC domain, a central sigma-54 interaction domain that lacks ATPase activity, and a C-terminal DNA-binding domain. Members of this family contain a REC domain and various output domains including the cyclase homology domain (CHD) and the c-di-GMP phosphodiesterase domains, HD-GYP and EAL. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381114	cd17572	REC_NtrC1-like	phosphoacceptor receiver (REC) domain of nitrogen regulatory protein C 1 (NtrC1) from Aquifex aeolicus and similar NtrC family response regulators. NtrC family proteins are transcriptional regulators that have REC, AAA+ ATPase/sigma-54 interaction, and DNA-binding output domains. This subfamily of NtrC proteins include Aquifex aeolicus NtrC1 and Vibrio quorum-sensing signal integrator LuxO. The N-terminal REC domain of NtrC proteins regulate the activity of the protein and its phosphorylation controls the AAA+ domain oligomerization, while the central AAA+ domain participates in nucleotide binding, hydrolysis, oligomerization, and sigma54 interaction. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	121
381115	cd17573	REC_HP-RR-like	phosphoacceptor receiver (REC) domain of orphan response regulator HP-RR and similar proteins. Helicobacter pylori response regulator hp1043 (HP-RR) is an orphan response regulator which is phosphorylation-independent and is essential for growth. HP-RR functions as a cell growth-associated regulator in the absence of post-translational modification. Members of this subfamily contain REC and DNA-binding output domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	110
381116	cd17574	REC_OmpR	phosphoacceptor receiver (REC) domain of OmpR family response regulators. OmpR-like proteins are one of the most widespread transcriptional regulators. OmpR family members contain REC and winged helix-turn-helix (wHTH) DNA-binding output effector domain.  They are involved in the control of environmental stress tolerance (such as the oxidative, osmotic and acid stress response), motility, virulence, outer membrane biogenesis and other processes. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	99
381117	cd17575	REC_WspR-like	phosphoacceptor receiver (REC) domain of WspR response regulator and similar proteins. The GGDEF response regulator WspR is part of the Wsp system that is homologous to chemotaxis systems and also includes the membrane-bound receptor protein WspA. In response to growth on surfaces, WspR is phosphorylated by the Wsp signal transduction complex and is activated, functioning as a diguanylate cyclase (DGC) that catalyzes c-di-GMP synthesis. WspR is a hybrid response regulator-diguanylate cyclase, containing an N-terminal REC domain and a C-terminal GGDEF domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	128
381118	cd17580	REC_2_DhkD-like	second phosphoacceptor receiver (REC) domain of Dictyostelium discoideum hybrid signal transduction histidine kinase D and similar domains. Dictyostelium discoideum hybrid signal transduction histidine kinase D (DhkD) is a large protein that contains two histidine kinase (HK) and two REC domains on the intracellular side of a single pass transmembrane domain, and extracellular PAS and PAC domains that likely are involved in ligand binding. This model represents the second REC domain and similar domains. DhkD activates the cAMP phosphodiesterase RegA to ensure proper prestalk and prespore patterning, tip formation, and the vertical elongation of the mound into a finger, in Dictyostelium discoideum. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	112
381119	cd17581	REC_typeA_ARR	phosphoacceptor receiver (REC) domain of type A Arabidopsis response regulators (ARRs) and similar proteins. Type-A response regulators of Arabidopsis (ARRs) are involved in cytokinin signaling, which involves a phosphorelay cascade by histidine kinase receptors (AHKs), histidine phosphotransfer proteins (AHPs) and downstream ARRs. Cytokinin is a plant hormone implicated in many growth and development processes including shoot organogenesis, leaf senescence, sink/source relationships, vascular development, lateral bud release, and photomorphogenic development. Type-A ARRs function downstream of and are regulated by type-B ARRs, which are a class of MYB-type transcription factors. As primary cytokinin response genes, type-A ARRs act as redundant negative feedback regulators of cytokinin signaling by inactivating the phosphorelay. ARRs are divided into two groups, type-A and -B, according to their sequence and domain structure. Type-A ARRs are similar in domain structure to CheY, in that they lack a typical output domain and only contain a stand-alone receiver (REC) domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	122
381120	cd17582	psREC_PRR	pseudo receiver domain of pseudo-response regulators. In Arabidopsis, five pseudo-response regulators (PRRs), also called APRRs, comprise a core group of clock components that controls the pace of the central oscillator of the circadian clock, an endogenous time-keeping mechanism that enables organisms to adapt to external daily cycles. The coordinated sequential expression of PRR9 (APRR9), PRR7 (APRR7), PRR5 (APRR5), PRR3 (APRR3), and PRR1 (APRR1) results in circadian waves that may be at the basis of the endogenous circadian clock. PRRs contain an N-terminal pseudo receiver (psREC) domain that resembles the receiver domain of a two-component response regulator, but lacks an aspartate residue that accepts a phosphoryl group from the sensor kinase, and a CCT motif at the C-terminus that contains a putative nuclear localization signal. The psREC domain is involved in protein-protein interactions.	104
381121	cd17584	REC_typeB_ARR-like	phosphoacceptor receiver (REC) domain of type B Arabidopsis response regulators (ARRs) and similar domains. Type-B ARRs (Arabidopsis response regulators) are a class of MYB-type transcription factors that act as major players in the transcriptional activation of cytokinin-responsive genes. They directly regulate the expression of type-A ARR genes and other downstream target genes. Cytokinin is a plant hormone implicated in many growth and development processes including shoot organogenesis, leaf senescence, sink/source relationships, vascular development, lateral bud release, and photomorphogenic development. Cytokinin signaling involves a phosphorelay cascade by histidine kinase receptors (AHKs), histidine phosphotransfer proteins (AHPs) and downstream ARRs. ARRs are divided into two groups, type-A and -B, according to their sequence and domain structure. Type-B ARRs contain a receiver (REC) domain and a large C-terminal extension that has characteristics of an effector or output domain, with a Myb-like DNA binding domain referred to as the GARP domain. The GARP domain is a motif specific to plant transcription factors. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381122	cd17586	REC_PFxFATGY	phosphoacceptor receiver (REC) domain of PFxFATGY motif single-domain (stand-alone) response regulators. This subfamily is composed of stand-alone response regulators (RRs) containing the PFxFATG[G/Y] motif; RRs with such a motif are also called ''FAT GUY'' response regulators. Included in this subfamily are Sphingomonas melonis SdrG, Sinorhizobium meliloti Sma0114, and Erythrobacter litoralis EL_LovR. SdrG is involved in the control of the general stress response. Sma0114 is part of the Sma0113/Sma0114 two-component system (TCS) that is involved in catabolite repression and polyhydroxy butyrate synthesis. EL_LovR is involved in a light-regulated TCS. PFxFATG[G/Y] RRs are typically associated with histidine-tryptophan-glutamate (HWE) histidine kinases that constitute a subclass of the larger histidine kinase superfamily characterized by an altered ATP binding site, which lacks the F-box that is normally an integral component of the ATP lid. The PFxFATG[G/Y] motif is involved in conformational changes after phosphorylation that results in the activation of the RR. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	111
381123	cd17589	REC_TPR	phosphoacceptor receiver (REC) domain of uncharacterized tetratricopeptide repeat (TPR)-containing response regulators. Response regulators share the common phosphoacceptor REC domain and different output domains. This subfamily contains uncharacterized response regulators with TPR repeats as the effector or output domain, which might contain between 3 to 16 TPR repeats (each about 34 amino acids). TPR-containing proteins occur in all domains of life and the abundance of TPR-containing proteins in a bacterial proteome is not indicative of virulence. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. Some members in this subfamily may contain inactive REC domains lacking canonical metal-binding and active site residues.	115
381124	cd17593	REC_CheC-like	phosphoacceptor receiver (REC) domain of uncharacterized response regulators containing a CheC domain. This subfamily is composed of uncharacterized proteins containing an N-terminal REC domain and a C-terminal CheC domain that may function as the output/effector domain of a response regulator. CheC is a CheY-P phosphatase, affecting the level of phosphorylated CheY which controls the sense of flagella rotation and determine swimming behavior of chemotactic bacteria. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381125	cd17594	REC_OmpR_VirG	phosphoacceptor receiver (REC) domain of VirG-like OmpR family response regulators. VirG is part of the VirA/VirG two-component system that regulates the expression of virulence (vir) genes. The histidine kinase VirA senses a phenolic wound response signal, undergoes autophosphorylation, and phosphorelays to the VirG response regulator, which induces transcription of the vir regulon. VirG belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	113
381126	cd17595	REC_TrxB	phosphoacceptor receiver (REC) domain a fused response regulator with a thioredoxin reductase output domain. This family is composed of uncharacterized fusion proteins containing a REC domain and a thioredoxin reductase domain. Thioredoxin reductase catalyzes the reduction of thioredoxin and is thus a central component in the thioredoxin system. Fusion proteins containing REC and thioredoxin reductase domains could play an important role in the environmental regulation of the cellular dithiol-disulfide ratio. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	135
381127	cd17596	REC_HupR	phosphoacceptor receiver (REC) domain of hydrogen uptake protein regulator (HupR). Members of this subfamily are response regulator components of two-component systems that regulates hydrogenase activity, including HupR and HoxA. HupR is part of the HupT/HupR system that controls the synthesis of the membrane-bound [NiFe]hydrogenase, HupSL, of the photosynthetic bacterium Rhodobacter capsulatus. It belongs to the nitrogen regulatory protein C (NtrC) family of response regulators, which activate transcription by RNA polymerase (RNAP) in response to a change in the environment. HupR is an unusual member of this family as it activates transcription when unphosphorylated, and transcription is inhibited by phosphorylation. Proteins in this subfamily contain an N-terminal REC domain, a central sigma-54 interaction domain that lacks ATPase activity, and a C-terminal DNA-binding domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	133
381128	cd17598	REC_hyHK	phosphoacceptor receiver (REC) domain of uncharacterized hybrid sensor histidine kinase/response regulators. Typically, two-component regulatory systems (TCSs) consist of a sensor (histidine kinase) that responds to specific input(s) by modifying the output of a cognate response regulator (RR). TCSs allow organisms to sense and respond to changes in environmental conditions. Hybrid sensor histidine kinase/response regulators contain all the elements of a classical TCS in a single polypeptide chain. RRs share the common phosphoacceptor REC domain and different effector/output domains such as DNA, RNA, ligand-binding, protein-binding, or enzymatic domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381129	cd17602	REC_PatA-like	phosphoacceptor receiver (REC) domain of PatA and similar domains. Nostoc sp. (or Anabaena sp.) PatA is necessary for proper patterning of heterocysts along filaments. PatA contains phosphoacceptor REC domain at its C-terminus and an N-terminal PATAN (PatA N-terminus) domain, which was proposed in a bioinformatics study to mediate protein-protein interactions. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. Some members of this group may have an inactive REC domain, lacking canonical metal-binding and active site residues.	102
381130	cd17614	REC_OmpR_YycF-like	phosphoacceptor receiver (REC) domain of YrcF-like OmpR family response regulators. YycF appears to play an important role in cell wall integrity in a wide range of gram-positive bacteria, and may also modulate cell membrane integrity. It functions as part of a phosphotransfer system that ultimately controls the levels of competence within the bacteria. YycF belongs to the OmpR family of response regulators, which are characterized by a REC domain and a winged helix-turn-helix effector domain involved in DNA binding. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381131	cd17615	REC_OmpR_MtPhoP-like	phosphoacceptor receiver (REC) domain of MtPhoP-like OmpR family response regulators. Mycobacterium tuberculosis PhoP (MtPhoP) is part of the PhoP/PhoR two-component system that is involved in phosphate control by stimulating expression of genes involved in scavenging, transport and mobilization of phosphate, and repressing the utilization of nitrogen sources. Also included in this subfamily is Mycobacterium tuberculosis transcriptional regulatory protein TcrX, part of the two-component regulatory system TcrY/TcrX that may be involved in virulence. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381132	cd17616	REC_OmpR_CtrA	phosphoacceptor receiver (REC) domain of CtrA-like OmpR family response regulators. CtrA is part of the CckA-ChpT-CtrA phosphorelay that is conserved in alphaproteobacteria and is important in orchestrating the cell cycle, polar development, and flagellar biogenesis. CtrA is the master regulator of flagella synthesis genes and also regulates genes involved in the cell cycle, exopolysaccharide synthesis, and cyclic-di-GMP signaling. CtrA is active as a transcription factor when phosphorylated. It is a member of the OmpR family of DNA-binding response regulators, characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	114
381133	cd17618	REC_OmpR_PhoB	phosphoacceptor receiver (REC) domain of PhoB response regulator from the OmpR family. The transcription factor PhoB is a component of the PhoR/PhoB two-component system, a key regulatory protein network that facilitates response to inorganic phosphate (Pi) starvation conditions by turning on the phosphate (pho) regulon whose products are involved in phosphorus uptake and metabolism. PhoB is a member of the OmpR family of DNA-binding response regulators that contains REC and winged helix-turn-helix (wHTH) DNA-binding output effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381134	cd17619	REC_OmpR_ArcA_TorR-like	phosphoacceptor receiver (REC) domain of ArcA- and TorR-like OmpR family response regulators. This subfamily includes Escherichia coli TorR and ArcA, both OmpR family response regulators that mediate adaptation to changes in various respiratory growth conditions. The TorS-TorR two-component system (TCS) is responsible for the tight regulation of the torCAD operon, which encodes the trimethylamine N-oxide (TMAO) reductase respiratory system in response to anaerobic conditions and the presence of TMAO. The ArcA-ArcB TCS is involved in cell growth during anaerobiosis. ArcA is a global regulator that controls more than 30 operons involved in redox regulation (the Arc modulon). OmpR family DNA-binding response regulators are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	113
381135	cd17620	REC_OmpR_KdpE-like	phosphoacceptor receiver (REC) domain of KdpE-like OmpR family response regulators. KdpE is a component of the KdpD/KdpE two-component system (TCS) and is activated when histidine kinase KdpD senses a drop in external K+ concentration or upshift in ionic osmolarity, resulting in the expression of a heterooligomeric transporter KdpFABC. In addition, the KdpD/KdpE TCS is also an adaptive regulator involved in the virulence and intracellular survival of pathogenic bacteria. KdpE is a member of the OmpR family of DNA-binding response regulators that contain REC and winged helix-turn-helix (wHTH) DNA-binding output effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	99
381136	cd17621	REC_OmpR_RegX3-like	phosphoacceptor receiver (REC) domain of RegX3-like OmpR family response regulators. RegX3 is a member of the SenX3-RegX3 two-component system that is involved in phosphate-sensing signal transduction. Phosphorylated RegX3 functions as a transcriptional activator of phoA. It induces transcription in phosphate limiting environment and also controls expression of several critical metabolic enzymes in aerobic condition. RegX3 belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	99
381137	cd17622	REC_OmpR_kpRstA-like	phosphoacceptor receiver (REC) domain of kpRstA-like OmpR family response regulators. Klebsiella pneumoniae RstA (kpRstA) is part of the RstA/RstB two-component regulatory system that may play a regulatory role in virulence. It belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	116
381138	cd17623	REC_OmpR_CpxR	phosphoacceptor receiver (REC) domain of CpxR-like OmpR family response regulators. CpxR is part of the CpxA/CpxR two-component regulatory system that mediates envelope stress responses that is key for virulence and antibiotic resistance in several Gram negative pathogens. CpxR is a transcription factor/response regulator that controls the expression of numerous genes, including those of the classical porins OmpF and OmpC. It belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381139	cd17624	REC_OmpR_PmrA-like	phosphoacceptor receiver (REC) domain of PmrA-like OmpR family response regulators. This subfamily contains various OmpR family response regulators including PmrA, BasR, QseB, tctD, and RssB, which are components of two-component regulatory systems (TCSs). The PmrA/PmrB TCS controls transcription of genes that are involved in lipopolysaccharide modification in the outer membrane of bacteria, increasing bacterial resistance to host-derived antimicrobial peptides. The BasS/BasR TCS functions as an iron- and zinc-sensing transcription regulator. The QseB/QseC TCS activates the flagella regulon by activating transcription of FlhDC. The RssA/RssB TCS regulates swarming behavior in Serratia marcescens. OmpR family DNA-binding response regulators contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381140	cd17625	REC_OmpR_DrrD-like	phosphoacceptor receiver (REC) domain of DrrD-like OmpR family response regulators. DrrD is a OmpR/PhoB homolog from Thermotoga maritima whose function is not yet known. This subfamily also includes Streptococcus agalactiae transcriptional regulatory protein DltR, part of the DltS/DltR two-component system (TCS), and Pseudomonas aeruginosa transcriptional activator protein PfeR, part of the PfeR/PfeS TCS, which activates expression of the ferric enterobactin receptor. The DltS/DltR TCS regulates the expression of the dlt operon, which comprises four genes (dltA, dltB, dltC, and dltD) that catalyze the incorporation of D-alanine residues into the lipoteichoic acids. Members of this subfamily belong to the OmpR/PhoB family, which comprises of two domains, an N-terminal receiver domain and a C-terminal DNA-binding winged helix-turn-helix effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381141	cd17626	REC_OmpR_MtrA-like	phosphoacceptor receiver (REC) domain of MtrA-like OmpR family response regulators. MtrA is part of MtrA/MtrB (or MtrAB), a highly conserved two-component system (TCS) implicated in the regulation of cell division in the actinobacteria. In unicellular Mycobacterium tuberculosis, MtrAB coordinates DNA replication with cell division and regulates the transcription of resuscitation-promoting factor B. In filamentous Streptomyces venezuelae, it links antibiotic production to sporulation. MtrA belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381142	cd17627	REC_OmpR_PrrA-like	phosphoacceptor receiver (REC) domain of PrrA-like OmpR family response regulators. The Mycobacterium tuberculosis PrrA is part of the PrrA/PrrB two-component system (TCS) that has been implicated in early intracellular multiplication and is essential for viability. Also included in this subfamily is Mycobacterium tuberculosis MprA, part of the MprAB TCS that regulates EspR, a key regulator of the ESX-1 secretion system, and is required for establishment and maintenance of persistent infection in a tissue- and stage-specific fashion. PrrA and MprA belong to the OmpR family of DNA-binding response regulators, which contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	116
341285	cd17630	OSB_MenE-like	O-succinylbenzoic acid-CoA ligase. This family contains O-succinylbenzoyl-CoA (OSB-CoA) synthetase (also known as O-succinylbenzoic acid CoA ligase) that belongs to the ANL superfamily and catalyzes the ligation of CoA to o-succinylbenzoate (OSB). It includes MenE in the bacterial menaquinone biosynthesis pathway which is a promising target for the development of novel antibacterial agents. MenE catalyzes CoA ligation via an acyl-adenylate intermediate; tight-binding inhibitors of MenE based on stable acyl-sulfonyladenosine analogs of this intermediate provide a pathway toward the development of optimized MenE inhibitors.	325
341286	cd17631	FACL_FadD13-like	fatty acyl-CoA synthetase, including FadD13. This family contains fatty acyl-CoA synthetases, including Mycobacterium tuberculosis acid-induced operon MymA encoding the fatty acyl-CoA synthetase FadD13 which is essential for virulence and intracellular growth of the pathogen. The fatty acyl-CoA synthetase activates lipids before entering into the metabolic pathways and is also involved in transmembrane lipid transport. However, unlike soluble fatty acyl-CoA synthetases, but like the mammalian integral-membrane very-long-chain acyl-CoA synthetases, FadD13 accepts lipid substrates up to the maximum length of C26, and this is facilitated by an extensive hydrophobic tunnel from the active site to a positively charged patch. Also included is feruloyl-CoA synthetase (Fcs) in Rhodococcus strains where it is involved in biotechnological vanillin production from eugenol and ferulic acid via a non-beta-oxidative pathway.	435
341287	cd17632	AFD_CAR-like	adenylation domain of carboxylic acid reductase (CAR). This family contains the adenylation domain of carboxylic acid reductase enzymes (CARs), and performs an equivalent function to that of the ANL superfamily of adenylating enzymes. It takes a carboxylic acid substrate and ATP, and produces an AMP-acyl phosphoester intermediate, releasing pyrophosphate. Kinetic analysis using various substrates shows that this enzyme has a broad but similar substrate specificity, preferring electron-rich acids. This suggests that attack by the carboxylate on the alpha-phosphate of adenosine triphosphate (ATP) is the step that determines the substrate specificity and reaction kinetics. CAR is an important enzyme for use as a biocatalyst providing regiospecific route to aldehydes from their respective carboxylic acids.	588
341288	cd17633	AFD_YhfT-like	fatty acid-CoA ligase VraA. This family of acyl-CoA ligases includes Bacillus subtilis YhfT, as well as long-chain fatty acid-CoA ligase VraA, all of which are as yet to be characterized. These proteins belong to the adenylate-forming enzymes which catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain	320
341289	cd17634	ACS-like	acetate-CoA ligase. This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain.	587
341290	cd17635	FADD10	adenylate forming domain, fatty acid CoA ligase (FadD10). This family contains long chain fatty acid CoA ligases, including FadD10 which is involved in the synthesis of a virulence-related lipopeptide. FadD10 is a fatty acyl-AMP ligase (FAAL) that transfers fatty acids to an acyl carrier protein. Structures of FadD10 in apo- and complexed form with dodecanoyl-AMP, show a novel open conformation, facilitated by its unique inter-domain and intermolecular interactions, which is critical for the enzyme to carry out the acyl transfer onto the acyl carrier protein (Rv0100) rather than coenzyme A.	340
341291	cd17636	PtmA	long-chain fatty acid CoA ligase (FadD). This family contains fatty acid CoA ligases, including acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II, most of which are yet to be characterized. Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions.	331
341292	cd17637	ACLS-CaiC	acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II. This family contains fatty acid CoA ligases, including acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II, most of which are yet to be characterized, but may be similar to Carnitine-CoA ligase (CaiC) which catalyzes the transfer of CoA to carnitine. Fatty acyl-CoA ligases catalyze the ATP-dependent activation of fatty acids in a two-step reaction. The carboxylate substrate first reacts with ATP to form an acyl-adenylate intermediate, which then reacts with CoA to produce an acyl-CoA ester. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions.	333
341293	cd17638	FadD3	acyl-CoA synthetase FadD3 and similar proteins. This family contains long chain fatty acid CoA ligases, including FadD3 which is an acyl-CoA synthetase that initiates catabolism of cholesterol rings C and D in actinobacteria. The cholesterol catabolic pathway occurs in most mycolic acid-containing actinobacteria, such as Rhodococcus jostii RHA1, and is critical for Mycobacterium tuberculosis (Mtb) during infection. FadD3 catalyzes the ATP-dependent CoA thioesterification of 3a-alpha-H-4alpha(3'-propanoate)-7a-beta-methylhexahydro-1,5-indanedione (HIP) to yield HIP-CoA. Hydroxylated analogs of HIP, 5alpha-OH HIP and 1beta-OH HIP, can also be used.	330
341294	cd17639	LC_FACS_euk1	Eukaryotic long-chain fatty acid CoA synthetase (LC-FACS), including fungal proteins. The members of this family are eukaryotic fatty acid CoA synthetases (EC 6.2.1.3) that activate fatty acids with chain lengths of 12 to 20 and includes fungal proteins. They act on a wide range of long-chain saturated and unsaturated fatty acids, but the enzymes from different tissues show some variation in specificity. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. This is a required step before free fatty acids can participate in most catabolic and anabolic reactions. Organisms tend to have multiple isoforms of LC-FACS genes with multiple splice variants. For example, nine genes are found in Arabidopsis and six genes are expressed in mammalian cells. In Schizosaccharomyces pombe, lcf1 gene encodes a new fatty acyl-CoA synthetase that preferentially recognizes myristic acid as a substrate.	507
341295	cd17640	LC_FACS_like	Long-chain fatty acid CoA synthetase. This family includes long-chain fatty acid (C12-C20) CoA synthetases, including an Arabidopsis gene At4g14070 that plays a role in activation and elongation of exogenous fatty acids. FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Eukaryotes generally have multiple isoforms of LC-FACS genes with multiple splice variants. For example, nine genes are found in Arabidopsis and six genes are expressed in mammalian cells. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions.	468
341296	cd17641	LC_FACS_bac1	bacterial long-chain fatty acid CoA synthetase. The members of this family are bacterial long-chain fatty acid CoA synthetase, most of which are as yet uncharacterized. LC-FACS catalyzes the formation of fatty acyl-CoA in a two-step reaction: the formation of a fatty acyl-AMP molecule as an intermediate, and the formation of a fatty acyl-CoA. Free fatty acids must be "activated" to their CoA thioesters before participating in most catabolic and anabolic reactions.	569
341297	cd17642	Firefly_Luc	insect luciferase, similar to plant 4-coumarate: CoA ligases. This family contains insect firefly luciferases that share significant sequence similarity to plant 4-coumarate:coenzyme A ligases, despite their functional diversity. Luciferase catalyzes the production of light in the presence of MgATP, molecular oxygen, and luciferin. In the first step, luciferin is activated by acylation of its carboxylate group with ATP, resulting in an enzyme-bound luciferyl adenylate. In the second step, luciferyl adenylate reacts with molecular oxygen, producing an enzyme-bound excited state product (Luc=O*) and releasing AMP. This excited-state product then decays to the ground state (Luc=O), emitting a quantum of visible light.	532
341298	cd17643	A_NRPS_Cytc1-like	similar to adenylation domain of cytotrienin synthetase CytC1. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes Streptomyces sp. cytotrienin synthetase (CytC1), a relatively promiscuous adenylation enzyme that installs the aminoacyl moieties on the phosphopantetheinyl arm of the holo carrier protein CytC2. Also included are Streptomyces sp Thr1, involved in the biosynthesis of 4-chlorothreonine, Pseudomonas aeruginosa pyoverdine synthetase D (PvdD), involved in the biosynthesis of the siderophore pyoverdine and Pseudomonas syringae syringopeptin synthetase, where syringpeptin is a necrosis-inducing phytotoxin that functions as a virulence determinant in the plant-pathogen interaction. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	450
341299	cd17644	A_NRPS_ApnA-like	similar to adenylation domain of anabaenopeptin synthetase (ApnA). This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes Planktothrix agardhii anabaenopeptin synthetase (ApnA A1), which is capable of activating two chemically distinct amino acids (Arg and Tyr). Structural studies show that the architecture of the active site forces Arg to adopt a Tyr-like conformation, thus explaining the bispecificity. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	465
341300	cd17645	A_NRPS_LgrA-like	adenylation (A) domain of linear gramicidin synthetase (LgrA) and similar proteins. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes linear gramicidin synthetase (LgrA) in Brevibacillus brevis. LgrA has a formylation domain fused to the N-terminal end that formylates its substrate for linear gramicidin synthesis to proceed. This formyl group is essential for the clinically important antibacterial activity of gramicidin by enabling head-to-head gramicidin dimers to make a beta-helical pore in gram-positive bacterial membranes, allowing free passage of monovalent cations, destroying the ion gradient and killing bacteria. This family also includes bacitracin synthetase 1 (known as ATP-dependent cysteine adenylase or BA1); it activates cysteine, incorporates two D-amino acids, releases and cyclizes the mature bacitracin, an antibiotic that is a mixture of related cyclic peptides that disrupt gram positive bacteria by interfering with cell wall and peptidoglycan synthesis. Also included is surfactin synthetase which activates and polymerizes the amino acids Leu, Glu, Asp, and Val to form the antibiotic surfactin.	440
341301	cd17646	A_NRPS_AB3403-like	Peptide Synthetase. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	488
341302	cd17647	A_NRPS_alphaAR	Alpha-aminoadipate reductase. This family contains L-2-aminoadipate reductase, also known as alpha-aminoadipate reductase (EC 1.2.1.95) or alpha-AR or L-aminoadipate-semialdehyde dehydrogenase (EC 1.2.1.31), which catalyzes the activation of alpha-aminoadipate by ATP-dependent adenylation and the reduction of activated alpha-aminoadipate by NADPH. The activated alpha-aminoadipate is bound to the phosphopantheinyl group of the enzyme itself before it is reduced to (S)-2-amino-6-oxohexanoate.	520
341303	cd17648	A_NRPS_ACVS-like	N-(5-amino-5-carboxypentanoyl)-L-cysteinyl-D-valine synthase. This family contains ACV synthetase (ACVS, EC 6.3.2.26; also known as N-(5-amino-5-carboxypentanoyl)-L-cysteinyl-D-valine synthase or delta-(L-alpha-aminoadipyl)-L-cysteinyl-D-valine synthetase) is involved in medically important antibiotic biosynthesis. ACV synthetase is active in an early step in the penicillin G biosynthesis pathway which involves the formation of the tripeptide 6-(L-alpha-aminoadipyl)-L-cysteinyl-D-valine (ACV); each of the constituent amino acids of the tripeptide ACV are activated as aminoacyl-adenylates with peptide bonds formed through the participation of amino acid thioester intermediates. ACV is then cyclized by the action of isopenicillin N synthase.	453
341304	cd17649	A_NRPS_PvdJ-like	non-ribosomal peptide synthetase. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes pyoverdine biosynthesis protein PvdJ involved in the synthesis of pyoverdine, which consists of a chromophore group attached to a variable peptide chain and comprises around 6-12 amino acids that are specific for each Pseudomonas species, and for which the peptide might be first synthesized before the chromophore assembly. Also included is ornibactin biosynthesis protein OrbI; ornibactin is a tetrapeptide siderophore with an l-ornithine-d-hydroxyaspartate-l-serine-l-ornithine backbone. The adenylation domain at the N-terminal of OrbI possibly initiates the ornibactin with the binding of N5-hydroxyornithine. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	450
341305	cd17650	A_NRPS_PpsD_like	similar to adenylation domain of plipastatin synthase (PpsD). This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes bacitracin synthetase 1 (BacA) in Bacillus licheniformis, tyrocidine synthetase in Brevibacillus brevis, plipastatin synthase (PpsD, an important antifungal protein) in Bacillus subtilis and mannopeptimycin peptide synthetase (MppB) in Streptomyces hygroscopicus. Plipastatin has strong fungitoxic activity and is involved in inhibition of phospholipase A2 and biofilm formation. Bacitracin, a mixture of related cyclic peptides, is used as a polypeptide antibiotic while function of tyrocidine is thought to be regulation of sporulation. MppB is involved in biosynthetic pathway of mannopeptimycin, a novel class of mannosylated lipoglycopeptides. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	447
341306	cd17651	A_NRPS_VisG_like	similar to adenylation domain of virginiamycin S synthetase. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes virginiamycin S synthetase (VisG) in Streptomyces virginiae; VisG is involved in virginiamycin S (VS) biosynthesis as the provider of an L-pheGly molecule, a highly specific substrate for the last condensation step by VisF. This family also includes linear gramicidin synthetase B (LgrB) in Brevibacillus brevis. Substrate specificity analysis using residues of the substrate-binding pockets of all 16 adenylation domains has shown good agreement of the substrate amino acids predicted with the sequence of linear gramicidin. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	491
341307	cd17652	A_NRPS_CmdD_like	similar to adenylation domain of chondramide synthase cmdD. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes phosphinothricin tripeptide (PTT, phosphinothricylalanylalanine) synthetase, where PTT is a natural-product antibiotic and potent herbicide that is produced by Streptomyces hygroscopicus. This adenylation domain has been confirmed to directly activate beta-tyrosine, and fluorinated chondramides are produced through precursor-directed biosynthesis. Also included in this family is chondramide synthase D (also known as ATP-dependent phenylalanine adenylase or phenylalanine activase or tyrosine activase). Chondramides A-D are depsipeptide antitumor and antifungal antibiotics produced by C. crocatus, are a class of mixed peptide/polyketide depsipeptides comprised of three amino acids (alanine, N-methyltryptophan, plus the unusual amino acid beta-tyrosine or alpha-methoxy-beta-tyrosine) and a polyketide chain ([E]-7-hydroxy-2,4,6-trimethyloct-4-enoic acid).	436
341308	cd17653	A_NRPS_GliP_like	nonribosomal peptide synthase GliP-like. This family includes the adenylation (A) domain of nonribosomal peptide synthases (NRPS) gliotoxin biosynthesis protein P (GliP), thioclapurine biosynthesis protein P (tcpP) and Sirodesmin biosynthesis protein P (SirP). In the filamentous fungus Aspergillus fumigatus, NRPS GliP is involved in the biosynthesis of gliotoxin, which is initiated by the condensation of serine and phenylalanine. Studies show that GliP is not required for invasive aspergillosis, suggesting that the principal targets of gliotoxin are neutrophils or other phagocytes. SirP is a phytotoxin produced by the fungus Leptosphaeria maculans, which causes blackleg disease of canola (Brassica napus). In the fungus Claviceps purpurea, NRPS tcpP catalyzes condensation of tyrosine and glycine, part of biosynthesis of an unusual class of epipolythiodioxopiperazines (ETPs) that lacks the reactive thiol group for toxicity. The adenylation (A) domain of NRPS recognizes a specific amino acid or hydroxy acid and activates it as an (amino) acyl adenylate by hydrolysis of ATP. The activated acyl moiety then forms a thioester bond to the enzyme-bound cofactor phosphopantetheine of a peptidyl carrier protein domain. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	433
341309	cd17654	A_NRPS_acs4	acyl-CoA synthetase family member 4. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) contains acyl-CoA synthethase family member 4, also known as 2-aminoadipic 6-semialdehyde dehydrogenase or aminoadipate-semialdehyde dehydrogenase, most of which are uncharacterized. Acyl-CoA synthetase catalyzes the initial reaction in fatty acid metabolism, by forming a thioester with CoA. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	449
341310	cd17655	A_NRPS_Bac	bacitracin synthetase and related proteins. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) includes bacitracin synthetases 1, 2, and 3 (BA1, also known as ATP-dependent cysteine adenylase or cysteine activase, BA2, also known as ATP-dependent lysine adenylase or lysine activase, and BA3, also known as ATP-dependent isoleucine adenylase or isoleucine activase) in Bacilli. Bacitracin is a mixture of related cyclic peptides used as a polypeptide antibiotic. This family also includes gramicidin synthetase 1 involved in synthesis of the cyclic peptide antibiotic gramicidin S via activation of phenylalanine. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	490
341311	cd17656	A_NRPS_ProA	gramicidin S synthase 2, also known as ATP-dependent proline adenylase. This family of the adenylation (A) domain of nonribosomal peptide synthases (NRPS) contains gramicidin S synthase 2 (also known as ATP-dependent proline adenylase or proline activase or ProA). ProA is a multifunctional enzyme involved in synthesis of the cyclic peptide antibiotic gramicidin S and able to activate and polymerize the amino acids proline, valine, ornithine and leucine. NRPSs are large multifunctional enzymes which synthesize many therapeutically useful peptides in bacteria and fungi via a template-directed, nucleic acid independent nonribosomal mechanism. These natural products include antibiotics, immunosuppressants, plant and animal toxins, and enzyme inhibitors. NRPS has a distinct modular structure in which each module is responsible for the recognition, activation, and in some cases, modification of a single amino acid residue of the final peptide product. The modules can be subdivided into domains that catalyze specific biochemical reactions.	479
350495	cd17657	CDC14_N	N-terminal domain pseudophosphatase domain of CDC14 family proteins. The cell division control protein 14 (CDC14) family is highly conserved in all eukaryotes, although the roles of its members seem to have diverged during evolution. Yeast Cdc14, the best characterized member of this family, is a dual-specificity phosphatase that plays key roles in cell cycle control. It preferentially dephosphorylates cyclin-dependent kinase (CDK) targets, which makes it the main antagonist of CDK in the cell. Cdc14 functions at the end of mitosis and it triggers the events that completely eliminates the activity of CDK and other mitotic kinases. It is also involved in coordinating the nuclear division cycle with cytokinesis through the cytokinesis checkpoint, and in chromosome segregation. Cdc14 phosphatases also function in DNA replication, DNA damage checkpoint, and DNA repair. Vertebrates may contain more than one Cdc14 homolog; humans have three (CDC14A, CDC14B, and CDC14C). CDC14 family proteins contain a highly conserved N-terminal pseudophosphatase domain that contributes to substrate specificity and a C-terminal catalytic dual-specificity phosphatase domain with the PTP signature motif. The N-terminal pseudophosphatase domain lacks the catalytic residues.	144
350496	cd17658	PTPc_plant_PTP1	protein tyrosine phosphatase 1 from Arabidopsis thaliana and similar plant PTPs. Arabidopsis thaliana protein tyrosine phosphatase 1 (AtPTP1) belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. AtPTP1 dephosphorylates and inhibits MAP kinase 6 (MPK6) in non-oxidative stress conditions. Together with MAP kinase phosphatase 1 (MKP1) it expresses salicylic acid (SA) and camalexin biosynthesis, and therefore, modulating defense response.	206
350497	cd17659	PTP_paladin_1	protein tyrosine phosphatase-like domain of paladin, repeat 1. Paladin is a putative phosphatase, which in mouse is expressed in endothelial cells during embryonic development and in arterial smooth muscle cells in adults. It has been suggested to be an antiphosphatase that regulates the activity of specific neural crest regulatory factors and thus, modulates neural crest cell formation and migration. Paladin contains two tyrosine-protein phosphatase domains. This model represents repeat 1.	220
350498	cd17660	PTP_paladin_2	protein tyrosine phosphatase-like domain of paladin, repeat 2. Paladin is a putative phosphatase, which in mouse is expressed in endothelial cells during embryonic development and in arterial smooth muscle cells in adults. It has been suggested to be an antiphosphatase that regulates the activity of specific neural crest regulatory factors and thus, modulates neural crest cell formation and migration. Paladin contains two tyrosine-protein phosphatase domains. This model represents repeat 2.	216
350499	cd17661	PFA-DSP_Oca2	atypical dual specificity phosphatases similar to oxidant-induced cell-cycle arrest protein 2. Oxidant-induced cell-cycle arrest protein 2 (Oca2) is an atypical dual specificity phosphatase of unknown function. It has been identified as a putative negative regulator acting on cell wall integrity and mating MAPK pathways in yeast. It belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs). Oca2 may be an inactive DSP-like protein as it lacks the CxxxxxR catalytic motif.	146
350500	cd17662	PFA-DSP_Oca4	atypical dual specificity phosphatases similar to oxidant-induced cell-cycle arrest protein 4. Oxidant-induced cell-cycle arrest protein 4 (Oca4) is an atypical dual specificity phosphatase of unknown function. It belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs). Oca4 may be an inactive DSP-like protein as it lacks the CxxxxxR catalytic motif.	177
350501	cd17663	PFA-DSP_Oca6	atypical dual specificity phosphatases similar to oxidant-induced cell-cycle arrest protein 6. Oxidant-induced cell-cycle arrest protein 6 (Oca6) is an atypical dual specificity phosphatase of unknown function. It belongs to a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds called plant and fungi atypical dual-specificity phosphatases (PFA-DSPs). Oca6 may be an inactive DSP-like protein as it lacks the CxxxxxR catalytic motif.	162
350502	cd17664	Mce1_N	N-terminal triphosphatase domain of mRNA capping enzyme. mRNA capping enzyme, also known as  RNA guanylyltransferase and 5'-phosphatase (RNGTT) or mammalian mRNA capping enzyme (Mce1) in mammals, is a bifunctional enzyme that catalyzes the first two steps of cap formation: (1) by removing the gamma-phosphate from the 5'-triphosphate end of nascent mRNA to yield a diphosphate end using the polynucleotide 5'-phosphatase activity (EC 3.1.3.33) of the N-terminal triphosphatase domain; and (2) by transferring the GMP moiety of GTP to the 5'-diphosphate terminus through the C-terminal mRNA guanylyltransferase domain (EC 2.7.7.50). The enzyme is also referred to as CEL-1 in Caenorhabditis elegans.	167
350503	cd17665	DSP_DUSP11	dual-specificity phosphatase domain of dual specificity protein phosphatase 11 and similar proteins. dual specificity protein phosphatase 11 (DUSP11), also known as RNA/RNP complex-1-interacting phosphatase or phosphatase that interacts with RNA/RNP complex 1 (PIR1), has RNA 5'-triphosphatase and diphosphatase activity, but only poor protein-tyrosine phosphatase activity. It has activity for short RNAs but is less active toward mononucleotide triphosphates, suggesting that its primary function in vivo is to dephosphorylate RNA 5'-ends. It may play a role in nuclear mRNA metabolism. Also included in this subfamily is baculovirus RNA 5'-triphosphatase for Autographa californica nuclear polyhedrosis virus.	169
350504	cd17666	PTP-MTM-like_fungal	protein tyrosine phosphatase-like domain of fungal myotubularins. Myotubularins are a unique subgroup of protein tyrosine phosphatases that use inositol phospholipids, rather than phosphoproteins, as substrates. They dephosphorylate the D-3 position of phosphatidylinositol 3-phosphate [PI(3)P] and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2], generating phosphatidylinositol and phosphatidylinositol 5-phosphate [PI(5)P], respectively. Not all members are catalytically active proteins, some function as adaptors for the active members.	229
350505	cd17667	R-PTPc-G-1	catalytic domain of receptor-type tyrosine-protein phosphatase G, repeat 1. Receptor-type tyrosine-protein phosphatase G (PTPRG), also called protein-tyrosine phosphatase gamma (R-PTP-gamma), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRG is an important tumor suppressor gene in multiple human cancers such as lung, ovarian, and breast cancers. It is widely expressed in many tissues, including the central nervous system, where it plays a role during neuroinflammation processes. It can dephosphorylate platelet-derived growth factor receptor beta (PDGFRB) and may play a role in PDGFRB-related infantile myofibromatosis. PTPRG has four splicing isoforms: three transmembrane isoforms, PTPRG-A, B, and C, and one secretory isoform, PTPRG-S, which are expressed in many tissues including the brain. PTPRG is a type 1 integral membrane protein consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the catalytic PTP domain (repeat 1).	274
350506	cd17668	R-PTPc-Z-1	catalytic domain of receptor-type tyrosine-protein phosphatase Z, repeat 1. Receptor-type tyrosine-protein phosphatase Z (PTPRZ), also called receptor-type tyrosine-protein phosphatase zeta (R-PTP-zeta), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Three isoforms are generated by alternative splicing from a single PTPRZ gene: two transmembrane isoforms, PTPRZ-A and PTPRZ-B, and one secretory isoform, PTPRZ-S (also known as phosphacan); all are preferentially expressed in the central nervous system (CNS) as chondroitin sulfate (CS) proteoglycans. PTPRZ isoforms play important roles in maintaining oligodendrocyte precursor cells in an undifferentiated state. PTPRZ is a type 1 integral membrane protein consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the catalytic PTP domain (repeat 1).	209
350507	cd17669	R-PTP-Z-2	catalytic domain of receptor-type tyrosine-protein phosphatase Z, repeat 2. Receptor-type tyrosine-protein phosphatase Z (PTPRZ), also called receptor-type tyrosine-protein phosphatase zeta (R-PTP-zeta), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. Three isoforms are generated by alternative splicing from a single PTPRZ gene: two transmembrane isoforms, PTPRZ-A and PTPRZ-B, and one secretory isoform, PTPRZ-S (also known as phosphacan); all are preferentially expressed in the central nervous system (CNS) as chondroitin sulfate (CS) proteoglycans. PTPRZ isoforms play important roles in maintaining oligodendrocyte precursor cells in an undifferentiated state. PTPRZ is a type 1 integral membrane protein consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the inactive PTP-like domain (repeat 2).	204
350508	cd17670	R-PTP-G-2	PTP-like domain of receptor-type tyrosine-protein phosphatase G, repeat 2. Receptor-type tyrosine-protein phosphatase G (PTPRG), also called protein-tyrosine phosphatase gamma (R-PTP-gamma), belongs to the family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides. PTPRG is an important tumor suppressor gene in multiple human cancers such as lung, ovarian, and breast cancers. It is widely expressed in many tissues, including the central nervous system, where it plays a role during neuroinflammation processes. It can dephosphorylate platelet-derived growth factor receptor beta (PDGFRB) and may play a role in PDGFRB-related infantile myofibromatosis. PTPRG is a type 1 integral membrane protein consisting of an extracellular region with a carbonic anhydrase-like (CAH) and a fibronectin type III (FN3) domains, and an intracellular region with a catalytic PTP domain (repeat 1) proximal to the membrane, and a catalytically inactive PTP-fold domain (repeat 2) distal to the membrane. This model represents the inactive PTP-like domain (repeat 2).	205
349491	cd17672	MDM2	p53-binding domain found in E3 ubiquitin-protein ligase MDM2 and similar proteins. MDM2, also known as double minute 2 protein (Hdm2), or oncoprotein MDM2, or p53-binding protein, exerts its oncogenic activity predominantly by binding the p53 tumor suppressor and blocking its transcriptional activity. It forms homo-oligomers and displays E3 ubiquitin ligase activity, catalyzing the attachment of ubiquitin to p53 as an essential step in the regulation of its expression levels in cells. Moreover, in response to ribosomal stress, MDM2-mediated p53 ubiquitination and degradation can be inhibited through the interaction with ribosomal proteins L5, L11, and L23. MDM2 also has a p53-independent role in tumorigenesis and cell growth regulation. In addition, it binds interferon (IFN) regulatory factor-2 (IRF-2), an IFN-regulated transcription factor, and mediates its ubiquitination. MDM2 contains an N-terminal p53-binding domain and a C-terminal zinc RING-finger domain conferring E3 ligase activity that is required for ubiquitination and nuclear export of p53. It is also responsible for the hetero-oligomerization of MDM2, which is crucial for the suppression of P53 activity during embryonic development, and the recruitment of E2 ubiquitin-conjugating enzymes. MDM2 also harbors a RanBP2-type zinc finger (zf-RanBP2) domain, as well as a nuclear localization signal (NLS) and a nuclear export signal (NES), near the central acidic region.	83
349492	cd17673	MDM4	p53-binding domain found in MDM4 and similar proteins. MDM4, also known as double minute 4 protein, MDM2-like p53-binding protein, protein MDMX, HDMX, or p53-binding protein MDM4, exerts its oncogenic activity predominantly by binding the p53 tumor suppressor and blocking its transcriptional activity. MDM4 is phosphorylated and destabilized in response to DNA damage stress. It can also be specifically dephosphorylated through directly interacting with protein phosphatase 1 (PP1), which may increase its stability and thus inhibit p53 activity. MDM4 also has a p53-independent role in tumorigenesis and cell growth regulation. MDM4 contains an N-terminal p53-binding domain and a C-terminal zinc RING-finger domain responsible for its hetero-oligomerization, which is crucial for the suppression of P53 activity during embryonic development and the recruitment of E2 ubiquitin-conjugating enzymes. MDM4 also harbors a RanBP2-type zinc finger (zf-RanBP2) domain near the central acidic region.	79
349493	cd17674	SWIB_BAF60A	SWIB domain found in BRG1-associated factor 60A (BAF60A) and similar proteins. BAF60A, also termed SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily D member 1 (SMARCD1), or 60 kDa BRG-1/Brm-associated factor subunit A, or SWI/SNF complex 60 kDa subunit, is a core subunit of the SWI/SNF chromatin-remodeling complex that activates the transcription of fatty acid oxidation genes during fasting. BAF60A is involved in chromatin remodeling and hepatic lipid metabolism. It mediates critical interactions between nuclear receptors and the BRG1 chromatin-remodeling complex for transactivation. It is also a key component of the transcriptional control in cardiac progenitors. Moreover, BAF60A interacts with p53 to recruit the SWI/SNF complex, suggesting that the SWI/SNF chromatin remodeling complex is involved in the suppression of tumors.	77
349494	cd17675	SWIB_BAF60B	SWIB domain found in BRG1-associated factor 60B (BAF60B) and similar proteins. BAF60B, also termed SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily D member 2 (SMARCD2), or 60 kDa BRG-1/Brm-associated factor subunit B, is a component of the BAF complex. It is involved in transcriptional activation and repression of select genes by chromatin remodeling. It plays a role in the ATM-p53 pathway in sensing chromatin opening by facilitating ATM recruitment to the SWI/SNF complex, as well as ATM activation. It also regulates transcriptional networks controlling differentiation of neutrophil granulocytes. Thus, it acts as a key factor controlling myelopoiesis and is a potential tumor suppressor in leukemia.	80
349495	cd17676	SWIB_BAF60C	SWIB domain found in BRG1-associated factor 60C (BAF60C) and similar proteins. BAF60C, also termed SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily D member 3 (SMARCD3), or 60 kDa BRG-1/Brm-associated factor subunit C, is a core subunit of the SWI/SNF chromatin-remodeling complex that activates the transcription of fatty acid oxidation genes during fasting. It is involved in chromatin remodeling and hepatic lipid metabolism. It is also essential for cardiomyocyte differentiation at the early heart development. Moreover, BAF60C drives glycolytic metabolism in the muscle and improves systemic glucose homeostasis through Deptor-mediated Akt activation. Furthermore, BAF60C epigenetically regulates epithelial-mesenchymal transition (EMT) by activating WNT signaling pathways.	74
350658	cd17706	MCM	MCM helicase family. MCM helicases are a family of helicases that play an important role in replication and homologous recombination repair. The heterohexameric ring-shaped Mcm2-7 complex is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases. Mcm8 and Mcm9, form a complex required for homologous recombination (HR) repair induced by DNA interstrand crosslinks (ICLs).	311
349340	cd17707	BRCT_XRCC1_rpt2	Second (C-terminal) BRCT domain in X-ray repair cross-complementing protein 1 (XRCC1) and similar proteins. XRCC1 is a DNA repair protein that corrects defective DNA strand-break repair and sister chromatid exchange following treatment with ionizing radiation and alkylating agents. It forms homodimers and interacts with polynucleotide kinase (PNK), DNA polymerase-beta (POLB), DNA ligase III (LIG3), APTX, APLF, and APEX1. XRCC1 contains an N-terminal XRCC1-specific domain and two BRCT domains. This model corresponds to the second BRCT domain.	94
349341	cd17709	BRCT_pescadillo_like	BRCT domain of pescadillo and related proteins. Pescadillo has been characterized in zebrafish as a protein involved in the control of cell proliferation, specifically in the developing embryo. Mammalian homologs have been linked to ribosome biogenesis and nucleologenesis, and yeast homologs have been shown to be required for synthesis of the 60S ribosomal subunit. Pescadillo contains a BRCT domain.	86
349342	cd17710	BRCT_PAXIP1_rpt2	second BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the second BRCT domain.	81
349343	cd17711	BRCT_PAXIP1_rpt3	third BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the third BRCT domain.	81
349344	cd17712	BRCT_PAXIP1_rpt5	fifth BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the fifth BRCT domain.	75
349345	cd17713	BRCT_polymerase_mu_like	BRCT domain of DNA-directed DNA/RNA polymerase mu (polymerase mu), DNA nucleotidylexotransferase and similar proteins. The family includes DNA-directed DNA/RNA polymerase mu (polymerase mu) and DNA nucleotidylexotransferase. Polymerase mu (EC 2.7.7.7), also termed Pol mu, or terminal transferase, is a Gap-filling polymerase involved in repair of DNA double-strand breaks by non-homologous end joining (NHEJ). It participates in immunoglobulin (Ig) light chain gene rearrangement in V(D)J recombination. DNA nucleotidylexotransferase (EC 2.7.7.31), also termed terminal addition enzyme, or terminal deoxynucleotidyltransferase, or terminal transferase, is a template-independent DNA polymerase which catalyzes the random addition of deoxynucleoside 5'-triphosphate to the 3'-end of a DNA initiator. It is the addition of nucleotides at the junction (N region) of rearranged Ig heavy chain and T-cell receptor gene segments during the maturation of B- and T-cells. All family members contains a BRCT domain.	87
349346	cd17714	BRCT_PAXIP1_rpt1	first (N-terminal) BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the first BRCT domain.	76
349347	cd17715	BRCT_polymerase_lambda	BRCT domain of DNA polymerase lambda and similar proteins. DNA polymerase lambda, also termed Pol Lambda, or DNA polymerase beta-2 (Pol beta2), or DNA polymerase kappa, is involved in base excision repair (BER) and is responsible for repair of lesions that give rise to abasic (AP) sites in DNA. It also contributes to DNA double-strand break repair by non-homologous end joining and homologous recombination. DNA polymerase lambda has both template-dependent and template-independent (terminal transferase) DNA polymerase activities, as well as a 5'-deoxyribose-5-phosphate lyase (dRP lyase) activity. DNA polymerase lambda contains one BRCT domain.	80
349348	cd17716	BRCT_microcephalin_rpt1	first (N-terminal) BRCT domain of microcephalin and similar proteins. Microcephalin is a DNA damage response protein involved in regulation of CHK1 and BRCA1. It has been implicated in chromosome condensation and DNA damage induced cellular responses. It may play a role in neurogenesis and regulation of the size of the cerebral cortex. Microcephalin contains three BRCT repeats. This family corresponds to the first repeat.	78
349349	cd17717	BRCT_DNA_ligase_IV_rpt2	second BRCT domain of DNA ligase 4 (LIG4) and similar proteins. LIG4 (EC 6.5.1.1), also termed DNA ligase IV, or polydeoxyribonucleotide synthase [ATP] 4, is involved in DNA non-homologous end joining (NHEJ) required for double-strand break repair and V(D)J recombination. It is a component of the LIG4-XRCC4 complex that is responsible for the NHEJ ligation step. LIG4 contains two BRCT domains. The family corresponds to the second one.	88
349350	cd17718	BRCT_TopBP1_rpt3	third BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the third BRCT domain.	83
349351	cd17719	BRCT_Rev1	BRCT domain of DNA repair protein Rev1 and similar proteins. REV1, also termed alpha integrin-binding protein 80, or AIBP80, or Rev1-like terminal deoxycytidyl transferase, is a DNA template-dependent dCMP transferase required for mutagenesis induced by UV light.	87
349352	cd17720	BRCT_Bard1_rpt2	second (C-terminal) BRCT domain of BRCA1-associated RING domain protein 1 (Bard1) and similar proteins. Bard1, also termed BARD-1, or RING-type E3 ubiquitin transferase BARD1, is a critical factor in BRCA1-mediated tumor suppression and may also serve as a target for tumorigenic lesions in some human cancers. It associates with BRCA1 (breast cancer-1) to form a heterodimeric BRCA1/BARD1 complex that is responsible for maintaining genomic stability through nuclear functions involving DNA damage signaling and repair, transcriptional regulation, and cell cycle control. The BRCA1/BARD1 complex catalyzes autoubiquitination of BRCA1 and trans ubiquitination of other protein substrates. Its E3 ligase activity is dramatically reduced in the presence of UBX domain protein 1 (UBXN1). BARD-1 contains an N-terminal C3HC4-type RING-HC finger that binds BRCA1, and a C-terminal region with three ankyrin repeats and tandem BRCT domains that bind CstF-50 (cleavage stimulation factor) to modulate mRNA processing and RNAP II stability in response to DNA damage. The family corresponds to the second BRCT domain.	101
349353	cd17721	BRCT_BRCA1_rpt2	second (C-terminal) BRCT domain of breast cancer type 1 susceptibility protein (BRCA1) and similar proteins. BRCA1, also termed RING finger protein 53 (RNF53), is a RING finger protein encoded by BRCA1, a tumor suppressor gene that regulates all DNA double-strand break (DSB) repair pathways. BRCA1 is frequently mutated in patients with hereditary breast and ovarian cancer (HBOC). Its mutation is also associated with an increased risk of pancreatic, stomach, laryngeal, fallopian tube, and prostate cancer. It plays an important role in the DNA damage response signaling, and has been implicated in various cellular processes such as cell cycle regulation, transcriptional regulation, chromatin remodeling, DNA DSBs, and apoptosis. BRCA1 contains an N-terminal C3HC4-type RING-HC finger, and two BRCT repeats at the C-terminus. The family corresponds to the second BRCT domain.	98
349354	cd17722	BRCT_DNA_ligase_IV_rpt1	first BRCT domain of DNA ligase 4 (LIG4) and similar proteins. LIG4 (EC 6.5.1.1), also termed DNA ligase IV, or polydeoxyribonucleotide synthase [ATP] 4, is involved in DNA non-homologous end joining (NHEJ) required for double-strand break repair and V(D)J recombination. It is a component of the LIG4-XRCC4 complex that is responsible for the NHEJ ligation step. LIG4 contains two BRCT domains. The family corresponds to the first one.	90
349355	cd17723	BRCT_Rad4_rpt4	fourth BRCT domain of Schizosaccharomyces pombe S-M checkpoint control protein Rad4 and similar proteins. Rad4, also termed P74, or protein cut5, is an essential component for DNA replication and the checkpoint control system, which couples S and M phases. It may directly or indirectly interact with chromatin proteins to form the complex required for the initiation and/or progression of DNA synthesis. Rad4 contains four BRCT repeats. The family corresponds to the fourth one.	74
349356	cd17724	BRCT_p53bp1_rpt2	Second (C-terminal) BRCT domain in p53-binding protein 1 (p53BP1) and similar proteins. p53BP1, also termed 53BP1, or TP53-binding protein 1 (TP53BP1) , is a double-strand break (DSB) repair protein involved in response to DNA damage, telomere dynamics, and class-switch recombination (CSR) during antibody genesis. TP53BP1 contains two tandem BRCT repeats. This family corresponds to the second BRCT domain.	87
349357	cd17725	BRCT_XRCC1_rpt1	First (central) BRCT domain in X-ray repair cross-complementing protein 1 (XRCC1) and similar proteins. XRCC1 is a DNA repair protein that corrects defective DNA strand-break repair and sister chromatid exchange following treatment with ionizing radiation and alkylating agents. It forms homodimers and interacts with polynucleotide kinase (PNK), DNA polymerase-beta (POLB), DNA ligase III (LIG3), APTX, APLF, and APEX1. XRCC1 contains an N-terminal XRCC1-specific domain and two BRCT domains. This family corresponds to the first one.	80
349358	cd17726	BRCT_PARP4_like	BRCT domain of poly [ADP-ribose] polymerase 4 (PARP-4) and similar proteins. PARP-4, also termed 193 kDa vault protein, or ADP-ribosyltransferase diphtheria toxin-like 4 (ARTD4), or PARP-related/IalphaI-related H5/proline-rich (PH5P), or vault poly(ADP-ribose) polymerase (VPARP), shows poly(ADP-ribosyl)ation activity that catalyzes the formation of ADP-ribose polymers in response to DNA damage. PARP-4 is a component of the vault ribonucleoprotein particle, at least composed of MVP, PARP4 and one or more vault RNAs (vRNAs). The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this group.	85
349359	cd17727	BRCT_TopBP1_rpt6	sixth BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the sixth BRCT domain.	75
349360	cd17728	BRCT_TopBP1_rpt8	eighth (C-terminal) BRCT domain of DNA topoisomerase 2-binding protein 1. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the eighth BRCT domain. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this group.	80
349361	cd17729	BRCT_CTDP1	BRCT domain of RNA polymerase II subunit A C-terminal domain phosphatase (CTDP1) and similar proteins. CTDP1 (EC 3.1.3.16), also termed TFIIF-associating CTD phosphatase, or TFIIF- associating RNA polymerase C-terminal domain phosphatase (FCP1), promotes the activity of RNA polymerase II through processively dephosphorylating 'Ser-2' and 'Ser-5' of the heptad repeats YSPTSPS in the C-terminal domain of the largest RNA polymerase II subunit. It plays a role in the exit from mitosis by dephosphorylating crucial mitotic substrates (USP44, CDC20 and WEE1) that are required for M-phase-promoting factor (MPF)/CDK1 inactivation.	97
349362	cd17730	BRCT_PAXIP1_rpt4	fourth BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the fourth BRCT domain.	73
349363	cd17731	BRCT_TopBP1_rpt2_like	second BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the second BRCT domain.	77
349364	cd17732	BRCT_Ect2_rpt2	second BRCT domain of epithelial cell-transforming sequence 2 protein (ECT2) and similar proteins. ECT2 is a guanine nucleotide exchange factor (GEF) for Rho GTPases, phosphorylated in G2/M phases, and is involved in the regulation of cytokinesis. It contains two tandem BRCT domains. The family corresponds to the second BRCT domain.	80
349365	cd17733	BRCT_Ect2_rpt1	first BRCT domain of epithelial cell-transforming sequence 2 protein (ECT2) and similar proteins. ECT2 is a guanine nucleotide exchange factor (GEF) for Rho GTPases, phosphorylated in G2/M phases, and is involved in the regulation of cytokinesis. It contains two tandem BRCT domains. The family corresponds to the first BRCT domain.	76
349366	cd17734	BRCT_Bard1_rpt1	first BRCT domain of BRCA1-associated RING domain protein 1 (Bard1) and similar proteins. Bard1, also termed BARD-1, or RING-type E3 ubiquitin transferase BARD1, is a critical factor in BRCA1-mediated tumor suppression and may also serve as a target for tumorigenic lesions in some human cancers. It associates with BRCA1 (breast cancer-1) to form a heterodimeric BRCA1/BARD1 complex that is responsible for maintaining genomic stability through nuclear functions involving DNA damage signaling and repair, transcriptional regulation, and cell cycle control. The BRCA1/BARD1 complex catalyzes autoubiquitination of BRCA1 and trans ubiquitination of other protein substrates. Its E3 ligase activity is dramatically reduced in the presence of UBX domain protein 1 (UBXN1). BARD-1 contains an N-terminal C3HC4-type RING-HC finger that binds BRCA1, and a C-terminal region with three ankyrin repeats and tandem BRCT domains that bind CstF-50 (cleavage stimulation factor) to modulate mRNA processing and RNAP II stability in response to DNA damage. The family corresponds to the first BRCT domain.	80
349367	cd17735	BRCT_BRCA1_rpt1	first BRCT domain of breast cancer type 1 susceptibility protein (BRCA1) and similar proteins. BRCA1, also termed RING finger protein 53 (RNF53), is a RING finger protein encoded by BRCA1, a tumor suppressor gene that regulates all DNA double-strand break (DSB) repair pathways. BRCA1 is frequently mutated in patients with hereditary breast and ovarian cancer (HBOC). Its mutation is also associated with an increased risk of pancreatic, stomach, laryngeal, fallopian tube, and prostate cancer. It plays an important role in the DNA damage response signaling, and has been implicated in various cellular processes such as cell cycle regulation, transcriptional regulation, chromatin remodeling, DNA DSBs, and apoptosis. BRCA1 contains an N-terminal C3HC4-type RING-HC finger, and two BRCT (BRCA1 C-terminus domain) repeats at the C-terminus. The family corresponds to the first BRCT domain.	97
349368	cd17736	BRCT_microcephalin_rpt2	second BRCT domain of microcephalin and similar proteins. Microcephalin is a DNA damage response protein involved in regulation of CHK1 and BRCA1. It has been implicated in chromosome condensation and DNA damage induced cellular responses. It may play a role in neurogenesis and regulation of the size of the cerebral cortex. Microcephalin contains three BRCT repeats. This family corresponds to the second repeat.	76
349369	cd17737	BRCT_TopBP1_rpt1	first BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the first BRCT domain.	72
349370	cd17738	BRCT_TopBP1_rpt7	seventh BRCT domain of DNA topoisomerase 2-binding protein 1. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the seventh BRCT domain. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is missing in this group.	75
349371	cd17740	BRCT_Rad4_rpt1	first BRCT domain of Schizosaccharomyces pombe S-M checkpoint control protein Rad4 and similar proteins. Rad4, also termed P74, or protein cut5, is an essential component for DNA replication and the checkpoint control system which couples the S and M phases. It may directly or indirectly interact with chromatin proteins to form the complex required for the initiation and/or progression of DNA synthesis. Rad4 contains four BRCT repeats. The family corresponds to the first one.	82
349372	cd17741	BRCT_nibrin	BRCT domain of nibrin and similar proteins. Nibrin (NBN), also termed Nijmegen breakage syndrome protein 1 (NBS1), or cell cycle regulatory protein p95, is a novel DNA double-strand break repair protein that is mutated in Nijmegen breakage syndrome. It is a component of the MRE11-RAD50-NBN (MRN complex) which plays a critical role in the cellular response to DNA damage and the maintenance of chromosome integrity. The BRCT (Breast Cancer Suppressor Protein BRCA1, carboxy-terminal) domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is absent in this group.	74
349373	cd17742	BRCT_CHS5_like	BRCT domain of yeast chitin biosynthesis protein CHS5 and similar proteins. CHS5, also termed protein CAL3, is a component of the CHS5/6 complex which mediates export of specific cargo proteins, including chitin synthase CHS3. It is also involved in targeting FUS1 to sites of polarized growth.	77
349374	cd17743	BRCT_BRC1_like_rpt5	fifth BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. The family also includes Cryptococcus neoformans DNA ligase 4 (LIG4, also known as DNA ligase IV or polydeoxyribonucleotide synthase [ATP] 4), which is involved in dsDNA break repair, and plays a role in non-homologous integration (NHI) pathways where it is required in the final step of non-homologus end-joining. Members in this family contain six BRCT domains. This family corresponds to the fifth one.	70
349375	cd17744	BRCT_MDC1_rpt1	first BRCT domain of mediator of DNA damage checkpoint protein 1 (MDC1) and similar proteins. MDC1, also termed nuclear factor with BRCT domains 1 (NFBD1), is a nuclear chromatin-associated protein that is required for checkpoint mediated cell cycle arrest in response to DNA damage within both the S phase and G2/M phases of the cell cycle. It directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. MDC1 contains a forkhead-associated (FHA) domain and two BRCT domains, as well as an internal 41-amino acid repeat sequence. The family corresponds to the first BRCT domain.	72
349376	cd17745	BRCT_p53bp1_rpt1	first (central) BRCT domain in p53-binding protein 1 (p53BP1) and similar proteins. p53BP1, also termed 53BP1, or TP53-binding protein 1 (TP53BP1) , is a double-strand break (DSB) repair protein involved in response to DNA damage, telomere dynamics and class-switch recombination (CSR) during antibody genesis. TP53BP1 contains two tandem BRCT repeats. This family also includes Schizosaccharomyces pombe Crb2, which is a checkpoint mediator required for the cellular response to DNA damage. This model corresponds to the first BRCT domain.	99
349377	cd17746	BRCT_Rad4_rpt2	second BRCT domain of Schizosaccharomyces pombe S-M checkpoint control protein Rad4 and similar proteins. Rad4, also termed P74, or protein cut5, is an essential component for DNA replication and the checkpoint control system which couples S and M phases. It may directly or indirectly interact with chromatin proteins to form the complex required for the initiation and/or progression of DNA synthesis. Rad4 contains four BRCT repeats. The family corresponds to the second one.	91
349378	cd17747	BRCT_PARP1	BRCT domain of poly [ADP-ribose] polymerase 1 (PARP-1) and similar proteins. PARP-1 (EC 2.4.2.30), also termed ADP-ribosyltransferase diphtheria toxin-like 1 (ARTD1), or NAD(+) ADP-ribosyltransferase 1 (ADPRT 1), or poly[ADP-ribose] synthase 1, is involved in the base excision repair (BER) pathway, by catalyzing the poly(ADP-ribosyl)ation of a limited number of acceptor proteins involved in chromatin architecture and in DNA metabolism.	76
349379	cd17748	BRCT_DNA_ligase_like	BRCT domain of bacterial NAD-dependent DNA ligase (LigA) and similar proteins. LigA, also called NAD(+)-dependent polydeoxyribonucleotide synthase, catalyzes the formation of phosphodiester linkages between 5'-phosphoryl and 3'-hydroxyl groups in double-stranded DNA using NAD as a coenzyme and as the energy source for the reaction. It is essential for DNA replication and repair of damaged DNA. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this family.	76
349380	cd17749	BRCT_TopBP1_rpt4	fourth BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also called DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the fourth BRCT domain.	84
349381	cd17750	BRCT_SLF1	BRCT domain of SMC5-SMC6 complex localization factor protein 1 (SLF1) and similar proteins. SLF1, also termed Smc5/6 localization factor 1, or ankyrin repeat domain-containing protein 32 (ANKRD32), or BRCT domain-containing protein 1 (BRCTD1), plays a role in the DNA damage response (DDR) pathway by regulating post replication repair of UV-damaged DNA and genomic stability maintenance. It is a component of the SLF1-SLF2 complex that acts to link RAD18 with the SMC5-SMC6 complex at replication-coupled interstrand cross-links (ICL) and DNA double-strand break (DSB) sites on chromatin during DNA repair in response to stalled replication forks. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is missing in this group.	81
349382	cd17751	BRCT_microcephalin_rpt3	third BRCT domain of microcephalin and similar proteins. Microcephalin is a DNA damage response protein involved in regulation of CHK1 and BRCA1. It has been implicated in chromosome condensation and DNA damage induced cellular responses. It may play a role in neurogenesis and regulation of the size of the cerebral cortex. Microcephalin contains three BRCT repeats. This family corresponds to the third repeat.	75
349383	cd17752	BRCT_RFC1	BRCT domain of replication factor C subunit 1 (RFC1) and similar proteins. RFC1, also termed activator 1 140 kDa subunit, or A1 140 kDa subunit, or activator 1 large subunit, or activator 1 subunit 1, or replication factor C 140 kDa subunit, or RF-C 140 kDa subunit, or RFC140, is the large subunit of replication factor C (RFC), which is a heteropentameric protein essential for DNA replication and repair. RFC1 can bind single- or double-stranded DNA. It could play a role in DNA transcription regulation as well as DNA replication and/or repair. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this family.	79
350659	cd17753	MCM2	DNA replication licensing factor Mcm2. Mcm2 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases.	325
350660	cd17754	MCM3	DNA replication licensing factor Mcm3. Mcm3 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases.	299
350661	cd17755	MCM4	DNA replication licensing factor Mcm4. Mcm4 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases.	309
350662	cd17756	MCM5	DNA replication licensing factor Mcm5. Mcm5 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases.	317
350663	cd17757	MCM6	DNA replication licensing factor Mcm6. Mcm6 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases.	307
350664	cd17758	MCM7	DNA replication licensing factor Mcm7. Mcm7 is a helicase that play an important role in replication. It is part of the heterohexameric ring-shaped Mcm2-7 complex, which is part of the replicative helicase that unwinds parental double-stranded DNA at a replication fork to provide single-stranded DNA templates for the replicative polymerases.	306
350665	cd17759	MCM8	DNA helicase Mcm8. Mcm8 plays an important role homologous recombination repair. It forms a complex with Mcm9 that is required for homologous recombination (HR) repair induced by DNA interstrand crosslinks (ICLs).	289
350666	cd17760	MCM9	DNA helicase Mcm9. Mcm9 plays an important role homologous recombination repair. It forms a complex with Mcm8 that is required for homologous recombination (HR) repair induced by DNA interstrand crosslinks (ICLs).	299
350667	cd17761	MCM_arch	archaeal MCM protein. archaeal MCM proteins form a homohexameric ring homologous to the eukaryotic Mcm2-7 helicase and also function as the replicative helicase at the replication fork	308
350162	cd17762	AMN	AMP nucleosidase. AMP nucleosidase (AMN) catalyzes the hydrolysis of AMP to ribose 5-phosphate and adenine. It is a prokaryotic enzyme which plays a role in purine nucleoside salvage and intracellular AMP level regulation. AMN is active as a homohexamer; each monomer is comprised of a catalytic domain and a putative regulatory domain. This model represents the catalytic domain. AMN belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	242
350163	cd17763	UP_hUPP-like	uridine phosphorylases similar to a human UPP1 and UPP2. Uridine phosphorylase (UP) catalyzes the reversible phosphorolysis of uracil ribosides and analogous compounds to their respective nucleobases and ribose 1-phosphate. Human UPP1 has a role in the activation of pyrimidine nucleoside analogues used in chemotherapy, such as 5-fluorouracil. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	276
350164	cd17764	MTAP_SsMTAPI_like	5'-deoxy-5'-methylthioadenosine phosphorylases similar to Sulfolobus solfataricus MTAPI. 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP) catalyzes the reversible phosphorolysis of 5'-deoxy-5'-methylthioadenosine (MTA) to adenine and 5-methylthio-D-ribose-1-phosphate. Sulfolobus solfataricus MTAPI will utilize inosine, guanosine, and adenosine as substrates, in addition to MTA. Two MTAPs have been isolated from S. solfataricus: SsMTAP1 and SsMTAPII, SsMTAPII belongs to a different subfamily of the nucleoside phosphorylase-I (NP-I) family, whose  members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-I family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	220
350165	cd17765	PNP_ThPNP_like	purine nucleoside phosphorylases similar to Thermus thermophiles PNP. Purine nucleoside phosphorylase (PNP) catalyzes the reversible phosphorolysis of purine nucleosides. Thermus thermophiles PNP catalyzes the phosphorolysis of guanosine but not adenosine. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	234
350166	cd17766	futalosine_nucleosidase_MqnB	futalosine nucleosidase which catalyzes the hydrolysis of futalosine to dehypoxanthinylfutalosine and a hypoxanthine base; similar to Thermus thermophiles MqnB. Futalosine nucleosidase (MqnB, EC 3.2.2.26, also known as futalosine hydrolase) functions in an alternative menaquinone biosynthetic pathway (the futalosine pathway) which operates in some bacteria, including Streptomyces coelicolor and Thermus thermophiles. This domain model belongs to the PNP_UDP_1 superfamily which includes members which accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. PNP_UDP_1 includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). Superfamily members have different physiologically relevant quaternary structures: hexameric such as the trimer-of-dimers arrangement of Shewanella oneidensis MR-1 UP, homotrimeric such as human PNP and Escherichia coli PNPII (XapA), homohexomeric (with some evidence for co-existence of a trimeric form) such as E. coli PNPI (DeoD), or homodimeric such as human and Trypanosoma brucei UP. The PNP_UDP_2 (nucleoside phosphorylase-II family) is a different structural family.	217
350167	cd17767	UP_EcUdp-like	uridine phosphorylases similar to Escherichia coli Udp and related phosphorylases. Uridine phosphorylase (UP) is specific for pyrimidines, and is involved in pyrimidine salvage and in the maintenance of uridine homeostasis. In addition to E. coli Udp, this subfamily includes Shewanella oneidensis MR-1 UP and Plasmodium falciparum purine nucleoside phosphorylase (PfPNP). PfPNP is an outlier in terms of genetic distance from the other families of PNPs. PfPNP is catalytically active for inosine and guanosine, and in addition, has a weak UP activity. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	239
350168	cd17768	adenosylhopane_nucleosidase_HpnG-like	adenosylhopane nucleosidase which cleaves adenine from adenosylhopane to form ribosyl hopane; similar to Burkholderia cenocepacia HpnG. adenosylhopane nucleosidase HpnG, catalyzes the second step in hopanoid side-chain biosynthesis. Hopanoids are bacterial membrane lipids. This CD belongs to the PNP_UDP_1 superfamily which includes members which accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. PNP_UDP_1 includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). Superfamily members have different physiologically relevant quaternary structures: hexameric such as the trimer-of-dimers arrangement of Shewanella oneidensis MR-1 UP, homotrimeric such as human PNP and Escherichia coli PNPII (XapA), homohexameric (with some evidence for co-existence of a trimeric form) such as E. coli PNPI (DeoD), or homodimeric such as human and Trypanosoma brucei UP. The PNP_UDP_2 (nucleoside phosphorylase-II family) is a different structural family.	188
350169	cd17769	NP_TgUP-like	nucleoside phosphorylases similar to Toxoplasma gondii uridine phosphorylase. This subfamily is composed of mostly uncharacterized proteins with similarity to Toxoplasma gondii uridine phosphorylase (TgUPase). Toxoplasma gondii appears to have a single non-specific uridine phosphorylase which catalyzes the reversible phosphorolysis of uridine, deoxyuridine and thymidine, rather than the two distinct enzymes of mammalian cells: uridine phosphorylase (nucleoside phosphorylase-I family) and thymidine phosphorylase (nucleoside phosphorylase-II family). TgUPase is a potential target for intervention against toxoplasmosis. It belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	255
341407	cd17771	CBS_pair_CAP-ED_NT_Pol-beta-like_DUF294_assoc	CBS domain protein. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the bacterial CAP_ED (cAMP receptor protein effector domain) family of transcription factors, the NT_Pol-beta-like domain, and the DUF294 domain.  Members of CAP_ED, include CAP which binds cAMP, FNR (fumarate and nitrate reductase) which uses an iron-sulfur cluster to sense oxygen, and CooA a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. The NT_Pol-beta-like domain includes the Nucleotidyltransferase (NT) domains of DNA polymerase beta and other family X DNA polymerases, as well as the NT domains of class I and class II CCA-adding enzymes, RelA- and SpoT-like ppGpp synthetases and hydrolases, 2'5'-oligoadenylate (2-5A)synthetases, Escherichia coli adenylyltransferase (GlnE), Escherichia coli uridylyl transferase (GlnD), poly (A) polymerases, terminal uridylyl transferases, Staphylococcus aureus kanamycin nucleotidyltransferase, and similar proteins.  DUF294 is a putative nucleotidyltransferase with a conserved DxD motif. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	115
341408	cd17772	CBS_pair_DHH_polyA_Pol_assoc	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the DHH and nucleotidyltransferase (NT) domains. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with an upstream DHH domain which performs a phosphoesterase function and a downstream nucleotidyltransferase (NT) domain of family X DNA polymerases. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	112
341409	cd17773	CBS_pair_NeuB	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domain present in N-acylneuraminate-9-phosphate synthase. This CD contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain present in N-acylneuraminate-9-phosphate synthase NeuB.  NeuB catalyzes the condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine, directly forming N-acetylneuraminic acid (or sialic acid). The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	118
341410	cd17774	CBS_two-component_sensor_histidine_kinase_repeat2	2 tandem repeats of the CBS domain in the two-component sensor histidine kinase and related-proteins, repeat 2. This cd contains 2 tandem repeats of the CBS domain in the two-component sensor histidine kinase and related-proteins. Two-component regulation is the predominant form of signal recognition and response coupling mechanism used by bacteria to sense and respond to diverse environmental stresses and cues ranging from common environmental stimuli to host signals recognized by pathogens and bacterial cell-cell communication signals.  The structures of both sensors and regulators are modular, and numerous variations in domain architecture and composition have evolved to tailor to specific needs in signal perception and signal transduction. The simplest histidine kinase sensors consists of only sensing and kinase domains. The more complex hybrid sensors contain an additional REC domain typical of two-component regulators and in some cases a C-terminal histidine phosphotransferase (HPT) domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	137
341411	cd17775	CBS_pair_bact_arch	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains  present in bacteria and archaea. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	117
341412	cd17776	CBS_pair_arch	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	115
341413	cd17777	CBS_arch_repeat1	CBS pair domains found in archeal proteins, repeat 1. CBS pair domains found in archeal proteins that contain 2 repeats. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.	137
341414	cd17778	CBS_arch_repeat2	CBS pair domains found in archeal proteins, repeat 2. CBS pair domains found in archeal proteins that contain 2 repeats. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.	131
341415	cd17779	CBS_archAMPK_gamma-repeat1	signal transduction protein with CBS domains. Archeal gamma-subunit of 5'-AMP-activated protein kinase (AMPK) contains four CBS domains in tandem repeats, similar to eukaryotic homologs. AMPK is an important regulator of metabolism and of energy homeostasis. It is a heterotrimeric protein composed of a catalytic serine/threonine kinase subunit (alpha) and two regulatory subunits (beta and gamma). The gamma subunit senses the intracellular energy status by competitively binding AMP and ATP and is believed to be responsible for allosteric regulation of the whole complex. In humans mutations in gamma- subunit of AMPK are associated with hypertrophic cardiomiopathy, Wolff-Parkinson-White syndrome and glycogen storage in the skeletal muscle. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.	136
341416	cd17780	CBS_pair_arch1_repeat1	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in archaea, repeat 1. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	106
341417	cd17781	CBS_pair_MUG70_1	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains similar to MUG70 repeat1. Two tandem repeats of the cystathionine beta-synthase (CBS pair) domain, present in MUG70. The MUG70 protein, encoded by the Meiotically Up-regulated Gene 70, plays a role in meiosis and contains, beside the two CBS pairs, a PB1 domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	118
341418	cd17782	CBS_pair_MUG70_2	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains similar to MUG70 repeat2. Two tandem repeats of the cystathionine beta-synthase (CBS pair) domain, present in MUG70. The MUG70 protein, encoded by the Meiotically Up-regulated Gene 70, plays a role in meiosis and contains, beside the two CBS pairs, a PB1 domain.  The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	118
341419	cd17783	CBS_pair_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	108
341420	cd17784	CBS_pair_Euryarchaeota	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in Euryarchaeota. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	120
341421	cd17785	CBS_pair_bac_arch	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria and archaea. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	136
341422	cd17786	CBS_pair_Thermoplasmatales	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in Thermoplasmatales. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	114
341423	cd17787	CBS_pair_ACT	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains found in Thermatoga in combination with an ACT domain. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	111
341424	cd17788	CBS_pair_bac	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains present in bacteria. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	137
341425	cd17789	CBS_pair_plant_CBSX	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains from plant CBSX proteins. This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains of plant single cystathionine beta-synthase (CBS) pair proteins (CBSX). CBSX1 and CBSX2 have been identified as redox regulators of the thioredoxin (Trx) system. The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria.  The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair.  The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here.  It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase).	141
341356	cd17790	7tmA_mAChR_M1	muscarinic acetylcholine receptor subtype M1, member of the class A family of seven-transmembrane G protein-coupled receptors. Muscarinic acetylcholine receptors (mAChRs) regulate the activity of many fundamental central and peripheral functions. The mAChR family consists of 5 subtypes M1-M5, which can be further divided into two major groups according to their G-protein coupling preference. The M1, M3 and M5 receptors selectively interact with G proteins of the G(q/11) family, whereas the M2 and M4 receptors preferentially link to the G(i/o) types of  G proteins. Activation of mAChRs by agonist (acetylcholine) leads to a variety of biochemical and electrophysiological responses. M1 is the dominant mAChR subtype involved in learning and memory. It is linked to synaptic plasticity, neuronal excitability, and neuronal differentiation during early development. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	262
341490	cd17791	HipA-like	serine/threonine-protein kinases similar to HipA and CtkA. This family contains serine/threonine-protein kinases similar to Escherichia coli HipA, a type II toxin-antitoxin (TA) system HipA family toxin that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu), and is the toxin component of the HipA-HipB TA module, as well as similar to the Helicobacter pylori serine/threonine-protein kinase CtkA (proinflammatory kinase), which has been shown to be secreted by the bacteria and to induce cytokines in gastric epithelial cells relevant to chronic gastric inflammation.	284
341491	cd17792	CtkA	serine/threonine-protein kinase CtkA and similar proteins. The Helicobacter pylori serine/threonine-protein kinase CtkA (proinflammatory kinase), encoded by the jhp940 gene, has been shown to be secreted by the bacterium and to induce cytokines in gastric epithelial cells. It may play a role in chronic gastric inflammation. CtkA autophosphorylates itself at a threonine residue near the N-terminus and it translocates into cultured human cells. It also enhances phosphorylation of the NF-kappaB p65 subunit at Ser276 in human epithelial cancer cells; phosphorylation at this position is known to activate the transcriptional activity of NF-kappaB.	281
341492	cd17793	HipA	type II toxin-antitoxin sytem toxin HipA and similar proteins. This family contains type II toxin-antitoxin (TA) system HipA family toxins similar to Escherichia coli and Shewanella oneidensis HipA, which is a serine/threonine-protein kinase that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu). This induces amino acid starvation and the stringent response via RelA/SpoT and increased (p)ppGpp levels, which inhibits replication, transcription, translation and cell wall synthesis, reducing growth and leading to persistence and multidrug resistance. HipA is the toxin component of the HipA-HipB TA module that is a major factor in persistence and bioflim formation; its toxic effect is neutralized by its cognate antitoxin HipB. HipA, with HipB, acts as a a corepressor for transcription of the hipBA promoter. Structures of HipAB:DNA complexes from both Escherichia coli and Shewanella oneidensis reveal distinct complex assembly.	358
341493	cd17808	HipA_Ec_like	type II toxin-antitoxin sytem toxin HipA from Escherichia coli and similar proteins. This family contains type II toxin-antitoxin (TA) system HipA family toxins similar to Escherichia coli HipA, a serine/threonine-protein kinase that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu). This induces amino acid starvation and the stringent response via RelA/SpoT and increased (p)ppGpp levels, which inhibits replication, transcription, translation and cell wall synthesis, reducing growth and leading to persistence and multidrug resistance. HipA is the toxin component of the HipA-HipB TA module that is a major factor in persistence and bioflim formation; its toxic effect is neutralized by its cognate antitoxin HipB. HipA, with HipB, acts as a a corepressor for transcription of the hipBA promoter. In the Escherichia coli HipAB:DNA promoter complex, HipA forms a dimer and each HipA monomer interacts with a HipB homodimer which binds DNA. The HipAB component of the complex is composed of two HipA and four HipB subunits.	401
341494	cd17809	HipA_So_like	type II toxin-antitoxin sytem toxin HipA from Shewanella oneidensis and similar proteins. This family contains type II toxin-antitoxin (TA) system HipA family toxins similar to Shewanella oneidensis HipA, a serine/threonine-protein kinase that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu). This induces amino acid starvation and the stringent response via RelA/SpoT and increased (p)ppGpp levels, which inhibits replication, transcription, translation and cell wall synthesis, reducing growth and leading to persistence and multidrug resistance. HipA is the toxin component of the HipA-HipB TA module that is a major factor in persistence and bioflim formation; its toxic effect is neutralized by its cognate antitoxin HipB. HipA, with HipB, acts as a a corepressor for transcription of the hipBA promoter. In the Shewanella oneidensis HipAB:DNA promoter complex, HipB forms a dimer that binds the duplex operator DNA, with each HipB monomer interacting with separate HipA monomers. The HipAB component of the complex is composed of two HipA and two HipB subunits.	405
341489	cd17814	Fe-ADH-like	iron-containing alcohol dehydrogenases (Fe-ADH)-like. This family contains iron-containing alcohol dehydrogenase (Fe-ADH) which catalyzes the reduction of acetaldehyde to alcohol with NADP as cofactor. Its activity requires iron ions. The protein structure represents a dehydroquinate synthase-like fold and is a member of the iron-activated alcohol dehydrogenase-like family. It is distinct from other alcohol dehydrogenases which contains different protein domains. Proteins of this family have not been characterized.	374
349777	cd17868	GPN	GPN-loop GTPase. GPN-loop GTPases are deeply evolutionarily conserved family of three small GTPases, Gpn1, 2, and 3. They form heterodimers, interact with RNA polymerase II and may function in nuclear import of RNA polymerase II.	198
349778	cd17869	TadZ-like	pilus assembly protein TadZ. Pilus assembly protein TadZ is involved in the production of a variant of type IV pili. It is part of the SIMIBI superfamily which contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion.	219
349779	cd17870	GPN1	GPN-loop GTPase 1. GPN-loop GTPase 1 (GPN1, also kown as MBD2-interacting protein or MBDin, RNAPII-associated protein 4, and XPA-binding protein 1) is a GTPase is required for nuclear targeting of RNA polymerase II. It forms heterodimers with GPN3.	241
349780	cd17871	GPN2	GPN-loop GTPase 2. GPN-loop GTPase 2 (GPN2) is a small GTPase required for proper localization of RNA polymerase II and III (RNAPII and RNAPIII). It forms heterodimers with GPN1 or GPN3.	196
349781	cd17872	GPN3	GPN-loop GTPase 3. GPN-loop GTPase 3 (GPN3) is a small GTPase that is required for nuclear targeting of RNA polymerase II. It forms heterodimers with GPN1.	196
349782	cd17873	FlhF	signal-recognition particle GTPase FlhF. FlhF protein is a signal-recognition particle (SRP)-type GTPase that is essential for the placement and assembly of polar flagella. It is similar to the 54 kd subunit (SRP54) of the signal recognition particle (SRP) that mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognated receptor (SR).	189
349783	cd17874	FtsY	signal recognition particle receptor FtsY. FtsY, the bacterial  signal-recognition particle (SRP) receptor (SR), is homologous to the SRP receptor alpha-subunit (SRalpha) of the eukaryotic SR. It interacts with the signal-recognition particle (SRP) and is required for the co-translational membrane targeting of proteins.	199
349784	cd17875	SRP54_G	GTPase domain of the signal recognition 54 kDa subunit. The signal recognition particle (SRP) mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognated receptor (SR). In mammals, SRP consists of six protein subunits and a 7SL RNA. One of these subunits is a 54 kd protein (SRP54), which is a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 is a multidomain protein that consists of an N-terminal domain, followed by a central G (GTPase) domain and a C-terminal M domain.	193
349785	cd17876	SRalpha_C	C-terminal domain of signal recognition particle receptor alpha subunit. The signal-recognition particle (SRP) receptor (SR) alpha-subunit (SRalpha) of the eukaryotic SR  interacts with the signal-recognition particle (SRP) and is essential for the co-translational membrane targeting of proteins.	204
350170	cd17877	NP_MTAN-like	nucleoside phosphorylases similar to 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidases. This subfamily includes both bacterial and plant 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidases (MTANs), as well as futalosine nucleosidase and adenosylhopane nucleosidase. Bacterial MTANs show comparable efficiency in hydrolyzing MTA and SAH, while plant enzymes are highly specific for MTA and are unable to metabolize SAH or show significantly reduced activity towards SAH. MTAN is involved in methionine and S-adenosyl-methionine recycling, polyamine biosynthesis, and bacterial quorum sensing. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	210
350625	cd17880	D-Ala-D-Ala_dipeptidase	D-Ala-D-Ala_dipeptidase. This family contains D-Ala-D-Ala dipeptidase enzymes which include D-alanyl-D-alanine dipeptidase vanX and Aad, among others. VanX is a Zn2+-dependent enzyme that mediates resistance to the antibiotic vancomycin in Enterococci and other bacteria (both Gram-positive and Gram-negative). It is part of a gene cluster that affects cell-wall biosynthesis. The operon triggers the termination of peptidoglycan precursors by D-Ala-(R)-lactate instead of D-Ala-D-Ala dipeptides. The enzyme is stereospecific, as L-Ala-L-Ala, D-Ala-L-Ala and L-Ala-D-Ala are not substrates. It fasmily includes Lactobacillus Aad peptidase and belongs in the MEROPS peptidase family M15, subfamily D.	110
350087	cd17900	ArfGap_ASAP3	ArfGAP domain of ASAP3 (ArfGAP with ANK repeat and PH domain-containing protein 3). The ArfGAPs are a family of multidomain proteins with a common catalytic domain that promotes the hydrolysis of GTP bound to Arf, thereby  inactivating Arf signaling. ASAP-subfamily GAPs include three members: ASAP1, ASAP2, ASAP3.  The ASAP subfamily comprises Arf GAP, SH3, ANK repeat and PH domains. From the N-terminus, each member has a BAR, PH, Arf GAP, ANK repeat, and proline rich domains. Unlike ASAP1 and ASAP2, ASAP3 do not have an SH3 domain at the C-terminus. ASAP1 and ASAP2 show strong GTPase-activating protein (GAP) activity toward Arf1 and Arf5 and weak activity toward Arf6. ASAP1 is a target of Src and FAK signaling that regulates focal adhesions, circular dorsal ruffles (CDR), invadopodia, and podosomes. ASAP1 GAP activity is synergistically stimulated by phosphatidylinositol 4,5-bisphosphate (PIP2) and phosphatidic acid.  ASAP2 is believed to function as an ArfGAP that controls ARF-mediated vesicle budding when recruited to Golgi membranes. It also functions as a substrate and downstream target for protein tyrosine kinases Pyk2 and Src, a pathway that may be involved in the regulation of vesicular transport. ASAP3 is a focal adhesion-associated ArfGAP that functions in cell migration and invasion. Similar to ASAP1, the GAP activity of ASAP3 is strongly enhanced by PIP2 via PH domain. Like ASAP1, ASAP3 associates with focal adhesions and circular dorsal ruffles. However, unlike ASAP1, ASAP3 does not localize to invadopodia or podosomes. ASAP 1 and 3 have been implicated in oncogenesis, as ASAP1 is highly expressed in metastatic breast cancer and ASAP3 in hepatocellular carcinoma.	124
350088	cd17901	ArfGap_ARAP1	ArfGap with Rho-Gap domain, ANK repeat and PH domain-containing protein 1. The ARAP subfamily includes three members, ARAP1-3, and belongs to the ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) family of proteins that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling.  The function of Arfs is dependent on GAPs and guanine nucleotide exchange factors (GEFs), which allow Arfs to cycle between the GDP-bound and GTP-bound forms. In addition to the Arf GAP domain, ARAPs contain the SAM (sterile-alpha motif) domain, 5 pleckstrin homology (PH) domains, the Rho-GAP domain, the Ras-association domain, and ANK repeats. ARAPs show phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3)-dependent GAP activity toward Arf6. ARAPs play important roles in endocytic trafficking, cytoskeleton reorganization in response to growth factors stimulation, and focal adhesion dynamics. ARAP1 localizes to the plasma membrane, the Golgi complex, and endosomal compartments. It displays PI(3,4,5)P3-dependent ArfGAP activity that regulates Arf-, RhoA-, and Cdc42-dependent cellular events. For example, ARAP1 inhibits the trafficking of epidermal growth factor receptor (EGFR) to the early endosome.	116
350089	cd17902	ArfGap_ARAP3	ArfGap with Rho-Gap domain, ANK repeat and PH domain-containing protein 3. The ARAP subfamily includes three members, ARAP1-3, and belongs to the ADP-ribosylation factor GTPase-activating proteins (Arf GAPs) family of proteins that promotes the hydrolysis of GTP bound to Arf, thereby inactivating Arf signaling.  The function of Arfs is dependent on GAPs and guanine nucleotide exchange factors (GEFs), which allow Arfs to cycle between the GDP-bound and GTP-bound forms. In addition to the Arf GAP domain, ARAPs contain the SAM (sterile-alpha motif) domain, 5 pleckstrin homology (PH) domains, the Rho-GAP domain, the Ras-association domain, and ANK repeats. ARAPs show phosphatidylinositol 3,4,5-trisphosphate (PI(3,4,5)P3)-dependent GAP activity toward Arf6. ARAPs play important roles in endocytic trafficking, cytoskeleton reorganization in response to growth factors stimulation, and focal adhesion dynamics. ARAP3 possesses a unique dual-specificity GAP activity for Arf6 and RhoA regulated by PI(3,4,5)P3 and a small GTPase Rap1-GTP. The RhoGAP activity of ARAP3 is enhanced by direct binding of Rap1-GTP to the Ras-association (RA) domain. ARAP3 is involved in regulation of cell shape and adhesion.	116
350090	cd17903	ArfGap_AGFG2	ArfGAP domain of AGFG2 (ArfGAP domain and FG repeat-containing protein 2). The ArfGAP domain and FG repeat-containing proteins (AFGF) subfamily of Arf GTPase-activating proteins consists of the two structurally-related members: AGFG1 and AGFG2. AGFG2 is a member of the HIV-1 Rev binding protein (HRB) family and contains one Arf-GAP zinc finger domain, several Phe-Gly (FG) motifs, and four Asn-Pro-Phe (NPF) motifs. AGFG2 interacts with Eps15 homology (EH) domains and plays a role in the Rev export pathway, which mediates the nucleocytoplasmic transfer of proteins and RNAs. In humans, the presence of the FG repeat motifs (11 in AGFG1 and 7 in AGFG2) are thought to be required for these proteins to act as HIV-1 Rev cofactors. Hence, AGFG promotes movement of Rev-responsive element-containing RNAs from the nuclear periphery to the cytoplasm, which is an essential step for HIV-1 replication.	116
380783	cd17904	PFM_monalysin-like	pore-forming module of Pseudomonas entomophila monalysin and similar aerolysin-type beta-barrel pore-forming proteins. Monalysin plays a role in Pseudomonas entomophila virulence against Drosophila, contributing to host intestinal damage and lethality. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	206
381733	cd17905	CheC-like	chemotaxis protein CheC; includes CheC classes I, II, and III. This family contains chemotaxis protein CheC that acts as a weak CheY-P phosphatase but shows increased activity in the presence of CheD. This CheC family includes three classes: class I containing Bacillus subtilis CheC which might function as a regulator of CheD; class II CheCs that likely function as phosphatases in systems other than chemotaxis; and class III CheCs that are found chiefly in the archaeal class Halobacteria and might function similarly as class I CheCs. Class I CheCs contain two active sites with the consensus sequence ([DS]xxxExxNx(22)P), with four conserved residues thought to form the phosphatase active site; class II and class III CheCs have only one actve site.	173
381734	cd17906	CheX	chemotaxis phosphatase CheX. This family contains CheX CheY-P phosphatase which is very closely related to CheC chemotaxis phosphatase; both dephosphorylate CheY, although CheC requires binding of CheD to achieve the level of activity of CheX. CheX has been shown to be the most powerful CheY-P phosphatase of the CheC-FliY-CheX (CXY) family. Structural and functional data of CheX and its CheY3 substrate in Borrelia burgdorferi (the causative agent of Lyme disease) bound to the phosphoryl analog BeF3(-) and Mg2+ reveal a unique mode of binding, but a catalytic mechanism which is virtually identical to that used by the structurally unrelated CheZ, providing a striking example of convergent evolution. Thus, CheX is quite divergent from the rest of the CXY family; it forms a dimer and some may function outside chemotaxis. The data also suggest a possible CheX regulatory mechanism through dissociation of the CheX homodimer.	148
381735	cd17907	FliY_FliN-Y	flagellar motor switch protein FliY. This family contains the flagellar rotor protein FliY, a highly conserved and essential member of the CheC phosphatase family, that distinguishes flagellar architecture and function in different types of bacteria. Unlike CheC and CheX, FliY is localized in the flagellar switch complex, which also contains the stator-coupling protein FliG and the target of CheY-P, FliM, all present in many copies, and together corresponding structurally to the C-ring of the flagellar basal body. FliY structure resembles that of the rotor protein FliM but contains two active centers for CheY dephosphorylation. In bacteria such as Thermotogae and Bacilli, FliY is fused to FliN. It incorporates properties of the FliM/FliN rotor proteins and the CheC/CheX phosphatases to serve multiple functions in the flagellar switch. FliY seems to act on CheY-P constitutively, as compared to CheC and CheX that appear to be primarily involved in restoring normal CheY-P levels.	191
381736	cd17908	FliM	flagellar protein FliM. This family contains bacterial flagellar protein FliM which is localized in the flagellar switch complex along with FliG and FliY; all are present in many copies, and together they correspond structurally to the C-ring of the flagellar basal body. FliM does not contain the CheC consensus sequence of the phosphatase active site ([DS]xxxExxNx(22)P) and is not a CheY-P phosphatase. FliM sits in the center of the rotor with the N-terminal region interacting with the signaling protein, phosphorylated CheY (CheY-P). The activated form of CheY destabilizes the parallel arrangement of FliM molecules, and perturbs FliG alignment in a process that may reflect the onset of rotation switching. This suggests a model of C-ring assembly in which intermolecular contacts among FliG domains provide a template for FliM assembly. Recent data show that binding of FliM to spermine synthase, SpeE, contributes to flagellar motility, an association that is unique to Helicobacter species.	181
381737	cd17909	CheC_ClassI	chemotaxis protein CheC, Class I. This subfamily contains Class I CheC proteins with phosphatase activity. The Class I cheC genes are generally found in firmicute and archaeal chemotaxis operons with cheD, usually translationally coupled. Class I CheCs interact with the CheD protein which is responsible for deamidation of certain glutamine residues to glutamates on the chemotaxis receptor proteins. This family contains two active sites with the consensus sequence ([DS]xxxExxNx(22)P), with four conserved residues thought to form the phosphatase active site. The C-terminal helix of CheC acts as a mimic of the natural enzymatic target of CheD, the alpha-helical receptors, and serves as the binding site for CheD. The CheC/CheD heterodimerization increases CheY-P phosphatase activity five-fold. Class I CheCs are involved in adaptation of the chemotaxis system.	189
381738	cd17910	CheC_ClassII	chemotaxis protein CheC, Class II. This family contains class II CheC proteins found in proteobacteria, which diverge from class I CheCs in sequence conservation and lack critical well-conserved residues for CheD binding. These proteins are likely to be dedicated phosphatases. The class II cheC genes are not found in chemotaxis operons, but in operons containing more archetypical two-component signaling components, non-signaling operons, or as orphans. Thus, class II CheCs appear to be involved in non-chemotactic two component systems. Class II CheCs lack the first of the two phosphatase active sites of class I CheCs, and retain the second active site of class I CheCs.	187
381739	cd17911	CheC_ClassIII	chemotactic protein CheC, Class III. This family contains class III CheC proteins, present chiefly in the archaeal class Halobacteria. Sequence analysis shows that class III CheC proteins are structurally and functionally similar to class I CheCs, and not to CheX, despite the fact that both class III CheCs and CheX lack the first of the two phosphatase active sites of class I CheCs, and retain the second active site. Mutation analysis shows that the second active site is more important for function that the first one, suggesting that class III proteins arose by loss of the unnecessary first active site through mutational shift. All chemotactic archaea have a CheC homologue.	187
350670	cd17912	DEAD-like_helicase_N	N-terminal helicase domain of the DEAD-box helicase superfamily. The DEAD-like helicase superfamily is a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. The N-terminal domain contains the ATP-binding region.	81
350671	cd17913	DEXQc_Suv3	DEXQ-box helicase domain of Suv3. Suppressor of var1 3-like protein (Suv3) is a DNA/RNA unwinding enzyme belonging to the class of DexH-box helicases. It localizes predominantly in the mitochondria, where it forms an RNA-degrading complex called mitochondrial degradosome (mtEXO) with exonuclease PNP (polynucleotide phosphorylase), that degrades 3' overhang double-stranded RNA with a 3'-to-5' directionality in an ATP-dependent manner. Suv3 plays a role in the RNA surveillance system in mitochondria; it regulates the stability of mature mRNAs, the removal of aberrantly formed mRNAs and the rapid degradation of non coding processing intermediates. It also confers salinity and drought stress tolerance by maintaining both photosynthesis and antioxidant machinery, probably via an increase in plant hormone levels such as gibberellic acid (GA3), the cytokinin zeatin (Z), and indole-3-acetic acid (IAA). Suv3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	142
350672	cd17914	DExxQc_SF1-N	DEXQ-box helicase domain of superfamily 1 helicase. The superfamily (SF)1 family members include UvrD/Rep, Pif1-like, and Upf-1-like proteins. Like SF2, they do not form toroidal, predominantly hexameric structures like SF3-6. Their helicase core is surrounded by C and N-terminal domains with specific functions such as nucleases, RNA or DNA binding domains or domains engaged in protein-protein interactions. SF1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	116
350673	cd17915	DEAHc_XPD-like	DEAH-box helicase domain of XPD family DEAD-like helicases. The xeroderma pigmentosum group D (XPD)-like family members are DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	138
350674	cd17916	DEXHc_UvrB	DEXH-box helicase domain of excinuclease ABC subunit B. Excinuclease ABC subunit B (or UvrB) plays a central role in nucleotide excision repair (NER). Together with other components of the NER system, like UvrA, UvrC, UvrD (helicase II) and DNA polymerase I, it recognizes and cleaves damaged DNA in a multistep ATP-dependent reaction. UvrB is critical for the second phase of damage recognition by verifying the nature of the damage and forming the pre-incision complex. Its ATPase site becomes activated in the presence of UvrA and damaged DNA, but its activity is strand destabilization via distortion of the DNA at lesion site, with very limited DNA unwinding. UvrB is a member of the DEAD-like helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	299
350675	cd17917	DEXHc_RHA-like	DEXH-box helicase domain of DEAD-like helicase RHA family proteins. The RNA helicase A (RHA) family includes RHA, also called DEAH-box helicase 9 (DHX9), DHX8, DHX15-16, DHX32-38, and many others. The RHA family belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	159
350676	cd17918	DEXHc_RecG	DEXH/Q-box helicase domain of DEAD-like helicase RecG family proteins. The DEAD-like helicase RecG family is part of the DEAD-like helicases superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	180
350677	cd17919	DEXHc_Snf	DEXH/Q-box helicase domain of DEAD-like helicase Snf family proteins. Sucrose Non-Fermenting (SNF) proteins DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	182
350678	cd17920	DEXHc_RecQ	DEXH-box helicase domain of RecQ family proteins. The RecQ family of the type II DEAD box helicase superfamily is a family of highly conserved DNA repair helicases. This domain contains the ATP-binding region.	200
350679	cd17921	DEXHc_Ski2	DEXH-box helicase domain of DEAD-like helicase Ski2 family proteins. Ski2-like RNA helicases play an important role in RNA degradation, processing, and splicing pathways. They belong to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	181
350680	cd17922	DEXHc_LHR-like	DEXH-box helicase domain of LHR. Large helicase-related protein (LHR) is a DNA damage-inducible helicase that uses ATP hydrolysis to drive unidirectional 3'-to-5' translocation along single-stranded DNA (ssDNA) and to unwind RNA:DNA duplexes. This group also includes related bacterial and archaeal helicases from the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	166
350681	cd17923	DEXHc_Hrq1-like	DEAH-box helicase domain of Hrq1 and similar proteins. Yeast Hrq1, similar to RecQ4, plays a role in DNA inter-strand crosslink (ICL) repair and in telomere maintenance. Hrq1 lacks the Sld2-like domain found in RecQ4. Hrq1 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	182
350682	cd17924	DDXDc_reverse_gyrase	DDXD-box helicase domain of reverse gyrase. Reverse gyrase modifies the topological state of DNA by introducing positive supercoils in an ATP-dependent process. Reverse gyrase belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	189
350683	cd17925	DEXDc_ComFA	DEXD-box helicase domain of ComFA. ATP-dependent helicase ComFA (also called ComF operon protein 1) is part of the complex mediating the binding and uptake of single-stranded DNA. ComFA is required for DNA uptake but not for binding. It belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	143
350684	cd17926	DEXHc_RE	DEXH-box helicase domain of DEAD-like helicase restriction enzyme family proteins. This family is composed of helicase restriction enzymes and similar proteins such as TFIIH basal transcription factor complex helicase XPB subunit. These proteins are part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	146
350685	cd17927	DEXHc_RIG-I	DEXH-box helicase domain of DEAD-like helicase RIG-I family proteins. Members of the RIG-I family include FANCM, dicer, Hef, and the RIG-I-like receptors. Fanconi anemia group M (FANCM) protein is a DNA-dependent ATPase component of the Fanconi anemia (FA) core complex required for the normal activation of the FA pathway, leading to monoubiquitination of the FANCI-FANCD2 complex in response to DNA damage, cellular resistance to DNA cross-linking drugs, and prevention of chromosomal breakage. Dicer ribonucleases cleave double-stranded RNA (dsRNA) precursors to generate microRNAs (miRNAs) and small interfering RNAs (siRNAs). Hef (helicase-associated endonuclease fork-structure) is involved in stalled replication fork repair. RIG-I-like receptors (RLRs) sense cytoplasmic viral RNA and comprises RIG-I, RLR-2/MDA5 (melanoma differentiation-associated protein 5) and RLR-3/LGP2 (laboratory of genetics and physiology 2). The RIG-I family is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	201
350686	cd17928	DEXDc_SecA	DEXD-box helicase domain of SecA. SecA is a part of the Sec translocase that transports the vast majority of bacterial and ER-exported proteins. SecA binds both the signal sequence and the mature domain of the preprotein emerging from the ribosome. SecA belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	230
350687	cd17929	DEXHc_priA	DEXH-box helicase domain of PriA. PriA, also known as replication factor Y or primosomal protein N', is a 3'-->5' superfamily 2 DNA helicase that acts to remodel stalled replication forks and as a specificity factor for origin-independent assembly of a new replisome at the stalled fork. PriA is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	178
350688	cd17930	DEXHc_cas3	DEXH/Q-box helicase domain of Cas3. CRISPR-associated (Cas) 3 is a nuclease-helicase responsible for degradation of dsDNA. The two enzymatic units of Cas3, a histidine-aspartate (HD) nuclease and a Superfamily 2 (SF2) helicase, may be expressed from separate genes as Cas3' (SF2 helicase) and Cas3'' (HD nuclease) or may be fused as a single HD-SF2 polypeptide. The nucleolytic activity of most Cas3 enzymes is transition metal ion-dependent. Cas3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	186
350689	cd17931	DEXHc_viral_Ns3	DEXH-box helicase domain of NS3 protease-helicase. NS3 is a nonstructural multifunctional protein found in pestiviruses that contains an N-terminal protease and a C-terminal helicase. The N-terminal domain is a chymotrypsin-like serine protease, which is responsible for most of the maturation cleavages of the polyprotein precursor in the cytosolic side of the endoplasmic reticulum membrane. The C-terminal domain, about two-thirds of NS3, is a helicase belonging to superfamily 2 (SF2) thought to be important for unwinding highly structured regions of the RNA genome during replication. NS3 plays an essential role in viral polyprotein processing and genome replication. NS3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	151
350690	cd17932	DEXQc_UvrD	DEXQD-box helicase domain of UvrD. UvrD is a highly conserved helicase involved in mismatch repair, nucleotide excision repair, and recombinational repair. It plays a critical role in maintaining genomic stability and facilitating DNA lesion repair in many prokaryotic species including Helicobacter pylori and Escherichia coli. UvrD is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	189
350691	cd17933	DEXSc_RecD-like	DEXS-box helicase domain of RecD and similar proteins. RecD is a member of the RecBCD (EC 3.1.11.5, Exonuclease V) complex. It is the alpha chain of the complex and functions as a 3'-5' helicase. The RecBCD enzyme is both a helicase that unwinds, or separates the strands of DNA, and a nuclease that makes single-stranded nicks in DNA. RecD is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	155
350692	cd17934	DEXXQc_Upf1-like	DEXXQ-box helicase domain of Upf1-like helicase. The Upf1-like helicase family includes UPF1, HELZ, Mov10L1, Aquarius, IGHMBP2 (SMUBP2), and similar proteins. They belong to the  DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	133
350693	cd17935	EEXXQc_AQR	EEXXQ-box helicase domain of AQR. Aquarius (AQR) is a multifunctional RNA helicase that binds precursor-mRNA introns at a defined position and is part of a pentameric intron-binding complex (IBC). It is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	207
350694	cd17936	EEXXEc_NFX1	EEXXE-box helicase domain of NFX1. Human NFX1 protein was identified as a protein that represses class II MHC (major histocompatibility complex) gene expression. NFX1 binds a conserved cis-acting element, termed the X-box, in promoters of human class II MHC genes. The Cys-rich region contains several NFX1-type zinc finger domains. Frequently, a R3H domain is present in the C-terminus, and a RING finger domain and a PAM2 motif are present in the N-terminus. The lack of R3H and PAM2 motifs in the plant proteins indicates functional differences. Plant NFX1-like proteins are proposed to modulate growth and survival by coordinating reactive oxygen species, salicylic acid, further biotic stress and abscisic acid responses. A common feature of all members may be E3 ubiquitin ligase, due to the presence of a RING finger domain, as well as DNA binding. NFX1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	178
350695	cd17937	DEXXYc_viral_SF1-N	DEXXY-box helicase domain of viral superfamily 1 helicase. Superfamily 1 (SF1) helicases are nucleic acid motor proteins that couple ATP hydrolysis to translocation along with the concomitant unwinding of DNA or RNA.  The members here contain arterivirus equine arteritis virus (EAV) non-structural protein (nsp)10.  Nsp10 is composed of two domains, ZBD (ATPase) and HEL1 (helicase) along with 2 additional non-enymatic domains that are thought to regulate HEL1 function. The helicase activity depends on the extensive relay of interactions between the ZBD and HEL1 domains. The arterivirus helicase structurally resembles the cellular Upf1 helicase, suggesting that nidoviruses may also use their helicases for post-transcriptional quality control of their large RNA genomes. The proteins here are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	137
350696	cd17938	DEADc_DDX1	DEAD-box helicase domain of DEAD box protein 1. DEAD box protein 1 (DDX1) acts as an ATP-dependent RNA helicase, able to unwind both RNA-RNA and RNA-DNA duplexes. It possesses 5' single-stranded RNA overhang nuclease activity as well as ATPase activity on various RNA, but not DNA polynucleotides. DDX1 may play a role in RNA clearance at DNA double-strand breaks (DSBs), thereby facilitating the template-guided repair of transcriptionally active regions of the genome. It may also be involved in 3'-end cleavage and polyadenylation of pre-mRNAs. DDX1 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	204
350697	cd17939	DEADc_EIF4A	DEAD-box helicase domain of eukaryotic initiation factor 4A. The eukaryotic initiation factor-4A (eIF4A) family consists of 3 proteins EIF4A1, EIF4A2, and EIF4A3. These factors are required for the binding of mRNA to 40S ribosomal subunits. In addition these proteins are helicases that function to unwind double-stranded RNA. EIF4A proteins are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	199
350698	cd17940	DEADc_DDX6	DEAD-box helicase domain of DEAD box protein 6. DEAD box protein 6 (DDX6, also known as Rck or p54) participates in mRNA regulation mediated by miRNA-mediated silencing. It also plays a role in global and transcript-specific messenger RNA (mRNA) storage, translational repression, and decay. It is a member of the DEAD-box helicase family, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	201
350699	cd17941	DEADc_DDX10	DEAD-box helicase domain of DEAD box protein 10. Fusion of the DDX10 gene and the nucleoporin gene, NUP98, by inversion 11 (p15q22) chromosome translocation is found in the patients with de novo or therapy-related myeloid malignancies. Diseases associated with DDX10 (also known as DDX10-NUP98 Fusion Protein Type 2) include myelodysplastic syndrome and leukemia, acute myeloid. DDX10 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	198
350700	cd17942	DEADc_DDX18	DEAD-box helicase domain of DEAD box protein 18. This DDX18 gene encodes a DEAD box protein and is activated by Myc protein. DDX18 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	198
350701	cd17943	DEADc_DDX20	DEAD-box helicase domain of DEAD box protein 20. DDX20 (also called DEAD Box Protein DP 103, Component Of Gems 3, Gemin-3, and SMN-Interacting Protein) interacts directly with SMN (survival of motor neurons), the spinal muscular atrophy gene product, and may play a catalytic role in the function of the SMN complex on ribonucleoproteins. Diseases associated with DDX20 include spinal muscular atrophy and muscular atrophy. DDX20 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	192
350702	cd17944	DEADc_DDX21_DDX50	DEAD-box helicase domain of DEAD box proteins 21 and 50. DDX21 (also called Gu-Alpha and nucleolar RNA helicase 2) is an RNA helicase that acts as a sensor of the transcriptional status of both RNA polymerase (Pol) I and II.  It promotes ribosomal RNA (rRNA) processing and transcription from polymerase II (Pol II) and binds various RNAs, such as rRNAs, snoRNAs, 7SK and, at lower extent, mRNAs. DDX50 (also called Gu-Beta, Nucleolar Protein Gu2, and malignant cell derived RNA helicase).  DDX21 and DDX50 have similar genomic structures and are in tandem orientation on chromosome 10, suggesting that the two genes arose by gene duplication in evolution. Diseases associated with DDX21 include stomach disease and cerebral creatine deficiency syndrome 3.  Diseases associated with DDX50 include rectal disease.  Both are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. Their name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP- binding region.	202
350703	cd17945	DEADc_DDX23	DEAD-box helicase domain of DEAD box protein 23. DDX23 (also called U5 snRNP 100kD protein and PRP28 homolog) is involved in pre-mRNA splicing and its phosphorylated form (by SRPK2) is required for spliceosomal B complex formation. Diseases associated with DDX23 include distal hereditary motor neuropathy, type II. DDX23 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	220
350704	cd17946	DEADc_DDX24	DEAD-box helicase domain of DEAD box protein 24. The human DDX24 gene encodes a DEAD box protein, which shows little similarity to any of the other known human DEAD box proteins, but shows a high similarity to mouse Ddx24 at the amino acid level. MDM2 mediates nonproteolytic polyubiquitylation of the DEAD-Box RNA helicase DDX24. DDX24 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP- binding region.	235
350705	cd17947	DEADc_DDX27	DEAD-box helicase domain of DEAD box protein 27. DDX27 (also called RHLP, deficiency of ribosomal subunits protein 1 homolog, and probable ATP-dependent RNA helicase DDX27) is involved in the processing of 5.8S and 28S ribosomal RNAs. More specifically, the encoded protein localizes to the nucleolus, where it interacts with the PeBoW complex to ensure proper 3' end formation of 47S rRNA. DDX27 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	196
350706	cd17948	DEADc_DDX28	DEAD-box helicase domain of DEAD box protein 28. DDX28 (also called mitochondrial DEAD-box polypeptide 28) plays an essential role in facilitating the proper assembly of the mitochondrial large ribosomal subunit and its helicase activity is essential for this function. DDX28 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	231
350707	cd17949	DEADc_DDX31	DEAD-box helicase domain of DEAD box protein 31. DDX31 (also called helicain or G2 helicase) plays a role in ribosome biogenesis and TP53/p53 regulation through its interaction with NPM1. DDX31 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	214
350708	cd17950	DEADc_DDX39	DEAD-box helicase domain of DEAD box protein 39. DDX39A is involved in pre-mRNA splicing and is required for the export of mRNA out of the nucleus. DDX39B is an essential splicing factor required for association of U2 small nuclear ribonucleoprotein with pre-mRNA, and it also plays an important role in mRNA export from the nucleus to the cytoplasm. Diseases associated with DDX39A (also called UAP56-Related Helicase, 49 kDa) include gastrointestinal stromal tumor and inflammatory bowel disease 6, while diseases associated with DDX39B (also called 56 kDa U2AF65-Associated Protein) include Plasmodium vivax malaria. DDX39 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	208
350709	cd17951	DEADc_DDX41	DEAD-box helicase domain of DEAD box protein 41. DDX41 (also called ABS and MPLPF) interacts with several spliceosomal proteins and may recognize the bacterial second messengers cyclic di-GMP and cyclic di-AMP, resulting in the induction of genes involved in the innate immune response. Diseases associated with DDX41 include "myeloproliferative/lymphoproliferative neoplasms, familial" and "Ddx41-related susceptibility to familial myeloproliferative/lymphoproliferative neoplasms". DDX41 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	206
350710	cd17952	DEADc_DDX42	DEAD-box helicase domain of DEAD box protein 42. DDX42 (also called Splicing Factor 3B-Associated 125 kDa Protein, RHELP, or RNAHP) is an NTPase with a preference for ATP, the hydrolysis of which is enhanced by various RNA substrates. It acts as a non-processive RNA helicase with protein displacement and RNA annealing activities. DDX42 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	197
350711	cd17953	DEADc_DDX46	DEAD-box helicase domain of DEAD box protein 46. DDX46 (also called Prp5-like DEAD-box protein) is a component of the 17S U2 snRNP complex. It plays an important role in pre-mRNA splicing and has a role in antiviral innate immunity. DDX46 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	222
350712	cd17954	DEADc_DDX47	DEAD-box helicase domain of DEAD box protein 47. DDX47 (also called E4-DEAD box protein) can shuttle between the nucleus and the cytoplasm, and has an RNA-independent ATPase activity. DX47 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	203
350713	cd17955	DEADc_DDX49	DEAD-box helicase domain of DEAD box protein 49. DDX49 (also called Dbp8) is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	204
350714	cd17956	DEADc_DDX51	DEAD-box helicase domain of DEAD box protein 51. DDX51 aids cell cancer proliferation by regulating multiple signalling pathways. Mammalian DEAD box protein Ddx51 acts in 3' end maturation of 28S rRNA by promoting the release of U8 snoRNA.It is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	231
350715	cd17957	DEADc_DDX52	DEAD-box helicase domain of DEAD box protein 52. DDX52 (also called ROK1 and HUSSY19) is ubiquitously expressed in testis, endometrium, and other tissues in humans. DDX52 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	198
350716	cd17958	DEADc_DDX43_DDX53	DEAD-box helicase domain of DEAD box proteins 43 and 53. DDX43 (also called cancer/testis antigen 13 or helical antigen) displays tumor-specific expression.  Diseases associated with DDX43 include rheumatoid lung disease. DDX53 is also called cancer/testis antigen 26 or DEAD-Box Protein CAGE. Both DDX46 and DDX53 are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	197
350717	cd17959	DEADc_DDX54	DEAD-box helicase domain of DEAD box protein 54. DDX54 interacts in a hormone-dependent manner with nuclear receptors, and represses their transcriptional activity. DDX54 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	205
350718	cd17960	DEADc_DDX55	DEAD-box helicase domain of DEAD box protein 55. DDX55 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	202
350719	cd17961	DEADc_DDX56	DEAD-box helicase domain of DEAD box protein 56. DDX56 is a helicase required for assembly of infectious West Nile virus particles. New research suggests that DDX56 relocalizes to the site of virus assembly during WNV infection and that its interaction with WNV capsid in the cytoplasm may occur transiently during virion morphogenesis. DDX56 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	206
350720	cd17962	DEADc_DDX59	DEAD-box helicase domain of DEAD box protein 59. DDX59 plays an important role in lung cancer development by promoting DNA replication. DDX59 knockdown mice showed reduced cell proliferation, anchorage-independent cell growth, and reduction of tumor formation. Recent work shows that EGFR and Ras regulate DDX59 during lung cancer development.Diseases associated with DDX59 (also called zinc finger HIT domain-containing protein 5) include orofaciodigital syndrome V and orofaciodigital syndrome. DDX59 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	193
350721	cd17963	DEADc_DDX19_DDX25	DEAD-box helicase domain of ATP-dependent RNA helicases DDX19 and DDX25. DDX19 (also called DEAD box RNA helicase DEAD5) and DDX25 (also called gonadotropin-regulated testicular RNA helicase (GRTH)) are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	196
350722	cd17964	DEADc_MSS116	DEAD-box helicase domain of DEAD-box helicase Mss116. Mss116 is an RNA chaperone important for mitochondrial group I and II intron splicing, translational activation, and RNA end processing. Mss116 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	211
350723	cd17965	DEADc_MRH4	DEAD-box helicase domain of ATP-dependent RNA helicase MRH4. Mitochondrial RNA helicase 4 (MRH4) plays an essential role during the late stages of mitochondrial ribosome or mitoribosome assembly by promoting remodeling of the 21S rRNA-protein interactions. MRH4 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	251
350724	cd17966	DEADc_DDX5_DDX17	DEAD-box helicase domain of ATP-dependent RNA helicases DDX5 and DDX17. DDX5 and DDX17 are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	197
350725	cd17967	DEADc_DDX3_DDX4	DEAD-box helicase domain of ATP-dependent RNA helicases DDX3 and DDX4. This subfamily includes Drosophila melanogaster Vasa, which is essential for development. DEAD box protein 3 (DDX3) has been reported to display a high level of RNA-independent ATPase activity stimulated by both RNA and DNA. DEAD box protein 4 (DDX4, also known as VASA homolog) is an ATP-dependent RNA helicase required during spermatogenesis and is essential for the germline integrity. DDX3 and DDX4 are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	221
350726	cd17968	DEAHc_DDX11_starthere	DEAH-box helicase domain of ATP-dependent DNA helicase DDX11. DDX11 (also called ChlR1) encodes a protein of the conserved family of Iron-Sulfur (Fe-S) cluster DNA helicases and is thought to function in maintaining chromosome transmission fidelity and genome stability. Mutations in the Chl1 human homologs ChlR1/DDX11 and BACH1/BRIP1/FANCJ collectively result in Warsaw Breakage Syndrome, Fanconi anemia, cell aneuploidy and breast and ovarian cancers. DDX11 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	134
350727	cd17969	DEAHc_XPD	DEAH-box helicase domain of TFIIH basal transcription factor complex helicase XPD subunit. TFIIH can be resolved biochemically into a seven subunit core complex containing XPD/Rad3, XPB/Ssl2, p62/Tfb1, p52/Tfb2, p44/Ssl1, p34/Tfb4, and p8/Tfb5 and a three subunit Cdk Activating Kinase (CAK) complex containing CDK7/Kin28, cyclin H/Ccl1, and MAT1/Tfb3. XPD interacts directly with p44, which stimulates XPD helicase activity. XPD/Rad3 also interacts directly with the CAK via its MAT1/Tfb3 subunit inhibiting the helicase activity of XPD. XPD is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	157
350728	cd17970	DEAHc_FancJ	DEAH-box helicase domain of Fanconi anemia group J protein and similar proteins. Fanconi anemia group J protein (FACJ or FANCJ, also known as BRIP1) is a DNA helicase required for the maintenance of chromosomal stability. It plays a role in the repair of DNA double-strand breaks by homologous recombination dependent on its interaction with BRCA1. FANCJ belongs to the DEAD-box helicase family, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	181
350729	cd17971	DEXHc_DHX8	DEXH-box helicase domain of DEAH-box helicase 8. DEAH-box helicase 8 (DHX8 ,also known as pre-mRNA-splicing factor ATP-dependent RNA helicase PRP22) acts late in the splicing of pre-mRNA and mediates the release of the spliced mRNA from spliceosomes. DHX8 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	179
350730	cd17972	DEXHc_DHX9	DEXH-box helicase domain of DEAH-box helicase 9. DEAH-box helicase 9 (DHX9, also known as ATP-dependent RNA helicase A or RHA and leukophysin or LKP) plays an important role in many cellular processes, including regulation of DNA replication, transcription, translation, microRNA biogenesis, RNA processing and transport, and maintenance of genomic stability. DHX9 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	234
350731	cd17973	DEXHc_DHX15	DEXH-box helicase domain of DEAH-box helicase 15. DEAH-box helicase 15 (DHX15) is a pre-mRNA processing factor involved in disassembly of spliceosomes after the release of mature mRNA. DHX15 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	186
350732	cd17974	DEXHc_DHX16	DEXH-box helicase domain of DEAH-box helicase 16. DEAH-box helicase 16 (DHX16) is probably involved in pre-mRNA splicing. DHX16 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	174
350733	cd17975	DEXHc_DHX29	DEXH-box helicase domain of DEAH-box helicase 29. DEAH-box helicase 29 (DHX29) is a part of the 43S pre-initiation complex involved in translation initiation of mRNAs with structured 5'-UTRs. DHX29 is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	183
350734	cd17976	DEXHc_DHX30	DEXH-box helicase domain of DEAH-box helicase 30. DEAH-box helicase 30 (DHX30) plays an important role in the assembly of the mitochondrial large ribosomal subunit. DHX30 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	178
350735	cd17977	DEXHc_DHX32	DEXH-box helicase domain of DEAH-box helicase 32. DEAH-box helicase 32 (DHX32) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	176
350736	cd17978	DEXHc_DHX33	DEXH-box helicase domain of DEAH-box helicase 33. DEAH-box helicase 33 (DHX33) stimulates RNA polymerase I transcription of the 47S precursor rRNA. DHX33 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	179
350737	cd17979	DEXHc_DHX34	DEXH-box helicase domain of DEAH-box helicase 34. DEAH-box helicase 34 (DHX34) plays a role in the nonsense-mediated decay (NMD), a surveillance mechanism that degrades aberrant mRNAs. DHX34 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	170
350738	cd17980	DEXHc_DHX35	DEXH-box helicase domain of DEAH-box helicase 35. DHX35 plays a role in colorectal cancers and seems to be associated with risk to thyroid cancers.  It also has been shown to postively regulates poxviruses, such as Myxoma virus.  DEAH-box helicase 35 (DHX35) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	185
350739	cd17981	DEXHc_DHX36	DEXH-box helicase domain of DEAH-box helicase 36. DEAH-box helicase 36 (DHX36, also known as G4-resolvase 1 or G4R1, MLE-like protein 1 and RNA helicase associated with AU-rich element or RHAU) unwinds a G4-quadruplex in human telomerase RNA. DHX36 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	180
350740	cd17982	DEXHc_DHX37	DEXH-box helicase domain of DEAH-box helicase 37. DHX37 plays a role in the development of the human nervous system and has been linked to schizophrenia.  It also negatively regulates poxviruses such as Myxoma virus. DEAH-box helicase 37 (DHX37) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	191
350741	cd17983	DEXHc_DHX38	DEXH-box helicase domain of DEAH-box helicase 38. DEAH-box helicase 38 (DHX38, also known as PRP16) is involved in pre-mRNA splicing. DHX38 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	173
350742	cd17984	DEXHc_DHX40	DEXH-box helicase domain of DEAH-box helicase 40. DEAH-box helicase 40 (DHX40) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	178
350743	cd17985	DEXHc_DHX57	DEXH-box helicase domain of DEAH-box helicase 57. DEAH-box helicase 57 (DHX57) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	177
350744	cd17986	DEXQc_DQX1	DEXQ-box helicase domain of DEAQ-box RNA dependent ATPase 1. DEAQ-box RNA dependent ATPase 1 (DQX1) belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	177
350745	cd17987	DEXHc_YTHDC2	DEXH-box helicase domain of YTH domain containing 2. YTH domain containing 2 (YTHDC2) regulates mRNA translation and stability via binding to N6-methyladenosine, a modified RNA nucleotide enriched in the stop codons and 3' UTRs of eukaryotic messenger RNAs. YTHDC2 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	176
350746	cd17988	DEXHc_TDRD9	DEXH-box helicase domain of tudor domain containing 9. Tudor domain containing 9 (TDRD9, also known as HIG-1or NET54 or C14orf75) is a part of the  nuclear PIWI-interacting RNA (piRNA) pathway essential for transposon silencing and male fertility TDRD9 belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	180
350747	cd17989	DEXHc_HrpA	DEXH-box helicase domain of ATP-dependent RNA helicase HrpA. HrpA is part of the HrpB-HrpA two-partner secretion (TPS) system, a secretion pathway important to the secretion of large virulence-associated proteins. HrpA belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	173
350748	cd17990	DEXHc_HrpB	DEXH-box helicase domain of ATP-dependent helicase HrpB. HrpB is part of the HrpB-HrpA two-partner secretion (TPS) system, a secretion pathway important to the secretion of large virulence-associated proteins. HrpB belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	174
350749	cd17991	DEXHc_TRCF	DEXH/Q-box helicase domain of the transcription-repair coupling factor. Transcription-repair coupling factor (TrcF) dissociates transcription elongation complexes blocked at nonpairing lesions and mediates recruitment of DNA repair proteins. TrcF is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	193
350750	cd17992	DEXHc_RecG	DEXH/Q-box helicase domain of RecG. ATP-dependent DNA helicase RecG plays a critical role in recombination and DNA repair. It is a member of the DEAD-like helicases superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	225
350751	cd17993	DEXHc_CHD1_2	DEXH-box helicase domain of the chromodomain helicase DNA binding proteins 1 and 2, and similar proteins. Chromodomain-helicase-DNA-binding protein 1 (CHD1) is an ATP-dependent chromatin-remodeling factor which functions as the substrate recognition component of the transcription regulatory histone acetylation (HAT) complex SAGA. It regulates polymerase II transcription and is also required for efficient transcription by RNA polymerase I, and more specifically the polymerase I transcription termination step. It is not only involved in transcription-related chromatin-remodeling, but is also required to maintain a specific chromatin configuration across the genome. CHD1 is also associated with histone deacetylase (HDAC) activity. Chromodomain-helicase-DNA-binding protein 2 (CHD2) is a DNA-binding helicase that specifically binds to the promoter of target genes, leading to chromatin remodeling, possibly by promoting deposition of histone H3.3. It is involved in myogenesis via interaction with MYOD1; it binds to myogenic gene regulatory sequences and mediates incorporation of histone H3.3 prior to the onset of myogenic gene expression, promoting their expression. Both are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	218
350752	cd17994	DEXHc_CHD3_4_5	DEAH-box helicase domain of the chromodomain helicase DNA binding proteins 3, 4 and 5. Chromodomain-helicase-DNA-binding protein 3 (CHD3) is a component of the histone deacetylase NuRD complex which participates in the remodeling of chromatin by deacetylating histones. It is required for anchoring centrosomal pericentrin in both interphase and mitosis, for spindle organization and centrosome integrity. Chromodomain-helicase-DNA-binding protein 4 (CHD4) is a component of the histone deacetylase NuRD complex which participates in the remodeling of chromatin by deacetylating histones. Chromodomain-helicase-DNA-binding protein 5 (CHD5) is a chromatin-remodeling protein that binds DNA through histones and regulates gene transcription. It is thought to specifically recognize and bind trimethylated 'Lys-27' (H3K27me3) and non-methylated 'Lys-4' of histone H3 and plays a role in the development of the nervous system by activating the expression of genes promoting neuron terminal differentiation. In parallel, it may also positively regulate the trimethylation of histone H3 at 'Lys-27' thereby specifically repressing genes that promote the differentiation into non-neuronal cell lineages. As a tumor suppressor, it regulates the expression of genes involved in cell proliferation and differentiation. In spermatogenesis, it probably regulates histone hyperacetylation and the replacement of histones by transition proteins in chromatin, a crucial step in the condensation of spermatid chromatin and the production of functional spermatozoa. CHD3, CHD4, and CHD5 are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	196
350753	cd17995	DEXHc_CHD6_7_8_9	DEXH-box helicase domain of the chromodomain helicase DNA binding protein 6, 7, 8 and 9. Chromodomain-helicase-DNA-binding protein 6-9 (CHD6, CHD7, CHD8, and CHD9) are members of the DEAD-like helicases superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	223
350754	cd17996	DEXHc_SMARCA2_SMARCA4	DEXH-box helicase domain of SMARCA2 and SMARCA4. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, members 2 and 4 (SMARCA2 and SMARCA4) are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	233
350755	cd17997	DEXHc_SMARCA1_SMARCA5	DEAH-box helicase domain of SMARCA1 and SMARCA5. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 1 and 5 (SMARCA1 and SMARCA5) are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	222
350756	cd17998	DEXHc_SMARCAD1	DEXH-box helicase domain of SMARCAD1. SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A containing DEAD/H box 1 (SMARCAD1, also known as ATP-dependent helicase 1 or Hel1) possesses intrinsic ATP-dependent nucleosome-remodeling activity and is required for both DNA repair and heterochromatin organization. SMARCAD1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	187
350757	cd17999	DEXHc_Mot1	DEXH-box helicase domain of Mot1. Modifier of transcription 1 (Mot1, also known as TAF172 in eukaryotes) regulates transcription in association with TATA binding protein (TBP). Mot1, Ino80C, and NC2 function coordinately to regulate pervasive transcription in yeast and mammals. Mot1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	232
350758	cd18000	DEXHc_ERCC6	DEXH-box helicase domain of ERCC6. ERCC excision repair 6, chromatin remodeling factor (ERCC6, also known Cockayne syndrome group B (CSB), Rad26 in Saccharomyces cerevisiae, and Rhp26 in Schizosaccharomyces pombe) is a DNA-binding protein that is important in transcription-coupled excision repair. ERCC6 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	193
350759	cd18001	DEXHc_ERCC6L	DEXH-box helicase domain of ERCC6L. ERCC excision repair 6 like, spindle assembly checkpoint helicase (ERCC6L, also known as RAD26L) is an essential component of the mitotic spindle assembly checkpoint, by acting as a tension sensor that associates with catenated DNA which is stretched under tension until it is resolved during anaphase. ERCC6L is proposed to stimulate cancer cell proliferation by promoting cell cycle through a way of RAB31-MAPK-CDK2. ERCC6L is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	232
350760	cd18002	DEXQc_INO80	DEAQ-box helicase domain of INO80. INO80 is the catalytic ATPase subunit of the INO80 chromatin remodeling complex. INO80 removes histone H3-containing nucleosomes from associated chromatin, promotes CENP-ACnp1 chromatin assembly at the centromere in a redundant manner with another chromatin-remodeling factor Chd1Hrp1. INO80 mutants have severe defects in oxygen consumption and promiscuous cell division that is no longer coupled with metabolic status. INO80 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	229
350761	cd18003	DEXQc_SRCAP	DEXH/Q-box helicase domain of SRCAP. Snf2-related CBP activator (SRCAP, also known as SWR1 or DOMO1) is the core catalytic component of the multiprotein chromatin-remodeling SRCAP complex, that is necessary for the incorporation of the histone variant H2A.Z into nucleosomes. SRCAP is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	223
350762	cd18004	DEXHc_RAD54	DEXH-box helicase domain of RAD54. RAD54 proteins play a role in recombination. They are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	240
350763	cd18005	DEXHc_ERCC6L2	DEXH-box helicase domain of ERCC6L2. ERCC excision repair 6 like 2 (ERCC6L2, also known as RAD26L) may play a role in DNA repair and mitochondrial function. In humans, mutations in the ERCC6L2 gene are associated with bone marrow failure syndrome 2. ERCC6L2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	245
350764	cd18006	DEXHc_CHD1L	DEAH/Q-box helicase domain of CHD1L. Chromodomain helicase DNA binding protein 1 like (CHD1L, also known as ALC1) is involved in DNA repair by regulating chromatin relaxation following DNA damage. CHD1L is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	216
350765	cd18007	DEXHc_ATRX-like	DEXH-box helicase domain of ATRX-like proteins. This family includes ATRX-like members such as transcriptional regulator ATRX (also called alpha thalassemia/mental retardation syndrome X-linked and X-linked nuclear protein or XNP) which is involved in transcriptional regulation and chromatin remodeling, and ARIP4 (also called androgen receptor-interacting protein 4, RAD54 like 2 or RAD54L2) which modulates androgen receptor (AR)-dependent transactivation in a promoter-dependent manner. They are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	239
350766	cd18008	DEXDc_SHPRH-like	DEXH-box helicase domain of SHPRH-like proteins. The SHPRH-like subgroup belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	241
350767	cd18009	DEXHc_HELLS_SMARCA6	DEXH-box helicase domain of HELLS. HELLS (helicase, lymphoid specific, also known as Lsh or SMARCA6) is a major epigenetic regulator crucial for normal heterochromatin structure and function. HELLS is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	236
350768	cd18010	DEXHc_HARP_SMARCAL1	DEXH-box helicase domain of SMARCAL1. SMARCAL1 (SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a like 1, also known as HARP) is recruited to stalled replication forks to promote repair and helps restart replication. It plays a role in DNA repair, telomere maintenance and replication fork stability in response to DNA replication stress. Mutations cause Schimke Immunoosseous Dysplasia. SMARCAL1 is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	213
350769	cd18011	DEXDc_RapA	DEXH-box helicase domain of RapA. In bacteria, RapA is an RNA polymerase (RNAP)-associated SWI2/SNF2 (switch/sucrose non-fermentable) protein that mediates RNAP recycling during transcription. The ATPase activity of RapA is stimulated by its interaction with RNAP and inhibited by its N-terminal domain. The conformational changes of RapA and its interaction with RNAP are essential for RNAP recycling.  RapA is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	207
350770	cd18012	DEXQc_arch_SWI2_SNF2	DEAQ-box helicase domain of archaeal and bacterial SNF2-related proteins. Proteins belonging to SNF2 family of DNA dependent ATPases are important members of the chromatin remodeling complexes that are implicated in epigenetic control of gene expression. The Snf2 family comprises a large group of ATP-hydrolyzing proteins that are ubiquitous in eukaryotes, but also present in eubacteria and archaea. Archaeal SWI2 and SNF2 are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	218
350771	cd18013	DEXQc_bact_SNF2	DEXQ-box helicase domain of bacterial SNF2 family proteins. Proteins belonging to the SNF2 family of DNA dependent ATPases are important members of the chromatin remodeling complexes that are implicated in epigenetic control of gene expression. The Snf2 family comprise a large group of ATP-hydrolyzing proteins that are ubiquitous in eukaryotes, but also present in eubacteria and archaea. The bacterial SNF2 present in this family are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	218
350772	cd18014	DEXHc_RecQ5	DEAH-box helicase domain of RecQ5. ATP-dependent DNA helicase Q5 (RecQ5) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	205
350773	cd18015	DEXHc_RecQ1	DEXH-box helicase domain of RecQ1. ATP-dependent DNA helicase Q1 (RecQ1) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	209
350774	cd18016	DEXHc_RecQ2_BLM	DEAH-box helicase domain of RecQ2. ATP-dependent DNA helicase Q2 (RecQ2, also called Bloom syndrome protein homolog or BLM) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Mutations in RecQ2 cause Bloom syndrome.	208
350775	cd18017	DEXHc_RecQ3	DEAH-box helicase domain of RecQ3. DEAD-like helicase RecQ3 (also called Werner syndrome ATP-dependent helicase or WRN) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Mutations cause Werner's syndrome.	193
350776	cd18018	DEXHc_RecQ4-like	DEAH-box helicase domain of RecQ4 and similar proteins. ATP-dependent DNA helicase Q4 (RecQ4) is part of the RecQ family of highly conserved DNA repair helicases that is part of the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Mutations cause Rothmund-Thomson/RAPADILINO/Baller-Gerold syndrome.	201
350777	cd18019	DEXHc_Brr2_1	N-terminal DEXH-box helicase domain of spliceosomal Brr2 RNA helicase. Brr2 is a type II DEAD box helicase that mediates spliceosome catalytic activation. It is a stable subunit of the spliceosome, required during splicing catalysis and spliceosome disassembly. Brr2 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	214
350778	cd18020	DEXHc_ASCC3_1	N-terminal DEXH-box helicase domain of Activating signal cointegrator 1 complex subunit 3. Activating signal cointegrator 1 complex subunit 3 (ASCC3) is a type II DEAD box helicase that plays a role in the repair of N-alkylated nucleotides. ASCC3 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	199
350779	cd18021	DEXHc_Brr2_2	C-terminal D[D/E]X[H/Q]-box helicase domain of spliceosomal Brr2 RNA helicase. Brr2 is a type II DEAD box helicase that mediates spliceosome catalytic activation. It is a stable subunit of the spliceosome, required during splicing catalysis and spliceosome disassembly. Brr2 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	191
350780	cd18022	DEXHc_ASCC3_2	C-terminal DEXH-box helicase domain of Activating signal cointegrator 1 complex subunit 3. Activating signal cointegrator 1 complex subunit 3 (ASCC3) is a type II DEAD box helicase that plays a role in the repair of N-alkylated nucleotides. ASCC3 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	189
350781	cd18023	DEXHc_HFM1	DEXH-box helicase domain of ATP-dependent DNA helicase HFM1. HFM1 is a type II DEAD box helicase, required for crossover formation and complete synapsis of homologous chromosomes during meiosis. HFM1 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	206
350782	cd18024	DEXHc_Mtr4-like	DEXH-box helicase domain of ATP-dependent RNA helicase Mtr4. Mtr4 (also known as DOB1 or SKIV2L2) is a type II DEAD box helicase that plays a role in the processing of structured RNAs, including the maturation of 5.8S ribosomal RNA (rRNA)and is part of the TRAMP complex that is involved in exosome-mediated degradation of aberrant RNAs. Mtr4 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	205
350783	cd18025	DEXHc_DDX60	DEXH-box helicase domain of DEAD box protein 60. DEAD box protein 60 (DDX60) is an IFN-inducible cytoplasmic helicase that plays a role in RIG-I-mediated type I interferon (IFN) nuclease-mediated viral RNA degradation. DDX60 belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	192
350784	cd18026	DEXHc_POLQ-like	DEXH-box helicase domain of DNA polymerase theta. DNA polymerase theta (POLQ) is important in the repair of genomic double-strand breaks (DSBs). POLQ contains an N-terminal type II DEAD box helicase domain which contains the ATP-binding region.	202
350785	cd18027	DEXHc_SKIV2L	DEXH-box helicase domain of SKIV2L. Superkiller viralicidic activity 2-like (SKIV2L, also called SKI2 or DHX13) plays a role in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. SKIV2L belongs to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	179
350786	cd18028	DEXHc_archSki2	DEXH-box helicase domain of archaeal Ski2-type helicase. Archaeal Ski2-type RNA helicases play an important role in RNA degradation, processing and splicing pathways. They belong to the type II DEAD box helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	177
350787	cd18029	DEXHc_XPB	DEXH-box helicase domain of TFIIH XPB subunit and similar proteins. TFIIH basal transcription factor complex helicase XPB subunit (also known as DNA excision repair protein ERCC-3 or TFIIH 89 kDa subunit) is the ATP-dependent 3'-5' DNA helicase component of the core-TFIIH basal transcription factor, involved in nucleotide excision repair (NER) of DNA and, when complexed to CAK, in RNA transcription by RNA polymerase II. XPB is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	169
350788	cd18030	DEXHc_RE_I_HsdR	DEXH-box helicase domain of type I restriction enzyme HdsR subunit. The HdsR motor subunit of type I restriction-modification enzymes contains the DNA cleavage and ATP-dependent DNA translocation activities of the heteromeric complex. It is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	208
350789	cd18031	DEXHc_UvsW	DEXH-box helicase domain of bacteriophage UvsW. Bacteriophage UvsW is part of the WXY system that repairs DNA damage by a process that involves homologous recombination. UvsW is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	161
350790	cd18032	DEXHc_RE_I_III_res	DEXH-box helicase domain of type III restriction enzyme res subunit. Members of this cd includes both type I and type III restriction enzymes. Both are hetero-oligomeric proteins. Type I REs are encoded by three closely linked genes: a specificity subunit (HsdS or S) for recognizing a DNA sequence, a methylation subunit (HsdM or M) for methylating the recognized target bases, and a restriction subunit (HsdR or R) for the translocation and random cleavage of non-methylated DNA. They show diverse catalytic activities, including methyltransferase (MTase), ATP hydrolase (ATPase), DNA translocation and restriction activities. These enzymes cut at a site that differs, and is a random distance (at least 1000 bp) away, from their recognition site. Cleavage at these random sites follows a process of DNA translocation, which shows that these enzymes are also molecular motors. The recognition site is asymmetrical and is composed of two specific portions: one containing 3-4 nucleotides, and another containing 4-5 nucleotides, separated by a non-specific spacer of about 6-8 nucleotides. Type III enzymes are composed of two subunits, Res and Mod. The Mod subunit recognizes the DNA sequence specific for the system and is a modification methyltransferase; as such, it is functionally equivalent to the M and S subunits of type I restriction endonucleases. Res is required for restriction, although it has no enzymatic activity on its own. Type III enzymes recognize short 5-6 bp-long asymmetric DNA sequences and cleave 25-27 bp downstream to leave short, single-stranded 5' protrusions. They require the presence of two inversely oriented unmethylated recognition sites for restriction to occur. These enzymes methylate only one strand of the DNA, at the N-6 position of adenosyl residues, so newly replicated DNA will have only one strand methylated, which is sufficient to protect against restriction. Both type I and type III REs are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	163
350791	cd18033	DEXDc_FANCM	DEAH-box helicase domain of FANCM. Fanconi anemia group M (FANCM) protein is a DNA-dependent ATPase component of the Fanconi anemia (FA) core complex. It is required for the normal activation of the FA pathway, leading to monoubiquitination of the FANCI-FANCD2 complex in response to DNA damage, cellular resistance to DNA cross-linking drugs, and prevention of chromosomal breakage. In complex with CENPS and CENPX, it binds double-stranded DNA (dsDNA), fork-structured DNA (fsDNA), and Holliday junction substrates. Its ATP-dependent DNA branch migration activity can process branched DNA structures such as a movable replication fork. This activity is strongly stimulated in the presence of CENPS and CENPX. In complex with FAAP24, it efficiently binds to single-strand DNA (ssDNA), splayed-arm DNA, and 3'-flap substrates. In vitro, on its own, it strongly binds ssDNA oligomers and weakly fsDNA, but does not bind to dsDNA. FANCM is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	182
350792	cd18034	DEXHc_dicer	DEXH-box helicase domain of endoribonuclease Dicer. Dicer ribonucleases cleave double-stranded RNA (dsRNA) precursors to generate microRNAs (miRNAs) and small interfering RNAs (siRNAs). In concert with Argonautes, these small RNAs bind complementary mRNAs to down-regulate their expression. miRNAs are processed by Dicer from small hairpins, while siRNAs are typically processed from longer dsRNA, from endogenous sources, or exogenous sources such as viral replication intermediates. Some organisms, such as Homo sapiens and Caenorhabditis elegans, encode one Dicer that generates miRNAs and siRNAs, but other organisms have multiple dicers with specialized functions. Dicers exist throughout eukaryotes, and a subset have an N-terminal helicase domain of the RIG-I-like receptor (RLR) subgroup. RLRs often function in innate immunity and Dicer helicase domains sometimes show differences in activity that correlate with roles in immunity. Dicer is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	200
350793	cd18035	DEXHc_Hef	DEXH-box helicase domain of Hef. Hef (helicase-associated endonuclease fork-structure) belongs to the XPF/MUS81/FANCM family of endonucleases and is involved in stalled replication fork repair. All archaea encode a protein of the XPF/MUS81/FANCM family of endonucleases. It exists in two forms: a long form, referred as Hef which consists of an N-terminal helicase fused to a C-terminal nuclease and is specific to euryarchaea and a short form, referred as XPF which lacks the helicase domain and is specific to crenarchaea and thaumarchaea. Hef has the unique feature of having both active helicase and nuclease domains. This domain configuration is highly similar with the human FANCM, a possible ortholog of archaeal Hef proteins. Hef is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	181
350794	cd18036	DEXHc_RLR	DEXH-box helicase domain of RIG-I-like receptors. RIG-I-like receptors (RLRs) sense cytoplasmic viral RNA and comprise RIG-I, RLR-2/MDA5 (melanoma differentiation-associated protein 5) and RLR-3/LGP2 (laboratory of genetics and physiology 2). RIG-I-like receptors (RLRs) are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	204
350795	cd18037	DEXSc_Pif1_like	DEAD-box helicase domain of Pif1. Pif1 and other members of this family are RecD-like helicases involved in maintaining genome stability through unwinding double-stranded DNAs (dsDNAs), DNA/RNA hybrids, and G quadruplex (G4) structures. The members of Pif1 helicase subfamily studied so far all appear to contribute to telomere maintenance. Pif1 is a member of the DEAD-like helicases superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	183
350796	cd18038	DEXXQc_Helz-like	DEXXQ/H-box helicase domain of Helz-like helicase. This subfamily contains HELZ, Mov10L1, and similar proteins. Helicase with zinc finger (HELZ) acts as a helicase that plays a role in RNA metabolism during development. Moloney leukemia virus 10-like protein 1 (Mov10L1) binds Piwi-interacting RNA (piRNA) precursors to initiate piRNA processing. All are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	229
350797	cd18039	DEXXQc_UPF1	DEXXQ-box helicase domain of UPF1. UPF1 (also called RNA Helicase And ATPase, Regulator Of Nonsense Transcripts, or ATP-Dependent Helicase RENT1) is an RNA-dependent helicase and ATPase required for nonsense-mediated decay (NMD) of mRNAs containing premature stop codons. It is recruited to mRNAs upon translation termination and undergoes a cycle of phosphorylation and dephosphorylation; its phosphorylation appears to be a key step in NMD. It is recruited by release factors to stalled ribosomes together with the SMG1C protein kinase complex to form the transient SURF (SMG1-UPF1-eRF1-eRF3) complex. In EJC-dependent NMD, the SURF complex associates with the exon junction complex (EJC) located downstream from the termination codon through UPF2 and allows the formation of an UPF1-UPF2-UPF3 surveillance complex which is believed to activate NMD.  Diseases associated with UPF1 include juvenile amyotrophic lateral sclerosis and epidermolysis bullosa, junctional, non-Herlitz type. UPF1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	234
350798	cd18040	DEXXc_HELZ2-C	C-terminal DEXX-box helicase domain of HELZ2. Helicase with zinc finger 2 (HELZ2, also known as PPAR-alpha-interacting complex protein 285 or PRIC285 and PPAR-gamma DBD-interacting protein 1 or PDIP1) acts as a transcriptional coactivator for a number of nuclear receptors including PPARA, PPARG, THRA, THRB and RXRA. It belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	271
350799	cd18041	DEXXQc_DNA2	DEXXQ-box helicase domain of DNA2. DNA2 (DNA Replication Helicase/Nuclease 2) possesses different enzymatic activities, such as single-stranded DNA (ssDNA)-dependent ATPase, 5-3 helicase, and endonuclease activities, and is involved in DNA replication and DNA repair in the nucleus and mitochondrion. It is involved in Okazaki fragment processing by cleaving long flaps that escape FEN1: flaps that are longer than 27 nucleotides are coated by replication protein A complex (RPA), leading to recruit DNA2 which cleaves the flap until it is too short to bind RPA and becomes a substrate for FEN1. It is also involved in 5-end resection of DNA during double-strand break (DSB) repair; it is recruited by BLM and mediates the cleavage of 5-ssDNA, while the 3-ssDNA cleavage is prevented by the presence of RPA. DNA2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	203
350800	cd18042	DEXXQc_SETX	DEXXQ-box helicase domain of SETX. The RNA/DNA helicase senataxin (SETX) plays a role in transcription, neurogenesis, and antiviral response. SEXT is an R-loop-associated protein that is thought to function as an RNA/DNA helicase. R-loops consist of RNA/DNA hybrids, formed during transcription when nascent RNA hybridizes to the DNA template strand, displacing the non-template DNA strand. Mutations in SETX are linked to two neurodegenerative disorders: ataxia with oculomotor apraxia type 2 (AOA2) and amyotrophic lateral sclerosis type 4 (ALS4). S. cerevisiae homolog splicing endonuclease 1 (Sen1) is an exclusively nuclear protein, important for nucleolar organization. S. cerevisiae Sen1 and its ortholog, the Schizosaccharomyces pombe Sen1, share conserved domains and belong to the family I class of helicases. Both proteins translocate 5' to 3' and unwind both DNA and RNA duplexes and also RNA/DNA hybrids in vitro. SETX is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	217
350801	cd18043	DEXXQc_SF1	DEXXQ-box helicase domain of Superfamily 1 helicases. Superfamily 1 (SF1) helicases are nucleic acid motor proteins that couple ATP hydrolysis to translocation along with the concomitant unwinding of DNA or RNA. This is central to many aspects of cellular DNA and RNA metabolism and accordingly, they are implicated in a wide range of nucleic acid processing events including DNA replication, recombination, and repair as well as many aspects of RNA metabolism. Superfamily 1 helicases are members of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	127
350802	cd18044	DEXXQc_SMUBP2	DEXXQ-box helicase domain of SMUBP2. SMUBP2 (also called immunoglobulin mu-binding protein 2, or IGHMBP2) is a 5' to 3' helicase that unwinds RNA and DNA duplexes in an ATP-dependent reaction. It is a DNA-binding protein specific to 5'-phosphorylated single-stranded guanine-rich sequence (5'-GGGCT-3') related to the immunoglobulin mu chain switch region. The IGHMBP2 gene is responsible for Charcot-Marie-Tooth disease (CMT) type 2S and spinal muscular atrophy with respiratory distress type 1 (SMARD1). It is also thought to play a role in frontotemporal dementia (FTD) with amyotrophic lateral sclerosis (ALS) and major depressive disorder (MDD). SMUBP2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	191
350803	cd18045	DEADc_EIF4AIII_DDX48	DEAD-box helicase domain of eukaryotic initiation factor 4A-III. Eukaryotic initiation factor 4A-III (EIF4AIII, also known as DDX48) is part of the exon junction complex (EJC) that plays a major role in posttranscriptional regulation of mRNA. EJC consists of four proteins (eIF4AIII, Barentsz [Btz], Mago, and Y14), mRNA, and ATP. DDX48 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	201
350804	cd18046	DEADc_EIF4AII_EIF4AI_DDX2	DEAD-box helicase domain of eukaryotic initiation factor 4A-I and 4-II. Eukaryotic initiation factor 4A-I (DDX2A) and eukaryotic initiation factor 4A-II (DDX2B) are involved in cap recognition and are required for mRNA binding to ribosome. They are DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	201
350805	cd18047	DEADc_DDX19	DEAD-box helicase domain of DEAD box protein 19. DDX19 is an RNA helicase involved in both mRNA (mRNA) export from the nucleus into the cytoplasm and in mRNA translation. DDX19 functions in the nucleus in resolving RNA:DNA hybrids (R-loops). Activation of a DNA damage response pathway dependent upon the ATR kinase, a major regulator of replication fork progression, stimulates translocation of DDX19 from the cytoplasm into the nucleus. Only nuclear Ddx19 is competent to resolve R-loops. DDX19 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	205
350806	cd18048	DEADc_DDX25	DEAD-box helicase domain of DEAD box protein 25. DDX25 (also called gonadotropin-regulated testicular RNA helicase (GRTH) is a testis-specific protein essential for completion of spermatogenesis. DDX25 is also a novel negative regulator of IFN pathway and facilitates RNA virus infection. Diseases associated with DDX25 include hydrolethalus syndrome, an autosomal recessive lethal malformation syndrome characterized by multiple developmental defects of fetus.. DDX25 (also called gonadotropin-regulated testicular RNA helicase) is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	229
350807	cd18049	DEADc_DDX5	DEAD-box helicase domain of DEAD box protein 5. DDX5 (also called RNA helicase P68, HLR1, G17P1, or HUMP68) is involved in pathways that include the alteration of RNA structures, plays a role as a coregulator of transcription, a regulator of splicing, and in the processing of small noncoding RNAs. It synergizes with DDX17 and SRA1 RNA to activate MYOD1 transcriptional activity and is involved in skeletal muscle differentiation. Dysregulation of this gene may play a role in cancer development. DDX5 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	234
350808	cd18050	DEADc_DDX17	DEAD-box helicase domain of DEAD box protein 17. DDX17 (also called DEAD Box Protein P72 or DEAD Box Protein P82) has a wide variety of functions including regulating the alternative splicing of exons exhibiting specific features such as the inclusion of AC-rich alternative exons in CD44 transcripts, playing a role in innate immunity, and promoting mRNA degradation mediated by the antiviral zinc-finger protein ZC3HAV1 in an ATPase-dependent manner. DDX17 synergizes with DDX5 and SRA1 RNA to activate MYOD1 transcriptional activity and is involved in skeletal muscle differentiation. DDX17 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	271
350809	cd18051	DEADc_DDX3	DEAD-box helicase domain of DEAD box protein 3. DDX3 (also called helicase-like protein, DEAD box, X isoform, or DDX14) has been reported to display a high level of RNA-independent ATPase activity stimulated by both RNA and DNA. This protein has multiple conserved domains and is thought to play roles in both the nucleus and cytoplasm. Nuclear roles include transcriptional regulation, mRNP assembly, pre-mRNA splicing, and mRNA export. In the cytoplasm, this protein is thought to be involved in translation, cellular signaling, and viral replication. Misregulation of this gene has been implicated in tumorigenesis. Diseases associated with DDX3 include mental retardation, X-linked 102 and agenesis of the corpus callosum, with facial anomalies and robin sequence. DDX3 is a member of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	249
350810	cd18052	DEADc_DDX4	DEAD-box helicase domain of DEAD box protein 4. DEAD box protein 4 (DDX4, also known as VASA homolog) is an ATP-dependent RNA helicase required during spermatogenesis and is essential for the germline integrity. DEAD-box helicases are a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation. The name derives from the sequence of the Walker B motif (motif II). This domain contains the ATP-binding region.	264
350811	cd18053	DEXHc_CHD1	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 1. Chromodomain-helicase-DNA-binding protein 1 (CHD1) is an ATP-dependent chromatin-remodeling factor which functions as substrate recognition component of the transcription regulatory histone acetylation (HAT) complex SAGA. It regulates polymerase II transcription and is also required for efficient transcription by RNA polymerase I, and more specifically the polymerase I transcription termination step. It is not only involved in transcription-related chromatin-remodeling, but also required to maintain a specific chromatin configuration across the genome. CHD1 is also associated with histone deacetylase (HDAC) activity. It is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	237
350812	cd18054	DEXHc_CHD2	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 2. Chromodomain-helicase-DNA-binding protein 2 (CHD2) is a DNA-binding helicase that specifically binds to the promoter of target genes, leading to chromatin remodeling, possibly by promoting deposition of histone H3.3. It is involved in myogenesis via interaction with MYOD1; it binds to myogenic gene regulatory sequences and mediates incorporation of histone H3.3 prior to the onset of myogenic gene expression, promoting their expression. CHD2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	237
350813	cd18055	DEXHc_CHD3	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 3. Chromodomain-helicase-DNA-binding protein 3 (CHD3) is a component of the histone deacetylase NuRD complex which participates in the remodeling of chromatin by deacetylating histones. It is required for anchoring centrosomal pericentrin in both interphase and mitosis, for spindle organization and centrosome integrity. CHD3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	232
350814	cd18056	DEXHc_CHD4	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 4. Chromodomain-helicase-DNA-binding protein 4 (CHD4) is a component of the histone deacetylase NuRD complex which participates in the remodeling of chromatin by deacetylating histones. CHD4 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	232
350815	cd18057	DEXHc_CHD5	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 5. Chromodomain-helicase-DNA-binding protein 5 (CHD5) is a chromatin-remodeling protein that binds DNA through histones and regulates gene transcription. It is thought to specifically recognize and bind trimethylated 'Lys-27' (H3K27me3) and non-methylated 'Lys-4' of histone H3 and plays a role in the development of the nervous system by activating the expression of genes promoting neuron terminal differentiation. In parallel, it may also positively regulate the trimethylation of histone H3 at 'Lys-27' thereby specifically repressing genes that promote the differentiation into non-neuronal cell lineages. As a tumor suppressor, it regulates the expression of genes involved in cell proliferation and differentiation. In spermatogenesis, it probably regulates histone hyperacetylation and the replacement of histones by transition proteins in chromatin, a crucial step in the condensation of spermatid chromatin and the production of functional spermatozoa. CHD5 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	232
350816	cd18058	DEXHc_CHD6	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 6. Chromodomain-helicase-DNA-binding protein 6 (CHD6) is a DNA-dependent ATPase that plays a role in chromatin remodeling. It regulates transcription by disrupting nucleosomes in a largely non-sliding manner which strongly increases the accessibility of chromatin. It activates transcription of specific genes in response to oxidative stress through interaction with NFE2L2.2 and acts as a transcriptional repressor of different viruses including influenza virus or papillomavirus. During influenza virus infection, the viral polymerase complex localizes CHD6 to inactive chromatin where it gets degraded in a proteasome independent-manner. CHD6 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	222
350817	cd18059	DEXHc_CHD7	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 7. Chromodomain-helicase-DNA-binding protein 7 (CHD7) is a probable transcription regulator. It may be involved in the 45S precursor rRNA production. CHD7 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	222
350818	cd18060	DEXHc_CHD8	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 8. Chromodomain-helicase-DNA-binding protein 8 (CHD8) is a DNA helicase that acts as a chromatin remodeling factor and regulates transcription. It also acts as a transcription repressor by remodeling chromatin structure and recruiting histone H1 to target genes. It suppresses p53/TP53-mediated apoptosis by recruiting histone H1 and preventing p53/TP53 transactivation activity and of STAT3 activity by suppressing the LIF-induced STAT3 transcriptional activity. It also acts as a negative regulator of Wnt signaling pathway and CTNNB1-targeted gene expression. CHD8 is also involved in both enhancer blocking and epigenetic remodeling at chromatin boundary via its interaction with CTCF. It also acts as a transcription activator via its interaction with ZNF143 by participating in efficient U6 RNA polymerase III transcription. CHD8 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	222
350819	cd18061	DEXHc_CHD9	DEAH-box helicase domain of the chromodomain helicase DNA binding protein 9. Chromodomain-helicase-DNA-binding protein 9 (CHD9) acts as a transcriptional coactivator for PPARA and possibly other nuclear receptors. It is proposed to be a ATP-dependent chromatin remodeling protein. CHD9 has DNA-dependent ATPase activity and binds to A/T-rich DNA. It also associates with A/T-rich regulatory regions in promoters of genes that participate in the differentiation of progenitors during osteogenesis. CHD9 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	222
350820	cd18062	DEXHc_SMARCA4	DEXH-box helicase domain of SMARCA4. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member  4 (SMARCA4, also known as transcription activator BRG1) is a component of the CREST-BRG1 complex that regulates promoter activation by orchestrating a calcium-dependent release of a repressor complex and a recruitment of an activator complex. Mutation of SMARCA4 (BRG1), the ATPase of BAF (mSWI/SNF) and PBAF complexes, contributes to a range of malignancies and neurologic disorders. SMARCA4 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	251
350821	cd18063	DEXHc_SMARCA2	DEXH-box helicase domain of SMARCA2. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 2 (SMARCA2, also known as brahma homolog) is a component of the BAF complex. SMARCA2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	251
350822	cd18064	DEXHc_SMARCA5	DEAH-box helicase domain of SMARCA5. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 5 (SMARCA5, also called SNF2H) is the catalytic subunit of the four known chromatin-remodeling complexes: CHRAC, RSF, ACF/WCRF, and WICH. SMARCA5 plays a major role organising arrays of nucleosomes adjacent to the binding sites for the architectural transcription factor CTCF sites and acts to promote CTCF binding SMARCA5 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	244
350823	cd18065	DEXHc_SMARCA1	DEAH-box helicase domain of SMARCA1. SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 1 (SMARCA1, also called SNF2L) is a component of NURF (nucleosome-remodeling factor) and CERF (CECR2-containing-remodeling factor) complexes which promote the perturbation of chromatin structure in an ATP-dependent manner. SMARCA1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	233
350824	cd18066	DEXHc_RAD54B	DEXH-box helicase domain of RAD54B. DNA repair and recombination protein RAD54B, also known as RDH54, binds to double-stranded DNA, displays ATPase activity in the presence of DNA, and may have a role in meiotic and mitotic recombination. RAD54B  is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	235
350825	cd18067	DEXHc_RAD54A	DEXH-box helicase domain of RAD54A. DNA repair and recombination protein RAD54A, also known as RAD54L or RAD54, plays a role in homologous recombination related repair of DNA double-strand breaks. RAD54A is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	243
350826	cd18068	DEXHc_ATRX	DEXH-box helicase domain of ATRX. Transcriptional regulator ATRX (also called alpha thalassemia/mental retardation syndrome X-linked and X-linked nuclear protein or XNP) is involved in transcriptional regulation and chromatin remodeling. Mutations in humans cause mental retardation, X-linked, syndromic, with hypotonic facies 1 (MRXSHF1) and alpha-thalassemia myelodysplasia syndrome (ATMDS). ATRX is part of the a DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	246
350827	cd18069	DEXHc_ARIP4	DEXH-box helicase domain of ARIP4. Androgen receptor-interacting protein 4 (ARIP4, also called RAD54 like 2 or RAD54L2 ) modulates androgen receptor (AR)-dependent transactivation in a promoter-dependent manner. ARIP4 is part of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	227
350828	cd18070	DEXQc_SHPRH	DEXQ-box helicase domain of SHPRH. E3 ubiquitin-protein ligase SHPRH is a ubiquitously expressed protein that contains motifs characteristic of several DNA repair proteins, transcription factors, and helicases. SHPRH is a functional homolog of S. cerevisiae RAD5 and is involved in DNA repair. SHPRH is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	257
350829	cd18071	DEXHc_HLTF1_SMARC3	DEXH-box helicase domain of HLTF1. Helicase like transcription factor (HLTF1, also known as HIP116 or SMARCA3) has both helicase and E3 ubiquitin ligase activities and ATP-dependent nucleosome-remodeling activity. HLTF1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	239
350830	cd18072	DEXHc_TTF2	DEAH-box helicase domain of TTF2. Transcription termination factor 2 (TTF2 also called Forkhead-box E1/FOXE1 ) is a transcription termination factor that couples ATP hydrolysis with the removal of RNA polymerase II from the DNA template. Single nucleotide polymorphism (SNP) within the 5'-UTR of TTF2 is associated with thyroid cancer risk.TTF2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	241
350831	cd18073	DEXHc_RIG-I_DDX58	DEXH-box helicase domain of RIG-I. RIG-I (Retinoic acid-inducible gene I protein), also called DEAD box protein 58 (DDX58), is a pathogen-recognition receptor that recognizes viral 5'-triphosphates carrying double-stranded RNA. Upon binding to these microbe-associated molecular patterns (MAMPs), RIG-I forms oligomers and promotes downstream processes that result in type I interferon production and induction of an antiviral state. The optimal ligand for RIG-I has been found to be base-paired or double-stranded RNA (dsRNA) molecules containing a 5' triphosphate (5'-ppp-dsRNA). RIG-I contains two N-terminal caspase activation and recruitment domains (CARDs), which are required for interaction with IPS-1, a superfamily 2 helicase/translocase/ATPase (SF2) domain and a C-terminal regulatory/repressor domain (RD). RIG-I is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	202
350832	cd18074	DEXHc_RLR-2	DEXH-box helicase domain of RLR-2. RIG-I-like receptor 2 (RLR-2, also known as melanoma differentiation-associated protein 5 or Mda5 and IFIH1) is a viral double-stranded RNA (dsRNA) receptor that shares sequence similarity and signaling pathways with RIG-I, yet plays essential functions in antiviral immunity through distinct specificity for viral RNA. RLR-2 recognizes the internal duplex structure, whereas RIG-I recognizes the terminus of dsRNA. RLR-2 uses direct protein-protein contacts to stack along dsRNA in a head-to-tail arrangement. The signaling domain (tandem CARD), which decorates the outside of the core RLR-2 filament, also has an intrinsic propensity to oligomerize into an elongated structure that activates the signaling adaptor, MAVS. RLR-2 uses long dsRNA as a signaling platform to cooperatively assemble the core filament, which in turn promotes stochastic assembly of the tandem CARD oligomers for signaling. LGP2 appears to positively and negatively regulate RLR-2 and RIG-I signaling, respectively. RLR-2 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	216
350833	cd18075	DEXHc_RLR-3	DEXH-box helicase domain of RLR-3. RIG-I-like receptor 3 (RLR-3, also known as laboratory of genetics and physiology 2 or LGP2 and DHX58) appears to positively and negatively regulate MDA5 and RIG-I signaling, respectively. RLR-3 resembles a chimera combining a MDA5-like helicase domain and RIG-I like CTD supporting both stem and end binding. RNA binding is required for RLR-3-mediated enhancement of MDA5 activation. RLR-3 end-binding may promote nucleation of MDA5 oligomerization on dsRNA. RLR-3 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	200
350834	cd18076	DEXXQc_HELZ2-N	N-terminal DEXXQ-box helicase domain of HELZ2. Helicase with zinc finger 2 (HELZ2, also known as PPAR-alpha-interacting complex protein 285 or PRIC285 and PPAR-gamma DBD-interacting protein 1 or PDIP1) acts as a transcriptional coactivator for a number of nuclear receptors including PPARA, PPARG, THRA, THRB, and RXRA.  It belongs to the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	230
350835	cd18077	DEXXQc_HELZ	DEXXQ-box helicase domain of HELZ. Helicase with zinc finger (HELZ) acts as a helicase that plays a role in RNA metabolism during development. HELZ is a member of the family I class of RNA helicases of the  DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	226
350836	cd18078	DEXXQc_Mov10L1	DEXXQ-box helicase domain of Mov10L1. Moloney leukemia virus 10-like protein 1 (Mov10L1) binds Piwi-interacting RNA (piRNA) precursors to initiate piRNA processing. Mov10L1 is a member of the DEAD-like helicase superfamily, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region.	230
350837	cd18079	S-AdoMet_synt	S-adenosylmethionine synthetase. S-adenosylmethionine synthetase (EC 2.5.1.6), also known as methionine adenosyltransferase, catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP in two steps, the formation of AdoMet and hydrolysis of the tripolyphosphate, which occurs prior to release of the product from the enzyme, which consists of three structural domains that have a similar alpha+beta fold.	371
349953	cd18080	TrmD-like	tRNA-M1G37-methyltransferase TrmD. The bacterial tRNA-(N(1)G37) methyltransferase (TrmD) catalyzes the transfer of a methyl group from S-adenosyl-L-methionine (AdoMet) to the N1 position of G37 in the anticodon loop of a subset of tRNA that contains a G at position 36.  The presence of the modification prevents Watson-Crick base-pairing of this guanosine with cytosine in mRNA and translational frame-shifting. This family of proteins contains members of the SPOUT methyltransferases. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	219
349954	cd18081	RlmH-like	23S-rRNA-pseudouridine1915-N3-methyltransferase RlmH. 23S rRNA (pseudouridine1915-N3)-methyltransferase RlmH catalyzes the addition of a methyl group at the N-3 position of pseudouridine Psi1915 in 23S rRNA to form m(3)Psi1915. This family of proteins belongs to the SPOUT methyltransferases.  The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	152
349955	cd18082	SpoU-like_family	SAM-dependent rRNA or tRNA methylase related to SpoU. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	145
349956	cd18083	aTrm56-like	archaeal tRNA (cytidine(56)-2'-O)-methyltransferase Trm56. Archaeal tRNA (cytidine(56)-2'-O)-methyltransferase Trm56 catalyzes the 2'-O-ribose methylation of cytidine at position 56 in tRNAs. Trm56 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	169
349957	cd18084	RsmE-like	SPOUT superfamily RNA methyltransferase RsmE-like. 16S rRNA m3U1498 methyltransferase RsmE modifies nucleotides during ribosomal RNA maturation in a site-specific manner. The Escherichia coli member is specific for U1498 methylation. RsmE is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	159
349958	cd18085	TM1570-like	SPOUT superfamily RNA methyltransferase TM1570-like. DUF2168; This domain, found in various hypothetical prokaryotic proteins, has no known function. It is also found in a few prokaryotic tRNA (guanine-N(1)-)-methyltransferases. Proteins of this family are members of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	178
349959	cd18086	HsC9orf114-like	SPOUT superfamily RNA methyltransferase HsC9orf114-like. Human C9orf114 (also known as centromere protein 32 or CENP-32) is required for association of the centrosomes with the poles of the bipolar mitotic spindle during metaphase. CENP-32 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	187
349960	cd18087	TrmY-like	tRNA (pseudouridine(54)-N(1))-methyltransferase TrmY. tRNA (pseudouridine(54)-N(1))-methyltransferase TrmY catalyzes the N1-methylation of pseudouridine at position 54 (Psi54) in tRNAs. TrmY is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	193
349961	cd18088	Nep1-like	18S rRNA (pseudouridine(1248)-N1)-methyltransferase Nep1. 18S rRNA (pseudouridine(1248)-N1)-methyltransferase Nep1 (also known as EMG1) methylates pseudouridine at position1248 (Psi1248) in 18S rRNA and is required for small subunit (SSU) ribosomal RNA (rRNA) maturation. Mutations on human cause in Bowen-Conradi Syndrome. Nep1 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	204
349962	cd18089	SPOUT_Trm10-like	tRNA methyltransferase Trm10-like. Family of tRNA methyltransferase Trm10-like proteins catalyzes the N(1) methylation of guanine at position 9 (m(1)G9) of tRNA (eukaryotes) or N(1) methylation of guanine or adenine at position 9 (m1G9/m1A9) of tRNA (archaea), which might play a role in the stabilization of tRNA and in translation termination efficiency. Trm10 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	171
349963	cd18090	Arginine_MT_Sfm1	SAM-dependent arginine methyltransferase related to yeast Sfm1. Arginine methyltransferase Sfm1 methylates R146 of 40S ribosomal protein S3 (Rps3), which contacts 18S RNA. Sfm1 is part of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent mainly RNA MTases which are structurally characterized by a deep trefoil knot.	140
349964	cd18091	SpoU-like_TRM3-like	SAM-dependent tRNA methylase related to TRM3. Yeast tRNA (guanosine(18)-2'-O)-methyltransferase TRM3 catalyzes the formation of 2'-O-methylguanosine at position 18 (Gm18) in various tRNAs. TRM3 is similar to C-terminal domain of TAR (HIV-1) RNA binding protein 1 (TARBP1), a protein binding to TAR, which functions as a RNA regulatory signal by forming a stable stem-loop structure to which transactivator protein Tat binds. The role of TARBP1 is believed to be to disengage RNA polymerase II from TAR during transcriptional elongation. TRM3 and the C-terminal methyltransferase domain of TARBP1 are members of the SPOUT methyltransferase superfamily.	145
349965	cd18092	SpoU-like_TrmH	SAM-dependent tRNA methylase related to TrmH. TrmH catalyzes the transfer of the methyl group from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of the ribose of the universally conserved guanosine 18 (G18) position in tRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	162
349966	cd18093	SpoU-like_TrmJ	SAM-dependent tRNA methylase related to TrmJ. tRNA methyltransferase TrmJ catalyzes the methyl transfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH at position 32 in both tRNASer1 and tRNAGln2. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	153
349967	cd18094	SpoU-like_TrmL	SAM-dependent tRNA methylase related to TrmL. tRNA (Um34/Cm34) methyltransferase TrmL catalyzes the methyl transfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH at position 34 in both tRNA(Leu)(CmAA) and tRNA(Leu)(cmnm5UmAA). It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT  methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	145
349968	cd18095	SpoU-like_rRNA-MTase	SAM-dependent rRNA methylase related to SpoU-TrmH. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	143
349969	cd18096	SpoU-like	SAM-dependent rRNA or tRNA methylase related to SpoU. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	140
349970	cd18097	SpoU-like	SAM-dependent rRNA or tRNA methylase related to SpoU. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	144
349971	cd18098	SpoU-like	SAM-dependent rRNA or tRNA methylase related to SpoU. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	138
349972	cd18099	Trm10arch	archaeal tRNA(m1G9/m1A9)-methyltransferase Trm10. Archaeal tRNA(m1G9/m1A9)-methyltransferase Trm10 catalyzes the N(1) methylation of guanine or adenine at position 9 (m1G9/m1A9) of tRNA, which might play a role in the stabilization of tRNA and in translation termination efficiency. Trm10 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	170
349973	cd18100	Trm10euk_B	eukaryotic tRNA m1G9 methyltransferase Trm10 homolog B. Eukaryotic tRNA m1G9 methyltransferase Trm10 homolog B (TM10B) catalyzes the N(1) methylation of guanine at Position 9 (m(1)G9) of tRNA, which might play a role in the stabilization of tRNA and in translation termination efficiency. Trm10 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	182
349974	cd18101	Trm10euk_A	eukaryotic tRNA m1G9 methyltransferase Trm10 homolog A. Eukaryotic tRNA m1G9 methyltransferase Trm10 homolog A (TM10A) catalyzes the N(1) methylation of guanine at Position 9 (m(1)G9) of tRNA, which might play a role in the stabilization of tRNA and in translation termination efficiency. Trm10 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	174
349975	cd18102	Trm10_MRRP1	Mitochondrial ribonuclease P protein 1. Mitochondrial ribonuclease P protein 1 (or tRNA methyltransferase 10 homolog C) functions in mitochondrial tRNA maturation and is part of mitochondrial ribonuclease P, an enzyme composed of MRPP1/RG9MTD1, MRPP2/HSD17B10 and MRPP3/KIAA0391, which cleaves tRNA molecules in their 5'-ends. MRRP1 is related to Trm10, a tRNA m1G9 methyltransferase and is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	179
349976	cd18103	SpoU-like_RlmB	SAM-dependent rRNA methylase related to RlmB. 23S rRNA-M2G2251-MTase RlmB catalyzes the methylation of guanosine 2251, a modification conserved in the peptidyltransferase domain of 23S rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	143
349977	cd18104	SpoU-like_RNA-MTase	SAM-dependent RNA methylase related to SpoU-TrmH. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	146
349978	cd18105	SpoU-like_MRM1	SAM-dependent rRNA methylase related to MRM1. MRM1 catalyzes the methylation of 2'-O-ribose residues G1145 to GmG residue of the mitochondrial 16S rRNA. MRM1 is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	158
349979	cd18106	SpoU-like_RNMTL1	SAM-dependent rRNA methylase related to RNMTL1. RNMTL1 (also known as HC90, MRM3 and RMTL1) catalyzes the methylation of 2'-O-ribose residues G1370 to GmG residue of the mitochondrial 16S rRNA. RNMTL1  is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	148
349980	cd18107	SpoU-like_AviRb	SAM-dependent rRNA methylase related to AviRb. AviRb from Streptomyces viridochromogenes methylates the 2'-O atom of U2479 of the 23S ribosomal RNA. AviRb is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	148
349981	cd18108	SpoU-like_NHR	Nosiheptide-resistance methyltransferase (NHR). Nosiheptide-resistance methyltransferase (NHR) confers resistance to the thiazole antibiotic nosiheptide via catalyzing 2'O-methylation of 23S rRNA at the nucleotide A1067. NHR is a member of the SPOUT (SpoU-TrmD) methyltransferase (MTase) superfamily, a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	144
349982	cd18109	SpoU-like_RNA-MTase	SAM-dependent RNA methylase related to SpoU-TrmH. RNA 2'-O ribose methyltransferase catalyzes the methyltransfer from S-adenosyl-L-methionine (AdoMet) to the 2'-OH group of ribose in tRNA or rRNA. It is part of the SpoU family of MTases, a subfamily of the SPOUT methyltransferase superfamily. The SPOUT methyltransferase superfamily is a large class of S-adenosyl-L-methionine (AdoMet or SAM)-dependent RNA MTases which are structurally characterized by a deep trefoil knot.	141
349745	cd18110	ATP-synt_F1_beta_C	F1-ATP synthase beta (B) subunit, C-terminal domain. The beta (B) subunit of the F1 complex of F0F1-ATP synthase, C-terminal domain. The F-ATP synthase (also called FoF1-ATPase) is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The beta subunit of ATP synthase is catalytic.	108
349746	cd18111	ATP-synt_V_A-type_alpha_C	V/A-type ATP synthase catalytic subunit A (alpha), C-terminal domain. The alpha (A) subunit of the V1/A1 complex of V/A-type ATP synthases, C-terminal domain.  The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 and A1 complexes contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes.	105
349747	cd18112	ATP-synt_V_A-type_beta_C	V/A-type ATP synthase beta (B) subunit, C-terminal domain. The beta (B) subunit of the V1/A1 complexes of V/A-type ATP synthases, C-terminal domain.  The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 and A1 complexes contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. This subfamily consists of the non-catalytic beta subunit.	95
349748	cd18113	ATP-synt_F1_alpha_C	F1-ATP synthase alpha (A) subunit, C-terminal domain. The alpha (A) subunit of the F1 complex of F0F1-ATP synthase, C-terminal domain. The F-ATP synthase (also called FoF1-ATPase) is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The alpha subunit of the F1 ATP synthase can bind nucleotides, but is non-catalytic.	126
349749	cd18114	ATP-synt_flagellum-secretory_path_III_C	Flagellum-specific ATP synthase, C-terminal domain. The C-terminal domain of the flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins. The flagellum-specific ATPase FliI is the soluble export component that drives flagellar protein export, and it shows extensive similarity to the alpha and beta subunits of FoF1-ATP synthase. Although they both are proton driven rotary molecular devices, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens such as Salmonella and Chlamydia also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway.	71
349739	cd18115	ATP-synt_F1_beta_N	F1-ATP synthase beta (B) subunit, N-terminal domain. The beta (B) subunit of the F1 complex of FoF1-ATP synthase, N-terminal domain. The F-ATP synthase (also called FoF1-ATPase) is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta and epsilon subunits with a stoichiometry of 3:3:1:1:1. The beta subunit of ATP synthase is catalytic.	76
349740	cd18116	ATP-synt_F1_alpha_N	F1-ATP synthase alpha (A) subunit, N-terminal domain. The alpha (A) subunit of the F1 complex of FoF1-ATP synthase, N-terminal domain. The F-ATP synthase (also called FoF1-ATPase) is found in bacterial plasma membranes, in mitochondrial inner membranes, and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta, and epsilon subunits with a stoichiometry of 3:3:1:1:1. The alpha subunit of the F1 ATP synthase can bind nucleotides, but is non-catalytic.	67
349741	cd18117	ATP-synt_flagellum-secretory_path_III_N	Flagellum-specific ATP synthase, N-terminal domain. The N-terminal domain of the flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins.  The FliI ATPase is the soluble export component that drives flagellar protein export, and it shows extensive similarity to the alpha and beta subunits of F1-ATP synthase. Although they both are proton driven rotary molecular devices, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens, such as Salmonella and Chlamydia, also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway.	70
349742	cd18118	ATP-synt_V_A-type_beta_N	V/A-type ATP synthase beta (B) subunit, N-terminal domain. The beta (B) subunit of the V1/A1 complexes of V/A-type ATP synthases, N-terminal domain.  The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 or A1 complex which contains three copies each of the alpha and beta subunits that form the soluble catalytic core, that is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex which forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. This subfamily consists of the non-catalytic beta subunit.	72
349743	cd18119	ATP-synt_V_A-type_alpha_N	V/A-type ATP synthase catalytic subunit A (alpha), N-terminal domain. The alpha (A) subunit of the V1/A1 complexes of V/A-type ATP synthases, N-terminal domain.  The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 or A1 complex contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes.	67
349413	cd18120	ATP-synt_Vo_Ao_c	Membrane-bound Vo/Ao complexes of V/A-type ATP synthases, subunit c. Vo/Ao-ATP synthase subunit c. The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 and A1 complexes contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPases) is exclusively found in archaea and functions like the F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. The V- and A-type synthases can function in both ATP synthesis and hydrolysis modes. The V1 complex consists of three A and three B subunits, two G subunits plus the C, D, E, F, and H subunits. The Vo complex consists of five different subunits: a, c, c', c'', and d. The Ao/A1 complexes are composed of nine subunits in a stoichiometry of A(3):B(3):C:D:E:F:H(2):a:c(x). ATP is synthesized on the A3:B3 hexamer and the energy released during that process is transferred to the Ao complex, which consists of the C-terminal segment of subunit a and subunit c.	62
349414	cd18121	ATP-synt_Fo_c	membrane-bound Fo complex of F-ATP synthase, subunit c. Subunit c (also called subunit 9, or proteolipid) of the Fo complex of F-ATP synthase. The F-ATP synthase (also called FoF1-ATPase) consists of two structural domains: the F1 (factor one) complex containing the soluble catalytic core, and the Fo (oligomycin sensitive factor) complex containing the membrane proton channel, linked together by a central stalk and a peripheral stalk. F1 is composed of alpha, beta, gamma, delta, and epsilon subunits with a stoichiometry of 3:3:1:1:1, while Fo consists of the three subunits a, b, and c (1:2:10-14). An oligomeric ring of 10-14 c subunits (c-ring) make up the Fo rotor. The flux of protons though the ATPase channel (Fo) drives the rotation of the c-ring, which in turn is coupled to the rotation of the F1 complex gamma subunit rotor due to the permanent binding between the gamma and epsilon subunits of F1 and the c-ring of Fo. The F-ATP synthases are primarily found in the inner membranes of eukaryotic mitochondria, in the thylakoid membranes of chloroplasts, or in the plasma membranes of bacteria. The F-ATP synthases are the primary producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). Alternatively, under conditions of low driving force, ATP synthases function as ATPases, thus generating a transmembrane proton or Na(+) gradient at the expense of energy derived from ATP hydrolysis. This group also includes F-ATP synthase that has also been found in the archaea Methanosarcina acetivorans.	65
350838	cd18133	HLD_clamp	helical lid domain of clamp loader-like AAA+ proteins. Clamp loader complexes are multisubunit complexes that play an important role in DNA replication. They open sliding clamps for assembly and close them around DNA, specifically targeting them to sites where DNA synthesis is initiated and orienting them correctly for replication. The subunits belong to the clamp loader clade of AAA+ superfamily.	65
350839	cd18137	HLD_clamp_pol_III_gamma_tau	helical lid domain of DNA polymerase III subunits gamma and tau. DNA polymerase III subunit gamma/tau is part of the DNA polymerase III holoenzyme. Gamma and tau subunits are isoforms, both containing the helical lid domain. Gamma interacts with the delta subunit to transfer the beta subunit on the DNA while tau serves as a scaffold to help in the dimerization of the core complex. Both are members of the clamp-loader clade of the AAA+ superfamily.	65
350840	cd18138	HLD_clamp_pol_III_delta	helical lid domain of DNA polymerase III subunits delta. DNA polymerase III subunit delta is part of the DNA polymerase III holoenzyme. the delta subunit id required for ring opening  and binds the beta subunit. It is a member of the clamp-loader clade of the AAA+ superfamily.	65
350841	cd18139	HLD_clamp_RarA	helical lid domain of recombination factor protein RarA. Recombination factor RarA (Replication associated recombination gene/protein A, also known as MgsA (Maintenance of genome stability A) or Mgs1 in yeast and WRNIP1 in mammals) is a member of the clamp-loader clade of the AAA+ superfamily. It functions as a tetramer. RarA co-localize with the replication fork throughout the cell cycle and may play a role in the rescue of stalled replication forks.	75
350842	cd18140	HLD_clamp_RFC	helical lid domain of replication factor C subunit. Replication factor C (RFC) is five-protein clamp loader complex that forms a stable ATP-dependent complex with the sliding clamp, PCNA, which binds specifically to primed DNA.  RFC subunits belong to the clamp loader clade of the AAA+ superfamily.	63
381143	cd18159	REC_OmpR_NsrR-like	phosphoacceptor receiver (REC) domain of Streptococcus agalactiae NsrR-like OmpR family response regulators. Streptococcus agalactiae NsrR is a lantibiotic resistance-associated response regulator and is part of the nisin resistance operon. It is a member of the NsrRK two-component system (TCS) that is involved in the regulation of lantibiotic resistance genes such as a membrane-associated lipoprotein of LanI, and the nsr gene cluster which encodes for the resistance protein NSR and the ABC transporter NsrFP, both conferring resistance against nisin. This subfamily also includes Staphylococcus epidermidis GraR, part of the GraR/GraS TCS involved in resistance against cationic antimicrobial peptides, and Bacillus subtilis BceR, part of the BceS/BceR TCS involved in the regulation of bacitracin resistance. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	113
381144	cd18160	REC_CpdR_CckA-like	phosphoacceptor receiver (REC) domain of Brucella abortus CpdR and CckA, and similar domains. Two-component systems (TCSs), consisting of a sensor and a response regulator, are used by bacteria to adapt to changing environments. Processes regulated by TCSs in bacteria include sporulation, pathogenicity, virulence, chemotaxis and membrane transport. Response regulators share the common phosphoacceptor REC domain and differ output domains such as DNA, RNA, ligand, and protein-binding, or enzymatic domain. CpdR is a stand-alone REC protein. CckA is a sensor histidine kinase containing N-terminal PAS domains and a C-terminal REC domain. CpdR and CckA are components of a regulatory phosphorelay system (composed of CckA, ChpT, CtrA and CpdR) that controls Brucella abortus cell growth, division, and intracellular survival inside mammalian host cells. CckA autophosphorylates in the presence of ATP and transfers a phosphoryl group to the conserved aspartic acid residue on its C-terminal REC domain, which is relayed to the ChpT phosphotransferase. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	103
381145	cd18161	REC_hyHK_blue-like	phosphoacceptor receiver (REC) domain of hybrid sensor histidine kinase/response regulators similar to Pseudomonas savastanoi blue-light-activated histidine kinase. Typically, two-component regulatory systems (TCSs) consist of a sensor (histidine kinase) that responds to specific input(s) by modifying the output of a cognate response regulator (RR). TCSs allow organisms to sense and respond to changes in environmental conditions. Hybrid sensor histidine kinase (HK)/response regulators contain all the elements of a classical TCS in a single polypeptide chain. Pseudomonas savastanoi blue-light-activated histidine kinase is a photosensitive HK and RR that is involved in increased bacterial virulence upon exposure to light. RRs share the common phosphoacceptor REC domain and different effector/output domains such as DNA, RNA, ligand-binding, protein-binding, or enzymatic domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	102
349482	cd18172	M14_CP_plant	Zinc carboxypeptidase, including SOL1, a carboxypeptidase D in plant. This family includes only plant members of the carboxypeptidase (CP) N/E-like subfamily of the M14 family of metallocarboxypeptidases (MCPs). It includes Arabidopsis thaliana SOL1 carboxypeptidase D which is known to possess enzymatic activity to remove the C-terminal arginine residue of CLE19 proprotein in vitro, and SOL1-dependent cleavage of the C-terminal arginine residue is necessary for CLE19 activity in vivo. The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages that would otherwise damage the cell. In addition, all members of the N/E subfamily contain an extra C-terminal domain that is not present in the A/B subfamily. This domain has structural homology to transthyretin and other proteins and has been proposed to function as a folding domain. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation.	276
349483	cd18173	M14_CP_bacteria	bacterial peptidase M14 carboxypeptidase, uncharacterized. This family contains only bacterial carboxypeptidase (CP) members of the M14 family of metallocarboxypeptidases (MCPs), mostly of which have yet to be characterized. The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes eight members, of which five (CPN, CPE, CPM, CPD, CPZ) are considered enzymatically active, while the other three are non-active (CPX1, PCX2, ACLP/AEBP1) and lack the critical active site and substrate-binding residues considered necessary for CP activity. These non-active members may function as binding proteins or display catalytic activity towards other substrates. Unlike the A/B CP subfamily, enzymes belonging to the N/E subfamily are not produced as inactive precursors that require proteolysis to produce the active form; rather, they rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavages that would otherwise damage the cell. In addition, all members of the N/E subfamily contain an extra C-terminal domain that is not present in the A/B subfamily. This domain has structural homology to transthyretin and other proteins and has been proposed to function as a folding domain. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation.	281
349484	cd18174	M14_ASTE_ASPA_like	Peptidase M14 Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA)-like; uncharacterized subgroup. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	187
349415	cd18175	ATP-synt_Vo_c_ATP6C_rpt1	V-type proton ATPase 16 kDa proteolipid subunit (ATP6C/ATP6V0C/ATP6L/ATPL) and similar proteins. ATP6C (also called the V-ATPase 16 kDa proteolipid subunit, or vacuolar proton pump 16 kDa proteolipid subunit) is a proton-conducting pore forming subunit of the membrane integral Vo complex of vacuolar ATPase. V-ATPase is responsible for acidifying a variety of intracellular compartments in eukaryotic cells.	68
349416	cd18176	ATP-synt_Vo_c_ATP6C_rpt2	V-type proton ATPase 16 kDa proteolipid subunit (ATP6C/ATP6V0C/ATP6L/ATPL) and similar proteins. ATP6C (also called V-ATPase 16 kDa proteolipid subunit, or vacuolar proton pump 16 kDa proteolipid subunit) is a proton-conducting pore forming subunit of the membrane integral Vo complex of vacuolar ATPase. V-ATPase is responsible for acidifying a variety of intracellular compartments in eukaryotic cells.	68
349417	cd18177	ATP-synt_Vo_c_ATP6F_rpt1	V-type proton ATPase 21 kDa proteolipid subunit (ATP6F/ATP6V0B) and similar proteins. ATP6F (also called V-ATPase 21 kDa proteolipid subunit, or vacuolar proton pump 21 kDa proteolipid subunit) is a proton-conducting pore forming subunit of the membrane integral Vo complex of vacuolar ATPase. V-ATPase is responsible for acidifying a variety of intracellular compartments in eukaryotic cells.	63
349418	cd18178	ATP-synt_Vo_c_ATP6F_rpt2	V-type proton ATPase 21 kDa proteolipid subunit (ATP6F/ATP6V0B) and similar proteins. ATP6F (also called V-ATPase 21 kDa proteolipid subunit, or vacuolar proton pump 21 kDa proteolipid subunit) is a proton-conducting pore forming subunit of the membrane integral Vo complex of vacuolar ATPase. V-ATPase is responsible for acidifying a variety of intracellular compartments in eukaryotic cells.	65
349419	cd18179	ATP-synt_Vo_Ao_c_NTPK_rpt1	V-type sodium ATPase subunit K (NTPK) and similar proteins. NTPK (also called Na(+)-translocating ATPase subunit K, or sodium ATPase proteolipid component) is involved in ATP-driven sodium extrusion.	63
349420	cd18180	ATP-synt_Vo_Ao_c_NTPK_rpt2	V-type sodium ATPase subunit K (NTPK) and similar proteins. NTPK (also called Na(+)-translocating ATPase subunit K, or sodium ATPase proteolipid component) is involved in ATP-driven sodium extrusion.	64
349421	cd18181	ATP-synt_Vo_Ao_c_TtATPase_like	Thermus thermophilus V/A-ATPase and similar proteins. This family includes a group of uncharacterized ATPase similar to Thermus thermophilus V/A-ATPase, which is homologous to the eukaryotic V-ATPase, but has a simpler subunit composition and functions in vivo to synthesize ATP rather than pump protons.	62
349422	cd18182	ATP-synt_Fo_c_ATP5G3	ATP synthase F(0) complex subunit C3 (ATP5G3) and similar proteins. ATP5G3 (also called ATP synthase lipid-binding protein, ATP synthase proteolipid P3, ATP synthase proton-transporting mitochondrial F(o) complex subunit C3, ATPase protein 9, or ATPase subunit c) transports protons across the inner mitochondrial membrane to the F1-ATPase protruding on the matrix side, resulting in the generation of ATP.	65
349423	cd18183	ATP-synt_Fo_c_ATPH	F-type proton-translocating ATP synthase (ATPH) and similar proteins. This family includes subunit c of chloroplast F-ATP synthase (F1Fo-ATP synthase), also known as ATP synthase F(o) sector subunit c (also called ATPase subunit III, F-type ATPase subunit c, or F-ATPase subunit c)and similar proteins. It is a proton-translocating subunit of the ATP synthase encoded by gene atpH.	75
349424	cd18184	ATP-synt_Fo_c_NaATPase	F-type sodium ion-translocating ATP synthase and similar proteins. This family includes F-type Na(+)-coupled ATP synthase and similar proteins.	65
349425	cd18185	ATP-synt_Fo_c_ATPE	F-type proton-translocating ATPase subunit c (ATPE) and similar proteins. This family includes subunit c of F-ATP synthase (also called ATP synthase F(o) sector subunit c, F-type ATPase subunit c, or F-ATPase subunit c) and similar proteins. It is a proton-translocating subunit of the ATP synthase encoded by gene atpE.	65
349497	cd18186	BTB_POZ_ZBTB_KLHL-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing (ZBTB) proteins, Kelch-like (KLHL) proteins, and similar proteins. This family includes a variety of BTB/POZ domain-containing proteins, such as zinc finger and BTB domain-containing (ZBTB) proteins and Kelch-like (KLHL) proteins. They have diverse functions, such as transcriptional regulation, chromatin remodeling, protein degradation and cytoskeletal regulation. Many BTB/POZ proteins contain one or two additional domains, such as kelch repeats, zinc-finger domains, FYVE (Fab1, YOTB, Vac1, and EEA1) fingers, or ankyrin repeats. These special additional domains or interaction partners provide unique characteristics and functions to BTB/POZ proteins. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	82
349498	cd18187	BTB_POZ_Kv_KCTD	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in voltage-gated potassium (Kv) channels and potassium channel tetramerization domain-containing (KCTD) proteins. This family includes two protein groups: voltage-gated potassium (Kv) channels and potassium channel tetramerization domain-containing (KCTD) proteins. Kv channels are membrane proteins with fundamental physiological roles. They are responsible for a variety of electrical phenomena, such as the repolarization of the action potential, spike frequency adaptation, synaptic repolarization, and smooth muscle contraction. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels, and others. All family members contain the BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization.	83
349499	cd18190	BTB_POZ_ETO1-like	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Arabidopsis thaliana ethylene-overproduction protein 1 (ETO1) and similar proteins. ETO1, also called protein ethylene overproducer 1, is an essential regulator of the ethylene pathway, which acts by regulating the stability of 1-aminocyclopropane-1-carboxylate synthase (ACS) enzymes. It may act as a substrate-specific adaptor that connects ACS enzymes, such as ACS5, to ubiquitin ligase complexes, leading to proteasomal degradation of ACS enzymes. The family also includes ETO1-like proteins 1 (EOL1) and 2 (EOL2). ETO1, EOL1, and EOL2 contain a BTB domain and tetratricopeptide (TPR) repeats. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	83
349500	cd18191	BTB_POZ_ARMC5	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in armadillo repeat-containing protein 5 (ARMC5). ARMC5 plays a role in steroidogenesis, and modulates the expression and cortisol production of steroidogenic enzymes. It negatively regulates adrenal cells survival. It contains armadillo (ARM) repeats and a BTB domain, which is a common protein-protein interaction motif of about 100 amino acids.	100
349501	cd18192	BTB_POZ_ZBTB1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 1 (ZBTB1). ZBTB1 acts as a transcriptional repressor that represses cAMP-responsive element (CRE)-mediated transcriptional activation. It also has a role in translesion DNA synthesis, and is essential for lymphocyte development. ZBTB1 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	114
349502	cd18193	BTB_POZ_ZBTB2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 2 (ZBTB2). ZBTB2 is a POZ domain Kruppel-like zinc finger (POK) family transcription factor acting as a potent repressor of the ARF-HDM2-p53-p21 pathway, which is important in cell cycle regulation. It represses transcription of the ARF, p53, and p21 genes, but activates the HDM2 gene. ZBTB2 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	115
349503	cd18194	BTB_POZ_ZBTB3-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing proteins, ZBTB3, ZBTB18, ZBTB42 and similar proteins. The family includes zinc finger and BTB domain-containing proteins, ZBTB3, ZBTB18 and ZBTB42. ZBTB3 is a transcription factor essential for cancer cell growth via the regulation of the reactive oxygen species (ROS) detoxification pathway. ZBTB18 is a sequence-specific transrepressor associated with heterochromatin. ZBTB42 is a transcriptional repressor that specifically binds DNA and probably acts by recruiting chromatin remodeling multiprotein complexes. Members of this family contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	128
349504	cd18195	BTB_POZ_ZBTB4	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 4 (ZBTB4). ZBTB4, also called KAISO-like zinc finger protein 1 (KAISO-L1), is a transcriptional repressor with bimodal DNA-binding specificity. It binds with a higher affinity to methylated CpG dinucleotides in the consensus sequence 5'-CGCG-3' but can also bind to the non-methylated consensus sequence 5'-CTGCNA-3', also known as the consensus kaiso binding site (KBS). It can also bind specifically to a single methyl-CpG pair and can bind hemimethylated DNA but with a lower affinity compared to methylated DNA. ZBTB4 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	124
349505	cd18196	BTB_POZ_ZBTB5	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 5 (ZBTB5). ZBTB5 is a POZ domain Kruppel-like zinc finger (POK) family transcription repressor of cell cycle arrest gene p21 and a potential proto-oncogene stimulating cell proliferation. ZBTB5 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	126
349506	cd18197	BTB_POZ_ZBTB6	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 6 (ZBTB6). ZBTB6, also called zinc finger protein 482 (ZNF482) or zinc finger protein with interaction domain, may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	116
349507	cd18198	BTB_POZ_ZBTB7	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 7 (ZBTB7). There are three ZBTB7 isoforms: ZBTB7A, ZBTB7B, and ZBTB7C. ZBTB7A is a transcription repressor of key glycolytic genes, including GLUT3, PFKP, and PKM, and its downregulation in human cancer contributes to tumor metabolism. ZBTB7B is a transcriptional regulator of extracellular matrix gene expression. ZBTB7C is a transcriptional repressor with a pro-oncogenic role that relies upon binding to p53 and inhibition of its transactivation function. ZBTB7 isoforms contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	120
349508	cd18199	BTB_POZ_ZBTB8	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 8 (ZBTB8). There are two ZBTB8 isoforms: ZBTB8A and ZBTB8B. ZBTB8A is a novel proto-oncoprotein that stimulates cell proliferation. ZBTB8B may be involved in transcriptional regulation. They both contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	113
349509	cd18200	BTB_POZ_ZBTB9	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 9 (ZBTB9). ZBTB9 may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	111
349510	cd18201	BTB_POZ_ZBTB10	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 10 (ZBTB10). ZBTB10, also called zinc finger protein RIN ZF, is an mRNA target of miR-27a and a transcriptional repressor of Specificity protein (Sp) expression. The microRNA-27a:ZBTB10-specificity protein pathway is involved in follicle stimulating hormone-induced VEGF, Cox2, and survivin expression in ovarian epithelial cancer cells. ZBTB10 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	122
349511	cd18202	BTB_POZ_ZBTB11	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 11 (ZBTB11). ZBTB11 is a transcriptional repressor of TP53. It is critical for basal and emergency granulopoiesis. It regulates neutrophil development through its integrase-like zinc finger domain. ZBTB11 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	118
349512	cd18203	BTB_POZ_ZBTB12	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 12 (ZBTB12). ZBTB12, also called protein G10, may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	122
349513	cd18204	BTB_POZ_ZBTB14	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 14 (ZBTB14). ZBTB14 is also called zinc finger protein 161 (Zfp-161), zinc finger protein 478, zinc finger protein 5 (ZF5), or Zfp-5. It is a novel transcriptional activator of the dopamine transporter, binding it's promoter at the consensus sequence 5'-CCTGCACAGTTCACGGA-3'. It also binds to 5'-d(GCC)(n)-3' trinucleotide repeats in promoter regions and acts as a repressor of the FMR1 gene. ZBTB14 acts as a transcriptional repressor of MYC and thymidine kinase promoters. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	114
349514	cd18205	BTB_POZ_ZBTB16_PLZF	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 16 (ZBTB16). ZBTB16 is also called promyelocytic leukemia zinc finger protein, zinc finger protein 145, or zinc finger protein PLZF. It is a DNA-binding transcription factor essential for undifferentiated cell maintenance. ZBTB16 also acts as a downstream transcriptional regulator of Osterix and can be useful as a late marker of osteoblastic differentiation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	107
349515	cd18206	BTB_POZ_ZBTB17_MIZ1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 17 (ZBTB17). ZBTB1 is also called c-Myc-interacting zinc finger protein 1 (Miz-1), zinc finger protein 151, or zinc finger protein 60. It is a poly-Cys2His2 zinc finger (ZF) transcription factor that can function as an activator or repressor depending on its binding partners, and by targeting negative regulators of cell cycle progression. ZBTB17 has been implicated in cardiomyopathy and is important in cardiac stress response. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	112
349516	cd18207	BTB_POZ_ZBTB19_PATZ1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in POZ-, AT hook-, and zinc finger-containing protein 1 (PATZ1). PATZ1 is also called zinc finger and BTB domain-containing protein 19 (ZBTB19), BTB/POZ domain zinc finger transcription factor, protein kinase A RI subunit alpha-associated protein, zinc finger protein 278, or zinc finger sarcoma gene protein. It is an important transcriptional regulatory factor that regulates divergent pathways depending on the cellular context. For instance, it acts as a transcriptional suppressor that functions in T lymphocytes. It is also a DNA damage-responsive transcription factor that inhibits p53 function. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	128
349517	cd18208	BTB_POZ_ZBTB20_DPZF	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 20 (ZBTB20). ZBTB20, also called dendritic-derived BTB/POZ zinc finger protein (DPZF) or zinc finger protein 288, may be a transcription factor involved in hematopoiesis, oncogenesis, and immune responses. It is an essential regulator of hepatic lipogenesis and may be a therapeutic target for the treatment of fatty liver disease. It also functions as a critical regulator of anterior pituitary development and lactotrope specification. Moreover, it promotes astrocytogenesis during neocortical development. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	117
349518	cd18209	BTB_POZ_ZBTB21_ZNF295	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 21 (ZBTB21). ZBTB21, also called zinc finger protein 295 (ZNF295), is a transcription repressor that acts in a selective manner on different promoters. It may be involved in the bi-directional control of gene expression in concert with another transcription factor ZFP161. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	112
349519	cd18210	BTB_POZ_ZBTB22_BING1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 22 (ZBTB22). ZBTB22, also called protein BING1 or zinc finger protein 297, may be involved in transcriptional regulation. Its gene, together with BING 3-5, TAPASIN, DAXX, RGL2, and HKE2, form a dense cluster at the centromeric end of the major histocompatibility complex class I region. ZBTB22 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	112
349520	cd18211	BTB_POZ_ZBTB23_GZF1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in glial cell line-derived neurotrophic factor-inducible zinc finger protein 1 (GZF1). GZF1 is also called GDNF-inducible zinc finger protein 1, zinc finger and BTB domain-containing protein 23 (ZBTB23), or zinc finger protein 336 (ZNF336). It is a sequence-specific transcriptional repressor that binds the GZF1 responsive element (GRE), with the consensus sequence of 5'-TGCGCN[TG][CA]TATA-3'. It may play a role in renal branching morphogenesis. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	128
349521	cd18212	BTB_POZ_ZBTB24_ZNF450	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 24 (ZBTB24). ZBTB24, also called zinc finger protein 450, functions as a transcription factor essentially involved in B-cell functions in humans. The loss-of-function mutations in ZBTB24 can cause ICF2 (immunodeficiency, centromeric instability and facial anomalies syndrome 2) with immunological characteristics of greatly reduced serum antibodies and circulating memory B cells. ZBTB24 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	118
349522	cd18213	BTB_POZ_ZBTB25_ZNF46_KUP	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 25 (ZBTB25). ZBTB25, also called zinc finger protein 46 (ZNF46) or zinc finger protein KUP, is a transcription repressor that facilitates viral RNA transcription and replication. It interacts with viral RNA-dependent RNA polymerase (RdRp) proteins and modulates their transcription activity. It also functions as a viral RNA-binding protein, binding preferentially to the U-rich sequence within 5' UTR of vRNA. ZBTB25 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	128
349523	cd18214	BTB_POZ_ZBTB26_Bioref	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 26 (ZBTB26). ZBTB26, also called zinc finger protein 481 (ZNF481) or zinc finger protein Bioref, may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	122
349524	cd18215	BTB_POZ_ZBTB27-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in B-cell lymphoma 6 proteins, BCL-6 and BCL-6B. This family includes B-cell lymphoma 6 proteins, BCL-6 and BCL-6B. BCL-6 is a transcriptional repressor mainly required for germinal center (GC) formation and antibody affinity maturation, which have different mechanisms of action specific to the lineage and biological functions. BCL-6B is a sequence-specific transcriptional repressor in association with BCL-6. It may function in a narrow stage or be related to some events in the early B-cell development. Family members contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	113
349525	cd18216	BTB_POZ_ZBTB29-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in hypermethylated in cancer proteins, Hic-1 and Hic-2. The family includes hypermethylated in cancer proteins, Hic-1 and Hic-2. Hic-1 is a sequence-specific transcriptional repressor that recognizes and binds to the consensus sequence '5-[CG]NG[CG]GGGCA[CA]CC-3'. Hic-2 is a homolog of tumor suppressor Hic-1 that functions as a transcriptional regulator. Family members contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	118
349526	cd18217	BTB_POZ_ZBTB31_myoneurin	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in myoneurin. Myoneurin, also called zinc finger and BTB domain-containing protein 31 (ZBTB31), is a novel member of the BTB/POZ-zinc finger family highly expressed in the neuromuscular system and is associated with neuromuscular junctions during the late embryonic period. It may function as a synaptic gene regulator. Myoneurin contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	111
349527	cd18218	BTB_POZ_ZBTB32_FAZF_TZFP	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 32 (ZBTB32). ZBTB32 is also called FANCC-interacting protein, fanconi anemia zinc finger protein (FAZF), testis zinc finger protein (TZFP), or zinc finger protein 538 (ZNF538). It is a DNA-binding transcription factor that binds to the 5'-TGTACAGTGT-3' core sequence. It acts as a transcription suppressor that controls T cell-mediated autoimmunity. ZBTB32 is essential  for down-regulation of GATA3 via ZPO2; this promotes aggressive breast cancer development. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	110
349528	cd18219	BTB_POZ_ZBTB33_KAISO	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kaiso. Kaiso, also called zinc finger and BTB domain-containing protein 33 (ZBTB33), is a DNA methylation-dependent transcriptional repressor that binds to methylated CpG dinucleotides in the consensus sequence 5'-CGCG-3'. It also binds to the non-methylated consensus sequence 5'-CTGCNA-3', also known as the consensus kaiso binding site (KBS). It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	106
349529	cd18220	BTB_POZ_ZBTB34	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 34 (ZBTB34). ZBTB34 acts as a transcriptional regulator. It downregulates specificity protein (Sp) transcription factors Sp1, Sp3, and Sp4 in pancreatic cancer cells. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	120
349530	cd18221	BTB_POZ_ZBTB35_ZNF131	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger protein 131 (ZNF131). ZNF131, also called zinc finger and BTB domain-containing protein 35 (ZBTB35), is a transcriptional activator implicated as a regulator of Kaiso-mediated biological processes. It regulates cell growth of developing and mature T cells. It inhibits estrogen signaling by suppressing estrogen receptor alpha homo-dimerization, and plays a role in breast cancer cell proliferation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	113
349531	cd18222	BTB_POZ_ZBTB37	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 37 (ZBTB37). ZBTB37 may be involved in transcriptional regulation. It is differentially expressed in aryl hydrocarbon receptor (AhR)-KO mice compared with WT mice, and may potentially contribute to the aging phenotype of AhR-KO mice. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	123
349532	cd18223	BTB_POZ_ZBTB38_CIBZ	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 38 (ZBTB38). ZBTB38, also termed CIBZ, is a transcriptional regulator with bimodal DNA-binding specificity. It binds with a higher affinity to methylated CpG dinucleotides in the consensus sequence 5'-CGCG-3', as well as E-box elements (5'-CACGTG-3'). It can also bind specifically to a single methyl-CpG pair. ZBTB38 represses transcription in a methyl-CpG-dependent manner. It is a negative regulator of endoplasmic reticulum stress-associated apoptosis. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	114
349533	cd18224	BTB_POZ_ZBTB39	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 39 (ZBTB39). ZBTB39 may be involved in transcriptional regulation. Its specific function is as yet unknown. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	123
349534	cd18225	BTB_POZ_ZBTB40	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 40 (ZBTB40). ZBTB40 may be involved in transcriptional regulation. Single-nucleotide polymorphisms of ZBTB40 are associated with bone mineral density in European and East-Asian populations. ZBTB40 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	116
349535	cd18226	BTB_POZ_ZBTB41	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 41 (ZBTB41). ZBTB41, also called FRBZ1, may be involved in transcriptional regulation. Its specific function is as yet unknown. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	114
349536	cd18227	BTB_POZ_ZBTB43	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 43 (ZBTB43). ZBTB43, also called zinc finger and BTB domain-containing protein 22B (ZBTB22b), zinc finger protein 297B (ZNF297B), or ZnF-x, may be involved in transcriptional regulation. It interacts with BDP1, a subunit of transcription factor IIIB (TFIIIB). Since BDP1 is essential in Pol III transcription, ZBTB43 may also regulate these transcriptional pathways. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	121
349537	cd18228	BTB_POZ_ZBTB44	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 44 (ZBTB44). ZBTB44, also called BTB/POZ domain-containing protein 15 (BTBD15) or zinc finger protein 851 (ZNF851), may be involved in transcriptional regulation. Single-nucleotide polymorphisms of ZBTB44 showed a suggestive association with disease progression of Crohn's disease. ZBTB44 has also preferentially been recognized by sera of patients with peripheral T-cell lymphoma (PTCL). It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	126
349538	cd18229	BTB_POZ_ZBTB45	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 45 (ZBTB45). ZBTB45, also called zinc finger protein 499 (ZNF499), may act as a transcriptional regulator that is essential for proper glial differentiation of neural and oligodendrocyte progenitor cells. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	112
349539	cd18230	BTB_POZ_ZBTB46	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 46 (ZBTB46). ZBTB46 is also called BTB-ZF protein expressed in effector lymphocytes (BZEL), BTB/POZ domain-containing protein 4 (BTBD4), or zinc finger protein 340 (ZNF340). It is a conventional dendritic cell (cDC) lineage specific transcription factor that acts as a negative regulator required to prevent activation of classical dendritic cells in the steady state. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	125
349540	cd18231	BTB_POZ_ZBTB47_ZNF651	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 47 (ZBTB47). ZBTB47, also called zinc finger protein 651 (ZNF651), is a paralog of ZNF652, a novel zinc-finger transcriptional repressor. It interacts with CBFA2T3 via its carboxy-terminal proline-rich region. CBFA2T3-ZNF651 functions as a transcriptional co-repressor complex. ZBTB47 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	115
349541	cd18232	BTB_POZ_ZBTB48_TZAP_KR3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in telomere zinc finger-associated protein (TZAP). TZAP is also called Krueppel-related zinc finger protein 3 (KR3), zinc finger and BTB domain-containing protein 48 (ZBTB48), or zinc finger protein 855 (ZNF855). It is a vertebrate telomere-binding protein involved in telomere length control. It directly binds the telomeric double-stranded 5'-TTAGGG-3' repeat. TZAP also acts as a transcription regulator that binds to promoter regions. It is a transcriptional activator of alternate reading frame (ARF) gene. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	108
349542	cd18233	BTB_POZ_ZBTB49	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 49 (ZBTB49). ZBTB49, also called zinc finger protein 509 (ZNF509), is a transcription factor that inhibits cell proliferation by activating either CDKN1A/p21 transcription or RB1 transcription. There are four ZNF509 isoforms generated by alternative splicing. Short ZNF509 (ZNF509S1, -S2 and -S3) isoforms contain one or two out of the seven zinc-fingers contained in long ZNF509 (ZNF509L). ZBTB49 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	112
349543	cd18234	BTB_POZ_KLHL1-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like proteins KLHL1, KLHL4 and KLHL5. This family contains the Kelch-like proteins: KLHL1, KLHL4 and KLHL5, all of which share high identity and similarity with the Drosophila kelch protein, a component of ring canals. KLHL1 is a neuronal actin-binding protein that modulates voltage-gated CaV2.1 (P/Q-type) and CaV3.2 (alpha1H T-type) calcium channels. Family members contain a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	105
349544	cd18235	BTB_POZ_KLHL2-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like proteins, KLHL2 and KLHL3. The family includes Kelch-like proteins, KLHL2 and KLHL3. KLHL2 is a novel actin-binding protein predominantly expressed in brain. It plays a role in the reorganization of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors. KLHL2 and KLHL3 each functions as a component of an E3 ubiquitin ligase complex that mediates the ubiquitination of target proteins. They contain a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349545	cd18236	BTB_POZ_KLHL6	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 6 (KLHL6). KLHL6 is a BTB-kelch protein with a lymphoid tissue-restricted expression pattern. It is involved in B-lymphocyte antigen receptor signaling and germinal center formation. It belongs to the KLHL gene family, which is composed of an N-terminal BTB-POZ domain and four to six Kelch motifs in tandem. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	129
349546	cd18237	BTB_POZ_KLHL7	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 7 (KLHL7). KLHL7 is a component of a Cul3-based E3 ubiquitin ligase complex and is involved in the ubiquitination of target proteins for proteasome-mediated degradation. Mutations in KLHL7 causes autosomal-dominant retinitis pigmentosa. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	126
349547	cd18238	BTB_POZ_KLHL8	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 8 (KLHL8). KLHL8 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for the ubiquitination and degradation of rapsyn, a postsynaptic protein required for clustering of nicotinic acetylcholine receptors (nAChRs) at the neuromuscular junction. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	120
349548	cd18239	BTB_POZ_KLHL9_13	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like proteins KLHL9 and KLHL13. KLHL9 and KLHL13 (also called BTB and kelch domain-containing protein 2, or BKLHD2) are substrate-specific adaptors of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for mitotic progression and cytokinesis. The BCR(KLHL9-KLHL13) E3 ubiquitin ligase complex mediates the ubiquitination of AURKB and controls the dynamic behavior of AURKB on mitotic chromosomes, thereby coordinating faithful mitotic progression and completion of cytokinesis. They contain a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	128
349549	cd18240	BTB_POZ_KLHL10	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 10 (KLHL10). KLHL10 is a substrate-specific adaptor of a CUL3-based E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins specifically in the testis during spermatogenesis. Haploinsufficiency of Klhl10 causes infertility in male mice. KLHL10 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	120
349550	cd18241	BTB_POZ_KLHL11	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 11 (KLHL11). KLHL11 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, leading most often to their proteasomal degradation. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	135
349551	cd18242	BTB_POZ_KLHL12_C3IP1_DKIR	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 12 (KLHL12). KLHL12, also called CUL3-interacting protein 1 (C3IP1) or DKIR homolog, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a negative regulator of the Wnt signaling pathway and ER-Golgi transport. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	124
349552	cd18243	BTB_POZ_KLHL14_printor	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 14 (KLHL14). KLHL14 is also called protein interactor of Torsin-1A (TOR1A), protein interactor of torsinA, or Printor. It is a novel torsinA-interacting protein that preferentially interacts with ATP-free form of TOR1A and is implicated in dystonia pathogenesis. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	130
349553	cd18244	BTB_POZ_KLHL15	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 15 (KLHL15). KLHL15 is a substrate-specific adaptor for the Cullin3 E3 ubiquitin-protein ligase complex that targets the serine/threonine-protein phosphatase 2A (PP2A) subunit PPP2R5B for ubiquitination and subsequent proteasomal degradation, thus promoting exchange with other regulatory subunits. It also plays a key role in DNA damage response, favoring DNA double-strand repair through error-prone non-homologous end joining (NHEJ) over error-free, RBBP8-mediated homologous recombination (HR), by targeting the DNA-end resection factor RBBP8/CtIP for ubiquitination and subsequent proteasomal degradation. KLHL15 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	137
349554	cd18245	BTB_POZ_KLHL16_gigaxonin	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in gigaxonin. Gigaxonin, also called Kelch-like protein 16 (KLHL16), may be a cytoskeletal component that directly or indirectly plays an important role in neurofilament architecture. It may also act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins, including tubulin folding cofactor B (TBCB), microtubule-associated protein MAP1B, and glial fibrillary acidic protein (GFAP). Gigaxonin is mutated in giant axonal neuropathy. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	111
349555	cd18246	BTB_POZ_KLHL17_actinfilin	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 17 (KLHL17). KLHL17, also called actinfilin, is a substrate-recognition component of some cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complexes. It acts as a Cullin 3 (Cul3) substrate adaptor that links GLUR6 to the E3 ubiquitin-ligase complex, and mediates the ubiquitination and subsequent degradation of GLUR6. It may play a role in actin-based neuronal function. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	139
349556	cd18247	BTB_POZ_KLHL18	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 18 (KLHL18). KLHL18 acts as a substrate-specific adaptor for a Cullin3 E3 ubiquitin-protein ligase complex that regulates mitotic entry and ubiquitylates Aurora-A. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	116
349557	cd18248	BTB_POZ_KLHL19_KEAP1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like ECH-associated protein 1 (KEAP1). KEAP1, also called cytosolic inhibitor of Nrf2 (INrf2) or Kelch-like protein 19 (KLHL19), is a redox-regulated substrate adaptor protein for a Cullin3-dependent ubiquitin ligase complex that targets NFE2L2/NRF2 for ubiquitination and degradation by the proteasome, thus resulting in the suppression of its transcriptional activity and the repression of antioxidant response element-mediated detoxifying enzyme gene expression. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	124
349558	cd18249	BTB_POZ_KLHL20_KLEIP	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 20 (KLHL20). KLHL20, also called Kelch-like ECT2-interacting protein (KLEIP) or Kelch-like protein X, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex involved in interferon response and anterograde Golgi to endosome transport. KLHL20 plays a role in actin assembly at cell-cell contact sites of Madin-Darby canine kidney cells. It also controls endothelial migration and sprouting angiogenesis. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	128
349559	cd18250	BTB_POZ_KLHL21	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 21 (KLHL21). KLHL21 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for efficient chromosome alignment and cytokinesis. The BCR(KLHL21) E3 ubiquitin ligase complex regulates localization of the chromosomal passenger complex (CPC) from chromosomes to the spindle midzone in anaphase and mediates the ubiquitination of aurora B. KLHL21 also targets IkappaB kinase-beta to regulate nuclear factor kappa-light chain enhancer of activated B cells (NF-kappaB) signaling negatively. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	124
349560	cd18251	BTB_POZ_KLHL22	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 22 (KLHL22). KLHL22 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for chromosome alignment and localization of polo-like kinase 1 (PLK1) at kinetochores. The BCR(KLHL22) ubiquitin ligase complex mediates mono-ubiquitination of PLK1, leading to PLK1 dissociation from phosphoreceptor proteins and subsequent removal from kinetochores, allowing silencing of the spindle assembly checkpoint (SAC) and chromosome segregation. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	125
349561	cd18252	BTB_POZ_KLHL23	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 23 (KLHL23). KLHL23 overexpression is associated with increased cell proliferation and invasion in gastric cancer. Downregulation of KLHL23 is associated with invasion, metastasis, and poor prognosis of hepatocellular carcinoma and pancreatic cancer. KLHL23 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	127
349562	cd18253	BTB_POZ_KLHL24_KRIP6	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 24 (KLHL24). KLHL24, also called kainate receptor-interacting protein for GluR6 (KRIP6) or protein DRE1, is necessary to maintain the balance between intermediate filament stability and degradation, a process that is essential for skin integrity. KLHL24 is a component of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that mediates ubiquitination of KRT14 and controls its levels during keratinocyte differentiation. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349563	cd18254	BTB_POZ_KLHL25	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 25 (KLHL25). KLHL25, also called ectoderm-neural cortex protein 2 (ENC-2), is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that is required for translational homeostasis. The BCR(KLHL25) ubiquitin ligase complex acts by mediating ubiquitination of hypophosphorylated EIF4EBP1 (4E-BP1). Cullin3-KLHL25 ubiquitin ligase also targets ATP-citrate lyase (ACLY), a key enzyme for lipid synthesis, for degradation to inhibit lipid synthesis and tumor progression. KLHL25 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	128
349564	cd18255	BTB_POZ_KLHL26	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 26 (KLHL26). KLHL26 is encoded by the klhl26 gene, which is regulated by p53 via fuzzy tandem repeats. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349565	cd18256	BTB_POZ_KLHL27_IPP	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in intracisternal A particle-promoted polypeptide (IPP). IPP, also called Kelch-like protein 27 (KLHL27) or actin-binding protein IPP, is an actin-binding protein that may play a role in organizing the actin cytoskeleton. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	125
349566	cd18257	BTB_POZ_KLHL28_BTBD5	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 28 (KLHL28). KLHL28, also called BTB/POZ domain-containing protein 5 (BTBD5), contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	118
349567	cd18258	BTB_POZ_KLHL29_KBTBD9	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 29 (KLHL29). KLHL29 is also called Kelch repeat and BTB domain-containing protein 9 (KBTBD9). A novel fusion transcript NR5A2-KLHL29FT, resulting from transchromosomal insertion, may influence the origin or progression of colon cancer.  KLHL29 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	125
349568	cd18259	BTB_POZ_KLHL30	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 30 (KLHL30). KLHL30 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	137
349569	cd18260	BTB_POZ_KLHL31_KBTBD1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 31 (KLHL31). KLHL31 is also called BTB and kelch domain-containing protein 6 (BKLHD6), Kelch repeat and BTB domain-containing protein 1 (KBTBD1), or Kelch-like protein KLHL. It is a transcriptional repressor in the MAPK/JNK signaling pathway to regulate cellular functions. Overexpression inhibits the transcriptional activities of both the TPA-response element (TRE) and serum response element (SRE). It is also a novel modulator of canonical Wnt signaling, which is important for vertebrate myogenesis. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	120
349570	cd18261	BTB_POZ_KLHL32	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 32 (KLHL32). KLHL32, also called BTB and kelch domain-containing protein 5 (BKLHD5), contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. Deletion of KLHL32 may be ssociated with Tourette syndrome and obsessive-compulsive disorder. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	133
349571	cd18262	BTB1_POZ_KLHL33	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 33 (KLHL33). KLHL33 contains BTB domains and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. KLHL33 gene expression in normal and tumor tissue suggest a significant association with prostate cancer risk. KLHL33 contains two BTB domains. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	101
349572	cd18263	BTB2_POZ_KLHL33	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 33 (KLHL33). KLHL33 contains BTB domains and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. KLHL33 gene expression in normal and tumor tissue suggest a significant association with prostate cancer risk. KLHL33 contains two BTB domains. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	118
349573	cd18264	BTB_POZ_KLHL34	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 34 (KLHL34). KLHL34 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The methylation status of KLHL34 cg14232291 appears to be predictive of pathologic response to preoperative chemoradiation therapy in rectal cancer patients. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	136
349574	cd18265	BTB_POZ_KLHL35	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 35 (KLHL35). KLHL35 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. Significant differences in DNA methylation of the KLHL35 gene in abdominal aortic aneurysm (AAA) patients compared to non-AAA controls suggest a potential role in AAA pathology. Hypermethylation of the KLHL35 gene has also been associated with the development of hepatocellular carcinoma. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	128
349575	cd18266	BTB_POZ_KLHL36	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 36 (KLHL36). KLHL36 may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	135
349576	cd18267	BTB_POZ_KLHL37_ENC1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ectoderm-neural cortex protein 1 (ENC-1). ENC-1 is also called Kelch-like protein 37 (KLHL37), nuclear matrix protein NRP/B, or p53-induced gene 10 protein. It is an actin-binding nuclear matrix protein that associates with p110(RB), and is involved in the regulation of neuronal process formation and in differentiation of neural crest cells. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	147
349577	cd18268	BTB_POZ_KLHL38	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 38 (KLHL38). KLHL38 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The KLHL38 gene is significantly up-regulated during diapause, a temporary arrest of development during early ontogeny. It may also function in preadipocyte differentiation in the chicken. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	129
349578	cd18269	BTB_POZ_KLHL40-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like proteins, KLHL40 and KLHL41. This family includes Kelch-like proteins, KLHL40 and KLHL41. KLHL40 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a key regulator of skeletal muscle development. KLHL41 is a novel kelch related protein that is involved in pseudopod elongation in transformed cells. They both contain a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	133
349579	cd18270	BTB_POZ_KBTBD2_BKLHD1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 2 (KBTBD2). KBTBD2, also called BTB and kelch domain-containing protein 1 (BKLHD1), plays an essential role in the regulation of insulin-signaling pathway. It is a BTB-Kelch family substrate recognition subunit of the Cullin-3-based E3 ubiquitin ligase, which targets p85alpha, the regulatory subunit of the phosphoinositol-3-kinase (PI3K) heterodimer, causing p85alpha ubiquitination and proteasome-mediated degradation. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	133
349580	cd18271	BTB_POZ_KBTBD3_BKLHD3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 3 (KBTBD3). KBTBD3, also called BTB and kelch domain-containing protein 3 (BKLHD3), contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	130
349581	cd18272	BTB_POZ_KBTBD4	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 4 (KBTBD4). KBTBD4, also called BTB and kelch domain-containing protein 4 (BKLHD4), is a BTB-BACK-Kelch domain protein belonging to a large family of cullin-RING ubiquitin ligase adaptors that facilitate the ubiquitination of target substrates. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	140
349582	cd18273	BTB_POZ_KBTBD6_7	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing proteins KBTBD6 and KBTBD7. KBTBD6 and KBTBD7 are substrate adaptors of a cullin-3 RING ubiquitin ligase complex that mediates ubiquitylation and proteasomal degradation of T-lymphoma and metastasis gene 1 (TIAM1), a RAC1-specific guanine exchange factor (GEF), by cooperating with gamma-aminobutyric acid receptor-associated proteins (GABARAP). KBTBD7 may also act as a new transcriptional activator in mitogen-activated protein kinase (MAPK) signaling. They both contain a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	142
349583	cd18274	BTB_POZ_KBTBD8	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 8 (KBTBD8). KBTBD8, also called T-cell activation kelch repeat protein (TA-KRP), is a BTB-kelch family protein that is located in the Golgi apparatus and translocates to the spindle apparatus during mitosis. It acts as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a regulator of neural crest specification. The BCR(KBTBD8) complex monoubiquitylates NOLC1 and its paralog TCOF1, the mutation of which underlies the neurocristopathy Treacher Collins syndrome. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	129
349584	cd18275	BTB_POZ_KBTBD11	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 11 (KBTBD11). KBTBD11 is also called chronic myelogenous leukemia-associated protein (CMLAP) or Kelch domain-containing protein 7B, or KLHDC7C. It is a BTB-Kelch family protein whose function remains unclear. A novel polymorphism rs11777210 in KBTBD11 is significantly associated with colorectal cancer risk. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	104
349585	cd18276	BTB_POZ_KBTBD12	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 12 (KBTBD12). KBTBD12, also called Kelch domain-containing protein 6 (KLHDC6), contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	127
349586	cd18277	BTB_POZ_BACH1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB and CNC homolog 1 (BACH1). BACH1, also called BTB-basic leucine zipper transcription factor 1, belongs to the cap 'n' collar (CNC) and basic leucine zipper (bZIP) factor family. It can act as repressor or activator. BACH1 is a heme-responsive transcriptional repressor of heme oxygenase (HO)-1. It represses genes involved in heme metabolism and oxidative stress response. It is also a negative regulator of nuclear factor erythroid 2-related factor 2 (Nrf2) that controls antioxidant response elements (ARE)-dependent gene expressions. BACH1 binds to NF-E2 binding sites in vitro, and plays important roles in coordinating transcription activation and repression by MafK. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	120
349587	cd18278	BTB_POZ_BACH2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB and CNC homolog 2 (BACH2). BACH2, also called BTB-basic leucine zipper transcription factor 2, belongs to the cap 'n' collar (CNC) and basic leucine zipper (bZIP) factor family. BACH2 is a lymphoid-specific transcription factor with a prominent role in B-cell development. It is transcriptionally regulated by the BCR/ABL oncogene. It represses the anti-apoptotic factor heme oxygenase-1 (HO-1). It is also a potent general repressor of effector differentiation in naive T cells. Moreover, BACH2 is required for pulmonary surfactant homeostasis and alveolar macrophage function. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	124
349588	cd18279	BTB_POZ_SPOP-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in speckle-type POZ protein (SPOP) and similar proteins. This family includes speckle-type POZ protein (SPOP), speckle-type POZ protein-like (SPOPL), TD and POZ domain-containing proteins (TDPOZ), Drosophila melanogaster protein roadkill and similar proteins. Both, SPOP and SPOPL, serve as adaptors of cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complexes that mediate the ubiquitination and proteasomal degradation of target proteins. TDPOZ is a family of bipartite animal and plant proteins that contain a tumor necrosis factor receptor-associated factor (TRAF) domain (TD) and a POZ/BTB domains. TDPOZ proteins may be nuclear scaffold proteins probably involved in transcription regulation in early development and other cellular processes. Drosophila melanogaster protein roadkill, also called Hh-induced MATH and BTB domain-containing protein (HIB), is a hedgehog-induced BTB protein that modulates hedgehog signaling by degrading Ci/Gli transcription factor. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	120
349589	cd18280	BTB_POZ_BPM_plant	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in plant BTB/POZ-MATH (BPM) protein family. The BPM protein family includes Arabidopsis thaliana BTB/POZ and MATH domain-containing proteins, AtBPM1-6, and similar proteins from other plants. BPM protein, also called protein BTB-POZ and MATH domain, may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349590	cd18281	BTB_POZ_BTBD1_2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing proteins, BTBD1 and BTBD2. This family includes BTB/POZ domain-containing proteins BTBD1 and BTBD2, both of which are BTB-domain-containing Kelch-like proteins that interact with DNA topoisomerase 1 (Topo1), a key enzyme of cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	127
349591	cd18282	BTB_POZ_BTBD3_6	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing proteins, BTBD3 and BTBD6. This family includes BTB/POZ domain-containing proteins BTBD3 and BTBD6, both of which are BTB-domain-containing Kelch-like proteins. BTBD3 controls dendrite orientation toward active axons in mammalian neocortex. BTBD6 is required for proper embryogenesis and plays an essential evolutionary conserved role during neuronal development. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	108
349592	cd18283	BTB1_POZ_BTBD7	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 7 (BTBD7). BTBD7 is a crucial regulator that is essential for region-specific epithelial cell dynamics and branching morphogenesis. It has been implicated in various cancers. BTBD7 contains two BTB domains. This model corresponds to the first domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	92
349593	cd18284	BTB2_POZ_BTBD7	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 7 (BTBD7). BTBD7 is a crucial regulator that is essential for region-specific epithelial cell dynamics and branching morphogenesis. It has been implicated in various cancers. BTBD7 contains two BTB domains. This model corresponds to the second domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	146
349594	cd18285	BTB1_POZ_BTBD8	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 8 (BTBD8). BTBD8 is a BTB-domain-containing Kelch-like protein that may play a role in developmental processes. It may also act as a protein-protein adaptor in a transcription complex and thus be involved in brain development. BTBD8 contains two BTB domains. This model corresponds to the first domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	104
349595	cd18286	BTB2_POZ_BTBD8	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 8 (BTBD8). BTBD8 is a BTB-domain-containing Kelch-like protein that may play a role in developmental processes. It may also act as a protein-protein adaptor in a transcription complex and thus be involved in brain development. BTBD8 contains two BTB domains. This model corresponds to the second domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349596	cd18287	BTB_POZ_BTBD9	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 9 (BTBD9). BTBD9 is a risk factor for Restless Legs Syndrome (RLS) encoding a Cullin-3 substrate adaptor. The BTBD9 gene may be associated with antipsychotic-induced RLS in schizophrenia. Mutations in BTBD9 lead to reduced dopamine, increased locomotion and sleep fragmentation. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	119
349597	cd18288	BTB_POZ_BTBD12_SLX4	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Structure-specific endonuclease subunit SLX4. SLX4, also called BTB/POZ domain-containing protein 12 (BTBD12), is a Holliday junction resolvase subunit that binds multiple DNA repair/recombination endonucleases and is required for DNA repair. Mutations of the SLX4 gene are found in Fanconi anemia. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	116
349598	cd18289	BTB_POZ_BTBD14A_NAC2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in nucleus accumbens-associated protein 2 (NAC-2). NAC-2, also called BTB/POZ domain-containing protein 14A (BTBD14A) or repressor with BTB domain and BEN domain (RBB), is a novel transcription repressor through its association with the NuRD complex. It recruits the NuRD complex to the promoter of MDM2, leading to the repression of MDM2 transcription and subsequent stability of p53/TP53. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	131
349599	cd18290	BTB_POZ_BTBD14B_NAC1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in nucleus accumbens-associated protein 1 (NAC-1). NAC-1, also called BTB/POZ domain-containing protein 14B (BTBD14B), is a transcriptional repressor that contributes to tumor progression, tumor cell proliferation, and survival. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	123
349600	cd18291	BTB_POZ_BTBD16	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 16 (BTBD16). BTBD16 is a BTB domain-containing protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	114
349601	cd18292	BTB_POZ_BTBD17	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 17 (BTBD17). BTBD17, also called galectin-3-binding protein-like, is a BTB domain-containing protein. Its function remains unclear. It may be associated with hepatocellular carcinoma. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	114
349602	cd18293	BTB_POZ_BTBD18	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 18 (BTBD18). BTBD18 acts as a specific controller for transcription activation through RNA polymerase II elongation at a subset of genomic PIWI-interacting RNA (piRNA) loci. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	120
349603	cd18294	BTB_POZ_BTBD19	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 19 (BTBD19). BTBD19 is a BTB domain-containing protein. Its function remains unclear. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	111
349604	cd18295	BTB1_POZ_ABTB1_BPOZ1	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ankyrin repeat and BTB/POZ domain-containing protein 1 (ABTB1). ABTB1, also called elongation factor 1A-binding protein or bood POZ containing gene type 1 (BPOZ-1), is an anti-proliferative factor that may act as a mediator of the phosphatase and tensin homolog (PTEN) growth-suppressive signaling pathway. It may play a role in developmental processes. ABTB1 contains an ankyrin repeat and two BTB domains. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	119
349605	cd18296	BTB2_POZ_ABTB1_BPOZ1	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ankyrin repeat and BTB/POZ domain-containing protein 1 (ABTB1). ABTB1, also called elongation factor 1A-binding protein or bood POZ containing gene type 1 (BPOZ-1), is an anti-proliferative factor that may act as a mediator of the phosphatase and tensin homolog (PTEN) growth-suppressive signaling pathway. It may play a role in developmental processes. ABTB1 contains an ankyrin repeat and two BTB domains. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349606	cd18297	BTB_POZ_ABTB2-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ankyrin repeat and BTB/POZ domain-containing protein 2 (ABTB2) and similar proteins. This family includes ABTB2, BTBD11, plant ARM repeat protein interacting with ABF2 (ARIA), and similar proteins. ABTB2, also called bood POZ containing gene type 2 (BPOZ-2), is a scaffold protein that controls the degradation of many biological proteins ranging from embryonic development to tumor progression. It may be involved in the initiation of hepatocyte growth. ABTB2 functions as an adaptor protein for the E3 ubiquitin ligase scaffold protein Cullin-3. It directly binds to eukaryotic elongation factor 1A1 (eEF1A1) to promote eEF1A1 ubiquitylation and degradation, and prevent translation. The BTBD11 gene has been recently identified as an all-trans retinoic acid (atRA)-responsive gene that lies downstream of atRA and its receptors in the regulation of neurite outgrowth and cell adhesion in neural as well as non-neural tissues. ARIA is an armadillo (ARM) repeat and BTB domain-containing protein that acts as a positive regulator of ABA response via the modulation of the transcriptional activity of ABF2. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	117
349607	cd18298	BTB_POZ_RCBTB1_2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in RCC1 and BTB domain-containing proteins, RCBTB1 and RCBTB2. The RCC1-related guanine nucleotide exchange factor (GEF) family includes RCC1 and BTB domain-containing proteins, RCBTB1 and RCBTB2, both of which are chromosome condensation regulator-like guanine nucleotide exchange factors. They contain an RCC1 repeat, a BTB domain, and a BACK domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	108
349608	cd18299	BTB1_POZ_RhoBTB	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing proteins (RhoBTB). RhoBTB proteins constitute a subfamily of atypical members within the Rho family of small guanosine triphosphatases (GTPases), which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. In vertebrates, the RhoBTB subfamily consists of 3 isoforms: RhoBTB1, RhoBTB2, and RhoBTB3. Orthologs are present in several other eukaryotes, such as Drosophila and Dictyostelium, but have been lost in plants and fungi. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	103
349609	cd18300	BTB2_POZ_RhoBTB	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing proteins (RhoBTB). RhoBTB proteins constitute a subfamily of atypical members within the Rho family of small guanosine triphosphatases (GTPases), which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, a tandem of 2 BTB domains, and a conserved C-terminal region. In humans, the RhoBTB subfamily consists of 3 isoforms: RhoBTB1, RhoBTB2, and RhoBTB3. Orthologs are present in several other eukaryotes, such as Drosophila and Dictyostelium, but have been lost in plants and fungi. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	108
349610	cd18301	BTB1_POZ_IBtk	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in inhibitor of Bruton tyrosine kinase (IBtk). IBtk is an inhibitor or negative regulator of Bruton tyrosine kinase (Btk), which is required for B-cell differentiation and development. IBtk binds to the PH domain of Btk and down-regulates the Btk kinase activity. It contains two BTB domains. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	99
349611	cd18302	BTB2_POZ_IBtk	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in inhibitor of Bruton tyrosine kinase (IBtk). IBtk is an inhibitor or negative regulator of Bruton tyrosine kinase (Btk), which is required for B-cell differentiation and development. IBtk binds to the PH domain of Btk and down-regulates the Btk kinase activity. It contains two BTB domains. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	113
349612	cd18303	BTB_POZ_Rank-5	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in rabankyrin-5 (Rank-5). Rank-5, also called ankyrin repeat and FYVE domain-containing protein 1 (ANKFY1) or ankyrin repeats hooked to a zinc finger motif (ANKHZN), is a Rab5 effector that regulates and coordinates different endocytic mechanisms. It contains an N-terminal BTB domain, followed by a BACK domain, several ankyrin (ANK) repeats and a C-terminal FYVE domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	120
349613	cd18304	BTB_POZ_M2BP	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Mac-2-binding protein (MAC2BP/M2BP). M2BP is also called galectin-3-binding protein, basement membrane autoantigen p105, lectin galactoside-binding soluble 3-binding protein (LGALS3BP), or tumor-associated antigen 90K. It promotes integrin-mediated cell adhesion and may stimulate host defense against viruses and tumor cells. It contains a scavenger receptor cysteine-rich domain, followed by BTB and BACK domains. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	114
349614	cd18305	BTB_POZ_GCL	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster protein germ cell-less (GCL) and similar proteins. GCL proteins are nuclear envelope proteins highly conserved between the mammalian and Drosophila orthologs. Drosophila melanogaster GCL is a key regulator required for the specification of pole cells and primordial germ cell formation in embryos. Both, human germ cell-less protein-like 1 (GMCL1) and germ cell-less protein-like 2 (GMCL2), also called germ cell-less protein-like 1-like (GMCL1P1 or GMCL1L), may function in spermatogenesis. They may also be substrate-specific adaptors of E3 ubiquitin-protein ligase complexes which mediate the ubiquitination and subsequent proteasomal degradation of target proteins. They contain BTB and BACK domains. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	115
349615	cd18306	BTB_POZ_NS1BP	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Influenza virus NS1A-binding protein (NS1-BP). NS1-BP is also called NS1-binding protein, aryl hydrocarbon receptor-associated protein 3 (ARA3), or IVNS1ABP. It is a novel protein that interacts with the influenza A virus nonstructural NS1 protein, which is relocalized in the nuclei of infected cells. It plays a role in cell division and in the dynamic organization of the actin skeleton as a stabilizer of actin filaments by association with F-actin through its kelch repeats. It also interacts with alpha-enolase/MBP-1 and is involved in c-Myc gene transcriptional control. NS1-BP contains BTB and BACK domains at the N-terminal region and kelch repeats at the C-terminal region. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	124
349616	cd18307	BTB_POZ_calicin	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in calicin. Calicin is a basic cytoskeletal protein involved in the formation and maintenance of the highly regular organization of the postacrosomal perinuclear theca, the calyx of mammalian spermatozoa. It contains BTB and BACK domains at the N-terminal region and kelch repeats at the C-terminal region. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	97
349617	cd18308	BTB1_POZ_LZTR1	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in leucine-zipper-like transcriptional regulator 1 (LZTR-1). LZTR-1 is a golgi BTB-kelch protein that is degraded upon induction of apoptosis. It may also function as a transcriptional regulator that plays a crucial role in embryogenesis. Germline loss-of-function mutations in LZTR-1 predispose to an inherited disorder of multiple schwannomas. LZTR-1 contains two BTB domains. This model corresponds to the first domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	156
349618	cd18309	BTB2_POZ_LZTR1	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in leucine-zipper-like transcriptional regulator 1 (LZTR-1). LZTR-1 is a golgi BTB-kelch protein that is degraded upon induction of apoptosis. It may also function as a transcriptional regulator that plays a crucial role in embryogenesis. Germline loss-of-function mutations in LZTR-1 predispose to an inherited disorder of multiple schwannomas. LZTR-1 contains two BTB domains. This model corresponds to the second domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	126
349619	cd18310	BTB_POZ_NPR_plant	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in plant regulatory proteins, NPR1-4, and similar proteins. NPR1 and NPR2 are essential for pathogenicity and the utilization of many nitrogen sources. NPR1 is also called nitrogen pathogenicity regulation protein NPR1, non-inducible immunity protein 1 (Nim1), nonexpresser of PR genes 1, or salicylic acid insensitive 1 (Sai1). It acts as a transcription coactivator that plays dual roles in regulating plant immunity. NPR3 and NPR4 are involved in negative regulation of defense responses against pathogens in plant. NPR proteins contain a BTB domain, DUF3420, ankyrin (ANK) repeats, and a conserved C-terminal domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	145
349620	cd18311	BTB_POZ_CP190-like	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster centrosomal protein 190kD (CP190) and similar proteins. CP190 is a large, multi-domain protein, first identified as a centrosome protein with oscillatory localization over the course of the cell cycle. It has an essential function in the nucleus as a chromatin insulator. It is known to associate with the nuclear matrix, components of the RNAi machinery, active promoters and borders of the repressive chromatin domains. CP190 contains an N-terminal BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	110
349621	cd18312	BTB_POZ_NPY3-like	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Arabidopsis thaliana protein naked pins in YUC mutants 3 (NPY3), Root phototropism protein 3 (RPT3), and similar proteins. NPY3 may play an essential role in auxin-mediated organogenesis and in root gravitropic responses in Arabidopsis. RPT3 is a signal transducer of the phototropic response and photo-induced movements. It is necessary for root and hypocotyl phototropisms, but not for the regulation of stomata opening. Proteins in this subfamily contain an N-terminal BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	105
349622	cd18313	BTB_POZ_BT	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Arabidopsis thaliana BTB/POZ and TAZ domain-containing proteins, BT1-5. BT1-5 may act as substrate-specific adaptors of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. They contain a BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	102
349623	cd18314	BTB_POZ_trishanku-like	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Dictyostelium discoideum trishanku and similar proteins. Trishanku is a novel regulator required for normal morphogenesis and cell-type stability in Dictyostelium discoideum. It contains a BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	96
349624	cd18315	BTB_POZ_BAB-like	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster proteins bric-a-brac 1 (BAB1), bric-a-brac 2 (BAB2), modifier of mdg4 (doom), and similar proteins. BAB1 and BAB2 probably act as transcriptional regulators that are required for specification of the tarsal segment and are involved in antenna development. Doom is a product of the Drosophila mod(mdg4) gene. It induces apoptosis and binds to baculovirus inhibitor-of-apoptosis proteins. This subfamily also includes Drosophila melanogaster sex determination protein fruitless (FRU), protein jim lovell (LOV), zinc finger protein chinmo, transcription factor GAGA, transcription factor Ken, and longitudinals lacking proteins (LOLA). FRU probably acts as a transcriptional regulator that plays a role in male courtship behavior and sexual orientation, and enhances male-specific expression of takeout in brain-associated fat body. LOV, also called tyrosine kinase-related (TKR), has a regulatory role during midline cell development. Chinmo is a functional effector of the JAK/STAT pathway that regulates eye development, tumor formation, and stem cell self-renewal in Drosophila. GAGA is a transcriptional activator that functions by regulating chromatin structure. Ken, also termed protein Ken and Barbie, is a transcription factor required for Terminalia development. LOLA proteins are putative transcription factors required for axon growth and guidance in the central and peripheral nervous systems. Proteins in this subfamily contain a BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	85
349625	cd18316	BTB_POZ_KCTD-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins. The potassium channel tetramerization domain (KCTD) family proteins contain the BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels and others. Some KCTD proteins are involved in protein ubiquitination as part of the CRL (Cullin RING Ligase) E3 ligases. Some others show Cullin-independent functions including binding and regulation of GABA (gamma-aminobutyric acid) receptors (KCTD8, KCTD12 and KCTD16) and inhibition of AP-2 function (KCTD15). KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	83
349626	cd18317	BTB_POZ_Kv	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in voltage-gated potassium (Kv) channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. This family includes several groups of alpha subunits such as KCNA/Kv1 family of Shaker-type Kv channels, KCNB/Kv2 family of Shab-type Kv channels, KCNC/Kv3 family of Shaw-type Kv channels, KCND/Kv4 family of Shal-type Kv channels, KCNF/Kv5 subfamily of Kv channels, KCNG/Kv6 subfamily of Kv channels, KCNV/Kv8 subfamily of Kv channels, and KCNS/Kv9 subfamily of Kv channels. Kv alpha subunits form functional homo- or hetero-tetrameric channels (typically with other alpha subunits from the same subfamily) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. KCNQ/Kv7 channels are not included in this family, since they do not contain a BTB/POZ domain.	82
349627	cd18318	BTB_POZ_KCTD20-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 20 (KCTD20) and BTB/POZ domain-containing protein 10 (BTBD10). KCTD20, also termed potassium channel tetramerization domain containing 20, is a positive regulator of Akt signaling. It may play an important role in regulating the death and growth of some non-nervous and nervous cells. BTBD10, also termed glucose metabolism-related protein 1 (GMRP1), plays a major role as an activator of AKT family members. It binds to Akt and protein phosphatase 2A (PP2A) and inhibits the PP2A-mediated dephosphorylation of Akt, thereby keeping Akt activated. It also plays a role in preventing motor neuronal death and accelerating the growth of pancreatic beta cells.	92
349628	cd18319	BTB_POZ_KLHL42	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 42 (KLHL42). KLHL42, also called Cullin-3-binding protein 9 (Ctb9) or Kelch domain-containing protein 5, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for mitotic progression and cytokinesis. The BCR(KLHL42) E3 ubiquitin ligase complex mediates the ubiquitination and subsequent degradation of katanin (KATNA1). KLHL42 is involved in microtubule dynamics throughout mitosis. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	93
349629	cd18320	BTB_POZ_KBTBD13	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch repeat and BTB domain-containing protein 13 (KBTBD13). KBTBD13 is a muscle specific protein. Its autosomal dominant mutations may cause Nemaline Myopathy (NEM). KBTBD13 may act as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that functions as a muscle specific ubiquitin ligase, and thereby implicating the ubiquitin proteasome pathway in the pathogenesis of KBTBD13-associated NEM. KBTBD13 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	83
349630	cd18321	BTB_POZ_EloC	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in Elongin-C (EloC) and similar proteins. Elongin-C is also called elongin 15 kDa subunit, RNA polymerase II transcription factor SIII subunit C, SIII p15, or transcription elongation factor B polypeptide 1 (TCEB1). It is a component of SIII (also known as elongin), which is a general transcription elongation factor that increases the RNA polymerase II transcription elongation past template-encoded arresting sites. It forms a regulatory complex with subunit B or elongin-B (BC) that enhances the activity of the transcriptionally active subunit A. The BC complex also functions as an adaptor protein in the proteasomal degradation of target proteins via different E3 ubiquitin ligase complexes, including the von Hippel-Lindau ubiquitination complex CBC (VHL) and the suppressors of cytokine signaling (SOCS) box ubiquitin ligase family. Elongin-C belongs to the BTB/POZ domain family; the domain is a common protein-protein interaction motif of about 100 amino acids.	95
349631	cd18322	BTB_POZ_SKP1	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in S-phase kinase-associated protein 1 (SKP1) and similar proteins. SKP1 is also called cyclin-A/CDK2-associated protein p19 (p19A), organ of Corti protein 2 (OCP-2), organ of Corti protein II (OCP-II), RNA polymerase II elongation factor-like protein, transcription elongation factor B polypeptide 1-like, or p19skp1. It is an essential component of the SCF (SKP1-CUL1-F-box protein) ubiquitin ligase complex, which mediates the ubiquitination of proteins involved in cell cycle progression, signal transduction and transcription. SKP1 serves as an adaptor protein that links the F-box protein to CUL1. SKP1 and CUL1 are invariant components of all SCF complexes, while  F-box proteins are variable substrate binding modules that determine specificity. SKP1 belongs to the BTB/POZ domain family; the domain is a common protein-protein interaction motif of about 100 amino acids.	120
349632	cd18323	BTB_POZ_ZBTB3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 3 (ZBTB3). ZBTB3 is a transcription factor essential for cancer cell growth via the regulation of the reactive oxygen species (ROS) detoxification pathway. ZBTB3 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	128
349633	cd18324	BTB_POZ_ZBTB18_RP58	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 18 (ZBTB18). ZBTB18 is also called 58 kDa repressor protein (RP58), transcriptional repressor RP58, translin-associated zinc finger protein 1 (TAZ-1), zinc finger protein 238 (ZNF238), or zinc finger protein C2H2-171. It is a sequence-specific transrepressor associated with heterochromatin. It plays a role in various developmental processes such as myogenesis and brain development. It specifically binds the consensus DNA sequence 5'-[AC]ACATCTG[GT][AC]-3' which contains the E box core, and acts by recruiting chromatin remodeling multiprotein complexes. ZBTB18 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	147
349634	cd18325	BTB_POZ_ZBTB18_2-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 18.2 and similar proteins. This subfamily is composed of Xenopus laevis zinc finger and BTB domain-containing protein 18.2, encoded by the znf238.2.L gene, and similar proteins. Many proteins in this group are annotated as zinc finger and BTB domain-containing protein 42 (ZBTB42). However, characterized mammalian ZBTB42 does not belong to this subfamily. ZBTB18.2, like ZBTB18, functions as a transcriptional repressor that plays a role in various developmental processes. Members of this family contain a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	128
349635	cd18326	BTB_POZ_ZBTB7A	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 7A (ZBTB7A). ZBTB7A is also called factor binding IST protein 1 (FBI-1), factor that binds to inducer of short transcripts protein 1, HIV-1 1st-binding protein 1, Leukemia/lymphoma-related factor (LRF), POZ and Krueppel erythroid myeloid ontogenic factor, POK erythroid myeloid ontogenic factor, Pokemon, TTF-I-interacting peptide 21 (TIP21), or zinc finger protein 857A (ZNF857A). It is a transcription repressor of key glycolytic genes, including GLUT3, PFKP, and PKM, and its downregulation in human cancer contributes to tumor metabolism. It has been implicated in carcinogenesis and cell differentiation and development. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	120
349636	cd18327	BTB_POZ_ZBTB7B_ZBTB15	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 7B (ZBTB7B). ZBTB7B is also called Krueppel-related zinc finger protein cKrox, T-helper-inducing POZ/Krueppel-like factor, zinc finger and BTB domain-containing protein 15 (ZBTB15), zinc finger protein 67 (ZNF67), Zfp67, zinc finger protein 857B (ZNF857B), or zinc finger protein Th-POK. It is a transcriptional regulator of extracellular matrix gene expression. It plays widespread and critical roles in T-cell development, particularly as the master regulator of CD4 commitment. It also plays a role as a potent driver of brown fat development and thermogenesis, as well as cold-induced beige fat formation. ZBTB7B contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	127
349637	cd18328	BTB_POZ_ZBTB7C_ZBTB36	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 7C (ZBTB7C). ZBTB7C is also called affected by papillomavirus DNA integration in ME180 cells protein 1 (APM-1), zinc finger and BTB domain-containing protein 36 (ZBTB36), zinc finger protein 857C (ZNF857C), or kidney cancer-related POZ domain and Kruppel-like protein (Kr-POK). It is a transcriptional repressor with a pro-oncogenic role that relies upon binding to p53 and inhibition of its transactivation function. It may act as an important regulator of fatty acid synthesis and may induce rapid cancer cell proliferation by increasing palmitate synthesis. The ZBTB7C gene has been identified as a susceptibility gene to ischemic injury. ZBTB7C contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	120
349638	cd18329	BTB_POZ_ZBTB8A_BOZF1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 8A (ZBTB8A). ZBTB8A, also called BTB/POZ and zinc-finger domain-containing factor or BTB/POZ and zinc-finger domains factor on chromosome 1 (BOZ-F1), is a novel proto-oncoprotein that stimulates cell proliferation. It binds to all the proximal GC boxes to repress transcription, and it inhibits p53 acetylation without affecting p53 stability. It may be involved in gastric adenocarcinoma cell differentiation, cancer invasion, and metastasis. ZBTB8A contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	116
349639	cd18330	BTB_POZ_ZBTB8B	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 8B (ZBTB8B). ZBTB8B may be involved in transcriptional regulation. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	113
349640	cd18331	BTB_POZ_ZBTB27_BCL6	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in B-cell lymphoma 6 protein (BCL-6). BCL-6 is also called B-cell lymphoma 5 protein (BCL-5), zinc finger and BTB domain-containing protein 27 (ZBTB27), protein LAZ-3, or zinc finger protein 51 (ZNF51). It is a transcriptional repressor mainly required for germinal center (GC) formation and antibody affinity maturation, which have different mechanisms of action specific to the lineage and biological functions. It represses its target genes by binding directly to the DNA sequence 5'-TTCCTAGAA-3' or indirectly by repressing the transcriptional activity of transcription factors. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	118
349641	cd18332	BTB_POZ_ZBTB28_BCL6B	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in B-cell CLL/lymphoma 6 member B protein (BCL6B). BCL6B is also called Bcl6-associated zinc finger protein, zinc finger protein 62, or zinc finger and BTB domain-containing protein 28 (ZBTB28). It is a sequence-specific transcriptional repressor in association with BCL-6. It may function in a narrow stage or be related to some events in the early B-cell development. BCL6B plays an important role as a potential tumor suppressor in gastric cancer; it is found preferentially methylated in gastric cancer. It also inhibits both colorectal cancer growth and hepatocellular carcinoma metastases. BCL6B contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	114
349642	cd18333	BTB_POZ_ZBTB29_HIC1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in hypermethylated in cancer 1 protein (Hic-1). Hic-1, also called zinc finger and BTB domain-containing protein 29 (ZBTB29), is a sequence-specific transcriptional repressor that recognizes and binds to the consensus sequence '5-[CG]NG[CG]GGGCA[CA]CC-3'. It may act as a tumor suppressor, and is involved in regulatory loops modulating P53-dependent and E2F1-dependent cell survival, growth control, and stress responses. It also regulates intestinal immunity and homeostasis. Hic-1 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	121
349643	cd18334	BTB_POZ_ZBTB30_HIC2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in hypermethylated in cancer 2 protein (Hic-2). Hic-2 is also called HIC1-related gene on chromosome 22 protein (HRG22), Hic-3, or zinc finger and BTB domain-containing protein 30 (ZBTB30). It is a homolog of tumor suppressor Hic-1. It functions as a transcriptional regulator. It is a dosage-dependent regulator of cardiac development located within the distal 22q11 deletion syndrome region. Hic-2 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	120
349644	cd18335	BTB_POZ_KLHL1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 1 (KLHL1). KLHL1 is a neuronal actin-binding protein that modulates voltage-gated CaV2.1 (P/Q-type) and CaV3.2 (alpha1H T-type) calcium channels. It may play a role in organizing the actin cytoskeleton the brain cells. KLHL1 contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	126
349645	cd18336	BTB_POZ_KLHL4	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 4 (KLHL4). KLHL4 shares high identity and similarity with the Drosophila kelch protein, a component of ring canals. It may be associated with X-linked cleft palate (CPX) and is also a candidate gene in the impairment of mullerian duct development. In addition, it has been identified as a target of insulin-like growth factor binding protein 5 (IGFBP5). KLHL4 contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	126
349646	cd18337	BTB_POZ_KLHL5	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 5 (KLHL5). KLHL5 shares high identity and similarity with the Drosophila kelch protein, a component of ring canals. It is abundantly expressed in the ovary, adrenal gland, and thymus. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	130
349647	cd18338	BTB_POZ_KLHL2_Mayven	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 2 (KLHL2). KLHL2, also called actin-binding protein Mayven, is a novel actin-binding protein predominantly expressed in the brain. It plays a role in the reorganization of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors. KLHL2 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, such as NPTXR, leading most often to their proteasomal degradation. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349648	cd18339	BTB_POZ_KLHL3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 3 (KLHL3). KLHL3 is a component of an E3 ubiquitin ligase complex that regulates blood pressure by targeting With-No-Lysine (WNK) kinases for degradation. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349649	cd18340	BTB_POZ_KLHL40_KBTBD5	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 40 (KLHL40). KLHL40, also called Kelch repeat and BTB domain-containing protein 5 (KBTBD5) or sarcosynapsin, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a key regulator of skeletal muscle development. Mutations in KLHL40 may cause severe autosomal-recessive nemaline myopathy. KLHL40 contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	134
349650	cd18341	BTB_POZ_KLHL41_KBTBD10	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Kelch-like protein 41 (KLHL41). KLHL41 is also called Kel-like protein 23, Kelch repeat and BTB domain-containing protein 10 (KBTBD10), Kelch-related protein 1 (Krp1), or sarcosine. It is a novel kelch-related protein that is involved in pseudopod elongation in transformed cells. It is also involved in skeletal muscle development and differentiation. It regulates proliferation and differentiation of myoblasts and plays a role in myofibril assembly by promoting lateral fusion of adjacent thin fibrils into mature, wide myofibrils. It contains a BTB domain and kelch repeats, characteristics of a kelch family protein. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	133
349651	cd18342	BTB_POZ_SPOP	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in speckle-type POZ protein (SPOP). SPOP, also called HIB homolog 1 or Roadkill homolog 1, is a novel nuclear speckle-type protein which serves as an adaptor of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and proteasomal degradation of target proteins, such as BRMS1, DAXX, PDX1/IPF1, GLI2 and GLI3. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	125
349652	cd18343	BTB_POZ_SPOPL	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in speckle-type POZ protein-like (SPOPL). SPOPL, also called HIB homolog 2 or Roadkill homolog 2, is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complexes that mediate the ubiquitination and subsequent proteasomal degradation of target proteins. The complexes may contain homodimeric SPOPL or the heterodimers formed by speckle-type POZ protein (SPOP) and SPOPL, which are less efficient than ubiquitin ligase complexes containing only SPOP. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	123
349653	cd18344	BTB_POZ_TDPOZ	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in TD and POZ domain-containing proteins (TDPOZ). TDPOZ is a family of bipartite animal and plant proteins that contains a tumor necrosis factor receptor-associated factor (TRAF) domain (TD) and a POZ/BTB domain. TDPOZ proteins may be nuclear scaffold proteins probably involved in transcription regulation in early development and other cellular processes. This subfamily contains only mammalian members. Plant TDPOZ proteins contain a MATH domain at the N-terminal region and are named "BTB/POZ and MATH domain-containing proteins (BPM)", not included in this subfamily. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	128
349654	cd18345	BTB_POZ_roadkill-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster protein roadkill and similar proteins. Drosophila melanogaster protein roadkill, also called Hh-induced MATH and BTB domain-containing protein (HIB), is a hedgehog-induced BTB protein that modulates hedgehog signaling by degrading Ci/Gli transcription factor. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	121
349655	cd18346	BTB_POZ_BTBD1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 1 (BTBD1). BTBD1, also called Hepatitis C virus NS5A-transactivated protein 8 or HCV NS5A-transactivated protein 8, is a BTB-domain-containing Kelch-like protein that is expressed in skeletal muscle and interacts with DNA topoisomerase 1 (Topo1), a key enzyme of cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. BTBD1 may serve as substrate-specific adaptor of an E3 ubiquitin-protein ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	133
349656	cd18347	BTB_POZ_BTBD2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 2 (BTBD2). BTBD2 is a BTB-domain-containing Kelch-like protein that interacts with DNA topoisomerase 1 (Topo1), a key enzyme of cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	127
349657	cd18348	BTB_POZ_BTBD3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 3 (BTBD3). BTBD3 is a BTB-domain-containing Kelch-like protein that controls dendrite orientation toward active axons in the mammalian neocortex. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	131
349658	cd18349	BTB_POZ_BTBD6	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 6 (BTBD6). BTBD6, also termed lens BTB domain protein, is a BTB-domain-containing Kelch-like protein required for proper embryogenesis and plays an essential evolutionary conserved role during neuronal development. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	109
349659	cd18350	BTB_POZ_ABTB2_BPOZ2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Ankyrin repeat and BTB/POZ domain-containing protein 2 (ABTB2). ABTB2, also called bood POZ containing gene type 2 (BPOZ-2), is a scaffold protein that controls the degradation of many biological proteins with various functions ranging from embryonic development to tumor progression. It may be involved in the initiation of hepatocyte growth. It inhibits the aggregation of alpha-synuclein, with implications for Parkinson's disease. ABTB2 functions as an adaptor protein for the E3 ubiquitin ligase scaffold protein Cullin-3. It directly binds to eukaryotic elongation factor 1A1 (eEF1A1) to promote eEF1A1 ubiquitylation and degradation, and prevent translation. It is also involved in the growth suppressive effect of the phosphatase and tensin homolog (PTEN). It contains an ankyrin repeat, BTB/POZ, and BACK domains. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	134
349660	cd18351	BTB_POZ_BTBD11	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 11 (BTBD11). BTBD11, also called ankyrin repeat and BTB/POZ domain-containing protein BTBD11, is a BTB-domain-containing protein. The BTBD11 gene has been recently identified as an all-trans retinoic acid (atRA)-responsive gene that lies downstream of atRA and its receptors in the regulation of neurite outgrowth and cell adhesion in neural as well as non-neural tissues. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	131
349661	cd18352	BTB_POZ_ARIA_plant	BTB (Broad-Complex, Tramtrack and Bric a brac) /POZ (poxvirus and zinc finger) domain found in plant ARM repeat protein interacting with ABF2 (ARIA) and similar proteins. ARIA is an armadillo (ARM) repeat and BTB domain-containing protein that acts as a positive regulator of ABA response via the modulation of the transcriptional activity of ABF2, a transcription factor which controls ABA-dependent gene expression via the G-box-type ABA-responsive elements. ARIA is a novel abscisic acid signaling component. It negatively regulates seed germination and young seedling growth. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	116
349662	cd18353	BTB_POZ_RCBTB1_CLLD7	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in RCC1 and BTB domain-containing protein 1 (RCBTB1). RCBTB1 is also called chronic lymphocytic leukemia deletion region gene 7 protein (CLLD7), CLL deletion region gene 7 protein, regulator of chromosome condensation and BTB domain-containing protein 1, or E4.5. It is a novel chromosome condensation regulator-like guanine nucleotide exchange factor that may be involved in cell cycle regulation by chromatin remodeling. It may also function as a tumor suppressor that regulates pathways of DNA damage/repair and apoptosis. RCBTB1 may also be a substrate adaptor for a cullin3 (CUL3) E3 ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins. Biallelic mutations in RCBTB1 may cause isolated and syndromic retinal dystrophy. It contains an RCC1 repeat, a BTB domain, and a BACK domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	117
349663	cd18354	BTB_POZ_RCBTB2_CHC1L	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in RCC1 and BTB domain-containing protein 2 (RCBTB2). RCBTB2 is also called chromosome condensation 1-like (CHC1-L), RCC1-like G exchanging factor, or regulator of chromosome condensation and BTB domain-containing protein 2. It is a chromosome condensation regulator-like guanine nucleotide exchange factor (GEF) protein for the Ras-related GTPase Ran. It contains an RCC1 repeat, a BTB domain, and a BACK domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	117
349664	cd18355	BTB1_POZ_RHOBTB1	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 1 (RhoBTB1). RhoBTB1 is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB1 functions as a tumor suppressor that regulates the integrity of the Golgi complex through the methyltransferase METTL7B. It also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	146
349665	cd18356	BTB1_POZ_RHOBTB2	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 2 (RhoBTB2). RhoBTB2, also called Deleted in breast cancer 2 gene protein (DBC2) or p83, is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB2 functions as a tumor suppressor that regulates the expression of the methyltransferase METTL7A. It also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	148
349666	cd18357	BTB1_POZ_RHOBTB3	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 3 (RhoBTB3). RhoBTB3 is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB3 is a Golgi-associated Rho-related ATPase that regulates the S/G2 transition of the cell cycle by targeting cyclin E for ubiquitylation. It is involved in vesicle trafficking and in targeting proteins for degradation in the proteasome. It binds directly to Rab9 GTPase and functions with Rab9 in protein transport from endosomes to the trans Golgi network. It also promotes proteasomal degradation of Hypoxia-inducible factor alpha (HIFalpha) through facilitating hydroxylation and suppresses the Warburg effect. This model corresponds to the first BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	159
349667	cd18358	BTB2_POZ_RHOBTB1	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 1 (RhoBTB1). RhoBTB1 is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB1 functions as a tumor suppressor that regulates the integrity of the Golgi complex through the methyltransferase METTL7B. It also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	126
349668	cd18359	BTB2_POZ_RHOBTB2	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 2 (RhoBTB2). RhoBTB2, also called Deleted in breast cancer 2 gene protein (DBC2) or p83, is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB2 functions as a tumor suppressor that regulates the expression of the methyltransferase METTL7A. It also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	124
349669	cd18360	BTB2_POZ_RHOBTB3	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Rho-related BTB domain-containing protein 3 (RhoBTB3). RhoBTB3 is an atypical Rho family small guanosine triphosphatase (GTPase) and is a member of the RhoBTB subfamily, which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline-rich region, tandem BTB domains, and a conserved C-terminal region. The carboxyl terminal extension that harbors two BTB domains is capable of assembling cullin 3-dependent ubiquitin ligase complexes. RhoBTB3 is a Golgi-associated Rho-related ATPase that regulates the S/G2 transition of the cell cycle by targeting cyclin E for ubiquitylation. It is involved in vesicle trafficking and in targeting proteins for degradation in the proteasome. It binds directly to Rab9 GTPase and functions with Rab9 in protein transport from endosomes to the trans Golgi network. It also promotes proteasomal degradation of Hypoxia-inducible factor alpha (HIFalpha) through facilitating hydroxylation and suppresses the Warburg effect. This model corresponds to the second BTB domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	110
349670	cd18361	BTB_POZ_KCTD1-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins, KCTD1 and KCTD15. This subfamily of KCTD proteins includes KCTD1 and KCTD15. KCTD1 is a nuclear BTB/POZ domain-containing protein that acts as a transcriptional repressor and mediates protein-protein interactions through a BTB domain. It represses the transcriptional activity of AP-2 family members, including TFAP2A, TFAP2B and TFAP2C. It also functions as a novel inhibitor of the Wnt signaling pathway. Mutations in KCTD1 cause scalp-ear-nipple (SEN) syndrome. KCTD15 is a BTB/POZ domain-containing protein that plays a role in the regulation of neural crest (NC) formation and other steps in embryonic development. It inhibits AP2 transcriptional activity by interaction with its activation domain. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD1 BTB domains form pentamers.	94
349671	cd18362	BTB_POZ_KCTD2-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins KCTD2, KCTD5, and KCTD17, and similar proteins. This subfamily includes potassium channel tetramerization domain-containing proteins KCTD2, KCTD5, and KCTD17, all of which function as adaptors of Cullin3 based ubiquitin E3 ubiquitin ligases. KCTD2 suppresses gliomagenesis by destabilizing c-Myc. KCTD5 is a negative regulator of the AKT pathway, a key signaling cascade frequently deregulated in cancer. KCTD5 does not impact the operation of Kv4.2, Kv3.4, Kv2.1, or Kv1.2 channels. KCTD17 polyubiquitylates trichoplein, a protein involved in ciliogenesis down-regulation. It is a positive regulator of ciliogenesis, playing a crucial role in the initial steps of axoneme extension. A missense mutation in KCTD17 causes autosomal dominant myoclonus-dystonia (M-D). The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. KCTD5 and KCTD17 BTB domains form pentamer structures.	85
349672	cd18363	BTB_POZ_KCTD3-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 3 (KCTD3) and SH3KBP1-binding protein 1 (SHKBP1). The group of KCTD proteins includes KCTD3 and SHKBP1. KCTD3, also called renal carcinoma antigen NY-REN-45, is a BTB/POZ domain-containing protein that is an accessory subunit of potassium/sodium hyperpolarization-activated cyclic nucleotide-gated channel 3 (HCN3), upregulating its cell-surface expression and current density without affecting its voltage dependence and kinetics. SHKBP1, also called SETA-binding protein 1, interacts with cathepsin B and participates in tumor necrosis factor (TNF)-induced apoptosis in ovarian cancer cells. It can promote epidermal growth factor receptor (EGFR) signaling by interrupting c-Cbl-CIN85 complex and inhibiting EGFR degradation. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	86
349673	cd18364	BTB_POZ_KCTD4	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 4 (KCTD4). KCTD4 is a BTB/POZ domain-containing protein with an unknown biological function. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels and others. Some KCTD proteins are involved in protein ubiquitination as part of the CRL (Cullin RING Ligase) E3 ligases. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	86
349674	cd18365	BTB_POZ_KCTD6_like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins, KCTD6, KCTD21 and similar proteins. KCTD6, also called KCASH3 (KCTD containing, Cullin3 adaptor, suppressor of Hedgehog 3), is a substrate-specific adaptor of cullin-3, effectively regulating protein levels of the muscle small ankyrin-1 isoform 5 (sAnk1.5). KCTD21, also called KCASH2, functions as a substrate-specific adaptor of cullin-3, promoting the ubiquitination and degradation of histone deacetylase HDAC1, thereby inhibiting the deacetylation-mediated transcriptional activation of the Hedgehog effectors Gli1 and Gli2. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	94
349675	cd18366	BTB_POZ_KCTD7	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 7 (KCTD7). KCTD7 is a BTB/POZ domain-containing protein that has an impact on K+ fluxes, neurotransmitter synthesis, and neuronal function. It functions as a regulator of potassium conductance in neurons, and is involved in the control of excitability of cortical neurons. Mutations in KCTD7 may cause progressive myoclonus epilepsy (PME). The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	92
349676	cd18367	BTB_POZ_KCTD8-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins, KCTD8, KCTD12, KCTD16 and similar proteins. This subfamily of KCTD proteins includes KCTD8, KCTD12 (also called predominantly fetal expressed T1 domain/Pfetin), and KCTD16. They act as auxiliary subunits of GABAB receptors associated with mood disorders. KCTD8 interacts as a tetramer with GABRB1 and GABRB2. KCTD12 regulates agonist potency and kinetics of GABAB receptor signaling. It promotes tumorigenesis by facilitating CDC25B/CDK1/Aurora A-dependent G2/M transition. KCTD16 interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion, and axon guidance. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	100
349677	cd18368	BTB_POZ_KCTD9	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 9 (KCTD9). KCTD9 is a BTB/POZ domain-containing protein that contributes to liver injury through NK cell activation during hepatitis B virus-induced acute-on-chronic liver failure. It functions as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex, which mediates the ubiquitination of target proteins, leading to their degradation by the proteasome. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD9 BTB domain forms a pentameric structure.	100
349678	cd18369	BTB_POZ_KCTD10-like_BACURD	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing proteins, KCTD10 (BACURD3), KCTD13 (BACURD1), and TNFAIP1 (BACURD2). This subfamily of KCTD proteins, also called the BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein (BACURD) subfamily, includes KCTD10 (BACURD3), KCTD13 (BACURD1), and TNFAIP1 (BACURD2). KCTD10 is a BTB/POZ domain-containing protein that interacts with proliferating cell nuclear antigen (PCNA) and polymerase delta, and participates in DNA repair, DNA replication, and cell-cycle control. Its down-regulation could inhibit cell proliferation. KCTD10 also plays crucial roles in embryonic angiogenesis and heart development in mammals by negatively regulating the Notch signaling pathway. KCTD13 is a BTB/POZ domain-containing protein that may function as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of RhoA, leading to its degradation by the proteasome, thereby regulating the actin cytoskeleton and cell migration. TNFAIP1, also called protein B12, is a BTB/POZ domain-containing protein that is involved in DNA replication, DNA damage repair, cell apoptosis, and is implicated in human diseases including cancer, Alzheimer's disease (AD) and type 2 diabetic nephropathy. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. KCTD10 and KCTD13 BTB domains form a novel two-fold symmetric tetramer that is distinct from the tetramer formed by voltage-gated potassium (Kv) channels.	91
349679	cd18370	BTB_POZ_KCTD11	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein KCTD11. KCTD11 may function as an antagonist of the Hedgehog pathway of cell proliferation and differentiation by affecting the nuclear transfer of transcription factor GLI1, thus maintaining cerebellar granule cells in the undifferentiated state. It is a probable substrate-specific adapter for a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex towards HDAC1. It contains a BTB/POZ domain; in some cases the domain may be truncated. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. Variants of the human/mouse KCTD11 appear to contain truncated BTB/POZ domains.	88
349680	cd18371	BTB_POZ_KCTD14	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 14 (KCTD14). KCTD14 is a BTB/POZ domain-containing protein with unknown biological function. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels and others. Some KCTD proteins are involved in protein ubiquitination as part of the CRL (Cullin RING Ligase) E3 ligases. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	99
349681	cd18372	BTB_POZ_KCTD18	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 18 (KCTD18). KCTD18 is a BTB/POZ domain-containing protein with with unknown biological function. A duplication of the KCTD18 gene has been found in a patient with epilepsy, developmental delay, and autistic behavior, which may contribute to the phenotype. KCTD proteins play crucial roles in a variety of fundamental biological processes, such as protein ubiquitination and degradation, suppression of proliferation or transcription, cytoskeleton regulation, tetramerization and gating of ion channels and others. Some KCTD proteins are involved in protein ubiquitination as part of the CRL (Cullin RING Ligase) E3 ligases. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	101
349682	cd18373	BTB1_POZ_KCTD19	first BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 19 (KCTD19). KCTD19 is a BTB/POZ domain-containing protein with unclear biological function. It may be a host factor involved in Nef-induced downregulation of MHC-I. Nef is a HIV-1-encoded protein that plays a key role in the development of AIDS. KCTD19 contains two BTB domains. This model corresponds to the first domain. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	98
349683	cd18374	BTB2_POZ_KCTD19	second BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 19 (KCTD19). KCTD19 is a BTB/POZ domain-containing protein with unclear biological function. It may be a host factor involved in Nef-induced downregulation of MHC-I. Nef is a HIV-1-encoded protein that plays a key role in the development of AIDS. KCTD19 contains two BTB domains. This model corresponds to the second domain. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	99
349684	cd18375	BTB_POZ_KCNRG	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel regulatory protein (KCNRG). KCNRG, also called potassium channel regulator or protein CLLD4, is an endoplasmic reticulum (ER)-associated tumor suppressor that regulates Kv1 family potassium channel proteins by retaining a fraction of the channels in endomembranes. It contains a BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization.	97
349685	cd18376	BTB_POZ_FIP2-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Arabidopsis thaliana FH protein interacting protein FIP2 and similar proteins. FIP2 may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins. It contains a BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization.	89
349686	cd18377	BTB_POZ_Kv1_KCNA	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNA/Kv1 subfamily of Shaker-type voltage-dependent potassium channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv1, also known as subfamily A, contains eight alpha subunit members, Kv1.1 (KCNA1), Kv1.2 (KCNA2), Kv1.3 (KCNA3), Kv1.4 (KCNA4), Kv1.5 (KCNA5), Kv1.6 (KCNA6), Kv1.7 (KCNA7), and Kv1.8 (KCNA10), which are orthologs of the Shaker gene in Drosophila. They are delayed rectifiers except for Kv1.4 (KCNA4), which is an A-type potassium channel. Delayed rectifiers are slow opening and closing voltage-gated potassium channels. Because of their delayed activation kinetics, they play an important role in controlling action potential duration. A-type channels are fast/rapidly inactivating potassium channels. Kv1/KCNA subfamily alpha subunits form functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	85
349687	cd18378	BTB_POZ_Kv2_KCNB	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNB/Kv2 subfamily of Shab-type voltage-dependent potassium channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv3, also known as subfamily C, contains two alpha subunit members, Kv2.1 (KCNB1) and Kv2.2 (KCNB2), which are orthologs of the Shab gene in Drosophila. They are delayed-rectifier potassium currents in various neurons, although their physiological roles often remain elusive. Kv2/KCNB subfamily alpha subunits form functional homo- or hetero-tetrameric channels (with other alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	109
349688	cd18379	BTB_POZ_Kv3_KCNC	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNC/Kv3 subfamily of Shaw-type voltage-dependent potassium channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv3, also known as subfamily C, contains four alpha subunit members, Kv3.1 (KCNC1), Kv3.2 (KCNC2), Kv3.3 (KCNC3), and Kv3.4 (KCNC4), which are orthologs of the Shaw gene in Drosophila. Unlike other Kv subfamilies, Kv3 channels typically open only at positive potentials and both, activation and deactivation, in response to changes in voltage are very rapid. They are uniquely associated with the ability of certain neurons to fire action potentials and to release neurotransmitter at high rates of up to 1,000 Hz. Kv3/KCNC subfamily alpha subunits form functional homo- or hetero-tetrameric channels (with other alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	109
349689	cd18380	BTB_POZ_Kv4_KCND	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCND/Kv4 subfamily of Shal-type voltage-dependent potassium channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv4, also known as subfamily D, contains three alpha subunit members, Kv4.1 (KCND1), Kv4.2 (KCND2), and Kv4.3 (KCND3), which are orthologs of the Shal gene in Drosophila. They are A-type potassium channels that mediate the native, fast inactivating (A-type) K+ current (IA) described both in the nervous system (A currents) and the heart (transient outward current). Kv4/KCND subfamily alpha subunits form functional homo- or hetero-tetrameric channels through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. They are modulated by cytoplasmic KChIPs/KCNIPs (Kv-channel interacting proteins), which are small calcium binding proteins with  EF-hand-like domains.	102
349690	cd18381	BTB_POZ_Kv5_KCNF1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNF/Kv5 subfamily of potassium voltage-gated channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv5, also known as subfamily F, only contains KCNF1 (also known as Kv5.1 or kH1), which functions as a regulatory alpha-subunit of voltage-gated potassium channel that when coassembled with Kv2.1 can modulate gating in a physiologically relevant manner. It forms hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	116
349691	cd18382	BTB_POZ_Kv6_KCNG	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNG/Kv6 subfamily of potassium voltage-gated channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv6, also known as subfamily G, includes KCNG1 (Kv6.1), KCNG2 (Kv6.2 or KCNF2), KCNG3 (Kv6.3) and KCNG4 (Kv6.4), which are regulatory alpha subunits and do not form functional channels on their own. KCNG1 can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. KCNG2, also called cardiac potassium channel subunit, can form functional heterodimeric channels with KCNB1, and further modulates channel activity by shifting the threshold and the half-maximal activation to more negative values. KCNG3, also called voltage-gated potassium channel subunit Kv10.1, is an electrically silent modulatory subunit that can form functional heterotetrameric channels with KCNB1, and further promotes a reduction in the rate of activation and inactivation of the delayed rectifier voltage-gated potassium channel KCNB1. KCNG4 is a silent voltage-gated potassium (KvS) channel subunit that can form functional heterotetrameric channels with KCNB1, and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1.	109
349692	cd18384	BTB_POZ_Kv9_KCNS	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in KCNS/Kv9 subfamily of potassium voltage-gated channels. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. The potassium voltage-gated channel subfamily Kv9, also known as subfamily S, includes KCNS1 (Kv9.1), KCNS2 (Kv9.2) and KCNS3 (Kv9.3). They are regulatory alpha subunits that cannot form functional homo-tetrameric channels. Both KCNS1 and KCNS2 are delayed-rectifier K(+) channel alpha subunits that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1) and KCNB2 (also known as Kv2.2), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1 and KCNB2. KCNS3 is a delayed-rectifier K(+) channel alpha subunit linked to tissue oxygenation responses. It can form functional heterotetrameric channels with KCNB1, and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1.	106
349693	cd18385	BTB_POZ_BTBD10_GMRP1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB/POZ domain-containing protein 10 (BTBD10). BTBD10, also called glucose metabolism-related protein 1 (GMRP1), plays a major role as an activator of AKT family members. It binds to Akt and protein phosphatase 2A (PP2A) and inhibits the PP2A-mediated dephosphorylation of Akt, thereby keeping Akt activated. It also plays a role in preventing motor neuronal death and accelerating the growth of pancreatic beta cells. BTBD10 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	110
349694	cd18386	BTB_POZ_KCTD20	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 20 (KCTD20). KCTD20, also called potassium channel tetramerization domain containing 20, is a positive regulator of Akt signaling. It may play an important role in regulating the death and growth of some non-nervous and nervous cells. It contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	104
349695	cd18387	BTB_POZ_KCTD1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 1 (KCTD1). KCTD1 is a nuclear BTB/POZ domain-containing protein that acts as a transcriptional repressor and mediates protein-protein interactions through a BTB domain. It represses the transcriptional activity of AP-2 family members, including TFAP2A, TFAP2B and TFAP2C to various extent. It also functions as a novel inhibitor of the Wnt signaling pathway. Mutations in KCTD1 cause scalp-ear-nipple (SEN) syndrome. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD1 BTB domains form pentamers.	105
349696	cd18388	BTB_POZ_KCTD15	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 15 (KCTD15). KCTD15 is a BTB/POZ domain-containing protein that plays a role in the regulation of neural crest (NC) formation and other steps in embryonic development. It inhibits AP2 transcriptional activity by interaction with its activation domain. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD1 BTB domains, closely related to KCTD15, form pentamers.	99
349697	cd18389	BTB_POZ_KCTD2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 2 (KCTD2). KCTD2 is a BTB/POZ domain-containing protein that functions as an adaptor of Cullin3 E3 ubiquitin ligase. It suppresses gliomagenesis by destabilizing c-Myc. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. KCTD5 and KCTD17 BTB domain, highly similar to KCTD2, form pentamer structures.	105
349698	cd18390	BTB_POZ_KCTD5	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 5 (KCTD5). KCTD5 is a BTB/POZ domain-containing protein that functions as a substrate adaptor for cullin3 based ubiquitin E3 ligases. It is a negative regulator of the AKT pathway, a key signaling cascade frequently deregulated in cancer. KCTD5 does not impact the operation of Kv4.2, Kv3.4, Kv2.1, or Kv1.2 channels. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. KCTD5 forms pentamers mediated by its BTB domain.	112
349699	cd18391	BTB_POZ_KCTD17	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 17 (KCTD17). KCTD17 is a BTB/POZ domain-containing protein that functions as a substrate-adaptor for cullin3-RING ubiquitin ligases that polyubiquitylates trichoplein, a protein involved in ciliogenesis down-regulation. It is a positive regulator of ciliogenesis, playing a crucial role in the initial steps of axoneme extension. A missense mutation in KCTD17 causes autosomal dominant myoclonus-dystonia (M-D). The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD17 BTB domains form pentamers.	101
349700	cd18392	BTB_POZ_KCTD3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 3 (KCTD3). KCTD3, also called renal carcinoma antigen NY-REN-45, is a BTB/POZ domain-containing protein that is an accessory subunit of potassium/sodium hyperpolarization-activated cyclic nucleotide-gated channel 3 (HCN3), upregulating its cell-surface expression and current density without affecting its voltage dependence and kinetics. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	88
349701	cd18393	BTB_POZ_SHKBP1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in SH3KBP1-binding protein 1 (SHKBP1). SHKBP1, also called SETA-binding protein 1, interacts with cathepsin B and participates in tumor necrosis factor (TNF)-induced apoptosis in ovarian cancer cells. It can promote epidermal growth factor receptor (EGFR) signaling by interrupting c-Cbl-CIN85 complex and inhibiting EGFR degradation. It contains a BTB/POZ domain, also known as tetramerization (T1) domain, a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	103
349702	cd18394	BTB_POZ_KCTD6	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 6 (KCTD6). KCTD6, also called KCTD containing, Cullin3 adaptor, suppressor of Hedgehog 3 (KCASH3), is a BTB/POZ domain-containing protein that functions as a substrate-specific adaptor of cullin-3, regulating protein levels of the muscle small ankyrin-1 isoform 5 (sAnk1.5) as well as suppressing histone deacetylase and Hedgehog activity in medulloblastoma. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	104
349703	cd18395	BTB_POZ_KCTD21	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 21 (KCTD21). KCTD21, also calledz KCTD containing, Cullin3 adaptor, suppressor of Hedgehog 2 (KCASH2), is a BTB/POZ domain-containing protein that functions as a substrate-specific adaptor of cullin-3, promoting the ubiquitination and degradation of histone deacetylase HDAC1, thereby inhibiting the deacetylation-mediated transcriptional activation of the Hedgehog effectors Gli1 and Gli2. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	98
349704	cd18396	BTB_POZ_KCTD8	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein KCTD8. KCTD8, a BTB/POZ domain-containing protein, is an auxiliary subunit of GABA-B receptors that determine the pharmacology and kinetics of the receptor response. It interacts as a tetramer with GABRB1 and GABRB2. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	103
349705	cd18397	BTB_POZ_KCTD12_Pfetin	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 12 (KCTD12). KCTD12, also called predominantly fetal expressed T1 domain (Pfetin), is a BTB/POZ domain-containing protein that is an auxiliary subunit of GABAB receptors associated with mood disorders. It regulates agonist potency and kinetics of GABAB receptor signaling. It promotes tumorigenesis by facilitating CDC25B/CDK1/Aurora A-dependent G2/M transition. It also regulates colorectal cancer cell stemness through the ERK pathway. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	100
349706	cd18398	BTB_POZ_KCTD16	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 16 (KCTD16). KCTD16 is a BTB/POZ domain-containing protein that is an auxiliary subunit of GABAB receptors associated with mood disorders. It interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion and axon guidance. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization.	103
349707	cd18399	BTB_POZ_KCTD10_BACURD3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 10 (KCTD10). KCTD10, also called BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein 3 (BACURD3), is a BTB/POZ domain-containing protein that interacts with proliferating cell nuclear antigen (PCNA) and polymerase delta, and participates in DNA repair, DNA replication, and cell-cycle control. Its down-regulation could inhibit cell proliferation. KCTD10 also plays crucial roles in embryonic angiogenesis and heart development in mammals by negatively regulating the Notch signaling pathway. Furthermore, KCTD10 may function as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex, which mediates the ubiquitination of target proteins, leading to their degradation by the proteasome. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD10 BTB domain forms a novel two-fold symmetric tetramer that is distinct from the tetramer formed by voltage-gated potassium (Kv) channels.	110
349708	cd18400	BTB_POZ_KCTD13_BACURD1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium channel tetramerization domain-containing protein 13 (KCTD13). KCTD13, also called BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein 1 (BACURD1), or TNFAIP1-like protein, is a BTB/POZ domain-containing protein that may function as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of RhoA, leading to its degradation by the proteasome, thereby regulating the actin cytoskeleton and cell migration. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The KCTD13 BTB domain forms a novel two-fold symmetric tetramer that is distinct from the tetramer formed by voltage-gated potassium (Kv) channels.	103
349709	cd18401	BTB_POZ_TNFAIP1_BACURD2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in tumor necrosis factor, alpha-induced protein 1, endothelial (TNFAIP1). TNFAIP1, also called BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein 2 (BACURD2), or protein B12, is a BTB/POZ domain-containing protein that is involved in DNA replication, DNA damage repair and cell apoptosis, and is implicated in human diseases including cancer, Alzheimer's disease (AD) and type 2 diabetic nephropathy. The BTB/POZ domain, also known as tetramerization (T1) domain, is a versatile protein-protein interaction motif that facilitates homodimerization or heterodimerization. KCTD family BTB domains can adopt a wide range of oligomerization geometries, including homodimerization, tetramerization, and pentamerization. The BTB domains of other BACURD subfamily members, KCTD10 and KCTD13, form a novel two-fold symmetric tetramer that is distinct from the tetramer formed by voltage-gated potassium (Kv) channels.	104
349710	cd18402	BTB_POZ_KCNA1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 1 (KCNA1). KCNA1 is also called voltage-gated K(+) channel HuKI, voltage-gated potassium channel HBK1, or voltage-gated potassium channel subunit Kv1.1. It mediates transmembrane potassium transport in excitable membranes, primarily in the brain and the central nervous system, but also in the kidney. It is involved in the regulation of the membrane potential and nerve signaling, and prevents neuronal hyperexcitability. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA1 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	98
349711	cd18403	BTB_POZ_KCNA2_KCNA3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A members 2 (KCNA2) and 3 (KCNA3). KCNA2 is also called NGK1, voltage-gated K(+) channel HuKIV, voltage-gated potassium channel HBK5, or voltage-gated potassium channel subunit Kv1.2. KCNA3 is also called HGK5, HLK3, HPCN3, voltage-gated K(+) channel HuKIII, or voltage-gated potassium channel subunit Kv1.3. KCNA2 and KCNA3 mediate transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA2 primarily functions in the brain and the central nervous system, but also in the cardiovascular system. It prevents aberrant action potential firing and regulates neuronal output. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA2 and KCNA3 are alpha subunits that form functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	99
349712	cd18405	BTB_POZ_KCNA4	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 4 (KCNA4). KCNA4 is also called HPCN2, or voltage-gated K(+) channel HuKII, voltage-gated potassium channel HBK4, voltage-gated potassium channel HK1, or voltage-gated potassium channel subunit Kv1.4. It mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA4 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	97
349713	cd18406	BTB_POZ_KCNA5	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 5 (KCNA5). KCNA5, also called HPCN1, voltage-gated potassium channel HK2, or voltage-gated potassium channel subunit Kv1.5, mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA5 may play a role in regulating the secretion of insulin in normal pancreatic islets. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA5 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	97
349714	cd18407	BTB_POZ_KCNA6	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 6 (KCNA6). KCNA6, also called voltage-gated potassium channel HBK2 or voltage-gated potassium channel subunit Kv1.6, mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA6 is distributed primarily in neurons of central and peripheral nervous systems. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA6 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	127
349715	cd18408	BTB_POZ_KCNA7	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 7 (KCNA7). KCNA7, also called voltage-gated potassium channel subunit Kv1.7, mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA7 plays an important role in the repolarization of cell membranes. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA7 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv1/KCNA alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	115
349716	cd18409	BTB_POZ_KCNA10	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily A member 10 (KCNA10). KCNA10, also called voltage-gated potassium channel subunit Kv1.8, is a cyclic nucleotide-gated, voltage-activated potassium channel that mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNA10 is expressed in proximal tubular cells, glomerular and vascular endothelial cells, as well as in vascular smooth muscle cells. It may facilitate proximal tubular sodium absorption by stabilizing cell membrane voltage. The channel activity is up-regulated by cAMP. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNA10 is an alpha subunit that forms functional homotetrameric channels through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	87
349717	cd18410	BTB_POZ_Shaker-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster potassium voltage-gated channel protein Shaker and similar proteins. Shaker, also termed protein minisleep, represents a family of putative potassium channel proteins in the nervous system of Drosophila. It is a voltage-gated potassium channel that mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Shaker plays a role in the regulation of sleep need or efficiency. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. Shaker is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	100
349718	cd18411	BTB_POZ_KCNB1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily B member 1 (KCNB1). KCNB1, also called delayed rectifier potassium channel 1 (DRK1) or voltage-gated potassium channel subunit Kv2.1, mediates transmembrane potassium transport in excitable membranes, primarily in the brain, but also in the pancreas and cardiovascular system. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNB1 is involved in the regulation of the action potential (AP) repolarization, duration and frequency of repetitive AP firing in neurons, muscle cells and endocrine cells and plays a role in homeostatic attenuation of electrical excitability throughout the brain. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNB1 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	117
349719	cd18412	BTB_POZ_KCNB2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily B member 2 (KCNB2). KCNB2, also called voltage-gated potassium channel subunit Kv2.2, mediates transmembrane potassium transport in excitable membranes, primarily in the brain and smooth muscle cells. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. KCNB2 contributes to the delayed-rectifier voltage-gated potassium current in cortical pyramidal neurons and smooth muscle cells. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNB2 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	127
349720	cd18413	BTB_POZ_Shab-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster potassium voltage-gated channel protein Shab and similar proteins. Shab is a slow delayed rectifier voltage-gated potassium channel in Drosophila. It mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. Shab is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	109
349721	cd18414	BTB_KCNC1_3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily C members KCNC1 and KCNC3. KCNC1 (also called NGK2, voltage-gated potassium channel subunit Kv3.1, or voltage-gated potassium channel subunit Kv4) and KCNC3 (also called KSHIIID or voltage-gated potassium channel subunit Kv3.3) play important roles in the rapid repolarization of fast-firing brain neurons. Assuming opened or closed conformations in response to the voltage difference across the membrane, the proteins form tetrameric potassium-selective channels through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNC1 and KCNC3 are alpha subunit that form functional homo- or hetero-tetrameric channels (with other alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	117
349722	cd18415	BTB_KCNC2_4	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily C members KCNC2 and KCNC4. KCNC2, also called Shaw-like potassium channel or voltage-gated potassium channel Kv3.2, is a delayed rectifier voltage-gated potassium channel that mediates transmembrane potassium transport in excitable membranes, primarily in the brain. It contributes to the regulation of the fast action potential repolarization and in sustained high-frequency firing in neurons of the central nervous system. KCNC4, also called KSHIIIC or voltage-gated potassium channel subunit Kv3.4, is a novel high-voltage-activating, tetraethylammonium (TEA)-sensitive, type-A potassium channel that mediates the voltage-dependent potassium ion permeability of excitable membranes. It plays a pivotal role in oxidative stress-related neural cell damage as an oxidation-sensitive channel. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNC2 and KCNC4 are alpha subunit that form functional homo- or hetero-tetrameric channels (with other alpha subunits) through their BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	124
349723	cd18416	BTB_Shaw-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel protein Shaw. Shaw, also called Shaw2, is a voltage-gated potassium channel in Drosophila. It mediates transmembrane potassium transport in excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a tetrameric potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. Shaw is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	112
349724	cd18417	BTB_POZ_KCND1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily D member 1 (KCND1). KCND1, also called voltage-gated potassium channel subunit Kv4.1, is a pore-forming subunit of voltage-gated rapidly inactivating A-type potassium channels. It may contribute to I (To) current in heart and I (Sa) current in neurons. Its properties are modulated by interactions with other alpha subunits and with regulatory subunits. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCND1 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv4/KCND alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. It is modulated by cytoplasmic KChIPs/KCNIPs (Kv-channel interacting proteins), which are small calcium binding proteins with  EF-hand-like domains.	138
349725	cd18418	BTB_POZ_KCND2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily D member 2 (KCND2). KCND2, also called voltage-gated potassium channel subunit Kv4.2, is a major pore-forming subunit in somatodendritic subthreshold A-type potassium current I(SA) channels. It mediates transmembrane potassium transport in excitable membranes, primarily in the brain. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCND2 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv4/KCND alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. It is modulated by cytoplasmic KChIPs/KCNIPs (Kv-channel interacting proteins), which are small calcium binding proteins with  EF-hand-like domains.	103
349726	cd18419	BTB_POZ_KCND3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily D member 3 (KCND3). KCND3, also called voltage-gated potassium channel subunit Kv4.3, is a pore-forming subunit of voltage-gated rapidly inactivating A-type potassium channels. Mutations in KCND3 cause spinocerebellar ataxia. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCND3 is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other Kv4/KCND alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif. It is modulated by cytoplasmic KChIPs/KCNIPs (Kv-channel interacting proteins), which are small calcium binding proteins with  EF-hand-like domains.	138
349727	cd18420	BTB_POZ_Shal-like	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in Drosophila melanogaster potassium voltage-gated channel protein Shal and similar proteins. Drosophila melanogaster Shal, also called Shaker cognate l or Shal2, is a transient potassium current (I(A)) channel, which is required for maintaining excitability during repetitive firing and normal locomotion in Drosophila. It may play a role in the nervous system and in the regulation of beating frequency in pacemaker cells. Shal mediates the voltage-dependent potassium ion permeability of excitable membranes. Assuming opened or closed conformations in response to the voltage difference across the membrane, the protein forms a potassium-selective channel through which potassium ions may pass in accordance with their electrochemical gradient. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. Shal is an alpha subunit that forms functional homo- or hetero-tetrameric channels (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	139
349728	cd18421	BTB_POZ_KCNG1_2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily G members, KCNG1 and KCNG2. KCNG1, also called voltage-gated potassium channel subunit Kv6.1 or kH2, functions as a regulatory alpha-subunit of voltage-gated potassium channel that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. KCNG2, also called cardiac potassium channel subunit or voltage-gated potassium channel subunit Kv6.2, is a new gamma-subunit of voltage-gated potassium channels that can form functional heterodimeric channels with KCNB1, and further modulates channel activity by shifting the threshold and the half-maximal activation to more negative values. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNG1 and KCNG2 are regulatory alpha subunits and do not form homomultimers. They form heteromultimers (with other alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	114
349729	cd18422	BTB_POZ_KCNG3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily G member 3 (KCNG3). KCNG3, also called voltage-gated potassium channel subunit Kv6.3 or voltage-gated potassium channel subunit Kv10.1, is an electrically silent modulatory subunit that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further promotes a reduction in the rate of activation and inactivation of the delayed rectifier voltage-gated potassium channel KCNB1. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNG3 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	111
349730	cd18423	BTB_POZ_KCNG4	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily G member 4 (KCNG4). KCNG4, also called voltage-gated potassium channel subunit Kv6.4, is a silent voltage-gated potassium (KvS) channel subunit that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNG4 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	112
349731	cd18424	BTB_POZ_KCNV1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily V member 1 (KCNV1). KCNV1, also called neuronal potassium channel alpha subunit HNKA or voltage-gated potassium channel subunit Kv8.1, is a new neuronal voltage-gated potassium channel alpha subunit with specific inhibitory properties towards Shab and Shaw channels. It modulates KCNB1 (also known as Kv2.1) and KCNB2 (also known as Kv2.2) channel activity by shifting the threshold for inactivation to more negative values and by slowing the rate of inactivation. It can also down-regulate the channel activity of KCNB1, KCNB2, KCNC4 (also known as Kv3.4) and KCND1 (also known as Kv4.1), possibly by trapping them in intracellular membranes. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNV1 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	109
349732	cd18425	BTB_POZ_KCNV2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily V member 2 (KCNV2). KCNV2, also called voltage-gated potassium channel subunit Kv8.2, is a modulatory voltage-gated potassium channel alpha subunit that modulates channel activity by shifting the threshold and the half-maximal activation to more negative values. KCNV2 is essential for visual function and cone survival. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNV2 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	108
349733	cd18426	BTB_POZ_KCNS1	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily S member 1 (KCNS1). KCNS1, also called delayed-rectifier K(+) channel alpha subunit 1 or voltage-gated potassium channel subunit Kv9.1, is a modulatory alpha subunit of voltage-gated potassium channel that mediates neuropathic pain following nerve injury. It can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1) and KCNB2 (also known as Kv2.2), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1 and KCNB2. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNS1 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	106
349734	cd18427	BTB_POZ_KCNS2	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily S member 2 (KCNS2). KCNS2, also called delayed-rectifier K(+) channel alpha subunit 2 or voltage-gated potassium channel subunit Kv9.2, is a modulatory alpha subunit of voltage-gated potassium channel that can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1) and KCNB2 (also known as Kv2.2), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1 and KCNB2. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNS2 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	107
349735	cd18428	BTB_POZ_KCNS3	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in potassium voltage-gated channel subfamily S member 3 (KCNS3). KCNS3, also called delayed-rectifier K(+) channel alpha subunit 3 or voltage-gated potassium channel subunit Kv9.3, is an alpha subunit of voltage-gated potassium channel linked to tissue oxygenation responses. It can form functional heterotetrameric channels with KCNB1 (also known as Kv2.1), and further modulates the delayed rectifier voltage-gated potassium channel activation and deactivation rates of KCNB1. Voltage-gated potassium (Kv) channels are composed of alpha subunits, which form the actual conductance pore, and cytoplasmic beta subunits, which are auxiliary proteins that associate with alpha subunits to modulate the activity of the Kv channel. KCNS3 is a regulatory alpha subunit that cannot form a functional homo-tetrameric channel. It forms hetero-tetrameric channels (with other functional alpha subunits) through its BTB/POZ domain, also known as tetramerization (T1) domain, which is a versatile protein-protein interaction motif.	108
349485	cd18429	M14_Nna1-like	Peptidase M14-like domain of ATP/GTP binding proteins and cytosolic carboxypeptidases; uncharacterized bacterial subgroup. A bacterial subgroup of the Peptidase M14-like domain of Nna-1 (Nervous system Nuclear protein induced by Axotomy), also known as ATP/GTP binding protein (AGTPBP-1) and cytosolic carboxypeptidase (CCP),-like proteins. The Peptidase M14 family of metallocarboxypeptidases are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Nna1-like proteins are active metallopeptidases that are thought to act on cytosolic proteins (such as alpha-tubulin in eukaryotes) to remove a C-terminal tyrosine. Nna1-like proteins from the different phyla are highly diverse, but they all contain a unique N-terminal conserved domain right before the CP domain. It has been suggested that this N-terminal domain might act as a folding domain.	253
349486	cd18430	M14_ASTE_ASPA_like	Succinylglutamate desuccinylase/aspartoacylase; uncharacterized. A functionally uncharacterized subgroup of the Succinylglutamate desuccinylase (ASTE)/aspartoacylase (ASPA) subfamily which is part of the M14 family of metallocarboxypeptidases. ASTE catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway, and aspartoacylase (ASPA, also known as aminoacylase 2, and ACY-2; EC:3.5.1.15) cleaves N-acetyl L-aspartic acid (NAA) into aspartate and acetate. NAA is abundant in the brain, and hydrolysis of NAA by ASPA may help maintain white matter. ASPA is an NAA scavenger in other tissues. Mutations in the gene encoding ASPA cause Canavan disease (CD), a fatal progressive neurodegenerative disorder involving dysmyelination and spongiform degeneration of white matter in children. This enzyme binds zinc which is necessary for activity. Measurement of elevated NAA levels in urine is used in the diagnosis of CD.	168
349384	cd18431	BRCT_DNA_ligase_III	BRCT domain of DNA ligase 3 (LIG3) and similar proteins. LIG3 (EC 6.5.1.1), also termed DNA ligase III, or polydeoxyribonucleotide synthase [ATP] 3, functions as heterodimer with DNA-repair protein XRCC1 in the nucleus and can correct defective DNA strand-break repair and sister chromatid exchange following treatment with ionizing radiation and alkylating agents.	78
349385	cd18432	BRCT_PAXIP1_rpt6_like	sixth BRCT domain of PAX-interacting protein 1 (PAXIP1), second BRCT domain of mediator of DNA damage checkpoint protein 1 (MDC1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. MDC1, also termed nuclear factor with BRCT domains 1 (NFBD1), is a nuclear chromatin-associated protein that is required for checkpoint mediated cell cycle arrest in response to DNA damage within both the S phase and G2/M phases of the cell cycle. It directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. MDC1 contains a forkhead-associated (FHA) domain and two BRCT domains, as well as an internal 41-amino acid repeat sequence. The family corresponds to the sixth BRCT domain of PAXIP1 and the second BRCT domain of MDC1.	85
349386	cd18433	BRCT_Rad4_rpt3	third BRCT domain of Schizosaccharomyces pombe S-M checkpoint control protein Rad4 and similar proteins. Rad4, also termed P74, or protein cut5, is an essential component for DNA replication and the checkpoint control system which couples S and M phases. It may directly or indirectly interact with chromatin proteins to form the complex required for the initiation and/or progression of DNA synthesis. Rad4 contains four BRCT repeats. The family corresponds to the third repeat.	83
349387	cd18434	BRCT_TopBP1_rpt5	fifth BRCT domain of DNA topoisomerase 2-binding protein 1 (TopBP1) and similar proteins. TopBP1, also termed DNA topoisomerase II-beta-binding protein 1, or DNA topoisomerase II-binding protein 1, functions in DNA replication and damage response. It binds double-stranded DNA breaks and nicks as well as single-stranded DNA. TopBP1 contains six copies of BRCT domain. The family corresponds to the fifth BRCT domain.	89
349388	cd18435	BRCT_BRC1_like_rpt1	first (N-terminal) BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. Members in this family contains six BRCT domains. This family corresponds to the fourth repeat.	107
349389	cd18436	BRCT_BRC1_like_rpt2	second BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. The family also includes Cryptococcus neoformans DNA ligase 4 (LIG4, also known as DNA ligase IV or polydeoxyribonucleotide synthase [ATP] 4), which is involved in dsDNA break repair, and plays a role in non-homologous integration (NHI) pathways where it is required in the final step of non-homologus end-joining. Members in this family contains six BRCT domains. This family corresponds to the second repeat.	75
349390	cd18437	BRCT_BRC1_like_rpt3	third BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. The family also includes Cryptococcus neoformans DNA ligase 4 (LIG4, also known as DNA ligase IV or polydeoxyribonucleotide synthase [ATP] 4), which is involved in dsDNA break repair, and plays a role in non-homologous integration (NHI) pathways where it is required in the final step of non-homologus end-joining. Members in this family contains six BRCT domains. This family corresponds to the third repeat. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this group; it contains a conserved Trp, but not the Cys/Ser residue.	78
349391	cd18438	BRCT_BRC1_like_rpt4	fourth BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. Members in this family contains six BRCT domains. This family corresponds to the fourth repeat.	68
349392	cd18439	BRCT_BRC1_like_rpt6	sixth (C-terminal) BRCT domain of Schizosaccharomyces pombe BRCT-containing protein 1 (BRC1) and similar proteins. Schizosaccharomyces pombe BRC1 is required for mitotic fidelity, specifically in the G2 phase of the cell cycle. It plays a role in chromatin organization. The family also includes Cryptococcus neoformans DNA ligase 4 (LIG4, also known as DNA ligase IV or polydeoxyribonucleotide synthase [ATP] 4), which is involved in dsDNA break repair, and plays a role in non-homologous integration (NHI) pathways where it is required in the final step of non-homologus end-joining. Members in this family contains six BRCT domains. This family corresponds to the sixth repeat.	116
349393	cd18440	BRCT_PAXIP1_rpt6	sixth BRCT domain of PAX-interacting protein 1 (PAXIP1) and similar proteins. PAXIP1, also termed PAX transactivation activation domain-interacting protein (PTIP), is involved in DNA damage response and in transcriptional regulation through histone methyltransferase (HMT) complexes. It also facilitates ATM-mediated activation of p53 and promotes cellular resistance to ionizing radiation. PAXIP1 contains six BRCT repeats. This family corresponds to the sixth BRCT domain. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this family.	90
349394	cd18441	BRCT_MDC1_rpt2	second BRCT domain of mediator of DNA damage checkpoint protein 1 (MDC1) and similar proteins. MDC1, also termed nuclear factor with BRCT domains 1 (NFBD1), is a nuclear chromatin-associated protein that is required for checkpoint mediated cell cycle arrest in response to DNA damage within both the S phase and G2/M phases of the cell cycle. It directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. MDC1 contains a forkhead-associated (FHA) domain and two BRCT domains, as well as an internal 41-amino acid repeat sequence. The family corresponds to the second BRCT domain. The Trp-X-X-X-Cys/Ser signature motif of the BRCT family is not conserved in this family.	81
349395	cd18442	BRCT_polymerase_mu	BRCT domain of DNA-directed DNA/RNA polymerase mu (polymerase mu) and similar proteins. Polymerase Mu (EC 2.7.7.7), also termed Pol mu, or terminal transferase, is a Gap-filling polymerase involved in repair of DNA double-strand breaks by non-homologous end joining (NHEJ). It participates in immunoglobulin (Ig) light chain gene rearrangement in V(D)J recombination. Polymerase Mu contains a BRCT domain.	98
349396	cd18443	BRCT_DNTT	BRCT domain of DNA nucleotidylexotransferase (DNTT) and similar proteins. DNTT (EC 2.7.7.31), also termed terminal addition enzyme, or terminal deoxynucleotidyltransferase, or terminal transferase, is a template-independent DNA polymerase which catalyzes the random addition of deoxynucleoside 5'-triphosphate to the 3'-end of a DNA initiator. It is the addition of nucleotides at the junction (N region) of rearranged Ig heavy chain and T-cell receptor gene segments during the maturation of B- and T-cells. DNA nucleotidylexotransferase contains a BRCT domain.	95
350519	cd18444	BACK_KLHL1_like	BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins KLHL1, KLHL4 and KLHL5. This subfamily contains Kelch-like proteins: KLHL1, KLHL4 and KLHL5, all of which share high identity and similarity with the Drosophila kelch protein, a component of ring canals. Members of this subfamily contain a BTB domain and kelch repeat domains, characteristics of a kelch family protein. KLHL1 is a neuronal actin-binding protein that modulates voltage-gated CaV2.1 (P/Q-type) and CaV3.2 (alpha1H T-type) calcium channels.	106
350520	cd18445	BACK_KLHL2_like	BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins, KLHL2 and KLHL3. This subfamily includes Kelch-like proteins, KLHL2 and KLHL3. KLHL2 is a novel actin-binding protein predominantly expressed in the brain. It plays a role in the reorganization of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors. Both KLHL2 and KLHL3 function as a component of an E3 ubiquitin ligase complex that mediates the ubiquitination of target proteins.	114
350521	cd18446	BACK_KLHL6	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 6 (KLHL6). KLHL6 is a BTB-kelch protein with a lymphoid tissue-restricted expression pattern. It belongs to the KLHL gene family, which is composed of an N-terminal BTB-POZ domain and four to six Kelch motifs in tandem. It is involved in B-lymphocyte antigen receptor signaling and germinal center formation.	108
350522	cd18447	BACK_KLHL7	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 7 (KLHL7). KLHL7 is a BTB-Kelch protein that constitutes a Cul3-based E3 ubiquitin ligase complex and is involved in the ubiquitination of target proteins for proteasome-mediated degradation. Mutations in KLHL7 cause autosomal-dominant retinitis pigmentosa.	98
350523	cd18448	BACK_KLHL8	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 8 (KLHL8). KLHL8 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex. The BCR(KLHL8) ubiquitin ligase complex mediates ubiquitination and degradation of RAPSN.	97
350524	cd18449	BACK_KLHL9_13	BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins, KLHL9 and KLHL13. KLHL9 and KLHL13 (also termed BTB and kelch domain-containing protein 2, or BKLHD2) are substrate-specific adaptors of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for mitotic progression and cytokinesis. The BCR(KLHL9-KLHL13) E3 ubiquitin ligase complex mediates the ubiquitination of AURKB and controls the dynamic behavior of AURKB on mitotic chromosomes and thereby coordinates faithful mitotic progression and completion of cytokinesis.	95
350525	cd18450	BACK_KLHL10	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 10 (KLHL10). KLHL10 may be a substrate-specific adapter of a CUL3-based E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins during spermatogenesis.	80
350526	cd18451	BACK_KLHL11	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 11 (KLHL11). KLHL11 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, leading most often to their proteasomal degradation.	88
350527	cd18452	BACK_KLHL12	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 12 (KLHL12). KLHL12, also termed CUL3-interacting protein 1 (C3IP1), or DKIR, is a substrate-specific adapter of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a negative regulator of Wnt signaling pathway and ER-Golgi transport.	136
350528	cd18453	BACK_KLHL14	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 14 (KLHL14). KLHL14, also termed protein interactor of Torsin-1A, or Printor, or protein interactor of torsinA, is a novel ATP-free form of torsinA-interacting protein implicated in dystonia pathogenesis.	102
350529	cd18454	BACK_KLHL15	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 15 (KLHL15). KLHL15 is a substrate-specific adaptor for Cullin3 E3 ubiquitin-protein ligase complex that target the serine/threonine-protein phosphatase 2A (PP2A) subunit PPP2R5B for ubiquitination and subsequent proteasomal degradation, thus promoting exchange with other regulatory subunits. It also plays a key role in DNA damage response, favoring DNA double-strand repair through error-prone non-homologous end joining (NHEJ) over error-free, RBBP8-mediated homologous recombination (HR), by targeting the DNA-end resection factor RBBP8/CtIP for ubiquitination and subsequent proteasomal degradation.	108
350530	cd18455	BACK_KLHL16_gigaxonin	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 16 (KLHL16). Gigaxonin, also termed Kelch-like protein 16 (KLHL16), may be a cytoskeletal component that directly or indirectly plays an important role in neurofilament architecture. It may also act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins, such as tubulin folding cofactor B (TBCB), microtubule-associated protein MAP1B and glial fibrillary acidic protein (GFAP). Gigaxonin is mutated in giant axonal neuropathy.	97
350531	cd18456	BACK_KLHL17	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 17 (KLHL17). KLHL17, also termed actinfilin, is a substrate-recognition component of some cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complexes. It acts as a Cullin 3 (Cul3) substrate adaptor that links GluR6 to the E3 ubiquitin-ligase complex, and mediates the ubiquitination and subsequent degradation of GLUR6. It may play a role in the actin-based neuronal function.	102
350532	cd18457	BACK_KLHL18	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 18 (KLHL18). KLHL18 acts as a substrate-specific adaptor for the Cullin3 E3 ubiquitin-protein ligase complex that regulates mitotic entry and ubiquitylates Aurora-A.	107
350533	cd18458	BACK_KLHL19_KEAP1	BACK (BTB and C-terminal Kelch) domain found in Kelch-like ECH-associated protein 1 (KEAP1). KEAP1, also termed cytosolic inhibitor of Nrf2 (INrf2), or Kelch-like protein 19 (KLHL19), is a redox-regulated substrate adaptor protein for a Cullin3-dependent ubiquitin ligase complex that targets NFE2L2/NRF2 for ubiquitination and degradation by the proteasome, thus resulting in the suppression of its transcriptional activity and the repression of antioxidant response element-mediated detoxifying enzyme gene expression.	91
350534	cd18459	BACK_KLHL20	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 20 (KLHL20). KLHL20, also termed Kelch-like ECT2-interacting protein (KLEIP), or Kelch-like protein X, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex involved in interferon response and anterograde Golgi to endosome transport. KLHL20 plays a role in actin assembly at cell-cell contact sites of Madin-Darby canine kidney cells. It also controls endothelial migration and sprouting angiogenesis.	100
350535	cd18460	BACK_KLHL21	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 21 (KLHL21). KLHL21 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for efficient chromosome alignment and cytokinesis. The BCR(KLHL21) E3 ubiquitin ligase complex regulates localization of the chromosomal passenger complex (CPC) from chromosomes to the spindle midzone in anaphase and mediates the ubiquitination of aurora B. KLHL21 targets IkappaB kinase-beta to regulate nuclear factor kappa-light chain enhancer of activated B cells (NF-kappaB) signaling negatively.	101
350536	cd18461	BACK_KLHL22	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 22 (KLHL22). KLHL22 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for chromosome alignment and localization of Polo-like kinase 1 (PLK1) at kinetochores. The BCR(KLHL22) ubiquitin ligase complex mediates monoubiquitination of PLK1, leading to PLK1 dissociation from phosphoreceptor proteins and subsequent removal from kinetochores, allowing silencing of the spindle assembly checkpoint (SAC) and chromosome segregation.	104
350537	cd18462	BACK_KLHL23	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 23 (KLHL23). KLHL23 is involved in tumorigenesis and resistance to anticancer drug treatment. It also associates with cone-rod dystrophy.	102
350538	cd18463	BACK_KLHL24	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 24 (KLHL24). KLHL24, also called kainate receptor-interacting protein for GluR6 (KRIP6), or protein DRE1, is necessary to maintain the balance between intermediate filament stability and degradation, a process that is essential for skin integrity. KLHL24 is a component of the BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that mediates ubiquitination of KRT14 and controls its levels during keratinocyte differentiation.	78
350539	cd18464	BACK_KLHL25_like	BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins, KLHL25 and KLHL37. The family includes KLHL25 and KLHL37. KLHL25, also called ectoderm-neural cortex protein 2 (ENC-2), is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for translational homeostasis. The BCR(KLHL25) ubiquitin ligase complex acts by mediating ubiquitination of hypophosphorylated EIF4EBP1 (4E-BP1). KLHL37, also called ectoderm-neural cortex protein 1 (ENC-1), or nuclear matrix protein NRP/B, or p53-induced gene 10 protein, is an actin-binding nuclear matrix protein that associates with p110(RB), and is involved in the regulation of neuronal process formation and in differentiation of neural crest cells.	98
350540	cd18465	BACK_KLHL26	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 26 (KLHL26). KLHL26 is a kelch family protein encoded by gene klhl26, which is regulated by p53 via fuzzy tandem repeats.	97
350541	cd18466	BACK_KLHL27_IPP	BACK (BTB and C-terminal Kelch) domain found in intracisternal A particle-promoted polypeptide (IPP). IPP, also termed Kelch-like protein 27 (KLHL27), is an actin-binding protein that may play a role in organizing the actin cytoskeleton.	103
350542	cd18467	BACK_KLHL28_BTBD5	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 28 (KLHL28). KLHL28, also termed BTB/POZ domain-containing protein 5 (BTBD5), belongs to the KLHL family. Its function remains unclear.	99
350543	cd18468	BACK_KLHL29_KBTBD9	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 29 (KLHL29). KLHL29, also termed Kelch repeat and BTB domain-containing protein 9 (KBTBD9), belongs to the KLHL family. Its function remains unclear. A nuclear receptor subfamily 5, group A, member 2 (NR5A2)-Kelch-like family member 29 (KLHL29) fusion transcript may participate in the origin or progression of some colon cancers.	102
350544	cd18469	BACK_KLHL30	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 30 (KLHL30). KLHL30 belongs to the KLHL family. Its function remains unclear. Differential expression of the KLHL30 gene has been observed in glioblastoma multiforme versus normal brain.	104
350545	cd18470	BACK_KLHL31_KBTBD1	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 31 (KLHL31). KLHL31, also termed BTB and kelch domain-containing protein 6, or Kelch repeat and BTB domain-containing protein 1, or Kelch-like protein KLHL, is a transcriptional repressor in MAPK/JNK signaling pathway that regulates cellular functions. Overexpression inhibits the transcriptional activities of both the TPA-response element (TRE) and serum response element (SRE).	98
350546	cd18471	BACK_KLHL32_BKLHD5	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 32 (KLHL32). KLHL32, also termed BTB and kelch domain-containing protein 5 (BKLHD5), belongs to the KLHL family. Its function remains unclear. KLHL32 SNPs may be associated with body mass index in individuals of African ancestry.	98
350547	cd18472	BACK_KLHL33	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 33 (KLHL33). KLHL33 belongs to the KLHL family. Its function remains unclear. KLHL33 SNPs may be associated with prostate cancer risk.	75
350548	cd18473	BACK_KLHL34	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 34 (KLHL34). KLHL34 belongs to the KLHL family. Its function remains unclear. The methylation status of KLHL34 cg14232291 may be a predictive candidate of sensitivity to preoperative chemoradiation therapy.	106
350549	cd18474	BACK_KLHL35	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 35 (KLHL35). KLHL35 belongs to the KLHL family. Its function remains unclear. Hypermethylation of KLHL35 is associated with hepatocellular carcinoma and abdominal aortic aneurysm.	79
350550	cd18475	BACK_KLHL36	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 36 (KLHL36). KLHL36 may act as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins.	100
350551	cd18476	BACK_KLHL38	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 38 (KLHL38). KLHL38 belongs to the KLHL family. Its function remains unclear. The klhl38 gene has recently been identified as a possible diapause (a temporary arrest of development during early ontogeny) gene, as it is significantly up-regulated during diapause. It may also be involved in chicken preadipocyte differentiation.	99
350552	cd18477	BACK_KLHL40_like	BACK (BTB and C-terminal Kelch) domain found in Kelch-like proteins, KLHL40 and KLHL41. The family includes Kelch-like proteins, KLHL40 and KLHL41. KLHL40 is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a key regulator of skeletal muscle development. KLHL41 is a novel kelch related protein that is involved in pseudopod elongation in transformed cells.	99
350553	cd18478	BACK_KLHL42_KLHDC5	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 42 (KLHL42). KLHL42, also called Cullin-3-binding protein 9 (Ctb9), or Kelch domain-containing protein 5, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex required for mitotic progression and cytokinesis. The BCR(KLHL42) E3 ubiquitin ligase complex mediates the ubiquitination and subsequent degradation of KATNA1. KLHL42 is involved in microtubule dynamics throughout mitosis.	103
350554	cd18479	BACK_KBTBD2	BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 2 (KBTBD2). KBTBD2, also called BTB and kelch domain-containing protein 1 (BKLHD1), plays an essential role in the regulating the insulin-signaling pathway. It is a BTB-Kelch family substrate recognition subunit of the Cullin-3-based E3 ubiquitin ligase, which targets p85alpha, the regulatory subunit of the phosphoinositol-3-kinase (PI3K) heterodimer, causing p85alpha ubiquitination and proteasome-mediated degradation.	96
350555	cd18480	BACK_KBTBD3	BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 3 (KBTBD3). KBTBD3, also termed BTB and kelch domain-containing protein 3 (BKLHD3), is a BTB-Kelch family protein. Its function remains unclear.	82
350556	cd18481	BACK_KBTBD4	BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 4 (KBTBD4). KBTBD4, also termed BTB and kelch domain-containing protein 4 (BKLHD4), is a BTB-BACK-Kelch domain protein belonging to a large family of cullin-RING ubiquitin ligase adaptors that facilitate the ubiquitination of target substrates.	88
350557	cd18482	BACK_KBTBD6_7	BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing proteins, KBTBD6 and KBTBD7. KBTBD6 and KBTBD7 are substrate adaptors of a cullin-3 RING ubiquitin ligase complex that mediates ubiquitylation and proteasomal degradation of T-lymphoma and metastasis gene 1 (TIAM1), a RAC1-specific guanine exchange factor (GEF), by cooperating with gamma-aminobutyric acid receptor-associated proteins (GABARAP). KBTBD7 may also act as a new transcriptional activator in mitogen-activated protein kinase (MAPK) signaling.	99
350558	cd18483	BACK_KBTBD8	BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 8 (KBTBD8). KBTBD8, also called T-cell activation kelch repeat protein (TA-KRP), is a BTB-kelch family protein that is located in the Golgi apparatus and translocates to the spindle apparatus during mitosis. It acts as a substrate-specific adaptor for a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a regulator of neural crest specification. The BCR(KBTBD8) complex monoubiquitylates NOLC1 and its paralogue TCOF1, the mutation of which underlies the neurocristopathy Treacher Collins syndrome.	97
350559	cd18484	BACK_KBTBD11_CMLAP	BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 11 (KBTBD11). KBTBD11, also termed chronic myelogenous leukemia-associated protein (CMLAP), or Kelch domain-containing protein 7B, or KLHDC7C, is a BTB-Kelch family protein. Its function remains unclear. A novel polymorphism rs11777210 in KBTBD11 is significantly associated with colorectal cancer risk; KBTBD11 may function as a tumor suppressor. KBTBD11 hypomethylation may also be a potential target for differentiating between the mostly fatal TCF3-HLF and curable TCF3-PBX1 pediatric acute lymphoblastic leukemia subtypes.	77
350560	cd18485	BACK_KBTBD12	BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 12 (KBTBD12). KBTBD12, also termed Kelch domain-containing protein 6 (KLHDC6), is a BTB-Kelch family protein. Its function remains unclear.	100
350561	cd18486	BACK_KBTBD13	BACK (BTB and C-terminal Kelch) domain found in Kelch repeat and BTB domain-containing protein 13 (KBTBD13). KBTBD13 is a muscle-specific protein. Autosomal dominant mutations may cause nemaline myopathy (NEM); these disease-associated mutations are located in conserved Kelch repeats and are predicted to disrupt the beta-propeller structure. KBTBD13 may act as a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that functions as a muscle specific ubiquitin ligase, and thereby implicate the ubiquitin proteasome pathway in the pathogenesis of KBTBD13-associated NEM.	89
350562	cd18487	BACK_BTBD1_like	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing proteins, BTBD1 and BTBD2. This subfamily includes BTB/POZ domain-containing proteins BTBD1 and BTBD2, both of which are BTB-domain-containing Kelch-like proteins that interact with DNA topoisomerase 1 (Topo1), a key enzyme in cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta.	95
350563	cd18488	BACK_BTBD3_like	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing proteins, BTBD3 and BTBD6. This subfamily includes BTB/POZ domain-containing proteins BTBD3 and BTBD6, both of which are BTB-domain-containing Kelch-like proteins. BTBD3 controls dendrite orientation toward active axons in mammalian neocortex. BTBD6 is required for proper embryogenesis and plays an essential evolutionarily-conserved role during neuronal development.	95
350564	cd18489	BACK_BTBD7	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 7 (BTBD7). BTBD7 is a crucial regulator that is essential for region-specific epithelial cell dynamics and branching morphogenesis. It has been implicated in various cancers. BTBD7 contains two BTB domains and a BACK domain.	98
350565	cd18490	BACK_BTBD8	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 8 (BTBD8). BTBD8 is a BTB-domain-containing Kelch-like protein that may play a role in developmental process. It may also act as a protein-protein adaptor in a transcription complex and thus may be involved in brain development.	64
350566	cd18491	BACK_ABTB2_like	BACK (BTB and C-terminal Kelch) domain found in ankyrin repeat and BTB/POZ domain-containing protein 2 (ABTB2) and similar proteins. ABTB2, also called Bood POZ containing gene type 2 (BPOZ-2), is a scaffold protein that controls the degradation of many biological proteins involved in a range of functions from embryonic development to tumor progression. It may be involved in the initiation of hepatocyte growth. It inhibits the aggregation of alpha-synuclein, with implications in Parkinson's disease. ABTB2 functions as an adaptor protein for the E3 ubiquitin ligase scaffold protein Cullin-3. It directly binds to eukaryotic elongation factor 1A1 (eEF1A1) to promote eEF1A1 ubiquitylation and degradation and prevent translation. It is also involved in the growth suppressive effect of the phosphatase and tensin homologue (PTEN). This subfamily also includes BTB/POZ domain-containing protein 11 (BTBD11), whose function is unclear.	72
350567	cd18492	BACK_BTBD16	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 16 (BTBD16). BTBD16 is a BTB-domain-containing Kelch-like protein. Its function remains unclear. BTBD16 SNPs may be bipolar disorder (BD) genetic susceptibility variants exhibiting genetic background-dependent effects.	97
350568	cd18493	BACK_BTBD17	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 17 (BTBD17). BTBD17, also termed galectin-3-binding protein-like, is a BTB-domain-containing Kelch-like protein. Its function remains unclear. It may be involved in hepatocellular carcinoma development and progression.	74
350569	cd18494	BACK_BTBD19	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 19 (BTBD19). BTBD19 is a BTB-domain-containing Kelch-like protein. Its function remains unclear.	73
350570	cd18495	BACK_GCL	BACK (BTB and C-terminal Kelch) domain found in Drosophila melanogaster protein germ cell-less (GCL) and similar proteins. The GCL protein is a nuclear envelope protein highly conserved between the mammalian and Drosophila orthologs. Drosophila melanogaster GCL is a key regulator required for the specification of pole cells and primordial germ cell formation in Drosophila embryos. Both human germ cell-less protein-like 1 (GMCL1) and germ cell-less protein-like 1-like (GMCL1P1 or GMCL1L) may function in spermatogenesis. They may also be substrate-specific adaptors of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins.	78
350571	cd18496	BACK_LGALS3BP	BACK (BTB and C-terminal Kelch) domain found in lectin galactoside-binding soluble 3-binding protein (LGALS3BP). LGALS3BP, also called galectin-3-binding protein, or basement membrane autoantigen p105, or Mac-2-binding protein (MAC2BP/M2BP), or tumor-associated antigen 90K, promotes integrin-mediated cell adhesion. It may stimulate host defense against viruses and tumor cells.	74
350572	cd18497	BACK_ABTB1_BPOZ	BACK (BTB and C-terminal Kelch) domain found in ankyrin repeat and BTB/POZ domain-containing protein 1 (ABTB1). ABTB1, also called elongation factor 1A-binding protein, or Bood POZ containing gene type 1 (BPOZ-1), is an anti-proliferative factor that may act as a mediator of the phosphatase and tensin homologue (PTEN) growth-suppressive signaling pathway. It may play a role in developmental processes.	72
350573	cd18498	BACK_RCBTB1_2	BACK (BTB and C-terminal Kelch) domain found in RCC1 and BTB domain-containing proteins, RCBTB1 and RCBTB2. The RCC1-related guanine nucleotide exchange factor (GEF) family includes RCC1 and BTB domain-containing proteins, RCBTB1 and RCBTB2, both of which are chromosome condensation regulator-like guanine nucleotide exchange factors.	64
350574	cd18499	BACK_RHOBTB	BACK (BTB and C-terminal Kelch) domain found in Rho-related BTB domain-containing proteins (RhoBTB). RhoBTB proteins constitute a subfamily of atypical members within the Rho family of small guanosine triphosphatases (GTPases), which is characterized by containing a GTPase domain (in most cases, non-functional) followed by a proline rich region, a tandem of 2 BTB domains, and a C-terminal BACK domain. In humans, the RhoBTB subfamily consists of 3 isoforms: RhoBTB1, RhoBTB2, and RhoBTB3. Orthologs are present in several other eukaryotes, such as Drosophila and Dictyostelium, but have been lost in plants and fungi.	76
350575	cd18500	BACK_IBtk	BACK (BTB and C-terminal Kelch) domain found in inhibitor of Bruton tyrosine kinase (IBtk). IBtk is an inhibitor of Bruton's tyrosine kinase (Btk), thereby playing a role in B-cell development.	60
350576	cd18501	BACK_ANKFY1_Rank5	BACK (BTB and C-terminal Kelch) domain found in rabankyrin-5 (Rank-5). Rank-5, also called ankyrin repeat and FYVE domain-containing protein 1 (ANKFY1), or ankyrin repeats hooked to a zinc finger motif (ANKHZN), is a Rab5 effector that regulates and coordinates different endocytic mechanisms.	89
350577	cd18502	BACK_NS1BP_IVNS1ABP	BACK (BTB and C-terminal Kelch) domain found in influenza virus NS1A-binding protein (NS1-BP). NS1-BP, also called NS1-binding protein, or Aryl hydrocarbon receptor-associated protein 3, or IVNS1ABP, is a novel protein that interacts with the influenza A virus nonstructural NS1 protein, which is relocalized in the nuclei of infected cells. It plays a role in cell division and in the dynamic organization of the actin skeleton as a stabilizer of actin filaments by association with F-actin through Kelch repeats. It also interacts with alpha-enolase/MBP-1 and is involved in c-Myc gene transcriptional control.	99
350578	cd18503	BACK_calicin	BACK (BTB and C-terminal Kelch) domain found in calicin. Calicin is a basic cytoskeletal protein involved in the formation and maintenance of the highly regular organization of the postacrosomal perinuclear theca, the calyx of mammalian spermatozoa.	78
350579	cd18504	BACK_ARIA_like	BACK (BTB and C-terminal Kelch) domain found in plant ARM repeat protein interacting with ABF2 (ARIA) and similar proteins. ARIA is an ARM repeat protein that acts as a positive regulator of ABA response via the modulation of the transcriptional activity of ABF2, a transcription factor which controls ABA-dependent gene expression via the G-box-type ABA-responsive elements. ARIA is a novel abscisic acid signaling component. It negatively regulates seed germination and young seedling growth.	64
350580	cd18505	BACK1_LZTR1	first BACK (BTB and C-terminal Kelch) domain found in leucine-zipper-like transcriptional regulator 1 (LZTR-1). LZTR-1 is a Golgi BTB-kelch protein that is degraded upon induction of apoptosis. It may also function as a transcriptional regulator that plays a crucial role in embryogenesis. Germline loss-of-function mutations in LZTR-1 predispose to an inherited disorder of multiple schwannomas.	59
350581	cd18506	BACK2_LZTR1	second BACK (BTB and C-terminal Kelch) domain found in leucine-zipper-like transcriptional regulator 1 (LZTR-1). LZTR-1 is a Golgi BTB-kelch protein that is degraded upon induction of apoptosis. It may also function as a transcriptional regulator that plays a crucial role in embryogenesis. Germline loss-of-function mutations in LZTR-1 predispose to an inherited disorder of multiple schwannomas.	61
350582	cd18507	BACK_GPRS_like	BACK (BTB and C-terminal Kelch) domain found in Drosophila melanogaster serine-enriched protein (GPRS) and similar proteins. The family includes uncharacterized Drosophila melanogaster serine-enriched protein (GPRS) and similar proteins.	80
350583	cd18508	BACK_KEL_like	BACK (BTB and C-terminal Kelch) domain found in Drosophila melanogaster ring canal kelch protein (KEL) and similar proteins. KEL, also termed kelch short protein, is a component of ring canals that regulates the flow of cytoplasm between cells. It binds actin and may be involved in the regulation of cytoplasm flow from nurse cells to the oocyte during oogenesis.	77
350584	cd18509	BACK_KLHL1	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 1 (KLHL1). KLHL1 is a neuronal actin-binding protein that modulates voltage-gated CaV2.1 (P/Q-type) and CaV3.2 (alpha1H T-type) calcium channels. It may play a role in organizing the actin cytoskeleton in brain cells. KLHL1 contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein.	106
350585	cd18510	BACK_KLHL4	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 4 (KLHL4). KLHL4 shares high identity and similarity with the Drosophila kelch protein, a component of ring canals. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein.	106
350586	cd18511	BACK_KLHL5	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 5 (KLHL5). KLHL5 shares high identity and similarity with the Drosophila kelch protein, a component of ring canals. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein. It is abundantly expressed in ovary, adrenal gland, and thymus.	106
350587	cd18512	BACK_KLHL2_Mayven	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 2 (KLHL2). KLHL2, also called actin-binding protein Mayven, is a novel actin-binding protein predominantly expressed in the brain. It plays a role in the reorganization of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors. KLHL2 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, such as NPTXR, leading most often to their proteasomal degradation.	130
350588	cd18513	BACK_KLHL3	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 3 (KLHL3). KLHL3 serves as a substrate adapter in Cullin3 (Cul3) E3 ubiquitin ligase complexes. It is a component of an E3 ubiquitin ligase complex that regulates blood pressure by targeting With-No-Lysine (WNK) kinases for degradation.	130
350589	cd18514	BACK_KLHL25_ENC2	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 25 (KLHL25). KLHL25, also called ectoderm-neural cortex protein 2 (ENC-2), is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex required for translational homeostasis. The BCR(KLHL25) ubiquitin ligase complex acts by mediating ubiquitination of hypophosphorylated EIF4EBP1 (4E-BP1).	99
350590	cd18515	BACK_KLHL37_ENC1	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 37 (KLHL37). KLHL37, also called ectoderm-neural cortex protein 1 (ENC-1), or nuclear matrix protein NRP/B, or p53-induced gene 10 protein, is an actin-binding nuclear matrix protein that associates with p110(RB), and is involved in the regulation of neuronal process formation and in differentiation of neural crest cells.	98
350591	cd18516	BACK_KLHL40_KBTBD5	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 40 (KLHL40). KLHL40, also called Kelch repeat and BTB domain-containing protein 5, or sarcosynapsin, is a substrate-specific adaptor of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that acts as a key regulator of skeletal muscle development. Mutations in KLHL40 may cause severe autosomal-recessive nemaline myopathy.	99
350592	cd18517	BACK_KLHL41_KBTBD10	BACK (BTB and C-terminal Kelch) domain found in Kelch-like protein 41 (KLHL41). KLHL41, also called Kel-like protein 23, or Kelch repeat and BTB domain-containing protein 10, or Kelch-related protein 1 (Krp1), or sarcosine, is a novel kelch related protein that is involved in pseudopod elongation in transformed cells. It is also involved in skeletal muscle development and differentiation. It regulates proliferation and differentiation of myoblasts and plays a role in myofibril assembly by promoting lateral fusion of adjacent thin fibrils into mature, wide myofibrils.	99
350593	cd18518	BACK_SPOP	BACK (BTB and C-terminal Kelch) domain found in speckle-type POZ protein (SPOP). SPOP, also termed HIB homolog 1, or Roadkill homolog 1, is a novel nuclear speckle-type protein which serves as an adaptor of cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and proteasomal degradation of target proteins, such as BRMS1, DAXX, PDX1/IPF1, GLI2 and GLI3.	71
350594	cd18519	BACK_SPOPL	BACK (BTB and C-terminal Kelch) domain found in speckle-type POZ protein-like (SPOPL). SPOPL, also termed HIB homolog 2, or Roadkill homolog 2, is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins. The complexes containing homodimeric SPOPL or the heterodimer formed by speckle-type POZ protein (SPOP) and SPOPL are less efficient than ubiquitin ligase complexes containing only SPOP.	96
350595	cd18520	BACK_roadkill_like	BACK (BTB and C-terminal Kelch) domain found in Drosophila melanogaster protein roadkill and similar proteins. Drosophila melanogaster protein roadkill, also termed Hh-induced MATH and BTB domain-containing protein (HIB), is a hedgehog-induced BTB protein that modulates hedgehog signaling by degrading Ci/Gli transcription factor.	74
350596	cd18521	BACK_Tdpoz	BACK (BTB and C-terminal Kelch) domain found in TD and POZ domain-containing proteins, Tdpoz1-4. TDPOZ is a family of bipartite animal and plant proteins that contain a tumor necrosis factor receptor-associated factor (TRAF) domain (TD) and a POZ/BTB domain. TDPOZ proteins may be nuclear scaffold proteins probably involved in transcription regulation in early development and other cellular processes. This subfamily contains only mammalian members. Plant TDPOZ proteins contain a MATH domain at the N-terminal region and are named "BTB/POZ and MATH domain-containing proteins (BPM)", and are not inlcuded in this subfamily.	67
350597	cd18522	BACK_BTBD1	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 1 (BTBD1). BTBD1, also called Hepatitis C virus NS5A-transactivated protein 8 or HCV NS5A-transactivated protein 8, is a BTB-domain-containing Kelch-like protein specifically expressed in skeletal muscle. It interacts with DNA topoisomerase 1 (Topo1), a key enzyme in cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta. It may serve as a substrate-specific adaptor of an E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins.	107
350598	cd18523	BACK_BTBD2	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 2 (BTBD2). BTBD2 is a BTB-domain-containing Kelch-like protein that interacts with DNA topoisomerase 1 (Topo1), a key enzyme in cell survival. BTBD1 and BTBD2 colocalize to cytoplasmic bodies with the RBCC/tripartite motif protein, TRIM5delta.	106
350599	cd18524	BACK_BTBD3	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 3 (BTBD3). BTBD3 is a BTB-domain-containing Kelch-like protein that controls dendrite orientation toward active axons in mammalian neocortex.	95
350600	cd18525	BACK_BTBD6	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 6 (BTBD6). BTBD6, also called lens BTB domain protein, is a BTB-domain-containing Kelch-like protein that is required for proper embryogenesis and plays an essential evolutionarily-conserved role during neuronal development.	95
350601	cd18526	BACK_ABTB2	BACK (BTB and C-terminal Kelch) domain found in ankyrin repeat and BTB/POZ domain-containing protein 2 (ABTB2). ABTB2, also called Bood POZ containing gene type 2 (BPOZ-2), is a scaffold protein that controls the degradation of many biological proteins with functions ranging from embryonic development to tumor progression. It may be involved in the initiation of hepatocyte growth. It inhibits the aggregation of alpha-synuclein, with implications in Parkinson's disease. ABTB2 functions as an adaptor protein for the E3 ubiquitin ligase scaffold protein Cullin-3. It directly binds to eukaryotic elongation factor 1A1 (eEF1A1) to promote eEF1A1 ubiquitylation and degradation and prevent translation. It is also involved in the growth suppressive effect of the phosphatase and tensin homologue (PTEN).	79
350602	cd18527	BACK_BTBD11	BACK (BTB and C-terminal Kelch) domain found in BTB/POZ domain-containing protein 11 (BTBD11). BTBD11, also termed ankyrin repeat and BTB/POZ domain-containing protein BTBD11, is a BTB-domain-containing Kelch-like protein. Its function remains unclear. The BTBD11 gene has been identified as an all-trans retinoic acid-responsive gene that may play a role in neural development.	83
350603	cd18528	BACK_RCBTB1	BACK (BTB and C-terminal Kelch) domain found in RCC1 and BTB domain-containing protein 1 (RCBTB1). RCBTB1, also called chronic lymphocytic leukemia deletion region gene 7 protein (CLLD7), or CLL deletion region gene 7 protein, or regulator of chromosome condensation and BTB domain-containing protein 1, or E4.5, is a novel chromosome condensation regulator-like guanine nucleotide exchange factor (GEF) that may be involved in cell cycle regulation by chromatin remodeling. It may also function as a tumor suppressor that regulates pathways of DNA damage/repair and apoptosis. Moreover, RCBTB1 acts as a putative substrate adaptor for a cullin3 (CUL3) E3 ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins. Biallelic mutations in RCBTB1 may cause isolated and syndromic retinal dystrophy.	66
350604	cd18529	BACK_RCBTB2	BACK (BTB and C-terminal Kelch) domain found in RCC1 and BTB domain-containing protein 2 (RCBTB2). RCBTB2, also called chromosome condensation 1-like (CHC1-L), or RCC1-like G exchanging factor, or regulator of chromosome condensation and BTB domain-containing protein 2, is a chromosome condensation regulator-like guanine nucleotide exchange factor (GEF) protein for the Ras-related GTPase Ran.	65
350605	cd18530	BACK_RHOBTB1	BACK (BTB and C-terminal Kelch) domain found in Rho-related BTB domain-containing protein 1 (RhoBTB1). RhoBTB1 is an atypical member of the Rho GTPase family of signaling proteins, which is characterized by containing a carboxyl terminal extension that harbors two BTB domains and a BACK domain and is capable of assembling cullin 3-dependent ubiquitin ligase complexes. It functions as a tumor suppressor that regulates the integrity of the Golgi complex through the methyltransferase METTL7B. RhoBTB1 also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex.	100
350606	cd18531	BACK_RHOBTB2	BACK (BTB and C-terminal Kelch) domain found in Rho-related BTB domain-containing protein 2 (RhoBTB2). RhoBTB2, also called Deleted in breast cancer 2 gene protein (DBC2), or p83, is an atypical member of the Rho GTPase family of signaling proteins, which is characterized by containing a carboxyl terminal extension that harbors two BTB domains and a BACK domain and is capable of assembling cullin 3-dependent ubiquitin ligase complexes. It functions as a tumor suppressor that regulates the expression of the methyltransferase METTL7A. RhoBTB2 also acts an adaptor of the Cullin-3-dependent E3 ubiquitin ligase complex.	97
350607	cd18532	BACK_RHOBTB3	BACK (BTB and C-terminal Kelch) domain found in Rho-related BTB domain-containing protein 3 (RhoBTB3). RhoBTB3 is an atypical member of the Rho GTPase family of signaling proteins, which is characterized by containing a carboxyl terminal extension that harbors two BTB domains and a BACK domain and is capable of assembling cullin 3-dependent ubiquitin ligase complexes. It is a Golgi-associated Rho-related ATPase that regulates the S/G2 transition of the cell cycle by targeting cyclin E for ubiquitylation. RhoBTB3 is involved in vesicle trafficking and in targeting proteins for degradation in the proteasome. It binds directly to Rab9 GTPase and functions with Rab9 in protein transport from endosomes to the trans Golgi network. It also promotes proteasomal degradation of Hypoxia-inducible factor alpha (HIFalpha) by facilitating hydroxylation and ubiquitination.	83
350509	cd18533	PTP_fungal	fungal protein tyrosine phosphatases. This subfamily contains Saccharomyces cerevisiae protein-tyrosine phosphatases 1 (PTP1) and 2 (PTP2), Schizosaccharomyces pombe PTP1, PTP2, and PTP3, and similar fungal proteins. PTPs (EC 3.1.3.48) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. PTP2, together with PTP3, is the major phosphatase that dephosphorylates and inactivates the MAP kinase HOG1 and also modulates its subcellular localization.	212
350510	cd18534	DSP_plant_IBR5-like	dual specificity phosphatase domain of plant IBR5-like protein phosphatases. This subfamily is composed of Arabidopsis thaliana INDOLE-3-BUTYRIC ACID (IBA) RESPONSE 5 (IBR5) and similar plant proteins. IBR5 protein is also called SKP1-interacting partner 33. The IBR5 gene encodes a dual-specificity phosphatase (DUSP) which acts as a positive regulator of plant responses to auxin and abscisic acid. DUSPs function as protein-serine/threonine phosphatases (EC 3.1.3.16) and protein-tyrosine-phosphatases (EC 3.1.3.48). Typical DUSPs, also called mitogen-activated protein kinase (MAPK) phosphatases (MKPs), deactivate MAPKs by dephosphorylating the threonine and tyrosine residues in the conserved Thr-Xaa-Tyr motif residing in their activation sites. IBR5 is an atypical DUSP; it contains the catalytic dual specificity phosphatase domain but lacks the N-terminal Cdc25/rhodanese-like domain that is present in typical DUSPs. It has been shown to target MPK12, which is a negative regulator of auxin signaling.	130
350511	cd18535	PTP-IVa3	protein tyrosine phosphatase type IVA 3. Protein tyrosine phosphatase type IVA 3 (PTP-IVa3), also known as protein-tyrosine phosphatase of regenerating liver 3 (PRL-3), stimulates progression from G1 into S phase during mitosis and enhances cell proliferation, cell motility and invasive activity, and promotes cancer metastasis. It exerts its oncogenic functions through activation of PI3K/Akt, which is a key regulator of the rapamycin-sensitive mTOR complex 1. PRL-3 is a member of the PTP-IVa/PRL family of small, prenylated phosphatases that are the most oncogenic of all PTPs. PRLs associate with magnesium transporters of the cyclin M (CNNM) family, which results in increased intracellular magnesium levels that promote oncogenic transformation.	154
350512	cd18536	PTP-IVa2	protein tyrosine phosphatase type IVA 2. Protein tyrosine phosphatase type IVA 2 (PTP-IVa2), also known as protein-tyrosine phosphatase of regenerating liver 2 (PRL-2), stimulates progression from G1 into S phase during mitosis and  promotes tumors. It regulates tumor cell migration and invasion through an ERK-dependent signaling pathway. Its overexpression correlates with breast tumor formation and progression. PRL-2 is a member of the PTP-IVa/PRL family of small, prenylated phosphatases that are the most oncogenic of all PTPs. PRLs associate with magnesium transporters of the cyclin M (CNNM) family, which results in increased intracellular magnesium levels that promote oncogenic transformation.	155
350513	cd18537	PTP-IVa1	protein tyrosine phosphatase type IVA 1. Protein tyrosine phosphatase type IVA 1 (PTP-IVa1), also known as protein-tyrosine phosphatase of regenerating liver 1 (PRL-1), stimulates progression from G1 into S phase during mitosis and enhances cell proliferation, cell motility and invasive activity, and promotes cancer metastasis. It may play a role in the development and maintenance of differentiating epithelial tissues. PRL-1 promotes cell growth and migration by activating both the ERK1/2 and RhoA pathways. It is a member of the PTP-IVa/PRL family of small, prenylated phosphatases that are the most oncogenic of all PTPs. PRLs associate with magnesium transporters of the cyclin M (CNNM) family, which results in increased intracellular magnesium levels that promote oncogenic transformation.	167
350514	cd18538	PFA-DSP_unk	unknown subfamily of atypical dual-specificity phosphatases from fungi. This uncharacterized subfamily belongs to the plant and fungi atypical dual-specificity phosphatases (PFA-DSPs) group of atypical DSPs that present in plants, fungi, kinetoplastids, and slime molds. They share structural similarity with atypical- and lipid phosphatase DSPs from mammals. The PFA-DSP group is composed of active as well as inactive phosphatases. This unknown subgroup contains the conserved the CxxxxxR catalytic motif present in active cysteine phosphatases.	145
349786	cd18539	SRP_G	GTPase domain of signal recognition particle protein. The signal recognition particle (SRP) mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes. SRP recognizes N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognated receptor (SR). In mammals, SRP consists of six protein subunits and a 7SL RNA. One of these subunits is a 54 kd protein (SRP54), which is a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 is a multidomain protein that consists of an N-terminal domain, followed by a central G (GTPase) domain and a C-terminal M domain.	193
349984	cd18540	ABC_6TM_exporter_like	Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	295
349985	cd18541	ABC_6TM_TmrB_like	Six-transmembrane helical domain (TmrB) of the heterodimeric Thermus thermophilus multidrug resistance proteins TmrAB, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the heterodimeric Thermus thermophilus multidrug resistance proteins A and B (TmrAB), a homolog of the Antigen Translocation Complex Tap, and similar proteins. TmrAB has been shown to able to restore antigen processing in human TAP-deficient cells. The 6-transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain  6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	293
349986	cd18542	ABC_6TM_YknU_like	Six-transmembrane helical domain (6-TMD) of the uncharacterized ABC transporter YknU and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the uncharacterized ABC transporter YknU and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	292
349987	cd18543	ABC_6TM_Rv0194_D1_like	Six-transmembrane helical domain 1 (TMD1) of the multidrug efflux ABC transporter Rv0194 and similar proteins. This group includes the six-transmembrane helical domain 1 (TMD1) of the multidrug efflux ATP-binding/permease protein Rv0194 from Mycobacterium tuberculosis and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	291
349988	cd18544	ABC_6TM_TmrA_like	Six-transmembrane helical domain (TmrA) of the heterodimeric Thermus thermophilus multidrug resistance proteins TmrAB, and similar proteins. This group represents the six-transmembrane helical domain (TrmA) of the heterodimeric Thermus thermophilus multidrug resistance proteins A and B (TmrAB), a homolog of the Antigen Translocation Complex Tap, and similar proteins. TmrAB has been shown to able to restore antigen processing in human TAP-deficient cells. The 6-transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	294
349989	cd18545	ABC_6TM_YknV_like	Six-transmembrane helical domain (6-TMD) of the uncharacterized ABC transporter YknV and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the uncharacterized ABC transporter YknV and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	293
349990	cd18546	ABC_6TM_Rv0194_D2_like	Six-transmembrane helical domain 2 (TMD2) of the multidrug efflux ABC transporter Rv0194 and similar proteins. This group includes the six-transmembrane helical domain 2 (TMD2) of the multidrug efflux ATP-binding/permease protein Rv0194 from Mycobacterium tuberculosis and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	292
349991	cd18547	ABC_6TM_Tm288_like	Six-transmembrane helical domain Tm288 of a heterodimeric ABC transporter Tm287/288 from Thermotoga maritima and similar proteins. This group represents the six-transmembrane helical domain (Tm288) of a heterodimeric ABC transporter Tm287/288 from Thermotoga maritima and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	298
349992	cd18548	ABC_6TM_Tm287_like	Six-transmembrane helical domain Tm287 of a heterodimeric ABC transporter Tm287/288 from Thermotoga maritima and similar proteins. This group represents the six-transmembrane helical domain (Tm287) of a heterodimeric ABC transporter Tm287/288 from Thermotoga maritima and similar proteins. This TMD possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	292
349993	cd18549	ABC_6TM_YwjA_like	Six-transmembrane helical domain of an uncharacterized ABC transporter YwjA and similar proteins. This group represents the six-transmembrane helical domain of an uncharacterized ABC transporter YwjA from Bacillus subtilis and similar proteins. This transmembrane (TM) subunit possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	295
349994	cd18550	ABC_6TM_exporter_like	Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	294
349995	cd18551	ABC_6TM_LmrA_like	Six-transmembrane helical domain of the multidrug resistance ABC transporter LmrA and similar proteins. This group represents the six-transmembrane helical domain of the multidrug resistance ABC transporter LmrA from Lactococcus lactis and similar proteins. This transmembrane (TM) subunit possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	289
349996	cd18552	ABC_6TM_MsbA_like	Six-transmembrane helical domain of the bacterial ABC lipid flippase MsbA and similar proteins. The bacterial lipid flippase MsbA is found in Gram-negative bacteria and transports lipid A and lipopolysaccharide (LPS) from the cytoplasmic leaflet to the periplasmic leaflet of the inner membrane. MsbA is also a polyspecific transporter capable of transporting a broad spectrum of drug molecules. Additionally, MsbA exhibits significant sequence similarity to mammalian multidrug resistance (MDR) proteins such as human MDR protein 1 (MDR1) and LmrA from Lactococcus lactis. This subgroup also contains a putative transporter Brevibacillus brevis TycD; the location of the tycD gene within the Tyc (tyrocidine) biosynthesis operon suggests that TycD may play a role in the secretion of the cyclic decapeptide antibiotic tyrocidine. This transmembrane (TM) subunit possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Moreover, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	292
349997	cd18553	ABC_6TM_PglK_like	Six-transmembrane helical domain of the ABC transporter PglK and similar proteins. This group represents the transmembrane (TM) domain of an active lipid-linked oligosaccharides flippase PglK (protein glycosylation K), which is a homodimeric ABC transporter that flips a lipid-linked oligosaccharide that serves as a glycan donor in N-linked protein glycosylation. Pglk mediates the ATP-dependent translocation of the undecaprenylpyrophosphate-linked heptasaccharide intermediate across the cell membrane; this is an essential step during the N-linked protein glycosylation pathway. This TM subunit exhibits the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. Bacterial ABC exporters are typically expressed as half-transporters that contain one transmembrane domain (TMD) fused to a nucleotide-binding domain (NBD), which dimerize to form the full transporter.	300
349998	cd18554	ABC_6TM_Sav1866_like	Six-transmembrane helical domain of the bacterial ABC multidrug exporter Sav1866 and similar proteins. This group represents the homodimeric bacterial ABC multidrug exporter Sav1866, which is homologous to the lipid flippase MsbA, and both of which are functionally related to the human P-glycoprotein multidrug transporter (ABCB1 or MDR1). This transmembrane (TM) subunit possesses the ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs.	299
349999	cd18555	ABC_6TM_T1SS_like	Six-transmembrane helical domain (6-TMD) of the ATP-binding cassette subunit in the type 1 secretion systems, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS) and similar proteins. These transporter subunits include HylB, PrtD, CyaB, CvaB, RsaD, HasD, LipB, and LapB, among many others. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). Most targeted proteins are not cleaved at the N terminus, but rather carry signals located toward the extreme C terminus to direct type I secretion. However, the 10 kDa Escherichia coli colicin V (CvaB) targets the ABC transporter using a cleaved, N-terminal signal sequence. Almost all transport substrates of the type I system have critical functions in attacking host cells either directly or by being essential for host colonization. The ABC-dependent T1SS transports various molecules, from ions, drugs, to proteins of various sizes up to 900 kDa. The molecules secreted vary in size from the small Escherichia coli peptide colicin V, (10 kDa) to the Pseudomonas fluorescens cell adhesion protein LapA of 520 kDa. The best characterized are the RTX toxins such as the adenylate cyclase (CyaA) toxin from Bordetella pertussis, the causative agent of whooping cough, and the lipases such as LipA. Type I secretion is also involved in export of non-protein substrates such as cyclic beta-glucans and polysaccharides.	294
350000	cd18556	ABC_6TM_McjD_like	Six-transmembrane helical domain of the antibacterial peptide ATP-binding cassette transporter McjD and similar proteins. This group represents the 6-TM subunit of the ABC transporter McjD that exports the antibacterial peptide microcin J25, which is an antimicrobial peptide produced by Enterobacteriaceae against other microorganisms for survival under nutrient starvation. Thus, the ABC exporter McjD provides self-immunity of the producing bacteria through export of the toxic peptide out of the cell. Bacterial ABC exporters are typically expressed as half-transporters that contain one transmembrane domain (TMD) fused to a nucleotide-binding domain (NBD), which dimerize to form the full transporter.	298
350001	cd18557	ABC_6TM_TAP_ABCB8_10_like	Six-transmembrane helical domain (6-TMD) of the ABC transporter TAP, ABCB8 and ABCB10. This group includes ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection, as well as ABCB8 and ABCB10, which are found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. TAP is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum(ER) for association with MHC class I molecules, which play a central role in the adaptive immune response to viruses and cancers by presenting antigenic peptides to CD8+ cytotoxic T lymphocytes (CTLs). Mammalian ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress, while ABCB8 is essential for normal cardiac function, maintenance of mitochondrial iron homeostasis and maturation of cytosolic Fe/S proteins.	289
350002	cd18558	ABC_6TM_Pgp_ABCB1	Six-transmembrane helical domain of P-glycoprotein 1 (Pgp) and related proteins. P-glycoprotein 1 (permeability glycoprotein, Pgp) also known as multidrug resistance protein 1 (MDR1) or ATP-binding cassette sub-family B member 1(ABCB1) is a member of the superfamily of ATP-binding cassette (ABC) transporters. Pgp acts as an ATP-dependent efflux pump, binds drugs with diverse chemical structures and pump them out of the drug resistant cancer cells. It is responsible for decreased drug accumulation in multidrug-resistant cells and mediates the development of resistance to anticancer drugs. Pgp consists of two alpha-helical transmembrane domains (TMDs) and two cytoplasmic nucleotide-binding domains (NBDs). This protein also functions as a transporter in the blood-brain barrier. In addition to Pgp, breast cancer resistance protein (BCRP/MXR/ABC-P/ABCG2) and multidrug resistance-associated proteins (MRP1/ABCC1 and MRP2/ABCC2) function as drug efflux pumps of anticancer drugs, and overexpression of these transporters induces multidrug resistance to a broad spectrum of anticancer drugs including doxorubicin, taxol, and vinca alkaloids by actively pumping the drugs out of cells.	312
350003	cd18559	ABC_6TM_ABCC	Six-transmembrane helical domain of the ABC transporters, subfamily C. This group represents the 6-transmembrane (6TM) domain of the ABC transporters that belong to the ABCC subfamily, such as the sulphonylurea receptors SUR1/2 (ABCC8), the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), Multidrug-Resistance associated Proteins (MRP1-9), VMR1 (vacuolar multidrug resistance protein 1), and YOR1 (yeast oligomycin resistance transporter protein). This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides.	290
350004	cd18560	ABC_6TM_ATM1_ABCB7_HMT1_ABCB6	Six-transmembrane helical domain (6-TMD) of the Atm1/ABCB7/HMT1/ABCB6 subfamily. This group represents the Atm1/ABCB7/HMT1/ABCB6 subfamily of ATP Binding Cassette (ABC) transporters that are involved in transition metal homeostasis and detoxification processes. Yeast ATM1 and human ABCB7 (ABC transporter subfamily B, member 7), which are involved in the assembly of cytosolic iron-sulfur (Fe/S) cluster-containing proteins by mediating export of Fe/S cluster precursors from mitochondria. In eukaryotes, the Atm1/ABCB7 is present in the inner membrane of mitochondria and is required for the formation of cytosolic iron sulfur cluster containing proteins; mutations of ABCB7 gene result in mitochondrial iron accumulation and are responsible for X-linked sideroblastic anemia. ABCB6 is originally identified as a porphyrin transporter present in the outer membrane of mitochondria. It is highly expressed in cells resistance to arsenic and protects against arsenic cytotoxicity. Moreover, Heavy Metal Tolerance Factor-1 (HMT1) proteins are required for cadmium resistance in Caenorhabditis elegans and Drosophila melanogaster.	292
350005	cd18561	ABC_6TM_AarD_CydDC_like	Six-transmembrane helical domain (6-TMD) of the ABC cysteine/GSH transporter CydDC, and similar proteins. The CydD protein, together with the CydC protein, constitutes a bacterial heterodimeric ATP-binding cassette (ABC) transporter complex required for formation of the functional cytochrome bd oxidase in both gram-positive and gram-negative aerobic bacteria. In Escherichia coli, the biogenesis of both cytochrome bd-type quinol oxidases and periplasmic cytochromes requires the ABC-type cysteine/GSH transporter CydDC, which exports cysteine and glutathione from the cytoplasm to the periplasm to maintain redox homeostasis. Mutations in AarD, a homolog from Providencia stuartii, also show phenotypic characteristic consistent with a defect in the cytochrome d oxidase. The CydDC forms a heterodimeric ABC transporter with two transmembrane domains (TMDs), each predicted to comprise six TM alpha-helices and two nucleotide binding domains (NBDs).	289
350006	cd18562	ABC_6TM_NdvA_beta-glucan_exporter_like	Six-transmembrane helical domain of the cyclic beta-glucan ABC transporter NdvA, and similar proteins. This group represents the six-transmembrane domain of NdvA, an ATP-dependent exporter of cyclic beta glucans, and similar proteins. NdvA is required for nodulation of legume roots and is involved in beta-(1,2)-glucan export to the periplasm. NdvA mutants in Brucella abortus and Sinorhizobium meliloti have been shown to exhibit decreased virulence in mice and inhibit intracellular multiplication in HeLa cells. These results suggest that cyclic beta-(1,2)-glucan is required to transport into the periplasmatic space to function as a virulence factor. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs.	289
350007	cd18563	ABC_6TM_exporter_like	Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	296
350008	cd18564	ABC_6TM_exporter_like	Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	307
350009	cd18565	ABC_6TM_exporter_like	Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	313
350010	cd18566	ABC_6TM_PrtD_LapB_HlyB_like	Six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (PrtD, LapB, HylB), and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS), including PrtD, LapB, and HylB. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type 1 secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). These three components assemble into a complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides. In addition, PrtD is the integral membrane ATP-binding cassette component of the Erwinia chrysanthemi metalloprotease secretion system (PrtDEF). LabB is an inner-membrane transporter component of the LapBCE system that is required for the secretion of the LapA adhesion.	294
350011	cd18567	ABC_6TM_CvaB_RaxB_like	Six-transmembrane helical domain (6-TMD) of the ABC transporter subunit of the type 1 secretion systems, CvaB and RaxB, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the peptidase-containing ABC transporter subunit of T1SS (Type 1 secretion systems), such as Escherichia coli colicin V secretion/processing ATP-binding protein CvaB and putative ABC transporter RaxB. These ABC-transporter proteins carry a proteolytic peptidase domain in their N-termini, termed as C39, which cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion. RaxB is part of the T1SS RaxABC, which is responsible for the type 1-dependent secretion of the bacterial quorum-sensing molecule AvrXa21. Both CvaB and RaxB belong to a subgroup of T1SS ABC transporters that contain a C39 peptidase domain. T1SS are found in pathogenic Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium.	294
350012	cd18568	ABC_6TM_HetC_like	Six-transmembrane helical domain (6-TMD) of the ABC subunit of T1SS-like HetC and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit of T1SS (type 1 secretion systems), such as heterocyst differentiation protein HetC. HetC is similar to ABC protein exporters of T1SS (type 1 secretion systems) and is involved in early regulation of heterocyst differentiation in the filamentous cynobacterium Anabaena sp. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. ABC-transporter proteins in this group carry a proteolytic peptidase domain in their N-termini, termed as C39, which cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion.	294
350013	cd18569	ABC_6TM_NHLM_bacteriocin	Six-transmembrane helical domain (6-TMD) of NHLP family bacteriocin export ABC transporters. This group includes the six-transmembrane helical domain (6-TMD) of the ABC subunit of NHLM (Nitrile Hydratase Leader Microcin) bacteriocin system, which contains ABC transporter (permease/ATP-binding fused protein) with a peptidase domain. ABC-transporter proteins in this group are predicted to be a subunit of a bacteriocin processing and export system, and they carry a proteolytic peptidase domain in their N-termini, termed as C39, which cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion.	294
350014	cd18570	ABC_6TM_PCAT1_LagD_like	Six-transmembrane helical domain (6-TMD) of the peptidase-containing ATP-binding cassette transporters. This group includes the 6-TMD of the peptidase-containing ATP-binding cassette transporters (PCATs) such as Clostridium thermocellum PCAT1, a polypeptide processing and secretion transporter, and LagD, a bacteriocin ABC transporter from Lactococcus lactis. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs. The transporters involved in protein secretion often contain additional peptidase domains essential for substrate processing. These peptidase domains belong to the cysteine protease superfamily, classified as family C39, bacteriocin-processing peptidase. LagD is highly similar to the peptidase-containing ATP-binding cassette transporters (PCATs). In Gram-positive bacteria, the PCATs are responsible for exporting quorum-sensing or antimicrobial peptides called bacteriocins.	294
350015	cd18571	ABC_6TM_peptidase_like	Six-transmembrane helical domain (6-TMD) of an uncharacterized peptidase ABC transporter and similar proteins. This group includes the 6-TMD of an uncharacterized peptidase-containing ABC transporter of T1SS (type 1 secretion systems), similar to heterocyst differentiation protein HetC. HetC is involved in early regulation of heterocyst differentiation in the filamentous cynobacterium Anabaena sp. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. ABC-transporter proteins in this group carry a proteolytic peptidase domain in their N-termini, termed as C39, which cleaves a double glycine (GG) motif-containing signal peptide from substrates before secretion.	294
350016	cd18572	ABC_6TM_TAP	Six-transmembrane helical domain (6-TMD) of the ABC transporter associated with antigen processing. This group represents the 6-TM subunit of the ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection. TAP is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum(ER) for association with MHC class I molecules, which play a central role in the adaptive immune response to viruses and cancers by presenting antigenic peptides to CD8+ cytotoxic T lymphocytes (CTLs). It also acts as a molecular scaffold for the assembly of the MHC I peptide-loading complex in the ER membrane. Newly synthesized MHC class I molecules associate with TAP via tapasin, which is one component of the peptide-loading complex. TAP is a heterodimer formed by two distinct subunits, TAP1 (ABCB2) and TAP2 (ABCB3), each half-transporter comprises one transmembrane domain (TMD) and one nucleotide domain (NBD). Two 6-helical core TMDs contain the peptide-binding pocket and translocation channel, while the NBDs bind and hydrolyze ATP to power peptide translocation.	289
350017	cd18573	ABC_6TM_ABCB10_like	Six-transmembrane helical domain (6-TMD) of the mitochondrial transporter ABCB10 (subfamily B, member 10) and similar proteins. This group includes the 6-TM subunit of the ABC10 (also known as ABC mitochondrial erythroid, ABC-me, mABC2, or ABCBA), which is one of the three ATP-binding cassette (ABC) transporters found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. In mammals, ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting significant structural diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane.	294
350018	cd18574	ABC_6TM_ABCB8_like	Six-transmembrane helical domain (6-TMD) of ATP-binding cassette transporter subfamily B member 8, mitochondrial, and similar proteins. This group includes ABCB8, which is one of the three ATP-binding cassette (ABC) transporters found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. ABCB8 is essential for maintenance of normal cardiac function, involves mitochondrial iron export, and plays a role in the maturation of cytosolic Fe/S cluster-containing enzymes. ABCB8 is a half-molecule ABC protein that contains one TMD fused to a NBD, which dimerize to form a functional transporter.	295
350019	cd18575	ABC_6TM_bac_exporter_ABCB8_10_like	Six-transmembrane helical domain of putative bacterial ABC exporters, similar to ABCB8 and ABCB10. This group includes putative bacterial ABC transporters similar to ABCB8 and ABCB10, which are found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. Mammalian ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress, while ABCB8 is essential for normal cardiac function, maintenance of mitochondrial iron homeostasis and maturation of cytosolic Fe/S proteins. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs.	289
350020	cd18576	ABC_6TM_bac_exporter_ABCB8_10_like	Six-transmembrane helical domain of putative bacterial ABC exporters, similar to ABCB8 and ABCB10. This group includes putative bacterial ABC transporters similar to ABCB8 and ABCB10, which are found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. Mammalian ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress, while ABCB8 is essential for normal cardiac function, maintenance of mitochondrial iron homeostasis and maturation of cytosolic Fe/S proteins. Bacterial exporters are typically formed by dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are formed of two identical TMDs and two identical NBDs.	289
350021	cd18577	ABC_6TM_Pgp_ABCB1_D1_like	Six-transmembrane helical domain 1 (TMD1) of P-glycoprotein 1 (Pgp) and related proteins. P-glycoprotein 1 (permeability glycoprotein, Pgp) also known as multidrug resistance protein 1 (MDR1) or ATP-binding cassette sub-family B member 1 (ABCB1) is a member of the superfamily of ATP-binding cassette (ABC) transporters. Pgp acts as an ATP-dependent efflux pump, binds drugs with diverse chemical structures and pump them out of the drug resistant cancer cells. It is responsible for decreased drug accumulation in multidrug-resistant cells and mediates the development of resistance to anticancer drugs. Pgp consists of two alpha-helical transmembrane domains (TMDs) and two cytoplasmic nucleotide-binding domains (NBDs). This protein also functions as a transporter in the blood-brain barrier. In addition to Pgp, breast cancer resistance protein (BCRP/MXR/ABC-P/ABCG2) and multidrug resistance-associated proteins (MRP1/ABCC1 and MRP2/ABCC2) function as drug efflux pumps of anticancer drugs, and overexpression of these transporters induces multidrug resistance to a broad spectrum of anticancer drugs including doxorubicin, taxol, and vinca alkaloids by actively pumping the drugs out of cells.	300
350022	cd18578	ABC_6TM_Pgp_ABCB1_D2_like	Six-transmembrane helical domain 2 (TMD2) of P-glycoprotein 1 (Pgp) and related proteins. P-glycoprotein 1 (permeability glycoprotein, Pgp) also known as multidrug resistance protein 1 (MDR1) or ATP-binding cassette sub-family B member 1 (ABCB1) is a member of the superfamily of ATP-binding cassette (ABC) transporters. Pgp acts as an ATP-dependent efflux pump, binds drugs with diverse chemical structures and pump them out of the drug resistant cancer cells. It is responsible for decreased drug accumulation in multidrug-resistant cells and mediates the development of resistance to anticancer drugs. Pgp consists of two alpha-helical transmembrane domains (TMDs) and two cytoplasmic nucleotide-binding domains (NBDs). This protein also functions as a transporter in the blood-brain barrier. In addition to Pgp, breast cancer resistance protein (BCRP/MXR/ABC-P/ABCG2) and multidrug resistance-associated proteins (MRP1/ABCC1 and MRP2/ABCC2) function as drug efflux pumps of anticancer drugs, and overexpression of these transporters induces multidrug resistance to a broad spectrum of anticancer drugs including doxorubicin, taxol, and vinca alkaloids by actively pumping the drugs out of cells.	317
350023	cd18579	ABC_6TM_ABCC_D1	Six-transmembrane helical domain 1 (TMD1) of the ABC transporters, subfamily C. This group represents the six-transmembrane domain 1 (TMD1)of the ABC transporters that belong to the ABCC subfamily, such as the sulphonylurea receptors SUR1/2 (ABCC8), the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), Multidrug-Resistance associated Proteins (MRP1-9), VMR1 (vacuolar multidrug resistance protein 1), and YOR1 (yeast oligomycin resistance transporter protein). This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. By contrast, bacterial ABC exporters are typically assembled from dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are comprised of two identical TMDs and two identical NBDs.	289
350024	cd18580	ABC_6TM_ABCC_D2	Six-transmembrane helical domain 2 (TMD2) of the ABC transporters, subfamily C. This group represents the six-transmembrane domain 2 (TMD2) of the ABC transporters that belong to the ABCC subfamily, such as the sulphonylurea receptors SUR1/2 (ABCC8), the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), Multidrug-Resistance associated Proteins (MRP1-9), VMR1 (vacuolar multidrug resistance protein 1), and YOR1 (yeast oligomycin resistance transporter protein). This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. All ABC transporters share a common architecture of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. By contrast, bacterial ABC exporters are typically assembled from dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are comprised of two identical TMDs and two identical NBDs.	294
350025	cd18581	ABC_6TM_ABCB6	Six-transmembrane helical domain of the ATP-binding cassette subfamily B member 6, mitochondrial. This group represents the ABCB6 subfamily of ATP Binding Cassette (ABC) transporters that are involved in transition metal homeostasis and detoxification processes. ABCB6 is originally identified as a porphyrin transporter present in the outer membrane of mitochondria. It is highly expressed in cells resistance to arsenic and protects against arsenic cytotoxicity. Moreover, ABCB6 (ABC transporter subfamily B, member 6) is closely related to yeast ATM1 and human ABCB7, which are involved in the assembly of cytosolic iron-sulfur (Fe/S) cluster-containing proteins by mediating export of Fe/S cluster precursors from mitochondria. In eukaryotes, the Atm1/ABCB7 is present in the inner membrane of mitochondria and is required for the formation of cytosolic iron sulfur cluster containing proteins; mutations of ABCB7 gene result in mitochondrial iron accumulation and are responsible for X-linked sideroblastic anemia.	300
350026	cd18582	ABC_6TM_ATM1_ABCB7	Six-transmembrane helical domain of the Atm1/ABC7 transporters. This group represents the Atm1/ABCB7 subfamily of ATP Binding Cassette (ABC) transporters that are involved in transition metal homeostasis and detoxification processes. Yeast ATM1 and human ABCB7 (ABC transporter subfamily B, member 7), which are involved in the assembly of cytosolic iron-sulfur (Fe/S) cluster-containing proteins by mediating export of Fe/S cluster precursors from mitochondria. In eukaryotes, the Atm1/ABCB7 is present in the inner membrane of mitochondria and is required for the formation of cytosolic iron sulfur cluster containing proteins; mutations of ABCB7 gene result in mitochondrial iron accumulation and are responsible for X-linked sideroblastic anemia.	292
350027	cd18583	ABC_6TM_HMT1	Six-transmembrane helical domain of the heavy metal tolerance protein. This group represents the HMT1 subfamily of ATP Binding Cassette (ABC) transporters that are involved in transition metal homeostasis and detoxification processes. Heavy Metal Tolerance Factor-1 (HMT1) proteins are required for cadmium resistance in Caenorhabditis elegans and Drosophila melanogaster. HMT1 is closely related to Yeast ATM1 and human ABCB7 (ABC transporter subfamily B, member 7), which are involved in the assembly of cytosolic iron-sulfur (Fe/S) cluster-containing proteins by mediating export of Fe/S cluster precursors from mitochondria.	290
350028	cd18584	ABC_6TM_AarD_CydD	Six-transmembrane helical domain (6TM) of the CydD, a component of the ABC cysteine/GSH transporter, and a homolog AarD. The CydD protein, together with the CydC protein, constitutes a bacterial heterodimeric ATP-binding cassette (ABC) transporter complex required for formation of the functional cytochrome bd oxidase in both gram-positive and gram-negative aerobic bacteria. In Escherichia coli, the biogenesis of both cytochrome bd-type quinol oxidases and periplasmic cytochromes requires the ABC-type cysteine/GSH transporter CydDC, which exports cysteine and glutathione from the cytoplasm to the periplasm to maintain redox homeostasis. Mutations in AarD, a homolog from Providencia stuartii, also show phenotypic characteristic consistent with a defect in the cytochrome d oxidase. The CydDC forms a heterodimeric ABC transporter with two transmembrane domains (TMDs), each predicted to comprise six TM alpha-helices and two nucleotide binding domains (NBDs).	290
350029	cd18585	ABC_6TM_CydC	Six-transmembrane helical domain (6-TMD) of the CydC, a component of the ABC cysteine/GSH transporter. The CydC protein, together with the CydD protein, constitutes a bacterial heterodimeric ATP-binding cassette (ABC) transporter complex required for formation of the functional cytochrome bd oxidase in both gram-positive and gram-negative aerobic bacteria. In Escherichia coli, the biogenesis of both cytochrome bd-type quinol oxidases and periplasmic cytochromes requires the ABC-type cysteine/GSH transporter CydDC, which exports cysteine and glutathione from the cytoplasm to the periplasm to maintain redox homeostasis. The CydDC forms a heterodimeric ABC transporter with two transmembrane domains (TMDs), each predicted to comprise six TM alpha-helices and two nucleotide binding domains (NBDs).	290
350030	cd18586	ABC_6TM_PrtD_like	Six-transmembrane helical domain (6TM) domain of the ABC subunit (PrtD) in the T1SS metalloprotease secretion system, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS) such as PrtD, which is the integral membrane ATP-binding cassette component of the Erwinia chrysanthemi metalloprotease secretion system (PrtDEF). T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. The Aquifex aeolicus PrtDEF of T1SS is composed of an inner-membrane ABC transporter (PrtD), a periplasmic membrane-fusion protein (PrtE), and an outer-membrane porin (PrtF). These three components assemble into complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides	291
350031	cd18587	ABC_6TM_LapB_like	Six-transmembrane helical domain of the ABC transporter subunit LapB and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS), such as LapB. LapB is an inner-membrane transporter component of the LapBCE system that is required for the secretion of the LapA adhesion, LapA is a RTX (repeats in toxin) protein found in Pseudomonas fluorescens and is required for biofilm formation in this organism. T1SS are found in pathogenic Gram-negative bacteria to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. In this T1SS system, LapB is a cytoplasmic membrane-localized ATPase, LapC is a membrane fusion protein, and LapE is an outer membrane protein.	293
350032	cd18588	ABC_6TM_CyaB_HlyB_like	Six-transmembrane helical domain of the ABC subunits of T1SS, CyaB/HylB, and similar proteins. This group represents the six-transmembrane helical domain (6-TMD) of the ABC subunits of T1SS, such as CyaG and HlyB. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). These three components assemble into a complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides. Additionally, CyaB is part of the three T1SS complex proteins for adenylate cyclase toxin CyaA, which is a primary virulence factor in Bordetella pertussis: CyaB (an ABC transporter) CyaD (a membrane fusion protein), and CyaE (an outer membrane protein).	294
350033	cd18589	ABC_6TM_TAP1	Six-transmembrane helical domain 1 (6-TMD1) of the ABC transporter associated with antigen processing 1 (TAP1). This group represents the 6-TM subunit of the ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection. TAP is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum(ER) for association with MHC class I molecules, which play a central role in the adaptive immune response to viruses and cancers by presenting antigenic peptides to CD8+ cytotoxic T lymphocytes (CTLs). It also acts as a molecular scaffold for the assembly of the MHC I peptide-loading complex in the ER membrane. Newly synthesized MHC class I molecules associate with TAP via tapasin, which is one component of the peptide-loading complex. TAP is a heterodimer formed by two distinct subunits, TAP1 (ABCB2) and TAP2 (ABCB3), each half-transporter comprises one transmembrane domain (TMD) and one nucleotide domain (NBD). Two 6-helical core TMDs contain the peptide-binding pocket and translocation channel, while the NBDs bind and hydrolyze ATP to power peptide translocation.	289
350034	cd18590	ABC_6TM_TAP2	Six-transmembrane helical domain 2 (6-TMD2) of the ABC transporter associated with antigen processing 2 (TAP2). This group represents the 6-TM subunit of the ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection. TAP is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum(ER) for association with MHC class I molecules, which play a central role in the adaptive immune response to viruses and cancers by presenting antigenic peptides to CD8+ cytotoxic T lymphocytes (CTLs). It also acts as a molecular scaffold for the assembly of the MHC I peptide-loading complex in the ER membrane. Newly synthesized MHC class I molecules associate with TAP via tapasin, which is one component of the peptide-loading complex. TAP is a heterodimer formed by two distinct subunits, TAP1 (ABCB2) and TAP2 (ABCB3), each half-transporter comprises one transmembrane domain (TMD) and one nucleotide domain (NBD). Two 6-helical core TMDs contain the peptide-binding pocket and translocation channel, while the NBDs bind and hydrolyze ATP to power peptide translocation.	289
350035	cd18591	ABC_6TM_SUR1_D1_like	Six-transmembrane helical domain 1 (TMD1) of the sulphonylurea receptors SUR1/2. This group represents the six-transmembrane domain 1 (TMD1) of the sulphonylurea receptors SUR1/2 (ABCC8), which function as a modulator of ATP-sensitive potassium channels and insulin release, and they belong to the ABCC subfamily. The ATP-sensitive (K-ATP) channel is an octameric complex of four pore-forming Kir6.2 subunits and four regulatory SUR subunits. Thus, in contrast to other ABC transporters, the SUR serves as the regulatory subunit of an ion channel. Mutations and deficiencies in the SUR proteins have been observed in patients with hyperinsulinemic hypoglycemia of infancy, an autosomal recessive disorder of unregulated and high insulin secretion. Mutations have also been associated with non-insulin-dependent diabetes mellitus type 2, an autosomal dominant disease of defective insulin secretion.	309
350036	cd18592	ABC_6TM_MRP5_8_9_D1	Six-transmembrane helical domain 1 (TMD1) of multidrug resistance-associated proteins (MRPs) 5, 8, and 9. This group represents the six-transmembrane domain 1 (TMD1) of multidrug resistance-associated proteins (MRPs) 5, 8, and 9, all of which are belonging to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0).	287
350037	cd18593	ABC_6TM_MRP4_D1_like	Six-transmembrane helical domain 1 (TMD1) of multidrug resistance-associated protein 4 (MRP4) and similar proteins. This group represents the six-transmembrane domain 1 (TMD1) of multidrug resistance-associated protein 4 (MRP4), which belongs to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0).	291
350038	cd18594	ABC_6TM_CFTR_D1	Six-transmembrane helical domain 1 of Cystic Fibrosis Transmembrane Conductance Regulator. This group represents the six-transmembrane domain 1 (TMD1) of the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), which belongs to the ABCC subfamily. CFTR functions as a chloride channel, in contrast to other ABC transporters, and controls ion and water secretion and absorption in epithelial tissues. ABC proteins are formed from two homologous halves each containing a transmembrane domain (TMD) and a cytosolic nucleotide binding domain (NBD). In CFTR, these two TMD-NBD halves are linked by the unique regulatory (R) domain, which is not present in other ABC transporters. The ion channel only opens when its R-domain is phosphorylated by cyclic AMP-dependent protein kinase (PKA) and ATP is bound at the NBDs. Mutations in CFTR cause cystic fibrosis, the most common lethal genetic disorder in populations of Northern European descent.	291
350039	cd18595	ABC_6TM_MRP1_2_3_6_D1_like	Six-transmembrane helical domain 1 (TMD1) of multidrug resistance-associated proteins (MRPs) 1, 2, 3 and 6. This group represents the six-transmembrane domain 1 (TMD1) of multidrug resistance-associated proteins (MRPs) 1, 2, 3 and 6, all of which are belonging to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0).	290
350040	cd18596	ABC_6TM_VMR1_D1_like	Six-transmembrane helical domain 1 (TMD1) of the yeast Vmr1p, Ybt1p and Nft1; ABCC subfamily. This group includes the six-transmembrane domain 1 (TMD1) of the yeast Vmr1p, Ybt1p and Nft1, all of which are ABC transporters of the MRP (multidrug resistance-associated protein) subfamily (ABCC). Yeast ABCC (also termed MRP/CFTR) subfamily includes six members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p, Vmr1p, and Yor1p), of which three members (Ycf1p, Bpt1P and Yor1p) are not included here. While Yor1p, an oligomycin resistance ABC transporter, has been shown to localize to the plasma membrane, the other 4 members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p and Vmr1p) have been shown to localize to the vacuolar membrane. Ybt1p is originally identified as a bile acid transporter and regulates membrane fusion through Ca2+ transport modulation. Ybt1p also plays a part in ade2 pigment transport. Moreover, Ybt1p has been recently shown to translocate phosphatidylcholine from the outer leaflet of the vacuole to the inner leaflet for degradation and choline recycling. Vmr1p, a vacuolar membrane protein, participates in the export of numerous growth inhibitors from the cell, such as cycloheximide, 2,4-dinitrophenole, cadmium and other toxic metals. Nft1p is not well-characterized, but it is proposed to be regulate Ycf1p, which is involved in heavy metal detoxification.	309
350041	cd18597	ABC_6TM_YOR1_D1_like	Six-transmembrane helical domain 1 (TMD1) of the yeast Yor1p and similar proteins; ABCC subfamily. This group includes the six-transmembrane domain 1 (TMD1) of the yeast Yor1p, an oligomycin resistance ABC transporter, and similar proteins. Members of this group belong to the MRP (multidrug resistance-associated protein) subfamily (ABCC). In addition to Yor1p, yeast ABCC (also termed MRP/CFTR) subfamily also comprises five other members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p, and Vmr1p), which are not included in this group. Yor1p is a plasma membrane ATP-binding transporter that mediates export of many different organic anions including oligomycin. While Yor1p has been shown to localize to the plasma membrane, the other 4 members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p and Vmr1p) have been shown to localize to the vacuolar membrane.	293
350042	cd18598	ABC_6TM_MRP7_D1_like	Six-transmembrane helical domain 1 (TMD1) of multidrug resistance-associated protein 7, and similar proteins. This group represents the six-transmembrane domain 1 (TMD1) of multidrug resistance-associated protein 7 (MRP7), which belongs to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0).	288
350043	cd18599	ABC_6TM_MRP5_8_9_D2	Six-transmembrane helical domain 2 (TMD2) of multidrug resistance-associated proteins (MRPs) 5, 8, and 9. This group represents the six-transmembrane domain 2 (TMD2) of multidrug resistance-associated proteins (MRPs) 5, 8, and 9, all of which are belonging to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0).	313
350044	cd18600	ABC_6TM_CFTR_D2	Six-transmembrane helical domain 2 of Cystic Fibrosis Transmembrane Conductance Regulator. This group represents the six-transmembrane domain 2 (TMD2) of the ABC transporters that belong to the ABCC subfamily, such as the sulphonylurea receptors SUR1/2 (ABCC8), the cystic fibrosis transmembrane conductance regulator (CFTR, ABCC7), Multidrug-Resistance associated Proteins (MRP1-9), VMR1 (vacuolar multidrug resistance protein 1), and YOR1 (yeast oligomycin resistance transporter protein). This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterized by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides. All ABC transporters share a common architecture of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane by alternating between inward- and outward-facing conformations. By contrast, bacterial ABC exporters are typically assembled from dimers of TMD-NBD half-transporters. Thus, most bacterial ABC transporters are comprised of two identical TMDs and two identical NBDs.	324
350045	cd18601	ABC_6TM_MRP4_D2_like	Six-transmembrane helical domain 2 (TMD2) of multidrug resistance-associated protein 4 (MRP4) and similar proteins. This group represents the six-transmembrane domain 2 (TMD2) of multidrug resistance-associated protein 4 (MRP4), which belongs to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0).	314
350046	cd18602	ABC_6TM_SUR1_D2_like	Six-transmembrane helical domain 2 (TMD2) of the sulphonylurea receptors SUR1/2. This group represents the six-transmembrane domain 2 (TMD2) of the sulphonylurea receptors SUR1/2 (ABCC8), which function as a modulator of ATP-sensitive potassium channels and insulin release, and belong to the ABCC subfamily. The ATP-sensitive (K-ATP) channel is an octameric complex of four pore-forming Kir6.2 subunits and four regulatory SUR subunits. Thus, in contrast to other ABC transporters, the SUR serves as the regulatory subunit of an ion channel. Mutations and deficiencies in the SUR proteins have been observed in patients with hyperinsulinemic hypoglycemia of infancy, an autosomal recessive disorder of unregulated and high insulin secretion. Mutations have also been associated with non-insulin-dependent diabetes mellitus type 2, an autosomal dominant disease of defective insulin secretion.	307
350047	cd18603	ABC_6TM_MRP1_2_3_6_D2_like	Six-transmembrane helical domain 2 (TMD2) of multidrug resistance-associated proteins (MRPs) 1, 2, 3 and 6. This group represents the six-transmembrane domain 2 (TMD2) of multidrug resistance-associated proteins (MRPs) 1, 2, 3 and 6, all of which are belonging to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0).	296
350048	cd18604	ABC_6TM_VMR1_D2_like	Six-transmembrane helical domain 2 (TMD2) of the yeast Vmr1p, Ybt1p and Nft1; ABCC subfamily. This group includes the six-transmembrane domain 2 (TMD2) of the yeast Vmr1p, Ybt1p and Nft1, all of which are ABC transporters of the MRP (multidrug resistance-associated protein) subfamily (ABCC). Yeast ABCC (also termed MRP/CFTR) subfamily includes six members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p, Vmr1p, and Yor1p), of which three members (Ycf1p, Bpt1P and Yor1p) are not included here. While Yor1p, an oligomycin resistance ABC transporter, has been shown to localize to the plasma membrane, the other 4 members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p and Vmr1p) have been shown to localize to the vacuolar membrane. Ybt1p is originally identified as a bile acid transporter and regulates membrane fusion through Ca2+ transport modulation. Ybt1p also plays a part in ade2 pigment transport. Moreover, Ybt1p has been recently shown to translocate phosphatidylcholine from the outer leaflet of the vacuole to the inner leaflet for degradation and choline recycling. Vmr1p, a vacuolar membrane protein, participates in the export of numerous growth inhibitors from the cell, such as cycloheximide, 2,4-dinitrophenole, cadmium and other toxic metals. Nft1p is not well-characterized, but it is proposed to be regulate Ycf1p, which is involved in heavy metal detoxification.	297
350049	cd18605	ABC_6TM_MRP7_D2_like	Six-transmembrane helical domain 2 (TMD2) of multidrug resistance-associated protein 7, and similar proteins. This group represents the six-transmembrane domain 2 (TMD2) of multidrug resistance-associated protein 7 (MRP7), which belongs to the subfamily C of the ATP-binding cassette (ABC) transporter superfamily. The MRP subfamily (ABCC subfamily) is composed of 13 members, of which MRP1 to MRP9 are the major transporters that cause multidrug resistance in tumor cells by pumping anticancer drugs out of the cell. These nine MRP members function as ATP-dependent exporters for endogenous substances and xenobiotics. MRP family can be divided into two groups, depending on their structural architecture. MRP4, MRP5, MRP8, and MRP9 (ABCC4, 5, 11 and 12, respectively) have a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2). On the other hand, MRP1, 2, 3, 6 and 7 (ABCC1, 2, 3, 6 and 7, respectively) have an additional N-terminal five transmembrane segments in a single domain (TMD0) connected to the core (TMD-NBD) by a cytoplasmic linker (L0).	300
350050	cd18606	ABC_6TM_YOR1_D2_like	Six-transmembrane helical domain 2 (TMD2) of the yeast Yor1p and similar proteins; ABCC subfamily. This group includes the six-transmembrane domain 1 (TMD1) of the yeast Yor1p, an oligomycin resistance ABC transporter, and similar proteins. Members of this group belong to the MRP (multidrug resistance-associated protein) subfamily (ABCC). In addition to Yor1p, yeast ABCC (also termed MRP/CFTR) subfamily also comprises five other members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p, and Vmr1p), which are not included in this group. Yor1p is a plasma membrane ATP-binding transporter that mediates export of many different organic anions including oligomycin. While Yor1p has been shown to localize to the plasma membrane, the other 4 members (Ycf1p, Bpt1p, Ybt1p/Bat1p, Nft1p and Vmr1p) have been shown to localize to the vacuolar membrane.	290
350119	cd18607	GH130	Glycoside hydrolase family 130. Members of the glycosyl hydrolase family 130, as classified by the carbohydrate-active enzymes database (CAZY), are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor.	269
350120	cd18608	GH43_F5-8_typeC-like	Glycosyl hydrolase family 43 protein most having a F5/8 type C domain C-terminal to the GH43 domain. This glycosyl hydrolase family 43 (GH43)  subgroup includes enzymes that have been annotated as having beta-xylosidase (EC 3.2.1.37), xylanase (EC 3.2.1.8), and beta-galactosidase (EC 3.2.1.145) activities, and some as F5/8 type C domain (also known as the discoidin (DS) domain)-containing proteins. Most contain a F5/8 type C domain C-terminal to the GH43 domain. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. Characterized enzymes belonging to this subgroup include Lactobacillus brevis (LbAraf43) and Weissella sp (WAraf43) which show activity with similar catalytic efficiency on 1,5-alpha-L-arabinooligosaccharides with a degree of polymerization (DP) of 2-3; size is limited by an extended loop at the entrance to the active site. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	276
350121	cd18609	GH32-like	Glycosyl hydrolase family 32 family protein. The GH32 family contains glycosyl hydrolase family GH32 proteins that cleave sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). This family also contains other fructofuranosidases such as inulinase (EC 3.2.1.7), exo-inulinase (EC 3.2.1.80), levanase (EC 3.2.1.65), and transfructosidases such sucrose:sucrose 1-fructosyltransferase (EC 2.4.1.99), fructan:fructan 1-fructosyltransferase (EC 2.4.1.100), sucrose:fructan 6-fructosyltransferase (EC 2.4.1.10), fructan:fructan 6G-fructosyltransferase (EC 2.4.1.243) and levan fructosyltransferases (EC 2.4.1.-). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. These enzymes are predicted to display a 5-fold beta-propeller fold as found for GH43 and CH68. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency  to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	303
350122	cd18610	GH130_BT3780-like	Glycosyl hydrolase family 130, such as beta-mammosidase BT3780 and BACOVA_03624. This subfamily contains glycosyl hydrolase family 130, as classified by the carbohydrate-active enzymes database (CAZY), and includes Bacteroides enzymes, BT3780 and BACOVA_03624. Members of this family possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. GH130 enzymes have also been shown to target beta-1,2- and beta-1,4-mannosidic linkages where these  phosphorylases mediate bond cleavage by a single displacement reaction in which phosphate functions as the catalytic nucleophile. However, some lack the conserved basic residues that bind the phosphate nucleophile, as observed for the Bacteroides enzymes, BT3780 and BACOVA_03624, which are indeed beta-mannosidases that hydrolyze beta-1,2-mannosidic linkages through an inverting mechanism.	301
350123	cd18611	GH130	Glycosyl hydrolase family 130; uncharacterized. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), most of which are as yet uncharacterized. GH130 enzymes are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor.	289
350124	cd18612	GH130_Lin0857-like	Glycoside hydrolase family 130 such as Listeria innocua beta-1,2-mannobiose phosphorylase. This subfamily contains the glycosyl hydrolase family 130 (GH130), as classified by the carbohydrate-active enzymes database (CAZY), enzymes that are phosphorylases and hydrolases for beta-mannosides, and includes Listeria innocua beta-1,2-mannobiose phosphorylase (Lin0857). hey possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Structure of Lin0857 shows beta-1,2-mannotriose bound in a U-shape, interacting with a phosphate analog at both ends. Lin0857 has a unique dimer structure connected by a loop, with a significant open-close loop displacement observed for substrate entry. A long loop, which is exclusively present in Lin0857, covers the active site to limit the pocket size.	261
350125	cd18613	GH130	Glycosyl hydrolase family 130; uncharacterized. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), most of which are as yet uncharacterized. GH130 enzymes are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor.	302
350126	cd18614	GH130	Glycosyl hydrolase family 130; uncharacterized. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), most of which are as yet uncharacterized. GH130 enzymes are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor.	276
350127	cd18615	GH130	Glycosyl hydrolase family 130; uncharacterized. This subfamily contains glycosyl hydrolase family 130 (GH130) proteins, as classified by the carbohydrate-active enzymes database (CAZY), most of which are as yet uncharacterized. GH130 enzymes are phosphorylases and hydrolases for beta-mannosides, and include beta-1,4-mannosylglucose phosphorylase (EC 2.4.1.281), beta-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319), beta-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320), beta-1,2-mannobiose phosphorylase (EC 2.4.1.-), beta-1,2-oligomannan phosphorylase (EC 2.4.1.-) and beta-1,2-mannosidase (EC 3.2.1.-). They possess 5-bladed beta-propeller domains similar to families 32, 43, 62, 68, 117 (GH32, GH43, GH62, GH68, GH117). GH130 enzymes are involved in the bacterial utilization of mannans or N-linked glycans. Beta-1,4-mannosylglucose phosphorylase is involved in degradation of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine linkages in the core of N-glycans; it produces alpha-mannose 1-phosphate and glucose from 4-O-beta-D-mannosyl-D-glucose and inorganic phosphate, using a critical catalytic Asp as a proton donor.	277
350128	cd18616	GH43_ABN-like	Glycosyl hydrolase family 43 such as arabinan endo-1 5-alpha-L-arabinosidase. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes with endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activity. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	291
350129	cd18617	GH43_XynB-like	Glycosyl hydrolase family 43, such as Bacteroides ovatus alpha-L-arabinofuranosidase (BoGH43, XynB). This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have been characterized to have alpha-L-arabinofuranosidase (EC 3.2.1.55) and beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37) activities. Beta-1,4-xylosidases are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Also included in this subfamily are Bacteroides ovatus  alpha-L-arabinofuranosidases, BoGH43A and BoGH43B, both having a two-domain architecture, consisting of an N-terminal 5-bladed beta-propeller domain harboring the catalytic active site, and a C-terminal beta-sandwich domain. However, despite significant functional overlap between these two enzymes, BoGH43A and BoGH43B share just 41% sequence identity. The latter appears to be significantly less active on the same substrates, suggesting that these paralogs may play subtly different roles during the degradation of xyloglucans from different sources, or may function most optimally at different stages in the catabolism of xyloglucan oligosaccharides (XyGOs), for example before or after hydrolysis of certain side-chain moieties. It also includes Phanerochaete chrysosporium BKM-F-1767 Xyl, a bifunctional xylosidase/arabinofuranosidase. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	285
350130	cd18618	GH43_Xsa43E-like	Glycosyl hydrolase family 43, including Butyrivibrio proteoclasticus arabinofuranosidase Xsa43E. This glycosyl hydrolase family 43 (GH43) subgroup belongs to the GH43_AXH-like subgroup which includes enzymes that have been characterized with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55), alpha-1,2-L-arabinofuranosidase 43A (arabinan-specific; EC 3.2.1.-), endo-alpha-L-arabinanase as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. AXHs specifically hydrolyze the glycosidic bond between arabinofuranosyl substituents and xylopyranosyl backbone residues of arabinoxylan. This subgroup includes Cellvibrio japonicus arabinan-specific alpha-1,2-arabinofuranosidase, CjAbf43A, which confers its specificity by a surface cleft that is complementary to the helical backbone of the polysaccharide, and Butyrivibrio proteoclasticus GH43 enzyme Xsa43E, also an arabinofuranosidase, which has been shown to cleave arabinose side chains from short segments of xylan. Several of these enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	275
350131	cd18619	GH43_CoXyl43_like	Glycosyl hydrolase family 43 protein such as metagenomic beta-xylosidase/alpha-L-arabinofuranosidase CoXyl43. This glycosyl hydrolase family 43 (GH43) subgroup belongs to the GH43_AXH-like subgroup which includes enzymes that have been characterized with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55), alpha-1,2-L-arabinofuranosidase 43A (arabinan-specific; EC 3.2.1.-), endo-alpha-L-arabinanase as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. Included in this subfamily is the metagenomic beta-xylosidase/alpha-L-arabinofuranosidase CoXyl43, which shows synergy with Trichoderma reesei cellulases and promotes plant biomass saccharification by degrading xylo-oligosaccharides, such as xylobiose and xylotriose, into the monosaccharide xylose. Studies show that the hydrolytic activity of CoXyl43 is stimulated in the presence of calcium. Several of these enzymes also contain carbohydrate binding modules (CBMs) that bind cellulose or xylan. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	313
350132	cd18620	GH43_XylA-like	Glycosyl hydrolase family 43-like protein such as Clostridium stercorarium alpha-L-arabinofuranosidase XylA. This glycosyl hydrolase family 43 (GH43) subgroup belongs to the GH43_AXH-like subgroup which includes enzymes that have been characterized with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55), alpha-1,2-L-arabinofuranosidase 43A (arabinan-specific; EC 3.2.1.-), endo-alpha-L-arabinanase as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. The GH43_XylA-like subgroup includes Clostridium stercorarium alpha-L-arabinofuranosidase XylA, and enzymes that have been annotated as having beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55), endo-alpha-L-arabinanase (EC 3.2.1.-) as well as arabinoxylan arabinofuranohydrolase (AXH) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. AXHs specifically hydrolyze the glycosidic bond between arabinofuranosyl substituents and xylopyranosyl backbone residues of arabinoxylan.	274
350133	cd18621	GH32_XdINV-like	glycoside hydrolase family 32 protein such as Xanthophyllomyces dendrorhous beta-fructofuranosidase (Inv;Xd-INV;XdINV). This subfamily of glycosyl hydrolase family GH32 includes fructan:fructan 1-fructosyltransferase (FT, EC 2.4.1.100) and beta-fructofuranosidase (invertase or Inv, EC 3.2.1.26), among others. These enzymes cleave sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. Xanthophyllomyces dendrorhous beta-fructofuranosidase  (XdINV) also catalyzes the synthesis of fructooligosaccharides  (FOS, a beneficial prebiotic), producing neo-FOS, making it an interesting biotechnology target.  Structural studies show plasticity of its active site, having a flexible loop that is essential in binding sucrose and beta(2-1)-linked oligosaccharide, making it a valuable biocatalyst to produce novel bioconjugates. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency  to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	337
350134	cd18622	GH32_Inu-like	glycoside hydrolase family 32 protein such as Aspergillus ficuum endo-inulinase (Inu2). This subfamily of glycosyl hydrolase family GH32 includes endo-inulinase (inu2, EC 3.2.1.7), exo-inulinase (Inu1, EC  3.2.1.80), invertase (EC 3.2.1.26), and levan fructotransferase (LftA, EC 4.2.2.16), among others. These enzymes cleave sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase (EC 3.2.1.26). These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. These enzymes are predicted to display a 5-fold beta-propeller fold as found for GH43 and CH68. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency  to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	289
350135	cd18623	GH32_ScrB-like	glycoside hydrolase family 32 sucrose 6 phosphate hydrolase (sucrase). Glycosyl hydrolase family GH32 subgroup contains sucrose-6-phosphate hydrolase (sucrase, EC:3.2.1.26) among others. The enzyme cleaves sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose. These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency  to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	289
350136	cd18624	GH32_Fruct1-like	glycoside hydrolase family 32 protein such as Arabidopsis thaliana cell-wall invertase 1 (AtBFruct1;Fruct1;AtcwINV1;At3g13790). This subfamily of glycosyl hydrolase family GH32 includes fructan beta-(2,1)-fructosidase and fructan 1-exohydrolase IIa (1-FEH IIa, EC 3.2.1.153), cell-wall invertase 1 (EC 3.2.1.26), sucrose:fructan 6-fructosyltransferase (6-Sst/6-Dft, EC 2.4.1.10), and levan fructosyltransferases (EC 2.4.1.-) among others. This enzyme cleaves sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase. These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency  to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	296
350137	cd18625	GH32_BfrA-like	glycoside hydrolase family 32 protein such as Thermotoga maritima invertase (BfrA or Tm1414). This subfamily of glycosyl hydrolase family GH32 includes beta-fructosidase (invertase, EC 3.2.1.26) that cleaves sucrose into fructose and glucose via beta-fructofuranosidase activity, producing invert sugar that is a mixture of dextrorotatory D-glucose and levorotatory D-fructose, thus named invertase. These retaining enzymes (i.e. they retain the configuration at anomeric carbon atom of the substrate) catalyze hydrolysis in two steps involving a covalent glycosyl enzyme intermediate: an aspartate located close to the N-terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/base; a conserved aspartate residue in the Arg-Asp-Pro (RDP) motif stabilizes the transition state. The breakdown of sucrose is widely used as a carbon or energy source by bacteria, fungi, and plants. Invertase is used commercially in the confectionery industry, since fructose has a sweeter taste than sucrose and a lower tendency  to crystallize. A common structural feature of all these enzymes is a 5-bladed beta-propeller domain, similar to GH43, that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	286
349276	cd18626	CD_eEF3	chromodomain-like insertion in an ATPase domain of elongation factor eEF3. Eukaryotic elongation factor eEF3 (also known as EF-3, YEF3, and TEF3), a member of the ATP-binding cassette (ABC) family of proteins, is a ribosomal binding ATPase essential for fungal translation machinery. Until recently it was considered fungal-specific and therefore an attractive target for antifungal therapy; however, recent bioinformatics analysis indicates it may be more widely distributed among other unicellular eukaryotes, and translation elongation factor 3 activity has been demonstrated from a non-fungal species, Phytophthora infestans. eEF3 is a soluble factor lacking a transmembrane domain and having two ABC domains arranged in tandem, with a unique chromodomain inserted within the ABC2 domain. Chromodomain mutations in the ABC2 domain of eEF3 have been shown to reduce ATPase activity, but not ribosome binding. Thus, the chromodomain-like insertion is critical to eEF3 function. In addition to its elongation function, eEF3 has been shown to interact with mRNA in a translation independent manner, suggesting an additional, non-elongation function for this factor.	56
349277	cd18627	CD_polycomb_like	chromodomain of polycomb and chromobox family proteins. CHRomatin Organization Modifier (chromo) domain of Polycomb and Polycomb-group (PcG) chromobox (CBX) family proteins such as CBX2, CBX4, CBX6, CBX7, and CBX8. These CBX proteins are components of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells.	49
349278	cd18628	CD3_cpSRP43_like	chromodomain 3 of chloroplast signal recognition particle 43 kDa protein, and similar proteins. This subgroup includes the chromodomain 3 of chloroplast SRP43 (cpSRP43), and similar proteins. CpSRP43 is a component of the chloroplast signal recognition particle (SRP) pathway. It forms a stable complex with cpSRP54 (cpSRP complex) which is required for the efficient posttranslational transport of members of the nuclearly encoded light harvesting chlorophyll-a/b-binding proteins (LHCPs) to the thylakoid membrane. Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromodomain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followed by a related chromo shadow domain.	51
349279	cd18629	CD2_cpSRP43_like	chromodomain 2 of chloroplast signal recognition particle 43 kDa protein, and similar proteins. This subgroup includes the chromodomain 2 of chloroplast SRP43 (cpSRP43), and similar proteins. CpSRP43 is a component of the chloroplast signal recognition particle (SRP) pathway. It forms a stable complex with cpSRP54 (cpSRP complex) which is required for the efficient posttranslational transport of members of the nuclearly encoded light harvesting chlorophyll-a/b-binding proteins (LHCPs) to the thylakoid membrane. Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromodomain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followed by a related chromo shadow domain.	48
349280	cd18630	CD_Rhino	chromodomain of Drosophila melanogaster Rhino, and similar proteins. N-terminal CHRomatin Organization Modifier (chromo) domain of Drosophila melanogaster Rhino (also known as heterochromatin protein 1-like), and similar proteins.  Rhino is a female-specific protein that affects chromosome structure and egg polarity that is required for germline PIWI-interacting RNA (piRNA) production. In Drosophila the RDC (rhino, deadlock, and cutoff) complex, composed of rhino, the protein deadlock (Del) and the Rai1-like transcription termination cofactor cutoff (Cuff) binds to chromatin of dual-strand piRNA clusters, special genomic regions, which encode piRNA precursors.  The RDC complex is anchored to H3K9me3-marked chromatin in part via the H3K9me3-binding activity of Rhino, and is required for transcription of piRNA precursors. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	51
349281	cd18631	CD_HP1_like	chromodomain of heterochromatin protein 1 proteins, including HP1alpha, HP1beta, and HP1gamma. CHRomatin Organization Modifier (chromo) domain of mammalian HP1alpha (Cbx5), HP1beta (Cbx1), HP1gamma (Cbx5), and similar proteins. HP1 has diverse functions in heterochromatin formation and impacts both gene expression and gene silencing.  HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid.  HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3).	50
349282	cd18632	CD_Clr4_like	N-terminal chromodomain of the fission yeast histone methyltransferase Clr4, and similar proteins. N-terminal CHRomatin Organization Modifier (chromo) domain of cryptic loci regulator 4 (Clr4), a histone H3 lysine methyltransferase which targets H3K9. Clr4 regulates silencing and switching at the mating-type loci and affects chromatin structure at centromeres. Clr4 is a catalytic component of the rik1-associated E3 ubiquitin ligase complex that shows ubiquitin ligase activity and is required for histone H3K9 methylation. H3K9me represents a specific tag for epigenetic transcriptional repression by recruiting swi6/HP1 to methylated histones which leads to transcriptional silencing within centromeric heterochromatin, telomeric regions and at the silent mating-type loci. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	55
349283	cd18633	CD_MMP8	chromodomain of M-phase phosphoprotein 8. The chromodomain of M-phase phosphoprotein 8 (MPP8), a component of the RanBPM-containing large protein complex, binds methylated H3K9. This may in turn recruit the H3K9 methyltransferases GLP and ESET, and DNA methyltransferase 3A to the promoter of the E-cadherin gene, mediating the E-cadherin gene silencing and promoting tumor cell motility and invasion. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	51
349284	cd18634	CD_CDY	chromodomain of the Chromodomain Y-like protein family. This group includes the chromodomain found in the mammalian chromodomain Y-like (CDY) protein family, and similar proteins. The human CDY family includes 6 proteins: the genes encoding four of these: two copies of CDY1 (CDY1a, CDY1a) and two copies of CDY2(CDY2a and CDY2b), are located on chromosome Y, and the genes encoding the other two members (CDYL and CDYL2) are located on autosomes. The chromosomal genes are only present in primates, whereas the CDYL and CDYL2 genes exist in most mammalian species. The CDY family proteins contain two functional domains: a chromodomain involved in chromatin binding and a catalytic domain found in many coenzyme A (CoA)- dependent acylation enzymes. CDYL is ubiquitously expressed, whereas CDYL2 shows selective expression in tissues of testis, prostate, spleen, and leukocyte. The CDYL genes are ubiquitously expressed, the CDY genes are only expressed in the testis. Deletion of the CDY1b gene has been shown to be a risk factor for male infertility. Impairments in CDY2 expression could be implicated in the pathogenesis of maturation arrest (a failure of germ cell development).	52
349285	cd18635	CD_CMT3_like	chromodomain of chromomethylase 3, and similar proteins. CHRomatin Organization Modifier (chromo) domain of DNA (cytosine-5)-methyltransferase chromomethylase 3 (CMT3, EC:2.1.1.37), and similar proteins. CMT3 is primarily a CHG (where H is either A, T or C) methyltransferase and is predominantly expressed in actively replicating cells. The protein is involved in preferentially methylating transposon-related sequences, reducing their mobility. Studies suggest that in order to target DNA methylation, CMT3 associates with H3K9me2-containing nucleosomes through binding of its BAH- and chromo-domains to H3K9me2. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	57
349286	cd18636	CD_Chp1_like	chromodomain of chromodomain-containing protein 1, and similar proteins. CHRomatin Organization Modifier (chromo) domain of chromodomain-containing protein 1 (CHp1), and similar proteins. Chp1 is needed for RNA interference-dependent heterochromatin formation in fission yeast. Chp1 is a member of the RNA-induced transcriptional silencing (RITS) complex which maintains the heterochromatin regions. The chromodomain of the Chp1 component binds the histone H3 lysine 9 methylated tail (H3K9me) and the core of the nucleosome. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	52
349287	cd18637	CD_Swi6_like	chromodomain of fission yeast Swi6, and similar proteins. Fission yeast Swi6 protein is a structural and functional homolog of mammalian HP1 (heterochromatin protein 1) and is involved in the chromatin structure by binding to centromeres, telomeres, and the silent mating-type locus. Swi6 contains a N-terminal chromo (CHRromatin Organization MOdifier) domain and a C-terminal chromo shadow domain (CSD). Swi6 binds histone H3 tails methylated at Lys- and the cohesion subunit Psc3, leading to silencing the genes and sister chromatid cohesion. It is also involved in the repression of the silent mating-type loci MAT2 and MAT3. Swi6 may compact MAT2/3 into a heterochromatin-like conformation which represses the transcription of these silent cassettes. chromodomains mediate the interaction of the heterochromatin with other heterochromatin proteins, thereby affecting chromatin structure (e.g. Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2). CSDs have only been found in proteins that also possess a chromodomain.	54
349288	cd18638	CD_EhHp1_like	chromodomain of Entamoeba histolytica heterochromatin protein 1, and similar proteins. This subgroup includes the N-terminal CHRomatin Organization Modifier (chromo) domain of heterochromatin protein 1 (HP1)-like protein from Entamoeba histolytica, and similar proteins. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	52
349289	cd18639	CD_SUV39H1_like	chromodomain of histone methyltransferase SUV39H1, and similar proteins. CHRomatin Organization Modifier (chromo) domain of human SUV39H1, a histone lysine methyltransferase (HMT) which catalyzes di- and tri-methylation of lysine 9 of histone H3 (H3K9me2/3), leading to heterochromatin formation and gene silencing. H3K9me2/3 represents a specific mark for epigenetic transcriptional repression by recruiting HP1 (CBX1, CBX3, and/or CBX5) proteins to methylated histones. SUV39H1 mainly functions in heterochromatin regions. The human SUV39H1/2, histone H3K9 methyltransferases, are the mammalian homologs of Drosophila Su(var)3-9 and Schizosaccharomyces pombe Clr4. SUV39H1 contains a chromodomain at its N-terminus and a SET domain at its C-terminus. Although the SET domain performs the catalytic activity, the chromodomain of SUV39H1 is essential for the catalytic activity of SUV39H1. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	49
349290	cd18640	CD_Chro-like	chromodomain of Drosophila melanogaster chromator chromodomain protein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in chromodomain of Drosophila melanogaster chromator (also known as Chriz/Chro) chromodomain protein, and similar proteins.  Chromator is a nuclear protein that plays a role in proper spindle dynamics during mitosis. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	52
350843	cd18641	CBD_RBP1_like	chromo barrel domain of retinoblastoma binding protein 1, and similar proteins. Retinoblastoma-binding protein 1 (RBP1), also termed AT-rich interaction domain 4A, is a ubiquitously expressed nuclear protein. RBP1 is a tumor and leukemia suppressor that binds both methylated histone tails and DNA, and is involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. The chromo barrel domain of RBP1 has been reported to recognize histone H4K20me3 weakly, and this binding is enhanced by the simultaneous binding of DNA. RBP1 binds directly, with several other proteins, to retinoblastoma protein (pRB) which regulates cell proliferation; pRB represses transcription by recruiting RBP1. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromodomain.	59
350844	cd18642	CBD_MOF_like	chromo barrel domain of Drosophila melanogaster males-absent on the first protein, and similar proteins. This subgroup includes the chromo barrel domains found in human Tat-interactive protein 60 (TIP60, (also known as KAT5 or HTATIP), Drosophila melanogaster males-absent on the first (MOF) protein, and Saccharomyces ESA1. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. The MOF-like chromo barrels may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites.	67
350845	cd18643	CBD	chromo barrel domain of MOF acetyltransferase, and similar proteins. This group includes the chromo barrel domains found in human Tat-interactive protein 60 (TIP60, (also known as KAT5 or HTATIP), Drosophila melanogaster males-absent on the first (MOF) protein, human male-specific lethal (MSL) complex subunit 3 (MSL3), and retinoblastoma binding protein 1. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. The chromobarrel domains include a MOF-like subgroup which may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites.	61
349291	cd18644	CD_polycomb	chromodomain of polycomb. CHRomatin Organization Modifier (chromo) domain of the PcG (polycomb-group) chromodomain protein Polycomb (Pc) from Drosophila melanogaster, anthropod, worm, and sea cucumber, and similar proteins. Pc is a component of the Polycomb-group (PcG) multiprotein PRC1 complex, a complex class required to maintain the transcriptionally repressive state of many genes, including Hox genes, throughout development. The core subunits of PRC1 are polycomb (Pc), polyhomeotic (Ph), posterior sex combs (Psc), and sex comb extra (Sce, also known as dRing). Polycomb (Pc) plays a role in modulating life span in flies, it negatively regulates longevity.	54
349292	cd18645	CD_Cbx4	chromodomain of chromobox homolog 4. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 4 (CBX4), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. In addition to a chromodomain with H3K27me3-binding activity, Cbx4 contains two SUMO-interacting motifs responsible for its small ubiquitin-related modifier (SUMO) E3 ligase activity. CBX proteins may act as an oncogene or tumor suppressor in a cell-type-dependent manner, for example CBX8 promotes proliferation while suppressing metastasis, in colorectal carcinoma progression. CBX4 may serve as a tumor suppressor in colorectal carcinoma, and has been shown to be an oncogene in osteosarcoma and breast cancer.	55
349293	cd18646	CD_Cbx7	chromodomain of chromobox homolog 7. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 7 (CBX7), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. CBX proteins may act as an oncogene or tumor suppressor in a cell-type-dependent manner, for example CBX8 promotes proliferation while suppressing metastasis, in colorectal carcinoma progression. CBX7 has been shown to function as a tumor suppressor in lung carcinoma and an oncogene in gastric cancer and lymphoma.	56
349294	cd18647	CD_Cbx2	chromodomain of chromobox homolog 2. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 2 (CBX2), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells.	53
349295	cd18648	CD_Cbx6	chromodomain of chromobox homolog 6. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 6 (CBX6), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells.	58
349296	cd18649	CD_Cbx8	chromodomain of chromobox homolog 8. CHRomatin Organization Modifier (chromo) domain of chromobox homolog 8 (CBX8), a component of the PcG repressive complex PRC1, one of the two classes of PRCs. PcG proteins form large multiprotein complexes (PcG bodies) which are involved in the stable repression of genes involved in development, signaling or cancer via chromatin-based epigenetic modifications. Mammalian PRC1 includes canonical (cPRC1) and non-canonical complexes; cPRC1, contains four core subunits including one CBX protein (CBX2, CBX4, and CBX6-CBX8) that binds H3K27me3. CBX family members have different affinity for H3K27me3, with CBX7 having the highest binding capability. The human CBX proteins show distinct nuclear localizations and contribute differently to transcriptional repression. Some CBX proteins of the PRC1 complex have been implicated in transcriptional activation as well as in PRC1-independent roles in embryonic stem cells and in somatic cells. CBX proteins may act as an oncogene or tumor suppressor in a cell-type-dependent manner, CBX8 for example promotes proliferation while suppressing metastasis, in colorectal carcinoma progression.	55
349297	cd18650	CD_HP1beta_Cbx1	chromodomain of heterochromatin protein 1 homolog beta. CHRomatin Organization Modifier (chromo) domain of heterochromatin protein 1 homolog beta (also known as HP1beta, CBX1, and chromobox 1), and related proteins. HP1beta is a highly conserved non-histone protein, which is a member of the heterochromatin protein family, and is enriched in the heterochromatin and associated with centromeres. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta, and HP1gamma (also known as Cbx3).	50
349298	cd18651	CD_HP1alpha_Cbx5	chromodomain of heterochromatin protein 1 homolog alpha. CHRomatin Organization Modifier (chromo) domain of heterochromatin protein 1 homolog alpha (also known as HP1alpha, Cbx5, and Chromobox 5), and related proteins. HP1alpha has diverse functions in heterochromatin formation, gene regulation, and mitotic progression, and forms complex networks of gene, RNA, and protein interactions. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid.  HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha, HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3).	50
349299	cd18652	CD_HP1gamma_Cbx3	chromodomain of heterochromatin protein 1 homolog gamma. CHRomatin Organization Modifier (chromo) domain of heterochromatin protein 1 homolog gamma (also known as HP1gamma, Cbx3, and Chromobox 3), and related proteins. HP1gamma  is a highly conserved non-histone protein, which is a member of the heterochromatin protein family, and is enriched in the heterochromatin and associated with centromeres. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. In addition to being involved in transcriptional silencing in heterochromatin-like complexes, HP1gamma also binds lamin B receptor, an integral membrane protein found in the inner nuclear membrane. The dual binding functions of the protein may explain the association of heterochromatin with the inner nuclear membrane. HP1gamma is also recruited to sites of ultraviolet-induced DNA damage and double-strand breaks. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma.	50
349300	cd18653	CD_HP1a_insect	chromodomain of insect HP1a. CHRomatin Organization Modifier (chromo) domain of insect HP1a. HP1a is a member of the heterochromatin protein family, and is enriched in the heterochromatin and associated with centromeres. HP1 has diverse functions in heterochromatin formation and impacts both gene expression and gene silencing. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid.  HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. In Drosophila, there are at least five HP1 family proteins, this subgroup includes the CD of Drosophila melanogaster HP1a.	50
349301	cd18654	CSD_HP1beta_Cbx1	chromo shadow domain of heterochromatin protein 1 homolog beta. heterochromatin protein 1 homolog beta (also known as HP1beta, Cbx1, chromobox 1) is a highly conserved non-histone protein, which is a member of the heterochromatin protein family, and is enriched in the heterochromatin and associated with centromeres. HP1beta has a single N-terminal chromodomain which can bind to histone proteins via methylated lysine residues, and a C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. HP1beta may play an important role in the epigenetic control of chromatin structure and gene expression. CSD domains have only been found in proteins that also possess a related chromodomain, while chromodomains can exist in isolation. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta, and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3	58
349302	cd18655	CSD_HP1alpha_Cbx5	chromo shadow domain of heterochromatin protein 1 homolog alpha. Chromo shadow domain (CSD) of heterochromatin protein 1 homolog alpha (also known as HP1alpha, Cbx5, and Chromobox 5), and related proteins. HP1alpha has diverse functions in heterochromatin formation, gene regulation, and mitotic progression, and forms complex networks of gene, RNA, and protein interactions. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid.  CSD domains have only been found in proteins that also possess a related chromodomain, while chromodomains can exist in isolation. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha, HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3.	58
349303	cd18656	CSD_HP1gamma_Cbx3	chromo shadow domain of heterochromatin protein 1 gamma homolog gamma. Chromo shadow domain (CSD) of heterochromatin protein 1 gamma homolog gamma (also known as HP1gamma, Cbx3, Chromobox 3), and related proteins. HP1gamma appears to be involved in transcriptional silencing in heterochromatin-like complexes. It binds histone H3 tails methylated at Lys-9, leading to epigenetic repression, and also binds lamin B receptor, an integral membrane protein found in the inner nuclear membrane. The dual binding functions of the protein may explain the association of heterochromatin with the inner nuclear membrane. HP1gamma is also recruited to sites of ultraviolet-induced DNA damage and double-strand breaks. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal CSD which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. CSD domains have only been found in proteins that also possess a related chromodomain, while chromodomains can exist in isolation. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma. The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3.	58
349304	cd18657	CSD_Swi6	chromo shadow domain of chromatin-associated protein Swi6. Chromo shadow domain (CSD) of fission yeast Swi6 protein. Swi6 is a structural and functional homolog of mammalian HP1 (heterochromatin protein 1) and is involved in the chromatin structure by binding to centromere, telomere and silent mating-type locus. Swi6 contains a N-terminal chromo (CHRromatin Organization MOdifier) domain and a C-terminal chromo shadow domain (CSD). Swi6 binds histone H3 tails methylated at Lys- and the cohesion subunit Psc3, leading to silencing the genes and sister chromatid cohesion. It is also involved in the repression of the silent mating-type loci MAT2 and MAT3. Swi6 may compact MAT2/3 into a heterochromatin-like conformation which represses the transcription of these silent cassettes. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation.	55
349305	cd18658	CSD_HP1a_insect	chromo shadow domain of insect heterochromatin protein 1a. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal CSD which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid.  HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3. In Drosophila, there are at least five HP1 family proteins, this subgroup includes the CSD of Drosophila melanogaster HP1a.	53
349306	cd18659	CD2_tandem	repeat 2 of paired tandem chromodomains. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD1 to CHD9, and yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	54
349307	cd18660	CD1_tandem	repeat 1 of paired tandem chromodomains. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD1 to CHD9, and yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	70
349308	cd18661	CD2_tandem_CHD1-2_like	repeat 2 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 1 and 2, and similar proteins. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD1 and CHD2. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	58
349309	cd18662	CD2_tandem_CHD3-4_like	repeat 2 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 3 and 4, and similar proteins. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD3 and CHD4, and yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. Human CHD3 (also named Mi-2 alpha) and CHD4 (also named Mi-2 beta) are coexpressed in many cell lines and tissues and may act as the motor subunit of the NuRD complex (nucleosome remodeling and deacetylase activities). The proteins form distinct CHD3- and CHD4-NuRD complexes that repress, as well as activate gene transcription of overlapping and specific target genes. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	55
349310	cd18663	CD2_tandem_CHD5-9_like	repeat 2 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 5-9, and similar proteins. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD5, CHD6, CHD7, CHD8, and CHD9. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. CHD6, CHD7, and CHD8 enzymes have been demonstrated to have different substrate specificities and remodeling activities. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	59
349311	cd18664	CD2_tandem_ScCHD1_like	repeat 2 of the paired tandem chromodomains of yeast chromodomain helicase DNA-binding protein 1, and similar proteins. Repeat 2 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	59
349312	cd18665	CD1_tandem_CHD1_yeast_like	repeat 1 of the paired tandem chromodomains of yeast chromodomain helicase DNA-binding protein 1, and similar proteins. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as yeast protein CHD1. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	65
349313	cd18666	CD1_tandem_CHD1-2_like	repeat 1 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 1 and 2, and similar proteins. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD1 and CHD2. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	85
349314	cd18667	CD1_tandem_CHD3-4_like	repeat 1 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 3 and 4, and similar proteins. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD3 and CHD4. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. Human CHD3 (also named Mi-2 alpha) and CHD4 (also named Mi-2 beta) are coexpressed in many cell lines and tissues and may act as the motor subunit of the NuRD complex (nucleosome remodeling and deacetylase activities). The proteins form distinct CHD3- and CHD4-NuRD complexes that repress, as well as activate gene transcription of overlapping and specific target genes. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	79
349315	cd18668	CD1_tandem_CHD5-9_like	repeat 1 of the paired tandem chromodomains of chromodomain helicase DNA-binding protein 5-9, and similar proteins. Repeat 1 of tandem CHRomatin Organization Modifier (chromo) domains, found in CHD (chromodomain helicase DNA-binding) proteins such as mammalian helicase DNA-binding proteins CHD5, CHD6, CHD7, CHD8, and CHD9. The CHD proteins belong to the SNF2 superfamily of ATP-dependent chromatin remodelers and contain two signature motifs: a pair of chromodomains located in the N-terminal region, and the SNF2-like ATPase domain located in the central region of the protein. CHD chromatin remodelers are important regulators of transcription and play critical roles during developmental processes. The N-terminal chromodomains of CHD1 have been shown to guard against sliding hexasomes. Mutations in the chromodomains of mouse CHD1 result in nuclear redistribution, suggesting that the chromodomain is essential for proper association with chromatin; also, deletion of the chromodomains in the Drosophila melanogaster CHD3-4 homolog impaired nucleosome binding, mobilization, and ATPase functions. CHD6, CHD7, and CHD8 enzymes have been demonstrated to have different substrate specificities and remodeling activities. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	68
349948	cd18669	M20_18_42	M20, M18 and M42 Zn-peptidases include aminopeptidases and carboxypeptidases. This family corresponds to the MEROPS MH clan families M18, M20, and M42. The peptidase M20 family contains exopeptidases, including carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase, dipeptidases such as bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). This family also includes the bacterial aminopeptidase peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. These peptidases generally hydrolyze the late products of protein degradation so as to complete the conversion of proteins to free amino acids. Glutamate carboxypeptidase hydrolyzes folate analogs such as methotrexate, and therefore can be used to treat methotrexate toxicity. Peptidase families M18 and M42 contain metallo-aminopeptidases. M18 (aspartyl aminopeptidase, DAP) family cleaves only unblocked N-terminal acidic amino-acid residues and is highly selective for hydrolyzing aspartate or glutamate residues. Some M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacyl-peptidase activity (i.e. hydrolysis of acylated N-terminal residues).	198
350850	cd18670	PIN_Mut7-C-like	PIN domain at the C-terminus of Caenorhabditis elegans exonuclease Mut-7 and related domains. The Mut7-C-like family of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog). Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The Mut7-C PIN domain family is recognized as a genuine PIN domain, however it is not included it in the CDD PIN domain superfamily hierarchical model as it is lacks a core strand and helix (H3 and S3). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. Matelska et al. recently classified PIN-like domains into distinct groups, this family includes some sequences belonging to two of these, PIN _10 and PIN_16.	65
350238	cd18671	PIN_PRORP-Zc3h12a-like	PIN domain of protein-only RNase P (PRORP), ribonuclease Zc3h12a, and related proteins. PRORPs catalyze the maturation of the 5' end of precursor tRNAs in eukaryotes. This family includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3, PRORP1 localizes to the chloroplast and the mitochondria, and PRORP2 and PRORP3 localize to the nucleus. Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This PIN_PRORP-Zc3h12a-like family also includes Caenorhabditis elegans REGE-1 (REGnasE-1), which also functions as a cytoplasmic endonuclease. Additionally, it includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4, as well as N4BP1 (NEDD4-binding partner-1), NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350239	cd18672	PIN_FAM120B-like	FEN-like PIN domains of FAM120B (family with sequence similarity 120B) and related proteins. FAM120B (also known as CCPG, "constitutive coactivator of PPAR-gamma", PGCC1, "PPARgamma constitutive coactivator 1") is a constitutive coactivator of peroxisome proliferator-activated receptor (PPARgamma) that promotes adipogenesis in a PPARgamma-dependent manner. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	210
350240	cd18673	PIN_XRN1-2-like	FEN-like PIN domains of XRN1, XRN2, and related proteins. XRN1 (5'-3' exoribonuclease 1, also known as SEP1) is a processive 5'-3' exoribonuclease that degrades the body of transcripts in the major pathway of RNA decay; XRN2 (5'-3' exoribonuclease 2) is predominantly localized in the nucleus and recognizes single-stranded RNAs with a 5'-terminal monophosphate to degrade them possessively to mononucleotides. XRN2 has a critical function to process maturation of 5.8S and 25S/28S rRNAs as well as degradation of some spacer fragments that are excised during rRNA maturation. Both XRN1 and XRN2 preferentially cleave 5'-monophosphorylated RNA. XRN2 is also known as Rat1p in yeast. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	240
350241	cd18674	PIN_Pox_G5	FEN-like PIN domain of vaccinia virus G5 protein and related proteins. Poxvirus G5 nuclease is involved in DNA replication and double-strand break repair by homologous recombination. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	151
350242	cd18675	PIN_SpAst1-like	FEN-like PIN domain of Schizosaccharomyces pombe asteroid homolog 1 and related proteins. Schizosaccharomyces pombe Ast1 is a homologue of Drosophila Asteroid and human ASTE1. Ast1 appears to be involved in mounting a checkpoint response to endogenous damage in cells lacking Rad2 and Exo1. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	163
350243	cd18676	PIN_asteroid-like	FEN-like PIN domain of Drosophila melanogaster asteroid and related proteins. This subfamily includes Drosophila melanogaster asteroid protein which may function in EGF receptor signaling, and may play a role in compound eye morphogenesis. This subfamily belongs to the structure-specific, 5' nuclease family (FEN-like) that catalyzes hydrolysis of DNA duplex-containing nucleic acid structures during DNA replication, repair, and recombination. Canonical members of the FEN-like family possess a PIN domain with a two-helical structure insert (also known as the helical arch, helical clamp or I domain) of variable length (approximately 16 to 800 residues), the helical arch/clamp region is involved in DNA binding. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	164
350244	cd18677	PIN_MjVapC2-VapC6_like	VapC-like PIN domain of Methanocaldococcus jannaschii VapC2, and VapC6, and related proteins. This subfamily includes Methanocaldococcus jannaschii VapC2 and VapC6. It belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	136
350245	cd18678	PIN_MtVapC25_VapC33-like	VapC-like PIN domain of Mycobacterium tuberculosis VapC25, VapC33, and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC25, VapC29, VapC33, VapC37, and VapC39 toxins. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	140
350246	cd18679	PIN_VapC-Af1683-like	VapC-like PIN domain of VapC ribonuclease similar to Archaeoglobus fulgidus uncharacterized Af1683 protein. Uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350247	cd18680	PIN_MtVapC20-like	VapC-like PIN domain of Mycobacterium tuberculosis VapC20 and related proteins. M. tuberculosis VapC20 inhibits translation by site-specific cleavage of the universally conserved Sarcin-Ricin loop in 23S rRNA. This subfamily belongs to the VapC (virulence-associated protein C)-like nuclease family of the PIN domain-like superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1.	131
350248	cd18681	PIN_MtVapC27-VapC40_like	VapC-like PIN domain of Mycobacterium tuberculosis VapC27, and VapC40, and related proteins. This subfamily includes Mycobacterium tuberculosis VapC27 and VapC40. It belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	129
350249	cd18682	PIN_VapC-like	Uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	118
350250	cd18683	PIN_VapC-like	Uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	125
350251	cd18684	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	131
350252	cd18685	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains and included distant subgroups, this subgroup includes some sequences belonging to one of these, PIN_14.	110
350253	cd18686	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	119
350254	cd18687	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_3.	118
350255	cd18688	PIN_VapC-like	uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	134
350256	cd18689	PIN_VapC-like	uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	125
350257	cd18690	PIN_VapC-like	uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_12.	134
350258	cd18691	PIN_VapC-like	uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	130
350259	cd18692	PIN_VapC-like	uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	129
350260	cd18693	PIN_VapC-like	uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_22.	129
350261	cd18694	PIN_VapC-like	uncharacterized subfamily of the VapC-like nuclease family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_24.	133
350262	cd18695	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_13.	118
350263	cd18696	PIN_MtVapC26-like	VapC-like PIN domain of Mycobacterium tuberculosis VapC26 and related proteins. Mycobacterium tuberculosis VapC26 cleaves 23S rRNA in the Sarcin-Ricin Loop, it is inhibited by the cognate VapB26 antitoxin. This subfamily belongs to the VapC (virulence-associated protein C)-like nuclease family of the PIN domain-like superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	132
350264	cd18697	PIN_VapC_N-like	VapC-like N-terminal PIN (DUF4935) domain of DUF4935 domain-containing proteins and related proteins. This subgroup the includes N-terminal PIN domain of DUF4935 domain-containing proteins, and is an uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	183
350265	cd18698	PIN_VapC_C-like	VapC-like C-terminus of DUF1308 domain in DUF1308 domain-containing proteins and related proteins. This subfamily includes the C-terminus of DUF1308 domain in DUF1308 domain-containing proteins, and is an uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	138
350266	cd18699	PIN_VapC_like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_15.	129
350267	cd18700	PIN_GNAT-like	VapC-like PIN domain of uncharacterized GNAT family proteins. This subfamily includes uncharacterized GNAT family proteins having an N-terminal GNAT family N-acetyltransferase domain. This subgroup belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_17.	137
350268	cd18701	PIN_VapC_like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_18.	144
350269	cd18702	PIN_VapC_like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_19	139
350270	cd18703	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_20.	148
350271	cd18704	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_21.	145
350272	cd18705	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_23.	130
350273	cd18706	PIN_STKc_like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily; includes the PIN domains of uncharacterized serine/threonine kinases. This subfamily includes the PIN domains of some uncharacterized proteins having serine/threonine kinase catalytic domains and annotated as serine/threonine kinases. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_25.	126
350274	cd18707	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_26.	131
350275	cd18708	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_28.	116
350276	cd18709	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_4-1.	196
350277	cd18710	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_4-2.	130
350278	cd18711	PIN_VapC-like_DUF411	VapC-like PIN (DUF411) domain in DUF411 domain-containing proteins and related proteins. This subfamily includes the DUF411 PIN domain in proteins annotated as DUF411 domain-containing proteins. It is an uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	146
350279	cd18712	PIN_VapC-like_DUF411	CapC-like PIN (DUF411) domain in DUF411 domain-containing proteins and related proteins. This subfamily includes the DUF411 PIN domain in proteins annotated as DUF411 domain-containing proteins. It is an uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	148
350280	cd18713	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_27.	140
350281	cd18714	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_8.	228
350282	cd18715	PIN_VapC-like	uncharacterized subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The PIN domain belongs to a large nuclease superfamily. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	125
350283	cd18716	PIN_SSO1118-like	VapC-like PIN domain of Sulfolobus solfataricus SSO1118 and related proteins. This subfamily includes the functionally uncharacterized protein SSO1118 from the hyperthermophilic archaeon Sulfolobus solfataricus P2. This subfamily belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	102
350284	cd18717	PIN_ScNmd4p-like	VapC-like PIN domain of Saccharomyces cerevisiae Nmd4p and related proteins. Saccharomyces cerevisiae Nmd4p may be involved in nonsense-mediated mRNA decay. This subfamily belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	150
350285	cd18718	PIN_PRORP	PIN domain of protein-only RNase P (PRORP) and related proteins. PRORPs catalyze the maturation of the 5' end of precursor tRNAs in eukaryotes. This family includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3, PRORP1 localizes to the chloroplast and the mitochondria, and PRORP2 and PRORP3 localize to the nucleus. This subfamily belongs to the PRORP-Zc3h12a-like PIN family which in addition includes Zc3h12a (also known as MCPIP1/MCP induced protein 1 and Regnase-1), Caenorhabditis elegans REGE-1 (REGnasE-1), Zc3h12b-d (also known as Regnase-2-4), N4BP1, and NYNRIN (also known as CGIN1). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	124
350286	cd18719	PIN_Zc3h12a-N4BP1-like	PRORP-like PIN domain of ribonuclease Zc3h12a, NEDD4-binding partner-1, and related proteins. Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This subfamily also includes Caenorhabditis elegans REGE-1 (REGnasE-1), which also functions as a cytoplasmic endonuclease. Additionally, it includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4, as well as N4BP1 (NEDD4-binding partner-1), NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. This subfamily belongs to the PRORP-Zc3h12a-like PIN family which in addition includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350287	cd18720	PIN_YqxD-like	LabA-like PIN domain of uncharacterized Bacillus subtilis YqxD and related proteins. This subfamily includes the PIN domain of uncharacterized Bacillus subtilis YqxD (also known as YqfM) and Escherichia coli YaiI. Firmicute, such as Bacillus and Listeria, YqxD proteins are encoded within RNA polymerase major sigma43 operons, whereas E. coli YaiL is transcribed as a mono cistron. This subfamily belongs to LabA-like PIN domain family which includes Synechococcus elongatus PCC 7942 LabA, human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	96
350288	cd18721	PIN_ZNF451-like	LabA-like PIN domain of human zinc finger protein 451 and related proteins. Human ZNF451 (also known as COASTER) functions as a transcriptional cofactor in promyelocytic leukemia bodies in the nucleus, it acts as a coactivator or corepressor, depending on the factors with which it interacts. ZNF451 interacts with p300 by the PIN-like domain and down regulates TGF-beta signaling in a p300-dependent and sumoylation-independent manner. This subfamily belongs to LabA-like PIN domain family which includes Synechococcus elongatus PCC 7942 LabA, human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_11.	117
350289	cd18722	PIN_NicB-like	LabA-like PIN domain of Pseudomonas putida S16 NicB and related proteins. Curiously NicB from Pseudomonas putida S16 is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine. This subfamily also includes the uncharacterized CPP15 (plasmid) protein from Campylobacter jejuni. This subfamily belongs to LabA-like PIN domain family which includes Synechococcus elongatus PCC 7942 LabA, human ZNF451, uncharacterized Bacillus subtilis YqxD and Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously, a gene labeled NicB from Pseudomonas putida S16, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into the LabA-like PIN family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	117
350290	cd18723	PIN_LabA-like	uncharacterized subfamily of the LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. It also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes, human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_7.	110
350291	cd18724	PIN_LabA-like	uncharacterized subfamily of the LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. It also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes, human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	172
350292	cd18725	PIN_LabA-like	uncharacterized subfamily of the LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. It also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes, human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	131
350293	cd18726	PIN_LabA-like	uncharacterized subfamily of the LabA-like PIN domain of Synechococcus elongatus LabA (low-amplitude and bright) and related proteins. The LabA-like PIN domain family includes Synechococcus elongatus PCC 7942 LabA which participates in cyanobacterial circadian timing, it is required for negative feedback regulation of the autokinase/autophosphatase KaiC, a central component of the circadian clock system, and appears to be necessary for KaiC-dependent repression of gene expression. It also includes the N-terminal domain of limkain b1, a human autoantigen localized to a subset of ABCD3 and PXF marked peroxisomes, human ZNF451, uncharacterized Bacillus subtilis YqxD, uncharacterized Escherichia coli YaiI, and the N-terminal domain of a well-conserved group of mainly bacterial proteins with no defined function, which contain a C-terminal LabA_like_C domain. Curiously Pseudomonas putida S16 NicB, which is described as a putative NADH-dependent hydroxylase involved in the microbial degradation of nicotine also falls into this family. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	113
350294	cd18727	PIN_Swt1-like	VapC-like PIN domain of Saccharomyces cerevisiae Swt1p, human SWT1 and related proteins. Saccharomyces cerevisiae mRNA-processing endoribonuclease Swt1p plays an important role in quality control of nuclear mRNPs in eukaryotes. Human transcriptional protein SWT1 (RNA endoribonuclease homolog, also known as HsSwt1, C1orf26, and chromosome 1 open reading frame 26) is an RNA endonuclease that participates in quality control of nuclear mRNPs and can associate with the nuclear pore complex (NPC). This subfamily belongs to the Smg5 and Smg6-like PIN domain family. Smg5 and Smg6 are essential factors in NMD, a post-transcriptional regulatory pathway that recognizes and rapidly degrades mRNAs containing premature translation termination codons. In vivo, the Smg6 PIN domain elicits degradation of bound mRNAs, as well as, metal-ion dependent, degradation of single-stranded RNA, in vitro. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues), is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Point mutation studies of the conserved aspartate residues in the catalytic center of the Smg6 PIN domain revealed that Smg6 is the endonuclease involved in human NMD. However, Smg5 lacks several of these key catalytic residues and does not degrade single-stranded RNA, in vivo.	141
350295	cd18728	PIN_N4BP1-like	PRORP-like PIN domain of NEDD4 binding protein 1 and related proteins. NEDD4-binding partner-1 (N4BP1) interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally down-regulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. This subfamily additionally includes NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. This subfamily belongs to the Zc3h12a-N4BP1-like PIN subfamily of the PRORP-Zc3h12a-like PIN family, the latter of which additionally includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350296	cd18729	PIN_Zc3h12-like	PRORP-like PIN domain of ribonuclease Zc3h12a and related proteins. Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This subfamily also includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4. It belongs to the Zc3h12a-N4BP1-like PIN subfamily of the PRORP-Zc3h12a-like PIN family, the latter of which additionally includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	131
350297	cd18730	PIN_PH0500-like	VapC-like PIN-domain of Pyrococcus horikoshii protein PH0500 and related proteins. This subfamily includes Pyrococcus horikoshii protein PH0500, a protein with possible exonuclease activity and involvement in DNA or RNA editing. This subfamily belongs to the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350298	cd18731	PIN_NgFitB-like	VapC-like PIN domain of Neisseria gonorrhoeae FitB and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Neisseria gonorrhoeae FitB toxin of the FitAB toxin/antitoxin (TA) system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. In N. gonorrhoeae, FitA and FitB form a heterodimer: FitA is the DNA binding subunit and FitB contains a ribonuclease activity that is blocked by the presence of FitA. A tetramer of FitAB heterodimers binds DNA from the fitAB upstream promoter region with high affinity. This results in both sequestration of FitAB and repression of fitAB transcription. It is thought that FitAB release from the DNA and subsequent dissociation both slows N. gonorrhoeae replication and transcytosis by an as yet undefined mechanism. The toxin M. tuberculosis VapC is a structural homolog of N. gonorrhoeae FitB, but their antitoxin partners, VapB and FitA, respectively, differ structurally. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	136
350299	cd18732	PIN_MtVapC4-C5_like	VapC-like PIN domain of Mycobacterium tuberculosis VapC4, VapC5, and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 toxin of the VapBC toxin/antitoxin (TA) system. This family belongs to the PIN_VapC4-5_FitB-like subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. M. tuberculosis VapC4 interacts with, and cleaves tRNA44Cys-GCA. M. tuberculosis VapC5 has endonucleolytic activity with RNA, this activity is low with dsRNA, and no activity has been demonstrated on dsDNA. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	124
350300	cd18733	PIN_RfVapC1-like	VapC-like PIN domain of Rickettsia felis VapC1 and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Rickettsia felis VapC1, a ribonuclease toxin of the VapBC toxin/antitoxin (TA) system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC TA systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	131
350301	cd18734	PIN_RfVapC2-like	VapC-like PIN domain of Rickettsia felis VapC2 and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Rickettsia felis VapC2, a ribonuclease toxin of the VapBC toxin/antitoxin (TA) system. Rickettsia felis VapC2 cleaves single-stranded RNA. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC TA systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	129
350302	cd18735	PIN_HiVapC1-like	VapC-like PIN domain of Haemophilus influenzae VapC1 and related proteins. Haemophilus influenzae VapC1 has endonucleolytic activity with RNA, it cleaves initiator tRNA between the anticodon stem and loop, but does not cleave mRNA, rRNA or tmRNA, and has no activity on ssDNA or dsDNA. This subfamily belongs to the PIN_VapC4-5_FitB-like subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC TA systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. FitB is a toxin of the FitAB TA system. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	128
350303	cd18736	PIN_CcVapC1-like	VapC-like PIN domain of Caulobacter Crescentus VapC1-like and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Caulobacter Crescentus VapC1, a ribonuclease toxin of the VapBC toxin/antitoxin (TA) system. This subfamily belongs to the PIN_VapC4-5_FitB-like subfamily of the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC TA systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. FitB is a toxin of the FitAB TA system. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	123
350304	cd18737	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	115
350305	cd18738	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	118
350306	cd18739	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	124
350307	cd18740	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350308	cd18741	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	120
350309	cd18742	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350310	cd18743	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350311	cd18744	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	128
350312	cd18745	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	130
350313	cd18746	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	133
350314	cd18747	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	132
350315	cd18748	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	128
350316	cd18749	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350317	cd18750	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350318	cd18751	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	133
350319	cd18752	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	130
350320	cd18753	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	123
350321	cd18754	PIN_VapC4-5_FitB-like	uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily. The PIN_VapC4-5_FitB-like subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 ribonuclease toxins of the VapBC toxin/antitoxin (TA) system, and Neisseria gonorrhoeae FitB toxin of the FitAB TA system. This subfamily belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	128
350322	cd18755	PIN_MtVapC3_VapC21-like	VapC-like PIN domain of Mycobacterium tuberculosis VapC3, VapC21 and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC3 and VapC21 toxins. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350323	cd18756	PIN_MtVapC15-VapC11-like	VapC-like PIN domain of Mycobacterium tuberculosis VapC11, VapC15, and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC11 and VapC15 toxins. M. tuberculosis VapC11 and VapC15 cleave tRNA3 Leu-CAG, VapC11 may additionally cleave tRNA13Leu-GAG and tRNA10Gln-CTG. This subgroup belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	129
350324	cd18757	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	129
350325	cd18758	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350326	cd18759	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350327	cd18760	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350328	cd18761	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	127
350329	cd18762	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	130
350330	cd18763	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	128
350331	cd18764	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	129
350332	cd18765	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	131
350333	cd18766	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	130
350334	cd18767	PIN_MtVapC3-like	uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily. The VapC3-like nuclease subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of various Mycobacterium tuberculosis VapC toxins including VapC3, VapC11, VapC15, VapC21, VapC25, VapC28, VapC29, VapC30, VapC32, VapC33, VapC37, and VapC39. It belongs to the VapC-like family of the PIN domain nuclease superfamily. VapC is a PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. Other members of the VapC-like nuclease family include FitB toxin of the FitAB TA system, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and rRNA-processing protein Fcf1. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	126
350335	cd18768	PIN_MtVapC4-C5-like	VapC-like PIN domain of Mycobacterium tuberculosis VapC4, VapC5, and related proteins. This subfamily includes the Virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of Mycobacterium tuberculosis VapC4 and VapC5 toxin of the VapBC toxin/antitoxin (TA) system. This family belongs to the PIN_VapC4-5_FitB-like subfamily of the VapC (virulence-associated protein C)-like family of the PIN domain nuclease superfamily. VapC is the PIN-domain ribonuclease toxin from prokaryotic VapBC toxin-antitoxin (TA) systems. VapB is a transcription factor-like protein antitoxin acting as an inhibitor. M. tuberculosis VapC4 interacts with, and cleaves tRNA44Cys-GCA. M. tuberculosis VapC5 has endonucleolytic activity with RNA, this activity is low with dsRNA, and no activity has been demonstrated on dsDNA. The structural properties of the PIN (PilT N terminus) domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions, in some members, additional metal coordinating residues can be found. Some members of the superfamily lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.	123
350851	cd18769	PIN_Mut7-C-like	uncharacterized subgroup of the Mut7-C-like family of the PIN domain superfamily. The Mut7-C-like family of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog). Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The Mut7-C PIN domain family is recognized as a genuine PIN domain, however it not included it in the CDD PIN domain superfamily hierarchical model as it is lacks a core strand and helix (H3 and S3). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_10.	85
350852	cd18770	PIN_Mut7-C-like	uncharacterized subgroup of the Mut7-C-like family of the PIN domain superfamily. The Mut7-C-like family of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog) Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The Mut7-C PIN domain family is recognized as a genuine PIN domain, however it is not included it in the CDD PIN domain superfamily hierarchical model as it is lacks a core strand and helix (H3 and S3). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains. Matelska et al. recently classified PIN-like domains into distinct groups; this subgroup includes some sequences belonging to one of these, PIN_16.	80
350853	cd18771	PIN_Mut7-C-like	uncharacterized subgroup of the Mut7-C-like family of the PIN domain superfamily. The Mut7-C-like family of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog). Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The Mut7-C PIN domain family is recognized as a genuine PIN domain, however it is not included it in the CDD PIN domain superfamily hierarchical model as it is lacks a core strand and helix (H3 and S3). The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains.	62
350854	cd18772	PIN_Mut7-C-like	Mut7-C-like family of the PIN domain superfamily similar to the PIN domain found at the C-terminus of Caenorhabditis elegans exonuclease Mut-7 and related proteins. This Mut7-C-like subgroup of the PIN domain superfamily includes the C-terminal domain of Caenorhabditis elegans Mut-7 (also known as exonuclease 3'-5' domain-containing protein 3 homolog). Mut-7 is involved in RNA interference (RNAi) and transposon silencing in C. elegans. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily, and were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Other PIN domain families are: the FEN-like PIN domain family which includes the PIN domains of Flap endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap endonuclease 1 (GEN1), and Xeroderma pigmentosum complementation group G (XPG) nuclease, 5'-3' exonucleases of DNA polymerase I and bacteriophage T4- and T5-5' nucleases; the VapC-like PIN domain family which includes toxins of prokaryotic toxin/antitoxin operons FitAB and VapBC, as well as, eukaryotic ribonucleases such as Smg6, ribosome assembly factor NOB1, exosome subunit Rrp44 endoribonuclease and, rRNA-processing protein Fcf1; the LabA-like PIN domain family which includes the PIN domains of Synechococcus elongatus LabA (low-amplitude and bright); the PRORP-Zc3h12a-like PIN domain family which includes the PIN domains of of RNase P (PRORP), ribonuclease Zc3h12a; and Bacillus subtilis YacP/Rae1-like PIN domains.	81
350341	cd18773	PDC1_HK_sensor	first PDC (PhoQ/DcuS/CitA) domain of methyl-accepting chemotaxis proteins, diguanylate-cyclase and similar domains. Histidine kinase (HK) receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. HK receptors in this family contain double PDC (PhoQ/DcuS/CitA) sensor domains. Signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. The HK family includes not just histidine kinase receptors but also sensors for chemotaxis proteins and diguanylate cyclase receptors, implying a combinatorial molecular evolution.	125
350342	cd18774	PDC2_HK_sensor	second PDC (PhoQ/DcuS/CitA) domain of methyl-accepting chemotaxis proteins, diguanylate-cyclase and similar domains. Histidine kinase (HK) receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. HK receptors in this family contain double PDC (PhoQ/DcuS/CitA) sensor domains. Signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. The HK family includes not just histidine kinase receptors but also sensors for chemotaxis proteins and diguanylate cyclase receptors, implying a combinatorial molecular evolution.	89
350651	cd18775	SafA-like	Saf-pilin pilus formation protein SafA. This subfamily is composed of Saf-pilin pilus formation protein SafA from Salmonella enterica and similar proteins. SafA is the major subunit of Saf pili, which are often found in clinical isolates of Salmonella and are assembled by the chaperone-usher secretion pathway. In addition to safA, the saf operon is also composed of safB (periplasmic chaperone), safC (outer membrane usher), and safD (minor subunit). SafA and SafD subunits are transported from the cytoplasm into the periplasm via the SEC machinery, and the periplasmic chaperone SafB donates its G1 strand to complete the correct folding of SafA or SafD. In Saf pili assembly, the N-terminal extension (NTE) of an incoming SafA replaces the G1 strand (in SafB) via a zip-in-zip-out mechanism (also called donor-strand complementation or exchange) to form the polymer of SafD-(SafA)n (n > 100).	122
350652	cd18776	AfaD-like	AfaD and similar proteins. This subfamily consists of Escherichia coli AfaD, Salmonella SafD, and similar proteins. The afa gene clusters encode an afimbrial adhesive sheath produced by Escherichia coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells. SafD is the minor subunit of Saf pili, which are often found in clinical isolates of Salmonella and are assembled by the chaperone-usher secretion pathway. In addition to safD, the saf operon is also composed of safA (major subunit), safB (periplasmic chaperone), and safC (outer membrane usher). Also included is the enteroaggregative Escherichia coli AAF/IV pilus tip protein, which is implicated in adhesion as well. During fimbria/pili assembly, polymerization occurs when the N-terminal extension (NTE) of one monomer is inserted into an adjacent monomer, providing the final beta strand or G-strand, to complete the Ig-like fold, in a mechanism called the donor-strand complementation (DSC) or donor-strand exchange (DSE).	118
350653	cd18777	PsaA_MyfA	Fimbrial subunit PsaA, MyfA, and similar proteins. This subfamily is composed of Yersinia pestis PsaA, Yersinia enterocolitica MyfA, and similar proteins. PsaA and MyfA are the major subunits of pH 6 antigen (Psa) and Myf fimbrial homopolymers. Psa and Myf specifically recognize beta1-3- or beta1-4-linked galactose in glycosphingolipids, but while Psa also binds phosphatidylcholine, Myf does not. Psa has acquired a tyrosine-rich surface that enables it to bind to phosphatidylcholine and mediate adhesion of Y. pestis/pseudotuberculosis to alveolar cells. Myf has specialized as a carbohydrate-binding adhesin, facilitating the attachment of Y. enterocolitica to intestinal cells. During fimbria/pili assembly, polymerization occurs when the N-terminal extension (NTE) of one monomer is inserted into an adjacent monomer, providing the final beta strand or G-strand, to complete the Ig-like fold, in a mechanism called the donor-strand complementation (DSC) or donor-strand exchange (DSE).	110
350051	cd18778	ABC_6TM_exporter_like	Six-transmembrane helical domain (TMD) of an uncharacterized ABC exporter, and similar proteins. This group includes a subunit of six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs). The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting the chemical diversity of the translocated substrates, while NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional transporter. The ABC exporters play a role in multidrug resistance to antibiotics and anticancer agents, and mutations in these proteins have been shown to cause severe human diseases such as cystic fibrosis.	293
350052	cd18779	ABC_6TM_T1SS_like	uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ATP-binding cassette subunit in the type 1 secretion systems, and similar proteins. uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS) and similar proteins. These transporter subunits include HylB, PrtD, CyaB, CvaB, RsaD, HasD, LipB, and LapB, among many others. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type I secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). Most targeted proteins are not cleaved at the N terminus, but rather carry signals located toward the extreme C terminus to direct type I secretion. However, the 10 kDa Escherichia coli colicin V (CvaB) targets the ABC transporter using a cleaved, N-terminal signal sequence. Almost all transport substrates of the type I system have critical functions in attacking host cells either directly or by being essential for host colonization. The ABC-dependent T1SS transports various molecules, from ions, drugs, to proteins of various sizes up to 900 kDa. The molecules secreted vary in size from the small Escherichia coli peptide colicin V, (10 kDa) to the Pseudomonas fluorescens cell adhesion protein LapA of 520 kDa. The best characterized are the RTX toxins such as the adenylate cyclase (CyaA) toxin from Bordetella pertussis, the causative agent of whooping cough, and the lipases such as LipA. Type I secretion is also involved in export of non-protein substrates such as cyclic beta-glucans and polysaccharides.	294
350053	cd18780	ABC_6TM_AtABCB27_like	Six-transmembrane helical domain (6-TMD) of the Arabidopsis ABC transporter B family member 27 and similar proteins. This group includes Arabidopsis ABC transporter B family member 27 (also known as AtABCB27, aluminum tolerance-related ATP-binding cassette transporter, transporter associated with antigen processing-like protein 2, AtTAP2, and ALS1) which may play a role in aluminum resistance. The ABC_6TM_TAP_ABCB8_10_like subgroup of the ABC_6TM exporter family includes ABC transporter associated with antigen processing (TAP), which is essential to cellular immunity against viral infection, as well as ABCB8 and ABCB10, which are found in the inner membrane of mitochondria, with the nucleotide-binding domains (NBDs) inside the mitochondrial matrix. Mammalian ABCB10 is essential for erythropoiesis and for protection of mitochondria against oxidative stress, while ABCB8 is essential for normal cardiac function, maintenance of mitochondrial iron homeostasis and maturation of cytosolic Fe/S proteins. The ABC_6TM exporter family represents the six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in the ABC_6TM exporter family.	295
350054	cd18781	ABC_6TM_AarD_CydDC_like	uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC cysteine/GSH transporter CydDC, and similar proteins. This subgroup belongs to the ABC_6TM_AarD_CydDC_like subgroup of the ABC_6TM exporter family. The CydD protein, together with the CydC protein, constitutes a bacterial heterodimeric ATP-binding cassette (ABC) transporter complex required for formation of the functional cytochrome bd oxidase in both gram-positive and gram-negative aerobic bacteria. In Escherichia coli, the biogenesis of both cytochrome bd-type quinol oxidases and periplasmic cytochromes requires the ABC-type cysteine/GSH transporter CydDC, which exports cysteine and glutathione from the cytoplasm to the periplasm to maintain redox homeostasis. Mutations in AarD, a homolog from Providencia stuartii, also show phenotypic characteristic consistent with a defect in the cytochrome d oxidase. The CydDC forms a heterodimeric ABC transporter with two transmembrane domains (TMDs), each predicted to comprise six TM alpha-helices and two nucleotide binding domains (NBDs). The ABC_6TM exporter family represents the six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in the ABC_6TM exporter family.	290
350055	cd18782	ABC_6TM_PrtD_LapB_HlyB_like	uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (PrtD, LapB, HylB), and similar proteins. Uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS), including PrtD, LapB, and HylB. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type 1 secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). These three components assemble into a complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides. In addition, PrtD is the integral membrane ATP-binding cassette component of the Erwinia chrysanthemi metalloprotease secretion system (PrtDEF). LabB is an inner-membrane transporter component of the LapBCE system that is required for the secretion of the LapA adhesion.	294
350056	cd18783	ABC_6TM_PrtD_LapB_HlyB_like	uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (PrtD, LapB, HylB), and similar proteins. Uncharacterized subgroup of the six-transmembrane helical domain (6-TMD) of the ABC subunit in the type 1 secretion systems (T1SS), including PrtD, LapB, and HylB. T1SS are found in pathogenic Gram-negative bacteria (such as Escherichia coli, Vibrio cholerae or Bordetella pertussis) to export proteins (often proteases) across both inner and outer membranes to the extracellular medium. This is one of three proteins of the type 1 secretion apparatus. In the case of the Escherichia coli HlyA T1SS, these three proteins are HlyB (a dimeric ABC transporter), HlyD (MFP, oligomeric membrane fusion protein) and TolC (OMP, a trimeric oligomeric outer membrane protein). These three components assemble into a complex spanning both membranes and provide a channel for the translocation of unfolded polypeptides. In addition, PrtD is the integral membrane ATP-binding cassette component of the Erwinia chrysanthemi metalloprotease secretion system (PrtDEF). LabB is an inner-membrane transporter component of the LapBCE system that is required for the secretion of the LapA adhesion.	294
350057	cd18784	ABC_6TM_ABCB9_like	Six-transmembrane helical domain (6-TMD) of ATP-binding cassette sub-family B member 9 and similar proteins. ATP-binding cassette sub-family B member 9 is also known as transporter associated with antigen processing, TAP-like protein, TAPL, and ABCB9. It is a half transporter comprises a homodimeric lysosomal peptide transport complex. It belongs to the ABC_6TM_TAP_ABCB8_10_like subgroup of the ABC_6TM exporter family. The ABC_6TM exporter family represents the six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in the ABC_6TM exporter family. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs. The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting chemical diversity of the translocated substrates, whereas NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional unit.	289
350172	cd18785	SF2_C	C-terminal helicase domain of superfamily 2 DEAD/H-box helicases. Superfamily (SF)2 helicases include DEAD-box helicases, UvrB, RecG, Ski2, Sucrose Non-Fermenting (SNF) family helicases, and dicer proteins, among others. Similar to SF1 helicases, they do not form toroidal structures like SF3-6 helicases. SF2 helicases are a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Their helicase core is surrounded by C- and N-terminal domains with specific functions such as nucleases, RNA or DNA binding domains, or domains engaged in protein-protein interactions. The core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	77
350173	cd18786	SF1_C	C-terminal helicase domain of superfamily 1 DEAD/H-box helicases. Superfamily (SF)1 family members include UvrD/Rep, Pif1-like, and Upf-1-like proteins. Similar to SF2 helicases, they do not form toroidal, predominantly hexameric structures like SF3-6. SF1 helicases are a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Their helicase core is surrounded by C- and N-terminal domains with specific functions such as nucleases, RNA or DNA binding domains, or domains engaged in protein-protein interactions. The core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	89
350174	cd18787	SF2_C_DEAD	C-terminal helicase domain of the DEAD box helicases. DEAD-box helicases comprise a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis, and RNA degradation. They are superfamily (SF)2 helicases that, similar to SF1, do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	131
350175	cd18788	SF2_C_XPD	C-terminal helicase domain of xeroderma pigmentosum group D (XPD) family DEAD-like helicases. The xeroderma pigmentosum group D (XPD)-like family members are DEAD-box helicases belonging to superfamily (SF)2. This family includes DDX11 (also called ChlR1), a protein involved in maintaining chromosome transmission fidelity and genome stability, the TFIIH basal transcription factor complex XPD subunit, and FANCJ (also known as BRIP1), a DNA helicase required for the maintenance of chromosomal stability. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	159
350176	cd18789	SF2_C_XPB	C-terminal helicase domain of XPB-like helicases. TFIIH basal transcription factor complex helicase XPB (xeroderma pigmentosum type B) subunit (also known as DNA excision repair protein ERCC-3 or TFIIH 89 kDa subunit) is the ATP-dependent 3'-5' DNA helicase component of the core-TFIIH basal transcription factor, involved in nucleotide excision repair (NER) of DNA and, when complexed to CAK, in RNA transcription by RNA polymerase II. XPB is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	153
350177	cd18790	SF2_C_UvrB	C-terminal helicase domain of the UvrB family helicases. Excinuclease ABC subunit B (or UvrB) plays a central role in nucleotide excision repair (NER). Together with other components of the NER system, like UvrA, UvrC, UvrD (helicase II), and DNA polymerase I, it recognizes and cleaves damaged DNA in a multistep ATP-dependent reaction. UvrB is critical for the second phase of damage recognition by verifying the nature of the damage and forming the pre-incision complex. Its ATPase site becomes activated in the presence of UvrA and damaged DNA. Its activity is strand destabilization via distortion of the DNA at lesion site, with very limited DNA unwinding. UvrB is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	171
350178	cd18791	SF2_C_RHA	C-terminal helicase domain of the RNA helicase A (RHA) family helicases. The RNA helicase A (RHA) family includes RHA, also called DEAH-box helicase 9 (DHX9), DHX8, DHX15-16, DHX32-38, and many others. The RHA family members are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	171
350179	cd18792	SF2_C_RecG_TRCF	C-terminal helicase domain of the RecG family helicases. The DEAD-like helicase RecG family contains recombination factor RecG and transcription-repair coupling factor TrcF. They are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	160
350180	cd18793	SF2_C_SNF	C-terminal helicase domain of the SNF family helicases. The Sucrose Non-Fermenting (SNF) family includes chromatin-remodeling factors, such as CHD proteins and SMARCA proteins, recombination proteins Rad54, and many others. They are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	135
350181	cd18794	SF2_C_RecQ	C-terminal helicase domain of the RecQ family helicases. The RecQ helicase family is an evolutionarily conserved class of enzymes, dedicated to preserving genomic integrity by operating in telomere maintenance, DNA repair, and replication. They are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	134
350182	cd18795	SF2_C_Ski2	C-terminal helicase domain of the Ski2 family helicases. Ski2-like RNA helicases play an important role in RNA degradation, processing, and splicing pathways. This family includes spliceosomal Brr2 RNA helicase, ASCC3 (involved in the repair of N-alkylated nucleotides), Mtr4 (involved in processing of structured RNAs), DDX60 (involved in viral RNA degradation), and other proteins. Ski2-like RNA helicases are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	154
350183	cd18796	SF2_C_LHR	C-terminal helicase domain of LHR family helicases. Large helicase-related protein (LHR) is a DNA damage-inducible helicase that uses ATP hydrolysis to drive unidirectional 3'-to-5' translocation along single-stranded DNA (ssDNA) and to unwind RNA:DNA duplexes. This group also includes related bacterial and archaeal helicases. LHR family helicases are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	150
350184	cd18797	SF2_C_Hrq	C-terminal helicase domain of HrQ family helicases. Yeast Hrq1, similar to RecQ4, plays a role in DNA inter-strand crosslink (ICL) repair and in telomere maintenance. Hrq1 lacks the Sld2-like domain found in RecQ4. HrQ family helicases are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	146
350185	cd18798	SF2_C_reverse_gyrase	C-terminal helicase domain of the reverse gyrase. Reverse gyrase modifies the topological state of DNA by introducing positive supercoils in an ATP-dependent process. Reverse gyrase is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	174
350186	cd18799	SF2_C_EcoAI-like	C-terminal helicase domain of EcoAI HsdR-like restriction enzyme family helicases. This family is composed of helicase restriction enzymes, including the HsdR subunit of restriction-modification enzymes such as Escherichia coli type I restriction enzyme EcoAI R protein (R.EcoAI). The EcoAI enzyme recognizes 5'-GAGN(7)GTCA-3'. The HsdR or R subunit is required for both nuclease and ATPase activities, but not for modification. These proteins are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	116
350187	cd18800	SF2_C_EcoR124I-like	C-terminal helicase domain of EcoR124I HsdR-like restriction enzyme family helicases. This family is composed of helicase restriction enzymes, including the HsdR subunit of restriction-modification enzymes such as Escherichia coli type I restriction enzyme EcoR124I R protein. EcoR124I recognizes the sequence, 5'-GAAN(6)RTCG-3', and cleaves at random sites. The HsdR or R subunit is required for both nuclease and ATPase activities, but not for modification. These proteins are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	82
350188	cd18801	SF2_C_FANCM_Hef	C-terminal helicase domain of Fanconi anemia group M family helicases. Fanconi anemia group M (FANCM) protein is a DNA-dependent ATPase component of the Fanconi anemia (FA) core complex. It is required for the normal activation of the FA pathway, leading to monoubiquitination of the FANCI-FANCD2 complex in response to DNA damage, cellular resistance to DNA cross-linking drugs, and prevention of chromosomal breakage. Hef (helicase-associated endonuclease fork-structure) belongs to the XPF/MUS81/FANCM family of endonucleases and is involved in stalled replication fork repair. FANCM and Hef are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	143
350189	cd18802	SF2_C_dicer	C-terminal helicase domain of the endoribonuclease Dicer. Dicer ribonucleases cleave double-stranded RNA (dsRNA) precursors to generate microRNAs (miRNAs) and small interfering RNAs (siRNAs). In concert with Argonautes, these small RNAs bind complementary mRNAs to down-regulate their expression. miRNAs are processed by Dicer from small hairpins, while siRNAs are typically processed from longer dsRNA, from endogenous sources, or exogenous sources such as viral replication intermediates. Some organisms, such as Homo sapiens and Caenorhabditis elegans, encode one Dicer that generates miRNAs and siRNAs, but other organisms have multiple dicers with specialized functions. Dicer exists throughout eukaryotes, and a subset has an N-terminal helicase domain of the RIG-I-like receptor (RLR) subgroup. RLRs often function in innate immunity and Dicer helicase domains sometimes show differences in activity that correlate with roles in immunity. Dicer helicase domains are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	142
350190	cd18803	SF2_C_secA	C-terminal helicase domain of the protein translocase subunit secA. SecA is a component of the Sec translocase that transports the vast majority of bacterial and ER-exported proteins. SecA binds both the signal sequence and the mature domain of the preprotein emerging from the ribosome. SecA is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	141
350191	cd18804	SF2_C_priA	C-terminal helicase domain of ATP-dependent helicase PriA. PriA, also known as replication factor Y or primosomal protein N', is a 3'-->5' DNA helicase that acts to remodel stalled replication forks and as a specificity factor for origin-independent assembly of a new replisome at the stalled fork. PriA is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	238
350192	cd18805	SF2_C_suv3	C-terminal helicase domain of ATP-dependent RNA helicase. The SUV3 (suppressor of Var 3) gene encodes a DNA and RNA helicase, which is localized in mitochondria and is a subunit of the degradosome complex involved in regulation of RNA surveillance and turnover. SUV3 exhibits DNA and RNA-dependent ATPase, DNA and RNA-binding and DNA and RNA unwinding activities. SUV3 is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	135
350193	cd18806	SF2_C_viral	C-terminal helicase domain of viral helicase. Viral helicases in this family here are DEAD-like helicases belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	145
350194	cd18807	SF1_C_UvrD	C-terminal helicase domain of UvrD family helicases. UvrD is a highly conserved helicase involved in mismatch repair, nucleotide excision repair, and recombinational repair. It plays a critical role in maintaining genomic stability and facilitating DNA lesion repair in many prokaryotic species including Helicobacter pylori and Escherichia coli. This family also includes ATP-dependent helicase/nuclease AddA and helicase/nuclease RecBCD subunit RecB, among others. UvrD family helicases are DEAD-like helicases belonging to superfamily (SF)1, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF2 helicases, SF1 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	150
350195	cd18808	SF1_C_Upf1	C-terminal helicase domain of Upf1-like family helicases. The Upf1-like helicase family includes UPF1, HELZ, Mov10L1, Aquarius, IGHMBP2 (SMUBP2), and similar proteins. They are DEAD-like helicases belonging to superfamily (SF)1, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF2 helicases, SF1 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	184
350196	cd18809	SF1_C_RecD	C-terminal helicase domain of RecD family helicases. RecD is a member of the RecBCD (EC 3.1.11.5, Exonuclease V) complex. It is the alpha chain of the complex and functions as a 3'-5' helicase. The RecBCD enzyme is both a helicase that unwinds, or separates the strands of DNA, and a nuclease that makes single-stranded nicks in DNA. RecD family helicases are DEAD-like helicases belonging to superfamily (SF)1, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF2 helicases, SF1 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	80
350197	cd18810	SF2_C_TRCF	C-terminal helicase domain of the transcription-repair coupling factor. Transcription-repair coupling factor (TrcF) dissociates transcription elongation complexes blocked at nonpairing lesions and mediates recruitment of DNA repair proteins. TrcF is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	151
350198	cd18811	SF2_C_RecG	C-terminal helicase domain of DNA helicase RecG. ATP-dependent DNA helicase RecG plays a critical role in recombination and DNA repair. RecG helps process Holliday junction intermediates to mature products by catalyzing branch migration. It is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	159
349406	cd18812	CAP_PI15-like	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of peptidase inhibitor 15 and similar proteins. This family is composed of peptidase inhibitor 15 (PI15), peptidase inhibitor R3HDML, cysteine-rich secretory protein LCCL domain-containing 1 (CRISPLD1), and cysteine-rich secretory protein LCCL domain-containing 2 (CRISPLD2). PI15 is a serine protease inhibitor which displays weak inhibitory activity against trypsin and may play a role in facial patterning during embryonic development. The PI15 gene is a candidate gene for abdominal aortic internal elastic lamina ruptures in the rat. R3HDML is a putative serine protease inhibitor, whose gene may be associated with clinical dimensions of schizophrenia. CRISPLD1 may play a role in NSCLP (nonsyndromic cleft lip with or without cleft palate) through the interaction with CRISPLD2 and folate pathway genes. plays a role in the etiology of NSCLP and is required for neural crest cell migration and cell viability during craniofacial development. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	146
349407	cd18813	CAP_CRISPLD1	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of cysteine-rich secretory protein LCCL domain-containing 1. Cysteine-rich secretory protein LCCL domain-containing 1 (CRISPLD1) is also called cysteine-rich secretory protein 10 (CRISP-10), CocoaCrisp, LCCL domain-containing cysteine-rich secretory protein 1 (LCRISP1), or CAP and LCCL domain containing protein 1 (CAPLD1). CRISPLD1 is clearly distinct from CRISPs because they do not contain the 10 absolutely conserved cysteines or the ICR (ion channel regulator) domain of the CRISPs. It may play a role in NSCLP (nonsyndromic cleft lip with or without cleft palate) through the interaction with CRISPLD2 and folate pathway genes. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	146
349408	cd18814	CAP_PI15	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of peptidase inhibitor 15. Peptidase inhibitor 15 (PI15) is also called 25 kDa trypsin inhibitor (p25TI), cysteine-rich secretory protein 8 (CRISP-8), or SugarCrisp. It is a serine protease inhibitor which displays weak inhibitory activity against trypsin and may play a role in facial patterning during embryonic development. The PI15 gene is a candidate gene for abdominal aortic internal elastic lamina ruptures in the rat. PI15 may also participate in the regulation of drug resistance in ovarian cancer and serve as a potential target in targeted therapies. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	146
349409	cd18815	CAP_R3HDML	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of peptidase inhibitor R3HDML. Peptidase inhibitor R3HDML, also called cysteine-rich secretory protein R3HDML, is a putative serine protease inhibitor. The R3HDML gene may be associated with clinical dimensions of schizophrenia. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	146
349410	cd18816	CAP_CRISPLD2	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain of cysteine-rich secretory protein LCCL domain-containing 2. Cysteine-rich secretory protein LCCL domain-containing 2 (CRISPLD2) is also called cysteine-rich secretory protein 11 (CRSIP-11), LCCL domain-containing cysteine-rich secretory protein 2 (LCRISP2), or CAP and LCCL domain containing protein 2 (CAPLD2). It plays a role in the etiology of NSCLP (non-syndromic cleft lip with or without cleft palate). It is required for neural crest cell migration and cell viability during craniofacial development. The CRISPLD2 gene has been identified a glucocorticoid responsive gene that modulates cytokine function in airway smooth muscle cells. The wider family of CAP domain containing proteins includes plant pathogenesis-related protein 1 (PR-1), cysteine-rich secretory proteins (CRISPs), and allergen 5 from vespid venom, among others.	146
350138	cd18817	GH43f_LbAraf43-like	Glycosyl hydrolase family 43 such as Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43. This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with alpha-L-arabinofuranosidase (EC 3.2.1.55) activity. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. Characterized enzymes belonging to this subgroup include Lactobacillus brevis (LbAraf43) and Weissella sp (WAraf43) which show activity with similar catalytic efficiency on 1,5-alpha-L-arabinooligosaccharides with a degree of polymerization (DP) of 2-3; size is limited by an extended loop at the entrance to the active site. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	262
350139	cd18818	GH43_GbtXyl43B-like	Glycosyl hydrolase family 43 such as Geobacillus thermoleovorans IT-08 beta-xylosidase/exo-xylanase (GbtXyl43B). This glycosyl hydrolase family 43 (GH43) subgroup includes the characterized enzymes Geobacillus thermoleovorans IT-08 beta-xylosidase (EC 3.2.1.37) / exo-xylanase (GbtXyl43B), and Paenibacillus sp. strain E18 alpha-L-arabinofuranosidase (EC 3.2.1.55) Abf43B. It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	273
350140	cd18819	GH43_LbAraf43-like	Glycosyl hydrolase family 43 proteins similar to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans GbtXyl43B. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes enzymes with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55) and possibly bifunctional xylosidase/arabinofuranosidase activities, similar to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans IT-08 beta-xylosidase / exo-xylanase (GbtXyl43B). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	277
350141	cd18820	GH43_LbAraf43-like	Glycosyl hydrolase family 43 proteins similar to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans GbtXyl43B. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes enzymes with beta-xylosidase (EC 3.2.1.37), alpha-L-arabinofuranosidase (EC 3.2.1.55) and possibly bifunctional xylosidase/arabinofuranosidase activities, similar to Lactobacillus brevis alpha-L-arabinofuranosidase LbAraf43 and Geobacillus thermoleovorans IT-08 beta-xylosidase / exo-xylanase (GbtXyl43B). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	258
350142	cd18821	GH43_Pc3Gal43A-like	Glycosyl hydrolase family 43 protein such as Phanerochaete chrysosporium exo-beta-1,3-galactanase (Pc1, 3Gal43A, 1,3Gal43A). This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), Fusarium oxysporum 12S Fo/1 (3Gal), and Streptomyces sp. 19(2012) SGalase1 and SGalase2. It belongs to the GH43_CtGH43 subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43_CtGH43 includes proteins such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43) which is comprised of the GH43 domain, a CBM13 domain, and a dockerin domain, exhibits an unusual ability to hydrolyze beta-1,3-galactan in the presence of a beta-1,6 linked branch, and is missing an essential acidic residue suggesting a mechanism by which it bypasses beta-1,6 linked branches in the substrate. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	262
350143	cd18822	GH43_CtGH43-like	Glycosyl hydrolase family 43 protein such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43). This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43), Streptomyces avermitilis MA-4680 = NBRC 14893 (Sa1,3Gal43A;SAV2109) (1,3Gal43A), and Ruminiclostridium thermocellum ATCC 27405 (Ct1,3Gal43A;CtGH43;Cthe_0661) (1,3Gal43A). It belongs to the GH43_CtGH43 subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43_CtGH43 includes proteins such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43) which is comprised of the GH43 domain, a CBM13 domain, and a dockerin domain, exhibits an unusual ability to hydrolyze beta-1,3-galactan in the presence of a beta-1,6 linked branch, and is missing an essential acidic residue suggesting a mechanism by which it bypasses beta-1,6 linked branches in the substrate. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	266
350144	cd18823	GH43_RcAra43A-like	Glycosyl hydrolase family 43 such as Ruminococcus champanellensis arabinanase Ara43A. This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis arabinanase Ara43A and Fibrobacter succinogenes subsp. succinogenes S85 Fisuc_1994 / FSU_2517. It belongs to the GH43_CtGH43 subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43_CtGH43 includes proteins such as Clostridium thermocellum exo-beta-1,3-galactanase (Ct1,3Gal43A or CtGH43) (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) which is comprised of the GH43 domain, a CBM13 domain, and a dockerin domain, exhibits an unusual ability to hydrolyze beta-1,3-galactan in the presence of a beta-1,6 linked branch, and is missing an essential acidic residue suggesting a mechanism by which it bypasses beta-1,6 linked branches in the substrate. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	289
350145	cd18824	GH43_CtGH43-like	Glycosyl hydrolase family 43 protein similar to Clostridium thermocellum exo-beta-1,3-galactanase CtGH43 and Ruminococcus champanellensis arabinanase Ara43A. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum (Ct1,3Gal43A or CtGH43) and Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), and arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis Ara43A. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	282
350146	cd18825	GH43_CtGH43-like	Glycosyl hydrolase family 43 protein similar to Clostridium thermocellum exo-beta-1,3-galactanase CtGH43 and Ruminococcus champanellensis arabinanase Ara43A. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum (Ct1,3Gal43A or CtGH43) and Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), and arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis Ara43A. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	285
350147	cd18826	GH43_CtGH43-like	Glycosyl hydrolase family 43 protein similar to Clostridium thermocellum exo-beta-1,3-galactanase CtGH43 and Ruminococcus champanellensis arabinanase Ara43A. This uncharacterized glycosyl hydrolase family 43 (GH43) subgroup belongs to a subgroup which includes characterized enzymes with exo-beta-1,3-galactanase (EC 3.2.1.145, also known as galactan 1,3-beta-galactosidase) activity such as Clostridium thermocellum (Ct1,3Gal43A or CtGH43) and Phanerochaete chrysosporium 1,3Gal43A (Pc1, 3Gal43A), and arabinanase (EC 3.2.1.99) activity such as Ruminococcus champanellensis Ara43A. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	269
350148	cd18827	GH43_XlnD-like	Glycosyl hydrolase family 43 protein such as Aspergillus niger DMS1957 xylanase D (XlnD); includes mostly xylanases. This glycosyl hydrolase family 43 (GH43) subgroup includes enzymes that have mostly been annotated as xylanases (endo-alpha-L-arabinanase, EC 3.2.1.8). It belongs to the GH43_bXyl-like subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. The GH43_bXyl-like subgroup includes enzymes that have been annotated as xylan-digesting beta-xylosidases (EC 3.2.1.37) and xylanases, as well the Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (EC 3.2.1.55) (BT3675;BT_3675) and (BT3662;BT_3662). GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	277
350149	cd18828	GH43_BT3675-like	Glycosyl hydrolase family 43 protein such as Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (BT3675;BT_3675). This glycosyl hydrolase family 43 (GH43) subgroup includes the Bacteroides thetaiotaomicron VPI-5482 alpha-L-arabinofuranosidases (EC 3.2.1.55) (BT3675;BT_3675) and (BT3662;BT_3662). It belongs to the GH43_bXyl subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. The GH43_bXyl subgroup also includes enzymes annotated as having xylan-digesting beta-xylosidase (EC 3.2.1.37) and xylanase (endo-alpha-L-arabinanase, EC 3.2.1.8) activities. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. Many GH43 enzymes display both alpha-L-arabinofuranosidase and beta-D-xylosidase activity using aryl-glycosides as substrates. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	283
350150	cd18829	GH43_BsArb43A-like	Glycosyl hydrolase family 43 protein such as Bacillus subtilis subsp. subtilis str. 168 endo-alpha-1,5-L-arabinanase Arb43A. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes annotated as having endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities, and includes Bacillus subtilis subsp. subtilis str. 168 endo-alpha-1,5-L-arabinanase (AbnA;BSU28810) (Arb43A). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the arabinofuranosidase (ABF; EC 3.2.1.55) enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. Many of these enzymes such as the Bacillus subtilis arabinanase Abn2, that hydrolyzes sugar beet arabinan (branched), linear alpha-1,5-L-arabinan and pectin, are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	273
350151	cd18830	GH43_CjArb43A-like	Glycosyl hydrolase family 43 protein such as Cellvibrio japonicus Ueda107  endo-alpha-1,5-L-arabinanase / exo-alpha-1,5-L-arabinanase 43A (ArbA;CJA_0805) (Arb43A). This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes annotated with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities, and includes the bifunctional Cellvibrio japonicus Ueda107  endo-alpha-1,5-L-arabinanase / exo-alpha-1,5-L-arabinanase 43A (ArbA;CJA_0805) (Arb43A). It belongs to the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43 are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. Many of these enzymes such as the Bacillus subtilis arabinanase Abn2, that hydrolyzes sugar beet arabinan (branched), linear alpha-1,5-L-arabinan and pectin, are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	291
350152	cd18831	GH43_AnAbnA-like	Glycosyl hydrolase family 43 protein such as Aspergillus niger endo-alpha-L-arabinanase (AbnA). This glycosyl hydrolase family 43 (GH43) subgroup includes characterized enzymes with endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities such as Aspergillus niger AbnA, Aspergillus niveus AbnA, and Chrysosporium lucknowense Abn1. It belongs to the GH43_Arb43a subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. GH43_Arb43a subgroup includes mostly enzymes with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase activities. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. The GH43_Arb43a  subgroup includes many enzymes such as Bacillus subtilis arabinanase Abn2, that hydrolyzes sugar beet arabinan (branched), linear alpha-1,5-L-arabinan and pectin, and are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener.  A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	286
350153	cd18832	GH43_GsAbnA-like	Glycosyl hydrolase family 43 protein such as Geobacillus stearothermophilus endo-alpha-1,5-L-arabinanase AbnA. This glycosyl hydrolase family 43 (GH43) subgroup includes mostly enzymes with alpha-L-arabinofuranosidase (ABF; EC 3.2.1.55) and endo-alpha-L-arabinanase (ABN; EC 3.2.1.99) activities. It includes Geobacillus stearothermophilus T-6 NCIMB 40222 AbnA, Bacillus subtilis subsp. subtilis str. 168 (Abn2;YxiA;J3A;BSU39330) (Arb43B), and Thermotoga petrophila RKU-1 (AbnA;TpABN;Tpet_0637). These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43 ABN enzymes hydrolyze alpha-1,5-L-arabinofuranoside linkages while the ABF enzymes cleave arabinose side chains so that the combined actions of these two enzymes reduce arabinan to L-arabinose and/or arabinooligosaccharides. Many of these enzymes are different from other arabinases; they are organized into two different domains with a divalent metal cluster close to the catalytic residues to guarantee the correct protonation state of the catalytic residues and consequently the enzyme activity. These arabinan-degrading enzymes are important in the food industry for efficient production of L-arabinose from agricultural waste; L-arabinose is often used as a bioactive sweetener. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	332
350154	cd18833	GH43_PcXyl-like	Glycosyl hydrolase family 43 protein such as the bifunctional Phanerochaete chrysosporium xylosidase/arabinofuranosidase (Xyl;PcXyl). This glycosyl hydrolase family 43 (GH43) subgroup includes Phanerochaete chrysosporium BKM-F-1767 Xyl, a characterized bifunctional enzyme with beta-1,4-xylosidase (beta-D-xylosidase;xylan 1,4-beta-xylosidase; EC 3.2.1.37)/ alpha-L-arabinofuranosidase (EC 3.2.1.55) activities. This subgroup belongs to the GH43_XybB subgroup of the glycosyl hydrolase clan F (according to carbohydrate-active enzymes database (CAZY)) which includes family 43 (GH43) and 62 (GH62) families. The GH43_XybB subgroup includes enzymes having beta-1,4-xylosidase and alpha-L-arabinofuranosidase activities. Beta-1,4-xylosidases are part of an array of hemicellulases that are involved in the final breakdown of plant cell-wall whereby they degrade xylan. They hydrolyze beta-1,4 glycosidic bonds between two xylose units in short xylooligosaccharides. These are inverting enzymes (i.e. they invert the stereochemistry of the anomeric carbon atom of the substrate) that have an aspartate as the catalytic general base, a glutamate as the catalytic general acid and another aspartate that is responsible for pKa modulation and orienting the catalytic acid. The GH43_XybB subgroup includes Bacteroides ovatus alpha-L-arabinofuranosidases, BoGH43A and BoGH43B, both having a two-domain architecture, consisting of an N-terminal 5-bladed beta-propeller domain harboring the catalytic active site, and a C-terminal beta-sandwich domain. However, despite significant functional overlap between these two enzymes, BoGH43A and BoGH43B share just 41% sequence identity. The latter appears to be significantly less active on the same substrates, suggesting that these paralogs may play subtly different roles during the degradation of xyloglucans from different sources, or may function most optimally at different stages in the catabolism of xyloglucan oligosaccharides (XyGOs), for example before or after hydrolysis of certain side-chain moieties. A common structural feature of GH43 enzymes is a 5-bladed beta-propeller domain that contains the catalytic acid and catalytic base. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	292
380671	cd18892	TET	oxygenase domain of ten-eleven translocation (TET)1, TET2, and TET3 methylcytosine dioxygenases and similar proteins. TET proteins are involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. Alterations in TET protein function have been linked to cancer, and TETs influence many cell differentiation processes. TET family genes have been implicated as tumor suppressors, for example mutations/deletions of the TET2 gene frequently occur in multiple spectra of myeloid malignancies. TET3 acts as a suppressor of ovarian cancer by demethylating the miR-30d precursor gene promoter to block TGF-beta1 induced epithelial-mesenchymal transition (EMT).  TET3 (and TET2) promoters are silenced in melanoma cells by mechanisms triggered by TGF-beta and mediated by DNA methyltransferase 3 alpha (DNMT3A). TET genes are downregulated in endometriosis. TET proteins belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	398
380672	cd18893	TET-like	oxygenase domain of ten-eleven translocation (TET)-like proteins such as Naegleria gruberi Tet-like protein (NgTet1) and similar proteins. Naegleria gruberi Tet1 can catalyze the iterative oxidation of both 5-methylcytosine (5mC) and thymidine (T) on various DNA forms. Like mammalian TETs, it catalyzes the oxidation of 5mC to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC) in three consecutive, Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate, 2-OG)-dependent oxidation reactions. Like JBP1 and JBP2, NgTet1 can perform T-oxidation to form 5-hydroxymethyluridine (5hmU), but in addition it can catalyze the formation of 5-formyluridine (5fU) and 5-carboxyluridine (5caU). This family belongs to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	243
380673	cd18894	JBP-like	oxygenase domain of J-binding protein (JBP) 1 and JBP2 thymidine hydroxylases and similar proteins, including uncharacterized bacterial and phage proteins. J binding protein (JBP) 1 and JBP2 catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and this oxygenase domain. They belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	250
380674	cd18895	TET1	oxygenase domain of ten-eleven translocation (TET)1 methylcytosine dioxygenase and similar proteins. TET1 is involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Human TET1 (and TET2) are more active on 5mC-DNA than 5hmC/5fC-DNA substrates. TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. TET1 plays multiple roles in in tumor development and progression. TET1 serves as a tumor suppressor gene; loss of TET1 is associated with tumorigenesis and can be used as a potential biomarker for cancer therapy. In addition to its dioxygenase activity, it can induce epithelial-mesenchymal transition and act as a coactivator to regulate gene transcription. The regulation of TET1 is also correlated with microRNA in a posttranscriptional modification process. TET1 belongs to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	410
380675	cd18896	TET2	oxygenase domain of ten-eleven translocation (TET)2 methylcytosine dioxygenase and similar proteins. TET2 is involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Human TET2 (and TET1) have been shown to be more active on 5mC-DNA than 5hmC/5fC-DNA substrates. TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. TET2 acts as a tumor suppressor in hematopoiesis; mutations/deletions of the TET2 gene frequently occur in multiple spectra of myeloid malignancies. TET2 (and TET3) promoters are silenced in melanoma cells by mechanisms triggered by TGF-beta and mediated by DNA methyltransferase 3 alpha (DNMT3A), which play a functional role in the epithelial-mesenchymal transition process and metastasis. In addition, TET2 (and TET3) may be guardians of regulatory T cell stability and immune homeostasis. TET2 belongs to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	434
380676	cd18897	TET3	oxygenase domain of ten-eleven translocation (TET)3 methylcytosine dioxygenase and similar proteins. TET3 is involved in DNA demethylation through iteratively oxidizing 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). TET proteins contain a C-terminal catalytic domain which consists of a cysteine-rich region and a double-stranded beta-helix (DSBH) fold. TET3 serves as a tumor suppressor; it acts as a suppressor of ovarian cancer by demethylating the miR-30d precursor gene promoter to block TGF-beta1 induced epithelial-mesenchymal transition (EMT). TET3 (and TET2) promoters are silenced in melanoma cells by mechanisms triggered by TGF-beta and mediated by DNA methyltransferase 3 alpha (DNMT3A), which play a functional role in the EMT process and metastasis. In addition, TET3 (and TET2) may be guardians of regulatory T cell stability and immune homeostasis. TET3 has been shown to prevent terminal differentiation of adult neural stem cells by a mechanism involving direct binding and repression of TET3 to the imprinted gene Snrpn. TET3 has also been shown to mediate the activation of hepatic stellate cells via modulation of the long non-coding RNA HIF1A-AS1 expression. TET1 belongs to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	452
381478	cd18908	bHLH_SOHLH1_2	basic helix-loop-helix (bHLH) domain found in the spermatogenesis- and oogenesis-specific basic helix-loop-helix-containing protein (SOHLH) family. The SOHLH family includes two bHLH transcription factors, SOHLH1 and SOHLH2. They are specifically in spermatogonia and oocytes and essential for early spermatogonial and oocyte differentiation.	59
381479	cd18909	bHLH_TCFL5	basic helix-loop-helix (bHLH) domain found in transcription factor-like 5 protein (TCFL5) and similar proteins. TCFL5, also termed Cha transcription factor, or HPV-16 E2-binding protein 1 (E2BP-1), is a bHLH transcription factor that plays a crucial role in spermatogenesis. It regulates cell proliferation or differentiation of cells through binding to a specific DNA sequence like other bHLH molecules.	60
381480	cd18910	bHLHzip_USF3	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in basic helix-loop-helix domain-containing protein USF3 and similar proteins. USF3, also termed upstream transcription factor 3, is a bHLHzip protein that is involved in the negative regulation of epithelial-mesenchymal transition, the process by which epithelial cells lose their polarity and adhesion properties to become mesenchymal cells with enhanced migration and invasive properties.	65
381481	cd18911	bHLHzip_MGA	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MAX gene-associated protein (MGA) and similar proteins. MGA, also termed MAX dimerization protein 5 (MAD5), is a dual specificity T-box/ bHLHzip transcription factor that regulates the expression of both Max-network and T-box family target genes. It contains a Myc-like bHLHZip motif and requires heterodimerization with Max for binding to the preferred Myc-Max-binding site CACGTG. In addition to the bHLHZip domain, MGA harbors a second DNA-binding domain, the T-box or T-domain. It thus binds the preferred Brachyury-binding sequence and represses transcription of reporter genes containing promoter-proximal Brachyury-binding sites.	65
381482	cd18912	bHLH_TS_bHLHa9	basic helix-loop-helix (bHLH) domain found in Class A basic helix-loop-helix protein 9 (bHLHa9) and similar proteins. bHLHa9, also termed Class F basic helix-loop-helix factor 42 (bHLHf42), is a bHLH transcription factor that plays an essential role in limb development.	63
381483	cd18913	bHLH-O_hairy_like	basic helix-loop-helix-orange (bHLH-O) domain found in Drosophila melanogaster protein hairy, protein deadpan and similar proteins. Protein hairy is a bHLH transcriptional repressor of genes that require a bHLH-O protein for their transcription. It acts as a pair-rule protein that regulates embryonic segmentation and adult bristle patterning. Protein deadpan is closely related to the product of the segmentation gene hairy. It is a direct target of Notch signaling and regulates neuroblast self-renewal in Drosophila.	67
381484	cd18914	bHLH_AtORG2_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana OBP3-responsive gene 2 (ORG2), 3 (ORG3) and similar proteins. The family includes ORG2 (also termed AtbHLH38, or EN 8) and ORG3 (also termed AtbHLH39, or EN 9), both of which act as bHLH transcription factors.	77
381485	cd18915	bHLH_AtLHW_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein LONESOME HIGHWAY (LHW) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as LHW, and EMB1444. LHW, also termed AtbHLH156, or bHLH delta, is a bHLH transcription activator that regulates root development and promotes the production of stele cells in roots. It coordinately controls the number of all vascular cell types by regulating the size of the pool of cells from which they arise. EMB1444, also termed AtbHLH169, or lonesome highway-like protein 1, or protein embryo defective 1444, may regulate root development.	71
381486	cd18916	bHLH-O_ESM5_like	basic helix-loop-helix-orange (bHLH-O) domain found in Drosophila melanogaster Enhancer of split proteins, E(spl)m5, E(spl)m8 and similar proteins. The family includes two bHLH-O transcriptional repressors, E(spl)m5 and E(spl)m8, which participate in the control of cell fate choice by uncommitted neuroectodermal cells in the embryo. They bind DNA on N-box motifs, 5'-CACNAG-3'.	59
381487	cd18917	bHLH_AtSAC51_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana suppressor of acaulis 51 (SAC51) and similar proteins. SAC51, also termed AtbHLH142, or EN 128, is a bHLH transcription factor that is involved in stem elongation, probably by regulating a subset of genes involved in this process.	53
381488	cd18918	bHLH_AtMYC1_like	basic Helix-Loop-Helix (bHLH) domain found in Arabidopsis thaliana MYC1 and similar proteins. MYC1, also termed AtbHLH12, or EN 58, acts as a transcription activator, when associated with MYB75/PAP1 or MYB90/PAP2.	70
381489	cd18919	bHLH_AtBPE_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana BIG PETAL (BPE) and similar proteins. The family includes several bHLH transcription factors from Arabidopsis thaliana, such as BPE, HBI1 and BEE proteins (BEE1-3). BPE, also termed AtbHLH31, or EN 88, is involved in the control of Arabidopsis petal size, by interfering with postmitotic cell expansion to limit final petal cell size. HBI1, also termed AtbHLH64, or homolog of bee2 interacting with IBH1, or EN 79, is an atypical bHLH transcription factor that acts as positive regulator of cell elongation downstream of multiple external and endogenous signals by direct binding to the promoters and activation of the two expansin genes EXPA1 and EXPA8, encoding cell wall loosening enzymes. BEEs, also termed protein Brassinosteroid enhanced expression, are positive regulators of brassinosteroid signaling.	86
381490	cd18920	bHLH-O_HEY2	basic helix-loop-helix-orange (bHLH-O) domain found in hairy/enhancer-of-split related with YRPW motif protein 2 (HEY2) and similar proteins. HEY2, also termed cardiovascular helix-loop-helix factor 1 (CHF-1), or Class B basic helix-loop-helix protein 32 (bHLHb32), or HES-related repressor protein 2, or hairy and enhancer of split-related protein 2 (HESR-2), or hairy-related transcription factor 2 (HRT-2), or protein gridlock homolog, is a bHLH-O transcriptional repressor expressed preferentially in the developing and adult cardiovascular system. As a downstream effector of Notch signaling, HEY2 may be required for cardiovascular development.  It also plays an important role in neurologic development, as well as in the progression of human cancers.	82
381491	cd18921	bHLHzip_SREBP1	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in sterol regulatory element-binding protein 1 (SREBP1) and similar proteins. SREBP1, also termed Class D basic helix-loop-helix protein 1 (bHLHd1), or sterol regulatory element-binding transcription factor 1 (SREBF1), is a member of a family of bHLHzip transcription factors that recognize sterol regulatory element 1 (SRE-1). It acts as a transcriptional activator required for lipid homeostasis. It may control transcription of the low-density lipoprotein receptor gene as well as the fatty acid. SREBP1 has dual sequence specificity binding to both an E-box motif (5'-ATCACGTGA-3') and to SRE-1 (5'-ATCACCCCAC-3').	75
381492	cd18922	bHLHzip_SREBP2	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in sterol regulatory element-binding protein 2 (SREBP2) and similar proteins. SREBP2, also termed Class D basic helix-loop-helix protein 2 (bHLHd2), or sterol regulatory element-binding transcription factor 2 (SREBF2), is a member of a family of bHLHzip transcription factors that recognize sterol regulatory element 1 (SRE-1). It acts as a transcription activator of cholesterol biosynthesis.	77
381493	cd18923	bHLHzip_USF2	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in upstream stimulatory factor 2 (USF2) and similar proteins. USF2, also termed Class B basic helix-loop-helix protein 12 (bHLHb12), or major late transcription factor 2, or FOS-interacting protein (FIP), or upstream transcription factor 2, is a bHLHzip transcription factor that binds to a symmetrical DNA sequence (E-boxes) (5'-CACGTG-3') that is found in a variety of viral and cellular promoters.	80
381494	cd18924	bHLHzip_USF1	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in upstream stimulatory factor 1 (USF1) and similar proteins. USF1, also termed Class B basic helix-loop-helix protein 11 (bHLHb11), or major late transcription factor 1, is a bHLHzip transcription factor that binds to a symmetrical DNA sequence (E-boxes) (5'-CACGTG-3') that is found in a variety of viral and cellular promoters. It is ubiquitously expressed and involved in the transcription activation of various functional genes implicated in lipid and glucose metabolism, stress response, immune response, cell cycle control and tumour suppression. USF-1 recruits chromatin remodeling enzymes and interact with co-activators and the members of the transcription pre-initiation complex. Genetic polymorphisms of USF1 are associated with some metabolic and cardiovascular diseases, like diabetes, atherosclerosis, coronary artery calcifications and familial combined hyperlipidaemia (FCHL).	65
381495	cd18925	bHLHzip_TFEC	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in transcription factor EC (TFEC) and similar proteins. TFEC, also termed Class E basic helix-loop-helix protein 34 (bHLHe34), or transcription factor EC-like (TFEC-L), is a bHLHzip transcriptional regulator that acts as a repressor or an activator and regulates gene expression in macrophages. It plays an important role in the niche to expand hematopoietic progenitors through the modulation of several cytokines.	85
381496	cd18926	bHLHzip_MITF	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in microphthalmia-associated transcription factor (MITF) and similar proteins. MITF, also termed Class E basic helix-loop-helix protein 32 (bHLHe32), is a bHLHzip transcription factor that is involved in neural crest melanocytes development as well as the pigmented retinal epithelium. It regulates the expression of genes with essential roles in cell differentiation, proliferation and survival. It binds to M-boxes (5'-TCATGTG-3') and symmetrical DNA sequences (E-boxes) (5'-CACGTG-3') found in the promoters of target genes, such as BCL2 and tyrosinase (TYR).	104
381497	cd18927	bHLHzip_TFEB	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in transcription factor EB (TFEB) and similar proteins. TFEB, also termed Class E basic helix-loop-helix protein 35 (bHLHe35), is a bHLHzip transcription factor that is required for vascularization of the mouse placenta. It specifically recognizes and binds E-box sequences (5'-CANNTG-3'). Its efficient DNA-binding requires dimerization with itself or with another MiT/TFE family member such as TFE3 or MITF.	91
381498	cd18928	bHLHzip_TFE3	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in transcription factor E3 (TFE3) and similar proteins. TFE3, also termed Class E basic helix-loop-helix protein 33 (bHLHe33), is a bHLHzip transcription factor that is involved in B cell function. It specifically recognizes and binds E-box sequences (5'-CANNTG-3'). Its efficient DNA-binding requires dimerization with itself or with another MiT/TFE family member such as TFEB or MITF.	91
381499	cd18929	bHLHzip_Mad4	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-associated protein 4 (Mad4) and similar proteins. Mad4, also termed Max dimerization protein 4, or Max dimerizer 4 (MXD4), or Class C basic helix-loop-helix protein 12 (bHLHc12), or Max-interacting transcriptional repressor MAD4, is a bHLHZip Max-interacting transcriptional repressor that suppresses c-myc dependent transformation and is expressed during neural and epidermal differentiation. It is regulated by a transcriptional repressor complex that contains Miz-1 and c-Myc.	88
381500	cd18930	bHLHzip_MXI1	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-interacting protein 1 (MXI1) and similar proteins. MXI1, also termed Max interactor 1, or Class C basic helix-loop-helix protein 11 (bHLHc11), is a bHLHZip transcriptional repressor that binds with MAX to form a sequence-specific DNA-binding protein complex which recognizes the core sequence 5'-CAC[GA]TG-3'. It thus antagonizes MYC transcriptional activity by competing for MAX. It plays an important role in the regulation of cell proliferation.	80
381501	cd18931	bHLHzip_Mad1	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in protein Max-associated protein 1 (Mad1)  and similar proteins. Mad1, also termed Max dimerization protein 1 (MXD1), or Max dimerizer 1, or protein MAD, is a bHLHZip transcriptional repressor that binds with MAX to form a sequence-specific DNA-binding protein complex which recognizes the core sequence 5'-CAC[GA]TG-3'. It thus antagonizes MYC transcriptional activity by competing for MAX.	80
381502	cd18932	bHLHzip_Mad3	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-associated protein 3 (Mad3) and similar proteins. Mad3, also termed Max dimerization protein 3, or Max dimerizer 3 (MXD3), or Class C basic helix-loop-helix protein 13 (bHLHc13), or Max-interacting transcriptional repressor MAD3, or Myx, is a bHLHZip Max-interacting transcriptional repressor that plays an important role in cellular proliferation. It suppresses c-myc dependent transformation and is expressed during neural and epidermal differentiation.	85
381503	cd18933	bHLH-O_HES3	basic helix-loop-helix-orange (bHLH-O) domain found in transcription factor HES-3 and similar proteins. HES-3, also termed Class B basic helix-loop-helix protein 43 (bHLHb43), or hairy and enhancer of split 3, is a bHLH-O transcription factor expressed in neural stem and progenitor cells that is involved in tissue regeneration. It regulates gene expression, cell growth, and insulin release. HES-3 is one mammalian counterpart of the Hairy and Enhancer of split proteins that play a critical role in many physiological processes including cellular differentiation, cell cycle arrest, apoptosis and self-renewal ability.	55
381504	cd18934	bHLH_TS_MRF4_Myf6	basic helix-loop-helix (bHLH) domain found in muscle-specific regulatory factor 4 (MRF4) and similar proteins. MRF4, also termed Class C basic helix-loop-helix protein 4 (bHLHc4), or myogenic factor 6 (Myf-6), is a bHLH transcription factor associated with myogenesis. It plays a role in skeletal muscle differentiation.	64
381505	cd18935	bHLH_TS_MYOG_Myf4	basic helix-loop-helix (bHLH) domain found in myogenin (MYOG) and similar proteins. MYOG, also termed Class C basic helix-loop-helix protein 3 (bHLHc3), or myogenic factor 4 (Myf-4), is a bHLH transcriptional activator that promotes transcription of muscle-specific target genes and plays a role in muscle differentiation, cell cycle exit and muscle atrophy.	59
381506	cd18936	bHLH_TS_MYOD1_Myf3	basic helix-loop-helix (bHLH) domain found in myoblast determination protein 1 (MYOD1) and similar proteins. MYOD1, also termed Class C basic helix-loop-helix protein 1 (bHLHc1), or myogenic factor 3 (Myf-3), is a bHLH transcriptional activator that promotes transcription of muscle-specific target genes and plays a role in muscle differentiation. Together with Myf-5 and MYOG, MYOD1 co-occupies muscle-specific gene promoter core region during myogenesis.	61
381507	cd18937	bHLH_TS_Myf5	basic helix-loop-helix (bHLH) domain found in myogenic factor 5 (Myf-5) and similar proteins. Myf-5, also termed Class C basic helix-loop-helix protein 2 (bHLHc2), is a nuclear bHLH transcriptional activator that promotes transcription of muscle-specific target genes and plays a role in muscle specification and differentiation. It also acts as an RNA-binding protein which enhances Ccnd1/Cyclin D1 mRNA translation during myogenesis.	64
381508	cd18938	bHLH_TS_Mesp	basic helix-loop-helix (bHLH) domain found in the mesoderm posterior protein (Mesp) family. Mesp, a bHLH tissue specific transcription factor, acts as a key regulator of the cardiovascular transcriptional network by inducing directly and/or indirectly the expression of the majority of key cardiovascular transcription factors. The Mesp family includes two bHLH transcription factors, Mesp1 and Mesp2. Mesp1, also termed Class C basic helix-loop-helix protein 5 (bHLHc5), promotes cardiovascular differentiation during embryonic development and embryonic stem cell differentiation. Mesp2, also termed Class C basic helix-loop-helix protein 6 (bHLHc6), plays an important role in somitogenesis.	65
381509	cd18939	bHLH_TS_Msgn1	basic helix-loop-helix (bHLH) domain found in mesogenin-1 (Msgn1) and similar proteins. Msgn1, also termed paraxial mesoderm-specific mesogenin1, or pMesogenin1 (pMsgn1), is a bHLH transcription factor required for maturation and segmentation of paraxial mesoderm. It may regulate the expression of T-box transcription factors essential for mesoderm formation and differentiation.	66
381510	cd18940	bHLH_TS_OLIG2	basic helix-loop-helix (bHLH) domain found in oligodendrocyte transcription factor 2 (Oligo2) and similar proteins. Oligo2, also termed Class B basic helix-loop-helix protein 1 (bHLHb1), or Class E basic helix-loop-helix protein 19 (bHLHe19), or protein kinase C-binding protein 2, or protein kinase C-binding protein RACK17, is a bHLH transcription factor that is required for oligodendrocyte and motor neuron specification in the spinal cord, as well as for the development of somatic motor neurons in the hindbrain. It cooperates with OLIG1 to establish the MN progenitors (pMN) domain of the embryonic neural tube.	85
381511	cd18941	bHLH_TS_OLIG3	basic helix-loop-helix (bHLH) domain found in oligodendrocyte transcription factor 3 (Oligo3) and similar proteins. Oligo3, also termed Class B basic helix-loop-helix protein 7 (bHLHb7), or Class E basic helix-loop-helix protein 20 (bHLHe20), is a bHLH transcription factor that is expressed in the ventricular zone of the dorsal alar plate of the hindbrain and involved in regulating the development of dorsal and ventral spinal cord. It may determine the distinct specification program of class A neurons in the dorsal part of the spinal cord and suppress specification of class B neurons.	81
381512	cd18942	bHLH_TS_OLIG1	basic helix-loop-helix (bHLH) domain found in oligodendrocyte transcription factor 1 (Oligo1) and similar proteins. Oligo1, also termed Class B basic helix-loop-helix protein 6 (bHLHb6), or Class E basic helix-loop-helix protein 21 (bHLHe21), is a bHLH transcription factor that promotes formation and maturation of oligodendrocytes, especially within the brain.	75
381513	cd18943	bHLH_E-protein_E47-like	basic helix-loop-helix (bHLH) domain found in transcription factor E47 and similar proteins. E47 is a class I bHLH transcriptional regulator that forms heterodimers with class II bHLH proteins to regulate distinct differentiation pathways. Its homodimers regulate B lymphocytes development.	74
381514	cd18944	bHLH_E-protein_E2A_TCF3	basic helix-loop-helix (bHLH) domain found in transcription factor E2-alpha (E2A) and similar proteins. E2A, also termed Class B basic helix-loop-helix protein 21 (bHLHb21), or immunoglobulin enhancer-binding factor E12/E47, or immunoglobulin transcription factor 1, or Kappa-E2-binding factor, or transcription factor 3 (TCF-3), or transcription factor ITF-1, is a bHLH transcriptional regulator involved in the initiation of neuronal differentiation.	74
381515	cd18945	bHLH_E-protein_TCF4_E2-2	basic helix-loop-helix (bHLH) domain found in transcription factor 4 (TCF-4) and similar proteins. TCF-4, also termed E2-2, or Class B basic helix-loop-helix protein 19 (bHLHb19), or immunoglobulin transcription factor 2 (ITF-2), or SL3-3 enhancer factor 2 (SEF-2), is a bHLH transcription factor that binds to the immunoglobulin enhancer Mu-E5/KE5-motif. It is involved in the initiation of neuronal differentiation.	85
381516	cd18946	bHLH_E-protein_TCF12_HEB	basic helix-loop-helix (bHLH) domain found in transcription factor 12 (TCF-12) and similar proteins. TCF-12, also termed HEB, or Class B basic helix-loop-helix protein 20 (bHLHb20), or DNA-binding protein HTF4, or E-box-binding protein, or transcription factor HTF-4, is a bHLH transcription factor that is involved in the initiation of neuronal differentiation.	83
381517	cd18947	bHLH-PAS_ARNT	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in aryl hydrocarbon receptor nuclear translocator (ARNT) and similar proteins. ARNT, also termed Class E basic helix-loop-helix protein 2 (bHLHe2), or Dioxin receptor, nuclear translocator, or hypoxia-inducible factor 1-beta (HIF1b), or HIF-1-beta, or HIF1-beta, is a member of bHLH-PAS transcription regulators that acts as the heterodimeric partner for bHLH-PAS proteins such as aryl hydrocarbon receptor (AhR), hypoxia-inducible factor (HIF), and single-minded (SIM). These bHLH-PAS transcription complexes are involved in transcriptional responses to xenobiotic, hypoxia, and developmental pathways. Heterodimerization of bHLH-PAS proteins with ARNT is mediated by contacts between both the bHLH and the tandem PAS domains. ARNT use bHLH and/or PAS domains to interact with several transcriptional coactivators. It is required for activity of the aryl hydrocarbon (dioxin) receptor.	65
381518	cd18948	bHLH-PAS_NCoA1_SRC1	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in nuclear receptor coactivator 1 (NCoA-1) and similar proteins. NCoA-1, also termed Class E basic helix-loop-helix protein 74 (bHLHe74), or protein Hin-2, or RIP160, or renal carcinoma antigen NY-REN-52, or steroid receptor coactivator 1 (SRC-1), is a bHLH-PAS nuclear receptor coactivator that directly binds nuclear receptors and stimulates the transcriptional activities in a hormone-dependent fashion.	61
381519	cd18949	bHLH-PAS_NCoA3_SRC3	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in nuclear receptor coactivator 3 (NCoA-3) and similar proteins. NCoA-3, also termed ACTR, or amplified in breast cancer 1 protein (AIB-1), or CBP-interacting protein (pCIP), or Class E basic helix-loop-helix protein 42 (bHLHe42), or receptor-associated coactivator 3 (RAC-3), or steroid receptor coactivator protein 3 (SRC-3), or thyroid hormone receptor activator molecule 1 (TRAM-1), is a bHLH-PAS steroid/nuclear receptor-associated coactivator that directly binds nuclear receptors and stimulates the transcriptional activities in a hormone-dependent fashion. It also plays a central role in creating a multisubunit coactivator complex, which probably acts via remodeling of chromatin.	73
381520	cd18950	bHLH-PAS_NCoA2_SRC2	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in nuclear receptor coactivator 2 (NCoA-2) and similar proteins. NCoA-2, also termed Class E basic helix-loop-helix protein 75 (bHLHe75), or transcriptional intermediary factor 2 (TIF2), or steroid receptor coactivator 2 (SRC-2), or glucocorticoid receptor interacting protein-1 (GRIP1), is a bHLH-PAS transcriptional coactivator for steroid receptors and nuclear receptors. It is required with NCoA-1 to control energy balance between white and brown adipose tissues.	64
381521	cd18951	bHLH_TS_scleraxis	basic helix-loop-helix (bHLH) domain found in scleraxis and similar proteins. Scleraxis, also termed SCX, or Class A basic helix-loop-helix protein 41 (bHLHa41), or Class A basic helix-loop-helix protein 48 (bHLHa48), is a bHLH transcription factor that is expressed in sclerotome limb bud cranial and body wall mesenchyme, pericardium and heart valves, ligaments and tendons. It is required for tendon formation ligaments, connective tissue, the diaphragm, and testis development. Scleraxis plays a central role in promoting fibroblast proliferation and matrix synthesis during the embryonic development of tendons.	68
381522	cd18952	bHLH_TS_HAND1	basic helix-loop-helix (bHLH) domain found in heart- and neural crest derivatives-expressed protein 1 (HAND1) and similar proteins. HAND1, also termed Class A basic helix-loop-helix protein 27 (bHLHa27), or extraembryonic tissues, heart, autonomic nervous system and neural crest derivatives-expressed protein 1 (eHAND), is a bHLH transcription factor that plays an essential role in both trophoblast-giant cells differentiation and in cardiac morphogenesis.	60
381523	cd18953	bHLH_TS_bHLHe23_bHLHb4	basic helix-loop-helix (bHLH) domain found in Class E basic helix-loop-helix protein 23 (bHLHe23) and similar proteins. bHLHe23, also termed Class B basic helix-loop-helix protein 4 (bHLHb4), is an OLIG-related bHLH transcription factor that is expressed in rod bipolar cells and is required for rod bipolar cell maturation. bHLHe23 have roles in spinal interneuron differentiation by mechanisms linked to the Notch signaling pathway. It modulates the expression of genes required for the differentiation and/or maintenance of pancreatic and neuronal cell types.	81
381524	cd18954	bHLH_TS_bHLHe22_bHLHb5	basic helix-loop-helix (bHLH) domain found in Class E basic helix-loop-helix protein 22 (bHLHe22) and similar proteins. bHLHe22, also termed Class B basic helix-loop-helix protein 5 (bHLHb5), or trinucleotide repeat-containing gene 20 protein, is an OLIG-related bHLH neural-specific transcriptional repressor that is expressed in both excitatory (unipolar brush cells) and inhibitory neurons (cartwheel cells) of the dorsal cochlear nucleus (DCN) during development. It is important for the proper development and/or survival of a number of neural cell types.	70
349736	cd18955	BTB_POZ_BACH	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in BTB and CNC homolog (BACH) proteins. This subfamily includes BACH1 (also called BTB-basic leucine zipper transcription factor 1), BACH2 (also called BTB-basic leucine zipper transcription factor 2), and similar proteins. They belong to the cap 'n' collar (CNC) and basic leucine zipper (bZIP) factor family. BACH1 is a heme-responsive transcriptional repressor of heme oxygenase (HO)-1. It represses genes involved in heme metabolism and oxidative stress response. BACH2 is a lymphoid-specific transcription factor with a prominent role in B-cell development. It is transcriptionally regulated by the BCR/ABL oncogene. It represses the anti-apoptotic factor heme oxygenase-1 (HO-1). Subfamily members contain a BTB domain and a basic leucine zipper (bZIP) domain. The BTB/POZ domain is a common protein-protein interaction motif of about 100 amino acids.	94
349737	cd18956	BTB_POZ_ZBTB42	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain found in zinc finger and BTB domain-containing protein 42 (ZBTB42). ZBTB42 is a transcriptional repressor that specifically binds DNA and probably acts by recruiting chromatin remodeling multiprotein complexes. It is enriched in skeletal muscles, especially at the neuromuscular junction. A ZBTB42 mutation has been identified to define a novel lethal congenital contracture syndrome (LCCS6), a lethal autosomal recessive form of arthrogryposis multiplex congenita (AMC). ZBTB42 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	129
349316	cd18960	CD_HP1_like	chromodomain of heterochromatin protein 1 proteins, including HP1alpha, HP1beta, and HP1gamma; uncharacterized subgroup. CHRomatin Organization Modifier (chromo) domain of mammalian HP1alpha (Cbx5), HP1beta (Cbx1), HP1gamma (Cbx5), and similar proteins. HP1 has diverse functions in heterochromatin formation and impacts both gene expression and gene silencing.  HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid.  HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3).	51
349317	cd18961	CD_CEC-4_like	chromodomain of Caenorhabditis elegans chromodomain protein 4, and similar proteins. CHRomatin Organization Modifier (chromo) domain of Caenorhabditis elegans CEC-4, and similar proteins. CEC-4 is a perinuclear heterochromatin anchor, it mediates the anchoring of H3K9 methylation-bearing chromatin at the nuclear periphery in early to mid-stage embryos. It is necessary for anchoring, but does not affect transcriptional repression. CEC-4 contributes to the efficiency with which muscle differentiation is induced following ectopic expression of the master regulator, HLH-1 (MyoD in mammals). A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	51
349318	cd18962	CD_MT_like	chromodomain of a putative Coemansia reversa NRRL 1564 methyltransferase, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in a Coemansia reversa NRRL 1564 SET (Su(var)3-9, enhancer-of-zeste, trithorax) domain-containing protein, and similar proteins. The SU(VAR)3-9 protein is the main chromocenter-specific histone H3-K9 methyltransferase (HMTase) in Drosophila where it plays a role in heterochromatic gene silencing. A chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and which appears to play a role in the functional organization of the eukaryotic nucleus. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	52
349319	cd18963	chromodomain	CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain.	57
349320	cd18964	chromodomain	CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain.	54
349321	cd18965	chromodomain	CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain.	53
349322	cd18966	chromodomain	CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain.	49
349323	cd18967	chromodomain	CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain.	55
349324	cd18968	chromodomain	CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain.	57
349325	cd18969	chromodomain	CHROMO (CHRromatin Organization Modifier) domain; uncharacterized subgroup; for most members of this subgroup, the chromodomain is followed by a chromo shadow domain. The chromodomain is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain. Chromodomains belong to the chromo-like superfamily of SH3-fold-beta-barrel domains which includes chromo shadow domains and chromo barrel domains. Chromodomains differ from these in that they lack the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain. For the majority of members of this subgroup, the chromodomain is followed by a chromo shadow domain (CSD).	56
349326	cd18970	CD_POL_like	chromodomain of Hypsizygus marmoreus TY3B-I_0 protein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Hypsizygus marmoreus TY3B-I_0 protein, a putative TY3/gypsy retrotransposon polyprotein, and similar proteins. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	49
349327	cd18971	CD_POL_like	chromodomain of a Magnaporthe grisea putative retrotransposon polyprotein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in a Magnaporthe grisea putative retrotransposon polyprotein which includes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	50
349328	cd18972	CD_POL_like	chromodomain of a Moniliophthora perniciosa FA553 putative retrotransposon polyprotein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in a Moniliophthora perniciosa FA553 putative retrotelement polyprotein, which includes domains in the following order: a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related "chromo shadow" domain	50
349329	cd18973	CD_Tf2-1_POL_like	chromodomain of Rhizoctonia solani AG-1 IB retrotransposable element Tf2 155 kDa protein type 1, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Rhizoctonia solani AG-1 IB retrotransposable element Tf2 155 kDa protein type 1 (Tf2-1), and similar proteins. It belongs to the Ty3/gypsy family of long terminal repeat (LTR) retrotransposons. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	50
349330	cd18974	CD_POL_like	chromodomain of Penicillium solitum protein PENSOL_c198G03123. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Penicillium solitum protein PENSOL_c198G03123 a putative polyprotein from a Ty3/Gypsy long terminal repeat (LTR) retroelement. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	50
349331	cd18975	CD_MarY1_POL_like	chromodomain of Tricholoma matsutake polyprotein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in the polyprotein from the MarY1 Ty3/Gypsy long terminal repeat (LTR) retroelement  from the from the Ectomycorrhizal Basidiomycete Tricholoma matsutake.  The pol gene in TY3/gypsy elements generally encodes domains in the following order: prt-reverse transcriptase-RNase H-integrase, in marY1 POL the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	49
349332	cd18976	CD_POL_like	chromodomain of uncharacterized putative retroelement polyprotein proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in  uncharacterized putative retrotransposon proteins, and similar proteins. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	51
349333	cd18977	CD_POL_like	chromodomain of a Rhizoctonia solani AG-3 Rhs1AP polyprotein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in a Rhizoctonia solani AG-3 Rhs1AP, a putative Ty3/Gypsy polyprotein/retrotransposon which includes a protease, a reverse transcriptase, a ribonuclease H, and an integrase domain, in that order, with a chromodomain at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	57
349334	cd18978	CD_DDE_transposase_like	chromodomain of Rhizopus microsporus putative DDE transposases, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Rhizopus microsporus putative DDE transposases, and similar proteins. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	52
349335	cd18979	CD_POL_like	chromodomain of a Zea maize putative metaviridae (gypsy-type) retrotransposon polyproteins (Z195D10.9), and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Zea maize Z195D10.9 protein, and other putative TY3/gypsy retrotransposon polyproteins. The pol gene in TY3/gypsy elements generally encodes domains in the following order: an aspartyl protease, a reverse transcriptase, RNase H, and an integrase, here the chromodomain is found at the C-terminus of the integrase domain. The chromodomain, is a conserved region of about 50 amino acids, found in a variety of chromosomal proteins, and implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	48
349336	cd18980	CD_NC-like	chromodomain of a Tasahii var. asahii CBS 8904 retrotransposon nucleocapsid protein, and similar proteins. This subgroup includes the CHROMO (CHRromatin Organization Modifier) domain found in Trichosporon asahii var. asahii CBS 8904 retrotransposon nucleocapsid protein, and similar proteins. The chromodomain is implicated in the binding, of the proteins in which it is found, to methylated histone tails and maybe RNA. A chromodomain may occur as a single instance, in a tandem arrangement, or followed by a related chromo shadow domain.	56
349337	cd18981	CSD_HP1e_insect	chromo shadow domain of insect heterochromatin protein 1E. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. CSDs are found for example in Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3.	53
349338	cd18982	CSD	chromo shadow domain; uncharacterized subgroup. The chromo shadow domain (CSD) is always found in association with a related N-terminal chromo (CHRromatin Organization MOdifier) domain. CSD domains have only been found in proteins that also possess a chromodomain, while chromodomains can exist in isolation. CSDs are found for example in Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2. HP1 is a highly conserved non-histone chromosomal protein that is evolutionarily conserved from fission yeast to plants and animals. HP1 has two conserved protein-protein interaction domains, a single N-terminal chromodomain (CD) which can bind to histone proteins via methylated lysine residues, and a related C-terminal chromo shadow domain (CSD) which is responsible for the homodimerization and interaction with a number of chromatin-associated non-histone proteins; a flexible hinge region separates the CD and CSD and may bind nucleic acid. The HP1 CSD, in addition to interacting with various proteins bearing the PXVXL motif, also interacts with a region of histone H3 that bears the similar PXXVXL motif. There are three human homologs of HP1 proteins: HP1alpha (also known as Cbx5), HP1beta (also known as Cbx1), and HP1gamma (also known as Cbx3). The CSD domains of all three human HP1 homologs have similar affinities to the PXXVXL motif of histone H3.	50
350846	cd18983	CBD_MSL3_like	chromo barrel domain of human male-specific lethal complex subunit 3, and similar proteins. This subgroup includes human male-specific lethal (MSL) complex subunit 3 (MSL3, also known as MSL3L1). The MSL3 chromodomain specifically recognizes the H4K20 monomethyl mark, in a DNA-dependent manner, and may be involved in chromosomal targeting of the MSL complex. Also included is MORF-related gene on chromosome 15 (MRG15, also known as MORF4L1) which specifically binds to Lys36-methylated histone H3 and plays a role in transcriptional regulation and in DNA repair. This subgroup also includes Arabidopsis thaliana Morf Related Gene 2 (MRG2) which acts as a H3K4me3/H3K36me3 reader involved in the regulation of Arabidopsis flowering. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromo domain.	57
350847	cd18984	CBD_MOF_like	chromo barrel domain of Drosophila melanogaster males-absent on the first protein, and similar proteins. This subgroup includes the chromo barrel domain of Drosophila melanogaster males-absent on the first (MOF) protein. The histone H4 lysine 16 (H4K16)-specific acetyltransferase MOF is part of two distinct complexes involved in X chromosome dosage compensation and autosomal transcription regulation. Its chromobarrel domain is essential for H4K16 acetylation throughout the Drosophila genome and controls spreading of the male-specific lethal (MSL) complex on the X chromosome. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromodomain. The MOF-like chromo barrels may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites.	70
350848	cd18985	CBD_TIP60_like	chromo barrel domain of human tat-interactive protein 60, and similar proteins. Tat-interactive protein 60 (also known as KAT5 or HTATIP) catalyzes the acetylation of lysine side chains in various histone and nonhistone proteins, and in itself. It plays roles in multiple cellular processes including remodeling, transcription, DNA double-strand break repair, apoptosis, embryonic stem cell identity, and embryonic development. The TIP60 chromo barrel domain recognizes trimethylated lysine at site 9 of histone H3 (H3K9me3) which triggers TIP60 to acetylate and activate ataxia telangiectasia-mutated kinase, thereby promoting the DSB repair pathway. In a different study, the TIP60 chromo barrel domain was shown to bind H3K4me1, which stabilizes TIP60 recruitment to a subset of estrogen receptor alpha target genes, facilitating regulation of the associated gene transcription. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromodomain. This subgroup belongs to the MOF-like chromo barrels may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites.	64
350849	cd18986	CBD_ESA1_like	chromo barrel domain of yeast NuA4 histone acetyltransferase complex catalytic subunit ESA1, and similar proteins. The subgroup includes the chromo barrel domain of NuA4 histone acetyltransferase (HAT) complex catalytic subunit Esa1 (also known as Tas1 and Kat5).  Yeast Esa1p acetylates specific histones nonrandomly in H4, H3, and H2A. Esa1 also plays roles in cell cycle progression. In addition, its chromo barrel domain plays a role in the yeast Piccolo NuA4 complex's ability to distinguish between histones and nucleosomes; however, the chromodomain is not required for the Piccolo to bind to nucleosomes. SH3-fold-beta-barrel domains of the chromo-like superfamily include chromodomains, chromo shadow domains, and chromo barrel domains, and are implicated in the recognition of lysine-methylated histone tails and nucleic acids. The chromodomain differs, in that it lacks the first strand of the SH3-fold-beta-barrel. This first strand is altered by insertion in the chromo shadow domains, and chromo barrel domains are typical SH3-fold-beta-barrel domains with sequence similarity to the canonical chromodomain. This subgroup belongs to the MOF-like chromo barrels may be may be auto-inhibited, i.e. they seem to have occluded peptide binding sites.	65
349788	cd18987	LGIC_ECD_anion	extracellular domain (ECD) of anionic Cys-loop neurotransmitter-gated ion channels. This family contains the extracellular domain (ECD) of anionic Cys-loop neurotransmitter-gated ion channels which include type-A gamma-aminobutyric acid receptor (GABAAR), glycine receptor (GlyR), invertebrate glutamate-gated chloride channel (GluCl), and histimine-gated chloride channel (HisCl). These neurotransmitter receptors directly mediate chloride permeability and constitute one half of the Cys-loop receptor family. Receptors in this family are composed of five either identical or homologous subunits, which generate diversity in functional profiles and pharmacological preferences. GABAAR and GlyR, both mediate fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR receptor pore, resulting in hyperpolarization of the neuron. GluCl channels are found only in protostomia, but are closely related to mammalian glycine receptors (GlyRs). They have several roles in these invertebrates, including controlling locomotion and feeding, and mediating sensory inputs into behavior. Ligand-gated chloride channels are critical not only for maintaining appropriate neuronal activity, but have long been important therapeutic targets: benzodiazepines, barbiturates, some intravenous and volatile anaesthetics, alcohol, strychnine, picrotoxin, and ivermectin all derive their biological activity from acting on this inhibitory half of the Cys-loop receptor family. Many of the therapeutically useful compounds acting at Cys-loop receptors target an allosteric site. The sites in Cys-loop receptors at which these allosteric ligands bind and their structure-based mechanisms of action are largely unresolved.	185
349789	cd18988	LGIC_ECD_bact	extracellular domain of prokaryotic pentameric ligand-gated ion channels (pLGIC). This family contains extracellular domain (ECD) of bacterial pentameric ligand-gated ion channels (pLGICs), including ones from Gloebacter violaceus (GLIC) and Erwinia chrysanthemi (ELIC).  These prokaryotic homologs of Cys-loop receptors have been useful in understanding their eukaryotic counterparts. The largely beta-sheet ECD in this family is similar to other pLGICs, but lacks the cysteine loop and an intracellular domain. While most pLGICs undergo desensitization on prolonged exposure to the agonist, GLIC is activated by protons, but does not desensitize, even at proton concentrations eliciting maximal electrophysiological response (pH 4.5). Studies show that GLIC activation is inhibited by most general anaesthetics at clinical concentrations, including xenon which has been used in clinical practice as a potent gaseous anesthetic for decades. Xenon binding sites have been identified in three distinct regions of the TMD: in a large intra-subunit cavity, in the pore, and at the interface between adjacent subunits.	182
349790	cd18989	LGIC_ECD_cation	extracellular domain (LBD) of cationic Cys-loop neurotransmitter-gated ion channels. This superfamily contains the extracellular domain (ECD) of cationic Cys-loop neurotransmitter-gated ion channels, which include nicotinic acetylcholine receptor (nAChR), serotonin 5-hydroxytryptamine receptor (5-HT3), and zinc-activated ligand-gated ion channel (ZAC) receptor. These ligand-gated ion channels (LGICs) are found across metazoans and have close homologs in bacteria. They are vital for communication throughout the nervous system. nAChR is a non-selective cation channel that is permeable to Na+ and K+, and some subunit combinations are also permeable to Ca2+. Na+ enters and K+ exits to allow net flow of positively charged ions inward. 5-HT3, a cation-selective channel, binds serotonin and is permeable to Na+, K+, and Ca2+. It mediates neuronal depolarization and excitation within the central and peripheral nervous systems. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+ and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling require is as yet unknown.	180
349791	cd18990	LGIC_ECD_GABAAR	gamma-aminobutyric acid receptor extracellular domain. This family contains extracellular domain (ECD) of type-A gamma-aminobutyric acid receptor (GABAAR), a member of the pentameric "Cys-loop" superfamily of transmitter-gated ion channels. This family includes 19 isoforms in human; six alpha, 3 beta, 3 gamma, one of delta, epsilon, pi, and theta, known to form heteropentameric GABAARs, and 3 rho subunits that only form homopentameric channels (also known as GABAA rho or GABAC receptor) or pseudoheteromeric if consisting of different rho subunits. The majority of GABAA receptor pentamers contain two alpha subunits, two beta subunits, and a gamma subunit, with different isoforms affecting potency of the neurotransmitter. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to its site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. Benzodiazepine and barbiturates each bind to their own distinct sites on the ECD. The channels have to contain the gamma subunit and alpha subunits in order to respond to benzodiazepines. Specific combinations of alpha, beta, and gamma subunits exhibit ethanol sensitivity. All these major classes of drugs favor channel-opening. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy.	184
349792	cd18991	LGIC_ECD_GlyR	extracellular domain of glycine receptor (GlyR). This subfamily contains extracellular domain of glycine receptor (GlyR or GLR) of the amino acid neurotransmitter glycine. GlyR has four known isoforms of the alpha-subunit (alpha1-4, encoded by GLRA1, GLRA2, GLRA3, GLRA4) that are essential to bind ligands and a single beta-subunit (encoded by GLRB), all of which have been described to have a regionally and temporally controlled expression during development and maturation of the central nervous system (CNS). Functional chloride-permeable GlyR ion channels are formed by 5 alpha subunit homopentamers or by alpha and beta subunit heteropentamers, which form complexes with either a 2alpha-3beta or 3alpha-2beta stoichiometry. The receptor can be activated by glycine as well as beta-alanine and taurine, and can be selectively blocked by the high-affinity competitive antagonist strychnine. Caffeine is also a competitive antagonist of GlyR. In human, glycine receptor alpha1 and beta subunits are the major targets of mutations that cause disruption of GlyR surface expression or reduced ability of expressed GlyRs to conduct chloride ions, leading to hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli. Mutations in GlyR alpha2 are known to cause cortical neuronal migration/autism spectrum disorder and in GlyR alpha3 to cause inflammatory pain sensitization/rhythmic breathing.	185
349793	cd18992	LGIC_ECD_HisCl	extracellular domain of histimine-gated chloride channel (HisCl or HGCC). This family contains extracellular domain (ECD) of histamine-gated chloride channel (HisCl), a member of the Cys-loop receptor superfamily of ligand-gated ion channels and is closely related to the mammalian GABAA receptor and glycine receptor (GlyR). Histamine (HA) is a neurotransmitter that activates GPCRs in vertebrates, but in arthropods, it is a photoreceptor neurotransmitter, directly gating chloride channels on large monopolar cells (LMCs), postsynaptic to photoreceptors in the lamina. It has also been reported to play important roles in mechanosensory reception, temperature preference, and sleep in insects. HA activates its receptor channels to cause an inward chloride flux in the insect nervous system. In Drosophila, HA acts on two histamine-gated chloride channel (HGCC) subunits called HisCl1 (HisClalpha2, HCLB) and HisCl2 (HisClalpha1, Ort, HCLA). HisCl1 (HCLB) and HisCl2 (HCLA) are expressed predominantly in the insect eye, sharing 60% sequence identity, and forming homomeric and heteromeric HGCCs. HCLA homomers are involved in synaptic transmission in the lamina, while HCLB homomers, localized in the glia cells, have a role in shaping the transmission. HCLB channels, but not HCLA channels, are also responsible for the activation and maintenance of wake state in D. melanogaster. In Manduca sexta, HCLB channels in the flight sensory-motor have been shown to be involved in olfactory processing circuit. Studies show that HCLB channels are more sensitive to agonists when compared with HCLA channels, but insensitive to known LGCC insecticides.	185
349794	cd18993	LGIC_ECD_GluCl	glutamate-gated chloride channel (GluCl) extracellular domain. This subfamily contains extracellular domain of glutamate-gated chloride channel (GluCl) found only in protostomia, but are closely related to mammalian glycine receptors. They have several roles in these invertebrates, including controlling locomotion and feeding, and mediating sensory inputs into behavior. Comparison of the GluCl gene families between organisms shows that insect gene family is relatively simple, while that found in nematodes tends to be larger and more diverse. Glutamate is an inhibitory neurotransmitter that shapes the responses of projection neurons to olfactory stimuli in the Drosophila. GluCls are targeted by the macrocyclic lactone family of anthelmintics and pesticides in arthropods and nematodes, thus making the GluCls of considerable medical and economic importance. In Drosophila melanogaster, GluCl mediates sensitivity to the antiparasitic agents ivermectin and nodulisporic acid, suggesting that their drug target is the same throughout the Ecdysozoa.	183
349795	cd18994	LGIC_ECD_ZAC	extracellular domain of zinc-activated ligand-gated ion channel. This family is the extracellular domain of zinc-activated ligand-gated ion channel (ZAC), a cationic ion channel belonging to the superfamily of Cys-loop receptors, which consists of pentameric ligand-gated ion channels. ZAC displays low sequence similarity to other members in the superfamily, with closest matches to the human serotonin 5-HT3 receptor (5-HT3R) subunits 5-HT3A and 5-HT3B, and nAChR alpha7 subunits that exhibit approximately 15% amino acid sequence identity to ZAC. Expression of ZAC has been detected in human fetal whole brain, spinal cord, pancreas, placenta, prostate, thyroid, trachea, and stomach, as well as in adult hippocampus, striatum, amygdala, and thalamus. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+, and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling is as yet unknown.	170
349796	cd18995	LGIC_AChBP	acetylcholine binding protein (AChBP). This family contains acetylcholine binding protein (AChBP) which is a soluble extracellular domain homolog secreted by protostomia, and has been widely recognized as a surrogate for the ligand binding domain of nicotinic acetylcholine receptors (nAChRs). AChBP forms a pentameric structure where the interfaces between the subunits provide an acetylcholine (ACh) binding pocket homologous to the binding pocket of nAChRs. Thus far, AChBPs have been characterized only in aquatic mollusks, which have shown low sensitivity to neonicotinoids, the insecticides targeting insect nAChRs. Lymnaea stagnalis acetylcholine binding protein (Ls-AChBP) which has been found in glial cells as a water-soluble protein modulating synaptic ACh concentration has its the binding pocket show better resemblance as it contains all the five aromatic residues fully conserved in nAChR. Five AChBP subunits have been characterized in Pardosa pseudoannulata, a predator enemy against rice insect pests, and share higher sequence similarities with nAChR subunits of both insects and mammals compared with mollusk AChBP subunits.	180
349797	cd18996	LGIC_ECD_5-HT3	extracellular domain of serotonin 5-HT3 receptor. This family contains extracellular domain of serotonin 5-HT3 receptor which belongs to the Cys-loop superfamily of ligand-gated ion channels (LGICs). This ion channel is cation-selective and mediates neuronal depolarization and excitation within the central and peripheral nervous systems. Like other ligand gated ion channels, the 5-HT3 receptor consists of five subunits arranged around a central ion conducting pore, which is permeable to Na+, K+, and Ca2+ ions. Binding of the neurotransmitter 5-hydroxytryptamine (serotonin) to the 5-HT3 receptor opens the channel, which then leads to an excitatory response in neurons, and the rapidly activating, desensitizing, inward current is predominantly carried by Na+ and K+ ions. This receptor is most closely related by homology to the nicotinic acetylcholine receptor (nAChR). Five subunits have been identified for this family: 5-HT3A, 5-HT3B, 5-HT3C, 5-HT3D, and 5-HT3E, encoded by HTR3A-E genes. Only 5-HT3A subunits are able to form functional homomeric receptors, whereas the 5-HT3B, C, D, and E subunits form heteromeric receptors with 5-HT3A. Different receptor subtypes are important mediators of nausea and vomiting during chemotherapy, pregnancy, and following surgery, while some contribute to neuro-gastroenterologic disorders such irritable bowel syndrome (IBS) and eating disorders as well as co-morbid psychiatric conditions. 5-HT3 receptor antagonists are established treatments for emesis and IBS, and are beneficial in the treatment of psychiatric diseases.	215
349798	cd18997	LGIC_ECD_nAChR	extracellular domain of nicotinic acetylcholine receptor. This family contains the extracellular domain of nicotinic acetylcholine receptor (nAChR), a member of the pentameric "Cys-loop" superfamily of transmitter-gated ion channels. nAChR is found in high concentrations at the nerve-muscle synapse, where it mediates fast chemical transmission of electrical signals in response to the endogenous neurotransmitter acetylcholine (ACh) released from the nerve terminal into the synaptic cleft. Thus far, seventeen nAChR subunits have been identified, including ten alpha subunits, four beta subunits, and one gamma, delta, and epsilon subunit each, all found on the cell membrane that non-selectively conducts cations (Na+, K+, Ca++). These nAChR subunits combine in several different ways to form functional nAChR subtypes which are broadly categorized as either muscle subtype located at the neuromuscular junction or neuronal subtype that are found on neurons and on other cell types throughout the body. The muscle type of nAChRs are formed by the alpha1, beta1, gamma, delta, and epsilon subunits while the neuronal type are composed of nine alpha subunits and three beta subunits, which combine in various permutations and combinations to form functional receptors. Among various subtypes of neuronal nAChRs, the homomeric alpha7 and the heteromeric alpha4beta2 receptors are the main subtypes widely distributed in the brain and implicated in the pathophysiology of neurodevelopmental disorders such as schizophrenia and autism and neurodegenerative disorders such as Alzheimer's disease and Parkinson's disease. Among subtypes of muscle nAChRs, the heteromeric subunits (alpha1)2, beta, gamma, and delta in fetal muscle, and the gamma subunit replaced by epsilon in adult muscle have been implicated in congenital myasthenic syndromes and multiple pterygium syndromes due to various mutations. This family also includes alpha- and beta-like nAChRs found in protostomia.	181
349799	cd18998	LGIC_ECD_GABAAR_A	extracellular domain of gamma-aminobutyric acid receptor subunit alpha. This family contains extracellular domain (ECD) of type-A gamma-aminobutyric acid receptor (GABAAR), a member of the pentameric "Cys-loop" superfamily of transmitter-gated ion channels. This family includes 19 isoforms in human; six alpha, 3 beta, 3 gamma, one of delta, epsilon, pi, and theta, known to form heteromeric GABAARs, and 3 rho subunits that only form homomeric channels (also known as GABAA rho or GABAC receptor) or pseudoheteromeric if consisting of different rho subunits. GABAAR is assembled from a variety of different subunit subtypes which determines their pharmacology and physiology; the most abundant being 2alpha2beta1gamma stoichiometry. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to its site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. Benzodiazepine and barbiturates each bind to their own distinct sites on the ECD. The channels have to contain the gamma subunit and alpha subunits in order to respond to benzodiazepines. All these major classes of drugs favor channel-opening. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABRA1, GABRA3, GABRB3, GABRG2, and GABRD, encoding the alpha1-, alpha3-, beta2-, gamma3-, and delta-subunits have been directly associated with epilepsy. Specific combinations of alpha, beta, and gamma subunits exhibit ethanol sensitivity.	184
349800	cd18999	LGIC_ECD_GABAAR_B	extracellular domain of gamma-aminobutyric acid receptor subunit beta (GABAAR-B or GABRB). This family contains extracellular domain (ECD) of beta subunits of type-A gamma-aminobutyric acid receptor (GABAAR), which include beta1-beta4 in vertebrates. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. Benzodiazepine and barbiturates each bind to their own distinct sites on the LBD. The channels must contain the gamma subunit and alpha subunits in order to respond to benzodiazepines. Specific combinations of alpha, beta, and gamma subunits exhibit ethanol sensitivity. All these major classes of drugs favor channel-opening. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Mutations or genetic variations of the genes encoding the GABRB2 and GABRB3 have been associated with human epilepsy, both with and without febrile seizures. Mutations in GABRB2, and GABRB3 have been associated with infantile spasms and Lennox-Gastaut syndrome. A de novo missense mutation of GABRB2 causes early myoclonic encephalopathy, a disease with a devastating prognosis, characterized by neonatal onset of seizures. Another de novo heterozygous missense variant in exon 4 of GABRB2 is associated with intellectual disability and epilepsy. Mutations in the GABRB1 gene promote alcohol consumption through increased tonic inhibition.	182
349801	cd19000	LGIC_ECD_GABAAR_G	extracellular domain of gamma-aminobutyric acid receptor subunit gamma. This family contains extracellular domain (ECD) of the theta subunit of type-A gamma-aminobutyric acid receptor (GABAAR). GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABA stimulates human hepatocellular carcinoma growth through overexpressed GABAA receptor theta subunit. Also, two autism spectrum disorder (ASD)-associated protein truncation variants have been identified in alpha 3 (GABRA3) and theta (GABRQ) genes.	182
349802	cd19001	LGIC_ECD_GABAAR_delta	extracellular domain of gamma-aminobutyric acid receptor subunit delta. This family contains extracellular domain of delta subunit of type-A gamma-aminobutyric acid receptor (GABAAR). GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Receptors containing the delta subunit (GABRD) are expressed exclusively extra-synaptically (in the cortex, hippocampus, thalamus, striatum, and cerebellum) and mediate tonic inhibition. Studies suggest that delta subunits form heteropentamers in similar stoichiometry and arrangement as alpha/beta/gamma receptors, with the delta subunit replacing the gamma subunit (2alpha:2beta:1delta), although other stoichiometries have also been detected. The delta subunit is flexible in its positioning in the pentameric complex, producing receptors with diverse pharmacological properties. Mutations in GABRD have been associated with susceptibility to generalized epilepsy with febrile seizures, type 5. GABRD gene may also be associated with childhood-onset mood disorders.	184
349803	cd19002	LGIC_ECD_GABAAR_E	extracellular domain of gamma-aminobutyric acid receptor subunit epsilon (GABRE). This family contains extracellular domain of epsilon subunit of type-A gamma-aminobutyric acid receptor (GABAAR), a protein that is encoded by the GABRE gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The epsilon subunits form heteropentamers with other GABAAR subunits, possibly with alpha3, beta4, and theta subunits since their genes are clustered on the same human chromosome. Various combinations of alpha3-, theta-, and epsilon-subunits may be assembled at a regional and developmental level in the brain. Brainstem expression of epsilon subunit-containing GABAA receptors is upregulated during pregnancy, particularly in the ventral respiratory neurons, thus protecting breathing, despite increased neurosteroid levels during pregnancy.	182
349804	cd19003	LGIC_ECD_GABAAR_theta	extracellular domain of gamma-aminobutyric acid receptor subunit theta (GABRQ). This family contains extracellular domain (ECD) of the theta subunit of type-A gamma-aminobutyric acid receptor (GABAAR), and encoded by the GABRQ gene, which is mapped to chromosome Xq28 in a cluster of genes that also that encode the alpha 3 and epsilon subunits. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABA stimulates human hepatocellular carcinoma growth through overexpressed GABAAR theta subunit. Also, two autism spectrum disorder (ASD)-associated protein truncation variants have been identified in alpha 3 (GABRA3) and theta (GABRQ) genes.	183
349805	cd19004	LGIC_ECD_GABAAR_pi	extracellular domain of gamma-aminobutyric acid receptor subunit pi (GABRP). This family contains extracellular domain of pi subunit of type-A gamma-aminobutyric acid receptor (GABAAR). GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABRP is expressed mainly in non-neuronal tissues such as the mammary gland, prostate gland, lung, thymus, and uterus. It is also highly expressed in certain types of cancer such as basal-like breast cancer and pancreatic ductal adenocarcinoma. GABRP is involved in inhibitory synaptic transmission in the central nervous system.  Its assembly with other GABAAR subunits alters the sensitivity of recombinant receptors to modulatory agents such as pregnanolone. Studies suggest that polymorphisms in the GABRP gene may be associated with the susceptibility to systematic lupus erythematosus (SLE).	182
349806	cd19005	LGIC_ECD_GABAAR_rho	extracellular domain of gamma-aminobutyric acid receptor subunit rho. This family contains extracellular domain of rho subunits (rho1, rho2, and rho3, encoded by GABRR1, GABRR2, and GABRR3, respectively) of type-A gamma-aminobutyric acid receptor (GABAAR). These subunits homo-oligomerize to form GABAA-rho receptors (formerly classified as GABA-rho or GABAC receptor), but do not co-assemble with any of the classical GABAA subunits. They are especially high expression in the retina and their distinctive pharmacological properties are unique; they are not modulated by many GABAA receptor modulators such as barbiturates, benzodiazepines, and neuroactive steroids. In humans, mutations in the GABRR1 and GABRR2 genes may be responsible for some cases of autosomal recessive retinitis pigmentosa. Variation in GABRR1 is also associated with susceptibility to bipolar schizoaffective disorder while a SNP in GABRR2 has been reported to show association with autism.	186
349807	cd19006	LGIC_ECD_GABAAR_LCCH3-like	gamma-aminobutyric acid receptor subunit beta-like extracellular domain in protostomia, such as LCCH3 (ligand-gated chloride channel homolog 3). This family contains extracellular domain of beta-like subunits of type-A gamma-aminobutyric acid receptor (GABAAR) found in protostomia, similar to Drosophila melanogaster ligand-gated chloride channel homolog 3 (LCCH3) subunits. Drosophila melanogaster expresses three GABA-receptor subunit orthologs: (RDL, resistant to dieldrin; GRD, GABA/glycine-like receptor of Drosophila; LCCH3, ligand-gated chloride channel homolog 3), and may possibly form homo- and/or heteropentameric associations. LCCH3 has been shown to combine with subunit GRD to form cation-selective GABA-gated ion channels when coexpressed in Xenopus laevis oocytes. GABAARs are known to be the molecular targets of a class of insecticides. The resulting pentameric receptors in this family have been shown to be activated by insect GABA-receptor agonists muscimol and CACA, and blocked by antagonists fipronil, dieldrin, and picrotoxin, but not bicuculline. GABAARs are abundant in the CNS, where their physiological role is to mediate fast inhibitory neurotransmission. In insects, this inhibitory transmission plays a crucial role in olfactory information processing.	183
349808	cd19007	LGIC_ECD_GABAR_GRD-like	gamma-aminobutyric acid receptor subunit alpha-like extracellular domain in protostomia, such as GRD (GABA/glycine-like receptor of Drosophila). This family contains extracellular domain of alpha-like subunits of type-A gamma-aminobutyric acid receptor (GABAAR) found in protostomia, similar to Drosophila melanogaster GABA/ glycine-like receptor of Drosophila (GRD) subunits. Drosophila melanogaster expresses three GABA-receptor subunit orthologs: (RDL, resistant to dieldrin; GRD, GABA/glycine-like receptor of Drosophila; LCCH3, ligand-gated chloride channel homolog 3), and may possibly form homo- and/or heteropentameric associations. LCCH3 has been shown to combine with subunit GRD to form cation-selective GABA-gated ion channels when co-expressed in Xenopus laevis oocytes. GABAARs are known to be the molecular targets of a class of insecticides. The resulting pentameric receptors in this family have been shown to be activated by insect GABA-receptor agonists muscimol and CACA, and blocked by antagonists fipronil, dieldrin, and picrotoxin, but not bicuculline. GABAARs are abundant in the CNS, where their physiological role is to mediate fast inhibitory neurotransmission. In insects, this inhibitory transmission plays a crucial role in olfactory information processing.	183
349809	cd19008	LGIC_ECD_GABAR_RDL-like	gamma-aminobutyric acid receptor subunit beta-like extracellular domain in protostomia, such as RDL (resistant to dieldrin). This family contains extracellular domain of beta-like subunits of type-A gamma-aminobutyric acid receptor (GABAAR) found in protostomia, similar to Drosophila melanogaster resistant to dieldrin (RDL) subunits. Drosophila melanogaster expresses three GABA-receptor subunit orthologs: (RDL, resistant to dieldrin; GRD, GABA/glycine-like receptor of Drosophila; LCCH3, ligand-gated chloride channel homolog 3), and may possibly form homo- and/or heteropentameric associations. GABAARs are known to be the molecular targets of a class of insecticides. The resulting pentameric receptors in this family have been shown to be activated by insect GABA-receptor agonists muscimol and CACA, and blocked by antagonists fipronil, dieldrin, and picrotoxin, but not bicuculline. GABAARs are abundant in the CNS, where their physiological role is to mediate fast inhibitory neurotransmission. In insects, this inhibitory transmission plays a crucial role in olfactory information processing. Bombyx mori includes three RDL (RD1, RD2, RD3), one LCCH3, and one GRD subunits. Its RDL1 gene has RNA-editing sites, and the RDL1 and RDL3 genes possess alternative splicing, enhancing the diversity of its GABA-receptor gene family. The three RDL subunits may have arisen from two duplication events.	184
349810	cd19009	LGIC_ECD_GlyR_alpha	extracellular domain of glycine receptor alpha subunit. This subfamily contains extracellular domain of glycine receptor (GlyR or GLR) alpha subunits of the amino acid neurotransmitter glycine. GlyR has four known isoforms of alpha-subunit (alpha1-4, encoded by GLRA1, GLRA2, GLRA3, GLRA4) that are essential to bind ligands, and, along with the GlyR beta subunit, have been described to have a regionally and temporally controlled expression during development and maturation of the central nervous system (CNS). These alpha subunits are highly homologous, but differ in their kinetic properties, temporal and regional expression and physiological functions. They can form functional chloride-permeable GlyR ion channels by forming homopentamers with 5 alpha subunits or heteropentamers with a combination of alpha and beta subunits, either a 2alpha-3beta or 3alpha-2beta stoichiometry. In human, mutations in glycine receptor alpha subunits cause disruption of GlyR surface expression or reduced ability of expressed GlyRs to conduct chloride ions. Mutations in GlyR alpha1 subunit leads to hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli, while mutations in GlyR alpha2 are known to cause cortical neuronal migration/autism spectrum disorder and in GlyR alpha3 to cause inflammatory pain sensitization/rhythmic breathing. GlyR alpha1 and alpha2 subunits have an important role in regulation of the excitatory-inhibitory balance, control of motor actions, modulation of sedative ethanol effects and probably regulation ethanol preference and consumption.	184
349811	cd19010	LGIC_ECD_GlyR_beta	extracellular domain of glycine receptor beta subunit. This subfamily contains extracellular domain of glycine receptor (GlyR or GLR) beta subunit of the amino acid neurotransmitter glycine encoded by GLRB gene. These subunits form heteropentamers with a combination of alpha and beta subunits, either a 2alpha-3beta or 3alpha-2beta stoichiometry. While the alpha subunits contain binding sites for agonists and antagonists and are responsible for ion channel formation, the beta subunit displays structural and regulatory functions, such as GlyR clustering in synaptic locations by interaction between intracellular loop domains with the scaffolding protein gephyrin, and control of pharmacologic responses to agonist or allosteric modulators due in part to the presence of interfaces alpha/beta and beta/beta. GLRB gene mutations are associated with the neurological disorder hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli, as well as agoraphobic cognitions.	187
349812	cd19011	LGIC_ECD_5-HT3A	extracellular domain of serotonin 5-hydroxytryptamine receptor (5-HT3) receptor subunit A (5HT3A). This subfamily contains extracellular domain of subunit A of serotonin 5-HT3 receptor (5-HT3AR), encoded by the HTR3A gene. 5-HT3A subunit forms a homopentameric complex or a heterologous combination with other subunits (B-E). Heteromeric combination of A and B subunits provides the full functional features of this receptor, since either subunit alone results in receptors with very low conductance and response amplitude. 5-HT3A receptors are located in the dorsal vagal complex of the brainstem and in the gastrointestinal (GI) tract, and form a channel circuit that controls gut motility, secretion, visceral perception, and the emesis reflex. These receptors are implicated in several GI and psychiatric disorder conditions including anxiety, depression, bipolar disorder, and irritable bowel syndrome (IBS). Several 5-HT3AR antagonists, such as the isoquinoline Palonosetron, are in clinical use to control emetic reflexes associated with gastrointestinal pathologies and cancer therapies. SNPs in the 5-HT3A serotonin receptor gene are associated with psychiatric disorders.	208
349813	cd19012	LGIC_ECD_5-HT3B	extracellular domain of serotonin 5-hydroxytryptamine receptor (5-HT3) receptor subunit B (5HT3B). This subfamily contains extracellular domain of subunit B of serotonin 5-HT3 receptor (5-HT3BR), encoded by the HTR3B gene. 5-HT3B is not functional as a homopentameric complex and is co-expression with the 5-HT3A subunit, resulting in heteromeric 5-HT3AB receptors that are functionally distinct from homomeric 5-HT3A receptors. This receptor causes fast, depolarizing responses in neurons after activation, with affinities of competitive ligands at the two receptor subtypes extracellular domains mostly similar. HTR3B gene variants may contribute to variability in severity of and response to anti-emetic therapy for nausea and vomiting in pregnancy, as well as efficacy of ondansetron in cancer chemotherapy, radiation therapy, or surgery. 5-HT3B subunit affects high-potency inhibition of 5-HT3 receptors by morphine by reducing its affinity at its high-affinity, non-competitive site.	210
349814	cd19013	LGIC_ECD_5-HT3C_E	extracellular domain of serotonin 5-hydroxytryptamine receptor (5-HT3) receptor subunit E (5HT3E); may include subunits C and D (5-HT3C,D). This subfamily contains extracellular domain of subunit E of serotonin 5-HT3 receptor (5-HT3ER), encoded by the HTR3E gene, and may also contain subunits C and D, all three encoding genes forming a cluster on chromosome 3. Data show that 5-HT3C, 5-HT3D, and 5-HT3E subunits are co-expressed with 5-HT3A in cell bodies of myenteric neurons, and that 5-HT3A and 5-HT3D are expressed in submucosal plexus of the human large intestine while HTR3E is restricted to the colon, intestine, and stomach. None of these subunits can form functional homopentamers, but, upon co-expression with the 5-HT3A subunit, they give rise to functional receptors that differ in maximal responses to 5-HT, and thus modulate 5-HT3 receptor's pharmacological profile. HTR3A and HTR3E polymorphisms have been shown to remarkably up-regulate the expression of 5-HT3 receptors, which have been proved to cause the gastric functional disorders including emesis, eating disorders and IBS-D.	215
349815	cd19014	LGIC_ECD_nAChR_A1	extracellular domain of nicotinic acetylcholine receptor subunit alpha 1 (CHRNA1). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 1 (alpha1), encoded by the CHRNA1 gene. These muscle type nicotinic subunits form heteropentamers with other nAChR subunits, most broadly expressed as combination of two alpha1, beta1, delta, and epsilon subunits in mature muscles, and of two alpha1, beta1, delta, and gamma in embryonic cells. The alpha1 subunit in human nAChR is the primary target of Myasthenia gravis antibodies that disrupt communication between the nervous system and the muscle, causing chronic muscle weakness.	210
349816	cd19015	LGIC_ECD_nAChR_A2	extracellular domain of nicotinic acetylcholine receptor subunit alpha 2 (CHRNA2). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 2 (alpha2), encoded by the CHRNA2 gene. It is specifically expressed in medial subpallium-derived amygdalar nuclei from early developmental stages to adult. This subunit is incorporated in heteropentameric neuronal nAChRs mainly with beta2 or beta4 subunits and, along with the alpha4 and alpha7, is one of the main nAChR subunits expressed in primate brain. In Xenopus laevis oocytes, when alpha2 is co-expressed with the beta2 subunit, two subtypes of alpha2beta2 nAChR are formed with either low or high ACh sensitivity. Mouse mutation studies show that alpha2 subunits in the nAChRs influence hippocampus-dependent learning and memory as well as CA1 synaptic plasticity in adolescent mice.	207
349817	cd19016	LGIC_ECD_nAChR_A3	extracellular domain of nicotinic acetylcholine receptor subunit alpha 3 (CHRNA3). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 3 (alpha3), encoded by the CHRNA3 gene, and likely plays a role in neurotransmission. The alpha3 subunit is expressed in the aorta and macrophages, and may play a regulatory role in the process of vascular inflammation. One of the most broadly expressed subtype is the alpha3beta4 nAChR, also known as the ganglion-type nicotinic receptor, located in the autonomic ganglia and adrenal medulla, where activation yields post- and/or presynaptic excitation, mainly by increased Na+ and K+ permeability. The exact pentameric stochiometry of alpha3beta4 receptor is not known and functional assemblies with varying subunit stoichiometries are possible. Alpha4 plays a pivotal role in regulating the inflammatory responses in endothelial cells and macrophages, via mechanisms involving the modulations of multiple cell signaling pathways. Polymorphisms in this gene (CHRNA3) have been associated with an increased risk of smoking initiation and an increased susceptibility to lung cancer.	207
349818	cd19017	LGIC_ECD_nAChR_A4	extracellular domain of neuronal acetylcholine receptor subunit alpha 4 (CHRNA4). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 4 (alpha4), encoded by the CHRNA4 gene. Alpha4 forms a functional nAChR by interacting with either nAChR beta2 or beta4 subunits. Alpha4beta2, the major heteropentameric nAChR in the brain, exists in two isoforms, (alpha4)3(beta2)2 and (alpha4)2(beta2)3, with the latter believed to constitute the majority of alpha4beta2 nAChR in the cortex. Both isoforms contain two canonical alpha4:beta2 ACh-binding sites with either low or high ACh sensitivity. This protein is an integral membrane receptor subunit that can interact with either nAChR beta-2 or nAChR beta-4 to form a functional receptor. Mutations in this gene (CHRNA4) cause nocturnal frontal lobe epilepsy type 1. Polymorphisms in this gene may provide protection against nicotine addiction.	181
349819	cd19018	LGIC_ECD_nAChR_A5	extracellular domain of nicotinic acetylcholine receptor subunit alpha 5 (CHRNA5). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 5 (alpha5), encoded by the CHRNA5 gene, which is part of the CHRNA5/A3/B4 gene cluster. Polymorphisms in this gene cluster have been identified as risk factors for nicotine dependence, lung cancer, chronic obstructive pulmonary disease, alcoholism, and peripheral arterial disease. A loss-of-function polymorphism in CHRNA5 is strongly linked to nicotine abuse and schizophrenia; the alpha5 nAChR subunit is strategically situated in the prefrontal cortex (PFC), where a loss-of-function in this subunit may contribute to cognitive disruptions in both disorders. Alpha5 forms heteropentamers with alpha3beta2 or alpha3beta4 nAChRs which increases the calcium permeability of the resulting receptors possibly playing significant roles in the initiation of ACh-induced signaling cascades under normal and pathological condition. Acetylcholine (ACh) release and signaling via alpha4/beta2 nAChR subunits plays a central role in the control of attention, but a subset of these oligomers also contains alpha5 subunit. A strong association is seen between a CHRNA5 polymorphism and the risk of lung cancer, especially in smokers.	207
349820	cd19019	LGIC_ECD_nAChR_A6	extracellular domain of nicotinic acetylcholine receptor subunit alpha 6 (CHRNA6). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 6 (alpha6), encoded by the CHRNA6 gene. Human (alpha6beta2)(alpha4beta2)3 nicotinic acetylcholine receptors (AChRs) are essential for addiction to nicotine and a target for drug development for smoking cessation. In xenopus oocytes, data show efficient expression of (alpha6beta2)2beta3 AChR subunits with only small changes in alpha6 subunits, while not altering AChR pharmacology or channel structure. Alternatively spliced transcript variants have been observed for this gene. Single nucleotide polymorphisms in this gene have been associated with both nicotine and alcohol dependence. CHRNA6 has a cellular expression signature for retinal ganglion cells with high correlation to Thy1, a known marker, and is preferentially expressed by retinal ganglion cells (RGCs) in the young and adult mouse retina and expression is reduced in glaucoma. A genetic variant in CHRNB3#CHRNA6 cluster is associated with esophageal adenocarcinoma.	181
349821	cd19020	LGIC_ECD_nAChR_A7	extracellular domain of neuronal acetylcholine receptor subunit alpha 7 (CHRNA7). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 7 (alpha7), encoded by the CHRNA7 gene. Alpha7 subunits form a homo-pentameric channel, displays marked permeability to calcium ions and is a major component of brain nicotinic receptors that are blocked by, and highly sensitive to, alpha-bungarotoxin. This protein is ubiquitously expressed in both the central nervous system and in the periphery, in several tissues, including adrenal, small intestine, testis, and stomach. CHRNA7 is located in a region identified as a major susceptibility locus for juvenile myoclonic epilepsy and a chromosomal location involved in the genetic transmission of schizophrenia. It is also genetically linked to other disorders with cognitive deficits, including bipolar disorder, ADHD, Alzheimer's disease, and Rett syndrome. An evolutionarily recent partial duplication of CHRNA7 on chromosome 15 forms a new gene, CHRFAM7A or FAM7A, which encodes the protein dup-alpha7. This protein assembles with alpha7 subunits, results in fewer binding sites and is a dominant negative regulator of alpha7 nAChR function.	180
349822	cd19021	LGIC_ECD_nAChR_A7L	extracellular domain of neuronal acetylcholine receptor subunit alpha-7-like. This family contains the extracellular domain of nicotinic acetylcholine receptor (nAChR), a member of the pentameric "Cys-loop" superfamily of transmitter-gated ion channels. nAChR is found in high concentrations at the nerve-muscle synapse, where it mediates fast chemical transmission of electrical signals in response to the endogenous neurotransmitter acetylcholine (ACh) released from the nerve terminal into the synaptic cleft. Thus far, seventeen nAChR subunits have been identified, including ten alpha subunits, four beta subunits and one gamma, delta, and epsilon subunit each, all found on the cell membrane that non-selectively conducts cations (Na+, K+, Ca++). These nAChR subunits combine in several different ways to form functional nAChR subtypes which are broadly categorized as either muscle subtype located at the neuromuscular junction or neuronal subtype that are found on neurons and on other cell types throughout the body. The muscle type of nAChRs are formed by the alpha1, beta1, gamma, delta, and epsilon subunits while the neuronal type are composed of nine alpha subunits and three beta subunits, which combine in various permutations and combinations to form functional receptors. Among various subtypes of neuronal nAChRs, the homomeric alpha7 and the heteromeric alpha4beta2 receptors are the main subtypes widely distributed in the brain and implicated in the pathophysiology of neurodevelopmental disorders such as schizophrenia and autism and neurodegenerative disorders such as Alzheimer's disease and Parkinson's disease.	179
349823	cd19022	LGIC_ECD_nAChR_A9	extracellular domain of neuronal acetylcholine receptor subunit alpha 9 (CHRNA9). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 9 (alpha9), encoded by the CHRNA9 gene. This protein is involved in cochlea hair cell development and is also expressed in the outer hair cells (OHCs) of the adult cochlea as well as in keratinocytes, the pituitary gland, B-cells, and T-cells. Mammalian alpha9 subunits can form functional homomeric alpha9 receptors as well as the heteromeric alpha9alpha10 receptors, the latter being atypical since the heteromeric alpha9alpha10 receptor is composed only of alpha subunits compared to nAChRs typically assembled from alpha and beta subunits. A stoichiometry of (alpha9)2(alpha10)3 has been determined for the rat recombinant receptor. The alpha9alpha10 nAChR is an important therapeutic target for pain; selective block of alpha9alpha10 nicotinic acetylcholine receptors by the conotoxin RgIA has been shown to be analgesic in an animal model of nerve injury pain, and accelerates recovery of nerve function after injury, possibly through immune/inflammatory-mediated mechanisms. CHRNA9 polymorphisms are associated with non-small cell lung cancer, and effect of a particular SNP (rs73229797) and passive smoking exposure on risk of breast malignancy has been observed.	207
349824	cd19023	LGIC_ECD_nAChR_A10	extracellular domain of neuronal acetylcholine receptor subunit alpha 10 (CHRNA10). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha 10 (alpha10), encoded by the CHRNA10 gene. This protein is involved in cochlea hair cell development and is also expressed in the outer hair cells (OHCs) of the adult cochlea as well as in keratinocytes, the pituitary gland, B-cells, and T-cells. Unlike alpha9 nAChR subunits, alpha10 subunits do not generate functional channels when expressed heterologously, suggesting that alpha10 might serve as a structural subunit, much like a beta subunit of heteromeric receptors, providing only complementary components to the agonist binding site. Mammalian alpha10 subunits can form functional heteromeric alpha9alpha10 receptors, an atypical heteromeric receptor since it is composed only of alpha subunits compared to nAChRs typically assembled from alpha and beta subunits. A stoichiometry of (alpha9)2(alpha10)3 has been determined for the rat recombinant receptor. The alpha9alpha10 nAChR is an important therapeutic target for pain; selective block of alpha9alpha10 nicotinic acetylcholine receptors by the conotoxin RgIA has been shown to be analgesic in an animal model of nerve injury pain, and accelerates recovery of nerve function after injury, possibly through immune/inflammatory-mediated mechanisms.	181
349825	cd19024	LGIC_ECD_nAChR_B1	extracellular domain of nicotinic acetylcholine receptor subunit beta 1 (CHRNB1). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta 1 (beta1), encoded by the CHRNB1 gene. It is a muscle type subunit found predominantly in the neuromuscular junction (NMJ), but also in other tissues and cell lines such as adrenal glands, carcinomas, brain, and lung. Simultaneous mRNA and protein expression of beta1 nAChR subunit is present in human placenta and skeletal muscle. The beta1 nAChR subunit forms a heteropentamer with either (alpha1)2, gamma and delta subunits in embryonic type or (alpha1)2, epsilon and delta subunits in adult type receptors. nAChRs containing beta1 subunits have been attributed to efficient clustering and anchoring of the receptors to the cytoskeleton which is important for formation of synapses in the NMJ. Mutations in the transmembrane domain region of this gene are associated with slow-channel congenital myasthenic syndrome (CMS).	213
349826	cd19025	LGIC_ECD_nAChR_B2	extracellular domain of nicotinic acetylcholine receptor subunit beta 2 (CHRNB2). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta 2 (beta2), encoded by the CHRNB2 gene. The most abundant nicotinic subtype in the human brain is alpha4beta2 receptor which is known to assemble in two functional subunit stoichiometries, (alpha4)3(beta2)2 and (alpha4)2(beta2)3, the latter having a much higher affinity for both acetylcholine and nicotine. This subtype is implicated in the pathophysiology of neurodevelopmental disorders such as schizophrenia and autism, and neurodegenerative disorders such as Parkinson's disease and Alzheimer's disease. Thus, pharmacological ligands targeting this subtype have been researched and developed as a treatment approach implicated in these diseases. They include agonists such as varenicline and cytisine used as smoking cessation aids, as well as positive allosteric modulators (PAMs) such as desformylflustrabromine (dFBr), which are ligands that bind to nicotinic receptors at sites other than the orthosteric site where acetylcholine binds, and are not able to act as agonists on nAChR.	204
349827	cd19026	LGIC_ECD_nAChR_B3	extracellular domain of nicotinic acetylcholine receptor subunit beta 3 (CHRNB3). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta 3 (beta3), encoded by the CHRNB3 gene. CHRNB3 polymorphisms have been reported to potentially affect nicotine-induced upregulation of nicotinic and to be associated with disorders such as schizophrenia, autism, and cancer. Beta3 subunit is depleted in the striatum of Parkinson's disease patients. Rare variants in CHRNB3 are also implicated in risk for alcohol and cocaine dependence and independently associated with bipolar disorder. Human alpha6beta2beta3* (* indicating possible additional assembly partners) nAChRs on dopaminergic neurons are important targets for drugs to treat nicotine addiction and Parkinson's disease; (alpha6beta2)(alpha4beta2)beta3 nAChR is essential for addiction to nicotine and a target for drug development for smoking cessation.	179
349828	cd19027	LGIC_ECD_nAChR_B4	extracellular domain of nicotinic acetylcholine receptor subunit beta 4 (CHRNB4). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta 4 (beta4), encoded by the CHRNB4 gene and ubiquitously expressed on lung epithelial cells.  The cluster of human neuronal nicotinic receptor gene CHRNA5-CHRNA3-CHRNB4 is related to drug-related behaviors and the development of lung cancer. One of the most broadly expressed subtype is the alpha-3 beta-4 nAChR, also known as the ganglion-type nicotinic receptor, located in the autonomic ganglia and adrenal medulla, where activation yields post- and/or pre-synaptic excitation, mainly by increased Na+ and K+ permeability. Beta4 forms heteromeric nAchRs to modulate receptor affinity for nicotine, but the exact pentameric stochiometry of alpha3beta4 receptor is not known; functional assemblies with varying subunit stoichiometries are possible.	178
349829	cd19028	LGIC_ECD_nAChR_D	extracellular domain of nicotinic acetylcholine receptor subunit delta (CHRND). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit delta (delta), encoded by the CHRND gene and found in the muscle. Delta nAChR subunit forms a heteropentamer with either (alpha1)2, beta and gamma subunits in embryonic type or (alpha1)2, beta and epsilon subunits in adult type receptors. Defects in this gene are a cause of multiple pterygium syndrome lethal type (MUPSL), congenital myasthenic syndrome slow-channel type (SCCMS), and congenital myasthenic syndrome fast-channel type (FCCMS). The slow-channel congenital myasthenic syndromes (SCCMS) are caused by prolonged opening episodes of AChR due to dominant gain-of-function mutations in the transmembrane regions of the heteropentamer. These mutations produce an increase in the channel opening rate, a decrease in the channel closing rate, or an increase in the affinity of ACh for the AChR, resulting in the stabilization of the open state or the destabilization of the closed state of the AChR.	221
349830	cd19029	LGIC_ECD_nAChR_G	extracellular domain of nicotinic acetylcholine receptor subunit gamma (CHRNG). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit gamma (gamma), encoded by the CHRNG gene expressed during early fetal development, and replaced by the epsilon subunit in the adult. The gamma subunit forms a heteropentamer with (alpha1)2, beta, and delta and plays a role in neuromuscular organogenesis and ligand binding. Disruption of gamma subunit expression prevents the correct localization of the receptor in cell membranes. Mutations in CHRNG may cause the non-lethal Escobar variant (EVMPS) and lethal form (LMPS) of multiple pterygium syndrome (MPS), a condition characterized by prenatal growth failure with pterygium and akinesia leading to muscle weakness and severe congenital contractures, as well as scoliosis. Muscle-type acetylcholine receptor is the major antigen in the autoimmune disease myasthenia gravis.	193
349831	cd19030	LGIC_ECD_nAChR_E	extracellular domain of nicotinic acetylcholine receptor subunit epsilon (CHRNE). This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit epsilon (epsilon), encoded by the CHRNE gene and found in adult skeletal muscle. Epsilon subunit forms a heteropentamer with (alpha1)2, beta and delta after birth, replacing the gamma subunit seen in embryonic receptors. The adult-type epsilon-AChR has a higher conductance and a shorter open time compared to embryonic gamma-AChR and the open channel is non-selectively cation permeable. Mutations of the CHRNE gene are the most common causes of congenital myasthenic syndrome (CMS), most of which are autosomal recessive loss-of-function mutations, resulting in endplate AChR deficiency. A highly fatal fast-channel syndrome is caused by AChR epsilon subunit mutation (Trp to Arg; changing environment from anionic to cationic) at the agonist binding site at the alpha/epsilon interface of the receptor, thus disrupting agonist binding affinity and gating efficiency.	191
349832	cd19031	LGIC_ECD_nAChR_proto_alpha-like	extracellular domain of nicotinic acetylcholine receptor subunit alpha-like found in protostomia. This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit alpha-like in organisms that include arthropods, mollusks, annelid worms, and flat worms, and have their cholinergic system limited to the central nervous system. C. elegans genome encodes 29 acetylcholine receptor subunits, of which the levamisole-sensitive receptor (L-AChR) alpha-subunits, UNC-38, UNC-63, and LEV-8, included in this subfamily, form heteromers with the two non-alpha (also known as beta-like) subunits, UNC-29 and LEV-1. This receptor functions as the main excitatory postsynaptic receptor at neuromuscular junctions, indicating that many are expressed in neurons. Also included is the nicotinic alpha subunit MARA1 (Manduca ACh Receptor Alpha 1) which is expressed in Ca2+ responding neurons and contributes to the nicotinic responses in the neurons. In insects, the receptors supply fast synaptic excitatory transmission and represent a major target for several insecticides. In Drosophila, ten exclusively neuronal nAChRs have been identified, Dalpha1-Dalpha7 and Dbeta1-Dbeta3, and various combinations of these subunits and mutations are key to nAChR function. Alpha5 subunit is involved in alpha-bungarotoxin sensitivity while the alpha6 subunit is essential for the insecticidal effect of spinosad. nAChR agonists acetylcholine, nicotine, and neonicotinoids stimulate dopamine release in Drosophila larval ventral nerve cord and mutations in nAChR subunits affect how insecticides stimulate dopamine release.	222
349833	cd19032	LGIC_ECD_nAChR_proto_beta-like	extracellular domain of nicotinic acetylcholine receptor subunit beta-like found in protostomia. This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit beta-like in organisms that include arthropods, mollusks, annelid worms, and flat worms, and have their cholinergic system limited to the central nervous system. C. elegans genome encodes 29 acetylcholine receptor subunits, of which the levamisole-sensitive receptor alpha-subunits (L-AChR), UNC-38, UNC-63, and LEV-8, form heteromers with the two non-alpha (also known as beta-like) subunits, UNC-29 and LEV-1 found in this subfamily. This receptor functions as the main excitatory postsynaptic receptor at neuromuscular junctions, indicating that many are expressed in neurons. In insects, the receptors supply fast synaptic excitatory transmission and represent a major target for several insecticides. In Drosophila, ten exclusively neuronal nAChR subunits have been identified, Dalpha1-Dalpha7 and Dbeta1-Dbeta3, and various combinations of these subunits and mutations are key to nAChR function. Dbeta1 subunits in dopaminergic neurons play a role in acute locomotor hyperactivity caused by nicotine in male Drosophila. Mutations of Dbeta2 or Dalpha1 nAChR subunits in Drosophila strains have significantly lower neonicotinoid-stimulated release, but no changes in nicotine-stimulated release; they are highly resistant to the neonicotinoids nitenpyram and imidacloprid. This family also includes a novel nAChR found in Aplysia bag cell neurons (neuroendocrine cells that control reproduction) which is a cholinergic ionotropic receptor that is both, nicotine insensitive and acetylcholine sensitive.	208
349834	cd19033	LGIC_ECD_nAChR_proto-like	nicotinic acetylcholine receptor (nAChR) subunit extracellular domain in molluscs and annelids. This subfamily contains the extracellular domain of nicotinic acetylcholine receptor subunit found in molluscs, including several Lymnaea nAChRs, and annelids that are mostly uncharacterized. To date, 12 Lymnaea nAChRs have been identified which can be subdivided in two subtypes according to the residues that may be contributing to the selectivity of ion conductance. Phylogenetic analysis of the nAChR gene sequences suggests that anionic nAChRs in molluscs probably evolved from cationic ancestors through amino acid substitutions in the ion channel pore which is a mechanism different from acetylcholine-gated channels in other invertebrates.	183
349835	cd19034	LGIC_ECD_GABAAR_A1	extracellular domain of gamma-aminobutyric acid receptor subunit alpha-1 (GABAAR-A1 or GABRA1). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-1 (GABAAR-A1), a protein that is encoded by the GABRA1 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-1 subunits form heteropentamers with other GABAAR subunits, most broadly expressed as combination of two alpha1, beta1, gamma. Alpha1, beta2, and gamma2 subunits are clustered on the same human chromosome and may be why alpha1beta2gamma2 receptors are one of the most abundant GABAA receptor isoforms in CNS neurons. Mutations in this gene cause familial juvenile myoclonic epilepsy, sporadic childhood absence epilepsy type 4, and idiopathic familial generalized epilepsy. Polymorphisms in GABRA1 are also significantly associated with schizophrenia. GABRA1 has also been associated with methamphetamine abuse. The GABRA1 receptor is the specific target of the z-drug class of nonbenzodiazepine hypnotic agents and is responsible for their hypnotic and hallucinogenic effects.	194
349836	cd19035	LGIC_ECD_GABAAR_A2	extracellular domain of gamma-aminobutyric acid receptor subunit alpha-2 (GABAAR-A2 or GABRA2). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-2 (GABAAR-A2), a protein that is encoded by the GABRA2 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-2 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as combination of alpha2beta3gamma2. The alpha-2 (GABRA2) subunit is found primarily in the forebrain and hippocampus, and is more confined to areas of the brain compared to other alpha subunits. GABRA2 increases the risk of anxiety, making it a target for treating behavioral disorders including alcohol dependence, and drug use. GABRA2 is a binding site for benzodiazepines (psychoactive drugs known to reduce anxiety), causing chloride channels to open, leading to the hyper-polarization of the membrane. Other anxiolytic drugs such as Diazepam bind this subunit to induce inhibitory effects. GABRA2 is associated with reward behavior when it activates the insula, the part of the cerebral cortex responsible for emotions. GABA alpha2 and/or alpha3 receptor subtypes are also involved in GABAergic modulation of prolactin secretion.	203
349837	cd19036	LGIC_ECD_GABAAR_A3	extracellular domain of gamma-aminobutyric acid receptor subunit alpha-3 (GABAAR-A3 or GABRA3). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-3 (GABAAR-A3), a protein that is encoded by the GABRA3 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-3 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as combination of alpha3betagamma2, typically found post-synaptically. Rare loss-of-function variants in GABRA3 have been shown to increase the risk for a varying combination of epilepsy, intellectual disability/developmental delay, and dysmorphic features. GABRA3, normally exclusively expressed in adult brain, is also expressed in breast cancer, with high expression being inversely correlated with breast cancer survival.  It activates the AKT pathway to promote breast cancer cell migration, invasion, and metastasis. GABRA3 promotes lymphatic metastasis in lung adenocarcinoma by mediating upregulation of matrix metalloproteinases, MMP-2 and MMP-9, through activation of the JNK/AP-1 signaling pathway. GABRA3 is overexpressed in human hepatocellular carcinoma growth and, with GABA, promotes the proliferation of cancer cells.	200
349838	cd19037	LGIC_ECD_GABAAR_A4	extracellular domain of gamma-aminobutyric acid receptor subunit alpha-4 (GABAAR-A4 or GABRA4). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-4 (GABAAR-A4), a protein that is encoded by the GABRA4 gene in humans, with biased expression in the brain and heart. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-4 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as combination of alpha2alpha4beta1gamma1, all four subunits existing on the same gene cluster. Alpha-4 is involved in the etiology of autism and eventually increases autism risk through interaction with the beta-1 (GABRB1) subunit. Polymorphism in GABRA4 may trigger migraine by ethanol, while another is associated to faster reaction times and with lower ethanol effects. A rare variant in GABRA4 may have modest physiological effect in autism spectrum disorder etiology.	199
349839	cd19038	LGIC_ECD_GABAAR_A5	extracellular domain of gamma-aminobutyric acid receptor subunit alpha-5 (GABAAR-A5 or GABRA5). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-5 (GABAAR-A5), a protein that is encoded by the GABRA5 gene in humans, with biased expression in the brain and heart. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-5 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as alpha5-beta-gamma2, and probably alpha5-beta3-gamma2, predominantly expressed in the hippocampus and localized extrasynaptically. These receptors have been demonstrated to play an important modulatory role in learning and memory processes, thus making them suitable targets for pharmacological intervention. Studies show that alpha5-containing GABAARs play an important part in tonic inhibition in hippocampal pyramidal neurons, and that these can also contribute to synaptic inhibition. Studies strongly suggest that amnesia is primarily mediated by alpha5-beta-gamma2. Polymorphisms in GABRA5 (and GABRA3) are linked to the susceptibility to panic disorder. A genetic association also exists between GABRA5 and bipolar affective disorder.	199
349840	cd19039	LGIC_ECD_GABAAR_A6	extracellular domain of gamma-aminobutyric acid receptor subunit alpha-6 (GABAAR-A6 or GABRA6). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit alpha-6 (GABAAR-A6), a protein that is encoded by the GABRA6 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The alpha-6 subunit forms heteropentamers with other GABAAR subunits, most broadly expressed as alpha6-beta-gamma2 found extrasynaptically, alpha6-beta2/3-delta in the cerebellar granule cells and likely also forms alpha1-alpha6-beta-gamma/alpha1-alpha6-beta-delta. A GABRA6 mutation from Arg to Trp, has been identified as a susceptibility gene that may contribute to the pathogenesis of childhood absence epilepsy and cause neuronal disinhibition and increase in seizures via a reduction of alphabetagamma and alphabetadelta receptor function and expression. Polymorphism in the GABRA6 gene is associated with specific personality characteristics as well as a marked attenuation in hormonal and blood pressure responses to psychological stress. Alpha6-containing receptors lack high sensitivity to diazepam.	198
349841	cd19040	LGIC_ECD_GABAAR_B1	extracellular domain of gamma-aminobutyric acid receptor subunit beta-1 (GABAAR-B1 or GABRB1). This family contains extracellular domain (ECD) of gamma-aminobutyric acid receptor beta-1 subunit, a protein that is encoded by the GABRB1 gene. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The beta-1 subunit forms heteropentamers with other GABAAR subunits, likely expressed as alpha-beta1-gamma/delta, mainly found in the brain. It is clustered on the chromosome with genes encoding alpha 4, alpha 2, and gamma 1 subunits of the GABAAR. GABRB1 expression is altered significantly in the lateral cerebellum of subjects with schizophrenia, major depression, and bipolar disorder. Mutations in the GABRB1 gene promote alcohol consumption through increased tonic inhibition. Epigenetic control of gene expression may affect the expression of GABRB1 and disrupt inhibitory synaptic transmission during embryonic development. The GABRB1 gene is also associated with thalamus volume and modulates the association between thalamus volume and intelligence.	182
349842	cd19041	LGIC_ECD_GABAAR_B2	extracellular domain of gamma-aminobutyric acid receptor subunit beta-2 (GABAAR-B2 or GABRB2). This family contains extracellular domain (ECD) of gamma-aminobutyric acid receptor beta-2 subunit, a protein that is encoded by the GABRB2 gene. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The beta-2 subunit forms heteropentamers with other GABAAR subunits, with alpha1-beta2-gamma2 subtype being the most prevalent isoform (approximately 50%-60% of all GABAARs), and are expressed in almost all regions of the brain. It also assembles less abundantly as alpha4beta2/3delta and alpha6beta2/3delta. Mutations or genetic variations of the genes encoding the GABRB2 and GABRB3 have been associated with human epilepsy, both with and without febrile seizures. Mutations in GABRB2, and GABRB3 have been associated with infantile spasms and Lennox-Gastaut syndrome. A de novo missense mutation of GABRB2 causes early myoclonic encephalopathy, a disease with a devastating prognosis, characterized by neonatal onset of seizures. Another de novo heterozygous missense variant in exon 4 of GABRB2 is associated with intellectual disability and epilepsy. GABRB2 plays important tumorigenic functions and acts as a novel oncogene in papillary thyroid carcinoma (PTC).	182
349843	cd19042	LGIC_ECD_GABAAR_B3	extracellular domain of gamma-aminobutyric acid receptor subunit beta-3 (GABAAR-B3 or GABRB3). This family contains extracellular domain (ECD) of gamma-aminobutyric acid receptor beta-3 subunit, a protein that is encoded by the GABRB3 gene. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The beta-3 subunit forms heteropentamers with other GABAAR subunits, with alpha2-beta3-gamma2 and alpha3-beta3-gamma2 subtypes highly enriched in hippocampal pyramidal neurons and cholinergic neurons of the basal forebrain, respectively. Other heteromers include alpha1-beta3-gamma2 and alpha5-beta3-gamma2. GABRB3 mutations are likely associated with a broad phenotypic spectrum of epilepsies and that reduced receptor function causing GABAergic disinhibition represents the relevant disease mechanism. GABRB3 might be associated with heroin dependence, and increased expression possibly contributing to the pathogenesis of heroin dependence. This gene may also be associated with the pathogenesis of other disorders such as Angelman syndrome, Prader-Willi syndrome, nonsyndromic orofacial clefts, schizophrenia, and autism.	183
349844	cd19043	LGIC_ECD_GABAAR_G1	extracellular domain of gamma-aminobutyric acid receptor subunit gamma-1 (GABAAR-G1 or GABRG1). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit gamma-1 (GABAAR-G1), a protein that is encoded by the GABRG1 gene in humans, clustered with the alpha2 gene GABRA2, which is associated with alcohol dependence. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The gamma-1 subunit forms heteropentamers with other GABAAR subunits, likely expressed as combination of alpha1/2-beta-gamma1 subunits. A variant in GABRG1 shows the strongest statistical evidence of association of recovery from eating disorders. Studies show that upregulating or preserving GABAA gamma1/3 and gamma2 receptors may protect neurons against neurofibrillary pathology in Alzheimer's disease.	182
349845	cd19044	LGIC_ECD_GABAAR_G2	extracellular domain of gamma-aminobutyric acid receptor subunit gamma-2 (GABAAR-G2 or GABRG2). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit gamma-2 (GABAAR-G2), a protein that is encoded by the GABRG2 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The gamma-2 subunit forms heteropentamers with other GABAAR subunits, most prevalently expressed as alpha1-beta2-gamma2. The gamma2 subunit also coassembles with other alpha and beta variants in the brain, but these receptors are found in considerably less abundance and are restricted in their regional, e.g. the alpha2-beta3-gamma2 and alpha3-beta3-gamma2 subtypes are highly enriched in hippocampal pyramidal neurons and cholinergic neurons of the basal forebrain, respectively. Pathogenic missense and truncating variants in this gene have been associated with spectrum of epilepsies, from Dravet syndrome to milder simple febrile seizures, while a recurrent GABRG2 missense variant is associated with early-onset seizures, significant motor and speech delays, intellectual disability, hypotonia, movement disorder, dysmorphic features, and vision/ocular issues.	184
349846	cd19045	LGIC_ECD_GABAAR_G3	extracellular domain of gamma-aminobutyric acid receptor subunit gamma-3 (GABAAR-G3 or GABRG3). This family contains extracellular domain of gamma-aminobutyric acid receptor subunit gamma-3 (GABAAR-G3), a protein that is encoded by the GABRG3 gene in humans. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Upon gamma-aminobutyric acid (GABA) binding to the ligand binding site on the ECD, Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. The gamma-3 subunit forms heteropentamers with other GABAAR subunits, likely expressed as alpha1-beta3-gamma3. This subunit contains the benzodiazepine binding site. Polymorphisms in GABG3 show consistent evidence of alcohol dependence.	182
349847	cd19046	LGIC_ECD_GABAAR_rho1	extracellular domain of gamma-aminobutyric acid receptor subunit rho-1 (GABA-rho1 or GABRR1). This family contains extracellular domain (ECD) of the rho subunit 1 of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the GABRR1 gene, expressed in many areas of the brain, but especially high in the retina. GABRR1 exists next to GABRR2 (encoding rho subunit 2) on the chromosome region thought to be associated with susceptibility for psychiatric disorders and epilepsy. Close proximity of the rho1 and rho2 subunit genes suggests that they emerged via a local duplication event. This subunit homo-oligomerizes to form GABAA-rho receptors (formerly classified as GABA-rho or GABAc receptor), but does not co-assemble with any of the classical GABAAR subunits. In humans, mutations in the GABRR1 gene may be responsible for some cases of autosomal recessive retinitis pigmentosa. Variation in GABRR1 is also associated with susceptibility to bipolar schizoaffective disorder, and may be associated with alcohol dependency.	186
349848	cd19047	LGIC_ECD_GABAAR_rho2	extracellular domain of gamma-aminobutyric acid receptor subunit rho-2 (GABA-rho2 or GABRR2). This family contains extracellular domain (ECD) of the rho subunit 2 of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the GABRR2 gene which exists next to GABRR1 (encoding rho subunit 1) on the chromosome region thought to be associated with susceptibility for psychiatric disorders and epilepsy. Close proximity of the rho1 and rho2 subunit genes suggests that they emerged via a local duplication event. Rho1 is expressed in many areas of the brain, but especially high in the retina. This subunit homo-oligomerizes to form GABAA-rho receptors (formerly classified as GABA-rho or GABAc receptor), but does not co-assemble with any of the classical GABAAR subunits. In humans, mutations in the GABRR2 gene may be responsible for some cases of autosomal recessive retinitis pigmentosa. Variation in GABRR2 is also associated with susceptibility to bipolar schizoaffective disorder, as well as alcohol dependence and general cognitive ability. GABA-rho2 receptors expressed pre-synaptically in the spinal dorsal horn have been implicated in pain perception and identified as a novel target for analgesia.	186
349849	cd19048	LGIC_ECD_GABAAR_rho3	extracellular domain of gamma-aminobutyric acid receptor subunit rho-3 (GABAA-rho3). This family contains extracellular domain (ECD) of the rho subunit 3 of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the GABRR3 gene which maps to a different chromosome to that of GABRR1 and GABRR2. While close proximity of the rho1 and rho2 subunit genes suggests that they emerged via a local duplication event, GABRR3 may have arisen by duplication of a GABRR1/GABRR2 progenitor. This subunit homo-oligomerizes to form GABAA-rho receptors (formerly classified as GABA-rho or GABAc receptor), but does not co-assemble with any of the classical GABAAR subunits. In humans, some individuals contain a variant that is predicted to inactivate this gene product.	186
349851	cd19049	LGIC_TM_anion	transmembrane domain of anionic Cys-loop neurotransmitter-gated ion channels, includes GABAAR, GlyR and GluCl. This family contains transmembrane domain of type-A gamma-aminobutyric acid receptor (GABAAR) as well as glycine receptor (GlyR) subunits. Thus far, there are 18 vertebrate receptor subunits categorized in 7 families: alpha1-6, beta1-4, gamma1-4, delta, epsilon, theta, rho, and pi. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GlyR, with a similar structure as GABAAR, is concentrated in the brain stem and spinal cord in the CNS and can be activated by glycine, beta-alanine, or taurine. It is selectively blocked by the high-affinity competitive antagonist strychnine, which causes death by asphyxiation. An autosomal dominant R271Q mutation in GLRA1 causes hyperekplexia (Startle disease or Stiff Baby Syndrome) by decreasing glycine sensitivity.	111
349852	cd19050	LGIC_TM_bact	transmembrane domain of prokaryotic pentameric ligand-gated ion channels (pLGIC). This family contains transmembrane (TM) domain of bacterial pentameric ligand-gated ion channels (pLGICs) including ones from Gloeobacter violaceus (GLIC) and Erwinia chrysanthemi (ELIC). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. Studies show that GLIC activation is inhibited by most general anaesthetics at clinical concentrations, including xenon which has been used in clinical practice as a potent gaseous anesthetic for decades. Xenon binding sites have been identified in three distinct regions of the TMD: in a large intra-subunit cavity, in the pore, and at the interface between adjacent subunits. Propofol, the drug used for induction and maintenance of general anesthesia, and desflurane, a negative allosteric modulator of GLIC bind at the entrance in the intra-subunit cavity. Alzheimer's drug memantine, which blocks ion conduction at vertebrate pLGICs by plugging the channel pore, has been shown to have similar potency in ELIC.	119
349853	cd19051	LGIC_TM_cation	transmembrane domain of Cys-loop neurotransmitter-gated ion channels, includes 5HT3, nAChR, and ZAC. This superfamily contains the transmembrane (TM) domain of cationic Cys-loop neurotransmitter-gated ion channels, which include nicotinic acetylcholine receptor (nAChR), serotonin 5-hydroxytryptamine receptor (5-HT3), and zinc-activated ligand-gated ion channel (ZAC) receptor. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. The ligand-gated ion channels (LGICs) in this family are found across metazoans and have close homologs in bacteria. They are vital for communication throughout the nervous system. nAChR is a non-selective cation channel that is permeable to Na+ and K+, and some subunit combinations are also permeable to Ca2+. Na+ enters and K+ exits to allow net flow of positively charged ions inward. 5-HT3, a cation-selective channel, binds serotonin and is permeable to Na+, K+, and Ca2+. It mediates neuronal depolarization and excitation within the central and peripheral nervous systems. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+ and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling require is as yet unknown.	112
349854	cd19052	LGIC_TM_GABAAR_alpha	transmembrane domain of alpha subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane domain of type-A gamma-aminobutyric acid receptor (GABAAR) as well as glycine receptor (GlyR) subunits. Thus far, there are 18 vertebrate receptor subunits categorized in 7 families: alpha1-6, beta1-4, gamma1-4, delta, epsilon, theta, rho, and pi. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GlyR, with a similar structure as GABAAR, is concentrated in the brain stem and spinal cord in the CNS and can be activated by glycine, beta-alanine or taurine. It is selectively blocked by the high-affinity competitive antagonist strychnine, which causes death by asphyxiation. An autosomal dominant R271Q mutation in GLRA1 causes hyperekplexia (Startle disease or Stiff Baby Syndrome) by decreasing glycine sensitivity.	111
349855	cd19053	LGIC_TM_GABAAR_beta	transmembrane domain of beta subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the beta subunit of type-A beta-aminobutyric acid receptor (GABAAR), which includes beta1-beta4 in vertebrates. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Mutations or genetic variations of the genes encoding beta2 (GABRB2) and beta3 (GABRB3) have been associated with human epilepsy, both with and without febrile seizures. Mutations in GABRB2, and GABRB3 have been associated with infantile spasms and Lennox-Gastaut syndrome. A de novo missense mutation of GABRB2 causes early myoclonic encephalopathy, a disease with a devastating prognosis, characterized by neonatal onset of seizures. Another de novo heterozygous missense variant in exon 4 of GABRB2 is associated with intellectual disability and epilepsy. Mutations in the GABRB1 gene encoding beta1 promote alcohol consumption through increased tonic inhibition.	111
349856	cd19054	LGIC_TM_GABAAR_gamma	transmembrane domain of gamma subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the gamma subunit of type-A beta-aminobutyric acid receptor (GABAAR), which includes gamma1-gamma3 in vertebrates. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Studies show upregulating or preserving GABAA gamma1/3 and gamma2 receptors may protect neurons against neurofibrillary pathology in Alzheimer's disease. Pathogenic missense and truncating variants in GABRG2 have been associated with spectrum of epilepsies, from Dravet syndrome to milder simple febrile seizures. Polymorphisms in GABG3 show consistent evidence of alcohol dependence.	111
349857	cd19055	LGIC_TM_GABAAR_delta	transmembrane domain of delta subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the delta subunit of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the gene GABRD. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. Receptors containing the delta subunit (GABRD) are expressed exclusively extra-synaptically (in the cortex, hippocampus, thalamus, striatum, and cerebellum) and mediate tonic inhibition. Studies suggest that delta subunits form heteropentamers in similar stoichiometry and arrangement as alpha/beta/gamma receptors, with the delta subunit replacing the gamma subunit (2alpha:2beta:1delta), although other stoichiometries have also been detected. The delta subunit is flexible in its positioning in the pentameric complex, producing receptors with diverse pharmacological properties. Mutations in GABRD have been associated with susceptibility to generalized epilepsy with febrile seizures, type 5. GABRD gene may also be associated with childhood-onset mood disorders.	121
349858	cd19056	LGIC_TM_GABAAR_theta	transmembrane domain of theta subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the theta subunit of type-A gamma-aminobutyric acid receptor (GABAAR). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABA stimulates human hepatocellular carcinoma growth through overexpressed GABAA receptor theta subunit. Also, two autism spectrum disorder (ASD)-associated protein truncation variants have been identified in alpha 3 (GABRA3) and theta (GABRQ) genes.	118
349859	cd19057	LGIC_TM_GABAAR_epsilon	transmembrane domain of epsilon subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of type-A gamma-aminobutyric acid receptor (GABAAR) subunits as well as glycine receptor (GlyR). Thus far, there are 18 vertebrate receptor subunits categorized in 7 families: alpha1-6, beta1-4, gamma1-4, delta, epsilon, theta, rho, and pi. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GlyR, with a similar structure as GABAAR, is concentrated in the brain stem and spinal cord in the CNS and can be activated by glycine, beta-alanine, or taurine. It is selectively blocked by the high-affinity competitive antagonist strychnine, which causes death by asphyxiation. An autosomal dominant R271Q mutation in GLRA1 causes hyperekplexia (Startle disease or Stiff Baby Syndrome) by decreasing glycine sensitivity.	115
349860	cd19058	LGIC_TM_GABAAR_pi	transmembrane domain of pi subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the pi subunit of type-A gamma-aminobutyric acid receptor (GABAAR), encoded my the gene GABRP. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. GABRP is expressed mainly in non-neuronal tissues such as the mammary gland, prostate gland, lung, thymus, and uterus. It is also highly expressed in certain types of cancer such as basal-like breast cancer and pancreatic ductal adenocarcinoma. GABRP is involved in inhibitory synaptic transmission in the central nervous system.  Its assembly with other GABAAR subunits alters the sensitivity of recombinant receptors to modulatory agents such as pregnanolone. Studies suggest that polymorphisms in the GABRP gene may be associated with the susceptibility to systematic lupus erythematosus (SLE).	123
349861	cd19059	LGIC_TM_GABAAR_rho	transmembrane domain of rho subunits of type-A gamma-aminobutyric acid receptor (GABAAR). This family contains transmembrane (TM) domain of the rho subunit of type-A gamma-aminobutyric acid receptor (GABAAR), which includes rho1-3. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GABAAR is an anionic channel, mediating fast inhibitory synaptic transmission. Cl- ions are selectively conducted through the GABAAR pore, resulting in hyperpolarization of the neuron. GABAAR is the principal mediator of rapid inhibitory synaptic transmission in the human brain. A decline in GABAAR signaling triggers hyperactive neurological disorders such as insomnia, anxiety, and epilepsy. These rho subunits homo-oligomerize to form GABAA-rho receptors (formerly classified as GABA-rho or GABAC receptor) but do not co-assemble with any of the classical GABAA subunits. They are especially high expression in the retina and their distinctive pharmacological properties are unique; they are not modulated by many GABAA receptor modulators such as barbiturates, benzodiazepines, and neuroactive steroids. In humans, mutations in the rho-1 and rho genes, GABRR1 and GABRR2, may be responsible for some cases of autosomal recessive retinitis pigmentosa. Variation in GABRR1 is also associated with susceptibility to bipolar schizoaffective disorder while a SNP in GABRR2 has been reported to show association with autism.	113
349862	cd19060	LGIC_TM_GlyR_alpha	transmembrane domain of alpha subunits of glycine receptor (GlyR). This family contains transmembrane (TM) domain of the alpha subunit of glycine receptor (GlyR or GLR) of the amino acid neurotransmitter glycine. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. GlyR has four known isoforms of the alpha-subunit (alpha1-4, encoded by GLRA1, GLRA2, GLRA3, GLRA4) that are essential to bind ligands and, along with the GlyR beta subunit, have been described to have a regionally and temporally controlled expression during development and maturation of the central nervous system (CNS). These alpha subunits are highly homologous but differ in their kinetic properties, temporal and regional expression and physiological functions. They can form functional chloride-permeable GlyR ion channels by forming homopentamers with 5 alpha subunits or heteropentamers with a combination of alpha and beta subunits, either a 2alpha-3beta or 3alpha-2beta stoichiometry. In human, mutations in glycine receptor alpha subunits cause disruption of GlyR surface expression or reduced ability of expressed GlyRs to conduct chloride ions. Mutations in GlyR alpha1 subunit leads to hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli, while mutations in GlyR alpha2 are known to cause cortical neuronal migration/autism spectrum disorder and in GlyR alpha3 to cause inflammatory pain sensitization/rhythmic breathing. GlyR alpha1 and alpha2 subunits have an important role in regulation of the excitatory-inhibitory balance, control of motor actions, modulation of sedative ethanol effects and probably regulation of ethanol preference and consumption.	120
349863	cd19061	LGIC_TM_GlyR_beta	transmembrane domain of beta subunits of glycine receptor (GlyR). This family contains transmembrane (TM) domain of the beta subunit of glycine receptor (GlyR or GLR) of the amino acid neurotransmitter glycine, encoded by GLRB gene. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. These subunits form heteropentamers with a combination of alpha and beta subunits, either a 2alpha-3beta or 3alpha-2beta stoichiometry. While the alpha subunits contain binding sites for agonists and antagonists and are responsible for ion channel formation, the beta subunit displays structural and regulatory functions, such as GlyR clustering in synaptic locations by interaction between intracellular loop domains with the scaffolding protein gephyrin, and control of pharmacologic responses to agonist or allosteric modulators due in part to the presence of interfaces alpha/beta and beta/beta. GLRB gene mutations are associated with the neurological disorder hyperekplexia, a rare neurological disorder characterized by neonatal hypertonia and exaggerated startle responses to unexpected stimuli, as well as agoraphobic cognitions.	114
349864	cd19062	LGIC_TM_GluCl	transmembrane domain of glutamate gated chloride channel (GluCl). This family contains transmembrane (TM) domain of the glutamate-gated chloride channel (GluCl) found only in protostomia but are closely related to mammalian glycine receptors. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. These GluCl channels have several roles in these invertebrates, including controlling locomotion and feeding, and mediating sensory inputs into behavior. Comparison of the GluCl gene families between organisms shows that insect gene family is relatively simple, while that found in nematodes tends to be larger and more diverse. Glutamate is an inhibitory neurotransmitter that shapes the responses of projection neurons to olfactory stimuli in the Drosophila. GluCls are targeted by the macrocyclic lactone family of anthelmintics and pesticides in arthropods and nematodes, thus making the GluCls of considerable medical and economic importance. In Drosophila melanogaster, GluCl mediates sensitivity to the antiparasitic agents ivermectin and nodulisporic acid, suggesting that their drug target is the same throughout the Ecdysozoa.	116
349865	cd19063	LGIC_TM_5-HT3	transmembrane domain of 5-hydroxytryptamine 3 (5-HT3) receptor. This family contains transmembrane (TM) domain of the serotonin 5-HT3 receptors. The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. The 5-HT3 channel is cation-selective and mediates neuronal depolarization and excitation within the central and peripheral nervous systems. Like other ligand gated ion channels, the 5-HT3 receptor consists of five subunits arranged around a central ion conducting pore, which is permeable to Na+, K+, and Ca2+ ions. Binding of the neurotransmitter 5-hydroxytryptamine (serotonin) to the 5-HT3 receptor opens the channel, which then leads to an excitatory response in neurons, and the rapidly activating, desensitizing, inward current is predominantly carried by Na+ and K+ ions. This receptor is most closely related by homology to the nicotinic acetylcholine receptor (nAChR). Five subunits have been identified for this family: 5-HT3A, 5-HT3B, 5-HT3C, 5-HT3D, and 5-HT3E, encoded by HTR3A-E genes. Only 5-HT3A subunits are able to form functional homomeric receptors, whereas the 5-HT3B, C, D, and E subunits form heteromeric receptors with 5-HT3A. Different receptor subtypes are important mediators of nausea and vomiting during chemotherapy, pregnancy, and following surgery, while some contribute to neuro-gastroenterologic disorders such irritable bowel syndrome (IBS) and eating disorders as well as co-morbid psychiatric conditions. 5-HT3 receptor antagonists are established treatments for emesis and IBS, and are beneficial in the treatment of psychiatric diseases.	121
349866	cd19064	LGIC_TM_nAChR	transmembrane domain of nicotinic acetylcholine receptor (nAChR). This family contains transmembrane (TM) domain of the nicotinic acetylcholine receptor (nAChR). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. nAChR is found in high concentrations at the nerve-muscle synapse, where it mediates fast chemical transmission of electrical signals in response to the endogenous neurotransmitter acetylcholine (ACh) released from the nerve terminal into the synaptic cleft. Thus far, seventeen nAChR subunits have been identified, including ten alpha subunits, four beta subunits and one gamma, delta, and epsilon subunit each, all found on the cell membrane that non-selectively conducts cations (Na+, K+, Ca++). These nAChR subunits combine in several different ways to form functional nAChR subtypes which are broadly categorized as either muscle subtype located at the neuromuscular junction or neuronal subtype that are found on neurons and on other cell types throughout the body. The muscle type of nAChRs are formed by the alpha1, beta1, gamma, delta, and epsilon subunits while the neuronal type are composed of nine alpha subunits and three beta subunits, which combine in various permutations and combinations to form functional receptors. Among various subtypes of neuronal nAChRs, the homomeric alpha7 and the heteromeric alpha4beta2 receptors are the main subtypes widely distributed in the brain and implicated in the pathophysiology of neurodevelopmental disorders such as schizophrenia and autism and neurodegenerative disorders such as Alzheimer's disease and Parkinson's disease. Among subtypes of muscle nAChRs, the heteromeric subunits (alpha1)2, beta, gamma, and delta in fetal muscle, and the gamma subunit replaced by epsilon in adult muscle have been implicated in congenital myasthenic syndromes and multiple pterygium syndromes due to various mutations. This family also includes alpha- and beta-like nAChRs found in protostomia.	113
349867	cd19065	LGIC_TM_ZAC	transmembrane domain of zinc-activated ligand-gated ion channel. This family contains transmembrane (TM) domain of zinc-activated ligand-gated ion channel (ZAC). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. ZAC displays low sequence similarity to other members in the superfamily, with closest matches to the human serotonin 5-HT3 receptor (5-HT3R) subunits 5-HT3A and 5-HT3B, and nAChR alpha7 subunits that exhibit approximately 15% amino acid sequence identity to ZAC. Expression of ZAC has been detected in human fetal whole brain, spinal cord, pancreas, placenta, prostate, thyroid, trachea, and stomach, as well as in adult hippocampus, striatum, amygdala, and thalamus. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+, and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling is as yet unknown.	176
380453	cd19066	C_NRPS-like	Condensation domain of nonribosomal peptide synthetases (NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long, with various activities such as antibiotic, antifungal, antitumor and immunosuppression. There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity.	427
410989	cd19067	PfuEndoQ-like	lesion-specific endonuclease similar to Pyrococcus furiosus EndoQ. Pyrococcus furiosus EndoQ is a lesion-specific endonuclease which is assumed to be involved in DNA repair pathways in Thermococcales. It recognizes a deaminated base and hydrolyzes the phosphodiester bond 5' to the site of the lesion. Initially identified as a hypoxanthine-specific endonuclease, it has now been shown that EndoQ also recognizes uracil, xanthine, and apurinic/apyrimidinic (AP) sites in DNA, and that a homolog in Bacillus pumilus shares functional properties of the archaeal EndoQs.	395
381297	cd19071	AKR_AKR1-5-like	AKR1/2/3/4/5 family of aldo-keto reductase (AKR) and similar proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. The family includes AKR1A/B/C/D/E/G/I, AKR2A/B/C/D/E, AKR3A/B/C/D/E/G, AKR4A/B/C, AKR5A/B/C/D/E/F/G/H, and similar proteins.	251
381298	cd19072	AKR_AKR3F1-like	Thermotoga maritime Tm1743, Escherichia coli YeaE and similar proteins. Thermotoga maritime Tm1743 is a founding member of aldo-keto reductase family 3 member F1 (AKR3F1). It is a aldo/keto reductase family oxidoreductase. Escherichia coli YeaE may act as an aldo-keto reductase (AKR) that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor.	263
381299	cd19073	AKR_AKR3F2_3	Escherichia coli 2,5-diketo-D-gluconic acid reductase B (DkgB/YafB), Sinorhizobium meliloti isatin reductase and similar proteins. Escherichia coli DkgB/YafB (EC 1.1.1.346), also called 2,5-didehydrogluconate reductase (2-dehydro-L-gulonate-forming), or 2,5-DKG reductase B, or 2,5-DKGR B, or 25DKGR-B, is a founding member of aldo-keto reductase family 3 member F2 (AKR3F2). It catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). Sinorhizobium meliloti isatin reductase is a founding member of aldo-keto reductase family 3 member F3 (AKR3F3). It is a aldo/keto reductase family oxidoreductase.	243
381300	cd19074	Aldo_ket_red_shaker-like	Shaker potassium channel beta subunit family and similar proteins. This family includes voltage-gated potassium channel subunits, beta-1 (KCAB1B), beta-2 (KCAB2B) and beta-3 (KCAB3B). KCAB1B and KCAB2B are cytoplasmic potassium channel subunits that modulate the characteristics of the channel-forming alpha-subunits. KCAB3B is an accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit. The family also includes Drosophila melanogaster Hk protein, a founding member of aldo-keto reductase family 6 member B1 (AKR6B1), as well as voltage-gated potassium channel subunit beta (KCAB) from Arabidopsis thaliana and Egeria densa, founding members of AKR6C1and AKR6C2, respectively. Hk protein, also called hyperkinetic, is a beta subunit of Shaker (Sh) K+ channels and shows high sequence homology to aldoketoreductase. KCAB, also called Shaker channel b-subunit, or K(+) channel subunit beta, or potassium voltage beta 1, or KV-beta1, or KAB1, is a probable accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit.	297
381301	cd19075	AKR_AKR7A1-5	AKR7A family of aldo-keto reductase (AKR). Aflatoxin B1 aldehyde reductase member 1/3 (AKR7A1/AKR7A3/AFAR) from Rattus norvegicus, aflatoxin B1 aldehyde reductase member 2 (AKR7A2/AFAR1/AFAR) and aflatoxin B1 aldehyde reductase member 3 (AKR7A3/AFAR2) from Homo sapiens, aflatoxin B1 aldehyde reductase member 2 (AKR7A2/AFAR2) from Rattus norvegicus, and aflatoxin B1 aldehyde reductase member 2 (AKR7A2/AKR7A5/AFAR) from Mus musculus, are founding members of aldo-keto reductase family 7 member A1-5 (AKR7A1-5), respectively. AKR7A2 (EC 1.1.1.n11), also called AFB1 aldehyde reductase 1, or AFB1-AR 1, or aldoketoreductase 7, or succinic semialdehyde reductase, or SSA reductase, catalyzes the NADPH-dependent reduction of succinic semialdehyde to gamma-hydroxybutyrate (GHB). It has NADPH-dependent aldehyde reductase activity towards 2-carboxybenzaldehyde, 2-nitrobenzaldehyde and pyridine-2-aldehyde (in vitro). AKR7A2, AKR7A3 (also called AFB1 aldehyde reductase 2 or AFB1-AR 2), and AKR7A4 (also called AFB1 aldehyde reductase 3, or AFB1-AR 3, or aldoketoreductase 7-like), may be involved in protection of liver against the toxic and carcinogenic effects of aflatoxin B1 (AFB1), a potent hepatocarcinogen. They can reduce the dialdehyde protein-binding form of AFB1 to the non-binding AFB1 dialcohol.	304
381302	cd19076	AKR_AKR13A_13D	AKR13A and AKR13D families of aldo-keto reductase (AKR). Schizosaccharomyces pombe aldo-keto reductase YakC is a founding member of aldo-keto reductase family 13 member A1 (AKR13A1). It catalyzes the reversible reduction of ketones to the respective alcohols using NADP(+) as a hydride donor. Rauvolfia serpentina PR is a founding member of aldo-keto reductase family 13 member D1 (AKR13D1). It catalyzes the NADPH-dependent reduction of the aldehyde perakine to yield the alcohol raucaffrinoline in the biosynthetic pathway of ajmaline in Rauvolfia, a key step in indole alkaloid biosynthesis. This family also includes Arabidopsis thaliana aldo-keto reductases, ALKR1-6.	303
381303	cd19077	AKR_AKR8A1-2	AKR8A family of aldo-keto reductase (AKR). Schizosaccharomyces pombe PLR and PLR2 are founding members of aldo-keto reductase family 8 member A1-2 (AKR8A1-2), respectively. PLR (EC 1.1.1.65), also called PL reductase (PL-red), catalyzes the reduction of pyridoxal (PL) with NADPH and oxidation of pyridoxine (PN) with NADP(+).	302
381304	cd19078	AKR_AKR13C1_2	AKR13C family of aldo-keto reductase (AKR). The AKR13C family includes Helicobacter pyroli aldehyde reductase (AKR13C1) and Thermotoga maritima aldo-keto reductase (AKR13C2). Aldehyde reductase (EC 1.1.1.21), also called aldose reductase, is a cytosolic NADPH-dependent oxidoreductase that catalyzes the reduction of a variety of aldehydes and carbonyls, including monosaccharides.	301
381305	cd19079	AKR_EcYajO-like	Escherichia coli YajO and similar proteins. Escherichia coli YajO is the prototype of this family. It is an uncharacterized aldo/keto reductase family oxidoreductase.	312
381306	cd19080	AKR_AKR9A_9B	AKR9A and AKR9B families of aldo-keto reductase (AKR). The AKR9A family includes Aspergillus nidulans sterigmatocystin biosynthesis dehydrogenase StcV, Aspergillus flavus norsolorinic acid reductase (NOR), and Phanerochaete chrysosporium aryl-alcohol dehydrogenase [NADP(+)] (AAD), are founding members of aldo-keto reductase family 9 member A1-3 (AKR9A1-3), respectively. StcV may be involved in the dehydration of 5'-hydroxyaverantin to form averufin. NOR is involved in aflatoxin biosynthesis. AAD (EC1.1.1.91) is involved in lignin degradation and reduces aromatic benzaldehydes to their respective alcohols in the presence of NADP(H). The AKR9B family includes Saccharomyces cerevisiae aryl-alcohol dehydrogenases AAD14p, AAD3p, AAD4p, and AAD10p, which are founding members of aldo-keto reductase family 9 member B1-4 (AKR9B1-4), respectively.	307
381307	cd19081	AKR_AKR9C1	AKR9C family of aldo-keto reductase (AKR). Haloferax volcanii aldo-keto reductase is a founding member of aldo-keto reductase family 9 member C1 (AKR9C1).	308
381308	cd19082	AKR_AKR10A1_2	AKR10A family of aldo-keto reductase (AKR). Streptomyces bluensis aldo-keto reductase (BlmT) and Streptomyces glaucescens aldo-keto reductase (StrT) are founding members of aldo-keto reductase family 10 member A1 (AKR10A1) and A2 (AKR10A2). BlmT is bluensomycin aldo-keto reductase (AKR) and StrT is streptomycin AKR.	291
381309	cd19083	AKR_AKR11A1_11D1	AKR11A and  AKR11D families of aldo-keto reductase (AKR). Bacillus subtilis aldo-keto reductase IolS, also called vegetative protein 147 (VEG147), is a founding member of aldo-keto reductase family 11 member A1 (AKR11A1). It is able to reduce the standard aldo-keto reductase (AKR) substrates DL-glyceraldehyde, D-erythrose, and methylglyoxal in the presence of NADPH, albeit with poor efficiency in vitro. Bacillus aryabhattai aldo keto reductase is a founding member of aldo-keto reductase family 11 member D1 (AKR11D1).	307
381310	cd19084	AKR_AKR11B1-like	AKR11B1/AKR11B2 subfamily of aldo-keto reductase (AKR). Bacillus subtilis YhdN, also called general stress protein 69 (GSP69), is a founding member of aldo-keto reductase family 11 member B1 (AKR11B1). It acts as an aldo-keto reductase (AKR) that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. Escherichia coli YdjG is a founding member of aldo-keto reductase family 11 member B2 (AKR11B2). It catalyzes the NADH-dependent reduction of methylglyoxal (2-oxopropanal) in vitro. It may play some role in intestinal colonization.	296
381311	cd19085	AKR_AKR11B3	Synechococcus sp. aldo-keto reductase (SakR1) and similar proteins. Synechococcus sp. SakR1 is a founding member of aldo-keto reductase family 11 member B3(AKR11B3). It is responsible for methylglyoxal detoxification.	292
381312	cd19086	AKR_AKR11C1	AKR11C family of aldo-keto reductase (AKR). Bacillus subtilis uncharacterized oxidoreductase YqkF is a founding member of aldo-keto reductase family 11 member C1 (AKR11C1). It may function as oxidoreductase. This family also includes Bacillus halodurans AKR11C1, an NADPH-dependent 4-hydroxy-2,3-trans-nonenal reductase.	238
381313	cd19087	AKR_AKR12A1_B1_C1	AKR12A, AKR12B,  AKR12C families of aldo-keto reductase (AKR). Streptomyces fradiae TylCII, Saccharopolyspora erythraea EryBII, and Streptomyces avermitilis aveBVIII are founding members of aldo-keto reductase family 12 member A1 (AKR12A1), B1 (AKR12B1), and C1(AKR12C1), respectively. TylCII acts as a NDP-hexose 2,3-enoyl reductase. EryBII is a mycarose/desosamine reductase involved in L-mycarose and D-desosamine production. aveBVIII functions as a dTDP-4-keto-6-deoxy-L-hexose-2,3-reductase.	310
381314	cd19088	AKR_AKR13B1	AKR13B family of aldo-keto reductase (AKR). Xylella fastidiosa phenylacetaldehyde dehydrogenase is a founding member of aldo-keto reductase family 13 member B1 (AKR13B1). phenylacetaldehyde dehydrogenase (EC 1.2.1.39) catalyzes the NAD+-dependent oxidation of phenylactealdehyde to phenylacetic acid.	256
381315	cd19089	AKR_AKR14A1_2	AKR14A family of aldo-keto reductase (AKR). Escherichia coli L-glyceraldehyde 3-phosphate reductase (GPR/YghZ), also called GAP reductase, is a founding member of aldo-keto reductase family 14 member A1 (AKR14A1). It catalyzes the stereospecific, NADPH-dependent reduction of L-glyceraldehyde 3-phosphate (L-GAP). It is also involved in the stress response as a methylglyoxal reductase which converts the toxic metabolite methylglyoxal to acetol in vitro and in vivo. Salmonella enterica AKR is a founding member of aldo-keto reductase family 14 member A2 (AKR14A2). It catalyzes the conversion of 3-hydroxybutanal (3-HB) to 1,3-butanediol (1,3-BDO) by using NADPH as a cofactor.	308
381316	cd19090	AKR_AKR15A-like	AKR15A family of aldo-keto reductase and similar proteins. The AKR15 family includes Microbacterium luteolum pyridoxal 4-dehydrogenase (PLD), Pseudomonas sp. D-threo-aldose 1-dehydrogenase (FDH) and similar proteins. PLD (EC1.1.1.107) catalyzes irreversible oxidation of pyridoxal. FDH (EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose and, to a much lesser degree, D-arabinose. FDH (EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose and, to a much lesser degree, D-arabinose. The family also includes L-galactose dehydrogenase (L-galDH) and D-arabinose 1-dehydrogenase (ARA2). L-galDH (EC 1.1.1.316), also called L-galactose 1-dehydrogenase, catalyzes the oxidation of L-galactose to L-galactono-1,4-lactone in the presence of NAD(+). It uses NAD(+) as a hydrogen acceptor much more efficiently than NADP(+). ARA2 (EC1.1.1.116), also called NAD(+)-specific D-arabinose dehydrogenase, catalyzes the the oxidation of D-arabinose to D-arabinono-1,4-lactone in the presence of NAD(+).	278
381317	cd19091	AKR_PsAKR	Polaromonas Sp. aldo-keto reductase and similar proteins. The prototype of this family is an uncharacterized aldo-keto reductase from Polaromonas sp.	319
381318	cd19092	AKR_BsYcsN_EcYdhF-like	Bacillus subtilis YcsN, Escherichia coli YdhF and similar proteins. Bacillus subtilis YcsN and Escherichia coli YdhF are prototypes of this family. They are uncharacterized aldo/keto reductase family oxidoreductases.	287
381319	cd19093	AKR_AtPLR-like	Arabidopsis thaliana pyridoxal reductase (PLR) and similar proteins. Arabidopsis thaliana PLR (EC 1.1.1.65) is the prototype of this family. It catalyzes the reduction of pyridoxal (PL) with NADPH and oxidation of pyridoxine (PN) with NADP(+), and is involved in the PLP salvage pathway.	293
381320	cd19094	AKR_Tas-like	Escherichia coli Tas protein and similar proteins. Escherichia coli Tas protein is the prototype of this family. It is an NADP(H)-dependent aldo-keto reductase that catalyzes the reversible reduction of ketones to the respective alcohols using NADP(H) as a hydride donor.	328
381321	cd19095	AKR_PA4992-like	Pseudomona aeruginosa PA4992 and similar proteins. Pseudomona aeruginosa PA4992 is the prototype of this family. It is a putative aldo-keto reductase that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor.	253
381322	cd19096	AKR_Fe-S_oxidoreductase	Fe-S oxidoreductase and similar proteins. The family includes a group of uncharacterized Fe-S oxidoreductase that belongs to aldo-keto reductase (AKR) superfamily. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	255
381323	cd19097	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	267
381324	cd19098	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	318
381325	cd19099	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	316
381326	cd19100	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	238
381327	cd19101	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	304
381328	cd19102	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	302
381329	cd19103	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	299
381330	cd19104	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	321
381331	cd19105	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	250
381332	cd19106	AKR_AKR1A1-4	AKR1A family of aldo-keto reductase (AKR). The AKR1A family of AKR includes alcohol dehydrogenase [NADP(+)] (ALR, EC 1.1.1.2) from Homo sapiens (AKR1A1), Sus scrofa (AKR1A2), Rattus norvegicus (liver, AKR1A3), and Mus musculus (AKR1A4). ALR, also known as aldehyde reductase, or ALDR1, catalyzes the NADPH-dependent reduction of a variety of aromatic and aliphatic aldehydes to their corresponding alcohols. In vitro substrates include succinic semialdehyde, 4-nitrobenzaldehyde, 1,2-naphthoquinone, methylglyoxal, and D-glucuronic acid.	305
381333	cd19107	AKR_AKR1B1-19	AKR1B family of aldo-keto reductase (AKR). The AKR1B family of AKR includes aldose reductase (AR, EC 1.1.1.21) from Homo sapiens (AKR1B1), Oryctolagus cuniculus (kidney, AKR1B2), Mus musculus (AKR1B3), Rattus norvegicus (lens, AKR1B4), Bos taurus (lens/testis, AKR1B5), and Sus scrofa (lens, AKR1B6), aldose reductase-related protein 1 (ALD1, EC1.1.1.21) from Mus musculus (AKR1B7), Rattus norvegicus (AKR1B14), and Homo sapiens (AKR1B15), Mus musculus fibroblast growth factor induced protein (FR-1 or AKR1B8, EC 1.1.1.21), Cricetulus griseus aldose reductase-related protein 2 (ALD2 or AKR1B9, EC 1.1.1.21), aldose reductase-like from Homo sapiens (ARL-1 or AKR1B10) and Rattus norvegicus (AKR1B13), aldo-keto reductase from Gallus domesticus (eye, tongue, esophagus, AKR1B12), and Oryctolagus cuniculus AR-like protein (3beta-HSD, AKR1B19). AR, also called aldehyde reductase, catalyzes the NADPH-dependent reduction of a wide variety of carbonyl-containing compounds to their corresponding alcohols with a broad range of catalytic efficiencies. ALD1 reduces a broad range of aliphatic and aromatic aldehydes to the corresponding alcohols. It may play a role in the metabolism of xenobiotic aromatic aldehydes. FR-1, also called aldose reductase-related protein 2, or fibroblast growth factor-regulated protein (FGFRP), is induced by fibroblast growth factor-1. It may play a role in the regulation of the cell cycle. FR-1 belongs to the NADPH-dependent aldo-keto reductase family. ALD2 is an inducible aldo-keto reductase with a preference for aliphatic substrates. It can also act on small aromatic aldehydes, steroid aldehydes and some ketone substrates. ARL-1, also called aldose reductase-like, or aldose reductase-related protein (ARP), or small intestine reductase, or SI reductase, acts as all-trans-retinaldehyde reductase that can efficiently reduce aliphatic and aromatic aldehydes, and is less active on hexoses (in vitro). It may be responsible for detoxification of reactive aldehydes in the digested food before the nutrients are passed on to other organs. AKR1B15, also called estradiol 17-beta-dehydrogenase AKR1B15, is a mitochondrial aldo-keto reductase that catalyzes the reduction of androgens and estrogens with high positional selectivity (shows 17-beta-hydroxysteroid dehydrogenase activity) as well as 3-keto-acyl-CoAs. It has a strong selectivity towards NADP(H). AKR1B19 is aldose reductase-like that may show 3-beta-hydroxysteroid dehydrogenase (3beta-HSD) activity.	307
381334	cd19108	AKR_AKR1C1-35	AKR1C family of aldo-keto reductase (AKR). The AKR1C family of aldo-keto reductase (AKR) includes AKR1C1 (20-alpha-hydroxysteroid dehydrogenase, also known as 20alpha-HSD), AKR1C2 (3alpha-HSD type 3), AKR1C3 (17beta-HSD type 5), and AKR1C4 (3alpha-HSD type 1) from Homo sapiens; AKR1C5 (20alpha-HSD, also known as prostaglandin-E(2) 9-reductase) from Rattus norvegicus (ovary); AKR1C6 (estradiol 17beta-HSD type 5) from Mus musculus; AKR1C7 (prostaglandin F synthase 1 or PGF1) from Bos taurus (lung); AKR1C8 (20alpha-HSD) from Rattus norvegicus (ovary); AKR1C9 (3alpha-HSD) from Rattus norvegicus (liver); AKR1C10a (Rho crystallin) from Rana temporaria and AKR1C10b (Rho crystallin) from Rana catesbeina; AKR1C11 (prostaglandin F synthase 2 or PGF2) from Bos taurus (liver); AKR1C12 (aldo-keto reductase or AKR), AKR1C13 (interleukin-3-regulated AKR), and AKR1C14 (3alpha-HSD) from Mus musculus; AKR1C15 (NADPH-dependent reductase), AKR1C16 (NAD+-preferring 3alpha/17beta/20alpha-HSD), and AKR1C17 (NAD+-dependent 3alpha-HSD) from Rattus norvegicus; AKR1C18 (20alpha-HSD), AKR1C19 (3-hydroxybutyrate dehydrogenase or 3HB dehydrogenase), AKR1C20 (3alpha(17beta)-HSD), AKR1C21 (3(17)alpha-HSD), AKR1C22 (dihydrodiol dehydrogenase or DD) from Mus musculus;  AKR1C23 (20alpha-HSD) from Equus caballus; AKR1C24 (NAD+-dependent 17beta-HSD) from Rattus norvegicus; AKR1C25 (3(20)alpha-HSD) from Macaca fuscata; AKR1C26 (identical to morphine 6-dehydrogenase or M6DH, acts as NAD(+)-dependent 3alpha/17beta-HSD), AKR1C27/AKR1C28 (NAD(+)-dependent 3alpha/17beta-HSDs), AKR1C29 (identical to 3-hydroxyhexobarbital dehydrogenase or 3HBD, acts as NADPH-preferring reductase with 3alpha/3beta/17beta/20alpha-HSD activity), AKR1C30 (identical to naloxone reductase type 1 and acts as 17beta-HSD), AKR1C31 (3alpha/17beta/20alpha-HSD), AKR1C32 (identical to loxoprofen reductase and acts as 3alpha/20alpha-HSD), and AKR1C33 (identical to naloxone reductase type 2 and mainly acts as 3alpha-HSD) from Oryctolagus cuniculus; AKR1C34 (NAD+-dependent morphine 6-dehydrogenase or M6DH with 3beta/17beta/20alpha-HSD activity) and AKR1C35 (NAD+-dependent dehydrogenase with 3(17)beta-HSD activity) from Mesocricetus auratus.	303
381335	cd19109	AKR_AKR1D1-3	AKR1D family of aldo-keto reductase (AKR). The AKR1D family of aldo-keto reductase includes 3-oxo-5-beta-steroid 4-dehydrogenase (EC 1.3.1.3) from Homo sapiens (AKR1D1), Rattus norvegicus (liver, AKR1D2), and Oryctolagus cuniculus (AKR1D3). 3-oxo-5-beta-steroid 4-dehydrogenase, also called delta(4)-3-ketosteroid 5-beta-reductase (EC 1.3.99.6), or delta(4)-3-oxosteroid 5-beta-reductase, or 5-beta-reductase, efficiently catalyzes the reduction of progesterone, androstenedione, 17-alpha-hydroxyprogesterone and testosterone to 5-beta-reduced metabolites.	308
381336	cd19110	AKR_AKR1E1-2	AKR1E family of aldo-keto reductase (AKR). The AKR1E family of AKR includes 1,5-anhydro-D-fructose reductase (EC 1.1.1.263) from Mus musculus (liver, AKR1E1) and Homo sapiens (AKR1E2). 1,5-anhydro-D-fructose reductase), also called AF reductase, or aldo-keto reductase family 1 member C-like protein 2 (AKR1CL2), catalyzes the NADPH-dependent reduction of 1,5-anhydro-D-fructose (AF) to 1,5-anhydro-D-glucitol. AKR1E2 is a testis aldo-keto reductase (tAKR), which is also known as testis-specific protein (TSP), or LoopADR.	301
381337	cd19111	AKR_AKR1G1_1I	Caenorhabditis elegans aldo-keto reductase (CeAKR), Coptotermes gestroi aldo-keto reductase (CgAKR-1) and similar proteins. CeAKR is a founding member of aldo-keto reductase family 1 member G1 (AKR1G1). It may catalyze the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor. Coptotermes gestroi aldo-keto reductase (CgAKR-1) is a founding member of aldo-keto reductase family 1 member I (AKR1I). It is a multipurpose enzyme with potential biotechnological applications.	286
381338	cd19112	AKR_AKR2A1-2	AKR2A family of aldo-keto reductase (AKR). The AKR2A family of AKR includes AKR2A1 (NADP-dependent D-sorbitol-6-phosphate dehydrogenase or NADP-S6PDH) from Malus domestica, and  AKR2A2 (NADPH-dependent mannose-6-phosphate reductase or NADPH-M6PR) from Apium graveolens. NADP-S6PDH (EC 1.1.1.200), also called aldose-6-phosphate reductase [NADPH], synthesizes sorbitol-6-phosphate, a key intermediate in the synthesis of sorbitol which is a major photosynthetic product in many members of the Rosaceae family. NADPH-M6PR (EC 1.1.1.224), also called NADPH-dependent M6P reductase, is a key enzyme involved in mannitol biosynthesis.	308
381339	cd19113	AKR_AKR2B1-10	AKR2B family of aldo-keto reductase (AKR). The AKR2B family of AKR includes NAD(P)H-dependent D-xylose reductase (XR) from Pichia stipites, Kluyveromyces lactis, Pachysolen tannophilus, Candida tropicalis, and Candida tenuis, Gre3p from Saccharomyces cerevisiae, XR from Candida tropicalis, Pichia guilliermondii, Debaryomyces hansenli, and Debaryomyces nepalensis, which correspond to aldo-keto reductase family 2 member B1-B10 (AKR2B1-10), respectively. XR (EC1.1.1.307) catalyzes the NAD(P)H dependent reduction of xylose to xylitol.	310
381340	cd19114	AKR_AKR2C1	AKR2C family of aldo-keto reductase (AKR). Mucor mucedo NADP-dependent 4-dihydromethyl-trisporate dehydrogenase (TDH), also called 4-dihydromethyltrisporate dehydrogenase, or 4-dihydromethyl-TA dehydrogenase, is a founding member of aldo-keto reductase family 2 member C1 (AKR2C1). It is involved in the biosynthesis of trisporic acid, the sexual hormone of zygomycetes, which induces the first steps of zygophore development. TDH catalyzes the NADP-dependent oxidation of (+) mating-type specific precursor 4-dihydromethyl-trisporate to methyl-trisporate.	302
381341	cd19115	AKR_AKR2D1	AKR2D family of aldo-keto reductase (AKR). Aspergillus niger NAD(P)H-dependent D-xylose reductase xyl1 (XR, EC 1.1.1.307) is a founding member of aldo-keto reductase family 2 member D1 (AKR2D1). It catalyzes the initial reaction in the xylose utilization pathway by reducing D-xylose into xylitol in a NAD(P)H dependent manner.	311
381342	cd19116	AKR_AKR2E1-5	AKR2E family of aldo-keto reductase (AKR). Bombyx mori 3-dehydroecdysone reductase is a founding member of aldo-keto reductase family 2 member E4 (AKR2E4). It is a NADP-dependent oxidoreductase with high 3-dehydroecdysone reductase activity. It may play a role in the regulation of molting and has lower activity with phenylglyoxal and isatin (in vitro). This family also includes 3-dehydroecdysone 3b-reductase from Spodoptera littoralis and Trichoplusia ni, DL-glyceraldehyde reductase from Drosophila melanogaster, aldo-keto reductase from Bombyx mori, which correspond to aldo-keto reductase family 2 member E1, E2, E3 and E5 (AKR2E1/2/3/5), respectively.	292
381343	cd19117	AKR_AKR3A1-2	AKR3A family of aldo-keto reductase (AKR). Saccharomyces cerevisiae Gcy1p and Ypr1p are founding members of aldo-keto reductase family 3 member A1 (AKR3A1) and A2 (AKR3A2), respectively. Gcy1p, also called galactose-inducible crystallin-like protein 1, is a glycerol dehydrogenase involved in glycerol catabolism under microaerobic conditions. It has mRNA binding activity. Ypr1p acts as a 2-methylbutyraldehyde reductase that displays high specific activity towards 2-methylbutyraldehyde, as well as other aldehydes such as hexanal.	284
381344	cd19118	AKR_AKR3B1-3	AKR3B family of aldo-keto reductase (AKR). Sporidiobolus salmonicolor NADPH-dependent aldehyde reductase 1 (ARI, EC 1.1.1.2), Trichosporonoides megachilieni NADPH-dependent erthyrose reductase (ER) 1/2 and 3, are founding members of aldo-keto reductase family 3 member B1 (AKR3B1), B2 (AKR3B2), and B3 (AKR3B3), respectively. Sporidiobolus salmonicolor NADPH-ARI, also called alcohol dehydrogenase [NADP(+)], or aldehyde reductase I, or ALR 1, catalyzes the asymmetric reduction of aliphatic and aromatic aldehydes and ketones to an R-enantiomer. It reduces ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate. Trichosporonoides megachilieni NADPH-ERs catalyze the reduction of D-erythrose.	283
381345	cd19119	AKR_AKR3C1	Saccharomyces cerevisiae D-arabinose dehydrogenase [NAD(P)+] heavy chain (Ara1p) and similar proteins. Saccharomyces cerevisiae Ara1p (EC 1.1.1.117), also called D-arabinose 1-dehydrogenase (NAD(P)(+)), is a founding members of aldo-keto reductase family 3 member C1 (AKR3C1). It catalyzes the oxidation of D-arabinose, L-xylose, L-fucose, and L-galactose in the presence of NADP(+).	294
381346	cd19120	AKR_AKR3C2-3	Saccharomyces pombe NAD/NADP-dependent indole-3-acetaldehyde reductase, Candida parapsilosis NADPH-dependent conjugated polyketone reductase C2 (CPR), and similar proteins. Saccharomyces pombe NAD/NADP-dependent indole-3-acetaldehyde reductase (EC 1.1.1.190/EC 1.1.1.191) and Candida parapsilosis NADPH-dependent CPR (EC 1.1.1.358/EC 1.1.1.168) are founding members of aldo-keto reductase family 3 member C2 (AKR3C2) and C3 (AKR3C3), respectively. Saccharomyces pombe NAD/NADP-dependent indole-3-acetaldehyde reductase catalyzes the conversion from (Indol-3-yl)ethanol to (indol-3-yl)acetaldehyde in a NAD/NADP-dependent manner. CPR, also called 2-dehydropantolactone reductase, or 2-dehydropantolactone reductase (A-specific), or ketopantoyl-lactone reductase, acts as a NADPH-dependent conjugated polyketone reductase with broad substrate specificity and strict stereospecificity. It reduces ketopantoyl lactone and isatin.	269
381347	cd19121	AKR_AKR3D1	AKR3D family of aldo-keto reductase (AKR). Trichoderma reesei D-galacturonate reductase (GAR1, EC 1.1.1.365), also called D-galacturonic acid reductase, or GalUR, is a founding member of aldo-keto reductase family 3 member D1 (AKR3D1). It mediates the reduction of D-galacturonate to L-galactonate, the first step in D-galacturonate catabolic process. It also has activity with D-glucuronate and DL-glyceraldehyde. Its activity is seen only with NADPH and not with NADH.	279
381348	cd19122	AKR_AKR3E1	AKR3E family of aldo-keto reductase (AKR). Trichoderma reesei NADP(+)-dependent glycerol 2-dehydrogenase (GLD2, EC 1.1.1.156), also called dihydroxyacetone reductase, is a founding member of aldo-keto reductase family 3 member E1 (AKR3E1). It acts as a glycerol oxidoreductase probably involved in glycerol synthesis.	291
381349	cd19123	AKR_AKR3G1	AKR3G family of aldo-keto reductase (AKR). Synechocystis sp. aldo/keto reductase slr0942 is a founding member of aldo-keto reductase family 3 member G1 (AKR3G1). It is an aldo/keto reductase that catalyzes the NADPH-dependent reduction of aldehyde- and ketone-groups of different classes of carbonyl compounds to the corresponding alcohols.	297
381350	cd19124	AKR_AKR4A_4B	AKR4A and AKR4B families of aldo-keto reductase (AKR). The AKR4A family of AKR includes Glycine max NAD(P)H-dependent 6'-deoxychalcone synthase (6DCS, EC 3.1.170), chalcone reductase (CHR, EC 2.3.1.74) from Medicago sativa, Glycyrrhiza echinate, and Glycyrrhiza glabra, which are founding members of aldo-keto reductase family 4 member A1 (AKR4A1), A2 (AKR4A2), A3 (AKR4A3), and A4 (AKR4A4), respectively. NAD(P)H-6DCS co-acts with chalcone synthase in formation of 4,2',4'-trihydroxychalcone, involved in the biosynthesis of glyceollin type phytoalexins. CHR, also called chalcone polyketide reductase, is a key enzyme of the flavonoid/isoflavonoid biosynthesis pathway. The AKR4B family of AKR includes Sesbania rostrate chalcone reductase (CHR, AKR4B1), Papaver somniferum codeinone reductase (COR, AKR4B2/ AKR4B3), Fragaria x ananassa D-galacturonate reductase (GalUR, AKR4B4), deoxymugineic acid synthase 1 (DMAS1) from Zea mays (AKR4B5), Oryza sativa (AKR4B6), Hordeum vulgare (AKR4B7), Triticum aestivum (AKR4B8), and Erythroxylum coca methylecgonone reductase (MecgoR, AKR4B10). CHR, also called chalcone polyketide reductase, is a key enzyme of the flavonoid/isoflavonoid biosynthesis pathway.  NADPH-dependent COR and non-functional NADPH-dependent COR from Papaver somniferum are founding members of aldo-keto reductase family 4 member B2 (AKR4B2) and B3 (AKR4B3), respectively. NADPH-dependent COR (EC 1.1.1.247) reduces codeinone to codeine in the penultimate step in morphine biosynthesis. It can use morphinone, hydrocodone, and hydromorphone as substrates during reductive reaction with NADPH as cofactor, and morphine and dihydrocodeine as substrates during oxidative reaction with NADP as cofactor. GalUR (EC 1.1.1.365), also called aldo-keto reductase 2 (AKR2), is involved in ascorbic acid (vitamin C) biosynthesis by catalyzing the conversion from L-galactonate and NADP(+) to D-galacturonate and NADPH. DMAS1 (EC 1.1.1.285) catalyzes the reduction of a 3''-keto intermediate during the biosynthesis of 2'-deoxymugineic acid (DMA) from L-Met. It is involved in the formation of phytosiderophores (MAs) belonging to the mugineic acid family and required to acquire iron. MecgoR catalyzes the stereospecific reduction of methylecgonone to methylecgonine, the penultimate step in cocaine biosynthesis.	281
381351	cd19125	AKR_AKR4C1-15	AKR4C family of aldo-keto reductase (AKR). The AKR4C family of AKR includes aldose reductase (ALR) from Hordeum vulgare (AKR4C1), Bromus inermis (AKR4C2), Avena fatua (AKR4C3), and Xerophyta viscosa (AKR4C4), two aldose reductases, DpAR1 (AKR4C5) and DpAR2(AKR4C6), from Digitalis purpurea, aldehyde reductase from Zea mays (AKR4C7), four aldo-keto reductases from Arabidopsis thaliana (AKR4C8-11), and another three aldo-keto reductases from Aloe arborescens (AKR4C12) and Oryza sativa (AKR4C14/15). ALR (EC 1.1.1.21), also called AR, aldehyde reductase, or polyol dehydrogenase (NADP(+)), is a cytosolic NADPH-dependent oxidoreductase that catalyzes the reduction of a variety of aldehydes and carbonyls, including monosaccharides. Both DpAR1 and DpAR2 reduce the ketone group of steroid structures. They may be involved in plant steroid metabolism in general and in cardenolide biosynthesis in particular. Plant aldo-keto reductases of the AKR4C subfamily play key roles during stress and are attractive targets for developing stress-tolerant crops.	287
381352	cd19126	AKR_AKR5A_5G	AKR5A and AKR5G families of aldo-keto reductase (AKR). The AKR5A family of AKR includes prostaglandin F2-alpha synthase (PGFS) from Leishmania major (AKR5A1) and Trypanosoma brucei (AKR5A2). PGFS, also called 9,11-endoperoxide prostaglandin H2 reductase, catalyzes the NADP-dependent formation of prostaglandin F2-alpha from prostaglandin H2. It has also aldo/ketoreductase activity for synthetic substrates 9,10-phenanthrenequinone and p-nitrobenzaldehyde. The AKR5G family of AKR includes Bacillus subtilis glyoxal reductase (GR), uncharacterized oxidoreductase YtbE, and Bacillus aryabhattai aldo-keto reductase, which corresponds to aldo-keto reductase family 5 member G1-3 (AKR5G1-3), respectively. GR (YvgN, EC 1.1.1.283), also called methylglyoxal reductase, reduces glyoxal and methylglyoxal (2-oxopropanal). It is not involved in vitamin B6 biosynthesis.	254
381353	cd19127	AKR_AKR5B1	AKR5B family of aldo-keto reductase (AKR). Pseudomonas putida morphine 6-dehydrogenase (M6DH) is a founding member of the aldo-keto reductase family 5 member B1 (AKR5B1). M6DH (EC 1.1.1.218), also called naloxone reductase, oxidizes the C-6 hydroxy group of morphine and codeine.	268
381354	cd19128	AKR_GlAR-like	Giardia lamblia aldose reductase (AR) and similar proteins. Giardia lamblia AR (EC 1.1.1.21), also called aldehyde reductase, is the prototype of this family. It catalyzes the NADPH-dependent reduction of a wide variety of carbonyl-containing compounds to their corresponding alcohols with a broad range of catalytic efficiencies.	277
381355	cd19129	AKR_BaDH-like	Bradyrhizobium diazoefficiens dehydrogenase (DH) and similar proteins. Bradyrhizobium diazoefficiens DH is the prototype of this family. It belongs to aldo/keto reductase family.	295
381356	cd19130	AKR_AKR5C1	Corynebacterium sp. 2,5-diketo-D-gluconic acid reductase A (DkgA) and similar proteins. Corynebacterium sp. DkgA is a founding member of aldo-keto reductase family 5 member C1 (AKR5C1). DkgA (EC 1.1.1.346), also called 2,5-DKG reductase A, or 2,5-DKGR A, or 25DKGR-A, or AKR5C, catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). 5-keto-D-fructose and dihydroxyacetone can also serve as substrates.	256
381357	cd19131	AKR_AKR5C2	Escherichia coli 2,5-diketo-D-gluconic acid reductase A (DkgA/YqhE) and similar proteins. Escherichia coli DkgA/YqhE is a founding member of aldo-keto reductase family 5 member C2 (AKR5C2). DkgA/YqhE (EC 1.1.1.274), also called 2,5-DKG reductase A, or 2,5-DKGR A, or 25DKGR-A, or AKR5C, catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG). It is also capable of stereoselective -keto ester reductions on ethyl acetoacetate and other 2-substituted derivatives.	256
381358	cd19132	AKR_AKR5D1_E1	AKR5D and AKR5E families of aldo-keto reductase (AKR). 2,5-diketo-D-gluconic acid reductase B (DkgB) from Corynebacterium sp. and 2,5-diketo-D-gluconic acid reductase Zymomonas mobilis are founding members of aldo-keto reductase family 5 member D1 (AKR5D1) and E1 (AKR5E1), respectively. DkgB (EC 1.1.1.274), also called 2,5-didehydrogluconate reductase (2-dehydro-D-gluconate-forming), or 2,5-DKG reductase B, or 2,5-DKGR B, or 25DKGR-B, catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG).	255
381359	cd19133	AKR_AKR5F1	the AKR5F family of aldo-keto reductase (AKR). Klebsiella sp. 2,5-diketo-D-gluconic acid reductase (2,5-DKG reductase) is a founding member of aldo-keto reductase family 5 member F1 (AKR5F1). It catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG).	255
381360	cd19134	AKR_AKR5H1	AKR5H family of aldo-keto reductase (AKR). Mycobacterium smegmatis MSMEG_2407 is a founding member of aldo-keto reductase family 5 member H1 (AKR5H1). It is a NADPH-dependent aldo-keto reductase that reduces methylglyoxal and phenylglyoxal.	263
381361	cd19135	AKR_CeZK1290-like	Caenorhabditis elegans ZK1290.5 and similar proteins. Caenorhabditis elegans ZK1290.5 is the prototype of this family. It is an uncharacterized aldo/keto reductase family oxidoreductase.	265
381362	cd19136	AKR_DrGR-like	Danio rerio glyoxal reductase-like (GR-like) protein and similar proteins. Danio rerio GR-like protein is the prototype of this family. It is an uncharacterized aldo/keto reductase family oxidoreductase similar to Bacillus subtilis glyoxal reductase (YvgN) that reduces glyoxal and methylglyoxal (2-oxopropanal).	262
381363	cd19137	AKR_AKR3F1	Thermotoga maritime Tm1743 and similar proteins. Thermotoga maritime Tm1743 is a founding member of aldo-keto reductase family 3 member F1 (AKR3F1). It is a aldo/keto reductase family oxidoreductase.	260
381364	cd19138	AKR_YeaE	Escherichia coli YeaE and similar proteins. Escherichia coli YeaE is the prototype of this family. It acts as an aldo-keto reductase (AKR) that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor.	266
381365	cd19139	AKR_AKR3F2	Escherichia coli 2,5-diketo-D-gluconic acid reductase B (DkgB/YafB) and similar proteins. Escherichia coli DkgB/YafB (EC 1.1.1.346), also called 2,5-didehydrogluconate reductase (2-dehydro-L-gulonate-forming), or 2,5-DKG reductase B, or 2,5-DKGR B, or 25DKGR-B, is a founding member of aldo-keto reductase family 3 member F2 (AKR3F2). It catalyzes the reduction of 2,5-diketo-D-gluconic acid (25DKG) to 2-keto-L-gulonic acid (2KLG).	248
381366	cd19140	AKR_AKR3F3	Sinorhizobium meliloti isatin reductase and similar proteins. Sinorhizobium meliloti isatin reductase is a founding member of aldo-keto reductase family 3 member F3 (AKR3F3). It is a aldo/keto reductase family oxidoreductase.	253
381367	cd19141	Aldo_ket_red_shaker	Shaker potassium channel beta subunit (AKR6A) family of aldo-keto reductase (AKR). This family includes voltage-gated potassium channel subunits, beta-1 (KCAB1B), beta-2 (KCAB2B) and  beta-3 (KCAB3B). KCAB1B and KCAB2B are cytoplasmic potassium channel subunits that modulate the characteristics of the channel-forming alpha-subunits. KCAB3B is an accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit.	310
381368	cd19142	AKR_AKR6B1	AKR6B family of aldo-keto reductase (AKR). Drosophila melanogaster Hk protein is a founding member of aldo-keto reductase family 6 member B1 (AKR6B1). Hk protein, also called hyperkinetic, is a beta subunit of Shaker (Sh) K+ channels and shows high sequence homology to aldoketoreductase.	325
381369	cd19143	AKR_AKR6C1_2	AKR6C family of aldo-keto reductase (AKR). Voltage-gated potassium channel subunit beta (KCAB) from Arabidopsis thaliana and Egeria densa are founding members of aldo-keto reductase family 6 member C1 (AKR6C1) and C2 (AKR6C2), respectively. KCAB, also called Shaker channel b-subunit, or K(+) channel subunit beta, or potassium voltage beta 1, or KV-beta1, or KAB1, is a probable accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit.	319
381370	cd19144	AKR_AKR13A1	AKR13A family of aldo-keto reductase (AKR). Schizosaccharomyces pombe aldo-keto reductase YakC is a founding member of aldo-keto reductase family 13 member A1 (AKR13A1). It catalyzes the reversible reduction of ketones to the respective alcohols using NADP(+) as a hydride donor.	323
381371	cd19145	AKR_AKR13D1	AKR13D family of aldo-keto reductase (AKR). Rauvolfia serpentina PR is a founding member of aldo-keto reductase family 13 member D1 (AKR13D1). It catalyzes the NADPH-dependent reduction of the aldehyde perakine to yield the alcohol raucaffrinoline in the biosynthetic pathway of ajmaline in Rauvolfia, a key step in indole alkaloid biosynthesis. This family also includes Arabidopsis thaliana aldo-keto reductases, ALKR1-6.	304
381372	cd19146	AKR_AKR9A1-2	Aspergillus nidulans sterigmatocystin biosynthesis dehydrogenase StcV, Aspergillus flavus norsolorinic acid reductase (NOR), and similar proteins. Aspergillus nidulans sterigmatocystin biosynthesis dehydrogenase StcV and Aspergillus flavus norsolorinic acid reductase (NOR), are founding members of aldo-keto reductase family 9 member A1-2 (AKR9A1-2), respectively. StcV may be involved in the dehydration of 5'-hydroxyaverantin to form averufin. NOR is involved in aflatoxin biosynthesis.	326
381373	cd19147	AKR_AKR9A3_9B1-4	Phanerochaete chrysosporium aryl-alcohol dehydrogenase [NADP(+)] (AAD) and similar proteins. Phanerochaete chrysosporium ADD (EC1.1.1.91) is a founding member of aldo-keto reductase family 9 member A3. It is involved in lignin degradation and reduces aromatic benzaldehydes to their respective alcohols in the presence of NADP(H). This family also includes Saccharomyces cerevisiae aryl-alcohol dehydrogenases AAD14p, AAD3p, AAD4p, and AAD10p, which are founding members of aldo-keto reductase family 9 member B1-4 (AKR9B1-4), respectively.	319
381374	cd19148	AKR_AKR11B1	Bacillus subtilis aldo-keto reductase YhdN and similar proteins. Bacillus subtilis YhdN, also called general stress protein 69 (GSP69), is a founding member of aldo-keto reductase family 11 member B1 (AKR11B1). It acts as an aldo-keto reductase (AKR) that catalyzes the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor.	302
381375	cd19149	AKR_AKR11B2	Escherichia coli NADH-specific methylglyoxal reductase (YdjG) and similar proteins. Escherichia coli YdjG is a founding member of aldo-keto reductase family 11 member B2 (AKR11B2). It catalyzes the NADH-dependent reduction of methylglyoxal (2-oxopropanal) in vitro. It may play some role in intestinal colonization.	315
381376	cd19150	AKR_AKR14A1	Escherichia coli L-glyceraldehyde 3-phosphate reductase (GPR/YghZ/AKR14A1) and similar proteins. Escherichia coli L-glyceraldehyde 3-phosphate reductase (GPR/YghZ), also called GAP reductase, is a founding member of aldo-keto reductase family 14 member A1 (AKR14A1). It catalyzes the stereospecific, NADPH-dependent reduction of L-glyceraldehyde 3-phosphate (L-GAP). It is also involved in the stress response as a methylglyoxal reductase which converts the toxic metabolite methylglyoxal to acetol in vitro and in vivo.	309
381377	cd19151	AKR_AKR14A2	Salmonella enterica aldo-keto reductase (AKR) and similar protein. Salmonella enterica AKR is a founding member of aldo-keto reductase family 14 member A2 (AKR14A2).	309
381378	cd19152	AKR_AKR15A	AKR15A family of aldo-keto reductase. The AKR15 family includes Microbacterium luteolum pyridoxal 4-dehydrogenase (PLD), Pseudomonas sp. D-threo-aldose 1-dehydrogenase (FDH), and similar proteins. PLD (EC1.1.1.107) catalyzes irreversible oxidation of pyridoxal. FDH(EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose, and to a much lesser degree, D-arabinose. FDH (EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose, and to a much lesser degree, D-arabinose.	308
381379	cd19153	AKR_galDH-like	L-galactose dehydrogenase (L-galDH), D-arabinose 1-dehydrogenase (ARA2) and similar proteins. L-galDH (EC 1.1.1.316), also called L-galactose 1-dehydrogenase, catalyzes the oxidation of L-galactose to L-galactono-1,4-lactone in the presence of NAD(+). It uses NAD(+) as a hydrogen acceptor much more efficiently than NADP(+). ARA2 (EC1.1.1.116), also called NAD(+)-specific D-arabinose dehydrogenase, catalyzes the the oxidation of D-arabinose to D-arabinono-1,4-lactone in the presence of NAD(+).	294
381380	cd19154	AKR_AKR1G1_CeAKR	Caenorhabditis elegans aldo-keto reductase (CeAKR) and similar proteins. CeAKR is a founding member of aldo-keto reductase family 1 member G1 (AKR1G1). It may catalyze the reversible reduction of ketones to the respective alcohols using NAD(P)H as a hydride donor.	303
381381	cd19155	AKR_AKR1I_CgAKR1	Coptotermes gestroi aldo-keto reductase (CgAKR-1) and similar proteins. Coptotermes gestroi aldo-keto reductase (CgAKR-1) is a founding member of aldo-keto reductase family 1 member I (AKR1I). It is a multipurpose enzyme with potential biotechnological applications.	307
381382	cd19156	AKR_AKR5A1_2	AKR5A family of aldo-keto reductase (AKR). Prostaglandin F2-alpha synthase (PGFS) from Leishmania major and Trypanosoma brucei are founding members of aldo-keto reductase family 5 member A1 (AKR5A1) and A2 (AKR5A2), respectively. PGFS, also called 9,11-endoperoxide prostaglandin H2 reductase, catalyzes the NADP-dependent formation of prostaglandin F2-alpha from prostaglandin H2. It has also aldo/ketoreductase activity toward the synthetic substrates 9,10-phenanthrenequinone and p-nitrobenzaldehyde.	266
381383	cd19157	AKR_AKR5G1-3	AKR5G family of aldo-keto reductase (AKR). Bacillus subtilis glyoxal reductase (GR), uncharacterized oxidoreductase YtbE, and Bacillus aryabhattai aldo-keto reductase are founding members of aldo-keto reductase family 5 member G1-3 (AKR5G1-3), respectively. GR (YvgN, EC 1.1.1.283), also called methylglyoxal reductase, reduces glyoxal and methylglyoxal (2-oxopropanal). It is not involved in vitamin B6 biosynthesis.	265
381384	cd19158	AKR_KCAB2B_AKR6A1-like	voltage-gated potassium channel subunit beta-2 (KCAB2B) and similar proteins. KCAB2B from Bos taurus, Rattus norvegicus, Mus musculus, Homo sapiens, and Oryctolagus cuniculus, are founding members of aldo-keto reductase family 6 member A1 (AKR6A1), A2 (AKR6A2), A4 (AKR6A4), A5 (AKR6A5), and A6 (AKR6A6), respectively. KCAB2B, also called Shaker channel b-subunit 2 (Kvb2), or K(+) channel subunit beta-2, or Kv-beta-2, or Kvbeta2, is a cytoplasmic potassium channel subunit that modulates the characteristics of the channel-forming alpha-subunits. It may be involved in the regulation of nerve signaling, and prevents neuronal hyperexcitability.	324
381385	cd19159	AKR_KCAB1B_AKR6A3-like	voltage-gated potassium channel subunit beta-1 (KCAB1B) and similar proteins. KCAB1B from Homo sapiens, Mus musculus, Mustela putorius, Rattus norvegicus, and Kvb1.1, Kvb1.2 from Oryctolagus cuniculus, are founding members of aldo-keto reductase family 6 member A3 (AKR6A3), A8 (AKR6A8), A10a (AKR6A10a), A13 (AKR6A13), A7 (AKR6A7) and A10b (AKR6A10b), respectively. KCAB1B, also called Shaker channel b-subunit 1(Kvb1), K(+) channel subunit beta-1, or Kv-beta-1, is a cytoplasmic potassium channel subunit that modulates the characteristics of the channel-forming alpha-subunits. It modulates action potentials via its effect on the pore-forming alpha subunits.	323
381386	cd19160	AKR_KCAB3B_AKR6A9-like	voltage-gated potassium channel subunit beta-3 (KCAB3B) and similar proteins. KCAB3B from Homo sapiens, Rattus norvegicus, and Mus musculus, are founding members of aldo-keto reductase family 6 member A9 (AKR6A9), A12 (AKR6A12), A14 (AKR6A14), respectively. KCAB3B, also called Shaker channel b-subunit 3 (Kvb3), K(+) channel subunit beta-3, or Kv-beta-3, is an accessory potassium channel protein which modulates the activity of the pore-forming alpha subunit. It alters the functional properties of Kv1.5.	325
381387	cd19161	AKR_AKR15A1	Microbacterium luteolum pyridoxal 4-dehydrogenase (PLD) and similar proteins. Microbacterium luteolum PLD (EC1.1.1.107) is a founding member of aldo-keto reductase family 15 member A1 (AKR15A1). It catalyzes irreversible oxidation of pyridoxal.	310
381388	cd19162	AKR_FDH	D-threo-aldose 1-dehydrogenase (FDH) and similar proteins. FDH (EC1.1.1.122), also called (2S,3R)-aldose dehydrogenase, or L-fucose dehydrogenase, catalyzes the oxidation of L-fucose to L-fuconolactone in the presence of NADP(+). It is also active against L-galactose, and to a much lesser degree, D-arabinose.	290
381389	cd19163	AKR_galDH	L-galactose dehydrogenase (L-galDH) and similar proteins. L-galDH (EC 1.1.1.316), also called L-galactose 1-dehydrogenase, catalyzes the oxidation of L-galactose to L-galactono-1,4-lactone in the presence of NAD(+). It uses NAD(+) as a hydrogen acceptor much more efficiently than NADP(+).	293
381390	cd19164	AKR_ARA2	D-arabinose 1-dehydrogenase (ARA2) and similar proteins. ARA2 (EC1.1.1.116), also called NAD(+)-specific D-arabinose dehydrogenase, catalyzes the the oxidation of D-arabinose to D-arabinono-1,4-lactone in the presence of NAD(+).	298
350856	cd19165	HemeO	heme oxygenase in eukaryotes and some bacteria. This subfamily contains heme oxygenase (HO, EC 1.14.14.18) found in eukaryotes as well as some proteobacteria, including cyanobacteria. Heme oxygenase (HO) catalyzes the rate limiting step in the degradation of heme to biliverdin in a multi-step reaction. HO is essential for recycling of iron from heme which is used as a substrate and cofactor for its own degradation to biliverdin, iron, and carbon monoxide. In vertebrates, HO plays a role in heme homeostasis and oxidative stress response, and cellular signaling in mammals that include isoforms HO-1, HO-2 and HO-3. HO-1 is ubiquitously expressed after induction while HO-2 expression is constitutive, mostly limited to certain organs, such as the brain, testes, and the vascular system. HO-3 is non-functional in humans, suggesting that the Hmox3 gene is a pseudogene derived from HO-2 transcripts. In higher plants and cyanobacteria, heme oxygenase is required for the synthesis of light-harvesting pigments, which contain tetrapyrrols derived from biliverdin. Candida albicans expresses a heme oxygenase that is required for the utilization of heme as a nutritional iron source, whereas Saccharomyces cerevisiae responds to iron deprivation by increasing Hmx1p transcription, which is controlled by the major iron-dependent transcription factor, Aft1p, and promotes both the re-utilization of heme iron and the regulation of heme-dependent transcription during periods of iron scarcity. In pathogenic bacteria, HO is part of a pathway for iron acquisition from host heme. In Leptospira interrogans, a pathogenic spirochete that causes leptospirosis, HO is required for iron utilization when hemoglobin is the sole iron source, thus making HO an interesting target for novel antimicrobial agents. HO shares tertiary structure similarity to methane monooxygenase (EC 1.14.13.25), ribonucleotide reductase (EC 1.17.4.1) and thiaminase II (EC 3.5.99.2), but shares little sequence homology.	205
350857	cd19166	HemeO-bac	heme oxygenase found in pathogenic bacteria. This subfamily contains bacterial heme oxygenase (HO, EC 1.14.14.18), where HO is part of a pathway for iron acquisition from host heme and heme products. Most of these proteins have yet to be characterized. HO catalyzes the rate limiting step in the degradation of heme to biliverdin in a multi-step reaction. HO is essential for recycling of iron from heme which is used as a substrate and cofactor for its own degradation to biliverdin, iron, and carbon monoxide. This family includes heme oxygenase (pa-HO) from Pseudomonas aeruginosa, an opportunistic pathogen that causes a variety of systemic infections, particularly in those afflicted with cystic fibrosis, as well as cancer and AIDS patients who are immunosuppressed. Pa-HO, expressed by the PigA gene, is critical for the acquisition of host iron since there is essentially no free iron in mammals, and is unusual since it hydroxylates heme predominantly at the delta-meso heme carbon, while all other well-studied HOs hydroxylate the alpha-meso carbon. Also included in this family is Neisseria meningitidis HO which is substantially different from the human HO, with the reaction product being ferric biliverdin IXalpha rather than reduced iron and free biliverdin IXalpha. HO shares tertiary structure similarity to methane monooxygenase (EC 1.14.13.25), ribonucleotide reductase (EC 1.17.4.1) and thiaminase II (EC 3.5.99.2), but shares little sequence homology.	182
380944	cd19167	SET_SMYD1_2_3-like	SET domain (including post-SET domain) found in SET and MYND domain-containing proteins, SMYD1, SMYD2, SMYD3 and similar proteins. The family includes SET and MYND domain-containing proteins, SMYD1, SMYD2 and SMYD3. SMYD1 (EC 2.1.1.43; also termed BOP) is a heart and muscle specific SET-MYND domain containing protein, which functions as a histone methyltransferase and regulates downstream gene transcription. It methylates histone H3 at 'Lys-4' (H3K4me), seems able to perform both mono-, di-, and trimethylation. SMYD2 (also termed HSKM-B, or lysine N-methyltransferase 3C (KMT3C)) functions as a histone methyltransferase that methylates both histones and non-histone proteins, including p53/TP53 and RB1. It specifically methylates histone H3 'Lys-4' (H3K4me) and dimethylates histone H3 'Lys-36' (H3K36me2). SMYD3 (also termed zinc finger MYND domain-containing protein 1) functions as a histone methyltransferase that specifically methylates 'Lys-4' of histone H3, inducing di- and tri-methylation, but not monomethylation. It also methylates 'Lys-5' of histone H4. SMYD3 plays an important role in transcriptional activation as a member of an RNA polymerase complex.	205
380945	cd19168	SET_EZH-like	SET domain found in enhancer of zeste homolog 1 (EZH1) and zeste homolog 2 (EZH2) of polycomb repressive complex 2 (PRC2), and similar proteins. The family includes EZH1 and EZH2. EZH1 (EC 2.1.1.43; also termed ENX-2, or histone-lysine N-methyltransferase EZH1) is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. EZH2 (EC 2.1.1.43; also termed lysine N-methyltransferase 6, ENX-1, or histone-lysine N-methyltransferase EZH2) is a catalytic subunit of the PRC2/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. Both EZH1 and EZH2 can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively. PRC2 is involved in several cancers; EZH2 is overexpressed in breast, liver and prostate cancer, while point mutations in EZH2 alter the substrate preference and product specificity of PRC2 in Non-Hodgkin lymphomas (NHLs). Thus, PRC2 is a popular target for cancer therapeutics.	124
380946	cd19169	SET_SETD1	SET domain (including post-SET domain) found in SET domain-containing protein 1 (SETD1) and similar proteins. This family includes SET domain-containing protein 1A (SETD1A) and SET domain-containing protein 1B (SETD1B). These proteins are histone-lysine N-methyltransferases that specifically methylate 'Lys-4' of histone H3 (H3K4me) when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated.	148
380947	cd19170	SET_KMT2A_2B	SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2A (KMT2A), 2B (KMT2B) and similar proteins. This family includes KMT2A and KMT2B. Both KMT2A (also termed ALL-1 or CXXC7 or MLL or MLL1 or TRX1 or HRX) and KMT2B (also termed MLL4 or TRX2) act as histone methyltransferases that methylate 'Lys-4' of histone H3 (H3K4me).	152
380948	cd19171	SET_KMT2C_2D	SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2C (KMT2C), 2D (KMT2D) and similar proteins. This family includes KMT2C and KMT2D. Both, KMT2C (also termed HALR or MLL3) and KMT2D (also termed ALR or MLL2), act as histone methyltransferases that methylate 'Lys-4' of histone H3 (H3K4me). They are subunits of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation.	153
380949	cd19172	SET_SETD2	SET domain (including post-SET domain) found in SET domain-containing protein 2 (SETD2) and similar proteins. SETD2 (also termed HIF-1, huntingtin yeast partner B, huntingtin-interacting protein 1 (HIP-1), huntingtin-interacting protein B, lysine N-methyltransferase 3A or protein-lysine N-methyltransferase SETD2) acts as histone-lysine N-methyltransferase that specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate. It has been shown that methylation is a posttranslational modification of dynamic microtubules and that SETD2 methylates alpha-tubulin at lysine 40, the same lysine that is marked by acetylation on microtubules. Methylation of microtubules occurs during mitosis and cytokinesis and can be ablated by SETD2 deletion, which causes mitotic spindle and cytokinesis defects, micronuclei, and polyploidy.	142
380950	cd19173	SET_NSD	SET domain (including post-SET domain) found in nuclear SET domain-containing proteins, NSD1, NSD2, NSD3 and similar proteins. The nuclear receptor-binding SET Domain (NSD) family of histone H3 lysine 36 methyltransferases is comprised of NSD1, NSD2, and NSD3, which are primarily known to be involved in chromatin integrity and gene expression through mono-, di-, or tri-methylating lysine 36 of histone H3 (H3K36), respectively. NSD1 (EC 2.1.1.43; also termed histone-lysine N-methyltransferase H3 lysine-36 and H4 lysine-20 specific, androgen receptor coactivator 267 kDa protein (ARA267), androgen receptor-associated protein of 267 kDa, H3-K36-HMTase, H4-K20-HMTase, lysine N-methyltransferase 3B (KMT3B) or NR-binding SET domain-containing protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-36' of histone H3 and 'Lys-20' of histone H4. NSD2 (EC 2.1.1.43; also termed multiple myeloma SET domain-containing protein (MMSET), protein trithorax-5 (TRX5), or wolf-Hirschhorn syndrome candidate 1 protein (WHSC1)) acts as histone-lysine N-methyltransferase with histone H3 'Lys-27' (H3K27me) methyltransferase activity. NSD3 (EC 2.1.1.43; also termed protein whistle, WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1L1), or WHSC1-like protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-4' and 'Lys-27' of histone H3.	142
380951	cd19174	SET_ASH1L	SET domain (including post-SET domain) found in ASH1-like protein (ASH1L) and similar proteins. ASH1L (EC 2.1.1.43; also termed absent small and homeotic disks protein 1 homolog, KMT2H, or lysine N-methyltransferase 2H) acts as histone-lysine N-methyltransferase that specifically methylates 'Lys-36' of histone H3 (H3K36me). It plays important roles in development; heterozygous mutation of ASH1L is associated with severe intellectual disability (ID) and multiple congenital anomaly (MCA).	141
380952	cd19175	SET_ASHR3-like	SET domain (including post-SET domain) found in Arabidopsis thaliana ASH1-related protein 3 (ASHR3) and similar proteins. This family includes Arabidopsis thaliana ASH1-related protein 3 (ASHR3, also termed protein SET DOMAIN GROUP 4 or protein stamen loss), ASH1 homolog 3 (ASHH3, also termed protein SET DOMAIN GROUP 7) and homolog 4 (ASHH4, also termed protein SET DOMAIN GROUP 24). They all function as histone-lysine N-methyltransferases (EC 2.1.1.43).	139
380953	cd19176	SET_SETD3	SET domain found in SET domain-containing protein 3 (SETD3) and similar proteins. SETD3 (EC 2.1.1.43) is a histone-lysine N-methyltransferase that methylates 'Lys-4' and 'Lys-36' of histone H3 (H3K4me and H3K36me). It functions as a transcriptional activator that plays an important role in the transcriptional regulation of muscle cell differentiation via interaction with MYOD1.	251
380954	cd19177	SET_SETD4	SET domain found in SET domain-containing protein 4 (SETD4) and similar proteins. SETD4 is a cytosolic and nuclear functional lysine methyltransferase that plays a crucial role in breast carcinogenesis. However, its specific substrates and modification sites remain to be disclosed.	245
380955	cd19178	SET_SETD6	SET domain found in SET domain-containing protein 6 (SETD6) and similar proteins. SETD6 is a lysine N-methyltransferase that monomethylates 'Lys-310' of the RELA subunit of NF-kappa-B complex, leading to down-regulate NF-kappa-B transcription factor activity. It also monomethylates 'Lys-8' of H2AZ (H2AZK8me1).	250
380956	cd19179	SET_RBCMT	SET domain found in chloroplastic ribulose-1,5 bisphosphate carboxylase/oxygenase large subunit N-methyltransferase (RBCMT) and similar proteins. RBCMT (EC 2.1.1.127; also termed [Ribulose-bisphosphate carboxylase]-lysine N-methyltransferase, RuBisCO LSMT, RuBisCO methyltransferase, or rbcMT) methylates 'Lys-14' of the large subunit of RuBisCO.	237
380957	cd19180	SET_SpSET10-like	SET domain found in Schizosaccharomyces pombe SET domain-containing protein 10 (SETD10) and similar proteins. Schizosaccharomyces pombe SETD10 is a ribosomal S-adenosyl-L-methionine-dependent protein-lysine N-methyltransferase that methylates ribosomal protein L23 (rpl23a and rpl23b).	252
380958	cd19181	SET_SETD5	SET domain (including post-SET domain) found in SET domain-containing protein 5 (SETD5) and similar proteins. SETD5 is a probable transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. SETD5 loss-of-function mutations are a likely cause of a familial syndromic intellectual disability with variable phenotypic expression.	150
380959	cd19182	SET_KMT2E	SET domain found in inactive histone-lysine N-methyltransferase 2E (KMT2E) and similar proteins. KMT2E (also termed inactive lysine N-methyltransferase 2E, myeloid/lymphoid or mixed-lineage leukemia protein 5 (MLL5)) plays a key role in hematopoiesis, spermatogenesis and cell cycle progression. It associates with chromatin regions downstream of transcriptional start sites of active genes and thus regulates gene transcription. Lack of key residues in the SET domain as well as the presence of an unusually large loop in the SET-I subdomain preclude the interaction of MLL5 SET with its cofactor and substrate thus making MLL5 devoid of any in vitro methyltransferase activity on full-length histones and histone H3 peptide.	129
380960	cd19183	SET_SpSET3-like	SET domain (including post-SET domain) found in Schizosaccharomyces pombe SET domain-containing protein 3 (SETD3) and similar proteins. Schizosaccharomyces pombe SETD3 functions as a transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. It is required for both, gene activation and repression.	173
380961	cd19184	SET_KMT5B	SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 5B (KMT5B) and similar proteins. KMT5B (also termed lysine N-methyltransferase 5B, lysine-specific methyltransferase 5B, suppressor of variegation 4-20 homolog 1, Su(var)4-20 homolog 1 or Suv4-20h1) is a histone methyltransferase that specifically trimethylates 'Lys-20' of histone H4 (H4K20me3). It plays a central role in the establishment of constitutive heterochromatin in pericentric heterochromatin regions.	144
380962	cd19185	SET_KMT5C	SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 5C (KMT5C) and similar proteins. KMT5C (also termed lysine N-methyltransferase 5C, lysine-specific methyltransferase 5C, suppressor of variegation 4-20 homolog 2, Su(var)4-20 homolog 2 or Suv4-20h2) is a histone methyltransferase that specifically trimethylates 'Lys-20' of histone H4 (H4K20me3). It plays a central role in the establishment of constitutive heterochromatin in pericentric heterochromatin regions.	142
380963	cd19186	SET_Suv4-20	SET domain (including post-SET domain) found in Drosophila melanogaster suppressor of variegation 4-20 (Suv4-20) and similar proteins. Suv4-20 (also termed Su(var)4-20) is a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-20' of histone H4. It acts as a dominant suppressor of position-effect variegation.	142
380964	cd19187	PR-SET_PRDM1	PR-SET domain found in PR domain zinc finger protein 1 (PRDM1) and similar proteins. PRDM1 (also termed BLIMP-1, beta-interferon gene positive regulatory domain I-binding factor, PR domain-containing protein 1, positive regulatory domain I-binding factor 1, PRDI-BF1, or PRDI-binding factor 1)  acts as a transcription factor that mediates a transcriptional program in various innate and adaptive immune tissue-resident lymphocyte T cell types such as tissue-resident memory T (Trm), natural killer (trNK) and natural killer T (NKT) cells and negatively regulates gene expression of proteins that promote the egress of tissue-resident T-cell populations from non-lymphoid organs.	128
380965	cd19188	PR-SET_PRDM2	PR-SET domain found in PR domain zinc finger protein 2 (PRDM2) and similar proteins. PRDM2 (also termed GATA-3-binding protein G3B, lysine N-methyltransferase 8, MTB-or MTE-binding protein, PR domain-containing protein 2, retinoblastoma protein-interacting zinc finger protein, or zinc finger protein RIZ) is S-adenosyl-L-methionine-dependent histone methyltransferase that specifically methylates 'Lys-9' of histone H3. It may function as a DNA-binding transcription factor.	123
380966	cd19189	PR-SET_PRDM4	PR-SET domain found in PR domain zinc finger protein 4 (PRDM4) and similar proteins. PRDM4 (also termed PR domain-containing protein 4, or PFM1) may function as a transcription factor involved in cell differentiation.	133
380967	cd19190	PR-SET_PRDM5	PR-SET domain found in PR domain zinc finger protein 5 (PRDM5) and similar proteins. PRDM5 (also termed PR domain-containing protein 5) is a sequence-specific DNA-binding transcription factor that represses transcription at least in part by recruitment of the histone methyltransferase EHMT2/G9A and histone deacetylases such as HDAC1.	127
380968	cd19191	PR-SET_PRDM6	PR-SET domain found in PR domain zinc finger protein 6 (PRDM6) and similar proteins. PRDM6 (also termed PR domain-containing protein 6) is a putative histone-lysine N-methyltransferase that acts as a transcriptional repressor of smooth muscle gene expression. It may specifically methylate 'Lys-20' of histone H4 when associated with other proteins and in vitro.	128
380969	cd19192	PR-SET_PRDM8	PR-SET domain found in PR domain zinc finger protein 8 (PRDM8) and similar proteins. PRDM8 (also termed PR domain-containing protein 8) may function as histone methyltransferase, preferentially acting on 'Lys-9' of histone H3.	131
380970	cd19193	PR-SET_PRDM7_9	PR-SET domain found in PR domain zinc finger protein 7 (PRDM7) and 9 (PRDM9) and similar proteins. PRDM7 (also termed PR domain-containing protein 7) is a primate-specific histone methyltransferase that is the result of a recent gene duplication of PRDM9. It selectively catalyzes the trimethylation of H3 lysine 4 (H3K4me3). PRDM9 (also termed PR domain-containing protein 9) is a histone methyltransferase that specifically trimethylates 'Lys-4' of histone H3 (H3K4me3) during meiotic prophase and is essential for proper meiotic progression. It also efficiently mono-, di-, and trimethylates H3K36. Aberrant PRDM9 expression is assciated with with genome instability in cancer.	129
380971	cd19194	PR-SET_PRDM10	PR-SET domain found in PR domain zinc finger protein 10 (PRDM10) and similar proteins. PRDM10 (also termed PR domain-containing protein 10, or tristanin) may be involved in transcriptional regulation.	128
380972	cd19195	PR-SET_PRDM11	PR-SET domain found in PR domain zinc finger protein 11 (PRDM11) and similar proteins. PRDM11 (also termed PR domain-containing protein 11) may be involved in transcription regulation.	127
380973	cd19196	PR-SET_PRDM12	PR-SET domain found in PR domain zinc finger protein 12 (PRDM12) and similar proteins. PRDM12 (also termed PR domain-containing protein 12) acts as a transcription factor that is involved in the positive regulation of histone H3-K9 dimethylation.	130
380974	cd19197	PR-SET_PRDM13	PR-SET domain found in PR domain zinc finger protein 13 (PRDM13) and similar proteins. PRDM13 (also termed PR domain-containing protein 13) may be involved in transcriptional regulation. It mediates the balance of inhibitory and excitatory neurons in somatosensory circuits.	103
380975	cd19198	PR-SET_PRDM14	PR-SET domain found in PR domain zinc finger protein 14 (PRDM14) and similar proteins. PRDM14 (also termed PR domain-containing protein 14) acts as a transcription factor that has both positive and negative roles on transcription. It acts on regulating epigenetic modifications in the cells, playing a key role in the regulation of cell pluripotency, epigenetic reprogramming, differentiation and development. Aberrant PRDM14 expression is associated with tumorigenesis, cell migration and cell chemotherapeutic drugs resistance.	133
380976	cd19199	PR-SET_PRDM15	PR-SET domain found in PR domain zinc finger protein 15 (PRDM15) and similar proteins. PRDM15 (also termed PR domain-containing protein 15, or zinc finger protein 298 (ZNF298)) may be involved in transcriptional regulation. It plays an essential role as a chromatin factor that modulates the transcription of upstream regulators of WNT and MAPK-ERK signaling to safeguard naive pluripotency.	126
380977	cd19200	PR-SET_PRDM16_PRDM3	PR-SET domain found in PR domain zinc finger protein 16 (PRDM16), MDS1 and EVI1 complex locus protein and similar proteins. PRDM16 (also termed PR domain-containing protein 16, transcription factor MEL1, or MDS1/EVI1-like gene 1) functions as a transcriptional regulator. PRDM16 is preferentially expressed by hematopoietic and neuronal stem cells. It is closely related to paralog of PRDM3 (also termed MDS1 and EVI1 complex locus protein, ecotropic virus integration site 1 protein, EVI-1, myelodysplasia syndrome 1 protein, myelodysplasia syndrome-associated protein 1, or MECOM) which is a nuclear transcription factor essential for the proliferation/maintenance of hematopoietic stem cells (HSCs). PRDM3 and PRDM16 are both directly linked to various aspects of oncogenic transformation.	135
380978	cd19201	PR-SET_ZFPM	PR-SET domain found in zinc finger protein ZFPM1, ZFPM2 and similar proteins. ZFPM1 (also termed friend of GATA protein 1, FOG-1, friend of GATA 1, zinc finger protein 89A, or zinc finger protein multitype 1) functions as a transcription regulator that plays an essential role in erythroid and megakaryocytic cell differentiation. ZFPM2 (also termed friend of GATA protein 2, FOG-2, friend of GATA 2, zinc finger protein 89B, or zinc finger protein multitype 2) functions as a transcription regulator that plays a central role in heart morphogenesis and development of coronary vessels from epicardium, by regulating genes that are essential during cardiogenesis.	122
380979	cd19202	SET_SMYD2	SET domain (including post-SET domain) found in SET and MYND domain-containing protein 2 (SMYD2) and similar proteins. SMYD2 (also termed HSKM-B, lysine N-methyltransferase 3C (KMT3C)) functions as a histone methyltransferase that methylates both histones and non-histone proteins, including p53/TP53 and RB1. It specifically methylates histone H3 'Lys-4' (H3K4me) and dimethylates histone H3 'Lys-36' (H3K36me2). It plays a role in myofilament organization in both skeletal and cardiac muscles via Hsp90 methylation. SMYD2 overexpression is associated with tumor cell proliferation and a worse outcome in human papillomavirus-unrelated nonmultiple head and neck carcinomas.  It regulates leukemia cell growth such that diminished SMYD2 expression upregulates SET7/9, thereby possibly shifting leukemia cells from growth to quiescence state associated with resistance to DNA damage associated with Acute Myeloid Leukemia (AML).	206
380980	cd19203	SET_SMYD3	SET domain (including post-SET domain) found in SET and MYND domain-containing protein 3 (SMYD3) and similar proteins. SMYD3 (also termed zinc finger MYND domain-containing protein 1) functions as a histone methyltransferase that specifically methylates 'Lys-4' of histone H3, inducing di- and tri-methylation, but not monomethylation. It also methylates 'Lys-5' of histone H4. SMYD3 plays an important role in transcriptional activation as a member of an RNA polymerase complex. It is overexpressed in colorectal, breast, prostate, and hepatocellular tumors, and has been implicated as an oncogene in human malignancies. Methylation of MEKK2 by SMYD3 is important for regulation of the MEK/ERK pathway, suggesting the possibility of selectively targeting SMYD3 in RAS-driven cancers.	210
380981	cd19204	SET_SETD1A	SET domain (including post-SET domain) found in SET domain-containing protein 1A (SETD1A) and similar proteins. SETD1A (EC2.1.1.43), also termed lysine N-methyltransferase 2F, or Set1/Ash2 histone methyltransferase complex subunit SET1, is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me), when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. Human SET domain containing protein 1A (hSETD1A) expression occurs at a high rate in hepatocellular carcinoma patients and controls tumor metastasis in breast cancer by activating MMP expression.	153
380982	cd19205	SET_SETD1B	SET domain (including post-SET domain) found in SET domain-containing protein 1B (SETD1B) and similar proteins. SETD1B (EC2.1.1.43), also termed lysine N-methyltransferase 2G, is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me) when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. Loss of SETD1B occurs in up to half the gastric and colorectal cancers, most commonly via SETD1B mutations, while de novo variants in SETD1B are associated with intellectual disability, epilepsy and autism.	153
380983	cd19206	SET_KMT2A	SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2A (KMT2A) and similar proteins. KMT2A (EC2.1.1.43; also termed lysine N-methyltransferase 2A, ALL-1, CXXC-type zinc finger protein 7 (CXXC7), myeloid/lymphoid or mixed-lineage leukemia (MLL), myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (TRX1), or zinc finger protein HRX) acts as a histone methyltransferase that plays an essential role in early development and hematopoiesis. It is a catalytic subunit of the MLL1/MLL complex, a multiprotein complex that mediates both methylation of 'Lys-4' of histone H3 (H3K4me) complex and acetylation of 'Lys-16' of histone H4 (H4K16ac).	154
380984	cd19207	SET_KMT2B	SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2B (KMT2B) and similar proteins. KMT2B (EC2.1.1.43; also termed lysine N-methyltransferase 2B, myeloid/lymphoid or mixed-lineage leukemia protein 4 (MLL2/MLL4), trithorax homolog 2 (TRX2), or WW domain-binding protein 7 (WBP-7)), acts as a histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me). It is required during the transcriptionally active period of oocyte growth for the establishment and/or maintenance of bulk H3K4 trimethylation (H3K4me3), global transcriptional silencing that precedes resumption of meiosis, oocyte survival and normal zygotic genome activation.	154
380985	cd19208	SET_KMT2C	SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2C (KMT2C) and similar proteins. KMT2C (EC2.1.1.43; also termed lysine N-methyltransferase 2C, homologous to ALR protein (HALR) myeloid/lymphoid, or mixed-lineage leukemia protein 3 (MLL3)), acts as a histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me) and may be involved in leukemogenesis and developmental disorder. KMT2C is a catalytic subunit of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation. Overexpression of KMT2C is associated with estrogen receptor-positive breast cancer; KMT2C mediates the estrogen dependence of breast cancer through regulation of estrogen receptor alpha (ERalpha) enhancer function. KMT2C is frequently mutated in certain populations with diffuse-type gastric adenocarcinomas (DGA); its loss promotes epithelial-to-mesenchymal transition (EMT) and is associated with worse overall survival.	154
380986	cd19209	SET_KMT2D	SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2D (KMT2D) and similar proteins. KMT2D (EC2.1.1.43; also termed lysine N-methyltransferase 2D, ALL1-related protein (ALR), or myeloid/lymphoid or mixed-lineage leukemia protein 2 (MLL2)), acts as histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me). It is a coactivator for estrogen receptor by being recruited by ESR1, thereby activating transcription. KMT2D is a subunit of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation.	155
380987	cd19210	SET_NSD1	SET domain (including post-SET domain) found in nuclear receptor-binding SET domain-containing protein 1 (NSD1) and similar proteins. NSD1 (EC 2.1.1.43; also termed Histone-lysine N-methyltransferase H3 lysine-36 and H4 lysine-20 specific, androgen receptor coactivator 267 kDa protein (ARA267), androgen receptor-associated protein of 267 kDa, H3-K36-HMTase, H4-K20-HMTase, lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-36' of histone H3 and 'Lys-20' of histone H4. NSD1 is altered in approximately 10% of head and neck cancer patients with 55% decrease in risk of death in NSD1-mutated versus non-mutated patients; its disruption promotes favorable chemotherapeutic responses linked to hypomethylation.	142
380988	cd19211	SET_NSD2	SET domain (including post-SET domain) found in nuclear SET domain-containing protein 2 (NSD2) and similar proteins. NSD2 (EC 2.1.1.43; also termed multiple myeloma SET domain-containing protein (MMSET), protein trithorax-5 (TRX5), or wolf-Hirschhorn syndrome candidate 1 protein (WHSC1)) acts as histone-lysine N-methyltransferase with histone H3 'Lys-36' (H3K36me) methyltransferase activity. NSD2 has been shown to mediate di- and trimethylation of H3K36 and dimethylation of H4K20 in different systems, and has been characterized as a transcriptional repressor interacting with histone deacetylase HDAC1 and histone demethylase LSD1. NSD2 mediates constitutive NF-kappaB signaling for cancer cell proliferation, survival and tumor growth. It is highly overexpressed in several types of human cancers, including small-cell lung cancers, neuroblastoma, carcinomas of stomach and colon, and bladder cancers, and its overexpression tends to be associated with tumor aggressiveness. WHSC1 is frequently deleted in Wolf-Hirschhorn syndrome (WHS).	142
380989	cd19212	SET_NSD3	SET domain (including post-SET domain) found in nuclear receptor-binding SET domain-containing protein 3 (NSD3) and similar proteins. NSD3 (EC 2.1.1.43; also termed protein whistle, WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1L1), or WHSC1-like protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-4' and 'Lys-27' of histone H3. NSD3 is amplified and overexpressed in multiple cancer types, including acute myeloid leukemia (AML), breast, lung, pancreatic and bladder cancers, as well as squamous cell carcinoma of the head and neck (SCCHN). NSD3 contributes to tumorigenesis by interacting with bromodomain-containing protein 4 (BRD4), the bromodomain and extraterminal (BET) protein, which is a potential therapeutic target in acute myeloid leukemia (AML). NSD3 is amplified in primary tumors and cell lines from breast carcinoma, and can promote the cell viability of small-cell lung cancer and pancreatic ductal adenocarcinoma. High NSD3 expression is implicated in poor grade and heavy smoking history in SCCHN. Thus, NSD3 may serve as a potential druggable target for selective cancer therapy.	142
380990	cd19213	PR-SET_PRDM16	PR-SET domain found in PR domain zinc finger protein 16 (PRDM16) and similar proteins. PRDM16, also termed PR domain-containing protein 16, or transcription factor MEL1, or MDS1/EVI1-like gene 1, functions as a transcriptional regulator. PRDM16 is preferentially expressed by hematopoietic and neuronal stem cells and is closely related to paralog of PRDM3, both of which are directly linked to various aspects of oncogenic transformation.	162
380991	cd19214	PR-SET_PRDM3	PR-SET domain found in MDS1 and EVI1 complex locus protein and similar proteins. PRDM3 (also termed MDS1 and EVI1 complex locus protein, ecotropic virus integration site 1 protein, EVI-1, myelodysplasia syndrome 1 protein, myelodysplasia syndrome-associated protein 1, or MECOM) is a nuclear transcription factor, which is essential for the proliferation/maintenance of hematopoietic stem cells (HSCs). It is closely related to paralog PRDM16, both o fwhich are directly linked to various aspects of oncogenic transformation.	158
380992	cd19215	PR-SET_ZFPM1	PR-SET domain found in zinc finger protein ZFPM1 and similar proteins. ZFPM1 (also termed friend of GATA protein 1, FOG-1, friend of GATA 1, zinc finger protein 89A, or zinc finger protein multitype 1) functions as a transcription regulator that plays an essential role in erythroid and megakaryocytic cell differentiation.	110
380993	cd19216	PR-SET_ZFPM2	PR-SET domain found in zinc finger protein ZFPM2 and similar proteins. ZFPM2 (also termed friend of GATA protein 2, FOG-2, friend of GATA 2, zinc finger protein 89B, or zinc finger protein multitype 2) functions as a transcription regulator that plays a central role in heart morphogenesis and development of coronary vessels from epicardium, by regulating genes that are essential during cardiogenesis.	111
380994	cd19217	SET_EZH1	SET domain found in enhancer of zeste homolog 1 (EZH1) and similar proteins. EZH1 (EC 2.1.1.43), also termed ENX-2, or histone-lysine N-methyltransferase EZH1, is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. It can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively.	136
380995	cd19218	SET_EZH2	SET domain found in enhancer of zeste homolog 2 (EZH2) and similar proteins. EZH2 (EC 2.1.1.43), also termed lysine N-methyltransferase 6, or ENX-1, or histone-lysine N-methyltransferase EZH2, is a catalytic subunit of the polycomb repressive complex 2 (PRC2)/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. It can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively. PRC2 is involved in several cancers; EZH2 is overexpressed in breast, liver and prostate cancer, while point mutations in EZH2 alter the substrate preference and product specificity of PRC2 in Non-Hodgkin lymphomas (NHLs). Thus, PRC2 is a popular target for cancer therapeutics.	120
412037	cd19318	Rev1_UBM2	Ubiquitin-Binding Motif 2 (UBM2) of Y-family polymerase Rev1. This model characterizes UBM2, the second ubiquitin-binding motif of Rev1, a DNA damage tolerance protein. Rev1 acts as a translesion synthesis (TLS) DNA polymerase and may also recruit other TLS polymerases to the site of DNA damage; in that process the UBMs are essential for Rev1 function, triggering TLS activation via recognition of ubiquitin moieties in PCNA, the proliferating cell nuclear antigen.	36
381707	cd19333	Wnt_Wnt1	Wnt domain found in proto-oncogene Wnt-1 and similar proteins. Wnt-1, also called proto-oncogene Int-1, acts in the canonical Wnt signaling pathway by promoting beta-catenin-dependent transcriptional activation. It plays a role in osteoblast function, bone development and bone homeostasis. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	311
381708	cd19334	Wnt_Wnt2_like	Wnt domain found in protein Wnt-2, Wnt-2b and similar proteins. The family includes Wnt-2 and Wnt-2b. Wnt-2, also called Int-1-like protein 1 (INT1L1), or Int-1-related protein (IRP), functions in the canonical Wnt signaling pathway that results in activation of transcription factors of the TCF/LEF family. It plays an important role in embryonic lung development. Wnt-2b, also called protein Wnt-13, functions in the canonical Wnt/beta-catenin signaling pathway. It plays a redundant role in embryonic lung development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	314
381709	cd19335	Wnt_Wnt3_Wnt3a	Wnt domain found in proto-oncogene Wnt-3 and similar proteins. Wnt-3, also called proto-oncogene Int-4, functions in the canonical Wnt signaling pathway that results in activation of transcription factors of the TCF/LEF family. It is required for normal embryonic development, and especially for limb development. Wnt-3a functions in the canonical Wnt signaling pathway and plays crucial roles in both proliferation and differentiation processes in several types of stem cells. Wnt3a stimulates the migration and invasion of trophoblasts and induce the survival, proliferation, and migration of human embryonic kidney (HEK) 293 cells. It also up-regulates genes implicated in melanocyte differentiation and increases the expression and nuclear localization of the transcriptional co-activator with PDZ-binding motif (TAZ), a transcriptional modulator involved in activating osteoblastic differentiation. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	314
381710	cd19336	Wnt_Wnt4	Wnt domain found in protein Wnt-4 and similar proteins. Wnt-4 may function as a signaling molecule which affects the development of discrete regions of tissues. Its overexpression may be associated with abnormal proliferation in human breast tissue. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	310
381711	cd19337	Wnt_Wnt5	Wnt domain found in protein Wnt-5a, Wnt-5b and similar proteins. The family includes Wnt-5a and Wnt-5b, both of which are secreted growth factors that belong to the noncanonical members of the Wingless-related MMTV-integration family. Wnt-5a can activate or inhibit canonical Wnt signaling, depending on receptor context. It specifically regulates dendritic spine formation in rodent hippocampal neurons, resulting in postsynaptic development that promotes the clustering of the postsynaptic density protein 95 (PSD-95). The overexpression of Wnt-5b is associated with cancer aggressiveness. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	313
381712	cd19338	Wnt_Wnt6	Wnt domain found in protein Wnt-6 and similar proteins. Wnt-6 may function as a signaling molecule which affects the development of discrete regions of tissues. It may promote tumorigenesis in gastrointestinal cancer and cervical cancer. It can compensate for the absence of ectoderm and can induce the formation of muscle cells in the limb. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	310
381713	cd19339	Wnt_Wnt7	Wnt domain found in protein Wnt-7a, Wnt-7b and similar proteins. The family includes Wnt-7a and Wnt-7b. Wnt-7a acts as a canonical Wnt ligand that modulates the synaptic vesicle cycle and synaptic transmission in hippocampal neurons. It also plays an important role in embryonic development, including dorsal versus ventral patterning during limb development, skeleton development, and urogenital tract development. Wnt-7b functions in the canonical Wnt/beta-catenin signaling pathway in vascular smooth muscle cells. It is required for normal fusion of the chorion and the allantois during placenta development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	313
381714	cd19340	Wnt_Wnt8	Wnt domain found in protein Wnt-8a, Wnt-8b and similar proteins. The family includes Wnt-8a and Wnt-8b. Wnt-8a, also called protein Wnt-8d, plays a role in embryonic patterning. Wnt-8b may play an important role in the development and differentiation of certain forebrain structures, notably the hippocampus. It acts as a suppressor of early eye and retinal progenitor formation. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	301
381715	cd19341	Wnt_Wnt9	Wnt domain found in protein Wnt-9a, Wnt-9b and similar proteins. The family includes Wnt-9a and Wnt-9b, both of which function in the canonical Wnt/beta-catenin signaling pathway. Wnt-9a, also called protein Wnt-14, is required for normal timing of IHH expression during embryonic bone development, normal chondrocyte maturation and for normal bone mineralization during embryonic bone development. Wnt-9a plays a redundant role in maintaining joint integrity. It is a conserved regulator of hematopoietic stem and progenitor cell development. Wnt-9b, also called protein Wnt-14b, or protein Wnt-15, plays a central role in the regulation of mesenchymal to epithelial transitions underlying organogenesis of the mammalian urogenital system. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	299
381716	cd19342	Wnt_Wnt10	Wnt domain found in protein Wnt-10a, Wnt-10b and similar proteins. The family includes protein Wnt-10a and Wnt-10b. Wnt-10a plays a role in normal ectoderm development. It is required for normal postnatal development and maintenance of tongue papillae and sweat ducts, as well as normal hair follicle function. Wnt-10b, also called protein Wnt-12, specifically activates canonical Wnt/beta-catenin signaling and thus triggers beta-catenin/LEF/TCF-mediated transcriptional programs. It is involved in signaling networks controlling stemness, pluripotency, and cell fate decisions. Wnt-10b is unique and plays an important role in differentiation of epithelial cells in the hair follicle. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	316
381717	cd19343	Wnt_Wnt11	Wnt domain found in protein Wnt-11 and similar proteins. Wnt-11 may be a signaling molecule which has possible roles in the development of skeleton, kidney and lung. It is a positive regulator of the Wnt signaling pathway, which plays a crucial role in carcinogenesis. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	305
381718	cd19344	Wnt_Wnt16	Wnt domain found in protein Wnt-16 and similar proteins. Wnt-16 is a mixed canonical and noncanonical Wnt ligand involved in the regulation of postnatal bone homeostasis. It promotes bone formation and inhibits bone resorption. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	306
381719	cd19345	Wnt_Wnt2	Wnt domain found in protein Wnt-2 and similar proteins. Wnt-2, also called Int-1-like protein 1 (INT1L1), or Int-1-related protein (IRP), functions in the canonical Wnt signaling pathway that results in activation of transcription factors of the TCF/LEF family. It plays an important role in embryonic lung development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	314
381720	cd19346	Wnt_Wnt2b	Wnt domain found in protein Wnt-2b and similar proteins. Wnt-2b, also called protein Wnt-13, functions in the canonical Wnt/beta-catenin signaling pathway. It plays a redundant role in embryonic lung development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	314
381721	cd19347	Wnt_Wnt5a	Wnt domain found in protein Wnt-5a and similar proteins. Wnt-5a is a secreted growth factor that belongs to the noncanonical members of the Wingless-related MMTV-integration family. It can activate or inhibit canonical Wnt signaling, depending on receptor context. Wnt-5a specifically regulates dendritic spine formation in rodent hippocampal neurons, resulting in postsynaptic development that promotes the clustering of the postsynaptic density protein 95 (PSD-95). Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	312
381722	cd19348	Wnt_Wnt5b	Wnt domain found in protein Wnt-5b and similar proteins. Wnt-5b is a secreted growth factor that belongs to the noncanonical members of the Wingless-related MMTV-integration family. Its overexpression is associated with cancer aggressiveness. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	312
381723	cd19349	Wnt_Wnt7a	Wnt domain found in protein Wnt-7a and similar proteins. Wnt-7a acts as a canonical Wnt ligand that modulates the synaptic vesicle cycle and synaptic transmission in hippocampal neurons. It also plays an important role in embryonic development, including dorsal versus ventral patterning during limb development, skeleton development and urogenital tract development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	318
381724	cd19350	wnt_Wnt7b	Wnt domain found in protein Wnt-7b and similar proteins. Wnt-7b functions in the canonical Wnt/beta-catenin signaling pathway in vascular smooth muscle cells. It is required for normal fusion of the chorion and the allantois during placenta development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	318
381725	cd19351	Wnt_Wnt8a	Wnt domain found in protein Wnt-8a and similar proteins. Wnt-8a, also called protein Wnt-8d, plays a role in embryonic patterning. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	307
381726	cd19352	Wnt_Wnt8b	Wnt domain found in protein Wnt-8b and similar proteins. Wnt-8b may play an important role in the development and differentiation of certain forebrain structures, notably the hippocampus. It acts as a suppressor of early eye and retinal progenitor formation. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	305
381727	cd19353	Wnt_Wnt9a	Wnt domain found in protein Wnt-9a and similar proteins. Wnt-9a, also called protein Wnt-14, functions in the canonical Wnt/beta-catenin signaling pathway. It is required for normal timing of IHH expression during embryonic bone development, normal chondrocyte maturation and for normal bone mineralization during embryonic bone development. Wnt-9a plays a redundant role in maintaining joint integrity. It is a conserved regulator of hematopoietic stem and progenitor cell development. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	298
381728	cd19354	Wnt_Wnt9b	Wnt domain found in protein Wnt-9b and similar proteins. Wnt-9b, also called protein Wnt-14b or Wnt-15, functions in the canonical Wnt/beta-catenin signaling pathway. It plays a central role in the regulation of mesenchymal to epithelial transitions underlying organogenesis of the mammalian urogenital system. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	297
381729	cd19355	Wnt_Wnt10a	Wnt domain found in protein Wnt-10a and similar proteins. Wnt-10a plays a role in normal ectoderm development. It is required for normal postnatal development and maintenance of tongue papillae and sweat ducts, as well as normal hair follicle function. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	302
381730	cd19356	Wnt_Wnt10b	Wnt domain found in protein Wnt-10b and similar proteins. Wnt-10b, also called protein Wnt-12, specifically activates canonical Wnt/beta-catenin signaling and thus triggers beta-catenin/LEF/TCF-mediated transcriptional programs. It is involved in signaling networks controlling stemness, pluripotency and cell fate decisions. Wnt-10b is unique and plays an important role in differentiation of epithelial cells in the hair follicle. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	299
381692	cd19357	TenA_E_At3g16990-like	TenA_E proteins similar to Arabidopsis thaliana At3g16990. This family of TenA proteins belongs to the TenA_E class, and lacks the conserved active site Cys residue of the TenA_C class; most have a pair of structurally conserved Glu residues in the active site. TenA_C proteins (EC 3.5.99.2; aminopyrimidine aminohydrolase, also known as thiaminase II) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway; the role of TenA_E proteins is less clear. Members of this family include Arabidopsis thaliana At3g16990, Zea mays GRMZM2G080501, and Pyrococcus furiosus PF1337, among others. Arabidopsis thaliana TenA_E hydrolyzes amino-HMP to AMP, and the N-formyl derivative of amino-HMP to amino-HMP, but does not hydrolyze thiamin; nor does it have activity with other thiamine degradation products such as thiamine mono- or diphosphate, oxythiamine, oxothiamine, thiamine disulfide, desthiothiamine or thiochrome as substrates. Structural studies of P. furiosus PF1337 strongly support its enzymatic function in thiamine biosynthesis. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect.	217
381693	cd19358	TenA_E_Spr0628-like	TenA_E proteins similar to Streptococcus pneumoniae Spr0628 and Saccharomyces cerevisiae S288C PET18. This family of TenA proteins belongs to the TenA_E class, and lacks the conserved active site Cys residue of the TenA_C class; most have a pair of structurally conserved Glu residues in the active site. TenA_C proteins (EC 3.5.99.2; aminopyrimidine aminohydrolase, also known as thiaminase II) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway; the role of TenA_E proteins is less clear. Arabidopsis thaliana TenA_E (not belonging to this family) hydrolyzes amino-HMP to AMP, and the N-formyl derivative of amino-HMP to amino-HMP. Members of this family include the putative thiaminase Streptococcus pneumoniae Spr0628, and Saccharomyces cerevisiae S288C PET18, a protein of unknown function whose expression is induced in the absence of thiamin. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. Many proteins in this family have yet to be characterized.	209
381694	cd19359	TenA_C_Bt3146-like	uncharacterized TenA_C proteins similar to Bacteroides thetaiotaomicron Bt3146. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; only one of the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes mostly uncharacterized TenA-like proteins such as Bacteroides thetaiotaomicron Bt3146.	206
381695	cd19360	TenA_C_SaTenA-like	TenA_C proteins similar to Staphylococcus aureus TenA (SaTenA). This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. This family includes Staphylococcus aureus TenA (SaTenA) which plays two essential roles in thiamin metabolism: in the deamination of aminopyrimidine to HMP, and in hydrolyzing thiamin into HMP and 5-(2-hydroxyethyl)4-methylthiazole (THZ). It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. SaTenA is then also a putative transcriptional regulator controlling the secretion of extracellular proteases such as subtilisin-type proteases in bacteria. This family includes mostly uncharacterized TenA like proteins.	211
381696	cd19361	TenA_C_HP1287-like	TenA_C proteins similar to Helicobacter pylori TenA (HP1287. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. This family includes Helicobacter pylori TenA (HP1287) protein which is thought to catalyze a salvage reaction in thiamin metabolism, however its pyrimidine substrate has not yet been identified. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. HP1287 may contribute to stomach colonization and persistence.	212
381697	cd19362	TenA_C_SsTenA-1-like	uncharacterized TenA_C proteins similar to Sulfolobus solfataricus TenA-1 (Sso2206). This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; only one of the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes mostly uncharacterized TenA like proteins such as Sulfolobus solfataricus putative TenA-like thiaminase (Tena-1, Sso2206).	200
381698	cd19363	TenA_C_PH1161-like	uncharacterized TenA_C proteins similar to Pyrococcus horikoshii PH1161. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; only one of the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes functionally uncharacterized TenA like proteins such as Pyrococcus horikoshii PH1161 protein.	210
381699	cd19364	TenA_C_BsTenA-like	TenA_C proteins similar to Bacillus subtilis TenA. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. This family includes Bacillus subtilis TenA which has been shown to be a thiaminase II, catalyzing the hydrolysis of thiamine into HMP and 5-(2-hydroxyethyl)-4-methylthiazole (THZ), within thiamine metabolism. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect.	212
381700	cd19365	TenA_C-like	uncharacterized TenA_C proteins. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes mostly uncharacterized TenA_C- like proteins.	205
381701	cd19366	TenA_C_BhTenA-like	TenA_C proteins similar to Bacillus halodurans TenA. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. This family includes Bacillus halodurans TenA which participates in a salvage pathway where the thiamine degradation product 2-methyl-4-formylamino-5-aminomethylpyrimidine (formylamino-HMP) is hydrolyzed first to amino-HMP by the YlmB protein, and the amino-HMP is then hydrolyzed by TenA to produce HMP. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect.	213
381702	cd19367	TenA_C_ScTHI20-like	TenA_C family similar to the C-terminal TenA_C domain of Saccharomyces cerevisiae THI20 protein. This TenA family belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. Saccharomyces cerevisiae THI20 includes a C-terminal tetrameric TenA-like domain fused to an N-terminal HMP kinase/HMP-P kinase (ThiD-like) domain, and participates in thiamin biosynthesis, degradation and salvage; the TenA-like domain catalyzes the production of HMP from thiamin degradation products (salvage). A majority of this family are single-domain TenA_C- like proteins; some however have additional domains such as a ThiD domain. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect.	204
381703	cd19368	TenA_C_AtTH2-like	TenA_C family similar to the N-terminal TenA_C domain of Arabidopsis thaliana thiamine requiring 2. This TenA family belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. Arabidopsis thaliana TH2 is an orphan enzyme thiamin monophosphate phosphatase which has a haloacid dehalogenase (HAD) family domain fused to its TenA_C domain, it's TenA_C domain has thiamin salvage hydrolase activity against amino-HMP. This family includes mostly uncharacterized single-domain TenA_C- like proteins; some however have additional domains such as a HAD family domain or a kinase domain It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect.	210
381704	cd19369	TenA_C-like	uncharacterized TenA_C proteins. This family of TenA proteins belongs to the TenA_C class as it has a conserved active site Cys residue; the double Glu residues identified in the active site of TenA_E from the archaeon Pyrococcus furiosus is conserved in this family. TenA_C proteins (EC 3.5.99.2) catalyze the hydrolysis of the thiamin breakdown product 4-amino-5-amino-methyl-2-methylpyrimidine (amino-HMP) to 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP) in a thiamin salvage pathway. It has also been suggested that TenA proteins act as transcriptional regulators based on changes in gene-expression patterns when TenA is overexpressed in Bacillus subtilis, however this effect may be indirect. This family includes mostly uncharacterized TenA_C- like proteins.	202
381705	cd19370	TenA_PqqC	TenA_like proteins, including PqqC and CADD. This family contains proteins with similarity to TenA, and includes bacterial coenzyme pyrroloquinoline quinone (PQQ) synthesis protein C or PQQC proteins. PQQ is the prosthetic group of several bacterial enzymes, including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria. PQQC catalyzes the last step of PQQ biogenesis which involves a ring closure and an eight-electron oxidation of the substrate [3a-(2-amino-2-carboxyethyl)-4,5-dioxo-4,5,6,7,8,9-hexahydroquinoline-7,9-dicarboxylic acid (AHQQ)]. The exact molecular function of members of this family is unclear. Also belonging to this family is Chlamydia protein CADD (Chlamydia protein Associating with Death Domains), a redox protein toxin unique to Chlamydia species, which modulates host cell apoptosis; its redox activity and death domain binding ability may be required for this biological activity. CADD may have a role in folate metabolism.	219
381686	cd19371	UDG-F1-like	Uracil DNA glycosylase family 1, includes Human uracil DNA glycosylase, Vaccinia virus protein D4, Nitratifractor salsuginis UNG and similar proteins. Uracil DNA glycosylase family 1 is the most efficient of all uracil-DNA glycosylases (UDGs, also known as UNGs) and shows a specificity for uracil in DNA. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other. More distant members of UDG family 1 include Nitratifractor salsuginis UNG (NsaUNG) and Vaccinia virus (VAVC) protein D4 uracil-DNA glycosylase, a subunit of the VACV DNA polymerase holoenzyme. NsaUNG only exhibits robust enzymatic activity on uracil-containing DNAs, in particular double-stranded uracil-containing substrates; it does not act on hypoxanthine- and xanthine-containing substrates. NsUNG is not inhibited by Ugi protein that specifically inhibits conventional family 1 UDGs. D4, in addition to excising uracil residues from DNA, is part of a heterodimeric processivity factor which potentiates the DNA polymerase activity.	135
381687	cd19372	UDG_F1_VAVC_D4-like	Uracil DNA glycosylase family 1 subfamily, includes Vaccinia virus protein D4 and similar proteins. Vaccinia virus (VAVC) protein D4 uracil-DNA glycosylase, is a subunit of the VACV DNA polymerase holoenzyme, and a more distant member of uracil DNA glycosylase (UDG) family 1. D4, in addition to excising uracil residues from DNA, is part of a heterodimeric processivity factor which potentiates the DNA polymerase activity. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	200
381688	cd19373	UDG-F1_NsUNG-like	Uracil DNA glycosylase family 1 subfamily, includes Nitratifractor salsuginis UNG and similar proteins. Uracil DNA glycosylase family 1 is the most efficient of all uracil-DNA glycosylases (UDGs, also known as UNGs) and shows a specificity for uracil in DNA. Nitratifractor salsuginis UNG (NsaUNG) only exhibits robust enzymatic activity on uracil-containing DNAs, in particular double-stranded uracil-containing substrates, and does not act on hypoxanthine- and xanthine-containing substrates. NsUNG is not inhibited by Ugi protein that specifically inhibits conventional family 1 UDGs. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	174
381689	cd19374	UDG-F3_SMUG1-like	Uracil DNA glycosylase family 3 subfamily, includes single-strand-selective monofunctional uracil-DNA glycosylase 1 and similar proteins. Uracil DNA glycosylase family 3 includes Human SMUG1 that can remove uracil and its oxidized pyrimidine derivatives from both, single-stranded DNA and double-stranded DNA, with a preference for single-stranded DNA substrates. The SMUG-targeted mismatched uracil derivatives include 5-hydroxyuracil, 5-hydroxymethyluracil and 5-formyluracil. Also included in this subfamily is Geobacter metallireducens SMUG1 which has dual substrate specificities for DNA with uracil or xanthine. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	232
381690	cd19375	UDG-F3-like_SMUG2	Uracil DNA glycosylase family 3-like subfamily, includes single-strand-selective monofunctional uracil-DNA glycosylase 2 and similar proteins. Uracil DNA glycosylase family 3-like, which includes Pedobacter heparinus SMUG2, displays catalytic activities towards DNA containing uracil or hypoxanthine/xanthine. UDG catalyzes the removal of uracil from DNA to initiate the DNA base excision repair pathway. Uracil in DNA can arise as a result of misincorporation of dUMP residues by DNA polymerase or deamination of cytosine. Uracil mispaired with guanine in DNA is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. UDGs have been classified into various families on the basis of their substrate specificity, conserved motifs, and structural similarities. Although these families demonstrate different substrate specificities, often the function of one enzyme can be complemented by the other.	218
381646	cd19376	TGF_beta_GDF15	transforming growth factor beta (TGF-beta) like domain found in mammalian growth/differentiation factor 15 (GDF-15) and similar proteins. GDF-15, also termed macrophage inhibitory cytokine 1 (MIC-1), or NSAID-activated gene 1 protein (NAG-1), or NSAID-regulated gene 1 protein (NRG-1), or placental TGF-beta, or placental bone morphogenetic protein, or prostate differentiation factor, regulates food intake, energy expenditure and body weight in response to metabolic and toxin-induced stresses.	101
381647	cd19377	TGF_beta_INHA_B_like	transforming growth factor beta (TGF-beta) like domain found in inhibin alpha chain (INHA), beta chain (INHB) and similar proteins. INHA is a component of inhibins (inhibin A or inhibin B) that inhibit the secretion of follitropin by the pituitary gland. INHB includes inhibin beta A chain (INHBA), B chain (INHBB), C chain (INHBC), and E chain (INHBE). INHBA, also termed activin beta-A chain, or erythroid differentiation protein (EDF), is a component of inhibin A, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. INHBB, also termed activin beta-B chain, is a component of inhibin B, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland. INHBC, also termed activin beta-C chain, might play important roles in carcinogenesis. It may function as a negative regulator of liver growth. INHBE, also termed activin beta-E chain, is a possible insulin resistance-associated hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans. It also acts as a possible new marker for drug-induced endoplasmic reticulum stress.	101
381648	cd19378	TGF_beta_DAF7	transforming growth factor beta (TGF-beta) like domain found in Caenorhabditis elegans Dauer larva development regulatory growth factor DAF-7 and similar proteins. DAF-7, also termed abnormal dauer formation protein 7, may act as a negative regulator of dauer larva development by transducing chemosensory information from ASI neurons. It is involved in sensitivity to CO2 levels.	100
381649	cd19379	TGF_beta_GSDF	transforming growth factor beta (TGF-beta) like domain found in Danio rerio gonadal somatic cell derived factor (GSDF) and similar proteins. GSDF is a new member of transforming growth factor beta (TGF-beta) superfamily. It is a teleost- and gonad-specific growth factor that controls sex determination in some fish and plays an important role in mediating germ cell/soma signaling.	94
381650	cd19380	TGF_beta_GDNF	transforming growth factor beta (TGF-beta) like domain found in glial cell line-derived neurotrophic factor (GDNF) and similar proteins. GDNF, also termed astrocyte-derived trophic factor (ATF), is a member of the glial cell-line-derived neurotrophic factor (GDNF) family. It acts as a neurotrophic factor that enhances survival and morphological differentiation of dopaminergic neurons and increases their high-affinity dopamine uptake.	96
381651	cd19381	TGF_beta_Artemin	transforming growth factor beta (TGF-beta) like domain found in Artemin and similar proteins. Artemin, also termed Enovin, or Neublastin, is a member of the glial cell-line-derived neurotrophic factor (GDNF) family with growth promoting activity on neuronal cells. It acts as the ligand for the GFR-alpha-3-RET receptor complex but can also activate the GFR-alpha-1-RET receptor complex. It supports peripheral and central neurons and signals through the GFR-alpha-3-RET receptor complex.	98
381652	cd19382	TGF_beta_Persephin	transforming growth factor beta (TGF-beta) like domain found in Persephin and similar proteins. Persephin is a member of the glial cell-line-derived neurotrophic factor (GDNF) family with neurotrophic activity on mesencephalic dopaminergic and motor neurons.	99
381653	cd19383	TGF_beta_Neurturin	transforming growth factor beta (TGF-beta) like domain found in Neurturin and similar proteins. Neurturin is a member of the glial cell-line-derived neurotrophic factor (GDNF) family. It acts as a neurotrophic factor that supports the survival of sympathetic neurons in culture and may regulate the development and maintenance of the CNS. It might control the size of non-neuronal cell population such as haemopoietic cells.	104
381654	cd19384	TGF_beta_TGFB1	transforming growth factor beta (TGF-beta) like domain found in transforming growth factor beta-1 (TGF-beta-1) and similar proteins. TGF-beta-1 is a polypeptide member of the transforming growth factor beta superfamily of cytokines. It is a secreted protein that performs many cellular functions, including the control of cell growth, cell proliferation, cell differentiation, and apoptosis.	99
381655	cd19385	TGF_beta_TGFB2	transforming growth factor beta (TGF-beta) like domain found in transforming growth factor beta-2 (TGF-beta-2) and similar proteins. TGF-beta-2, also termed BSC-1 cell growth inhibitor, or cetermin, or glioblastoma-derived T-cell suppressor factor (G-TSF), or polyergin, is a polypeptide member of the transforming growth factor beta superfamily of cytokines. It is a secreted protein that performs many cellular functions and has a vital role during embryonic development. It can suppress the effects of interleukin-2 dependent T-cell growth.	97
381656	cd19386	TGF_beta_TGFB3	transforming growth factor beta (TGF-beta) like domain found in transforming growth factor beta-3 (TGF-beta-3) and similar proteins. TGF-beta-3 is a polypeptide member of the transforming growth factor beta superfamily of cytokines. It is involved in embryogenesis and cell differentiation. It regulates molecules involved in cellular adhesion and extracellular matrix (ECM) formation during the process of palate development.	101
381657	cd19387	TGF_beta_univin	transforming growth factor beta (TGF-beta) like domain found in Strongylocentrotus purpuratus univin and similar proteins. Univin may have a critical role in early developmental decisions in the sea urchin embryo.	104
381658	cd19388	TGF_beta_GDF8	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 8 (GDF8) and similar proteins. GDF8, also termed myostatin, acts specifically as a negative regulator of skeletal muscle growth.	108
381659	cd19389	TGF_beta_GDF11	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 11 (GDF11) and similar proteins. GDF11, also termed bone morphogenetic protein 11 (BMP-11), is a secreted signal that acts globally to specify positional identity along the anterior/posterior axis during development.	109
381660	cd19390	TGF_beta_BMP2	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 2 (BMP-2) and similar proteins. BMP-2, also termed BMP-2A, induces cartilage and bone formation. It stimulates the differentiation of myoblasts into osteoblasts via the EIF2AK3-EIF2A- ATF4 pathway.	103
381661	cd19391	TGF_beta_BMP4_BMP2B	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 4 (BMP-4) and similar proteins. BMP-4, also termed BMP-2B, induces cartilage and bone formation. It also acts in mesoderm induction, tooth development, limb formation and fracture repair.	107
381662	cd19392	TGF_beta_DPP	transforming growth factor beta (TGF-beta) like domain found in Drosophila melanogaster protein decapentaplegic (Dpp) and similar proteins. decapentaplegic (Dpp) and similar proteins Dpp, also termed as protein DPP-C, is required later in embryogenesis for dorsal closure and patterning of the hindgut. It also functions postembryonically as a long-range morphogen during imaginal disk development and is responsible for the progression of the morphogenetic furrow during eye development.	109
381663	cd19393	TGF_beta_BMP3	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 3 (BMP-3) and similar proteins. BMP-3, also termed BMP-3A, or osteogenin, negatively regulates bone density. It antagonizes the            ability of certain osteogenic BMPs to induce osteoprogenitor differentitation and ossification.	110
381664	cd19394	TGF_beta_GDF10	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 10 (GDF10) and similar proteins. GDF10, also termed bone morphogenetic protein 3B (BMP-3B), or bone-inducing protein (BIP), is a growth factor involved in osteogenesis and adipogenesis.	112
381665	cd19395	TGF_beta_BMP5	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 5 (BMP-5) and similar proteins. BMP-5 induces cartilage and bone formation.	113
381666	cd19396	TGF_beta_BMP6	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 6 (BMP-6) and similar proteins. BMP-6, also termed VG-1-related protein, or VG-1-R, or VGR-1, induces cartilage and bone formation.	103
381667	cd19397	TGF_beta_BMP7	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 7 (BMP-7) and similar proteins. BMP-7, also termed osteogenic protein 1 (OP-1), or eptotermin alfa, induces cartilage and bone formation. It may act as the osteoinductive factor responsible for the phenomenon of epithelial osteogenesis and play a role in calcium regulation and bone homeostasis.	107
381668	cd19398	TGF_beta_BMP8	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 8A (BMP-8A), 8B (BMP-8B) and similar proteins. BMP-8A plays a role in the maintenance of spermatogenesis and the integrity of the epididymis. BMP-8B, also termed BMP-8, or osteogenic protein 2 (OP-2), may act as secreted factor in cancer progression. It also plays an essential role in bone metabolism and can regulate thermogenesis and energy balance. Like BMP-8A, BMP-8B plays a role in spermatogenesis and placental development. Mutation in either of the genes encoding BMP-8A or BMP-8B causes postnatal depletion of spermatogonia in mice.	105
381669	cd19399	TGF_beta_GDF5	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 5 (GDF5) and similar proteins. GDF5, also termed bone morphogenetic protein 14 (BMP-14), or cartilage-derived morphogenetic protein 1 (CDMP-1), or lipopolysaccharide-associated protein 4 (LAP-4), or LPS-associated protein 4, or radotermin, is a growth factor involved in bone and cartilage formation.	103
381670	cd19400	TGF_beta_BMP9	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 9 (BMP-9) and similar proteins. BMP-9, also termed growth/differentiation factor 2 (GDF-2), is a potent circulating inhibitor of angiogenesis. It signals through the type I activin receptor ACVRL1 but not other activin receptor-like kinases (ALKs).	105
381671	cd19401	TGF_beta_BMP10	transforming growth factor beta (TGF-beta) like domain found in bone morphogenetic protein 10 (BMP-10) and similar proteins. BMP-10 is required for maintaining the proliferative activity of embryonic cardiomyocytes by preventing premature activation of the negative cell cycle regulator CDKN1C/p57KIP and maintaining the required expression levels of cardiogenic factors such as MEF2C and NKX2-5. It inhibits endothelial cell migration and growth. It may reduce cell migration and cell matrix adhesion in breast cancer cell lines.	105
381672	cd19402	TGF_beta_GDF9B	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 9B (GDF-9B) and similar proteins. GDF-9B, also termed bone morphogenetic protein 15 (BMP15), acts as oocyte-specific growth/differentiation factor that stimulates folliculogenesis and granulosa cell (GC) growth.	104
381673	cd19403	TGF_beta_GDF9	transforming growth factor beta (TGF-beta) like domain found in growth/differentiation factor 9 (GDF-9) and similar proteins. GDF-9 is required for ovarian folliculogenesis. It promotes primordial follicle development and stimulates granulosa cell proliferation.	106
381674	cd19404	TGF_beta_INHBA	transforming growth factor beta (TGF-beta) like domain found in inhibin beta A chain (INHBA) and similar proteins. INHBA, also termed activin beta-A chain, or erythroid differentiation protein (EDF), is a component of inhibin A, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland.	108
381675	cd19405	TGF_beta_INHBB	transforming growth factor beta (TGF-beta) like domain found in inhibin beta B chain (INHBB) and similar proteins. INHBB, also termed activin beta-B chain, is a component of inhibin B, activin A, or activin AB. Inhibins and activins inhibit and activate, respectively, the secretion of follitropin by the pituitary gland.	107
381676	cd19406	TGF_beta_INHBC_E	transforming growth factor beta (TGF-beta) like domain found in inhibin beta C chain (INHBC), inhibin beta E chain (INHBE) and similar proteins. The family includes INHBC and INHBE. INHBC, also termed activin beta-C chain, might play important roles in carcinogenesis. It may function as a negative regulator of liver growth. INHBE, also termed activin beta-E chain, is a possible insulin resistance-associated hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans. It also acts as a possible new marker for drug-induced endoplasmic reticulum stress.	104
410990	cd19412	pMMO-AMO_C	subunit C of particulate methane monooxygenase (pMMO, also known as membrane-bound MMO) from methanotrophic bacteria, and of ammonia monooxygenase (AMO) from ammonia-oxidizing bacteria, and related proteins. This family contains subunit C of particulate methane monooxygenase (pMMO; EC 1.14.18.3), an integral membrane metalloenzyme that catalyzes the conversion of methane to methanol. MMO is the first enzyme in the metabolic pathway of methanotrophic bacteria. It also contains subunit C of AMO (EC 1.14.99.39) from ammonia-oxidizing bacteria such as Nitrosomonas europaea (AmoC1-AmoC3). AMO catalyzes the conversion of ammonia to hydroxylamine. pMMO, along with soluble MMO (sMMO; EC 1.14.13.25), and the related enzyme AMO are the only known enzymes capable of methane hydroxylation. pMMO is composed of three subunits, PmoB (B or alpha), PmoA (A or beta), and PmoC (C or gamma), each containing membrane-spanning helices, with three copies each of the subunits forming a cylindrical A3B3C3 oligomer with a hole in the center. This subunit of pMMO has a metal-binding site that is exposed to the center of the pMMO oligomer, the metal being zinc or copper. Although biochemical and mutagenesis data indicate that the active site is located at the dicopper site in subunit B, the metal-binding site in this transmembrane subunit C may also be functionally relevant since all ligands are conserved and best enzymatic activity is obtained from intact pMMO containing all three subunits. Zinc inhibition studies of several respiratory complexes in Methylococcus capsulatus and Methylosinus trichosporium suggest that zinc might inhibit proton transfer in pMMO by either replacing active site copper ions or another copper site that is involved in reducing the active site. Nitrosomonas europaea AMO is composed of three subunits; AmoA, AmoB, and AmoC; it has two nearly identical copies of AmoC encoded by duplicate amoCAB operons and a more divergent AmoC encoded by a monocistronic amoC. The significantly shorter related C subunit of AMO from ammonia-oxidizing archaea are not included in this model.	217
381292	cd19413	RsbR_N-like	globin-like domain of positive regulator of sigma-B activity (RsbRA). The globin-like domain of Bacillus subtilis RsbRA is a non-heme globin presumed to channel sensory input to the C-terminal sulfate transporter/anti-sigma factor antagonist (STAT) domain. RsbRA is a component of the sigma B-activating stressosome, and a regulator of the RNA polymerase sigma factor subunit sigma (B).	132
381189	cd19414	lipocalin_1_3_4_13-like	lipocalin-1, -3, -4, -13 and similar proteins. Lipocalin-1 (LCN1, also known as tear lipocalin, von ebner's gland protein, or tear specific prealbumin), the main lipid carrier in human tears, is critical to functions involving lipids in protection of the ocular surface. Its large ligand pocket accommodates a range of ligands including alkyl alcohols, glycolipids, phospholipids, cholesterol, steroids, and siderophores.  Lipocalin-3 (LCN3, also known as vomeronasal secretory protein 1) and lipocalin-4 (LCN4, also known as vomeronasal secretory protein 2) are involved in transport of lipophilic molecules, and are possibly pheromone-carriers. Lipocalin-13 (LCN13, also known as odorant binding protein 2A) may bind and transport small hydrophobic volatile molecules with a higher affinity for aldehydes and large fatty acids. Another member of this family is late lactation protein B (LLPB), a milk protein produced during the late phase of lactation, which may be involved in transporting a small ligand released during the hydrolysis of milk fat. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	147
381190	cd19415	lipocalin_ApoM_AGP	apolipoprotein M and alpha1-acid glycoprotein family. Apolipoprotein M (ApoM) is mainly found in high-density lipoproteins (HDL) and is expressed in the liver and in the kidney; it is associated to a lesser extend with low density lipids and triglyceride rich lipoproteins. ApoM is involved in lipid transport and can bind sphingosine-1-phosphate, myristic acid, palmitic acid and stearic acid, retinol, all-trans-retinoic acid and 9-cis-retinoic acid. Alpha1-acid glycoprotein (AGP), also known as orosomucoid, has many important biological roles such as in the acute-phase reaction in response to inflammation, in immune regulation, in drug-binding and drug-transportation, in regulating sphingolipid synthesis and metabolism, and in maintaining the capillary barrier. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	153
381191	cd19416	lipocalin_beta-LG-like	beta-lactoglobulin and similar proteins. Beta-Lactoglobulin (beta-LG) is the major whey protein of ruminant species and present in the milk of many other species, with a notable exception of human. It is the major allergen of bovine milk. Beta-LG has been shown to bind hydrophobic ligands such as curcumin, vitamin E or fatty acids, or hydrophilic such as vitamin B9. This group also includes human glycodelin (also known as placental protein 14, pregnancy-associated endometrial alpha-2 globulin, and progestagen-associated endometrial protein) which is involved in crucial biological processes such as reproduction and immune reaction. Four glycoforms of glycodelin have been identified in reproductive tissue that differ in glycosylation and biological activity. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	160
381192	cd19417	lipocalin_C8gamma	complement protein C8 gamma. Human complement protein C8 gamma, together with C8alpha and C8beta, form one of five components of the cytolytic membrane attack complex (MAC), a pore-like structure that assembles on bacterial membranes. C8alpha and C8gamma form a disulfide-linked heterodimer that is noncovalently associated with C8beta. MAC plays an important role in the defense against gram-negative bacteria and other pathogenic organisms. C8gamma belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	162
381193	cd19418	lipocalin_A1M-like	lipocalin domain of alpha1-microglobulin and similar proteins. Alpha(1)-microglobulin (A1M, also known as protein AMBP, alpha-1 microglycoprotein, and protein HC), has immunosuppressive properties, such as inhibition of antigen induced lymphocyte cell-proliferation, cytokine secretion, and oxidative burst of neutrophils. A1M may participate in the reducing and scavenging of biological pro-oxidants such as heme and heme-proteins. It binds heme strongly, and a C-terminally processed form of the protein degrades the heme. It can reduce cytochrome C, nitroblue tetrazolium, methemoglobin and free iron, using NADH, NADPH or ascorbate as cofactor. Intravenous administration of recombinant A1M in animal models eliminates or significantly reduces the manifestations of preeclampsia. A1M is a useful biomarker in clinical diagnostics for monitoring pre-eclampsia, hepatitis E, renal tubular dysfunction, and renal toxicity. A1M belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	163
381194	cd19419	lipocalin_L-PGDS	lipocalin-type prostaglandin D synthase. Lipocalin-type prostaglandin D synthase (L-PGDS; EC:5.3.99.2) is a secreted enzyme and the second most abundant protein in human cerebrospinal fluid. L-PGDS acts as both, an enzyme and as a lipid transporter, converting prostaglandin H2 to prostaglandin D2 and serving as a carrier for hydrophobic ligands including retinoids, hemoglobin metabolites, thyroid hormones, gangliosides, and fatty acids. L-PGDS belongs to the lipocalin/cytosolic fatty-acid binding protein family which has a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	158
381195	cd19420	lipocalin_VDE	lipocalin domain of violaxanthin deepoxidase and similar proteins. Plant violaxanthin de-epoxidase (VDE, EC 1.23.5.1) participates in the xanthophyll cycle for controlling the concentration of zeaxanthin in chloroplasts. It catalyzes the conversion of violaxanthin to antheraxanthin and zeaxanthin in strong light, and plays a central role in adjusting photosynthetic activity to changing light conditions. In addition, maize VDE has been shown to interact with sugarcane mosaic virus helper component-proteinase, HC-(SCMV), and to attenuate the RNA silencing suppression activity of the latter. VDE belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	177
381196	cd19421	lipocalin_5_8-like	lipocalin similar to human epididymal-specific lipocalin-8, mouse lipocalin-5 and -8, and similar proteins. Lipocalin 5 (LCN5; also known as epididymal retinoic acid binding protein Erabp, mouse epididymal protein 10, MEP10, and E-RABP) and Lipocalin 8 (LCN8; also known as mouse epididymal protein 17, MEP17) are homologous proteins belonging to the epididymis-specific lipocalins; they may play a role in male fertility, and may act as retinoid carrier proteins within the epididymis. In mice, genes encoding the two proteins are contiguous; in humans, there is one gene LCN8 (which has been previously called LCN5). This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	150
381197	cd19422	lipocalin_15-like	lipocalin 15 and similar proteins, such as chicken CALbeta. This subfamily includes uncharacterized human lipocalin 15, and chicken chondrogenesis-associated lipocalin (CAL) beta which is associated with chondrogenesis and inflammation. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	143
381198	cd19423	lipocalin_LTBP1-like	Triatominae salivary lipocalins such as Rhodnius prolixus LTBP1 and Meccus pallidipennis triabin, and similar proteins. This subfamily includes various insect proteins found in the saliva of Triatominae (kissing bugs), including Rhodnius prolixus leukotriene-binding LTBP1. Rhodnius prolixus, a vector of the pathogen Trypanosoma cruzi, sequesters cysteinyl leukotrienes during feeding to inhibit immediate inflammatory responses; LTBP1 binds leukotrienes C4 (LTC4), D4 (LTD4), and E4 (LTE4). Meccus pallidipennis (syn Triatoma pallidipennis) triabin is a potent and selective thrombin inhibitor. It also includes Triatoma protracta procalin, a major salivary allergen which causes an allergic reaction in humans. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	132
381199	cd19424	lipocalin_NPs-like	nitrophorins and similar proteins. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase, R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. Besides NO, NPs also show high affinities for histamine (Hm). This group also includes Rhodnius prolixus amine-binding protein (ABP) which plays an important role in biogenic amine binding; it binds serotonin and norepinephrine with high affinity. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	175
381200	cd19425	lipocalin_10-like	Epididymal-specific lipocalin-10 and similar proteins. Epididymal-specific lipocalin-10 (LCN10) may play a role in male fertility, and may act as a retinoid carrier protein within the epididymis. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which has a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	111
381201	cd19426	lipocalin_6	Epididymal-specific lipocalin-6. Epididymal-specific lipocalin-6 (LCN6) may play a role in male fertility. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which has a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	144
381202	cd19427	lipocalin_OBP-like	Lipocalin odorant-binding protein and similar proteins. Odorant-binding proteins (OBPs) transport small hydrophobic molecules in the nasal mucosa of vertebrates. This subfamily includes mouse odorant-binding protein 1a (Obp1a), Obp1b, and probasin. Mouse Obp1a and Obp1b, which are expressed in the nasal mucosa, bind the chemical odorant 2-isobutyl-3-methoxypyrazine, and may form a OBPO1a/Opb1B heterodimer. Mouse probasin may play a role in the biology of the prostate gland. This group also includes hamster female-specific lacrimal gland protein (FLP) and aphrodisin. FLP may bind tear lipids or lipid-like pheromones found in hamster tears; aphrodisin is found in hamster vaginal discharge, carries pheromones, and stimulates copulatory behavior in males. This group also includes dog allergen Ca f4 which is expressed by tongue epithelial tissue and found in saliva and dander. Bovine OBP is believed to act as a homodimer, having the C-terminal alpha-helix of each monomer stacking against the beta-barrel of the other monomer; this is possible due to its lack of cysteines and therefore lack of disulfide bonds. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	147
381203	cd19428	lipocalin_MUP-like	major urinary proteins (MUPs) and similar proteins. Mouse urine contains major urinary proteins (MUPs) which bind low molecular weight hydrophobic organic compounds such as urinary volatile pheromones such as the male-specific 2-sec-butyl-4,5-dihydrothiazole (SB2HT) which hastens puberty in female mice. The association between MUPs and these volatiles slows the release of the volatiles into the air from urine marks. MUPs may also act as pheromones themselves. MUPs, expressed in the nasal and vomeronasal mucosa, may be important for delivering urinary volatiles to receptors in the vomeronasal organ. This group includes MUPs encoded by central genes in the MUP cluster, as well as those encoded by peripheral genes such as Darcin/Mup20 which binds most of the male pheromone SB2HT in urine and was the first MUP shown to have male pheromonal activity in its own right. This group includes rat MUPs (also called alpha-2U globulins) and other lipocalins such as major horse allergen Equ c 1 and boar salivary lipocalin, a pheromone-binding protein specifically expressed in the submaxillary glands of the boar. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	158
381204	cd19429	lipocalin_9	lipocalin 9. Lipocalin 9 (LCN9) is specifically expressed in the epididymis. It belongs to the lipocalin/cytosolic fatty-acid binding protein family. Lipocalins are typically small extracellular proteins that bind small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids and form covalent or non-covalent complexes with soluble macromolecules as well as membrane bound-receptors. They are involved in many important functions, like ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior.	156
381205	cd19430	lipocalin_trichosorin-like	trichosurin and similar proteins. Trichosurin is a protein from the milk whey of the common brushtail possum, Trichosurus Vulpecula, and shows a preference for binding small phenolic ligands. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	153
381206	cd19431	lipocalin_Can_f_2	Minor allergen Can f 2. The minor dog lipocalin allergen Can f 2 is an important cause of allergic sensitization in humans worldwide. It is one of two major allergens present in dog dander extracts, and is produced by tongue and the parotid gland (a major salivary gland). Can f 2 belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	162
381207	cd19432	lipocalin_2_12-like	lipocalin 2 and 12 and similar proteins. Lipocalin-2 (LCN2, also known as siderocalin, uterocalin, neutrophil gelatinase-associated lipocalin) is expressed in renal, endothelial, liver, smooth muscle cells, cardiomyocytes, in various populations of immune cells and dendritic cells. Roles ascribed to LCN2 include chemotactic and bacteriostatic effects, and iron trafficking. LCN2 can also act as a growth factor. It plays a key role in the pathophysiology of renal and cardiovascular diseases, and is involved in various deleterious processes, such as inflammation and fibrosis. It is used as a renal injury biomarker. Lipocalin 12 (LCN12) is an epididymis-specific protein which binds all-trans retinoic acid. It may act as a retinoid carrier protein within the epididymis and play a role in male reproduction. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	154
381208	cd19433	lipocalin_CpcS-CpeS	CpcS/CpeS phycobiliprotein lyase family. Phycobilin lyases covalently attach a chromophore to the Cys residue(s) of cyanobacterial phycobiliproteins. They include Synechococcus sp. PCC 7002 phycocyanobilin lyase CpcS which attaches a phycocyanobilin chromophore to C-phycocyanain beta subunit and to allophycocyanin alpha and beta subunits, Synechococcus sp. PCC 7002 phycocyanobilin lyase subunit CpcU which forms a heterodimer with CpcS-I to attach phycocyanobilin to beta-phycocyanin and to allophycocyanin subunits, and Prochlorococcus marinus phycoerythrobilin lyase CpeS which attaches 3Z-phycoerythrobilin to phycoerythrin. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	169
381209	cd19434	lipocalin_YxeF	Lipocalin similar to uncharacterized Bacillus subtilis YxeF. Bacillus subtiuls YxeF lacks the alpha-helix that packs in all lipocalins with known structure against the beta-barrel. It belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	108
381210	cd19435	lipocalin_Bacteroides	bacteroides lipocalin. An uncharacterized Bacteroides subfamily of the lipocalin/cytosolic fatty-acid binding protein family a characterisitc of which is a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	131
381211	cd19436	lipocalin_crustacyanin	crustacyanin Type I CRTC and Type II CRTA subunits. Alpha crustacyanin bound with the carotenoid astaxanthisn (AXT) is the predominant cartenoprotein generating the slate-grey/blue color of the lobster carapace. Crustacyanin forms heterodimers (beta-crustacyanin) or complexes of 16 subunits (alpha-crustacyanin) assembled from beta-crustacyanin. Beta-crustacyanin is formed from one type I CRTC lipocalin subunit, and one type II CRTA lipocalin subunit (and two bound astaxanthin molecules). Homarus  gammarus (European lobster) crustacyanin has of five distinct subunits evident on 6 M urea-PAGE gels: type I CRTC ( A1, C1, C2) and type II CRTA ( A2, A3). Homarus americanus crustacyanin consists of only two major subunits, namely type I CRTC (H1) and type II CRTA (H2), both of which behave like Ax subunits on a 6 M urea-PAGE gel. This family includes both type I CRTC subunit and type II CRTA subunits and belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	169
381212	cd19437	lipocalin_apoD-like	apolipoprotein D and similar proteins. Human apolipoprotein D (ApoD) is a small glycoprotein associated with high density lipoproteins (HDL) in plasma. It appears promiscuous since it can bind hydrophobic ligands belonging to different lipid groups, with different shapes and biochemical properties; however, it exhibits specificity between very similar lipidic species. Some ligands, such as progesterone and arachidonic acid, bind to the ligand-binding pocket with high affinity, while others may interact with ApoD via its region of surface hydrophobicity. This hydrophobic surface cluster may facilitate its association with HDL particles and facilitate its insertion into cellular lipid membranes. Drosophila NLaz and Schistocerca Laz belong to this group, and share functional properties with human ApoD, including regulation of lifespan, lipid and carbohydrate metabolism control, and protection against oxidative stress or starvation. This group also includes Sandercyanin, a blue protein secreted in the skin mucus of blue forms of walleye, Sander vitreus. Walleye is an important golden yellow commercial and sport fish; the findings of blue walleye are recent. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	160
381213	cd19438	lipocalin_Blc-like	bacterial lipocalin Blc, Arabidopsis thaliana temperature-induced lipocalin-1, and similar proteins. Escherichia coli bacterial lipocalin (Blc, also known as YjeL) is an outer membrane lipoprotein involved in the storage or transport of lipids necessary for membrane maintenance under stressful conditions. Blc has a binding preference for lysophospholipids. This group includes eukaryotic lipocalins such as Arabidopsis thaliana temperature-induced lipocalin-1 (TIL) which is involved in thermotolerance, oxidative, salt, drought and high light stress tolerance, and is needed for seed longevity by ensuring polyunsaturated lipids integrity. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	143
381214	cd19439	lipocalin_Ex-FABP-like	extracellular fatty acid-binding protein. Ex-FABP (also known as siderocalin, lipocalin Q83 or protein Ch21) displays a dual ligand binding mode as it can bind siderophore and fatty acids simultaneously. ExFABP has a cavity which extends through the protein and has two separate ligand specificities, one for bacterial siderophores at one end, and other specifically binding co-purified lysophosphatidic acid (LPA), a potent cell signaling molecule, at the other end.  As well as acting as an LPA "sensor", Ex-FABP is bacteriostatic, and tightly binds the 2,3-catechol-type ferric siderophores enterobactin, bacillibactin, and parabactin, associated with enteric bacteria and Gram-positive bacilli. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	142
381215	cd19440	lipocalin_Bla_g_4_Per_a_4	major allergens Bla g 4 and Per a 4. Inhalant allergens from cockroaches are an important cause of asthma. Bla g 4 and Per a 4 are male pheromone transport lipocalins, and both are major allergens. Bla g 4 is produced by Blattella germanica (German cockroach) and has been shown to bind two biogenic amines, tyramine and octopamine which may be its physiological ligands. Per a 4 is produced by Periplaneta americana (American cockroach) and may bind different ligands from Bla g 4 or have different modes for tyramine/octopamine binding. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	148
381216	cd19441	CRABP	cellular retinoic acid-binding proteins (CRABP1 and CRABP2). Cellular retinoic acid-binding proteins (CRABPs) play a role in the metabolism of vitamin A and retinoic acid. They bind all trans retinoic acid, but not retinol. Retinol, the alcohol form of vitamin A, is an essential dietary nutrient. Within the cell, it gets oxidized into its biologically active acid form, retinoic acid, which interacts with the nuclear receptors (RARs and RXRs). The two CRABPs (CRABP1 AND CRABP2) differ in their pattern of expression across cells and developmental stages. Like other lipid binding proteins, CRABPs serve to solubilize and protect their ligand in the aqueous cytosol and transport retinoic acid between cellular compartments. CRABP1 (also known as CRABP, CRABP-I, CRABPI, RBP5) is thought to play an important role in retinoic acid-mediated differentiation and proliferation processes. CRABP1 has been shown to modulate stem cell proliferation to affect learning and memory. It has also been shown to regulate CaMKII, excessive and/or persistent activation of which is detrimental in acute and chronic cardiac injury. CRABP2 (also known as CRABP-II, RBP6) transports retinoic acid to the nucleus, and delivers all-trans-retinoic acid to nuclear retinoic acid receptors. CRABPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and besides CRABPS include the cellular retinol-binding protein (CRBPs) and the fatty acid-binding proteins (FABPs).	135
381217	cd19442	CRBP	cellular retinol-binding protein. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs).	131
381218	cd19443	FABP3-like	fatty acid-binding protein 3 and similar proteins including FABP4, -5, -7, -8, -9, -11, and -12. This FABP3-like subfamily includes FABP3, -4, -5, -7, -8, -9, -11, -12, and similar proteins and belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	128
381219	cd19444	FABP1	fatty acid-binding protein 1. Fatty acid-binding protein 1, FABP1 (also known as fatty acid-binding protein 1, liver FABP, L-FABP) occurs at high cytosolic concentration in liver, intestine, and, in the case of humans, also in kidney. FABP1 binds to two molecules of long-chain fatty acids; the two binding sites appear to be inter-dependent. FABP1 binds to fatty acyl-CoAs, peroxisome proliferators, prostaglandins, bile acids, bilirubin, heme, hydroxyl and hydroperoxyl metabolites of fatty acids, lysophosphatidic acids, selenium, and other hydrophobic ligands. FABP1 is down-regulated in about ten percent of hepatocellular carcinoma (HCC) as well as in colorectal cancer at the adenoma stage, but can also be over-expressed in various cancers. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	126
381220	cd19445	FABP2	fatty acid-binding protein 2. FABP2 (also known as fatty acid-binding protein 2, intestinal, and I-FABP) is a small cytosolic protein abundantly present in mature enterocytes of small and large intestine and responsible for the absorption and intracellular transport of fatty acids. It is present throughout the small intestine; its highest expression is in the jejunum. It is a sensitive marker for damage to the intestinal epithelium. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	130
381221	cd19446	FABP6	fatty acid-binding protein 6. Human fatty acid-binding protein 6 (also known as gastrotropin, I-15P, I-BABP, I-BALB, I-BAP, ILBP, ILBP3, ileal bile acid-binding protein, ILLBP, "ileal lipid-binding protein) is an intracellular carrier of bile salts in the epithelial cells of the distal small intestine and has a key role in the enterohepatic circulation of bile salts. It recognizes a series of physiological bile salts that vary in the number and position of steroidal hydroxyl groups, and the presence and type of side-chain conjugation. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	125
381222	cd19447	L-BABP-like	liver bile acid-binding protein and similar proteins. Liver bile acid-binding protein (also known as "fatty acid-binding protein, liver", LB-FABP, L-BABP, L-FABP, FABP1) is present in the liver of the vertebrates fish, amphibians, reptiles, and birds but not in mammals. L-BABPs bind free fatty acids and their coenzyme A derivatives, bilirubin, and some other small molecules in the cytoplasm. The role of L-BABPs may be that of cellular and metabolic trafficking of bile acids; they may be involved in intracellular lipid transport. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	124
381223	cd19448	FABP_pancrustacea	fatty acid-binding protein similar to Manduca sexta and Eriocheir sinensis fatty acid-binding protein 1. This subfamily includes fatty acid-binding protein found mainly in insects such as Manduca sexta FABP1 (also known as MFB1) and Luciola cerata FABP (LcFABP), and crustacea such as Eriocheir sinensis FABP (Es-FABP). MFB1, which is isolated from midgut cytosol, binds fatty acids in a 1:1 molar ratio. LcFABP, abundantly and specifically expressed in the cytosol as well as the nucleus of cells of the photogenic layer of firefly light organ, binds fatty acids of length C14-C18. Es-FABP plays a role in lipid transport during the period of rapid ovarian growth and is involved in lipid nutrient absorption and utilization processes in the hepatopancreas, ovary, and hemocytes. It is also expressed in gills, muscle, thoracic ganglia, heart, and intestine. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	130
381224	cd19449	ReP1-NCXSQ-like	fatty acid-binding protein ReP1-NCXSQ and similar proteins. Arthropod ReP1-NCXSQ (regulatory protein of the squid nerve sodium calcium exchanger) is required for MgATP stimulation of the squid nerve Na(+)/Ca(2+) exchanger NCXSQ1. ReP1-NCXSQ acts as a carrier of fatty acids; is possible that its biological ligand is palmitic acid, which is abundant in squid axons. The mechanism for fine-tuning of the regulation of NCXSQ1 by ReP1-NCXSQ may then involve the transport of palmitic acid. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	129
381225	cd19450	lipocalin_ApoM	Apolipoprotein M. Apolipoprotein M (ApoM) is mainly found in high-density lipoproteins (HDL) and is expressed in the liver and the kidney; it is associated to a lesser extend with low density lipids and triglyceride rich lipoproteins. It is involved in lipid transport and can bind sphingosine-1-phosphate, myristic acid, palmitic acid and stearic acid, retinol, all-trans-retinoic acid and 9-cis-retinoic acid. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	161
381226	cd19451	lipocalin_AGP-like	alpha1-acid glycoprotein and similar proteins. Alpha1-acid glycoprotein (AGP), also known as orosomucoid, has many important biological roles such as in the acute-phase reaction in response to inflammation, in immune regulation, in drug-binding and drug-transportation, in regulating sphingolipid synthesis and metabolism, and in maintaining the capillary barrier. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	173
381227	cd19452	lipocalin_ABP	Rhodnius prolixus amine-binding protein and similar proteins. Rhodnius prolixus amine-binding protein (ABP) plays an important role in biogenic amine binding; it binds serotonin and norepinephrine with high affinity. It is a subgroup of the lipocalin NP-like family. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. The lipocalin NP-like family belongs to the lipocalin/cytosolic fatty-acid binding protein family which has a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	189
381228	cd19453	lipocalin_NP2	nitrophorin 2. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	179
381229	cd19454	lipocalin_NP3	nitrophorin 3. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	189
381230	cd19455	lipocalin_NP1	nitrophorin 1. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	184
381231	cd19456	lipocalin_NP4	nitrophorin 4. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	184
381232	cd19457	lipocalin_2-like	lipocalin 2 and similar proteins. Lipocalin-2 (LCN2, also known as siderocalin, uterocalin, oncogene 24p3, and neutrophil gelatinase-associated lipocalin) is expressed in renal, endothelial, liver, smooth muscle cells, cardiomyocytes, in various populations of immune cells and dendritic cells. Roles ascribed to LCN2, include chemotactic and bacteriostatic effects, and iron trafficking. LCN2 can also act as a growth factor. It plays an key role in the pathophysiology of renal and cardiovascular diseases, and is involved in various deleterious processes, such as inflammation and fibrosis. It is used as a renal injury biomarker. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	173
381233	cd19458	lipocalin_12	Lipocalin 12. Lipocalin 12 (LCN12) is an epididymis-specific protein which binds all-trans retinoic acid. It may act as a retinoid carrier protein within the epididymis and play a role in male reproduction. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	165
381234	cd19459	lipocalin_C1_moubatin-like	Ornithodoros moubata CI, O. moubata moubatin, and similar proteins. The soft tick Ornithodoros moubata complement inhibitor CI (OmCI, also known as coversin) specifically targets C5, a member of the C3/C4/C5 protein family that orchestrates the assembly of the terminal C multiprotein complexes.  O.  moubata moubatin is a specific inhibitor of collagen-induced platelet aggregation. This subgroup belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	146
381235	cd19460	CRABP1	cellular retinoic acid-binding protein 1. Cellular retinoic acid-binding proteins (CRABPs) play a role in the metabolism of vitamin A and retinoic acid. They bind all trans retinoic acid, but not retinol. Retinol, the alcohol form of vitamin A, is an essential dietary nutrient. Within the cell, it gets oxidized into its biologically active acid form, retinoic acid, which interacts with the nuclear receptors (RARs and RXRs). The two CRABPs (CRABP1 AND CRABP2) differ in their pattern of expression across cells and developmental stages. Like other lipid binding proteins, CRABPs serve to solubilize and protect their ligand in the aqueous cytosol and transport retinoic acid between cellular compartments. This subgroup includes CRABP1 (also known as CRABP, CRABP-I, CRABPI, RBP5), which is thought to play an important role in retinoic acid-mediated differentiation and proliferation processes. CRABP1 has been shown to modulate stem cell proliferation to affect learning and memory. It has also been shown to regulate CaMKII, excessive and/or persistent activation of which is detrimental in acute and chronic cardiac injury. CRABPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and besides CRABPS include the cellular retinol-binding protein (CRBPs) and the fatty acid-binding proteins (FABPs).	136
381236	cd19461	CRABP2	Cellular retinoic acid-binding protein 2. Cellular retinoic acid-binding proteins (CRABPs) play a role in the metabolism of vitamin A and retinoic acid. They bind all trans retinoic acid, but not retinol. Retinol, the alcohol form of vitamin A, is an essential dietary nutrient. Within the cell, it gets oxidized into its biologically active acid form, retinoic acid, which interacts with the nuclear receptors (RARs and RXRs). The two CRABPs (CRABP1 AND CRABP2) differ in their pattern of expression across cells and developmental stages. Like other lipid binding proteins, CRABPs serve to solubilize and protect their ligand in the aqueous cytosol and transport retinoic acid between cellular compartments. This subgroup includes CRABP2 (also known as CRABP-II, RBP6) which transports retinoic acid to the nucleus, and delivers all-trans-retinoic acid to nuclear retinoic acid receptors. CRABPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and besides CRABPS include the cellular retinol-binding protein (CRBPs) and the fatty acid-binding proteins (FABPs).	136
381237	cd19462	CRBP1	cellular retinol-binding protein 1. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. CRBP1 (also known as Retinol-Binding Protein 1, CRBP, RBPC, CRBP1, CRBPI, CRABP-I) is widely expressed in numerous tissues: it has highest abundance in the liver, kidney, lung, and retinal pigment epithelium cells of the eye. CRBP1 has a high affinity for retinol. It accepts retinol transported from the plasma to cytosol via a cell surface receptor named STRA6, which interacts with serum retinol-binding protein. CRBP1 can bind all-trans-retinol, all trans-retinal and 13-cis-retinol, but not 9-cis-retinol. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs).	131
381238	cd19463	CRBP2	cellular retinol-binding protein 2. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. CRBP2 is also known as: "retinol-binding protein 2, cellular", CRABP-II, CRBP2, CRBPII, and RBPC2. Expression of CRBP2 is limited to the small intestine. CRBP2 binds both retinol and retinal; rat CRBP2 appears to bind both with equal affinity, human CRBP2 showed a significantly higher affinity for retinol relative to retinal. CRBP2 can bind all-trans-retinol, all trans-retinal and 13-cis-retinol, but not 9-cis-retinol. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs).	131
381239	cd19464	CRBP3	cellular retinol-binding protein 3. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. This group includes human CRBP3 (also known as retinol-binding protein 5, HRBPiso) which is expressed at highest levels in kidney and liver. CRBP3 binds retinol, and may be a human intracellular carrier of retinol in such tissues. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs).	131
381240	cd19465	CRBP4	cellular retinol-binding protein 4. Cellular retinol-binding proteins (CRBPs) participate in the cellular uptake of vitamin A in the form of free retinol. Retinol achieves a higher chemical stability when bound to CRBPs, and its interaction with retinol-binding proteins allows the solubilization in the aqueous medium of the hydrophobic retinol molecule. There are four human CRBP types (CRBP1, -2, -3, -4) which differ in their tissue-specific expression pattern, as well as in their different ligand affinities. This group includes human CRBP4 (also known as retinoid-binding protein 7, CRABP4, CRBP4, CRBPIV) which is expressed primarily in kidney, heart, and transverse colon, and mouse CRBP4 which is highly expressed in white adipose tissue and mammary gland. Human CRBP4 binds retinol with an affinity lower than those for CRBP1, -2, -3. CRBPs belong to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and, besides CRBPS, include the cellular retinoic acid-binding proteins (CRABPs) and the fatty acid-binding proteins (FABPs).	131
381241	cd19466	FABP3	fatty acid binding protein 3. FABP3 (also known as heart-type fatty acid binding protein, H-FABP, MDGI, O-FABP) is a cytosolic protein mainly expressed in cardiac and skeletal muscle cells. In these tissues, it plays an important role in fatty acid transportation, cell growth, cell signaling, and gene transcription. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	128
381242	cd19467	FABP4	fatty acid binding protein 4. FABP4 (also known as A-FABP, adipocyte fatty acid binding protein, aP2) is highly expressed in macrophages and in adipocytes where it regulates fatty acid storage and lipolysis and is an important mediator of inflammation. It binds long chain fatty acids, retinoic acid and eicosanoids. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	130
381243	cd19468	FABP5	fatty acid binding protein 5. FABP5 (also known as epidermal FABP, E-FABP, cutaneous fatty-acid-binding protein, C-FABP, psoriasis-associated fatty-acid-binding protein, KFABP,  PA-FABP) binds a wide array of ligands. It is an intracellular carrier for long-chain fatty acids and related active lipids, and also selectively delivers specific fatty acids from the cytosol to the nucleus. Its ligands include vitamin A metabolite all-trans-retinoic acid, endocannabinoid and numerous synthetic drugs and probes. It may be involved in keratinocyte differentiation. Mouse FABP5 is found only in the monomeric form; however, human FABP5 can exist as a monomer as well as a domain-swapped dimer. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	128
381244	cd19469	FABP8	fatty acid binding protein 8. FABP8 (also known as peripheral myelin protein 2, PMP2, myelin fatty acid binding protein, M-FABP, myelin P2 protein, MP2) is a fatty acid-binding structural component of the myelin sheath in the peripheral nervous system and may play a role in lipid transport and homeostasis in myelin. It may bind cholesterol which is present in myelin at high concentrations. In addition to binding momomeric ligands, P2 is able to bind membrane surfaces, and to stack lipid bilayers together. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	129
381245	cd19470	FABP7	Fatty acid binding protein 7. FABP7 (also known as brain FABP, B-FABP,  BLBP, brain lipid binding protein) is highly expressed in glial cells through development of the nervous system. In the developing brain, FABP7 is required for the establishment of the radial glial fiber system, which is involved in the migration of immature neurons. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	130
381246	cd19471	FABP9	fatty acid binding protein 9 and similar proteins. FABP9 (also known as testis-FABP, T-FABP, PERF15) is a major protein found in the inner acrosomal membrane and outer face of the nuclear envelope of mammalian sperm. Its expression is increased in prostate cancer. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	130
380996	cd19473	SET_SUV39H_DIM5-like	SET domain (including pre-SET domain) found in Neurospora crassa (DIM-5) and similar proteins. This subfamily contains Neurospora crassa DIM-5 (also termed H3-K9-HMTase dim-5, or HKMT) which functions as histone-lysine N-methyltransferase that specifically trimethylates histone H3 to form H3K9me3.	274
410883	cd19475	FlaH	flagellar accessory protein FlaH. Flagellar accessory protein FlaH is part of the motor of the archaellum membrane-anchored archaeal motility structure, together with FlaX and FlaI. FlaH forms a hexameric ring, and binds ATP which  is essential for its interaction with FlaI and for archaellum assembly.	220
410884	cd19476	RecA-like_ion-translocating_ATPases	RecA-like domain of ion-translocating ATPases. RecA-like NTPases. This family includes the NTP-binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	270
410885	cd19477	type_II_IV_secretion_ATPases	type II/type IV hexameric secretion ATPases. RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	168
410886	cd19478	Csm2	Shu complex subunit Csm2. Csm2, together with Shu1, Shu2, and Psy3, form the Shu complex which is thought to play a role in maintaining genome stability by linking error-free post-replication repair to homologous recombination.	206
410887	cd19479	Elp456	Elongator subcomplex subunits Elp4, 5 and 6. Elongator is a highly conserved multiprotein complex involved in RNA polymerase II-mediated transcriptional elongation and many other processes, including cytoskeleton organization, exocytosis, and tRNA modification. It is composed of two subcomplexes, Elp1-3 and Elp4-6. Elp4-6 forms a heterohexameric RecA-like ring structure, although they lack the key sequence signatures of ATPases.	175
410888	cd19480	Psy3	Shu complex subunit Psy3. Psy3, together with Shu1, Shu2, and Csm2, form the Shu complex which is thought to play a role in maintaining genome stability by linking error-free PRR to homologous recombination (HR).	218
410889	cd19481	RecA-like_protease	proteases similar to RecA. RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	158
410890	cd19482	RecA-like_Thep1	RecA-like domain of the nucleoside-triphosphatase THEP1 family. This family represents the THEP1 family ATPase domain. It includes nucleoside-triphosphatase THEP 1 from Aquifex aeolicus (aaTHEP1) a nucleoside-phosphatase, with activity towards ATP, GTP, CTP, TTP and UTP; and which may hydrolyze nucleoside diphosphates with lower efficiency. The catalytic function of aaTHEP1 remains unclear, it may be a DNA/RNA modifying enzyme. Human THEP1 (hsTHEP1) may have a general function in many human tissues, as it is widely expressed in most examined tissues (such as in brain, heart, lymph node, skin, pancreas); it is especially highly expressed in embryonic and various tumor tissues. This family belongs to the RecA-like NTPase superfamily which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	164
410891	cd19483	RecA-like_Gp4D_helicase	RecA-like domain of Escherichia coli bacteriophage T7 Gp4D helicase. This family includes the RecA-like domain of the Gp4D fragment of the Gene4 helicase-primase (Gp4) from bacteriophage T7. Gp4D (residues 241-566) is the minimal fragment of the Gp4 that forms hexameric rings, it contains the helicase domain and the linker connecting the helicase and primase domains. Helicases are ring-shaped oligomeric enzymes that unwind DNA at the replication fork; they couple NTP hydrolysis to the unwinding of nucleic acid duplexes into their component strands. This family belongs to the RecA-like NTPase superfamily which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	231
410892	cd19484	KaiC_C	C-terminal domain of Circadian Clock Protein KaiC. KaiC is a circadian clock protein, most studied in cyanobacteria.  KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation.	218
410893	cd19485	KaiC-N	N-terminal domain of Circadian Clock Protein Kaic. KaiC is a circadian clock protein, most studied in cyanobacteria.  KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation.	226
410894	cd19486	KaiC_arch	KaiC family protein; uncharacterized subfamily similar to  Pyrococcus horikoshii PH0284. KaiC is a circadian clock protein, most studied in cyanobacteria.  KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation.	230
410895	cd19487	KaiC-like_C	C-terminal domain of KaiC family protein; uncharacterized subfamily. KaiC is a circadian clock protein, most studied in cyanobacteria.  KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation.	219
410896	cd19488	KaiC-like_N	N-terminal domain of KaiC family protein; uncharacterized subfamily. KaiC is a circadian clock protein, most studied in cyanobacteria.  KaiC, an autokinase, autophosphatase, and ATPase, is part of the core oscillator, composed of three proteins: KaiA, KaiB, and KaiC. The circadian oscillation is regulated via KaiC phosphorylation.	225
410897	cd19489	Rad51D	RAD51D recombinase. RAD51D recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. RAD51D, together with the other RAD51 paralogs, RAD51B, RAD51C, XRCC3, and XRCC2, helps recruit RAD51 to the break site.	209
410898	cd19490	XRCC2	XRCC2 recombinase. XRCC2 (X-ray repair complementing defective repair in Chinese hamster cells 2) recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. XRCC2, together with the other RAD51 paralogs, RAD51B, RAD51C, RAD51D, and XRCC3, helps recruit RAD51 to the break site.	226
410899	cd19491	XRCC3	XRCC3 recombinase. XRCC3 (X-ray repair complementing defective repair in Chinese hamster cells 3) recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. XRCC3, together with the other RAD51 paralogs, RAD51B, RAD51C, RAD51D, and XRCC2, helps recruit RAD51 to the break site.	250
410900	cd19492	Rad51C	RAD51C recombinase. RAD51C recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. RAD51C, together with the other RAD51 paralogs, RAD51B, RAD51D, XRCC3, and XRCC2, helps recruit RAD51 to the break site. Additionally, RAD51C acts as a mediator in the early steps of DNA damage signaling.	172
410901	cd19493	Rad51B	RAD51B recombinase. RAD51B recombinase, a RAD51 paralog, plays an important role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. RAD51B, together with the other RAD51 paralogs, RAD51C, RAD51D, XRCC3, and XRCC2, helps recruit RAD51 to the break site.	222
410902	cd19494	Elp4	Elongator subcomplex subunit Elp4. Elongator is a highly conserved multiprotein complex involved in RNA polymerase II-mediated transcriptional elongation and many other processes, including cytoskeleton organization, exocytosis, and tRNA modification. It is composed of two subcomplexes, Elp1-3 and Elp4-6. Elp4-6 forms a heterohexameric RecA-like ring structure, although they lack the key sequence signatures of ATPases.	259
410903	cd19495	Elp6	Elongator subcomplex subunit Elp6. Elongator is a highly conserved multiprotein complex involved in RNA polymerase II-mediated transcriptional elongation and many other processes, including cytoskeleton organization, exocytosis, and tRNA modification. It is composed of two subcomplexes, Elp1-3 and Elp4-6. Elp4-6 forms a heterohexameric RecA-like ring structure, although they lack the key sequence signatures of ATPases.	228
410904	cd19496	Elp5	Elongator subcomplex subunit Elp5. Elongator is a highly conserved multiprotein complex involved in RNA polymerase II-mediated transcriptional elongation and many other processes, including cytoskeleton organization, exocytosis, and tRNA modification. It is composed of two subcomplexes, Elp1-3 and Elp4-6. Elp4-6 forms a heterohexameric RecA-like ring structure, although they lack the key sequence signatures of ATPases.	143
410905	cd19497	RecA-like_ClpX	ATP-dependent Clp protease ATP-binding subunit ClpX. ClpX is a component of the ATP-dependent protease ClpXP. In ClpXP,  ClpX ATPase serves to specifically recognize, unfold, and translocate protein substrates into the chamber of ClpP protease for degradation. This RecA-like_ClpX domain subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	251
410906	cd19498	RecA-like_HslU	ATP-dependent protease ATPase subunit HslU. HslU is a component of the ATP-dependent protease HslVU. In HslVU, HslU ATPase serves to unfold and translocate protein substrate, and the HslV protease degrades the unfolded proteins. This RecA-like_HslU subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	183
410907	cd19499	RecA-like_ClpB_Hsp104-like	Chaperone protein ClpB/Hsp104 subfamily. Bacterial Caseinolytic peptidase B (ClpB) and eukaryotic Heat shock protein 104 (Hsp104) are ATP-dependent molecular chaperones and essential proteins of the heat-shock response. ClpB/Hsp104 ATPases, in concert with the DnaK/Hsp70 chaperone system, disaggregate and reactivate aggregated proteins. This RecA-like_ClpB_Hsp104_like subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	178
410908	cd19500	RecA-like_Lon	lon protease homolog 2 peroxisomal. Lon protease (also known as Lon peptidase) is an evolutionarily conserved ATP-dependent serine protease, present in bacteria and eukaryotic mitochondria and peroxisomes, which mediates the selective degradation of mutant and abnormal proteins as well as certain short-lived regulatory proteins. Lon protease is both an ATP-dependent peptidase and a protein-activated ATPase. This RecA-like Lon domain subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	182
410909	cd19501	RecA-like_FtsH	ATP-dependent zinc metalloprotease FtsH. FtsH ATPase is a processive, ATP-dependent zinc metallopeptidase for both cytoplasmic and membrane proteins. It is anchored to the cytoplasmic membrane such that the amino- and carboxy-termini are exposed to the cytoplasm. It presents a membrane-bound hexameric structure that is able to unfold and degrade protein substrates. It is comprised of an N-terminal transmembrane region and the larger C-terminal cytoplasmic region, which consists of an ATPase domain and a protease domain. This RecA-Like FTsH subfamily represents the ATPase domain, and belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	171
410910	cd19502	RecA-like_PAN_like	proteasome activating nucleotidase PAN and related proteasome subunits. This subfamily contains ATPase subunits of the eukaryotic 26S proteasome, and of the archaeal proteasome which carry out ATP-dependent degradation of substrates of the ubiquitin-proteasome pathway. The eukaryotic 26S proteasome consists of a proteolytic 20S core particle (CP), and a 19S regulatory particle (RP) which provides the ATP-dependence and the specificity for ubiquitinated proteins.  In the archaea the RP is a homohexameric complex of proteasome-activating nucleotidase (PAN). This subfamily also includes various eukaryotic 26S subunits including, proteasome 26S subunit, ATPase 2 (PSMC2, also known as S7 and MSS1) which is a member of the 19S RP and has a chaperone like activity; and proteasome 20S subunit alpha 6 (PSMA6, also known as IOTA, p27K, and PROS27) which is a member of the 20S CP. This RecA-like_PAN subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	171
410911	cd19503	RecA-like_CDC48_NLV2_r1-like	first of two ATPase domains of CDC48 and NLV2, and similar ATPase domains. CDC48 in yeast and p97 or VCP metazoans is an ATP-dependent molecular chaperone which plays an essential role in many cellular processes, by segregating polyubiquitinated proteins from complexes or membranes. Cdc48/p97 consists of an N-terminal domain and two ATPase domains; this subfamily represents the first of the two ATPase domains. This subfamily also includes the first of the two ATPase domains of NVL (nuclear VCP-like protein) 2, an isoform of NVL mainly present in the nucleolus, which is involved in ribosome biogenesis, in telomerase assembly and the regulation of telomerase activity, and in pre-rRNA processing. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	165
410912	cd19504	RecA-like_NSF-SEC18_r1-like	first of two ATPase domains of NSF and SEC18, and similar ATPase domains. N-ethylmaleimide-sensitive factor (NSF) and Saccharomyces cerevisiae Vesicular-fusion protein Sec18, key factors for eukaryotic trafficking, are ATPases and SNARE disassembly chaperones. NSF/Sec18 activate or prime SNAREs, the terminal catalysts of membrane fusion. Sec18/NSF associates with SNARE complexes through binding Sec17/alpha-SNAP. Sec18 has an N-terminal cap domain and two nucleotide-binding domains (D1 and D2) which form the two rings of the hexameric complex. The hydrolysis of ATP by D1 generates most of the energy necessary to disassemble inactive SNARE bundles, while the D2 ring binds ATP to stabilize the homohexamer. This subfamily includes the first (D1) ATPase domain of NSF/Sec18, and belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	177
410913	cd19505	RecA-like_Ycf2	ATPase domain of plant YCF2. Ycf2 is a chloroplast ATPase which has an essential function; however, its function remains unclear. The gene encoding YCF2 is the largest known plastid gene in angiosperms and has been used to predict phylogenetic relationships. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	161
410914	cd19506	RecA-like_IQCA1	ATPase domain of IQ and AAA domain-containing protein 1 (IQCA1). IQCA1 (also known as dynein regulatory complex subunit 11, DRC11 and IQCA) is an ATPase subunit of the nexin-dynein regulatory complex (N-DRC). The 9 + 2 axoneme of most motile cilia and flagella consists of nine outer doublet microtubules arranged in a ring surrounding a central pair of two singlet microtubules. The N-DRC complex maintains alignment between outer doublet microtubules and limits microtubule sliding in motile axonemes. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	160
410915	cd19507	RecA-like_Ycf46-like	ATPase domain of Ycf46 and similar ATPase domains. Ycf46 may play a role in the regulation of photosynthesis in cyanobacteria, especially in CO2 uptake and utilization. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	161
410916	cd19508	RecA-like_Pch2-like	ATPase domain of Pachytene checkpoint 2 (Pch2) and similar ATPase domains. Pch2 (known as Thyroid hormone receptor interactor 13 (TRIP13) and 16E1BP) is a key regulator of specific chromosomal events, like the control of G2/prophase processes such as DNA break formation and recombination, checkpoint signaling, and chromosome synapsis. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion	199
410917	cd19509	RecA-like_VPS4-like	ATPase domain of VPS4, ATAD1, K, KTNA1, Spastin, FIGL-1 and similar ATPase domains. This subfamily includes the ATPase domains of vacuolar protein sorting-associated protein 4 (VPS4), ATPase family AAA domain-containing protein 1 (ATAD1, also known as Thorase), Katanin p60 ATPase-containing subunit A1 (KTNA1), Spastin, and Fidgetin-Like 1 (FIGL-1). This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	163
410918	cd19510	RecA-like_BCS1	Mitochondrial chaperone BCS1. Mitochondrial chaperone BCS1 is necessary for the assembly of mitochondrial respiratory chain complex III and plays an important role in the maintenance of mitochondrial tubular networks, respiratory chain assembly and formation of the LETM1 complex. RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	153
410919	cd19511	RecA-like_CDC48_r2-like	second of two ATPase domains of CDC48/p97, PEX1 and -6, VAT and NVL, and similar ATPase domains. This subfamily includes the second of two ATPase domains of the molecular chaperone CDC48 in yeast and p97 or VCP in metazoans, Peroxisomal biogenesis factor 1 (PEX1) and -6 (PEX6), Valosin-containing protein-like ATPase (VAT), and nuclear VCP-like protein (NVL). This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	159
410920	cd19512	RecA-like_ATAD3-like	ATPase domains of ATPase AAA-domain protein 3A (ATAD3A), -3B, and -3C, and similar ATPase domains. ATPase AAA-domain protein 3 (ATAD3) is a ubiquitously expressed mitochondrial protein involved in mitochondrial dynamics, DNA-nucleoid structural organization, cholesterol transport and steroidogenesis. The ATAD3 gene family in human comprises three paralog genes: ATAD3A, ATAD3B and ATAD3C. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	150
410921	cd19513	Rad51	RAD51D recombinase. RAD51 recombinase plays an essential role in DNA repair by homologous recombination (HR). HR is an important error-free repair mechanism for chromosomal double-strand break (DSB) which otherwise leads to cell cycle arrest and death. RAD51 is recruited to the break site with the help of its paralogs, RAD51D, RAD51B, RAD51C, XRCC3, and XRCC2, where it forms long helical polymers which wrap around the ssDNA tail at the break which leads to pairing and strand invasion.	235
410922	cd19514	DMC1	homologous-pairing protein DMC1. DMC1 has a central role in homologous recombination in meiosis. It assembles at the sites of programmed DNA double-strand breaks and carries out a search for allelic DNA sequences located on homologous chromatids. It forms octameric rings.	236
410923	cd19515	archRadA	archaeal recombinase Rad51/RadA. This group includes the archaeal protein RadA which is a homolog of Rad51.  RAD51 recombinase plays an essential role in DNA repair by homologous recombination (HR)	233
410924	cd19516	DotB_TraJ	dot/icm secretion system protein DotB-like. Defect in organelle trafficking (Dot)B is part of the type IVb secretion (T4bS) system, also known as the dot/icm system, and is the main energy supplier of the secretion system. It is an ATPase, similar to the VirB11 component of the T4aS systems. This family also includes Escherichia coli IncI plasmid-encoded conjugative transfer ATPase TraJ encoded on the tra (transfer) operon.	179
410925	cd19517	RecA-like_Yta7-like	ATPase domain of Saccharomyces cerevisiae Yta7 and similar ATPase domains. Saccharomyces cerevisiae Yta7 is a chromatin-associated AAA-ATPase involved in regulation of chromatin dynamics. Its human ortholog  ANCCA/ATAD2 transcriptionally activates pathways of malignancy in a broad range of cancers. The RecA-like_Yta7 subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	170
410926	cd19518	RecA-like_NVL_r1-like	first of two ATPase domains of NVL (nuclear VCP-like protein) and similar ATPase domains. NVL exists in two forms with N-terminal extensions of different lengths in mammalian cells. NVL has two alternatively spliced isoforms, a short form, NVL1, and a long form, NVL2. NVL2, the major species, is mainly present in the nucleolus, whereas NVL1 is nucleoplasmic. Each has an N-terminal domain, followed by two tandem ATPase domains; this subfamily includes the first of the two ATPase domains. NVL2 is involved in the biogenesis of the 60S ribosome subunit by associating specifically with ribosome protein L5 and modulating the function of DOB1. NVL2 is also required for telomerase assembly and the regulation of telomerase activity, and is involved in pre-rRNA processing. The role of NVL1 is unclear. This RecA-like_NVL_r1-like subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	169
410927	cd19519	RecA-like_CDC48_r1-like	first of two ATPase domains of CDC48 and similar ATPase domains. CDC48 in yeast and p97 or VCP metazoans is an ATP-dependent molecular chaperone which plays an essential role in many cellular processes, by segregating polyubiquitinated proteins from complexes or membranes. Cdc48/p97 consists of an N-terminal domain and two ATPase domains; this subfamily represents the first of the two ATPase domains. CDC48's roles include in the fragmentation of Golgi stacks during mitosis and for their reassembly after mitosis, and in the formation of the nuclear envelope, and of the transitional endoplasmic reticulum (tER). This RecA-like_cdc48_r1-like subfamily belongs to the RecA-like family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	166
410928	cd19520	RecA-like_ATAD1	ATPase domain of ATPase family AAA domain-containing protein 1 and similar ATPase domains. ATPase family AAA domain-containing protein 1 (ATAD1, also known as Thorase) is an ATPase that plays a critical role in regulating the surface expression of alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors, thereby regulating synaptic plasticity, learning and memory. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	166
410929	cd19521	RecA-like_VPS4	ATPase domain of vacuolar protein sorting-associated protein 4. Vacuolar protein sorting-associated protein 4 (Vps4) is believed to be involved in intracellular protein transport out of a prevacuolar endosomal compartment. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	170
410930	cd19522	RecA-like_KTNA1	Katanin p60 ATPase-containing subunit A1. Katanin p60 ATPase-containing subunit A1 (KTNA1) is the catalytic subunit of the Katanin complex which is severs microtubules in an ATP-dependent manner, and is implicated in multiple aspects of microtubule dynamics. In addition to the p60 catalytic ATPase subunit, Katanin contains an accessory subunit (p80 or p80-like). The microtubule-severing activity of the ATPase is essential for female meiotic spindle assembly, and male gamete production; and the katanin complex severing microtubules is under tight regulation during the transition from the meiotic to mitotic stage to allow proper embryogenesis. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	170
410931	cd19523	RecA-like_fidgetin	ATPase domain of fidgetin. Fidgetin (FIGN) is a ATP-dependent microtubule severing protein. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	163
410932	cd19524	RecA-like_spastin	ATPase domain of spastin. Spastin is an ATP-dependent microtubule-severing protein involved in microtubule dynamics; it specifically recognizes and cuts microtubules that are polyglutamylated. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	164
410933	cd19525	RecA-like_Figl-1	ATPase domain of Fidgetin-Like 1 (FIGL-1). FIGL-1 may participate in DNA repair in the nucleus; it may be involved in DNA double-strand break repair via homologous recombination. Caenorhabditis elegans FIGL-1 is a nuclear protein and controls the mitotic progression in the germ line and mouse FIGL-1 may be involved in the control of male meiosis. human FIGL-1 has been shown to be a centrosome protein involved in ciliogenesis perhaps as a microtubule-severing protein. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	186
410934	cd19526	RecA-like_PEX1_r2	second of two ATPase domains of Peroxisomal biogenesis factor 1 (PEX1). PEX1(also known as Peroxin-1)/PEX6 is a protein unfoldase; PEX1 and PEX6 form a heterohexameric Type-2 AAA-ATPase complex and are essential for peroxisome biogenesis as they are required for the import of folded proteins into the peroxisomal matrix. PEX-1 is required for stability of PEX5. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	158
410935	cd19527	RecA-like_PEX6_r2	second of two ATPase domains of Peroxisomal biogenesis factor 6 (PEX6). PEX6(also known as Peroxin61)/PEX1 is a protein unfoldase; PEX6 and PEX1 form a heterohexameric Type-2 AAA-ATPase complex and are essential for peroxisome biogenesis as they are required for the import of folded proteins into the peroxisomal matrix. This subfamily represents the second ATPase domain of PEX6. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	160
410936	cd19528	RecA-like_CDC48_r2-like	second of two ATPase domains of CDC48 and similar ATPase domains. CDC48 in yeast and p97 or VCP in metazoans is an ATP-dependent molecular chaperone which plays an essential role in many cellular processes, by segregating polyubiquitinated proteins from complexes or membranes. Cdc48/p97 consists of an N-terminal domain and two ATPase domains; this subfamily represents the second of the two ATPase domains. CDC48's roles include in the fragmentation of Golgi stacks during mitosis and for their reassembly after mitosis, and in the formation of the nuclear envelope, and of the transitional endoplasmic reticulum (tER). This RecA-like_cdc48_r2-like subfamily belongs to the RecA-like family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	161
410937	cd19529	RecA-like_VCP_r2	second of two ATPase domains of Valosin-containing protein-like ATPase (VAT) and similar ATPase domains. The Valosin-containing protein-like ATPase of Thermoplasma acidophilum (VAT), is an archaeal homolog of the ubiquitous Cdc48/p97. It is a protein unfoldase that functions in concert with the 20S proteasome by unfolding proteasome substrates and passing them on for degradation. VAT forms a homohexamer, each monomer contains two tandem ATPase domains, referred to as D1 and D2, and an N-terminal domain. This subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	159
410938	cd19530	RecA-like_NVL_r2-like	second of two ATPase domains of NVL (nuclear VCP-like protein) and similar ATPase domains. NVL exists in two forms with N-terminal extensions of different lengths in mammalian cells. NVL has two alternatively spliced isoforms, a short form, NVL1, and a long form, NVL2. NVL2, the major species, is mainly present in the nucleolus, whereas NVL1 is nucleoplasmic. Each has an N-terminal domain, followed by two tandem ATPase domains; this subfamily includes the first of the two ATPase domains. NVL2 is involved in the biogenesis of the 60S ribosome subunit by associating specifically with ribosome protein L5 and modulating the function of DOB1. NVL2 is also required for telomerase assembly and the regulation of telomerase activity, and is involved in pre-rRNA processing. The role of NVL1 is unclear. This RecA-like_NVL_r1-like subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	161
380454	cd19531	LCL_NRPS-like	LCL-type Condensation (C) domain of non-ribosomal peptide synthetases(NRPSs) and similar domains including the C-domain of SgcC5, a free-standing NRPS with both ester- and amide- bond forming activity. LCL-type Condensation (C) domains catalyze peptide bond formation between two L-amino acids, ((L)C(L)). C-domains of NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). In addition to the LCL-type, there are various subtypes of C-domains such as the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. Streptomyces globisporus SgcC5 is a free-standing NRPS condensation enzyme (rather than a modular NRPS), which catalyzes the condensation between the SgcC2-tethered (S)-3-chloro-5-hydroxy-beta-tyrosine and (R)-1phenyl-1,2-ethanediol, forming an ester bond, during the synthesis of the chromoprotein enediyne antitumor antibiotic C-1027. It has some acceptor substrate promiscuity as it has been shown to also catalyze the formation of an amide bond between SgcC2-tethered (S)-3-chloro-5-hydroxy-beta-tyrosine and a mimic of the enediyne core acceptor substrate having an amine at its C-2 position. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. An HHxx[SAG]DGxSx(6)[ED] motif is characteristic of LCL-type C-domains.	427
380455	cd19532	C_PKS-NRPS	Condensation domain of hybrid polyketide synthetase/nonribosomal peptide synthetases (PKS/NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Hybrid PKS/NRPS create polymers containing both polyketide and amide linkages. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Most members of this subfamily have the typical C-domain HHxxxD motif, a few such as Monascus pilosus lovastatin nonaketide synthase MokA have a non-canonical HRxxxD motif in the C-domain and are unable to catalyze  amide-bond formation. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	421
380456	cd19533	starter-C_NRPS	Starter Condensation domains, found in the first module of nonribosomal peptide synthetases (NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. While standard C-domains catalyze peptide bond formation between two amino acids, an initial, ('starter') C-domain may instead acylate an amino acid with a fatty acid. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity.	419
380457	cd19534	E_NRPS	Epimerization domain of nonribosomal peptide synthetases (NRPSs); belongs to the Condensation-domain family. Epimerization (E) domains of nonribosomal peptide synthetases (NRPS) flip the chirality of the end amino acid of a peptide being manufactured by the NRPS. E-domains are homologous to the Condensation (C) domains. NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Specialized tailoring NRPS domains such as E-domains greatly increase the range of possible peptide products created by the NRPS machinery. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the E-domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity.	428
380458	cd19535	Cyc_NRPS	Cyc (heterocyclization) domain of nonribosomal peptide synthetases (NRPSs); belongs to the Condensation-domain family. Cyc (heterocyclization) domains catalyze two separate reactions in the creation of heterocyclized peptide products in nonribosomal peptide synthesis: amide bond formation followed by intramolecular cyclodehydration between a Cys, Ser, or Thr side chain and a carbonyl carbon on the peptide backbone to form a thiazoline, oxazoline, or methyloxazoline ring. Cyc-domains are homologous to standard NRPS Condensation (C) domains. C-domains typically have a conserved HHxxxD motif at the active site; Cyc-domains have an alternative, conserved DxxxxD active site motif, mutation of the aspartate residues in this motif can abolish or diminish condensation activity. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and Cyc-domains. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	423
380459	cd19536	DCL_NRPS-like	DCL-type Condensation domains of nonribosomal peptide synthetases (NRPSs), such as terminal fungal CT domains and Dual Epimerization/Condensation (E/C) domains. Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type [D-specific for the peptidyl donor and L-specific for the aminoacyl acceptor ((D)C(L))], which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity.	419
380460	cd19537	C_NRPS-like	Condensation family domain with an atypical active site motif. Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Members of this subfamily typically have a non-canonical conserved SHXXXDX(14)Y motif. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	395
380461	cd19538	LCL_NRPS	LCL-type Condensation domain of non-ribosomal peptide synthetases (NRPSs) and similar domains. LCL-type Condensation (C) domains catalyze peptide bond formation between two L-amino acids, ((L)C(L)). C-domains of NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). In addition to the LCL-type, there are various subtypes of C-domains such as the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. An HHxx[SAG]DGxSx(6)[ED] motif is characteristic of LCL-type C-domains.	432
380462	cd19539	SgcC5_NRPS-like	SgcC5 is a non-ribosomal peptide synthetase (NRPS) condensation enzyme with ester- and amide- bond forming activity and similar C-domains of modular NRPSs. SgcC5 is a free-standing NRPS condensation enzyme (rather than a modular NRPS), which catalyzes the condensation between the SgcC2-tethered (S)-3-chloro-5-hydroxy-beta-tyrosine and (R)-1phenyl-1,2-ethanediol, forming an ester bond, during the synthesis of the chromoprotein enediyne antitumor antibiotic C-1027. It has some acceptor substrate promiscuity as it has been shown to also catalyze the formation of an amide bond between SgcC2-tethered (S)-3-chloro-5-hydroxy-beta-tyrosine and a mimic of the enediyne core acceptor substrate having an amine at its C-2 position. This subfamily also includes similar C-domains of modular NRPSs such as Penicillium chrysogenum N-(5-amino-5-carboxypentanoyl)-L-cysteinyl-D-valine synthase PCBAB. Condensation (C) domains of NRPSs normally catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity.	427
380463	cd19540	LCL_NRPS-like	LCL-type Condensation domain of nonribosomal peptide synthetases (NRPSs) and similar domains. LCL-type Condensation (C) domains catalyze peptide bond formation between two L-amino acids, ((L)C(L)). C-domains of NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). In addition to the LCL-type, there are various subtypes of C-domains such as the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. An HHxx[SAG]DGxSx(6)[ED] motif is characteristic of LCL-type C-domains.	433
380464	cd19542	CT_NRPS-like	Terminal Condensation (CT)-like domains of nonribosomal peptide synthetases (NRPSs). Unlike bacterial NRPS, which typically have specialized terminal thioesterase (TE) domains to cyclize peptide products, many fungal NRPSs employ a terminal condensation-like (CT) domain to produce macrocyclic peptidyl products (e.g. cyclosporine and echinocandin). Domains in this subfamily (which includes both terminal and non-terminal domains) typically have a non-canonical conserved [SN]HxxxDx(14)Y motif at their active site compared to the standard Condensation (C) domain active site motif (HHxxxD). C-domains of NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	401
380465	cd19543	DCL_NRPS	DCL-type Condensation domain of nonribosomal peptide synthetases (NRPSs), which catalyzes the condensation between a D-aminoacyl/peptidyl-PCP donor and a L-aminoacyl-PCP acceptor. The DCL-type Condensation (C) domain catalyzes the condensation between a D-aminoacyl/peptidyl-PCP donor and a L-aminoacyl-PCP acceptor. This domain is D-specific for the peptidyl donor and L-specific for the aminoacyl acceptor ((D)C(L)); this is in contrast with the standard LCL domains which catalyze peptide bond formation between two L-amino acids, and the restriction of ribosomes to use only L-amino acids. C domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains in addition to the LCL- and DCL-types such as starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity.	423
380466	cd19544	E-C_NRPS	Dual Epimerization/Condensation (E/C) domains of nonribosomal peptide synthetases (NRPSs). Dual function Epimerization/Condensation (E/C) domains have both an epimerization and a DCL condensation activity. Dual E/C domains first epimerize the substrate amino acid to produce a D-configuration, then catalyze the condensation between the D-aminoacyl/peptidyl-PCP donor and a L-aminoacyl-PCP acceptor. They are D-specific for the peptidyl donor and L-specific for the aminoacyl acceptor ((D)C(L)); this is in contrast with the standard LCL domains which catalyze peptide bond formation between two L-amino acids, and the restriction of ribosomes to use only L-amino acids. These Dual E/C domains contain an extended His-motif (HHx(N)GD) near the N-terminus of the domain in addition to the standard Condensation (C) domain active site motif (HHxxxD). C domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains, these include the DCL-type, LCL-type, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C domains, and the X-domain.	413
380467	cd19545	FUM14_C_NRPS-like	Condensation domains of nonribosomal peptide synthetases (NRPSs) similar to the ester-bond forming Fusarium verticillioides FUM14 protein. Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) typically catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. However, some C-domains have ester-bond forming activity. This subfamily includes Fusarium verticillioides FUM14 (also known as NRPS8), a bi-domain protein with an ester-bond forming NRPS C-domain, which catalyzes linkages between an aminoacyl/peptidyl-PCP donor and a hydroxyl-containing acceptor. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. FUM14 has an altered active site motif DHTHCD instead of the typical HHxxxD motif seen in other subfamily members. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	395
380468	cd19546	X-Domain_NRPS	X-domain is a catalytically inactive Condensation-like domain shown to recruit oxygenases to the non-ribosomal peptide synthetase (NRPS). The X-domain is a catalytically inactive member of the Condensation (C) domain family of non-ribosomal peptide synthetase (NRPS). It has been shown to recruit oxygenases to the NRPS to perform side-chain crosslinking in the production of glycopeptide antibiotics. C-domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as this X-domain, the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, and dual E/C (epimerization and condensation) domains. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity; members of this X-domain subfamily lack the second H of this motif.	440
380469	cd19547	beta-lac_NRPS	Condensation domain of nonribosomal peptide synthetases (NRPSs) similar to Nocardia uniformis NocB which exhibits an unusual cyclization to form beta-lactam rings in pro-nocardicin G synthesis. Nocardia uniformis NRPS NocB acts centrally in the biosynthesis of the nocardicin monocyclic beta-lactam antibiotics. Along with another NRPS NocA, it mediates an unusual cyclization to form beta-lactam rings in the synthesis of the beta-lactam-containing pentapeptide pro-nocardicin G. This small subfamily is related to DCL-type Condensation (C) domains, which catalyze condensation between a D-aminoacyl/peptidyl-PCP donor and a L-aminoacyl-PCP acceptor. NRPSs catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain. C-domains typically have a conserved HHxxxD motif at the active site; domains belonging to this subfamily have an HHHxxxD motif at the active site.	422
381016	cd19548	serpinA_A1AT-like	serpin family A member, alpha-1-antitrypsin and similar serpin proteins in birds and reptiles. The alpha-1-antitrypsin family has a variety of different members of sauropsida belonging to the clade A of the serpin superfamily. This branch includes members from zebra finch, green anole, king cobra, gekko, crocodile, and central bearded dragon. Alpha-1-antitrypsin (also called A1AT, A1A, AAT, alpha1-proteinase inhibitor/A1PI, alpha1-antiproteinase/A1AP, and serum trypsin inhibitor) is a protease inhibitor. Clade A includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	370
381017	cd19549	serpinA_A1AT-like	serpin family A member, alpha-1-antitrypsin and similar proteins. This group contains proteins similar to alpha-1-antitrypsin (also called A1AT, A1A, AAT, alpha1-proteinase inhibitor/A1PI, alpha1-antiproteinase/A1AP, and serum trypsin inhibitor), a protease inhibitor that belongs to the serpin superfamily. It is encoded in humans by the SERPINA1 gene. When the blood contains inadequate amounts of A1AT or functionally defective A1AT (such as in alpha-1 antitrypsin deficiency), neutrophil elastase is excessively free to break down elastin, degrading the elasticity of the lungs, which results in respiratory complications, such as chronic obstructive pulmonary disease. Normally, A1AT leaves its site of origin, the liver, and joins the systemic circulation; defective A1AT can fail to do so, building up in the liver, which results in cirrhosis. This group belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	367
381018	cd19550	serpinA2_PIL	serpin family A member 2, protease inhibitor 1-like. Protease inhibitor 1-like (also called serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 2, ARGS, protease inhibitor 1 (alpha-1-antitrypsin)-like)/PIL, and alpha-1-antitrypsin-related protein/ATR) belongs to the serpin superfamily and is encoded by the SERPINA2 gene in humans. SERPINA2 was once thought to be a pseudogene, but recent evidence shows that it produces an active transcript. It is very similar in structure and function to SERPINA1. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	363
381019	cd19551	serpinA3_A1AC	serpin family A member 3, alpha 1-antichymotrypsin. Alpha 1-antichymotrypsin (a1AC/A1AC/a1ACT/AACT) is an alpha globulin glycoprotein that is a member of the serpin superfamily. In humans, it is encoded by the SERPINA3 gene. It inhibits the activity of proteases, such as cathepsin G that is found in neutrophils, and chymases found in mast cells, by cleaving them into a different shape or conformation. This activity protects some tissues, such as the lower respiratory tract, from damage caused by proteolytic enzymes. Deficiency of this protein has been associated with liver disease. Mutations have been identified in patients with Parkinson disease and chronic obstructive pulmonary disease. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	382
381020	cd19552	serpinA4_KST	serpin family A member 4, kallistatin. Kallistatin (KST, also called proteinase inhibitor 4/PI4, or kallikrein inhibitor/KAL) is a protein that in humans is encoded by the SERPINA4 gene. Kallistatin inhibits human amidolytic and kininogenase activities of tissue kallikrein. Heparin blocks kallistatin's complex formation with tissue kallikrein and abolishes its inhibitory effect on tissue kallikrein's activity. Kallistatin was found to be expressed in human liver, stomach, pancreas, kidney, aorta, testes, prostate, artery, atrium, ventricle, lung, renal proximal tubular cell, and a colonic carcinoma cell line T84. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	383
381021	cd19553	serpinA5_PCI	serpin family A member 5, protein C inhibitor. Protein C inhibitor (PCI/PROCI, also called PAI3, plasminogen activator inhibitor-3/PLANH3, plasma serine protease inhibitor) has many biological functions. It acts as a pro-coagulant in blood and in the seminal vesicles, it is required for spermatogenesis. It is a member of the clade A serpin family that includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	364
381022	cd19554	serpinA6_CBG	serpin family A member 6, corticosteroid-binding globulin. Corticosteroid-binding globulin (CBG, also known as transcortin) is encoded by the SERPINA6 gene in humans which encodes an alpha-globulin with corticosteroid-binding properties. It is produced in the liver. CBG binds several steroid hormones at high rates including cortisol, cortisone, deoxycorticosterone (DOC), corticosterone, aldosterone, progesterone, and 17a-hydroxyprogesterone. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	373
381023	cd19555	serpinA7_TBG	serpin family A member 7, thyroxine-binding globulin. Thyroxine-binding globulin (TBG, also called T4-binding globulin) is a globulin that binds thyroid hormones in circulation. It is one of three transport proteins (along with transthyretin and serum albumin) responsible for carrying the thyroid hormones thyroxine (T4) and triiodothyronine (T3) in the bloodstream. TBG is synthesized primarily in the liver and is a serpin with no inhibitory function like many other members of this class of proteins. There are two forms of inherited thyroxine-binding globulin deficiency: the complete form (TBG-CD), which results in a total loss of thyroxine-binding globulin, and the partial form (TBG-PD), which reduces the amount of this protein or alters its structure. Neither of these conditions causes any problems with thyroid function, but it can be mistaken for more serious thyroid disorders, such as hypothyroidism. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	379
381024	cd19556	serpinA9_centerin	serpin family A member 9, centerin. Centerin, also known as germinal center B-cell-expressed transcript 1/GCET1, is a serpin whose expression is restricted to germinal center B-cells and lymphoid malignancies with germinal center B-cell maturation. Expression of centerin, together with bcl-6 and GCET2, constitutes a germinal center B-cell signature, which is associated with a good prognosis in diffuse large B-cell lymphomas. Centerin is thought to function in vivo in the germinal centre as an efficient inhibitor of a trypsin-like protease. It also inhibits the trypsin-like serine proteases trypsin, thrombin and plasmin and is able to bind heparin and DNA. The centerin gene maps to the A clade serpin cluster on chromosome 14q32.1, which also contains a1-antitrypsin and a1-antichymotrypsin together with seven other serpins. The clade A of the serpin superfamily includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	388
381025	cd19557	serpinA11	serpin family A member 11. Serpin A11, in rats also called liver regeneration-related protein LRRG023, is a serpin encoded by the gene SERPINA11. It maps on chromosome 14, at 14q32.13 and is strongly expressed in the human liver. The function of this protein is unknown. It belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	373
381026	cd19558	serpinA12_vaspin	serpin family A member 12, visceral adipose tissue-derived serpin. Vaspin, also called visceral adipose tissue-derived serpin or serpinA12, was identified as an adipokine with insulin-sensitizing effects and has been shown to significantly reduce blood glucose concentrations in various mouse models. As such, vaspin may represent a novel treatment tool for diabetes intervention strategies. Human kallikrein 7 (hK7), which cleaves human insulin within A and B chain, was the first protease target of vaspin inhibited by classical serpin mechanism with high specificity in vitro. This family belongs to the clade A of the serpin superfamily, which includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	372
381027	cd19559	serpinA14_UTMP_UABP-2	serpin family A member 14, uterine milk protein and uteroferrin-associated basic protein 2. The uteroferrin(Uf)-associated basic proteins-2(UABP-2/UABP/UfAP) are a group of three (Mr = 42K, 48K, and 50K) antigenically related, basic glycoproteins secreted by the porcine uterus under the influence of progesterone (P4), which exist as heterodimers (Mr = 80,000) with the iron-binding acid phosphatase, Uf. This group also contains UTMP (uterine milk protein), encoded by SERPINA14. UTMP binds noncovalently to the iron-containing glycoprotein uteroferrin, which displays phosphatase activity and is thought to be involved with iron transport to the fetus. Synthesis of these serpins is induced by progesterone in the uterus. UTMP is also an activin-binding protein and has been implicated in regulation of uterine immune function. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	386
381028	cd19560	serpinB1_LEI	serpin family B member 1 (serpin B1), leukocyte elastase inhibitor (LEI). Leukocyte elastase inhibitor (LEI , also known as proteinase inhibitor 2/PI2, monocyte neutrophil elastase inhibitor/MNEI, EI, or ELANH2) is a member of the clade B serpins or ov-serpins (ovalbumin related serpins) that in humans is encoded by the SERPINB1 gene. Human SERPINB1 is a potent intracellular inhibitor for granzyme H (GzmH) which is constitutively expressed in NK cells and induces target cell death. GzmH cleaves SERPINB1 at Phe343 in the RCL to mediate suicide inhibition. Equine leukocyte elastase inhibitor (HLEI) in contrast to other serpins contains no carbohydrate and has a blocked amino terminus. HLEI is a thymosin beta4-binding protein suggesting a physiological role for cytoplasmic elastase inhibitors in the thymosin B4-regulated rearrangement of the cytoskeleton of leukocytes. HLEI has been proposed to be involved with the control of intracellular protein turnover or the control of elastinolytic activity during inflammation. Ov-serpins are a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	379
381029	cd19562	serpinB2_PAI-2	serpin family B member 2, plasminogen activator inhibitor 2. Plasminogen activator inhibitor-2 (PAI-2/PLANH2, also called placental PAI, monocyte arg-serpin, or urokinase inhibitor) is a serine protease inhibitor that belongs to the ovalbumin family of serpins (ov-serpins). It is an effective inhibitor of urinary plasminogen activator (urokinase or uPA) and is involved in cell differentiation, tissue growth and regeneration. Ov-serpins are a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	414
381030	cd19563	serpinB3_B4_SCCA1_2	serpin family B members 3 and 4, squamous cell carcinoma antigens 1 and 2. Squamous cell carcinoma antigen 1 (SCCA1, also called HsT1196 or protein T4-A) and squamous cell carcinoma antigen 2 (SCCA2, also called PI11 or leupin), which are encoded by the SERPINB3 and SERPINB4 genes, respectively, are members of the serpin family of serine protease inhibitors. SCCA1 is a so called cross-class serpin, inhibiting cysteine proteinases such as cathepsin S, K, L, and papain. SCCA2 inhibits chymotrypsin-like serine proteases including chymase, cathepsin G, and Der p1. Elevated levels of SCCA1 and SCCA2 have been detected in chronic inflammatory conditions involving the skin, especially atopic dermatitis (AD)and psoriasis, as well as in respiratory inflammatory diseases such as asthma, chronic obstructive pulmonary disease (COPD), and tuberculosis. They are both normally co-expressed in squamous epithelial cells of tongue, esophagus, tonsils, epidermal hair follicles, lung and uterus, and become highly up-regulated in squamous carcinomas of these organs. Diseases associated with SERPINB3 include anal cancer and cervical squamous cell carcinoma, whereas SERPINB4 include squamous cell carcinoma and chromosome 18Q deletion syndrome. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	390
381031	cd19565	serpinB6_CAP	serpin family B member 6, cytoplasmic antiproteinase. Cytoplasmic antiproteinase (CAP, also called proteinase inhibitor 6/PI6 or placental thrombin inhibitor/PTI) is thought to be involved in the regulation of serine proteinases present in the brain or extravasated from the blood. It may play an important role in the inner ear in the protection against leakage of lysosomal content during stress; loss of this protection results in cell death and sensorineural hearing loss. It is an inhibitor of cathepsin G, kallikrein-8 and thrombin. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	378
381032	cd19566	serpinB7_megsin	serpin family B member 7, megsin. Megsin is named as such due to its primary expression in the mesangium, a structure associated with the capillaries in the glomerulus of the kidney. Megsin is thought to play a role in the regulation of a wide variety of processes in mesangial cells, such as matrix metabolism, cell proliferation, and apoptosis. Identification of the exact biological functions and target proteases of megsin will lead to the development of novel therapeutic approaches to glomerular diseases. Expression of this gene is upregulated in IgA nephropathy and mutations have been found to cause palmoplantar keratoderma, Nagashima type. Megsin belongs to the ovalbumin family of serpins (ov-serpins), a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	380
381033	cd19567	serpinB8_CAP-2	serpin family B member 8, cytoplasmic antiproteinase 2. Cytoplasmic antiproteinase 2 (CAP-2 or peptidase inhibitor 8/PI-8) is a member of the ovalbumin family of serpins (ov-serpins). Serpin B8 is produced by platelets and can bind to and inhibit the function of furin, a serine protease involved in platelet functions. In addition, this protein has been found to enhance the mechanical stability of cell-cell adhesion in the skin, and defects in this gene have been associated with an autosomal-recessive form of exfoliative ichthyosis. Diseases associated with SERPINB8 include Peeling Skin Syndrome 5 and Exfoliative Ichthyosis. Among its related pathways are Response to elevated platelet cytosolic Ca2+ and CFTR-dependent regulation of ion channels in Airway Epithelium (norm and CF). The ov-serpins are a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	374
381034	cd19568	serpinB9_CAP-3	serpin family B member 9, cytoplasmic antiproteinase 3. Cytoplasmic antiproteinase 3 (CAP-3; peptidase inhibitor 9/PI-9, Spi6, or testicular tissue protein Li 180) is an intracellular inhibitor of granzyme B (grB) that protects cytotoxic lymphocytes from grB-mediated death. It is also thought to be expressed in accessory immune cells, including dendritic cells (DCs), although there is some debate about this. Overexpression of serpin B9 may prevent cytotoxic T-lymphocytes from eliminating certain tumor cells. A pseudogene of this gene is found on chromosome 6. Diseases associated with serpin B9 include chronic obstructive pulmonary disease (COPD) and oral squamous cell carcinoma (OSCC). The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	376
381035	cd19569	serpinB10_bomapin	serpin family B member 10, bomapin. Bomapin (also called proteinase inhibitor 10/PI10) is a hematopoietic- and myeloid leukaemia-specific protease inhibitor which is thought to augment proliferation or apoptosis of leukemia cells, depending on growth factor availability. Bomapin is expressed only in bone marrow, leukocytes of patients with myeloid leukaemia that correspond to myeloid progenitors, and promyelocytic leukaemia cell lines (HL60, THP1, and AML-193), but it is not present in terminally differentiated leukocytes. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	397
381036	cd19570	serpinB11_epipin	serpin family B member 11, epipin. Epipin/SERPINB11 has no serine protease inhibitory activity, probably due to mutations in the scaffold, impairing conformational changes, and may have evolved a non-inhibitory function. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	392
381037	cd19571	serpinB12_yukopin	serpin family B member 12, yukopin. Yukopin, encoded by the SERPINB12 gene, is a member of the serpin superfamily of serine protease inhibitors. It inhibits trypsin and plasmin, but not thrombin, coagulation factor Xa, or urokinase-type plasminogen activator. An important paralog of this gene is SERPINB4. The ovalbumin family of serpins (ov-serpins) is a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	420
381038	cd19572	serpinB13_headpin	serpin family B member 13, headpin. Headpin (also known as hurpin or proteinase inhibitor 13/P113) maps to chromosome 18q21.3 and is expressed in normal squamous epithelium of the oral mucosa, skin, and cervix. Inhibitory serpins are known to play an important role in tumor invasion, metastasis, tumor suppression and apoptosis. Headpin belongs to the ovalbumin family of serpins (ov-serpins), a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). It also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. The ov-serpins corresponds to clade B of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	391
381039	cd19573	serpinE2_GDN	serpin family E member 2, glia derived nexin (GDN). Serpin glia-derived nexin (GDN; also called peptidase inhibitor 7/PI-7 or protease nexin 1/PN-1) is a specific and extremely efficient inhibitor of thrombin. Unlike other thrombin inhibitors, it is not synthesized in the liver and does not circulate in the blood. It is instead expressed by multiple cell types and is located on the surface of these cells, bound to glycosaminoglycans. GDN plays a role in thrombosis and atherosclerosis and is a clade E serpin. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	375
381040	cd19574	serpinE3	serpin family E member 3. The function of serpin E3 is not known. It is a member of clade E, which also includes nexin and plasminogen activator inhibitor type 1, of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	384
381041	cd19575	serpinH2	serpin family H member 2. The function of Danio rerio serpin H2 is not known. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	382
381042	cd19576	serpinI2_pancpin	serpin family I member 2, pancpin. Pancpin (also called proteinase inhibitor 14/PI14 or myoepithelium-derived serine protease inhibitor/MEPI ) is an inhibitory member of the serpin superfamily. It is downregulated in pancreatic and breast cancer, and is associated with acinar cell apoptosis and pancreatic insufficiency when absent in mice. Pancpin was found to inhibit pancreatic chymotrypsin and elastase. It is thought that pancpin protects pancreatic cells from the consequences of premature activation of their respective zymogens. This subgroup belongs to clade I of the serpin superfamily. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	371
381043	cd19577	serpinJ_IRS-2-like	serpin family J, Ixodes ricinus serpin-2 (IRS-2). The serpin family J clade contains serpins from the Chelicerates. This model includes serpins from the Japanese horseshoe crab, mites, ticks, and spiders. The Limulus intracellular coagulation inhibitor, designated LICI, was isolated from hemocytes of the Japanese horseshoe crab. It blocks the amidolytic activities of Limulus lipopolysaccharide-sensitive serine protease, factor C and also inhibits human alpha-thrombin, rat salivary kallikrein, bovine plasmin, and trypsin but not Limulus clotting enzyme, Limulus factor B, bovine factor Xa, human factor XIa, human tissue plasminogen activator, human urokinase, chymotrypsin, elastase, and papain. Glycosaminoglycans such as heparin and heparan sulfate had no effect on the inhibitory activity. The castor bean tick, Ixodes ricinus serpin-2 (IRS-2) whose structure has been solved, unlike that of the LICI, is found in the saliva of the tick and primarily targets 2 proinflammatory serine proteases: cathepsin G and mast cell chymase, and in higher molar excess, thrombin. It also blocks cathepsin G- and thrombin-induced platelet aggregation. Thus it has a dual role and can interfere with both inflammation and wound healing during tick feeding. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	372
381044	cd19578	serpinK_insect_SRPN2-like	serpin family K, insect Serpin-2 and similar proteins. Serpin-2 (SRPN2) is a negative regulator of the melanization response in the malaria vector Anopheles gambiae. SRPN2 irreversibly inhibits clip domain serine proteinase 9 (CLIPB9), which functions in a serine proteinase cascade ending in the activation of prophenoloxidase and melanization. Silencing of SRPN2 results in spontaneous melanization and decreased life span of the mosquito and is a promising target for vector control. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	376
381045	cd19579	serpin1K-like	Manduca sexta Serpin 1K and similar proteins. Serpin 1K is a chymotrypsin inhibitor and is 1 of 12 serpins found in the hemolymph of the hornworm moth Manduca sexta. Serpins may be involved in the immune response in insect hemolymph. All of these serpins are encoded by the same gene, and the message for each is produced by alternative splicing of the final exon. This exon encodes the RCL and two strands of sheet B. Serpin 1K has a canonical structure at the reactive center, as is observed in a1-antitrypsin, whereas hinge residues (P17-P13) adopt the position and conformation observed in ovalbumin. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	368
381046	cd19580	serpin_silkworm16_18_22	silk gland serpins 16, 18, and 22 from Bombyx mori. Serpins 16, 18, and 22 of the silkworm Bombyx mori are found in the silk gland, a highly specialized organ that functions to synthesize and store silk proteins. These three serpins are mainly distributed in the middle silk gland and contain a signal peptide for secretion. They also share high sequence homology (~87%), implying that they might carry out a similar and specific function in the middle silk gland lumen. They have a canonical serpin fold, but contain a unique reactive center loop, which is shorter than that of typical serpins. It is thought that active proteases in silk glands are restricted by serpins until the wandering stage. Studies show that serpins 16 and 18 act as inhibitor of cysteine protease with serpin 18 acting specifically on fibroinase. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	365
381047	cd19581	serpinL_nematode	serpin family L, serpin family proteins from nematodes. The role of nematode serpins remains largely elusive. The only nematode serpin for which experimental evidence indicates an evasive function is Brugia malayi SPN-2 which specifically inhibits two human neutrophil-derived serine proteinases, cathepsin G and elastase. Less is known of Brugia malayi SPN-1, which is present at all stages of the parasite life cycle and could exist to inhibit a cognate proteinase endogenous to the parasite. Schistosoma serpins are hypothesized to play a role in both the physiological control of elastase within the schistosomes, and protection of the parasite from activated neutrophils during inflammation. Caenorhabditis elegans serpins are thought to regulate endogenous serine proteinases as well as inhibit proteinases produced by pathogenic microorganisms. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	357
381048	cd19582	serpinM_ShSPI	serpin family M, Schistosoma haematobium serpin. ShSPI is a serpin from the trematode Schistosoma haematobium. The protein is exposed on the surface of invading cercaria as well as of adult worms, suggesting its involvement in the parasite-host interaction. It has several distinctive features, mostly concerning the helical subdomain of the protein. It is proposed that these peculiarities are related to the unique biological properties of a small serpin subfamily which is conserved among pathogenic schistosomes. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	388
381049	cd19583	serpinN_SPI-1_SPI-2	serpin family N, viral serpin-1 and serpin-2. This group of viral serpins are from the Orthopoxvirus branch (cowpox, ectromelia, vaccinia, variola, and rabbitpox) and corresponding to clade N which contains viral serpin-1 (SPI-1-like) and viral serpin-2 (SPI-2-like) serpins. The other is clade O which contains the viral serpin-3 (SPI-3-like) serpins. SPI-2, also called cytokine response modifier A (crmA), acts to inhibit inflammation and apoptosis. SPI-1, a serpin that is approximately 45% identical to SPI-2, has also been implicated in the inhibition of apoptosis, since certain cells infected with RPV SPI-1 mutants undergo apoptotic cell death. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	347
381050	cd19584	serpinO_SPI-3_virus	serpin family O, viral serpin-3. This group of viral serpins are from the Orthopoxvirus branch (cowpox, ectromelia, vaccinia, variola, and rabbitpox) and corresponding to clade O which contains the viral serpin-3 (SPI-3-like) serpins. The other is clade N which contains viral serpin-1 (SPI-1-like) and viral serpin-2 (SPI-2-like) serpins. SPI-3 is an N-glycosylated bifunctional protein that acts as both a proteinase inhibitor and a suppressor of infected cell-cell fusion. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	350
381051	cd19585	serpin_poxvirus	serpin-like proteins found in poxviruses. These are viral serpins from poxviridae that are not in the Orthopoxvirus branch (cowpox, ectromelia, vaccinia, variola, and rabbitpox) that contains clade N serpins (viral serpin-1/SPI-1-like and viral serpin-2/SPI-2-like) and clade O serpins (viral serpin-3/SPI-3-like). The members here include fowlpox virus, canarypox virus, deerpox virus, tanapox virus, an cotia virus and belong to other poxviridae branches including Leporipoxvirus, Yatapoxvirus, and Avipoxvirus. These viruses have a variety of hosts including humans, birds, and mice. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	345
381052	cd19586	serpin_mimivirus	serpin-like proteins found in mimiviruses. These viral serpins are from Mimiviridae (Tupanvirus, Powai, Bandra, Moumouvirus, and Megavirus) and may represent a new clade of viral serpins. Mimiviridae are thought to have a common evolutionary origin with Poxviridae whose viral serpins are classified into clades N and O. N is composed of viral serpin-1 (SPI-1-like) and viral serpin-2 (SPI-2-like) serpins and clade O is made up of viral serpin-3 (SPI-3-like) serpins. Mimiviruses have the only known viral serpins outside of the poxvirus family. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	355
381053	cd19587	serpinA16_HongrES1-like	serpin family A member 16, HongrES1 and similar proteins. HongrES1 is an epididymis-specific secretory protein and is encoded by the SERPINA16 gene. It is one of several potential decapacitation factors of rodents, including a 40-kDa glycoprotein, phosphatidylethanolamine-binding protein 1 (PEBP1), a cysteine-rich secretory protein 1, an acrosome-stabilizing factor, SVA, SVS2, and SPINKL. In humans, some potential decapacitation factors that have been reported are glycodelin-S, semenogelin I, a 130-kDa glycoprotein, and some mannosyl glycopeptides. Decapitation factors are removed from the sperm head surface during the capacitation process and are able to reverse sperm capacitation. The clade A of the serpin superfamily includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	373
381054	cd19588	serpin_miropin-like	serpin miropin and similar proteins. Miropin, the serpin from Tannerella forsythia, is thought to contribute to the virulence of periodontal pathogens by inhibiting neutrophil serine proteases. Miropin broadly inhibits serine endopeptidases (SEPs) including trypsin, neutrophil elastase, pancreatic elastase, subtilisin, and cathepsin G and cysteine endopeptidases (CEPs) including papain, calpain-like peptidase Tpr, and gingipain K through various reactive-site bonds. This is achieved by offering several target bonds of the RCL for cleavage within a bait region, instead of a single RSB as found in canonical serpins. In addition, promiscuous inhibition is facilitated by the capacity to insert strands deviating from the canonical length into the central sheet A, while keeping the prey peptidase bound and inactivated. The structural adaptation of miropin to provide a relaxed inhibitory specificity, which allows for formation of inhibitory complexes using different sites, is unique among serpins. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	365
381055	cd19589	serpin_tengpin-like	serpin tengpin and similar proteins. Tengpin is an unusual prokaryotic serpin from the extremophile Thermoanaerobacter tengcongensis. In addition to the serpin domain, tengpin contains an N-terminal region that functions to trap the serpin domain in the native metastable state and prevent the spontaneous transition to the latent conformation. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	367
381056	cd19590	serpin_thermopin-like	serpin thermopin and similar proteins. Thermopin, the serpin from Thermobifida fusca, functions as an irreversible proteinase inhibitor with resistance to polymerization at high temperatures. The crystal structure of the cleaved thermopin was found to adopt the canonical serpin fold, supporting its inclusion as a classical inhibitory member of the serpin superfamily. A detailed structural comparison revealed unique features, including charge-stabilizing interactions, a deleted element of secondary structure (the G helix), and a C-terminal "tail" that interacts with the top of the A beta sheet and plays an important role in the folding/unfolding of the molecule. These unique features provide structural and biophysical evidence as to how this unusual serpin member has adapted to remain functional in an extreme environment. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	366
381057	cd19591	serpin_like	serpin family proteins. This group includes a variety of serpins in three domains of life eukaryotes, bacteria, and archaea. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	364
381058	cd19593	serpin_bacteria_crustaceans	serpin family proteins from bacteria and crustaceans. This group includes a variety of serpin family proteins from various bacteria and crustaceans including sea louse and salmon louse. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	370
381059	cd19594	serpin_crustaceans_chelicerates_insects	serpin family proteins from crustaceans, chelicerates, and insects. This group includes a variety of serpins from crustaceans (sea louse, Chinese mitten crab, signal crayfish, red king crab, Asian tiger shrimp), chelicerates (Atlantic horseshoe crab, common house spider), and insects (Asian tiger mosquito, caddisfly, pea aphid, bed bug, fruit fly, Australian sheep blowfly, tobacco hornworm, alfalfa leafcutting bee). SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	374
381060	cd19596	serpin_fungal	cellulosomal serpin precursor. A single fungal serpin has been characterized to date: celpin from Piromyces spp. strain E2. Piromyces is a genus of anaerobic fungi found in the gut of ruminants and is important for digesting plant material. Celpin is predicted to be inhibitory and contains two N-terminal dockerin domains in addition to its serpin domain. Dockerins are commonly found in proteins that localise to the fungal cellulosome, a large extracellular multiprotein complex that breaks down cellulose.[21] It is therefore suggested that celpin may protect the cellulosome against plant proteases. Certain bacterial serpins similarly localize to the cellulosome.[186] SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	361
381061	cd19597	serpin28D-like_insects	insect serpins similar to Drosophila melanogaster Serpin-28D. Serpins in insects function within development, wound healing and immunity. Drosophila melanogaster Serpin-28D is required for pupal viability and plays an essential role in regulating melanization. Insect serpins from mosquitoes, Mediterranean fruit fly, fruit fly, and blowfly are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	395
381062	cd19598	serpin77Ba-like_insects	insect serpins similar to Drosophila melanogaster Serpin 77Ba. Serpins in insects function within development, wound healing and immunity. Drosophila melanogaster Serpin 77Ba plays an essential role in regulating the tracheal melanization immune response to bacterial and fungal infection. Insect serpins from pine beetle, diamondback moth, red flour beetle, mosquito, silkworm, and fruit fly are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	376
381063	cd19599	serpin18-like_insects	insect serpins similar to Anopheles gambiae Serpin 18. Serpins in insects function within development, wound healing and immunity. A. gambiae serpin 18 is categorized as non-inhibitory based on the sequence of its reactive-center loop. It is expressed throughout all life stages in multiple tissues and the hemolymph, and is predicted to be secreted based on the presence of a signal peptide. Insect serpins from mosquitoes are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	354
381064	cd19600	serpin11-like_insects	insect serpins similar to Bombyx mori Serpin-11. Serpins in insects function within development, wound healing and immunity. The specific function of Bombyx mori serpin-11 (SPN19) is unknown. Insect serpins from sawfly, mealworm, riceborer, moth, silkworm, bollworm are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	366
381065	cd19601	serpin42Da-like	serpins similar to Drosophila melanogaster Serpin 42Da. This subfamily is composed mainly of insect serpins, including Drosophila melanogaster serpin 42Da. Serpins in insects function within development, wound healing and immunity. Serpin 42Da, previously serpin 4, is a serine protease inhibitor that is capable of remarkable functional diversity through the alternative splicing of four different reactive center loop exons. Insect serpins from stink bug, alfalfa leafcutting bee, red flour beetle, house fly, and brown planthopper are also included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	361
381066	cd19602	serpin_mollusks	serpin family proteins from mollusks. This group includes a variety of serpins from mollusks (freshwater snail, sea slug, and disk abalone). SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	374
381067	cd19603	serpin_platyhelminthes	serpin family proteins from platyhelminthes. This group includes a variety of serpins from platyhelminthes (lung fluke, tapeworm, flatworm). SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	380
381068	cd19604	serpin_protozoa	serpin family proteins from protozoa. This group includes a variety of serpin clades from various protozoa including Neospora caninum that causes neosporosis, Toxoplasma gondii that causes toxoplasmosis, and Hammondia hammondi. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	439
381069	cd19605	serpin_protozoa	viral serpin. CrmA is a viral serpin that inhibits both cysteine and serine proteinases involved in the regulation of host inflammatory and apoptosis processes. It differs from other members of the serpin superfamily by having a shorter reactive center loop as well as possessing an additional highly charged antiparallel beta-strand of beta-sheet A, whose sequence and length are unique. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	413
381624	cd19606	GH113-like	Glycoside hydrolase family 113 beta-mannosidase and similar proteins. Family 113 glycoside hydrolases cleave (1->4)-beta-glycosidic linkages, such as endo-1,4-beta-mannanase. This family also includes TIM-barrel domains found in gene transfer agent proteins.	303
381625	cd19607	GTA_TIM-barrel-like	Putative glycoside hydrolase TIM-barrel domain in gene transfer agent and similar proteins. This domain is found in the gene transfer agent protein, such as the Rhodobacter capsulatus putative gene transfer agent protein encoded by orfg15. In the purple nonsulfur bacterium Rhodobacter capsulatus, DNA transmission is mediated via an unusual system, a small bacteriophage-like particle called the gene transfer agent (GTA) that transfers random 4.5-kb segments of the producing cell's genome to recipient cells, where allelic replacement occurs.	438
381626	cd19608	GH113_mannanase-like	Glycoside hydrolase family 113 beta-1,4-mannanase and similar proteins. Mannan endo-1,4-beta mannosidase (E.C 3.2.1.78) randomly cleaves (1->4)-beta-D-mannosidic linkages in mannans, galactomannans and glucomannans and is also called beta-1,4-mannanase, endo-1,4-beta-mannanase, endo-beta-1,4-mannase, beta-mannanase B, beta-1, 4-mannan 4-mannanohydrolase, endo-beta-mannanase, beta-D-mannanase, 1,4-beta-D-mannan mannanohydrolase, and 4-beta-D-mannan mannanohydrolase. (1->4)-beta-linked mannans are polysaccharides with a linear polymer backbone of (1->4)-beta-linked mannose units (in plants and fungi) or alternating mannose and glucose/galactose units (glucomannan in plants and fungi, and galactomannan and galactoglucomannan in plants), such as in the hemicellulose fraction of hard- and softwoods. Complete degradation of mannan requires a series of enzymes, including beta-1,4-mannanase. According to the CAZy database beta-1,4-mannanases are grouped into various glycoside hydrolase (GH) families; GH family 113 beta-1,4-mannanases include mostly bacterial and archaeal sequences.	310
410991	cd19609	NTD_TDP-43	N-terminal domain of transactive response DNA-binding protein 43. Transactive response DNA-binding protein of 43 kDa (TDP-43) is a nuclear DNA/RNA-binding protein involved in gene transcription and mRNA processing, transport, and translational regulation. It is vital to pre-mRNA and microRNA processing and regulates stress granule activity through the differential regulation of G3BP and TIA-1. It also forms aggregates implicated in amyotrophic lateral sclerosis. The N-terminal domain of TDP-43 is required for its physiological functions and pathological aggregation.	74
381622	cd19610	mannanase_GH134	glycosyl hydrolase family 134 inverting endo-beta-1,4-mannanase. glycosyl hydrolase family 134 beta-mannanase (E.C. 3.2.1.78) differs from other mannanases in as it has a hen egg white lysozyme fold and cleaves beta-1,4-mannans with inversion of sterochemistry. Beta-mannosidases are enzymes involved in seed germination and the degradation of the hemicellulose fraction of soft- and hardwoods.	162
381623	cd19611	Ctf13_LRR_LRR-insertion	leucine-rich-repeat (LRR) domain and LRR insertion domain of centromere DNA-binding protein complex CBF3 subunit C (Ctf13). Ctf13, is an F-box protein of the leucine-rich-repeat superfamily; it is a component of CEN binding factor 3 (CBF3), a complex that recognizes point centromeres found in budding yeast, associating specifically with the third centromere DNA element (CDEIII) DNA. CBF3 is comprised of two homodimers of Cep3 and Ndc10, and a Ctf13-Skp1 heterodimer. The Skp1-Ctf13 heterodimer interacts with Cep3, Ndc10 and CDEIII at a completely conserved G, centrally positioned between the TGC/CCG sites. The eight leucine-rich repeat (LRR) motifs of Ctf13 (LRR 1-8) form a solenoid structure. At the N-terminus of the Ctf13 LRR is an expanded F-box, and at the C-terminal end, an alpha-beta domain formed by insertions within the latter LRRs of Ctf13 (LRR insertion domain). This domain model includes the LLR domain and the LRR insertion domain.	290
381247	cd19614	FABP_Der_p_13-like	mite group 13 allergens similar to Dermatophagoides farinae Der p 13, and related proteins. The minor house dust mite allergen Der p 13 is a fatty acid-binding protein and an activator of a TLR2-mediated innate immune response. This group also contains other mite group 13 allergens, including Tyrophagus putrescentiae Tyr p 13 and Blomia tropicalis mite blo t 13. blo t 13 binds the natural fluorescent fatty acid cis-parinaric acid and oleic acid by competition, but not retinol, retinoic acid, cholesterol, or dansylated or anthroxylated fatty acids. This subgroup belongs to the intracellular fatty acid-binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	128
381248	cd19615	lipocalin_NP7	nitrophorin 7. Nitrophorins (NPs) represent a group of nitric oxide (NO)-carrying heme proteins found in the saliva of Rhodnius prolixus. In its adult phase R. prolixus expresses at least 4 nitrophorins (designated NP1-4 in order of their increasing abundance in the saliva of adult insects). Two additional nitrophorins, NP5 and NP6, have been detected mainly in the five instar nymphal stages of insect development. NP7 has not been isolated from the insects but was instead recognized in a cDNA library. NP7 displays peculiar properties, such as an abnormally high isoelectric point, the ability to bind negatively charged membranes, and a strong pH sensitivity of NO affinity. In contrast to NP1-4, which show high affinities for histamine (Hm), Np7 does not appear to sequester Hm. NPs belong to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	174
381249	cd19616	FABP11	fatty acid binding protein 11 similar to zebrafish FABP11. This group includes zebrafish FABP11a and FABP11b, Senegalese sole FABP11, and similar proteins. The two copies of the fabp11 gene in the zebrafish genome may have resulted from a fish-specific whole genome duplication event.  Fabp11a transcripts have been detected in the liver, brain, heart, testis, muscle, ovary and skin of adult zebrafish while fabp11b transcripts have been found in the brain, heart, ovary and eye in adult tissues. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	129
381250	cd19617	FABP12	fatty acid-binding protein 12. FABP12 is expressed in rodent retina and testis, as well as in human retinoblastoma cell lines. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	128
410998	cd19668	TmYidC_peri	periplasmic beta-super sandwich fold domain of membrane protein insertase YidC from Thermotoga maritima and similar domains. This subfamily is composed of Thermotoga maritima YidC (TmYidC) and similar proteins. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC belongs to the YidC/Oxa1/Alb3 protein family of insertases that contain a core domain of five transmembrane (TM) segments that is essential to insertase function. In addition to this core transmembrane domain, YidC from Gram-negative bacteria contain an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. This periplasmic domain may have a role in protein assembly: a region of YidC that binds to SecF maps to one edge of the beta-super sandwich. The periplasmic domain of TmYidC shows no amino acid sequence identity with that of the prototypical Escherichia coli YidC (EcYidC), yet they adopt a similar fold. However, the periplasmic domain of TmYidC displays shorter beta strands and some differences in connectivity, compared to EcYidC.	196
381525	cd19682	bHLHzip_MGA_like	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MAX gene-associated protein (MGA) family. The MGA family includes MGA, Schizosaccharomyces pombe ESC1 (spESC1) and similar proteins. MGA, also termed MAX dimerization protein 5 (MAD5), is a dual specificity T-box/ bHLHzip transcription factor that regulates the expression of both Max-network and T-box family target genes. It contains a Myc-like bHLHZip motif and requires heterodimerization with Max for binding to the preferred Myc-Max-binding site CACGTG. In addition to the bHLHZip domain, MGA harbors a second DNA-binding domain, the T-box or T-domain. It thus binds the preferred Brachyury-binding sequence and represses transcription of reporter genes containing promoter-proximal Brachyury-binding sites. spESC1 is a bHLHzip protein with homology to human MyoD and Myf-5 myogenic differentiation inducers. It is involved in the sexual differentiation process.	65
381526	cd19683	bHLH_SOHLH_like	basic helix-loop-helix (bHLH) domain found in the spermatogenesis- and oogenesis-specific basic helix-loop-helix-containing protein (SOHLH) family and similar proteins. The SOHLH family includes two bHLH transcription factors, SOHLH1 and SOHLH2. They are specifically in spermatogonia and oocytes and essential for early spermatogonial and oocyte differentiation. The family also includes transcription factor-like 5 protein (TCFL5) and similar proteins. TCFL5, also termed Cha transcription factor, or HPV-16 E2-binding protein 1 (E2BP-1), is a bHLH transcription factor that plays a crucial role in spermatogenesis. It regulates cell proliferation or differentiation of cells through binding to a specific DNA sequence like other bHLH molecules.	58
381527	cd19684	bHLH_dnHLH_ID	basic helix-loop-helix (bHLH) domain found in the DNA-binding protein inhibitor (ID) family. The ID family includes a dominant negative group of helix-loop-helix (dnHLH) proteins, ID1-4, that are negative regulators of bHLH transcription factors. They contain the HLH-dimerization domain but lack the basic domain necessary for DNA-binding. ID proteins inhibit binding to DNA and transcriptional transactivation by heterodimerization with bHLH proteins. They also interact with many non-bHLH proteins in complex networks. ID proteins have been implicated in regulating gene expression as well as cell-cycle progression. Whereas ID-1, ID-2 and ID-3, are generally considered as tumor promoters, ID4 on the contrary has emerged as a tumor suppressor.	47
381528	cd19685	bHLH-O_HERP_HES	basic helix-loop-helix-orange (bHLH-O) domain found in HERP/HES-like family. The HERP/HES-like family includes bHLH-O transcriptional regulators that are related to the Drosophila hairy and Enhancer-of-split (HES) proteins. The HERP (HES-related repressor protein) subfamily proteins contain a basic helix-loop-helix (bHLH) domain with an invariant glycine residue in its basic region, an orange domain in the central region and YXXW sequence motif at its C-terminal region. Hairy and enhancer of split (HES)-related repressor protein (HERP) proteins (HEY1, HEY2 and HEYL) act as downstream effectors of Notch signaling. They are involved in cardiovascular development and have roles in somitogenesis, myogenesis and gliogenesis. Hairy and enhancer of split-related protein HELT is a transcriptional repressor expressed in the developing central nervous system. It binds preferentially to the canonical E box sequence 5'-CACGCG-3' and regulates neuronal differentiation and/or identity. Differentially expressed in chondrocytes proteins, DEC1 and DEC2, are widely expressed in both embryonic and adult tissues and have been implicated in apoptosis, cell proliferation, and circadian rhythms, as well as malignancy in various cancers.  Drosophila melanogaster protein clockwork orange (Cwo) is also included in this subfamily. It is involved in the regulation of Drosophila circadian rhythms. It functions as both an activator and a repressor of clock gene expression. The HES subfamily proteins contain a basic helix-loop-helix (bHLH) domain with an invariant proline residue in its basic region, an orange domain in the central region and a conserved tetrapeptide motif, WRPW, at its C-terminal region. They form heterodimers or homodimers via their HLH domain and bind DNA to repress gene transcription that play an essential role in development of both compartment and boundary cells of the central nervous system.	52
381529	cd19686	bHLH_TS_FERD3L_like	basic helix-loop-helix (bHLH) domain found in Fer3-like protein (FERD3L), pancreas transcription factor 1 subunit alpha (PTF1A) and similar proteins. The family corresponds to a group of bHLH transcription factors, including FERD3L and PTF1A. FERD3L, also termed basic helix-loop-helix protein N-twist, or Class A basic helix-loop-helix protein 31 (bHLHa31), or nephew of atonal 3 (NATO3), or Neuronal twist (NTWIST), is expressed in the developing central nervous system (CNS). It regulates floor plate (FP) cells development. FP is a critical organizing center located at the ventral-most midline of the neural tube. FERD3L binds to the E-box and functions as inhibitor of transcription. PTF1A, also termed Class A basic helix-loop-helix protein 29 (bHLHa29), or pancreas-specific transcription factor 1a, or bHLH transcription factor p48, or p48 DNA-binding subunit of transcription factor PTF1 (PTF1-p48), is implicated in the cell fate determination in various organs. It binds to the E-box consensus sequence 5'-CANNTG-3' and plays a role in early and late pancreas development and differentiation.	56
381530	cd19687	bHLHzip_Mlx	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Max-like protein X (Mlx) and similar proteins. Mlx, also termed Class D basic helix-loop-helix protein 13 (bHLHd13), or Max-like bHLHZip protein, or protein BigMax, or transcription factor-like protein 4, is a Max-like bHLHZip transcription regulator that interacts with the Max network of transcription factors. It forms a sequence-specific DNA-binding protein complex with some member of Mad family (Mad1 and Mad4) and Mondo family but not the Myc family and bind the E-box DNA to control transcription.	76
381531	cd19688	bHLHzip_MLXIP	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MLX-interacting protein (MLXIP) and similar proteins. MLXIP, also termed Class E basic helix-loop-helix protein 36 (bHLHe36), or transcriptional activator MondoA, is a bHLHZip transcriptional activator that binds DNA as a heterodimer with Mlx. It binds to the canonical E box sequence 5'-CACGTG-3' and plays a role in transcriptional activation of glycolytic target genes. MLXIP is most highly expressed in skeletal muscle and functions as an indirect glucose sensor, by sensing glucose 6-phosphate and shuttling between the nucleus and the cytoplasm.	72
381532	cd19689	bHLHzip_MLXIPL	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in MLX-interacting protein-like (MLXIPL) and similar proteins. MLXIPL, also termed carbohydrate-responsive element-binding protein (ChREBP), or Class D basic helix-loop-helix protein 14 (bHLHd14), or MLX interactor, or WS basic-helix-loop-helix leucine zipper protein (WS-bHLH), or Williams-Beuren syndrome chromosomal region 14 protein (WBSCR14), is a bHLHZip transcriptional factor integral to the regulation of glycolysis and lipogenesis in the liver. It forms heterodimers with the bHLHZip protein Mlx to bind the DNA sequence 5'-CACGTG-3'.	76
381533	cd19690	bHLHzip_spESC1_like	basic Helix-Loop-Helix-zipper (bHLHzip) domain found in Schizosaccharomyces pombe ESC1 (spESC1) and similar proteins. spESC1 is a bHLHzip protein with homology to human MyoD and Myf-5 myogenic differentiation inducers. It is involved in the sexual differentiation process.	65
381534	cd19691	bHLH_dnHLH_ID1	basic helix-loop-helix (bHLH) domain found in DNA-binding protein inhibitor ID1 and similar proteins. ID1, also termed Class B basic helix-loop-helix protein 24 (bHLHb24), or inhibitor of DNA binding 1, or inhibitor of differentiation 1, is a dominant negative helix-loop-helix (dnHLH) transcriptional regulator (lacking a basic DNA binding domain) which negatively regulates the bHLH transcription factors by forming heterodimers and inhibiting their DNA binding and transcriptional activity. ID1 interferes with centrosomal function. It has been implicated in the regulation of cell proliferation and differentiation in myogenesis, neurogenesis, and/or hematopoiesis.	52
381535	cd19692	bHLH_dnHLH_ID2	basic helix-loop-helix (bHLH) domain found in DNA-binding protein inhibitor ID2 and similar proteins. ID2, also termed Class B basic helix-loop-helix protein 26 (bHLHb26), or inhibitor of DNA binding 2, or inhibitor of differentiation 2, is a dominant negative helix-loop-helix (dnHLH) transcriptional regulator (lacking a basic DNA binding domain) which negatively regulates the bHLH transcription factors by forming heterodimers and inhibiting their DNA binding and transcriptional activity. It has been implicated in the regulation of cell proliferation and differentiation in myogenesis, neurogenesis, and/or hematopoiesis.	66
381536	cd19693	bHLH_dnHLH_ID3	basic helix-loop-helix (bHLH) domain found in DNA-binding protein inhibitor ID3 and similar proteins. ID3, also termed Class B basic helix-loop-helix protein 25 (bHLHb25), or helix-loop-helix protein HEIR-1, or ID-like protein inhibitor HLH 1R21, or inhibitor of DNA binding 3, or inhibitor of differentiation 3, is a dominant negative helix-loop-helix (dnHLH) transcriptional regulator (lacking a basic DNA binding domain) which negatively regulates the bHLH transcription factors by forming heterodimers and inhibiting their DNA binding and transcriptional activity. It negatively regulates muscle differentiation by inhibiting the DNA-binding activities of the myogenic regulatory factors.	61
381537	cd19694	bHLH_dnHLH_ID4	basic helix-loop-helix (bHLH) domain found in DNA-binding protein inhibitor ID4 and similar proteins. ID4, also termed Class B basic helix-loop-helix protein 27 (bHLHb27), or inhibitor of DNA binding 4, or inhibitor of differentiation 4, is a dominant negative helix-loop-helix (dnHLH) transcriptional regulator (lacking a basic DNA binding domain) which negatively regulates the bHLH transcription factors by forming heterodimers and inhibiting their DNA binding and transcriptional activity.  It plays a role in adipose cell differentiation.	60
381538	cd19695	bHLH_dnHLH_EMC_like	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein extra-macrochaetae   and similar proteins. Extra-macrochaetae is a negative regulator of sensory organ development in Drosophila. It belongs to dominant negative group of helix-loop-helix (dnHLH) proteins, which lack a basic DNA-binding domain but can form heterodimers with other HLH proteins, thereby inhibiting DNA binding.	52
381539	cd19696	bHLH-PAS_AhR_like	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in the aryl hydrocarbon receptor (AhR) family. The AhR family includes AhR, AhR repressor (AhRR) and Drosophila melanogaster protein spineless. AhR, also termed Ah receptor, or Dioxin receptor (DR), or Class E basic helix-loop-helix protein 76 (bHLHe76), is the only member of bHLH-PAS transcription regulators that bind and be activated by small chemical ligands. It is activated by Dioxin to control the expression of certain genes to influence biological processes such as apoptosis, proliferation, cell growth and differentiation. To form active DNA binding complexes AhR dimerizes with a bHLH-PAS factor ARNT (Aryl hydrocarbon Nuclear Receptor Translocator). AhRR, also termed Class E basic helix-loop-helix protein 77 (bHLHe77), is a member of bHLH-PAS transcription factors that acts as a negative regulator of AhR, playing key roles in development and environmental sensing. AhRR functions by competing with AhR for its partner ARNT. AhRR-ARNT complexes are transcriptionally inactive. Spineless is a bHLH-PAS transcription factor that plays an important role in fly morphogenesis. It is both necessary and sufficient for the formation of the ommatidial mosaic.	59
381540	cd19697	bHLH-PAS_NPAS4_PASD10	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing protein 4 (NPAS4) and similar proteins. NPAS4, also termed neuronal Per-Arnt-Sim homology (PAS) factor 4, or neuronal PAS4, or Class E basic helix-loop-helix protein 79 (bHLHe79), or HLH-PAS transcription factor NXF, or PAS domain-containing protein 10 (PASD10), is a bHLH-PAS neuronal activity-dependent transcription factor which heterodimerizes with ARNT2 to regulate genes involved in inhibitory synapse formation.	57
381541	cd19698	bHLH_AtMEE8_like	basic helix-loop-helix (bHLH) domain found in Arabidopsis thaliana protein maternal effect embryo arrest 8 (AtMEE8) and similar proteins. AtMEE8, also termed AtbHLH108, or EN 132, is a bHLH transcription factor required during early embryo development, for the endosperm formation.	71
381542	cd19699	bHLH_TS_dMYOD_like	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster myogenic-determination protein (dMYOD) and similar proteins. dMYOD, also termed protein nautilus, or Myd, may play an important role in the early development of muscle in Drosophila.	56
381543	cd19700	bHLH_TS_TWIST2	basic helix-loop-helix (bHLH) domain found in twist-related protein 2 (TWIST2) and similar proteins. TWIST2, also termed Class A basic helix-loop-helix protein 39 (bHLHa39), or Dermis-expressed protein 1, or Dermo-1, is a bHLH transcription factor that regulates the development of mesenchymal tissues and plays a critical role in embryogenesis. It binds to the E-box consensus sequence 5'-CANNTG-3' as a heterodimer and inhibits transcriptional activation by MYOD1, MYOG, MEF2A and MEF2C.	82
381544	cd19701	bHLH_TS_HEN1	basic helix-loop-helix (bHLH) domain found in helix-loop-helix protein 1 (HEN-1) and similar proteins. HEN-1, also termed nescient helix-loop-helix 1 (Nhlh1), or Class A basic helix-loop-helix protein 35 (bHLHa35), or nescient helix loop helix 1 (NSCL-1), is a neuron-specific bHLH transcription factor that may serve as DNA-binding protein and may be involved in the control of cell-type determination, possibly within the developing nervous system.	72
381545	cd19702	bHLH_TS_HEN2	basic helix-loop-helix (bHLH) domain found in helix-loop-helix protein 2 (HEN-2) and similar proteins. HEN-2, also termed nescient helix-loop-helix 2 (Nhlh2), or Class A basic helix-loop-helix protein 34 (bHLHa34), or nescient helix loop helix 2 (NSCL-2), is a neuron-specific bHLH transcription factor that may serve as DNA-binding protein and may be involved in the control of cell-type determination, possibly within the developing nervous system.	73
381546	cd19703	bHLH_TS_musculin	basic helix-loop-helix (bHLH) domain found in musculin and similar proteins. Musculin, also termed activated B-cell factor 1 (ABF-1), or Class A basic helix-loop-helix protein 22 (bHLHa22), is a bHLH transcription factor expressed in activated B lymphocytes. It acts as a transcription repressor capable of inhibiting the transactivation capability of TCF3/E47. Musculin may play a role in regulating antigen-dependent B-cell differentiation. The mouse homolog, musculin, is suggested to be a repressor of myogenesis that is expressed in developing muscle and in the spleen. Musculin heterodimerizes with products of the E2A gene.	66
381547	cd19704	bHLH_TS_TCF21_capsulin	basic helix-loop-helix (bHLH) domain found in transcription factor 21 (TCF-21) and similar proteins. TCF-21, also termed capsulin, or Class A basic helix-loop-helix protein 23 (bHLHa23), or epicardin, or podocyte-expressed 1 (Pod-1), is a bHLH transcription factor expressed specifically in mesodermally-derived cells that surround the epithelium of the developing gastrointestinal, genitourinary and respiratory systems during mouse embryogenesis. It may play a role in the specification or differentiation of one or more subsets of epicardial cell types.	64
381548	cd19705	bHLH_TS_LYL1	basic helix-loop-helix (bHLH) domain found in protein lyl-1 and similar proteins. Lyl-1, also termed Class A basic helix-loop-helix protein 18 (bHLHa18), or lymphoblastic leukemia-derived sequence 1, is a proto-oncogenic bHLH transcription factor involved in T-cell acute lymphoblastic leukemia. It plays an important role in hematopoietic stem cell function and is required for the late stages of postnatal angiogenesis to limit the formation of new blood vessels, notably by regulating the activity of the small GTPase Rap1. LYL-1 deficiency induces a stress erythropoiesis.	65
381549	cd19706	bHLH_TS_TAL1	basic helix-loop-helix (bHLH) domain found in T-cell acute lymphocytic leukemia protein 1 (TAL-1) and similar proteins. TAL-1, also termed Class A basic helix-loop-helix protein 17 (bHLHa17), or stem cell protein (SCL), or T-cell leukemia/lymphoma protein 5, is a hematopoietic-specific bHLH transcription factor that functions in embryonic and adult hematopoiesis in vertebrates. It is also required for embryonic vascular remodeling.  It acts as a regulator of erythroid differentiation and binds to regulatory regions of a large cohort of erythroid genes as part of a complex with GATA-1, LMO2 and Ldb1. TAL-1 has been implicated in T-cell acute lymphoblastic leukemia. In common with other tissue-specific bHLH proteins, Tal heterodimerizes with ubiquitously-expressed members of the E2A family and form a DNA-binding complex with an E-box (CANNTG) to regulate transcription at its recognition site.	65
381550	cd19707	bHLH_TS_TAL2	basic helix-loop-helix (bHLH) domain found in T-cell acute lymphocytic leukemia protein 2 (TAL-2) and similar proteins. TAL-2, also termed Class A basic helix-loop-helix protein 19 (bHLHa19), is a bHLH transcription factor essential for the normal brain development. It has been implicated in T-cell acute lymphoblastic leukemia.	61
381551	cd19708	bHLH_TS_dHLH3B_like	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster helix loop helix protein 3B (dHLH3B) and similar proteins. Drosophila HLH3B is an uncharacterized bHLH transcription factor that belongs to the T-cell acute lymphocytic leukemia protein/ lymphoblastic leukemia-derived sequence (TAL/LYL) family.	60
381552	cd19709	bHLH_TS_TCF23_OUT	basic helix-loop-helix (bHLH) domain found in transcription factor 23 (TCF-23) and similar proteins. TCF-23, also termed Class A basic helix-loop-helix protein 24 (bHLHa24), is a bHLH transcription factor that is essential for progesterone-dependent decidualization. The mouse homolog is also called ovary, uterus and testis protein (OUT), which is expressed predominantly in the reproductive organs such as the uterus, ovary and testis. It shows an Id-like inhibitory activity and functions as a negative regulator of bHLH factors through the formation of a functionally inactive heterodimeric complex. OUT inhibits the formation of TCF3 and MYOD1 homodimers and heterodimers, but lacks DNA binding activity. OUT is involved in the regulation or modulation of smooth muscle contraction of the uterus during pregnancy and particularly around the time of delivery. It also plays a role in the inhibition of myogenesis. Unlike typical bHLH factors, OUT proteins do not bind E-box (CANNTG) or N-box DNA sequences and inhibit DNA binding of homo- and heterodimers consisting of E12 and MyoD in gel mobility shift assays.	56
381553	cd19710	bHLH_TS_TCF24	basic helix-loop-helix (bHLH) domain found in transcription factor 24 (TCF-24) and similar proteins. TCF-24 is an uncharacterized bHLH transcription factor that shows high sequence similarity with TCF-23.	56
381554	cd19711	bHLH_TS_MIST1	basic helix-loop-helix (bHLH) domain found in muscle, intestine and stomach expression 1 (MIST-1) and similar proteins. MIST-1, also termed Class A basic helix-loop-helix protein 15 (bHLHa15), or Class B basic helix-loop-helix protein 8 (bHLHb8), is a bHLH transcription factor expressed in pancreatic acinar cells and other serous exocrine cells. It is essential for cytoskeletal organization and secretory activity. It also functions as a potent endoplasmic reticulum (ER) stress-inducible transcriptional regulator. MIST-1 is capable of binding to E-box (CANNTG) motifs as a homodimer or a heterodimer with E-proteins (E12 and E47) to regulate transcription.	62
381555	cd19712	bHLH_TS_dimmed_like	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein dimmed and similar proteins. Dimmed, also termed DIMM, is a bHLH transcription factor that regulates neurosecretory (NS) cell function and neuroendocrine cell fate in Drosophila.	60
381556	cd19713	bHLH_TS_ATOH1	basic helix-loop-helix (bHLH) domain found in protein atonal homolog 1 (ATOH1) and similar proteins. ATOH1, also termed Class A basic helix-loop-helix protein 14 (bHLHa14), or helix-loop-helix protein hATH-1 (hATH1), or Math1, or Cath1, is a proneural bHLH transcription factor that is essential for inner ear hair cell differentiation. It dimerizes with E47 and activates E-box (CANNTG) dependent transcription. ATOH1 is a mammalian homolog of the Drosophila melanogaster gene atonal and mouse atonal homolog 1 (Math1).	64
381557	cd19714	bHLH_TS_ATOH7	basic helix-loop-helix (bHLH) domain found in protein atonal homolog 7 (ATOH7) and similar proteins. ATOH7, also termed Class A basic helix-loop-helix protein 13 (bHLHa13), or helix-loop-helix protein hATH-5 (hATH5), or Math5, is a bHLH transcription factor involved in the differentiation of retinal ganglion cells.	69
381558	cd19715	bHLH_TS_amos_like	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster protein Amos and similar proteins. Amos, also termed absent MD neurons and olfactory sensilla protein, or reduced olfactory organs protein, or rough eye protein, is a bHLH transcription factor that promotes multiple dendritic neuron formation in the Drosophila peripheral nervous system.	64
381559	cd19716	bHLH_TS_NGN1_NeuroD3	basic helix-loop-helix (bHLH) domain found in neurogenin-1 (NGN-1) and similar proteins. NGN-1, also termed Class A basic helix-loop-helix protein 6 (bHLHa6), or neurogenic basic-helix-loop-helix protein, or neurogenic differentiation factor 3 (NeuroD3), is a neural-specific bHLH transcription factor involved in the initiation of neuronal differentiation. It activates transcription by binding to the E box (5'-CANNTG-3').	77
381560	cd19717	bHLH_TS_NGN2_ATOH4	basic helix-loop-helix (bHLH) domain found in neurogenin-2 (NGN-2) and similar proteins. NGN-2, also termed Class A basic helix-loop-helix protein 8 (bHLHa8), or protein atonal homolog 4 (ATOH4), is a neural-specific bHLH transcription factor required for sensory neurogenesis. It activates transcription by binding to the E box (5'-CANNTG-3').	69
381561	cd19718	bHLH_TS_NGN3_ATOH5	basic helix-loop-helix (bHLH) domain found in neurogenin-3 (NGN-3) and similar proteins. NGN-3, also termed Class A basic helix-loop-helix protein 7 (bHLHa7), or protein atonal homolog 5 (ATOH5), is a neural-specific bHLH transcription factor expressed in the developing central nervous system and the embryonic pancreas. It is involved in neurogenesis and plays an important role in spermatogenesis.	68
381562	cd19719	bHLH_TS_NeuroD1	basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor 1 (NeuroD1) and similar proteins. NeuroD1, also termed Class A basic helix-loop-helix protein 3 (bHLHa3), is a neuronal bHLH transcription factor involved in the development and maintenance of the endocrine pancreas and neuronal elements. It acts as an essential regulator of glutamatergic neuronal differentiation. Loss of NeuroD1 causes ataxia, cerebellar hypoplasia, sensorineural deafness, and severe retinal dystrophy in mice.	86
381563	cd19720	bHLH_TS_NeuroD2	basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor 2 (NeuroD2) and similar proteins. NeuroD2, also termed Class A basic helix-loop-helix protein 1 (bHLHa1), or NeuroD-related factor (NDRF), is a neuronal calcium-dependent bHLH transcription factor that induces neuronal differentiation and promotes neuronal survival. It plays a central role in thalamocortical synaptic maturation. NeuroD2 mediates calcium-dependent transcription activation by binding to E box-containing promoter.	93
381564	cd19721	bHLH_TS_NeuroD4_ATOH3	basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor 4 (NeuroD4) and similar proteins. NeuroD4, also termed Class A basic helix-loop-helix protein 4 (bHLHa4), or protein atonal homolog 3 (ATH-3), or Atoh3, or Math-3, is a bHLH transcriptional activator that mediates neuronal differentiation.	87
381565	cd19722	bHLH_TS_NeuroD6_ATOH2	basic helix-loop-helix (bHLH) domain found in neurogenic differentiation factor 6 (NeuroD6) and similar proteins. NeuroD6, also termed Class A basic helix-loop-helix protein 2 (bHLHa2), or protein atonal homolog 2 (ATH-2), or Atoh2, or Math2, or Nex1, is a neurogenic bHLH transcription factor involved in neuronal development, differentiation, and survival in Alzheimer's disease (AD) brains of both cohorts. It plays an integrative role in coordinating increase in mitochondrial mass with cytoskeletal remodeling, suggesting that it may act as a co-regulator of neuronal differentiation and energy metabolism.	70
381566	cd19723	bHLH_TS_ASCL1_like	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster achaete-scute complex (AS-C) proteins, achaete-scute-like proteins, ASCL1-2, and similar proteins. This subfamily includes Drosophila melanogaster AS-C proteins and two ASCL family of transcription factors, ASCL-1 and ASCL-2. Drosophila melanogaster AS-C proteins includes lethal of scute (also known as achaete-scute complex protein T3 or AST3), scute (also known as achaete-scute complex protein T4 or AST4), achaete (also known as achaete-scute complex protein T5 or AST5), and asense (also known as achaete-scute complex protein T8 or AST8). They are involved in the determination of the neuronal precursors in the peripheral nervous system and the central nervous system, as well as in sex determination and dosage compensation. ASCL-1, also termed Class A basic helix-loop-helix protein 46 (bHLHa46), or achaete-scute homolog 1 (ASH-1), or mammalian achaete scute homolog 1 (Mash1), is expressed in subsets of neural progenitors in both the central and peripheral nervous system. It plays a key role in neuronal differentiation and specification in the nervous system. ASCL-2, also termed achaete-scute homolog 2 (ASH-2), or Class A basic helix-loop-helix protein 45 (bHLHa45), or mammalian achaete scute homolog 2 (Mash2), is involved in Schwann cell differentiation and control of proliferation in adult peripheral nerves.	56
381567	cd19724	bHLH_TS_ASCL3_like	basic helix-loop-helix (bHLH) domain found in achaete-scute-like proteins, ASCL3-5, and similar proteins. This subfamily includes three ASCL family of transcription factors, ASCL-3, ASCL-4 and ASCL-5. ASCL-3, also termed Class A basic helix-loop-helix protein 42 (bHLHa42), or bHLH transcriptional regulator Sgn-1, or achaete-scute homolog 3 (ASH-3), is specifically localized in the duct cells of the salivary glands. It may act as transcriptional repressor that inhibits myogenesis. ASCL-4, also termed Class A basic helix-loop-helix protein 44 (bHLHa44), or achaete-scute homolog 4 (ASH-4), or Hash4, may be involved in skin development. ASCL-5, also termed Class A basic helix-loop-helix protein 47 (bHLHa47), or achaete-scute homolog 5 (ASH-5), is an uncharacterized bHLH transcription factor that is close related to ASCL-3 and ASCL-4.	56
381568	cd19725	bHLH_TS_OLIG2_like	basic helix-loop-helix (bHLH) domain found in oligodendrocyte transcription factors, Oligo2, Oligo3 and similar proteins. The family includes two bHLH transcription factors, Oligo2 and Oligo3. Oligo2, also termed Class B basic helix-loop-helix protein 1 (bHLHb1), or Class E basic helix-loop-helix protein 19 (bHLHe19), or protein kinase C-binding protein 2, or protein kinase C-binding protein RACK17, is required for oligodendrocyte and motor neuron specification in the spinal cord, as well as for the development of somatic motor neurons in the hindbrain. It cooperates with OLIG1 to establish the MN progenitors (pMN) domain of the embryonic neural tube. Oligo3, also termed Class B basic helix-loop-helix protein 7 (bHLHb7), or Class E basic helix-loop-helix protein 20 (bHLHe20), is expressed in the ventricular zone of the dorsal alar plate of the hindbrain and involved in regulating the development of dorsal and ventral spinal cord. It may determine the distinct specification program of class A neurons in the dorsal part of the spinal cord and suppress specification of class B neurons.	63
381569	cd19726	bHLH-PAS_cycle_like	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster protein cycle and similar proteins. Protein cycle, also termed brain and muscle ARNT-like 1 (BMAL1), or MOP3, is a putative bHLH-PAS transcription factor involved in the generation of biological rhythms in Drosophila. It activates cycling transcription of Period (PER) and Timeless (TIM) by binding to the E-box (5'-CACGTG-3') present in their promoters.	62
381570	cd19727	bHLH-PAS_HIF1a_PASD8	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in hypoxia-inducible factor 1-alpha (HIF1a) and similar proteins. HIF1a, also termed HIF-1-alpha, or HIF1-alpha, or ARNT-interacting protein, or Basic-helix-loop-helix-PAS protein MOP1, or Class E basic helix-loop-helix protein 78 (bHLHe78), or Member of PAS protein 1, or PAS domain-containing protein 8 (PASD8), functions as a master transcriptional regulator of the adaptive response to hypoxia.	71
381571	cd19728	bHLH-PAS_HIF2a_PASD2	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in hypoxia-inducible factor 2-alpha (HIF2a) and similar proteins. HIF2a, also termed HIF-2-alpha, or HIF2-alpha, or endothelial PAS domain-containing protein 1 (EPAS-1), or Basic-helix-loop-helix-PAS protein MOP2, or Class E basic helix-loop-helix protein 73 (bHLHe73), or Member of PAS protein 2, or PAS domain-containing protein 2 (PASD2), or HIF-1-alpha-like factor (HLF), is a bHLH-PAS transcription factor involved in the induction of oxygen regulated genes.	66
381572	cd19729	bHLH-PAS_HIF3a_PASD7	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in hypoxia-inducible factor 3-alpha (HIF3a) and similar proteins. HIF3a, also termed HIF-3-alpha, or HIF3-alpha, or endothelial PAS domain-containing protein 1 (EPAS-1), or Basic-helix-loop-helix-PAS protein MOP7, or Class E basic helix-loop-helix protein 17 (bHLHe17), or Member of PAS protein 7, or PAS domain-containing protein 7 (PASD7), or HIF3-alpha-1, or inhibitory PAS domain protein (IPAS), is a bHLH-PAS transcriptional regulator in adaptive response to low oxygen tension. It plays a role in the regulation of hypoxia-inducible gene expression.	63
381573	cd19730	bHLH-PAS_spineless_like	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster protein spineless and similar proteins. Spineless is a bHLH-PAS transcription factor that plays an important role in fly morphogenesis. It is both necessary and sufficient for the formation of the ommatidial mosaic.	64
381574	cd19731	bHLH-PAS_NPAS1_PASD5	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing protein 1 (NPAS1) and similar proteins. NPAS1, also termed neuronal PAS1, or Basic-helix-loop-helix-PAS protein MOP5, or Class E basic helix-loop-helix protein 11 (bHLHe11), or member of PAS protein 5, or PAS domain-containing protein 5 (PASD5), is a bHLH-PAS transcriptional repressor expressed in the central nervous system and involved in neuronal differentiation. It is active during late embryogenesis and postnatal development.	74
381575	cd19732	bHLH-PAS_NPAS3_PASD6	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing protein 3 (NPAS3) and similar proteins. NPAS3, also termed neuronal PAS3, or Basic-helix-loop-helix-PAS protein MOP6, or Class E basic helix-loop-helix protein 12 (bHLHe12), or member of PAS protein 6, or PAS domain-containing protein 6 (PASD6), is a bHLH-PAS brain-enriched transcription factor that is involved in central nervous system development and neurogenesis. It is a replicated genetic risk factor for psychiatric disorders. Human chromosomal rearrangements that affect NPAS3 normal expression are associated with schizophrenia and mental retardation.	78
381576	cd19733	bHLH-PAS_trachealess_like	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster protein trachealess and similar proteins. Protein trachealess is a bHLH-PAS transcription factor that acts as an inducer of tracheal cell fates in Drosophila. It is necessary for the development of the salivary gland duct and the posterior spiracles.	79
381577	cd19734	bHLH-PAS_CLOCK	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Circadian locomotor output cycles protein kaput (CLOCK) and similar proteins. CLOCK, also termed Class E basic helix-loop-helix protein 8 (bHLHe8), is a bHLH-PAS transcriptional activator which forms a core component of the circadian clock. It forms heterodimers with another bHLH-PAS protein, Brain-Muscle-Arnt-Like (also known as BMAL or ARNT3 or mop3), which regulates circadian rhythm. BMAL1-CLOCK heterodimer complex activates transcription from E-box (CANNTG) elements found in the promoter of circadian responsive genes.	61
381578	cd19735	bHLH-PAS_dCLOCK	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster Circadian locomotor output cycles protein kaput (dCLOCK) and similar proteins. dCLOCK, also termed dPAS1, is a bHLH-PAS Circadian regulator that acts as a transcription factor and generates a rhythmic output with a period of about 24 hours.	80
381579	cd19736	bHLH-PAS_PASD1	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in circadian clock protein PASD1. PASD1, also termed PAS domain-containing protein 1, is evolutionarily related to Circadian locomotor output cycles protein kaput (CLOCK)and functions as a suppressor of the biological clock that drives the daily circadian rhythms of cells throughout the body. Mammalian PASD1 doesn't harbor the bHLH-PAS domain and is not included in this family.	70
381580	cd19737	bHLH-PAS_NPAS2_PASD4	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in neuronal PAS domain-containing protein 2 (NPAS2) and similar proteins. NPAS2, also termed neuronal PAS2, or basic-helix-loop-helix-PAS protein MOP4, or Class E basic helix-loop-helix protein 9 (bHLHe9), or member of PAS protein 4, or PAS domain-containing protein 4 (PASD4), is a bHLH-PAS transcriptional activator which forms a core component of the circadian clock.	77
381581	cd19738	bHLH-PAS_SIM1	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in single-minded homolog 1 (SIM1) and similar proteins. SIM1, also termed Class E basic helix-loop-helix protein 14 (bHLHe14), is a bHLH-PAS transcription factor that may have pleiotropic effects during embryogenesis and in the adult.	71
381582	cd19739	bHLH-PAS_SIM2	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in single-minded homolog 2 (SIM2) and similar proteins. SIM2, also termed Class E basic helix-loop-helix protein 15 (bHLHe15), is a bHLH-PAS transcription factor that may be a master gene of central nervous system (CNS) development in cooperation with ARNT. It may have pleiotropic effects in the tissues expressed during development.	74
381583	cd19740	bHLH-PAS_dSIM_like	basic helix-loop-helix-Per-ARNT-Sim (bHLH-PAS) domain found in Drosophila melanogaster protein single-minded (SIM) and similar proteins. SIM is a nuclear bHLH-PAS transcription factor that functions as a master developmental regulator controlling midline development of the ventral nerve cord in Drosophila.	62
381584	cd19741	bHLH-O_ESMB_like	basic helix-loop-helix-orange (bHLH-O) domain found in Drosophila melanogaster enhancer of split mbeta protein (ESMB) and similar proteins. ESMB, also termed E(spl)mbeta, or HLH-mbeta, or split locus enhancer protein mA, is a bHLH-O transcriptional repressor of genes that require a bHLH protein for their transcription. It is involved in the neural-epidermal lineage decision during early neurogenesis. The family also includes Enhancer of split m7 protein (also known as E(spl)m7), which acts as a transcriptional repressor that participates in the control of cell fate choice by uncommitted neuroectodermal cells in the embryo.	69
381585	cd19742	bHLH_TS_ASCL1_Mash1	basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 1 (ASCL-1) and similar proteins. ASCL-1, also termed Class A basic helix-loop-helix protein 46 (bHLHa46), or achaete-scute homolog 1 (ASH-1), or mammalian achaete scute homolog 1 (Mash1), is a neural-specific bHLH transcription factor that is expressed in subsets of neural progenitors in both the central and peripheral nervous system. It plays a key role in neuronal differentiation and specification in the nervous system.	71
381586	cd19743	bHLH_TS_ASCL2_Mash2	basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 2 (ASCL-2) and similar proteins. ASCL-2, also termed achaete-scute homolog 2 (ASH-2), or Class A basic helix-loop-helix protein 45 (bHLHa45), or mammalian achaete scute homolog 2 (Mash2), is a bHLH transcription factor that is involved in Schwann cell differentiation and control of proliferation in adult peripheral nerves.	64
381587	cd19744	bHLH_TS_dAS-C_like	basic helix-loop-helix (bHLH) domain found in Drosophila melanogaster achaete-scute complex (AS-C) proteins and similar proteins. Drosophila melanogaster AS-C proteins includes lethal of scute (also known as achaete-scute complex protein T3 or AST3), scute (also known as achaete-scute complex protein T4 or AST4), achaete (also known as achaete-scute complex protein T5 or AST5), and asense (also known as achaete-scute complex protein T8 or AST8). They are involved in the determination of the neuronal precursors in the peripheral nervous system and the central nervous system, as well as in sex determination and dosage compensation.	67
381588	cd19745	bHLH_TS_ASCL3	basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 3 (ASCL-3) and similar proteins. ASCL-3, also termed Class A basic helix-loop-helix protein 42 (bHLHa42), or bHLH transcriptional regulator Sgn-1, or achaete-scute homolog 3 (ASH-3), is a bHLH transcription factor specifically localized in the duct cells of the salivary glands. It may act as transcriptional repressor that inhibits myogenesis.	59
381589	cd19746	bHLH_TS_ASCL4	basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 4 (ASCL-4) and similar proteins. ASCL-4, also termed Class A basic helix-loop-helix protein 44 (bHLHa44), or achaete-scute homolog 4 (ASH-4), or Hash4, is a bHLH transcriptional regulator that may be involved in skin development.	64
381590	cd19747	bHLH_TS_ASCL5	basic helix-loop-helix (bHLH) domain found in achaete-scute-like protein 5 (ASCL-5) and similar proteins. ASCL-5, also termed Class A basic helix-loop-helix protein 47 (bHLHa47), or achaete-scute homolog 5 (ASH-5), is an uncharacterized bHLH transcription factor that belongs to AS-C (achaete, scute, lethal of scute, and asense) family.	61
381591	cd19748	bHLH-O_HEY1	basic helix-loop-helix-orange (bHLH-O) domain found in hairy/enhancer-of-split related with YRPW motif protein 1 (HEY1) and similar proteins. HEY1, also termed cardiovascular helix-loop-helix factor 2 (CHF-2), or Class B basic helix-loop-helix protein 31 (bHLHb31), or HES-related repressor protein 1, or hairy and enhancer of split-related protein 1 (HESR-1), or hairy-related transcription factor 1 (HRT-1), is a bHLH-O transcriptional repressor that acts as an essential downstream effector of the Notch signaling pathway and may play a fundamental role in vascular development. HEY1 also participates several cancer-related pathways. It acts as a positive regulator of the tumor suppressor p53.	71
381592	cd19749	bHLH-O_DEC1	basic helix-loop-helix-orange (bHLH-O) domain found in differentially expressed in chondrocytes protein 1 (DEC1) and similar proteins. DEC1, also termed Class E basic helix-loop-helix protein 40 (bHLHe40), or Class B basic helix-loop-helix protein 2 (bHLHb2), or enhancer-of-split and hairy-related protein 2 (SHARP-2), or stimulated by retinoic acid gene 13 protein (STRA13), is a bHLH-O transcriptional repressor involved in the regulation of the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes.	90
381593	cd19750	bHLH-O_DEC2	basic helix-loop-helix-orange (bHLH-O) domain found in differentially expressed in chondrocytes protein 2 (DEC2) and similar proteins. DEC2, also termed Class E basic helix-loop-helix protein 41 (bHLHe41), or Class B basic helix-loop-helix protein 3 (bHLHb3), or enhancer-of-split and hairy-related protein 1 (SHARP-1), is a bHLH-O transcriptional repressor involved in the regulation of the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes.	92
410992	cd19751	5TM_YidC_Oxa1_Alb3	Five transmembrane core domain of YidC/Oxa1/Alb3 protein family of insertases. The YidC/Oxa1/Alb3 protein family of insertases facilitate the insertion, folding and assembly of proteins of the inner membranes of bacteria and mitochondria, and the thylakoid membrane of plastids. Members include bacterial YidC, mitochondrial Cox18 and Oxa1, and chloroplastic Alb3 and Alb4. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Oxa1 and Cox18/Oxa2 mediate the insertion of both mitochondrion-encoded precursors and nuclear-encoded proteins from the matrix into the mitochondrial inner membrane. Alb3 and Alb3-like proteins, including Alb4, are required for the post-translational insertion of the light-harvesting chlorophyll-binding proteins (LHCPs) into the chloroplast thylakoid membrane. YidC/Oxa1/Alb3 family insertases contain a core domain of five transmembrane (5TM) segments that is essential to insertase function.	189
381391	cd19752	AKR_unchar	uncharacterized aldo-keto reductase (AKR) superfamily protein. This family includes a group of uncharacterized AKR superfamily proteins. Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance in both health and industrial applications.	291
381293	cd19753	Mb-like_oxidoreductase	Globin domain of uncharacterized oxidoreductases containing a FAD/NADH binding domain. This subfamily is composed of uncharacterized proteins containing an N-terminal myoglobin-like (M family globin) domain and a C-terminal oxygenase reductase FAD/NADH binding domain belonging to the ferredoxin reductase (FNR) family and is usually part of multi-component bacterial oxygenases which oxidize hydrocarbons using oxygen as the oxidant. The domain architecture of this subfamily is similar to flavohemoglobins, which function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. NO scavenging by flavoHb attenuates the expression of the nitrosative stress response, affects the swarming behavior of Escherichia coli, and maintains squid-Vibrio fischeri and Medicago truncatula-Sinorhizobium meliloti symbioses.	121
381294	cd19754	FHb_fungal-globin	Globin domain of fungal flavohemoglobin. FlavoHbs function primarily as nitric oxide dioxygenases (NODs, EC 1.14.12.17), converting NO and O2 to inert NO3- (nitrate). They have an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD- and FAD-binding domain, and use the reducing power of cellular NAD(P)H to drive regeneration of the ferrous heme. They protect from nitrosative stress (the broad range of cellular toxicities caused by NO), and modulate NO signaling pathways. NO scavenging by flavoHb attenuates the expression of the nitrosative stress response, affects the swarming behavior of Escherichia coli, and maintains squid-Vibrio fischeri and Medicago truncatula-Sinorhizobium meliloti symbioses. FlavoHb expression affects Aspergillus nidulans sexual development and mycotoxin production, and Dictyostelium discoideum development.	141
381295	cd19755	TrHb2_AtGlb3-like_O	nonsymbiotic haemoglobin Ahb3 (GLB3) and similar truncated hemoglobins, group 2 (O). The M- and S families exhibit the canonical secondary structure of hemoglobins, a 3-over-3 alpha-helical sandwich structure (3/3 Mb-fold), built by eight alpha-helical segments. Truncated hemoglobins (TrHbs, 2/2Hb, or 2/2 globins) or the T family globins adopt a 2-on-2 alpha-helical sandwich structure, resulting from extensive and complex modifications of the canonical 3-on-3 alpha-helical sandwich that are distributed throughout the whole protein molecule. TrHbs are classified into three main groups based on their structural properties and named after Mycobacterium sp. genes glbN, glbO, and glbP: TrHb1s (N), TrHb2s (O) and TrHb3s (P). This subfamily includes the dimeric Arabidopsis thaliana TrHb2 AtGLB3. GLB3 is likely to have a function distinct from other plant globins: it exhibits a low O2 affinity, an unusual concentration-independent binding of O2 and CO, and does not respond to any of the treatments that induce plant 3-on-3 globins.	119
380814	cd19756	Bbox2	B-box-type 2 zinc finger (Bbox2). The B-box-type zinc finger is a short zinc binding domain of around 40 amino acid residues in length. It has been found in transcription factors, ribonucleoproteins and proto-oncoproteins, such as in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). The B-box-type zinc finger often presents in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction. Based on different consensus sequence and the spacing of the 7-8 zinc-binding residues, B-box-type zinc fingers can be divided into two groups, type 1 (Bbox1: C6H2) and type 2 (Bbox2: CHC3H2). The family corresponds to type 2 B-box (Bbox2).	39
380815	cd19757	Bbox1	B-box-type 1 zinc finger (Bbox1). The B-box-type zinc finger is a short zinc binding domain of around 40 amino acid residues in length. It has been found in transcription factors, ribonucleoproteins and proto-oncoproteins, such as in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). The B-box-type zinc finger often presents in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain, in functionally unrelated proteins, most likely mediating protein-protein interactions. Based on different consensus sequences and the spacing of the 7-8 zinc-binding residues, the B-box-type zinc fingers can be divided into two groups, type 1 (Bbox1: C6H2) and type 2 (Bbox2: CHC3H2). This family corresponds to the type 1 B-box (Bbox1).	44
380816	cd19758	Bbox2_MID	B-box-type 2 zinc finger  found in midline (MID) family. The MID family includes MID1 and MID2. MID1, also known as midin, midline 1 RING finger protein, putative transcription factor XPRF, RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRIM18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is highly related to MID1. It associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with Alpha 4, which is a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. They also play a central role in the regulation of granule exocytosis, and functional redundancy exists between MID1 and MID2 in cytotoxic lymphocytes (CTL). Both MID1 and MID2 belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	40
380817	cd19759	Bbox2_TRIM2-like	B-box-type 2 zinc finger  found in tripartite motif-containing protein TRIM2, TRIM3, and similar proteins. TRIM2, also known as RING finger protein 86 (RNF86), is an E3 ubiquitin-protein ligase that ubiquitinates the neurofilament light chain, a component of the intermediate filament in axons. Loss of function of TRIM2 results in early-onset axonal neuropathy. TRIM3, also known as brain-expressed RING finger protein (BERP), RING finger protein 97 (RNF97), or RING finger protein 22 (RNF22), is an E3 ubiquitin-protein ligase involved in the pathogenesis of various cancers. It also plays an important role in the central nervous system (CNS). In addition, TRIM3 may be involved in vesicular trafficking via its association with the cytoskeleton-associated-recycling or transport (CART) complex that is necessary for efficient transferrin receptor recycling, but not for epidermal growth factor receptor (EGFR) degradation. Both TRIM2 and TRIM3 belong to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	42
380818	cd19760	Bbox2_TRIM4-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins, TRIM4, TRIM17, TRIM41, TRIM52 and similar proteins. This family includes a group of tripartite motif-containing proteins, including TRIM4, TRIM17, TRIM41 and TRIM52. TRIM4, also known as RING finger protein 87 (RNF87), is a cytoplasmic E3 ubiquitin-protein ligase that recently evolved and is present only in higher mammals. It transiently interacts with mitochondria, induces mitochondrial aggregation and sensitizes the cells to hydrogen peroxide (H2O2) induced death. Its interaction with peroxiredoxin 1 (PRX1) is critical for the regulation of H2O2 induced cell death. Moreover, TRIM4 functions as a positive regulator of RIG-I-mediated type I interferon induction. It regulates the K63-linked ubiquitination of RIG-1 and assembly of antiviral signaling complex at mitochondria. TRIM17, also known as RING finger protein 16 (RNF16) or testis RING finger protein (Terf), is a crucial E3 ubiquitin ligase that is necessary and sufficient for neuronal apoptosis and contributes to Mcl-1 ubiquitination in cerebellar granule neurons (CGNs). It interacts in a SUMO-dependent manner with nuclear factor of activated T cell NFATc3 transcription factor, and thus inhibits the activity of NFATc3 by preventing its nuclear localization. In contrast, it binds to and inhibits NFATc4 transcription factor in a SUMO-independent manner. Moreover, TRIM17 stimulates degradation of kinetochore protein ZW10 interacting protein (ZWINT), a known component of the kinetochore complex required for mitotic spindle checkpoint, and negatively regulates cell proliferation. TRIM41, also known as RING finger-interacting protein with C kinase (RINCK), is an E3 ubiquitin-protein ligase that promotes the ubiquitination of protein kinase C (PKC) isozymes in cells. It specifically recognizes the C1 domain of PKC isozymes. It controls the amplitude of PKC signaling by controlling the amount of PKC in the cell. TRIM52, also known as RING finger protein 102 (RNF102), is encoded by a novel, noncanonical antiviral TRIM52 gene in primate genomes with unique specificity determined by the rapidly evolving RING domain. TRIM4, TRIM17 and TRIM41 belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. In contrast, TRIM52 lacks the putative viral recognition SPRY/B30.2 domain, and thus has been classified to the C-V subclass of TRIM family that contains only RBCC domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	39
380819	cd19761	Bbox2_TRIM5-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins, TRIM5, TRIM6, TRIM22, TRIM34, TRIM38 and similar proteins. The family includes TRIM5, TRIM6, TRIM22, TRIM34, and TRIM38, all of which belong to the C-IV subclass of the TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM5, also termed RING finger protein 88 (RNF88), is a capsid-specific restriction factor that prevents infection from non-host-adapted retroviruses in a species-specific manner by binding to and destabilizing the retroviral capsid lattice before reverse transcription is completed. Its retroviral restriction activity correlates with the ability to activate TAK1-dependent innate immune signaling. TRIM5 also acts as a pattern recognition receptor that activates innate immune signaling in response to the retroviral capsid lattice. Moreover, TRIM5 plays a role in regulating autophagy through activation of autophagy regulator BECN1 by causing its dissociation from its inhibitors BCL2 and TAB2. It also plays a role in autophagy by acting as a selective autophagy receptor which recognizes and targets HIV-1 capsid protein p24 for autophagic destruction. TRIM6, also termed RING finger protein 89 (RNF89), is an E3-ubiquitin ligase that cooperates with the E2-ubiquitin conjugase UbE2K to catalyze the synthesis of unanchored K48-linked polyubiquitin chains, and further stimulates the interferon-I kappa B kinase epsilon (IKKepsilon) kinase-mediated antiviral response. It also regulates the transcriptional activity of Myc during the maintenance of embryonic stem (ES) cell pluripotency, and may act as a novel regulator for Myc-mediated transcription in ES cells. TRIM22, also termed 50 kDa-stimulated trans-acting factor (Staf-50), or RING finger protein 94 (RNF94), is an E3 ubiquitin-protein ligase that plays an integral role in the host innate immune response to viruses. It has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. TRIM22 acts as a suppressor of basal HIV-1 long terminal repeat (LTR)-driven transcription by preventing transcription factor specificity protein 1 (Sp1) binding to the HIV-1 promoter. It also controls FoxO4 activity and cell survival by directing Toll-like receptor 3 (TLR3)-stimulated cells toward type I interferon (IFN) type I gene induction or apoptosis. Moreover, TRIM22 can activate the noncanonical nuclear factor-kappaB (NF-kappaB) pathway by activating I kappa B kinase alpha (IKKalpha). It also regulates nucleotide binding oligomerization domain containing 2 (NOD2)-dependent activation of interferon-beta signaling and nuclear factor-kappaB. TRIM34, also termed interferon-responsive finger protein 1, or RING finger protein 21 (RNF21), may function as an antiviral protein that contributes to the defense against retroviral infections. TRIM38, also known as RING finger protein 15 (RNF15) or zinc finger protein RoRet, is an E3 ubiquitin-protein ligase that promotes K63- and K48-linked ubiquitination of cellular proteins and also catalyzes self-ubiquitination. It negatively regulates tumor necrosis factor alpha (TNF-alpha) and interleukin-1beta-triggered nuclear factor-kappaB (NF-kappaB) activation by mediating lysosomal-dependent degradation of transforming growth factor beta (TGFbeta)-activated kinase 1 (TAK1)-binding protein (TAB)2/3, two critical components of the TAK1 kinase complex. It also inhibits TLR3/4-mediated activation of NF-kappaB and interferon regulatory factor 3 (IRF3) by mediating ubiquitin-proteasomal degradation of TNF receptor-associated factor 6 (Traf6) and NAK-associated protein 1 (Nap1), respectively. Moreover, TRIM38 negatively regulates TLR3-mediated interferon beta (IFN-beta) signaling by targeting ubiquitin-proteasomal degradation of TIR domain-containing adaptor inducing IFN-beta (TRIF). It functions as a valid target for autoantibodies in primary Sjogren's Syndrome.	40
380820	cd19762	Bbox2_TRIM7-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins TRIM7, TRIM27 and similar proteins. The family includes TRIM7 and TRIM27, both of which belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM7, also known as glycogenin-interacting protein (GNIP) or RING finger protein 90 (RNF90), is an E3 ubiquitin-protein ligase that mediates c-Jun/AP-1 activation by Ras signalling. Its phosphorylation and activation by MSK1 in response to direct activation by the Ras-Raf-MEK-ERK pathway can stimulate TRIM7 E3 ubiquitin ligase activity in mediating Lys63-linked ubiquitination of the AP-1 coactivator RACO-1, leading to RACO-1 protein stabilization. Moreover, TRIM7 binds and activates glycogenin, the self-glucosylating initiator of glycogen biosynthesis. TRIM27, also termed RING finger protein 76 (RNF76), or RET finger protein (RFP), or zinc finger protein RFP, is a nuclear E3 ubiquitin-protein ligase that is highly expressed in testis and in various tumor cell lines. Expression of TRIM27 is associated with prognosis of colon and endometrial cancers. TRIM27 was first identified as a fusion partner of the RET receptor tyrosine kinase. It functions as a transcriptional repressor and associates with several proteins involved in transcriptional activity, such as enhancer of polycomb 1 (Epc1), a member of the Polycomb group proteins, and Mi-2beta, a main component of the nucleosome remodeling and deacetylase (NuRD) complex, and the cell cycle regulator retinoblastoma protein (RB1). It also interacts with HDAC1, leading to downregulation of thioredoxin binding protein 2 (TBP-2), which inhibits the function of thioredoxin. Moreover, TRIM27 mediates Pax7-induced ubiquitination of MyoD in skeletal muscle atrophy. It also inhibits muscle differentiation by modulating serum response factor (SRF) and Epc1. Furthermore, TRIM27 promotes non-canonical polyubiquitination of PTEN, a lipid phosphatase that catalyzes PtdIns(3,4,5)P3 (PIP3) to PtdIns(4,5)P2 (PIP2). It is an IKKepsilon-interacting protein that regulates IkappaB kinase (IKK) function and negatively regulates signaling involved in the antiviral response and inflammation. In addition, TRIM27 forms a protein complex with MBD4 or MBD2 or MBD3, and thus plays an important role in the enhancement of transcriptional repression through MBD proteins in tumorigenesis, spermatogenesis, and embryogenesis. It is also a component of an estrogen receptor 1 (ESR1) regulatory complex, and is involved in estrogen receptor-mediated transcription in MCF-7 cells. Meanwhile, TRIM27 interacts with the hinge region of chromosome 3 protein (SMC3), a component of the multimeric cohesin complex that holds sister chromatids together and prevents their premature separation during mitosis.	44
380821	cd19763	Bbox2_TRIM8_C-V	B-box-type 2 zinc finger  found in tripartite motif-containing protein 8 (TRIM8) and similar proteins. TRIM8, also known as glioblastoma-expressed RING finger protein (GERP) or RING finger protein 27 (RNF27), is a probable E3 ubiquitin-protein ligase that may promote proteasomal degradation of suppressor of cytokine signaling 1 (SOCS1) and further regulate interferon-gamma signaling. It functions as a new p53 modulator that stabilizes p53, impairing its association with MDM2 and inducing the reduction of cell proliferation. TRIM8 deficit dramatically impairs p53 stabilization and activation in response to chemotherapeutic drugs. TRIM8 also modulates tumor necrosis factor-alpha (TNFalpha) and interleukin-1beta (IL-1beta)-triggered nuclear factor-kappaB (NF-kappa B) activation by targeting transforming growth factor beta (TGFbeta) activated kinase 1 (TAK1) for K63-linked polyubiquitination. Moreover, TRIM8 modulates translocation of phosphorylated STAT3 into the nucleus through interaction with Hsp90beta and consequently regulates transcription of Nanog in embryonic stem cells. It also interacts with protein inhibitor of activated STAT3 (PIAS3), which inhibits IL-6-dependent activation of STAT3. TRIM8 belongs to the C-V subclass of nuclear TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The coiled coil domain is required for homodimerization and the region immediately C-terminal to the RING motif is sufficient to mediate the interaction with SOCS1. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	41
380822	cd19764	Bbox2_TRIM9-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins, TRIM9, TRIM67 and similar proteins. This family includes a group of tripartite motif-containing proteins including TRIM9 and TRIM67, both of which belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, consisting of three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM9 (the human ortholog of rat Spring), also known as RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. It plays an important role in the regulation of neuronal functions and participates in neurodegenerative disorders through its ligase activity. TRIM67, also termed TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis.	39
380823	cd19765	Bbox2_TRIM10-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins, TRIM10, TRIM15, TRIM26, TRIM31 and similar proteins. This family includes TRIM10, TRIM15, TRIM26 and TRIM31. TRIM10, also known as B30-RING finger protein (RFB30), RING finger protein 9 (RNF9), or hematopoietic RING finger 1 (HERF1), is a novel hematopoiesis-specific RING finger protein required for terminal differentiation of erythroid cells. TRIM15, also termed RING finger protein 93 (RNF93), or zinc finger protein 178 (ZNF178), or zinc finger protein B7 (ZNFB7), is a focal adhesion protein that regulates focal adhesion disassembly. It localizes to focal contacts in a myosin-II-independent manner by an interaction between its coiled-coil domain and the LD2 motif of paxillin. TRIM15 can also associate with coronin 1B, cortactin, filamin binding LIM protein1, and vasodilator-stimulated phosphoprotein, which are involved in actin cytoskeleton dynamics. As an additional component of the integrin adhesome, it regulates focal adhesion turnover and cell migration. TRIM26, also known as acid finger protein (AFP), RING finger protein 95 (RNF95), or zinc finger protein 173 (ZNF173), is an E3 ubiquitin-protein ligase that negatively regulates interferon-beta production and antiviral response through polyubiquitination and degradation of nuclear transcription factor IRF3. It functions as an important regulator for RNA virus-triggered innate immune response by bridging TBK1 to NEMO (NF-kappaB essential modulator, also known as IKKgamma) and mediating TBK1 activation. It also acts as a novel tumor suppressor of hepatocellular carcinoma by regulating cancer cell proliferation, colony forming ability, migration, and invasion. TRIM31 is an E3 ubiquitin-protein ligase that primarily localizes to the cytoplasm, but is also associated with the mitochondria. It can negatively regulate cell proliferation and may be a potential biomarker of gastric cancer as it is overexpressed from the early stage of gastric carcinogenesis. TRIM31 is downregulated in non-small cell lung cancer and serves as a potential tumor suppressor. It interacts with p52 (Shc) and inhibits Src-induced anchorage-independent growth. TRIM10, TRIM15 and TRIM26 belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. TRIM31 belongs to the C-V subclass of TRIM family of proteins. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	39
380824	cd19766	Bbox2_TRIM11_C-IV	B-box-type 2 zinc finger  found in tripartite motif-containing protein 11 (TRIM11) and similar proteins. TRIM11, also known as protein BIA1, or RING finger protein 92 (RNF92), is an E3 ubiquitin-protein ligase involved in the development of the central nervous system. It is overexpressed in high-grade gliomas and promotes proliferation, invasion, migration and glial tumor growth. TRIM11 acts as a potential therapeutic target for congenital central hypoventilation syndrome (CCHS) through mediating the degradation of CCHS-associated polyalanine-expanded Phox2b. Trim11 modulates the function of neurogenic transcription factor Pax6 through the ubiquitin-proteosome system, and thus plays an essential role for Pax6-dependent neurogenesis. It also binds to and destabilizes a key component of the activator-mediated cofactor complex (ARC105), humanin, a neuroprotective peptide against Alzheimer's disease-relevant insults, and further regulates ARC105 function in transforming growth factor beta (TGFbeta) signaling. Moreover, TRIM11 negatively regulates retinoic acid-inducible gene-I (RIG-I)-mediated interferon-beta (IFNbeta) production and antiviral activity by targeting TANK-binding kinase-1 (TBK1). It may contribute to the endogenous restriction of retroviruses in cells. It enhances N-tropic murine leukemia virus (N-MLV) entry by interfering with Ref1 restriction. It also suppresses the early steps of human immunodeficiency virus HIV-1 transduction, resulting in decreased reverse transcripts. TRIM11 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	44
380825	cd19767	Bbox2_TRIM13_C-XI	B-box-type 2 zinc finger  found in tripartite motif-containing protein 13 (TRIM13) and similar proteins. TRIM13, also known as B-cell chronic lymphocytic leukemia tumor suppressor Leu5, leukemia-associated protein 5, putative tumor suppressor RFP2, RING finger protein 77 (RNF77), or Ret finger protein 2, is an endoplasmic reticulum (ER) membrane-anchored E3 ubiquitin-protein ligase that interacts with proteins localized to the ER, including valosin-containing protein (VCP), a protein indispensable for ER-associated degradation (ERAD). It also targets the known ER proteolytic substrate CD3-delta, but not the N-end rule substrate Ub-R-YFP (yellow fluorescent protein) for its degradation. Moreover, TRIM13 regulates ubiquitination and degradation of NEMO to suppress tumor necrosis factor (TNF) induced nuclear factor-kappaB (NF- kappa B) activation. It is also involved in NF-kappaB p65 activation and nuclear factor of activated T-cells (NFAT)-dependent activation of c-Rel upon T-cell receptor engagement. Furthermore, TRIM13 negatively regulates lanoma differentiation-associated gene 5 (MDA5)-mediated type I interferon production. It also regulates caspase-8 ubiquitination, translocation to autophagosomes, and activation during ER stress induced cell death. Meanwhile, TRIM13 enhances ionizing radiation-induced apoptosis by increasing p53 stability and decreasing AKT kinase activity through MDM2 and AKT degradation. TRIM13 belongs to the C-XI subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region. In addition, TRIM13 contains a C-terminal transmembrane domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	42
380826	cd19768	Bbox2_TRIM14	B-box-type 2 zinc finger  found in tripartite motif-containing protein 14 (TRIM14) and similar proteins. TRIM14 is a mitochondrial adaptor that facilitates innate immune signaling. It also plays a critical role in tumor development. TRIM14 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. It contains a Bbox2 zinc finger as well as a C-terminal SPRY/B30.2 domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	44
380827	cd19769	Bbox2_TRIM16-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins, TRIM16, TRIM29, TRIM47 and similar proteins. This family includes a group of tripartite motif-containing proteins, such as TRIM16, TRIM29 and TRIM47. TRIM16, also termed estrogen-responsive B box protein (EBBP), is a regulator that may play a role in the regulation of keratinocyte differentiation. It may also act as a tumor suppressor through affecting cell proliferation and migration or tumorigenicity in carcinogenesis. TRIM29, also termed ataxia telangiectasia group D-associated protein (ATDC), plays a crucial role in the regulation of macrophage activation in response to viral or bacterial infections within the respiratory tract. TRIM47, also known as gene overexpressed in astrocytoma protein (GOA) or RING finger protein 100 (RNF100), plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis. TRIM16 and TRIM29 belong to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. TRIM47 belongs to the C-IV subclass of TRIM family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	46
380828	cd19770	Bbox2_TRIM19_C-V	B-box-type 2 zinc finger  found in tripartite motif-containing protein 19, also called promyelocytic leukemia protein (PML), and similar proteins. Protein PML, also known as RING finger protein 71 (RNF71) or tripartite motif-containing protein 19 (TRIM19), is predominantly a nuclear protein with a broad intrinsic antiviral activity. It is the eponymous component of PML nuclear bodies (PML NBs) and has been implicated in a wide variety of cell processes, including DNA damage signaling, apoptosis, and transcription. PML interferes with the replication of many unrelated viruses, including human immunodeficiency virus 1 (HIV-1), human foamy virus (HFV), poliovirus, influenza virus, rabies virus, EMCV, adeno-associated virus (AAV), and vesicular stomatitis virus (VSV). It also selectively interacts with misfolded proteins through distinct substrate recognition sites and conjugates these proteins with the small ubiquitin-like modifiers (SUMOs) through its SUMO ligase activity. PML belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	50
380829	cd19771	Bbox2_TRIM20	B-box-type 2 zinc finger  found in tripartite motif-containing protein TRIM20 and similar proteins. TRIM20, also termed Pyrin, or Marenostrin (MEFV), is involved in the regulation of innate immunity and the inflammatory response in response to IFNG/IFN-gamma. TRIM20 belongs to unclassified TRIM family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. It contains a pyrin domain, a Bbox2 zinc finger, and a C-terminal SPRY/B30.2 domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	39
380830	cd19772	Bbox2_TRIM21_C-IV	B-box-type 2 zinc finger  found in tripartite motif-containing protein 21 (TRIM21) and similar proteins. TRIM21, also known as 52 kDa Ro protein, 52 kDa ribonucleoprotein autoantigen Ro/SS-A, Ro(SS-A), RING finger protein 81 (RNF81), or Sjoegren's syndrome type A antigen (SS-A), is a ubiquitously expressed E3 ubiquitin-protein ligase and a high affinity antibody receptor uniquely expressed in the cytosol of mammalian cells. As a cytosolic Fc receptor, TRIM21 binds the Fc of virus-associated antibodies and targets the complex in the cytosol for proteasomal degradation in a process known as antibody-dependent intracellular neutralization (ADIN), and provides an intracellular immune response to protect host defense against pathogen infection. It shows remarkably broad isotype specificity as it does not only bind IgG, but also IgM and IgA. Moreover, TRIM21 promotes the cytosolic DNA sensor cGAS and the cytosolic RNA sensor RIG-I sensing of viral genomes during infection by antibody-opsonized virus. It stimulates inflammatory signaling and activates innate transcription factors, such as nuclear factor-kappaB (NF-kappaB). TRIM21 also plays an essential role in p62-regulated redox homeostasis, suggesting a viable target for treating pathological conditions resulting from oxidative damage. Furthermore, TRIM21 may have implications for various autoimmune diseases associated uncontrolled antiviral signaling through the regulation of Nmi-IFI35 complex-mediated inhibition of innate antiviral response. TRIM21 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	40
380831	cd19773	Bbox2_TRIM23_C-IX_rpt1	first B-box-type 2 zinc finger  found in tripartite motif-containing protein 23 (TRIM23) and similar proteins. TRIM23, also known as ADP-ribosylation factor domain-containing protein 1, GTP-binding protein ARD-1, or RING finger protein 46 (RNF46), is an E3 ubiquitin-protein ligase belonging to the C-IX subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, two Bbox2, and a coiled coil region, as well as C-terminal ADP ribosylation factor (ARF) domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM23 is involved in nuclear factor (NF)-kappaB activation. It mediates atypical lysine 27 (K27)-linked polyubiquitin conjugation to NF-kappaB essential modulator NEMO, also known as IKKgamma, which plays an important role in the NF-kappaB pathway, and this conjugation is essential for TLR3- and RIG-I/MDA5-mediated antiviral innate and inflammatory responses. It also regulates adipocyte differentiation via stabilization of the adipogenic activator peroxisome proliferator-activated receptor gamma (PPARgamma) through atypical ubiquitin conjugation to PPARgamma. Moreover, TRIM23 interacts with and polyubiquitinates yellow fever virus (YFV) NS5 to promote its binding to STAT2 and trigger type I interferon (IFN-I) signaling inhibition.	50
380832	cd19774	Bbox2_TRIM23_C-IX_rpt2	second B-box-type 2 zinc finger  found in tripartite motif-containing protein 23 (TRIM23) and similar proteins. TRIM23, also known as ADP-ribosylation factor domain-containing protein 1, GTP-binding protein ARD-1, or RING finger protein 46 (RNF46), is an E3 ubiquitin-protein ligase belonging to the C-IX subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, two Bbox2, and a coiled coil region, as well as C-terminal ADP ribosylation factor (ARF) domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM23 is involved in nuclear factor (NF)-kappaB activation. It mediates atypical lysine 27 (K27)-linked polyubiquitin conjugation to NF-kappaB essential modulator NEMO, also known as IKKgamma, which plays an important role in the NF-kappaB pathway, and this conjugation is essential for TLR3- and RIG-I/MDA5-mediated antiviral innate and inflammatory responses. It also regulates adipocyte differentiation via stabilization of the adipogenic activator peroxisome proliferator-activated receptor gamma (PPARgamma) through atypical ubiquitin conjugation to PPARgamma. Moreover, TRIM23 interacts with and polyubiquitinates yellow fever virus (YFV) NS5 to promote its binding to STAT2 and trigger type I interferon (IFN-I) signaling inhibition.	50
380833	cd19775	Bbox2_TIF1_C-VI	B-box-type 2 zinc finger  found in transcription intermediary factor 1 (TIF1) family. This family corresponds to the TIF1 family of transcriptional cofactors including TIF1-alpha (TRIM24), TIF1-beta (TRIM28), and TIF1-gamma (TRIM33), which belong to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TIF1 proteins couple chromatin modifications to transcriptional regulation, signaling, and tumor suppression. They exert a deacetylase-dependent silencing effect when tethered to a promoter region. TIF1alpha and TIF1beta can homodimerize and contain a PXVXL motif necessary and sufficient for heterochromatin protein 1(HP1) binding. They bind nuclear receptors and Kruppel-associated boxes (KRAB) specifically and respectively. TIF1gamma is structurally closely related to TIF1alpha and TIF1beta, but has very little functional features in common with them. It does not interact with the KRAB silencing domain of KOX1 or the heterochromatinic proteins HP1alpha, beta and gamma. It cannot bind to nuclear receptors (NRs).	43
380834	cd19776	Bbox2_TRIM25_C-IV	B-box-type 2 zinc finger  found in tripartite motif-containing protein 25 (TRIM25) and similar proteins. TRIM25, also termed estrogen-responsive finger protein (EFP), or ubiquitin/ISG15-conjugating enzyme TRIM25, or zinc finger protein 147 (ZNF147), or E3 ubiquitin/ISG15 ligase TRIM25, is induced by estrogen and particularly abundant in placenta and uterus. It has been implicated in cell proliferation, protein modification, and the retinoic acid inducible gene I (RIG-I)-mediated antiviral signaling pathway. It functions as an E3-ubiquitin ligase able to transfer ubiquitin and ISG15 to target proteins. It binds to mono-ubiquitinated PCNA and promotes the ISG15 modification (ISGylation) of PCNA, suggesting a crucial role in termination of error-prone translesion DNA synthesis. TRIM25 also enhances p53 and Mdm2 abundance by inhibiting their ubiquitination and degradation in 26S proteasomes. It suppresses p53's transcriptional activity and dampens the response to DNA damage. Upon deubiquitylation by ubiquitin-specific peptidase 15 (USP15), it mediates K63-linked polyubiquitination of RIG-I that is crucial for downstream antiviral interferon signaling. TRIM25 is required for melanoma differentiation-associated gene 5 (MDA5) and mitochondrial antiviral signaling (MAVS, also known as IPS-1, VISA, Cardiff) mediated activation of nuclear factor-kappaB (NF- kappa B) and interferon production. It is an RNA binding protein acting as RNA-specific activator for Lin28a/TuT4-mediated uridylation. TRIM25 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	38
380835	cd19777	Bbox2_TRIM35_C-IV	B-box-type 2 zinc finger  found in tripartite motif-containing protein 35 (TRIM35) and similar proteins. TRIM35, also known as hemopoietic lineage switch protein 5 (HLS5), is a putative hepatocellular carcinoma (HCC) suppressor that inhibits phosphorylation of pyruvate kinase isoform M2 (PKM2), which is involved in aerobic glycolysis of cancer cells and further suppresses the Warburg effect and tumorigenicity in HCC. It also negatively regulates Toll-like receptor 7 (TLR7)- and TLR9-mediated type I interferon production by suppressing the stability of interferon regulatory factor 7 (IRF7). Moreover, TRIM35 regulates erythroid differentiation by modulating globin transcription factor 1 (GATA-1) activity. TRIM35 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	44
380836	cd19778	Bbox2_TRIM36_C-I	B-box-type 2 zinc finger  found in tripartite motif-containing protein 36 (TRIM36) and similar proteins. TRIM36, human ortholog of mouse Haprin, also known as RING finger protein 98 (RNF98) or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in carcinogenesis. TRIM36 functions upstream of Wnt/beta-catenin activation, and plays a role in controlling the stability of proteins regulating microtubule polymerization during cortical rotation, and subsequently dorsal axis formation. It is also potentially associated with chromosome segregation through interacting with the kinetochore protein centromere protein-H (CENP-H), and colocalizing with the microtubule protein alpha-tubulin. Its overexpression may cause chromosomal instability and carcinogenesis. It is, thus, a novel regulator affecting cell cycle progression. Moreover, TRIM36 plays a critical role in the arrangement of somites during embryogenesis. TRIM36 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	45
380837	cd19779	Bbox2_TRIM37_C-VIII	B-box-type 2 zinc finger  found in tripartite motif-containing protein 37 (TRIM37) and similar proteins. TRIM37, also known as Mulibrey nanism protein, is a peroxisomal E3 ubiquitin-protein ligase that is involved in the tumorigenesis of several cancer types, including pancreatic ductal adenocarcinoma (PDAC), hepatocellular carcinoma (HCC), breast cancer, and sporadic fibrothecoma. It mono-ubiquitinates histone H2A, a chromatin modification associated with transcriptional repression. Moreover, TRIM37 possesses anti-HIV-1 activity, and interferes with viral DNA synthesis. Mutations in the human TRIM37 gene (also known as MUL) cause Mulibrey (muscle-liver-brain-eye) nanism, a rare growth disorder of prenatal onset characterized by dysmorphic features, pericardial constriction, and hepatomegaly. TRIM37 belongs to the C-VIII subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a MATH (meprin and TRAF-C homology) domain positioned C-terminal to the RBCC domain. Its MATH domain has been shown to interact with the TRAF (TNF-Receptor-Associated Factor) domain of six known TRAFs in vitro. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	40
380838	cd19780	Bbox2_TRIM39-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins TRIM39, TRIM58 and similar proteins. The family includes TRIM39 and TRIM58, both of which belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM39, also termed RING finger protein 23 (RNF23), or testis-abundant finger protein, is an E3 ubiquitin-protein ligase that plays a role in controlling DNA damage-induced apoptosis through inhibition of the anaphase promoting complex (APC/C), a multiprotein ubiquitin ligase that controls multiple cell cycle regulators, including cyclins, geminin, and others. TRIM39 also functions as a regulator of several key processes in the proliferative cycle. It directly regulates p53 stability and modulates cell cycle progression and DNA damage responses via stabilization of p21. TRIM39 also negatively regulates the nuclear factor kappaB (NFkappaB)-mediated signaling pathway through stabilization of Cactin, an inhibitor of NFkappaB- and Toll-like receptor (TLR)-mediated transcription, which is induced by inflammatory stimulants such as tumor necrosis factor alpha (TNFalpha). TRIM39 is a MOAP-1-binding protein that can promote apoptosis signaling through stabilization of MOAP-1 via the inhibition of its poly-ubiquitination process. TRIM58, also known as protein BIA2, is an erythroid E3 ubiquitin-protein ligase induced during late erythropoiesis. It binds and ubiquitinates the intermediate chain of the microtubule motor dynein (DYNC1LI1/DYNC1LI2), stimulating the degradation of the dynein holoprotein complex. It may participate in the erythroblast enucleation process through regulation of nuclear polarization.	44
380839	cd19781	Bbox2_TRIM40_C-V	B-box-type 2 zinc finger  found in tripartite motif-containing protein 40 (TRIM40) and similar proteins. TRIM40, also termed probable E3 NEDD8-protein ligase, or RING finger protein 35, may function as an E3 ubiquitin-protein ligase of the NEDD8 conjugation pathway. It promotes neddylation of IKBKG/NEMO, stabilizing NFKBIA, and inhibiting NF-kappaB nuclear translocation and activity. TRIM40 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	44
380840	cd19782	Bbox2_TRIM42_C-III	B-box-type 2 zinc finger  found in tripartite motif-containing protein 42 (TRIM42) and similar proteins. TRIM42 belongs to the C-III subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain. It also has a novel cysteine-rich motif N-terminal to the RBCC domain, as well as a COS (carboxyl-terminal subgroup one signature) box and a fibronectin type-III (FN3) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TRIM42 can interact with TRIM27, a known cancer-associated protein. Its precise biological function remains unclear.	40
380841	cd19783	Bbox2_TRIM43-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins TRIM43, TRIM48, TRIM49, TRIM51, TRIM64, TRIM77 and similar proteins. The family includes a group of closely related uncharacterized tripartite motif-containing proteins, TRIM43, TRIM43B, TRIM48/RNF101, TRIM49/RNF18, TRIM49B, TRIM49C/TRIM49L2, TRIM49D/TRIM49L, TRIM51/SPRYD5, TRIM64, TRIM64B, TRIM64C, and TRIM77, whose biological functions remain unclear. TRIM49, also known as testis-specific RING-finger protein, has moderate similarity with SS-A/Ro52 antigen, suggesting it may be one of target proteins of autoantibodies in the sera of patients with these autoimmune disorders. All family members (except for TRIM51) belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. TRIM51 belongs to unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	53
380842	cd19784	Bbox2_TRIM44	B-box-type 2 zinc finger  found in tripartite motif-containing protein 44 (TRIM44) and similar proteins. TRIM44, also termed protein DIPB, functions as a critical regulator in tumor metastasis and progression. TRIM44 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. It contains a Bbox2 domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	39
380843	cd19785	Bbox2_TRIM45_C-X	B-box-type 2 zinc finger  found in tripartite motif-containing protein 45 (TRIM45) and similar proteins. TRIM45, also known as RING finger protein 99 (RNF99), is a novel receptor for activated C-kinase (RACK1)-interacting protein that suppresses transcriptional activities of Elk-1 and AP-1 and downregulates mitogen-activated protein kinase (MAPK) signal transduction by inhibiting RACK1/PKC (protein kinase C) complex formation. It also negatively regulates tumor necrosis factor alpha (TNFalpha)-induced nuclear factor-kappaB (NF-kappa B)-mediated transcription and suppresses cell proliferation. TRIM45 belongs to the C-X subclass of TRIM (tripartite motif) family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a filamin-type immunoglobulin (IG-FLMN) domain and NHL repeats positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	43
380844	cd19786	Bbox2_TRIM46_C-I	B-box-type 2 zinc finger  found in tripartite motif-containing protein 46 (TRIM46) and similar proteins. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that specifically localizes to the proximal axon, partly overlaps with the axon initial segment (AIS) at later stages, and organizes uniform microtubule orientation in axons. It controls neuronal polarity and axon specification by driving the formation of parallel microtubule arrays. TRIM46 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins, which are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	46
380845	cd19787	Bbox2_TRIM50-like	B-box-type 2 zinc finger  found in tripartite motif-containing protein TRIM50, TRIM73, TRIM74 and similar proteins. TRIM50 is a stomach-specific E3 ubiquitin-protein ligase, encoded by the Williams-Beuren syndrome (WBS) TRIM50 gene, which regulates vesicular trafficking for acid secretion in gastric parietal cells. It colocalizes, interacts with, and increases the level of p62/SQSTM1, a multifunctional adaptor protein implicated in various cellular processes including the autophagy clearance of polyubiquitinated protein aggregates. It also promotes the formation and clearance of aggresome-associated polyubiquitinated proteins through the interaction with the histone deacetylase 6 (HDAC6), a tubulin specific deacetylase that regulates microtubule-dependent aggresome formation. TRIM50 can be acetylated by PCAF and p300. TRIM50 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. The family also includes two paralogs of TRIM50, tripartite motif-containing protein 73 (TRIM73), also known as tripartite motif-containing protein 50B (TRIM50B), and tripartite motif-containing protein 74 (TRIM74), also known as tripartite motif-containing protein 50C (TRIM50C), both of which are WBS-related genes encoding proteins and may also act as E3 ligases. In contrast with TRIM50, TRIM73 and TRIM74 belong to the C-V subclass of TRIM family of proteins that are defined by the N-terminal RBCC domains only.	39
380846	cd19788	Bbox2_MuRF	B-box-type 2 zinc finger  found in muscle-specific RING finger protein (MuRF) family. This family corresponds to a group of striated muscle-specific tripartite motif (TRIM) proteins, including TRIM63/MuRF-1, TRIM55/MuRF-2, and TRIM54/MuRF-3, which function as E3 ubiquitin ligases in ubiquitin-mediated muscle protein turnover. They are tightly developmentally regulated in skeletal muscle and associate with different cytoskeleton components, such as microtubules, Z-disks and M-bands, as well as with metabolic enzymes and nuclear proteins. They also cooperate with diverse proteins implicated in selective protein degradation by the proteasome and autophagosome, and target proteins of metabolic regulation, sarcomere assembly and transcriptional regulation. Moreover, MURFs display variable fibre-type preferences. TRIM63/MuRF-1 is predominantly fast (type II) fibre-associated in skeletal muscle. TRIM55/MuRF-2 is predominantly slow-fibre associated. TRIM54/MuRF-3 is ubiquitously present. They play an active role in microtubule-mediated sarcomere assembly. MuRFs belong to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain positioned C-terminal to the RBCC domain. They also harbor a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	39
380847	cd19789	Bbox2_TRIM56_C-V	B-box-type 2 zinc finger  found in tripartite motif-containing protein 56 (TRIM56) and similar proteins. TRIM56, also known as RING finger protein 109 (RNF109), is a virus-inducible E3 ubiquitin ligase that restricts pestivirus infection. It positively regulates the Toll-like receptor 3 (TLR3) antiviral signaling pathway, and possesses antiviral activity against bovine viral diarrhea virus (BVDV), a ruminant pestivirus classified within the family Flaviviridae shared by tick-borne encephalitis virus (TBEV). It also possesses antiviral activity against two classical flaviviruses, yellow fever virus (YFV) and dengue virus (DENV), as well as a human coronavirus, HCoV-OC43, which is responsible for a significant share of common cold cases. It may not act on positive-strand RNA viruses indiscriminately. Moreover, TRIM56 is an interferon-inducible E3 ubiquitin ligase that modulates STING to confer double-stranded DNA-mediated innate immune responses. TRIM56 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	47
380848	cd19790	Bbox2_TRIM59_C-XI	B-box-type 2 zinc finger  found in tripartite motif-containing protein 59 (TRIM59) and similar proteins. TRIM59, also known as TRIM57, or RING finger protein 104 (RNF104) or tumor suppressor TSBF-1, is a putative E3 ubiquitin-protein ligase that functions as a novel multiple cancer biomarker for immunohistochemical detection of early tumorigenesis. It is upregulated in gastric cancer and promotes gastric carcinogenesis by interacting with and targeting the P53 tumor suppressor for its ubiquitination and degradation. It also acts as a novel accessory molecule involved in cytotoxicity of BCG-activated macrophages (BAM). Moreover, TRIM59 may serve as a multifunctional regulator for innate immune signaling pathways. It interacts with ECSIT and negatively regulates nuclear factor-kappaB (NF- kappa B) and interferon regulatory factor (IRF)-3/7-mediated signal pathways. TRIM59 belongs to the C-XI subclass of TRIM (tripartite motif) family of proteins that are defined by an N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region. In addition, TRIM59 contains a C-terminal transmembrane domain.	40
380849	cd19791	Bbox2_TRIM60-like	B-box-type 2 zinc finger  found in tripartite motif-containing proteins, TRIM60, TRIM61, TRIM75 and similar proteins. This family includes a group of tripartite motif-containing proteins, including TRIM60, TRIM61 and TRIM75. TRIM60, also known as RING finger protein 129 (RNF129) or RING finger protein 33 (RNF33), is a cytoplasmic protein expressed in the testis. It may play an important role in the spermatogenesis process, the development of the preimplantation embryo, and in testicular functions. TRIM60 interacts with the cytoplasmic kinesin motor proteins KIF3A and KIF3B suggesting possible contribution to cargo movement along the microtubule in the expressed sites. It is also involved in spermatogenesis in Sertoli cells under the regulation of nuclear factor-kappaB (NF-kappaB). TRIM61 is closely related to TRIM60, but its biological function remains unclear. TRIM75 could be the product of a pseudogene. Its biological function remains unclear. TRIM60 and TRIM75 belong to the C-IV subclass of the TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. In contrast, TRIM61 belongs to the C-V subclass of TRIM family that contains RBCC domains only. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	39
380850	cd19792	Bbox2_TRIM62_C-IV	B-box-type 2 zinc finger  found in tripartite motif-containing protein 62 (TRIM62) and similar proteins. TRIM62, also known as Ductal Epithelium Associated Ring Chromosome 1 (DEAR1), is a cytoplasmic E3 ubiquitin-protein ligase that was identified as a dominant regulator of acinar morphogenesis in the mammary gland. It is implicated in the inflammatory response of immune cells by regulating the Toll-like receptor 4 (TLR4) signaling pathway, leading to increased activity of the activator protein 1 (AP-1) transcription factor in primary macrophages. It is also involved in muscular protein homeostasis, especially during inflammation-induced atrophy, and may play a role in the pathogenesis of ICU-acquired weakness (ICUAW) by activating and maintaining inflammation in myocytes. Moreover, TRIM62 facilitates K27-linked poly-ubiquitination of CARD9 and also regulates CARD9-mediated anti-fungal immunity and intestinal inflammation. Furthermore, TRIM62 is involved in the regulation of apical-basal polarity and acinar morphogenesis. It also functions as a chromosome 1p35 tumor suppressor and negatively regulates transforming growth factor beta (TGFbeta)-driven epithelial-mesenchymal transition (EMT) through binding to and promoting the ubiquitination of SMAD3, a major effector of TGFbeta-mediated EMT. TRIM62 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	38
380851	cd19793	Bbox2_TRIM65-like	B-box-type 2 zinc finger  found in tripartite motif-containing protein 65 (TRIM65), B box and SPRY domain-containing protein (BSPRY) and similar proteins. The family includes TRIM65 and BSPRY. TRIM65 is an E3 ubiquitin-protein ligase that interacts with the innate immune receptor MDA5 enhancing its ability to stimulate interferon-beta signaling. It functions as a potential oncogenic protein that negatively regulates p53 through ubiquitination, providing insight into development of novel approaches targeting TRIM65 for non-small cell lung carcinoma (NSCLC) treatment, and also overcoming chemotherapy resistance. Moreover, TRIM65 negatively regulates microRNA-driven suppression of mRNA translation by targeting TNRC6 proteins for ubiquitination and degradation. BSPRY is a regulatory protein for maintaining calcium homeostasis. It may regulate epithelial calcium transport by inhibiting TRPV5 activity. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	43
380852	cd19794	Bbox2_TRIM66-like	B-box-type 2 zinc finger  found in tripartite motif-containing protein 66 (TRIM66) and similar proteins. TRIM66, also termed transcriptional intermediary factor 1 delta (TIF1delta), is a novel heterochromatin protein 1 (HP1)-interacting member of the transcriptional intermediary factor 1 (TIF1) family expressed by elongating spermatids. Like other TIF1 proteins, TRIM66 displays a potent trichostatin A (TSA)-sensitive repression function; TSA is a specific inhibitor of histone deacetylases. Moreover, TRIM66 plays an important role in heterochromatin-mediated gene silencing during postmeiotic phases of spermatogenesis. It functions as a negative regulator of postmeiotic genes acting through HP1 isotype gamma (HP1gamma) complex formation and centromere association. TRIM66 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	43
380853	cd19795	Bbox2_TRIM68_C-IV	B-box-type 2 zinc finger  found in tripartite motif-containing protein 68 (TRIM68) and similar proteins. TRIM68, also known as RING finger protein 137 (RNF137) or SSA protein SS-56 (SS-56), is an E3 ubiquitin-protein ligase that negatively regulates Toll-like receptor (TLR)- and RIG-I-like receptor (RLR)-driven type I interferon production by degrading TRK fused gene (TFG), a novel driver of IFN-beta downstream of anti-viral detection systems. It also functions as a cofactor for androgen receptor-mediated transcription by regulating ligand-dependent transcription of androgen receptor in prostate cancer cells. Moreover, TRIM68 is a cellular target of autoantibody responses in Sjogren's syndrome (SS), as well as systemic lupus erythematosus (SLE). It is also an auto-antigen for T cells in SS and SLE. TRIM68 belongs the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and two coiled coil domains, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	44
380854	cd19796	Bbox2_TRIM71_C-VII	B-box-type 2 zinc finger  found in tripartite motif-containing protein 71 (TRIM71) and similar proteins. TRIM71, also known as protein lineage variant 41 (lin-41), is an E3 ubiquitin-protein ligase that may play essential roles in embryonic stem cells, cellular reprogramming, and the timing of embryonic neurogenesis. It was first identified in the nematode Caenorhabditis elegans as a target of the differentiation-associated microRNA (miRNA) let-7 (lethal 7), and therefore part of a heterochronic gene network that controls larval development. In humans, it regulates let-7 microRNA biogenesis via modulation of Lin28B protein polyubiquitination. TRIM71 localizes to cytoplasmic P-bodies and directly interacts with the miRNA pathway proteins Argonaute 2 (AGO2) and DICER. It represses miRNA activity by promoting degradative ubiquitination of AGO2. Moreover, TRIM71 associates with SHCBP1, a novel component of the fibroblast growth factor (FGF) signaling pathway, and regulates its non-degradative polyubiquitination. It is also involved in the post-transcriptional regulation of the CDKN1A, RBL1 and RBL2 or EGR1 mRNAs through mediating RNA-binding in embryonic stem cells. TRIM71 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	48
380855	cd19797	Bbox2_TRIM72_C-IV	B-box-type 2 zinc finger  found in tripartite motif-containing protein 72 (TRIM72) and similar proteins. TRIM72, also known as Mitsugumin-53 (MG53), is a muscle-specific protein that plays a central role in cell membrane repair by nucleating the assembly of the repair machinery at muscle injury sites. It is required in repair of alveolar epithelial cells under plasma membrane stress failure. It interacts with dysferlin to regulate sarcolemmal repair. Upregulation of TRIM72 develops obesity, systemic insulin resistance, dyslipidemia, and hyperglycemia, as well as induces diabetic cardiomyopathy through transcriptional activation of peroxisome proliferation-activated receptor alpha (PPAR-alpha) signaling pathway. Compensation for the absence of AKT signaling by ERK signaling during TRIM72 overexpression leads to pathological hypertrophy. Moreover, TRIM72 functions as a novel negative feedback regulator of myogenesis via targeting insulin receptor substrate-1 (IRS-1). It is transcriptionally activated by the synergism of myogenin (MyoD) and myocyte enhancer factor 2 (MEF2). TRIM72 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	42
380856	cd19798	Bbox2_BRAT-like	B-box-type 2 zinc finger  found in Drosophila melanogaster brain tumor protein (BRAT) and similar proteins. BRAT is a NHL-domain family protein that functions as a translational repressor to inhibit cell proliferation. This family also contains Caenorhabditis elegans B-box type zinc finger protein ncl-1, a C. elegans Brat homolog which functions as a translational repressor that inhibits protein synthesis. BRAT contains Bbox1 and Bbox2 zinc fingers and NHL repeats. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	44
380857	cd19799	Bbox2_MYCBP2	B-box-type 2 zinc finger  found in Myc-binding protein 2 (MYCBP2) and similar proteins. MYCBP2, also termed protein associated with Myc (Pam), is an atypical E3 ubiquitin-protein ligase which specifically mediates ubiquitination of threonine and serine residues on target proteins, instead of ubiquitinating lysine residues. MYCBP2 harbors a B-box motif that shows high sequence similarity with B-Box-type zinc finger 2 found in tripartite motif-containing proteins (TRIMs). The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	50
380858	cd19800	Bbox2_xNF7-like	B-box-type 2 zinc finger  found in Xenopus laevis nuclear factor 7 (xNF7) and similar proteins. xNF7 is a maternally expressed novel zinc finger nuclear phosphoprotein. It acts as a transcription factor that determines dorsal-ventral body axis. xNF7 harbors a B-box motif that shows high sequence similarity with B-Box-type zinc finger 2 found in tripartite motif-containing proteins (TRIMs). The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	39
380859	cd19801	Bbox1_MID	B-box-type 1 zinc finger found in the midline (MID) family. The MID family includes MID1 and MID2. MID1, also known as midin, midline 1 RING finger protein, putative transcription factor XPRF, RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRIM18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is highly related to MID1. It associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with alpha4, a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. They also play a central role in the regulation of granule exocytosis, and functional redundancy exists between MID1 and MID2 in cytotoxic lymphocytes (CTL). Both MID1 and MID2 belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	49
380860	cd19802	Bbox1_TRIM8-like	B-box-type 1 zinc finger found in tripartite motif-containing proteins, TRIM8, TRIM16, TRIM25, TRIM29, TRIM44, TRIM47 and similar proteins. This family includes a group of tripartite motif-containing proteins, including TRIM8, TRIM16, TRIM25, TRIM29, TRIM44 and TRIM47. TRIM8, also known as glioblastoma-expressed RING finger protein (GERP) or RING finger protein 27 (RNF27), is a probable E3 ubiquitin-protein ligase that may promote proteasomal degradation of suppressor of cytokine signaling 1 (SOCS1) and further regulate interferon-gamma signaling. It functions as a new p53 modulator that stabilizes p53, impairing its association with MDM2 and inducing the reduction of cell proliferation. TRIM16, also termed estrogen-responsive B box protein (EBBP), may play a role in the regulation of keratinocyte differentiation. It may also act as a tumor suppressor by affecting cell proliferation and migration or tumorigenicity in carcinogenesis. TRIM25, also termed estrogen-responsive finger protein (EFP), or ubiquitin/ISG15-conjugating enzyme TRIM25, or zinc finger protein 147 (ZNF147), or E3 ubiquitin/ISG15 ligase TRIM25, is induced by estrogen and is particularly abundant in placenta and uterus. It has been implicated in cell proliferation, protein modification, and the retinoic acid inducible gene I (RIG-I)-mediated antiviral signaling pathway. It functions as an E3-ubiquitin ligase able to transfer ubiquitin and ISG15 to target proteins. TRIM29, also termed ataxia telangiectasia group D-associated protein (ATDC), plays a crucial role in the regulation of macrophage activation in response to viral or bacterial infections within the respiratory tract. TRIM44, also termed protein DIPB, functions as a critical regulator in tumor metastasis and progression. TRIM47, also known as gene overexpressed in astrocytoma protein (GOA) or RING finger protein 100 (RNF100), plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis. The TRIM (tripartite motif) family of proteins are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	46
380861	cd19803	Bbox1_TRIM9-like_C-I	B-box-type 1 zinc finger found in tripartite motif-containing proteins, TRIM9, TRIM67 and similar proteins. This family includes a group of tripartite motif-containing proteins, including TRIM9 and TRIM67, both of which belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, consisting of three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TRIM9 (the human ortholog of rat Spring), also known as RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. It plays an important role in the regulation of neuronal functions and participates in the neurodegenerative disorders through its ligase activity. TRIM67, also termed TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis.	47
380862	cd19804	Bbox1_TRIM19_C-V	B-box-type 1 zinc finger found in promyelocytic leukemia protein (PML) and similar proteins. Protein PML, also known as RING finger protein 71 (RNF71) or tripartite motif-containing protein 19 (TRIM19), is predominantly a nuclear protein with a broad intrinsic antiviral activity. It is the eponymous component of PML nuclear bodies (PML NBs) and has been implicated in a wide variety of cellular processes, including DNA damage signaling, apoptosis, and transcription. PML interferes with the replication of many unrelated viruses, including human immunodeficiency virus 1 (HIV-1), human foamy virus (HFV), poliovirus, influenza virus, rabies virus, EMCV, adeno-associated virus (AAV), and vesicular stomatitis virus (VSV). It also selectively interacts with misfolded proteins through distinct substrate recognition sites, and conjugates these proteins with the small ubiquitin-like modifiers (SUMOs) through its SUMO ligase activity. PML belongs to the C-V subclass of the TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	47
380863	cd19805	Bbox1_TIF1	B-box-type 1 zinc finger found in transcription intermediary factor 1 (TIF1) family. This family corresponds to the TIF1 family of transcriptional cofactors including TIF1-alpha (TRIM24), TIF1-beta (TRIM28), and TIF1-gamma (TRIM33), which belongs to the C-VI subclass of the TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TIF1 proteins couple chromatin modifications to transcriptional regulation, signaling, and tumor suppression. They exert a deacetylase-dependent silencing effect when tethered to a promoter region. TIF1-alpha and TIF1-beta can homodimerize and contain a PXVXL motif necessary and sufficient for heterochromatin protein 1(HP1) binding. They bind nuclear receptors and Kruppel-associated boxes (KRAB) specifically and respectively. TIF1-gamma is structurally closely related to TIF1-alpha and TIF1-beta, but has very little functional features in common with them. It does not interact with the KRAB silencing domain of KOX1 or the heterochromatinic proteins HP1alpha, beta and gamma. It cannot bind to nuclear receptors (NRs).	44
380864	cd19806	Bbox1_TRIM32_C-VII	B-box-type 1 zinc finger found in tripartite motif-containing protein 32 (TRIM32) and similar proteins. TRIM32, also known as 72 kDa Tat-interacting protein, or zinc finger protein HT2A, or BBS11, is an E3 ubiquitin-protein ligase that promotes degradation of several targets, including actin, PIASgamma, Abl interactor 2, dysbindin, X-linked inhibitor of apoptosis (XIAP), p73 transcription factor, thin filaments and Z-bands during fasting. It plays important roles in neuronal differentiation of neural progenitor cells, as well as in controlling cell fate in skeletal muscle progenitor cells. It reduces PI3K-Akt-FoxO signaling in muscle atrophy by promoting plakoglobin-PI3K dissociation. It also functions as a pluripotency-reprogramming roadblock that facilitates cellular transition towards differentiation via modulating the levels of Oct4 and cMyc. Moreover, TRIM32 is an intrinsic influenza A virus (IAV) restriction factor which senses and targets the polymerase basic protein 1 (PB1) for ubiquitination and protein degradation. It also plays a significant role in mediating the biological activity of the HIV-1 Tat protein in vivo, binding specifically to the activation domain of HIV-1 Tat; it and can also interact with the HIV-2 and EIAV Tat proteins. Furthermore, TRIM32 regulates myoblast proliferation by controlling turnover of NDRG2 (N-myc downstream-regulated gene). It negatively regulates tumor suppressor p53 to promote tumorigenesis. It also facilitates degradation of MYCN on spindle poles and induces asymmetric cell division in human neuroblastoma cells. In addition, TRIM32 plays important roles in regulation of hyperactivities and positively regulates the development of anxiety and depression disorders induced by chronic stress. It also plays a role in regeneration by affecting satellite cell cycle progression via modulation of the SUMO ligase PIASy (PIAS4). Defects in TRIM32 leads to limb-girdle muscular dystrophy type 2H (LGMD2H), sarcotubular myopathies (STM) and Bardet-Biedl syndrome. TRIM32 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The NHL domain mediates the interaction with Argonaute proteins and consequently allows TRIM32 to modulate the activity of certain miRNAs. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	41
380865	cd19807	Bbox1_TRIM36-like	B-box-type 1 zinc finger found in tripartite motif-containing proteins, TRIM36, TRIM46 and similar proteins. The family includes tripartite motif-containing proteins, TRIM36 and TRIM46, both of which belong to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TRIM36, the human ortholog of mouse Haprin, also known as RING finger protein 98 (RNF98) or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in carcinogenesis. TRIM36 functions upstream of Wnt/beta-catenin activation, and plays a role in controlling the stability of proteins regulating microtubule polymerization during cortical rotation, and subsequent dorsal axis formation. It is also potentially associated with chromosome segregation by interacting with the kinetochore protein centromere protein-H (CENP-H), and colocalizing with the microtubule protein alpha-tubulin. Its overexpression may cause chromosomal instability and carcinogenesis. It is, thus, a novel regulator affecting cell cycle progression. Moreover, TRIM36 plays a critical role in the arrangement of somites during embryogenesis. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that specifically localizes to the proximal axon, partly overlaps with the axon initial segment (AIS) at later stages, and organizes uniform microtubule orientation in axons. It controls neuronal polarity and axon specification by driving the formation of parallel microtubule arrays.	52
380866	cd19808	Bbox1_TRIM42_C-III	B-box-type 1 zinc finger found in tripartite motif-containing protein 42 (TRIM42) and similar proteins. TRIM42 belongs to the C-III subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain. It also has a novel cysteine-rich motif N-terminal to the RBCC domain, as well as a COS (carboxyl-terminal subgroup one signature) box and a fibronectin type-III (FN3) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TRIM42 can interact with TRIM27, a known cancer-associated protein. Its precise biological function remains unclear.	47
380867	cd19809	Bbox1_TRIM45_C-X	B-box-type 1 zinc finger found in tripartite motif-containing protein 45 (TRIM45) and similar proteins. TRIM45, also known as RING finger protein 99 (RNF99), is a novel receptor for activated C-kinase (RACK1)-interacting protein that suppresses transcriptional activities of Elk-1 and AP-1, and downregulates mitogen-activated protein kinase (MAPK) signal transduction by inhibiting RACK1/PKC (protein kinase C) complex formation. It also negatively regulates tumor necrosis factor alpha (TNFalpha)-induced nuclear factor-kappaB (NF-kappa B)-mediated transcription, and suppresses cell proliferation. TRIM45 belongs to the C-X subclass of the TRIM (tripartite motif) family of proteins that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a filamin-type immunoglobulin (IG-FLMN) domain and NHL repeats positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	46
380868	cd19810	Bbox1_TRIM56_C-V	B-box-type 1 zinc finger found in tripartite motif-containing protein 56 (TRIM56) and similar proteins. TRIM56, also known as RING finger protein 109 (RNF109), is a virus-inducible E3 ubiquitin ligase that restricts pestivirus infection. It positively regulates the Toll-like receptor 3 (TLR3) antiviral signaling pathway, and possesses antiviral activity against bovine viral diarrhea virus (BVDV), a ruminant pestivirus classified within the family Flaviviridae shared by tick-borne encephalitis virus (TBEV). It also possesses antiviral activity against two classical flaviviruses, yellow fever virus (YFV) and dengue virus (DENV), as well as a human coronavirus, HCoV-OC43, which is responsible for a significant share of common cold cases. It may not act on positive-strand RNA viruses indiscriminately. Moreover, TRIM56 is an interferon-inducible E3 ubiquitin ligase that modulates STING to confer double-stranded DNA-mediated innate immune responses. TRIM56 belongs to the C-V subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	49
380869	cd19811	Bbox1_TRIM66	B-box-type 1 zinc finger found in tripartite motif-containing protein 66 (TRIM66) and similar proteins. TRIM66, also termed transcriptional intermediary factor 1 delta (TIF1delta), is a novel heterochromatin protein 1 (HP1)-interacting member of the transcriptional intermediary factor 1 (TIF1) family, and is expressed by elongating spermatids. Like other TIF1 proteins, TRIM66 displays a potent trichostatin A (TSA)-sensitive repression function; TSA is a specific inhibitor of histone deacetylases. Moreover, TRIM66 plays an important role in heterochromatin-mediated gene silencing during postmeiotic phases of spermatogenesis. It functions as a negative regulator of postmeiotic genes acting through HP1 isotype gamma (HP1gamma) complex formation and centromere association. TRIM66 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	37
380870	cd19812	Bbox1_TRIM71_C-VII	B-box-type 1 zinc finger found in tripartite motif-containing protein 71 (TRIM71) and similar proteins. TRIM71, also known as protein lineage variant 41 (lin-41), is an E3 ubiquitin-protein ligase that may play essential roles in embryonic stem cells, cellular reprogramming, and the timing of embryonic neurogenesis. It was first identified in the nematode Caenorhabditis elegans as a target of the differentiation-associated microRNA (miRNA) let-7 (lethal 7) and therefore part of a heterochronic gene network that controls larval development. In humans, it regulates let-7 microRNA biogenesis via modulation of Lin28B protein polyubiquitination. TRIM71 localizes to cytoplasmic P-bodies and directly interacts with the miRNA pathway proteins Argonaute 2 (AGO2) and DICER. It represses miRNA activity by promoting degradative ubiquitination of AGO2. Moreover, TRIM71 associates with SHCBP1, a novel component of the fibroblast growth factor (FGF) signaling pathway, and regulates its non-degradative polyubiquitination. It is also involved in the post-transcriptional regulation of the CDKN1A, RBL1 and RBL2 or EGR1 mRNAs by mediating RNA-binding in embryonic stem cells. TRIM71 belongs to the C-VII subclass of the TRIM (tripartite motif) family of proteins that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	44
380871	cd19813	Bbox1_BRAT-like	B-box-type 1 zinc finger found in Drosophila melanogaster brain tumor protein (BRAT) and similar proteins. BRAT is a NHL-domain family protein that functions as a translational repressor to inhibit cell proliferation. The family also contains Caenorhabditis elegans B-box type zinc finger protein ncl-1, a C. elegans Brat homolog which functions as a translational repressor that inhibits protein synthesis. BRAT contains Bbox1 and Bbox2 zinc fingers and NHL repeats. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	44
380872	cd19814	Bbox1_RNF207-like	B-box-type 1 zinc finger found in RING finger protein 207 (RNF207) and similar proteins. RNF207 is a cardiac-specific E3 ubiquitin-protein ligase that plays an important role in the regulation of cardiac repolarization. It regulates action potential duration, likely via effects on human ether-a-go-go-related gene (HERG) trafficking and localization, in a heat shock protein-dependent manner. RNF207 contains a RING finger, a B-box motif and Bbox C-terminal (BBC) domain, as well as a C-terminal non-homologous region (CNHR). The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	49
380873	cd19815	Bbox1_HOIP	B-box-type 1 zinc finger found in HOIL-1-interacting protein (HOIP) and similar proteins. HOIP, also termed RING finger protein 31 (RNF31), or zinc in-between-RING-finger ubiquitin-associated domain protein, together with HOIL-1 and SHARPIN, forms the E3-ligase complex (also known as linear-ubiquitin-chain assembly complex LUBAC) that regulates NF-kappaB activity and apoptosis. It also interacts with the atypical mammalian orphan receptor DAX-1, trigger DAX-1 ubiquitination and stabilization, and participate in repressing steroidogenic gene expression. HOIP contains a B-box motif that shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	43
380874	cd19816	Bbox1_CYLD	B-box-type 1 zinc finger found in tumor suppressor cylindromatosis (CYLD) and similar proteins. CYLD, also termed ubiquitin carboxyl-terminal hydrolase CYLD, or deubiquitinating enzyme CYLD, or ubiquitin thioesterase CYLD, or ubiquitin-specific-processing protease CYLD, is a microtubule-associated deubiquitinase that specifically cleaves Lys-63-linked polyubiquitin chains. It plays a pivotal role in a wide range of cellular activities, including innate immunity, cell division, and ciliogenesis. CYLD antagonizes NF-kappaB and JNK signaling by disassembly of Lys63-linked ubiquitin chains synthesized in response to cytokine stimulation. Structural characterization reveals a small zinc-binding B-box inserted within the ubiquitin specific protease (USP) domain of CYLD. The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs) and is responsible for its intermolecular interaction and cytoplasmic localization. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	56
380875	cd19817	Bbox1_ANCHR-like	B-box-type 1 zinc finger found in Abscission/NoCut checkpoint regulator (ANCHR) and similar proteins. ANCHR, also termed MLL partner containing FYVE domain, or zinc finger FYVE domain-containing protein 19, is a key regulator of the abscission step in cytokinesis: part of the cytokinesis checkpoint, a process required to delay abscission to prevent both premature resolution of intercellular chromosome bridges and accumulation of DNA damage. The family also includes zinc finger B-box domain-containing protein 1 (ZBBX), a B-box motif containing protein with unclear biological function. The B-box motif of this family shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	45
380876	cd19818	Bbox1_ZBBX	B-box-type 1 zinc finger found in zinc finger B-box domain-containing protein 1 (ZBBX) and similar proteins. The family corresponds to a group of uncharacterized zinc finger B-box domain-containing proteins. The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	43
380877	cd19819	Bbox1_ZFYVE1_rpt1	first B-box-type 1 zinc finger found in zinc finger FYVE domain-containing protein 1 (ZFYVE1) and similar proteins. ZFYVE1 also termed double FYVE-containing protein 1 (DFCP1), or SR3, or tandem FYVE fingers-1, is a novel tandem FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. The subcellular distribution of exogenously-expressed ZFYVE1 to Golgi, endoplasmic reticulum (ER) and vesicular is governed in part by its FYVE domains but unaffected by Wortmannin, a PI3-kinase inhibitor. ZFYVE1 harbors two B-box motifs, both of which show high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	48
380878	cd19820	Bbox1_ZFYVE1_rpt2	second B-box-type 1 zinc finger found in zinc finger FYVE domain-containing protein 1 (ZFYVE1) and similar proteins. ZFYVE1 also termed double FYVE-containing protein 1 (DFCP1), or SR3, or tandem FYVE fingers-1, is a novel tandem FYVE domain containing protein that binds phosphatidylinositol 3-phosphate (PtdIns3P or PI3P) with high specificity over other phosphoinositides. The subcellular distribution of exogenously-expressed ZFYVE1 to Golgi, endoplasmic reticulum (ER) and vesicular is governed in part by its FYVE domains but unaffected by Wortmannin, a PI3-kinase inhibitor. ZFYVE1 harbors two B-box motifs, both of which show high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	59
380879	cd19821	Bbox1_BBX-like	B-box-type 1 zinc finger found in B-box (BBX) family of plant transcription factors and similar proteins. The BBX family includes a group of zinc finger transcription factors that contain one or two B-box motifs, and sometimes also feature a CCT (CONSTANS, CO-like, and TOC1) domain. They play important roles in plant growth and development, including seedling photomorphogenesis, photoperiodic regulation of flowering, shade avoidance, and responses to biotic and abiotic stresses. Their B-box motifs show high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs) and are involved in mediating transcriptional regulation and protein-protein interaction in plant signaling. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif; this family contains a modified motif, C3XC2H2, where X can be D, E, C or H.	44
380880	cd19822	Bbox2_MID1_C-I	B-box-type 2 zinc finger  found in midline-1 (MID1) and similar proteins. MID1, also termed midin, or midline 1 RING finger protein, or putative transcription factor XPRF, or RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRI18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. It monoubiquinates the alpha4 subunit of protein phosphatase 2A (PP2A), promoting proteosomal degradation of the catalytic subunit of PP2A (PP2Ac) and preventing the A and B subunits from forming an active complex. It promotes allergen and rhinovirus-induced asthma through the inhibition of PP2A activity. It is strongly upregulated in cytotoxic lymphocytes (CTLs) and directs lytic granule exocytosis and cytotoxicity of killer T cells. Loss-of-function mutations in MID1 lead to the human X-linked Opitz G/BBB (XLOS) syndrome characterized by defective midline development during embryogenesis. It heterodimerizes in vitro with its paralog MID2. MID1 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	47
380881	cd19823	Bbox2_MID2_C-I	B-box-type 2 zinc finger  found in midline-2 (MID2) and similar proteins. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is a probable E3 ubiquitin-protein ligase that is highly related to MID1 that associate with cytoplasmic microtubules along their length and throughout the cell cycle. Like MID1, MID2 associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with Alpha 4, which is a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. MID2 can also substitute for MID1 to control exocytosis of lytic granules in cytotoxic T cells. It heterodimerizes in vitro with its paralog MID1. Loss-of-function mutations in MID2 lead to the human X-linked intellectual disability (XLID). MID2 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxy-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	40
380882	cd19824	Bbox2_TRIM2_C-VII	B-box-type 2 zinc finger  found in tripartite motif-containing protein 2 (TRIM2) and similar proteins. TRIM2, also known as RING finger protein 86 (RNF86), is an E3 ubiquitin-protein ligase that ubiquitinates the neurofilament light chain, a component of the intermediate filament in axons. Loss of function of TRIM2 results in early-onset axonal neuropathy. TRIM2 also plays a role in mediating the p42/p44 Semi-independent ubiquitination of the cell death-promoting protein Bcl-2-interacting mediator of cell death (Aim) in rapid ischemic tolerance. TRIM2 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	42
380883	cd19825	Bbox2_TRIM3_C-VII	B-box-type 2 zinc finger  found in tripartite motif-containing protein 3 (TRIM3). TRIM3, also known as brain-expressed RING finger protein (BERP), RING finger protein 97 (RNF97), or RING finger protein 22 (RNF22), is an E3 ubiquitin-protein ligase involved in the pathogenesis of various cancers. It functions as a tumor suppressor that regulates asymmetric cell division in neuroblastoma. It binds to the ck inhibitor p21(WAF1/CIP1) and regulates its availability that promotes cyclins D1-cdk4 nuclear accumulation. Moreover, TRIM3 plays an important role in the central nervous system (CNS). It corresponds to gene BERP (brain-expressed RING finger protein), a unique p53-regulated gene that modulates seizure susceptibility and GABAAR cell surface expression. Furthermore, TRIM3 mediates activity-dependent turnover of presynaptic density (PSD) scaffold proteins GKAP/SAPAP1 and is a negative regulator of dendrite spine morphology. In addition, TRIM3 may be involved in vesicular trafficking via its association with the cytoskeleton-associated-recycling or transport (CART) complex that is necessary for efficient transferrin receptor recycling, but not for epidermal growth factor receptor (EGFR) degradation. It also regulates the motility of the kinesin superfamily protein KIF21B. TRIM3 belongs to the C-VII subclass of TRIM (tripartite motif)-NHL family that is defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil domain, as well as a NHL (named after proteins NCL-1, HT2A and Lin-41 that contain repeats folded into a six-bladed beta propeller) repeat domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	47
380884	cd19826	Bbox2_TRIM9_C-I	B-box-type 2 zinc finger  found in tripartite motif-containing protein 9 (TRIM9) and similar proteins. TRIM9 (the human ortholog of rat Spring), also termed RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. TRIM9 plays an important role in the regulation of neuronal functions and participates in the neurodegenerative disorders through its ligase activity. It interacts with the WD repeat region of beta-transducer repeat-containing protein (beta-TCP) through its N-terminal degron motif (DSGXXS) depending on the phosphorylation status, and thus negatively regulate nuclear factor-kappaB (NF-kappaB) activation in the NF-kappaB pro-inflammatory signaling pathway. Moreover, TRIM9 acts as a critical catalytic link between Netrin-1 and exocytosis soluble NSF attachment receptor protein (SNARE) machinery in murine cortical neurons. It promotes SNARE-mediated vesicle fusion and axon branching in a Netrin-dependent manner. TRIM9 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	49
380885	cd19827	Bbox2_TRIM67_C-I	B-box-type 2 zinc finger  found in tripartite motif-containing protein 67 (TRIM67) and similar proteins. TRIM67, also termed TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis. TRIM67 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	45
380886	cd19828	Bbox2_TIF1a_C-VI	B-box-type 2 zinc finger  found in transcription intermediary factor 1-alpha (TIF1-alpha). TIF1-alpha, also known as tripartite motif-containing protein 24 (TRIM24), E3 ubiquitin-protein ligase TRIM24, or RING finger protein 82, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TIF1-alpha interacts specifically and in a ligand-dependent manner with the ligand binding domain (LBD) of several nuclear receptors (NRs), including retinoid X (RXR), retinoic acid (RAR), vitamin D3 (VDR), estrogen (ER), and progesterone (PR) receptors. It also associates with heterochromatin-associated factors HP1alpha, MOD1 (HP1beta), and MOD2 (HP1gamma), as well as the vertebrate Kruppel-type (C2H2) zinc finger proteins that contain the transcriptional silencing domain KRAB. TIF1-alpha is a ligand-dependent co-repressor of retinoic acid receptor (RAR) that interacts with multiple nuclear receptors in vitro via an LXXLL motif and further acts as a gatekeeper of liver carcinogenesis. It also functions as an E3-ubiquitin ligase targeting p53, and is broadly associated with chromatin silencing. Moreover, it is a chromatin regulator that recognizes specific, combinatorial histone modifications through its C-terminal PHD-Bromo region. In addition, it interacts with chromatin and estrogen receptor to activate estrogen-dependent genes associated with cellular proliferation and tumor development.	57
380887	cd19829	Bbox2_TIF1b_C-VI	B-box-type 2 zinc finger  found in transcription intermediary factor 1-beta (TIF1-beta). TIF1-beta, also known as Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), KRAB-interacting protein 1 (KRIP-1), nuclear co-repressor KAP-1, RING finger protein 96, tripartite motif-containing protein 28 (TRIM28), or E3 SUMO-protein ligase TRIM28, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD) and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TIF1-beta acts as a nuclear co-repressor that plays a role in transcription and in the DNA damage response. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair. It also regulates CHD3 nucleosome remodeling during the DNA double-strand break (DSB) response. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response. Moreover, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase. The N-terminal RBCC domains of TIF1-beta are responsible for the interaction with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and the regulation of homo- and heterodimerization. The C-terminal PHD/Bromo domains are involved in interacting with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity.	44
380888	cd19830	Bbox2_TIF1g_C-VI	B-box-type 2 zinc finger  found in transcription intermediary factor 1 gamma (TIF1-gamma). TIF1-gamma, also known as tripartite motif-containing 33 (TRIM33), ectodermin, RFG7, or PTC7, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif. TIF1-gamma is an E3-ubiquitin ligase that functions as a regulator of transforming growth factor beta (TGFbeta) signaling. It inhibits the Smad4-mediated TGFbeta response by interaction with Smad2/3 or ubiquitylation of Smad4. Moreover, TIF1gamma is an important regulator of transcription during hematopoiesis, as well as a key actor of tumorigenesis. Like other TIF1 family members, TIF1-gamma also contains an intrinsic transcriptional silencing function. It can control erythroid cell fate by regulating transcription elongation. It can bind to the anaphase-promoting complex/cyclosome (APC/C) and promotes mitosis.	53
380889	cd19831	Bbox2_MuRF1_C-II	B-box-type 2 zinc finger  found in muscle-specific RING finger protein 1 (MuRF-1) and similar proteins. MuRF-1, also known as tripartite motif-containing protein 63 (TRIM63), RING finger protein 28 (RNF28), iris RING finger protein, or striated muscle RING zinc finger, is an E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover. It is predominantly fast (type II) fibre-associated in skeletal muscle and can bind to many myofibrillar proteins, including titin, nebulin, the nebulin-related protein NRAP, troponin-I (TnI), troponin-T (TnT), myosin light chain 2 (MLC-2), myotilin, and T-cap. The early and robust upregulation of MuRF-1 is triggered by disuse, denervation, starvation, sepsis, or steroid administration resulting in skeletal muscle atrophy. It also plays a role in maintaining titin M-line integrity. It associates with the periphery of the M-line lattice and may be involved in the regulation of the titin kinase domain. It also participates in muscle stress response pathways and gene expression. MuRF-1 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	43
380890	cd19832	Bbox2_MuRF2_C-II	B-box-type 2 zinc finger  found in muscle-specific RING finger protein 2 (MuRF-2) and similar proteins. MuRF-2, also known as tripartite motif-containing protein 55 (TRIM55) or RING finger protein 29 (RNF29), is a muscle-specific E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover and also a ligand of the transactivation domain of the serum response transcription factor (SRF). It is predominantly slow-fibre associated and highly expressed in embryonic skeletal muscle. MuRF-2 associates transiently with microtubules, myosin, and titin during sarcomere assembly. It has been implicated in microtubule, intermediate filament, and sarcomeric M-line maintenance in striated muscle development, as well as in signalling from the sarcomere to the nucleus. It plays an important role in the earliest stages of skeletal muscle differentiation and myofibrillogenesis. It is developmentally downregulated and is assembled at the M-line region of the sarcomere and with microtubules. MuRF-2 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	45
380891	cd19833	Bbox2_MuRF3_C-II	B-box-type 2 zinc finger  found in muscle-specific RING finger protein 3 (MuRF-3) and similar proteins. MuRF-3, also known as tripartite motif-containing protein 54 (TRIM54), or RING finger protein 30 (RNF30), is an E3 ubiquitin-protein ligase in ubiquitin-mediated muscle protein turnover. It is ubiquitously detected in all fibre types, and is developmentally upregulated, associates with microtubules, the sarcomeric M-line and Z-line, and is required for microtubule stability and myogenesis. It associates with glutamylated microtubules during skeletal muscle development, and is required for skeletal myoblast differentiation and development of cellular microtubular networks. MuRF-3 controls the degradation of four-and-a-half LIM domain (FHL2) and gamma-filamin and is required for maintenance of ventricular integrity after myocardial infarction (MI). MuRF-3 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbors a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	43
380892	cd19834	Bbox2_BSPRY	B-box-type 2 zinc finger  found in B box and SPRY domain-containing protein (BSPRY) and similar proteins. BSPRY is a regulatory protein for maintaining calcium homeostasis. It may regulate epithelial calcium transport by inhibiting TRPV5 activity. BSPRY is composed of a B-box, an alpha-helical coiled coil and a SPRY domain. The B-box motif shows high sequence similarity with B-Box-type zinc finger 2 found in tripartite motif-containing proteins (TRIMs). The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	43
380893	cd19835	Bbox2_TRIM65_C-IV	B-box-type 2 zinc finger  found in tripartite motif-containing protein 65 (TRIM65) and similar proteins. TRIM65 is an E3 ubiquitin-protein ligase that interacts with the innate immune receptor MDA5 enhancing its ability to stimulate interferon-beta signaling. It functions as a potential oncogenic protein that negatively regulates p53 through ubiquitination, providing insight into development of novel approaches targeting TRIM65 for non-small cell lung carcinoma (NSCLC) treatment, and also overcoming chemotherapy resistance. Moreover, TRIM65 negatively regulates microRNA-driven suppression of mRNA translation by targeting TNRC6 proteins for ubiquitination and degradation. TRIM65 belongs to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox2, and a coiled coil region, as well as a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	42
380894	cd19836	Bbox1_MID1_C-I	B-box-type 1 zinc finger found in midline-1 (MID1) and similar proteins. MID1, also termed midin, or midline 1 RING finger protein, or putative transcription factor XPRF, or RING finger protein 59 (RNF59), or tripartite motif-containing protein 18 (TRI18), is a microtubule-associated E3 ubiquitin-protein ligase implicated in epithelial-mesenchymal differentiation, cell migration and adhesion, and programmed cell death along specific regions of the ventral midline during embryogenesis. It monoubiquinates the alpha4 subunit of protein phosphatase 2A (PP2A), promoting proteosomal degradation of the catalytic subunit of PP2A (PP2Ac) and preventing the A and B subunits from forming an active complex. It promotes allergen and rhinovirus-induced asthma through the inhibition of PP2A activity. It is strongly upregulated in cytotoxic lymphocytes (CTLs) and directs lytic granule exocytosis and cytotoxicity of killer T cells. Loss-of-function mutations in MID1 lead to the human X-linked Opitz G/BBB (XLOS) syndrome characterized by defective midline development during embryogenesis. MID1 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. MID1 heterodimerizes in vitro with its paralog MID2.	50
380895	cd19837	Bbox1_MID2_C-I	B-box-type 1 zinc finger found in midline-2 (MID2) and similar proteins. MID2, also known as midin-2, midline defect 2, RING finger protein 60 (RNF60), or tripartite motif-containing protein 1 (TRIM1), is a probable E3 ubiquitin-protein ligase that is highly related to MID1, which associates with cytoplasmic microtubules along their length and throughout the cell cycle. Like MID1, MID2 associates with the microtubule network and may at least partially compensate for the loss of MID1. Both MID1 and MID2 interacts with alpha4, a regulatory subunit of PP2-type phosphatases, such as PP2A, and an integral component of the rapamycin-sensitive signaling pathway. MID2 can also substitute for MID1 to control exocytosis of lytic granules in cytotoxic T cells. Loss-of-function mutations in MID2 lead to human X-linked intellectual disability (XLID). MID2 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxy-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. MID2 heterodimerizes in vitro with its paralog MID1.	53
380896	cd19838	Bbox1_TRIM8_C-V	B-box-type 1 zinc finger found in tripartite motif-containing protein 8 (TRIM8) and similar proteins. TRIM8, also known as glioblastoma-expressed RING finger protein (GERP) or RING finger protein 27 (RNF27), is a probable E3 ubiquitin-protein ligase that may promote proteasomal degradation of suppressor of cytokine signaling 1 (SOCS1) and further regulate interferon-gamma signaling. It functions as a new p53 modulator that stabilizes p53, impairing its association with MDM2 and inducing the reduction of cell proliferation. TRIM8 deficit dramatically impairs p53 stabilization and activation in response to chemotherapeutic drugs. TRIM8 also modulates tumor necrosis factor-alpha (TNFalpha) and interleukin-1beta (IL-1beta)-triggered nuclear factor-kappaB (NF-kappa B) activation by targeting transforming growth factor beta (TGFbeta) activated kinase 1 (TAK1) for K63-linked polyubiquitination. Moreover, TRIM8 modulates translocation of phosphorylated STAT3 into the nucleus through interaction with Hsp90beta and consequently regulates transcription of Nanog in embryonic stem cells. It also interacts with protein inhibitor of activated STAT3 (PIAS3), which inhibits IL-6-dependent activation of STAT3. TRIM8 belongs to the C-V subclass of nuclear TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil domain, as well as an uncharacterized region positioned C-terminal to the RBCC domain. The coiled coil domain is required for homodimerization and the region immediately C-terminal to the RING motif is sufficient to mediate the interaction with SOCS1. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	48
380897	cd19839	Bbox1_TRIM16	B-box-type 1 zinc finger found in tripartite motif-containing protein 16 (TRIM16) and similar proteins. TRIM16, also termed estrogen-responsive B box protein (EBBP), is a regulator that may play a role in the regulation of keratinocyte differentiation. It may also act as a tumor suppressor by affecting cell proliferation and migration or tumorigenicity in carcinogenesis. TRIM16 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	46
380898	cd19840	Bbox1_TRIM29	B-box-type 1 zinc finger found in tripartite motif-containing protein 29 (TRIM29) and similar proteins. TRIM29, also termed ataxia telangiectasia group D-associated protein (ATDC), plays a crucial role in the regulation of macrophage activation in response to viral or bacterial infections within the respiratory tract. TRIM29 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	47
380899	cd19841	Bbox1_TRIM44	B-box-type 1 zinc finger found in tripartite motif-containing protein 44 (TRIM44) and similar proteins. TRIM44, also termed protein DIPB, functions as a critical regulator in tumor metastasis and progression. TRIM44 belongs to an unclassified TRIM (tripartite motif) family of proteins that do not have RING fingers and thus lack the characteristic tripartite (RING (R), B-box, and coiled coil (CC)) RBCC motif. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif; this family contains a modified motif, C5H3.	46
380900	cd19842	Bbox1_TRIM25-like_C-IV	B-box-type 1 zinc finger found in tripartite motif-containing proteins, TRIM25, TRIM47 and similar proteins. The family includes tripartite motif-containing proteins, TRIM25 and TRIM47, both of which belong to the C-IV subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TRIM25, also termed estrogen-responsive finger protein (EFP), or ubiquitin/ISG15-conjugating enzyme TRIM25, or zinc finger protein 147 (ZNF147), or E3 ubiquitin/ISG15 ligase TRIM25, is induced by estrogen and is particularly abundant in placenta and uterus. It has been implicated in cell proliferation, protein modification, and the retinoic acid inducible gene I (RIG-I)-mediated antiviral signaling pathway. It functions as an E3-ubiquitin ligase able to transfer ubiquitin and ISG15 to target proteins. TRIM47, also known as gene overexpressed in astrocytoma protein (GOA) or RING finger protein 100 (RNF100), plays an important role in the process of dedifferentiation that is associated with astrocytoma tumorigenesis.	49
380901	cd19843	Bbox1_TRIM9_C-I	B-box-type 1 zinc finger found in tripartite motif-containing protein 9 (TRIM9) and similar proteins. TRIM9 (the human ortholog of rat Spring), also termed RING finger protein 91 (RNF91), is a brain-specific E3 ubiquitin-protein ligase collaborating with an E2 ubiquitin conjugating enzyme UBCH5b. TRIM9 plays an important role in the regulation of neuronal functions and participates in neurodegenerative disorders through its ligase activity. It interacts with the WD repeat region of beta-transducer repeat-containing protein (beta-TCP) through its N-terminal degron motif depending on the phosphorylation status, and thus negatively regulate nuclear factor-kappaB (NF-kappaB) activation in the NF-kappaB pro-inflammatory signaling pathway. Moreover, TRIM9 acts as a critical catalytic link between Netrin-1 and exocytosis soluble NSF attachment receptor protein (SNARE) machinery in murine cortical neurons. It promotes SNARE-mediated vesicle fusion and axon branching in a Netrin-dependent manner. TRIM9 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	47
380902	cd19844	Bbox1_TRIM67_C-I	B-box-type 1 zinc finger found in tripartite motif-containing protein 67 (TRIM67) and similar proteins. TRIM67, also termed TRIM9-like protein (TNL), is a protein selectively expressed in the cerebellum. It interacts with PRG-1, an important molecule in the control of hippocampal excitability dependent on presynaptic LPA2 receptor signaling, and 80K-H (also known as glucosidase II beta), a protein kinase C substrate. It negatively regulates Ras signaling in cell proliferation via degradation of 80K-H, leading to neural differentiation including neuritogenesis. TRIM67 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, the fibronectin type III domain and the SPRY/B30.2 domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	49
380903	cd19845	Bbox1_TIF1a_C-VI	B-box-type 1 zinc finger found in transcription intermediary factor 1-alpha (TIF1-alpha). TIF1-alpha, also known as tripartite motif-containing protein 24 (TRIM24), E3 ubiquitin-protein ligase TRIM24, or RING finger protein 82, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TIF1-alpha interacts specifically and in a ligand-dependent manner with the ligand binding domain (LBD) of several nuclear receptors (NRs), including retinoic X (RXR), retinoic acid (RAR), vitamin D3 (VDR), estrogen (ER), and progesterone (PR) receptors. It also associates with heterochromatin-associated factors HP1alpha, MOD1 (HP1beta), and MOD2 (HP1gamma), as well as the vertebrate Kruppel-type (C2H2) zinc finger proteins that contain the transcriptional silencing domain KRAB. TIF1-alpha is a ligand-dependent co-repressor of retinoic acid receptor (RAR) that interacts with multiple nuclear receptors in vitro via an LXXLL motif and further acts as a gatekeeper of liver carcinogenesis. It also functions as an E3-ubiquitin ligase targeting p53, and is broadly associated with chromatin silencing. Moreover, it is a chromatin regulator that recognizes specific, combinatorial histone modifications through its C-terminal PHD-Bromo region. In addition, it interacts with chromatin and estrogen receptor to activate estrogen-dependent genes associated with cellular proliferation and tumor development.	45
380904	cd19846	Bbox1_TIF1b_C-VI	B-box-type 1 zinc finger found in transcription intermediary factor 1-beta (TIF1-beta). TIF1-beta, also known as Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), KRAB-interacting protein 1 (KRIP-1), nuclear co-repressor KAP-1, RING finger protein 96, tripartite motif-containing protein 28 (TRIM28), or E3 SUMO-protein ligase TRIM28, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TIF1-beta/KAP-1 acts as a nuclear co-repressor that plays a role in transcription and in the DNA damage response. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair. It also regulates CHD3 nucleosome remodeling during the DNA double-strand break (DSB) response. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response. Moreover, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase. The N-terminal RBCC domains of TIF1-beta are responsible for the interaction with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and the regulation of homo- and heterodimerization. The C-terminal PHD/Bromo domains are involved in interacting with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity.	52
380905	cd19847	Bbox1_TIF1g_C-VI	B-box-type 1 zinc finger found in transcriptional intermediary factor 1 gamma (TIF1-gamma). TIF1-gamma, also known as tripartite motif-containing 33 (TRIM33), ectodermin, RFG7, or PTC7, belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif. TIF1-gamma is an E3-ubiquitin ligase that functions as a regulator of transforming growth factor beta (TGFbeta) signaling. It inhibits the Smad4-mediated TGFbeta response by interaction with Smad2/3 or ubiquitylation of Smad4. Moreover, TIF1-gamma is an important regulator of transcription during hematopoiesis, as well as a key actor of tumorigenesis. Like other TIF1 family members, TIF1-gamma also contains an intrinsic transcriptional silencing function. It can control erythroid cell fate by regulating transcription elongation. It can bind to the anaphase-promoting complex/cyclosome (APC/C) and promotes mitosis.	54
380906	cd19848	Bbox1_TRIM36_C-I	B-box-type 1 zinc finger found in tripartite motif-containing protein 36 (TRIM36) and similar proteins. TRIM36, the human ortholog of mouse Haprin, also known as RING finger protein 98 (RNF98) or zinc-binding protein Rbcc728, is an E3 ubiquitin-protein ligase expressed in the germ plasm. It has been implicated in acrosome reaction, fertilization, and embryogenesis, as well as in carcinogenesis. TRIM36 functions upstream of Wnt/beta-catenin activation, and plays a role in controlling the stability of proteins regulating microtubule polymerization during cortical rotation, and subsequent dorsal axis formation. It is also potentially associated with chromosome segregation by interacting with the kinetochore protein centromere protein-H (CENP-H), and colocalizing with the microtubule protein alpha-tubulin. Its overexpression may cause chromosomal instability and carcinogenesis. It is, thus, a novel regulator affecting cell cycle progression. Moreover, TRIM36 plays a critical role in the arrangement of somites during embryogenesis. TRIM36 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	55
380907	cd19849	Bbox1_TRIM46_C-I	B-box-type 1 zinc finger found in tripartite motif-containing protein 46 (TRIM46) and similar proteins. TRIM46, also known as gene Y protein (GeneY) or tripartite, fibronectin type-III and C-terminal SPRY motif protein (TRIFIC), is a microtubule-associated protein that specifically localizes to the proximal axon, partly overlaps with the axon initial segment (AIS) at later stages, and organizes uniform microtubule orientation in axons. It controls neuronal polarity and axon specification by driving the formation of parallel microtubule arrays. TRIM46 belongs to the C-I subclass of TRIM (tripartite motif) family of proteins, which are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a RING finger, Bbox1 and Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, a fibronectin type III (FN3) domain, and a B30.2/SPRY (SplA and ryanodine receptor) domain positioned C-terminal to the RBCC domain. The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	52
381251	cd19851	lipocalin_CHL	chloroplastic lipocalin(CHL) similar to Arabidopsis CHL. Chloroplastic lipocalin (CHL) prevents thylakoidal membrane lipids peroxidation and is protective against oxidative stress, especially mediated by singlet oxygen in response to excess light and other stress (e.g. heat shocks). CHL is required for seed longevity. This group belongs to the lipocalin/cytosolic fatty-acid binding protein family which have a large beta-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	141
381252	cd19852	FABP_pancrustacea	fatty acid-binding protein similar to Locusta migratoria FABP (Lm-FABP). This subfamily includes fatty acid-binding protein found mainly in insects such as the migratory locust (Locusta migratoria) FABP (Lm-FABP) and the desert locust (Schistocerca gregaria) FABP (Sg-FABP), having flight muscle tissues that contain unusually high levels FABP, similar to migratory birds. Both Sg- and Lm-FABP are closely related to the mammalian i-LBP subfamily IV, especially to the heart and adipocyte FABP forms. This subgroup belongs to the intracellular fatty-acid binding protein (FABP) family, members of which are small proteins that bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.	128
380683	cd19854	DSRM_DHX9_rpt1	first double-stranded RNA binding motif of DEAH box protein 9 (DHX9) and similar proteins. DHX9 (EC 3.6.4.13; also known as ATP-dependent RNA helicase A, DExH-box helicase 9 (DDX9), Leukophysin (LKP), nuclear DNA helicase II (NDH II), NDH2, or RNA helicase A) is a multifunctional ATP-dependent nucleic acid helicase that unwinds DNA and RNA in a 3' to 5' direction and plays important roles in many processes, such as DNA replication, transcriptional activation, post-transcriptional RNA regulation, mRNA translation, and RNA-mediated gene silencing. It contains two double-stranded RNA binding motifs (DSRMs) at the N-terminal region. This model corresponds to the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	69
380684	cd19855	DSRM_DHX9_rpt2	second double-stranded RNA binding motif of DEAH box protein 9 (DHX9) and similar proteins. DHX9 (EC 3.6.4.13; also known as ATP-dependent RNA helicase A, DExH-box helicase 9 (DDX9), Leukophysin (LKP), nuclear DNA helicase II (NDH II), NDH2, or RNA helicase A) is a multifunctional ATP-dependent nucleic acid helicase that unwinds DNA and RNA in a 3' to 5' direction and plays important roles in many processes, such as DNA replication, transcriptional activation, post-transcriptional RNA regulation, mRNA translation and RNA-mediated gene silencing. It contains two double-stranded RNA binding motifs (DSRMs) at the N-terminal region. This model corresponds to the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	75
380685	cd19856	DSRM_Kanadaptin	double-stranded RNA binding motif of Kanadaptin and similar proteins. Kanadaptin (also known as human lung cancer oncogene 3 protein (HLC-3), kidney anion exchanger adapter protein, or solute carrier family 4 anion exchanger member 1 adapter protein (SLC4A1AP)) is a nuclear protein widely expressed in mammalian tissues. It was originally isolated as a kidney Cl-/HCO3- anion exchanger 1 (kAE1)-binding protein. It is a highly mobile nucleocytoplasmic shuttling and multilocalizing protein. Its role in mammalian cells remains unclear. The double-stranded RNA binding motif (DSRM) is not sequence specific, but highly specific for dsRNAs of various origin and structure.	86
380686	cd19857	DSRM_STAU_rpt1	first double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	64
380687	cd19858	DSRM_STAU_rpt2	second double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	67
380688	cd19859	DSRM_STAU_rpt3	third double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	65
380689	cd19860	DSRM_STAU_rpt4	fourth double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the fourth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	68
380690	cd19861	DSRM_STAU_rpt5	fifth double-stranded RNA binding motif of Drosophila melanogaster maternal effect protein Staufen and similar proteins. Staufen is a double-stranded RNA binding protein required both for the localization of maternal determinants to the posterior pole of the egg, oskar (osk) RNA, and for correct localization to the anterior pole, anchoring bicoid (bcd) RNA. The family also includes two Staufen homologs from vertebrates, Staufen 1 and Staufen 2. They are present in distinct ribonucleoprotein complexes and associate with different mRNAs. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen proteins contain five double-stranded RNA binding motifs (DSRMs). This model describes the fifth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	68
380691	cd19862	DSRM_PRKRA-like_rpt1	first double-stranded RNA binding motif of protein activator of the interferon-induced protein kinase (PRKRA) and similar proteins. This family includes protein activator of the interferon-induced protein kinase (PRKRA) and the RISC-loading complex subunit TARBP2. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. TARBP2 (also called TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)), participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. This family also includes Drosophila melanogaster Loquacious and similar proteins. Loquacious (Loqs) is a double-stranded RNA-binding domain (dsRBD) protein, a homolog of human TAR RNA binding protein (TRBP) that is a protein first identified as binding the HIV trans-activator RNA (TAR). Loqs interacts with Dicer1 (dmDcr1) to facilitate miRNA processing. PRKRA family proteins contain three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	70
380692	cd19863	DSRM_PRKRA-like_rpt2	second double-stranded RNA binding motif of PRKRA, TARBP2 and similar proteins. The family includes protein activator of the interferon-induced protein kinase (PRKRA) and the RISC-loading complex subunit TARBP2. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. TARBP2 (also called TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)) participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. The family also includes Drosophila melanogaster Loquacious and similar proteins. Loquacious (Loqs) is a double-stranded RNA-binding domain (dsRBD) protein, a homolog of human TAR RNA binding protein (TRBP) that is a protein first identified as binding the HIV trans-activator RNA (TAR). Loqs interacts with Dicer1 (dmDcr1) to facilitate miRNA processing. PRKRA family proteins contain three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	67
380693	cd19864	DSRM_PRKRA-like_rpt3	third double-stranded RNA binding motif of PRKRA, TARBP2 and similar proteins. The family includes protein activator of the interferon-induced protein kinase (PRKRA) and the RISC-loading complex subunit TARBP2. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. TARBP2 (also called TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)) participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. The family also includes Drosophila melanogaster Loquacious and similar proteins. Loquacious (Loqs) is a double-stranded RNA-binding domain (dsRBD) protein, a homolog of human TAR RNA binding protein (TRBP) that is a protein first identified as binding the HIV trans-activator RNA (TAR). Loqs interacts with Dicer1 (dmDcr1) to facilitate miRNA processing. PRKRA family proteins contain three double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	72
380694	cd19865	DSRM_STRBP_RED-like_rpt1	first double-stranded RNA binding motif of STRBP, ILF3, RED1, RED2 and similar proteins. This family includes spermatid perinuclear RNA-binding protein (STRBP) and interleukin enhancer-binding factor 3 (ILF3), as well as two RNA-editing deaminases, RED1 and RED2. STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. RED1 (EC 3.5.4.37; also called double-stranded RNA-specific editase 1, RNA-editing enzyme 1, dsRNA adenosine deaminase, ADARB1, ADAR2, or DRADA2) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. RED2 (also called double-stranded RNA-specific editase B2, RNA-dependent adenosine deaminase 3, RNA-editing enzyme 2, dsRNA adenosine deaminase B2, ADAR3, or ADARB2) prevents the binding of other ADAR enzymes to targets in vitro, and decreases the efficiency of  these enzymes. It is capable of binding to dsRNA, but also to ssRNA. RED2 lacks editing activity for currently known substrate RNAs. Members of this group contain two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	63
380695	cd19866	DSRM_STRBP_RED-like_rpt2	second double-stranded RNA binding motif of STRBP, ILF3, RED1, RED2 and similar proteins. This family includes spermatid perinuclear RNA-binding protein (STRBP) and interleukin enhancer-binding factor 3 (ILF3), as well as two RNA-editing deaminases, RED1 and RED2. STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. RED1 (EC 3.5.4.37; also called double-stranded RNA-specific editase 1, RNA-editing enzyme 1, dsRNA adenosine deaminase, ADARB1, ADAR2, or DRADA2) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. RED2 (also called double-stranded RNA-specific editase B2, RNA-dependent adenosine deaminase 3, RNA-editing enzyme 2, dsRNA adenosine deaminase B2, ADAR3, or ADARB2) prevents the binding of other ADAR enzymes to targets in vitro, and decreases the efficiency of  these enzymes. It is capable of binding to dsRNA but also to ssRNA. RED2 lacks editing activity for currently known substrate RNAs. Members of this group contain two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	63
380696	cd19867	DSRM_DGCR8_rpt1	first double-stranded RNA binding motif of DiGeorge syndrome critical region 8 (DGCR8) and similar proteins. DGCR8 is a component of the microprocessor complex that acts as an RNA- and heme-binding protein that is involved in the initial step of microRNA (miRNA) biogenesis. Within the microprocessor complex, DGCR8 functions as a molecular anchor necessary for the recognition of pri-miRNA at dsRNA-ssRNA junction and directs DROSHA to cleave 11bp away from the junction to release hairpin-shaped pre-miRNAs that are subsequently cut by the cytoplasmic DICER to generate mature miRNAs. DGCR8 contains two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	74
380697	cd19868	DSRM_DGCR8_rpt2	second double-stranded RNA binding motif of DiGeorge syndrome critical region 8 (DGCR8) and similar proteins. DGCR8 is a component of the microprocessor complex that acts as an RNA- and heme-binding protein that is involved in the initial step of microRNA (miRNA) biogenesis. Within the microprocessor complex, DGCR8 functions as a molecular anchor necessary for the recognition of pri-miRNA at dsRNA-ssRNA junction and directs DROSHA to cleave 11bp away from the junction to release hairpin-shaped pre-miRNAs that are subsequently cut by the cytoplasmic DICER to generate mature miRNAs. DGCR8 contains two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	69
380698	cd19869	DSRM_DCL_plant	double-stranded RNA binding motif of plant Dicer-like proteins. The family includes plant Dicer-like (DCL) proteins and other ribonuclease (RNase) III-like (RTL) proteins. DCLs are endoribonucleases involved in RNA-mediated post-transcriptional gene silencing (PTGS). They function in the microRNA (miRNA) biogenesis pathway by cleaving primary miRNAs (pri-miRNAs) and precursor miRNAs (pre-miRNAs). Family members contain a double-stranded RNA binding motif (DSRM) at the C-terminus. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	70
380699	cd19870	DSRM_SON-like	double-stranded RNA binding motif of protein SON and similar proteins. Protein SON (also known as Bax antagonist selected in saccharomyces 1 (BASS1), negative regulatory element-binding protein (NRE-binding protein), or protein DBP-5, or SON3) is an RNA-binding protein which acts as an mRNA splicing cofactor by promoting efficient splicing of transcripts that possess weak splice sites. It specifically promotes splicing of many cell-cycle and DNA-repair transcripts that possess weak splice sites, such as TUBG1, KATNB1, TUBGCP2, AURKB, PCNT, AKT1, RAD23A, and FANCG. Members of this group contain a double-stranded RNA binding motif (DSRM) at the C-terminus. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	75
380700	cd19871	DSRM_DUS2L	double-stranded RNA binding motif of tRNA-dihydrouridine(20) synthase [NAD(P)+]-like (DUS2L) and similar proteins. DUS2L (also known as dihydrouridine synthase 2 (DUS2), up-regulated in lung cancer protein 8 (URLC8), or tRNA-dihydrouridine synthase 2-like) catalyzes the synthesis of dihydrouridine, a modified base found in the D-loop of most tRNAs. It negatively regulates the activation of EIF2AK2/PKR. DUS2L contains an N-terminal FMN-binding domain and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure.	68
380701	cd19872	DSRM_A1CF-like	double-stranded RNA binding motif of APOBEC1 complementation factor (A1CF), RNA-binding protein 46 (RBM46) and similar proteins. The family includes two dsRNA-binding motif-containing proteins, A1CF and RBM46. A1CF (also known as APOBEC1-stimulating protein) is an essential component of the apolipoprotein B mRNA editing enzyme complex which is responsible for the posttranscriptional editing of a CAA codon for Gln to a UAA codon for stop in APOB mRNA. A1CF binds to APOB mRNA and is probably responsible for docking the catalytic subunit, APOBEC1, to the mRNA to allow it to deaminate its target cytosine. RBM46 (also called cancer/testis antigen 68 (CT68), or RNA-binding motif protein 46) plays a novel role in the regulation of embryonic stem cell (ESC) differentiation by regulating the degradation of beta-catenin mRNA. It also regulates trophectoderm specification by stabilizing Cdx2 mRNA in early mouse embryos. Members of this family contain three RNA recognition motifs (RRMs) and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure.	75
380702	cd19873	DSRM_MRPL3_like	double-stranded RNA binding motif of Saccharomyces cerevisiae mitochondrial 54S ribosomal protein L3 (MRPL3) and similar proteins. MRPL3 (also called mitochondrial large ribosomal subunit protein mL44) is a component of the mitochondrial ribosome (mitoribosome), a dedicated translation machinery responsible for the synthesis of mitochondrial genome-encoded proteins, including at least some of the essential transmembrane subunits of the mitochondrial respiratory chain. MRPL3 contains a RNase III-like domain and a double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	84
380703	cd19874	DSRM_MRPL44	double-stranded RNA binding motif of mitochondrial 39S ribosomal protein L44 (MRPL44) and similar proteins. MRPL44 (also known as L44mt, MRP-L44, or mitochondrial large ribosomal subunit protein mL44) is a component of the 39S subunit of mitochondrial ribosome. It may play a role in the assembly/stability of nascent mitochondrial polypeptides exiting the ribosome. MRPL44 contains a RNase III-like domain and a double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	84
380704	cd19875	DSRM_EIF2AK2-like	double-stranded RNA binding motif of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and similar proteins. The family includes EIF2AK2 and adenosine deaminase domain-containing proteins, ADAD1 and ADAD2. EIF2AK2 (EC 2.7.11.1/EC 2.7.10.2; also known as interferon-induced, double-stranded RNA-activated protein kinase, eIF-2A protein kinase 2, interferon-inducible RNA-dependent protein kinase, P1/eIF-2A protein kinase, protein kinase RNA-activated (PKR), protein kinase R, tyrosine-protein kinase EIF2AK2, or p68 kinase) acts as an IFN-induced dsRNA-dependent serine/threonine-protein kinase which plays a key role in the innate immune response to viral infection and is also involved in the regulation of signal transduction, apoptosis, cell proliferation and differentiation. ADAD1 (also called testis nuclear RNA-binding protein (TENR)) and ADAD2 (also called testis nuclear RNA-binding protein-like (TENRL)) are phylogenetically related to a family of adenosine deaminases involved in RNA editing. ADAD1 plays an essential function in spermatid morphogenesis. It may be involved in testis-specific nuclear post-transcriptional processes such as heterogeneous nuclear RNA (hnRNA) packaging, alternative splicing, or nuclear/cytoplasmic transport of mRNAs. ADAD2 is a double-stranded RNA binding protein with unclear biological function. Members of this group contains varying numbers of double-stranded RNA binding motifs (DSRMs). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	67
380705	cd19876	DSRM_RNT1p-like	double-stranded RNA binding motif of Saccharomyces cerevisiae ribonuclease 3 (RNT1p) and similar proteins. RNT1p (EC 3.1.26.3; also known as ribonuclease III (RNase III)) is a dsRNA-specific nuclease that cleaves eukaryotic pre-ribosomal RNA at the U3 snoRNP-dependent A0 site in the 5'-external transcribed spacer (ETS) and in the 3'-ETS. RNT1p contains a double-stranded RNA binding motif (DSRM) at the C-terminus. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	69
380706	cd19877	DSRM_RNAse_III_meta_like	double-stranded RNA binding motif of metazoan ribonuclease III (RNase III) and similar proteins. RNase III (EC 3.1.26.3; also known as Drosha, or ribonuclease 3) is a double-stranded RNA (dsRNA)-specific endoribonuclease that is involved in the initial step of microRNA (miRNA) biogenesis. It is a component of the microprocessor complex that is required to process primary miRNA transcripts (pri-miRNAs) to release precursor miRNA (pre-miRNA) in the nucleus. Within the microprocessor complex, RNase III cleaves the 3' and 5' strands of a stem-loop in pri-miRNAs (processing center 11 bp from the dsRNA-ssRNA junction) to release hairpin-shaped pre-miRNAs that are subsequently cut by the cytoplasmic DICER to generate mature miRNAs. It is also involved in pre-rRNA processing. Metazoan RNase III is a larger protein than bacterial RNase III. It contains two RNase III domains in the C-terminal half of the protein followed by a double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	75
380707	cd19878	DSRM_AtDRB-like	double-stranded RNA binding motif of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRBs)and similar proteins. This family includes a group of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRB1-5). They bind double-stranded RNA (dsRNA) and may be involved in RNA-mediated silencing. Members of this family contain two to three double-stranded RNA binding motifs (DSRMs). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	67
380708	cd19879	DSRM_STAU1_rpt1	first double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	66
380709	cd19880	DSRM_STAU2_rpt1	first double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	68
380710	cd19881	DSRM_STAU1_rpt2	second double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	79
380711	cd19882	DSRM_STAU2_rpt2	second double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	82
380712	cd19883	DSRM_STAU1_rpt3	third double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	67
380713	cd19884	DSRM_STAU2_rpt3	third double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	67
380714	cd19885	DSRM_STAU1_rpt4	fourth double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the fourth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	86
380715	cd19886	DSRM_STAU2_rpt4	fourth double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the fourth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	86
380716	cd19887	DSRM_STAU1_rpt5	fifth double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 1 (Staufen 1) and similar proteins. Staufen 1 may play a role in specific positioning of mRNAs at given sites in the cell by cross-linking cytoskeletal and RNA components, and in stimulating their translation at the site. It binds double-stranded RNA (regardless of the sequence) and tubulin. Staufen 1 contains five double-stranded RNA binding motifs (DSRMs). This model describes the fifth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	70
380717	cd19888	DSRM_STAU2_rpt5	fifth double-stranded RNA binding motif of double-stranded RNA-binding protein Staufen homolog 2 (Staufen 2) and similar proteins. Staufen 2 is an RNA-binding protein required for the microtubule-dependent transport of neuronal RNA from the cell body to the dendrite. Staufen 2 contains five double-stranded RNA binding motifs (DSRMs). This model describes the fifth motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	68
380718	cd19889	DSRM_PRKRA_rpt1	first double-stranded RNA binding motif of protein activator of the interferon-induced protein kinase (PRKRA) and similar proteins. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. PRKRA contains three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	71
380719	cd19890	DSRM_TARBP2_rpt1	first double-stranded RNA binding motif of the RISC-loading complex subunit TARBP2 and similar proteins. TARBP2 (also known as TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)), participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. TARBP2 contains three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	72
380720	cd19891	DSRM_PRKRA_rpt2	second double-stranded RNA binding motif of protein activator of the interferon-induced protein kinase (PRKRA) and similar proteins. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. PRKRA contains three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	67
380721	cd19892	DSRM_PRKRA_rpt3	third double-stranded RNA binding motif of protein activator of the interferon-induced protein kinase (PRKRA) and similar proteins. PRKRA (also known as interferon-inducible double-stranded RNA-dependent protein kinase activator A, PKR-associated protein X (RAX), PKR-associating protein X, protein kinase, interferon-inducible double-stranded RNA-dependent activator, PACT, or HSD14) is a cellular activator for double-stranded RNA-dependent protein kinase during stress signaling. PRKRA contains three double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	72
380722	cd19893	DSRM_TARBP2_rpt3	third double-stranded RNA binding motif of the RISC-loading complex subunit TARBP2 and similar proteins. TARBP2 (also known as TAR RNA-binding protein 2, or trans-activation-responsive RNA-binding protein (TRBP)) participates in the formation of the RNA-induced silencing complex (RISC). It is part of the RISC-loading complex (RLC), together with dicer1 and eif2c2/ago2, and is required to process precursor miRNAs. TARBP2 contains three double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	72
380723	cd19894	DSRM_STRBP-like_rpt1	first double-stranded RNA binding motif of STRBP, ILF3 and similar proteins. This family includes spermatid perinuclear RNA-binding protein (STRBP) and interleukin enhancer-binding factor 3 (ILF3). STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. Members of this STRBP/ILF3 group contain an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	63
380724	cd19895	DSRM_RED1_rpt1	first double-stranded RNA binding motif of RNA-editing deaminase 1 (RED1) and similar proteins. RED1 (EC 3.5.4.37; also known as double-stranded RNA-specific editase 1, RNA-editing enzyme 1, dsRNA adenosine deaminase, ADARB1, ADAR2, or DRADA2) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. It contains two double-stranded RNA binding motifs (DSRMs) and a C-terminal RNA-specific adenosine-deaminase (editase) domain. This model describes the first DSRM. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	72
380725	cd19896	DSRM_RED2_rpt1	first double-stranded RNA binding motif of RNA-editing deaminase 2 (RED2) and similar proteins. RED2 (also known as double-stranded RNA-specific editase B2, RNA-dependent adenosine deaminase 3, RNA-editing enzyme 2, dsRNA adenosine deaminase B2, ADAR3, or ADARB2) prevents the binding of other ADAR enzymes to targets in vitro, and decreases the efficiency of  these enzymes. It is capable of binding to dsRNA but also to ssRNA. RED2 contains two double-stranded RNA binding motifs (DSRMs) and a C-terminal RNA-specific adenosine-deaminase (editase) domain. This model describes the first DSRM. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. RED2 lacks editing activity for currently known substrate RNAs, and may have an inactive editase domain.	74
380726	cd19897	DSRM_STRBP-like_rpt2	second double-stranded RNA binding motif of STRBP, ILF3 and similar proteins. This family includes spermatid perinuclear RNA-binding protein (STRBP) and interleukin enhancer-binding factor 3 (ILF3). STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. Members of this STRBP/ILF3 group contain an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	64
380727	cd19898	DSRM_RED1_rpt2	second double-stranded RNA binding motif of RNA-editing deaminase 1 (RED1) and similar proteins. RED1 (EC 3.5.4.37; also known as double-stranded RNA-specific editase 1, RNA-editing enzyme 1, dsRNA adenosine deaminase, ADARB1, ADAR2, or DRADA2) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. It contains two double-stranded RNA binding motifs (DSRMs) and a C-terminal RNA-specific adenosine-deaminase (editase) domain. This model describes the second DSRM. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	70
380728	cd19899	DSRM_RED2_rpt2	second double-stranded RNA binding motif of RNA-editing deaminase 2 (RED2) and similar proteins. RED2 (also known as double-stranded RNA-specific editase B2, RNA-dependent adenosine deaminase 3, RNA-editing enzyme 2, dsRNA adenosine deaminase B2, ADAR3, or ADARB2) prevents the binding of other ADAR enzymes to targets in vitro, and decreases the efficiency of  these enzymes. It is capable of binding to dsRNA but also to ssRNA. RED2 contains two double-stranded RNA binding motifs (DSRMs) and a C-terminal RNA-specific adenosine-deaminase (editase) domain. This model describes the second DSRM. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure. RED2 lacks editing activity for currently known substrate RNAs, and may have an inactive editase domain.	74
380729	cd19900	DSRM_A1CF	double-stranded RNA binding motif of APOBEC1 complementation factor (A1CF) and similar proteins. A1CF (also known as APOBEC1-stimulating protein) is an essential component of the apolipoprotein B mRNA editing enzyme complex which is responsible for the posttranscriptional editing of a CAA codon for Gln to a UAA codon for stop in APOB mRNA. A1CF binds to APOB mRNA and is probably responsible for docking the catalytic subunit, APOBEC1, to the mRNA to allow it to deaminate its target cytosine. It contains three RNA recognition motifs (RRMs) and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure.	81
380730	cd19901	DSRM_RBM46	double-stranded RNA binding motif of RNA-binding protein 46 (RBM46) and similar proteins. RBM46 (also known as cancer/testis antigen 68 (CT68), or RNA-binding motif protein 46) plays a novel role in the regulation of embryonic stem cell (ESC) differentiation by regulating the degradation of beta-catenin mRNA. It also regulates trophectoderm specification by stabilizing Cdx2 mRNA in early mouse embryos. RBM46 contains three RNA recognition motifs (RRMs) and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure.	78
380731	cd19902	DSRM_DRADA	double-stranded RNA binding motif of double-stranded RNA-specific adenosine deaminase (DRADA) and similar proteins. DRADA (EC 3.5.4.37; also known as 136 kDa double-stranded RNA-binding protein (p136), interferon-inducible protein 4 (IFI-4), K88DSRBP, ADAR1, G1P1, or ADAR) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. DRADA family members contain at least one double-stranded RNA binding motifs (DSRM); vertebrate proteins contain three. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	71
380732	cd19903	DSRM_EIF2AK2_rpt1	first double-stranded RNA binding motif of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and similar proteins. EIF2AK2 (EC 2.7.11.1/EC 2.7.10.2; also known as interferon-induced, double-stranded RNA-activated protein kinase, eIF-2A protein kinase 2, interferon-inducible RNA-dependent protein kinase, P1/eIF-2A protein kinase, protein kinase RNA-activated (PKR), protein kinase R, tyrosine-protein kinase EIF2AK2, or p68 kinase) acts as an IFN-induced dsRNA-dependent serine/threonine-protein kinase which plays a key role in the innate immune response to viral infection and is also involved in the regulation of signal transduction, apoptosis, cell proliferation and differentiation. EIF2AK2 proteins contain two to three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	68
380733	cd19904	DSRM_EIF2AK2_rpt2	second double-stranded RNA binding motif of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and similar proteins. EIF2AK2 (EC 2.7.11.1/EC 2.7.10.2; also known as interferon-induced, double-stranded RNA-activated protein kinase, eIF-2A protein kinase 2, interferon-inducible RNA-dependent protein kinase, P1/eIF-2A protein kinase, protein kinase RNA-activated (PKR), protein kinase R, tyrosine-protein kinase EIF2AK2, or p68 kinase) acts as an IFN-induced dsRNA-dependent serine/threonine-protein kinase which plays a key role in the innate immune response to viral infection and is also involved in the regulation of signal transduction, apoptosis, cell proliferation and differentiation. EIF2AK2 proteins contain two to three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	69
380734	cd19905	DSRM_ADAD1	double-stranded RNA binding motif of adenosine deaminase domain-containing protein 1 (ADAD1) and similar proteins. ADAD1 (also known as testis nuclear RNA-binding protein (TENR)) is phylogenetically related to a family of adenosine deaminases involved in RNA editing. It plays an essential function in spermatid morphogenesis. It may be involved in testis-specific nuclear post-transcriptional processes such as heterogeneous nuclear RNA (hnRNA) packaging, alternative splicing, or nuclear/cytoplasmic transport of mRNAs. ADAD1 contains a double-stranded RNA binding motif (DSRM) and a C-terminal adenosine-deaminase (editase) domain. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	69
380735	cd19906	DSRM_ADAD2	double-stranded RNA binding motif of adenosine deaminase domain-containing protein 2 (ADAD2) and similar proteins. ADAD2 (also known as testis nuclear RNA-binding protein-like (TENRL)) is phylogenetically related to a family of adenosine deaminases involved in RNA editing. It is a double-stranded RNA binding protein with unclear biological function. ADAD2 contains a double-stranded RNA binding motif (DSRM) and a C-terminal adenosine-deaminase (editase) domain. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	74
380736	cd19907	DSRM_AtDRB-like_rpt1	first double-stranded RNA binding motif of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRBs)and similar proteins. This family includes a group of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRB1-5). They bind double-stranded RNA (dsRNA) and may be involved in RNA-mediated silencing. Members of this family contain two to three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	69
380737	cd19908	DSRM_AtDRB-like_rpt2	second double-stranded RNA binding motif of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRBs)and similar proteins. This family includes a group of Arabidopsis thaliana double-stranded RNA-binding proteins (AtDRB1-5). They bind double-stranded RNA (dsRNA) and may be involved in RNA-mediated silencing. Members of this family contain two to three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	69
380738	cd19909	DSRM_STRBP_rpt1	first double-stranded RNA binding motif of spermatid perinuclear RNA-binding protein (STRBP) and similar proteins. STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. STRBP contains an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	84
380739	cd19910	DSRM_ILF3_rpt1	first double-stranded RNA binding motif of interleukin enhancer-binding factor 3 (ILF3) and similar proteins. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. ILF3 contains an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	73
380740	cd19911	DSRM_STRBP_rpt2	second double-stranded RNA binding motif of spermatid perinuclear RNA-binding protein (STRBP) and similar proteins. STRBP is a double-stranded DNA and RNA binding protein that is involved in spermatogenesis and sperm function. It plays a role in regulation of cell growth. STRBP contains an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	64
380741	cd19912	DSRM_ILF3_rpt2	second double-stranded RNA binding motif of interleukin enhancer-binding factor 3 (ILF3) and similar proteins. ILF3 (also known as double-stranded RNA-binding protein 76 (DRBP76), M-phase phosphoprotein 4 (MPP4), nuclear factor associated with dsRNA (NFAR), nuclear factor of activated T-cells 90 kDa (NF-AT-90), or translational control protein 80 (TCP80)) is an RNA-binding protein that plays an essential role in the biogenesis of circular RNAs (circRNAs) which are produced by back-splicing circularization of pre-mRNAs. ILF3 contains an N-terminal DZF domain and two double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	72
380742	cd19913	DSRM_DRADA_rpt1	first double-stranded RNA binding motif of double-stranded RNA-specific adenosine deaminase (DRADA). DRADA (EC 3.5.4.37; also known as 136 kDa double-stranded RNA-binding protein (p136), interferon-inducible protein 4 (IFI-4), K88DSRBP, ADAR1, G1P1, or ADAR) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. Vertebrate DRADA contains three double-stranded RNA binding motifs (DSRMs). This model describes the first motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	71
380743	cd19914	DSRM_DRADA_rpt2	second double-stranded RNA binding motif of double-stranded RNA-specific adenosine deaminase (DRADA) and similar proteins. DRADA (EC 3.5.4.37; also known as 136 kDa double-stranded RNA-binding protein (p136), interferon-inducible protein 4 (IFI-4), K88DSRBP, ADAR1, G1P1, or ADAR) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. Vertebrate DRADA contains three double-stranded RNA binding motifs (DSRMs). This model describes the second motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	71
380744	cd19915	DSRM_DRADA_rpt3	third double-stranded RNA binding motif of double-stranded RNA-specific adenosine deaminase (DRADA) and similar proteins. DRADA (EC 3.5.4.37; also known as 136 kDa double-stranded RNA-binding protein (p136), interferon-inducible protein 4 (IFI-4), K88DSRBP, ADAR1, G1P1, or ADAR) catalyzes the hydrolytic deamination of adenosine to inosine in double-stranded RNA (dsRNA), referred to as A-to-I RNA editing. Vertebrate DRADA contains three double-stranded RNA binding motifs (DSRMs). This model describes the third motif. DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	71
381179	cd19916	OphMA_like	tetrapyrrole methylase family protein similar to Omphalotus olearius omphalotin methyltransferase (OphMA) and Dendrothele bispora dbOphMA. OphMA, is the precursor protein of the fungal cyclic peptide Omphalotin A. Omphalotin A is a potent nematicide, having 9 out of 12 of its residues methylated at the backbone amide. Omphalotin A derives from the C-terminus of OphMA (also known as OphA). OphMA catalyzes the automethylation of its own C-terminus using S-adenosyl methionine (SAM); this C terminus is subsequently released and macrocyclized by the protease OphP to give Omphalotin A.	237
381180	cd19917	RsmI_like	tetrapyrrole methylase family protein similar to ribosomal RNA small subunit methyltransferase I (RsmI). RsmI, also known as rRNA (cytidine-2'-O-)-methyltransferase, is an S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent methyltransferase responsible for the 2'-O-methylation of cytidine 1402 (C1402) at the P site of bacterial 16S rRNA. Another S-AdoMet-dependent methyltransferase, RsmH (not included in this family), is responsible for N4-methylation at C1402. These methylation reactions may occur at a late step during 30S assembly in the cell. The dimethyl modification is believed to be conserved in bacteria, may play a role in fine-tuning the shape and functions of the P-site to increase the translation fidelity, and has been shown for Staphylococcus aureus, to contribute to virulence in host animals by conferring resistance to oxidative stress.	217
381181	cd19918	RsmI_like	uncharacterized subfamily of the tetrapyrrole methylase family similar to Ribosomal RNA small subunit methyltransferase I (RsmI). RsmI, also known as rRNA (cytidine-2'-O-)-methyltransferase, is an S-AdoMet (S-adenosyl-L-methionine or SAM)-dependent methyltransferase responsible for the 2'-O-methylation of cytidine 1402 (C1402) at the P site of bacterial 16S rRNA. Another S-AdoMet-dependent methyltransferase, RsmH (not included in this family), is responsible for N4-methylation at C1402. These methylation reactions may occur at a late step during 30S assembly in the cell. The dimethyl modification is believed to be conserved in bacteria, may play a role in fine-tuning the shape and functions of the P-site to increase the translation fidelity, and has been shown for Staphylococcus aureus, to contribute to virulence in host animals by conferring resistance to oxidative stress.	217
381146	cd19919	REC_NtrC	phosphoacceptor receiver (REC) domain of DNA-binding transcriptional regulator NtrC. DNA-binding transcriptional regulator NtrC is also called nitrogen regulation protein NR(I) or nitrogen regulator I (NRI). It contains an N-terminal receiver (REC) domain, followed by a sigma-54 interaction domain, and a C-terminal helix-turn-helix DNA-binding domain. It is part of the two-component regulatory system NtrB/NtrC, which controls expression of the nitrogen-regulated (ntr) genes in response to nitrogen limitation. DNA-binding response regulator NtrC is phosphorylated by NtrB; phosphorylation of the N-terminal REC domain activates the central sigma-54 interaction domain and leads to the transcriptional activation from promoters that require sigma(54)-containing RNA polymerase. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	116
381147	cd19920	REC_PA4781-like	phosphoacceptor receiver (REC) domain of cyclic di-GMP phosphodiesterase PA4781 and similar domains. Pseudomonas aeruginosa cyclic di-GMP phosphodiesterase PA4781 contains an N-terminal REC domain and a C-terminal catalytic HD-GYP domain, characteristics of RpfG family response regulators. PA4781 is involved in cyclic di-3',5'-GMP (c-di-GMP) hydrolysis/degradation in a two-step reaction via the linear intermediate pGpG to produce GMP. Its unphosphorylated REC domain prevents accessibility of c-di-GMP to the active site. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	103
381148	cd19921	REC_1_GGDEF	first phosphoacceptor receiver (REC) domain of uncharacterized GGDEF domain proteins. This family is composed of uncharacterized PleD-like response regulators that contain two N-terminal REC domains and a C-terminal diguanylate cyclase output domain with the characteristic GGDEF motif at the active site. Unlike PleD which contains a REC-like adaptor domain, the second REC domain of these uncharacterized GGDEF domain proteins contains characteristic metal-binding and active site residues. PleD response regulators are global regulators of cell metabolism in some important human pathogens. This model describes the first REC domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	115
381149	cd19922	REC_RitR-like	receiver (REC) domain of orphan response regulator RitR and similar domains. Streptococcus pneumoniae RitR (Repressor of iron transport Regulator, formerly RR489) is an orphan two-component signal transduction response regulator that is required for lung pathogenicity. It acts to repress iron uptake via binding the pneumococcal iron uptake (Piu) transporter promoter. Members of this subfamily contain REC and DNA-binding output domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays. However, members of this family do not contain the phosphorylatable aspartic acid residue and are phosphorylation-independent.	110
381150	cd19923	REC_CheY_CheY3	phosphoacceptor receiver (REC) domain of chemotaxis response regulator CheY3 and similar CheY family proteins. CheY family chemotaxis response regulators (RRs) comprise about 17%  of bacterial RRs and almost half of all RRs in archaea. This subfamily contains Vibrio cholerae CheY3, Escherichia coli CheY, and similar CheY family RRs. CheY proteins control bacterial motility and participate in signaling phosphorelays and in protein-protein interactions. CheY RRs contain only the REC domain with no output/effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	119
381151	cd19924	REC_CheV-like	phosphoacceptor receiver (REC) domain of chemotaxis protein CheV and similar proteins. This subfamily includes the REC domains of Bacillus subtilis chemotaxis protein CheV, Myxococcus xanthus gliding motility regulatory protein FrzE, and similar proteins. CheV is a hybrid protein with an N-terminal CheW-like domain and a C-terminal CheY-like REC domain. The CheV pathway is one of three systems employed by B. subtilis for sensory adaptation that contribute to chemotaxis. It is involved in the transmission of sensory signals from chemoreceptors to flagellar motors. Together with CheW, it is involved in the coupling of methyl-accepting chemoreceptors to the central two-component histidine kinase CheA. FrzE is a hybrid sensor histidine kinase/response regulator that is part of the Frz pathway that controls cell reversal frequency to support directional motility during swarming and fruiting body formation. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	111
381152	cd19925	REC_citrate_TCS	phosphoacceptor receiver (REC) domain of citrate family two-component system response regulators. This family includes Lactobacillus paracasei MaeR, Escherichia coli DcuR and DpiA, Klebsiella pneumoniae CitB, as well as Bacillus DctR, MalR, and CitT. These are all response regulators of two-component systems (TCSs) from the citrate family, and are involved in the transcriptional regulation of genes associated with L-malate catabolism (MaeRK), citrate-specific fermentation (DpiAB, CitAB), plasmid inheritance (DpiAB), anaerobic fumarate respiratory system (DcuRS), and malate transport/utilization (MalKR). REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381153	cd19926	REC_PilR	phosphoacceptor receiver (REC) domain of type 4 fimbriae expression regulatory protein PilR and similar proteins. Pseudomonas aeruginosa PilR is the response regulator of the PilS/PilR two-component regulatory system (PilSR TCS) that acts in conjunction with sigma-54 to regulate the expression of type 4 pilus (T4P) major subunit PilA. In addition, the PilSR TCS regulates flagellum-dependent swimming motility and pilus-dependent twitching motility. PilR contains an N-terminal REC domain, a central sigma-54 interaction domain, and a C-terminal Fis-type helix-turn-helix DNA-binding domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	100
381154	cd19927	REC_Ycf29	phosphoacceptor receiver (REC) domain of probable transcriptional regulator Ycf29. Ycf29 is a probable response regulator of a two-component system (TCS), typically consisting a sensor and a response regulator, that functions in adaptation to changing environments. Processes regulated by TCSs in bacteria include sporulation, pathogenicity, virulence, chemotaxis, and membrane transport. Ycf29 contains an N-terminal REC domain and a LuxR-type helix-turn-helix DNA-binding output domain. REC domains function as phosphorylation-mediated switches within RRs, but some also transfer phosphoryl groups in multistep phosphorelays.	102
381155	cd19928	REC_RcNtrC-like	phosphoacceptor receiver (REC) domain of Rhodobacter capsulatus nitrogen regulatory protein C (NtrC) and similar NtrC family response regulators. NtrC family proteins are transcriptional regulators that have REC, AAA+ ATPase/sigma-54 interaction, and DNA-binding output domains. This subfamily of NtrC proteins include NtrC, also called nitrogen regulator I (NRI), from Rhodobacter capsulatus, Azospirillum brasilense, and Azorhizobium caulinodans. NtrC is part of the NtrB/NtrC two-component system that controls the expression of the nitrogen-regulated (ntr) genes in response to nitrogen limitation. The N-terminal REC domain of NtrC proteins regulate the activity of the protein and its phosphorylation controls the AAA+ domain oligomerization, while the central AAA+ domain participates in nucleotide binding, hydrolysis, oligomerization, and sigma54 interaction. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	100
381156	cd19929	psREC_Atg32	pseudo receiver domain of autophagy receptor Atg32. Autophagy receptor Atg32 is a single-pass outer mitochondrial membrane protein that is required for the selective autophagy of mitochondria, called mitophagy, in yeast. It mediates ubiquitin-independent mitophagy in response to nitrogen deprivation. It recruits the autophagy machinery to mitochondria, facilitating mitochondrial capture in phagophores, the precursors to autophagosomes. Whereas mammals have at least 7 different autophagy receptors, yeast only has one. Little is known about the structure of Atg32; it contains a binding region for the selective autophagy scaffolding protein Atg11 and an Atg8-interacting motif (AIM). Limited proteolysis has identified a structured domain, a pseudo receiver (psREC) domain, within the cytosolic region of Atg32 that is essential for the induction of mitophagy. psREC domains lack the metal-binding, phosphorylatable asp, and active site residues of canonical REC domains and are thought to function in protein-protein interactions.	139
381157	cd19930	REC_DesR-like	phosphoacceptor receiver (REC) domain of DesR and similar proteins. This group is composed of Bacillus subtilis DesR, Streptococcus pneumoniae response regulator spr1814, and similar proteins, all containing an N-terminal REC domain and a C-terminal LuxR family helix-turn-helix (HTH) DNA-binding output domain. DesR is a response regulator that, together with its cognate sensor kinase DesK, comprises a two-component regulatory system that controls membrane fluidity. Phosphorylation of the REC domain of DesR is allosterically coupled to two distinct exposed surfaces of the protein, controlling noncanonical dimerization/tetramerization, cooperative activation, and DesK binding. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381158	cd19931	REC_NarL	phosphoacceptor receiver (REC) domain of Nitrate/Nitrite response regulator L (NarL). Nitrate/nitrite response regulator protein NarL contains an N-terminal REC domain and a C-terminal LuxR family helix-turn-helix (HTH) DNA-binding output domain. Escherichia coli NarL activates the expression of the nitrate reductase (narGHJI) and formate dehydrogenase-N (fdnGHI) operons, and represses the transcription of the fumarate reductase (frdABCD) operon in response to a nitrate/nitrite induction signal. Phosphorylation of the NarL REC domain releases the C-terminal HTH output domain that subsequently binds specific DNA promoter sites to repress or activate gene expression. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381159	cd19932	REC_PdtaR-like	phosphoacceptor receiver (REC) domain of PdtaR and similar proteins. This subfamily includes Mycobacterium tuberculosis PdtaR, also called Rv1626, and similar proteins containing a REC domain and an ANTAR (AmiR and NasR transcription antitermination regulators) RNA-binding output domain. PdtaR is a response regulator that acts at the level of transcriptional antitermination and is a member of the PdtaR/PdtaS two-component regulatory system. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	118
381160	cd19933	REC_ETR-like	phosphoacceptor receiver (REC) domain of plant ethylene receptors ETR1, ETR2, and EIN4, and similar proteins. Plant ethylene receptors contain N-terminal transmembrane domains that contain an ethylene binding site and also serve in localization of the receptor to the endoplasmic reticulum or the Golgi apparatus and a C-terminal histidine kinase (HK)-like domain. There are five ethylene receptors (ETR1, ERS1, ETR2, ERS2, and EIN4) in Arabidopsis thaliana. ETR1, ETR2, and EIN4 also contain REC domains C-terminal to the HK domain. ETR1 and ERS1 belong to subfamily 1, and have functional HK domains while ETR2, ERS2, and EIN4 belong to subfamily 2, and lack the necessary residues for HK activity and may function as serine/threonine kinases. The plant hormone ethylene plays an important role in plant growth and development. It regulates seed germination, seedling growth, leaf and petal abscission, fruit ripening, organ senescence, and pathogen responses. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381161	cd19934	REC_OmpR_EcPhoP-like	phosphoacceptor receiver (REC) domain of EcPhoP-like OmpR family response regulators. Escherichia coli PhoP (EcPhoP) is part of the PhoQ/PhoP two-component system (TCS) that regulates virulence genes and plays an essential role in the response of the bacteria to the environment of their mammalian hosts, sensing several stimuli such as extracellular magnesium limitation, low pH, the presence of cationic antimicrobial peptides, and osmotic upshift. This subfamily also includes Brucella suis FeuP, part of the FeuPQ TCS that is involved in the regulation of iron uptake, and Microchaete diplosiphon RcaC, which is required for chromatic adaptation. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	117
381162	cd19935	REC_OmpR_CusR-like	phosphoacceptor receiver (REC) domain of CusR-like OmpR family response regulators. Escherichia coli CusR is part of the CusS/CusR two-component system (TCS) that is involved in response to copper and silver. Other members of this subfamily include Escherichia coli PcoR, Pseudomonas syringae CopR, and Streptomyces coelicolor CutR, which are all transcriptional regulatory proteins and components of TCSs that regulate genes involved in copper resistance and/or metabolism.  member of the subfamily is Escherichia coli HprR (hydrogen peroxide response regulator), previously called YdeW, which is part of the HprSR (or YedVW) TCS involved in stress response to hydrogen peroxide, as well as Cupriavidus metallidurans CzcR, which is part of the CzcS/CzcR TCS involved in the control of cobalt, zinc, and cadmium homeostasis. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	100
381163	cd19936	REC_OmpR_ChvI-like	phosphoacceptor receiver (REC) domain of ChvI-like OmpR family response regulators. Sinorhizobium meliloti ChvI is part of the ExoS/ChvI two-component regulatory system (TCS) that is required for nitrogen-fixing symbiosis and exopolysaccharide synthesis. ExoS/ChvI also play important roles in regulating biofilm formation, motility, nutrient utilization, and the viability of free-living bacteria. ChvI belongs to the OmpR family of DNA-binding response regulators that contain N-terminal receiver (REC) and C-terminal DNA-binding winged helix-turn-helix effector domains. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	99
381164	cd19937	REC_OmpR_BsPhoP-like	phosphoacceptor receiver (REC) domain of BsPhoP-like OmpR family response regulators. Bacillus subtilis PhoP (BsPhoP) is part of the PhoPR two-component system that participates in a signal transduction network that controls adaptation of the bacteria to phosphate deficiency by regulating (activating or repressing) genes of the Pho regulon upon phosphorylation by PhoR. When activated, PhoPR directs expression of phosphate scavenging enzymes, lowers synthesis of the phosphate-rich wall teichoic acid (WTA) and initiates synthesis of teichuronic acid, a non-phosphate containing replacement anionic polymer. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	116
381165	cd19938	REC_OmpR_BaeR-like	phosphoacceptor receiver (REC) domain of BaeR-like OmpR family response regulators. BaeR is part of the BaeSR two-component system that is involved in regulating genes that confer multidrug and metal resistance. In Salmonella, BaeSR induces AcrD and MdtABC drug efflux systems, increasing multidrug and metal resistance. In Escherichia coli, BaeR stimulates multidrug resistance via mdtABC (multidrug transporter ABC, formerly known as yegMNO) genes, which encode a resistance-nodulation-cell division (RND) drug efflux system. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	114
381166	cd19939	REC_OmpR_BfmR-like	phosphoacceptor receiver (REC) domain of BfmR-like OmpR family response regulators. Acinetobacter baumannii BfmR is part of the BfmR/S two-component system that functions as the master regulator of biofilm initiation. BfmR confers resistance to complement-mediated bactericidal activity, independent of capsular polysaccharide, and also increases resistance to the clinically important antimicrobials meropenem and colistin, making it a potential antimicrobial target. Its inhibition would have the dual benefit of significantly decreasing in vivo survival and increasing sensitivity to selected antimicrobials. Members of this subfamily belong to the OmpR family of DNA-binding response regulators, which are characterized by a REC domain and a winged helix-turn-helix (wHTH) DNA-binding output effector domain. REC domains function as phosphorylation-mediated switches within response regulators, but some also transfer phosphoryl groups in multistep phosphorelays.	116
410849	cd19940	XPF_nuclease-like	nuclease domain of XPF/MUS81 family proteins. The XPF/MUS81 family belongs to 3'-flap endonuclease that act upon 3'-flap structures and involved in DNA repair pathways that are necessary for the removal of UV-light-induced DNA lesions and cross-links between DNA strands. Family members exist either as heterodimers or as homodimers in their functionally competent states which consist of a catalytic and a noncatalytic subunit. The catalytic subunits have a DX(n)RKX(3)D motif. This motif is required for metal-dependent endonuclease activity but not for DNA junction binding. The equivalent regions of the noncatalytic subunits (ERCC1, EME1, and FAAP24) have diverged. The noncatalytic subunits have roles such as binding ssDNA or an ability to target the endonuclease to defined DNA structures or sites of DNA damage.	126
410995	cd19941	TIL	trypsin inhibitor-like cysteine rich domain. TIL (trypsin inhibitor-like) cysteine rich domains are found in smapins (small serine proteinase inhibitor), or Ascaris trypsin inhibitor (ATI)-like proteins, whose members include anticoagulant proteins, elastase inhibitors, trypsin inhibitors, thrombin inhibitors, and chymotrypsin inhibitors. The TIL domain is also found in some large modular glycoproteins, including the von Willebrand factor (VWF), mucin-6, mucin-19, and SCO-spondin, among others. The TIL domain is characterized by the presence of five disulfide bonds (two of which are located on either side of the reactive site) in a single small protein domain of 61-62 residues. The cysteine residues that form the disulfide bonds are linked in the pattern: cysteines 1-7, 2-6, 3-5, 4-10 and 8-9. TILs can occur as a single domain or in multiple tandem arrangements. The disulfide bonds account for the unusual resistance to proteolysis and heat denaturation of these proteins. Smapins possess an unusual fold and, with the exception of the reactive site, shows no similarity to other serine protease inhibitors. The serine protease inhibitors comprise a large family of molecules involved in inflammatory responses, blood clotting, and complement activation.	55
381075	cd19942	Fer2_BFD-like	[2Fe-2S]-binding domain of bacterioferritin-associated ferredoxin (BFD) and related proteins. The BFD-like [2Fe-2S]-binding domain comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. The Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.  BFD-like [2Fe-2S]-binding domains are found in proteins such as bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, Cu+ chaperone CopZ, anaerobic glycerol 3-phosphate dehydrogenase subunit A, hydrogen cyanide synthase subunit B, nitrogen fixation protein NifU, prokaryotic assimilatory nitrate reductase catalytic subunit NasA, and archaeal proline dehydrogenase PDH1. This superfamily also includes uncharacterized proteins having an N-terminal BFD-like [2Fe-2S]-binding domain and a C-terminal domain belonging to the Ni,Fe-hydrogenase I small subunit family.	49
381076	cd19943	NirB_Fer2_BFD-like_1	first bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of the large subunit of the NADH-dependent nitrite reductase. The NADH-dependent nitrite reductase (NirBD) complex comprises a large and a small subunit, and is also known as nitrite reductase (reduced nicotinamide adenine dinucleotide), NADH-nitrite oxidoreductase, and assimilatory nitrite reductase. NirBD uses NADH as electron donor, and FAD, iron-sulfur cluster, and siroheme cofactors, all embedded in the large subunit NirB to catalyze the 6-electron reduction of nitrite to ammonium. NirBD plays a role in regulating nitric oxide homeostasis in Streptomyces coelicolor. In addition to NirB, the BFD-like [2Fe-2S]-binding domain is found in a variety of proteins including bacterioferritin-associated ferredoxin (BFD) and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	53
381077	cd19944	NirB_Fer2_BFD-like_2	second bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of the large subunit of the NADH-dependent nitrite reductase. The NADH-dependent nitrite reductase (NirBD) complex comprises a large (NirB) and a small (NirD) subunit, and is also known as nitrite reductase (reduced nicotinamide adenine dinucleotide), NADH-nitrite oxidoreductase, and assimilatory nitrite reductase. NirBD uses NADH as electron donor, and FAD, iron-sulfur cluster, and siroheme cofactors, all embedded in the large subunit NirB to catalyze the 6-electron reduction of nitrite to ammonium. Some of the second [2Fe-2S]-binding domains, have one of the Cys residues replaced by a His residue, they may interact with non-Rieske NirD subunits. NirBD plays a role in regulating nitric oxide homeostasis in Streptomyces coelicolor. In addition to NirB, the BFD-like [2Fe-2S]-binding domain is found in a variety of proteins including bacterioferritin-associated ferredoxin (BFD) and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	52
381078	cd19945	Fer2_BFD	bacterioferritin-associated ferredoxin (BFD) [2Fe-2S]-binding domain. This family includes Escherichia coli and Pseudomonas aeruginosa bacterioferritin-associated ferredoxin BFD which binds an [2Fe-2S] cluster and appears to interact with bacterioferritin (E. coli BFR/YheA and P. aeruginosa BfrB), a dynamic regulator of intracellular iron levels. It has been suggested that BFD and bacterioferritin form an electron transfer complex which may participate in the iron storage or iron immobilization functions of bacterioferritin. For Pseudomonas aeruginosa, it has been shown that mobilization of Fe3+ stored in BfrB requires interaction with BFD, which transfers electrons to reduce Fe3+ in the internal cavity of BfrB for subsequent release of Fe2+. The stability of BFD may be aided by an anion-binding site found within this domain. In addition to BFD, the BFD-like [2Fe-2S]-binding domain is found in a variety of proteins such as the large subunit of NADH-dependent nitrite reductase and the Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	54
381079	cd19946	GlpA-like_Fer2_BFD-like	bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of anaerobic glycerol 3-phosphate dehydrogenase subunit A, hydrogen cyanide synthase subunit B, and similar proteins. This subgroup includes the BFD-like [2Fe-2S]-binding domains of subunits of various component dehydrogenase/oxidases, including anaerobic glycerol 3-phosphate dehydrogenase subunit A of GlpABC, hydrogen cyanide synthase subunit HcnB of HcnABC, octopine oxidase subunit A of OoxAB, and nopaline oxidase subunit A of NoxAB. GlpABC catalyzes the conversion of glycerol 3-phosphate to dihydroxyacetone, and participates in the glycerol degradation by glycerol kinase pathway in step 1 of the sub-pathway that synthesizes glycerone phosphate from sn-glycerol 3-phosphate (anaerobic route). HcnABC oxidizes glycine producing hydrogen cyanide and CO2. In Agrobacterium spp, the first enzymic step in the catabolic utilization of octopine and nopaline is the oxidative cleavage into L-arginine and pyruvate or 2-ketoglutarate, respectively; nopaline oxidase (NoxAB) accepts nopaline and octopine while octopine oxidase (OoaB) has high activity with octopine but barely detectable activity with nopaline, both subunits possibly contributing to the substrate specificity. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	55
381080	cd19947	NifU_Fer2_BFD-like	bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of nitrogen fixation protein NifU and similar proteins. This family includes the BFD-like [2Fe-2S]-binding domain of Azotobacter vinelandii and Klebsiella pneumoniae nitrogen fixation protein NifU. NifU binds one Fe cation per subunit and one [2Fe-2S] cluster per subunit, and is involved in the formation or repair of [Fe-S] clusters present in iron-sulfur proteins. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	55
381081	cd19948	NasA-like_Fer2_BFD-like	bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain at the C-terminus of prokaryotic assimilatory nitrate reductase catalytic subunit NasA and similar proteins. The BFD-like [2Fe-2S]-binding domain described in this family is found at the C-terminus of prokaryotic assimilatory nitrate reductase catalytic subunit (NasA) such as Rhodobacter capsulatus E1F1 NasA. Nitrate reductase catalyzes the reduction of nitrate to nitrite, the first step of nitrate assimilation. R. capsulatus E1F1 nitrate reductase is composed of this NasA subunit and a small diaphorase subunit with FAD. Note that this [2Fe-2S]-binding domain is not always present; for example, it is absent from the characterized haloaechean Haloferax mediterranei NasA; both, however, have an [4Fe-4S] binding domain at their N-terminus. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	53
381082	cd19949	PDH1_Fer2_BFD-like	bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of the alpha-subunit of archaeal proline dehydrogenase PDH1. This domain family describes the alpha-subunit of archaeal dye-linked L-proline dehydrogenase PDH1. Dye-linked PDH catalyzes the oxidation of L-proline to 1-pyrroline-5-carboxylate in the presence of artificial electron acceptors. It includes the alpha subunit of Pyrococcus horikoshii PHD1 which has been shown to exist as an (alphabeta)4 heterooctamer and to contain three cofactors: FAD, FMN, and ATP; the alpha subunit contains ATP but exhibits no PDH activity, the beta subunit is the catalytic component contains FAD and exhibits PDH activity, and FMN is located between the alpha and beta subunits. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	56
381083	cd19950	Fer2_BFD-like	[2Fe-2S]-binding domain of bacterioferritin-associated ferredoxin (BFD) and related proteins; uncharacterized subgroup. The BFD-like [2Fe-2S]-binding domain comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. The Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.  BFD-like [2Fe-2S]-binding domains are found in proteins such as bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, Cu+ chaperone CopZ, anaerobic glycerol 3-phosphate dehydrogenase subunit A, hydrogen cyanide synthase subunit B, nitrogen fixation protein NifU, prokaryotic assimilatory nitrate reductase catalytic subunit NasA, and archaeal proline dehydrogenase PDH1.	51
381084	cd19951	HyaA_family_Fer2_BFD-like	bacterioferritin-associated ferredoxin (BFD)-like [2Fe-2S]-binding domain of uncharacterized proteins having a C-terminal Ni,Fe-hydrogenase I small subunit (HyaA) family domain. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	54
410996	cd19953	PDS5	Sister chromatid cohesion protein PDS5. Pds5 plays a crucial role in sister chromatid cohesion. Together with WapI and Scc3, it is involved in the release of the cohesin complex from chromosomes during S phase. The core of the cohesin complex consists of a coiled-coiled heterodimer of Smc1 and Smc30, together with Scc1 (also called kleisin). Pds5 interacts with Scc1 via a conserved patch on the surface of its heat repeats. Pds5 also promotes the acetylation of Smc3 that protects cohesin from releasing activity in G2 phase.	630
381070	cd19954	serpin42Dd-like_insects	insect serpins similar to Drosophila melanogaster Serpin 42Dd. Serpins in insects function within development, wound healing and immunity. Drosophila melanogaster Serpin 42Dd, also called serpin 1 (Spn1), regulates Toll-mediated immune responses, functioning as a repressor of Toll activation upon fungal infection. Insect serpins from house flies,  fruit flies, and stable flies are included in this subfamily. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	366
381071	cd19955	serpin48-like_insects	insect serpins similar to Tenebrio molitor serpin 48. Serpins in insects function within development, wound healing and immunity. Tenebrio molitor serpin 48 (SPN48) is highly specific for Spatzle-processing enzyme, an essential component in insect innate immunity. SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	361
381072	cd19956	serpinB	serpin B family, ov-serpins. The clade B of the serpin superfamily corresponds to the ovalbumin family of serpins (ov-serpins), a family of closely related proteins, whose members can be secreted (ovalbumin), cytosolic (leukocyte elastase inhibitor, LEI), or targeted to both compartments (plasminogen activator inhibitor 2, PAI-2). Family members are also characterized by N- and C-terminal extensions, the absence of a signal peptide, and a Ser rather than an Asn residue at the penultimate position. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants can cause blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	376
381073	cd19957	serpinA	serpin family A. The clade A of the serpin superfamily includes the classical serine proteinase inhibitors, alpha-1-antitrypsin and alpha-1-antichymotrypsin, protein C inhibitor, kallistatin, and non-inhibitory serpins, like corticosteroid and thyroxin binding globulins. In general, SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia. A classification based on evolutionary relatedness has resulted in the assignment of serpins to 16 clades designated A-P along with some orphans.	363
410997	cd19958	pyocin_knob	knob domain of R1 and R2 pyocins and similar domains. The knob domain is present as a tandemly repeated structural domain in R-type pyocins, which are high-molecular weight bacteriocins produced by some strains of Pseudomonas aeruginosa to specifically kill other strains of the same species. R-type pyocins are structurally similar to simple contractile tails, such as those of phage P2 and Mu, and they punch a hole in the bacterial envelope to efficiently kill target cells. The second knob domain may contain regions responsible for determining the killing spectrum. Knob-like domains occur in host-recognition and binding proteins of, not only pyocins, but also phages, such as in phage K1F endosialidase (not represented by this model), where it may interact with sialic acid, the cell surface molecule that is recognized during infection.	80
410999	cd19960	YidC_peri	periplasmic beta-super sandwich fold domain of membrane protein insertase YidC from Gram-negative bacteria and similar domains. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC belongs to the YidC/Oxa1/Alb3 protein family of insertases that contains a core domain of five transmembrane (TM) segments that is essential to insertase function. In addition to this core transmembrane domain, YidC from Gram-negative bacteria contain an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. This periplasmic domain may have a role in protein assembly: a region of YidC that binds to SecF maps to one edge of the beta-super sandwich. Other members of the YidC/Oxa1/Alb3 family include YidC1/YidC2 from gram-positive bacteria as well as eukaryotic  members such as mitochondrial Oxa1/Oxa2 (or Cox18) and chloroplastic Alb3/Alb4; they are not part of this hierarchy as they do not possess the periplasmic domain.	233
411000	cd19961	EcYidC-like_peri	periplasmic beta-super sandwich fold domain of membrane protein insertase YidC from Escherichia coli and similar domains. This subfamily is composed of Escherichia coli YidC and similar proteins. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC belongs to the YidC/Oxa1/Alb3 protein family of insertases that contain a core domain of five transmembrane (TM) segments that is essential to insertase function. In addition to this core transmembrane domain, YidC from Gram-negative bacteria contain an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. This periplasmic domain may have a role in protein assembly: a region of YidC that binds to SecF maps to one edge of the beta-super sandwich.	255
380617	cd19962	PBP1_ABC_RfuA-like	periplasmic riboflavin-binding component (RfuA) of ABC transporter (RfuABCD) from Treponema pallidum and its close homologs in other bacteria. This group includes the basic membrane lipoprotein (BMP) family ABC transporter substrate-binding protein RfuA from Treponema pallidum and its close homologs in other bacteria. RfuA is the riboflavin-binding component of ABC transporter (RfuABCD) in spirochetes. The members of this group are highly similar to that of the periplasmic binding domain of basic membrane lipoprotein (BMP), PnrA. The PnrA lipoprotein, also known as Tp0319 or TmpC, represents a novel family of bacterial purine nucleoside receptor encoded within an ATP-binding cassette (ABC) transport system (pnrABCDE). It shows a striking structural similarity to another basic membrane lipoprotein Med which regulates the competence transcription factor gene, comK, in Bacillus subtilis. PnrA-like proteins are likely to have similar nucleoside-binding functions and a similar type 1 periplasmic sugar-binding protein-like fold.	305
380618	cd19963	PBP1_BMP-like	periplasmic binding component of a basic membrane lipoprotein (BMP) from Brucella abortus and its close homologs in other bacteria. Periplasmic binding component of a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. These outer membrane proteins include Med, a cell-surface localized protein regulating the competence transcription factor gene comK in Bacillus subtilis, and PnrA, a periplasmic purine nucleoside binding protein of an ATP-binding cassette (ABC) transport system in Treponema pallidum. All contain the type 1 periplasmic sugar-binding protein-like fold.	279
380619	cd19964	PBP1_BMP-like	periplasmic binding component of a basic membrane lipoprotein (BMP) from Aeropyrum pernix K1 and its close homologs in other bacteria. Periplasmic binding component of a family of basic membrane lipoproteins from Aeropyrum pernix K1 and various putative lipoproteins from other bacteria. These outer membrane proteins include Med, a cell-surface localized protein regulating the competence transcription factor gene comK in Bacillus subtilis, and PnrA, a periplasmic purine nucleoside binding protein of an ATP-binding cassette (ABC) transport system in Treponema pallidum. All contain the type 1 periplasmic sugar-binding protein-like fold.	263
380620	cd19965	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate binding protein CUT2 family and similar proteins. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	272
380621	cd19966	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate binding protein CUT2 family and simialr proteins. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	278
380622	cd19967	PBP1_TmRBP-like	D-ribose ABC transporter substrate-binding protein such as Thermoanaerobacter tengcongensis ribose binding protein (ttRBP). Periplasmic sugar-binding domain of the thermophilic Thermoanaerobacter tengcongensis ribose binding protein (ttRBP) and its mesophilic homologs. Members of this group are belonging to the type 1 periplasmic binding protein superfamily, whose members are involved in chemotaxis, ATP-binding cassette transport, and intercellular communication in central nervous system. The thermophilic and mesophilic ribose-binding proteins are structurally very similar, but differ substantially in thermal stability.	272
380623	cd19968	PBP1_ABC_IbpA-like	periplasmic sugar-binding protein IbpA of an ABC transporter and similar proteins. The periplasmic binding protein (PBP) IbpA mediates the uptake of myo-inositol by an ABC transporter that consists of the PBP IbpA, the transmembrane permease IatP, and the ABC IatA. IbpA shares homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge.	271
380624	cd19969	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate binding protein. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	278
380625	cd19970	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	275
380626	cd19971	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	267
380627	cd19972	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	269
380628	cd19973	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein. Periplasmic sugar-binding domain of active transport systems that are members of the type 1 periplasmic binding protein (PBP1) superfamily. The members of this family function as the primary receptors for chemotaxis and transport of many sugar based solutes in bacteria and archaea. The sugar binding domain is also homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR. Moreover, this periplasmic binding domain, also known as Venus flytrap domain, undergoes transition from an open to a closed conformational state upon the binding of ligands such as lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars. This family also includes the periplasmic binding domain of autoinducer-2 (AI-2) receptors such as LsrB and LuxP which are highly homologous to periplasmic pentose/hexose sugar-binding proteins.	285
380629	cd19974	PBP1_LacI-like	ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. This group includes the ligand-binding domain of uncharacterized DNA-binding regulatory proteins that are members of the LacI-GalR family of bacterial transcription repressors. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	270
380630	cd19975	PBP1_CcpA-like	ligand-binding domain of putative DNA transcription regulators highly similar to that of the catabolite control protein A (CcpA), which functions as the major transcriptional regulator of carbon catabolite repression/regulation. This group includes the ligand-binding domain of uncharacterized DNA transcription repressors highly similar to that of the catabolite control protein A (CcpA), which functions as the major transcriptional regulator of carbon catabolite repression/regulation (CCR), a process in which enzymes necessary for the metabolism of alternative sugars are inhibited in the presence of glucose. In gram-positive bacteria, CCR is controlled by HPr, a phosphoenolpyruvate:sugar phsophotrasnferase system (PTS) and a transcriptional regulator CcpA. Moreover, CcpA can regulate sporulation and antibiotic resistance as well as play a role in virulence development of certain pathogens such as the group A streptococcus. The ligand binding domain of CcpA is a member of the LacI-GalR family of bacterial transcription regulators.	269
380631	cd19976	PBP1_DegA_Like	ligand-binding domain of putative DNA transcription regulators highly similar to that of the transcription regulator DegA. This group includes the ligand-binding domain of uncharacterized DNA transcription repressors highly similar to that of the transcription regulator DegA, which is involved in the control of degradation of Bacillus subtilis amidophosphoribosyltransferase (purF). This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding.	268
380632	cd19977	PBP1_EndR-like	periplasmic ligand-binding domain of putative repressor of the endoglucanase operon and its close homologs. This group includes the ligand-binding domain of putative repressor of the endoglucanase operon from Paenibacillus polymyxa and its close homologs from other bacteria. This group belongs to the LacI-GalR family repressors and are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type 1 periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding.	264
380633	cd19978	PBP1_ABC_ligand_binding-like	periplasmic ligand-binding domain of uncharacterized ABC-type transport systems predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This group includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type transport systems that are predicted to be involved in the uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); its ligand specificity has not been determined experimentally, however.	341
380634	cd19979	PBP1_ABC_ligand_binding-like	amino acid amide ABC transporter substrate binding protein haat family. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. Members of this group are sequence-similar to members of the family of ABC-type hydrophobic amino acid transporters, such as leucine-isoleucine-valine binding protein (LIVBP); however their ligand specificity has not been determined experimentally.	350
380635	cd19980	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	334
380636	cd19981	PBP1_ABC_HAAT-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	297
380637	cd19982	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, their ligand specificity has not been determined experimentally.	302
380638	cd19983	PBP1_ABC_HAAT-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of hydrophobic amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of hydrophobic amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	303
380639	cd19984	PBP1_ABC_ligand_binding-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in uptake of amino acids, peptides, or inorganic ions. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	296
380640	cd19985	PBP1_ABC_HAAT-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of hydrophobic amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of hydrophobic amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	321
380641	cd19986	PBP1_ABC_HAAT-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	297
380642	cd19987	PBP1_SBP-like	periplasmic substrate-binding domain of active transport proteins. Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids.	353
380643	cd19988	PBP1_ABC_HAAT-like	type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems predicted to be involved in uptake of amino acids or peptides. This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (Atpase Binding Cassette)-type active transport systems that are predicted to be involved in the uptake of amino acids or peptides. This subgroup has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP); however, its ligand specificity has not been determined experimentally.	302
380644	cd19989	PBP1_SBP-like	periplasmic substrate-binding domain of active transport proteins. Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids.	299
380645	cd19990	PBP1_GABAb_receptor_plant	periplasmic ligand-binding domain of Arabidopsis thaliana glutamate receptors and its close homologs in other plants. This group includes the ligand-binding domain of Arabidopsis thaliana glutamate receptors, which have sequence similarity with animal ionotropic glutamate receptor and its close homologs in other plants. The ligand-binding domain of GABAb receptors are metabotropic transmembrane receptors for gamma-aminobutyric acid (GABA). GABA is the major inhibitory neurotransmitter in the mammalian CNS and, like glutamate and other transmitters, acts via both ligand gated ion channels (GABAa receptors) and G-protein coupled receptors (GABAb receptor or GABAbR). GABAa receptors are members of the ionotropic receptor superfamily which includes alpha-adrenergic and glycine receptors. The GABAb receptor is a member of a receptor superfamily which includes the mGlu receptors. The GABAb receptor is coupled to G alpha-i proteins, and activation causes a decrease in calcium, an increase in potassium membrane conductance, and inhibition of cAMP formation. The response is thus inhibitory and leads to hyperpolarization and decreased neurotransmitter release, for example.	373
380646	cd19991	PBP1_ABC_xylose_binding	D-xylose binding periplasmic protein. Periplasmic xylose-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR.	284
380647	cd19992	PBP1_ABC_xylose_binding-like	periplasmic xylose-like sugar-binding component of the ABC-type transport systems. Periplasmic xylose-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR.	284
380648	cd19993	PBP1_ABC_xylose_binding-like	periplasmic xylose-like sugar-binding component of the ABC-type transport systems. Periplasmic xylose-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR.	287
380649	cd19994	PBP1_ChvE	periplasmic sugar binding protein ChvE that interacts with a bacterial two-component signaling system. Periplasmic aldose-monosaccharides binding protein ChvE that belongs to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR.	304
380650	cd19995	PBP1_ABC_xylose_binding-like	periplasmic xylose-like sugar-binding component of the ABC-type transport systems. Periplasmic xylose-binding component of the ABC-type transport systems that belong to a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein (PBP1) superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes a transition from an open to a closed conformational state upon ligand binding. Moreover, the periplasmic xylose-binding protein is homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR.	294
380651	cd19996	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate binding protein such as CUT2. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail.	302
380652	cd19997	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate binding protein such as CUT2. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail.	305
380653	cd19998	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate binding protein such as CUT2. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail.	302
380654	cd19999	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate binding protein such as CUT2. Periplasmic sugar-binding component of uncharacterized ABC-type transport systems that are members of the pentose/hexose sugar-binding protein family of the type 1 periplasmic binding protein superfamily, which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding. Members of this group are predicted to be involved in the transport of sugar-containing molecules across cellular and organellar membranes; however their substrate specificity is not known in detail.	313
380655	cd20000	PBP1_ABC_rhamnose	rhamnose ABC transporter substrate-binding protein. Rhamnose ABC transporter substrate-binding protein similar to periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling.	298
380656	cd20001	PBP1_LsrB_Quorum_Sensing-like	ligand-binding protein LsrB-like of ABC transporter periplasmic binding protein. Ligand-binding protein LsrB-like of a transport system, similar to periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs from other bacteria. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling.	296
380657	cd20002	PBP1_LsrB_Quorum_Sensing-like	ligand-binding protein LsrB-like of ABC transporter periplasmic binding protein. Ligand-binding protein LsrB-like of a transport system, similar to periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs from other bacteria. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling.	295
380658	cd20003	PBP1_LsrB_Quorum_Sensing	ligand-binding protein LsrB of ABC transporter periplasmic binding protein. Periplasmic binding domain of autoinducer-2 (AI-2) receptor LsrB from Salmonella typhimurium and its close homologs from other bacteria. The members of this group are homologous to a family of periplasmic pentose/hexose sugar-binding proteins that function as the primary receptors for chemotaxis and transporters of many sugar based solutes in bacteria and archaea and that are a member of the type 1 periplasmic binding protein superfamily. LsrB, which is part of the ABC transporter complex LsrABCD, binds a chemically distinct form of the AI-2 signal that lacks boron, in contrast to the Vibrio harveyi AI-2 signaling molecule that has an unusual furanosyl borate diester. Hence, many bacteria coordinate their gene expression according to the local density of their population by producing species specific AI-2. This process of quorum sensing allows LsrB to function as a periplasmic AI-2 binding protein in interspecies signaling.	298
380659	cd20004	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	273
380660	cd20005	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	274
380661	cd20006	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	274
380662	cd20007	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	271
380663	cd20008	PBP1_ABC_sugar_binding-like	monosaccharide ABC transporter substrate-binding protein such as CUT2. Periplasmic sugar-binding domain of uncharacterized ABC-type transport systems that share homology with a family of pentose/hexose sugar-binding proteins of the type 1 periplasmic binding protein superfamily, which consists of two domains connected by a three-stranded hinge. The substrate specificity of this group is not known, but it is predicted to be involved in the transport of sugar-containing molecules and chemotaxis.	277
380664	cd20009	PBP1_RafR-like	Ligand-binding domain of DNA transcription repressor specific for raffinose (RafR) and similar proteins. Ligand-binding domain of DNA transcription repressor specific for raffinose (RafR) which is a member of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type I periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	266
380665	cd20010	PBP1_AglR-like	Ligand-binding domain of DNA transcription repressor specific for alpha-glucosides (AglR) and similar proteins. Ligand-binding domain of DNA transcription repressor specific for alpha-glucosides (AglR) which is a member of the LacI-GalR family of bacterial transcription regulators. The LacI-GalR family repressors are composed of two functional domains: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal ligand-binding domain, which is homologous to the sugar-binding domain of ABC-type transport systems that contain the type I periplasmic binding protein-like fold. As also observed in the periplasmic binding proteins, the C-terminal domain of the bacterial transcription repressor undergoes a conformational change upon ligand binding which in turn changes the DNA binding affinity of the repressor.	269
380666	cd20013	PBP1_RPA0985_benzoate-like	type 1 periplasmic binding-protein component of an ABC system (RPA0985), involved in the active transport of lignin-derived benzoate derivative compounds, and its close homologs. This group includes RPA0985 from Rhodopseudomonas palustris and its close homologs in other bacteria. Rpa0985 is the periplasmic binding-protein component of an ABC system that is involved in the active transport of lignin-derived benzoate derivative compounds. Members of this group has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP).	356
380667	cd20014	PBP1_RPA0668_benzoate-like	type 1 periplasmic binding-protein component of an ABC system (RPA0668), involved in in the active transport of lignin-derived benzoate derivative compounds, and its close homologs. This group includes RPA0668 from Rhodopseudomonas palustris and its close homologs in other bacteria. Rpa0668 is the periplasmic binding-protein component of an ABC system that is involved in the active transport of lignin-derived benzoate derivative compounds. Members of this group has high sequence similarity to members of the family of hydrophobic amino acid transporters (HAAT), such as leucine-isoleucine-valine binding protein (LIVBP).	346
410789	cd20015	FH_FOXA	Forkhead (FH) domain found in the Forkhead box protein A (FOXA) subfamily. The FOXA subfamily includes three winged helix transcription factors, FOXA1 (also called hepatocyte nuclear factor 3-alpha or transcription factor 3A), FOXA2 (also called hepatocyte nuclear factor 3-beta or transcription factor 3B), and FOXA3 (also called hepatocyte nuclear factor 3-gamma or transcription factor 3G). FOXA1 is essential for epithelial lineage differentiation and has been found to be upregulated in numerous cancers. FOXA2 controls cell differentiation. It is a key transcriptional regulator that maintains airway mucus homeostasis and may also have an important role in bone metabolism. FOXA3 acts as an essential transcriptional regulator engaged in adipogenesis and energy metabolism. This subfamily also includes Xenopus tropicalis FOXA4, Drosophila melanogaster protein fork head (dFKH), and similar proteins. FOXA4 is only present in amphibians, where it is required for the correct regionalization and maintenance of the central nervous system. dFKH promotes terminal as opposed to segmental development. In the absence of dFKH, this developmental switch does not occur. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	97
410790	cd20016	FH_FOXB	Forkhead (FH) domain found in the Forkhead box protein B (FOXB) subfamily. The FOXB subfamily includes two winged helix transcription factors, FOXB1 (also called transcription factor FKH-5) and FOXB2 (also called transcription factor FKH-4). FOXB1 controls development of mammary glands and regions of the central nervous system (CNS) that regulate the milk-ejection reflex. It is essential for access of mammillothalamic axons to the thalamus. FOXB2 may act as a tumor suppressor; it has been found to inhibit the malignant characteristics of the pancreatic cancer cell line Panc-1 in vitro. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	79
410791	cd20017	FH_FOXC	Forkhead (FH) domain found in the Forkhead box protein C (FOXC) subfamily. The FOXC subfamily includes two winged helix transcription factors, FOXC1 (also called Forkhead-related protein FKHL7, or Forkhead-related transcription factor 3) and FOXC2 (also called Forkhead-related protein FKHL14, or Mesenchyme fork head protein 1). FOXC1 is a DNA-binding transcriptional factor that plays a role in a broad range of cellular and developmental processes such as the development of the eyes, bones, cardiovascular system, kidneys, and skin. FOXC2 acts as a transcriptional activator that might be involved in the formation of special mesenchymal tissues. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	77
410792	cd20018	FH_FOXD	Forkhead (FH) domain found in the Forkhead box protein D (FOXD) subfamily. The FOXD subfamily includes four winged helix transcription factors, FOXD1-4. FOXD1, also called Forkhead-related protein FKHL8 or Forkhead-related transcription factor 4 (FREAC-4), is involved in transcriptional activation of Placental Growth Factor (PGF) and the complement component (C3) genes. It plays an important role in early embryonic development and organogenesis, and functions as an oncogene in several cancers. FOXD2, also called Forkhead-related protein FKHL17 or Forkhead-related transcription factor 9 (FREAC-9), is a probable transcription factor involved in embryogenesis and somatogenesis. FOXD3, also called HNF3/FH transcription factor genesis, acts as a transcriptional repressor that binds to the consensus sequence 5'-A[AT]T[AG]TTTGTTT-3'. It also acts as a transcriptional activator. It promotes development of neural crest cells from neural tube progenitors and restricts neural progenitor cells to the neural crest lineage while suppressing interneuron differentiation. FOXD3 is required for maintenance of pluripotent cells in the pre-implantation and peri-implantation stages of embryogenesis. FOXD4, also called Forkhead-related protein FKHL9 or Forkhead-related transcription factor 5 (FREAC-5), is essential for establishing neural cell fate and for neuronal differentiation. The family also includes Forkhead box protein D4-like proteins, FOXD4L1-6. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	78
410793	cd20019	FH_FOXE	Forkhead (FH) domain found in the Forkhead box protein E (FOXE) subfamily. The FOXE subfamily includes two winged helix transcription factors, FOXE1 (also known as FOXE2) and FOXE3. FOXE1, also called Forkhead-related protein FKHL15, HFKH4, HNF-3/fork head-like protein 5 (HFKL5), or thyroid transcription factor 2 (TTF-2), is a transcription factor that binds consensus sites on a variety of gene promoters and activate their transcription. FOXE3, also called Forkhead-related protein FKHL12 or Forkhead-related transcription factor 8 (FREAC-8), is a transcription factor that controls lens epithelial cell growth through regulation of proliferation, apoptosis, and the cell cycle. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	94
410794	cd20020	FH_FOXF	Forkhead (FH) domain found in the Forkhead box protein F (FOXF) subfamily. The FOXF subfamily includes two winged helix transcription factors, FOXF1 and FOXF2, both of which are probable transcription activators for a number of lung-specific genes. FOXF1 mutations in sporadic and familial cases of alveolar capillary dysplasia with misaligned pulmonary veins (ACD/MPV) suggest its involvement in ACD/MPV and lung organogenesis. FOXF2 is involved in programming organogenesis and regulating epithelial-to-mesenchymal transition (EMT) and cell proliferation. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	78
410795	cd20021	FH_FOXG	Forkhead (FH) domain found in the Forkhead box protein G (FOXG) subfamily. The FOXG subfamily includes a winged helix transcription factor FOXG1, which is also called brain factor 1 (BF-1), brain factor 2 (BF-2), Forkhead box protein G1A, Forkhead box protein G1B, Forkhead box protein G1C, Forkhead-related protein FKHL1, Forkhead-related protein FKHL2, or Forkhead-related protein FKHL3. FOXG1 acts as a transcription repression factor which plays an important role in the establishment of the regional subdivision of the developing brain and in the development of the telencephalon. It is repetitively used in the sequential events of telencephalic development to control multi-steps of brain circuit formation ranging from cell cycle control to neuronal differentiation in a clade- and species-specific manner. Individuals with mutations in FOXG1 harbor "FOXG1-related encephalopathy", characterized by two clinical phenotypes/syndromes including microcephaly, developmental delay, severe cognitive disabilities, early-onset dyskinesia and hyperkinetic movements, stereotypies, epilepsy, and cerebral malformation for those with deletions or intragenic mutations of FOXG1. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	79
410796	cd20022	FH_FOXH	Forkhead (FH) domain found in the Forkhead box protein H (FOXH) subfamily. The FOXH subfamily includes a winged helix transcription factor, FOXH1, which is also called Forkhead activin signal transducer 1 (Fast-1), or Forkhead activin signal transducer 2 (Fast-2). FOXH1 acts as a transcriptional activator that recognizes and binds to the DNA sequence 5'-TGT[GT][GT]ATT-3'. It is required for induction of the goosecoid (GSC) promoter by TGF-beta or activin signaling. FOXH1 forms a transcriptionally active complex containing FOXH1/SMAD2/SMAD4 on a site on the GSC promoter called TARE (TGF-beta/activin response element). The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	79
410797	cd20023	FH_FOXJ1	Forkhead (FH) domain found in Forkhead box protein J1 (FOXJ1) and similar proteins. FOXJ1, also called Forkhead-related protein FKHL13 or hepatocyte nuclear factor 3 Forkhead homolog 4 (HFH-4), acts as a transcription factor specifically required for the formation of motile cilia. It acts by activating transcription of genes that mediate assembly of motile cilia, such as CFAP157. FOXJ1 binds the DNA consensus sequences 5'-HWDTGTTTGTTTA-3' or 5'-KTTTGTTGTTKTW-3' (where H is not G, W is A or T, D is not C, and K is G or T). It activates the transcription of a variety of ciliary proteins in the developing brain and lung. The FH domain is a winged helix DNA-binding domain.	79
410798	cd20024	FH_FOXJ2-like	Forkhead (FH) domain found in Forkhead box proteins, FOXJ2, FOXJ3 and similar proteins. The FOXJ2-like subfamily includes two winged helix transcription factors, FOXJ2 and FOXJ3. FOXJ2, also called Forkhead homologous X (FHX), plays an important role in tumorigenesis, progression, and metastasis of certain cancers. It acts as a transcriptional activator that can bind to two different type of DNA binding sites. FOXJ3 is a transcription factor which regulates sperm function. It transcriptionally activates Mef2c and regulates adult skeletal muscle fiber type identity. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	77
410799	cd20025	FH_FOXI	Forkhead (FH) domain found in the Forkhead box protein I (FOXI) subfamily. The FOXI subfamily includes three human winged helix transcription factors, FOXI1-3, Xenopus laevis FoxI1c, and similar proteins. FOXI1 acts as a transcriptional activator required for the development of normal hearing, sense of balance, and kidney function. FOXI2 may act as a transcriptional activator that plays a possible role in controlling cellular identity. Loss of function mutations in the FOXI2 gene may contribute to ectodermal dysplasia. FOXI3 plays a critical role in the development of the inner ear and jaw. It is a regulator of ectodermal development. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	77
410800	cd20026	FH_FOXK	Forkhead (FH) domain found in the Forkhead box protein K (FOXK) subfamily. The FOXK subfamily includes two winged helix transcription factors, FOXK1 and FOXK2. FOXK1, also called myocyte nuclear factor (MNF), acts as a transcriptional regulator that binds to the upstream enhancer region (CCAC box) of myoglobin genes. It positively regulates Wnt/beta-catenin signaling by translocating dishevelled (DVL) proteins into the nucleus. It also reduces virus replication, probably by binding the interferon stimulated response element (ISRE) to promote antiviral gene expression. In addition, FOXK1 plays important roles in multiple human cancers. FOXK2, also called cellular transcription factor ILF-1 or interleukin enhancer-binding factor 1, is a transcriptional regulator that recognizes the core sequence 5'-TAAACA-3'. It binds to NFAT-like motifs (purine-rich) in the IL2 promoter. It also binds to the HIV-1 long terminal repeat. FOXK2 may be involved in both, positive and negative regulation of important viral and cellular promoter elements. In addition, FOXK2 plays a critical role in suppressing tumorigenesis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	78
410801	cd20027	FH_FOXL1	Forkhead (FH) domain found in Forkhead box protein L1 (FOXL1) and similar proteins. FOXL1, also called Forkhead-related protein FKHL11 or Forkhead-related transcription factor 7 (FREAC-7), acts as a transcription factor required for proper proliferation and differentiation in the gastrointestinal epithelium. It may play a critical role in suppressing tumorigenesis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	98
410802	cd20028	FH_FOXL2	Forkhead (FH) domain found in Forkhead box protein L2 (FOXL2) and similar proteins. FOXL2 is a transcriptional regulator that is essential for ovary differentiation and maintenance, and repression of the genetic program for somatic testis determination. Mutations in the FOXL2 gene cause blepharophimosis-ptosis-epicanthus inversus syndrome (BPES) types I and II, a rare genetic disorder. In BPES type I, a complex eyelid malformation is associated with premature ovarian failure (POF), whereas in BPES type II, the eyelid defect occurs as an isolated entity. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	89
410803	cd20029	FH_FOXM	Forkhead (FH) domain found in the Forkhead box protein M (FOXM) subfamily. The FOXM subfamily includes a winged helix transcription factor, FOXM1, which is also called Forkhead-related protein FKHL16, Hepatocyte nuclear factor 3 Forkhead homolog 11 (HFH-11), HNF-3/fork-head homolog 11, M-phase phosphoprotein 2, MPM-2 reactive phosphoprotein 2, transcription factor Trident, or Winged-helix factor from INS-1 cells. FOXM1 acts as a transcriptional factor regulating the expression of cell cycle genes essential for DNA replication and mitosis. It plays a role in the control of cell proliferation. It is also involved in DNA break repair, participating in the DNA damage checkpoint response. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	77
410804	cd20030	FH_FOXN1-like	Forkhead (FH) domain found in Forkhead box protein N1 (FOXN1) and similar proteins. The FOXN1-like group includes two FOXN subfamily winged helix transcription factors, FOXN1 and FOXN4. FOXN1, also called winged helix transcription factor nude, acts as a transcriptional regulator which regulates the development, differentiation, and function of thymic epithelial cells (TECs), both in the prenatal and postnatal thymus. FOXN4 acts as a transcription factor essential for neural and some non-neural tissue development, such as retina and lung, respectively. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	81
410805	cd20031	FH_FOXN2-like	Forkhead (FH) domain found in Forkhead box protein N2 (FOXN2) and similar proteins. The FOXN2-like group includes two FOXN subfamily winged helix transcription factors, FOXN2 and FOXN3. FOXN2, also called human T-cell leukemia virus enhancer factor (HTLF), is a potential tumor suppressor that can facilitate replication fork reversal. It acts as a transcription factor that binds to the purine-rich region in human T-cell leukemia virus long terminal repeat (HTLV-I LTR). It may be a potential therapeutic and radiosensitization target for lung cancer. FOXN3, also called checkpoint suppressor 1, acts as a transcriptional repressor that may be involved in DNA damage-inducible cell cycle arrests (checkpoints). The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	76
410806	cd20032	FH_FOXO	Forkhead (FH) domain found in the Forkhead box protein O (FOXO) subfamily. The FOXO subfamily includes several winged helix transcription factors: FOXO1, FOXO3, FOXO4 and FOXO6. FOXO transcription factors are involved in the regulation of longevity phenomenon via insulin and insulin-like growth factor signaling. All FOXOs bind to the consensus sequence 5'-GTAAACAA-3', known as the DAF-16 family member-binding element, which includes the core sequence 5'-(A/C)AA(C/T)A-3' recognized by all FOX family members. FOXO1, also called Forkhead box protein O1A or Forkhead in rhabdomyosarcoma (FKHR), is a transcription factor that is the main target of insulin signaling and regulates metabolic homeostasis in response to oxidative stress. FOXO3, also called AF6q21 protein or Forkhead in rhabdomyosarcoma-like 1 (FKHRL1), is a transcriptional activator which triggers apoptosis in the absence of survival factors, including neuronal cell death upon oxidative stress. It recognizes and binds to the DNA sequence 5'-[AG]TAAA[TC]A-3'. FOXO4, also called Fork head domain transcription factor AFX1, is a transcription factor involved in the regulation of the insulin signaling pathway. It binds to insulin-response elements (IREs) and can activate transcription of IGFBP1. FOXO6 acts as a transcriptional activator that may play an important role on tumor invasion, metastasis, and prognosis. The FH domain is a winged helix DNA-binding domain.	80
410807	cd20033	FH_FOXP	Forkhead (FH) domain found in the Forkhead box protein P (FOXP) subfamily. The FOXP subfamily includes four winged helix transcription factors, FOXP1-4. They are involved in the development of the central nervous system. Mutations in FOXP1, also called Mac-1-regulated Forkhead (MFH), leads to developmental delay, intellectual disability, autism spectrum disorder, speech delay, and dysmorphic features. FOXP2, also called CAG repeat protein 44 or Trinucleotide repeat-containing gene 10 protein, is a transcriptional repressor that may play a role in the specification and differentiation of lung epithelium. It may also play a role in developing neural, gastrointestinal and cardiovascular tissues. FOXP3, also called Scurfin, is a transcriptional regulator which is crucial for the development and inhibitory function of regulatory T-cells (Treg). FOXP4, also called Forkhead-related protein-like A, is a transcriptional repressor that represses lung-specific expression. It is necessary for normal T cell cytokine recall responses to antigen following pathogenic infection. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	73
410808	cd20034	FH_FOXQ1-like	Forkhead (FH) domain found in Forkhead box protein Q1 (FOXQ1) and similar proteins. FOXQ1, also called HNF-3/Forkhead-like protein 1 (HFH-1), or hepatocyte nuclear factor 3 Forkhead homolog 1, plays a role in hair follicle differentiation. It has also been reported to promote epithelial differentiation, inhibit smooth muscle differentiation, activate T cells and autoimmunity, and control mucin gene expression and granule content in stomach surface mucous cells. FOXQ1 is significantly associated with the pathogenesis of various tumor types including gastric, breast, colorectal, pancreatic, bladder and ovarian cancer, and glioma. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	79
410809	cd20035	FH_FOXQ2-like	Forkhead (FH) domain found in Forkhead box protein Q2 (FOXQ2) and similar proteins. FOXQ2 is the neurogenic ectoderm specification transcription factor that controls aboral development in cnidarians and anterior identity in bilaterians. It plays essential roles in epidermal development and central brain patterning. The foxQ2 gene is absent in placental mammals. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	95
410810	cd20036	FH_FOXR	Forkhead (FH) domain found in the Forkhead box protein R (FOXR) subfamily. The FOXR subfamily includes two winged helix transcription factors, FOXR1-2. FOXR1, also called Forkhead box protein N5 (FOXN5), is required for proper cell division and survival possibly via the p21 and mTOR pathways. FOXR2, also called Forkhead box protein N6 (FOXN6), is an important player in a wide range of cellular processes such as proliferation, migration, differentiation, and apoptosis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	87
410811	cd20037	FH_FOXS1	Forkhead (FH) domain found in Forkhead box protein S1 (FOXS1). FOXS1, also called Forkhead-like 18 protein or Forkhead-related transcription factor 10 (FREAC-10), is a transcriptional repressor that suppresses transcription from the FASLG, FOXO3, and FOXO4 promoters. It may have a role in the organization of the testicular vasculature. It has also been implicated in energy turnover, motor function, and body weight. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	100
410812	cd20038	FH_FOXA1	Forkhead (FH) domain found in Forkhead box protein A1 (FOXA1) and similar proteins. FOXA1, also called hepatocyte nuclear factor 3-alpha (HNF-3-alpha or HNF-3A) or transcription factor 3A (TCF-3A), acts as a transcription factor that is essential for epithelial lineage differentiation. It has been found to be upregulated in numerous cancers. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	112
410813	cd20039	FH_FOXA2	Forkhead (FH) domain found in Forkhead box protein A2 (FOXA2) and similar proteins. FOXA2, also called hepatocyte nuclear factor 3-beta (HNF-3-beta or HNF-3B) or transcription factor 3B (TCF-3B), acts as a core transcription factor that controls cell differentiation. It is a key transcriptional regulator that maintains airway mucus homeostasis. It may also have an important role in bone metabolism. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	104
410814	cd20040	FH_FOXA3	Forkhead (FH) domain found in Forkhead box protein A3 (FOXA3) and similar proteins. FOXA3, also called hepatocyte nuclear factor 3-gamma (HNF-3-gamma or HNF-3G), fork head-related protein FKH H3, or transcription factor 3G (TCF-3G), acts as an essential transcriptional regulator engaged in adipogenesis and energy metabolism. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	102
410815	cd20041	FH_dFKH	Forkhead (FH) domain found in Drosophila melanogaster protein fork head (dFKH) and similar proteins. dFKH promotes terminal as opposed to segmental development. In the absence of dFKH, this developmental switch does not occur. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	98
410816	cd20042	FH_FOXB1	Forkhead (FH) domain found in Forkhead box protein B1 (FOXB1) and similar proteins. FOXB1, also called transcription factor FKH-5, is a winged helix transcription factor that controls development of mammary glands and regions of the central nervous system (CNS) that regulate the milk-ejection reflex. It is essential for access of mammillothalamic axons to the thalamus. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	92
410817	cd20043	FH_FOXB2	Forkhead (FH) domain found in Forkhead box protein B2 (FOXB2) and similar proteins. FOXB2, also called transcription factor FKH-4, may act as a transcription factor. It may also act as a tumor suppressor; it has been found to inhibit the malignant characteristics of the pancreatic cancer cell line Panc-1 in vitro. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	110
410818	cd20044	FH_FOXC1	Forkhead (FH) domain found in Forkhead box protein C1 (FOXC1) and similar proteins. FOXC1, also called Forkhead-related protein FKHL7, or Forkhead-related transcription factor 3 (FREAC-3), is a DNA-binding transcriptional factor that plays a role in a broad range of cellular and developmental processes such as the development of the eyes, bones, cardiovascular system, kidneys, and skin. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	93
410819	cd20045	FH_FOXC2	Forkhead (FH) domain found in Forkhead box protein C2 (FOXC2) and similar proteins. FOXC2, also called Forkhead-related protein FKHL14, Mesenchyme fork head protein 1 (MFH-1 protein), or transcription factor FKH-14, acts as a transcriptional activator that might be involved in the formation of special mesenchymal tissues. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	90
410820	cd20046	FH_FOXD1_D2-like	Forkhead (FH) domain found in Forkhead box proteins FOXD1, FOXD2 and similar proteins. FOXD1, also called Forkhead-related protein FKHL8, or Forkhead-related transcription factor 4 (FREAC-4), is involved in transcriptional activation of Placental Growth Factor (PGF) and the complement component (C3) genes. It plays an important role in early embryonic development and organogenesis, and functions as an oncogene in several cancers. FOXD2, also called Forkhead-related protein FKHL17, or Forkhead-related transcription factor 9 (FREAC-9), is a probable transcription factor involved in embryogenesis and somatogenesis. It has been found that long noncoding RNA FOXD2 adjacent opposite strand RNA1 (lncRNA FOXD2-AS1) expression is upregulated in various human malignancies, including gastric, lung, bladder, colorectal, nasopharyngeal, esophageal, hepatocellular, thyroid and skin cancer. It is a promising candidate as a new biomarker and therapeutic target for cancer diagnosis/prognostication due to high tissue specificity and elevated efficiency. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	99
410821	cd20047	FH_FOXD3	Forkhead (FH) domain found in Forkhead box protein D3 (FOXD3) and similar proteins. FOXD3, also called HNF3/FH transcription factor genesis, acts as a transcriptional repressor that binds to the consensus sequence 5'-A[AT]T[AG]TTTGTTT-3'. It also acts as a transcriptional activator. It promotes development of neural crest cells from neural tube progenitors and restricts neural progenitor cells to the neural crest lineage while suppressing interneuron differentiation. FOXD3 is required for maintenance of pluripotent cells in the pre-implantation and peri-implantation stages of embryogenesis. The FH domain is a winged helix DNA-binding domain.	97
410822	cd20048	FH_FOXD4-like	Forkhead (FH) domain found in Forkhead box protein D4 (FOXD4) and similar proteins. FOXD4, also called Forkhead-related protein FKHL9, Forkhead-related transcription factor 5 (FREAC-5), or myeloid factor-alpha, is essential for establishing neural cell fate and for neuronal differentiation. The family also includes Forkhead box protein D4-like proteins, FOXD4L1-6. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	96
410823	cd20049	FH_FOXF1	Forkhead (FH) domain found in Forkhead box protein F1 (FOXF1) and similar proteins. FOXF1, also called Forkhead-related activator 1 (FREAC-1), Forkhead-related protein FKHL5, or Forkhead-related transcription factor 1, is a probable transcription activator for a number of lung-specific genes. FOXF1 mutations in sporadic and familial cases of alveolar capillary dysplasia with misaligned pulmonary veins (ACD/MPV) suggest its involvement in ACD/MPV and lung organogenesis. The role of FOXF1 in cancer is conflicting; its loss in some cancers suggests a tumor suppressive function, but its abundance in others is associated with protumorigenic and metastatic traits. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	99
410824	cd20050	FH_FOXF2	Forkhead (FH) domain found in Forkhead box protein F2 (FOXF2) and similar proteins. FOXF2, also called Forkhead-related activator 2 (FREAC-2), Forkhead-related protein FKHL6, or Forkhead-related transcription factor 2, is a probable transcription activator for a number of lung-specific genes. It is involved in programming organogenesis and regulating epithelial-to-mesenchymal transition (EMT) and cell proliferation. FOXF2 dysregulation is critical for tumorigenesis of various tissue types. Its expression correlates with good prognosis in patients with early non-invasive stages of breast cancer, but with poor prognosis in advanced breast cancer. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	93
410825	cd20051	FH_FOXJ2	Forkhead (FH) domain found in Forkhead box protein J2 (FOXJ2) and similar proteins. FOXJ2, also called Fork head homologous X (FHX), plays an important role in tumorigenesis, progression, and metastasis of certain cancers. It acts as a transcriptional activator that can bind to two different type of DNA binding sites. It is specifically expressed in meiotic spermatocytes in adult mouse testes and appears to promote meiotic progression during spermatogenesis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	82
410826	cd20052	FH_FOXJ3	Forkhead (FH) domain found in Forkhead box protein J3 (FOXJ3) and similar proteins. FOXJ3 is a transcription factor which regulates sperm function. It transcriptionally activates Mef2c and regulates adult skeletal muscle fiber type identity. Polymorphisms in the FOXJ3 gene may be associated with the development of rheumatoid arthritis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	79
410827	cd20053	FH_FOXI1	Forkhead (FH) domain found in Forkhead box protein I1 (FOXI1) and similar proteins. FOXI1 is also called Forkhead-related protein FKHL10, Forkhead-related transcription factor 6 (FREAC-6), Hepatocyte nuclear factor 3 Forkhead homolog 3 (HFH-3), or HNF-3/fork-head homolog 3. It is a master regulator of vacuolar H-ATPase proton pump subunits in the inner ear, kidney, and epididymis, and is required for the development of normal hearing, sense of balance, and kidney function. Its epididymal expression is required for male fertility. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	100
410828	cd20054	FH_FOXK1	Forkhead (FH) domain found in Forkhead box protein K1 (FOXK1) and similar proteins. FOXK1, also called myocyte nuclear factor (MNF), acts as a transcriptional regulator that binds to the upstream enhancer region (CCAC box) of myoglobin genes. It positively regulates Wnt/beta-catenin signaling by translocating dishevelled (DVL) proteins into the nucleus. It also reduces virus replication, probably by binding the interferon stimulated response element (ISRE) to promote antiviral gene expression. In addition, FOXK1 plays important roles in multiple human cancers. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	101
410829	cd20055	FH_FOXK2	Forkhead (FH) domain found in Forkhead box protein K2 (FOXK2) and similar proteins. FOXK2, also called cellular transcription factor ILF-1 or interleukin enhancer-binding factor 1, is a transcriptional regulator that recognizes the core sequence 5'-TAAACA-3'. It binds to NFAT-like motifs (purine-rich) in the IL2 promoter. It also binds to the HIV-1 long terminal repeat. FOXK2 may be involved in both, positive and negative regulation of important viral and cellular promoter elements. In addition, FOXK2 plays a critical role in suppressing tumorigenesis. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	98
410830	cd20056	FH_FOXN1	Forkhead (FH) domain found in Forkhead box protein N1 (FOXN1). FOXN1, also called winged helix transcription factor nude, acts as a transcriptional regulator which regulates the development, differentiation, and function of thymic epithelial cells (TECs), both in the prenatal and postnatal thymus. It is also an important factor in controlling the skin wound healing process, as it actively participates in re-epithelialization and is responsible for scar formation. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	97
410831	cd20057	FH_FOXN4	Forkhead (FH) domain found in Forkhead box protein N4 (FOXN4). FOXN4 acts as a transcription factor essential for neural and some non-neural tissue development, such as retina and lung, respectively. During development of the central nervous system, FOXN4 is required in specifying the amacrine and horizontal cell fates from multipotent retinal progenitors, while suppressing alternative photoreceptor cell fates. In non-neural tissues, it plays an essential role in specifying the atrioventricular canal and is indirectly required for patterning the distal airway during lung development. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	106
410832	cd20058	FH_FOXN2	Forkhead (FH) domain found in Forkhead box protein N2 (FOXN2). FOXN2, also called human T-cell leukemia virus enhancer factor (HTLF), is a potential tumor suppressor that can facilitate replication fork reversal. It acts as a transcription factor that binds to the purine-rich region in human T-cell leukemia virus long terminal repeat (HTLV-I LTR). It may be a potential therapeutic and radiosensitization target for lung cancer. FOXN2 has also been found to be down-regulated in breast cancer, where it regulates migration, invasion, and epithelial-mesenchymal transition. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	82
410833	cd20059	FH_FOXN3	Forkhead (FH) domain found in Forkhead box protein N3 (FOXN3). FOXN3, also called checkpoint suppressor 1 (CHES1), acts as a transcriptional repressor that may be involved in DNA damage-inducible cell cycle arrests (checkpoints). It displays transcriptional inhibitory activity, and is involved in cell cycle regulation and tumorigenesis.  Alterations in FOXN3 are found in of a variety of cancers including melanoma, osteosarcoma, and hepatocellular carcinoma. FOXN3 also regulates hepatic glucose utilization/metabolism by regulating gluconeogenic substrate selection. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	90
410834	cd20060	FH_FOXO1	Forkhead (FH) domain found in Forkhead box protein O1 (FOXO1). FOXO1, also called Forkhead box protein O1A or Forkhead in rhabdomyosarcoma (FKHR), is a transcription factor that is the main target of insulin signaling and regulates metabolic homeostasis in response to oxidative stress. It binds to the insulin response element (IRE) with the consensus sequence 5'-TT[G/A]TTTTG-3' and the related Daf-16 family binding element (DBE) with the consensus sequence 5'-TT[G/A]TTTAC-3'. The FH domain is a winged helix DNA-binding domain.	99
410835	cd20061	FH_FOXO3	Forkhead (FH) domain found in Forkhead box protein O3 (FOXO3). FOXO3, also called AF6q21 protein or Forkhead in rhabdomyosarcoma-like 1 (FKHRL1), is a transcriptional activator which triggers apoptosis in the absence of survival factors, including neuronal cell death upon oxidative stress. It recognizes and binds to the DNA sequence 5'-[AG]TAAA[TC]A-3'. The FH domain is a winged helix DNA-binding domain. All FOXOs bind to the consensus sequence 5'-GTAAACAA-3', known as the DAF-16 family member-binding element, which includes the core sequence 5'-(A/C)AA(C/T)A-3' recognized by all FOX family members.	83
410836	cd20062	FH_FOXO4	Forkhead (FH) domain found in Forkhead box protein O4 (FOXO4) and similar proteins. FOXO4, also called Fork head domain transcription factor AFX1, is a transcription factor involved in the regulation of the insulin signaling pathway. It binds to insulin-response elements (IREs) and can activate transcription of IGFBP1. The FH domain is a winged helix DNA-binding domain. All FOXOs bind to the consensus sequence 5'-GTAAACAA-3', known as the DAF-16 family member-binding element, which includes the core sequence 5'-(A/C)AA(C/T)A-3' recognized by all FOX family members.	86
410837	cd20063	FH_FOXO6	Forkhead (FH) domain found in Forkhead box protein O6 (FOXO6) and similar proteins. FOXO6 acts as a transcriptional activator that may play an important role on tumor invasion, metastasis and prognosis. The FH domain is a winged helix DNA-binding domain. All FOXOs bind to the consensus sequence 5'-GTAAACAA-3', known as the DAF-16 family member-binding element, which includes the core sequence 5'-(A/C)AA(C/T)A-3' recognized by all FOX family members.	88
410838	cd20064	FH_FOXP1	Forkhead (FH) domain found in Forkhead box protein P1 (FOXP1). FOXP1, also called Mac-1-regulated Forkhead (MFH), is a transcription factor that is widely expressed and has a  broad range of functions. It has been shown to have a role in cardiac, lung, and lymphocyte development. Deregulation of FOXP1 is an important contributor to the pathogenesis of diffuse large B-cell lymphoma (DLBCL), suggesting it may function as an oncogene. Loss of FOXP1 expression in breast cancer is associated with a worse outcome, suggesting that FOXP1 may function as a tumor suppressor in some tissues. Haploinsufficiency of the FOXP1 gene leads to a neurodevelopmental disorder termed FOXP1 syndrome, characterized by developmental delay, intellectual disability, autism spectrum disorder, speech delay, and dysmorphic features. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	87
410839	cd20065	FH_FOXP2	Forkhead (FH) domain found in Forkhead box protein P2 (FOXP2). FOXP2, also called CAG repeat protein 44, or Trinucleotide repeat-containing gene 10 protein, is a transcriptional repressor that may play a role in the specification and differentiation of lung epithelium. It may also play a role in developing neural, gastrointestinal, and cardiovascular tissues. An arginine-to-histidine missense mutation (R553H) in the FOXP2 Forkhead domain has been linked to a severe speech and language disorder. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	82
410840	cd20066	FH_FOXP3	Forkhead (FH) domain found in Forkhead box protein P3 (FOXP3) and similar proteins. FOXP3, also called Scurfin, is a transcriptional regulator which is crucial for the development and inhibitory function of regulatory T-cells (Treg), which are required for maintaining self-tolerance. It may also have intrinsic regulatory function in conventional T (Tconv) cells. A deletion of the Forkhead domain arising from a frame-shift mutation in mouse FOXP3 is linked to the autoimmune disorder scurfy, and a similar congenital disease in humans is known as IPEX (immune dysregulation, polyendocrinopathy, enteropathy, X-linked syndrome). The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	81
410841	cd20067	FH_FOXP4	Forkhead (FH) domain found in Forkhead box protein P4 (FOXP4) and similar proteins. FOXP4, also called Forkhead-related protein-like A, is a transcriptional repressor that represses lung-specific expression. It is not required for T cell development, but is necessary for normal T cell cytokine recall responses to antigen following pathogenic infection. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	87
410993	cd20069	5TM_Oxa1-like	Five transmembrane core domain of mitochondrial inner membrane protein Oxa1 and similar proteins. This group is composed mostly of the mitochondrial members of the YidC/Oxa1/Alb3 protein family of insertases, including mitochondrial inner membrane proteins Oxa1, Oxa1-like (Oxa1L), cytochrome c oxidase assembly protein 18 (Cox18, also called Oxa2), and Arabidopsis thaliana mitochondrial ALBINO3-like protein 3 (ALB33). It also includes Arabidopsis thaliana chloroplastic ALBINO3-like protein 2 (ALB32). Members of this group mediate the insertion of both mitochondrion-encoded precursors and nuclear-encoded proteins from the matrix into the mitochondrial inner membrane. Oxa1 and Cox18/Oxa2 are essential for the activity and assembly of cytochrome c oxidase, playing central roles in the translocation and export of the N-terminal and C-terminal parts, respectively, of the COX2 protein into the mitochondrial intermembrane space. ALB32 may be involved in the insertion of integral membrane proteins into the chloroplast thylakoid membranes. YidC/Oxa1/Alb3 family insertases contain a core domain of five transmembrane (5TM) segments that is essential to insertase function.	201
410994	cd20070	5TM_YidC_Alb3	Five transmembrane core domain of membrane protein insertase YidC, Alb3, and similar proteins. This group is composed of the bacterial and chloroplastic members of the YidC/Oxa1/Alb3 protein family of insertases, including bacterial YidC, and chloroplastic ALBINO3 (Alb3) and Alb3-like proteins such as ALBINO3-like protein 1 (also called Alb4). Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC from Gram-negative bacteria contains an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. Alb3 and Alb3-like proteins are required for the post-translational insertion of the light-harvesting chlorophyll-binding proteins (LHCPs) into the chloroplast thylakoid membrane. Alb3 acts independently and may also function cooperatively with the thylakoid cpSecYE translocase to insert proteins co-translationally into the thylakoid membrane, similar to bacterial YidC that can function with the SecYEG translocase. YidC/Oxa1/Alb3 family insertases contain a core domain of five transmembrane (5TM) segments that is essential to insertase function.	181
380997	cd20071	SET_SMYD	SET domain (including SET domain and post-SET domain) found in SET and MYND domain-containing protein, and similar proteins. The family includes SET and MYND domain-containing proteins, SMYD1-SYMD5. SMYD1 (EC 2.1.1.43; also termed BOP) is a heart and muscle specific SET-MYND domain containing protein, which functions as a histone methyltransferase and regulates downstream gene transcription. It methylates histone H3 at 'Lys-4' (H3K4me), seems able to perform both mono-, di-, and trimethylation. SMYD2 (also termed HSKM-B, or lysine N-methyltransferase 3C (KMT3C)) functions as a histone methyltransferase that methylates both histones and non-histone proteins, including p53/TP53 and RB1. It specifically methylates histone H3 'Lys-4' (H3K4me) and dimethylates histone H3 'Lys-36' (H3K36me2). SMYD3 (also termed zinc finger MYND domain-containing protein 1) functions as a histone methyltransferase that specifically methylates 'Lys-4' of histone H3, inducing di- and tri-methylation, but not monomethylation. It also methylates 'Lys-5' of histone H4. SMYD3 plays an important role in transcriptional activation as a member of an RNA polymerase complex. SMYD4 functions as a potential tumor suppressor that plays a critical role in breast carcinogenesis at least partly through inhibiting the expression of PDGFR-alpha. SMYD5 (also termed protein NN8-4AG, or retinoic acid-induced protein 15) functions as histone lysine methyltransferase that mediates H4K20me3 at heterochromatin regions.	122
380998	cd20072	SET_SET1	SET domain (including post-SET domain) found in catalytic component of the Saccharomyces cerevisiae COMPASS complex and similar proteins. The family contains mostly fungal SET domains, including SET1 found in the catalytic component of the Saccharomyces cerevisiae COMPASS (complex of proteins associated with Set1). SET1 is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me), when part of the SET1 histone methyltransferase (HMT) complex. The activity of this catalytic domain is established through forming a complex with a set of core proteins; it is extensively contacted by Cps60 (Bre2), Cps50 (Swd1), and Cps30 (Swd3).	148
380999	cd20073	SET_SUV39H_Clr4-like	SET domain (including pre-SET and post-SET domains) found in of Schizosaccharomyces pombe H3K9 methyltransferase Clr4, and similar proteins. This subfamily contains fission yeast Schizosaccharomyces pombe H3K9 methyltransferase Clr4 (also known as Suv39h), the sole homolog of the mammalian SUV39H1 and SUV39H2 enzymes, that has a critical role in preventing aberrant heterochromatin formation. It is known to di- and tri-methylate Lys-9 of histone H3, a central heterochromatic histone modification, with its specificity profile most similar to that of the human SUV39H2 homolog.	259
410850	cd20074	XPF_nuclease_Mus81	XPF-like nuclease domain of Mus81. Mus81 is a crossover junction endonuclease that interacts with Eme1 and Eme2 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. The typical substrates include 3'-flap structures, replication forks and nicked Holliday junctions. Mus81 may be required in mitosis for the processing of stalled or collapsed replication forks. Mus81 consists of the active nuclease domain with the GDX(n)ERKX(3)D motif which is required for metal-dependent endonuclease activity and two helix-hairpin-helix (HhH2) domains.	150
410851	cd20075	XPF_nuclease_XPF_arch	nuclease domain of XPF found in archaea. XPF, also called DNA excision repair protein ERCC-4, or DNA repair protein complementing XP-F cells, or Xeroderma pigmentosum group F-complementing protein, is a 3'-flap repair endonuclease that cleaves 5' of ds/ssDNA interfaces in 3' flap structures, although it also cuts bubble, Y-DNA structures and mobile and immobile Holliday junctions. XPF cuts preferentially after pyrimidines, may continue to progressively cleave substrate upstream of the initial cleavage, at least in vitro. It may be involved in nucleotide excision repair. The nuclease domains of the catalytic subunits XPF have the GDX(n)ERKX(3)D motif which is required for metal-dependent endonuclease activity but not for DNA junction binding. XPF-ERRC1 and its yeast homolog Rad1-Rad10 play key roles in the excision of DNA lesions and are required for certain types of homologous recombination events and for the repair of DNA cross-links.	127
410852	cd20076	XPF_nuclease_FAAP24	XPF-like nuclease domain of Fanconi anemia associated protein 24 kDa (FAAP24). FAAP24, also called Fanconi anemia core complex-associated protein 24, plays a role in DNA repair through recruitment of the Fanconi anemia (FA) core complex to damaged DNA. It regulates FANCD2 monoubiquitination upon DNA damage and induces chromosomal instability as well as hypersensitivity to DNA cross-linking agents, when repressed. FAAP24 may possess a high affinity toward single-stranded DNA (ssDNA). The nuclease domain of FAAP24 lacks the catalytic motif. The FANCM/FAAP24 complex is related to XPF/MUS81 endonucleases but lacks endonucleolytic activity. It binds branched DNA structures containing ssDNA regions, such as splayed-arm and 3'-flap DNA structures, and anchors the FA core complex to chromatin in repairing DNA interstrand crosslinks.	123
410853	cd20077	XPF_nuclease_FANCM	XPF-like nuclease domain of Fanconi anemia group M protein (FANCM). FANCM (EC 3.6.4.13), also called Fanconi anemia-associated polypeptide of 250 kDa (FAAP250), or protein Hef ortholog, or ATP-dependent RNA helicase FANCM, is a DNA-dependent ATPase component of the Fanconi anemia (FA) core complex. It is required for the normal activation of the FA pathway, leading to monoubiquitination of the FANCI-FANCD2 complex in response to DNA damage, cellular resistance to DNA cross-linking drugs, and prevention of chromosomal breakage. In complex with CENPS and CENPX, it binds double-stranded DNA (dsDNA), fork-structured DNA (fsDNA) and Holliday junction substrates. In complex with FAAP24, it efficiently binds to single-strand DNA (ssDNA), splayed-arm DNA, and 3'-flap substrates. In vitro, on its own, FANCM strongly binds ssDNA oligomers and weakly fsDNA, but does not bind to dsDNA.	139
410854	cd20078	XPF_nuclease_XPF_euk	nuclease domain of XPF found in eukaryotes. XPF, also called DNA excision repair protein ERCC-4, or DNA repair protein complementing XP-F cells, or Xeroderma pigmentosum group F-complementing protein, is a DNA repair endonuclease that is a catalytic component of a structure-specific DNA repair endonuclease responsible for the 5-prime incision during DNA repair. It is involved in homologous recombination that assists in removing interstrand cross-link. The nuclease domains of the catalytic subunits XPF have the GDX(n)ERKX(3)D motif which is required for metal-dependent endonuclease activity but not for DNA junction binding. XPF-ERRC1 and its yeast homolog Rad1-Rad10 play key roles in the excision of DNA lesions and are required for certain types of homologous recombination events and for the repair of DNA cross-links.	136
410855	cd20079	XPF_nuclease_ERCC1	XPF-like nuclease domain of DNA excision repair protein ERCC1. ERCC1 is a non-catalytic component of a structure-specific DNA repair endonuclease responsible for the 5'-incision during DNA repair. In conjunction with SLX4, ERCC1 is responsible for the first step in the repair of interstrand cross-links (ICL), as well as for homology-directed repair (HDR) of DNA double-strand breaks. ERCC1 participates in the processing of anaphase bridge-generating DNA structures, which consist in incompletely processed DNA lesions arising during S or G2 phase, and can result in cytokinesis failure. ERCC1 also plays a critical role in targeting the XPF-ERCC1 complex to DNA. XPF-ERRC1 and its yeast homolog Rad1-Rad10 play key roles in the excision of DNA lesions and are required for certain types of homologous recombination events and for the repair of DNA cross-links. The critical motif, DX(n)ERKX(3)D, for endonuclease activity is absent in the nuclease domain of ERCC1.	115
410856	cd20080	XPF_nuclease_EME-like	XPF-like nuclease domain of the family of Essential Meiotic Endonucleases (EMEs) and similar proteins. The family of EMEs includes EME1 and EME2. EME1, also called MMS4 homolog (hMMS4), interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. Its typical substrates include 3'-flap structures, replication forks and nicked Holliday junctions. EME1 may be required in mitosis for the processing of stalled or collapsed replication forks. EME2 interacts with MUS81 to form a DNA structure-specific endonuclease which cleaves substrates such as 3'-flap structures. MUS81-EME2 is responsible for fork cleavage and restart in human cells. The MUS81-EME2 protein, whose actions are restricted to S phase, is also responsible for telomere maintenance in telomerase-negative ALT (Alternative Lengthening of Telomeres) cells. The nuclease domain of EMEs is a nuclease-like domain which is involved in targeting the MUS81-EME heterodimer complex to DNA. The family also includes budding yeast Mms4 (also known as Eme1 in other organisms), a putative transcriptional (co)activator that protects Saccharomyces cerevisiae cells from endogenous and environmental DNA damage. It interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. The nuclease domain of Mms4 lacks the catalytic motif.	164
410857	cd20081	XPF_nuclease_EME1	XPF-like nuclease domain of crossover junction endonuclease EME1. EME1, also called MMS4 homolog (hMMS4), interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. Its typical substrates include 3'-flap structures, replication forks and nicked Holliday junctions. EME1 may be required in mitosis for the processing of stalled or collapsed replication forks. The nuclease domain of EME1 is a nuclease-like domain which is involved in targeting the MUS81-EME1 heterodimer complex to DNA.	179
410858	cd20082	XPF_nuclease_EME2	XPF-like nuclease domain of crossover junction endonuclease EME2. EME2 interacts with MUS81 to form a DNA structure-specific endonuclease which cleaves substrates such as 3'-flap structures. MUS81-EME2 is responsible for fork cleavage and restart in human cells. The MUS81-EME2 protein, whose actions are restricted to S phase, is also responsible for telomere maintenance in telomerase-negative ALT (Alternative Lengthening of Telomeres) cells. The nuclease domain of EME2 is a nuclease-like domain which is involved in targeting the MUS81-EME2 heterodimer complex to DNA.	195
410859	cd20083	XPF_nuclease_EME	XPF-like nuclease domain of crossover junction endonucleases, EME1, EME2 and similar proteins. The Mus81-EME1 complex is a structure-selective endonuclease with a critical role in the resolution of recombination intermediates during DNA repair after interstrand cross-links, replication fork collapse, or double-strand breaks. ERCC4 domain of Eme1 is a nuclease-like domain which is involved in  targeting the MUS81-EME1 heterodimer complex to DNA.	179
410860	cd20085	XPF_nuclease_Mms4	XPF-like nuclease domain of Saccharomyces cerevisiae crossover junction endonuclease Mms4 and similar proteins. Budding yeast Mms4, also known as Eme1 in other organisms, is a putative transcriptional (co)activator that protects Saccharomyces cerevisiae cells from endogenous and environmental DNA damage. It interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. Typical substrates include 3'-flap structures, D-loops, replication forks with regressed leading strands and nicked Holliday junctions. The nuclease domain of Mms4 lacks the catalytic motif.	220
380911	cd20167	Peptidase_M90-like	M90 peptidase is a zinc-metallopeptidase. The M90 peptidase family includes the MtfA (Mlc Titration Factor A) peptidase from Escherichia coli, also known as the YeeI gene product, which is involved in the control of the glucose-phosphotransferase sensory and regulatory system by inactivation of the repressor Mlc (making large colonies). E. coli MtfA has been shown to have aminopeptidase activity with the presence of a single zinc ion in the active site ligated by two histidines in a HEXXH motif. This family also includes uncharacterized proteins similar to MtfA peptidase.	208
380912	cd20169	Peptidase_M90_mtfA	Mlc titration factor A (MtfA) is a zinc metallopeptidase (M90 peptidase). This subfamily includes the Mlc Titration Factor A (MtfA; also known as YeeI or DgsA anti-repressor MtfA) which is involved in the control of the glucose-phosphotransferase sensory and regulatory system by inactivation of the repressor Mlc (making large colonies). It can cleave synthetic substrates of both carboxypeptidases and aminopeptidases, with strongest activity towards the latter. Its biologically relevant substrate has yet to be identified. Although it interacts with the transcription repressor Mlc, it does not cleave it. However, Mlc seems to activate the peptidase activity of MtfA. MtfA is related to the catalytic domain of the anthrax lethal factor which is a zinc-dependent metalloprotease, targeting mitogen-activated protein kinase kinases (MAPKKs), and resulting in apoptosis, as well as the Mop (modulation of pathogenesis) protein involved in the virulence of Vibrio cholerae; although sequence similarity is low, conservation is observed in the overall structure as well as in the residues around the active site.	208
380913	cd20170	Peptidase_M90-like	uncharacterized M90 peptidase family-like  proteins. This subfamily contains uncharacterized M90 peptidase-like domains, similar to the Mlc Titration Factor A (MtfA) peptidase from Escherichia coli, also known as the YeeI gene product, which is involved in the control of the glucose-phosphotransferase sensory and regulatory system by inactivation of the repressor Mlc (making large colonies). E. coli MtfA has been shown to have aminopeptidase activity with the presence of a single zinc ion in the active site ligated by two histidines in an HEXXH motif. MtfA is related to the catalytic domain of the anthrax lethal factor and the Mop protein involved in the virulence of Vibrio cholerae; although sequence similarity is low, conservation is observed in the overall structure as well as in the residues around the active site.	210
380332	cd20171	M34_peptidase	Peptidase family M34 includes the C-terminal catalytic domain of anthrax lethal factor (ATLF), the protective antigen-binding domains of ATLF and edema factor, and Pro-Pro endopeptidase. Peptidase family M34 (also known as the anthrax lethal factor family) includes the C-terminal catalytic domain of anthrax lethal factor (ATLF, EC 3.4.24.83), and the N-terminal protective antigen-binding domains (PABDs) of ATLF and edema factor (EF). ATLF and EF are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF is a highly selective protease whose major substrates are mitogen-activated protein kinase kinases (MKKs). At its N-terminus, ATLF has a PABD domain which lacks the hallmark metalloprotease motif HEXXH, and, at its C-terminus, the related catalytic domain has the HEXXH motif where the two His residues bind a single zinc atom, and the Glu has a catalytic role. EF acts as a Ca2+- and calmodulin-dependent adenylyl cyclase that can cause edema when associated with PA. EF is comprised of the PABD and an adenylyl cyclase domain. This family also includes Pro-Pro endopeptidase (PPEP-1; EC 3.4.24.89, also known as Zmp1) which is an extracellular metalloprotease that shows a unique specificity for hydrolyzing a Pro-Pro bond and is involved in bacterial adhesion.	156
380910	cd20174	GH18_LinChi78-like_UFR	an unknown function domain of Listeria innocua LinChi78 GH18 chitinase that is essential for its catalytic activity; found in similar chitinase-like proteins. This domain is referred to as an unknown-function region (UFR) and shown to be necessary for the hydrolytic activity of LinChi78 glycosyl hydrolase family 18 (GH18) chitinase (a product of the lin0153gene) from the nonpathogenic bacterium Listeria innocua. The catalytic domain (CatD) of GH18 chitinases folds into a TIM barrel and has a conserved DXXDXDXE motif, in which the Glu residue functions as a catalytic residue; these chitinases contain additional domains such as a chitin-binding domain (ChBD) and/or a fibronectin type III-like (FnIII) domain.  LinChi78 consists of a CatD, a FnIII, and a ChBD domain, and has this UFR region located between the CatD and the FnIII domain. Its catalytic site is composed of a typical CatD and a portion of this UFR, in particular the key Gln and Ile residues which are indispensable for LinChi78 to exhibit full catalytic activity. This UFR domain is also found in proteins where it is located between a CatD domain and DUF5011 and ChBD(s) domains. LinChi78 exhibits chitinase activity towards artificial and natural substrates, including colloidal chitin and chitin oligosaccharides of various lengths, and hydrolyzes these in a processive manner. Members of this family include some uncharacterized chitinase-like proteins from pathogenic bacteria such as Listeria monocytogenes and Clostridium botulinum.	136
412038	cd20175	ThyX	FAD-dependent thymidylate synthase (ThyX), mechanistically and structurally unrelated to thymidylate synthase (ThyA). This family contains FAD-dependent thymidylate synthase (also known as ThyX, Thy1, FDTS or thymidylate synthase complementing protein), found in many microbial genomes including several human pathogens, but absent in humans. This protein is mechanistically and structurally unrelated to thymidylate synthase (TS or ThyA) found in mammals. ThyA and ThyX both produce de novo thymidylate or deoxythymidine 5'-monophosphate (dTMP), an essential DNA precursor. The classic ThyA catalyzes the reductive methylation of deoxyuridine 5'-monophosphate (dUMP) to form dTMP, with methylenetetrahydrofolate (CH2H4folate) serving as a one-carbon donor and as the source of reductive power. On the other hand, ThyX contains FAD, tightly bound by a novel fold, that mediates hydride transfer from NADPH during catalysis. Consequently, CH2H4folate serves only as a carbon donor and tetrahydrofolate (and not dihydrofolate as in the case of ThyA) is produced. The differences between the ThyX and ThyA is used for mechanism-based drugs to selectively inhibit FDTS and not have much effect on human and other eukaryotic TS. ThyX has been pursued for the development of new antibacterial agents against Mycobacterium tuberculosis, the causative agent of the widespread infectious disease tuberculosis (TB). It is also an attractive target for designing specific antibiotic drugs against many diseases such as ulcers, periodontal disease, and Lyme's disease, as well as biological warfare agents such as anthrax, botulism, and typhus.	186
380333	cd20183	M34_PPEP	Pro-Pro endopeptidase (PPEP) and similar proteins; belongs to peptidase family M34. This subfamily includes the enzyme Pro-Pro endopeptidase (PPEP-1, EC 3.4.24.89, also known as Zmp1 (Clostridium difficile-type)), an extracellular metalloprotease showing a unique specificity for hydrolyzing a Pro-Pro bond. It belongs to peptidase family M34 and has the hallmark metalloprotease motif HEXXH, where the two His residues bind a single zinc atom, and the Glu has a catalytic role. PPEP-1 cleaves two C. difficile cell surface proteins (CD2831 and CD3246) involved in adhesion, one of which is encoded by the gene adjacent to the ppep-1 gene. There are multiple PPEP-1 cleavage sites located just above the site of attachment to the peptidoglycan layer.  PPEP-1 may play a role in switching from an adhesive to a motile phenotype. Also included in this subfamily is Paenibacillus alvei PPEP-2, a secreted Pro-Pro endopeptidase. The cleavage motif of PPEP-2, PLP PVP, is distinct from that of PPEP-1 (VNP PVP). PPEP-2 cleavage sites in a cell-surface protein, with putative extracellular matrix-binding domains, and encoded by the adjacent gene, suggests a similar role of PPEP-2 in controlling bacterial adhesion.	184
380334	cd20184	M34_peptidase_like	uncharacterized subfamily of peptidase family M34. Peptidase family M34 (also known as the anthrax lethal factor family) includes the C-terminal catalytic domain of anthrax lethal factor (ATLF, EC 3.4.24.83), and the N-terminal protective antigen-binding domains (PABDs) of ATLF and edema factor (EF). ATLF and EF are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF is a highly selective protease whose major substrates are mitogen-activated protein kinase kinases (MKKs). At its N-terminus, ATLF has a PABD domain which lacks the hallmark metalloprotease motif HEXXH, and, at its C-terminus, the related catalytic domain has the HEXXH motif where the two His residues bind a single zinc atom, and the Glu has a catalytic role. EF acts as a Ca2+- and calmodulin-dependent adenylyl cyclase that can cause edema when associated with PA; it is comprised of the PABD and an adenylyl cyclase domain. Pro-Pro endopeptidase (PPEP-1; EC 3.4.24.89, also known as Zmp1) is an extracellular metalloprotease that shows a unique specificity for hydrolyzing a Pro-Pro bond and is involved in bacterial adhesion. This uncharacterized subfamily includes proteins which have an N-terminal SLH domain, and proteins which may have an N-terminal IG-like domain; these proteins have the hallmark metalloprotease motif HEXXH motif.	131
380335	cd20185	M34_PABD	N-terminal protective antigen-binding domain (PABD) of Anthrax Toxin Lethal Factor (ATLF) and Edema Factor (EF), and similar domains; belongs to peptidase family M34. This subfamily includes the functional N-terminal protective antigen-binding domain (PABD) of the anthrax edema factor (EF), as well as the likely inactive N-terminal PABD of anthrax toxin lethal factor (ATLF), both secreted by Bacillus anthracis. ATLF and EF are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF-PABD resembles the C-terminal catalytic domain of ATLF (EC 3.4.24.83) but lacks the hallmark metalloprotease motif HEXXH. This subfamily belongs to the peptidase family M34 (also known as the anthrax lethal factor family), which also includes the C-terminal catalytic domain of ATLF, and Pro-Pro endopeptidase (PPEP-1; EC 3.4.24.89, also known as Zmp1), which is an extracellular metalloprotease that shows a unique specificity for hydrolyzing a Pro-Pro bond and is involved in bacterial adhesion.	219
410313	cd20187	T-box_TBX1_10-like	DNA-binding domain of T-box transcription factor 1 and 10, and related T-box proteifactors. This subfamily includes TBX1 and TBX10. TBX1 is a T-box transcription factor which plays an important role in heart development and has been implicated in DiGeorge or 22q11.2 deletion syndrome. This syndrome is associated with various types of cardiac outflow tract (OFT) and vascular defects. Wnt5a is regulated by TBX1 in the second heart field (SHF). TBX1 is required to maintain the integrity of extracellular matrix-cell interactions in the SHF and this interaction is critical for cardiac (OFT) development. TBX10 is a putative T-box transcription factor. Diseases associated with TBX10 include Isolated Cleft Lip and Cleft Lip/cleft lip with or without cleft palate. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	189
410314	cd20188	T-box_TBX2_3-like	DNA-binding domain of T-box transcription factor 2 and 3, and related T-box proteins. This subfamily includes the T-box transcription factors TBX2 and TBX3 and similar proteins. TBX2 is an oncogenic transcription factor implicated in developmental processes, including coordinating cell fate, patterning and morphogenesis of a wide range of tissues and organs. It is overexpressed in several cancers, including melanoma and breast, and plays a key role during cardiac development. TBX2 is a negative regulator of promyelocytic leukemia protein (PML) function in cellular senescence, and it interacts with HP1 to recruit a repression complex to EGR1-responsive promoters to drive the proliferation of breast cancer cells. TBX3 has also been implicated in oncogenesis in breast cancer and melanoma. The tbx3 gene is downregulated by PML. TBX3 directly represses TBX2 under the control of the PRC2 complex in skeletal muscle and rhabdomyosarcoma. Also included in this family is the Drosophila melanogaster optomotor-blind protein (Omb, also known as lethal(1)optomotor-blind, or L(1)omb, or protein bifid) which controls many developmental processes such as wing, eye, and abdominal tergites and optic lobes, and induces epithelial cell migration and extrusion in vivo. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	185
410315	cd20189	T-box_TBX4_5-like	DNA-binding domain of T-box transcription factor 4 and 5, and related T-box proteins. This subfamily includes the T-box transcription factors TBX4 and TBX5 which play important roles in vertebrate limb and heart development, and in lung and trachea development. TBX4 is needed for normal skeletal and muscular hindlimb development and is involved in super-enhancer-driven transcriptional programs underlying features specific to lung fibroblasts. TBX5 plays a role in regulating cardiac conduction system function, and in coordinating forelimb muscle pattern.  Mutations in human TBX5 and TBX4 are associated with Holt-Oram syndrome and Small Patella syndrome, respectively. Both syndromes are characterized by limb defects in addition to other abnormalities. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	185
410316	cd20190	T-box_TBX6_VegT-like	DNA-binding domain of T-box transcription factor 6, VegT and related T-box proteins. This subfamily includes the transcriptional regulators TBX6 and VegT. TBX6 plays an essential role in the fate determination of axial stem to become either neural or mesodermal. It also plays an essential role in the regulation of left/right patterning in mouse embryos through effects on nodal cilia and perinodal signaling. VegT (also known as Antipodean, Brat and Xombi) is required in early Xenopus embryos for the formation of both the mesoderm and endoderm germ layers. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved 1DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	183
410317	cd20191	T-box_TBX15_18_22-like	DNA-binding domain of T-box transcription factor 15, 18 and 22, and related T-box proteins. This subfamily includes the transcriptional regulators TBX15, TBX18 and TBX22 which are involved in various developmental processes. TBX15 (also known as TBX14) plays an important role in the development of the skeleton of the limb, vertebral column and head, possibly through its control of the number of mesenchymal precursor cells and chondrocytes; it also plays a role in the differentiation of brown and brite adipocytes. TBX18 is involved in the developmental processes of a variety of tissues and organs, including the ureter, vertebral column. epicardium and coronary vessels; it is important for the development of the head portion of the sino atrial node (SAN). Mutations in the T-box transcription factor gene TBX22 are found in X-linked Cleft Palate with or without Ankyloglossia syndrome (CPX syndrome), and associated with cleft lip and palate, and tooth agenesis. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	194
410318	cd20192	T-box_TBXT_TBX19-like	DNA-binding domain of T-box transcription factor T, T-box transcription factor 19 and related T-box proteins. Tbx19 (also known as Tpit) is a T-box factor restricted to two pituitary (pro-opiomelanocortin) POMC-expressing lineages, the corticotrophs and melanotrophs; it controls terminal differentiation of these lineages. TBX19 activates POMC gene transcription with the cooperation of another transcription factor Pitx1. TBXT, also known as Brachyury protein, or protein T, is a transcription factor needed for posterior mesoderm formation and differentiation as well as for the notochord development during embryogenesis. It binds to a 24 base-pair (bp) palindromic site (called the T site) and activates gene transcription when bound to such a site. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. TBXT is the founding member of the T-box family, members of which share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	180
410319	cd20193	T-box_TBX20-like	DNA-binding domain of T-box transcription factor 20 and related T-box proteins. TBX20 is a T-box transcriptional factor which functions in embryonic development and its deficiency is associated with congenital heart disease. It acts both as a transcriptional activator and a repressor required for cardiac development, and has key roles in maintaining the functional and structural phenotypes in the adult heart. The TBX20-cardiac transcription factor CASZ1 protein complex is protective against dilated cardiomyopathy and is essential for maintaining cardiac homeostasis. TBX20 has also been shown to regulate angiogenesis through the PROK2-PROKR1 (prokineticin receptor 1) pathway and is involved in both, pathological and developmental, angiogenesis. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	190
410320	cd20194	T-box_TBR1_2_21-like	DNA-binding domain of T-box brain protein 1 and 2, T-box transcription factor 21 and related T-box proteins. TBX21 (also known as T-cell-specific T-box transcription factor T-bet or transcription factor TBLYM) is a lineage-defining transcription factor which directs T helper type 1 (Th1) cell differentiation. This subfamily includes TBR1 (also known as T-brain-1, or TES-56), which is a neuron-specific transcription factor involved in forebrain development, and TBR2 (also known as Eomesodermin, Eomes, or T-brain-2), which is associated with neurogenesis, cardiogenesis and tumor immune response. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	185
410321	cd20195	T-box_MGA-like	DNA-binding domain of MAX gene-associated protein and related T-box proteins. MGA (also known as MGAP, MAX dimerization protein, MAD5, MXD5) is a dual-specificity transcription factor that regulates the expression of both, MAX-network and T-box family target genes. MGA functions as a repressor or an activator; it binds to 5'-AATTTCACACCTAGGTGTGAAATT-3' core sequence. Its function is activated by heterodimerization with MAX. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved  DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	186
410322	cd20196	T-box_TBX6	DNA-binding domain of T-box transcription factor 6, and related T-box proteins. TBX6 is a T-box transcription factor which plays an essential role in the fate determination of axial stem to become either neural or mesodermal. It also plays an essential role in the regulation of left/right patterning in mouse embryos, through effects on nodal cilia and perinodal signaling. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	182
410323	cd20197	T-box_VegT-like	DNA-binding domain of Xenopus VegT and related T-box proteins. VegT, (also known as Antipodean, Brat and Xombi), is a T-box transcription factor required in early Xenopus embryos for the formation of both, the mesoderm and endoderm germ layers. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	183
410324	cd20198	T-box_TBX15-like	DNA-binding domain of T-box transcription factor 15 and related T-box proteins. TBX15 (also known as TBX14) plays an important role in the development of the skeleton of the limb, vertebral column and head, possibly through its control of the number of mesenchymal precursor cells and chondrocytes. TBX15 also plays a role in the differentiation of brown and brite adipocytes. This subgroup belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	198
410325	cd20199	T-box_TBX18_like	DNA-binding domain of T-box transcription factor 18 and related T-box proteins. TBX18 acts as a transcription repressor involved in the developmental processes of a variety of tissues and organs, including the ureter, vertebral column. epicardium and coronary vessels. TBX18 is important for the development of the head portion of the sino atrial node (SAN); SAN is the pacemaker region of the heart that initiates each heartbeat. This subgroup belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	195
410326	cd20200	T-box_TBX22-like	DNA-binding domain of T-box transcription factor 22 and related T-box proteins. TBX22 is a transcriptional regulator involved in developmental processes. Mutations in the T-Box transcription factor gene TBX22 are found in X-linked Cleft Palate with or without Ankyloglossia syndrome (CPX syndrome). TBX22 mutation is also associated with cleft lip and palate, and tooth agenesis. This subgroup belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	194
410327	cd20201	T-box_TBX19-like	DNA-binding domain of T-box transcription factor 19 and related T-box proteins. Tbx19 (also known as Tpit) is a T-box factor restricted to two pituitary (pro-opiomelanocortin) POMC-expressing lineages, the corticotrophs and melanotrophs; it controls terminal differentiation of these lineages. TBX19 activates POMC gene transcription with the cooperation of another transcription factor Pitx1. Mutations of the human TPIT gene cause early onset pituitary adrenocorticotrophic hormone (ACTH) deficiency. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	183
410328	cd20202	T-box_TBXT	DNA-binding domain of T-box transcription factor T and related T-box proteins. TBXT, also known as Brachyury protein, or protein T, is a transcription factor needed for posterior mesoderm formation and differentiation as well as for the notochord development during embryogenesis. It binds to a 24 base-pair (bp) palindromic site (called the T site) and activates gene transcription when bound to such a site. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. TBXT is the founding member of the T-box family, members of which share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	179
410329	cd20203	T-box_TBX21	DNA-binding domain of T-box transcription factor 21 and related T-box proteins. TBX21 (also known as T-cell-specific T-box transcription factor T-bet or transcription factor TBLYM) is a lineage-defining transcription factor which directs T helper type 1 (Th1) cell differentiation. It initiates Th1 lineage development from naive T helper precursor cells both by initiating the Th1 genetic programs and by inhibiting the opposing Th2 and Th17 lineage-commitment programs. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	191
410330	cd20204	T-box_TBR1	DNA-binding domain of T-box brain protein 1 and related T-box proteins. TBR1 (also known as T-brain-1 or TES-56) is a neuron-specific transcription factor of the T-box family and involved in forebrain development. It has been recognized as a high-confidence risk gene for autism spectrum disorders (ASD); it regulates the expression of ASD-related genes that are critical for cortical development. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	191
410331	cd20205	T-box_TBR2	DNA-binding domain of T-box brain protein 2 and related T-box proteins. TBR2 (also known as Eomesodermin, Eomes, or T-brain-2) is a member of the T-box family of transcription factors and is associated with neurogenesis, cardiogenesis and tumor immune response. This subfamily belongs to the T-box family of transcription factors which plays a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	191
411001	cd20206	YbbR	YbbR domain. YbbR domains occur as tandem repeats or in architectures together with other domains. Putative roles in cell growth, cell division, and/or virulence have been suggested for this domain.	79
380908	cd20207	Bbox2_GefO-like	B-box-type 2 zinc finger  found in Ras guanine nucleotide exchange factor O (GefO) and similar proteins. Ras guanine-nucleotide exchange factors (RasGEFs) activate Ras by catalyzing the replacement of GDP with GTP, and thus lie near the top of many signaling pathways. They are important for signaling in development and chemotaxis in many organisms. Ras guanine nucleotide exchange factor O (GefO), also known as RasGEF domain-containing protein O, is faintly expressed during development of Dictyostelium discoideum. It contains a C3HC4-type RING finger, a B-box motif that shows high sequence similarity with B-Box-type zinc finger 2 found in tripartite motif-containing proteins (TRIMs), a REM (Ras exchanger motif) domain, and a # RasGEF domain. The type 2 B-box (Bbox2) zinc finger is characterized by a CHC3H2 zinc-binding consensus motif.	40
380909	cd20208	Bbox1_DUF2009	B-box-type 1 zinc finger found in DUF2009 domain-containing proteins and similar proteins. This group is composed of uncharacterized proteins containing a zinc finger B-box domain and a DUF2009 domain, and similar zinc finger B-box domain-containing proteins. The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	43
380784	cd20214	PFM_CEL-III-like	pore-forming module of hemolytic lectin Cucumaria echinate CEL-III and similar aerolysin-type beta-barrel pore-forming proteins. Cucumaria echinate CEL-III is a Ca(2+)-dependent and galactose-specific lectin, which is cytotoxic to some cultured cell lines, has strong hemolytic activity toward human and rabbit erythrocytes, and anti-malarial activity. Hemolysis results from ion-permeable pores formed from CEL-III oligomers in the target cell membrane. Members of this group includes CEL-III isoforms: CEL-III-L1, CEL-III-L2, CEL-III-S1, CEL-III-S2, and CEL-III-LS1. Many proteins belonging to this group have two N-terminal ricin-type carbohydrate-binding domains which adopt beta-trefoil folds. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin). The CEL-III oligomer in the membrane may be composed of six monomers.	124
380785	cd20215	PFM_LSL-like	pore-forming module of Laetiporus sulphureus LSL lectin and similar aerolysin-type beta-barrel pore-forming proteins. LSL is a lectin, produced by the parasitic mushroom Laetiporus sulphureus, which exhibits hemolytic and hemagglutinating activities. Members of this family belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	164
380786	cd20216	PFM_HFR-2-like	pore-forming module of wheat HFR-2 toxin, FEM32, and similar aerolysin-type beta-barrel pore-forming proteins. HFR-2 is a wheat cytolytic toxin which may normally function in defense against certain insects or pathogens. The Hfr-2 gene is upregulated in virulent Hessian fly larval feedingdouble dagger. The HFR-2 protein may insert in plant cell membranes at the feeding sites and by forming pores provide water, ions and other small nutritive molecules to the developing larvae. This group also contains FEM32, a flower-specific lectin-like protein from the dioecious plant Rumex acetosa, which alters flower development and induces male sterility in transgenic tobacco. It has been suggested that the FEM32 gene activates some form of programmed cell death (PCD), a process that could be mediated by the action of its lectin domains for binding to specific glycoproteins on the cell membrane and facilitated by the formation of pore structures in the membranes and the subsequent leakage of the cytosolic content through its pore-forming aerolysin domain. Most proteins belonging to this group have N-terminal agglutatin (also known as amaranthin) lectin domains; most have two agglutatin domains, in combination with one aerolysin domain. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	152
380787	cd20217	PFM_agglutinin-like	pore-forming module (PFM) of uncharacterized proteins which have agglutatin domain(s), and similar aerolysin-type beta-barrel pore-forming proteins. Most proteins belonging to this group have an N-terminal agglutatin (also known as amaranthin) lectin domain; some have fascin-like domains which adopt a beta-trefoil topology. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	150
380788	cd20218	PFM_aerolysin	pore-forming module of aerolysin and similar aerolysin-type beta-barrel pore-forming proteins. Aerolysin is a cytosolic bacterial toxin that forms pores in the host membrane, leading to destruction of the membrane permeability barrier and host cell death. Another member of this family is alpha-toxin from Clostridium septicum, the main virulence factor of this bacterium, known for causing non-traumatic gas gangrene. Many proteins belonging to this group have an N-terminal APT domain; an APT domain is the N-terminal domain of aerolysin and pertussis toxin and has a type-C lectin-like fold. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	144
380789	cd20219	PFM_physalysin-1-like	pore-forming module of Physella acuta physalysin1 and similar aerolysin-type beta-barrel pore-forming proteins. From a comparative immunological study of the snail Physella acuta, physalysin1 was identified as one of three physalysins in the snail. Members of this family belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	125
380790	cd20220	PFM_natterin-3-like	pore-forming module of Thalassophryne nattereri fish venom natterins 1-4, and similar aerolysin-type beta-barrel pore-forming proteins. This group includes 4 of the 5 Thalassophryne nattereri fish venom natterins: natterin-1, -2, -3, and 4. Natterins have kininogenase activity, kallikrein activity, and are allodynic and edema inducing. They also cleave type I and type IV collagen, resulting in necrosis of the affected cells. Contradictory to their edematic activity, Natterins also have anti-inflammatory effects through inhibition of interactions between leukocytes and the endothelium, and reduction in neutrophil accumulation. Many proteins belonging to this group have an N-terminal DUF3421 domain. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	152
380791	cd20221	PFM_Dln1-like	pore-forming module of Danio rerio Dln1, and similar aerolysin-type beta-barrel pore-forming proteins. Since Danio rerio Dln1 has a specific affinity towards high-mannose glycans, which are common on the surface of virus and fungi, it has been suggested that it may play a defense role. Members of this group also include lamprey immune protein (LIP), a defense molecule derived from lamprey supraneural body tissue which has efficient cytocidal actions against tumor cells. Many proteins belonging to this group have a N-terminal Jacalin-like lectin domain. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	168
380792	cd20222	PFM_parasporin-2-like	pore-forming module of parasporin-2, hydralysin and similar aerolysin-type beta-barrel pore-forming proteins. Bacillus thuringiensis strain A1547 parasporin-2 (PS2, also named Cry46Aa1) is an anti-cancer protein which causes specific cell damage via PS2 oligomerization in the cell membrane. Glycosylphosphatidylinositol (GPI)-anchored proteins may be involved in the cytocidal action of PS2 as co-receptors for PS2's cytocidal action. This family also includes hydralysin (Hln-1) and Hln-2 produced by the green hydra Chlorohydra viridissima. Hydralysin is a paralysis-inducing protein not found in the stinging cells (nematocytes), with a cell type-selective cytolytic activity; it binds erythrocyte membranes and forms discrete pores. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	147
380793	cd20223	PFM_epsilon-toxin-like	pore-forming module of Clostridium perfringens epsilon-toxin and similar aerolysin-type beta-barrel pore-forming proteins. Clostridium perfringens epsilon-toxin is responsible for fatal enterotoxemia in ungulates. It forms a heptamer in the lipid rafts of Madin-Darby Canine Kidney (MDCK) cells, leading to cell death; its oligomer formation is induced by activation of neutral sphingomyelinase. This group also includes an insecticidal crystal protein Cry14-4 (encoded on plasmid pBMBt1 of Bacillus thuringiensis serovar darmstadiensis). Also included is pXO2-60 (a protein from the pathogenic pXO2 plasmid of Bacillus anthracis) which harbors a unique ubiquitin-like fold domain at the C-terminus of the aerolysin-like domain, and is involved in virulence. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	144
380794	cd20224	PFM_alpha-toxin-like	pore-forming module of Clostridium septicum alpha-toxin and similar aerolysin-type beta-barrel pore-forming proteins. Clostridium septicum alpha-toxin is the main virulence factor of this bacterium, known for causing non-traumatic gas gangrene. Members of this family belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	121
380795	cd20225	PFM_lysenin-like	pore-forming module of lysenin and similar aerolysin-type beta-barrel pore-forming proteins. Lysenin (also known as Efl1) is a sphingomyelin-binding defense protein found in the coelomic fluid of the annelid earthworm Eisenia fetida. This group also contains lysenin-related proteins LRP-1 , LRP-2 , and LRP-3 from Eisenia sp.. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	150
380796	cd20226	PFM_Cry51Aa-like	pore-forming module of Bacillus thuringiensis insecticidal Cry51A toxin, Bacillus thuringiensis cytotoxic parasporin-5 and similar aerolysin-type beta-barrel pore-forming proteins. Bacillus thuringiensis parasporin-5 has strong cytocidal activity against several types of cancer cells and may or may not have insecticidal activity. Cry51A toxin is toxic to coleopteran (beetle) larvae. Other members of this family include Bacillus thuringiensis Cry15Aa which is toxic to lepidopteran (butterflies and moth) larvae. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	172
380797	cd20227	PFM_CPE-like	CPE (Clostridium perfringens enterotoxin), HA-70 type C,  and similar aerolysin-type beta-barrel pore-forming proteins. This domain is also known as the clenterotox domain (Chlostridium enterotoxin). Clostridium perfringens enterotoxin is the major virulence determinant for C. perfringens type-A food poisoning. After binding to its receptors, which include particular human claudins, the toxin forms pores in the cell membrane. This family also includes HA-70 type C, a component of the haemagglutinin complex of Clostridium botulinum type C toxin. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	162
380798	cd20228	PFM_TDP-like	pore-forming module of Flammulina velutipes transepithelial electrical resistance (TEER)-decreasing protein, and similar aerolysin-type beta-barrel pore-forming proteins. Flammulina velutipes TEER-decreasing protein (also known as flammutoxin, FTX), is a pore-forming hemolytic protein known to cause a rapid decrease in TEER and a parallel increase in paracellular permeability in the human intestinal epithelial Caco-2 cell monolayer. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	118
380799	cd20229	PFM_tachylectin-like	pore-forming module (PFM) of uncharacterized proteins having tachylectin domain(s), and similar aerolysin-type beta-barrel pore-forming proteins. Many proteins belonging to this group have tachylectin domain(s), N-terminal to this PFM; some also have an immunoglobulin (Ig) domain. Tachylectins are lectins which bind N-acetylglucosamine and N-acetylgalactosamine. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	148
380800	cd20230	PFM_EP37-like	pore-forming module of Cynops pyrrhogaster EP37, and similar aerolysin-type beta-barrel pore-forming proteins. Cynops pyrrhogaster (Japanese newt EP37) EP37 is an epidermis-specific protein which has a non-lens beta/gamma-crystallin domain in tandem and N-terminal to this pore-forming module. C. pyrrhogaster has several EP37-like proteins present in skin, gastric epithelium and fundic glands of an adult newt and in the swimming larva. This group also includes the alpha subunit of Bombina maxima betagamma-CAT (a non-lens betagamma-crystallin (alpha-subunit) and trefoil factor (beta subunit) complex) identified from skin secretions. Betagamma-CAT shows potent hemolytic activity on mammalian erythrocytes. Many proteins belonging to this group have N-terminal crystallin (beta/gamma crystallin) domain(s). Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	146
380801	cd20231	PFM_jacalin-like	pore-forming module of uncharacterized proteins which have an N-terminal jacalin-like lectin domain, and similar aerolysin-type beta-barrel pore-forming proteins. Jacalin-like lectins are sugar-binding protein domains. Proteins having these lectin domains may bind mono- or oligosaccharides with high specificity. Generally, pore-forming proteins (PFPs) are secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores detrimental to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel. Many of this family are bacterial toxins. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	150
380802	cd20232	PFM_crystallin-like	pore-forming module (PFM) of uncharacterized proteins which have N-terminal crystallin domain(s), and similar aerolysin-type beta-barrel pore-forming proteins. Many proteins belonging to this group have N-terminal crystallin (beta/gamma crystallins) domain(s). Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	151
380803	cd20233	PFM_unzipped-like	pore-forming module (PFM) of proteins having a DUF3421 domain including Drosophila unzipped, honey bee anarchy 1,  and similar aerolysin-type beta-barrel pore-forming proteins. Many proteins belonging to this group have N-terminal DUF3421. Drosophila melanogaster unzipped protein is required for normal axon patterning during neurogenesis, and honey bee anarchy 1 may play a role in worker sterility in a social insect.  Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	134
380804	cd20234	PFM_fascin-like	pore-forming module (PFM) of uncharacterized proteins which have N-terminal fascin-like domain, and similar aerolysin-type beta-barrel pore-forming proteins. Most proteins belonging to this group have an N-terminal Fascin-like domains which adopt a beta-trefoil topology. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	139
380805	cd20235	PFM_spherulin-2a-like	pore-forming module of Physarum polycephalum spherulin-2a, Plodia interpunctella follicular epithelium yolk protein subunit YP4, and similar aerolysin-type beta-barrel pore-forming proteins. Spherulin 2a is a coat glycoprotein produced during encystment from the slime mold, Physarum polycephalum. YP4, is one of two subunits of the follicular epithelium yolk protein from Plodia interpunctella and other pyralid moths; it is produced in the follicle cells during vitellogenesis, and after secretion it is taken up into the oocyte and stored in the yolk spheres for utilization during embryogenesis. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	150
380806	cd20236	PFM_SP17-like	pore-forming module-like domain of Phlebotomus argentipes 29 kDa salivary protein SP17, and similar aerolysin-type beta-barrel pore-forming proteins. Members include two putative secreted proteins from the salivary glands of Phlebotomus argentipes: 29 kDa salivary protein SP17 and 30 kDa salivary protein SP15. They belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	117
380807	cd20237	PFM_LIN24-like	pore-forming module of Caenorhabditis elegans LIN-24 and similar aerolysin-type beta-barrel pore-forming proteins. The process of cytotoxic cell death occurs in Caenorhabditis elegans containing mutations in either of lin-24 and lin-33. The cytotoxicity caused by mutation of either gene requires the function of the other. Genes required for the engulfment of apoptotic corpses function in the cytotoxic cell deaths induced by mutations in lin-24 and lin-33. It has been proposed that Caenorhabditis elegans LIN-24 may function to interact with bacterial toxins having similarity with it, and inactivate these, thereby allowing C. elegans to consume or survive exposure to bacteria that produce such toxins. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	120
380808	cd20238	PFM_ABFB-like	pore-forming module (PFM) of uncharacterized proteins which have an N-terminal ABFB (alpha-L-arabinofuranosidase B) domain, and similar aerolysin-type beta-barrel pore-forming proteins. Most proteins belonging to this group have a PFM C-terminal to an ABFB domain. Alpha-L-arabinofuranosidase (Araf-ase, EC 3.2.1.55) belongs to the glycosyl hydrolase family GH54, and in Aspergillus niger exhibits both Araf-ase, (EC 3.2.1.55) and alpha-D-galactofuranose (Galf-ase) activities, with Galf-ase being less than Araf-ase. Some members have a Ricin-type carbohydrate-binding domain which adopts a beta-trefoil fold. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	146
380809	cd20239	PFM_aerolysin-like	pore-forming module of aerolysin-type beta-barrel pore-forming proteins; uncharacterized subgroup. Generally, pore-forming proteins (PFPs) are secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores detrimental to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel. Many of this family are bacterial toxins. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	145
380810	cd20240	PFM_aerolysin-like	pore-forming module of aerolysin-type beta-barrel pore-forming proteins; uncharacterized subgroup. Generally, pore-forming proteins (PFPs) are secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores detrimental to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel. Many of this family are bacterial toxins. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	145
380811	cd20241	PFM_aerolysin-like	pore-forming module of aerolysin-type beta-barrel pore-forming proteins; uncharacterized subgroup. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	139
380812	cd20242	PFM_aerolysin-like	pore-forming module of aerolysin-type beta-barrel pore-forming proteins; uncharacterized subgroup. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	144
380781	cd20243	NoBody	Negative regulator of P-body association. Negative regulator of P-body association, also called P-body dissociating protein or protein NoBody (non-annotated P-body dissociating polypeptide), is a microprotein that interacts with mRNA decapping proteins, which remove the 5' cap from mRNAs to promote 5'-to-3' decay. It localizes to P-bodies, which are mRNA-decay-associated RNA-protein granules. NoBody promotes dispersal of P-body components and is likely to play a role in the mRNA decapping process. Decapping proteins participate in mRNA turnover and nonsense-mediated decay (NMD).	68
380780	cd20244	Toddler	Apelin receptor early endogenous ligand, also called Toddler. Apelin receptor early endogenous ligand, also called protein Elabela or protein Toddler, is an endogenous peptidic ligand of the G protein-coupled receptor APJ, also called apelin receptor (APLNR). The apelin/APJ axis contributes to maintaining homeostasis in normal and pathological hearts, and is also involved in heart development, including endoderm differentiation, heart morphogenesis and coronary vascular formation. Elabela/Toddler plays crucial roles in heart development and disease conditions, presumably at time points or at areas of the heart different from apelin. It is a potential therapeutic peptide, showing beneficial effects on body fluid homeostasis, cardiovascular health, and renal insufficiency, as well as potential benefits for metabolism and diabetes.	53
380778	cd20245	humanin	humanin and similar peptides. Humanin (HN) is a peptide encoded in the mitochondrial genome by the 16S ribosomal RNA gene MT-RNR2. HN has neuroprotective, anti-inflammatory, anti-apoptotic, anti-aging, and anti-fibrilogenic properties through interaction with a series of targets including BAX, IGFBP-3 (insulin-like growth factor binding protein 3), and a trimeric CNTFR/WSX-1/gp130 complex. Endogenous HN is both, an intracellular and secreted protein, and has been detected in brain, retinal pigment epithelium, blood vessels, pancreatic beta cells, tumors, testes and other tissues, and is also present in serum and in human seminal plasma. Single amino acid substitutions of HN can lead to significant changes in its potency and biological functions. There are multiple nuclearly-encoded HN isoforms which may be functional genes regulated in a tissue- and factor-specific manner. HN has potential as a therapeutic target for neurodegenerative and cardiovascular diseases, diabetes, male infertility, and cancer; it may have value as a biomarker for these diseases. Humanin analogs and peptide mimetics have been developed which show promising results in preclinical models of degenerative diseases.	24
380775	cd20246	CASIMO1	Cancer Associated Small Integral Membrane Open reading frame 1 (CASIMO1). The Cancer-Associated Small Integral Membrane Open reading frame 1 (CASIMO1), a small open reading frame (sORF)-encoded protein (also known as a microprotein), controls cell proliferation and interacts with squalene epoxidase (SQLE) modulating lipid droplet formation. CASIMO1 RNA is overexpressed predominantly in hormone receptor-positive breast tumors, and its knockdown has been shown to decrease proliferation in multiple breast cancer cell lines. Loss of CASIMO1 disturbs the organization of the actin cytoskeleton, leads to inhibition of cell motility, and stalls the cell cycle in the G0/G1 phase. CASIMO1 interacts with SQLE, a key enzyme in cholesterol synthesis and a known oncogene in breast cancer. This family contains two variants expressed on different chromosomes in humans, small integral membrane protein 5 (SMIM5) and small integral membrane protein 22 (SMIM22).	68
380779	cd20247	DWORF	DWarf Open Reading Frame (DWORF). DWarf Open Reading Frame (DWORF) is a small protein that plays a key role in heart muscle contraction. DWORF, Sarcolipin (SLN), and myoregulin (MRLN) are transmembrane regulators of the sarcoplasmic reticulum calcium transporting ATPase (SERCA). DWORF enhances SERCA activity by displacing phospholamban (PLN), a potent SERCA inhibitor. This makes DWORF an attractive candidate for a heart failure therapeutic. DWORF is also present in slow-twitch skeletal muscle fibers.	35
380749	cd20248	phospholamban_like	phospholamban, sarcolipin, and sarcolamban family of bioactive peptides. Vertebrate phospholamban (PLN), sarcolipin (SLN), and invertebrate sarcolamban (SCL) constitute a family of bioactive peptides. They are involved in the regulation of Ca2+ traffic, and their alteration can result in irregular muscle contractions. Invertebrate SCL (SCLA and SCLB) are encoded within a single putative noncoding transcript, pncr003:2L; vertebrate PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN expression is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. SLN in skeletal muscle has also been shown to amplify calcineurin signaling, and is a potential therapeutic target for the management of muscular dystrophy, as it is upregulated in the skeletal muscle of the classical mouse model for Duchenne muscular dystrophy.	27
380772	cd20249	Hemotin	Hemotin. Hemotin is a transmembrane alpha-helical microprotein localized to early endosomes in hemocytes (Drosophila macrophages), where it regulates endosomal maturation during phagocytosis by repressing the cooperation of 14-3-3zeta with specific phosphatidylinositol enzymes. Hemocytes are professional phagocytes tasked with removing dying cells and microorganisms invading the body. Drosophila hemotin mutants accumulate undigested phagocytic material inside enlarged endolysosomes, resulting in reduced ability to fight bacteria and severely reduced life span.	86
380750	cd20250	Phospholamban	Phospholamban bioactive peptide and similar proteins. Vertebrate phospholamban (PLN) belongs to a family of bioactive peptides which includes vertebrate sarcolipin (SLN), and the invertebrate sarcolamban (SCLA, and SCLB). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. PLN and SLN differ in their interaction with SERCA; PLN is an affinity modulator of SERCA. It is thought to form a pentamer in the membrane.	52
380755	cd20251	Complex1_LYR_SF	LYR (leucine-tyrosine-arginine) motif found in Complex1_LYR-like superfamily. The Complex1_LYR-like superfamily consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus. The human genome has at least ten LYR proteins that were predominantly identified as mitochondrial proteins. Some family members were also found in the cytosol or nucleus. LYR motif-containing protein 4 (LYRM4) represents the only LYR protein that is directly involved in the first steps of Fe-S cluster generation. Other LYR proteins have been identified as accessory subunits or assembly factors of mitochondrial OXPHOS (oxidative phosphorylation) complexes I, II, III and V, and they play specific roles in acetate metabolism.	57
380751	cd20253	Sarcolipin	Sarcolipin bioactive peptide and similar proteins. Vertebrate sarcolipin (SLN) belongs to a family of bioactive peptides which includes phospholamban (PLN), and invertebrate sarcolamban (SCL). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity. PLN and SLN differ in their interaction with SERCA. SLN in skeletal muscle has also been shown to amplify calcineurin signaling, and is a potential therapeutic target for the management of muscular dystrophy, as it is upregulated in the skeletal muscle of the classical mouse model for Duchenne muscular dystrophy.	30
380776	cd20254	CASIMO1_SMIM5	small integral membrane protein 5 (SMIM5) of CASIMO1. This family contains the small integral membrane protein 5 (SMIM5) variant of the Cancer-Associated Small Integral Membrane Open reading frame 1 (CASIMO1), a small open reading frame (sORF)-encoded protein (also known as a microprotein). CASIMO1 controls cell proliferation and interacts with squalene epoxidase (SQLE) modulating lipid droplet formation. CASIMO1 RNA is overexpressed predominantly in hormone receptor-positive breast tumors, and its knockdown has been shown to decrease proliferation in multiple breast cancer cell lines. Loss of CASIMO1 disturbs the organization of the actin cytoskeleton, leads to inhibition of cell motility, and stalls the cell cycle in the G0/G1 phase. CASIMO1 interacts with SQLE, a key enzyme in cholesterol synthesis and a known oncogene in breast cancer. This variant is expressed on chromosome 16 in humans, mostly in the stomach, kidney, thyroid and esophagus.	71
380777	cd20255	CASIMO1_SMIM22	small integral membrane protein 22 (SMIM22) of CASIMO1. This family contains the small integral membrane protein 22 (SMIM22) variant of the Cancer-Associated Small Integral Membrane Open reading frame 1 (CASIMO1), a small open reading frame (sORF)-encoded protein (also known as a microprotein). CASIMO1 controls cell proliferation and interacts with squalene epoxidase (SQLE) modulating lipid droplet formation. CASIMO1 RNA is overexpressed predominantly in hormone receptor-positive breast tumors, and its knockdown has been shown to decrease proliferation in multiple breast cancer cell lines. Loss of CASIMO1 disturbs the organization of the actin cytoskeleton, leads to inhibition of cell motility, and stalls the cell cycle in the G0/G1 phase. CASIMO1 interacts with SQLE, a key enzyme in cholesterol synthesis and a known oncogene in breast cancer. This variant is expressed on chromosome 17 in humans, mostly in the colon and stomach.	80
380773	cd20256	Stannin_family	Stannin family includes vertebrate Stannin and insect Hemotin. The Stannin family includes vertebrate Stannin and insect Hemotin, which are functional homologs required at the cellular level for endosomal maturation, and at the molecular level, to bind and antagonize 14-3-3zeta. Stannin is a monotopic membrane protein containing an N-terminal single transmembrane helix that transverses the lipid bilayer, an unstructured linker which includes a conserved CXC metal-binding motif and a putative 14-3-3zeta binding site, and a C-terminal distorted cytoplasmic helix. Analysis of the Hemotin sequence using a transmembrane topology prediction program revealed a very similar potential transmembrane alpha-helical domain arrangement as Stannin.	83
380774	cd20257	Stannin	Stannin. Stannin (SNN) is a monotopic membrane protein containing an N-terminal single transmembrane helix that transverses the lipid bilayer, an unstructured linker which includes a conserved CXC metal-binding motif and a putative 14-3-3zeta binding site, and a C-terminal distorted cytoplasmic helix. It binds and antagonizes 14-3-3zeta and is required for endosomal maturation. It has also been identified as the specific marker for neuronal cell apoptosis induced by trimethyltin (TMT) intoxication. TMT is one of the most toxic organotin compound (or alkyltin), and is known to selectively inflict injury to specific regions of the brain.	84
380771	cd20258	Tal_Pri	Tarsal-less (Tal), also known as polished rice (Pri), and related peptides. The tal/pri gene produces a single polycistronic transcript that encodes 4 related peptides: tal-1A, tal-2A, and tal-3A which are each 11 amino acids long, and tal-AA, which is 32 amino acids long, the shorter ones contain one conserved LDPTGXY motif, tal-AA contains two. The Tal/Pri peptides function redundantly in several developmental processes. They are required for embryonic and imaginal development in Drosophila. They control epidermal differentiation in Drosophila by triggering the amino-terminal truncation of the transcription factor Shavenbaby (Svb), converting Svb from a repressor to an activator. In addition, Tal/Pri peptides are required for denticle formation and may play a role in the developmental timing of trichome differentiation. They are essential for the development of taenidial folds in the trachea, and in the early stages of leg development, for the intercalation of the tarsal segments during the mid-third instar stage and later for tarsal joint formation. Furthermore, Tal/Pri peptides are required for correct wing and leg formation through their regulation of several genes including those in the Notch signaling pathway. The Tribolium orthologue mille-pattes (mlpt) is essential for embryo segmentation; it also encodes a polycistronic mRNA that codes for four peptides: Mlpt peptides 1-3 range in size from 11 to 15 amino acids and each contain one conserved LDPTGXY motif; Mlpt peptide 4 is 23 amino acids, is not represented here, and does not contain this motif.	32
380770	cd20259	pgc	polar granule component. Polar granule component (pgc) is implicated in primordial germ cell specification in Drosophila, which require transcriptional quiescence and three genes: pgc, nano (nos) and germ cell less (gcl), that act to down-regulate Pol II transcription. The microprotein pgc inhibits transcription elongation factor b (P-TEFb), which phosphorylates the C-terminal domain of the largest Pol II subunit.	66
380769	cd20260	Myoregulin	Myoregulin. Myoregulin (MLN) is encoded by a skeletal muscle-specific RNA Linc-RAM, which is annotated as a putative long noncoding RNA (lncRNA). It is a single-pass transmembrane alpha-helix that interacts directly with sarcoplasmic reticulum (SR) calcium transporting ATPase (SERCA) and impedes Ca(2+) uptake into the SR. SERCA is the membrane pump that controls muscle relaxation by regulating Ca(2+) uptake into the SR. MLN is the dominant regulator of SERCA1 activity in adult skeletal muscle and is a promising drug target for improving muscle performance.	45
380756	cd20261	Complex1_LYR_LYRM1	LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 1 (LYRM1) and similar proteins. LYR motif-containing protein 1 (LYRM1) may promote cell proliferation and inhibition of apoptosis of preadipocytes. Overexpression of the human LYRM1 causes mitochondrial dysfunction and induces insulin resistance in 3T3-L1 adipocytes. LYRM1 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	70
380757	cd20262	Complex1_LYR_LYRM2	LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 2 (LYRM2) and similar proteins. LYRM2 is an uncharacterized LYR motif-containing protein that belongs to the Complex1_LYR-like superfamily which consists of proteins of diverse functions that are exclusively found in eukaryotes; these proteins contain the conserved tripeptide 'LYR' close to the N-terminus.	63
380758	cd20263	Complex1_LYR_NDUFB9_LYRM3	LYR (leucine-tyrosine-arginine) motif found in NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 9 (NDUFB9) and similar proteins. NDUFB9, also called LYR motif-containing protein 3 (LYRM3), or Complex I-B22 (CI-B22), or NADH-ubiquinone oxidoreductase B22 subunit (UQOR22), is an accessory subunit of the mitochondrial membrane respiratory chain NADH dehydrogenase (complex I), and is believed to be not involved in catalysis. In general, accessory subunits are integral for assembly and function of Complex I, which functions in the transfer of electrons from NADH to the respiratory chain. NDUFB9 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	77
380759	cd20264	Complex1_LYR_LYRM4	LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 4 (LYRM4) and similar proteins. LYRM4, also called ISD11, is a eukaryote-specific component of the mitochondrial biogenesis of Fe-S clusters which are essential cofactors in multiple processes, including oxidative phosphorylation. It is required for nuclear and mitochondrial iron-sulfur protein biosynthesis by forming a complex with, and stabilizing, the sulfur donor NFS1. LYRM4 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	69
380760	cd20265	Complex1_LYR_ETFRF1_LYRM5	LYR (leucine-tyrosine-arginine) motif found in electron transfer flavoprotein regulatory factor 1 (ETFRF1) and similar proteins. ETFRF1, also called LYR motif-containing protein 5 (LYRM5), or Ghiso (growth-factor inducible soluble) factor, acts as a regulator of the electron transfer flavoprotein by promoting the removal of flavin from the ETF holoenzyme (composed of ETFA and ETFB). It belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	74
380761	cd20266	Complex1_LYR_NDUFA6_LYRM6	LYR (leucine-tyrosine-arginine) motif found in NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 6 (NDUFA6) and similar proteins. NDUFA6, also called LYR motif-containing protein 6 (LYRM6), or Complex I-B14 (CI-B14), or NADH-ubiquinone oxidoreductase B14 subunit, is an accessory subunit of the mitochondrial membrane respiratory chain NADH dehydrogenase (Complex I), and is believed to be not involved in catalysis. In general, accessory subunits are integral for assembly and function of Complex I, which functions in the transfer of electrons from NADH to the respiratory chain. NDUFA6 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	75
380762	cd20267	Complex1_LYR_LYRM7	LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 7 (LYRM7) and similar proteins. LYRM7 is an assembly factor required for Rieske iron sulfur (Fe-S) protein UQCRFS1 incorporation into the cytochrome b-c1 (CIII) complex. It functions as a chaperone, binding to this subunit within the mitochondrial matrix and stabilizing it prior to its translocation and insertion into the late CIII dimeric intermediate within the mitochondrial inner membrane. LYRM7 mutations cause a multifocal cavitating leukoencephalopathy with a distinct and recognizable magnetic resonance imaging (MRI) pattern. LYRM7 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	72
380763	cd20268	Complex1_LYR_SDHAF1_LYRM8	LYR (leucine-tyrosine-arginine) motif found in mitochondrial succinate dehydrogenase assembly factor 1 (SDHAF1) and similar proteins. SDHAF1, also called SDH assembly factor 1, or LYR motif-containing protein 8 (LYRM8), is a LYR complex-II specific assembly factor that plays an essential role in the assembly of succinate dehydrogenase (SDH), an enzyme complex (also referred to as respiratory complex II) that is a component of both the tricarboxylic acid (TCA) cycle and the mitochondrial electron transport chain, and which couples the oxidation of succinate to fumarate with the reduction of ubiquinone (coenzyme Q) to ubiquinol. It promotes maturation of the iron-sulfur protein subunit SDHB of the SDH catalytic dimer, protecting it from the deleterious effects of oxidants. SDHAF1 may act together with SDHAF3. It is mutated in SDH-defective infantile leukoencephalopathy. SDHAF1 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	64
380764	cd20269	Complex1_LYR_LYRM9	LYR (leucine-tyrosine-arginine) motif found in LYR motif-containing protein 9 (LYRM9) and similar proteins. LYRM9 is an uncharacterized LYR motif-containing protein that belongs to the Complex1_LYR-like superfamily which consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	60
380765	cd20270	Complex1_LYR_SDHAF3_LYRM10	LYR (leucine-tyrosine-arginine) motif found in mitochondrial succinate dehydrogenase assembly factor 3 (SDHAF3) and similar proteins. SDHAF3, also called SDH assembly factor 3, or LYR motif-containing protein 10 (LYRM10), plays an essential role in the assembly of succinate dehydrogenase (SDH), an enzyme complex (also referred to as respiratory complex II) that is a component of both the tricarboxylic acid (TCA) cycle and the mitochondrial electron transport chain, and which couples the oxidation of succinate to fumarate with the reduction of ubiquinone (coenzyme Q) to ubiquinol. It promotes maturation of the iron-sulfur protein subunit SDHB of the SDH catalytic dimer, protecting it from the deleterious effects of oxidants. SDHAF3 may act together with SDHAF1. Its mutations may be associated with idiopathic SDH-associated diseases. SDHAF3 belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	56
380766	cd20271	Complex1_LYR_FMC1	LYR (leucine-tyrosine-arginine) motif found in formation of mitochondrial complex V assembly factor 1 (FMC1) and similar proteins. FMC1, also known as formation of mitochondrial complexes protein 1, is an ATP synthase assembly factor that plays a role in the assembly/stability of the mitochondrial membrane ATP synthase (F(1)F(0) ATP synthase or Complex V). It belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	95
380767	cd20272	Complex1_LYR_MIEF1-MP	LYR (leucine-tyrosine-arginine) motif found in mitochondrial elongation factor 1 microprotein (MIEF1-MP) and similar proteins. MIEF1-MP, also called alternative mitochondrial elongation factor 1 (MIEF1) protein (AltMIEF1), or MIEF1 upstream open reading frame protein, is involved in the regulation of mitochondrial fission mediated by dynamin-1-like protein (DNM1L). It positively regulates mitochondrial translation. MIEF1-MP belongs to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	58
380768	cd20273	Complex1_LYR_unchar	LYR (leucine-tyrosine-arginine) motif found in uncharacterized LYR motif-containing protein. This group contains uncharacterized LYR motif-containing proteins belonging to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	61
380752	cd20274	Sarcolamban	Sarcolamban A and B bioactive peptides and similar proteins. Invertebrate sarcolamban (SCLA and SCLB) belong to a family of bioactive peptides which includes vertebrate phospholamban (PLN) and sarcolipin (SLN). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity.	27
380753	cd20275	Sarcolamban_B	Sarcolamban B bioactive peptide and similar proteins. Invertebrate sarcolamban B (SCLB) belongs to a family of bioactive peptides which includes invertebrate sarcolamban A (SCLA), and vertebrate phospholamban (PLN) and sarcolipin (SLN). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity.	28
380754	cd20276	Sarcolamban_A	Sarcolamban A bioactive peptide. Invertebrate sarcolamban A (SCLA) belongs to a family of bioactive peptides which includes invertebrate sarcolamban B (SCLB), and vertebrate phospholamban (PLN) and sarcolipin (SLN). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity.	27
410555	cd20277	FXYD	phenylalanine-X-tyrosine-aspartate (FXYD) family. FXYDs are small single-transmembrane proteins that act as novel regulators of Na+/K+-ATPase (NKA). The transmembrane domain and the conserved Phe-X-Tyr-Asp motif of FXYD play a role in the binding of FXYD to the alpha- and beta-subunits of NKA. PFXYD (proline-phenylalanine-X-tyrosine-aspartate) at the beginning of the signature sequence is invariant in all known examples in mammals and identical except for the proline in other vertebrates; X is usually Y (tyrosine), but can also be E, T, or H (glutamate, threonine, or histidine). The FXYD protein family contains at least twelve members that have the extracellular FXYD motif, transmembrane domain, and intracellular domain. Members share a 35-amino acid signature sequence domain, beginning with PFXYD and containing 7 invariant and 6 highly conserved amino acids. In mammals, members of the FXYD family include FXYD1 (phospholemman, PLM), FXYD2 (the gamma-subunit of NKA), FXYD3 (mammary tumor marker Mat-8), FXYD4 (corticosteroid hormone-induced factor, CHIF), FXYD5 (dysadherin), FXYD6 (phosphohippolin), and FXYD7. In elasmobranchs, FXYD10 (phospholemman-like protein from shark, PLMS) was first identified in the rectal glands of Squalus acanthias. In addition, studies on sharks reported that the functions of FXYD10 via its C-terminal cysteine residue interactions were associated with negative regulation of shark NKA activity. Teleostean FXYD proteins (FXYD2, 5-9, 11, and 12) have been reported in certain teleosts such as the Tetraodon nigroviridis, Salmo salar, Danio rerio, and Oryzias dancena. Recent studies have demonstrated that several teleost FXYD isoforms are expressed in the gills and kidneys of the fish, and their expression levels are altered in response to salinity changes, suggesting that these FXYDs may regulate electrolyte homeostasis and body fluid of the fish.	30
380748	cd20278	Minion	Microprotein INducer of fusION (Minion). Microprotein INducer of fusION (Minion), also called protein myomixer or protein myomerger, is encoded by the MYMX gene. Along with Myomaker, it allows cells to fuse and form multinucleated fibers that are capable of contracting. A lack of Minion disables skeletal muscles, including the diaphragm, resulting in perinatal death in mice. This insight into the Minion-Myomaker system may one day be exploited for targeted drug delivery involving fusing cells in cancer or other contexts. The production of Minion peaks three to four days after injury, similar to the expression profile of Myomaker.	57
411710	cd20280	NotI-like	Restriction endonuclease NotI and similar proteins. Restriction enzyme NotI is a type IIP restriction enzyme (the simplest being separate homodimeric endonucleases and methyltransferases that each recognize the same palindromic DNA target sequence) that recognize sites of 8 bp or longer in invasive DNA. NotI is commonly used for the introduction of radiolabeled landmarks in the restriction landmark genomic scanning (RLGS) method, which has become a common technique for the study of aberrant DNA methylation patterns in tumor- and tissue-specific cell lines.	357
380417	cd20281	cupin_QDO_C	quercetinase, C-terminal cupin domain. This family contains the C-terminal domain of quercetinase (also known as quercetin 2,3-dioxygenase, 2,3QD, QDO and YxaG; EC 1.13.11.24), a mononuclear copper-dependent dioxygenase that catalyzes the cleavage of the flavonol quercetin (5,7,3',4'-tetrahydroxyflavonol) heterocyclic ring to produce 2-protocatechuoyl-phloroglucinol carboxylic acid and carbon monoxide. This family includes Aspergillus japonicus quercetin 2,3-dioxygenase (QDO), a homodimer that shows oxygenase activity with Cu2+. The dioxygen binds to the metal ion of the Cu-QDO-quercetin complex, yielding a Cu2+-superoxo quercetin radical intermediate, which forms a Cu2+-alkylperoxo complex that evolves into an endoperoxide intermediate that decomposes to the product. Quercetinase is a bicupin with two tandem cupin beta-barrel domains, only the C-terminal domain is included in this alignment. The pirins, which also belong to the cupin domain family, have been shown to catalyze a reaction involving quercetin and may have a function similar to that of quercetinase.	114
380418	cd20282	cupin_DddQ	dimethylsulfoniopropionate lyase DddQ, cupin domain. Dimethylsulfoniopropionate (DMSP) is produced worldwide in large amounts, mainly by marine phytoplankton and macroalgae. DMSP lyase catalyzes the cleavage of DMSP to generate the volatile dimethyl sulfide (DMS) and plays a major role in the biogeochemical cycling of sulfur. When released into the atmosphere from the oceans, DMS is oxidized, forming cloud condensation nuclei that may influence weather and climate. DMSP lyase belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	103
380419	cd20283	cupin_DddY	dimethylsulfoniopropionate lyase DddY, cupin domain. This family includes dimethylsulfoniopropionate (DMSP) lyase DddY, the only known periplasmic DMSP lyase that is present in certain proteobacteria. DddY cleaves dimethylsulfoniopropionate (DMSP), the organic osmolyte and antioxidant produced in marine environments, and yields acrylate and the climate-active gas dimethyl sulfide (DMS). The catabolism of DMSP by microbial organisms provides a major source of carbon and sulfur in the marine environment. Studies show that DddY binds a zinc ion as cofactor, and uses a key tyrosine as a general base to attack DMSP. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	294
380420	cd20285	cupin_7S_11S_C	7S and 11S seed storage globulin, C-terminal cupin domain. This family contains the C-terminal cupin domains of 7S and 11S seed storage proteins. The 7S globulins include soybean allergen beta-conglycinin, peanut allergen conarachin (Ara h 1), walnut allergen Jug r 2, and lentil allergen Len c 1. Proteins in this family perform various functions, including a role in sucrose binding, desiccation, defense against microbes and oxidative stress. The 11S globulins include many common food allergens such as the peanut major allergen Ara h 3, almond allergen Pru du 6, pecan allergen Car i 4, hazelnut nut allergen Cor a 9, Brazil nut allergen Ber e 2, cashew allergen Ana o 2, pistachio allergen Pis v 2/5, and walnut allergen Jug n/r 4. These plant seed storage globulins have tandem cupin-like beta-barrel folds (referred to as a bicupin). Storage proteins are the cause of well-known allergic reactions to peanuts and cereals. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	109
380421	cd20287	cupin_pirin-like_N	pirin-like, N-terminal cupin domain. This family contains the N-terminal cupin domain of pirin and pirin-like proteins, including Escherichia coli YhhW and YhaK. Pirin functions as both a transcriptional cofactor and an apoptosis-related protein in mammals and is involved in seed germination and seedling development in plants. Proteins in this family have two tandem cupin-like folds but the C-terminal cupin fold has diverged considerably and does not have a metal binding site. The exact functions of pirins are unknown but they have quercitinase activity in Escherichia coli and are thought to play important roles in transcription and apoptosis. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	81
380422	cd20288	cupin_pirin-like_C	pirin-like, C-terminal cupin domain. This family contains the C-terminal cupin domain of pirin and pirin-like proteins, including Escherichia coli YhhW and YhaK. Pirin functions as both a transcriptional cofactor and an apoptosis-related protein in mammals and is involved in seed germination and seedling development in plants. Proteins in this family have two tandem cupin-like folds but the C-terminal cupin fold has diverged considerably and does not have a metal binding site. The exact functions of pirins are unknown but they have quercitinase activity in Escherichia coli and are thought to play important roles in transcription and apoptosis. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	70
380423	cd20289	cupin_ADO	2-aminoethanethiol dioxygenase, cupin domain. This family contains 2-aminoethanethiol dioxygenase (also known as cysteamine dioxygenase, persulfurase or ADO; EC 1.13.11.19), which catalyzes the addition of two oxygen atoms to free cysteamine (2-aminoethanethiol) to form hypotaurine that subsequently oxidizes to taurine. These enzymes are found in prokaryotes as well as eukaryotes and belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	103
380424	cd20290	cupin_Mj0764-like	uncharacterized Methanocaldococcus jannaschii Mj0764 and related proteins, cupin domain. This family includes archaeal and bacterial proteins homologous to MJ0764, a Methanocaldococcus jannaschii protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	100
380425	cd20291	cupin_CucA-like	soluble periplasm cuproprotein CucA and related proteins, cupin domain. This family includes bacterial proteins homologous to a soluble periplasm protein, CucA, found in the periplasm of the cyanobacterium Synechocystis where it shows some Cu2+-dependent quercetin dioxygenase activity. Studies show that a copper-trafficking pathway enables Cu2+ occupancy of CucA to accumulate in the periplasm, and this involves two copper transporters (CtaA and PacS) and a metallochaperone (Atx1). CucA belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	201
380426	cd20292	cupin_QdtA-like	sugar 3,4-ketoisomerase QdtA and related proteins, cupin domain. This family includes cupin domains of several bacterial proteins homologous to sugar 3,4-ketoisomerases. Thermoanaerobacterium thermosaccharolyticum QdtA catalyzes a key step in the biosynthesis of these sugars, the conversion of thymidine diphosphate (dTDP)-4-keto-6-deoxyglucose to dTDP-3-keto-6-deoxyglucose. In Aneurinibacillus thermoaerophilus, TDP-4-oxo-6-deoxy-alpha-D-glucose-3,4-oxoisomerase (also known as FdtA) is involved in the biosynthesis of dTDP-Fucp3NAc (3-acetamido-3,6-dideoxy-alpha-d-galactose), which is part of the repeating units of the glycan chain in the S-layer. Shewanella denitrificans bifunctional ketoisomerase/N-acetyltransferase (also known as FdtD) is involved in the third and fifth steps in the production of 3-acetamido-3,6-dideoxy-alpha-d-galactose or Fuc3NAc; the C-terminal cupin domain harbors the active site responsible for the isomerization reaction. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	117
380427	cd20293	cupin_HutD_N	histidine utilization protein HutD and related proteins, N-terminal cupin domain. This model represents the N-terminal domain of a bicupin protein HutD, involved in histidine utilization (Hut) in Pseudomonas species. Although a metal binding site is not found in Pseudomonas fluorescens (PfluHutD), a binding pocket for ligands is located in the middle of the N-terminal cupin domain near the metal binding sites; N-formyl-l-glutamate (FG, a Hut pathway intermediate) has been identified as a potential ligand in vivo. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	92
380428	cd20294	cupin_KduI_N	5-keto-4-deoxyuronate isomerase (KduI) and related proteins, N-terminal cupin domain. 5-keto-4-deoxyuronate isomerase (KduI; EC 5.3.1.17), also called 5-dehydro-4-deoxy-D-glucuronate isomerase or 4-deoxy-L-threo-5-hexosulose-uronate ketol-isomerase, catalyzes the interconversion of 5-keto-4-deoxyuronate and 2,5-diketo-3-dexoygluconate in the breakdown of pectin. KduI is a bicupin; this model describes the N-terminal cupin domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	100
380429	cd20295	cupin_Pac13-like	monomeric dehydratase Pac13 and related proteins, cupin domain. This family includes a small monomeric dehydratase Pac13 that mediates the formation of the 3'-deoxynucleotide of pacidamycins, which are uradyl peptide antibiotics (UPAs). Pac13 is involved in the formation of the unique 3'-deoxyuridine moiety found in these UPAs; it catalyzes the dehydration of uridine-5'-aldehyde. The similarity of the 3'-deoxy pacidamycin moiety with synthetic anti-retrovirals, offers a potential opportunity for the utilization of Pac13 in the biocatalytic generation of antiviral compounds. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	101
380430	cd20296	cupin_PpnP-like	pyrimidine/purine nucleoside phosphorylase and related proteins, cupin domain. This family includes cupin domain proteins that are homologous to pyrimidine/purine nucleoside phosphorylase PpnP. Purine and pyrimidine nucleoside phosphorylases are key enzymes of the nucleoside salvage pathway; they catalyze the reversible phosphorolytic cleavage of the glycosidic bond of purine and pyrimidine nucleosides. Nucleoside phosphorylases are of medical interest since phosphorylases can be used in activating prodrugs; high-molecular mass purine nucleoside phosphorylases may be used in gene therapy of some solid tumors and their inhibitors could be selective immunosuppressive, anticancer, and antiparasitic agents. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	90
380431	cd20297	cupin_HQDO_small	hydroquinol 1,2-dioxygenase (HQDO) small subunit, cupin domain. This model describes the small (or alpha) subunit of hydroquinone 1,2-dioxygenase (HQDO), which adopts a cupin domain fold. HQDO is a heterotetramer of two alpha and two beta subunits of 19kDa and 38kDa, respectively, and is a Fe(II) ring cleaving dioxygenase that is a key enzyme in the hydroquinone pathway of para-nitrophenol degradation, where it catalyzes the ring cleavage of hydroquinone to gamma-hydroxymuconic semialdehyde. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	160
380432	cd20298	cupin_UAH	ureidoglycolate amidohydrolase (UAH) and related proteins, cupin domain. This family includes the cupin-fold protein, ureidoglycolate hydrolase (AllA; EC 3.5.3.19), which is involved in the breakdown of allantoin under aerobic conditions. Allantoin is the key intermediate of nitrogen fixation in bacteria, and its degradation occurs in several steps. AllA is involved in the third step of this pathway which consists of hydrolysis of (S)-ureidoglycolate to yield glyoxylate, ammonia, and CO2. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	92
380433	cd20299	cupin_YP766765-like	Rhizobium leguminosarum YP_766765.1 and related proteins, cupin domain. This family includes mostly bacterial proteins homologous to Rhizobium leguminosarum YP_766765.1, a protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	90
380434	cd20300	cupin_npun_f5605-like_N	Nostoc punctiforme npun_f5605 and related proteins, N-terminal cupin domain. This family includes proteins homologous to Nostoc punctiforme putative dioxygenase npun_f5605, a protein of unknown function. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	99
380435	cd20301	cupin_ChrR	anti-ECFsigma factor, ChrR , cupin domain. This family contains bacterial anti-sigma factor ChrR from the photosynthetic bacterium Rhodobacter sphaeroides (Rsp) and similar proteins. ChrR is a member of the ZAS (Zn2+ anti-sigma) subfamily of group IV anti-sigmas. It inhibits transcriptional activity by binding to the Rsp extra cytoplasmic function (ECF) sigma factor E (sigmaE), an essential factor to mount a transcriptional response to a singlet oxygen and for viability when carotenoids are limiting. ChrR comprises two structural and functional modules; the N-terminal anti-sigma domain (ASD) binds a Zn(2+) ion, contacts sigma(E), and is sufficient to inhibit sigma(E)-dependent transcription. The ChrR C-terminal domain adopts a cupin fold, can coordinate an additional Zn(2+), and is required for the transcriptional response to singlet oxygen, a potent oxidant that damages cellular biomolecules and can kill cells. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	161
380436	cd20302	cupin_DAD	2,4'-Dihydroxyacetophenone dioxygenase (DAD), cupin domain. 2,4'-Dihydroxyacetophenone dioxygenase (DAD) catalyzes the oxidation of 2,4'-dihydroxyacetophenone to 4-hydroxybenzoate and formate as part of the 4-hydroxyacetophenone catabolic pathway. This enzyme is a homo-tetramer containing one iron per molecule of enzyme. This enzyme is an unusual dioxygenase in that it cleaves a C-C bond in a substituent of the aromatic ring rather than within the ring itself. As a bacterial dioxygenase, DAD plays an important environmental role in the aerobic catabolism of aromatic compounds; expression of this enzyme in appropriately engineered microorganisms has the potential to use these aromatic pollutants as a carbon source and thus remove them from the environment. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	123
380437	cd20303	cupin_ChrR_1	Marinobacter hydrocarbonoclasticus anti-ECFsigma factor ChrR, and similar proteins; 2 heterologous tandem repeats of cupin domain. This family contains bacterial anti-sigma factor such as ChrR from Marinobacter hydrocarbonoclasticus. Anti-sigma factor ChrR is a member of the ZAS (Zn2+ anti-sigma) subfamily of group IV anti-sigmas. It inhibits transcriptional activity by binding to the ECF sigma factor E (sigmaE), an essential factor to mount a transcriptional response to a singlet oxygen and for viability when carotenoids are limiting. This protein family likely contains two distinct homologous functional domains belonging to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	102
380438	cd20304	cupin_OxDC_N	Oxalate decarboxylase (OxDC), N-terminal cupin domain. This model represents the N-terminal cupin domain of oxalate decarboxylase (OxDC; EC 4.1.1.2), a manganese-dependent bicupin that catalyzes the conversion of oxalate to formate and carbon dioxide, utilizing dioxygen as a cofactor. It is evolutionarily related to oxalate oxidase (OxOx or germin; EC 1.2.3.4) which, in contrast, converts oxalate and dioxygen to carbon dioxide and hydrogen peroxide. OxDC is classified as a bicupin because it contains two cupin folds with each domain containing one manganese binding site, with four manganese binding residues (three histidines and one glutamate) conserved as well as a number of hydrophobic residues. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	155
380439	cd20305	cupin_OxDC_C	Oxalate decarboxylase (OxDC), C-terminal cupin domain. This model represents the C-terminal cupin domain of oxalate decarboxylase (OxDC; EC 4.1.1.2), a manganese-dependent bicupin that catalyzes the conversion of oxalate to formate and carbon dioxide, utilizing dioxygen as a cofactor. It is evolutionarily related to oxalate oxidase (OxOx or germin; EC 1.2.3.4) which, in contrast, converts oxalate and dioxygen to carbon dioxide and hydrogen peroxide. OxDC is classified as a bicupin because it contains two cupin folds with each domain containing one manganese binding site, with four manganese binding residues (three histidines and one glutamate) conserved as well as a number of hydrophobic residues. Members of this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	153
380440	cd20306	cupin_OxDC-like	Oxalate decarboxylase (OxDC)-like cupin domain. This subfamily contains bacterial and eukaryotic cupin domains of proteins homologous to oxalate decarboxylase (OxDC; EC 4.1.1.2) such as MSMEG_2254, a putative OxDC from Mycobacterium smegmatis. OxDC is a manganese-dependent bicupin that catalyzes the conversion of oxalate to formate and carbon dioxide, utilizing dioxygen as a cofactor. It is evolutionarily related to oxalate oxidase (OxOx or germin; EC 1.2.3.4) which, in contrast, converts oxalate and dioxygen to carbon dioxide and hydrogen peroxide. OxDC is classified as a bicupin because it contains two cupin folds with each domain containing one manganese binding site, with four manganese binding residues (three histidines and one glutamate) conserved as well as a number of hydrophobic residues.	151
380441	cd20307	cupin_BacB_N	Bacillus subtilis bacilysin and related proteins, N-terminal cupin domain. This model represents the N-terminal domain of bacilysin (BacB, also known as AerE in Microcystis aeruginosa), a non-ribosomally synthesized dipeptide antibiotic that is produced and excreted by certain strains of Bacillus subtilis. Bacilysin is an oxidase that catalyzes the synthesis of 2-oxo-3-(4-oxocyclohexa-2,5-dienyl)propanoic acid, a precursor to L-anticapsin. Each bacilysin monomer has two tandem cupin domains. It is active against a wide range of bacteria and some fungi. The antimicrobial activity of bacilysin is antagonized by glucosamine and N-acetyl glucosamine, indicating that bacilysin interferes with glucosamine synthesis, and thus, with the synthesis of microbial cell walls. AerE is thought to be involved in the formation of the 2-carboxy-6-hydroxyoctahydroindole (Choi) moiety found on all aeruginosin tetrapeptides, based on gene knock-out experiments. It is encoded by the aerE gene of the aerABCDEF Aeruginosin biosynthesis gene cluster in Microcystis aeruginosa. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	100
380442	cd20308	cupin_YdaE	D-lyxose isomerase YdaE, cupin domain. This family includes D-lyxose isomerase (D-LI; EC 5.3.1.15) homologous to YdaE from the sigma B regulon of Bacillus subtilis, a protein with an active site that is highly similar to the E. coli O157 z5688 D-lyxose isomerase. YdaE may have a synergistic role with ydaD, an NAD(P)-dependent alcohol dehydrogenase, in the adaptation to environment stresses; YdaD may be active against the ketose sugar produced by YdaE and function in providing resistance to oxidative stress through the production of reducing equivalents in the form of NAD(P)H. YdaE forms a cupin-type beta-barrel, with two alpha helices at the N-terminus. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	160
380443	cd20309	cupin_EcSI	Escherichia coli sugar isomerase (EcSI), cupin domain. This family includes a sugar isomerase homologous to pathogenic Escherichia coli O157 z5688 D-lyxose isomerase (EcSI or Z5688) which has an active site highly similar to YdaE from the sigma B regulon of Bacillus subtilis. Extensive substrate screening has revealed that EcSI is capable of acting on D-lyxose and D-mannose. Studies show that overexpression of EcSI enables cell growth on D-lyxose as the sole carbon source. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	199
380444	cd20310	cupin_L-RbI	L-ribose isomerase, cupin domain. L-ribose isomerase (RbI) catalyzes the reversible isomerization between L-ribose and L-ribulose, which are rare sugars and non-abundant in nature. RbI from Acinetobacter sp. DL-28 has been shown to have D-lyxose isomerase activity of about 47% compared to L-ribose.  Cellulomonas parahominis MB426 RbI has a broad substrate specificity and can also catalyze the isomerization between D-lyxose and D-xylulose, D-talose and D-tagatose, L-allose and L-psicose, L-gulose and L-sorbose, and D-mannose and D-fructose. RbI adopts a cupin-type structure and belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	235
380445	cd20311	cupin_Yhhw_C	Escherichia coli YhhW and YhaK and related proteins, pirin-like bicupin, C-terminal cupin domain. This family includes the C-terminal domain of YhhW and YhaK, Escherichia coli pirin-like proteins with unknown function. YhhW is structurally similar not only to human pirin but also to quercitin 2,3-dioxygenase (quercitinase). Although the function of YhhW is not completely understood, YhhW and its human ortholog have quercitinase activity and are likely to play an important role in transcription and apoptosis. This C-terminal cupin-like domain has diverged considerably and has closer alignment with C-terminal pirin, while the N-terminal cupin domain of YhhW has a metal coordination site and is thought to have catalytic activity. YhaK is found in low abundance in the cytosol of E. coli and is strongly up-regulated by nitroso-glutathione (GSNO). There are major structural differences at the N-terminus of YhaK compared with YhhW; YhaK lacks the canonical cupin metal-binding residues of pirins and may be involved in chloride binding and/or sensing of oxidative stress in enterobacteria. YhaK showed no quercetinase and peroxidase activity; however, reduced YhaK was very sensitive to reactive oxygen species (ROS). Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold.	70
380745	cd20313	DSRM_DND1	double-stranded RNA binding motif of dead end protein homolog 1 (DND1) and similar proteins. DND1 (also known as dead end protein, or RNA-binding motif single-stranded-interacting protein 4 (RBMS4)) is an RNA-binding protein that is required for the survival of primordial germ cells (PGCs) and suppresses the formation of germ-cell tumors. DND1 binds a UU(A/U) trinucleotide motif predominantly in the 3' untranslated regions of mRNA, and destabilizes target mRNAs. It also counteracts the function of several microRNAs (miRNAs), which are inhibitors of gene expression, by binding mRNAs and prohibiting miRNAs from associating with their target sites. DND1 contains two RNA recognition motifs (RRMs) and a C-terminal double-stranded RNA binding motif (DSRM) that is not sequence specific, but highly specific for dsRNAs of various origin and structure.	80
380746	cd20314	DSRM_EIF2AK2	double-stranded RNA binding motif of eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and similar proteins. EIF2AK2 (EC 2.7.11.1/EC 2.7.10.2; also known as interferon-induced, double-stranded RNA-activated protein kinase, eIF-2A protein kinase 2, interferon-inducible RNA-dependent protein kinase, P1/eIF-2A protein kinase, protein kinase RNA-activated (PKR), protein kinase R, tyrosine-protein kinase EIF2AK2, or p68 kinase) acts as an IFN-induced dsRNA-dependent serine/threonine-protein kinase which plays a key role in the innate immune response to viral infection and is also involved in the regulation of signal transduction, apoptosis, cell proliferation and differentiation. EIF2AK2 proteins contain two to three double-stranded RNA binding motifs (DSRMs). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	68
380747	cd20315	DSRM_mL44_subfamily	double-stranded RNA binding motif of mL44 subfamily proteins. The mitochondrion-specific ribosomal protein mL44 subfamily is composed of mitochondrial 54S ribosomal protein L3 (MRPL3) and mitochondrial 39S ribosomal protein L44 (MRPL44). MRPL3 (also known as mitochondrial large ribosomal subunit protein mL44) is a component of the mitochondrial ribosome (mitoribosome), a dedicated translation machinery responsible for the synthesis of mitochondrial genome-encoded proteins, including at least some of the essential transmembrane subunits of the mitochondrial respiratory chain. MRPL44 (also called L44mt, MRP-L44, or mitochondrial large ribosomal subunit protein mL44) is a component of the 39S subunit of mitochondrial ribosome. It may play a role in the assembly/stability of nascent mitochondrial polypeptides exiting the ribosome. Members of this family contain a RNase III-like domain and a double-stranded RNA binding motif (DSRM). DSRM is not sequence specific, but highly specific for dsRNAs of various origin and structure.	71
410556	cd20317	FXYD1	FXYD domain-containing ion transport regulator 1. FXYD domain-containing ion transport regulator 1 (FXYD1), also known as phospholemman (PLM), or sodium/potassium-transporting ATPase subunit FXYD1, associates with and regulates the activity of the sodium/potassium-transporting ATPase (NKA) which transports Na+ out of the cell and K+ into the cell. It is a plasma membrane substrate for several kinases, including protein kinase A, protein kinase C, NIMA kinase, and myotonic dystrophy kinase. It is thought to form an ion channel or regulate ion channel activity. Transcript variants with different 5' UTR sequences have been described in the literature.	64
410557	cd20318	FXYD2	FXYD domain-containing ion transport regulator 2. FXYD domain-containing ion transport regulator 2 (FXYD2), also known as sodium/potassium-transporting ATPase subunit gamma, or Na(+)/K(+) ATPase subunit gamma, or  sodium pump gamma chain, is the regulatory subunit of the sodium/potassium-transporting ATPase (Na,K-ATPase). Na+,K+-ATPase is a heteromeric complex consisting of a large alpha-subunit, which is responsible for ATP hydrolysis, ion transport, and CTS binding, and a beta-subunit, acting as a chaperone. Although the Na,K-ATPase does not depend on the gamma subunit to be functional, it is thought that the gamma subunit modulates the enzyme's activity by inducing ion channel activity. Mutations in this gene have been associated with renal hypomagnesaemia.	43
410558	cd20322	FXYD4	FXYD domain-containing ion transport regulator 4. FXYD domain-containing ion transport regulator 4 (FXYD4), also known as CHIF (channel-inducing factor or corticosteroid hormone-induced factor), evokes K+ conductance in oocytes and is localized in the distal parts of the nephron and in the colon. CHIF, a putative K channel regulator, is regulated by aldosterone in the colon and by K+ intake in the kidney.	48
410559	cd20323	FXYD_FXYD5	FXYD domain of FXYD domain-containing ion transport regulator 5. FXYD domain-containing ion transport regulator 5 (FXYD5) is also called dysadherin in humans or related to ion channel (RIC) in mice. Two transcript variants have been found for this gene, and they are both predicted to encode the same protein. Dysadherin is the gamma subunit the human Na,K-ATPase and is the only member that has a large extracellular sequence of 140 amino acids. Dysadherin has been observed to be over-expressed on the surface of cells that have down regulated levels of surface E-cadherin. CCL2 (bone homing cytokine) is a protein that is highly affected by silencing dysadherin expression. Dysadherin interferes with cell adhesion via beta1 subunit interactions and is a target for an extracellular antibody drug conjugate where the antibody to dysadherin is attached to a cardiac glycoside. FXYD5 expression in mouse is mainly in the kidney, intestine, spleen, and lung. Confocal immunofluorescence microscopy of mouse kidney detected FXYD5 on basolateral membranes of connecting tubules, collecting tubules, intercalated cells of collecting duct, and on apical membranes in long thin limb of Henle loop.	48
410560	cd20324	FXYD6	FXYD domain-containing ion transport regulator 6. FXYD domain-containing ion transport regulator 6 (FXYD6 encodes the protein phosphohippolin and is located at the 11q23.3. It can be found in all human tissues except blood. FXYD6 in humans is primarily in the brain, with highest levels of expression found in the prefrontal cortex, amygdala, hypothalamus, and occipital lobe. FXYD6 is up-regulated in hepatocellular carcinoma (HCC) and it enhances the migration and proliferation of HCC cells. Therapy targeting FXYD6 could potentially benefit the clinical treatment toward HCC patients. FXYD6 is also associated with mental diseases. Mutations in the FXYD6 gene, or in sequences close to this gene, can predispose to schizophrenia which is known to be strongly heritable. FXYD6 was also found to be significantly downregulated in a Tg2576 mouse model of Alzheimer's disease (AD) brain and hippocampus. FXYD6 is a novel regulator of Na,K-ATPase expressed in the inner ear.	66
410561	cd20325	FXYD7	FXYD domain-containing ion transport regulator 7. FXYD domain-containing ion transport regulator 7 (FXYD7) has a potential splice variant with an additional 3 residues. In rats, expression of FXYD7 was restricted to the brain, with highest levels in the cerebrum, followed by brainstem, and hippocampus, and relatively weak expression in the hypothalamus. Immunofluorescence microscopy demonstrated colocalization with synaptophysin and modest colocalization with glial fibrillary acidic protein, indicating predominant expression in neurons and lower expression in astroglial cells. The FXYD7 gene maps to chromosome 19.	51
410562	cd20327	FXYD8	FXYD domain-containing ion transport regulator 8. FXYD domain-containing ion transport regulator 8 (FXYD8), also known as FXYD domain containing ion transport regulator 6 pseudogene 3 (FXYD6P3), is a member of the FXYD protein family that is involved in the modulation of NKA activity in the kidneys. The human FXYD8 gene is located on the X chromosome. However, the gene is located on chromosome 9 in the mouse and chromosome 8 in the rat.	93
410563	cd20328	FXYD3-like	FXYD domain-containing ion transport regulator 3 and similar proteins. This subfamily includes FXYD domain-containing ion transport regulator 3 (FXYD3), FXYD9, and FXYD10, also called PLMS/Phospholemman-like protein. FXYD3, also known as mammary tumor 8 kDa protein (MAT-8), or chloride conductance inducer protein Mat-8, or phospholemman-like (PLML), may function as a chloride channel or as a chloride channel regulator. It associates with and regulates the activity of the sodium/potassium-transporting ATPase (NKA) which transports Na+ out of the cell and K+ into the cell. Two transcript variants encode two different isoforms of the protein; in addition, transcripts utilizing alternative polyA signals have been described in the literature. Members here include mammalians and reptiles. FXYD9 is present in teleosts including: Danio rerio, Atlantic salmon, and Japanese Medaka fish. In general, the FXYD9 isoform has the highest degree of conservation among the examined teleost species, indicating that it may be involved in physiological processes that are not evolving within this group of vertebrates. FXYD10, present in shark, associates with and modifies the activity of Na,K-ATPase in vitro through interactions mediated by its transmembrane and cytoplasmic C-terminal domains. It is important in the phosphorylation and potassium deocclusion reactions, which are known to be controlled by A domain movements. It is thought that FXYD10 interacts with the A domain of the shark Na,K-ATPase alpha-subunit.	51
410564	cd20329	FXYD11	FXYD domain-containing ion transport regulator 11. FXYD domain-containing ion transport regulator 11 (FXYD11) is a putative regulatory subunit of the Na(+)/K(+)-ATPase (NKA) pump. FXYD11 is expressed predominantly in the gills of euryhaline teleosts, such as the spotted scat, Scatophagus argus. It regulates NKA activity through protein-protein interactions. The regulation of NKA and FXYD11 is of critical importance for osmotic homeostasis. The expression and activity of NKA, as well as FXYD11 mRNA expression in gills have been shown to respond to different environmental salinity by dual-labeling immunohistochemistry and quantitative PCR (RT-qPCR) methods, indicating that there is an interaction between NKA and FXYD.	64
410565	cd20330	FXYD12	FXYD domain-containing ion transport regulator 12. The FXYD domain-containing ion transport regulator 12 (FXYD12) mRNA is mainly distributed in kidneys and intestines of fish. In co-immunoprecipitation experiments, FXYD12 was shown to associate with the Na(+)/(K+)-ATPase (NKA) alpha-subunit in the intestines of two closely related medakas, Oryzias dancena and O. latipes. These results suggests that FXYD12 may play a role in modulating NKA activity in the intestines following salinity changes in the maintenance of internal homeostasis.	53
380677	cd20331	JBP-like	oxygenase domain of uncharacterized bacterial and phage proteins similar to kinetoplastid J-binding protein (JBP) 1 and JBP2. J binding protein (JBP) 1 and JBP2 catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and this oxygenase domain. They belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	270
380678	cd20332	JBP	J-binding protein. J binding protein (JBP) 1 and JBP2 catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and this oxygenase domain. They belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	250
380474	cd20334	Cas13b	Class 2 type VI-B CRISPR-associated RNA-guided ribonuclease Cas13b. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes; class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13b has many distinctive features compared to the other Cas13 proteins, including the lack of significant sequence similarity, disparate crRNA repeat region, and double-sided protospacer flanking sequence (PFS)-dependent target RNA cleavage.	759
380669	cd20374	Pot1C	Protection Of Telomeres Protein 1 (POT1) C-terminal region. POT1 is part of shelterin, a hexameric nucleoprotein complex (comprising TRF1, TRF2, TIN2, RAP1, POT1 and TPP1 in humans) that protects telomeres, the physical ends of chromosomes. Shelterin protects against these ends being recognized as double-stranded DNA breaks, as well as against degradation of the telomeric overhang by endonucleases. It also helps control access of telomerase to the telomeric overhang, thereby affecting telomore length. This C-terminal region has an OB-fold domain and a holiday junction resolvase (HJR) domain which make dimer contacts with TPP1.	286
380668	cd20378	PBP1_SBP-like	periplasmic substrate-binding domain of active transport proteins. Periplasmic substrate-binding domain of active transport proteins found in bacteria and Archaea. Members of this group are initial receptors in the process of active transport across cellular membrane, but their substrate specificities are not known in detail. However, they closely resemble the group of AmiC and active transport systems for short-chain amides and urea (FmdDEF), and thus are likely to exhibit a ligand-binding mode similar to that of the amide sensor protein AmiC from Pseudomonas aeruginosa. Moreover, this binding domain has high sequence identity to the family of hydrophobic amino acid transporters (HAAT), and thus it may also be involved in transport of amino acids.	357
410450	cd20379	Tudor_dTUD-like	Tudor domain found in Drosophila melanogaster maternal protein Tudor (dTUD) and similar proteins. dTUD is required during oogenesis for the formation of primordial germ cells and for normal abdominal segmentation. It contains 11 Tudor domains. The family also includes mitochondrial A-kinase anchor protein 1 (AKAP1) and Tudor domain-containing proteins (TDRDs). AKAP1, also called A-kinase anchor protein 149 kDa (AKAP 149), or dual specificity A-kinase-anchoring protein 1 (D-AKAP-1), or protein kinase A-anchoring protein 1 (PRKA1), or Spermatid A-kinase anchor protein 84 (S-AKAP84), is found in mitochondria and in the endoplasmic reticulum-nuclear envelope where it anchors protein kinases, phosphatases, and a phosphodiesterase. It regulates multiple cellular processes governing mitochondrial homeostasis and cell viability. AKAP1 binds to type I and II regulatory subunits of protein kinase A and anchors them to the cytoplasmic face of the mitochondrial outer membrane. TDRDs have diverse biological functions and may contain one or more copies of the Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	50
410451	cd20380	Tudor_TDRD13-like	Tudor domain found in Tudor domain-containing protein 13 (TDRD13) and similar proteins. The TDRD13 family includes TDRD13 and OTU domain-containing protein 4 (OTUD4). TDRD13, also called asparagine-linked glycosylation 13 (ALG13), glycosyltransferase 28 domain-containing protein 1 (GLT28D1), or UDP-N-acetylglucosamine transferase subunit ALG13, is a putative bifunctional UDP-N-acetylglucosamine transferase and deubiquitinase (EC 2.4.1.141/EC 3.4.19.12). It is a potential member of the Alg7p/Alg13p/Alg14p complex catalyzing the first two initial reactions in the N-glycosylation process. OTUD4, also called HIV-1-induced protein HIN-1, is a phospho-activated K63 deubiquitinase that hydrolyzes the isopeptide bond between the ubiquitin C-terminus and the lysine epsilon-amino group of the target protein. It may negatively regulate inflammatory and pathogen recognition signaling in innate immune response. Members of this family contain one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	54
410452	cd20381	Tudor_LBR	Tudor domain found in Lamin-B receptor (LBR) and similar proteins. LBR, also called integral nuclear envelope inner membrane (INM) protein or LMN2R, is a nuclear envelope protein that anchors the lamina and the heterochromatin to the inner nuclear membrane, in cellular senescence induced by excess thymidine. It is also important for cholesterol biosynthesis. LBR can interact with chromodomain proteins and DNA. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	51
410453	cd20382	Tudor_SETDB1_rpt1	first Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins. SETDB1, also called ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E), acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. It contains two Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	82
410454	cd20383	Tudor_53BP1	Tudor domain found in tumor suppressor TP53-binding protein 1 (53BP1) and similar proteins. 53BP1, also called p53-binding protein 1 (p53BP1), is a double-strand break (DSB) repair protein involved in response to DNA damage, telomere dynamics, and class-switch recombination (CSR) during antibody genesis. It plays a key role in the repair of DSBs in response to DNA damage by promoting non-homologous end joining (NHEJ)-mediated repair of DSBs and specifically counteracting the function of the homologous recombination (HR) repair protein BRCA1. It is recruited to DSB sites by recognizing and binding histone H2A monoubiquitinated at 'Lys-15' (H2AK15Ub) and histone H4 dimethylated at 'Lys-20' (H4K20me2), two histone marks that are present at DSB sites. 53BP1 contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	52
410455	cd20384	Tudor_ZGPAT	Tudor domain found in zinc finger CCCH-type with G patch domain-containing protein (ZGPAT) and similar proteins. ZGPAT, also called ZIP, G patch domain-containing protein 6 (GPATC6), GPATCH6, zinc finger CCCH domain-containing protein 9 (ZC3HDC9), ZC3H9, or zinc finger and G patch domain-containing protein, is a transcription repressor that specifically binds the 5'-GGAG[GA]A[GA]A-3' consensus sequence. It represses transcription by recruiting the chromatin multiprotein complex NuRD to target promoters. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	55
410456	cd20385	Tudor_PCL	Tudor domain found in polycomb repressive complex 2 (PRC2)-associated polycomb-like (PCL) family proteins. The PCL family includes PHD finger protein1 (PHF1) and its homologs, metal-response element-binding transcription factor 2 (MTF2/PCL2) and PHF19/PCL3, which are accessory components of the Polycomb repressive complex 2 (PRC2) core complex. Members contain an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. PCL proteins specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains. The interaction between their Tudor domains and H3K36me3 is critical for both the targeting and spreading of PRC2 into active chromatin regions and for the maintenance of optimal repression of poised developmental genes where PCL proteins, H3K36me3, and H3K27me3 coexist. Moreover, unlike other PHD domain-containing proteins, the first PHD domains of PCL proteins do not display histone H3K4 binding affinity and they do not affect the binding of the Tudor domain to histones.	54
410457	cd20386	Tudor_PHF20-like	Tudor domain found in PHD finger protein 20 (PHF20), PHF20-like protein 1 (PHF20L1), and similar proteins. PHF20, also called Glioma-expressed antigen 2, hepatocellular carcinoma-associated antigen 58, novel zinc finger protein, or transcription factor TZP (referring to Tudor and zinc finger domain containing protein), is a regulator of NF-kappaB activation by disrupting recruitment of PP2A to p65. It also functions as a transcription factor that binds to Akt and plays a role in Akt cell survival/growth signaling. Moreover, it transcriptionally regulates p53. The phosphorylation of PHF20 on Ser291 mediated by protein kinase B (PKB) is essential in tumorigenesis via the regulation of p53-mediated signaling. PHF20L1 is an active malignant brain tumor (MBT) domain-containing protein that binds to monomethylated lysine 142 on DNA (cytosine-5) Methyltransferase 1 (DNMT1) (DNMT1K142me1) and colocalizes at the perinucleolar space in a SET7-dependent manner. Both PHF20 and PHF20L1 contain an N-terminal malignant brain tumor (MBT) domain, a Tudor domain, a plant homeodomain (PHD) finger and putative DNA-binding domains AT hook and C2H2-type zinc finger. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	50
410458	cd20387	Tudor_UHRF_rpt1	first Tudor domain found in the UHRF (ubiquitin-like PHD and RING finger domain-containing protein) family. The UHRF family includes UHRF1 and UHRF2. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain(PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger. The model corresponds to the first Tudor domain. The tandem Tudor domain directs binding of UHRF to the heterochromatin mark histone H3K9me3.	73
410459	cd20388	Tudor_UHRF_rpt2	second Tudor domain found in the UHRF (ubiquitin-like PHD and RING finger domain-containing protein) family. The UHRF family includes UHRF1 and UHRF2. UHRF1 is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 acts as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. It is also an N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF2 was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. Moreover, UHRF2 functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Both UHRF1 and UHRF2 contain an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain(PHD) finger, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger. The model corresponds to the second Tudor domain. The tandem Tudor domain directs binding of UHRF to the heterochromatin mark histone H3K9me3.	72
410460	cd20389	Tudor_ARID4_rpt1	first Tudor domain found in AT-rich interactive domain-containing protein ARID4 family. The family contains ARID4A and its paralog ARID4B, both of which are retinoblastoma (RB)-binding proteins that function as coactivators to enhance the androgen receptor (AR) and RB transcriptional activity, and play important roles in the AR and RB pathways to control male fertility. They also act as the leukemia and tumor suppressors involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. Moreover, they associate with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through their interaction with each other, as well as with the breast cancer associated tumor suppressor ING1 and the breast cancer metastasis suppressor BRMS1. Both ARID4A and ARID4B contain tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	53
410461	cd20390	Tudor_ARID4_rpt2	second Tudor domain found in AT-rich interactive domain-containing protein ARID4 family. The family contains ARID4A and its paralog ARID4B, both of which are retinoblastoma (RB)-binding proteins that function as coactivators to enhance the androgen receptor (AR) and RB transcriptional activity, and play important roles in the AR and RB pathways to control male fertility. They also act as the leukemia and tumor suppressors involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. Moreover, they associate with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through their interaction with each other, as well as with the breast cancer associated tumor suppressor ING1 and the breast cancer metastasis suppressor BRMS1. Both ARID4A and ARID4B contain tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	53
410462	cd20391	Tudor_JMJD2_rpt1	first Tudor domain found in Jumonji domain-containing protein 2 (JMJD2) family of histone demethylases. JMJD2 proteins, also called lysine-specific demethylase 4 histone demethylases (KDM4), have been implicated in various cellular processes including DNA damage response, transcription, cell cycle regulation, cellular differentiation, senescence, and carcinogenesis. They selectively catalyze the demethylation of di- and trimethylated H3K9 and H3K36. This model contains only three JMJD2 proteins, JMJD2A-C, which all contain jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. JMJD2D is not included in this model, since it lacks both the PHD and Tudor domains and has a different substrate specificity. JMJD2A-C are required for efficient cancer cell growth.	53
410463	cd20392	Tudor_JMJD2_rpt2	second Tudor domain found in Jumonji domain-containing protein 2 (JMJD2) family of histone demethylases. JMJD2 proteins, also called lysine-specific demethylase 4 histone demethylases (KDM4), have been implicated in various cellular processes including DNA damage response, transcription, cell cycle regulation, cellular differentiation, senescence, and carcinogenesis. They selectively catalyze the demethylation of di- and trimethylated H3K9 and H3K36. This model contains only three JMJD2 proteins, JMJD2A-C, which all contain jmjN and jmjC domains in the N-terminal region, followed by a canonical PHD domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. JMJD2D is not included in this model, since it lacks both the PHD and Tudor domains and has a different substrate specificity. JMJD2A-C are required for efficient cancer cell growth.	56
410464	cd20393	Tudor_SGF29_rpt1	first Tudor domain found in SAGA-associated factor 29 (SGF29) and similar proteins. SGF29, also called coiled-coil domain-containing protein 101, or SAGA complex-associated factor 29, is a chromatin reader component of some histone acetyltransferase (HAT) SAGA-type complexes, like the TFTC-HAT, ATAC or STAGA complexes. It specifically recognizes and binds methylated 'Lys-4' of histone H3 (H3K4me), with a preference for the trimethylated form (H3K4me3). SGF29 contains two Tudor domains. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	67
410465	cd20394	Tudor_SGF29_rpt2	second Tudor domain found in SAGA-associated factor 29 (SGF29) and similar proteins. SGF29, also called coiled-coil domain-containing protein 101, or SAGA complex-associated factor 29, is a chromatin reader component of some histone acetyltransferase (HAT) SAGA-type complexes, like the TFTC-HAT, ATAC or STAGA complexes. It specifically recognizes and binds methylated 'Lys-4' of histone H3 (H3K4me), with a preference for trimethylated form (H3K4me3). SGF29 contains two Tudor domains. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	60
410466	cd20395	Tudor_SpCrb2-like_rpt1	first Tudor domain found in Schizosaccharomyces pombe Cut5-repeat binding protein 2 (Crb2) and similar proteins. Crb2, also called RAD9 protein homolog, or checkpoint mediator protein crb2, is a DNA repair protein essential for cell cycle arrest at the G1 and G2 stages following DNA damage by X-, and UV-irradiation, or inactivation of DNA ligase. Crb2 contains two Tudor domains. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	50
410467	cd20396	Tudor_SpCrb2-like_rpt2	second Tudor domain found in Schizosaccharomyces pombe Cut5-repeat binding protein 2 (Crb2) and similar proteins. Crb2, also called RAD9 protein homolog, or checkpoint mediator protein crb2, is a DNA repair protein essential for cell cycle arrest at the G1 and G2 stages following DNA damage by X-, and UV-irradiation, or inactivation of DNA ligase. Crb2 contains two Tudor domains. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	73
410468	cd20397	Tudor_BAHCC1-like	Tudor domain found in the BAH and coiled-coil domain-containing protein 1 (BAHCC1) family. The family of BAHCC1 includes BAHCC1 and trinucleotide repeat-containing gene 18 protein (TNRC18). BAHCC1 may function as a transcriptional regulator. The biological function of TNRC18 remains unclear. Members of this family contain one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	67
410469	cd20398	Tudor_SMN	Tudor domain found in survival motor neuron protein (SMN) and similar proteins. SMN, also called component of gems 1, or Gemin-1, is part of a multimeric SMN complex that includes spliceosomal Sm core proteins and plays a catalyst role in the assembly of small nuclear ribonucleoproteins (snRNPs), the building blocks of the spliceosome. Mutations in human SMN lead to motor neuron degeneration and spinal muscular atrophy. SMN contains a central, highly conserved Tudor domain that is required for U snRNP assembly and Sm protein binding and has been shown to bind arginine-glycine-rich motifs in an methylarginine-dependent manner.	56
410470	cd20399	Tudor_SPF30	Tudor domain found in survival of motor neuron-related-splicing factor 30 (SPF30) and similar proteins. SPF30, also called 30 kDa splicing factor SMNrp, SMN-related protein, or survival motor neuron domain-containing protein 1 (SMNDC1), is an essential pre-mRNA splicing factor required for assembly of the U4/U5/U6 tri-small nuclear ribonucleoprotein into the spliceosome. Overexpression of SPF30 causes apoptosis. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	55
410471	cd20400	Tudor_ERCC6L2	Tudor domain found in DNA excision repair protein ERCC-6-like 2 (ERCC6L2) and similar proteins. ERCC6L2, also called DNA repair and recombination protein RAD26-like (RAD26L), may be involved in early DNA damage response. It regulates RNA Pol II-mediated transcription via its interaction with DNA-dependent protein kinase (DNA-PK) to resolve R loops and minimize transcription-associated genome instability. ERCC6L2 gene mutations have been associated with bone marrow failure that includes developmental delay and microcephaly. It contains an N-terminal Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	59
410472	cd20401	Tudor_AtPTM-like	Tudor domain found in Arabidopsis thaliana DDT domain-containing protein PTM (AtPTM), Dirigent protein 17 (AtDIR17), and similar proteins. This family includes AtPTM and AtDIR17. AtPTM, also called DDT domain-containing protein 1, or PHD type transcription factor with transmembrane domains, is a membrane-bound transcription factor required for plastid-to-nucleus retrograde signaling. AtDIR17 imparts stereoselectivity on the phenoxy radical-coupling reaction, yielding optically active lignans from two molecules of coniferyl alcohol in the biosynthesis of lignans, flavonolignans, and alkaloids, and thus plays a central role in plant secondary metabolism. Members of this family contain one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	50
410473	cd20402	Tudor_Agenet_FMRP-like_rpt1	first Tudor-like Agenet domain found in the fragile X mental retardation protein (FMRP) family. The FMRP family includes synaptic functional regulator FMR1, fragile X mental retardation syndrome-related protein 1 (FXR1), and 2 (FXR2). FMR1, also called fragile X mental retardation protein 1 (FMRP), is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. FXR1 and FXR2 are RNA-binding proteins that shuttle between the nucleus and cytoplasm and associate with polyribosomes, predominantly with the 60S ribosomal subunit. Members of this family contain two copies of the Tudor-like Agenet domain. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	50
410474	cd20403	Tudor_Agenet_FMRP-like_rpt2	second Tudor-like Agenet domain found in the fragile X mental retardation protein (FMRP) family. The FMRP family includes synaptic functional regulator FMR1, fragile X mental retardation syndrome-related protein 1 (FXR1) and 2 (FXR2). FMR1, also called fragile X mental retardation protein 1 (FMRP), is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. FXR1 and FXR2 are RNA-binding proteins that shuttle between the nucleus and cytoplasm and associate with polyribosomes, predominantly with the 60S ribosomal subunit. Members of this family contain two copies of the Tudor-like Agenet domain. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	50
410475	cd20404	Tudor_Agenet_AtEML-like	Tudor-like Agenet domain found in  Arabidopsis thaliana proteins EMSY-LIKE 1-4 (AtEML1-4) and similar proteins. This family includes Arabidopsis thaliana proteins EMSY-LIKE 1-4 (AtEML1-4), histone-lysine N-methyltransferase trithorax-like proteins ATX1-2 (AtATX1-2), histone-lysine N-methyltransferase ASHH3, DNA mismatch repair protein MSH6, and similar proteins. EMSY-like proteins contain an EMSY N-terminal domain, a central Tudor-like Agenet domain, and a C-terminal coiled-coil motif. AtEML1, AtEML2, and likely AtEML4, contribute to RPP7-mediated immunity. Besides this, AtEML1 and AtEML2 participate in a second EDM2-dependent function and affect floral transition. ATX-like proteins are plant counterparts of the Drosophila melanogaster trithorax (TRX) and mammalian mixed-lineage leukemia (MLL1) proteins. ATX1, also called protein SET domain group 27, or trithorax-homolog protein 1 (TRX-homolog protein 1), is a methyltransferase that trimethylates histone H3 at lysine 4 (H3K4me3). It also acts as a histone modifier and as a positive effector of gene expression. ATX1regulates transcription from diverse classes of genes implicated in biotic and abiotic stress responses. It is involved in dehydration stress signaling in both abscisic acid (ABA)-dependent and ABA-independent pathways. ATX2, also called protein SET domain group 30, or trithorax-homolog protein 2 (TRX-homolog protein 2), is involved in dimethylating histone H3 at lysine 4 (H3K4me2). Both ATX1 and ATX2 are multi-domain proteins that consist of an N-terminal Tudor-like Agenet domain, a PWWP domain, FYRN- and FYRC (DAST, domain associated with SET in trithorax) domains, a canonical plant homeodomain (PHD) domain, a non-canonical extended PHD (ePHD) domain, and a C-terminal SET domain. ASHR3, also called protein SET DOMAIN GROUP 7, functions as a histone-lysine N-methyltransferase (EC 2.1.1.43). It contains a SET domain and a Tudor-like Agenet domain. AtMSH6, also called MutS protein homolog 6, is a component of the post-replicative DNA mismatch repair system (MMR). It forms a heterodimer with MutS alpha (MSH2-MSH6 heterodimer) which binds to DNA mismatches thereby initiating DNA repair. AtMSH6 contains a Tudor-like Agenet domain and a MutS domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	51
410476	cd20405	Tudor_Agenet_AtDUF_rpt1_3	first and third Tudor-like Agenet domains found in a family of Arabidopsis thaliana DUF724 domain-containing proteins (AtDUFs). The family includes a group of AtDUFs (AtDUF1-3 and AtDUF6-8) that may be involved in the polar growth of plant cells via transportation of RNAs. Members of this family have four Tudor-like Agenet domains, except for AtDUF8, which contains only two copies of the Tudor-like Agenet domain. AtDUF4 and AtDUF5 are not included here due to the lack of a Tudor-like Agenet domain. The model corresponds to the first and third Tudor-like Agenet domains in AtDUF1-3 and AtDUF6-7, as well as the first Tudor-like Agenet domain in AtDUF8. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	65
410477	cd20406	Tudor_Agenet_AtDUF_rpt2_4	second and fourth Tudor-like Agenet domains found in the family of Arabidopsis thaliana DUF724 domain-containing proteins (AtDUFs). The family includes a group of AtDUFs (AtDUF1-3 and AtDUF6-8) that may be involved in the polar growth of plant cells via transportation of RNAs. Members of this family have four Tudor-like Agenet domains, except for AtDUF8, which contains only two copies of the Tudor-like Agenet domain. AtDUF4 and AtDUF5 are not included here due to the lack of a Tudor-like Agenet domain. The model corresponds to the second and fourth Tudor-like Agenet domains in AtDUF1-3 and AtDUF6-7, as well as the first Tudor-like Agenet domain in AtDUF8. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	47
410478	cd20407	Tudor_AKAP1	Tudor domain found in mitochondrial A-kinase anchor protein 1 (AKAP1) and similar proteins. AKAP1, also called A-kinase anchor protein 149 kDa (AKAP 149), dual specificity A-kinase-anchoring protein 1 (D-AKAP-1), protein kinase A-anchoring protein 1 (PRKA1), or Spermatid A-kinase anchor protein 84 (S-AKAP84), is found in mitochondria and in the endoplasmic reticulum-nuclear envelope, where it anchors protein kinases, phosphatases, and a phosphodiesterase. It regulates multiple cellular processes governing mitochondrial homeostasis and cell viability. AKAP1 binds to type I and II regulatory subunits of protein kinase A and anchors them to the cytoplasmic face of the mitochondrial outer membrane. It contains a C-terminal Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	76
410479	cd20408	Tudor_TDRD1_rpt1	first Tudor domain found in Tudor domain-containing protein 1 (TDRD1) and similar proteins. TDRD1, also called cancer/testis antigen 41.1 (CT41.1), plays a central role during spermatogenesis by participating in the repression transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins, and governs the methylation and subsequent repression of transposons. TDRD1 contains four Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	130
410480	cd20409	Tudor_TDRD1_rpt2	second Tudor domain found in Tudor domain-containing protein 1 (TDRD1) and similar proteins. TDRD1, also called cancer/testis antigen 41.1 (CT41.1), plays a central role during spermatogenesis by participating in the repression transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins, and governs the methylation and subsequent repression of transposons. TDRD1 contains four Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	82
410481	cd20410	Tudor_TDRD1_rpt3	third Tudor domain found in Tudor domain-containing protein 1 (TDRD1) and similar proteins. TDRD1, also called cancer/testis antigen 41.1 (CT41.1), plays a central role during spermatogenesis by participating in the repression transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins, and governs the methylation and subsequent repression of transposons. TDRD1 contains four Tudor domains. This model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	59
410482	cd20411	Tudor_TDRD1_rpt4	fourth Tudor domain found in Tudor domain-containing protein 1 (TDRD1) and similar proteins. TDRD1, also called cancer/testis antigen 41.1 (CT41.1), plays a central role during spermatogenesis by participating in the repression transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins, and governs the methylation and subsequent repression of transposons. TDRD1 contains four Tudor domains. This model corresponds to the fourth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	116
410483	cd20412	Tudor_TDRD2	Tudor domain found in Tudor domain-containing protein 2 (TDRD2) and similar proteins. TDRD2, also called Tudor and KH domain-containing protein (TDRKH), participates in the primary piwi-interacting RNA (piRNA) biogenesis pathway and is required during spermatogenesis to repress transposable elements and prevent their mobilization, which is essential for germline integrity. The family also includes the TDRD2 homolog found in Drosophila melanogaster (dTDRKH), which is also called partner of PIWIs protein, or PAPI, and is involved in Zucchini-mediated piRNA 3'-end maturation. TDRD2 contains two KH domains and one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	95
410484	cd20413	Tudor_TDRD3	Tudor domain found in Tudor domain-containing protein 3 (TDRD3) and similar proteins. TDRD3 is a scaffolding protein that specifically recognizes and binds dimethylarginine-containing proteins. In the nucleus, it acts as a coactivator; it recognizes and binds asymmetric dimethylation on the core histone tails associated with transcriptional activation (H3R17me2a and H4R3me2a) and recruits proteins at these arginine-methylated loci. In the cytoplasm, it may play a role in the assembly and/or disassembly of mRNA stress granules and in the regulation of translation of target mRNAs by binding Arg/Gly-rich motifs (GAR) in dimethylarginine-containing proteins. TDRD3 contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	53
410485	cd20414	Tudor_TDRD4_rpt1	first Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	77
410486	cd20415	Tudor_TDRD4_rpt2	second Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	96
410487	cd20416	Tudor_TDRD4_rpt3	third Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	82
410488	cd20417	Tudor_TDRD4_rpt4	fourth Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the fourth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	68
410489	cd20418	Tudor_TDRD4_rpt5	fifth Tudor domain found in Tudor domain-containing protein 4 (TDRD4) and similar proteins. TDRD4, also called RING finger protein 17 (RNF17), is a component of the mammalian germ cell nuage and is essential for spermiogenesis. It seems to be involved in the regulation of transcriptional activity of MYC. In vitro, TDRD4 inhibits the DNA-binding activity of Mad-MAX heterodimers. It can recruit Mad transcriptional repressors (MXD1, MXD3, MXD4 and MXI1) to the cytoplasm. TDRD4 also acts as a potential cancer/testis antigen in liver cancer. TDRD4 contains a RING finger and five Tudor domains. This model corresponds to the fifth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	105
410490	cd20419	Tudor_TDRD5	Tudor domain found in Tudor domain-containing protein 5 (TDRD5) and similar proteins. TDRD5 is an RNA-binding protein directly associated with piRNA precursors. It is required for retrotransposon silencing, chromatoid body assembly, and spermiogenesis. TDRD5 participates in the repression of transposable elements and prevents their mobilization, which is essential for germline integrity. TDRD5 contains three LOTUS domains and one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	118
410491	cd20420	Tudor_TDRD6_rpt1	first Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	132
410492	cd20421	Tudor_TDRD6_rpt2	second Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	130
410493	cd20422	Tudor_TDRD6_rpt3	third Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	135
410494	cd20423	Tudor_TDRD6_rpt4	fourth Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the fourth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	80
410495	cd20424	Tudor_TDRD6_rpt5	fifth Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the fifth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	126
410496	cd20425	Tudor_TDRD6_rpt6	sixth Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the sixth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	115
410497	cd20426	Tudor_TDRD6_rpt7	seventh Tudor domain found in Tudor domain-containing protein 6 (TDRD6) and similar proteins. TDRD6, also called antigen NY-CO-45 or cancer/testis antigen 41.2 (CT41.2), is a testis-specific expressed protein that was localized to the chromatoid bodies in germ cells, and is involved in spermiogenesis, chromatoid body formation, and for proper precursor and mature miRNA expression. Mutations in TDRD6 may be associated with human male infertility and early embryonic lethality. TDRD6 contains seven Tudor domains. This model corresponds to the seventh one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	140
410498	cd20427	Tudor_TDRD7_rpt1	first Tudor domain found in Tudor domain-containing protein 7 (TDRD7) and similar proteins. TDRD7, also called PCTAIRE2-binding protein, or Tudor repeat associator with PCTAIRE-2 (Trap), is a component of specific cytoplasmic RNA granules involved in post-transcriptional regulation of specific genes: probably acts by binding to specific mRNAs and regulating their translation. It is required for lens transparency during lens development, by regulating translation of genes such as CRYBB3 and HSPB1 in the developing lens. It is also essential for dynamic ribonucleoprotein (RNP) remodeling of chromatoid bodies during spermatogenesis. TDRD7 contains three Tudor domains. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	98
410499	cd20428	Tudor_TDRD7_rpt2	second Tudor domain found in Tudor domain-containing protein 7 (TDRD7) and similar proteins. TDRD7, also called PCTAIRE2-binding protein, or Tudor repeat associator with PCTAIRE-2 (Trap), is a component of specific cytoplasmic RNA granules involved in post-transcriptional regulation of specific genes: probably acts by binding to specific mRNAs and regulating their translation. It is required for lens transparency during lens development, by regulating translation of genes such as CRYBB3 and HSPB1 in the developing lens. It is also essential for dynamic ribonucleoprotein (RNP) remodeling of chromatoid bodies during spermatogenesis. TDRD7 contains three Tudor domains. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	140
410500	cd20429	Tudor_TDRD7_rpt3	third Tudor domain found in Tudor domain-containing protein 7 (TDRD7) and similar proteins. TDRD7, also called PCTAIRE2-binding protein, or Tudor repeat associator with PCTAIRE-2 (Trap), is a component of specific cytoplasmic RNA granules involved in post-transcriptional regulation of specific genes: probably acts by binding to specific mRNAs and regulating their translation. It is required for lens transparency during lens development, by regulating translation of genes such as CRYBB3 and HSPB1 in the developing lens. It is also essential for dynamic ribonucleoprotein (RNP) remodeling of chromatoid bodies during spermatogenesis. TDRD7 contains three Tudor domains. The model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	91
410501	cd20430	Tudor_TDRD8	Tudor domain found in Tudor domain-containing protein 8 (TDRD8) and similar proteins. TDRD8, also called serine/threonine-protein kinase (EC 2.7.11.1) 31 (STK31), serine/threonine-protein kinase NYD-SPK, or Sugen kinase 396 (SgK396), is a germ cell-specific factor expressed in embryonic gonocytes of both sexes, and in postnatal spermatocytes and round spermatids in males. It acts as a cell-cycle regulated protein that contributes to the tumorigenicity of epithelial cancer cells. TDRD8 contains a Tudor domain and a serine/threonine kinase domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	75
410502	cd20431	Tudor_TDRD9	Tudor domain found in Tudor domain-containing protein 9 (TDRD9) and similar proteins. TDRD9 is an ATP-dependent DEAD-like RNA helicase required during spermatogenesis. It is involved in the biosynthesis of PIWI-interacting RNAs (piRNAs). A recessive deleterious mutation mutation in TDRD9 causes non-obstructive azoospermia in infertile men. TDRD9 contains an N-terminal HrpA-like RNA helicase module and a Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	101
410503	cd20432	Tudor_TDRD10	Tudor domain found in Tudor domain-containing protein 10 (TDRD10) and similar proteins. TDRD10 is widely expressed and localized both to the nucleus and cytoplasm, and may play general roles like regulation of RNA metabolism. It contains a Tudor domain and an RNA recognition motif (RRM). The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	139
410504	cd20433	Tudor_TDRD11	Tudor domain found in Tudor domain-containing protein 11 (TDRD11) and similar proteins. TDRD11, also called Staphylococcal nuclease domain-containing protein 1 (SND1), 100 kDa coactivator, EBNA2 coactivator p100, or p100 co-activator, is a multifunctional protein that is reportedly associated with different types of RNA molecules, including mRNA, miRNA, pre-miRNA, and dsRNA. It has been implicated in a number of biological processes in eukaryotic cells, including the cell cycle, DNA damage repair, proliferation, and apoptosis. TDRD11 is overexpressed in multiple cancers and functions as an oncogene. It contains multiple Staphylococcal nuclease (SN) domains and a C-terminal Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	84
410505	cd20434	Tudor_TDRD12_rpt1	first Tudor domain found in Tudor domain-containing protein 12 (TDRD12) and similar proteins. TDRD12, also called ES cell-associated transcript 8 protein (ECAT8), is a putative ATP-dependent DEAD-like RNA helicase that is essential for germ cell development and maintenance. It acts as a unique piRNA biogenesis factor essential for secondary PIWI interacting RNA (piRNA) biogenesis. TDRD12 contains two Tudor domains, one at the N-terminus and the other at the C-terminal end. The model corresponds to the first/N-terminal one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	164
410506	cd20435	Tudor_TDRD12_rpt2	second Tudor domain found in Tudor domain-containing protein 12 (TDRD12) and similar proteins. TDRD12, also called ES cell-associated transcript 8 protein (ECAT8), is a putative ATP-dependent DEAD-like RNA helicase that is essential for germ cell development and maintenance. It acts as a unique piRNA biogenesis factor essential for secondary PIWI interacting RNA (piRNA) biogenesis. TDRD12 contains two Tudor domains, one at the N-terminus and the other at the C-terminal end. The model corresponds to the second/C-terminal one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	134
410507	cd20436	Tudor_TDRD15_rpt1	first Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	147
410508	cd20437	Tudor_TDRD15_rpt2	second Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	120
410509	cd20438	Tudor_TDRD15_rpt3	third Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the third one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	141
410510	cd20439	Tudor_TDRD15_rpt4	fourth Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the fourth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	125
410511	cd20440	Tudor_TDRD15_rpt5	fifth Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the fifth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	127
410512	cd20441	Tudor_TDRD15_rpt6	sixth Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the sixth one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	108
410513	cd20442	Tudor_TDRD15_rpt7	seventh Tudor domain found in Tudor domain-containing protein 15 (TDRD15) and similar proteins. TDRD15 is an uncharacterized Tudor domain-containing protein that contains seven Tudor domains. This model corresponds to the seventh one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	160
410514	cd20443	Tudor_AtTudor1-like	Tudor domain found in Arabidopsis thaliana ribonuclease Tudor 1 (AtTudor1), ribonuclease Tudor 2 (AtTudor2), and similar proteins. The family includes AtTudor1 (also called Tudor-SN protein 1) and AtTudor2 (also called Tudor-SN protein 2 or 100 kDa coactivator-like protein). They are cytoprotective ribonucleases (RNases) required for resistance to abiotic stresses, acting as positive regulators of mRNA decapping during stress. Members of this family contain one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	117
410515	cd20444	Tudor_vreteno-like_rpt1	first Tudor domain found in Drosophila melanogaster protein vreteno and similar proteins. Vreteno is a gonad-specific protein essential for germline development to repress transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process in both germline and somatic gonadal tissues by mediating the repression of transposable elements during meiosis. Vreteno contains two Tudor domains. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	55
410516	cd20445	Tudor_vreteno-like_rpt2	second Tudor domain found in Drosophila melanogaster protein vreteno and similar proteins. Vreteno is a gonad-specific protein essential for germline development to repress transposable elements and preventing their mobilization, which is essential for germline integrity. It acts via the piRNA metabolic process in both germline and somatic gonadal tissues by mediating the repression of transposable elements during meiosis. Vreteno contains two Tudor domains. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	56
410517	cd20446	Tudor_SpSPF30-like	Tudor domain found in Schizosaccharomyces pombe splicing factor spf30 (SpSPF30) and similar proteins. SpSPF30, also called survival of motor neuron-related-splicing factor 30, is necessary for spliceosome assembly. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	56
410518	cd20447	Tudor_TDRD13	Tudor domain found in Tudor domain-containing protein 13 (TDRD13). TDRD13, also called asparagine-linked glycosylation 13 (ALG13), glycosyltransferase 28 domain-containing protein 1 (GLT28D1), or UDP-N-acetylglucosamine transferase subunit ALG13, is a putative bifunctional UDP-N-acetylglucosamine transferase and deubiquitinase (EC 2.4.1.141/EC 3.4.19.12). It is a potential member of the Alg7p/Alg13p/Alg14p complex catalyzing the first two initial reactions in the N-glycosylation process. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	80
410519	cd20448	Tudor_OTUD4	Tudor domain found in OTU domain-containing protein 4 (OTUD4). OTUD4, also called HIV-1-induced protein HIN-1, is a phospho-activated K63 deubiquitinase that hydrolyzes the isopeptide bond between the ubiquitin C-terminus and the lysine epsilon-amino group of the target protein. It may negatively regulate inflammatory and pathogen recognition signaling in innate immune response. It contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	64
410520	cd20449	Tudor_PHF1	Tudor domain found in PHD finger protein1 (PHF1) and similar proteins. PHF1, also called Polycomb-like protein 1 (PCL1), together with JARID2 and AEBP2, associates with the Polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis, through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF1 is essential in epigenetic regulation and genome maintenance. It acts as a dual reader of lysine trimethylation at lysine 36 of histone H3 and lysine 27 of histone variant H3t. Moreover, PHF1 is required for efficient H3-K27 trimethylation (H3K27me3) and Hox gene silencing. It can mediate deposition of the repressive H3K27me3 mark and acts as a cofactor in early DNA-damage response. PHF1 consists of an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. Its Tudor domain selectively binds to histone H3K36me3.	54
410521	cd20450	Tudor_MTF2	Tudor domain found in metal-response element-binding transcription factor 2 (MTF2) and similar proteins. MTF2, also called metal regulatory transcription factor 2, metal-response element DNA-binding protein M96, or Polycomb-like protein 2 (PCL2), complexes with the Polycomb repressive complex-2 (PRC2) in embryonic stem cells and regulates the transcriptional networks during embryonic stem cell self-renewal and differentiation. It recruits the PRC2 complex to the inactive X chromosome and target loci in embryonic stem cells. Moreover, MTF2 is required for PRC2-mediated Hox cluster repression. It activates the Cdkn2a gene and promotes cellular senescence, thus suppressing the catalytic activity of PRC2 locally. MTF2, like other PCL family proteins, consists of an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. PCL proteins specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains.	54
410522	cd20451	Tudor_PHF19	Tudor domain found in PHD finger protein1 (PHF19) and similar proteins. PHF19, also called Polycomb-like protein 3 (PCL3), is a component of the Polycomb repressive complex 2 (PRC2), which is the major H3K27 methyltransferase that regulates pluripotency, differentiation, and tumorigenesis through catalysis of histone H3 lysine 27 trimethylation (H3K27me3) on chromatin. PHF19 consists of an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. It binds trimethylated histone H3 Lys36 (H3K36me3) through its Tudor domain and recruits the PRC2 complex and the H3K36me3 demethylase NO66 to embryonic stem cell genes during differentiation. Moreover, PHF19 and its upstream regulator, Akt, play roles in the phenotype switch of melanoma cells from proliferative to invasive states.	57
410523	cd20452	Tudor_dPCL-like	Tudor domain found in Drosophila melanogaster Polycomb protein PCL (dPCL)and similar proteins. dPCL, also called Polycomblike protein, is a Polycomb group (PcG) protein that is specifically required during the first 6 hours of embryogenesis to establish the repressed state. dPCL is a component of the Esc/E(z) complex, which methylates 'Lys-9' and 'Lys-27' residues of histone H3, leading to transcriptional repression of the affected target gene. Like other PCL family proteins, it consists of an N-terminal Tudor domain followed by two PHD domains, and a C-terminal MTF2 domain. PCL proteins specifically recognize tri-methylated H3K36 (H3K36me3) through their N-terminal Tudor domains.	55
410524	cd20453	Tudor_PHF20	Tudor domain found in PHD finger protein 20 (PHF20) and similar proteins. PHF20, also called Glioma-expressed antigen 2, hepatocellular carcinoma-associated antigen 58, novel zinc finger protein, or transcription factor TZP (referring to Tudor and zinc finger domain containing protein), is a regulator of NF-kappaB activation by disrupting recruitment of PP2A to p65. It also functions as a transcription factor that binds to Akt and plays a role in Akt cell survival/growth signaling. Moreover, it transcriptionally regulates p53. The phosphorylation of PHF20 on Ser291 mediated by protein kinase B (PKB) is essential in tumorigenesis via the regulation of p53-mediated signaling. PHF20 contains an N-terminal malignant brain tumor (MBT) domain, a Tudor domain, a plant homeodomain (PHD) finger and putative DNA-binding domains AT hook and C2H2-type zinc finger. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	53
410525	cd20454	Tudor_PHF20L1	Tudor domain found in PHD finger protein 20-like protein 1 (PHF20L1) and similar proteins. PHF20L1 is an active malignant brain tumor (MBT) domain-containing protein that binds to monomethylated lysine 142 on DNA (cytosine-5) Methyltransferase 1 (DNMT1) (DNMT1K142me1) and colocalizes at the perinucleolar space in a SET7-dependent manner. Its MBT domain reads and controls enzyme levels of methylated DNMT1 in cells, thus representing a novel antagonist of DNMT1 proteasomal degradation. In addition to the MBT domain, PHF20L1 also contains a Tudor domain, a plant homeodomain (PHD) finger and putative DNA-binding domains AT hook and C2H2-type zinc finger. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	59
410526	cd20455	Tudor_UHRF1_rpt1	first Tudor domain found in ubiquitin-like PHD and RING finger domain-containing protein 1 (UHRF1) and similar proteins. UHRF1, also called inverted CCAAT box-binding protein of 90 kDa, nuclear protein 95, nuclear zinc finger protein Np95 (Np95), RING finger protein 106, transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1, is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 can act as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also a N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING- associated (SRA) domain, and a C-terminal RING-finger domain. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitylation has an essential role in maintenance DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD domain targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-finger domain exhibit both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1. The model corresponds to the first Tudor domain.	79
410527	cd20456	Tudor_UHRF2_rpt1	first Tudor domain found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2) and similar proteins. UHRF2, also called Np95/ICBP90-like RING finger protein (NIRF), Np95-like RING finger protein, nuclear protein 97, nuclear zinc finger protein Np97, RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2, was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. UHRF2 also functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger domain. The model corresponds to the first Tudor domain. The tandem Tudor domain directs binding of UHRF to the heterochromatin mark histone H3K9me3.	91
410528	cd20457	Tudor_UHRF1_rpt2	second Tudor domain found in ubiquitin-like PHD and RING finger domain-containing protein 1 (UHRF1) and similar proteins. UHRF1, also called inverted CCAAT box-binding protein of 90 kDa, nuclear protein 95, nuclear zinc finger protein Np95 (Np95), RING finger protein 106, transcription factor ICBP90, or E3 ubiquitin-protein ligase UHRF1, is a unique chromatin effector protein that integrates the recognition of both histone PTMs and DNA methylation. It is essential for cell proliferation and plays a critical role in the development and progression of many human carcinomas, such as laryngeal squamous cell carcinoma (LSCC), gastric cancer (GC), esophageal squamous cell carcinoma (ESCC), colorectal cancer, prostate cancer, and breast cancer. UHRF1 can act as a transcriptional repressor through its binding to histone H3 when it is unmodified at Arg2. Its overexpression in human lung fibroblasts results in downregulation of expression of the tumour suppressor pRB. It also plays a role in transcriptional repression of the cell cycle regulator p21. Moreover, UHRF1-dependent repression of factors can facilitate the G1-S transition. It interacts with Tat-interacting protein of 60 kDa (TIP60) and induces degradation-independent ubiquitination of TIP60. It is also a N-methylpurine DNA glycosylase (MPG)-interacting protein that binds MPG in a p53 status-independent manner in the DNA base excision repair (BER) pathway. In addition, UHRF1 functions as an epigenetic regulator that is important for multiple aspects of epigenetic regulation, including maintenance of DNA methylation patterns and recognition of various histone modifications. UHRF1 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING- associated (SRA) domain, and a C-terminal RING-finger domain. It specifically binds to hemimethylated DNA, double-stranded CpG dinucleotides, and recruits the maintenance methyltransferase DNMT1 to its hemimethylated DNA substrate through its SRA domain. UHRF1-dependent H3K23 ubiquitylation has an essential role in maintenance DNA methylation and replication. The tandem Tudor domain directs UHRF1 binding to the heterochromatin mark histone H3K9me3 and the PHD domain targets UHRF1 to unmodified histone H3 in euchromatic regions. The RING-finger domain exhibit both autocatalytic E3 ubiquitin (Ub) ligase activity and activity against histone H3 and DNMT1. The model corresponds to the second Tudor domain.	72
410529	cd20458	Tudor_UHRF2_rpt2	second Tudor domain found in ubiquitin-like PHD and RING finger domain-containing protein 2 (UHRF2) and similar proteins. UHRF2, also called Np95/ICBP90-like RING finger protein (NIRF), Np95-like RING finger protein, nuclear protein 97, nuclear zinc finger protein Np97, RING finger protein 107, or E3 ubiquitin-protein ligase UHRF2, was originally identified as a ubiquitin ligase acting as a small ubiquitin-like modifier (SUMO) E3 ligase that enhances zinc finger protein 131 (ZNF131) SUMOylation but does not enhance ZNF131 ubiquitination. It also ubiquitinates PCNP, a PEST-containing nuclear protein. UHRF2 also functions as a nuclear protein involved in cell-cycle regulation and has been implicated in tumorigenesis. It interacts with cyclins, CDKs, p53, pRB, PCNA, HDAC1, DNMTs, G9a, methylated histone H3 lysine 9, and methylated DNA. It interacts with the cyclin E-CDK2 complex, ubiquitinates cyclins D1 and E1, induces G1 arrest, and is involved in the G1/S transition regulation. Furthermore, UHRF2 is a direct transcriptional target of the transcription factor E2F-1 in the induction of apoptosis. It recruits HDAC1 and binds to methyl-CpG. UHRF2 also participates in the maturation of Hepatitis B virus (HBV) through interacting with HBV core protein and promoting its degradation. UHRF2 contains an N-terminal ubiquitin-like domain (UBL), a tandem Tudor domain (TTD), a plant homeodomain (PHD) domain, a SET- and RING-associated (SRA) domain, and a C-terminal RING finger domain. The model corresponds to the second Tudor domain. The tandem Tudor domain directs binding of UHRF to the heterochromatin mark histone H3K9me3.	73
410530	cd20459	Tudor_ARID4A_rpt1	first Tudor domain found in AT-rich interactive domain-containing protein 4A (ARID4A) and similar proteins. ARID4A, also called retinoblastoma-binding protein 1 (RBBP1 or RBP1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through its interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and the ARID4 family homolog ARID4B ( also known as RBP1L1). ARID4A specifically interacts with retinoblastoma protein (pRb) and shows both HDAC -dependent and -independent repression activities. It also acts as a Runx2 coactivator and is involved in the regulation of osteoblastic differentiation in Runx2-osterix transcriptional cascade. ARID4A contains tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The ARID and R2 domains are responsible for the repression activities. The Tudor, PWWP, and chromobarrel domains are all Royal Family domains, but only the chromobarrel domain of ARID4A is responsible for recognizing both dsDNA and methylated histone tails, particularly H4K20me3, in chromatin remodeling and epigenetic regulation. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	58
410531	cd20460	Tudor_ARID4B_rpt1	first Tudor domain found in AT-rich interactive domain-containing protein 4B (ARID4B) and similar proteins. ARID4B, also called 180 kDa Sin3-associated polypeptide (p180), breast cancer-associated antigen BRCAA1, histone deacetylase complex subunit SAP180, or retinoblastoma-binding protein 1-like 1 (RBP1L1 or RBBP1L1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through its interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and ARID4A ( also known as RBP1). ARID4B plays a causative role in metastatic progression of breast cancer. It may also be associated with regulating the cell cycle. ARID4B contains tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	61
410532	cd20461	Tudor_ARID4A_rpt2	second Tudor domain found in AT-rich interactive domain-containing protein 4A (ARID4A) and similar proteins. ARID4A, also called retinoblastoma-binding protein 1 (RBBP1 or RBP1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through its interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and the ARID4 family homolog ARID4B ( also known as RBP1L1). ARID4A specifically interacts with retinoblastoma protein (pRb) and shows both HDAC -dependent and -independent repression activities. It also acts as a Runx2 coactivator and is involved in the regulation of osteoblastic differentiation in Runx2-osterix transcriptional cascade. ARID4A contains tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The ARID and R2 domains are responsible for the repression activities. The Tudor, PWWP, and chromobarrel domains are all Royal Family domains, but only the chromobarrel domain of ARID4A is responsible for recognizing both dsDNA and methylated histone tails, particularly H4K20me3, in chromatin remodeling and epigenetic regulation. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	60
410533	cd20462	Tudor_ARID4B_rpt2	second Tudor domain found in AT-rich interactive domain-containing protein 4B (ARID4B) and similar proteins. ARID4B, also called 180 kDa Sin3-associated polypeptide (p180), breast cancer-associated antigen BRCAA1, histone deacetylase complex subunit SAP180, or retinoblastoma-binding protein 1-like 1 (RBP1L1 or RBBP1L1), is a leukemia and tumor suppressor involved in epigenetic regulation in leukemia and Prader-Willi/Angelman syndromes. It associates with the mSIN3A histone deacetylase (HDAC) chromatin remodeling complex through its interaction with the breast cancer associated tumor suppressor ING1, the breast cancer metastasis suppressor BRMS1, and ARID4A ( also known as RBP1). ARID4B plays a causative role in metastatic progression of breast cancer. It may also be associated with regulating the cell cycle. ARID4B contains tandem Tudor domains, a PWWP domain (also known as HATH domain or RBB1NT domain), an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), a chromobarrel domain, and a C-terminal R2 domain. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	57
410534	cd20463	Tudor_JMJD2A_rpt1	first Tudor domain found in Jumonji domain-containing protein 2A (JMJD2A) and similar proteins. JMJD2A, also called lysine-specific demethylase 4A (KDM4A), or JmjC domain-containing histone demethylation protein 3A (JHDM3A), catalyzes the demethylation of di- and trimethylated H3K9 and H3K36. It is involved in carcinogenesis and functions as a transcription regulator that may either stimulate or repress gene transcription. It associates with nuclear receptor corepressor complex or histone deacetylases. Moreover, JMJD2A forms complexes with both the androgen and estrogen receptor (ER), and plays an essential role in growth of both ER-positive and -negative breast tumors. It is also involved in prostate, colon, and lung cancer progression. JMJD2A contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	55
410535	cd20464	Tudor_JMJD2B_rpt1	first Tudor domain found in Jumonji domain-containing protein 2B (JMJD2B) and similar proteins. JMJD2B, also called lysine-specific demethylase 4B (KDM4B), or JmjC domain-containing histone demethylation protein 3B (JHDM3B), specifically antagonizes the tri-methyl group from H3K9 in pericentric heterochromatin and reduces H3K36 methylation in mammalian cells. It plays an essential role in the growth regulation of cancer cells by modulating the G1-S transition and promotes cell-cycle progression through the regulation of cyclin-dependent kinase 6 (CDK6). It interacts with heat shock protein 90 (Hsp90) and its stability can be regulated by Hsp90. JMJD2B also functions as a direct transcriptional target of p53, which induces its expression through promoter binding. Moreover, JMJD2B expression can be controlled by hypoxia-inducible factor 1alpha (HIF1alpha) in colorectal cancer and estrogen receptor alpha (ERalpha) in breast cancer. It is also involved in bladder, lung, and gastric cancer. JMJD2B contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	54
410536	cd20465	Tudor_JMJD2C_rpt1	first Tudor domain found in Jumonji domain-containing protein 2C (JMJD2C) and similar proteins. JMJD2C, also called lysine-specific demethylase 4C (KDM4C), gene amplified in squamous cell carcinoma 1 protein (GASC-1 protein), or JmjC domain-containing histone demethylation protein 3C (JHDM3C), is an epigenetic factor that catalyzes the demethylation of di- and trimethylated H3K9 and H3K36, and may be involved in the development and/or progression of various types of cancer including esophageal squamous cell carcinoma (ESC) and breast cancer. It selectively interacts with hypoxia-inducible factor 1alpha (HIF1alpha) and plays a role in breast cancer progression. Moreover, JMJD2C may play an important role in the treatment of obesity and its complications by modulating the regulation of adipogenesis by nuclear receptor peroxisome proliferator-activated receptor gamma (PPARgamma). JMJD2C contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the first Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	54
410537	cd20466	Tudor_JMJD2A_rpt2	second Tudor domain found in Jumonji domain-containing protein 2A (JMJD2A) and similar proteins. JMJD2A, also called lysine-specific demethylase 4A (KDM4A), or JmjC domain-containing histone demethylation protein 3A (JHDM3A), catalyzes the demethylation of di- and trimethylated H3K9 and H3K36. It is involved in carcinogenesis and functions as a transcription regulator that may either stimulate or repress gene transcription. It associates with nuclear receptor corepressor complex or histone deacetylases. Moreover, JMJD2A forms complexes with both the androgen and estrogen receptor (ER), and plays an essential role in growth of both ER-positive and -negative breast tumors. It is also involved in prostate, colon, and lung cancer progression. JMJD2A contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	56
410538	cd20467	Tudor_JMJD2B_rpt2	second Tudor domain found in Jumonji domain-containing protein 2B (JMJD2B) and similar proteins. JMJD2B, also called lysine-specific demethylase 4B (KDM4B), or JmjC domain-containing histone demethylation protein 3B (JHDM3B), specifically antagonizes the tri-methyl group from H3K9 in pericentric heterochromatin and reduces H3K36 methylation in mammalian cells. It plays an essential role in the growth regulation of cancer cells by modulating the G1-S transition and promotes cell-cycle progression through the regulation of cyclin-dependent kinase 6 (CDK6). It interacts with heat shock protein 90 (Hsp90) and its stability can be regulated by Hsp90. JMJD2B also functions as a direct transcriptional target of p53, which induces its expression through promoter binding. Moreover, JMJD2B expression can be controlled by hypoxia-inducible factor 1alpha (HIF1alpha) in colorectal cancer and estrogen receptor alpha (ERalpha) in breast cancer. It is also involved in bladder, lung, and gastric cancer. JMJD2B contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	56
410539	cd20468	Tudor_JMJD2C_rpt2	second Tudor domain found in Jumonji domain-containing protein 2C (JMJD2C) and similar proteins. JMJD2C, also called lysine-specific demethylase 4C (KDM4C), gene amplified in squamous cell carcinoma 1 protein (GASC-1 protein), or JmjC domain-containing histone demethylation protein 3C (JHDM3C), is an epigenetic factor that catalyzes the demethylation of di- and trimethylated H3K9 and H3K36, and may be involved in the development and/or progression of various types of cancer including esophageal squamous cell carcinoma (ESC) and breast cancer. It selectively interacts with hypoxia-inducible factor 1alpha (HIF1alpha) and plays a role in breast cancer progression. Moreover, JMJD2C may play an important role in the treatment of obesity and its complications by modulating the regulation of adipogenesis by nuclear receptor peroxisome proliferator-activated receptor gamma (PPARgamma). JMJD2C contains jmjN and jmjC domains in the N-terminal region, followed by a canonical plant homeodomain (PHD) domain, a noncanonical extended PHD domain, and tandem Tudor domains. The model corresponds to the second Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	60
410540	cd20469	Tudor_TNRC18	Tudor domain found in trinucleotide repeat-containing gene 18 protein (TNRC18) and similar proteins. TNRC18, also called long CAG trinucleotide repeat-containing gene 79 protein (CAGL79), is a protein that in humans is encoded by the TNRC18 gene. Its biological function remains unclear. TNRC18 contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	67
410541	cd20470	Tudor_BAHCC1	Tudor domain found in BAH and coiled-coil domain-containing protein 1 (BAHCC1) and similar proteins. BAHCC1, also called Bromo adjacent homology domain-containing protein 2 (BAHD2), or BAH domain-containing protein 2, may function as a transcriptional regulator. BAHCC1 contains one Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	70
410542	cd20471	Tudor_Agenet_FMR1_rpt1	first Tudor-like Agenet domain found in synaptic functional regulator FMR1 and similar proteins. FMR1, also called fragile X mental retardation protein 1 (FMRP), is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. FMR1 contains two copies of the Tudor-like Agenet domain. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	55
410543	cd20472	Tudor_Agenet_FXR1_rpt1	first Tudor-like Agenet domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA binding protein that interacts with the functionally similar proteins FMR1 and FXR2. It shuttles between the nucleus and cytoplasm and associates with polyribosomes, predominantly with the 60S ribosomal subunit. FXR1 contains two copies of the Tudor-like Agenet domain. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	55
410544	cd20473	Tudor_Agenet_FXR2_rpt1	first Tudor-like Agenet domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2 is an RNA-binding protein that associates with polyribosomes, predominantly with 60S large ribosomal subunits. It may have a role in the development of fragile X mental retardation syndrome. FXR2 contains two copies of the Tudor-like Agenet domain. The model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	55
410545	cd20474	Tudor_Agenet_FMR1_rpt2	second Tudor-like Agenet domain found in synaptic functional regulator FMR1 and similar proteins. FMR1, also called fragile X mental retardation protein 1 (FMRP), is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. FMR1 contains two copies of the Tudor-like Agenet domain. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	63
410546	cd20475	Tudor_Agenet_FXR1_rpt2	second Tudor-like Agenet domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA binding protein that interacts with the functionally similar proteins FMR1 and FXR2. It shuttles between the nucleus and cytoplasm and associates with polyribosomes, predominantly with the 60S ribosomal subunit. FXR1 contains two copies of the Tudor-like Agenet domain. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	66
410547	cd20476	Tudor_Agenet_FXR2_rpt2	second Tudor-like Agenet domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2 is an RNA-binding protein that associates with polyribosomes, predominantly with 60S large ribosomal subunits. It may have a role in the development of fragile X mental retardation syndrome. FXR2 contains two copies of the Tudor-like Agenet domain. The model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	68
380475	cd20477	Cas13b_Pb-like	Class 2 type VI-B CRISPR-associated RNA-guided ribonuclease Cas13b from Prevotella buccae and similar Cas13b proteins. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13b has many distinctive features compared to the other Cas13 proteins, including the lack of significant sequence similarity, disparate crRNA repeat region, and double-sided protospacer flanking sequence (PFS)-dependent target RNA cleavage.	995
380476	cd20478	Cas13b_Bz-like	Class 2 type VI-B CRISPR-associated RNA-guided ribonuclease Cas13b from Bergeyella zoohelcum and similar Cas13b proteins. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13b has many distinctive features compared to the other Cas13 proteins, including the lack of significant sequence similarity, disparate crRNA repeat region, and double-sided protospacer flanking sequence (PFS)-dependent target RNA cleavage.	1118
380470	cd20480	ArgR-Cyc_NRPS-like	Cyc (heterocyclization)-like domain of Vibrio anguillarum AngR and similar proteins; belongs to the Condensation-domain family. Vibrio anguillarum AngR plays a role in regulating the expression of iron transport genes as well as in the production of the siderophore anguibactin. Cyc-domains are a type of Condensation (C) domain. Cyc-domains catalyze two separate reactions in the creation of heterocyclized peptide products in nonribosomal peptide synthesis: amide bond formation followed by intramolecular cyclodehydration between a Cys, Ser, or Thr side chain and a carbonyl carbon on the peptide backbone to form a thiazoline, oxazoline, or methyloxazoline ring.  C-domains typically have a conserved HHxxxD motif at the active site; Cyc-domains have a alternative, conserved DxxxxD active site motif, mutation of the aspartate residues in this motif can abolish or diminish condensation activity. Members of this subfamily have an SxxxD motif at the active site. C-domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). In addition to Cyc-domains there are various other subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	406
380473	cd20481	phage_tailspike_middle	N-terminal and middle domains of tailspike protein in Acinetobacter bacteriophages. This model describes the middle beta-helical domain of Acinetobacter bacteriophage tailspike proteins, as well as a separate N-terminal domain that does not appear to be part of the beta-helical substructure. The N-terminal domain may be involved in virion binding, and the molecules form a homo-trimeric arrangement. A C-terminal domain that may be involved in receptor binding is omitted from the model.	419
380471	cd20483	C_PKS-NRPS	Condensation domain of hybrid polyketide synthetase/nonribosomal peptide synthetases (PKS/NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Hybrid PKS/NRPS create polymers containing both polyketide and amide linkages. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Most members of this subfamily have the typical C-domain HHXXXD motif. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	430
380472	cd20484	C_PKS-NRPS_PksJ-like	Condensation domain of hybrid polyketide synthetase/nonribosomal peptide synthetases (PKS/NRPSs), similar to Bacillus subtilis PksJ. Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Hybrid PKS/NRPS create polymers containing both polyketide and amide linkages. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Members of this subfamily have the typical C-domain HHxxxD motif. PksJ is involved in some intermediate steps for the synthesis of the antibiotic polyketide bacillaene which is important in secondary metabolism. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	430
380450	cd20485	USP25_USP28_C-like	carboxyl-terminal domain of ubiquitin-specific protease 25 (USP25) and 28 (USP28), and similar domains. This family contains the C-terminal domain of two deubiquitinases (DUBs), ubiquitin-specific proteases USP25 and USP28, which share high similarity but vary in their cellular functions. USP25 is a regulator of the innate immune system and may play a role in tumorigenesis, while USP28 is known for its tumor-promoting role. These two closely related DUBs contain an N-terminal domain harboring a Ub-associated domain (UBA) and two Ub-interacting motifs (UIMs), a central catalytic USP domain, and a C-terminal region of unknown function and variable size due to alternative splicing. In general, USP catalytic domains are around 350 amino acids in length; however, in USP25 and 28, the catalytic domains span around 550 amino acids due to a large, conserved insertion at a common insertion point called USP25/28 catalytic domain inserted domain (UCID). This alignment model represents the C-terminal region that has been implicated in substrate binding for both USP25 and USP28 and harbors the splicing site for isoform-specific sequences.	273
380451	cd20486	USP25_C	carboxyl-terminal domain of ubiquitin-specific protease 25 (USP25). This subfamily contains the C-terminal domain of ubiquitin-specific protease USP25, a deubiquitinase (DUB), which shares high similarity with USP28 but varies in cellular function; USP25 is a regulator of the innate immune system and may play a role in tumorigenesis, while USP28 is known for its tumor-promoting role. USP25 regulates inflammatory TRAF signaling and USP28 stabilizes c-MYC and other nuclear proteins. These two closely related DUBs contain an N-terminal domain harboring a Ub-associated domain (UBA) and two Ub-interacting motifs (UIMs), a central catalytic USP domain, and a C-terminal region of unknown function and variable size due to alternative splicing. In general, USP catalytic domains are around 350 amino acids in length; however, in USP25 and 28, the catalytic domains span around 550 amino acids due to a large, conserved insertion at a common insertion point called USP25/28 catalytic domain inserted domain (UCID). This C-terminal region has been implicated in substrate binding for USP25 and harbors the splicing site for isoform-specific sequences. Structure studies show that the C-terminally extended USP25 is exclusively tetrameric.	281
380452	cd20487	USP28_C	carboxyl-terminal domain of ubiquitin-specific protease 28 (USP28). This family contains the C-terminal domain of ubiquitin-specific protease USP28, a deubiquitinase (DUB), which shares high similarity with USP25 but varies in cellular function; USP28 is known for its tumor-promoting role while USP25 is a regulator of the innate immune system and may play a role in tumorigenesis. USP28 stabilizes c-MYC and other nuclear proteins, and USP25 regulates inflammatory TRAF signaling. These two closely related DUBs contain an N-terminal domain harboring a Ub-associated domain (UBA) and two Ub-interacting motifs (UIMs), a central catalytic USP domain, and a C-terminal region of unknown function and variable size due to alternative splicing. In general, USP catalytic domains are around 350 amino acids in length; however, in USP25 and 28, the catalytic domains span around 550 amino acids due to a large, conserved insertion at a common insertion point called USP25/28 catalytic domain inserted domain (UCID). This C-terminal region has been implicated in substrate binding for USP28 and harbors the splicing site for isoform-specific sequences. Structure studies suggest that the C-terminal domain forms an independent entity.	280
410774	cd20488	peptidase_C58-like	C58 peptidase domain and and similar domains. This family contains C58 peptidases and similar proteins. C58 family peptidases are endopeptidases that also act as transamidases, attaching a lipid moiety to the newly exposed N-terminus of the substrate. These include the Pseudomonas avirulence (Avr) protein AvrPphB and the homologous protein from Yersinia known as YopT; both are involved in bacterial pathogenesis. These proteins have a papain-like fold and a distinct substrate-binding site. Also included is a cysteine-protease-like domain in Photorhabdus asymbiotica toxin PaTox that enhances cytotoxic effects of the toxin, and therefore is essential for full PaTox activity. The C58 cysteine protease domain is also found in Vibrio vulnificus biotype 3 multifunctional autoprocessing RTX toxin. It usually contains the characteristic Cys, His, Asp residues in the active site. Some members may lack this cataytic triad.	143
380446	cd20489	cupin_HppE-like_C	hydroxypropylphosphonic acid epoxidase (HppE) and similar proteins, C-terminal cupin domain. This family includes HppE (hydroxypropylphosphonic acid epoxidase or HPP epoxidase or 2-hydroxypropylphosphonic acid epoxidase; EC 1.11.1.23), a non-heme mononuclear iron-dependent enzyme that catalyzes a unique epoxidation reaction as part of the biosynthetic pathway of the clinically important oxirane antibiotic fosfomycin. HppE uses a facial triad with two histidine ligands and one aspartic acid or glutamic acid, His2(Glu/Asp), to catalyze a variety of different reactions, including DNA repair and antibiotic biosynthesis. The C-terminal catalytic domain of HppE has a cupin fold that binds a divalent cation, whereas the N-terminal domain carries a helix-turn-helix (HTH) motif with putative DNA-binding helices. HppE converts (S)-2-hydroxypropyl-1-phosphonate (S-HPP) to the antibiotic fosfomycin [(1R,2S)-epoxypropylphosphonate] in an unusual 1,3-dehydrogenation of a secondary alcohol to an epoxide; it uses H2O2 as a co-substrate to abstract hydrogen (Ho) from C1 of S-HPP to initiate epoxide ring closure, using an iron(IV)-oxo complex as the Ho abstractor. HppE belongs to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization and its structure serves as a model for numerous proteins of unknown function, predicted to be transcription factors, containing an HTH motif at the N-terminus and a cupin domain at the C-terminus.	97
380447	cd20490	cupin_HutD_C	histidine utilization protein HutD and related proteins, C-terminal cupin domain. This model represents the C-terminal domain of a bicupin protein HutD, involved in histidine utilization (Hut) in Pseudomonas species. Although a metal binding site is not found in Pseudomonas fluorescens (PfluHutD), a binding pocket for ligands is located in the middle of the N-terminal cupin domain near the metal binding sites; N-formyl-l-glutamate (FG, a Hut pathway intermediate) has been identified as a potential ligand in vivo. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	77
380448	cd20491	cupin_KduI_C	Escherichia coli 5-keto-4-deoxyuronate isomerase (KduI) and related proteins, C-terminal cupin domain. 5-keto-4-deoxyuronate isomerase (KduI; EC 5.3.1.17), also called 5-dehydro-4-deoxy-D-glucuronate isomerase or 4-deoxy-L-threo-5-hexosulose-uronate ketol-isomerase, catalyzes the interconversion of 5-keto-4-deoxyuronate and 2,5-diketo-3-dexoygluconate in the breakdown of pectin. KduI is a bicupin; this model describes the C-terminal cupin domain. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	108
380449	cd20492	cupin_HQDO_large_C	hydroquinol 1,2-dioxygenase (HQDO) large subunit, C-terminal cupin domain. This model describes the C-terminal cupin domain of the large (or beta) subunit of hydroquinone 1,2-dioxygenase (HQDO), a heterotetramer of two alpha and two beta subunits of 19kDa and 38kDa, respectively. HQDO is a Fe(II) ring cleaving dioxygenase that is a key enzyme in the hydroquinone pathway of para-nitrophenol degradation, where it catalyzes the ring cleavage of hydroquinone to gamma-hydroxymuconic semialdehyde. Proteins in this family belong to the cupin superfamily with a conserved "jelly roll-like" beta-barrel fold capable of homodimerization.	100
380336	cd20493	M34_ATLF_C-like	C-terminal catalytically active domain of anthrax toxin lethal factor and similar domains; belongs to peptidase family M34. This subfamily includes the C-terminal catalytic domain of anthrax toxin lethal factor (ATLF; EC 3.4.24.83). ATLF and edema factor are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF is secreted by Bacillus anthracis to promote disease virulence through disruption of host signaling pathways. ATLF belongs to peptidase family M34 and has the hallmark metalloprotease motif HEXXH motif where the two His residues bind a single zinc atom, and the Glu has a catalytic role. ATLF is a highly selective protease whose major substrates are mitogen-activated protein kinase kinases (MKKs). MKKs are cleaved by ATLF near their N-termini, removing the docking sequence for the downstream cognate mitogen-activated protein kinase. Preferred amino acids around the cleavage site can be denoted BBBBxHxH, in which B denotes Arg or Lys, H denotes a hydrophobic amino acid, and x is any amino acid. At its N-terminus, ATLF has a related PABD domain which lacks the hallmark metalloprotease motif HEXXH. This subfamily also includes Bacillus thuringiensis Vip2Ac-like_2 which belongs to the Vip family of proteins that are secreted during the vegetative growth phase.	208
410775	cd20494	C58_RtxA	peptidase C58-like domain of cytotoxin RtxA and similar proteins. This subfamily includes the C58 peptidase-like domain of Vibrio vulnificus biotype 3 multifunctional autoprocessing RTX (MARTX) toxin, the primary virulence factor of V. vulnificus. MARTX has been shown to be an essential virulence factor contributing to highly inflammatory skin wounds with severe damage affecting every tissue layer. This toxin is a large single-polypeptide composed of repeat sequences that form a pore in eukaryotic cell plasma membranes for the translocation of centrally located effector domains. This C58 family cysteine protease domain usually contains the invariant C/H/D residues that form an active site triad, however, cysteine is not fully conserved in this group.	229
410776	cd20495	C58_PaToxP-like	peptidase C58 domain of Photorhabdus asymbiotica toxin PaTox and LifA/Efa1-related large cytotoxin, and similar proteins. This subfamily includes the cysteine protease domain of Photorhabdus asymbiotica toxin PaTox, a large virulence-associated multifunctional protein toxin. This domain is similar to AvrPphB protease found in Pseudomonas syringae, a C58 protease. Mutation studies show that this domain enhances cytotoxic effects of the toxin, and therefore is essential for full PaTox activity. Also included in this family is the enteropathogenic Escherichia coli (EPEC) factor for adherence/lymphocyte activation inhibitor (efa1/lifA) gene which is strongly associated with diarrhea. Efa1/LifA proteins are important for A/E lesion formation efficiency in EPEC strains lacking multiple effectors. This domain contains the invariant C/H/D residues conserved in the C58/YopT family.	179
410777	cd20496	C58	peptidase C58 domain. The C58 family peptidases are endopeptidases that also act as transamidases, attaching a lipid moiety to the newly exposed N-terminus of the substrate. These include the Pseudomonas avirulence (Avr) protein AvrPphB and the homologous protein from Yersinia known as YopT; both are involved in bacterial pathogenesis. These proteins have a papain-like fold and a distinct substrate-binding site. The proteolytic activity of AvrPphB is essential for autoproteolytic cleavage of an AvrPphB precursor as well as for eliciting the hypersensitive response in plants. Yersinia pestis YopT cleaves the post-translationally modified Rho GTPases near their carboxyl termini, releasing them from the membrane. This leads to the disruption of actin cytoskeleton in host cells. Also included in this family is the Pseudomonas syringae HopN1 peptidase, a type III secretion system effector that can suppress plant cell death events in both compatible and incompatible interactions. All of these proteolytic activities are dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain.	149
410778	cd20497	C58_YopT-like	peptidase C58 domain of YopT-like proteins, including Pseudomonas avirulence AvrPphB. This subfamily includes the C58 peptidase domain of the Pseudomonas avirulence (Avr) protein AvrPphB which is homologous to Yersinia effector known as YopT; both are involved in bacterial pathogenesis. These proteins have a papain-like fold and a distinct substrate-binding site. The proteolytic activity of AvrPphB is essential for autoproteolytic cleavage of an AvrPphB precursor as well as for eliciting the hypersensitive response in plants. Also included is the Ralstonia solanacearum type III effector protein RipT, a YopT-like cysteine protease. All of these proteolytic activities are dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain.	185
410779	cd20498	C58_YopT	peptidase C58 domain of the YopT subfamily, including Yersinia pestis YopT and related proteins. This subfamily includes the plague organism Yersinia pestis cysteine protease YopT, an outer membrane protein. Y. pestis can disarm the host immune response by interfering with cell-signaling pathways; YopT cleaves post-translationally modified Rho GTPases near their carboxyl termini, releasing them from the membrane. This leads to the disruption of the actin cytoskeleton in host cells. YopT's proteolytic activity is dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain.	211
410780	cd20499	HopN1-like	peptidase C58 domain of Pseudomonas syringae type III effector HopN1 and related proteins. This family includes the C58 peptidase domain of Pseudomonas syringae HopN1 peptidase, a type III secretion system effector that can suppress plant cell death events in both compatible and incompatible interactions. HopN1's proteolytic activity is dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain.	216
410973	cd20500	Peptidase_C80	peptidase C80 family. The peptidase C80 family includes self-cleaving proteins that are precursors of bacterial toxins such as the Vibrio cholerae RTX self-cleaving toxin, as well as the major virulence factors of Clostridium difficile multidomain toxins, TcdA and TcdB. These toxins contain a cysteine protease domain (CPD) that autoproteolytically releases a cytotoxic effector domain upon binding intracellular inositol hexakisphosphate. This family also contains filamentous hemagglutinin family cysteine protease C80 domains, that are located at the C-terminus. All domains in this family contain the characteristic Cys/His residues in the active site. Site-directed mutagenesis has identified functional residues Asp/His/Cys in Clostridium toxin B and His/Cys in cholera RTX toxin.	150
410974	cd20501	C80_RtxA-like	peptidase C80 cysteine binding domain of RTX toxin RtxA and related proteins. This peptidase C80 family includes the autoproteolytic cysteine protease domain (CPD) of Vibrio cholerae multifunctional autoprocessing repeats-in-toxin (MARTX) toxin that causes disassembly of the actin cytoskeleton and enhances V. cholerae colonization of the small intestine, possibly by facilitating evasion of phagocytic cells. The central region of this toxin is composed of several domains, including the actin cross-linking domain (ACD) that introduces lysine-glutamate cross-links between actin protomers, the Rho-inactivating domain (RID) that disables small Rho GTPases, and an autoprocessing cysteine protease domain (CPD). Within the cell, the CPD is activated by the binding of inositol hexakisphosphate to release individual effector domains of the toxin into the cytosol. The CPD contains the characteristic Cys/His residues in the active site.	194
410975	cd20502	C80_toxinA_B-like	Peptidase C80 cysteine binding domain of Clostridium difficile toxins A and B, and related proteins. This peptidase C80 family includes the major virulence factors of Clostridium difficile multidomain toxins TcdA and TcdB. These large homologous toxins contain several distinct domains including a cysteine protease domain (CPD) that autoproteolytically releases a cytotoxic effector domain upon binding of intracellular inositol hexakisphosphate. C. difficile is a major cause of intestinal tissue damage and inflammation, and TcdA is generally more inflammatory whereas TcdB is more cytotoxic; studies show that the CPD is an internal regulator of the proinflammatory activity. Site-directed mutagenesis has identified functional residues Asp/His/Cys in Clostridium toxin B.	209
410976	cd20503	C80_adhesin-like	peptidase C80 domains found in filamentous hemagglutinin or adhesin, and other similar proteins. This peptidase C80 family includes the cysteine-binding domain (CPD) of several large, repetitive bacterial exoproteins involved in heme utilization or adhesion and many typically having CPD repeats as well as regions rich in repeats. Many members of this family have been designated adhesins or filamentous haemagglutinins. The CPD contains the characteristic Asp/Cys/His residues found in Clostridium toxin B active site.	156
410208	cd20504	CYCLIN_CCNA_rpt1	first cyclin box found in cyclin-A (CCNA) family. The CCNA family includes two A-type cyclins, CCNA1 and CCNA2. CCNA1 may primarily function in the control of the germline meiotic cell cycle and additionally in the control of mitotic cell cycle in some somatic cells. CCNA2 controls both the G1/S and the G2/M transition phases of the cell cycle. Members in this family contain two cyclin boxes. The model corresponds to the first one. The cyclin box is a protein binding domain.	128
410209	cd20505	CYCLIN_CCNA_rpt2	second cyclin box found in cyclin-A (CCNA) family. The CCNA family includes two A-type cyclins, CCNA1 and CCNA2. CCNA1 may primarily function in the control of the germline meiotic cell cycle and additionally in the control of mitotic cell cycle in some somatic cells. CCNA2 controls both the G1/S and the G2/M transition phases of the cell cycle. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	110
410210	cd20506	CYCLIN_AtCycA-like_rpt2	second cyclin box found in Arabidopsis thaliana A-type cyclins (CycAs) and similar proteins. Plant A-type cyclins (CycAs) correspond to a group of G2/mitotic-specific cyclins that are functionally linked to S- and M-phases of the mitotic cycle, which predicts their involvement also in meiosis. CycAs associate with their partner cyclin-dependent kinases (CDKs) to trigger the kinase activity. They contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	111
410211	cd20507	CYCLIN_CCNB1-like_rpt1	first cyclin box found in cyclin-B1 (CCNB1)-like family. The CCNB1-like family includes two B-type cyclins, CCNB1 and CCNB2, both of which are essential for the control of the cell cycle at the G2/M (mitosis) transition. Members in this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	130
410212	cd20508	CYCLIN_CCNB3_rpt1	first cyclin box found in G2/mitotic-specific cyclin-B3 (CCNB3) and similar proteins. CCNB3 is a mitotic B-type cyclin that promotes the metaphase-anaphase transition. It controls anaphase onset independent of spindle assembly checkpoint in meiotic oocytes. CCNB3 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	142
410213	cd20509	CYCLIN_CCNB1-like_rpt2	second cyclin box found in cyclin-B1 (CCNB1)-like family. The CCNB1-like family includes two B-type cyclins, CCNB1 and CCNB2, both of which are essential for the control of the cell cycle at the G2/M (mitosis) transition. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	111
410214	cd20510	CYCLIN_CCNB3_rpt2	second cyclin box found in G2/mitotic-specific cyclin-B3 (CCNB3) and similar proteins. CCNB3 is a mitotic B-type cyclin that promotes the metaphase-anaphase transition. It controls anaphase onset independent of spindle assembly checkpoint in meiotic oocytes. CCNB3 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	115
410215	cd20511	CYCLIN_AtCycB-like_rpt2	second cyclin box found in Arabidopsis thaliana B-type cyclins (CycBs) and similar proteins. Plant B-type cyclins (CycBs) correspond to a group of G2/mitotic-specific cyclins that are functionally linked to S- and M-phases of the mitotic cycle, which predicts their involvement also in meiosis. CycBs associate with their partner cyclin-dependent kinases (CDKs) to trigger the kinase activity. They contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	117
410216	cd20512	CYCLIN_CLBs_yeast_rpt2	second cyclin box found in yeast B-type cyclins. The family includes Saccharomyces cerevisiae G2/mitotic-specific cyclins 1-4 (ScCLB1-4), S-phase entry cyclins 5-6 (ScCLB5-6), and Schizosaccharomyces pombe G2/mitotic-specific cyclins, cig1, cig2 and cdc13. ScCLB1-4 are essential for the control of the cell cycle at the G2/M (mitosis) transition. They interact with the CDC2 protein kinase to form maturation promoting factor (MPF). ScCLB5-6 interact with CDC28 and are involved in DNA replication in Saccharomyces cerevisiae. ScCLB5 is required for efficient progression through S phase and possibly for the normal progression through meiosis. ScCLB6 is involved in G1/S and or S phase progression. Cig1 is required for efficient passage of the G1/S transition. Cig2 and cdc13 are essential for the control of the cell cycle at the G2/M and G1/S (mitosis) transition. They interact with the cdc2 protein kinase to form MPF. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	116
410217	cd20513	CYCLIN_CCNC_rpt1	first cyclin box found in cyclin-C (CCNC) and similar proteins. CCNC, also termed CycC, or SRB11, is a component of the Mediator complex, a coactivator involved in regulated gene transcription of nearly all RNA polymerase II-dependent genes. It mediates stress-induced mitochondrial fission and apoptosis. CCNC contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	101
410218	cd20514	CYCLIN_CCNC_rpt2	second cyclin box found in cyclin-C (CCNC) and similar proteins. CCNC, also termed CycC, or SRB11, is a component of the Mediator complex, a coactivator involved in regulated gene transcription of nearly all RNA polymerase II-dependent genes. It mediates stress-induced mitochondrial fission and apoptosis. CCNC contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	92
410219	cd20515	CYCLIN_CCND_rpt1	first cyclin box found in cyclin-D (CCND) family. The CCND family includes three mitogen-induced D-type cyclins, CCND1, CCND2 and CCND3, which function as regulatory subunits of the cyclin-dependent kinases CDK4 and CDK6, that drive progression through the G1 phase of the cell cycle. Members in this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	150
410220	cd20516	CYCLIN_CCND_rpt2	second cyclin box found in cyclin-D (CCND) family. The CCND family includes three mitogen-induced D-type cyclins, CCND1, CCND2 and CCND3, which function as regulatory subunits of the cyclin-dependent kinases CDK4 and CDK6, that drive progression through the G1 phase of the cell cycle. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	98
410221	cd20517	CYCLIN_vCyC_rpt1	first cyclin box found in viral cyclin (v-cyclin). v-Cyclin modulates host cell cycle progression and apoptotic signaling pathways. It forms an active kinase complex with cellular CDK6, a cellular cyclin-dependent kinase known to interact with cellular type D cyclins. v-Cyclin belongs to Cyclin D subfamily. It contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	99
410222	cd20518	CYCLIN_vCyC_rpt2	second cyclin box found in viral cyclin (v-cyclin). v-Cyclin modulates host cell cycle progression and apoptotic signaling pathways. It forms an active kinase complex with cellular CDK6, a cellular cyclin-dependent kinase known to interact with cellular type D cyclins. v-Cyclin belongs to Cyclin D subfamily. It contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	100
410223	cd20519	CYCLIN_CCNE_rpt1	first cyclin box found in G1/S-specific cyclin-E (CCNE) family. The CCNE family includes two E-type cyclins, CCNE1 and CCNE2. CCNE1 is essential for the control of the cell cycle at the G1/S (start) transition. It interacts with CDK2 protein kinase to form a serine/threonine kinase holoenzyme complex. CCNE2 is essential for the control of the cell cycle at the late G1 and early S phase. It interacts with the CDK2 (in vivo) and CDK3 (in vitro) protein kinases to form a serine/threonine kinase holoenzyme complexes. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	131
410224	cd20520	CYCLIN_CCNE_rpt2	second cyclin box found in G1/S-specific cyclin-E (CCNE) family. The CCNE family includes two E-type cyclins, CCNE1 and CCNE2. CCNE1 is essential for the control of the cell cycle at the G1/S (start) transition. It interacts with CDK2 protein kinase to form a serine/threonine kinase holoenzyme complex. CCNE2 is essential for the control of the cell cycle at the late G1 and early S phase. It interacts with the CDK2 (in vivo) and CDK3 (in vitro) protein kinases to form serine/threonine kinase holoenzyme complexes. Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	105
410225	cd20521	CYCLIN_CCNF_rpt1	first cyclin box found in G2/mitotic-specific cyclin-F (CCNF) and similar proteins. CCNF, also termed F-box only protein 1 (FBXO1), is a substrate recognition component of a SCF (SKP1-CUL1-F-box protein) E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of CP110 during G2 phase, thereby acting as an inhibitor of centrosome reduplication. It is the largest among all cyclins and oscillates in the cell cycle like other cyclins. CCNF contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	95
410226	cd20522	CYCLIN_CCNF_rpt2	second cyclin box found in G2/mitotic-specific cyclin-F (CCNF) and similar proteins. CCNF, also termed F-box only protein 1 (FBXO1), is a substrate recognition component of a SCF (SKP1-CUL1-F-box protein) E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of CP110 during G2 phase, thereby acting as an inhibitor of centrosome reduplication. It is the largest among all cyclins and oscillates in the cell cycle like other cyclins. CCNF contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	112
410227	cd20523	CYCLIN_CCNG	cyclin box found in the cyclin-G (CCNG) family. The CCNG family includes two cyclins, CCNG1 and CCNG2. CCNG1 is the only cyclin that has either positive or negative effects on cell growth. It is associated with G2/M phase arrest in response to DNA damage. It is also involved in the development of human carcinoma. CCNG2 may play a role in growth regulation and in negative regulation of cell cycle progression. It has been identified as a tumor suppressor in several cancers. Members of this family contain one cyclin box. The cyclin box is a protein binding domain.	94
410228	cd20524	CYCLIN_CCNH_rpt1	first cyclin box found in cyclin-H (CCNH) and similar proteins. CCNH, also called MO15-associated protein, p34, or p37, is normally associated with the cyclin-dependent kinase cdk7, the catalytic subunit of the CDK-activating kinase (CAK) enzymatic complex. CCNH contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	150
410229	cd20525	CYCLIN_CCNH_rpt2	second cyclin box found in cyclin-H (CCNH) and similar proteins. CCNH, also called MO15-associated protein, p34, or p37, is normally associated with the cyclin-dependent kinase cdk7, the catalytic subunit of the CDK-activating kinase (CAK) enzymatic complex. CCNH contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	126
410230	cd20526	CYCLIN_CCNI-like	cyclin box found in cyclin-I (CCNI) and similar proteins. CCNI is an atypical cyclin because it is most abundant in post-mitotic cells. It is involved in various biological processes, such as cell survival, angiogenesis, cell differentiation, and cell cycle progression. CCNI contains a typical cyclin box near the N-terminus and a PEST sequence near the C-terminus. The cyclin box is a protein binding domain.	99
410231	cd20528	CYCLIN_CCNJ-like_rpt1	first cyclin box found in cyclin-J (CCNJ) family. The CCNJ family includes two cyclins, CCNJ and CCNJ-like. CCNJ may regulate the cell cycle or transcription. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	103
410232	cd20529	CYCLIN_CCNJ-like_rpt2	second cyclin box found in cyclin-J (CCNJ) family. The CCNJ family includes two cyclins, CCNJ and CCNJ-like. CCNJ may regulate the cell cycle or transcription. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	101
410233	cd20530	CYCLIN_CCNK_rpt1	first cyclin box found in cyclin-K (CCNK) and similar proteins. CCNK is a novel RNA polymerase II-associated C-type cyclin possessing both carboxy-terminal domain kinase and Cdk-activating kinase activity. It is a regulatory subunit of cyclin-dependent kinases that mediates the activation of these target kinases. It plays a role in transcriptional regulation by controlling the phosphorylation of the C-terminal domain (CTD) of the large subunit of RNA polymerase II (POLR2A). CCNK contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	115
410234	cd20531	CYCLIN_CCNK_rpt2	second cyclin box found in cyclin-K (CCNK) and similar proteins. CCNK is a novel RNA polymerase II-associated C-type cyclin possessing both carboxy-terminal domain kinase and Cdk-activating kinase activity. It is a regulatory subunit of cyclin-dependent kinases that mediates the activation of these target kinases. It plays a role in transcriptional regulation by controlling the phosphorylation of the C-terminal domain (CTD) of the large subunit of RNA polymerase II (POLR2A). CCNK contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	101
410235	cd20532	CYCLIN_CCNL_rpt1	first cyclin box found in cyclin-L (CCNL) family. The CCNL family includes two cyclins, CCNL1 and CCNL2. CCNL1 is involved in the regulation of RNA polymerase II (pol II) transcription. It functions in association with cyclin-dependent kinases (CDKs). CCNL2 is a novel RNA polymerase II-associated cyclin involved in pre-mRNA splicing. It may induce cell death, possibly by acting on the transcription and RNA processing of apoptosis-related factors. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	127
410236	cd20533	CYCLIN_CCNL_rpt2	second cyclin box found in cyclin-L (CCNL) family. The CCNL family includes two cyclins, CCNL1 and CCNL2. CCNL1 is involved in regulation of RNA polymerase II (pol II) transcription. It functions in association with cyclin-dependent kinases (CDKs). CCNL2 is a novel RNA polymerase II-associated cyclin involved in pre-mRNA splicing. It may induce cell death, possibly by acting on the transcription and RNA processing of apoptosis-related factors. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	92
410237	cd20534	CYCLIN_CCNM_CCNQ_rpt1	first cyclin box found in cyclin-M (CCNM) family. The CCNM family proteins, also called ancient conserved domain proteins (ACDPs), are evolutionarily conserved Mg2+ transporters. CCNM, also called cyclin-Q (CCNQ), or CDK10-activating cyclin, or cyclin-related protein FAM58A, associates with CDK10 to promote its kinase activity. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	110
410238	cd20535	CYCLIN_CCNM_CCNQ_rpt2	second cyclin box found in cyclin-M (CCNM) family. The CCNM family proteins, also called ancient conserved domain proteins (ACDPs), are evolutionarily conserved Mg2+ transporters. CCNM, also called cyclin-Q (CCNQ), or CDK10-activating cyclin, or cyclin-related protein FAM58A, associates with CDK10 to promote its kinase activity. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	104
410239	cd20536	CYCLIN_CCNO_rpt1	first cyclin box found in cyclin-O (CCNO) and similar proteins. CCNO is specifically required for generation of multiciliated cells, possibly by promoting a cell cycle state compatible with centriole amplification and maturation. It acts downstream of MCIDAS (multiciliate differentiation and DNA synthesis associated cell cycle protein) to promote mother centriole amplification and maturation in preparation for apical docking. CCNO is involved in the activation of cyclin-dependent kinase 2. CCNO contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	93
410240	cd20537	CYCLIN_CCNO-like_rpt2	second cyclin box found in cyclin-O (CCNO) and similar proteins. This subfamily is composed of CCNO and similar proteins including Schizosaccharomyces pombe meiosis-specific cyclin rem1, Drosophila melanogaster G2/mitotic-specific cyclin-A (CCNA), and Candida albicans G1/S-specific cyclin CCN1, among others. Rem1 is required for pre-meiotic DNA synthesis and S phase progression. CCNA is essential for the control of the cell cycle at the G2/M (mitosis) transition. CCN1 is essential for the control of the cell cycle at the G1/S (start) transition and for maintenance of filamentous growth. CCNO is specifically required for generation of multiciliated cells, possibly by promoting a cell cycle state compatible with centriole amplification and maturation. It is involved in the activation of cyclin-dependent kinase 2. Members of this subfamily contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	91
410241	cd20538	CYCLIN_CCNT_rpt1	first cyclin box found in cyclin-T (CCNT) family. The CCNT family includes two C-type cyclins, cyclin-T1 (CCNT1) and cyclin-T2 (CCNT2), both of which are regulatory subunits of the cyclin-dependent kinase pair (CDK9/cyclin-T) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to productive elongation by phosphorylating the CTD (C-terminal domain) of the large subunit of RNA polymerase II (RNA Pol II). Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	137
410242	cd20539	CYCLIN_CCNT_rpt2	second cyclin box found in cyclin-T (CCNT) family. The CCNT family includes two C-type cyclins, cyclin-T1 (CCNT1) and cyclin-T2 (CCNT2), both of which are regulatory subunits of the cyclin-dependent kinase pair (CDK9/cyclin-T) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to productive elongation by phosphorylating the CTD (C-terminal domain) of the large subunit of RNA polymerase II (RNA Pol II). Members in this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	92
410243	cd20540	CYCLIN_CCNY_like	cyclin box found in cyclin-Y (CCNY) family. The CCNY family includes two cyclins, CCNY and CCNY-like protein 1 (CCNYL1). They can enhance Wnt/beta-catenin signaling in mitosis. CCNY, also called Cyc-Y, cyclin box protein 1 (CBCP1), cyclin fold protein 1 (CFP1), or cyclin-X (CCNX), is a key cell cycle regulator that acts as a growth factor sensor to integrate extracellular signals with the cell cycle machinery. It is a positive regulatory subunit of the cyclin-dependent kinases CDK14/PFTK1 and CDK16. It acts as a cell-cycle regulator of Wnt signaling pathway during G2/M phase by recruiting CDK14/PFTK1 to the plasma membrane and promoting phosphorylation of LRP6, leading to the activation of the Wnt signaling pathway. Members of this family contain one cyclin box. The cyclin box is a protein binding domain.	97
410244	cd20541	CYCLIN_CNTD1	cyclin box found in Cyclin N-terminal domain-containing protein 1 (CNTD1) and similar proteins. CNTD1 is a cyclin-related protein critical for meiotic crossover maturation and deselection of excess precrossover sites. CNTD1 contains one cyclin box. The cyclin box is a protein binding domain.	127
410245	cd20542	CYCLIN_CNTD2	cyclin box found in Cyclin N-terminal domain-containing protein 2 (CNTD2) and similar proteins. CNTD2 is an atypical cyclin upregulated in human cancer tissues. It promotes cell proliferation and migration, as well as increases tumor growth in vivo. It can function as a prognostic factor and drug target. CNTD2 contains one cyclin box. The cyclin box is a protein binding domain.	96
410246	cd20543	CYCLIN_AtCycD-like_rpt1	first cyclin box found in plant cyclin-delta family. This subfamily is composed of plant delta family cyclins, including a group of G1/S-specific D-type cyclins from Arabidopsis thaliana which may activate the cell cycle in the root apical meristem (RAM) and promote embryonic root (radicle) protrusion. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	99
410247	cd20544	CYCLIN_AtCycD-like_rpt2	second cyclin box found in plant cyclin-delta family. This subfamily is composed of plant delta family cyclins, including a group of G1/S-specific D-type cyclins from Arabidopsis thaliana which may activate the cell cycle in the root apical meristem (RAM) and promote embryonic root (radicle) protrusion. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	90
410248	cd20545	CYCLIN_SpCG1C-like_rpt1	first cyclin box found in Schizosaccharomyces pombe cyclin C homolog 1 (pch1) and similar proteins. Cyclin pch1 is essential for progression through the whole cell cycle. It is a homolog of cyclin T; it forms a heterodimer with its partner kinase, cyclin-dependent kinase 9 (CDK9), that can phosphorylate both the pol II C-terminal domain (CTD) and the CTD of transcription elongation factor Spt5. Yeast Cdk9/Pch1, with mRNA capping enzyme Pct1, may also form an elongation checkpoint for mRNA quality control. Members of this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	116
410249	cd20546	CYCLIN_SpCG1C_ScCTK2-like_rpt2	second cyclin box found in Schizosaccharomyces pombe cyclin C homolog 1 (pch1), Saccharomyces cerevisiae CTD kinase subunit 2 (ScCTK2), and similar proteins. Cyclin pch1 is essential for progression through the whole cell cycle. It is a homolog of cyclin T; it forms a heterodimer with its partner kinase, cyclin-dependent kinase 9 (CDK9), that can phosphorylate both the pol II C-terminal domain (CTD) and the CTD of transcription elongation factor Spt5. Yeast Cdk9/Pch1, with mRNA capping enzyme Pct1, may also form an elongation checkpoint for mRNA quality control. CTK2, also called CTD kinase subunit beta, CTDK-I subunit beta, or CTD kinase 38 kDa subunit, is the cyclin subunit of the CTDK-I complex, which hyperphosphorylates the C-terminal heptapeptide repeat domain (CTD) of the largest RNA polymerase II subunit. This group also includes yeast RNA polymerase II holoenzyme cyclin-like subunit, a component of the SRB8-11 complex, a regulatory module of the Mediator complex which is involved in regulation of basal and activated RNA polymerase II-dependent transcription. Members of this family contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	97
410250	cd20547	CYCLIN_ScCTK2-like_rpt1	first cyclin box found in Saccharomyces cerevisiae CTD kinase subunit 2 (ScCTK2) and similar proteins. CTK2, also called CTD kinase subunit beta, CTDK-I subunit beta, or CTD kinase 38 kDa subunit, is the cyclin subunit of the CTDK-I complex, which hyperphosphorylates the C-terminal heptapeptide repeat domain (CTD) of the largest RNA polymerase II subunit. CTK2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	110
410251	cd20548	CYCLIN_RB-like	cyclin box found in retinoblastoma-associated protein (RB) family. The RB family includes retinoblastoma-associated protein (RB), and two retinoblastoma-like proteins, RBL1 and RBL2. RB, also called p105-Rb, pRb, or pp110, is a key regulator of entry into cell division, and also acts as a tumor suppressor. It promotes G0-G1 transition when phosphorylated by CDK3/cyclin-C. It also acts as a transcription repressor of E2F1 target genes. RB is directly involved in heterochromatin formation by maintaining overall chromatin structure. It recruits and targets histone methyltransferases SUV39H1, KMT5B and KMT5C, leading to epigenetic transcriptional repression. RBL1 and RBL2 are also key regulators of entry into cell division. RBL1 and RBL2 recruit and target histone methyltransferases KMT5B and KMT5C, leading to epigenetic transcriptional repression. They control histone H4 'Lys-20' trimethylation and probably act as transcription repressors by recruiting chromatin-modifying enzymes to promoters. They may also act as tumor suppressors. Members of this family contain one cyclin box. The cyclin box is a protein binding domain.	122
410252	cd20549	CYCLIN_TFIIB_archaea_like_rpt1	first cyclin box found in archaeal transcription initiation factor IIB (TFIIB) and similar proteins. Archaeal TFIIB stabilizes TATA-binding protein (TBP) binding to an archaeal box-A promoter. It is also responsible for recruiting RNA polymerase II to the pre-initiation complex (DNA-TBP-TFIIB). TFIIB contains two cyclin boxes. This model corresponds to the first one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIB.	99
410253	cd20550	CYCLIN_TFIIB_archaea_like_rpt2	second cyclin box found in archaeal transcription initiation factor IIB (TFIIB) and similar proteins. Archaeal TFIIB stabilizes TATA-binding protein (TBP) binding to an archaeal box-A promoter. It is also responsible for recruiting RNA polymerase II to the pre-initiation complex (DNA-TBP-TFIIB). TFIIB contains two cyclin boxes. This model corresponds to the second one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIB.	87
410254	cd20551	CYCLIN_TFIIB_rpt1	first cyclin box found in transcription initiation factor IIB (TFIIB) and similar proteins. TFIIB, also called B-related factor 2 (BRF-2) or S300-II, is a general transcription factor that plays a role in transcription initiation by RNA polymerase II (Pol II). It is involved in the pre-initiation complex (PIC) formation and Pol II recruitment at promoter DNA. TFIIB contains two cyclin boxes. This model corresponds to the first one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIB.	88
410255	cd20552	CYCLIN_TFIIB_rpt2	second cyclin box found in transcription initiation factor IIB (TFIIB) and similar proteins. TFIIB, also called B-related factor 2 (BRF-2) or S300-II, is a general transcription factor that plays a role in transcription initiation by RNA polymerase II (Pol II). It is involved in the pre-initiation complex (PIC) formation and Pol II recruitment at promoter DNA. TFIIB contains two cyclin boxes. This model corresponds to the second one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIB.	97
410256	cd20553	CYCLIN_TFIIIB90_rpt1	first cyclin box found in transcription factor IIIB 90 kDa subunit (TFIIIB90) and similar proteins. TFIIIB90, also called B-related factor 1 (BRF-1), or TATA box-binding protein-associated factor, RNA polymerase III, subunit 2 (TAF3B2), is a general activator of RNA polymerase which utilizes different TFIIIB complexes at structurally distinct promoters. TFIIIB90 contains two cyclin boxes. This model corresponds to the first one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIIB90.	91
410257	cd20554	CYCLIN_TFIIIB90_rpt2	second cyclin box found in transcription factor IIIB 90 kDa subunit (TFIIIB90) and similar proteins. TFIIIB90, also called B-related factor 1 (BRF-1), or TATA box-binding protein-associated factor, RNA polymerase III, subunit 2 (TAF3B2), is a general activator of RNA polymerase which utilizes different TFIIIB complexes at structurally distinct promoters. TFIIIB90 contains two cyclin boxes. This model corresponds to the second one. The cyclin box fold is generally a protein binding domain, but binds DNA in TFIIIB90.	92
410258	cd20555	CYCLIN_BRF2	cyclin box found in B-related factor 2 (BRF-2) and similar proteins. BRF-2, also called transcription factor IIIB 50 kDa subunit (TFIIIB50), or BRFU, is a general activator of RNA polymerase (Pol) III transcription and is required for Pol III transcription of genes with promoter elements upstream of the initiation sites. It recruits Pol III to type III gene-external promoters, including the U6 spliceosomal RNA and selenocysteine tRNA genes. BRF-2 contains one cyclin box. The cyclin box fold is generally a protein binding domain, but binds DNA in BRF-2.	97
410259	cd20556	CYCLIN_CABLES	cyclin box found in CDK5 and ABL1 enzyme substrate (CABLES) family. The CABLES family includes CABLES1 and CABLES2. CABLES1, also called interactor with CDK3 1 (Ik3-1), is a cyclin-dependent kinase binding protein that enhances cyclin-dependent kinase tyrosine phosphorylation by non-receptor tyrosine kinases, such as that of CDK5 by activated ABL1, which leads to increased CDK5 activity and is critical for neuronal development, and that of CDK2 by WEE1, which leads to decreased CDK2 activity and growth inhibition. CABLES2, also called interactor with CDK3 2 (Ik3-2), acts as a proapoptotic factor involved in both p53-mediated and p53-independent apoptotic pathways. Both, CABLES1 and CABLES2, contain one cyclin box. The cyclin box is a protein binding domain.	119
410260	cd20557	CYCLIN_ScPCL1-like	cyclin box found in Saccharomyces cerevisiae G1/S-specific cyclin PCL1, PCL2 and similar proteins. The family includes a group of cyclin-like proteins that interact with the Pho85 cyclin-dependent kinase, such as Saccharomyces cerevisiae G1/S-specific cyclin PCL1, PCL2, PCL9 and their vertebrate counterparts, cyclin Pas1/PHO80 domain-containing protein 1 (CNPPD1). PCL1 (also called PHO85 cyclin-1, or cyclin HCS26) and PCL2 (also called PHO85 cyclin-1, or cyclin HCS26 homolog) are G1/S-specific cyclin partners of the cyclin-dependent kinase (CDK) PHO85. They are essential for the control of the cell cycle at the G1/S (start) transition. The PCL1-PHO85 cyclin-CDK holoenzyme is involved in phosphorylation of the CDK inhibitor (CKI) SIC1, which is required for its ubiquitination and degradation, releasing repression of b-type cyclins and promoting exit from mitosis. Together with cyclin PCL2, it positively controls degradation of sphingoid long chain base kinase LCB4. PCL1-PHO85 also phosphorylates HMS1, NCP1 and NPA3, which may all have a role in mitotic exit. PCL2-PHO85 also phosphorylates RVS167, linking cyclin-CDK activity with organization of the actin cytoskeleton. PCL9 is an M/G1-specific cyclin partner of the cyclin-dependent kinase (CDK) PHO85. It may have a role in bud site selection in the G1 phase. The family also includes cyclin Pas1/PHO80 domain-containing protein 1 (CNPPD1) and similar proteins. Their biological functions remain unclear. Members of this family contain one cyclin box. The cyclin box is a protein binding domain.	94
410261	cd20558	CYCLIN_ScPCL7-like	cyclin box found in Saccharomyces cerevisiae PHO85 cyclin-7 (ScPCL7) and similar proteins. ScPCL7, also called PHO85-associated protein 1, is a cyclin partner of the cyclin-dependent kinase (CDK) PHO85. Together with cyclin PCL6, ScPCL7 controls glycogen phosphorylase and glycogen synthase activities in response to nutrient availablility. This family also includes Schizosaccharomyces pombe PHO85 cyclin-like protein Psl1 (SpPsl1) and Arabidopsis thaliana PHO80-like proteins, P-type cyclins (CYCPs). SpPsl1 is the cyclin partner of the CDK pef1 (PHO85 homolog). CYCPs may be involved in cell division, cell differentiation, and the nutritional status of the cell in Arabidopsis thaliana. Members of this family contain one cyclin box. The cyclin box is a protein binding domain.	101
410262	cd20559	CYCLIN_ScCLN_like	cyclin box found in Saccharomyces cerevisiae G1/S-specific cyclins (ScCLNs) and similar proteins. ScCLNs, including ScCLN1-3, are essential for the control of the cell cycle at the G1/S (start) transition in Saccharomyces. ScCLN1 and ScCLN2 interact with the CDC28 protein kinase to form maturation promoting factor (MPF). ScCLN3 may be an upstream activator of the G1 cyclins that directly initiate G1/S transition. This family also includes Schizosaccharomyces pombe cyclin puc1, which contributes to negative regulation of the timing of sexual development in fission yeast, and functions at the transition between cycling and non-cycling cells. It interacts with protein kinase A. Members of this family contain one cyclin box. The cyclin box is a protein binding domain.	95
410263	cd20560	CYCLIN_CCNA1_rpt1	first cyclin box found in cyclin-A1 (CCNA1). CCNA1 may primarily function in the control of the germline meiotic cell cycle and additionally in the control of mitotic cell cycle in some somatic cells. CCNA1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	163
410264	cd20561	CYCLIN_CCNA2_rpt1	first cyclin box found in cyclin-A2 (CCNA2) and similar proteins. CCNA2 controls both the G1/S and the G2/M transition phases of the cell cycle. It is significantly over-expressed in various cancer types, and can be used as a prognostic biomarker for estrogen receptor positive (ER+) breast cancer and tamoxifen resistance. CCNA2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	131
410265	cd20562	CYCLIN_AtCycA_like_rpt1	first cyclin box found in Arabidopsis thaliana A-type cyclins (CycAs) and similar proteins. Plant A-type cyclins (CycAs) correspond to a group of G2/mitotic-specific cyclins that are functionally linked to S- and M-phases of the mitotic cycle, which predicts their involvement also in meiosis. CycAs associate with their partner cyclin-dependent kinases (CDKs) to trigger the kinase activity. They contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	136
410266	cd20563	CYCLIN_CCNA1_rpt2	second cyclin box found in cyclin-A1 (CCNA1) and similar proteins. CCNA1 may primarily function in the control of the germline meiotic cell cycle and additionally in the control of mitotic cell cycle in some somatic cells. CCNA1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	123
410267	cd20564	CYCLIN_CCNA2_rpt2	second cyclin box found in cyclin-A2 (CCNA2) and similar proteins. CCNA2 controls both the G1/S and the G2/M transition phases of the cell cycle. It is significantly overexpressed in various cancer types, and can be used as a prognostic biomarker for estrogen receptor positive (ER+) breast cancer and tamoxifen resistance. CCNA2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	110
410268	cd20565	CYCLIN_CCNB1_rpt1	first cyclin box found in G2/mitotic-specific cyclin-B1 (CCNB1). CCNB1 is essential for the control of the cell cycle at the G2/M (mitosis) transition. It is required for embryo development. Over-expression of human CCNB1 has been found in numerous cancers and has been associated with tumor aggressiveness. CCNB1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	129
410269	cd20566	CYCLIN_CCNB2_rpt1	first cyclin box found in G2/mitotic-specific cyclin-B2 (CCNB2) and similar proteins. CCNB2 is essential for the control of the cell cycle at the G2/M (mitosis) transition. It is required for progression through meiosis. CCNB2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	133
410270	cd20567	CYCLIN_AtCycB-like_rpt1	first cyclin box found in Arabidopsis thaliana B-type cyclins (CycBs) and similar proteins. Plant B-type cyclins (CycBs) correspond to a group of G2/mitotic-specific cyclins that are functionally linked to S- and M-phases of the mitotic cycle, which predicts their involvement also in meiosis. CycBs associate with their partner cyclin-dependent kinases (CDKs) to trigger the kinase activity. They contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	147
410271	cd20568	CYCLIN_CLBs_yeast_rpt1	first cyclin box found in yeast B-type cyclins. This subfamily includes Saccharomyces cerevisiae G2/mitotic-specific cyclins 1-4 (ScCLB1-4), S-phase entry cyclins 5-6 (ScCLB5-6), and Schizosaccharomyces pombe G2/mitotic-specific cyclins, cig1, cig2 and cdc13. ScCLB1-4 are essential for the control of the cell cycle at the G2/M (mitosis) transition. They interact with the CDC2 protein kinase to form maturation promoting factor (MPF). ScCLB5-6 interact with CDC28 and are involved in DNA replication in Saccharomyces cerevisiae. ScCLB5 is required for efficient progression through S phase and possibly for the normal progression through meiosis. ScCLB6 is involved in G1/S and or S phase progression. Cig1 is required for efficient passage of the G1/S transition. Cig2 and cdc13 are essential for the control of the cell cycle at the G2/M and G1/S (mitosis) transitions. They interact with the cdc2 protein kinase to form MPF. Members in this family contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	134
410272	cd20569	CYCLIN_CCNB1_rpt2	second cyclin box found in G2/mitotic-specific cyclin-B1 (CCNB1) and similar proteins. CCNB1 is essential for the control of the cell cycle at the G2/M (mitosis) transition. It is required for embryo development. Over-expression of human CCNB1 has been found in numerous cancers and has been associated with tumor aggressiveness. CCNB1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	121
410273	cd20570	CYCLIN_CCNB2_rpt2	second cyclin box found in G2/mitotic-specific cyclin-B2 (CCNB2) and similar proteins. CCNB2 is essential for the control of the cell cycle at the G2/M (mitosis) transition. It is required for progression through meiosis. CCNB2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	119
410274	cd20571	CYCLIN_AtCycC_rpt1	first cyclin box found in Arabidopsis thaliana C-type cyclins (CycCs) and similar proteins. Plant CycCs are the cognate cyclin partners of cyclin-dependent kinase CDK8. They may be involved in cell cycle control. CycCs contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	107
410275	cd20572	CYCLIN_AtCycC_rpt2	second cyclin box found in Arabidopsis thaliana C-type cyclins (CycCs) and similar proteins. CycCs are the cognate cyclin partners of cyclin-dependent kinase CDK8. They may be involved in cell cycle control. CycCs contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	102
410276	cd20573	CYCLIN_CCND1_rpt1	first cyclin box found in G1/S-specific cyclin-D1 (CCND1). CCND1, also called B-cell lymphoma 1 protein (BCL-1), or BCL-1 oncogene, or PRAD1 oncogene, is a regulatory component of the cyclin D1-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1. The complex also regulates the cell-cycle during G(1)/S transition. It is an important cell cycle regulatory protein involved in carcinogenesis of various human cancers. CCND1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	149
410277	cd20574	CYCLIN_CCND2_rpt1	first cyclin box found in G1/S-specific cyclin-D2 (CCND2). CCND2 is a regulatory component of the cyclin D2-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. It is a critical mediator of exercise-induced cardiac hypertrophy. It also acts as a regulator of cell cycle proteins affecting SAMHD1-mediated HIV-1 restriction in non-proliferating macrophages. CCND2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	150
410278	cd20575	CYCLIN_CCND3_rpt1	first cyclin box found in G1/S-specific cyclin-D3 (CCND3) and similar proteins. CCND3 is a regulatory component of the cyclin D3-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. In skeletal muscle, CCND3 plays a unique function in controlling the proliferation/differentiation balance of myogenic progenitor cells. CCND3 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	150
410279	cd20576	CYCLIN_CCND1_rpt2	second cyclin box found in G1/S-specific cyclin-D1 (CCND1). CCND1, also called B-cell lymphoma 1 protein (BCL-1), or BCL-1 oncogene, or PRAD1 oncogene, is a regulatory component of the cyclin D1-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. It is an important cell cycle regulatory protein involved in carcinogenesis of various human cancers. CCND1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	110
410280	cd20577	CYCLIN_CCND2_rpt2	second cyclin box found in G1/S-specific cyclin-D2 (CCND2). CCND2 is a regulatory component of the cyclin D2-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. It is a critical mediator of exercise-induced cardiac hypertrophy. It also acts as a regulator of cell cycle proteins affecting SAMHD1-mediated HIV-1 restriction in non-proliferating macrophages. CCND2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	105
410281	cd20578	CYCLIN_CCND3_rpt2	second cyclin box found in G1/S-specific cyclin-D3 (CCND3). CCND3 is a regulatory component of the cyclin D3-CDK4 (DC) complex that phosphorylates and inhibits members of the retinoblastoma (RB) protein family, including RB1, and regulates the cell-cycle during G(1)/S transition. In skeletal muscle, CCND3 plays a unique function in controlling the proliferation/differentiation balance of myogenic progenitor cells. CCND3 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	105
410282	cd20579	CYCLIN_CCNE1_rpt1	first cyclin box found in G1/S-specific cyclin-E1 (CCNE1). CCNE1 is essential for the control of the cell cycle at the G1/S (start) transition. It interacts with CDK2 protein kinase to form a serine/threonine kinase holoenzyme complex. CCNE1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	137
410283	cd20580	CYCLIN_CCNE2_rpt1	first cyclin box found in G1/S-specific cyclin-E2 (CCNE2). CCNE2 is essential for the control of the cell cycle at the late G1 and early S phase. It interacts with the CDK2 (in vivo) and CDK3 (in vitro) protein kinases to form a serine/threonine kinase holoenzyme complexes. CCNE2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	137
410284	cd20581	CYCLIN_CCNE1_rpt2	second cyclin box found in G1/S-specific cyclin-E1 (CCNE1). CCNE1 is essential for the control of the cell cycle at the G1/S (start) transition. It interacts with CDK2 protein kinase to form a serine/threonine kinase holoenzyme complex. CCNE1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	114
410285	cd20582	CYCLIN_CCNE2_rpt2	second cyclin box found in G1/S-specific cyclin-E2 (CCNE2). CCNE2 is essential for the control of the cell cycle at the late G1 and early S phase. It interacts with the CDK2 (in vivo) and CDK3 (in vitro) protein kinases to form serine/threonine kinase holoenzyme complexes. CCNE2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	99
410286	cd20583	CYCLIN_CCNG1	cyclin box found in cyclin-G1 (CCNG1) and similar proteins. CCNG1 is the only cyclin that has either positive or negative effects on cell growth. It is associated with G2/M phase arrest in response to DNA damage. It is also involved in the development of human carcinoma. CCNG1 contains one cyclin box. The cyclin box is a protein binding domain.	98
410287	cd20584	CYCLIN_CCNG2	cyclin box found in cyclin-G2 (CCNG2) and similar proteins. CCNG2 may play a role in growth regulation and in negative regulation of cell cycle progression. It has been identified as a tumor suppressor in several cancers. CCNG2 contains one cyclin box. The cyclin box is a protein binding domain.	96
410288	cd20585	CYCLIN_AcCycH_rpt1	first cyclin box found in Arabidopsis thaliana H-type cyclin (CycH) and similar proteins. CycH associates with and activates the cyclin-dependent kinases, CDK-2 and CDK-3. CycH contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	170
410289	cd20586	CYCLIN_AcCycH_rpt2	second cyclin box found in Arabidopsis thaliana H-type cyclin (CycH) and similar proteins. CycH associates with and activates the cyclin-dependent kinases, CDK-2 and CDK-3. CycH contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	132
410290	cd20587	CYCLIN_AcCycT_rpt1	first cyclin box found in Arabidopsis thaliana T-type cyclins (CycTs) and similar proteins. CycTs associate with their partner cyclin-dependent kinases (CDKs) to trigger their kinase activity. CycTs show high sequence similarity with metazoan cyclin-K (CCNK). CycTs contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	124
410291	cd20588	CYCLIN_AcCycT_rpt2	second cyclin box found in Arabidopsis thaliana T-type cyclins (CycTs) and similar proteins. CycTs associate with their partner cyclin-dependent kinases (CDKs) to trigger their kinase activity. CycTs show high sequence similarity with metazoan cyclin-K (CCNK). CycTs contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	93
410292	cd20589	CYCLIN_CCNL1_rpt1	first cyclin box found in cyclin-L1 (CCNL1) and similar proteins. CCNL1 is an L-type cyclin involved in the regulation of RNA polymerase II (pol II) transcription. It functions in association with cyclin-dependent kinases (CDKs). CCNL1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	124
410293	cd20590	CYCLIN_CCNL2_rpt1	first cyclin box found in cyclin-2 (CCNL2) and similar proteins. CCNL2 is a novel RNA polymerase II-associated cyclin that is involved in pre-mRNA splicing. It may induce cell death, possibly by acting on the transcription and RNA processing of apoptosis-related factors. CCNL2 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	156
410294	cd20591	CYCLIN_AcCycL_rpt1	first cyclin box found in Arabidopsis thaliana L-type cyclins (CycL) and similar proteins. Cyclin-L1-1 (CycL1), also called arginine-rich cyclin 1 (AtRCY1), or protein MODIFIER OF SNC1 12, is the cognate cyclin for cyclin-dependent kinase G1 (CDKG1). It is involved in regulation of DNA methylation and transcriptional silencing. It is required for synapsis and male meiosis, and for the proper splicing of specific resistance (R) genes. L-type cyclins contain two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	128
410295	cd20592	CYCLIN_CCNL1_rpt2	second cyclin box found in cyclin-L1 (CCNL1) and similar proteins. CCNL1 is an L-type cyclin involved in the regulation of RNA polymerase II (pol II) transcription. It functions in association with cyclin-dependent kinases (CDKs). CCNL1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	123
410296	cd20593	CYCLIN_CCNL2_rpt2	second cyclin box found in cyclin-2 (CCNL2) and similar proteins. CCNL2 is a novel RNA polymerase II-associated cyclin that is involved in pre-mRNA splicing. It may induce cell death, possibly by acting on the transcription and RNA processing of apoptosis-related factors. CCNL2 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	123
410297	cd20594	CYCLIN_AcCycL_rpt2	second cyclin box found in Arabidopsis thaliana L-type cyclins (CycL) and similar proteins. Cyclin-L1-1 (CycL1), also called arginine-rich cyclin 1 (AtRCY1), or protein MODIFIER OF SNC1 12, is the cognate cyclin for cyclin-dependent kinase G1 (CDKG1). It is involved in regulation of DNA methylation and transcriptional silencing. It is required for synapsis and male meiosis, and for the proper splicing of specific resistance (R) genes. L-type cyclins contain two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	101
410298	cd20595	CYCLIN_CCNT1_rpt1	first cyclin box found in cyclin-T1 (CCNT1). CCNT1, also termed CycT1, is a host factor essential for HIV-1 replication in CD4 T cells and macrophages. It is a regulatory subunit of the cyclin-dependent kinase pair (CDK9/cyclin-T1) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to productive elongation by phosphorylating the CTD (C-terminal domain) of the large subunit of RNA polymerase II (RNA Pol II). CCNT1 contains two cyclin boxes. This model corresponds to the first one. The cyclin box is a protein binding domain.	138
410299	cd20596	CYCLIN_CCNT2_rpt1	first cyclin box found in cyclin-T2 (CCNT2). CCNT2, also termed CycT2, is a regulatory subunit of the cyclin-dependent kinase pair (CDK9/cyclin T) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to production elongation by phosphorylating the CTD (carboxy-terminal domain) of the large subunit of RNA polymerase II (RNAP II). CCNT2 contains two cyclin boxs. The model responds to the first one. The cyclin box is a protein binding domain.	139
410300	cd20597	CYCLIN_CCNT1_rpt2	second cyclin box found in cyclin-T1 (CCNT1). CCNT1, also termed CycT1, is a host factor essential for HIV-1 replication in CD4 T cells and macrophages. It is a regulatory subunit of the cyclin-dependent kinase pair (CDK9/cyclin-T1) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to productive elongation by phosphorylating the CTD (C-terminal domain) of the large subunit of RNA polymerase II (RNA Pol II). CCNT1 contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	109
410301	cd20598	CYCLIN_CCNT2_rpt2	second cyclin box found in cyclin-T2 (CCNT2). CCNT2, also termed CycT2, is a regulatory subunit of the cyclin-dependent kinase pair (CDK9/cyclin T) complex, also called positive transcription elongation factor B (P-TEFb), which is proposed to facilitate the transition from abortive to production elongation by phosphorylating the CTD (carboxy-terminal domain) of the large subunit of RNA polymerase II (RNAP II). CCNT2 contains two cyclin boxs. The model responds to the second one. The cyclin box is a protein binding domain.	114
410302	cd20599	CYCLIN_RB	cyclin box found in retinoblastoma-associated protein (RB) and similar proteins. RB, also called p105-Rb, pRb, or pp110, is a key regulator of entry into cell division and also acts as a tumor suppressor. It promotes G0-G1 transition when phosphorylated by CDK3/cyclin-C. It also acts as a transcription repressor of E2F1 target genes. RB is directly involved in heterochromatin formation by maintaining overall chromatin structure, especially that of constitutive heterochromatin by stabilizing histone methylation. It recruits and targets histone methyltransferases SUV39H1, KMT5B and KMT5C, leading to epigenetic transcriptional repression. It controls histone H4 'Lys-20' trimethylation. RB contains one cyclin box. The cyclin box is a protein binding domain.	126
410303	cd20600	CYCLIN_RBL	cyclin box found in retinoblastoma-like protein (RBL) subfamily. The RBL subfamily includes two retinoblastoma-like proteins, RBL1 and RBL2. They are key regulators of entry into cell division and are directly involved in heterochromatin formation by maintaining overall chromatin structure. RBL1 and RBL2 recruit and target histone methyltransferases KMT5B and KMT5C, leading to epigenetic transcriptional repression. They control histone H4 'Lys-20' trimethylation and probably act as transcription repressors by recruiting chromatin-modifying enzymes to promoters. They may also act as tumor suppressors. Members of this family contain one cyclin box. The cyclin box is a protein binding domain.	112
410304	cd20601	CYCLIN_AtRBR_like	cyclin box found in Arabidopsis thaliana retinoblastoma-related protein 1 (AtRBR1) and similar proteins. AtRBR1 is a key regulator of entry into cell division. It acts as a transcription repressor of E2F target genes, whose activity is required for progress from the G1 to the S phase of the cell cycle. AtRBR1 plays a central role in the mechanism controlling meristem cell differentiation, cell fate establishment and cell fate maintenance during organogenesis and gametogenesis. AtRBR1 contains one cyclin box. The cyclin box is a protein binding domain.	129
410305	cd20602	CYCLIN_CABLES1	cyclin box found in CDK5 and ABL1 enzyme substrate 1 (CABLES1). CABLES1, also called interactor with CDK3 1 (Ik3-1), is a cyclin-dependent kinase binding protein that enhances cyclin-dependent kinase tyrosine phosphorylation by non-receptor tyrosine kinases, such as that of CDK5 by activated ABL1, which leads to increased CDK5 activity and is critical for neuronal development, and that of CDK2 by WEE1, which leads to decreased CDK2 activity and growth inhibition. CABLES1 contains one cyclin box. The cyclin box is a protein binding domain.	132
410306	cd20603	CYCLIN_CABLES2	cyclin box found in CDK5 and ABL1 enzyme substrate 2 (CABLES2). CABLES2, also called interactor with CDK3 2 (Ik3-2), acts as a proapoptotic factor involved in both p53-mediated and p53-independent apoptotic pathways. CABLES2 contains one cyclin box. The cyclin box is a protein binding domain.	121
410307	cd20604	CYCLIN_AtCycU-like	cyclin box found in Arabidopsis thaliana U-type cyclins (CycUs) and similar proteins. CycUs interact with cyclin-dependent kinase A-1 (CDKA-1) to trigger its kinase activity. CycUs contain one cyclin box. The cyclin box is a protein binding domain.	126
410308	cd20605	CYCLIN_RBL1	cyclin box found in retinoblastoma-like protein 1 (RBL1) and similar proteins. RBL1, also called 107 kDa retinoblastoma-associated protein (p107), retinoblastoma-related protein 1 (RBR-1), or pRb1, is a key regulator of entry into cell division. It is directly involved in heterochromatin formation by maintaining overall chromatin structure, especially that of constitutive heterochromatin by stabilizing histone methylation. RBL1 recruits and targets histone methyltransferases KMT5B and KMT5C, leading to epigenetic transcriptional repression. It controls histone H4 'Lys-20' trimethylation. RBL1 probably acts as a transcription repressor by recruiting chromatin-modifying enzymes to promoters. It may also act as a tumor suppressor. RBL1 contains one cyclin box. The cyclin box is a protein binding domain.	130
410309	cd20606	CYCLIN_RBL2	cyclin box found in retinoblastoma-like protein 2 (RBL2) and similar proteins. RBL2, also called 130 kDa retinoblastoma-associated protein (p130), retinoblastoma-related protein 2 (RBR-2), or pRb2, is a key regulator of entry into cell division. It is directly involved in heterochromatin formation by maintaining overall chromatin structure, especially that of constitutive heterochromatin by stabilizing histone methylation. RBL2 recruits and targets histone methyltransferases KMT5B and KMT5C, leading to epigenetic transcriptional repression. It controls histone H4 'Lys-20' trimethylation. It probably acts as a transcription repressor by recruiting chromatin-modifying enzymes to promoters. It may also act as a tumor suppressor. RBL2 contains one cyclin box. The cyclin box is a protein binding domain.	189
380328	cd20607	FbiB_C-like	nitroreductase family domain similar to the C-terminal domain of F420:gamma-glutamyl ligase FbiB. Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer,  utilizing FMN or FAD as cofactor. They are often found to be homodimers. Mycobacterium tuberculosis FbiB, is a two-domain protein and produces F420 with predominantly 5 to 7 L-glutamate residues in the poly-gamma-glutamate tail, its C-terminal domain is homologous to FMN-dependent nitroreductases.	155
380329	cd20608	nitroreductase	nitroreductase family protein. Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer,  utilizing FMN or FAD as cofactor. They are often found to be homodimers. Enzymes of this family are described as NAD(P)H:FMN oxidoreductases, oxygen-insensitive nitroreductase, flavin reductase P, dihydropteridine reductase, NADH oxidase or NADH dehydrogenase.	145
380330	cd20609	nitroreductase	nitroreductase family protein. A subfamily of the nitroreductase family containing uncharacterized proteins. Nitroreductase catalyzes the reduction of nitroaromatic compounds such as nitrotoluenes, nitrofurans and nitroimidazoles. This process requires NAD(P)H as electron donor in an obligatory two-electron transfer and uses FMN as cofactor.  The enzyme is typically a homodimer.often found to be homodimers.	145
380331	cd20610	nitroreductase	nitroreductase family protein. Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer,  utilizing FMN or FAD as cofactor. They are often found to be homodimers. Enzymes of this family are described as NAD(P)H:FMN oxidoreductases, oxygen-insensitive nitroreductase, flavin reductase P, dihydropteridine reductase, NADH oxidase or NADH dehydrogenase.	167
410705	cd20612	CYP_LDS-like_C	C-terminal cytochrome P450 domain of linoleate diol synthase and similar cytochrome P450s. This family contains Gaeumannomyces graminis linoleate diol synthase (LDS) and similar proteins including Ssp1 from the phytopathogenic basidiomycete Ustilago maydis. LDS, also called linoleate (8R)-dioxygenase, catalyzes the dioxygenation of linoleic acid to (8R)-hydroperoxylinoleate and the isomerization of the resulting hydroperoxide to (7S,8S)-dihydroxylinoleate. Ssp1 is expressed in mature teliospores, which are produced by U. maydis only after infection of its host plant, maize. Ssp1 is localized on lipid bodies in germinating teliospores, suggesting a role in the mobilization of storage lipids. LDS and Ssp1 contain an N-terminal dioxygenase domain related to animal heme peroxidases, and a C-terminal cytochrome P450 domain. The LDS-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	370
410706	cd20613	CYP46A1-like	cytochrome P450 family 46, subfamily A, polypeptide 1, also called cholesterol 24-hydroxylase, and similar cytochrome P450s. CYP46A1 is also called cholesterol 24-hydroxylase (EC 1.14.14.25), CH24H, cholesterol 24-monooxygenase, or cholesterol 24S-hydroxylase. It catalyzes the conversion of cholesterol into 24S-hydroxycholesterol and, to a lesser extent, 25-hydroxycholesterol. CYP46A1 is associated with high-order brain functions; increased expression improves cognition while a reduction leads to a poor cognitive performance. It also plays a role in the pathogenesis or progression of neurodegenerative disorders. CYP46A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	429
410707	cd20614	CYPBJ-4-like	cytochrome P450 BJ-4 homolog and similar cytochrome P450s. This group is composed of mostly uncharacterized proteins including Sinorhizobium fredii CYPBJ-4 homolog. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	406
410708	cd20615	CYP_GliC-like	cytochrome P450 monooxygenases similar to gliotoxin biosynthesis protein C. This subfamily is composed of cytochrome P450 monooxygenases that are part of gene clusters involved in the biosynthesis of various compounds such as mycotoxins and alkaloids, including Aspergillus fumigatus gliotoxin biosynthesis protein (GliC), Penicillium rubens roquefortine/meleagrin synthesis protein R (RoqR), Aspergillus oryzae aspirochlorine biosynthesis protein C (AclC), Aspergillus terreus bimodular acetylaranotin synthesis protein ataTC, Kluyveromyces lactis pulcherrimin biosynthesis cluster protein 2 (PUL2), and Aspergillus nidulans aspyridones biosynthesis protein B (ApdB). The GliC-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	409
410709	cd20616	CYP19A1	cytochrome P450 family 19, subfamily A, polypeptide 1. CYP19A1, also called aromatase or estrogen synthetase (EC 1.14.14.14), catalyzes the formation of aromatic C18 estrogens from C19 androgens. The CYP19A1 subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	414
410710	cd20617	CYP1_2-like	cytochrome P450 families 1 and 2, and similar cytochrome P450s. This model includes cytochrome P450 families 1 (CYP1) and 2 (CYP2), CYP17A1, and CYP21 in vertebrates, as well as insect and crustacean CYPs similar to CYP15A1 and CYP306A1. CYP1 and CYP2 enzymes are involved in the metabolism of endogenous and exogenous compounds such as hormones, xenobiotics, and drugs. CYP17A1 catalyzes the conversion of pregnenolone and progesterone to their 17-alpha-hydroxylated products, while CYP21 catalyzes the 21-hydroxylation of steroids such as progesterone and 17-alpha-hydroxyprogesterone (17-alpha-OH-progesterone) to form 11-deoxycorticosterone and 11-deoxycortisol, respectively. Members of this group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	419
410711	cd20618	CYP71_clan	Plant cytochrome P450s, clan CYP71. The number of cytochrome P450s (P450s, CYPs) in plants is considerably larger than in other taxa. In individual plant genomes, CYPs form the third largest family of plant genes; the two largest gene families code for F-box proteins and receptor-like kinases. CYPs have been classified into families and subfamilies based on homology and phylogenetic criteria; family membership is defined as 40% amino acid sequence identity or higher. However, there is a phenomenon called family creep, where a sequence (below 40% identity) is absorbed into a large family; this is seen in the plant CYP71 and CYP89 families. The plant CYPs have also been classified according to clans; land plants have 11 clans that form two groups: single-family clans (CYP51, CYP74, CYP97, CYP710, CYP711, CYP727, CYP746) and multi-family clans (CYP71, CYP72, CYP85, CYP86). The CYP71 clan has expanded dramatically and represents 50% of all plant CYPs; it includes several families including CYP71, CYP73, CYP76, CYP81, CYP82, CYP89, and CYP93, among others. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	429
410712	cd20619	CYP_XplA	cytochrome P450 XplA. XplA is a cytochrome P450 that was found to mediate the microbial metabolism of the military explosive, hexahydro-1,3,5-trinitro-1,3,5-triazine (RDX). XplA has an unusual structural organization comprising a heme domain that is fused to its flavodoxin redox partner. XplA, along with its partner reductase XplB, are plasmid encoded and the xplA gene has now been found in divergent genera across the globe with near sequence identity. It has only been detected at explosive-contaminated sites, suggesting rapid dissemination of this novel catabolic activity, possibly within a 50-year period since the introduction of RDX into the environment. XplA belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	358
410713	cd20620	CYP132-like	cytochrome P450 family 132 and similar cytochrome P450s. This subfamily is composed of Mycobacterium tuberculosis cytochrome P450 132 (CYP132) and similar proteins. The function of CYP132 is as yet unknown. CYP132 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	406
410714	cd20621	CYP5011A1-like	cytochrome P450 monooxygenase CYP5011A1 and similar cytochrome P450s. This subfamily is composed of CYPs from unicellular ciliates similar to Tetrahymena thermophila CYP5011A1, whose function is still unknown. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	427
410715	cd20622	CYP_TRI13-like	fungal cytochrome P450s similar to TRI13. This subfamily is composed of cytochrome P450 monooxygenase TRI13, also called core trichothecene cluster (CTC) protein 13, and similar proteins. The tri13 gene is located in the trichothecene biosynthesis gene cluster in Fusarium species, which produce a great diversity of agriculturally important trichothecene toxins that differ from each other in their pattern of oxygenation and esterification. Trichothecenes comprise a large family of chemically related bicyclic sesquiterpene compounds acting as mycotoxins, including the T2-toxin; TRI13 is required for the addition of the C-4 oxygen of T-2 toxin. The TRI13-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	494
410716	cd20623	CYP_unk	unknown subfamily of actinobacterial cytochrome P450s. This subfamily is composed of uncharacterized cytochrome P450s. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle.	367
410717	cd20624	CYP_unk	unknown subfamily of actinobacterial cytochrome P450s. This subfamily is composed of uncharacterized cytochrome P450s. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle.	376
410718	cd20625	CYP164-like	cytochrome P450 family 164 and similar cytochrome P450s. This group is composed mostly of bacterial cytochrome P450s from multiple families, including Mycobacterium smegmatis CYP164A2, Streptomyces sp. CYP245A1, Bacillus subtilis CYP107H1, Micromonospora echinospora P450 oxidase Calo2, and putative P450s such as Xylella fastidiosa CYP133 and Mycobacterium tuberculosis CYP140. CYP107H1, also called cytochrome P450(BioI), catalyzes the C-C bond cleavage of fatty acid linked to acyl carrier protein (ACP) to generate pimelic acid for biotin biosynthesis. CYP245A1, also called cytochrome P450 StaP, catalyzes the intramolecular C-C bond formation and oxidative decarboxylation of chromopyrrolic acid (CPA) to form the indolocarbazole core, a key step in staurosporine biosynthesis. CalO2 is involved in calicheamicin biosynthesis. The CYP164-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	369
410719	cd20626	CYP_Pc22g25500-like	cytochrome P450 Pc22g25500 and similar cytochrome P450s. Penicillium rubens Pc22g25500 is a putative cytochrome P450 of unknown function. Cytochrome P450 (P450, CYP) is a large superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop. Their monooxygenase activity relies on the reductive scission of molecular oxygen bound to the P450 heme iron, and the delivery of two electrons to the heme iron during the catalytic cycle.	381
410720	cd20627	CYP20A1	cytochrome P450 family 20, subfamily A, polypeptide 1. Cytochrome P450, family 20, subfamily A, polypeptide 1 (cytochrome P450 20A1 or CYP20A1) is expressed in human hippocampus and substantia nigra. In zebrafish, maternal transcript of CYP20A1 occurs in eggs, suggesting involvement in brain and early development. CYP20A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	394
410721	cd20628	CYP4	cytochrome P450 family 4. Cytochrome P450 family 4 (CYP4) proteins catalyze the omega-hydroxylation of the terminal carbon of fatty acids, including essential signaling molecules such as eicosanoids, prostaglandins and leukotrienes, and they are important for chemical defense. There are seven vertebrate family 4 subfamilies: CYP4A, CYP4B, CYP4F, CYP4T, CYP4V, CYP4X, and CYP4Z; three (CYP4X, CYP4A, CYP4Z) are specific to mammals. CYP4 enzymes metabolize fatty acids off various length, level of saturation, and branching. Specific subfamilies show preferences for the length of fatty acids; CYP4B, CYP4A and CYP4V, and CYP4F preferentially metabolize short (C7-C10), medium (C10-C16), and long to very long (C18-C26) fatty acid chains, respectively. CYP4 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	426
410722	cd20629	P450_pinF1-like	cytochrome P450-pinF1 and similar cytochrome P450s. This subfamily is composed of bacterial CYPs similar to Agrobacterium tumefaciens plant-inducible cytochrome P450-pinF1, which is not essential for virulence but may be involved in the detoxification of plant protective agents at the site of wounding. The P450-pinF1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	353
410723	cd20630	P450_epoK-like	cytochrome P450epok and similar cytochrome P450s. Sorangium cellulosum cytochrome P450epoK is a heme-containing monooxygenase which participates in epothilone biosynthesis where it catalyzes the epoxidation of epothilones C and D into epothilones A and B, respectively. This subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	373
410724	cd20631	CYP7A1	cytochrome P450 family 7, subfamily A, polypeptide 1. Cytochrome P450 7A1 (CYP7A1) is also called cholesterol 7-alpha-monooxygenase (EC 1.14.14.23) or cholesterol 7-alpha-hydroxylase. It catalyzes the hydroxylation at position 7 of cholesterol, a rate-limiting step in the classic (or neutral) pathway of cholesterol catabolism and bile acid biosynthesis. It is important for cholesterol homeostasis. CYP7A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	451
410725	cd20632	CYP7B1	cytochrome P450 family 7, subfamily B, polypeptide 1. Cytochrome P450 7B1 (CYP7B1) is also called 25-hydroxycholesterol 7-alpha-hydroxylase (EC 1.14.14.29) or oxysterol 7-alpha-hydroxylase. It catalyzes the 7alpha-hydroxylation of both steroids and oxysterols, and is thus implicated in the metabolism of neurosteroids and bile acid synthesis, respectively. It participates in the alternative (or acidic) pathway of cholesterol catabolism and bile acid biosynthesis. It also mediates the formation of 7-alpha,25-dihydroxycholesterol (7-alpha,25-OHC) from 25-hydroxycholesterol; 7-alpha,25-OHC acts as a ligand for the G protein-coupled receptor GPR183/EBI2, a chemotactic receptor in lymphoid cells. CYP7B1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	438
410726	cd20633	Cyp8B1	cytochrome P450 family 8, subfamily B, polypeptide 1. Cytochrome P450 8B1 (CYP8B1) is also called 7-alpha-hydroxycholest-4-en-3-one 12-alpha-hydroxylase (EC 1.14.18.8) or sterol 12-alpha-hydroxylase. It is involved in the classic (or neutral) pathway of cholesterol catabolism and bile acid synthesis, and is responsible for sterol 12alpha-hydroxylation, which directs the synthesis to cholic acid (CA). It converts 7-alpha-hydroxy-4-cholesten-3-one into 7-alpha,12-alpha-dihydroxy-4-cholesten-3-one, but also displays broad substrate specificity including other 7-alpha-hydroxylated C27 steroids. CYP8B1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	449
410727	cd20634	PGIS_CYP8A1	prostacyclin Synthase, also called cytochrome P450 family 8, subfamily A, polypeptide 1. Prostacyclin synthase, also called prostaglandin I2 synthase (PGIS) or cytochrome P450 8a1 (CYP8A1), catalyzes the isomerization of prostaglandin H2 to prostacyclin (or prostaglandin I2), a potent mediator of vasodilation and anti-platelet aggregation. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	442
410728	cd20635	CYP39A1	cytochrome P450 family 39, subfamily A, polypeptide 1. Cytochrome P450 39A1 (CYP39A1) is also called 24-hydroxycholesterol 7-alpha-hydroxylase (EC 1.14.14.26) or oxysterol 7-alpha-hydroxylase. It is involved in the metabolism of bile acids and has a preference for 24-hydroxycholesterol, converting it into the 7-alpha-hydroxylated product. It may play a role in the alternative bile acid synthesis pathway in the liver. CYP39A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	410
410729	cd20636	CYP26C1	cytochrome P450 family 26, subfamily C, polypeptide 1. Cytochrome P450 26C1 (CYP26C1) is a retinoic acid-metabolizing cytochrome that plays key roles in retinoic acid (RA) metabolism. It effectively metabolizes all-trans retinoic acid (atRA), 9-cis-retinoic acid (9-cis-RA), 13-cis-retinoic acid, and 4-oxo-atRA with the highest intrinsic clearance toward 9-cis-RA. RA is a critical signaling molecule that regulates gene transcription and the cell cycle. Loss of function mutations in the CYP26C1 gene cause type IV focal facial dermal dysplasia (FFDD), a rare syndrome characterized by facial lesions resembling aplasia cutis. CYP26C1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	431
410730	cd20637	CYP26B1	cytochrome P450 family 26, subfamily B, polypeptide 1. Cytochrome P450 26B1 (CYP26B1) is a retinoic acid-metabolizing cytochrome that plays key roles in retinoic acid (RA) metabolism. It is an all-trans-retinoic acid (atRA) hydroxylase that catalyzes the formation of similar metabolites as CYP26A1. RA is a critical signaling molecule that regulates gene transcription and the cell cycle. In rats, CYP26B1 regulates sex-specific timing of meiotic initiation, independent of RA signaling. CYP26B1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	430
410731	cd20638	CYP26A1	cytochrome P450 family 26, subfamily A, polypeptide 1. Cytochrome P450s 26A1 (CYP26A1) is a retinoic acid-metabolizing cytochrome that plays key roles in retinoic acid (RA) metabolism. It is the main all-trans-retinoic acid (atRA) hydroxylase that catalyzes the formation of several hydroxylated forms of RA, including 4-OH-RA, 4-oxo-RA and 18-OH-RA. RA is a critical signaling molecule that regulates gene transcription and the cell cycle. CYP26A1 has been shown to upregulate fascin and promote the malignant behavior of breast carcinoma cells. It belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	432
410732	cd20639	CYP734	cytochrome P450 family 734. Cytochrome P450 family 734 (CYP734) belongs to the plant CYP72 clan, which is generally associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. CYP734As function as multisubstrate and multifunctional enzymes in brassinosteroid (BRs) catabolism and regulation of BRs homeostasis. Arabidopsis thaliana CYP734A1/BAS1 (formerly CYP72B1) inactivates bioactive brassinosteroids such as castasterone (CS) and brassinolide (BL) by C-26 hydroxylation. Rice CYP734As can catalyze C-22 hydroxylation as well as second and third oxidations to produce aldehyde and carboxylate groups at C-26. CYP734 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	428
410733	cd20640	CYP714	cytochrome P450 family 714. Cytochrome P450 family 714 (CYP714) belongs to the plant CYP72 clan, which is generally associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. CYP714 enzymes are involved in the biosynthesis of gibberellins (GAs) and the mechanism to control their bioactive endogenous levels. They contribute to the production of diverse GA compounds through various oxidations of C and D rings in both monocots and eudicots. CYP714B1 and CYP714B2 encode the enzyme GA 13-oxidase, which is required for GA1 biosynthesis, while CYP714D1 encodes GA 16a,17-epoxidase, which inactivates the non-13-hydroxy GAs in rice. Arabidopsis CYP714A1 is an inactivation enzyme that catalyzes the conversion of GA12 to 16-carboxylated GA12 (16-carboxy-16beta,17-dihydro GA12). CYP714 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	426
410734	cd20641	CYP709	cytochrome P450 family 709. Cytochrome P450 family 709 (CYP709) belongs to the plant CYP72 clan, which is generally associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. Arabidopsis thaliana CYP709B3 is involved in abscisic acid (ABA) and salt stress response. CYP709 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	431
410735	cd20642	CYP72	cytochrome P450 family 72. Cytochrome P450 family 72 (CYP72) belongs to the plant CYP72 clan, which is generally associated with the metabolism of a diversity of fairly hydrophobic compounds including fatty acids and isoprenoids, with the catabolism of hormones (brassinosteroids and gibberellin, GA) and with the biosynthesis of cytokinins. Characterized members, among others, include: Catharanthus roseus cytochrome P450 72A1 (CYP72A1), also called secologanin synthase (EC 1.3.3.9), that catalyzes the conversion of loganin into secologanin, the precursor of monoterpenoid indole alkaloids and ipecac alkaloids; Medicago truncatula CYP72A67 that catalyzes a key oxidative step in hemolytic sapogenin biosynthesis; and Arabidopsis thaliana CYP72C1, an atypical CYP that acts on brassinolide precursors and functions as a brassinosteroid-inactivating enzyme. This family also includes Panax ginseng CYP716A47 that catalyzes the formation of protopanaxadiol from dammarenediol-II during ginsenoside biosynthesis. CYP72 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	431
410736	cd20643	CYP11A1	cytochrome P450 family 11, subfamily A, polypeptide 1, also called cholesterol side-chain cleavage enzyme. Cytochrome P450 11A1 (CYP11A1, EC 1.14.15.6) is also called cholesterol side-chain cleavage enzyme, cholesterol desmolase, or cytochrome P450(scc). It catalyzes the side-chain cleavage reaction of cholesterol to form pregnenolone, the precursor of all steroid hormones. Missense or nonsense mutations of the CYP11A1 gene cause mild to severe early-onset adrenal failure depending on the severity of the enzyme dysfunction/deficiency. CYP11A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410737	cd20644	CYP11B	cytochrome P450 family 11, subfamily B subfamily. Cytochrome P450 11B (CYP11B) enzymes catalyze the final steps in the production of glucocorticoids and mineralocorticoids that takes place in the adrenal gland. There are two human CYP11B isoforms: Cyb11B1 (11-beta-hydroxylase or P45011beta), which catalyzes the final step of cortisol synthesis by a one-step reaction from 11-deoxycortisol; and CYP11B2 (aldosterone synthase or P450aldo), which catalyzes three steps in the synthesis of aldosterone. The CYP11B subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	428
410738	cd20645	CYP24A1	cytochrome P450 family 24, subfamily A, polypeptide 1, also called vitamin D(3) 24-hydroxylase. Cytochrome P450 24A1 (CYP24A1, EC 1.14.15.16) is also called 1,25-dihydroxyvitamin D(3) 24-hydroxylase (24-OHase), vitamin D(3) 24-hydroxylase, or cytochrome P450-CC24. It catalyzes the NADPH-dependent 24-hydroxylation of calcidiol (25-hydroxyvitamin D(3)) and calcitriol (1-alpha,25-dihydroxyvitamin D(3) or 1,25(OH)2D3). CYP24A1 regulates vitamin D activity through its hydroxylation of calcitriol, the physiologically active vitamin D hormone, which controls gene-expression and signal-transduction processes associated with calcium homeostasis, cellular growth, and the maintenance of heart, muscle, immune, and skin function. CYP24A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	419
410739	cd20646	CYP27A1	cytochrome P450 family 27, subfamily A, polypeptide 1, also called vitamin D(3) 25-hydroxylase. Cytochrome P450 27A1 (CYP27A1, EC 1.14.15.15) is also called CYP27, cholestanetriol 26-monooxygenase, sterol 26-hydroxylase, 5-beta-cholestane-3-alpha,7-alpha,12-alpha-triol 27-hydroxylase, cytochrome P-450C27/25, sterol 27-hydroxylase, or vitamin D(3) 25-hydroxylase. It catalyzes the first step in the oxidation of the side chain of sterol intermediates, the 27-hydroxylation of 5-beta-cholestane-3-alpha,7-alpha,12-alpha-triol, and the first three sterol side chain oxidations in bile acid biosynthesis via the neutral (classic) pathway. It also hydroxylates vitamin D3 at the 25-position, as well as cholesterol at positions 24 and 25. CYP27A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	430
410740	cd20647	CYP27C1	cytochrome P450 family 27, subfamily C, polypeptide 1, also called all-trans retinol 3,4-desaturase. Cytochrome P450 27C1 (CYP27C1) is also called all-trans retinol 3,4-desaturase. It catalyzes the conversion of all-trans retinol (also called vitamin A1, the precursor of 11-cis retinal) to 3,4-didehydroretinol (also called vitamin A2, the precursor of 11-cis 3,4-didehydroretinal). CYP27C1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	433
410741	cd20648	CYP27B1	cytochrome P450 family 27, subfamily B, polypeptide 1, also called calcidiol 1-monooxygenase. Cytochrome p450 27B1 (CYP27B1) is also called calcidiol 1-monooxygenase (EC 1.14.15.18), 25-hydroxyvitamin D(3) 1-alpha-hydroxylase (VD3 1A hydroxylase), 25-hydroxyvitamin D-1 alpha hydroxylase, 25-OHD-1 alpha-hydroxylase, 25-hydroxycholecalciferol 1-hydroxylase, or 25-hydroxycholecalciferol 1-monooxygenase. It catalyzes the conversion of 25-hydroxyvitamin D3 (25(OH)D3) to 1-alpha,25-dihydroxyvitamin D3 (1,25(OH)2D3 or calcitriol), and of 24,25-dihydroxyvitamin D3 (24,25(OH)(2)D3) to 1-alpha,24,25-trihydroxyvitamin D3 (1alpha,24,25(OH)(3)D3). It is also active with 25-hydroxy-24-oxo-vitamin D3, and has an important role in normal bone growth, calcium metabolism, and tissue differentiation. CYP27B1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	430
410742	cd20649	CYP5A1	cytochrome P450 family 5, subfamily A, polypeptide 1, also called thromboxane-A synthase. Cytochrome P450 5A1 (CYP5A1), also called thromboxane-A synthase (EC 5.3.99.5) or thromboxane synthetase, converts prostaglandin H2 into thromboxane A2, a biologically active metabolite of arachidonic acid that has been implicated in stroke, asthma, and various cardiovascular diseases, due to its acute and chronic effects in promoting platelet aggregation, vasoconstriction, bronchoconstriction, and proliferation. CYP5A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	457
410743	cd20650	CYP3A	cytochrome P450 family 3, subfamily A. The cytochrome P450 3A (CYP3A) subfamily, the most abundant CYP subfamily in the liver, consists of drug-metabolizing enzymes. In humans, there are at least four isoforms: CYP3A4, 3A5, 3A7, and 3A3. CYP3A enzymes are embedded in the endoplasmic reticulum, where they can catalyze a wide variety of biochemical reactions including hydroxylation, N-demethylation, O-dealkylation, S-oxidation, deamination, or epoxidation of substrates. They oxidize a variety of structurally unrelated compounds including steroids, fatty acids, and xenobiotics. The CYP3A subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	426
410744	cd20651	CYP15A1-like	cytochrome P450 family 15, subfamily A, polypeptide 1, and similar cytochrome P450s. This subfamily is composed of insect and crustacean cytochrome P450s including Diploptera punctata cytochrome P450 15A1 (CYP15A1 or CYP15A1), Panulirus argus CYP2L1, and CYP303A1, CYP304A1, and CYP305A1 from Drosophila melanogaster. CYP15A1, also called methyl farnesoate epoxidase, catalyzes the conversion of methyl farnesoate to juvenile hormone III acid during juvenile hormone biosynthesis. CYP303A1, CYP304A1, and CYP305A1 may be involved in the metabolism of insect hormones and in the breakdown of synthetic insecticides. The CYP15A1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	423
410745	cd20652	CYP306A1-like	cytochrome P450 306A1 and similar cytochrome P450s. This subfamily is composed of insect and crustacean cytochrome P450s including insect cytochrome P450 306A1 (CYP306A1 or Cyp306a1) and CYP18A1. CYP306A1 functions as a carbon 25-hydroxylase and has an essential role in ecdysteroid biosynthesis during insect development. CYP18A1 is a 26-hydroxylase and plays a key role in steroid hormone inactivation. The CYP306A1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	432
410746	cd20653	CYP81	cytochrome P450 family 81. The only characterized member of the cytochrome P450 family 81 (CYP81 or Cyp81) is CYP81E1, also called isoflavone 2'-hydroxylase, that catalyzes the hydroxylation of isoflavones, daidzein, and formononetin, to yield 2'-hydroxyisoflavones, 2'-hydroxydaidzein, and 2'-hydroxyformononetin, respectively. It is involved in the biosynthesis of isoflavonoid-derived antimicrobial compounds of legumes. CYP81 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	420
410747	cd20654	CYP82	cytochrome P450 family 82. Cytochrome P450 family 82 (CYP82 or Cyp82) genes specifically reside in dicots and are usually induced by distinct environmental stresses. Characterized members include: Glycine max CYP82A3 that is induced by infection, salinity and drought stresses, and is involved in the jasmonic acid and ethylene signaling pathway, enhancing plant resistance; Arabidopsis thaliana CYP82G1 that catalyzes the breakdown of the C(20)-precursor (E,E)-geranyllinalool to the insect-induced C(16)-homoterpene (E,E)-4,8,12-trimethyltrideca-1,3,7,11-tetraene (TMTT); and Papaver somniferum CYP82N4, also called methyltetrahydroprotoberberine 14-monooxygenase, and CYP82Y1, also called N-methylcanadine 1-hydroxylase. CYP82N4 catalyzes the conversion of N-methylated protoberberine alkaloids N-methylstylopine and N-methylcanadine into protopine and allocryptopine, respectively, in the biosynthesis of isoquinoline alkaloid sanguinarine. CYP82Y1 catalyzes the 1-hydroxylation of N-methylcanadine to 1-hydroxy-N-methylcanadine, the first committed step in the formation of noscapine. CYP82 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	447
410748	cd20655	CYP93	cytochrome P450 family 93. The cytochrome P450 family 93 (CYP93) is specifically found in flowering plants and could be classified into ten subfamilies, CYP93A-K. CYP93A appears to be the ancestor that was derived in flowering plants, and the remaining subfamiles show lineage-specific distribution: CYP93B and CYP93C are present in dicots; CYP93F is distributed only in Poaceae; CYP93G and CYP93J are monocot-specific; CYP93E is unique to legumes; CYP93H and CYP93K are only found in Aquilegia coerulea; and CYP93D is Brassicaceae-specific. Members of this family include: Glycyrrhiza echinata CYP93B1, also called licodione synthase (EC 1.14.14.140), that catalyzes the formation of licodione and 2-hydroxynaringenin from (2S)-liquiritigenin and (2S)-naringenin, respectively; and Glycine max CYP93A1, also called 3,9-dihydroxypterocarpan 6A-monooxygenase (EC 1.14.14.93), that is involved in the biosynthesis of the phytoalexin glyceollin. CYP93 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	433
410749	cd20656	CYP98	cytochrome P450 family 98. Cytochrome P450 family 98 (CYP98) monooxygenases catalyze the meta-hydroxylation step in the phenylpropanoid biosynthetic pathway. CYP98A3, also called p-coumaroylshikimate/quinate 3'-hydroxylase, catalyzes 3'-hydroxylation of p-coumaric esters of shikimic/quinic acids to form lignin monomers. CYP98A8, also called p-coumarate 3-hydroxylase, acts redundantly with CYP98A9 as tricoumaroylspermidine meta-hydroxylase. CYP98 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	432
410750	cd20657	CYP75	cytochrome P450 family 75. The cytochrome P450 family 75 (CYP75) play important roles in the biosynthesis of colored class of flavonoids, anthocyanins, which confer a diverse range of colors to flowers from orange to red to violet and blue. The number of hydroxyl groups on the B-ring of anthocyanidins, the chromophores and precursors of anthocyanins, impact the anthocyanin color - the more the bluer. The hydroxylation pattern is determined by CYP75 proteins: flavonoid 3'-hydroxylase (F3'H, EC 1.14.14.82) and and flavonoid 3',5'-hydroxylase (F3'5'H, EC 1.14.14.81), which belong to CYP75B and CYP75A subfamilies, respectively. Both enzymes have broad substrate specificity and catalyze the hydroxylation of flavanones, dihydroflavonols, flavonols and flavones. F3'H catalyzes the 3'-hydroxylation of the flavonoid B-ring to the 3',4'-hydroxylated state. F3'5'H catalysis leads to trihydroxylated delphinidin-based anthocyanins that tend to have violet/blue colours. CYP75 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	438
410751	cd20658	CYP79	cytochrome P450 family 79. Cytochrome P450 family 79 (CYP79) enzymes catalyze the first committed step in the biosynthesis of the core structure of glucosinolates, the conversion of amino acids to the corresponding aldoximes. Glucosinolates are amino acid-derived natural plant products that function in the defense against herbivores and microorganisms. Arabidopsis thaliana contains seven family members: CYP79B2 and CYP79B3, which metabolize trytophan; CYP79F1 and CYP79F2, which metabolize chain-elongated methionine derivatives with respectively 1-6 or 5-6 additional methylene groups in the side chain; CYP79A2 that metabolizes phenylalanine; and CYP79C1 and CYP79C2, with unknown function. CYP79 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	444
410752	cd20659	CYP4B_4F-like	cytochrome P450 family 4, subfamilies B and F, and similar cytochrome P450s. This group is composed of family 4 cytochrome P450s from vertebrate subfamilies A (CYP4A), B (CYP4B), F (CYP4F), T (CYP4T), X (CYP4X), and Z (CYP4Z). Also included are similar proteins from lancelets, tunicates, hemichordates, echinoderms, mollusks, annelid worms, sponges, and choanoflagellates, among others. The CYP4A, CYP4X, and CYP4Z subfamilies are specific to mammals, CYP4T is present in fish, while CYP4B and CYP4F are conserved among vertebrates. CYP4Bs specialize in omega-hydroxylation of short chain fatty acids and also participates in the metabolism of exogenous compounds that are protoxic including valproic acid (C8), 3-methylindole (C9), 4-ipomeanol, 3-methoxy-4-aminoazobenzene, and several aromatic amines. CYP4F enzymes are known for known for omega-hydroxylation of very long fatty acids (VLFA; C18-C26), leukotrienes, prostaglandins, and vitamins with long alkyl side chains. The CYP4B_4F-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	423
410753	cd20660	CYP4V-like	cytochrome P450 family 4, subfamily V, and similar cytochrome P450s. This group is composed of vertebrate cytochrome P450 family 4, subfamily V (CYP4V) enzymes and similar proteins, including invertebrate subfamily C (CYP4C). Insect CYP4C enzymes may be involved in the metabolism of insect hormones and in the breakdown of synthetic insecticides. CYP4V2, the most characterized member of the CYP4V subfamily, is a selective omega-hydroxylase of saturated, medium-chain fatty acids, such as laurate, myristate and palmitate, with high catalytic efficiency toward myristate. Polymorphisms in the CYP4V2 gene cause Bietti's crystalline corneoretinal dystrophy (BCD), a recessive degenerative retinopathy that is characterized clinically by a progressive decline in central vision, night blindness, and constriction of the visual field. The CYP4V-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	429
410754	cd20661	CYP2R1	cytochrome P450 2R1. CYP2R1, also called vitamin D 25-hydroxylase (EC 1.14.14.24), is a microsomal enzyme that is required for the activation of vitamin D; it catalyzes the initial step converting vitamin D into 25-hydroxyvitamin D (25(OH)D), the major circulating metabolite of vitamin D. The 1alpha-hydroxylation of 25(OH)D by CYP27B1 generates the fully active vitamin D metabolite, 1,25-dihydroxyvitamin D (1,25(OH)2D). Mutations in the CYP2R1 gene are associated with an atypical form of vitamin D-deficiency rickets, which has been classified as vitamin D dependent rickets type 1B. CYP2R1 belongs to family 2 of the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	436
410755	cd20662	CYP2J	cytochrome P450 family 2, subfamily J. Members of CYP2J are expressed in multiple tissues in mice and humans. They function as catalysts of arachidonic acid metabolism and are active in the metabolism of fatty acids to generate bioactive compounds. Human CYP2J2, also called arachidonic acid epoxygenase or albendazole monooxygenase (hydroxylating), is a membrane-bound cytochrome P450 primarily expressed in the heart and plays a significant role in cardiovascular diseases. The CYP2J subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	421
410756	cd20663	CYP2D	cytochrome P450 family 2, subfamily D. Members of CYP2D are present in mammals, birds, reptiles, and amphibians. The hominin CYP2D subfamily consists of a functional CYP2D6 and two paralogs, CYP2D7 and CYP2D8, that are often not functional in some species. Human CYP2D6 has a high affinity for alkaloids and can detoxify them. It is also responsible for metabolizing about 25% of commonly used drugs, such as antidepressants, beta-blockers, and antiarrhythmics. The CYP2D subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	428
410757	cd20664	CYP2K	cytochrome P450 family 2, subfamily K. Members of CYP2K are present in fish, birds, and amphibians. CYP2K6 from zebrafish has been shown to catalyze the conversion of aflatoxin B1 (AFB1) to its cytotoxic derivative AFB1 exo-8,9-epoxide, while its ortholog in rainbow trout CYP2K1 is also capable of oxidizing lauric acid. In birds, CYP2K is one of the largest CYP2 subfamilies. The CYP2K subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	424
410758	cd20665	CYP2C-like	cytochrome P450 family 2, subfamily C, and similar cytochrome P450s. This CYP2C-like group includes CYP2C, and similar CYPs including mammalian CYP2E1, also called 4-nitrophenol 2-hydroxylase, as well as chicken CYP2H1 and CYP2H2. The CYP2C subfamily is composed of four human members (CYP2C8, CYP2C9, CYP2C18, CYP2C19) that metabolize approximately 20% of clinically used drugs, and all four exhibit genetic polymorphisms that results in toxicity or altered efficacy of some drugs in affected individuals. CYP2E1 participates in the metabolism of endogenous substrates, including acetone and fatty acids, and exogenous compounds such as anesthetics, ethanol, nicotine, acetaminophen, aspartame, and chlorzoxazone, among others. The CYP2C-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410759	cd20666	CYP2U1	cytochrome P450 family 2, subfamily U, polypeptide 1. CYP2U1 is a thymus- and brain-specific cytochrome P450 that catalyzes omega- and (omega-1)-hydroxylation of fatty acids such as arachidonic acid, docosahexaenoic acid, and other long chain fatty acids. Mutations in CYP2U1 are associated with hereditary spastic paraplegia (HSP), a neurological disorder, and pigmentary degenerative maculopathy associated with progressive spastic paraplegia. CYP2U1 belongs to family 2 of the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	426
410760	cd20667	CYP2AB1-like	cytochrome P450, family 2, subfamily AB, polypeptide 1 and similar cytochrome P450s. The function of CYP2AB1 is unknown. CYP2AB1 belongs to family 2 of the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	423
410761	cd20668	CYP2A	cytochrome P450 family 2, subfamily A. Cytochrome P450 family 2, subfamily A (CYP2A) includes CYP2A1, 2A2, and 2A3 in rats; CYP2A4, 2A5, 2A12, 2A20p, 2A21p, 2A22, and 2A23p in mice; CYP2A6, 2A7, 2A13, 2A18P in humans; CYP2A8, 2A9, 2A14, 2A15, 2A16, and 2A17 in hamsters; CYP2A10 and 2A11 in rabbits; and CYP2A19 in pigs. CYP2A enzymes metabolize numerous xenobiotic compounds, including coumarin, aflatoxin B1, nicotine, cotinine, 1,3-butadiene, and acetaminophen, among others, as well as endogenous compounds, including testosterone, progesterone, and other steroid hormones. Human CYP2A6 is responsible for the systemic clearance of nicotine, while CYP2A13 activates the nicotine-derived procarcinogen 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) into DNA-altering compounds that cause lung cancer. The CYP2A subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410762	cd20669	Cyp2F	cytochrome P450 family 2, subfamily F. Cytochrome P450 family 2, subfamily F (CYP2F) members are selectively expressed in lung tissues. They are responsible for the bioactivation of several pneumotoxic and carcinogenic chemicals such as benzene, styrene, naphthalene, and 1,1-dichloroethylene. CYP2F1 and CYP2F3 selectively catalyzes the 3-methyl dehydrogenation of 3-methylindole, forming toxic reactive intermediates that can form adducts with proteins and DNA. The CYP2F subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410763	cd20670	CYP2G	cytochrome P450 family 2, subfamily G. CYP2G1 is uniquely expressed in the olfactory mucosa of rats and rabbits and may have important functions for the olfactory chemosensory system. It is involved in the metabolism of sex steroids and xenobiotic compounds. In cynomolgus monkeys, CYP2G2 is a functional drug-metabolizing enzyme in nasal mucosa. In humans, two different CYP2G genes, CYP2GP1 and CYP2GP2, are pseudogenes because of loss-of-function deletions/mutations. The CYP2G subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410764	cd20671	CYP2W1	cytochrome P450 family 2, subfamily W, polypeptide 1. Cytochrome P450 2W1 (CYP2W1) is expressed during development of the gastrointestinal tract, is silenced after birth in the intestine and colon by epigenetic modifications, but is activated following demethylation in colorectal cancer (CRC). Its expression levels in CRC correlate with the degree of malignancy, are higher in metastases and are predictive of survival. Thus, it is an attractive tumor-specific diagnostic and therapeutic target. CYP2W1 belongs to family 2 of the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	422
410765	cd20672	CYP2B	cytochrome P450 family 2, subfamily B. The human cytochrome P450 family 2, subfamily B (CYP2B) consists of only one functional member CYP2B6, which shows broad substrate specificity and plays a key role in the metabolism of many clinical drugs, environmental toxins, and endogenous compounds. Rodents have multiple functional CYP2B proteins; mouse subfamily members include CYP2B9, 2B10, 2B13, 2B19, and 2B23. CYP2B enzymes are highly inducible by chemicals that interact with the constitutive androstane receptor (CAR) and/or pregnane X receptor (PXR), such as rifampicin and phenobarbital. The CYP2B subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	425
410766	cd20673	CYP17A1	cytochrome P450 family 17, subfamily A, polypeptide 1. Cytochrome P450 17A1 (CYP17A1 or Cyp17a1), also called cytochrome P450c17, steroid 17-alpha-hydroxylase (EC 1.14.14.19)/17,20 lyase (EC 1.14.14.32), or 17-alpha-hydroxyprogesterone aldolase, catalyzes the conversion of pregnenolone and progesterone to their 17-alpha-hydroxylated products and subsequently to dehydroepiandrosterone (DHEA) and androstenedione. It is a dual enzyme that catalyzes both the 17-alpha-hydroxylation and the 17,20-lyase reactions. Severe mutations on the enzyme cause combined 17-hydroxylase/17,20-lyase deficiency (17OHD); patients with 17OHD synthesize 11-deoxycorticosterone (DOC) which causes hypertension and hypokalemia. Loss of 17,20-lyase activity precludes sex steroid synthesis and leads to sexual infantilism. Included in this group is a second 17A P450 from teleost fish, CYP17A2, that is more efficient in pregnenolone 17-alpha-hydroxylation than CYP17A1, but does not catalyze the lyase reaction. CYP17A1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	432
410767	cd20674	CYP21	cytochrome P450 21, also called steroid 21-hydroxylase. Cytochrome P450 21 (CYP21 or Cyp21), also called steroid 21-hydroxylase (EC 1.14.14.16) or cytochrome P-450c21 or CYP21A2 (in humans), catalyzes the 21-hydroxylation of steroids such as progesterone and 17-alpha-hydroxyprogesterone (17-alpha-OH-progesterone) to form 11-deoxycorticosterone and 11-deoxycortisol, respectively. It is required for the adrenal synthesis of mineralocorticoids and glucocorticoids. Deficiency of this CYP is involved in ~95% of cases of human congenital adrenal hyperplasia, a disorder of adrenal steroidogenesis. There are two CYP21 genes in the human genome, CYP21A1 (a pseudogene) and CYP21A2 (the functional gene). Deficiencies in steroid 21-hydroxylase activity lead to a type of congenital adrenal hyperplasia, which has three clinical forms: a severe form with concurrent defects in both cortisol and aldosterone biosynthesis; a form with adequate aldosterone biosynthesis; and a mild, non-classic form that can be asymptomatic or associated with signs of postpubertal androgen excess without cortisol deficiency. CYP21A2 is also the major autoantigen in autoimmune Addison disease. Cyp21 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	424
410768	cd20675	CYP1B1-like	cytochrome P450 family 1, subfamily B, polypeptide 1 and similar cytochrome P450s. Cytochrome P450 1B1 (CYP1B1) is expressed in liver and extrahepatic tissues where it carries out the metabolism of numerous xenobiotics, including metabolic activation of polycyclic aromatic hydrocarbons. It is also important in regulating endogenous metabolic pathways, including the metabolism of steroid hormones, fatty acids, melatonin, and vitamins. CYP1B1 is overexpressed in a wide variety of tumors and is associated with angiogenesis. It is also associated with adipogenesis, obesity, hypertension, and atherosclerosis. It is therefore a target for the treatment of metabolic diseases and cancer. Also included in this subfamily are CYP1C proteins from fish, birds and amphibians. The CYP1B1-like subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	434
410769	cd20676	CYP1A	cytochrome P450 family 1, subfamily A. Cytochrome P450 family 1, subfamily A (CYP1A) consists of two human members, CYP1A1 and CYP1A2, which overlap in their activities. CYP1A2 is the highly expressed cytochrome enzyme in the human liver, while CYP1A1 is mostly found in extrahepatic tissues. Known common substrates include aromatic compounds such as polycyclic aromatic hydrocarbons, arachidonic acid and eicosapentoic acid, as well as melatonin and 6-hydroxylate melatonin. In addition, CYP1A1 activates procarcinogens into carcinogens via epoxides, and metabolizes heterocyclic aromatic amines of industrial origin. CYP1A2 metabolizes numerous natural products that result in toxic products, such as the transformation of methyleugenol to 1'-hydroxymethyleugenol, estragole to reactive metabolites, and oxidation of nephrotoxins. It also plays an important role in the metabolism of several clinical drugs including analgesics, antipyretics, antipsychotics, antidepressants, anti-inflammatory, and cardiovascular drugs. The CYP1A subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	437
410770	cd20677	CYP1D1	cytochrome P450 family 1, subfamily D, polypeptide 1. The cytochrome P450 1D1 (CYP1D1) gene is pseudogenized in humans because of five nonsense mutations in the putative coding region. However, in other organisms including cynomolgus monkey, CYP1D1 is a functional drug-metabolizing enzyme that is highly expressed in the liver. CYP1D1 belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	435
410771	cd20678	CYP4B-like	cytochrome P450 family 4, subfamily B and similar cytochrome P450s, including subfamilies A, T, X, and Z. This group is composed of family 4 cytochrome P450s from subfamilies A (CYP4A), B (CYP4B), T (CYP4T), X (CYP4X), and Z (CYP4Z). The CYP4A, CYP4X, and CYP4Z subfamilies are specific to mammals, CYP4T is present in fish, while CYP4B is conserved among vertebrates. CYP4As are known for catalyzing arachidonic acid to 20-HETE (20-hydroxy-5Z,8Z,11Z,14Z-eicosatetraenoic acid), and some can also metabolize  lauric and palmitic acid. CYP4Bs specialize in omega-hydroxylation of short chain fatty acids and also participates in the metabolism of exogenous compounds that are protoxic including valproic acid (C8), 3-methylindole (C9), 4-ipomeanol, 3-methoxy-4-aminoazobenzene, and several aromatic amines. CYP4X1 is expressed at high levels in the mammalian brain and may play a role in regulating fat metabolism. CYP4Z1 is a fatty acid hydroxylase that is unique among human CYPs in that it is predominantly expressed in the mammary gland. Monophyly was not found with the CYP4T and CYP4B subfamilies, and further consideration should be given to their nomenclature. The CYP4B-like group belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	436
410772	cd20679	CYP4F	cytochrome P450 family 4, subfamily F. Cytochrome P450 family 4, subfamily F (CYP4F) enzymes are known for known for omega-hydroxylation of very long fatty acids (VLFA; C18-C26), leukotrienes, prostaglandins, and vitamins with long alkyl side chains. The CYP4F subfamily show diverse specificities among its members: CYP4F2 and CYP4F3 metabolize pro- and anti-inflammatory leukotrienes; CYP4F8 and CYP4F12 metabolize prostaglandins, endoperoxides and arachidonic acid; CYP4F11 and CYP4F12 metabolize VLFA and are unique in the CYP4F subfamily since they also hydroxylate xenobiotics such as benzphetamine, ethylmorphine, erythromycin, and ebastine. CYP4F belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	442
410773	cd20680	CYP4V	cytochrome P450 family 4, subfamily V. Cytochrome P450 family 4, subfamily V, polypeptide 2 (CYP4V2) is the most characterized member of the CYP4V subfamily. It is a selective omega-hydroxylase of saturated, medium-chain fatty acids, such as laurate, myristate and palmitate, with high catalytic efficiency toward myristate. Polymorphisms in the CYP4V2 gene cause Bietti's crystalline corneoretinal dystrophy (BCD), a recessive degenerative retinopathy that is characterized clinically by a progressive decline in central vision, night blindness, and constriction of the visual field. The CYP4V subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	440
410332	cd20681	T-box_Drosocross-like	DNA-binding domain of Drosophila Dorsocross and related T-box proteins. Drosophila Dorsocross (Doc) includes three Dorsocross paralogs, Doc1-3. These are key cardiogenic T-box transcription factors during specification and differentiation of heart cells. Drosophila Doc also functions in caudal visceral mesoderm development, and modulates Notch signaling in the developing Drosophila eye by regulating the expression of Delta in the eye imaginal discs. Doc also functions in the morphogenesis of epithelial tissues: in Drosophila, which possesses a single extraembryonic (EE) membrane, it is essential for EE epithelia tissue maintenance while in Tribolium castaneum, which has 2 EE membranes, Doc plays a major role in EE morphogenetic events throughout development without affecting EE tissue specificity or maintenance. This subfamily belongs to the T-box family of transcription factors which play a multitude of diverse functions throughout development. The founding member of the T-box family is Brachyury (also known as TBXT, or T). T-box family members share a conserved DNA-binding domain (T-box) which binds DNA in a sequence-specific manner. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development, and conserved expression patterns.	186
410333	cd20682	T-box-like	T-box DNA-binding domain; uncharacterized subfamily. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family.  Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors.	191
410334	cd20683	T-box_Fungi_incertae_sedis	T-box DNA-binding domain; uncharacterized subfamily of fungi classified as Fungi incertae sedis. Fungi incertae sedis refers to a fungal taxonomic group where its broader relationships are unknown or undefined. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family.  Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors.	214
411002	cd20684	CdiA-CT_Yk_RNaseA-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Yersinia kristensenii, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector protein from Yersinia kristensenii and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; Yersinia kristensenii CdiA-CT has potent RNase activity in vivo and in vitro. Although CdiA-CT has structural homology with angiogenin and other RNase A paralogs, it does not share sequence similarity with these nucleases and lacks the characteristic disulfide bonds of the superfamily. It binds its cognate immunity protein CdiI which neutralizes toxicity by blocking access to RNA substrates. Y. kristensenii CdiA-CT is the first non-vertebrate protein found to possess the RNase A superfamily fold. Homologs of this toxin are associated with secretion systems in many Gram-negative and Gram-positive bacteria, suggesting that RNase A-like toxins are commonly deployed in inter-bacterial competition.	112
411003	cd20685	CdiA-CT_Ecl_RNase-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Enterobacter cloacae, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain of Enterobacter cloacae CdiA and similar domains. This CdiA-CT toxin has structural homology to the C-terminal nuclease domain of colicin E3, which cleaves 16S ribosomal RNA to disrupt protein synthesis, and has been shown to use the same nuclease activity to inhibit bacterial growth. The CdiA-CT toxin is specifically neutralized by cognate immunity protein CdiI to protect the toxin-producing cell from autoinhibition. Despite carrying equivalent toxin domains, the corresponding immunity proteins for CdiA-CT and colicin E3 are unrelated in sequence, structure, and toxin-binding site, thus showing diversity among 16S rRNase toxins.	75
411004	cd20686	CdiA-CT_Ec-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Escherichia coli STEC_O31, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain of Escherichia coli STEC_O31 CdiA and similar domains. The function of this CdiA-CT is as yet unknown, but its C-terminal end is similar to EndoU domain-containing protein which may act as a nuclease toxin that cleaves RNAs in competitor cells. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity.	135
412039	cd20687	CdiI_Ykris-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Yersinia kristensenii, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Yersinia kristensenii (which is an RNase), and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. The CdiI immunity protein binds the CdiA toxin via its C-terminal domain to prevent auto-inhibition. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. Y. kristensenii CdiI binds directly over the putative active site of the CdiA-CT toxin and likely neutralizes toxicity by blocking access to RNA substrates.	90
412040	cd20688	CdiI_Ecoli_Nm-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli STEC_O31, Neisseria meningitidis MC58, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Escherichia coli STEC_O31 Neisseria meningitidis MC58, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. Neisseria meningitidis MC58 immunity protein CdiI has structural homology to the Whirly family of RNA-binding proteins, but lacks the characteristic nucleic acid-binding motif of the family. It has been predicted to neutralize toxin activity by preventing access to RNA substrates.	100
412042	cd20689	CDI_toxin_Bp_tRNase-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Burkholderia pseudomallei, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model includes the C-terminal (CT) toxin domains of Burkholderia pseudomallei E479 and 1026b, both appearing to be RNAses acting on tRNA.	99
412045	cd20690	CdiI_BpE479-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Burkholderia pseudomallei E479, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Burkholderia pseudomallei E479 (which is a tRNase). CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. Although related B. pseudomallei E479 CdiA-CT has structural homology to B. pseudomallei 1026B CdiA-CT (both tRNases), their cognate CdiI immunity proteins share no significant sequence or structure homology. This CdiI binds its cognate toxin CdiA-CT domain with high affinity.	100
412046	cd20691	CdiI_EC536-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli 536, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Escherichia coli 536 (which is a predicted RNase), and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. This E. coli CdiI's cognate toxin CdiA-CT domain is activated only when it is bound to the biosynthetic enzyme O-acetylserine sulfhydrylase-A (CysK), one of two isoenzymes (along with CysM) that catalyze the final reaction in cysteine synthesis. CdiA's predicted nuclease active site is occluded by immunity protein in the CysK/CdiA-CT/CdiI structure, suggesting that CdiI blocks the binding of tRNA substrates to the toxin.	120
411005	cd20692	CdiA-CT_Ec-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Escherichia coli A0 34/86, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain Escherichia coli A0 34/86 CdiA. Activity of this E. coli CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity.	99
412047	cd20693	CdiI_EcoliA0-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli A0 34/86, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Escherichia coli A0 34/86, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. This E. coli CdiI binds its cognate toxin CdiA-CT domain with high affinity.	124
412048	cd20694	CdiI_Ct-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Cupriavidus taiwanensis CdiI immunity protein and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Cupriavidus taiwanensis, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. This C. taiwanensis CdiI is alpha-helical and binds its cognate toxin CdiA-CT domain with high affinity.	96
411006	cd20695	CdiA-CT_5T87E_Ct	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Cupriavidus taiwanensis, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain of Cupriavidus taiwanensis CdiA. The exact biochemical function of this CdiA-CT cannot be predicted easily and may include RNase or DNase activity. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity.	61
412049	cd20696	CdiI_Ecoli3006-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli 3006, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Escherichia coli 3006, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. The E. coli CdiI binds its cognate CdiA-CT with high affinity via one end of its beta-sandwich structure.	150
410969	cd20697	CdiA-CT_Ec_Kp-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Escherichia coli and Klebsiella pneumoniae CdiA, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal domain of CdiA (CdiA-CT) from Escherichia coli, Klebsiella pneumoniae and other bacteria. The exact biochemical function of this CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity.	94
412050	cd20698	CdiI_Kp-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Klebsiella pneumoniae, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Klebsiella pneumoniae, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. The K. pneumoniae CdiI binds its cognate CdiA-CT via one end of its beta-sandwich structure.	110
412051	cd20699	CdiI_ECL-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of  Enterobacter cloacae, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the inhibitor (CdiI, also called CdiI immunity protein) of the CdiA effector protein from Enterobacter cloacae, and similar proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via its C-terminal domain (CdiA-CT). The CdiI immunity proteins are intracellular proteins that inactivate the toxin/effector protein to prevent auto-inhibition. They are specific for their cognate CdiA-CT and do not protect cells from the toxins of other CDI+ bacteria. Thus, CDI systems encode a complex network of toxin-immunity protein pairs that are deployed for intercellular competition. Although E. cloacae CdiA-CT has structural homology to the C-terminal nuclease domain of colicin E3, which cleaves 16S ribosomal RNA to disrupt protein synthesis, and has been shown to use the same nuclease activity to inhibit bacterial growth, the corresponding CdiI immunity proteins are unrelated in sequence, structure and toxin-binding sites. Structural homology searches reveal that E. cloacae CdiI is most similar to the Whirly family of single-stranded DNA-binding proteins.	141
411007	cd20700	CdiA-CT_Ec_tRNase	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Escherichia coli 563, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain Escherichia coli 563 CdiA and similar domains. This CdiA-CT (EC536) region is composed of two domains that have distinct functions during CDI. This domain is the extreme C-terminal domain, an RNase toxin that possesses an all alpha-helical fold and conserved aspartate and glutamate residues, and K[DE] and [DN]HxxE motifs. The N-terminal domain facilitates translocation of the tethered nuclease into the cytosol of target bacteria. Although this CdiA-CT rapidly cleaves tRNA in vivo, the purified toxin has no detectable nuclease activity in vitro. Experiments show that it is activated when bound to the biosynthetic enzyme O-acetylserine sulfhydrylase-A (CysK), which is one of two isoenzymes (along with CysM) that catalyze the final reaction in cysteine synthesis. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity.	115
410945	cd20701	MIX	Marker for type sIX effectors domain. This family contains the MIX (Marker for type sIX effectors) domain, a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. MIX domains have been classified into five clans (called MIX I-V) by Dar et. al. based on sequence similarity. These domains have been further classified as either antibacterial or anti-eukaryotic, based on the presence or absence of adjacent putative immunity genes, respectively. In Vibrionaceae, antibacterial MIX-effectors carrying domains with pore-forming, phospholipase, nuclease, peptidoglycan hydrolase, and protease activities have been identified. Additionally, novel virulence MIX-effectors that employ a combination of antibacterial and anti-eukaryotic MIX-effectors have been found, suggesting that certain bacteria adapted their antibacterial T6SS to mediate interactions with eukaryotic hosts or predators. A subset of polymorphic MIX-effectors, a widespread class of effectors secreted by T6SSs, are horizontally shared between marine bacteria and used to diversify their T6SS effector repertoires, thus enhancing their environmental fitness.	128
410637	cd20702	PoNe	Polymorphic Nuclease effector (PoNe) domain is a deoxyribonuclease. This family contains the DNase toxin domain called PoNe (Polymorphic Nuclease effector), which belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. The PoNe domain co-occurs with a variety of N-terminal domains such as filamentous hemagglutinin, nuclease, HINT, DUFs, PAAR, RHS repeat, or LXG domains. Some members of this family also co-occur with the FIX (Found in type sIX effector) domain of unknown function, as identified by Jana et al., who have also identified this PoNe domain.	77
410939	cd20703	FIX-like	Found in type sIX effector (FIX) domain of unknown function. This family contains the Found in type sIX effector (FIX) domain and similar proteins. FIX is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. FIX is present in various established T6SS-secreted effectors that have an N-terminal VgrG or PAAR or PAAR-like (i.e., DUF4280) domain, suggesting that FIX may play a role in delivery of T6SS effectors, and serve as a new marker for T6SS-delivered proteins to enable the identification of novel T6SS substrates.	75
412052	cd20704	Orc3	Origin recognition complex subunit 3. Origin recognition complex subunit 3 (Orc3) is a subunit of the heterohexameric origin recognition complex (ORC) that is essential for coordinating replication onset. ORC binds to the origin of replication, binds CDC6, and recruits the hexameric MCM2-7 ring to the DNA, which leads to the assembly of the pre-replicative complex (pre-RC). Five of the 6 ORC subunits (Orc 1-5) retain AAA+ (ATPases associated with a variety of cellular activities) folds, but Orc3, as well as Orc2, lost their ATP-binding signatures.	387
410946	cd20705	MIX_I	Marker for type sIX effectors domain, clan I. This subfamily contains the MIX (Marker for type sIX effectors) clan I (MIX I) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins predicted to mediate antibacterial toxicity. These C-terminal toxin domains of Vibrionaceae MIX I effectors include pore-forming, nuclease and nucleotide deaminase activities. Members of the MIX I clan are similar, in both sequence and synteny, to the Vibrio parahaemolyticus MIX-effector VP1388, but their activity is unknown. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist.	115
410947	cd20706	MIX_II	Marker for type sIX effectors domain, clan II. This subfamily contains the MIX (Marker for type sIX effectors) II clan (MIX II) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. Predicted activity of the C-terminal toxin domains of Vibrionaceae MIX II effectors is mainly pore-forming. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist. Also, some of these MIX II effectors also contain N-terminal domains such as the T6SS-secreted tail component PAAR.	149
410948	cd20707	MIX_III	Marker for type sIX effectors domain, clan III. This subfamily contains the MIX (Marker for type sIX effectors) III clan (MIX III) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. No MIX III clan members have been detected in Vibrionaceae. Predicted activity of the C-terminal toxin domains of MIX III effectors is mainly pore-forming. Studies have shown that many members of the MIX III clan neighbor transposable elements. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist.	137
410949	cd20708	MIX_IV	Marker for type sIX effectors domain, clan IV. This subfamily contains the MIX (Marker for type sIX effectors) IV clan (MIX IV) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. Predicted activity of the C-terminal toxin domains of Vibrionaceae MIX IV effectors is mainly pore-forming. Members of MIX IV are similar, in both sequence and synteny, to the Vibrio parahaemolyticus MIX-effector VP1388, but their activity is unknown. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist.	133
410950	cd20709	MIX_V	Marker for type sIX effectors domain, clan V. This subfamily contains the MIX (Marker for type sIX effectors) V clan (MIX V) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. Predicted antibacterial activities of the C-terminal toxin domains of Vibrionaceae MIX V effectors include peptidase, peptidoglycan hydrolase, nuclease and pore-forming. Also included in this clan is VPR01S_11_01570, encoded by V. proteolyticus, that carries a CNF1 (cytotoxic necrotizing factor 1) toxin domain and modulates the actin cytoskeleton of eukaryotic phagocytic cells. Some members contain DUF2235, which is predicted as a phospholipase domain. Members of the MIX V clan are shared between marine bacteria via horizontal gene transfer, thereby enhancing their bacterial competitive fitness. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist.	123
411008	cd20710	NOT1_connector	Connector domain of NOT1. This NOT1 connector domain is one of several catalytically inactive subunits of the multisubunit CCR4-NOT complex assembly that plays a central role in post-translational gene regulation in eukaryotes. CCR4-NOT contains the catalytic center formed by two deadenylase subunits CCR4 and CAF1, and the conserved core complex which contains a minimum of four catalytically inactive subunits, NOT1, NOT2, NOT3 and CAF40/NOT9. NOT1 is the largest subunit which functions as a central scaffold for complex assembly in human orthologs. The Chaetomium thermophilum NOT1 connector domain consists of five alpha-helical hairpin repeats of the HEAT type that structurally resemble MIF4G domains, and hence is also called the MIF4G-C domain. However, NOT1 MIF4G-C does not interact with DEAD-box helicases such as DDX6 like MIF4G does. Structural conservation of this domain suggests an important role but its function is as yet unknown.	202
411009	cd20712	LNYV_P-protein-C_like	C-terminal domain of lettuce necrotic yellows virus phosphoprotein and related domains. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Rhabdoviridae family such as Lettuce necrotic yellows virus (LNYV). LNYV P protein acts as a weak local RNA silencing suppressor in plants to counteract RNA silencing antiviral defense. It suppresses both RNA induced silencing complex (RISC)-mediated cleavage and RNA silencing amplification. The C-terminal domain of LNYV P protein is essential for both local RNA silencing suppression and interaction with Argonaute (AGO) 1, AGO2, and AGO4 (key components of the RISC complexes), and with SGS3 and RDR6 (which function in the amplification step of RNA silencing). The family Rhabdoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, including acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA.	69
411010	cd20714	NSP3_rotavirus	rotavirus non-structural protein 3 (NSP3). Rotaviruses co-opt the eukaryotic translation machinery during their life cycle. Most eukaryotic mRNAs are characterized by a 5' cap structure and a 3' poly(A) tail. Eukaryotic translation initiation is facilitated by interactions between the 3' poly(A) tail and the 5' end of the message mediated by poly(A) binding protein (PABP) and eukaryotic translation initiation factor 4G (eIF4G). Rotavirus NSP3 is a functional analog of PABP that enables rotaviruses to direct eukaryotic translation machinery to viral mRNAs. It binds to the 3' consensus sequence of viral mRNA and participates in mRNA circularization by interacting with eIF4G. NSP3 closes the viral mRNA loop and facilitates translation of its own mRNAs while blocking recruitment of PABP to the eukaryotic translation initiation machinery.	127
410956	cd20716	cyt_P460_fam	Cytochrome P460 family. The cytochrome P460 family is composed mostly of monoheme, ~17 kDa, c-cytochromes typified by the cytochromes P460 of Nitrosomonas europaea and Methylococcus capsulatus (Bath), and the cytochrome c'-beta of M. capsulatus. Members of this family can be characterized by a predominantly beta-sheet structure as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria. They are involved in the oxidation/reduction or ligation of N-oxides for detoxification or energy generation. Phylogenetic studies suggest that cytochrome P460 (cytL) genes evolved from ancestral cytochrome c'-beta genes (cytS) by acquisition of features including the lysine-heme cross-link. The protein-bound c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate.	124
410310	cd20721	CYCLIN_SDS-like_rpt2	second cyclin box found in Arabidopsis thaliana cyclin-SDS and similar proteins. Cyclin-SDS, also called protein SOLO DANCERS, is a meiosis-specific cyclin that is required for normal homolog synapsis and recombination in early to mid-prophase 1. It contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	104
410311	cd20722	CYCLIN_CCNO_rpt2	second cyclin box found in cyclin-O (CCNO). CCNO is specifically required for generation of multiciliated cells, possibly by promoting a cell cycle state compatible with centriole amplification and maturation. It acts downstream of MCIDAS (multiciliate differentiation and DNA synthesis associated cell cycle protein) to promote mother centriole amplification and maturation in preparation for apical docking. CCNO is involved in the activation of cyclin-dependent kinase 2. CCNO contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	97
410970	cd20723	CdiA-CT_Ec-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Escherichia coli CdiA, and similar proteins. This family includes the C-terminal (CT) domain of Escherichia coli CdiA, an effector protein involved in contact-dependent growth inhibition (CDI), a mechanism of inter-bacterial competition. The large CdiA effector protein carries a C-terminal toxin domain (CdiA-CT) which is delivered to neighboring bacteria to inhibit target-cell growth. The exact biochemical function of this E. coli CdiA is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its all helical cognate CdiI with high affinity.	160
410971	cd20724	CdiA-CT_Kp342-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Klebsiella pneumoniae 342 CdiA, and similar proteins. This family includes the C-terminal (CT) domain of Klebsiella pneumoniae CdiA, an effector protein involved in contact-dependent growth inhibition (CDI), a mechanism of inter-bacterial competition. The large CdiA effector protein carries a C-terminal toxin domain (CdiA-CT) which is delivered to neighboring bacteria to inhibit target-cell growth. The exact biochemical function of this CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI immunity protein (all beta-sheet structure) with high affinity.	107
410972	cd20725	CdiA-CT_Kp-like	CdiA C-terminal domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) similar to Klebsiella pneumoniae CdiA-CT. This family includes the C-terminal (CT) domain of bacterial CdiA, an effector protein involved in contact-dependent growth inhibition (CDI), a mechanism of inter-bacterial competition. The large CdiA effector protein carries a C-terminal toxin domain (CdiA-CT) which is delivered to neighboring bacteria to inhibit target-cell growth. Many of the domains in this family are associated with RHS repeats N-terminal to the domain. The exact biochemical function of this CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity.	99
412043	cd20726	CDI_toxin_BpE479_tRNase-like	C-terminal (CT) toxin domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Burkholderia pseudomallei, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) domain of Burkholderia pseudomallei E479 CdiA. This CdiA-CT domain is a tRNAse that contains a core alpha/beta-fold that is characteristic of PD(D/E)XK superfamily nucleases. It is structurally similar to another CDI toxin domain from B. pseudomallei 1026b which is unrelated in sequence but has a similar nuclease domain, and shares similar fold and active-site architecture. The PD(D/E)XK superfamily includes most restriction endonucleases and other enzymes involved in DNA recombination and repair.	121
412044	cd20727	CDI_toxin_Bp_tRNase-like	C-terminal (CT) toxin domain of a contact-dependent growth inhibition (CDI) system (CdiA-CT) similar to that of Burkholderia pseudomallei, and related proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) toxin domains that are similar to Burkholderia pseudomallei E479 and 1026b CdiA toxins, both of which are tRNAses.	105
410638	cd20729	PoNe_LXG-like	Polymorphic Nuclease effector (PoNe) co-occurring with N-terminal LXG domains. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains some members that contain LXG domains in the N-terminal region. This group of polymorphic toxin proteins in bacteria are predicted to be associated with type VII secretion pathways to mediate export of bacterial toxins.	139
410639	cd20730	PoNe_FilH-like	Polymorphic Nuclease effector (PoNe) co-occurring with filamentous hemagglutinin N-terminal domain repeats. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with N-terminal hemagglutinin repeats and/or a hemagglutination activity domain.	133
410640	cd20731	PoNe_FilH_TF-like	Polymorphic Nuclease effector (PoNe) co-occurring with N-terminal domains such as filamentous hemagglutinin repeats or TANFOR. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains members with PoNe domains typically co-occuring with N-terminal domains such as hemagglutinin repeats and/or hemagglutination activity domains, or a TANFOR domain, which contains uncharacterized single or repeat domains that co-occur with fibronectin type III domains.	129
410641	cd20732	PoNe_FilH_DUF637_VENN-like	Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal domains such as filamentous hemagglutinin repeats, DUF637, or pre-toxin domain with VENN motif. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with a PoNe domain typically co-occuring with N-terminal domains such as filamentous hemagglutinin repeats, hemagglutination activity domains, DUF637 - predicted to be a hemagglutinin domain, or pre-toxin domains with VENN motifs, which are found in many bacterial polymorphic toxins and are located before the C-terminal toxin modules.	121
410642	cd20733	PoNe_PAAR-like	Polymorphic Nuclease effector (PoNe) co-occurring with an N-terminal PAAR domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contain members with PoNe domains that typically co-occur with N-terminal domains such as proline-alanine-alanine-arginine (PAAR) repeat domains that form a sharp conical extension on VgrG spikes, which is a trimeric protein complex of the bacterial type VI secretion system (T6SS).	125
410643	cd20734	PoNe_RHS-like	Polymorphic Nuclease effector (PoNe) domain co-occurring with RHS repeat-associated core domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains typically co-occurring with N-terminal domains such as RHS repeat-associated core domains, which may contain FG-GAP, RHS or YD repeats, and are found in secreted bacterial insecticidal toxins.	90
410644	cd20735	PoNe_RHS-like	Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal domains such as RHS repeats. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with uncharacterized N-terminal RHS repeat domains.	111
410645	cd20736	PoNe_Nuclease	Polymorphic Nuclease effector (PoNe) co-occurring with an N-terminal nuclease domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with nuclease N-terminal domains such as endonucleases involved in methyl-directed DNA mismatch repair in gram negative bacteria.	80
410646	cd20737	PoNe_HINT	Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal domains such as the HINT domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with a pre-toxin HINT domain, a member of the HINT superfamily of proteases usually found N-terminal to the toxin module in polymorphic toxin systems; the HINT domain is predicted to function in releasing the toxin domain by autoproteolysis.	91
410647	cd20738	PoNe_DUF4280	Polymorphic Nuclease effector (PoNe) co-occurring with an N-terminal DUF4280 domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with an N-terminal domain of unknown function (DUF4280), which has a single completely conserved residue C that may be functionally important.	127
410648	cd20739	PoNe_DUF637	Polymorphic Nuclease effector (PoNe) co-occurring with an N-terminal DUF637 domain. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with N-terminal domains such as DUF637 predicted to be a hemagglutinin domain.	124
410649	cd20740	PoNe_LXG_HINT-like	Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal LXG or pre-toxin HINT domains. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains members with PoNe domains that co-occur with N-terminal domains such as HINT or LXG domains. The pre-toxin HINT domain, a member of the HINT superfamily of proteases, is usually found N-terminal to the toxin module in polymorphic toxin systems; the HINT domain is predicted to function in releasing the toxin domain by autoproteolysis. The LXG domains that are present in the N-terminal region of a group of polymorphic toxin proteins in bacteria and predicted to use a Type VII secretion pathway to mediate export of bacterial toxins.	96
410650	cd20741	PoNe_HINT_TF-like	Polymorphic Nuclease effector (PoNe) domain co-occurring with N-terminal domains such as HINT or TANFOR. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with N-terminal domains such as a TANFOR domain which contains uncharacterized single or repeat domains that co-occur with fibronectin type III domains, or a pre-toxin HINT domain, a member of the HINT superfamily of proteases usually found N-terminal to the toxin module in polymorphic toxin systems; the HINT domain is predicted to function in releasing the toxin domain by autoproteolysis.	77
410940	cd20742	FIX_vWA-like	Found in type sIX effector (FIX) domain of unknown function co-occurring with von Willebrand factor type A (vWA) domain or MSCRAMM family adhesin SdrC domain. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that co-occurs with domains such as the von Willebrand factor type A (vWA) domain, which has a wide variety of important cellular functions, and the MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules) family adhesin SdrC domain, that contains a variable-length C-terminal region of Ser-Asp (SD) repeats.	80
410941	cd20743	FIX_RhsA-like	Found in type sIX effector (FIX) domain of unknown function co-occurring with RhsA domains with RHS repeats. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that co-occurs with C-terminal RhsA-like domain, which contains extended repeat regions and RHS repeats. Some in this family have additional C-terminal domains such as AAH, a predicted nuclease domain with conserved AHH motif that is found in bacterial polymorphic toxin systems and functions as a toxin module.	92
410942	cd20744	FIX_AHH_RhsA-like	Found in type sIX effector (FIX) domain of unknown function co-occurring with C-terminal AHH domain and some RhsA domains with RHS repeats. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that co-occurs with C-terminal domains such as AAH, a predicted nuclease domain with conserved AHH motif that is found in bacterial polymorphic toxin systems and functions as a toxin module. Some in this family have additional C-terminal domains such as RhsA protein which contains extended repeat regions and RHS repeats.	76
410943	cd20745	FIX_RhsA_AHH_HNH-like	Found in type sIX effector (FIX) domain of unknown function co-occurring with RhsA, AHH or HNH domain. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that co-occurs with C-terminal RhsA-like domain which contains extended repeat regions and RHS repeats. Some in this subfamily have additional C-terminal domains such as AAH, a predicted nuclease domain with conserved AHH motif that is found in bacterial polymorphic toxin systems and functions as a toxin module, and HNH endonuclease domain, which usually contains a conserved HNH motif in the sequence. Some members also contain additional N-terminal VgrG or PAAR or PAAR-like (i.e., DUF4280) domain.	69
410944	cd20746	FIX_Ntox15_NUC_DUF4112_RhsA-like	Found in type sIX effector (FIX) domain of unknown function co-occurring with Ntox15, endonuclease, or RHS repeat domain. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that generally co-occurs with the C-terminal Ntox15 (Novel toxin 15), a predicted RNase toxin that possesses a conserved HxxD motif, as well as with domains such as DNA/RNA non-specific endonuclease, RhsA domain regions with extende RHS repeats, or DUF4112. Some members also contain an N-terminal PAAR-like (i.e., DUF4280) domain.	84
410957	cd20750	cyt_c_I	Uncharacterized subfamily of the cytochrome P460 family. This subfamily is composed mainly of hypothetical proteins, including Sphingopyxis alaskensis class I cytochrome C. Members of this subfamily belong to the cytochrome P460 family that is composed mostly of monoheme, ~17 kDa, c-cytochromes typified by the cytochromes P460 of Nitrosomonas europaea and Methylococcus capsulatus (Bath), and the cytochrome c'-beta of M. capsulatus. Cytochrome P460 family members can be characterized by a predominantly beta-sheet structure as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria. They are involved in the oxidation/reduction or ligation of N-oxides for detoxification or energy generation. The protein-bound c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate.	146
410958	cd20751	cyt_P460_Ne-like	cytochrome P460 from Nitrosomonas europaea and similar proteins. Cytochrome (cyt) P460 is a small soluble periplasmic protein that binds the c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, which has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate. The heme P460 in N. europea cyt P460 contains a third proteinaceous cross-link, similar to that found in hydroxylamine oxidoreductase (HAO), but in this case, the cross-link is to a conserved lysine residue, K70. The biological function of cyt P460 is yet to be determined, but it binds hydroxylamine, hydrazine, hydrogen peroxide, and cyanide in the ferric form, and CO in the ferrous form; it also possesses a weak hydroxylamine oxidation/cyt c reduction activity. It belongs to a family, called the cytochrome P460 family, of small mono-heme c-type cytochromes that are predominantly of beta-sheet structure, as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria.	153
410959	cd20752	cyt_c'_beta	Cytochrome c'-beta from Methylococcus capsulatus (Bath) and similar proteins. Cytochromes (cyt) c' are defined by a pentacoordinate heme Fe with a CXXCH c-heme-binding motif located close to the C-terminus. Most cyt c' have four alpha-helix bundle structures, and are referred to as cyt c'-alpha. M. capsulatus (Bath) cytochrome c'-beta, encoded by the cytS gene, is a homodimeric heme protein with a higher molecular weight of about 16 kDa per monomer, compared to cyt c'-alpha (~12 kDa), and it adopts a beta-sheet structure. It is involved in nitric oxide scavenging and protection against nitrosoative stress. Cyt c'-beta belongs to a family, called the cytochrome P460 family, of small mono-heme c-type cytochromes that are predominantly of beta-sheet structure, as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria.	136
410960	cd20753	cyt_P460_Mc-like	cytochrome P460 from Methylococcus capsulatus (Bath) and similar proteins. Cytochrome (cyt) P460 is a small soluble periplasmic protein that binds the c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, which has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate. M. capsulatus (Bath) cytochrome P460, encoded by the cytL gene, catalyzes the oxidation of hydroxylamine (NH2OH) to form nitrous oxide (N2O) under anaerobic conditions. Similar to Nitrosomonas europaea cyt P460, it is defined by an unusual porphyrin (heme)-lysine cross link. This subfamily belongs to a family, called the cytochrome P460 family, of small mono-heme c-type cytochromes that are predominantly of beta-sheet structure, as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria.	136
394914	cd20754	capping_2-OMTase_viral	viral Cap-0 specific (nucleoside-2'-O-)-methyltransferase. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Some dsDNA and dsRNA viruses, like the bluetongue virus (BTV), a member of the Reoviridae family, and Vaccinia virus, a member of the Poxviridae family, as well as some ss(+)RNA viruses, like Flaviviridae and Nidovirales, also cap their mRNAs and encode their own 2'OMTase. In BTV, all four reactions are catalyzed by a single protein, VP4. In Vaccinia, the activity is located in the processing factor of the poly(A) polymerase, VP39.	179
394915	cd20756	capping_2-OMTase_Poxviridae	Cap-0 specific (nucleoside-2'-O-)-methyltransferase of poxviridae. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Poxviridae, a family of dsDNA viruses, cap their mRNAs. The 2'OMTase activity is located in the processing factor of the poly(A) polymerase, VP39.	270
394916	cd20757	capping_2-OMTase_Rotavirus	Cap-0 specific (nucleoside-2'-O-)-methyltransferase of rotavirus. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Rotavirus, a family of dsRNA viruses, cap their mRNAs. The 2'OMTase activity is located in the multifunctional capping enzyme, VP3.	197
394917	cd20758	capping_2-OMTase_Orbivirus	Cap-0 specific (nucleoside-2'-O-)-methyltransferase of orbivirus. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Orbivirus, a family of dsRNA viruses, cap their mRNAs. The 2'OMTase activity is located in the multifunctional capping enzyme, VP4.	211
394918	cd20759	capping_2-OMTase_Phytoreovirus	Cap-0 specific (nucleoside-2'-O-)-methyltransferase of phytoreovirus. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Phytoreovirus, a family of dsRNA viruses, cap their mRNAs. The 2'OMTase activity is located in the mRNA capping enzyme P5.	199
394919	cd20760	capping_2-OMTase_Mimiviridae	Cap-0 specific (nucleoside-2'-O-)-methyltransferase of mimiviridae and pithoviridae. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Mimiviridae and pithoviridae are part of the nucleocytoplasmic large dsDNA virus clade (NCLDV). The 2'OMTase activity is located in the polyA polymerase regulatory subunit.	233
394920	cd20761	capping_2-OMTase_Flaviviridae	Cap-0 specific (nucleoside-2'-O-)-methyltransferase of flaviviridae. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Flaviviridae, a family of ss(+)RNA viruses, cap their mRNAs. The 2'OMTase activity is located in the nonstructural protein 5 (NS5).	222
394921	cd20762	capping_2-OMTase_Nidovirales	Cap-0 specific (nucleoside-2'-O-)-methyltransferase of nidovirales. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Nidovirales, a family of ss(+)RNA viruses, cap their mRNAs. For one member, Coronavirus, the 2'OMTase activity is located in the nonstructural protein 16 (NSP16). For others, the 2'OMTase activity may be located in replicase polyprotein 1ab.	176
411011	cd20786	tapirin_C	C-terminal domain of cellulose binding protein tapirin. This family contains the C-terminal domain of tapirin, a unique cellulose binding protein that is present only in the extremely thermophilic bacterial species Caldicellulosiruptor that grow on carbohydrates from lignocellulose at elevated temperatures. Tapirins appear to be specifically attached to cellulose, having similar binding affinities to cellulose as family 3 carbohydrate binding modules (CBM3). Structures of the C-terminal region indicate that aromatic and hydrophobic residues are responsible for cellulose binding, while a flexible peptide loop may protect and control access to this region. The basis for the genomic localization of the tapirins is unknown; however, these proteins are located next to type IV pili in the Caldicellulosiruptor genomes and therefore may be exposed on the cell membrane beside or as part of pili proteins. Caldicellulosiruptor hydrothermalis, which has less capability to deconstruct lignocellulose itself, may use tapirin as one of the mechanisms for its survival in extreme environments by anchoring itself to biomass that is hydrolyzed by enzymes from other species. Understanding mechanisms by which these microorganisms attach to and degrade lignocellulose may be important in finding effective approaches for conversion of plant biomass into fuels and chemicals.	343
412053	cd20788	TBC1D23_C-like	C-terminal domain of TBC1 domain family member 23, and similar proteins. This family contains the C-terminal domain of Tre2-Bub2-Cdc16 (TBC) family 23 (TBC1D23), which adopts a Pleckstrin homology (PH) domain fold. It selectively binds to phosphoinositides, in particular, PtdIns(4)P, through one surface while it binds FAM21 via the opposite surface. TBC1D23, which is highly conserved in many eukaryotes but missing in plants and fungi, also possesses an N-terminal domain which is a catalytically inactive TBC domain. TBC1D23 encodes a protein functioning in endosome-to-Golgi trafficking in cells; it is a specificity determinant that links the vesicle to the target membrane. Homozygous mutations of TBC1D23 have been found in patients diagnosed with pontocerebellar hypoplasia (PCH), a group of neurological disorders that affect the brain development, particularly, the pons and cerebellum. Mutation of key residues of TBC1D23 (or FAM21) selectively disrupts the endosomal vesicular trafficking toward the Trans-Golgi Network. This C-terminal domain is missing in some PCH patients.	115
411012	cd20789	Cas13d	Class 2 type VI-D CRISPR-associated RNA-guided ribonuclease Cas13d. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both, pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13d enzymes are 20-30% smaller than other Cas13 subtypes, which enable flexible packaging into size-constrained therapeutic viral vectors such as adeno-associated virus.	875
411013	cd20790	Cas13a	Class 2 type VI-A CRISPR-associated RNA-guided ribonuclease Cas13a. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Within the Cas13a (also called C2c2) subfamily, the active site is functionally diverse in terms of both nucleotide cleavage preference and turnover efficiency. There are two distinct types of Cas13a enzymes, based on their cleavage preference: adenosine (A) cleaving or uridine (U) cleaving.	1188
410342	cd20792	C1_cPKC_nPKC_rpt1	first protein kinase C conserved region 1 (C1 domain) found in classical (or conventional) protein kinase C (cPKC), novel protein kinase C (nPKC), and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domains. PKCs undergo three phosphorylations in order to take mature forms. In addition, cPKCs depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. nPKCs are calcium-independent, but require DAG and PS for activity, while atypical PKCs (aPKCs) only require PS. PKCs phosphorylate and modify the activities of a wide variety of cellular proteins including receptors, enzymes, cytoskeletal proteins, transcription factors, and other kinases. They play a central role in signal transduction pathways that regulate cell migration and polarity, proliferation, differentiation, and apoptosis. This family includes classical PKCs (cPKCs) and novel PKCs (nPKCs). There are four cPKC isoforms (named alpha, betaI, betaII, and gamma) and four nPKC isoforms (delta, epsilon, eta, and theta). Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410343	cd20793	C1_cPKC_nPKC_rpt2	second protein kinase C conserved region 1 (C1 domain) found in classical (or conventional) protein kinase C (cPKC), novel protein kinase C (nPKC), and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. PKCs undergo three phosphorylations in order to take mature forms. In addition, cPKCs depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. nPKCs are calcium-independent, but require DAG and PS for activity, while atypical PKCs (aPKCs) only require PS. PKCs phosphorylate and modify the activities of a wide variety of cellular proteins including receptors, enzymes, cytoskeletal proteins, transcription factors, and other kinases. They play a central role in signal transduction pathways that regulate cell migration and polarity, proliferation, differentiation, and apoptosis. This family includes classical PKCs (cPKCs) and novel PKCs (nPKCs). There are four cPKC isoforms (named alpha, betaI, betaII, and gamma) and four nPKC isoforms (delta, epsilon, eta, and theta). Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	50
410344	cd20794	C1_aPKC	protein kinase C conserved region 1 (C1 domain) found in the atypical protein kinase C (aPKC) family. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKC-zeta plays a critical role in activating the glucose transport response. It is activated by glucose, insulin, and exercise through diverse pathways. PKC-zeta also plays a central role in maintaining cell polarity in yeast and mammalian cells. In addition, it affects actin remodeling in muscle cells. PKC-iota is directly implicated in carcinogenesis. It is critical to oncogenic signaling mediated by Ras and Bcr-Abl. The PKC-iota gene is the target of tumor-specific gene amplification in many human cancers, and has been identified as a human oncogene. In addition to its role in transformed growth, PKC-iota also promotes invasion, chemoresistance, and tumor cell survival. Expression profiling of PKC-iota is a prognostic marker of poor clinical outcome in several human cancers. PKC-iota also plays a role in establishing cell polarity, and has critical embryonic functions. Members of this family contain one C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	55
410345	cd20795	C1_PKD_rpt1	first protein kinase C conserved region 1 (C1 domain) found in the protein kinase D (PKD) family. PKDs are important regulators of many intracellular signaling pathways such as ERK and JNK, and cellular processes including the organization of the trans-Golgi network, membrane trafficking, cell proliferation, migration, and apoptosis. They are activated in a PKC-dependent manner by many agents including diacylglycerol (DAG), PDGF, neuropeptides, oxidative stress, and tumor-promoting phorbol esters, among others. Mammals harbor three types of PKDs: PKD1 (or PKCmu), PKD2, and PKD3 (or PKCnu). PKDs contain N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the first C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	56
410346	cd20796	C1_PKD_rpt2	second protein kinase C conserved region 1 (C1 domain) found in the family of protein kinase D (PKD). PKDs are important regulators of many intracellular signaling pathways such as ERK and JNK, and cellular processes including the organization of the trans-Golgi network, membrane trafficking, cell proliferation, migration, and apoptosis. They are activated in a PKC-dependent manner by many agents including diacylglycerol (DAG), PDGF, neuropeptides, oxidative stress, and tumor-promoting phorbol esters, among others. Mammals harbor three types of PKDs: PKD1 (or PKCmu), PKD2, and PKD3 (or PKCnu). PKDs contain N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the second C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	54
410347	cd20797	C1_CeDKF1-like_rpt1	first protein kinase C conserved region 1 (C1 domain) found in Caenorhabditis elegans serine/threonine-protein kinase DKF-1 and similar proteins. DKF-1 converts transient diacylglycerol (DAG) signals into prolonged physiological effects, independently of PKC. It plays a role in the regulation of growth and neuromuscular control of movement. It is involved in immune response to Staphylococcus aureus bacterium by activating transcription factor hlh-30 downstream of phospholipase plc-1. Members of this group contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	56
410348	cd20798	C1_CeDKF1-like_rpt2	second protein kinase C conserved region 1 (C1 domain) found in Caenorhabditis elegans serine/threonine-protein kinase DKF-1 and similar proteins. DKF-1 converts transient diacylglycerol (DAG) signals into prolonged physiological effects, independently of PKC. It plays a role in the regulation of growth and neuromuscular control of movement. It is involved in immune response to Staphylococcus aureus bacterium by activating transcription factor hlh-30 downstream of phospholipase plc-1. Members of this group contain two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	54
410349	cd20799	C1_DGK_typeI_rpt1	first protein kinase C conserved region 1 (C1 domain) found in type I diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type I DAG kinases (DGKs) contain EF-hand structures that bind Ca(2+) and recoverin homology domains, in addition to C1 and catalytic domains that are present in all DGKs. Type I DGKs, regulated by calcium binding, include three DGK isozymes (alpha, beta and gamma). DAG kinase alpha, also called 80 kDa DAG kinase, or diglyceride kinase alpha (DGK-alpha), is active upon cell stimulation, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. DAG kinase beta, also called 90 kDa DAG kinase, or diglyceride kinase beta (DGK-beta), exhibits high phosphorylation activity for long-chain diacylglycerols. DAG kinase gamma, also called diglyceride kinase gamma (DGK-gamma), reverses the normal flow of glycerolipid biosynthesis by phosphorylating diacylglycerol back to phosphatidic acid. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. DGK-alpha contains atypical C1 domains, while DGK-beta and DGK-gamma contain typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	62
410350	cd20800	C1_DGK_typeII_rpt1	first protein kinase C conserved region 1 (C1 domain) found in type II diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type II DAG kinases (DGKs) contain pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. Three DGK isozymes (delta, eta and kappa) are classified as type II. DAG kinase delta, also called 130 kDa DAG kinase, or diglyceride kinase delta (DGK-delta), is a residential lipid kinase in the endoplasmic reticulum. It promotes lipogenesis and is involved in triglyceride biosynthesis. DAG kinase eta, also called diglyceride kinase eta (DGK-eta), plays a key role in promoting cell growth. The DAG kinase eta gene, DGKH, is a replicated risk gene of bipolar disorder (BPD). DAG kinase kappa is also called diglyceride kinase kappa (DGK-kappa) or 142 kDa DAG kinase. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	60
410351	cd20801	C1_DGKepsilon_typeIII_rpt1	first protein kinase C conserved region 1 (C1 domain) found in type III diacylglycerol kinase, DAG kinase epsilon, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase epsilon, also called diglyceride kinase epsilon (DGK-epsilon), is the only isoform classified as type III; it possesses a hydrophobic domain in addition to C1 and catalytic domains that are present in all DGKs, and shows selectivity for acyl chains. It is highly selective for arachidonate-containing species of DAG. It may terminate signals transmitted through arachidonoyl-DAG or may contribute to the synthesis of phospholipids with defined fatty acid composition. DAG kinase epsilon contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	54
410352	cd20802	C1_DGK_typeIV_rpt1	first protein kinase C conserved region 1 (C1 domain) found in type IV diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type IV DAG kinases (DGKs) contain myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. Two DGK isozymes (zeta and iota) are classified as type IV. DAG kinase zeta, also called diglyceride kinase zeta (DGK-zeta), displays a strong preference for 1,2-diacylglycerols over 1,3-diacylglycerols, but lacks substrate specificity among molecular species of long chain diacylglycerols. DAG kinase iota, also called diglyceride kinase iota (DGK-iota), or DGKI, is a homolog of Drosophila DGK2, RdgA. It may have important cellular functions in the retina and brain. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	62
410353	cd20803	C1_DGKtheta_typeV_rpt1	first protein kinase C conserved region 1 (C1 domain) found in type V diacylglycerol kinase, DAG kinase theta, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase theta, also called diglyceride kinase theta (DGK-theta), is the only isoform classified as type V; it contains a pleckstrin homology (PH)-like domain and an additional C1 domain, compared to other DGKs. It may regulate the activity of protein kinase C by controlling the balance between the two signaling lipids, diacylglycerol and phosphatidic acid. DAG kinase theta contains three copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	56
410354	cd20804	C1_DGKtheta_typeV_rpt2	second protein kinase C conserved region 1 (C1 domain) found in type V diacylglycerol kinase, DAG kinase theta, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase theta, also called diglyceride kinase theta (DGK-theta), is the only isoform classified as type V; it contains a pleckstrin homology (PH)-like domain and an additional C1 domain, compared to other DGKs. It may regulate the activity of protein kinase C by controlling the balance between the two signaling lipids, diacylglycerol and phosphatidic acid. DAG kinase theta contains three copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	57
410355	cd20805	C1_DGK_rpt2	second protein kinase C conserved region 1 (C1 domain) found in the diacylglycerol kinase family. The diacylglycerol kinase (DGK, EC 2.7.1.107) family of enzymes plays critical roles in lipid signaling pathways by converting diacylglycerol to phosphatidic acid, thereby downregulating signaling by the former and upregulating signaling by the latter second messenger. Ten DGK family isozymes have been identified to date, which possess different interaction motifs imparting distinct temporal and spatial control of DGK activity to each isozyme. They have been classified into five types (I-V), according to domain architecture and some common features. All DGK isozymes, except for DGKtheta, contain two copies of the C1 domain. This model corresponds to the second one. DGKtheta harbors three C1 domains. Its third C1 domain is included here. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	55
410356	cd20806	C1_CHN	protein kinase C conserved region 1 (C1 domain) found in the chimaerin family. Chimaerins are a family of phorbolester- and diacylglycerol-responsive GTPase activating proteins (GAPs) specific for the Rho-like GTPase Rac. Alpha1-chimerin (formerly known as N-chimerin) and alpha2-chimerin are alternatively spliced products of a single gene, as are beta1- and beta2-chimerin. Alpha1- and beta1-chimerin have a relatively short N-terminal region that does not encode any recognizable domains, whereas alpha2- and beta2-chimerin both include a functional SH2 domain that can bind to phosphotyrosine motifs within receptors. All the isoforms contain a GAP domain with specificity in vitro for Rac1 and a diacylglycerol (DAG)-binding C1 domain which allows them to translocate to membranes in response to DAG signaling and anchors them in close proximity to activated Rac. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410357	cd20807	C1_Munc13	protein kinase C conserved region 1 (C1 domain) found in the Munc13 family. The Munc13 gene family encodes a family of neuron-specific, synaptic molecules that bind to syntaxin, an essential mediator of neurotransmitter release. Munc13-1 is a component of presynaptic active zones in which it acts as an essential synaptic vesicle priming protein. Munc13-2 is essential for normal release probability at hippocampal mossy fiber synapses. Munc13-3 is almost exclusively expressed in the cerebellum. It acts as a tumor suppressor and plays a critical role in the formation of release sites with calcium channel nanodomains. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410358	cd20808	C1_RASGRP	protein kinase C conserved region 1 (C1 domain) found in the RAS guanyl-releasing protein (RASGRP) family. The RASGRP family includes RASGRP1-4. They function as cation-, usually calcium-, and diacylglycerol (DAG)-regulated nucleotide exchange factor activating Ras through the exchange of bound GDP for GTP. RASGRP1, also called calcium and DAG-regulated guanine nucleotide exchange factor II (CalDAG-GEFII) or Ras guanyl-releasing protein, activates the Erk/MAP kinase cascade and regulates T-cell/B-cell development, homeostasis and differentiation by coupling T-lymphocyte/B-lymphocyte antigen receptors to Ras. RASGRP1 also regulates NK cell cytotoxicity and ITAM-dependent cytokine production by activation of Ras-mediated ERK and JNK pathways. RASGRP2, also called calcium and DAG-regulated guanine nucleotide exchange factor I (CalDAG-GEFI), Cdc25-like protein (CDC25L), or F25B3.3 kinase-like protein, specifically activates Rap and may also activate other GTPases such as RRAS, RRAS2, NRAS, KRAS but not HRAS. RASGRP2 is involved in aggregation of platelets and adhesion of T-lymphocytes and neutrophils probably through inside-out integrin activation, as well as in the muscarinic acetylcholine receptor M1/CHRM1 signaling pathway. RASGRP3, also called calcium and DAG-regulated guanine nucleotide exchange factor III (CalDAG-GEFIII), or guanine nucleotide exchange factor for Rap1, is a guanine nucleotide-exchange factor activating H-Ras, R-Ras and Ras-associated protein-1/2. It functions as an important mediator of signaling downstream from receptor coupled phosphoinositide turnover in B and T cells. RASGRP4 may function in mast cell differentiation. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410359	cd20809	C1_MRCK	protein kinase C conserved region 1 (C1 domain) found in the Myotonic dystrophy kinase-related Cdc42-binding kinase (MRCK) family. MRCK is thought to be a coincidence detector of signaling by the small GTPase Cdc42 and phosphoinositides. MRCK/Cdc42 signaling mediates myosin-dependent cell motility. MRCK has been shown to promote cytoskeletal reorganization, which affects many biological processes. Three isoforms of MRCK are known, named alpha, beta and gamma. MRCKgamma is expressed in heart and skeletal muscles, unlike MRCKalpha and MRCKbeta, which are expressed ubiquitously. MRCK consists of a serine/threonine kinase domain, a cysteine rich (C1) region, a PH domain and a p21 binding motif. This model corresponds to C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410360	cd20810	C1_VAV	protein kinase C conserved region 1 (C1 domain) found in VAV proteins. VAV proteins function both as cytoplasmic guanine nucleotide exchange factors (GEFs) for Rho GTPases and as scaffold proteins, and they play important roles in cell signaling by coupling cell surface receptors to various effector functions. They play key roles in processes that require cytoskeletal reorganization including immune synapse formation, phagocytosis, cell spreading, and platelet aggregation, among others. Vertebrates have three VAV proteins (VAV1, VAV2, and VAV3). VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410361	cd20811	C1_Raf	protein kinase C conserved region 1 (C1 domain) found in the Raf (Rapidly Accelerated Fibrosarcoma) kinase family. Raf kinases are serine/threonine kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. They act as mitogen-activated protein kinase kinase kinases (MAP3Ks, MKKKs, MAPKKKs), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. Aberrant expression or activation of components in this pathway are associated with tumor initiation, progression, and metastasis. Raf proteins contain a Ras binding domain, a zinc finger cysteine-rich domain (C1), and a catalytic kinase domain. Vertebrates have three Raf isoforms (A-, B-, and C-Raf) with different expression profiles, modes of regulation, and abilities to function in the ERK cascade, depending on cellular context and stimuli. They have essential and non-overlapping roles during embryo- and organogenesis. Knockout of each isoform results in a lethal phenotype or abnormality in most mouse strains. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	49
410362	cd20812	C1_KSR	protein kinase C conserved region 1 (C1 domain) found in the kinase suppressor of Ras (KSR) family. KSR is a scaffold protein that functions downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. KSR proteins regulate the assembly and activation of the Raf/MEK/ERK module upon Ras activation at the membrane by direct association of its components. They are widely regarded as pseudokinases, but there is some debate in this designation as a few groups have reported detecting kinase catalytic activity for KSRs, specifically KSR1. Vertebrates contain two KSR proteins, KSR1 and KSR2. KSR proteins contain a SAM-like domain, a zinc finger cysteine-rich domain (C1), and a pseudokinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	48
410363	cd20813	C1_ROCK	protein kinase C conserved region 1 (C1 domain) found in the Rho-associated coiled-coil containing protein kinase (ROCK) family. ROCK is a serine/threonine protein kinase, catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. It is also referred to as Rho-associated kinase or simply as Rho kinase. It contains an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD), a pleckstrin homology (PH) domain and a C1 domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain. It is activated via interaction with Rho GTPases and is involved in many cellular functions including contraction, adhesion, migration, motility, proliferation, and apoptosis. The ROCK subfamily consists of two isoforms, ROCK1 and ROCK2, which may be functionally redundant in some systems, but exhibit different tissue distributions. Both isoforms are ubiquitously expressed in most tissues, but ROCK2 is more prominent in brain and skeletal muscle while ROCK1 is more pronounced in the liver, testes, and kidney. Studies in knockout mice result in different phenotypes, suggesting that the two isoforms do not compensate for each other during embryonic development. This model corresponds to C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	65
410364	cd20814	CRIK	protein kinase C conserved region 1 (C1 domain) found in citron Rho-interacting kinase (CRIK) and similar proteins. CRIK, also called serine/threonine-protein kinase 21, is an effector of the small GTPase Rho. It plays an important function during cytokinesis and affects its contractile process. CRIK-deficient mice show severe ataxia and epilepsy as a result of abnormal cytokinesis and massive apoptosis in neuronal precursors. A Down syndrome critical region protein TTC3 interacts with CRIK and inhibits CRIK-dependent neuronal differentiation and neurite extension. CRIK contains a catalytic domain, a central coiled-coil domain, and a C-terminal region containing a Rho-binding domain (RBD), a zinc finger (C1 domain), and a pleckstrin homology (PH) domain, in addition to other motifs. This model corresponds to C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	56
410365	cd20815	C1_p190RhoGEF-like	protein kinase C conserved region 1 (C1 domain) found in the 190 kDa guanine nucleotide exchange factor (p190RhoGEF)-like family. The p190RhoGEF-like protein family includes p190RhoGEF, Rho guanine nucleotide exchange factor 2 (ARHGEF2), A-kinase anchor protein 13 (AKAP-13) and similar proteins. p190RhoGEF is a brain-enriched, RhoA-specific guanine nucleotide exchange factor that regulates signaling pathways downstream of integrins and growth factor receptors. It is involved in axonal branching, synapse formation and dendritic morphogenesis, as well as in focal adhesion formation, cell motility and B-lymphocytes activation. ARHGEF2 acts as a guanine nucleotide exchange factor (GEF) that activates Rho-GTPases by promoting the exchange of GDP for GTP. It is thought to play a role in actin cytoskeleton reorganization in different tissues since its activation induces formation of actin stress fibers. AKAP-13 is a scaffold protein that plays an important role in assembling signaling complexes downstream of several types of G protein-coupled receptors. It activates RhoA in response to signaling via G protein-coupled receptors via its function as Rho guanine nucleotide exchange factor. It may also activate other Rho family members. AKAP-13 plays a role in cell growth, cell development and actin fiber formation. Members of this family share a common domain architecture containing C1, RhoGEF or Dbl-homologous (DH), and Pleckstrin Homology (PH) domains. Some members may contain additional domains such as the DUF5401 domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	54
410366	cd20816	C1_GMIP-like	protein kinase C conserved region 1 (C1 domain) found in the GEM-interacting protein (GMIP)-like family. The GMIP-like family includes GMIP, Rho GTPase-activating protein 29 (ARHGAP29) and Rho GTPase-activating protein 45 (ARHGAP45). GMIP is a RhoA-specific GTPase-activating protein that acts as a key factor in saltatory neuronal migration. It associates with the Rab27a effector JFC1 and modulates vesicular transport and exocytosis. ARHGAP29, also called PTPL1-associated RhoGAP protein 1 (PARG1) or Rho-type GTPase-activating protein 29, is a GTPase activator for the Rho-type GTPases by converting them to an inactive GDP-bound state. It has strong activity toward RHOA, and weaker activity toward RAC1 and CDC42. ARHGAP29 may act as a specific effector of RAP2A to regulate Rho. In concert with RASIP1, ARHGAP29 suppresses RhoA signaling and dampens ROCK and MYH9 activities in endothelial cells and plays an essential role in blood vessel tubulogenesis. ARHGAP45, also called minor histocompatibility antigen HA-1 (mHag HA-1), is a Rac-GAP (GTPase-Activating Protein) in endothelial cells. It acts as a novel regulator of endothelial integrity. ARHGAP45 contains a GTPase activator for the Rho-type GTPases (RhoGAP) domain that would be able to negatively regulate the actin cytoskeleton as well as cell spreading. However, it also contains N-terminally a BAR-domin which can play an autoinhibitory effect on this RhoGAP activity. Members of this family contain a zinc-binding C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	51
410367	cd20817	C1_Stac	protein kinase C conserved region 1 (C1 domain) found in the SH3 and cysteine-rich domain-containing protein (Stac) family. Stac proteins are putative adaptor proteins that are important for neuronal function. There are three mammalian members (Stac1, Stac2 and Stac3) of this family. Stac1 and Stac3 contain two SH3 domains while Stac2 contains a single SH3 domain at the C-terminus. Stac1 and Stac2 have been found to be expressed differently in mature dorsal root ganglia (DRG) neurons. Stac1 is mainly expressed in peptidergic neurons while Stac2 is found in a subset of nonpeptidergic and all trkB+ neurons. Stac proteins contain a cysteine-rich C1 domain and one or two SH3 domains at the C-terminus. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	51
410368	cd20818	C1_Myosin-IX	protein kinase C conserved region 1 (C1 domain) found in the unconventional myosin-IX family. Myosins IX (Myo9) is a class of unique motor proteins with a common structure of an N-terminal extension preceding a myosin head homologous to the Ras-association (RA) domain, a head (motor) domain, a neck with IQ motifs that bind light chains, and a C-terminal tail containing cysteine-rich zinc binding (C1) and Rho-GTPase activating protein (RhoGAP) domains. There are two genes for myosins IX in humans, IXa and IXb, that are different in their expression and localization. IXa is expressed abundantly in brain and testis, and IXb is expressed abundantly in tissues of the immune system. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	56
410369	cd20819	C1_DEF8	protein kinase C conserved region 1 (C1 domain) found in differentially expressed in FDCP 8 (DEF-8) and similar proteins. DEF-8 positively regulates lysosome peripheral distribution and ruffled border formation in osteoclasts. It is involved in bone resorption. DEF-8 contains a protein kinase C conserved region 1 (C1) domain followed by a putative zinc-RING and/or ribbon. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	62
410370	cd20820	C1_RASSF1-like	protein kinase C conserved region 1 (C1 domain) found in the Ras association domain-containing protein 1 (RASSF1)-like family. The RASSF1-like family includes RASSF1 and RASSF5. RASSF1 and RASSF5 are members of a family of RAS effectors, of which there are currently 8 members (RASSF1-8), all containing a Ras-association (RA) domain of the Ral-GDS/AF6 type. RASSF1 has eight transcripts (A-H) arising from alternative splicing and differential promoter usage. RASSF1A and 1C are the most extensively studied RASSF1; both are localized to microtubules and involved in the regulation of growth and migration. RASSF1 is a potential tumor suppressor that is required for death receptor-dependent apoptosis. RASSF5, also called new ras effector 1 (NORE1), or regulator for cell adhesion and polarization enriched in lymphoid tissues (RAPL), is expressed as three transcripts (A-C) via differential promoter usage and alternative splicing. RASSF5A is a pro-apoptotic Ras effector and functions as a Ras regulated tumor suppressor. RASSF5C is regulated by Ras related protein and modulates cellular adhesion. RASSF5 is a potential tumor suppressor that seems to be involved in lymphocyte adhesion by linking RAP1A activation upon T-cell receptor or chemokine stimulation to integrin activation. RASSF1 and RASSF5 contain a C1 domain, which is descibed in this model. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410371	cd20821	C1_MgcRacGAP	protein kinase C conserved region 1 (C1 domain) found in male germ cell RacGap (MgcRacGAP) and similar proteins. MgcRacGAP, also called Rac GTPase-activating protein 1 (RACGAP1) or protein CYK4, plays an important dual role in cytokinesis: i) it is part of centralspindlin-complex, together with the mitotic kinesin MKLP1, which is critical for the structure of the central spindle by promoting microtuble bundling; and ii) after phosphorylation by aurora B, MgcRacGAP becomes an effective regulator of RhoA and plays an important role in the assembly of the contractile ring and the initiation of cytokinesis. MgcRacGAP-like proteins contain an N-terminal C1 domain, and a C-terminal RhoGAP domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	55
410372	cd20822	C1_ScPKC1-like_rpt1	first protein kinase C conserved region 1 (C1 domain) found in Saccharomyces cerevisiae protein kinase C-like 1 (ScPKC1) and similar proteins. ScPKC1 is required for cell growth and for the G2 to M transition of the cell division cycle. It mediates a protein kinase cascade, activating BCK1 which itself activates MKK1/MKK2. The family also includes Schizosaccharomyces pombe PKC1 and PKC2, which are involved in the control of cell shape and act as targets of the inhibitor staurosporine. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410373	cd20823	C1_ScPKC1-like_rpt2	second protein kinase C conserved region 1 (C1 domain) found in Saccharomyces cerevisiae protein kinase C-like 1 (ScPKC1) and similar proteins. ScPKC1 is required for cell growth and for the G2 to M transition of the cell division cycle. It mediates a protein kinase cascade, activating BCK1 which itself activates MKK1/MKK2. The family also includes Schizosaccharomyces pombe PKC1 and PKC2, which are involved in the control of cell shape and act as targets of the inhibitor staurosporine. Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	59
410374	cd20824	C1_SpBZZ1-like	protein kinase C conserved region 1 (C1 domain) found in Schizosaccharomyces pombe protein BZZ1 and similar proteins. BZZ1 is a syndapin-like F-BAR protein that plays a role in endocytosis and trafficking to the vacuole. It functions with type I myosins to restore polarity of the actin cytoskeleton after NaCl stress. BZZ1 contains an N-terminal F-BAR (FES-CIP4 Homology and Bin/Amphiphysin/Rvs), a central coiled-coil, and two C-terminal SH3 domains. Schizosaccharomyces pombe BZZ1 also harbors a C1 domain, but Saccharomyces cerevisiae BZZ1 doesn't have any. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410375	cd20825	C1_PDZD8	protein kinase C conserved region 1 (C1 domain) found in PDZ domain-containing protein 8 (PDZD8) and similar proteins. PDZD8, also called Sarcoma antigen NY-SAR-84/NY-SAR-104, is a molecular tethering protein that connects endoplasmic reticulum (ER) and mitochondrial membranes. PDZD8-dependent ER-mitochondria membrane tethering is essential for ER-mitochondria Ca2+ transfer. In neurons, it is involved in the regulation of dendritic Ca2+ dynamics by regulating mitochondrial Ca2+ uptake. PDZD8 also plays an indirect role in the regulation of cell morphology and cytoskeletal organization. It contains a PDZ domain and a C1 domain. This model describes the C1 domain, a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	55
410376	cd20826	C1_TNS2-like	protein kinase C conserved region 1 (C1 domain) found in tensin-2 like (TNS2-like) proteins. The TNS2-like group includes TNS2, and variants of TNS1 and TNS3. Tensin-2 (TNS2), also called C1 domain-containing phosphatase and tensin (C1-TEN), or tensin-like C1 domain-containing phosphatase (TENC1), is an essential component for the maintenance of glomerular basement membrane (GBM) structures. It regulates cell motility and proliferation. It may have phosphatase activity. TNS2 reduces AKT1 phosphorylation, lowers AKT1 kinase activity and interferes with AKT1 signaling. Tensin-1 (TNS1) plays a role in fibrillar adhesion formation. It may be involved in cell migration, cartilage development and in linking signal transduction pathways to the cytoskeleton. Tensin-3 (TNS3), also called tensin-like SH2 domain-containing protein 1 (TENS1), or tumor endothelial marker 6 (TEM6), may play a role in actin remodeling. It is involved in the dissociation of the integrin-tensin-actin complex. Typical TNS1 and TNS3 do not contain C1 domains, but some isoforms/variants do. Members of this family contain an N-terminal region with a zinc finger (C1 domain), a protein tyrosine phosphatase (PTP)-like domain and a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. This model corresponds to C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410377	cd20827	C1_Sbf-like	protein kinase C conserved region 1 (C1 domain) found in the myotubularin-related protein Sbf and similar proteins. This group includes Drosophila melanogaster SET domain binding factor (Sbf), the single homolog of human MTMR5/MTMR13, and similar proteins, that show high sequence similarity to vertebrate myotubularin-related proteins (MTMRs) which may function as guanine nucleotide exchange factors (GEFs). Sbf is a pseudophosphatase that coordinates both phosphatidylinositol 3-phosphate (PI(3)P) turnover and Rab21 GTPase activation in an endosomal pathway that controls macrophage remodeling. It also functions as a GEF that promotes Rab21 GTPase activation associated with PI(3)P endosomes. Vertebrate MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Members of this family contain these domains and have an additional C1 domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410378	cd20828	C1_MTMR-like	protein kinase C conserved region 1 (C1 domain) found in uncharacterized proteins similar to myotubularin-related proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate myotubularin-related proteins (MTMRs), such as MTMR5 and MTMR13. MTMRs may function as guanine nucleotide exchange factors (GEFs). Vertebrate MTMR5 and MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. Members of this family contain these domains and have an additional C1 domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	57
410379	cd20829	C1_PIK3R-like_rpt1	first protein kinase C conserved region 1 (C1 domain) found in uncharacterized phosphatidylinositol 3-kinase regulatory subunit-like proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate phosphatidylinositol 3-kinase regulatory subunits (PIK3Rs), which bind to activated (phosphorylated) protein-tyrosine kinases through its SH2 domain and regulate their kinase activity. Unlike typical PIK3Rs, members of this family have two C1 domains. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410380	cd20830	C1_PIK3R-like_rpt2	second protein kinase C conserved region 1 (C1 domain) found in uncharacterized phosphatidylinositol 3-kinase regulatory subunit-like proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate phosphatidylinositol 3-kinase regulatory subunits (PIK3Rs), which bind to activated (phosphorylated) protein-tyrosine kinases through its SH2 domain and regulate their kinase activity. Unlike typical PIK3Rs, members of this family have two C1 domains. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410381	cd20831	C1_dGM13116p-like	protein kinase C conserved region 1 (C1 domain) found in Drosophila melanogaster GM13116p and similar proteins. This group contains uncharacterized proteins including Drosophila melanogaster GM13116p and Caenorhabditis elegans hypothetical protein R11G1.4, both of which contain C2 (a calcium-binding domain) and C1 domains. This model describes the C1 domain, a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	58
410382	cd20832	C1_ARHGEF-like	protein kinase C conserved region 1 (C1 domain) found in uncharacterized Rho guanine nucleotide exchange factor (ARHGEF)-like proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate Rho guanine nucleotide exchange factors ARHGEF11 and ARHGEF12, which may play a role in the regulation of RhoA GTPase by guanine nucleotide-binding alpha-12 (GNA12) and alpha-13 (GNA13). Unlike typical ARHGEF11 and ARHGEF12, members of this family contain a C1 domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410383	cd20833	C1_cPKC_rpt1	first protein kinase C conserved region 1 (C1 domain) found in the classical (or conventional) protein kinase C (cPKC) family. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domains. cPKCs are potent kinases for histones, myelin basic protein, and protamine. They depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. There are four cPKC isoforms, named alpha, betaI, betaII, and gamma. PKC-alpha is expressed in many tissues and is associated with cell proliferation, apoptosis, and cell motility. It plays a role in the signaling of the growth factors PDGF, VEGF, EGF, and FGF. Abnormal levels of PKC-alpha have been detected in many transformed cell lines and several human tumors. In addition, PKC-alpha is required for HER2 dependent breast cancer invasion. The PKC beta isoforms (I and II), generated by alternative splicing of a single gene, are preferentially activated by hyperglycemia-induced DAG (1,2-diacylglycerol) in retinal tissues. This is implicated in diabetic microangiopathy such as ischemia, neovascularization, and abnormal vasodilator function. PKC-beta also plays an important role in VEGF signaling. In addition, glucose regulates proliferation in retinal endothelial cells via PKC-betaI. PKC-beta is also being explored as a therapeutic target in cancer. It contributes to tumor formation and is involved in the tumor host mechanisms of inflammation and angiogenesis. PKC-gamma is mainly expressed in neuronal tissues. It plays a role in protection from ischemia. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	58
410384	cd20834	C1_nPKC_theta-like_rpt1	first protein kinase C conserved region 1 (C1 domain) found in novel protein kinase C (nPKC) theta, delta, and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domains. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. PKC-theta is selectively expressed in T-cells and plays an important and non-redundant role in several aspects of T-cell biology. PKC-delta plays a role in cell cycle regulation and programmed cell death in many cell types. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	61
410385	cd20835	C1_nPKC_epsilon-like_rpt1	first protein kinase C conserved region 1 (C1 domain) found in novel protein kinase C (nPKC) epsilon, eta, and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domains. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. PKC-epsilon has been shown to behave as an oncoprotein. Its overexpression contributes to neoplastic transformation depending on the cell type. It contributes to oncogenesis by inducing disordered cell growth and inhibiting cell death. It also plays a role in tumor invasion and metastasis. PKC-epsilon has also been found to confer cardioprotection against ischemia and reperfusion-mediated damage. Other cellular functions include the regulation of gene expression, cell adhesion, and cell motility. PKC-eta is predominantly expressed in squamous epithelia, where it plays a crucial role in the signaling of cell-type specific differentiation. It is also expressed in pro-B cells and early-stage thymocytes, and acts as a key regulator in early B-cell development. PKC-eta increases glioblastoma multiforme (GBM) proliferation and resistance to radiation, and is being developed as a therapeutic target for the management of GBM. Members of this family contain two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	64
410386	cd20836	C1_cPKC_rpt2	second protein kinase C conserved region 1 (C1 domain) found in the classical (or conventional) protein kinase C (cPKC) family. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. cPKCs are potent kinases for histones, myelin basic protein, and protamine. They depend on calcium, DAG (1,2-diacylglycerol), and in most cases, phosphatidylserine (PS) for activation. There are four cPKC isoforms, named alpha, betaI, betaII, and gamma. PKC-alpha is expressed in many tissues and is associated with cell proliferation, apoptosis, and cell motility. It plays a role in the signaling of the growth factors PDGF, VEGF, EGF, and FGF. Abnormal levels of PKC-alpha have been detected in many transformed cell lines and several human tumors. In addition, PKC-alpha is required for HER2 dependent breast cancer invasion. The PKC beta isoforms (I and II), generated by alternative splicing of a single gene, are preferentially activated by hyperglycemia-induced DAG (1,2-diacylglycerol) in retinal tissues. This is implicated in diabetic microangiopathy such as ischemia, neovascularization, and abnormal vasodilator function. PKC-beta also plays an important role in VEGF signaling. In addition, glucose regulates proliferation in retinal endothelial cells via PKC-betaI. PKC-beta is also being explored as a therapeutic target in cancer. It contributes to tumor formation and is involved in the tumor host mechanisms of inflammation and angiogenesis. PKC-gamma is mainly expressed in neuronal tissues. It plays a role in protection from ischemia. Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	54
410387	cd20837	C1_nPKC_theta-like_rpt2	second protein kinase C conserved region 1 (C1 domain) found in novel protein kinase C (nPKC) theta, delta, and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. PKC-theta is selectively expressed in T-cells and plays an important and non-redundant role in several aspects of T-cell biology. PKC-delta plays a role in cell cycle regulation and programmed cell death in many cell types. Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	50
410388	cd20838	C1_nPKC_epsilon-like_rpt2	second protein kinase C conserved region 1 (C1 domain) found in novel protein kinase C (nPKC) epsilon, eta, and similar proteins. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. nPKCs are calcium-independent, but require DAG (1,2-diacylglycerol) and phosphatidylserine (PS) for activity. PKC-epsilon has been shown to behave as an oncoprotein. Its overexpression contributes to neoplastic transformation depending on the cell type. It contributes to oncogenesis by inducing disordered cell growth and inhibiting cell death. It also plays a role in tumor invasion and metastasis. PKC-epsilon has also been found to confer cardioprotection against ischemia and reperfusion-mediated damage. Other cellular functions include the regulation of gene expression, cell adhesion, and cell motility. PKC-eta is predominantly expressed in squamous epithelia, where it plays a crucial role in the signaling of cell-type specific differentiation. It is also expressed in pro-B cells and early-stage thymocytes, and acts as a key regulator in early B-cell development. PKC-eta increases glioblastoma multiforme (GBM) proliferation and resistance to radiation, and is being developed as a therapeutic target for the management of GBM. Members of this family contain two copies of C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	55
410389	cd20839	C1_PKD1_rpt1	first protein kinase C conserved region 1 (C1 domain) found in protein kinase D (PKD) and similar proteins. PKD is also called PKD1, PRKD1, protein kinase C mu type (nPKC-mu), PRKCM, serine/threonine-protein kinase D1, or nPKC-D1. It is a serine/threonine-protein kinase that converts transient diacylglycerol (DAG) signals into prolonged physiological effects downstream of PKC, and is involved in the regulation of MAPK8/JNK1 and Ras signaling, Golgi membrane integrity and trafficking, cell survival through NF-kappa-B activation, cell migration, cell differentiation by mediating HDAC7 nuclear export, cell proliferation via MAPK1/3 (ERK1/2) signaling, and plays a role in cardiac hypertrophy, VEGFA-induced angiogenesis, genotoxic-induced apoptosis and flagellin-stimulated inflammatory response. PKD contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the first C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	72
410390	cd20840	C1_PKD2_rpt1	first protein kinase C conserved region 1 (C1 domain) found in protein kinase D2 (PKD2) and similar proteins. PKD2, also called PRKD2, HSPC187, or serine/threonine-protein kinase D2 (nPKC-D2), is a serine/threonine-protein kinase that converts transient diacylglycerol (DAG) signals into prolonged physiological effects downstream of PKC, and is involved in the regulation of cell proliferation via MAPK1/3 (ERK1/2) signaling, oxidative stress-induced NF-kappa-B activation, inhibition of HDAC7 transcriptional repression, signaling downstream of T-cell antigen receptor (TCR) and cytokine production, and plays a role in Golgi membrane trafficking, angiogenesis, secretory granule release and cell adhesion. PKD2 contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the first C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	73
410391	cd20841	C1_PKD3_rpt1	first protein kinase C conserved region 1 (C1 domain) found in protein kinase D3 (PKD3) and similar proteins. PKD3 is also called PRKD3, PRKCN, serine/threonine-protein kinase D3 (nPKC-D3), protein kinase C nu type (nPKC-nu), or protein kinase EPK2. It converts transient diacylglycerol (DAG) signals into prolonged physiological effects, downstream of PKC. It is involved in the regulation of the cell cycle by modulating microtubule nucleation and dynamics. PKD3 acts as a key mediator in several cancer development signaling pathways. PKD3 contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the first C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	75
410392	cd20842	C1_PKD1_rpt2	second protein kinase C conserved region 1 (C1 domain) found in protein kinase D (PKD) and similar proteins. PKD is also called PKD1, PRKD1, protein kinase C mu type (nPKC-mu), PRKCM, serine/threonine-protein kinase D1, or nPKC-D1. It is a serine/threonine-protein kinase that converts transient diacylglycerol (DAG) signals into prolonged physiological effects downstream of PKC, and is involved in the regulation of MAPK8/JNK1 and Ras signaling, Golgi membrane integrity and trafficking, cell survival through NF-kappa-B activation, cell migration, cell differentiation by mediating HDAC7 nuclear export, cell proliferation via MAPK1/3 (ERK1/2) signaling, and plays a role in cardiac hypertrophy, VEGFA-induced angiogenesis, genotoxic-induced apoptosis and flagellin-stimulated inflammatory response. PKD contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the second C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	94
410393	cd20843	C1_PKD2_rpt2	second protein kinase C conserved region 1 (C1 domain) found in protein kinase D2 (PKD2) and similar proteins. PKD2, also called PRKD2, HSPC187, or serine/threonine-protein kinase D2 (nPKC-D2), is a serine/threonine-protein kinase that converts transient diacylglycerol (DAG) signals into prolonged physiological effects downstream of PKC, and is involved in the regulation of cell proliferation via MAPK1/3 (ERK1/2) signaling, oxidative stress-induced NF-kappa-B activation, inhibition of HDAC7 transcriptional repression, signaling downstream of T-cell antigen receptor (TCR) and cytokine production, and plays a role in Golgi membrane trafficking, angiogenesis, secretory granule release and cell adhesion. PKD2 contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the second C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	79
410394	cd20844	C1_PKD3_rpt2	second protein kinase C conserved region 1 (C1 domain) found in protein kinase D3 (PKD3) and similar proteins. PKD3 is also called PRKD3, PRKCN, serine/threonine-protein kinase D3 (nPKC-D3), protein kinase C nu type (nPKC-nu), or protein kinase EPK2. It converts transient diacylglycerol (DAG) signals into prolonged physiological effects, downstream of PKC. It is involved in the regulation of the cell cycle by modulating microtubule nucleation and dynamics. PKD3 acts as a key mediator in several cancer development signaling pathways. PKD3 contains N-terminal tandem cysteine-rich zinc binding C1 (PKC conserved region 1), central PH (Pleckstrin Homology), and C-terminal catalytic kinase domains. This model corresponds to the second C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	69
410395	cd20845	C1_DGKbeta_rpt1	first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase beta (DAG kinase beta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase beta, also called 90 kDa diacylglycerol kinase, or diglyceride kinase beta (DGK-beta), exhibits high phosphorylation activity for long-chain diacylglycerols. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DAG kinase beta contains two copies of the C1 domain. This model corresponds to the first one. DGK-beta contains typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	66
410396	cd20846	C1_DGKgamma_rpt1	first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase gamma (DAG kinase gamma) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase gamma, also called diglyceride kinase gamma (DGK-gamma), reverses the normal flow of glycerolipid biosynthesis by phosphorylating diacylglycerol back to phosphatidic acid. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DGK-gamma contains two copies of the C1 domain. This model corresponds to the first one. DGK-gamma contains typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	73
410397	cd20847	C1_DGKdelta_rpt1	first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase delta (DAG kinase delta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase delta, also called 130 kDa diacylglycerol kinase, or diglyceride kinase delta (DGK-delta), is a residential lipid kinase in the endoplasmic reticulum. It promotes lipogenesis and is involved in triglyceride biosynthesis. It is classified as a type II DAG kinase (DGK), containing pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. DAG kinase delta contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	85
410398	cd20848	C1_DGKeta_rpt1	first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase eta (DAG kinase eta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase eta, also called diglyceride kinase eta (DGK-eta), plays a key role in promoting cell growth. It is classified as a type II DAG kinase (DGK), containing pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. The diacylglycerol kinase eta gene, DGKH, is a replicated risk gene of bipolar disorder (BPD). DAG kinase eta contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	86
410399	cd20849	C1_DGKzeta_rpt1	first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase zeta (DAG kinase zeta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase zeta, also called diglyceride kinase zeta (DGK-zeta), displays a strong preference for 1,2-diacylglycerols over 1,3-diacylglycerols, but lacks substrate specificity among molecular species of long chain diacylglycerols. It is classified as a type IV DAG kinase (DGK), containing myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. DAG kinase zeta contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	74
410400	cd20850	C1_DGKiota_rpt1	first protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase iota (DAG kinase iota) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase iota, also called diglyceride kinase iota (DGK-iota), or DGKI, is a homolog of Drosophila DGK2, RdgA. It may have important cellular functions in the retina and brain. It is classified as a type IV DAG kinase (DGK), containing myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. DAG kinase iota contains two copies of the C1 domain. This model corresponds to the first one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	73
410401	cd20851	C1_DGK_typeI_like_rpt2	second protein kinase C conserved region 1 (C1 domain) found in type I diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type I DAG kinases (DGKs) contain EF-hand structures that bind Ca(2+) and recoverin homology domains, in addition to C1 and catalytic domains that are present in all DGKs. Type I DGKs, regulated by calcium binding, include three DGK isozymes (alpha, beta and gamma). DAG kinase alpha, also called 80 kDa DAG kinase, or diglyceride kinase alpha (DGK-alpha), is active upon cell stimulation, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. DAG kinase beta, also called 90 kDa DAG kinase, or diglyceride kinase beta (DGK-beta), exhibits high phosphorylation activity for long-chain diacylglycerols. DAG kinase gamma, also called diglyceride kinase gamma (DGK-gamma), reverses the normal flow of glycerolipid biosynthesis by phosphorylating diacylglycerol back to phosphatidic acid. Members of this family contain two copies of the C1 domain. This model corresponds to the second one. DGK-alpha contains atypical C1 domains, while DGK-beta and DGK-gamma contain typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410402	cd20852	C1_DGK_typeII_rpt2	second protein kinase C conserved region 1 (C1 domain) found in type II diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type II DAG kinases (DGKs) contain pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. Three DGK isozymes (delta, eta and kappa) are classified as type II. DAG kinase delta, also called 130 kDa DAG kinase, or diglyceride kinase delta (DGK-delta), is a residential lipid kinase in the endoplasmic reticulum. It promotes lipogenesis and is involved in triglyceride biosynthesis. DAG kinase eta, also called diglyceride kinase eta (DGK-eta), plays a key role in promoting cell growth. The DAG kinase eta gene, DGKH, is a replicated risk gene of bipolar disorder (BPD). DAG kinase kappa is also called diglyceride kinase kappa (DGK-kappa) or 142 kDa DAG kinase. Members of this family contain two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	54
410403	cd20853	C1_DGKepsilon_typeIII_rpt2	second protein kinase C conserved region 1 (C1 domain) found in type III diacylglycerol kinase, DAG kinase epsilon, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase epsilon, also called diglyceride kinase epsilon (DGK-epsilon), is the only isoform classified as type III; it possesses a hydrophobic domain in addition to C1 and catalytic domains that are present in all DGKs, and shows selectivity for acyl chains. It is highly selective for arachidonate-containing species of DAG. It may terminate signals transmitted through arachidonoyl-DAG or may contribute to the synthesis of phospholipids with defined fatty acid composition. DAG kinase epsilon contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	63
410404	cd20854	C1_DGKtheta_typeV_rpt3	third protein kinase C conserved region 1 (C1 domain) found in type V diacylglycerol kinase, DAG kinase theta, and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase theta, also called diglyceride kinase theta (DGK-theta), is the only isoform classified as type V; it contains a pleckstrin homology (PH)-like domain and an additional C1 domain, compared to other DGKs. It may regulate the activity of protein kinase C by controlling the balance between the two signaling lipids, diacylglycerol and phosphatidic acid. DAG kinase theta contains three copies of the C1 domain. This model corresponds to the third one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	63
410405	cd20855	C1_DGK_typeIV_rpt2	second protein kinase C conserved region 1 (C1 domain) found in type IV diacylglycerol kinases. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. Type IV DAG kinases (DGKs) contain myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. Two DGK isozymes (zeta and iota) are classified as type IV. DAG kinase zeta, also called diglyceride kinase zeta (DGK-zeta), displays a strong preference for 1,2-diacylglycerols over 1,3-diacylglycerols, but lacks substrate specificity among molecular species of long chain diacylglycerols. DAG kinase iota, also called diglyceride kinase iota (DGK-iota), or DGKI, is a homolog of Drosophila DGK2, RdgA. It may have important cellular functions in the retina and brain. Members of this family contain two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	62
410406	cd20856	C1_alphaCHN	protein kinase C conserved region 1 (C1 domain) found in alpha-chimaerin and similar proteins. Alpha-chimaerin, also called A-chimaerin, N-chimaerin (CHN), alpha-chimerin, N-chimerin (NC), or Rho GTPase-activating protein 2 (ARHGAP2), is a GTPase-activating protein (GAP) for p21-rac and a phorbol ester receptor. It is involved in the assembly of neuronal locomotor circuits as a direct effector of EPHA4 in axon guidance. Alpha-chimaerin contains a functional SH2 domain that can bind to phosphotyrosine motifs within receptors, a GAP domain with specificity in vitro for Rac1 and a diacylglycerol (DAG)-binding C1 domain which allows them to translocate to membranes in response to DAG signaling and anchors them in close proximity to activated Rac. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	57
410407	cd20857	C1_betaCHN	protein kinase C conserved region 1 (C1 domain) found in beta-chimaerin and similar proteins. Beta-chimaerin, also called beta-chimerin (BCH) or Rho GTPase-activating protein 3 (ARHGAP3), is a GTPase-activating protein (GAP) for p21-rac. Insufficient expression of beta-2 chimaerin is expected to lead to higher Rac activity and could therefore play a role in the progression from low-grade to high-grade tumors. Beta-chimaerin contains a functional SH2 domain that can bind to phosphotyrosine motifs within receptors, a GAP domain with specificity in vitro for Rac1 and a diacylglycerol (DAG)-binding C1 domain which allows them to translocate to membranes in response to DAG signaling and anchors them in close proximity to activated Rac. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	61
410408	cd20858	C1_Munc13-1	protein kinase C conserved region 1 (C1 domain) found in Munc13-1 and similar proteins. Munc13-1, also called protein unc-13 homolog A (Unc13A), is a diacylglycerol (DAG) receptor that plays a role in vesicle maturation during exocytosis as a target of the diacylglycerol second messenger pathway. It is involved in neurotransmitter release by acting in synaptic vesicle priming prior to vesicle fusion and participates in the activity-dependent refilling of readily releasable vesicle pool (RRP). Loss of MUNC13-1 function causes microcephaly, cortical hyperexcitability, and fatal myasthenia. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	60
410409	cd20859	C1_Munc13-2-like	protein kinase C conserved region 1 (C1 domain) found in Munc13-2, Munc13-3 and similar proteins. Munc13-2, also called protein unc-13 homolog B (Unc13B), plays a role in vesicle maturation during exocytosis as a target of the diacylglycerol second messenger pathway. It is involved in neurotransmitter release by acting in synaptic vesicle priming prior to vesicle fusion and participates in the activity-dependent refilling of readily releasable vesicle pool (RRP). Munc13-2 is essential for normal release probability at hippocampal mossy fiber synapses. Munc13-3 is almost exclusively expressed in the cerebellum. It acts as a tumor suppressor and plays a critical role in the formation of release sites with calcium channel nanodomains. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	82
410410	cd20860	C1_RASGRP1	protein kinase C conserved region 1 (C1 domain) found in RAS guanyl-releasing protein 1 (RASGRP1) and similar proteins. RASGRP1, also called calcium and DAG-regulated guanine nucleotide exchange factor II (CalDAG-GEFII) or Ras guanyl-releasing protein, functions as a calcium- and diacylglycerol (DAG)-regulated nucleotide exchange factor specifically activating Ras through the exchange of bound GDP for GTP. It activates the Erk/MAP kinase cascade and regulates T-cell/B-cell development, homeostasis and differentiation by coupling T-lymphocyte/B-lymphocyte antigen receptors to Ras. RASGRP1 also regulates NK cell cytotoxicity and ITAM-dependent cytokine production by activation of Ras-mediated ERK and JNK pathways. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	55
410411	cd20861	C1_RASGRP2	protein kinase C conserved region 1 (C1 domain) found in RAS guanyl-releasing protein 2 (RASGRP2) and similar proteins. RASGRP2, also called calcium and DAG-regulated guanine nucleotide exchange factor I (CalDAG-GEFI), Cdc25-like protein (CDC25L), or F25B3.3 kinase-like protein, functions as a calcium- and DAG-regulated nucleotide exchange factor specifically activating Rap through the exchange of bound GDP for GTP. It may also activate other GTPases such as RRAS, RRAS2, NRAS, KRAS but not HRAS. RASGRP2 is also involved in aggregation of platelets and adhesion of T-lymphocytes and neutrophils probably through inside-out integrin activation, as well as in the muscarinic acetylcholine receptor M1/CHRM1 signaling pathway. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	56
410412	cd20862	C1_RASGRP3	protein kinase C conserved region 1 (C1 domain) found in RAS guanyl-releasing protein 3 (RASGRP3) and similar proteins. RASGRP3, also called calcium and DAG-regulated guanine nucleotide exchange factor III (CalDAG-GEFIII), or guanine nucleotide exchange factor for Rap1, is a guanine nucleotide-exchange factor activating H-Ras, R-Ras and Ras-associated protein-1/2. It functions as an important mediator of signaling downstream from receptor coupled phosphoinositide turnover in B and T cells. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	59
410413	cd20863	C1_RASGRP4	protein kinase C conserved region 1 (C1 domain) found in RAS guanyl-releasing protein 4 (RASGRP4) and similar proteins. RASGRP4 functions as a cation- and diacylglycerol (DAG)-regulated nucleotide exchange factor activating Ras through the exchange of bound GDP for GTP. It may function in mast cell differentiation. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	57
410414	cd20864	C1_MRCKalpha	protein kinase C conserved region 1 (C1 domain) found in myotonic dystrophy kinase-related Cdc42-binding kinase alpha (MRCK alpha) and similar proteins. MRCK alpha, also called Cdc42-binding protein kinase alpha, DMPK-like alpha, or myotonic dystrophy protein kinase-like alpha, is a serine/threonine-protein kinase expressed ubiquitously in many tissues. It plays a role in the regulation of peripheral actin reorganization and neurite outgrowth. It may also play a role in the transferrin iron uptake pathway. MRCK alpha is an important downstream effector of Cdc42 and plays a role in the regulation of cytoskeleton reorganization and cell migration. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	60
410415	cd20865	C1_MRCKbeta	protein kinase C conserved region 1 (C1 domain) found in myotonic dystrophy kinase-related Cdc42-binding kinase beta (MRCK beta) and similar proteins. MRCK beta, also called Cdc42-binding protein kinase beta (Cdc42BP-beta), DMPK-like beta, or myotonic dystrophy protein kinase-like beta, is a serine/threonine-protein kinase expressed ubiquitously in many tissues. MRCK beta is an important downstream effector of Cdc42 and plays a role in the regulation of cytoskeleton reorganization and cell migration. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410416	cd20866	C1_MRCKgamma	protein kinase C conserved region 1 (C1 domain) found in myotonic dystrophy kinase-related Cdc42-binding kinase gamma (MRCK gamma) and similar proteins. MRCK gamma (MRCKG), also called Cdc42-binding protein kinase gamma, DMPK-like gamma, myotonic dystrophy protein kinase-like gamma, or myotonic dystrophy protein kinase-like alpha, is a serine/threonine-protein kinase expressed in heart and skeletal muscles. It may act as a downstream effector of Cdc42 in cytoskeletal reorganization and contributes to the actomyosin contractility required for cell invasion, through the regulation of MYPT1 and thus MLC2 phosphorylation. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410417	cd20867	C1_VAV1	protein kinase C conserved region 1 (C1 domain) found in VAV1 protein. VAV1 is expressed predominantly in the hematopoietic system and plays an important role in the development and activation of B and T cells. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	57
410418	cd20868	C1_VAV2	protein kinase C conserved region 1 (C1 domain) found in VAV2 protein. VAV2 is widely expressed and functions as a guanine nucleotide exchange factor (GEF) for RhoA, RhoB and RhoG and also activates Rac1 and Cdc42. It is implicated in many cellular and physiological functions including blood pressure control, eye development, neurite outgrowth and branching, EGFR endocytosis and degradation, and cell cluster morphology, among others. It has been reported to associate with Nek3. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	58
410419	cd20869	C1_VAV3	protein kinase C conserved region 1 (C1 domain) found in VAV3 protein. VAV3 is ubiquitously expressed and functions as a phosphorylation-dependent guanine nucleotide exchange factor (GEF) for RhoA, RhoG, and Rac1. Its function has been implicated in the hematopoietic, bone, cerebellar, and cardiovascular systems. VAV3 is essential in axon guidance in neurons that control blood pressure and respiration. It is overexpressed in prostate cancer cells and plays a role in regulating androgen receptor transcriptional activity. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	59
410420	cd20870	C1_A_C-Raf	protein kinase C conserved region 1 (C1 domain) found in A- and C-Raf (Rapidly Accelerated Fibrosarcoma) kinases, and similar proteins. This group includes A-Raf and C-Raf, both of which are serine/threonine-protein kinases. A-Raf, also called proto-oncogene A-Raf or proto-oncogene A-Raf-1, cooperates with C-Raf in regulating ERK transient phosphorylation that is associated with cyclin D expression and cell cycle progression. Mice deficient in A-Raf are born alive but show neurological and intestinal defects. A-Raf demonstrates low kinase activity to MEK, compared with B- and C-Raf, and may also have alternative functions other than in the ERK signaling cascade. It regulates the M2 type pyruvate kinase, a key glycolytic enzyme. It also plays a role in endocytic membrane trafficking. C-Raf, also known as proto-oncogene Raf-1 or c-Raf-1, is ubiquitously expressed and was the first Raf identified. It was characterized as the acquired oncogene from an acutely transforming murine sarcoma virus (3611-MSV) and the transforming agent from the avian retrovirus MH2. C-Raf-deficient mice embryos die around mid-gestation with increased apoptosis of embryonic tissues, especially in the fetal liver. One of the main functions of C-Raf is restricting caspase activation to promote survival in response to specific stimuli such as Fas stimulation, macrophage apoptosis, and erythroid differentiation. Both A- and C-Raf are mitogen-activated protein kinase kinase kinases (MAP3K, MKKK, MAPKKK), which phosphorylate and activate MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. Raf proteins contain a Ras binding domain, a zinc finger cysteine-rich domain (C1), and a catalytic kinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	52
410421	cd20871	C1_B-Raf	protein kinase C conserved region 1 (C1 domain) found in B-Raf (Rapidly Accelerated Fibrosarcoma) kinase and similar proteins. Serine/threonine-protein kinase B-Raf, also called proto-oncogene B-Raf, p94, or v-Raf murine sarcoma viral oncogene homolog B1, activates ERK with the strongest magnitude, compared with other Raf kinases. Mice embryos deficient in B-Raf die around midgestation due to vascular hemorrhage caused by apoptotic endothelial cells. Mutations in B-Raf have been implicated in initiating tumorigenesis and tumor progression, and are found in malignant cutaneous melanoma, papillary thyroid cancer, as well as in ovarian and colorectal carcinomas. Most oncogenic B-Raf mutations are located at the activation loop of the kinase and surrounding regions; the V600E mutation accounts for around 90% of oncogenic mutations. The V600E mutant constitutively activates MEK, resulting in sustained activation of ERK. B-Raf is a mitogen-activated protein kinase kinase kinase (MAP3K, MKKK, MAPKKK), which phosphorylates and activates MAPK kinases (MAPKKs or MKKs or MAP2Ks), which in turn phosphorylate and activate MAPKs during signaling cascades that are important in mediating cellular responses to extracellular signals. They function in the linear Ras-Raf-MEK-ERK pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. Raf proteins contain a Ras binding domain, a zinc finger cysteine-rich domain (C1), and a catalytic kinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	60
410422	cd20872	C1_KSR1	protein kinase C conserved region 1 (C1 domain) found in kinase suppressor of Ras 1 (KSR1) and similar proteins. KSR1 functions as a transducer of TNFalpha-stimulated C-Raf activation of ERK1/2 and NF-kB. Detected activity of KSR1 is cell type specific and context dependent. It is inactive in normal colon epithelial cells and becomes activated at the onset of inflammatory bowel disease (IBD). Similarly, KSR1 activity is undetectable prior to stimulation by EGF or ceramide in COS-7 or YAMC cells, respectively. KSR proteins are widely regarded as pseudokinases, however, this matter is up for debate as catalytic activity has been detected for KSR1 in some systems. KSR proteins contain a SAM-like domain, a zinc finger cysteine-rich domain (C1), and a pseudokinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	47
410423	cd20873	C1_KSR2	protein kinase C conserved region 1 (C1 domain) found in kinase suppressor of Ras 2 (KSR2) and similar proteins. KSR2 interacts with the protein phosphatase calcineurin and functions in calcium-mediated ERK signaling. It also functions in energy metabolism by regulating AMP kinase and AMPK-dependent processes such as glucose uptake and fatty acid oxidation. KSR proteins act as scaffold proteins that function downstream of Ras and upstream of Raf in the Extracellular signal-Regulated Kinase (ERK) pathway that regulates many cellular processes including cycle regulation, proliferation, differentiation, survival, and apoptosis. KSR proteins regulate the assembly and activation of the Raf/MEK/ERK module upon Ras activation at the membrane by direct association of its components. They are widely regarded as pseudokinases. KSR proteins contain a SAM-like domain, a zinc finger cysteine-rich domain (C1), and a pseudokinase domain. This model describes the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	57
410424	cd20874	C1_ROCK1	protein kinase C conserved region 1 (C1 domain) found in Rho-associated coiled-coil containing protein kinase 1 (ROCK1) and similar proteins. ROCK1 is a serine/threonine kinase, catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK1, also called Rho-associated protein kinase 1, renal carcinoma antigen NY-REN-35, Rho-associated, coiled-coil-containing protein kinase I (ROCK-I), p160 ROCK-1, or p160ROCK, is preferentially expressed in the liver, lung, spleen, testes, and kidney. It mediates signaling from Rho to the actin cytoskeleton. It is implicated in the development of cardiac fibrosis, cardiomyocyte apoptosis, and hyperglycemia. Mice deficient with ROCK1 display eyelids open at birth (EOB) and omphalocele phenotypes due to the disorganization of actin filaments in the eyelids and the umbilical ring. ROCK proteins contain an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD), a pleckstrin homology (PH) domain and a C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	69
410425	cd20875	C1_ROCK2	protein kinase C conserved region 1 (C1 domain) found in Rho-associated coiled-coil containing protein kinase 2 (ROCK2) and similar proteins. ROCK2 is a serine/threonine kinase, catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. ROCK2, also called Rho-associated protein kinase 2, Rho kinase 2, Rho-associated, coiled-coil-containing protein kinase II (ROCK-II), or p164 ROCK-2, was the first identified target of activated RhoA, and was found to play a role in stress fiber and focal adhesion formation. It is prominently expressed in the brain, heart, and skeletal muscles. It is implicated in vascular and neurological disorders, such as hypertension and vasospasm of the coronary and cerebral arteries. ROCK2 is also activated by caspase-2 cleavage, resulting in thrombin-induced microparticle generation in response to cell activation. Mice deficient in ROCK2 show intrauterine growth retardation and embryonic lethality because of placental dysfunction. ROCK proteins contain an N-terminal extension, a catalytic kinase domain, and a C-terminal extension, which contains a coiled-coil region encompassing a Rho-binding domain (RBD), a pleckstrin homology (PH) domain and a C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	71
410426	cd20876	C1_p190RhoGEF	protein kinase C conserved region 1 (C1 domain) found in 190 kDa guanine nucleotide exchange factor (p190RhoGEF) and similar proteins. p190RhoGEF, also called Rho guanine nucleotide exchange factor (RGNEF), Rho guanine nucleotide exchange factor 28 (ARHGEF28), or RIP2, is a brain-enriched, RhoA-specific guanine nucleotide exchange factor that regulates signaling pathways downstream of integrins and growth factor receptors. It is involved in axonal branching, synapse formation and dendritic morphogenesis, as well as in focal adhesion formation, cell motility and B-lymphocytes activation. In addition to the Dbl homology (DH)-PH domain, p190RhoGEF contains an N-terminal C1 (Protein kinase C conserved region 1) domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	61
410427	cd20877	C1_ARHGEF2	protein kinase C conserved region 1 (C1 domain) found in Rho guanine nucleotide exchange factor 2 (ARHGEF2) and similar proteins. ARHGEF2, also called guanine nucleotide exchange factor H1 (GEF-H1), microtubule-regulated Rho-GEF, or proliferating cell nucleolar antigen p40, acts as guanine nucleotide exchange factor (GEF) that activates Rho-GTPases by promoting the exchange of GDP for GTP. It is thought to play a role in actin cytoskeleton reorganization in different tissues since its activation induces formation of actin stress fibers. ARHGEF2 may be involved in epithelial barrier permeability, cell motility and polarization, dendritic spine morphology, antigen presentation, leukemic cell differentiation, cell cycle regulation, innate immune response, and cancer. It contains a C1 domain followed by Dbl-homology (DH) and pleckstrin-homology (PH) domains which bind and catalyze the exchange of GDP for GTP on RhoA. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	61
410428	cd20878	C1_AKAP13	protein kinase C conserved region 1 (C1 domain) found in A-kinase anchor protein 13 (AKAP-13) and similar proteins. AKAP-13, also called AKAP-Lbc, breast cancer nuclear receptor-binding auxiliary protein (Brx-1), guanine nucleotide exchange factor Lbc, human thyroid-anchoring protein 31, lymphoid blast crisis oncogene (LBC oncogene), non-oncogenic Rho GTPase-specific GTP exchange factor, protein kinase A-anchoring protein 13 (PRKA13), or p47, is a scaffold protein that plays an important role in assembling signaling complexes downstream of several types of G protein-coupled receptors (GPCRs). It activates RhoA in response to GPCR signaling via its function as a Rho guanine nucleotide exchange factor. It may also activate other Rho family members. AKAP-13 plays a role in cell growth, cell development and actin fiber formation. Its Rho-GEF activity is regulated by protein kinase A (PKA), through binding and phosphorylation. Alternative splicing of this gene in humans has at least 3 transcript variants encoding different isoforms (i.e. proto-/onco-Lymphoid blast crisis, Lbc and breast cancer nuclear receptor-binding auxiliary protein, and Brx) that contain a C1 domain followed by a dbl oncogene homology (DH) domain and a PH domain which are required for full transforming activity. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	60
410429	cd20879	C1_ARHGEF18-like	protein kinase C conserved region 1 (C1 domain) found in uncharacterized Rho guanine nucleotide exchange factor 18 (ARHGEF18)-like proteins. The family includes a group of uncharacterized proteins that show high sequence similarity to vertebrate ARHGEF18, which is also called 114 kDa Rho-specific guanine nucleotide exchange factor (p114-Rho-GEF), p114RhoGEF, or septin-associated RhoGEF (SA-RhoGEF). ARHGEF18 acts as guanine nucleotide exchange factor (GEF) for RhoA GTPases. Its activation induces formation of actin stress fibers. ARHGEF18 also acts as a GEF for RAC1, inducing production of reactive oxygen species (ROS). Members of this family contain C1, RhoGEF or Dbl-homologous (DH), and Pleckstrin Homology (PH) domains, as well as a DUF5401 domain. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410430	cd20880	C1_Stac1	protein kinase C conserved region 1 (C1 domain) found in SH3 and cysteine-rich domain-containing protein (Stac1) and similar proteins. Stac1, also called Src homology 3 and cysteine-rich domain-containing protein, promotes expression of the ion channel CACNA1H at the cell membrane, and thereby contributes to the regulation of channel activity. It plays a minor and redundant role in promoting the expression of calcium channel CACNA1S at the cell membrane, and thereby contributes to increased channel activity. It slows down the inactivation rate of the calcium channel CACNA1C. Stac1 contains a cysteine-rich C1 domain and two SH3 domains at the C-terminus. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	57
410431	cd20881	C1_Stac2	protein kinase C conserved region 1 (C1 domain) found in SH3 and cysteine-rich domain-containing protein 2 (Stac2) and similar proteins. Stac2, also called 24b2/Stac2, or Src homology 3 and cysteine-rich domain-containing protein 2, plays a redundant role in promoting the expression of calcium channel CACNA1S at the cell membrane, and thereby contributes to increased channel activity. It slows down the inactivation rate of the calcium channel CACNA1C. Stac2 contains a cysteine-rich C1 domain and one SH3 domain at the C-terminus. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	59
410432	cd20882	C1_Stac3	protein kinase C conserved region 1 (C1 domain) found in SH3 and cysteine-rich domain-containing protein 3 (Stac3) and similar proteins. Stac3 is an essential component of the skeletal muscle excitation-contraction coupling (ECC) machinery. It is required for normal excitation-contraction coupling in skeletal muscle and for normal muscle contraction in response to membrane depolarization. It plays an essential role for normal Ca2+ release from the sarcplasmic reticulum, which ultimately leads to muscle contraction. Stac3 contains a cysteine-rich C1 domain and two SH3 domains at the C-terminus. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	59
410433	cd20883	C1_Myosin-IXa	protein kinase C conserved region 1 (C1 domain) found in unconventional myosin-IXa and similar proteins. Myosin-IXa, also called unconventional myosin-9a (Myo9a), is a single-headed, actin-dependent motor protein of the unconventional myosin IX class. It is expressed in several tissues and is enriched in the brain and testes. Myosin-IXa contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a Rho GTPase activating domain (RhoGAP). Myosin-IXa binds the alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid receptor (AMPAR) GluA2 subunit, and plays a key role in controlling the molecular structure and function of hippocampal synapses. Moreover, Myosin-IXa functions in epithelial cell morphology and differentiation, such that its knockout mice develop hydrocephalus and kidney dysfunction. Myosin-IXa regulates collective epithelial cell migration by targeting RhoGAP activity to cell-cell junctions. Myosin-IXa negatively regulates Rho GTPase signaling, and functions as a regulator of kidney tubule function. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	58
410434	cd20884	C1_Myosin-IXb	protein kinase C conserved region 1 (C1 domain) found in unconventional myosin-IXb and similar proteins. Myosin-IXb, also called unconventional myosin-9b (Myo9b), is an actin-dependent motor protein of the unconventional myosin IX class. It is expressed abundantly in tissues of the immune system, like lymph nodes, thymus, and spleen, and in several immune cells including dendritic cells, macrophages and CD4+ T cells. Myosin-IXb contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a Rho GTPase activating (RhoGAP) domain. Myosin-IXb acts as a motorized signaling molecule that links Rho signaling to the dynamic actin cytoskeleton. It regulates leukocyte migration by controlling RhoA signaling. Myosin-IXb is also involved in the development of autoimmune diseases, including rheumatoid arthritis, systemic lupus erythematosus, and type 1 diabetes. Moreover, Myosin-IXb is a ROBO-interacting protein that suppresses RhoA activity in lung cancer cells. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	58
410435	cd20885	C1_RASSF1	protein kinase C conserved region 1 (C1 domain) found in Ras association domain-containing protein 1 (RASSF1) and similar proteins. RASSF1 is a member of a family of RAS effectors, of which there are currently 8 members (RASSF1-8), all containing a Ras-association (RA) domain of the Ral-GDS/AF6 type. RASSF1 has eight transcripts (A-H) arising from alternative splicing and differential promoter usage. RASSF1A and 1C are the most extensively studied RASSF1 with both localized to microtubules and involved in regulation of growth and migration. RASSF1 is a potential tumor suppressor that is required for death receptor-dependent apoptosis. It contains a C1 domain, which is descibed in this model. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	54
410436	cd20886	C1_RASSF5	protein kinase C conserved region 1 (C1 domain) found in Ras association domain-containing protein 5 (RASSF5) and similar proteins. RASSF5, also called new ras effector 1 (NORE1), or regulator for cell adhesion and polarization enriched in lymphoid tissues (RAPL), is a member of a family of RAS effectors, of which there are currently 8 members (RASSF1-8), all containing a Ras-association (RA) domain of the Ral-GDS/AF6 type. It is expressed as three transcripts (A-C) via differential promoter usage and alternative splicing. RASSF5A is a pro-apoptotic Ras effector and functions as a Ras regulated tumor suppressor. RASSF5C is regulated by Ras related protein and modulates cellular adhesion. RASSF5 is a potential tumor suppressor that seems to be involved in lymphocyte adhesion by linking RAP1A activation upon T-cell receptor or chemokine stimulation to integrin activation. It contains a C1 domain, which is descibed in this model. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	50
410437	cd20887	C1_TNS2	protein kinase C conserved region 1 (C1 domain) found in tensin-2 and similar proteins. Tensin-2 (TNS2), also called C1 domain-containing phosphatase and tensin (C1-TEN), or tensin-like C1 domain-containing phosphatase (TENC1), is an essential component for the maintenance of glomerular basement membrane (GBM) structures. It regulates cell motility and proliferation. It may have phosphatase activity. TNS2 reduces AKT1 phosphorylation, lowers AKT1 kinase activity, and interferes with AKT1 signaling. It contains an N-terminal region with a zinc finger (C1 domain), a protein tyrosine phosphatase (PTP)-like domain and a protein kinase 2 (C2) domain, and a C-terminal region with SH2 and pTyr binding (PTB) domains. This model corresponds to the C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	53
410438	cd20888	C1_TNS1_v	protein kinase C conserved region 1 (C1 domain) found in tensin-1 (TNS1) variant and similar proteins. Tensin-1 (TNS1) plays a role in fibrillar adhesion formation. It may be involved in cell migration, cartilage development and in linking signal transduction pathways to the cytoskeleton. This model corresponds to the C1 domain found in TNS1 variant. Typical TNS1 does not contain C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	57
410439	cd20889	C1_TNS3_v	protein kinase C conserved region 1 (C1 domain) found in tensin-3 (TNS3) variant and similar proteins. Tensin-3 (TNS3), also called tensin-like SH2 domain-containing protein 1 (TENS1), or tumor endothelial marker 6 (TEM6), may play a role in actin remodeling. It is involved in the dissociation of the integrin-tensin-actin complex. This model corresponds to the C1 domain found in TNS3 variant. Typical TNS3 does not contain C1 domain. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	56
410440	cd20890	C1_DGKalpha_rpt2	second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase alpha (DAG kinase alpha) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase alpha, also called 80 kDa diacylglycerol kinase, or diglyceride kinase alpha (DGK-alpha), converts the second messenger diacylglycerol into phosphatidate upon cell stimulation, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DAG kinase alpha contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	62
410441	cd20891	C1_DGKbeta_rpt2	second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase beta (DAG kinase beta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase beta, also called 90 kDa diacylglycerol kinase, or diglyceride kinase beta (DGK-beta), exhibits high phosphorylation activity for long-chain diacylglycerols. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DAG kinase beta contains two copies of the C1 domain. This model corresponds to the second one. DGK-beta contains typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	59
410442	cd20892	C1_DGKgamma_rpt2	second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase gamma (DAG kinase gamma) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase gamma, also called diglyceride kinase gamma (DGK-gamma), reverses the normal flow of glycerolipid biosynthesis by phosphorylating diacylglycerol back to phosphatidic acid. It is classified as a type I DAG kinase (DGK), containing EF-hand structures that bind Ca(2+) and a recoverin homology domain, in addition to C1 and catalytic domains that are present in all DGKs. As a type I DGK, it is regulated by calcium binding. DGK-gamma contains two copies of the C1 domain. This model corresponds to the second one. DGK-gamma contains typical C1 domains that bind DAG and phorbol esters. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	61
410443	cd20893	C1_DGKdelta_rpt2	second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase delta (DAG kinase delta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase delta, also called 130 kDa diacylglycerol kinase, or diglyceride kinase delta (DGK-delta), is a residential lipid kinase in the endoplasmic reticulum. It promotes lipogenesis and is involved in triglyceride biosynthesis. It is classified as a type II DAG kinase (DGK), containing pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. DAG kinase delta contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	61
410444	cd20894	C1_DGKeta_rpt2	second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase eta (DAG kinase eta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase eta, also called diglyceride kinase eta (DGK-eta), plays a key role in promoting cell growth. It is classified as a type II DAG kinase (DGK), containing pleckstrin homology (PH) and sterile alpha motifs (SAM) domains, in addition to C1 and catalytic domains that are present in all DGKs. The SAM domain mediates oligomerization of type II DGKs. The diacylglycerol kinase eta gene, DGKH, is a replicated risk gene of bipolar disorder (BPD). DAG kinase eta contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	62
410445	cd20895	C1_DGKzeta_rpt2	second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase zeta (DAG kinase zeta) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase zeta, also called diglyceride kinase zeta (DGK-zeta), displays a strong preference for 1,2-diacylglycerols over 1,3-diacylglycerols, but lacks substrate specificity among molecular species of long chain diacylglycerols. It is classified as a type IV DAG kinase (DGK), containing myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. DAG kinase zeta contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	75
410446	cd20896	C1_DGKiota_rpt2	second protein kinase C conserved region 1 (C1 domain) found in diacylglycerol kinase iota (DAG kinase iota) and similar proteins. Diacylglycerol (DAG) kinase (EC 2.7.1.107) is a lipid kinase that phosphorylates diacylglycerol to form phosphatidic acid. DAG kinase iota, also called diglyceride kinase iota (DGK-iota), or DGKI, is a homolog of Drosophila DGK2, RdgA. It may have important cellular functions in the retina and brain. It is classified as a type IV DAG kinase (DGK), containing myristoylated alanine-rich protein kinase C substrate (MARCKS), PDZ-binding, and ankyrin domains, in addition to C1 and catalytic domains that are present in all DGKs. The MARCKS domain regulates the nuclear localizations of type IV DGKs while the PDZ-binding and ankyrin domains regulate interactions with several proteins. DAG kinase iota contains two copies of the C1 domain. This model corresponds to the second one. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	75
411014	cd20897	Smlt3025-like	S. maltophilia immunity protein with unknown function (Smlt3025) and similar proteins. This family includes Smlt3025, an immunity protein of the type IV secretion system (T4SS) found in the multi-drug-resistant opportunistic pathogen Stenotrophomonas maltophilia. Experiments show that Smlt3025 counteracts Smlt3024, an effector protein transferred by the T4SS into target cells. The crystal structure of Smlt3025 reveals a topology similar to the iron-regulated protein FrpD from Neisseria meningitidis which has been shown to interact with the RTX protein FrpC, while its counterpart Smlt3024 is homologous to the N-terminal domain of large Ca2+-binding RTX proteins.	234
412054	cd20900	HopBF1	type III secretion system (T3SS) effector HopBF1 from Ewingella americana, and similar proteins. This family includes HopBF1 family of bacterial type III secretion system (T3SS) effectors identified as eukaryotic-specific HSP90 protein kinases. HopBF1 adopts a minimal and atypical protein kinase fold such that it is recognized by HSP90 as a host client. Utilizing this "betrayal-like" mechanism to achieve specificity, HopF1 phosphorylates and inactivates eukaryotic HSP90 by inhibiting the chaperone's ATPase activity. This prevents activation of immune receptors that trigger hypersensitive response in plants, thereby inducing severe disease symptoms in plants infected by certain plant pathogens.	169
411015	cd20901	CC_AF10	coiled coil domain of ALL1-Fused gene from chromosome 10 protein (AF10) and similar proteins. This family includes AF10 (ALL1-Fused gene from chromosome 10 protein) which is one of mixed-lineage leukemia 1 (MLL1)-fusion partners that function in acute myeloid leukemia (ALL). Aberration of the mixed-lineage leukemia (MLL) gene is implicated in acute leukemia; chromosomal translocations of MLL1 generate oncogenic chimeric proteins, containing the non-catalytic N-terminal portion of MLL1 fused with many partners such as AF10. The MLL-AF10 fusion oncoprotein recruits DOT1L (disruptor of telomeric-silencing 1-like) to the homeobox A. The aberrant recruitment of DOT1L, a histone methyltransferase that methylates H3 lysine residues (H3K79), by MLL fusions and the resulting H3K79 methylation are thought to affect gene expression by altering chromatin accessibility. AF10 and DOT1L interact through their coiled coil domains.	64
411016	cd20902	CC_DOT1L	coiled coil domain of disruptor of telomeric-silencing 1-like (DOT1L) and similar proteins. This family contains DOT1L (disruptor of telomeric-silencing 1-like), a non-SET domain histone lysine methyltransferase (HKMT) that catalyzes monomethylation, dimethylation, and trimethylation of nucleosomal H3K79. DOT1L is recruited to the homeobox A by AF10 (ALL1-Fused gene from chromosome 10 protein), one of the mixed-lineage leukemia 1 (MLL1)-fusion partners that function in acute myeloid leukemia (ALL). Aberration of the MLL gene is implicated in acute leukemia; chromosomal translocations of MLL1 generate oncogenic chimeric proteins, containing the non-catalytic N-terminal portion of MLL1 fused with many partners such as AF10. The aberrant recruitment of DOT1L by MLL fusions and the resulting H3K79 methylation are thought to affect gene expression by altering chromatin accessibility. AF10 and DOT1L interact through their coiled coil domains.	65
411017	cd20903	HCV_p7	Hepatitis C virus p7 protein. Hepatitis C virus (HCV) p7 protein is a viroporin essential for virus production. The p7 monomer is comprised of 2 trans-membrane helices connected by a cytosolic loop, and oligomerizes to form cation-specific ion channels. These ion channels dissipate pH gradients in secretory vesicles potentially protecting acid-labile intracellular virions during egress (the rupturing of the infected cell and release of viral contents). p7 protein has at least two different functions in culture, one via the formation of these ion channels, the other through its specific interaction with the non-structural viral protein NS2. Several compounds targeting p7 have been investigated as anti-HCV drugs.	58
411018	cd20905	EHMT_ZBD	Zinc-binding domain of euchromatic histone lysine methyltransferases EHMT1 and EHTM2. EHMT1 (also known as GLP) and EHMT2 (also known as NG36 and G9a) are histone methyltransferases that methylate the K9 position of histone H3, marking genomic regions for transcriptional repression. They may play a role in the G0/G1 cell cycle transition and are associated with promoting various types of cancer. Mutations in EHMT1 are associated with the genetic disorder Kleefstra syndrome. A functional role for the zinc-binding domain has not been established.	133
411019	cd20907	CBM86	carbohydrate binding module family 86. This family describes what is most likely a xylan-binding module such as found in the Xyn10A protein of Roseburia intestinalis L1-82, which is involved in the extracellular capture and breakdown of xylan.	127
411020	cd20908	SUF4-like	N-terminal domain of Oryza sativa transcription factor SUPPRESSOR OF FRI 4 (OsSUF4), Arabidopsis thaliana SUF4 (AtSUF4), and similar proteins. Oryza sativa SUPPRESSOR OF FRI 4 (OsSUF4) is a C2H2-type zinc finger transcription factor which interacts with the major H3K36 methyltransferase SDG725 to promote H3K36me3 (tri-methylation at H3K9) establishment. The transcription factor OsSUF4 recognizes a specific 7-bp DNA element (5'-CGGAAAT-3'), which is contained in the promoter regions of many genes throughout the rice genome. Through interaction with OsSUF4, SDG725 is recruited to the promoters of key florigen genes, RICE FLOWERING LOCUS T1 (RFT1) and Heading date 3a (Hd3a), for H3K36 deposition to promote gene activation and rice plant flowering. OsSUF4 target genes include a number of genes involved in many biological processes. Flowering plant Arabidopsis SUF4 binds to a 15bp DNA element (5'-CCAAATTTTAAGTTT-3') within the promoter of the floral repressor gene FLOWERING LOCUS C (FLC) and recruits the FRI-C transcription activator complex to the FLC promoter. Although the DNA-binding element and target genes of AtSUF4 are different from those of OsSUF4, AtSUF4 is known to interact with the Arabidopsis H3K36 methyltransferase SDG8 (also known as ASHH2/EFS/SET8), and the methylation deposition mechanism mediated by the SUF4 transcription factor and H3K36 methyltransferase may be conserved in Arabidopsis and rice. Proteins in this family have two conserved C2H2-type zinc finger motifs at the N-terminus (included in this model), and a large proline-rich domain at the C-terminus; for OsSUF4, it has been shown that the N-terminal zinc-finger domain is responsible for DNA binding, and that the C-terminal domain interacts with SDG725.	82
411021	cd20910	NCBD_CREBBP-p300_like	Nuclear Coactivator Binding Domain (NCBD) of CREB (cyclic AMP response element binding protein) binding protein (CREBBP, also known as CBP) and its paralog p300. CREBBP (also called CBP) and its paralog p300, generally referred to as CREBBP/p300, are universal transcriptional coactivators that interact with many important transcription factors and comodulators to activate transcription. The NCBD domain [nuclear coactivator binding domain, also known as IRF-3 binding domain (IBiD) or SRC1 interaction domain (SID)] of CREBBP/p300 behaves as an intrinsically disordered domain in isolation, but folds into helical structures with different topologies upon binding to different ligands such as nuclear receptor coactivator p160, CREBBP interaction domain (CID) from nuclear receptor coactivator 1 (NCOA1 or Src1), NCOA2 (Tif2), and NCOA3 (ACTR), or interferon regulatory factor 3 (IRF-3). In Drosophila, there is only one CREB-binding protein ortholog and it is called nejire, dCBP, CBP/p300, or CBP.	43
411022	cd20912	AIR_RAP80-like	ABRAXAS Interacting Region (AIR) of Receptor-Associated Protein 80 (RAP80), and related domains. RAP80 and ABRAXAS are integral subunits of the BRCA1-A complex which also contains MERIT40 (Mediator of Rap80 Interactions and Targeting 40 kD, also known as BABAM1), BRE (also known as BABAM2, BRCC45 and BRCC4), and BRCC36 (also known as BRCC3). BRCA1-A functions in DNA double-strand break (DSB) repair. RAP80 interacts with the ABRAXAS, MERIT40, and BRE subunits. It is the interaction with ABRAXAS that drives specific incorporation of RAP80 into BRCA1-A. RAP80 contains one SUMO-interacting motif (SIM), two ubiquitin-interacting motifs (UIMs), this AIR, and two zinc finger motifs (ZnF). The SIM and UIM domains recruit BRCA1-A to sites of DNA damage. The AIR is integral in the interaction of RAP80 with the ABRAXAS, MERIT40, and BRE subunits.	59
411023	cd20913	DCAF15-CTD	C-terminal domain of DDB1- and CUL4-associated factor 15. This model represents the C-terminal domain of DCAF15 (DDB1- and CUL4-associated factor 15), the cullin RING ligase substrate receptor/adaptor that forms a complex with CUL4A or CUL4B, as part of the Rbx-Cul4-DDA1-DDB1-DCAF15 E3 ubiquitin ligase that is responsible for the proteasome degradation of certain proteins. Aryl sulfonamide anticancer agents such as indisulam, tasisulam, E7820, and chloroquinoxaline have been shown to recruit the essential mRNA-splicing factor RBM39 to DCAF15. These agents appear to promote binding of DCAF15 to the RNA-recognition motif (RRM) of RBM39, which suggests that derivatives of the aryl-sulfonamides may be used to target other RRM-containing proteins. Cell proliferation is inhibited by these aryl sulfonamides by causing degradation of RBM39, which leads to aberrant processing of pre-mRNA in hundreds of genes, primarily reflected by intron retention and exon skipping, thus collectively referred to as splicing inhibitor sulfonamides, or SPLAMs.	224
411024	cd20917	DCAF15-NTD	N-terminal domain of DDB1- and CUL4-associated factor 15. This model represents the N-terminal domain of DCAF15 (DDB1- and CUL4-associated factor 15), the cullin RING ligase substrate receptor/adaptor that forms a complex with CUL4A or CUL4B, as part of the Rbx-Cul4-DDA1-DDB1-DCAF15 E3 ubiquitin ligase that is responsible for the proteasome degradation of certain proteins. Aryl sulfonamide anticancer agents such as indisulam, tasisulam, E7820, and chloroquinoxaline have been shown to recruit the essential mRNA-splicing factor RBM39 to DCAF15. These agents appear to promote binding of DCAF15 to the RNA-recognition motif (RRM) of RBM39, which suggests that derivatives of the aryl-sulfonamides may be used to target other RRM-containing proteins. Cell proliferation is inhibited by these aryl sulfonamides by causing degradation of RBM39, which leads to aberrant processing of pre-mRNA in hundreds of genes, primarily reflected by intron retention and exon skipping, thus collectively referred to as splicing inhibitor sulfonamides, or SPLAMs.	225
410842	cd20918	polyA_pol_NCLDV	RNA polyadenylate polymerase of nucleocytoplasmic large DNA viruses. This model represents the poly(A) polymerases (PAPs) from nucleocytoplasmic large DNA viruses (NCLDV), a group of giant eukaryotic double-stranded DNA viruses that make up the phylum Nucleocytoviricota. They are referred to as nucleocytoplasmic because they are often able to replicate in both the host's cell nucleus and cytoplasm. PAPs catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. This group includes PAPs from the Poxviridae and Mimiviridae family of viruses. In Vaccinia virus, from the Poxviridae family, polyadenylation is crucial for virion maturation and is carried out by a heterodimer, formed by the catalytic subunit VP55 and the processivity factor (VP39), which is required for the formation of long poly(A) tails. PAPs from Acanthamoeba polyphaga mimivirus and Megavirus chiliensis, which belong to the Mimiviridae family, are homodimeric and intrinsically self-processive, generating >700 nucleotides long poly(A) tails. Homodimerization is required for PAP activity; monomers are able to bind RNA but are enzymatically inactive. Thus, while other PAPs form heterodimers with processivity factors, the Mimiviridae PAPs become processive upon homodimerization. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs.	335
410843	cd20919	polyA_pol_Pox	RNA polyadenylate polymerase catalytic subunit from the Poxviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. In Vaccinia virus, from the Poxviridae family of viruses, polyadenylation is crucial for virion maturation and is carried out by a heterodimer, formed by the catalytic subunit VP55 and the processivity factor (VP39). In the absence of VP39, oligo(A) tails are added to permissive primers by VP55 in a rapid processive burst, which ceases abruptly after tails have reached 30-35 nucleotides in length. With VP39, tails with lengths in the hundreds of nucleotides are processively synthesized with no abrupt termination of elongation. In contrast to mammalian cells, polyadenylation is not dependent on a multiprotein mRNA 3' end processing complex. VP55 translocates with respect to its single-stranded nucleic acid substrate during poly(A) tail addition. The catalytic subunit of Poxviridae PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs.	461
410844	cd20920	polyA_pol_Mimi	RNA polyadenylate polymerase from the Mimiviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. PAPs from Acanthamoeba polyphaga mimivirus and Megavirus chiliensis, which belong to the Mimiviridae family, are homodimeric and intrinsically self-processive, generating >700 nucleotides long poly(A) tails. Homodimerization is required for PAP activity; monomers are able to bind RNA but are enzymatically inactive. Thus, while other PAPs form heterodimers with processivity factors, the Mimiviridae PAPs become processive upon homodimerization. mRNA polyadenylation in Mimiviridae occurs at hairpin-forming palindromic sequences terminating viral transcripts. The catalytic subunit of Mimiviridae PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs.	449
410845	cd20921	polyA_pol_Pycodna	RNA polyadenylate polymerase from the Phycodnaviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs.	357
410846	cd20922	polyA_pol_Marseille	RNA polyadenylate polymerase from the Marseilleviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs.	316
410847	cd20923	polyA_pol_Fausto	RNA polyadenylate polymerase from Faustovirus. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs.	351
410848	cd20924	polyA_pol_Asfar	RNA polyadenylate polymerase from the Asfarviridae family of viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs.	336
409519	cd20925	IgV_CD28	Immunoglobulin Variable (IgV) domain Cluster of Differentiation (CD) 28. The members here are composed of the immunoglobulin variable region (IgV) of Cluster of Differentiation (CD) 28). CD28 is one of the proteins expressed on T cells that provide co-stimulatory signals required for T cell activation and survival. CD28 is the receptor for CD80 (B7.1) and CD86 (B7.2) proteins. CD28 consists of a paired V-set of immunoglobulin (Ig) superfamily domains attached to single-transmembrane domains and cytoplasmic domains that contain the MYPPY motif, which is involved in binding B7.1 or B7.2. CD28 is very similar to CTLA-4 (cytotoxic T-lymphocyte-associated protein 4, also known as CD152 (cluster of differentiation 152)), which is involved in the regulation of T cell response, acting as an inhibitor of intracellular signaling. CTLA-4 also binds the B7 molecules (B7.1 and B7.2) with a higher affinity than does CD28.  The B7/CTLA-4 interaction generates inhibitory signals down-regulating the response, and may prevent T cell activation by weak TCR signals. CD28 and CTLA-4 then elicit opposing signals in the regulation of T cell responsiveness and homeostasis. The IgSF is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The N-terminal Ig-like domain of CD28 is a member of the V-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C'-C" in the other. However, each CD28-B7 family member is slightly different, some have an IgV domain which lacks an A' or C" strand.	117
409520	cd20926	IgV_NKp30	Immunoglobulin variable (IgV) domain of Natural Killer cell activating receptor NKp30 and similar domains. The members here are composed of the immunoglobulin variable region (IgV) of Natural Killer cell activating receptor NKp30 (also known as Natural Cytotoxicity triggering Receptor 3 (NCR3)) and similar domains. NKp30 Recognizes the N-Terminal IgV Domain of B7-H6. In humans, the activating natural cytotoxicity receptor NKp30 plays a major role in NK cell-mediated tumor cell lysis. NKp30 recognizes the cell-surface protein B7-H6, which is expressed on tumor, but not healthy, cells.	112
409521	cd20927	IgI_Titin_M1-like	Immunoglobulin-like M1 domain from Titin; a member of the I-set of IgSF domains. The members here are composed of the Immunoglobulin-like M1 I-set domain from Titin and similar proteins. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the titin-M1 domain lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	90
409522	cd20928	IgI_BTLA	Extracellular Immunoglobulin (Ig) domain of the B and T lymphocyte attenuator (BTLA); member of the I-set Ig superfamily domains. The members here are composed of the extracellular immunoglobulin (Ig) domain of the B and T lymphocyte attenuator (BTLA; also known as CD270). BTLA is a type I transmembrane glycoprotein that is structurally similar to the CD28 family of T cell co-stimulatory or coinhibitory molecules. BTLA is a coinhibitory molecule expressed on T cells, B cells, macrophages, dendritic and natural killer (NK) cells. Unlike CD28 family members, BTLA interacts with the tumor necrosis factor receptor superfamily member HVEM (herpes virus entry mediator) rather than with B7 family ligands. In addition, BTLA does not form a homodimer. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. In contrast to CD28 family members, the structure of the BTLA extracellular Ig domain lacks a C" strand and thus is better described as a member of the I-set of Ig domains. I-set domains are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM).	101
409523	cd20929	IgV_CD22_d1	First immunoglobulin domain of Cluster of Differentiation (CD) 22; member of the V-set of IgSF domains. The members here are composed of the first immunoglobulin domain in Cluster of Differentiation (CD) 22 (also known as Siglec-2). CD22, a sialic-acid binding immunoglobulin type-lectin (Siglec) family member, is an inhibitory co-receptor of the B-cell receptor (BCR). The inhibitory function of CD22 and its restricted expression on B cells makes CD22 an attractive target against dysregulated B cells that cause autoimmune diseases and B-cell-derived cancers. CD22 plays a vital role in establishing a baseline level of B-cell inhibition, and thus is an important determinant of homeostasis in humoral immunity. Siglecs are primarily expressed on immune cells and recognize sialic acid-containing glycan ligands. Siglecs are organized as an extracellular module composed of Ig-like domains (an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains), followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG (Siglec-4, myelin-associated glycoprotein), the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the V-set of IgSF domains.	114
409524	cd20930	Ig3_Nectin-5_like	Third immunoglobulin domain of Nectin-like Protein-5, and similar domains. The members here are composed of the third immunoglobulin domain of Nectin-like Protein-5 (also known as Cluster of Differentiation 155 (CD155)). Nectin-like Protein-5 mediates NK (Natural Killer) cell adhesion and triggers NK cell effector functions. CD155 binds two different NK cell receptors: CD96 and CD226. These interactions accumulate at the cell-cell contact site, leading to the formation of a mature immunological synapse between NK cell and target cell. This may trigger adhesion and secretion of lytic granules and IFN-gamma and activate cytotoxicity of activated NK cells. CD155 may also promote NK cell-target cell modular exchange, and PVR transfer to the NK cell. This transfer is more important in some tumor cells expressing a lot of PVR, and may trigger fratricide NK cell activation, providing tumors with a mechanism of immunoevasion. Moreover, CD155 plays a role in mediating tumor cell invasion and migration.	86
409525	cd20931	Ig3_IL1RAP	Third immunoglobulin domain of interleukin-1 receptor accessory protein (IL1RAP). The members here are composed of the third immunoglobulin Ig interleukin-1 receptor accessory protein (IL1RAP). The interleukin 1 receptor accessory protein (IL-1RAP), also known as IL-1R3, is a coreceptor of type 1 interleukin 1 receptor (IL-1R1) and is required for transmission of IL-1 signaling. The activated IL-1 receptor complex, which consists of IL-1R1 and IL-1RAP, induces multiple cellular responses including NF-kappa-B activation, IL-2 secretion, and IL-2 promoter activation. Signaling involves the recruitment of adapter molecules such as TOLLIP, MYD88, and IRAK1 or IRAK2 via the respective Toll/IL-1 receptor (TIR) domains of the receptor/coreceptor subunits. Moreover, IL1RAP is known to be the accessory co-receptor that activates signal transduction upon IL-36 binding to IL-36R. IL-36 cytokines, which are a subfamily of the IL-1 superfamily, bind to the IL-36 receptor (IL-36R) and use IL1RAP as a co-receptor.	107
409526	cd20932	Ig3_IL1R_like	Third immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the third immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R) and similar proteins. Members of this family are characterized by extracellular immunoglobulin-like domains and intracellular Toll/Interleukin-1R (TIR) domain. Three naturally occurring ligands for the IL-1 receptor (IL1R) are known: the agonists IL-1alpha and IL-1beta and the IL-1-receptor antagonist IL1RA. IL-1Rs are involved in immune host defense and hematopoiesis. After binding to interleukin-1, IL1R associates with the coreceptor IL1RAP (interleukin 1 receptor accessory protein, also known as IL-1R3) to form the high affinity interleukin-1 receptor complex, which induces multiple cellular responses including NF-kappa-B activation, IL-2 secretion, and IL-2 promoter activation. Signaling involves the recruitment of adapter molecules such as TOLLIP, MYD88, and IRAK1 or IRAK2 via the respective TIR domains of the receptor/coreceptor subunits. IL1R binds ligands with comparable affinity to its antagonist IL1RA, and binding of IL1RA to IL1R, prevents association of the latter with IL1RAP to form a signaling complex.	104
409527	cd20933	Ig_ch-CD3_epsilon_like	Immunoglobulin (Ig)-like domain of chicken Cluster of Differentiation (CD) 3 epsilon chain and similar proteins. The members here are composed of the immunoglobulin (Ig)-like domain of chicken Cluster of Differentiation (CD) 3 epsilon chain and similar proteins. CD3 is a T cell surface receptor that is associated with alpha/beta T cell receptors (TCRs). The CD3 complex consists of one gamma, one delta, two epsilon, and two zeta chains. The CD3 subunits form heterodimers as gamma/epsilon, delta/epsilon, and zeta/zeta. The gamma, delta, and epsilon chains each contain an extracellular Ig domain, whereas the extracellular domains of the zeta chains are very small and have unknown structure. The CD3 domain participates in intracellular signaling once the TCR has bound an MHC/antigen complex. The chicken CD3epsilon Ig domain has low sequence identity with human (22%) and mouse (24%) CD3epsilon, but overall is structurally very similar over the entire domain.	63
409528	cd20934	IgV_B7-H3	Immunoglobulin Variable (IgV) domain of B7-H3, a member of the B7 family of immune checkpoint molecules. The members here are composed of the immunoglobulin variable (IgV) domain of B7-H3 also known as CD276), a member of the B7 family of immune checkpoint molecules. B7-H3 is an important immune checkpoint member of the B7 family and shares homology with other B7 ligands such as programmed death ligand 1 (PD-L1). The B7 family molecules interact with CD28 on T-cells to provide co-stimulatory signals that regulate T-cell activation and T-helper cell differentiation. Although B7-H3 has been shown to have both co-stimulatory and co-inhibitory effects on T-cell responses, the most current studies describe B7-H3 as a T cell inhibitor that promotes tumor aggressiveness and proliferation. Moreover, B7-H3 is highly overexpressed on a wide range of human solid cancers and promotes tumor growth, metastasis, and drug resistance. Thus, B7-H3 expression in tumors often correlates with both negative prognosis and poor clinical outcome in cancer patients. B7-H3 protein contains a predicted signal peptide, V- and C-like Ig domains (IgV and IgC), a transmembrane region, and an intracellular tail.	115
409529	cd20935	IgV_B7-H2	Immunoglobulin Variable (IgV) domain of B7-H2 (B7 homolog 2). The members here are composed of the immunoglobulin variable (IgV) domain of B7-H2 (B7 homolog 2 also known as ICOSL (inducible T cell costimulator ligand) or CD275). B7-H2 is a ligand for the T-cell-specific cell surface receptor ICOS and acts as a costimulatory signal for T-cell proliferation and cytokine secretion. The interaction of ICOS with ICOSL (B7-H2) regulates T cell activation and expansion, is involved in T cell dependent B cell activation, and T-helper cell differentiation. It is a member of the B7 family of immune regulatory proteins and shares homology with other B7 ligands, such as B7-1, B7-2, B7-H1 (PD-L1), PD-L2, and B7-H3. The extracellular domains of B7 proteins contain two Ig-like domains and all members have short cytoplasmic domains. These ligands are typically expressed on antigen presenting cells (such as macrophages, B cells and dendritic cells) and have the ability to regulate T-cell proliferation and function. Tumor cells are also capable of expressing the B7 family members in order to evade immune surveillance.	113
409530	cd20936	IgI_3_CSF-1R	Third immunoglobulin domain of the hematopoietic colony-stimulating factor 1 receptor (CSF-1R), and similar domains; member of the I-set of IgSF domains. The members here are composed of the third immunoglobulin domain of the hematopoietic colony-stimulating factor 1 receptor (CSF-1R) and similar proteins. CSF-1R, a class III receptor tyrosine kinase (RTKIII), is critical to the survival, proliferation, and differentiation of mononuclear phagocytic cells such as monocytes, tissue macrophages, muscularis macrophages, microglia, osteoclasts, Paneth cells, and myeloid dendritic cells. Human colony-stimulating factor 1 receptor (hCSF-1R) is unique among the hematopoietic receptors because it is activated by two distinct cytokines, CSF-1 and interleukin-34 (IL-34). The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the I-set of IgSF domains.	93
409531	cd20937	IgC2_CD22_d3	Third immunoglobulin domain in Cluster of Differentiation (CD) 22; member of the Constant 2 (C2)-set of IgSF domains. The members here are composed of the third immunoglobulin domain in Cluster of Differentiation (CD) 22 (also known as Siglec-2).  CD22, a sialic-acid binding immunoglobulin type-lectin (Siglec) family member, is an inhibitory co-receptor of the B-cell receptor (BCR). The inhibitory function of CD22 and its restricted expression on B cells makes CD22 an attractive target against dysregulated B cells that cause autoimmune diseases and B-cell-derived cancers. CD22 plays a vital role in establishing a baseline level of B-cell inhibition, and thus is an important determinant of homeostasis in humoral immunity. Siglecs are primarily expressed on immune cells and recognize sialic acid-containing glycan ligands. Siglecs are organized as an extracellular module composed of Ig-like domains (an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains), followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG (Siglec-4, myelin-associated glycoprotein), the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains, having A, B, and E strands in one beta-sheet and A', G, F, C' in the other. Unlike other Ig domain sets, the C2-set lacks the D strand.	88
409532	cd20938	IgC1_CD22_d2	Second immunoglobulin domain of Cluster of Differentiation (CD) 22; member of the Constant 1 (C1)-set of IgSF domains. The members here are composed of the second immunoglobulin domain of clusters of differentiation (CD) 22 (also known as Siglec-2). CD22, a sialic-acid binding immunoglobulin type-lectin (Siglec) family member, is an inhibitory co-receptor of the B-cell receptor (BCR). The inhibitory function of CD22 and its restricted expression on B cells makes CD22 an attractive target against dysregulated B cells that cause autoimmune diseases and B-cell-derived cancers. CD22 plays a vital role in establishing a baseline level of B-cell inhibition, and thus is an important determinant of homeostasis in humoral immunity. Siglecs are primarily expressed on immune cells and recognize sialic acid-containing glycan ligands. Siglecs are organized as an extracellular module composed of Ig-like domains (an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains), followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG (Siglec-4, myelin-associated glycoprotein), the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C1-set of IgSF domains.	98
409533	cd20939	IgC2_D1_IL-6RA	Immunoglobulin-like domain D1 of interleukin-6 receptor alpha-chain (IL-6RA, also known as CD126); member of the C2-set of IgSF domains. The members here are composed of the immunoglobulin-like domain D1 of interleukin-6 receptor alpha-chain (IL-6RA, also known as CD126). The IL-6RA ectodomain, which is highly modular, consisting of three domains (D1, D2, and D3). Interleukin-6 (IL-6) is a multifunctional cytokine that regulates the immune response, hemopoiesis, the acute phase response and inflammation. It is generated in an infectious lesion and sends out a warning signal to the entire body. IL-6 binds first to its cognate alpha-chain receptor (IL-6R), and then the IL-6/IL-6R complex which in turn induces homodimerization of gp130. As a result, a high-affinity functional receptor complex of IL-6, IL-6R and gp130 is formed, and subsequently the complex triggers a downstream signal cascade. Aberrant production of IL-6 and its receptor (IL-6R) are implicated in the pathogenesis of multiple myeloma, autoimmune diseases and prostate cancer. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains. Unlike other Ig domain sets, the C2-set lacks the D strand.	77
409534	cd20940	Ig0_BSG1	Immunoglobulin-like Ig0 domain of basigin-1 (BSG1) and similar proteins. The members here are composed of the immunoglobulin (Ig) domain of the collagenase stimulatory factor, basigin-1 (BSG1; also known as Cluster of Differentiation 147 (CD147) and Extracellular Matrix Metalloproteinase Inducer (EMMPRIN)) and similar proteins.  CD147 is a transmembrane glycoprotein that belongs to the immunoglobulin superfamily. It is expressed in nearly all cells including platelets and fibroblasts and is involved in inflammatory diseases, and cancer progression. CD147 is highly expressed in several cancers and used as a prognostic marker. The two primary isoforms of CD147 that are related to cancer progression have been identified: CD147 Ig1-Ig2 (also called Basigin-2) that is ubiquitously expressed in most tissues and CD147 Ig0-Ig1-Ig2 (also called Basigin-1) that is retinal specific and implicated in retinoblastoma. Studies showed that CD147 Ig0 domain is a potent stimulator of interleukin-6 and suggest that the CD147 Ig0 dimer is the functional unit required for activity.	116
409535	cd20942	IgI_MAdCAM-1	Immunoglobulin-like domain of Mucosal addressin cell-adhesion molecule (MAdCAM-1); member of the I-set of IgSF domains. The members here include the immunoglobulin-like domain of Mucosal addressin cell-adhesion molecule (MAdCAM-1). MadCAM-1 is an endothelial cell adhesion molecule that interacts preferentially with the leukocyte beta7 integrin LPAM-1 (alpha4beta7), L-selectin, and VLA-4 (alpha4beta1) on myeloid cells to direct leukocytes into mucosal and inflamed tissues. MadCAM-1 is expressed primarily on HEV of Peyer's patches and on venules in small intestinal lamina propria, on the marginal sinus of the spleen, and on HEV of embryonic lymph nodes. It is a member of the immunoglobulin superfamily (IgSF), which is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The first Ig-like domain of MAdCAM-1 is a member of the I-set IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, the A strand of the I-set is discontinuous, but lacks a C" strand.	88
409536	cd20943	IgI_VCAM-1	First immunoglobulin-like domain of vascular endothelial cell adhesion molecule-1 (VCAM-1), and similar domains; member of the I-set of IgSF domains. The members here include the first immunoglobulin-like domain of vascular endothelial cell adhesion molecule-1 (VCAM-1; also known as Cluster of Differentiation 106 (CD106)) and similar proteins. During the inflammation process, these molecules recruit leukocytes onto the vascular endothelium before extravasation to the injured tissues. The interaction of VCAM-1 binding to the beta1 integrin very late antigen (VLA-4) expressed by lymphocytes and monocytes mediates the adhesion of leucocytes to blood vessel walls, and regulates migration across the endothelium. During metastasis, some circulating cancer cells extravasate to a secondary site by a similar process. VCAM-1 may be involved in organ targeted tumor metastasis and may also act as host receptors for viruses and parasites. VCAM-1 contains seven Ig domains. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The first Ig-like domain of VCAM-1 is a member of the I-set IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, the A strand of the I-set is discontinuous, but lacks a C" strand.	89
409537	cd20944	IgI_N_ICAM1-2-3	N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54), ICAM-2 (CD102) and ICAM-3 (CD50); members of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig) domain found in the N-terminus of the intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54), ICAM-2 (CD102), and ICAM-3 (CD50). ICAM-1, ICAM-2, and ICAM-3 mediate a variety of critical intercellular adhesion events in the immune system through interactions with their counter-receptors, the beta2-integrins LFA-1 (CD11a/CD18), Mac-1 (CD11b/CD18), p150,95 (CD11c/CD18), and CD11d/CD18. The ICAMs are type I transmembrane glycoproteins belonging to the immunoglobulin superfamily (IgSF). The binding of the ICAM family members with the beta2-integrins physically stabilizes interactions between pairs of T and B cells, T cells and antigen-presenting cells (APCs), and brings effector cells such as cytotoxic T lymphocytes (CTLs) and natural killer (NK) cells into close proximity to their target cells. All three ICAMs share a common polypeptide homology and structural motif, and the ability to bind LFA-1. The distinct functional role of each ICAM is affected by their relative affinities for LFA-1 (ICAM-1 > ICAM-2 > ICAM-3). ICAM-1 is expressed in most tissues at low levels, and expression is increased by inflammatory cytokines. In contrast, ICAM-2 is expressed predominantly on endothelium and leukocytes (except neutrophils), and its expression generally is not responsive to cytokines. ICAM-3 is expressed on leukocytes and Langerhans cells, but not on resting, cytokine-induced endothelium, or nonhematopoietic tissues.	81
409538	cd20946	IgV_1_JAM1-like	First Ig-like domain of Junctional adhesion molecule-1 (JAM1)and similar domains; a member of the V-set of IgSF domains. The members here are composed of the first Ig-like domain of Junctional Adhesion Molecule-1 (JAM1)and similar domains. JAM1 is an immunoglobulin superfamily (IgSF) protein with two Ig-like domains in its extracellular region; it plays a role in the formation of endothelial and epithelial tight junction and acts as a receptor for mammalian reovirus sigma-1. The IgSF is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The first Ig-like domain of JAM1 is a member of the V-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C'-C" in the other.	102
409539	cd20947	IgV_PDl1	Immunoglobulin Variable (IgV) domain of Programmed death ligand 1 (PD-L1). The members here are composed of the immunoglobulin variable (IgV) domain of Programmed death ligand 1 (PD-L1; also known as Cluster of Differentiation 274 (CD274)). PD-L1 is a cell-surface ligand that competes with PD-L2 for binding to the immunosuppressive receptor programmed death-1 (PD-1). PD-1 is a member of the B7 family that plays an important role in negatively regulating immune responses upon interaction with its two ligands, PD-L1 or PD-L2. Like PD-L2, PD-L1 interacts with PD-1 and suppresses T cell proliferation and cytokine production. The PD-1 receptor is expressed on the surface of activated T cells, while PD-L1 is expressed on cancer cells. When PD-1 and PD-L1 bind together, they form a molecular shield protecting tumor cells from being destroyed by the immune system. Thus, inhibiting the binding of PD-L1 to PD-1 with an antibody leads to killing of tumor cells by T cells. PD-1 inhibitors (such as Pembrolizumab, Nivolumab, and Cemiplimab) and  PD-L1 inhibitors (such as Atezolizumab, Avelumab, and Durvalumab ) are an emerging class of immunotherapy that stimulate lymphocytes against tumor cells.	110
409540	cd20948	IgC2_CEACAM5-like	Fifth immunoglobulin (Ig)-like domain of the carcinoembryonic antigen (CEA) related cell adhesion molecule 5 (CEACAM5) and similar domains; member of the C2-set IgSF domains. The members here are composed of the fifth immunoglobulin (Ig)-like domain of the carcinoembryonic antigen (CEA) related cell adhesion molecule 5 (CEACAM5) and similar domains. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. Carcinoembryonic antigen-related cell adhesion molecule 5 (CEACAM5), also known as CD66e (Cluster of Differentiation 66e), is a cell surface glycoprotein that plays a role in cell adhesion, intracellular signaling and tumor progression. Diseases associated with CEACAM5 include lung cancer and rectum cancer. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains, having A, B, and E strands in one beta-sheet and A', G, F, C' in the other. Unlike other Ig domain sets, the C2-set lacks the D strand.	76
409541	cd20949	IgI_Twitchin_like	C-terminal immunoglobulin-like domain of the myosin-associated giant protein kinase Twitchin, and similar domains; member of the I-set IgSF domains. The members here are composed of the C-terminal immunoglobulin-like domain of the myosin-associated giant protein kinase Twitchin and similar proteins, including Caenorhabditis elegans and Aplysia californica Twitchin, Drosophila melanogaster Projectin, and similar proteins. These are very large muscle proteins containing multiple immunoglobulin (Ig)-like and fibronectin type III (FN3) domains and a single kinase domain near the C-terminus. In humans these proteins are called Titin. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The Ig-like domain of the Twitchin is a member of the I-set IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.  I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins (titin, telokin, and twitchin), the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D.	89
409542	cd20950	IgI_2_JAM1	Second Ig-like domain of Junctional adhesion molecule-1 (JAM1); a member of the I-set of IgSF domains. The members here are composed of the second Ig-like domain of Junctional adhesion molecule-1 (JAM1). JAM1 is an immunoglobulin superfamily (IgSF) protein with two Ig-like domains in its extracellular region; it plays a role in the formation of endothelial and epithelial tight junction and acts as a receptor for mammalian reovirus sigma-1. The IgSF is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The second Ig-like domain of JAM1 is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, the A strand of the I-set is discontinuous but lacks a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors.	97
409543	cd20951	IgI_titin_I1-like	Immunoglobulin domain I1 of the titin I-band and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin domain I1 of the titin I-band and similar proteins. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. The two sheets are linked together by a conserved disulfide bond between B strand and F strand. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The Ig I1 domain of the titin I-band is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	94
409544	cd20952	IgI_5_Robo	Fifth Ig-like domain of Roundabout (Robo) homolog 1/2, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the fifth Ig-like domain of Roundabout (Robo) homolog 1/2 and similar domains.  Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (Robo1, -2, and -3), and three mammalian Slit homologs (Slit-1,-2, -3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. Robo1, -2, and -3 are expressed by commissural neurons in the vertebrate spinal cord and Slits 1, -2, -3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of slit responsiveness, antagonizes slit responsiveness in precrossing axons.  The Slit-Robo interaction is mediated by the second leucine-rich repeat (LRR) domain of Slit and the two N-terminal Ig domains of Robo, Ig1 and Ig2. The primary Robo binding site for Slit2 has been shown by surface plasmon resonance experiments and mutational analysis to be is the Ig1 domain, while the Ig2 domain has been proposed to harbor a weak secondary binding site. The fifth Ig-like domain of Robo 1 and 2 is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors	87
409545	cd20953	IgI_2_Dscam	Second immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. DSCAM is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.	95
409546	cd20954	IgI_7_Dscam	Seventh immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the seventh immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.	96
409547	cd20955	IgI_1_Dscam	First immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.	99
409548	cd20956	IgI_4_Dscam	Fourth immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the fourth immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.	96
409549	cd20957	IgC2_3_Dscam	Third immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the Constant 2 (C2)-set of IgSF domains. The members here are composed of the third immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the C2-set of IgSF domains, having A, B, and E strands in one beta-sheet and A', G, F, C, and C' in the other. Unlike other Ig domain sets, the C2-set lacks the D strand.	88
409550	cd20958	IgI_5_Dscam	Fifth immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the fifth immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.	89
409551	cd20959	IgI_6_Dscam	Sixth immunoglobulin domain of the Drosophila melanogaster Dscam protein, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the sixth immunoglobulin domain of the Drosophila melanogaster Down syndrome cell adhesion molecule (DSCAM) protein and similar proteins. Down syndrome cell adhesion molecule (DSCAM) is a cell adhesion molecule that plays critical roles in neural development, including axon guidance and branching, axon target recognition, self-avoidance and synaptic formation. DSCAM belongs to the immunoglobulin superfamily and contributes to defects in the central nervous system in Down syndrome patients. Vertebrate DSCAMs differ from Drosophila Dscam1 in that they lack the extensive alternative splicing that occurs in the insect gene. Drosophila melanogaster Dscam has 38,016 isoforms generated by the alternative splicing of four variable exon clusters, which allows every neuron in the fly to display a distinctive set of Dscam proteins on its cell surface. Drosophila Dscam1 is a cell-surface protein that plays important roles in neural development and axon tiling of neurons. It is shown that thousands of isoforms bind themselves through specific homophilic (self-binding) interactions, a process which mediates cellular self-recognition. Drosophila Dscam2 is also alternatively spliced and plays a key role in the development of two visual system neurons, monopolar cells L1 and L2. This group is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.	94
409552	cd20960	IgV_CAR_like	Immunoglobulin Variable (V) domain of the Coxsackievirus and Adenovirus Receptor (CAR), and similar proteins. The members here are composed of the Variable (V) domain of the Coxsackievirus and Adenovirus Receptor (CAR), and similar proteins.  CAR, which is encoded by human CXADR gene, is a cell adhesion molecule of the Immunoglobulin (Ig) superfamily. The CAR acts as a type I membrane receptor for group B1-B6 coxsackie viruses and subgroup C adenoviruses. For instance, adenovirus interacts with the coxsackievirus and adenovirus receptor to enter epithelial airway cells.  The CAR is also shown to be involved in physiological processes such as neuronal and heart development, epithelial tight junction integrity, and tumor suppression. The CAR is a component of the epithelial apical junction complex that may function as a homophilic cell adhesion molecule and is essential for tight junction integrity. The CAR is also involved in transepithelial migration of leukocytes through adhesive interactions with JAML a transmembrane protein of the plasma membrane of leukocytes. The interaction between both receptors also mediates the activation of gamma-delta T-cells, a subpopulation of T-cells residing in epithelia and involved in tissue homeostasis and repair. The CAR is composed of one V-set and one C2-set Ig module, a single transmembrane helix, and an intracellular domain. This group belongs to the V-set of IgSF domains, having A, B, E and D strands in one beta-sheet and A', G, F, C, C' and C" in the other	114
409553	cd20961	Ig1_Tyro3_like	First immunoglobulin (Ig)-like domain of Tyro3 receptor tyrosine kinase (RTK), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of Tyro3 receptor tyrosine kinase (RTK). Tyro3 together with Axl and Mer form the Axl/Tyro3 family of receptor tyrosine kinases (RTKs). This family includes Axl (also known as Ark, Ufo, and Tyro7), Tyro3 (also known as Sky, Rse, Brt, Dtk, and Tif), and Mer (also known as Nyk, c-Eyk, and Tyro12). Axl/Tyro3 family receptors have an extracellular portion with two Ig-like domains followed by two fibronectin-types III (FNIII) domains, a membrane-spanning single helix, and a cytoplasmic tyrosine kinase domain. Axl, Tyro3 and Mer are widely expressed in adult tissues, though they show higher expression in the brain, in the lymphatic and vascular systems, and in the testis. Axl, Tyro3, and Mer bind the vitamin K dependent protein Gas6 with high affinity, and in doing so activate their tyrosine kinase activity. Axl/Gas6 signaling may play a part in cell adhesion processes, prevention of apoptosis, and cell proliferation.	87
409554	cd20962	IgI_C1_MyBP-C_like	Immunoglobulin Domain C1 of human cardiac Myosin Binding Protein C and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin domain C1 of human cardiac Myosin Binding Protein C (MyBP-C). MyBP-C is a thick filament protein involved in the regulation of muscle contraction. Mutations in cardiac MyBP-C gene are the second most frequent cause of hypertrophic cardiomyopathy. MyBP-C binds to myosin with two binding sites, one at its C-terminus and another at its N-terminus. The N-terminal binding site, consisting of immunoglobulin (lg) domains C1 and C2 connected by a flexible linker, interacts with the S2 segment of myosin in a phosphorylation-regulated manner.  The C1 and C2 Ig domains can bind to and activate or inhibit the thin filament. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The C1 domain of the MyBP-C is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors.	101
409555	cd20963	IgV_VCBP	Immunoglobulin Variable region-containing chitin-binding proteins; an immunoglobulin V-set domain. The members here are composed of the immunoglobulin variable (IgV) region-containing chitin-binding proteins (VCBPs). VCBPs are secreted, immune-type molecules that have been identified in both amphioxus and sea squirt (Ciona intestinalis). VCBPs, which consist of a leader peptide, two tandem N-terminal immunoglobulin V-type domains and a single C-terminal chitin-binding domain, belong to a multigene family encoding secreted proteins. The VCBPs were identified first in the cephalochordate Branchiostoma floridae and show structural similarities with V-type domains of immunoglobulins and T cell receptors, suggesting that VCBPs represent a unique gut-associated form of innate immune proteins. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group belongs to the V-set of IgSF domains, having A, B, E and D strands in one beta-sheet and A', G, F, C, C' and C" in the other.	123
409556	cd20964	IgI_Tie2	Immunoglobulin domain of Tie2 tyrosine kinase; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig) domain of Tie2 tyrosine kinase. The Tie receptor tyrosine kinases and their angiopoietin (Ang) ligands play central roles in developmental and tumor-induced angiogenesis. Tie2 contains three immunoglobulin (Ig) domains, which fold together with the three epidermal growth factor domains into a compact, arrowhead-shaped structure. Ang2-Tie2 recognition is similar to antibody-protein antigen recognition, including the location of the ligand-binding site within the Ig fold. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structures of the Tie2 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors.	92
409557	cd20965	IgI_2_hemolin-like	Second immunoglobulin (Ig)-like domain of hemolin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of hemolin and similar proteins. Hemolin, an insect immunoglobulin superfamily (IgSF) member containing four Ig-like domains, is a lipopolysaccharide-binding immune protein induced during bacterial infection. Hemolin shares significant sequence similarity with the first four Ig-like domains of the transmembrane cell adhesion molecules (CAMs) of the L1 family. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structures of this group show that the second Ig domain lacks this strand and thus belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM).	101
409558	cd20966	IgI_1_Axl_like	First immunoglobulin (Ig)-like domain of Axl receptor tyrosine kinase (RTK), and similar domains; member of the I-set Ig domains. The members here are composed of the first immunoglobulin (Ig)-like domain of Axl receptor tyrosine kinase (RTK). Axl together with Tyro3 and Mer form the Axl/Tyro3 family of receptor tyrosine kinases (RTKs). This family includes Axl (also known as Ark, Ufo, and Tyro7), Tyro3 (also known as Sky, Rse, Brt, Dtk, and Tif), and Mer (also known as Nyk, c-Eyk, and Tyro12). Axl/Tyro3 family receptors have an extracellular portion with two Ig-like domains followed by two fibronectin-types III (FNIII) domains, a membrane-spanning single helix, and a cytoplasmic tyrosine kinase domain. Axl, Tyro3 and Mer are widely expressed in adult tissues, though they show higher expression in the brain, in the lymphatic and vascular systems, and in the testis. Axl, Tyro3, and Mer bind the vitamin K dependent protein Gas6 with high affinity, and in doing so activate their tyrosine kinase activity. Axl/Gas6 signaling may play a part in cell adhesion processes, prevention of apoptosis, and cell proliferation. Ig superfamily domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The Ig-like domain of the Axl is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand.	101
409559	cd20967	IgI_C2_MyBP-C-like	Domain C2 of human cardiac Myosin Binding Protein C and similar domains; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig)  Domain C2 of human cardiac Myosin Binding Protein C (MyBP-C) and similar domains. MyBP-C is a thick filament protein involved in the regulation of muscle contraction. Mutations in cardiac MyBP-C gene are the second most frequent cause of hypertrophic cardiomyopathy. MyBP-C binds to myosin with two binding sites, one at its C-terminus and another at its N-terminus. The N-terminal binding site, consisting of immunoglobulin (lg) domains C1 and C2 connected by a flexible linker, interacts with the S2 segment of myosin in a phosphorylation-regulated manner.  The C1 and C2 Ig domains can bind to and activate or inhibit the thin filament. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structures of the Ig domains of MyBP-C lack this strand and thus belong to the I-set of Ig superfamily domains. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors.	82
409560	cd20968	IgI_2_MuSK	agrin-responsive second immunoglobulin-like domains (Ig2) of the Muscle-specific kinase (MuSK) ectodomain; a member of the I-set of Ig superfamily domains. The members here are composed of the second immunoglobulin-like (Ig) domains of the Muscle-specific kinase (MuSK) ectodomain. MuSK is a receptor tyrosine kinase specifically expressed in skeletal muscle, where it plays a central role in the formation and maintenance of the neuromuscular junction (NMJ). MuSK is activated by agrin, a neuron-derived heparan sulfate proteoglycan. The activation of MUSK in myotubes regulates the formation of NMJs through the regulation of different processes including the specific expression of genes in subsynaptic nuclei, the reorganization of the actin cytoskeleton and the clustering of the acetylcholine receptors (AChR) in the postsynaptic membrane. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the MuSK lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	88
409561	cd20969	IgI_Lingo-1	Immunoglobulin I-set domain of the Leucine-rich repeat and immunoglobin-like domain-containing protein 1 (Lingo-1). The members here are composed of the immunoglobulin I-set (IgI) domain of the Leucine-rich repeat and immunoglobin-like domain-containing protein 1 (Lingo-1). Human Lingo-1 is a central nervous system-specific transmembrane glycoprotein also known as LERN-1, which functions as a negative regulator of neuronal survival, axonal regeneration, and oligodendrocyte differentiation and myelination. Lingo-1 is a key component of the Nogo receptor signaling complex (RTN4R/NGFR) in RhoA activation responsible for some inhibition of axonal regeneration by myelin-associated factors. The ligand-binding ectodomain of human Lingo-1 contains a bimodular, kinked structure composed of leucine-rich repeat (LRR) and immunoglobulin (Ig)-like modules. Diseases associated with Lingo-1 include mental retardation, autosomal recessive 64 and essential tremor. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the Lingo-1 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	92
409562	cd20970	IgI_1_MuSK	agrin-responsive first immunoglobulin-like domains (Ig1) of the MuSK ectodomain; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin-like domains (Ig1) of the Muscle-specific kinase (MuSK). MuSK is a receptor tyrosine kinase specifically expressed in skeletal muscle, where it plays a central role in the formation and maintenance of the neuromuscular junction (NMJ). MuSK is activated by agrin, a neuron-derived heparan sulfate proteoglycan. The activation of MUSK in myotubes regulates the formation of NMJs through the regulation of different processes including the specific expression of genes in subsynaptic nuclei, the reorganization of the actin cytoskeleton and the clustering of the acetylcholine receptors (AChR) in the postsynaptic membrane. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the MuSK lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	92
409563	cd20971	IgI_1_Titin-A168_like	First immunoglobulin-like domains A168 within the A-band segment of human cardiac titin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin-like domain A168 within the A-band segment of human cardiac titin. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structures of the titin-A168169 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	93
409564	cd20972	IgI_2_Titin_Z1z2-like	Second Ig-like domain of the giant muscle protein titin Z1z2 in the sarcomeric Z-disk, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig)-like domain of the giant muscle protein titin Z1z2 in the sarcomeric Z-disk and similar proteins. Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the titin Z1z2 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	91
409565	cd20973	IgI_telokin-like	immunoglobulin-like domain of telokin and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin (Ig) domain in telokin, the C-terminal domain of myosin light chain kinase which is identical to telokin, and similar proteins. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the telokin Ig domain lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	88
409566	cd20974	IgI_1_Titin_Z1z2-like	First Ig-like domain of the giant muscle protein titin Z1z2 in the sarcomeric Z-disk and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin (Ig)-like domain of the giant muscle protein titin Z1z2 in the sarcomeric Z-disk and similar proteins.  Titin is a key component in the assembly and functioning of vertebrate striated muscles. By providing connections at the level of individual microfilaments, it contributes to the fine balance of forces between the two halves of the sarcomere. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the titin Z1z2 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	93
409567	cd20975	IgI_APEG-1_like	Immunoglobulin-like domain of human Aortic Preferentially Expressed Protein-1 (APEG-1) and similar proteins; a member of the I-set of IgSF domains. The members here are composed of the immunoglobulin I-set (IgI) domain of the Human Aortic Preferentially Expressed Protein-1 (APEG-1) and similar proteins. APEG-1 is a novel specific smooth muscle differentiation marker predicted to play a role in the growth and differentiation of arterial smooth muscle cells (SMCs). The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of the human APEG-1 lacks this strand and thus it belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	91
409568	cd20976	IgI_4_MYLK-like	Fourth Ig-like domain from smooth muscle myosin light chain kinase and similar domains ; a member of the I-set of IgSF domains. The members here are composed of the fourth immunoglobulin (Ig)-like domain from smooth muscle myosin light chain kinase (MYLK) and similar domains. The Ig superfamily (IgSF) is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. Unlike the V-set, one of the distinctive features of I-set domains is the lack of a C" strand. The structure of this group shows that the fourth Ig-like domain from myosin light chain kinase lacks this strand and thus belongs to the I-set of the IgSF. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	90
409569	cd20977	IgI_3_hemolin-like	Third immunoglobulin (Ig)-like domain of hemolin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the third immunoglobulin (Ig)-like domain of hemolin and similar proteins. Hemolin, an insect immunoglobulin superfamily (IgSF) member containing four Ig-like domains, is a lipopolysaccharide-binding immune protein induced during bacterial infection. Hemolin shares significant sequence similarity with the first four Ig-like domains of the transmembrane cell adhesion molecules (CAMs) of the L1 family. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The third Ig-like domain of hemolin is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM).	93
409570	cd20978	IgI_4_hemolin-like	Fourth immunoglobulin (Ig)-like domain of hemolin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the fourth immunoglobulin (Ig)-like domain of hemolin and similar proteins. Hemolin, an insect immunoglobulin superfamily (IgSF) member containing four Ig-like domains, is a lipopolysaccharide-binding immune protein induced during bacterial infection. Hemolin shares significant sequence similarity with the first four Ig-like domains of the transmembrane cell adhesion molecules (CAMs) of the L1 family. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The fourth Ig-like domain of hemolin is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules (such as VCAM, ICAM, and MADCAM), and are also present in numerous other diverse protein families, including several tyrosine-protein kinase receptors, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1, and the signaling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis.	88
409571	cd20979	IgI_1_hemolin-like	First immunoglobulin (Ig)-like domain of hemolin, and similar domains; a member of the I-set of IgSF domains. The members here are composed of the first immunoglobulin (Ig)-like domain of hemolin and similar proteins. Hemolin, an insect immunoglobulin superfamily (IgSF) member containing four Ig-like domains, is a lipopolysaccharide-binding immune protein induced during bacterial infection. Hemolin shares significant sequence similarity with the first four Ig-like domains of the transmembrane cell adhesion molecules (CAMs) of the L1 family. IgSF domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. The first Ig-like domain of hemolin is a member of the I-set Ig domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand but lack a C" strand. I-set domains are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM).	91
409572	cd20980	IgV_VISTA	Immunoglobulin variable (IgV) domain of V-domain immunoglobulin suppressor of T cell activation (VISTA). The members here are composed of the immunoglobulin variable (IgV) domain of V-domain immunoglobulin suppressor of T cell activation (VISTA; also known as B7-H5, PD-1H, Gi24, Dies1, SISP1 and DD1alpha). VISTA is an immune checkpoint protein involved in the regulation of T cell activity and inhibits the T cell response against cancer. VISTA is a type I transmembrane protein with a single IgV domain with sequence homology to the IgV domains of the members of B7 family. VISTA is the only B7 family member that lacks an IgC domain. VISTA is primarily expressed in white blood cells and its transcription is partially controlled by p53. Similar to PD-1/PD-L1 and CTLA-4, a blockade of VISTA promotes tumor clearance by the immune system. Unlike the B7 family members, VISTA contains 10 beta-strands, instead of the nine that typically comprises of an IgV fold. Moreover, human VISTA contains the 21-residue extended loop between stands C and C', which does not align with any B7 family structure.	147
409573	cd20981	IgV_B7-H6	Immunoglobulin variable (IgV) domain of B7-H6. The members here are composed of the immunoglobulin variable (IgV) domain of B7-H6 (also known as NCR3LG1). B7-H6 contains one IgV domain and one IgC domain (IgV-IgC) and belongs to the B7-family, which consists of structurally related cell-surface protein ligands which bind to receptors on lymphocytes that regulate immune responses. B7-H6 is a ligand of NKp30, which is a member of CD28 family and an activating receptor of natural killer (NK) cells. The expression of NKp30 has been found in most of NK cells, which is involved in the process of tumor cell killing and interaction with antigen presenting cells (APCs) such as dendritic cells. Studies showed that NK cells eliminate B7-H6-expressing tumor cells either directly via cytotoxicity or indirectly by cytokine secretion. For instance, chimeric NKp30-expressing T cells responded to B7-H6(+) tumor cells and those T cells produced IFN-gamma and killed B7-H6-expressing tumor cells in vivo.  B7-H6 mRNA is not found in normal cells, while high expression of B7-H6 is found in certain type tumor cells, such as lymphoma, leukemia, ovarian cancer, brain tumors, breast cancers, and various sarcomas. Since B7-H6 can bind NKp30 to exert anti-tumor effects by NK cells, which are able to recognize the difference between cancer cells and normal cells, B7-H6 may serve as a promising target for cancer immunotherapy.	114
409574	cd20982	IgV_TIM-3_like	Immunoglobulin Variable (IgV) domain of T cell Immunoglobulin Domain and Mucin Domain 3 (Tim-3), and similar domains. The members here are composed of the immunoglobulin variable (IgV) domain of T cell immunoglobulin domain and mucin domain 3 (Tim-3; also known as Hepatitis A virus cellular receptor 2 (HAVcr-2) and Cluster of Differentiation 366 (CD366)) and similar proteins. TIM-3 is a checkpoint inhibitor in immune responses to tumors, as well as involved in chronic viral infections. Thus, Tim-3 has emerged as one of most promising immune checkpoint targets for cancer immunotherapy. Tim-3 is highly expressed on Th1 lymphocytes and CD11b(+) macrophages and is upregulated on activated T and myeloid cells. TIM-3 regulates macrophage, activation and inhibits Th1 mediated immune responses to promote immunological tolerance. There are three TIM family members in humans (TIM-1, TIM-3, and TIM-4) and eight members in mice (TIM-1 to TIM-8). The IgV domain of human TIM-3 has been shown to bind ligands such as carcinoembryonic antigen cell adhesion molecule 1 (CEACAM1), high mobility group protein B1 (HMGB1)and galectin-9 (GAL9). The binding of GAL9 to TIM-3 can negatively regulate Th1 immune response, enhance immune tolerance and inhibit anti#tumor immunity. Dysregulation of the TIM-3/GAL9 pathway is implicated in numerous chronic autoimmune diseases, such as multiple sclerosis and systemic lupus erythematosus.	107
409575	cd20983	IgV_PD-L2	Immunoglobulin Variable (IgV) domain of Programmed death ligand 2 (PD-L2). The members here are composed of the immunoglobulin variable (IgV) domain of Programmed death ligand 2 (PD-L2; also known as B7-DC or CD273). Receptor-binding domain of PD-L2 is a cell-surface ligand that competes with PD-L1 for binding to the immunosuppressive receptor programmed death-1 (PD-1). PD-1 is a member of the CD28/B7 family that plays an important role in negatively regulating immune responses upon interaction with its two ligands, PD-L1 or PD-L2. PD-L2 has a higher affinity for PD-1 but is expressed at lower levels. PD-L2 interaction with PD-1 suppresses T cell proliferation, cytokine production and cytotoxic activity. PD-L2 is expressed on tumor cells, antigen-presenting cells or APCs (such as macrophages, B cells and dendritic cells), and a variety of other immune and nonimmune cells. Tumor expression of PD-L2 may contribute to tumor evasion of immune destruction by inactivating T cells. Thus, PD-L2 is a negative predictor for prognosis among solid cancer patients.	100
409576	cd20984	IgV_B7-H4	Immunoglobulin Variable (IgV) domain of B7-H4. The members here are composed of the immunoglobulin variable (IgV) domain of B7-H4 (also known as B7-S1, B7x, or Vtcn1). B7-H4 is one of the B7 family of immune-regulatory ligands that act as negative regulators of T cell function; it contains one IgV domain and one IgC domain. The B7-family consists of structurally related cell-surface protein ligands, which bind to receptors on lymphocytes that regulate immune responses. The binding of B7-H4 to unidentified receptors results in the inhibition of TCR-mediated T cell proliferation, cell-cycle progression and IL-2 production. As a co-inhibitory molecule, B7-H4 is widely expressed in tumor tissues and its expression is significantly associated with poor prognosis in human cancers such as glioma, pancreatic cancer, oral squamous cell carcinoma, renal cell carcinoma, and lung cancer.	110
409577	cd20985	IgV_CD200R-like	Immunoglobulin Variable domain of cell surface glycoprotein CD200 receptor and similar proteins. The members here are composed of the immunoglobulin variable (IgV) domain of cell surface glycoprotein CD200 receptor and similar proteins. CD200 (also known as OX2) is a widely distributed membrane glycoprotein that regulates myeloid cell activity through its interaction with an inhibitory receptor (CD200R). CD200-CD200R interactions are involved in the control of myeloid cellular function. In the mouse, several CD200R-related genes have been identified, including CD200RL (for receptor like), CD200R1, and CD200R2. While CD200 gives good binding to CD200R, it does not bind CD200RLa, CD200RLb, CD200RLc, or CD200RLe. For instance, CD200RLa has a 50-fold lower binding affinity to CD200, although CD200RLa shares a high amino acid sequence identity with CD200R in the V-like domain. Furthermore, the CD200-CD200R regulatory interactions provide an attractive target for immunomodulation, because its manipulation can provoke either immune tolerance or autoimmune diseases.	107
409578	cd20986	IgC1_PD-L2	Immunoglobulin Constant 1 (IgC1) domain of Programmed death ligand 2 (PD-L2). The members here are composed of the immunoglobulin Constant 1 (IgC1) domain of Programmed death ligand 2 (PD-L2; also known as B7-DC or CD273). PD-L2 is a cell-surface ligand that competes with PD-L1 for binding to the immunosuppressive receptor programmed death-1 (PD-1). PD-1 is a member of the CD28/B7 family that plays an important role in negatively regulating immune responses upon interaction with its two ligands, PD-L1 or PD-L2. PD-L2 has a higher affinity for PD-1 but is expressed at lower levels. PD-L2 interaction with PD-1 suppresses T cell proliferation, cytokine production and cytotoxic activity. PD-L2 is expressed on tumor cells, antigen-presenting cells or APCs (such as macrophages, B cells and dendritic cells), and a variety of other immune and nonimmune cells. Tumor expression of PD-L2 may contribute to tumor evasion of immune destruction by inactivating T cells. Thus, PD-L2 is a negative predictor for prognosis among solid cancer patients.	82
409579	cd20987	IgC2_CD33_d2_like	Second immunoglobulin domain of Cluster of Differentiation (CD) 33 and related Siglecs; member of the C2-set of IgSF domains. The members here are composed of the second immunoglobulin (Ig) domain of Cluster of Differentiation (CD) 33 (also known as sialic-acid binding immunoglobulin type-lectin 3 (Siglec-3)) and related Siglecs. CD33, a Siglec family member, is a well-known immunotherapeutic target in acute myeloid leukemia (AML). It is an inhibitory sialoadhesin expressed in human leukocytes of the myeloid lineage and some lymphoid subsets, including natural killer (NK) cells. Siglecs are primarily expressed on immune cells and recognize sialic acid-containing glycan ligands. Siglecs are organized as an extracellular module composed of Ig-like domains (an N-terminal variable set of Ig-like carbohydrate recognition domains, and 1 to 16 constant Ig-like domains), followed by transmembrane and short cytoplasmic domains. Human Siglecs are classified into two subgroups, one subgroup is comprised of sialoadhesin (Siglec-1), CD22 (Siglec-2), and MAG (Siglec-4, myelin-associated glycoprotein), the other subgroup is comprised of CD33-related Siglecs which include CD33 (Siglec-3) and human Siglecs 5-11. CD33 (Siglec-3) is the smallest Siglec member. It preferentially binds to alpha2-6- and alpha2-3-sialylated glycans and strongly binds to sialylated ligands on leukemic cell lines. Ig Superfamily (IgSF) domains can be divided into 4 main classes based on their structures and sequences: the Variable (V), Constant 1 (C1), Constant 2 (C2), and Intermediate (I) sets. This group includes CD33-related Siglecs which belong to the C2-set of IgSF domains. Unlike the C1-set, the C2-set structures do not have a D strand.	94
409580	cd20988	IgV_TCR_gammadelta	Gammadelta T-cell antigen receptor, variable (V) domain. The members here are composed of the immunoglobulin (Ig) variable (V) domain of the gamma/delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are heterodimers consisting of alpha and beta chains or gamma and delta chains.  Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha/beta TCRs, but a small subset contain gamma/delta TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma/delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing, and MHC independently of the bound peptide. Gamma/delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. The variable domain of gamma/delta TCRs is responsible for antigen recognition and is located at the N-terminus of the receptor. Members of this group contain standard Ig superfamily V-set AGFCC'C"/DEB domain topology. 	114
409581	cd20989	IgV_1_Nectin-2_NecL-5_like_CD112_CD155	First immunoglobulin variable (IgV) domain of nectin-2, nectin-like protein 5, and similar domains. The members here are composed of the second immunoglobulin (Ig) domain of nectin-2 (also known as poliovirus receptor related protein 2 or Cluster of Differentiation 112 (CD112)), nectin-like protein 5 (CD155), and similar proteins. Nectins and Nectin-like molecules are a family of Ca(2+)-independent immunoglobulin-like transmembrane glycoproteins belonging to the class of adhesion receptors, consisting of nine members (nectins 1 through 4 and nectin-like proteins 1 through 5). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. Nectin-2 and nectin-3 localize at Sertoli-spermatid junctions where they form heterophilic trans-interactions between the cells that are essential for the formation and maintenance of the junctions and for spermatid development. CD155 is the fifth member in the nectin-like molecule family, and functions as the receptor of poliovirus; therefore, CD155 is also referred to as Necl-5, or PVR. In contrast to all other family members, CD155 lacks self-adhesion capacity, yet it shares with nectins the feature to interact with other nectins. For instance, CD155 heterophilically trans-interacts with nectin-3, thereby contributing significantly to the establishment of adherens junctions between epithelial cells. This group belongs to the Constant 1 (C1)-set of IgSF domains, which has one beta-sheet that is formed by strands A-B-E-D and the other strands by G-F-C-C'.	112
409582	cd20990	IgI_2_Palladin_C	Second C-terminal immunoglobulin (Ig)-like domain of palladin; member of the I-set of Ig superfamily (IgSF) domains. The members here are composed of the C-terminal immunoglobulin (Ig)-like domain of palladin. Palladin belongs to the palladin-myotilin-myopalladin family. Proteins belonging to this family contain multiple Ig-like domains and function as scaffolds, modulating actin cytoskeleton. Palladin binds to alpha-actinin ezrin, vasodilator-stimulated phosphoprotein VASP, SPIN90 (also known as DIP or mDia interacting protein), and Src. Palladin also binds F-actin directly, via its Ig3 domain. Palladin is expressed as several alternatively spliced isoforms, having various combinations of Ig-like domains, in a cell-type-specific manner. It has been suggested that palladin's different Ig-like domains may be specialized for distinct functions. This group belongs to the I-set of IgSF domains, having A-B-E-D strands in one beta-sheet and A'-G-F-C-C' in the other. Like the V-set Ig domains, members of the I-set have a discontinuous A strand, but lack a C" strand.	91
409583	cd20991	Ig1_IL1R_like	First immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP).  IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three Ig-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. IL-1 receptor antagonist (IL-1RA), a naturally occurring cytokine, is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta.	91
409584	cd20992	Ig1_IL1R_like	First immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the first immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP). IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three Ig-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta.	108
409585	cd20993	Ig2_IL-1RAP_like	Second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP).  IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three IG-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. This group also contains ILIR-like 1 (IL1R1L) which maps to the same chromosomal location as IL1R1 and IL1R2.	93
409586	cd20994	Ig2_IL1R_like	Second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R), and similar domains. The members here are composed of the second immunoglobulin (Ig)-like domain of interleukin-1 receptor (IL1R). IL-1 alpha and IL-1 beta are cytokines which participate in the regulation of inflammation, immune responses, and hematopoiesis. These cytokines bind to the IL-1 receptor type 1 (IL1R1), which is activated on additional association with interleukin-1 receptor accessory protein (IL1RAP).  IL-1 also binds a second receptor designated type II (IL1R2). Mature IL1R1 consists of three IG-like domains, a transmembrane domain, and a large cytoplasmic domain. Mature IL1R2 is organized similarly except that it has a short cytoplasmic domain. The latter does not initiate signal transduction. A naturally occurring cytokine IL-1RA (IL-1 receptor antagonist) is widely expressed and binds to IL-1 receptors, inhibiting the binding of IL-1 alpha and IL-1 beta. This group also contains ILIR-like 1 (IL1R1L) which maps to the same chromosomal location as IL1R1 and IL1R2.	94
409587	cd20995	IgI_N_ICAM-2	N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-2 (Cluster of Differentiation 102 or CD102); member of the I-set of IgSF domains. The members here are composed of the N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-2 (Cluster of Differentiation 102 or CD102). The intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54), ICAM-2 and ICAM-3 (Cluster of Differentiation 50 or CD50) mediate a variety of critical intercellular adhesion events in the immune system through interactions with their counter-receptors, the beta2-integrins LFA-1 (CD11a/CD18), Mac-1 (CD11b/CD18), p150,95 (CD11c/CD18), and CD11d/CD18. The ICAMs are type I transmembrane glycoproteins belonging to the immunoglobulin superfamily (IgSF). The binding of the ICAM family members with the beta2-integrins physically stabilizes interactions between pairs of T and B cells, T cells and antigen-presenting cells (APCs), and brings effector cells such as cytotoxic T lymphocytes (CTLs) and natural killer (NK) cells into close proximity to their target cells. All three ICAMs share a common polypeptide homology and structural motif, and the ability to bind LFA-1. The distinct functional role of each ICAM is affected by their relative affinities for LFA-1 (ICAM-1 > ICAM-2 > ICAM-3). ICAM-1 is expressed in most tissues at low levels, and expression is increased by inflammatory cytokines. In contrast, ICAM-2 is expressed predominantly on endothelium and leukocytes (except neutrophils), and its expression generally is not responsive to cytokines. ICAM-3 is expressed on leukocytes and Langerhans cells, but not on resting, cytokine-induced endothelium, or nonhematopoietic tissues.	83
409588	cd20996	IgI_N_ICAM-1	N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54); member of the I-set of IgSF domains. The members here are composed of the N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54).  The intercellular adhesion molecules ICAM-1, ICAM-2 (Cluster of Differentiation 102 or CD102) and ICAM-3 (Cluster of Differentiation 50 or CD50) mediate a variety of critical intercellular adhesion events in the immune system through interactions with their counter-receptors, the beta2-integrins LFA-1 (CD11a/CD18), Mac-1 (CD11b/CD18), p150,95 (CD11c/CD18), and CD11d/CD18. The ICAMs are type I transmembrane glycoproteins belonging to the immunoglobulin superfamily (IgSF). The binding of the ICAM family members with the beta2-integrins physically stabilizes interactions between pairs of T and B cells, T cells and antigen-presenting cells (APCs), and brings effector cells such as cytotoxic T lymphocytes (CTLs) and natural killer (NK) cells into close proximity to their target cells. All three ICAMs share a common polypeptide homology and structural motif, and the ability to bind LFA-1. The distinct functional role of each ICAM is affected by their relative affinities for LFA-1 (ICAM-1 > ICAM-2 > ICAM-3). ICAM-1 is expressed in most tissues at low levels, and expression is increased by inflammatory cytokines. In contrast, ICAM-2 is expressed predominantly on endothelium and leukocytes (except neutrophils), and its expression generally is not responsive to cytokines. ICAM-3 is expressed on leukocytes and Langerhans cells, but not on resting, cytokine-induced endothelium, or nonhematopoietic tissues.	82
409589	cd20997	IgI_N_ICAM-3	N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-3 (Cluster of Differentiation 50 or CD50); member of the I-set of IgSF domains. The members here are composed of the N-terminal immunoglobulin domain of the intercellular adhesion molecules ICAM-3 (Cluster of Differentiation 50 or CD50). The intercellular adhesion molecules ICAM-1 (Cluster of Differentiation 54 or CD54), ICAM-2 (Cluster of Differentiation 102 or CD102) and ICAM-3 mediate a variety of critical intercellular adhesion events in the immune system through interactions with their counter-receptors, the beta2-integrins LFA-1 (CD11a/CD18), Mac-1 (CD11b/CD18), p150,95 (CD11c/CD18), and CD11d/CD18. The ICAMs are type I transmembrane glycoproteins belonging to the immunoglobulin superfamily (IgSF). The binding of the ICAM family members with the beta2-integrins physically stabilizes interactions between pairs of T and B cells, T cells and antigen-presenting cells (APCs), and brings effector cells such as cytotoxic T lymphocytes (CTLs) and natural killer (NK) cells into close proximity to their target cells. All three ICAMs share a common polypeptide homology and structural motif, and the ability to bind LFA-1. The distinct functional role of each ICAM is affected by their relative affinities for LFA-1 (ICAM-1 > ICAM-2 > ICAM-3). ICAM-1 is expressed in most tissues at low levels, and expression is increased by inflammatory cytokines. In contrast, ICAM-2 is expressed predominantly on endothelium and leukocytes (except neutrophils), and its expression generally is not responsive to cytokines. ICAM-3 is expressed on leukocytes and Langerhans cells, but not on resting, cytokine-induced endothelium, or nonhematopoietic tissues.	85
409590	cd20998	IgC1_MHC_II_beta_I-E	Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) I-E; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) I-E. Three genetically distinct isotypes of class II MHC molecules are found in humans (HLA-DR, HLA-DQ, and HLA-DP), and two in mice (I-E and I-A).  I-A and I-E molecules have the same basic features insofar as peptide loading and presentation, although each interacts with distinctly different sets of peptides. They also differ in that there is a relatively high incidence of deletion of the I-E gene in both inbred strains of mice as well as wild mice and the lack of the reverse situation i.e. the deletion of I-A genes.  A detailed structural understanding of the similarities and differences between I-A and the paralogous I-E could help illuminate the respective roles these molecules play in peptide presentation and T cell activation. Mouse I-Ag7 has a genetic susceptibility to autoimmune diabetes due to its small, uncharged amino acid residue at position 57 of their beta chain which results in the absence of a salt bridge between beta 57 and Arg alpha 76, which is adjacent to the P9 pocket of the peptide-binding groove. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	99
409591	cd21000	IgC1_MHC_II_beta_HLA-DR	Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DR; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DR. HLA-DR is an MHC class II cell surface receptor encoded by the human leukocyte antigen complex on chromosome 6 region 6p21.31. HLA-DR is also involved in several autoimmune conditions, disease susceptibility, and disease resistance including seronegative-rheumatoid arthritis, penicillamine-induced myasthenia, schizophrenia, Goodpasture syndrome, systemic lupus erythematosus, Alzheimers, tuberculoid leprosy, and Hashimoto's thyroiditis. HLA-DR molecules are upregulated in response to signaling.  HLA-DR is an alphabeta heterodimer cell surface receptor, each subunit of which contains two extracellular domains, a membrane-spanning domain, and a cytoplasmic tail. Both alpha and beta chains are anchored in the membrane. The DR beta chain is encoded by 4 loci, however no more than 3 functional loci are present in a single individual, and no more than two on a single chromosome. Sometimes an individual may only possess 2 copies of the same locus, DRB1*. The HLA-DRB1 locus is ubiquitous and encodes a very large number of functionally variable gene products (HLA-DR1 to HLA-DR17). The HLA-DRB3 locus encodes the HLA-DR52 specificity, is moderately variable and is variably associated with certain HLA-DRB1 types. The HLA-DRB4 locus encodes the HLA-DR53 specificity, has some variation, and is associated with certain HLA-DRB1 types. The HLA-DRB5 locus encodes the HLA-DR51 specificity, which is typically invariable, and is linked to the HLA-DR2 types. Three genetically distinct isotypes of class II MHC molecules are found in humans (HLA-DR, HLA-DQ, and HLA-DP), and two in mice (I-E and I-A). MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	96
409592	cd21001	IgC1_MHC_II_beta_HLA-DQ_I-A	Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DQ and I-A; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of human histocompatibility antigen (HLA) DQ and mouse I-A. Three genetically distinct isotypes of class II MHC molecules are found in humans (HLA-DR, HLA-DQ, and HLA-DP), and two in mice (I-E and I-A). I-A and I-E have the same basic features insofar as peptide loading and presentation, they differ in that each interacts with distinctly different sets of peptides, and in the incidence of deletion of their genes. A structural understanding of the similarities and differences between I-A and I-E may help with understanding their roles in peptide presentation and T cell activation. Mouse I-Ag7 has a genetic susceptibility to autoimmune diabetes due to its small, uncharged amino acid residue at position 57 of their beta chain which results in the absence of a salt bridge between beta 57 and Arg alpha 76, which is adjacent to the P9 pocket of the peptide-binding groove. Human HLA-DR, -DQ, and -DP  are about 70% similar to each other.  HLA-DQ (DQ) is a cell surface receptor protein found on antigen presenting cells. It is an alphabeta heterodimer of type MHC class II. The alpha and beta chains are encoded by two loci, HLA-DQA1 and HLA-DQB1, that are adjacent to each other on chromosome band 6p21.3.  A person often produces two alpha-chain and two beta chain variants and thus 4 isoforms of DQ.  HLA-DQ is involved in the autoimmune diseases celiac disease and diabetes mellitus type. DQ is one of several antigens involved in rejection of organ transplants. DQ2 is encoded by the HLA-DQB1*02 allele group. DQ6 is encoded by the HLA-DQB1*06 allele group.  DQ2 beta-chains combine with alpha-chains, encoded by genetically linked HLA-DQA1 alleles, to form the cis-haplotype isoforms. These isoforms, nicknamed DQ2.2 and DQ2.5, are also encoded by the DQA1*0201 and DQA1*0501 genes, respectively. DQ6 beta-chains combine with alpha-chains, encoded by genetically linked HLA-DQA1 alleles, to form the cis-haplotype isoforms. For DQ6, however, cis-isoform pairing only occurs with DQ1 alpha-chains. There are many haplotypes of DQ6. Susceptibility to Leptospirosis infection was found associated with undifferentiated DQ6. DQ8 is determined by the antibody recognition of beta8 and this generally detects the gene product of DQB1*0302. DQ8 is commonly linked to autoimmune disease in the human population. DQ8 is the second most predominant isoform linked to celiac disease and the DQ most linked to Type 1 diabetes. DQ8 increases the risk for rheumatoid arthritis and is linked to the primary risk locus for RA, HLA-DR4. DR4 also plays an important role in Type 1 diabetes.   DQ8 is a split antigen of the DQ3 broad antigen. MHC class II molecules play a key role in the initiation of the antigen-specific immune response. They are expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice, and induced in nonprofessional APCs, such as keratinocyctes; they are expressed on the surface of activated human T cells and on T cells from other species. MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes; these peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC, and bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	97
409593	cd21002	IgC1_MHC_II_beta_HLA-DM	Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DM; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DM. Human HLA-DM plays a critical role in antigen presentation to CD4 T cells by catalyzing the exchange of peptides bound to MHC class II molecules.  Type 1 diabetes is correlated with DM activation and it is also implicated in viral infections such as herpes simplex virus, celiac disease, multiple sclerosis, other autoimmune diseases, and leukemia. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	97
409594	cd21003	IgC1_MHC_II_beta_HLA-DP	Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DP; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class II major histocompatibility complex (MHC) beta chain immunoglobulin domain of histocompatibility antigen (HLA) DP. HLA class II histocompatibility antigen, DP(W2) beta chain is a protein that in humans is encoded by the HLA-DPB1 gene.  It plays a central role in the immune system by presenting peptides derived from extracellular proteins.  MHC class II molecules are encoded by three different loci, HLA-DR, -DQ, and -DP, which are about 70% similar to each other. HLA-DP is an alphabeta heterodimer cell-surface receptor. Each DP subunit (alpha-subunit, beta-subunit) is composed of a alpha-helical N-terminal domain, an IgG-like beta sheet, a membrane spanning domain, and a cytoplasmic domain. The alpha-helical domain forms the sides of the peptide binding groove. The beta sheet regions form the base of the binding groove and the bulk of the molecule as well as the inter-subunit (non-covalent) binding region. Individuals carrying the MHCII allele, HLA-DP2, are at risk for chronic beryllium disease (CBD), a debilitating inflammatory lung condition caused by the reaction of CD4 T cells to inhaled beryllium. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	96
409595	cd21004	IgC1_MHC_II_alpha_HLA_DO	HLA class II histocompatibility antigen DO alpha; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the nonclassical MHC class II (MHCII) protein, HLA-DO, which binds HLA-DM and influences the repertoire of peptides presented by MHCII proteins.  In complex with HLA-DM, HLA-DO adopts a classical MHCII structure, with alterations near the a subunit's 310 helix. HLA-DO binds to HLA-DM at the same sites implicated in MHCII interaction, and kinetic analysis showed that HLA-DO acts as a competitive inhibitor by acting as a substrate mimic.  Though more remains to be elucidated about the function of HLA-DO, its unique distribution in the mammalian body namely, the exclusive expression of HLA-DO in B cells, thymic medullary epithelial cells, and dendritic cells indicate that it may be of physiological importance and has inspired further research.  Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	95
409596	cd21005	IgC1_MHC_II_alpha_I-EK	Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) I-E; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) I-E.  MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	95
409597	cd21006	IgC1_MHC_II_alpha_I-A	Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) I-A; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) I-A. Three genetically distinct isotypes of class II MHC molecules are found in humans (HLA-DR, HLA-DQ, and HLA-DP), and two in mice (I-E and I-A).  I-A and I-E molecules have the same basic features insofar as peptide loading and presentation, although each interacts with distinctly different sets of peptides. They also differ in that there is a relatively high incidence of deletion of the I-E a gene in both inbred strains of mice as well as wild mice and the lack of the reverse situation i.e. the deletion of I-A genes.  A detailed structural understanding of the similarities and differences between I-A and the paralogous I-E could help illuminate the respective roles these molecules play in peptide presentation and T cell activation. Mouse I-Ag7 has a genetic susceptibility to autoimmune diabetes due to its small, uncharged amino acid residue at position 57 of their beta chain which results in the absence of a salt bridge between beta 57 and Arg alpha 76, which is adjacent to the P9 pocket of the peptide-binding groove. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	95
409598	cd21007	IgC1_MHC_II_alpha_HLA-DR	Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) DR; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) DR. MHC class II molecules are encoded by three different loci, HLA-DR, -DQ, and -DP, which are about 70% similar to each other.  HLA-DR is a cell surface receptor protein found on antigen presenting cells. It is an alphabeta heterodimer of type MHC class II. The alpha and beta chains are encoded by two loci, HLA-DRA1 and HLA-DRB1, that are adjacent to each other on chromosome band 6p21.31. Susceptibility to multiple sclerosis and rheumatoid arthritis are associated with the human histocompatibility leukocyte antigen HLA-DR2 and HLA-DR4, respectively. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	95
409599	cd21008	IgC1_MHC_II_alpha_HLA-DQ	Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) DQ and related proteins; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) DQ. MHC class II molecules are encoded by three different loci, HLA-DR, -DQ, and -DP, which are about 70% similar to each other.  HLA-DQ (DQ) is a cell surface receptor protein found on antigen presenting cells. It is an alphabeta heterodimer of type MHC class II. The alpha and beta chains are encoded by two loci, HLA-DQA1 and HLA-DQB1, that are adjacent to each other on chromosome band 6p21.3.  A person often produces two alpha-chain and two beta chain variants and thus 4 isoforms of DQ.  Two autoimmune diseases in which HLA-DQ is involved are celiac disease and diabetes mellitus type 1. DQ is one of several antigens involved in rejection of organ transplants. DQ8 is a split antigen of the DQ3 broad antigen. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	95
409600	cd21009	IgC1_MHC_II_alpha_HLA-DM	Class II major histocompatibility complex (MHC) alpha chain immunoglobulin domain of histocompatibility antigen (HLA) DM; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class II alpha chain of histocompatibility antigen (HLA) DM. Human HLA-DM, also known as H2-M in mice, plays a critical role in antigen presentation to CD4 T cells by catalyzing the exchange of peptides bound to MHC class II molecules. MHC class II molecules play a key role in the initiation of the antigen-specific immune reponse. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	94
409601	cd21010	IgC1_MHC-like_ZAG	Immunoglobulin domain of Zn-alpha2-glycoprotein (ZAG); member of the C1-set of Ig superfamily (IgSF) domains.  The members here are composed of the immunoglobulin domain of Zn-alpha2-glycoprotein (ZAG). ZAG is a soluble protein that is present in serum and other body fluids. ZAG stimulates lipid degradation in adipocytes and causes the extensive fat losses associated with some advanced cancers. The 2.8 angstrom crystal structure of ZAG resembles a class I major histocompatibility complex (MHC) heavy chain, but ZAG does not bind the class I light chain beta-2-microglobulin. The ZAG structure includes a large groove analogous to class I MHC peptide binding grooves. Instead of a peptide, the ZAG groove contains a nonpeptidic compound that may be implicated in lipid catabolism under normal or pathological conditions. IgC_MHC_I_alpha3;  Immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class I alpha chain. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	93
409602	cd21011	IgC1_MHC-like_FcRn	immunoglobulin domain of neonatal Fc receptor, major histocompatibility complex (MHC)-like; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin domain of neonatal Fc receptor (FcRn). FcRn performs two distinct functions: the transport of maternal immunoglobulin G (IgG) to pre- or neonatal mammals which provides passive immunity and protection of IgG from normal serum protein catabolism. FcRn is related to class I MHC proteins, but lacks a functional peptide binding groove. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	93
409603	cd21012	IgC1_MHC_H-2_TLA	H-2 class I histocompatibility complex TLA (thymus leukemia antigen); member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the major histocompatibility complex (MHC) H-2 class I histocompatibility complex TLA (thymus leukemia antigen). The murine MHC class I histocompatibility TLA (Thymus leukemia antigen), which is encoded in the T region by T3 and T18 genes, is expressed mainly by intestinal epithelial cells and thymocytes.  The murine TLAs are class I, beta-2-microglobulin-associated glycoproteins. The TLA function is not defined by antigen presentation, but rather by its relatively high affinity binding to CD8-alpha-alpha compared with CD8-alpha-beta. The existence of a human homolog for murine TLA remains unresolved. This group is a member of the C1-set Ig domains, which have one beta sheet that is formed by strands A, B,  E, and D and the other strands by G, F, C, and C'.	95
409604	cd21013	IgC1_MHC_Ib_Qa-1	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of Qa-1 and similar proteins; member of the C1-set of Ig superfamily (IgSF) domains. Class Ib major histocompatibility complex (MHC) immunoglobulin domain of Qa-1 and similar proteins. Qa-1 presents hydrophobic peptides including Qdm derived from the leader sequence of classical MHC I molecules for immune surveillance by NK cells. Qa-1 bound peptides derived from the TCR Vbeta8.2 of activated T cells also activates CD8+ regulatory T cells to control autoimmunity and maintain self-tolerance. Four allotypes of Qa-1 (Qa-1a-d) are expressed that are highly conserved in sequence but have several variations that could affect peptide binding to Qa-1 or TCR recognition. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	97
409605	cd21014	IgC1_MHC_Ib_Qa-2	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of Qa-2; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of QA-2. Qa-2 is a nonclassical MHC Ib antigen, which has been implicated in both innate and adaptive immune responses, as well as embryonic development. Qa-2 has an unusual peptide binding specificity in that it requires two dominant C-terminal anchor residues and is capable of associating with a substantially more diverse array of peptide sequences than other nonclassical MHC. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	94
409606	cd21015	IgC1_MHC_Ia_RT1-Aa	Class Ia major histocompatibility complex (MHC) immunoglobulin domain of RT1-Aa; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of RT1-Aa. While most mammalian species transport these peptides into the ER via a single allele of TAP, rats have evolved different TAPs, TAP-A and TAP-B, RT1-Aa and RT1-A1c, which are associated with TAP-A and TAP-B. The rat MHC class Ia molecule RT1-Aa has the unusual capacity to bind long peptides ending in arginine, such as MTF-E, a thirteen-residue, maternally transmitted minor histocompatibility antigen. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	95
409607	cd21016	IgC1_MHC_Ib_T10_T22_like	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of T10, T22, and similar proteins; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of the murine H-2T-encoded T10, T22, and similar proteins.  T10 and T22 are highly related nonclassical major histocompatibility complex (MHC) class Ib proteins that bind to certain gammadelta T cell receptors (TCRs) in the absence of other components.  Classical MHC class I (class Ia) molecules participate in immune responses by presenting peptide antigens to cytolytic alpha beta T cells. Many nonclassical MHC class I (class Ib) molecules have distinct antigen-binding capabilities, suggesting that they have evolved for specific tasks that are distinct from those of MHC class Ia. Members of the IgC family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions.	97
409608	cd21017	IgC1_MHC_Ia_MIC-A_MIC-B	Class Ia major histocompatibility complex (MHC) immunoglobulin domain of MIC-A and MIC-B; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of MIC-A and MIC-B. MIC-A and MIC-B are homologs that serve as stress-inducible antigens on epithelial and epithelially derived cells. Both serve as ligands for the widely expressed activating immunoreceptor NKG2D, a C-type lectin-like activating immunoreceptor. MIC-B is very similar in structure to MIC-A and likely interacts with NKG2D in an analogous manner. The interdomain flexibility observed in the MIC-A structures, a feature unique to MIC proteins among MHC class I proteins and homologs, is also displayed by MIC-B, with an interdomain relationship intermediate between the two examples of MIC-A structures. Mapping sequence variations onto the structures of MIC-A and MIC-B reveals patterns completely distinct from those displayed by classical MHC class I proteins, with a number of substitutions falling on positions likely to affect interactions with NKG2D, but with other positions lying distant from the NKG2D binding sites or buried within the core of the proteins. Members of the IgC family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding and the IgC domain is involved in oligomerization and molecular interactions.	95
409609	cd21018	IgC1_MHC_Ia_H2Db_H2Ld	Class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) H2Db and H2Ld; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) H2Db and H2Ld.  H-2Ld complexed with peptide QL9 (or p2Ca) and complexed with influenza virus peptide NP366-374 (ASNEN-METM), respectively are high-affinity alloantigens for the 2C T cell receptor (TCR). The a1-a2 super domains of H-2Ld, H-2Db, and H-2Kb closely superimpose. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	95
409610	cd21019	IgC1_MHC_Ia_H-2Kb	Class Ia major histocompatibility complex (MHC) immunoglobulin domain of H-2Kb; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of H-2Kb. H-2Kb is an alloantigen for the 2C T cell receptor (TCR). H-2Kb forms a complex with beta-2-microglobulin, and a peptide, including VSV-8 (RGYVYNGL), SEV-9 (FAPGNYPAL), and OVA-8 (SIINFEKL). Comparison of the OVA-8, VSV-8, and SEV-9 complexes with H-2Kb indicates that four side chains (Lys-66, Glu-152, Arg-155, and Trp-167) adopt peptide-specific conformations.  H-2Kb paralogs include H-2Db, H-2Kbml and H-2KbI1s. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	94
409611	cd21020	IgC1_MHC_Ia_H-2Dd	Class Ia major histocompatibility complex (MHC) immunoglobulin domain of H2-Dd; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ia major histocompatibility complex (MHC) immunoglobulin domain of H2-Dd. Mouse MHC is composed of 11 subclasses. It includes the classical MHC class I (MHC-Ia) that comprises H-2D, H-2K and H-2L subclasses, the non-classical MHC class I (MHCIb) that comprises H-2Q, H-2M and H-2T subclasses, the classical MHC class II (MHC-IIa) that includes H-2A(I-A) and H-2E(I-E) subclasses, and the non-classical MHC class II (MHC-IIb) comprises H-2M and H-2O. H-2K, H-2D, and H-2L are 80 to 90% homologous at the amino acid level yet appear to be involved in different recognition reactions and are differentially expressed on lymphoid cells.  Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	95
409612	cd21021	IgC1_MHC_Ib_HLA-H	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen H; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen H (HLA-H). HLA-H (also known as hereditary hemochromatosis protein; HFE) is a major histocompatibility complex (MHC) class I-like protein that is mutated in Hereditary Hemochromatosis. HFE is a protein of 343 amino acids that includes a signal peptide, an extracellular transferrin receptor-binding region (a1 and a2), an immunoglobulin-like domain (a3), a transmembrane region, and a short cytoplasmic tail.  HFE binds beta-2-microglobulin to form a heterodimer expressed at the cell surface. It binds transferrin receptor (TFRC) in its extracellular alpha1-alpha2 domain.  HFE plays an important part in the regulation of hepcidin expression in response to iron overload and the liver is important in the pathophysiology of HFE-associated hemochromatosis.  Nine HFE splicing variants have been reported with transcripts lacking exon 2 or exon 3, or exons 2-3, 2-4, or 2-5.  Diverse mutations involving HFE introns and exons discovered in persons with hemochromatosis or their family members cause or probably cause high iron phenotypes. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	94
409613	cd21022	IgC1_MHC_Ia_HLA-G	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) G; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) G. HLA-G histocompatibility antigen (also known as human leukocyte antigen G ; HLA-G) is a protein that in humans is encoded by the HLA-G gene. HLA-G belongs to the HLA nonclassical class I heavy chain paralogs. This class I molecule is a heterodimer consisting of a heavy chain and light chain, beta-2-microglobulin. The heavy chain is anchored in the membrane. HLA-G may play a role in immune tolerance in pregnancy, being expressed in the placenta by extravillous trophoblast cells (EVT), while the classical MHC class I genes (HLA-A and HLA-B) are not. Immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class I and class II. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells. MHC class II molecules play a key role in the initiation of the antigen-specific immune repose. These molecules have been shown to be expressed constitutively on the cell surface of professional antigen-presenting cells (APCs), including B-lymphocytes, monocytes, and macrophages in both humans and mice. The expression of these molecules has been shown to be induced in nonprofessional APCs such as keratinocyctes, and they are expressed on the surface of activated human T cells and on T cells from other species. The MHC II molecules present antigenic peptides to CD4(+) T-lymphocytes. These peptides derive mostly from proteolytic processing via the endocytic pathway, of antigens internalized by the APC. These peptides bind to the MHC class II molecules in the endosome before they are transported to the cell surface. MHC class II molecules are heterodimers, comprised of two similarly-sized membrane-spanning chains, alpha and beta. Each chain had two globular domains (N- and C-terminal), and a membrane-anchoring transmembrane segment. The two chains form a compact four-domain structure. The peptide-binding site is a cleft in the structure.	94
409614	cd21023	IgC1_MHC_Ia_HLA-F	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) F; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen alpha chain F (HLA-F).  HLA-F, encoded by the HLA-F gene in humans, belongs to the non-classical HLA class I heavy chain paralogs. This class I molecule mainly exists as a heterodimer associated with the invariant light chain beta-2-microglobulin. HLA-F molecules can interact with both activating and inhibitory receptors on immune cells, such as NK cells, and can present a diverse panel of peptides. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	98
409615	cd21024	IgC1_MHC_Ib_HLA-E	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) E; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) E. HLA-E is the first human class Ib major histocompatibility complex molecule to be crystallized. Like other MHC class I molecules, HLA-E is a heterodimer consisting of an a heavy chain and light chain beta-2-microglobulin. HLA-E is highly conserved and almost nonpolymorphic, and has recently been shown to be the first specialized ligand for natural killer cell receptors.  Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3). Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	95
409616	cd21025	IgC1_MHC_Ib_HLA-Cw3-4	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of HLA-Cw3 and HLA-Cw4; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the Class Ib major histocompatibility complex (MHC) immunoglobulin domain of HLA-Cw3 and HLA-Cw4. HLA-C belongs to the MHC class I heavy chain receptors. The C receptor is a heterodimer consisting of a HLA-C mature gene product and beta-2-microglobulin. The mature C chain is anchored in the membrane. MHC Class I molecules, like HLA-C, are expressed in nearly all cells, and present small peptides to the immune system which surveys for non-self peptides.  HLA-C is a locus on chromosome 6, which encodes for a large number of HLA-C alleles that are Class-I MHC receptors. Class Ib histocompatibility leukocyte antigens (HLA)-Cw3 and (HLA)-Cw4 are ligands for the natural killer (NK) cell inhibitory receptors KIR2DL2 and KIR2DL1, respectively.  HLA-Cw3 and related alleles (HLA-Cw1, -Cw7, and -Cw8) contain Ser77 and Asn80 and interact with KIR that are reactive with the GL183 antibody Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. HLA-Cw4 and related alleles (HLA-Cw2, -Cw5, and -Cw6) have Asn77 and Lys80 and are recognized by KIR reactive with the EB6 15 or HP-3E4 16 antibody.  Members of the IgC family are components of immunoglobulin, T-cell receptors, CD1 cell surface glycoproteins, secretory glycoproteins A/C, and major histocompatibility complex (MHC) class I/II molecules. In immunoglobulins, each chain is composed of one variable domain (IgV) and one or more IgC domains. These names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. The IgV domain is responsible for antigen binding, and the IgC domain is involved in oligomerization and molecular interactions.	96
409617	cd21026	IgC1_MHC_Ia_HLA-B	Class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) B and similar proteins; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) B and similar proteins. The classical class I molecules (HLA-A, -B, and -C) are responsible for the presentation of endogenous antigen to CD8+ T cells. The receptor is a heterodimer, and is composed of a heavy alpha chain and smaller beta chain. The alpha chain is encoded by a variant HLA-B gene, and the beta chain (beta-2-microglobulin) is an invariant beta-2-microglobulin molecule.  The beta-2-microglobulin protein is coded for by a separate region of the human genome. Human leukocyte antigen (HLA) B*3501 (B35) is a common human allele involved in mediating protective immunity against HIV.  Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	97
409618	cd21027	IgC1_MHC_Ia_HLA-A	Class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) A; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the class Ia major histocompatibility complex (MHC) immunoglobulin domain of human leukocyte antigen (HLA) A. The classical class I molecules (HLA-A, -B, and -C) are responsible for the presentation of endogenous antigen to CD8+ T cells. The receptor is a heterodimer, and is composed of a heavy alpha chain and smaller beta chain. The alpha chain is encoded by a variant HLA-A gene, and the beta chain (beta-2-microglobulin) is an invariant beta-2-microglobulin molecule.  The beta-2-microglobulin protein is coded for by a separate region of the human genome. HLA-A2 is associated with spontaneous abortions, HIV, and Hodgkin lymphoma.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	95
409619	cd21028	IgC1_MHC_I_M144	Class I major histocompatibility complex (MHC) homolog m144; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin (Ig) domain of major histocompatibility complex (MHC) homolog m144 class I alpha chain. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes.  Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells.	101
409620	cd21029	IgC1_CD1	Immunoglobulin domain of Cluster of Differentiation (CD) 1; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the immunoglobulin domain of Cluster of Differentiation (CD) 1.  CD1 family of transmembrane glycoproteins, are structurally related to the major histocompatibility complex (MHC) proteins and form heterodimers with beta-2-microglobulin. They mediate the presentation of primarily lipid and glycolipid antigens of self or microbial origin to T cells. The human genome contains five CD1 family genes (CD1a, CD1b, CD1c, CD1d, and CD1e) organized in a cluster on chromosome 1. The CD1 family members are thought to differ in their cellular localization and specificity for particular lipid ligands. CD1a localizes to the plasma membrane and to recycling vesicles of the early endocytic system.  Alternative splicing results in multiple transcript variants. Immunoglobulin (Ig) domain of major histocompatibility complex (MHC) class I alpha chain. Class I MHC proteins bind antigenic peptide fragments and present them to CD8+ T lymphocytes. Class I molecules consist of a transmembrane alpha chain and a small chain called the beta-2-microglobulin. The alpha chain contains three extracellular domains, two of which fold together to form the peptide-binding cleft (alpha1 and alpha2), and one which has an Ig fold (alpha3).  Peptide binding to class I molecules occurs in the endoplasmic reticulum (ER) and involves both chaperones and dedicated factors to assist in peptide loading.  Class I MHC molecules are expressed on most nucleated cells. C1-set Ig domains have one beta sheet that is formed by strands A, B,  E, and D and the other strands by G, F, C, and C'.	93
411025	cd21030	V35-RBD_P-protein-C_like	C-terminal RNA-binding domain (RBD) domain of Ebola virus VP35 phosphoprotein and related proteins. This family includes the C-terminal RNA-binding domain (RBD) of the P protein of viruses belonging to the Filoviridae family, such as Ebola virus or Marburg virus. VP35-RBD contains two subdomains: an alpha-helical subdomain and a beta-sheet subdomain.  Virus infection typically activates host innate immunity, including the interferon (IFN) signaling pathway; VP35-RBD binds double-stranded RNA (dsRNA) inhibiting IFN-alpha/beta signaling. The family Filoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serve as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA.	125
411026	cd21031	MEV_P-protein-C_like	C-terminal domain of Measles virus phosphoprotein and related proteins. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Paramyxoviridae family such as measles virus and mumps virus. The family Paramyxoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA. Paramyxoviruses have a polycistronic phosphoprotein (P) gene which encodes for proteins in addition to P protein; for example the measles virus P gene encodes for P protein and virulence factor V (MV-V). This domain family includes the unshared C-terminal domain of P protein not present in MV-V.	46
411027	cd21032	RABV_P-protein-C_like	C-terminal domain of Rabies virus phosphoprotein and related proteins. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Rhabdoviridae animal family such as Rabies virus (RABV). RABV P protein is known to counteract the functions of various cellular factors involved in antiviral responses, including STAT1, and interferon-induced promyelocytic leukaemia (PML) protein; the C-terminal domain of the RABV P protein includes STAT1 and PML binding sites. The family Rhabdoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA.	105
411028	cd21033	VSV_P-protein-C_like	C-terminal domain of Vesicular stomatitis Indiana virus phosphoprotein and related proteins. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Rhabdoviridae animal family such as Vesicular stomatitis Indiana virus (VSV). The family Rhabdoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA.	71
411029	cd21036	WH_MUS81	winged helix domain found in crossover junction endonuclease MUS81 and similar proteins. MUS81 is a crossover junction endonuclease that interacts with EME1 (essential meiotic structure-specific endonuclease 1) and EME2, to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. The MUS80-EME1 endonuclease maintains genomic integrity in metazoans by cleaving branched DNA structures that can form during mitosis and fission yeast meiosis, and during processing of damaged replication folks. This model corresponds to the winged helix (WH) domain of MUS81, which is responsible for DNA binding. It comprises four helices and two beta strands.	94
411030	cd21037	MLKL_NTD	N-terminal domain of mixed lineage kinase domain-like protein (MLKL) and similar proteins. MLKL is a pseudokinase that does not have protein kinase activity and plays a key role in tumor necrosis factor (TNF)-induced necroptosis, a programmed cell death process. The model corresponds to the MLKL N-terminal region that reveals a four-helix bundle with an additional helix at the top which is likely key for MLKL function. The N-terminal domain binds directly to phospholipids and induces membrane permeabilization.	138
410951	cd21039	NURR	NURR (N-terminal unit for RNA recognition) domain. NURR domain is a self-folding globular RNA-binding domain with an all alpha-helix architecture with a highly conserved negatively charged surface area. It also contains a large hydrophobic cavity and a positively charged surface area as potential epitopes for inter-molecular interactions. NURR domain has been found in Drosophila melanogaster Syncrip and vertebrates heterogeneous nuclear ribonucleoproteins hnRNPR and hnRNPQ.	77
411031	cd21044	Rab11BD_RAB3IP_like	Rab11 binding domain of Rab-3A-interacting protein (RAB3IP), Rab-3A-interacting-like protein 1 (RAB3IL1) and similar proteins. The family includes RAB3IP and RAB3IL1, as well as Rab guanine nucleotide exchange factor SEC2 from yeast. RAB3IP, also called Rabin-3, or SSX2-interacting protein, or Rabin8, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. It mediates the release of GDP from RAB8A and RAB8B but not from RAB3A or RAB5. It modulates actin organization and promotes polarized transport of RAB8A-specific vesicles to the cell surface. RAB3IL1, also called guanine nucleotide exchange factor for Rab-3A (GRAB), or Rab3A-interacting-like protein 1, or Rabin3-like 1, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. As a dual Rab-binding protein, RAB3IL1 could potentially link Rab3 and Rab11 and/or Rab8 and Rab11-mediated intracellular trafficking processes. It may activate RAB3A, a GTPase that regulates synaptic vesicle exocytosis. It may also activate RAB8A and RAB8B. In addition, RAB3IL1 interacts with InsP6K1 and plays a role for InsP7 in vesicle exocytosis. SEC2 is a guanine nucleotide exchange factor for SEC4, catalyzing the dissociation of GDP from SEC4 and also potently promoting binding of GTP. Activation of SEC4 by SEC2 is needed for the directed transport of vesicles to sites of exocytosis. SEC2 binds the Rab GTPase YPT32 but does not have exchange activity on YPT32. The model corresponds to the Rab11a/Rab11b-binding region of family members which lies within the carboxy-terminus, a region distinct from their GEF domain and Rab3a-binding region.	178
410965	cd21050	ELD_TRPML	extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipins (TRPMLs). TRPML family proteins contain a linker between the first two transmembrane helices (S1 and S2), which is called TRPML I-II linker. It forms a tight tetramer that is crucial for full-length TRPMLs assembly and localization. In lysosomes and endosomes, this linker faces the lumen (it is therefore also referred to as the 'luminal linker'); on the plasma membrane, it faces the extracellular solution. TRPML I-II linker has been named as extracytosolic/lumenal domain (ELD).	167
411034	cd21055	WH_NTD_SMARCB1_like	N-terminal winged helix DNA-binding domain found in SMARCB1, PHF10 and similar proteins. The family includes SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 (SMARCB1) and PHD finger protein 10 (PHF10), both of which have an N-terminal winged helix DNA-binding domain that is structurally related to the SKI/SNO/DAC domain found in a number of metazoan chromatin-associated proteins. SMARCB1, also termed BRG1-associated factor 47 (BAF47), or integrase interactor 1 protein (INI1), or SNF5, or SNF5L1, is a core component of the BAF (hSWI/SNF) complex, an ATP-dependent chromatin-remodeling complex that plays important roles in cell proliferation and differentiation, in cellular antiviral activities and inhibition of tumor formation. PHF10, also termed BRG1-associated factor 45a (BAF45a), or XAP135, is involved in transcription activity regulation by chromatin remodeling. It is a component of the neural progenitors-specific chromatin remodeling complex (npBAF complex) and plays a role in the proliferation of neural progenitors.	80
411038	cd21058	toxin_MLD_like	membrane localization domain (MLD) of Vibrio MARTX, Pasteurella PMT, clostridial glycosylating cytotoxins, toxin effectors BteA (Bordetella T3SS effector A) and related proteins. This family includes membrane localization domains (MLDs) for toxin effectors such as the Rho-inactivation domain of Vibrio MARTX, Pasteurella mitogenic toxin (PMT), where it has been termed PMT C1 domain, and clostridial glycosylating cytotoxins including Clostridium difficile toxins A (TcdA) and B (TcdB), Clostridium novyi alpha-toxin (TcnA), and Clostridium sordellii lethal toxin (TcsL). It also includes the MLD located in the N-terminal minimal membrane-binding fragment of BteA, a type III secretion system (T3SS) effector protein from Bordetella pertussis, the causative agent of whooping cough.	78
411040	cd21059	LciA-like	lactococcin A immunity protein (LciA) and similar proteins. This family includes pore-forming bacteriocin class IId lactococcin A immunity protein (LciA) and similar proteins. The subclass IId is a linear, one-peptide bacteriocin that shares no sequence similarity to the other class II pediocin-like bacteriocins (class IIa), two-peptide bacteriocins (class IIb) or cyclic bacteriocins (class IIc). However, they all induce membrane leakage and cell death by specifically binding the mannose phosphotransferase system (man-PTS) on their target cells. LciA shares the same 4-helical bundle structure as the pediocin-like immunity proteins but has a shorter C-terminal helix and a different surface potential. Also, it has a flexible C-terminal tail that is important for the functionality of the immunity protein.	69
410634	cd21061	7tm_viral_rhodopsin	viral rhodopsins and similar proteins, members of the seven-transmembrane GPCR superfamily. This subfamily is composed of viral homologs of proteorhodopsins (PRs), which are blue-light absorbing and green-light absorbing proteins acting as light-driven proton pumps that play a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. Viral proteorhodopsins are predicted to function as sensory rhodopsins that could affect signaling, for example, phototaxis in the infected protists, perhaps stimulating relocation of the infected protists to areas that are rich in nutrients required for virus reproduction. Viral proteorhodopsins are monophyletic and split into two distinct groups, I and II, represented by Phaeocystis globosa virus 12T VirRDTS and Organic Lake phycodnavirus OLPVRII, respectively. PRs belong to the microbial rhodopsin family, also known as type 1 rhodopsins, which also comprise the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	210
410952	cd21064	NURR_hnRNPQ-like	NURR (N-terminal unit for RNA recognition) domain found in heterogeneous nuclear ribonucleoproteins hnRNPQ, hnRNPR and similar proteins. The family includes hnRNPQ and hnRNPR. hnRNPQ, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NSAP1), or Synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP), is a highly conserved RNA-binding protein that mediates the exosomal partition of a set of miRNAs. It acts as a component of the hepatocyte exosomal miRNA sorting machinery. hnRNPR is a highly conserved RNA-binding protein that belongs to the heterogeneous nuclear ribonucleoprotein (hnRNP) family. hnRNP plays an important role in processing of precursor mRNA in the nucleus. hnRNPR acts as a general positive regulator of MHC class I expression.	78
410953	cd21065	NURR_Syncrip-like	NURR (N-terminal unit for RNA recognition) domain found in Drosophila melanogaster Syncrip and similar proteins. Syncrip is a conserved RNA-binding protein important in neuronal and muscular development in Drosophila. It is essential for the morphology and growth of the neuromuscular junction and regulates cytoplasmic vesicle-based messenger RNA (mRNA) transport. The model corresponds to NURR domain of Syncrip, which is a RNA-binding domain with a highly conserved RNA-binding surface.	83
410954	cd21066	NURR_hnRNPQ	NURR (N-terminal unit for RNA recognition) domain found in heterogeneous nuclear ribonucleoprotein Q (hnRNPQ) and similar proteins. hnRNPQ, also termed glycine- and tyrosine-rich RNA-binding protein (GRY-RBP), or NS1-associated protein 1 (NSAP1), or Synaptotagmin-binding, cytoplasmic RNA-interacting protein (SYNCRIP), is a highly conserved RNA-binding protein that mediates the exosomal partition of a set of miRNAs. It acts as a component of the hepatocyte exosomal miRNA sorting machinery. The model corresponds to NURR domain of hnRNPQ, which has structural similarity to bacterial protein Barstar and binds to Apobec1.	85
410955	cd21067	NURR_hnRNPR	NURR (N-terminal unit for RNA recognition) domain found in heterogeneous nuclear ribonucleoprotein R (hnRNPR) and similar proteins. hnRNPR is a highly conserved RNA-binding protein that belongs to the heterogeneous nuclear ribonucleoprotein (hnRNP) family. hnRNP plays an important role in processing of precursor mRNA in the nucleus. hnRNPR acts as a general positive regulator of MHC class I expression. The model corresponds to NURR domain of hnRNPR.	84
411032	cd21068	Rab11BD_RAB3IP	Rab11 binding domain of Rab-3A-interacting protein (RAB3IP). RAB3IP, also called Rabin-3, or SSX2-interacting protein, or Rabin8, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. It mediates the release of GDP from RAB8A and RAB8B but not from RAB3A or RAB5. It modulates actin organization and promotes polarized transport of RAB8A-specific vesicles to the cell surface. The model corresponds to the Rab11a/Rab11b-binding region of RAB3IP lies within its carboxy-terminus, a region distinct from its GEF domain and Rab3a-binding region.	193
411033	cd21069	Rab11BD_RAB3IL1	Rab11 binding domain of Rab-3A-interacting-like protein 1 (RAB3IL1). RAB3IL1, also called guanine nucleotide exchange factor for Rab-3A (GRAB), or Rab3A-interacting-like protein 1, or Rabin3-like 1, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. As a dual Rab-binding protein, RAB3IL1 could potentially link Rab3 and Rab11 and/or Rab8 and Rab11-mediated intracellular trafficking processes. It may activate RAB3A, a GTPase that regulates synaptic vesicle exocytosis. It may also activate RAB8A and RAB8B. In addition, RAB3IL1 interacts with InsP6K1 and plays a role for InsP7 in vesicle exocytosis. The model corresponds to the Rab11a/Rab11b-binding region of RAB3IL1 lies within its carboxy-terminus, a region distinct from its GEF domain and Rab3a-binding region.	163
410966	cd21070	ELD_TRPML1	extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipin 1 (TRPML1). TRPML1, also called mucolipin-1 (ML1), or MG-2, or Mucolipidin, may play a major role in Ca(2+) release from late endosome and lysosome vesicles to the cytoplasm, which is important for many lysosome-dependent cellular events, including the fusion and trafficking of these organelles, exocytosis and autophagy. The model corresponds to extracytosolic/lumenal domain (ELD), a linker located between the first two transmembrane segments (S1 and S2) of TRPML1. It forms a tight tetramer that is crucial for full-length TRPML1 assembly and localization.	171
410967	cd21071	ELD_TRPML2	extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipin 2 (TRPML2). TRPML2, also called mucolipin-2 (ML2), acts as Ca(2+)-permeable cation channel with inwardly rectifying activity. It may activate ARF6 and be involved in the trafficking of GPI-anchored cargo proteins to the cell surface via the ARF6-regulated recycling pathway. The model corresponds to extracytosolic/lumenal domain (ELD), a linker located between the first two transmembrane segments (S1 and S2) of TRPML2. It forms a tight tetramer that is crucial for full-length TRPML2 assembly and localization.	167
410968	cd21072	ELD_TRPML3	extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipin 3 (TRPML3). TRPML3, also called mucolipin-3 (ML3), acts as Ca(2+)-permeable cation channel with inwardly rectifying activity. It mediates release of Ca(2+) from endosomes to the cytoplasm, contributes to endosomal acidification and is involved in the regulation of membrane trafficking and fusion in the endosomal pathway. The model corresponds to extracytosolic/lumenal domain (ELD), a linker located between the first two transmembrane segments (S1 and S2) of TRPML3. It forms a tight tetramer that is crucial for full-length TRPML3 assembly and localization.	169
411039	cd21073	toxin_BteA-MLD_like	membrane localization domain (MLD) of BteA (Bordetella T3SS effector A) cytotoxin, the N-terminal domain of Photox toxin and related proteins. This family includes the MLD located in the N-terminal minimal membrane-binding segment of BteA (residues 1-131, BteA131), which has also been referred to as the lipid raft targeting (LRT) domain/motif. BteA is a type III secretion system (T3SS) effector protein from Bordetella pertussis, a bacterial respiratory pathogen and the causative agent of whooping cough. The BteA131 segment is multifunctional: in addition to targeting phosphatidylinositol (PI)-rich microdomains in the host membrane, it binds its cognate chaperone BtcA. The MLD adopts a four-helix bundle structure, with a positively charged surface that targets phosphatidylinositol 4,5-bisphosphate (PIP2) in the host membrane via critical arginine and lysine residues. A flexible region preceding the BteA helical bundle contains the characteristic beta-motif required for binding BtcA. This domain has significant sequence similarity to the N-terminal domain of effectors and the endo-domain of RTX-type toxins from Photorhabdus luminescens. This family includes the N-terminal domain of Photorhabdus laumondii Photox toxin; little is known about the N-terminus of Photox, but its C-terminus is an actin-targeting ADP-ribosyltransferase.	87
410781	cd21074	DHD_Ski_Sno_Dac	Dachshund-homology domain found in the Ski/Sno/Dac family of transcriptional regulators. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. Members of this family include the Ski protein, Ski-like protein (Sno), and Dachshund proteins. Ski may play a role in terminal differentiation of skeletal muscle cells but not in the determination of cells to the myogenic lineage. It functions as a repressor of transforming growth factor-beta (TGF-beta) signaling. Ski-like protein, also known as SKIL or Sno, is the ski proto-oncogene homolog. It may have regulatory roles in cell division or differentiation in response to extracellular signals. Dachshund proteins are essential components of a regulatory network controlling cell fate determination. They have been implicated in eye, limb, brain, and muscle development.	88
410961	cd21075	DBD_XPA-like	DNA-binding domain found in DNA repair protein complementing XP-A cells (XPA), yeast DNA repair protein RAD14 and similar proteins. The family includes DNA repair protein complementing XP-A cells (XPA), yeast DNA repair protein RAD14, zinc transporter 9 (ZNT9) and similar proteins. XPA, also known as xeroderma pigmentosum group A-complementing protein (XPAC), is involved in DNA excision repair. It initiates repair by binding to damaged sites with various affinities, depending on the photoproduct and the transcriptional state of the region. Rad14 is involved in nucleotide excision repair. It binds specifically to damaged DNA and is required for the incision step. Rad14 is a component of the nucleotide excision repair factor 1 (NEF1) complex consisting of Rad1, Rad10 and Rad14. ZNT9, also known as solute carrier family 30 member 9 (SLC30A9), may act as a zinc transporter involved in intracellular zinc homeostasis and may also play a role as nuclear receptor coactivator. The model corresponds to the DNA-binding domain found in XPA and Rad14. It consists of a conserved N-terminal zinc-binding subdomain and a C-terminal alpha/beta fold subdomain. ZNT9 contains only C-terminal alpha/beta fold subdomain but lacks of N-terminal zinc-binding subdomain.	67
410962	cd21076	DBD_XPA	DNA-binding domain found in DNA repair protein complementing XP-A cells (XPA) and similar proteins. XPA, also known as xeroderma pigmentosum group A-complementing protein (XPAC), is involved in DNA excision repair. It initiates repair by binding to damaged sites with various affinities, depending on the photoproduct and the transcriptional state of the region.	107
410963	cd21077	DBD_Rad14	DNA-binding domain found in yeast DNA repair protein Rad14 and similar proteins. Rad14 is involved in nucleotide excision repair. It binds specifically to damaged DNA and is required for the incision step. Rad14 is a component of the nucleotide excision repair factor 1 (NEF1) complex consisting of Rad1, Rad10 and Rad14.	105
410964	cd21078	NTD_ZNT9	N-terminal domain found in zinc transporter 9 (ZNT9) and similar proteins. ZNT9, also known as solute carrier family 30 member 9 (SLC30A9), may act as a zinc transporter involved in intracellular zinc homeostasis and may also play a role as nuclear receptor coactivator.	89
410782	cd21079	DHD_Ski_Sno	Dachshund-homology domain found in Ski, Ski-like protein (Sno), and similar proteins. Ski may play a role in terminal differentiation of skeletal muscle cells but not in the determination of cells to the myogenic lineage. It functions as a repressor of transforming growth factor-beta (TGF-beta) signaling. Ski-like protein, also known as SKIL or Sno, is the ski proto-oncogene homolog. It may have regulatory roles in cell division or differentiation in response to extracellular signals. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA.	91
410783	cd21080	DHD_Skor	Dachshund-homology domain found in SKI family transcriptional corepressors, Skor1, Skor2 and similar proteins. Skor1, also known as functional Smad-suppressing element on chromosome 15 (Fussel-15), LBX1 corepressor 1, or ladybird homeobox corepressor 1, acts as a transcriptional corepressor of LBX1 and inhibits BMP signaling. Skor2, also known as functional Smad-suppressing element on chromosome 18 (Fussel-18), LBX1 corepressor 1-like protein, or ladybird homeobox corepressor 1-like protein, exhibits transcriptional repressor activity. It acts as a transforming growth factor-beta (TGF-beta) antagonist in the nervous system. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA.	91
410784	cd21081	DHD_Dac	Dachshund-homology domain found in the retinal determination protein Dachshund and similar proteins. Dachshund proteins act as transcription factors involved in the regulation of organogenesis. They may be a regulator of SIX1, SIX6 and probably SIX5. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA. It has been postulated that Dachshund proteins may bind to chromatin DNA via their DHD domains.	95
410785	cd21082	DHD_SKIDA1	Dachshund-homology domain found in SKI/DACH domain-containing protein 1 (SKIDA1) and similar proteins. SKIDA1 is also known as protein DLN-1. Its biological function remains unclear. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA.	91
410786	cd21083	DHD_Ski	Dachshund-homology domain found in Ski and similar proteins. Ski may play a role in terminal differentiation of skeletal muscle cells but not in the determination of cells to the myogenic lineage. It functions as a repressor of transforming growth factor-beta (TGF-beta) signaling. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA.	102
410787	cd21084	DHD_Sno	Dachshund-homology domain found in Ski-like protein (Sno) and similar proteins. Ski-like protein, also known as SKIL, Ski-related oncogene (Sno), or Ski-related protein, is the ski proto-oncogene homolog. It may have regulatory roles in cell division or differentiation in response to extracellular signals. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA.	100
411035	cd21085	WH_NTD_PHF10	N-terminal winged helix DNA-binding domain found in PHD finger protein 10 (PHF10) and similar proteins. PHF10, also termed BRG1-associated factor 45a (BAF45a), or XAP135, is involved in transcription activity regulation by chromatin remodeling. It is a component of the neural progenitors-specific chromatin remodeling complex (npBAF complex) and plays a role in the proliferation of neural progenitors. The model corresponds to the N-terminal winged helix DNA-binding domain of PHF10, which is structurally related to the SKI/SNO/DAC domain that is found in a number of metazoan chromatin-associated proteins.	89
411036	cd21086	WH_NTD_SMARCB1	N-terminal winged helix DNA-binding domain found in SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 (SMARCB1) and similar proteins. SMARCB1, also termed BRG1-associated factor 47 (BAF47), or integrase interactor 1 protein (INI1), or SNF5, or SNF5L1, is a core component of the BAF (hSWI/SNF) complex, an ATP-dependent chromatin-remodeling complex that plays important roles in cell proliferation and differentiation, in cellular antiviral activities and inhibition of tumor formation. The model corresponds to the N-terminal winged helix DNA binding domain of SMARCB1, which is structurally related to the SKI/SNO/DAC domain that is found in a number of metazoan chromatin-associated proteins.	88
410635	cd21087	7tm_viral_rhod_II_OLPVRII-like	viral group II rhodopsins such as OLPVRII and similar proteins, members of the seven-transmembrane GPCR superfamily. The viral group II rhodopsins includes Organic Lake Phycodnavirus rhodopsin II (OLPVRII), a pentameric light-gated channel that is functionally analogous to well-studied pentameric ligand-gated ion channels playing crucial roles in many cellular processes. It is most likely specific for chloride. Members of this group are considered homologs of proteorhodopsins (PRs), which are blue-light absorbing and green-light absorbing proteins acting as light-driven proton pumps that play a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. Viral proteorhodopsins are predicted to function as sensory rhodopsins that could affect signaling, for example, phototaxis in the infected protists, perhaps stimulating relocation of the infected protists to areas that are rich in nutrients required for virus reproduction. PRs belong to the microbial rhodopsin family, also known as type 1 rhodopsins, which also comprise the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	210
410636	cd21088	7tm_viral_rhod_I_VirRDTS-like	viral group I rhodopsins such as VirRDTS and similar proteins, members of the seven-transmembrane GPCR superfamily. The viral group I rhodopsins includes Phaeocystis globosa virus 12T divergent type-1 DTS-motif rhodopsin (VirRDTS), a green light-absorbing proton pump that has a structure similar to that of bacteriorhodopsin (BR) and transfers light energy in a manner that substantially changes medium pH when expressed in a cell. Members of this group are considered homologs of proteorhodopsins (PRs), which are blue-light absorbing and green-light absorbing proteins acting as light-driven proton pumps that play a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. Viral proteorhodopsins are predicted to function as sensory rhodopsins that could affect signaling, for example, phototaxis in the infected protists, perhaps stimulating relocation of the infected protists to areas that are rich in nutrients required for virus reproduction. PRs belong to the microbial rhodopsin family, also known as type 1 rhodopsins, which also comprise the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	211
411041	cd21089	Trm112-like	eukaryotic tRNA methyltransferase 112, a partner protein of both rRNA/tRNA and protein methyltransferases, and similar proteins. This family contains eukaryotic tRNA methyltransferase 112 (Trm112)-like proteins such as human multifunctional methyltransferase subunit Trm112 protein, which acts as an activator of both rRNA/tRNA and protein methyltransferases. Trm112 acts as an obligate activating platform for at least four methyltransferases (MTase) involved in the modification of 18S rRNA (Bud23), tRNA (Trm9 and Trm11) and translation termination factor eRF1 (Mtq2) in eukaryotes. Hence, Trm112 is at a nexus between ribosome synthesis and function. Trm112 is a partner protein of N6amt1 (N6 -adenine-specific DNA methyltransferase 1), which is suggested to be the N6-adenine DNA methyltransferase (MTase) in human cells. Trm112 binds to a hydrophobic surface of N6amt1, stabilizing its structure but not directly contributing to substrate binding and catalysis. In Yarrowia lipolytica, it forms a complex with Trm9 methyltransferase, which is involved in the 5-methoxycarbonylmethyluridine (mcm(5)U) modification of the tRNA anticodon wobble position and hence promotes translational fidelity. In Saccharomyces cerevisiae, Trm112 (also called Ynr046w or tRNA methyltransferase 112) is a zinc binding protein that is plurifunctional and a component of the eRF1 methyltransferase, putatively containing a zinc finger signature motif.	117
411042	cd21090	C11orf65	chromosome 11 open reading frame 65 and homologs. Chromosome 11 open reading frame 65 (C11orf65) is an uncharacterized protein that may be associated with potential sensitivity to metformin in type 2 diabetes (diabetes mellitus) patients without cancer.	260
411043	cd21091	Fuzzy	protein fuzzy and homologs. Protein fuzzy (or FUZ) is a planar cell polarity (PCP) effector that controls multiple cellular processes during development. PCP signalling is an evolutionarily conserved pathway by which directional information regarding polarized cell movement is provided to cells. The PCP signalling axis involves PCP core and PCP effector genes, which are activated consecutively to govern orientated cell migration and the establishment of cytoskeletal structures. Dishevelled (Dvl in mammals or Dsh in Drosophila) and Fuz (or fuzzy in Drosophila) are two representative PCP core and effector genes, respectively. PCP regulates mammalian nervous system development; Fuz-null mutant mice display neural tube defects due to failure of directional cell motility and cell fusion.	401
411044	cd21092	TPT_S35C2	solute carrier family 35 member C2, member of the triose-phosphate transporter family. Solute carrier family 35 member C2 (S35C2 or Slc35c2), also called ovarian cancer-overexpressed gene 1 protein (OVCOV1), is a member of the triose-phosphate transporter (TPT) family, which is part of the drug/metabolite transporter (DMT) superfamily. It may function either as a GDP-fucose transporter that competes with Slc35c1 (S35C1) for GDP-fucose, or a factor that otherwise enhances the fucosylation of Notch and is required for optimal Notch signaling in mammalian cells.	248
410606	cd21093	KLF8_12_N	N-terminal domain of Kruppel-like factor (KLF) 8, KLF12, and similar proteins. Kruppel-like transcription factors (also known as Krueppel-like transcription factors, KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Although these factors bind to similar elements in vitro, they have distinct activities in vivo depending on their expression profile and the sequence of the N-terminal activation/repression domain, which differ between members. This model represents the related N-terminal activation/repression domains of KLF8 and KLF12.	172
410447	cd21094	C1_aPKC_iota	protein kinase C conserved region 1 (C1 domain) found in the atypical protein kinase C (aPKC) iota type. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKC-iota is directly implicated in carcinogenesis. It is critical to oncogenic signaling mediated by Ras and Bcr-Abl. The PKC-iota gene is the target of tumor-specific gene amplification in many human cancers, and has been identified as a human oncogene. In addition to its role in transformed growth, PKC-iota also promotes invasion, chemoresistance, and tumor cell survival. Expression profiling of PKC-iota is a prognostic marker of poor clinical outcome in several human cancers. PKC-iota also plays a role in establishing cell polarity, and has critical embryonic functions. Members of this family contain C1 domain found in aPKC isoform iota. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	55
410448	cd21095	C1_aPKC_zeta	protein kinase C conserved region 1 (C1 domain) found in the atypical protein kinase C (aPKC) zeta type. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKC-zeta plays a critical role in activating the glucose transport response. It is activated by glucose, insulin, and exercise through diverse pathways. PKC-zeta also plays a central role in maintaining cell polarity in yeast and mammalian cells. In addition, it affects actin remodeling in muscle cells. Members of this family contain C1 domain found in aPKC isoform zeta. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	55
411045	cd21101	MAF1-ALBA4_C	C-terminal domain of mitochondrial association factor 1 (MAF1), Alba4, and related proteins. Mitochondria play a role in the regulation of the innate immune response. Host mitochondria are recruited to the membranes that surround certain intracellular bacteria and parasites during infection, a phenomenon termed host mitochondrial association (HMA). In Toxoplasma gondii, HMA is driven by a gene family that encodes mitochondrial association factor 1 (MAF1) proteins. MAF1 is the parasite protein needed to recruit host mitochondria to the Toxoplasma-containing vacuole during infection. The T. gondii MAF1 locus harbors multiple distinct paralogs that differ in their ability to mediate HMA; these fall into two broad groups designated MAF1a and MAF1b based on residue percent identity. MAF1b paralogs, but not MAF1a paralogs, have been shown to be responsible for the HMA phenotype. This family also includes Plasmodium yeolii ALBA4 which has been shown to modulate its stage-specific interactions and the fates of specific mRNAs during the parasite's growth and transmission, acting to regulate the development of the parasite's transmission stages.	264
411696	cd21102	Arl6IP1_RETR3-like	ADP-ribosylation factor-like protein 6-interacting protein 1, Reticulophagy regulator 3, and  similar proteins. This family contains ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1) and the N-terminal reticulon-homology domain (RHD) of Reticulophagy regulators 1-3. Arl6IP1 is an endoplasmic reticulum (ER) protein that has an important role in cell conduction and material transport. Arl6IP1, a tetraspan membrane protein, is an anti-apoptotic protein specific to multicellular organisms, and is a potential player in shaping the ER tubules in mammalian cells. In Drosophila, knockdown of the Arl6IP1 gene leads to progressive motor deficit. An Arl6IP1 variant has also been associated with hereditary spastic paraplegia (HSP), motor and sensory polyneuropathy, and acromutilation.  Reticulophagy regulator 1 (RETREG1/FAM134B) is an endoplasmic reticulum (ER)-anchored autophagy receptor that regulates the size and shape of the ER. It regulates turnover of the ER by selective phagocytosis, mediating ER delivery into lysosomes through sequestration into autophagosomes. It promotes membrane remodeling and ER scission through its membrane bending activity, and targets the fragments into autophagosomes by interacting with ATG8 family modifier proteins such as MAP1LC3A, MAP1LC3B, GABARAP, GABARAPL1 and GABARAPL2. RETREG2/FAM134A and RETREG3/FAM134C has been shown to interact with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. Arl6IP1 shows some sequence similarity to the RHD of reticulophagy regulators, which may function in inducing membrane curvature.	178
411046	cd21104	SNU13	U4/U6.U5 small nuclear ribonucleoprotein SNU13. U4/U6.U5 small nuclear ribonucleoprotein SNU13, also known as NHP2-like protein 1 or U4/U6.U5 tri-snRNP 15.5 kDa protein, is a component of the spliceosome B complex, involved in pre-mRNA splicing. It binds to the 5'-stem-loop of U4 snRNA.	122
409189	cd21105	PGAP4-like	Post-GPI attachment to proteins factor 4 and similar proteins. This family includes post-GPI attachment to proteins factor 4 (PGAP4), also known as post-GPI attachment to proteins GalNAc transferase 4 or transmembrane protein 246 (TMEM246). PGAP4 has been shown to be a Golgi-resident GPI-GalNAc transferase. Many eukaryotic proteins are anchored to the cell surface through glycolipid glycosylphosphatidylinositol (GPI). GPIs have a conserved core but exhibit diverse N-acetylgalactosamine (GalNAc) modifications. PGAP4 knockout cells lose GPI-GalNAc structures. PGAP4 is most likely involved in the initial steps of GPI-GalNAc biosynthesis. In contrast to other Golgi glycotransferases, it contains three transmembrane domains. This family also includes uncharacterized fungal proteins with similarity to PGAP4.	364
411047	cd21106	TM6SF1-like	transmembrane 6 superfamily member 1, member 2, and similar proteins. This family includes transmembrane 6 superfamily members 1 (TM6SF1) and 2 (TM6SF2), and similar proteins. TM6SF1 is a widely expressed lysosomal transmembrane protein that may be suitable as a lysosomal marker. Polymorphism of its paralog, TM6SF2, has been associated with the risk for hepatocellular carcinoma, and a variant of the gene has been found to impact the processing of lipids in the liver and the small intestine, causing non-alcoholic fatty liver disease (NAFLD).	356
411048	cd21107	RsiG	anti-sigma factor RsiG (AmfC). RsiG is an anti-sigma factor that binds and sequesters the sporulation-specific sigma factor WhiG in a fashion dependent on 3',5'-cyclic diguanylic acid (c-di-GMP). RsiG can bind the cyclic dinucleotide in the absence of sigma factor WhiG, and does so in a specific manner via a unique signature conserved in all Streptomyces. This gene was originally named amfC (aerial mycelium formation protein) but no specific role was established, and was later renamed rsiG (regulator of sigma WhiG).	145
410609	cd21109	SPASM	Iron-sulfur cluster-binding SPASM domain. This iron-sulfur cluster-binding domain is named SPASM after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. SPASM occurs as an additional C-terminal domain in many peptide-modifying enzymes of the radical S-adenosylmethionine (SAM) superfamily. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster.	65
411049	cd21111	IFTase	inulin fructotransferase. Inulin fructotransferase (IFTase; EC 4.2.2.17 and EC 4.2.2.18), a member of the glycoside hydrolase family 91, catalyzes depolymerization of beta-2,1-fructans inulin by successively removing the terminal difructosaccharide units as cyclic anhydrides via intramolecular fructosyl transfer. As a result, IFTase produces DFA-I (alpha-D-fructofuranose-beta-D-fructofuranose 2',1:2,1'-dianhydride) and DFA-III (alpha-D-fructofuranose-beta-D-fructofuranose 2',1:2,3'-dianhydride).	395
411050	cd21112	alphaLP-like	alpha-lytic protease (alpha-LP), a bacterial serine protease of the chymotrypsin family, and similar proteins. This family represents the catalytic domain of alpha-lytic protease (alpha-LP) and its closely-related homologs. Alpha-lytic protease (EC 3.4.21.12; also called alpha-lytic endopeptidase), originally isolated from the myxobacterium Lysobacter enzymogenes, belongs to the MEROPS peptidase family S1, subfamily S1E (streptogrisin A subfamily). It is synthesized as a pro-enzyme, thus having two domains; the N-terminal pro-domain acts as a foldase, required transiently for the correct folding of the protease domain, and also acts as a potent inhibitor of the mature enzyme, while the C-terminal domain catalyzes the cleavage of peptide bonds. Members of the alpha-lytic protease subfamily include Nocardiopsis alba protease (NAPase), a secreted chymotrypsin from the alkaliphile Cellulomonas bogoriensis, streptogrisins (SPG-A, SPG-B, SPG-C, and SPG-D), and Thermobifida fusca protease A (TFPA). These serine proteases have characteristic kinetic stability, exhibited by their extremely slow unfolding kinetics. The active site, characteristic of serine proteases, contains the catalytic triad consisting of serine acting as a nucleophile, aspartate as an electrophile, and histidine as a base, all required for activity. This model represents the C-terminal catalytic domain of alpha-lytic proteases.	188
409232	cd21114	NAC	NAC domain. This family contains the NAC domain, named after the nascent polypeptide-associated complex (NAC) whose subunits contain NAC domains. In eukaryotes, the NAC complex, which plays an important role in co-translational targeting of nascent polypeptides to endoplasmic reticulum (ER), consists of 2 subunits: NAC alpha and a shortened splice variant of the basal transcription factor 3 (BTF3; also called BTF3b or NAC beta). The full length BTF3a protein excites transcription.	43
411051	cd21115	legumain_C	C-terminal prodomain of legumain. This family contains the C-terminal propeptide of legumain, a lysosomal endopeptidase with a specificity for hydrolysis of asparaginyl bonds. Legumain (also called vacuolar processing enzyme or VPE in plants, and asparaginyl endopeptidase or AEP in animals) is synthesized as a precursor with both N- and C-terminal propeptides. Prolegumain is directed to the lysosome or plant vacuole, where activation occurs at least partially by autolysis. The N-terminal catalytic domain is a cysteine protease from the C13 family. The C-terminal prodomain can be organized into an activation peptide (AP), spanning a helical region, and a C-terminal death domain-like fold, denoted as legumain stabilization and activity modulation (LSAM) domain. The C-terminal prodomain binds over the active site and inhibits the catalytic domain. During activation, the C-terminal prodomain is autocatalytically cleaved. This process is induced by pH changes. Human legumain has been shown to process the tetanus toxin generating the fragments found in class II antigen presentation. Legumain from plant seeds is thought to be responsible for the post-translational processing of seed proteins prior to storage. Legumain is highly expressed in some cancers such as colorectal cancer (CRC) and uveal melanoma (UM); it is associated with poor outcome in CRC and upregulation of legumain is associated with malignant behavior of UM. Thus, legumain may be used as a negative prognostic factor as well as a therapeutic target.	119
411052	cd21117	Twitch_MoaA	Iron-sulfur cluster-binding Twitch domain of GTP 3',8-cyclase. The iron-sulfur cluster-binding Twitch domain is found at the C-terminus of GTP 3',8-cyclase (EC 4.1.99.22), which is also called molybdenum cofactor biosynthesis protein A (MoaA) in bacteria and archaea, molybdenum cofactor biosynthesis protein 1 (MOCS1) in most eukaryotes, and molybdenum cofactor biosynthesis enzyme CNX2 in plants. GTP 3',8-cyclase is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the first step in molybdopterin biosynthesis, the cyclization of guanosine triphosphate to (8S)-3',8-cyclo-7,8-dihydroguanosine 5'-triphosphate, which is then converted to molybdopterin in subsequent steps. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. GTP 3',8-cyclase contains an additional iron-sulfur cluster at the C-terminal Twitch domain that is involved in substrate binding. The Twitch domain may be related to another iron-sulfur cluster-binding domain found at the C-terminus of some radical SAM enzymes, the SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSMEs, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively.	70
411053	cd21118	dermokine	dermokine. Dermokine, also known as epidermis-specific secreted protein SK30/SK89, is a skin-specific glycoprotein that may play a regulatory role in the crosstalk between barrier dysfunction and inflammation, and therefore play a role in inflammatory diseases such as psoriasis. Dermokine is one of the most highly expressed proteins in differentiating keratinocytes, found mainly in the spinous and granular layers of the epidermis, but also in the epithelia of the small intestine, macrophages of the lung, and endothelial cells of the lung. Mouse dermokine has been reported to be encoded by 22 exons, and its expression leads to alpha, beta, and gamma transcripts.	495
410610	cd21119	SPASM_PqqE	Iron-sulfur cluster-binding SPASM domain of coenzyme PQQ synthesis protein E. Coenzyme PQQ synthesis protein E (PqqE), also called pyrroloquinoline quinone (PQQ) biosynthesis protein E or PqqA peptide cyclase (EC 1.21.98.4), is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the formation of a C-C bond between C-4 of glutamate and C-3 of tyrosine residues of the PqqA protein, which is the first enzymatic step in the biosynthesis of the bacterial enzyme cofactor PQQ. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM (RS) enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. PqqE contains two auxiliary Fe-S clusters in its SPASM domain: one nearest the RS site (AuxI) is in the form of a 2Fe-2S cluster ligated by four cysteines; and a more remote cluster (AuxII) in the form of a 4Fe-4S center that is ligated by three cysteine residues and one aspartate residue.	114
410611	cd21120	SPASM_anSME	Iron-sulfur cluster-binding SPASM domain of anaerobic sulfatase maturating enzyme. Anaerobic sulfatase maturating enzyme (anSME) is a radical S-adenosylmethionine (SAM) enzyme that catalyzes, under anaerobic conditions, the co- or post-translational modification of arylsulfatases to form a catalytically essential formylglycine (FGly) residue to perform their hydrolysis function, removing sulfate groups from a wide array of substrates. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM (RS) enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster; anSME contains two auxillary 4Fe-4S clusters in its SPASM domain.	107
410612	cd21121	SPASM_Cmo-like	Iron-sulfur cluster-binding SPASM domain of tungsten-containing aldehyde ferredoxin oxidoreductase cofactor-modifying protein and similar proteins. This group is composed of Pyrococcus furiosus tungsten-containing aldehyde ferredoxin oxidoreductase (AOR;  EC 1.2.7.5) cofactor-modifying protein, encoded by the cmo gene, and similar proteins. AOR cofactor-modifying protein is involved in the biosynthesis of a molybdopterin-based tungsten cofactor. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN.	80
410613	cd21122	SPASM_rSAM	Iron-sulfur cluster-binding SPASM domain of an uncharacterized group of radical SAM proteins. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN.	71
410614	cd21123	SPASM_MftC-like	Iron-sulfur cluster-binding SPASM domain of mycofactocin radical SAM maturase MftC and similar proteins. This group is composed of Mycobacterium tuberculosis putative mycofactocin radical SAM maturase MftC and similar proteins. MftC is a radical S-adenosylmethionine (SAM) enzyme that may function to modify mycofactocin, a conserved polypeptide that might serve as an electron carrier. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster that is similar to the second auxillary 4Fe-4S cluster (AuxII) of Clostridium perfringens anaerobic sulfatase-maturating enzyme (anSME).	91
410615	cd21124	SPASM_CteB-like	Iron-sulfur cluster-binding SPASM domain of sactionine bond-forming enzyme CteB and similar proteins. Clostridium thermocellum sactionine bond-forming enzyme CteB is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the formation of the requisite thioether bridge between a cysteine and the alpha-carbon of an opposing amino acid that is required in sactipeptide biosynthesis. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM (RS) enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. CteB contains two auxillary 4Fe-4S clusters in its SPASM domain; the auxillary cluster nearest the RS site, called AuxI, exhibits an open coordination site in the absence of peptide substrate, which is coordinated by a peptidyl-cysteine residue in the bound state.	96
410616	cd21125	SPASM_AlbA-like	Iron-sulfur cluster-binding SPASM domain of antilisterial bacteriocin subtilosin biosynthesis protein AlbA and similar proteins. Bacillus subtilis antilisterial bacteriocin subtilosin biosynthesis protein AlbA is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the formation of three thioether bonds in the post-translational modification of a linear peptide into the cyclic peptide subtilosin A. The thioether bonds formed are between the sulfur of three cysteine residues and the alpha-carbons of two phenylalanines and one threonine to produce a rigid cyclic peptide. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. AlbA appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN.	97
410617	cd21126	SPASM_rSAM	Iron-sulfur cluster-binding SPASM domain of an uncharacterized group of radical SAM proteins. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN.	70
410618	cd21127	SPASM_rSAM	Iron-sulfur cluster-binding SPASM domain of an uncharacterized group of radical SAM proteins. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group appears to contain one auxillary Fe-S cluster, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN.	83
410619	cd21128	SPASM_rSAM	Iron-sulfur cluster-binding SPASM domain of an uncharacterized group of radical SAM proteins. Members of this group are radical S-adenosylmethionine (SAM) enzymes with a SPASM domain, named after the biochemically characterized members, AlbA, PqqE, anSME, and MftC, which are involved in Subtilosin A, Pyrroloquinoline quinone, Anaerobic Sulfatase, and Mycofactocin maturation, respectively. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. This group may contain one auxillary Fe-S cluster with an open coordination site, similar to the auxillary 4Fe-4S cluster in Bacillus circulans butirosin biosynthetic enzyme BtrN, but missing one conserved cysteine in the binding site.	65
410620	cd21129	SPASM_BtrN	Iron-sulfur cluster-binding SPASM domain of butirosin biosynthesis protein N. Butirosin biosynthesis protein N (BtrN), also called S-adenosyl-L-methionine-dependent 2-deoxy-scyllo-inosamine dehydrogenase (EC 1.1.99.38), is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the two-electron oxidation of 2-deoxy-scyllo-inosamine (DOIA) to amino-dideoxy-scyllo-inosose (amino-DOI) in the biosynthetic pathway of the aminoglycoside antibiotic butirosin. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. BtrN contains one auxillary 4Fe-4S cluster.	87
412055	cd21131	TbPSSA-2-like	ectodomain of Trypanosoma, including T. brucei Procyclic-Specific Surface Antigen-2 (TbPSSA-2), T. congolense Insect Stage Antigen (TcISA), and similar proteins. This family includes the ectodomains of Trypanosoma brucei Procyclic-Specific Surface Antigen-2 (TbPSSA-2) and homolog T. congolense Insect Stage Antigen (TcISA). Trypanosomal parasites transmit disease through the arthropod vector Glossina spp (the tsetse fly). Studies have shown that TbPSSA-2 plays an important role in parasite survival in the tsetse; TbPSSA-2 knock-out reduced the efficiency of trypanosome migration from the tsetse midgut to the salivary glands. The TbPSSA-2 and TcISA ectodomains adopt a novel architecture, having two lobes connected by a loop, exhibiting conformational flexibility. The inter-lobe hinge region displaying rotational flexibility suggests a potential mechanism for coordinating a binding partner.	208
410977	cd21132	EVE-like	EVE and YTH domains belong to the PUA superfamily. The EVE domain was formerly known as DUF55 and is thought to be involved in RNA binding. The YTH (YT521-B homology) domain is a novel RNA-binding domain that has been shown to bind to short, degenerate, single-stranded RNA motifs that loosely follow a consensus sequence. Both domains are part of the larger PUA superfamily.	138
410978	cd21133	EVE	EVE domains are putative RNA-binding domains that belong to the PUA superfamily. The EVE domain, formerly known as DUF55, has been revealed via structural similarity to be part of the PUA superfamily. It is most similar in three-dimensional fold to the YTH (YT521-B homology) domain, and is thought to be involved in RNA-binding.	148
410979	cd21134	YTH	YTH (YT521-B homology) domains are RNA-binding domains that belong to the PUA superfamily. Individual members of the YTH family have been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. In general, eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or in other forms of silencing. The YTH domain is a novel RNA-binding domain that has been shown to bind to short, degenerate, single-stranded RNA motifs that loosely follow a consensus sequence. It belongs to the larger PUA superfamily.	133
412056	cd21137	AA13_LPMO-like	AA13 lytic polysaccharide monooxygenase, and similar proteins. This family contains starch-degrading (also called starch-active) lytic polysaccharide monooxygenase (LPMO), a representative of the new CAZy AA13 family and classified as an auxiliary activity enzyme. This enzyme acts on alpha-linked glycosidic bonds and displays a binding surface that is quite different from those of LPMOs acting on beta-linked glycosidic bonds, indicating that the AA13 family proteins interact with their substrate in a distinct fashion. The active site contains an amino-terminal histidine-ligated mononuclear copper. This enzyme generates aldonic acid-terminated malto-oligosaccharides from retrograded starch and significantly boosts the conversion of this recalcitrant substrate to maltose by beta-amylase.	233
411054	cd21138	McdB-like	Maintenance of carboxysome distribution (Mcd) protein B and similar proteins. This family contains maintenance of carboxysome distribution (Mcd) protein B (McdB), also called maintenance of carboxysome positioning B protein (McsB). It is found in cyanobacteria, where carboxysome maintenance is mediated by a DNA partition-like ParA-ParB system called the McdA-McdB. In order to actively position carboxysomes, McdB binds directly to McdA, a putative Walker-box ParA-like protein. McdB harbors a unique helical fold and it enables McdA dimer formation.	132
410981	cd21140	Cas6_I-like	Class 1 type I CRISPR-associated endoribonuclease Cas6. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Cas6 family endoribonucleases are metal-independent nucleases that catalyze RNA cleavage via a mechanism involving a 2'-3' cyclic intermediate. They share a common ferredoxin or RNA recognition motif (RRM) fold, and they recognize and excise CRISPR repeat RNAs that vary widely in primary and secondary structures. This subfamily contains Cas6 family endoribonucleases typically found within type I CRISPR-Cas systems and similar proteins.	243
410982	cd21141	Cas6_III-like	Class 1 type III CRISPR-associated endoribonuclease Cas6. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Cas6 family endoribonucleases are metal-independent nucleases that catalyze RNA cleavage via a mechanism involving a 2'-3' cyclic intermediate. They share a common ferredoxin or RNA recognition motif (RRM) fold, and they recognize and excise CRISPR repeat RNAs that vary widely in primary and secondary structures. This subfamily contains Cas6 family endoribonucleases typically found within type III CRISPR-Cas systems and similar proteins.	251
411055	cd21142	Cas7fv	type I-F variant CRISPR-associated backbone protein Cas7 (Cas7fv). Cas7fv is one of the CRISPR associated (Cas) proteins of type I-F variant CRISPR-Cas system. CRISPR (clustered regularly interspaced short palindromic repeats)-Cas modules are adaptive immune systems found in archaea and bacteria against foreign nucleic acids such as phages and plasmids via RNA-guided endonucleases. CRISPR-Cas systems are classified based on Cas protein content and arrangement in CRISPR-Cas loci into two main classes (1 and 2) and at least six types (I, II, III, IV, V, VI). Class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Type I CRISPR-Cas systems are most widespread in nature and the Cas protein composition of the employed Cascade interference complexes differ between seven subtypes (A-F, U). Type I-F variant (I-Fv) is a subtype that rely on a minimal set of five Cas proteins and has structural differences with I-F and I-E Cascades. Double strand DNA recruitment and recognition in the type I-Fv Cascade is facilitated from the major groove side by Cas5fv instead of the large subunit Cas8 and the finger domain of Cas7fv.	315
411056	cd21143	Cas5fv	type I-F variant CRISPR-associated protein Cas5 (Cas5fv). Cas5fv is one of the CRISPR-associated (Cas) proteins of type I-F variant CRISPR-Cas system. CRISPR (clustered regularly interspaced short palindromic repeats)-Cas modules are adaptive immune systems found in archaea and bacteria against foreign nucleic acids such as phages and plasmids via RNA-guided endonucleases. CRISPR-Cas systems are classified based on Cas protein content and arrangement in CRISPR-Cas loci into two main classes (1 and 2) and at least six types (I, II, III, IV, V, VI). Class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Type I CRISPR-Cas systems are most widespread in nature and the Cas protein composition of the employed Cascade interference complexes differ between seven subtypes (A-F, U). Type I-F variant (I-Fv) is a subtype that rely on a minimal set of five Cas proteins and has structural differences with I-F and I-E Cascades. Double strand DNA recruitment and recognition in the type I-Fv Cascade is facilitated from the major groove side by Cas5fv instead of the large subunit Cas8 and the finger domain of Cas7fv.	335
394908	cd21144	NendoU_XendoU-like	Nidoviral uridylate-specific endoribonuclease (NendoU) domain of coronavirus Nonstructural protein 15 (Nsp15), arterivirus Nsp11, torovirus endoribonuclease, Xenopus laevis endoribonuclease XendoU, and related proteins. Nidovirus endoribonucleases (NendoUs) and eukaryotic Xenopus laevis-like endoribonucleases (XendoUs) are uridylate-specific endoribonucleases which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. XendoU is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. Except for turkey coronavirus (TCoV) Nsp15, Mn2+ is generally essential for the catalytic activity of coronavirus Nsp15. Mn2+ is dispensable, and to some extent inhibits the activity of arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11. XendoU also requires Mn2+. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and murine hepatitis virus (MHV) forms a functional hexamer while Porcine DeltaCoronavirus (PDCoV) Nsp15 has been shown to exist as a dimer and a monomer in solution. Nsp11 from the arterivirus PRRSV is a dimer.	112
409283	cd21146	Nip7_N_euk	N-terminal domain of eukaryotic 60S ribosome subunit biogenesis protein Nip7 and similar proteins. This N-terminal domain of various proteins co-occurs with a PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain. This model contains eukaryotic Nip7, a protein that was shown to be required for efficient biogenesis of the 60S ribosome subunit in Saccharomyces cerevisiae. Recently, it was demonstrated that human Nip7 is essential in the accurate processing of pre-rRNA. Also included is KD93, a human homolog of Nip7. Nip7 and its homologs share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets.	87
409284	cd21147	RsmF_methylt_CTD1	RsmF rRNA methyltransferase first C-terminal domain. This model represents the first of two distinct C-terminal domains of the 16S rRNA methyltransferase RsmF and related RsmB/RsmF family ribosomal methyltransferases. It is necessary for stabilizing the N-terminal catalytic core (a SAM-dependent methyltransferase) and is related to the N-terminal domain of Nip7, a protein that was shown to be required for efficient biogenesis of the 60S ribosome subunit in Saccharomyces cerevisiae (human Nip7 is essential in the accurate processing of pre-rRNA). The second distinct C-terminal domain belongs to the PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain superfamily.	75
409290	cd21148	PUA_Cbf5	PUA RNA-binding domain of the archaeal pseudouridine synthase component Cbf5. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of the archaeal and eukaryotic subfamily of pseudouridine synthases, including Cbf5 (dyskerin in humans) and similar proteins, are modules that assist in the binding and positioning (guide and/or substrate) of RNA to the pseudouridine synthase complex. Pseudouridine synthases are enzymes that are responsible for post-translational modifications of RNAs by specifically isomerizing uracil residues. In Pyrococcus furiosus H/ACA ribonucleoprotein (RNP) assembly with a single-hairpin H/ACA RNA, the lower stem and the ACA motif of the guide RNA are anchored at the PUA domain of Cbf5. In addition, the N-terminal extension of Cbf5, which is a hot spot for dyskeratosis congenita (a rare genetic form of bone marrow failure) mutation, forms an extra structural layer on the PUA domain.	75
409291	cd21149	PUA_archaeosine_TGT	PUA RNA-binding domain of archaeosine tRNA-guanine transglycosylase. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this archaeosine tRNA-guanine transglycosylase (TGT) family are responsible for the exchange of a guanine residue in archaeal tRNAs with a preQ0 base (7-cyano-7-deazaguanine), which constitutes the initial step in archaeosine biosynthesis. Archaeosine is a modified RNA base specific to archaea (7-formamidino-7deazaguanosine), found at position 15 in tRNAs. It has been shown that the PUA domain of archaeosine TGT is not required for its specificity for position 15.	75
409292	cd21150	PUA_NSun6-like	PUA RNA-binding domain of the SAM-dependent methyltransferase NSun6 and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this subfamily contain PUA domains that co-occur with SAM-dependent methyltransferase domains and may play roles as cytosine-C(5)-methyltransferases specific for tRNAs or rRNAs. Nsun6 binding to its tRNA substrates requires the presence of a 3'-CCA sequence, which is precisely recognized primarily through interactions with residues from the PUA domain, where the molecular surface of the PUA domain snugly fits onto each nucleotide residue of the CCA end. Human RNA:m5C methyltransferase NSun6 (hNSun6) plays a major role in bone metastasis and could be a valuable therapeutic target for bone metastasis and therapy-resistant tumors.	92
409293	cd21151	PUA_Nip7-like	PUA RNA binding domain of ribosome assembly factor Nip7 and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. This eukaryotic and archaeal subfamily contains the conserved protein Nip7 and similar proteins, which are involved in ribosome biogenesis, taking part in 27S pre-rRNA processing and in formation of 60S ribosomal subunit. Nip7 orthologs share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets. Structural analyses of the RNA-interacting surfaces of Saccharomyces cerevisiae and Pyrococcus abyssi Nip7 orthologs indicate that, in the archaeal PUA domain, C-terminal positively charged residues (arginines and lysines) are involved in RNA interaction while equivalent positions in eukaryotic orthologs are occupied by mostly hydrophobic residues. Both proteins can bind specifically to polyuridine, and RNA interaction requires specific residues of the PUA domain as determined by site-directed mutagenesis.	78
409294	cd21152	PUA_TruB_bacterial	PUA RNA-binding domain of bacterial pseudouridine synthase TruB and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this bacterial subfamily of pseudouridine synthases, including TruB and similar proteins, are modules that assist in the binding and positioning (guide and/or substrate) of RNA to the pseudouridine synthase complex. Pseudouridine synthases are enzymes that are responsible for post-translational modifications of RNAs by specifically isomerizing uracil residues. The pseudouridine synthase TruB (also called tRNA pseudouridylate synthase B or Psi55 synthase) is responsible for synthesis of pseudouridine from uracil-55 in the psi GC loop of elongator tRNAs.	60
409295	cd21153	PUA_RlmI	PUA RNA-binding domain of the SAM-dependent methyltransferase RlmI and related proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this subfamily contain PUA domains that co-occur N-terminal to SAM-dependent methyltransferase domains and include Escherichia coli RlmI (rRNA large subunit methyltransferase gene I, also called YccW) and Thermus thermophilus methyltransferase RlmO, which are 5-methylcytosine methyltransferases (m5C MTases) that play a role in modifying 23S rRNA. This subfamily also includes Pyrococcus horikoshii PH1915 that may play a role as a 5-methyluridine MTase, and/or perform similar roles.	70
409296	cd21154	PUA_MJ1432-like	PUA RNA-binding domain of MJ1432, TA1423, PH0734, and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this mostly archaeal family have not been characterized functionally; they may bind to RNA. This family includes Pyrococcus horikoshii PH0734 where the N-terminal domain may modulate the binding target of the C-terminal PUA domain using its characteristic electropositive surface.	84
409297	cd21155	PUA_MCTS-1-like	PUA RNA-binding domain of malignant T cell-amplified sequence 1 and related proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of this eukaryotic family, labelled MCT-1 (malignant T cell-amplified sequence 1) or MCTS-1 (multiple copies T-cell lymphoma-1), contain a single PUA domain. They may play roles in the regulation of the cell cycle; human MCT-1 has been characterized for its oncogenic potential. MCT-1/MCTS1 expression is a new poor-prognosis marker in patients with aggressive breast cancers, and thus the MCT-1 pathway is a novel and promising therapeutic target for triple-negative breast cancer (TNBC).	97
409298	cd21156	PUA_eIF2d-like	PUA RNA-binding domain of eukaryotic translation initiation factor 2D and similar proteins. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Most members of this  eukaryotic translation initiation factor 2D (eIF2d)-like family of eukaryotic proteins also contain a domain homologous to the translation initiation factor eIF1/SUI1, and a short uncharacterized N-terminal domain. eIF2D may function as a cytosolic GTP-independent initiation factor which delivers Met-tRNA (and non-initiating tRNAs) to the 40S ribosomal subunit. The family member from Drosophila melanogaster has been named ligatin, and this alias has been adopted for other family members as well, which are not homologous to the vertebrate ligatin (LGTN) that is a trafficking receptor for phosphoglycoproteins.	82
409299	cd21157	PUA_G5K	PUA domain of gamma-glutamyl kinase, found in archaea, bacteria, and eukarya. Gamma glutamyl kinase (G5K) is an enzyme essential for the biosynthesis of L-proline; it catalyzes the transfer of a phosphate group to glutamate. The resulting glutamate 5-phosphate cyclizes spontaneously to form 5-oxoproline. The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain functions as an RNA binding domain in many other proteins; however, its role in G5K is not understood. It might play a role in modulating the enzymatic properties of bacterial G5Ks.	104
394909	cd21158	NendoU_nv	Nidoviral uridylate-specific endoribonuclease (NendoU) domain of coronavirus Nonstructural protein 15 (Nsp15), arterivirus Nsp11, torovirus endoribonuclease, and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. This family also includes torovirus NendoUs. Except for turkey coronavirus (TCoV) Nsp15, Mn2+ is generally essential for the catalytic activity of coronavirus Nsp15. Mn2+ is dispensable, and to some extent inhibits the activity of arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and murine hepatitis virus (MHV) form a functional hexamer while Porcine DeltaCoronavirus (PDCoV) Nsp15 has been shown to exist as a dimer and monomer in solution. Nsp11 from the arterivirus PRRSV is a dimer. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA.	134
394910	cd21159	XendoU	Xenopus laevis endoribonuclease XendoU, and related proteins. Xenopus laevis XendoU is a uridylate-specific endoribonuclease, which releases a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. XendoU is a monomer and requires Mn2+.  It is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA. XendoU is distantly related to the Nidovirus uridylate-specific endoribonucleases (NendoUs) which include Nonstructural protein 15 (Nsp15) from coronaviruses, Nsp11 from arteriviruses, and torovirus endoribonuclease.	264
394911	cd21160	NendoU_av_Nsp11-like	Nidoviral uridylate-specific endoribonuclease (NendoU) domain of arterivirus PRRSV Nonstructural protein 11 (Nsp11), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Mn2+ is dispensable, and to some extent inhibits the activity of arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11. This Nsp11 exists as a dimer. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA.	120
394912	cd21161	NendoU_cv_Nsp15-like	Nidoviral uridylate-specific endoribonuclease (NendoU) domain of coronavirus Nonstructural Protein 15 (Nsp15) and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Except for turkey coronavirus (TCoV) Nsp15, Mn2+ is generally essential for the catalytic activity of coronavirus Nsp15. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and murine hepatitis virus (MHV) form a functional hexamer while Porcine DeltaCoronavirus (PDCoV) Nsp15 has been shown to exist as a dimer and a monomer in solution. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA.	151
394913	cd21162	NendoU_tv_PToV-like	Nidoviral uridylate-specific endoribonuclease (NendoU) domain of Porcine torovirus (PToV) endoribonuclease and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. The Porcine torovirus (PToV) strain PToV-NPL/2013 NendoU domain is located at the N-terminus of the ORF1ab replicase polyprotein, between regions annotated as Nonstructural proteins 11 (Nsp11) and 13 (Nsp13). This subfamily belongs to a family which includes Nsp15 from coronaviruses and Nsp11 from arteriviruses, which may participate in the viral replication process and in the evasion of the host immune system. These vary in their requirement for Mn2+. Coronavirus Nsp15 generally form functional hexamers, with the exception of Porcine DeltaCoronavirus (PDCoV) Nsp15 which exists as a dimer and a monomer in solution. Arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11 is a dimer. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA.	133
394902	cd21163	M_cv_Nsp15-NTD_av_Nsp11-like	middle (M) domain of coronavirus Nonstructural protein 15 (Nsp15) and the N-terminal domain (NTD) of arterivirus Nsp11 and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain, and a C-terminal catalytic (NendoU) domain. Arterivirus Nsp11 has an N-terminal domain (NTD) and a C-terminal catalytic (NendoU) domain. The NTD of Nsp11 superimposes onto the M-domain of coronavirus Nsp15.  Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of other coronavirus members; it has been shown to exist as a dimer and a monomer in solution. Nsp11 from the arterivirus PRRSV functions as a dimer.	127
394903	cd21165	M_cv-Nsp15-like	middle domain of coronavirus Nonstructural protein 15 (Nsp15), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronavirus members; it has been shown to exist as a dimer and a monomer in solution.	128
394904	cd21166	NTD_av_Nsp11-like	N-terminal domain (NTD) of arterivirus Nonstructural protein 11 (Nsp11), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain.  Arterivirus Nsp11 has an N-terminal domain (NTD) and a C-terminal NendoU catalytic domain. The NTD of Nsp11 superimposes onto the M-domain of coronavirus Nsp15.  Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution. Nsp11 from the arterivirus PRRSV functions as a dimer. PRRSV Nsp11 has been shown to induce STAT2 degradation to inhibit interferon signaling; mutagenesis revealed that the amino acid residue K59 located at the NTD of Nsp11 is indispensable for inducing STAT2 reduction.	100
394905	cd21167	M_alpha_beta_cv_Nsp15-like	middle domain of alpha- and beta-coronavirus Nonstructural protein 15 (Nsp15), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. This middle domain harbors residues involved in hexamer formation and in trimer stability. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution.	127
394906	cd21168	M_gcv_Nsp15-like	middle domain of gammacoronavirus Nonstructural protein 15 (Nsp15), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. This middle domain harbors residues involved in hexamer formation and in trimer stability. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution.	123
394907	cd21169	M_dcv_Nsp15-like	middle domain of delta coronavirus Nonstructural protein 15 (Nsp15), and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution.	118
394899	cd21170	NTD_cv_Nsp15-like	N-terminal domain of coronavirus Nonstructural protein 15 (Nsp15) and related proteins. Coronavirus Nsp15 is a nidovirus endoribonuclease (NendoU). NendoUs are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include coronavirus Nsp15 and arterivirus Nsp11, both of which may participate in the viral replication process and in the evasion of the host immune system. This NTD structure (approximately 60 residues) present in coronavirus Nsp15, is missing in Nsp11. Coronavirus Nsp15 has an N-terminal domain, a middle (M) domain, and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from the Nsp15 of these alpha- and beta-coronavirus; it has been shown to exist as dimers and monomers in solution.	60
394900	cd21171	NTD_alpha_beta_cv_Nsp15-like	N-terminal domain of alpha- and beta-coronavirus Nonstructural protein 15 (Nsp15), and related proteins. Coronavirus Nsp15 is a nidovirus endoribonuclease (NendoU). NendoUs are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include coronavirus Nsp15 and arterivirus Nsp11, both of which may participate in the viral replication process and in the evasion of the host immune system. This small NTD structure, present in coronavirus Nsp15, is missing in Nsp11. Coronavirus Nsp15 has an N-terminal domain, a middle (M) domain, and a C-terminal catalytic (NendoU) domain. Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Residues in this N-terminal domain are important for hexamer (dimer of trimers) formation.	61
394901	cd21172	NTD_dcv_Nsp15-like	N-terminal domain of deltacoronavirus Nonstructural protein 15 (Nsp15), and related proteins. Coronavirus Nsp15 is a nidovirus endoribonuclease (NendoU). NendoUs are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include coronavirus Nsp15 and arterivirus Nsp11, both of which may participate in the viral replication process and in the evasion of the host immune system. This small NTD structure, present in coronavirus Nsp15, is missing in Nsp11. Coronavirus Nsp15 has an N-terminal domain, a middle (M) domain, and a C-terminal catalytic (NendoU) domain. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from the Nsp15 of alpha- and beta-coronavirus, such as Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and Murine Hepatitis Virus (MHV) which form functional hexamers; PDCoV Nsp15 has been shown to exist as a dimer and a monomer in solution.	60
411057	cd21173	NucC-like	cyclic oligonucleotide-based anti-phage signaling system-associated NucC nuclease and similar proteins. Cyclic oligonucleotide-based anti-phage signaling system (CBASS)-associated NucC nuclease kills phage-infected cells through genome destruction. It is allosterically activated by a cyclic triadenylate (cA3) second messenger that is synthesized by CBASS upon infection. NucC is related to restriction endonucleases but it adopts a homotrimeric structure. Binding of cA3 causes two NucC homotrimers to assemble into a homohexamer, which brings together a pair of active sites to activate DNA cleavage. NucC has also been integrated into type III CRISPR/Cas systems as an accessory nuclease.	231
410621	cd21174	LPMO_auxiliary	lytic polysaccharide monooxygenase auxiliary activity protein. Many proteins in this superfamily are copper-dependent lytic polysaccharide monooxygenases (LPMOs) and include lytic polysaccharide monooxygenase auxiliary activity families 9 (AA9) and 10 (AA10). The substrate-binding surface of this family is a flat beta-sandwich fold.	136
410622	cd21175	LPMO_AA9	lytic polysaccharide monooxygenase (LPMO) auxiliary activity family 9 (AA9). AA9 proteins are copper-dependent lytic polysaccharide monooxygenases (LPMOs) involved in the cleavage of cellulose chains with oxidation of carbons C1 and/or C4 and C6. Activities include lytic cellulose monooxygenase (C1-hydroxylating) (EC 1.14.99.54) and lytic cellulose monooxygenase (C4-dehydrogenating) (EC 1.14.99.56). The family used to be called GH61 because weak endoglucanase activity had been demonstrated in some family members.	216
410623	cd21176	LPMO_auxiliary-like	fungal lytic polysaccharide monooxygenase (LPMO) auxiliary activity family protein. Proteins in this fungal family of copper-binding proteins may not function as lytic polysaccharide monooxygenases (LPMOs) or in specific binding of chitin and/or cellulose. A family member found in the ectomycorrhizal fungus Laccaria bicolor has been found to be located at the interface between tree rootlet cells and fungal hyphae. It does not perform oxidative cleavage of polysaccharides. Members of this family are related to LPMOs but have diverged to biological functions other than polysaccharide degradation.	121
410624	cd21177	LPMO_AA10	lytic polysaccharide monooxygenase (LPMO) auxiliary activity family 10 (AA10). AA10 proteins are copper-dependent lytic polysaccharide monooxygenases (LPMOs), which may act on chitin or cellulose. The family used to be called CBM33. Activities in this family include lytic cellulose monooxygenase (C1-hydroxylating) (EC 1.14.99.54), lytic cellulose monooxygenase (C4-dehydrogenating) (EC 1.14.99.56), lytic chitin monooxygenase (EC 1.14.99.53), and lytic xylan monooxygenase/xylan oxidase (glycosidic bond-cleaving) (EC 1.14.99.-). Also included are viral chitin-binding glycoproteins such as fusolin and spheroidin-like proteins.	180
410625	cd21178	Fusolin-like	fusolin and similar proteins. Fusolin is a protein found in spindles of insect poxviruses that resembles the lytic polysaccharide monooxygenases of chitinovorous bacteria and may function to disrupt the chitin-rich peritrophic matrix that protects insects against oral infections. Thus, it is a component of the virus occlusion bodies (which are large proteinaceous polyhedra) that protect the virus from the outside environment for extended periods until they are ingested by insect larvae.	227
411059	cd21179	LIC_1098-like	putative DNA adenine methyltransferase similar to Leptospira interrogans LIC_1098. this uncharacterized family is structurally similar to DNA adenine methyltransferases such as FokI, EcoRV, or DpnIIA.	280
409666	cd21180	GH2_GIPC	GIPC-homology 2 (GH2) domain found in the GIPC family. The GIPC family includes PDZ domain-containing proteins, GIPC1 (also called GAIP C-terminus-interacting protein, RGS-GAIP-interacting protein, RGS19-interacting protein 1, RGS19IP1, synectin, tax interaction protein 2, or TIP-2), GIPC2, and GIPC3, which may act as scaffold proteins linking heterotrimeric G-proteins to seven-transmembrane-type WNT receptor or to receptor tyrosine kinases. They might play key roles in carcinogenesis and embryogenesis through modulation of growth factor signaling and cell adhesion. GIPCs are proteins with a GIPC homology 1 (GH1) domain, a central PDZ domain and a GH2 domain. This model corresponds to the GH2 domain, which mediates the interaction with myosin VI and is involved in homodimerization in the autoinhibited state.	65
410548	cd21181	Tudor_SETDB1_rpt2	second Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins. SETDB1, also called ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E), acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. It contains two Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	54
410549	cd21182	Tudor_SMN_SPF30-like	Tudor domain found in survival motor neuron protein (SMN), motor neuron-related-splicing factor 30 (SPF30), and similar proteins. This group contains SMN, SPF30, Tudor domain-containing protein 3 (TDRD3), DNA excision repair protein ERCC-6-like 2 (ERCC6L2), and similar proteins. SMN, also called component of gems 1, or Gemin-1, is part of a multimeric SMN complex that includes spliceosomal Sm core proteins and plays a catalyst role in the assembly of small nuclear ribonucleoproteins (snRNPs), the building blocks of the spliceosome. SPF30, also called 30 kDa splicing factor SMNrp, SMN-related protein, or survival motor neuron domain-containing protein 1 (SMNDC1), is an essential pre-mRNA splicing factor required for assembly of the U4/U5/U6 tri-small nuclear ribonucleoprotein into the spliceosome. TDRD3 is a scaffolding protein that specifically recognizes and binds dimethylarginine-containing proteins. ERCC6L2, also called DNA repair and recombination protein RAD26-like (RAD26L), may be involved in early DNA damage response. It regulates RNA Pol II-mediated transcription via its interaction with DNA-dependent protein kinase (DNA-PK) to resolve R loops and minimize transcription-associated genome instability. Members of this group contain a single Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	50
409032	cd21183	CH_FLN-like_rpt1	first calponin homology (CH) domain found in the filamin family. The filamin family includes filamin-A (FLN-A), filamin-B (FLN-B) and filamin-C (FLN-C). Filamins function to anchor various transmembrane proteins to the actin cytoskeleton. FLN-A is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-B is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton and may also promote orthogonal branching of actin filaments as well as link actin filaments to membrane glycoproteins. FLN-C, also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. This family also includes Drosophila melanogaster protein jitterbug (Jbug), which is an actin-meshwork organizing protein containing three copies of the CH domain. Other members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	108
409033	cd21184	CH_FLN-like_rpt2	second calponin homology (CH) domain found in the filamin family. The filamin family includes filamin-A (FLN-A), filamin-B (FLN-B) and filamin-C (FLN-C). Filamins function to anchor various transmembrane proteins to the actin cytoskeleton. FLN-A is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-B is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton and may also promote orthogonal branching of actin filaments as well as link actin filaments to membrane glycoproteins. FLN-C, also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. This family also includes Drosophila melanogaster protein jitterbug (Jbug), which is an actin-meshwork organizing protein containing three copies of the CH domain. Other members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	103
409034	cd21185	CH_jitterbug-like_rpt3	third calponin homology (CH) domain found in Drosophila melanogaster protein jitterbug and similar proteins. Protein jitterbug (Jbug) is an actin-meshwork organizing protein. It is required to maintain the shape and cell orientation of the Drosophila notum epithelium during flight muscle attachment to tendon cells. Jbug contains three copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs.	98
409035	cd21186	CH_DMD-like_rpt1	first calponin homology (CH) domain found in the dystrophin family. The dystrophin family includes dystrophin and its paralog, utrophin. Dystrophin, encoded by the DMD gene, is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscles. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. Dystrophin is also involved in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Utrophin, also called dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homolog that increases dystrophic muscle function and reduces pathology. It is broadly expressed in both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with dystroglycans (DGs) and sarcoglycan-dystroglycans, as well as sarcoglycan and sarcospan (SG-SSPN) subcomplexes. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and links the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409036	cd21187	CH_DMD-like_rpt2	second calponin homology (CH) domain found in the dystrophin family. The dystrophin family includes dystrophin and its paralog, utrophin. Dystrophin, encoded by the DMD gene, is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscles. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. Dystrophin is also involved in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Utrophin, also called dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homolog that increases dystrophic muscle function and reduces pathology. It is broadly expressed in both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with dystroglycans (DGs) and sarcoglycan-dystroglycans, as well as sarcoglycan and sarcospan (SG-SSPN) subcomplexes. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	104
409037	cd21188	CH_PLEC-like_rpt1	first calponin homology (CH) domain found in the plectin/dystonin/MACF1 family. This family includes plectin, dystonin and microtubule-actin cross-linking factor 1, isoforms 1/2/3/5 (MACF1). Plectin, also called PCN, PLTN, hemidesmosomal protein 1 (HD1), or plectin-1, is a structural component of muscle. It interlinks intermediate filaments with microtubules and microfilaments, and anchors intermediate filaments to desmosomes or hemidesmosomes. It could also bind muscle proteins such as actin to membrane complexes in muscle. Dystonin, also called 230 kDa bullous pemphigoid antigen, 230/240 kDa bullous pemphigoid antigen, bullous pemphigoid antigen 1 (BPA or BPAG1), dystonia musculorum protein, or hemidesmosomal plaque protein, is a cytoskeletal linker protein that acts as an integrator of intermediate filaments, actin, and microtubule cytoskeleton networks. It is required for anchoring either intermediate filaments to the actin cytoskeleton in neural and muscle cells, or keratin-containing intermediate filaments to hemidesmosomes in epithelial cells. MACF1, also called 620 kDa actin-binding protein (ABP620), actin cross-linking family protein 7 (ACF7), macrophin-1, or trabeculin-alpha, is a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. It facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409038	cd21189	CH_PLEC-like_rpt2	second calponin homology (CH) domain found in the plectin/dystonin/MACF1 family. This family includes plectin, dystonin and microtubule-actin cross-linking factor 1, isoforms 1/2/3/5 (MACF1). Plectin, also called PCN, PLTN, hemidesmosomal protein 1 (HD1), or plectin-1, is a structural component of muscle. It interlinks intermediate filaments with microtubules and microfilaments, and anchors intermediate filaments to desmosomes or hemidesmosomes. It could also bind muscle proteins such as actin to membrane complexes in muscle. Dystonin, also called 230 kDa bullous pemphigoid antigen, 230/240 kDa bullous pemphigoid antigen, bullous pemphigoid antigen 1 (BPA or BPAG1), dystonia musculorum protein, or hemidesmosomal plaque protein, is a cytoskeletal linker protein that acts as an integrator of intermediate filaments, actin, and microtubule cytoskeleton networks. It is required for anchoring either intermediate filaments to the actin cytoskeleton in neural and muscle cells, or keratin-containing intermediate filaments to hemidesmosomes in epithelial cells. MACF1, also called 620 kDa actin-binding protein (ABP620), actin cross-linking family protein 7 (ACF7), macrophin-1, or trabeculin-alpha, is a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. It facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409039	cd21190	CH_SYNE-like_rpt1	first calponin homology (CH) domain found in the synaptic nuclear envelope protein family. The synaptic nuclear envelope (SYNE) family includes SYNE-1, -2 and calmin. SYNE-1 (also called nesprin-1, enaptin, KASH domain-containing protein 1, KASH1, myocyte nuclear envelope protein 1, MYNE-1, or nuclear envelope spectrin repeat protein 1) and SYNE-2 (also called nesprin-2, KASH domain-containing protein 2, KASH2, nuclear envelope spectrin repeat protein 2, nucleus and actin connecting element protein, or protein NUANCE) may act redundantly. They are multi-isomeric modular proteins which form a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. They also act as components of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. Calmin, also called calponin-like transmembrane domain protein, is a protein with calponin homology (CH) and transmembrane domains expressed in maturing spermatogenic cells. It may be involved in the development and/or maintenance of neuronal functions. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	113
409040	cd21191	CH_CLMN_rpt1	first calponin homology (CH) domain found in calmin and similar proteins. Calmin, also called calponin-like transmembrane domain protein, is a protein with calponin homology (CH) and transmembrane domains expressed in maturing spermatogenic cells. It may be involved in the development and/or maintenance of neuronal functions. Calmin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	114
409041	cd21192	CH_SYNE-like_rpt2	second calponin homology (CH) domain found in the synaptic nuclear envelope protein (SYNE) family. The SYNE family includes SYNE-1, -2 and calmin. SYNE-1 (also called nesprin-1, enaptin, KASH domain-containing protein 1, KASH1, myocyte nuclear envelope protein 1, MYNE-1, or nuclear envelope spectrin repeat protein 1) and SYNE-2 (also called nesprin-2, KASH domain-containing protein 2, KASH2, nuclear envelope spectrin repeat protein 2, nucleus and actin connecting element protein, or protein NUANCE) may act redundantly. They are multi-isomeric modular proteins which form a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. They also act as components of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. Calmin, also called calponin-like transmembrane domain protein, is a protein with calponin homology (CH) and transmembrane domains expressed in maturing spermatogenic cells. It may be involved in the development and/or maintenance of neuronal functions. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409042	cd21193	CH_beta_spectrin_rpt1	first calponin homology (CH) domain found in the beta spectrin family. The beta spectrin family includes beta-I, -II, -III, -IV and -V spectrins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. Beta-I spectrin, also called spectrin beta chain, erythrocytic (SPTB), may be involved in anaemia pathogenesis. Beta-II spectrin, also called spectrin beta chain, non-erythrocytic 1 (SPTBN1), or fodrin beta chain, is a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. Beta-IV spectrin is also called spectrin, non-erythroid beta chain 3 (SPTBN3) or spectrin beta chain, non-erythrocytic 4 (SPTBN4). Its mutation associates with congenital myopathy, neuropathy, and central deafness. Beta-III spectrin is also called spectrin beta chain, non-erythrocytic 2 (SPTBN2), or spinocerebellar ataxia 5 protein (SCA5). Beta-V spectrin, also called spectrin beta chain, non-erythrocytic 5 (SPTBN5), is a mammalian ortholog of Drosophila beta H spectrin. Beta-III and Beta-V spectrins may play crucial roles as longer actin-membrane cross-linkers or fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	116
409043	cd21194	CH_beta_spectrin_rpt2	second calponin homology (CH) domain found in the beta spectrin family. The beta spectrin family includes beta-I, -II, -III, -IV and -V spectrins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. Beta-I spectrin, also called spectrin beta chain, erythrocytic (SPTB), may be involved in anaemia pathogenesis. Beta-II spectrin, also called spectrin beta chain, non-erythrocytic 1 (SPTBN1), or fodrin beta chain, is a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. Beta-IV spectrin is also called spectrin, non-erythroid beta chain 3 (SPTBN3) or spectrin beta chain, non-erythrocytic 4 (SPTBN4). Its mutation associates with congenital myopathy, neuropathy, and central deafness. Beta-III spectrin is also called spectrin beta chain, non-erythrocytic 2 (SPTBN2), or spinocerebellar ataxia 5 protein (SCA5). Beta-V spectrin, also called spectrin beta chain, non-erythrocytic 5 (SPTBN5), is a mammalian ortholog of Drosophila beta H spectrin. Beta-III and Beta-V spectrins may play crucial roles as longer actin-membrane cross-linkers or fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409044	cd21195	CH_MICAL2_3-like	calponin homology (CH) domain found in molecule interacting with CasL protein 2 (MICAL-2), MICAL-3, and similar proteins. Molecule interacting with CasL protein (MICAL) is a large, multidomain, cytosolic protein with a single LIM domain, a calponin homology (CH) domain and a flavoprotein monooxygenase (MO) domain. In Drosophila, MICAL is expressed in axons, interacts with the neuronal A (PlexA) receptor and is required for Semaphorin 1a (Sema-1a)-PlexA-mediated repulsive axon guidance. The LIM and CH domains mediate interactions with the cytoskeleton, cytoskeletal adaptor proteins, and other signaling proteins. The flavoprotein MO is required for semaphorin-plexin repulsive axon guidance during axonal pathfinding in the Drosophila neuromuscular system. In addition, MICAL functions to interact with Rab13 and Rab8 to coordinate the assembly of tight junctions and adherens junctions in epithelial cells. Thus, MICAL is also called junctional Rab13-binding protein (JRAB). Members of this family, which includes MICAL-2, MICAL-3, and similar proteins, contain one CH domain. CH domains are actin filament (F-actin) binding motifs.	110
409045	cd21196	CH_MICAL1	calponin homology (CH) domain found in molecule interacting with CasL protein 1. MICAL-1, also called NEDD9-interacting protein with calponin homology and LIM domains, acts as a [F-actin]-monooxygenase that promotes depolymerization of F-actin by mediating oxidation of specific methionine residues on actin to form methionine-sulfoxide, resulting in actin filament disassembly and preventing repolymerization. In the absence of actin, it also functions as a NADPH oxidase producing H(2)O(2). MICAL-1 acts as a cytoskeletal regulator that connects NEDD9 to intermediate filaments. It also acts as a negative regulator of apoptosis via its interaction with STK38 and STK38L. MICAL-1 is a Rab effector protein that plays a role in vesicle trafficking. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	106
409046	cd21197	CH_MICALL	calponin homology (CH) domain found in the MICAL-like protein family. The MICAL-L family includes MICAL-L1 and MICAL-L2. MICAL-L1, also called molecule interacting with Rab13 (MIRab13), is a probable lipid-binding protein with higher affinity for phosphatidic acid, a lipid enriched in recycling endosome membranes. It is a tubular endosomal membrane hub that connects Rab35 and Arf6 with Rab8a. It may be involved in a late step of receptor-mediated endocytosis regulating endocytosed-EGF receptor trafficking. Alternatively, it may regulate slow endocytic recycling of endocytosed proteins back to the plasma membrane. MICAL-L1 may indirectly play a role in neurite outgrowth. MICAL-L2, also called junctional Rab13-binding protein (JRAB), or molecule interacting with CasL-like 2, acts as an effector of small Rab GTPases which is involved in junctional complexes assembly through the regulation of cell adhesion molecule transport to the plasma membrane, and actin cytoskeleton reorganization. It regulates the endocytic recycling of occludins, claudins, and E-cadherin to the plasma membrane and may thereby regulate the establishment of tight junctions and adherens junctions. Members of this family contain a single copy of CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409047	cd21198	CH_EHBP	calponin homology (CH) domain found in the EH domain-binding protein (EHBP) family. The EHBP family includes EHBP1 and EHBP1-like protein (EHBP1L1). EHBP1 is a regulator of endocytic recycling and may play a role in actin reorganization by linking clathrin-mediated endocytosis to the actin cytoskeleton. It may act as an effector of small GTPases, including RAB-10 (Rab10), and play a role in vesicle trafficking. EHBP1 is associated with aggressive prostate cancer and insulin-stimulated trafficking and cell migration. EHBP1L1 may also act as Rab effector protein and play a role in vesicle trafficking. It coordinates Rab8 and Bin1 to regulate apical-directed transport in polarized epithelial cells. Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409048	cd21199	CH_CYTS	calponin homology (CH) domain found in the cytospin family. The cytospin family includes cytospin-A and cytospin-B. Cytospin-A, also called renal carcinoma antigen NY-REN-22, sperm antigen with calponin homology and coiled-coil domains 1-like, or SPECC1-like (SPECC1L) protein, is involved in cytokinesis and spindle organization. It may play a role in actin cytoskeleton organization and microtubule stabilization and hence, is required for proper cell adhesion and migration. Cytospin-B, also called nuclear structure protein 5 (NSP5), sperm antigen HCMOGT-1, or sperm antigen with calponin homology and coiled-coil domains 1 (SPECC1), is a novel fusion partner to PDGFRB in juvenile myelomonocytic leukemia with translocation t(5;17)(q33;p11.2). Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	112
409049	cd21200	CH_SMTN-like	calponin homology (CH) domain found in the smoothelin family. The smoothelin family includes smoothelin and smoothelin-like proteins. Smoothelins are actin-binding cytoskeletal proteins that are abundantly expressed in healthy visceral (smoothelin-A) and vascular (smoothelin-B) smooth muscle. SMTNL1, also called calponin homology-associated smooth muscle protein (CHASM), plays a role in the regulation of contractile properties of both striated and smooth muscles. It can bind to calmodulin and tropomyosin. When it is unphosphorylated, SMTNL1 may inhibit myosin dephosphorylation. SMTNL2 is highly expressed in skeletal muscle and could be associated with differentiating myocytes. Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409050	cd21201	CH_VAV	calponin homology (CH) domain found in VAV proteins. VAV proteins function both as cytoplasmic guanine nucleotide exchange factors (GEFs) for Rho GTPases and as scaffold proteins, and they play important roles in cell signaling by coupling cell surface receptors to various effector functions. They play key roles in processes that require cytoskeletal reorganization including immune synapse formation, phagocytosis, cell spreading, and platelet aggregation, among others. Vertebrates have three VAV proteins (VAV1, VAV2, and VAV3). VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the CH domain, an actin-binding domain which is present as a single copy in VAV proteins.	117
409051	cd21202	CH_PIX	calponin homology (CH) domain found in the Pak Interactive eXchange factor family. Pak Interactive eXchange factor (PIX) proteins are Rho guanine nucleotide exchange factors (GEFs), which activate small GTPases by exchanging bound GDP for free GTP. They act as GEFs for both Cdc42 and Rac1, and have been implicated in cell motility, adhesion, neurite outgrowth, and cell polarity. Vertebrates contain two proteins from the PIX family, alpha-PIX and beta-PIX. Alpha-PIX, also called Rho guanine nucleotide exchange factor 6 (ARHGEF6), is localized in dendritic spines where it regulates spine morphogenesis. It controls dendritic length and spine density in the hippocampus. Mutations in the ARHGEF6 gene cause X-linked intellectual disability in humans. Beta-PIX, also called Rho guanine nucleotide exchange factor 7 (ARHGEF7), plays important roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion. Both alpha-PIX and beta-PIX contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	114
409052	cd21203	CH_AtKIN14-like	calponin homology (CH) domain found in Arabidopsis thaliana Kinesin-like KIN-14 protein family. Kinesins are microtubule-dependent molecular motors that play important roles in intracellular transport and in cell division. This family includes a group of kinesin-like proteins belonging to KIN-14 protein family. They all contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	112
409053	cd21204	CH_GAS2-like	calponin homology (CH) domain found in the growth arrest-specific protein 2 family. The growth arrest-specific protein 2 (GAS-2) family includes GAS-2, and GAS-2 like proteins, GAS2L1-3. GAS-2 may play a role in apoptosis by acting as a cell death substrate for caspases. GAS2L1 (also called GAS2-related protein on chromosome 22 or growth arrest-specific protein 2-like 1) and GAS2L2 (also called GAS2-related protein on chromosome 17 or growth arrest-specific protein 2-like 2) may be involved in the cross-linking of microtubules and microfilaments. GAS2L3, also called GAS2-like protein 3, is a cytoskeletal linker protein that may promote and stabilize the formation of the actin and microtubule network. Members of this family contain a single copy of the CH domain at the N-terminal region. CH domains are actin filament (F-actin) binding motifs.	131
409054	cd21205	CH_LRCH	calponin homology (CH) domain found in the leucine-rich repeat and calponin homology domain-containing protein family. The leucine-rich repeat and calponin homology domain-containing protein (LRCH) family includes LRCH1-4. LRCH1, also called calponin homology domain-containing protein 1, or neuronal protein 81 (NP81), acts as a negative regulator of GTPase Cdc42 by sequestering Cdc42-guanine exchange factor DOCK8. LRCH2 may play a role in the organization of the cytoskeleton. LRCH3 is part of the DISP complex and may regulate the association of septins with actin and thereby regulate the actin cytoskeleton. LRCH4, also called leucine-rich repeat neuronal protein 4, or leucine-rich neuronal protein, acts as a novel Toll-like receptor (TLR) accessory protein that regulates the innate immune response. Members of this family contain a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs.	107
409055	cd21206	CH_IQGAP	calponin homology (CH) domain found in the IQ motif containing GTPase activating protein  family. Members of the IQ motif containing GTPase activating protein (IQGAP) family are associated with the Ras GTP-binding protein and act as essential regulators of cytoskeletal function. There are three known IQGAP family members: IQGAP1, IQGAP2, and IQGAP3. They are multi-domain molecules having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP1 negatively regulates Ras family GTPases by stimulating their intrinsic GTPase activity. It lacks GAP activity. Both IQGAP1 and IQGAP2 specifically bind to Cdc42 and Rac1, but not to RhoA. Despite similarities to part of the sequence of RasGAP, neither IQGAP1 nor IQGAP2 interacts with Ras. IQGAP3 regulates the organization of the cytoskeleton under the regulation of Rac1 and Cdc42 in neuronal cells. The depletion of IQGAP3 is shown to impair neurite or axon outgrowth in neuronal cells with disorganized cytoskeleton.	118
409056	cd21207	CH_dMP20-like	calponin homology (CH) domain found in Drosophila melanogaster muscle-specific protein 20 (dMP20) and similar domains. This subfamily contains Drosophila melanogaster muscle-specific protein 20 (dMP20), Echinococcus granulosus myophilin, Dictyostelium discoideum Rac guanine nucleotide exchange factor B (also called Trix), and similar proteins. dMP20 is present only in the synchronous muscles of D. melanogaster. It may be involved in the system linking the nerve impulse with the contraction or the relaxation process. Trix is involved in the regulation of the late steps of the endocytic pathway. dMP20 contains a single copy of the CH domain, while Trix (triple CH-domain array exchange factor) contains three, two type 3 CH domains which are included in this model, and one type 1 CH domain that is not included in this subfamily, but is part of the superfamily. CH domains are actin filament (F-actin) binding motifs.	107
409057	cd21208	CH_LMO7-like	calponin homology (CH) domain found in LIM domain only protein 7 and similar proteins. This family includes LIM domain only protein 7 (LMO-7) and LIM and calponin homology domains-containing protein 1 (LIMCH1), and similar proteins. LMO-7, also called F-box only protein 20, or LOMP, is a transcription regulator for expression of many Emery-Dreifuss muscular dystrophy (EDMD)-relevant genes. It binds to alpha-actinin and AF6/afadin at adherens junctions for epithelial cell-cell adhesion. LIMCH1 acts as an actin stress fiber-associated protein that activates the non-muscle myosin IIa complex by promoting the phosphorylation of its regulatory subunit MRLC/MYL9. It positively regulates actin stress fiber assembly and stabilizes focal adhesions, and therefore negatively regulates cell spreading and cell migration. Members of this family contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	119
409058	cd21209	CH_TAGLN-like	calponin homology (CH) domain found in the transgelin family. The transgelin (TAGLN) family includes transgelin, transgelin-2 and transgelin-3. Transgelin, also called 22 kDa actin-binding protein, protein WS3-10, or smooth muscle protein 22-alpha (SM22-alpha), acts as an actin cross-linking/gelling protein that may be involved in calcium interactions and in regulating contractile properties of the cell. Transgelin-2, also called epididymis tissue protein Li 7e, or SM22-alpha homolog, acts as an actin-binding protein that induces actin gelation and regulates actin cytoskeleton. It may participate in the development and progression of multiple cancers. Transgelin-3, also called neuronal protein 22 (NP22), or neuronal protein NP25, may have a role in alcohol-related adaptations and may mediate regulatory signal transduction pathways in neurons. Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	119
409059	cd21210	CH_SCP1-like	calponin homology (CH) domain found in Saccharomyces cerevisiae transgelin (SCP1) and similar proteins. The family includes transgelins from Saccharomyces cerevisiae and Schizosaccharomyces pombe, which are also called SCP1 and STG1, respectively. Transgelin, also called calponin homolog 1, has actin-binding and actin-bundling activity. It stabilizes actin filaments against disassembly. Transgelin contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	101
409060	cd21211	CH_CNN	calponin homology (CH) domain found in the calponin family. Calponin is an actin filament-associated regulatory protein expressed in smooth muscle and many types of non-muscle cells. There are three calponin isoforms, calponin-1, -2, -3. All of them are actin-binding proteins with functions in inhibiting actin-activated myosin ATPase and stabilizing the actin cytoskeleton. Calponin-1 is specifically expressed in smooth muscle cells and plays a role in fine-tuning smooth muscle contractility. Calponin-2 is expressed in both smooth muscle and non-muscle cells and regulates multiple actin cytoskeleton-based functions. Calponin-3 is expressed in the brain and participates in actin cytoskeleton-based activities in embryonic development and myogenesis. Members of this family contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	108
409061	cd21212	CH_NAV2-like	calponin homology (CH) domain found in neuron navigator (NAV) 2, NAV3, and similar proteins. This family includes neuron navigator 2 (NAV2) and NAV3, both of which contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs. NAV2, also called helicase APC down-regulated 1 (HELAD1), pore membrane and/or filament-interacting-like protein 2 (POMFIL2), retinoic acid inducible in neuroblastoma 1 (RAINB1), Steerin-2 (STEERIN2), or Unc-53 homolog 2 (unc53H2), possesses 3' to 5' helicase activity and exonuclease activity. It is involved in neuronal development, specifically in the development of different sensory organs. NAV3, also called pore membrane and/or filament-interacting-like protein 1 (POMFIL1), Steerin-3 (STEERIN3), or Unc-53 homolog 3 (unc53H3), may regulate IL2 production by T-cells. It may be involved in neuron regeneration.	105
409062	cd21213	CH_DIXDC1	calponin homology (CH) domain found in Dixin and similar proteins. Dixin, also called coiled-coil protein DIX1, coiled-coil-DIX1, or DIX domain-containing protein 1, is a positive effector of the Wnt signaling pathway. It activates WNT3A signaling via DVL2 and regulates JNK activation by AXIN1 and DVL2. Members of this family contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	107
409063	cd21214	CH_ACTN_rpt1	first calponin homology (CH) domain found in the alpha-actinin family. The alpha-actinin (ACTN) family includes alpha-actinin-1, -2, -3, and -4. They are F-actin cross-linking proteins which are thought to anchor actin to a variety of intracellular structures. ACTN1 mutations cause congenital macrothrombocytopenia. ACTN2 mutations are associated with cardiomyopathies, as well as skeletal muscle disorder. ACTN3 is critical in anchoring the myofibrillar actin filaments and plays a key role in muscle contraction. ACTN4 is associated with cell motility and cancer invasion. It is probably involved in vesicular trafficking via its association with the CART complex, which is necessary for efficient transferrin receptor recycling but not for epidermal growth factor receptor (EGFR) degradation. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409064	cd21215	CH_SpAIN1-like_rpt1	first calponin homology (CH) domain found in Schizosaccharomyces pombe alpha-actinin-like protein 1 and similar proteins. Schizosaccharomyces pombe alpha-actinin-like protein 1 (SpAIN1) binds to actin and is involved in actin-ring formation and organization. It plays a role in cytokinesis and is involved in septation. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409065	cd21216	CH_ACTN_rpt2	second calponin homology (CH) domain found in the alpha-actinin family. The alpha-actinin (ACTN) family includes alpha-actinin-1, -2, -3, and -4. They are F-actin cross-linking proteins which are thought to anchor actin to a variety of intracellular structures. ACTN1 mutations cause congenital macrothrombocytopenia. ACTN2 mutations are associated with cardiomyopathies, as well as skeletal muscle disorder. ACTN3 is critical in anchoring the myofibrillar actin filaments and plays a key role in muscle contraction. ACTN4 is associated with cell motility and cancer invasion. It is probably involved in vesicular trafficking via its association with the CART complex, which is necessary for efficient transferrin receptor recycling but not for epidermal growth factor receptor (EGFR) degradation. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	115
409066	cd21217	CH_PLS_FIM_rpt1	first calponin homology (CH) domain found in the plastin/fimbrin family. This family includes plastin and fimbrin. Plastin has three isoforms, plastin-1, -2, and -3, which are all actin-bundling proteins. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, LC64P, or lymphocyte cytosolic protein 1 (LCP-1), plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Fimbrin has been found in plants and fungi. Arabidopsis thaliana fimbrin (AtFIM) includes fimbrin-1, -2, -3, -4, and -5; they cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Fungal fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	114
409067	cd21218	CH_PLS_FIM_rpt2	second calponin homology (CH) domain found in the plastin/fimbrin family. This family includes plastin and fimbrin. Plastin has three isoforms, plastin-1, -2, and -3, which are all actin-bundling proteins. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, LC64P, or lymphocyte cytosolic protein 1 (LCP-1), plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Fimbrin has been found in plants and fungi. Arabidopsis thaliana fimbrin (AtFIM) includes fimbrin-1, -2, -3, -4, and -5; they cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Fungal fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	114
409068	cd21219	CH_PLS_FIM_rpt3	third calponin homology (CH) domain found in the plastin/fimbrin family. This family includes plastin and fimbrin. Plastin has three isoforms, plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Fimbrin has been found in plants and fungi. Arabidopsis thaliana fimbrin (AtFIM) includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Fungal fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs.	113
409069	cd21220	CH_PLS_FIM_rpt4	fourth calponin homology (CH) domain found in the plastin/fimbrin family. This family includes plastin and fimbrin. Plastin has three isoforms, plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Fimbrin has been found in plants and fungi. Arabidopsis thaliana fimbrin (AtFIM) includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Fungal fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409070	cd21221	CH_PARV_rpt1	first calponin homology (CH) domain found in the parvin family. The parvin family includes alpha-parvin, beta-parvin, and gamma-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. Both alpha-parvin and beta-parvin are involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia, and both play roles in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Gamma-parvin probably plays a role in the regulation of cell adhesion and cytoskeleton organization. Members of this family contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	106
409071	cd21222	CH_PARV_rpt2	second calponin homology (CH) domain found in the parvin family. The parvin family includes alpha-parvin, beta-parvin, and gamma-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. Both alpha-parvin and beta-parvin are involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia, and both play roles in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Gamma-parvin probably plays a role in the regulation of cell adhesion and cytoskeleton organization. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	121
409072	cd21223	CH_ASPM_rpt1	first calponin homology (CH) domain found in abnormal spindle-like microcephaly-associated protein (ASPM) and similar proteins. ASPM, also called abnormal spindle protein homolog, or Asp homolog, is involved in mitotic spindle regulation and coordination of mitotic processes. It may also have a preferential role in regulating neurogenesis. Members of this family contain two copies of the CH domain in the middle region. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	113
409073	cd21224	CH_ASPM_rpt2	second calponin homology (CH) domain found in abnormal spindle-like microcephaly-associated protein (ASPM) and similar proteins. ASPM, also called abnormal spindle protein homolog, or Asp homolog, is involved in mitotic spindle regulation and coordination of mitotic processes. It may also have a preferential role in regulating neurogenesis. Members of this family contain two copies of CH domain in the middle region. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	138
409074	cd21225	CH_CTX_rpt1	first calponin homology (CH) domain found in cortexillin. Cortexillins are actin-bundling proteins that play a critical role in regulating cell morphology and actin cytoskeleton reorganization. They play a major role in cytokinesis and contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	111
409075	cd21226	CH_CTX_rpt2	second calponin homology (CH) domain found in cortexillin. Cortexillins are actin-bundling proteins that play a critical role in regulating cell morphology and actin cytoskeleton reorganization. They play a major role in cytokinesis and contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	103
409076	cd21227	CH_jitterbug-like_rpt1	first calponin homology (CH) domain found in Drosophila melanogaster protein jitterbug and similar proteins. Protein jitterbug (Jbug) is an actin-meshwork organizing protein. It is required to maintain the shape and cell orientation of the Drosophila notum epithelium during flight muscle attachment to tendon cells. Jbug contains three copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	109
409077	cd21228	CH_FLN_rpt1	first calponin homology (CH) domain found in filamins. The filamin family includes filamin-A (FLN-A), filamin-B (FLN-B) and filamin-C (FLN-C). Filamins function to anchor various transmembrane proteins to the actin cytoskeleton. FLN-A is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-B is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton and may also promote orthogonal branching of actin filaments as well as link actin filaments to membrane glycoproteins. FLN-C, also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. Members of this family contain two copies of the CH domain. The model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	108
409078	cd21229	CH_jitterbug-like_rpt2	second calponin homology (CH) domain found in Drosophila melanogaster protein jitterbug and similar proteins. Protein jitterbug (Jbug) is an actin-meshwork organizing protein. It is required to maintain the shape and cell orientation of the Drosophila notum epithelium during flight muscle attachment to tendon cells. Jbug contains three copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409079	cd21230	CH_FLN_rpt2	second calponin homology (CH) domain found in filamins. The filamin family includes filamin-A (FLN-A), filamin-B (FLN-B) and filamin-C (FLN-C). Filamins function to anchor various transmembrane proteins to the actin cytoskeleton. FLN-A is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-B is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton and may also promote orthogonal branching of actin filaments as well as link actin filaments to membrane glycoproteins. FLN-C, also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. Members of this family contain two copies of the CH domain. The model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	103
409080	cd21231	CH_DMD_rpt1	first calponin homology (CH) domain found in dystrophin and similar proteins. Dystrophin, encoded by the DMD gene, is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscles. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. It is involved in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Mutations in dystrophin lead to Duchenne muscular dystrophy (DMD). Moreover, dystrophin deficiency is associated with abnormal cerebral diffusion and perfusion, as well as in acute Trypanosoma cruzi infection. The dystrophin subfamily has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, dystrophin contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, approximately 24 spectrin repeats (SRs) and a WW domain. This model corresponds to the first CH domain.	111
409081	cd21232	CH_UTRN_rpt1	first calponin homology (CH) domain found in utrophin and similar proteins. Utrophin, also called dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homolog that increases dystrophic muscle function and reduces pathology. It is broadly expressed in both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with dystroglycans (DGs) and sarcoglycan-dystroglycans, as well as sarcoglycan and sarcospan (SG-SSPN) subcomplexes. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Like dystrophin, utrophin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, it contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, up to 24 spectrin repeats (SRs), and a WW domain. However, utrophin lacks the intrinsic microtubule binding activity of dystrophin SRs. This model corresponds to the first CH domain.	107
409082	cd21233	CH_DMD_rpt2	second calponin homology (CH) domain found in dystrophin and similar proteins. Dystrophin, encoded by the DMD gene, is a large, submembrane cytoskeletal protein that is the main component of the dystrophin-glycoprotein complex (DGC) in skeletal muscles. It links the transmembrane DGC to the actin cytoskeleton through binding strongly to the cytoplasmic tail of beta-dystroglycan, the transmembrane subunit of a highly O-glycosylated cell-surface protein. It is involved in maintaining the structural integrity of cells, as well as in the formation of the blood-brain barrier (BBB). Mutations in dystrophin lead to Duchenne muscular dystrophy (DMD). Moreover, dystrophin deficiency is associated with abnormal cerebral diffusion and perfusion, as well as in acute Trypanosoma cruzi infection. The dystrophin subfamily has been characterized by a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, dystrophin contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, approximately 24 spectrin repeats (SRs) and a WW domain. The model corresponds to the second CH domain.	111
409083	cd21234	CH_UTRN_rpt2	second calponin homology (CH) domain found in utrophin and similar proteins. Utrophin, also called dystrophin-related protein 1 (DRP-1), is an autosomal dystrophin homolog that increases dystrophic muscle function and reduces pathology. It is broadly expressed in both the mRNA and protein levels, and occurs in the cerebrovascular endothelium. Utrophin forms the utrophin-glycoprotein complex (UGC) by interacting with dystroglycans (DGs) and sarcoglycan-dystroglycans, as well as sarcoglycan and sarcospan (SG-SSPN) subcomplexes. It may act as a scaffolding protein that stabilizes lipid microdomains and clusters mechanosensitive channel subunits, and link the F-actin cytoskeleton to the cell membrane via the associated glycoprotein complex. Like dystrophin, utrophin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, it contains two syntrophin binding sites (SBSs) and a long N-terminal extension that comprises two actin-binding calponin homology (CH) domains, up to 24 spectrin repeats (SRs), and a WW domain. However, utrophin lacks the intrinsic microtubule binding activity of dystrophin SRs. This model corresponds to the second CH domain.	104
409084	cd21235	CH_PLEC_rpt1	first calponin homology (CH) domain found in plectin and similar proteins. Plectin, also called PCN, PLTN, hemidesmosomal protein 1 (HD1), or plectin-1, is a structural component of muscle. It interlinks intermediate filaments with microtubules and microfilaments, and anchors intermediate filaments to desmosomes or hemidesmosomes. It can also bind muscle proteins such as actin to membrane complexes in muscle. Plectin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	119
409085	cd21236	CH_DYST_rpt1	first calponin homology (CH) domain found in dystonin and similar proteins. Dystonin, also called 230 kDa bullous pemphigoid antigen, 230/240 kDa bullous pemphigoid antigen, bullous pemphigoid antigen 1 (BPA or BPAG1), dystonia musculorum protein, or hemidesmosomal plaque protein, is a cytoskeletal linker protein that acts as an integrator of intermediate filaments, actin, and microtubule cytoskeleton networks. It is required for anchoring either intermediate filaments to the actin cytoskeleton in neural and muscle cells, or keratin-containing intermediate filaments to hemidesmosomes in epithelial cells. Dystonin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	128
409086	cd21237	CH_MACF1_rpt1	first calponin homology (CH) domain found in microtubule-actin cross-linking factor 1, isoforms 1/2/3/5 (MACF1) and similar proteins. MACF1, also called 620 kDa actin-binding protein (ABP620), actin cross-linking family protein 7 (ACF7), macrophin-1, or trabeculin-alpha, is a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. It facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. MACF1 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	118
409087	cd21238	CH_PLEC_rpt2	second calponin homology (CH) domain found in plectin and similar proteins. Plectin, also called PCN, PLTN, hemidesmosomal protein 1 (HD1), or plectin-1, is a structural component of muscle. It interlinks intermediate filaments with microtubules and microfilaments and anchors intermediate filaments to desmosomes or hemidesmosomes. It can also bind muscle proteins such as actin to membrane complexes in muscle. Plectin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	106
409088	cd21239	CH_DYST_rpt2	second calponin homology (CH) domain found in dystonin and similar proteins. Dystonin, also called 230 kDa bullous pemphigoid antigen, 230/240 kDa bullous pemphigoid antigen, bullous pemphigoid antigen 1 (BPA or BPAG1), dystonia musculorum protein, or hemidesmosomal plaque protein, is a cytoskeletal linker protein that acts as an integrator of intermediate filaments, actin, and microtubule cytoskeleton networks. It is required for anchoring either intermediate filaments to the actin cytoskeleton in neural and muscle cells, or keratin-containing intermediate filaments to hemidesmosomes in epithelial cells. Dystonin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	104
409089	cd21240	CH_MACF1_rpt2	second calponin homology (CH) domain found in microtubule-actin cross-linking factor 1, isoforms 1/2/3/5 (MACF1) and similar proteins. MACF1, also called 620 kDa actin-binding protein (ABP620), actin cross-linking family protein 7 (ACF7), macrophin-1, or trabeculin-alpha, is a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. It facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. MACF1 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409090	cd21241	CH_SYNE1_rpt1	first calponin homology (CH) domain found in synaptic nuclear envelope protein 1 and similar proteins. Synaptic nuclear envelope protein 1 (SYNE-1), also called nesprin-1, enaptin, KASH domain-containing protein 1 (KASH1), myocyte nuclear envelope protein 1 (MYNE-1), or nuclear envelope spectrin repeat protein 1, is a multi-isomeric modular protein which forms a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. SYNE-1 also acts as a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. SYNE-1 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	113
409091	cd21242	CH_SYNE2_rpt1	first calponin homology (CH) domain found in synaptic nuclear envelope protein 2. Synaptic nuclear envelope protein 2 (SYNE-2), also called nesprin-2, KASH domain-containing protein 2 (KASH2), nuclear envelope spectrin repeat protein 2, nucleus and actin connecting element protein, or protein NUANCE, is a multi-isomeric modular protein which forms a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. SYNE-2 also acts as a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. SYNE-2 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	111
409092	cd21243	CH_SYNE1_rpt2	second calponin homology (CH) domain found in synaptic nuclear envelope protein 1 (SYNE-1) and similar proteins. SYNE-1, also called nesprin-1, enaptin, KASH domain-containing protein 1 (KASH1), myocyte nuclear envelope protein 1 (MYNE-1), or nuclear envelope spectrin repeat protein 1, is a multi-isomeric modular protein which forms a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. SYNE-1 also acts as a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. SYNE-1 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	109
409093	cd21244	CH_SYNE2_rpt2	second calponin homology (CH) domain found in synaptic nuclear envelope protein 2 (SYNE-2) and similar proteins. SYNE-2, also called nesprin-2, KASH domain-containing protein 2 (KASH2), nuclear envelope spectrin repeat protein 2, nucleus and actin connecting element protein, or protein NUANCE, is a multi-isomeric modular protein which forms a linking network between organelles and the actin cytoskeleton to maintain subcellular spatial organization. SYNE-2 also acts as a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, which is involved in the connection between the nuclear lamina and the cytoskeleton. SYNE-2 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	109
409094	cd21245	CH_CLMN_rpt2	second calponin homology (CH) domain found in calmin and similar proteins. Calmin, also called calponin-like transmembrane domain protein, is a protein with calponin homology (CH) and transmembrane domains expressed in maturing spermatogenic cells. It may be involved in the development and/or maintenance of neuronal functions. Calmin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	106
409095	cd21246	CH_SPTB-like_rpt1	first calponin homology (CH) domain found in the beta-I spectrin-like subfamily. The beta-I spectrin-like family includes beta-I, -II, -III and -IV spectrins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. Beta-I spectrin, also called spectrin beta chain, erythrocytic (SPTB), may be involved in anaemia pathogenesis. Beta-II spectrin, also called spectrin beta chain, non-erythrocytic 1 (SPTBN1), or fodrin beta chain, is a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. Beta-III spectrin, also called spectrin beta chain, non-erythrocytic 2 (SPTBN2), or spinocerebellar ataxia 5 protein (SCA5), may play a crucial role as a longer actin-membrane cross-linker or fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. Beta-IV spectrin is also called spectrin, non-erythroid beta chain 3 (SPTBN3) or spectrin beta chain, non-erythrocytic 4 (SPTBN4). Its mutation associates with congenital myopathy, neuropathy, and central deafness. Members of this subfamily contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	117
409096	cd21247	CH_SPTBN5_rpt1	first calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 5 (SPTBN5) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN5, also called beta-V spectrin, is a mammalian ortholog of Drosophila beta H spectrin that may play a crucial role as a longer actin-membrane cross-linker or to fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. SPTBN5 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	125
409097	cd21248	CH_SPTB_like_rpt2	second calponin homology (CH) domain found in the beta-I spectrin-like subfamily. The beta-I spectrin-like family includes beta-I, -II, -III and -IV spectrins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. Beta-I spectrin, also called spectrin beta chain, erythrocytic (SPTB), may be involved in anaemia pathogenesis. Beta-II spectrin, also called spectrin beta chain, non-erythrocytic 1 (SPTBN1), or fodrin beta chain, is a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. Beta-III spectrin, also called spectrin beta chain, non-erythrocytic 2 (SPTBN2), or spinocerebellar ataxia 5 protein (SCA5), may play a crucial role as a longer actin-membrane cross-linker or fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. Beta-IV spectrin is also called spectrin, non-erythroid beta chain 3 (SPTBN3) or spectrin beta chain, non-erythrocytic 4 (SPTBN4). Its mutation associates with congenital myopathy, neuropathy, and central deafness. Members of this subfamily contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409098	cd21249	CH_SPTBN5_rpt2	second calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 5 (SPTBN5) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN5, also called beta-V spectrin, is a mammalian ortholog of Drosophila beta H spectrin that may play a crucial role as a longer actin-membrane cross-linker or to fulfill the need for greater extensible flexibility than can be provided by the other smaller conventional spectrins. SPTBN5 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	109
409099	cd21250	CH_MICAL2	calponin homology (CH) domain found in molecule interacting with CasL protein 2. MICAL-2 is a nuclear [F-actin]-monooxygenase that promotes depolymerization of F-actin by mediating oxidation of specific methionine residues on actin to form methionine-sulfoxide, resulting in actin filament disassembly and preventing repolymerization. In the absence of actin, it also functions as a NADPH oxidase producing H(2)O(2). MICAL-2 acts as a key regulator of the serum response factor (SRF) signaling pathway elicited by nerve growth factor and serum. It mediates oxidation and subsequent depolymerization of nuclear actin, leading to the increased MKL1/MRTF-A presence in the nucleus, promoting SRF:MKL1/MRTF-A-dependent gene transcription. MICAL-2 contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	110
409100	cd21251	CH_MICAL3	calponin homology (CH) domain found in molecule interacting with CasL protein 3. MICAL-3 is a [F-actin]-monooxygenase that promotes depolymerization of F-actin by mediating oxidation of specific methionine residues on actin to form methionine-sulfoxide, resulting in actin filament disassembly and preventing repolymerization. In the absence of actin, it also functions as a NADPH oxidase producing H(2)O(2). MICAL-3 seems to act as a Rab effector protein and plays a role in vesicle trafficking. It is involved in exocytic vesicle tethering and fusion. MICAL3 contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	111
409101	cd21252	CH_MICALL1	calponin homology (CH) domain found in MICAL-like protein 1. MICAL-like protein 1 (MICAL-L1), also called molecule interacting with Rab13 (MIRab13), is a probable lipid-binding protein with higher affinity for phosphatidic acid, a lipid enriched in recycling endosome membranes. It is a tubular endosomal membrane hub that connects Rab35 and Arf6 with Rab8a. It may be involved in a late step of receptor-mediated endocytosis regulating endocytosed-EGF receptor trafficking. Alternatively, it may regulate slow endocytic recycling of endocytosed proteins back to the plasma membrane. MICAL-L1 may indirectly play a role in neurite outgrowth. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409102	cd21253	CH_MICALL2	calponin homology (CH) domain found in MICAL-like protein 2 and similar proteins. MICAL-like protein 2 (MICAL-L2), also called junctional Rab13-binding protein (JRAB), or molecule interacting with CasL-like 2, acts as an effector of small Rab GTPases which is involved in junctional complexes assembly through the regulation of cell adhesion molecule transport to the plasma membrane, and actin cytoskeleton reorganization. It regulates the endocytic recycling of occludins, claudins, and E-cadherin to the plasma membrane and may thereby regulate the establishment of tight junctions and adherens junctions. Members of this subfamily contain a single copy of CH domain. CH domains are actin filament (F-actin) binding motifs.	106
409103	cd21254	CH_EHBP1	calponin homology (CH) domain found in EH domain-binding protein 1 and similar proteins. EHBP1 is a regulator of endocytic recycling and may play a role in actin reorganization by linking clathrin-mediated endocytosis to the actin cytoskeleton. It may act as an effector of small GTPases, including RAB-10 (Rab10), and play a role in vesicle trafficking. EHBP1 is associated with aggressive prostate cancer and insulin-stimulated trafficking and cell migration. Members of this subfamily contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409104	cd21255	CH_EHBP1L1	calponin homology (CH) domain found in EH domain-binding protein 1-like protein 1 and similar proteins. EHBP1L1 may act as Rab effector protein and play a role in vesicle trafficking. It coordinates Rab8 and Bin1 to regulate apical-directed transport in polarized epithelial cells. Members of this subfamily contain a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	105
409105	cd21256	CH_CYTSA	calponin homology (CH) domain found in cytospin-A. Cytospin-A, also called renal carcinoma antigen NY-REN-22, or sperm antigen with calponin homology and coiled-coil domains 1-like, or SPECC1-like protein (SPECC1L), is involved in cytokinesis and spindle organization. It may play a role in actin cytoskeleton organization and microtubule stabilization and hence, is required for proper cell adhesion and migration. Cytospin-A contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	119
409106	cd21257	CH_CYTSB	calponin homology (CH) domain found in cytospin-B. Cytospin-B, also called nuclear structure protein 5 (NSP5), or sperm antigen HCMOGT-1, or sperm antigen with calponin homology and coiled-coil domains 1 (SPECC1), is a novel fusion Cytospin-B that contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	112
409107	cd21258	CH_SMTNA	calponin homology (CH) domain found in smoothelin-A and similar proteins. Smoothelins are actin-binding cytoskeletal proteins that are abundantly expressed in healthy visceral (smoothelin-A) and vascular (smoothelin-B) smooth muscle. This model corresponds to the single CH domain of smoothelin-A. CH domains are actin filament (F-actin) binding motifs.	111
409108	cd21259	CH_SMTNB	calponin homology (CH) domain found in smoothelin-B and similar proteins. Smoothelins are actin-binding cytoskeletal proteins that are abundantly expressed in healthy visceral (smoothelin-A) and vascular (smoothelin-B) smooth muscle. The human SMTN gene encodes smoothelin-A and smoothelin-B. This model corresponds to the single CH domain of smoothelin-B. CH domains are actin filament (F-actin) binding motifs.	112
409109	cd21260	CH_SMTNL1	calponin homology (CH) domain found in smoothelin-like protein 1. Smoothelin-like protein 1 (SMTNL1), also called calponin homology-associated smooth muscle protein (CHASM), plays a role in the regulation of contractile properties of both striated and smooth muscles. It can bind to calmodulin and tropomyosin. When it is unphosphorylated, SMTNL1 may inhibit myosin dephosphorylation. SMTNL1 contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	116
409110	cd21261	CH_SMTNL2	calponin homology (CH) domain found in smoothelin-like protein 2. Smoothelin-like protein 2 (SMTNL2) is highly expressed in skeletal muscle and could be associated with differentiating myocytes. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409111	cd21262	CH_VAV1	calponin homology (CH) domain found in VAV1 protein. VAV1 is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. This model corresponds to the CH domain, an actin-binding domain which is present as a single copy in VAV1 protein.	120
409112	cd21263	CH_VAV2	calponin homology (CH) domain found in VAV2 protein and similar proteins. VAV2 is widely expressed and functions as a guanine nucleotide exchange factor (GEF) for RhoA, RhoB and RhoG and also activates Rac1 and Cdc42. It is implicated in many cellular and physiological functions including blood pressure control, eye development, neurite outgrowth and branching, EGFR endocytosis and degradation, and cell cluster morphology, among others. It has been reported to associate with Nek3. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The model corresponds to CH domain, an actin-binding domain which is present as a single copy in VAV2 protein.	119
409113	cd21264	CH_VAV3	calponin homology (CH) domain found in VAV3 protein and similar proteins. VAV3 is ubiquitously expressed and functions as a phosphorylation-dependent guanine nucleotide exchange factor (GEF) for RhoA, RhoG, and Rac1. Its function has been implicated in the hematopoietic, bone, cerebellar, and cardiovascular systems. VAV3 is essential in axon guidance in neurons that control blood pressure and respiration. It is overexpressed in prostate cancer cells and it plays a role in regulating androgen receptor transcriptional activity. VAV proteins contain several domains that enable their function: N-terminal calponin homology (CH), acidic, RhoGEF (also called Dbl-homologous or DH), Pleckstrin Homology (PH), C1 (zinc finger), SH2, and two SH3 domains. The model corresponds to CH domain, an actin-binding domain which is present as a single copy in VAV3 protein.	117
409114	cd21265	CH_alphaPIX	calponin homology (CH) domain found in alpha-Pak Interactive eXchange factor. Alpha-Pak Interactive eXchange factor (alpha-PIX), also called PAK-interacting exchange factor alpha, Rho guanine nucleotide exchange factor 6 (ARHGEF6), Rac/Cdc42 guanine nucleotide exchange factor 6, or Cool (Cloned out of Library)-2, activates small GTPases by exchanging bound GDP for free GTP. It acts as a GEF for both Cdc42 and Rac1, and is localized in dendritic spines where it regulates spine morphogenesis. It controls dendritic length and spine density in the hippocampus. Mutations in the ARHGEF6 gene cause X-linked intellectual disability in humans. Alpha-PIX contains a single copy of the CH domain at its N-terminus. CH domains are actin filament (F-actin) binding motifs.	117
409115	cd21266	CH_betaPIX	calponin homology (CH) domain found in beta-Pak Interactive eXchange factor. Beta-Pak Interactive eXchange factor (beta-PIX), also called PAK-interacting exchange factor beta, Rho guanine nucleotide exchange factor 7 (ARHGEF7), p85, or Cool (Cloned out of Library)-1, activates small GTPases by exchanging bound GDP for free GTP. It acts as a GEF for both Cdc42 and Rac1, and plays important roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion. Beta-PIX contains a single copy of the CH domain at its N-terminus. CH domains are actin filament (F-actin) binding motifs.	112
409116	cd21267	CH_GAS2	calponin homology (CH) domain found in growth arrest-specific protein 2. Growth arrest-specific protein 2 (GAS-2) may play a role in apoptosis by acting as a cell death substrate for caspases. It contains a single copy of the CH domain at the N-terminal region. CH domains are actin filament (F-actin) binding motifs.	136
409117	cd21268	CH_GAS2L1_2	calponin homology (CH) domain found in GAS2-like protein 1 (GAS2L1), GAS2L2, and similar proteins. This subfamily includes GAS2L1 (also called GAS2-related protein on chromosome 22 or growth arrest-specific protein 2-like 1) and GAS2L2 (also called GAS2-related protein on chromosome 17 or growth arrest-specific protein 2-like 2). They may be involved in the cross-linking of microtubules and microfilaments. Members of this subfamily contain a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	142
409118	cd21269	CH_GAS2L3	calponin homology (CH) domain found in growth arrest-specific protein 2-like 3. Growth arrest-specific protein 2-like 3 (GAS2L3), also called GAS2-like protein 3, is a cytoskeletal linker protein that may promote and stabilize the formation of the actin and microtubule network. It contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	130
409119	cd21270	CH_LRCH1	calponin homology (CH) domain found in leucine-rich repeat and calponin homology domain-containing protein 1. Leucine-rich repeat and calponin homology domain-containing protein 1 (LRCH1), also called calponin homology domain-containing protein 1, or neuronal protein 81 (NP81), acts as a negative regulator of GTPase CDC42 by sequestering CDC42-guanine exchange factor DOCK8. LRCH1 contains a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs.	112
409120	cd21271	CH_LRCH2	calponin homology (CH) domain found in leucine-rich repeat and calponin homology domain-containing protein 2. Leucine-rich repeat and calponin homology domain-containing protein 2 (LRCH2) may play a role in the organization of the cytoskeleton. It contains a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs.	111
409121	cd21272	CH_LRCH3	calponin homology (CH) domain found in leucine-rich repeat and calponin homology domain-containing protein 3. Leucine-rich repeat and calponin homology domain-containing protein 3 (LRCH3) is part of the DISP (DOCK7-Induced Septin disPlacement) complex. It may regulate the association of septins with actin and thereby regulate the actin cytoskeleton. LRCH3 contains a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs.	109
409122	cd21273	CH_LRCH4	calponin homology (CH) domain found in leucine-rich repeat and calponin homology domain-containing protein 4. Leucine-rich repeat and calponin homology domain-containing protein 4 (LRCH4), also called leucine-rich repeat neuronal protein 4, or leucine-rich neuronal protein, acts as a novel Toll-like receptor (TLR) accessory protein that regulates the innate immune response. LRCH4 contains a single copy of the CH domain at the C-terminus. CH domains are actin filament (F-actin) binding motifs.	109
409123	cd21274	CH_IQGAP1	calponin homology (CH) domain found in Ras GTPase-activating-like protein IQGAP1. IQ motif containing GTPase activating protein 1 (IQGAP1), also called p195, is a homodimeric protein that is widely expressed among vertebrate cell types from early embryogenesis. It plays a crucial role in regulating the dynamics and assembly of the actin cytoskeleton. It belongs to the IQGAP family, which consists of multi-domain proteins having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP1 negatively regulates Ras family GTPases by stimulating their intrinsic GTPase activity. It lacks GAP activity. Both, IQGAP1 and IQGAP2, specifically bind to Cdc42 and Rac1, but not to RhoA. Despite similarities to part of the sequence of RasGAP, neither IQGAP1 nor IQGAP2 interacts with Ras. IQGAP1 contains a single copy of the CH domain at the N-terminus.	154
409124	cd21275	CH_IQGAP2	calponin homology (CH) domain found in Ras GTPase-activating-like protein IQGAP2. IQ motif containing GTPase activating protein 2 (IQGAP2) is a member of the IQGAP family, which consists of multi-domain proteins having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP2 binds to activated Cdc42 and Rac1 but does not seem to stimulate their GTPase activity. It associates with calmodulin. IQGAP2 contains a single copy of the CH domain at the N-terminus.	156
409125	cd21276	CH_IQGAP3	calponin homology (CH) domain found in Ras GTPase-activating-like protein IQGAP3. IQ motif containing GTPase activating protein 3 (IQGAP3) associates with Ras GTP-binding proteins. It regulates the organization of the cytoskeleton under the regulation of Rac1 and Cdc42 in neuronal cells. The depletion of IQGAP3 is shown to impair neurite or axon outgrowth in neuronal cells with disorganized cytoskeleton.  It belongs to the IQGAP family, which consists of multi-domain proteins having a calponin-homology (CH) domain which binds F-actin, IQGAP-specific repeats, a single WW domain, four IQ motifs that mediate interactions with calmodulin, and a RasGAP related domain that binds active Rho family GTPases. IQGAP3 contains a single copy of the CH domain at the N-terminus.	152
409126	cd21277	CH_LMO7	calponin homology (CH) domain found in LIM domain only protein 7. LIM domain only protein 7 (LMO-7), also called F-box only protein 20, or LOMP, is a transcription regulator for expression of many Emery-Dreifuss muscular dystrophy (EDMD)-relevant genes. It binds to alpha-actinin and AF6/afadin at adherens junctions for epithelial cell-cell adhesion. It contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	116
409127	cd21278	CH_LIMCH1	calponin homology (CH) domain found in LIM and calponin homology domains-containing protein 1. LIM and calponin homology domains-containing protein 1 (LIMCH1) acts as an actin stress fiber-associated protein that activates the non-muscle myosin IIa complex by promoting the phosphorylation of its regulatory subunit MRLC/MYL9. It positively regulates actin stress fiber assembly and stabilizes focal adhesions, and therefore negatively regulates cell spreading and cell migration. LIMCH1 contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	118
409128	cd21279	CH_TAGLN	calponin homology (CH) domain found in transgelin. Transgelin, also called 22 kDa actin-binding protein, protein WS3-10, or smooth muscle protein 22-alpha (SM22-alpha), acts as an actin cross-linking/gelling protein that may be involved in calcium interactions and in regulating contractile properties of the cell. It may also contribute to replicative senescence. Transgelin contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	121
409129	cd21280	CH_TAGLN2	calponin homology (CH) domain found in transgelin-2. Transgelin-2, also called epididymis tissue protein Li 7e, or SM22-alpha homolog, acts as an actin-binding protein that induces actin gelation and regulates the actin cytoskeleton. It may participate in the development and progression of multiple cancers. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	137
409130	cd21281	CH_TAGLN3	calponin homology (CH) domain found in transgelin-3. Transgelin-3, also called neuronal protein 22 (NP22), or neuronal protein NP25, may have a role in alcohol-related adaptations and may mediate regulatory signal transduction pathways in neurons. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	119
409131	cd21282	CH_CNN1	calponin homology (CH) domain found in calponin-1 and similar proteins. Calponin-1 (CNN1), also called basic calponin, or smooth muscle calponin H1, is a thin filament-associated protein that is implicated in the regulation and modulation of smooth muscle contraction. It is capable of binding to actin, calmodulin, troponin C, and tropomyosin. Calponin-1 contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	108
409132	cd21283	CH_CNN2	calponin homology (CH) domain found in calponin-2. Calponin-2 (CNN2), also called neutral calponin, or smooth muscle calponin H2, is an actin cytoskeleton-associated regulatory protein that inhibits the activity of myosin-ATPase and cytoskeleton dynamics. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	109
409133	cd21284	CH_CNN3	calponin homology (CH) domain found in calponin-3. Calponin-3 (CNN3), also called acidic isoform calponin, is an F-actin-binding protein that is expressed in the brain and has been shown to control dendritic spine morphology, density, and plasticity by regulating actin cytoskeletal reorganization and dynamics. It contains a single copy of the CH domain. CH domains are actin filament (F-actin) binding motifs.	111
409134	cd21285	CH_NAV2	calponin homology (CH) domain found in neuron navigator 2. Neuron navigator 2 (NAV2), also called helicase APC down-regulated 1 (HELAD1), pore membrane and/or filament-interacting-like protein 2 (POMFIL2), retinoic acid inducible in neuroblastoma 1 (RAINB1), Steerin-2 (STEERIN2), or Unc-53 homolog 2 (unc53H2), possesses 3' to 5' helicase activity and exonuclease activity. It is involved in neuronal development, specifically in the development of different sensory organs. NAV2 contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	121
409135	cd21286	CH_NAV3	calponin homology (CH) domain found in neuron navigator 3. Neuron navigator 3 (NAV3), also called pore membrane and/or filament-interacting-like protein 1 (POMFIL1), Steerin-3 (STEERIN3), or Unc-53 homolog 3 (unc53H3), may regulate IL2 production by T-cells. It may be involved in neuron regeneration. NAV3 contains a single copy of the CH domain at the N-terminus. CH domains are actin filament (F-actin) binding motifs.	105
409136	cd21287	CH_ACTN1_rpt2	second calponin homology (CH) domain found in alpha-actinin-1. Alpha-actinin-1 (ACTN1), also called alpha-actinin cytoskeletal isoform, or non-muscle alpha-actinin-1, is an F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures. ACTN1 is a bundling protein. Its mutations cause congenital macrothrombocytopenia. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	124
409137	cd21288	CH_ACTN2_rpt2	second calponin homology (CH) domain found in alpha-actinin-2. Alpha-actinin-2 (ACTN2), also called alpha-actinin skeletal muscle isoform 2, is an F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures. ACTN2 is a bundling protein. Its mutations are associated with cardiomyopathies, as well as skeletal muscle disorder. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	124
409138	cd21289	CH_ACTN3_rpt2	second calponin homology (CH) domain found in alpha-actinin-3. Alpha-actinin-3 (ACTN3), also called alpha-actinin skeletal muscle isoform 3, is an F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures. ACTN3 is a bundling protein. It is critical in anchoring the myofibrillar actin filaments and plays a key role in muscle contraction. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	124
409139	cd21290	CH_ACTN4_rpt2	second calponin homology (CH) domain found in alpha-actinin-4. Alpha-actinin-4 (ACTN4), also called non-muscle alpha-actinin 4, is an F-actin cross-linking protein which is thought to anchor actin to a variety of intracellular structures. It is associated with cell motility and cancer invasion. ACTN4 is probably involved in vesicular trafficking via its association with the CART complex, which is necessary for efficient transferrin receptor recycling but not for epidermal growth factor receptor (EGFR) degradation. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	125
409140	cd21291	CH_SpAIN1-like_rpt2	second calponin homology (CH) domain found in Schizosaccharomyces pombe alpha-actinin-like protein 1 and similar proteins. Schizosaccharomyces pombe alpha-actinin-like protein 1 (SpAIN1) binds to actin and is involved in actin-ring formation and organization. It plays a role in cytokinesis and is involved in septation. Members of this family contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	115
409141	cd21292	CH_PLS_rpt1	first calponin homology (CH) domain found in the plastin family. The plastin family includes plastin-1, -2, and -3, which are all actin-bundling proteins. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, LC64P, or lymphocyte cytosolic protein 1 (LCP-1), plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Members of this family contain four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	145
409142	cd21293	CH_AtFIM_like_rpt1	first calponin homology (CH) domain found in the Arabidopsis thaliana fimbrin family. The Arabidopsis thaliana fimbrin (AtFIM) family includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, and are probably involved in the cell cycle, cell division, cell elongation, and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Members of this family contain four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	116
409143	cd21294	CH_FIMB_rpt1	first calponin homology (CH) domain found in Saccharomyces cerevisiae fimbrin and similar proteins. Fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	125
409144	cd21295	CH_PLS_rpt2	second calponin homology (CH) domain found in the family of plastin. The plastin family includes plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Members of this family contain four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	113
409145	cd21296	CH_AtFIM_like_rpt2	second calponin homology (CH) domain found in the Arabidopsis thaliana fimbrin family. The Arabidopsis thaliana fimbrin (AtFIM) family includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, and are probably involved in the cell cycle, cell division, cell elongation, and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Members of this family contain four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	109
409146	cd21297	CH_FIMB_rpt2	second calponin homology (CH) domain found in Saccharomyces cerevisiae fimbrin and similar proteins. Fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	109
409147	cd21298	CH_PLS_rpt3	third calponin homology (CH) domain found in the plastin family. The plastin family includes plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Members of this family contain four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs.	117
409148	cd21299	CH_AtFIM_like_rpt3	third calponin homology (CH) domain found in the Arabidopsis thaliana fimbrin family. The Arabidopsis thaliana fimbrin (AtFIM) family includes Fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Members of this family contain four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs.	114
409149	cd21300	CH_FIMB_rpt3	third calponin homology (CH) domain found in Saccharomyces cerevisiae fimbrin and similar proteins. Fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs.	119
409150	cd21301	CH_PLS_rpt4	fourth calponin homology (CH) domain found in the plastin family. The plastin family includes plastin-1, -2, and -3. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Members of this family contain four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409151	cd21302	CH_AtFIM_like_rpt4	fourth calponin homology (CH) domain found in the Arabidopsis thaliana fimbrin family. The Arabidopsis thaliana fimbrin (AtFIM) family includes fimbrin-1, -2, -3, -4, and -5, which cross-link actin filaments (F-actin) in a calcium independent manner. They stabilize and prevent F-actin depolymerization mediated by profilin. They act as key regulators of actin cytoarchitecture, probably involved in cell cycle, cell division, cell elongation and cytoplasmic tractus. AtFIM5 is an actin bundling factor that is required for pollen germination and pollen tube growth. Members of this family contain four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs.	109
409152	cd21303	CH_FIMB_rpt4	fourth calponin homology (CH) domain found in Saccharomyces cerevisiae fimbrin and similar proteins. Fimbrin binds to actin, and functionally associates with actin structures involved in the development and maintenance of cell polarity. Members of this family contain four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs.	108
409153	cd21304	CH_PARVA_B_rpt1	first calponin homology (CH) domain found in the alpha/beta parvin subfamily. The alpha/beta parvin subfamily includes alpha-parvin and beta-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. Both alpha-parvin and beta-parvin are involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia, and both play roles in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Members of this subfamily contain two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	107
409154	cd21305	CH_PARVG_rpt1	first calponin homology (CH) domain found in gamma-parvin. Gamma-parvin probably plays a role in the regulation of cell adhesion and cytoskeleton organization. It contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	106
409155	cd21306	CH_PARVA_B_rpt2	second calponin homology (CH) domain found in the alpha/beta parvin subfamily. The alpha/beta parvin subfamily includes alpha-parvin and beta-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. Both alpha-parvin and beta-parvin are involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia, and both play roles in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Members of this subfamily contain two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	121
409156	cd21307	CH_PARVG_rpt2	second calponin homology (CH) domain found in gamma-parvin. Gamma-parvin probably plays a role in the regulation of cell adhesion and cytoskeleton organization. It contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	122
409157	cd21308	CH_FLNA_rpt1	first calponin homology (CH) domain found in filamin-A (FLN-A) and similar proteins. Filamin-A (FLN-A) is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also anchors various transmembrane proteins to the actin cytoskeleton and serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-A contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	129
409158	cd21309	CH_FLNB_rpt1	first calponin homology (CH) domain found in filamin-B (FLN-B) and similar proteins. Filamin-B (FLN-B) is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton. It may promote orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It anchors various transmembrane proteins to the actin cytoskeleton. FLN-B contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	131
409159	cd21310	CH_FLNC_rpt1	first calponin homology (CH) domain found in filamin-C (FLN-C) and similar proteins. Filamin-C (FLN-C), also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. FLN-C contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	125
409160	cd21311	CH_dFLNA-like_rpt1	first calponin homology (CH) domain found in Drosophila melanogaster filamin-A (dFLNA) and similar proteins. Drosophila melanogaster filamin-A (dFLNA or dFLN-A), also called actin-binding protein 280 (ABP-280) or filamin-1, is involved in germline ring canal formation. It may tether actin microfilaments within the ovarian ring canal to the cell membrane and contributes to actin microfilament organization. dFLNA contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	124
409161	cd21312	CH_FLNA_rpt2	second calponin homology (CH) domain found in filamin-A (FLN-A) and similar proteins. Filamin-A (FLN-A) is also called actin-binding protein 280 (ABP-280), alpha-filamin, endothelial actin-binding protein, filamin-1, or non-muscle filamin. It promotes orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It also anchors various transmembrane proteins to the actin cytoskeleton and serves as a scaffold for a wide range of cytoplasmic signaling proteins. FLN-A contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	114
409162	cd21313	CH_FLNB_rpt2	second calponin homology (CH) domain found in filamin-B (FLN-B) and similar proteins. Filamin-B (FLN-B) is also called ABP-278, ABP-280 homolog, actin-binding-like protein, beta-filamin, filamin homolog 1 (Fh1), filamin-3, thyroid autoantigen, truncated actin-binding protein, or truncated ABP. It connects cell membrane constituents to the actin cytoskeleton. It may promote orthogonal branching of actin filaments and links actin filaments to membrane glycoproteins. It anchors various transmembrane proteins to the actin cytoskeleton. FLN-B contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	110
409163	cd21314	CH_FLNC_rpt2	second calponin homology (CH) domain found in filamin-C (FLN-C) and similar proteins. Filamin-C (FLN-C), also called FLNc, ABP-280-like protein, ABP-L, actin-binding-like protein, filamin-2, or gamma-filamin, is a muscle-specific filamin that plays a central role in muscle cells, probably by functioning as a large actin-cross-linking protein. It may be involved in reorganizing the actin cytoskeleton in response to signaling events, and may also display structural functions at the Z lines in muscle cells. FLN-C is critical for normal myogenesis and for maintaining the structural integrity of the muscle fibers. FLN-C contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	115
409164	cd21315	CH_dFLNA-like_rpt2	second calponin homology (CH) domain found in Drosophila melanogaster filamin-A (dFLNA) and similar proteins. Drosophila melanogaster filamin-A (dFLNA or dFLN-A), also called actin-binding protein 280 (ABP-280) or filamin-1, is involved in germline ring canal formation. It may tether actin microfilaments within the ovarian ring canal to the cell membrane and contributes to actin microfilament organization. dFLNA contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	118
409165	cd21316	CH_SPTBN1_rpt1	first calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 1 (SPTBN1) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN1, also called beta-II spectrin, fodrin beta chain, or spectrin, non-erythroid beta chain 1, is also a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. SPTBN1 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	154
409166	cd21317	CH_SPTBN2_rpt1	first calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 2 (SPTBN2) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN2, also called beta-III spectrin, or spinocerebellar ataxia 5 protein (SCA5), probably plays an important role in the neuronal membrane skeleton. Mutations in SPTBN2 is associated with spinocerebellar ataxia type 5. SPTBN2 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	132
409167	cd21318	CH_SPTBN4_rpt1	first calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 4 (SPTBN4) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN4, also called beta-IV spectrin, or spectrin, non-erythroid beta chain 3 (SPTBN3), is a novel spectrin isolated as an interactor of the receptor tyrosine phosphatase-like protein ICA512. Its mutation associates with congenital myopathy, neuropathy, and central deafness. SPTBN4 contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	139
409168	cd21319	CH_SPTB_rpt2	second calponin homology (CH) domain found in spectrin beta chain, erythrocytic (SPTB) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTB, also called beta-I spectrin, may be involved in anaemia pathogenesis. SPTB contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	112
409169	cd21320	CH_SPTBN1_rpt2	second calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 1 (SPTBN1) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN1, also called beta-II spectrin, fodrin beta chain, or spectrin, non-erythroid beta chain 1, is also a component of fodrin, which is the general spectrin-like protein that seems to be involved in secretion. Fodrin interacts with calmodulin in a calcium-dependent manner and is thus a candidate for the calcium-dependent movement of the cytoskeleton at the membrane. SPTBN1 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	108
409170	cd21321	CH_SPTBN2_rpt2	second calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 2 (SPTBN2) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN2, also called beta-III spectrin, or spinocerebellar ataxia 5 protein (SCA5), probably plays an important role in the neuronal membrane skeleton. Mutations in SPTBN2 is associated with spinocerebellar ataxia type 5. SPTBN2 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	119
409171	cd21322	CH_SPTBN4_rpt2	second calponin homology (CH) domain found in spectrin beta chain, non-erythrocytic 4 (SPTBN4) and similar proteins. Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. SPTBN4, also called beta-IV spectrin, or spectrin, non-erythroid beta chain 3 (SPTBN3), is a novel spectrin isolated as an interactor of the receptor tyrosine phosphatase-like protein ICA512. Its mutation associates with congenital myopathy, neuropathy, and central deafness. SPTBN4 contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	130
409172	cd21323	CH_PLS1_rpt1	first calponin homology (CH) domain found in plastin-1. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. It contains four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	145
409173	cd21324	CH_PLS2_rpt1	first calponin homology (CH) domain found in plastin-2. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-2 contains four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	145
409174	cd21325	CH_PLS3_rpt1	first calponin homology (CH) domain found in plastin-3. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Plastin- 3 contains four copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	148
409175	cd21326	CH_PLS1_rpt2	second calponin homology (CH) domain found in plastin-1. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. It contains four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	121
409176	cd21327	CH_PLS2_rpt2	second calponin homology (CH) domain found in plastin-2. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-2 contaisn four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	125
409177	cd21328	CH_PLS3_rpt2	second calponin homology (CH) domain found in plastin-3. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Plastin-3 contains four copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	122
409178	cd21329	CH_PLS1_rpt3	third calponin homology (CH) domain found in plastin-1. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. It contains four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs.	118
409179	cd21330	CH_PLS2_rpt3	third calponin homology (CH) domain found in plastin-2. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-2 contains four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs.	125
409180	cd21331	CH_PLS3_rpt3	third calponin homology (CH) domain found in plastin-3. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Plastin-3 contains four copies of the CH domain. This model corresponds to the third CH domain. CH domains are actin filament (F-actin) binding motifs.	134
409181	cd21332	CH_PLS1_rpt4	fourth calponin homology (CH) domain found in plastin-1. Plastin-1, also called intestine-specific plastin, or I-plastin, is an actin-bundling protein in the absence of calcium. It contains four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs.	115
409182	cd21333	CH_PLS2_rpt4	fourth calponin homology (CH) domain found in plastin-2. Plastin-2, also called L-plastin, or LC64P, or lymphocyte cytosolic protein 1 (LCP-1), is an actin-binding protein that plays a role in the activation of T-cells in response to costimulation through TCR/CD3 and CD2 or CD28. It modulates the cell surface expression of IL2RA/CD25 and CD69. Plastin-2 contains four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs.	115
409183	cd21334	CH_PLS3_rpt4	fourth calponin homology (CH) domain found in plastin-3. Plastin-3, also called T-plastin, is an actin-bundling protein found in intestinal microvilli, hair cell stereocilia, and fibroblast filopodia. It may play a role in the regulation of bone development. Plastin-3 contains four copies of the CH domain. This model corresponds to the fourth CH domain. CH domains are actin filament (F-actin) binding motifs.	112
409184	cd21335	CH_PARVA_rpt1	first calponin homology (CH) domain found in alpha-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. It is also involved in the reorganization of the actin cytoskeleton, the formation of lamellipodia and ciliogenesis, as well as in the establishement of cell polarity, cell adhesion, cell spreading, and directed cell migration. Alpha-parvin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	115
409185	cd21336	CH_PARVB_rpt1	first calponin homology (CH) domain found in beta-parvin. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. It is involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia and also plays a role in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Beta-parvin contains two copies of the CH domain. This model corresponds to the first CH domain. CH domains are actin filament (F-actin) binding motifs.	106
409186	cd21337	CH_PARVA_rpt2	second calponin homology (CH) domain found in alpha-parvin. Alpha-parvin, also called actopaxin, calponin-like integrin-linked kinase-binding protein (CH-ILKBP), or matrix-remodeling-associated protein 2, plays a role in sarcomere organization and in smooth muscle cell contraction. It is required for normal development of the embryonic cardiovascular system, and for normal septation of the heart outflow tract. It is also involved in the reorganization of the actin cytoskeleton, the formation of lamellipodia and ciliogenesis, as well as in the establishement of cell polarity, cell adhesion, cell spreading, and directed cell migration. Alpha-parvin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	129
409187	cd21338	CH_PARVB_rpt2	second calponin homology (CH) domain found in beta-parvin. Beta-parvin, also called affixin, is an adapter protein that plays a role in integrin signaling via ILK and in activation of the GTPases Cdc42 and Rac1 by guanine exchange factors, such as ARHGEF6. It is involved in the reorganization of the actin cytoskeleton and the formation of lamellipodia and also plays a role in cell adhesion, cell spreading, establishment or maintenance of cell polarity, and cell migration. Beta-parvin contains two copies of the CH domain. This model corresponds to the second CH domain. CH domains are actin filament (F-actin) binding motifs.	130
410336	cd21339	PPP2R3	serine/threonine protein phosphatase 2A regulatory subunit B". Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This family includes PP2A regulatory B'' subunits alpha, beta and gamma, encoded by PPP2R3A, PPP2R3B and PPP2R3C, respectively. It also includes subunit delta encoded by PPP2R3D in mouse. These B-family regulatory subunits play various roles including regulation of cytoskeletal assembly, neuronal differentiation, mitogen-activated protein kinase signaling, and apoptosis. Subunits alpha and beta contain two-domain elongated structure with two calcium EF-hands which mediate Ca2+-dependent changes in phosphatase activity.	259
411060	cd21340	PPP1R42	protein phosphatase 1 regulatory subunit 42. Protein phosphatase 1 regulatory subunit 42 (PPP1R42), also known as leucine-rich repeat-containing protein 67 (lrrc67) or testis leucine-rich repeat (TLRR) protein, plays a role in centrosome separation. PPP1R42 has been shown to interact with the well-conserved signaling protein phosphatase-1 (PP1) and thereby increasing PP1's activity, which counters centrosome separation.  Inhibition of PPP1R42 expression increases the number of centrosomes per cell while its depletion reduces the activity of PP1 leading to activation of NEK2, the kinase responsible for phosphorylation of centrosomal linker proteins promoting centrosome separation.	220
411061	cd21341	TTC8_N	N-terminal domain of tetratricopeptide repeat domain 8. Tetratricopeptide repeat domain 8 (TTC80), also known a BBS8, has been directly linked to Bardet-Biedl syndrome, an autosomal recessive ciliopathy characterized by retinal degeneration, renal failure, obesity, diabetes, male infertility, polydactyly and cognitive impairment. Mutations in BBS8 cause early vision loss. In addition to C-terminal tetratricopeptide repeats, TTC8 also contains an N-terminal domain of unknown function.	139
409247	cd21342	Syt1_2_N	N-terminal domain of synaptotagmin-1 and -2. The synaptotagmins are integral membrane proteins of synaptic vesicles thought to serve as Ca(2+) sensors in the process of vesicular trafficking and exocytosis. Calcium binding to synaptotagmin-1 participates in triggering neurotransmitter release at the synapse. In general, synaptotagmins contain 2 calcium binding C2 domains. Synaptotagmin-1 and -2 have an additional N-terminal domain that has been shown to bind to Botulinum neurotoxin B.	93
394805	cd21343	ZBD_UPF1_nv_SF1_Hel-like	Cys/His rich zinc-binding domain (CH/ZBD) of eukaryotic UPF1 helicase, nidovirus SF1 helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands, and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this family belong to helicase superfamily 1 (SF1) and include nidoviral helicases such as Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13) and equine arteritis virus (EAV) Nsp10, as well as eukaryotic UPF1 helicase. The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. UPF1 participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons. The CH/ZBD of UPF1 interacts with UPF2, a factor also involved in NMD. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). UPF1, SARS-Nsp13 and EAV Nsp10 are multidomain proteins; their other domains include a 1B regulatory domain and a SF1 helicase core. The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD.	72
394813	cd21344	1B_UPF1_nv_SF1_Hel-like	1B domain of eukaryotic UPF1 helicase, nidovirus SF1 helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this family belong to helicase superfamily 1 (SF1) and include nidoviral helicases such as Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13), Equine arteritis virus (EAV) Nsp10, and eukaryotic UPF1 RNA helicase. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). UPF1 participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons. UPF1, EAV Nsp10 and SARS-Nsp13 are multidomain proteins with an N-terminal Cys/His rich zinc-binding domain (CH/ZBD), a 1B domain and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of EAV Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids.	86
410596	cd21369	cwf21	cwf21 domain. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8. Mutations in the cwf21 domain prevents its binding to Prp8. The domain is composed of two alpha helices. Proteins containing the cwf21 domain include complexed with CEF1 protein 21 (CWC21) from budding yeast, complexed with cdc5 protein 21 (CWF21) from fission yeast, as well as their orthologs, serine/arginine repetitive matrix proteins (SRRM2 and SRRM3) from vertebrates. This domain family also includes U2-associated protein SR140 from Eumetazoa, protein RRC1, and similar proteins from plants.	48
410597	cd21370	cwf21_SR140	cwf21 domain found in U2-associated protein SR140 and similar proteins. SR140, also called U2 snRNP-associated SURP motif-containing protein, U2SURP, or 140 kDa Ser/Arg-rich domain protein, is a putative splicing factor mainly found in higher eukaryotes. Although it was initially identified as a 17S U2 snRNP-associated protein, the molecular and physiological function of SR140 remains unclear. This model represents the cwf21 domain of SR140 and similar proteins. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8.	50
410598	cd21371	cwf21_RRC1-like	cwf21 domain found in Arabidopsis thaliana protein RRC1 and similar proteins. RRC1, also called reduced red-light responses in cry1cry2 background 1, is a SR-like splicing factor required for phytochrome B (phyB) signal transduction and involved in phyB-dependent alternative splicing. This subfamily also includes protein RRC1-like, which may also function as a SR-like splicing factor. SR family splicing factors are characterized by the presence of a domain rich in arginine and serine dipeptides, called the RS domain. This model represents the cwf21 domain of RRC1 and similar proteins. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8.	50
410599	cd21372	cwf21_CWC21-like	cwf21 domain found in fungal complexed with CEF1 protein 21 (CWC21) and similar proteins. This subfamily includes complexed with CEF1 protein 21 (CWC21) from budding yeast, complexed with cdc5 protein 21 (CWF21) from fission yeast, as well as their orthologs, serine/arginine repetitive matrix proteins (SRRM2 and SRRM3) from vertebrates. Both CWC21 and CWF21 are pre-mRNA-splicing factors that may function at or prior to the first catalytic step of splicing at the catalytic center of the spliceosome, together with ISY1. SRRM2 is required for pre-mRNA splicing as a component of the spliceosome. SRRM3 may play a role in regulating breast cancer cell invasiveness. It may be involved in RYBP-mediated breast cancer progression. Members of this family contain a cwf21 domain at the N-terminus. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8.	49
410600	cd21373	cwf21_SRRM2-like	cwf21 domain found in serine/arginine repetitive matrix proteins, SRRM2, SRRM3 and similar proteins. This subfamily includes SRRM2 and SRRM3, both of which contain a cwf21 domain at the N-terminus. SRRM2, also called 300 kDa nuclear matrix antigen, serine/arginine-rich splicing factor-related nuclear matrix protein of 300 kDa, SR-related nuclear matrix protein of 300 kDa, Ser/Arg-related nuclear matrix protein of 300 kDa, splicing coactivator subunit SRm300, or Tax-responsive enhancer element-binding protein 803 (TaxREB803), is required for pre-mRNA splicing as component of the spliceosome. SRRM3 may play a role in regulating breast cancer cell invasiveness. It may be involved in RYBP-mediated breast cancer progression. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8.	50
410601	cd21375	cwf21_SRRM2	cwf21 domain found in serine/arginine repetitive matrix protein 2. Serine/arginine repetitive matrix protein 2 (SRRM2) is also called 300 kDa nuclear matrix antigen, serine/arginine-rich splicing factor-related nuclear matrix protein of 300 kDa, SR-related nuclear matrix protein of 300 kDa, Ser/Arg-related nuclear matrix protein of 300 kDa, splicing coactivator subunit SRm300, or Tax-responsive enhancer element-binding protein 803 (TaxREB803). It is required for pre-mRNA splicing as component of the spliceosome. It contains a cwf21 domain at the N-terminus. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8.	64
410602	cd21376	cwf21_SRRM3	cwf21 domain found in serine/arginine repetitive matrix protein 3 and similar proteins. Serine/arginine repetitive matrix protein 3 (SRRM3) may play a role in regulating breast cancer cell invasiveness. It may also be involved in RYBP-mediated breast cancer progression. SRRM3 contains a cwf21 domain at the N-terminus. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8.	68
411062	cd21378	eIF3E	eukaryotic translation initiation factor 3 subunit E. Eukaryotic translation initiation factor 3 subunit E (eIF3E, also called INT6) is a subunit of eIF3, the largest initiation factor. eIF3 is involved in many steps of initiation, including ribosomal recruitment, attachment to mRNA, and scanning. The mammalian eIF3 complex has 13 subunits. Six subunits, including subunit E, contain PCI domains (N-terminal helical repeats and a winged helix domain or WHD) that mediates PCI polymerization. Mammalian eIF3e subunit interacts with eIF3C, eIF3D, eIF3L, and eIF3A subunits, as well as eIF4G and HERC2. It exhibits tumor suppressive or oncogenic functions depending on its expression level and/or tumor type; for example, decreased expression may cause breast cancer or non-small cell lung carcinoma while overexpression is correlated with colon cancer and glioblastoma. Decreased expression of eIF3E may also enable epithelial-mesenchymal transition (EMT), which is involved in adenomyosis by promoting cell invasion, and fibrogenesis by activating the TGF-beta1 signaling pathway.	416
412057	cd21382	RING0_parkin	RING finger-like zinc-binding domain 0 of parkin. Parkin, also called Parkinson juvenile disease protein 2, is a RBR (RING1-BRcat-Rcat)-type E3 ubiquitin-protein ligase that is associated with recessive early onset Parkinson's disease (PD), and exerts a protective effect against dopamine-induced alpha-synuclein-dependent cell toxicity. Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Parkin functions within a multiprotein E3 ubiquitin ligase complex, catalyzing the covalent attachment of ubiquitin moieties onto substrate proteins. It is involved in regulating mitochondrial quality control. Its activation is a key regulatory event in the pathway to the clearance of depolarized or damaged mitochondria. Parkin contains an N-terminal ubiquitin-like domain, an acid linker, a RING finger-like domain 0 (RING0), and a C-terminal RBR domain that was previously known as RING-BetweenRING-RING domain or TRIAD [two RING fingers and a DRIL (double RING finger linked)] domain. This model represents RING0 of parkin.	84
410588	cd21383	GAT_GGA_Tom1-like	canonical GAT domain found in eukaryotic ADP-ribosylation factor (Arf)-binding proteins (GGAs), metazoan myb protein 1 (Tom1)-like proteins, and similar proteins. This model represents the canonical GAT (GGA and Tom1) domain found in GGAs from eukaryotes, Tom1-like proteins from metazoa, and LAS seventeen-binding protein 5 (Lsb5p)-like proteins from fungi. The canonical GAT domain is a monomeric three-helix bundle that binds ubiquitin. GGAs, also called Golgi-localized gamma-ear-containing Arf-binding proteins, belong to a family of ubiquitously expressed, monomeric, motif-binding cargo/clathrin adaptor proteins that regulate clathrin-mediated trafficking of cargo proteins from the trans-Golgi network (TGN) to endosomes. GGAs play important roles in ubiquitin-dependent sorting of cargo proteins both in biosynthetic and endocytic pathways. Tom1 and its related proteins, Tom1L1 and Tom1L2, form a protein family sharing an N-terminal VHS-domain followed by a GAT domain. Tom1 family proteins bind to ubiquitin, ubiquitinated proteins, and Toll-interacting protein (Tollip) through its GAT domain. They do not associate with either Arf GTPases through its GAT domain nor with acidic cluster-dileucine sequences through its VHS domain. In addition, Tom1 family proteins recruit clathrin onto endosomes through their C-terminal region. The C-terminal clathrin-binding region of Tom1 and Tom1L2 are similar to each other, but distinguishable from Tom1L1.	80
410589	cd21384	GAT_STAM_Vps27-like	non-canonical GAT domain found in metazoan signal transducing adapter molecules (STAMs), fungal vacuolar protein sorting-associated protein 27 (Vps27), and similar proteins. This family includes several components of the ESCRT-0 complex, including STAMs, hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs), as well as vacuolar protein sorting-associated protein 27 (Vps27) and class E vacuolar protein-sorting machinery protein Hse1 from fungi. The ESCRT-0 complex binds ubiquitin and acts as a sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members in this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. By contrast, a canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. Hrs together with STAM forms a Hrs/STAM core complex. Vps27, together with Hse1, forms a Vps27/Hse1 core complex. Those complexes consist of two intertwined non-canonical GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The intertwined GAT heterodimer acts as a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	79
410590	cd21385	GAT_Vps27	non-canonical GAT domain found in fungal vacuolar protein sorting-associated protein 27 (Vps27) and similar proteins. Vps27, also called Golgi retention defective protein 11 (GRD11), is a component of the ESCRT-0 complex which is the sorting receptor for ubiquitinated cargo proteins at the multivesicular body (MVB), and recruits ESCRT-I to the MVB outer membrane. It controls exit from the prevacuolar compartment (PVC) in both the forward direction to the vacuole and the return to the Golgi. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. Vps27, together with another GAT domain-containing protein Hse1, forms a Vps27/Hse1 core complex that consists of two intertwined non-canonical GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Vps27/Hse1 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	84
410591	cd21386	GAT_Hse1	non-canonical GAT domain found in fungal class E vacuolar protein-sorting machinery protein Hse1 and similar proteins. Hse1 is a component of the ESCRT-0 complex which is the sorting receptor for ubiquitinated cargo proteins at the multivesicular body (MVB), and recruits ESCRT-I to the MVB outer membrane. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. Hse1, together with another GAT domain-containing protein Vps27, forms a Vps27/Hse1 core complex that consists of two intertwined non-canonical GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Vps27/Hse1 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	84
410592	cd21387	GAT_Hrs	non-canonical GAT domain found in metazoan hepatocyte growth factor-regulated tyrosine kinase substrate (Hrs) and similar proteins. Hrs, also called protein pp110, is a tyrosine kinase substrate in growth factor-stimulated cells. It is involved in intracellular signal transduction mediated by cytokines and growth factors. Hrs is a component of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. Hrs, together with another GAT domain-containing protein STAM, forms a Hrs/STAM core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	96
410593	cd21388	GAT_STAM	non-canonical GAT domain found in metazoan signal transducing adapter molecules (STAMs) and similar proteins. STAMs are Hrs-binding proteins involved in intracellular signal transduction mediated by cytokines and growth factors. They are components of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. STAM, together with another GAT domain-containing protein Hrs, forms a Hrs/STAM core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	77
410594	cd21389	GAT_STAM1	non-canonical GAT domain found in signal transducing adapter molecule 1 (STAM-1) and similar proteins. STAM-1 is involved in intracellular signal transduction mediated by cytokines and growth factors. It may also play a role in T-cell development. STAM-1 is a component of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this subfamily contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. STAM-1, together with another GAT domain-containing protein Hrs, forms a Hrs/STAM1 core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM1 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	77
410595	cd21390	GAT_STAM2	non-canonical GAT domain found in signal transducing adapter molecule 2 (STAM-2) and similar proteins. STAM-2 is a Hrs-binding protein involved in intracellular signal transduction mediated by cytokines and growth factors. STAM-2 is a component of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. STAM-2, together with another GAT domain-containing protein Hrs, forms a Hrs/STAM2 core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM2 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	91
409621	cd21392	IgC2_CD160	Immunoglobulin Constant-2 domain of Cluster of Differentiation 160 (CD160). CD160 is expressed at the cell surface as a tightly disulfide-linked multimer and is tightly associated with peripheral blood NK cells and CD8 T lymphocytes with cytolytic effector activity. Structurally similar to The B and T lymphocyte attenuator (BTLA), which appears to act as a negative regulator of T cell activation and growth. CD160 is a ligand for HVEM (herpes virus entry mediator), and considered a proposed immune checkpoint inhibitor with anti-cancer activity along with anti-PD-1 antibodies. CD160 has also been proposed as a potential target in cases of human pathological ocular and tumor neoangiogenesis that do not respond or become resistant to existing antiangiogenic drugs.	119
411063	cd21393	sm_acid_XPC-like	small acidic domain of Xeroderma pigmentosum group C complementing protein and similar proteins. This model represents the small acidic domain of mammalian Xeroderma pigmentosum group C complementing protein (XPC), yeast Rad4, and similar proteins. XPC/Rad4 recruits transcription/repair factor IIH (TFIIH) to the nucleotide excision repair (NER) complex through interactions with its p62/Tfb1 and XPB/Ssl2 TFIIH subunits. Global genome repair (GGR), one of two NER initiation pathways in mammals, starts with DNA lesion detection by XPC. XPC is a structure specific DNA-binding factor that recognizes distortion of the damaged DNA double helix and recruits the TFIIH complex onto the lesion to open up the damaged DNA. The small acidic domain of XPC/Rad4 interacts with the pleckstrin homology (PH) domain of the p62/Tfb1 subunit of TFIIH.	42
412027	cd21396	GINS_B	beta-strand (B) domain of GINS complex proteins: Sld5, Psf1, Psf2, Psf3, Gins51 and Gins23. The GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex is involved in both the initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. This complex is found in eukaryotes and archaea, but not in bacteria. In eukaryotes, GINS is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3, while in archaea, it consists of two different proteins named Gins51 and Gins23. The archaeal GINS complex can be either an alpha2beta2-type heterotetramer composed of Gins51 and Gins23, or a Gins51-only alpha4-type homotetramer. All GINS subunits are homologous and consist of two domains, called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1/Gins51 are permuted with respect to Psf1/Psf3/Gins23. The overall tetrameric assemblies of GINS are similar, but the relative locations of the C-terminal small domains are different with respect to the alpha-helical domain, resulting in different subunit contacts. However, the basic function of GINS in DNA replication is conserved across eukaryotes and archaea. This model represents the beta-strand domain (B-domain) of GINS complex proteins.	49
411064	cd21397	cc_ERCC-6_N	coiled-coil domain located near the N-terminus of human Excision Repair Cross Complementing 6 (ERCC-6) and related proteins. This model represents a coiled-coil domain located near the N-terminus of ERCC-6 and related proteins. ERCC-6 (also known as Cockayne syndrome group B, CSB) is a DNA-binding protein important in eukaryotic transcription-coupled repair (TCR). TCR is a well-conserved sub-pathway of nucleotide excision repair (NER) that preferentially removes DNA lesions from the template strand blocking translocation of RNA polymerase II (Pol II). In a model for TCR, the processing Pol II encounters the lesion on the transcribed DNA strand and stalls; it is then displaced by the TCR-initiation complex which includes ERCC-6, ERCC-8, UVSSA and USP7; TCR-specific factors then access the lesion for the DNA damage incision process. The N-terminal region, the ATPase domain and the C-terminal region of ERCC-6 all directly contribute to DNA association and catalytic activity. The ATPase domain functions in concert with either the N- or C-terminal region to mediate UV-induced chromatin association. The N-terminal region prevents ERCC-6 from stably associating with chromatin under normal growth conditions, and the C-terminal region of ERCC-6 promotes stable chromatin association in the presence of lesion-stalled transcription. In addition to this coiled-coil domain, the N-terminal region of ERCC-6 includes two lysine residues subject to SUMOylation, a nucleolar localization signal NoLS1, and a nuclear localization signal NLS1. ERCC-6 also includes a SWI/SNF-like ATPase domain, a nucleotide-binding domain and a ubiquitin-binding domain. This coiled-coil domain binds magnesium. This domain family does not include Saccharomyces cerevisiae RAD26, and Schizosaccharomyces pombe Rhp26.	77
394806	cd21399	ZBD_nv_SF1_Hel-like	Cys/His rich zinc-binding domain (CH/ZBD) of nidovirus helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This nidovirus family includes Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13) and equine arteritis virus (EAV) Nsp10 helicase, and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. SARS-Nsp13 and EAV Nsp10 are multidomain proteins; their other domains include a 1B regulatory domain and a SF1 helicase core.	71
394807	cd21400	ZBD_UPF1-like	Cys/His rich zinc-binding domain (CH/ZBD) of eukaryotic UPF1 RNA helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. UPF1 belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. UPF1 participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons. The N-terminal CH/ZBD of UPF1 interacts with UPF2, a factor also involved in NMD. UPF1 has an N-terminal CH/ZBD, a 1B domain, and a SF1 helicase core.	120
394808	cd21401	ZBD_cv_Nsp13-like	Cys/His rich zinc-binding domain (CH/ZBD) of coronavirus SARS NSP13 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This coronavirus family includes Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13) and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD. SARS-Nsp13 has an N-terminal CH/ZBD, a stalk domain, a 1B regulatory domain, and SF1 helicase core.	95
394809	cd21402	ZBD_mv_SF1_Hel-like	Cys/His rich zinc-binding domain (CH/ZBD) of mesnidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This mesnidovirus group includes the Bontag Baru virus (BBaV) replication helicase encoded on ORF1b and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this group belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD.	111
394810	cd21403	ZBD_tv_SF1_Hel-like	Cys/His rich zinc-binding domain (CH/ZBD) of tornidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This tornidovirus group includes White bream virus (WBV) SF1 helicase encoded on ORF1b and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this family belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD.	95
394811	cd21404	ZBD_rv_SF1_Hel-like	Cys/His rich zinc-binding domain (CH/ZBD) of ronidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. This ronidovirus family includes Gill-associated virus (GAV) replication helicase encoded on ORF1 and belongs to helicase superfamily 1 (SF1). The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this family belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD.	105
394812	cd21405	ZBD_av_Nsp10-like	Cys/His rich zinc-binding domain (CH/ZBD) of arterivirus EAV Nsp10 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this arnidovirus group belong to helicase superfamily 1 (SF1) and include arterivirus helicases such Equine arteritis virus (EAV) Nsp10 helicase encoded on ORF1b. The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this family belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD.	62
394814	cd21406	1B_nv_SF1_Hel-like	1B domain of nidovirus helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this nidoviral family belong to helicase superfamily 1 (SF1) and include nidoviral helicases such as Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13) and Equine arteritis virus (EAV) Nsp10. SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). They belong to a larger SF1 helicase family which also includes eukaryotic UPF1-like helicases. UPF1, EAV Nsp10 and SARS-Nsp13 are multidomain proteins with an N-terminal Cys/His rich zinc-binding domain (CH/ZBD), a 1B domain and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of EAV Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids.	79
394815	cd21407	1B_UPF1-like	1B domain of eukaryotic UPF1 RNA helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. UPF1 belongs to helicase superfamily 1 (SF1). It participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons. UPF1 is a multidomain protein; it includes an N-terminal Cys/His rich zinc-binding domain (CH/ZBD), a regulatory 1B domain, and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of the related Equine arteritis virus (EAV) Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids.	90
394816	cd21408	1B_Sen1p-like	1B domain of Saccharomyces cerevisiae Sen1p RNA helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Sen1p belongs to a UPF1-like family of helicase superfamily 1 (SF1). UPF1 participates in nonsense-mediated mRNA decay (NMD), a pathway which degrades transcripts with premature termination codons, Sen1p plays a role in the termination of non-coding transcription. UPF1 is a multidomain protein; it includes an N-terminal Cys/His rich zinc-binding domain (CH/ZBD), a 1B regulatory domain, and a SF1 helicase core. Sen1p has a similar domain organization and helicase mechanism to UPF1. However, it has distinct structural features including a more elaborate topology of the 1B barrel domain, and a distinct function from UPF1, an ATPase-dependent ability of promoting transcription termination in vitro. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of the related Equine arteritis virus (EAV) Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids.	106
394817	cd21409	1B_cv_Nsp13-like	1B domain of coronavirus SARS NSP13 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this subfamily belong to helicase superfamily 1 (SF1) and include coronavirus helicases such as Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13). SARS-Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). SARS-Nsp13 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (CH/ZBD) and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of the related Equine arteritis virus (EAV) Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids.	79
394818	cd21410	1B_av_Nsp10-like	1B domain of arterivirus EAV Nsp10 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this subfamily belong to helicase superfamily 1 (SF1) and include arterivirus helicases such Equine arteritis virus (EAV) Nsp10 helicase encoded on ORF1b. EAV Nsp10 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (CH/ZBD) and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of EAV Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids.	49
411058	cd21411	NucC	cyclic oligonucleotide-based anti-phage signaling system-associated NucC nuclease. Cyclic oligonucleotide-based anti-phage signaling system (CBASS)-associated NucC nuclease kills phage-infected cells through genome destruction. It is allosterically activated by a cyclic triadenylate (cA3) second messenger that is synthesized by CBASS upon infection. NucC is related to restriction endonucleases but it adopts a homotrimeric structure. Binding of cA3 causes two NucC homotrimers to assemble into a homohexamer, which brings together a pair of active sites to activate DNA cleavage. NucC has also been integrated into type III CRISPR/Cas systems as an accessory nuclease.	225
394819	cd21413	unc_tv_SF1_Hel-like	uncharacterized domain which connects the Cys/His rich zinc-binding (ZBD) and linker to the first helicase domain of tornidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Helicases in this family belong to helicase superfamily 1 (SF1) and include tornidovirus helicases such as Breda virus serotype 1 (BoTV-1) SF1 helicase encoded on ORF1b. They are related to the SF1 family nidoviral replication helicases which include Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13), a component of the viral RNA synthesis replication and transcription complex (RTC). SARS-Nsp13 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (ZBD) and a SF1 helicase core. The location of the uncharacterized domain represented in this tornidovirus group resembles that of the 1B domain in SARS-Nsp13 helicase; it connects the zinc-binding domain (ZBD) and linker to the first helicase domain.	79
394820	cd21414	unc_rv_SF1_Hel-like	uncharacterized domain which connects the Cys/His rich zinc-binding domain (ZBD) and linker to the first helicase domain of ronidovirus SF1 helicase and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this family belong to helicase superfamily 1 (SF1) and include ronidovirus helicases such as Gill-associated virus (GAV) replication helicase encoded on ORF1b. They are related to the SF1 family nidoviral replication helicases which include Severe Acute Respiratory Syndrome coronavirus (SARS) non-structural protein 13 (SARS-Nsp13), a component of the viral RNA synthesis replication and transcription complex (RTC). SARS-Nsp13 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (CH/ZBD) and a SF1 helicase core. The location and orientation of the uncharacterized domain represented in this ronidovirus group resembles that of the 1B domain in SARS-Nsp13 helicase; it connects the Cys/His zinc-binding domain (ZBD) and linker to the first helicase domain.	56
411065	cd21416	HDC_protein	histidine decarboxylase (HDC) gene cluster protein. This model includes an uncharacterized histidine decarboxylase (HDC) gene cluster protein. Lactobacillus parabuchneri is one of the major causes of elevated histamine levels in cheese. Histamine positive strain FAM21731 from L. parabuchneri has been shown to contain a histidine decarboxylase gene cluster present on a genomic island, which seems to have undergone transfer within the same species as well as between other lactobacilli.	380
411066	cd21417	AvrRxo1	AvrRxo1, a type III effector with a polynucleotide kinase domain. This family contains AvrRxo1-ORF1 (also called AvrRxo1) which has been shown to be a type III-secreted virulence factor in Xanthomonas oryzae (Xoc) that causes bacterial leaf streak (BLS) disease in rice plants. AvrRxo1-ORF1 delivery in rice plant cells is recognized by disease resistance protein Rxo1, which triggers resistance to BLS disease. In the Xoc genome, AvrRxo1-ORF1 is adjacent to AvrRxo1-ORF2 (also called AvrRxo1-required chaperone, or Arc1) which appears to act as a molecular chaperone. AvrRxo1 has a T4 polynucleotide kinase (T4pnk) domain, while Arc1 has a kinase-binding domain with a structure that is atypical of effector-binding chaperones. AvrRxo1 and Arc1 comprise a toxin-antitoxin system similar to members of the zeta-epsilon family, with AvrRxo1 acting as the toxin.	330
411067	cd21418	Arc1	Arc1, AvrRxo1-required chaperone. This family contains AvrRxo1-ORF2 (also called AvrRxo1-required chaperone or Arc1) which appears to act as a molecular chaperone for AvrRxo1-ORF1 (also called AvrRxo1), a type III-secreted virulence factor in Xanthomonas oryzae (Xoc), a bacteria that causes leaf streak (BLS) disease in rice plants. AvrRxo1-ORF1 delivery in rice plant cells is recognized by disease resistance protein Rxo1, which triggers resistance to BLS disease. In the Xoc genome, the Arc1 gene is found adjacent to AvrRxo1; Arc1 functions to suppress the bacteriostatic activity of AvrRxo1-ORF1 in bacterial cells. Arc1 has a kinase-binding domain with a structure that is atypical of effector-binding chaperones, while AvrRxo1 has a T4 polynucleotide kinase (T4pnk) domain. AvrRxo1 and Arc1 comprise a toxin-antitoxin system similar to members of the zeta-epsilon family, with Arc1 acting as the antitoxin.	97
412058	cd21422	GatF	mitochondrial glutamyl-tRNA(Gln) amidotransferase subunit F. Glutamyl-tRNA(Gln) amidotransferase subunit F (GatF), also called Glu-AdT subunit F, is the connector subunit of yeast mitochondrial tRNA-dependent amidotransferase (AdT) that is also composed of the GatA and GatB subunits. GatA and GatB are well conserved among bacteria and eukaryota, but the GatF subunit is a fungi-specific ortholog of the GatC subunit found in all other known heterotrimeric AdTs. AdT allows the formation of correctly charged Gln-tRNA(Gln) through the transamidation of misacylated Glu-tRNA(Gln) in the mitochondria. The reaction takes place in the presence of glutamine and ATP through an activated gamma-phospho-Glu-tRNA(Gln). This model corresponds to the GatF subunit, which can be divided into two halves, the C-terminal GatC-like portion and the N-terminal appended domain (NTD).	128
410603	cd21435	SUN_cc1	coiled-coil domain 1 of SUN domain-containing proteins. SUN (Sad1 and UNC-84) proteins (SUN1 and SUN2) are components of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex which is involved in the connection between the nuclear lamina and the cytoskeleton. Besides the core SUN domain, SUN proteins contain two coiled-coil domains (CC1 and CC2), which act as intrinsic dynamic regulators controlling the activity of the SUN domain. The model corresponds to CC1 that functions as an activation segment to release CC2-mediated inhibition of the SUN domain.	55
410604	cd21438	SUN2_cc1	coiled-coil domain 1 of SUN domain-containing protein 2 and similar proteins. SUN domain-containing protein 2 (SUN2), also called protein unc-84 homolog B, Rab5-interacting protein (Rab5IP), or Sad1/unc-84 protein-like 2, is a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex which is involved in the connection between the nuclear lamina and the cytoskeleton. Besides the core SUN domain, SUN2 contains two coiled-coil domains (CC1 and CC2), which act as the intrinsic dynamic regulators for controlling the activity of the SUN domain. This model corresponds to CC1 that functions as an activation segment to release CC2-mediated inhibition of the SUN domain.	55
410605	cd21439	SUN1_cc1	coiled-coil domain 1 of SUN domain-containing protein 1 and similar proteins. SUN domain-containing protein 1 (SUN1), also called protein unc-84 homolog A, or Sad1/unc-84 protein-like 1, is a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex which is involved in the connection between the nuclear lamina and the cytoskeleton. Besides the core SUN domain, SUN1 contains two coiled-coil domains (CC1 and CC2), which act as the intrinsic dynamic regulators for controlling the activity of the SUN domain. This model corresponds to CC1 that may function as an activation segment to release CC2-mediated inhibition of the SUN domain.	55
410607	cd21440	KLF8_N	N-terminal domain of Kruppel-like factor 8. Kruppel-like factor 8 (also known as Krueppel-like transcription factor 8, KLF8) is a CACCC-box binding protein that associates with C-terminal Binding Protein (CtBP) and represses transcription. It plays an essential role in the regulation of the cell cycle, apoptosis, and differentiation. It has been identified as a key component of the transcription factor network that controls terminal differentiation during adipogenesis. It also plays an important role in the formation of several human tumors, including the promotion of tumorigenesis, invasion, and metastasis of colorectal cancer cells, and the progression of pancreatic cancer. KLF8 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Although these factors bind to similar elements in vitro, they have distinct activities in vivo depending on their expression profile and the sequence of the N-terminal activation/repression domain, which differ between members. KLF8 contains an N-terminal repression domain that is related to that of KLF12.	169
410608	cd21441	KLF12_N	N-terminal domain of Kruppel-like factor 12. Kruppel-like factor 12 (also known as Krueppel-like transcription factor 12, KLF12) regulates, by transcriptionally repressing Nur77 expression, endometrial decidualization, which is a prerequisite for successful implantation and the establishment of pregnancy. It is involved in the maturation processes of kidney collecting ducts after birth, and is able to increase the promoter activity of the UT-A1 urea transporter promoter by binding to the CACCC motif. KLF12 has also been found to promote colorectal cancer growth is also involved in the invasion and apoptosis of basal-like breast carcinoma. KLF12 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Although these factors bind to similar elements in vitro, they have distinct activities in vivo depending on their expression profile and the sequence of the N-terminal activation/repression domain, which differ between members. KLF12 contains an N-terminal domain that is related to the N-terminal repression domain of KLF8.	197
410568	cd21442	SNARE_NTD_STX6-like	N-terminal domain of syntaxin-6 and similar proteins. The family includes soluble NSF attachment protein receptor (SNARE) proteins, syntaxin-6 (STX6) and syntaxin-10 (STX10), and their homologs found in fungi and plants, such as Tlg1p, AtSYP61, and similar proteins. STX6 is involved in intracellular vesicle trafficking. STX10, also called Syn10, is involved in vesicular transport from the late endosomes to the trans-Golgi network. Tlg1p, also called syntaxin TLG1, is a SNARE protein (of Qc type) involved in membrane fusion probably in retrograde traffic of cytosolic double-membrane vesicles derived from both, early and possibly late endosomes/PVC (prevacuolar compartment) back to the trans-Golgi network (TGN or late Golgi). It has been reported to function both as a (target membrane) t-SNARE and as a (vesicle) v-SNARE. AtSYP61, also called osmotic stress-sensitive mutant 1 (OSM1), is a vesicle trafficking syntaxin protein that functions in the secretory pathway. It is involved in osmotic stress tolerance and in abscisic acid (ABA) regulation of stomatal responses in Arabidopsis.	103
410569	cd21443	SNARE_NTD_STX6_STX10	N-terminal domain of syntaxin-6, syntaxin-10, and similar proteins. This subfamily includes two soluble NSF attachment protein receptor (SNARE) proteins, syntaxin-6 (STX6) and syntaxin-10 (STX10). STX6 is involved in intracellular vesicle trafficking. STX10, also called Syn10, is involved in vesicular transport from the late endosomes to the trans-Golgi network. This model corresponds to the N-terminal domain of STX6 and STX10, which is a regulatory domain named Habc.	103
410570	cd21444	SNARE_NTD_Tlg1p-like	N-terminal domain of t-SNARE affecting a late Golgi compartment protein 1 and similar proteins. t-SNARE affecting a late Golgi compartment protein 1 (Tlg1p), also called syntaxin TLG1, is a soluble NSF attachment protein receptor (SNARE) protein (of Qc type) involved in membrane fusion, probably in retrograde traffic of cytosolic double-membrane vesicles derived from both, early and possibly late endosomes/PVC (prevacuolar compartment) back to the trans-Golgi network (TGN or late Golgi). It has been reported to function both as a (target membrane) t-SNARE and as a (vesicle) v-SNARE. The model corresponds to the N-terminal domain of Tlg1p, which consists of a three-helix bundle.	93
410571	cd21445	SNARE_NTD_AtSYP61-like	N-terminal domain of Arabidopsis thaliana syntaxin-61 and similar proteins. Arabidopsis thaliana syntaxin-61 (AtSYP61), also called osmotic stress-sensitive mutant 1 (OSM1), is a vesicle trafficking syntaxin protein that functions in the secretory pathway. It is involved in osmotic stress tolerance and in abscisic acid (ABA) regulation of stomatal responses in Arabidopsis. This model corresponds to the N-terminal domain of AtSYP61, which shows high sequence similarity with the N-terminal domain of yeast Tlg1p, a soluble NSF attachment protein receptor (SNARE) protein (of Qc type) involved in membrane fusion.	99
410572	cd21446	SNARE_NTD_STX10	N-terminal domain of syntaxin-10. Syntaxin-10 (STX10), also called Syn10, is part of a soluble NSF attachment protein receptor (SNARE) complex involved in vesicular transport from the late endosomes to the trans-Golgi network, such as the transport of mannose 6-phosphate receptors from endosomes to the Golgi after delivering lysosomal enzymes to the endocytic pathway. This model corresponds to the N-terminal domain of STX10, which is a regulatory domain named Habc.	103
410573	cd21447	SNARE_NTD_STX6	N-terminal domain of syntaxin-6. Syntaxin-6 (STX6) is a component of a soluble NSF attachment protein receptor (SNARE) complex involved in intracellular vesicle trafficking and in the fusion of retrograde transport carriers with the trans-Golgi network (TGN). This model corresponds to N-terminal domain of STX6, which is a regulatory domain named Habc.	103
411996	cd21448	DLC-like_DYNLT1-like	dynein light chain (DLC)-like domain found in the dynein light chain Tctex-type 1 (DYNLT1) subfamily and similar proteins. The dynein light chain Tctex-type 1 (DYNLT1) subfamily includes two isoforms, DYNLT1 and DYNLT3, which contribute to the differential regulation of dynein cargo binding. They are non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. The dynein complex contains either DYNLT1 or DYNLT3, but not both. The family also includes Schizosaccharomyces pombe dynein light chain Tctex-type (SpDlc1), Saccharomyces cerevisiae topoisomerase I damage affected protein 2 (TDA2) and similar proteins. SpDlc1 belongs to the 14-kDa Tctex-1 dynein light chain family. It acts as a non-catalytic accessory component of a dynein complex. It is required for regular oscillatory nuclear movement and efficient recombination during meiotic prophase in fission yeast. TDA2 is a novel protein of the endocytic machinery necessary for normal internalization of native cargo in yeast. It works independently of the dynein motor complex and microtubules.	98
411997	cd21449	DLC-like_SF	dynein light chain (DLC)-like domain superfamily. The superfamily corresponds to a class of proteins containing a dynein light chain (DLC)-like domain with anti-parallel beta-sheet packed against an alpha-helical hairpin. DLC-like domain-containing proteins includes cytoplasmic dynein light chain DYNLL1 and DYNLL2, axonemal dynein light chain 4 (DNAL4), tegumental-allergen-like proteins (TALs), dynein light chain Tctex-type DYNLT1 and DYNLT3, as well as Tctex1 domain-containing proteins (TCTEX1D). Both DYNLL1 and DYNLL2 are non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. DNAL4 is a force generating protein of respiratory cilia. TALs may be involved in the transport of vesicles within the tegumental cytoplasm, probably within dynein motor complexes. DYNLT1 and DYNLT3, which contribute to the differential regulation of dynein cargo binding. They are non-catalytic accessory components of the cytoplasmic dynein 1 complex, which contains either DYNLT1 or DYNLT3, but not both. TCTEX1D family includes TCTEX1D1-4. TCTEX1D1 is a genetic modifier of disease progression in Duchenne muscular dystrophy (DMD). TCTEX1D2 is required for proper retrograde ciliary transport. TCTEX1D3 may be an accessory component of axonemal dynein and cytoplasmic dynein 1. TCTEX1D4 is a novel protein phosphatase 1 (PPP1) interactor.	96
411998	cd21450	DLC-like_DYNLL1-like	dynein light chain (DLC)-like domain found in cytoplasmic dynein light chain 1 (DYNLL1), axonemal dynein light chain 4 (DNAL4), tegumental-allergen-like proteins (TALs) and similar proteins. The family includes cytoplasmic dynein light chain 1 (DYNLL1), DYNLL2, axonemal dynein light chain 4 (DNAL4), and tegumental-allergen-like proteins (TALs). DYNLL1, also called protein inhibitor of neuronal nitric oxide synthase (PIN), or 8 kDa dynein light chain (DLC8), or dynein light chain LC8-type 1 (DLC1), is one of several non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. It acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules. It may play a role in changing or maintaining the spatial distribution of cytoskeletal structures. DYNLL2, also called cytoplasmic dynein light chain 2, or 8 kDa dynein light chain b (DLC8b), or dynein light chain LC8-type 2 (DLC2), is one of several non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. DNAL4 is a force generating protein of respiratory cilia. It produces force towards the minus ends of microtubules. TALs, also called tegument antigens, are characterized by two N-terminal EF-hand motifs and a C-terminal region resembling a dynein light chain (DLC)-like domain. They were mainly found in parasitic platyhelminth species. TALs are strongly associated with the tegument, a syncytial structure that forms the outer layer of the organism. They may be involved in the transport of vesicles within the tegumental cytoplasm, probably within dynein motor complexes.	68
411999	cd21451	DLC-like_TCTEX1D	dynein light chain (DLC)-like domain found in the Tctex1 domain-containing protein (TCTEX1D) family. The Tctex1 domain-containing protein (TCTEX1D) family includes TCTEX1D1-4. TCTEX1D1 is a genetic modifier of disease progression in Duchenne muscular dystrophy (DMD). It can interact with ZMYND10 that stabilizes intermediate chain proteins in the cytoplasmic pre-assembly of dynein arms. TCTEX1D2 is required for proper retrograde ciliary transport. It associates with short-rib polydactyly syndrome proteins, such as Wdr34, Wdr60, and other dynein complex 1 and 2 subunits, and is required for ciliogenesis. TCTEX1D2 is a negative regulator of GLUT4 translocation and glucose uptake. TCTEX1D3, also called T-complex testis-specific protein 3, or T-complex-associated testis-expressed protein 3 (Tcte-3), may be an accessory component of axonemal dynein and cytoplasmic dynein 1. TCTEX1D4, also called protein N22.1, or Tctex-2-beta, is a novel protein phosphatase 1 (PPP1) interactor. It also interacts with ENG/endoglin, TGFBR2, and TGFBR3. The distribution of TCTEX1D4 in testis suggests its involvement in distinct functions, such as TGFbeta signaling at the blood-testis barrier and acrosome cap formation. The model corresponds to the dynein light chain (DLC)-like domain of TCTEX1Ds.	101
412000	cd21452	DLC-like_DYNLL1_DYNLL2	dynein light chain (DLC)-like domain found in the cytoplasmic dynein light chain 1 (DYNLL1) family. The cytoplasmic dynein light chain 1 (DYNLL1) family includes DYNLL1 and DYNLL2. DYNLL1, also called protein inhibitor of neuronal nitric oxide synthase (PIN), or 8 kDa dynein light chain (DLC8), or dynein light chain LC8-type 1 (DLC1), acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules. It may play a role in changing or maintaining the spatial distribution of cytoskeletal structures. Both DYNLL1 and DYNLL2 (also called 8 kDa dynein light chain b, or DLC8b, or dynein light chain LC8-type 2, or DLC2) are non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. The model corresponds to the dynein light chain (DLC)-like domain of DYNLL1 and DYNLL2.	84
412001	cd21453	DLC-like_DNAL4	dynein light chain (DLC)-like domain found in axonemal dynein light chain 4 (DNAL4) and similar proteins. Axonemal dynein light chain 4 (DNAL4) is a force generating protein of respiratory cilia. It produces force towards the minus ends of microtubules. The model corresponds to the dynein light chain (DLC)-like domain of DNAL4.	83
412002	cd21454	DLC-like_TAL	dynein light chain (DLC)-like domain found in the family of tegumental-allergen-like proteins (TALs). Tegumental-allergen-like proteins (TALs), also called tegument antigens, are characterized by two N-terminal EF-hand motifs and a C-terminal region resembling a dynein light chain (DLC)-like domain. They were mainly found in parasitic platyhelminth species. TALs are strongly associated with the tegument, a syncytial structure that forms the outer layer of the organism. They may be involved in the transport of vesicles within the tegumental cytoplasm, probably within dynein motor complexes. The model corresponds to the dynein light chain (DLC)-like domain of TAL.	87
412003	cd21455	DLC-like_DYNLT1_DYNLT3	dynein light chain (DLC)-like domain found in the dynein light chain Tctex-type 1 (DYNLT1) family. The dynein light chain Tctex-type 1 (DYNLT1) family includes two isoforms, DYNLT1 and DYNLT3, which contribute to the differential regulation of dynein cargo binding. They are non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. The dynein complex contains either DYNLT1 or DYNLT3, but not both. The model corresponds to the dynein light chain (DLC)-like domain of DYNLT1 and DYNLT3.	97
412004	cd21456	DLC-like_SpDlc1-like	dynein light chain (DLC)-like domain found in Schizosaccharomyces pombe dynein light chain Tctex-type (SpDlc1) and similar proteins. Schizosaccharomyces pombe dynein light chain 1 (SpDlc1) belongs to the 14-kDa Tctex-1 dynein light chain family. It acts as a non-catalytic accessory component of a dynein complex. It is required for regular oscillatory nuclear movement and efficient recombination during meiotic prophase in fission yeast. The model corresponds to the dynein light chain (DLC)-like domain found in SpDlc1 and similar proteins.	110
412005	cd21457	DLC-like_TDA2	dynein light chain (DLC)-like domain found in topoisomerase I damage affected protein 2 (TDA2) and similar proteins. Topoisomerase I damage affected protein 2 (TDA2) is a novel protein of the endocytic machinery necessary for normal internalization of native cargo in yeast. It works independently of the dynein motor complex and microtubules. The model corresponds to the dynein light chain (DLC)-like domain of TDA2.	108
412006	cd21458	DLC-like_TCTEX1D1	dynein light chain (DLC)-like domain found in Tctex1 domain-containing protein 1 (TCTEX1D1) and similar proteins. Tctex1 domain-containing protein 1 (TCTEX1D1) is a genetic modifier of disease progression in Duchenne muscular dystrophy (DMD). It can interact with ZMYND10 that stabilizes intermediate chain proteins in the cytoplasmic pre-assembly of dynein arms. The model corresponds to the (dynein light chain) DLC-like domain of TCTEX1D1.	104
412007	cd21459	DLC-like_TCTEX1D2	dynein light chain (DLC)-like domain found in Tctex1 domain-containing protein 2 (TCTEX1D2) and similar proteins. Tctex1 domain-containing protein 2 (TCTEX1D2) is required for proper retrograde ciliary transport. It associates with short-rib polydactyly syndrome proteins, such as Wdr34, Wdr60, and other dynein complex 1 and 2 subunits, and is required for ciliogenesis. TCTEX1D2 is a negative regulator of GLUT4 translocation and glucose uptake. The model corresponds to the dynein light chain (DLC)-like domain of TCTEX1D2.	104
412008	cd21460	DLC-like_TCTEX1D3	dynein light chain (DLC)-like domain found in Tctex1 domain-containing protein 3 (TCTEX1D3) and similar proteins. Tctex1 domain-containing protein 3 (TCTEX1D3), also called T-complex testis-specific protein 3, or T-complex-associated testis-expressed protein 3 (Tcte-3), may be an accessory component of axonemal dynein and cytoplasmic dynein 1.  The model corresponds to the dynein light chain (DLC)-like domain of TCTEX1D3.	112
412009	cd21461	DLC-like_TCTEX1D4	dynein light chain (DLC)-like domain found in Tctex1 domain-containing protein 4 (TCTEX1D4) and similar proteins. Tctex1 domain-containing protein 4 (TCTEX1D4), also called protein N22.1, or Tctex-2-beta, is a novel protein phosphatase 1 (PPP1) interactor. It also interacts with ENG/endoglin, TGFBR2 and TGFBR3. The distribution of TCTEX1D4 in testis suggests its involvement in distinct functions, such as TGFbeta signaling at the blood-testis barrier and acrosome cap formation. The model corresponds to the dynein light chain (DLC)-like domain of TCTEX1D4.	114
412010	cd21462	DLC-like_DYNLT1	dynein light chain (DLC)-like domain found in dynein light chain Tctex-type 1 (DYNLT1) and similar proteins. Dynein light chain Tctex-type 1 (DYNLT1), also called TCTEL1, or TCTEX1, or protein CW-1, or T-complex testis-specific protein 1 homolog, is a non-catalytic accessory component of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. It plays a role in neuronal morphogenesis. It is involved in intracellular targeting of D-type retrovirus gag polyproteins to the cytoplasmic assembly site. The model corresponds to the dynein light chain (DLC)-like domain of DYNLT1.	102
412011	cd21463	DLC-like_DYNLT3	dynein light chain (DLC)-like domain found in dynein light chain Tctex-type 3 (DYNLT3) and similar proteins. Dynein light chain Tctex-type 3 (DYNLT3), also called rp3, or protein 91/23, or T-complex-associated testis-expressed 1-like, is a non-catalytic accessory component of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. It has a potential role in chromosome congression in human mitosis and is required for chromosome alignment during mouse oocyte meiotic maturation. The DYNLT3 light chain directly links cytoplasmic dynein to a spindle checkpoint protein, Bub3. The model corresponds to the dynein light chain (DLC)-like domain of DYNLT3.	97
410550	cd21464	7tm_GPR137	GPR137 family belonging to the seven-transmembrane G protein-coupled receptor superfamily. The GPR137 family includes GPR137A, GPR137B, and GPR137C, which are all orphan G protein-coupled receptors (GPCRs). GPR137A, also called GPR137 or transmembrane 7 superfamily member 1-like 1 protein (TM7SF1L1), is expressed in the central nervous system (CNS), endocrine gland, thymus, and lung. It is associated with different cancers including gastric cancer, pancreatic cancer, colon cancer, and malignant glioma. GPR137B, also called transmembrane 7 superfamily member 1 (TM7SF1), is a lysosome integral membrane protein that is strongly expressed in the heart, liver, kidney, and brain. It is associated with M2 macrophage polarization, and has been shown to perform a regulatory function in controlling dynamic Rag and mTORC1 localization and activity, as well as lysosome morphology. GPR137C, also called transmembrane 7 superfamily member 1-like 2 protein (TM7SF1L2), may be a key player in the prognosis of small cell lung cancer. GPCRs transmit physiological signals from the outside of the cell to the inside via G proteins. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	310
410574	cd21465	LPS_wlbK-like	Bordetella wlbK gene product domains involved in bacterial polysaccharide synthesis, and similar domains. This model includes gene wlbJ (also known as bplJ, bplK, wlbjK) product protein, one of 12 genes that is involved in liposaccharide (LPS) synthesis. The liposaccharides (LPS) of Bordetella species are pyrogenic, mitogenic, and toxic, and can activate and induce tumor necrosis factor production in macrophages, similar to endotoxins from other gram-negative bacteria. Also, while the family Enterobacteriaceae expresses smooth-type LPS, the Bordetella LPS molecules differ in chemical structure; B. bronchiseptica and B. parapertussis synthesize a long-chain polysaccharide consisting of a homopolymer of 2,3-dideoxy-2,3-diN-acetylgalactosaminuronic acid (2,3-diNAcGalA), known as O antigen, whereas B. pertussis does not and is therefore more similar to rough-type LPS. This substantial structural difference between the LPS molecules of the three main pathogenic bordetellae likely confers quite different surface properties on the different species. Gene characterization studies show that wlbJ and wlbK are two apparently separate genes in B. pertussis, but are fused into a single open reading frame in B. bronchiseptica and B. parapertussishu. Studies show that mutations in wlbJK do not affect LPS biosynthesis but their function remains unclear.	154
394821	cd21466	Ubl2_cv_PLpro_N_Nsp3-like	second ubiquitin-like (Ubl) domain located N-terminal to the coronavirus SARS-CoV papain-like protease (PLpro) domain in the non-structural protein 3 (Nsp3) and related proteins. Severe acute respiratory syndrome coronavirus (SARS-CoV) non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the SARS-CoV genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pps), pp1a and pp1ab. Papain-like protease (PLpro) is one of two SARS-CoV proteases which process these polyproteins; it cleaves pp1a at three sites, releasing Nsp1, Nsp2, and Nsp3. Nsp3 is a large multi-functional multi-domain protein which is an essential component of the replication/transcription complex (RTC). This ubiquitin-like (Ubl) domain (sometimes referred to as Ubl2, the second Ubl domain of Nsp3) is located N-terminal to the PLpro domain of Nsp3. In addition to being a protease, SARS-CoV PLpro is a deubiquitinating enzyme (DUB), and may be involved in subverting cellular ubiquitination machinery to facilitate viral replication. A number of cellular DUBs have a Ubl domain, where it may serve a regulatory function. The exact functional role of this Ubl domain is unclear.	54
394822	cd21467	Ubl1_cv_Nsp3_N-like	first ubiquitin-like (Ubl) domain located at the N-terminus of coronavirus SARS-CoV non-structural protein 3 (Nsp3) and related proteins. This ubiquitin-like (Ubl) domain (Ubl1) is found at the N-terminus of coronavirus Nsp3, a large multi-functional multi-domain protein which is an essential component of the replication/transcription complex (RTC). The functions of Ubl1 in CoVs are related to single-stranded RNA (ssRNA) binding and to interacting with the nucleocapsid (N) protein. SARS-CoV Ubl1 has been shown to bind ssRNA having AUA patterns, and since the 5'-UTR of the SARS-CoV genome has a number of AUA repeats, it may bind there. In mouse hepatitis virus (MHV), this Ubl1 domain binds the cognate N protein. Adjacent to Ubl1 is a Glu-rich acidic region (also referred to as hypervariable region, HVR); Ubl1 together with HVR has been called Nsp3a. Currently, the function of HVR in CoVs is unknown. This model corresponds to one of two Ubl domains in Nsp3; the other is located N-terminal to the papain-like protease (PLpro) and is not represented by this model.	89
410575	cd21468	LPS_wlbK_N-like	N-terminal domain of Bordetella wlbK gene product involved in bacterial polysaccharide synthesis, and similar domains. This model includes the N-terminal domain of the gene wlbJ (also known as bplJ, bplK, wlbjK) product protein, one of 12 genes that is involved in liposaccharide (LPS) synthesis. The liposaccharides (LPS) of Bordetella species are pyrogenic, mitogenic, and toxic, and can activate and induce tumor necrosis factor production in macrophages, similar to endotoxins from other gram-negative bacteria. Also, while the family Enterobacteriaceae expresses smooth-type LPS, the Bordetella LPS molecules differ in chemical structure; B. bronchiseptica and B. parapertussis synthesize a long-chain polysaccharide consisting of a homopolymer of 2,3-dideoxy-2,3-diN-acetylgalactosaminuronic acid (2,3-diNAcGalA), known as O antigen, whereas B. pertussis does not and is therefore more similar to rough-type LPS. This substantial structural difference between the LPS molecules of the three main pathogenic bordetellae likely confers quite different surface properties on the different species. Gene characterization studies show that wlbJ and wlbK are two apparently separate genes in B. pertussis, but are fused into a single open reading frame in B. bronchiseptica and B. parapertussishu. Studies show that mutations in wlbJK do not affect LPS biosynthesis but their function remains unclear.	154
410576	cd21469	LPS_wlbK_C-like	C-terminal domain of Bordetella wlbK gene product involved in bacterial polysaccharide synthesis, and similar domains. This model includes the C-terminal domain of the gene wlbJ (also known as bplJ, bplK, wlbjK) product protein, one of 12 genes that is involved in liposaccharide (LPS) synthesis. The liposaccharides (LPS) of Bordetella species are pyrogenic, mitogenic, and toxic, and can activate and induce tumor necrosis factor production in macrophages, similar to endotoxins from other gram-negative bacteria. Also, while the family Enterobacteriaceae expresses smooth-type LPS, the Bordetella LPS molecules differ in chemical structure; B. bronchiseptica and B. parapertussis synthesize a long-chain polysaccharide consisting of a homopolymer of 2,3-dideoxy-2,3-diN-acetylgalactosaminuronic acid (2,3-diNAcGalA), known as O antigen, whereas B. pertussis does not and is therefore more similar to rough-type LPS. This substantial structural difference between the LPS molecules of the three main pathogenic bordetellae likely confers quite different surface properties on the different species. Gene characterization studies show that wlbJ and wlbK are two apparently separate genes in B. pertussis, but are fused into a single open reading frame in B. bronchiseptica and B. parapertussishu. Studies show that mutations in wlbJK do not affect LPS biosynthesis but their function remains unclear.	154
394823	cd21470	CoV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of coronavirus spike (S) proteins. This family contains the receptor-binding domain (RBD) of the S1 subunit of coronavirus (CoV) spike (S) proteins from three highly pathogenic human coronaviruses (CoVs), including Middle East respiratory syndrome coronavirus (MERS-CoV), Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), and SARS coronavirus 2 (SARS-CoV-2), also known as a 2019 novel coronavirus (2019-nCoV), as well as S proteins from related coronaviruses. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. MHV uses mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a) as the receptor, and the receptors for SARS-CoV and MERS-CoV are human angiotensin-converting enzyme 2 (ACE2) and human dipeptidyl peptidase 4 (DPP4), respectively. Recent studies found that the RBD of SARS-CoV-2 S protein binds strongly to human and bat angiotensin-converting enzyme 2 (ACE2) receptors. Moreover, SARS-CoV-2 RBD exhibited significantly higher binding affinity to the ACE2 receptor than SARS-CoV RBD. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs.	171
409193	cd21471	CrtC-like	carotenoid 1,2-hydratase and similar proteins. Carotenoid 1,2-hydratase (EC 4.2.1.131; CrtC; also known as acyclic carotenoid 1,2-hydratase, 1-hydroxyneurosporene hydratase, hydroxylycopene hydratase, hydroxyneurosporene synthase, lycopene hydratase, or neurosporene hydratase) is an enzyme with the systematic name lycopene hydro-lyase (1-hydroxy-1,2-dihydrolycopene-forming). It is involved in the biosynthesis of carotenoids such as lycopenes. It catalyzes the hydration of neurosporene to the corresponding hydroxylated carotenoids 1-HO-neurosporene and that of lycopene to 1-HO-lycopene. Studies suggest that CrtC may be bound to the membrane through an anchor so that a close distance to the substrate, which is synthesized in the cell membranes, is facilitated.	276
411068	cd21472	NocO-like	cyanobacterial NocO and similar proteins. This family includes many uncharacterized proteins similar to cyanobacterial NocO and NocN, which are involved in the synthesis of natural oxadiazines such as nocuolin A (NoA, exhibits anti-proliferative activity against human cancer cell lines). Members are also similar to cyanobacterial ColD and ColE, putative acyl halogenases involved in columbamide biosynthesis.	356
394836	cd21473	cv_Nsp4_TM	coronavirus non-structural protein 4 (Nsp4) transmembrane domain. Nsp4 may be involved in coronavirus-induced membrane remodeling. In order to assemble the replication-transcription complex (RTC), coronavirus induces the rearrangement of host endoplasmic reticulum (ER) membrane into double membrane vesicles (DMVs), zippered ER, or ER spherules. DMV formation has been observed in SARS-CoV cells overexpressing the three transmembrane-containing non-structural proteins of viral replicase polyprotein 1ab: Nsp3, Nsp4 and Nsp6. Together, Nsp3, Nsp4, and Nsp6 have the ability to induce the formation of DMVs that are similar to those seen in SARS-CoV-infected cells.	376
410551	cd21474	7tm_GPR137A	Integral membrane protein GPR137A, an orphan receptor member of the seven-transmembrane G protein-coupled receptor superfamily. GPR137A, also called GPR137 or transmembrane 7 superfamily member 1-like 1 protein (TM7SF1L1), is an orphan G protein-coupled receptor (GPCR) expressed in the central nervous system (CNS), endocrine gland, thymus, and lung. It is associated with different cancers including gastric cancer, pancreatic cancer, colon cancer, and malignant glioma. It is highly expressed in ovarian cancer and plays a pro-oncogenic role in the disease, promoting cell proliferation and metastasis through regulation of the PI3K/AKT pathway. GPCRs transmit physiological signals from the outside of the cell to the inside via G proteins. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	310
410552	cd21475	7tm_GPR137C	Integral membrane protein GPR137C, an orphan receptor member of the seven-transmembrane G protein-coupled receptor superfamily. GPR137C, also called transmembrane 7 superfamily member 1-like 2 protein (TM7SF1L2), is an orphan G protein-coupled receptor (GPCR) of unknown function. Bioinformatics analysis identified it as a likely key player in the prognosis of small cell lung cancer. GPCRs transmit physiological signals from the outside of the cell to the inside via G proteins. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	319
410553	cd21476	7tm_GPR137B	Integral membrane protein GPR137B, an orphan receptor member of the seven-transmembrane G protein-coupled receptor superfamily. GPR137B, also called transmembrane 7 superfamily member 1 (TM7SF1), is a lysosome integral membrane protein that is strongly expressed in the heart, liver, kidney, and brain. It is an orphan G protein-coupled receptor (GPCR) associated with M2 macrophage polarization, and has been shown to perform a regulatory function in controlling dynamic Rag and mTORC1 localization and activity, as well as lysosome morphology. It also plays a role in bone remodeling in mouse and zebrafish, functioning as a negative regulator of osteoclast activity essential for normal resorption and patterning of the skeleton. GPCRs transmit physiological signals from the outside of the cell to the inside via G proteins. All GPCRs share a common structural architecture comprising of seven-transmembrane (TM) alpha-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes.	320
394824	cd21477	SARS-CoV-like_Spike_S1_RBD	receptor-binding domain of the S1 subunit of severe acute respiratory syndrome-related coronavirus Spike (S) protein and similar proteins. This subfamily contains the receptor-binding domain of the S1 subunit of coronavirus (CoV) spike (S) proteins from highly pathogenic human virus, severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), SARS coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV), and other SARS-like coronaviruses. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2 and SARS-CoV use the C-domain to bind their receptors. Recent studies found that the receptor-binding domain (RBD) of SARS-CoV-2 S protein binds strongly to human and bat angiotensin-converting enzyme 2 (ACE2) receptors. Moreover, SARS-CoV-2 RBD exhibited significantly higher binding affinity to the ACE2 receptor than SARS-CoV RBD. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for SARS-CoV, SARS-CoV-2, and most CoVs.	205
394825	cd21478	HKU1-like_CoV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from human coronavirus HKU1 and related coronaviruses. This family contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from human coronavirus (CoV) HKU1, human coronavirus OC43 (HCoV-OC43), mouse hepatitis virus (MHV), porcine hemagglutinating encephalomyelitis virus (HEV), and other related coronaviruses. HKU1 is a human betacoronavirus that causes mild yet prevalent respiratory disease. HCoV-OC43 is of zoonotic origin and is endemic in the human population, causing mild respiratory tract infections and possible severe complications or fatalities in young children, the elderly, and immunocompromised individuals. MHV is the most common viral pathogen in contemporary laboratory mouse colonies manifesting as a primary infection in the upper respiratory tract. Porcine HEV is associated with acute outbreaks of wasting and encephalitis in nursing piglets from pig farms. These viruses are related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBDs of MHV and HCoV-OC43 are located at the NTD, most CoVs use the C-domain to bind their receptors. Although a protein receptor has not yet been identified for HKU1, antibodies against the C-domain, but not those against the NTD, blocked HKU1 infection of cells, suggesting that the S1 C-domain is the primary HKU1 receptor-binding site. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs.	223
394826	cd21479	MERS-like_CoV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from Middle East respiratory syndrome coronavirus. This family contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from the human coronavirus that causes Middle East Respiratory Syndrome (MERS-CoV) and related coronaviruses from animals. MERS-CoV causes severe pulmonary disease in humans. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including MERS-CoV use the C-domain to bind their receptors. MERS-CoV use human dipeptidyl peptidase 4 (DPP4), also called CD26, as its receptor. It binds DPP4 through the RBD of its S1 subunit and then fuses viral and host membranes through its S2 subunit. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs including MERS-CoV.	216
394827	cd21480	SARS-CoV-2_Spike_S1_RBD	receptor-binding domain of the S1 subunit of severe acute respiratory syndrome coronavirus 2 Spike (S) protein. This group contains the receptor-binding domain of the S1 subunit of the spike (S) protein from highly pathogenic human virus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV). The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While RBD of mouse hepatitis virus (MHV) is located at the NTD, most of other CoVs, including SARS-CoV-2 use the C-domain to bind their receptors. Recent studies found that the receptor-binding domain (RBD) of SARS-CoV-2 S protein binds strongly to human and bat angiotensin-converting enzyme 2 (ACE2) receptors. Moreover, SARS-CoV-2 RBD exhibited significantly higher binding affinity to the ACE2 receptor than SARS-CoV RBD. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for SARS-CoV-2 and most CoVs.	223
394828	cd21481	SARS-CoV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of severe acute respiratory syndrome-related coronavirus Spike (S) protein. This group contains the receptor-binding domain of the S1 subunit of the spike (S) protein from severe acute respiratory syndrome-related coronavirus (SARS-CoV) and similar coronaviruses. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV use the C-domain to bind their receptors. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for SARS-CoV and most CoVs.	222
394829	cd21482	HKU1_N5-like_CoV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from human coronavirus HKU1, isolate N5 and isolate N2. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from human coronavirus (CoV) HKU1, isolates N5 and N2. HKU1 is a human betacoronavirus that causes mild yet prevalent respiratory disease, and is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. Although a protein receptor has not yet been identified for HKU1, antibodies against the C-domain, but not those against the NTD, blocked HKU1 infection of cells, suggesting that the S1 C-domain is the primary HKU1 receptor-binding site. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs, and most likely, for HKU1.	304
394830	cd21483	HKU1_N1_CoV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from human coronavirus HKU1, isolate N1. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from human coronavirus (CoV) HKU1, isolate N1. HKU1 is a human betacoronavirus that causes mild yet prevalent respiratory disease, and is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. Although a protein receptor has not yet been identified for HKU1, antibodies against the C-domain, but not those against the NTD, blocked HKU1 infection of cells, suggesting that the S1 C-domain is the primary HKU1 receptor-binding site. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs, and most likely, for HKU1.	306
394831	cd21484	MHV-like_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from mouse hepatitis virus and other rodent coronaviruses. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from mouse hepatitis virus (MHV) and other rodent coronaviruses. MHV is the most common viral pathogen in contemporary laboratory mouse colonies manifesting as a primary infection in the upper respiratory tract. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). MHV uses mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a) as the receptor; the RBD of MHV is located at the NTD. Most CoVs, such as SARS-CoV and MERS-CoV, use the C-domain to bind their receptors. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs.	264
394832	cd21485	HCoV-OC43-like_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from human coronavirus OC43 and related proteins. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from several betacoronaviruses including human coronavirus OC43 (HCoV-OC43) and bovine respiratory coronavirus (BCoV), among others. HCoV-OC43 is of zoonotic origin and is endemic in the human population, causing mild respiratory tract infections and possible severe complications or fatalities in young children, the elderly, and immunocompromised individuals. These viruses are related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. It has been reported that HCoV-OC43 uses 9-O-acetyl-sialic acid (9-O-Ac-Sia) as a receptor, which is terminally linked to oligosaccharides decorating glycoproteins and gangliosides at the host cell surface. HCoV-OC43 appears to bind 9-O-Ac-Sia at the NTD. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs.	312
394833	cd21486	human_MERS-CoV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from human Middle East respiratory syndrome coronavirus. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from the human coronavirus that causes Middle East Respiratory Syndrome (MERS-CoV). MERS-CoV causes severe pulmonary disease in humans. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including MERS-CoV use the C-domain to bind their receptors. MERS-CoV use human dipeptidyl peptidase 4 (DPP4), also called CD26, as its receptor. It binds DPP4 through the RBD of its S1 subunit and then fuses viral and host membranes through its S2 subunit. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs including MERS-CoV.	219
394834	cd21487	bat_HKU4-like_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from Tylonycteris bat coronavirus HKU4 and similar proteins. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from Tylonycteris bat coronavirus HKU4 and other Middle East Respiratory Syndrome (MERS)-related coronaviruses. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including human MERS-CoV that is phylogenetically closely related to bat CoV HKU4 use the C-domain to bind their receptors. HKU4 is able to bind the MERS-CoV receptor, human dipeptidyl peptidase 4 (DPP4), also called CD26. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs including MERS-CoV, and most likely, bat CoV HKU4.	219
411069	cd21494	TMEM168	Transmembrane (TMEM) protein family 168. This family includes transmembrane protein 168 (TMEM168) and similar proteins. TMEM168 is a multi-pass membrane protein predicted to contain nine transmembrane helices. Its expression has been implicated in several diseases. It is upregulated in glioblastoma multiforme (GBM) and small interfering RNA (siRNA)-TMEM168 has been shown to prevent viability of human GBM cell lines, induce cell cycle arrest (G0/G1 phase), and promote apoptosis through the suppression of the Wnt/beta-catenin pathway. A TMEM168 point mutation has also been associated with arrhythmogenesis in familial Brugada syndrome.	681
412059	cd21500	NleA-like	bacterial effector protein NleA, and similar proteins. This family includes non-locus of enterocyte effacement (non-LEE) encoded effector A (NleA), a bacterial effector protein injected by enteropathogenic and enterohemorrhagic Escherichia coli (EPEC and EHEC), both related strains capable of inducing severe gastrointestinal disease. These pathogens modulate cellular functions via the deployment of effector proteins in a type three secretion system (T3SS)-dependent manner. In response, the host Nod-like receptor pyrin domain containing (NLRP) inflammasome activates caspase-1 and releases IL-1beta. NleA plays a role in controlling the host immune response through targeting of Nod-like receptor 3 (NLRP3); it has been identified as the effector that can subdue IL-1beta secretion by inhibiting caspase-1 activation, thus inhibiting NLRP inflammasome activation. NleA interacts with NLRP3 via regions containing the PYD and LRR domains. NleA has also been shown to associate with non-ubiquitinated and ubiquitinated NLRP3 and to interrupt de-ubiquitination of NLRP3, which is a required process for inflammasome activation.	425
411070	cd21501	GtgE	Salmonella enterica effector protein GtgE. The Salmonella enterica GtgE effector protein contributes to the virulence of this pathogen by modulating trafficking of the Salmonella-containing vacuole. GtgE, which exclusively targets inactive Rab GTPases, has been identified as a cysteine protease with the typical Cys-Hip-Asp catalytic triad. It functions by cleaving the Rab-family GTPases Rab29, Rab32 and Rab38, thereby preventing the delivery of antimicrobial factors to the bacteria-containing vacuole. It has been shown to solely process the inactive GDP-bound GTPase Rab32. However, weak binding of GtgE to the peptide encompassing the Rab29 cleavage site suggests that the function of GtgE may be dependent on other factors, such as a protein partner or interactions with the Salmonella-containing vacuole (SCV) membrane.	174
411071	cd21502	vWA_BABAM1	Von-Willebrand factor A (vWA) domain found in BRISC and BRCA1-A complex member 1. BRISC and BRCA1 A complex member 1 (BABAM1) is also known as Mediator of RAP80 interactions and targeting subunit of 40 kDa (MERIT40), New component of the BRCA1-A complex (NBA1), HSPC142, or C19orf620. It is a core component of the BRCA1-A and BRISC complexes that function in DNA double-strand break repair and immune signaling, and contain the lysine-63 linkage-specific BRCC36 subunit that is functionalized by scaffold subunits Abraxas and ABRO1, respectively. BABAM1 interacts with Rap80, BRCC36, BRCC45, and Abraxas to form the BRCA1-A complex, a lysine-63-Ub specific deubiquitinating enzyme (DUB) which specifically recognizes lysine-63-linked ubiquitinated histones H2A and H2AX at DNA lesions sites, leading to target the BRCA1-BARD1 heterodimer to sites of DNA damage at double-strand breaks (DSBs). BRISC is a DUB complex containing three other subunits, BRCC36, ABRO1 and BRCC45. It specifically hydrolyzes lysine-63 polyubiquitin chains, and is involved in multiple biological processes, including IFN-mediated antiviral immune regulation and inflammatory reaction. BABAM1 likely serves as a scaffold protein by integrating other components to form a functional complex. Furthermore, BABAM1 has been shown to play a critical role in BRISC-mediated regulation of Tankyrase1 (TNKS1) function during spindle assembly; it directly binds to the ankyrin repeat cluster V (ARC-V) domain of TNKS1 via its RXXPEG motif. BABAM1 contains a Von-Willebrand factor A (vWA) domain that is distantly related to classical vWA domains.	216
409631	cd21503	ABC-2_lan_permease	lantibiotic immunity ABC transporter permease (also called ABC-2 transporter permease) subunit. This family contains lantibiotic ABC transporter permease subunits which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, particularly to type-A lantibiotics. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. For example, in Lactococcus lactis, the lantibiotic nisin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming nisin is mediated by the ABC transporter composed of NisF, NisE and NisG subunits. This family includes the Lactococcus lactis NisG permease subunit that transports nisin to the surface and expels it from the membrane. This family also includes the lantibiotic ABC transporter permease subunits EpiE, MutE, MutG, and SlvE. Self-protection of the epidermin-producing strain Staphylococcus epidermidis Tu3298 against the pore-forming lantibiotic epidermin is mediated by an ABC transporter composed of the EpiF, EpiE, and EpiG proteins. In the mutacin I-producing strain Streptococcus mutans CH43, self-immunity against mutacin I is mediated by proteins MutF, MutE, and MutG, while in salivaricin D-producing strain Streptococcus salivarius 5M6c, mediation is via ABC transporter proteins SlvF, SlvE, and SlvG.	221
410337	cd21504	PPP2R3A_B-like	serine/threonine protein phosphatase 2A regulatory subunit B" alpha and beta subunits, and similar proteins. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. These B-family regulatory subunits play various roles including regulation of cytoskeletal assembly, neuronal differentiation, mitogen-activated protein kinase signaling, and apoptosis. This subfamily includes protein phosphatase 2A regulatory subunit B'' subunits alpha and beta, encoded by PPP2R3A and PPP2R3B. It also includes subunit delta encoded by PPP2R3D in mouse. They contain two-domain elongated structures with two calcium EF-hands which mediate Ca2+-dependent changes in phosphatase activity.	274
410338	cd21505	PPP2R3C	serine/threonine protein phosphatase 2A regulatory subunit B" subunit gamma. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This subfamily includes protein phosphatase subunit G5PR (also known as serine/threonine-protein phosphatase 2A regulatory subunit B'' subunit gamma, G4-1, G5pr, GDRM, SPGF36, or C14orf10) that is encoded by the PPP2R3C gene. It is involved in the control of the dynamic organization of the cortical cytoskeleton and plays an important role in the organization of interphase microtubule arrays in part through the regulation of nucleation geometry. G5PR is involved in the ontogeny of multiple organs, especially critical for testis development and spermatogenesis. PPP2R3C gene variants cause syndromic 46,XY gonadal dysgenesis and impaired spermatogenesis in humans, and thus is emerging as a potential therapeutic target for male infertility.	382
410339	cd21506	PPP2R3A	serine/threonine protein phosphatase 2A regulatory subunit B" subunit alpha. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This group contains protein phosphatase subunit PR130 (also known as protein phosphatase 2A regulatory subunit B'' subunit alpha, PR72, or PPP2R3) that is encoded by the PPP2R3A gene. PR130 and PR72 subunits are derived from the same gene through differential splicing; they harbor specific N-terminal domains of different lengths that are encoded by alternatively spliced exons and have identical C-termini. The common C-terminus contains a two-domain elongated structure with two calcium EF-hands which mediate Ca2+-dependent changes in phosphatase activity. The PR130 subunit has been shown to interact with the LIM domain of lipoma-preferred partner (LPP) through a conserved Zn2+-finger-like motif in the N-terminus of PR130.	284
410340	cd21507	PPP2R3B	serine/threonine protein phosphatase 2A regulatory subunit B" subunit beta. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This group contains protein phosphatase subunit PR70 (also known as protein phosphatase 2 regulatory subunit B'' subunit beta, PR48, NYREN8, PPP2R3L, or PPP2R3LY) that is encoded by the PPP2R3B gene. This substrate-recognizing subunit of PP2A has a two-domain elongated structure with two calcium EF-hands, each displaying different affinities to Ca2+. PPP2R3B/PR70 is a gonosomal melanoma tumor suppressor gene; PR70 decreased melanoma growth by negatively interfering with DNA replication and cell cycle progression through its role in stabilizing the cell division cycle 6 (CDC6)-chromatin licensing and DNA replication factor 1 (CDT1) interaction, which delays the firing of origins of DNA replication.	355
394835	cd21508	HEV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of the Spike (S) protein from porcine hemagglutinating encephalomyelitis virus. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from porcine hemagglutinating encephalomyelitis virus (HEV), which is associated with acute outbreaks of wasting and encephalitis in nursing piglets from pig farms. Porcine HEV is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. The protein receptor for porcine HEV has not yet been identified. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs.	298
411072	cd21510	agarase_cat	alpha-beta barrel catalytic domain of agarase, such as GH86-like endo-acting agarases identified in non-marine organisms. Typically, agarases (E.C. 3.2.1.81) are found in ocean-dwelling bacteria since agarose is a principle component of red algae cell wall polysaccharides. Agarose is a linear polymer of alternating D-galactose and 3,6-anhydro-L-galactopyranose. Endo-acting agarases, such as glycoside hydrolase 16 (GH16) and GH86 hydrolyze internal beta-1,4 linkages. GH86-like endo-acting agarase of this protein family has been identified in the human intestinal bacterium Bacteroides uniformis. This acquired metabolic pathway, as demonstrated by the prevalence of agar-specific genetic cluster called polysaccharide utilization loci (PULs), varies considerably between human populations, being much more prevalent in a Japanese sample than in North America, European, or Chinese samples. Agarase activity was also identified in the non-marine bacterium Cellvibrio sp.	321
394864	cd21511	cv-alpha_beta_Nsp2-like	alpha- and betacoronavirus non-structural protein 2 (Nsp2), similar to SARS-CoV Nsp2 and HCoV-229E Nsp2, and related proteins. Coronavirus Nsps are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This alpha- and betacoronavirus family includes alphacoronavirus human coronavirus 229E (HCoV-229E) Nsp2, betacoronavirus Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2 and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle. This family may be distantly related to the gammacoronavirus Avian infectious bronchitis virus (IBV) Nsp2; IBV Nsp2 is a weak protein kinase R (PKR) antagonist, which may suggest that it plays a role in interfering with intracellular immunity.	399
394837	cd21512	cv_gamma-delta_Nsp2_IBV-like	gamma- and deltacoronavirus non-structural protein 2 (Nsp2), similar to IBV Nsp2 and related proteins. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these cleaved subunits. The functions of Nsp2 remain unclear. This gamma- and deltacoronavirus family includes Avian infectious bronchitis virus (IBV) Nsp2 which has been shown to be a weak protein kinase R (PKR) antagonist, which may suggest that it plays a role in interfering with intracellular immunity. This family may be distantly related to a family of alpha- and betacoronavirus Nsp2, which includes severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2, and Murine hepatitis virus (MHV) Nsp2 (also known as p65). SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle.	365
394838	cd21513	SUD_C_DPUP_CoV_Nsp3	C-terminal SARS-Unique Domain (SUD) of betacoronavirus non-structural protein 3 (Nsp3). This family contains the SUD-C of Nsp3 from Severe Acute Respiratory Syndrome (SARS) coronavirus (CoV), Middle East respiratory syndrome-related (MERS) CoV, and Rousettus bat CoV HKU9, as well as the DPUP (domain preceding Ubl2 and PLP2) of murine hepatitis virus (MHV) Nsp3. Though structurally similar, there is little sequence similarity between these four domain subfamilies: SARS SUD-C, MERS SUD-C, HKU9 SUD-C, and MHV DPUP. Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of SARS coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). The SUD-C domain adopts a frataxin-like fold and has structural similarity to DNA-binding domains of DNA-modifying enzymes. It binds to single-stranded RNA and recognizes purine bases more strongly than pyrimidine bases. SUD-C also regulates the RNA binding behavior of the SUD-M macrodomain. SUD-C is not as specific to SARS CoV Nsp3 as originally thought, and is conserved in the Nsp3s of all four lineages (A-D) of betacoronavirus.	71
394865	cd21514	cv_alpha_Nsp2_HCoV-229E-like	alphacoronavirus non-structural protein 2 (Nsp2), similar to HCoV-229E Nsp2 and related proteins. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes alphacoronavirus human coronavirus 229E (HCoV-229E) Nsp2 and belongs to a family which includes betacoronavirus Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2, and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle.	503
394866	cd21515	cv_beta_Nsp2_SARS_MHV-like	betacoronavirus non-structural protein 2 (Nsp2), similar to SARS-CoV Nsp2 and MHV Nsp2 (p65), and related proteins. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This family includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2 and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2 rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2 which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle.	584
394867	cd21516	cv_beta_Nsp2_SARS-like	betacoronavirus non-structural protein 2 (Nsp2) similar to SARS-CoV Nsp2, and related proteins from betacoronaviruses in the B lineage. Non-structural proteins (Nsps) from Severe acute respiratory syndrome coronavirus (SARS-CoV) and betacoronaviruses in the sarbecovirus subgenera (B lineage) are encoded in ORF1a and ORF1b. Post infection, the SARS-CoV genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. The functions of Nsp2 remain unknown. Deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. Rather than playing a role in viral replication, SARS-CoV Nsp2 may be involved in altering the host cell environment; it has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2 which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis.	637
394868	cd21517	cv_beta_Nsp2_MERS-like	betacoronavirus non-structural protein 2 (Nsp2) similar to MERS-CoV Nsp2, and related proteins from betacoronaviruses in the C lineage. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes Nsp2 from Middle East respiratory syndrome-related coronavirus (MERS-CoV) and betacoronaviruses in the merbecovirus subgenera (C lineage). It belongs to a family which includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2, and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle.	660
394869	cd21518	cv_beta_Nsp2_HKU9-like	betacoronavirus non-structural protein 2 (Nsp2) similar to bat coronavirus HKU9 Nsp2, and related proteins from betacoronaviruses in the D lineage. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes Nsp2 from Rousettus bat coronavirus HKU9 and betacoronaviruses in the nobecovirus subgenera (D lineage). It belongs to a family which includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2, and Murine hepatitis virus (MHV) Nsp2 (also known as p65). The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers. It has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2, which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2/p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle.	597
394870	cd21519	cv_beta_Nsp2_MHV-like	betacoronavirus non-structural protein 2 (Nsp2) similar to MHV Nsp2/p65 and related proteins from betacoronaviruses in the A lineage. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes Nsp2 from Murine hepatitis virus (MHV) and betacoronaviruses in the embecovirus subgenera (A lineage). It belongs to a family which includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2. The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers, and it has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2 which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2, also known as p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle.	586
394839	cd21523	SUD_C_MERS-CoV_Nsp3	C-terminal SARS-Unique Domain (SUD) of non-structural protein 3 (Nsp3) from Middle East respiratory syndrome-related coronavirus and related betacoronaviruses in the C lineage. This subfamily contains the SUD-C of Middle East respiratory syndrome-related (MERS) coronavirus (CoV) Nsp3 and other Nsp3s from betacoronaviruses in the merbecovirus subgenera (C lineage), including several bat-CoVs such as Tylonycteris bat CoV HKU4, Pipistrellus bat CoV HKU5, and Hypsugo bat CoV HKU25. Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of SARS coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). SUD is not as specific to SARS CoV as originally thought and is also found in MERS and related bat coronaviruses. Similar to SARS SUD-C, Tylonycteris bat-CoV HKU4 SUD-C (HKU4 C), a member of the MERS SUD-C group, also adopts a frataxin-like fold (DOI:10.1177/1934578X19849202) that has structural similarity to DNA-binding domains of DNA-modifying enzymes. However, there is little sequence similarity between the two domains. SARS SUD-C has been shown to bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases; it also regulates the RNA binding behavior of the SARS SUD-M macrodomain. It is not known whether MERS SUD-C or HKU4 C functions in the same way. It has been suggested that HKU4 C engages in protein-protein interactions with HKU4 SUD-M.	76
394840	cd21524	DPUP_MHV_Nsp3	DPUP (domain preceding Ubl2 and PLP2) of non-structural protein 3 (Nsp3) from murine hepatitis virus and related betacoronaviruses in the A lineage. This subfamily contains the DPUP (domain preceding Ubl2 and PLP2) of murine hepatitis virus (MHV) non-structural protein 3 (Nsp3) and other Nsp3s from betacoronaviruses in the embecovirus subgenera (A lineage), including human CoV OC43, rabbit CoV HKU14 and porcine hemagglutinating encephalomyelitis virus (HEV), among others. Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. MHV Nsp3 contains a DPUP that is located N-terminal to the ubiquitin-like domain 2 (Ubl2) and papain-like protease 2 (PLP2) catalytic domain. It is structurally similar to the Severe Acute Respiratory Syndrome (SARS) CoV unique domain C (SUD-C), adopting a frataxin-like fold that has structural similarity to DNA-binding domains of DNA-modifying enzymes. SUD-C is also located N-terminal to Ubl2 and PLP2 in SARS Nsp3, similar to the DPUP of MHV Nsp3; however, unlike DPUP, it is preceded by SUD-N and SUD-M macrodomains that are absent in MHV Nsp3. Though structurally similar, there is little sequence similarity between DPUP and SUD-C. SARS SUD-C has been shown to bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases; it also regulates the RNA binding behavior of the SARS SUD-M macrodomain. It is not known whether DPUP functions in the same way.	75
394841	cd21525	SUD_C_SARS-CoV_Nsp3	C-terminal SARS-Unique Domain (SUD) of non-structural protein 3 (Nsp3) from Severe Acute Respiratory Syndrome coronavirus and related betacoronaviruses in the B lineage. This subfamily contains the SUD-C of Severe Acute Respiratory Syndrome (SARS) coronavirus (CoV) non-structural protein 3 (Nsp3) and other Nsp3s from betacoronaviruses in the sarbecovirus subgenera (B lineage), such as SARS-CoV-2 and related bat CoVs. Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of the Severe Acute Respiratory Syndrome (SARS) coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). The SUD-C domain adopts a frataxin-like fold and has structural similarity to DNA-binding domains of DNA-modifying enzymes. It binds to single-stranded RNA and recognizes purine bases more strongly than pyrimidine bases. SUD-C also regulates the RNA binding behavior of the SUD-M macrodomain.	67
394843	cd21526	CoV_Nsp6	coronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation.	287
394949	cd21527	CoV_Spike_S1_NTD	N-terminal domain of the S1 subunit of coronavirus Spike (S) proteins. This family contains the N-terminal domain (NTD) of the S1 subunit of coronavirus (CoV) Spike (S) proteins from all four (A-D) lineages of betacoronaviruses, including three highly pathogenic human CoVs (HCoV) such as Middle East respiratory syndrome (MERS)-related CoV, Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as a 2019 novel coronavirus (2019-nCoV) or COVID-19 virus. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). Most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV, use the C-domain to bind their receptors. However, some CoVs from the A lineage, such as mouse hepatitis virus (MHV) uses the NTD to bind its receptor, mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a). Bovine CoV and HCoV-OC43, also from the A lineage, recognize a sugar moiety, 5-N-acetyl-9-O-acetylneuraminic acid (Neu5,9Ac2), on cell-surface glycoproteins or glycolipids; this binding is also through the S1 NTD. The S1 NTD has also been the target for neutralizing antibodies, including human antibody CDC2-A2, and murine antibodies G2 and 5F9, which target MERS-CoV NTD. In addition, the S1 NTD contributes to the Spike trimer interface.	268
394955	cd21528	CoV_Nsp14	nonstructural protein 14 of coronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs.	518
394849	cd21529	CoV_M	coronavirus Membrane (or Matrix) protein. This family contains the Membrane (M) protein of coronaviruses (CoVs) including three highly pathogenic human CoVs such as Middle East respiratory syndrome (MERS)-related CoV, severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	198
394890	cd21530	CoV_RdRp	coronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This family contains the RNA-dependent RNA polymerase of alpha-, beta-, gamma-, delta-coronaviruses, including three highly pathogenic human coronaviruses (CoVs) such as Middle East respiratory syndrome (MERS)-related CoV, Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which shows potential for the treatment of SARS-CoV-2 viral infections. The structure of SARS-CoV-2 Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	928
394858	cd21531	CoV_E	Coronavirus Envelope (E) small membrane protein. This family contains the Envelope (E) small membrane protein of betacoronaviruses, including the E proteins from three highly pathogenic human coronaviruses (CoVs) such as Middle East respiratory syndrome (MERS) CoV, Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection.	58
394859	cd21532	HKU1-CoV-like_E	human coronavirus HKU1 Envelope small membrane protein and similar proteins. This group contains the Envelope (E) small membrane protein of human coronavirus HKU1 and related coronaviruses (CoVs) from rodents. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection.	74
394860	cd21533	MERS-CoV-like_E	Middle East respiratory syndrome-related coronavirus Envelope small membrane protein and similar proteins. This group contains the Envelope (E) small membrane protein of Middle East respiratory syndrome (MERS) coronavirus (CoV), as well as E proteins from related coronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection.	80
394861	cd21534	SARS-CoV-like_E	Severe acute respiratory syndrome coronavirus Envelope small membrane protein and similar proteins. This group contains the Envelope (E) small membrane protein of Severe acute respiratory syndrome (SARS) coronavirus (CoV) and SARS-CoV-2, also known as 2019 novel coronavirus (2019-nCoV) or COVID-19 virus, as well as E proteins from related CoVs. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection.	62
394862	cd21536	SARS-CoV-2_E	Severe acute respiratory syndrome coronavirus 2 Envelope small membrane protein. This group contains the Envelope (E) small membrane protein of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection.	75
394842	cd21537	SUD_C_HKU9_CoV_Nsp3	C-terminal SARS-Unique Domain (SUD) of non-structural protein 3 (Nsp3) from Rousettus bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This subfamily contains the SUD-C of Rousettus bat coronavirus (CoV) HKU9 non-structural protein 3 (Nsp3) and other Nsp3s from betacoronaviruses in the nobecovirus subgenera (D lineage). Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of SARS coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). SUD is not as specific to SARS CoV as originally thought and is also found in Rousettus bat CoV HKU9 and related bat CoVs. Similar to SARS SUD-C, Rousettus bat CoV HKU9 SUD-C (HKU9 C), also adopts a frataxin-like fold that has structural similarity to DNA-binding domains of DNA-modifying enzymes. However, there is little sequence similarity between the two domains. SARS SUD-C has been shown to bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases; it also regulates the RNA binding behavior of the SARS SUD-M macrodomain. It is not known whether HKU9 C functions in the same way.	73
394863	cd21554	CoV_N-NTD	N-terminal domain of nucleocapsid (N) protein of coronavirus. The coronavirus nucleocapsid (N) protein is a major structural and multifunctional protein. It plays an important role in the virus replication cycle, by forming a complex with the viral RNA through its N-terminal domain (N-NTD), which makes this domain an important drug target. It also interacts with the viral membrane protein during virion assembly and plays a critical role in enhancing the efficiency of virus transcription and assembly.	125
411073	cd21555	OmcS-like	C-type cytochrome OmcS and similar proteins. This family includes C-type outer membrane cytochrome S (OmcS) which plays an important role in extracellular electron transfer. OmcS can transfer electrons to insoluble Fe(3+) oxides as well as other extracellular electron acceptors, including Mn(4+) oxide and humic substances. Recent studies show that Geobacter sulfurreducens hexaheme cytochrome OmcS proteins can assemble into filaments, known as microbial nanowires, similar to type IV pili composed of PilA protein. The coordination of a histidine in one subunit with the iron in the heme of an adjacent subunit is an important stabilizing element. The capacity of these bacteria to transport electrons to remote electron acceptors via these protein nanowires is of interest because of the environmental and practical significance of these microbes in soil.	382
394881	cd21556	Macro_cv_SUD-N-M_Nsp3-like	SUD-N and SUD-M macrodomains of the SARS-Unique Domain (SUD) of SARS-CoV non-structural protein 3 and related macrodomains. This family includes two macrodomains referred to as the SUD-N (N-terminal subdomain) and SUD-M (middle SUD subdomain) of the SARS-unique domain (SUD) which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). It is found in non-structural protein 3 (Nsp3) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and highly related coronaviruses. SUD consists of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. Among these, SUD-N and SUD-M are macrodomains. SUD-N is specific to the Nsp3 of SARS and betacoronaviruses of the sarbecovirus subgenera (B lineage), while SUD-M is present in most Nsp3 proteins except the Nsp3 from betacoronaviruses of the embecovirus subgenera (A lineage). SUD-C adopts a frataxin-like fold, has structural similarity to DNA-binding domains of DNA-modifying enzymes, binds single-stranded RNA, and regulates the RNA binding behavior of the SUD-M macrodomain. SARS-CoV Nsp3 contains a third macrodomain (the X-domain) which is not included in this family. The X-domain may function as a module binding poly(ADP-ribose); however, SUD-N and SUD-M do not bind ADP-ribose, as the triple glycine sequence involved in its binding is not conserved in these.	109
394882	cd21557	Macro_X_Nsp3-like	X-domain of viral non-structural protein 3 and related macrodomains. The X-domain, a macrodomain, is found in riboviral non-structural protein 3 (Nsp3), including the Nsp3 of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and other coronaviruses (alpha-, beta-, gamma-, and deltacoronavirus), among others. The SARS-CoV X-domain may function as a module binding poly(ADP-ribose), a metabolic product of NAD+ synthesized by poly(ADP-ribose) polymerase (PARP). The X-domain of Avian infectious bronchitis virus (IBV) strain Beaudette coronavirus does not bind ADP-ribose; the triple glycine sequence found in the X-domains of SARS-CoV and human coronavirus 229E (HCoV229E), which are involved in ADP-ribose binding, is not conserved in the IBV X-domain. SARS-CoV has two other macrodomains referred to as the SUD-N (N-terminal subdomain) and SUD-M (middle SUD subdomain) of the SARS-unique domain (SUD), which also do not bind ADP-ribose; these bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). SARS-CoV SUD-N and SUD-M are not included in this group.	127
394844	cd21558	alphaCoV-Nsp6	alphacoronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation.	293
394845	cd21559	gammaCoV-Nsp6	gammacoronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation.	307
394846	cd21560	betaCoV-Nsp6	betacoronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation.	290
394847	cd21561	deltaCoV-Nsp6	deltacoronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation.	296
394883	cd21562	Macro_cv_SUD-N_Nsp3-like	SUD-N macrodomain of the SARS Unique Domain (SUD) of SARS-CoV non-structural protein 3 and related macrodomains. This subfamily includes the macrodomain referred to as SUD-N (N-terminal subdomain) of the SARS-unique domain (SUD) which binds G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). It is found in the non-structural protein 3 (Nsp3) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and highly related coronaviruses. SUD consists of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. Among these, SUD-N and SUD-M are macrodomains: the SUD-M domain (not represented in this subfamily) is a related macrodomain which also binds G-quadruplexes. SUD-N is specific to the Nsp3 of SARS and betacoronaviruses of the sarbecovirus subgenera (B lineage), while SUD-M is present in most Nsp3 proteins except the Nsp3 from betacoronaviruses of the embecovirus subgenera (A lineage). SUD-C adopts a frataxin-like fold, has structural similarity to DNA-binding domains of DNA-modifying enzymes, binds single-stranded RNA, and regulates the RNA binding behavior of the SUD-M macrodomain. SARS-CoV Nsp3 contains a third macrodomain (the X-domain) which is also not represented in this subfamily. The X-domain may function as a module binding poly(ADP-ribose); however, SUD-N and SUD-M do not bind ADP-ribose, as the triple glycine sequence involved in its binding is not conserved in these.	126
394884	cd21563	Macro_cv_SUD-M_Nsp3-like	SUD-M macrodomain of the SARS Unique Domain (SUD) of SARS-CoV non-structural protein 3 and related macrodomains. This subfamily includes the macrodomain referred to as SUD-M (middle SUD subdomain) of the SARS-unique domain (SUD) which binds G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). It is found in non-structural protein 3 (Nsp3) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and related coronaviruses. SUD consists of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. Among these, SUD-N and SUD-M are macrodomains: The SUD-N domain (not represented in this subfamily) is a related macrodomain which also binds G-quadruplexes. While SUD-N is specific to the Nsp3 of SARS and betacoronaviruses of the sarbecovirus subgenera (B lineage), SUD-M is present in most Nsp3 proteins except the Nsp3 from betacoronaviruses of the embecovirus subgenera (A lineage). SUD-M, despite its name, is not specific to SARS. SUD-C adopts a frataxin-like fold, has structural similarity to DNA-binding domains of DNA-modifying enzymes, binds single-stranded RNA, and regulates the RNA binding behavior of the SUD-M macrodomain. SARS-CoV Nsp3 contains a third macrodomain (the X-domain) which is also not represented in this subfamily. The X-domain may function as a module binding poly(ADP-ribose); however, SUD-N and SUD-M do not bind ADP-ribose, as the triple glycine sequence involved in its binding is not conserved in these.	120
394850	cd21564	alphaCoV_M	alphacoronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of alphacoronaviruses including human coronaviruses (HCoVs), HCoV-229E and HCoV-NL63. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	218
394851	cd21565	betaCoV_M	betacoronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of betacoronaviruses including the M proteins from three highly pathogenic human coronaviruses (CoVs) such as Middle East respiratory syndrome (MERS)-related CoV, severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	208
394852	cd21566	gammaCoV_M	gammacoronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of gammacoronavirus including avian infectious bronchitis virus (IBV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	212
394853	cd21567	MERS-like-CoV_M	Membrane (or Matrix) protein from Middle East respiratory syndrome-related coronavirus and related betacoronaviruses in the C lineage. This group contains the Membrane (M) protein of Middle East respiratory syndrome (MERS)-related CoV, bat-CoV HKU5, and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	216
394854	cd21568	HCoV-like_M	Membrane (or Matrix) protein from human coronavirus and related betacoronaviruses in the A lineage. This group contains the Membrane (M) protein of human coronaviruses (HCoVs), HCoV-OC43 and HCoV-HKU1, and similar proteins from betacoronaviruses in the embecovirus subgenera (A lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	220
394855	cd21569	SARS-like-CoV_M	Membrane (or Matrix) protein from Severe acute respiratory syndrome (SARS) coronavirus, SARS-CoV-2, and related betacoronaviruses in the B lineage. This group contains the Membrane (M) protein of Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV-2 (also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus), and related proteins from betacoronaviruses in the sarbecovirus subgenera (B lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	218
394856	cd21570	batCoV_HKU9-like_M	Membrane (or Matrix) protein from bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This group contains the Membrane (M) protein of Rousettus bat coronavirus HKU9, and similar proteins from betacoronaviruses in the nobecovirus subgenera (D lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	217
409236	cd21571	KLF13_N	N-terminal domain of Kruppel-like factor 13. Kruppel-like factor 13 (KLF13; also known as Krueppel-like factor 13, RANTES factor of late activated T lymphocytes 1/RFLAT-1, or Fetal Kruppel-like factor-2/FKLF-2), is a protein that in humans is encoded by the KLF13 gene. It was originally cloned from fetal globin-expressing tissues, though it has also been cloned from bone marrow, striated muscles, and a subset of T cells where it is highly expressed. KLF13 plays a role in heart development and morphogenesis and is thought to play a role in obesity. It regulates the expression of the chemokine RANTES in T lymphocytes and has been shown to interact with CREB-binding protein, heat shock protein 47, and PCAF. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF13 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF13.	136
409241	cd21572	KLF10_N	N-terminal domain of Kruppel-like factor 10. Kruppel-like factor 10 (KLF10; also known as Krueppel-like factor 10; early growth response(EGR)-alpha/EGRA; TGFbeta inducible early gene-1/TIEG1) is a protein that in humans is encoded by the KLF10 gene. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. It may also play a role in adipocyte differentiation and adipose tissue function. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10.	245
409237	cd21573	KLF16_N	N-terminal domain of Kruppel-like factor 16. Kruppel-like factor 16 (KLF16; also known as Krueppel-like factor 16, Basic transcription element binding protein 4/BTEB4, or Novel Sp1-like zinc finger transcription factor/2NSLP2) is a protein that in humans is encoded by the KLF16 gene. KLF16 functions as a transcription activator. It is thought to modulate dopaminergic transmission in the brain and also regulates the expression of several genes essential for metabolic and endocrine processes in sex steroid-sensitive uterine cells. KLF16 selectively binds three distinct KLF-binding sites (GC, CA, and BTE boxes). KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF16 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF16.	125
410567	cd21574	KLF17_N	N-terminal domain of Kruppel-like factor 17. Kruppel-like factor 17 (KLF17), or Krueppel-like factor 17, is a protein that, in humans, is encoded by the KLF17 gene and acts as a tumor suppressor. It negatively regulates epithelial-mesenchymal transition and metastasis in breast cancer. KLF17 is thought to be the human ortholog of the mouse gene, zinc finger protein 393 (Zfp393), although it has diverged significantly. KLF17 can regulate gene transcription from CACCC-box elements. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF17.	286
410566	cd21575	KLF18_N	N-terminal domain of Kruppel-like factor 18. Kruppel-like factor 18 (KLF18), or Krueppel-like factor 18, is a product of a chromosomal neighbor of the KLF17 gene and is likely a product of its duplication. Phylogenetic analyses revealed that mammalian predicted KLF18 proteins and KLF17 proteins experienced elevated rates of evolution and are grouped with KLF1/KLF2/KLF4 and non-mammalian KLF17. KLF18 has been found in the human testis, though it was previously hypothesized to be a pseudogene in extant placental mammals. Mouse KLF18 expression data indicates that it may function in early embryonic development. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF18. Some KLF18 isoforms have duplicated N-terminal domains.	276
409238	cd21576	KLF14_N	N-terminal domain of Kruppel-like factor 14. Kruppel-like factor 14 (KLF14; also known as Krueppel-like factor 14 or basic transcription element-binding protein 5/BTEB5) is a protein that in humans is encoded by the KLF14 gene. KLF14 regulates the transcription of various genes, including TGFbetaRII (the type II receptor for TGFbeta). KLF14 is expressed in many tissues, lacks introns, and is subject to parent-specific expression. It also appears to be a master regulator of gene expression in adipose tissue. KLF14 is associated with coronary artery disease, hypercholesterolemia, and type 2 diabetes. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF14 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF14.	195
410554	cd21577	KLF3_N	N-terminal domain of Kruppel-like factor 3. Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.	214
409239	cd21578	KLF9_N	N-terminal domain of Kruppel-like factor 9. Kruppel-like factor 9 (KLF9; also known as Krueppel-like factor 9, or Basic Transcription Element Binding Protein 1/BTEB Protein 1) is a protein that in humans is encoded by the KLF9 gene. KLF9 is critical for the inhibition of growth and development of tumors. It is involved in cell differentiation of B cells, keratinocytes, and neurons. It is also a key transcriptional regulator for uterine endometrial cell proliferation, adhesion, and differentiation; these are processes essential for pregnancy success and are subverted during tumorigenesis. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF9 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF9.	142
410335	cd21579	KLF5_N	N-terminal domain of Kruppel-like factor 5. Kruppel-like factor 5 (KLF5; also known as also known as Krueppel-like factor 5; intestinal enriched Kruppel-like factor/IKLF; basic transcription element binding protein 2/BTEB2) a protein that in humans is encoded by the KLF5 gene. KLF5 is involved in numerous functions in eukaryotic cells, such as proliferation, migration, and differentiation. The loss of KLF5 expression is associated with tumors of the breast, cervix, endometrium, ovary, and prostate. KLF5 mediates the expression of several genes essential for proper cardiac structure and function, and plays a role in familial dilated cardiomyopathy. It functions as a transcriptional activator. KLF5 exhibits both transcriptional activation activity as well as trans-activating function. It belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF5.	273
410206	cd21580	KLF15_N	N-terminal domain of Kruppel-like factor 15. Kruppel-like factor 15 (KLF15; also known as Krueppel-like factor 15 or kidney-enriched Kruppel-like factor/KKLF) is a protein that in humans is encoded by the KLF15 gene. KLF15 plays a role in gluconeogenesis, adipogenesis, and may be a potential therapeutic target to reduce hepatitis B virus gene expression and viral replication, heart failure and aortic aneurysm formation, and endometrial, breast cancer, and other diseases related to estrogen. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF15.	213
409227	cd21581	KLF1_N	N-terminal domain of Kruppel-like Factor 1. Kruppel-like Factor 1 (KLF1, also known as Krueppel-like factor 1 or Erythroid Kruppel-like Factor/EKLF) was the first Kruppel-like factor discovered. It was found to be vitally important for embryonic erythropoiesis in promoting the switch from fetal hemoglobin (Hemoglobin F) to adult hemoglobin (Hemoglobin A) gene expression by binding to highly conserved CACCC domains. EKLF ablation in mouse embryos produces a lethal anemic phenotype, causing death by embryonic day 14, and natural mutations lead to beta+ thalassemia in humans. However, expression of embryonic hemoglobin and fetal hemoglobin genes is normal in EKLF-deficient mice, suggesting other factors may be involved. KLF1 functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF1, which is related to the N-terminal domains of KLF2 and KLF4.	278
409228	cd21582	KLF4_N	N-terminal domain of Kruppel-like factor 4. Kruppel-like factor 4 (KLF4; also known as Krueppel-like factor 4 or gut-enriched Kruppel-like factor/GKLF) is a protein that, in humans, is encoded by the KLF4 gene. Evidence also suggests that KLF4 is a tumor suppressor in certain cancers, including colorectal cancer, gastric cancer, esophageal squamous cell carcinoma, intestinal cancer, prostate cancer, bladder cancer and lung cancer.  It may act as a tumor promoter where increased KLF4 expression has been reported, such as in oral squamous cell carcinoma and in primary breast ductal carcinoma. KLF4 is one of four key factors that are essential for inducing pluripotent stem cells. KLF4 is highly expressed in non-dividing cells and its overexpression induces cell cycle arrest. KLF proteins KLF1, KLF2, KLF4, KLF5, KLF6, and KLF7 are transcriptional activators. KLF4 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF4, which is related to the N-terminal domains of KLF1 and KLF2.	335
409229	cd21583	KLF2_N	N-terminal domain of Kruppel-like factor 2. Kruppel-like Factor 2 (KLF2, also known as Krueppel-like factor 2 or lung Kruppel-like Factor/LKLF) is a protein that, in humans, is encoded by the KLF2 gene on chromosome 19. It has been implicated in a variety of biochemical processes in the human body, including lung development, embryonic erythropoiesis, epithelial integrity, T-cell viability, and adipogenesis. KLF proteins KLF1, KLF2, KLF4, KLF5, KLF6, and KLF7 are transcriptional activators. KLF2 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF2, which is related to the N-terminal domains of KLF1 and KLF4.	299
409242	cd21584	KLF11_N	N-terminal domain of Kruppel-like factor 11. Kruppel-like factor 11 (KLF11; also known as Krueppel-like factor 11; Fetal Kruppel-like factor-1/FKLF-1; maturity-onset diabetes of the young 7/MODY7; TGFbeta Inducible Early Growth Response 2/TIEG2) is a protein that in humans is encoded by the KLF11 gene. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF11 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF11.	217
409244	cd21585	KLF7_N	N-terminal domain of Kruppel-like factor 7. Kruppel-like factor 7 (KLF7; also known as Krueppel-like factor 7, or ubiquitous Kruppel-like factor/UKLF) is a protein which, in humans, is encoded by the KLF7 gene. KLF7 is involved in regulation of the development and function of the nervous system and adipose tissue, type 2 diabetes, blood diseases, as well as pluripotent cell maintenance. It functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF7.	160
409245	cd21586	KLF6_N	N-terminal domain of Kruppel-like factor 6. Kruppel-like factor 6 (KLF6; also known as Krueppel-like factor 6, BCD1, CBA1, COPEB, CPBP, GBF, PAC1, ST12, or ZF9) is a protein that, in humans, is encoded by the KLF6 gene. KLF6 contributes to cell proliferation, differentiation, cell death, and signal transduction. Hepatocyte expression of KLF6 regulates hepatic fatty acid and glucose metabolism via transcriptional activation of liver glucokinase and post-transcriptional regulation of the nuclear receptor peroxisome proliferator activated receptor alpha (PPARa). KLF6-expression contributes to hepatic insulin resistance and the progression of non-alcoholic fatty liver disease (NAFLD) to non-alcoholic steatohepatitis (NASH) and NASH-cirrhosis. KLF6 also affects peroxisome proliferator activated receptor gamma (PPARgamma)-signaling in NAFLD. KLF6 has also been identified as a tumor suppressor gene that is inactivated or downregulated in different cancers, including prostate, colon, and hepatocellular carcinomas. KLF6 transactivates genes controlling cell proliferation, including p21, E-cadherin, and pituary tumor-transforming gene 1 (PTTG1). KLF6 functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF6.	198
394891	cd21587	gammaCoV_RdRp	gammacoronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This subfamily contains the RNA-dependent RNA polymerase (RdRp) of gammacoronaviruses, including the RdRp of avian infectious bronchitis virus (IBV) and similar proteins. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	931
394892	cd21588	alphaCoV_RdRp	alphacoronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This subfamily contains the RNA-dependent RNA polymerase (RdRp) of alphacoronaviruses, including human coronaviruses (HCoVs), HCoV-NL63, and HCoV-229E. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	924
394893	cd21589	betaCoV_RdRp	betacoronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This subfamily contains the RNA-dependent RNA polymerase (RdRp) of betacoronaviruses, including the RdRps from three highly pathogenic human coronaviruses (CoVs) such as Middle East respiratory syndrome (MERS)-related CoV, Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which shows potential for the treatment of SARS-CoV-2 viral infections. The structure of SARS-CoV-2 Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	929
394894	cd21590	deltaCoV_RdRp	deltacoronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This subfamily contains the RNA-dependent RNA polymerase (RdRp) of deltacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which has been shown to inhibit human endemic and zoonotic deltacoronaviruses with a highly divergent RdRp. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	928
394895	cd21591	SARS-CoV-like_RdRp	Severe acute respiratory syndrome coronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12, and similar proteins from betacoronaviruses in the B lineage: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV-2 (also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus), and similar proteins from betacoronaviruses in the sarbecovirus subgenera (B lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which shows potential for the treatment of SARS-CoV-2 viral infections. The structure of SARS-CoV-2 Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	928
394896	cd21592	MERS-CoV-like_RdRp	Middle East respiratory syndrome-related coronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12, and similar proteins from betacoronaviruses in the C lineage: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of Middle East respiratory syndrome (MERS)-related CoV, bat-CoV HKU5, and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir, which has been shown to potently inhibit MERS RdRp. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	931
394897	cd21593	HCoV_HKU1-like_RdRp	human coronavirus HKU1 RNA-dependent RNA polymerase, also known as non-structural protein 12, and similar proteins from betacoronaviruses in the A lineage: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of human coronavirus HKU1, murine hepatitis virus, and similar proteins from betacoronaviruses in the embecovirus subgenera (A lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	925
394857	cd21594	deltaCoV_M	deltacoronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of deltacoronaviruses including porcine deltacoronavirus and Bulbul coronavirus HKU11. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	217
394954	cd21595	CoV_N-CTD	C-terminal domain of nucleocapsid (N) protein of coronavirus. The coronavirus nucleocapsid (N) protein is a major structural and multifunctional protein. It plays an important role in the virus replication cycle, by forming a complex with the viral RNA. It also interacts with the viral membrane protein during virion assembly and plays a critical role in enhancing the efficiency of virus transcription and assembly. The C-terminal domain of the N protein (N-CTD) is involved in dimerization, and is thus, also called the dimerization domain.	95
394898	cd21596	batCoV-HKU9-like_RdRp	Bat coronavirus HKU9 RNA-dependent RNA polymerase, also known as non-structural protein 12, and similar proteins from betacoronaviruses in the D lineage: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of bat coronavirus HKU9 and similar proteins from betacoronaviruses in the nobecovirus subgenera (D lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	929
394948	cd21597	SARS-CoV-2_Orf10	Severe acute respiratory syndrome coronavirus 2 Orf10. This model represents the Orf10 protein of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV). SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). Orf10 appears to have no homologous proteins in SARS-CoV and other coronaviruses. It has been suggested that the genome sequence currently annotated as orf10 may not have a protein coding function in SARS-CoV-2, and instead may act, itself or as a precursor of other RNAs, in the regulation of gene expression, replication, or modulating cellular antiviral pathways (DOI:10.1101/2020.03.05.976167).	36
394937	cd21598	ORF7b_SARS_bat-CoV-like	Severe Acute Respiratory Syndrome coronavirus structural accessory protein ORF7b and similar proteins from related betacoronaviruses in the B lineage. This family contains the structural accessory protein ORF7b, also called NS7b, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoVs) from betacoronavirus lineage B, including SARS-CoV-2, also known as 2019-nCoV, and a bat coronavirus (BatCoV RaTG13), which was previously detected in Rhinolophus affinis from China's Yunnan province, as well as SARS-related virus from Rhinolophus bats in Europe and Kenya. ORF7b/NS7b from betacoronavirus in the B lineage are not related to NS7b proteins from other betacoronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS-CoV contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. The SARS-CoV ORF7b protein is a highly hydrophobic 43 amino acid protein which is homologous to an accessory but structural component of SARS-CoV virion. While ORF7b is packaged into virions, it is not required for the virus budding process, as gene 7 deletion viruses replicate efficiently in vitro and in vivo. Moreover, ORF7b possesses a transmembrane helical domain (TMD), between 9-29 amino acid residues, is necessary for its Golgi complex localization, as replacing it with the TMD from the human endoprotease furin results in aberrant localization.	40
410178	cd21599	RRM1_GNPTAB	RNA recognition motif 1 (RRM1) found in N-acetylglucosamine-1-phosphotransferase subunits alpha/beta (GNPTAB) and similar proteins. GNPTAB, also termed GlcNAc-1-phosphotransferase subunits alpha/beta, or stealth protein GNPTAB, or UDP-N-acetylglucosamine-1-phosphotransferase subunits alpha/beta, catalyzes the formation of mannose 6-phosphate (M6P) markers on high mannose type oligosaccharides in the Golgi apparatus. M6P residues are required to bind to the M6P receptors (MPR), which mediate the vesicular transport of lysosomal enzymes to the endosomal/prelysosomal compartment. The model corresponds to the RNA recognition motif 1 (RRM1) of GNPTAB. Its functional significance remains to be investigated.	90
410179	cd21600	RRM2_GNPTAB	RNA recognition motif 2 (RRM2) found in N-acetylglucosamine-1-phosphotransferase subunits alpha/beta (GNPTAB) and similar proteins. GNPTAB, also termed GlcNAc-1-phosphotransferase subunits alpha/beta, or stealth protein GNPTAB, or UDP-N-acetylglucosamine-1-phosphotransferase subunits alpha/beta, catalyzes the formation of mannose 6-phosphate (M6P) markers on high mannose type oligosaccharides in the Golgi apparatus. M6P residues are required to bind to the M6P receptors (MPR), which mediate the vesicular transport of lysosomal enzymes to the endosomal/prelysosomal compartment. The model corresponds to the RNA recognition motif 2 (RRM2) of GNPTAB. Its functional significance remains to be investigated.	77
410180	cd21601	RRM1_PES4_MIP6	RNA recognition motif 1 (RRM1) found in Saccharomyces cerevisiae protein PES4, protein MIP6 and similar proteins. The family includes PES4 (also called DNA polymerase epsilon suppressor 4) and MIP6 (also called MEX67-interacting protein 6), both of which are predicted RNA binding proteins that may act as regulators of late translation, protection, and mRNA localization. MIP6 acts as a novel factor for nuclear mRNA export, binds to both poly(A)+ RNA and nuclear pores. It interacts with MEX67. Members in this family contain four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif.	80
410181	cd21602	RRM2_PES4_MIP6	RNA recognition motif 2 (RRM2) found in Saccharomyces cerevisiae protein PES4, protein MIP6 and similar proteins. The family includes PES4 (also called DNA polymerase epsilon suppressor 4) and MIP6 (also called MEX67-interacting protein 6), both of which are predicted RNA binding proteins that may act as regulators of late translation, protection, and mRNA localization. MIP6 acts as a novel factor for nuclear mRNA export, binds to both poly(A)+ RNA and nuclear pores. It interacts with MEX67. Members in this family contain four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif.	76
410182	cd21603	RRM3_PES4_MIP6	RNA recognition motif 3 (RRM3) found in Saccharomyces cerevisiae protein PES4, protein MIP6 and similar proteins. The family includes PES4 (also called DNA polymerase epsilon suppressor 4) and MIP6 (also called MEX67-interacting protein 6), both of which are predicted RNA binding proteins that may act as regulators of late translation, protection, and mRNA localization. MIP6 acts as a novel factor for nuclear mRNA export, binds to both poly(A)+ RNA and nuclear pores. It interacts with MEX67. Members in this family contain four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the third RRM motif.	73
410183	cd21604	RRM4_PES4_MIP6	RNA recognition motif 4 (RRM4) found in Saccharomyces cerevisiae protein PES4, protein MIP6 and similar proteins. The family includes PES4 (also called DNA polymerase epsilon suppressor 4) and MIP6 (also called MEX67-interacting protein 6), both of which are predicted RNA binding proteins that may act as regulators of late translation, protection, and mRNA localization. MIP6 acts as a novel factor for nuclear mRNA export, binds to both poly(A)+ RNA and nuclear pores. It interacts with MEX67. Members in this family contain four RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the fourth RRM motif.	79
410184	cd21605	RRM1_HRB1_GBP2	RNA recognition motif 1 (RRM1) found in Saccharomyces cerevisiae protein HRB1, G-strand-binding protein 2 (GBP2) and similar proteins. The family includes Saccharomyces cerevisiae protein HRB1 (also called protein TOM34) and GBP2, both of which are SR-like mRNA-binding proteins which shuttle from the nucleus to the cytoplasm when bound to the mature mRNA molecules. They act as quality control factors for spliced mRNAs. GBP2, also called RAP1 localization factor 6, is a single-strand telomeric DNA-binding protein that binds single-stranded telomeric sequences of the type (TG[1-3])n in vitro. It also binds to RNA. GBP2 influences the localization of RAP1 in the nuclei and plays a role in modulating telomere length. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif.	77
410185	cd21606	RRM2_HRB1_GBP2	RNA recognition motif 2 (RRM2) found in Saccharomyces cerevisiae protein HRB1, G-strand-binding protein 2 (GBP2) and similar proteins. The family includes Saccharomyces cerevisiae protein HRB1 (also called protein TOM34) and GBP2, both of which are SR-like mRNA-binding proteins which shuttle from the nucleus to the cytoplasm when bound to the mature mRNA molecules. They act as quality control factors for spliced mRNAs. GBP2, also called RAP1 localization factor 6, is a single-strand telomeric DNA-binding protein that binds single-stranded telomeric sequences of the type (TG[1-3])n in vitro. It also binds to RNA. GBP2 influences the localization of RAP1 in the nuclei and plays a role in modulating telomere length. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif.	75
410186	cd21607	RRM3_HRB1_GBP2	RNA recognition motif 3 (RRM3) found in Saccharomyces cerevisiae protein HRB1, G-strand-binding protein 2 (GBP2) and similar proteins. The family includes Saccharomyces cerevisiae protein HRB1 (also called protein TOM34) and GBP2, both of which are SR-like mRNA-binding proteins which shuttle from the nucleus to the cytoplasm when bound to the mature mRNA molecules. They act as quality control factors for spliced mRNAs. GBP2, also called RAP1 localization factor 6, is a single-strand telomeric DNA-binding protein that binds single-stranded telomeric sequences of the type (TG[1-3])n in vitro. It also binds to RNA. GBP2 influences the localization of RAP1 in the nuclei and plays a role in modulating telomere length. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the third RRM motif.	79
410187	cd21608	RRM2_NsCP33_like	RNA recognition motif 2 (RRM2) found in Nicotiana sylvestris chloroplastic 33 kDa ribonucleoprotein (NsCP33) and similar proteins. The family includes NsCP33, Arabidopsis thaliana chloroplastic 31 kDa ribonucleoprotein (CP31A) and mitochondrial glycine-rich RNA-binding protein 2 (AtGR-RBP2). NsCP33 may be involved in splicing and/or processing of chloroplast RNA's. AtCP31A, also called RNA-binding protein 1/2/3 (AtRBP33), or RNA-binding protein CP31A, or RNA-binding protein RNP-T, or RNA-binding protein cp31, is required for specific RNA editing events in chloroplasts and stabilizes specific chloroplast mRNAs, as well as for normal chloroplast development under cold stress conditions by stabilizing transcripts of numerous mRNAs under these conditions. CP31A may modulate telomere replication through RNA binding domains. AtGR-RBP2, also called AtRBG2, or glycine-rich protein 2 (AtGRP2), or mitochondrial RNA-binding protein 1a (At-mRBP1a), plays a role in RNA transcription or processing during stress. It binds RNAs and DNAs sequence with a preference to single-stranded nucleic acids. AtGR-RBP2 displays strong affinity to poly(U) sequence. It exerts cold and freezing tolerance, probably by exhibiting an RNA chaperone activity during the cold and freezing adaptation process. Some members in this family contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif.	76
410188	cd21609	RRM1_PSRP2_like	RNA recognition motif 1 (RRM1) found in chloroplastic plastid-specific 30S ribosomal protein 2 (PSRP-2) and similar proteins. PSRP-2, also called chloroplastic 30S ribosomal protein 2, or chloroplastic small ribosomal subunit protein cS22, is a component of the chloroplast ribosome (chloro-ribosome), a dedicated translation machinery responsible for the synthesis of chloroplast genome-encoded proteins, including proteins of the transcription and translation machinery and components of the photosynthetic apparatus. It binds single strand DNA (ssDNA) and RNA in vitro. It exhibits RNA chaperone activity and regulates negatively resistance responses to abiotic stresses during seed germination (e.g. salt, dehydration, and low temperature) and seedling growth (e.g. salt). The family also includes Nicotiana sylvestris chloroplastic 33 kDa ribonucleoprotein (NsCP33) and Arabidopsis thaliana chloroplastic 31 kDa ribonucleoprotein (AtCP31A). NsCP33 may be involved in splicing and/or processing of chloroplast RNA's. AtCP31A, also called RNA-binding protein 1/2/3 (AtRBP33), or RNA-binding protein CP31A, or RNA-binding protein RNP-T, or RNA-binding protein cp31, is required for specific RNA editing events in chloroplasts and stabilizes specific chloroplast mRNAs, as well as for normal chloroplast development under cold stress conditions by stabilizing transcripts of numerous mRNAs under these conditions. CP31A may modulate telomere replication through RNA binding domains. Members in this family contain two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif.	80
410189	cd21610	RRM2_PSRP2	RNA recognition motif 2 (RRM2) found in chloroplastic plastid-specific 30S ribosomal protein 2 (PSRP-2) and similar proteins. PSRP-2, also called chloroplastic 30S ribosomal protein 2, or chloroplastic small ribosomal subunit protein cS22, is a component of the chloroplast ribosome (chloro-ribosome), a dedicated translation machinery responsible for the synthesis of chloroplast genome-encoded proteins, including proteins of the transcription and translation machinery and components of the photosynthetic apparatus. It binds single strand DNA (ssDNA) and RNA in vitro. It exhibits RNA chaperone activity and regulates negatively resistance responses to abiotic stresses during seed germination (e.g. salt, dehydration, and low temperature) and seedling growth (e.g. salt). PSRP-2 contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif.	79
410190	cd21611	RRM_SpPof8_like	RNA recognition motif (RRM) found in Schizosaccharomyces pombe protein Pof8 and similar proteins. Pof8 is a La-related protein and a constitutive component of telomerase in fission yeast. It regulates telomerase assembly and poly(a)+TERRA expression in fission yeast. Members in this family contain an RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	82
410191	cd21612	RRM_AtRDRP1_like	RNA recognition motif (RRM) found in Arabidopsis thaliana RNA-dependent RNA polymerase 1 (AtRDRP1) and similar proteins. AtRDRP1, also called RNA-directed RNA polymerase 1, is an RNA-dependent direct polymerase involved in antiviral silencing. It is required for the biogenesis of viral secondary siRNAs, process that follows the production of primary siRNAs derived from viral RNA replication. Members in this family contain an RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	67
410192	cd21613	RRM1_KSRP	RNA recognition motif 1 (RRM1) found in Kinetoplastid-Specific Ribosomal Protein (KSRP) and similar proteins. KSRP is an essential protein located at the solvent face of the 40S subunit, where it binds and stabilizes kinetoplastid-specific domains of rRNA, suggesting its role in ribosome integrity. It also interacts with the kinetoplastid-specific C-terminal region of protein eS6. KSRP contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif.	71
410193	cd21614	RRM2_KSRP	RNA recognition motif 2 (RRM2) found in Kinetoplastid-Specific Ribosomal Protein (KSRP) and similar proteins. KSRP is an essential protein located at the solvent face of the 40S subunit, where it binds and stabilizes kinetoplastid-specific domains of rRNA, suggesting its role in ribosome integrity. It also interacts with the kinetoplastid-specific C-terminal region of protein eS6. KSRP contains two RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif.	97
410194	cd21615	RRM_SNP1_like	RNA recognition motif (RRM) found in Saccharomyces cerevisiae U1 small nuclear ribonucleoprotein SNP1 and similar proteins. SNP1, also called U1 snRNP protein SNP1, or U1 small nuclear ribonucleoprotein 70 kDa homolog, or U1 70K, or U1 snRNP 70 kDa homolog, interacts with mRNA and is involved in nuclear mRNA splicing. It is a component of the spliceosome, where it is associated with snRNP U1 by binding stem loop I of U1 snRNA. Members in this family contain an N-terminal U1snRNP70 domain and an RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	118
410195	cd21616	RRM_ScJSN1_like	RNA recognition motif (RRM) found in Saccharomyces cerevisiae protein JSN1 and similar proteins. JSN1, also called Pumilio homology domain family member 1 (PUF1), is a member of the PUF family of proteins. It facilitates association of Arp2/3 complex to yeast mitochondria. It may play a role in mitosis, perhaps by affecting the stability of microtubules. Members in this family contain an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	118
410196	cd21617	RRM_TDRD10	RNA recognition motif (RRM) found in Tudor domain-containing protein 10 (TDRD10) and similar proteins. TDRD10 is widely expressed and localized both to the nucleus and cytoplasm and may play general roles like regulation of RNA metabolism. It contains a Tudor domain and a RNA recognition motif (RRM).	69
410197	cd21618	RRM_AtNSRA_like	RNA recognition motif (RRM) found in Arabidopsis thaliana nuclear speckle RNA-binding protein A (AtNSRA) and similar protein. AtNSRA is an alternative splicing (AS) regulator that binds to specific mRNAs and modulates auxin effects on the transcriptome. It can be displaced from its targets upon binding to AS competitor long non-coding RNA (ASCO-RNA). Members in this family contain an RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain).	87
410198	cd21619	RRM1_Crp79	RNA recognition motif 1 (RRM1) found in Schizosaccharomyces pombe mRNA export factor Crp79 and similar proteins. Crp79, also called meiotic expression up-regulated protein 5 (Mug5), or polyadenylate-binding protein crp79, or PABP, or poly(A)-binding protein, is an auxiliary mRNA export factor that binds the poly(A) tail of mRNA and is involved in the export of mRNA from the nucleus to the cytoplasm. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif.	78
410199	cd21620	RRM1_Mug28	RNA recognition motif 1 (RRM1) found in Schizosaccharomyces pombe meiotically up-regulated gene 28 protein (Mug28) and similar proteins. Mug28 is a meiosis-specific protein that regulates spore wall formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the first RRM motif.	84
410200	cd21621	RRM2_Crp79_Mug28	RNA recognition motif 2 (RRM2) found in Schizosaccharomyces pombe mRNA export factor Crp79, meiotically up-regulated gene 28 protein (Mug28) and similar proteins. Crp79, also called meiotic expression up-regulated protein 5 (Mug5), or polyadenylate-binding protein crp79, or PABP, or poly(A)-binding protein, is an auxiliary mRNA export factor that binds the poly(A) tail of mRNA and is involved in the export of mRNA from the nucleus to the cytoplasm. Mug28 is a meiosis-specific protein that regulates spore wall formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the second RRM motif.	74
410201	cd21622	RRM3_Crp79_Mug28	RNA recognition motif 3 (RRM3) found in Schizosaccharomyces pombe mRNA export factor Crp79, meiotically up-regulated gene 28 protein (Mug28) and similar proteins. Crp79, also called meiotic expression up-regulated protein 5 (Mug5), or polyadenylate-binding protein crp79, or PABP, or poly(A)-binding protein, is an auxiliary mRNA export factor that binds the poly(A) tail of mRNA and is involved in the export of mRNA from the nucleus to the cytoplasm. Mug28 is a meiosis-specific protein that regulates spore wall formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the three RRM motif.	92
394938	cd21623	ORF7b_SARS-CoV-2	Structural accessory protein ORF7b of Severe Acute Respiratory Syndrome coronavirus 2 and similar proteins. This group contains the ORF7b, also called NS7b, of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), also known as 2019-nCoV, and a bat coronavirus (BatCoV RaTG13), which was previously detected in Rhinolophus affinis from China's Yunnan province and showed high sequence identity to SARS-CoV-2. ORF7b/NS7b from betacoronavirus in the B lineage are not related to NS7b proteins from other betacoronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS coronavirus contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. The SARS-CoV ORF7b protein is a highly hydrophobic 43 amino acid protein which is homologous to an accessory but structural component of SARS-CoV virion. While ORF7b is packaged into virions, it is not required for the virus budding process, as gene 7 deletion viruses replicate efficiently in vitro and in vivo. Moreover, ORF7b possesses a transmembrane helical domain (TMD), between 9-29 amino acid residues, is necessary for its Golgi complex localization, as replacing it with the TMD from the human endoprotease furin results in aberrant localization.	43
394950	cd21624	SARS-CoV-like_Spike_S1_NTD	N-terminal domain of the S1 subunit of the Spike (S) protein from Severe acute respiratory syndrome coronavirus and related betacoronaviruses in the B lineage. This subfamily contains the N-terminal domain (NTD) of the S1 subunit of the Spike (S) proteins from betacoronaviruses in the sarbecovirus subgenera (B lineage), including the highly pathogenic human coronavirus (CoV), Severe acute respiratory syndrome (SARS) CoV, and SARS-CoV-2, also known as a 2019 novel coronavirus (2019-nCoV) or COVID-19 virus. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). Most CoVs, including SARS-CoV-2 and SARS-CoV, use the C-domain to bind their receptors. The S1 NTD contributes to the Spike trimer interface.	280
394951	cd21625	MHV-like_Spike_S1_NTD	N-terminal domain of the S1 subunit of the Spike (S) protein from murine hepatitis virus and related betacoronaviruses in the A lineage. This subfamily contains the N-terminal domain (NTD) of the S1 subunit of the Spike (S) proteins from betacoronaviruses in the embecovirus subgenera (A lineage), including murine hepatitis virus (MHV), human coronavirus (HCoV) HKU1 and OC43, and bovine CoV (BCoV). MHV is the most common viral pathogen in contemporary laboratory mouse colonies manifesting as a primary infection in the upper respiratory tract, while HCoV-HKU1 causes mild yet prevalent respiratory disease in humans. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While most CoVs, including SARS-CoV and MERS-CoV use the C-domain to bind their receptors, several CoVs in the A lineage use the NTD to bind their receptors. MHV binds its protein receptor, mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a), through its S1 NTD. BCoV and HCoV-OC43 recognize a sugar moiety, 5-N-acetyl-9-O-acetylneuraminic acid (Neu5,9Ac2), on cell-surface glycoproteins or glycolipids; this binding is also through the S1 NTD. In addition, the S1 NTD contributes to the Spike trimer interface.	284
394952	cd21626	MERS-CoV-like_Spike_S1_NTD	N-terminal domain of the S1 subunit of the Spike (S) protein from Middle East respiratory syndrome-related coronavirus and related betacoronaviruses in the C lineage. This subfamily contains the N-terminal domain (NTD) of the S1 subunit of the Spike (S) proteins from betacoronaviruses in the merbecovirus subgenera (C lineage), including the highly pathogenic human coronavirus (CoV), Middle East respiratory syndrome (MERS)-related CoV, and related bat CoVs. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). Most CoVs, including MERS-CoV, use the C-domain to bind their receptors. Despite using the C-domain as its receptor, neutralizing antibodies targeting MERS-CoV S1-NTD have been reported, including human antibody CDC2-A2, murine antibodies G2 and 5F9, and macaque antibodies FIB-H1 and JC57-13. G2 has been shown to strongly disrupt the attachment of MERS-CoV S to its receptor, dipeptidyl peptidase-4 (DPP4). In addition, the S1 NTD contributes to the Spike trimer interface.	328
394953	cd21627	batCoV-HKU9-like_Spike_S1_NTD	N-terminal domain of the S1 subunit of the Spike (S) protein from Rousettus bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This subfamily contains the N-terminal domain (NTD) of the S1 subunit of the Spike (S) proteins from betacoronaviruses in the nobecovirus subgenera (D lineage), including  Rousettus bat coronavirus HKU9 and related bat CoVs. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). Most CoVs, including SARS-CoV-2, SARS-CoV, and MERS-CoV use the C-domain to bind their receptors. However,  CoV such as mouse hepatitis virus (MHV) uses the NTD to bind its receptor, mouse carcinoembryonic antigen related cell adhesion molecule 1a (mCEACAM1a). The S1 NTD contributes to the Spike trimer interface.	289
394930	cd21628	deltaCoV_NS7_NS7a	deltacoronavirus accessory protein NS7 and NS7a. This family includes the accessory protein NS7 found in deltacoronaviruses from the Buldecovirus subgenus, such as porcine coronavirus HKU15, and several avian coronaviruses found in sparrow, pigeon, quail and falcon, among others. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. Porcine deltacoronavirus (PDCoV) encodes three accessory proteins, NS6, NS7 and NS7a. NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. PDCoV HKU15, an emerging swine enteric coronavirus that causes diarrhea in neonatal piglets, has also been found in the respiratory tract of pigs and may be able to cause respiratory infections, thus possibly spreading through the respiratory route. NS7-specific mAbs that recognized cells transfected with an NS7 expression construct or infected with PDCoV also recognized NS7a, which is encoded by a separate subgenome mRNA with a non-canonical transcription regulatory sequence. The NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7-expressing and PDCoV-infected cells also show a substantial down-regulation of alpha-actinin-4.	195
394929	cd21629	NS6_deltaCoV	deltacoronavirus accessory protein NS6. This family includes the accessory protein NS6 from deltacoronaviruses such as porcine coronavirus HKU15, and several avian coronaviruses found in sparrow, pigeon, quail and falcon, among others. There are five essential genes in coronaviruses (CoVs) that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. Porcine deltacoronavirus (PDCoV) encodes three accessory proteins, NS6, NS7, and NS7a. During PDCoV infection, NS6 antagonizes RIG-I-like receptor (RLR)-mediated IFN-beta production to evade host innate immune defense; it interacts with RIG-I and MDA5 to impede their association with double-stranded RNA. This is an important finding towards novel therapeutic targets and may lead to the development of more effective vaccines against PDCoV infection.	91
409020	cd21631	RHH_CopG_NikR-like	ribbon-helix-helix domains of transcription repressor CopG, nickel responsive transcription factor NikR, and similar proteins. This family includes the ribbon-helix-helix (RHH) domains of transcriptional repressor CopG, nickel-responsive transcription factor NikR, several antitoxins such as Shewanella oneidensis CopA(SO), Burkholderia pseudomallei HicB, and Caulobacter crescentus ParD, and similar proteins. CopG, a homodimeric RHH protein of around 45 residues, constitutes one of the smallest natural transcriptional repressors characterized and is the prototype of a series of repressor proteins encoded by plasmids that exhibit a similar genetic structure at their leading strand initiation and control regions. It is involved in the control of plasmid copy number. NikR, which consists of the N-terminal DNA-binding RHH domain and the C-terminal metal-binding domain (MBD) with four nickel ions, regulates several genes; in Helicobacter pylori, NikR regulates the urease enzyme under extreme acidic conditions, and is involved in the intracellular physiology of nickel. Protein HicB is part of the HicAB toxin-antitoxin (TA) system, where the toxins are RNases, found in many bacteria. In Burkholderia pseudomallei, the HicAB system may play a role in disease by regulating the frequency of persister cells, while in Yersinia pestis HicB acts as an autoregulatory protein that inhibits HicA, which acts as an mRNase. In Escherichia coli, an excess of HicA has been shown to de-repress a HicB-DNA complex and restore transcription of HicB. The CopG family RHH domain, represented by this model, forms a homodimer and binds DNA.	42
394939	cd21635	ORF7b_SARS-CoV-like	Severe Acute Respiratory Syndrome coronavirus structural accessory protein ORF7b and related proteins. This group contains the ORF7b, also called NS7b, of Severe Acute Respiratory Syndrome coronaviruses (SARS-CoVs) and related betacoronaviruses identified in Chinese horseshoe bats, including bat SARS-like-CoV WIV1 and HKU3. ORF7b/NS7b from betacoronavirus in the B lineage are not related to NS7b proteins from other betacoronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS coronavirus contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. The SARS-CoV ORF7b protein is a highly hydrophobic 43 amino acid protein which is homologous to an accessory but structural component of SARS-CoV virion. While ORF7b is packaged into virions, it is not required for the virus budding process, as gene 7 deletion viruses replicate efficiently in vitro and in vivo. Moreover, ORF7b possesses a transmembrane helical domain (TMD), between 9-29 amino acid residues, is necessary for its Golgi complex localization, as replacing it with the TMD from the human endoprotease furin results in aberrant localization.	44
394931	cd21637	NS7_PDCoV	Porcine deltacoronavirus (PDCoV) accessory protein NS7. This group includes the accessory protein NS7 found in Porcine coronavirus HKU15. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle., In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. Porcine deltacoronavirus (PDCoV) encodes three accessory proteins, NS6, NS7 and NS7a. NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. PDCoV HKU15, an emerging swine enteric coronavirus that causes diarrhea in neonatal piglets, has also been found in the respiratory tract of pigs and may be able to cause respiratory infections, thus possibly spreading through the respiratory route. NS7-specific mAbs that recognized cells transfected with an NS7 expression construct or infected with PDCoV also recognized NS7a, which is encoded by a separate subgenome mRNA with a non-canonical transcription regulatory sequence. The NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7-expressing and PDCoV-infected cells also show a substantial down-regulation of alpha-actinin-4.	198
394932	cd21638	NS7a_deltaCoV_HKU16-like	accessory protein NS7a found in deltacoronavirus, including avian coronavirus HKU16 and related coronaviruses. This group includes the accessory protein NS7a from White-eye coronavirus HKU16, Falcon coronavirus UAE-HKU27, Houbara coronavirus UAE-HKU28 and Pigeon coronavirus UAE-HKU29, within the Buldecovirus subgenus of deltacoronaviruses (deltaCoVs). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. In deltaCoVs, several avian species encode accessory protein NS7a, which is homologous to Porcine coronavirus (PDCoV) HKU15 accessory proteins NS7 and NS7a. PDCoV NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. The PDCoV NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7a proteins in this subfamily have yet to be characterized.	197
394933	cd21639	NS7a_deltaCoV_HKU30-like	accessory protein NS7a found in deltacoronavirus, including avian coronavirus HKU30 and related coronaviruses. This group includes the accessory protein NS7a from Quail deltacoronavirus (QdCoV) UAE-HKU30 and sparrow deltacoronavirus (SpCoV-HKU17) within the Buldecovirus subgenus of deltacoronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. In deltaCoVs, several avian species encode accessory protein NS7a, which is homologous to Porcine coronavirus (PDCoV) HKU15 accessory proteins NS7 and NS7a. PDCoV NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. The PDCoV NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7a proteins in this subfamily have yet to be characterized. Phylogenetic analysis revealed that QdCoV UAE-HKU30 belongs to the same CoV species as porcine deltacoronavirus (PdCoV) HKU15 and sparrow deltacoronavirus (SpdCoV) HKU17 within Buldecovirus subgenus, suggesting transmission between avian and swine hosts.	198
394944	cd21640	ORF8-Ig_SARS-CoV-2-like	SARS-CoV-2 ORF8 immunoglobulin (Ig) domain protein and related proteins. This family includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and related Sarbecovirus ORF8 proteins including those classified as type II, such as bat coronavirus Rf1 ORF8, and those classified as type III, such as Bat SARS coronavirus HKU3-1 ORF8. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736).	120
394945	cd21641	ORF8-Ig_SARS-CoV-2-like	SARS-CoV-2 ORF8 immunoglobulin (Ig) domain protein and related proteins. This subfamily includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and related Sarbecovirus ORF8 proteins. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 (also known as ns8 and accessory protein 8) is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736). It belongs to a family which includes Sarbecovirus ORF8 proteins classified as type II, such as bat coronavirus Rf1 ORF8, and those classified as type III, such as Bat SARS coronavirus HKU3-1 ORF8.	121
394946	cd21642	ORF8-Ig_Bat_SARS_CoV_Rf1_type-II-like	ORF8 immunoglobulin (Ig) domain protein of bat coronavirus Rf1, a type II ORF8, and related proteins. This subfamily includes the ORF8 immunoglobulin (Ig) domain proteins of bat coronavirus Rf1 (Bat SARS CoV Rf1) and Bat CoV 273/2005, which have been classified previously as type II ORF8 proteins. They belong to a family which includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and other related Sarbecovirus ORF8's, such as Bat SARS coronavirus HKU3-1 ORF8 which has been classified previously as a type III ORF8.  SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19).  SARS-CoV-2 ORF8 protein (also known as ns8 and accessory protein 8) is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736). In most SARS-CoVs, ORF8 is split into overlapping ORF8a and ORF8b proteins; the N- and C-terminus of SARS-CoV-2 ORF8 is similar to SARS-CoV ORF8a and ORF8b, respectively.	119
394947	cd21643	ORF8-Ig_bat_SARS-CoV_HKU3-1_type-III-like	ORF8 immunoglobulin (Ig) domain protein of bat SARS coronavirus HKU3-1 ORF8, a type III ORF8, and related proteins. This subfamily includes the ORF8 immunoglobulin (Ig) domain proteins of Bat SARS coronavirus HKU3-1 and Bat SARS-like coronavirus Rs3367, which have been classified previously as type III ORF8's. They belong to a family which includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and other related Sarbecovirus ORF8's, such as bat coronavirus Rf1 (Bat SARS CoV Rf1) ORF8 which has been classified previously as a type II ORF8. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 protein (also known as ns8 and accessory protein 8) is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736).	120
394940	cd21644	batCoV-HKU9_NS7b	NS7b protein from Rousettus bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This model represents the NS7b protein of Rousettus bat coronavirus (CoV) HKU9 and related proteins from betacoronaviruses in the nobecovirus subgenera (D lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. The NS7b protein of lineage D betacoronavirus is an accessory protein whose function is unknown. It is not related to NS7b proteins from other betacoronavirus lineages.	178
394928	cd21645	MERS-CoV-like_ORF5	Non-structural protein ORF5 from Middle East respiratory syndrome-related coronavirus and related betacoronaviruses in the C lineage. This model represents the non-structural protein ORF5 from Middle East respiratory syndrome-related coronavirus (MERS-CoV) and and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage). ORF5 is also called non-structural protein 3d (NS3d) or accessory protein 3d in some bat merbecoviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. MERS-CoV is a highly pathogenic respiratory virus with pathogenic mechanisms that may be driven by innate immune pathways. MERS-CoV ORF5 acts as an interferon antagonist and may play a role in circumventing the innate immunity of host cells. It is also implicated to play a role in the modulation of NF-kappaB-mediated inflammation. ORF5/NS3d from merbecovirus (betacoronavirus, lineage C) may not be related to ORF5 proteins from other lineages.	223
394885	cd21646	CoV_Nsp5_Mpro	coronavirus non-structural protein 5, also called Main protease (Mpro). This family contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs.	292
394924	cd21647	ORF4b_NS3c-betaCoV	accessory protein ORF4b, also known as non-structural protein 3c (NS3c), of betacoronaviruses in the C lineage. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Middle East respiratory syndrome (MERS)-related CoV and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage), including Tylonycteris bat coronavirus HKU4 and Pipistrellus bat coronavirus HKU5. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. The MERS-CoV ORF4b (also known as MERS-CoV 4b) has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis.	227
394922	cd21648	SARS-CoV-like_ORF3a	accessory protein ORF3a of severe acute respiratory syndrome-associated coronavirus and similar proteins from related betacoronavirus. This model represents the accessory protein ORF3a of Severe acute respiratory syndrome-associated coronavirus (SARS-CoV), SARS-COV-2 (also called 2019 novel coronavirus or 2019-nCoV), and related betacoronaviruses in the Sarbecovirus subgenus (B lineage). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. SARS-CoV mRNA 3 encodes the distinct proteins ORF3a and ORF3b, which are translated in different reading frames. Accessory protein ORF3a, also called protein 3a and protein X1, is the largest ORF protein in SARS-CoV. It is also called accessory protein 3 or protein 3 in some bat coronaviruses. SARS-CoV ORF3a promotes membrane rearrangement and cell death; it induces vesicle formation and is necessary for SARS-CoV-induced Golgi fragmentation. It has also been found to activate NF-kappaB and the NLRP3 inflammasome by promoting TNF receptor-associated factor 3 (TRAF3)-dependent ubiquitination of p105 and ASC (apoptosis-associated speck-like protein containing a caspase recruitment domain). The cytoplasmic domain of SARS-CoV ORF3a, composed of amino acids at the C-terminal region, has sequence similarity to a calcium pump present in Plasmodium falciparum and has been shown to bind calcium in vitro.	269
394923	cd21649	SARS-CoV_ORF3b	accessory protein ORF3b of severe acute respiratory syndrome-associated coronavirus. This model represents the accessory protein ORF3b of Severe acute respiratory syndrome-associated coronavirus (SARS-CoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. SARS-CoV mRNA 3 encodes the distinct proteins ORF3a and ORF3b proteins, which are translated in different reading frames. SARS-CoV accessory protein ORF3b antagonizes interferon (IFN) function by modulating the activity of IFN regulatory factor 3 (IRF3). The IFN system functions as the first line of defense against viral infection in mammalian cells. Viral infection triggers a series of cellular events that lead to the production of IFN and several downstream antiviral genes, helping to establish an antiviral state. Viruses encode IFN antagonists to counteract the antiviral effects of IFN. SARS-CoV ORF3b, ORF6, and N proteins function as IFN antagonists. ORF3b inhibits both IFN synthesis and signaling. It localizes to the nucleus in transfected cells.	151
412060	cd21650	CrtA-like	spheroidene monooxygenase and similar proteins. Spheroidene monooxygenase (such as Rhodobacter sphaeroides monooxygenase CrtA) catalyzes the asymmetrical introduction of one keto group at the C-2 position of spheroidene and two keto groups at the C-2 and C-2' positions of spirilloxanthin in carotenoid pathways. Spectroscopic analysis suggests CrtA may have a 5-coordinated heme at its active site and that it may be a novel oxygenase and not a P450 enzyme.	225
394925	cd21651	ORF4b_MERS-CoV-like	accessory protein ORF4b, also known as non-structural protein 3c (NS3c) in Middle East respiratory syndrome (MERS)-related CoV. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Middle East respiratory syndrome (MERS)-related CoV, as well as some bat coronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. The MERS-CoV ORF4b (also known as MERS-CoV 4b) has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis.	239
394926	cd21652	ORF4b_HKU4-CoV	accessory protein ORF4b, also known as non-structural protein 3c (NS3c), of Tylonycteris bat coronavirus HKU4 and similar proteins. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Tylonycteris bat coronavirus HKU4 and related bat coronaviruses including Tylonycteris pachypus bat coronavirus HKU4-related. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. ORF4b/NS3c proteins in this subgroup are similar to the MERS-CoV ORF4b (also known as MERS-CoV 4b) which has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis.	256
394927	cd21653	ORF4b_HKU5-CoV	accessory protein ORF4b, also known as non-structural protein 3c (NS3c), of Pipistrellus bat coronavirus HKU5 and similar proteins. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Pipistrellus bat coronavirus HKU5 and related bat coronaviruses including Pipistrellus abramus bat coronavirus HKU5-related. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. ORF4b/NS3c proteins in this subgroup are similar to the MERS-CoV ORF4b (also known as MERS-CoV 4b) which has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis.	249
394941	cd21654	embe-merbe_CoV_ORF8b_protein-I-like	MERS-CoV ORF8b, BECV protein I, and related Embecovirus and Merbecovirus proteins. This family includes the ORF8b accessory protein from Middle East respiratory syndrome-related coronavirus (MERS-CoV) and related merbecoviruses (C lineage), and protein I (also known as accessory protein N2) from bovine enteritic coronavirus-F15 strain (BECV-F15) and related Embecoviruses (A lineage). The gene encoding ORF8b is an internal ORF that is overlapped by the N (nucleocapsid) protein gene (ORF8a), and the gene encoding protein I is included in the N gene as an alternative ORF. ORF8b and protein I appear to have no homologous proteins in Sarbecovirus (lineage B), which includes Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) and SARS-CoV-2 (2019 novel coronavirus, 2019-nCoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. MERS-CoV ORF8b and BECV-F15 protein I are not essential for viral replication.	104
394956	cd21657	deltaCoV_Nsp14	nonstructural protein 14 of deltacoronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs.	508
394957	cd21658	gammaCoV_Nsp14	nonstructural protein 14 of gammacoronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs.	518
394958	cd21659	betaCoV_Nsp14	nonstructural protein 14 of betacoronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs.	519
394959	cd21660	alphaCoV_Nsp14	nonstructural protein 14 of alphacoronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs.	510
394942	cd21661	merbe_CoV_ORF8b-like	MERS-CoV ORF8b protein and related Merbecovirus proteins. This subfamily includes the ORF8b accessory protein from Middle East respiratory syndrome-related coronavirus (MERS-CoV) and related merbecoviruses (C lineage). The gene encoding ORF8b is an internal ORF that is overlapped by the N (nucleocapsid) protein gene (ORF8a). ORF8b appear to have no homologous proteins in Sarbecovirus (lineage B), which includes Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) and SARS-CoV-2 (2019 novel coronavirus, 2019-nCoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. MERS-CoV ORF8b is not essential for viral replication. It is related to protein I (also known as accessory protein N2) of bovine enteritic coronavirus-F15 strain (BECV-F15) and other related Embecoviruses; the gene encoding protein I is included in the N gene as an alternative ORF.	104
394943	cd21662	embe-CoV_Protein-I_like	BECV protein I and related Embecovirus proteins. This subfamily includes protein I (also known as accessory protein N2) from bovine enteritic coronavirus-F15 strain (BECV-F15) and related Embecoviruses (A lineage) including murine hepatitis virus. The gene encoding protein I is included in the N gene as an alternative ORF. Protein I appears to have no homologous proteins in Sarbecovirus lineage B, which includes Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) and SARS-CoV-2 (2019 novel coronavirus, 2019-nCoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. BECV-F15 protein I is not essential for viral replication. It is related to the ORF8b accessory protein of Middle East respiratory syndrome-related coronavirus (MERS-CoV) and other related merbecoviruses (C lineage); the gene encoding ORF8b is an internal ORF that is overlapped by the N (nucleocapsid) protein gene (ORF8a).	115
394934	cd21663	ORF7a_SARS-CoV-like	Severe Acute Respiratory Syndrome coronavirus (SARS-CoV) structural accessory protein ORF7a and similar proteins from related betacoronaviruses in the subgenera Sarbecovirus (B lineage). This family contains the structural accessory protein ORF7a, also called NS7a, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoVs) from betacoronavirus subgenera Sarbecovirus (lineage B), including SARS-CoV-2, also known as 2019-nCoV, and a bat coronavirus (BatCoV RaTG13), which was previously detected in Rhinolophus affinis from China's Yunnan province, as well as SARS-related virus from Rhinolophus bats in Europe and Kenya. ORF7a/NS7a from betacoronavirus in the subgenera Sarbecovirus (lineage B) are not related to NS7a proteins from other coronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS-CoV contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. Structurally, ORF7a possesses a distinctive immunoglobulin (Ig)-like domain which is related to extracellular metazoan Ig domains that are involved in adhesion, such as ICAM; it also contains a 15-amino acid signal peptide sequence at its N terminus, an 81-amino acid luminal domain, a 21-amino acid transmembrane domain, and a short C-terminal tail. Co-expression of SARS-CoV ORF7a with S, M, N and E proteins resulted in production of virus-like particles (VLPs) carrying ORF7a protein, indicating that ORF7a is a viral structural protein. Expression studies of ORF7a have shown that biological functions include induction of apoptosis through a caspase-dependent pathway, activation of the p38 mitogen-activated protein kinase signaling pathway, inhibition of host protein translation, and suppression of cell growth progression. These results collectively suggested that ORF7a protein may be involved in virus-host interactions.	83
394886	cd21665	alphaCoV_Nsp5_Mpro	alphacoronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in alphacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs.	296
394887	cd21666	betaCoV_Nsp5_Mpro	betacoronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in betacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs.	297
394888	cd21667	gammaCoV_Nsp5_Mpro	gammacoronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in gammacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs.	306
394889	cd21668	deltaCoV_Nsp5_Mpro	deltacoronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in deltacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs.	302
394935	cd21684	ORF7a_SARS-CoV-2-like	Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2) structural accessory protein ORF7a and a bat coronavirus (BatCoV RaTG13) from related betacoronaviruses in the subgenera Sarbecovirus (B lineage). This group contains the structural accessory protein ORF7a, also called NS7a, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoV) from betacoronavirus subgenera Sarbecovirus (lineage B), including SARS-CoV-2, also known as 2019-nCoV, and a bat coronavirus (BatCoV RaTG13), which was previously detected in Rhinolophus affinis from China's Yunnan province. ORF7a/NS7a from betacoronavirus in the subgenera Sarbecovirus (B lineage) are not related to NS7a proteins from other coronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS-CoV contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. Structurally, ORF7a possesses a distinctive immunoglobulin (Ig)-like domain which is related to extracellular metazoan Ig domains that are involved in adhesion, such as ICAM; it also contains a 15-aa signal peptide sequence at its N terminus, an 81-aa luminal domain, a 21-aa transmembrane domain, and a short C-terminal tail. Coexpression of SARS-CoV ORF7a with S, M, N, and E proteins resulted in production of virus-like particles (VLPs) carrying ORF7a protein, indicating that ORF7a is a viral structural protein. Expression studies of ORF7a have shown that biological functions include induction of apoptosis through a caspase-dependent pathway, activation of the p38 mitogen-activated protein kinase signaling pathway, inhibition of host protein translation, and suppression of cell growth progression. These results collectively suggested that ORF7a protein may be involved in virus-host interactions.	121
394936	cd21685	ORF7a_SARS-CoV-like	Severe Acute Respiratory Syndrome coronavirus (SARS-CoV-2) structural accessory protein ORF7a and similar proteins from betacoronaviruses in the subgenera Sarbecovirus (B lineage). This group contains the structural accessory protein ORF7a, also called NS7a, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoVs) from betacoronavirus subgenera Sarbecovirus (lineage B). ORF7a/NS7a from betacoronavirus in the subgenera Sarbecovirus (B lineage) are not related to NS7a proteins from other coronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS-CoV contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. Structurally, ORF7a possesses a distinctive immunoglobulin (Ig)-like domain which is related to extracellular metazoan Ig domains that are involved in adhesion, such as ICAM; it also contains a 15-aa signal peptide sequence at its N terminus, an 81-aa luminal domain, a 21-aa transmembrane domain, and a short C-terminal tail. Coexpression of SARS-CoV ORF7a with S, M, N, and E proteins resulted in production of virus-like particles (VLPs) carrying ORF7a protein, indicating that ORF7a is a viral structural protein. Expression studies of ORF7a have shown that biological functions include induction of apoptosis through a caspase-dependent pathway, activation of the p38 mitogen-activated protein kinase signaling pathway, inhibition of host protein translation, and suppression of cell growth progression. These results collectively suggested that ORF7a protein may be involved in virus-host interactions.	83
409657	cd21686	TM_Y_CoV_Nsp3_C	C-terminus of coronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from alpha-, beta-, gamma-, and deltacoronavirus, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In SARS-CoV and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	476
409334	cd21687	TGEV-like_alphaCoV_Nsp1	non-structural protein 1 from transmissible gastroenteritis virus and similar alphacoronaviruses. This model represents the non-structural protein 1 (Nsp1) from transmissible gastroenteritis virus (TGEV) and similar alphacoronaviruses from the tegacovirus and minacovirus subgenera. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the TGEV and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	104
409647	cd21688	CoV_PLPro	Coronavirus (CoV) papain-like protease (PLPro). This model represents the papain-like protease (PLPro) found in non-structural protein 3 (Nsp3) of alpha-, beta-, gamma-, and deltacoronavirus, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in many of these CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation.	299
410205	cd21689	stalk_CoV_Nsp13-like	stalk domain of coronavirus Nsp13 helicase and related proteins. This model represents the stalk domain of coronavirus non-structural protein 13 (Nsp13) helicase, found in the Nsp3s of alpha-, beta-, gamma-, and deltacoronaviruses, including Severe Acute Respiratory Syndrome coronavirus (SARS-CoV), SARS-CoV-2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome coronavirus (MERS-CoV). Helicases are classified based on the arrangement of conserved motifs into six superfamilies; coronavirus helicases in this family belong to superfamily 1 (SF1). Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It consists of an N-terminal ZBD (Cys/His rich zinc-binding domain), a stalk domain, a 1B regulatory domain, and SF1 helicase core. The stalk domain lies between the ZBD domain and the 1B domain; a short loop connects the ZBD to the stalk domain. The stalk domain is comprised of three tightly-interacting alpha-helices connected to the 1B domain, transferring the effect from the ZBD domain onto the helicase core domains. The ZBD and stalk domains are critical for the helicase activity of SARS-CoV Nsp13.	48
409667	cd21690	GH2_like	GIPC homology 2 (GH2) domain-like family. The GIPC (GAIP C-terminus-interacting protein) family of proteins mediate endocytosis by tethering cargo proteins to the motor myosin VI. This model represents the C-terminal GIPC homology 2 or GH2 domain (plus the linker to the PDZ domain located N-terminally of GH2), which mediates the interaction with myosin VI and is involved in homodimerization in the autoinhibited state. The family also includes DEAH box protein 8 (DHX8) and similar proteins. DHX8 (a human homolog of yeast Prp22), also called RNA helicase HRH1, is an ATP-dependent RNA helicase involved in pre-mRNA splicing as a component of the spliceosome. It facilitates nuclear export of spliced mRNA by releasing the RNA from the spliceosome. DHX8 contains a GH2-like domain at the N-terminus, which shows high sequence similarity with the GH2 domain found in GIPC proteins.	62
409668	cd21691	GH2-like_DHX8	GIPC-homology 2 (GH2)-like domain found in DEAH box protein 8 (DHX8) and similar proteins. DHX8 (a human homolog of yeast Prp22), also called RNA helicase HRH1, is an ATP-dependent RNA helicase involved in pre-mRNA splicing as a component of the spliceosome. It facilitates nuclear export of spliced mRNA by releasing the RNA from the spliceosome. This model corresponds to the GH2-like domain that shows high sequence similarity with the GH2 domain found in GIPC (GAIP C-terminus-interacting protein) family of proteins, which mediate endocytosis by tethering cargo proteins to the motor myosin VI.	68
412028	cd21692	GINS_B_Sld5	beta-strand (B) domain of GINS complex protein Sld5. Sld5 is a component of the GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) tetrameric protein complex, within which Sld5 interacts with Psf1 via its N-terminal A-domain, and with Psf2 through a combination of the A and B domains. In Drosophila, Sld5 is required for normal cell cycle progression and the maintenance of genomic integrity. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits Sld5, Psf1, Psf2 and Psf3 are homologous, and homologs are also found in archaea; the complex is not found in bacteria. Each subunit of the complex consists of two domains called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. This model represents the B-domain of GINS subunit Sld5.	55
412029	cd21693	GINS_B_Psf3	beta-strand (B) domain of GINS complex protein Psf3. Psf3 (partner of Sld5 3) is one of the proteins known to comprise the GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex, which is a macromolecular protein complex associated with DNA replication. Psf3 is dysregulated in cancer cells, and its overexpression may be related to tumor progression in some cancers including colon, breast, and lung cancers; its expression can be used as a prognostic indicator in some cancers. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both the initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits Sld5, Psf1, Psf2, and Psf3 are homologous, and homologs are also found in archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. This model represents the B-domain of GINS subunit Psf3.	64
412030	cd21694	GINS_B_Psf2	beta-strand (B) domain of GINS complex protein Psf2. Psf2 (partner of Sld5 2) is a component of GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) tetrameric protein complex and has been found to play important roles in normal eye development in Xenopus laevis and in ICL (interstrand crosslinks) repair. ICLs are toxic lesions that covalently attach opposite strands of DNA. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) and is involved in both the initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits Sld5, Psf1, Psf2, and Psf3 are homologous, and homologs are also found in archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. This model represents the B-domain of GINS subunit Psf2.	62
412031	cd21695	GINS_B_archaea_Gins51	beta-strand (B) domain of archaeal GINS complex protein Gins51. The GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS  being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In archaeal DNA replication initiation, homo-hexameric MCM (mini-chromosome maintenance) unwinds the template double-stranded DNA to form the replication fork. MCM is activated by two proteins GINS and GAN (GINS-associated nuclease), which constitute the 'CMG' unwindosome complex together with the MCM core. While eukaryotic GINS complex is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3, the archaeal complex consists of two different proteins, namely Gins51 and Gins23, and forms either an alpha2beta2-type heterotetramer composed of Gins51 and Gins23, or a Gins51-only alpha4-type homotetramer. The archaeal Gins51, as well as eukaryotic Sld5 and Psf1) have the alpha-helical (A) domain at the N-terminus and the beta-strand domain (B) at the C-terminus; this arrangement is called ABtype. Archaeal GINS contacts GAN by using the Gins51 B-domain as a hook, for the formation of the CMG helicase. The locations and contributions of the archaeal Gins subunit B domain to the tetramer formation, imply the possibility that the archaeal and eukaryotic GINS complexes contribute to DNA unwinding reactions by significantly different mechanisms in terms of the atomic details. This model represents the B-domain of Gins51.	52
412032	cd21696	GINS_B_Psf1	beta-strand (B) domain of GINS complex protein Psf1. Psf1 (partner of Sld5 1) is a component of the GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) tetrameric protein complex, and is mainly expressed in highly proliferative tissues, such as blastocysts, adult bone marrow, and testis, in which the stem cell system is active. Psf1 has been reported to be a prognostic biomarker in breast cancer, prostate cancer, hepatocellular carcinoma, and non-small cell lung cancer (NSCLC) patients treated with surgery following preoperative chemotherapy or chemoradiotherapy. Loss of Psf1 causes embryonic lethality. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) and is involved in both the initiation and elongation stages of eukaryotic chromosome replication. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits Sld5, Psf1, Psf2 and Psf3 are homologous, and homologs are also found in archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, called the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3. This model represents the B-domain of GINS subunit Psf1.	49
412033	cd21697	GINS_B_archaea_Gins23	beta-strand (B) domain of archaeal GINS complex protein Gins23. The GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In archaeal DNA replication initiation, homo-hexameric MCM (mini-chromosome maintenance) unwinds the template double-stranded DNA to form the replication fork. MCM is activated by two proteins GINS and GAN (GINS-associated nuclease), which constitute the 'CMG' unwindosome complex together with the MCM core. While eukaryotic GINS complex is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3, the archaeal complex consists of two different proteins, namely Gins51 and Gins23, and forms either an alpha2beta2-type heterotetramer composed of Gins51 and Gins23, or a Gins51-only alpha4-type homotetramer. The archaeal Gins23, as well as eukaryotic Psf2 and Psf3, have the alpha-helical (A) domain at the C-terminus and the beta-strand domain (B) at the N-terminus; this arrangement is called BAtype. The locations and contributions of the archaeal Gins subunit B domain to the tetramer formation, imply the possibility that the archaeal and eukaryotic GINS complexes contribute to DNA unwinding reactions by significantly different mechanisms in terms of the atomic details. This model represents the B-domain of archaeal Gins23.	42
411955	cd21698	CoV_Spike_S1-S2_S2	S1/S2 cleavage region and the S2 fusion subunit of coronavirus spike (S) proteins. This model represents the S1/S2 cleavage region and the S2 subunit of the spike (S) glycoprotein from coronavirus (CoVs), including three highly pathogenic human CoVs, Middle East respiratory syndrome coronavirus (MERS-CoV), Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), and SARS coronavirus 2 (SARS-CoV-2), also known as a 2019 novel coronavirus (2019-nCoV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect S1 and S2. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV, and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP), and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. Notably, SARS-CoV-2 has a functional polybasic (furin) cleavage site through the insertion of PRRAR*SV (* indicates the cleavage site) at the S1/S2 interface, which is absent in SARS-CoV and other SARS-related CoVs. The S1/S2 cleavage region and the S2 fusion subunit play an essential role in viral entry by initiating fusion of the viral and cellular membranes.	523
411982	cd21699	JMTM_APP_like	juxtamembrane and transmembrane (JMTM) domain found in the amyloid-beta precursor protein (APP) family. The amyloid-beta precursor protein (APP) family includes amyloid-like proteins APLP-1 and APLP-2. APP (also called ABPP, APPI, Alzheimer disease (AD) amyloid protein, amyloid precursor protein, amyloid-beta A4 protein, cerebral vascular amyloid peptide (CVAP), PreA4, or protease nexin-II (PN-II)) functions as a cell surface receptor and performs physiological functions on the surface of neurons relevant to neurite growth, neuronal adhesion and axonogenesis. Amyloid-beta peptides are lipophilic metal chelators with metal-reducing activity; they bind transient metals such as copper, zinc and iron. APLP-1, also called APLP, may play a role in postsynaptic function. It couples to JIP signal transduction through C-terminal binding. APLP-1 may interact with cellular G-protein signaling pathways. It can regulate neurite outgrowth through binding to components of the extracellular matrix such as heparin and collagen I. APLP-2 (also called amyloid protein homolog (APPH), or CDEI box-binding protein (CDEBP)) may play a role in the regulation of hemostasis. Its soluble form may have inhibitory properties towards coagulation factors. APLP-2 may bind to the DNA 5'-GTCACATG-3'(CDEI box). It inhibits trypsin, chymotrypsin, plasmin, factor XIA, and plasma and glandular kallikrein. This model corresponds to juxtamembrane and transmembrane (JMTM) domain of APP, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region. More than half of all familial APP mutations of Alzheimer's disease are seen in its JMTM domain region.	41
411983	cd21700	JMTM_Notch_APP	juxtamembrane and transmembrane (JMTM) domain found in Notch and APP family proteins. The substrates of gamma-secretase include amyloid precursor protein (APP) and the Notch receptor. APP, also called APPI, or Alzheimer disease amyloid protein (ABPP), or amyloid precursor protein, or amyloid-beta A4 protein, or cerebral vascular amyloid peptide (CVAP), or PreA4, or protease nexin-II (PN-II), functions as a cell surface receptor and performs physiological functions on the surface of neurons relevant to neurite growth, neuronal adhesion and axonogenesis. Notch proteins are a family of type-1 transmembrane proteins that form a core component of the Notch signaling pathway. They operate in a variety of different tissues and play a role in a variety of developmental processes by controlling cell fate decisions. Successive cleavage of the APP carboxyl-terminal fragment generates amyloid-beta (Abeta) peptides of varying lengths. Accumulation of Abeta peptides such as Abeta42 and Abeta43 leads to formation of amyloid plaques in the brain, a hallmark of Alzheimer's disease. Notch cleavage is involved in cell-fate determination during development and neurogenesis. The model corresponds to the juxtamembrane and transmembrane (JMTM) domain found in Notch and APP family proteins. It comprises a transmembrane helix (TM) with adjacent juxtamembrane (JM) regions. The JMTM domain is likely to be recognized by gamma-secretase in a similar fashion to both Notch and APP family proteins.	41
411984	cd21701	JMTM_Notch	juxtamembrane and transmembrane (JMTM) domain found in Notch protein family. Neurogenic locus notch homolog (Notch) proteins are a family of type-1 transmembrane proteins that form a core component of the Notch signaling pathway. They operate in a variety of different tissues and play a role in a variety of developmental processes by controlling cell fate decisions. The model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch proteins, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand.	85
411985	cd21702	JMTM_Notch1	juxtamembrane and transmembrane (JMTM) domain found in neurogenic locus notch homolog protein 1 (Notch1) and similar proteins. Neurogenic locus notch homolog protein 1 (Notch1), also called translocation-associated notch protein TAN-1, functions as a receptor for membrane-bound ligands Jagged-1 (JAG1), Jagged-2 (JAG2) and Delta-1 (DLL1) to regulate cell-fate determination. It affects the implementation of differentiation, proliferation and apoptotic programs. It is also involved in angiogenesis, and also negatively regulates endothelial cell proliferation and migration and angiogenic sprouting. This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch1, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand.	80
411986	cd21703	JMTM_Notch2	juxtamembrane and transmembrane (JMTM) domain found in neurogenic locus notch homolog protein 2 (Notch2) and similar proteins. Neurogenic locus notch homolog protein 2 (Notch2) functions as a receptor for membrane-bound ligands Jagged-1 (JAG1), Jagged-2 (JAG2) and Delta-1 (DLL1) to regulate cell-fate determination. Upon ligand activation through the released notch intracellular domain (NICD) it forms a transcriptional activator complex with RBPJ/RBPSUH and activates genes of the enhancer of split locus. Notch2 is involved in bone remodeling and homeostasis. In collaboration with RELA/p65, it enhances NFATc1 promoter activity and positively regulates RANKL-induced osteoclast differentiation. Notch2 positively regulates self-renewal of liver cancer cells. This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch2, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand.	82
411987	cd21704	JMTM_Notch3	juxtamembrane and transmembrane (JMTM) domain found in neurogenic locus notch homolog protein 3 (Notch3) and similar proteins. Neurogenic locus notch homolog protein 3 (Notch3) functions as a receptor for membrane-bound ligands Jagged1, Jagged2 and Delta1 to regulate cell-fate determination. Upon ligand activation through the released notch intracellular domain (NICD) it forms a transcriptional activator complex with RBPJ/RBPSUH and activates genes of the enhancer of split locus. The model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch3, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand.	90
411988	cd21705	JMTM_Notch4	juxtamembrane and transmembrane (JMTM) domain found in neurogenic locus notch homolog protein 4 (Notch4) and similar proteins. Neurogenic locus notch homolog protein 4 (Notch4) functions as a receptor for membrane-bound ligands Jagged1, Jagged2 and Delta1 to regulate cell-fate determination. Upon ligand activation through the released notch intracellular domain (NICD) it forms a transcriptional activator complex with RBPJ/RBPSUH and activates genes of the enhancer of split locus. It affects the implementation of differentiation, proliferation and apoptotic programs. This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of Notch4, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand.	92
411989	cd21706	JMTM_dNotch	juxtamembrane and transmembrane (JMTM) domain found in Drosophila melanogaster neurogenic locus Notch protein (dNotch) and similar proteins. Drosophila melanogaster neurogenic locus Notch protein (dNotch) is an essential signaling protein which has a major role in many developmental processes. It functions as a receptor for membrane-bound ligands Delta and Serrate to regulate cell-fate determination. It regulates oogenesis, the differentiation of the ectoderm and the development of the central and peripheral nervous system, eye, wing disk, muscles and segmental appendages such as antennae and legs, through lateral inhibition or induction. It also regulates neuroblast self-renewal, identity and proliferation through the regulation of bHLH-O proteins; in larval brains, it is involved in the maintenance of type II neuroblast self-renewal and identity by suppressing erm expression together with pnt. It might also regulate dpn expression through the activation of the transcriptional regulator Su(H). This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of dNotch, which comprises an extended coil, a transmembrane helix (TM), and a beta-strand.	90
411990	cd21707	JMTM_APP	juxtamembrane and transmembrane (JMTM) domain found in amyloid-beta precursor protein (APP) and similar proteins. Amyloid-beta precursor protein (APP), also called APPI, ABPP, Alzheimer disease amyloid protein, amyloid precursor protein, amyloid-beta A4 protein, cerebral vascular amyloid peptide (CVAP), PreA4, or protease nexin-II (PN-II), functions as a cell surface receptor and performs physiological functions on the surface of neurons relevant to neurite growth, neuronal adhesion and axonogenesis. Amyloid-beta peptides are lipophilic metal chelators with metal-reducing activity; they bind transient metals such as copper, zinc and iron. This model corresponds to juxtamembrane and transmembrane (JMTM) domain of APP, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region. More than half of all familial APP mutations of Alzheimer's disease are seen in its JMTM domain region.	40
411991	cd21708	JMTM_APLP1	juxtamembrane and transmembrane (JMTM) domain found in amyloid-like protein 1 (APLP-1) and similar proteins. Amyloid-like protein 1 (APLP-1), also called APLP, may play a role in postsynaptic function. It couples to JIP signal transduction through C-terminal binding. APLP-1 may interact with cellular G-protein signaling pathways. It can regulate neurite outgrowth through binding to components of the extracellular matrix such as heparin and collagen I. This model corresponds to the juxtamembrane and transmembrane (JMTM) domain of APLP-1, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region.	85
411992	cd21709	JMTM_APLP2	juxtamembrane and transmembrane (JMTM) domain found in amyloid-like protein 2 (APLP-2) and similar proteins. Amyloid-like protein 2 (APLP-2), also called amyloid protein homolog (APPH), or CDEI box-binding protein (CDEBP), may play a role in the regulation of hemostasis. Its soluble form may have inhibitory properties towards coagulation factors. APLP-2 may bind to the DNA 5'-GTCACATG-3'(CDEI box). It inhibits trypsin, chymotrypsin, plasmin, factor XIA, and plasma and glandular kallikrein. This model corresponds to juxtamembrane and transmembrane (JMTM) domain of APLP-2, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region.	81
409658	cd21710	TM_Y_gammaCoV_Nsp3_C	C-terminus of gammacoronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from gammacoronavirus, including Infectious bronchitis virus. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	525
409659	cd21711	TM_Y_deltaCoV_Nsp3_C	C-terminus of deltacoronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from deltacoronavirus, including Magpie-robin coronavirus HKU18 and Bulbul coronavirus HKU11, among others. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	490
409660	cd21712	TM_Y_alphaCoV_Nsp3_C	C-terminus of alphacoronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from alphacoronavirus, including Porcine epidemic diarrhea virus and Human coronavirus 229E, among others. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	501
409661	cd21713	TM_Y_betaCoV_Nsp3_C	C-terminus of betacoronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In SARS-CoV and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	545
409662	cd21714	TM_Y_MHV-like_Nsp3_C	C-terminus of non-structural protein 3, including transmembrane and Y domains, from murine hepatitis virus and betacoronavirus in the A lineage. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus  in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV) and Human coronavirus HKU1. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In MHV and the related Severe acute respiratory syndrome-related coronavirus (SARS-CoV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	555
409663	cd21715	TM_Y_HKU9-like_Nsp3_C	C-terminus of non-structural protein 3, including transmembrane and Y domains, from Rousettus bat coronavirus HKU9 and betacoronavirus in the D lineage. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	526
409664	cd21716	TM_Y_MERS-CoV-like_Nsp3_C	C-terminus of non-structural protein 3, including transmembrane and Y domains, from Middle East respiratory syndrome-related coronavirus and betacoronavirus in the C lineage. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In the related betacoronaviruses, Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	566
409665	cd21717	TM_Y_SARS-CoV-like_Nsp3_C	C-terminus of non-structural protein 3, including transmembrane and Y domains, from Severe acute respiratory syndrome-related coronavirus and betacoronavirus in the B lineage. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the sarbecovirus subgenus (B lineage), including highly pathogenic human coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In SARS-CoV and the related murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	531
409652	cd21718	CoV_Nsp13-helicase	helicase domain of coronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from alpha-, beta-, gamma-, and deltacoronavirus, including pathogenic human viruses such as Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core.	341
409653	cd21720	gammaCoV_Nsp13-helicase	helicase domain of gammacoronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from gammacoronavirus, including Avian infectious bronchitis virus. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Coronavirus (CoV) Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core.	343
409654	cd21721	deltaCoV_Nsp13-helicase	helicase domain of deltacoronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from deltacoronavirus, including Bulbul coronavirus (CoV) HKU11 and Common moorhen CoV HKU21. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core.	342
409655	cd21722	betaCoV_Nsp13-helicase	helicase domain of betacoronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from betacoronavirus, including pathogenic human viruses such as Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core.	340
409656	cd21723	alphaCoV_Nsp13-helicase	helicase domain of alphacoronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from alphacoronavirus, including Porcine epidemic diarrhea virus and Human coronavirus (CoV) NL63. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core.	340
409626	cd21727	betaCoV_Nsp3_betaSM	betacoronavirus-specific marker of betacoronavirus non-structural protein 3. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus, including highly pathogenic human coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16.	125
409648	cd21731	alphaCoV_PLPro	alphacoronavirus papain-like protease. This model represents the papain-like protease (PLPro) found in non-structural protein 3 (Nsp3) of alphacoronavirus, including Swine acute diarrhea syndrome coronavirus (SADS-CoV) which causes severe diarrhea in piglets, and Human coronavirus 229E which infects humans and bats and causes the common cold. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in SADS-CoV and many others has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation.	289
409649	cd21732	betaCoV_PLPro	betacoronavirus papain-like protease. This model represents the papain-like protease (PLPro) found in non-structural protein 3 (Nsp3) of betacoronavirus, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. In SARS-CoV and murine hepatitis virus (MHV), the C-terminal non-structural protein 3 region spanning transmembrane regions TM1 and TM2 with 3Ecto domain in between, are important for the PL2pro domain to process Nsp3-Nsp4 cleavage. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain of many of these CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation. Interactions of SARS-CoV and MERS-CoV with antiviral interferon (IFN) responses of human cells are remarkably different; high-dose IFN treatment (type I and type III) shows MERS-CoV was substantially more IFN sensitive than SARS-CoV. This may be due to differences in the architecture of the oxyanion hole and of the S3 as well as the S5 specificity sites, despite the overall structures of SARS-CoV and MERS-CoV PLPro being similar.	304
409650	cd21733	gammaCoV_PLPro	gammacoronavirus papain-like protease. This model represents the papain-like protease (PLPro) found in non-structural protein 3 (Nsp3) of gammacoronavirus, including Avian coronavirus, Canada goose coronavirus, and Beluga whale coronavirus SW1. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in several CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation.	304
409651	cd21734	deltaCoV_PLPro	deltacoronavirus papain-like protease. This model represents the papain-like protease (PLPro) found in the non-structural protein 3 (Nsp3) region of deltacoronavirus, including Porcine deltacoronavirus, Bulbul coronavirus HKU11, and Common moorhen coronavirus HKU21. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in many of these CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation.	313
412023	cd21743	CTD_KDM2A_2B-like	C-terminal domain found in lysine-specific demethylase KDM2A, KDM2B, and similar proteins. This family includes lysine-specific demethylases KDM2A and KDM2B, as well as Drosophila melanogaster JmjC domain-containing histone demethylation protein 1 (Jhd1). KDM2A is a ubiquitously expressed histone H3 lysine 36 (H3K36) demethylase that has been implicated in gene silencing, cell cycle, cell growth, and cancer development. KDM2B is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. Jhd1, also called lysine (K)-specific demethylase 2 (KDM2), or [Histone-H3]-lysine-36 demethylase 1, is a histone demethylase (EC 1.14.11.27) that specifically demethylates 'Lys-36' of histone H3, thereby playing a central role in the histone code. Members in this family belong to the JmjC domain-containing histone demethylase family. They consist of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature.	67
409643	cd21744	RBD_KIF20A-like	RAB6 binding domain (RBD) found in kinesin-like proteins KIF20A, KIF20B, and similar proteins. This family includes kinesin-like proteins KIF20A and KIF20B. KIF20A (also called GG10_2, mitotic kinesin-like protein 2 (MKlp2), Rab6-interacting kinesin-like protein, or rabkinesin-6) is a mitotic kinesin required for chromosome passenger complex (CPC)-mediated cytokinesis. Following phosphorylation by PLK1 (polo-like kinase 1), it is involved in recruitment of PLK1 to the central spindle. KIF20A interacts with guanosine triphosphate (GTP)-bound forms of RAB6A and RAB6B. It may act as a motor required for the retrograde RAB6 regulated transport of Golgi membranes and associated vesicles along microtubules. KIF20A has a microtubule plus-end-directed motility. KIF20B (also called cancer/testis antigen 90 (CT90), kinesin family member 20B, kinesin-related motor interacting with PIN1, or M-phase phosphoprotein 1 (MPP1)) is a plus-end-directed motor enzyme that is required for completion of cytokinesis. It is required for proper midbody organization and abscission in polarized cortical stem cells. KIF20B plays a role in the regulation of neuronal polarization by mediating the transport of specific cargoes. It participates in the mobilization of SHTN1 and in the accumulation of PIP3 in the growth cone of primary hippocampal neurons in a tubulin and actin-dependent manner. In the developing telencephalon, KIF20B cooperates with SHTN1 to promote both the transition from the multipolar to the bipolar stage and the radial migration of cortical neurons from the ventricular zone toward the superficial layer of the neocortex. This model corresponds to a conserved domain in the KIF20A subfamily, that shows RAB6 binding ability and has been called the RAB6 binding domain (RBD). KIF20A-RBD is a dimer composed of two parallel alpha helices that form a right-handed coiled-coil additionally stabilized by an inter-helical cysteine bridge.	56
409646	cd21759	CBD_MYO6-like	calmodulin binding domain found in unconventional myosin-VI and similar proteins. Myosins, which are actin-based motor molecules with ATPase activity, include unconventional myosins that serve in intracellular movements. Myosin-VI, also called unconventional myosin-6 (MYO6), is a reverse-direction motor protein that moves towards the minus-end of actin filaments. It is required for the structural integrity of the Golgi apparatus via the p53-dependent pro-survival pathway. Myosin-VI appears to be involved in a very early step of clathrin-mediated endocytosis in polarized epithelial cells. It modulates RNA polymerase II-dependent transcription. As part of the DISP (DOCK7-Induced Septin disPlacement) complex, Myosin-VI may regulate the association of septins with actin and thereby regulate the actin cytoskeleton. Myosin-VI is encoded by gene MYO6, the human homolog of the gene responsible for deafness in Snell's waltzer mice. It is mutated in autosomal dominant non-syndromic hearing loss. This family also includes Drosophila melanogaster unconventional myosin VI Jaguar (Jar; also called myosin heavy chain 95F (Mhc95F), or 95F MHC), which is a motor protein necessary for the morphogenesis of epithelial tissues during Drosophila development. Jar is required for basal protein targeting and correct spindle orientation in mitotic neuroblasts. It contributes to synaptic transmission and development at the Drosophila neuromuscular junction. Together with CLIP-190 (CAP-Gly domain-containing/cytoplasmic linker protein 190), Jar may coordinate the interaction between the actin and microtubule cytoskeleton. Jar may link endocytic vesicles to microtubules and possibly be involved in transport in the early embryo and in the dynamic process of dorsal closure; its function is believed to change during the life cycle. This model corresponds to the calmodulin (CaM) binding domain (CBD), which consists of three subdomains: a unique insert (Insert 2 or Ins2), an IQ motif, and a proximal tail domain (PTD, also known as lever arm extension or LAE).	149
409196	cd21762	WH2	Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif), and similar proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) as well as thymosin-beta (Tbeta; also called beta-thymosin or betaT) domains that are small, widespread intrinsically disordered actin-binding peptides displaying significant sequence variability and different regulations of actin self-assembly in motile and morphogenetic processes. These WH2/betaT peptides are identified by a central consensus actin-binding motif LKKT/V flanked by variable N-terminal and C-terminal extensions; the betaT shares a more extended and conserved C-terminal half than WH2. These single or repeated domains are found in actin-binding proteins (ABPs) such as the hematopoietic-specific protein WASP, its ubiquitously expressed ortholog neural-WASP (N-WASP), WASP-interacting protein (WAS/WASL-interacting protein family members 1 and 2), and WASP-family verprolin homologous protein (WAVE/SCAR) isoforms: WAVE1, WAVE2, and WAVE3. Also included are the WH2 domains found in inverted formin FH2 domain-containing protein (INF2), Cordon bleu (Cobl) protein, vasodilator-stimulated phosphoprotein (VASP) homology protein and actobindin (found in amoebae). These ABPs are commonly multidomain proteins that contain signaling domains and structurally conserved actin-binding motifs, the most important being the WH2 domain motif through which they bind actin in order to direct the location, rate, and timing for actin assembly in the cell into different structures, such as filopodia, lamellipodia, stress fibers, and focal adhesions. The WH2 domain motif is one of the most abundant actin-binding motifs in Wiskott-Aldrich syndrome proteins (WASPs) where they activate Arp2/3-dependent actin nucleation and branching in response to signals mediated by Rho-family GTPases. The thymosin beta (Tbeta) domains in metazoans act in cells as major actin-sequestering peptides; their complex with monomeric ATP-actin (G-ATP-actin) cannot polymerize at either filament (F-actin) end.	22
409640	cd21764	CEN_USH1G_ANKS4B	central domain found in usher syndrome type-1G protein, ankyrin repeat and SAM domain-containing protein 4B, and similar proteins. The family includes usher syndrome type-1G protein (USH1G), ankyrin repeat and SAM domain-containing protein 4B (ANKS4B), and similar proteins. USH1G, also called scaffold protein containing ankyrin repeats and SAM domain (Sans), is an anchoring/scaffolding protein that is a part of the functional network formed by USH1C, USH1G, CDH23 and MYO7A, that mediates mechanotransduction in cochlear hair cells. It is required for normal development and maintenance of cochlear hair cell bundles, as well as for normal hearing. ANKS4B, also called Harmonin-interacting ankyrin repeat-containing protein (Harp), is highly expressed in intestine and is essential for intermicrovillar adhesion. As part of the intermicrovillar adhesion complex (IMAC), ANKS4B plays a role in epithelial brush border differentiation, controlling microvilli organization and length. It may be involved in cellular response to endoplasmic reticulum stress. Both USH1G and ANKS4B contain four N-terminal ANK repeats, a central region, and a sterile alpha motif (SAM) followed by a C-terminal type I PDZ binding motif (PBM). This model corresponds to the central region (CEN), which contains the conserved regions CEN1 and CEN2. CEN is directly responsible for USH1G binding to the MYO7A MyTH4-FERM tandem, as well as for ANKS4B binding to the N-terminal MyTH4-FERM-SH3 supramodule of MYO7B.	41
409636	cd21769	DEFL	defensin-like domain family. This family includes a group of defensin-like proteins, including Arabidopsis thaliana protein LURE 1.2 (AtLURE1.2) and protein LURE 1.6 (AtLURE1.6), Mesobuthus martensii neurotoxin BmBKTx1, Arabidopsis thaliana defensin-like protein 32 (AtDEF32), as well as bactericidal proteins such as defensins, sapecins, tenecins, phormicins, and lucifensins. They are characterized by a defensin-like (DEFL) domain, which adopts a structure characterized by a cysteine-stabilized alpha/beta scaffold. AtLURE1.2 (also called cysteine-rich peptide 810_1.2 or defensin-like protein 213) and AtLURE1.6 (also called cysteine-rich peptide 810_1.6 or defensin-like protein 215) are pollen tube attractants guiding pollen tubes to the ovular micropyle. BmBKTx1, also called potassium channel toxin alpha-KTx 19.1 or BmK37, is a selective inhibitor of high conductance calcium-activated potassium channels KCa1.1/KCNMA1. Bactericidal proteins are host defense peptides produced in response to injury and are mostly active against Gram-positive bacteria.	29
412024	cd21783	CTD_Jhd1-like	C-terminal domain found in Drosophila melanogaster JmjC domain-containing histone demethylation protein 1 and similar proteins. JmjC domain-containing histone demethylation protein 1 (Jhd1), also called lysine (K)-specific demethylase 2 (KDM2), or [Histone-H3]-lysine-36 demethylase 1, is a histone demethylase (EC 1.14.11.27) that specifically demethylates 'Lys-36' of histone H3, thereby playing a central role in the histone code. Jhd1 consists of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region in Jhd1 between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature.	67
412025	cd21784	CTD_KDM2A	C-terminal domain found in Lysine-specific demethylase 2A. Lysine-specific demethylase 2A (KDM2A) is also called CXXC-type zinc finger protein 8, F-box and leucine-rich repeat protein 11 (FBXL11), F-box protein FBL7, F-box protein Lilina, F-box/LRR-repeat protein 11, JmjC domain-containing histone demethylation protein 1A (Jhdm1a), or [Histone-H3]-lysine-36 demethylase 1A. It is a ubiquitously expressed histone H3 lysine 36 (H3K36) demethylase that has been implicated in gene silencing, cell cycle, cell growth, and cancer development. It acts as a key negative regulator of gluconeogenic gene expression and plays a critical role in the invasiveness, proliferation, and anchorage-independent growth of non-small cell lung cancer (NSCLC) cells, as well as in the osteo/dentinogenic differentiation of Mesenchymal stem cells (MSCs). KDM2A regulates rRNA transcription in response to starvation and functions as a negative regulator of NF-kappaB. It is a heterochromatin-associated and HP1-interacting protein that promotes Heterochromatin Protein 1 (HP1) localization to chromatin. It is specifically recruited to CpG islands to define a unique chromatin architecture, which requires direct and specific interaction with linker DNA. It also functions as a H3K4 demethylase that regulates cell proliferation through p15 (INK4B) and p27 (Kip1) in stem cells from apical papilla (SCAPs). KDM2A belongs to the JmjC domain-containing histone demethylase family. KDM2A consists of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region in KDM2A between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature.	68
412026	cd21785	CTD_KDM2B	C-terminal domain found in Lysine-specific demethylase 2B. Lysine-specific demethylase 2B (KDM2B) is also called Ndy1, CXXC-type zinc finger protein 2, F-box and leucine-rich (LRR) repeat protein 10 (FBXL10), F-box protein FBL10, JmjC domain-containing histone demethylation protein 1B (Jhdm1b), Jumonji domain-containing EMSY-interactor methyltransferase motif protein (protein JEMMA), or [Histone-H3]-lysine-36 demethylase 1B. It is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. It regulates the differentiation of Mesenchymal Stem Cells (MSCs) and has been implicated in cell cycle regulation by de-repressing cyclin-dependent kinase inhibitor 2B (CDKN2B or p15INK4B). It also plays a role in recruiting polycomb repressive complex 1 (PRC1) to CpG islands (CGIs) of developmental genes and regulates lysine 119 monoubiquitylation on H2A (H2AK119ub1) in embryonic stem cells (ESCs). KDM2B also acts as an oncogene that plays a critical role in leukemia development and maintenance. It consists of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region in KDM2B between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature.	67
409644	cd21786	RBD_KIF20B	RAB6 binding domain (RBD) found in kinesin-like protein KIF20B, and similar proteins. KIF20B (also called cancer/testis antigen 90 (CT90), kinesin family member 20B, kinesin-related motor interacting with PIN1, or M-phase phosphoprotein 1 (MPP1)) is a plus-end-directed motor enzyme that is required for completion of cytokinesis. It is required for proper midbody organization and abscission in polarized cortical stem cells. KIF20B plays a role in the regulation of neuronal polarization by mediating the transport of specific cargos. It participates in the mobilization of SHTN1 (shootin 1) and in the accumulation of PIP3 in the growth cone of primary hippocampal neurons in a tubulin and actin-dependent manner. In the developing telencephalon, KIF20B cooperates with SHTN1 to promote both the transition from the multipolar to the bipolar stage and the radial migration of cortical neurons from the ventricular zone toward the superficial layer of the neocortex. KIF20B acts as an oncogene for promoting bladder cancer cell proliferation, apoptosis inhibition, and carcinogenic progression. This model corresponds to a conserved region in KIF20B that shows some sequence similarity to the RAB6 binding domain (RBD) of KIF20A. KIF20A-RBD is a dimer composed of two parallel alpha helices that form a right-handed coiled-coil additionally stabilized by an inter-helical cysteine bridge.	56
409645	cd21787	RBD_KIF20A	RAB6 binding domain (RBD) found in kinesin-like protein KIF20A, and similar proteins. KIF20A, also called GG10_2, or mitotic kinesin-like protein 2 (MKlp2), or Rab6-interacting kinesin-like protein, or rabkinesin-6, is a mitotic kinesin required for chromosome passenger complex (CPC)-mediated cytokinesis. Following phosphorylation by PLK1, it is involved in recruitment of PLK1 (polo-like kinase 1) to the central spindle. KIF20A interacts with guanosine triphosphate (GTP)-bound forms of RAB6A and RAB6B. It may act as a motor required for the retrograde RAB6 regulated transport of Golgi membranes and associated vesicles along microtubules. KIF20A has a microtubule plus end-directed motility. This model corresponds to RAB6 binding domain (RBD) of KIF20A. KIF20A-RBD is a dimer composed of two parallel alpha helices that form a right-handed coiled-coil additionally stabilized by an inter-helical cysteine bridge.	56
409347	cd21795	betaCoV_Nsp3_NAB	nucleic acid binding domain of betacoronavirus non-structural protein 3. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus including highly pathogenic human coronaviruses (CoVs) such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), but may not be conserved in the Nsp3 NAB from betacoronaviruses in other lineages.	110
409335	cd21796	SARS-CoV-like_Nsp1_N	N-terminal domain of non-structural protein 1 from Severe acute respiratory syndrome-related coronavirus and betacoronavirus in the B lineage. This model represents the N-terminal domain of non-structural protein 1 (Nsp1) from betacoronaviruses in the sarbecovirus subgenus (B lineage), including highly pathogenic coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	115
409197	cd21799	WH2_Wa_Cobl	first Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat (called Wa) found in protein Cordon-Bleu (Cobl) and similar proteins. This family contains the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2), called Wa, found in protein Cordon-Bleu (Cobl), a potent actin filament nucleator that plays an important role in the reorganization of the actin cytoskeleton. It regulates neuron morphogenesis and increases branching of axons and dendrites. It also modulates dendrite branching in Purkinje cells. Cobl binds to and sequesters actin monomers (G-actin). Cobl contains three tandem WH2 (or W) domains consisting of an N-terminal alpha helix and a C-terminal LRKV motif. The first two WH2 domains have the highest binding affinity for actin. They are functionally active in actin nucleation and polymerization. The model corresponds to the first WH2 domain.	33
409198	cd21800	WH2_Wb_Cobl	second Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat (called Wb) found in protein Cordon-Bleu (Cobl) and similar proteins. This family contains the second tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2), called Wb, found in protein Cordon-Bleu (Cobl), a potent actin filament nucleator that plays an important role in the reorganization of the actin cytoskeleton. It regulates neuron morphogenesis and increases branching of axons and dendrites. It also modulates dendrite branching in Purkinje cells. Cobl binds to and sequesters actin monomers (G-actin). Cobl contains three tandem WH2 or W domains consisting of an N-terminal alpha helix and a C-terminal LRKV motif. The first two WH2 domains have the highest binding affinity for actin. They are functionally active in actin nucleation and polymerization. The model corresponds to the second WH2 domain.	44
409199	cd21801	WH2_Wc_Cobl	third Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat (called Wc) found in protein Cordon-Bleu (Cobl) and similar proteins. This family contains the third tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2), called Wc, found in protein Cordon-Bleu (Cobl), a potent actin filament nucleator that plays an important role in the reorganization of the actin cytoskeleton. It regulates neuron morphogenesis and increases branching of axons and dendrites. It also modulates dendrite branching in Purkinje cells. Cobl binds to and sequesters actin monomers (G-actin). Cobl contains three tandem WH2 (or W) domains consisting of an N-terminal alpha helix and a C-terminal LRKV motif. The first two WH2 domains have the highest binding affinity for actin. They are functionally active in actin nucleation and polymerization. The model corresponds to the first WH2 domain.	26
409641	cd21802	CEN_ANKS4B	central domain found in ankyrin repeat and SAM domain-containing protein 4B. Ankyrin repeat and SAM domain-containing protein 4B (ANKS4B), also called Harmonin-interacting ankyrin repeat-containing protein (Harp), is highly expressed in intestine and is essential for intermicrovillar adhesion. As part of the intermicrovillar adhesion complex (IMAC), ANKS4B plays a role in epithelial brush border differentiation, controlling microvilli organization and length. It may be involved in cellular response to endoplasmic reticulum stress. ANKS4B consists of four N-terminal ANK repeats, a central region, and a sterile alpha motif (SAM) followed by a C-terminal type I PDZ binding motif (PBM). This model corresponds to the central region (CEN) of ANKS4B, which contains the conserved regions CEN1 and CEN2. CEN is directly responsible for binding to the N-terminal MyTH4-FERM-SH3 supramodule of MYO7B, with a mechanism highly analogous to the interaction between USH1G and MYO7A.	46
409642	cd21803	CEN_USH1G	central domain found in usher syndrome type-1G protein. Usher syndrome type-1G protein (USH1G), also called scaffold protein containing ankyrin repeats and SAM domain (Sans), is an anchoring/scaffolding protein that is part of the functional network formed by USH1C, USH1G, CDH23 and MYO7A, that mediates mechanotransduction in cochlear hair cells. It is required for normal development and maintenance of cochlear hair cell bundles, as well as for normal hearing. USH1G consists of four N-terminal ANK repeats, a central region, and a sterile alpha motif (SAM) followed by a C-terminal type I PDZ binding motif (PBM). This model corresponds to the central region (CEN) of USH1G, which contains the conserved regions CEN1 and CEN2. CEN is directly responsible for binding to the MYO7A MyTH4-FERM tandem.	57
409637	cd21804	DEFL_AtLURE1-like	defensin-like domain found in Arabidopsis thaliana proteins LURE 1.2, LURE 1.6, and similar proteins. This subfamily includes Arabidopsis thaliana (At) LURE1.2 (also called cysteine-rich peptide 810_1.2, CRP810_1.2, or defensin-like protein 213) and AtLURE1.6 (also called cysteine-rich peptide 810_1.6, CRP810_1.6, or defensin-like protein 215). They are pollen tube attractants guiding pollen tubes to the ovular micropyle. AtLURE1.2 attracts specifically pollen tubes from A. thaliana, but not those from A. lyrata. It triggers endocytosis of MDIS1 in the pollen tube tip. This model corresponds to the defensin-like (DEFL) domains of AtLURE1.2 and AtLURE1.6, which adopts a typical structure characterized by cysteine-stabilized alpha/beta scaffold.	38
409638	cd21805	DEFL_BmBKTx1-like	defensin-like domain found in Mesobuthus martensii neurotoxin BmBKTx1 and similar proteins. BmBKTx1, also called potassium channel toxin alpha-KTx 19.1, or BmK37, is a selective inhibitor of high conductance calcium-activated potassium channels KCa1.1/KCNMA1. It belongs to a family of short-chain alpha-KTx toxins of the potassium channel (also called alpha-KTx19) and may be insect specific. This subfamily also includes Arabidopsis thaliana defensin-like protein 32 (AtDEF32). Its biological function remains unclear. This model corresponds to the defensin-like (DEFL) domain of BmBKTx1 and AtDEF32, which adopts a typical structure characterized by cysteine-stabilized alpha/beta scaffold.	39
409639	cd21806	DEFL_defensin-like	defensin-like domain found in bilateria defensins, sapecins, tenecins, phormicins, and lucifensins. This subfamily includes a group of bactericidal proteins, such as defensins, sapecins, tenecins, phormicins, and lucifensins from bilateria. They are host defense peptides produced in response to injury and mostly active against Gram-positive bacteria. This model corresponds to the defensin-like (DEFL) domain, which adopts a typical structure characterized by cysteine-stabilized alpha/beta scaffold.	38
409632	cd21807	ABC-2_lan_permease_MutE_EpiE-like	lantibiotic immunity ABC transporter MutE/EpiE family permease (also called ABC-2 transporter MutE/EpiE family permease) subunit. This subfamily includes lantibiotic ABC transporter permease subunits EpiE, MutE, SlvE and NisE, which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, specifically to the lantibiotics mutacin, epidermin, nisin and salivaricin, respectively. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. For example, in Staphylococcus epidermidis Tu3298, the lantibiotic epidermin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming epidermin is mediated by the ABC transporter immunity proteins composed of EpiF, EpiE and EpiG; the EpiE permease subunit transports epidermin to the surface and expels it from the membrane. This subfamily also includes the lantibiotic ABC transporter permease subunits MutE, SlvF, and NisE. Self-protection of the mutacin-producing strain Streptococcus mutans CH43 against the pore-forming lantibiotic mutacin is mediated by an ABC transporter composed of MutF, MutE and MutG. In salivaricin D-producing strain Streptococcus salivarius 5M6c, self-immunity against the intrinsically trypsin-resistant salivaricin is mediated via ABC transporter proteins SlvF, SlvE and SlvG, while in Lactococcus lactis, self-immunity against nisin is mediated by the ABC transporter NisFEG. The MutE, NisE and SlvF permease subunits transport mutacin, nisin and salivaricin, respectively to the surface and expel them from the membrane.	234
409633	cd21808	ABC-2_lan_permease_MutG	lantibiotic immunity ABC transporter MutG family permease (also called ABC-2 transporter MutG family permease) subunit. This subfamily includes lantibiotic ABC transporter permease subunit MutG which is a highly hydrophobic, integral membrane protein, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, specifically to lantibiotic mutacin. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. For example, in Streptococcus mutans CH43, the lantibiotic mutacin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming mutacin is mediated by the ABC transporter composed of MutF, MutE, and MutG. This subfamily includes the MutG permease subunit that transports mutacin to the surface and expels it from the membrane.	237
409634	cd21809	ABC-2_lan_permease-like	lantibiotic immunity ABC transporter permease (also called ABC-2 transporter permease) subunit and similar proteins. This subfamily contains lantibiotic ABC transporter permease subunits which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, particularly to type-A lantibiotics. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. For example, in Lactococcus lactis, the lantibiotic nisin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming nisin is mediated by the ABC transporter composed of NisF, NisE and NisG; the NisG permease subunit transports nisin to the surface and expels it from the membrane. This family includes mostly uncharacterized transport permease subunits that transport lantibiotics to the surface and expel them from the membrane.	235
409635	cd21810	ABC-2_lan_permease_NisG-like	lantibiotic immunity ABC transporter NisG family permease (also called ABC-2 transporter NisG family permease) subunit, and similar proteins. This subfamily contains lantibiotic ABC transporter permease subunits NisG and NsuG, which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, particularly to the lantibiotic nisin. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. In Lactococcus lactis and Streptococcus uberis, the lantibiotic nisin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming nisin is mediated by the ABC transporter composed of NisF, NisE and NisG. In Streptococcus uberis, similar proteins provide self-protection against the pore-forming lantibiotic nisin U. This subfamily contains the NisG and NsuG permease subunits that transport nisin to the surface and expel it from the membrane.	211
409251	cd21811	CoV_Nsp7	coronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of alpha-, beta-, gamma- and deltacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp7 forms a 2:1 heterotrimer with Nsp8. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length.	83
409627	cd21812	MHV-like_Nsp3_betaSM	betacoronavirus-specific marker of non-structural protein 3 from murine hepatitis virus and betacoronavirus in the A lineage. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV) and Human coronavirus HKU1. The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of the related SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16.	125
409628	cd21813	HKU9-like_Nsp3_betaSM	betacoronavirus-specific marker of non-structural protein 3 from Rousettus bat coronavirus HKU9 and betacoronavirus in the D lineage. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of the related SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16.	135
409629	cd21814	SARS-CoV-like_Nsp3_betaSM	betacoronavirus-specific marker of non-structural protein 3 from Severe acute respiratory syndrome-related coronavirus and betacoronavirus in the B lineage. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the sarbecovirus subgenus (B lineage), including highly pathogenic human coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16.	116
409630	cd21815	MERS-CoV-like_Nsp3_betaSM	betacoronavirus-specific marker of non-structural protein 3 from Middle East respiratory syndrome-related coronavirus and betacoronavirus in the C lineage. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of the related SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16.	124
409256	cd21816	CoV_Nsp8	Coronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) of alpha-, beta-, gamma- and deltacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp8 forms a 1:2 heterotrimer with Nsp7. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length.	194
409622	cd21817	IgC1_CH1_IgEG	CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin heavy epsilon and gamma chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of epsilon and gamma chains. It belongs to a family composed of the first immunoglobulin constant-1 set domain of alpha, delta, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors.	94
409623	cd21818	IgC1_CH1_IgA	CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin heavy alpha chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of alpha chains. It belongs to a family composed of the first immunoglobulin constant-1 set domain of alpha, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors.	94
409624	cd21819	IgC1_CH1_IgM	CH1 domain (first constant Ig domain of the heavy chain) in immunoglobulin heavy mu chain; member of the C1-set of Ig superfamily (IgSF) domains. The members here are composed of the first immunoglobulin constant-1 set domain of mu chains. It belongs to a family composed of the first immunoglobulin constant-1 set domain of alpha, delta, epsilon, gamma, and mu heavy chains. This domain is found on the Fab antigen-binding fragment. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. There are two types of light chains: kappa and lambda; each is composed of a constant domain and a variable domain. There are five types of heavy chains: alpha, delta, epsilon, gamma, and mu, all consisting of a variable domain (VH) with three (alpha, delta and gamma) or four (epsilon and mu) constant domains (CH1 to CH4). Ig molecules are modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. This group belongs to the C1-set of IgSF domains, which are classical Ig-like domains resembling the antibody constant domain. C1-set domains are found almost exclusively in molecules involved in the immune system, such as in immunoglobulin light and heavy chains, in the major histocompatibility complex (MHC) class I and II complex molecules, and in various T-cell receptors.	95
409625	cd21820	IgC1_MHC_1b_Qa-1b	Class Ib major histocompatibility complex (MHC) immunoglobulin domain of Qa-1b; member of the C1-set of Ig superfamily (IgSF) domains. The non-classical mouse MHC class I (MHC-I) molecule Qa-1b is a non-polymorphic MHC molecule with an important function in innate immunity. It binds and presents signal peptides of classical MHC-I molecules at the cell surface and, as such, act as an indirect sensor for the normal expression of MHC-I molecules. This signal peptide dominantly accommodated in the groove of Qa-1b is called Qdm, for Qa-1 determinant modifier, and its amino acid sequence AMAPRTLLL is highly conserved among mammalian species. The Qdm/Qa-1b complex serves as a ligand for the germ-line encoded heterodimeric CD94/NKG2A receptors expressed on natural killer (NK) cells and activated CD8+ T cells and transduces inhibitory signals to these lymphocytes. Thus, upon binding, Qa-1b signals NK cells not to engage in cell lysis. The molecular basis of Qa-1b function is unclear.	98
409352	cd21821	MavE	Dot/Icm type IV secretion system effector MavE. The Icm/Dot protein translocation apparatus is a type IVb secretion system, highly related to bacterial conjugative DNA transfer systems, and is important in establishing a replication vacuole. A complex of Icm/Dot proteins spans the bacterial envelope, allowing the transfer of proteins from the bacterial cytoplasm across membranes located in the target host eukaryotic cell. Icm/Dot-translocated substrates (IDTS) control construction of the replication compartment and have been shown to directly regulate membrane traffic associated with the movement of vesicles along steps in the early secretory system. Although the function of Legionella MavE is unknown, it has been shown to be an Icm/Dot-translocated substrate and is assumed to play a role in this type IV secretion system.	132
409348	cd21822	SARS-CoV-like_Nsp3_NAB	nucleic acid binding domain of non-structural protein 3 from Severe acute respiratory syndrome-related coronavirus and betacoronavirus in the B lineage. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the sarbecovirus subgenus (B lineage) and hibecovirus subgenus, including highly pathogenic human coronaviruses (CoVs) such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the B lineage.	107
409349	cd21823	MERS-CoV-like_Nsp3_NAB	nucleic acid binding domain of non-structural protein 3 from Middle East respiratory syndrome-related coronavirus and betacoronavirus in the C lineage. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), and appears to be partially conserved in the Nsp3 NAB from betacoronaviruses in the C lineage.	123
409350	cd21824	MHV-like_Nsp3_NAB	nucleic acid binding domain of non-structural protein 3 from murine hepatitis virus and betacoronavirus in the A lineage. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV) and Human coronavirus HKU1. The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), but is not conserved in the Nsp3 NAB from betacoronaviruses in the A lineage.	119
409351	cd21825	HKU9-like_Nsp3_NAB	nucleic acid binding domain of non-structural protein 3 from Rousettus bat coronavirus HKU9 and betacoronavirus in the D lineage. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), but is not conserved in the Nsp3 NAB from betacoronaviruses in the D lineage.	117
409252	cd21826	alphaCoV_Nsp7	alphacoronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of alphacoronaviruses that include Feline infectious peritonitis virus (FCoV), Human coronavirus NL63 (HCoV-NL63), and Porcine transmissible gastroenteritis coronavirus (TGEV), among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. FCoV Nsp7 forms a 2:1 heterotrimer with Nsp8; the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length.	83
409253	cd21827	betaCoV_Nsp7	betacoronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of betacoronaviruses including the highly pathogenic Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder; the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length.	83
409254	cd21828	gammaCoV_Nsp7	gammacoronavirus  non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of gammacoronaviruses that include Avian infectious bronchitis virus (IBV) and Canada goose coronavirus (CGCoV), among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp7 forms a 2:1 heterotrimer with Nsp8. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length.	83
409255	cd21829	deltaCoV_Nsp7	deltacoronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of deltacoronaviruses that include White-eye coronavirus HKU16 and Quail coronavirus UAE-HKU30, among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp7 forms a 2:1 heterotrimer with Nsp8. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length.	96
409257	cd21830	alphaCoV_Nsp8	alphacoronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) region of alphacoronaviruses that include Feline infectious peritonitis virus (FCoV), Human coronavirus NL63 (HCoV-NL63), and Porcine epidemic diarrhea coronavirus (PEDV), among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. FCoV Nsp8 forms a 1:2 heterotrimer with Nsp7; the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length.	195
409258	cd21831	betaCoV_Nsp8	betacoronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) the highly pathogenic betacoronaviruses that include Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder; the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length.	196
409259	cd21832	gammaCoV_Nsp8	gammacoronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) region of gammacoronaviruses that include Avian infectious bronchitis virus (IBV) and Canada goose coronavirus (CGCoV), among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp8 forms a 1:2 heterotrimer with Nsp7. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length.	210
409260	cd21833	deltaCoV_Nsp8	deltacoronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) region of deltacoronaviruses that include White-eye coronavirus HKU16 and Quail coronavirus UAE-HKU30, among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp8 forms a 1:2 heterotrimer with Nsp7. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length.	189
411711	cd21834	Hhal-like	Restriction endonuclease HhaI and similar endonucleases. HhaI is a type II restriction endonuclease that recognizes the symmetric sequence 5'-GCG|C-3' (| denotes the cleavage site) and produces fragments with 2-base, 3'-overhangs. It domain belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and HindIII.	261
409346	cd21835	SagF	streptolysin S-associated protein SagF. Streptolysin S-associated protein SagF is encoded by the sagF gene, which has been identified to be a hemolytic activity-related gene in SEZ (Streptococcus equi ssp. zooepidemicus). The sagF gene is located in the same operon with sagD gene in the SEZ genome implying that it should play an important role in SEZ hemolytic activity and is an indispensable gene in the sag operon for streptolysin S (SLS) biosynthesis.	225
409345	cd21836	adhesin_CP	Neisseria gonorrhoeae Adhesin Complex Protein and similar proteins. This model contains adhesin complex protein found in Neisseria gonorrhoeae (Ng-ACP), the causative organism of the sexually transmitted disease gonorrhoea, and similar proteins. Studies have shown that Ng-ACP is conserved and expressed by over 50 gonococcal strains and that recombinant proteins induce antibodies in mice that killed the bacteria in vitro. Thus, recombinant Ng-ACP (rNg-ACP) is a potential vaccine candidate that induces antibodies that are bactericidal and prevent the gonococcus from inhibiting the lytic activity of an innate defense molecule. This protein is structurally similar to N. meningitidis adhesin complex protein as well as members of the MliC/PliC protein family of membrane-bound or periplasmic inhibitors of human C-type lysozyme (HL), suggesting that Ng-ACP may probably be located in the periplasm or phospholipid layer of the outer membrane.	93
409344	cd21837	AvrRps4-like	Pseudomonas syringae coiled-coil effector AvrRps4 C-terminal region, and regions in similar proteins. This model includes the C-terminal region of AvrRps4, a type III-secreted (T3S) effector protein originally identified in Pseudomonas syringae pv. pisi, a causal agent of bacterial blight in pea. AvrRps4 triggers RPS4 (resistance to P. syringae 4)-dependent immunity in resistant accessions of Arabidopsis. AvrRps4 is a bipartite effector, processed upon entry in planta by cleavage between two glycine residues, generating two protein parts, AvrRps4N and AvrRps4C. Mutation studies have shown that an electronegative surface patch in AvrRps4(C) is required for recognition by RPS4; mutations in this region have been shown to uncouple triggering of the hypersensitive response from disease resistance. The N-terminal part of AvrRps4 was previously assumed to only function in effector secretion into the host cell; however, in Arabidopsis, which uses a pair of resistance proteins, RRS1 and RPS4, both AvrRps4 parts are required for triggering resistance in Arabidopsis, and in fact, AvrRps4N on its own has some functions of an effector, implying that the fusion of the two AvrRps4 parts may have arisen to counteract plant defenses.	86
412061	cd21864	GTSE1_CTD	C-terminal domain of G2 and S phase-expressed protein 1. G2 and S phase-expressed protein 1 (GTSE-1), also called protein B99 homolog, is a cell cycle-regulated protein mainly localized in the cytoplasm and apparently associated with microtubules. It may be involved in p53-induced cell cycle arrest in G2/M phase by interfering with microtubule rearrangements that are required to enter mitosis. Overexpression of GTSE-1 delays G2/M phase progression. GTSE-1 is a clathrin adaptor protein; it is recruited to the spindle by clathrin, which stabilizes microtubules by inhibiting the microtubule depolymerase MCAK. This model corresponds to a conserved domain at the C-terminus of GTSE-1, which is required for clathrin binding and is only conserved in vertebrates.	56
409286	cd21868	CC1_SLMAP-like	first coiled-coil (CC1) domain found in Sarcolemmal membrane-associated protein and similar proteins. The family includes Sarcolemmal membrane-associated protein (SLMAP), its paralog TRAF3-interacting JNK-activating modulator (T3JAM), and similar proteins. SLMAP, also called Sarcolemmal membrane-associated protein, is a cardiac tail-anchored membrane protein that may play a role during myoblast fusion. T3JAM, also called TRAF3-interacting protein 3 (TRAF3IP3), is a novel protein that specifically interacts with TRAF3 and promotes the activation of JNK. It may function as an adapter molecule that regulates TRAF3-mediated JNK activation. SLMAP contains an N-terminal FHA domain, followed by four coiled-coil (CC) domains and a transmembrane domain. The model corresponds to the first CC (CC1) domain that is responsible for the binding of suppressor of IKBKE 1 (SIKE1).	38
409343	cd21871	VscT2	type III secretion system apparatus protein VscT2 in Vibrio species. This model contains Vibrio type III secretion system (T3SS) apparatus protein VscT2. Vibrios, which include over 100 species, are ubiquitous in marine and estuarine environments, and many species such as Vibrio cholerae, V. parahaemolyticus and V. mimicus, are pathogens for humans. VscT2 co-occurs with vscS2, vscN2, vscC2 and vscR2 which are all essential for T3SS secretion.	130
409325	cd21872	CoV_Nsp10	coronavirus non-structural protein 10. This model represents the non-structural protein 10 (Nsp10) of alpha-, beta-, gamma- and deltacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation, and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16, and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity.	131
409342	cd21873	Ugr_9a-1-like	includes Urticina grebelnyi Ugr 9a-1, Anemonia viridis Avd13a/b, Antheopsis maculata Amc1a peptide actitoxins. This model includes novel peptides isolated from venom of the sea anemone and include Urticina grebelnyi Ugr 9a-1 (also called pi-anemonetoxin (pi-AnmTX) Ugr 9a-1 or Pi-actitoxin-Ugr1a or Ugr 9-1), Antheopsis maculate Amc1a (also called delta-actitoxin-Amc1a, Delta-AITX-Amc1a, AnmTX Ama 9a-1 or peptide toxins Am-1) and Anemonia viridis Avd13b (also called U-actitoxin-Avd13b, AnmTX Avi 9a-1, or peptide toxin AV-2). These peptides belong to structural group 9a. Ugr 9a-1 has an uncommon beta-hairpin structure, stabilized by two S-S bridges. Its precursor protein appears to be processed in the following sequence: release of the signal peptide and of the propeptide, production of six identical 34-residue peptides by cleavage between Arg and Glu, release of four N-terminal and three C-terminal residues from each peptide and hydroxylation of each Pro in position 6 of the resulting 27-residue peptides. Ugr1a has been shown to produce a reversible inhibition effect on both the transient and the sustained current of human acid-sensing ion channel 3 (ASIC3) channels expressed in Xenopus laevis oocytes; it completely blocks the transient component and partially (48%) inhibits the amplitude of the sustained component. In mice, it significantly reversed inflammatory and acid-induced pain.	29
409336	cd21874	alpha_betaCoV_Nsp1	non-structural protein 1 from alpha- and betacoronavirus. This model represents the non-structural protein 1 (Nsp1) from alpha- and betacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. Gamma- and deltaCoVs do not have Nsp1. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	103
409337	cd21875	PEDV-like_alphaCoV_Nsp1	non-structural protein 1 from porcine epidemic diarrhea virus and similar alphacoronaviruses. This model represents the non-structural protein 1 (Nsp1) from porcine epidemic diarrhea virus (PEDV) and similar alphacoronaviruses from several subgenera including pedacovirus, setracovirus, duvinacovirus, decacovirus, colacovirus, myotacovirus, minunacovirus, and rhinacovirus. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	108
409338	cd21876	betaCoV_Nsp1	non-structural protein 1 from betacoronavirus. This model represents the non-structural protein 1 (Nsp1) from betacoronaviruses, including highly pathogenic coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	114
409339	cd21877	HKU9-like_Nsp1	non-structural protein 1 from Rousettus bat coronavirus HKU9 and betacoronavirus in the D lineage. This model represents the non-structural protein 1 (Nsp1) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	165
409340	cd21878	MERS-CoV-like_Nsp1	non-structural protein 1 from Middle East respiratory syndrome-related coronavirus and betacoronavirus in the C lineage. This model represents the non-structural protein 1 (Nsp1) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and murine hepatitis virus (MHV) genomes cause drastic reduction or elimination of infectious virus; bovine coronavirus (BCoV) Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	170
409341	cd21879	MHV-like_Nsp1	non-structural protein 1 from murine hepatitis virus and betacoronavirus in the A lineage. This model represents the non-structural protein 1 (Nsp1) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV), bovine coronavirus (BCoV) and Human coronavirus HKU1. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and MHV genomes cause drastic reduction or elimination of infectious virus; BCoV Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	236
409329	cd21881	CoV_Nsp9	coronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from coronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for CoV replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG at the C-terminus; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication.	111
411975	cd21882	TRPV	Transient Receptor Potential channel, Vanilloid subfamily (TRPV). The vanilloid TRP subfamily (TRPV), named after the vanilloid receptor 1 (TRPV1), consists of six members: four thermo-sensing channels (TRPV1, TRPV2, TRPV3, and TRPV4) and two Ca2+ selective channels (TRPV5 and TRPV6). The calcium-selective channels TRPV5 and TRPV6 can be heterotetramers and are important for general Ca2+ homeostasis. All four channels within the TRPV1-4 group show temperature-invoked currents when expressed in heterologous cell systems, ranging from activation at ~25C for TRPV4 to ~52C for TRPV2. The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains. The TRP family consists of membrane proteins that function as ion channels that communicate between the cell and its environment, by a vast array of physical or chemical stimuli, including radiation (in the form of temperature, infrared ,or light) and pressure (osmotic or mechanical). TRP channels are formed by a tetrameric complex of channel subunits. Based on sequence identity, the mammalian TRP channel family is classified into six subfamilies, with significant sequence similarity within the transmembrane domains, but very low similarity in their N- and C-terminal cytoplasmic regions. The six subfamilies are named based on their first member: TRPC (canonical), TRPV (vanilloid), TRPM (melastatin), TRPA (ankyrin), TRPML (mucolipin), and TRPP (polycystic).	600
409330	cd21897	alphaCoV_Nsp9	alphacoronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) of alphacoronaviruses, including Porcine epidemic diarrhea virus (PEDV), Porcine transmissible gastroenteritis coronavirus (TGEV), and Human coronavirus 229E. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication.	108
409331	cd21898	betaCoV_Nsp9	betacoronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from betacoronaviruses including highly pathogenic Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome-related (MERS) CoV. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication.	111
409332	cd21899	gammaCoV_Nsp9	gammacoronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from gammacoronaviruses such as Avian infectious bronchitis virus (IBV). CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication.	113
409333	cd21900	deltaCoV_Nsp9	deltacoronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from deltacoronaviruses such as the Porcine delta coronavirus (PDCoV) Porcine coronavirus HKU15. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication.	109
409326	cd21901	alpha_betaCoV_Nsp10	alphacoronavirus and betacoronavirus non-structural protein 14. This model represents the non-structural protein 10 (Nsp10) of alpha- and betacoronaviruses, including highly pathogenic betacoronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), Middle East respiratory syndrome-related (MERS) CoV, and alphacoronaviruses such as Human coronavirus 229E. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation, and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16 and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity.	130
409327	cd21902	gammaCoV_Nsp10	gammacoronavirus non-structural protein 10. This model represents the non-structural protein 10 (Nsp10) of gammacoronaviruses, including Infectious bronchitis virus (IBV)and Bottlenose dolphin coronavirus HKU22(BdCoV HKU22). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16 and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity.	134
409328	cd21903	deltaCoV_Nsp10	deltacoronavirus non-structural protein 10. This model represents the non-structural protein 10 (Nsp10) of deltacoronaviruses, including Thrush coronavirus HKU12-600 and Wigeon coronavirus HKU20. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16 and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity.	128
409324	cd21904	TtfA-like	Mycobacterial trehalose monomycolate transport factor A and similar proteins. TtfA (trehalose monomycolate transport factor A) plays a role in the transport of trehalose monomycolate across the inner membrane, potentially by forming a complex with the atypical lipid transporter MmpL3. Trehalose monomycolate is a component of the mycobacterial envelope. The core domain of TtfA shows strong structural similarity to class I type III secretion system (T3SS) chaperones, and TtfA may play other roles besides assisting in mycolate transport, given its phylogenetic distribution.	171
409300	cd21905	PUA_TruB_thermotogae	PUA RNA-binding domain of the thermotogae tRNA pseudouridine synthase B. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of the thermotogae subfamily of pseudouridine synthases TruB are modules that assist in the binding and positioning (guide and/or substrate) of RNA to the pseudouridine synthase complex. Pseudouridine synthases are enzymes that are responsible for post-translational modifications of RNAs by specifically isomerizing uracil residues. The pseudouridine synthase TruB (also called tRNA pseudouridylate synthase B or Psi55 synthase) is responsible for synthesis of pseudouridine from uracil-55 in the psi GC loop of elongator tRNAs.	78
409287	cd21911	CC1_SLMAP	first coiled-coil (CC1) domain found in Sarcolemmal membrane-associated protein. Sarcolemmal membrane-associated protein (SLMAP), also called Sarcolemmal membrane-associated protein, is a cardiac tail-anchored membrane protein that may play a role during myoblast fusion. SLMAP contains an N-terminal FHA domain followed by four coiled-coil (CC) domains and a transmembrane domain. The model corresponds to the first CC (CC1) domain that is responsible for the binding of suppressor of IKBKE 1 (SIKE1).	63
409288	cd21912	CC1_T3JAM	first coiled-coil (CC1) domain found in TRAF3-interacting JNK-activating modulator. TRAF3-interacting JNK-activating modulator (T3JAM), also called TRAF3-interacting protein 3 (TRAF3IP3), is a novel protein that specifically interacts with TRAF3 and promotes the activation of JNK. It may function as an adapter molecule that regulates TRAF3-mediated JNK activation. The model corresponds to a conserved region that shows high sequence similarity with the first CC (CC1) domain of Sarcolemmal membrane-associated protein (SLMAP), which is responsible for the binding of suppressor of IKBKE 1 (SIKE1).	45
409285	cd21913	Nip7_N_arch	N-terminal domain of archaeal 60S ribosome subunit biogenesis protein Nip7. The N-terminal domain of archaeal 60S ribosome subunit biogenesis protein Nip7 co-occurs with a PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain. Nip7 is involved in ribosome biogenesis, taking part in 27S pre-rRNA processing and in formation of the 60S ribosomal subunit. Nip7 and its homologs share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets.	85
409275	cd21927	ZIP_TSC22D-like	leucine zipper found in the TSC22 domain leucine zipper transcription factors, c-Myc-binding protein, and similar proteins. The family includes TGF-beta-stimulated clone-22 domain (TSC22D) leucine zipper transcription factors, TSC22D1-4, as well as c-Myc-binding protein (MycBP). TSC22D proteins have diverse physiological functions, including cell growth, development, homeostasis, and immune regulation. MycBP, also called associate of Myc 1 (AMY-1), is a novel c-Myc binding protein that may control the transcriptional activity of Myc. It stimulates the activation of E box-dependent transcription by Myc. Members of this family contain a conserved leucine zipper (ZIP) domain. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. In the bZIP family of transcription factors, the leucine zipper acts as a dimerization domain and the upstream basic region as a DNA-binding domain. However, DNA-binding capability of TSC22D family proteins is not obvious, due to the lack of the basic region found in the original bZIP DNA-binding domains. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription.	51
409272	cd21928	LGNbd_FRMPD1_D4-like	LGN tetratricopeptide repeat-binding domain found in FERM and PDZ domain-containing proteins FRMPD1, FRMPD4, and similar proteins. The family includes FRMPD1, FRMPD4, and similar proteins. FRMPD1, also called FERM domain-containing protein 2 (FRMD2), stabilizes membrane-bound GPSM1, and thereby promotes its interaction with GNAI1. It also acts as a regulatory binding partner of Activator of G-protein Signaling 3 (AGS3). FRMPD4, also called PDZ domain-containing protein 10 (PDZD10), PDZK10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a novel PSD-95-interacting FERM and PDZ domain-containing protein that regulates dendritic spine morphogenesis. It acts as a positive regulator of dendritic spine morphogenesis and density. It is required for the maintenance of excitatory synaptic transmission. It binds phosphatidylinositol 4,5-bisphosphate. This model corresponds to a conserved region in FRMPD1 and FRMPD4 that binds to tetratricopeptide (TPR) repeats present in the N-terminal domain of adaptor protein LGN. LGN plays a crucial role in mitotic spindle orientation and cell polarization via interaction with multiple targets including FRMPD1 and FRMPD4.	37
412018	cd21930	IPD_PPP1R12	inhibitory phosphorylation domain of protein phosphatase 1 regulatory subunit 12 (PPP1R12) family. The PPP1R12 family includes PPP1R12A/MYPT1, PPP1R12B/MYPT2, and PPP1R12C. PPP1R12A/MYPT1, also called myosin phosphatase target subunit 1, or protein phosphatase myosin-binding subunit, is a substrate for the asparaginyl hydroxylase factor inhibiting hypoxia-inducible factor (FIH). It acts as a key regulator of protein phosphatase 1C (PPP1C). It mediates binding to myosin. As part of the PPP1C complex, PPP1R12A/MYPT1 is involved in dephosphorylation of PLK1. It is capable of inhibiting HIF1A inhibitor (HIF1AN)-dependent suppression of HIF1A activity. PPP1R12B/MYPT2, also called myosin phosphatase target subunit 2, is the targeting subunit of smooth-muscle myosin phosphatase that regulates myosin phosphatase activity and augments Ca(2+) sensitivity of the contractile apparatus. PPP1R12C, also called protein phosphatase 1 myosin-binding subunit of 85 kDa (MBS85), protein phosphatase 1 myosin-binding subunit p85, or LENG3, regulates myosin phosphatase activity. All family members contain an inhibitory phosphorylation domain.	47
409267	cd21931	TD_EMAP-like	trimerization domain of the echinoderm microtubule-associated protein-like family. The echinoderm microtubule-associated protein (EMAP)-like (EML) family includes EMAP-1, EMAP-2, EMAP-3, and EMAP-4. EMAP-1, also called EMAL1, EMAPL or EMAPL1, modulates the assembly and organization of the microtubule cytoskeleton, and probably plays a role in regulating the orientation of the mitotic spindle and the orientation of the plane of cell division. It is required for normal proliferation of neuronal progenitor cells in the developing brain and for normal brain development. EMAP-2, also called EML2 or EMAPL2, is a tubulin binding protein that inhibits microtubule nucleation and growth, resulting in shorter microtubules. EMAP-3, also called EML3, is a nuclear microtubule-binding protein required for the correct alignment of chromosomes in metaphase. EMAP-4, also called EML4, EMAPL4, restrictedly overexpressed proliferation-associated protein, or Ropp 120, may modify the assembly dynamics of microtubules, such that microtubules are slightly longer, but more dynamic. This model corresponds to a conserved trimerization domain located at the N-terminus of EML family members.	44
409264	cd21932	MIU2_RNF168-like	second motif interacting with ubiquitin domain found in RING finger protein 168 and similar domains. The domain family includes motif interacting with ubiquitin (MIU) domains of RING finger protein, RNF168 and RNF169. RNF168 is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. It, together with RNF8, functions as a DNA damage response (DDR) factor that promotes monoubiquitination of H2A/H2AX at K13/15, facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. RNF169 is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. RNF169 recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to the regulation of the DSB repair pathway by competing with repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. RNF168 contains an N-terminal C3HC4-type RING-HC finger that catalyzes H2A-K15ub modification and interacts with H2A, and two MIU (motif interacting with ubiquitin) domains responsible for interaction with K63 linked poly-ubiquitin. RNF169 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal MIU domain. This model corresponds to the second MIU (MIU2) domain of RNF168 and the C-terminal MIU domain of RNF169, which is responsible for bridging histone and ubiquitin surfaces.	42
409261	cd21933	TBK1_IKKE-like_C	C-terminal domain of non-canonical Inhibitor of kappa B kinases, IKK-E and TBK1, and similar proteins. Inhibitor of nuclear factor kappa-B kinase subunit epsilon (IKK-E or IKK-epsilon) and TANK-binding kinase 1 (TBK1) are non-canonical members of IKK family. They have been characterized as activators of nuclear factor-kappaB (NF-kappaB), but they are not essential for NF-kappaB activation. They play critical roles in antiviral response via phosphorylation and activation of transcription factors IRF3, IRF7, STAT1, and STAT3. They are also involved in the survival, tumorigenesis, and development of various cancers. Both IKK-epsilon and TBK1 contain an N-terminal protein kinase domain followed by a ubiquitin-like (Ubl) domain, a coiled-coil domain 1 (CCD1), and a C-terminal elongated alpha-helical domain. The model corresponds to the C-terminal elongated alpha-helical domain. It is responsible for the binding of adaptor proteins, optineurin (OPTN) and NAP1, to TBK1.	43
409276	cd21936	ZIP_TSC22D	leucine zipper domain found in the TSC22 domain family of leucine zipper transcription factors. The TGF-beta-stimulated clone-22 domain (TSC22D) family includes TSC22D1-4 and similar proteins. They have diverse physiological functions, including cell growth, development, homeostasis, and immune regulation. All family members contain a conserved leucine zipper (ZIP) domain located at the C-terminus. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. In the bZIP family of transcription factors, the leucine zipper acts as a dimerization domain and the upstream basic region as a DNA-binding domain. However, DNA-binding capability of TSC22D family proteins is not obvious, due to the lack of the basic region found in the original bZIP DNA-binding domains. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription.	49
409277	cd21937	ZIP_MycBP-like	leucine zipper domain found in c-Myc-binding protein and similar proteins. MycBP, also called associate of Myc 1 (AMY-1), is a novel c-Myc binding protein that may control the transcriptional activity of Myc. It stimulates the activation of E box-dependent transcription by Myc. This model corresponds to the conserved region that shows high sequence similarity with the leucine zipper (ZIP) domain located at the C-terminus of TGF-beta-stimulated clone-22 domain (TSC22D) family transcription factors. The first helix of ZIP is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription.	53
409278	cd21938	ZIP_TSC22D1	leucine zipper domain found in TSC22 domain family protein 1. TSC22 domain family protein 1 (TSC22D1) is also called cerebral protein 2, regulatory protein TSC-22, TGFB-stimulated clone 22, or transforming growth factor beta-1-induced transcript 4 protein (TGFB1I4). It is a transcriptional repressor that was reported to be present in both the cytoplasmic and the nuclear fraction. It is activated by transcription growth factor-beta1 and other growth factors of osteoblastic cells. TSC22D1 acts on the C-type natriuretic peptide (CNP) promoter. It enhances c-Myc-mediated activation of the telomerase reverse transcriptase (TERT) promoter. This model corresponds to the conserved leucine zipper (ZIP) domain located at the C-terminus of TSC22D1. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription.	79
409279	cd21939	ZIP_TSC22D2	leucine zipper domain found in TSC22 domain family protein 2. TSC22 domain family protein 2 (TSC22D2), also called transforming growth factor beta-stimulated clone 22 domain family member 2, or TSC22-related-inducible leucine zipper protein 4 (TILZ4), may participate in the regulation of cell growth. It interacts with pyruvate kinase isoform M2 (PKM2) and WD repeat domain 77 (WDR77). The model corresponds to the conserved leucine zipper (ZIP) domain located at the C-terminus of TSC22D2. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription.	63
409280	cd21940	ZIP_TSC22D3	leucine zipper domain found in TSC22 domain family protein 3. TSC22 domain family protein 3 (TSC22D3) is also called DSIP-immunoreactive peptide, protein DIP, delta sleep-inducing peptide immunoreactor, glucocorticoid-induced leucine zipper protein (GILZ), TSC-22-like protein, or TSC-22-related protein (TSC-22R). It protects T-cells from IL2 deprivation-induced apoptosis through the inhibition of FOXO3A transcriptional activity that leads to the down-regulation of the pro-apoptotic factor BCL2L11. In macrophages, it plays a role in the anti-inflammatory and immunosuppressive effects of glucocorticoids and IL10. In T-cells, it inhibits anti-CD3-induced NFKB1 nuclear translocation. TSC22D3 contains a leucine zipper motif, a Pro/Glu rich domain, and three potential phosphorylation sites. This model corresponds to the leucine zipper (ZIP) domain. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription.	81
409281	cd21941	ZIP_TSC22D4	leucine zipper domain found in TSC22 domain family protein 4. TSC22 domain family protein 4 (TSC22D4), also called TSC22-related-inducible leucine zipper protein 2 (TILZ2), or Tsc-22-like protein THG-1, is a transcriptional repressor that acts as a molecular determinant of insulin signalling and glucose handling. It also functions in hepatic lipid handling by regulating hepatic very-low-density-lipoprotein (VLDL) release and lipogenic gene expression. This model corresponds to the conserved leucine zipper (ZIP) domain located at the C-terminus of TSC22D4. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription.	74
409273	cd21942	LGNbd_FRMPD1	LGN tetratricopeptide repeat-binding domain found in FERM and PDZ domain-containing protein 1. FERM and PDZ domain-containing protein 1 (FRMPD1), also called FERM domain-containing protein 2 (FRMD2), stabilizes membrane-bound GPSM1, and thereby promotes its interaction with GNAI1. It also acts as a regulatory binding partner of Activator of G-protein Signaling 3 (AGS3). This model corresponds to a conserved region in FRMPD1 that binds to tetratricopeptide (TPR) repeats present in the N-terminal domain of adaptor protein LGN. LGN plays a crucial role in mitotic spindle orientation and cell polarization via interaction with multiple targets including FRMPD1.	38
409274	cd21943	LGNbd_FRMPD4	LGN tetratricopeptide repeat-binding domain found in FERM and PDZ domain-containing protein 4. FRMPD4, also called PDZ domain-containing protein 10 (PDZD10), PDZK10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a novel PSD-95-interacting FERM and PDZ domain protein that regulates dendritic spine morphogenesis. It acts as a positive regulator of dendritic spine morphogenesis and density. It is required for the maintenance of excitatory synaptic transmission. It binds phosphatidylinositol 4,5-bisphosphate. FRMPD4 contains WW, PDZ and FERM domains in the N-terminal region. This model corresponds to a conserved region in the C-terminal region of FRMPD4 that binds to tetratricopeptide (TPR) repeats present in the N-terminal domain of adaptor protein LGN. LGN plays a crucial role in mitotic spindle orientation and cell polarization via interaction with multiple targets including FRMPD4.	49
412019	cd21944	IPD_MYPT1	inhibitory phosphorylation domain of myosin phosphatase targeting subunit 1(MYPT1). MYPT1, also called protein phosphatase 1 regulatory subunit 12A (PPP1R12A), myosin phosphatase target subunit 1, or protein phosphatase myosin-binding subunit, is the targeting subunit of smooth-muscle myosin phosphatase. It is a substrate for the asparaginyl hydroxylase factor inhibiting hypoxia-inducible factor (FIH). MYPT1 acts as a key regulator of protein phosphatase 1C (PPP1C). It mediates binding to myosin. As part of the PPP1C complex, MYPT1 is involved in dephosphorylation of the mitosis regulator polo-like kinase 1 (PLK1). It is capable of inhibiting HIF1A inhibitor (HIF1AN)-dependent suppression of HIF1A activity. This model corresponds to the inhibitory phosphorylation domain of MYPT1.	57
412020	cd21945	IPD_PPP1R12C	inhibitory phosphorylation domain of protein phosphatase 1 regulatory subunit 12C (PPP1R12C). PPP1R12C, also called protein phosphatase 1 myosin-binding subunit of 85 kDa (MBS85), protein phosphatase 1 myosin-binding subunit p85, or LENG3, regulates myosin phosphatase activity. This model corresponds to a conserved region of PPP1R12C, which shows high sequence similarity to the inhibitory phosphorylation domain of MYPT1.	54
412021	cd21946	IPD_MYPT2	inhibitory phosphorylation domain of myosin phosphatase targeting subunit 2 (MYPT2). MYPT2, also called protein phosphatase 1 regulatory subunit 12B (PPP1R12B), or myosin phosphatase target subunit 2, is the targeting subunit of smooth-muscle myosin phosphatase that regulates myosin phosphatase activity and augments Ca(2+) sensitivity of the contractile apparatus. This model corresponds to the inhibitory phosphorylation domain of MYPT2.	53
409268	cd21947	TD_EMAP1	trimerization domain of echinoderm microtubule-associated protein-like 1. Echinoderm microtubule-associated protein-like 1 (EMAP-1), also called EMAL1, EMAPL, or EMAPL1, modulates the assembly and organization of the microtubule cytoskeleton, and probably plays a role in regulating the orientation of the mitotic spindle and the orientation of the plane of cell division. It is required for normal proliferation of neuronal progenitor cells in the developing brain and for normal brain development. This model corresponds to a conserved region located at the N-terminus of EMAP-1, which shows high sequence similarity with the N-terminal trimerization domain of EMAP-4 and EMAP-2.	58
409269	cd21948	TD_EMAP2	trimerization domain of echinoderm microtubule-associated protein-like 2. Echinoderm microtubule-associated protein-like 2 (EMAP-2), also called EML2 or EMAPL2, is a tubulin binding protein that inhibits microtubule nucleation and growth, resulting in shorter microtubules. This model corresponds to the N-terminal trimerization domain of EMAP-2.	48
409270	cd21949	TD_EMAP3	trimerization domain of echinoderm microtubule-associated protein-like 3. Echinoderm microtubule-associated protein-like 3 (EMAP-3), also called EML3, is a nuclear microtubule-binding protein required for the correct alignment of chromosomes in metaphase. It may modify the assembly dynamics of microtubules, such that microtubules are slightly longer, but more dynamic. This model corresponds to a conserved region located at the N-terminus of EMAP-3, which shows high sequence similarity with the N-terminal trimerization domain of EMAP-2 and EMAP-4.	48
409271	cd21950	TD_EMAP4	trimerization domain of echinoderm microtubule-associated protein-like 4. Echinoderm microtubule-associated protein-like 4 (EMAP-4), also called EML4, EMAPL4, restrictedly overexpressed proliferation-associated protein, or Ropp 120, may modify the assembly dynamics of microtubules, such that microtubules are slightly longer, but more dynamic. This model corresponds to the N-terminal trimerization domain of EMAP-4.	59
409265	cd21951	MIU_RNF169_C	C-terminal motif interacting with ubiquitin domain found in RING finger protein 169. RING finger protein 169 (RNF169) is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. RNF169 recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to the regulation of the DSB repair pathway by competing with repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. RNF169 contains an N-terminal C3HC4-type RING-HC finger and a C-terminal MIU (motif interacting with ubiquitin) domain. This model corresponds to the MIU domain of RNF169, which shows high sequence similarity with the second MIU (MIU2) domain of RNF168, and is responsible for bridging histone and ubiquitin surfaces.	54
409266	cd21952	MIU2_RNF168	second motif interacting with ubiquitin domain found in RING finger protein 168. RNF168 is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. It, together with RNF8, functions as a DNA damage response (DDR) factor that promotes monoubiquitination of H2A/H2AX at K13/15, facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. Moreover, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. RNF168 contains an N-terminal C3HC4-type RING-HC finger that catalyzes H2A-K15ub modification and interacts with H2A, and two MIU (motif interacting with ubiquitin) domains responsible for the interaction with K63 linked poly-ubiquitin. This model corresponds to the second MIU (MIU2) domain of RNF168. The first MIU belongs to a different domain family and is not included here.	51
409262	cd21953	IKKE_C	C-terminal domain of inhibitor of nuclear factor kappa-B kinase subunit epsilon. Inhibitor of nuclear factor kappa-B kinase subunit epsilon (IKK-E) (EC 2.7.11.10) is also called I-kappa-B kinase epsilon, IKK-epsilon, IkBKE, inducible I kappa-B kinase, or IKK-I. It is an interferon regulatory factor-activating kinase that is a non-canonical member of the IKK family. It is involved in cellular innate immunity by inducing type I interferons. It is induced by the activation of nuclear factor-kappaB (NF-kappaB). IKK-E has also been implicated in antiviral immune response in higher vertebrates. It acts as a crucial pro-survival factor in human T cell leukemia virus type 1 (HTLV-1)-transformed T lymphocytes. Moreover, IKK-E plays an essential role in tumor initiation and progression. It inhibits protein kinase C (PKC) to promote Fascin-dependent actin bundling. IKK-E contains an N-terminal protein kinase domain followed by a ubiquitin-like (Ubl) domain, a coiled-coil domain 1 (CCD1), and a C-terminal elongated helical domain. This model corresponds to the C-terminal elongated helical domain of IKK-E that shows high sequence similarity with the C-terminal domain of TBK1, which is responsible for binding to its adaptor proteins, optineurin (OPTN) and NAP1.	48
409263	cd21954	TBK1_C	C-terminal domain of TANK-binding kinase 1. TANK-binding kinase 1 (TBK1), also called T2K and NF-kB-activating kinase, is a serine/threonine-protein kinase that is widely expressed in most cell types and acts as an IkappaB kinase (IKK)-activating kinase responsible for NF-kB activation in response to growth factors. It plays a role in modulating inflammatory responses through the NF-kB pathway. TKB1 is also a major player in innate immune responses since it functions as a virus-activated kinase necessary for establishing an antiviral state. It phosphorylates IRF-3 and IRF-7, which are important transcription factors for inducing type I interferon during viral infection. TBK1 may also play roles in cell transformation and oncogenesis. In addition, it regulates optineurin (OPTN), an important autophagy receptor involved in several selective autophagy processes. TBK1 contains N-terminal serine/threonine protein kinase, ubiquitin-like (Ubl), coiled-coil domain 1 (CCD1), and C-terminal alpha-helical domains. This model corresponds to a small conserved elongated alpha-helical domain at the C-terminus of TBK1, which is responsible for the binding of its adaptor proteins such as OPTN and NAP1.	47
409250	cd21955	SARS-CoV_ORF9b	accessory protein 9b of severe acute respiratory syndrome-associated coronavirus and similar proteins. This model represents the accessory protein 9b (ORF9b) from Severe acute respiratory syndrome-associated coronavirus (SARS-CoV) and some related betacoronaviruses such as bat coronavirus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF9b is a product of an alternative open reading frame within the N gene from SARS coronavirus. It is a lipid-binding protein that has been shown to associate with intracellular vesicles in mammalian cells, consistent with a role in the assembly of the virion. ORF9b localizes to mitochondria and causes mitochondrial elongation by triggering ubiquitination and proteasomal degradation of dynamin-like protein 1, a host protein involved in mitochondrial fission. It also targets the mitochondrial-associated adaptor molecule MAVS signalosome to trigger the degradation of MAVS, TRAF3, and TRAF, which severely limits host cell interferon responses. There are slight differences in the genome organization of SARS-CoV-2 in different studies; not all SARS-CoV-2 isolates are reported as having ORF9b.	89
409248	cd21963	Syt1_N	N-terminal domain of synaptotagmin-1 (Syt1) and similar proteins. Syt1, also called synaptotagmin I (SytI), or p65, is a calcium sensor that participates in triggering neurotransmitter release at the synapse. It may have a regulatory role in the membrane interactions during trafficking of synaptic vesicles at the active zone of the synapse. Syt1 binds acidic phospholipids with a specificity that requires the presence of both an acidic head group and a diacyl backbone. A Ca(2+)-dependent interaction between synaptotagmin and putative receptors for activated protein kinase C has also been reported. It can bind to at least three additional proteins in a Ca(2+)-independent manner; these are neurexins, syntaxin and AP2. Syt1 also plays a role in dendrite formation by melanocytes. The model corresponds to N-terminal domain of Syt1, which is a recognition domain responsible for the binding of botulinum neurotoxin B (BoNT B).	108
409249	cd21964	Syt2_N	N-terminal domain of synaptotagmin-2 (Syt2) and similar proteins. Syt2, also called synaptotagmin II (SytII), exhibits calcium-dependent phospholipid and inositol polyphosphate binding properties. It may have a regulatory role in the membrane interactions during trafficking of synaptic vesicles at the active zone of the synapse. It plays a role in dendrite formation by melanocytes. The model corresponds to N-terminal domain of Syt2, which is a recognition domain responsible for the binding of botulinum neurotoxin B (BoNT B).	111
412012	cd21965	Zn-C2H2_CALCOCO1_TAX1BP1_like	autophagy receptor zinc finger-C2H2 domain found in calcium-binding and coiled-coil domain-containing proteins, TAX1BP1 and similar proteins. The family includes calcium-binding and coiled-coil domain-containing proteins (CALCOCO1 and CALCOCO2), TAX1BP1 and similar proteins. CALCOCO1, also called calphoglin, or coiled-coil coactivator protein, or Sarcoma antigen NY-SAR-3, functions as a coactivator for aryl hydrocarbon and nuclear receptors (NR). CALCOCO2, also called antigen nuclear dot 52 kDa protein, or nuclear domain 10 protein NDP52, or nuclear domain 10 protein 52, or nuclear dot protein 52, is an ubiquitin-binding autophagy receptor involved in the selective autophagic degradation of invading pathogens. TAX1BP1, also called TRAF6-binding protein (T6BP), is a novel ubiquitin-binding adaptor protein involved in the negative regulation of the NF-kappaB transcription factor, a key player in inflammatory responses, immunity and tumorigenesis. The family also includes Drosophila melanogaster Spindle-F (Spn-F) that is the central mediator of IK2 kinase-dependent dendrite pruning in drosophila sensory neurons. This model corresponds to the C2H2-type zinc binding domain found in family members. It is a typical C2H2-type zinc finger which specifically recognizes mono-ubiquitin or poly-ubiquitin chain. The overall ubiquitin-binding mode utilizes the C-terminal alpha-helix to interact with the solvent-exposed surface of the central beta-sheet of ubiquitin, similar to that observed in the RABGEF1/Rabex-5 or POLN/Pol-eta zinc finger.	24
412013	cd21967	Zn-C2H2_CALCOCO1	C2H2-type zinc binding domain found in calcium-binding and coiled-coil domain-containing protein 1 (CALCOCO1) and similar proteins. CALCOCO1, also called calphoglin, or coiled-coil coactivator protein, or Sarcoma antigen NY-SAR-3, functions as a coactivator for aryl hydrocarbon and nuclear receptors (NR). It is recruited to promoters through its contact with the N-terminal basic helix-loop-helix-Per-Arnt-Sim (PAS) domain of transcription factors or coactivators, such as NCOA2. During ER-activation CALCOCO1 acts synergistically in combination with other NCOA2-binding proteins, such as EP300, CREBBP and CARM1. It is involved in the transcriptional activation of target genes in the Wnt/CTNNB1 pathway. It functions as a secondary coactivator in LEF1-mediated transcriptional activation via its interaction with CTNNB1. In association with CCAR1, CALCOCO1 enhances GATA1- and MED1-mediated transcriptional activation from the gamma-globin promoter during erythroid differentiation of K562 erythroleukemia cells. CALCOCO1 contains a C2H2-type zinc binding domain.	29
412014	cd21968	Zn-C2H2_CALCOCO2	C2H2-type zinc binding domain found in calcium-binding and coiled-coil domain-containing protein 2 (CALCOCO2) and similar proteins. CALCOCO2, also called antigen nuclear dot 52 kDa protein, or nuclear domain 10 protein NDP52, or nuclear domain 10 protein 52, or nuclear dot protein 52, is an Xenophagy-specific receptor required for autophagy-mediated intracellular bacteria degradation. It acts as an effector protein of galectin-sensed membrane damage that restricts the proliferation of infecting pathogens such as Salmonella typhimurium upon entry into the cytosol by targeting LGALS8-associated bacteria for autophagy. It may play a role in ruffle formation and actin cytoskeleton organization and seems to negatively regulate constitutive secretion. CALCOCO2 contains a C2H2-type zinc binding domain.	27
412015	cd21969	Zn-C2H2_TAX1BP1_rpt1	first C2H2-type zinc binding domain found in tax1-binding protein 1 (TAX1BP1) and similar proteins. TAX1BP1, also called TRAF6-binding protein (T6BP), is a novel ubiquitin-binding adaptor protein involved in the negative regulation of the NF-kappaB transcription factor, a key player in inflammatory responses, immunity and tumorigenesis. It inhibits TNF-induced apoptosis by mediating the TNFAIP3 anti-apoptotic activity. It may also play a role in the pro-inflammatory cytokine IL-1 signaling cascade. TAX1BP1 is degraded by caspase-3-like family proteins upon TNF-induced apoptosis. TAX1BP1 contains two C2H2-type zinc binding domains; this model corresponds to the first one.	24
412016	cd21970	Zn-C2H2_TAX1BP1_rpt2	second C2H2-type zinc binding domain found in tax1-binding protein 1 (TAX1BP1) and similar proteins. TAX1BP1, also called TRAF6-binding protein (T6BP), is a novel ubiquitin-binding adaptor protein involved in the negative regulation of the NF-kappaB transcription factor, a key player in inflammatory responses, immunity and tumorigenesis. It inhibits TNF-induced apoptosis by mediating the TNFAIP3 anti-apoptotic activity. It may also play a role in the pro-inflammatory cytokine IL-1 signaling cascade. TAX1BP1 is degraded by caspase-3-like family proteins upon TNF-induced apoptosis. TAX1BP1 contains two C2H2-type zinc binding domains; this model corresponds to the second one.	27
412017	cd21971	Zn-C2H2_spn-F	C2H2-type zinc binding domain found in Drosophila melanogaster Spindle-F (Spn-F) and similar proteins. spn-F is the central mediator of IK2 kinase-dependent dendrite pruning in drosophila sensory neurons. It acts downstream of IKK-related kinase Ik2 in the same pathway for dendrite pruning. Spn-F is a coil-coiled protein containing a C2H2-type zinc binding domain.	30
409230	cd21972	KLF1_2_4_N	N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins. Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.	194
409246	cd21973	KLF6_7_N-like	N-terminal domain of Kruppel-like factor (KLF) 6, KLF7, and similar proteins. This subfamily is composed of Kruppel-like factor or Krueppel-like factor (KLF) 6, KLF7, and similar proteins, including KLF Luna, a Drosophila KLF6/KLF7. KLF6 contributes to cell proliferation, differentiation, cell death and signal transduction. Hepatocyte expression of KLF6 regulates hepatic fatty acid and glucose metabolism via transcriptional activation of liver glucokinase and post-transcriptional regulation of the nuclear receptor peroxisome proliferator activated receptor alpha (PPARa). KLF7 is involved in regulation of the development and function of the nervous system and adipose tissue, type 2 diabetes, blood diseases, as well as pluripotent cell maintenance. KLF Luna is maternally required for synchronized nuclear and centrosome cycles in the preblastoderm embryo. KLF6 and KLF7 are transcriptional activators. They belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF6, KLF7, and similar proteins.	138
409243	cd21974	KLF10_11_N	N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins. This subfamily is composed of Kruppel-like factor or Krueppel-like factor (KLF) 10, KLF11, and similar proteins. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10/11 belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10, KLF11, and similar proteins.	229
409240	cd21975	KLF9_13_N-like	Kruppel-like factor (KLF) 9, KLF13, KLF14, KLF16, and similar proteins. Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF9, KLF13, KLF14, KLF16, and similar proteins.	163
409235	cd21976	SARS-CoV_ORF9c	accessory protein ORF9c (also referred to as ORF14) from Severe acute respiratory syndrome-associated coronavirus and related coronaviruses. This model represents the accessory protein 9c (ORF9c, also referred to as ORF14/protein 14) from Sarbecoviruses including Severe acute respiratory syndrome-associated coronavirus (SARS-CoV), SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV), and Bat SARS-like coronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF9c/protein 14 is a product of an alternative open reading frame (ORF) within the N gene from SARS coronavirus. A study of the SARS-CoV2-human protein-protein interaction network (including cloning, tagging and expressing SARS-CoV2 proteins in human cells followed by affinity-purification mass spectrometry) uncovered ORF9c/protein 14 interactions, including those with Sigma receptors (implicated in lipid remodeling and ER stress response), mitochondrial electron transport (ECSIT, ACAD9, NDUFAF1, NDUFB9 ), GPI-anchor biosynthesis (GPAA1, GIPS), and with innate immune signaling proteins (NLRX1, F2RL1, NDFIP2). A preliminary study, using a computational and knowledge-based approach to investigate the interplay between host and SARS-CoV2 in various signaling pathways, supports that SARS-CoV2 ORF9c protein may perturb host antiviral inflammatory cytokine and interferon production pathways (DOI:10.1101/2020.05.06.050260). There are slight differences in the genome organization of SARS-CoV2 in different studies; not all SARS-CoV2 isolates are reported as having ORF9c/protein 14.	70
409233	cd22054	NAC_NACA	nascent polypeptide-associated complex (NAC), alpha subunit. The nascent polypeptide-associated complex (NAC) is a complex, conserved from archaea to human, that plays an important role in co translational targeting of nascent polypeptides to the endoplasmic reticulum (ER). In eukaryotes, under physiological conditions, the complex is a stable heterodimer of the NAC alpha subunit and the NAC beta subunit, also known as basal transcription factor 3b (BTF3b). An imbalance of the relative concentrations has been observed in diseases, like Alzheimer's, AIDS, and ulcerative colitis. NAC alpha consists of a NAC domain, also present in BTF3, and a unique C-terminal ubiquitin-associated (UBA) domain.	48
409234	cd22055	NAC_BTF3	basal transcription factor BTF3. Basal transcription factor 3 (BTF3) plays an important role in the transcriptional regulation linked to growth and development in eukaryotes. In mammals, the BTF3 gene encodes two alternative splicing isoforms, BTF3a and BTF3b. The full length BTF3a protein excites transcription. The shortened BTF3b, which lacks the first 44 amino-terminal extension, is a component of the nascent polypeptide-associated complex (NAC), involved in regulating protein localization during translation. BTF3 is involved in oncogenesis; overexpression of BTF3 has been shown to be associated with a variety of malignancies such as cancer of the colon, pancreas, stomach, prostate and breast. It is upregulated in hypopharyngeal squamous cell carcinoma (HSCC) tumors correlating with lymph node metastasis and tumor promotion, thus indicating that BTF3 is a potential therapeutic target and prognostic biomarker for HSCC. BTF3 has also been implicated in the pathogenesis of osteosarcoma (OS), a malignant cancer that affects rapidly proliferating bones, and has a poor prognosis.	117
409231	cd22056	KLF1_2_4_N-like	N-terminal domain of Kruppel-like factors with similarity to the N-terminal domains of Kruppel-like factor (KLF)1, KLF2, and KLF4. Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domains of an unknown subfamily of KLFs, predominantly found in fish, related to the N-terminal domains of KLF1, KLF2, and KLF4.	339
409200	cd22057	WH2_WAVE	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Wiskott-Aldrich Syndrome Protein Family members 1 (WASP1 or WAVE1), 2 (WASP2 or WAVE2) and 3 (WASP3 or WAVE3). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in three Wiskott-Aldrich syndrome protein (WASP) family verprolin homologous protein (SCAR/WAVE) isoforms: WAVE1, WAVE2, and WAVE3. Members of this family activate actin related protein (Arp)2/3-dependent actin nucleation and branching in response to signals mediated by Rho-family GTPases. The domain structure of these proteins varies, reflecting different modes of regulation; however, they all share a common C-terminal WH2 region which constitutes the smallest fragment necessary for Arp2/3 activation. These proteins interact with actin via their WH2 domain.	28
409201	cd22058	WH2_N_WASP	first and second of two tandem Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeats found in Neural Wiskott-Aldrich syndrome protein (N-WASP). This family contains both tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) repeats found in the Neural Wiskott-Aldrich syndrome protein (N-WASP or Neural WASP); N-WASP contains two tandem WH2 domains. N-WASP integrates various extracellular signals to control actin dynamics and cytoskeletal reorganization through activation of the actin related protein (Arp)2/3 complex. It interacts with actin via the WH2 domain. N-WASP plays an important role in the deactivation or attenuation of B cell receptor signaling. N-WASP regulates filopodia formation and membrane invagination, as compared to WAVE proteins that serve as Rac1 effectors in the formation of lamellipodia. Filopodia are thin, actin-rich surface projections that are extended and maintained by N-WASP together with CDC42. N-WASP also plays a role in the nucleus by regulating gene transcription, probably by promoting nuclear actin polymerization. It binds to HSF1/HSTF1 and forms a complex on heat shock promoter elements (HSE) that negatively regulates HSP90 expression. It also plays a role in dendrite spine morphogenesis. Unphosphorylated N-WASP is preferentially localized in the nucleus and in the cytoplasm when phosphorylated; it is exported from the nucleus by a nuclear export signal (NES)-dependent mechanism to the cytoplasm.	23
409202	cd22059	WH2_BetaT	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in beta-Thymosin, and similar proteins. This family contains beta-thymosin (betaT; also called thymosin beta or Tbeta) domain which is similar to the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2). Proteins in the beta-thymosin family are small peptides that act as actin monomer (G-actin) sequestering factors. They bind to G-actin into a 1:1 complex, rendering G-actin resistant to polymerization into filaments (F-actin). Thymosin beta 4 (Tbeta4 or TB4) and beta10 (Tbeta10) are minor variants of betaT that bind skeletal muscle actin and inhibit actin polymerization. Thymosin beta4 can also bind to polymerized F-actin. The roles of beta-thymosins also appear to extend beyond G-actin sequestration. Thymosin beta4 has also been linked to a number of additional biological events, including angiogenesis, wound healing, inflammation, and intracellular signaling through kinase activation. Research on thymosin beta10 in breast cancer cells has suggested a relationship with actin cytoskeletal remodeling and cell motility. In addition, thymosins beta4, beta10, and beta15 are highly expressed in several tumor cells, and these have been associated with a higher metastatic potential, possibly due to their function in cell proliferation.	34
409203	cd22060	WH2_MTSS1	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Metastasis suppressor protein 1 (MTSS-1). This family contains the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in metastasis suppressor protein 1 (MTSS1, also called also known as missing in metastasis or MIM). MTSS1 may be related to cancer progression or tumor metastasis in a variety of organ sites, most likely through an interaction with the actin cytoskeleton. It interacts with actin via its WH2 domain. MTSS1 is a novel potential metastasis suppressor gene in several types of human cancers; its expression is down-regulated in ovarian cancer, colorectal cancer, oesophageal cancer, prostate cancer and breast cancer, whereas it has also been observed to be up-regulated in hepato-cellular carcinoma and breast cancer.	31
409204	cd22061	WH2_INF2	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Inverted formin-2 (INF2). This family contains the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in inverted formin-2 (INF2, also known as  HBEBP2-binding protein C). INF2 is a formin protein with the unique ability to accelerate both actin polymerization and depolymerization, the latter requiring severing of the filament. It interacts with actin at its formin homology 2 (FH2) domain, while the WH2 domain acts as the diaphanous autoregulatory domain (DAD) and binds to actin monomers. INF2 plays a role in mitochondrial fission and dorsal stress fiber formation. It accelerates actin nucleation and elongation by interacting with the fast-growing ends (barbed ends) of actin filaments, but also accelerates disassembly of actin through encircling and severing filaments. Mutations in INF2 lead to the kidney disease focal segmental glomerulosclerosis (FSGS) and the neurological disorder Charcot-Marie Tooth Disease (CMTD).	30
409205	cd22062	WH2_DdVASP-like	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Dictyostelium discoideum Vasodilator-stimulated phosphoprotein (VASP) and similar proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in Dictyostelium discoideum vasodilator-stimulated phosphoprotein (VASP) and similar proteins. VASP belongs to the Ena/VASP protein family whose members act as actin polymerases that drive the processive elongation of filament barbed ends in membrane protrusions or at the surface of bacterial pathogens. These actin-associated proteins are involved in a range of processes dependent on cytoskeleton remodeling and cell polarity such as lamellipodial and filopodial dynamics in migrating cells. VASP plays a crucial role in filopodia formation, cell-substratum adhesion, and proper chemotaxis. It nucleates and bundles actin filaments via oligomers that use their WH2 domains to effect both the tethering of actin filaments and their processive elongation in sites of active actin assembly.	31
409206	cd22063	WH2_Actobindin	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Actobindin and similar proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in actobindin, an actin-binding protein from amoeba. Actobindin is able to bind two actin monomers at high concentrations of G-actin. It inhibits actin polymerization by sequestering G-actin and stabilizing actin dimers, thus making it a more potent inhibitor of the early phase of actin polymerization than of F-actin elongation.	29
409207	cd22064	WH2_WAS_WASL	Wiskott Aldrich syndrome homology region 2 (WH2 motif) in WAS/WASL-interacting protein (WIP). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in WAS/WASL-interacting protein family (WIPF, also known as WASP-interacting protein or WIP). Human WIP protein is proline rich and has high sequence similarity to yeast protein verprolin (included in this model). WIP forms complexes with WASP/N-WASP and modulates their function in vivo. It is involved in the regulation of endocytosis and participates in several cellular processes, some of which are relevant in cancer and may be dependent on different oncogenic stimuli. WIP interacts directly with mammalian actin-binding protein-1 (mABP1) via the SH3 domain during platelet-derived growth factor (PDGF)-mediated dorsal ruffle formation. WIP family includes members 1 (WAS/WASL-interacting protein family member 1) or WIPF1), 2 (WIPF2) and 3 (WIPF3). Aberrant expression of WIPF1 contributes to the invasion and metastasis of several malignancies such breast cancer, glioma and colorectal cancer; it has been identified as an oncoprotein in human pancreatic ductal adenocarcinoma (PDAC) and is associated with poor survival. WIPF2 may be an important regulator of the actin cytoskeleton. WIPF2 binds to N-WASP, regulating actin dynamics close to the plasma membrane; N-WASP in turn controls the second phase insulin secretion through the regulation of the Arp2/3 complex. WIPF3, along with LIPA (lysosomal acid lipase A), are expressed in microphages and are involved in pathological abdominal aortic aneurysm (AAA), a serious condition of the aorta. In yeast, verprolin is involved in cytoskeletal organization and cellular growth. It may exert its effects on the cytoskeleton directly, or indirectly via proline-binding proteins, such as profilin, or via proteins possessing SH3 domains.	29
409208	cd22065	WH2_Spire_1-2_r1	first tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homologs 1 and 2. This family contains the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family proteins Spire-1 (also called Spir1) and Spire-2 (Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 and Spire-2. This model contains WH2 domain 1 of human Spire-1 and Spire-2 . Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, while spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. In contrast, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice.	32
409209	cd22066	WH2_Spire	second, third, and fourth, tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeats of protein Spire. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) repeats 2-4 in human Spire (also called Spir),  Drosophila Spire, and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. This WH2-containing actin nucleator was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. Several spire gene family members have been identified, including paralogs Spire-1 (Spir1) and Spire-2 (Spir2) in higher eukaryotes. Spire acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. Spire-1 and Spire-2 encode a modified Fab1/YOTB/Vac1/EEA1 (FYVE)-type zinc finger membrane-binding domain at their C-termini that promiscuously interacts with negatively charged lipids and the interaction of these proteins with additional factors may provide the specificity for its targeting to the correct subpopulation of vesicles.	22
409210	cd22067	WH2_DmSpire_r1-like	first tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat found in Drosophila melanogaster Spire, and similar proteins. This family contains the first of four tandem repeats of Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in Drosophila melanogaster Spire (also called Spir), an actin nucleator essential for establishing an actin mesh during oogenesis. Spire was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire promotes dissociation of the actin nucleator Cappuccino (Capu) from the barbed end of actin filaments. Spire is involved in intracellular vesicle transport along actin fibers, providing a link between actin cytoskeleton dynamics and intracellular transport. Drosophila Spire contains four tandem WH2 domains which appear to function by determining the size of filament nuclei according to the number of WH2 repeats, suggesting that the WH2 domains of Spire line up actin subunits along a filament strand of the actin double helix, thereby generating nuclei for actin assembly. This model contains the first tandem WH2 domain of Spire (also called Spir-A or WH2-A).	27
409211	cd22068	WH2_DmSpire_r3-like	third tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat found in Drosophila melanogaster Spire, and similar proteins. This family contains the third of four tandem repeats of Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in Drosophila melanogaster Spire (also called Spir), an actin nucleator essential for establishing an actin mesh during oogenesis. Spire was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire promotes dissociation of the actin nucleator Cappuccino (Capu) from the barbed end of actin filaments. Spire is involved in intracellular vesicle transport along actin fibers, providing a link between actin cytoskeleton dynamics and intracellular transport. Drosophila Spire contains four tandem WH2 domains which appear to function by determining the size of filament nuclei according to the number of WH2 repeats, suggesting that the WH2 domains of Spire line up actin subunits along a filament strand of the actin double helix, thereby generating nuclei for actin assembly. This model contains the third tandem WH2 domain of Spire (also called Spir-C or WH2-C), which plays a unique role whereby two critical residues have been identified for activity for binding to actin with positive cooperativity.	26
409212	cd22069	WH2_DmSpire_r4	fourth tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat found in Drosophila melanogaster Spire, and similar proteins. This family contains the fourth of four tandem repeats of Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in Drosophila melanogaster Spire (also called Spir), an actin nucleator essential for establishing an actin mesh during oogenesis. Spire was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire promotes dissociation of the actin nucleator Cappuccino (Capu) from the barbed end of actin filaments. Spire is involved in intracellular vesicle transport along actin fibers, providing a link between actin cytoskeleton dynamics and intracellular transport. Drosophila Spire contains four tandem WH2 domains which appear to function by determining the size of filament nuclei according to the number of WH2 repeats, suggesting that the WH2 domains of Spire line up actin subunits along a filament strand of the actin double helix, thereby generating nuclei for actin assembly. This model contains the fourth tandem WH2 domain of Spire (also called Spir-D or WH2-A).	29
409213	cd22070	WH2_Pan1-like	Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) domain found in Actin cytoskeleton-regulatory complex protein Pan1, and similar proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in actin cytoskeleton-regulatory complex protein Pan1, and similar proteins. Pan1 actin cytoskeleton-regulatory complex is a multi-domain scaffold that is required for the internalization of endosomes during actin-coupled endocytosis. It links the site of endocytosis to the cell membrane-associated actin cytoskeleton. Pan1 mediates uptake of external molecules and vacuolar degradation of plasma membrane proteins, may play a role in the proper organization of the cell membrane-associated actin cytoskeleton, and promotes its destabilization.	22
409214	cd22071	WH2_WAVE-1	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Wiskott-Aldrich Syndrome Protein Family Member 1 (WASP1 or WAVE1 or WASF1 or SCAR1). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in the Wiskott-Aldrich syndrome protein (WASP) relative WAVE 1 (also called WASP-family verprolin homologous protein 1 or SCAR1 or WAVE1). WAVE1 is a downstream effector protein involved in the transmission of signals from tyrosine kinase receptors and small GTPases to the actin cytoskeleton. It regulates lamellipodia formation via a hetero-pentameric WAVE regulatory complex (WRC) with additional proteins in the cell (Sra1/Cyfip1, Nap1/Hem-2, Abi and HSPC300) that regulates actin filament reorganization via its interaction with the actin related protein (Arp)2/3 complex. The WRC is stimulated by the Rac GTPase binding to CYFIP protein, allowing the release of WAVE1 from the complex.  WAVE1 then binds and activates the Arp2/3 complex via its C-terminal domain. It interacts with actin via the WH2 domain. WAVE1 has been shown to be necessary for efficient transcriptional reprogramming in Xenopus oocytes and for normal development.	75
409215	cd22072	WH2_WAVE-2	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Wiskott-Aldrich Syndrome Protein Family Member 2 (WASP2 or WAVE2 or WASF2 or SCAR2). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in the Wiskott-Aldrich syndrome protein (WASP) relative WAVE 2 (also called WASP-family verprolin homologous protein 2 or WASF2 or SCAR2 or WAVE2). WAVE2 is a downstream effector protein involved in the transmission of signals from tyrosine kinase receptors and small GTPases to the actin cytoskeleton. It participates in multiple processes related to actin dynamics, such as lamellipodia and filopodium formation, cell migration and protrusion, and embryogenesis. It regulates lamellipodia formation via a hetero-pentameric WAVE regulatory complex (WRC) with additional proteins in the cell (Sra1/Cyfip1, Nap1/Hem-2, Abi and HSPC300) that regulates actin filament reorganization via its interaction with the actin related protein (Arp)2/3 complex. The WRC is stimulated by the Rac GTPase, kinases and phosphatidylinositols, and binds and activates the Arp2/3 complex via WAVE2 C-terminal domain. It interacts with actin via the WH2 domain. WAVE2 can also be phosphorylated by MAPK and forms a complex with PKA that regulates membrane protrusion. In mouse oocyte, WAVE2 regulates meiotic spindle stability, peripheral positioning and polar body emission, probably via an actin-mediated pathway.	30
409216	cd22073	WH2_WAVE-3	Wiskott Aldrich syndrome homology region 2 (WH2 motif) found in Wiskott-Aldrich Syndrome Protein Family Member 3 (WASP-3 or WAVE3). This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in the Wiskott-Aldrich syndrome protein (WASP) relative WAVE 3 (also called WASP-family verprolin homologous protein 3 or WASF3 or SCAR3 or WAVE3). WAVE3 is a downstream effector protein involved in the transmission of signals from tyrosine kinase receptors and small GTPases to the actin cytoskeleton. It plays a role in the regulation of cell morphology and cytoskeletal organization and is required in the control of cell shape. It forms a hetero-pentameric WAVE regulatory complex (WRC) with additional proteins in the cell (Sra1/Cyfip1, Nap1/Hem-2, Abi and HSPC300) that regulates actin filament reorganization via its interaction with the actin related protein (Arp)2/3 complex. The WRC is stimulated by the Rac GTPase, kinases and phosphatidylinositols, and binds and activates the Arp2/3 complex via WAVE3 C-terminal domain. It interacts with actin via the WH2 domain. This actin polymerization process is also involved in cancer cell invasion and metastasis. WASF3 has been shown to have a central role in cancer cell invasion and metastasis; elevated WAVE3 expression promotes metastasis in breast cancer and inactivation of WAVE3 in highly metastatic breast cancer cells has been shown to suppress invasion and metastasis. WAVE3 may also be pivotal in ovarian cancer cell motility, invasion and oncogenesis. In gastric cancer patients, WAVE3 expression correlates with poor outcome. In pancreatic cancer tissues, expression is prominently higher that in normal tissues and may be associated with lymphatic metastasis and poorly differentiated tumors; findings suggest that WAVE3 influences cell proliferation, migration and invasion via the AKT pathway.	66
409217	cd22074	WH2_N-WASP_r1	first tandem Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat found in human Neural Wiskott-Aldrich syndrome protein (N-WASP) and related domains. This subfamily includes the first tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in human Neural Wiskott-Aldrich syndrome protein (N-WASP or Neural WASP) and related domains. N-WASP integrates various extracellular signals to control actin dynamics and cytoskeletal reorganization through activation of the actin related protein (Arp)2/3 complex. It interacts with actin via the WH2 domain. N-WASP plays an important role in the deactivation or attenuation of B cell receptor signaling. N-WASP regulates filopodia formation and membrane invagination, as compared to WAVE proteins that serve as Rac1 effectors in the formation of lamellipodia. Filopodia are thin, actin-rich surface projections that are extended and maintained by N-WASP together with CDC42. N-WASP also plays a role in the nucleus by regulating gene transcription, probably by promoting nuclear actin polymerization. It binds to HSF1/HSTF1 and forms a complex on heat shock promoter elements (HSE) that negatively regulates HSP90 expression. It also plays a role in dendrite spine morphogenesis. Unphosphorylated N-WASP is preferentially localized in the nucleus and in the cytoplasm when phosphorylated; it is exported from the nucleus by a nuclear export signal (NES)-dependent mechanism to the cytoplasm.	27
409218	cd22075	WH2_hN-WASP_r2_like	second tandem Wiskott Aldrich syndrome homology region 2 (WH2 motif) repeat found in human Neural Wiskott-Aldrich syndrome protein (N-WASP) and related domains. This subfamily includes the second tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) found in human Neural Wiskott-Aldrich syndrome protein (N-WASP or Neural WASP). N-WASP integrates various extracellular signals to control actin dynamics and cytoskeletal reorganization through activation of the actin related protein (Arp)2/3 complex. It interacts with actin via the WH2 domain. N-WASP plays an important role in the deactivation or attenuation of B cell receptor signaling. N-WASP regulates filopodia formation and membrane invagination, as compared to WAVE proteins that serve as Rac1 effectors in the formation of lamellipodia. Filopodia are thin, actin-rich surface projections that are extended and maintained by N-WASP together with CDC42. N-WASP also plays a role in the nucleus by regulating gene transcription, probably by promoting nuclear actin polymerization. It binds to HSF1/HSTF1 and forms a complex on heat shock promoter elements (HSE) that negatively regulates HSP90 expression. It also plays a role in dendrite spine morphogenesis. Unphosphorylated N-WASP is preferentially localized in the nucleus and in the cytoplasm when phosphorylated; it is exported from the nucleus by a nuclear export signal (NES)-dependent mechanism to the cytoplasm. This subfamily includes both tandem WH2 domains of mouse  N-WASP.	25
409219	cd22076	WH2_WAS_WASL-1	Wiskott Aldrich syndrome homology region 2 (WH2 motif) in WAS/WASL-interacting protein family member 1. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in WAS/WASL-interacting protein family (WIPF, also known as WASP-interacting protein or WIP) member 1 (WIPF1). WIPF1 is a ubiquitously expressed proline-rich multidomain protein and is a binding partner and chaperone of WASP. It stabilizes actin filaments and regulates actin organization and polymerization which are associated with cell migration and invasion. Mutations in the WIPF1 binding site of WASP or in WIPF1 itself cause Wiskott-Aldrich syndrome (WAS), a rare X-linked recessive disease characterized by eczema, thrombocytopenia, immune deficiency, and bloody diarrhea. Aberrant expression of WIPF1 contributes to the invasion and metastasis of several malignancies such breast cancer, glioma and colorectal cancer; it has been identified as an oncoprotein in human pancreatic ductal adenocarcinoma (PDAC) and is associated with poor survival.	32
409220	cd22077	WH2_WAS_WASL-2_3	Wiskott Aldrich syndrome homology region 2 (WH2 motif) in WAS/WASL-interacting protein (WIP) family members 2 and 3. WASF2 (WAS protein family, member 2), This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) in WAS/WASL-interacting protein family (WIPF, also known as WASP-interacting protein or WIP) members 2 (WIPF2; also known as WIRE/WICH) and 3 (WIPF3). WIPF2 may be an important regulator of the actin cytoskeleton. It binds to N-WASP, regulating actin dynamics close to the plasma membrane; N-WASP in turn controls the second phase insulin secretion through the regulation of the Arp2/3 complex. Pathogenic properties of Shigella flexneri, a causative agent of intestinal infections worldwide, rely on its ability to invade the human colon where it spreads from cell to cell; WIPF2 has been shown to promote this via its contribution to the efficiency of actin-based motility in the cytosol and the resolution of the membrane protrusions into vacuoles. WIPF3, along with LIPA (lysosomal acid lipase A), are expressed in microphages and are involved in pathological abdominal aortic aneurysm (AAA), a serious condition of the aorta.	30
409221	cd22078	WH2_Spire1_r2-like	second tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homolog 1 (Spir1), and related proteins. This family contains the second tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-1 (also called Spir1) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 2 of human Spire-1 protein. Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, and in adult tissues, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice. This family also contains the second of four tandem repeats of WH2 in Drosophila melanogaster Spire (also called Spir), an actin nucleator essential for establishing an actin mesh during oogenesis.	32
409222	cd22079	WH2_Spire2_r2	second tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homolog 2. This family contains the second tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-2 (also called Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 2 of human Spire-2. Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage.	30
409223	cd22080	WH2_Spire1_r4	fourth tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homolog 1. This family contains the fourth tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-1 (also called Spir1) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 4 of Spire-1 protein. Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, and in adult tissues, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice.	24
409224	cd22081	WH2_Spire2_r4	fourth tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homolog 2. This family contains the fourth tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-2 (also called Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 4 of Spire-2. Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage.	22
409195	cd22184	Af2093-like	Archaeaoglobus fulgidus Af2093 and similar proteins. This family represents the uncharacterized protein Af2093, which has no known function. The three-dimensional fold of this protein family resembles that of PDDEXK nucleases, but it lacks the typical catalytic site.	245
409225	cd22185	WH2_hVASP-like	Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) of human Vasodilator-stimulated phosphoprotein and related proteins. This family contains the Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) of Ena/VASP family members including Protein enabled homolog (also known as Mena, mammalian enabled), VASP (vasodilator-stimulated phosphoprotein) and EVL (Ena-VASP-like or Enabled VASP or Ena/VASP). These are actin-associated proteins involved in a range of processes dependent on cytoskeleton remodeling and cell polarity such as axon guidance and lamellipodial and filopodial dynamics in migrating cells, platelet activation and cell migration. Ena/VASP proteins processively elongate F-actin barbed ends, promoting dissociation of barbed end assembly antagonists (uncapping).  WH2 domains are small, widespread intrinsically disordered actin-binding peptides displaying significant sequence variability and different regulations of actin self-assembly in motile and morphogenetic processes. WH2 domains are identified by a central consensus actin-binding motif LKKT/V flanked by variable N-terminal and C-terminal extensions.	27
409226	cd22186	WH2_Spire1-2_r3	third tandem Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif) repeat of protein Spire homologs 1 and 2. This family contains the third tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-1 (also called Spir1) and Spire-2 (Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 3 of human Spire-1 and Spire-2 . Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, while spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. In contrast, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice.	23
409194	cd22187	asqI-like	protein asqI and similar proteins. This family includes Aspergillus nidulans tyrosinase family protein asqI (aspoquinolone biosynthesis protein I) that is part of the gene cluster that mediates the biosynthesis of the aspoquinolone mycotoxins.	322
409192	cd22188	arch-AMO_C-like	subunit C of ammonia monooxygenase (AMO) from ammonia-oxidizing archaea, and related proteins. This model contains the subunit C of ammonia monooxygenase (AMO, EC 1.14.99.39) from ammonia-oxidizing archaea including Nitrososphaera viennensis gen. nov., sp. nov (also called Nitrososphaera viennensis EN76) that contains six variants (AmoC1-AmoC6) encoded by different genes. AMO catalyzes the conversion of ammonia to hydroxylamine. Nitrososphaera viennensis EN76 AMO is composed of four subunits: AmoA, AmoB, AmoX, and one of six variants of AmoC. The AMO subunit C belongs to a family which also includes subunit C of particulate methane monooxygenase (pMMO, also known as membrane-bound MMO, EC 1.14.18.3) from methanotrophic bacteria, and AMO from ammonia-oxidizing bacteria, which are not included in this model. Compared to its bacterial counterpart, archaeal AMO C subunit is significantly shorter at the N-terminal end.	181
409190	cd22189	PGAP4-like_fungal	uncharacterized fungal proteins similar to Post-GPI attachment to proteins factor 4. This subfamily contains uncharacterized fungal proteins with similarity to animal post-GPI attachment to proteins factor 4 (PGAP4), also known as post-GPI attachment to proteins GalNAc transferase 4 or transmembrane protein 246 (TMEM246). PGAP4 has been shown to be a Golgi-resident GPI-GalNAc transferase. Many eukaryotic proteins are anchored to the cell surface through glycolipid glycosylphosphatidylinositol (GPI). GPIs have a conserved core but exhibit diverse N-acetylgalactosamine (GalNAc) modifications. PGAP4 knockout cells lose GPI-GalNAc structures. PGAP4 is most likely involved in the initial steps of GPI-GalNAc biosynthesis. In contrast to other Golgi glycotransferases, it contains three transmembrane domains. Proteins from this subfamily contain the putative catalytic site of PGAP4 and may have similar activities.	375
409191	cd22190	PGAP4	Post-GPI attachment to proteins factor 4. Post-GPI attachment to proteins factor 4 (PGAP4), also known as post-GPI attachment to proteins GalNAc transferase 4 or transmembrane protein 246 (TMEM246), has been shown to be a Golgi-resident GPI-GalNAc transferase. Many eukaryotic proteins are anchored to the cell surface through glycolipid glycosylphosphatidylinositol (GPI). GPIs have a conserved core but exhibit diverse N-acetylgalactosamine (GalNAc) modifications. PGAP4 knockout cells lose GPI-GalNAc structures. PGAP4 is most likely involved in the initial steps of GPI-GalNAc biosynthesis. In contrast to other Golgi glycotransferases (GTs), it contains three transmembrane domains. Structural modeling suggests that PGAP4 adopts a GT-A fold split by an insertion of tandem transmembrane domains.	379
409004	cd22191	DPBB_RlpA_EXP_N-like	double-psi beta-barrel fold of RlpA, N-terminal domain of expansins, and similar domains. The double-psi beta-barrel (DPBB) fold is found in a divergent group of proteins, including endolytic peptidoglycan transglycosylase RlpA (rare lipoprotein A), EG45-like domain containing proteins, kiwellins, Streptomyces papain inhibitor (SPI), and the N-terminal domain of plant and bacterial expansins. RlpA may work in tandem with amidases to degrade peptidoglycan (PG) in the division septum and lateral wall to facilitate daughter cell separation. An EG45-like domain containing protein from Arabidopsis thaliana, called plant natriuretic peptide A (AtPNP-A), functions in cell volume regulation. Kiwellin proteins comprise a widespread family of plant-defense proteins that target pathogenic bacterial/fungal effectors that down-regulate plant defense responses. SPI is a stress protein produced under hyperthermal stress conditions that serves as a glutamine and lysine donor substrate for microbial transglutaminase (MTG, EC 2.3.2.13) from Streptomycetes. Some expansin family proteins display cell wall loosening activity and are involved in cell expansion and other developmental events during which cell wall modification occurs.	94
411976	cd22192	TRPV5-6	Transient Receptor Potential channel, Vanilloid subfamily (TRPV), types 5 and 6. TRPV5 and TRPV6 (TRPV5/6) are two homologous members within the vanilloid subfamily of the transient receptor potential (TRP) family. TRPV5 and TRPV6 show only 30-40% homology with other members of the TRP family and have unique properties that differentiates them from other TRP channels. They mediate calcium uptake in epithelia and their expression is dramatically increased in numerous types of cancer. The structure of TRPV5/6 shows the typical topology features of all TRP family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6, which is predicted to form the Ca2+ pore, and large intracellular N- and C-terminal domains. The N-terminal domain of TRPV5/6 contains three ankyrin repeats. This structural element is present in several proteins and plays a role in protein-protein interactions. The N- and C-terminal tails of TRPV5/6 each contain an internal PDZ motif which can function as part of a molecular scaffold via interaction with PDZ-domain containing proteins. A major difference between the properties of TRPV5 and TRPV6 is in their tissue distribution: TRPV5 is predominantly expressed in the distal convoluted tubules (DCT) and connecting tubules (CNT) of the kidney, with limited expression in extrarenal tissues. In contrast, TRPV6 has a broader expression pattern such as expression in the intestine, kidney, placenta, epididymis, exocrine tissues, and a few other tissues.	609
411977	cd22193	TRPV1-4	Transient Receptor Potential channel, Vanilloid subfamily (TRPV), types 1-4. TRPV1-4 are thermo-sensing channels that function directly in temperature-sensing and nociception; they share substantial structural and functional properties. Transient Receptor Potential (TRP) ion channels activated by temperature (thermo TRPs) are important molecular players in acute, inflammatory, and chronic pain states. So far, 11 TRP channels in mammalian cells have been identified as thermosensitive TRP (thermo-TRP) channels. TRPV1-4 channels are activated by different heat temperatures, for example, TRPV1 and TRPV2 are activated by high temperatures (>43C and >55C, respectively). TRPV1-4 belong to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all TRP ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains.	607
411978	cd22194	TRPV3	Transient Receptor Potential channel, Vanilloid subfamily (TRPV), type 3. TRPV3 is a temperature-sensitive Transient Receptor Potential (TRP) ion channel that is activated by warm temperatures, synthetic small-molecule chemicals, and natural compounds from plants. TRPV3 function is regulated by physiological factors such as extracellular divalent cations and acidic pH, intracellular adenosine triphosphate, membrane voltage, and arachidonic acid. It is expressed in both neuronal and non-neuronal tissues including epidermal keratinocytes, epithelial cells in the gut, endothelial cells in blood vessels, and neurons in dorsal root ganglia and CNS. TRPV3 null mice have abnormal hair morphogenesis and compromised skin barrier function. It may play roles in inflammatory skin disorders, such as itch and pain sensation. TRPV3 is also expressed by many neuronal and non-neuronal tissues, showing that TRPV3 might play roles in other unknown cellular and physiological functions. TRPV3 belongs to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all TRP ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains.	680
411979	cd22195	TRPV4	Transient Receptor Potential channel, Vanilloid subfamily (TRPV), type 4. TRPV4 is expressed broadly in neuronal and non-neuronal cells. It is activated by various stimuli, including hypo-osmolarity, warm temperature, and chemical ligands. TRPV4 acts in physiological functions such as osmoregulation and thermoregulation. It also has a role in mechanosensation in the vascular endothelium and urinary tract, and in cell barrier formation in vascular and epidermal tissues. Knockout mice studies suggested the functional importance of TRPV4 in the central nervous system, nociception, and bone formation. TRPV4 belongs to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains.	733
411980	cd22196	TRPV1	Transient Receptor Potential channel, Vanilloid subfamily (TRPV), type 1. Vanilloid receptor 1 (TRPV1), a capsaicin (vanilloid) receptor, is the founding member of the vanilloid TRP subfamily (TRPV). In humans, it is expressed in the brain, kidney, pancreas, testis, uterus, spleen, stomach, small intestine, lung and liver. TRPV1 has been implicated to have function in thermo-sensation (heat), autonomic thermoregulation, nociception, food intake regulation, and multiple functions in the gastrointestinal (GI) tract. The receptor has also been involved in growth cone guidance, long-term depression, endocannabinoid signaling and osmosensing in the central nervous system. TRPV1 is up regulated in several human pathological conditions including vulvodynia, GI inflammation, Crohn's disease and ulcerative colitis. TRPV1 knock-out mice exhibit impaired sensation to thermal-mechanical acute pain. The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains.	649
411981	cd22197	TRPV2	Transient Receptor Potential channel, Vanilloid subfamily (TRPV), type 2. TRPV2 is closely related to TRPV1, sharing high sequence identity (>50%), but TRPV2 shows a higher temperature threshold and sensitivity for activation than TRPV1. TRPV2 can be stimulated by ligands or lipids, and is involved in osmosensation and mechanosensation. TRPV2 is expressed in both neuronal and non-neuronal tissues, and it has been implicated in diverse physiological and pathophysiological processes, including cardiac-structure maintenance, innate immunity, and cancer. TRPV2 belongs to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains.	640
409188	cd22198	CH_MICAL_EHBP-like	calponin homology (CH) domain found in the MICAL and EHBP families. This group is composed of the molecule interacting with CasL protein (MICAL) and EH domain-binding protein (EHBP) families. MICAL is a large, multidomain, cytosolic protein with a single LIM domain, a calponin homology (CH) domain and a flavoprotein monooxygenase (MO) domain. In Drosophila, MICAL is expressed in axons, interacts with the neuronal A (PlexA) receptor and is required for Semaphorin 1a (Sema-1a)-PlexA-mediated repulsive axon guidance. The LIM and CH domains mediate interactions with the cytoskeleton, cytoskeletal adaptor proteins, and other signaling proteins. The flavoprotein MO is required for semaphorin-plexin repulsive axon guidance during axonal pathfinding in the Drosophila neuromuscular system. The EHBP family includes EHBP1 and EHBP1-like protein (EHBP1L1). EHBP1 is a regulator of endocytic recycling and may play a role in actin reorganization by linking clathrin-mediated endocytosis to the actin cytoskeleton. It may act as an effector of small GTPases, including RAB-10 (Rab10), and play a role in vesicle trafficking. EHBP proteins contain a single CH domain. CH domains are actin filament (F-actin) binding motifs.	105
412062	cd22200	NRDE2_MID	MTR4-interacting domain (MID) found in nuclear exosome regulator NRDE2 and similar proteins. NRDE2 is a protein of the nuclear speckles that regulates RNA degradation and export from the nucleus through its interaction with MTREX, an essential factor directing various RNAs to exosomal degradation. NRDE2 negatively regulates exosome functions by inhibiting the RNA helicase MTR4 recruitment and exosome interaction. This model corresponds to the N-terminal MTR4-interacting domain (MID) of NRDE2.	99
412063	cd22201	cubilin_NTD	N-terminal domain of cubilin and similar proteins. Cubilin (CUBN, also called 460 kDa receptor, intestinal intrinsic factor receptor, intrinsic factor-cobalamin receptor, or intrinsic factor-vitamin B12 receptor) is an endocytic receptor which plays a role in lipoprotein, vitamin and iron metabolism by facilitating their uptake. It acts together with the 45-kDa transmembrane protein amnionless (AMN) to mediate endocytosis of the cobalamin (vitamin B12) binding intrinsic factor (CBLIF)-cobalamin complex. This model corresponds to the N-terminal domain of cubilin, which is responsible for the interaction with AMN. The cubilin interface with AMN is formed by the N-terminal strands of three cubilin chains.	129
409026	cd22204	H1_KCTD12-like	H1 domain found in potassium channel tetramerization domain-containing proteins. The H1 domain is found in potassium channel tetramerization domain-containing proteins such as KCTD8, KCTD12 (also called predominantly fetal expressed T1 domain/Pfetin), KCTD12b, and KCTD16. They serve as auxiliary gamma-aminobutyric acid type B (GABA-B) receptor subunits that constitute receptor subtypes with distinct functional properties. KCTD12 and -12b generate desensitizing receptor responses while KCTD8 and -16 generate largely non-desensitizing receptor responses. They control GABA-B signaling and  regulate the rise time and duration of G protein-coupled inwardly rectifying potassium (GIRK) currents, as well as enhance receptor expression levels. KCTD12 regulates agonist potency and kinetics of GABA-B receptor signaling. It promotes tumorigenesis by facilitating CDC25B/CDK1/Aurora A-dependent G2/M transition. KCTD16 interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion and axon guidance. Members of this family consist of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits and is responsible for desensitization. This model corresponds to the H1 domain.	118
412064	cd22207	pseudoGTPaseD_p190RhoGAP	pseudoGTPase domain found in the family of p190RhoGAP. This family includes two p190RhoGAP proteins, A and B, which are Rho family GTPase-activating proteins (GAPs) that act as key regulators of Rho GTPase signaling and are essential for actin cytoskeletal structure and contractility. Rho family is one of five Ras superfamily subgroups (Ras, Rho, Rab, Ran and Arf). Each contains five highly conserved sequence motifs, termed 'G-motifs', required for nucleotide-binding and catalytic activity. PseudoGTPases consist of a GTPase fold lacking one or more of these G motifs. This model corresponds to the GTPase-like domain called pseudoGTPase domain that is located at the middle region of p190RhoGAP proteins.	166
412067	cd22209	EMC10	ER membrane protein complex subunit 10 and similar proteins. Endoplasmic reticulum (ER) membrane protein complex subunit 10 (EMC10), also called hematopoietic signal peptide-containing membrane domain-containing protein 1 (HSM1) or INM02, is a bone marrow-derived angiogenic growth factor promoting angiogenesis and tissue repair in the heart after myocardial infarction. It stimulates cardiac endothelial cell migration and outgrowth via the activation of p38 MAPK, PAK, and MAPK2 signaling pathways. Yeast EMC10 is a non-essential component of the ER membrane protein complex (EMC), which may be involved in ER-associated degradation (ERAD) and proper assembly of multi-pass transmembrane proteins.	141
408999	cd22210	HD_XRCC4-like_N	N-terminal head domain found in the XRCC4 superfamily of proteins. The XRCC4 superfamily includes five families: XRCC4, XLF, PAXX, SAS6 and CCDC61. XRCC4 (X-ray repair cross-complementing protein 4), XLF (XRCC4-like factor) and PAXX (paralog of XRCC4 and XLF) play crucial roles in the non-homologous end-joining (NHEJ) DNA repair pathway. SAS6 (spindle assembly abnormal protein 6) and CCDC61 (coiled-coil domain-containing protein 61) have a centrosomal/centriolar function. Members of this superfamily have an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal low-complexity region. They form homodimers through two homodimerization domains: an N-terminal globular head domain and a parallel coiled-coil domain. In addition, some members such as XRCC4 and XLF form symmetric heterodimers that interact through their globular head domains at the opposite end of the homodimer interface, and may form XLF-XRCC4 filaments. This model corresponds to the N-terminal head domain of XRCC4 superfamily proteins.	115
411792	cd22211	HkD_SF	Hook domain-containing proteins superfamily. The Hook domain superfamily includes Hook adaptor proteins, Hook-related proteins and nuclear mitotic apparatus protein (NuMA). They share an N-terminal conserved globular Hook domain, which folds as a variant of the helical calponin homology (CH) domain with an extended alpha-helix. The Hook domain is responsible for the binding of microtubule. The Hook family includes microtubule-binding proteins, Hook1-3. Hook1 is required for spermatid differentiation. Hook2 contributes to the establishment and maintenance of centrosome function. Hook3 is an adaptor protein for microtubule-dependent intracellular vesicle and protein trafficking, and is involved in Golgi and endosome transport. Hook proteins are components of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex. The Hook-related protein (HkRP) family includes Daple, Girdin and Gipie. Daple, also called Dvl-associating protein with a high frequency of leucine residues, or coiled-coil domain-containing protein 88C(CCDC88C), or Hook-related protein 2 (HkRP2), is a novel non-receptor nucleotide exchange factor (GEF) required for activation of guanine nucleotide-binding proteins (G-proteins) during non-canonical Wnt signaling. Girdin, also called Akt phosphorylation enhancer (APE), or coiled-coil domain-containing protein 88A (CCDC88A), or G alpha-interacting vesicle-associated protein (GIV), or Girders of actin filament, or Hook-related protein 1 (HkRP1), is a bifunctional modulator of guanine nucleotide-binding proteins (G proteins). Gipie, also called GRP78-interacting protein induced by ER stress, or coiled-coil domain-containing protein 88B(CCDC88B), or brain leucine zipper domain-containing protein, or Hook-related protein 3 (HkRP3), is a novel actin cytoskeleton-binding protein and Akt substrate that regulates cell migratory responses in various biological contexts. NuMA, also called nuclear mitotic apparatus protein 1, or nuclear matrix protein-22 (NMP-22), or SP-H antigen, is a microtubule (MT)-binding protein that plays a role in the formation and maintenance of the spindle poles and the alignment and the segregation of chromosomes during mitotic cell division.	145
412068	cd22212	NDFIP-like	NEDD4 family-interacting protein. The NEDD4 (neural precursor cell expressed, developmentally down-regulated protein 4)-family interacting proteins (NDFIPs) are adaptor proteins that recruit NEDD4 E3 ligases to specific substrate proteins, which leads to the ubiquitylation and subsequent degradation of these proteins. They also act as activators of the E3 ligase activity by releasing NEDD4 ligase from its auto-inhibitory conformation. NDFIP1/2 have been shown to be involved in neural development by regulating the expression of the Robo1 receptor.	171
409027	cd22216	H1_KCTD12b	H1 domain found in potassium channel tetramerization domain-containing protein 12b. Potassium channel tetramerization domain-containing protein 12b (KCTD12b) is a BTB/POZ domain-containing protein that is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-B) receptors associated with mood disorders. KCTD12b consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits and is responsible for desensitization. This model corresponds to the H1 domain.	118
409028	cd22217	H1_KCTD12	H1 domain found in potassium channel tetramerization domain-containing protein 12. Potassium channel tetramerization domain-containing protein 12 (KCTD12), also called predominantly fetal expressed T1 domain (Pfetin), is a BTB/POZ domain-containing protein that is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-B) receptors associated with mood disorders. It regulates agonist potency and kinetics of GABA-B receptor signaling. It promotes tumorigenesis by facilitating CDC25B/CDK1/Aurora A-dependent G2/M transition. It also regulates colorectal cancer cell stemness through the ERK pathway. KCTD12 consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits and is responsible for desensitization. This model corresponds to the H1 domain.	119
409029	cd22218	H1_KCTD8	H1 domain found in potassium channel tetramerization domain-containing protein 8. Potassium channel tetramerization domain-containing protein 8  (KCTD8), a BTB/POZ domain-containing protein, is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-B) receptors that determine the pharmacology and kinetics of the receptor response. It generates largely non-desensitizing receptor responses. KCTD8 consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits. In the related protein KCTD12, the H1 domain is also responsible for desensitization. This model corresponds to the H1 domain of KCTD8, which may not be involved in desensitization.	122
409030	cd22219	H1_KCTD16	H1 domain found in potassium channel tetramerization domain-containing protein 16. Potassium channel tetramerization domain-containing protein 16 (KCTD16) is a BTB/POZ domain-containing protein that is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-) receptors associated with mood disorders. It interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion and axon guidance. KCTD16 generates largely non-desensitizing receptor responses. It consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits. In the related protein KCTD12, the H1 domain is also responsible for desensitization. This model corresponds to the H1 domain of KCTD16, which may not be involved in desensitization.	121
412065	cd22220	pseudoGTPaseD_p190RhoGAP-B	pseudoGTPase domain found in p190RhoGAP-B and similar proteins. p190RhoGAP protein B (p190RhoGAP-B), also called ARHGAP5, or p190-B, or Rho-type GTPase-activating protein 5 (RHOGAP5), is a Rho family GTPase-activating protein (GAP) that acts as a key regulator of Rho GTPase signaling and is essential for actin cytoskeletal structure and contractility. This model corresponds to the GTPase-like domain called pseudoGTPase domain that is located at the middle region of p190RhoGAP-B. Rho family GTPase-activating proteins normally have five highly conserved sequence motifs, termed 'G-motifs', required for nucleotide-binding and catalytic activity. PseudoGTPases consist of a GTPase fold lacking one or more of these G motifs.	171
412066	cd22221	pseudoGTPaseD_p190RhoGAP-A	pseudoGTPase domain found in p190RhoGAP-A and similar proteins. p190RhoGAP protein A (p190RhoGAP-A), also called Rho GTPase-activating protein 35(RHOGAP35), glucocorticoid receptor DNA-binding factor 1, or glucocorticoid receptor repression factor 1 (GRF-1), or Rho GAP p190A, or p190-A, is a Rho family GTPase-activating protein (GAP) that acts as a key regulator of Rho GTPase signaling and is essential for actin cytoskeletal structure and contractility. It binds several acidic phospholipids which inhibits the Rho GAP activity to promote the Rac GAP activity. This model corresponds to the GTPase-like domain called pseudoGTPase domain that is located at the middle region of p190RhoGAP-A. Rho family GTPase-activating proteins normally have five highly conserved sequence motifs, termed 'G-motifs', required for nucleotide-binding and catalytic activity. PseudoGTPases would consist of a GTPase fold lacking one or more of these G motifs.	172
411793	cd22222	HkD_Hook	Hook domain found in Hook family of microtubule-binding proteins. The Hook family includes Hook1-3. Hook1 is a microtubule-binding protein required for spermatid differentiation. Hook2, also a microtubule-binding protein, contributes to the establishment and maintenance of centrosome function. It may function in the positioning or formation of aggresomes, which are pericentriolar accumulations of misfolded proteins, proteasomes and chaperones. Hook3 is an adaptor protein for microtubule-dependent intracellular vesicle and protein trafficking. It is involved in Golgi and endosome transport. It acts as a scaffold for the opposite-polarity microtubule-based motors cytoplasmic dynein-1 and the kinesin KIF1C. It may participate in the turnover of the endocytosed scavenger receptor. Hook proteins are components of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex. Hook adaptor proteins share an N-terminal conserved globular Hook domain, which folds as a variant of the helical calponin homology (CH) domain, and contacts the helix alpha1 of dynein light intermediate chain 1 (LIC1) in a hydrophobic groove.	147
411794	cd22223	HkD_HkRP	Hook domain found in the Hook-related protein (HkRP) family. The HkRP family includes Daple, Girdin and Gipie. Daple, also called Dvl-associating protein with a high frequency of leucine residues, or coiled-coil domain-containing protein 88C (CCDC88C), or Hook-related protein 2 (HkRP2), is a novel non-receptor nucleotide exchange factor (GEF) required for activation of guanine nucleotide-binding proteins (G-proteins) during non-canonical Wnt signaling. Girdin, also called Akt phosphorylation enhancer (APE), or coiled-coil domain-containing protein 88A (CCDC88A), or G alpha-interacting vesicle-associated protein (GIV), or Girders of actin filament, or Hook-related protein 1 (HkRP1), is a bifunctional modulator of guanine nucleotide-binding proteins (G proteins). It acts as a non-receptor guanine nucleotide exchange factor which binds to and activates guanine nucleotide-binding protein G(i) alpha subunits. It also acts as a guanine nucleotide dissociation inhibitor for guanine nucleotide-binding protein G(s) subunit alpha GNAS. In addition, Girdin plays an essential role in cell migration. Gipie, also called GRP78-interacting protein induced by ER stress, or coiled-coil domain-containing protein 88B (CCDC88B), or brain leucine zipper domain-containing protein, or Hook-related protein 3 (HkRP3), is a novel actin cytoskeleton-binding protein and Akt substrate that regulates cell migratory responses in various biological contexts. It acts as a positive regulator of T-cell maturation and inflammatory function. As a microtubule-binding protein, Gipie regulates lytic granule clustering and NK cell killing. All family members contain a conserved globular Hook domain which folds as a variant of the helical calponin homology (CH) domain.	149
411795	cd22224	HkD_NuMA	Hook domain found in nuclear mitotic apparatus protein (NuMA) and similar proteins. NuMA, also called nuclear mitotic apparatus protein 1, or nuclear matrix protein-22 (NMP-22), or SP-H antigen, is a microtubule (MT)-binding protein that plays a role in the formation and maintenance of the spindle poles, and the alignment and segregation of chromosomes during mitotic cell division. The model corresponds to the N-terminal conserved globular Hook domain of NuMA, which folds as a variant of the helical calponin homology (CH) domain. It directly binds dynein light intermediate chains LIC1 and LIC2 through a conserved hydrophobic patch shared among other Hook adaptors.	148
411796	cd22225	HkD_Hook1	Hook domain found in protein Hook 1 (Hook1) and similar proteins. Hook1 is a microtubule-binding protein required for spermatid differentiation. It is a component of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex.	150
411797	cd22226	HkD_Hook3	Hook domain found in protein Hook 3 (Hook3) and similar proteins. Hook3 is an adaptor protein for microtubule-dependent intracellular vesicle and protein trafficking. It is involved in Golgi and endosome transport. It acts as a scaffold for the opposite-polarity microtubule-based motors cytoplasmic dynein-1 and the kinesin KIF1C. It may participate in the turnover of the endocytosed scavenger receptor. Hook3 is a component of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex.	153
411798	cd22227	HkD_Hook2	Hook domain found in protein Hook 2 (Hook2) and similar proteins. Hook2 is a microtubule-binding protein that contributes to the establishment and maintenance of centrosome function. It may function in the positioning or formation of aggresomes, which are pericentriolar accumulations of misfolded proteins, proteasomes and chaperones. Hook2 is a component of the FTS/Hook/FHIP complex (FHF complex), which may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting (HOPS) complex.	150
411799	cd22228	HkD_Daple	Hook domain found in Daple (Dvl-associating protein with a high frequency of leucine residues) and similar proteins. Protein Daple, also called coiled-coil domain-containing protein 88C (CCDC88C), or Hook-related protein 2 (HkRP2), is a novel non-receptor nucleotide exchange factor (GEF) required for activation of guanine nucleotide-binding proteins (G-proteins) during non-canonical Wnt signaling.	153
411800	cd22229	HkD_Girdin	Hook domain found in Girdin and similar proteins. Girdin, also called Akt phosphorylation enhancer (APE), or coiled-coil domain-containing protein 88A (CCDC88A), or G alpha-interacting vesicle-associated protein (GIV), or Girders of actin filament, or Hook-related protein 1 (HkRP1), is a bifunctional modulator of guanine nucleotide-binding proteins (G proteins). It acts as a non-receptor guanine nucleotide exchange factor which binds to and activates guanine nucleotide-binding protein G(i) alpha subunits. It also acts as a guanine nucleotide dissociation inhibitor for guanine nucleotide-binding protein G(s) subunit alpha GNAS. In addition, Girdin plays an essential role in cell migration.	156
411801	cd22230	HkD_Gipie	Hook domain found in Gipie (GRP78-interacting protein induced by ER stress) and similar proteins. Gipie, also called coiled-coil domain-containing protein 88B (CCDC88B), or brain leucine zipper domain-containing protein, or Hook-related protein 3 (HkRP3), is a novel actin cytoskeleton-binding protein and Akt substrate that regulates cell migratory responses in various biological contexts. It acts as a positive regulator of T-cell maturation and inflammatory function. As a microtubule-binding protein, Gipie regulates lytic granule clustering and NK cell killing.	170
409021	cd22231	RHH_NikR_HicB-like	ribbon-helix-helix domains of nickel responsive transcription factor NikR, antitoxins HicB, ParD, and MazE, and similar proteins. This family includes the N-terminal domain of NikR, C-terminal domains of antitoxins HicB and ParD, as well as antitoxin MazE, and similar proteins, all of which belong to the ribbon-helix-helix (RHH) family of transcription factors. NikR is a nickel-responsive transcription factor that consists of an N-terminal DNA-binding RHH domain and a C-terminal metal-binding domain (MBD) with four nickel ions. In Helicobacter pylori, which colonizes the gastric epithelium of humans leading to gastric ulcers and gastric cancers, NikR (HpNikR) regulates multiple genes. It regulates urease, which protects H. pylori from acidic shock at low pH, by converting urea to ammonia and bicarbonate. It also plays a complex role in the intracellular physiology of nickel; occupation of nickel-binding sites results in NikR binding to its operator in the nickel permease nikABCDE promoter. Thus, there is weaker repression of NikABCDE transcription at low intracellular free nickel concentrations while strong repression prevails at higher concentrations, which would be potentially toxic. Antitoxin HicB is part of the HicAB toxin-antitoxin (TA) system, where the toxins are RNases, found in many bacteria. In the pathogen Burkholderia pseudomallei, the HicAB system plays a role in regulating the frequency of persister cells and may therefore play a role in disease. Structural studies of Yersinia pestis HicB show that it acts as an autoregulatory protein and HicA acts as an mRNase. In Escherichia coli, an excess of HicA has been shown to de-repress a HicB-DNA complex and restore transcription of HicB. Similarly, Caulobacter crescentus ParD antitoxin neutralizes the effect of cognate ParE toxin. In Bacillus subtilis, during stress conditions, antitoxin MazE binds to toxin MazF, an mRNA interferase, and inactivates it and cleaves mRNAs in a sequence-specific manner, resulting in cellular growth arrest.	44
409022	cd22232	RHH_CopG_Cop6-like	ribbon-helix-helix family transcriptional repressor protein CopG, uncharacterized Cop6, and similar proteins. This family includes the ribbon-helix-helix (RHH) family transcriptional repressor CopG, which is involved in the control of plasmid copy number, as well as uncharacterized proteins such as Cop6, which is found in a small plasmid that has been identified in methicillin-resistant Staphylococcus aureus (MRSA). CopG, a homodimeric protein of around 45 residues, constitutes one of the smallest natural transcriptional repressors characterized and is the prototype of a series of repressor proteins encoded by plasmids that exhibit a similar genetic structure at their leading strand initiation and control regions. It binds to and represses the single Pcr promoter that directs the synthesis of a bicistronic mRNA for CopG and the RepB initiator of replication, thereby regulating its own synthesis and that of RepB.	45
409023	cd22233	RHH_CopAso-like	ribbon-helix-helix domain of Shewanella oneidensis type II antitoxin CopA(SO), and similar proteins. This family includes the N-terminal ribbon-helix-helix (RHH) domain of Shewanella oneidensis CopA(SO), a newly identified type II antitoxin, as well as the N-terminal RHH domain of Escherichia coli PutA flavoprotein, among other similar proteins, many of which are as yet uncharacterized. CopA(SO) is a typical RHH antitoxin that includes an ordered N-terminal domain (CopA(SO)-N) and a disordered C-terminal domain (CopA(SO)-C). Biophysical investigation indicates allosteric effects of CopA(SO)-N on CopA(SO)-C; DNA binding of CopA(SO)-N appears to induce CopA(SO)-C to fold and self-associate the C-terminal domain. The multifunctional E. coli proline utilization A (PutA) flavoprotein functions as a membrane-associated proline catabolic enzyme as well as a transcriptional repressor of the proline utilization genes putA and putP. The N-terminal domain of PutA is a transcriptional regulator with an RHH fold; structure studies show that it forms a homodimer to bind one DNA duplex. This family also includes orphan antitoxin ParD2, an antitoxin component of a non-functional type II toxin-antitoxin (TA system); it does not neutralize the effect of any of the RelE or ParE toxins.	44
409024	cd22234	RHH_MobB-like	ribbon-helix-helix domain of mobilization protein MobB and similar proteins. This subfamily includes Pseudomonas syringae mobilization protein MobB, and mostly archaeal uncharacterized CopG family proteins. These proteins have a typical ribbon-helix-helix (RHH), similar to plasmid-encoded transcriptional repressor CopG, the protein that is encoded by the promiscuous streptococcal plasmid pMV158 and is involved in the control of plasmid copy number.	44
409025	cd22235	RHH_CopG_archaea	ribbon-helix-helix domain of CopG family transcriptional regulators found in archaea. This subfamily includes the N-terminal ribbon-helix-helix (RHH) domain of putative transcriptional repressor CopG from archaea, and similar proteins. These uncharacterized proteins have a typical RHH, similar to plasmid-encoded transcriptional repressor CopG, the protein that is encoded by the promiscuous streptococcal plasmid pMV158 and is involved in the control of plasmid copy number.	43
412071	cd22238	AcrIF3	Anti-CRISPR type I subtype F3. AcrIF3 (also known as AcrF3) is an anti-CRISPR (Acr) protein that forms a homodimer and interacts directly with helicase-nuclease protein Cas3 and blocks its recruitment to the type I-F CRISPR-Cas surveillance complex (Csy). The type I-F Csy is a crRNA-guided surveillance complex, composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. Without Cas3 recruitment by the Csy-dsDNA complex, the CRISPR/Cas system is unable to efficiently destroy the invading DNA, resulting in escape from the immune response. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	127
412072	cd22239	NPHP4	Nephrocystin-4. Nephrocystin-4 (NPHP4), also known as nephroretinin, is a component of the nephronophthisis (NPHP) module which is part of the transition zone (TZ) of the cilia. NPHP4 forms complexes with alpha-tubulin, NPHP1 and RPGRIP1. The interaction with NPHP1 is crucial for cell-cell and cell-matrix adhesion signaling. Mutations in NPHP4 have been shown to cause nephronophthisis (NPHP), an autosomal recessive cause of kidney failure and earlier stages of chronic kidney disease among adults.	904
412034	cd22240	akirin	akirin. Akirins are small, highly conserved eumetazoan nuclear proteins that play a role in immune response and tumorigenesis. It is believed that they act as a connector between a variety of transcription factors and major chromatin remodeling complexes. In vertebrates, there are two orthologs, Akirin1 and Akirin2.	147
412073	cd22241	AcrIF8	Anti-CRISPR type I subtype F8 (AcrIF8). AcrIF8 (also known as AcrF8) is an anti-CRISPR (Acr) protein that is positioned on the type I-F Csy spiral backbone surrounded by Cas5f, Cas7.4-7.6f, and Cas8f, and forms interactions with crRNA (CRISPR-RNA) to prevent the target DNA from binding to the Csy complex. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system.  Acr Proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	77
412035	cd22243	akirin-1	akirin-1. Akirins are small, highly conserved eumetazoan nuclear proteins that play a role in immune response and tumorigenesis. It is believed that they act as a connector between a variety of transcription factors and major chromatin remodeling complexes. Akirin-1 is one of the two orthologs in vertebrates that plays a role in immunity, myogenesis and meiosis.	188
412036	cd22244	akirin-2	akirin-2. Akirins are small, highly conserved eumetazoan nuclear proteins that play a role in immune response and tumorigenesis. It is believed that they act as a connector between a variety of transcription factors and major chromatin remodeling complexes. Akirin-2 is one of the two orthologs in vertebrates that plays a role in immunity, myogenesis, and brain- and limb-development. Akirin-2 is partly cytosolic. It has been shown to interact with nuclear importins and therefore may play a role in proper transport between nucleus and cytoplasm.	184
412074	cd22246	PI4KB_NTD	N-terminal domain of phosphatidylinositol 4-kinase beta. Phosphatidylinositol 4-kinase beta (PI4K-beta, PI4Kbeta or PI4KB), also called PtdIns 4-kinase beta, NPIK, PI4K92, or PI4KIII, catalyzes the phosphorylation of phosphatidylinositol (PI) to form phosphatidylinositol 4-phosphate (PI4P), in the first committed step in the production of the second messenger inositol-1,4,5,-trisphosphate (PIP). It may regulate Golgi disintegration/reorganization during mitosis, possibly via its phosphorylation. PI4K-beta is critical for the maintenance of the Golgi and trans Golgi network (TGN) PI4P pools. It is recruited to membranes via its interaction with Golgi adaptor protein acyl-coenzyme A binding domain containing protein 3 (ACBD3). The ACBD3:PI4K-beta complex formation is essential for proper function of the Golgi. PI4K-beta also plays an essential role in Aichi virus RNA replication. It is recruited by ACBD3 at viral replication sites. This model corresponds to the N-terminal domain of PI4K-beta, which is responsible for interacting with ACBD3 by forming a complex with the Q domain.	65
410202	cd22248	Rcc_KIF21	regulatory coiled-coil domain found in the kinesin-like KIF21 family. The KIF21 family includes KIF21A and KIF21B. KIF21A (also called kinesin-like protein KIF2, or renal carcinoma antigen NY-REN-62) is a microtubule-binding motor protein involved in neuronal axonal transport. It works as a microtubule stabilizer that regulates axonal morphology, suppressing cortical microtubule dynamics in neurons. Mutations in KIF21A cause congenital fibrosis of the extraocular muscles type 1 (CFEOM1). In vitro, it has a plus-end directed motor activity. KIF21B is a plus-end directed microtubule-dependent motor protein which displays processive activity. It is involved in regulation of microtubule dynamics, synapse function, and neuronal morphology, including dendritic tree branching and spine formation. KIF21B plays a role in learning and memory. It is involved in the delivery of gamma-aminobutyric acid (GABA(A)) receptors to the cell surface. This model corresponds to the regulatory coiled-coil domain of KIF21A/KIF21B, which folds into an intramolecular antiparallel coiled-coil monomer in solution but crystallizes into a dimeric domain-swapped antiparallel coiled-coil.	81
409016	cd22249	UDM1_RNF168_RNF169-like	UDM1 (ubiquitin-dependent DSB recruitment module 1) found in RING finger proteins RNF168, RNF169 and similar proteins. This model represents the UDM1 (ubiquitin-dependent double-strand break [DSB] recruitment module 1) found in RING finger proteins, RNF168 and RNF169. RNF168 is an E3 ubiquitin-protein ligase that promotes non-canonical K27 ubiquitination to signal DNA damage. It functions, together with RNF8, as a DNA damage response (DDR) factor that promotes a series of ubiquitylation events on substrates such as H2A and H2AX. With H2AK13/15 ubiquitylation, it facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. In addition, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. RNF169 is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. RNF169 recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to the regulation of DSB repair pathway utilization via functionally competing with recruiting repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin, independent of its catalytic activity, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. The UDM1 domain comprises LRM1 (LR motif 1), UMI (ubiquitin-interacting motif [UIM]- and MIU-related UBD) and MIU1 (motif interacting with ubiquitin 1). Mutations of Ub-interacting residues in UDM1 have little effect on the accumulation of RNF168 to DSB sites, suggesting that it may not be the main site of binding ubiquitylated and polyubiquitylated targets.	66
409019	cd22250	ROCK_SBD	Shroom-binding domain found in Rho-associated coiled-coil containing protein kinase. Rho-associated coiled-coil containing protein kinase (ROCK) is a serine/threonine kinase (STK) that catalyzes the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. It is also referred to as Rho-associated protein kinase or simply as Rho kinase. The ROCK subfamily consists of two isoforms, ROCK1 and ROCK2, which may be functionally redundant in some systems, but exhibit different tissue distributions. Rho-associated protein kinase 1 (ROCK1) is also called renal carcinoma antigen NY-REN-35, Rho-associated, coiled-coil-containing protein kinase 1, ROCK-I, p160 ROCK-1, or p160ROCK, is preferentially expressed in the liver, lung, spleen, testes, and kidney. It mediates signaling from Rho to the actin cytoskeleton. It is implicated in the development of cardiac fibrosis, cardiomyocyte apoptosis, and hyperglycemia. Mice deficient in ROCK1 display eyelids open at birth (EOB) and omphalocele phenotypes due to the disorganization of actin filaments in the eyelids and the umbilical ring. Rho-associated protein kinase 2 (ROCK2), also called Rho kinase 2, Rho-associated, coiled-coil-containing protein kinase 2, ROCK-II, or p164 ROCK-2, is more prominent in brain and skeletal muscle. It is implicated in vascular and neurological disorders, such as hypertension and vasospasm of the coronary and cerebral arteries. Mice deficient in ROCK2 show intrauterine growth retardation and embryonic lethality because of placental dysfunction. ROCK subfamily proteins contain an N-terminal extension, a catalytic kinase domain, a coiled-coil (CC) region encompassing a Rho-binding domain (RBD), and a pleckstrin homology (PH) domain. ROCK is auto-inhibited by the RBD and PH domain interacting with the catalytic domain. It is activated via proteolytic cleavage, binding of lipids to the PH domain, or binding of GTP-bound RhoA to the CC region. More recently, the Shroom family of proteins have been identified as an additional regulator of ROCK. This model corresponds to the Shroom-binding domain (SBD) of ROCK, which forms a parallel coiled coil with the Shroom domain 2 (SD2) of Shroom.	75
412075	cd22252	PARP2_NTR	NTR (N-terminal region) domain of poly [ADP-ribose] polymerase 2 (PARP-2) and similar proteins. PARP-2 is also called ADP-ribosyltransferase diphtheria toxin-like 2 (ARTD2), DNA ADP-ribosyltransferase PARP2, NAD(+) ADP-ribosyltransferase 2 (ADPRT-2), poly[ADP-ribose] synthase 2 (pADPRT-2), or protein poly-ADP-ribosyltransferase PARP2. It is a poly-ADP-ribosyltransferase that mediates poly-ADP-ribosylation of proteins and plays a key role in DNA repair. It mainly mediates glutamate and aspartate ADP-ribosylation of target proteins. PARP-2 can also ADP-ribosylate DNA; it preferentially acts on 5'-terminal phosphates at DNA strand break termini in nicked duplex. This model corresponds to the NTR (N-terminal region) domain of PARP-2, which contains a nucleolar localization sequence (NoLS) and a putative nuclear localization signal (NLS). The NTR domain has a helical SAF-A/B, Acinus, and PIAS (SAP) domain fold and may participate in protein-protein interactions.	59
412076	cd22255	PPP1R3A_PBD	PP1C binding domain found in protein phosphatase 1 regulatory subunit 3A (PPP1R3A) and similar proteins. PPP1R3A, also called protein phosphatase 1 glycogen-associated regulatory subunit (PP1G), protein phosphatase type-1 glycogen targeting subunit, or RG1, acts as a glycogen-targeting subunit for PP1 that is essential for cell division, and participates in the regulation of glycogen metabolism, muscle contractility, and protein synthesis. PPP1R3A plays an important role in glycogen synthesis but is not essential for insulin activation of glycogen synthase. It interacts with the PPP1CC catalytic subunit of PP1 and associates with glycogen. This model corresponds to the protein phosphatase 1 catalytic subunit (PP1C) binding domain of PPP1R3A, which contains a RVxF PP1C-binding motif that mediates interactions with PP1C.	82
412077	cd22256	PrimPol_RBD	C-terminal RPA-binding domain (RBD) of DNA-directed primase/polymerase protein and similar proteins. DNA-directed primase/polymerase protein (PrimPol), also called coiled-coil domain-containing protein 111, is a DNA primase-polymerase required for the maintenance of genome integrity. It facilitates mitochondrial and nuclear replication fork progression by initiating de novo DNA synthesis using dNTPs and acting as an error-prone DNA polymerase able to bypass certain DNA lesions. PrimPol is regulated by single-stranded DNA binding proteins. This model corresponds to the C-terminal RPA-binding domain (RBD) of PrimPol, which interacts directly with the RPA70N domain of RPA70.	81
412078	cd22257	AcrIF6-like	Anti-CRISPR type I subtype F6 and related uncharacterized proteins. AcrIF6 (also known as AcrF6) is an anti-CRISPR (Acr) protein that blocks invader DNA access by binding in the junction region between Cas7.6f and Cas8f subunits of the type I-F CRISPR-Cas surveillance complex (Csy) to compete for foreign DNA binding. The type I-F Csy is a crRNA-guided surveillance complex, composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. AcrIF6 can function as an inhibitor of both the type I-E and I-F CRISPR-Cas systems. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	82
412079	cd22259	AcrIF10	Anti-CRISPR type I subtype F10. AcrIF10 (also known as AcrF10) is an anti-CRISPR (Acr) protein which acts as a "DNA mimic protein" (DMP) that binds in the junction region between Cas 7.6f and Cas8f subunits of the type I-F CRISPR-Cas surveillance complex (Csy) to inhibit foreign DNA binding to the CRISPR-Cas adaptive immune system. The key feature of DMPs is their DNA-like shape and charge distribution, and they affect the activity of DNA-binding proteins by occupying their DNA-binding domains. The type I-F Csy is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. Without Cas3 recruitment by the Csy-dsDNA complex, the CRISPR/Cas system is unable to efficiently destroy the invading DNA, resulting in the escape from the immune response. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	94
410203	cd22262	Rcc_KIF21B	regulatory coiled-coil domain found in kinesin-like protein KIF21B. KIF21B is a plus-end directed microtubule-dependent motor protein which displays processive activity. It is involved in regulation of microtubule dynamics, synapse function, and neuronal morphology, including dendritic tree branching and spine formation. KIF21B plays a role in learning and memory. It is involved in the delivery of gamma-aminobutyric acid (GABA(A)) receptors to the cell surface. This model corresponds to a conserved region of KIF21B, which shows high sequence similarity to the regulatory coiled-coil domain of KIF21A.	82
410204	cd22263	Rcc_KIF21A	regulatory coiled-coil domain found in kinesin-like protein KIF21A. KIF21A, also called kinesin-like protein KIF2 or renal carcinoma antigen NY-REN-62, is a microtubule-binding motor protein involved in neuronal axonal transport. It works as a microtubule stabilizer that regulates axonal morphology, suppressing cortical microtubule dynamics in neurons. Mutations in KIF21A cause congenital fibrosis of the extraocular muscles type 1 (CFEOM1). In vitro, it has a plus-end directed motor activity. This model corresponds to the regulatory coiled-coil domain of KIF21A, which folds into an intramolecular antiparallel coiled-coil monomer in solution, but crystallizes into a dimeric domain-swapped antiparallel coiled-coil.	82
409017	cd22264	UDM1_RNF169	UDM1 (ubiquitin-dependent DSB recruitment module 1) domain found in RING finger protein 169. RING finger protein 169 (RNF169) is an uncharacterized E3 ubiquitin-protein ligase paralogous to RNF168. It functions as a negative regulator of the DNA damage signaling cascade. It recognizes polyubiquitin structures but does not itself contribute to double-strand break (DSB)-induced chromatin ubiquitylation. It contributes to the regulation of DSB repair pathway utilization via functionally competing with recruiting repair factors, 53BP1 and RAP80-BRCA1, for association with RNF168-modified chromatin independent of its catalytic activity, limiting the magnitude of the RNF8/RNF168-dependent signaling response to DSBs. This model corresponds to the UDM1 (ubiquitin-dependent double-strand break [DSB] recruitment module 1) domain of RNF169, which comprises LRM1 (LR motif 1), UMI (ubiquitin-interacting motif [UIM]- and MIU-related UBD) and MIU1 (motif interacting with ubiquitin 1). Mutations of Ub-interacting residues in UDM1 have little effect on the accumulation of the related RNF168 to DSB sites, suggesting that it may not be the main site of binding ubiquitylated and polyubiquitylated targets.	70
409018	cd22265	UDM1_RNF168	UDM1 (ubiquitin-dependent DSB recruitment module 1) domain found in RING finger protein 168. RING finger protein 168 (RNF168) is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. Together with RNF8, RNF168 functions as a DNA damage response (DDR) factor that promotes a series of ubiquitylation events on substrates such as H2A and H2AX. With H2AK13/15 ubiquitylation, it facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. In addition, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. This model corresponds to the UDM1 (ubiquitin-dependent double-strand break [DSB] recruitment module 1) domain of RNF168, which comprises LRM1 (LR motif 1), UMI (ubiquitin-interacting motif [UIM]- and MIU-related UBD) and MIU1 (motif interacting with ubiquitin 1). Mutations of Ub-interacting residues in UDM1 have little effect on the accumulation of RNF168 to DSB sites, suggesting that it may not be the main site of binding ubiquitylated and polyubiquitylated targets.	73
412080	cd22266	AcrIE1	Anti-CRISPR type I subtype E1. AcrIE1 (also known as AcrE1) is an anti-CRISPR (Acr) protein which binds as a homodimer to and inactivates the CRISPR-associated helicase/nuclease Cas3 protein. It has been shown that the C-terminal region of AcrIE1 is important for its inhibitory activity. AcrIE1 can convert the endogenous type I-E CRISPR system into a programmable transcriptional repressor. The type I-E Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	98
409005	cd22268	DPBB_RlpA-like	double-psi beta-barrel fold of endolytic peptidoglycan transglycosylase RlpA and similar proteins. Endolytic peptidoglycan transglycosylase RlpA (rare lipoprotein A, RlpA) is a lytic transglycosylase with a strong preference for naked glycan strands that lack stem peptides. It adopts a double-psi beta-barrel (DPBB) fold and is one of four SPOR-domain containing proteins in Escherichia coli (including FtsN, DedA and DamX) that bind peptidoglycan (PG) and are targeted to the septum during division. It directly interacts with the divisome protein FtsK in vitro, and deletion of the rlpA gene partially bypasses the requirement for functional FtsK, a large, multi-spanning membrane protein that facilitates double-stranded DNA translocation during division and sporulation in E. coli and Bacillus subtilis, respectively. In Pseudomonas aeruginosa, RlpA contributes to rod shape maintenance and daughter cell separation. The separation of daughter cells requires extensive PG remodeling. It has been suggested that amidases and RlpA work in tandem to degrade PG in the division septum and lateral wall to facilitate daughter cell separation.	93
409006	cd22269	DPBB_EG45-like	double-psi beta-barrel fold of EG45-like domain-containing proteins. This family contains plant EG45-like domain-containing proteins which show sequence similarity to expansins, and similar proteins. Citrus jambhiri EG45-like domain-containing protein was identified as a protein associated with citrus blight (CB), and is also called blight-associated protein p12 (CjBAp12) or plant natriuretic peptide (PNP). CjBAp12 does not display cell wall loosening activity of expansins. Arabidopsis thaliana EG45-like domain-containing protein 2, also called plant natriuretic peptide A (AtPNP-A), is a systemically mobile natriuretic peptide immunoanalog, recognized by antibodies against vertebrate atrial natriuretic peptides (ANPs), that functions in cell volume regulation. Thus, it has an important and systemic role in plant growth and homeostasis. Due to their similarity to the N-terminal domain of expansin and to endolytic peptidoglycan transglycosylase RlpA, EG45-like domain-containing proteins may adopt a double-psi beta-barrel fold.	106
409007	cd22270	DPBB_kiwellin-like	double-psi beta-barrel fold of the kiwellin family. Kiwellin (KWL) proteins comprise a widespread family of plant-defense proteins that target pathogenic bacterial/fungal effectors that down-regulate plant defense responses. They are part of a spatiotemporally coordinated, plant-wide defense response comprising KWL proteins with overlapping activities. Zea mays KWL1 specifically inhibits the enzymatic activity of the secreted chorismate mutase Cmu1, a virulence-promoting effector of the smut fungus Ustilago maydis. KWL proteins adopt a double-psi beta-barrel (DPBB) fold, which provides a versatile scaffold that can specifically counteract pathogen effectors such as Cmu1.	128
409008	cd22271	DPBB_EXP_N-like	N-terminal double-psi beta-barrel fold domain of the expansin family and similar domains. The plant expansin family consists of four subfamilies, alpha-expansin (EXPA), beta-expansin (EXPB), expansin-like A (EXLA), and expansin-like B (EXLB). EXPA and EXPB display cell wall loosening activity and are involved in cell expansion and other developmental events during which cell wall modification occurs. EXPA proteins function more efficiently on dicotyledonous cell walls, whereas EXPB proteins exhibit specificity for the cell walls of monocotyledons. Expansins also affect environmental stress responses. Expansin family proteins contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This family also includes GH45 endoglucanases from mollusks. This model represents the N-terminal domain of expansins and similar proteins, which adopts a double-psi beta-barrel (DPBB) fold.	117
409009	cd22272	DPBB_EXLX1-like	N-terminal double-psi beta-barrel fold domain of bacterial expansins similar to Bacillus subtilis EXLX1. This subfamily is composed of bacterial expansins including Bacillus subtilis EXLX1, also called expansin-YoaJ. Similar to plant expansins, EXLX1 contains an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. It strongly binds to crystalline cellulose via D2, and weakly binds soluble cellooligosaccharides. Bacterial expansins, which are present in some plant pathogens, have the ability to loosen plant cell walls, but with weaker activity compared to plant expansins. They may have a role in plant-bacterial interactions. This model represents the N-terminal domain of EXLX1 and similar bacterial expansins, which adopts a double-psi beta-barrel (DPBB) fold.	101
409010	cd22273	DPBB_SPI-like	double-psi beta-barrel fold of Streptomyces papain inhibitor and similar proteins. Streptomyces papain inhibitor (SPI) adopts a rigid, thermo-resistant double-psi-beta-barrel (DPBB) fold that is stabilized by two cysteine bridges. SPI serves as a glutamine and lysine donor substrate for microbial transglutaminase (MTG, EC 2.3.2.13) from Streptomycetes, that is used to covalently and specifically link functional amines to glutamine donor sites of therapeutic proteins. SPI is a stress protein produced under hyperthermal stress conditions, and is able to inhibit the cysteine proteases, papain and bromelain, as well as the bovine serine protease trypsin.	101
409011	cd22274	DPBB_EXPA_N	N-terminal double-psi beta-barrel fold domain of the alpha-expansin subfamily. Alpha-expansins (EXPA, expansin-A) have cell wall loosening activity and are involved in cell expansion and other developmental events during which cell wall modification occurs. They also affect environmental stress responses. Arabidopsis thaliana EXPA1 is a cell wall modifying enzyme that controls the divisions marking lateral root initiation. Nicotiana tabacum EXPA4 positively regulates abiotic stress tolerance, and negatively regulates pathogen resistance. Wheat TaEXPA2 is involved in conferring cadmium tolerance. Alpha-expansins belong to the expansin family of proteins that contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This model represents the N-terminal domain of alpha-expansins, which adopts a double-psi beta-barrel (DPBB) fold.	129
409012	cd22275	DPBB_EXPB_N	N-terminal double-psi beta-barrel fold domain of the beta-expansin subfamily. Beta-expansins (EXPB, expansin-B) have cell wall loosening activity and are involved in cell expansion and other developmental events during which cell wall modification occurs. They also affect environmental stress responses. The EXPB subfamily is known in the allergen literature as group-1 grass pollen allergens. EXPB of Bermuda, Johnson, and Para grass pollens, is a major cross-reactive allergen for allergic rhinitis patients in subtropical climate. EXPB1 induces extension and stress relaxation of grass cell walls. Wheat TaEXPB7-B is a beta-expansin gene involved in low-temperature stress and abscisic acid responses. Beta-expansins belong to the expansin family of proteins that contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This model represents the N-terminal domain of beta-expansins, which adopts a double-psi beta-barrel (DPBB) fold.	122
409013	cd22276	DPBB_EXLA_N	N-terminal double-psi beta-barrel fold domain of the expansin-like A subfamily. Expansin-like A (EXLA) belongs to the plant expansin family that also includes alpha-expansin (EXPA), beta-expansin (EXPB), and expansin-like B (EXLB). Unlike EXPA and EXPB, EXLA proteins have not been shown to display cell wall loosening activity. EXLA2 is one of the three EXLA members in Arabidopsis. It lacks expansin activity, but contains a presumed cellulose-interacting domain. EXLA2 may function as a positive regulator of cell elongation in the dark-grown hypocotyl of Arabidopsis, possibly by interference with cellulose metabolism, deposition, or its organization. EXLA belongs to the expansin family of proteins that contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This model represents the N-terminal domain of EXLA proteins, which adopts a double-psi beta-barrel (DPBB) fold.	129
409014	cd22277	DPBB_EXLB_N	N-terminal double-psi beta-barrel fold domain of the expansin-like B subfamily. Expansin-like B (EXLB) belongs to the plant expansin family that also includes alpha-expansin (EXPA), beta-expansin (EXPB), and expansin-like A (EXLA). Unlike EXPA and EXPB, EXLA proteins have not been shown to display cell wall loosening activity. Solanum tuberosum StEXLB6 showed differential expression under the treatments of abscisic acid (ABA), indoleacetic acid (IAA), and gibberellin acid 3 (GA3), as well as under drought and heat stresses, indicating that it is likely involved in potato stress resistance. Soybean GmEXLB1 improves phosphorus acquisition by regulating root elongation and architecture in Arabidopsis. EXLB belongs to the expansin family of proteins that contain an N-terminal domain (D1) homologous to the catalytic domain of glycoside hydrolase family 45 (GH45) proteins but with no hydrolytic activity, and a C-terminal domain (D2) homologous to group-2 grass pollen allergens. This model represents the N-terminal domain of EXLB proteins, which adopts a double-psi beta-barrel (DPBB) fold.	117
409015	cd22278	DPBB_GH45_endoglucanase	double-psi beta-barrel fold of glycoside hydrolase family 45 endoglucanase EG27II and similar proteins. This group is made up of endoglucanases from mollusks similar to Ampullaria crossean endoglucanase EG27II, a glycoside hydrolase family 45 (GH45) subfamily B protein. Endoglucanases (EC 3.2.1.4) catalyze the endohydrolysis of (1-4)-beta-D-glucosidic linkages in cellulose, lichenin, and cereal beta-D-glucans. Animal cellulases, such as endoglucanase EG27II, have great potential for industrial applications such as bioethanol production. GH45 endoglucanases from mollusks adopt a double-psi beta-barrel (DPBB) fold.	149
412081	cd22279	AcrIF1	Anti-CRISPR type I subtype F1 (AcrIF1). AcrIF1 (also known as AcrF1) is an anti-CRISPR (Acr) protein that targets type I-F Csy and blocks CRISPR-RNA (crRNA) and invader DNA hybridization. It has been shown that multiple copies of AcrIF1 bind to the CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats-CRISPR associated protein) complex with different modes when working individually or cooperating with AcrIF2, which might exclude target DNA binding through different mechanisms. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps: the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	77
412082	cd22280	AcrIF2	Anti-CRISPR type I subtype F2. AcrIF2 (also known as AcrF2) is an anti-CRISPR (Acr) protein which functions as a double-stranded "DNA mimic protein" (DMP) that binds to the type I-F CRISPR-Cas surveillance complex (Csy) and excludes target DNA binding. The key feature of DMPs is their DNA-like shape and charge distribution, and they affect the activity of DNA-binding proteins by occupying their DNA-binding domains. Acidic residues on the surface of AcrIF2 mimic the negative charge distribution on the helical backbone of a DNA duplex. The type I-F Csy complex is a crRNA-guided surveillance complex, composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	86
409000	cd22283	HD_XRCC4_N	N-terminal head domain found in X-ray repair cross-complementing protein 4 and similar proteins. X-ray repair cross-complementing protein 4 (XRCC4) is a DNA repair protein involved in DNA non-homologous end-joining (NHEJ), which is required for double-strand break repair and V(D)J recombination. The DNA ligase IV (LIG4)- XRCC4 complex is responsible for the ligation step of NHEJ, and XRCC4 enhances the joining activity of LIG4. Binding of the LIG4-XRCC4 complex to DNA ends is dependent on the assembly of the DNA-dependent protein kinase complex DNA-PK to these DNA ends. XRCC4 monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two dimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. In addition, XRCC4 and XLF form symmetric heterodimers that interact through their globular head domains at the opposite end of the homodimer interface, and may form XLF-XRCC4 filaments. This model corresponds to the N-terminal head domain of XRCC4, which is structurally related to other XRCC4-superfamily members, PAXX, XLF, SAS6, and CCDC61.	117
409001	cd22284	HD_CCDC61_N	N-terminal head domain found in coiled-coil domain-containing protein 61 and similar proteins. Coiled-coil domain-containing protein 61 (CCDC61), also known as variable flagellar number 3 (VFL3), is a centrosomal protein required for spindle assembly and precise chromosome alignments in mitosis. It is the human ortholog of proteins required for anchoring distinct sets of cytoskeletal fibers to centrioles in unicellular eukaryotes. CCDC61 monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two homodimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. These CCDC61 homodimers assembles into linear filaments. This model corresponds to the N-terminal head domain of CCDC61, which is structurally related to other XRCC4-superfamily members, XRCC4, XLF, SAS6, and PAXX.	135
409002	cd22285	HD_XLF_N	N-terminal head domain found in XRCC4-like factor and similar proteins. XRCC4-like factor (XLF), also known as non-homologous end-joining factor 1 (NHEJ1) or protein cernunnos, is involved in DNA nonhomologous end joining (NHEJ), which is required for double-strand break (DSB) repair and V(D)J recombination. It interacts with the XRCC4-DNA ligase IV complex to promote NHEJ. It may act in concert with XRCC6/XRCC5 (Ku) to stimulate XRCC4-mediated joining of blunt ends and several types of mismatched ends that are non-complementary or partially complementary. XLF binds DNA in a length-dependent manner. Similar to XRCC4, XLF monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two dimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. In addition, XLF and XRCC4 form symmetric heterodimers that interact through their globular head domains at the opposite end of the homodimer interface, and may form XLF-XRCC4 filaments. This model corresponds to the N-terminal head domain of XLF, which is structurally related to other XRCC4-superfamily members, XRCC4, PAXX, SAS6, and CCDC61.	109
409003	cd22286	HD_PAXX_N	N-terminal head domain found in paralog of XRCC4 and XLF, and similar proteins. Paralog of XRCC4 and XLF (PAXX), also called XRCC4-like small protein, is a paralog of X-ray repair cross-complementing protein 4 (XRCC4) and XRCC4-like factor (XLF). It is involved in non-homologous end joining (NHEJ), a major pathway to repair double-strand breaks (DSBs) in DNA. It may act as a scaffold required to stabilize the DSB-repair protein Ku heterodimer, composed of XRCC5/Ku80 and XRCC6/Ku70, at double-strand break sites in cells. It functions with XRCC4 and XLF to bring about DSB repair and cell survival in response to DSB-inducing agents. Similar to XRCC4 and XLF, PAXX monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two homodimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. This model corresponds to the N-terminal head domain of PAXX, which is structurally related to other XRCC4-superfamily members, XRCC4, XLF, SAS6, and CCDC61.	102
412083	cd22287	REV3L_RBD	REV7 binding domain found in protein reversionless 3-like (REV3L) and similar proteins. REV3L, also called REV3-like, or REV3, or DNA polymerase zeta catalytic subunit (POLZ), is the catalytic subunit of the DNA polymerase zeta complex, an error-prone polymerase specialized in translesion DNA synthesis (TLS). REV3L lacks an intrinsic 3'-5' exonuclease activity and thus has no proofreading function. The model corresponds to a conserved region that is responsible for the binding of REV7.	23
412084	cd22288	CWC27_CTD	C-terminal domain of spliceosome-associated protein CWC27 and similar proteins. CWC27, also called antigen NY-CO-10, or probable inactive peptidyl-prolyl cis-trans isomerase CWC27, or PPIase CWC27, or serologically defined colon cancer antigen 10, is part of the spliceosome and plays a role in pre-mRNA splicing. It is a probable inactive PPIase with no peptidyl-prolyl cis-trans isomerase activity. This model corresponds to the C-terminal domain of CWC27, which interacts with CWC22 MIF4G domain.	56
412085	cd22289	RecQL4_SLD2_NTD	N-terminal homeodomain-like domain of metazoan RecQ protein-like 4 (RecQL4), fungal DNA replication regulator SLD2 and similar proteins. RecQL4, also called ATP-dependent DNA helicase Q4, or DNA helicase, RecQ-like type 4 (RecQ4), or RTS, is a DNA-dependent ATPase that may modulate chromosome segregation. This family also includes fungal DNA replication regulator SLD2, also known as DNA replication and checkpoint protein 1 (DRC1), which functions with DPB11 to control DNA replication and the S-phase checkpoint. It is also required for the proper activation of RAD53 in response to DNA damage and replication blocks. This model corresponds to the N-terminal domain of RecQL4 and SLD2, which is a homeodomain-like DNA interaction motif.	49
412086	cd22290	cc_RasGRP1_C	C-terminal coiled-coil domain of RAS guanyl-releasing protein 1 (RasGRP1) and similar proteins. RasGRP1, also called calcium and DAG-regulated guanine nucleotide exchange factor II (CalDAG-GEFII), or Ras guanyl-releasing protein, acts as a calcium- and diacylglycerol (DAG)-regulated nucleotide exchange factor, specifically activating Ras through the exchange of bound GDP for GTP. This model corresponds to the C-terminal coiled-coil domain of RasGRP1, which mediates oligomerization.	55
412087	cd22291	cc_THAP11_C	C-terminal coiled-coil domain of THAP domain-containing protein 11. THAP domain-containing protein 11 (THAP11) is a cell cycle and cell growth regulator differentially expressed in cancer cells. It acts as a transcriptional repressor that plays a central role for embryogenesis and the pluripotency of embryonic stem (ES) cells. This model corresponds to the C-terminal coiled-coil domain of THAP11, which is involved in protein dimerization.	61
412088	cd22292	cc_Cep135_MBD	coiled-coil microtubule binding domain of centrosomal protein of 135 kDa (Cep135) and similar proteins. Cep135, also called centrosomal protein 4, is involved in early centriole assembly, duplication, biogenesis, and formation. It is required for the recruitment of CEP295 to the proximal end of new-born centrioles at the centriolar microtubule wall during early S phase in a PLK4-dependent manner. This model corresponds to a conserved coiled-coil domain of Cep135, which is critical for microtubule binding.	62
412089	cd22293	RBD_SHLD3_N	N-terminal REV7-binding domain of Shieldin complex subunit 3 (SHLD3) and similar proteins. SHLD3, also called REV7-interacting novel NHEJ regulator 1, or Shield complex subunit 3, is a component of the shieldin complex, which plays an important role in the repair of DNA double-stranded breaks (DSBs). During G1 and S phase of the cell cycle, the complex functions downstream of TP53BP1 to promote non-homologous end joining (NHEJ) and suppress DNA end resection. SHLD3 mediates various NHEJ-dependent processes including immunoglobulin class-switch recombination, and fusion of unprotected telomeres. The model corresponds to the N-terminal REV7-binding domain of SHLD3, which contains a REV7-binding FXPWFP motif.	61
412090	cd22294	MYO6_MIU_linker	MIU-linker domain found in unconventional myosin-VI. Myosins are actin-based motor molecules with ATPase activity. Unconventional myosins function in intracellular movements. Myosin-VI, also called unconventional myosin-6 (MYO6), is a reverse-direction motor protein that moves towards the minus-end of actin filaments. It is required for the structural integrity of the Golgi apparatus via the p53-dependent pro-survival pathway. It appears to be involved in a very early step of clathrin-mediated endocytosis in polarized epithelial cells. It modulates RNA polymerase II-dependent transcription. As part of the DISP complex, Myosin-VI may regulate the association of septins with actin and thereby regulate the actin cytoskeleton. Myosin-VI is encoded by the MYO6 gene, the human homologue of the gene responsible for deafness in Snell's waltzer mice. It is mutated in autosomal dominant nonsyndromic hearing loss. This model corresponds to a conserved region of myosin-VI, which consist of three helices: MIU (Motif Interacting with Ubiquitin), a common linker helix (linker-alpha1) and an isoform-specific helix (linker-alpha2).	69
411969	cd22295	cc_LAMB_C	C-terminal coiled-coil domain found in the laminin subunit beta (LAMB) family. The LAMB family contains four members, LAMB1-4. They are components of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB, which may be involved in the integrin binding activity.	70
412091	cd22296	CBD_TRPV5_C	C-terminal CaM binding domain found in transient receptor potential cation channel subfamily V member 5 (TRPV5) and similar proteins. TRPV5, also called calcium transport protein 2 (CaT2), epithelial calcium channel 1 (ECaC1), or Osm-9-like TRP channel 3 (OTRPC3), is a constitutively active calcium selective cation channel that might be involved in Ca(2+) reabsorption in kidney and intestine. The channel is activated by low internal calcium levels, and the current exhibits an inward rectification. The model corresponds to the C-terminal calmodulin (CaM) binding domain of TRPV5, which contains several CaM binding sites in the N- and C-terminal tails. The binding of CaM to the C-terminal binding site is essential for the fast Ca2+-dependent inactivation of the channel.	73
412092	cd22297	PSMD4_RAZUL	RAZUL (Rpn10 AZUL-binding) domain of 26S proteasome non-ATPase regulatory subunit 4 (PSMD4) and similar proteins. PSMD4 is also called 26S proteasome regulatory subunit RPN10, 26S proteasome regulatory subunit S5A, antisecretory factor 1, AF, ASF, or multiubiquitin chain-binding protein (MCB1). It acts as a ubiquitin receptor subunit through ubiquitin-interacting motifs and selects ubiquitin-conjugates for destruction. It displays a preferred selectivity for longer polyubiquitin chains. PSMD4 is a component of the 26S proteasome, a multiprotein complex involved in the ATP-dependent degradation of ubiquitinated proteins. The proteasome participates in numerous cellular processes, including cell cycle progression, apoptosis, or DNA damage repair. The model corresponds to the C-terminal Rpn10 AZUL-binding domain (RAZUL) of PSMD4, which is responsible for binding the AZUL domain of E6AP/UBE3A. AZUL stands for amino-terminal zinc-binding domain of ubiquitin E3a ligase.	48
412093	cd22298	NuMA_LGNBD	LGN binding domain (LGNBD) of nuclear mitotic apparatus protein (NuMA) and similar proteins. NuMA, also called nuclear matrix protein-22 (NMP-22), nuclear mitotic apparatus protein 1 (NUMA1), or SP-H antigen, is a microtubule (MT)-binding protein that plays a role in the formation and maintenance of spindle poles and the alignment and segregation of chromosomes during mitotic cell division. It is involved in the establishment of mitotic spindle orientation during metaphase, and elongation during anaphase in a dynein-dynactin-dependent manner. NuMA, in complex with LGN, forms NuMA:LGN hetero-hexamers that promote spindle orientation. The model corresponds to the LGN binding domain (LGNBD) of NuMA. LGN (named for leu-gly-asn repeats) is also known as G protein signaling modulator 2.	56
411970	cd22299	cc_LAMB2_C	C-terminal coiled-coil domain found in laminin subunit beta-2 (LAMB2). LAMB2 is also called laminin B1s chain, laminin-11 subunit beta, laminin-14 subunit beta, laminin-15 subunit beta, laminin-3 subunit beta, laminin-4 subunit beta, laminin-7 subunit beta, laminin-9 subunit beta, S-laminin subunit beta, or S-LAM beta (LAMS). It is an important component of the interphotoreceptor matrix and plays a role in rod morphogenesis. It may also have an important function in the sarcolemmal basement membrane. Mutations of the LAMB2 gene mainly cause Pierson syndrome (microcoria-congenital nephrosis syndrome). LAMB2 is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB2, which may be involved in the integrin binding activity.	72
411971	cd22300	cc_LAMB1_C	C-terminal coiled-coil domain found in laminin subunit beta-1 (LAMB1). LAMB1 is also called laminin B1 chain, laminin-1 subunit beta, laminin-10 subunit beta, laminin-12 subunit beta, laminin-2 subunit beta, laminin-6 subunit beta, or laminin-8 subunit beta. It is a glycoprotein that is involved in the pathogenesis of neurodevelopmental disorders. It also plays a crucial role in both lung morphogenesis and physiological function. Mutations in LAMB1 are associated with Cobblestone brain malformation (COB) with variable muscular or ocular abnormalities. LAMB1 is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB1, which is involved in the integrin binding activity.	73
411972	cd22301	cc_LAMB4_C	C-terminal coiled-coil domain found in laminin subunit beta-4 (LAMB4). LAMB4, also called laminin beta-1-related protein, is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. Mutations or loss of LAMB4 may be features of gastric and colorectal cancers. Reduced LAMB4 levels may contribute to colonic dysmotility associated with diverticulitis. This model corresponds to the C-terminal coiled-coil domain of LAMB4, which may be involved in the integrin binding activity.	70
411973	cd22302	cc_DmLAMB1-like_C	C-terminal coiled-coil domain found in Drosophila melanogaster laminin subunit beta-1 (DmLAMB1) and similar proteins. DmLAMB1, also called LanB1, is a glycoprotein required for nidogen (Ndg) localization to the basement membrane. It is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of DmLAMB1, which may be involved in the integrin binding activity.	70
411974	cd22303	cc_LAMB3_C	C-terminal coiled-coil domain found in laminin subunit beta-3 (LAMB3). LAMB3 is also called epiligrin subunit beta, kalinin B1 chain, kalinin subunit beta, laminin B1k chain, laminin-5 subunit beta, or nicein subunit beta. It is a major component of the basement membrane in most adult tissues. Mutations in LAMB3 are associated with Herlitz junctional epidermolysis bullosa (H-JEB), a severe autosomal recessive disorder characterized by blister formation within the dermal-epidermal basement membrane. LAMB3 is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB3, which may be involved in the integrin binding activity.	71
408997	cd22304	VpdB_C	C-terminal fragment of effector protein VpdB. This model represents the C-terminal fragment of the effector protein VpdB that binds the Legionella pneumophila Dot/Icm type IVB coupling protein (T4CP) complex which includes IcmS, IcmW, and LvgA. These L. pneumophila proteins are known to selectively assist the export of a subclass of effectors. The effector protein VpdB, like other L. pneumophila effectors VpdA, VpdC and VpdD, is a homolog of phospholipase A (PLA) patatin-like enzymes. However, VpdB does not appear to be involved in phospholipid metabolism. The structure reveals interactions between LvgA and a linear motif in the C-terminus of VpdB. This binding interface of LvgA also interacts with the C-terminal region of three additional L. pneumophila effectors, SidH, SetA, and PieA.	126
412069	cd22305	NDFIP1	NEDD4 family-interacting protein 1. The NEDD4 (neural precursor cell expressed, developmentally down-regulated protein 4)-family interacting proteins (NDFIPs) are adaptor proteins that recruit NEDD4 E3 ligases to specific substrate proteins, which leads to the ubiquitylation and subsequent degradation of these proteins. They also act as activators of the E3 ligase activity by releasing NEDD4 ligase from its auto-inhibitory conformation. NDFIP1 has been shown to play a role in a variety of processes, including inflammation, immune signaling, and nuclear trafficking.	206
412070	cd22306	NDFIP2	NEDD4 family-interacting protein 2. The NEDD4 (neural precursor cell expressed, developmentally down-regulated protein 4)-family interacting proteins (NDFIPs) are adaptor proteins that recruit NEDD4 E3 ligases to specific substrate proteins, which leads to the ubiquitylation and subsequent degradation of these proteins. They also act as activators of the E3 ligase activity by releasing NEDD4 ligase from its auto-inhibitory conformation. NDFIP2 may play a role in protein trafficking.	229
412094	cd22307	Adgb_C_mid-like	C-terminal middle region of Androglobins (Adgbs) and related proteins; including permuted globin domain and IQ motif. Androglobin (Adgb, also known as Calpain-7-like protein, CAPN7L) is a large multidomain protein consisting of an N-terminal peptidase C2 family calpain-like domain, an IQ calmodulin-binding motif, and an internal, circularly permuted globin domain. The canonical secondary structure of hemoglobins is an 3-over-3 alpha-helical sandwich structure, where the eight alpha-helical segments are conventionally labeled, A-H, according to their sequential order; Adgbs differ from this in having helices C-H followed by A-B. Adgbs and other phylogenetically ancient globins, such as neuroglobins and globin X, form hexacoordinated heme iron complexes. Globins contain various highly conserved residues of the heme pocket: including a Phe in the interhelical position CD1 (Phe CD1, first position in the loop between the helices C and D) that is packed against the heme, a His at the 7th position of the E-helix (His E7) that binds the heme iron distally, and a His at the 8th position of the F-helix (His F8) that binds the heme iron proximally. Unlike other hexacoordinated globins, Adgbs have an E7 Gln; their hexacoordination scheme is [Gln]-Fe-[His]. In mammals, Adgb is mainly expressed in the testes and may play an important role in spermatogenesis. Arthropod Adgbs have degenerate globin domains (DOI:10.3389/fgene.2020.00858). This model spans the permuted globin domain, the IQ motif, and a conserved region of about 200 amino acid residues located C-terminal to the globin domain; it does not include the N-terminal protease domain or the large uncharacterized C-terminal domain of approximately 500 residues.	416
411712	cd22308	Af1548-like	Archeoglobus fulgidus Af1548 and similar putative endonucleases. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	165
411713	cd22309	AgeI-like	restriction endonuclease AgeI and similar endonucleases. Type IIP restriction endonuclease AgeI recognizes a palindromic sequence 5'-A|CCGGT-3' and cuts it ('|' denotes the cleavage site) producing staggered DNA ends. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	205
411714	cd22310	BcnI-like	Restriction endonuclease BcnI and similar endonucleases. Restriction endonuclease BcnI cleaves duplex DNA containing the sequence 5'-CC|SGG-3' (S stands for C or G, | designates a cleavage position) to generate staggered products with single nucleotide 5'-overhangs. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	229
411715	cd22311	BglI-like	Restriction endonuclease BglI and similar endonucleases. BglI is a type II restriction endonuclease that recognizes the interrupted DNA sequence GCCNNNNNGGC and cleaves between the fourth and fifth unspecified base pair to produce overhanging ends; it belongs to a superfamily of nucleases including very short patch repair (Vsr) Endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	284
411716	cd22312	BglII-like	Restriction endonuclease BglII and similar endonucleases. Restriction endonuclease BglII cleaves duplex DNA containing the sequence 5'-A|GATCT-3' (| designates the cleavage position) to generate staggered products with four nucleotide 5'-overhangs. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	178
411717	cd22313	BsaWI-like	endonuclease BsaWI and similar endonucleases. The type II restriction endonuclease BsaWI recognizes a degenerated sequence 5'-W|CCGGW-3', where W stands for A or T and  '|' denotes the cleavage site. It belongs to a family of restriction endonucleases that recognize a conserved CCGG tetranucleotide in their target and form homodimers or homotetramers, requiring binding of one, two or three DNA targets for optimal catalytic activity. They are part of a yet larger superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many other restriction endonucleases, such as EcoRI, BamHI, and FokI.	276
411718	cd22314	Bse634I-like	Restriction endonuclease Bse634I and similar endonucleases. Bacillus stearothermophilus restriction endonuclease Bse634I recognizes the nucleotide sequence R|CCGGY (R = A or G, Y = T or C, with | designating the cleavage site)  and is an isoschisomer of Citrobacter freundii restriction endonuclease Cfr10I; it is active as a homotetramer and belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	281
411719	cd22315	BsoBI-like	Type II restriction endonuclease BsoBI and similar proteins. BsoBI is a thermophilic PDDEXK-family restriction endonuclease exhibiting both base-specific and degenerate recognition within the sequence C-Y-C-G-R-G. (R = A or G, Y = T or C) A conserved histidine has been proposed to act as a general base in the catalysis. BsoBI belongs to a wider superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	288
411720	cd22316	BspD6I-like	nicking endonuclease Nt.BspD6I and similar endonucleases. Heterodimeric type II restriction endonuclease nicking endonuclease BspD6I recognizes a pseudosymmetric DNA sequence (5'-GAGTC) and cuts both strands outside the recognition motif 4 nucleotides downstream. It forms the large subunit in a heterodimeric arrangement. This catalytic domain/subunit belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	345
411721	cd22317	BstYI-like	type II restriction endonuclease BstYI and similar proteins. BstYI is a thermophilic PDDExK-family restriction endonuclease with specificities that overlap those of BamHI and BglII; it cleaves the degenerate hexanucleotide R-G-A-T-C-Y (R = A or G, Y = T or C) and is part of a larger superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	188
411722	cd22318	DNA2_N-like	Nuclease domain of the nuclease/helicase DNA2 and related nucleases. The eukaryotic nuclease/helicase DNA2 processes double-strand breaks in DNA that have single-stranded ends/overhangs, as well as Okazaki fragments and stalled replication forks; it is therefore crucial for maintaining the integrity of the genome. The nuclease domain modeled here belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	234
411723	cd22319	DpnI-like	type II restriction endonuclease DpnI and similar proteins. This catalytic PD-(D/E)XK domain co-occurs with a C-terminal winged-helix DNA binding domain that is not included in the model. Both domains of R.DpnI bind DNA and are separately specific for the Gm6ATC sequences in Dam-methylated DNA. DpnI or Dam-replacing protein (DRP) is a restriction endonuclease flanked by pseudo-transposable small repeat elements. The replacement of Dam-methylase by DRP allows phase variation through slippage-like mechanisms in several pathogenic isolates of Neisseria meningitidis. Type II restriction endonuclease DpnI belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	174
411724	cd22320	Ecl18kI-like	Restriction endonuclease Ecl18kI and similar endonucleases. Restriction endonuclease Ecl18kI recognizes the sequence |CCNGG and cleaves it before the outer C (| designates the cleavage site) to generate 5 nt 5'-overhangs. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	262
411725	cd22321	EcoO109I-like	Restriction endonuclease EcoO109I and related endonucleases. EcoO109I is a type II restriction endonuclease that recognizes ds DNA with a seven-base pair motif of both degenerate and discontinuous sequence, RG|GNCCY (R = A or G, Y = T or C, with | designating the cleavage site), and generates 5'-overhangs; it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	232
411726	cd22322	EcoRII-like	Restriction endonuclease EcoRII and similar endonucleases. Restriction endonuclease EcoRII recognizes the sequence 5'-CCWGG-3' (W stands for A or T); it requires binding of a second target site as an allosteric effector in order to be active. EcoRII belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	211
411727	cd22323	EcoRV-like	Restriction endonuclease EcoRV and similar endonucleases. Type II restriction endonuclease EcoRV recognizes the site 5'-GAT|ATC-3' (| denotes the cleavage site) and functions as a homodimer; it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	221
411728	cd22324	Endonuclease_I	Endonuclease I and similar nucleases. Junction-resolving T7 endonuclease I is a nuclease that is selective for the structure of the four-way DNA junction, it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	114
411729	cd22325	ERCC1_C-like	Central domain of ERCC1. ERCC1 is a subunit of the DNA structure-specific endonuclease XPF-ERCC1, which incises a damaged DNA strand on the 5' side of a lesion during nucleotide excision repair. It also plays roles in DNA interstrand crosslink repair and homologous recombination. The ERCC1 central domain modeled here interacts tightly with XPF and may be involved in binding to single-stranded DNA. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	128
411730	cd22326	FAN1-like	repair nuclease FAN1. This model characterizes a set of nucleases that resemble Holliday-junction resolving enzymes. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	652
411731	cd22327	FokI_nuclease-like	Nuclease domain of restriction endonuclease FokI and similar endonucleases. The type II restriction endonuclease FokI recognizes an asymmetric nucleotide sequence 5'-GGATG(N)9/13 and cleaves both DNA strands outside the recognition motif; its nuclease domain belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and HindIII.	161
411732	cd22328	Hef-like	Hef-like homing endonuclease and similar nucleases. Hef-like homing endonuclease such as I-Bth0305I, which is encoded within a group I intron in the recA gene of a Bacillus thuringiensis bacteriophage and cleaves a DNA target in the uninterrupted recA gene at a position immediately adjacent to the intron insertion site. It belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	99
411733	cd22329	HincII-like	Restriction endonuclease HincII and similar endonucleases. Type II restriction endonuclease HincII cleaves double-stranded DNA 5'-GTY|RAC-3' (| denotes the cleavage site, Y stands for C or T, R stand for A or G ) creating blunt ends. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	252
411734	cd22330	HindIII-like	Restriction endonuclease HindIII and similar endonucleases. The type II restriction endonuclease HindIII cleaves DNA at the palindromic sequence A|AGCTT (| denotes the cleavage site). It belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	289
411735	cd22331	HinP1I-like	Restriction endonuclease HinP1I and similar endonucleases. HinP1I is a type II restriction endonuclease that recognizes and cleaves a palindromic tetranucleotide sequence (G|CGC) in double-stranded DNA, producing 2 nt 5' overhanging ends, it belongs to the PDDEXK superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	243
411736	cd22332	HsdR_N	N-terminal domain of HsdR motor subunit of type I restriction-modification enzyme EcoR124I and similar systems. The N-terminal endonuclease-like domain of HsdR motor subunit of type I restriction-modification enzyme EcoR124I belongs to a wider superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	226
411737	cd22333	LlaBIII_nuclease-like	nuclease domain of type ISB restriction-modification enzyme LlaBIII and similar nuclease domains. This N-terminal nuclease domain belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	149
411738	cd22334	MspI-like	Restriction endonuclease MspI and similar endonucleases. The type II restriction endonuclease MspI It recognizes and cleaves the palindromic tetranucleotide sequence 5'-C|CGG (| denotes the cleavage site) leaving 2 base 5' overhangs. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	262
411739	cd22335	MspjI-like	Modification-dependent restriction endonuclease MspjI and similar endonucleases. MspJI recognizes 5-methylcytosine or 5-hydroxymethylcytosine as part of the motif CNN(G/A) and cleaves both strands at fixed distances (N(12)/N(16)) away from the modified cytosine at the 3'-side. It belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	185
411740	cd22336	MunI-like	restriction endonuclease MunI and similar proteins. MunI ( E.C. 3.1.21.4) is a type II restriction enzyme that catalyzes the hydrolysis of DNA, recognizing the palindromic hexanucleotide sequence CAATTG (with the cleavage site after C-1), and is very similar to EcoRI. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	200
411741	cd22337	MvaI-like	Restriction Endonuclease MvaI and similar endonucleases. Restriction endonuclease MvaI recognizes the sequence CC|WGG (W stands for A or T, '|' designates the cleavage site) and generates products with single nucleotide 5'-overhangs; it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	239
411742	cd22338	NaeI-like	Restriction endonuclease NaeI and similar endonucleases. The type II restriction endonuclease NaeI recognizes and cleaves the DNA motif GCC|GGC (| denotes the cleavage site) and forms a covalent bond with the cleaved substrate. The enzyme binds two DNA recognition sites and only cleaves one DNA sequence. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	256
411743	cd22339	NciI-like	Restriction endonuclease NciI and similar endonucleases. NciI is a type II restriction endonuclease that recognizes and cleaves the sequence CC|SGG (S stands for C or G, | denotes the cleavage site). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	170
411744	cd22340	NgoMIV-like	Restriction endonuclease NgoMIV and similar endonucleases. Type II restriction endonuclease NgoMIV recognizes and cleaves the palindromic sequence 5'-G|CCGGC-3' (| denotes the cleavage site) to produce 4 bp 5' staggered ends. It is active as a homotetramer and belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	238
411745	cd22341	NucS-like	Mismatch restriction endonuclease NucS and similar nucleases. Archaeal mismatch restriction endonuclease NucS and its ortholog EndoMS specifically cleave dsDNA containing mismatched bases. They belong to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	237
411746	cd22342	Pa4535-like	putative restriction endonuclease similar to Pseudomonas aeruginosa Pa4535. These proteins belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	195
411747	cd22343	PDDEXK_lambda_exonuclease-like	Uncharacterized nucleases similar to lambda phage exonuclease. This model characterizes a diverse set of nucleases such as alkaline exonuclease from Laribacter hongkongensis, lambda phage exonuclease, or a Cas4-like protein from the Mimivirus virophage resistance element system. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	182
411748	cd22344	PDDEXK_nuclease	uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	220
411749	cd22345	PDDEXK_nuclease	uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	201
411750	cd22346	PDDEXK_nuclease	uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	221
411751	cd22347	PDDEXK_nuclease	uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	203
411752	cd22348	PDDEXK_nuclease	uncharacterized PDDEXK nuclease may function as a restriction endonuclease. This family belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	218
411753	cd22349	PDDEXK_RNA_polymerase-like	Endonuclease domain of segmented negative-strand RNA virus (sNSV) polymerases. The N-terminal endonuclease domain of sNSV polymerases is essential for viral cap-dependent transcription; it has endonuclease activity and belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	182
411754	cd22350	PspGI-like	Restriction endonuclease PspGI and similar nucleases. PspGI is an isoschizomer of EcoRII, it recognizes and cleaves the DNA sequence 5'-|CCWGG-3' (| denotes the cleavage site, W stands for A or T). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	239
411755	cd22351	PvuII-like	Restriction endonuclease PvuII and similar nucleases. The type II restriction endonuclease PvuII recognizes and cleaves the DNA sequence 5'-CAG|CTG-3' leaving blunt ends (| denotes the cleavage site). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	151
411756	cd22352	RecB_C-like	C-terminal nuclease domain of exodeoxyribonuclease V subunit RecB and similar proteins. Exodeoxyribonuclease V subunit beta (RecB) is a helicase/nuclease that prepares dsDNA breaks (DSB) for recombinational DNA repair; it binds to DSBs and unwinds DNA via a rapid and highly processive ATP-dependent bidirectional helicase. The C-terminal PDDEXK nuclease domain belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	215
411757	cd22353	RecC_C-like	C-terminal nuclease-like domain of exodeoxyribonuclease V subunit RecC and similar proteins. Exodeoxyribonuclease V subunit beta (RecC) is part of the RecBCD complex that processes DNA ends resulting from a double-strand break. Its C-terminal domain contacts the two separate strands of the DNA substrate and may be responsible for stabilizing RecD interactions with the complex. It belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	283
411758	cd22354	RecU-like	Holliday junction resolvase RecU (recombination protein U) and similar nucleases. Holliday junction (HJ) resolving enzyme RecU is involved in DNA repair and recombination. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	157
411759	cd22355	Sau3AI_C	C-terminal allosteric effector domain of the restriction endonuclease Sau3AI. Sau3AI is a type II restriction enzyme that recognizes the 5'-|GATC-3' sequence in double-strand DNA (| denotes the cleavage site). The C-terminal domain modeled here does not have catalytic activity, it functions as an allosteric effector domain that assists in DNA binding and cleavage. It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methy-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	214
411760	cd22356	Sau3AI_N-like	N-terminal catalytic domain of type II restriction enzyme Sau3AI and similar endonucleases. Sau3AI is a type II restriction enzyme that recognizes the 5'-|GATC-3' sequence in double-strand DNA (| denotes the cleavage site). The N-terminal domain modeled here conveys the catalytic activity, it belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	188
411761	cd22357	SfsA-like	Sugar fermentation stimulation protein A and similar nucleases. Sugar fermentation stimulation protein A may bind to DNA in a non-specific manner and may act as a regulatory factor involved in the metabolism of sugars such as maltose. However, it contains a well-conserved PDDEXK nuclease active site and may have hydrolytic activity towards an unknown target. The putative catalytic domain belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	213
411762	cd22358	SfsA-like_archaeal	Sugar fermentation stimulation protein A and similar nucleases. Sugar fermentation stimulation protein A may bind to DNA in a non-specific manner and may act as a regulatory factor involved in the metabolism of sugars such as maltose. However, it contains a well-conserved PDDEXK nuclease active site and may have hydrolytic activity towards an unknown target. The putative catalytic domain belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	221
411763	cd22359	SfsA-like_bacterial	Sugar fermentation stimulation protein A and similar proteins. Sugar fermentation stimulation protein A may bind to DNA in a non-specific manner and may act as a regulatory factor involved in the metabolism of sugars such as maltose. However, it contains a well-conserved PDDEXK nuclease active site and may have hydrolytic activity towards an unknown target. The putative catalytic domain belongs to a superfamily of PDDEXK nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI. The N-terminus of SfsA resembles a DNA-binding OB-fold domain.	218
411764	cd22360	SgrAI-like	Restriction endonuclease SgrAI and similar nucleases. The type II restriction endonuclease SgrAI binds and cleaves the target sequence CR|CCGGYG (| denotes the cleavage site, R stands for a purine and Y stands for a pyrimidine). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	272
411765	cd22361	ThaI-like	type II restriction endonuclease subunit R of ThaI and similar endonucleases. The PD-(D/E)XK type II restriction endonuclease ThaI cuts the target sequence CG/CG with blunt ends. It belongs to a superfamily of PDDEXK nucleases that includes diverse members such as very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	200
411766	cd22362	TnsA_endonuclease-like	Transposon Tn7 transposition protein TnsA. TnsA is part of the Tn7 transposon mobile genetic element working together with TnsB, TnsC, and TnsD to facilitate insertion of the transposon. TnsA catalyzes cleavage at the transposon 5' ends, and TnsC is the activator of the composite TnsAB transposase. TnsA belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	234
411767	cd22363	tRNA-intron_lyase_C	catalytic C-terminal domain of the tRNA-intron lyase. This C-terminal catalytic domain of tRNA intron endonucleases cleaves pre tRNA at the 5' and 3' splice sites to release the intron (EC:3.1.27.9). It belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	91
411768	cd22364	VC1899-like	putative nuclease domain found in Vibrio cholerae VC1899 and similar proteins. A putative nuclease domain found in Vibrio cholerae VC1899 and similar proteins belongs to a superfamily of PDDEXK nucleases that includes diverse members such as very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	377
411769	cd22365	VRR-NUC-like	Virus-type replication repair nuclease. This model characterizes a set of nucleases that resemble Holliday-junction resolving enzymes. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	92
411770	cd22366	XisH-like	Endonuclease XisH and similar nucleases. XisH functions as an endonuclease in the control of expression of nitrogen fixation genes of certain Anabaena and Nostoc species of cyanobacteria. Together with XisI, it controls the cell-type specificity of the excision of the fdxN element by the recombinase XisF. XisH belongs to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	133
411771	cd22367	XPF_ERCC4_MUS81-like	XPF family DNA repair endonuclease. (Xeroderma Pigmentosum group F) DNA repair gene homologs are members of the XPF/Rad1/Mus81-dependent nuclease family which specifically cleave branched structures generated during DNA repair, replication, and recombination, and they are essential for maintaining genome stability. They belong to a wider superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	123
411772	cd22368	YaeQ-like	Nucleases similar to Escherichia coli YaeQ. This model characterizes a diverse set of poorly characterized nucleases such as Escherichia coli YaeQ. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	174
411956	cd22369	alphaCoV_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) protein from alphacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from alphacoronaviruses including human coronaviruses (HCoVs), HCoV-NL63, and HCoV-229E, and porcine coronaviruses, transmissible gastroenteritis virus (TGEV) and porcine epidemic diarrhea virus (PEDV), among others. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP), and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1 the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	666
411957	cd22370	betaCoV_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses. This family contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses, including three highly pathogenic human coronaviruses (CoVs), Middle East respiratory syndrome coronavirus (MERS-CoV), Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), and SARS coronavirus 2 (SARS-CoV-2), also known as a 2019 novel coronavirus (2019-nCoV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	667
411958	cd22371	alphaCoV-HKU2-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the CoV spike (S) glycoprotein from Rhinolophus bat coronavirus HKU2 and related alphacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Wencheng shrew coronavirus (WESV), Lucheng Rn rat coronavirus (LRNV), and two bat viruses (Rhinolophus bat coronavirus HKU2 and BtRf-AlphaCoV/YN2012). Members of this group form a distinct cluster that is separated from the other alphacoronaviruses. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	686
411959	cd22372	gammaCoV_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from avian infectious bronchitis coronavirus (IBV) and related gammacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from gammacoronaviruses, including avian infectious bronchitis virus, and Beluga whale coronavirus SW1 (whale-CoV SW1). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	661
411960	cd22373	delta-PDCoV-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from porcine coronavirus HKU15, avian coronaviruses, and related deltacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from porcine coronavirus PDCoV, and several avian coronaviruses such as quail deltacoronavirus (QdCoV) UAE-HKU30, white-eye coronavirus HKU16, common moorhen coronavirus HKU21, thrush CoV HKU12, and munia CoV HKU13, all from the Buldecovirus subgenus of deltacoronaviruses. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	648
411961	cd22374	delta-PiCoV-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Pigeon coronavirus UAE-HKU29, and related avian deltacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Pigeon coronavirus UAE-HKU29, and related avian deltacoronaviruses including Falcon coronavirus UAE-HKU27, Magpie-robin coronavirus HKU18, Sparrow coronavirus HKU17, and Night heron coronavirus HKU19. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the (C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	739
411962	cd22375	HCoV-NL63-229E-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoproteins from HCoV-NL63, HCoV-229E, and related alphacoronavirus. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from alphacoronaviruses, including human coronaviruses (HCoVs), HCoV-NL63 and HCoV-229E. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	677
411963	cd22376	PDEV-like_Spike_SD1-2_S1-2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Porcine epidemic diarrhea virus and related alphacoronavirus. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from alphacoronaviruses, including porcine epidemic diarrhea virus (PEDV), Scotophilus bat coronavirus, and swine enteric coronavirus, among others. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1 the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	673
411964	cd22377	TGEV-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from transmissible gastroenteritis virus and related alphacoronaviruses. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from porcine transmissible gastroenteritis virus (TGEV), canine coronavirus (CCoV), and feline coronavirus (FCoV). They display greater than 96% sequence identity and have been grouped in the same species, alphacoronavirus 1, within the Alphacoronavirus genus. The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HKU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	751
411965	cd22378	SARS-CoV-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from SARS-CoV-2 (COVID-19) and related betacoronaviruses in the B lineage. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the sarbecovirus subgenus (B lineage), including highly pathogenic human CoVs such as Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV), and SARS-CoV-2 (also known as a 2019 novel coronavirus or 2019-nCoV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves  as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. Notably, SARS-CoV-2 has a functional polybasic (furin) cleavage site through the insertion of PRRAR*SV (* indicates the cleavage site) at the S1/S2 interface, which is absent in SARS-CoV and other SARS-related coronaviruses. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	662
411966	cd22379	MERS-CoV-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Middle East respiratory syndrome coronavirus and related betacoronaviruses in the C lineage. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome coronavirus (MERS-CoV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	682
411967	cd22380	HKU1-CoV-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from human HKU1 and OC43 coronaviruses and related betacoronaviruses in the A lineage. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the embecovirus subgenus (A lineage), including highly pathogenic human coronaviruses (CoVs), HKU1 and OC43 CoVs, as well as murine hepatitis virus (MHV). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of MHV is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Rousettus bat coronavirus HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	663
411968	cd22381	bat-HKU9-CoV-like_Spike_SD1-2_S1-S2_S2	SD-1 and SD-2 subdomains, the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from Rousettus bat coronavirus HKU9 and related betacoronaviruses in the D lineage. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9 (Ro-BatCoV HKU9). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	731
411810	cd22382	KH-I_SF1	type I K homology (KH) RNA-binding domain found in splicing factor 1 (SF1) and similar proteins. SF1, also called branch point-binding protein, or BBP, or transcription factor ZFM1, or zinc finger gene in MEN1 locus, or zinc finger protein 162, is necessary for the ATP-dependent first step of spliceosome assembly. Binds to the intron branch point sequence (BPS) 5'-UACUAAC-3' of the pre-mRNA. It may act as transcription repressor.	93
411811	cd22383	KH-I_Hqk_like	type I K homology (KH) RNA-binding domain found in protein quaking (Hqk) family. The Hqk family includes Hqk and protein held out wings (how) found in Drosophila. Hqk, also called HqkI, is an RNA-binding protein that plays a central role in myelinization. It binds to the 5'-NACUAAY-N(1,20)-UAAY-3' RNA core sequence and regulates target mRNA stability. It acts by regulating pre-mRNA splicing, mRNA export and protein translation. Hqk is a regulator of oligodendrocyte differentiation and maturation in the brain that may play a role in myelin and oligodendrocyte dysfunction in schizophrenia. How, also called KH domain protein KH93F, or protein muscle-specific, or protein Struthio, or protein wings held out (who), or Quaking-related 93F (qkr93F), is an RNA-binding protein involved in the control of muscular and cardiac activity. It is required for integrin-mediated cell-adhesion in wing blade. It plays essential roles during embryogenesis, in late stages of somatic muscle development, for myotube migration and during metamorphosis for muscle reorganization.	101
411812	cd22384	KH-I_KHDRBS	type I K homology (KH) RNA-binding domain found in the KH domain-containing, RNA-binding, signal transduction-associated protein (KHDRBS) family. The KHDRBS family includes three members, KHDRBS1-3. KHDRBS1, also called GAP-associated tyrosine phosphoprotein p62, or Src-associated in mitosis 68 kDa protein, or Sam68, or p21 Ras GTPase-activating protein-associated p62, or p68, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds to RNA containing 5'-[AU]UAA-3' as a bipartite motif spaced by more than 15 nucleotides. It also binds poly(A). KHDRBS1 acts as a putative regulator of mRNA stability and/or translation rates and mediates mRNA nuclear export. It is recruited and tyrosine phosphorylated by several receptor systems, for example the T-cell, leptin and insulin receptors. KHDRBS2, also called Sam68-like mammalian protein 1, or SLM-1, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds both poly(A) and poly(U) homopolymers. KHDRBS2 may function as an adapter protein for Src kinases during mitosis. KHDRBS3, also called RNA-binding protein T-Star, or Sam68-like mammalian protein 2, or SLM-2, or Sam68-like phosphotyrosine protein, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds optimally to RNA containing 5'-[AU]UAA-3' as a bipartite motif spaced by more than 15 nucleotides. It also binds poly(A). KHDRBS3 may play a role as a negative regulator of cell growth.	102
411813	cd22385	KH-I_KHDC4_rpt1	first type I K homology (KH) RNA-binding domain found in KH homology domain-containing protein 4 (KHDC4) and similar proteins. KHDC4, also called Brings lots of money 7 (Blom7), or pre-mRNA splicing factor protein KHDC4, is an RNA-binding protein involved in pre-mRNA splicing. It interacts with the PRP19C/Prp19 complex/NTC/Nineteen complex which is part of the spliceosome. KHDC4 binds preferentially RNA with A/C rich sequences and poly-C stretches. KHDC4 contains two type I K homology (KH) RNA-binding domains. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif.	84
411814	cd22386	KH-I_KHDC4_rpt2	first type I K homology (KH) RNA-binding domain found in KH homology domain-containing protein 4 (KHDC4) and similar proteins. KHDC4, also called Brings lots of money 7 (Blom7), or pre-mRNA splicing factor protein KHDC4, is an RNA-binding protein involved in pre-mRNA splicing. It interacts with the PRP19C/Prp19 complex/NTC/Nineteen complex which is part of the spliceosome. KHDC4 binds preferentially RNA with A/C rich sequences and poly-C stretches. KHDC4 contains two type I K homology (KH) RNA-binding domains. The model corresponds to the second one.	102
411815	cd22387	KH-I_DDX46_like	type I K homology (KH) RNA-binding domain found in the family of DEAD box protein 46 (DDX46). The DDX46 family includes DEAD box protein 46 (DDX46), fungal pre-mRNA-processing ATP-dependent RNA helicase PRP5, Arabidopsis thaliana DEAD-box ATP-dependent RNA helicase RH42 and similar proteins. DDX46, also called PRP5 homolog, is an ATP-dependent RNA helicase that plays an essential role in splicing, either prior to, or during splicing A complex formation. It inhibits antiviral innate responses by entrapping selected antiviral transcripts in the nucleus. It is also involved in the development of several tumors. PRP5 is an ATP-dependent RNA helicase involved spliceosome assembly and in nuclear splicing. It catalyzes an ATP-dependent conformational change of U2 snRNP. PRP5 interacts with the U2 snRNP and HSH155. RH42, also called DEAD-box RNA helicase RCF1, or REGULATOR OF CBF GENE EXPRESSION 1, is a helicase required for pre-mRNA splicing, cold-responsive gene regulation and cold tolerance. Members in this family contain a divergent KH domain that lacks the RNA-binding GXXG motif.	82
411816	cd22388	KH-I_N4BP1_like_rpt2	second type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 (N4BP1). The N4BP1 family includes N4BP1, NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN) and KH and NYN domain-containing protein (KHNYN). These proteins are probably of retroviral origin. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. Members of this family contains two type I K homology (KH) RNA-binding domain. The model corresponds to the second one.	63
411817	cd22389	KH-I_Dim2p_like_rpt1	first type I K homology (KH) RNA-binding domain found in Pyrococcus horikoshii Dim2p and similar proteins. The family includes a group of conserved KH domain-containing protein mainly from archaea, such as Dim2p homologues from Pyrococcus horikoshii and Aeropyrum pernix. Dim2p acts as a preribosomal RNA processing factor that has been identified as an essential protein for the maturation of 40S ribosomal subunit in Saccharomyces cerevisiae. It is required for the cleavage at processing site A2 to generate the pre-20S rRNA and for the dimethylation of the 18S rRNA by 18S rRNA dimethyltransferase, Dim1p. Dim2p contains two K-homology (KH) RNA-binding domains. The model corresponds to the first one.	70
411818	cd22390	KH-I_Dim2p_like_rpt2	second type I K homology (KH) RNA-binding domain found in Pyrococcus horikoshii Dim2p and similar proteins. The family includes a group of conserved KH domain-containing protein mainly from archaea, such as Dim2p homologues from Pyrococcus horikoshii and Aeropyrum pernix. Dim2p acts as a preribosomal RNA processing factor that has been identified as an essential protein for the maturation of 40S ribosomal subunit in Saccharomyces cerevisiae. It is required for the cleavage at processing site A2 to generate the pre-20S rRNA and for the dimethylation of the 18S rRNA by 18S rRNA dimethyltransferase, Dim1p. Dim2p contains two K-homology (KH) RNA-binding domains. The model corresponds to the second one.	96
411819	cd22391	KH-I_PNO1_rpt1	first type I K homology (KH) RNA-binding domain found in partner of NOB1 (PNO1) and similar proteins. PNO1 is an RNA-binding protein that acts as a ribosome assembly factor and plays an important role in ribosome biogenesis. It positively regulates dimethylation of two adjacent adenosines in the loop of a conserved hairpin near the 3'-end of 18S rRNA. PNO1 contains two K-homology (KH) RNA-binding domains. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif.	80
411820	cd22392	KH-I_PNO1_rpt2	second type I K homology (KH) RNA-binding domain found in partner of NOB1 (PNO1) and similar proteins. PNO1 is an RNA-binding protein that acts as a ribosome assembly factor and plays an important role in ribosome biogenesis. It positively regulates dimethylation of two adjacent adenosines in the loop of a conserved hairpin near the 3'-end of 18S rRNA. PNO1 contains two K-homology (KH) RNA-binding domains. The model corresponds to the second one.	96
411821	cd22393	KH-I_KRR1_rpt1	first type I K homology (KH) RNA-binding domain found in KRR1 small subunit processome component and similar proteins. KRR1, also called HIV-1 Rev-binding protein 2, or KRR-R motif-containing protein 1, or Rev-interacting protein 1, or Rip-1, or ribosomal RNA assembly protein KRR1, is a ribosomal assembly factor required for 40S ribosome biogenesis. It is involved in nucleolar processing of pre-18S ribosomal RNA and ribosome assembly. KRR1 contains two K-homology (KH) RNA-binding domains. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif and is involved in binding another assembly factor, Kri1.	83
411822	cd22394	KH-I_KRR1_rpt2	second type I K homology (KH) RNA-binding domain found in KRR1 small subunit processome component and similar proteins. KRR1, also called HIV-1 Rev-binding protein 2, or KRR-R motif-containing protein 1, or Rev-interacting protein 1, or Rip-1, or ribosomal RNA assembly protein KRR1, is a nucleolar protein required for 40S ribosome biogenesis. It is involved in nucleolar processing of pre-18S ribosomal RNA and ribosome assembly. KRR1 contains two K-homology (KH) RNA-binding domains. The model corresponds to the second one.	93
411823	cd22395	KH-I_AKAP1	type I K homology (KH) RNA-binding domain found in mitochondrial A-kinase anchor protein 1 (AKAP1) and similar proteins. AKAP1, also called A-kinase anchor protein 149 kDa, or AKAP 149, or dual specificity A-kinase-anchoring protein 1, or D-AKAP-1, or protein kinase A-anchoring protein 1 (PRKA1), or spermatid A-kinase anchor protein 84, or S-AKAP84, is a novel developmentally regulated A kinase anchor protein of male germ cells. It binds to type I and II regulatory subunits of protein kinase A and anchors them to the cytoplasmic face of the mitochondrial outer membrane.	68
411824	cd22396	KH-I_FUBP_rpt1	first type I K homology (KH) RNA-binding domain found in the FUBP family RNA/DNA-binding proteins. The far upstream element-binding protein (FUBP) family includes FUBP1-3. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP proteins contain four K-homology (KH) RNA-binding domains. The model corresponds to the first one.	68
411825	cd22397	KH-I_FUBP_rpt2	second type I K homology (KH) RNA-binding domain found in the FUBP family RNA/DNA-binding proteins. The far upstream element-binding protein (FUBP) family includes FUBP1-3. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP proteins contain four K-homology (KH) RNA-binding domains. The model corresponds to the second one.	69
411826	cd22398	KH-I_FUBP_rpt3	third type I K homology (KH) RNA-binding domain found in the FUBP family RNA/DNA-binding proteins. The far upstream element-binding protein (FUBP) family includes FUBP1-3. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP proteins contain four K-homology (KH) RNA-binding domains. The model corresponds to the third one.	67
411827	cd22399	KH-I_FUBP_rpt4	fourth type I K homology (KH) RNA-binding domain found in the FUBP family RNA/DNA-binding proteins. The far upstream element-binding protein (FUBP) family includes FUBP1-3. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP proteins contain four K-homology (KH) RNA-binding domains. The model corresponds to the fourth one.	67
411828	cd22400	KH-I_IGF2BP_rpt1	first type I K homology (KH) RNA-binding domain found in the insulin-like growth factor 2 mRNA-binding protein (IGF2BP) family. The IGF2BP family includes three members: IGF2BP1/IMP-1/ CRD-BP/ VICKZ1, IGF2BP2/IMP-2/ VICKZ2, and IGF2BP3/IMP-3/VICKZ3, which are RNA-binding factors that recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). They function by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. IGF2BP proteins contain four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the first one.	68
411829	cd22401	KH-I_IGF2BP_rpt2	second type I K homology (KH) RNA-binding domain found in the insulin-like growth factor 2 mRNA-binding protein (IGF2BP) family. The IGF2BP family includes three members: IGF2BP1/IMP-1/ CRD-BP/ VICKZ1, IGF2BP2/IMP-2/ VICKZ2, and IGF2BP3/IMP-3/VICKZ3, which are RNA-binding factors that recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). They function by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. IGF2BP proteins contain four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the second one.	72
411830	cd22402	KH-I_IGF2BP_rpt3	third type I K homology (KH) RNA-binding domain found in the insulin-like growth factor 2 mRNA-binding protein (IGF2BP) family. The IGF2BP family includes three members: IGF2BP1/IMP-1/ CRD-BP/ VICKZ1, IGF2BP2/IMP-2/ VICKZ2, and IGF2BP3/IMP-3/VICKZ3, which are RNA-binding factors that recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). They function by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. IGF2BP proteins contain four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the third one.	66
411831	cd22403	KH-I_IGF2BP_rpt4	fourth type I K homology (KH) RNA-binding domain found in the insulin-like growth factor 2 mRNA-binding protein (IGF2BP) family. The IGF2BP family includes three members: IGF2BP1/IMP-1/CRD-BP/VICKZ1, IGF2BP2/IMP-2/VICKZ2, and IGF2BP3/IMP-3/VICKZ3, which are RNA-binding factors that recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). They function by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. IGF2BP proteins contain four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the fourth one.	66
411832	cd22404	KH-I_MASK	type I K homology (KH) RNA-binding domain found in Mask family proteins. The Mask family includes Drosophila melanogaster ankyrin repeat and KH domain-containing protein Mask, and its mammalian homologues Mask1/ANKHD1 and Mask2/ANKRD17. Mask, also called multiple ankyrin repeat single KH domain-containing protein, is a large ankyrin repeat and KH domain-containing protein involved in Drosophila receptor tyrosine kinase signaling. It acts as a mediator of receptor tyrosine kinase (RTK) signaling and may act either downstream of MAPK or transduce signaling through a parallel branch of the RTK pathway. Mask is required for the development and organization of indirect flight muscle sarcomeres by regulating the formation of M line and H zone and the correct assembly of thick and thin filaments in the sarcomere. Mask1/ANKHD1, also called HIV-1 Vpr-binding ankyrin repeat protein, or multiple ankyrin repeats single KH domain, or Hmask, is highly expressed in various cancer tissues and is involved in cancer progression, including proliferation and invasion. Mask2/ANKRD17, also called ankyrin repeat protein 17, or gene trap ankyrin repeat protein (GTAR), or serologically defined breast cancer antigen NY-BR-16, is a ubiquitously expressed ankyrin factor essential for the vascular integrity during embryogenesis. It may be directly involved in the DNA replication process and play pivotal roles in cell cycle and DNA regulation. It is also involved in innate immune defense against bacteria and viruses.	71
411833	cd22405	KH-I_Vigilin_rpt1	first type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the first one.	69
411834	cd22406	KH-I_Vigilin_rpt2	second type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the second one.	75
411835	cd22407	KH-I_Vigilin_rpt3	third type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the third one.	62
411836	cd22408	KH-I_Vigilin_rpt4	fourth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the fourth one.	62
411837	cd22409	KH-I_Vigilin_rpt5	fifth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the fifth one.	70
411838	cd22410	KH-I_Vigilin_rpt7	seventh type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the seventh one.	67
411839	cd22411	KH-I_Vigilin_rpt8	eighth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the eighth one.	62
411840	cd22412	KH-I_Vigilin_rpt9	ninth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the ninth one.	70
411841	cd22413	KH-I_Vigilin_rpt10	tenth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the tenth one.	66
411842	cd22414	KH-I_Vigilin_rpt11	eleventh type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the eleventh one.	66
411843	cd22415	KH-I_Vigilin_rpt12	twelfth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the twelfth one.	92
411844	cd22416	KH-I_Vigilin_rpt13	thirteenth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the thirteenth one.	78
411845	cd22417	KH-I_Vigilin_rpt14	fourteenth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the fourteenth one.	72
411846	cd22418	KH-I_Vigilin_rpt15	fifteenth type I K homology (KH) RNA-binding domain found in vigilin and similar proteins. Vigilin, also called high density lipoprotein-binding protein, or HDL-binding protein, is a ubiquitous and highly conserved RNA-binding protein that shuttles between nucleus and cytoplasm presumably in contact with RNA molecules. It may be involved in chromosome partitioning at mitosis, facilitating translation and tRNA transport, and control of mRNA metabolism, including estrogen-mediated stabilization of vitellogenin mRNA. Vigilin is up-regulated by cholesterol loading of cells and functions to protect cells from over-accumulation of cholesterol. It may play a role in cell sterol metabolism. Disruption of human vigilin impairs chromosome condensation and segregation. Vigilin has a unique structure of 14-15 consecutively arranged, but non-identical K-homology (KH) domains which apparently mediate RNA-protein binding. The model corresponds to the fifteenth one.	69
411847	cd22419	KH-I_ASCC1	type I K homology (KH) RNA-binding domain found in activating signal cointegrator 1 complex subunit 1 (ASCC1) and similar proteins. ASCC1, also called ASC-1 complex subunit p50, or Trip4 complex subunit p50, plays a role in DNA damage repair as component of the ASCC complex. It is part of the ASC-1 complex that enhances NF-kappa-B, SRF and AP1 transactivation. In cells responding to gastrin-activated paracrine signals, it is involved in the induction of SERPINB2 expression by gastrin. ASCC1 may also play a role in the development of neuromuscular junction.	66
411848	cd22420	KH-I_BICC1_rpt1	first type I K homology (KH) RNA-binding domain found in protein bicaudal C homolog 1 (BICC1) and similar proteins. BICC1, also called Bic-C, is a mammalian homologue of Drosophila Bicaudal-C (dBic-C). BICC1 functions as an RNA-binding protein that represses the translation of selected mRNAs to control development. It regulates gene expression and modulates cell proliferation and apoptosis. BICC1 is a negative regulator of Wnt signaling.  Increased levels of BICC1 may be associated with depression. Besides, BICC1 is a genetic determinant of osteoblastogenesis and bone mineral density. BICC1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	81
411849	cd22421	KH-I_BICC1_rpt2	second type I K homology (KH) RNA-binding domain found in protein bicaudal C homolog 1 (BICC1) and similar proteins. BICC1, also called Bic-C, is a mammalian homologue of Drosophila Bicaudal-C (dBic-C). BICC1 functions as an RNA-binding protein that represses the translation of selected mRNAs to control development. It regulates gene expression and modulates cell proliferation and apoptosis. BICC1 is a negative regulator of Wnt signaling.  Increased levels of BICC1 may be associated with depression. Besides, BICC1 is a genetic determinant of osteoblastogenesis and bone mineral density. BICC1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	70
411850	cd22422	KH-I_BICC1_rpt3	third type I K homology (KH) RNA-binding domain found in protein bicaudal C homolog 1 (BICC1) and similar proteins. BICC1, also called Bic-C, is a mammalian homologue of Drosophila Bicaudal-C (dBic-C). BICC1 functions as an RNA-binding protein that represses the translation of selected mRNAs to control development. It regulates gene expression and modulates cell proliferation and apoptosis. BICC1 is a negative regulator of Wnt signaling.  Increased levels of BICC1 may be associated with depression. Besides, BICC1 is a genetic determinant of osteoblastogenesis and bone mineral density. BICC1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	67
411851	cd22423	KH-I_MEX3_rpt1	first type I K homology (KH) RNA-binding domain found in the family of MEX-3 RNA-binding proteins. The MEX-3 protein family includes four members, MEX3A/RKHD4, MEX3B/RKHD3/RNF195, MEX3C/ RKHD2/RNF194, and MEX3D/RKHD1/RNF193/TINO. They are homologous of Caenorhabditis elegans MEX-3 protein, a translational regulator that specifies the posterior blastomere identity in the early embryo and contributes to the maintenance of the germline totipotency. Mex-3 proteins are RNA-binding phosphoproteins involved in post-transcriptional regulatory mechanisms. They are characterized by containing two K-homology (KH) RNA-binding domains and a C-terminal RING finger. They bind RNA through their KH domains and shuttle between the nucleus and the cytoplasm via the CRM1-dependent export pathway. The model corresponds to the first KH domain.	73
411852	cd22424	KH-I_MEX3_rpt2	second type I K homology (KH) RNA-binding domain found in the family of MEX-3 RNA-binding proteins. The MEX-3 protein family includes four members, MEX3A/RKHD4, MEX3B/RKHD3/RNF195, MEX3C/ RKHD2/RNF194, and MEX3D/RKHD1/RNF193/TINO. They are homologous of Caenorhabditis elegans MEX-3 protein, a translational regulator that specifies the posterior blastomere identity in the early embryo and contributes to the maintenance of the germline totipotency. Mex-3 proteins are RNA-binding phosphoproteins involved in post-transcriptional regulatory mechanisms. They are characterized by containing two K-homology (KH) RNA-binding domains and a C-terminal RING finger. They bind RNA through their KH domains and shuttle between the nucleus and the cytoplasm via the CRM1-dependent export pathway. The model corresponds to the second KH domain.	72
411853	cd22425	KH_I_FMR1_FXR_rpt1	first type I K homology (KH) RNA-binding domain found in a family of fragile X mental retardation protein (FMR1) and fragile X related (FXR) proteins. The FMR1/FXR family includes FMR1 (also known as FMRP) and its two homologues, fragile X related 1 (FXR1) and 2 (FXR2). They are involved in translational regulation, particularly in neuronal cells and play an important role in the regulation of glutamate-mediated neuronal activity and plasticity. Each of these three proteins can form heteromers with the others, and each can also form homomers. Lack of expression of FMR1 results in mental retardation and macroorchidism. FXR1 and FXR2 may play important roles in the function of FMR1 and in the pathogenesis of the Fragile X Mental Retardation Syndrome. Members of this family contain three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	77
411854	cd22426	KH_I_FMR1_FXR_rpt2	second type I K homology (KH) RNA-binding domain found in a family of fragile X mental retardation protein (FMR1) and fragile X related (FXR) proteins. The FMR1/FXR family includes FMR1 (also known as FMRP) and its two homologues, fragile X related 1 (FXR1) and 2 (FXR2). They are involved in translational regulation, particularly in neuronal cells and play an important role in the regulation of glutamate-mediated neuronal activity and plasticity. Each of these three proteins can form heteromers with the others, and each can also form homomers. Lack of expression of FMR1 results in mental retardation and macroorchidism. FXR1 and FXR2 may play important roles in the function of FMR1 and in the pathogenesis of the Fragile X Mental Retardation Syndrome. Members of this family contain three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	63
411855	cd22427	KH_I_FMR1_FXR_rpt3	third type I K homology (KH) RNA-binding domain found in a family of fragile X mental retardation protein (FMR1) and fragile X related (FXR) proteins. The FMR1/FXR family includes FMR1 (also known as FMRP) and its two homologues, fragile X related 1 (FXR1) and 2 (FXR2). They are involved in translational regulation, particularly in neuronal cells and play an important role in the regulation of glutamate-mediated neuronal activity and plasticity. Each of these three proteins can form heteromers with the others, and each can also form homomers. Lack of expression of FMR1 results in mental retardation and macroorchidism. FXR1 and FXR2 may play important roles in the function of FMR1 and in the pathogenesis of the Fragile X Mental Retardation Syndrome. Members of this family contain three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	79
411856	cd22428	KH-I_TDRKH_rpt1	first type I K homology (KH) RNA-binding domain found in tudor and KH domain-containing protein (TDRKH) and similar proteins. TDRKH, also called tudor domain-containing protein 2 (TDRD2), is a mitochondria-anchored RNA-binding protein that is required for spermatogenesis and involved in piRNA biogenesis. It specifically recruits MIWI, but not MILI, to engage the piRNA pathway. TDRKH contains two K-homology (KH) RNA-binding domains and one tudor domain, which are involved in binding to RNA or single-strand DNA. The model corresponds to the first one.	74
411857	cd22429	KH-I_TDRKH_rpt2	second type I K homology (KH) RNA-binding domain found in tudor and KH domain-containing protein (TDRKH) and similar proteins. TDRKH, also called tudor domain-containing protein 2 (TDRD2), is a mitochondria-anchored RNA-binding protein that is required for spermatogenesis and involved in piRNA biogenesis. It specifically recruits MIWI, but not MILI, to engage the piRNA pathway. TDRKH contains two K-homology (KH) RNA-binding domains and one tudor domain, which are involved in binding to RNA or single-strand DNA. The model corresponds to the second one.	82
411858	cd22430	KH-I_DDX43_DDX53	type I K homology (KH) RNA-binding domain found in DEAD box protein 43 (DDX43), DEAD box protein 53 (DDX53) and similar proteins. DDX43 (also called cancer/testis antigen 13, or DEAD box protein HAGE, or helical antigen) displays tumor-specific expression. Diseases associated with DDX43 include rheumatoid lung disease. DDX53 (also called cancer-associated gene protein, or cancer/testis antigen 26, or DEAD box protein CAGE) shows high expression level in various tumors and is involved in anti-cancer drug resistance. Both DDX46 and DDX53 are members of the DEAD-box helicases, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation.	66
411859	cd22431	KH-I_RNaseY	type I K homology (KH) RNA-binding domain found in ribonuclease Y (RNase Y) and similar proteins. RNase Y is an endoribonuclease that initiates mRNA decay. It initiates the decay of all SAM-dependent riboswitches, such as yitJ riboswitch. RNase Y is involved in processing of the gapA operon mRNA and it cleaves between cggR and gapA. It is also the decay-initiating endonuclease for rpsO mRNA. It plays a role in degradation of type I toxin-antitoxin system bsrG/SR4 RNAs and also a minor role in degradation of type I toxin-antitoxin system bsrE/SR5 degradation.	79
411860	cd22432	KH-I_HNRNPK_rpt1	first type I K homology (KH) RNA-binding domain found in heterogeneous nuclear ribonucleoprotein K (hnRNP K) and similar proteins. hnRNP K, also called transformation up-regulated nuclear protein (TUNP), is a pre-mRNA binding protein that binds tenaciously to poly(C) sequences. It may be involved in the nuclear metabolism of hnRNAs, particularly for pre-mRNAs that contain cytidine-rich sequences. It can also bind poly(C) single-stranded DNA. hnRNP K plays an important role in p53/TP53 response to DNA damage, acting at the level of both transcription activation and repression. hnRNP K contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	64
411861	cd22433	KH-I_HNRNPK_rpt2	second type I K homology (KH) RNA-binding domain found in heterogeneous nuclear ribonucleoprotein K (hnRNP K) and similar proteins. hnRNP K, also called transformation up-regulated nuclear protein (TUNP), is a pre-mRNA binding protein that binds tenaciously to poly(C) sequences. It may be involved in the nuclear metabolism of hnRNAs, particularly for pre-mRNAs that contain cytidine-rich sequences. It can also bind poly(C) single-stranded DNA. hnRNP K plays an important role in p53/TP53 response to DNA damage, acting at the level of both transcription activation and repression. hnRNP K contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	70
411862	cd22434	KH-I_HNRNPK_rpt3	third type I K homology (KH) RNA-binding domain found in heterogeneous nuclear ribonucleoprotein K (hnRNP K) and similar proteins. hnRNP K, also called transformation up-regulated nuclear protein (TUNP), is a pre-mRNA binding protein that binds tenaciously to poly(C) sequences. It may be involved in the nuclear metabolism of hnRNAs, particularly for pre-mRNAs that contain cytidine-rich sequences. It can also bind poly(C) single-stranded DNA. hnRNP K plays an important role in p53/TP53 response to DNA damage, acting at the level of both transcription activation and repression. hnRNP K contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	74
411863	cd22435	KH-I_NOVA_rpt1	first type I K homology (KH) RNA-binding domain found in the family of neuro-oncological ventral antigen (Nova). The family includes two related neuronal RNA-binding proteins, Nova-1 and Nova-2. Nova-1, also called onconeural ventral antigen 1, or paraneoplastic Ri antigen, or ventral neuron-specific protein 1, may regulate RNA splicing or metabolism in a specific subset of developing neurons. It interacts with RNA containing repeats of the YCAY sequence. It is a brain-enriched splicing factor regulating neuronal alternative splicing. Nova-1 is involved in neurological disorders and carcinogenesis.  Nova-2, also called astrocytic NOVA1-like RNA-binding protein, is a neuronal RNA-binding protein expressed in a broader central nervous system (CNS) distribution than Nova-1. It functions in neuronal RNA metabolism. NOVA family proteins contain three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	73
411864	cd22436	KH-I_NOVA_rpt2	second type I K homology (KH) RNA-binding domain found in the family of neuro-oncological ventral antigen (Nova). The family includes two related neuronal RNA-binding proteins, Nova-1 and Nova-2. Nova-1, also called onconeural ventral antigen 1, or paraneoplastic Ri antigen, or ventral neuron-specific protein 1, may regulate RNA splicing or metabolism in a specific subset of developing neurons. It interacts with RNA containing repeats of the YCAY sequence. It is a brain-enriched splicing factor regulating neuronal alternative splicing. Nova-1 is involved in neurological disorders and carcinogenesis.  Nova-2, also called astrocytic NOVA1-like RNA-binding protein, is a neuronal RNA-binding protein expressed in a broader central nervous system (CNS) distribution than Nova-1. It functions in neuronal RNA metabolism. NOVA family proteins contain three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	70
411865	cd22437	KH-I_BTR1_rpt2	second type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein BTR1 and similar proteins. BTR1, also called Binding to ToMV RNA 1, is a negative regulator of tomato mosaic virus (ToMV) multiplication but has no effect on the multiplication of cucumber mosaic virus (CMV). BTR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	69
411866	cd22438	KH-I_PCBP_rpt1	first type I K homology (KH) RNA-binding domain found in the family of poly(C)-binding proteins (PCBPs). The PCBP family, also known as hnRNP E family, comprises four members, PCBP1-4, which are RNA-binding proteins that interact in a sequence-specific manner with single-stranded poly(C) sequences. They are mainly involved in various posttranscriptional regulations, including mRNA stabilization or translational activation/silencing. Besides, PCBPs may share iron chaperone activity. PCBPs contain three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	67
411867	cd22439	KH-I_PCBP_rpt3	third type I K homology (KH) RNA-binding domain found in the family of poly(C)-binding proteins (PCBPs). The PCBP family, also known as hnRNP E family, comprises four members, PCBP1-4, which are RNA-binding proteins that interact in a sequence-specific manner with single-stranded poly(C) sequences. They are mainly involved in various posttranscriptional regulations, including mRNA stabilization or translational activation/silencing. Besides, PCBPs may share iron chaperone activity. PCBPs contain three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	68
411868	cd22440	KH-I_KHDC1_like	type I K homology (KH) RNA-binding domain found in KHDC1-like family. The KHDC1-like family corresponds to a group of structurally related proteins characterized by an atypical RNA-binding KH domain. They are unique to eutherian mammals and specifically expressed in oocytes and/or embryonic stem cells. Family members include KH homology domain-containing protein 1 (KHDC1), KHDC1-like protein (KHDC1L), KHDC3-like protein (KHDC3L, also called ES cell-associated transcript 1 protein or ECAT1), developmental pluripotency-associated 5 protein (DPPA5, also called embryonal stem cell-specific gene 1 protein or ESG-1), Oocyte-expressed protein (OOEP, also called KH homology domain-containing protein 2 or KHDC2, or Oocyte- and embryo-specific protein 19 or OEP19). KHDC3L is essential for human oocyte maturation and pre-implantation development of the resulting embryos. DPPA5 is involved in the maintenance of embryonic stem (ES) cell pluripotency. OOEP plays an essential role for zygotes to progress beyond the first embryonic cell divisions.	68
411869	cd22441	KH-I_CeGLD3_rpt1	first type I K homology (KH) RNA-binding domain found in Caenorhabditis elegans defective in germ line development protein 3 (CeGLD-3) and similar proteins. CeGLD-3, also called germline development defective 3, is a Bicaudal-C (Bic-C) homolog that is involved in the translational control of germline-specific mRNAs during embryogenesis. It interacts with the cytoplasmic poly(A)-polymerase GLD-2. The two proteins cooperate to recognize target mRNAs and convert them into a polyadenylated, translationally active state. CeGLD-3 contains four K-homology (KH) RNA-binding domains, which are divergent KH domains that lacks the RNA-binding GXXG motif. The model corresponds to the first one.	71
411870	cd22442	KH-I_CeGLD3_rpt2	second type I K homology (KH) RNA-binding domain found in Caenorhabditis elegans defective in germ line development protein 3 (CeGLD-3) and similar proteins. CeGLD-3, also called germline development defective 3, is a Bicaudal-C (Bic-C) homolog that is involved in the translational control of germline-specific mRNAs during embryogenesis. It interacts with the cytoplasmic poly(A)-polymerase GLD-2. The two proteins cooperate to recognize target mRNAs and convert them into a polyadenylated, translationally active state. CeGLD-3 contains four K-homology (KH) RNA-binding domains, which are divergent KH domains that lacks the RNA-binding GXXG motif. The model corresponds to the second one.	73
411871	cd22443	KH-I_CeGLD3_rpt3	third type I K homology (KH) RNA-binding domain found in Caenorhabditis elegans defective in germ line development protein 3 (CeGLD-3) and similar proteins. CeGLD-3, also called germline development defective 3, is a Bicaudal-C (Bic-C) homolog that is involved in the translational control of germline-specific mRNAs during embryogenesis. It interacts with the cytoplasmic poly(A)-polymerase GLD-2. The two proteins cooperate to recognize target mRNAs and convert them into a polyadenylated, translationally active state. CeGLD-3 contains four K-homology (KH) RNA-binding domains, which are divergent KH domains that lacks the RNA-binding GXXG motif. The model corresponds to the third one.	74
411872	cd22444	KH-I_CeGLD3_rpt4	fourth type I K homology (KH) RNA-binding domain found in Caenorhabditis elegans defective in germ line development protein 3 (CeGLD-3) and similar proteins. CeGLD-3, also called germline development defective 3, is a Bicaudal-C (Bic-C) homolog that is involved in the translational control of germline-specific mRNAs during embryogenesis. It interacts with the cytoplasmic poly(A)-polymerase GLD-2. The two proteins cooperate to recognize target mRNAs and convert them into a polyadenylated, translationally active state. CeGLD-3 contains four K-homology (KH) RNA-binding domains, which are divergent KH domains that lacks the RNA-binding GXXG motif. The model corresponds to the fourth one.	77
411873	cd22445	KH-I_Rrp4_Rrp40	type I K homology (KH) RNA-binding domain found in exosome complex components Rrp4, Rrp40 and similar proteins. The family includes two ribosomal RNA-processing proteins, Rrp4 and Rrp40. They are non-catalytic components of the RNA exosome complex which has 3'-->5' exoribonuclease activity and participates in a multitude of cellular RNA processing and degradation events. Eukaryotic Rrp4 and Rrp40 contain a divergent KH domain that lacks the RNA-binding GXXG motif.	78
411874	cd22446	KH-I_ScSCP160_rpt1	first type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the first one.	86
411875	cd22447	KH-I_ScSCP160_rpt2	second type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the second one.	80
411876	cd22448	KH-I_ScSCP160_rpt3	third type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the third one.	81
411877	cd22449	KH-I_ScSCP160_rpt4	fourth type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the fourth one.	70
411878	cd22450	KH-I_ScSCP160_rpt5	fifth type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the fifth one.	80
411879	cd22451	KH-I_ScSCP160_rpt6	sixth type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the sixth one.	69
411880	cd22452	KH-I_ScSCP160_rpt7	seventh type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae Protein SCP160 and similar proteins. SCP160, also called protein HX, is a new yeast protein associated with the nuclear membrane and the endoplasmic reticulum. It is involved in the control of mitotic chromosome transmission. It is required during cell division for faithful partitioning of the ER-nuclear envelope membranes which enclose the duplicated chromosomes in yeast. SCP160 contains seven K-homology (KH) RNA-binding domains. The model corresponds to the seventh one.	65
411881	cd22453	KH-I_MUG60_like	type I K homology (KH) RNA-binding domain found in Schizosaccharomyces pombe meiotically up-regulated gene 60 protein (MUG60) and similar proteins. MUG60 is a KH domain-containing protein that has a role in meiosis. The family also contains Saccharomyces cerevisiae KH domain-containing protein YLL032C.	72
411882	cd22454	KH-I_Mextli_like	type I K homology (KH) RNA-binding domain found in Drosophila melanogaster eukaryotic translation initiation factor 4E-binding protein Mextli and similar proteins. Mextli is a novel eukaryotic translation initiation factor 4E-binding protein that promotes translation in Drosophila melanogaster.	71
411883	cd22455	KH-I_Rnc1_rpt1	first type I K homology (KH) RNA-binding domain found in Schizosaccharomyces pombe RNA-binding protein Rnc1 and similar proteins. Rnc1, also called RNA-binding protein that suppresses calcineurin deletion 1, is an RNA-binding protein that acts as an important regulator of the posttranscriptional expression of the MAPK phosphatase Pmp1 in fission yeast. It binds and stabilizes pmp1 mRNA and hence acts as a negative regulator of pmk1 signaling. Overexpression of Rnc1 suppresses the Cl(-) sensitivity of calcineurin deletion. The nuclear export of Rnc1 requires mRNA-binding ability and the mRNA export factor Rae1. Rnc1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	70
411884	cd22456	KH-I_Rnc1_rpt2	second type I K homology (KH) RNA-binding domain found in Schizosaccharomyces pombe RNA-binding protein Rnc1 and similar proteins. Rnc1, also called RNA-binding protein that suppresses calcineurin deletion 1, is an RNA-binding protein that acts as an important regulator of the posttranscriptional expression of the MAPK phosphatase Pmp1 in fission yeast. It binds and stabilizes pmp1 mRNA and hence acts as a negative regulator of pmk1 signaling. Overexpression of Rnc1 suppresses the Cl(-) sensitivity of calcineurin deletion. The nuclear export of Rnc1 requires mRNA-binding ability and the mRNA export factor Rae1. Rnc1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	69
411885	cd22457	KH-I_Rnc1_rpt3	third type I K homology (KH) RNA-binding domain found in Schizosaccharomyces pombe RNA-binding protein Rnc1 and similar proteins. Rnc1, also called RNA-binding protein that suppresses calcineurin deletion 1, is an RNA-binding protein that acts as an important regulator of the posttranscriptional expression of the MAPK phosphatase Pmp1 in fission yeast. It binds and stabilizes pmp1 mRNA and hence acts as a negative regulator of pmk1 signaling. Overexpression of Rnc1 suppresses the Cl(-) sensitivity of calcineurin deletion. The nuclear export of Rnc1 requires mRNA-binding ability and the mRNA export factor Rae1. Rnc1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	64
411886	cd22458	KH-I_MER1_like	type I K homology (KH) RNA-binding domain found in Saccharomyces cerevisiae meiotic recombination 1 protein (MER1) and similar proteins. MER1 is required for chromosome pairing and genetic recombination. It may function to bring the axial elements of the synaptonemal complex corresponding to homologous chromosomes together by initiating recombination. MER1 might be responsible for regulating the MER2 gene and/or gene product.	65
411887	cd22459	KH-I_PEPPER_rpt1_like	first type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana RNA-binding KH domain-containing protein PEPPER and similar proteins. The family includes a group of plant RNA-binding KH domain-containing proteins, such as PEPPER, flowering locus K homology domain protein (FLK), RNA-binding KH domain-containing protein RCF3 and KH domain-containing protein HEN4. PEPPER regulates vegetative and gynoecium development. It acts as a positive regulator of the central floral repressor FLOWERING LOCUS C. In concert with HUA2, PEPPER antagonizes FLK by positively regulating FLC probably at transcriptional and post-transcriptional levels, and thus acts as a negative regulator of flowering. FLK, also called flowering locus KH domain protein, regulates positively flowering by repressing FLC expression and post-transcriptional modification. PEPPER and FLK contain three K-homology (KH) RNA-binding domains. RCF3, also called protein ENHANCED STRESS RESPONSE 1 (ESR1), or protein HIGH OSMOTIC STRESS GENE EXPRESSION 5 (HOS5), or protein REGULATOR OF CBF GENE EXPRESSION 3, or protein SHINY 1 (SHI1), acts as negative regulator of osmotic stress-induced gene expression. It is involved in the regulation of thermotolerance responses under heat stress. It functions as an upstream regulator of heat stress transcription factor (HSF) genes. HEN4, also called protein HUA ENHANCER 4, plays a role in floral reproductive organ identity in the third whorl and floral determinacy specification by specifically promoting the processing of AGAMOUS (AG) pre-mRNA. It functions in association with HUA1 and HUA2. RCF3 and HEN4 contain five KH RNA-binding domains. The model corresponds to the KH1 domain of PEPPER and FLK, as well as KH1 and KH3 domains of RCF3 and HEN4.	69
411888	cd22460	KH-I_PEPPER_rpt2_like	second type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana RNA-binding KH domain-containing protein PEPPER and similar proteins. The family includes a group of plant RNA-binding KH domain-containing proteins, such as PEPPER, flowering locus K homology domain protein (FLK), RNA-binding KH domain-containing protein RCF3 and KH domain-containing protein HEN4. PEPPER regulates vegetative and gynoecium development. It acts as a positive regulator of the central floral repressor FLOWERING LOCUS C. In concert with HUA2, PEPPER antagonizes FLK by positively regulating FLC probably at transcriptional and post-transcriptional levels, and thus acts as a negative regulator of flowering. FLK, also called flowering locus KH domain protein, regulates positively flowering by repressing FLC expression and post-transcriptional modification. PEPPER and FLK contain three K-homology (KH) RNA-binding domains. RCF3, also called protein ENHANCED STRESS RESPONSE 1 (ESR1), or protein HIGH OSMOTIC STRESS GENE EXPRESSION 5 (HOS5), or protein REGULATOR OF CBF GENE EXPRESSION 3, or protein SHINY 1 (SHI1), acts as negative regulator of osmotic stress-induced gene expression. It is involved in the regulation of thermotolerance responses under heat stress. It functions as an upstream regulator of heat stress transcription factor (HSF) genes. HEN4, also called protein HUA ENHANCER 4, plays a role in floral reproductive organ identity in the third whorl and floral determinacy specification by specifically promoting the processing of AGAMOUS (AG) pre-mRNA. It functions in association with HUA1 and HUA2. RCF3 and HEN4 contain five KH RNA-binding domains. The model corresponds to the KH2 domain of PEPPER and FLK, as well as KH2 and KH4 domains of RCF3 and HEN4.	73
411889	cd22461	KH-I_PEPPER_like_rpt3	third type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana RNA-binding KH domain-containing protein PEPPER and similar proteins. The family includes a group of plant RNA-binding KH domain-containing proteins, such as PEPPER and flowering locus K homology domain protein (FLK). PEPPER regulates vegetative and gynoecium development. It acts as a positive regulator of the central floral repressor FLOWERING LOCUS C. In concert with HUA2, PEPPER antagonizes FLK by positively regulating FLC probably at transcriptional and post-transcriptional levels, and thus acts as a negative regulator of flowering. FLK, also called flowering locus KH domain protein, regulates positively flowering by repressing FLC expression and post-transcriptional modification. PEPPER and FLK contain three K-homology (KH) RNA-binding domains.  The model corresponds to the KH3 domain of PEPPER and FLK.	69
411890	cd22462	KH-I_HEN4_like_rpt5	fifth type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana KH domain-containing protein HEN4 and similar protein. HEN4, also called protein HUA ENHANCER 4, plays a role in floral reproductive organ identity in the third whorl and floral determinacy specification by specifically promoting the processing of AGAMOUS (AG) pre-mRNA. It functions in association with HUA1 and HUA2. HEN4 contains five K-homology (KH) RNA-binding domains. The model corresponds to the KH5 domain of HEN4.	66
411891	cd22463	KH-I_RCF3_like_rpt5	fifth type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana RNA-binding KH domain-containing protein RCF3 and similar protein. RCF3, also called protein ENHANCED STRESS RESPONSE 1 (ESR1), or protein HIGH OSMOTIC STRESS GENE EXPRESSION 5 (HOS5), or protein REGULATOR OF CBF GENE EXPRESSION 3, or protein SHINY 1 (SHI1), acts as negative regulator of osmotic stress-induced gene expression. It is involved in the regulation of thermotolerance responses under heat stress. It functions as an upstream regulator of heat stress transcription factor (HSF) genes. HEN4, also called protein HUA ENHANCER 4, plays a role in floral reproductive organ identity in the third whorl and floral determinacy specification by specifically promoting the processing of AGAMOUS (AG) pre-mRNA. It functions in association with HUA1 and HUA2. RCF3 contains five K-homology (KH) RNA-binding domains. The model corresponds to the KH5 domain of RCF3.	71
411892	cd22464	KH-I_AtC3H36_like	type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana zinc finger CCCH domain-containing proteins AtC3H36, AtC3H52 and similar proteins. The family corresponds to a group of plant CCCH family zinc finger proteins, such as AtC3H36 and AtC3H52, which contain one K homology (KH) RNA-binding domain.  They may play important roles in RNA processing as RNA-binding proteins in animals. They may also have an effective role in stress tolerance.	66
411893	cd22465	KH-I_Hqk	type I K homology (KH) RNA-binding domain found in protein quaking (Hqk) and similar proteins. Hqk, also called HqkI, is an RNA-binding protein that plays a central role in myelinization. It binds to the 5'-NACUAAY-N(1,20)-UAAY-3' RNA core sequence and regulates target mRNA stability. It acts by regulating pre-mRNA splicing, mRNA export and protein translation. Hqk is a regulator of oligodendrocyte differentiation and maturation in the brain that may play a role in myelin and oligodendrocyte dysfunction in schizophrenia.	103
411894	cd22466	KH-I_HOW	type I K homology (KH) RNA-binding domain found in Drosophila protein held out wings (how) and similar proteins. How, also called KH domain protein KH93F, or protein muscle-specific, or protein Struthio, or protein wings held out (who), or Quaking-related 93F (qkr93F), is an RNA-binding protein involved in the control of muscular and cardiac activity. It is required for integrin-mediated cell-adhesion in wing blade. It plays essential roles during embryogenesis, in late stages of somatic muscle development, for myotube migration and during metamorphosis for muscle reorganization.	105
411895	cd22467	KH-I_SPIN1_like	type I K homology (KH) RNA-binding domain found in Oryza sativa SPL11-interacting protein 1 (SPIN1) and similar proteins. SPIN1 is a K homology domain protein negatively regulated and ubiquitinated by the E3 ubiquitin ligase SPL11. It is involved in flowering time control in rice. SPIN1 binds DNA and RNA in vitro.	101
411896	cd22468	KH-I_KHDRBS1	type I K homology (KH) RNA-binding domain found in KH domain-containing, RNA-binding, signal transduction-associated protein 1 (KHDRBS1) and similar proteins. KHDRBS1, also called GAP-associated tyrosine phosphoprotein p62, or Src-associated in mitosis 68 kDa protein, or Sam68, or p21 Ras GTPase-activating protein-associated p62, or p68, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds to RNA containing 5'-[AU]UAA-3' as a bipartite motif spaced by more than 15 nucleotides. It also binds poly(A). KHDRBS1 acts as a putative regulator of mRNA stability and/or translation rates and mediates mRNA nuclear export. It is recruited and tyrosine phosphorylated by several receptor systems, for example the T-cell, leptin and insulin receptors.	106
411897	cd22469	KH-I_KHDRBS2	type I K homology (KH) RNA-binding domain found in KH domain-containing, RNA-binding, signal transduction-associated protein 2 (KHDRBS2) and similar proteins. KHDRBS2, also called Sam68-like mammalian protein 1, or SLM-1, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds both poly(A) and poly(U) homopolymers. KHDRBS2 may function as an adapter protein for Src kinases during mitosis.	118
411898	cd22470	KH-I_KHDRBS3	type I K homology (KH) RNA-binding domain found in KH domain-containing, RNA-binding, signal transduction-associated protein 3 (KHDRBS3) and similar proteins. KHDRBS3, also called RNA-binding protein T-Star, or Sam68-like mammalian protein 2, or SLM-2, or Sam68-like phosphotyrosine protein, is an RNA-binding protein that plays a role in the regulation of alternative splicing and influences mRNA splice site selection and exon inclusion. It binds optimally to RNA containing 5'-[AU]UAA-3' as a bipartite motif spaced by more than 15 nucleotides. It also binds poly(A). KHDRBS3 may play a role as a negative regulator of cell growth.	113
411899	cd22471	KH-I_RIK_like_rpt1	first type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein RIK and similar proteins. RIK, also called rough sheath 2-interacting KH domain protein, or RS2-interacting KH domain protein, is a RNA binding protein that acts together with RS2/AS1 in the recruitment of HIRA. RIK contains two type I K homology (KH) RNA-binding domains. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif.	91
411900	cd22472	KH-I_RIK_like_rpt2	second type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein RIK and similar proteins. RIK, also called rough sheath 2-interacting KH domain protein, or RS2-interacting KH domain protein, is a RNA binding protein that acts together with RS2/AS1 in the recruitment of HIRA. RIK contains two type I K homology (KH) RNA-binding domains. The model corresponds to the second one.	96
411901	cd22473	KH-I_DDX46	type I K homology (KH) RNA-binding domain found in DEAD box protein 46 (DDX46) and similar proteins. DDX46, also called PRP5 homolog, is an ATP-dependent RNA helicase that plays an essential role in splicing, either prior to, or during splicing A complex formation. It inhibits antiviral innate responses by entrapping selected antiviral transcripts in the nucleus. It is also involved in the development of several tumors. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	103
411902	cd22474	KH-I_PRP5_like	type I K homology (KH) RNA-binding domain found in fungal pre-mRNA-processing ATP-dependent RNA helicase PRP5 and similar proteins. PRP5 is an ATP-dependent RNA helicase involved spliceosome assembly and in nuclear splicing. It catalyzes an ATP-dependent conformational change of U2 snRNP. PRP5 interacts with the U2 snRNP and HSH155. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	89
411903	cd22475	KH-I_AtRH42_like	type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana DEAD-box ATP-dependent RNA helicase RH42 and similar proteins. RH42, also called DEAD-box RNA helicase RCF1, or REGULATOR OF CBF GENE EXPRESSION 1, is a helicase required for pre-mRNA splicing, cold-responsive gene regulation and cold tolerance. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	102
411904	cd22476	KH-I_N4BP1	type I K homology (KH) RNA-binding domain found in NEDD4-binding protein 1 (N4BP1) and similar proteins. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates.	68
411905	cd22477	KH-I_NYNRIN_like	type I K homology (KH) RNA-binding domain found in the subfamily of NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN). The NYNRIN subfamily includes NYNRIN and KH and NYN domain-containing protein (KHNYN). NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation.	66
411906	cd22478	KH-I_FUBP1_rpt1	first type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 1 (FUBP1) and similar proteins. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP1 contains four K-homology (KH) RNA-binding domains. The model corresponds to the first one.	75
411907	cd22479	KH-I_FUBP2_rpt1	first type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 2 (FUBP2) and similar proteins. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP2 contains four K-homology (KH) RNA-binding domains. The model corresponds to the first one.	71
411908	cd22480	KH-I_FUBP3_rpt1	first type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 3 (FUBP3) and similar proteins. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP3 contains four K-homology (KH) RNA-binding domains. The model corresponds to the first one.	71
411909	cd22481	KH-I_FUBP1_rpt2	second type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 1 (FUBP1) and similar proteins. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP1 contains four K-homology (KH) RNA-binding domains. The model corresponds to the second one.	71
411910	cd22482	KH-I_FUBP2_rpt2	second type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 2 (FUBP2) and similar proteins. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP2 contains four K-homology (KH) RNA-binding domains. The model corresponds to the second one.	73
411911	cd22483	KH-I_FUBP3_rpt2	second type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 3 (FUBP3) and similar proteins. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP3 contains four K-homology (KH) RNA-binding domains. The model corresponds to the second one.	83
411912	cd22484	KH-I_FUBP1_rpt3	third type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 1 (FUBP1) and similar proteins. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP1 contains four K-homology (KH) RNA-binding domains. The model corresponds to the third one.	68
411913	cd22485	KH-I_FUBP2_rpt3	third type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 2 (FUBP2) and similar proteins. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP2 contains four K-homology (KH) RNA-binding domains. The model corresponds to the third one.	68
411914	cd22486	KH-I_FUBP3_rpt3	third type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 3 (FUBP3) and similar proteins. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP3 contains four K-homology (KH) RNA-binding domains. The model corresponds to the third one.	70
411915	cd22487	KH-I_FUBP1_rpt4	fourth type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 1 (FUBP1) and similar proteins. FUBP1, also called FBP, or FUSE-binding protein 1, or DNA helicase V, or DH V, binds RNA and single-stranded DNA (ssDNA) and may act both as activator and repressor of transcription. It regulates MYC expression by binding to a single-stranded far-upstream element (FUSE) upstream of the MYC promoter. FUBP1 contains four K-homology (KH) RNA-binding domains. The model corresponds to the fourth one.	72
411916	cd22488	KH-I_FUBP2_rpt4	fourth type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 2 (FUBP2) and similar proteins. FUBP2, also called FUSE-binding protein 2, or KH type-splicing regulatory protein (KSRP), or p75, is a single-strand nucleic acid binding protein implicated in a variety of cellular processes, including splicing in the nucleus, mRNA decay, maturation of miRNA, and transcriptional control of proto-oncogenes such as c-myc. It regulates the stability and/or translatability of many mRNA species, encoding immune-relevant proteins, either by binding to AU-rich elements (AREs) of mRNA 3'UTR or by facilitating miRNA biogenesis to target mRNA. FUBP2 contains four K-homology (KH) RNA-binding domains. The model corresponds to the fourth one.	69
411917	cd22489	KH-I_FUBP3_rpt4	fourth type I K homology (KH) RNA-binding domain found in far upstream element-binding protein 3 (FUBP3) and similar proteins. FUBP3, also called FUSE-binding protein 3, or MARTA2, was previously shown to mediate dendritic targeting of MAP2 mRNA in neurons. It may interact with single-stranded DNA from the far-upstream element (FUSE) and activate gene expression. It is required for beta-actin mRNA localization. It also interacts with fibroblast growth factor 9 (FGF9) 3'-UTR UG repeats and positively controls FGF9 expression through increasing translation of FGF9 mRNA. FUBP3 contains four K-homology (KH) RNA-binding domains. The model corresponds to the fourth one.	69
411918	cd22490	KH-I_IGF2BP1_rpt1	first type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) and similar proteins. IGF2BP1, also called IGF2 mRNA-binding protein 1 (IMP-1), or coding region determinant-binding protein (CRD-BP), or IGF-II mRNA-binding protein 1, or VICKZ family member 1 (VICKZ1), or zipcode-binding protein 1 (ZBP-1), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It regulates localized beta-actin/ACTB mRNA translation, a crucial process for cell polarity, cell migration and neurite outgrowth. IGF2BP1 can form homodimers and heterodimers with IGF2BP1 and IGF2BP3. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the first one.	76
411919	cd22491	KH-I_IGF2BP2_rpt1	first type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) and similar proteins. IGF2BP2, also called IGF2 mRNA-binding protein 2 (IMP-2), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 2, or VICKZ family member 2 (VICKZ2), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP2 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP3 in an RNA-dependent manner. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the first one.	74
411920	cd22492	KH-I_IGF2BP3_rpt1	first type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3) and similar proteins. IGF2BP3, also called IGF2 mRNA-binding protein 3 (IMP-3), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 3, or VICKZ family member 3 (VICKZ3), or KH domain-containing protein overexpressed in cancer, or KOC, is primarily found in the nucleolus, where it can bind to the 5' UTR of the insulin-like growth factor II leader 3 mRNA and may repress translation of insulin-like growth factor II during late development. It acts as an RNA-binding factor that may recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It also modulates the rate and location at which target transcripts encounter the translational apparatus and shields them from endonuclease attacks or microRNA-mediated degradation. IGF2BP3 binds to the 3'-UTR of CD44 mRNA and stabilizes it, hence promotes cell adhesion and invadopodia formation in cancer cells. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP3 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP2 in an RNA-dependent manner. IGF2BP3 contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the first one.	76
411921	cd22493	KH-I_IGF2BP1_rpt2	second type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) and similar proteins. IGF2BP1, also called IGF2 mRNA-binding protein 1 (IMP-1), or coding region determinant-binding protein (CRD-BP), or IGF-II mRNA-binding protein 1, or VICKZ family member 1 (VICKZ1), or zipcode-binding protein 1 (ZBP-1), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It regulates localized beta-actin/ACTB mRNA translation, a crucial process for cell polarity, cell migration and neurite outgrowth. IGF2BP1 can form homodimers and heterodimers with IGF2BP1 and IGF2BP3. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the second one.	97
411922	cd22494	KH-I_IGF2BP2_rpt2	second type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) and similar proteins. IGF2BP2, also called IGF2 mRNA-binding protein 2 (IMP-2), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 2, or VICKZ family member 2 (VICKZ2), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP2 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP3 in an RNA-dependent manner. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the second one.	77
411923	cd22495	KH-I_IGF2BP3_rpt2	second type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3) and similar proteins. IGF2BP3, also called IGF2 mRNA-binding protein 3 (IMP-3), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 3, or VICKZ family member 3 (VICKZ3), or KH domain-containing protein overexpressed in cancer, or KOC, is primarily found in the nucleolus, where it can bind to the 5' UTR of the insulin-like growth factor II leader 3 mRNA and may repress translation of insulin-like growth factor II during late development. It acts as an RNA-binding factor that may recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It also modulates the rate and location at which target transcripts encounter the translational apparatus and shields them from endonuclease attacks or microRNA-mediated degradation. IGF2BP3 binds to the 3'-UTR of CD44 mRNA and stabilizes it, hence promotes cell adhesion and invadopodia formation in cancer cells. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP3 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP2 in an RNA-dependent manner. IGF2BP3 contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the second one.	77
411924	cd22496	KH-I_IGF2BP1_rpt3	third type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) and similar proteins. IGF2BP1, also called IGF2 mRNA-binding protein 1 (IMP-1), or coding region determinant-binding protein (CRD-BP), or IGF-II mRNA-binding protein 1, or VICKZ family member 1 (VICKZ1), or zipcode-binding protein 1 (ZBP-1), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It regulates localized beta-actin/ACTB mRNA translation, a crucial process for cell polarity, cell migration and neurite outgrowth. IGF2BP1 can form homodimers and heterodimers with IGF2BP1 and IGF2BP3. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the third one.	76
411925	cd22497	KH-I_IGF2BP2_rpt3	third type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) and similar proteins. IGF2BP2, also called IGF2 mRNA-binding protein 2 (IMP-2), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 2, or VICKZ family member 2 (VICKZ2), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP2 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP3 in an RNA-dependent manner. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the third one.	77
411926	cd22498	KH-I_IGF2BP3_rpt3	third type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3) and similar proteins. IGF2BP3, also called IGF2 mRNA-binding protein 3 (IMP-3), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 3, or VICKZ family member 3 (VICKZ3), or KH domain-containing protein overexpressed in cancer, or KOC, is primarily found in the nucleolus, where it can bind to the 5' UTR of the insulin-like growth factor II leader 3 mRNA and may repress translation of insulin-like growth factor II during late development. It acts as an RNA-binding factor that may recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It also modulates the rate and location at which target transcripts encounter the translational apparatus and shields them from endonuclease attacks or microRNA-mediated degradation. IGF2BP3 binds to the 3'-UTR of CD44 mRNA and stabilizes it, hence promotes cell adhesion and invadopodia formation in cancer cells. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP3 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP2 in an RNA-dependent manner. IGF2BP3 contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the third one.	78
411927	cd22499	KH-I_IGF2BP1_rpt4	fourth type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) and similar proteins. IGF2BP1, also called IGF2 mRNA-binding protein 1 (IMP-1), or coding region determinant-binding protein (CRD-BP), or IGF-II mRNA-binding protein 1, or VICKZ family member 1 (VICKZ1), or zipcode-binding protein 1 (ZBP-1), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It regulates localized beta-actin/ACTB mRNA translation, a crucial process for cell polarity, cell migration and neurite outgrowth. IGF2BP1 can form homodimers and heterodimers with IGF2BP1 and IGF2BP3. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the fourth one.	76
411928	cd22500	KH-I_IGF2BP2_rpt4	fourth type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) and similar proteins. IGF2BP2, also called IGF2 mRNA-binding protein 2 (IMP-2), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 2, or VICKZ family member 2 (VICKZ2), is an RNA-binding factor that recruits target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It functions by binding to the 5' UTR of the insulin-like growth factor 2 (IGF2) mRNA and regulating IGF2 translation. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP2 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP3 in an RNA-dependent manner. It contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the fourth one.	78
411929	cd22501	KH-I_IGF2BP3_rpt4	fourth type I K homology (KH) RNA-binding domain found in insulin-like growth factor 2 mRNA-binding protein 3 (IGF2BP3) and similar proteins. IGF2BP3, also called IGF2 mRNA-binding protein 3 (IMP-3), or hepatocellular carcinoma autoantigen p62, or IGF-II mRNA-binding protein 3, or VICKZ family member 3 (VICKZ3), or KH domain-containing protein overexpressed in cancer, or KOC, is primarily found in the nucleolus, where it can bind to the 5' UTR of the insulin-like growth factor II leader 3 mRNA and may repress translation of insulin-like growth factor II during late development. It acts as an RNA-binding factor that may recruit target transcripts to cytoplasmic protein-RNA complexes (mRNPs). It also modulates the rate and location at which target transcripts encounter the translational apparatus and shields them from endonuclease attacks or microRNA-mediated degradation. IGF2BP3 binds to the 3'-UTR of CD44 mRNA and stabilizes it, hence promotes cell adhesion and invadopodia formation in cancer cells. It also binds to beta-actin/ACTB and MYC transcripts. IGF2BP3 can form homooligomers and heterooligomers with IGF2BP1 and IGF2BP2 in an RNA-dependent manner. IGF2BP3 contains four K-homology (KH) RNA-binding domains which are important in RNA binding and are known to be involved in RNA synthesis and metabolism. The model corresponds to the fourth one.	66
411930	cd22502	KH-I_ANKRD17	type I K homology (KH) RNA-binding domain found in ankyrin repeat domain-containing protein 17 (ANKRD17) and similar proteins. ANKRD17, also called ankyrin repeat protein 17, or gene trap ankyrin repeat protein (GTAR), or serologically defined breast cancer antigen NY-BR-16, is a ubiquitously expressed ankyrin factor essential for the vascular integrity during embryogenesis. It may be directly involved in the DNA replication process and play pivotal roles in cell cycle and DNA regulation. It is also involved in innate immune defense against bacteria and viruses.	71
411931	cd22503	KH-I_ANKHD1	type I K homology (KH) RNA-binding domain found in ankyrin repeat and KH domain-containing protein 1 (ANKHD1) and similar proteins. ANKHD1, also called HIV-1 Vpr-binding ankyrin repeat protein, or multiple ankyrin repeats single KH domain, or Hmask, is highly expressed in various cancer tissues and is involved in cancer progression, including proliferation and invasion. It acts as a scaffolding protein that may be associated with the abnormal phenotype of leukemia cells. It may play might have a role in MM cell proliferation and cell cycle progression by regulating expression of p21. It also regulates cell cycle progression and proliferation in multiple myeloma cells. ANKHD1 is a component of Hippo signaling pathway. It functions as a positive regulator of YAP1 and promotes cell growth and cell cycle progression through Cyclin A upregulation in prostate cancer cells.	83
411932	cd22504	KH_I_FXR1_rpt1	first type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA-binding protein required for embryonic and postnatal development of muscle tissue. It may regulate intracellular transport and local translation of certain mRNAs. FXR1 protein may be present in amyloid form in brain of different species of mammals. It may regulate memory and emotions. FXR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	77
411933	cd22505	KH_I_FXR2_rpt1	first type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2, also known as FMR1L2, is an RNA-binding protein that plays a role in central nervous system function. It specifically regulates hippocampal neurogenesis by reducing the stability of Noggin mRNA. FXR2 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	77
411934	cd22506	KH_I_FMR1_rpt1	first type I K homology (KH) RNA-binding domain found in fragile X mental retardation protein 1 (FMR1) and similar proteins. FMR1, also called FMRP, or synaptic functional regulator FMR1, is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. It also plays a role in the alternative splicing of its own mRNA. FMR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	77
411935	cd22507	KH_I_FXR1_rpt2	second type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA-binding protein required for embryonic and postnatal development of muscle tissue. It may regulate intracellular transport and local translation of certain mRNAs. FXR1 protein may be present in amyloid form in brain of different species of mammals. It may regulate memory and emotions. FXR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	63
411936	cd22508	KH_I_FXR2_rpt2	second type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2, also known as FMR1L2, is an RNA-binding protein that plays a role in central nervous system function. It specifically regulates hippocampal neurogenesis by reducing the stability of Noggin mRNA. FXR2 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	63
411937	cd22509	KH_I_FMR1_rpt2	second type I K homology (KH) RNA-binding domain found in fragile X mental retardation protein 1 (FMR1) and similar proteins. FMR1, also called FMRP, or synaptic functional regulator FMR1, is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. It also plays a role in the alternative splicing of its own mRNA. FMR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	63
411938	cd22510	KH_I_FXR1_rpt3	third type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 1 (FXR1) and similar proteins. FXR1 is an RNA-binding protein required for embryonic and postnatal development of muscle tissue. It may regulate intracellular transport and local translation of certain mRNAs. FXR1 protein may be present in amyloid form in brain of different species of mammals. It may regulate memory and emotions. FXR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	78
411939	cd22511	KH_I_FXR2_rpt3	third type I K homology (KH) RNA-binding domain found in fragile X mental retardation syndrome-related protein 2 (FXR2) and similar proteins. FXR2, also known as FMR1L2, is an RNA-binding protein that plays a role in central nervous system function. It specifically regulates hippocampal neurogenesis by reducing the stability of Noggin mRNA. FXR2 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	78
411940	cd22512	KH_I_FMR1_rpt3	third type I K homology (KH) RNA-binding domain found in fragile X mental retardation protein 1 (FMR1) and similar proteins. FMR1, also called FMRP, or synaptic functional regulator FMR1, is a multifunctional polyribosome-associated RNA-binding protein that plays a central role in neuronal development and synaptic plasticity through the regulation of alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs. It also plays a role in the alternative splicing of its own mRNA. FMR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	78
411941	cd22513	KH-I_BTR1_rpt1	first type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein BTR1 and similar proteins. BTR1, also called Binding to ToMV RNA 1, is a negative regulator of tomato mosaic virus (ToMV) multiplication but has no effect on the multiplication of cucumber mosaic virus (CMV). BTR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	73
411942	cd22514	KH-I_BTR1_rpt3	third type I K homology (KH) RNA-binding domain found in Arabidopsis thaliana protein BTR1 and similar proteins. BTR1, also called Binding to ToMV RNA 1, is a negative regulator of tomato mosaic virus (ToMV) multiplication but has no effect on the multiplication of cucumber mosaic virus (CMV). BTR1 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	71
411943	cd22515	KH-I_PCBP1_2_rpt1	first type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 1 (PCBP1) and similar proteins. The family includes PCBP1 (also called alpha-CP1, or heterogeneous nuclear ribonucleoprotein E1, or hnRNP E1, or nucleic acid-binding protein SUB2.3) and PCBP2 (also called alpha-CP2, or heterogeneous nuclear ribonucleoprotein E2, or hnRNP E2). They are single-stranded nucleic acid binding proteins that bind preferentially to oligo dC. They act as iron chaperones for ferritin. In case of infection by poliovirus, PCBP1 plays a role in initiation of viral RNA replication in concert with the viral protein 3CD. PCBP2 is a major cellular poly(rC)-binding protein. It also binds poly(rU). PCBP2 negatively regulates cellular antiviral responses mediated by MAVS signaling. It acts as an adapter between MAVS and the E3 ubiquitin ligase ITCH, therefore triggering MAVS ubiquitination and degradation. PCBP2 forms a metabolon with the heme oxygenase 1/cytochrome P450 reductase complex for heme catabolism and iron transfer. Both PCBP1 and PCBP2 contain three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	70
411944	cd22516	KH-I_PCBP3_rpt1	first type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 3 (PCBP3) and similar proteins. PCBP3, also called alpha-CP3, or PCBP3-overlapping transcript, or PCBP3-overlapping transcript 1, or heterogeneous nuclear ribonucleoprotein E3, or hnRNP E3, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It can function as a repressor dependent on binding to single-strand and double-stranded poly(C) sequences. PCBP3 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	77
411945	cd22517	KH-I_PCBP4_rpt1	first type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 4 (PCBP4) and similar proteins. PCBP4, also called alpha-CP4, or heterogeneous nuclear ribonucleoprotein E4, or hnRNP E4, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It regulates both basal and stress-induced p21 expression through binding p21 3'-UTR and modulating p21 mRNA stability. It also plays a role in the cell cycle and is implicated in lung tumor suppression. PCBP4 contains three K-homology (KH) RNA-binding domains. The model corresponds to the first one.	70
411946	cd22518	KH-I_PCBP1_2_rpt2	second type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 1 (PCBP1) and similar proteins. The family includes PCBP1 (also called alpha-CP1, or heterogeneous nuclear ribonucleoprotein E1, or hnRNP E1, or nucleic acid-binding protein SUB2.3) and PCBP2 (also called alpha-CP2, or heterogeneous nuclear ribonucleoprotein E2, or hnRNP E2). They are single-stranded nucleic acid binding proteins that bind preferentially to oligo dC. They act as iron chaperones for ferritin. In case of infection by poliovirus, PCBP1 plays a role in initiation of viral RNA replication in concert with the viral protein 3CD. PCBP2 is a major cellular poly(rC)-binding protein. It also binds poly(rU). PCBP2 negatively regulates cellular antiviral responses mediated by MAVS signaling. It acts as an adapter between MAVS and the E3 ubiquitin ligase ITCH, therefore triggering MAVS ubiquitination and degradation. PCBP2 forms a metabolon with the heme oxygenase 1/cytochrome P450 reductase complex for heme catabolism and iron transfer. Both PCBP1 and PCBP2 contain three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	78
411947	cd22519	KH-I_PCBP3_rpt2	second type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 3 (PCBP3) and similar proteins. PCBP3, also called alpha-CP3, or PCBP3-overlapping transcript, or PCBP3-overlapping transcript 1, or heterogeneous nuclear ribonucleoprotein E3, or hnRNP E3, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It can function as a repressor dependent on binding to single-strand and double-stranded poly(C) sequences. PCBP3 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	79
411948	cd22520	KH-I_PCBP4_rpt2	second type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 4 (PCBP4) and similar proteins. PCBP4, also called alpha-CP4, or heterogeneous nuclear ribonucleoprotein E4, or hnRNP E4, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It regulates both basal and stress-induced p21 expression through binding p21 3'-UTR and modulating p21 mRNA stability. It also plays a role in the cell cycle and is implicated in lung tumor suppression. PCBP4 contains three K-homology (KH) RNA-binding domains. The model corresponds to the second one.	72
411949	cd22521	KH-I_PCBP1_2_rpt3	third type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 1 (PCBP1) and similar proteins. The family includes PCBP1 (also called alpha-CP1, or heterogeneous nuclear ribonucleoprotein E1, or hnRNP E1, or nucleic acid-binding protein SUB2.3) and PCBP2 (also called alpha-CP2, or heterogeneous nuclear ribonucleoprotein E2, or hnRNP E2). They are single-stranded nucleic acid binding proteins that bind preferentially to oligo dC. They act as iron chaperones for ferritin. In case of infection by poliovirus, PCBP1 plays a role in initiation of viral RNA replication in concert with the viral protein 3CD. PCBP2 is a major cellular poly(rC)-binding protein. It also binds poly(rU). PCBP2 negatively regulates cellular antiviral responses mediated by MAVS signaling. It acts as an adapter between MAVS and the E3 ubiquitin ligase ITCH, therefore triggering MAVS ubiquitination and degradation. PCBP2 forms a metabolon with the heme oxygenase 1/cytochrome P450 reductase complex for heme catabolism and iron transfer. Both PCBP1 and PCBP2 contain three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	76
411950	cd22522	KH-I_PCBP3_rpt3	third type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 3 (PCBP3) and similar proteins. PCBP3, also called alpha-CP3, or PCBP3-overlapping transcript, or PCBP3-overlapping transcript 1, or heterogeneous nuclear ribonucleoprotein E3, or hnRNP E3, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It can function as a repressor dependent on binding to single-strand and double-stranded poly(C) sequences. PCBP3 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	75
411951	cd22523	KH-I_PCBP4_rpt3	third type I K homology (KH) RNA-binding domain found in poly(rC)-binding protein 4 (PCBP4) and similar proteins. PCBP4, also called alpha-CP4, or heterogeneous nuclear ribonucleoprotein E4, or hnRNP E4, is a single-stranded nucleic acid binding protein that binds preferentially to oligo dC. It regulates both basal and stress-induced p21 expression through binding p21 3'-UTR and modulating p21 mRNA stability. It also plays a role in the cell cycle and is implicated in lung tumor suppression. PCBP4 contains three K-homology (KH) RNA-binding domains. The model corresponds to the third one.	68
411952	cd22524	KH-I_Rrp4_prokar	type I K homology (KH) RNA-binding domain found in exosome complex component Rrp4 mainly from archaea. The subfamily corresponds to ribosomal RNA-processing protein 4 (Rrp4) mainly from archaea. It is a non-catalytic component of the exosome, which is a phosphorolytic 3'-5' exoribonuclease complex involved in RNA degradation and processing. Rrp4 increases the RNA binding and the efficiency of RNA degradation and confers strong poly(A) specificity to the exosome.	82
411953	cd22525	KH-I_Rrp4_eukar	type I K homology (KH) RNA-binding domain found in exosome complex component Rrp4 from eukaryote. The subfamily corresponds to ribosomal RNA-processing protein 4 (Rrp4) mainly from eukaryote. Rrp4, also called exosome component 2 (EXOSC2), or ribosomal RNA-processing protein 4, is a non-catalytic component of the RNA exosome complex which has 3'-->5' exoribonuclease activity and participates in a multitude of cellular RNA processing and degradation events. Mutations in EXOSC2 gene are associated with a novel syndrome characterized by retinitis pigmentosa, progressive hearing loss, premature aging, short stature, mild intellectual disability and distinctive gestalt. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	123
411954	cd22526	KH-I_Rrp40	type I K homology (KH) RNA-binding domain found in exosome complex component Rrp40 and similar proteins. Rrp40, also called exosome component 3 (EXOSC3), or ribosomal RNA-processing protein 40, is a non-catalytic component of the RNA exosome complex which has 3'-->5' exoribonuclease activity and participates in a multitude of cellular RNA processing and degradation events. Mutations of EXOSC3 gene are associated with neurological diseases. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	78
412022	cd22527	IPD_PPP1R12A-like	inhibitory phosphorylation domain of protein phosphatase 1 regulatory subunit 12A-like, and similar proteins. Protein phosphatase 1 regulatory subunit 12A-like (PPP1R12A-like) is a homolog of MYPT1, also called protein phosphatase 1 regulatory subunit 12A (PPP1R12A), myosin phosphatase target subunit 1, or protein phosphatase myosin-binding subunit. MYPT1 is the targeting subunit of smooth-muscle myosin phosphatase. It is a substrate for the asparaginyl hydroxylase factor inhibiting hypoxia-inducible factor (FIH). MYPT1 acts as a key regulator of protein phosphatase 1C (PPP1C). It mediates binding to myosin. As part of the PPP1C complex, MYPT1 is involved in dephosphorylation of the mitosis regulator polo-like kinase 1 (PLK1). It is capable of inhibiting HIF1A inhibitor (HIF1AN)-dependent suppression of HIF1A activity. This model corresponds to the inhibitory phosphorylation domain of PPP1R12A-like protein.	50
412095	cd22528	av_Nsp3_ER-remodelling	intracellular membrane remodeller motif of arterivirus non-structural protein 3 (Nsp3). This domain is present in subunit Nsp3 of RNA-arteriviruses, such as porcine arterivirus PRRSV and equine arterivirus EAV. Nsp3 proteins are localized to the ER and appear to be essential for formation of double-membrane vesicles that originate from the ER during the life-cycle of the virus. Arterivirus Nsp3 is a predicted tetra-spanning transmembrane protein containing four transmembrane helices, with the N- and C-termini of the protein residing in the cytoplasm. It contains a cluster of four highly conserved cysteine residues that are predicted to reside in the first luminal domain of the protein. These conserved cysteines play a key role in the formation of double-membrane vesicles (DMVs); mutagenesis of each completely blocked DMV formation.	57
411786	cd22529	KH-II_NusA_rpt2	second type II K-homology (KH) RNA-binding domain found in transcription termination/antitermination protein NusA and similar proteins. NusA, also called N utilization substance protein A or transcription termination/antitermination L factor, is an essential multifunctional transcription elongation factor that participates in both transcription termination and antitermination. NusA anti-termination function plays an important role in the expression of ribosomal rrn operons. During transcription of many other genes, NusA-induced RNA polymerase pausing provides a mechanism for synchronizing transcription and translation. In prokaryotes, the N-terminal RNA polymerase-binding domain (NTD) is connected through a flexible hinge helix to three globular domains, the S1 and two K-homology, KH1 and KH2. The K-homology (KH) domains of NusA belong to the type II KH RNA-binding domain superfamily. This model corresponds to the second KH domain of NusA and similar proteins.	61
411787	cd22530	KH-II_NusA_arch_rpt1	first type II K-homology (KH) RNA-binding domain found in archaeal probable transcription termination protein NusA and similar proteins. NusA, also called N utilization substance protein A, is an essential multifunctional transcription elongation factor that is universally conserved among prokaryotes and archaea. It participates in both transcription termination and antitermination. NusA homologs consisting of only the two type II K-homology (KH) domains are widely conserved in archaea. Although their function remains unclear, it has been found that Aeropyrum pernix NusA strongly binds to a certain CU-rich sequence near a termination signal. Archaeal NusA may have retained some functions of bacterial NusA, including ssRNA-binding ability. This model corresponds to the first KH domain of NusA found mainly in archaea.	69
411788	cd22531	KH-II_NusA_arch_rpt2	second type II K-homology (KH) RNA-binding domain found in archaeal probable transcription termination protein NusA and similar proteins. NusA, also called N utilization substance protein A, is an essential multifunctional transcription elongation factor that is universally conserved among prokaryotes and archaea. It participates in both transcription termination and antitermination. NusA homologs consisting of only the two type II K-homology (KH) domains are widely conserved in archaea. Although their function remains unclear, it has been found that Aeropyrum pernix NusA strongly binds to a certain CU-rich sequence near a termination signal. Archaeal NusA may have retained some functions of bacterial NusA, including ssRNA-binding ability. This model corresponds to the second KH domain of NusA mainly found in archaea.	67
411789	cd22532	KH-II_CPSF_arch_rpt1	first type II K-homology (KH) RNA-binding domain found in archaeal cleavage and polyadenylation specificity factor (CPSF) and similar proteins. The archaeal CPSFs are predicted to be metal-dependent RNases belonging to the beta-CASP family, a subgroup of enzymes within the metallo-beta-lactamase fold. Within the CPSF family, all archaeal genomes contain one member with two N-terminal type II K-homology (KH) domains and one without. This family includes the CPSF homologs from archaea possessing N-terminal KH domains. This model corresponds to the first KH domain, which is a non-canonical type II KH domain that does not contain the signature motif GXXG (where X represents any amino acid).	62
411790	cd22533	KH-II_YlqC-like	type II K-homology (KH) RNA-binding domain found in Bacillus subtilis UPF0109 protein YlqC and similar proteins. The family includes a group of uncharacterized proteins which show sequence similarity to Bacillus subtilis UPF0109 protein YlqC. They are mainly found in bacteria and contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid).	75
411791	cd22534	KH-II_Era	type II K-homology (KH) RNA-binding domain found in GTPase Era and similar proteins. GTPase Era, also called ERA or GTP-binding protein Era, is an essential GTPase that binds both GDP and GTP, with nucleotide exchange occurring in the order of seconds whereas hydrolysis occurs in the order of minutes. It plays a role in numerous processes, including cell cycle regulation, energy metabolism, as a chaperone for 16S rRNA processing, and 30S ribosomal subunit biogenesis. Its presence in the 30S subunit may prevent translation initiation. GTPase Era may also be critical for maintaining cell growth and cell division rates. Members of this family contain only one canonical type II K-homology (KH) domain that has the signature motif GXXG (where X represents any amino acid).	87
411773	cd22536	SP4_N	N-terminal domain of transcription factor Specificity Protein (SP) 4. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Human SP4 is a risk gene of multiple psychiatric disorders including schizophrenia, bipolar disorder, and major depression. SP4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP4.	623
411774	cd22537	SP3_N	N-terminal domain of transcription factor Specificity Protein (SP) 3. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP1 and SP3 can interact with and recruit a large number of proteins including the transcription initiation complex, histone modifying enzymes, and chromatin remodeling complexes, which strongly suggest that SP1 and SP3 are important transcription factors in remodeling chromatin and the regulation of gene expression. SP3 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP3.	574
411690	cd22538	SP8_N	N-terminal domain of transcription factor Specificity Protein (SP) 8. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP8 is crucial for limb outgrowth and neuropore closure.  It is expressed during embryogenesis in the forming apical ectodermal ridge, restricted regions of the central nervous system, and tail bud. SP8 and SP9 are two closely related transcription factors that mediate FGF10 signaling, which in turn regulates FGF8 expression which is essential for normal limb development. Both SP8 and SP9 have been found in vertebrates, but only SP8 is present in invertebrates. SP8 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP8.	303
411775	cd22539	SP1_N	N-terminal domain of transcription factor Specificity Protein (SP) 1. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP1 has been shown to interact with a variety of proteins including myogenin, SMAD3, SUMO1, SF1, TAL1, and UBC. Some 12,000 SP1 binding sites are found in the human genome. SP1 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLF bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP1.	433
411776	cd22540	SP2_N	N-terminal domain of transcription factor Specificity Protein (SP) 2. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.	511
412096	cd22541	SP5_N	N-terminal domain of transcription factor Specificity Protein (SP) 5. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. All of them contain clade SP5, which plays a potential role in human cancers and was found in several human tumors including hepatocellular carcinoma, gastric cancer, and colon cancer. Leukemia inhibitor factor/Stat3 and Wnt/beta-catenin signaling pathways converge on SP5 to promote mouse embryonic stem cell self-renewal. SP5 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP5.	143
411691	cd22542	SP7_N	N-terminal domain of transcription factor Specificity Protein (SP) 7. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.	297
411692	cd22543	SP6-9_N	N-terminal domains of transcription factor Specificity Proteins (SP) 6-9, and similar proteins. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. SPs belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the related N-terminal domains of SP6-SP9, and similar proteins.	162
411693	cd22544	SP6_N	N-terminal domain of transcription factor Specificity Protein (SP) 6. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP6 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP6.	245
411777	cd22545	SP1-4_N	N-terminal domain of transcription factor Specificity Proteins (SP) 1-4. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. SPs belong to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP1-4.	82
412097	cd22546	AcrIE2	Anti-CRISPR type I subtype E2. AcrIE2 (also known as AcrE2) is a phage anti-CRISPR (Acr) protein that has been shown to mediate inhibition of the type I-E CRISPR-Cas system of Pseudomonas aeruginosa. AcrIE2 was discovered via a guilt-by association (GBA) approach, which was based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches, AcrIE2 was then confirmed functionally to be an type 1-E Acr. These anti-CRISPR gene clusters all contain a conserved putative promoter region at their 5' end and a conserved aca gene at their 3' end. Type I-E and I-F acr genes are located at the same position in the genomes of a large group of related phages, and they are found in a variety of combinations and arrangements. The type I-E CRISPR-Cas system Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	84
411694	cd22547	SP6-9-like_N	N-terminal domain of invertebrate transcription factor Specificity Proteins (SP) similar to SP6, SP8 and SP9. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP9 plays a role in limb outgrowth. It is expressed during embryogenesis in the forming AER, restricted regions of the central nervous system, and tail bud. SP8 and SP9 are two closely related transcription factors that mediate FGF10 signaling, which in turn regulates FGF8 expression which is essential for normal limb development. Both SP8 and SP9 have been found in vertebrates, but only SP8 is present in invertebrates. SPs belong to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of invertebrate SPs similar to SP6, SP8, and SP9.	219
411695	cd22549	SP9_N	N-terminal domain of transcription factor Specificity Protein (SP) 9 and similar proteins. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP9 plays a role in limb outgrowth. It is expressed during embryogenesis in the forming apical ectodermal ridge, restricted regions of the central nervous system, and tail bud. SP8 and SP9 are two closely related transcription factors that mediate FGF10 signaling, which in turn regulates FGF8 expression which is essential for normal limb development. Both SP8 and SP9 have been found in vertebrates, but only SP8 is present in invertebrates. SP9 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP9.	299
412098	cd22551	AcrIE3	Anti-CRISPR type I subtype E3 (AcrIE3). AcrIE3 (also known as AcrE3) is an anti-CRISPR (Acr) protein that was discovered via guilt-by association (GBA) approach, which is based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches. These anti-CRISPR gene clusters all contain a conserved putative promoter region at their 5' end and a conserved aca gene at their 3' end. Type I-E and I-F acr genes are located at the same position in the genomes of a large group of related phages, and they are found in a variety of combinations and arrangements. Functional assays confirmed AcrIE3 as a Type I-E Acr protein. AcrIE3 associates with the Cascade complex to hinder DNA binding, via an unknown mechanism. The type I-E CRISPR-Cas system Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	68
412099	cd22552	AcrIE4	Anti-CRISPR type I subtype E4. AcrIE4, also known as AcrE4, anti-CRISPR protein 31 or ACR3112-31, is a phage anti-CRISPR (Acr) protein that has been shown to mediate inhibition of the type I-E CRISPR-Cas system of Pseudomonas aeruginosa. AcrIE4 was discovered via a guilt-by association (GBA) approach, which was based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches. AcrIE4 was then confirmed functionally to be a type 1-E Acr. These anti-CRISPR gene clusters all contain a conserved putative promoter region at their 5' end and a conserved aca gene at their 3' end. Type I-E and I-F acr genes are located at the same position in the genomes of a large group of related phages, and they are found in a variety of combinations and arrangements. The type I-E Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	52
411778	cd22553	SP1-4_arthropods_N	N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.	384
412100	cd22554	Slr4-like	S (surface)-layer proteins similar to Pseudoalteromonas tunicata Slr4. Pseudoalteromonas tunicata D2 Slr4 (also known as EAR28894 protein) is an S-layer protein and the dominant protein within P. tunicata pellicle biofilm components. S-layers are self-assembling, paracrystalline proteinaceous lattices that form an interface between the cell and its extracellular environment; purified P. tunicata Slr4 protein is able to form square (p4 symmetry) paracrystalline lattices. Slr4 may protect cells and biofilm matrix components against stressors such as attack by viruses, bacteria or eukaryotes. The Slr4 family is widely distributed in gammaproteobacteria, including species of Pseudoalteromonas and Vibrio, and is found exclusively in marine metagenomes. It may play an important role in marine microbial physiology and ecology.	400
412101	cd22555	Lpg2603_kinase	Legionella pneumophila Dom/Icm type IV secretion system effector Lpg2603 kinase domain. This model contains the kinase domain of the type IV secretion system effector (T4SS) Lpg2603, an atypical kinase from the bacterial pathogen Legionella pneumophila. Lpg2603 is a remote member of the protein kinase superfamily having structural similarity, but notable differences in primary amino acid sequence. Studies show that Lpg2603 is an active protein kinase that requires the eukaryote-specific host signaling molecule inositol hexakisphosphate (IP6) for activity; IP6 binding rearranges the active site to allow for ATP binding and catalysis. The C-terminal domain of Lpg2603 is a PI4P-binding domain.	291
412102	cd22556	AcrIE5	Anti-CRISPR type I subtype E5. AcrIE5 (also known as AcrE5) is an anti-CRISPR (Acr) protein that was discovered via guilt-by association (GBA) approach, which is based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches. These anti-CRISPR gene clusters all contain a conserved putative promoter region at their 5' end and a conserved aca gene at their 3' end. Type I-E and I-F acr genes are located at the same position in the genomes of a large group of related phages, and they are found in a variety of combinations and arrangements. The type I-E CRISPR-Cas system Csy is a crRNA-guided surveillance complex, composed of a crRNA and eleven Cas proteins (one Cse1, two Cse2, one Cas5, six Cas7 and one Cas6e), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif which inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	65
412103	cd22557	AcrIF4	Anti-CRISPR type I subtype F4. AcrIF4 is a phage anti-CRISPR (Acr) protein that has been shown to associate with the type I-F Cascade surveillance complex (type I-F Csy complex) to inhibit DNA binding. AcrIF4 binds the Csy complex with affinities that are orders of magnitude weaker than Acr proteins like AcrIF1 and AcrIF2. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif, which inhibit a wide range of CRISPR-Cas systems using various inhibition mechanisms. Weak and strong Acr-phages often cooperate to overcome CRISPR resistance, with a first phage blocking the host CRISPR-Cas immune system to allow a second Acr-phage to successfully replicate which leads to epidemiological tipping points. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	95
411697	cd22558	RETR_RHD	N-terminal reticulon-homology domain of Reticulophagy regulators and similar proteins. This subfamily includes Reticulophagy regulators 1-3.  Reticulophagy regulator 1 (RETREG1/FAM134B) is an endoplasmic reticulum (ER)-anchored autophagy receptor that regulates the size and shape of the ER. It regulates turnover of the ER by selective phagocytosis, mediating ER delivery into lysosomes through sequestration into autophagosomes. It promotes membrane remodeling and ER scission through its membrane bending activity, and targets the fragments into autophagosomes by interacting with ATG8 family modifier proteins such as MAP1LC3A, MAP1LC3B, GABARAP, GABARAPL1 and GABARAPL2. RETREG2/FAM134A and RETREG3/FAM134C has been shown to interact with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. Members of this subfamily contain an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an ER protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature.	192
411698	cd22559	Arl6IP1	ADP-ribosylation factor-like protein 6-interacting protein 1. ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), also called apoptotic regulator in the membrane of the endoplasmic reticulum (ARMER), is an endoplasmic reticulum (ER) protein that has an important role in cell conduction and material transport. Arl6IP1, a tetraspan membrane protein, is an anti-apoptotic protein specific to multicellular organisms, and is a potential player in shaping the ER tubules in mammalian cells. In neurons, Arl6IP1 has been associated with the regulation of glutamate, a major excitatory neurotransmitter in excitatory synapses. In Drosophila, knockdown of the Arl6IP1 gene leads to progressive motor deficit. An Arl6IP1 variant has also been associated with hereditary spastic paraplegia (HSP), motor and sensory polyneuropathy, and acromutilation. Arl6IP1 shows some sequence similarity to the reticulon-homology domain (RHD) of reticulophagy regulators, which may function in inducing membrane curvature.	167
411699	cd22560	RETR1_RHD	N-terminal reticulon-homology domain of Reticulophagy regulator 1. Reticulophagy regulator 1 (RETR1 or RETREG1), also called reticulophagy receptor 1 or FAM134B (family with sequence similarity 134, member B), is an endoplasmic reticulum (ER)-anchored autophagy receptor that regulates the size and shape of the ER. It regulates turnover of the ER by selective phagocytosis, mediating ER delivery into lysosomes through sequestration into autophagosomes. It promotes membrane remodeling and ER scission through its membrane bending activity, and targets the fragments into autophagosomes by interacting with ATG8 family modifier proteins such as MAP1LC3A, MAP1LC3B, GABARAP, GABARAPL1 and GABARAPL2. Loss of function of FAM134B is associated with diseases and cancer, including hereditary sensory and autonomic neuropathy type IIB (HSAN IIB), colorectal adenocarcinoma, and oesophageal squamous cell carcinoma, and other progressive neuronal degenerative diseases. FAM134B is also implicated in the suppression of viral replication during Ebola, Dengue, Zika, and West Nile viral infections. RETREG1/FAM134B contains an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an ER protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature.	198
411700	cd22561	RETR2_RHD	N-terminal reticulon-homology domain of Reticulophagy regulator 2. Reticulophagy regulator 2 (RETR2 or RETREG2), also called FAM134A (family with sequence similarity 134, member A), C2orf17, or MAG2, interacts with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. RETREG2/FAM134A contains an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an endoplasmic reticulum protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature.	199
411701	cd22562	RETR3_RHD	N-terminal reticulon-homology domain of Reticulophagy regulator 3. Reticulophagy regulator 3 (RETR3 or RETREG3), also called FAM134C (family with sequence similarity 134, member C), mediates NRF1-enhanced neurite outgrowth. It interacts with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. RETREG3/FAM134C contains an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an endoplasmic reticulum protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature.	192
412104	cd22563	CclA	carnocyclin A. Carnocyclin A (CclA) is a potent ribosomally synthesized antimicrobial peptide, originally found in Carnobacterium maltaromaticum UAL307, that displays a broad spectrum of activity against numerous Gram-positive organisms. An amide bond links the N and C termini of this circular bacteriocin, giving it stability and structural integrity. CclA interacts with lipid bilayers in a voltage-dependent manner and forms anion selective pores that preferentially bind halide anions. The ABC transporter CclEFGH facilitates the production of CclA.	51
412105	cd22564	AcrIF5	Anti-CRISPR type I subtype F5. AcrIF5, also known as AcrF5, is a phage anti-CRISPR (Acr) protein that has been shown to mediate inhibition of the type I-F CRISPR-Cas system of Pseudomonas aeruginosa. AcrIF5 is a weak anti-CRISPR and its gene always co-occurs with the AcrIF3 gene; however, the AcrIF3 gene often occurs in the absence of the AcrIF5 gene. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif, which inhibit a wide range of CRISPR-Cas systems via various inhibition mechanisms. Weak and strong Acr-phages often cooperate to overcome CRISPR resistance, with a first phage blocking the host CRISPR-Cas immune system to allow a second Acr-phage to successfully replicate which leads to epidemiological tipping points. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	79
412106	cd22565	AcrIF7	Anti-CRISPR type I subtype F7. AcrIF7 (also known as AcrF7) is an anti-CRISPR (Acr) protein that was discovered via the guilt-by association (GBA) approach, which is based on the strong co-occurrence and clustering of acr and anti-CRISPR associated (aca) genes through proximity and homology searches. Functional assays show that AcrIF7 in Pseudomonas aeruginosa prophages strongly inhibits the type I-F CRISPR-Cas system. It has been classified as a broad-range type I-F Acr, as it is able to block the type I-F system of both P. aeruginosa and Pectobacterium atrosepticum. AcrIF7 targets the Cas8f subunit of the Csy complex and may compete for the same binding interface with AcrIF2. Extensive mutagenic analyses revealed that AcrIF7 associated with the highly conserved dsDNA binding site of Cas8f, primarily via electrostatic interactions. The type I-F Csy complex is a crRNA-guided surveillance complex composed of a crRNA and nine Cas proteins (one Cas8f, one Cas5f, one Cas6f, and six Cas7f), which recruits a nuclease-helicase protein Cas3 for target degradation. CRISPR-Cas immune systems are used by certain prokaryotes and archaea to resist the invasion of foreign nucleic acids such as phages or plasmids. Anti-CRISPRs are small proteins which are the natural inhibitors for CRISPR-Cas systems; encoded on bacterial and archaeal viruses, they allow the virus to evade host CRISPR-Cas systems. The CRISPR-Cas-mediated adaptive immune response can be divided into three steps, including the acquisition of spacer derived from invading nucleic acids, crRNA processing, and target degradation. Theoretically, Acr proteins could suppress any step to disrupt the CRISPR-Cas system. Acr proteins are diverse with no common sequence or structural motif, and they inhibit a wide range of CRISPR-Cas systems with various inhibition mechanisms. CRISPR-Cas systems are divided into two classes (1 and 2) and six types (class 1: types I, III and IV; class 2: types II, V and VI). Class 1 systems utilize RNA-guided complexes consisting of multiple Cas proteins as the effector proteins to recognize and cleave target DNA. Type I CRISPR-Cas systems are the most widespread in nature, and the Cas protein composition of the employed CRISPR ribonucleoprotein (crRNP) complexes differs between seven subtypes (A to F, U). Acr families are named for their type and subtype which are numbered sequentially as they are discovered.	69
412107	cd22566	MrpH-like	Mannose-resistant Proteus-like fimbriae (MR/P) tip adhesin (MrpH) and similar proteins. This model contains mannose-resistant Proteus-like fimbriae (MR/P) tip adhesin (MrpH) found in Proteus mirabilis, a Gram-negative uropathogen and a major causative agent in catheter-associated urinary tract infections. MrpH is required for MR/P-dependent adherence to surfaces. While MR/P belongs to a well-known class of adhesive fimbriae encoded by the chaperone-usher pathway, MrpH has a markedly different structure compared with other tip-located adhesins in this family. It is a novel class of metal-binding adhesin that requires zinc to mediate biofilm formation.	131
412108	cl00011	PLAT	N/A. This is a family of plant seed-specific proteins identified in Arabidopsis thaliana (Mouse-ear cress). ATS3 (Arabidopsis thaliana seed gene 3) is expressed in a pattern similar to the Arabidopsis seed storage protein genes.	0
412109	cl00012	alpha_CA	N/A. Carbonic anhydrase alpha, isozyme IX. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Alpha CAs are strictly monomeric enzymes. The zinc ion is complexed by three histidine residues. This sub-family comprises the membrane protein CA IX. CA IX is functionally implicated in tumor growth and survival. CA IX is mainly present in solid tumors and its expression in normal tissues is limited to the mucosa of alimentary tract. CA IX is a transmembrane protein with two extracellular domains: carbonic anhydrase and,  a proteoglycan-like segment mediating cell-cell adhesion. There is evidence for an involvement of the MAPK pathway in the regulation of CA9 expression.	0
412110	cl00013	Lyase_I_like	Lyase class I_like superfamily: contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase, which catalyze similar beta-elimination reactions. This domain is found at the C-terminus of argininosuccinate lyase.	0
412111	cl00014	SORL	N/A. This domain is found in the sulfur oxidation protein SoxY. It is closely related to the Desulfoferrodoxin family pfam01880. Dissimilatory oxidation of thiosulfate is carried out by the ubiquitous sulfur-oxidizing (Sox) multi-enzyme system. In this system, SoxY plays a key role, functioning as the sulfur substrate-binding protein that offers its sulfur substrate, which is covalently bound to a conserved C-terminal cysteine, to another oxidizing Sox enzyme. The structure of this domain shows an Ig-like fold.	0
412112	cl00015	nt_trans	nucleotidyl transferase superfamily. This domain is found as the C-terminal portion of some HIGH_NTase1 proteins. The exact function is not known.	0
412113	cl00016	Cyt_c_Oxidase_Vb	N/A. cytochrome c oxidase subunit Vb	0
412114	cl00017	Cyt_c_Oxidase_VIa	N/A. cytochrome c oxidase subunit VI protein	0
350864	cl00018	DSRD	N/A. Most members of this family are small (approximately 36 amino acids) proteins that from homodimeric complexes. Each subunit contains a high-spin iron atom tetrahedrally bound to four cysteinyl sulphur atoms This family has a similar fold to the rubredoxin metal binding domain. It is also found as the N-terminal domain of desulfoferrodoxin, see (pfam01880).	0
412115	cl00019	Macro_SF	macrodomain superfamily. This domain is an ADP-ribose binding module. It is found in a number of yeast proteins.	0
412116	cl00020	GAT_1	Type 1 glutamine amidotransferase (GATase1)-like domain. This family captures members that are not found in pfam00310, pfam07685 and pfam13230.	0
412117	cl00021	PTS_IIB_man	N/A. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains.The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine, N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IIB components of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	0
412118	cl00022	YbaK_like	N/A. This domain is found either on its own or in association with the tRNA synthetase class II core domain (pfam00587). It is involved in the tRNA editing of mis-charged tRNAs including Cys-tRNA(Pro), Cys-tRNA(Cys), Ala-tRNA(Pro). The structure of this domain shows a novel fold.	0
412119	cl00025	PTS_IIA_man	N/A. phosphotransferase mannnose-specific family component IIA; Provisional	0
412120	cl00030	CH_SF	calponin homology (CH) domain superfamily. This group is composed of the molecule interacting with CasL protein (MICAL) and EH domain-binding protein (EHBP) families. MICAL is a large, multidomain, cytosolic protein with a single LIM domain, a calponin homology (CH) domain and a flavoprotein monooxygenase (MO) domain. In Drosophila, MICAL is expressed in axons, interacts with the neuronal A (PlexA) receptor and is required for Semaphorin 1a (Sema-1a)-PlexA-mediated repulsive axon guidance. The LIM and CH domains mediate interactions with the cytoskeleton, cytoskeletal adaptor proteins, and other signaling proteins. The flavoprotein MO is required for semaphorin-plexin repulsive axon guidance during axonal pathfinding in the Drosophila neuromuscular system. The EHBP family includes EHBP1 and EHBP1-like protein (EHBP1L1). EHBP1 is a regulator of endocytic recycling and may play a role in actin reorganization by linking clathrin-mediated endocytosis to the actin cytoskeleton. It may act as an effector of small GTPases, including RAB-10 (Rab10), and play a role in vesicle trafficking. EHBP proteins contain a single CH domain. CH domains are actin filament (F-actin) binding motifs.	0
412121	cl00031	ALBUMIN	N/A. Albumin domain, contains five or six internal disulphide bonds; albuminoid superfamily includes alpha-fetoprotein which binds various cations, fatty acids and bilirubin; vitamin D-binding protein which binds to vitamin D, its metabolites, and fatty acids; alpha-albumin which binds water, cations (such as Ca2+, Na+ and K+), fatty acids, hormones, bilirubin and drugs; and afamin of which little is known; these belong to a multigene family with highly conserved intron/exon organization and encoded protein structures; evolutionary comparisons strongly support vitamin D-binding protein as the original gene in this group with subsequent local duplications generating the remaining genes in the cluster	0
412122	cl00032	ANATO	N/A. C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins.	0
412123	cl00033	AP2	N/A. This 60 amino acid residue domain can bind to DNA and is found in transcription factor proteins.	0
412124	cl00034	Bbox_SF	B-box-type zinc finger superfamily. This group is composed of uncharacterized proteins containing a zinc finger B-box domain and a DUF2009 domain, and similar zinc finger B-box domain-containing proteins. The B-box motif shows high sequence similarity with B-Box-type 1 zinc finger found in tripartite motif-containing proteins (TRIMs). The type 1 B-box (Bbox1) zinc finger is characterized by a C6H2 zinc-binding consensus motif.	0
412125	cl00035	BIR	N/A. BIR stands for 'Baculovirus Inhibitor of apoptosis protein Repeat'. It is found repeated in inhibitor of apoptosis proteins (IAPs), and in fact it is also known as IAP repeat. These domains characteristically have a number of invariant residues, including 3 conserved cysteines and one conserved histidine that coordinate a zinc ion. They are usually made up of 4-5 alpha helices and a three-stranded beta-sheet. BIR is also found in other proteins known as BIR-domain-containing proteins (BIRPs), such as Survivin.	0
412126	cl00038	BRCT	C-terminal domain of the breast cancer suppressor protein (BRCA1) and related domains. This is the fifth BRCT domain of regulator of Ty1 transposition protein 107 (RTT107). It is involved in binding phosphorylated histone H2A.	0
412127	cl00040	C1	protein kinase C conserved region 1 (C1 domain) superfamily. PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their regulatory domain. aPKCs only require phosphatidylserine (PS) for activation. They contain a C2-like region, instead of a calcium-binding (C2) region found in classical PKCs, in their regulatory domain. There are two aPKC isoforms, zeta and iota. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes. PKC-zeta plays a critical role in activating the glucose transport response. It is activated by glucose, insulin, and exercise through diverse pathways. PKC-zeta also plays a central role in maintaining cell polarity in yeast and mammalian cells. In addition, it affects actin remodeling in muscle cells. Members of this family contain C1 domain found in aPKC isoform zeta. The C1 domain is a cysteine-rich zinc binding domain that does not bind DNA nor possess structural similarity to conventional zinc finger domains; it contains two separate Zn(2+)-binding sites.	0
412128	cl00042	CASc	N/A. Members of this family are asparaginyl peptidases. The blood fluke parasite Schistosoma mansoni has at least five Clan CA cysteine peptidases in its digestive tract including cathepsins B (2 isoforms), C, F and L. All have been recombinantly expressed as active enzymes, albeit in various stages of activation. In addition, a Clan CD peptidase, termed asparaginyl endopeptidase or 'legumain' has been identified. This has formerly been characterized as a 'haemoglobinase', but this term is probably incorrect. Two cDNAs have been described for Schistosoma mansoni legumain; one encodes an active enzyme whereas the active site cysteine residue encoded by the second cDNA is substituted by an asparagine residue. Both forms have been recombinantly expressed.	0
412129	cl00046	ChtBD3	Chitin/cellulose binding domains of chitinase and related enzymes. This short domain is found in many different glycosyl hydrolase enzymes and is presumed to have a carbohydrate binding function. The domain has six aromatic groups that may be important for binding.	0
412130	cl00047	CAP_ED	N/A. Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic cNMP-binding domains, present in ion channels, and cNMP-dependent kinases.	0
412131	cl00049	CUB	N/A. This is a family of hypothetical C. elegans proteins. The aligned region has no known function nor do any of the proteins which possess it. However, this domain is related to the CUB domain.	0
412132	cl00051	CysPc	N/A. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit).	0
412133	cl00054	DSRM_SF	double-stranded RNA binding motif (DSRM) superfamily. A C-terminal domain in human dead end protein 1 (DND1_HUMAN) homologous to double strand RNA binding domains (PF00035, PF00333)	0
412134	cl00055	MH1	N-terminal Mad Homology 1 (MH1) domain. The MH1 (MAD homology 1) domain is found at the amino terminus of MAD related proteins such as Smads. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind the DNA consensus sequence GNCN in the major groove, shown to be vital for the transcriptional activation of target genes. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. A basic helix (H2) in MH1 with the nuclear localization signal KKLKK has been shown to be essential for Smad3 nuclear import. Smads also use the MH1 domain to interact with transcription factors such as Jun, TFE3, Sp1, and Runx.	0
412135	cl00056	MH2	C-terminal Mad Homology 2 (MH2) domain. This is the MH2 (MAD homology 2) domain found at the carboxy terminus of MAD related proteins such as Smads. This domain is separated from the MH1 domain by a non-conserved linker region. The MH2 domain mediates interaction with a wide variety of proteins and provides specificity and selectivity to Smad function and also is critical for mediating interactions in Smad oligomers. Unlike MH1, MH2 does not bind DNA. The well-studied MH2 domain of Smad4 is composed of five alpha helices and three loops enclosing a beta sandwich. Smads are involved in the propagation of TGF-beta signals by direct association with the TGF-beta receptor kinase which phosphorylates the last two Ser of a conserved 'SSXS' motif located at the C-terminus of MH2.	0
412136	cl00057	vWFA	N/A. This is a uncharacterized domain found in eukaryotes and viruses.	0
412137	cl00060	FGF	N/A. Fibroblast growth factors are a family of proteins involved in growth and differentiation in a wide range of contexts. They are found in a wide range of organisms, from nematodes to humans. Most share an internal core region of high similarity, conserved residues in which are involved in binding with their receptors. On binding, they cause dimerization of their tyrosine kinase receptors leading to intracellular signalling. There are currently four known tyrosine kinase receptors for fibroblast growth factors. These receptors can each bind several different members of this family. Members of this family have a beta trefoil structure. Most have N-terminal signal peptides and are secreted. A few lack signal sequences but are secreted anyway; still others also lack the signal peptide but are found on the cell surface and within the extracellular matrix. A third group remain intracellular. They have central roles in development, regulating cell proliferation, migration and differentiation. On the other hand, they are important in tissue repair following injury in adult organisms.	0
412138	cl00061	FH_FOX	Forkhead (FH) domain found in Forkhead box (FOX) family of transcription factors and similar proteins. FOXP4, also called Forkhead-related protein-like A, is a transcriptional repressor that represses lung-specific expression. It is not required for T cell development, but is necessary for normal T cell cytokine recall responses to antigen following pathogenic infection. The FH domain is a winged helix DNA-binding domain. FOX transcription factors recognize the core sequence 5'-(A/C)AA(C/T)A-3'.	0
412139	cl00062	FHA	N/A. Yop-YscD-cpl is the cytoplasmic domain of Yop proteins like YscD from Proteobacteria. YscD forms part of the inner membrane component of the bacterial type III secretion injectosome apparatus.	0
412140	cl00063	FN1	N/A. One of three types of internal repeat within the plasma protein, fibronectin. Found also in coagulation factor XII, HGF activator and tissue-type plasminogen activator. In t-PA and fibronectin, this domain type contributes to fibrin-binding.	0
412141	cl00064	ZnMc	N/A. This is a family of uncharacterized proteins that carry the highly characteristic met-zincin mmotif HExxHxxGxxH, the extended zinc-binding domain of metallopeptidases.	0
350886	cl00066	FU	N/A. Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors.	0
412142	cl00068	GAL4	N/A. Gal4 is a positive regulator for the gene expression of the galactose- induced genes of S. cerevisiae. Is present only in fungi.	0
412143	cl00069	GGL	N/A. G-protein gamma like domains (GGL) are found in the gamma subunit of the heterotrimeric G protein complex and in regulators of G protein signaling (RGS) proteins. It is also found fused to an inactive Galpha in the Dictyostelium protein gbqA. G-gamma likely shares a common origin with the helical N-terminal unit of G-beta. All organisms that posses a G-beta possess a G-gamma.	0
412144	cl00071	GLECT	N/A. This family contains galactoside binding lectins. The family also includes enzymes such as human eosinophil lysophospholipase (EC:3.1.1.5).	0
412145	cl00072	GYF	N/A. The GYF domain is named because of the presence of Gly-Tyr-Phe residues. The GYF domain is a proline-binding domain in CD2-binding protein.	0
412146	cl00073	H15	N/A. Linker histone H1 is an essential component of chromatin structure. H1 links nucleosomes into higher order structures Histone H1 is replaced by histone H5 in some cell types.	0
412147	cl00075	HATPase	Histidine kinase-like ATPase domain. This family represents the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90.	0
412148	cl00081	bHLH_SF	basic Helix Loop Helix (bHLH) domain superfamily. DEC2, also termed Class E basic helix-loop-helix protein 41 (bHLHe41), or Class B basic helix-loop-helix protein 3 (bHLHb3), or enhancer-of-split and hairy-related protein 1 (SHARP-1), is a bHLH-O transcriptional repressor involved in the regulation of the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes.	0
412149	cl00082	HMG-box	N/A. This short 71 residue domain is an HMG-box domain. HMG-box domains mediate re-modelling of chromatin-structure. Mammalian HMG-box proteins are of two types: those that are non-sequence-specific DNA-binding proteins with two HMG-box domains and a long highly acidic C-tail; and a diverse group of sequence-specific transcription factor-proteins with either a single HMG-box or up to six copies, and no acidic C-tail.	0
412150	cl00083	HNHc	N/A. WHH is a predicted nuclease of the HNH/ENDO VII superfamily of the treble clef fold. The name is derived from the conserved motif WHH. It is found in bacterial polymorphic toxin systems and functions as a toxin module. WHH is the shortest version of HNH nuclease families. Like AHH and LHH, the WHH nuclease contains 4 conserved histidines of which the first one is predicted to bind a metal-ion and other three ones are involved in activation of water molecule for hydrolysis.	0
412151	cl00084	homeodomain	N/A. This is a homeobox transcription factor KN domain conserved from fungi to human and plants. They were first identified as TALE homeobox genes in eukaryotes, (including KNOX and MEIS genes). They have been recently classified.	0
412152	cl00085	FReD	N/A. Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety of fibrinogen-related proteins, including tenascin and Drosophila scabrous.	0
412153	cl00086	HPT	N/A. The histidine-containing phosphotransfer (HPt) domain is a novel protein module with an active histidine residue that mediates phosphotransfer reactions in the two-component signaling systems. A multistep phosphorelay involving the HPt domain has been suggested for these signaling pathways. The crystal structure of the HPt domain of the anaerobic sensor kinase ArcB has been determined. The domain consists of six alpha helices containing a four-helix bundle-folding. The pattern of sequence similarity of the HPt domains of ArcB and components in other signaling systems can be interpreted in light of the three-dimensional structure and supports the conclusion that the HPt domains have a common structural motif both in prokaryotes and eukaryotes. In S. cerevisiae ypd1p this domain has been shown to contain a binding surface for Ssk1p (response regulator receiver domain containing protein pfam00072).	0
412154	cl00087	HR1	Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases. The HR1 repeat was first described as a three times repeated homology region of the N-terminal non-catalytic part of protein kinase PRK1(PKN). The first two of these repeats were later shown to bind the small G protein rho known to activate PKN in its GTP-bound form. Similar rho-binding domains also occur in a number of other protein kinases and in the rho-binding proteins rhophilin and rhotekin. Recently, the structure of the N-terminal HR1 repeat complexed with RhoA has been determined by X-ray crystallography. It forms an antiparallel coiled-coil fold termed an ACC finger.	0
412155	cl00089	NUC	N/A. A family of bacterial and eukaryotic endonucleases share the following characteristics: they act on both DNA and RNA, cleave double-stranded and single-stranded nucleic acids and require a divalent ion such as magnesium for their activity. An histidine has been shown to be essential for the activity of the Serratia marcescens nuclease. This residue is located in a conserved region which also contains an aspartic acid residue that could be implicated in the binding of the divalent ion.	0
412156	cl00092	IFab	N/A. Interferons produce antiviral and antiproliferative responses in cells. They are classified into five groups, all of them related but gamma-interferon.	0
412157	cl00094	IL1	N/A. This family includes interleukin-1 and interleukin-18.	0
412158	cl00096	IRF	N/A. This family of transcription factors are important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. Three of the five conserved tryptophan residues bind to DNA.	0
412159	cl00097	KAZAL_FS	N/A. Usually indicative of serine protease inhibitors. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays. Small alpha+beta fold containing three disulphides.	0
412160	cl00098	KH-I	K homology (KH) RNA-binding domain, type I. Rrp40, also called exosome component 3 (EXOSC3), or ribosomal RNA-processing protein 40, is a non-catalytic component of the RNA exosome complex which has 3'-->5' exoribonuclease activity and participates in a multitude of cellular RNA processing and degradation events. Mutations of EXOSC3 gene are associated with neurological diseases. Members in this subfamily contain a divergent KH domain that lacks the RNA-binding GXXG motif.	0
412161	cl00100	KR	N/A. Kringle domains have been found in plasminogen, hepatocyte growth factors, prothrombin, and apolipoprotein A. Structure is disulfide-rich, nearly all-beta.	0
412162	cl00101	KU	N/A. Indicative of a protease inhibitor, usually a serine protease inhibitor. Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Certain family members are similar to the tick anticoagulant peptide (TAP). This is a highly selective inhibitor of factor Xa in the blood coagulation pathways. TAP molecules are highly dipolar, and are arranged to form a twisted two- stranded antiparallel beta-sheet followed by an alpha helix.	0
412163	cl00103	Trefoil	N/A. Proposed role in renewal and pathology of mucous epithelia.	0
412164	cl00104	LDLa	N/A. Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia.	0
412165	cl00105	LMWP	Low molecular weight phosphatase family. Arsenate reductase plays an important role in the reduction of intracellular arsenate to arsenite, an important step in arsenic detoxification. The reduction involves three different thiolate nucleophiles. In arsenate reductases of the LMWP family, reduction can be coupled with thioredoxin (Trx)/thioredoxin reductase (TrxR) or glutathione (GSH)/glutaredoxin (Grx).	0
412166	cl00109	MADS	N/A. SRF-like/Type I subfamily of MADS (MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptional regulators. Binds DNA and exists as hetero- and homo-dimers. Differs from the MEF-like/Type II subgroup mainly in position of the alpha 2 helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals.  Also found in fungi.	0
412167	cl00110	MBD	N/A. MBDa is a second MBD domain of Methyl-CpG-binding domain proteins. region implicated in binding the RbAp46/48 (retinoblastoma protein-associated protein) homolog p55, which is one of the components of the MBD2-NuRD complex. The MBD2-NuRD complex is a nucleosome remodelling and deacetylation complex.	0
412168	cl00111	PAH	N/A. Pancreatic hormone is a regulator of pancreatic and gastrointestinal functions.	0
412169	cl00112	PAN_APPLE	N/A. The PAN domain contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge the links the N and C termini of the domain. The domain is found in diverse proteins, in some they mediate protein-protein interactions, in others they mediate protein-carbohydrate interactions.	0
412170	cl00113	CRIB	N/A. Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB).	0
412171	cl00116	PDGF	N/A. Platelet-derived growth factor is a potent activator for cells of mesenchymal origin. PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer. Members of the VEGF family are homologues of PDGF.	0
412172	cl00117	PDZ	N/A. This domain is the PDZ domain of tricorn protease.	0
412173	cl00120	PP2Cc	N/A. Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase.	0
412174	cl00123	PROF	N/A. Binds actin monomers, membrane polyphosphoinositides and poly-L-proline.	0
412175	cl00125	RHOD	N/A. Rhodanese has an internal duplication. This Pfam represents a single copy of this duplicated domain. The domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.	0
412176	cl00128	RNase_A	N/A. Ribonucleases. Members include pancreatic RNAase A and angiogenins. Structure is an alpha+beta fold -- long curved beta sheet and three helices.	0
412177	cl00130	PseudoU_synth	Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi). TruD is responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. The structure of TruD reveals an overall V-shaped molecule which contains an RNA-binding cleft.	0
412178	cl00133	CAP	CAP (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins) domain family. This is a large family of cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins (CAP) that are found in a wide range of organisms, including prokaryotes and non-vertebrate eukaryotes, The nine subfamilies of the mammalian CAP 'super'family include: the human glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), cysteine-rich secretory proteins (CRISPs), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), mannose receptor like and the R3H domain containing like proteins. Members are most often secreted and have an extracellular endocrine or paracrine function and are involved in processes including the regulation of extracellular matrix and branching morphogenesis, potentially as either proteases or protease inhibitors; in ion channel regulation in fertility; as tumor suppressor or pro-oncogenic genes in tissues including the prostate; and in cell-cell adhesion during fertilisation. The overall protein structural conservation within the CAP 'super'family results in fundamentally similar functions for the CAP domain in all members, yet the diversity outside of this core region dramatically alters the target specificity and, thus, the biological consequences. The Ca++-chelating function would fit with the various signalling processes (e.g. the CRISP proteins) that members of this family are involved in, and also the sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how the cysteine-rich venom protein helothermine blocks the Ca++ transporting ryanodine receptors.	0
412179	cl00134	Chemokine	N/A. Includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity. Structure contains two highly conserved disulfide bonds.	0
412180	cl00136	Sec7	N/A. The Sec7 domain is a guanine-nucleotide-exchange-factor (GEF) for the pfam00025 family.	0
412181	cl00140	SNc	N/A. Present in all three domains of cellular life. Four copies in the transcriptional coactivator p100: these, however, appear to lack the active site residues of Staphylococcal nuclease. Positions 14 (Asp-21), 34 (Arg-35), 39 (Asp-40), 42 (Glu-43) and 110 (Arg-87) [SNase numbering in parentheses] are thought to be involved in substrate-binding and catalysis.	0
412182	cl00144	Tar_Tsr_LBD	ligand binding domain of Tar- and Tsr-related chemoreceptors. This family is a four helix bundle that operates as a ubiquitous sensory module in prokaryotic signal-transduction. The 4HB_MCP is always found between two predicted transmembrane helices indicating that it detects only extracellular signals. In many cases the domain is associated with a cytoplasmic HAMP domain suggesting that most proteins carrying the bundle might share the mechanism of transmembrane signalling which is well-characterized in E coli chemoreceptors.	0
412183	cl00145	T-box	DNA-binding domain of the T-box transcription factor family. Fungi incertae sedis refers to a fungal taxonomic group where its broader relationships are unknown or undefined. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family.  Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors.	0
412184	cl00146	TFIIS_I	N/A. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species {1-2]. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator exists in two major forms in human cells: a smaller form that interacts strongly with pol II and activates transcription, and a large form that does not interact strongly with pol II and does not directly activate transcription. Notably, the 'small' and 'large' Mediator complexes differ in their subunit composition: the Med26 subunit preferentially associates with the small, active complex, whereas cdk8, cyclin C, Med12 and Med13 associate with the large Mediator complex. This family includesthe C terminal region of a number of eukaryotic hypothetical proteins which are homologous to the Saccharomyces cerevisiae protein IWS1. IWS1 is known to be an Pol II transcription elongation factor and interacts with Spt6 and Spt5.	0
412185	cl00147	TNF	N/A. Family of cytokines that form homotrimeric or heterotrimeric complexes. TNF mediates mature T-cell receptor-induced apoptosis through the p75 TNF receptor.	0
412186	cl00150	TY	N/A. Thyroglobulin type 1 repeats are thought to be involved in the control of proteolytic degradation. The domain usually contains six conserved cysteines. These form three disulphide bridges. Cysteines 1 pairs with 2, 3 with 4 and 5 with 6.	0
412187	cl00154	UBCc	N/A. A member of the E2/UBC superfamily of proteins found in several bacteria. The active site residues are similar to the eukaryotic E2 proteins but lack the conserved asparagine. Members of this family are usually fused to an E1 domain at the C-terminus. The protein is usually in the gene neighborhood of a gene encoding a member of the pol-beta nucleotidyltransferase superfamily. Many of the operons in this family are in ICE-like mobile elements and plasmids.	0
412188	cl00156	WAP	N/A. WAP belongs to the group of Elafin or elastase-specific inhibitors.	0
412189	cl00157	WW	N/A. The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.	0
412190	cl00159	fer2	N/A. The 2Fe-2S ferredoxin family have a general core structure consisting of beta(2)-alpha-beta(2) which a beta-grasp type fold. The domain is around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This cluster appears within sarcosine oxidase proteins.	0
412191	cl00160	LbetaH	N/A. This family of proteins includes the characterized NeuD sialic acid O-acetyltransferase enzymes from E. coli and Streptococcus agalactiae (group B strep). These two are quite closely related to one another, so extension of this annotation to other members of the family in unsupported without additional independent evidence. The neuD gene is often observed in close proximity to the neuABC genes for the biosynthesis of CMP-N-acetylneuraminic acid (CMP-sialic acid), and NeuD sequences from these organisms were used to construct the seed for this model. Nevertheless, there are numerous instances of sequences identified by this model which are observed in a different genomic context (although almost universally in exopolysaccharide biosynthesis-related loci), as well as in genomes for which the biosynthesis of sialic acid (SA) is undemonstrated. Even in the cases where the association with SA biosynthesis is strong, it is unclear in the literature whether the biological substrate is SA iteself, CMP-SA, or a polymer containing SA. Similarly, it is unclear to what extent the enzyme has a preference for acetylation at the 7, 8 or 9 positions. In the absence of evidence of association with SA, members of this family may be involved with the acetylation of differring sugar substrates, or possibly the delivery of alternative acyl groups. The closest related sequences to this family (and those used to root the phylogenetic tree constructed to create this model) are believed to be succinyltransferases involved in lysine biosynthesis. These proteins contain repeats of the bacterial transferase hexapeptide (pfam00132), although often these do not register above the trusted cutoff.	0
412192	cl00162	PTS_IIA_glc	N/A. These are part of the The PTS Glucose-Glucoside (Glc) SuperFamily. The Glc family includes permeases specific for glucose, N-acetylglucosamine and a large variety of a- and b-glucosides. However, not all b-glucoside PTS permeases are in this class, as the cellobiose (Cel) b-glucoside PTS permease is in the Lac family (TC #4.A.3). The IIA, IIB and IIC domains of all of the permeases listed below are demonstrably homologous. These permeases show limited sequence similarity with members of the Fru family (TC #4.A.2). Several of the PTS permeases in the Glc family lack their own IIA domains and instead use the glucose IIA protein (IIAglc or Crr). Most of these permeases have the B and C domains linked together in a single polypeptide chain, and a cysteyl residue in the IIB domain is phosphorylated by direct phosphoryl transfer from IIAglc(his~P). Those permeases which lack a IIA domain include the maltose (Mal), arbutin-salicin-cellobiose (ASC), trehalose (Tre), putative glucoside (Glv) and sucrose (Scr) permeases of E. coli . Most, but not all Scr permeases of other bacteria also lack a IIA domain. The three-dimensional structures of the IIA and IIB domains of the E. coli glucose permease have been elucidated. IIAglchas a complex b-sandwich structure while IIBglc is a split ab-sandwich with a topology unrelated to the split ab-sandwich structure of HPr. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	0
412193	cl00163	PTS_IIA_fru	N/A. 4.A.2 The PTS Fructose-Mannitol (Fru) Family Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Fru family is a large and complex family which includes several sequenced fructose and mannitol-specific permeases as well as several putative PTS permeases of unknown specificities. The fructose permeases of this family phosphorylate fructose on the 1-position. Those of family 4.6 phosphorylate fructose on the 6-position. The Fru family PTS systems typically have 3 domains, IIA, IIB and IIC, which may be found as 1 or more proteins. The fructose and mannitol transporters form separate phylogenetic clusters in this family. This model is specific for the IIA domain of the fructose PTS transporters. Also similar to the Enzyme IIA Fru subunits of the PTS, but included in TIGR01419 rather than this model, is enzyme IIA Ntr (nitrogen), also called PtsN, found in E. coli and other organisms, which may play a solely regulatory role. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	0
412194	cl00165	Calpain_III	N/A. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions.	0
412195	cl00166	PTS_IIA_lac	N/A. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIA PTS system enzymes. This family of proteins normally function as a homotrimer, stabilized by a centrally located metal ion. Separation into subunits is thought to occur after phosphorylation.	0
412196	cl00169	Mog1	N/A. Segregation of nuclear and cytoplasmic processes facilitates regulation of many eukaryotic cellular functions such as gene expression and cell cycle progression. Trafficking through the nuclear pore requires a number of highly conserved soluble factors that escort macromolecular substrates into and out of the nucleus. The Mog1 protein has been shown to interact with RanGTP which stimulates guanine nucleotide release, suggesting Mog1 regulates the nuclear transport functions of Ran. The human homolog of Mog1 is thought to be alternatively spliced.	0
412197	cl00170	eu-GS	N/A. This model represents the eukaryotic glutathione synthetase, which shows little resemblance to the analogous enzyme of Gram-negative bacteria (TIGR01380). In the Kinetoplastida, trypanothione replaces glutathione, but can be made from glutathione; a sequence from Leishmania is not included in the seed, is highly divergent, and therefore scores between the trusted and noise cutoffs.	0
412198	cl00173	VIP2	N/A. Members of this family, which are predominantly found in anthrax toxin lethal factor, adopt a structure consisting of a core of antiparallel beta sheets and alpha helices. They form a long deep groove within the protein that anchors the 16-residue N-terminal tail of MAPKK-2 before cleavage. It has been noted that this domain resembles the ADP-ribosylating toxin from Bacillus cereus, but the active site has been modified to augment substrate recognition.	0
412199	cl00175	alpha-crystallin-Hsps_p23-like	alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins. The CS and CHORD (pfam04968) are fused into a single polypeptide chain in metazoans but are found in separate proteins in plants; this is thought to be indicative of an interaction between CS and CHORD. It has been suggested that the CS domain is a binding module for HSP90, implying that CS domain-containing proteins are involved in recruiting heat shock proteins to multiprotein assemblies. Two CS domains are found at the N-terminus of Ubiquitin carboxyl-terminal hydrolase 19 (USP19), these domains may play a role in the interaction of USP19 with cellular inhibitor of apoptosis 2.	0
412200	cl00178	Ecotin	Protease Inhibitor Ecotin; homodimeric protease inhibitor. Ecotin is a broad range serine protease inhibitor, which forms homodimers. The C-terminal region contains the dimerization motif. Interestingly, the binding sites show a fluidity of protein contacts binding sites show a fluidity of protein contacts derived from ecotin's innate flexibility in fitting itself to proteases while.	0
412201	cl00179	AlgLyase	N/A. This is the N-terminal domain of heparinase II/III proteins. It is a toroid-like domain.	0
412202	cl00180	RabGEF	N/A. Nucleotide exchange factor for Rab-like small GTPases (RabGEF), Mss4 type; RabGEF positely regulates the function of  Rab GTPase by promoting exchange of GDP for GTP; members of the Rab subfamily of Ras GTPases are important in vesicular transport;	0
412203	cl00182	Mth938-like	N/A. This is a large family of uncharacterized proteins found in all domains of life. The structure shows a novel fold with three beta sheets. A dimeric form is found in the crystal structure. It was suggested that the cleft in between the two monomers might bing nucleic acid.	0
412204	cl00184	CAS_like	N/A. This family consists of various bacterial proteins pertaining to the non-haem Fe(II)-dependent oxygenase family. Exact function is unknown, but a putative role includes involvement in the control of utilisation of gamma-aminobutyric acid.	0
381844	cl00185	PL_Passenger_AT	N/A. Pertactin-like passenger domains (virulence factors), C-terminal, subgroup 2, of autotransporter proteins of the type V secretion system of Gram-negative bacteria. This subgroup includes the passenger domains of the nonprotease autotransporters, Ag43, AIDA-1 and IcsA, as well as, the less characterized ShdA, MisL, and BapA autotransporters.	0
412205	cl00186	nidG2	N/A. Nidogen, an invariant component of basement membranes, is a multifunctional protein that interacts with most other major basement membrane proteins. The G2 fragment or (G2F domain) contains binding sites for collagen IV and perlecan. The structure is composed of an 11-stranded beta-barrel with a central helix. This domain is structurally related to that of green fluorescent protein pfam01353. A large surface patch on the beta-barrel is conserved in all metazoan nidogens.	0
412206	cl00188	BPI	N/A. The N and C terminal domains of the LBP/BPI/CETP family are structurally similar.	0
412207	cl00189	YlxR	N/A. Ylxr homologs; group of conserved hypothetical bacterial proteins of unknown function; structure revealed putative RNA binding cleft; proteins are encoded by an operon that includes other proteins involved in transcription and/or translation	0
412208	cl00192	ribokinase_pfkB_like	N/A. This enzyme EC:2.7.4.7 is part of the Thiamine pyrophosphate (TPP) synthesis pathway, TPP is an essential cofactor for many enzymes.	0
412209	cl00193	cytochrome_b_C	N/A. cytochrome b6-f complex subunit IV; Provisional	0
412210	cl00194	EF1B	N/A. This family is the guanine nucleotide exchange domain of EF-1 beta and EF-1 delta chains.	0
412211	cl00195	SIR2	N/A. This family of proteins are related to the sirtuins.	0
412212	cl00196	plant_peroxidase_like	Heme-dependent peroxidases similar to plant peroxidases. As catalase, this enzyme catalyzes the dismutation of two molecules of hydrogen peroxide to dioxygen and two molecules of water. As a peroxidase, it uses hydrogen peroxide to oxidize donor compounds and produce water. KatG from E. coli is a homotetramer with two non-covalently associated iron protoheme IX groups per tetramer, but the ortholog from Synechococcus sp. is a homodimer with one protoheme. Important sites (numbered according to E. coli KatG) include heme ligands His-106 and His-267 and active site Trp-318. Note that the translation PID:g296476 from accession X71420 from Rhodobacter capsulatus B10 contains extensive frameshift differences from the rest of the orthologous family. [Cellular processes, Detoxification]	0
412213	cl00197	cyclophilin	N/A. The peptidyl-prolyl cis-trans isomerases, also known as cyclophilins, share this domain of about 109 amino acids. Cyclophilins have been found in all organisms studied so far and catalyze peptidyl-prolyl isomerisation during which the peptide bond preceding proline (the peptidyl-prolyl bond) is stabilized in the cis conformation. Mammalian cyclophilin A (CypA) is a major cellular target for the immunosuppressive drug cyclosporin A (CsA). Other roles for cyclophilins may include chaperone and cell signalling function.	0
412214	cl00198	Phosphoglycerate_kinase	N/A. phosphoglycerate kinase; Provisional	0
412215	cl00199	SO_family_Moco	N/A. This domain is found in a variety of oxidoreductases. This domain binds to a molybdopterin cofactor. Xanthine dehydrogenases, that also bind molybdopterin, have essentially no similarity.	0
412216	cl00200	MIP	N/A. MIP (Major Intrinsic Protein) family proteins exhibit essentially two distinct types of channel properties: (1) specific water transport by the aquaporins, and (2) small neutral solutes transport, such as glycerol by the glycerol facilitators.	0
412217	cl00202	rubredoxin_like	N/A. Rubredoxin; nonheme iron binding domains containing a [Fe(SCys)4] center. Rubredoxins are small nonheme iron proteins. The iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc. They are believed to be involved in electron transfer.	0
412218	cl00203	Ribosomal_L30_like	N/A. This family includes prokaryotic L30 and eukaryotic L7.	0
412219	cl00204	PFK	N/A. Members of this family that are characterized, save one, are phosphofructokinases dependent on pyrophosphate (EC 2.7.1.90) rather than ATP (EC 2.7.1.11). The exception is one of three phosphofructokinases from Streptomyces coelicolor. Family members are both bacterial and archaeal. [Energy metabolism, Glycolysis/gluconeogenesis]	0
412220	cl00205	HMG-CoA_reductase	Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR). The HMG-CoA reductases catalyze the conversion of HMG-CoA to mevalonate, which is the rate-limiting step in the synthesis of isoprenoids like cholesterol. Probably because of the critical role of this enzyme in cholesterol homeostasis, mammalian HMG-CoA reductase is heavily regulated at the transcriptional, translational, and post-translational levels.	0
412221	cl00206	PTS-HPr_like	N/A. The HPr family are bacterial proteins (or domains of proteins) which function in phosphoryl transfer system (PTS) systems. They include energy-coupling components which catalyze sugar uptake via a group translocation mechanism. The functions of most of these proteins are not known, but they presumably function in PTS-related regulatory capacities. All seed members are stand-alone HPr proteins, although the model also recognizes HPr domains of PTS fusion proteins. This family includes the related NPr protein. [Signal transduction, PTS]	0
412222	cl00207	HMA	N/A. This model describes an apparently copper-specific subfamily of the metal-binding domain HMA (pfam00403). Closely related sequences outside this model include mercury resistance proteins and repeated domains of eukaryotic eukaryotic copper transport proteins. Members of this family are strictly prokaryotic. The model identifies both small proteins consisting of just this domain and N-terminal regions of cation (probably copper) transporting ATPases. [Transport and binding proteins, Cations and iron carrying compounds]	0
412223	cl00208	RNase_T2	N/A. Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far.  This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases).  Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the prokaryotic RNase T2 family members.	0
412224	cl00210	Isoprenoid_Biosyn_C1	Isoprenoid Biosynthesis enzymes, Class 1. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase.	0
412225	cl00211	Heme_Cu_Oxidase_III_like	N/A. Heme-copper oxidase subunit III subfamily.  Heme-copper oxidases are transmembrane protein complexes in the respiratory chains of prokaryotes and mitochondria which couple the reduction of molecular oxygen to water to, proton pumping across the membrane. The heme-copper oxidase superfamily is diverse in terms of electron donors, subunit composition, and heme types.  This superfamily includes cytochrome c and ubiquinol oxidases.  Bacterial oxidases typically contain 3 or 4 subunits in contrast to the 13 subunit bovine cytochrome c oxidase (CcO). Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunits I, II and III of ubiquinol oxidase are homologous to the corresponding subunits in CcO.  Although not required for catalytic activity, subunit III is believed to play a role in assembly of the multimer complex. Rhodobacter CcO subunit III stabilizes the integrity of the binuclear center in subunit I.  It has been proposed that Archaea acquired heme-copper oxidases through gene transfer from Gram-positive bacteria.	0
412226	cl00212	microbial_RNases	N/A. This enzyme hydrolyzes RNA and oligoribonucleotides.	0
412227	cl00213	DNA_BRE_C	DNA breaking-rejoining enzymes, C-terminal catalytic domain. catalyzes cleavage and ligation of DNA.	0
412228	cl00214	Aldolase_II	N/A. This family includes class II aldolases and adducins which have not been ascribed any enzymatic function.	0
412229	cl00215	Aconitase_swivel	N/A. This family represents the N-terminal region of several bacterial Aconitate hydratase 2 proteins and is found in conjunction with pfam00330.	0
412230	cl00216	L-asparaginase_like	Bacterial L-asparaginases and related enzymes. This is the N-terminal domain of this enzyme.	0
412231	cl00217	pyrophosphatase	N/A. inorganic pyrophosphatase; Provisional	0
412232	cl00219	Pterin_binding	N/A. This family includes a variety of pterin binding enzymes that all adopt a TIM barrel fold. The family includes dihydropteroate synthase EC:2.5.1.15 as well as a group methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) that catalyzes a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation. It transfers the N5-methyl group from methyltetrahydrofolate (CH3-H4folate) to a cob(I)amide centre in another protein, the corrinoid iron-sulfur protein. MeTr is a member of a family of proteins that includes methionine synthase and methanogenic enzymes that activate the methyl group of methyltetra-hydromethano(or -sarcino)pterin.	0
381872	cl00220	cysteine_hydrolases	N/A. This family are hydrolase enzymes.	0
412233	cl00221	ACBP	N/A. acyl CoA binding protein; Provisional	0
412234	cl00222	Lyz-like	lysozyme-like domains. This family is related to the SLT domain pfam01464.	0
412235	cl00223	NusB_Sun	N/A. Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with one central alpha-helix surrounded by five others, in a NusB-like fold. Their function has not, as yet, been determined.	0
412236	cl00224	PLPDE_IV	N/A. The D-amino acid transferases (D-AAT) are required by bacteria to catalyze the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity.	0
412237	cl00226	nuc_hydro	N/A. A family of proteins in Rhodopirellula baltica that are predicted to be secreted. Also, a member has been identified in Caulobacter crescentus. These proteins mat be related to pfam01156.	0
412238	cl00227	PEBP	PhosphatidylEthanolamine-Binding Protein (PEBP) domain. putative kinase inhibitor protein; Provisional	0
412239	cl00228	HIT_like	N/A. This family consists of several scavenger mRNA decapping enzymes (DcpS) and is the C-terminal region. DcpS is a scavenger pyrophosphatase that hydrolyzes the residual cap structure following 3' to 5' decay of an mRNA. The association of DcpS with 3' to 5' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway. The C-terminal domain contains a histidine triad (HIT) sequence with three histidines separated by hydrophobic residues. The central histidine within the DcpS HIT motif is critical for decapping activity and defines the HIT motif as a new mRNA decapping domain, making DcpS the first member of the HIT family of proteins with a defined biological function.	0
412240	cl00229	eIF1_SUI1_like	Eukaryotic initiation factor 1 and related proteins. This protein family shows weak but suggestive similarity to translation initiation factor SUI1 and its prokaryotic homologs.	0
412241	cl00230	Cis_IPPS	Cis (Z)-Isoprenyl Diphosphate Synthases. Previously known as uncharacterized protein family UPF0015, a single member of this family has been identified as an undecaprenyl diphosphate synthase.	0
412242	cl00231	SAICAR_synt	5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase. Also known as Phosphoribosylaminoimidazole-succinocarboxamide synthase.	0
412243	cl00232	Ribosomal_L19e	N/A. Ribosomal protein L19e, archaeal.  L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit.  The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits.	0
412244	cl00233	HPPK	N/A. This model describes the folate biosynthesis enzyme 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase. Alternate names include 6-hydroxymethyl-7,8-dihydropterin diphosphokinase and 7,8-dihydro-6-hydroxymethylpterin pyrophosphokinase (HPPK). The extreme C-terminal region, of typically eight to thirty residues, is not included in the model. This enzyme may be found as a fusion protein with other enzymes of folate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	0
412245	cl00234	Pep_deformylase	N/A. Peptide deformylase (EC 3.5.1.88), also called polypeptide deformylase, is a metalloenzyme that uses water to release formate from the N-terminal formyl-L-methionine of bacterial and chloroplast peptides. This enzyme should not be confused with formylmethionine deformylase (EC 3.5.1.31) which is active on free N-formyl methionine and has been reported from rat intestine. [Protein fate, Protein modification and repair]	0
412246	cl00235	4Oxalocrotonate_Tautomerase	N/A. This family includes the enzyme 4-oxalocrotonate tautomerase, which catalyzes the ketonisation of 2-hydroxymuconate to 2-oxo-3-hexenedioate.	0
412247	cl00236	Hsp33	N/A. Hsp33 is a molecular chaperone, distinguished from all other known chaperones by its mode of functional regulation. Its activity is redox regulated. Hsp33 is a cytoplasmically localized protein with highly reactive cysteines that respond quickly to changes in the redox environment. Oxidising conditions like H2O2 cause disulfide bonds to form in Hsp33, a process that leads to the activation of its chaperone function.	0
412248	cl00237	Peptidase_C15	N/A. PgaPase_1 is a family of functionally diverse Caenorhabditis proteins. The family is homologous to the cysteine-peptidases, but lack of a strictly conserved Glu-Cys-His catalytic triad or pGlu binding site implies that it has other functions that could have resulted in a change in reaction-specificity or even of catalytic activity.	0
412249	cl00238	Frataxin	N/A. This family contains proteins that have a domain related to the globular C-terminus of Frataxin the protein that is mutated in Friedreich's ataxia. This domain is found in a family of bacterial proteins. The function of this domain is currently unknown. It has been suggested that this family is involved in iron transport.	0
412250	cl00239	GXGXG	N/A. This domain is found in glutamate synthase, tungsten formylmethanofuran dehydrogenase subunit c (FwdC) and molybdenum formylmethanofuran dehydrogenase subunit c (FmdC). A repeated G-XX-G-XXX-G motif is seen in the alignment.	0
412251	cl00240	RRF	N/A. The ribosome recycling factor (RRF / ribosome release factor) dissociates the ribosome from the mRNA after termination of translation, and is essential bacterial growth. Thus ribosomes are "recycled" and ready for another round of protein synthesis.	0
412252	cl00241	IF6	N/A. This family includes eukaryotic translation initiation factor 6 as well as presumed archaebacterial homologs.	0
412253	cl00242	MoaC	N/A. Members of this family are involved in molybdenum cofactor biosynthesis. However their molecular function is not known.	0
412254	cl00245	MGS-like	N/A. This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site.	0
412255	cl00246	MTHFR	N/A. This family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from bacteria and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The structure for this domain is known to be a TIM barrel.	0
412256	cl00247	MCR_gamma	N/A. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (pfam02241), and 2 gamma (this family) subunits with two identical nickel porphinoid active sites.	0
412257	cl00248	OMPLA	N/A. Phospholipase A1 is a bacterial outer membrane bound acyl hydrolase with a broad substrate specificity EC:3.1.1.32. It has been proposed that Ser164 is the active site for Escherichia coli phospholipase A1.	0
412258	cl00249	MCH	N/A. Methenyl tetrahydromethanopterin cyclohydrolase EC:3.5.4.27 is involved in methanogenesis in bacteria and archaea, producing methane from carbon monoxide or carbon dioxide.	0
412259	cl00250	RaiA	N/A. This Pfam family contains the sigma-54 modulation protein family and the S30AE family of ribosomal proteins which includes the light- repressed protein (lrtA).	0
412260	cl00251	Translocase_SecB	N/A. This family consists of preprotein translocase subunit SecB. SecB is required for the normal export of envelope proteins out of the cell cytoplasm.	0
412261	cl00252	NifX_NifB	N/A. This family contains several NIF (B, Y and X) proteins which are iron-molybdenum cofactors (FeMo-co) in the dinitrogenase enzyme which catalyzes the reduction of dinitrogen to ammonium. Dinitrogenase is a hetero-tetrameric (alpha(2)beta(2)) enzyme which contains the iron-molybdenum cofactor (FeMo-co) at its active site.	0
412262	cl00253	Dtyr_deacylase	N/A. This family comprises of several D-Tyr-tRNA(Tyr) deacylase proteins. Cell growth inhibition by several d-amino acids can be explained by an in vivo production of d-aminoacyl-tRNA molecules. Escherichia coli and yeast cells express an enzyme, d-Tyr-tRNA(Tyr) deacylase, capable of recycling such d-aminoacyl-tRNA molecules into free tRNA and d-amino acid. Accordingly, upon inactivation of the genes of the above deacylases, the toxicity of d-amino acids increases. Orthologues of the deacylase are found in many cells.	0
412263	cl00254	NOS_oxygenase	N/A. Nitric oxide synthase (NOS) eukaryotic oxygenase domain. NOS produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain, which binds to the substrate L-Arg,  zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOS's also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN.	0
412264	cl00256	CheW_like	N/A. CheW proteins are part of the chemotaxis signaling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flageller rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in CheA binds to CheW, suggesting that these domains can interact with each other.	0
412265	cl00257	HU_IHF	DNA sequence specific (IHF) and non-specific (HU) domains. This model describes a set of proteins related to but longer than DNA-binding protein HU. Its distinctive domain architecture compared to HU and related histone-like DNA-binding proteins justifies the designation as superfamily. Members include, so far, one from Bacteroides fragilis, a gut bacterium, and ten from Porphyromonas gingivalis, an oral anaerobe. [DNA metabolism, Chromosome-associated proteins]	0
412266	cl00258	RIBOc	N/A. Members of this family are involved in rDNA transcription and rRNA processing. They probably also cleave a stem-loop structure at the 3' end of U2 snRNA to ensure formation of the correct U2 3' end; they are involved in polyadenylation-independent transcription termination. Some members may be mitochondrial ribosomal protein subunit L15, others may be 60S ribosomal protein L3.	0
412267	cl00259	Sm_like	Sm and related proteins. This SM domain is found in Ataxin-2.	0
412268	cl00261	PLPDE_III	Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes. These pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and related substrates This domain has a TIM barrel fold.	0
412269	cl00262	TroA-like	N/A. This family includes bacterial periplasmic binding proteins. Several of which are involved in iron transport.	0
412270	cl00263	TFold	N/A. The QueF monomer is made up of two ferredoxin-like domains aligned together with their beta-sheets that have additional embellishments. This subunit is composed of a three-stranded beta-sheet and two alpha-helices. QueF reduces a nitrile bond to a primary amine. The two monomer units together create suitable substrate-binding pockets.	0
412271	cl00264	Ferritin_like	Ferritin-like superfamily of diiron-containing four-helix-bundle proteins. This domain has a ferritin-like fold.	0
412272	cl00266	HGTP_anticodon	N/A. This is an HGTP_anticodon binding domain, found largely on Gcn2 proteins which bind tRNA to down regulate translation in certain stress situations.	0
412273	cl00268	class_II_aaRS-like_core	N/A. This is a family of class II aminoacyl-tRNA synthetase-like and ATP phosphoribosyltransferase regulatory subunits.	0
412274	cl00269	cytidine_deaminase-like	N/A. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Bdellovibrio Bd3614. They are typified by a distinct N-terminal globular domain. The Bdellovibrio version occurs in a predicted operon with a 23S rRNA G2445-modifying methylase suggesting that it might be involved in RNA editing.	0
412275	cl00271	PI3Ka	N/A. PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation.	0
412276	cl00274	ML	N/A. This domain is distantly similar to pfam02221 and conserves its pattern of conserved cysteines. This suggests that this domain may be involved in lipid binding.	0
412277	cl00275	Heme_Cu_Oxidase_I	N/A. Cytochrome C oxidase subunit I.  Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes.  It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Only subunits I and II are essential for function, but subunit III, which is also conserved, may play a role in assembly or oxygen delivery to the active site. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Subunit I contains a heme-copper binuclear center (the active site where O2 is reduced to water) formed by a high-spin heme (heme a3) and a copper ion (CuB).  It also contains a low-spin heme (heme a), believed to participate in the transfer of electrons to the binuclear center.  For every reduction of an O2 molecule, eight protons are taken from the inside aqueous compartment and four electrons are taken from cytochrome c on the opposite side of the membrane.  The four electrons and four of the protons are used in the reduction of O2; the four remaining protons are pumped across the membrane.  This charge separation of four charges contributes to the electrochemical gradient used for ATP synthesis. Two proton channels, the D-pathway and K-pathway, leading to the binuclear center have been identified in subunit I.  A well-defined pathway for the transfer of pumped protons beyond the binuclear center has not been identified. Electrons are transferred from cytochrome c (the electron donor) to heme a via the CuA binuclear site in subunit II, and directly from heme a to the binuclear center.	0
412278	cl00276	Maf_Ham1	N/A. Maf is a putative inhibitor of septum formation in eukaryotes, bacteria, and archaea.	0
412279	cl00278	CCC1_like	CCC1-related family of proteins. This family includes the vacuolar Fe2+/Mn2+ uptake transporter, Ccc1 and the vacuolar iron transporter VIT1.	0
412280	cl00279	APP_MetAP	N/A. This family contains metallopeptidases. It also contains non-peptidase homologs such as the N terminal domain of Spt16 which is a histone H3-H4 binding module.	0
412281	cl00281	metallo-dependent_hydrolases	N/A. These proteins are amidohydrolases that are related to pfam01979.	0
412282	cl00282	cbb3_Oxidase_CcoQ	N/A. This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon.	0
412283	cl00283	ADP_ribosyl	N/A. Members of this family, which are found in prokaryotic exotoxin A, catalyze the transfer of ADP ribose from nicotinamide adenine dinucleotide (NAD) to elongation factor-2 in eukaryotic cells, with subsequent inhibition of protein synthesis.	0
412284	cl00285	Aconitase	Aconitase catalytic domain; Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Family of hypothetical proteins.	0
260328	cl00288	EPT_RTPC-like	N/A. EPSP synthase domain. 3-phosphoshikimate 1-carboxyvinyltransferase (5-enolpyruvylshikimate-3-phosphate synthase) (EC 2.5.1.19) catalyses the reaction between shikimate-3-phosphate (S3P) and phosphoenolpyruvate (PEP) to form 5-enolpyruvylshkimate-3-phosphate (EPSP), an intermediate in the shikimate pathway leading to aromatic amino acid biosynthesis. The reaction is phosphoenolpyruvate + 3-phosphoshikimate = phosphate + 5-O-(1-carboxyvinyl)-3-phosphoshikimate. It is found in bacteria and plants but not animals. The enzyme is the target of the widely used herbicide glyphosate, which has been shown to occupy the active site. In bacteria and plants, it is a single domain protein, while in fungi, the domain is found as part of a multidomain protein with functions that are all part of the shikimate pathway.	0
412285	cl00289	FIG	N/A. This family represents the N-terminus of this protein family.	0
412286	cl00292	AANH_like	N/A. NAD synthase (EC:6.3.5.1) is involved in the de novo synthesis of NAD and is induced by stress factors such as heat shock and glucose limitation.	0
412287	cl00293	B12-binding_like	N/A. This domain tends to occur to the N-terminus of the pfam04055 domain in hypothetical bacterial proteins.	0
412288	cl00295	ZZ	N/A. Zinc finger present in dystrophin, CBP/p300. ZZ in dystrophin binds calmodulin. Putative zinc finger; binding not yet shown. Four to six cysteine residues in its sequence are responsible for coordinating zinc ions, to reinforce the structure.	0
412289	cl00296	Peptidase_C39_like	N/A. BtrH_N is the N-terminus of the acyl carrier protein:aminoglycoside acyltransferase BtrH. Alternatively it can be referred to as butirosin biosynthesis protein H. BtrH transfers the unique (S)-4-amino-2-hydroxybutyrate (AHBA) side chain, which protects the antibiotic butirosin from several common resistance mechanisms. Butirosin, an aminoglycoside antibiotic produced by Bacillus circulans, exhibits improved antibiotic properties over its parent molecule and retains bactericidal activity toward many aminoglycoside-resistant strains. Butirosin is unique in carrying the AHBA side-chain. BtrH transfers the AHBA from the acyl carrier protein BtrI to the parent aminoglycoside ribostamycin as a gamma-glutamylated dipeptide.	0
412290	cl00297	R3H	N/A. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA.	0
412291	cl00299	MIT	N/A. The MIT domain forms an asymmetric three-helix bundle and binds ESCRT-III (endosomal sorting complexes required for transport) substrates.	0
412292	cl00301	PAZ	N/A. This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerization. The three-dimensional structure of this domain has been solved. The PAZ domain is composed of two subdomains. One subdomain is similar to the OB fold, albeit with a different topology. The OB-fold is well known as a single-stranded nucleic acid binding fold. The second subdomain is composed of a beta-hairpin followed by an alpha-helix. The PAZ domains shows low-affinity nucleic acid binding and appears to interact with the 3' ends of single-stranded regions of RNA in the cleft between the two subdomains. PAZ can bind the characteristic two-base 3' overhangs of siRNAs, indicating that although PAZ may not be a primary nucleic acid binding site in Dicer or RISC, it may contribute to the specific and productive incorporation of siRNAs and miRNAs into the RNAi pathway.	0
412293	cl00303	PNP_UDP_1	Phosphorylase superfamily. This family consists of several purine nucleoside permease from both bacteria and fungi.	0
412294	cl00304	TP_methylase	S-AdoMet-dependent tetrapyrrole methylases. This family uses S-AdoMet in the methylation of diverse substrates. This family includes a related group of bacterial proteins of unknown function. This family includes the methylase Dipthine synthase.	0
412295	cl00305	Sua5_yciO_yrdC	Telomere recombination. This domain is found in NodU from Rhizobium, CmcH from Nocardia lactamdurans and the bifunctional carbamoyltransferase TobZ from Streptoalloteichus tenebrarius. NodU a Rhizobium nodulation protein involved in the synthesis of nodulation factors has 6-O-carbamoyltransferase-like activity. CmcH is involved in cephamycin (antibiotic) biosynthesis and has 3-hydroxymethylcephem carbamoyltransferase activity, EC:2.1.3.7 catalyzing the reaction: Carbamoyl phosphate + 3-hydroxymethylceph-3-EM-4-carboxylate <=> phosphate + 3-carbamoyloxymethylcephem. TobZ functions as an ATP carbamoyltransferase and tobramycin carbamoyltransferase. These proteins contain two domains, this is the smaller, C-terminal, domain.	0
412296	cl00307	Thiamine_BP	Thiamine-binding protein. This protein has been crystallized in both Methanobacterium thermoautotrophicum and yeast, but its function remains unknown. Both crystal structures showed sulfate ions bound at the interface of two dimers to form a tetramer. [Unknown function, General]	0
412297	cl00309	PRTases_typeI	Phosphoribosyl transferase (PRT)-type I domain. This PRTase family, and C-terminal TRSP domain, are related to OPRTases, and are predicted to use Orotate as substrate. These genes are found in the biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	0
412298	cl00310	AIRC	AIR carboxylase. Phosphoribosylaminoimidazole carboxylase is a fusion protein in plants and fungi, but consists of two non-interacting proteins in bacteria, PurK and PurE. This model represents PurK, an N5-CAIR mutase. [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	0
412299	cl00311	UbiD	3-octaprenyl-4-hydroxybenzoate carboxy-lyase. Members of this protein family are putative decarboxylases involved in a late stage of the alternative pathway for menaquinone, via futalosine, as in Streptomyces coelicolor and Helicobacter pylori. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	0
412300	cl00312	Ribosomal_S12_like	N/A. This protein is known as S12 in bacteria and archaea and S23 in eukaryotes.	0
412301	cl00313	uS7	Ribosomal protein S7. This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes.	0
412302	cl00314	Ribosomal_S10	Ribosomal protein S10p/S20e. This model describes the archaeal ribosomal protein uS10 and its equivalents (previously called S20) in eukaryotes. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412303	cl00315	RPS2	N/A. This model describes the ribosomal protein of the cytosol and of Archaea, homologous to S2 of bacteria. It is designated typically as Sa in eukaryotes and Sa or S2 in the archaea. TIGR01011 describes the related protein of organelles and bacteria. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412304	cl00317	Lumazine_synthase-like	lumazine synthase and riboflavin synthase; involved in the riboflavin (vitamin B2) biosynthetic pathway. This family includes the beta chain of 6,7-dimethyl-8- ribityllumazine synthase EC:2.5.1.9, an enzyme involved in riboflavin biosynthesis. The family also includes a subfamily of distant archaebacterial proteins that may also have the same function. The family contains a number of different subsets including a family of proteins comprising archaeal lumazine and riboflavin synthases, type I lumazine synthases, and the eubacterial type II lumazine synthases. It has been established that lumazine synthase catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. The type I lumazine synthases area active in pentameric or icosahedral quaternary assemblies, whereas the type II are decameric. Brucella, a bacterial genus that causes brucellosis, and other Rhizobiales have an atypical riboflavin metabolic pathway. Brucella spp code for both a type-I and a type-II lumazine synthase, and it has been shown that at least one of these two has to be present in order for Brucella to be viable, showing that in the case of Brucella flavin metabolism is implicated in bacterial virulence.	0
412305	cl00318	YjeF_N	YjeF-related protein N-terminus. The protein region corresponding to this model shows no clear homology to any protein of known function. This model is built on yeast protein YNL200C and the N-terminal regions of E. coli yjeF and its orthologs in various species. The C-terminal region of yjeF and its orthologs shows similarity to hydroxyethylthiazole kinase (thiM) and other enzymes involved in thiamine biosynthesis. Yeast YKL151C and B. subtilis yxkO match the yjeF C-terminal domain but lack this region. [Unknown function, General]	0
412306	cl00319	Gn_AT_II	N/A. This domain is a class-II glutamine amidotransferase domain found in a variety of enzymes such as asparagine synthetase and glutamine-fructose-6-phosphate transaminase.	0
412307	cl00320	tRNA_bindingDomain	N/A. This domain is found in prokaryotic methionyl-tRNA synthetases, prokaryotic phenylalanyl tRNA synthetases the yeast GU4 nucleic-binding protein (G4p1 or p42, ARC1), human tyrosyl-tRNA synthetase, and endothelial-monocyte activating polypeptide II. G4p1 binds specifically to tRNA form a complex with methionyl-tRNA synthetases. In human tyrosyl-tRNA synthetase this domain may direct tRNA to the active site of the enzyme. This domain may perform a common function in tRNA aminoacylation.	0
412308	cl00322	Ribosomal_L1	N/A. This family includes prokaryotic L1 and eukaryotic L10.	0
412309	cl00323	Chorismate_synthase	Chorismase synthase, the enzyme catalyzing the final step of the shikimate pathway. Homotetramer (noted in E.coli) suggests reason for good conservation. [Amino acid biosynthesis, Aromatic amino acid family]	0
351027	cl00324	RplC	Ribosomal protein L3 [Translation, ribosomal structure and biogenesis]. This model describes exclusively the archaeal class of ribosomal protein L3. A separate model (TIGR03625) describes the bacterial/organelle form, and both belong to pfam00297. Eukaryotic proteins are excluded from this model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412310	cl00325	Ribosomal_L4	Ribosomal protein L4/L1 family. Members of this protein family are ribosomal protein L4. This model recognizes bacterial and most organellar forms, but excludes homologs from the eukaryotic cytoplasm and from archaea. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412311	cl00326	Ribosomal_L23	Ribosomal protein L23. This model describes the archaeal ribosomal protein L23P and rigorously excludes the bacterial counterpart L23. In order to capture every known instance of archaeal L23P, the trusted cutoff is set lower than a few of the highest scoring eukaryotic cytosolic ribosomal counterparts. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412312	cl00327	Ribosomal_L22	N/A. This family includes L22 from prokaryotes and chloroplasts and L17 from eukaryotes.	0
412313	cl00328	Ribosomal_L14	Ribosomal protein L14p/L23e. Part of the 50S ribosomal subunit. Forms a cluster with proteins L3 and L24e, part of which may contact the 16S rRNA in 2 intersubunit bridges.	0
412314	cl00330	Ribosomal_S8	Ribosomal protein S8. 30S ribosomal protein S8; Validated	0
412315	cl00331	Ribosomal_S13	Ribosomal protein S13/S18. This model describes bacterial ribosomal protein S13, to the exclusion of the homologous archaeal S13P and eukaryotic ribosomal protein S18. This model identifies some (but not all) instances of chloroplast and mitochondrial S13, which is of bacterial type. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
320911	cl00332	Ribosomal_S11	Ribosomal protein S11. This model describes the bacterial 30S ribosomal protein S11. Cutoffs are set such that the model excludes archaeal and eukaryotic ribosomal proteins, but many chloroplast and mitochondrial equivalents of S11 are detected. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412316	cl00333	Ribosomal_L13	N/A. 60S ribosomal protein L13a; Provisional	0
412317	cl00334	Ribosomal_S9	Ribosomal protein S9/S16. ribosomal protein S9	0
351036	cl00335	NDPk	N/A. Nucleoside diphosphate kinase homolog 5 (NDP kinase homolog 5, NDPk5, NM23-H5; Inhibitor of p53-induced apoptosis-beta, IPIA-beta): In human, mRNA for NDPk5 is almost exclusively found in testis, especially in the flagella of spermatids and spermatozoa, in association with axoneme microtubules, and may play a role in spermatogenesis by increasing the ability of late-stage spermatids to eliminate reactive oxygen species.  It belongs to the nm23 Group II genes and appears to differ from the other human NDPks in that it lacks two important catalytic site residues, and thus does not appear to possess NDP kinase activity. NDPk5 confers protection from cell death by Bax and alters the cellular levels of several antioxidant enzymes, including glutathione peroxidase 5 (Gpx5).	0
412318	cl00336	DHBP_synthase	3,4-dihydroxy-2-butanone 4-phosphate synthase. Several members of the family are bifunctional, involving both ribA and ribB function. In these cases, ribA tends to be on the C-terminal end of the protein and ribB tends to be on the N-terminal. [Biosynthesis of cofactors, prosthetic groups, and carriers, Riboflavin, FMN, and FAD]	0
412319	cl00337	PT_UbiA	UbiA family of prenyltransferases (PTases). A fairly deep split separates this polyprenyltransferase subfamily from the set of mitochondrial and proteobacterial 4-hydroxybenzoate polyprenyltransferases, described in TIGR01474. Protoheme IX farnesyltransferase (heme O synthase) (TIGR01473) is more distantly related. Because no species appears to have both this protein and a member of TIGR01474, it is likely that this model represents 4-hydroxybenzoate polyprenyltransferase, a critical enzyme of ubiquinone biosynthesis, in the Archaea, Gram-positive bacteria, Aquifex aeolicus, the Chlamydias, etc. [Biosynthesis of cofactors, prosthetic groups, and carriers, Menaquinone and ubiquinone]	0
412320	cl00338	ALAD_PBGS	N/A. Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. The eukaryotic PBGSs represented by this model, which contain a cysteine-rich zinc binding motif (DXCXCX(Y/F)X3G(H/Q)CG), require zinc for their activity, they do not contain an additional allosteric metal binding site and do not bind magnesium.	0
412321	cl00339	SugarP_isomerase	N/A. This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilizes acyl-CoA and acetate to form acetyl-CoA.	0
412322	cl00340	ILVD_EDD	Dehydratase family. This protein, dihydroxy-acid dehydratase, catalyzes the fourth step in valine and isoleucine biosynthesis. It contains a catalytically essential [4Fe-4S] cluster This model generates scores of up to 150 bits vs. 6-phosphogluconate dehydratase, a homologous enzyme. [Amino acid biosynthesis, Pyruvate family]	0
412323	cl00341	IGPD	Imidazoleglycerol-phosphate dehydratase. imidazoleglycerol-phosphate dehydratase; Provisional	0
294246	cl00342	Trp-synth-beta_II	N/A. Members of this family include SbnA, a protein of the staphyloferrin B biosynthesis operon of Staphylococcus aureus. SbnA and SbnB together appear to synthesize 2,3-diaminopropionate, a precursor of certain siderophores and other secondary metabolites. SbnA is a pyridoxal phosphate-dependent enzyme. [Cellular processes, Biosynthesis of natural products]	0
412324	cl00344	PRA-CH	Phosphoribosyl-AMP cyclohydrolase. phosphoribosyl-AMP cyclohydrolase; Reviewed	0
351044	cl00348	GCD2	Translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]. This model, eIF-2B_rel, describes half of a superfamily, where the other half consists of eukaryotic translation initiation factor 2B (eIF-2B) subunits alpha, beta, and delta. It is unclear whether the eIF-2B_rel set is monophyletic, or whether they are all more closely related to each other than to any eIF-2B subunit because the eIF-2B clade is highly derived. Members of this branch of the family are all uncharacterized with respect to function and are found in the Archaea, Bacteria, and Eukarya, although a number are described as putative translation intiation factor components. Proteins found by eIF-2B_rel include at least three clades, including a set of uncharacterized eukaryotic proteins, a set found in some but not all Archaea, and a set universal so far among the Archaea and closely related to several uncharacterized bacterial proteins. [Unknown function, General]	0
412325	cl00349	S15_NS1_EPRS_RNA-bind	N/A. 40S ribosomal protein S15; Provisional	0
412326	cl00350	Ribosomal_S19	Ribosomal protein S19. This model represents eukaryotic ribosomal protein uS19 (previously S15) and its archaeal equivalent. It excludes bacterial and organellar ribosomal protein S19. The nomenclature for the archaeal members is unresolved and given variously as S19 (after the more distant bacterial homologs) or S15. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412327	cl00351	Ribosomal_S17	Ribosomal protein S17. This model describes the bacterial ribosomal small subunit protein S17, while excluding cytosolic eukaryotic homologs and archaeal homologs. The model finds many, but not, chloroplast and mitochondrial counterparts to bacterial S17. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412328	cl00352	PTH	N/A. Chloroplast RNA splicing 2 (CRS2) is a nuclear-encoded protein required for the splicing of group II introns in the chloroplast. CRS2 forms stable complexes with two CRS2-associated factors, CAF1 and CAF2, which are required for the splicing of distinct subsets of CRS2-dependent introns. CRS2 is closely related to bacterial peptidyl-tRNA hydrolases (PTH).	0
412329	cl00353	Ribosomal_L16_L10e	N/A. This model describes bacterial and organellar ribosomal protein L16. The homologous protein of the eukaryotic cytosol is designated L10 [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412330	cl00354	KOW	KOW: an acronym for the authors&apos; surnames (Kyrpides, Ouzounis and Woese). Ribosomal_L26 is a family of the 50S and the 60S ribosomal proteins from eukaryotes - L26 - and archaea - L25.	0
412331	cl00355	Ribosomal_S14	Ribosomal protein S14p/S29e. 30S ribosomal protein S14P; Reviewed	0
412332	cl00356	Ribosomal_L17	Ribosomal protein L17. Eubacterial and mitochondrial. The mitochondrial form, from yeast, contains an additional 110 amino acids C-terminal to the region found by this model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412333	cl00359	Ribosomal_L27	Ribosomal L27 protein. Eubacterial, chloroplast, and mitochondrial. Mitochondrial members have an additional C-terminal domain. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412334	cl00360	5-FTHF_cyc-lig	5-formyltetrahydrofolate cyclo-ligase family. This enzyme, 5,10-methenyltetrahydrofolate synthetase, is also called 5-formyltetrahydrofolate cycloligase. Function of bacterial proteins in this family was inferred originally from the known activity of eukaryotic homologs. Recently, activity was shown explicitly for the member from Mycoplasma pneumonia. Members of this family from alpha- and gamma-proteobacteria, designated ygfA, are often found in an operon with 6S structural RNA, and show a similar pattern of high expression during stationary phase. The function may be to deplete folate to slow 1-carbon biosynthetic metabolism. [Central intermediary metabolism, One-carbon metabolism]	0
412335	cl00361	Transcrip_reg	Transcriptional regulator. This model describes a minimally characterized protein family, restricted to bacteria excepting for some eukaryotic sequences that have possible transit peptides. YebC from E. coli is crystallized, and PA0964 from Pseudomonas aeruginosa has been shown to be a sequence-specific DNA-binding regulatory protein. In silico analysis suggests a role in Holliday junction resolution. [Regulatory functions, DNA interactions]	0
412336	cl00365	F1-ATPase_gamma	mitochondrial ATP synthase gamma subunit. A small number of taxonomically diverse prokaryotic species, including Methanosarcina barkeri, have what appears to be a second ATP synthase, in addition to the normal F1F0 ATPase in bacteria and A1A0 ATPase in archaea. These enzymes use ion gradients to synthesize ATP, and in principle may run in either direction. This model represents the F1 gamma subunit of this apparent second ATP synthase.	0
412337	cl00366	PMSR	Peptide methionine sulfoxide reductase. methionine sulfoxide reductase A; Provisional	0
412338	cl00367	Ribosomal_L28	Ribosomal L28 family. This model describes bacterial and chloroplast forms of the 50S ribosomal protein L28, a polypeptide about 60 amino acids in length. Mitochondrial homologs differ substantially in architecture (e.g. SP|P36525 from Saccharomyces cerevisiae, which is 258 amino acids long) and are not included. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412339	cl00368	Ribosomal_S16	Ribosomal protein S16. This model describes ribosomal S16 of bacteria and organelles. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412340	cl00370	Ribosomal_L34	Ribosomal protein L34. 50S ribosomal protein L34; Reviewed	0
412341	cl00373	Ribosomal_S18	Ribosomal protein S18. This ribosomal small subunit protein is found in all eubacteria so far, as well as in chloroplasts. YER050C from Saccharomyces cerevisiae and a related protein from Caenorhabditis elegans appear to be homologous and may represent mitochondrial forms. The trusted cutoff is set high enough that these two candidate S18 proteins are not categorized automatically. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412342	cl00376	Ribosomal_L10_P0	N/A. 60S acidic ribosomal protein P0; Provisional	0
412343	cl00377	Ribosomal_L31	Ribosomal protein L31. This family consists exclusively of bacterial (and organellar) 50S ribosomal protein L31. In some species, such as Bacillus subtilis, this protein exists in two forms (RpmE and YtiA), one of which (RpmE) contains a pair of motifs, CXC and CXXC, for binding zinc. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412344	cl00379	Ribosomal_L18_L5e	N/A. This family includes the large subunit ribosomal proteins from bacteria, archaea, the mitochondria and the chloroplast. It does not include the 60S L18 or L5 proteins from Metazoa.	0
412345	cl00380	Ribosomal_L36	Ribosomal protein L36. 50S ribosomal protein L36; Validated	0
412346	cl00381	PNPOx/FlaRed_like	Pyridoxine 5'-phosphate (PNP) oxidase-like and flavin reductase-like proteins. Pyridoxamine 5'-phosphate oxidase is a FMN flavoprotein that catalyzes the oxidation of pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P (PLP). This entry contains several pyridoxamine 5'-phosphate oxidases, and related proteins.	0
412347	cl00382	Ribosomal_L21p	Ribosomal prokaryotic L21 protein. 50S ribosomal protein L21; Validated	0
412348	cl00383	Ribosomal_L33	Ribosomal protein L33. This model describes bacterial ribosomal protein L33 and its chloroplast and mitochondrial equivalents. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412349	cl00384	Ribosomal_S20p	Ribosomal protein S20. ribosomal protein S20	0
412350	cl00386	BolA	BolA-like protein. transcriptional regulator BolA; Provisional	0
412351	cl00388	Thioredoxin_like	Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond.	0
412352	cl00389	SIS	N/A. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars.	0
412353	cl00391	beta_CA	N/A. This family includes carbonic anhydrases as well as a family of non-functional homologs related to YbcF.	0
412354	cl00392	Ribosomal_L35p	Ribosomal protein L35. This ribosomal protein is found in bacteria and organelles only. It is not closely related to any eukaryotic or archaeal ribosomal protein. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
412355	cl00393	Ribosomal_L20	Ribosomal protein L20. ribosomal protein L20	0
412356	cl00394	HupF_HypC	HupF/HypC family. This protein is suggested by act as a chaperone for a hydrogenase large subunit, holding the precursor form before metallocenter nickel incorporation. [SS 12/31/03] More recently proposed additional function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. . Added metallochaperone and protein mod GO terms. [Protein fate, Protein folding and stabilization, Protein fate, Protein modification and repair]	0
412357	cl00395	FMT_core	Formyltransferase, catalytic core domain. This family includes the following members. Glycinamide ribonucleotide transformylase catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyltetrahydrofolate deformylase produces formate from formyl- tetrahydrofolate. Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. Inclusion of the following members is supported by PSI-blast. HOXX_BRAJA (P31907) contains a related domain of unknown function. PRTH_PORGI (P46071) contains a related domain of unknown function. Y09P_MYCTU (Q50721) contains a related domain of unknown function.	0
412358	cl00399	MoaE	N/A. This family contains the MoaE protein that is involved in biosynthesis of molybdopterin. Molybdopterin, the universal component of the pterin molybdenum cofactors, contains a dithiolene group serving to bind Mo. Addition of the dithiolene sulfurs to a molybdopterin precursor requires the activity of the converting factor. Converting factor contains the MoaE and MoaD proteins.	0
412359	cl00400	Fe-S_biosyn	Iron-sulphur cluster biosynthesis. Proteins in this subfamily appear to be associated with the process of FeS-cluster assembly. The HesB proteins are associated with the nif gene cluster and the Rhizobium gene IscN has been shown to be required for nitrogen fixation. Nitrogenase includes multiple FeS clusters and many genes for their assembly. The E. coli SufA protein is associated with SufS, a NifS homolog and SufD which are involved in the FeS cluster assembly of the FhnF protein. The Azotobacter protein IscA (homologs of which are also found in E.coli) is associated which IscS, another NifS homolog and IscU, a nifU homolog as well as other factors consistent with a role in FeS cluster chemistry. A homolog from Geobacter contains a selenocysteine in place of an otherwise invariant cysteine, further suggesting a role in redox chemistry. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	0
412360	cl00402	UPF0054	Uncharacterized protein family UPF0054. This metalloprotein family is represented by a single member sequence only in nearly every bacterium. Crystallography demonstrated metal-binding activity, possibly to nickel. It is a predicted to be a metallohydrolase, and more recently it was shown that mutants have a ribosomal RNA processing defect. [Protein synthesis, Other]	0
412361	cl00406	Ribosomal_L19	Ribosomal protein L19. 50S ribosomal protein L19; Provisional	0
412362	cl00407	tRNA_m1G_MT	tRNA (Guanine-1)-methyltransferase. tRNA (guanine-N(1)-)-methyltransferase; Reviewed	0
412363	cl00410	G3P_acyltransf	Glycerol-3-phosphate acyltransferase. This model represents the full length of acylphosphate:glycerol 3-phosphate acyltransferase, and integral membrane protein about 200 amino acids in length, called PlsY in Streptococcus pneumoniae, YneS in Bacillus subtilis, and YgiH in E. coli. It is found in a single copy in a large number of bacteria, including the Mycoplasmas but not Mycobacteria or spirochetes, for example. Its partner is PlsX (see TIGR00182), and the pair can replace PlsB for synthesizing 1-acylglycerol-3-phosphate. [Fatty acid and phospholipid metabolism, Biosynthesis]	0
412364	cl00412	P-II	Nitrogen regulatory protein P-II. This family of proteins with unknown function appears to be restricted to Proteobacteria.	0
412365	cl00413	ATP-synt_A	ATP synthase A chain. Bacterial forms should be designated ATP synthase, F0 subunit A; eukaryotic (chloroplast and mitochondrial) forms should be designated ATP synthase, F0 subunit 6. The F1/F0 ATP synthase is a multisubunit, membrane associated enzyme found in bacteria and mitochondria and chloroplast. This enzyme is principally involved in the synthesis of ATP from ADP and inorganic phosphate by coupling the energy derived from the proton electrochemical gradient across the biological membrane. A brief description of this multisubunit enzyme complex: F1 and F0 represent two major clusters of subunits. Individual subunits in each of these clusters are named differently in prokaryotes and in organelles e.g., mitochondria and chloroplast. The bacterial equivalent of subunit 6 is named subunit 'A'. It has been shown that proton is conducted though this subunit. Typically, deprotonation and reprotonation of the acidic amino acid side-chains are implicated in the process. [Energy metabolism, ATP-proton motive force interconversion]	0
412366	cl00414	bS6	Bacterial ribosomal protein S6. bS6 is one of the components of the small subunit of the prokaryotic ribosome, a ribonucleoprotein organelle that decodes the genetic information in messenger RNA and forms peptide bonds to synthesize the corresponding polypeptides. Mitochondrial and chloroplastic ribosomes are similar to bacterial ribosomes. Ribosomes consist of a large and a small subunit, which assemble during the initiation stage of protein synthesis. Prokaryotic ribosomes consist of three molecules of RNA and more than 50 proteins. The small subunits of bacterial and eukaryotic ribosomes have the same overall shapes (with structural elements described as head, body, platform, beak and shoulder). The bacterial ribosomal protein S6 is important for the assembly of the central domain of the small subunit via heterodimerization with ribosomal protein S18.	0
412367	cl00415	CobS	Cobalamin-5-phosphate synthase. cobalamin synthase; Reviewed	0
412368	cl00416	CS_ACL-C_CCL	N/A. This is the long, C-terminal part of the enzyme.	0
412369	cl00420	NadA	Quinolinate synthetase A protein. This protein, termed NadA, plays a role in the synthesis of pyridine, a precursor to NAD. The quinolinate synthetase complex consists of A protein (this protein) and B protein. B protein converts L-aspartate to iminoaspartate, an unstable reaction product which in the absence of A protein is spontaneously hydrolyzed to form oxaloacetate. The A protein, NadA, converts iminoaspartate to quinolate. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	0
412370	cl00424	UPF0014	Uncharacterized protein family (UPF0014). [Hypothetical proteins, Conserved]	0
412371	cl00425	CofD_YvcK	Family of CofD-like proteins and proteins related to YvcK. Members of this family are distantly related to CofD, the enzyme LPPG:FO 2-phospho-L-lactate transferase, involved in coenzyme F420 biosynthesis. This family appears to belong to a biosynthesis cassette of unknown function.	0
412372	cl00426	YbjQ_1	Putative heavy-metal-binding. hypothetical protein; Provisional	0
412373	cl00427	TM_PBP2	N/A. The alignments cover the most conserved region of the proteins, which is thought to be located in a cytoplasmic loop between two transmembrane domains. The members of this family have a variable number of transmembrane helices.	0
412374	cl00429	SNARE_assoc	SNARE associated Golgi protein. This is a family of SNARE associated Golgi proteins. The yeast member of this family localizes with the t-SNARE Tlg2.	0
412375	cl00431	Pmp3	Proteolipid membrane potential modulator. Pmp3 is an evolutionarily conserved proteolipid in the plasma membrane which, in S. pombe, is transcriptionally regulated by the Spc1 stress MAPK (mitogen-activated protein kinases) pathway. It functions to modulate the membrane potential, particularly to resist high cellular cation concentration. In eukaryotic organisms, stress-activated mitogen-activated protein kinases play crucial roles in transmitting environmental signals that will regulate gene expression for allowing the cell to adapt to cellular stress. Pmp3-like proteins are highly conserved in bacteria, yeast, nematode and plants.	0
412376	cl00436	SirA_YedF_YeeD	N/A. Members of this family of hypothetical bacterial proteins have no known function.	0
412377	cl00437	Zip	ZIP Zinc transporter. The Zinc (Zn2+)-Iron (Fe2+) Permease (ZIP) Family (TC 2.A.5)Members of the ZIP family consist of proteins with eight putative transmembrane spanners. They are derived from animals, plants and yeast. Theycomprise a diverse family, with several paralogues in any one organism (e.g., at least five in Caenorabditis elegans, at least five in Arabidopsis thaliana and two inSaccharomyces cervisiae. The two S. cerevisiae proteins, Zrt1 and Zrt2, both probably transport Zn2+ with high specificity, but Zrt1 transports Zn2+ with ten-fold higher affinitythan Zrt2. Some members of the ZIP family have been shown to transport Zn2+ while others transport Fe2+, and at least one transports a range of metal ions. The energy source fortransport has not been characterized, but these systems probably function as secondary carriers. [Transport and binding proteins, Cations and iron carrying compounds]	0
412378	cl00438	FMN_red	NADPH-dependent FMN reductase. This is a family of flavodoxins. Flavodoxins are electron transfer proteins that carry a molecule of non-covalently bound FMN.	0
412379	cl00439	UPF0047	Uncharacterized protein family UPF0047. Members of this protein family have been studied extensively by crystallography. Members from several different species have been shown to have sufficient thiamin phosphate synthase activity (EC 2.5.1.3) to complement thiE mutants. However, it is presumed that this is a secondary activity, and the primary function of this enzyme remains unknown. [Unknown function, Enzymes of unknown specificity]	0
412380	cl00445	Iso_dh	Isocitrate/isopropylmalate dehydrogenase. Tartrate dehydrogenase catalyzes the oxidation of both meso- and (+)-tartrate as well as a D-malate. These enzymes are closely related to the 3-isopropylmalate and isohomocitrate dehydrogenases found in TIGR00169 and TIGR02088, respectively. [Energy metabolism, Other]	0
412381	cl00447	Nudix_Hydrolase	N/A. This domain family consists of uncharacterized proteins around 175 residues in length and is mainly found in various Streptomyces species. The function of this family is unknown. This family is related to the NUDIX hydrolases.	0
412382	cl00448	SurE	Survival protein SurE. This protein family originally was named SurE because of its role in stationary phase survivalin Escherichia coli. In E. coli, surE is next to pcm, an L-isoaspartyl protein repair methyltransferase that is also required for stationary phase survival. Recent work () shows that viewing SurE as an acid phosphatase (3.1.3.2) is not accurate. Rather, SurE in E. coli, Thermotoga maritima, and Pyrobaculum aerophilum acts strictly on nucleoside 5'- and 3'-monophosphates. E. coli SurE is Recommended cutoffs are 15 for homology, 40 for probable orthology, and 200 for orthology with full-length homology. [Cellular processes, Adaptations to atypical conditions]	0
412383	cl00451	MoCF_BD	N/A. This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation.	0
412384	cl00452	AAK	N/A. This family includes kinases that phosphorylate a variety of amino acid substrates, as well as uridylate kinase and carbamate kinase. This family includes: Aspartokinase EC:2.7.2.4. Acetylglutamate kinase EC:2.7.2.8. Glutamate 5-kinase EC:2.7.2.11. Uridylate kinase EC:2.7.4.-. Carbamate kinase EC:2.7.2.2.	0
412385	cl00453	CDP-OH_P_transf	CDP-alcohol phosphatidyltransferase. Alternate names: phosphatidylglycerophosphate synthase; glycerophosphate phosphatidyltransferase; PGP synthase. A number of related enzymes are quite similar in both sequence and catalytic activity, including Saccharamyces cerevisiae YDL142c, now known to be a cardiolipin synthase. There may be problems with incorrect transitive annotation of near homologs as authentic CDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase. [Fatty acid and phospholipid metabolism, Biosynthesis]	0
412386	cl00454	TM_PBP1_branched-chain-AA_like	N/A. This is a large family mainly comprising high-affinity branched-chain amino acid transporter proteins such as E. coli LivH and LivM, both of which are form the LIV-I transport system. Also found with in this family are proteins from the galactose transport system permease and a ribose transport system.	0
382020	cl00456	SLC5-6-like_sbd	Solute carrier families 5 and 6-like; solute binding domain. This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases.	0
412387	cl00457	Ribonuclease_P	Ribonuclease P. ribonuclease P; Reviewed	0
412388	cl00458	Peptidase_A8	Signal peptidase (SPase) II. Alternate name: lipoprotein signal peptidase [Protein fate, Protein and peptide secretion and trafficking]	0
412389	cl00459	MIT_CorA-like	metal ion transporter CorA-like divalent cation transporter superfamily. The CorA transport system is the primary Mg2+ influx system of Salmonella typhimurium and Escherichia coli. CorA is virtually ubiquitous in the Bacteria and Archaea. There are also eukaryotic relatives of this protein. The family includes the MRS2 protein from yeast that is thought to be an RNA splicing protein. However its membership of this family suggests that its effect on splicing is due to altered magnesium levels in the cell.	0
412390	cl00460	CMD	Carboxymuconolactone decarboxylase family. PA26 is a p53-inducible protein. Its function is unknown. It has similarity to pfam04636 in its N-terminus.	0
412391	cl00463	CbiQ	Cobalt transport protein. This model represents the CbiQ component of the cobalt-specific ECF-type. CbiQ is now recognized as the T component of energy-coupling factor (ECF)-type transporters. The S component confers specificity (CbiM-N for cobalt systems), which CbiO is the ABC-family ATPase. In general, proteins found by this model reside next to the other putative subunits of the complex, identified as CbiN, CbiO, or CbiM. Note that the designation of cobalt transporter has been spread excessively among ECF system transporters with many other specificities. [Transport and binding proteins, Cations and iron carrying compounds]	0
412392	cl00464	URO-D_CIMS_like	N/A. The N-terminal domain and C-terminal domains of cobalamin-independent synthases together define a catalytic cleft in the enzyme. The N-terminal domain is thought to bind the substrate, in particular, the negatively charged polyglutamate chain. The N-terminal domain is also thought to stabilize a loop from the C-terminal domain.	0
294317	cl00465	AI-2E_transport	AI-2E family transporter. Three lines of evidence show this protein to be involved in sporulation. First, it is under control of a sporulation-specific sigma factor, sigma-E. Second, mutation leads to a sporulation defect. Third, it if found in exactly those genomes whose bacteria are capable of sporulation, except for being absent in Clostridium acetobutylicum ATCC824. This protein has extensive hydrophobic regions and is likely an integral membrane protein. [Cellular processes, Sporulation and germination]	0
412393	cl00466	ATP-synt_C	ATP synthase subunit C. F0F1 ATP synthase subunit C; Provisional	0
412394	cl00467	Ntn_hydrolase	N/A. This family includes several hydrolases which cleave carbon-nitrogen bonds, other than peptide bonds, in linear amides. These include choloylglycine hydrolase (conjugated bile acid hydrolase, CBAH) EC:3.5.1.24, penicillin acylase EC:3.5.1.11 and acid ceramidase EC:3.5.1.23. This domain forms the alpha-subunit for members from vertebral species, see family NAAA-beta, pfam15508.	0
412395	cl00469	NADHdh	NADH dehydrogenase. NADH dehydrogenase subunit 1; Provisional	0
412396	cl00470	AKR_SF	Aldo-keto reductase (AKR) superfamily. This family includes a number of K+ ion channel beta chain regulatory domains - these are reported to have oxidoreductase activity.	0
412397	cl00473	BI-1-like	BAX inhibitor (BI)-1/YccA-like protein family. The Bax-inhibitor-1 region of the receptor molecules is conserved from bacteria to humans.	0
412398	cl00474	PAP2_like	N/A. This family is closely related to the C-terminal a region of PAP2.	0
320993	cl00475	FTR1	Iron permease FTR1 family. A characterized member from yeast acts as oxidase-coupled high affinity iron transporter. Note that the apparent member from E. coli K12-MG1655 has a frameshift by homology with member sequences from other species. [Unknown function, General]	0
412399	cl00477	H2MP	N/A. The family consists of hydrogenase maturation proteases. In E. coli HypI the hydrogenase maturation protease is involved in processing of HypE the large subunit of hydrogenases 3, by cleavage of its C-terminal.	0
412400	cl00478	LGT	Prolipoprotein diacylglyceryl transferase. The conversion of lipoprotein precursors into lipoproteins consists of three steps. First, the enzyme described by this model transfers a diacylglyceryl moiety from phosphatidylglycerol to the side chain of a Cys that will become the new N-terminus. Second, the signal peptide is removed by signal peptidase II. Finally, the free amino group of the new N-terminal Cys is acylated by apolipoprotein N-acyltransferase. [Protein fate, Protein modification and repair]	0
412401	cl00480	RraA-like	Aldolase/RraA. hypothetical protein; Validated	0
412402	cl00481	SecE	SecE/Sec61-gamma subunits of protein translocation complex. This model represents exclusively the bacterial (and some organellar) SecE protein. SecE is part of the core heterotrimer, SecYEG, of the Sec preprotein translocase system. Other components are the ATPase SecA, a cytosolic chaperone SecB, and an accessory complex of SecDF and YajC. [Protein fate, Protein and peptide secretion and trafficking]	0
412403	cl00482	SmpB	Small protein B (SmpB) is a component of the trans-translation system in prokaryotes for releasing stalled ribosome from damaged messenger RNAs. This model describes the SsrA-binding protein, also called tmRNA binding protein, small protein B, and SmpB. The small, stable RNA SsrA (also called tmRNA or 10Sa RNA) recognizes stalled ribosomes such as occur during translation from message that lacks a stop codon. It becomes charged with Ala like a tRNA, then acts as mRNA to resume translation started with the defective mRNA. The short C-terminal peptide tag added by the SsrA system marks the abortively translated protein for degradation. SmpB binds SsrA after its aminoacylation but before the coupling of the Ala to the nascent polypeptide chain and is an essential part of the SsrA peptide tagging system. SmpB has been associated with the survival of bacterial pathogens in conditions of stress. It is universal in the first 100 sequenced bacterial genomes. [Protein synthesis, Other]	0
412404	cl00483	UDG-like	uracil-DNA glycosylases (UDG) and related enzymes. This family consists of uncharacterized proteins around 230 residues in length and is mainly found in various Listeria species. The function of this family is unknown.	0
412405	cl00485	LacAB_rpiB	Ribose/Galactose Isomerase. This family is a member of the RpiB/LacA/LacB subfamily (TIGR00689) but lies outside the RpiB equivalog (TIGR01120) which is also a member of that subfamily. Ribose 5-phosphate isomerase is an essential enzyme of the pentose phosphate pathway; a pathway that appears to be present in the actinobacteria. The only candidates for ribose 5-phosphate isomerase in the Actinobacteria are members of this family.	0
412406	cl00489	60KD_IMP	60Kd inner membrane protein. This model describes full-length from some species, and the C-terminal region only from other species, of the YidC/Oxa1 family of proteins. This domain appears to be univeral among bacteria (although absent from Archaea). The well-characterized YidC protein from Escherichia coli and its close homologs contain a large N-terminal periplasmic domain in addition to the region modeled here. [Protein fate, Protein and peptide secretion and trafficking]	0
412407	cl00490	EEP	Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily. This domain represents the endonuclease region of retrotransposons from a range of bacteria, archaea and eukaryotes. These are enzymes largely from class EC:2.7.7.49.	0
412408	cl00492	Oxidored_q2	NADH-ubiquinone/plastoquinone oxidoreductase chain 4L. [Transport and binding proteins, Cations and iron carrying compounds]	0
412409	cl00493	trimeric_dUTPase	Trimeric dUTP diphosphatases. dUTPase hydrolyzes dUTP to dUMP and pyrophosphate.	0
412410	cl00494	YbaB_DNA_bd	YbaB/EbfC DNA-binding family. The function of this protein is unknown, but it has been expressed and crystallized. Its gene nearly always occurs next to recR and/or dnaX. It is restricted to Bacteria and the plant Arabidopsis. The plant form contains an additional N-terminal region that may serve as a transit peptide and shows a close relationship to the cyanobacterial member, suggesting that it is a chloroplast protein. Members of this family are found in a single copy per bacterial genome, but are broadly distributed. A member is present even in the minimal gene complement of Mycoplasm genitalium. [Unknown function, General]	0
412411	cl00495	Glu-tRNAGln	Glu-tRNAGln amidotransferase C subunit. This model represents a family small family related to GatC, the third subunit of an enzyme for completing the charging of tRNA(Gln) by amidating the Glu-tRNA(Gln). The few known archaea that contain a member of this family appear to produce Asn-tRNA(Asn) by an analogous amidotransferase reaction. This protein is proposed to substitute for GatC in the charging of both tRNAs.	0
321006	cl00497	CxxCxxCC	Putative zinc- or iron-chelating domain. This family of proteins contains 8 conserved cysteines. It has in the past been annotated as being one of the complex of proteins of the flagellar Fli complex. However this was due to a mis-annotation of the original Salmonella LT2 Genbank entry of 'fliB'. With all its conserved cysteines it is possibly a domain that chelates iron or zinc ions.	0
412412	cl00500	ACPS	4&apos;-phosphopantetheinyl transferase superfamily. This model models a domain active in transferring the phophopantetheine prosthetic group to its attachment site on enzymes and carrier proteins. Many members of this family are small proteins that act on the acyl carrier protein involved in fatty acid biosynthesis. Some members are domains of larger proteins involved specialized pathways for the synthesis of unusual molecules including polyketides, atypical fatty acids, and antibiotics. [Protein fate, Protein modification and repair]	0
321008	cl00504	Cytochrom_C_asm	Cytochrome C assembly protein. Members of this protein family represent one of two essential proteins of system II for c-type cytochrome biogenesis. Additional proteins tend to be part of the system but can be replaced by chemical reductants such as dithiothreitol. This protein is designated CcsB in Bordetella pertussis and some other bacteria, resC in Bacillus (where there is additional N-terminal sequence), and CcsA in chloroplast. We use the CcsB designation here. Member sequences show regions of strong sequence conservation and variable-length, poorly conserved regions in between; sparsely filled columns were removed from the seed alignment prior to model construction. [Energy metabolism, Electron transport, Protein fate, Protein modification and repair]	0
412413	cl00505	DHQase_II	N/A. 3-dehydroquinate dehydratase; Reviewed	0
412414	cl00506	Haemolytic	Haemolytic domain. This model describes a family, YidD, of small, non-essential proteins now suggested to improve YidC-dependent inner membrane protein insertion. A related protein is found in the temperature phage HP1 of Haemophilus influenzae. Annotation of some members of this family as hemolysins appears to represent propagation from an unpublished GenBank submission, L36462, attributed to Aeromonas hydrophila but a close match to E. coli. [Hypothetical proteins, Conserved]	0
412415	cl00508	YGGT	YGGT family. This family consists of a repeat found in conserved hypothetical integral membrane proteins. The function of this region and the proteins which possess it is unknown.	0
412416	cl00509	hot_dog	N/A. This is the dehydratase domain of polyketide synthases. Structural analysis shows these DH domains are double hotdogs in which the active site contains a histidine from the N-terminal hotdog and an aspartate from the C-terminal hotdog. Studies have uncovered that a substrate tunnel formed between the DH domains may be essential for loading substrates and unloading products.	0
412417	cl00510	MlaE	Permease MlaE. This model describes a subfamily of ABC transporter permease subunits. One member of this family has been associated with the toluene tolerance phenotype of Pseudomonas putida, another with L-glutamate transport, another with maintenance of lipid asymmetry. Many bacterial species have one or two members. The Mycobacteria have large paralogous families included in the DUF140 family but excluded from this subfamily on based on extreme divergence at the amino end and on phylogenetic and UPGMA trees on the more conserved regions. [Hypothetical proteins, Conserved]	0
294347	cl00511	FTSW_RODA_SPOVE	Cell cycle protein. This family consists of FtsW, an integral membrane protein with ten transmembrane segments. In general, it is one of two paralogs involved in peptidoglycan biosynthesis, the other being RodA, and is essential for cell division. All members of the seed alignment for this model are encoded in operons for the biosynthesis of UDP-N-acetylmuramoyl-pentapeptide, a precursor of murein (peptidoglycan). The FtsW designation is not used in endospore-forming bacterial (e.g. Bacillus subtilis), where the member of this family is designated SpoVE and three or more RodA/FtsW/SpoVE family paralogs are present. SpoVE acts in spore cortex formation and is dispensible for growth. Biological rolls for FtsW in cell division include recruitment of penicillin-binding protein 3 to the division site. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan, Cellular processes, Cell division]	0
412418	cl00512	LpxC	UDP-3-O-acyl N-acetylglycosamine deacetylase. UDP-3-O-(R-3-hydroxymyristoyl)-GlcNAc deacetylase from E. coli , LpxC, was previously designated EnvA. This enzyme is involved in lipid-A precursor biosynthesis. It is essential for cell viability. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
412419	cl00514	Nitro_FMN_reductase	nitroreductase family protein. The nitroreductase family comprises a group of FMN- or FAD-dependent and NAD(P)H-dependent enzymes able to metabolize nitrosubstituted compounds.	0
412420	cl00518	Asp_Glu_race	Asp/Glu/Hydantoin racemase. This family consists of several bacterial and archaeal AroM proteins. In Escherichia coli the aroM gene is cotranscribed with aroL. The function of this family is unknown.	0
412421	cl00519	RsfS	Ribosomal silencing factor during starvation. This model describes a widely distributed family of bacterial proteins related to iojap from plants. It includes RsfS(YbeB) from E. coli. The gene iojap is a pattern-striping gene in maize, reflecting a chloroplast development defect in some cells. The conserved function of this protein is to silence ribosomes by binding the ribosomal large subunit and impairing joining with the small subunit in response to nutrient stress. Note that RsfS (starvation) is an author-endorsed change from the published symbol RsfA, which conflicted with previously published gene symbols. [Protein synthesis, Translation factors]	0
412422	cl00521	TatC	Sec-independent protein translocase protein (TatC). This model represents the TatC translocase component of the Sec-independent protein translocation system. This system is responsible for translocation of folded proteins, often with bound cofactors across the periplasmic membrane. A related model (TIGR01912) represents the archaeal clade of this family. TatC is often found in a gene cluster with the two other components of the system, TatA/E (TIGR01411) and TatB (TIGR01410). A model also exists for the Twin-arginine signal sequence (TIGR01409). [Protein fate, Protein and peptide secretion and trafficking]	0
412423	cl00522	GTP_cyclohydro2	N/A. GTP cyclohydrolase II catalyzes the first committed step in the biosynthesis of riboflavin.	0
412424	cl00523	Queuosine_synth	Queuosine biosynthesis protein. This model describes the enzyme for S-adenosylmethionine:tRNA ribosyltransferase-isomerase (QueA). QueA synthesizes Queuosine which is usually in the first position of the anticodon of tRNAs specific for asparagine, aspartate, histidine, and tyrosine. [Protein synthesis, tRNA and rRNA base modification]	0
412425	cl00526	DAGK_IM_like	Integral membrane diacylglycerol kinase and similar enzymes. This bacterial family of homo-trimeric integral membrane enzyme domains catalyzes the ATP-dependent phosphorylation of of undecaprenol to undecaprenyl phosphate. They sit N-terminally to phosphatase domains that are members of the type 2 phosphatidic acid phosphatase superfamily, and the function of members of this domain architecture was determined to be undecaprenyl pyrophosphate phosphatases. The bi-functional enzymes might generate undecaprenyl phosphate via two mechanisms - the phosphorylation of undecaprenol or the cleavage of the terminal phosphate group of undecaprenyl pyrophosphate.	0
412426	cl00528	IscU_like	Iron-sulfur cluster scaffold-like proteins. This domain is found in NifU in combination with pfam01106. This domain is found on isolated in several bacterial species. The nif genes are responsible for nitrogen fixation. However this domain is found in bacteria that do not fix nitrogen, so it may have a broader significance in the cell than nitrogen fixation. These proteins appear to be scaffold proteins for iron-sulfur clusters.	0
412427	cl00529	Ribosomal_S21	Ribosomal protein S21. 30S ribosomal protein S21; Reviewed	0
412428	cl00530	UreD	UreD urease accessory protein. UreD is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid. UreD is involved in activation of the urease enzyme via the UreD-UreF-UreG-urease complex and is required for urease nickel metallocenter assembly. See also UreF pfam01730, UreG pfam01495.	0
412429	cl00532	Urease_gamma	N/A. Urease is a nickel-binding enzyme that catalyzes the hydrolysis of urea to carbon dioxide and ammonia.	0
412430	cl00533	Urease_beta	N/A. This subunit is known as alpha in Heliobacter.	0
412431	cl00535	Oxidored_q4	NADH-ubiquinone/plastoquinone oxidoreductase, chain 3. NADH dehydrogenase subunit A; Validated	0
412432	cl00537	ExbD	Biopolymer transport protein ExbD/TolR. The model describes the inner membrane protein TolR, part of the TolR/TolQ complex that transduces energy from the proton-motive force, through TolA, to an outer membrane complex made up of TolB and Pal (peptidoglycan-associated lipoprotein). The complex is required to maintain outer membrane integrity, and defects may cause a defect in the import of some organic compounds in addition to the resulting morphologic. While several gene pairs homologous to talR and tolQ may be found in a single genome, but the scope of this model is set to favor finding only bone fide TolR, supported by operon structure as well as by score. [Transport and binding proteins, Other, Cellular processes, Pathogenesis]	0
412433	cl00538	MinE	Septum formation topological specificity factor MinE. cell division topological specificity factor MinE; Provisional	0
412434	cl00540	Asp_decarbox	Aspartate alpha-decarboxylase or L-aspartate 1-decarboxylase, a pyruvoyl group-dependent  decarboxylase in beta-alanine production. Decarboxylation of aspartate is the major route of beta-alanine production in bacteria, and is catalyzed by the enzyme aspartate decarboxylase EC:4.1.1.11 which requires a pyruvoyl group for its activity. It is synthesized initially as a proenzyme which is then proteolytically cleaved to an alpha (C-terminal) and beta (N-terminal) subunit and a pyruvoyl group. This family contains both chains of aspartate decarboxylase.	0
412435	cl00541	PNPsynthase	N/A. Members of this family belong to the PdxJ family that catalyzes the condensation of 1-deoxy-d-xylulose-5-phosphate (DXP) and 1-amino-3-oxo-4-(phosphohydroxy)propan-2-one to form pyridoxine 5'-phosphate (PNP). This reaction is involved in de novo synthesis of pyridoxine (vitamin B6) and pyridoxal phosphate.	0
412436	cl00542	RBFA	Ribosome-binding factor A. ribosome-binding factor A; Provisional	0
412437	cl00546	POR	Pyruvate ferredoxin/flavodoxin oxidoreductase. This model represents the beta subunit of indolepyruvate ferredoxin oxidoreductase, an alpha(2)/beta(2) tetramer, as found in Pyrococcus furiosus and Methanobacterium thermoautotrophicum. Cofactors for the tetramer include TPP, 4Fe4S, and 3Fe-4S. It shows considerable sequence similarity to subunits of several other ketoacid oxidoreductases.	0
294369	cl00547	Branch_AA_trans	Branched-chain amino acid transport protein. The Branched Chain Amino Acid:Cation Symporter (LIVCS) Family (TC 2.A.26) Characterized members of this family transport all three of the branched chain aliphatic amino acids (leucine (L), isoleucine (I) and valine (V)). They function by a Na+ or H+ symport mechanism and display 12 putative transmembrane helical spanners. [Transport and binding proteins, Amino acids, peptides and amines]	0
412438	cl00548	Na_Ala_symp	Sodium:alanine symporter family. The Alanine or Glycine: Cation Symporter (AGCS) Family (TC 2.A.25) Members of the AGCS family transport alanine and/or glycine in symport with Na+ and or H+.	0
412439	cl00549	ABC_membrane	ABC transporter transmembrane region. This family represents a unit of six transmembrane helices.	0
412440	cl00551	Acylphosphatase	Acylphosphatase. acylphosphatase; Provisional	0
412441	cl00552	UPF0146	Uncharacterized protein family (UPF0146). hypothetical protein; Provisional	0
412442	cl00553	DNase-RNase	Bifunctional nuclease. This family is a bifunctional nuclease, with both DNase and RNase activity. It forms a wedge-shaped dimer, with each monomer being triangular in shape. A large groove at the thick end of the wedge contains a possible active site.	0
412443	cl00554	Inos-1-P_synth	Myo-inositol-1-phosphate synthase. This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyzes the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction.	0
412444	cl00555	SAF	Domains similar to fish antifreeze type III protein. ChapFlgA is a family similar to the SAF family, and includes chaperones for flagellar basal-body proteins and pilus-assembly proteins, FlgA, RcpB and CpaB. ChapFlgA is necessary for the formation of the P-ring of the flagellum, FlgI, which sits in the peptidoglycan layer of the outer membrane of the bacterium. FlgA plays an auxiliary role in P-ring assembly.	0
412445	cl00558	Abi	CAAX protease self-immunity. The CAAX prenyl protease, in eukaryotes, catalyzes three covalent modifications, including cleavage and acylation, at the C-terminus of certain proteins in a process connected to protein sorting. This family describes a bacterial protein family homologous to one domain of the CAAX-processing enzyme. Members of this protein family are found in genomes that carry a predicted protein sorting system, PEP-CTERM/exosortase, usually in the vicinity of the EpsH homolog that is the hallmark of the system. The function of this protein is unknown, but it may relate to protein motification. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
412446	cl00559	PgpA	Phosphatidylglycerophosphatase A; a bacterial membrane-associated enzyme involved in lipid metabolism. This family represents a family of bacterial phosphatidylglycerophosphatases (EC:3.1.3.27), known as PgpA. It appears that bacteria possess several phosphatidylglycerophosphatases, and thus, PgpA is not essential in Escherichia coli.	0
412447	cl00561	CobD_Cbib	CobD/Cbib protein. AmpE is a family of bacterial regulatory proteins. AmpE in conjunction with AmpD sense the effect of beta-lactam on peptidoglycan synthesis and relay this signal to AmpR. AmpR regulates the production of beta-lactamase.	0
412448	cl00562	Cyt_bd_oxida_I	Cytochrome bd terminal oxidase subunit I. cytochrome bd-II oxidase subunit 1; Provisional	0
294381	cl00565	LysE	LysE type translocator. [Transport and binding proteins, Amino acids, peptides and amines]	0
412449	cl00567	Colicin_V	Colicin V production protein. colicin V production protein; Provisional	0
412450	cl00568	MotA_ExbB	MotA/TolQ/ExbB proton channel family. The MotA protein, along with its partner MotB, comprise the stator complex of the bacterial flagellar motor. MotAB span the cytoplasmic membrane and undergo conformational changes powered by the translocation of protons. These conformational changes in turn are communicated to the rotor assembly, producing torque. This model represents one family of MotA proteins which are often not identified by the "transporter, MotA/TolQ/ExbB proton channel family" model, pfam01618.	0
412451	cl00569	BCCT	BCCT, betaine/carnitine/choline family transporter. putative transporter; Provisional	0
412452	cl00570	AzlC	AzlC protein. Overexpression of this gene results in resistance to a leucine analog, 4-azaleucine. The protein has 5 potential transmembrane motifs. It has been inferred, but not experimentally demonstrated, to be part of a branched-chain amino acid transport system. Commonly found in association with azlD. [Transport and binding proteins, Amino acids, peptides and amines]	0
412453	cl00572	SpoIIM	Stage II sporulation protein M. A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This predicted integral membrane protein is designated stage II sporulation protein M. [Cellular processes, Sporulation and germination]	0
412454	cl00573	SDF	Sodium:dicarboxylate symporter family. C4-dicarboxylate transporter DctA; Reviewed	0
412455	cl00574	Asp23	Asp23 family, cell envelope-related function. The alkaline shock protein Asp23 was identified as an alkaline shock protein that was expressed in a sigmaB-dependent manner in Staphylococcus aureus. Following an alkaline shock Asp23 accumulates in the soluble protein fraction of the S. aureus cell. Asp23 is one of the most abundant proteins in the cytosolic protein fraction of stationary S. aureus cells, with a copy-number of >25000 per cell. A second Asp23-family protein, AmaP, which is encoded within the asp23-operon, is required to localize Asp23 to the cell membrane. The overall function for the family is thus a cell envelope-related one in Gram-positive bacteria.	0
412456	cl00581	LytR_cpsA_psr	Cell envelope-related transcriptional attenuator domain. This model describes a domain of unknown function that is found in the predicted extracellular domain of a number of putative membrane-bound proteins. One of these is proteins psr, described as a penicillin binding protein 5 (PDP-5) synthesis repressor. Another is Bacillus subtilis LytR, described as a transcriptional attenuator of itself and the LytABC operon, where LytC is N-acetylmuramoyl-L-alanine amidase. A third is CpsA, a putative regulatory protein involved in exocellular polysaccharide biosynthesis. Besides the region of strong similarily represented by this model, these proteins share the property of having a short putative N-terminal cytoplasmic domain and transmembrane domain forming a signal-anchor. [Regulatory functions, Other]	0
412457	cl00583	PhaG_MnhG_YufB	Na+/H+ antiporter subunit. putative monovalent cation/H+ antiporter subunit G; Reviewed	0
412458	cl00584	CutA1	CutA1 divalent ion tolerance protein. Several gene loci with a possible involvement in cellular tolerance to copper have been identified. One such locus in eubacteria and archaebacteria, cutA, is thought to be involved in cellular tolerance to a wide variety of divalent cations other than copper. The cutA locus consists of two operons, of one and two genes. The CutA1 protein is a cytoplasmic protein, encoded by the single-gene operon and has been linked to divalent cation tolerance. It has no recognized structural motifs. This family also contains putative proteins from eukaryotes (human and Drosophila).	0
412459	cl00585	RNA_binding	RNA binding. hypothetical protein; Provisional	0
412460	cl00588	CarD_CdnL_TRCF	CarD-like/TRCF domain. CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes. This family includes the presumed N-terminal domain. CarD interacts with the zinc-binding protein CarG, to form a complex that regulates multiple processes in Myxococcus xanthus. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription. This domain is involved in binding to the stalled RNA polymerase.	0
412461	cl00591	FlaG	FlaG protein. flagellar protein FlaG; Provisional	0
412462	cl00593	FliP	FliP family. type III secretion system protein YscR; Provisional	0
412463	cl00596	LrgB	LrgB-like family. Members of this small but broadly distibuted (Gram-positive, Gram-negative, and Archaeal) family appear to have multiple transmembrane segments. The function is unknown. A homolog, LrgB of Staphylococcus aureus, in the same small superfamily but in an outgroup to this subfamily, is regulated by LytSR and is suggested to act as a murein hydrolase. Of the three paralogous proteins in B. subtilis, one is a full length member of this family, one lacks the C-terminal 60 residues and has an additional 128 N-terminal residues but branches within the family in a phylogenetic tree, and one is closely related to LrgB and part of the outgroup. [Hypothetical proteins, Conserved]	0
412464	cl00597	Rnf-Nqr	Rnf-Nqr subunit, membrane protein. electron transport complex RsxE subunit; Provisional	0
351169	cl00598	SMC_ScpA	Segregation and condensation protein ScpA. segregation and condensation protein A; Reviewed	0
412465	cl00599	Extradiol_Dioxygenase_3B_like	Subunit B of Class III Extradiol ring-cleavage dioxygenases. This family contains members from all branches of life. The molecular function of this protein is unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton.	0
412466	cl00600	Ribosomal_L7Ae	Ribosomal protein L7Ae/L30e/S12e/Gadd45 family. This RNA binding Pelota domain is at the C-terminus of a PRTase family. These PRTase+Pelota genes are found in the biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	0
412467	cl00603	DmpA_OAT	N/A. Members of the ArgJ family catalyze the first EC:2.3.1.1 and fifth steps EC:2.3.1.35 in arginine biosynthesis.	0
412468	cl00604	STAS	Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors. The STAS (after Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C-terminal region of Sulphate transporters and bacterial antisigma factor antagonists. It has been suggested that this domain may have a general NTP binding function.	0
412469	cl00605	RNase_P_Rpp14	Rpp14/Pop5 family. ribonuclease P protein component 2; Provisional	0
412470	cl00606	Archease	Archease protein family (MTH1598/TM1083). This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism.	0
412471	cl00607	PUA	PUA domain. This uncharacterized domain is found a number of enzymes and uncharacterized proteins, often at the C-terminus. It is found in some but not all members of a family of related tRNA-guanine transglycosylases (tgt), which exchange a guanine base for some modified base without breaking the phosphodiester backbone of the tRNA. It is also found in rRNA pseudouridine synthase, another enzyme of RNA base modification not otherwise homologous to tgt. It is found, again at the C-terminus, in two putative glutamate 5-kinases. It is also found in a family of small, uncharacterized archaeal proteins consisting mostly of this domain.	0
412472	cl00608	LrgA	LrgA family. hypothetical protein; Provisional	0
412473	cl00610	Ribosomal_S17e	Ribosomal S17. 40S ribosomal protein S17; Provisional	0
412474	cl00611	Methyltrans_RNA	RNA methyltransferase. This family is likely to be an S-adenosyl-L-methionine (SAM)-dependent RNA methyltransferase. It is responsible for N1-methylation of pseudouridine 54 in archaeal tRNAs.	0
412475	cl00612	SMC_ScpB	Segregation and condensation complex subunit ScpB. segregation and condensation protein B; Reviewed	0
412476	cl00613	ATP-synt_D	ATP synthase subunit D. V-type ATP synthase subunit D; Provisional	0
412477	cl00614	ADP_ribosyl_GH	ADP-ribosylglycohydrolase. Members of this family are the enzyme ADP-ribosyl-[dinitrogen reductase] hydrolase (EC 3.2.2.24), better known as Dinitrogenase Reductase Activating Glycohydrolase, DRAG. This enzyme reverses a regulatory inactivation of dinitrogen reductase caused by the action of NAD(+)--dinitrogen-reductase ADP-D-ribosyltransferase (EC 2.4.2.37) (DRAT). This enzyme is restricted to nitrogen-fixing bacteria and belongs to the larger family of ADP-ribosylglycohydrolases described by pfam03747. [Central intermediary metabolism, Nitrogen fixation]	0
294412	cl00615	Membrane-FADS-like	N/A. Beta-carotene hydroxylase (CrtR), the carotenoid zeaxanthin biosynthetic enzyme catalyzes the addition of hydroxyl groups to the beta-ionone rings of beta-carotene to form zeaxanthin and is found in bacteria and red algae. Carotenoids are important natural pigments; zeaxanthin and lutein are the only dietary carotenoids that accumulate in the macular region of the retina and lens. It is proposed that these carotenoids protect ocular tissues against photooxidative damage. CrtR does not show overall amino acid sequence similarity to the beta-carotene hydroxylases similar to CrtZ, an astaxanthin biosynthetic beta-carotene hydroxylase. However, CrtR does show sequence similarity to the green alga, Haematococcus pluvialis, beta-carotene ketolase (CrtW), which converts beta-carotene to canthaxanthin. Sequences of the CrtR_beta-carotene-hydroxylase domain family, as well as, the CrtW_beta-carotene-ketolase domain family appear to be structurally related to membrane fatty acid desaturases and alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXXHH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase.	0
412478	cl00616	DUF177	Uncharacterized ACR, COG1399. This family is nearly universally conserved in bacteria and plants except the Chlorophyceae algae. Thus far, mutantional analysis in bacteria have not established a function. In contrast, mutants have embryo lethal phenotypes in maize and Arabidopsis. In maize, the mutant embryos arrest at an early transition stage.It has been suggested that family members specifically affect 23S rRNA accumulation in plastids as well as bacteria.	0
412479	cl00617	SRP19	SRP19 protein. signal recognition particle protein Srp19; Provisional	0
412480	cl00618	Creatininase	Creatinine amidohydrolase. Members of this family are creatininase (EC 3.5.2.10), an amidohydrolase that interconverts creatinine + H(2)O with creatine. It should not be confused with creatinase (EC 3.5.3.3), which hydrolyzes creatine to sarcosine plus urea. [Central intermediary metabolism, Nitrogen metabolism]	0
412481	cl00620	DUF763	Protein of unknown function (DUF763). This family consists of several uncharacterized bacterial and archaeal proteins of unknown function.	0
412482	cl00622	Csm2_III-A	CRISPR/Cas system-associated protein Csm2. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-associated) proteins. This entry represents Csm2 Type III-A, a family of Cas proteins also known as TM1810/Csm2.	0
412483	cl00625	BioW	6-carboxyhexanoate--CoA ligase. 6-carboxyhexanoate--CoA ligase; Provisional	0
412484	cl00627	DUF192	Uncharacterized ACR, COG1430. hypothetical protein; Provisional	0
412485	cl00628	Piwi-like	N/A. This domain is found in the protein Piwi and its relatives. The function of this domain is the dsRNA guided hydrolysis of ssRNA. Determination of the crystal structure of Argonaute reveals that PIWI is an RNase H domain, and identifies Argonaute as Slicer, the enzyme that cleaves mRNA in the RNAi RISC complex. In addition, Mg+2 dependence and production of 3'-OH and 5' phosphate products are shared characteristics of RNaseH and RISC. The PIWI domain core has a tertiary structure belonging to the RNase H family of enzymes. RNase H fold proteins all have a five-stranded mixed beta-sheet surrounded by helices. By analogy to RNase H enzymes which cleave single-stranded RNA guided by the DNA strand in an RNA/DNA hybrid, the PIWI domain can be inferred to cleave single-stranded RNA, for example mRNA, guided by double stranded siRNA.	0
412486	cl00630	YdcF-like	N/A. This large family of proteins contains several highly conserved charged amino acids, suggesting this may be an enzymatic domain (Bateman A pers. obs). The family includes SanA, which is involved in Vancomycin resistance. This protein may be involved in murein synthesis.	0
412487	cl00632	ATP-synt_F	ATP synthase (F/14-kDa) subunit. V-type ATP synthase subunit F; Provisional	0
412488	cl00635	Ntn_Asparaginase_2_like	L-Asparaginase type 2-like enzymes of the NTN-hydrolase superfamily. The wider family of Asparaginase 2-like enzymes includes Glycosylasparaginase, Taspase 1, and  L-Asparaginase type 2. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue.	0
412489	cl00638	RNA_pol_Rpb4	RNA polymerase Rpb4. This family includes the Rpb4 protein. This family also includes C17 (aka CGRP-RCP) is an essential subunit of RNA polymerase III. C17 forms a subcomplex with C25 which is likely to be the counterpart of subcomplex Rpb4/7 in Pol II.	0
412490	cl00640	DHQS	3-dehydroquinate synthase II (EC 1.4.1.24). 3-Dehydroquinate synthase II was isolated from the archaeon Methanocaldococcus jannaschii and plays a key role in an alternative pathway for the biosynthesis of 3-dehydroquinate (DHQ), an intermediate of the canonical pathway for the biosynthesis of aromatic amino acids. The enzyme catalyzes a two-step reaction - an oxidative deamination, followed by cyclization. The enzyme converts 2-amino-3,7-dideoxy-D-threo-hept-6-ulosonate to 3-dehydroquinate.	0
412491	cl00641	Cas4_I-A_I-B_I-C_I-D_II-B	CRISPR/Cas system-associated protein Cas4. Members of this family belong to the PD-(D/E)XK nuclease superfamily	0
412492	cl00642	GCHY-1	Type I GTP cyclohydrolase folE2. GTP cyclohydrolase; Provisional	0
412493	cl00644	F420_ligase	F420-0:Gamma-glutamyl ligase. This protein family is related to CofE, a gamma-glutamyl ligase of coenzyme F420 biosynthesis. However, it occurs in a different gamma-glutamyl ligase context, polyglutamylated tetrahydrofolate biosynthesis-like regions in two widely separated lineages that both occur as intracellular bacteria - Chlamydia and Wolbachia.	0
412494	cl00647	SfsA	Sugar fermentation stimulation protein. probable regulatory factor involved in maltose metabolism contains a putative DNA binding domain. Isolated as a gene which enabled E.coli strain MK2001 to use maltose. [Energy metabolism, Sugars, Regulatory functions, Other]	0
412495	cl00649	DsbB	Disulfide bond formation protein DsbB. disulfide bond formation protein B; Provisional	0
412496	cl00650	Cu-oxidase_4	Multi-copper polyphenol oxidoreductase laccase. PSI-BLAST converges on members of this family of uncharacterized bacterial proteins and shows no significant similarity to any characterized protein. No completed genome to date has two members. Members of the family have been crystallized but the function is unknown. [Unknown function, General]	0
412497	cl00652	DUF501	Protein of unknown function (DUF501). Family of uncharacterized bacterial proteins.	0
412498	cl00653	Endonuclease_V	Endonuclease_V, a DNA repair enzyme that initiates repair of nitrosative deaminated purine bases. This domain is found in the C subunits of the bacterial and archaeal UvrABC system which catalyzes nucleotide excision repair in a multi-step process. UvrC catalyzes the first incision on the fourth or fifth phosphodiester bond 3' and on the eighth phosphodiester bond 5' from the damage that is to be excised. The domain described here is found to the N-terminus of a helix hairpin helix (pfam00633) motif and also co-occurs with the pfam01541 catalytic domain which is found at the N-terminus of the same proteins.	0
412499	cl00654	FliS	flagellar export chaperone FliS. FliS is coded for by the FliD operon and is transcribed in conjunction with FliD and FliT, however this protein has no known function.	0
412500	cl00656	Cas1_I-II-III	CRISPR/Cas system-associated protein Cas1. Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. This family of proteins corresponds to Cas1, a CRISPR-associated protein. Cas1 may be involved in linking DNA segments to CRISPR.	0
412501	cl00659	FdhD-NarQ	FdhD/NarQ family. FdhD in E. coli and NarQ in B. subtilis are required for the activity of formate dehydrogenase. The gene name in B. subtilis reflects the requirement of the neighboring gene narA for nitrate assimilation, for which NarQ is not required. In some species, the gene is associated not with a known formate dehydrogenase but with a related putative molybdopterin-binding oxidoreductase. A reasonable hypothesis is that this protein helps prepare a required cofactor for assembly into the holoenzyme. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	0
412502	cl00660	vATP-synt_AC39	ATP synthase (C/AC39) subunit. The A1/A0 ATP synthase is homologous to the V-type (V1/V0, vacuolar) ATPase, but functions in the ATP synthetic direction as does the F1/F0 ATPase of bacteria. The C subunit is part of the hydrophilic A1 "stalk" complex (AhaABCDEFG), which is the site of ATP generation and is coupled to the membrane-embedded proton translocating A0 complex.	0
412503	cl00661	DUF504	Protein of unknown function (DUF504). hypothetical protein; Provisional	0
412504	cl00662	RNA_bind_2	Predicted RNA-binding protein. Members of this family of bacterial proteins are thought to have RNA-binding properties, however, their exact function has not, as yet, been defined.	0
412505	cl00663	CRS1_YhbY	CRS1 / YhbY (CRM) domain. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome.	0
412506	cl00666	CinA	Competence-damaged protein. CinA is a DNA damage- or competence-inducible protein that is polycistronic with recA in a number of species. Several bacterial species have a protein consisting largely of the C-terminal domain of CinA but lacking the N-terminal domain, including nicotinamide mononucleotide (NMN) deamidase (3.5.1.42) proteins PncC in Shewanella oneidensis and ygaD in E. coli. [DNA metabolism, DNA replication, recombination, and repair]	0
412507	cl00667	DUF309	Domain of unknown function (DUF309). This domain is found in eubacterial and archaebacterial proteins of unknown function. The proteins contain a motif HXXXEXX(W/Y) where X can be any amino acid. This motif is likely to be functionally important and may be involved in metal binding.	0
412508	cl00668	Hydantoinase_A	Hydantoinase/oxoprolinase. This protein family was identified, by the method of partial phylogenetic profiling, as related to the use of tetrahydromethanopterin (H4MPT) as a C-1 carrier. Characteristic markers of the H4MPT-linked C1 transfer pathway include formylmethanofuran dehydrogenase subunits, methenyltetrahydromethanopterin cyclohydrolase, etc. Tetrahydromethanopterin, a tetrahydrofolate analog, occurs in methanogenic archaea, bacterial methanotrophs, planctomycetes, and a few other lineages. [Central intermediary metabolism, One-carbon metabolism]	0
412509	cl00669	DUF503	Protein of unknown function (DUF503). Family of hypothetical bacterial proteins.	0
412510	cl00670	CsrA	Global regulator protein family. Modulates the expression of genes in the glycogen biosynthesis and gluconeogenesis pathways by accelerating the 5'-to-3' degradation of these transcripts through selective RNA binding. The N-terminal end of the sequence (AA 11-45) contains the KH motif which is characteristic of a set of RNA-binding proteins. [Energy metabolism, Glycolysis/gluconeogenesis, Regulatory functions, RNA interactions]	0
412511	cl00671	Ribosomal_L40e	Ribosomal L40e family. 50S ribosomal protein L40e; Provisional	0
412512	cl00672	DrsE	DsrE/DsrF-like family. DsrE is a small soluble protein involved in intracellular sulfur reduction. The family also includes YrkE proteins.	0
412513	cl00674	LUD_dom	LUD domain. This entry represents a domain found in lactate utilization proteins B (LutB) and C (LutC), as well as several uncharacterized proteins. LutB and LutC are encoded by th conserved LutABC operon in bacteria. They are involved in lactate utilization and is implicated in the oxidative conversion of L-lactate into pyruvate	0
412514	cl00676	DUF4040	Domain of unknown function (DUF4040). Possible subunit of Na+/H+ antiporter,. Predicted integral membrane protein, usually four transmembrane regions in this domain. Often found in bacterial NADH dehydrogenase subunit.	0
412515	cl00681	FliL	Flagellar basal body-associated protein FliL. flagellar basal body-associated protein FliL; Reviewed	0
412516	cl00682	Alba	Alba. The nuclear RNase P of Saccharomyces cerevisiae is made up of at least nine protein subunits; Pop1, Pop3, Pop4, Pop5, Pop6, Pop7, Pop8, Rpr2 and Rpp1. Many of these subunits seem to be present also in the RNase MRP, with the exception of Rpr2 (Rpp21) which is unique to RNase P. Human nuclear RNase P and MRP appear to contain at least 10 protein subunits, Rpp14, Rpp20, Rpp21, Rpp25, Rpp29, Rpp30, Rpp38, Rpp40, hPop1 and hPop5, although there is recent evidence that not all of these subunits are shared between P and MRP. Archaeal RNase P has at least four protein subunits homologous to eukaryotic RNase P/MRP proteins. In the yeast RNase P, Pop6 and Pop7 (the Rpp20 homolog) interact with each other and they are both interaction partners of Pop4; in the human MRP Rpp25 and Rpp20 interact with each other and Rpp25 binds to Rpp29 (Pop4).	0
412517	cl00683	FlbD	Flagellar protein (FlbD). This family consists of several bacterial FlbD flagellar proteins. The exact function of this family is unknown.	0
412518	cl00685	Grp1_Fun34_YaaH	GPR1/FUN34/yaaH family. Proteins of this family are acetate transporters, which usually have 6 transmembrane regions. The homologue in E. coli is YaaH.	0
412519	cl00686	NfeD	NfeD-like C-terminal, partner-binding. NfeD-like proteins are widely distributed throughout prokaryotes and are frequently associated with genes encoding stomatin-like proteins (slipins). There appear to be three major groups: an ancestral group with only an N-terminal serine protease domain and this C-terminal beta sheet-rich domain which is structurally very similar to the OB-fold domain, associated with its neighboring slipin cluster; a second major group with an additional middle, membrane-spanning domain, associated in some species with eoslipin and in others with yqfA; a final 'artificial' group which unites truncated forms lacking the protease region and associated with their ancestral gene partner, either yqfA or eoslipin. This NefD, C-terminal, domain appears to be the major one for relating to the associated protein. NfeD homologs are clearly reliant on their conserved gene neighbor which is assumed to be necessary for function, either through direct physical interaction or by functioning in the same pathway, possibly involve with lipid-rafts.	0
412520	cl00687	AdoMet_dc	S-adenosylmethionine decarboxylase. Members of this protein family are the single chain precursor of the S-adenosylmethionine decarboxylase as found in Escherichia coli. This form shows a substantially different architecture from the form shared by the Archaea, Bacillus, and many other species (TIGR03330). It shows little or no similarity to the form found in eukaryotes (TIGR00535). [Central intermediary metabolism, Polyamine biosynthesis]	0
412521	cl00688	UPF0086	Domain of unknown function UPF0086. ribonuclease P protein component 1; Validated	0
412522	cl00689	TYW3	Methyltransferase TYW3. hypothetical protein; Provisional	0
412523	cl00693	CM_2	Chorismate mutase type II. This model represents the plant and yeast (plastidic) chorismate mutase. These CM's are distinct from other forms by the presence of an extended regulatory domain. [Amino acid biosynthesis, Aromatic amino acid family]	0
412524	cl00698	CGI-121	Kinase binding protein CGI-121. CGI-121 has been shown to bind to the p53-related protein kinase (PRPK). PRPK is a novel protein kinase which binds to and induces phosphorylation of the tumor suppressor protein p53. CGI-121 is part of a conserved protein complex, KEOPS. The KEOPS complex is involved in telomere uncapping and telomere elongation. Interestingly this family also include archaeal homologs, formerly in the DUF509 family. A structure for these proteins has been solved by structural genomics.	0
412525	cl00700	Peptidase_S66	LD-Carboxypeptidase, a serine protease, includes microcin C7 self immunity protein. Muramoyl-tetrapeptide carboxypeptidase hydrolyzes a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein. This family corresponds to Merops family S66.	0
294462	cl00701	Lactate_perm	L-lactate permease. L-lactate permease; Provisional	0
412526	cl00706	Ribosomal_L44	Ribosomal protein L44. 60S ribosomal protein L36a; Provisional	0
412527	cl00711	Glyco_hydro_77	4-alpha-glucanotransferase. 4-alpha-glucanotransferase; Provisional	0
412528	cl00712	RNA_pol_N	RNA polymerases N / 8 kDa subunit. DNA-directed RNA polymerase subunit N; Provisional	0
412529	cl00713	Auto_anti-p27	Sjogren&apos;s syndrome/scleroderma autoantigen 1 (Autoantigen p27). hypothetical protein; Validated	0
412530	cl00716	tRNA_deacylase	D-aminoacyl-tRNA deacylase. hypothetical protein; Provisional	0
412531	cl00718	TOPRIM	N/A. The toprim domain is found in a wide variety of enzymes involved in nucleic acid manipulation.	0
412532	cl00720	DUF296	Domain of unknown function found in archaea, bacteria, and plants. This putative domain is found in proteins that contain AT-hook motifs pfam02178, which strongly suggests a DNA-binding function for the proteins as a whole. There are three highly conserved histidine residues, eg at 117, 119 and 133 in Reut_B5223, which should be a structurally conserved metal-binding unit, based on structural comparison with known metal-binding structures. The proteins should work as trimers.	0
294470	cl00721	DDE_Tnp_IS1	IS1 transposase. Transposase proteins are necessary for efficient DNA transposition. This family represents bacterial IS1 transposases.	0
412533	cl00723	YajQ_like	Proteins similar to Escherichia coli YajQ. Family of uncharacterized proteins.	0
260590	cl00724	DUF2226	Uncharacterized protein conserved in archaea (DUF2226). This domain, found in various hypothetical archaeal proteins, has no known function.	0
412534	cl00727	DUF188	Uncharacterized BCR, YaiI/YqxD family COG1671. 	0
412535	cl00728	EVE	EVE domain. hypothetical protein; Provisional	0
412536	cl00731	DUF179	Uncharacterized ACR, COG1678. 	0
412537	cl00732	Arch_flagellin	Archaebacterial flagellin. flagellin; Validated	0
412538	cl00733	DUF523	Protein of unknown function (DUF523). Family of uncharacterized bacterial proteins.	0
412539	cl00734	Bac_export_1	Bacterial export proteins, family 1. flagellar biosynthesis protein FliR; Reviewed	0
412540	cl00735	AzlD	Branched-chain amino acid transport protein (AzlD). This family consists of a number of bacterial and archaeal branched-chain amino acid transport proteins. AzlD is known to be involved in conferring resistance to 4-azaleucine although its exact role is uncertain.	0
382173	cl00738	MBOAT	MBOAT, membrane-bound O-acyltransferase family. Members of this protein family are DltB, part of a four-gene operon for D-alanyl-lipoteichoic acid biosynthesis that is present in the vast majority of low-GC Gram-positive organisms. This protein may be involved in transport of D-alanine across the plasma membrane. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	0
412541	cl00739	UPF0147	Uncharacterized protein family (UPF0147). hypothetical protein; Provisional	0
412542	cl00740	FliW	FliW protein. flagellar assembly protein FliW; Provisional	0
412543	cl00742	LemA	LemA family. The members of this family are related to the LemA protein. LemA contains an amino terminal predicted transmembrane helix. It has been predicted that the small amino terminus is extracellular. The exact molecular function of this protein is uncertain.	0
412544	cl00746	RDD	RDD family. This family of proteins contain three highly conserved amino acids: one arginine and two aspartates, hence the name of RDD family. This region contains two predicted transmembrane regions. The arginine occurs at the N-terminus of the first helix and the first aspartate occurs in the middle of this helix. The molecular function of this region is unknown. However this region may be involved in transport of an as yet unknown set of ligands (Bateman A pers. obs.).	0
412545	cl00748	Ribosomal_L32_L32e	N/A. This family includes ribosomal protein L32 from eukaryotes and archaebacteria.	0
412546	cl00749	UPF0066	Escherichia coli YaeB and related proteins. This protein has been characterized by crystallography in complex with S-Adenosylmethionine, making it a probable S-adenosylmethionine-dependent methyltransferase. Analysis in EcoGene links this protein to the enzyme characterization mapped to the tsaA gene in Escherichia coli. [Unknown function, Enzymes of unknown specificity]	0
412547	cl00750	Exonuc_VII_S	Exonuclease VII small subunit. This protein is the small subunit for exodeoxyribonuclease VII. Exodeoxyribonuclease VII is made of a complex of four small subunits to one large subunit. The complex degrades single-stranded DNA into large acid-insoluble oligonucleotides. These nucleotides are then degraded further into acid-soluble oligonucleotides. [DNA metabolism, Degradation of DNA]	0
412548	cl00751	DUF155	Uncharacterized ACR, YagE family COG1723. 	0
412549	cl00752	HicA_toxin	HicA toxin of bacterial toxin-antitoxin,. HicA_toxin is a bacterial family of toxins that act as mRNA interferases. The antitoxin that neutralizes this is family HicB, pfam15919.	0
412550	cl00753	DUF327	Protein of unknown function (DUF327). The proteins in this family are around 140-170 residues in length. The proteins contain many conserved residues. with the most conserved motifs found in the central and C-terminal region. The function of these proteins is unknown.	0
412551	cl00755	zf-dskA_traR	Prokaryotic dksA/traR C4-type zinc finger. Members of this predicted regulatory protein are found only in endospore-forming members of the Firmicutes group of bacteria, and in nearly every such species; Clostridium perfringens seems to be an exception. The member from Bacillus subtilis, the model system for the study of the sporulation program, has been designated both yteA and yzwB. Some (but not all) members of this family show a strong sequence match to Pfam family pfam01258 the C4-type zinc finger protein, DksA/TraR family, but only one of the four key Cys residues is conserved. All members of this protein family share an additional C-terminal domain. Smaller proteins from the proteobacteria with just the N-terminal domain, including DksA and DksA2 are RNA polymerase-binding regulatory proteins even if the Zn-binding site is not conserved. [Unknown function, General]	0
412552	cl00756	Vut_1	Putative vitamin uptake transporter. All known members of this family are proteins or 210-250 amino acids in length. Conserved regions of hydrophobicity suggest that all members of the family are integral membrane proteins. [Hypothetical proteins, Conserved]	0
242072	cl00757	UPF0060	Uncharacterized BCR, YnfA/UPF0060 family. 	0
412553	cl00759	UPF0058	Uncharacterized protein family UPF0058. This archaebacterial protein has no known function.	0
412554	cl00762	VAPB_antitox	Putative antitoxin. hypothetical protein; Provisional	0
412555	cl00764	EMG1	EMG1/NEP1 methyltransferase. Members of this family are essential for 40S ribosomal biogenesis. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases.	0
412556	cl00767	OsmC	OsmC-like protein. pfam02566, OsmC-like protein, contains several deeply split clades of homologous proteins. The clade modeled here includes the protein OsmC, or osmotically induced protein C. The member from Thermus thermophilus was shown to have hydroperoxide peroxidase activity. In many species, this protein is induced by stress and helps resist oxidative stress. [Cellular processes, Detoxification]	0
412557	cl00768	CitG	ATP:dephospho-CoA triphosphoribosyl transferase. This protein acts in cofactor biosynthesis, preparing the coenzyme A derivative that becomes attached to the malonate decarboxylase acyl carrier protein (or delta subunit). The closely related protein CitG of citrate lyase produces the same molecule, but the two families are nonetheless readily separated. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	0
382191	cl00769	DUF531	Protein of unknown function (DUF531). Family of hypothetical archaeal proteins.	0
412558	cl00770	PSP1	PSP1 C-terminal conserved region. This region is present in both eukaryotes and eubacteria. The yeast PSP1 protein is involved in suppressing mutations in the DNA polymerase alpha subunit in yeast.	0
412559	cl00772	TctA	Tripartite tricarboxylate transporter TctA family. This family, formerly known as DUF112, is a family of bacterial and archaeal tripartite tricarboxylate transporters of the extracytoplasmic solute binding receptor-dependent transporter group of families, distinct from the ABC and TRAP-T families. TctA is part of the tripartite TctABC system which, as characterized in S. typhimurium, is a secondary carrier that depends for activity on the extracytoplasmic tricarboxylate-binding receptor TctC as well as two integral membrane proteins, TctA and TctB. complete three-component systems are found only in bacteria. TctA is a large transmembrane protein with up to 12 predicted membrane spanning regions in bacteria and up to 11 such in archaea, with the N-terminal within the cytoplasm. TctA is thought to be a permease, and in most other bacteria functions without TctB and TctC molecules.	0
412560	cl00774	Fae	Formaldehyde-activating enzyme (Fae). This family consists of formaldehyde-activating enzyme, or the corresponding domain of longer, bifunctional proteins. It links formaldehyde to the C1 carrier tetrahydromethanopterin (H4MPT), an analog of tetrahydrofolate, and is common among species with H4MPT. The ribulose monophosphate (RuMP) pathway, which removes the toxic metabolite formaldehyde by assimilation, runs in the opposite direction in some species to produce ribulose 5-phosphate for nucleotide biosynthesis, leaving formaldehyde as an additional metabolite. In these species, formaldehyde activating enzyme may occur as a fusion protein with D-arabino 3-hexulose 6-phosphate formaldehyde lyase from the RuMP pathway.	0
412561	cl00775	SepF	Cell division protein SepF. SepF accumulates at the cell division site in an FtsZ-dependent manner and is required for proper septum formation. Mutants are viable but the formation of the septum is much slower and occurs with a very abnormal morphology. This family also includes archaeal related proteins of unknown function.	0
412562	cl00777	DUF72	Protein of unknown function DUF72. hypothetical protein; Provisional	0
412563	cl00779	NQR2_RnfD_RnfE	NQR2, RnfD, RnfE family. Na(+)-translocating NADH-quinone reductase subunit B; Provisional	0
412564	cl00780	Kinase-PPPase	Kinase/pyrophosphorylase. This family of regulatory proteins has ADP-dependent kinase and inorganic phosphate-dependent pyrophosphorylase activity.	0
412565	cl00781	DUF389	Domain of unknown function (DUF389). This conserved hypothetical protein is found so far only in three archaeal genomes and in Streptomyces coelicolor. It shares a hydrophobic uncharacterized domain (see TIGR00271) of about 180 residues with several eubacterial proteins, including the much longer protein sll1151 of Synechocystis PCC6803. [Hypothetical proteins, Conserved]	0
412566	cl00782	ComA	(2R)-phospho-3-sulfolactate synthase (ComA). This model finds the ComA (Coenzyme M biosynthesis A) protein, phosphosulfolactate synthase, in methanogenic archaea. The ComABC pathway is one of at least two pathways to the intermediate sulfopyruvate. Coenzyme M occurs rarely and sporadically outside of the archaea, as for expoxide metabolism in Xanthobacter autotrophicus Py2, but candidate phosphosulfolactate synthases from that and other species occur fall below the cutoff and outside the scope of this model. This model deliberately is narrower in scope than pfam02679. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis]	0
412567	cl00784	DUF554	Protein of unknown function (DUF554). Family of uncharacterized prokaryotic proteins. Multiple predicted transmembrane regions suggest that the region is membrane associated.	0
412568	cl00787	Ribosomal_L25_TL5_CTC	Ribosomal L25/TL5/CTC N-terminal 5S rRNA binding domain. Ribosomal protein L25 is an RNA binding protein, that binds 5S rRNA. This family includes Ctc from B. subtilis, which is induced by stress.	0
294511	cl00788	MttA_Hcf106	mttA/Hcf106 family. This model distinguishes TatA/E from the related TatB, but does not distinguish TatA from TatE. The Tat (twin-arginine translocation) system is a Sec-independent exporter for folded proteins, often with a redox cofactor already bound, across the bacterial inner membrane. Functionally equivalent systems are found in the chloroplast and some in archaeal species. The signal peptide recognized by the Tat system is modeled by TIGR01409. [Protein fate, Protein and peptide secretion and trafficking]	0
412569	cl00789	PurS	Phosphoribosylformylglycinamidine (FGAM) synthase. phosphoribosylformylglycinamidine synthase subunit PurS; Reviewed	0
412570	cl00793	DUF92	Integral membrane protein DUF92. [Hypothetical proteins, Conserved]	0
412571	cl00795	Fumerase_C	Fumarase C-terminus. L(+)-tartrate dehydratase subunit beta; Validated	0
412572	cl00796	Adenosine_kin	Adenosine specific kinase. The structure of a member of this family from the hyperthermophilic archaeon Pyrobaculum aerophilum contains a modified histidine residue which is interpreted as stable phosphorylation. In vitro binding studies confirmed that adenosine and AMP but not ADP or ATP bind to the protein.	0
412573	cl00797	DUF356	Protein of unknown function (DUF356). Members of this family are around 120 amino acids in length and are found in some archaebacteria. The function of this family is unknown. However it contains a conserved motif IHPPAH that may be involved in its function.	0
412574	cl00798	DUF357	Protein of unknown function (DUF357). Members of this family are short (less than 100 amino acid) proteins found in archaebacteria. The function of these proteins is unknown.	0
321175	cl00799	UPF0128	Uncharacterized protein family (UPF0128). hypothetical protein; Provisional	0
412575	cl00800	DUF116	Protein of unknown function DUF116. This archaebacterial protein has no known function. The protein contains seven conserved cysteines and may also be an integral membrane protein.	0
412576	cl00802	LuxS	S-Ribosylhomocysteinase (LuxS). This family consists of the LuxS protein involved in autoinducer AI2 synthesis and its hypothetical relatives. S-ribosylhomocysteinase (LuxS) catalyzes the cleavage of the thioether bond in S-ribosylhomocysteine (SRH) to produce homocysteine and 4,5-dihydroxy-2,3-pentanedione (DPD), the precursor of type II bacterial quorum sensing molecule.	0
412577	cl00803	Cas7_I	CRISPR/Cas system-associated RAMP superfamily protein Cas7. This group of families is one of several protein families that are always found associated with prokaryotic CRISPRs, themselves a family of clustered regularly interspaced short palindromic repeats, DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. It has been shown that the CRISPRs are virus-derived sequences acquired by the host to enable them to resist viral infection. The Cas proteins from the host use the CRISPRs to mediate an antiviral response. After transcription of the CRISPR, a complex of Cas proteins termed Cascade cleaves a CRISPR RNA precursor in each repeat and retains the cleavage products containing the virus-derived sequence. Assisted by the helicase Cas3, these mature CRISPR RNAs then serve as small guide RNAs that enable Cascade to interfere with virus proliferation. Cas5 contains an endonuclease motif, whose inactivation leads to loss of resistance, even in the presence of phage-derived spacers. This family used to be known as DUF73. DevR appears to be negative auto-regulator within the system.	0
412578	cl00805	UPF0179	Uncharacterized protein family (UPF0179). hypothetical protein; Provisional	0
412579	cl00806	YajC	Preprotein translocase subunit. While this protein is part of the preprotein translocase in Escherichia coli, it is not essential for viability or protein secretion. The N-terminus region contains a predicted membrane-spanning region followed by a region consisting almost entirely of residues with charged (acidic, basic, or zwitterionic) side chains. This small protein is about 100 residues in length, and is restricted to bacteria; however, this protein is absent from some lineages, including spirochetes and Mycoplasmas. [Protein fate, Protein and peptide secretion and trafficking]	0
412580	cl00807	MNHE	Na+/H+ ion antiporter subunit. putative monovalent cation/H+ antiporter subunit E; Reviewed	0
412581	cl00808	CbiZ	Adenosylcobinamide amidohydrolase. This prokaryotic protein family includes CbiZ which converts adenosylcobinamide (AdoCbi) to adenosylcobyric acid (AdoCby), an intermediate of the de novo coenzyme B12 biosynthetic route.	0
412582	cl00809	RbsD_FucU	RbsD / FucU transport protein family. L-fucose mutarotase; Provisional	0
412583	cl00810	CheD	CheD chemotactic sensory transduction. chemoreceptor glutamine deamidase CheD; Provisional	0
412584	cl00811	DUF167	Uncharacterized ACR, YggU family COG1872. hypothetical protein; Validated	0
412585	cl00814	Cyclase	Putative cyclase. One of several pathways of tryptophan degradation is as follows: tryptophan 2,3-dioxygenase (1.13.11.11) uses 02 to convert Trp to L-formylkynurenine. Arylformamidase (3.5.1.9) hydrolyzes the product to L-kynurenine and formate. Kynureninase (3.7.1.3) hydrolyzes L-kynurenine to anthranilate plus alanine. Members of the seed alignment for this model are bacterial predicted metal-dependent hydrolases. All are supported as arylformamidase (3.5.1.9) by an operon structure in which kynureninase and/or tryptophan 2,3-dioxygenase genes are adjacent. The members from Bacillus cereus, Pseudomonas aeruginosa and Ralstonia metallidurans were characterized. An example from Pseudomonas fluorescens is given the gene symbol qbsH instead of kynB because of its role in quinolobactin biosynthesis, which begins with tryptophan. All members of this family should be arylformamidase (3.5.1.9). [Energy metabolism, Amino acids and amines]	0
412586	cl00816	OAD_beta	Na+-transporting oxaloacetate decarboxylase beta subunit. Malonate decarboxylase can be a soluble enzyme, or a sodium ion-translocating with additional membrane-bound components. Members of this protein family are integral membrane proteins required to couple decarboxylation to sodium ion export. This family belongs to a broader family, TIGR01109 of sodium ion-translocating decarboxylase beta subunits. [Transport and binding proteins, Cations and iron carrying compounds]	0
412587	cl00817	MM_CoA_mutase	N/A. The enzyme methylmalonyl-CoA mutase is a member of a class of enzymes that uses coenzyme B12 (adenosylcobalamin) as a cofactor. The enzyme induces the formation of an adenosyl radical from the cofactor. This radical then initiates a free-radical rearrangement of its substrate, succinyl-CoA, to methylmalonyl-CoA.	0
412588	cl00818	DUF555	Protein of unknown function (DUF555). hypothetical protein; Provisional	0
412589	cl00820	DUF211	Uncharacterized ArCR, COG1888. 	0
412590	cl00821	Ribosomal_S3Ae	Ribosomal S3Ae family. 30S ribosomal protein S3Ae; Validated	0
412591	cl00822	4HFCP_synth	4-HFC-P synthase. (5-formylfuran-3-yl)methyl phosphate synthase, also known as 4-HFC-P synthase, is involved in the production of methanofuran. This family has a classical TIM-barrel structure whose biological unit is a homohexamer.	0
412592	cl00824	HEPN	HEPN domain. 	0
412593	cl00826	DS	Deoxyhypusine synthase. Deoxyhypusine synthase is responsible for the first step in creating hypusine. Hypusine is a modified amino acid found in eukaryotes and in archaea in their respective forms of initiation factor 5A. Its presence is confirmed in archaeal genera Pyrococcus (), Sulfolobus, Halobacterium, and Haloferax (), but in an older report was not detected in Methanococcus voltae (J Biol Chem 1987 Dec 5;262(34):16585-9). This family of apparent orthologs has an unusual UPGMA difference tree, in which the members from the archaea M. jannaschii and P. horikoshii cluster with the known eukaryotic deoxyhypusine synthases. Separated by a fairly deep branch, although still strongly related, is a small cluster of proteins from Methanobacterium thermoautotrophicum and Archeoglobus fulgidus, the latter of which has two. [Protein fate, Protein modification and repair]	0
412594	cl00828	CbiD	CbiD. This protein has been shown by cloning into E. coli to be required for cobalamin biosynthesis. role_id [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	0
412595	cl00829	UxaC	Glucuronate isomerase. glucuronate isomerase; Reviewed	0
294541	cl00830	DUF401	Protein of unknown function (DUF401). This protein is predicted to have 10 transmembrane regions. Members of this family are found so far in the Archaea (Archaeoglobus fulgidus and Pyrococcus horikoshii) and in a bacterial thermophile, Thermotoga maritima. In Pyrococcus, the gene is located between nadA and nadB, two components of an enzyme involved in de novo synthesis of NAD. By PSI-BLAST, this family shows similarity (but not necessarily homology) to gluconate permease and other transport proteins. [Hypothetical proteins, Conserved]	0
412596	cl00831	FlpD	Methyl-viologen-reducing hydrogenase, delta subunit. This family consist of methyl-viologen-reducing hydrogenase, delta subunit / heterodisulphide reductase. No specific functions have been assigned to this subunit. The aligned region corresponds to almost the entire delta chain sequence and contains 4 conserved cysteine residues. However, in two Archaeoglobus sequences this region corresponds to only the C-terminus of these proteins.	0
412597	cl00832	DUF359	Protein of unknown function (DUF359). hypothetical protein; Provisional	0
412598	cl00838	FeoA	FeoA domain. This domain also occurs at the C-terminus in related proteins. The transporter Feo is composed of three proteins: FeoA a small, soluble SH3-domain protein probably located in the cytosol; FeoB, a large protein with a cytosolic N-terminal G-protein domain and a C-terminal integral inner-membrane domain containing two 'Gate' motifs which likely functions as the Fe2+ permease; and FeoC, a small protein apparently functioning as an [Fe-S]-dependent transcriptional repressor. Feo allows the bacterial cell to acquire iron from its environment.	0
412599	cl00840	MTD	methylene-5,6,7,8-tetrahydromethanopterin dehydrogenase. This enzyme family is involved in formation of methane from carbon dioxide EC:1.5.99.9. The enzyme requires coenzyme F420.	0
412600	cl00841	Gly_kinase	Glycerate kinase family. The only characterized member of this family so far is the glycerate kinase GlxK (EC 2.7.1.31) of E. coli. This enzyme acts after glyoxylate carboligase and 2-hydroxy-3-oxopropionate reductase (tartronate semialdehyde reductase) in the conversion of glyoxylate to 3-phosphoglycerate (the D-glycerate pathway) as a part of allantoin degradation. [Energy metabolism, Other]	0
412601	cl00842	CbiN	Cobalt transport protein component CbiN. This model describes the cobalt transporter in bacteria and its equivalents in archaea. It principally functions in the ion uptake mechanism. It is a multisubunit transporter with two integral membrane proteins and two closely associated cytoplasmic subunits. This transporter belongs to the ABC transporter superfamily (ATP stands for ATP Binding Cassette). This superfamily includes two groups, one which catalyze the uptake of small molecules, including ions from the external milieu and the other group which is engaged in the efflux of small molecular weight compounds and ions from within the cell. Energy derived from the hydrolysis of ATP drive the both the process of uptake and efflux. [Transport and binding proteins, Cations and iron carrying compounds]	0
412602	cl00845	DUF473	Protein of unknown function (DUF473). Family of uncharacterized Archaeal proteins.	0
412603	cl00846	CsoR-like_DUF156	Transcriptional regulators CsoR (copper-sensitive operon repressor), RcnR, and FrmR, and related domains; this domain superfamily was previously known as DUF156. This is a family of metal-sensitive repressors, involved in resistance to metal ions. Members of this family bind copper, nickel or cobalt ions via conserved cysteine and histidine residues. In the absence of metal ions, these proteins bind to promoter regions and repress transcription. When bound to metal ions they are unable to bind DNA, leading to transcriptional derepression.	0
412604	cl00847	PAC2	PAC2 family. This model represents one out of two closely related ortholgous sets of proteins that, so far, are found only in but are universal among the Archaea. This ortholog set includes MJ1210 from Methanococcus jannaschii and AF0525 from Archaeoglobus fulgidus while excluding MJ0106 and AF1251. [Hypothetical proteins, Conserved]	0
412605	cl00848	Y1_Tnp	Transposase IS200 like. Most IS200/IS605 family insertion sequences encode both this transposase, TnpA, about 130 amino acids long, and larger accessory protein, TnpB, that may act as a methyltransferase.	0
412606	cl00849	PvlArgDC	Pyruvoyl-dependent arginine decarboxylase (PvlArgDC). pyruvoyl-dependent arginine decarboxylase; Provisional	0
412607	cl00850	Phage_holin_4_2	Mycobacterial 4 TMS phage holin, superfamily IV. These proteins are predicted transmembrane proteins with probably four transmembrane spans. The 1.E.40 is represented by the mycobacterial 4 phage holin, but it also contains many cyanobacterial. proteobacterial and firmicute proteins. Holins are encoded within the genomes of Gram-positive and Gram-negative bacteria as well as in those of the bacteriophage of these organisms. The primary function of holins appears to be transport of murein hydrolases across the cytoplasmic membrane to the cell wall where these enzymes hydrolyze the cell wall polymer as a prelude to cell lysis. When chromosomally encoded the enzymes are therefore autolysins. Holins may also facilitate leakage of electrolytes and nutrients from the cell cytoplasm, thereby promoting cell death. Some may catalyze export of nucleases.	0
412608	cl00851	Fumerase	Fumarate hydratase (Fumerase). A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see PROSITE:PDOC00147). This model represents a subset of closely related proteins or modules, including the E. coli tartrate dehydratase alpha chain and the N-terminal region of the class I fumarase (where the C-terminal region is homologous to the tartrate dehydratase beta chain). The activity of archaeal proteins in this subfamily has not been established.	0
412609	cl00857	DUF63	Membrane protein of unknown function DUF63. Proteins found in Archaebacteria of unknown function. These proteins are probably transmembrane proteins.	0
412610	cl00858	BacA	Bacitracin resistance protein BacA. undecaprenyl pyrophosphate phosphatase; Reviewed	0
412611	cl00860	MscL	Large-conductance mechanosensitive channel, MscL. Protein encodes a channel which opens in response to a membrane stretch force. Probably serves as an osmotic gauge. Carboxy terminus tends to be more divergent across species with a high degree of sequence conservation found at the N-terminus. [Cellular processes, Adaptations to atypical conditions]	0
412612	cl00861	RNaseH_like	Ribonuclease H-like. RNaseH_like is a family of uncharacterized eubacterial proteins that are distant homologs of Ribonuclease H-like. The family maintains all the core secondary structure elements of the RNase H-like fold and shares several conserved, presumably active site residues with RNase HI. This finding suggests that it functions as a nuclease.	0
412613	cl00862	FBPase_3	Fructose-1,6-bisphosphatase. This is a family of bacterial and archaeal fructose-1,6-bisphosphatases (FBPases). FBPase catalyzes the hydrolysis of D-fructose-1,6-bisphosphate (FBP) to D-fructose-6-phosphate (F6P) and orthophosphate and is an essential regulatory enzyme in the glyconeogenic pathway.	0
412614	cl00864	PspC	PspC domain. This family includes Phage shock protein C (PspC) that is thought to be a transcriptional regulator. The presumed domain is 60 amino acid residues in length.	0
412615	cl00865	CT_A_B	Carboxyltransferase domain, subdomain A and B. This domain represents subunit 2 of allophanate hydrolase (AHS2).	0
412616	cl00866	NTPase_I-T	Protein of unknown function DUF84. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	0
412617	cl00867	Bac_export_3	Bacterial export proteins, family 3. flagellar biosynthesis protein FliQ; Reviewed	0
412618	cl00868	YdjM	LexA-binding, inner membrane-associated putative hydrolase. YdjM is a family of putative LexA-binding proteins. Members are predicted to be membrane-bound metal-dependent hydrolases that may be acting as phospholipases. It is a member of the SOS network, that rescues cells from UV and other DNA-damage. Expression of YdjM is regulated by LexA.	0
412619	cl00871	ThiP_synth	Thiamine-phosphate synthase. This family is thiamine-phosphate synthase, and it belongs to the SCOP phosphomethylpyrimidine kinase C-terminal domain-like family. Vitamin B1 (thiamine pyrophosphate) is involved in several microbial metabolic functions. Thiamine biosynthesis is accomplished by joining two intermediate molecules that are synthesized separately, HMP-PP and HET-P. In the archaeon Natrialba magadii, ThiE and ThiN, are known to join HMP-PP ( hydroxymethylpyrimidine pyrophosphate) and HET-P (hydroxyethylthiazole phosphate) to generate thiamine phosphate. Whereas ThiE in Natrialba magadii is a mono-functional protein, ThiN exists as a C-terminal domain in a ThiDN fusion protein - examples of all three forms, from various prokaryotes, are found in this family.	0
412620	cl00872	DUF190	Uncharacterized ACR, COG1993. 	0
412621	cl00873	PdxA	Pyridoxal phosphate biosynthetic protein PdxA. This model represents PdxA, an NAD+-dependent 4-hydroxythreonine 4-phosphate dehydrogenase (EC 1.1.1.262) active in pyridoxal phosphate biosynthesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	0
412622	cl00874	DNA_RNApol_7kD	DNA directed RNA polymerase, 7 kDa subunit. DNA-directed RNA polymerase subunit P; Provisional	0
412623	cl00876	Ribosomal_S27	Ribosomal protein S27a. 30S ribosomal protein S27ae; Validated	0
412624	cl00877	MazE_antitoxin	Antidote-toxin recognition MazE, bacterial antitoxin. PrlF_antitoxin is a family of bacterial antitoxins that neutralizes the toxin YhaV. PrlF is labile and forms a homodimer that then binds to the YhaV toxin thereby neutralising its ribonuclease activity. Alone, it can also act as a transcription factor. The YhaV/PrlF complex binds the prlF-yhaV operon, probably regulating its expression negatively. Over-expression of PrlF leads to increased doubling time.	0
412625	cl00878	Ribosomal_S24e	Ribosomal protein S24e. 40S ribosomal protein S24; Provisional	0
294572	cl00880	Ribosomal_S8e_like	Eukaryotic/archaeal ribosomal protein S8e and similar proteins. 40S ribosomal protein S8-like; Provisional	0
412626	cl00881	SQR_QFR_TM	N/A. Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centers to the electron-transport chain. This family consists of the 13kD hydrophobic subunit D.	0
412627	cl00883	RNA_pol_Rpb5_C	RNA polymerase Rpb5, C-terminal domain. DNA-directed RNA polymerase subunit H; Reviewed	0
412628	cl00884	AIM24	Mitochondrial biogenesis AIM24. [Hypothetical proteins, Conserved]	0
412629	cl00886	Robl_LC7	Roadblock/LC7 domain. This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role.	0
412630	cl00887	Rpr2	RNAse P Rpr2/Rpp21/SNM1 subunit domain. ribonuclease P protein component 4; Validated	0
412631	cl00890	DUF366	Domain of unknown function (DUF366). Archaeal domain of unknown function.	0
412632	cl00891	Cu-Zn_Superoxide_Dismutase	N/A. superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene cause familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Structure is an eight-stranded beta sandwich, similar to the immunoglobulin fold.	0
412633	cl00892	DUF131	Protein of unknown function DUF131. The member of this family from Pyrococcus horikoshii scores only 13.91 bits, largely because it is at least 15 residues shorter than other members of this family of small proteins and is penalized for not matching to the N-terminal section of the model. Cutoff scores are set so this hit is between noise and trusted cutoffs. [Hypothetical proteins, Conserved]	0
412634	cl00893	DUF368	Domain of unknown function (DUF368). Predicted transmembrane domain of unknown function. Family members have between 6 and 9 predicted transmembrane segments.	0
412635	cl00894	DUF169	Uncharacterized ArCR, COG2043. 	0
412636	cl00895	2-ph_phosp	2-phosphosulpholactate phosphatase. 2-phosphosulfolactate phosphatase catalyzes the sulfonation of phosphoenolpyruvate to form 2-phospho-3-sulfolactate, the second step in coenzyme M biosynthesis. Coenzyme M is the terminal methyl carrier in methanogenesis. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other, Energy metabolism, Methanogenesis]	0
294584	cl00897	Ribosomal_S27e	Ribosomal protein S27. 40S ribosomal protein S27; Provisional	0
412637	cl00898	DUF370	Domain of unknown function (DUF370). hypothetical protein; Provisional	0
412638	cl00900	Ldh_2	Malate/L-lactate dehydrogenase. This enzyme converts ureidoglycolate to oxalureate in the non-urea-forming catabolism of allantoin (GenProp0687). The pathway has been characterized in E. coli and is observed in the genomes of Entercoccus faecalis and Bacillus licheniformis.	0
412639	cl00903	KdpA	Potassium-transporting ATPase A subunit. Kdp is a high affinity ATP-driven K+ transport system in Escherichia coli. It is composed of three membrane-bound subunits, KdpA, KdpB and KdpC and one small peptide, KdpF. KdpA is the K+-transporting subunit of this complex. During assembly of the complex, KdpA and KdpC bind to each other. This interaction is thought to stabilize the complex [medline:9858692]. Data indicates that KdpC might connect the KdpA, the K+-transporting subunit, to KdpB, the ATP-hydrolyzing (energy providing) subunit [medline:9858692]. [Transport and binding proteins, Cations and iron carrying compounds]	0
412640	cl00907	Glutaminase	Glutaminase. This family describes the enzyme glutaminase, from a larger family that includes serine-dependent beta-lactamases and penicillin-binding proteins. Many bacteria have two isozymes. This model is based on selected known glutaminases and their homologs within prokaryotes, with the exclusion of highly-derived (long branch) and architecturally varied homologs, so as to achieve conservative assignments. A sharp drop in scores occurs below 250, and cutoffs are set accordingly. The enzyme converts glutamine to glutamate, with the release of ammonia. Members tend to be described as glutaminase A (glsA), where B (glsB) is unknown and may not be homologous (as in Rhizobium etli). Some species have two isozymes that may both be designated A (GlsA1 and GlsA2). [Energy metabolism, Amino acids and amines]	0
412641	cl00909	Ribosomal_L24e_L24	N/A. MYM-type zinc fingers were identified in MYM family proteins. Human protein ZMYM3 is involved in a chromosomal translocation and may be responsible for X-linked retardation in XQ13.1. ZMYM2 is also involved in disease. In myeloproliferative disorders it is fused to FGF receptor 1; in atypical myeloproliferative disorders it is rearranged. Members of the family generally are involved in development. This Zn-finger domain functions as a transcriptional trans-activator of late vaccinia viral genes, and orthologues are also found in all nucleocytoplasmic large DNA viruses, NCLDV. This domain is also found fused to the C termini of recombinases from certain prokaryotic transposons.	0
412642	cl00911	AMMECR1	AMMECR1. Members of this protein family belong to the same domain family as AMMECR1, a mammalian protein named for AMME - Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis. Members of the present family occur as part of a three gene system with a homolog of the mammalian protein Memo (Mediator of ErbB2-driven cell MOtility), and an uncharacterized radical SAM enzyme.	0
412643	cl00912	MmgE_PrpD	MmgE/PrpD family. Members of this family are bacterial proteins known or predicted to act as 2-methylcitrate dehydratase, an enzyme involved in the methylcitrate cycle of propionate catabolism. A related clade of archaeal proteins that may or may not be functionally equivalent is reserved for a future model and is excluded from this family. The PrpD enzyme of E. coli is responsible for the minor aconitase activity (AcnC) not accounted for by AcnA and AcnB.	0
412644	cl00913	CbiC	Precorrin-8X methylmutase. precorrin-8X methylmutase; Reviewed	0
412645	cl00914	DUF61	Protein of unknown function DUF61. hypothetical protein; Provisional	0
412646	cl00915	SpoVG	SpoVG. Stage V sporulation protein G. Essential for sporulation and specific to stage V sporulation in Bacillus megaterium and subtilis. In B. subtilis, expression decreases after 30-60 minutes of cold shock.	0
412647	cl00916	DUF371	Domain of unknown function (DUF371). Archaeal domain of unknown function.	0
412648	cl00920	Cob_adeno_trans	Cobalamin adenosyltransferase. This model represents as ATP:cob(I)alamin adenosyltransferase family corresponding to the N-terminal half of Salmonella PduO, a 1,2-propanediol utilization protein that probably is bifunctional. PduO represents one of at least three families of ATP:corrinoid adenosyltransferase: others are CobA (which partially complements PduO) and EutT. It was not clear originally whether ATP:cob(I)alamin adenosyltransferase activity resides in the N-terminal region of PduO, modeled here, but this has now become clear from the characterization of MeaD from Methylobacterium extorquens. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	0
412649	cl00921	Ribosomal_L31e	Eukaryotic/archaeal ribosomal protein L31. 50S ribosomal protein L31e; Reviewed	0
412650	cl00922	CbiJ	Precorrin-6x reductase CbiJ/CobK. This enzyme catalyzes a step in cobalamin biosynthesis. It has been identified experimentally in Pseudomonas denitrificans and has been shown to be part of cobalamin biosynthetic operons in several other species. This enzyme was found to be a monomer by gel filtration. [Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin]	0
412651	cl00927	Form_Nir_trans	Formate/nitrite transporter. FocA (formate channel A) forms a pentameric formate-selective channel through the plasma membrane. The focA gene is largely restricted to Proteobacteria and occurs adjacent to genes for pyruvate formate lyase (PFL) and the PFL activase, a radical SAM protein. FocA is homologous to a nitrite transport protein, NirC. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	0
412652	cl00928	dsDNA_bind	Double-stranded DNA-binding domain. This domain is believed to bind double-stranded DNA of 20 bases length.	0
412653	cl00929	PIG-L	GlcNAc-PI de-N-acetylase. Members of this protein family are BshB1 (YpjG), an enzyme of bacillithiol biosynthesis; either BshB1 or BshB2 (YojG) must be present, and often both are present. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	0
412654	cl00931	Ribosomal_S6e	Ribosomal protein S6e. 30S ribosomal protein S6e; Validated	0
412655	cl00932	Ribosomal_L37e	Ribosomal protein L37e. 60S ribosomal protein L37; Provisional	0
412656	cl00933	ClpS	ATP-dependent Clp protease adaptor protein ClpS. ATP-dependent Clp protease adaptor protein ClpS; Reviewed	0
412657	cl00934	CDH	CDP-diacylglycerol pyrophosphatase. CDP-diacylglycerol pyrophosphatase; Provisional	0
412658	cl00935	Brix	Brix domain. ribosomal biogenesis protein; Validated	0
412659	cl00936	RecX	RecX family. recombination regulator RecX; Provisional	0
412660	cl00937	Ribosomal_L21e	Ribosomal protein L21e. 50S ribosomal protein L21e; Reviewed	0
412661	cl00938	Rieske	N/A. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines coordinate one Fe ion, while the other Fe ion is coordinated by two conserved histidines. In hyperthermophilic archaea there is a SKTPCX(2-3)C motif at the C-terminus. The cysteines in this motif form a disulphide bridge, which stabilizes the protein.	0
412662	cl00941	FeS_assembly_P	Iron-sulfur cluster assembly protein. The function is unknown for this protein family, but members are found almost always in operons for the the SUF system of iron-sulfur cluster biosynthesis. The SUF system is present elsewhere on the chromosome for those few species where SUF genes are not adjacent. This family shares this property of association with the SUF system with a related family, TIGR02945. TIGR02945 consists largely of a DUF59 domain (see pfam01883), while this protein is about double the length, with a unique N-terminal domain and DUF59 C-terminal domain. A location immediately downstream of the cysteine desulfurase gene sufS in many contexts suggests the gene symbol sufT. Note that some other homologs of this family and of TIGR02945, but no actual members of this family, are found in operons associated with phenylacetic acid (or other ring-hydroxylating) degradation pathways. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	0
412663	cl00942	PCD_DCoH	N/A. Pterin 4 alpha carbinolamine dehydratase is also known as DCoH (dimerization cofactor of hepatocyte nuclear factor 1-alpha).	0
412664	cl00943	DUF378	Domain of unknown function (DUF378). Predicted transmembrane domain of unknown function. The majority of the family have two predicted transmembrane regions.	0
412665	cl00944	KdpC	K+-transporting ATPase, c chain. potassium-transporting ATPase subunit C; Provisional	0
412666	cl00945	Ribosomal_L18A	Ribosomal proteins 50S-L18Ae/60S-L20/60S-L18A. 50S ribosomal protein LX; Validated	0
412667	cl00946	zf-like	Cysteine-rich small domain. Probable metal-binding domain.	0
412668	cl00947	L-fuc_L-ara-isomerases	N/A. L-Arabinose isomerase (AI) catalyzes the isomerization of L-arabinose to L-ribulose, the first reaction in its conversion into D-xylulose-5-phosphate, an intermediate in the pentose phosphate pathway, which allows L-arabinose to be used as a carbon source. AI can also convert D-galactose to D-tagatose at elevated temperatures in the presence of divalent metal ions. D-tagatose, rarely found in nature, is of commercial interest as a low-calorie sugar substitute.	0
412669	cl00949	Acetyltransf_2	N-acetyltransferase. N-hydroxyarylamine O-acetyltransferase; Provisional	0
412670	cl00951	SufE	Fe-S metabolism associated domain. Members of this protein family are CsdE, formerly called YgdK. This protein, found as a paralog to SufE in Escherichia coli, Yersinia pestis, Photorhabdus luminescens, and related species, works together and physically interacts with CsdA (a paralog of SufS). CsdA has cysteine desulfurase activity that is enhanced by this protein (CsdE), in which Cys-61 (numbered as in E. coli) is a sulfur acceptor site. This gene pair, although involved in FeS cluster biosynthesis, is not found next to other such genes as are its paralogs from the Suf or Isc systems. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	0
412671	cl00952	Ribosomal_L39	Ribosomal L39 protein. 50S ribosomal protein L39e; Validated	0
412672	cl00954	GCS2	Glutamate-cysteine ligase family 2(GCS2). Family of bacterial f glutamate-cysteine ligases (EC:6.3.2.2) that carry out the first step of the glutathione biosynthesis pathway.	0
412673	cl00955	Ribosomal_L34e	Ribosomal protein L34e. 60S ribosomal protein L34; Provisional	0
412674	cl00957	Translin-like	Translin and translin-associated factor-X (TRAX). Members of this family include Translin, which interacts with DNA and forms a ring around the DNA. This family also includes human TSNAX, which was found to interact with translin with yeast two-hybrid screen.	0
412675	cl00958	Nitrate_red_del	Nitrate reductase delta subunit. Type II members of the DMSO reductase family are heterotrimeric proteins with bis(molybdopterin guanine dinucleotide)Mo, iron-sulfur, and heme b prosthetic groups bound by the alpha, beta, and gamma subunits respectively. Members of this protein family are not part of the mature protein, although they are the product of a fourth clustered gene. Proteins in this family are interpreted as a chaperone, analogous to NarJ of nitrate reductases.	0
412676	cl00959	Nitrate_red_gam	Nitrate reductase gamma subunit. Involved in anerobic respiration the gene product catalyzes the reaction (reduced acceptor + NO3- = Acceptor + nitrite). Another possible role_id for this gene product is in nitrogen fixation (Role_id:160). [Energy metabolism, Anaerobic]	0
412677	cl00960	Fic	Fic/DOC family. The characterized member of this family is the death-on-curing (DOC) protein of phage P1. It is part of a two protein operon with prevents-host-death (phd) that forms an addiction module. DOC lacks homology to analogous addiction module post-segregational killing proteins involved in plasmid maintenance. These modules work as a combination of a long lived poison (e.g. this protein) and a more abundant but shorter lived antidote. Members of this family have a well-conserved central motif HxFx[ND][AG]NKR. A similar region, with K replaced by G, is found in the huntingtin interacting protein (HYPE) family. [Unknown function, General]	0
412678	cl00969	Ribosomal_S19e	Ribosomal protein S19e. 30S ribosomal protein S19e; Provisional	0
412679	cl00970	DUF996	Protein of unknown function (DUF996). Family of uncharacterized bacterial and archaeal proteins.	0
412680	cl00973	AbiEii	Nucleotidyl transferase AbiEii toxin, Type IV TA system. This family was recently identified as belonging to the nucleotidyltransferase superfamily. AbiEii is the cognate toxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	0
412681	cl00977	Nop10p	Nucleolar RNA-binding protein, Nop10p family. Nop10p is a nucleolar protein that is specifically associated with H/ACA snoRNAs. It is essential for normal 18S rRNA production and rRNA pseudouridylation by the ribonucleoprotein particles containing H/ACA snoRNAs (H/ACA snoRNPs). Nop10p is probably necessary for the stability of these RNPs.	0
412682	cl00978	Transgly_assoc	Transglycosylase associated protein. hypothetical protein; Provisional	0
412683	cl00979	DUF402	Protein of unknown function (DUF402). hypothetical protein; Provisional	0
412684	cl00983	Indigoidine_A	Indigoidine synthase A like protein. Indigoidine is a blue pigment synthesized by Erwinia chrysanthemi implicated in pathogenicity and protection from oxidative stress. IdgA is involved in indigoidine biosynthesis, but its specific function is unknown. The recommended name for this protein is now pseudouridine-5'-phosphate glycosidase.	0
382321	cl00984	TM2	TM2 domain. TM2 domain-containing protein	0
412685	cl00987	GrpB	GrpB protein. This family has been suggested to belong to the nucleotidyltransferase superfamily. It occurs at the C-terminus of dephospho-CoA kinase (CoaE) in a number of cases, where it plays a role in the proper folding of the enzyme.	0
412686	cl00989	DUF420	Protein of unknown function (DUF420). Predicted membrane protein with four transmembrane helices.	0
412687	cl00990	DUF421	Protein of unknown function (DUF421). YDFR family	0
412688	cl00991	Caroten_synth	Carotenoid biosynthesis protein. The representative member of this family is CruF, a C50 carotenoid 2',3'-hydratase involved in the synthesis of the C50 carotenoid bacterioruberin in the halophilic archaeon Haloarcula japonica.	0
412689	cl00993	Zn-ribbon_8	Zinc ribbon domain. This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown.	0
412690	cl00994	CcmE	CcmE. cytochrome c-type biogenesis protein CcmE; Reviewed	0
412691	cl00995	PemK_toxin	PemK-like, MazF-like toxin of type II toxin-antitoxin system. PemK is a growth inhibitor in E. coli known to bind to the promoter region of the Pem operon, auto-regulating synthesis. This family represents the toxin molecule of a typical bacterial toxin-antitoxin system pairing. The family includes a number of different toxins, such as MazF, Kid, PemK, ChpA, ChpB and ChpAK.	0
412692	cl00997	Glyco_hydro_114	Glycoside-hydrolase family GH114. Original assignment of this protein family as cysteinyl-tRNA synthetase is controversial, supported by but challenged by and by subsequent discovery of the actual mechanism for synthesizing Cys-tRNA in species where a direct Cys--tRNA ligase was not found. Lingering legacy annotations of members of this family probably should be removed. Evidence against the role includes a signal peptide. This family as been renamed "extracellular protein" to facilitate correction. Members of this family occur in Deinococcus radiodurans (bacterial) and Methanococcus jannaschii (archaeal). A number of homologous but more distantly related proteins are annotated as alpha-1,4 polygalactosaminidases. The function remains unknown. [Unknown function, General]	0
412693	cl00998	NTP_transf_9	Domain of unknown function (DUF427). This domain contains a beta-tent fold.	0
412694	cl00999	YCII	YCII-related domain. YciI-like protein; Reviewed	0
412695	cl01001	YceI	YceI-like domain. E. coli YceI is a base-induced periplasmic protein. The recent structure of a member of this family shows that it binds to polyisoprenoid. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold.	0
412696	cl01002	DUF808	Protein of unknown function (DUF808). hypothetical protein; Provisional	0
412697	cl01005	SpoVS	Stage V sporulation protein S (SpoVS). In Bacillus subtilis this protein interferes with sporulation at an early stage and this inhibitory effect is overcome by SpoIIB and SpoVG. SpoVS seems to play a positive role in allowing progression beyond stage V of sporulation. Null mutations in the spoVS gene block sporulation at stage V, impairing the development of heat resistance and coat assembly.	0
412698	cl01007	DAP_dppA	Peptidase M55, D-aminopeptidase dipeptide-binding protein family. Bacillus subtilis DppA is a binuclear zinc-dependent, D-specific aminopeptidase. The structure reveals that DppA is a new example of a 'self-compartmentalising protease', a family of proteolytic complexes. Proteasomes are the most extensively studied representatives of this family. The DppA enzyme is composed of identical 30 kDa subunits organized in a decamer with 52 point-group symmetry. A 20 A wide channel runs through the complex, giving access to a central chamber holding the active sites. The structure shows DppA to be a prototype of a new family of metalloaminopeptidases characterized by the SXDXEG key sequence. The only known substrates are D-ala-D-ala and D-ala-gly-gly.	0
412699	cl01008	DUF423	Protein of unknown function (DUF423). hypothetical protein; Provisional	0
412700	cl01011	HupE_UreJ_2	HupE / UreJ protein. This family of proteins are hydrogenase / urease accessory proteins. The alignment contains many conserved histidines that are likely to be involved in nickel binding. The members usually have five membrane-spanning regions.	0
412701	cl01012	Big_5	Bacterial Ig-like domain. CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm.	0
412702	cl01015	FUN14	FUN14 family. This family of short proteins are found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices.	0
412703	cl01017	DUF2227	Uncharacterized metal-binding protein (DUF2227). Members of this family of hypothetical bacterial proteins possess metal binding properties; however, their exact function has not, as yet, been determined.	0
412704	cl01020	ASCH	N/A. The search results from NCBI sequence alignment indicates a conserved domain belonging to ASCH superfamily. Dali searching results show that the protein is a structurally similar to the PUA domain, suggesting it may be involved in RNA recognition. It has been reported that the deletion of PUA genes results in impaired growth (RluD) and competitive disadvantage (TruB) in Escherichia coli. Suggestions have been put forward that, apart from their usual catalytic role, certain PUS enzymes (e.g. TruB) may also act as chaperones for RNA folding. The interface interaction indicates that the biomolecule of protein NP_809782.1 should be a dimer.	0
382342	cl01021	DUF424	Protein of unknown function (DUF424). This is a family of uncharacterized proteins.	0
412705	cl01024	Sm_multidrug_ex	Putative small multi-drug export protein. This family contains a small number of putative small multi-drug export proteins.	0
412706	cl01027	DUF432	Protein of unknown function (DUF432). Archaeal protein of unknown function.	0
412707	cl01030	DUF433	Protein of unknown function (DUF433). 	0
412708	cl01031	DUF86	Protein of unknown function DUF86. The function of members of this family is unknown.	0
412709	cl01033	Ribosomal_L35Ae	Ribosomal protein L35Ae. 60S ribosomal protein L35a; Provisional	0
412710	cl01034	DUF2304	Uncharacterized conserved protein (DUF2304). Members of this family of hypothetical archaeal proteins have no known function.	0
412711	cl01041	DUF441	Protein of unknown function (DUF441). Predicted to be an integral membrane protein.	0
412712	cl01047	DUF386	Domain of unknown function (DUF386). This family consists of conserved hypothetical proteins, about 150 amino acids in length. Members with limited information include YhcH, a possible sugar isomerase of sialic acid catabolism, and YjgK. [Unknown function, General]	0
412713	cl01048	Barstar_like	N/A. Barstar_SaI14_like contains sequences that are similar to SaI14, an RNAase inhibitor, which are members of the Barstar family. Barstar is an intracellular inhibitor of barnase, an extracellular ribonuclease of Bacillus amyloliquefaciens. Barstar binds tightly to the barnase active site and sterically blocks it thus inhibiting its potentially lethal RNase activity inside the cell. The sequences in this subfamily are mostly uncharacterized, but believed to have a similar function and role.	0
412714	cl01049	Zn_peptidase_2	Putative neutral zinc metallopeptidase. Zinc metallopeptidase zinc binding regions have been predicted in some family members by a pattern match (Prosite:PS00142), of the characteristic HEXXH motif.	0
412715	cl01051	Antibiotic_NAT	Aminoglycoside 3-N-acetyltransferase. This family consists of bacterial aminoglycoside 3-N-acetyltransferases EC:2.3.1.81, these catalyze the reaction: Acetyl-Co + a 2-deoxystreptamine antibiotic <=> CoA + N3'-acetyl-2-deoxystreptamine antibiotic. The enzyme can use a range of antibiotics with 2-deoxystreptamine rings as acceptor for its acetyltransferase activity, this inactivates and confers resistance to gentamicin, kanamycin, tobramycin, neomycin and apramycin amongst others.	0
412716	cl01052	FlgM	Anti-sigma-28 factor, FlgM. FlgM interacts with and inhibits the alternative sigma factor sigma(28) FliA. The C-terminus of FlgM contains the sigma(28)-binding domain.	0
412717	cl01053	SGNH_hydrolase	N/A. This domain is mainly found in uncharacterized proteins around 290 residues in length and is mainly found in various Bacteroides species. It has a curved central beta sheet flanked by helices. Distant homolog analysis showed it has a similarity with GDSL-like Lipase/Acylhydrose family. The function of this domain is still unknown.	0
412718	cl01054	HAMP	Histidine kinase, Adenylyl cyclase, Methyl-accepting protein, and Phosphatase (HAMP) domain. HAMP is a signaling domain which occurs in a wide variety of signaling proteins, many of which are bacterial. The HAMP domain consists of two alpha helices connected by an extended linker. The structure of the Af1503 HAMP dimer from Archaeoglobus fulgidus has been solved using nuclear magnetic resonance, revealing a parallel four-helix bundle; this structure has been confirmed by cross-linking analysis of HAMP domains from the Escherichia coli aerotaxis receptor Aer. It has been suggested that the four-helix arrangement can rotate between the unusually packed conformation observed in the NMR structure and a canonical coiled-coil arrangement. Such rotation may coincide with signal transduction, but a common mechanism by which HAMP domains relay a variety of input signals has yet to be established.	0
412719	cl01059	Adenine_glyco	Methyladenine glycosylase. All proteins in this family are alkylation DNA glycosylases that function in base excision repair This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	0
242278	cl01062	DUF452	Protein of unknown function (DUF452). 	0
412720	cl01063	DUF454	Protein of unknown function (DUF454). Predicted membrane protein.	0
412721	cl01066	Trm112p	Trm112p-like protein. The function of this family is uncertain. The bacterial members are about 60-70 amino acids in length and the eukaryotic examples are about 120 amino acids in length. The C-terminus contains the strongest conservation. Trm112p is required for tRNA methylation in S. cerevisiae and is found in complexes with 2 tRNA methylases (TRM9 and TRM11) also with putative methyltransferase YDR140W. The zinc-finger protein Ynr046w is plurifunctional and a component of the eRF1 methyltransferase in yeast. The crystal structure of Ynr046w has been determined to 1.7 A resolution. It comprises a zinc-binding domain built from both the N- and C-terminal sequences and an inserted domain, absent from bacterial and archaeal orthologs of the protein, composed of three alpha-helices.	0
412722	cl01067	Dyp_perox	Dyp-type peroxidase family. A defined member of this superfamily is Dyp, a dye-decolorizing peroxidase that lacks a typical heme-binding region. A distinct, uncharacterized branch (TIGR01412) of this superfamily has a typical twin-arginine dependent signal sequence characteristic of exported proteins with bound redox cofactors.	0
412723	cl01069	DUF456	Protein of unknown function (DUF456). This family is a putative membrane protein that contains glycine zipper motifs.	0
412724	cl01070	DUF465	Protein of unknown function (DUF465). hypothetical protein; Provisional	0
412725	cl01071	PCuAC	Copper chaperone PCu(A)C. PCu(A)C is a periplasmic copper chaperone. Its role may be to capture and transfer copper to two other copper chaperones, PrrC and Cox11, which in turn deliver Cu(I) to cytochrome c oxidase.	0
412726	cl01073	MlaA	MlaA lipoprotein. MlaA is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. MlaA is required for the intercellular spreading of Shigella flexneri. It is attached to the outer membrane by a lipid anchor.	0
412727	cl01074	MlaC	MlaC protein. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.	0
412728	cl01075	Cons_hypoth698	Conserved hypothetical protein 698. Members of this family are found so far only in one archaeal species, Archaeoglobus fulgidus, and in two related bacterial species, Haemophilus influenzae and Escherichia coli. It has 9 GES predicted transmembrane regions at conserved locations in all members. These proteins have a molecular weight of approximately 35 to 38 kDa. [Hypothetical proteins, Conserved]	0
412729	cl01076	Peptidase_M78	IrrE N-terminal-like domain. This entry includes the catalytic domain of the protein ImmA, which is a metallopeptidase containing an HEXXH zinc-binding motif from peptidase family M78. ImmA is encoded on a conjugative transposon. Conjugating bacteria are able to transfer conjugative transposons that can, for example, confer resistance to antibiotics. The transposon is integrated into the chromosome, but during conjugation excises itself and then moves to the recipient bacterium and re-integrate into its chromosome. Typically a conjugative tranposon encodes only the proteins required for this activity and the proteins that regulate it. During exponential growth, the ICEBs1 transposon of Bacillus subtilis is inactivated by the immunity repressor protein ImmR, which is encoded by the transposon and represses the genes for excision and transfer. Cleavage of ImmR relaxes repression and allows transfer of the transposon. ImmA has been shown to be essential for the cleavage of ImmR. This domain is also found in in metalloprotease IrrE, a central regulator of DNA damage repair in Deinococcaceae, HTH-type transcriptional regulators RamB and PrpC.	0
412730	cl01077	SIMPL	Protein of unknown function (DUF541). oxidative stress defense protein; Provisional	0
412731	cl01078	UPF0114	Uncharacterized protein family, UPF0114. hypothetical protein; Provisional	0
412732	cl01080	Prp-like	ribosomal-processing cysteine protease Prp and similar proteins. This is a family of cysteine protease that are found to cleave the N-terminus extension of ribosomal subunit L27 in eubacteria. Proteins in this family are distinguished by a pair of invariant histidine and cysteine residues with conserved spacing that form the classic catalytic dyad of a cysteine protease.	0
412733	cl01081	FMN_bind	FMN-binding domain. This model represents the NqrC subunit of the six-protein, Na(+)-pumping NADH-quinone reductase of a number of marine and pathogenic Gram-negative bacteria. This oxidoreductase complex functions primarily as a sodium ion pump. [Transport and binding proteins, Cations and iron carrying compounds]	0
412734	cl01082	Sel_put	Selenoprotein, putative. This family is named KCU-star because nearly all member proteins end with tripeptide lysine-cysteine-selenocysteine, followed immediately by a stop codon (represented by an asterisk, or star). Members occur in primarily in species of Helicobacter (although not Helicobacter pylori, in which selenocysteine incorporation capability has been lost) and Campylobacter. This small family belongs the larger YbdD/YjiX (DUF466) family described by Pfam model PF04328.	0
412735	cl01085	UPF0175	Uncharacterized protein family (UPF0175). This family contains small proteins of unknown function.	0
294686	cl01087	MreD	rod shape-determining protein MreD. Members of this protein family are the MreD protein of bacterial cell shape determination. Most rod-shaped bacteria depend on MreB and RodA to achieve either a rod shape or some other non-spherical morphology such as coil or stalk formation. MreD is encoded in an operon with MreB, and often with RodA and PBP-2 as well. It is highly hydrophobic (therefore somewhat low-complexity) and highly divergent, and therefore sometimes tricky to discover by homology, but this model finds most examples. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	0
412736	cl01090	SlyX	SlyX. hypothetical protein; Provisional	0
412737	cl01093	Fer2_BFD	BFD-like [2Fe-2S] binding domain. bacterioferritin-associated ferredoxin; Provisional	0
412738	cl01097	DUF489	Protein of unknown function (DUF489). Protein of unknown function, cotranscribed with purB in Escherichia coli, but with function unrelated to purine biosynthesis.	0
412739	cl01098	IspA	Intracellular septation protein A. intracellular septation protein A; Reviewed	0
412740	cl01101	DsrC	DsrC like protein. Members of this protein family may be described as TusE, a partner to TusBCD in a sulfur relay system for 2-thiouridine biosynthesis, a tRNA base modification process. Other members are DsrC, a functionally similar protein in species where the sulfur relay system exists primarily for sulfur metabolism rather than tRNA base modification. Some members of this family are known explicitly as the gamma subunit of sulfite reductases.	0
412741	cl01102	DUF493	Protein of unknown function (DUF493). hypothetical protein; Provisional	0
412742	cl01103	DUF494	Protein of unknown function (DUF494). hypothetical protein; Validated	0
412743	cl01104	Iron_traffic	Bacterial Fe(2+) trafficking. oxidative damage protection protein; Provisional	0
412744	cl01106	DNA_pol3_chi	DNA polymerase III chi subunit, HolC. DNA polymerase III subunit chi; Validated	0
412745	cl01107	DUF502	Protein of unknown function (DUF502). Predicted to be an integral membrane protein.	0
412746	cl01108	BrnT_toxin	Ribonuclease toxin, BrnT, of type II toxin-antitoxin system. BrnT is a ribonuclease toxin of a type II toxin-antitoxin system that exhibits a RelE-like fold. The antitoxin that neutralizes this toxin is pfam14384. BrnT is found in bacteria, archaea, bacteriophage, and plasmids. BrnT-BrnA forms a 2:2 tetrameric complex and autoregulates its own expression, which is induced by a number of different environmental stresses. Expression of BrnT alone results in cessation of bacterial growth which can be rescued after subsequent expression of BrnA.	0
412747	cl01109	SYLF	The SYLF domain (also called DUF500), a novel lipid-binding module. Ysc84 is a family of Las17-binding proteins found in metazoa. Together, Las17 and Ysc84 are essential for proper polymerization of actin; Ysc84 is able to bind to and stabilize the actin dimer presented by Las17 and thereby promote polymerization. An active actin cytoskeleton is necessary for adequate endocytosis. (pfam00018), or a FYVE zinc finger (pfam01363).	0
412748	cl01110	Sdh5	Flavinator of succinate dehydrogenase. This family includes the highly conserved mitochondrial and bacterial proteins Sdh5/SDHAF2/SdhE. Both yeast and human Sdh5/SDHAF2 interact with the catalytic subunit of the succinate dehydrogenase (SDH) complex, a component of both the electron transport chain and the tricarboxylic acid cycle. Sdh5 is required for SDH-dependent respiration and for Sdh1 flavination (incorporation of the flavin adenine dinucleotide cofactor). Mutational inactivation of Sdh5 confers tumor susceptibility in humans. Bacterial homologs of Sdh5, termed SdhE, are functionally conserved being required for the flavinylation of SdhA and succinate dehydrogenase activity. Like Sdh5, SdhE interacts with SdhA. Furthermore, SdhE was characterized as a FAD co-factor chaperone that directly binds FAD to facilitate the flavinylation of SdhA. Phylogenetic analysis demonstrates that SdhE/Sdh5 proteins evolved only once in an ancestral alpha-proteobacteria prior to the evolution of the mitochondria and now remain in subsequent descendants including eukaryotic mitochondria and the alpha, beta and gamma proteobacteria. This family was previously annotated in Pfam as being a divergent TPR repeat but structural evidence has indicated this is not true. The E. coli protein, YgfY also acts as the antitoxin to the membrane-bound toxin family Cpta, pfam13166, whose E. coli member YgfX, expressed from the same operon as YgfY.	0
412749	cl01112	DUF507	Protein of unknown function (DUF507). Bacterial protein of unknown function.	0
412750	cl01115	BMFP	Membrane fusogenic activity. BMFP consists of two structural domains, a coiled-coil C-terminal domain via which the protein self-associates as a trimer, and an N-terminal domain disordered at neutral pH but adopting an amphipathic alpha-helical structure in the presence of phospholipid vesicles, high ionic strength, acidic pH or SDS. BMFP interacts with phospholipid vesicles though the predicted amphipathic alpha-helix induced in the N-terminal half of the protein and promotes aggregation and fusion of vesicles in vitro.	0
412751	cl01118	ThrE	Putative threonine/serine exporter. ThrE_2 is a family of membrane proteins involved in the export of threonine and serine. L-threonine, L-serine are both substrates for the exporter. The exporter exhibits nine-ten predicted transmembrane-spanning helices with long charged C and N termini and an amphipathic helix present within the N-terminus. L-Threonine can be made by the amino acid-producing bacterium Corynebacterium glutamicum, but the potential for amino acid formation can be considerably improved by reducing its intracellular degradation into glycine and increasing its export by this exporter. Members of the family are found in Bacteria, Archaea, and the fungal kingdoms, and the family can exist either as a single long polypeptide chain or as two short polypeptides. All family members show an extended hydrophilic N-terminal domain with weak sequence similarity to portions of hydrolases (proteases, peptidases, and glycosidases); this suggests that since this region is cytoplasmic to the membrane it may be generating the transport substrate, so may imply that threonine may not be the primary substrate and the ThrE has a subsidiary function.	0
412752	cl01119	DUF525	ApaG domain. CO2+/MG2+ efflux protein ApaG; Reviewed	0
412753	cl01120	SspB	Stringent starvation protein B. ClpXP protease specificity-enhancing factor; Provisional	0
412754	cl01122	RdgC	Putative exonuclease, RdgC. recombination associated protein; Reviewed	0
412755	cl01123	Fe-S_assembly	Iron-sulphur cluster assembly. hypothetical protein; Provisional	0
412756	cl01125	LptE	Lipopolysaccharide-assembly. LPS-assembly lipoprotein RlpB; Provisional	0
412757	cl01126	EI24	Etoposide-induced protein 2.4 (EI24). putative sulfate transport protein CysZ; Validated	0
412758	cl01128	DUF535	Protein of unknown function (DUF535). Family member Shigella flexneri VirK is a virulence protein required for the expression, or correct membrane localization of IcsA (VirG) on the bacterial cell surface,. This family also includes Pasteurella haemolytica lapB, which is thought to be membrane-associated.	0
412759	cl01129	NqrM	(Na+)-NQR maturation NqrM. The NqrM gene is often found adjacent to the nqr operons that encode (Na+)-NQR subunits. It is involved in the maturation of (Na+) translocating NADH:quinone oxidoreductase in proteobacteria. The four conserved Cys residues found in NqrM are required for (Na+)- NQR maturation and may serve as ligands for a metal ion or metal cluster used to build up the (Na+)-NQR molecule.	0
412760	cl01131	HlyC	RTX toxin acyltransferase family. Members of this family are enzymes EC:2.3.1.-. involved in fatty acylation of the protoxins (HlyA) at lysine residues, thereby converting them to the active toxin. Acyl-acyl carrier protein (ACP) is the essential acyl donor. This family show a number of conserved residues that are possible candidates for participation in acyl transfer. Site-directed mutagenesis of the single conserved histidine residue in Escherichia coli HlyC resulted in complete inactivation of the enzyme.	0
412761	cl01132	FA_hydroxylase	Fatty acid hydroxylase superfamily. beta-carotene hydroxylase	0
412762	cl01133	Na_H_Exchanger	Sodium/hydrogen exchanger family. This family contains a number of bacterial Na+/H+ antiporter 1 proteins. These are integral membrane proteins that catalyze the exchange of H+ for Na+ in a manner that is highly dependent on the pH.	0
412763	cl01135	ABC_trans_aux	ABC-type transport auxiliary lipoprotein component. ABC_trans_aux is a family of bacterial proteins that act as auxiliarires to the ABC-transporter in the gamma-hexachlorocyclohexane uptake permease system in Sphingobium japonicum. Gamma-hexachlorocyclohexane, or Lindane, can be used as the sole source of carbon in S.japonicum in aerobic conditions. Lindane is an insecticide.	0
412764	cl01136	DUF393	Protein of unknown function, DUF393. Members of this family have two highly conserved cysteine residues near their N-terminus. The function of these proteins is unknown.	0
412765	cl01137	YfbU	YfbU domain. This presumed domain is about 160 residues long. It is found in archaebacteria and eubacteria. In Corynebacterium glutamicum Ycg4L it is associated with a helix-turn-helix domain. This suggests that this may be a ligand binding domain.	0
412766	cl01139	Cofac_haem_bdg	Haem-binding uptake, Tiki superfamily, ChaN. This is a family of putative bacterial lipoproteins necessary for the uptake of haem-iron. The structure of UniProtKB:Q0PBW2, Structure 2g5g, comprises a large parallel beta-sheet with flanking alpha-helices and a smaller domain consisting of alpha-helices. Two cofacial haem groups (~3.5 Angstom apart with an inter-iron distance of 4.4 Angstrom) bind in a pocket formed by a dimer of two ChaN monomers.	0
412767	cl01143	H2O2_YaaD	Peroxide stress protein YaaA. YaaA is a key element of the stress response to H2O2. It acts by reducing the level of intracellular iron levels after peroxide stress, thereby attenuating the Fenton reaction and the DNA damage that this would cause. The molecular mechanism of action is not known.	0
412768	cl01144	YacG	DNA gyrase inhibitor YacG. zinc-binding protein; Provisional	0
412769	cl01146	ZapA	Cell division protein ZapA. cell division protein ZapA; Provisional	0
412770	cl01147	YjgA-like	uncharacterized proteins similar to Escherichia coli YjgA. This family of bacterial proteins has no known function.	0
412771	cl01148	FxsA	FxsA cytoplasmic membrane protein. phage T7 F exclusion suppressor FxsA; Reviewed	0
412772	cl01153	NapB	Nitrate reductase cytochrome c-type subunit (NapB). The napB gene encodes a dihaem cytochrome c, the small subunit of a heterodimeric periplasmic nitrate reductase.	0
294724	cl01162	DUF417	Protein of unknown function, DUF417. This family of uncharacterized proteins appears to be restricted to proteobacteria.	0
412773	cl01163	NapD	NapD protein. Uncharacterized protein involved in formation of periplasmic nitrate reductase.	0
412774	cl01164	Slp	Outer membrane lipoprotein Slp family. Slp superfamily members are present in the Gram-negative gamma proteobacteria Escherichia coli, which also contains a close paralog, Haemophilus influenzae and Pasteurella multocida and Vibrio cholera. The known members of the family to date share a motif LX[GA]C near the N-terminus, which is compatible with the possibility that the protein is modified into a lipoprotein with Cys as the new N-terminus. Slp from Escherichia coli is known to be a lipoprotein of the outer membrane and to be expressed in response to carbon starvation. [Cell envelope, Other]	0
412775	cl01166	DUF416	Protein of unknown function (DUF416). This is a bacterial protein family of unknown function. Proteins in this family adopt an alpha helical structure. Genome context analysis has suggested a high probability of a functional association with histidine kinases, which implicates proteins in this family to play a role in signalling (information from TOPSAN 2Q9R).	0
412776	cl01171	RelB	RelB antitoxin. Plasmids may be maintained stably in bacterial populations through the action of addiction modules, in which a toxin and antidote are encoded in a cassette on the plasmid. In any daughter cell that lacks the plasmid, the toxin persists and is lethal after the antidote protein is depleted. Toxin/antitoxin pairs are also found on main chromosomes, and likely represent selfish DNA. Sequences in the seed for this alignment all were found adjacent to toxin genes. The resulting model appears to describe a narrower set of proteins than pfam04221, although many in the scope of this model are not obviously paired with toxin proteins. Several toxin/antitoxin pairs may occur in a single species. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other]	0
412777	cl01172	YihI	Der GTPase activator (YihI). YihI activates the GTPase activity of Der, a 50S ribosomal subunit stability factor. The stimulation is specific to Der as YihI does not stimulate the GTPase activity of Era or ObgE. The interaction of YihI with Der requires only the C-terminal 78 amino acids of YihI. A yihI deletion mutant is viable and shows a shorter lag period, but the same post-lag growth rate as a wild-type strain. yihI is expressed during the lag period. Overexpression of yihI inhibits cell growth and biogenesis of the 50S ribosomal subunit. YihI is an unusual, highly hydrophilic protein with an uneven distribution of charged residues, resulting in an N-terminal region with high pI and a C-terminal region with low pI.	0
412778	cl01173	UPF0149	Uncharacterized protein family (UPF0149). This family resembles pfam03695 (version pfam03695.3), uncharacterised protein family UPF0149, but is broader in scope and includes additional proteins. It includes E. coli proteins YgfB and YecA. The function of this family of proteins is unknown. The crystal structure is known for the member from Haemophilus influenzae (Ygfb, HI0817). [Unknown function, General]	0
412779	cl01175	DUF1414	Protein of unknown function (DUF1414). hypothetical protein; Provisional	0
412780	cl01178	RseC_MucC	Positive regulator of sigma(E), RseC/MucC. This bacterial family of integral membrane proteins represents a positive regulator of the sigma(E) transcription factor, namely RseC/MucC. The sigma(E) transcription factor is up-regulated by cell envelope protein misfolding, and regulates the expression of genes that are collectively termed ECF (devoted to Extra-Cellular Functions). In Pseudomonas aeruginosa, de-repression of sigma(E) is associated with the alginate-overproducing phenotype characteristic of chronic respiratory tract colonisation in cystic fibrosis patients. The mechanism by which RseC/MucC positively regulates the sigma(E) transcription factor is unknown. RseC is also thought to have a role in thiamine biosynthesis in Salmonella typhimurium. In addition, this family also includes an N-terminal part of RnfF, a Rhodobacter capsulatus protein, of unknown function, that is essential for nitrogen fixation. This protein also contains an ApbE domain pfam02424, which is itself involved in thiamine biosynthesis.	0
412781	cl01179	CcmH	Cytochrome C biogenesis protein. [Energy metabolism, Electron transport]	0
412782	cl01180	UPF0270	Uncharacterized protein family (UPF0270). hypothetical protein; Provisional	0
382420	cl01181	DctQ	Tripartite ATP-independent periplasmic transporters, DctQ component. 2,3-diketo-L-gulonate TRAP transporter small permease protein YiaM; Provisional	0
412783	cl01183	DUF412	Protein of unknown function, DUF412. hypothetical protein; Provisional	0
412784	cl01184	SirB	Invasion gene expression up-regulator, SirB. SirB up-regulates Salmonella typhimurium invasion gene transcription. It is, however, not essential for the expression of these genes. Its function is unknown.	0
412785	cl01187	DUF446	tRNA pseudouridine synthase C. This family is suggested to be the catalytic domain of tRNA pseudouridine synthase C by association. The structure has been solved for one member, as Structure 2HGK, which by inference is designated in this way.	0
412786	cl01190	EpmC	Elongation factor P hydroxylase. This family catalyzes the final step in the elongation factor P modification pathway. It hydroxylates Lys-34 of elongation factor P. Members of this family have a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold.	0
412787	cl01193	DUF463	YcjX-like family, DUF463. These proteins possess a P-loop motif.	0
412788	cl01203	ACP_PD	Acyl carrier protein phosphodiesterase. YajB, now renamed acpH, encodes an ACP hydrolase that converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine prosthetic group from ACP.	0
412789	cl01204	COX4_pro	Prokaryotic Cytochrome C oxidase subunit IV. This family (QoxD) encodes subunit IV of the aa3-type quinone oxidase, one of several bacterial terminal oxidases. This complex couples oxidation of reduced quinones with the reduction of molecular oxygen to water and the pumping of protons to form a proton gradient utilized for ATP production. aa3-type oxidases contain two heme a cofactors as well as copper atoms in the active site. [Energy metabolism, Electron transport]	0
412790	cl01209	DUF480	Protein of unknown function, DUF480. hypothetical protein; Provisional	0
412791	cl01213	DUF481	Protein of unknown function, DUF481. This family includes several proteins of uncharacterized function.	0
412792	cl01215	DUF1315	Protein of unknown function (DUF1315). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown.	0
412793	cl01217	YebG	YebG protein. DNA damage-inducible protein YebG; Provisional	0
412794	cl01219	CheZ	Chemotaxis phosphatase, CheZ. chemotaxis regulator CheZ; Provisional	0
412795	cl01221	DTW	DTW domain. This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after.	0
412796	cl01222	T2SSM	Type II secretion system (T2SS), protein M. This family of membrane proteins consists of Type II secretion system protein M sequences from several Gram-negative (diderm) bacteria. The precise function of these proteins is unknown, though in Vibrio cholerae, the T2SM (EpsM) protein interacts with the T2SL (EpsL) protein, and also forms homodimers.	0
412797	cl01223	DUF1249	Protein of unknown function (DUF1249). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown.	0
412798	cl01224	DUF805	Protein of unknown function (DUF805). This family consists of several bacterial proteins of unknown function.	0
412799	cl01225	SCP2	SCP-2 sterol transfer family. This domain is found at the C-terminus of alkyl sulfatases. Together with the N-terminal catalytic domain, this domain forms a hydrophobic chute and may recruit hydrophobic substrates.	0
412800	cl01226	T6SS_HCP	Type VI secretion system effector, Hcp. This family includes Hcp1 (hemolysin coregulated protein 1), an exported, homohexameric ring-forming virulence protein from Pseudomonas aeruginosa. Hcp1 lacks a conventional signal sequence and is instead exported by means of the type VI secretion system, encoded by a pathogenicity cluster of a class previously designated IAHP (IcmF-associated homologous protein). Homologs of Hcp1, in this protein family, are found in various bacteria of which most but not all are known pathogens. Pathogens may have many multiple members of this family, with three to ten in Erwinia carotovora, Yersinia pestis, uropathogenic Escherichia coli, and the insect pathogen Photorhabdus luminescens. [Cellular processes, Pathogenesis]	0
382439	cl01230	Chor_lyase	Chorismate lyase. This is a family of uncharacterized proteins.	0
412801	cl01231	DUF485	Protein of unknown function, DUF485. This family includes several putative integral membrane proteins.	0
412802	cl01234	PilO	Pilus assembly protein, PilO. The T2SMb family is conserved in Proteobacteria and Actinobacteria, and differs from the T2SM proteins in Vibrio spp. (pfam04612).	0
412803	cl01236	DMT_6	Putative member of DMT superfamily (DUF486). This family contains several proteins of uncharacterized function. The family is represented in the Transport classification database as 2.A.7.34, though the exact nature of what is transported is not known.	0
412804	cl01237	DUF469	Protein with unknown function (DUF469). hypothetical protein; Provisional	0
412805	cl01240	CtaG_Cox11	Cytochrome c oxidase assembly protein CtaG/Cox11. cytochrome C oxidase assembly protein; Provisional	0
412806	cl01244	arom_aa_hydroxylase	N/A. This family includes phenylalanine-4-hydroxylase, the phenylketonuria disease protein.	0
412807	cl01245	META	META domain. Small domain family found in proteins of of unknown function. Some are secreted and implicated in motility in bacteria. Also occurs in Leishmania spp. as an essential gene. Over-expression in L.amazonensis increases virulence. A pair of cysteine residues show correlated conservation, suggesting that they form a disulphide bond.	0
382447	cl01246	DUF488	Protein of unknown function, DUF488. This family includes several proteins of uncharacterized function.	0
412808	cl01247	FliO	Flagellar biosynthesis protein, FliO. This short protein found in flagellar biosynthesis operons contains a highly hydrophobic N-terminal sequence followed generally by two basic amino acids. This region is reminiscent of but distinct from the twin-arginine translocation signal sequence. Some instances of this gene have been names "FliZ" but phylogenetic tree building supports a single FliO family.	0
412809	cl01248	EutH	Ethanolamine utilisation protein, EutH. ethanolamine utilization protein EutH; Provisional	0
412810	cl01249	Haem_degrading	Haem-degrading. hypothetical protein; Provisional	0
412811	cl01250	Ureidogly_lyase	Ureidoglycolate lyase. Ureidoglycolate lyase (EC:4.3.2.3) is one of the enzymes that acts upon ureidoglycolate, an intermediate of purine catabolism, releasing urea. The enzyme has in the past been wrongly assigned to EC:3.5.3.19, enzymes which release ammonia from ureidoglycolate.	0
412812	cl01251	OHCU_decarbox	OHCU decarboxylase. Previously thought to only proceed spontaneously, the decarboxylation of 2-oxo-4-hydroxy-4-carboxy--5-ureidoimidazoline (OHCU) has been recently been shown to be catalyzed by this enzyme in Mus musculus. Homologs of this enzyme are found adjacent to and fused with uricase in a number of prokaryotes and are represented by this model. This model is a separate (but related) clade from that represented by TIGR3164. This model places a second homolog in streptomyces species which (are not in the vicinity of other urate catabolism associated genes) below the trusted cutoff.	0
412813	cl01252	UPF0167	Uncharacterized protein family (UPF0167). The proteins in this family are about 200 amino acids long and each contain 3 CXXC motifs.	0
412814	cl01253	FixS	Cytochrome oxidase maturation protein cbb3-type. CcoS from Rhodobacter capsulatus has been shown essential for incorporation of redox-active prosthetic groups (heme, Cu) into cytochrome cbb(3) oxidase. FixS of Bradyrhizobium japonicum appears to have the same function. Members of this family are found so far in organisms with a cbb3-type cytochrome oxidase, including Neisseria meningitidis, Helicobacter pylori, Campylobacter jejuni, Caulobacter crescentus, Bradyrhizobium japonicum, and Rhodobacter capsulatus. [Energy metabolism, Electron transport, Protein fate, Protein modification and repair]	0
412815	cl01255	DAGK_cat	Diacylglycerol kinase catalytic domain. Members of this family include ATP-NAD kinases EC:2.7.1.23, which catalyzes the phosphorylation of NAD to NADP utilising ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. Also includes NADH kinases EC:2.7.1.86.	0
412816	cl01256	NMN_transporter	Nicotinamide mononucleotide transporter. The PnuC protein of E. coli is membrane protein responsible for nicotinamide mononucleotide transport, subject to regulation by interaction with the NadR (also called NadI) protein (see TIGR01526). This model defines a region corresponding to most of the length of PnuC, found primarily in pathogens. The extreme N- and C-terminal regions are poorly conserved and not included in the alignment and model. [Transport and binding proteins, Other, Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	0
412817	cl01257	DUF2061	Predicted membrane protein (DUF2061). This domain, found in various prokaryotic proteins, has no known function.	0
412818	cl01258	NnrS	NnrS protein. This family consists of several bacterial NnrS like proteins. NnrS is a putative heme-Cu protein (NnrS) and a member of the short-chain dehydrogenase family. Expression of nnrS is dependent on the transcriptional regulator NnrR, which also regulates expression of genes required for the reduction of nitrite to nitrous oxide, including nirK and nor. NnrS is a haem- and copper-containing membrane protein. Genes encoding putative orthologues of NnrS are sometimes but not always found in bacteria encoding nitrite and/or nitric oxide reductase.	0
412819	cl01260	PilZ	PilZ domain. This domain is related to Type IV pilus assembly protein PilZ (pfam07238). It is found in at least 12 copies in Myxococcus xanthus DK 1622.	0
412820	cl01261	DUF2062	Uncharacterized protein conserved in bacteria (DUF2062). Members of this family are uncharacterized proteins, usually encoded by a gene adjacent to a member of family TIGR03545, which is also uncharacterized.	0
412821	cl01262	DUF2063	Putative DNA-binding domain. This family represents the N-terminal part of a Neisseria protein, UniProtKB:Q5F5I0, Structure 3dee. It runs from residues 31-117 as a helical bundle with 4 main helices. \From genomic context and the fold of the C-terminal part, it is suggested that this protein is involved in transcriptional regulation.	0
412822	cl01264	PsiE	Phosphate-starvation-inducible E. phosphate-starvation-inducible protein PsiE; Provisional	0
412823	cl01267	Peptidase_M90	Glucose-regulated metallo-peptidase M90. DgsA anti-repressor MtfA; Provisional	0
412824	cl01275	DUF2065	Uncharacterized protein conserved in bacteria (DUF2065). This domain, found in various prokaryotic proteins, has no known function.	0
412825	cl01279	MbtH	MbtH-like protein. This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. Many of the members of this family are found in known antibiotic synthesis gene clusters.	0
412826	cl01280	Chlor_dismutase	Chlorite dismutase. putative heme peroxidase; Provisional	0
412827	cl01281	rhaM	L-rhamnose mutarotase. Members of this protein family are rhamnose mutarotase from Escherichia coli, previously designated YiiL as an uncharacterized protein, and close homologs also associated with rhamnose dissimilation operons in other bacterial genomes. Mutarotase is a term for an epimerase that changes optical activity. This enzyme was shown experimentally to interconvert alpha and beta stereoisomers of the pyranose form of L-rhamnose. The crystal structure of this small (104 amino acid) protein shows a locally asymmetric dimer with active site residues of His, Tyr, and Trp. [Energy metabolism, Sugars]	0
412828	cl01282	TRAM	TRAM domain. This small domain has no known function. However it may perform a nucleic acid binding role (Bateman A. unpublished observation).	0
412829	cl01284	DUF1722	Protein of unknown function (DUF1722). hypothetical protein; Provisional	0
412830	cl01285	Gar1	Gar1/Naf1 RNA binding region. H/ACA RNA-protein complex component Gar1; Reviewed	0
412831	cl01287	AE_Prim_S_like	N/A. Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities.	0
382472	cl01288	DUF2067	Uncharacterized protein conserved in archaea (DUF2067). This domain, found in various archaeal proteins, has no known function.	0
412832	cl01294	Baseplate_J	Baseplate J-like protein. This family consists of a large, conserved hypothetical protein in phage tail-like regions of at least six bacterial genomes: Gloeobacter violaceus PCC 7421, Geobacter sulfurreducens PCA, Streptomyces coelicolor A3(2), Streptomyces avermitilis MA-4680, Mesorhizobium loti, and Myxococcus xanthus. The C-terminal region is identified by the broader model pfam04865 as related to baseplate protein J from phage P2, but that relationship is not observed directly. [Mobile and extrachromosomal element functions, Prophage functions]	0
382474	cl01298	Glyco_transf_25	N/A. Members of this family belong to Glycosyltransferase family 25 This is a family of glycosyltransferases involved in lipopolysaccharide (LPS) biosynthesis. These enzymes catalyze the transfer of various sugars onto the growing LPS chain during its biosynthesis.	0
412833	cl01299	DUF2069	Predicted membrane protein (DUF2069). This domain, found in various prokaryotes, has no known function.	0
412834	cl01301	DUF1415	Protein of unknown function (DUF1415). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.	0
412835	cl01304	DUF1289	Protein of unknown function (DUF1289). This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N-terminus. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids.	0
412836	cl01308	CHASE4	CHASE4 domain. CHASE4. This is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in prokaryotes. Specifically, CHASE4 domains are found in histidine kinases in Archaea and in predicted diguanylate cyclases/phosphodiesterases in Bacteria. Environmental factors that are recognized by CHASE4 domains are not known at this time.	0
412837	cl01311	DUF1294	Protein of unknown function (DUF1294). This family includes a number of hypothetical bacterial and archaeal proteins of unknown function.	0
412838	cl01312	Sbt_1	Na+-dependent bicarbonate transporter superfamily. Family of bacterial proteins that are likely to be part of the Na(+)-dependent bicarbonate transporter (sbt) family. Members carry 10TMS in a 5+5 duplicated structure. The loop between helices 5 and 6 in Synechocystis PCC6803 is likely to be the location for regulatory mechanisms governing the activation of the transporter.	0
412839	cl01314	RecU	Recombination protein U. Holliday junction-specific endonuclease; Reviewed	0
412840	cl01315	TANGO2	Transport and Golgi organisation 2. In eukaryotes this family is predicted to play a role in protein secretion and Golgi organisation. In plants this family includes Solanum habrochaites Cwp, which is involved in water permeability in the cuticles of fruit. Mouse Tango2 has been found to be expressed during early embryogenesis in mice. This protein contains a conserved NRDE motif. This gene has been characterized in Drosophila melanogaster and named as transport and Golgi organisation 2, hence the name Tango2.	0
412841	cl01317	Cmr5_III-B	CRISPR/Cas system-associated protein Cmr5. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family, represented by TM1791.1 of Thermotoga maritima, is found in both archaeal and bacterial species.	0
412842	cl01318	DUF1232	Protein of unknown function (DUF1232). This family represents a conserved region of approximately 60 residues within a number of hypothetical bacterial and archaeal proteins of unknown function.	0
412843	cl01319	HARE-HTH	HB1, ASXL, restriction endonuclease HTH domain. Members of this family are the RNA polymerase delta subunit, as found in the Firmicutes and the Mollicutes. All members of the seed alignment have an extended C-terminal low-complexity region, consisting largely of Asp and Glu, that is not included in the model. Proteins giving borderline scores should be checked to confirm a similar acidic C-terminal domain. [Transcription, DNA-dependent RNA polymerase]	0
412844	cl01321	SURF1	N/A. SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder.	0
412845	cl01327	DUF1684	Protein of unknown function (DUF1684). The sequences featured in this family are found in hypothetical archaeal and bacterial proteins of unknown function. The region in question is approximately 200 amino acids long.	0
412846	cl01328	Dodecin	Dodecin. Dodecin is a flavin-binding protein,found in several bacteria and few archaea and represents a stand-alone version of the SHS2 domain. It most closely resembles the SHS2 domains of FtsA and Rpb7p, and represents a single domain small-molecule binding form.	0
412847	cl01329	DUF2071	Uncharacterized conserved protein (COG2071). This conserved protein (similar to YgjF), found in various prokaryotes, has no known function.	0
412848	cl01330	IMP_cyclohyd	IMP cyclohydrolase-like protein. This model represents IMP cyclohydrolase, the final step in the biosynthesis of inosine monophosphate (IMP) in archaea. In bacteria this step is catalyzed by a bifunctional enzyme (purH).	0
412849	cl01332	DUF2073	Uncharacterized protein conserved in archaea (DUF2073). This archaeal protein has no known function.	0
412850	cl01339	DUF1805	Domain of unknown function (DUF1805). This domain is found in bacteria and archaea and has an N terminal tetramerisation region that is composed of beta sheets.	0
412851	cl01342	Peptidase_A22B	Signal peptide peptidase. Mutations in presenilin-1 are a major cause of early onset Alzheimer's disease. It has been found that presenilin-1 binds to beta-catenin in-vivo. This family also contains SPE proteins from C.elegans.	0
412852	cl01346	PaaA_PaaC	Phenylacetic acid catabolic protein. Members of this protein family are BoxB, the B subunit of benzoyl-CoA oxygenase. This oxygen-requiring enzyme acts in an aerobic pathway of benzoate catabolism via coenzyme A ligation. [Energy metabolism, Other]	0
412853	cl01349	YqcI_YcgG	YqcI/YcgG family. This family of proteins are functionally uncharacterized. The family include YqcI and YcgG from B. subtilis. The alignment contains a conserved FPC motif at the N-terminus and CPF at the C-terminus.	0
412854	cl01350	FTCD_C	Formiminotransferase-cyclodeaminase. Members of this family are thought to be Formiminotransferase- cyclodeaminase enzymes EC:4.3.1.4. This domain is found in the C-terminus of the bifunctional animal members of the family.	0
412855	cl01351	Glyco_hydro_8	Glycosyl hydrolases family 8. 	0
412856	cl01356	DUF1508	Domain of unknown function (DUF1508). This family represents a series of bacterial domains of unknown function of around 50 residues in length. Members of this family are often found as tandem repeats and in some cases represent the whole protein. All member proteins are described as being hypothetical.	0
412857	cl01359	OpcA_G6PD_assem	Glucose-6-phosphate dehydrogenase subunit. Members of this family are found in various prokaryotic OpcA and glucose-6-phosphate dehydrogenase proteins. The exact function of the domain is, as yet, unknown.	0
412858	cl01360	Pilin_N	Archaeal Type IV pilin, N-terminal. This entry represents the N-terminal domain of archaeal pilins, which play important roles in surface adhesion and twitching motility. This domain contains an conserved N- terminal hydrophobic motif.	0
412859	cl01365	ZinT	ZinT (YodA) periplasmic lipocalin-like zinc-recruitment. zinc/cadmium-binding protein; Provisional	0
412860	cl01368	GyrI-like	GyrI-like small molecule binding domain. This family contains Cass2 from Vibrio cholerae, an integron-associated protein that has been shown to bind cationic drug compounds with submicromolar affinity. Cass2 has been proposed to be representative of a larger family of independent effector-binding proteins associated with lateral gene transfer within Vibrio and other closely-related species.	0
412861	cl01369	CHASE	CHASE domain. Predicted to be a ligand binding domain.	0
412862	cl01370	DotU	Type VI secretion system protein DotU. At least two families of proteins, often encoded by adjacent genes, show sequence similarity due to homology between type IV secretion systems and type VI secretion systems. One is the IcmF family (TIGR03348). The other is the family described by this model. Members include DotU from the Legionella pneumophila type IV secretion system. Many of the members of this protein family from type VI secretion systems have an additional C-terminal domain with OmpA/MotB homology. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
412863	cl01371	PaaB	Phenylacetic acid degradation B. Phenylacetate-CoA oxygenase is comprised of a five gene complex responsible for the hydroxylation of phenylacetate-CoA (PA-CoA) as the second catabolic step in phenylacetic acid (PA) degradation. Although the exact function of this enzyme has not been determined, it has been shown to be required for phenylacetic acid degradation and has been proposed to function in a multicomponent oxygenase acting on phenylacetate-CoA. [Energy metabolism, Other]	0
412864	cl01377	Iron_transport	Fe2+ transport protein. This is a bacterial family of periplasmic proteins that are thought to function in high-affinity Fe2+ transport.	0
412865	cl01378	LicD	LicD family. The LICD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent. These proteins are part of the nucleotidyltransferase superfamily.	0
412866	cl01379	TSPO_MBR	Translocator protein (TSPO)/peripheral-type benzodiazepine receptor (MBR) family. Tryptophan-rich sensory protein (TspO) is an integral membrane protein that acts as a negative regulator of the expression of specific photosynthesis genes in response to oxygen/light. It is involved in the efflux of porphyrin intermediates from the cell. This reduces the activity of coproporphyrinogen III oxidase, which is thought to lead to the accumulation of a putative repressor molecule that inhibits the expression of specific photosynthesis genes. Several conserved aromatic residues are necessary for TspO function: they are thought to be involved in binding porphyrin intermediates. In, the rat mitochondrial peripheral benzodiazepine receptor (MBR) was shown to not only retain its structure within a bacterial outer membrane, but also to be able to functionally substitute for TspO in TspO- mutants, and to act in a similar manner to TspO in its in situ location: the outer mitochondrial membrane. The biological significance of MBR remains unclear, however. It is thought to be involved in a variety of cellular functions, including cholesterol transport in steroidogenic tissues.	0
412867	cl01380	DUF1440	Protein of unknown function (DUF1440). This family contains a number of bacterial proteins of unknown function approximately 180 residues long. These are possibly integral membrane proteins.	0
412868	cl01381	zinc_ribbon_13	Nucleic-acid-binding protein containing Zn-ribbon domain (DUF2082). This domain, found in various hypothetical prokaryotic proteins, as well as some Zn-ribbon nucleic-acid-binding proteins has no known function.	0
412869	cl01382	PAD	Phenolic Acid Decarboxylase. This family consists of several bacterial phenolic acid decarboxylase proteins. Phenolic acids, also called substituted cinnamic acids, are important lignin-related aromatic acids and natural constituents of plant cell walls. These acids (particularly ferulic, p-coumaric, and caffeic acids) bind the complex lignin polymer to the hemicellulose and cellulose in plants. The Phenolic acid decarboxylase (PAD) gene (pad) is transcriptionally regulated by p-coumaric, ferulic, or caffeic acid; these three acids are the three substrates of PAD.	0
412870	cl01385	DUF1244	Protein of unknown function (DUF1244). This family consists of several short bacterial proteins of around 100 residues in length. The function of this family is unknown.	0
412871	cl01386	2HCT	2-hydroxycarboxylate transporter family. These proteins are members of the Citrate:Cation Symporter (CCS) Family (TC 2.A.24). These proteins have 12 GES predicted transmembrane regions. Most members of the CCS family catalyze citrate uptake with either Na+ or H+ as the cotransported cation. However, one member is specific for L-malate and probably functions by a proton symport mechanism. [Unclassified, Role category not yet assigned]	0
412872	cl01387	DUF3299	Protein of unknown function (DUF3299). This is a family of bacterial proteins of unknown function.	0
412873	cl01389	Phage_sheath_1	Phage tail sheath protein subtilisin-like domain. major tail sheath protein; Provisional	0
412874	cl01390	Phage_tube	Phage tail tube protein FII. major tail tube protein; Provisional	0
412875	cl01391	Phage_P2_GpU	Phage P2 GpU. This family consists of several bacterial and phage proteins of around 130 residues in length which seem to be related to the bacteriophage P2 GpU protein, which is thought to be involved in tail assembly.	0
412876	cl01393	DUF952	Protein of unknown function (DUF952). This family consists of several hypothetical bacterial and plant proteins of unknown function.	0
412877	cl01397	DUF1349	Protein of unknown function (DUF1349). This family consists of several hypothetical bacterial proteins but contains one sequence from Saccharomyces cerevisiae. Members of this family are typically around 200 residues in length. The function of this family is unknown.	0
412878	cl01402	T6SS_VipA	Type VI secretion system, VipA, VC_A0107 or Hcp2. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
412879	cl01403	GPW_gp25	Gene 25-like lysozyme. Some members in this family of proteins are annotated as phage related, xkdS however currently there is no known function.	0
412880	cl01404	T6SS_TssG	Type VI secretion, TssG. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
412881	cl01405	T6SS-SciN	Type VI secretion lipoprotein, VasD, EvfM, TssJ, VC_A0113. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
412882	cl01406	T6SS_VasE	Bacterial Type VI secretion, VC_A0110, EvfL, ImpJ, VasE. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
412883	cl01407	Rdx	Rdx family. This model represents a domain found in both bacteria and animals, including animal proteins SelT, SelW, and SelH, all of which are selenoproteins. In a CXXC motif near the N-terminus of the domain, selenocysteine may replace the second Cys. Proteins with this domain may include an insert of about 70 amino acids. This model is broader than the current SelW model pfam05169 in Pfam.	0
412884	cl01408	AAL_decarboxy	Alpha-acetolactate decarboxylase. Puruvate can be fermented to 2,3-butanediol. It is first converted to alpha-acetolactate by alpha-acetolactate synthase, then decarboxylated to acetoin by this enzyme. Acetoin can be reduced in some species to 2,3-butanediol by acetoin reductase. [Energy metabolism, Fermentation]	0
412885	cl01409	DUF2219	Uncharacterized protein conserved in bacteria (DUF2219). This domain, found in various hypothetical bacterial proteins, has no known function.	0
412886	cl01410	DUF2387	Probable metal-binding protein (DUF2387). Members of this family are small proteins, about 70 residues in length, with a basic triplet near the N-terminus and a probable metal-binding motif CPXCX(18)CXXC. Members are found in various Proteobacteria.	0
412887	cl01411	QSregVF_b	Putative quorum-sensing-regulated virulence factor. QSregVF_b is a family of short Pseudomonas proteins that are potential virulence factors. The structure of UniProtKB:Q9HY15 a secreted protein has been solved and deposited as Structure 3npd, from pfam13652. It is predicted that these two adjacent proteins form a single transcriptional unit based on the prediction that together they interact with their adjacent protein PotD, which is the putrescine-binding periplasmic protein in the polyamine uptake system comprising PotABCD. These two adjacent proteins are predicted to be quroum-sensing-regulated virulence factors.	0
412888	cl01412	Alpha-L-AF_C	Alpha-L-arabinofuranosidase C-terminal domain. This entry represents the C terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase. This catalyses the hydrolysis of non-reducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides.	0
412889	cl01414	DUF971	Protein of unknown function (DUF971). This family consists of several short bacterial proteins and one sequence from Oryza sativa. The function of this family is unknown.	0
412890	cl01416	Fimbrial	Fimbrial protein. FimA is a family of Gram-negative fimbrial component A proteins that form part of the pili. There are usually up to 1000 copies of this subunit in one pilus that form a helically wound rod onto which the tip fibrillum (FimF.FimG, FimH) is attached. Pilus subunits are translocated from the cytoplasm to the periplasm via the general secretory pathway SecYEG.	0
412891	cl01417	Nuc-transf	Predicted nucleotidyltransferase. hypothetical protein; Provisional	0
412892	cl01419	DUF1284	Protein of unknown function (DUF1284). This family consists of several hypothetical bacterial and archaeal proteins of around 130 residues in length. The function of this family is unknown, although it is thought that they may be iron-sulphur binding proteins.	0
412893	cl01421	DUF1211	Protein of unknown function (DUF1211). This family represents a conserved region within a number of hypothetical proteins of unknown function found in eukaryotes, bacteria and archaea. These may possibly be integral membrane proteins.	0
412894	cl01424	DUF2218	Uncharacterized protein conserved in bacteria (DUF2218). This domain, found in various hypothetical bacterial proteins, has no known function.	0
412895	cl01425	Glycolipid_bind	Putative glycolipid-binding. This family has a novel fold known as a spiral beta-roll, consisting of a 15-stranded beta sheet wrapped around a single alpha helix. It forms dimers. It has some structural similarity to the E. coli lipoprotein localization factors LolA and LolB. Its structure suggests that it may have a role in glycolipid binding. Its genomic context supports a role in glycolipid metabolism.	0
412896	cl01427	DUF2214	Predicted membrane protein (DUF2214). This domain, found in various hypothetical bacterial proteins, has no known function.	0
412897	cl01430	AntA	AntA/AntB antirepressor. In E. coli the two proteins AntA and AntB have 62% amino acid identities near their N termini. AntA appears to be encoded by a truncated and divergent copy of AntB. The two proteins are homologous to putative antirepressors found in numerous bacteriophages, such as the hypothetical antirepressor protein encoded by the gene LO142 of the bacteriophage 933W.	0
412898	cl01432	DUF779	Protein of unknown function (DUF779). This family consists of several bacterial proteins of unknown function.	0
412899	cl01435	NTP_transf_6	Nucleotidyltransferase. This family consists of several hypothetical bacterial proteins of unknown function. This family was recently identified as belonging to the nucleotidyltransferase superfamily.	0
412900	cl01438	zf-AN1	AN1-like Zinc finger. Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis.	0
412901	cl01439	3D_domain	3D domain, named for 3 conserved aspartate residues, is found in mltA-like lytic transglycosylases and numerous other contexts. This short presumed domain contains three conserved aspartate residues, hence the name 3D. It has been shown to be part of the catalytic double psi beta barrel domain of MltA.	0
412902	cl01440	TOBE_2	TOBE domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulfate. Found in ABC transporters immediately after the ATPase domain.	0
412903	cl01441	DUF5655	Domain of unknown function (DUF5655). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 122 and 304 amino acids in length.	0
412904	cl01445	DUF4065	Protein of unknown function (DUF4065). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 155 and 202 amino acids in length.	0
412905	cl01449	DUF2240	Uncharacterized protein conserved in archaea (DUF2240). This domain, found in various hypothetical archaeal proteins, has no known function.	0
412906	cl01453	DUF1275	Protein of unknown function (DUF1275). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although most members have 6 TM regions, and may be putative permeases.	0
412907	cl01454	PhnG	Phosphonate metabolism protein PhnG. PhnH is a component of the C-P lyase system (GenProp0232) for the catabolism of phosphonate compounds. The specific function of this component is unknown. This model is based on pfam06754.2, and has been broadened to include sequences missed by that model which are clearly true positive hits based on genome context.	0
412908	cl01455	PhnH	Bacterial phosphonate metabolism protein (PhnH). PhnH is a component of the C-P lyase system (GenProp0232) for the catabolism of phosphonate compounds. The specific function of this component is unknown. This model is based on pfam05845.2, and has been broadened to include sequences missed by that model which are clearly true positive hits based on genome context.	0
412909	cl01456	PhnI	Bacterial phosphonate metabolism protein (PhnI). This family consists of several Proteobacterial phosphonate metabolism protein (PhnI) sequences. Bacteria that use phosphonates as a phosphorus source must be able to break the stable carbon-phosphorus bond. In Escherichia coli phosphonates are broken down by a C-P lyase that has a broad substrate specificity. The genes for phosphonate uptake and degradation in E. coli are organized in an operon of 14 genes, named phnC to phnP. Three gene products (PhnC, PhnD and PhnE) comprise a binding protein-dependent phosphonate transporter, which also transports phosphate, phosphite, and certain phosphate esters such as phosphoserine; two gene products (PhnF and PhnO) may have a role in gene regulation; and nine gene products (PhnG, PhnH, PhnI, PhnJ, PhnK, PhnL, PhnM, PhnN, and PhnP) probably comprise a membrane-associated C-P lyase enzyme complex.	0
412910	cl01457	PhnJ	Phosphonate metabolism protein PhnJ. This family consists of several bacterial phosphonate metabolism (PhnJ) sequences. The exact role that PhnJ plays in phosphonate utilisation is unknown.	0
412911	cl01458	OAD_gamma	Oxaloacetate decarboxylase, gamma chain. This model finds the subfamily of distantly related, low complexity, hydrophobic small subunits of several related sodium ion-pumping decarboxylases. These include oxaloacetate decarboxylase gamma subunit and methylmalonyl-CoA decarboxylase delta subunit. Most sequences scoring between the noise and trusted cutoffs are eukaryotic sodium channel proteins.	0
412912	cl01461	DUF2239	Uncharacterized protein conserved in bacteria (DUF2239). This domain, found in various hypothetical bacterial proteins, has no known function.	0
412913	cl01462	ANT	Phage antirepressor protein KilAC domain. This domain was called the KilAC domain by Iyer and colleagues.	0
412914	cl01464	DUF2238	Predicted membrane protein (DUF2238). hypothetical protein; Provisional	0
412915	cl01465	Cas7_I-C	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR-associated protein Cas7 is one of the components of the type I-B cascade-like antiviral defense complex. In Haloferax volcanii, Cas5, Cas6 and Cas7 form a small complex that aids the stability of CRISPR-derived RNA.	0
412916	cl01467	DUF2237	Uncharacterized protein conserved in bacteria (DUF2237). This domain, found in various hypothetical bacterial proteins, has no known function.	0
412917	cl01472	DUF2236	Uncharacterized protein conserved in bacteria (DUF2236). This domain, found in various hypothetical bacterial proteins, has no known function. This family contains a highly conserved arginine and histidine that may be active site residues for an as yet unknown catalytic activity.	0
412918	cl01474	DUF1989	Domain of unknown function (DUF1989). A number of bacteria degrade urea as a nitrogen source by the urea carboxylase/allophanate hydrolase pathway, which uses biotin and consumes ATP, rather than my means of the nickel-dependent enzyme urease. This model represents one of a pair of homologous, tandem uncharacterized genes found together with the urea carboxylase and allophanate hydrolase genes.	0
412919	cl01480	DUF2235	Uncharacterized alpha/beta hydrolase domain (DUF2235). This domain, found in various hypothetical bacterial proteins, has no known function.	0
412920	cl01481	DDE_Tnp_IS1595	ISXO2-like transposase domain. Most transposases of this family of transposases, IS1595, have an additional short N-terminal domain with a pair of CxxC motifs.	0
412921	cl01482	CpxP_like	CpxP component of the bacterial Cpx-two-component system and related proteins. This is a metal-binding protein which is involved in resistance to heavy-metal ions. The protein forms a four-helix hooked hairpin, consisting of two long alpha helices each flanked by a shorter alpha helix. It binds a metal ion in a type-2 like centre. It contains two copies of an LTXXQ motif.	0
412922	cl01483	Com_YlbF	Control of competence regulator ComK, YlbF/YmcA. YlbF Is a family of short Gram-positive and archaeal proteins that includes both YlbF and YmcA which may interact synergistically. The family is necessary for correct biofilm formation, as null mutants of ymcA and ylbF fail to form pellicles at air-liquid interfaces and grow on solid media as smooth, undifferentiated colonies. During development, YmcA, YlbF and YaaT, family PSPI, pfam04468, interact directly with one another forming a stable ternary complex, in vitro. All three proteins are required for competence, sporulation and the formation of biofilms. The YmcA-YlbF-YaaT complex affects the phosphotransfer between Spo0F and Spo0B, thus accelerating the production of Spo0A~P. The three processes of biofilm formation, mature spore formation and competence all require the active, phosphorylated form of Spo0A, as Spo0A-P.	0
412923	cl01487	DUF1007	Protein of unknown function (DUF1007). Family of conserved bacterial proteins with unknown function.	0
412924	cl01491	NYN_YacP	YacP-like NYN domain. This family consists of bacterial proteins related to YacP. This family is uncharacterized functionally, but it has been suggested that these proteins are nucleases due to them containing a NYN domain. NYN (for N4BP1, YacP-like Nuclease) domains were discovered by Anantharaman and Aravind. Based on gene neighborhoods it was suggested that the bacterial YacP proteins interact with the Ribonuclease III and TrmH methylase in a processome complex that catalyzes the maturation of rRNA and tRNA.	0
412925	cl01492	DUF1980	Domain of unknown function (DUF1980). Members of this occur in gene pairs with members of pfam03773. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region.	0
412926	cl01498	CitX	Apo-citrate lyase phosphoribosyl-dephospho-CoA transferase. 2'-(5''-triphosphoribosyl)-3'-dephospho-CoA:apo-citrate lyase; Reviewed	0
412927	cl01500	VirB8	VirB8 protein. conjugal transfer protein TrbF; Provisional	0
412928	cl01501	VirB3	Type IV secretory pathway, VirB3-like protein. type IV secretion system protein VirB3; Provisional	0
382568	cl01503	TrbL	TrbL/VirB6 plasmid conjugal transfer protein. conjugal transfer protein TrbL; Provisional	0
412929	cl01505	YhhN	YhhN family. The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many are annotated as possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues. A human member of this family, formerly known as TMEM86B, is a lysoplasmalogenase that catalyzes the hydrolysis of the vinyl ether bond of lysoplasmalogen. Putative conserved active site residues have been proposed for the YhhN family.	0
412930	cl01506	EII-Sor	PTS system sorbose-specific iic component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man (PTS splinter group) family is unique in several respects among PTS permease families. It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this family can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the sorbose-specific IIC subunits of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	0
412931	cl01507	EIID-AGA	PTS system mannose/fructose/sorbose family IID component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein. It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue. Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars. The mannose permease of E. coli, for example, can transport and phosphorylate glucose, mannose, fructose, glucosamine,N-acetylglucosamine, and other sugars. Other members of this can transport sorbose, fructose and N-acetylglucosamine. This family is specific for the IID subunits of this family of PTS transporters. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	0
412932	cl01508	KduI	KduI/IolB family. Members of this protein family, 5-deoxy-glucuronate isomerase (iolB), represent one of eight enzymes in a pathway converting myo-inositol to acetyl-CoA. [Energy metabolism, Sugars]	0
412933	cl01509	ChuX_HutX	Haem utilisation ChuX/HutX. The Yersinia enterocolitica O:8 periplasmic binding-protein- dependent transport system consisted of four proteins: the periplasmic haemin-binding protein HemT, the haemin permease protein HemU, the ATP-binding hydrophilic protein HemV and the haemin-degrading protein HemS (this family). The structure for HemS has been solved and consists of a tandem repeat of this domain.	0
412934	cl01511	AstB	Succinylarginine dihydrolase. Members of this family are succinylarginine dihydrolase (EC 3.5.3.23), the second of five enzymes in the arginine succinyltransferase (AST) pathway. [Energy metabolism, Amino acids and amines]	0
412935	cl01513	Terminase_2	Terminase small subunit. Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far, which are hetero-oligomers composed of a small and a large subunit, do not have a significant level of sequence homology. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site.	0
412936	cl01515	EII-GUT	PTS system enzyme II sorbitol-specific factor. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Gut family consists only of glucitol-specific transporters, but these occur both in Gram-negative and Gram-positive bacteria.E. coli consists of IIA protein, a IIC protein and a IIBC protein. This family is specific for the IIC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	0
412937	cl01516	PTSIIA_gutA	PTS system glucitol/sorbitol-specific IIA component. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. This family consists only of glucitol-specific transporters, and occur both in Gram-negative and Gram-positive bacteria.The system in E.Coli consists of a IIA protein, and a IIBC protein. This family is specific for the IIA component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	0
412938	cl01519	DUF1287	Domain of unknown function (DUF1287). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. This family is related to pfam00877.	0
412939	cl01520	DUF817	Protein of unknown function (DUF817). This family consists of several bacterial proteins of unknown function.	0
321544	cl01521	Peptidase_S78	Caudovirus prohead serine protease. This model describes the prohead protease of HK97 and related phage. It is generally encoded next to the gene for the capsid protein that it processes, and in some cases may be fused to it. This family does not show similarity to the prohead protease of phage T4 (see pfam03420). [Mobile and extrachromosomal element functions, Prophage functions, Protein fate, Other]	0
412940	cl01522	FGase	N-formylglutamate amidohydrolase. In some species, histidine is converted to via urocanate and then formimino-L-glutamate to glutamate in four steps, where the fourth step is conversion of N-formimino-L-glutamate to L-glutamate and formamide. In others, that pathway from formimino-L-glutamate may differ, with the next enzyme being formiminoglutamate hydrolase (HutF) yielding N-formyl-L-glutamate. This model represents the enzyme N-formylglutamate deformylase, also called N-formylglutamate amidohydrolase, which then produces glutamate. [Energy metabolism, Amino acids and amines]	0
412941	cl01525	Terminase_4	Phage terminase, small subunit. This model describes a distinct family of phage (and integrated prophage) putative terminase small subunit. Members tend to be adjacent to the phage terminase large subunit gene. [Mobile and extrachromosomal element functions, Prophage functions]	0
412942	cl01526	DUF934	Bacterial protein of unknown function (DUF934). This family consists of several bacterial proteins of unknown function. One of the members of this family BMEI1764 is thought to be an oxidoreductase.	0
412943	cl01528	DUF937	Bacterial protein of unknown function (DUF937). hypothetical protein; Provisional	0
412944	cl01529	GH99_GH71_like	Glycoside hydrolase families 71, 99, and related domains. This domain, around 350 residues, is mainly found in some uncharacterized proteins from bacteroides to human. Some proteins in this family, annotated as endo-alpha-mannosidases cleave mannoside linkages internally within an N-linked glycan chain, short circuiting the classical N-glycan biosynthetic pathway. This domain reveals a (beta-alpha)(8) barrel fold in which the catalytic centre is present in a long substrate-binding groove, consistent with cleavage within the N-glycan chain, providing a foundation upon which to develop new enzyme inhibitors targeting the hijacking of N-glycan synthesis in viral disease and cancer.	0
412945	cl01530	LprI	Lysozyme inhibitor LprI. This family consists of several bacterial proteins of around 120 residues in length. Members of this family contain four highly conserved cysteine residues. Family members include lipoprotein LprI from Mycobacterium, which binds to and inhibits macrophage lysozyme, which may aid bacterial survival.	0
412946	cl01531	DUF1376	Protein of unknown function (DUF1376). This family consists of several hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown.	0
412947	cl01532	HutD	HutD. HutD from Pseudomonas fluorescens SBW25 is a component of the histidine uptake and utilisation operon. HutD is operonic with the well characterized repressor protein HutC. Genetic analysis using transcriptional fusions (lacZ) and deletion mutants shows that hutD is necessary to maintain fitness in environments replete with histidine. Evidence outlined by Zhang & Rainey (2007) suggests that HutD functions as a governor that sets an upper bound on the level of hut operon transcription. The mechanistic basis is unknown, but in silico molecular docking studies based on the crystal structure of PA5104 (HutD from Pseudomonas aeruginosa) show that urocanate (the first breakdown product of histidine) docks with the active site of HutD.	0
412948	cl01533	DUF1304	Protein of unknown function (DUF1304). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown.	0
412949	cl01534	NDUFA12	NADH ubiquinone oxidoreductase subunit NDUFA12. NADH:ubiquinone oxidoreductase 18 kDa subunit; Provisional	0
412950	cl01535	TPM_phosphatase	TPM domain. This family was first named TPM domain after its founding proteins: TLP18.3, Psb32 and MOLO-1. In Arabidopsis, this domain is called the thylakoid acid phosphatase -TAP - domain and has a Rossmann-like fold. In plants, the family resides in the thylakoid lumen attached to the outer membrane of the chloroplast/plastid. It is active in the photosystem II.	0
412951	cl01538	Peptidase_M74	Penicillin-insensitive murein endopeptidase. penicillin-insensitive murein endopeptidase; Reviewed	0
412952	cl01539	LapA_dom	Lipopolysaccharide assembly protein A domain. This family includes a domain found in lipopolysaccharide assembly protein A (LapA). LapA functions along with LapB in the assembly of lipopolysaccharide (LPS). Domains in this family are also found in some uncharacterized bacterial proteins.	0
412953	cl01542	DUF2313	Uncharacterized protein conserved in bacteria (DUF2313). Members of this family of proteins comprise various hypothetical and putative bacteriophage tail proteins.	0
412954	cl01544	Bestrophin	Bestrophin, RFP-TM, chloride channel. Bestrophin is a 68-kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterized by a depressed light peak in the electrooculogram. VMD2 encodes a 585-amino acid protein with an approximate mass of 68 kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localized to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of chloride channels, indicating a direct role for bestrophin in generating the light peak. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal RFP-TM domain implying important functional properties. The bestrophins are four-pass transmembrane chloride-channel proteins, and the RFP-TM or bestrophin domain extends from the N-terminus through approximately 350 amino acids and contains all of the TM domains as well as nearly all reported disease causing mutations. Interestingly, the RFP motif is not conserved evolutionarily back beyond Metazoa, neither is it in plant members.	0
412955	cl01545	DUF1853	Domain of unknown function (DUF1853). This family of proteins are functionally uncharacterized.	0
412956	cl01546	Cytochrom_B562	Cytochrome b562. cytochrome b562; Provisional	0
412957	cl01547	DUF1318	Protein of unknown function (DUF1318). This family consists of several bacterial proteins of around 100 residues in length and is often known as YdbL. The function of this family is unknown.	0
412958	cl01548	YccV-like	Hemimethylated DNA-binding protein YccV like. YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix.	0
412959	cl01551	DUF2170	Uncharacterized protein conserved in bacteria (DUF2170). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
412960	cl01553	GFA	Glutathione-dependent formaldehyde-activating enzyme. glutathione-dependent formaldehyde-activating enzyme; Provisional	0
412961	cl01557	DUF1697	Protein of unknown function (DUF1697). This family contains many hypothetical bacterial proteins.	0
412962	cl01558	DUF2171	Uncharacterized protein conserved in bacteria (DUF2171). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
412963	cl01561	DUF924	Bacterial protein of unknown function (DUF924). This family consists of several hypothetical bacterial proteins of unknown function. Structurally, this family resembles TPR-like repeats.	0
412964	cl01562	DOPA_dioxygen	Dopa 4,5-dioxygenase family. This family of proteins are related to a DOPA 4,5-dioxygenase that is involved in synthesis of betalain. DOPA-dioxygenase is the key enzyme involved in betalain biosynthesis. It converts 3,4-dihydroxyphenylalanine to betalamic acid, a yellow chromophore.	0
412965	cl01565	zf-TFIIB	Transcription factor zinc-finger. 	0
412966	cl01566	YjhX_toxin	Putative toxin of bacterial toxin-antitoxin pair. hypothetical protein; Provisional	0
412967	cl01567	DUF1993	Domain of unknown function (DUF1993). This family of proteins are functionally uncharacterized.	0
412968	cl01570	DUF2085	Predicted membrane protein (DUF2085). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
412969	cl01573	DUF969	Protein of unknown function (DUF969). Family of uncharacterized bacterial membrane proteins.	0
412970	cl01575	DUF599	Protein of unknown function, DUF599. This family includes several uncharacterized proteins.	0
412971	cl01577	MMP_TTHA0227_like	Minimal MMP-like domain found in Thermus thermophilus TTHA0227, Acidothermus cellulolyticus ACEL2062 and similar proteins. This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. The structure of this family is a minimal version of the metalloprotease fold (Structure 3E11).	0
412972	cl01581	WGR	WGR domain. This domain is found in a variety of polyA polymerases as well as the E. coli molybdate metabolism regulator and other proteins of unknown function. I have called this domain WGR after the most conserved central motif of the domain. The domain is found in isolation in proteins such as Rhizobium radiobacter Ych and is between 70 and 80 residues in length. I propose that this may be a nucleic acid binding domain.	0
412973	cl01583	TrbC	TrbC/VIRB2 family. conjugal transfer protein TrbC; Provisional	0
412974	cl01585	Flp_Fap	Flp/Fap pilin component. 	0
412975	cl01587	DUF1290	Protein of unknown function (DUF1290). This family consists of several bacterial small basic proteins of around 100 residues in length. The function of this family is unknown.	0
412976	cl01589	DUF2087	Uncharacterized protein conserved in bacteria (DUF2087). This domain, found in various hypothetical prokaryotic proteins and transcriptional activators, has no known function. Structural modelling suggests this domain may bind nucleic acids.	0
412977	cl01590	DUF2382	Domain of unknown function (DUF2382). This model describes an uncharacterized domain, sometimes found in association with a PRC-barrel domain (pfam05239, which is also found in rRNA processing protein RimM and in a photosynthetic reaction center complex protein). This domain is found in proteins from Bacillus subtilis, Deinococcus radiodurans, Nostoc sp. PCC 7120, Myxococcus xanthus, and several other species. The function is not known.	0
412978	cl01595	DUF1385	Protein of unknown function (DUF1385). This family contains a number of hypothetical bacterial proteins of unknown function approximately 300 residues in length. Some family members are predicted to be metal-dependent.	0
412979	cl01596	Spore_YtfJ	Sporulation protein YtfJ (Spore_YtfJ). Members of this protein family, exemplified by YtfJ of Bacillus subtilis, are encoded by bacterial genomes if and only if the species is capable of endospore formation. YtfJ was confirmed in spores of Bacillus subtilis; it appears to be expressed in the forespore under control of SigF (see ). [Cellular processes, Sporulation and germination]	0
412980	cl01598	DUF1343	Protein of unknown function (DUF1343). This family consists of several hypothetical bacterial proteins of around 400 residues in length. The function of this family is unknown.	0
412981	cl01600	DUF1963	Domain of unknown function (DUF1963). This domain is found in a set of hypothetical bacterial proteins. Its exact function has not, as yet, been described.	0
412982	cl01604	MliC	Membrane-bound lysozyme-inhibitor of c-type lysozyme. lysozyme inhibitor; Provisional	0
412983	cl01608	DUF1292	Protein of unknown function (DUF1292). hypothetical protein; Provisional	0
412984	cl01610	Cytochrom_C_2	Cytochrome C&apos;. 	0
412985	cl01611	DUF2094	Uncharacterized protein conserved in bacteria (DUF2094). Members of this protein family are found exclusively, although not universally, in bacterial species that possess a type VI secretion system. Genes are found in type VI secretion-associated gene clusters. The specific function is unknown. This model represents the rather well-conserved amino-terminal domain of a protein family in which carboxy-terminal regions, when present, show little conservation. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
412986	cl01614	DUF997	Protein of unknown function (DUF997). hypothetical protein; Provisional	0
412987	cl01617	ExoD	Exopolysaccharide synthesis, ExoD. Among the bacterial genes required for nodule invasion are the exo genes. These genes are involved in the production of an extracellular polysaccharide. Mutations in the exoD result in altered exopolysaccharide production and defects in nodule invasion.	0
412988	cl01626	Rod-binding	Rod binding protein. peptidoglycan hydrolase; Reviewed	0
412989	cl01627	LAB_N	Lipid A Biosynthesis N-terminal domain. This family is found at the N-terminus of a group of Chlamydial Lipid A biosynthesis proteins. It is also found by itself in a family of proteins of unknown function.	0
412990	cl01628	DUF1919	Domain of unknown function (DUF1919). This domain has no known function. It is found in various hypothetical and putative bacterial proteins.	0
412991	cl01629	TPP_enzymes	N/A. This family contains 1-deoxyxylulose-5-phosphate synthase (DXP synthase), an enzyme which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate, to yield 1-deoxy-D- xylulose-5-phosphate, a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6).	0
382632	cl01632	DUF2095	Uncharacterized protein conserved in archaea (DUF2095). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
412992	cl01633	DUF5611	Domain of unknown function (DUF5611). This is a domain of unknown function. Studies of the TA0095 gene product indicate that this 96-residue hypothetical protein from Thermoplasma acidophilum is a member of the COG4004 orthologous group of unknown function found in Archaea bacteria. The structure displays an alpha/beta two-layer sandwich architecture formed by three alpha-helices and five beta-strands. Furthermore, structural homologs indicate that the TA0095 structure belongs to the TBP-like fold.	0
412993	cl01636	DUF749	Domain of unknown function (DUF749). Archaeal domain of unknown function. This domain has been solved as part of a structural genomics project and comprises of segregated helical and anti-parallel beta sheet regions.	0
412994	cl01637	DUF2096	Uncharacterized protein conserved in archaea (DUF2096). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
412995	cl01638	DUF1786	Putative pyruvate format-lyase activating enzyme (DUF1786). This family is annotated as pyruvate formate-lyase activating enzyme (EC:1.97.1.4) in UniProt. It is not clear where this annotation comes from.	0
412996	cl01639	DUF2097	Uncharacterized protein conserved in archaea (DUF2097). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
412997	cl01640	DUF2098	Uncharacterized protein conserved in archaea (DUF2098). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
412998	cl01641	UPF0254	Uncharacterized protein family (UPF0254). hypothetical protein; Provisional	0
412999	cl01642	DUF1188	Protein of unknown function (DUF1188). This family consists of several hypothetical archaeal proteins of around 260 residues in length which seem to be specific to Methanobacterium, Methanococcus and Methanopyrus species. The function of this family is unknown.	0
413000	cl01645	DUF2099	Uncharacterized protein conserved in archaea (DUF2099). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it.	0
261026	cl01648	DUF2101	Predicted membrane protein (DUF2101). This domain, found in various archaeal and bacterial proteins, has no known function.	0
413001	cl01650	DUF2102	Uncharacterized protein conserved in archaea (DUF2102). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	0
413002	cl01651	DUF2103	Predicted metal-binding protein (DUF2103). This domain, found in various putative metal binding prokaryotic proteins, has no known function.	0
413003	cl01653	DUF1894	Domain of unknown function (DUF1894). Members of this family have an important role in methanogenesis. They assume an alpha-beta globular structure consisting of six beta-strands and three alpha-helices forming the secondary structural topological arrangement of alpha1-beta1-alpha2-beta2-beta3-beta4-beta5-beta6-alpha3.	0
413004	cl01655	DUF2104	Predicted membrane protein (DUF2104). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413005	cl01656	DUF2105	Predicted membrane protein (DUF2105). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413006	cl01659	DUF2108	Predicted membrane protein (DUF2108). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413007	cl01662	NiFe_hyd_3_EhaA	NiFe-hydrogenase-type-3 Eha complex subunit A. Energy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe] hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. Energy-converting [NiFe] hydrogenases function as ion pumps, catalyzing the reduction of ferredoxin with H2 driven by the proton-motive force or the sodium-ion-motive force. Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S] polyferredoxin, ten other predicted integral membrane proteins (EhaA, EhaB, EhaC, EhaD, EhaE, EhaF, EhaG, EhaI, EhaK, EhaL) and four hydrophobic subunits (EhaM, EhaR, EhS, EhT). Eha and Ehb catalyze the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases. Based on sequence similarity and genome context analysis, other organisms such as Methanopyrus kandleri, Methanocaldococcus jannaschii, and Methanothermobacter marburgensis also encode Eha-like [NiFe]-hydrogenase-3-type complexes and have very similar eha operon structure. This domain family can be found on the small membrane proteins that are predicted to be the EhaA trans-membrane subunits of multisubunit membrane-bound [NiFe]-hydrogenase Eha complexes.	0
413008	cl01665	DUF1512	Protein of unknown function (DUF1512). This family consists of several archaeal proteins of around 370 residues in length. The function of this family is unknown.	0
413009	cl01666	AGOG	N-glycosylase/DNA lyase. N-glycosylase/DNA lyase; Provisional	0
413010	cl01667	DUF2111	Uncharacterized protein conserved in archaea (DUF2111). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413011	cl01669	DUF2112	Uncharacterized protein conserved in archaea (DUF2112). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	0
413012	cl01670	DUF2113	Uncharacterized protein conserved in archaea (DUF2113). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	0
413013	cl01673	MCR_D	Methyl-coenzyme M reductase operon protein D. Members of this protein family are protein D, a non-structural protein, of the operon for methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). That enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis; it has several modified sites, so accessory proteins are expected. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. Proteins in this family are expressed at much lower levels than the methyl-coenzyme M reductase itself and associate and have been shown to form at least transient associations. The precise function is unknown. [Energy metabolism, Methanogenesis]	0
413014	cl01674	MCR_C	Methyl-coenzyme M reductase operon protein C. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it.	0
413015	cl01675	MtrE	Tetrahydromethanopterin S-methyltransferase, subunit E. tetrahydromethanopterin S-methyltransferase subunit E; Provisional	0
413016	cl01676	MtrD	Tetrahydromethanopterin S-methyltransferase, subunit D. This model describes N5-methyltetrahydromethanopterin: coenzyme M methyltransferase subunit D in methanogenic archaea. This methyltranferase is membrane-associated enzyme complex that uses methy-transfer reaction to drive sodium-ion pump. Archaea domain, have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This transferase is involved in the transfer of 'methyl' group from N5-methyltetrahydromethanopterin to coenzyme M. In an accompanying reaction, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme methyl-coenzyme M reductase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Methanogenesis]	0
413017	cl01677	MtrC	Tetrahydromethanopterin S-methyltransferase, subunit C. tetrahydromethanopterin S-methyltransferase subunit C; Provisional	0
413018	cl01678	MtrB	Tetrahydromethanopterin S-methyltransferase subunit B. Members of this protein family are the MtrB protein of the tetrahydromethanopterin S-methyltransferase complex. This system is universal in archaeal methanogens. [Energy metabolism, Methanogenesis]	0
413019	cl01680	DUF2114	Uncharacterized protein conserved in archaea (DUF2114). Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	0
413020	cl01681	DUF2115	Uncharacterized protein conserved in archaea (DUF2115). hypothetical protein; Provisional	0
413021	cl01683	DUF2116	Uncharacterized protein containing a Zn-ribbon (DUF2116). This domain, found in various hypothetical archaeal proteins, has no known function. Structural modelling suggests this domain may bind nucleic acids.	0
382661	cl01684	DUF2118	Uncharacterized protein conserved in archaea (DUF2118). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413022	cl01685	DUF2119	Uncharacterized protein conserved in archaea (DUF2119). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413023	cl01686	Nit_Regul_Hom	Uncharacterized protein, homolog of nitrogen regulatory protein PII. This domain, found in various hypothetical archaeal proteins, has no known function. It is distantly similar to the nitrogen regulatory protein PII.	0
413024	cl01687	DUF2120	Uncharacterized protein conserved in archaea (DUF2120). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413025	cl01688	NADHdeh_related	NADH dehydrogenase I, subunit N related protein. This family comprises a set of NADH dehydrogenase I, subunit N related proteins found in archaea. Their exact function, has not, as yet, been determined.	0
413026	cl01691	DUF1890	Domain of unknown function (DUF1890). This domain is found in a set of hypothetical archaeal proteins.	0
413027	cl01695	DUF2124	Uncharacterized protein conserved in archaea (DUF2124). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413028	cl01709	PBP2_NikA_DppA_OppA_like	The substrate-binding domain of an ABC-type nickel/oligopeptide-like import system contains the type 2 periplasmic binding fold. The borders of this family are based on the PDBSum definitions of the domain edges for Salmonella typhimurium oppA.	0
413029	cl01713	Gamma_PGA_hydro	Poly-gamma-glutamate hydrolase. This family consists of a number of bacterial and phage proteins that function as gamma-PGA hydrolase enzymes. Structurally the protein in this family adopted an open alpha/beta mixed core structure with a seven-stranded parallel/anti-parallel beta-sheet. This structure shows similarity to mammalian carboxypeptidase A and related enzymes.	0
382669	cl01720	Phage_Nu1	Phage DNA packaging protein Nu1. Terminase, the DNA packaging enzyme of bacteriophage lambda, is a heteromultimer composed of subunits Nu1 and A. The smaller Nu1 terminase subunit has a low-affinity ATPase stimulated by non-specific DNA.	0
413030	cl01722	DUF896	Bacterial protein of unknown function (DUF896). hypothetical protein; Provisional	0
382671	cl01728	DUF2232	Predicted membrane protein (DUF2232). This family of bacterial proteins are multi-pass membrane proteins with up to 10 (2 x 4/5) transmembrane regions. The exact function of this potential pore molecule is not known, but in many instances it is associated with ABC-transporter-like domains, implying that it is part of a secretion system that uses energy.	0
413031	cl01729	VKOR	Vitamin K epoxide reductase (VKOR) family. Vitamin K epoxide reductase (VKOR) recycles reduced vitamin K, which is used subsequently as a co-factor in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. VKORC1 is a member of a large family of predicted enzymes that are present in vertebrates, Drosophila, plants, bacteria and archaea. Four cysteine residues and one residue, which is either serine or threonine, are identified as likely active-site residues. In some plant and bacterial homologs the VKORC1 homologous domain is fused with domains of the thioredoxin family of oxidoreductases.	0
413032	cl01730	DUF2231	Predicted membrane protein (DUF2231). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413033	cl01731	DICT	Sensory domain in DIguanylate Cyclases and Two-component system. DICT is a sensory domain found associated with GGDEF, EAL, HD-GYP, STAS, and two component systems (histidine-kinase type). It assumes an alpha+beta fold with a 4-stranded beta-sheet and might have a role in light response (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter)	0
413034	cl01732	CHASE2	CHASE2 domain. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE2 domains are not known at this time.	0
413035	cl01733	DUF2345	Uncharacterized protein conserved in bacteria (DUF2345). Members of this family are found in various bacterial hypothetical proteins, as well as Rhs element Vgr proteins.	0
413036	cl01736	PelG	Putative exopolysaccharide Exporter (EPS-E). PelG is a family of putative exopolysaccharide transporters like PelG. Most members carry twelve transmembrane regions. The family also contains fusion proteins with glycosyl transferase group 1, which are putative flippase transporters.	0
382678	cl01737	McrBC	McrBC 5-methylcytosine restriction system component. 5-methylcytosine-specific restriction enzyme subunit McrC; Provisional	0
413037	cl01738	DUF898	Bacterial protein of unknown function (DUF898). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative membrane proteins.	0
413038	cl01741	DUF1634	Protein of unknown function (DUF1634). This family contains many hypothetical bacterial and archaeal proteins. A few members of this family are annotated as being putative transmembrane proteins, and the region in question in fact contains many hydrophobic residues.	0
413039	cl01742	DGC	DGC domain. This domain appears to be a zinc binding domain from the conservation of four potential chelating cysteines. The domain is named after a conserved central motif. The function of this domain is unknown.	0
413040	cl01743	GYD	GYD domain. This protein is found in a range of bacteria. It is usually less than 100 amino acids in length. The function of the protein is unknown. It may belong to the dimeric alpha/beta barrel superfamily.	0
413041	cl01744	Chrome_Resist	Chromate resistance exported protein. Members of this family of bacterial proteins, are involved in the reduction of chromate accumulation and are essential for chromate resistance.	0
413042	cl01747	SMI1_KNR4	SMI1 / KNR4 family (SUKH-1). Members of this family are related to the SMI1/KNR4-like or SUKH superfamily of proteins.	0
413043	cl01749	UPF0160	Uncharacterized protein family (UPF0160). This family of proteins contains a large number of metal binding residues. The patterns are suggestive of a phosphoesterase function. The conserved DHH motif may mean this family is related to pfam01368.	0
413044	cl01751	ASRT	Anabaena sensory rhodopsin transducer. The family of bacterial Anabaena sensory rhodopsin transducers are likely to bind sugars or related metabolites. The entire protein is comprised of a single globular domain with an eight-stranded beta-sandwich fold. There are a few characteristics which define this beta-sandwich fold as being distinct from other so-named folds, and these are: 1) a well conserved tryptophan, usually following a polar residue, present at the start of the first strand; this tryptophan appears to be central to a hydrophobic interaction required to hold the two beta-sheets of the sandwich together, and 2) a nearly absolutely conserved asparagine located at the end of the second beta-strand, that hydrogen bonds with the backbone carbonyls of the residues 2 and 4 positions downstream from it, thereby stabilizing the characteristic tight turn between strands 2 and 3 of the structure.	0
413045	cl01752	DUF2264	Uncharacterized protein conserved in bacteria (DUF2264). Members of this family of hypothetical bacterial proteins have no known function.	0
413046	cl01753	DUF1345	Protein of unknown function (DUF1345). This family consists of several hypothetical bacterial proteins of around 230 residues in length. The function of this family is unknown.	0
413047	cl01754	LtrA	Bacterial low temperature requirement A protein (LtrA). This family consists of several bacteria specific low temperature requirement A (LtrA) protein sequences which have been found to be essential for growth at low temperatures in Listeria monocytogenes.	0
413048	cl01755	DUF1802	Domain of unknown function (DUF1802). The function of this family is unknown. This region is found associated with a pfam04471 suggesting they could be part of a restriction modification system..	0
413049	cl01757	DUF2262	Uncharacterized protein conserved in bacteria (DUF2262). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413050	cl01759	YiaAB	yiaA/B two helix domain. This domain consists of two transmembrane helices and a conserved linking section.	0
413051	cl01762	EutC	Ethanolamine ammonia-lyase light chain (EutC). This family consists of several bacterial ethanolamine ammonia-lyase light chain (EutC) EC:4.3.1.7 sequences. Ethanolamine ammonia-lyase is a bacterial enzyme that catalyzes the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia.	0
413052	cl01763	DUF2247	Uncharacterized protein conserved in bacteria (DUF2247). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413053	cl01767	SoxD	Sarcosine oxidase, delta subunit family. This model describes the delta subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Mesorhizobium loti and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members are share the same function. The model is designated as subfamily rather than equivalog for this reason. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction converts tetrahydrofolate to 5,10-methylene-tetrahydrofolate. The enzyme is known in monomeric and heterotetrameric (alpha,beta,gamma,delta) form [Energy metabolism, Amino acids and amines]	0
413054	cl01768	Phenol_MetA_deg	Putative MetA-pathway of phenol degradation. 	0
413055	cl01769	NosL	NosL. NosL is one of the accessory proteins of the nos (nitrous oxide reductase) gene cluster. NosL is a monomeric protein of 18,540 MW that specifically and stoichiometrically binds Cu(I). The copper ion in NosL is ligated by a Cys residue, and one Met and one His are thought to serve as the other ligands. It is possible that NosL is a copper chaperone involved in metallo-centre assembly.	0
413056	cl01770	DUF2251	Uncharacterized protein conserved in bacteria (DUF2251). Members of this family of hypothetical bacterial proteins have no known function.	0
413057	cl01771	DUF1427	Protein of unknown function (DUF1427). This model describes an uncharacterized small, hydrophobic protein of about 50 amino acids, found between the xapB and xapR genes of the E. coli xanthosine utilization system, and homologous regions in other small proteins, such as the N-terminal region of DUF1427 (pfam07235). We name this domain XapX, as it comprises the full length of the protein encoded between the genes for the well-studied XapB and XapR proteins. [Unknown function, General]	0
413058	cl01775	RHH_4	Ribbon-helix-helix domain. This short bacterial protein contains a ribbon-helix-helix domain that is likely to be DNA-binding.	0
413059	cl01781	DUF4212	Domain of unknown function (DUF4212). Members of this family are highly hydrophobic bacterial proteins of about 90 amino acids in length. Members usually are found immediately upstream (sometimes fused to) a member of the solute:sodium symporter family, and therefore are a putative sodium:solute symporter small subunit. Members tend to be found in aquatic species, especially those from marine or other high salt environments. [Transport and binding proteins, Unknown substrate]	0
413060	cl01783	DUF2243	Predicted membrane protein (DUF2243). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413061	cl01784	DUF1361	Protein of unknown function (DUF1361). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although some members are annotated as being putative integral membrane proteins.	0
413062	cl01785	DUF2127	Predicted membrane protein (DUF2127). This domain, found in various hypothetical prokaryotic and archaeal proteins, has no known function.	0
413063	cl01786	DUF1062	Protein of unknown function (DUF1062). This family consists of several hypothetical bacterial proteins of unknown function.	0
413064	cl01787	DUF1643	Protein of unknown function (DUF1643). The members of this family are all sequences found within hypothetical proteins expressed by various bacterial species. The region concerned is approximately 150 residues long.	0
413065	cl01788	DUF2255	Uncharacterized protein conserved in bacteria (DUF2255). Members of this family of hypothetical bacterial proteins have no known function.	0
413066	cl01790	DUF1445	Protein of unknown function (DUF1445). This family represents a conserved region approximately 150 residues long within a number of hypothetical bacterial and eukaryotic proteins of unknown function.	0
413067	cl01792	DUF2256	Uncharacterized protein conserved in bacteria (DUF2256). Members of this family of hypothetical bacterial proteins have no known function.	0
413068	cl01794	2OG-Fe_Oxy_2	2OG-Fe dioxygenase. This family contains 2-oxoglutarate (2OG) and Fe-dependent dioxygenases. It includes L-isoleucine dioxygenase (IDO).	0
413069	cl01797	DUF2258	Uncharacterized protein conserved in archaea (DUF2258). Members of this family of hypothetical bacterial archaeal have no known function. Structural modelling suggests this domain may bind nucleic acids.	0
413070	cl01798	DUF1405	Protein of unknown function (DUF1405). This family consists of several bacterial and related archaeal protein of around 180 residues in length. The function of this family is unknown.	0
413071	cl01799	Ribosomal_L13e	Ribosomal protein L13e. 60S ribosomal protein L13; Provisional	0
413072	cl01800	DUF1122	Protein of unknown function (DUF1122). This family consists of several hypothetical archaeal and bacterial proteins of unknown function.	0
413073	cl01805	DUF2316	Uncharacterized protein conserved in bacteria (DUF2316). Members of this family of hypothetical bacterial proteins have no known function.	0
413074	cl01807	DUF1517	Protein of unknown function (DUF1517). This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown.	0
413075	cl01811	DUF2325	Uncharacterized protein conserved in bacteria (DUF2325). Members of this family of hypothetical bacterial proteins have no known function.	0
413076	cl01813	DUF799	Putative bacterial lipoprotein (DUF799). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins.	0
413077	cl01815	DUF1018	Protein of unknown function (DUF1018). This family consists of several bacterial and phage proteins of unknown function.	0
413078	cl01817	Tail_P2_I	Phage tail protein (Tail_P2_I). This model describes a region of sequence similarity shared by a number of uncharacterized proteins in bacterial genomes, including Geobacter sulfurreducens PCA, Mesorhizobium loti, Streptomyces coelicolor A3(2), Gloeobacter violaceus PCC 7421, and Myxococcus xanthus. In all cases, the genomic region resembles a phage tail region, based on tentative identifications of neighboring genes. A region of this domain resembles a region of TIGR01634, another phage tail protein model. [Mobile and extrachromosomal element functions, Prophage functions]	0
413079	cl01818	DUF1320	Protein of unknown function (DUF1320). This family consists of both hypothetical bacterial and phage proteins of around 145 residues in length. The function of this family is unknown.	0
413080	cl01820	DUF2322	Uncharacterized protein conserved in bacteria (DUF2322). Members of this family of hypothetical bacterial proteins have no known function.	0
413081	cl01821	zf-CHCC	Zinc-finger domain. This is a short zinc-finger domain conserved from fungi to humans. It is Cx8Hx14Cx2C.	0
413082	cl01823	DUF2331	Uncharacterized protein conserved in bacteria (DUF2331). This model describes a conserved protein that typically is encoded next to the gene efp for translation elongation factor P.	0
413083	cl01825	Phage_Mu_Gam	Bacteriophage Mu Gam like protein. This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo.	0
413084	cl01826	Mu-like_gpT	Mu-like prophage major head subunit gpT. Members of this family of proteins comprise various caudoviral prophage proteins, including the Mu-like prophage major head subunit gpT.	0
382727	cl01827	DUF2330	Uncharacterized protein conserved in bacteria (DUF2330). Members of this family of hypothetical bacterial proteins have no known function.	0
413085	cl01829	UT	Urea transporter. Members of this protein family are bacterial urea transporters, found not only is species that contain urease, but adjacent to the urease operon. It was characterized in Yersinia pseudotuberculosis. Members are homologous to eukaryotic members of solute carrier family 14, a family that includes urea transporters, and to bacterial proteins in species with no detectable urea degradation system. [Transport and binding proteins, Other]	0
413086	cl01831	DUF1003	Protein of unknown function (DUF1003). This family consists of several hypothetical bacterial proteins of unknown function.	0
413087	cl01834	PSK_trans_fac	Rv0623-like transcription factor. This entry represents the Rv0623-like family of transcription factors associated with the PSK operon.	0
413088	cl01837	DUF2332	Uncharacterized protein conserved in bacteria (DUF2332). Members of this family of hypothetical bacterial proteins have no known function.	0
413089	cl01841	DUF1499	Protein of unknown function (DUF1499). This family consists of several hypothetical bacterial and plant proteins of around 125 residues in length. The function of this family is unknown.	0
413090	cl01842	Asparaginase_II	L-asparaginase II. This family consists of several bacterial L-asparaginase II proteins. L-asparaginase (EC:3.5.1.1) catalyzes the hydrolysis of L-asparagine to L-aspartate and ammonium. Rhizobium etli possesses two asparaginases: asparaginase I, which is thermostable and constitutive, and asparaginase II, which is thermolabile, induced by asparagine and repressed by the carbon source.	0
413091	cl01843	RuBisCO_small_like	N/A. Ribulose bisphosphate carboxylase/oxygenase (Rubisco), small subunit. Rubisco is a bifunctional enzyme catalyzes the initial steps of two opposing metabolic pathways: photosynthetic carbon fixation and the competing process of photorespiration. Rubisco Form I, present in plants and green algae, is composed of eight large and eight small subunits. The nearly identical small subunits are encoded by a family of nuclear genes. After translation, the small subunits are translocated across the chloroplast membrane, where an N-terminal signal peptide is cleaved off. While the large subunits contain the catalytic activities, it has been shown that the small subunits are important for catalysis by enhancing the catalytic rate through inducing conformational changes in the large subunits.	0
413092	cl01844	CreD	Inner membrane protein CreD. This family consists of several bacterial CreD or Cet inner membrane proteins. Dominant mutations of the cet gene of Escherichia coli result in tolerance to colicin E2 and increased amounts of an inner membrane protein with an Mr of 42,000. The cet gene is shown to be in the same operon as the phoM gene, which is required in a phoR background for expression of the structural gene for alkaline phosphatase, phoA. Although the Cet protein is not required for phoA expression, it has been suggested that the Cet protein has an enhancing effect on the transcription of phoA.	0
413093	cl01845	DUF1778	Protein of unknown function (DUF1778). This is a family of uncharacterized proteins. The structure of one of the hypothetical proteins in this family has been solved and it forms a helix structure which may form interactions with DNA.	0
382737	cl01848	NapE	Periplasmic nitrate reductase protein NapE. NapE, homologous to TorE (TIGR02972), is a membrane protein of unknown function that is part of the periplasmic nitrate reductase system; it may be part of the enzyme complex. The periplasmic nitrate reductase allows for nitrate respiration in anaerobic conditions. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	0
413094	cl01850	CtsR	Firmicute transcriptional repressor of class III stress genes (CtsR). This family consists of several Firmicute transcriptional repressor of class III stress genes (CtsR) proteins. CtsR of L. monocytogenes negatively regulates the clpC, clpP and clpE genes belonging to the CtsR regulon.	0
413095	cl01852	VEG	Biofilm formation stimulator VEG. VEG is a family that is highly conserved among Gram-positive bacteria. It stimulates biofilm formation through inducing transcription of the tapA-sipW-tasA operon. The products of this operon are resposible for production of the amyloid fibre (TasA) component of the biofilm. Veg or a Veg-induced protein acts as an antirepressor of SinR - part of the major overall biofilm transcriptional control system - to regulate and stimulate biofilm formation. Veg is transcribed at high levels during both exponential growth and sporulation.	0
242748	cl01853	YabA	Regulator of replication initiation timing  [Replication, recombination and repair]. 	0
413096	cl01857	DUF965	Bacterial protein of unknown function (DUF965). IreB (EF1202) was characterized in Enterococcus faecalis as a small protein, well-conserved in the Firmicutes. It belongs to a system that includes the Ser/Thr protein kinase IreK, and phosphatase IreP, undergoes phosphorylation on threonine residues, and is involved in regulating cephalosporin resistance. This family was previously named DUF965 by Pfam model pfam06135	0
413097	cl01860	DUF436	Protein of unknown function (DUF436). hypothetical protein; Provisional	0
413098	cl01862	DUF1461	Protein of unknown function (DUF1461). This model represents a family of highly hydrophobic, uncharacterized predicted integral membrane proteins found almost entirely in low-GC Gram-positive bacteria, although a member is also found in the early-branching bacterium Aquifex aeolicus.	0
413099	cl01864	DUF951	Bacterial protein of unknown function (DUF951). This family consists of several short hypothetical bacterial proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids.	0
413100	cl01867	DUF4176	Domain of unknown function (DUF4176). 	0
413101	cl01868	YukC	WXG100 protein secretion system (Wss), protein YukC. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This protein is designated YukC in Bacillus subtilis and EssB is Staphylococcus aureus. [Protein fate, Protein and peptide secretion and trafficking]	0
413102	cl01870	DUF1934	Domain of unknown function (DUF1934). Members of this family are found in a set of hypothetical bacterial proteins. Their precise function has not, as yet, been defined.	0
413103	cl01873	AgrB	Accessory gene regulator B. The accessory gene regulator (agr) of Staphylococcus aureus is the central regulatory system that controls the gene expression for a large set of virulence factors. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. At low cell density, the agr genes are continuously expressed at basal levels. A signal molecule, autoinducing peptide (AIP), produced and secreted by the bacteria, accumulates outside of the cells. When the cell density increases and the AIP concentration reaches a threshold, it activates the agr response, i.e. activation of secreted protein gene expression and subsequent repression of cell wall-associated protein genes. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein. AgrB is involved in the proteolytic processing of AgrD and may have both proteolytic enzyme activity and a transporter facilitating the export of the processed AgrD peptide.	0
413104	cl01877	17kDa_Anti_2	17 kDa outer membrane surface antigen. This is a bacterial domain of 17 kDa common-antigen proteins.	0
382749	cl01879	DUF962	Protein of unknown function (DUF962). This family consists of several eukaryotic and prokaryotic proteins of unknown function. The yeast protein YGL010W has been found to be non-essential for cell growth.	0
413105	cl01880	SulA	Cell division inhibitor SulA. All proteins in this family for which the functions are known are cell division inhibitors. In E. coli, SulA is one of the SOS regulated genes. [DNA metabolism, DNA replication, recombination, and repair]	0
413106	cl01885	RusA	Endodeoxyribonuclease RusA. endodeoxyribonuclease RUS; Reviewed	0
413107	cl01886	Omptin	Omptin family. The omptin family is a family of serine proteases.	0
413108	cl01887	ChaB	ChaB. This family of proteins contain a conserved 60 residue region. This protein is known as ChaB in E. coli and is found next to ChaA which is a cation transporter protein. ChaB may be regulate ChaA function in some way.	0
413109	cl01888	DUF883	Bacterial protein of unknown function (DUF883). hypothetical protein; Provisional	0
413110	cl01889	EutN_CcmL	Ethanolamine utilisation protein and carboxysome structural protein domain family. The crystal structure of EutN contains a central five-stranded beta-barrel, with an alpha-helix at the open end of this barrel (Structure 2HD3). The structure also contains three additional beta-strands, which help the formation of a tight hexamer, with a hole in the center. this suggests that EutN forms a pore, with an opening of 26 Angstrom in diameter on one face and 14 Angstrom on the other face. EutN is involved in the cobalamin-dependent degradation of ethanolamine.	0
413111	cl01890	GutM	Glucitol operon activator protein (GutM). This family consists of several glucitol operon activator (GutM) proteins. Expression of the glucitol (gut) operon in Escherichia coli is regulated by an unusual, complex system which consists of an activator (encoded by the gutM gene) and a repressor (encoded by the gutR gene) in addition to the cAMP-CRP complex (CRP, cAMP receptor protein). Synthesis of the mRNA, which initiates at the promoter specific to the gutR gene, occurs within the gutM gene. Expressional control of the gut operon appears to occur as a consequence of the antagonistic action of the products of the autogenously regulated gutM and gutR genes.	0
413112	cl01891	AceK	Isocitrate dehydrogenase kinase/phosphatase (AceK). bifunctional isocitrate dehydrogenase kinase/phosphatase protein; Validated	0
413113	cl01892	ZapD	Cell division protein. Cell division protein ZapD enhances FtsZ-ring assembly. It directly interacts with FtsZ and promotes bundling of FtsZ protofilaments, with a reduction in FtsZ GTPase activity.	0
413114	cl01894	VF530	DNA-binding protein VF530. VF530 contains a unique four-helix motif that shows some similarity to the C-terminal double-stranded DNA (dsDNA) binding domain of RecA, as well as other nucleic acid binding domains.	0
413115	cl01907	YscJ_FliF	Secretory protein of YscJ/FliF family. All members of this protein family are predicted lipoproteins with a conserved Cys near the N-terminus for cleavage and modification, and are part of known or predicted type III secretion systems. Members are found in both plant and animal pathogens, including the obligately intracellular chlamydial species and (non-pathogenic) root nodule bacteria. The most closely related proteins outside this family are examples of the flagellar M-ring protein FliF. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
295085	cl01908	Phage_tail_L	Phage minor tail protein L. This model detects members of the family of phage lambda minor tail protein L. This model was built as a fragment model to allow detection of fragmentary sequences, as might be found in cryptic prophage regions. [Mobile and extrachromosomal element functions, Prophage functions]	0
413116	cl01912	HigB_toxin	HigB_toxin, RelE-like toxic component of a toxin-antitoxin system. HigB_toxin is a family of RelE-like prokaryotic proteins that function as mRNA interferases. HigB cleaves translated mRNA only, and cleavage depended on translation of the target RNAs. HigB belongs to the RelE super-family of RNases. The toxin-antitoxin gene-pair is induced by environmental stress factors.	0
413117	cl01913	YaeQ	YaeQ protein. This family consists of several hypothetical bacterial proteins of around 180 residues in length which are often known as YaeQ. YaeQ is homologous to RfaH, a specialized transcription elongation protein. YaeQ is known to compensate for loss of RfaH function.	0
382764	cl01916	DUF2138	Uncharacterized protein conserved in bacteria (DUF2138). hypothetical protein; Provisional	0
413118	cl01917	DUF956	Domain of unknown function (DUF956). Family of bacterial sequences with undetermined function.	0
413119	cl01919	ADC	Acetoacetate decarboxylase (ADC). Members of this family are MppR, one of three enzymes involved in synthesizing enduracididine, a non-proteinogenic amino acid used in non-ribosomal peptide synthases to make natural products such as enduracidin from Streptomyces fungicidicus ATCC 21013. MppR is belongs to the acetoacetate decarboxylase-like superfamily. MppR catalyzes an aldol condensation and a dehydration, not a decarboxylation.	0
413120	cl01925	DUF2139	Uncharacterized protein conserved in archaea (DUF2139). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413121	cl01930	DUF2141	Uncharacterized protein conserved in bacteria (DUF2141). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
413122	cl01935	DUF2391	Putative integral membrane protein (DUF2391). Members of this family are found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Sinorhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighborhood. This family, as defined, includes some members of COG4711 but is narrower and strictly bacterial. Members appear to span the membrane seven times. [Cell envelope, Other]	0
382770	cl01936	Rad52_Rad22	Rad52/22 family double-strand break repair protein. All proteins in this family for which functions are known are involved in recombination and recombination repair. Their exact biochemical activity is not yet known.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	0
413123	cl01938	DUF2167	Protein of unknown function (DUF2167). This domain, found in various hypothetical membrane-anchored prokaryotic proteins, has no known function.	0
413124	cl01940	Phage_min_tail	Phage minor tail protein. This family consists of a series of phage minor tail proteins and related sequences from several bacterial species.	0
413125	cl01943	ABC_cobalt	ABC-type cobalt transport system, permease component. Members of this family of prokaryotic proteins include various hypothetical proteins as well as ABC-type cobalt transport systems.	0
295099	cl01945	Lambda_tail_I	Bacteriophage lambda tail assembly protein I. This family consists of tail assembly proteins from lambdoid and T1 phages and related prophages, e.g. the tail assembly protein I (TAPI). Members of this family contain a core ubiquitin fold domain. The exact function of TAPI is not clear but it is not incorporated into the mature tail. Gene neighborhoods reveal that TAPI co-occurs with genes encoding the host-specificity protein TapJ, and TapK, which contains a JAB metallopeptidase fused to an NlpC/P60 peptidase. It is proposed that the TAPI protein is processed by the peptidase domains of TapK.	0
382774	cl01947	MT-A70	MT-A70. MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs.	0
413126	cl01949	DUF1653	Protein of unknown function (DUF1653). This is a family of hypothetical bacterial proteins of unknown function.	0
413127	cl01950	DUF1850	Domain of unknown function (DUF1850). This family of proteins are functionally uncharacterized. Some members of this family appear to be misannotated as RocC an amino acid transporter from B. subtilis.	0
413128	cl01951	DUF2147	Uncharacterized protein conserved in bacteria (DUF2147). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
413129	cl01953	ArdA	Antirestriction protein (ArdA). This family consists of several bacterial antirestriction (ArdA) proteins. ArdA functions in bacterial conjugation to allow an unmodified plasmid to evade restriction in the recipient bacterium and yet acquire cognate modification.	0
413130	cl01958	Endonuc_Holl	Endonuclease related to archaeal Holliday junction resolvase. This domain is found in various predicted bacterial endonucleases which are distantly related to archaeal Holliday junction resolvases.	0
413131	cl01959	DUF1616	Protein of unknown function (DUF1616). This is a family of sequences from hypothetical archaeal proteins. The region in question is approximately 330 amino acid residues long.	0
413132	cl01960	DUF2149	Uncharacterized conserved protein (DUF2149). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
413133	cl01962	MTH865	MTH865-like family. This domain has an EF-hand like fold.	0
413134	cl01963	DUF2150	Uncharacterized protein conserved in archaea (DUF2150). This domain, found in various hypothetical archaeal proteins, has no known function.	0
382784	cl01966	DUF2153	Uncharacterized protein conserved in archaea (DUF2153). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413135	cl01969	DUF1990	Domain of unknown function (DUF1990). This family of proteins are functionally uncharacterized.	0
413136	cl01970	DUF2155	Uncharacterized protein conserved in bacteria (DUF2155). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
413137	cl01971	VanZ	VanZ like family. This family contains several examples of the VanZ protein, but also contains examples of phosphotransbutyrylases.	0
413138	cl01972	DUF948	Bacterial protein of unknown function (DUF948). This family consists of bacterial sequences several of which are thought to be general stress proteins.	0
413139	cl01973	Hpre_diP_synt_I	Heptaprenyl diphosphate synthase component I. This family contains component I of bacterial heptaprenyl diphosphate synthase (EC:2.5.1.30) (approximately 170 residues long). This is one of the two dissociable subunits that form the enzyme, both of which are required for the catalysis of the biosynthesis of the side chain of menaquinone-7.	0
413140	cl01974	GPDPase_memb	Membrane domain of glycerophosphoryl diester phosphodiesterase. Members of this family comprise the membrane domain of the prokaryotic enzyme glycerophosphoryl diester phosphodiesterase.	0
413141	cl01977	FeThRed_B	Ferredoxin thioredoxin reductase catalytic beta chain. ferredoxin thioreductase subunit beta; Validated	0
413142	cl01978	DUF1269	Protein of unknown function (DUF1269). This family consists of several bacterial and archaeal proteins of around 200 residues in length. The function of this family is unknown. The family carries a repeated glycine-zipper sequence- motif, GxxxGxxxG, where the x following the G is frequently found to be an alanine. As glycine-zippers occur in membrane proteins, this family is likely to be found spanning a membrane.	0
413143	cl01981	DUF1307	Protein of unknown function (DUF1307). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Some family members are described as putative lipoproteins but the function of the family is unknown.	0
413144	cl01982	BMC	Bacterial Micro-Compartment (BMC) domain. Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure.	0
413145	cl01983	DUF986	Protein of unknown function (DUF986). hypothetical protein; Provisional	0
413146	cl01985	MepB	MepB protein. MepB is a functionally uncharacterized protein in the mepRAB gene cluster of Staphylococcus aureus.	0
413147	cl01986	DUF1048	Protein of unknown function (DUF1048). This family consists of several hypothetical bacterial proteins of unknown function.	0
413148	cl01988	Abi_2	Abi-like protein. This family, found in various bacterial species, contains sequences that are similar to the Abi group of proteins, which are involved in bacteriophage resistance mediated by abortive infection in Lactococcus species. The proteins are thought to have helix-turn-helix motifs, found in many DNA-binding proteins, allowing them to perform their function.	0
413149	cl01989	Phage_holin_4_1	Bacteriophage holin family. This model describes one of the many mutally dissimilar families of holins, phage proteins that act together with lytic enzymes in bacterial lysis. This family includes, besides phage holins, the protein TcdE/UtxA involved in toxin secretion in Clostridium difficile and related species. [Protein fate, Protein and peptide secretion and trafficking, Mobile and extrachromosomal element functions, Prophage functions]	0
413150	cl01990	DUF2162	Predicted transporter (DUF2162). Members of this family of bacterial proteins are thought to be membrane transporters, but their exact function has not, as yet, been elucidated.	0
413151	cl01991	DUF1622	Protein of unknown function (DUF1622). This is a family of 14 highly conserved sequences, from hypothetical proteins expressed by both bacterial and archaeal species.	0
413152	cl01992	MIase	Muconolactone delta-isomerase. Members of this protein family are muconolactone delta-isomerase (EC 5.3.3.4), the CatC protein of the ortho cleavage pathway for metabolizing aromatic compounds by way of catechol. [Energy metabolism, Other]	0
413153	cl01993	Ribosomal_S26e	Ribosomal protein S26e. ribosomal protein S26; Provisional	0
382803	cl01994	DUF2173	Uncharacterized conserved protein (DUF2173). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
413154	cl02005	WXG100	Proteins of 100 residues with WXG. T7SS_ESX-EspC is a family of exported virulence proteins from largely Acinetobacteria and a few Fimicutes, Gram-positive bacteria. It is exported in conjunction with EspA as an interacting pair.ED F8ADQ6.1/227-313; F8ADQ6.1/227-313;	0
413155	cl02008	2-oxoacid_dh	2-oxoacid dehydrogenases acyltransferase (catalytic domain). Chloramphenicol acetyltransferase (CAT).catalyzes the acetyl-CoA dependent acetylation of chloramphenicol (Cm), an antibiotic which inhibits prokaryotic peptidyltransferase activity. Acetylation of Cm by CAT inactivates the antibiotic. A histidine residue, located in the C-terminal section of the enzyme, plays a central role in its catalytic mechanism. There is a second family of CAT. evolutionary unrelated to the main family described above. These CAT belong to the bacterial hexapeptide-repeat containing-transferases family (see ). The crystal structure of the type III enzyme from Escherichia coli with chloramphenicol bound has been determined. CAT is a trimer of identical subunits (monomer Mr 25,000) and the trimeric structure is stabilised by a number of hydrogen bonds, some of which result in the extension of a beta-sheet across the subunit interface. Chloramphenicol binds in a deep pocket located at the boundary between adjacent subunits of the trimer, such that the majority of residues forming the binding pocket belong to one subunit while the catalytically essential histidine belongs to the adjacent subunit. His195 is appropriately positioned to act as a general base catalyst in the reaction, and the required tautomeric stabilisation is provided by an unusual interaction with a main-chain carbonyl oxygen.	0
413156	cl02009	DUF1453	Protein of unknown function (DUF1453). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. Members of this family seem to be found exclusively in the Order Bacillales.	0
382807	cl02010	DUF2175	Uncharacterized protein conserved in archaea (DUF2175). This domain, found in various hypothetical archaeal proteins, has no known function.	0
382808	cl02011	DUF1444	Protein of unknown function (DUF1444). hypothetical protein; Provisional	0
413157	cl02014	DUF2177	Predicted membrane protein (DUF2177). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413158	cl02015	YycI	YycH protein. This domain is exclusively found in YycI proteins in the low GC content Gram positive species. These two domains share the same structural fold with domains two and three of YycH pfam07435. Both, YycH and YycI are always found in pair on the chromosome, downstream of the essential histidine kinase YycG. Additionally, both proteins share a function in regulating the YycG kinase with which they appear to form a ternary complex. Lastly, the two proteins always contain an N-terminal transmembrane helix and are localized to the periplasmic space as shown by PhoA fusion studies.	0
413159	cl02016	DUF2178	Predicted membrane protein (DUF2178). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413160	cl02017	DUF2180	Uncharacterized protein conserved in archaea (DUF2180). This domain, found in various hypothetical archaeal proteins, has no known function. A few of the family members contain a zinc finger domain.	0
413161	cl02019	DUF2185	Protein of unknown function (DUF2185). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413162	cl02022	MecA	Negative regulator of genetic competence (MecA). This family contains several bacterial MecA proteins. The development of competence in Bacillus subtilis is regulated by growth conditions and several regulatory genes. In complex media competence development is poor, and there is little or no expression of late competence genes. Mec mutations permit competence development and late competence gene expression in complex media, bypassing the requirements for many of the competence regulatory genes. The mecA gene product acts negatively in the development of competence. Null mutations in mecA allow expression of a late competence gene comG, under conditions where it is not normally expressed, including in complex media and in cells mutant for several competence regulatory genes. Overexpression of MecA inhibits comG transcription.	0
413163	cl02025	Glm_e	N/A. This family consists of several methylaspartate mutase E chain proteins (EC:5.4.99.1). Glutamate mutase catalyzes the first step in the fermentation of glutamate by Clostridium tetanomorphum. This is an unusual isomerisation in which L-glutamate is converted to threo-beta-methyl L-aspartate.	0
413164	cl02034	DUF2193	Uncharacterized protein conserved in archaea (DUF2193). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413165	cl02037	DUF1847	Protein of unknown function (DUF1847). This family of proteins are functionally uncharacterized. THey contain 4 N-terminal cysteines that may form a zinc binding domain.	0
413166	cl02038	Elf1	Transcription elongation factor Elf1 like. putative transcription elongation factor Elf1; Provisional	0
413167	cl02039	YbgT_YccB	Membrane bound YbgT-like protein. This model describes a very small (as short as 33 amino acids) protein of unknown function, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It begins with an aromatic motif MWYFXW and appears to contain a membrane-spanning helix. This protein appears to be restricted to the Proteobacteria and exist in a single copy only. We suggest it may be a membrane subunit of the terminal oxidase. The family is named after the E. coli member YbgT (SP|P56100). This model excludes the apparently related protein YccB (SP|P24244). [Energy metabolism, Electron transport]	0
413168	cl02041	Cyt-b5	Cytochrome b5-like Heme/Steroid binding domain. This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases.	0
413169	cl02042	DUF2195	Uncharacterized protein conserved in bacteria (DUF2195). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413170	cl02043	LOR	LURP-one-related. Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury.	0
413171	cl02044	DUF2196	Uncharacterized conserved protein (DUF2196). A pair of adjacent genes, ablAB (acetyl-beta-lysine biosynthesis) encodes lysine 2,3-aminomutase and beta-lysine acetyltransferase in methanogenic archaea. Homologous pairs, possibly with identical function, occur in a wide range of species, including Bacillus subtilis. This model describes a conserved hypothetical protein, small in size, with a phylogenetic distribution moderately well correlated to that of the acetyltransferase family. This protein family is also described as DUF2196 and COG4895. The function is unknown. [Hypothetical proteins, Conserved]	0
413172	cl02047	DUF2200	Uncharacterized protein conserved in bacteria (DUF2200). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413173	cl02048	DUF2199	Uncharacterized protein conserved in bacteria (DUF2199). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413174	cl02049	PhageMetallopep	Putative phage metallopeptidase. This entry represents a probable metallopeptidase found in a variety of phage and bacterial proteomes.	0
413175	cl02050	Ribosomal_S25	S25 ribosomal protein. 30S ribosomal protein S25e; Provisional	0
413176	cl02055	Dehydratase_SU	Dehydratase small subunit. This family contains the small subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances.	0
413177	cl02056	DUF2203	Uncharacterized conserved protein (DUF2203). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413178	cl02059	Halogen_Hydrol	5-bromo-4-chloroindolyl phosphate hydrolysis protein. Members of this family of prokaryotic proteins mediate the hydrolysis of 5-bromo-4-chloroindolyl phosphate bonds.	0
413179	cl02063	DUF2209	Uncharacterized protein conserved in archaea (DUF2209). This domain, found in various hypothetical archaeal proteins, has no known function.	0
413180	cl02066	GDYXXLXY	GDYXXLXY protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 171 and 199 amino acids in length. It contains a conserved GDYXXLXY motif.	0
413181	cl02071	DUF1109	Protein of unknown function (DUF1109). This family consists of several hypothetical bacterial proteins of unknown function.	0
413182	cl02073	DUF3422	Protein of unknown function (DUF3422). This family of proteins are functionally uncharacterized. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 426 to 444 amino acids in length.	0
413183	cl02074	DUF2000	Protein of unknown function (DUF2000). This is a family of proteins of unknown function. The structure of one of the proteins in this family has been shown to adopt an alpha beta fold.	0
413184	cl02079	YtxH	YtxH-like protein. This family of proteins is found in bacteria. Proteins in this family are typically between 100 and 143 amino acids in length. The N-terminal region is the most conserved. Proteins is this family are functionally uncharacterized.	0
382833	cl02087	DUF1834	Domain of unknown function (DUF1834). This family of proteins are functionally uncharacterized. One member is the Gp37 protein from the FluMu prophage.	0
295164	cl02088	Phage_tail_X	Phage Tail Protein X. This domain is found in a family of phage tail proteins. Visual analysis suggests that it is related to pfam01476 (personal obs: C Yeats). The functional annotation of family members further confirms this hypothesis.	0
413185	cl02089	Phage_tail_S	Phage virion morphogenesis family. This model describes protein S of phage P2, suggested experimentally to act in tail completion and stable head joining, and related proteins from a number of phage. [Mobile and extrachromosomal element functions, Prophage functions]	0
413186	cl02091	Glyco_transf_15	Glycolipid 2-alpha-mannosyltransferase. This is a family of alpha-1,2 mannosyl-transferases involved in N-linked and O-linked glycosylation of proteins. Some of the enzymes in this family have been shown to be involved in O- and N-linked glycan modifications in the Golgi.	0
413187	cl02092	Clat_adaptor_s	Clathrin adaptor complex small chain. 	0
413188	cl02093	Coq4	Coenzyme Q (ubiquinone) biosynthesis protein Coq4. Coq4p was shown to peripherally associate with the matrix face of the mitochondrial inner membrane. The putative mitochondrial- targeting sequence present at the amino-terminus of the polypeptide efficiently imported it to mitochondria. The function of Coq4p is unknown, although its presence is required to maintain a steady-state level of Coq7p, another component of the Q biosynthetic pathway. The overall structure of Coq4 is alpha helical and shows resemblance to haemoglobin/myoglobin (information from TOPSAN).	0
413189	cl02095	CDC50	LEM3 (ligand-effect modulator 3) family / CDC50 family. Members of this family have been predicted to contain transmembrane helices. The family member LEM3 is a ligand-effect modulator, mutation of which increases glucocorticoid receptor activity in response to dexamethasone and also confers increased activity on other intracellular receptors including the progesterone, oestrogen and mineralocorticoid receptors. LEM3 is thought to affect a downstream step in the glucocorticoid receptor pathway. Factors that modulate ligand responsiveness are likely to contribute to the context-specific actions of the glucocorticoid receptor in mammalian cells. The products of genes YNR048w, YNL323w, and YCR094w (CDC50) show redundancy of function and are involved in regulation of transcription via CDC39. CDC39 (also known as NOT1) is normally a negative regulator of transcription either by affecting the general RNA polymerase II machinery or by altering chromatin structure. One function of CDC39 is to block activation of the mating response pathway in the absence of pheromone, and mutation causes arrest in G1 by activation of the pathway. It may be that the cold-sensitive arrest in G1 noticed in CDC50 mutants may be due to inactivation of CDC39. The effects of LEM3 on glucocorticoid receptor activity may also be due to effects on transcription via CDC39.	0
413190	cl02096	Gti1_Pac2	Gti1/Pac2 family. In S. pombe the gti1 protein promotes the onset of gluconate uptake upon glucose starvation. In S. pombe the Pac2 protein controls the onset of sexual development, by inhibiting the expression of ste11, in a pathway that is independent of the cAMP cascade.	0
413191	cl02098	14-3-3	14-3-3 domain. This 14-3-3 domain family includes proteins in Caenorhabditis elegans, the silkworm (Bombyx mori) as well as barley (Hordeum vulgare). In C. elegans, 14-3-3 proteins are SIR-2.1 binding partners which induce transcriptional activation of DAF-16 during stress and are required for the life-span extension conferred by extra copies of sir-2.1. In B. mori, the 14-3-3 proteins are expressed widely in larval and adult tissues, including the brain, fat body, Malpighian tube, silk gland, midgut, testis, ovary, antenna, and pheromone gland, and interact with the N-terminal fragment of Hsp60, suggesting that 14-3-3 (a molecular adaptor) and Hsp60 (a molecular chaperone) work together to achieve a wide range of cellular functions in B. mori. In barley aleurone cells, 14-3-3 proteins and members of the ABF transcription factor family have a regulatory function in the gibberellic acid (GA) pathway since the balance of GA and abscisic acid (ABA) is a determining factor during transition of embryogenesis and seed germination. 14-3-3 is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells.	0
413192	cl02099	CK_II_beta	Casein kinase II regulatory subunit. Casein kinase II subunit beta; Provisional	0
413193	cl02102	S10_plectin	Plectin/S10 domain. 40S ribosomal protein S10; Provisional	0
413194	cl02103	Maf1	Maf1 regulator. Maf1 is a negative regulator of RNA polymerase III. It targets the initiation factor TFIIIB.	0
413195	cl02104	Ribosomal_L36e	Ribosomal protein L36e. 60S ribosomal protein L36; Provisional	0
413196	cl02106	IF4E	Eukaryotic initiation factor 4E. translation initiation factor E4; Provisional	0
413197	cl02107	Evr1_Alr	Erv1 / Alr family. Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian orthologue of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane an d it thought to operate downstream of the mitochondrial ABC transporter.	0
413198	cl02109	GWT1	GWT1. Glycosylphosphatidylinositol (GPI) is a conserved post-translational modification to anchor cell surface proteins to plasma membrane in eukaryotes. GWT1 is involved in GPI anchor biosynthesis; it is required for inositol acylation in yeast.	0
413199	cl02110	Pho88	Phosphate transport (Pho88). Members of this family of proteins are involved in regulating inorganic phosphate transport, as well as telomere length regulation and maintenance.	0
413200	cl02111	PCI	PCI domain. Also called the PCI (Proteasome, COP9, Initiation factor 3) domain. Unknown function.	0
413201	cl02113	Vac_ImportDeg	Vacuolar import and degradation protein. Members of this family are involved in the negative regulation of gluconeogenesis. They are required for both proteosome-dependent and vacuolar catabolite degradation of fructose-1,6-bisphosphatase (FBPase), where they probably regulate FBPase targeting from the FBPase-containing vesicles to the vacuole.	0
413202	cl02117	ORMDL	ORMDL family. Evidence form suggests that ORMDLs are involved in protein folding in the ER. Orm proteins have been identified as negative regulators of sphingolipid synthesis that form a conserved complex with serine palmitoyltransferase, the first and rate-limiting enzyme in sphingolipid production. This novel and conserved protein complex, has been termed the SPOTS complex (serine palmitoyltransferase, Orm1/2, Tsc3, and Sac1).	0
413203	cl02120	HAT_KAT11	Histone acetylation protein. Histone acetylation is required in many cellular processes including transcription, DNA repair, and chromatin assembly. This family contains the fungal KAT11 protein (previously known as RTT109) which is required for H3K56 acetylation. Loss of KAT11 results in the loss of H3K56 acetylation, both on bulk histone and on chromatin. KAT11 and H3K56 acetylation appear to correlate with actively transcribed genes and associate with the elongating form of Pol II in yeast. This family also incorporates the p300/CBP histone acetyltransferase domain which has different catalytic properties and cofactor regulation to KAT11.	0
413204	cl02121	Med31	SOH1. The family consists of Saccharomyces cerevisiae SOH1 homologs. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes.	0
413205	cl02122	TFIIF_beta	Transcription initiation factor IIF, beta subunit. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter.	0
413206	cl02125	Med6	MED6 mediator sub complex component. Component of RNA polymerase II holoenzyme and mediator sub complex.	0
413207	cl02127	RNA_pol_Rpc34	RNA polymerase Rpc34 subunit. Subunit specific to RNA Pol III, the tRNA specific polymerase. The C34 subunit of yeast RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and is therefore participates in Pol III recruitment.	0
413208	cl02129	ParBc	ParB-like nuclease domain. This domain is probably distantly related to pfam02195. Suggesting these uncharacterized proteins have a nuclease function.	0
413209	cl02130	Got1	Got1/Sft2-like family. Traffic through the yeast Golgi complex depends on a member of the syntaxin family of SNARE proteins, Sed5, present in early Golgi cisternae. Got1 is thought to facilitate Sed5-dependent fusion events. This is a family of sequences derived from eukaryotic proteins. They are similar to a region of a SNARE-like protein required for traffic through the Golgi complex, SFT2 protein. This is a conserved protein with four putative transmembrane helices, thought to be involved in vesicular transport in later Golgi compartments.	0
413210	cl02137	PRA1	PRA1 family protein. This family includes the PRA1 (Prenylated rab acceptor) protein which is a Rab guanine dissociation inhibitor (GDI) displacement factor. This family also includes the glutamate transporter EAAC1 interacting protein GTRAP3-18.	0
413211	cl02138	G10	G10 protein. 	0
413212	cl02144	TLD	TLD. This domain is predicted to be an enzyme and is often found associated with pfam01476. It's structure consists of a beta-sandwich surrounded by two helices and two one-turn helices.	0
382862	cl02148	APC10-like	APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination. This model represents the APC10/DOC1-like domain present in the uncharacterized Zinc finger ZZ-type and EF-hand domain-containing protein 1 (ZZEF1) of Mus musculus. Members of this family contain EF-hand, APC10, CUB, and zinc finger ZZ-type domains. ZZEF1-like APC10 domains are homologous to the APC10 subunit/DOC1 domains present in E3 ubiquitin ligases, which mediate substrate ubiquitination (or ubiquitylation), and are components of the ubiquitin-26S proteasome pathway for selective proteolytic degradation.	0
413213	cl02150	TAF10	The TATA Binding Protein (TBP) Associated Factor 10. The TATA Binding Protein (TBP) Associated Factor 10 (TAF 10) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of the seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and the assembly of the preinitiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. Several hypotheses are proposed for TAF functions, such as serving as activator-binding sites, being involved in core-promoter recognition, or to perform an essential catalytic activity. Each TAF - with the help of a specific activator - is required only for the expression of a subset of genes, and TAFs are not universally involved in transcription such as the GTFs. TAF10 regulates genes that are important for cell cycle progression and cell morphology. A lack of TAF10 leads to cell cycle arrest and cell death by apoptosis in mouse. In both yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF10 is part of other transcription regulatory multiprotein complexes (e.g., SAGA, TBP-free TAF-containing complex [TFTC], STAGA, and PCAF/GCN5). Several TAFs interact via histone-fold motifs. The histone fold (HFD) is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. The minimal HFD contains three alpha-helices linked by two loops. The HFD is found in core histones, TAFs and many other transcription factors. Five HF-containing TAF pairs have been described in TFIID: TAF6-TAF9, TAF4-TAF12, TAF11-TAF13, TAF8-TAF10 and TAF3-TAF10.	0
413214	cl02153	TFIIE_beta_winged_helix	TFIIE_beta_winged_helix domain, located at the central core region of TFIIE beta, with double-stranded DNA binding activity. General transcription factor TFIIE consists of two subunits, TFIIE alpha pfam02002 and TFIIE beta. TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The structure of the DNA binding core region has been solved and has a winged helix fold.	0
413215	cl02154	YL1_C	YL1 nuclear protein C-terminal domain. This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins.	0
413216	cl02155	ER_lumen_recept	ER lumen protein retaining receptor. 	0
413217	cl02156	PTPLA	Protein tyrosine phosphatase-like protein, PTPLA. 3-hydroxyacyl-CoA dehydratase subunit of elongase	0
413218	cl02160	Rcd1	Cell differentiation family, Rcd1-like. Two of the members in this family have been characterized as being involved in regulation of Ste11 regulated sex genes. Mammalian Rcd1 is a novel transcriptional cofactor that mediates retinoic acid-induced cell differentiation.	0
413219	cl02161	Ssu72	Ssu72-like protein. The highly conserved and essential protein Ssu72 has intrinsic phosphatase activity and plays an essential role in the transcription cycle. Ssu72 was originally identified in a yeast genetic screen as enhancer of a defect caused by a mutation in the transcription initiation factor TFIIB. It binds to TFIIB and is also involved in mRNA elongation. Ssu72 is further involved in both poly(A) dependent and independent termination. It is a subunit of the yeast cleavage and polyadenylation factor (CPF), which is part of the machinery for mRNA 3'-end formation. Ssu72 is also essential for transcription termination of snRNAs.	0
413220	cl02162	Fip1	Fip1 motif. This short motif is about 40 amino acids in length. In the Fip1 protein that is a component of a yeast pre-mRNA polyadenylation factor that directly interacts with poly(A) polymerase. This region of Fip1 is needed for the interaction with the Th1 subunit of the complex and for specific polyadenylation of the cleaved mRNA precursor.	0
413221	cl02164	Utp11	Utp11 protein. This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome.	0
413222	cl02165	CBFB_NFYA	CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B. 	0
413223	cl02166	RRS1	Ribosome biogenesis regulatory protein (RRS1). This family consists of several eukaryotic ribosome biogenesis regulatory (RRS1) proteins. RRS1 is a nuclear protein that is essential for the maturation of 25 S rRNA and the 60 S ribosomal subunit assembly in Saccharomyces cerevisiae.	0
413224	cl02170	Sec62	Translocation protein Sec62. Members of the NSCC2 family have been sequenced from various yeast, fungal and animals species including Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. These proteins are the Sec62 proteins, believed to be associated with the Sec61 and Sec63 constituents of the general protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins has been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions. [Transport and binding proteins, Amino acids, peptides and amines]	0
413225	cl02172	Per1	Per1-like family. PER1 is required for GPI-phospholipase A2 activity and is involved in lipid remodelling of GPI-anchored proteins. PER1 is part of the CREST superfamily.	0
242920	cl02174	TAF13	The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. This family includes the Spt3 yeast transcription factors and the 18kD subunit from human transcription initiation factor IID (TFIID-18). Determination of the crystal structure reveals an atypical histone fold	0
413226	cl02175	Rer1	Rer1 family. RER1 family protein are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C-terminus of yeast Rer1p interacts with a coatomer complex.	0
413227	cl02176	TAF11	TATA Binding Protein (TBP) Associated Factor 11 (TAF11) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. The conserved region is found at the C-terminal of most member proteins. The crystal structure of hTAFII28 with hTAFII18 shows that this region is involved in the binding of these two subunits. The conserved region contains four alpha helices and three loops arranged as in histone H3.	0
413228	cl02183	Zincin_2	Zincin-like metallopeptidase. A phylogenetic tree of the DUF2342 family (TIGR03624) consists of two major branches. One of these branches, modeled here, is observed almost entirely to be found in coenzyme F420 biosynthesizing species of the Actinobacterial, Chloroflexi and Archaeal lineages. The few organisms having genes within this family and lacking F420 biosynthesis may either have an undiscovered F420 transporter, or may represent F420-to-FMN revertants. This family includes a Chloroflexus Aurantiacus protein whose crystal structure has been determined (PDB:3CMN_A). This has been annotated as a putative hydrolase, but the support for that assertion is untraceable. There is no cofactor present in the structure.	0
413229	cl02185	DUF1093	Protein of unknown function (DUF1093). This model represents a family of small (about 115 amino acids) uncharacterized proteins with N-terminal signal sequences, found exclusively in Gram-positive organisms. Most genomes that have any members of this family have at least two members. [Hypothetical proteins, Conserved]	0
413230	cl02186	Plus-3	Plus-3 domain. Plus3 domains occur in the Saccharomyces cerevisiae Rtf1p protein, which interacts with Spt6p, and in parsley CIP, which interacts with the bZIP protein CPRF1.	0
413231	cl02188	CcdA	Post-segregation antitoxin CcdA. This family consists of several Enterobacterial post-segregation antitoxin CcdA proteins. The F plasmid-carried bacterial toxin, the CcdB protein, is known to act on DNA gyrase in two different ways. CcdB poisons the gyrase-DNA complex, blocking the passage of polymerases and leading to double-strand breakage of the DNA. Alternatively, in cells that overexpress CcdB, the A subunit of DNA gyrase (GyrA) has been found as an inactive complex with CcdB. Both poisoning and inactivation can be prevented and reversed in the presence of the F plasmid-encoded antidote, the CcdA protein.	0
413232	cl02193	VirB5_like	VirB5 protein family. Based on Bacteroides thetaiotaomicron gene BT_4772, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture.	0
413233	cl02197	MmcB-like	DNA repair protein MmcB-like. This family includes Caulobacter MmcB (CCNA_03580), which is involved in DNA repair. It has been proposed to be an endonuclease that creates the substrate for translesion synthesis.	0
413234	cl02206	DUF1312	N-Utilization Substance G (NusG) N terminal (NGN) insert and Lin0431 are part of DUF1312. This domain is found in some NusG proteins where it forms domain II. However most NusG proteins are missing this domain. In other cases this domain is found in isolation. The function of this domain is unknown.	0
413235	cl02207	IalB	Invasion associated locus B (IalB) protein. This family consists of several invasion associated locus B (IalB) proteins and related sequences. IalB is known to be a major virulence factor in Bartonella bacilliformis where it was shown to have a direct role in human erythrocyte parasitism. IalB is upregulated in response to environmental cues signaling vector-to-host transmission. Such environmental cues would include, but not be limited to, temperature, pH, oxidative stress, and haemin limitation. It is also thought that IalB would aide B. bacilliformis survival under stress-inducing environmental conditions. The role of this protein in other bacterial species is unknown.	0
413236	cl02210	DUF2335	Predicted membrane protein (DUF2335). Members of this family of hypothetical bacterial proteins have no known function.	0
413237	cl02211	DUF983	Protein of unknown function (DUF983). hypothetical protein; Provisional	0
413238	cl02212	DUF2169	Uncharacterized protein conserved in bacteria (DUF2169). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
413239	cl02216	Terminase_6C	Terminase RNaseH-like domain. This model represents the C-terminal region of a set of phage proteins typically about 400-500 amino acids in length, although some members are considerably shorter. An article on Methanobacterium phage Psi-M2 ( calls the member from that phage, ORF9, a putative large terminase subunit, and ORF8 a candidate terminase small subunit. Most proteins in this family have an apparent P-loop nucleotide-binding sequence toward the N-terminus. [Mobile and extrachromosomal element functions, Prophage functions]	0
413240	cl02219	Bap31	B-cell receptor-associated protein 31-like. Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31.	0
413241	cl02228	ATP12	ATP12 chaperone protein. Mitochondrial F1-ATPase is an oligomeric enzyme composed of five distinct subunit polypeptides. The alpha and beta subunits make up the bulk of protein mass of F1. In Saccharomyces cerevisiae both subunits are synthesized as precursors with amino-terminal targeting signals that are removed upon translocation of the proteins to the matrix compartment. These proteins include examples from eukaryotes and bacteria and may have chaperone activity, being involved in F1 ATPase complex assembly.	0
382891	cl02232	DUF2306	Predicted membrane protein (DUF2306). Members of this family of hypothetical bacterial proteins have no known function.	0
413242	cl02233	NTP_transf_8	Nucleotidyltransferase. This is a family of bacterial proteins that have a nucleotidyltransferase fold. The fold-prediction is backed up by conservation of three highly characteristic sequence motifs found in all other nucleotidyl transferases: i) pDhDhhh(h/p), where p is a polar residue and h is a hydrophobic residue; ii) upstream of the first, a GG/S; iii) a conserved D/E in a hydrophobic surround. In the classification of nucleotidyltransferases proposed in this is a group XVIII NTP-transferase. Many of these sequences were classified in the COG database as COG5397. The exact function is not known.	0
382893	cl02234	DUF2286	Uncharacterized protein conserved in archaea (DUF2286). Members of this family of hypothetical archaeal proteins have no known function.	0
413243	cl02235	DUF1134	Protein of unknown function (DUF1134). This family consists of several hypothetical bacterial proteins of unknown function.	0
413244	cl02241	DUF2301	Uncharacterized integral membrane protein (DUF2301). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413245	cl02246	DUF2285	Uncharacterized conserved protein (DUF2285). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413246	cl02247	Rop-like	Rop-like. This family contains several uncharacterized bacterial proteins. These proteins are found in nitrogen fixation operons so are likely to play some role in this process. They consist of two alpha helices which are joined by a four residue linker. The helices form an antiparallel bundle and cross towards their termini. They are likely to form a rod-like dimer. They have structural similarity to the regulatory protein Rop, pfam01815.	0
413247	cl02248	DUF2284	Predicted metal-binding protein (DUF2284). Members of this family of metal-binding hypothetical bacterial proteins have no known function.	0
413248	cl02250	DUF2298	Uncharacterized membrane protein (DUF2298). Members of this highly hydrophobic probable integral membrane family belong to two classes. In one, a single copy of the region covered by this model represents essentially the full length of a strongly hydrophobic protein of about 700 to 900 residues (variable because of long inserts in some). The domain architecture of the other class consists of an additional N-terminal region, two copies of the region represented by this model, and three to four repeats of TPR, or tetratricopeptide repeat. The unusual species range includes several Archaea, several Chloroflexi, and Clostridium phytofermentans. An unusual motif YYYxG is present, and we suggest the name Chlor_Arch_YYY protein. The function is unknown.	0
413249	cl02251	DUF2283	Protein of unknown function (DUF2283). Members of this family of hypothetical bacterial proteins have no known function.	0
413250	cl02253	SCPU	Spore Coat Protein U domain. This domain is found in a bacterial family of spore coat proteins.as well as a family of secreted pili proteins involved in motility and biofilm formation.	0
413251	cl02259	YibE_F	YibE/F-like protein. The sequences featured in this family are similar to two proteins expressed by Lactococcus lactis, YibE and YibF. Most of the members of this family are annotated as being putative membrane proteins, and in fact the sequences contain a high proportion of hydrophobic residues.	0
413252	cl02261	DUF2299	Uncharacterized conserved protein (DUF2299). Members of this family of hypothetical bacterial proteins have no known function.	0
413253	cl02262	Tm-1-like	ATP-binding domain found in plant Tm-1-like (Tm-1L) and similar proteins. hypothetical protein; Provisional	0
413254	cl02266	CbtA	Probable cobalt transporter subunit (CbtA). This model represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of five trans-membrane segments, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a small protein (CbtB) having a single additional trans-membrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site.	0
413255	cl02268	DUF2460	Conserved hypothetical protein 2217 (DUF2460). This model represents a family of conserved hypothetical proteins. It is usually (but not always) found in apparent phage-derived regions of bacterial chromosomes. [Mobile and extrachromosomal element functions, Prophage functions]	0
413256	cl02273	HlyU	Transcriptional activator HlyU. This domain, found in various hypothetical prokaryotic proteins, has no known function. One of the sequences in this family corresponds to the transcriptional activator HlyU, indicating a possible similar role in other members.	0
413257	cl02275	RcnB	Nickel/cobalt transporter regulator. RcnB is a family of Proteobacteria proteins. RcnB is required for maintaining metal ion homeostasis, in conjunction with the efflux pump RcnA, family NicO, pfam03824.	0
413258	cl02278	DUF2164	Uncharacterized conserved protein (DUF2164). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
413259	cl02284	DUF1059	Protein of unknown function (DUF1059). This family consists of several short hypothetical archaeal proteins of unknown function.	0
413260	cl02289	DUF2190	Uncharacterized conserved protein (DUF2190). This domain, found in various hypothetical prokaryotic proteins, as well as in some putative RecA/RadA recombinases, has no known function.	0
413261	cl02290	DUF2165	Predicted small integral membrane protein (DUF2165). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
413262	cl02291	DUF2189	Predicted integral membrane protein (DUF2189). Members of this family are found in various hypothetical prokaryotic proteins, as well as putative cytochrome c oxidases. Their exact function has not, as yet, been established.	0
413263	cl02292	Crr6	Chlororespiratory reduction 6. The protein Slr1097 and its functionally equivalent cyanobacterial homologs are required for proper maturation of NdhI, a subunit of NADPH dehydrogenase complexes, so that NDH-1 complexes can assemble properly. The related protein in the model plant species Arabidopsis thaliana is known as CRR6 (chlororespiratory reduction 6).	0
413264	cl02293	DUF2158	Uncharacterized small protein (DUF2158). Members of this family of prokaryotic proteins have no known function.	0
413265	cl02294	DUF2160	Predicted small integral membrane protein (DUF2160). The members of this family of hypothetical prokaryotic proteins have no known function. It is thought that they are transmembrane proteins, but their function has not been inferred yet.	0
413266	cl02296	DUF1036	Protein of unknown function (DUF1036). This family consists of several hypothetical bacterial proteins of unknown function.	0
413267	cl02298	DUF2161	Putative PD-(D/E)XK phosphodiesterase (DUF2161). This family of proteins is functionally uncharacterized. This family of proteins is found in prokaryotes. Advanced homology-detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses s have established this family to be a member of the PD-(D/E)XK superfamily.	0
413268	cl02302	DUF2244	Integral membrane protein (DUF2244). This domain, found in various bacterial hypothetical and putative membrane proteins, has no known function.	0
413269	cl02303	DUF736	Protein of unknown function (DUF736). This family consists of several uncharacterized bacterial proteins of unknown function.	0
413270	cl02309	DUF2259	Predicted secreted protein (DUF2259). Members of this family of hypothetical bacterial proteins have no known function.	0
413271	cl02310	Glyco_hydro_81	Glycosyl hydrolase family 81. Family of eukaryotic beta-1,3-glucanases. Within the Aspergillus fumigatus protein ENGL1, two perfectly conserved Glu residues (E550 or E554) have been proposed as putative nucleophiles of the active site of the Engl1 endoglucanase, while the proton donor would be D475. The endo-beta-1,3-glucanase activity is essential for efficient spore release.	0
413272	cl02314	DUF2267	Uncharacterized conserved protein (DUF2267). This domain, found in various hypothetical bacterial proteins, has no known function.	0
242981	cl02317	DUF819	Protein of unknown function (DUF819). This family contains proteins of unknown function from archaeal, bacterial and plant species.	0
413273	cl02318	DUF1694	Protein of unknown function (DUF1694). This family contains many hypothetical proteins.	0
413274	cl02319	DUF1428	Protein of unknown function (DUF1428). This family consists of several hypothetical bacterial and one archaeal sequence of around 120 residues in length. The function of this family is unknown. The structure of this family shows it to be part of the Dimeric-alpha-beta-barrel superfamily. Many members are annotated as being RNA signal recognition particle 4.5S RNA, but this could not be verified.	0
413275	cl02324	DUF721	Protein of unknown function (DUF721). hypothetical protein; Provisional	0
413276	cl02325	Inhibitor_I42	Chagasin family peptidase inhibitor I42. Chagasin is a cysteine peptidase inhibitor which forms a beta barrel structure.	0
413277	cl02331	Intg_mem_TP0381	Integral membrane protein (intg_mem_TP0381). This model represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc.	0
413278	cl02333	Bac_rhodopsin	Bacteriorhodopsin-like protein. The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria.. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine).	0
413279	cl02335	DUF2269	Predicted integral membrane protein (DUF2269). Members of this family of bacterial hypothetical integral membrane proteins have no known function.	0
413280	cl02337	DUF2270	Predicted integral membrane protein (DUF2270). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413281	cl02338	DUF2303	Uncharacterized conserved protein (DUF2303). Members of this family of hypothetical bacterial proteins have no known function.	0
413282	cl02341	Sec66	Preprotein translocase subunit Sec66. Members of this family of proteins are a component of the heterotetrameric Sec62/63 complex composed of SEC62, SEC63, SEC66 and SEC72. The Sec62/63 complex associates with the Sec61 complex to form the Sec complex. Sec 66 is involved in SRP-independent post-translational translocation across the endoplasmic reticulum and functions together with the Sec61 complex and KAR2 in a channel-forming translocon complex. Furthermore, Sec66 is also required for growth at elevated temperatures.	0
413283	cl02344	Phage_holin_1	Bacteriophage holin. Phage proteins for bacterial lysis typically include a membrane-disrupting protein, or holin, and one or more cell wall degrading enzymes that reach the cell wall because of holin action. Holins are found in a large number of mutually non-homologous families. [Mobile and extrachromosomal element functions, Prophage functions]	0
413284	cl02346	Tmemb_14	Transmembrane proteins 14C. This family of short membrane proteins are as yet uncharacterized.	0
413285	cl02349	DUF2277	Uncharacterized conserved protein (DUF2277). Members of this family of hypothetical bacterial proteins have no known function.	0
413286	cl02351	NifT	NifT/FixU protein. This largely uncharacterized protein family is assigned a role in nitrogen fixation by two criteria. First, its gene occurs, generally, among genes essential for expression of active nitrogenase. Second, its phylogenetic profile closely matches that of nitrogen-fixing bacteria. However, mutational studies in Klebsiella pneumoniae failed to demonstrate any phenotype for deletion or overexpression of the protein.	0
413287	cl02353	DUF2280	Uncharacterized conserved protein (DUF2280). Members of this family of hypothetical bacterial proteins have no known function.	0
382939	cl02355	DUF2281	Protein of unknown function (DUF2281). Members of this family of hypothetical bacterial proteins have no known function.	0
413288	cl02356	CGGC	CGGC domain. The domain has many conserved cysteines and histidines suggestive of a zinc binding function.	0
413289	cl02360	Mor	Mor transcription activator family. Mor (Middle operon regulator) is a sequence specific DNA binding protein. It mediates transcription activation through its interactions with the C-terminal domains of the alpha and sigma subunits of bacterial RNA polymerase. The N terminal region of Mor is the dimerization region, and the C terminal contains a helix-turn-helix motif which binds DNA.	0
413290	cl02363	CusF_Ec	Copper binding periplasmic protein CusF. periplasmic copper-binding protein; Provisional	0
413291	cl02366	DUF2282	Predicted integral membrane protein (DUF2282). Members of this family of hypothetical bacterial proteins and putative signal peptide proteins have no known function.	0
413292	cl02369	DUF624	Protein of unknown function, DUF624. This family includes several uncharacterized bacterial proteins.	0
413293	cl02370	DUF1810	Protein of unknown function (DUF1810). This is a family of uncharacterized proteins. The structure of one of the members in this family has been solved and it adopts a mainly alpha helical structure.	0
413294	cl02371	DUF2292	Uncharacterized small protein (DUF2292). OscA (organosulfur compound A) is a small protein, about 60 amino acids in length, in the DUF2292 family. As characterized in Pseudomonas corrugata, OscA is required during sulfur starvation for obtaining it from organosulfur compounds. The pathway is required to remediate oxidative stress from chromate, so oscA was discovered by the loss of high resistance to chromate in Pseudomonas corrugata 28 when the gene is insertionally inactivated. The oscA gene tends to be found near sulfate transporter genes.	0
413295	cl02373	DUF2293	Uncharacterized conserved protein (DUF2293). This domain, found in various hypothetical bacterial proteins, has no known function.	0
413296	cl02374	DUF2461	Conserved hypothetical protein (DUF2461). Members of this family are widely (though sparsely) distributed bacterial proteins about 230 residues in length. All members have a motif RxxRDxRFxxx[DN]KxxY. The function of this protein family is unknown. In several fungi, this model identifies a conserved region of a longer protein. Therefore, it may be incorrect to speculate that all members share a common function.	0
413297	cl02375	DUF1326	Protein of unknown function (DUF1326). This family consists of several hypothetical bacterial proteins which seem to be found exclusively in Rhizobium and Ralstonia species. Members of this family are typically around 210 residues in length and contain 5 highly conserved cysteine residues at their N-terminus. The function of this family is unknown.	0
413298	cl02376	DUF2390	Protein of unknown function (DUF2390). Members of this family are bacterial hypothetical proteins, about 160 amino acids in length, found in various Proteobacteria, including members of the genera Pseudomonas and Vibrio. The C-terminal region is poorly conserved and is not included in the model. [Hypothetical proteins, Conserved]	0
413299	cl02380	DUF2310	Zn-ribbon-containing, possibly nucleic-acid-binding protein (DUF2310). Members of this family of proteobacterial zinc ribbon proteins are thought to bind to nucleic acids, however their exact function has not as yet been defined.	0
413300	cl02381	Tim17	Tim17/Tim22/Tim23/Pmp24 family. mitochondrial import inner membrane translocase subunit tim17; Provisional	0
413301	cl02384	NOT2_3_5	NOT2 / NOT3 / NOT5 family. NOT1, NOT2, NOT3, NOT4 and NOT5 form a nuclear complex that negatively regulates the basal and activated transcription of many genes. This family includes NOT2, NOT3 and NOT5.	0
413302	cl02390	DUF2294	Uncharacterized conserved protein (DUF2294). Members of this family of hypothetical bacterial proteins have no known function.	0
413303	cl02395	DUF2291	Predicted periplasmic lipoprotein (DUF2291). Members of this family of hypothetical bacterial proteins have no known function.	0
382955	cl02396	DUF2290	Uncharacterized conserved protein (DUF2290). Members of this family of hypothetical bacterial proteins have no known function.	0
413304	cl02398	Host_attach	Protein required for attachment to host cells. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome.	0
413305	cl02399	DUF2288	Protein of unknown function (DUF2288). Members of this family of hypothetical bacterial proteins have no known function.	0
413306	cl02406	DUF2274	Protein of unknown function (DUF2274). Members of this family of hypothetical bacterial proteins have no known function.	0
413307	cl02411	RES	RES domain. This presumed protein contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. RES is found widely distributed in bacteria, it has about 150 residues in length.	0
413308	cl02412	Rep_1	Replication protein. Replication proteins (rep) are involved in plasmid replication. The Rep protein binds to the plasmid DNA and nicks it at the double strand origin (dso) of replication. The 3'-hydroxyl end created is extended by the host DNA replicase, and the 5' end is displaced during synthesis. At the end of one replication round, Rep introduces a second single stranded break at the dso and ligates the ssDNA extremities generating one double-stranded plasmid and one circular ssDNA form. Complementary strand synthesis of the circular ssDNA is usually initiated at the single-stranded origin by the host RNA polymerase.	0
413309	cl02415	DUF922	Bacterial protein of unknown function (DUF922). This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold.	0
413310	cl02417	Myelin_PLP	Myelin proteolipid protein (PLP or lipophilin). 	0
413311	cl02418	Hormone_5	Neurohypophysial hormones, C-terminal Domain. Vasopressin/oxytocin gene family.	0
413312	cl02419	Notch	LNR domain. The Notch protein is essential for the proper differentiation of the Drosophila ectoderm. This protein contains 3 NL domains.	0
413313	cl02422	HRM	Hormone receptor domain. This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain.	0
413314	cl02423	LRRNT	Leucine rich repeat N-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats.	0
413315	cl02425	Osteopontin	Osteopontin. Osteopontin is an acidic phosphorylated glycoprotein of about 40 Kd which is abundant in the mineral matrix of bones and which binds tightly to hydroxyapatite. It is suggested that osteopontin might function as a cell attachment factor and could play a key role in the adhesion of osteoclasts to the mineral matrix of bone	0
413316	cl02426	DIX	DIX domain. Domain of unknown function.	0
413317	cl02428	Ependymin	Ependymin. Ependymins are the predominant proteins in the cerebrospinal fluid (CSF) of teleost fish. They have been implicated in the neurochemistry of memory and neuronal regeneration. They are glycoproteins of about 200 amino acids that can bind calcium. Four cysteines are conserved that probably form disulfide bonds.	0
413318	cl02432	CLECT	C-type lectin (CTL)/C-type lectin-like (CTLD) domain. This family includes both long and short form C-type	0
413319	cl02434	CNH	CNH domain. Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations.	0
413320	cl02436	COLFI	Fibrillar collagen C-terminal domain. Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc.	0
413321	cl02440	DAGK_acc	Diacylglycerol kinase accessory domain. Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain might either be an accessory domain or else contribute to the catalytic domain. Bacterial homologues are known.	0
413322	cl02442	DEP	N/A. The DEP domain is responsible for mediating intracellular protein targeting and regulation of protein stability in the cell. The DEP domain is present in a number of signaling molecules, including Regulator of G protein Signaling (RGS) proteins, and has been implicated in membrane targeting. New findings in yeast, however, demonstrate a major role for a DEP domain in mediating the interaction of an RGS protein to the C-terminal tail of a GPCR, thus placing RGS in close proximity with its substrate G protein alpha subunit.	0
351761	cl02446	MATH	N/A. This motif has been called the Meprin And TRAF-Homology (MATH) domain. This domain is hugely expanded in the nematode C. elegans.	0
413323	cl02447	CRD_FZ	CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain. Also known as the CRD (cysteine rich domain), the C6 box in MuSK receptor. This domain of unknown function has been independently identified by several groups. The domain contains 10 conserved cysteines.	0
413324	cl02448	Hormone_6	Glycoprotein hormone. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology.	0
413325	cl02449	Gla	Vitamin K-dependent carboxylation/gamma-carboxyglutamic (GLA) domain. A hyaluronan-binding domain found in proteins associated with the extracellular matrix, cell adhesion and cell migration.	0
413326	cl02451	Hydrophobin	Fungal hydrophobin. 	0
321941	cl02453	IlGF_like	N/A. Superfamily includes insulins; relaxins; insulin-like growth factor; and bombyxin. All are secreted regulatory hormones. Disulfide rich, all-alpha fold. Alignment includes B chain, linker (which is processed out of the final product), and A chain.	0
413327	cl02465	BTK	BTK motif. Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains (but not all PH domains are followed by BTK motifs). The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region.	0
413328	cl02467	C4	C-terminal tandem repeated domain in type 4 procollagen. Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome.	0
413329	cl02471	HX	N/A. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metallopeptidases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metallopeptidases (TIMPs).	0
413330	cl02472	IGFBP	Insulin-like growth factor binding protein. High affinity binding partners of insulin-like growth factors.	0
413331	cl02473	IL6	Interleukin-6/G-CSF/MGF family. GCSF is a family of higher eukaryotic granulocyte colony-stimulating factor proteins. Granulocyte colony-stimulating factors are cytokines that are involved in haematopoeisis. They control the production, differentiation and function of white blood cell granulocytes. GCSF binds to the extracellular Ig-like and CRH domain of its receptor GCSFR, thereby triggering the receptor to homodimerize. Homodimerization result in activation of Janus tyrosine kinase-signal transducers and other activators of transcription (JAK-STAT)-type signalling cascades.	0
413332	cl02475	LIM	LIM is a small protein-protein interaction domain, containing two zinc fingers. This family represents two copies of the LIM structural domain.	0
413333	cl02480	MyTH4	MyTH4 domain. Domain present twice in myosin-VIIa, and also present in 3 other myosins.	0
413334	cl02481	NGF	Nerve growth factor family. NGF is important for the development and maintenance of the sympathetic and sensory nervous systems.	0
413335	cl02483	PI3K_p85B	PI3-kinase family, p85-binding domain. Region of p110 PI3K that binds the p85 subunit.	0
413336	cl02484	PI3K_rbd	PI3-kinase family, ras-binding domain. Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding RA domains (unpublished observation).	0
413337	cl02485	RasGEF	N/A. Guanine nucleotide exchange factor for Ras-like small GTPases.	0
413338	cl02488	SPEC	N/A. Spectrin repeat-domains are found in several proteins involved in cytoskeletal structure. These include spectrin, alpha-actinin and dystrophin. The sequence repeat used in this family is taken from the structural repeat in reference. The spectrin domain- repeat forms a three helix bundle. The second helix is interrupted by proline in some sequences. The repeats are defined by a characteristic tryptophan (W) residue at position 17 in helix A and a leucine (L) at 2 residues from the carboxyl end of helix C. Although the domain occurs in multiple repeats along sequences, the domains are actually stable on their own - ie they act, biophysically, like domains rather than repeats that along function when aggregated.	0
413339	cl02491	VHP	Villin headpiece domain. 	0
413340	cl02494	SapA	Saposin A-type domain. Present as four and three degenerate copies, respectively, in prosaposin and surfactant protein B. Single copies in acid sphingomyelinase, NK-lysin amoebapores and granulysin. Putative phospholipid membrane binding domains.	0
382990	cl02495	RabGAP-TBC	Rab-GTPase-TBC domain. Widespread domain present in Gyp6 and Gyp7, thereby giving rise to the notion that it performs a GTP-activator activity on Rab-like GTPases.	0
413341	cl02501	IL10	Interleukin 10. Interleukin-22 is distantly related to interleukin (IL)-10, and is produced by activated T cells. IL-22 is a ligand for CRF2-4, a member of the class II cytokine receptor family.	0
413342	cl02505	PTN_MK_N	PTN/MK heparin-binding protein family, N-terminal domain. Heparin-binding domain family.	0
413343	cl02506	SAA	Serum amyloid A protein. Serum amyloid A proteins are induced during the acute-phase response. Secondary amyloidosis is characterised by the extracellular accumulation in tissues of SAA proteins. SAA proteins are apolipoproteins.	0
413344	cl02507	SEA	SEA domain. Proposed function of regulating or binding carbohydrate sidechains.	0
413345	cl02508	Somatomedin_B	Somatomedin B domain. Somatomedin-B is a peptide, proteolytically excised from vitronectin, that is a growth hormone-dependent serum factor with protease-inhibiting activity.	0
413346	cl02509	SRCR_2	Scavenger receptor cysteine-rich domain. Members of this family form an extracellular domain of the serine protease hepsin. They are formed primarily by three elements of regular secondary structure: a 12-residue alpha helix, a twisted five-stranded antiparallel beta sheet, and a second, two-stranded, antiparallel sheet. The two beta-sheets lie at roughly right angles to each other, with the helix nestled between the two, adopting an SRCR fold. The exact function of this domain has not been identified, though it probably may serve to orient the protease domain or place it in the vicinity of its substrate.	0
413347	cl02510	TGF_beta	Transforming growth factor beta like domain. Family members are active as disulphide-linked homo- or heterodimers. TGFB is a multifunctional peptide that controls proliferation, differentiation, and other functions in many cell types.	0
413348	cl02511	GH64-TLP-SF	glycoside hydrolase family 64 (beta-1,3-glucanases which produce specific pentasaccharide oligomers) and thaumatin-like proteins. Family 64 glycoside hydrolases have beta-1,3-glucanase activity.	0
413349	cl02512	NTR_like	N/A. Sequence similarity between netrin UNC-6 and C345C complement protein family members, and hence the existence of the UNC-6 module, was first reported in. Subsequently, many additional members of the family were identified on the basis of sequence similarity between the C-terminal domains of netrins, complement proteins C3, C4, C5, secreted frizzled-related proteins, and type I pro-collagen C-proteinase enhancer proteins (PCOLCEs), which are homologous with the N-terminal domains of tissue inhibitors of metalloproteinases (TIMPs). The TIMPs are classified as a separate family in Pfam (pfam00965). This expanded domain family has been named as the NTR module.	0
413350	cl02516	VWD	von Willebrand factor type D domain. Von Willebrand factor contains several type D domains: D1 and D2 are present within the N-terminal propeptide whereas the remaining D domains are required for multimerisation.	0
413351	cl02517	ZU5	ZU5 domain. Domain of unknown function.	0
413352	cl02518	BTB	BTB/POZ domain. In voltage-gated K+ channels this domain is responsible for subfamily-specific assembly of alpha-subunits into functional tetrameric channels. In KCTD1 this domain functions as a transcriptional repressor. It also mediates homomultimerisation of KCTD1 and interaction of KCTD1 with the transcription factor AP-2-alpha.	0
413353	cl02520	REM	N/A. A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this motif/domain N-terminal to the RasGef (Cdc25-like) domain.	0
413354	cl02521	CBM_1	Fungal cellulose binding domain. Small four-cysteine cellulose-binding domain of fungi	0
413355	cl02522	Calx-beta	Calx-beta domain. Domain in Na-Ca exchangers and integrin subunit beta4 (and some cyanobacterial proteins)	0
413356	cl02524	GAS2	Growth-Arrest-Specific Protein 2 Domain. GROWTH-ARREST-SPECIFIC PROTEIN 2 Domain	0
413357	cl02526	Peptidase_S41	C-terminal processing peptidase family S41. tail specific protease	0
413358	cl02528	Crystall	Beta/Gamma crystallin. Beta/gamma crystallins	0
413359	cl02529	Ank	Ankyrin repeat. Ankyrins are multifunctional adaptors that link specific proteins to the membrane-associated, spectrin- actin cytoskeleton. This repeat-domain is a 'membrane-binding' domain of up to 24 repeated units, and it mediates most of the protein's binding activities.	0
413360	cl02533	SOCS	N/A. The SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases.	0
413361	cl02535	F-box-like	F-box-like. This domain is approximately 50 amino acids long, and is usually found in the N-terminal half of a variety of proteins. Two motifs that are commonly found associated with the F-box domain are the leucine rich repeats (LRRs; pfam00560 and pfam07723) and the WD repeat (pfam00400). The F-box domain has a role in mediating protein-protein interactions in a variety of contexts, such as polyubiquitination, transcription elongation, centromere binding and translational repression.	0
413362	cl02536	SAND	SAND domain. The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerization. This region is also found in the putative transcription factor RegA from the multicellular green alga Volvox cateri. This region of RegA is known as the VARL domain.	0
413363	cl02539	BAG	BAG domain. BAG domains, present in Bcl-2-associated athanogene 1 and silencer of death domains	0
413364	cl02541	CIDE_N	N/A. This domain is found in CAD nuclease and ICAD, the inhibitor of CAD nuclease. The two proteins interact through this domain.	0
413365	cl02542	DnaJ	N/A. DnaJ domains (J-domains) are associated with hsp70 heat-shock system and it is thought that this domain mediates the interaction. DnaJ-domain is therefore part of a chaperone (protein folding) system. The T-antigens, although not in Prosite are confirmed as DnaJ containing domains from literature.	0
413366	cl02544	VHS_ENTH_ANTH	VHS, ENTH and ANTH domain superfamily. The C-terminal domain kinase (CTDK-1), is a three-subunit complex comprised of Ctk1, Ctk2, and Ctk3, that plays a key role in regulation of transcription and translation and in coordinating these two processes. Both Ctk2 and Ctk3 are regulated at the level of protein turnover, and are unstable proteins processed through a ubiquitin-proteasome pathway. Their physical interaction is required to protect both subunits from degradation, and both Ctk2 and Ctk3 are required for Ctk1 CTD kinase activation. The mammalian P-TEFb is mirrored by the combined complexes in yeast of the CTDK1 and the Bur1/2.	0
413367	cl02546	Granulin	Granulin. 	0
413368	cl02548	Laminin_B	Laminin B (Domain IV). 	0
413369	cl02549	OLF	Olfactomedin-like domain. 	0
351799	cl02553	Peptidase_C19	N/A. A subfamily of peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.	0
413370	cl02554	PWWP	N/A. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. The domain binds to Histone-4 methylated at lysine-20, H4K20me, suggesting that it is methyl-lysine recognition motif. Removal of two conserved aromatic residues in a hydrophobic cavity created by this domain within the full-length protein, Pdp1, abolishes the interaction o f the protein with H4K20me3. In fission yeast, Set9 is the sole enzyme that catalyzes all three states of H4K20me, and Set9-mediated H4K20me is required for efficient recruitment of checkpoint protein Crb2 to sites of DNA damage. The methylation of H4K20 is involved in a diverse array of cellular processes, such as organising higher-order chromatin, maintaining genome stability, and regulating cell-cycle progression.	0
413371	cl02556	Bromodomain	N/A. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	0
413372	cl02557	DM	DM DNA binding domain. The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerize and bind palindromic DNA.	0
413373	cl02558	GED	Dynamin GTPase effector domain. 	0
413374	cl02559	GPS	GPCR proteolysis site, GPS, motif. Present in latrophilin/CL-1, sea urchin REJ and polycystin.	0
413375	cl02562	PWI	PWI domain. 	0
413376	cl02563	PX_domain	The Phox Homology domain, a phosphoinositide binding module. PX domains bind to phosphoinositides.	0
413377	cl02564	PXA	PXA domain. unpubl. observations	0
413378	cl02565	RGS	Regulator of G protein signaling (RGS) domain superfamily. Members of this family adopt a structure consisting of twelve helices that fold into a compact domain that contains the overall structural scaffold observed in other RGS proteins and three additional helical elements that pack closely to it. Helices 1-9 comprise the RGS (pfam00615) fold, in which helices 4-7 form a classic antiparallel bundle adjacent to the other helices. Like other RGS structures, helices 7 and 8 span the length of the folded domain and form essentially one continuous helix with a kink in the middle. Helices 10-12 form an apparently stable C-terminal extension of the structural domain, and although other RGS proteins lack this structure, these elements are intimately associated with the rest of the structural framework by hydrophobic interactions. Members of the family bind to active G-alpha proteins, promoting GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off G protein-coupled receptor signalling pathways.	0
413379	cl02566	SET	SET domain. Putative methyl transferase, based on outlier plant homologues	0
413380	cl02568	WSC	WSC domain. Domain present in WSC proteins, polycystin and fungal exoglucanase	0
413381	cl02569	RasGAP	Ras GTPase Activating Domain. This family features the C-terminal regions of various plexins. Plexins are receptors for semaphorins, and plexin signalling is important in path finding and patterning of both neurons and developing blood vessels. The cytoplasmic region, which has been called a SEX domain in some members of this family, is involved in downstream signalling pathways, by interaction with proteins such as Rac1, RhoD, Rnd1 and other plexins. This domain acts as a RasGAP domain.	0
413382	cl02570	RhoGAP	N/A. GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases.	0
413383	cl02571	RhoGEF	N/A. Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that pfam00169 domains invariably occur C-terminal to RhoGEF/DH domains.	0
413384	cl02573	Tudor_SF	Tudor domain superfamily. This group contains SMN, SPF30, Tudor domain-containing protein 3 (TDRD3), DNA excision repair protein ERCC-6-like 2 (ERCC6L2), and similar proteins. SMN, also called component of gems 1, or Gemin-1, is part of a multimeric SMN complex that includes spliceosomal Sm core proteins and plays a catalyst role in the assembly of small nuclear ribonucleoproteins (snRNPs), the building blocks of the spliceosome. SPF30, also called 30 kDa splicing factor SMNrp, SMN-related protein, or survival motor neuron domain-containing protein 1 (SMNDC1), is an essential pre-mRNA splicing factor required for assembly of the U4/U5/U6 tri-small nuclear ribonucleoprotein into the spliceosome. TDRD3 is a scaffolding protein that specifically recognizes and binds dimethylarginine-containing proteins. ERCC6L2, also called DNA repair and recombination protein RAD26-like (RAD26L), may be involved in early DNA damage response. It regulates RNA Pol II-mediated transcription via its interaction with DNA-dependent protein kinase (DNA-PK) to resolve R loops and minimize transcription-associated genome instability. Members of this group contain a single Tudor domain. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.	0
413385	cl02574	Annexin	Annexin. This family of annexins also includes giardin that has been shown to function as an annexin.	0
413386	cl02575	Bcl-2_like	N/A. (BH1, BH2, (BH3 (one helix only)) and not BH4(one helix only)). Involved in apoptosis regulation	0
413387	cl02578	HRDC	HRDC domain. RecQ helicases unwind DNA in an ATP-dependent manner. Sgs1 has a HRDC (helicase and RNaseD C-terminal) domain which modulates the helicase function via auxiliary contacts to DNA.	0
413388	cl02581	KRAB_A-box	KRAB (Kruppel-associated box) domain -A box. The KRAB domain (or Kruppel-associated box) is present in about a third of zinc finger proteins containing C2H2 fingers. The KRAB domain is found to be involved in protein-protein interactions. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B. The A box plays an important role in repression by binding to corepressors, while the B box is thought to enhance this repression brought about by the A box. KRAB-containing proteins are thought to have critical functions in cell proliferation and differentiation, apoptosis and neoplastic transformation.	0
413389	cl02594	DD_R_PKA	Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. RII subunits contain a phosphorylation site in their inhibitory site and are both substrates and inhibitors. RIIbeta plays an important role in adipocytes and neuronal tissues. Mice deficient with RIIbeta have small fat cells, and are resistant to obesity, diet-induced diabetes, and alcohol-induced motor defects. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis.	0
413390	cl02596	NR_DBD_like	DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. In nearly all cases, this is the DNA binding domain of a nuclear hormone receptor. The alignment contains two Zinc finger domains that are too dissimilar to be aligned with each other.	0
413391	cl02598	Copper-fist	Copper fist DNA binding domain. The domain is named for its resemblance to a fist. It can be found in some fungal transcription factors. These proteins activate the transcription of the metallothionein gene in response to copper. Metallothionein maintains copper levels in yeast. The copper fist domain is similar in structure to metallothionein itself, and on copper binding undergoes a large conformational change, which allows DNA binding.	0
413392	cl02599	Ets	Ets-domain. variation of the helix-turn-helix motif	0
413393	cl02600	HTH_MerR-SF	Helix-Turn-Helix DNA binding domain of transcription regulators from the MerR superfamily. This domain is a DNA-binding helix-turn-helix domain.	0
413394	cl02601	PSI	Plexin repeat. A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman).	0
383044	cl02602	STE	STE like transcription factor. 	0
413395	cl02603	TEA	TEA/ATTS domain family. 	0
413396	cl02605	SCAN	SCAN oligomerization domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several pfam00096 proteins. The domain has been shown to be able to mediate homo- and hetero-oligomerization.	0
413397	cl02608	BAH	N/A. This domain has been called BAH (Bromo adjacent homology) domain and has also been called ELM1 and BAM (Bromo adjacent motif) domain. The function of this domain is unknown but may be involved in protein-protein interaction.	0
413398	cl02609	Zn-ribbon	C-terminal zinc ribbon domain of RNA polymerase intrinsic transcript cleavage subunit. TFIIS is a zinc-containing transcription factor. It has been shown in vitro to have distinct biochemical activities, including binding to RNA polymerases, stimulation of transcript elongation, and activation of a nascent RNA cleavage activity in the RNA polymerase II (Pol II) elongation complex. TFIIS consists of three domains. Domain II and III are sufficient for all known TFIIS activities. Domain III is a zinc ribbon that separated from domain II by a long linker and is indispensable for TFIIS function. The TFIIS homologs, subunits A12.2, B9, and C11, of Pol I, II, and III respectively, are required for RNA cleavage by the polymerases. In a single organism, there are tissue-specific TFIIS related proteins.	0
413399	cl02610	FF	FF domain. RhoGAP-FF1 is the FF domain of the Rho GTPase activating proteins (GAPs). These are the key proteins that make the switch between the active guanosine-triphosphate-bound form of Rho guanosine triphosphatases (GTPases) and the inactive guanosine-diphosphate-bound form. Rho guanosine triphosphatases (GTPases) are a family of proteins with key roles in the regulation of actin cytoskeleton dynamics. The RhoGAP-FF1 region contains the FF domain that has been implicated in binding to the transcription factor TFII-I; and phosphorylation of Tyr308 within the first FF domain inhibits this interaction. The RhoGAPFF1 domain constitutes the first solved structure of an FF domain that lacks the first of the two highly conserved Phe residues, but the substitution of Phe by Tyr does not affect the domain fold.	0
413400	cl02611	G-patch	G-patch domain. Yeast Spp2, a G-patch protein and spliceosome component, interacts with the ATP-dependent DExH-box splicing factor Prp2. As this interaction involves the G-patch sequence in Spp2 and is required for the recruitment of Prp2 to the spliceosome before the first catalytic step of splicing, it is proposed that Spp2 might be an accessory factor that confers spliceosome specificity on Prp2.	0
413401	cl02612	Link_Domain	N/A. Link_domain_KIAA0527_like; this domain is found in the human protein KIAA0527. Sequence-wise, it is highly similar to the link domain. The link domain is a hyaluronan-binding (HA) domain. KIAA0527 contains a single link module. The KIAA0527 gene was originally cloned from human brain tissue.	0
413402	cl02614	SPRY	SPRY domain. SPRY Domain is named from SPla and the RYanodine Receptor. Domain of unknown function. Distant homologs are domains in butyrophilin/marenostrin/pyrin homologs.	0
413403	cl02616	MACPF	MAC/Perforin domain. Membrane attack complex/ Perforin (MACPF) Superfamily; Provisional	0
413404	cl02617	Sorb	Sorbin homologous domain. First found in the peptide hormone sorbin and later in the ponsin/ArgBP2/vinexin family of proteins.	0
413405	cl02619	Smr	Smr domain. This family includes the Smr (Small MutS Related) proteins, and the C-terminal region of the MutS2 protein. It has been suggested that this domain interacts with the MutS1 protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2. This domain exhibits nicking endonuclease activity that might have a role in mismatch repair or genetic recombination. It shows no significant double strand cleavage or exonuclease activity. The full-length human NEDD4-binding protein 2 also has the polynucleotide kinase activity.	0
413406	cl02620	SAD_SRA	SAD/SRA domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533.	0
413407	cl02621	TGF_beta_GS	Transforming growth factor beta type I GS-motif. Aa approx. 30 amino acid motif that precedes the kinase domain in types I and II TGF beta receptors. Mutation of two or more of the serines or threonines in the TTSGSGSG of TGF-beta type I receptor impairs phosphorylation and signaling activity.	0
413408	cl02622	Pre-SET	Pre-SET motif. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.	0
413409	cl02623	WIF	WIF domain. Occurs as extracellular domain in metazoan Ryk receptor tyrosine kinases. C. elegans Ryk is required for cell-cuticle recognition. WIF-1 binds to Wnt and inhibits its activity.	0
413410	cl02626	DNA_pol_A	Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified in six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaaeota polymerase II (class D), human polymerase  beta (class x), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerase are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I ,mitochondrial polymerase delta, and several bacteriphage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic Pol Is have two functional domains located on the same polypeptide; a 5'-3' polymerase and 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and DNA polymerase activity to fill in the resulting gap. A combination of phylogenomic and signature sequence-based (or phonetic) approaches is used to understand the evolutionary relationships among bacteria. DNA polymerase I is one of the conserved proteins that is used to search for protein signatures. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains.	0
413411	cl02628	XPG_N	XPG N-terminal domain. domain in nucleases	0
383060	cl02629	CBM_14	Chitin binding Peritrophin-A domain. This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains.	0
413412	cl02632	PRP4	pre-mRNA processing factor 4 (PRP4) like. This small domain is found on PRP4 ribonuleoproteins. PRP4 is a U4/U6 small nuclear ribonucleoprotein that is involved in pre-mRNA processing.	0
413413	cl02633	ARID	ARID/BRIGHT DNA binding domain. Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.	0
413414	cl02637	TFIIS_M	Transcription factor S-II (TFIIS), central domain. Transcription elongation by RNA polymerase II is regulated by the general elongation factor TFIIS. This factor stimulates RNA polymerase II to transcribe through regions of DNA that promote the formation of stalled ternary complexes. TFIIS is composed of three structural domains, termed I, II, and III. The two C-terminal domains (II and III), this domain and pfam01096 are required for transcription activity.	0
413415	cl02638	Hairy_orange	Hairy Orange. This domain confers specificity among members of the Hairy/E(SPL) family.	0
413416	cl02640	SAP	SAP domain. The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins.	0
413417	cl02642	PABP	Poly-adenylate binding protein, unique domain. Involved in homodimerisation (either directly or indirectly)	0
413418	cl02643	PSP	PSP. Proline rich domain found in numerous spliceosome associated proteins.	0
413419	cl02648	NIDO	Nidogen-like. This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function.	0
413420	cl02649	LEM	LEM (Lap2/Emerin/Man1) domain found in emerin, lamina-associated polypeptide 2 (LAP2), inner nuclear membrane protein Man1 and similar proteins. The LEM domain is 50 residues long and is composed of two parallel alpha helices. This domain is found in inner nuclear membrane proteins. It is called the LEM domain after LAP2, Emerin, and Man1.	0
413421	cl02650	FYRN	F/Y-rich N-terminus. is sometimes closely juxtaposed with the C-terminal region (FYRC), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins.	0
413422	cl02651	FYRC	F/Y rich C-terminus. is sometimes closely juxtaposed with the N-terminal region (FYRN), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins.	0
413423	cl02652	MIF4G	MIF4G domain. Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. Ponting (TiBS) "Novel eIF4G domain homologues (in press)	0
413424	cl02653	MA3	MA3 domain. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains Ponting (TIBS) "Novel eIF4G domain homologues" in press	0
413425	cl02656	zf-RanBP	Zn-finger in Ran binding protein and others. Zinc finger domain in Ran-binding proteins (RanBPs), and other proteins. In RanBPs, this domain binds RanGDP.	0
413426	cl02658	TAFH	NHR1 homology to TAF. Domain in Drosophila nervy, CBFA2T1, human TAF105, human TAF130, and Drosophila TAF110. Also known as nervy homology region 1 (NHR1).	0
295419	cl02659	z-alpha	Adenosine deaminase z-alpha domain. Helix-turn-helix-containing domain. Also known as Zab.	0
413427	cl02660	zf-TAZ	TAZ zinc finger. The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumor suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC.	0
413428	cl02661	A_deamin	Adenosine-deaminase (editase) domain. Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defense against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc.	0
413429	cl02662	SEP	SEP domain. The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain.	0
413430	cl02663	Fasciclin	Fasciclin domain. This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria.	0
413431	cl02666	KU	N/A. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the central DNA-binding beta-barrel domain. This domain is found in both the Ku70 and Ku80 proteins that form a DNA binding heterodimer.	0
413432	cl02672	L27	L27 domain. The L27 domain is a protein interaction module that exists in a large family of scaffold proteins, functioning as an organisation centre of large protein assemblies required for the establishment and maintenance of cell polarity. L27 domains form specific heterotetrameric complexes, in which each domain contains three alpha-helices.	0
413433	cl02674	DDT	DDT domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors) and is approximately 60 residues in length. Along with the WHIM motifs, it comprises an entirely alpha helical module found in diverse eukaryotic chromatin proteins. Based on the structure of Ioc3, this module is inferred to interact with nucleosomal linker DNA and the SLIDE domain of ISWI proteins. The resulting complex forms a protein ruler that measures out the spacing between two adjacent nucleosomes. In particular, the DDT domain, in combination with the WHIM1 and WHIM2 motifs form the SLIDE domain binding pocket.	0
295427	cl02675	DZF	DZF domain. The function of this domain is unknown. It is often found associated with pfam00098 or pfam00035. This domain has been predicted to belong to the nucleotidyltransferase superfamily.	0
413434	cl02676	HSA	HSA. This domain is predicted to bind DNA and is often found associated with helicases.	0
413435	cl02677	POX	Associated with HOX. The function of this domain is unknown. It is often found in plant proteins associated with pfam00046.	0
413436	cl02684	zf-DBF	DBF zinc finger. This domain is predicted to bind metal ions and is often found associated with pfam00533 and pfam02178. It was first identified in the Drosophila chiffon gene product, and is associated with initiation of DNA replication.	0
413437	cl02686	PRY	SPRY-associated domain. SPRY and PRY domains occur on PYRIN proteins. Their function is not known.	0
413438	cl02687	RWD	RWD domain. This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices.	0
413439	cl02688	BRK	BRK domain. The function of this domain is unknown. It is often found associated with helicases and transcription factors.	0
413440	cl02689	RUN	RUN domain. This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases.	0
413441	cl02694	LCCL	LCCL domain. Rxt3 has been shown in yeast to be required for histone deacetylation.	0
413442	cl02699	VIT	Vault protein inter-alpha-trypsin domain. Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumor metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors.	0
413443	cl02701	Kelch_1	Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that Drosophila ring canal kelch protein is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415.	0
413444	cl02703	zf-BED	BED zinc finger. DNA-binding domain in chromatin-boundary-element-binding proteins and transposases	0
413445	cl02704	EphR_LBD	Ligand Binding Domain of Ephrin Receptors. The Eph receptors, which bind to ephrins pfam00812 are a large family of receptor tyrosine kinases. This family represents the amino terminal domain which binds the ephrin ligand.	0
413446	cl02706	Malt_amylase_C	Maltogenic Amylase, C-terminal domain. This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 110 amino acids in length. This domain is found associated with pfam00128, pfam02922.	0
413447	cl02708	Big_2	Bacterial Ig-like domain (group 2). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial and phage surface proteins such as intimins.	0
413448	cl02712	PGRP	N/A. This family includes zinc amidases that have N-acetylmuramoyl-L-alanine amidase activity EC:3.5.1.28. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls (preferentially: D-lactyl-L-Ala). The structure is known for the bacteriophage T7 structure and shows that two of the conserved histidines are zinc binding.	0
413449	cl02713	MurNAc-LAA	N/A. This family contains the bacterial stage II sporulation protein P (SpoIIP) (approximately 350 residues long). It has been shown that a block in polar cytokinesis in Bacillus subtilis is mediated partly by transcription of spoIID, spoIIM and spoIIP. This inhibition of polar division is involved in the locking in of asymmetry after the formation of a polar septum during sporulation. Engulfment in Bacillus subtilis is mediated by two complementary systems: the first includes the proteins SpoIID, SpoIIM and SpoIIP (DMP) which carry out the engulfment, and the second includes the SpoIIQ-SpoIIIAGH (Q-AH) zipper, that recruits other proteins to the septum in a second-phase of the engulfment. The course of events follows as the incorporation firstly of SpoIIB into the septum during division to serve directly or indirectly as a landmark for localising SpoIIM and then SpoIIP and SpoIID to the septum. SpoIIP and SpoIID interact together to form part of the DMP complex. SpoIIP itself has been identified as an autolysin with peptidoglycan hydrolase activity.	0
413450	cl02715	Surp	Surp module. domain present in regulators which are responsible for pre-mRNA splicing processes	0
413451	cl02716	RNA_pol_Rpb8	RNA polymerase Rpb8. RNA_pol_RpbG is a family of archaeal and fungal subunit G of DNA-directed RNA polymerase.	0
295446	cl02717	RNA_POL_M_15KD	RNA polymerases M/15 Kd subunit. 	0
413452	cl02720	PB1	N/A. Phox and Bem1p domain, present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pairs associate.	0
413453	cl02729	WWE	WWE domain. The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems.	0
413454	cl02731	CLIP	Regulatory CLIP domain of proteinases. Present in horseshoe crab proclotting enzyme N-terminal domain, Drosophila Easter and silkworm prophenoloxidase-activating enzyme.	0
413455	cl02735	DM13	Electron transfer DM13. The DM13 domain is a component of a novel electron-transfer system potentially involved in oxidative modification of animal cell-surface proteins. It contains a nearly absolutely conserved cysteine, which could be involved in a redox reaction, either as a naked thiol group or through binding a prosthetic group like heme.	0
413456	cl02739	THAP	THAP domain. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes.	0
413457	cl02748	zf-CDGSH	Iron-binding zinc finger CDGSH type. The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm.	0
413458	cl02754	zf-LITAF-like	LITAF-like zinc ribbon domain. Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumor necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure.	0
413459	cl02755	LAM	LA motif RNA-binding domain. This presumed domain is found at the N-terminus of La RNA-binding proteins as well as other proteins. The function of this region is uncertain.	0
413460	cl02758	AMOP	AMOP domain. This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges.	0
413461	cl02759	TRAM_LAG1_CLN8	TLC domain. Protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis, TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. The family may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains.	0
413462	cl02760	NEAT	NEAr Transport domain, a component of cell surface proteins. NEAT domains are heme and/or hemoprotein-binding modules highly conserved in secondary structure. They have roles in hemoprotein binding, heme extraction and heme transfer	0
413463	cl02763	ChW	Clostridial hydrophobic W. A novel extracellular macromolecular system has been proposed based on the proteins containing ChW repeats. ChW stands for Clostridial hydrophobic with conserved W (tryptophan). This repeat was originally described in Clostridium acetobutylicum but is also found in other Gram-positive bacteria including Enterococcus faecalis, Streptococcus agalactiae and Streptomyces coelicolor.	0
383112	cl02765	zf-WRNIP1_ubi	Werner helicase-interacting protein 1 ubiquitin-binding domain. Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids.	0
413464	cl02766	NGN	N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily. Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.	0
413465	cl02768	PASTA	N/A. This domain is found at the C termini of several Penicillin-binding proteins and bacterial serine/threonine kinases. It binds the beta-lactam stem, which implicates it in sensing D-alanyl-D-alanine - the PBP transpeptidase substrate. It is a small globular fold consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain.	0
413466	cl02770	CFEM	CFEM domain. This fungal specific cysteine rich domain is found in some proteins with proposed roles in fungal pathogenesis. The structure of the CFEM domain containing protein 'Surface antigen protein 2' from Candida albicans has been solved.	0
413467	cl02772	BSD	BSD domain. This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function.	0
275778	cl02773	HTTM	Horizontally Transferred TransMembrane Domain. Members of this protein family resemble SdpB (Sporulation Delaying Protein B), an integral membrane protein associated with production of the cannibalism peptide SdpC in Bacillus subtilis. Similar proteins are found in Myxococcus xanthus.	0
413468	cl02774	Topoisomer_IB_N	N/A. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination. This family may be more than one structural domain.	0
413469	cl02775	Oxidoreductase_nitrogenase	N/A. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. This metal cluster-binding family is related to nitrogenase structural protein NifD and accessory protein NifE, among others. [Energy metabolism, Methanogenesis]	0
413470	cl02776	GST_C_family	C-terminal, alpha helical domain of the Glutathione S-transferase family. Leishmania major and Trypanosoma cruzi glutathione-S-transferase (GST) has undergone gene duplication, diversification, and gene fusion leading to an four domain enzyme which contains two repeats of a GST N-terminal domain followed by a GST C-terminal domain.	0
351886	cl02777	chaperonin_like	N/A. This family consists of GroEL, the larger subunit of the GroEL/GroES cytosolic chaperonin. It is found in bacteria, organelles derived from bacteria, and occasionally in the Archaea. The bacterial GroEL/GroES group I chaperonin is replaced a group II chaperonin, usually called the thermosome in the Archaeota and CCT (chaperone-containing TCP) in the Eukaryota. GroEL, thermosome subunits, and CCT subunits all fall under the scope of pfam00118. [Protein fate, Protein folding and stabilization]	0
413471	cl02779	TRFH	N/A. Telomere repeat binding factor (TRF) family proteins are important for the regulation of telomere stability. The two related human TRF proteins hTRF1 and hTRF2 form homodimers and bind directly to telomeric TTAGGG repeats via the myb DNA binding domain pfam00249 at the carboxy terminus. TRF1 is implicated in telomere length regulation and TRF2 in telomere protection. Other telomere complex associated proteins are recruited through their interaction with either TRF1 or TRF2. The fission yeast protein Taz1p (telomere-associated in Schizosaccharomyces pombe) has similarity to both hTRF1 and hTRF2 and may perform the dual functions of TRF1 and TRF2 at fission yeast telomeres. This domain is composed of multiple alpha helices arranged in a solenoid conformation similar to TPR repeats. The fungal members have now also been found to carry two double strand telomeric repeat binding factors.	0
413472	cl02780	MIT_C	N/A. MIT_C is the C-terminal domain of MIT-containing proteins, pfam04212. It contains an unanticipated phospholipase d fold (PLD fold) that binds avidly to phosphoinositide-containing membranes. It is conserved in eukaryotes, though not fungi and plants, and some bacteria.	0
351888	cl02781	tetraspanin_LEL	N/A. Tetraspanin, extracellular domain or large extracellular loop (LEL), oculospanin_like family. Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. Tetraspanins are involved in diverse processes and their various functions may relate to their ability to act as molecular facilitators. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web". This subfamily contains sequences similar to oculospanin, which is found to be expressed in retinal pigment epithelium, iris, ciliary body, and retinal ganglion cells.	0
413473	cl02782	ERp29c	N/A. ERp29 is a ubiquitously expressed endoplasmic reticulum protein found in mammals. ERp29 is comprised of two domains. This domain, the C-terminal domain, has an all helical fold. ERp29 is thought to form part of the thyroglobulin folding complex.	0
413474	cl02783	TopoII_MutL_Trans	N/A. Members of this family adopt a structure consisting of a four-stranded beta-sheet backed by three alpha-helices, the last of which is over 50 amino acids long and extends from the body of the protein by several turns. This domain has been proposed to mediate intersubunit communication by structurally transducing signals from the ATP binding and hydrolysis domains to the DNA binding and cleavage domains of the gyrase holoenzyme.	0
413475	cl02784	Chelatase_Class_II	N/A. The function of CbiX is uncertain, however it is found in cobalamin biosynthesis operons and so may have a related function. Some CbiX proteins contain a striking histidine-rich region at their C-terminus, which suggests that it might be involved in metal chelation.	0
413476	cl02785	Elongation_Factor_C	N/A. This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold.	0
413477	cl02786	Translation_factor_III	Domain III of Elongation factor (EF) Tu (EF-TU) and related proteins. Members of this family, which are found in the initiation factors eIF2 and EF-Tu, adopt a structure consisting of a beta barrel with Greek key topology. They are required for formation of the ternary complex with GTP and initiator tRNA.	0
413478	cl02787	Translation_Factor_II_like	Domain II of Elongation factor Tu (EF-Tu)-like proteins. Elongation factor Tu consists of several structural domains, and this is usually the fourth.	0
413479	cl02788	Ser_Recombinase	N/A. The N-terminal domain of the resolvase family (this family) contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase - see pfam02796.	0
413480	cl02789	EFG_like_IV	N/A. This domain is found in elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopts a ribosomal protein S5 domain 2-like fold.	0
413481	cl02792	Cyt_c_Oxidase_IV	N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit IV. The Dictyostelium member of this family is called COX VI. The yeast protein MTC3 appears to be the yeast COX IV subunit.	0
413482	cl02793	Cyt_c_Oxidase_Va	N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit Va.	0
413483	cl02794	Cyt_c_Oxidase_VIb	N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of the potentially heme-binding subunit IVb of the oxidase.	0
413484	cl02795	Cyt_c_Oxidase_VIc	N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIc.	0
413485	cl02796	Cyt_c_Oxidase_VIIa	N/A. Cytochrome c oxidase, a 13 sub-unit complex, is the terminal oxidase in the mitochondrial electron transport chain. This family also contains both heart and liver isoforms of cytochrome c oxidase subunit VIIa.	0
413486	cl02797	Cyt_c_Oxidase_VIIc	N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIIc. The yeast member of this family is called COX VIII.	0
413488	cl02806	Laminin_N	Laminin N-terminal (Domain VI). N-terminal domain of laminins and laminin-related protein such as Unc-6/ netrins.	0
413489	cl02808	RT_like	N/A. This family includes viral RNA dependent RNA polymerase enzymes from hepatitis C virus and various plant viruses.	0
413493	cl02823	phosphagen_kinases	Phosphagen (guanidino) kinases. The substrate binding site is located in the cleft between N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain.	0
413500	cl02844	Arrestin_C	Arrestin (or S-antigen), C-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain. Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X). which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction.	0
413509	cl02872	DHQ_Fe-ADH	Dehydroquinate synthase-like (DHQ-like) and iron-containing alcohol dehydrogenases (Fe-ADH). The 3-dehydroquinate synthase EC:4.6.1.3 domain is present in isolation in various bacterial 3-dehydroquinate synthases and also present as a domain in the pentafunctional AROM polypeptide. 3-dehydroquinate (DHQ) synthase catalyzes the formation of dehydroquinate (DHQ) and orthophosphate from 3-deoxy-D-arabino heptulosonic 7 phosphate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids.	0
413511	cl02879	Chloroa_b-bind	Chlorophyll A-B binding protein. photosystem II light-harvesting-Chl-binding protein  Lhcb6 (CP24); Provisional	0
413513	cl02885	Ebola_HIV-1-like_HR1-HR2	heptad repeat 1-heptad repeat 2 region (ectodomain) of the transmembrane subunit of various endogenous retroviruses (ERVs) and infectious retroviruses, including Ebola virus and human immunodeficiency virus type 1 (HIV-1). This family includes envelope protein from a variety of retroviruses. It includes the GP41 subunit of the envelope protein complex from human and simian immunodeficiency viruses (HIV and SIV) which mediate membrane fusion during viral entry. The family also includes bovine immunodeficiency virus, feline immunodeficiency virus and Equine infectious anaemia (EIAV). The family also includes the Gp36 protein from mouse mammary tumor virus (MMTV) and human endogenous retroviruses (HERVs).	0
295537	cl02891	E7	E7 protein, Early protein. E7 protein; Provisional	0
295552	cl02915	Voltage_gated_ClC	N/A. ClC-6-like chloride channel proteins. This CD includes ClC-6, ClC-7 and ClC-B, C, D in plants. Proteins in this family are ubiquitous in eukarotes and their functions are unclear. They are expressed in intracellular organelles membranes.  This family belongs to the ClC superfamily of chloride ion channels, which share the unique double-barreled architecture and voltage-dependent gating mechanism. The gating is conferred by the permeating anion itself, acting as the gating charge. ClC chloride ion channel superfamily perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, and transepithelial transport in animals.	0
413523	cl02916	POLO_box	Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases. The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides.	0
413528	cl02928	TGFb_propeptide	TGF-beta propeptide. The DNRLRE domain, with a length of about 160 amino acids, appears typically in large, repetitive surface proteins of bacteria and archaea, sometimes repeated several times. It occurs, notably, three times in the C-terminal region of the enzyme disaggregatase from the archaeal species Methanosarcina mazei, each time with the motif DNRLRE, for which the domain is named.  Archaeal proteins within this family are described particularly well by the currently more narrowly defined Pfam model, PF06848. Note that the catalytic region of disaggregatase, in the N-terminal portion of the protein, is modeled by a different HMM, PF08480.	0
413529	cl02929	Cation_ATPase_C	Cation transporting ATPase, C-terminus. PhoLip_ATPase_C is found at the C-terminus of a number of phospholipid-translocating ATPases. It is found in higher eukaryotes.	0
413530	cl02930	Cation_ATPase_N	Cation transporter/ATPase, N-terminus. This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases.	0
413532	cl02948	GH20_hexosaminidase	N/A. This family consists of several uncharacterized proteins found in various Bacteroides and Chloroflexus species. The function of this family is unknown.	0
413533	cl02954	Gas_vesicle	Gas vesicle protein. 	0
413536	cl02959	Glyco_hydro_9	Glycosyl hydrolase family 9. endoglucanase	0
413539	cl02977	Ribosomal_L15e	Ribosomal L15. 50S ribosomal protein L15e; Validated	0
413546	cl02990	ASC	Amiloride-sensitive sodium channel. The Epithelial Na+ Channel (ENaC) Family (TC 1.A.06)The ENaC family consists of sodium channels from animals and has no recognizable homologues in other eukaryotes or bacteria. The vertebrate ENaC proteins from epithelial cells cluster tightly together on the phylogenetic tree: voltage-insensitive ENaC homologues are also found in the brain. Eleven sequenced C. elegans proteins, including the degenerins, are distantly related to the vertebrate proteins as well as to each other. At least some ofthese proteins form part of a mechano-transducing complex for touch sensitivity. Other members of the ENaC family, the acid-sensing ion channels, ASIC1-3,are homo- or hetero-oligomeric neuronal H+-gated channels that mediate pain sensation in response to tissue acidosis. The homologous Helix aspersa(FMRF-amide)-activated Na+ channel is the first peptide neurotransmitter-gated ionotropic receptor to be sequenced.Mammalian ENaC is important for the maintenance of Na+ balance and the regulation of blood pressure. Three homologous ENaC subunits, a, b and g, havebeen shown to assemble to form the highly Na+-selective channel.This model is designed from the vertebrate members of the ENaC family. [Transport and binding proteins, Cations and iron carrying compounds]	0
413547	cl02993	P2X_receptor	ATP P2X receptor. ATP-gated Cation Channel (ACC) Family (TC 1.A.7)Members of the ACC family (also called P2X receptors) respond to ATP, a functional neurotransmitter released by exocytosis from many types of neurons.These channels, which function at neuron-neuron and neuron-smooth muscle junctions, may play roles in the control of blood pressure and pain sensation. They may also function in lymphocyte and plateletphysiology. They are found only in animals.ACC channels are probably hetero- or homomultimers and transport small monovalent cations (Me+). Some also transport Ca2+; a few also transport small metabolites. [Transport and binding proteins, Cations and iron carrying compounds]	0
413550	cl03000	Innexin	Innexin. viral inexin-like protein; Provisional	0
413552	cl03008	ATP-synt_8	ATP synthase protein 8. ATP synthase F0 subunit 8; Provisional	0
413553	cl03012	Ammonium_transp	Ammonium Transporter Family. Members of this protein family are well conserved subclass of putative ammonimum transporters, belonging to the much broader set of ammonium/methylammonium transporter described by TIGR00836. Species with this transporter tend to be marine bacteria. Partial phylogenetic profiling (PPP) picks a member of this protein family as the single best-scoring protein vs. a reference profile for the marine environment Genome Property for a large number of different query genomes. This finding by PPP suggests that this transporter family represents an important adaptation to the marine environment.	0
413555	cl03019	VSV_P-protein-C_like	C-terminal domain of Vesicular stomatitis Indiana virus phosphoprotein and related proteins. This family includes the C-terminal domain of the P protein of plant viruses belonging to the Rhabdoviridae animal family such as Vesicular stomatitis Indiana virus (VSV). The family Rhabdoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serves as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA.	0
413558	cl03026	CBM_3	Cellulose binding domain. 	0
413563	cl03042	MHC_II_beta	Class II histocompatibility antigen, beta domain. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection).	0
413568	cl03055	DNA_gyraseB_C	DNA gyrase B subunit, carboxyl terminus. TOPRIM_C is found as the C-terminal extension of the TOPRIM domain, pfam01751 in metazoa.	0
413569	cl03056	CPSase_sm_chain	Carbamoyl-phosphate synthase small chain, CPSase domain. The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. The small chain has a GATase domain in the carboxyl terminus.	0
413570	cl03058	MHC_II_alpha	Class II histocompatibility antigen, alpha domain. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection).	0
351936	cl03065	Flavi_M	Flavivirus envelope glycoprotein M. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. The envelope glycoprotein M is made as a precursor, called prM. The precursor portion of the protein is the signal peptide for the proteins entry into the membrane. prM is cleaved to form M in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus.	0
413575	cl03075	GrpE	nucleotide exchange factor GrpE. heat shock protein GrpE; Provisional	0
413580	cl03088	MobM_relaxase	relaxase domain of MobM and similar proteins. With some plasmids, recombination can occur in a site specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre. Pre is a plasmid recombination enzyme. This protein is also known as Mob (conjugative mobilisation).	0
413585	cl03093	Defensin_2	Arthropod defensin. The actinodefensin family is named (here) as an Actinomyces-specific branch of the (otherwise eukaryotic) arthropod defensin family described by Pfam model PF01097.	0
413588	cl03104	CKS	Cyclin-dependent kinase regulatory subunit. cyclin-dependent kinases regulatory subunit; Provisional	0
413590	cl03107	ETX_MTX2	Clostridium epsilon toxin ETX/Bacillus mosquitocidal toxin MTX2. This family represents the pore forming lobe of aerolysin.	0
413592	cl03113	Peptidase_U32	Peptidase family U32. putative protease; Provisional	0
413593	cl03114	RNase_PH	RNase PH-like 3&apos;-5&apos; exoribonucleases. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterized subfamily. This subfamily is found in both eukaryotes and archaebacteria.	0
413596	cl03119	FpgNei_N	N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. This family is the N-terminal domain contains eight beta-strands, forming a beta-sandwich with two alpha-helices parallel to its edges.	0
413597	cl03120	ELO	GNS1/SUR4 family. fatty acid elongase; Provisional	0
413601	cl03129	T2SSN	Type II secretion system (T2SS), protein N. Members of this family are the N (or GspN) protein of type II secretion systems (T2SS) as found in Leptospira, Geobacter, Myxococcus, and several other genera. Sequence similarity to GspN as found in, say, Gammaproteobacteria (see pfam01203) is extremely remote. [Protein fate, Protein and peptide secretion and trafficking]	0
413603	cl03131	Dynein_light	Dynein light chain type 1. dynein light chain; Provisional	0
413607	cl03141	Ribosomal_S7e	Ribosomal protein S7e. 40S ribosomal protein S7; Provisional	0
413613	cl03152	TbpB_B_D	C-lobe and N-lobe beta barrels of Tf-binding protein B. HpuA is a family of Neisseria spp proteins from the hpuAB operon, which are putative porphyrin transporters.	0
413614	cl03164	Col_Im_like	inhibitory immunity (Im) protein of colicin (Col) deoxyribonuclease (DNase) and pyocins. This family contains inhibitory immunity (Im) proteins that bind to colicin endonucleases (DNases) or pyocins with very high affinity and specificity; this is critical for the neutralization of endogenous DNase catalytic activity and for protection against exogenous DNase bacteriocins. The DNase colicin family (ColE2, ColE7, ColE8 and ColE9) in E. coli, and pyocin family (S1, S2, S3 and AP41) in P. aeruginosa, are potent bacteriocins where the immunity proteins (Ims) protect the colicin/pyocin producing (i.e. colicinogenic) bacteria by binding and inactivating colicin nucleases. The binding affinities between cognate and non-cognate nucleases by Im proteins can vary up to 10 orders of magnitude.	0
351959	cl03170	CheB_like	methylesterase CheB domain family. This family contains the methylesterase CheB (EC 3.1.1.61; also known as CheB methylesterase, chemotaxis-specific methylesterase, methyl-accepting chemotaxis protein methyl-esterase, or protein methyl-esterase) domain, a phosphorylation-activated response regulator involved in reversible modification of bacterial chemotaxis receptors, fused with a CheR domain as well as other domains. Signaling output of the chemotaxis receptors is modulated by CheB and methyltransferase CheR by controlling the level of receptor methylation. cheB and cheR are typically found in the same operon. However, CheB and CheR are fused in multi-domain proteins in this subgroup. The CheR protein/domain includes an all-alpha N-terminal domain and an S-adenosylmethionine-dependent methyltransferase C-terminal domain. Reversible methylation of transmembrane chemoreceptors plays an important role in ligand-dependent signaling and cellular adaptation in bacterial chemotaxis. Phosphorylated CheB catalyzes deamidation of specific glutamine residues in the cytoplasmic region of the chemoreceptors and demethylation of specific methyl glutamate residues introduced into the chemoreceptors by CheR.	0
413619	cl03179	PARP_regulatory	Poly A polymerase regulatory subunit. poly(A) polymerase small subunit; Provisional	0
413620	cl03181	Peptidase_C25_N	Peptidase C25 family N-terminal domain, found in Arg-gingipain (Rgp), Lys-gingipain (Kgp) and related proteins. Domains in this subgroup are uncharacterized members of the Peptidase family C25 N-terminal domain family. Peptidases family C25 are a unique class of cysteine proteases, exemplified by gingipain, which is produced by Porphyromonas gingivalis. P. gingivalis is one of the primary gram-negative pathogens that causes periodontitis, a disease that is also associated with other diseases such as diabetes and cardiovascular disease. Gingipains are a group of extracellular Arg- and Lys-specific proteinases called Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and RgpB are homologous Arg-specific gingipains encoded by two closely related genes, rgpA and rgpB, while Lys-specific gingipain is encoded by the single kgp gene. Mutant studies have shown that, among the large quantities of proteolytic enzymes produced by P. gingivalis, these three proteases are major virulence factors of this bacterium. All three genes encode an N-terminal pre-pro fragment, followed by the protease domain; however, rgpA and kgp also encode additional C-terminal HA (hemaglutinin/adhesion) subunits which consist of several sequence-related adhesion domains. Although unique, their cysteine protease active site residues (His and Cys) forming the catalytic dyad are well-conserved, cleaving the C-terminal peptide bond with Arg or Lys residues. Gingipains are evolutionarily related to other highly specific proteases including caspases, clostripain, legumains, and separase. Gingipains function by dysregulating host defense and inflammatory responses, and degrading host proteins, e.g. tissue, cells, matrix, plasma and immunological proteins. They are proposed to enhance gingival crevicular fluid (GCF) production through activation of the kallikrein/kinin pathways, thus increasing vascular permeability and causing gingival inflammation, a distinctive feature of periodontitis. RgpA and RgpB are also able to cleave and activate coagulation factors IX and X in order to activate prothrombin to produce thrombin, which in turn increases production of GCF. The gingipains also play a pivotal role in the survival of P. gingivalis in the host by attacking the host defense system through cleavage of several immunological molecules, while at the same time evading the host-immune response by dysregulating the cytokine network.	0
413625	cl03191	CpcD	CpcD/allophycocyanin linker domain. 	0
413628	cl03205	Jacalin_like	Jacalin-like lectin domain. This beta-prism fold lectin is the C-terminal domain of the Vibrio cholerae cytolytic pore-forming toxin hemolysin. It binds to N-glycans with a heptasaccharide GlcNAc4Man3 core (NGA2).	0
413635	cl03224	Porin3	Eukaryotic porin family that forms channels in the mitochondrial outer membrane. MDM10 is a family of eukaryotic proteins that forms a subunit of the SAM complex for biogenesis of beta-barrel proteins, though not porins, into the outer mitochondrial membrane.	0
413636	cl03225	GRIP	GRIP domain. The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. At least some of these domains have been shown to bind to GTPase Arl1.	0
413637	cl03230	DAHP_synth_2	Class-II DAHP synthetase family. phospho-2-dehydro-3-deoxyheptonate aldolase	0
413643	cl03253	SAM_decarbox	Adenosylmethionine decarboxylase. This enzyme is a key regulatory enzyme of the polyamine synthetic pathway. This protein is a pyruvoyl-dependent enzyme. The proenzyme is cleaved at a Ser residue that becomes a pyruvoyl group active site. [Central intermediary metabolism, Polyamine biosynthesis]	0
383302	cl03283	Allergen_V_VI	Group V, VI major allergens from grass, including Phlp 5, Phlp 6, Pha a 5 and Lol p 5. This family contains grass pollen proteins of group V. Phleum pratense pollen allergen Phl p 5b has been shown to possess ribonuclease activity.	0
413657	cl03302	Glyco_hydro_12	Glycosyl hydrolase family 12. hypothetical protein; Provisional	0
413658	cl03304	Plasmid_parti	Putative plasmid partition protein. This family consists of conserved hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete, some of which are putative plasmid partition proteins.	0
413667	cl03348	Ribosomal_L22e	Ribosomal L22e protein family. 60S ribosomal protein L22; Provisional	0
413668	cl03350	Ribosomal_L28e	Ribosomal L28e protein family. 60S ribosomal protein L28; Provisional	0
413669	cl03352	Ribosomal_L38e	Ribosomal L38e protein family. 60S ribosomal protein L38; Provisional	0
413671	cl03356	DcrB	DcrB. This family consists of the 23 kDa subunit of oxygen evolving system of photosystem II or PsbP from various plants (where it is encoded by the nuclear genome) and Cyanobacteria. The 23 KDa PsbP protein is required for PSII to be fully operational in vivo, it increases the affinity of the water oxidation site for Cl- and provides the conditions required for high affinity binding of Ca2+.	0
413673	cl03371	Peptidase_G1_like	Peptidases of the G1 family and homologs that might lack peptidase activity. This family of proteins is found in bacteria. Proteins in this family are typically between 236 and 351 amino acids in length. The member from Bacillus subtilis, UniProtKB:O05411, is named YrpD.	0
413677	cl03379	Myo5-like_CBD	Cargo binding domain of myosin 5 and similar proteins. The DIL domain has no known function.	0
413678	cl03381	pVHL	von Hippel-Landau (pVHL) tumor suppressor protein. VHL forms a ternary complex with the elonginB and elonginC proteins. This complex binds Cul2, which then is involved in regulation of vascular endothelial growth factor mRNA.	0
413680	cl03398	DUF111	Protein of unknown function DUF111. Members of this family are found in the Archaea and in several different bacteria lineages. The function in unknown and the genomic context is not well conserved. [Hypothetical proteins, Conserved]	0
413681	cl03400	DUF137	Protein of unknown function DUF137. This family of archaeal proteins has no known function.	0
413688	cl03420	Gallidermin	Gallidermin. Mutacins are lantibiotics in the epidermin/gallidermin/nisin family, found in the biofilm-forming dental caries pathogen Streptococcus mutans. Named members of the family include mutacin I and mutacin 1140. This HMM separates the mutacins (MutA) from paralog MutA' encoded nearby, which lacks mutacin activity.	0
413692	cl03428	MAS20	MAS20 protein import receptor. [Transport and binding proteins, Amino acids, peptides and amines]	0
413697	cl03445	V35-RBD_P-protein-C_like	C-terminal RNA-binding domain (RBD) domain of Ebola virus VP35 phosphoprotein and related proteins. This family includes the C-terminal RNA-binding domain (RBD) of the P protein of viruses belonging to the Filoviridae family, such as Ebola virus or Marburg virus. VP35-RBD contains two subdomains: an alpha-helical subdomain and a beta-sheet subdomain.  Virus infection typically activates host innate immunity, including the interferon (IFN) signaling pathway; VP35-RBD binds double-stranded RNA (dsRNA) inhibiting IFN-alpha/beta signaling. The family Filoviridae belongs to the order Mononegavirales which are nonsegmented negative-stranded RNA viruses (NNVs). The genomes of NNVs are encapsidated by their nucleocapsid (N) proteins to form N-RNA complexes which serve as a template for transaction and replication. The C-terminus of P protein binds nucleocapsid. P protein plays multiple roles in transcription and translation, which include acting as a chaperone of nascent nucleoprotein (N), and as a cofactor of the viral polymerase (L) where P forms a two-subunit polymerase with a large catalytic subunit (L) and stabilizes the polymerase on its template of N-RNA.	0
413700	cl03449	M35_like	Peptidase M35 family. This is the catalytic region of aspzincins, a group of lysine-specific metallo-endopeptidases in the MEROPS:M35 family. They exhibit the following active-site architecture. The active site is composed of two helices and a loop region and includes the HExxH and GTxDxxYG motifs. In UniProt:P81054, His117, His121 and Asp130 coordinate to the catalytic zinc ligands. An electrostatically negative region composed of Asp154 and Glu157 attracts a positively charged Lys side chain of a substrate in a specific manner.	0
351997	cl03493	Alpha_TIF	Alpha trans-inducing protein (Alpha-TIF). Alpha-TIF (VP16) from Herpes Simplex virus is an essential tegument protein involved in the transcriptional activation of viral immediate early (IE) promoters (alpha genes) during the lytic phase of viral infection. VP16 associates with cellular transcription factors to enhance transcription rates, including the general transcription factor TFIIB and the transcriptional coactivator PC4. The N-terminal residues of VP16 confer specificity for the IE genes, while the C-terminal residues are responsible for transcriptional activation. Within the C-terminal region are two activation regions that can independently and cooperatively activate transcription. VP16 forms a transcriptional regulatory complex with two cellular proteins, the POU-domain transcription factor Oct-1 and the cell-proliferation factor HCF-1. VP16 is an alpha/beta protein with an unusual fold. Other transcription factors may have a similar topology.	0
413718	cl03503	Fe_hyd_SSU	Iron hydrogenase small subunit. Many microorganisms, such as methanogenic, acetogenic, nitrogen-fixing, photosynthetic, or sulphate-reducing bacteria, metabolise hydrogen. Hydrogen activation is mediated by a family of enzymes, termed hydrogenases, which either provide these organisms with reducing power from hydrogen oxidation, or act as electron sinks. There are two hydrogenases families that differ functionally from each other: NiFe hydrogenases tend to be more involved in hydrogen oxidation, while Iron-only FeFe (Fe only) hydrogenases in hydrogen production. Fe only hydrogenases show a common core structure, which contains a moiety, deeply buried inside the protein, with an Fe-Fe dinuclear centre, nonproteic bridging, terminal CO and CN- ligands attached to each of the iron atoms, and a dithio moiety, which also bridges the two iron atoms and has been tentatively assigned as a di(thiomethyl)amine. This common core also harbours three [4Fe-4S] iron-sulphur clusters. In FeFe hydrogenases, as in NiFe hydrogenases, the set of iron-sulphur clusters is dispersed regularly between the dinuclear Fe-Fe centre and the molecular surface. These clusters are distant by about 1.2 nm from each other but the [4Fe-4S] cluster closest to the dinuclear centre is covalently bound to one of the iron atoms though a thiolate bridging ligand. The moiety including the dinuclear centre, the thiolate bridging ligand, and the proximal [4Fe-4S] cluster is known as the H-cluster. A channel, lined with hydrophobic amino acid side chains, nearly connects the dinuclear centre and the molecular surface. Furthermore hydrogen-bonded water molecule sites have been identified at the interior and at the surface of the protein. The small subunit is comprised of alternating random coil and alpha helical structures that encompass the large subunit in a novel protein fold.	0
413722	cl03508	TFIIA_gamma_N	Gamma subunit of transcription initiation factor IIA, N-terminal helical domain. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The N-terminal domain of the gamma subunit is a 4 helix bundle.	0
413727	cl03519	UreI_AmiS_like	UreI/AmiS family, proton-gated urea channel and putative amide transporters. This family includes UreI and proton gated urea channel as well as putative amide transporters.	0
295899	cl03540	HDC	Histidine carboxylase PI chain. This enzyme converts histadine to histamine in a single step by catalyzing the release of CO2. This type is synthesized as an inactive single chain precursor, then cleaved into two chains. The Ser at the new N-terminus at the cleavage site is converted to a pyruvoyl group essential for activity. This type of histidine decarboxylase appears is known so far only in some Gram-positive bacteria, where it may play a role in amino acid catabolism. There is also a pyridoxal phosphate type histidine decarboxylase, as found in human, where histamine is a biologically active amine. [Energy metabolism, Amino acids and amines]	0
413743	cl03554	Decorin_bind	Decorin binding protein. This family consists of decorin binding proteins from Borrelia. The decorin binding protein of Borrelia burgdorferi the lyme disease spirochetes adheres to the proteoglycan decorin found on collagen fibers.	0
413747	cl03563	MraZ	protein domain of unknown function (UPF0040) includes MraZ. This small 70 amino acid domain is found duplicated in a family of bacterial proteins. These proteins may be DNA-binding transcription factors (Pers. comm. A Andreeva & A Murzin). It is likely, due to the similarity of fold, that this family acts as a bacterial antitoxin like the MazE antitoxin family.	0
413748	cl03567	Ycf4	Ycf4. photosystem I assembly protein Ycf4; Provisional	0
186578	cl03578	MerT	MerT mercuric transport protein. putative mercuric transport protein; Provisional	0
413752	cl03585	PSI_PsaE	Photosystem I reaction centre subunit IV / PsaE. photosystem I reaction center subunit IV; Provisional	0
413755	cl03589	Chalcone_3	Chalcone isomerase-like. Chalcone-flavanone isomerase is a plant enzyme responsible for the isomerisation of chalcone to naringenin, 4',5,7-trihydroxyflavanone, a key step in the biosynthesis of flavonoids.	0
413762	cl03620	DUF5011	Domain of unknown function (DUF5011). This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion.	0
383422	cl03627	PSI_PsaF	Photosystem I reaction centre subunit III. photosystem I reaction center subunit III; Provisional	0
413768	cl03639	PsaD	PsaD. photosystem I reaction center subunit II; Provisional	0
413769	cl03646	SRAP	SOS response associated peptidase (SRAP). hypothetical protein; Provisional	0
413770	cl03649	HemD	N/A. This family consists of uroporphyrinogen-III synthase HemD EC:4.2.1.75 also known as Hydroxymethylbilane hydrolyase (cyclizing) from eukaryotes, bacteria and archaea. This enzyme catalyzes the reaction: Hydroxymethylbilane <=> uroporphyrinogen-III + H(2)O. Some members of this family are multi-functional proteins possessing other enzyme activities related to porphyrin biosynthesis, such as HemD with pfam00590, however the aligned region corresponds with the uroporphyrinogen-III synthase EC:4.2.1.75 activity only. Uroporphyrinogen-III synthase is the fourth enzyme in the heme pathway. Mutant forms of the Uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria in humans a recessive inborn error of metabolism also known as Gunther disease.	0
413771	cl03651	PsaL	Photosystem I reaction centre subunit XI. photosystem I reaction center protein subunit XI; Provisional	0
413772	cl03656	PS_Dcarbxylase	Phosphatidylserine decarboxylase. Phosphatidylserine decarboxylase is synthesized as a single chain precursor. Generation of the pyruvoyl active site from a Ser is coupled to cleavage of a Gly-Ser bond between the larger (beta) and smaller (alpha chains). It is an integral membrane protein. A closely related family, possibly also active as phosphatidylserine decarboxylase, falls under model TIGR00164. [Fatty acid and phospholipid metabolism, Biosynthesis]	0
413785	cl03715	Mago_nashi	Mago nashi proteins, integral members of the exon junction complex. This family was originally identified in Drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene. The human homolog has been shown to interact with an RNA binding protein. An RNAi knockout of the C. elegans homolog causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination. Mago nashi has been found to be part of the exon-exon junction complex that binds 20 nucleotides upstream of exon-exon junctions.	0
413789	cl03728	Alpha_kinase	Alpha-kinase family. This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains.	0
413793	cl03741	Glyco_hydro_20b	Glycosyl hydrolase family 20, domain 2. Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the N-terminal region of alpha-glucuronidase. The N-terminal domain forms a two-layer sandwich, each layer being formed by a beta sheet of five strands. A further two helices form part of the interface with the central, catalytic, module (pfam07488).	0
413798	cl03749	STAT_int	STAT protein, protein interaction domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain.	0
413801	cl03758	SRP54_N	SRP54-type protein, helical bundle domain. This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.	0
413802	cl03759	Alpha_adaptinC2	Adaptin C-terminal domain. Adaptins are components of the adaptor complexes which link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. Gamma-adaptin is a subunit of the golgi adaptor. Alpha adaptin is a heterotetramer that regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This Ig-fold domain is found in alpha, beta and gamma adaptins and consists of a beta-sandwich containing 7 strands in 2 beta-sheets in a greek-key topology.. The adaptor appendage contains an additional N-terminal strand.	0
413803	cl03763	CaMBD	Calmodulin binding domain. Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other.	0
413808	cl03779	Enterotoxin_a	Heat-labile enterotoxin alpha chain. 	0
413817	cl03803	BAF	Barrier to autointegration factor. Barrier-to-autointegration factor (BAF) is an essential protein that is highly conserved in metazoan evolution, and which may act as a DNA-bridging protein. BAF binds directly to double-stranded DNA, to transcription activators, and to inner nuclear membrane proteins, including lamin A filament proteins that anchor nuclear-pore complexes in place, and nuclear LEM-domain proteins that bind to laminins filaments and chromatin. New findings suggest that BAF has structural roles in nuclear assembly and chromatin organization, represses gene expression and might interlink chromatin structure, nuclear architecture and gene regulation in metazoans. BAF can be exploited by retroviruses to act as a host component of pre-integration complexes, which promote the integration of the retroviral DNA into the host chromosome by preventing autointegration of retroviral DNA. BAF might contribute to the assembly or activity of retroviral pre-integration complexes through direct binding to the retroviral proteins p55 Gag and matrix, as well as to DNA.	0
413821	cl03812	Me-amine-dh_L	Methylamine dehydrogenase, L chain. This family consists of the light chain of methylamine dehydrogenase light chain, a periplasmic enzyme. This subunit contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from Trp-114 and Trp-165 of the precursor, numbered according to the sequence from Paracoccus denitrificans. The enzyme forms a complex with the type I blue copper protein amicyanin and cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome. [Energy metabolism, Amino acids and amines]	0
413824	cl03816	FokI_N	N-terminal DNA recognition domain of restriction endonuclease FokI and similar proteins. Restriction endonuclease FokI (EC3.1.21.4), also called R.FokI, or endonuclease FokI, is a type IIS restriction enzyme that require only divalent metals (such as Mg2+ or Mn2+) as cofactors to catalyze the hydrolysis of DNA. FokI recognizes the double-stranded sequence 5'-GGATG-3'/3'-CATCC-5' and cleaves 14 bases after G-1 and 13 bases before C-1, respectively. It contains an N-terminal DNA recognition domain and a C-terminal endonuclease domain. This model describes the DNA recognition domain. The family also includes endonuclease StsI, a type IIS restriction endonuclease found in Streptococcus sanguinis 54. It recognizes the same sequence as FokI but cleaves at different positions.	0
413828	cl03831	HlyIII	Haemolysin-III related. This family includes proteins from pathogenic and non-pathogenic bacteria, Homo sapiens and Drosophila. In Bacillus cereus, a pathogen, it has been show to function as a channel-forming cytolysin. The human protein is expressed preferentially in mature macrophages, consistent with a role cytolytic role.	0
413829	cl03835	RABV_P-protein-C_like	C-terminal domain of Rabies virus phosphoprotein and related proteins. This family includes the M1 phosphoprotein non-structural RNA polymerase alpha subunit, which is thought to be a component of the active polymerase, and may be involved in template binding.	0
413833	cl03849	PSS	Phosphatidyl serine synthase. CDP-diacylglycerol-serine O-phosphatidyltransferase	0
413836	cl03855	CemA	CemA family. proton extrusion protein PcxA; Provisional	0
383496	cl03860	ComC	COMC family. Members of this family are BlpC, a peptide pheromone that stimulates production of BLP (bacteriocin-like peptides) family class II bacteriocins. BlpC peptides fall within the broader family of PF03047, a homology family of pheromone/bacteriocin precursors that is also restricted to Streptococcus. The PF03047 HMM runs only a few residues past the GlyGly precursor peptide cleavage site, and thus does not distinguish BlpC from other pheromone precursors, such as ComC.	0
413838	cl03870	NPL	Nucleoplasmin-like domain. Nucleoplasmins are also known as chromatin decondensation proteins. They bind to core histones and transfer DNA to them in a reaction that requires ATP. This is thought to play a role in the assembly of regular nucleosomal arrays.	0
413845	cl03888	PTPA	N/A. Phosphotyrosyl phosphatase activator (PTPA) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognized phosphoserine/ threonine protein phosphorylase activity. The specific biological role of PTPA is unknown, Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumor suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1.	0
413846	cl03892	WRKY	WRKY DNA -binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.	0
413851	cl03904	CAT_RBD	CAT RNA binding domain. This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram+ and Gram- bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer.to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template.	0
413852	cl03905	EXS	EXS family. We have named this region the EXS family after (ERD1, XPR1, and SYG1). This family includes C-terminus portions from the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be murine leukaemia virus (MLV) receptors (XPR1). N-terminus portions from these proteins are aligned in the SPX pfam03105 family. The previously noted similarity between SYG1 and MLV receptors over their whole sequences is thus borne out in pfam03105 and this family. While the N-termini aligned in pfam03105 are thought to be involved in signal transduction, the role of the C-terminus sequences aligned in this family is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) yeast proteins. ERD1 proteins are involved in the localization of endogenous endoplasmic reticulum (ER) proteins. erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localization label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via `salvage' vesicles.	0
413853	cl03906	GAT_SF	GAT domain found in eukaryotic GGAs, metazoan Tom1-like proteins, metazoan STAMs, fungal Vps27, and similar proteins. STAM-2 is a Hrs-binding protein involved in intracellular signal transduction mediated by cytokines and growth factors. STAM-2 is a component of the ESCRT-0 complex that binds ubiquitin and acts as sorting machinery that recognizes ubiquitinated receptors and transfers them for further sequential lysosomal sorting/trafficking processes. Members of this family contain a non-canonical GAT (GGA and Tom1) domain consisting of two helices. A canonical GAT domain is a monomeric three-helix bundle that bind to ubiquitin. STAM-2, together with another GAT domain-containing protein Hrs, forms a Hrs/STAM2 core complex that consists of two intertwined GAT domains, each consisting of two helices from one subunit, and one from the other subunit. The two GAT domains are connected by a two-stranded coiled-coil. The Hrs/STAM2 complex, an intertwined GAT heterodimer, is a scaffold for binding of ubiquitinated cargo proteins and coordinating ubiquitination and deubiquitination reactions that regulate sorting.	0
413854	cl03910	AnfG_VnfG	Vanadium/alternative nitrogenase delta subunit. Nitrogenase is the enzyme of biological nitrogen fixation. The most wide-spread and most efficient nitrogenase contains a molybdenum cofactor. This protein family, VnfG, represents the delta subunit of the V-containing (vanadium) alternative nitrogenase. It is homologous to AnfG, the delta subunit of the Fe-only nitrogenase. [Central intermediary metabolism, Nitrogen fixation]	0
413859	cl03918	CHB_HEX	Putative carbohydrate binding domain. This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain.	0
413860	cl03922	V-ATPase_G	Vacuolar (H+)-ATPase G subunit. This model describes the vacuolar ATP synthase G subunit in eukaryotes and includes members from diverse groups e.g., fungi, plants, parasites etc. V-ATPases are multi-subunit enzymes composed of two functional domains: A transmembrane Vo domain and a peripheral catalytic domain V1. The G subunit is one of the subunits of the catalytic domain. V-ATPases are responsible for the acidification of endosomes and lysosomes, which are part of the central vacuolar system. [Energy metabolism, ATP-proton motive force interconversion]	0
413861	cl03923	BURP	BURP domain. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus; USPs and USP-like proteins; RD22 from Arabidopsis thaliana; and PG1beta from Lycopersicon esculentum. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid. The function of this domain is unknown.	0
413866	cl03934	MerC	MerC mercury resistance protein. 	0
413867	cl03935	NifW	Nitrogen fixation protein NifW. Nitrogenase is a complex metalloenzyme composed of two proteins designated the Fe-protein and the MoFe-protein. Apart from these two proteins, a number of accessory proteins are essential for the maturation and assembly of nitrogenase. Even though experimental evidence suggests that these accessory proteins are required for nitrogenase activity, the exact roles played by many of these proteins in the functions of nitrogenase are unclear. Using yeast two-hybrid screening it has been shown that NifW can interact with itself as well as NifZ.	0
413868	cl03936	MEV_P-protein-C_like	C-terminal domain of Measles virus phosphoprotein and related proteins. Paramyxoviridae P genes are able to generate more than one product, using alternative reading frames and RNA editing. The P gene encodes the structural phosphoprotein P. In addition, it encodes several non-structural proteins present in the infected cell but not in the virus particle. This family includes phosphoprotein P and the non-structural phosphoprotein V from different paramyxoviruses. Phosphoprotein P is essential for the activity of the RNA polymerase complex which it forms with another subunit, L pfam00946. Although all the catalytic activities of the polymerase are associated with the L subunit, its function requires specific interactions with phosphoprotein P. The P and V phosphoproteins are amino co-terminal, but diverge at their C-termini. This difference is generated by an RNA-editing mechanism in which one or two non-templated G residues are inserted into P-gene-derived mRNA. In measles virus and Sendai virus, one G residue is inserted and the edited transcript encodes the V protein. In mumps, simian virus type 5 and Newcastle disease virus, two G residues are inserted, and the edited transcript codes for the P protein. Being phosphoproteins, both P and V are rich in serine and threonine residues over their whole lengths. In addition, the V proteins are rich in cysteine residues at the C-termini. This C-terminal region of the P phosphoprotein is likely to be the nucleocapsid-binding domain, and is found to be intrinsically disordered and thus liable to induced folding.	0
413874	cl03951	CDC37_N	Cdc37 N terminal kinase binding. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases.and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function.	0
296151	cl03953	ESAG1	ESAG protein. expression site-associated gene (ESAG); Provisional	0
383536	cl03956	PSI_PsaH	Photosystem I reaction centre subunit VI. photosystem I reaction centre subunit VI; Provisional	0
413879	cl03973	DUF269	Protein of unknown function, DUF269. Members of this protein family, called DUF269 by pfam03270, are strictly limited to nitrogen-fixing species, although not universal among them. The gene typically is found next to the nifX gene (see TIGRFAMs model TIGR02663). [Central intermediary metabolism, Nitrogen fixation]	0
296175	cl03994	Trp_dioxygenase	Tryptophan 2,3-dioxygenase. Members of this family are tryptophan 2,3-dioxygenase, as confirmed by several experimental characterizations, and by conserved operon structure for many of the other members. This enzyme represents the first of a two-step degradation to L-kynurenine, and a three-step pathway (via kynurenine) to anthranilate plus alanine. [Energy metabolism, Amino acids and amines]	0
413887	cl04000	Cornichon	Cornichon protein. predicted protein; Provisional	0
383563	cl04057	DUF1256	Protein of unknown function (DUF1256). This model describes a tetrameric protease that makes the rate-limiting first cut in the small, acid-soluble spore proteins (SASP) of Bacillus subtilis and related species. The enzyme lacks clear homology to other known proteases. It processes its own amino end before becoming active to cleave SASPs. [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Sporulation and germination]	0
383566	cl04069	EspA	EspA-like secreted protein. EspA is the prototypical member of this family. EspA, together with EspB, EspD and Tir are exported by a type III secretion system. These proteins are essential for attaching and effacing lesion formation. EspA is a structural protein and a major component of a large, transiently expressed, filamentous surface organelle which forms a direct link between the bacterium and the host cell.	0
413906	cl04084	dDENN	dDENN domain. The dDENN domain is part of the tripartite DENN domain. It is always found downstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity.	0
413907	cl04085	uDENN	uDENN domain. The uDENN domain is part of the tripartite DENN domain. It is always found upstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity.	0
413909	cl04088	TRCF	TRCF domain. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript.	0
413914	cl04104	Arg_tRNA_synt_N	Arginyl tRNA synthetase N terminal domain. This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition.	0
413916	cl04109	Methyltransf_7	SAM dependent carboxyl methyltransferase. indole-3-acetate carboxyl methyltransferase	0
413918	cl04114	Channel_Tsx	Nucleoside-specific channel-forming protein, Tsx. This family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 235 amino acids in length.	0
413923	cl04129	Invas_SpaK	Invasion protein B family. type III secretion system chaperone SpaK; Provisional	0
413931	cl04142	VRP3	Salmonella virulence-associated 28kDa protein. 	0
413933	cl04145	Peptidase_C58	Yersinia/Haemophilus virulence surface antigen. The model represents a cysteine protease domain found in proteins of bacteria that include plant pathogens (Pseudomonas syringae), root nodule bacteria, and intracellular pathogens (e.g. Yersinia pestis, Haemophilus ducreyi, Pasteurella multocida, Chlamydia trachomatis) of animal hosts. The domain features a catalytic triad of Cys, His, and Asp. Sequences can be extremely divergent outside of a few well-conserved motifs, and additional members may exist that are detected by this model. YopT, a virulence effector protein of Yersinia pestis, cleaves and releases host cell Rho GTPases from the membrane, thereby disrupting the actin cytoskeleton. Members of the family from pathogenic bacteria are likely to be pathogenesis factors. [Cellular processes, Pathogenesis]	0
413940	cl04176	TDT	The Tellurite-resistance/Dicarboxylate Transporter (TDT) family. This family of transporters has ten alpha helical transmembrane segments. The structure of a bacterial homolog of SLAC1 shows it to have a trimeric arrangement. The pore is composed of five helices with a conserved Phe residue involved in gating. One homolog, Mae1 from the yeast Schizosaccharomyces pombe, functions as a malate uptake transporter; another, Ssu1 from Saccharomyces cerevisiae and other fungi including Aspergillus fumigatus, is characterized as a sulfite efflux pump; and TehA from Escherichia coli is identified as a tellurite resistance protein by virtue of its association in the tehA/tehB operon. In plants, this family is found in the stomatal guard cells functioning as an anion-transporting pore. Many homologs are incorrectly annotated as tellurite resistance or dicarboxylate transporter (TDT) proteins.	0
413956	cl04214	UPF0180	Uncharacterized protein family (UPF0180). hypothetical protein; Provisional	0
413958	cl04219	LPG_synthase_TM	Lysylphosphatidylglycerol synthase TM region. This family of hydrophobic proteins is observed in two distinct contexts. It is primarily found in the presence of genes for the biosynthesis and elaboration of hopene where we assign the gene symbol HpnL. In a subset of the genomes containing HpnL a second, often plasmid-encoded, homolog is observed in a context implying the biosynthesis of 2-aminoethylphosphonate head-group containing lipids.	0
413960	cl04227	CBM41_pullulanase	Family 41 Carbohydrate-Binding Module from pullulanase-like enzymes. Domain is found in pullanase - carbohydrate de-branching - proteins. It is found both to the N or the C terminii of of the alpha-amylase active site region. This domain contains several conserved aromatic residues that are suggestive of a carbohydrate binding function.	0
413978	cl04270	Glyco_transf_WecG_TagA	N/A. putative UDP-N-acetyl-D-mannosaminuronic acid transferase; Provisional	0
413979	cl04271	IBN_N	Importin-beta N-terminal domain. Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins.. which is important for importin-beta mediated transport.	0
413980	cl04273	MadL	Malonate transporter MadL subunit. The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM. The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	0
413981	cl04274	MadM	Malonate/sodium symporter MadM subunit. The MSS family includes the monobasic malonate:Na+ symporter of Malonomonas rubra. It consists of two integral membrane proteins, MadL and MadM.The transporter is believed to catalyze the electroneutral reversible uptake of H+-malonate with one Na+, and both subunits have been shown to be essential for activity. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	0
413982	cl04275	Mtc	Tricarboxylate carrier. The MTC family consists of a limited number of homologues, all from eukaryotes. A single member of the family has been functionally characterized, the tricarboxylate carrier from rat liver mitochondria. The rat liver mitochondrial tricarboxylate carrier has been reported to transport citrate, cis-aconitate, threo-D-isocitrate, D- and L-tartrate, malate, succinate and phosphoenolpyruvate. It presumably functions by a proton symport mechanism. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	0
413983	cl04276	Mtp	Golgi 4-transmembrane spanning transporter. The proteins of the MET family have 4 TMS regions and are located in late endosomal or lysosomal membranes. Substrates of the mouse MTP transporter include thymidine, both nucleoside and nucleobase analogues, antibiotics, anthracyclines, ionophores and steroid hormones. MET transporters may be involved in the subcellular compartmentation of steroid hormones and other compounds.Drug sensitivity by mouse MET was regulated by compounds that inhibit lysosomal function, interface with intracellular cholesterol transport, or modulate the multidrug resistance phenotype of mammalian cells. Thus, MET family members may compartmentalize diverse hydrophobic molecules, thereby affecting cellular drug sensitivity,nucleoside/nucleobase availability and steroid hormone responses. [Transport and binding proteins, Unknown substrate]	0
413987	cl04283	Rad10	Binding domain of DNA repair protein Ercc1 (rad10/Swi10). All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologs). This complex is used primarily for nucleotide excision repair but also for some aspects of recombination repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	0
413988	cl04285	RecT	RecT family. This model represents the phage recombination protein Bet from a number of phage, including phage lambda. All members of this family are found in phage genomes or in putative prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions]	0
413989	cl04289	Tfb2	Transcription factor Tfb2. All proteins in this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	0
413991	cl04295	CG-1	CG-1 domain. The domains contain a predicted bipartite NLS and are named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs.	0
413992	cl04297	ANTAR	ANTAR domain. The majority of the domain consists of a coiled-coil.	0
413993	cl04298	SpoVAC_SpoVAEB	SpoVAC/SpoVAEB sporulation membrane protein. This model describes stage V sporulation protein AE, a paralog of stage V sporulation protein AC. Both are proteins found to present in a species if and only if that species is one of the Firmicutes capable of endospore formation, as of the time of the publication of the genome of Carboxydothermus hydrogenoformans. Mutants in spoVAE have a stage V sproulation defect. [Cellular processes, Sporulation and germination]	0
413996	cl04309	RNAP_Rpb7_N_like	N/A. Rpb7 bind to Rpb4 to form a heterodimer. This complex is thought to interact with the nascent RNA strand during RNA polymerase II elongation. This family includes the homologs from RNA polymerase I and III. In RNA polymerase I, Rpa43 is at least one of the subunits contacted by the transcription factor TIF-IA. The N-terminus of Rpb7p/Rpc25p/MJ0397 has a SHS2 domain that is involved in protein-protein interaction.	0
414000	cl04326	Psb28	Psb28 protein. Members of this protein family are the Psb28 protein of photosystem II. Two different protein families, apparently without homology between them, have been designated PsbW. Cyanobacterial proteins previously designated PsbW are members of the family described here. However, while members of the plant PsbW family are not found (so far) in Cyanobacteria, members of the present family do occur in plants. We therefore support the alternative designation that has emerged for this protein family, Psp28, rather than PsbW. [Energy metabolism, Photosynthesis]	0
414008	cl04352	Borrelia_REV	Borrelia burgdorferi REV protein. This family consists of several REV proteins from Borrelia burgdorferi (Lyme disease spirochete). The function of REV is unknown although it known that gene is induced during the ingesting of host blood suggesting a role in the metabolic activation of borreliae to adapt to physiological stimuli.	0
414018	cl04375	PMEI_like	pectin methylesterase inhibitor and related proteins. This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein. It is also found at the N-termini of PMEs predicted from DNA sequences (personal obs:C Yeats), suggesting that both PMEs and their inhibitor are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical.	0
414028	cl04394	BRICHOS	BRICHOS domain. Its exact function is unknown; roles that have been proposed for the domain, which is about 100 amino acids long, include (a) targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialised intracellular protease processing system. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, pfam08999 provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region.	0
414036	cl04407	Dopey_N	Dopey, N-terminal. DopA is the founding member of the Dopey family and is required for correct cell morphology and spatiotemporal organisation of multicellular structures in the filamentous fungus Aspergillus nidulans. DopA homologs are found in mammals. S. cerevisiae DOP1 is essential for viability and, affects cellular morphogenesis.	0
414057	cl04451	EIIC-GAT	PTS system sugar-specific permease component. PTS system ascorbate-specific transporter subunit IIC; Reviewed	0
414059	cl04460	DUF434	Protein of unknown function (DUF434). 	0
414060	cl04466	P-mevalo_kinase	Phosphomevalonate kinase. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found. One is this type, found in animals. The other is the ERG8 type, found in plants and fungi (TIGR01219) and in Gram-positive bacteria (TIGR01220). [Central intermediary metabolism, Other]	0
414061	cl04467	DUF443	Protein of unknown function (DUF443). Members of this family of proteins, with average length of 210, have no invariant residues but five predicted transmembrane segments. Strangely, most members occur in groups of consecutive paralogous genes. A striking example is a set of eleven encoded consecutively, head-to-tail, in Staphylococcus aureus strain COL.	0
414062	cl04468	Tic22	Tic22-like family. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic22 protein. [Transport and binding proteins, Amino acids, peptides and amines]	0
414067	cl04498	LytTR	LytTr DNA-binding domain. This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern.	0
414081	cl04524	Fimbrial_CS1	CS1 type fimbrial major subunit. Fimbriae, also known as pili, form filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometres. They enable the cell to colonise host epithelia. This family constitutes the major subunits of CS1 like pili, including CS2 and CFA1 from Escherichia coli, and also the Cable type II pilin major subunit from Burkholderia cepacia. The major subunit of CS1 pili is called CooA. Periplasmic CooA is mostly complexed with the assembly protein CooB. In addition, a small pool of CooA multimers, and CooA-CooD complexes exists, but the functional significance is unknown. A member of this family has also been identified in Salmonella typhi and Salmonella enterica.	0
414087	cl04545	Phage_rep_O	Bacteriophage replication protein O. This model represents the N-terminal region of the phage lambda replication protein O and homologous regions of other phage proteins. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions]	0
383772	cl04571	MARVEL	Membrane-associating domain. This family of plant proteins contains a domain that may have a catalytic activity. It has a conserved arginine and aspartate that could form an active site. These proteins are predicted to contain 3 or 4 transmembrane helices.	0
414110	cl04601	SPC22	Signal peptidase subunit. signal peptidase; Provisional	0
414119	cl04635	F1-ATPase_epsilon	eukaryotic mitochondrial ATP synthase epsilon subunit. This family constitutes the mitochondrial ATP synthase epsilon subunit. This is not to be confused with the bacterial epsilon subunit, which is homologous to the mitochondrial delta subunit (pfam00401 and pfam02823) The epsilon subunit is located in the extrinsic membrane section F1, which is the catalytic site of ATP synthesis. The epsilon subunit was not well ordered in the crystal structure of bovine F1, but it is known to be located in the stalk region of F1. E subunit is thought to be involved in the regulation of ATP synthase, since a null mutation increased oligomycin sensitivity and decreased inhibition by inhibitor protein IF1.	0
414121	cl04640	DUF600	Protein of unknown function, DUF600. This model represents a tandem array of 10 proteins in Staphylococcus aureus and the C-terminal region of one protein each in Bacillus subtilis and Bacillus halodurans.	0
414123	cl04653	TAF7	TATA Binding Protein (TBP) Associated Factor 7 (TAF7) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. TAFII55 binds to TAFII250 and inhibits it acetyltransferase activity. The exact role of TAFII55 is currently unknown. The conserved region is situated towards the N-terminus of the protein.	0
414126	cl04660	PGAP4-like	Post-GPI attachment to proteins factor 4 and similar proteins. Post-GPI attachment to proteins factor 4 (PGAP4), also known as post-GPI attachment to proteins GalNAc transferase 4 or transmembrane protein 246 (TMEM246), has been shown to be a Golgi-resident GPI-GalNAc transferase. Many eukaryotic proteins are anchored to the cell surface through glycolipid glycosylphosphatidylinositol (GPI). GPIs have a conserved core but exhibit diverse N-acetylgalactosamine (GalNAc) modifications. PGAP4 knockout cells lose GPI-GalNAc structures. PGAP4 is most likely involved in the initial steps of GPI-GalNAc biosynthesis. In contrast to other Golgi glycotransferases (GTs), it contains three transmembrane domains. Structural modeling suggests that PGAP4 adopts a GT-A fold split by an insertion of tandem transmembrane domains.	0
383809	cl04661	Polysacc_synt_4	Polysaccharide biosynthesis. This model represents an uncharacterized domain found in both Arabidopsis thaliana (at least 10 copies) and Oryza sativa. Most member proteins have only a short stretch of sequence N-terminal to this domain, but one has a long N-terminal extension that includes a protein kinase domain (pfam00069).	0
414146	cl04704	PDDEXK_6	PDDEXK-like family of unknown function. This model represents a domain found toward the C-terminus of a number of uncharacterized plant proteins. The domain is strongly conserved (greater than 30 % sequence identity between most pairs of members) but flanked by highly divergent regions including stretches of low-complexity sequence.	0
383830	cl04705	GRDA	Glycine reductase complex selenoprotein A. putative glycine/sarcosine/betaine reductase complex protein A; Provisional	0
322775	cl04707	PsbR	Photosystem II 10 kDa polypeptide PsbR. photosystem II subunit R; Provisional	0
414149	cl04722	PLAC8	PLAC8 family. This model describes an uncharacterized domain of about 100 residues. It is common in plants but found also in Homo sapiens, Dictyostelium, and Leishmania; at least 12 distinct members are found in Arabidopsis. Most members of this family contain more than 10 per cent Cys, but no Cys residue is invariant across the family.	0
414152	cl04729	DUF617	Protein of unknown function, DUF617. This model represents a region of about 170 amino acids found at the C-terminus of a family of plant proteins. These proteins typically have additional highly divergent N-terminal regions rich in low complexity sequence. PSI-BLAST reveals no clear similarity to any characterized protein. At least 12 distinct members are found in Arabidopsis thaliana.	0
414155	cl04737	ZF-HD_dimer	ZF-HD protein dimerization region. This model describes a 54-residue domain found in the N-terminal region of plant proteins, the vast majority of which contain a ZF-HD class homeobox domain toward the C-terminus. The region between the two domains typically is rich in low complexity sequence. The companion ZF-HD homeobox domain is described in model TIGR01565.	0
414182	cl04793	PSRP-3_Ycf65	Plastid and cyanobacterial ribosomal protein (PSRP-3 / Ycf65). putative ribosomal protein 3; Validated	0
414184	cl04796	Ovate	Transcriptional repressor, ovate. This model describes an uncharacterized domain of about 70 residues found exclusively in plants, generally toward the C-terminus of proteins of 200 to 350 amino acids in length. At least 14 such proteins are found in Arabidopsis thaliana. Other regions of these proteins tend to consist largely of low-complexity sequence.	0
414191	cl04813	EIN3	Ethylene insensitive 3. ETHYLENE-INSENSITIVE3-like3 protein; Provisional	0
414195	cl04829	pMMO-AMO_C	subunit C of particulate methane monooxygenase (pMMO, also known as membrane-bound MMO) from methanotrophic bacteria, and of ammonia monooxygenase (AMO) from ammonia-oxidizing bacteria, and related proteins. This model contains the subunit C of ammonia monooxygenase (AMO, EC 1.14.99.39) from ammonia-oxidizing archaea including Nitrososphaera viennensis gen. nov., sp. nov (also called Nitrososphaera viennensis EN76) that contains six variants (AmoC1-AmoC6) encoded by different genes. AMO catalyzes the conversion of ammonia to hydroxylamine. Nitrososphaera viennensis EN76 AMO is composed of four subunits: AmoA, AmoB, AmoX, and one of six variants of AmoC. The AMO subunit C belongs to a family which also includes subunit C of particulate methane monooxygenase (pMMO, also known as membrane-bound MMO, EC 1.14.18.3) from methanotrophic bacteria, and AMO from ammonia-oxidizing bacteria, which are not included in this model. Compared to its bacterial counterpart, archaeal AMO C subunit is significantly shorter at the N-terminal end.	0
414197	cl04831	MbeD_MobD	MbeD/MobD like. MbeD, as found in the ColE1 plasmid, was originally described as a plasmid mobilization protein. Later, it was shown that MbeD additionally was responsible for a plasmid entry exclusion phenotype that had previously been ascribed to products of the exc1 and exc2 genes.	0
414207	cl04850	Wzy_C	O-Antigen ligase. This family of proteins is suggested to transport inorganic carbon (HCO3-), based on the phenotype of a mutant of IctB in Synechococcus sp. strain PCC 7942. Bicarbonate uptake is used by many photosynthetic organisms including cyanobacteria. These organisms are able to concentrate CO2/HCO3- against a greater than ten-fold concentration gradient. Cyanobacteria may have several such carriers operating with different efficiencies. Note that homology to various O-antigen ligases, with possible implications for mutant cell envelope structure, might allow alternatives to the interpretation of IctB as a bicarbonate transport protein. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	0
414210	cl04855	BLUF	Sensors of blue-light using FAD. The BLUF domain has been shown to bind FAD in the AppA protein. AppA is involved in the repression of photosynthesis genes in response to blue-light.	0
414213	cl04868	Phage_holin_2_1	Bacteriophage P21 holin S. Phage_holin_2_4 is a family of small hydrophobic phage proteins called holins with one transmembrane domain. Holins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion.	0
414231	cl04902	Agouti	Agouti protein. The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP) is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation.	0
383919	cl04907	L51_S25_CI-B8	Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain. Proteins containing this domain are located in the mitochondrion and include ribosomal protein L51, and S25. This domain is also found in mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) . It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins.	0
414244	cl04947	Phage_cap_P2	Phage major capsid protein, P2 family. This model family represents the major capsid protein component of the heads (capsids) of bacteriophage P2 and related phage. This model represents one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease. [Mobile and extrachromosomal element functions, Prophage functions]	0
414247	cl04955	LanC_like	Cyclases involved in the biosynthesis of lantibiotics, and similar proteins. This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 380 amino acids in length. The family is found in association with pfam05147. This domain may be involved in synthesis of a lantibiotic compound.	0
414249	cl04961	DSS1_Sem1	proteasome complex subunit DSS1/Sem1. This family contains the breast cancer tumor suppressor BRCA2-interacting protein DSS1 and its homolog SEM1, both of which are short acidic proteins. DSS1 has been shown to be a conserved component of the Rae1 mediated mRNA export pathway in Schizosaccharomyces pombe.	0
414262	cl04993	Spiralin	Spiralin. Spiralin is the major lipoprotein in multiple species of Spiroplasma, a relative of the Mycoplasmas.	0
414266	cl05000	CHASE3	CHASE3 domain. CHASE3 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE3 domains are found in histidine kinases, adenylate cyclases, methyl-accepting chemotaxis proteins and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognized by CHASE3 domains are not known at this time.	0
414269	cl05005	TAF4	TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. This region of similarity is found in Transcription initiation factor TFIID component TAF4.	0
414271	cl05012	FlhD	Flagellar transcriptional activator (FlhD). This family consists of several bacterial flagellar transcriptional activator (FlhD) proteins. FlhD combines with FlhC to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator.	0
414275	cl05017	UPF0203	Uncharacterized protein family (UPF0203). Uncharacterized protein At4g33100; Provisional	0
414280	cl05036	FlhC	Flagellar transcriptional activator (FlhC). This family consists of several bacterial flagellar transcriptional activator (FlhC) proteins. FlhC combines with FlhD to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator.	0
414285	cl05060	TraE	TraE protein. TraE is a component of type IV secretion systems involved in conjugative transfer of plasmid DNA. The function of the TraE protein is unknown.	0
414292	cl05087	Complex1_LYR	Complex 1 protein (LYR family). This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria.	0
414296	cl05094	Phage_attach	Phage Head-Tail Attachment. Members of this short family are putative ATP-binding sugar transporter-like protein.	0
414308	cl05125	FliT	Flagellar protein FliT. This family contains several bacterial flagellar FliT proteins. The flagellar proteins FlgN and FliT have been proposed to act as substrate specific export chaperones, facilitating incorporation of the enterobacterial hook-associated axial proteins (HAPs) FlgK/FlgL and FliD into the growing flagellum. In Salmonella typhimurium flgN and fliT mutants, the export of target HAPs is reduced, concomitant with loss of unincorporated flagellin into the surrounding medium.	0
414309	cl05126	PqqD	Coenzyme PQQ synthesis protein D (PqqD). Members of this protein show distant homology to PqqD, and belong to a three-gene cassette that included the HPr kinase related protein family of TIGR04352. The role of the cassette, and of this protein, are unknown.	0
414310	cl05127	DUF4779	Domain of unknown function (DUF4779). This family consists of several histidine-rich protein II and III sequence from Plasmodium falciparum.	0
414315	cl05142	GUN4	porphyrin-binding protein domain GUN4. In Arabidopsis, GUN4 is required for the functioning of the plastid mediated repression of nuclear transcription that is involved in controlling the levels of magnesium- protoporphyrin IX. GUN4 binds the product and substrate of Mg-chelatase, an enzyme that produces Mg-Proto, and activates Mg-chelatase. GUN4 is thought to participates in plastid-to-nucleus signaling by regulating magnesium-protoporphyrin IX synthesis or trafficking.	0
414328	cl05182	PsaN	Photosystem I reaction centre subunit N (PSAN or PSI-N). photosystem I reaction center subunit N; Provisional	0
414353	cl05250	Peptidase_U57	YabG peptidase U57. Members of this family are the protein YabG, demonstrated for Bacillus subtilis to be an endopeptidase able to release N-terminal peptides from a number of sporulation proteins, including CotT, CotF, and SpoIVA. It appears to be expressed under control of sigma-K. [Cellular processes, Sporulation and germination]	0
414361	cl05275	Prolamin_like	Prolamin-like. putative protein; Provisional	0
414366	cl05282	Borrelia_P13	Borrelia membrane protein P13. This family consists of P13 proteins from Borrelia species. P13 is a 13kDa integral membrane protein which is post-translationally processed at both ends and modified by an unknown mechanism.	0
414371	cl05296	DUF802	Domain of unknown function (DUF802). Proteins of this subfamily are putative H+ channel proteins, but it has been reported that they are also involved in anti-phage defense.	0
414374	cl05301	MLKL_NTD	N-terminal domain of mixed lineage kinase domain-like protein (MLKL) and similar proteins. This family consists of several broad-spectrum mildew resistance proteins from Arabidopsis thaliana. Plant disease resistance (R) genes control the recognition of specific pathogens and activate subsequent defense responses. The Arabidopsis thaliana locus Resistance To Powdery Mildew 8 (RPW8) contains two naturally polymorphic, dominant R genes, RPW8.1 and RPW8.2, which individually control resistance to a broad range of powdery mildew pathogens. They induce localized, salicylic acid-dependent defenses similar to those induced by R genes that control specific resistance. Apparently, broad-spectrum resistance mediated by RPW8 uses the same mechanisms as specific resistance.	0
414376	cl05307	DUF814	Domain of unknown function (DUF814). NFACT-R RNA binding family found found in bacteria fused to the ThiI domain as a variant of the canonical tRNA 4-thiouridylation pathway.	0
296991	cl05376	AfaD	Enterobacteria AfaD invasin protein. fimbrial adhesin protein SefD; Provisional	0
414398	cl05390	Tcp11	T-complex protein 11. This family consists of several eukaryotic T-complex protein 11 (Tcp11) related sequences. Tcp11 is only expressed in fertile adult mammalian testes and is thought to be important in sperm function and fertility. The family also contains the yeast Sok1 protein which is known to suppress cyclic AMP-dependent protein kinase mutants.	0
414410	cl05417	PLA2_like	N/A. This family consists of several phospholipase A2 like proteins mostly from insects.	0
414415	cl05426	YopD	YopD protein. One SctB and four SctE subunits, located at the tip of the type III secretion system (T3SS) injectosome, combine to form the translocon (translocator pore) in the membrane of targeted cells. Species-specific names for this highly variable component of T3SS include YopD, EspB, IpaC, SipC, etc.	0
414417	cl05433	ARPC4	ARP2/3 complex 20 kDa subunit (ARPC4). ARP2/3 complex subunit; Provisional	0
414418	cl05434	TraX	TraX protein. conjugal transfer protein TrbP; Provisional	0
414419	cl05436	Haemagg_act	haemagglutination activity domain. This model represents a conserved domain found near the N-terminus of a number of large, repetitive bacterial proteins, including many proteins of over 2500 amino acids. Members generally have a signal sequence, then an intervening region, then the region described by this model. Following this region, proteins typically have regions rich in repeats but may show no homology between the repeats of one member and the repeats of another. A number of the members of this family have been designated adhesins, filamentous haemagglutinins, heme/hemopexin-binding protein, etc.	0
414420	cl05442	Dam	DNA N-6-adenine-methyltransferase (Dam). This model is a fragment-mode model for a phage-borne DNA N-6-adenine-methyltransferase. [Mobile and extrachromosomal element functions, Prophage functions, DNA metabolism, Restriction/modification]	0
414424	cl05457	pyocin_knob	knob domain of R1 and R2 pyocins and similar domains. This family consists of several uncharacterized proteins from the Siphoviruses as well as one bacterial sequence. Some of the members of this family are described as putative minor structural proteins.	0
414425	cl05460	Excalibur	Excalibur calcium-binding domain. Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognised and the evolution of EF-hand-like domains is probably more complex than previously appreciated.	0
414436	cl05484	VipB	Type VI secretion protein, EvpB/VC_A0108, tail sheath. Work by Mougous, et al. (2006), describes IAHP-related loci as a type VI secretion system (). This protein family is associated with type VI secretion loci, although not treated explicitly by Mougous, et al. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
414440	cl05512	DUF903	Bacterial protein of unknown function (DUF903). Members of this family are exclusively lipoproteins of small size, including YgdI and YgdR from E. coli K-12.	0
414442	cl05524	Cir_Bir_Yir	Plasmodium variant antigen protein Cir/Yir/Bir. This model represents a large paralogous family of variant antigens from several Plasmodium species (P. yoelii, P. berghei and P. chabaudi). The seed was generated from a list of ORF's in P. yoelii containing a paralagous domain as defined by an algorithm implemented at TIFR. The list was aligned and reduced to six sequences approximating the most divergent clades present in the data set. The model only hits genes previously characterized as yir, bir, or cir genes above the trusted cutoff. In between trusted and noise is one gene from P. vivax (vir25) which has been characterized as a distant relative of the yir/bir/cir family. The vir family appears to be present in 600-1000 copies per haploid genome and is preferentially located in the sub-telomeric regions of the chromosomes. The genomic data for yoelii is consistent with this observation. It is not believed that there are any orthologs of this family in P. falciparum.	0
414443	cl05528	AlkA_N	AlkA N-terminal domain. The presence of 8-oxoguanine residues in DNA can give rise to G-C to T-A transversion mutations. This enzyme is found in archaeal, bacterial and eukaryotic species, and is specifically responsible for the process which leads to the removal of 8-oxoguanine residues. It has DNA glycosylase activity (EC:3.2.2.23) and DNA lyase activity (EC:4.2.99.18). The region featured in this family is the N-terminal domain, which is organized into a single copy of a TBP-like fold. The domain contributes residues to the 8-oxoguanine binding pocket.	0
414450	cl05556	Apyrase	Apyrase. apyrase Superfamily; Provisional	0
414461	cl05580	TraH	Conjugative relaxosome accessory transposon protein. conjugal transfer pilus assembly protein TraH; Provisional	0
414474	cl05618	tify	tify domain. Although previously known as the Zim domain this is now called the tify domain after its most conserved amino acids. TIFY proteins can be further classified into two groups depending on the presence (group I) or absence (group II) of a C2C2-GATA domain. Functional annotation of these proteins is still poor, but several screens revealed a link between TIFY proteins of group II and jasmonic acid-related stress response.	0
297128	cl05636	Phage_tail_T	Minor tail protein T. This model represents a translation of the T gene in phage lambda and related phage. A translational frameshift from the upstream gene G into the frame of T produces a minor protein gpG-T, essential in tail assembly but not found in the mature virion. [Mobile and extrachromosomal element functions, Prophage functions]	0
414486	cl05663	fn3_6	Fibronectin type-III domain. Fn3_5 is an fn3-like domain which is frequently found as the first of three on streptococcal C5a peptidase (SCP), a highly specific protease and adhesin/invasin. The family is found in conjunction with pfam00082, pfam02225 and pfam00746.	0
414489	cl05674	PET	PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions. This domain is suggested to be involved in protein-protein interactions. The family is found in conjunction with pfam00412.	0
414494	cl05686	P_gingi_FimA	Major fimbrial subunit protein (FimA). A family of uncharacterized proteins around 300 residues in length and found in various Bacteroides species. The function of this family is unknown.	0
384206	cl05704	Allene_ox_cyc	Allene oxide cyclase. allene oxide cyclase	0
414518	cl05741	AGTRAP	Angiotensin II, type I receptor-associated protein (AGTRAP). This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the C-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear.	0
414520	cl05743	RAP	Receptor-associated protein (RAP). The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. The N-terminal domain is predominately alpha helical. Two different studies have provided conflicted domain boundaries.	0
414525	cl05752	HdeA	HdeA/HdeB family. HdeA (hns-dependent expression protein A) is a single domain alpha-helical protein localized in the periplasmic space. HdeA is involved in acid resistance essential for infectivity of enteric bacterial pathogens. Functional studies demonstrate that HdeA is activated by a dimer-to-monomer transition at acidic pH, leading to suppression of aggregation by acid-denatured proteins. The gene encoding HdeA was initially identified as part of an operon regulated by the nucleoid protein H-NS. This family also contains HdeB.	0
414526	cl05753	TraD	Conjugal transfer protein TraD. This family contains bacterial TraD conjugal transfer proteins. Mutations in the TraD gene result in loss of transfer.	0
414533	cl05762	SATase_N	Serine acetyltransferase, N-terminal. The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants.and bacteria.	0
186667	cl05775	SEF14_adhesin	SEF14-like adhesin. fimbrial protein SefA; Provisional	0
414547	cl05797	SMC_hinge	SMC proteins Flexible Hinge Domain. This entry represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction.	0
414556	cl05813	Disulph_isomer	Disulphide isomerase. This protein family is one of several observed in species that express bacillithiol, an analog of glutathione and mycothiol. Rather than being involved in bacillithiol biosynthesis, members are likely to act in bacillithiol-dependent processes. A suggested term is bacilliredoxin (a glutaredoxin-like thiol-dependent oxidoreductase), and a suggested role of YphP is de-bacillithiolation - removing bacillithiol that became linked to protein thiols under oxidative stress. An older description of YphP as a disulphide isomerase therefore may be wrong.	0
414561	cl05827	IpaD	Invasion plasmid antigen IpaD. These proteins are found within type III secretion operons and have been shown to be secreted by that system.	0
414582	cl05878	TraK	TraK protein. This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 276 to 307 amino acids in length.	0
414583	cl05880	CRA	Circumsporozoite-related antigen (CRA). circumsporozoite-related antigen; Provisional	0
414601	cl05946	PspB	Phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response. [Cellular processes, Adaptations to atypical conditions]	0
414606	cl05964	zf-C4_ClpX	ClpX C4-type zinc finger. The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known.	0
414610	cl05973	FAM20_C_like	C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins. Fam20C represents the C-terminus of eukaryotic secreted Golgi casein kinase proteins. Fam20C is the Golgi casein kinase that phosphorylates secretory pathway proteins within Ser-x-Glu/pSer motifs. Mutations in Fam20C cause Raine syndrome, an autosomal recessive osteosclerotic bone dysplasia.	0
414643	cl06067	TraU	TraU protein. Members of this protein family are found in genomic regions associated with conjugative transfer and integrated TOL-like plasmids. The specific function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions]	0
414649	cl06082	ACP	Malonate decarboxylase delta subunit (MdcD). citrate lyase subunit gamma; Provisional	0
414651	cl06088	DUF1257	Protein of unknown function (DUF1257). Ycf35; Provisional	0
297413	cl06106	Phage_TAC_2	Bacteriophage lambda tail assembly chaperone, TAC, protein G. This model describes a family of bacteriophage proteins including G of phage lambda. This protein has been described as undergoing a translational frameshift at a Gly-Lys dipeptide near the C-terminus of protein G from phage lambda, with about 4 % efficiency, to produce tail assembly protein G-T. The Lys of the Gly-Lys pair is the conserved second-to-last residue of seed alignment for this family. [Mobile and extrachromosomal element functions, Prophage functions]	0
414667	cl06123	DHR2_DOCK	Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins. This family represents a conserved region within a number of eukaryotic dedicator of cytokinesis proteins. These are potential guanine nucleotide exchange factors, which activate some small GTPases by exchanging bound GDP for free GTP. This region interacts with RAC1 and ELMO1.	0
414670	cl06143	RELM	resistin-like molecule (RELM) hormone family. This family consists of several mammalian resistin proteins. Resistin is a 12.5-kDa cysteine-rich secreted polypeptide first reported from rodent adipocytes. It belongs to a multigene family termed RELMs or FIZZ proteins. Plasma resistin levels are significantly increased in both genetically susceptible and high-fat-diet-induced obese mice. Immunoneutralisation of resistin improves hyperglycemia and insulin resistance in high-fat-diet-induced obese mice, while administration of recombinant resistin impairs glucose tolerance and insulin action in normal mice. It has been demonstrated that increases in circulating resistin levels markedly stimulate glucose production in the presence of fixed physiological insulin levels, whereas insulin suppressed resistin expression. It has been suggested that resistin could be a link between obesity and type 2 diabetes.	0
414682	cl06181	PagP	Antimicrobial peptide resistance and lipid A acylation protein PagP. This family consists of several bacterial antimicrobial peptide resistance and lipid A acylation (PagP) proteins. The bacterial outer membrane enzyme PagP transfers a palmitate chain from a phospholipid to lipid A. In a number of pathogenic Gram-negative bacteria, PagP confers resistance to certain cationic antimicrobial peptides produced during the host innate immune response.	0
414684	cl06188	Orc3	Origin recognition complex subunit 3. This family represents the N-terminus (approximately 300 residues) of subunit 3 of the eukaryotic origin recognition complex (ORC). Origin recognition complex (ORC) is composed of six subunits that are essential for cell viability. They collectively bind to the autonomously replicating sequence (ARS) in a sequence-specific manner and lead to the chromatin loading of other replication factors that are essential for initiation of DNA replication.	0
414692	cl06211	KDGP_aldolase	KDGP aldolase. Members of this family of relatively uncommon proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved]	0
414703	cl06243	PsbW	Photosystem II reaction centre W protein (PsbW). photosystem II reaction centre W protein (PsbW); Provisional	0
414716	cl06278	TraL	TraL protein. This protein is part of the type IV secretion system for conjugative plasmid transfer. The function of the TraL protein is unknown. [Cellular processes, Conjugation]	0
414734	cl06336	Commd	N/A. The leucine-rich, 70-85 amino acid long COMM domain is predicted to form a beta-sheet and an extreme C-terminal alpha- helix. The COMM domain containing proteins are about 200 residues in length and passed the C-terminal COMM domain.	0
414736	cl06338	CDI_inhibitor_EC869_like	Inhibitor of the contact-dependent growth inhibition (CDI) system of Escherichia coli EC869, and related proteins. CdiI immunity proteins function as part of the bacterial contact-dependent growth inhibition (CDI) system. CDI is mediated by the CdiB-CdiA two-partner secretion system. Each CdiA protein exhibits a distinct growth inhibition activity, which resides in the polymorphic C-terminal region (CdiA-CT). Cells with the CDI sytem also express a CdiI immunity protein that blocks the activity of cognate CdiA-CT, thereby protecting the cell from autoinhibition. In many CDI systems the cdiBAI genes are followed by orphan cdiA-CT/cdiI modules, suggesting that these modules are exchanged between the CDI systems of different bacteria.	0
414738	cl06345	DUF1439	Protein of unknown function (DUF1439). lipoprotein; Provisional	0
414742	cl06353	BCHF	2-vinyl bacteriochlorophyllide hydratase (BCHF). This model represents the enzyme responsible for the first step in the modification of the ring A vinyl group of chlorophyllide a which (in part) distinguishes chlorophyll from bacteriochlorophyll. This enzyme is aparrently absent from cyanobacteria (which do not use bacteriochlorophyll). [Energy metabolism, Photosynthesis]	0
414755	cl06401	Amastin	Amastin surface glycoprotein. amastin surface glycoprotein; Provisional	0
414757	cl06405	Syd	Syd, a SecY-interacting protein. This family contains a number of bacterial Syd proteins approximately 180 residues long. It has been suggested that Syd is loosely associated with the cytoplasmic surface of the cytoplasmic membrane, and that interaction with SecY may be involved in this membrane association. Operon analysis showed that Syd protein may function as immunity protein in bacterial toxin systems.	0
297589	cl06408	UP_III_II	Uroplakin IIIb, IIIa and II. This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension.	0
414773	cl06460	CblD	CblD like pilus biogenesis initiator. This family consists of several minor pilin proteins including CblD from Burkholderia cepacia which is known to CblD be the initiator of pilus biogenesis. The family also contains a variety of Enterobacterial minor pilin proteins.	0
414774	cl06461	YycH_N_like	N-terminal domain of YycH and structurally similar proteins conserved in Firmicutes. This family consists of several uncharacterized proteins around 160 residues in length and is mainly found in various Clostridium species. The function of this family is unknown.	0
414779	cl06472	HycH	Formate hydrogenlyase maturation protein HycH. formate hydrogenlyase maturation protein HycH; Provisional	0
414780	cl06473	CHRD	CHRD domain. CHRD (after SWISS-PROT abbreviation for chordin) is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologs. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made. Its most conserved feature is a GE[I/L]RCG[V/I/L] motif towards its C-terminal end Most bacterial proteins in this family have only one CHRD domain, whereas it is found repeated in many eukaryotic proteins such as human chordin and Drosophila SOG..	0
141938	cl06484	Agglutinin	Agglutinin domain. Although its biological function is unknown, it has a high binding specificity for the methyl-glycoside of the T-antigen, found linked to serine or threonine residues of cell surface glycoproteins. The protein is comprised of a homodimer, with each homodimer consisting of two beta-trefoil domains.	0
414794	cl06505	Rho_N	Rho termination factor, N-terminal domain. The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers. This domain is found to the N-terminus of the RNA binding domain.	0
414795	cl06508	MANEC	MANEC domain. This domain, comprising 8 conserved cysteines, is found in the N terminus of higher multicellular animal membrane and extracellular proteins. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. It has been proposed that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase.	0
414799	cl06515	DUF1525	Protein of unknown function (DUF1525). Members of this protein belong to extended genomic regions that appear to be spread by conjugative transfer. [Mobile and extrachromosomal element functions, Plasmid functions]	0
414804	cl06527	HWE_HK	HWE histidine kinase. This is the dimerization and phosphoacceptor domain of a sub-family of histidine kinases. It shares sequence similarity with pfam00512 and pfam07536. It is usually found adjacent to a C-terminal ATPase domain (pfam02518). This domain is found in a wide range of Bacteria and also several Archaea.	0
414824	cl06567	BatA	Aerotolerance regulator N-terminal. This model represents a prokaryotic N-terminal region of about 80 amino acids. The predicted membrane topology by TMHMM puts the N-terminus outside and spans the membrane twice, with a cytosolic region of about 25 amino acids between the two transmembrane regions. Member proteins tend to be between 600 and 1000 amino acids in length. [Hypothetical proteins, Domain]	0
414828	cl06591	DUF1573	Protein of unknown function (DUF1573). This HMM describes a repeat domain just over 100 amino acids long and usually found in tandem copies. Members appear to be extracellular proteins that have some C-terminal anchoring domain, such as type IX secrection (T9SS) or PEP-CTERM.	0
414837	cl06641	Fea1	Low iron-inducible periplasmic protein. In Chlamydomonas reinhardtii, the gene encoding Fe-assimilating protein 1 is induced by iron deficiency. In green algae, this protein is periplasmic. The two paralogues FEA1 and FEA2 are the major proteins secreted by iron-deficient Chlamydomonas reinhardtii, and both are up-regulated in response to iron deficiency. FEA1 but not FEA2 is up-regulated by high CO2 concentration. Both FEA1 and FEA2 are secreted into the periplasmic space and genetic evidence confirms that their association with the cell is required for growth in low iron.	0
414853	cl06673	Extradiol_Dioxygenase_3A_like	Subunit A of Class III extradiol dioxygenases. This is a family of aromatic ring opening dioxygenases which catalyze the ring-opening reaction of protocatechuate and related compounds.	0
384654	cl06725	TraC	TraC-like protein. conjugal transfer protein TraC; Provisional	0
414890	cl06756	Nif11	Nif11 domain. This model describes a conserved, fairly long (about 65 residue) leader peptide region for a family of putative ribosomal natural products (RNP) of small size. Members of the seed alignment tend to have the Gly-Gly motif as the last two residues of the matched region. This is a cleavage site for a combination processing/export ABC transporter with a peptidase domain. Members include the prochlorosins, lantipeptides from Prochlorococcus. [Cellular processes, Biosynthesis of natural products]	0
414893	cl06766	YabP	YabP family. Members of this protein family are the YabP protein of the bacterial sporulation program, as found in Bacillus subtilis, Clostridium tetani, and other spore-forming members of the Firmicutes. In Bacillus subtilis, a yabP single mutant appears to sporulate and germinate normally (), but is in an operon with yabQ (essential for formation of the spore cortex), it near-universal among endospore-forming bacteria, and is found nowhere else. It is likely, therefore, that YabP does have a function in sporulation or germination, one that is either unappreciated or partially redundant with that of another protein. [Cellular processes, Sporulation and germination]	0
414904	cl06793	PRKCSH	Glucosidase II beta subunit-like protein. The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. The beta-subunit confers substrate specificity for di- and monoglucosylated glycans on the glucose-trimming activity of the alpha-subunit.	0
352591	cl06838	C1_4	TFIIH C1-like domain. The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterised by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C.	0
414922	cl06842	X8	X8 domain. The X8 domain, which may be involved in carbohydrate binding, is found in an Olive pollen antigen as well as at the C terminus of family 17 glycosyl hydrolases. It contains 6 conserved cysteine residues which presumably form three disulfide bridges.	0
414924	cl06844	SRR1	SRR1. Protein SENSITIVITY TO RED LIGHT REDUCED 1; Provisional	0
414930	cl06858	DUF1704	Domain of unknown function (DUF1704). Members of this family include a possible metal-binding motif HEXXXH and, nearby, a perfectly conserved motif QEGLA. All members belong to the Proteobacteria, including Agrobacterium tumefaciens and several species of Vibrio and Pseudomonas, and are found in only one copy per chromosome (Vibrio vulnificus, with two chromosomes, has two). The function is unknown.	0
414932	cl06868	FNR_like	N/A. This model describes an NADPH-dependent sulfite reductase flavoprotein subunit. Most members of this family are found in Cys biosynthesis gene clusters. The closest homologs below the trusted cutoff are designated as subunits nitrate reductase.	0
414933	cl06870	SpoU_sub_bind	RNA 2&apos;-O ribose methyltransferase substrate binding. This region is found in some members of the SpoU-type rRNA methylase family (pfam00588).	0
414945	cl06893	UME	UME (NUC010) domain. Characteristic domain in UVSP PI-3 kinase, MEI-41 and ESR-1. Found in nucleolar proteins. Associated with FAT, FATC, PI3_PI4_kinase modules.	0
414955	cl06904	eNOPS_SF	NOPS domain, including C-terminal helical extension region, in the p54nrb/PSF/PSP1 family. This domain is found at the C-terminus of NONA and PSP1 proteins adjacent to 1 or 2 pfam00076 domains.	0
414963	cl06920	dimerization2	dimerization domain. This domain is found at the N-terminus of a variety of plant O-methyltransferases. It has been shown to mediate dimerization of these proteins.	0
414968	cl06939	Antimicrobial17	Alpha/beta enterocin family. This family consists of the alpha and beta enterocins and lactococcin G peptides. These peptides have some antimicrobial properties; they inhibit the growth of Enterococcus spp. and a few other gram-positive bacteria. These peptides act as pore- forming toxins that create cell membrane channels through a barrel-stave mechanism and thus produce an ionic imbalance in the cell. These family of antimicrobial peptides belong to the class II group of bacteriocin.	0
414973	cl06949	SspH	Small acid-soluble spore protein H family. This model is derived from pfam08141 but has been expanded to include in the seed corresponding proteins from three species of Clostridium. Members of this family should occur only in endospore-forming bacteria, typically with two members per genome, but may be absent from the genomes of some endospore-forming bacteria. SspH (previously designated YfjU) was shown to be expressed specifically in spores of Bacillus subtilis. [Cellular processes, Sporulation and germination]	0
414974	cl06950	AARP2CN	AARP2CN (NUC121) domain. This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU.	0
414976	cl06954	BP28CT	BP28CT (NUC211) domain. This C-terminal domain is found in BAP28-like nucleolar proteins.	0
414979	cl06957	BING4CT	BING4CT (NUC141) domain. This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins.	0
414982	cl06960	GUCT	RNA-binding GUCT domain found in the RNA helicase II/Gu protein family. This is the C terminal domain found in the RNA helicase II / Gu protein family.	0
415001	cl06998	LEM_like	LEM-like domain of lamina-associated polypeptide 2 (LAP2) and similar proteins. Short protein of 49 amino acid isolated from bovine spleen cells. Thymopoietins (TMPOs) are a group of ubiquitously expressed nuclear proteins. They are suggested to play an important role in nuclear envelope organisation and cell cycle control.	0
384799	cl07019	SHR3_chaperone	ER membrane protein SH3. This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs). Shr3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of Shr3, AAPs are retained in the ER.	0
415011	cl07020	CW_7	CW_7 repeat. This domain was originally found in the C-terminal moiety of the Cpl-7 lysozyme encoded by the Streptococcus pneumoniae bacteriophage Cp-7. It is assumed that these repeats represent cell wall binding motifs although no direct evidence has been obtained so far.	0
415016	cl07029	SPT2	SPT2 chromatin protein. This entry includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation.	0
384806	cl07034	dCache_2	Cache domain. Members include the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions.	0
415021	cl07053	Sin3_corepress	Sin3 family co-repressor. This domain is found on transcriptional regulators. It forms interactions with histone deacetylases.	0
415023	cl07055	Bac_DnaA_C	N/A. Could be involved in DNA-binding.	0
415024	cl07060	NPCBM	NPCBM/NEW2 domain. This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. This domain has also been called the NEW2 domain (Naumoff DG. Phylogenetic analysis of alpha-galactosidases of the GH27 family. Molecular Biology (Engl Transl). (2004)38:388-399.)	0
415026	cl07066	Mad3_BUB1_I	Mad3/BUB1 homology region 1. Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p.	0
415031	cl07072	COG4	COG4 transport protein. This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi and intra Golgi transport.	0
415039	cl07097	oligo_HPY	Oligopeptide/dipeptide transporter, C-terminal region. This model represents a domain found in the C-terminal regions of oligopeptide ABC transporter ATP binding proteins, immediately following the ATP-binding domain (pfam00005). All characterized members appear able to be involved in the transport of oligopeptides or dipeptides. Some are important for sporulation or antibiotic resistance. Some dipeptide transporters also act on the heme precursor delta-aminolevulinic acid. [Transport and binding proteins, Amino acids, peptides and amines]	0
415068	cl07159	Rib	Rib/alpha-like repeat. Sequences in this family are tandem repeats of about 79 amino acids, present in up to 14 copies in a protein and highly identical, even at the DNA level, within each protein. Sequences with these repeats are found in the Rib and alpha surface antigens of group B Streptococcus, Esp of Enterococcus faecalis, and related proteins of Lactobacillus. The repeat lacks Cys residues. Most members of this protein family also have the cell wall anchor motif LPXTG shared by many staphyloccal and streptococcal surface antigens.	0
384882	cl07218	Ad_cyc_g-alpha	Adenylate cyclase G-alpha binding domain. This fungal domain is found in adenylate cyclase and interacts with the alpha subunit of heterotrimeric G proteins.	0
415102	cl07247	CDC37_C	Cdc37 C terminal domain. Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain pfam08565 and the N terminal kinase binding domain of Cdc37.	0
415103	cl07248	CDC37_M	Cdc37 Hsp90 binding domain. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37. It is found between the N terminal Cdc37 domain which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 whose function is unclear.	0
415127	cl07283	Tudor_3	DNA repair protein Crb2 Tudor domain. In Saccharomyces cerevisiae the Rad9 a key adaptor protein in DNA damage checkpoint pathways. DNA damage induces Rad9 phosphorylation, and Rad53 specifically associates with this region of Rad9, when phosphorylated, via Rad53 pfam00498 domains. This region is structurally composed of a pair of TUDOR domains.	0
415133	cl07291	RNase_H2-C	Ribonuclease H2-C is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids. This entry represents the non-catalytic subunit of RNase H2, which in S. cerevisiae is Ylr154p/Rnh203p. Whereas bacterial and archaeal RNases H2 are active as single polypeptides, the Saccharomyces cerevisiae homolog, Rnh2Ap, when expressed in Escherichia coli, fails to produce an active RNase H2. For RNase H2 activity three proteins are required [Rnh2Ap (Rnh201p), Ydr279p (Rnh202p) and Ylr154p (Rnh203p)]. Deletion of any one of the proteins or mutations in the catalytic site in Rnh2A leads to loss of RNase H2 activity. RNase H2 ia an endonuclease that specifically degrades the RNA of RNA:DNA hybrids. It participates in DNA replication, possibly by mediating the removal of lagging-strand Okazaki fragment RNA primers during DNA replication.	0
415160	cl07336	MutL_C	MutL C terminal dimerization domain. MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerisation.	0
415174	cl07362	PriCT_1	Primase C terminal 1 (PriCT-1). This alpha helical domain is found at the C terminal of primases.	0
415175	cl07364	Nfu_N	Scaffold protein Nfu/NifU N terminal. This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters.	0
415183	cl07381	BCS1_N	BCS1 N terminal. This domain is found at the N terminal of the mitochondrial ATPase BCS1. It encodes the import and intramitochondrial sorting for the protein.	0
415185	cl07383	C8	C8 domain. Not all of the conserved cysteines have been included in the alignment model. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin.	0
415187	cl07391	Cadherin_pro	Cadherin prodomain like. Cadherins are a family of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This domain corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions.	0
415188	cl07392	GT-D	Glycosyltransferase GT-D fold. Members of this protein family are putative glycosyltransferases. Some members are found close to genes for the accessory secretory (SecA2) system, and are suggested by Partial Phylogenetic Profiling to correlate with SecA2 systems. Glycosylation, therefore, may occur in the cytosol prior to secretion.	0
384993	cl07396	CRM1_C	CRM1 C terminal. CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat.	0
415190	cl07397	SoxZ	Sulphur oxidation protein SoxZ. SoxZ forms a heterodimer with SoxY, the subunit that forms a covalent bond with a sulfur moiety during thiosulfate oxidation to sulfate. Note that virtually all proteins that have a SoxY domain fused to a SoxZ domain are functionally distinct and not involved in thiosulfate oxidation.	0
415194	cl07405	DP_DD	Dimerization domain of DP. DP forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer and negatively regulates the G1-S transition.	0
415195	cl07406	c-SKI_SMAD_bind	c-SKI Smad4 binding domain. c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4.	0
415200	cl07418	HIRAN	HIRAN domain. HIRAN is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes. It has been predicted that this protein functions as a DNA-binding domain that probably recognises features associated with damaged DNA or stalled replication forks.	0
415202	cl07420	ydhR	Putative mono-oxygenase ydhR. putative monooxygenase; Provisional	0
415207	cl07428	Ivy	Inhibitor of vertebrate lysozyme (Ivy). C-lysozyme inhibitor; Provisional	0
415209	cl07433	Serine_rich_CAS	Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module. This is a serine rich domain that is found in the docking protein p130(cas) (Crk-associated substrate). This domain folds into a four helix bundle which is associated with protein-protein interactions.	0
415215	cl07443	Cdt1_m	The middle winged helix fold of replication licensing factor Cdt1 binds geminin to inhibit binding of the MCM complex to origins of replication and DNA. CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin.	0
385023	cl07446	SymE_toxin	Toxin SymE, type I toxin-antitoxin system. endoribonuclease SymE; Provisional	0
415228	cl07460	FRG	FRG domain. This presumed domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterized.	0
415229	cl07462	XisI-like	XisI is FdxN element excision controlling factor protein. The fdxN element, along with two other DNA elements, is excised from the chromosome during heterocyst differentiation in cyanobacteria. The xisH as well as the xisF and xisI genes are required.	0
415230	cl07463	DndE	DNA sulphur modification protein DndE. This model describes the DndE protein encoded by an operon associated with a sulfur-containing modification to DNA. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndE is a putative carboxylase homologous to NCAIR synthetases. [DNA metabolism, Restriction/modification]	0
415232	cl07469	QLQ	QLQ. QLQ is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. QLQ has been postulated to be involved in mediating protein interactions.	0
415241	cl07481	DUF1845	Domain of unknown function (DUF1845). Members of this protein family, such as PFL4669, are found in integrating conjugative elements (ICE) of the PFGI-1 class as in Pseudomonas fluorescens.	0
415251	cl07494	F_actin_bind	F-actin binding. FABD is the F-actin binding domain of Bcr-Abl and its cellular counterpart c-Abl. The Bcr-Abl tyrosine kinase causes different forms of leukemia in humans. Depending on its position within the cell, Bcr-Abl differentially affects cellular growth. The FABD forms a compact left-handed four-helix bundle in solution.	0
415258	cl07510	CsoSCA	Carboxysome Shell Carbonic Anhydrase. This model describes a carboxysome shell protein that proves to be a novel class, designated epsilon, of carbonic anhydrase. It tends to be encoded near genes for RuBisCo and for other carboxysome shell proteins. [Central intermediary metabolism, One-carbon metabolism]	0
415270	cl07523	LciA-like	lactococcin A immunity protein (LciA) and similar proteins. Gram-positive lactobacilli produce bacteriocins to kill closely-related competitor species. To protect themselves from the bacteriocidal activity of this molecule they co-express an immunity protein (for discussion of this operon see Bacteriocin_IIc pfam10439). The immunity protein structure is a soluble, cytoplasmic, antiparallel four alpha-helical globular bundle with a fifth, more flexible and more divergent C-terminal helical hair-pin. The C-terminal hair-pin recognizes the C-terminus of the producer bacteriocin and this interaction is sufficient to dis-orient the bacteriocin within the membrane and close up the permeabilising pore that on its own the bacteriocin creates. These immunity proteins interact in the same way with other bacteriocins, family Bacteriocin_II, pfam01721. Since many enterococci can produce more than one bacteriocin it seems likely that the whole operon can be carried on transferable plasmids.	0
415275	cl07531	DUF1874	Domain of unknown function (DUF1874). DNA binding protein	0
415286	cl07557	QH-AmDH_gamma	Quinohemoprotein amine dehydrogenase, gamma subunit. Members of this family contain a cross-linked, proteinous quinone cofactor, cysteine tryptophylquinone, which is required for catalysis of the oxidative deamination of a wide range of aliphatic and aromatic amines. The domain assumes a globular secondary structure, with two short alpha-helices having many turns and bends.	0
415293	cl07574	YopH_N	YopH, N-terminal. The type III secretion system (T3SS) protein called LcrQ in Yersinia pseudotuberculosis and YscM1 in Yersinia enterocolitica is a post-transcriptional regulator of T3SS effector gene expression. Successful chaperone-dependent export by the T3SS allows the translation of T3SS effector proteins to proceed.	0
298396	cl07585	T3SS_needle_reg	YopR, type III needle-polymerization regulator. Members of this family are type III secretion system effectors, named differently in different species and designated YopR (Yersinia outer protein R), encoded by the YscH (Yersinia secretion H) gene. This Yops protein is unusual in that it is released to extracellularly rather than injected directly into the target cell as are most Yops. [Cellular processes, Pathogenesis]	0
415300	cl07590	NCBD_CREBBP-p300_like	Nuclear Coactivator Binding Domain (NCBD) of CREB (cyclic AMP response element binding protein) binding protein (CREBBP, also known as CBP) and its paralog p300. The Creb binding domain assumes a structure comprising of three alpha-helices which pack in a bundle, exposing a hydrophobic groove between alpha-1 and alpha-3 within which complimentary domains found in the protein 'activator for thyroid hormone and retinoid receptors' (ACTR) can dock. Docking of these domains is required for the recruitment of RNA polymerase II and the basal transcription machinery.	0
415310	cl07609	Sod_Ni	Nickel-containing superoxide dismutase. This superoxide dismutase uses nickel, rather than iron, manganese, copper, or zinc. Its gene is always accompanied by a gene for a required protease.	0
415315	cl07618	B2-adapt-app_C	Beta2-adaptin appendage, C-terminal sub-domain. Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerisation. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15).	0
352774	cl07621	EFh_DMD_DYTN_DTN	EF-hand-like motif found in the dystrophin/dystrobrevin/dystrotelin family. Beta-dystrobrevin, also termed dystrobrevin beta (DTN-B), is a dystrophin-related protein that is restricted to non-muscle tissues and is abundantly expressed in brain, lung, kidney, and liver. It may be involved in regulating chromatin dynamics, possibly playing a role in neuronal differentiation, through the interactions with the high mobility group HMG20 proteins iBRAF/HMG20a and BRAF35 /HMG20b. It also binds to and represses the promoter of synapsin I, a neuronal differentiation gene. Moreover, beta-dystrobrevin functions as a kinesin-binding receptor involved in brain development via the association with the extracellular matrix components pancortins. Furthermore, beta-dystrobrevin binds directly to dystrophin and is a cytoplasmic component of the dystrophin-associated glycoprotein complex, a multimeric protein complex that links the extracellular matrix to the cortical actin cytoskeleton and acts as a scaffold for signaling proteins such as protein kinase A. Absence of alpha- and beta-dystrobrevin causes cerebellar synaptic defects and abnormal motor behavior. Beta-dystrobrevin has a compact cluster of domains comprising four EF-hand-like motifs and a ZZ-domain, followed by a looser region with two coiled-coils. These domains are believed to be involved in protein-protein interactions. In addition, beta-dystrobrevin contain two syntrophin binding sites (SBSs).	0
415341	cl07672	GH15_N	Glycoside hydrolase family 15, N-terminal domain. Members of this family, which are uniquely found in bacterial and archaeal glucoamylases and glucodextranases, adopt a structure consisting of 17 antiparallel beta-strands. These beta-strands are divided into two beta-sheets, and one of the beta-sheets is wrapped by an extended polypeptide, which appears to stabilize the domain. Members of this family are mainly concerned with catalytic activity, hydrolysing alpha-1,6-glucosidic linkages of dextran to release beta-D-glucose from the non-reducing end via an inverting reaction mechanism.	0
385167	cl07688	FimH_man-bind	Mannose binding  domain of FimH and related proteins. Members of this family adopt a secondary structure consisting of a beta sandwich, with nine strands arranged in two sheets in a Greek key topology. They are predominantly found in bacterial mannose-specific adhesins, since they are capable of binding to D-mannose.	0
415353	cl07696	PepX_N	X-Prolyl dipeptidyl aminopeptidase PepX, N-terminal. This N-terminal domain adopts a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, with the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand five of the catalytic domain. This domain mediates dimerisation of the protein, with two proline residues present in the domain being critical for interaction.	0
415374	cl07747	Aha1_N	Activator of Hsp90 ATPase, N-terminal. This domain is predominantly found in the protein 'Activator of Hsp90 ATPase', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity.	0
415389	cl07779	DUF1967	Domain of unknown function (DUF1967). CgtA (see model TIGR02729) is a broadly conserved member of the obg family of GTPases associated with ribosome maturation. This model represents a unique C-terminal domain found in some but not all sequences of CgtA. This region is preceded, and may be followed, by a region of low-complexity sequence.	0
385237	cl07828	zf-H2C2	His(2)-Cys(2) zinc finger. This is a family of probably DNA-binding zinc-fingers found on Gag-Pol polyproteins from mouse retroviruses. Added to clan to resolve overlaps with zf-H2C2, but neither are true members.	0
324147	cl07831	growth_hormone_like	Somatotropin/prolactin hormone family. Prolactin is primarily responsible for stimulating milk production and breast development in mammals. Aside from roles in reproduction, various functions have been attributed to prolactin, more than for other pituitary gland hormones combined. These are roles in growth and development, metamorphosis, metabolism of lipids, carbohydrates, and steroids, brain biochemistry and even immunoregulation, among others. Most of these roles are poorly understood, but it has become clear that many prolactin-like hormones are actually produced in the placenta and not the pituitary.	0
415414	cl07834	C6	C6 domain. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge.	0
385243	cl07847	RGP	Reversibly glycosylated polypeptide. reversibly glycosylated polypeptide; Provisional	0
415418	cl07849	Acetyltransf_14	YopJ Serine/Threonine acetyltransferase. The Yersinia effector YopJ inhibits the innate immune response by blocking MAP kinase and NFkappaB signaling pathways. YopJ is a serine/threonine acetyltransferase which regulates signalling pathways by blocking phosphorylation. Specifically, YopJ has been shown to block phosphorylation of active site residues. It has also been shown that YopJ acetyltransferase is activated by eukaryotic host cell inositol hexakisphosphate. This family was previously incorrectly annotated in Pfam as being a peptidase family.	0
415428	cl07863	Phasin	Poly(hydroxyalcanoate) granule associated protein (phasin). This model describes a domain found in some proteins associated with polyhydroxyalkanoate (PHA) granules in a subset of species that have PHA inclusion granules. Included are two tandem proteins of Pseudomonas oleovorans, PhaI and PhaF, and their homologs in related species. PhaF proteins have a low-complexity C-terminal region with repeats similar to AAAKP. [Fatty acid and phospholipid metabolism, Biosynthesis]	0
415433	cl07870	EppA_BapA	Exported protein precursor (EppA/BapA). This family consists of a number of exported protein precursor (EppA and BapA) sequences which seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). bapA gene sequences are quite stable but the encoded proteins do not provoke a strong immune response in most individuals. Conversely, EppA proteins are much more antigenic but are more variable in sequence. It is thought that BapA and EppA play important roles during the Borrelia burgdorferi infectious cycle.	0
415435	cl07874	zf-AD	Zinc-finger associated domain (zf-AD). The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA.	0
415437	cl07879	DnaG_DnaB_bind	DNA primase DnaG DnaB-binding. DnaG_DnaB_bind defines a domain of primase required for functional interaction with DnaB that attracts primase to the replication fork. DnaG_DnaB_bind is responsible for the interaction between DnaG and DnaB.	0
415439	cl07883	CAMSAP_CKK	Microtubule-binding calmodulin-regulated spectrin-associated. This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin.	0
415443	cl07889	Pro-peptidase_S53	Activation domain of S53 peptidases. Members of this family are found in various subtilase propeptides, and adopt a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptide.	0
415444	cl07890	AAI_LTSS	N/A. This domain has a four-helix bundle structure. It contains four disulfide bonds, of which three function to keep the C- and N-terminal parts of the molecule in place.	0
415445	cl07905	DUF3237	Protein of unknown function (DUF3237). hypothetical protein; Provisional	0
415446	cl07906	DUF2585	Protein of unknown function (DUF2585). hypothetical protein; Provisional	0
415447	cl07918	Virul_fac_BrkB	Virulence factor BrkB. Initial identification of members of this protein family was based on characterization of the yihY gene product as ribonuclease BN in Escherichia coli. This identification has been withdrawn, as the group now finds the homolog in E. coli of RNase Z is the true ribonuclease BN rather than a strict functional equivalent of RNase Z. Members of this subfamily include the largely uncharacterized BrkB (Bordetella resist killing by serum B) from Bordetella pertussis. Some members have an additional C-terminal domain. Paralogs from E. coli (yhjD) and Mycobactrium tuberculosis (Rv3335c) are part of a smaller, related subfamily that form their own cluster. [Unknown function, General]	0
415448	cl07929	Glyco_transf_56	4-alpha-L-fucosyltransferase glycosyl transferase group 56. This family contains the bacterial enzyme 4-alpha-L-fucosyltransferase (Fuc4NAc transferase) (EC 2.4.1.-) (approximately 360 residues long). This catalyzes the synthesis of Fuc4NAc-ManNAcA-GlcNAc-PP-Und (lipid III) as part of the biosynthetic pathway of enterobacterial common antigen (ECA), a polysaccharide comprised of the trisaccharide repeat unit Fuc4NAc-ManNAcA-GlcNAc.	0
415449	cl07930	Fe_bilin_red	Ferredoxin-dependent bilin reductase. phycocyanobilin:ferredoxin oxidoreductase; Validated	0
415450	cl07940	SSPI	Small, acid-soluble spore protein I. small acid-soluble spore protein SspI; Provisional	0
385277	cl07943	SspO	Small acid-soluble spore protein O family. This model represents a minor (low-abundance) spore protein, designated SspO. It is found in a very limited subset of the already small group of endospore-forming bacteria, but these species include Oceanobacillus iheyensis, Geobacillus kaustophilus, Bacillus subtilis, B. halodurans, and B. cereus. This protein was previously called CotK. [Cellular processes, Sporulation and germination]	0
415451	cl07944	HutP	Histidine Utilizing Protein, the hut operon positive regulatory protein. The HutP protein family regulates the expression of Bacillus 'hut' structural genes by an anti-termination complex, which recognizes three UAG triplet units, separated by four non-conserved nucleotides on the RNA terminator region. L-histidine and Mg2+ ions are also required. These proteins exhibit the structural elements of alpha/beta proteins, arranged in the order: alpha-alpha-beta-alpha-alpha-beta-beta-beta in the primary structure, and the four antiparallel beta-strands form a beta-sheet in the order beta1-beta2-beta3-beta4, with two alpha-helices each on the front (alpha1 and alpha2) and at the back (alpha3 and alpha4) of the beta-sheet.	0
186720	cl07951	PRK03830	N/A. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. Although previously designated tlp (thioredoxin-like protein), the B. subtilis protein was shown to be a minor small acid-soluble spore protein SASP, unique to spores. The motif E[VIL]XDE near the C-terminus probably represents at a germination protease cleavage site. [Cellular processes, Sporulation and germination]	0
415452	cl07980	FHIPEP	FHIPEP family. Members of this family are closely homologous to the flagellar biosynthesis protein FlhA (TIGR01398) and should all participate in type III secretion systems. Examples include InvA (Salmonella enterica), LcrD (Yersinia enterocolitica), HrcV (Xanthomonas), etc. Type III secretion systems resemble flagellar biogenesis systems, and may share the property of translocating special classes of peptides through the membrane. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
415453	cl07986	PDGLE	PDGLE domain. cobalt transport protein CbiN; Validated	0
415454	cl08022	Spore_III_AB	Stage III sporulation protein AB (spore_III_AB). stage III sporulation protein SpoAB; Provisional	0
415455	cl08031	ThiC_Rad_SAM	Radical SAM ThiC family. Members of this protein family closely resemble ThiC, an enzyme that performs a complex rearrangement during thiamin biosynthesis, but instead occur as one of two adjacent additional paralogs to bona fide ThiC, in a conserved gene neighborhood with a pair of B12 binding domain/radical SAM domain proteins. Members of the ThiC family are non-canonical radical SAM enzymes, using a C-terminal Cys-rich motif to ligand a 4Fe-4S cluster that cleaves S-adenosylmethionine (SAM), but that sequence region does not belong to pfam04055.	0
415456	cl08044	Cse1_I-E	CRISPR/Cas system-associated protein Cse1. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry, represented by CT1972 from Chlorobaculum tepidum, is found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse1.	0
415457	cl08047	Lar_restr_allev	Restriction alleviation protein Lar. Restriction alleviation proteins provide a countermeasure to host cell restriction enzyme defense against foreign DNA such as phage or plasmids. This family consists of homologs to the phage antirestriction protein Lar, and most members belong to phage genomes or prophage regions of bacterial genomes. [Mobile and extrachromosomal element functions, Prophage functions, DNA metabolism, Restriction/modification]	0
415458	cl08066	YafO_toxin	Toxin YafO, type II toxin-antitoxin system. YafO is a toxin which inhibits protein synthesis. It acts as a ribosome-dependent mRNA interferase. It forms part of a type II toxin-antitoxin system, where the YafN protein acts as an antitoxin. This domain forms complexes with yafN antitoxins containing pfam02604.	0
415459	cl08082	CsgF	Type VIII secretion system (T8SS), CsgF protein. The extracellular nucleation-precipitation (ENP) pathway or Type VIII secretion system (T8SS) in Gram-negative (diderm) bacteria is responsible for the secretion and assembly of prepilins for fimbiae biogenesis, the prototypical curli. Besides the T2SS that can be involved in the assembly of prototypical Type 4 pilus, the T4SS that can be involved in the biogenesis of the prototypical pilus T, the T3SS involved in the assembly of the injectisome and the T7SS involved in the formation of the prototypical Type 1 pilus, the T8SS differs in that fibre-growth occurs extracellularly. The curli, also called thin aggregative fimbriae (Tafi), are the only fimbriae dependent on the T8SS. Tafi were first identified in Salmonella spp and the controlling operon termed agf; however subsequent isolation of the homologous operon in E coli led to its being called csg. In the absence of extracellular polysaccharides Tafi appear curled, although when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix. CsgF is one of three putative curli assembly factors appearing to act as a nucleator protein. Unlike eukaryotic amyloid formation, curli biogenesis is a productive pathway requiring a specific assembly machinery.	0
298624	cl08090	DUF2541	Protein of unknown function (DUF2541). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. All proteins are annotated as YaaI precursor however currently no function is known.	0
415460	cl08095	HycA_repressor	Transcriptional repressor of hyc and hyp operons. This family is conserved in Proteobacteria. It is likely to be the transcriptional repressor molecule for the hyc and hyp operons, which express, amongst others, the protein HycA. This protein may be harnessed for the reduction of technetium oxide, an unwelcome product of radio-nucleotide bioaccumulation. HycA produces formate hydrogenlyase, one of the key proteins necessary for metal compound reduction.	0
415461	cl08096	DUF1992	Domain of unknown function (DUF1992). hypothetical protein; Provisional	0
415462	cl08107	DUF5329	Family of unknown function (DUF5329). hypothetical protein; Provisional	0
415463	cl08110	DUF2756	Protein of unknown function (DUF2756). Some members in this family of proteins are annotated yhhA however currently no function is known. The family appears to be restricted to Enterobacteriaceae.	0
415464	cl08115	CsgE	Curli assembly protein CsgE. Curli are a class highly aggregated surface fibers that are part of a complex extracellular matrix. They promote biofilm formation in addition to other activities. CsgE is a non-structural protein involved in curli biogenesis. CsgE forms an outer membrane complex with the curli assembly proteins CsgG and CsgF.	0
415465	cl08119	YgbA_NO	Nitrous oxide-stimulated promoter. The function of ygaB is not known but it is a promoter that is stimulated by the presence of nitrous oxide. It is regulated by the gene-product of the bacterial nsrR gene.	0
415466	cl08122	NiFe-hyd_HybE	[NiFe]-hydrogenase assembly, chaperone, HybE. Members of this family are chaperones for the assembly of [NiFe] hydrogenases, in the family of HybE, which is specific for hydrogenase-2 of Escherichia coli. Members often have an additional N-terminal rubredoxin domain.	0
415467	cl08125	DUF3811	YjbD family (DUF3811). hypothetical protein; Provisional	0
415468	cl08136	SecD-TM1	SecD export protein N-terminal TM region. This domain appears to be the fist transmembrane region of the SecD export protein. SecD is directly involved in protein secretion and important for the release of proteins that have been translocated across the cytoplasmic membrane.	0
415469	cl08141	Lipoprotein_20	YfhG lipoprotein. This family includes the YfhG protein from E. coli. Members of this family have an N-terminal lipoprotein attachment site. The members of this family are functionally uncharacterized.	0
415470	cl08147	RcsF	RcsF lipoprotein. The RcsF lipoprotein is a component of the Rcs signaling system. It activates the Rcs system by transmitting signals from the cell suface to the histidine kinase RcsC.	0
415471	cl08171	HtrL_YibB	Bacterial protein of unknown function (HtrL_YibB). hypothetical protein; Provisional	0
415472	cl08177	DUF2625	Protein of unknown function DUF2625. hypothetical protein; Provisional	0
415473	cl08186	DUF3251	Protein of unknown function (DUF3251). hypothetical protein; Provisional	0
415474	cl08187	Cas2_I-E	CRISPR/Cas system-associated protein Cas2. This entry represents a minor branch of the Cas2 family of CRISPR-associated protein which are found in IPR003799. Cas proteins are found adjacent to a characteristic short, palindromic repeat cluster termed CRISPR, a probable mobile DNA element.	0
415475	cl08197	DUF2501	Protein of unknown function (DUF2501). hypothetical protein; Provisional	0
415476	cl08212	Tipalpha	TNF-alpha-Inducing protein of Helicobacter. tumor necrosis factor alpha-inducing protein; Reviewed	0
415477	cl08220	Photo_RC	D1, D2 subunits of photosystem II  (PSII); M, L subunits of bacterial photosynthetic reaction center. This model decribes the photosynthetic reaction center M subunit in non-oxygenic photosynthetic bacteria. Reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reacion center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in form of NADH. Ultimately the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is some organic acid and not water. Much of our current functional understanding of photosynthesis comes from the structural determination, spectroscopic studies and mutational analysis on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	0
415478	cl08223	PSII	Photosystem II protein. photosystem II 47 kDa protein	0
415479	cl08224	PsaA_PsaB	Photosystem I psaA/psaB protein. photosystem I P700 chlorophyll a apoprotein A1; Provisional	0
415480	cl08232	RuBisCO_large	Ribulose bisphosphate carboxylase large chain. The C-terminal domain of RuBisCO large chain is the catalytic domain adopting a TIM barrel fold.	0
415481	cl08246	MHC_I	Class I Histocompatibility antigen, domains alpha 1 and 2. Members of this family are known as retinoic-acid-inducible proteins. They are ligands for the activating immunoreceptor NKG2D, which is widely expressed on natural killer cells, T cells, and macrophages.	0
415483	cl08255	Na_K-ATPase	Sodium / potassium ATPase beta chain. This model describes the Na+/K+ ATPase beta subunit in eukaryotes. Na+/K+ ATPase(also called Sodium-Potassium pump) is intimately associated with the plasma membrane. It couples the energy released by the hydrolysis of ATP to extrude 3 Na+ ions, with the concomitant uptake of 2K+ ions, against their ionic gradients. [Transport and binding proteins, Cations and iron carrying compounds]	0
415486	cl08263	TBP_TLF	N/A. archaeal TATA box binding protein (TBP): TBPs are transcription factors present in archaea and eukaryotes, that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA.	0
415487	cl08267	ISOPREN_C2_like	N/A. Proteins similar to alpha2-macroglobulin (alpha (2)-M). This group also contains the pregnancy zone protein (PZP).  Alpha(2)-M and PZP are broadly specific proteinase inhibitors. Alpha (2)-M is a major carrier protein in serum. The structural thioester of alpha (2)-M, is involved in the immobilization and entrapment of proteases.  PZP is a trace protein in the plasma of non-pregnant females and males which is elevated in pregnancy. Alpha (2)-M and PZ bind to placental protein-14 and may modulate its activity in T-cell growth and cytokine production contributing to fetal survival. It has been suggested that thioester bond cleavage promotes the binding of PZ and alpha (2)-M to the CD91 receptor clearing them from circulation.	0
415488	cl08270	Peptidase_S10	Serine carboxypeptidase. serine carboxypeptidase (CBP1); Provisional	0
415490	cl08275	RHD-n	N-terminal sub-domain of the Rel homology domain (RHD). Proteins containing the Rel homology domain (RHD) are eukaryotic transcription factors. The RHD is composed of two structural domains. This is the N-terminal DNA-binding domain that is similar to that found in P53. The C-terminal domain has an immunoglobulin-like fold (See pfam16179) that functions as a dimerization domain.	0
415491	cl08282	Acyl_transf_1	Acyl transferase domain. SAT is the N-terminal starter unit:ACP transacylase of the aflatoxin biosynthesis pathway. SAT selects the hexanoyl starter unit from a pair of specialized fungal fatty acid synthase subunits (HexA/HexB) and transfers it onto the polyketide synthase A acyl-carrier protein to prime polyketide chain elongation. The family is found in association with pfam02801, pfam00109, pfam00550, pfam00975, pfam00698.	0
415497	cl08291	TCTP	Translationally controlled tumor protein. translationally controlled tumor-like  protein; Provisional	0
415499	cl08298	NAP	Nucleosome assembly protein (NAP). (NAP-L) nucleosome assembly protein -L; Provisional	0
415500	cl08299	LAGLIDADG_3	LAGLIDADG-like domain. This domain is found within the sporulation regulator WhiA. It is a LAGLIDADG superfamily like domain.	0
415501	cl08302	EFh	N/A. S-100A10_like: S-100A10 domain found in proteins similar to S100A10. S100A10 is a member of the S100 family of EF-hand superfamily of calcium-binding proteins. Note that the S-100 hierarchy, to which this S-100A1_like group belongs, contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. A unique feature of S100A10 is that it contains mutation in both of the calcium binding sites, making it calcium insensitive. S100A10 has been detected in brain, heart, gastrointestinal tract, kidney, liver, lung, spleen, testes, epidermis, aorta, and thymus. Structural data supports the homo- and hetero-dimeric as well as hetero-tetrameric nature of the protein. S100A10 has multiple binding partners in its calcium free state and is therefore involved in many diverse biological functions.	0
415503	cl08306	Peptidase_C12	Cysteine peptidase C12 contains ubiquitin carboxyl-terminal hydrolase (UCH) families L1, L3, L5 and BAP1. This ubiquitin C-terminal hydrolase (UCH) family includes UCH37 (also known as UCH-L5) and BRCA1-associated protein-1 (BAP1). They contain a UCH catalytic domain as well as an additional C-terminal extension which plays a role in protein-protein interactions. UCH37 is responsible for ubiquitin (Ub) isopeptidase activity in the 19S proteasome regulatory complex; it disassembles Lys48-linked poly-ubiquitin from the distal end of the chain. It is also associated with the human Ino80 chromatin-remodeling complex (hINO80) in the nucleus and can be activated through transient association of hINO80 with hRpn13 that is bound to the 19S regulatory particle or the proteasome. UCH37 possibly plays a role in oncogenesis; it competes with Smad ubiquitination regulatory factor 2 (Smurf2, ubiquitin ligase) in binding concurrently to Smad7 in order to deubiquitinate the activated type I transforming growth factor beta (TGF-beta) receptor, thus rescuing it from proteasomal degradation. BAP1 binds to the wild-type BRCA1 RING finger domain, localized in the nucleus.  In addition to the UCH catalytic domain, BAP1 contains a UCH37-like domain (ULD), binding domains for BRCA1 and BARD1, which form a tumor suppressor heterodimeric complex, and a binding domain for HCFC1, which interacts with histone-modifying complexes during cell division. The full-length human BRCA1 is a ubiquitin ligase. However, BAP1 does not appear to function in the deubiquitination of autoubiquitinated BRCA1. BAP1 exhibits tumor suppressor activity in cancer cells, and gene mutations have been reported in a small number of breast and lung cancer samples. In metastasis of uveal melanoma, the most common primary cancer of the eye, inactivating somatic mutations have been identified in the gene encoding BAP1 on chromosome 3p21.1. These mutations include several that cause premature protein termination as well as affect its UCH domain, thus implicating loss of BAP1 and suggesting that the BAP1 pathway may be a valuable therapeutic target.	0
415505	cl08315	CAP_GLY	CAP-Gly domain. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove.	0
415507	cl08320	Pollen_allerg_1	Pollen allergen. pollen allergen group 3; Provisional	0
415511	cl08346	Rib_hydrolase	ADP-ribosyl cyclase, also known as cyclic ADP-ribose hydrolase or CD38. ADP-ribosyl cyclase EC:3.2.2.5 (also know as cyclic ADP-ribose hydrolase or CD38) synthesizes cyclic-ADP ribose, a second messenger for glucose-induced insulin secretion.	0
415514	cl08354	AFOR_N	Aldehyde ferredoxin oxidoreductase, N-terminal domain. Enzymes of the aldehyde ferredoxin oxidoreductase (AOR) family contain a tungsten cofactor and an 4Fe4S cluster and catalyse the interconversion of aldehydes to carboxylates. This family includes AOR, formaldehyde ferredoxin oxidoreductase (FOR), glyceraldehyde-3-phosphate ferredoxin oxidoreductase (GAPOR), all isolated from hyperthermophilic archea. carboxylic acid reductase found in clostridia. and hydroxycarboxylate viologen oxidoreductase from Proteus vulgaris, the sole member of the AOR family containing molybdenum. GAPOR may be involved in glycolysis. but the functions of the other proteins are not yet clear. AOR has been proposed to be the primary enzyme responsible for oxidising the aldehydes that are produced by the 2-keto acid oxidoreductases.	0
415515	cl08356	TFIIA_gamma_C	Gamma subunit of transcription initiation factor IIA, C-terminal domain. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The C-terminal domain of the gamma subunit is a 12 stranded beta-barrel.	0
415519	cl08380	CDC48_2	Cell division protein 48 (CDC48), domain 2. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain.	0
415523	cl08398	mltA_B_like	Domain B insert of mltA_like lytic transglycosylases. This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding.	0
415528	cl08409	Gln-synt_N	Glutamine synthetase, beta-Grasp domain. 	0
415530	cl08418	TAF5_NTD2	TAF5_NTD2 is the second conserved N-terminal region of TATA Binding Protein (TBP) Associated Factor 5 (TAF5), involved in forming Transcription Factor IID (TFIID). This region is an all-alpha domain associated with the WD40 helical bundle of the TAF5 subunit of transcription factor TFIID. The domain has distant structural similarity to RNA polymerase II CTD interacting factors. It contains several conserved clefts that are likely to be critical for TFIID complex assembly. The TAF5 subunit is present twice in the TFIID complex and is critical for the function and assembly of the complex, and the NTD2 and N-terminal domain is crucial for homodimerization.	0
415534	cl08424	OBF_DNA_ligase_family	The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases. This domain has an OB-like fold, but does not appear to be related to pfam03120. It is found at the C-terminus of the ATP dependent DNA ligase domain pfam01068.	0
415536	cl08426	AMPKBI	5&apos;-AMP-activated protein kinase beta subunit, interaction domain. This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain is sometimes found in proteins belonging to this family.	0
415540	cl08444	CesT	Tir chaperone protein (CesT) family. Members of this family include YscB of Yersinia and functionally equivalent (but differently named) proteins from type III secretion systems of other pathogens that affect animal cells. YscB acts, along with SycN (TIGR02503), as a chaperone for YopN, a key part of a complex that regulates type III secretion so it responds to contact with the eukaryotic target cell.	0
415542	cl08447	DUF1214	Protein of unknown function (DUF1214). This family represents the C-terminal region of several hypothetical proteins of unknown function. Family members are mostly bacterial, but a few are also found in eukaryotes and archaea.	0
415548	cl08459	PA14	PA14 domain. The GLEYA domain is related to lectin-like binding domains found in the S. cerevisiae Flo proteins and the C. glabrata Epa proteins. It is a carbohydrate-binding domain that is found in fungal adhesins (also referred to as agglutinins or flocculins). Adhesins with a GLEYA domain possess a typical N-terminal signal peptide and a domain of conserved sequence repeats, but lack glycosylphosphatidylinositol (GPI) anchor attachment signals. They contain a conserved motif G(M/L)(E/A/N/Q)YA, hence the name GLEYA. Based on sequence homology, it is suggested that the GLEYA domain would predominantly contain beta sheets. The GLEYA domain is also found in S. pombe putative cell agglutination protein fta5, thought to be a kinetochore portein (Sim4 complex subunit), however no direct evidence for kinetochore association has been found. Furthermore, a global protein localization study in S. pombe identified it as a secreted protein localized to the Golgi complex.	0
415550	cl08468	Leukocidin	Leukocidin/Hemolysin toxin family. This family of cytolytic pore-forming proteins includes alpha toxin and leukocidin F and S subunits from Staphylococcus aureus, hemolysin II of Bacillus cereus, and related toxins. [Cellular processes, Toxin production and resistance]	0
415552	cl08475	PIG-X	PIG-X / PBN1. Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules.	0
415556	cl08488	ANAPC2	Anaphase promoting complex (APC) subunit 2. The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyse the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein.	0
415560	cl08497	Cas6_I-E	CRISPR/Cas system-associated RAMP superfamily protein Cas6e. This domain forms an anti-parallel beta strand structure with flanking alpha helical regions.	0
415561	cl08500	YtxC	YtxC-like family. This uncharacterized protein is part of a panel of proteins conserved in all known endospore-forming Firmicutes (low-GC Gram-positive bacteria), including Carboxydothermus hydrogenoformans, and nowhere else. [Cellular processes, Sporulation and germination]	0
415571	cl08520	Cdc6_C	Winged-helix domain of essential DNA replication protein Cell division control protein (Cdc6), which mediates DNA binding. The C terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localization factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity.	0
415579	cl08531	ProRS-C_1	Prolyl-tRNA synthetase, C-terminal. Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif.	0
415584	cl09098	Sortase	Sortase domain. The founder member of this family is S.aureus sortase, a transpeptidase that attaches surface proteins by the threonine of an LPXTG motif to the cell wall.	0
415585	cl09109	NTF2_like	N/A. This family contains a large number of proteins that share the SnoaL fold.	0
415586	cl09111	Prefoldin	N/A. This family comprises of several prefoldin subunits. The biogenesis of the cytoskeletal proteins actin and tubulin involves interaction of nascent chains of each of the two proteins with the oligomeric protein prefoldin (PFD) and their subsequent transfer to the cytosolic chaperonin CCT (chaperonin containing TCP-1). Electron microscopy shows that eukaryotic PFD, which has a similar structure to its archaeal counterpart, interacts with unfolded actin along the tips of its projecting arms. In its PFD-bound state, actin seems to acquire a conformation similar to that adopted when it is bound to CCT.	0
415587	cl09113	cpn10	N/A. This family contains GroES and Gp31-like chaperonins. Gp31 is a functional co-chaperonin that is required for the folding and assembly of Gp23, a major capsid protein, during phage morphogenesis.	0
415588	cl09114	CRCB	CrcB-like protein, Camphor Resistance (CrcB). camphor resistance protein CrcB; Provisional	0
415589	cl09115	Ribosomal_L32p	Ribosomal L32p protein family. This protein describes bacterial ribosomal protein L32. The noise cutoff is set low enough to include the equivalent protein from mitochondria and chloroplasts. No related proteins from the Archaea nor from the eukaryotic cytosol are detected by this model. This model is a fragment model; the putative L32 of some species shows similarity only toward the N-terminus. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
415590	cl09123	SecG	Preprotein translocase SecG subunit. This family of proteins forms a complex with SecY and SecE. SecA then recruits the SecYEG complex to form an active protein translocation channel. [Protein fate, Protein and peptide secretion and trafficking]	0
415591	cl09125	ResB	ResB-like family. c-type cytochrome biogenensis protein; Validated	0
415592	cl09134	NurA	NurA domain. This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius.	0
415593	cl09139	FliE	Flagellar hook-basal body complex protein FliE. fliE is a component of the flagellar hook-basal body complex located possibly at (MS-ring)-rod junction. [Cellular processes, Chemotaxis and motility]	0
415594	cl09141	ACT	ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. The ACT domain is a structural motif of 70-90 amino acids that functions in the control of metabolism, solute transport and signal transduction. They are thus found in a variety of different proteins in a variety of different arrangements. In mammalian phenylalanine hydroxylase the domain forms no contacts but promotes an allosteric effect despite the apparent lack of ligand binding.	0
415595	cl09153	PhdYeFM_antitox	Antitoxin Phd_YefM, type II toxin-antitoxin system. This model recognizes a region of about 55 amino acids toward the N-terminal end of bacterial proteins of about 85 amino acids in length. The best-characterized member is prevent-host-death (phd) of bacteriophage P1, the antidote partner of death-on-curing (doc) (TIGR01550) in an addiction module. Addiction modules prevent plasmid curing by killing the host cell as the longer-lived killing protein persists while the gene for the shorter-lived antidote is lost. Note, however, that relatively few members of this family appear to be plasmid or phage-encoded. Also, there is little overlap, except for phage P1 itself, of species with this family and with the doc family. [Cellular processes, Toxin production and resistance, Mobile and extrachromosomal element functions, Other]	0
415596	cl09154	MrpF_PhaF	Multiple resistance and pH regulation protein F (MrpF / PhaF). putative monovalent cation/H+ antiporter subunit F; Reviewed	0
415597	cl09159	Imelysin-like	imelysin also called Peptidase M75. The imelysin peptidase was first identified in Pseudomonas aeruginosa. The active site residues have not been identified. However, His201 and Glu204 are completely conserved in the family and occur in an HXXE motif that is also found in family M14.	0
415598	cl09170	ATP-synt_I	ATP synthase I chain. F0F1 ATP synthase subunit I; Validated	0
415599	cl09173	Caa3_CtaG	Cytochrome c oxidase caa3 assembly factor (Caa3_CtaG). Members of this family are the CtaG protein required for assembly of active cytochrome c oxidase of the caa3 type, as in Bacillus subtilis.	0
415600	cl09176	FlgN	FlgN protein. This family includes the FlgN protein and export chaperone involved in flagellar synthesis.	0
415601	cl09182	DUF1009	Protein of unknown function (DUF1009). Family of uncharacterized bacterial proteins.	0
415602	cl09190	MAPEG	MAPEG family. This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity.	0
415603	cl09194	Sec61_beta	Sec61beta family. This family consists of homologs of Sec61beta - a component of the Sec61/SecYEG protein secretory system. The domain is found in eukaryotes and archaea and is possibly homologous to the bacterial SecG. It consists of a single putative transmembrane helix, preceded by a short stretch containing various charged residues; this arrangement may help determine orientation in the cell membrane.	0
415604	cl09208	Tim44	Tim44-like domain. Mba1 is an inner membrane protein that is part of the mitochondrial protein export machinery. It binds to the large subunit of mitochondrial ribosomes and cooperates with the C-terminal ribosome-binding domain of Oxa1, which is a central component of the insertion machinery of the inner membrane. In the absence of both Mba1 and the C-terminus of Oxa1, mitochondrial translation products fail to be properly inserted into the inner membrane and serve as substrates of the matrix chaperone Hsp70. It is proposed that Mba1 functions as a ribosome receptor that cooperates with Oxa1 in the positioning of the ribosome exit site to the insertion machinery of the inner membrane.	0
415605	cl09210	ROF	Modulator of Rho-dependent transcription termination (ROF). Rho-binding antiterminator; Provisional	0
415606	cl09211	Tagatose_6_P_K	Tagatose 6 phosphate kinase. Aldolases specific for D-tagatose-bisphosphate occur in distinct pathways in Escherichia coli and other bacteria, one for the degradation of galactitol (formerly dulcitol) and one for degradation of N-acetyl-galactosamine and D-galactosamine. This family represents a protein of both systems that behaves as a non-catalytic subunit of D-tagatose-bisphosphate aldolase, required both for full activity and for good stability of the aldolase. Note that members of this protein family appear in public databases annotated as putative tagatose 6-phosphate kinases, possibly in error. [Energy metabolism, Sugars]	0
385434	cl09219	DUF2208	Predicted membrane protein (DUF2208). This domain, found in various hypothetical archaeal proteins, has no known function.	0
415607	cl09232	YqaJ	YqaJ-like viral recombinase domain. This family includes various alkaline exonucleases from members of the herpesviridae. Alkaline exonuclease appears to have an important role in the replication of herpes simplex virus.	0
415608	cl09238	CY	N/A. SQAPI, aspartic acid inhibitor first isolated from squash, inhibits a wide range of aspartic proteinases. This particular family of PAAPIs (proteinaceous aspartic acid inhibitors) seems to have evolved quite recently from an ancestral cystatin. Structurally it consists of a four-stranded anti-parallel beta-sheet gripping an alpha-helix in much the same manner that a hand grips a tennis racket. The unstructured N-terminus and the loop connecting beta-strands 1 and 2 are important for pepsin inhibition, but the loop connecting strands 3 and 4 is not.	0
415612	cl09299	TSA	Type specific antigen. This protein is the immunodominant major cell surface protein of Orienta tsutsugamushi, known as "56-kDa type-specific antigen" or TSA56. It should not be confused with unrelated proteins TSA47 (a serine protease) or TSA22. An ortholog is found in Orientia chuto, and included in the seed alignment.	0
415618	cl09326	MATE_like	Multidrug and toxic compound extrusion family and similar proteins. Deletion of the mviN virulence gene in Salmonella enterica serovar. Typhimurium greatly reduces virulence in a mouse model of typhoid-like disease. Open reading frames encoding homologs of MviN have since been identified in a variety of bacteria, including pathogens and non-pathogens and plant-symbionts. In the nitrogen-fixing symbiont Rhizobium tropici, mviN is required for motility. The MviM protein is predicted to be membrane-associated.	0
415635	cl09429	VirE2	VirE2. This family consists of several VirE2 proteins which seem to be specific to Agrobacterium tumefaciens and Rhizobium etli. VirE2 is known to interact, via its C-terminus, with VirD4. Agrobacterium tumefaciens transfers oncogenic DNA and effector proteins to plant cells during the course of infection. Substrate translocation across the bacterial cell envelope is mediated by a type IV secretion (TFS) system composed of the VirB proteins, as well as VirD4, a member of a large family of inner membrane proteins implicated in the coupling of DNA transfer intermediates to the secretion machine. VirE2 is therefore thought to be a protein substrate of a type IV secretion system which is recruited to a member of the coupling protein superfamily.	0
415642	cl09462	Coagulase	Staphylococcus aureus coagulase. The von Willebrand factor binding protein Vwb, like its paralog staphylocoagulase, is a coagulase and a virulence factor. It induces clotting, not by being an enzyme, but by activating prothrombin to generate fibrin.	0
415644	cl09506	catalase_like	Catalase-like heme-binding proteins and protein domains. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases that are closely related to plant peroxidases, and non-haem, manganese-containing catalases that are found in bacteria.	0
415645	cl09511	FERM_B-lobe	FERM domain B-lobe. This domain is the central structural domain of the FERM domain.	0
415658	cl09607	Gly_reductase	Glycine/sarcosine/betaine reductase component B subunits. Members of this family are PrdD, encoded in the proline reductase gene cluster. Members are closely homologous to PrdA, which cleaves during maturation to create two subunits of the subunits of the proline reductase complex, one of which has a Cys-derived pyruvoyl active site.	0
415659	cl09608	Cas7_I-E	CRISPR/Cas system-associated RAMP superfamily protein Cas7. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family is represented by CT1975 of Chlorobium tepidum.	0
415664	cl09615	E1_UFD	Ubiquitin fold domain. This presumed domain found at the C terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterised.	0
415673	cl09633	NIL	NIL domain. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family.	0
415677	cl09641	T3SS_needle_F	Type III secretion needle MxiH, YscF, SsaG, EprI, PscF, EscF. type III secretion system needle protein SsaG; Provisional	0
415680	cl09645	Ftsk_gamma	Ftsk gamma domain. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding.	0
415681	cl09647	SARS-CoV_ORF9b	accessory protein 9b of severe acute respiratory syndrome-associated coronavirus and similar proteins. This is a family of proteins found in SARS coronavirus. The protein has a novel fold which forms a dimeric tent-like beta structure with an amphipathic surface, and a central hydrophobic cavity that binds lipid molecules. This cavity is likely to be involved in membrane attachment.	0
415684	cl09653	Btz	CASC3/Barentsz eIF4AIII binding. This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide.	0
415700	cl09680	eIF3E	eukaryotic translation initiation factor 3 subunit E. This is the N terminal domain of subunit 6 translation initiation factor eIF3.	0
385541	cl09697	Saf-Nte_pilin	Saf-pilin pilus formation protein. Saf-pilin pilus formation protein SafA; Provisional	0
415716	cl09710	Type_III_YscX	Type III secretion system YscX (type_III_YscX). Members of this family are encoded within bacterial type III secretion gene clusters. Among all species with type III secretion, those with this protein are found among those that target animal rather than plant cells. The member of this family in Yersinia was shown by mutation to be required for type III secretion of Yops effector proteins and therefore is believe to be part of the secretion machinery. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
415717	cl09714	Flg_new	Listeria-Bacteroides repeat domain (List_Bact_rpt). This model describes a conserved core region, about 43 residues in length, of at least two families of tandem repeats. These include 78-residue repeats from 2 to 15 in number, in some proteins of Bacteroides forsythus ATCC 43037, and 70-residue repeats in families of internalins of Listeria species. Single copies are found in proteins of Fibrobacter succinogenes, Geobacter sulfurreducens, and a few bacteria. [Unknown function, General]	0
385547	cl09716	OrgA_MxiK	Bacterial type III secretion apparatus protein (OrgA_MxiK). This gene is found in type III secretion operons and has been shown to be essential for the invasion phenotype in Salmonella and a component of the secretion apparatus. The protein is known as OrgA in Salmonella due to its oxygen-dependent expression pattern in which low-oxygen levels up-regulate the gene. In Shigella the ghene is called MxiK and has been shown to be sessential for the proper assembly of the secretion needle complex.	0
415718	cl09719	Cse2_I-E	CRISPR/Cas system-associated protein Cse2. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family of proteins, represented by CT1973 from Chlorobaculum tepidum, is encoded by genes found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse2.	0
415719	cl09723	CbtB	Probable cobalt transporter subunit (CbtB). This model represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of a single trans-membrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a protein (CbtA) predicted to have five additional trans-membrane segments.	0
415720	cl09726	DUF2389	Tryptophan-rich protein (DUF2389). Members of this family are small hypothetical proteins of 60 to 100 residues from Cyanobacteria and some Proteobacteria. Prochlorococcus marinus strains have two members, other species one only. Interestingly, of the eight most conserved residues, four are aromatic and three are invariant tryptophans. It appears all species that encode this protein can synthesize tryptophan de novo.	0
415726	cl09741	Hypoth_Ymh	Protein of unknown function (Hypoth_ymh). This family consists of a relatively rare (~ 8 occurrences per 200 genomes) prokaryotic protein family. Genes for members are appear to be associated variously with phage and plasmid regions, restriction system loci, transposons, and housekeeping genes. The function is unknown. [Hypothetical proteins, Domain]	0
415728	cl09743	RNA_lig_T4_1	RNA ligase. RNA ligase A; Provisional	0
415733	cl09752	Phg_2220_C	Conserved phage C-terminus (Phg_2220_C). This model represents the conserved C-terminal domain of a family of proteins found exclusively in bacteriophage and in bacterial prophage regions. The functions of this domain and the proteins containing it are unknown. [Mobile and extrachromosomal element functions, Prophage functions]	0
415734	cl09754	ATPase_gene1	Putative F0F1-ATPase subunit Ca2+/Mg2+ transporter. This model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophic stretches and is presumed to be a subunit of the enzyme. [Energy metabolism, ATP-proton motive force interconversion]	0
415735	cl09771	Spore_III_AE	Stage III sporulation protein AE (spore_III_AE). A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is found in a spore formation operon and is designated stage III sporulation protein AE. [Cellular processes, Sporulation and germination]	0
415736	cl09775	Spore_II_R	Stage II sporulation protein R (spore_II_R). A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage II sporulation protein R. [Cellular processes, Sporulation and germination]	0
415739	cl09782	Cas6-I-III	CRISPR/Cas system-associated RAMP superfamily protein Cas6. The Cas6 Crispr family of proteins averaging 140 residues are characterized by having a GhGxxxxxGhG motif, where h indicates a hydrophobic residue, at the C-terminus. The CRISPR-Cas system is possibly a mechanism of defense against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes.	0
415740	cl09783	Spore_YunB	Sporulation protein YunB (Spo_YunB). A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. Mutation of this sigma E-regulated gene, designated yunB, has been shown to cause a sporulation defect. [Cellular processes, Sporulation and germination]	0
415749	cl09801	Spore_YabQ	Spore cortex protein YabQ (Spore_YabQ). YabQ, a protein predicted to span the membrane several times, is found in exactly those genomes whose species perform sporulation in the style of Bacillus subtilis, Clostridium tetani, and others of the Firmicutes. Mutation of this sigma(E)-dependent gene blocks development of the spore cortex. The length of the C-terminal region, including some hydrophobic regions, is rather variable between members. [Cellular processes, Sporulation and germination]	0
415750	cl09807	Lin0512_fam	Conserved hypothetical protein (Lin0512_fam). This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a perfectly conserved motif GxGxDxHG near the N-terminus. [Hypothetical proteins, Conserved]	0
415751	cl09810	DUF2031	Protein of unknown function (DUF2031). This model represents a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism.	0
415756	cl09819	DUF2459	Protein of unknown function (DUF2459). This conserved hypothetical protein of unknown function is found in several Proteobacteria. Its function is unknown and its genome context is not well-conserved. It is found amid urease genes in at least one species. [Hypothetical proteins, Conserved]	0
385588	cl09820	PhaP_Bmeg	Polyhydroxyalkanoic acid inclusion protein (PhaP_Bmeg). This model describes a protein found in polyhydroxyalkanoic acid (PHA) gene regions and incorporated into PHA inclusions in Bacillus cereus and Bacillus megaterium. The role of the protein may include amino acid storage (see McCool,G.J. and Cannon,M.C, 1999).	0
415757	cl09821	Fib_succ_major	Fibrobacter succinogenes major domain (Fib_succ_major). This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulfide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron. [Cell envelope, Other]	0
415758	cl09823	Trep_Strep	Hypothetical bacterial integral membrane protein (Trep_Strep). This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. If is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6. [Transport and binding proteins, Unknown substrate]	0
415760	cl09826	Alph_Pro_TM	Putative transmembrane protein (Alph_Pro_TM). This family consists of predicted transmembrane proteins of about 270 amino acids. Members are found, so far, only among the Alphaproteobacteria and only once in each genome.	0
324500	cl09827	Csb2_I-U	CRISPR/Cas system-associated protein Csb2. This entry represents a rare CRISPR-associated protein. So far, members are found in Geobacter sulfurreducens and in two unpublished genomes: Gemmata obscuriglobus and Actinomyces naeslundii. CRISPR-associated proteins typically are found near CRISPR repeats and other CRISPR-associated proteins, have low levels of sequence identify, have sequence relationships that suggest lateral transfer, and show some sequence similarity to DNA-active proteins such as helicases and repair proteins.	0
415761	cl09829	Csy1_I-F	CRISPR/Cas system-associated protein Csy1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2465 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy1, for CRISPR/Cas Subtype Ypest protein 1.	0
415762	cl09832	Csy3_I-F	CRISPR/Cas system-associated RAMP superfamily protein Csy3. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2463 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy3, for CRISPR/Cas Subtype Ypest protein 3.	0
415763	cl09834	Csb1_I-U	CRISPR/Cas system-associated protein Csb1. This entry is found in CRISPR-associated (cas) proteins in the genomes of Geobacter sulfurreducens PCA and Desulfotalea psychrophila LSv54 (both Desulfobacterales from the Deltaproteobacteria), Gemmata obscuriglobus (a Planctomycete), and Actinomyces naeslundii MG1 (Actinobacteria).	0
415764	cl09835	Cas6_I-F	CRISPR/Cas system-associated RAMP superfamily protein Cas6f. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4.	0
415765	cl09837	Csx3_III-U	CRISPR/Cas system-associated protein Csx3. This entry is encoded in CRISPR-associated (cas) gene clusters, near CRISPR repeats, in the genomes of several different thermophiles: Archaeoglobus fulgidus (archaeal), Aquifex aeolicus (Aquificae), Dictyoglomus thermophilum (Dictyoglomi), and a thermophilic Synechococcus (Cyanobacteria). It is not yet assigned to a specific CRISPR/cas subtype (hence the x designation csx3).	0
299071	cl09838	LcrR	Type III secretion system regulator (LcrR). This protein is found in type III secretion operons and has been characterized in Yersinia as a regulator of the Low-Calcium Respone (LCR). [Protein fate, Protein and peptide secretion and trafficking]	0
385597	cl09839	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. Members of this minor CRISPR-associated (Cas) protein family are encoded in cas gene clusters in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, Mannheimia succiniciproducens MBEL55E, and Verrucomicrobium spinosum.	0
415777	cl09859	YopX	YopX protein. This model represents an uncharacterized, well-conserved family of proteins found in bacteriophage and prophage regions of Gram-positive bacteria. [Mobile and extrachromosomal element functions, Prophage functions, Hypothetical proteins, Conserved]	0
415781	cl09864	CHZ	Histone chaperone domain CHZ. This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones.	0
415782	cl09865	PHA_gran_rgn	Putative polyhydroxyalkanoic acid system protein (PHA_gran_rgn). All members of this family are encoded by genes polyhydroxyalkanoic acid (PHA) biosynthesis and utilization genes, including proteins at found at the surface of PHA granules. Examples so far are found in the Pseudomonales, Xanthomonadales, and Vibrionales, all of which belong to the Gammaproteobacteria.	0
415783	cl09868	DUF2396	Protein of unknown function (DUF2396). Members of this family of conserved hypothetical proteins are found, so far, only in the Cyanobacteria. Members are about 170 amino acids long and share a motif CxxCx(14)CxxH near the amino end. [Hypothetical proteins, Conserved]	0
415784	cl09869	Nitr_red_assoc	Conserved nitrate reductase-associated protein (Nitr_red_assoc). Most members of this protein family are found in the Cyanobacteria, and these mostly near nitrate reductase genes and molybdopterin biosynthesis genes. We note that molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. This protein is sometimes annotated as nitrate reductase-associated protein. Its function is unknown.	0
385616	cl09872	Cas8a2_I-A	CRISPR/Cas system-associated protein Csa8a2. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes archaeal proteins encoded in cas gene regions.	0
415785	cl09873	Csm6_III-A	CRISPR/Cas system-associated protein Csm6. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.	0
415786	cl09875	DUF2398	Protein of unknown function (DUF2398). Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved]	0
415788	cl09881	Spore_GerQ	Spore coat protein (Spore_GerQ). Members of this protein family are the spore coat protein GerQ of endospore-forming Firmicutes (low GC Gram-positive bacteria). This protein is cross-linked by a spore coat-associated transglutaminase. [Cellular processes, Sporulation and germination]	0
415789	cl09883	TrbC_Ftype	Type-F conjugative transfer system pilin assembly protein. conjugal transfer pilus assembly protein TrbC; Provisional	0
415790	cl09884	DUF2400	Protein of unknown function (DUF2400). Members of this uncharacterized protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighborhoods show little conservation. [Hypothetical proteins, Conserved]	0
415791	cl09889	Phage_rep_org_N	N-terminal phage replisome organizer (Phage_rep_org_N). This model represents the N-terminal domain of a small family of phage proteins. The protein contains a region of low-complexity sequence that reflects DNA direct repeats able to function as an origin of phage replication. The region covered by this model is N-terminal to the low-complexity region. [Mobile and extrachromosomal element functions, Prophage functions]	0
415792	cl09890	Phage_holin_6_1	Bacteriophage holin of superfamily 6 (Holin_LLH). This model represents a putative phage holin from a number of phage and prophage regions of Gram-positive bacteria. Like other holins, it is small (about 100 amino acids) with stretches of hydrophobic sequence and is encoded adjacent to lytic enzymes. [Mobile and extrachromosomal element functions, Prophage functions]	0
415793	cl09891	Lactococcin_972	Bacteriocin (Lactococcin_972). This model represents bacteriocins related to lactococcin 972. Members tend to be found in association with a seven transmembrane putative immunity protein. [Cellular processes, Toxin production and resistance]	0
299108	cl09901	Gcw_chp	Bacterial protein of unknown function (Gcw_chp). This model represents a conserved hypothetical protein about 240 residues in length found so far in Proteobacteria including Shewanella oneidensis, Ralstonia solanacearum, and Colwellia psychrerythraea, usually as part of a paralogous family. The function is unknown.	0
415797	cl09903	Porph_ging	Protein of unknown function (Porph_ging). This protein family was first noted as a paralogous set in Porphyromonas gingivalis, but it is more widely distributed among the Bacteroidetes. The protein family is now renamed GLPGLI after its best-conserved motif.	0
415798	cl09906	Csa5_I-A	CRISPR/Cas system-associated protein Csa5. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry represents a minor family of Cas proteins found in various species of Sulfolobus and Pyrococcus (all archaeal). It is found with two different CRISPR loci in Sulfolobus solfataricus.	0
299112	cl09907	Cas8a2_I-A	CRISPR/Cas system-associated protein Csa8a2. CRISPR loci appear to be mobile elements with a wide host range. This entry represents a protein that tends to be found near CRISPR repeats. The species range for this species, so far, is exclusively archaeal. It is found so far in only four different species, and includes two tandem genes in Pyrococcus furiosus DSM 3638. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.	0
299113	cl09912	Trep_dent_lipo	Treponema clustered lipoprotein (Trep_dent_lipo). This model represents a family of six predicted lipoproteins from a region of about 20 tandemly arranged genes in the Treponema denticola genome. Two other neighboring genes share the lipoprotein signal peptide region but do not show more extensive homology. The function of this locus is unknown.	0
415799	cl09913	Csn2_like	CRISPR/Cas system-associated protein Csn2. Cas_St_Csn2 is a family of Csn2 CRISPR-associated (Cas) proteins found in Firmicutes, largely Streptococcus and Enterococcus. CRISPR-associated (Cas) proteins are the main executioners of the process whereby prokaryotes acquire immunity against foreign genetic material. Cas allow short segments of this DNA, called spacer, to become incorporated into chromosomal loci as clustered regularly interspaced short palindromic repeats or CRISPRs; the resulting encoded RNAs are then processed into small fragments that guide the silencing of the invading genetic elements. Thus Cas are involved in the acquisition of new spacers. This family of St_Csn2 is longer than the canonical Csn2, pfam09711 through the addition of a large C-terminal domain. The central domain present in both families appears to be a channel that selectively interacts with dsDNA.	0
385628	cl09914	PHA_synth_III_E	Poly(R)-hydroxyalkanoic acid synthase subunit (PHA_synth_III_E). This model represents the PhaE subunit of the heterodimeric class (class III) of polymerase for poly(R)-hydroxyalkanoic acids (PHAs), carbon and energy storage polymers of many bacteria. The most common PHA is polyhydroxybutyrate but about 150 different constituent hydroxyalkanoic acids (HAs) have been identified in various species. This model must be designated subfamily to indicate the heterogeneity of PHAs. [Cellular processes, Adaptations to atypical conditions, Fatty acid and phospholipid metabolism, Biosynthesis]	0
415800	cl09915	A_thal_3526	Plant protein 1589 of unknown function (A_thal_3526). This model represents an uncharacterized plant-specific domain 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least 10 proteins from Arabidopsis thaliana and at least one from Oryza sativa.	0
415801	cl09916	Plasmod_dom_1	Plasmodium protein of unknown function (Plasmod_dom_1). hypothetical protein; Provisional	0
415802	cl09917	ETRAMP	Malarial early transcribed membrane protein (ETRAMP). This model describes a family of proteins from the malaria parasite Plasmodium falciparum, several of which have been shown to be expressed specifically in the ring stage as well as the rident parasite Plasmodium yoelii. A homolog from Plasmodium chabaudi was localized to the parasitophorous vacuole membrane. Members have an initial hydrophobic, Phe/Tyr-rich stretch long enough to span the membrane, a highly charged region rich in Lys, a second putative transmembrane region, and a second highly charged, low complexity sequence region. Some members have up to 100 residues of additional C-terminal sequence. These genes have been shown to be found in the sub-telomeric regions of both P. falciparum and P. yoelii chromosomes	0
415803	cl09918	CPW_WPC	Plasmodium falciparum domain of unknown function (CPW_WPC). The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown.	0
415804	cl09920	C_GCAxxG_C_C	Putative redox-active protein (C_GCAxxG_C_C). This model represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters.	0
415805	cl09921	Unstab_antitox	Putative addiction module component. Members of this family are bacterial proteins, typically are about 75 amino acids long, always found as part of a pair (at least) of two small genes. The other in the pair always belongs to a subfamily of the larger family pfam05016 (although not necessarily scoring above the designated cutoff), which contains plasmid stabilization proteins. It is likely that this protein and its pfam05016 member partner comprise some form of addiction module, although these gene pairs usually are found on the bacterial main chromosome. [Mobile and extrachromosomal element functions, Other]	0
415806	cl09927	S1_like	N/A. This domain is found at the N-terminus of RsgA domains. It has an OB fold.	0
415807	cl09928	Molybdopterin-Binding	N/A. This model describes a subset of formate dehydrogenase alpha chains found mainly in proteobacteria but also in Aquifex. The alpha chain contains domains for molybdopterin dinucleotide binding and molybdopterin oxidoreductase (pfam01568 and pfam00384, respectively). The holo-enzyme also contains beta and gamma subunits of 32 and 20 kDa. The enzyme catalyzes the oxidation of formate (produced from pyruvate during anaerobic growth) to carbon dioxide with the concomitant release of two electrons and two protons. The electrons are utilized mainly in the nitrate respiration by nitrate reductase. In E. coli and Salmonella, there are two forms of the formate dehydrogenase, one induced by nitrate which is strictly anaerobic (fdn), and one incuced during the transition from aerobic to anaerobic growth (fdo). This subunit is one of only three proteins in E. coli which contain selenocysteine. This model is well-defined, with a large, unpopulated trusted/noise gap. [Energy metabolism, Anaerobic, Energy metabolism, Electron transport]	0
415808	cl09929	MopB_CT	N/A. This domain is found in various molybdopterin - containing oxidoreductases and tungsten formylmethanofuran dehydrogenase subunit d (FwdD) and molybdenum formylmethanofuran dehydrogenase subunit (FmdD); where the domain constitutes almost the entire subunit. The formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and has a molybdopterin dinucleotide cofactor. This domain corresponds to the C-terminal domain IV in dimethyl sulfoxide (DMSO)reductase which interacts with the 2-amino pyrimidone ring of both molybdopterin guanine dinucleotide molecules.	0
415809	cl09930	RPA_2b-aaRSs_OBF_like	Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold. Replication protein A contains two OB domains in it's DNA binding region. This is the second of the OB domains.	0
415810	cl09932	Acyl-CoA_dh_N	Acyl-CoA dehydrogenase, N-terminal domain. Acyl-coenzyme A oxidase consists of three domains. An N-terminal alpha-helical domain, a beta sheet domain (pfam02770) and a C-terminal catalytic domain (pfam01756). This entry represents the N-terminal alpha-helical domain.	0
415811	cl09933	ACAD	Acyl-CoA dehydrogenase. C-terminal domain of Acyl-CoA dehydrogenase is an all-alpha, four helical up-and-down bundle.	0
415812	cl09936	PP-binding	Phosphopantetheine attachment site. acyl carrier protein; Provisional	0
415813	cl09938	cond_enzymes	N/A. This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III EC:2.3.1.180, the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria.	0
415814	cl09940	S4	N/A. This domain is found at the C-terminus of fungal tyrosyl-tRNA synthetases. It binds to group I introns.	0
415815	cl09943	Ribosomal_L29_HIP	N/A. This family represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure.	0
415816	cl09951	FN2	N/A. One of three types of internal repeat within the plasma protein, fibronectin. Also occurs in coagulation factor XII, 2 type IV collagenases, PDC-109, and cation-independent mannose-6-phosphate and secretory phospholipase A2 receptors. In fibronectin, PDC-109, and the collagenases, this domain contributes to collagen-binding function.	0
415817	cl09954	DUF202	Domain of unknown function (DUF202). This family consists of hypothetical proteins some of which are putative membrane proteins. No functional information or experimental verification of function is known. This domain is around 100 amino acids long.	0
415818	cl09957	zf-UBP	Zn-finger in ubiquitin-hydrolases and other protein. 	0
415819	cl09961	DUF1027	Protein of unknown function (DUF1027). This family consists of several hypothetical bacterial proteins of unknown function.	0
415820	cl09962	DUF771	Domain of unknown function (DUF771). Family of uncharacterized ORFs found in Bacteriophage and Lactococcus lactis.	0
415822	cl10011	Periplasmic_Binding_Protein_type1	Type 1 periplasmic binding fold superfamily. This family includes a diverse range of periplasmic binding proteins.	0
415823	cl10012	DnaQ_like_exo	DnaQ-like (or DEDD) 3&apos;-5&apos; exonuclease domain superfamily. This is a highly divergent 3' exoribonuclease family. The proteins constitute a typical RNase fold, where the active site residues form a magnesium catalytic centre. The protein of the solved structure readily cleaves 3' overhangs in a time-dependent manner. It is similar to DEDD-type RNases and is an unusual ATP-binding protein that binds ATP and dATP. It forms a dimer in solution and both protomers in the asymmetric unit bind a magnesium ion through Asp-6 in UniProtKB:P9WJ73.	0
415824	cl10013	Glycosyltransferase_GTB-type	glycosyltransferase family 1 and related proteins with GTB topology. Asp1, along with SecY2, SecA2, and other proteins forms part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. Asp1 is predicted to be cytosolic.	0
415825	cl10014	PTS_IIB	N/A. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes. The fold of IIB cellobiose shows similar structure to mammalian tyrosine phosphatases. This family also contains the fructose specific IIB subunit.	0
415826	cl10015	YjgF_YER057c_UK114_family	N/A. YjgF_Endoribonuc is a putative endoribonuclease. The structure is of beta-alpha-beta-alpha-beta(2) domains common both to bacterial chorismate mutase and to members of the YjgF family. These proteins form trimers with a three-fold symmetry with three closely-packed beta-sheets. The YjgF family is a large, widely distributed family of proteins of unknown biochemical function that are highly conserved among eubacteria, archaea and eukaryotes.	0
415827	cl10017	Tubulin_FtsZ_Cetz-like	Tubulin protein family of FtsZ and CetZ-like. Many of the residues conserved in Tubulin, pfam00091, are also highly conserved in this family.	0
415828	cl10019	PurM-like	N/A. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The function of the C-terminal domain of AIR synthase is unclear, but the cleft formed between N and C domains is postulated as a sulphate binding site.	0
415829	cl10020	S2P-M50	N/A. This is a family of bacterial and plant peptidases in the same family as MEROPS:M50B.	0
415830	cl10022	ABM	Antibiotic biosynthesis monooxygenase. The function of this family is unknown, but it is upregulated in response to salt stress in Populus balsamifera. It is also found at the C-terminus of an fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus. Arthrobacter nicotinovorans ORF106 is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved and the domain forms an a/b barrel dimer. Although there is a clear duplication within the domain it is not obviously detectable in the sequence.	0
353046	cl10023	POLBc	N/A. DNA polymerase subunit B; Provisional	0
415831	cl10029	Histidinol_dh	N/A. histidinol dehydrogenase; Reviewed	0
415832	cl10030	MECDP_synthase	N/A. The ygbB protein is a putative enzyme of deoxy-xylulose pathway (terpenoid biosynthesis).	0
415833	cl10031	DUF1190	Protein of unknown function (DUF1190). hypothetical protein	0
415834	cl10037	AroH	N/A. Chorismate mutase EC:5.4.99.5 catalyzes the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine.	0
415835	cl10043	hemP	Hemin uptake protein hemP. hypothetical protein; Provisional	0
415836	cl10045	tRNA_int_endo	tRNA intron endonuclease, catalytic C-terminal domain. tRNA-splicing endonuclease subunit beta; Reviewed	0
415837	cl10048	TonB_C	Gram-negative bacterial TonB protein C-terminal. This family contains TonB members that are not captured by pfam03544.	0
415838	cl10072	Phage_Mu_F	Phage Mu protein F like protein. Family of related phage minor capsid proteins.	0
415839	cl10080	RPE65	Retinal pigment epithelial membrane protein. 9-cis-epoxycarotenoid dioxygenase	0
415840	cl10125	DUF3461	Protein of unknown function (DUF3461). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 130 amino acids in length. This protein has two conserved sequence motifs: KFK and HLE.	0
415841	cl10143	DUF5431	Family of unknown function (DUF5431). modulator of post-segregation killing protein; Provisional	0
415842	cl10149	TraW_N	Sex factor F TraW protein N terminal. This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm.	0
415843	cl10177	DUF5455	Family of unknown function (DUF5455). minor coat protein	0
353058	cl10198	DUF5466	Family of unknown function (DUF5466). hypothetical protein	0
353059	cl10201	O_Spanin_T7	outer-membrane spanin sub-unit. phage lambda Rz1-like protein	0
415844	cl10205	Tube	Tail tubular protein. tail tubular protein A	0
385669	cl10212	DUF5476	Family of unknown function (DUF5476). hypothetical protein	0
353062	cl10214	TA_inhibitor	Inhibitor of toxin/antitoxin system (Gp4.5). hypothetical protein	0
353063	cl10215	DUF5471	Family of unknown function (DUF5471). hypothetical protein	0
353064	cl10223	DUF5480	Family of unknown function (DUF5480). hypothetical protein	0
353065	cl10228	p6	Histone-like Protein p6. dsDNA binding protein	0
415845	cl10256	YecR	YecR-like lipoprotein. hypothetical protein	0
353066	cl10264	DUF5493	Family of unknown function (DUF5493). hypothetical protein	0
353067	cl10269	DUF5517	Family of unknown function (DUF5517). hypothetical protein	0
353068	cl10273	DUF5489	Family of unknown function (DUF5489). hypothetical protein	0
385670	cl10291	DUF2523	Protein of unknown function (DUF2523). putative minor coat protein	0
353069	cl10305	Gp17	Superinfection exclusion protein, bacteriophage P22. hypothetical protein	0
353070	cl10308	Phi29_Phage_SSB	Phage Single-stranded DNA-binding protein. hypothetical protein	0
385671	cl10335	Phage_TAC_12	Phage tail assembly chaperone protein, TAC. hypothetical protein	0
415846	cl10351	Phage_gp49_66	Phage protein (N4 Gp49/phage Sf6 gene 66) family. hypothetical protein	0
415847	cl10447	GH18_chitinase-like	N/A. This DUF is likely to be a form of glycosyl hydrolase from CAZy family 18, possibly chitinase 18. This would have the EC number of EC:3.2.1.14.	0
415848	cl10448	GH25_muramidase	N/A. This domain is found in a set of uncharacterized hypothetical bacterial proteins.	0
415849	cl10459	Peptidases_S8_S53	Peptidase domain in the S8 and S53 families. Subtilases are a family of serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like that found in the trypsin serine proteases (see pfam00089). Structure is an alpha/beta fold containing a 7-stranded parallel beta sheet, order 2314567.	0
415851	cl10465	Peptidase_S24_S26	N/A. The C-terminal domain of the CI repressor functions in oligomer formation.	0
415853	cl10468	TerC	Integral membrane protein TerC family. Predicted to be an integral membrane protein with multiple membrane spans.	0
415854	cl10470	Rick_17kDa_Anti	Glycine zipper 2TM domain. hypothetical protein; Provisional	0
415855	cl10471	LU	N/A. UPAR_LY6_2 is a family of higher eukaryotic proteins expressed in neurons. It modulates nicotinic acetylcholine receptors by selectively increasing Ca2+-influx through this ion channel. The family carries an LU protein domain - about 80 amino acids long characterized by a conserved pattern of 10 cysteine residues. The family is a positive feedback regulator of Wnt/beta-catenin signalling, eg for patterning of the mesoderm and neuroectoderm in zebrafish gastrulation, where Lypd6 is GPI-anchored to the plasma-membrane and interacts with the Wnt receptor Frizzled8 and the co-receptor Lrp6.	0
415856	cl10479	DUF413	Protein of unknown function, DUF. hypothetical protein; Provisional	0
415857	cl10480	DUF2157	Predicted membrane protein (DUF2157). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
415858	cl10492	DUF596	Protein of unknown function, DUF596. This family contains several uncharacterized proteins.	0
415859	cl10501	DUF1223	Protein of unknown function (DUF1223). This family consists of several hypothetical proteins of around 250 residues in length which are found in both plants and bacteria. The function of this family is unknown. Structurally it lies in the Thioredoxin-like superfamily.	0
415860	cl10502	lipocalin_FABP	lipocalin/cytosolic fatty acid-binding protein family. This domain forms a beta barrel structure but the function is unknown. The GO annotation for this protein indicates that the protein has a function in nematode larval development and has a positive regulation on growth rate.	0
415861	cl10503	DUF1737	Domain of unknown function (DUF1737). This domain of unknown function is found at the N-terminus of bacterial and viral hypothetical proteins.	0
299184	cl10504	DUF975	Protein of unknown function (DUF975). Family of uncharacterized bacterial proteins.	0
415862	cl10507	Disintegrin	Disintegrin. Snake disintegrins inhibit the binding of ligands to integrin receptors. They contain a 'RGD' sequence, identical to the recognition site of many adhesion proteins. Molecules containing both disintegrin and metalloprotease domains are known as ADAMs.	0
415863	cl10509	PAW	PNGase C-terminal domain, mannose-binding module PAW. present in several copies in proteins with unknown function in C. elegans	0
415864	cl10511	Beach	N/A. The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein. The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown.	0
415869	cl10557	Dak1	Dak1 domain. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form (EC 2.7.1.-) with a phosphoprotein donor related to PTS transport proteins. This family represents the DhaK subunit of the latter type of dihydroxyacetone kinase, but it specifically excludes the DhaK paralog DhaK2 (TIGR02362) found in the same operon as DhaK and DhaK in the Firmicutes.	0
415871	cl10571	GT_MraY-like	N/A. phospho-N-acetylmuramoyl-pentapeptide-transferase; Provisional	0
385701	cl10591	Bro-N	BRO family, N-terminal domain. This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins.	0
415878	cl10615	sm_acid_XPC-like	small acidic domain of Xeroderma pigmentosum group C complementing protein and similar proteins. This model represents the small acidic domain of mammalian Xeroderma pigmentosum group C complementing protein (XPC), yeast Rad4, and similar proteins. XPC/Rad4 recruits transcription/repair factor IIH (TFIIH) to the nucleotide excision repair (NER) complex through interactions with its p62/Tfb1 and XPB/Ssl2 TFIIH subunits. Global genome repair (GGR), one of two NER initiation pathways in mammals, starts with DNA lesion detection by XPC. XPC is a structure specific DNA-binding factor that recognizes distortion of the damaged DNA double helix and recruits the TFIIH complex onto the lesion to open up the damaged DNA. The small acidic domain of XPC/Rad4 interacts with the pleckstrin homology (PH) domain of the p62/Tfb1 subunit of TFIIH.	0
415890	cl10701	FIST	FIST N domain. The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids.	0
415894	cl10713	Phage_pRha	Phage regulatory protein Rha (Phage_pRha). Members of this protein family are found in temperate phage and bacterial prophage regions. Members include the product of the rha gene of the lambdoid phage phi-80, a late operon gene. The presence of this gene interferes with infection of bacterial strains that lack integration host factor (IHF), which regulates the rha gene. It is suggested that pRha is a phage regulatory protein. [Mobile and extrachromosomal element functions, Prophage functions]	0
415896	cl10717	CactinC_cactus	Cactus-binding C-terminus of cactin protein. SF3A2 is one of the components of the SF3a splicing factor complex of the mature U2 snRNP (small nuclear ribonucleoprotein particle). In yeast, SF3a shows a bifurcated assembly structure of three subunits, Prp9 (subunit 3), Prp11 (subunit 2) and Prp21 (subunit 1). with Prp21 wrapping around Prp11.	0
415902	cl10727	E3_UFM1_ligase	E3 UFM1-protein ligase 1. E3 UFM1-protein ligase 1 homolog; Provisional	0
415925	cl10767	AD	Anticodon-binding domain. This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins.	0
415965	cl10870	NDFIP-like	NEDD4 family-interacting protein. The NEDD4 (neural precursor cell expressed, developmentally down-regulated protein 4)-family interacting proteins (NDFIPs) are adaptor proteins that recruit NEDD4 E3 ligases to specific substrate proteins, which leads to the ubiquitylation and subsequent degradation of these proteins. They also act as activators of the E3 ligase activity by releasing NEDD4 ligase from its auto-inhibitory conformation. NDFIP2 may play a role in protein trafficking.	0
415977	cl10889	Cir_N	N-terminal domain of CBF1 interacting co-repressor CIR. This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex.	0
415994	cl10918	Cg6151-P	Uncharacterized conserved protein CG6151-P. This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined.	0
416027	cl10970	AP_MHD_Cterm	C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD). The muniscins are a family of endocytic adaptors that is conserved from yeast to humans.This C-terminal domain is structurally similar to mu homology domains, and is the region of the muniscin proteins involved in the interactions with the endocytic adaptor-scaffold proteins Ede1-eps15. This interaction influences muniscin localization. The muniscins provide a combined adaptor-membrane-tubulation activity that is important for regulating endocytosis.	0
416063	cl11037	EKR	Domain of unknown function. EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) and the 4Fe-4S binding domain Fer4. It contains a characteristic EKR sequence motif. The exact function of this domain is not known.	0
416081	cl11062	BHD_1	Rad4 beta-hairpin domain 1. This short domain is found in the Rad4 protein. This domain binds to DNA.	0
416082	cl11063	BHD_2	Rad4 beta-hairpin domain 2. This short domain is found in the Rad4 protein. This domain binds to DNA.	0
416083	cl11065	TAF8	TATA Binding Protein (TBP) Associated Factor 8. This is the C-terminal, Delta, part of the TAF8 protein. The N-terminal is generally the histone fold domain, Bromo_TP (pfam07524). TAF8 is one of the key subunits of the transcription factor for pol II, TFIID. TAF8 is one of the several general cofactors which are typically involved in gene activation to bring about the communication between gene-specific transcription factors and components of the general transcription machinery.	0
416092	cl11081	dermokine	dermokine. This region has been called the argonaute hook. It has been shown to bind to the Piwi domain pfam02171 of Argnonaute proteins.	0
416130	cl11158	BEN	BEN domain. hypothetical protein; Provisional	0
416136	cl11171	Dev_Cell_Death	Development and cell death domain. The domain is shared by several proteins in the Arabidopsis and the rice genomes, which otherwise show a different protein architecture. Biological studies indicate a role of these proteins in phytohormone response, embryo development and programmed cell death by pathogens or ozone.	0
416146	cl11186	Cullin_Nedd8	Cullin protein neddylation domain. This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue.	0
416154	cl11198	zinc_ribbon_2	zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR. pfam12773.	0
416181	cl11253	Germane	Sporulation and spore germination. The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 pfam01520 Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold.	0
299614	cl11266	EssA	WXG100 protein secretion system (Wss), protein EssA. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This highly divergent protein family consists largely of a central region of highly polar low-complexity sequence containing occasional LF motifs in weak repeats about 17 residues in length, flanked by hydrophobic N- and C-terminal regions. [Protein fate, Protein and peptide secretion and trafficking]	0
416196	cl11278	DUF2492	Protein of unknown function (DUF2492). This model describes a family of small cytosolic proteins, about 80 amino acids in length, in which the eight invariant residues include three His residues and two Cys residues. Two pairs of these invariant residues occur in motifs HxH (where x is A or G) and CxH, both of which suggest metal-binding activity. This protein family was identified by searching with a phylogenetic profile based on an anaerobic sulfatase-maturase enzyme, which contains multiple 4Fe-4S clusters. The linkages by phylogenetic profiling and by iron-sulfur cluster-related motifs together suggest this protein may be an accessory protein to certain maturases in sulfatase/maturase systems.	0
416236	cl11367	EspA_EspE	EspA/EspE family. This family of mycobacterial proteins are uncharacterized.	0
416244	cl11377	NADH-u_ox-rdase	NADH-ubiquinone oxidoreductase complex I, 21 kDa subunit. complex I subunit	0
416253	cl11393	Peptidase_M14_like	M14 family of metallocarboxypeptidases and related proteins. This is the peptidase domain of a D,L-carboxypeptidase. The active site residues are Arg86, Glu222 and the metal ligands, in the peptidase domain, are Gln46, Glu49 and His128 in UniProtKB:O25708. The protein binds many zinc ions and a calcium ion and there are other metal binding sites. The catalytic activity is the release of m-Dpm from the peptide muramyl-Ala-gamma-D-Glu-m-Dpm; this is probably the precursor of the cell wall cross-linking peptide.	0
416254	cl11394	Glyco_tranf_GTA_type	Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold. Members of this family of prokaryotic proteins include putative glucosyltransferases, which are involved in bacterial capsule biosynthesis.	0
416255	cl11395	Pkinase_C	Protein kinase C terminal domain. 	0
416256	cl11396	Patatin_and_cPLA2	Patatins and Phospholipases. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein but it also has the enzymatic activity of lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates.	0
416257	cl11397	NR_LBD	The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators. This all helical domain is involved in binding the hormone in these receptors.	0
416258	cl11399	HP	Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction. The histidine phosphatase superfamily is so named because catalysis centers on a conserved His residue that is transiently phosphorylated during the catalytic cycle. Other conserved residues contribute to a 'phosphate pocket' and interact with the phospho group of substrate before, during and after its transfer to the His residue. Structure and sequence analyses show that different families contribute different additional residues to the 'phosphate pocket' and, more surprisingly, differ in the position, in sequence and in three dimensions, of a catalytically essential acidic residue. The superfamily may be divided into two main branches.The smaller branch 2 contains predominantly eukaryotic proteins. The catalytic functions in members include phytase, glucose-1-phosphatase and multiple inositol polyphosphate phosphatase. The in vivo roles of the mammalian acid phosphatases in branch 2 are not fully understood, although activity against lysophosphatidic acid and tyrosine-phosphorylated proteins has been demonstrated.	0
416259	cl11403	pepsin_retropepsin_like	Cellular and retroviral pepsin-like aspartate proteases. The N- and C-termini of the members of this family are jointly necessary for creating the catalytic pocket necessary for cleaving xylanase. Phytopathogens produce xylanase that destroys plant cells, so its destruction through proteolysis is vital for plant-survival.	0
416260	cl11404	Biotinyl_lipoyl_domains	N/A. HlyD_D4 is the long alpha-hairpin domain in the centre of CusB or HlyD proteins. CusB and HlyD proteins are membrane fusion proteins of the CusCFBA copper efflux system in E.coli and related bacteria. Efflux systems of this resistance-nodulation-division group - RND - have been developed to excrete poisonous metal ions, and in E.coli the only one that deals with silver and copper is the CusA transporter. The transporter CusA works in conjunction with a periplasmic component that is a membrane fusion protein, eg CusB, and an outer-membrane channel component CusC in a CusABC complex driven by import of protons. HlyD_D4 is thought to interact with the alpha-helical tunnels of the corresponding outer-membrane channels, ie the periplasmic domain of CusC.	0
416261	cl11409	RNAP_RPB11_RPB3	RPB11 and RPB3 subunits of RNA polymerase. The two eukaryotic subunits Rpb3 and Rpb11 dimerize to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent of the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerization domain of the alpha subunit/Rpb3 is interrupted by an insert domain (pfam01000). Some of the alpha subunits also contain iron-sulphur binding domains (pfam00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria, alpha subunits from chloroplasts, Rpb3 subunits from eukaryotes, Rpb11 subunits from eukaryotes, RpoD subunits from archaeal spp, and RpoL subunits from archaeal spp. Many of the members of this family carry only the N-terminal region of Rpb11.	0
416262	cl11410	TPP_enzyme_PYR	Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22. This family is distantly related to transketolases e.g. pfam02779.	0
416263	cl11421	FAA_hydrolase	Fumarylacetoacetate (FAA) hydrolase family. This bacterial family of proteins has no known function.	0
416264	cl11423	VirB9_CagX_TrbG	VirB9/CagX/TrbG, a component of the type IV secretion system. This family includes type IV secretion system CagX conjugation protein. Other members of this family are involved in conjugal transfer to plant cells of T-DNA.	0
416265	cl11424	nitrilase	Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes. This family contains hydrolases that break carbon-nitrogen bonds. The family includes: Nitrilase EC:3.5.5.1, Aliphatic amidase EC:3.5.1.4, Biotidinase EC:3.5.1.12, Beta-ureidopropionase EC:3.5.1.6. Nitrilase-related proteins generally have a conserved E-K-C catalytic triad, and are multimeric alpha-beta-beta-alpha sandwich proteins.	0
416266	cl11425	PSI_PSAK	Photosystem I psaG / psaK. Members of this protein family are the PsaK of the photosystem I reaction center. Photosystems I and II occur together in the same sets of organisms. Photosystem I uses light energy to transfer electrons from plastocyanin to ferredoxin, while photosystem II uses light energy to split water and releases molecular oxygen. [Energy metabolism, Photosynthesis]	0
416267	cl11433	DivIC	Septum formation initiator. In Escherichia coli, nine gene products are known to be essential for assembly of the division septum. One of these, FtsL, is a bitopic membrane protein whose precise function is not understood. It has been proposed that FtsL interacts with the DivIC protein pfam04977, however this interaction may be indirect.	0
353252	cl11434	AlkD_like	A new structural DNA glycosylase. This domain represents a new and uncharacterized structural superfamily of DNA glycosylases that form an alpha-alpha superhelix fold that are not belong to the identified five structural DNA glycosylase superfamilies (UDG, AAG/MNPG, MutM/Fpg and helix-hairpin-helix).  DNA glycosylases removing alkylated base residues have been identified in all organisms investigated and may be universally present in nature. DNA glycosylases catalyze the first step in Base Excision Repair (BER) pathway by cleaving damaged DNA bases within double strand DNA to produce an abasic site. The resulting abasic site is further processed by AP endonuclease, phosphodiesterase, DNA polymerases, and DNA ligase functions to restore the DNA to an undamaged state. All glycosylase examined to date utilize a similar strategy for binding DNA and base  flipping despite their structural diversity. The known structures for members of this family, AlkC and AlkD from Bacillus cereus, are distant homologues and are composed of six variant HEAT (Huntington/Elongation/ A subunit/Target of rapamycin) repeats. HEAT motifs are ~45-amino acid sequences that form antiparallel alpha-helices, which are packed by a conserved hyrophobic interface and are tandemly repeated to form superhelical alpha-structures. AlkD and AlkC are specific for removal of 3-methyladenine (3mA) and 7-methylguanine (7mG) from the DNA by base excision repair. Homologues of AlkC and AlkD were also identified in other organisms.	0
416268	cl11435	DMB-PRT_CobT	Nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase (DMB-PRT), also called CobT. This family of proteins represent the nicotinate-nucleotide- dimethylbenzimidazole phosphoribosyltransferase (NN:DBI PRT) enzymes involved in dimethylbenzimidazole synthesis. This function is essential to de novo cobalamin (vitamin B12) production in bacteria. Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) from Salmonella enterica plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin.	0
416269	cl11436	DNA_III_psi	DNA polymerase III psi subunit. This small subunit of the DNA polymerase III holoenzyme in E. coli and related species appearsto have a narrow taxonomic distribution. It is not found so far outside the gamma subdivision proteobacteria. [DNA metabolism, DNA replication, recombination, and repair]	0
416270	cl11437	DUF2057	Uncharacterized protein conserved in bacteria (DUF2057). hypothetical protein; Provisional	0
416271	cl11440	AstA	Arginine N-succinyltransferase beta subunit. In some bacteria, including Pseudomonas aeruginosa, the astB gene (arginine N-succinyltransferase) is replaced by tandem paralogs that form a heterodimer. This heterodimer from P. aeruginosa is characterized as arginine and ornithine N-2 succinyltransferase (AOST). Members of this protein family represent the less widespread paralog, designated AruI, or arginine/ornithine succinyltransferase, alpha subunit.	0
416272	cl11442	Cas2_I_II_III	CRISPR/Cas system-associated protein Cas2. Members of this family of bacterial proteins comprise various hypothetical proteins, as well as CRISPR (clustered regularly interspaced short palindromic repeats) associated proteins, conferring resistance to infection by certain bacteriophages.	0
416273	cl11443	Cas6	Class 1 CRISPR-associated endoribonuclease Cas6. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Cas6 family endoribonucleases are metal-independent nucleases that catalyze RNA cleavage via a mechanism involving a 2'-3' cyclic intermediate. They share a common ferredoxin or RNA recognition motif (RRM) fold, and they recognize and excise CRISPR repeat RNAs that vary widely in primary and secondary structures. This subfamily contains Cas6 family endoribonucleases typically found within type III CRISPR-Cas systems and similar proteins.	0
416274	cl11449	DUF406	Protein of unknown function (DUF406). These small proteins are approximately 100 amino acids in length and appear to be found only in gamma proteobacteria. The function of this protein family is unknown. [Hypothetical proteins, Conserved]	0
416275	cl11450	MtlR	Mannitol repressor. putative DNA-binding transcriptional regulator; Provisional	0
416276	cl11451	Cyd_oper_YbgE	Cyd operon protein YbgE (Cyd_oper_YbgE). hypothetical protein; Provisional	0
416277	cl11452	H_PPase	Inorganic H+ pyrophosphatase. This model describes proton pyrophosphatases from eukaryotes (predominantly plants), archaea and bacteria. It is an integral membrane protein and is suggested to have about 15 membrane spanning domains. Proton translocating inorganic pyrophosphatase, like H(+)-ATPase, acidifies the vacuoles and is pivotal to the vacuolar secondary active transport systems in plants. [Transport and binding proteins, Cations and iron carrying compounds]	0
416278	cl11454	FlaF	Flagellar protein FlaF. flagellar biosynthesis regulatory protein FlaF; Reviewed	0
416279	cl11455	FlbT	Flagellar protein FlbT. flagellar biosynthesis repressor FlbT; Reviewed	0
416280	cl11456	DUF1375	Protein of unknown function (DUF1375). hypothetical protein; Provisional	0
416281	cl11457	Secretoglobin	N/A. Uteroglobin is a homodimer of two identical 70 amino acid polypeptides linked by two disulphide bridges. The precise role of uteroglobin has still to be elucidated.	0
386124	cl11461	Phage_H_T_join	Phage head-tail joining protein. This family describes a small protein of about 100 amino acids found in bacteriophage and in bacterial prophage regions. Examples include gp9 of phage HK022 and gp16 of phage SPP1. This minor structural protein is suggested to be a head-tail adaptor protein (although the source of this annotation was not traced during construction of this model). [Mobile and extrachromosomal element functions, Prophage functions]	0
416282	cl11463	Phage_TTP_12	Lambda phage tail tube protein, TTP. characterized members are major tail tube proteins from various phages, including lactococcal temperate bacteriophage TP901-1.	0
416283	cl11466	STI	N/A. Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. Inhibit proteases by binding with high affinity to their active sites. Trefoil fold, common to interleukins and fibroblast growth factors.	0
416284	cl11468	KicB	MukF winged-helix domain. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid.	0
416285	cl11470	SeqA	SeqA protein C-terminal domain. The binding of SeqA protein to hemimethylated GATC sequences is important in the negative modulation of chromosomal initiation at oriC, and in the formation of SeqA foci necessary for Escherichia coli chromosome segregation. SeqA tetramers are able to aggregate or multimerize in a reversible, concentration-dependent manner. Apart from its function in the control of DNA replication, SeqA may also be a specific transcription factor.	0
416286	cl11471	MukE	bacterial condensin complex subunit MukE. Bacterial protein involved in chromosome partitioning, MukE	0
416287	cl11472	DUF440	Protein of unknown function, DUF440. dsDNA-mimic protein; Reviewed	0
416288	cl11473	DUF1043	Protein of unknown function (DUF1043). This family consists of several hypothetical bacterial proteins of unknown function.	0
416289	cl11474	UPF0231	Uncharacterized protein family (UPF0231). hypothetical protein; Provisional	0
416290	cl11475	CcmD	Heme exporter protein D (CcmD). The model for this protein family describes a small, hydrophobic, and only moderately well-conserved protein, tricky to identify accurately for all of these reasons. However, members are found as part of large operons involved in heme export across the inner membrane for assembly of c-type cytochromes in a large number of bacteria. The gray zone between the trusted cutoff (13.0) and noise cutoff (4.75) includes both low-scoring examples and false-positive matches to hydrophobic domains of longer proteins.	0
416291	cl11478	Rsd_AlgQ	Regulator of RNA polymerase sigma(70) subunit, Rsd/AlgQ. This family includes bacterial transcriptional regulators that are thought to act through an interaction with the conserved region 4 of the sigma(70) subunit of RNA polymerase. The Pseudomonas aeruginosa homolog, AlgQ, positively regulates virulence gene expression and is associated with the mucoid phenotype observed in Pseudomonas aeruginosa isolates from cystic fibrosis patients.	0
416292	cl11479	SMP_2	Bacterial virulence factor haemolysin. Members of this family of bacterial proteins are membrane proteins that effect the expression of haemolysin under anaerobic conditions.	0
416293	cl11481	DUF1145	Protein of unknown function (DUF1145). This family consists of several hypothetical bacterial proteins of unknown function.	0
416294	cl11483	PriC	Primosomal replication protein priC. primosomal replication protein N''; Provisional	0
416295	cl11485	YozE_SAM_like	YozE SAM-like fold. hypothetical protein; Provisional	0
416296	cl11488	DUF1450	Protein of unknown function (DUF1450). hypothetical protein; Provisional	0
416297	cl11491	Phasin_2	Phasin protein. Members of this protein family are encoded in polyhydroxyalkanoic acid storage system regions in Vibrio, Photobacterium profundum SS9, Acinetobacter sp., Aeromonas hydrophila, and several species of Vibrio. Members appear distantly related to the phasin family proteins modeled by TIGR01841 and TIGR01985.	0
416298	cl11492	DUF1447	Protein of unknown function (DUF1447). hypothetical protein; Provisional	0
416299	cl11493	PQQ_DH_like	PQQ-dependent dehydrogenases and related proteins. This protein family has a phylogenetic distribution very similar to that coenzyme PQQ biosynthesis enzymes, as shown by partial phylogenetic profiling. Members of this family have several predicted transmembrane helices in the N-terminal region, and include the quinoprotein glucose dehydrogenase (EC 1.1.5.2) of Escherichia coli and the quinate/shikimate dehydrogenase of Acinetobacter sp. ADP1 (EC 1.1.99.25). Sequences closely related except for the absense of the N-terminal hydrophobic region, scoring in the gray zone between the trusted and noise cutoffs, include PQQ-dependent glycerol (EC 1.1.99.22) and and other polyol (sugar alcohol) dehydrogenases.	0
386143	cl11495	IncFII_repA	IncFII RepA protein family. replication protein; Provisional	0
299749	cl11500	Phage_Treg	Lactococcus bacteriophage putative transcription regulator. putative transcription regulator; Provisional	0
416300	cl11501	HHA	Haemolysin expression modulating protein. This family consists of haemolysin expression modulating protein (HHA) homologs. YmoA and Hha are highly similar bacterial proteins downregulating gene expression in Yersinia enterocolitica and Escherichia coli, respectively.	0
416301	cl11502	Ter	DNA replicatioN-terminus site-binding protein (Ter protein). DNA replication terminus site-binding protein; Provisional	0
299752	cl11503	TraA	TraA. conjugal transfer pilin subunit TraA; Provisional	0
416302	cl11505	Sif	Sif protein. This family consists of several SifA and SifB and SseJ proteins which seem to be specific to the Salmonella species. SifA, SifB and SseJ have been demonstrated to localize to the Salmonella-containing vacuole (SCV) and to Salmonella-induced filaments (Sifs). Trafficking of SseJ and SifB away from the SCV requires the SPI-2 effector SifA. SseJ trafficking away from the SCV along Sifs is unnecessary for its virulence function.	0
416303	cl11506	CrgA	Cell division protein CrgA. putative septation inhibitor protein; Reviewed	0
416304	cl11507	DUF1471	Protein of unknown function (DUF1471). hypothetical protein; Provisional	0
299755	cl11508	NUMOD1	NUMOD1 domain. Repeat of unknown function, but possibly DNA-binding via helix-turn-helix motif (Ponting, unpublished).	0
299756	cl11513	Chlamy_scaf	Chlamydia-phage Chp2 scaffold (Chlamy_scaf). minor capsid protein	0
416305	cl11515	TrbI_Ftype	Type-F conjugative transfer system protein (TrbI_Ftype). This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm.	0
386150	cl11516	TraQ	Type-F conjugative transfer system pilin chaperone (TraQ). conjugal transfer pilin chaperone TraQ; Provisional	0
416306	cl11518	IL4	Interleukin 4. Interleukins-4 and -13 are cytokines involved in inflammatory and immune responses. IL-4 stimulates B and T cells.	0
416307	cl11519	DENN	DENN (AEX-3) domain. The DENN domain is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity.	0
416308	cl11522	Tom22	Mitochondrial import receptor subunit Tom22. The mitochondrial protein translocase (MPT) family, which brings nuclearly encoded preproteins into mitochondria, is very complex with 19 currently identified protein constituents.These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family is specific for the Tom22 proteins. [Transport and binding proteins, Amino acids, peptides and amines]	0
175307	cl11526	Phage_connector	Phage Connector (GP10). putative upper collar protein	0
386154	cl11530	DUF104	Protein of unknown function DUF104. This family includes short archaebacterial proteins of unknown function. Archaeoglobus fulgidus has twelve copies of this protein, with several being clustered together in the genome.	0
416309	cl11531	ZapB	Cell division protein ZapB. septal ring assembly protein ZapB; Provisional	0
416310	cl11533	DUF1013	Protein of unknown function (DUF1013). Family of uncharacterized proteins found in Proteobacteria.	0
416311	cl11538	ssDNA-exonuc_C	Single-strand DNA-specific exonuclease, C terminal domain. Members of this set of prokaryotic domains are found in a set of single-strand DNA-specific exonucleases, including RecJ. Their exact function has not, as yet, been determined.	0
299766	cl11540	Mu-like_Com	Mu-like prophage protein Com. Members of this family of proteins comprise the translational regulator of mom.	0
299767	cl11541	CoiA	Competence protein CoiA-like family. Many of the members of this family are described as transcription factors. CoiA falls within a competence-specific operon in Streptococcus. CoiA is an uncharacterized protein.	0
416312	cl11542	EcsB	Bacterial ABC transporter protein EcsB. This family consists of several bacterial ABC transporter proteins which are homologous to the EcsB protein of Bacillus subtilis. EcsB is thought to encode a hydrophobic protein with six membrane-spanning helices in a pattern found in other hydrophobic components of ABC transporters.	0
416313	cl11545	DUF1820	Domain of unknown function (DUF1820). This family includes small functionally uncharacterized proteins around 100 amino acids in length.	0
386160	cl11547	HTH_43	Winged helix-turn helix. This family, found in various hypothetical prokaryotic proteins, is a probable winged helix DNA-binding domain.	0
416314	cl11548	DUF2140	Uncharacterized protein conserved in bacteria (DUF2140). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
416315	cl11550	DUF1797	Protein of unknown function (DUF1797). This is a domain of unknown function. It forms a central anti-parallel beta sheet with flanking alpha helical regions.	0
416316	cl11551	DUF1149	Protein of unknown function (DUF1149). This family consists of several hypothetical bacterial proteins of unknown function.	0
386164	cl11552	DUF1462	Protein of unknown function (DUF1462). This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown.	0
416317	cl11555	DUF1129	Protein of unknown function (DUF1129). This family consists of several hypothetical bacterial proteins of unknown function.	0
416318	cl11560	ComK	ComK protein. This family consists of several bacterial ComK proteins. The ComK protein of Bacillus subtilis positively regulates the transcription of several late competence genes as well as comK itself. It has been found that ClpX plays an important role in the regulation of ComK at the post-transcriptional level.	0
416319	cl11562	DUF1465	Protein of unknown function (DUF1465). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.	0
416320	cl11564	GcrA	GcrA cell cycle regulator. GcrA is a master cell cycle regulator that, together with CtrA (see pfam00072 and pfam00486), is involved in controlling cell cycle progression and asymmetric polar morphogenesis. During this process, there are temporal and spatial variations in the concentrations of GcrA and CtrA. The variation in concentration produces time and space dependent transcriptional regulation of modular functions that implement cell-cycle processes. More specifically, GcrA acts as an activator of components of the replisome and the segregation machinery.	0
416321	cl11568	DUF1491	Protein of unknown function (DUF1491). This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in the Class Alphaproteobacteria. The function of this family is unknown.	0
416322	cl11569	DUF1467	Protein of unknown function (DUF1467). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown.	0
416323	cl11570	DUF1489	Protein of unknown function (DUF1489). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Members of this family seem to be founds exclusively in the Class Alphaproteobacteria. The function of this family is unknown.	0
416324	cl11571	DUF1476	Domain of unknown function (DUF1476). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family are found in Bradyrhizobium, Rhizobium, Brucella and Caulobacter species. The function of this family is unknown.	0
386171	cl11574	DUF2279	Predicted periplasmic lipoprotein (DUF2279). This domain, found in various hypothetical bacterial proteins, has no known function.	0
416325	cl11576	DUF1398	Protein of unknown function (DUF1398). This family consists of several hypothetical Enterobacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Escherichia coli and Salmonella species. The function of this family is unknown.	0
416326	cl11577	DUF1150	Protein of unknown function (DUF1150). This family consists of several hypothetical bacterial proteins of unknown function.	0
416327	cl11580	DUF2002	Protein of unknown function (DUF2002). hypothetical protein; Provisional	0
416328	cl11584	DUF2591	Protein of unknown function (DUF2591). hypothetical protein	0
299787	cl11585	Phage_X	Phage X family. gene X product; Reviewed	0
416329	cl11586	snake_toxin	N/A. This family predominantly includes venomous neurotoxins and cytotoxins from snakes, but also structurally similar (non-snake) toxin-like proteins (TOLIPs) such as Lymphocyte antigen 6D and Ly6/PLAUR domain-containing protein. Snake toxins are short proteins with a compact, disulphide-rich structure. TOLIPs have similar structural features (abundance of spaced cysteine residues, a high frequency of charge residues, a signal peptide for secretion and a compact structure) but, are not associated with a venom gland or poisonous function. They are endogenous animal proteins that are not restricted to poisonous animals.	0
416330	cl11589	Knot1	N/A. Knottins, representing plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins and arthropod defensins.	0
416331	cl11592	zf-CCCH	Zinc finger C-x8-C-x5-C-x3-H type (and similar). 	0
416332	cl11594	SSI	Subtilisin inhibitor-like. 	0
416333	cl11600	PBP_GOBP	PBP/GOBP family. The olfactory receptors of terrestrial animals exist in an aqueous environment, yet detect odorants that are primarily hydrophobic. The aqueous solubility of hydrophobic odorants is thought to be greatly enhanced via odorant binding proteins which exist in the extracellular fluid surrounding the odorant receptors. This family is composed of pheromone binding proteins (PBP), which are male-specific and associate with pheromone-sensitive neurons and general-odorant binding proteins (GOBP).	0
416334	cl11602	IL7	Interleukin 7/9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multifunctional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear.	0
416335	cl11603	Basic	Myogenic Basic domain. This basic domain is found in the MyoD family of muscle specific proteins that control muscle development. The bHLH region of the MyoD family includes the basic domain and the Helix-loop-helix (HLH) motif. The bHLH region mediates specific DNA binding. With 12 residues of the basic domain involved in DNA binding. The basic domain forms an extended alpha helix in the structure.	0
386183	cl11607	7TM_GPCR_Srab	Serpentine type 7TM GPCR receptor class ab chemoreceptor. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srb is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	0
299794	cl11610	Phage_G	Major spike protein (G protein). major spike protein	0
416336	cl11612	DUF243	Domain of unknown function (DUF243). This family of uncharacterized proteins is only found in fly proteins. It is found associated with YLP motifs pfam02757 in some proteins.	0
416337	cl11614	Peptidase_S77	Prohead core protein serine protease. prohead core scaffolding protein and protease	0
416338	cl11619	SPK	Domain of unknown function (DUF545). Family of uncharacterized C. elegans proteins. The region represented by this family can is found to be repeated up to four time in some proteins.	0
386187	cl11622	Phage_endo_I	Phage endonuclease I. endonuclease I	0
416339	cl11625	Podovirus_Gp16	Podovirus DNA encapsidation protein (Gp16). DNA encapsidation protein	0
416340	cl11627	HlyE	Haemolysin E (HlyE). This family consists of several enterobacterial haemolysin (HlyE) proteins.Hemolysin E (HlyE) is a novel pore-forming toxin of Escherichia coli, Salmonella typhi, and Shigella flexneri. HlyE is unrelated to the well characterized pore-forming E. coli hemolysins of the RTX family, haemolysin A (HlyA), and the enterohaemolysin encoded by the plasmid borne ehxA gene of E. coli 0157. However, it is evident that expression of HlyE in the absence of the RTX toxins is sufficient to give a hemolytic phenotype in E. coli. HlyE is a protein of 34 kDa that is expressed during anaerobic growth of E. coli. Anaerobic expression is controlled by the transcription factor, FNR, such that, upon ingestion and entry into the anaerobic mammalian intestine, HlyE is produced and may then contribute to the colonisation of the host.	0
416341	cl11629	KdgM	Oligogalacturonate-specific porin protein (KdgM). This family consists of several bacterial proteins which are homologous to the oligogalacturonate-specific porin protein KdgM from Erwinia chrysanthemi. The phytopathogenic Gram-negative bacteria Erwinia chrysanthemi secretes pectinases, which are able to degrade the pectic polymers of plant cell walls, and uses the degradation products as a carbon source for growth. KdgM is a major outer membrane protein, whose synthesis is strongly induced in the presence of pectic derivatives. KdgM behaves like a voltage-dependent porin that is slightly selective for anions and that exhibits fast block in the presence of trigalacturonate. In contrast to most porins, KdgM seems to be monomeric.	0
416342	cl11630	DinI	DinI-like family. DNA damage-inducible protein I; Provisional	0
271727	cl11632	DUF1035	Protein of unknown function (DUF1035). structural protein V1; Reviewed	0
416343	cl11636	SecM	Secretion monitor precursor protein (SecM). This family consists of several bacterial Secretion monitor precursor (SecM) proteins. SecM is known to regulate SecA expression. The eubacterial protein secretion machinery consists of a number of soluble and membrane associated components. One critical element is SecA ATPase, which acts as a molecular motor to promote protein secretion at translocation sites that consist of SecYE, the SecA receptor, and SecG and SecDFyajC proteins, which regulate SecA membrane cycling.	0
416344	cl11637	Mth_Ecto	N/A. This family represents the N-terminal region of the Drosophila specific Methuselah protein. Drosophila Methuselah (Mth) mutants have a 35% increase in average lifespan and increased resistance to several forms of stress, including heat, starvation, and oxidative damage. The protein affected by this mutation is related to G protein-coupled receptors of the secretin receptor family. Mth, like secretin receptor family members, has a large N-terminal ectodomain, which may constitute the ligand binding site. This family is found in conjunction with pfam00002.	0
416345	cl11643	WzyE	WzyE protein, O-antigen assembly polymerase. This family consists of several WzyE proteins which appear to be specific to Enterobacteria. Members of this family are described as putative ECA polymerases this has been found to be incorrect. The function of this family is unknown. The family is a transmembrane family with up to 11 TM regions, and is necessary for the assembly of O-antigen lipopolysaccharide.	0
299806	cl11645	DUF1293	Protein of unknown function (DUF1293). hypothetical protein	0
416346	cl11647	MalM	Maltose operon periplasmic protein precursor (MalM). This family consists of several maltose operon periplasmic protein precursor (MalM) sequences. The function of this family is unknown.	0
386195	cl11648	DUF1418	Protein of unknown function (DUF1418). hypothetical protein; Provisional	0
416347	cl11650	DUF1431	Protein of unknown function (DUF1431). This family contains a number of Drosophila melanogaster proteins of unknown function. These contain several conserved cysteine residues.	0
264457	cl11652	TraP	TraP protein. conjugal transfer protein TraP; Provisional	0
416348	cl11653	Crl	Transcriptional regulator Crl. This family contains the bacterial transcriptional regulator Crl (approximately 130 residues long). This is a transcriptional regulator of the csgA curlin subunit gene for curli fibers that are found on the surface of certain bacteria.	0
416349	cl11654	DUF1516	Protein of unknown function (DUF1516). hypothetical protein; Provisional	0
159607	cl11655	PRE_C2HC	Associated with zinc fingers. This function of this domain is unknown and is often found associated with pfam00096.	0
416350	cl11656	FCD	FCD domain. This family contains sequences that are similar to the fatty acid metabolism regulator protein (FadR). This functions as a dimer, with each monomer being composed of an N-terminal DNA-binding domain and a regulatory C-terminal domain. A linker comprising two short alpha helices joins the two domains. In the C-terminal domain, an antiparallel array of six alpha helices forms a barrel-like structure, while a seventh alpha helix forms a 'lid' at the end closest to the N-terminal domain. This structure was found to be similar to that of the C-terminal domain of the Tet repressor. Long-chain acyl-CoA thioesters interact directly and reversibly with the C-terminal domain, and this interaction affects the structure and therefore the DNA binding properties of the N-terminal domain.	0
416351	cl11657	DM4_12	DM4/DM12 family. This family contains sequences derived from hypothetical proteins expressed by two insect species, D. melanogaster and A. gambiae. The region in question is approximately 115 amino acid residues long and contains four highly- conserved cysteine residues.	0
416352	cl11660	PAN_3	PAN-like domain. 	0
416353	cl11665	7TM_GPCR_Srh	Serpentine type 7TM GPCR chemoreceptor Srh. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sri is part of the Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	0
416354	cl11672	DUF2509	Protein of unknown function (DUF2509). This family is conserved in Proteobacteria. The function is not known but many of the members are annotated as protein YgdB.	0
416355	cl11677	FliX	Class II flagellar assembly regulator. flagellar assembly regulator FliX; Reviewed	0
416356	cl11681	Anti-adapt_IraP	Sigma-S stabilisation anti-adaptor protein. This family is conserved in Enterobacteriaceae. It is one of a series of proteins, expressed by these bacteria in response to stress, that help to regulate Sigma-S, the stationary phase sigma factor of Escherichia coli and Salmonella. IraP is essential for Sigma-S stabilisation in some but not all starvation conditions.	0
416357	cl11685	CdhC	CO dehydrogenase/acetyl-CoA synthase complex beta subunit. acetyl-CoA decarbonylase/synthase complex subunit beta; Reviewed	0
416358	cl11698	Lipoprotein_22	Uncharacterized lipoprotein family. hypothetical protein; Provisional	0
275955	cl11748	LysW	Lysine biosynthesis protein LysW. This very small, poorly characterized protein has been shown essential in Thermus thermophilus for an unusual pathway of Lys biosynthesis from aspartate by way of alpha-aminoadipate (AAA) rather than diaminopimelate. It is found also in Deinococcus radiodurans and Pyrococcus horikoshii, which appear to share the AAA pathway. [Amino acid biosynthesis, Aspartate family]	0
416362	cl11777	zinc_ribbon_4	zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR, pfam12773.	0
416363	cl11797	Flagellar_put	Putative flagellar. Members of this family are found in a subset of bacterial flagellar operons, generally between genes designated flgD and flgE, in species as diverse as Bacillus halodurans and various other Firmicutes, Geobacter sulfurreducens, and Bdellovibrio bacteriovorus. The specific molecular function is unknown. [Cellular processes, Chemotaxis and motility]	0
416364	cl11819	LigD_N	DNA polymerase Ligase (LigD). Most sequences in this family are the 3'-phosphoesterase domain of a multidomain, multifunctional DNA ligase, LigD, involved, along with bacterial Ku protein, in non-homologous end joining, the less common of two general mechanisms of repairing double-stranded breaks in DNA sequences. LigD is variable in architecture, as it lacks this domain in Bacillus subtilis, is permuted in Mycobacterium tuberculosis, and occasionally is encoded by tandem ORFs rather than as a multifuntional protein. In a few species (Dehalococcoides ethenogenes and the archaeal genus Methanosarcina), sequences corresponding to the ligase and polymerase domains of LigD are not found, and the role of this protein is unclear. [DNA metabolism, DNA replication, recombination, and repair]	0
416365	cl11827	DUF3485	Protein of unknown function (DUF3485). In Methylobacillus sp strain 12S, EpsI is encoded immediately downstream of the multiple-membrane-spanning putative transporter EpsH, and is predicted to be a periplasmic protein involved in, but not required for, expression of the exopolysaccharide methanolan. In a number of other species, protein homologous to EpsI is encoded either next to EpsH or, more often, combined in a fused gene. We have proposed renaming EpsH, or the EpsHI fusion protein, to exosortase, based on its phylogenetic association with the PEP-CTERM proposed protein targeting signal. [Transport and binding proteins, Unknown substrate]	0
416366	cl11840	DUF3289	Protein of unknown function (DUF3289). Members of this protein family have been found in several species of gammaproteobacteria, including Yersinia pestis and Y. pseudotuberculosis, Xylella fastidiosa, and Escherichia coli UTI89. As many as five members can be found in a single genome. The function is unknown. [Hypothetical proteins, Conserved]	0
416367	cl11841	PSII_Pbs27	Photosystem II Pbs27. Members of this family are the Psb27 protein of the cyanobacterial photosynthetic supracomplex, photosystem II. Although most protein components of both cyanobacterial and chloroplast versions of photosystem II are closely related and described together by single models, this family is strictly bacterial. Some uncharacterized proteins with highly divergent sequences, from Arabidopsis, score between trusted and noise cutoffs for this model but are not at this time assigned as functionally equivalent photosystem II proteins. [Energy metabolism, Photosynthesis]	0
416368	cl11843	DUF3623	Protein of unknown function (DUF3623). This uncharacterized protein family was identified, by the method of partial phylogenetic profiling, as having a matching phylogenetic distribution to that of the photosynthetic reaction center of the alpha-proteobacterial type. It is nearly always encoded near other photosynthesis-related genes, including puhA. [Energy metabolism, Photosynthesis]	0
416369	cl11853	Couple_hipA	HipA N-terminal domain. Although Pfam models pfam07805 and pfam07804 currently are called HipA-like N-terminal domain and HipA-like C-terminal domain, respectively, those models hit the central and C-terminal regions of E. coli HipA but not the N-terminal region. This model hits the N-terminal region of HipA and its homologs, and also identifies proteins that lack match regions for pfam07804 and pfam07805.	0
275987	cl11864	Csf2_U	CRISPR/Cas system-associated RAMP superfamily protein Csf2. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf2 (CRISPR/cas Subtype as in A. ferrooxidans protein 2), as it lies second closest to the repeats.	0
187964	cl11865	Csf3_U	CRISPR/Cas system-associated RAMP superfamily protein Csf3. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf3 (CRISPR/cas Subtype as in A. ferrooxidans protein 3), as it lies third closest to the repeats.	0
416370	cl11869	Csc2_I-D	CRISPR/Cas system-associated protein Csc2. The Csc2 Crispr family of proteins forms a core RNA recognition motif-like domain, flanked by three peripheral insertion domains: a lid domain, a Zinc-binding domain and a helical domain. The CRISPR-Cas system is possibly a mechanism of defence against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes.	0
416371	cl11871	AtpR	N-ATPase, AtpR subunit. Members of this protein family are uncharacterized, highly hydrophobic proteins encoded in the middle of apparent F1/F0 ATPase operons. We note, however, that this protein is both broadly and sparsely distributed. It is found in about only about two percent of microbial genomes sequenced, with the first ten examples found coming from the Euryarchaeota, Chlorobia, Betaproteobacteria, Deltaproteobacteria, and Planctomycetes. In most of these species, surrounding operon appears to represent a second F1/F0 ATPase system, and the member proteins belong to subfamilies with the same phylogenetic distribution as the current protein family.	0
416372	cl11879	VasI	Type VI secretion system VasI, EvfG, VC_A0118. Members of this protein family, including VC_A0118 from Vibrio cholerae El Tor N16961, are restricted to a subset of bacteria with the type VI secretion system, and are encoded among the type VI-associated pathogenicity islands. However, many species with type VI secretion lack a member of this family. This lack suggests that members of this family may be targets rather than components of the type VI secretion system.	0
416373	cl11880	T6SS_VasJ	Type VI secretion, EvfE, EvfF, ImpA, BimE, VC_A0119, VasJ. This protein family is one of two related families in type VI secretion systems that contain an ImpA-related N-terminal domain (pfam06812). [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
416374	cl11881	CBP_BcsG	Cellulose biosynthesis protein BcsG. This protein was identified by the partial phylogenetic profiling algorithm () as part of the system for cellulose biosynthesis in bacteria, and in fact is found in cellulose biosynthesis gene regions. The protein was designated YhjU in Salmonella enteritidis, where disruption of its gene disrupts cellulose biosynthesis and biofilm formation (). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
416375	cl11883	VPLPA-CTERM	VPLPA-CTERM protein sorting domain. PepA was described in Zoogloea resiniphila as a PEP-CTERM protein regulated by the PrsK/PrsR two-component system. Knocking out that system blocks flocculation, after which expression of recombinant PepA can restore flocculation.	0
416376	cl11887	DUF3738	Protein of unknown function (DUF3738). Bacterial reference strains encoding members of this protein family are all isolated from soil. These include 39 members from Solibacter usitatus Ellin6076, 27 from Acidobacterium sp. MP5ACTX8 (both Acidobacteria), and four from Pedosphaera parvula Ellin514 (Verrucomicrobia). The family is well-diversified, with few pairs showing greater than 50 % pairwise identity. A few members are fused to Peptidase_M56 domains (see pfam05569), to Sigma70_r2 domains (see pfam04542), or have a duplication of this domain.	0
416377	cl11888	crt_membr	carotene biosynthesis associated membrane protein. Proteins of this family are Involved in the initiation of core alpha-(1,6) mannan biosynthesis of lipomannan (LM-A) and multi-mannosylated polymer (LM-B), extending triacylatedphosphatidyl-myo-inositol dimannoside (Ac1PIM2) and mannosylated glycolipid, 1,2-di-O-C16/C18:1-(alpha-D-mannopyranosyl)-(1->4)-(alpha-D-glucopyranosyluronic acid)-(1->3)-glycerol (Man1GlcAGroAc2), respectively.	0
416378	cl11889	Lycopene_cyc	Lycopene cyclase. This domain is often repeated twice within the same polypeptide, as is observed in Archaea, Thermus, Sphingobacteria and Fungi. In the fungal sequences, this tandem domain pair is observed as the N-terminal half of a bifunctional protein, where it has been characterized as a lycopene beta-cyclase and the C-terminal half is a phytoene synthetase. In Myxococcus and Actinobacterial genomes this domain appears as a single polypeptide, tandemly repeated and usually in a genomic context consistent with a role in carotenoid biosynthesis. It is unclear whether any of the sequences in this family truly encode lycopene epsilon cyclases. However a number are annotated as such. The domain is generally hydrophobic with a number of predicted membrane spanning segments and contains a distinctive motif (hPhEEhhhhhh). In certain sequences one of either the proline or glutamates may vary, but always one of the tandem pair appear to match this canonical sequence exactly.	0
187967	cl11892	Cas8c_I-C	CRISPR/Cas system-associated protein Cas8c. Members of this family are found among cas (CRISPR-Associated) genes close to CRISPR repeats in Leptospira interrogans (a spirochete), Myxococcus xanthus (a delta-proteobacterium), and Lyngbya sp. PCC 8106 (a cyanobacterium). It is found with other cas genes in Anabaena variabilis ATCC 29413. In Lyngbya sp., the protein is split into two tandem genes. This model corresponds to the N-terminal region or upstream gene; the C-terminal region is described by TIGR03486. CRISPR/cas systems are associated with prokaryotic acquired resistance to phage and other exogenous DNA.	0
187968	cl11893	Cas8c&apos;_I-D	CRISPR/Cas system-associated protein Cas8c&apos;. Members of this family are found among cas (CRISPR-Associated) genes close to CRISPR repeats in Leptospira interrogans (a spirochete), Myxococcus xanthus (a delta-proteobacterium), and Lyngbya sp. PCC 8106 (a cyanobacterium). It is found with other cas genes in Anabaena variabilis ATCC 29413. In Lyngbya sp., the protein is split into two tandem genes. This model corresponds to the C-terminal region or downstream gene; the N-terminal region is modeled by TIGR03485. CRISPR/cas systems are associated with prokaryotic acquired resistance to phage and other exogenous DNA.	0
187969	cl11894	Csp2_I-U	CRISPR/Cas system-associated protein Cas8c. Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pging (Porphyromonas gingivalis) subtype.	0
187970	cl11895	Cas5_I	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CC Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pgingi (Porphyromonas gingivalis) subtype, but shows some sequence similarity to genes of the Cas5 type (see TIGR02593).	0
416379	cl11905	GldH_lipo	GldH lipoprotein. Members of this protein family are predicted lipoproteins, exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). Members include GldH, a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Not all Bacteroidetes with members of this protein family may have gliding motility. [Cellular processes, Chemotaxis and motility]	0
416380	cl11917	DUF4312	Domain of unknown function (DUF4312). Members of this family of small (about 100 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved]	0
416381	cl11918	DUF4310	Domain of unknown function (DUF4310). Members of this family of relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown.	0
416382	cl11919	DUF4311	Domain of unknown function (DUF4311). Members of this family of relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Unknown function, General]	0
416383	cl11923	DUF2805	Protein of unknown function (DUF2805). This model describes an uncharacterized bacterial protein family. Members average about 90 amino acids in length with several well-conserved uncommon amino acids (Trp, Met). The majority of species are marine bacteria. Few species have more than one copy, but Vibrio cholerae El Tor N16961 has three identical copies. [Hypothetical proteins, Conserved]	0
416384	cl11925	VioE	Violacein biosynthetic enzyme VioE. This enzyme catalyzes the third step in violacein biosynthesis from a pair of Trp residues, as in Chromobacterium violaceum, but the first step that distinguishes that pathway from staurosporine (an indolocarbazole antibiotic) biosynthesis. [Cellular processes, Toxin production and resistance]	0
416385	cl11927	DUF5801	Domain of unknown function (DUF5801). This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion. [Cellular processes, Pathogenesis]	0
353324	cl11943	Activator-TraM	Transcriptional activator TraM. conjugal transfer protein TraM; Provisional	0
416386	cl11960	Ig	Immunoglobulin domain. The non-classical mouse MHC class I (MHC-I) molecule Qa-1b is a non-polymorphic MHC molecule with an important function in innate immunity. It binds and presents signal peptides of classical MHC-I molecules at the cell surface and, as such, act as an indirect sensor for the normal expression of MHC-I molecules. This signal peptide dominantly accommodated in the groove of Qa-1b is called Qdm, for Qa-1 determinant modifier, and its amino acid sequence AMAPRTLLL is highly conserved among mammalian species. The Qdm/Qa-1b complex serves as a ligand for the germ-line encoded heterodimeric CD94/NKG2A receptors expressed on natural killer (NK) cells and activated CD8+ T cells and transduces inhibitory signals to these lymphocytes. Thus, upon binding, Qa-1b signals NK cells not to engage in cell lysis. The molecular basis of Qa-1b function is unclear.	0
416387	cl11961	ALDH-SF	NAD(P)+-dependent aldehyde dehydrogenase superfamily. This family consists of several bacterial Acyl-CoA reductase (LuxC) proteins. The channelling of fatty acids into the fatty aldehyde substrate for the bacterial bioluminescence reaction is catalyzed by a fatty acid reductase multienzyme complex, which channels fatty acids through the thioesterase (LuxD), synthetase (LuxE) and reductase (LuxC) components.	0
416388	cl11964	CYTH-like_Pase	CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases. This presumed domain is found in the yeast vacuolar transport chaperone proteins VTC2, VTC3 and VTC4. This domain is also found in a variety of bacterial proteins.	0
416389	cl11965	terB_like	tellurium resistance terB-like protein. This family contains the TerB tellurite resistance proteins from a a number of bacteria.	0
416390	cl11966	NT_Pol-beta-like	Nucleotidyltransferase (NT) domain of DNA polymerase beta and similar proteins. This family is likely to be an uncharacterized group of nucleotidyltransferases.	0
416391	cl11967	Nucleotidyl_cyc_III	Class III nucleotidyl cyclases. This domain is found linked to a wide range of non-homologous domains in a variety of bacteria. It has been shown to be homologous to the adenylyl cyclase catalytic domain and has diguanylate cyclase activity. This observation correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. In the WspR protein of Pseudomonas aeruginosa, the GGDEF domain acts as a diguanylate cyclase, Structure 3bre, when the whole molecule appears to form a tetramer consisting of two symmetrically-related dimers representing a biological unit. The active site is the GGD/EF motif, buried in the structure, and the cyclic dimeric guanosine monophosphate (c-di-GMP) bind to the inhibitory-motif RxxD on the surface. The enzyme thus catalyzes the cyclisation of two guanosine triphosphate (GTP) molecules to one c-di-GMP molecule.	0
416392	cl11968	harmonin_N_like	N-terminal protein-binding module of harmonin and similar domains, also known as HHD (harmonin homology domain). CCM2_HHD is a folded-helical region of a family of vertebral proteins, mutations in which cause cerebral cavernous malformations (CCMs). These malformations are congenital vascular anomalies of the central nervous system that can result in haemorrhagic stroke, seizures, recurrent headaches, and focal neurologic deficits. This domain is structurally homologous to the N-terminal domain of harmonin, so it is named the CCM2 harmonin-homology domain or CCM2_HHD. This protein is often called Malcavernin.	0
416393	cl11970	PriL	Archaeal/eukaryotic core primase: Large subunit, PriL. DNA primase is the polymerase that synthesizes small RNA primers for the Okazaki fragments made during discontinuous DNA replication. DNA primase is a heterodimer of two subunits, the small subunit Pri1 (48 kDa in yeast), and the large subunit Pri2 (58 kDa in the yeast S. cerevisiae). The large subunit of DNA primase forms interactions with the small subunit and the structure implicates that it is not directly involved in catalysis, but plays roles in correctly positioning the primase/DNA complex, and in the transfer of RNA to DNA polymerase.	0
416394	cl11971	PPK2	Polyphosphate kinase 2 (PPK2). Members of this protein family belong to the polyphosphate kinase 2 (PPK2) family, which is not related in sequence to PPK1. While PPK1 tends to act in the biosynthesis of polyphosphate, or poly(P), members of the PPK2 family tend to use the terminal phosphate of poly(P) to regenerate ATP or GTP from the corresponding nucleoside diphosphate, or ADP from AMP as is the case with polyphosphate:AMP phosphotransferase (PAP). Members of this protein family most likely transfer the terminal phosphate between poly(P) and some nucleotide, but it is not clear which. [Central intermediary metabolism, Phosphorus compounds]	0
416395	cl11976	SNF	Sodium:neurotransmitter symporter family. These are twelve xTM-containing region transporters.	0
416396	cl11978	DUF2333	Uncharacterized protein conserved in bacteria (DUF2333). Members of this family of hypothetical bacterial proteins have no known function.	0
416397	cl11979	Lon_2	Putative ATP-dependent Lon protease. putative ATP-dependent protease	0
416398	cl11982	RHS_repeat	RHS Repeat. This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.	0
416399	cl12004	Cas8c_I-C	CRISPR/Cas system-associated protein Cas8c. CRISPR loci appear to be mobile elements with a wide host range. This entry represents proteins that tend to be found near CRISPR repeats. The species range, so far, is exclusively bacterial and mesophilic, although CRISPR loci are particularly common among the archaea and thermophilic bacteria. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.	0
416400	cl12007	Cby_like	Chibby, a nuclear inhibitor of Wnt/beta-catenin mediated transcription, and similar proteins. This family includes the eukaryotic chibby proteins. These proteins inhibit the wingless/Wnt pathway by binding to beta-catenin and inhibiting beta-catenin-mediated transcriptional activation. Chibby is Japanese for small, and is named after the RNAi phenotype seen in Drosophila.	0
299861	cl12008	FANCE_c-term	Fanconi anemia complementation group E protein, C-terminal domain. Fanconi Anaemia (FA) is a cancer predisposition disorder. In response to DNA damage, the FA core complex monoubiquitinates the downatream FANCD2 protein. The protein FANCE has an important role in DNA repair as it is the FANCD2-binding protein in the FA core complex so it represents the link between the FA core complex and FANCD2. The sequence shown is the C terminal domain of the protein which consists predominantly of helices and does not contain any beta-strand. The fold of the polypeptide is a continuous right-handed solenoidal pattern from the N terminal to the C terminal end.	0
416401	cl12009	HmuY_like	Bacterial proteins similar to Porphyromonas gingivalis HmuY and the C-terminal domain of PARMER_03218. HmuY is a novel heme-binding protein that recruits heme from host carriers and delivers it to its cognate outer-membrane transporter, the TonB-dependent receptor HmuR. This family of proteins is found in bacteria. Proteins in this family are typically between 214 and 278 amino acids in length.	0
416402	cl12013	BAR	The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature. BAR_12 is the BAR coiled-coil domain at the N-terminus of APPL or adaptor protein containing PH domain, PTB domain, and leucine zipper motif proteins in higher eukaryotes. This BAR domain contains four helices whereas the other classical BAR domains contain only three helices. The first three helices form an antiparallel coiled-coil, while the fourth helix, is unique to APPL1. BAR domains take part in many varied biological processes such as fission of synaptic vesicles, endocytosis, regulation of the actin cytoskeleton, transcriptional repression, cell-cell fusion, apoptosis, secretory vesicle fusion, and tissue differentiation.	0
416404	cl12015	Adenylation_DNA_ligase_like	Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases. PNKP_ligase is a classical ligase nucleotidyltransferase module of bacteria. PNKP (polynucleotide 5'-kinase/3'-phosphatase) is the end-healing and end-sealing component of an RNA-repair system present in diverse bacteria from ten different phyla. RNA breakage by site-specific 'ribotoxins' is an ancient mechanism by which microbes respond to cellular stress and distinguish self from non-self. Ribotoxins are trans-esterifying endonucleases that generate 5'-OH and 2',3' cyclic phosphate termini. Repair of this type of RNA damage is feasible via sequential enzymatic end-healing and end-sealing steps.	0
416406	cl12018	Peptidase_M48	Peptidase family M48. heat shock protein HtpX; Provisional	0
416407	cl12020	Anticodon_Ia_like	Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains. This DALR domain is found in cysteinyl-tRNA-synthetases.	0
416408	cl12022	Ribosomal_L27A	Ribosomal proteins 50S-L15, 50S-L18e, 60S-L27A. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
416410	cl12033	Spt4	Transcription elongation factor Spt4. This family consists of several eukaryotic transcription elongation Spt4 proteins as well as archaebacterial RpoE2. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles. RpoE2 is one of 13 subunits in the archaeal RNA polymerase. These proteins contain a C4-type zinc finger, and the structure has been solved in. The structure reveals that Spt4-Spt5 binding is governed by an acid-dipole interaction between Spt5 and Spt4, and the complex binds to and travels along the elongating RNA polymerase. The Spt4-Spt5 complex is likely to be an ancient, core component of the transcription elongation machinery.	0
416412	cl12045	Ubiq_cyt_C_chap	Ubiquinol-cytochrome C chaperone. 	0
416413	cl12046	DUF429	Protein of unknown function (DUF429). 	0
416414	cl12049	gp6_gp15_like	Head-Tail Connector Proteins gp6 and gp15, and similar proteins. Some members in this family of proteins with unknown function are annotated as YqbG however this cannot be confirmed. Currently the proteins has no known function.	0
353348	cl12054	Terminase_3	Phage terminase large subunit. This model detects members of a highly divergent family of the large subunit of phage terminase. All members are encoded by phage genomes or within prophage regions of bacterial genomes. This is a distinct family from pfam03354. [Mobile and extrachromosomal element functions, Prophage functions]	0
416417	cl12057	YdfA_immunity	SigmaW regulon antibacterial. hypothetical protein; Provisional	0
386259	cl12064	DUF697	Domain of unknown function (DUF697). hypothetical protein; Provisional	0
416421	cl12072	HypD	Hydrogenase formation hypA family. HypD is involved in the hyp operon which is needed for the activity of the three hydrogenase isoenzymes in Escherichia coli. HypD is one of the genes needed for formation of these enzymes. This protein has been found in gram-negative and gram-positive bacteria and Archaea. [Protein fate, Protein modification and repair]	0
416422	cl12074	YjgP_YjgQ	Predicted permease YjgP/YjgQ family. Members of this family are LptG, one of homologous, two tandem-encoded permease genes of an export ATP transporter for lipopolysaccharide (LPS) assembly in most Gram-negative bacteria. The other permease subunit is LptF (TIGR04407). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides, Transport and binding proteins, Other]	0
416423	cl12076	THUMP	THUMP domain, predicted to bind RNA. The THUMP domain is named after after thiouridine synthases, methylases and PSUSs. The THUMP domain consists of about 110 amino acid residues. The structure of ThiI reveals that the THUMP has a fold unlike that of previously characterized RNA-binding domains. It is predicted that this domain is an RNA-binding domain The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets.	0
416424	cl12077	Methyltrn_RNA_3	Putative RNA methyltransferase. This family has a TIM barrel-like fold with a deep C-terminal trefoil knot. The arrangement of its hydrophilic and hydrophobic surfaces are opposite to that of the classic TIM barrel proteins. It is likely to bind RNA, and may function as a methyltransferase.	0
416425	cl12078	p450	Cytochrome P450. Members of this subfamily are cytochrome P450 enzymes that occur next to tRNA-dependent cyclodipeptide synthases. This group does NOT include CYP121 (Rv2275) from Mycobacterium tuberculosis, adjacent to the cyclodityrosine synthetase Rv2276.	0
416426	cl12079	DUF373	Domain of unknown function (DUF373). Archaeal domain of unknown function. Predicted to be an integral membrane protein with six transmembrane regions.	0
416427	cl12080	Pcc1	Transcription factor Pcc1. KEOPS complex Pcc1-like subunit; Provisional	0
416428	cl12096	Iron_permease	Low affinity iron permease. 	0
416429	cl12097	DUF1772	Domain of unknown function (DUF1772). This domain is of unknown function.	0
416430	cl12098	DUF927	Domain of unknown function (DUF927). Family of bacterial proteins of unknown function. The C-terminal half of this family contains a P-loop motif. The N-terminal domain appears to have a unique fold, which contains three Helices and two strands. Structural analyses show that helicases containing this domain form a hexameric ring with a positively charged central pore threading a single DNA strand through suggestive of a replicative function for this helicase.	0
416431	cl12101	CrtC	CrtC N-terminal lipocalin domain. This family contains the members of the old Pfam family DUF2006. Structural characterization of family member NE1406 (from DUF2006 now merged into this family) has revealed a lipocalin-like fold with domain duplication.	0
416432	cl12104	Dehydratase_LU	N/A. This family contains the large subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances.	0
416433	cl12113	HSF_DNA-bind	HSF-type DNA-binding. 	0
416434	cl12114	HMG14_17	HMG14 and HMG17. 	0
416435	cl12115	HTH_Tnp_Tc5	Tc5 transposase DNA-binding domain. 	0
416436	cl12116	DUSP	DUSP domain. The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet.	0
416437	cl12117	JHBP	Haemolymph juvenile hormone binding protein (JHBP). The juvenile hormone exerts pleiotropic functions during insect life cycles and its binding proteins regulate these functions.	0
416438	cl12118	LEA_2	Late embryogenesis abundant protein. uncharacterized protein; Provisional	0
416439	cl12124	HK97-gp10_like	Bacteriophage HK97-gp10, putative tail-component. This model represents an uncharacterized, highly divergent bacteriophage family. The family includes gp10 from HK022 and HK97. It appears related to TIGR01635, a phage morphogenesis family believed to be involved in tail completion. [Mobile and extrachromosomal element functions, Prophage functions]	0
416440	cl12127	Csy2_I-F	CRISPR/Cas system-associated RAMP superfamily protein Csy2. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2464 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy2, for CRISPR/Cas Subtype Ypest protein 2.	0
416441	cl12129	DUF932	Domain of unknown function (DUF932). Members of this uncharacterized protein family are found in various Mycobacterium phage genomes, in Streptomyces coelicolor plasmid SCP1, and in bacterial genomes near various markers that suggest lateral gene transfer. The function is unknown. [Mobile and extrachromosomal element functions, Other]	0
416442	cl12130	Bacteriocin_IId	Bacteriocin class IId cyclical uberolysin-like. Circular bacteriocins are antibiotic proteins made by ribosomal translation of a precursor molecular, followed by cleavage and circularization. Members of this subclass of the circular bacteriocins include circularin A from Clostridium beijerinckii, bacteriocin AS-48 from Enterococcus faecalis, uberolysin from Streptococcus uberis, and carnocyclin A from Carnobacterium maltaromaticum. The mature circularized peptides average about 70 amino acids in size. [Cellular processes, Toxin production and resistance]	0
416443	cl12133	CbiG_C	Cobalamin synthesis G C-terminus. cobalamin biosynthesis protein CbiG; Provisional	0
416444	cl12138	ThylakoidFormat	Thylakoid formation protein. Thf1-like protein; Reviewed	0
386285	cl12141	Tweety_N	N-terminal domain of the protein encoded by the Drosophila tweety gene and related proteins, a family of chloride ion channels. The tweety (tty) gene has not been characterized at the protein level. However, it is thought to form a membrane protein with five potential membrane-spanning regions. A number of potential functions have been suggested in.	0
353372	cl12219	PRK15003	cytochrome d ubiquinol oxidase subunit II. part of a two component cytochrome D terminal complex. Terminal reaction in the aerobic respiratory chain. [Energy metabolism, Electron transport]	0
416450	cl12235	Phage_int_SAM_1	Phage integrase, N-terminal SAM-like domain. FliZ is involved in the regulation of flagellar assembly and possibly also the down-regulation of the motile phenotype. FliZ interacts with the flagellar translational activator FlhCD complex.	0
353374	cl12236	VirB7	Outer membrane lipoprotein virB7. type IV secretion system lipoprotein VirB7; Provisional	0
416452	cl12246	OrfB_IS605	Probable transposase. This family includes IS891, IS1136 and IS1341. DUF1225, pfam06774, has now been merged into this family.	0
416455	cl12263	CytoC_RC	Cytochrome C subunit of the bacterial photosynthetic reaction center. Photosynthesis in purple bacteria is dependent on light-induced electron transfer in the reaction centre (RC), coupled to the uptake of protons from the cytoplasm. The RC contains a cytochrome molecule which re-reduces the oxidized electron donor.	0
416457	cl12283	IPK	Inositol polyphosphate kinase. inositol polyphosphate multikinase	0
416473	cl12345	DUF1425	Putative periplasmic lipoprotein. This family consists of several hypothetical bacterial proteins of around 125 residues in length. Several members of this family are described as putative lipoproteins and are often known as YcfL. The function of this family is unknown.	0
416479	cl12363	PrgH	Type III secretion system protein PrgH-EprH (PrgH). In Samonella, this gene is part of a four-gene operon PrgHIJK and in general is found in type III secretion operons. PrgH has been shown to be required for secretion, as well as being a structural component of the needle complex. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
416485	cl12377	Brr6_like_C_C	Di-sulfide bridge nucleocytoplasmic transport domain. Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport.	0
416532	cl12494	TM6SF1-like	transmembrane 6 superfamily member 1, member 2, and similar proteins. This is a eukaryotic family of uncharacterized proteins. Some of the proteins in this family are annotated as membrane proteins.	0
386428	cl12560	DUF2806	Protein of unknown function (DUF2806). Members of this protein family are conserved hypothetical proteins with a limited species distribution within the Gammaproteobacteria. It is common in the genera Vibrio and Shewanella, and in this resembles the C-terminal domain and putative protein sorting motif TIGR03501. This model, but design, does not extend to all homologs,but rather represents a particular clade.	0
416600	cl12603	YqgB	Virulence promoting factor. YqgB encodes adaptive factors that acts in synergy with vqfZ, enabling the bacteria to cope with the physical environment in vivo, facilitating colonisation of the host.	0
416617	cl12633	DUF2859	Protein of unknown function (DUF2859). This model describes a protein family exemplified by PFL_4695 of Pseudomonas fluorescens Pf-5. Full-length proteins in this family show some architectural variety, but this model represents a conserved domain. Most or all member proteins belong to laterally transferred chromosomal islands called integrative conjugative elements, or ICE.	0
416638	cl12689	DUF2909	Protein of unknown function (DUF2909). Members of the seed alignment for this family are small (average length 68 residues), strictly bacterial, and extremely hydrophobic.  Pfam model PF04588 (HIG_1_N) includes both eukaryotic proteins, including a protein from the fish Gillichthys mirabilis, and the members of this family. Similarity between those eukaryotic proteins and the members o this model may represent convergent evolution related to the similar composition of their transmembrane alpha-helical regions, rather than a common origin or common function.	0
416688	cl12752	EccE	Putative type VII ESX secretion system translocon, EccE. This model represents the transmembrane protein EccB of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccE1, EccE2, etc. This model represents a conserved core region, and many members have 200 or more additional C-terminal residues. [Protein fate, Protein and peptide secretion and trafficking]	0
416692	cl12757	DUF2993	Protein of unknown function (DUF2993). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	0
416695	cl12761	Chs5_N	N-terminal dimerization domain of Chs5 and similar proteins. This domain is found at the N-terminus of fungal chitin biosynthesis protein CHS5. It functions as a dimerization domain.	0
416746	cl12832	DUF3090	Protein of unknown function (DUF3090). The conserved hypothetical protein described here occurs as part of the trio of uncharacterized proteins common in the Actinobacteria.	0
416797	cl12902	DUF3168	Protein of unknown function (DUF3168). This family of proteins has no known function but is likely to be a component of bacteriophage.	0
416815	cl12928	IcmL	inner membrane protein IcmL/DotI. IcmL contains two amphipathic beta-sheet regions, required for the pore-forming ability which may be related to the transfer of this protein into a host cell membrane. The icmL gene shows significant similarity to plasmid genes involved in conjugation however IcmL is thought to be required for macrophage killing. It is unknown whether conjugation plays a role in macrophage killing. This is a family of DotI/IcmL proteins of type IVb secretion systems, that reside in the inner-membrane. It carries a single transmembrane helix in the N-terminal conserved region, has an extra-periplasmic domain, and is conserved in all T4BSSs including I-type conjugation systems (TraM). DotI/IcmL (and DotJ) may form an inner membrane complex that associates with the core complex.	0
416837	cl12968	DUF2895	Protein of unknown function (DUF2895). Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions]	0
416842	cl12973	ArsP_2	Putative, 10TM heavy-metal exporter. Most proteins of this family have 8 transmembrane domains with two 4 transmembrane halves separated by a hydrophilic loop of variable sizes. It has been reported that some proteins of this family are involved in arsenate/arsenite resistance.	0
325694	cl13040	SeleniumBinding	Selenium binding protein. This model describes a homopentameric selenium-binding protein with a suggested role in selenium transport and delivery to selenophosphate synthase, the SelD protein. This protein family is closely related to pfam01906, but is shorter because of several deleted regions. It is restricted to the archaeal genus Methanococcus.	0
416885	cl13041	CopK	Copper resistance protein K. CopK is a periplasmic dimeric protein which is strongly up-regulated in the presence of copper, leading to a high periplasmic accumulation. CopK has two different binding sites for Cu(I), each with a different affinity for the metal. Binding of the first Cu(I) ion induces a conformational change of CopK which involves dissociation of the dimeric apo-protein. Binding of a second Cu(I) further increases the plasticity of the protein. CopK has features that are common with functionally related proteins such as a structure consisting of an all-beta fold and a methionine-rich Cu(I) binding site.	0
416900	cl13066	Omp28	Outer membrane protein Omp28. The Omp28 family of lipoproteins is named for a founding member described in Porphyromonas gingivalis, where it has been shown across many strains to be an expressed surface antigen. All members of the family are predicted lipoproteins.	0
416918	cl13107	TSPcc	Coiled coil region of thrombospondin. This family of proteins represents the five-stranded coiled-coil domain of cartilage oligomeric matrix protein (COMP). This region has a binding site between two internal rings formed by Leu37 and Thr40	0
416923	cl13117	DUF4352	Domain of unknown function (DUF4352). Members of these family are putative lipoproteins that fall into the Antigen MPT63/MPB63 (immunoprotective extracellular protein) superfamily.	0
416931	cl13131	rap1_RCT	C-terminal domain of RAP1 recruits proteins to telomeres. This family of proteins represents the C-terminal domain of the protein Rap-1, which plays a distinct role in silencing at the silent mating-type loci and telomeres. The Rap-1 C-terminus adopts an all-helical fold. Rap1 carries out its function by recruiting the Sir3 and Sir4 proteins to chromatin via its C terminal domain. Rap1 is otherwise known as TRF2-interacting protein, as it is one of the six subunit components of the Shelterin complex. Shelterin protects telomere ends from attack by DNA-repair mechanisms. Model doesn't capture Sch. pombe as it cuts this sequence into two.	0
416935	cl13137	LcnG-beta	Lactococcin G-beta. This HMM was built to improve on Pfam model PF11632, which in version PF11632.8 had a two-member seed. It includes 12 residues of leader peptide and GlyGly cleavage motif (see TIGR01847), and has a shorter but more broadly conserved core peptide region. Characterized member proteins include lactococcin G and enterocin 1071B.	0
416942	cl13152	RLR_C_like	C-terminal domain of Retinoic acid-inducible gene (RIG)-I-like Receptors, Cereblon (CRBN), and similar protein domains. This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerization. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity.	0
416963	cl13193	CrtA-like	spheroidene monooxygenase and similar proteins. This bacterial family of proteins has no known function.	0
416974	cl13209	CPSF73-100_C	Pre-mRNA 3&apos;-end-processing endonuclease polyadenylation factor C-term. The exact function of this domain is not known.	0
416993	cl13241	Bacteriocin_IIi	Aureocin-like type II bacteriocin. Members of this family include leaderless, unmodified class IId bacteriocins such as lacticin Q, BacSp222, and the founding member aureocin A53.	0
416997	cl13247	Candida_ALS_N	Cell-wall agglutinin N-terminal ligand-sugar binding. This is likely to be the sugar or ligand binding domain of the yeast alpha-agglutinins.	0
417110	cl13432	DUF3487	Protein of unknown function (DUF3487). Members of this protein family are found occasionally on plasmids. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	0
417120	cl13446	Telomerase_RBD	Telomerase ribonucleoprotein complex - RNA binding domain. Telomeres in most organisms are comprised of tandem simple sequence repeats. The total length of telomeric repeat sequence at each chromosome end is determined in a balance of sequence loss and sequence addition. One major influence on telomere length is the enzyme telomerase. It is a reverse transcriptase that adds these simple sequence repeats to chromosome ends by copying a template sequence within the RNA component of the enzyme. The RNA binding domain of telomerase - TRBD - is made up of twelve alpha helices and two short beta sheets. How telomerase and associated regulatory factors physically interact and function with each other to maintain appropriate telomere length is poorly understood. It is known however that TRBD is involved in formation of the holoenzyme (which performs the telomere extension) in addition to recognition and binding of RNA.	0
417131	cl13463	FAT-like_CAS_C	C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module. This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 192 to 218 amino acids in length. This domain is found associated with pfam00018, pfam08824. This domain has a conserved QPP sequence motif.	0
417171	cl13524	DUF3573	Protein of unknown function (DUF3573). LbtU, from Legionella pneumophila, a novel TonB-independent siderophore uptake outer membrane protein from a species that lacks TonB, is the founding member of a class of porins that may be involved generally in siderophore-mediated iron acquisition.	0
417288	cl13718	TryThrA_C	Tryptophan-Threonine-rich plasmodium antigen C terminal. tryptophan/threonine-rich antigen superfamily; Provisional	0
417301	cl13749	eIF3G	eIF3G domain found in eukaryotic translation initiation factor 3 subunit G (eIF-3G) and similar proteins. This domain family is found in eukaryotes, and is approximately 130 amino acids in length. The family is found in association with pfam00076. This family is subunit G of the eukaryotic translation initiation factor 3. Subunit G is required for eIF3 integrity.	0
417311	cl13764	ASH	Abnormal spindle-like microcephaly-assoc&apos;d, ASPM-SPD-2-Hydin. TMEM131_like is a family of bacterial, plant and other metazoa transmembrane proteins. Many of the members are multi-pass transmembrane proteins.	0
417412	cl13934	Inhibitor_I10	Serine endopeptidase inhibitors. Members of the microviridin/marinostatin are ribosomally translated peptides whose post-translational processing converts them into tricyclic depsipeptides that serve as serine proteinase inhibitors. A single precursor usually has one core peptide region near the C-terminus, with a nearly invariant TxKYPSD motif, but may instead have two or three repeats of the core region.	0
417445	cl13983	DUF3774	Wound-induced protein. hypothetical protein; Provisional	0
276033	cl13994	DUF326	Cysteine-rich 4 helical bundle widely conserved in bacteria. Members of this family average about 150 amino acids in length, beginning with a twin-arginine translocation signal sequence, then a His-rich spacer region, followed by a ~105-residue region in which thirteen positions are nearly invariant Cys residues. CDD (Conserved Domain Database) assigns members of this family to clan cl13994, the DUF326 superfamily, based on homology to PA2107 from Pseudomonas aeruginosa. PA2107 is a cysteine-rich four helical bundle protein, with solved structure PDB:3KAW.	0
417454	cl13995	MPP_superfamily	metallophosphatase superfamily, metallophosphatase domain. Members of this family are part of the Calcineurin-like phosphoesterase superfamily.	0
417455	cl13996	MPN	Mpr1p, Pad1p N-terminal (MPN) domains. These are metalloenzymes that function as the ubiquitin isopeptidase/ deubiquitinase in the ubiquitin-based signaling and protein turnover pathways in eukaryotes. Prokaryotic JAB domains are predicted to have a similar role in their cognates of the ubiquitin modification pathway. The domain is widely found in bacteria, archaea and phages where they are present in several gene contexts in addition to those that correspond to the prokaryotic cognates of the eukaryotic Ub pathway. Other contexts in which JAB domains are present include gene neighbor associations with ubiquitin fold domains in cysteine and siderophore biosynthesis, and phage tail morphogenesis, where they are shown or predicted to process the associated ubiquitin. A distinct family, the RadC-like JAB domains are widespread in bacteria and are predicted to function as nucleases. In halophilic archaea the JAB domain shows strong gene-neighborhood associations with a nucleotidyltransferase suggesting a role in nucleotide metabolism.	0
417456	cl13999	rhv_like	N/A. CAUTION: This alignment is very weak. It can not be generated by clustalw. If a representative set is used for a seed, many so-called members are not recognized. The family should probably be split up into sub-families. Capsid proteins of picornaviruses. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids. They include rhinovirus (common cold) and poliovirus. Common structure is an 8-stranded beta sandwich. Variations (one or two extra strands) occur.	0
417457	cl14009	DUF5837	Family of unknown function (DUF5837). This model represents a conserved N-terminal region shared by microcyclamide and patellamide bacteriocins precursors. These bacteriocin precursors are associated with heterocyclization. Related precursors are found in family TIGR04446.	0
417458	cl14014	Sec-ASP3	Accessory Sec secretory system ASP3. This protein is designated Asp3 because, along with SecY2, SecA2, and other proteins it is part of the accessory Sec system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
417459	cl14015	Asp2	Accessory Sec system GspB-transporter. This protein is designated Asp2 because, along with SecY2, SecA2, and other proteins it is part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
417460	cl14019	Prok-E2_D	Prokaryotic E2 family D. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. This protein family is designated protein B.	0
417461	cl14020	Prok_Ub	Prokaryotic Ubiquitin. A novel genetic system characterized by six major proteins, included a ParB homolog and a ThiF homolog, is designated PRTRC, or ParB-Related,ThiF-Related Cassette. It is often found on plasmids. This protein family is designated PRTRC system protein C.	0
417462	cl14023	DUF4400	Domain of unknown function (DUF4400). Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	0
417463	cl14026	BCD	Beta-carotene 15,15&apos;-dioxygenase. This integral membrane protein family includes Brp (bacterio-opsin related protein) and Blh (Brp-like protein). Bacteriorhodopsin is a light-driven proton pump with a covalently bound retinal cofactor that appears to be derived beta-carotene. Blh has been shown to cleave beta-carotene to product two all-trans retinal molecules. Mammalian enzymes with similar enzymatic function are not multiple membrane spanning proteins and are not homologous.	0
417464	cl14057	BPL_LplA_LipB	biotin-lipoate ligase family. This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyzes the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. The unusual biosynthesis pathway of lipoic acid is mechanistically intertwined with attachment of the cofactor.	0
417465	cl14058	lectin_L-type	legume lectins. Lectins are structurally diverse proteins that bind to specific carbohydrates. This family includes the VIP36 and ERGIC-53 lectins. These two proteins were the first recognized members of a family of animal lectins similar (19-24%) to the leguminous plant lectins. The alignment for this family aligns residues lying towards the N-terminus, where the similarity of VIP36 and ERGIC-53 is greatest. However, while Fiedler and Simons identified these proteins as a new family of animal lectins, our alignment also includes yeast sequences. ERGIC-53 is a 53kD protein, localized to the intermediate region between the endoplasmic reticulum and the Golgi apparatus (ER-Golgi-Intermediate Compartment, ERGIC). It was identified as a calcium-dependent, mannose-specific lectin. Its dysfunction has been associated with combined factors V and VIII deficiency OMIM:227300 OMIM:601567, suggesting an important and substrate-specific role for ERGIC-53 in the glycoprotein- secreting pathway.	0
417466	cl14106	RIFIN	Rifin. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.	0
246618	cl14192	Phage_T7_Capsid	Phage T7 capsid assembly protein. capsid assembly protein	0
353808	cl14340	B277	Family of unknown function. hypothetical protein	0
417467	cl14348	T4-gp15_tss	T4-like virus Myoviridae tail sheath stabilizer. tail sheath stabilizer and completion protein; Provisional	0
417468	cl14362	DUF5856	Family of unknown function (DUF5856). hypothetical protein; Provisional	0
353809	cl14364	Gp67	Gene product 67. prohead core protein; Provisional	0
353810	cl14502	E7R	Viral Protein E7. putative myristoylated protein; Provisional	0
353811	cl14561	An_peroxidase_like	Animal heme peroxidases and related proteins. Peroxidasin is a secreted heme peroxidase which is involved in hydrogen peroxide metabolism and peroxidative reactions in the cardiovascular system. The domain co-occurs with extracellular matrix domains and may play a role in the formation of the extracellular matrix.	0
417469	cl14571	Tocopherol_cycl	Tocopherol cyclase. tocopherol cyclase	0
417470	cl14578	GrlR	T3SS negative regulator,GrlR. negative regulator GrlR; Provisional	0
417471	cl14603	C2	C2 domain. The Dock180/Dock1 and Zizimin proteins are atypical GTP/GDP exchange factors for the small GTPases Rac and Cdc42 and are implicated cell-migration and phagocytosis. Across all Dock180 proteins, two regions are conserved: C-terminus termed CZH2 or DHR2 (or the Dedicator of cytokinesis) whereas CZH1/DHR1 contain a new family of the C2 domain.	0
417472	cl14605	DUF619-like	DUF619 domain of various N-acetylglutamate Kinases and N-acetylglutamate Synthases. This is the C-terminal NAT or N-acetyltransferase domain of bifunctional N-acetylglutamate synthase/kinases. It catalyzes the first two steps in arginine biosynthesis. This domain contains the putative NAGS - N-acetylglutamate synthase - active site. It is found at the C-terminus of Neurospora crassa acetylglutamate synthase - amino-acid acetyltransferase, EC: 2.3.1.1. It is also found C-terminal to the amino acid kinase region (pfam00696) in some fungal acetylglutamate kinase enzymes. it stabilizes the yeast NAGK, N-acetyl-L-glutamate kinase, slows catalysis and modulates feed-back inhibition by arginine. This domain is found to be the N-acetyltransferase (NAT) domain, and it has a typical GCN5-related NAT fold and a site that catalyzes NAG synthesis which is located >25 Angstrom away from the L-arginine binding site in the N-temrinal domain pfam00696.	0
417473	cl14606	Reeler_cohesin_like	Domains similar to the eukaryotic reeler domain and bacterial cohesins. Domain found in bacteria with undetermined function. Its structure has been determined and is an immunoglobulin-like fold.	0
387361	cl14607	OPT	OPT oligopeptide transporter protein. This protein represents a small family of integral membrane proteins from Gram-negative bacteria, a Gram-positive bacteria, and an archaeal species. Members of this family contain 15 to 18 GES predicted transmembrane regions, and this family has extensive homology to a family of yeast tetrapeptide transporters, including isp4 (Schizosaccharomyces pombe) and Opt1 (Candida albicans). EspB, an apparent equivalog from Myxococcus xanthus, shares an operon with a two component system regulatory protein, and is required for the normal timing of sporulation after the aggregation of cells. This is consistent with a role in transporting oligopeptides as signals across the membrane. [Transport and binding proteins, Amino acids, peptides and amines]	0
417474	cl14608	P53	P53 DNA-binding domain. Members of this family of DNA-binding domains are found the transcription factor CEP-1. They adopt a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology.	0
417475	cl14615	PI-PLCc_GDPD_SF	Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily. PI-PLC-C1 is a family of calcium 2+-dependent phosphatidylinositol-specific phospholipase C1 enzymes from bacteria and fungi. The enzyme classification number is EC:3.1.4.11. This enzyme is involved in part of the myo-inositol phosphate metabolic pathway.	0
417477	cl14631	Cdt1_c	The C-terminal fold of replication licensing factor Cdt1 is essential for Cdt1 activity and directly interacts with MCM2-7 helicase. This is the C-terminal domain of DNA replication factor Cdt1. This domain binds the MCM complex.	0
417478	cl14632	VOC	vicinal oxygen chelate (VOC) family. This domain is one of two barrel-shaped regions that together form the active enzyme, 4-hydroxyphenylpyruvic acid dioxygenase, EC:1.13.11.27. As can be deduced from the disposition of the various Glyoxalase families, _2, _3 and _4 in Pfam, pfam00903, pfam12681, pfam13468, pfam13669, these two regions are similar to be indicative of a gene-duplication event. At the individual sequence level slight differences in conformation have given rise to slightly different functions. In the case of UniProt:P80064, 4-hydroxyphenylpyruvic acid dioxygenase catalyzes the formation of homogentisate from 4-hydroxyphenylpyruvate, and the pyruvate part of the HPPD substrate (4-hydroxyphenylpyruvate), derived from L-tyrosine, and the O2 molecule occupy the three free coordination sites of the catalytic iron atom in the C-terminal domain. In plants and photosynthetic bacteria, the tyrosine degradation pathway is crucial because homogentisate, a tyrosine degradation product, is a precursor for the biosynthesis of photosynthetic pigments, such as quinones or tocopherols.	0
417479	cl14633	DD	Death Domain Superfamily of protein-protein interaction domains. In the probable ATP-dependent RNA helicase DDX58 this CARD domain is found near the N-terminus and interacts with the C-terminal domain.	0
417480	cl14643	SRPBCC	START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily. This domain is found on aromatic hydroxylating enzymes such as 2-oxo-1,2-dihydroquinoline 8-monooxygenase from Pseudomonas putida and carbazole 1,9a-dioxygenase from Janthinobacterium. These enzymes are homotrimers and are distantly related to the typical oxygenase. This domain is found C terminal to the Rieske domain which binds an iron-sulphur cluster.	0
417481	cl14647	GH43_62_32_68_117_130	Glycosyl hydrolase families: GH43, GH62, GH32, GH68, GH117, CH130. The glycosyl hydrolase family 43 contains members that are arabinanases. Arabinanases hydrolyze the alpha-1,5-linked L-arabinofuranoside backbone of plant cell wall arabinans. The structure of arabinanase Arb43A from Cellvibrio japonicus reveals a five-bladed beta-propeller fold. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	0
417482	cl14648	Aldose_epim	aldose 1-epimerase superfamily. Members of this protein family act as galactose mutarotase (D-galactose 1-epimerase) and participate in the Leloir pathway for galactose/glucose interconversion. All members of the seed alignment for this model are found in gene clusters with other enzymes of the Leloir pathway. This enzyme family belongs to the aldose 1-epimerase family, described by pfam01263. However, the enzyme described as aldose 1-epimerase itself (EC 5.1.3.3) is called broadly specific for D-glucose, L-arabinose, D-xylose, D-galactose, maltose and lactose. The restricted genome context for genes in this family suggests members should act primarily on D-galactose.	0
417483	cl14649	BRO1_Alix_like	Protein-interacting Bro1-like domain of mammalian Alix and related domains. This domain is found in a number proteins including Rhophilin and BRO1. It is known to have a role in endosomal targeting. ESCRT-III subunit Snf7 binds to a conserved hydrophobic patch in the BRO1 domain that is required for protein complex formation and for the protein-sorting function of BRO1.	0
417484	cl14651	RNA_pol_Rpb6	RNA polymerase Rpb6. DNA-directed RNA polymerase subunit omega; Reviewed	0
417485	cl14653	KdgT	2-keto-3-deoxygluconate permease. This family includes the characterized 2-Keto-3-Deoxygluconate transporters from Bacillus subtilis and Erwinia chrysanthemi. There are homologs of this protein found in both gram-positive and gram-negative bacteria. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	0
353824	cl14654	V_Alix_like	Protein-interacting V-domain of mammalian Alix and related domains. This domain family is comprised of uncharacterized plant proteins. It belongs to the V_Alix_like superfamily which includes the V-shaped (V) domains of Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, mammalian Alix (apoptosis-linked gene-2 interacting protein X), (His-Domain) type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), and related domains. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. The mammalian Alix V-domain (belonging to a different family) contains a binding site, partially conserved in the superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. In addition to this V-domain, members of the V_Alix_Rim20_Bro1_like superfamily also have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind to human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members of the V_Alix_like superfamily also have a proline-rich region (PRR).	0
187418	cl14664	PRK15266	N/A. subtilase cytotoxin subunit B-like protein; Provisional	0
417486	cl14670	CP12	CP12 domain. CP12 gene family protein; Provisional	0
417487	cl14673	DUF1266	Protein of unknown function (DUF1266). hypothetical protein; Provisional	0
417488	cl14674	DctA-YdbH	Dicarboxylate transport. In certain bacterial families this protein is expressed from the ydbH gene, and there is a suggestion that this is a form of DctA or dicarboxylate transport protein. Dicarboxylate transport proteins are found in aerobic bacteria which grow on succinate or other C4-dicarboxylates.	0
417489	cl14675	PorP_SprF	Type IX secretion system membrane protein PorP/SprF. This model describes a protein family unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Flavobacterium johnsoniae, that have type IX secretion systems (T9SS) and exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss a this motility.	0
387379	cl14676	PgaD	PgaD-like protein. Members of this protein family are PgaD, essential to the production of poly-beta-1,6-N-acetyl-D-glucosamine (PGA). This cytoplasmic membrane protein appears to be an auxiliary subunit to the PGA synthase, PgaC (TIGR03937). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
387380	cl14695	C166	Family of unknown function. hypothetical protein; Provisional	0
301338	cl14701	SopD	Salmonella outer protein D. SopD is a type III virulence effector protein whose structure consists of 38% alpha-helix and 26% beta-strand.	0
353831	cl14716	UL16	Viral unique long protein 16. tegument protein UL16; Provisional	0
417491	cl14728	DUF4922	Domain of unknown function (DUF4922). GDP-L-galactose-hexose-1-phosphate guanyltransferase; Provisional	0
387383	cl14744	small_mem_YnhF	YnhF family membrane protein; Validated. Members of this protein family, are small membrane proteins, about 29 amino acids in length. YnhF from E. coli was shown to have an intact fMet residue at the N-terminus and to be chloroform-soluble. The previously generated narrow cluster PRK14756 includes some members of this family.	0
417492	cl14758	T3SS_basalb_I	Type III secretion basal body protein I, YscI, HrpB, PscI. T3SS_basalb_I represents a family of Gram-negative type III secretion basal body proteins I. It is the inner rod protein of the secreted needle. YscI is suggested to form a rod that allows substrate passage across the inner membrane of the needle protein YscF through it.	0
301342	cl14772	PDU_like	Putative propanediol utilisation. Members of this family are PduM, a protein essential for forming functional microcompartments in which a trimeric B12-dependent enzyme acts as a dehydratase for 1,2-propanediol (Salmonella enterica) or glycerol (Lactobacillus reuteri).	0
417493	cl14778	DnaJ-X	X-domain of DnaJ-containing. RESA-like protein; Provisional	0
417494	cl14782	RNase_H_like	Ribonuclease H-like superfamily, including RNase H, HI, HII, HIII, and RNase-like domain IV of spliceosomal protein Prp8. This domain is found in plants and appears to be part of a retrotransposon.	0
417495	cl14783	DOMON_like	Domon-like ligand-binding domains. CBM9_2 is a family of putative endoxylanase-like proteins that belong to the Carbohydrate-binding family 9.	0
417496	cl14785	FMT_C_like	Carboxy-terminal domain of Formyltransferase and similar domains. Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA.	0
276063	cl14805	Csx14_I-U	CRISPR/Cas system-associated protein Csx14. This model describes a CRISPR-associated (cas) protein unique to the Dpsyc subtype (named for Desulfotalea psychrophila), a variant type I-C subtype, although not universal to the that subtype. Members of this family occur in CRISPR loci of Geobacter sulfurreducens PCA, Gemmata obscuriglobus UQM 2246, Rhodospirillum centenum SW, Planctomyces limnophilus DSM 3776, and Methylosinus trichosporium OB3b.	0
417497	cl14807	ACE1-Sec16-like	Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16. Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.	0
417498	cl14813	GluZincin	Gluzincin Peptidase family (thermolysin-like proteinases, TLPs) which includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins). Glycyl aminopeptidase is an unusual peptidase in that it has a preference for substrates with an N-terminal glycine or alanine. These proteins are found in Bacteria and in Archaea.	0
417499	cl14817	DUF1858	Domain of unknown function (DUF1858). Members of this protein family resemble the domain of unknown function DUF1858 described by pfam08984, but all members contain an apparent redox-active disulfide. In at least one member protein, a cysteine in the CXXC motif is substituted by a selenocysteine. Most member proteins consist of this domain only, but a few members are fused to or adjacent to members of the hybrid-cluster (prismane) family or the nitrite/sulfite reductase family. [Energy metabolism, Electron transport]	0
417500	cl14828	Lant_dehydr_C	Lantibiotic biosynthesis dehydratase C-term. This domain occurs within longer proteins that contain lantibiotic dehydratase domains (see pfam04737 and pfam04738), and as single-domain proteins in bacteriocin biosynthesis genomic contexts. Three named genes in this family, SioK in Streptomyces sioyaensis, TsrD in Streptomyces laurentii, and NosD in Streptomyces actuosus, all occur in regions associated with thiopeptide biosynthesis. [Cellular processes, Toxin production and resistance]	0
301355	cl14830	Mersacidin	Two-component Enterococcus faecalis cytolysin (EFC). This model recognizes a number of type 2 lantibiotic-type bacteriocins, related to but distinct from the family that includes lichenicidin and mersacidin. Sequence similarity among members consists largely of a 20-residue block of conserved sequence that covers most of the leader peptide region, absent from the mature lantibiotic. This is followed by a region with characteristic composition for lantibiotic precursor regions, rich in Ser and Thr and including a near-invariant Cys near or at the C-terminus, involved in cyclization. Members of this family typically are shorter than 70 amino acids. [Cellular processes, Toxin production and resistance]	0
417501	cl14834	TSCPD	TSCPD domain. This model describes a family of conserved hypothetical proteins of small size, typically ~85 residues, with four invariant Cys residues. This small protein is distantly homologous to a C-terminal domain found in proteins identified by N-terminal homology as ribonucleotide reductases. The rare and sporadic distribution of this protein family falls mostly within the subset of bacterial genomes containing the uncharacterized radical SAM protein modeled by TIGR03904. [Unknown function, General]	0
417502	cl14836	DUF4130	Domain of unknown function (DUF4130. This model represents a conserved hypothetical protein that almost invariably pairs with an uncharacterized radical SAM protein. The pair occurs in about twenty percent of completed prokaryotic genomes. About forty percent of the members of this family occur as fusion proteins, where the C-terminal domain belongs to the uracil-DNA glycosylase family, a DNA repair family (because uracil in DNA is deamidated cytosine). The linkage by gene clustering and correlated species distribution to a radical SAM protein, and by gene fusion to a DNA repair protein family, suggests a role in DNA modification and/or repair.	0
417503	cl14844	DUF3817	Domain of unknown function (DUF3817). This model describes a strictly bacterial integral membrane domain of about 85 residues in length. It occurs in proteins that on rare occasions are fused to transporter domains such as the major facilitator superfamily domain. Of three invariant residues, two occur as a His-Gly dipeptide in the middle of three predicted transmembrane helices. [Unknown function, General]	0
417504	cl14852	WYL	WYL domain. Members of this protein family belong to CRISPR-associated (Cas) gene clusters. The majority of members are Cyanobacterial.	0
417505	cl14855	Caps_synth_CapC	Capsule biosynthesis CapC. Of four genes commonly found to be involved in biosynthesis and export of poly-gamma-glutamate, pgsB(capB) and pgsC(capC) are found to be involved in the synthesis per se. Members of this family are designated PgsC, covering both cases in which the poly-gamma-glutamate is secreted and those in which it is retained to form capsular material. PgsC binds tightly to PgsB, which has been shown to have poly-gamma-glutamate activity. [Cell envelope, Other]	0
417506	cl14857	SdpA	Sporulation delaying protein SdpA. Members of this protein family resemble SdpA (Sporulation Delaying Protein A), a protein associated with production and export of the cannibalism peptide SdpC in Bacillus subtilis. Similar proteins are found in Myxococcus xanthus, Stigmatella aurantiaca DW4/3-1, Streptomyces sp. ACTE, etc.	0
276089	cl14861	AZL_007950_fam	AZL_007950 family protein. The first characterized methanobactin is made from a ribosomal precursor in Methylosinus trichosporium OB3b. Two additional species with homologous precursor peptides (family TIGR04071) are Azospirillum sp. B510 and Gluconacetobacter sp. SXCC-1. This model describes a clique of related sequences, domain or full-length, that occurs always and only next to a methanobactin precursor of the Mb-OB3b type. The model excludes several Pseudomonas proteins whose function is unknown, which likewise are in model TIGR04061, but which diverge toward the C-terminus.	0
417507	cl14869	SPASM	Iron-sulfur cluster-binding domain. This domain contains regions binding additional 4Fe4S clusters found in various radical SAM proteins C-terminal to the domain described by model pfam04055. Radical SAM enzymes with this domain tend to be involved in protein modification, including anaerobic sulfatase maturation proteins, a quinohemoprotein amine dehydrogenase biogenesis protein, the Pep1357-cyclizing radical SAM enzyme, and various bacteriocin biosynthesis proteins. The motif CxxCxxxxxCxxxC is nearly invariant for members of this family, although PqqE has a variant form. We name this domain SPASM for Subtilosin, PQQ, Anaerobic Sulfatase, and Mycofactocin.	0
276097	cl14874	Luminal_IRE1_like	The Luminal domain, a dimerization domain, of Inositol-requiring protein 1-like proteins. The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), a serine/threonine protein kinase (STK) and a type I transmembrane protein that is localized in the endoplasmic reticulum (ER). IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), is a kinase receptor that also contains an endoribonuclease domain in the cytoplasmic side. It plays roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1 acts as an ER stress sensor and is the oldest and most conserved component of the UPR in eukaryotes. During ER stress, IRE1 dimerizes through its luminal domain and forms oligomers, allowing the kinase domain to undergo trans-autophosphorylation. This leads to a conformational change that stimulates its endoribonuclease activity and results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. Mammals contain two IRE1 proteins, IRE1alpha (or ERN1) and IRE1beta (or ERN2). IRE1alpha is expressed in all cells and tissues while IRE1beta is found only in intestinal epithelial cells.	0
417508	cl14876	Zinc_peptidase_like	Zinc peptidases M18, M20, M28, and M42. This domain consists of 4 beta strands and two alpha helices which make up the dimerization surface of members of the M20 family of peptidases. This family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases.	0
417509	cl14879	LabA_like_C	C-terminal domain of LabA_like proteins. A predicted RNA-binding domain found in insect Oskar and vertebrate TDRD5/TDRD7 proteins that nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. The domain adopts the winged helix-turn- helix fold and bind RNA with a potential specificity for dsRNA.In eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized RNase domain belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains.	0
417510	cl14880	CBM6-CBM35-CBM36_like	Carbohydrate Binding Module 6 (CBM6) and CBM35_like superfamily. CBM_26 is a family of bacterial carbohydrate-binding modules frequently found at the C-terminus of enzymes. The combination is not unusual as the CBMs function to bring the relevant polysaccharide into close proximity to the active site.	0
417512	cl14897	HcyBio	Homocysteine biosynthesis enzyme, sulfur-incorporation. This presumed domain is about is about 360 residues long. The function of this domain is unknown. It is found in some proteins that have two C-terminal CBS pfam00571 domains. There are also proteins that contain two inserted Fe4S domains near the C-terminal end of the domain. The Methanothermobacter thermautotrophicus gene MTH_855 product has been misannotated as an inosine monophosphate dehydrogenase based on the similarity to the CBS domains. Based on genetic analyses in the methanogen Methanosarcina acetivorans, this family is a key component of the metabolic network for sulfide assimilation and trafficking in methanogens. It is essential to a novel, O-acetylhomoserine sulfhydrylase-independent pathway for homocysteine biosynthesis, and may catalyze sulfur incorporation into the side chain of an as yet unidentified amino acid precursor. The DUF39-CBS and DUF39-ferredoxin architectures repeatedly occur together in the genomes of methanogenic Archaea, suggesting they may be of diverged function. This is consistent with a phylogenetic reconstruction of the DUF39 family, which clearly distinguishes the CBS-associated and ferredoxin-associated DUF39s.	0
417513	cl14898	DUF1175	Protein of unknown function (DUF1175). This family consists of several hypothetical bacterial proteins of around 210 residues in length. The function of this family is unknown.	0
417514	cl14901	DDE_Tnp_Tn3	Tn3 transposase DDE domain. This family includes transposases of Tn3, Tn21, Tn1721, Tn2501, Tn3926 transposons from E-coli. The specific binding of the Tn3 transposase to DNA has been demonstrated. Sequence analysis has suggested that the invariant triad of Asp689, Asp765, Glu895 (numbering as in Tn3) may correspond to the D-D-35-E motif previously implicated in the catalysis of numerous transposases.	0
417515	cl14905	DUF1091	Protein of unknown function (DUF1091). This is a family of uncharacterized proteins. Based on its distant similarity to pfam02221 and conserved pattern of cysteine residues it is possible that these domains are also lipid binding.	0
417516	cl14906	AKAP_110	A-kinase anchor protein 110 kDa (AKAP 110). This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction.	0
301371	cl14909	GspL_C	GspL periplasmic domain. GspL-like protein; Provisional	0
417540	cl15003	TcpC_C	C-terminal domain of conjugative transposon protein TcpC. This family of proteins are annotated as conjugative transposon protein TcpC. The transfer clostridial plasmid (tcp) locus is part of some conjugative antibiotic resistance and virulence plasmids. TcpC was one of five genes whose products had low-level sequence identity to Tn916 proteins, having similarity to ORF13 homologs from Tn916, Tn5397, and CW459tet. This family of proteins is found in bacteria. Proteins in this family are typically between 302 and 351 amino acids in length.	0
417587	cl15079	PDS5	Sister chromatid cohesion protein PDS5. This HEAT repeat is found most frequently in sister chromatid cohesion proteins such as Nipped-B. HEAT repeats are found tandemly repeated in many proteins, and they appear to serve as flexible scaffolding on which other components can assemble.	0
417599	cl15102	CLU-central	An uncharacterized central domain of CLU mitochondrial proteins. Translation initiation factor eIF3 is a multi-subunit protein complex required for initiation of protein biosynthesis in eukaryotic cells. The complex promotes ribosome dissociation, the binding of the initiator methionyl-tRNA to the 40 S ribosomal subunit, and mRNA recruitment to the ribosome. The protein product from TIF31 genes in yeast is p135 which associates with the eIF3 but does not seem to be necessary for protein translation initiation.	0
417633	cl15166	RRP7_like	RRP7 domain ribosomal RNA-processing protein 7 (Rrp7p), ribosomal RNA-processing protein 7 homolog A (Rrp7A), and similar proteins. RRP7 is an essential protein in yeast that is involved in pre-rRNA processing and ribosome assembly. It is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle.	0
417677	cl15232	BACON	Bacteroidetes-Associated Carbohydrate-binding (putative) Often N-terminal (BACON) domain. This family represents a distinct class of BACON domains found in crAss-like phages, the most common viral family in the human gut, in which they are found in tail fiber genes. This suggests they may play a role in phage-host interactions.	0
417680	cl15236	PliI_like	Periplasmic lysozyme inhibitor, I-type (PliI) and similar proteins. Aeromonas hydrophila PliI is a dimeric periplasmic protein that enables bacteria to resist permeabilization of the outer membrane by the bactericidal action of lysozyme. PliI may be a direct inhibitor of lysozyme that inserts a conserved loop into the active site of type I (invertebrate) lysozymes.	0
417681	cl15237	Deltex_C	Domain found at the C-terminus of deltex-like. This is the C-terminal domains found in members of the Deltex family of proteins which comprises five members (DTX1, 2, 3, 4, and 3L). This conserved C-terminal region of about 150 residues of the Deltex family, is preceded by a RING E3 ligase domain in four of the members. Crystal structure of the Deltex C-terminal (DTC) domain reveals a fold composed of a central beta-sheet lined with two long parallel alpha-helices.	0
417682	cl15239	PLDc_SF	Catalytic domain of phospholipase D superfamily proteins. TrmB is an alpha-glucoside sensing transcriptional regulator. The protein is the transcriptional repressor for gene cluster encoding trehalose/maltose ABC transporter in T.litoralis and P.furiosus. TrmB has lost its DNA binding domain but retained its sugar recognition site. A nonreducing glucosyl residue is shared by all substrates bound to TrmB which suggests that its a common recognition motif.	0
197448	cl15240	Reelin_subrepeat_like	Tandem repeat subunit of reelin and related proteins. Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the C-terminal subrepeat, which directly contacts the N-terminal subrepeat and the EGF domain in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1).	0
417683	cl15242	BfiI_C_EcoRII_N_B3	DNA binding domains of BfiI, EcoRII and plant B3 proteins. The N-terminal effector-binding domain of the Restriction Endonuclease EcoRII has a DNA recognition fold, allowing for binding to 5'-CCWGG sequences. It assumes a structure composed of an eight-stranded beta-sheet with the strands in the order of b2, b5, b4, b3, b7, b6, b1 and b8. They are mostly antiparallel to each other except that b3 is parallel to b7. Alternatively, it may also be viewed as consisting of two mini beta-sheets of four antiparallel beta-strands, sheet I from beta-strands b2, b5, b4, b3 and sheet II from strands b7, b6, b1, b8, folded into an open mixed beta-barrel with a novel topology. Sheet I has a simple Greek key motif while sheet II does not.	0
417684	cl15243	HemeO-like	heme oxygenase. The CADD, Chlamydia protein associating with death domains, crystal structure reveals a dimer of seven-helical bundles. Each bundle contains a di-iron centre adjacent to an internal cavity that forms an active site similar to that of methane mono-oxygenase hydrolase.	0
417685	cl15254	UBAN	polyubiquitin binding domain of NEMO and related proteins. CC2-LZ is a leucine-zipper domain associated with the CC2 coiled-coil region of NF-kappa-B essential modulator, NEMO. It plays a regulatory role, along with the very C-terminal zinc-finger; it contains a ubiquitin-binding domain (UBD) and represents one region that contributes to NEMO oligomerization. NEMO itself is an integral part of the IkappaB kinase complex and serves as a molecular switch via which the NF-kappaB signalling pathway is regulated.	0
417686	cl15255	SH2	Src homology 2 (SH2) domain. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the C-terminal SH2 domain.	0
417687	cl15257	GIY-YIG_SF	GIY-YIG nuclease domain superfamily. This domain was identified by Iyer and colleagues.	0
417688	cl15262	PUB	PNGase/UBA or UBX (PUB) domain of p97 adaptor proteins. The PUB (also known as PUG) domain is found in peptide N-glycanase where it functions as a AAA ATPase binding domain. This domain is also found on other proteins linked to the ubiquitin-proteasome system.	0
417689	cl15265	YjbR	YjbR. YjbR has a CyaY-like fold.	0
387591	cl15268	V4R	V4R domain. This model represents the component of bacteriochlorophyll synthetase responsible for reduction of the B-ring pendant ethylene (4-vinyl) group. It appears that this step must precede the reduction of ring D, at least by the "dark" protochlorophyllide reductase enzymes BchN, BchB and BchL. This family appears to be present in photosynthetic bacteria except for the cyanobacterial clade. Cyanobacteria must use a non-orthologous gene to carry out this required step for the biosynthesis of both bacteriochlorophyll and chlorophyll. [Biosynthesis of cofactors, prosthetic groups, and carriers, Chlorophyll and bacteriochlorphyll]	0
417690	cl15270	FinO_conjug_rep	N/A. This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProP, in Escherichia coli. This family includes several bacterial fertility inhibition (FINO) proteins. The conjugative transfer of F-like plasmids is repressed by FinO, an RNA binding protein. FinO interacts with the F-plasmid encoded traJ mRNA and its antisense RNA, FinP, stabilizing FinP against endonucleolytic degradation and facilitating sense-antisense RNA recognition. ProQ operates as an RNA-chaperone, binding RNA and bringing about both RNA strand-exchange and RNA duplexing. This suggests that in fact it does not regulate ProP transcription but rather regulates ProP translation through activity as an RNA-binding protein.	0
353932	cl15276	Phage_GPA	Bacteriophage replication gene A protein (GPA). DNA replication initiation protein gpA	0
417691	cl15278	TSP_1	Thrombospondin type 1 domain. Type 1 repeats in thrombospondin-1 bind and activate TGF-beta.	0
387594	cl15288	DUF2378	Protein of unknown function (DUF2378). This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus DK 1622. Members are about 200 amino acids in length. No other homologs are known; the function is unknown.	0
387595	cl15289	DUF2380	Predicted lipoprotein of unknown function (DUF2380). This family consists of at least 9 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. One appears truncated toward the N-terminus; the others are predicted lipoproteins. The function is unknown.	0
417692	cl15307	TPKR_C2	Tyrosine-protein kinase receptor C2 Ig-like domain. In the tyrosine-protein kinase receptor NTRK1 this domain interacts with beta-nerve growth factor NGF.	0
417694	cl15347	CBM20	N/A. Novamyl (also known as acarviose transferase, ATase, maltogenic alpha-amylase, glucan 1,4-alpha-maltohydrolase, and AcbD), C-terminal CBM20 (carbohydrate-binding module, family 20) domain. Novamyl has a five-domain structure similar to that of cyclodextrin glucanotransferase (CGTase). Novamyl has a substrate-binding surface with an open groove which can accommodate both cyclodextrins and linear substrates. The CBM20 domain is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch.	0
417695	cl15354	CBS_pair_SF	Two tandem repeats of the cystathionine beta-synthase (CBS pair) domains superfamily. CBS domains are small intracellular modules that pair together to form a stable globular domain. This family represents a single CBS domain. Pairs of these domains have been termed a Bateman domain. CBS domains have been shown to bind ligands with an adenosyl group such as AMP, ATP and S-AdoMet. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role making proteins sensitive to adenosyl carrying ligands. The region containing the CBS domains in Cystathionine-beta synthase is involved in regulation by S-AdoMet. CBS domain pairs from AMPK bind AMP or ATP. The CBS domains from IMPDH and the chloride channel CLC2 bind ATP.	0
417696	cl15368	RNase_Ire1_like	RNase domain (also known as the kinase extension nuclease domain) of Ire1 and RNase L. This domain is a endoribonuclease. Specifically it cleaves an intron from Hac1 mRNA in humans, which causes it to be much more efficiently translated.	0
417697	cl15371	NIF3	NIF3 (NGG1p interacting factor 3). The characterization of this family of uncharacterized proteins as orthologous is tentative. Members are found in all three domains of life. Several members (from Bacillus subtilis, Listeria monocytogenes, and Mycobacterium tuberculosis - all classified as Firmicutes within the Eubacteria) share a long insert relative to other members. [Unknown function, General]	0
417698	cl15373	PATR	Passenger-associated-transport-repeat. This model represent a core 32-residue region of a class of bacterial protein repeat found in one to 30 copies per protein. Most proteins with a copy of this repeat have domains associated with membrane autotransporters (pfam03797, TIGR01414). The repeats occur with a periodicity of 60 to 100 residues. A pattern of sequence conservation is that every second residue is well-conserved across most of the domain. pfam05594 is based on a longer, much more poorly conserved multiple sequence alignment and hits some of the same proteins as this model with some overlap between the hit regions of the two models. It describes these repeats as likely to have a beta-helical structure. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
417699	cl15383	IDH	Monomeric isocitrate dehydrogenase. The monomeric type of isocitrate dehydrogenase has been found so far in a small number of species, including Azotobacter vinelandii, Corynebacterium glutamicum, Rhodomicrobium vannielii, and Neisseria meningitidis. It is NADP-specific. [Energy metabolism, TCA cycle]	0
417700	cl15384	DUF5131	Protein of unknown function (DUF5131). Members of this family are the upstream member (A) of a pair of tandem-encoded radical SAM enzymes. Most of these radical SAM gene pairs have an additional upstream regulatory gene in the MarR family. Examples of high sequence identity (over 96 percent) from cassettes in several Treponema species of the oral cavity to those in multiple Firmicutes in the gut microbiome suggest recent lateral gene transfer, as might be expected for antibiotic resistance genes. The function is unknown.	0
417701	cl15385	MTTB	Trimethylamine methyltransferase (MTTB). This model represents a distinct subfamily of pfam06253. All members here are trimethylamine:corrinoid methyltransferases that contain a critical pyrrolysine residue incorporated during translation via a special tRNA for a TAG (amber) codon. Known members so far are from the genus Methanosarcina. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with dimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates trimethylamine, leaving dimethylamine, and methylates the prosthetic group of its small cognate corrinoid protein, MttC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence.	0
417702	cl15397	DUF89	Protein of unknown function DUF89. This family has no known function.	0
301612	cl15401	12TM_1	Membrane protein of 12 TMs. This family carries twelve transmembrane regions. It does not have any characteristic nucleotide-binding-domains of the GxSGSGKST type. so it may not be an ATP-binding cassette transporter. However, it may well be a transporter of some description. ABC transporters always have two nucleotide binding domains; this has two unusual conserved sequence-motifs: 'KDhKxhhR' and 'LxxLP'.	0
417703	cl15406	DUF2088	Domain of unknown function (DUF2088). LarA from Lactobacillus plantarum is a nickel-dependent lactate racemase and the founding member of a family of isomerases that depend on a nicotinic acid-derived nickel pincer cofactor. While it is not yet clear which homologs of LarA act preferentially on lactate, this model identifies one clade of architecurally similar proteins from among a broader set of LarA homologs.  Note that the crystal structure 4NAR, on deposit at PDB but not associated with any publication, represents a protein from Thermotoga maritima that falls outside the scope of this family and that is annotated in PDB as a putative uronate isomerase.	0
417704	cl15407	DUF1614	Protein of unknown function (DUF1614). This is a family of sequences coming from hypothetical proteins found in both bacterial and archaeal species.	0
417705	cl15411	SpecificRecomb	Site-specific recombinase. Members of this family of bacterial proteins are found in various putative site-specific recombinase transmembrane proteins.	0
417706	cl15413	AAA_assoc_C	C-terminal AAA-associated domain. This had been thought to be an ATPase domain of ABC-transporter proteins. However, only one member has any trans-membrane regions. It is associated with an upstream ATP-binding cassette family, pfam00005.	0
417707	cl15414	V-ATPase_C	Subunit C of vacuolar H+-ATPase (V-ATPase). This family contains subunit C of vacuolar H+-ATPase (V-ATPase), a protein that plays a crucial role in the vacuolar system of eukaryotic cells. The main function of V-ATPase is to generate a proton-motive force at the expense of ATP and to cause limited acidification in the internal space (lumen) of several organelles of the vacuolar system. V-ATPases are multi-subunit protein complexes made up of two distinct structures: a peripheral catalytic sector (V1) and a hydrophobic membrane sector (V0) responsible for driving protons; subunit C is one of five polypeptides composing V1. The key function of the C subunit is intimately involved in the reversible dissociation of the V1 and V0 structures. It has also been identified as a mediator of the acidic microenvironment of tumors which it controls by proton extrusion to the extracellular medium. The acidic environment causes tissue damage, activates destructive enzymes in the extracellular matrix, and acquires metastatic cell phenotypes.	0
417708	cl15415	Sec1	Sec1 family. 	0
417709	cl15422	DUF3419	Protein of unknown function (DUF3419). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 398 to 802 amino acids in length.	0
417710	cl15424	PEP_hydrolase	Phosphoenolpyruvate hydrolase-like. This domain has a TIM barrel fold related to IGPS and to phosphoenolpyruvate mutase/aldolase/carboxylase.	0
417711	cl15430	Nucleoside_tran	Nucleoside transporter. This is a family of proteins from the CLN3 gene. A mis-sense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Batten disease is characterized by the accumulation of autofluorescent material in the lysosomes of most cells. Members of this family are transmembrane proteins functional in pre-vacuolar compartments. The protein in Sch.pombe is found to be localized to the vacuolar membrane, and a lack of functional protein clearly affects the size and pH of the vacuole. Thus the protein is necessary for vacuolar homeostasis. It is important for localization of late endosomal/lysosomal compartments, and it interacts with motor components driving both plus and minus end microtubular trafficking: tubulin, dynactin, dynein and kinesin-2.	0
417712	cl15435	DUF1045	Protein of unknown function (DUF1045). This family of proteins is observed in the vicinity of other caharacterized genes involved in the catabolism of phosphonates via the3 C-P lyase system (GenProp0232), its function is unknown. These proteins are members of the somewhat broader pfam06299 model "Protein of unknown function (DUF1045)" which contains proteins found in a different genomic context as well.	0
417713	cl15439	BTG	BTG family. The tob/btg1 is a family of proteins that inhibit cell proliferation.	0
387618	cl15442	DUF2381	Protein of unknown function (DUF2381). This family consists of at least 8 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown.	0
417714	cl15454	HrpJ	HrpJ-like domain. This protein is found in type III secretion operons and, in Yersinia is localized to the cell surface and is involved in the Low-Calicium Response (LCR), possibly by sensing the calcium concentration. In Salmonella, the gene is known as InvE and is believed to perform an essential role in the secretion process and interacts with the proteins SipBCD and SicA.//Altered name to reflect regulatory role. Added GO and role IDs . Negative regulation of type III secretion in Y pestis is mediated in part by a multiprotein complex that has been proposed to act as a physical impediment to type III secretion by blocking the entrance to the secretion apparatus prior to contact with mammalian cells. This complex is composed of YopN, its heterodimeric secretion chaperone SycN-YscB, and TyeA. 3[SS 6/3/05] [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
417715	cl15456	ADAM_CR	ADAM cysteine-rich. ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity.	0
417716	cl15462	T6SS_TssF	Type VI secretion system, TssF. This protein family is associated with type VI secretion in a number of pathogenic bacteria. Mutation is associated with impaired virulence, such as impaired infection of plants by Rhizobium leguminosarum.	0
417717	cl15463	Pup_ligase	Pup-ligase protein. This protein family is paralogous to (and distinct from) the PafA (proteasome accessory factor) first described in Mycobacterium tuberculosis (see TIGR03686). Members of both this family and TIGR03686 itself tend to cluster with each other, with the ubiquitin analog Pup (TIGR03687) associated with targeting to the proteasome, and with proteasome subunits themselves. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	0
276111	cl15465	CsaX_III-U	CRISPR/Cas system-associated protein CsaX. This family comprises a minor CRISPR-associated protein family. It occurs only in the context of the (strictly archaeal) Apern subtype of CRISPR/Cas system, and is further restricted to the Sulfolobales, including Metallosphaera sedula DSM 5348 and multiple species of the genus Sulfolobus.	0
417718	cl15473	NA37	37-kD nucleoid-associated bacterial protein. nucleoid-associated protein NdpA; Validated	0
417719	cl15483	Dymeclin	Dyggve-Melchior-Clausen syndrome protein. Hid1 (high-temperature-induced dauer-formation protein 1) represents proteins of approximately 800 residues long and is conserved from fungi to humans. It contains up to seven potential transmembrane domains separated by regions of low complexity. Functionally it might be involved in vesicle secretion or be an inter-cellular signalling protein or be a novel insulin receptor.	0
417750	cl15674	IPT	N/A. The Rel homology domain (RHD) is composed of two structural domains, an N-terminal DNA_binding domain (pfam00554) and a C-terminal dimerization domain. This is the dimerization domain.	0
417751	cl15675	RGL4_N	N-terminal catalytic domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. Members of this family are found in both fungi, bacteria and wood-eating arthropods. The domain is found at the N-terminus of rhamnogalacturonase B, a member of the polysaccharide lyase family 4. The domain adopts a structure consisting of a beta super-sandwich, with eighteen strands in two beta-sheets. The three domains of the whole protein rhamnogalacturonan lyase (RGL4), are involved in the degradation of rhamnogalacturonan-I, RG-I, an important pectic plant cell-wall polysaccharide. The active-site residues are a lysine at position 169 in UniProtKB:Q00019 and a histidine at 229, Lys169 is likely to be a proton abstractor, His229 a proton donor in the mechanism. The substrate is a disaccharide, and RGL4, in contrast to other rhamnogalacturonan hydrolases, cleaves the alpha-1,4 linkages of RG-I between Rha and GalUA through a beta-elimination resulting in a double bond in the nonreducing GalUA residue, and is thus classified as a polysaccharide lyase (PL).	0
417753	cl15685	Wzt_C-like	C-Terminal domain of O-antigenic polysaccharide transporter protein Wzt and related proteins. This domain is found at the C-terminus of the Wzt protein. The crystal structure of C-Wzt(O9a) reveals a beta sandwich with an immunoglobulin-like topology that contains the O-antigenic polysaccharide binding pocket. This domain is often associated with the ABC-transporter domain.	0
417754	cl15687	RGL4_C	C-terminal domain of rhamnogalacturonan lyase, a family 4 polysaccharide lyase. CBM-like is domain III of rhamnogalacturonan lyase (RG-lyase). The full-length protein specifically recognizes and cleaves alpha-1,4 glycosidic bonds between l-rhamnose and d-galacturonic acids in the backbone of rhamnogalacturonan-I, a major component of the plant cell wall polysaccharide, pectin. This domain possesses a jelly roll beta-sandwich fold structurally homologous to carbohydrate binding modules (CBMs), and it carries two sulfate ions and a hexa-coordinated calcium ion.	0
417755	cl15688	anti-TRAP	anti-TRAP (AT) protein specific to Bacilli. In Bacillus subtilis and related bacteria, AT binds to the TRAP protein, (tryptophan-activated trp RNA-binding attenuation protein), effectively disrupting interaction of TRAP with mRNAs. Upon binding of tryptophan, TRAP (which forms a complex of 11 identical subunits) interacts with a specific location in the leader RNA and blocks translation of the tryptophan biosynthetic operon. AT, in turn, recognizes the tryptophan-activated TRAP complex and prevents RNA binding. AT is expressed in response to high levels of uncharged tryptophan tRNA. AT contains a zinc-binding motif that closely resembles the zinc-binding motifs in the zinc-finger region of DnaJ/Hsp40. AT has been shown to form homo-dodecameric assemblies, and can actually do that in two different relative orientations, resulting in two different dodecamers. Recent data suggest that the trimeric form of AT may be the biologically relevant active complex.	0
417756	cl15692	CE4_SF	Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily. This domain, found in various hypothetical bacterial proteins, has no known function.	0
417757	cl15693	Sema	The Sema domain, a protein interacting module, of semaphorins and plexins. The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in the hepatocyte growth factor receptor and plexin-A3.	0
417758	cl15694	Exosortase_EpsH	Transmembrane exosortase (Exosortase_EpsH). This model represents the most conserved region of the multitransmembrane protein family of exosortases and archaeosortases. The region includes nearly invariant motifs at the ends of three predicted transmembrane helices on the extracytoplasmic face: a Cys (often Cys-Xaa-Gly), Asn-Xaa-Xaa-Arg, and His. This model is much broader than the bacterial exosortase model (TIGR02602), and has in intended scope similar to (or broader than) pfam09721.	0
417759	cl15697	ADF_gelsolin	Actin depolymerization factor/cofilin- and gelsolin-like domains. Severs actin filaments and binds to actin monomers.	0
417760	cl15705	DUF563	Protein of unknown function (DUF563). Family of uncharacterized proteins.	0
417762	cl15731	PGF-CTERM	PGF-CTERM motif. This model describes a strictly archaeal putative protein-sorting motif, PGF-CTERM. It is the (predicted) recognition sequence for an exosortase homolog, archaeosortase (TIGR04125). In some archaea, up to fifty proteins have this domain as their C-terminal region, usually preceded by a Thr-rich region likely to be heavily glycosylated. The removal of this sorting signal may be associated with a C-terminal prenyl group modification in the halobacterial major cell surface glycoprotein, an S-layer protein.	0
417763	cl15733	YyzF	YyzF-like protein. Members of this protein family occur exclusively in the Firmicutes, in at least 50 different species. Members average about 55 residues in length, and four of the five invariant or nearly invariant residues occur in motifs CxxH and CxxC. The function is unknown.	0
417764	cl15739	Bacteroid_pep	Ribosomally synthesized peptide in Bacteroidetes. This model describes a rare family of small putative polypeptides, including three encoded in tandem in Sphingobacterium spiritivorum ATCC 33300, in the vicinity of a TIGR04085 protein. This pairing is conserved in Chryseobacterium gleum ATCC 35910, Kordia algicida OT-1, and other species. TIGR04085 describes a C-terminal additional 4Fe4S-binding domain in PqqE and other radical SAM enzymes that seems to be a marker for peptide modification, and the family modeled here is a candidate modified peptide precursor.	0
417765	cl15749	IPTL-CTERM	IPTL-CTERM motif. This model describes a variant form of the PEP-CTERM C-terminal protein-sorting domain, with a consensus motif IPTL replacing the more typical VPEP. A majority of these sequences have a WG (Trp-Gly) motif at positions 7-8 of the domain. Species with multiple (up to 15) copies of this domain include Acidovorax citrulli, Acidovorax delafieldii 2AN, Delftia acidovorans SPH-1, and gamma proteobacterium NOR5-3.	0
417766	cl15753	CollagenBindB	Repeat unit of collagen-binding protein domain B. GramPos_pilinD3 is one of the major backbone units of Gram-positive pili, such as those from S.pneumoniae. There are three major pilin subunits that form the polymeric backbone of the pilin from S. pneumoniae, constructed of three transthyretin-like, CnaB, domains along with a crucial N-terminal domain, D1. The three Cna-B like domains are stabilized by internal Lys-Asn isopeptdie bonds, Gram-positive pili are formed from a single chain of covalently linked subunit proteins (pilins), usually comprising an adhesin at the distal tip, a major pilin that forms the polymer shaft and a minor pilin that mediates cell wall anchoring at the base.	0
417767	cl15755	SAM_superfamily	SAM (Sterile alpha motif ). The fungal Ste50p SAM domain consists of five helices, which form a compact, globular fold. It is required for mediation of homodimerization and heterodimerization (and in some cases oligomerization) of the protein.	0
417769	cl15774	Hemerythrin-like	Hemerythrin family. Iteration of the HHE family found it to be related to Hemerythrin. It also demonstrated that what has been described as a single domain in fact consists of two cation binding domains. Members of this family occur all across nature and are involved in a variety of processes. For instance, in Nereis diversicolor hemerythrin binds Cadmium so as to protect the organism from toxicity. However Hemerythrin is classically described as Oxygen-binding through two attached Fe2+ ions. And the bacterial NorA is a regulator of response to NO, which suggests yet another set-up for its metal ligands. In Staphylococcus aureus the iron-sulfur cluster repair protein ScdA has been noted to be important when the organism switches to living in environments with low oxygen concentrations; perhaps this protein acts as an oxygen store or scavenger.	0
417770	cl15781	K_trans	K+ potassium transporter. potassium transporter; Provisional	0
417771	cl15787	SEC14	N/A. This family includes divergent members of the CRAL-TRIO domain family. This family includes ECM25 that contains a divergent CRAL-TRIO domain identified by Gallego and colleagues.	0
417772	cl15796	Phage_GPD	Phage late control gene D protein (GPD). tail protein; Provisional	0
387679	cl15806	antisig_RsrA	mycothiol system anti-sigma-R factor. This group of anti-sigma factors are associated in an apparent operon with a family of sigma-70 family sigma factors (TIGR02947). They and appear by homology, tree building, bidirectional best hits and one-to-a-genome distribution, to represent a conserved family. This family is restricted to the Actinobacteria. [Transcription, Transcription factors]	0
417774	cl15816	CheC	CheC-like family. CheX is very closely related to the CheC chemotaxis phosphatase, but it dimerizes in a different way, via a continuous beta sheet between the subunits. CheC and CheX both dephosphorylate CheY, although CheC requires binding of CheD to achieve the activity of CheX. The ability of bacteria to modulate their swimming behaviour in the presence of external chemicals (nutrients and repellents) is one of the most rudimentary behavioural responses known, but the the individual components are very sensitively tuned.	0
326649	cl15819	MqsA	antitoxin MqsA for MqsR toxin. The YokU-like protein family includes the B. subtilis YokU protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two conserved CXXC sequence motifs. This is likely to be a family of bacterial antitoxins, as the sequence bears remote homology to the RelE fold family.	0
417776	cl15824	LPP20	LPP20 lipoprotein. This family contains the LPP20 lipoprotein, which is a non-essential class of lipoprotein.	0
417777	cl15825	YscW	Type III secretion system lipoprotein chaperone (YscW). This family of proteins is found within type III secretion operons. The protein has been characterized as a chaperone for the outer membrane pore component YscC (TIGR02516). YscW is a lipoprotein which is itself localized to the outer membrane and, it is believed, facilitates the oligomerization and localization of YscC.	0
417778	cl15827	BKACE	beta-keto acid cleavage enzyme. BKACE, beta-keto acid cleavage enzyme plays, a role in lysine degradation. In certain instances it catalyzes the conversion of 3-keto-5-aminohexanoate and acetyl-CoA into acetoacetate and 3-aminobutyryl-CoA. The family is found to have at least 14 slightly different potential new enzymatic activities, all of which can therefore be designated as beta-keto acid cleavage enzymes.	0
387686	cl15828	DUF308	Short repeat of unknown function (DUF308). Family of short repeats that occurs in a limited number of membrane proteins. It may divide further in short repeats of around 7-10 residues of the pattern G-#-X(2)-#(2)-X (#=hydrophobic).	0
417779	cl15830	DsbC	Disulphide bond corrector protein DsbC. DsbC rearranges incorrect disulphide bonds during oxidative protein folding. It is activated by the N-terminal domain of DsbD, a transmembrane electron transporter. DsbD binds to a DsbC dimer and selectively activates it using electrons from the cytoplasm.	0
417780	cl15834	YbjN	Putative bacterial sensory transduction regulator. YbjN is a putative sensory transduction regulator protein found in Proteobacteria. As it is a multi-copy suppressor of the coenzyme A-associated temperature sensitivity in temperature-sensitive mutant strains of Escherichia coli the suggestion is that it both helps CoA-A1 and possibly works as a general stabilizer for some other unstable proteins. This family was expanded to subsume other related families: DUF1790, DUF1821 and DUF2596.	0
417781	cl15838	Phage_GPO	Phage capsid scaffolding protein (GPO) serine peptidase. capsid-scaffolding protein; Provisional	0
417782	cl15839	ShK	ShK domain-like. ShK toxin domain	0
417783	cl15840	JmjN	jmjN domain. To date, this domain always co-occurs with the JmjC domain (although the reverse is not true).	0
417784	cl15841	SelR	SelR domain. This model describes a domain found in PilB, a protein important for pilin expression, N-terminal to a domain coextensive to with the known peptide methionine sulfoxide reductase (MsrA), a protein repair enzyme, of E. coli. Among the early completed genomes, this module is found if and only if MsrA is also found, whether N-terminal to MsrA (as for Helicobacter pylori), C-terminal (as for Treponema pallidum), or in a separate polypeptide. Although the function of this region is not clear, an auxiliary function to MsrA is suggested. [Protein fate, Protein modification and repair, Cellular processes, Adaptations to atypical conditions]	0
326659	cl15846	Phage_F	Capsid protein (F protein). major capsid protein	0
417785	cl15848	ESSS	ESSS subunit of NADH:ubiquinone oxidoreductase (complex I). complex I subunit	0
417786	cl15851	BcsB	Bacterial cellulose synthase subunit. This family includes bacterial proteins involved in cellulose synthesis. Cellulose synthesis has been identified in several bacteria. In Agrobacterium tumefaciens, for instance, cellulose has a pathogenic role: it allows the bacteria to bind tightly to their host plant cells. While several enzymatic steps are involved in cellulose synthesis, potentially the only step unique to this pathway is that catalyzed by cellulose synthase. This enzyme is a multi subunit complex. This family encodes a subunit that is thought to bind the positive effector cyclic di-GMP. This subunit is found in several different bacterial cellulose synthase enzymes. The first recognized sequence for this subunit is BcsB. In the AcsII cellulose synthase, this subunit and the subunit corresponding to BcsA are found in the same protein. Indeed, this alignment only includes the C-terminal half of the AcsAII synthase, which corresponds to BcsB.	0
387695	cl15855	Flg_bbr_C	Flagellar basal body rod FlgEFG protein C-terminal. Members of this protein are FlgF, one of several homologous flagellar basal-body rod proteins in bacteria. [Cellular processes, Chemotaxis and motility]	0
417789	cl15893	MgtC	MgtC family. This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	0
387700	cl15935	TIC20	Chloroplast import apparatus Tic20-like. Two families of proteins are involved in the chloroplast envelope import appartus.They are the three proteins of the outer membrane (TOC) and four proteins in the inner membrane (TIC). This family is specific for the Tic20 protein. [Transport and binding proteins, Amino acids, peptides and amines]	0
276137	cl15945	PRK09822	N/A. Members of this family are WaaZ, or Kdo-III transferase. This enzyme, present in some strains of E. coli and its allies but not others, performs a non-stoichiometric addition of a third 3-deoxy-D-manno-oct-2-ulosonic acid (KDO-III) onto some fraction of KDO-II in the lipopolysaccharide (LPS) inner core. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
387731	cl16047	Fur_reg_FbpB	Fur-regulated basic protein B. This model describes FbpB (Fur-regulated basic protein B), one of three paralogous small proteins recognized by Pfam model PF13040 in Bacillus subtilis.	0
417904	cl16231	DUF4089	Protein of unknown function (DUF4089). HpxX is a small protein of unknown function, about 60 residues in length, encoded in the set of four genes, hpxWXYZ, that belong to the oxalurate metabolism portion of a complete pathway for hypoxanthine (hpx) utilization, as in Klebsiella pneumoniae.	0
417917	cl16254	PDDEXK_3	PD-(D/E)XK nuclease superfamily. Members of this protein family average about 130 residues in length and include an almost perfectly conserved motif GxxExxY. Members occur in a wide range of prokaryotes, including Proteobacteria, Perrucomicrobia, Cyanobacteria, Bacteriodetes, Archaea, etc.	0
417918	cl16257	Alginate_exp	Alginate export. Proteins of this HMM family are primarily identified in sulfate-reducing Desulfovibrio, but this HMM may also hit proteins from other Gram-negative bacteria. Porins of this family form transmembrane pores for the passive transport of small molecules across the outer membranes of Gram-negative bacteria.	0
417923	cl16268	LytR_C	LytR cell envelope-related transcriptional attenuator. Cei (cell envelope integrity), as described for the founding member Rv2700 from Mycobacterium tuberculosis, is a transmembrane protein with an extracellular LytR_C domain. It lacks any DNA-binding domain and is not a transcriptional regulator. It shares homology to C-terminal regions present in some members of the LytR-CpsA-Psr family, a family in which some characterized members transfer teichoic acids to from carriers to mature peptidoglycan.	0
417930	cl16279	SH3_8	SH3-like domain. The GW domain of Listeria belongs to the clan of SH3-like domains. A similar but broader model (PF13457) occurs in Pfam. The GW domain occurs as repeats on surface proteins of the cell-invading pathogenic bacterium Listeria monocytogenes, and is involved in binding to glycosaminoglycans. Members of this family include the GW-type internalin InlB and several paralogs.	0
417943	cl16298	GH113-like	Glycoside hydrolase family 113 beta-mannosidase and similar proteins. This domain is found in the gene transfer agent protein. An unusual system of genetic exchange exists in the purple nonsulfur bacterium Rhodobacter capsulatus. DNA transmission is mediated by a small bacteriophage-like particle called the gene transfer agent (GTA) that transfers random 4.5-kb segments of the producing cell's genome to recipient cells, where allelic replacement occurs. The genes involved in this process appear to be found widely in bacteria. According to the SUPERFAMILY database this domain has a TIM barrel fold.	0
417973	cl16352	zf-3CxxC	Zinc-binding domain. This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue.	0
417986	cl16365	TraF_2	F plasmid transfer operon, TraF, protein. This is a family of unknown function mainly found in bacteria.	0
418019	cl16409	GH31_N	N-terminal domain of glycosyl hydrolase family 31 (GH31). This family is found N-terminal to glycosyl-hydrolase domains, and appears to be similar to the galactose mutarotase superfamily.	0
418021	cl16414	DUF4185	Domain of unknown function (DUF4185). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are typically around 440 amino acids in length.	0
418048	cl16452	Peptidase_S74_CIMCD	Peptidase S74 family, C-terminal intramolecular chaperone domain of Escherichia coli phage K1F endosialidase and related proteins. This is the very C-terminal, chaperone, domain of the bacteriophage protein endosialidase. It releases itself, via the serine-lysine dyad at the N-terminus, from the remainder of the end-tail-spike. Cleavage occurs after the threonine which is the final residue of the End-tail-spike family, pfam12219. The endosialidase protein forms homotrimeric molecules in bacteriophages. The catalytic dyad allows this portion of the molecule to be cleaved from the more N-terminal region such that the latter can fold and presumably bind to DNA.	0
418112	cl16538	DUF4231	Protein of unknown function (DUF4231). The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is always N-terminally fused to the SLATT_1 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels.  The SLATT domain defined here (170 residues long) is similar to the DUF4231 domain (105 residues long) described in Pfam model PF14015.	0
418122	cl16549	DUF4242	Protein of unknown function (DUF4242). Members of the SCO4226 family belong to the larger family of DUF4242 domain-containing proteins, described by Pfam model PF14026. SCO4226 itself was shown to dimerize and bind four nickel atoms per homodimer.	0
418271	cl16759	RAMA	Restriction Enzyme Adenine Methylase Associated. This domain family is found in bacteria and archaea, and is approximately 60 amino acids in length. There are two completely conserved residues (G and W) that may be functionally important.	0
418281	cl16774	AlgX_N_like	N-terminal catalytic domain of putative alginate O-acetyltranferase and similar proteins. ALGX is a family found in bacteria. The domain demonstrates catalytic activity similar to that of the SGNH hydrolase-like domain, with the typical Ser-His-Asp triad found in this enzyme. Alginate is an exopolysaccharide that contributes to biofilm formation. ALGX is secreted into the biofilm and is responsible for the acetylation of biofilm polymers that help protect them from host destruction.	0
418317	cl16818	PrcB_C	PrcB C-terminal. This domain is found at the C-terminus of Treponema denticola PrcB. PrcB interacts with the PrtP protease (dentilisin) and is required for the stability of the protease complex.	0
418358	cl16881	CdiA-CT_Ec-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Escherichia coli STEC_O31, and similar proteins. This is a bacterial virion of EndoU nuclease. It is found at C-terminal region of polymorphic toxin proteins.	0
418372	cl16901	DUF4425	Uncharacterized protein conserved in Bacteroidetes. A small family of bacterial proteins, found in several Bacteroides species. Structure determination (NMR and Xray) shows an immunoglobulin beta-barrel fold. Multiple homologs have been found in human gut metagenomics data sets. Structural experimentation shows it to share features with two well-established protein architectures in the SCOP database, ie, C2 (calcium/lipid-binding domain) of the Pfam PF00168 and PLAT/LH2 (lipase/lipooxigenase domain) of the Pfam PF01477. The C2 and PLAT/LH2 domains bind Ca2+ in their functions of targeting proteins to cell-membranes; this domain is also shown to bind Ca2+ as well as to be a novel fold.	0
418375	cl16905	alpha_DG_C	C-terminal domain of alpha dystroglycan. This is the second N-terminal domain found in alpha-Dystroglycan (DG). The murine skeletal muscle N-terminal alpha-DG region, contains two autonomous domains; the first identified as an Ig-like and the second resembling ribosomal RNA-binding proteins. This domain is similar to the small subunit ribosomal protein S6 of Thermus thermophilus (S6 domain). It is suggested that the S6 domain may be of functional relevance for LARGE (like-acetylglucosaminyltransferase) recognition along the alpha-DG maturation pathway.	0
418376	cl16912	MDR	Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family. Members of this family are putative quinone oxidoreductases that belong to the broader superfamily (modeled by Pfam pfam00107) of zinc-dependent alcohol (of medium chain length) dehydrogenases and quinone oxiooreductases. The alignment shows no motif of conserved Cys residues as are found in zinc-binding members of the superfamily, and members are likely to be quinone oxidoreductases instead. A member of this family in Homo sapiens, PIG3, is induced by p53 but is otherwise uncharacterized. [Unknown function, Enzymes of unknown specificity]	0
418377	cl16914	O-FucT_like	GDP-fucose protein O-fucosyltransferase and related proteins. The nodulation genes of Rhizobia are regulated by the nodD gene product in response to host-produced flavonoids and appear to encode enzymes involved in the production of a lipo-chitose signal molecule required for infection and nodule formation. NodZ is required for the addition of a 2-O-methylfucose residue to the terminal reducing N-acetylglucosamine of the nodulation signal. This substitution is essential for the biological activity of this molecule. Mutations in nodZ result in defective nodulation. nodZ represents a unique nodulation gene that is not under the control of NodD and yet is essential for the synthesis of an active nodulation signal.	0
418378	cl16915	ZnPC_S1P1	Zinc dependent phospholipase C/S1-P1 nuclease. This domain of unknown function contains several highly conserved histidines.	0
418379	cl16916	ChtBD1	Hevein or type 1 chitin binding domain. Hevein or type 1 chitin binding domain (ChtBD1), a lectin domain found in proteins from plants and fungi that bind N-acetylglucosamine, plant endochitinases, wound-induced proteins such as hevein, a major IgE-binding allergen in natural rubber latex, and the alpha subunit of Kluyveromyces lactis killer toxin. This domain is involved in the recognition and/or binding of chitin subunits; it typically occurs N-terminal to glycosyl hydrolase domains in chitinases, together with other carbohydrate-binding domains, or by itself in tandem-repeat arrangements.	0
418380	cl16919	CRAL_TRIO_N	CRAL/TRIO, N-terminal domain. This all-alpha domain is found to the N-terminus of pfam00650.	0
418381	cl16921	eIF2D_N_like	N-terminal domain of eIF2D, malignant T cell-amplified sequence 1 and related proteins. Members of this family are found in a set of hypothetical Archaeal proteins. Their exact function has not, as yet, been defined.	0
418382	cl16934	Axin_TNKS_binding	Tankyrase binding N-terminal segment of axin. This is the N-terminal domain tankyrase binding domain of Axin-1.	0
418383	cl16936	SATB1_N	N-terminal domain of SATB1 and similar proteins. ULD is an N-terminal oligomerization domain of SATB or special AT-rich sequence-binding proteins. SATBs are global chromatin organizers and regulators of gene expression that are essential for T-cell development, breast cancer tumor growth and metastasis. SATBs assemble into a tetramer via the ULD domain, and the tetramerisation of SATBs are essential for recognising specific DNA sequences (such as multiple AT-rich DNA fragments). Thus, SATBs may regulate gene expression directly by binding to various promoters and upstream regions and thereby influencing promoter activity.	0
418384	cl16937	Ndc10	Ndc10 component of the yeast centromere-binding factor 3. NDC10_II is a the second of five domains on the Kluyveromyces lactis Ndc10 protein. Each subunit of the Ndc10 dimer binds a separate fragment of DNA, suggesting that Ndc10 stabilizes a DNA loop at the centromere.	0
388367	cl16938	ThermoDBP	Thermoproteales single-stranded DNA-binding (SSB) domain. This domain is found in the N-terminal of ThermoDBP, a single stranded DNA binding protein found in Thermoproteus tenax. ThermoDBP binds specifically to ssDNA with low sequence specificity. This domain is responsible for ssDNA binding. Conserved motif 'LIYWIRSDR' is located at the C-terminal end of the domain and is thought to participate in ssDNA binding.	0
418385	cl16939	RTT106_N	histone chaperone RTT106, regulator of Ty1 transposition protein 106; N-terminal homodimerization domain. This is the N-terminal domain of Rtt106 in Saccharomyces cerevisiae. Rtt106 is a histone chaperone that contributes to the deposition of newly synthesized acetylated Histone 3 Lysine 56 (H3K56ac) carrying H3-H4 complex on replicating DNA. The N-terminal domain of Rtt106 homodimerizes and interacts with H3-H4 independently of acetylation.	0
418386	cl16941	NTP-PPase	Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain superfamily. This family of short proteins are distantly related to the MazG enzyme. This suggests that these proteins are enzymes that catalyze a related reaction.	0
418387	cl16946	Actino_peptide	Ribosomally synthesized peptide in actinomycetes. A ribosomally synthesized peptide related to microviridin and marinostatin, usually in the gene neighborhood of one or more RimK-like ATP-grasp. The gene-context suggests that it is further modified by the ATP-grasp. The peptide is predicted to function in a defensive or developmental role, or as an antibiotic.	0
418388	cl16948	FctA	Spy0128-like isopeptide containing domain. This model describes a domain that occurs once in the major pilin of Streptococcus pyogenes, Spy0128, but in higher copy numbers in other streptococcal proteins. The domain occurs nine times in a surface-anchored protein of Bifidobacterium longum. All members of this family have LPXTG-type sortase target sequences. The S. pyogenes major pilin has been shown to undergo isopeptide bond cross-linking, mediated by sortases, that are critical to maintaining pilus structural integrity. One such Lys-to-Asn isopeptide bond is to a near-invariant Asn near the C-terminal end of this domain (column 81 of the seed alignment). A Glu in the S. pyogenes major pilin (column 25 of the seed alignment), invariant as Glu or Gln, is described as catalytic for isopeptide bond formation.	0
276145	cl16949	MAST_ArtA_sort	MAST domain. Members of this protein family are exclusive to archaea, probably all of which have S-layer surface protein arrays. All member proteins have an N-terminal signal sequence. The majority of known members belong to codirectional tandem arrays in the genus Methanosarcina (nine in M. barkeri str. Fusaro). Nearly all members have an additional 50 residues, (trimmed from the seed alignment for this model), consisting of low-complexity sequence rich in E,N,Q,T,S, and P, followed by a variant (PAF) form of the PGF-CTERM putative archaeal surface glycoprotein sorting signal. The coined name, sarcinarray family protein, evokes the predicted archaeal surface layer localization, the taxonomic bias of known members, and the tandem organization of most members.	0
418389	cl16968	CFSR	Collagen-flanked surface repeat. This model describes a repeat sequence that occurs primarily LPXTG-anchored Streptococcus surface proteins, although it does occur elsewhere. It can comprise a major fraction of the length of repeat proteins taht exceed 2000 in length.	0
418390	cl16979	ser_adhes_Nterm	serine-rich repeat adhesion glycoprotein AST domain. Lacb_SerRich_Nt describes a Lactobacillus-restricted N-terminal non-repetitive sequence region shared by proteins with extensive serine-rich repeat regions, all likely to function as adhesins. This region contains a variant form of the KxYKxGKxW motif (see TIGR03715) followed by a region related to serine-rich glycoprotein adhesins of the Streptococci.	0
418391	cl16982	Antigen_C	Cell surface antigen C-terminus. This domain has a conserved Lys (position 3 in seed alignment) and Asn at 177 that form an intramolecular isopeptide bond. The Asp (or Glu) at position 59	0
418392	cl17006	VbhA_like	VbhA antitoxin and related proteins. VbhT is a bacterial Fic protein of the mammalian pathogen B. schoenbuchensis7,8. It is composed of an N-terminal FIC domain and a C-terminal BID domain. FIC domains are known to catalyse adenylylation (also called AMPylation). This entry represents VbhA, an antitoxin that binds FIC domain (filamentation induced by cyclic AMP) of VbhT and inhibits its activity. It inhibits the adenylylation activity of VbhT by positioning close to the putative ATP-binding site, hence competing with ATP binding.	0
418393	cl17007	COE_DBD	Colier/Olf/Early B-cell factor (EBF) DNA Binding Domain. COE_DBD is the amino-terminal DNA binding domain of the COE protein family. The COE transcription factor is a regulator of development in several organs and tissues that contain the DBD domain as well as IPT/TIG (immunoglobulin-like, Plexins, transcription factors/transcription factor immunoglobulin) and basic helix-loop-helix (bHLH) domains. COE has four members in mammals (COE1-4) with high sequence similarity at the amino-terminal region. COE_DBD requires a zinc ion to bind DNA and contains a zinc finger motif (H-X(3)-C-X(2)-C-X(5)-C) termed the zinc knuckle. COE is homo- or heterodimerized through the bHLH domain to bind DNA. COE1-4 each has a variant due to alternative splicing. However, this alternative splicing does not occur at the DBD domain.	0
388374	cl17010	TTHB210-like	Hypothetical protein TTHB210, a sigma(E)-regulated gene product found in Thermus thermophilus, and similar proteins. This domain is found in TTHB210 protein present in Thermus thermophilus. TTHB210 is a Sigma-E factor regulated gene product that forms a homodecamer. This domain is chain G and can be classified with chains A, C, E and I based on its folds.	0
418394	cl17011	Arginase_HDAC	Arginase-like and histone-like hydrolases. Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyze the removal of the acetyl group. Histone deacetylases are related to other proteins.	0
418395	cl17012	GINS_A	Alpha-helical domain of GINS complex proteins; Sld5, Psf1, Psf2 and Psf3. The eukaryotic GINS complex is essential for the initiation and elongation phases of DNA replication. It consists of four paralogous protein subunits (Sld5, Psf1, Psf2 and Psf3), all of which are included in this family. The GINS complex is conserved from yeast to humans, and has been shown in human to bind directly to DNA primase.	0
418396	cl17013	W2	C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon. This domain of unknown function is found at the C-terminus of several translation initiation factors.	0
418397	cl17014	eIF-5_eIF-2B	Domain found in IF2B/IF5. translation initiation factor IF-2 subunit beta; Validated	0
354298	cl17015	HRI1_like	Tandem repeat domain of HRI1 and related proteins. Saccharomyces cerevisiae Hri1p (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25p. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a beta-barrel with structural similarity to nitrobindin. This C-terminal repeat is missing several strands and forms an incomplete barrel.	0
418398	cl17018	FANC	Fanconi anemia ID complex proteins FANCI and FANCD2. The Fanconi anaemia protein FancD2 is a nuclease necessary for the repair of DNA interstrand-crosslinks.	0
327373	cl17028	hemoglobin_linker_C	Globular domain of extracellular hemoglobin linker. This domain is found in linker subunits of the erythrocruorin respiratory complex in annelid worms.	0
418400	cl17033	SOAR	STIM1 Orai1-activating region. SOAR is the Orai1-activating region of STIM1, where STIM1 are calcium sensors in the endoplasmic reticulum. As the store of calcium is depleted the calcium sensor in the ER activates Orai1, a Ca2+-release-activated Ca2+ (CRAC) channel, in the plasma membrane. The SOAR region, which runs from residues 340-443 on UniProtKB:Q13586, forms a dimer, and is essential for oligomerization of the whole of STIM1.	0
418401	cl17036	SH3	Src Homology 3 domain superfamily. This domain is the 70 C-terminal residues of ADAP - Adhesion and de-granulation promoting adapter protein. It shows homology to SH3 domains; however, conserved residues of the fold are absent. It thus represents an altered SH3 domain fold. An N-terminal, amphipathic, helix makes extensive contacts to residues of the regular SH3 domain fold thereby creating a composite surface with unusual surface properties. The domain can no longer bind conventional proline-rich peptides. There are key phosphorylation sites within the two hSH3 domains and it would appear that binding at these sites does not materially affect the folding of these regions although the equilibrium towards the unfolded state may be slightly altered. The binding partners of the hSH3 domains are still unknown.	0
418402	cl17037	NBD_sugar-kinase_HSP70_actin	Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily. FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains. The FtsA protein contains two structurally related actin-like ATPase domains which are also structurally related to the ATPase domains of HSP70 (see PF00012). FtsA has a SHS2 domain PF02491 inserted in to the RnaseH fold PF02491.	0
354301	cl17041	helicase_insert_domain	helicase_insert_domain. The endoribonuclease Dicer plays a central role in RNA interference by breaking down RNA molecules into fragments of about 22 nucleotides (miRNAs and siRNAs). Loading of RNA onto Dicer and the enzymatic cleavage are supported by dsRNA-binding proteins, including trans-activation response (TAR) RNA-binding protein (TRBP) or protein activator of PKR (PACT). Together with Argonaute, this constitutes the RNA-induced silencing complex (RISC) which functions to load the small RNA fragments onto Argonaute. The Partner-binding domain of Dicer is responsible for interactions with the dsRNA-binding proteins. This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases.	0
418403	cl17042	Polysacc_deac_2	Divergent polysaccharide deacetylase. This family is divergently related to pfam01522 (personal obs:Yeats C).	0
418404	cl17044	DD_cGKI	Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I. PKcGMP_CC is the N-terminal coiled-coil, dimerization, domain of cGMP-protein kinases.	0
277498	cl17045	TM_EGFR-like	Transmembrane domain of the Epidermal Growth Factor Receptor family of Protein Tyrosine Kinases. ErbB3 (HER3) is a member of the EGFR (HER, ErbB) subfamily of proteins, which are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. ErbB receptors are activated by ligand-induced dimerization, leading to the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. ErbB3 contains an impaired tyr kinase domain, which lacks crucial residues for catalytic activity against exogenous substrates but is still able to bind ATP and autophosphorylate. ErbB3 binds the neuregulin ligands, NRG1 and NRG2, and it relies on its heterodimerization partners for activity following ligand binding. The ErbB2-ErbB3 heterodimer constitutes a high affinity co-receptor capable of potent mitogenic signaling. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of ErbB receptors have been associated with increased breast cancer risk. ErbB3 participates in a signaling pathway involved in the proliferation, survival, adhesion, and motility of tumor cells.	0
418407	cl17065	Cthe_2751_like	Uncharacterized protein domain similar to Clostridium thermocellum 2751. Cthe_2751 has been found to form homodimers. Based on structural similarity to other families, a role in processing nucleic acids was suggested, though interactions with DNA could not be demonstrated.	0
418408	cl17067	GH94N_like	N-terminal domain of glycoside hydrolase family 94 and related domains. This is a family of bacterial proteins of unknown function.	0
418409	cl17068	AFD_class_I	Adenylate forming domain, Class I superfamily. This is a small domain that is found C terminal to pfam00501. It has a central beta sheet core that is flanked by alpha helices.	0
418410	cl17070	AMPKA_C_like	C-terminal regulatory domain of 5&apos;-AMP-activated protein kinase (AMPK) alpha subunit and similar domains. This domain is found at the C-terminus of several fungal kinases.	0
418411	cl17077	Caudo_TAP	Caudovirales tail fibre assembly protein, lambda gpK. Phage_tail_APC is a family of general phage tail assembly chaperone proteins from double-stranded DNA viruses with no RNA stage, many of which are unclassified.	0
418413	cl17090	Yos9_DD	C-terminal dimerization domain (DD) of Saccharomyces cerevisiae Yos9 and related proteins. This is the dimerization domain (DD) found in Yos9 proteins in yeast. Structural analysis revealed that this domain contributes to self association of Yos9. The overall fold of the domain can be classified as an alpha-beta-roll architecture, comprising two alpha-helices and seven beta-strands.	0
418414	cl17091	Rev1_C	C-terminal domain of the Y-family polymerase Rev1. This is the C-terminal domain of DNA repair protein REV1. It interacts with REV7, POLN, POLK and POLI.	0
418415	cl17092	STING_C	C-terminal domain of STING. Transmembrane protein 173, also known as stimulator of interferon genes protein (STING), is a transmembrane adaptor protein which is involved in innate immune signalling processes. It induces expression of type I interferons (IFN-alpha and IFN-beta) via the NF-kappa-B and IRF3, pathways in response to non-self cytosolic RNA and dsDNA.	0
418416	cl17095	Bacova_04320_like	Uncharacterized proteins similar to Bacteroides ovatus 4320. A large family of (predicted) secreted proteins with unknown functions from human gut and oral cavity. Typically forms a N-terminal domain with FMN binding domain at the C-terminus. Experimentaly determined 3D structure of this domain shows a variant of a TATA box binding - like fold, but no detectable sequence similarity to other proteins with this fold	0
418417	cl17096	gal11_coact	gall11 coactivator domain. This is activator-binding domain (ABD1) found in Gal11/med15 proteins. Structural analysis indicate that it binds to the central activator domain (cAD) of Gcn4. Mutations in Gal11-ABD1 W196 residue abolishes the binding to Gcn4 cAD.	0
418418	cl17100	DIP1984-like	DIP1984 family protein and similar proteins. Members of this family, including the Corynebacterium diphtheriae protein  DIP1984, which has a solved crystal structure, are uncharacterized with respect to function. Some members of this family previously have been annotated, incorrectly, as septolysin. This model was constructed to overrule and correct such errors. Note that septolysin O, and other members of the family of cholesterol-dependent cytolysins such as listeriolysin O (WP_003722731.1), are unrelated.	0
302613	cl17109	HopAB_KID	Kinase-interacting domains of the HopAB family of Type III Effector proteins. AvrPtoB_bdg is a binding region on a family of bacterial plant pathogenic proteins. Type III effector proteins are injected into plants by bacteria when they are under attack, eg Pseudomonas syringae when attacking tomato. AvrPtoB is one such effector that suppresses the plants' PAMP-triggered innate immunity. PAMPs are pathogen/microbe-associated molecular patterns that are detected as non-self by a host. AvrPtoB suppresses this response by binding to BAK1, a kinase that acts with several pattern recognition receptors to activate defense signalling. AvrPtoB_bdg is the region of AvrPtoB that binds to BAK1 thereby preventing its kinase activity after the perception of flagellin.	0
418419	cl17110	Erythro_esteras	Erythromycin esterase. This family includes erythromycin esterase enzymes that confer resistance to the erythromycin antibiotic.	0
418420	cl17112	AnfO_nitrog	Iron only nitrogenase protein AnfO (AnfO_nitrog). Members of this protein family, called Anf1 in Rhodobacter capsulatus and AnfO in Azotobacter vinelandii, are found only in species with the Fe-only nitrogenase and are encoded immediately downstream of the structural genes in the above named species.	0
271795	cl17113	DUF2833	Protein of unknown function (DUF2833). internal virion protein A	0
418423	cl17157	Alt_A1	Alternaria alternata allergen Alt a 1. AltA1 is a family of fungal allergens. It shows a unique beta-barrel comprising 11 beta-strands. There is structural evidence for the location of IgE antibody-binding epitopes. The crystal structure will allow efforts to promote immunotherapy for patients allergic to Alternaria species.	0
418424	cl17160	BPSL1549	Burkholderia Lethal Factor 1. This family includes members such as BLF1 (Burkholderia lethal factor 1) also known as BPSL1549. BLF1 is a potent toxin from Burkholderia pseudomallei causing melioidosis. BLF1 interacts with the human translation factor eIF4A causing deamidation of Gln339 to Glu. Thereby, reducing endogenous host cell protein synthesis and triggering increased stress granule formation, which is associated with translational blocks. Structural analysis of BLF1 revealed an alpha/beta fold comprising a sandwich of two mixed beta-sheets surrounded by loops and alpha-helices, where the beta-sheet core of the catalytic pocket is structurally similar to that of the deamidase domain of CNF1 pfam05785.	0
388404	cl17163	CarS	Antirepressor CarS. This is an SH3 domain found in antirepressor proteins such as CarS from Myxococcus xanthus. CarS antirepressor recognizes and neutralizes its cognate repressors to turn on a photo-inducible promoter. CarS physically interacts with the MerR-type winged-helix DNA-binding domain of these repressors leading to activation of carB operon. Structural studies of CarS from M. Xanthus reveals a beta-barrel fold akin to that in SH3 domains. However, it diverges from the typical SH3 domain fold in the lengths and conformations of the connecting loops. Functional analysis reveal that SH3 domain-like fold in the antirepressor CasS, mimics operator DNA in sequestering the repressor DNA recognition helix to activate transcription.	0
418425	cl17165	SKA2	Spindle and kinetochore-associated protein 2. Spindle and kinetochore-associated protein 2 (SKA2) interacts with the N-termini of SKA1 and SKA3 and forms the Ska complex. This is a microtubule binding complex required for chromosome segregation.	0
418426	cl17166	MMACHC-like	Methylmalonic aciduria and homocystinuria type C protein and similar proteins. MMACHC, also called CblC, is involved in the intracellular processing of vitamin B12 by catalyzing two reactions: the reductive decyanation of cyanocobalamin in the presence of a flavoprotein oxidoreductase and the dealkylation of alkylcobalamins through the nucleophilic displacement of the alkyl group by glutathione. Mutations in MMACHC cause combined methylmalonic acidemia/aciduria and homocystinuria (CblC type), the most common inherited disorder of cobalamin metabolism. The structure of MMACHC reveals it to be the most divergent member of the NADPH-dependent flavin reductase family that can use FMN or FAD to catalyze reductive decyanation; it is also the first enzyme with glutathione transferase (GST) activity that is unrelated to the GST superfamily in structure and sequence.	0
418427	cl17169	RRM_SF	RNA recognition motif (RRM) superfamily. Crp79, also called meiotic expression up-regulated protein 5 (Mug5), or polyadenylate-binding protein crp79, or PABP, or poly(A)-binding protein, is an auxiliary mRNA export factor that binds the poly(A) tail of mRNA and is involved in the export of mRNA from the nucleus to the cytoplasm. Mug28 is a meiosis-specific protein that regulates spore wall formation. Members in this family contain three RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains). The model corresponds to the three RRM motif.	0
418428	cl17171	PH-like	Pleckstrin homology-like domain. SIN1_PH is a pleckstrin-homology domain found at the C-terminus of SIN1. It is conserved from yeast to humans. PH-domains are involved in intracellular signalling or as constituents of the cytoskeleton. SIN1 (SAPK-interacting protein 1) plays an essential role in signal transduction, anf the PH domain is involved in lipid and membrane binding.	0
418429	cl17172	ADH_N	Alcohol dehydrogenase GroES-like domain. N-terminal region of oxidoreductase and prostaglandin reductase and alcohol dehydrogenase.	0
418430	cl17173	AdoMet_MTases	N/A. This family appears to have methyltransferase activity.	0
418431	cl17182	NAT_SF	N-Acyltransferase superfamily: Various enyzmes that characteristicly catalyze the transfer of an acyl group to a substrate. This family of GCN5-related N-acetyl-transferases bind both CoA and acetyl-CoA. They are characterized by highly conserved glycine, a cysteine residue in the acetyl-CoA binding site near the acetyl group, their small size compared with other GNATs and a lack of of an obvious substrate-binding site. It is proposed that they transfer an acetyl group from acetyl-CoA to one or more unidentified aliphatic amines via an acetyl (cysteine) enzyme intermediate. The substrate might be another macromolecule.	0
418432	cl17185	LPLAT	Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis. This family contains proteins with N-acetyltransferase functions.	0
418433	cl17190	NK	N/A. This family includes enzymes related to cytidylate kinase.	0
418434	cl17194	Oxidored_q6	NADH ubiquinone oxidoreductase, 20 Kd subunit. This model describes the B chain of complexes that resemble NADH-quinone oxidoreductases. The electron acceptor is a quinone, ubiquinone, in mitochondria and most bacteria, including Escherichia coli, where the recommended gene symbol is nuoB. The quinone is plastoquinone in Synechocystis (where the chain is designated K) and in chloroplast, where NADH may be replaced by NADPH. In the methanogenic archaeal genus Methanosarcina, NADH is replaced by F420H2. [Energy metabolism, Electron transport]	0
418435	cl17210	OSCP	ATP synthase delta (OSCP) subunit. F0F1 ATP synthase subunit delta; Provisional	0
418436	cl17212	PTA_PTB	Phosphate acetyl/butaryl transferase. The plsX gene is part of the bacterial fab gene cluster which encodes several key fatty acid biosynthetic enzymes. The exact function of the plsX protein in fatty acid synthesis is unknown.	0
418437	cl17225	DAHP_synth_1	DAHP synthetase I family. NeuB is the prokaryotic N-acetylneuraminic acid (Neu5Ac) synthase. It catalyzes the direct formation of Neu5Ac (the most common sialic acid) by condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine (ManNAc). This reaction has only been observed in prokaryotes; eukaryotes synthesize the 9-phosphate form, Neu5Ac-9-P, and utilize ManNAc-6-P instead of ManNAc. Such eukaryotic enzymes are not present in this family. This family also contains SpsE spore coat polysaccharide biosynthesis proteins.	0
418438	cl17238	RING_Ubox	The superfamily of RING finger (Really Interesting New Gene) domain and U-box domain. This is a family of primate-specific Ret finger protein-like (RFPL) zinc-fingers of the C3HC4 type. Ret finger protein-like proteins are primate-specific target genes of Pax6, a key transcription factor for pancreas, eye and neocortex development. This domain is likely to be DNA-binding. This zinc-finger domain together with the RDM domain, pfam11002, forms a large zinc-finger structure of the RING/U-Box superfamily. RING-containing proteins are known to exert an E3 ubiquitin protein ligase activity with the zinc-finger structure being mandatory for binding to the E2 ubiquitin-conjugating enzyme.	0
418439	cl17255	CPSase_L_D2	Carbamoyl-phosphate synthase L chain, ATP binding domain. A member of the ATP-grasp fold predicted to be involved in the modification/biosynthesis of spore-wall and capsular proteins.	0
418440	cl17279	DHFR	N/A. The function of this domain is not known, but it is thought to be involved in riboflavin biosynthesis. This domain is found in the C-terminus of RibD/RibG, in combination with pfam00383, as well as in isolation in some archaebacterial proteins. This family appears to be related to pfam00186.	0
418441	cl17319	PIN_5	PINc domain ribonuclease. hypothetical protein; Provisional	0
418442	cl17340	Glyco_hydro_100	Alkaline and neutral invertase. beta-fructofuranosidase	0
418443	cl17346	Trehalase	Trehalase. This is a family of eukaryotic enzymes belonging to glycosyl hydrolase family 63. They catalyze the specific cleavage of the non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase EC:3.2.1.106 is the first enzyme in the N-linked oligosaccharide processing pathway. This family represents the C-terminal catalytic domain.	0
418444	cl17362	Transglut_core	Transglutaminase-like superfamily. This peptidase has the catalytic triad C-H-D at the C-terminal end, a triad similar to that in thiol proteases and animal transglutaminases. It catalyzes the in vitro lysis of M. marburgensis cells under reducing conditions and exhibits characteristics of metal-activated peptidases.	0
302641	cl17365	TrkH	Cation transport protein. This family consists of various cation transport proteins (Trk) and V-type sodium ATP synthase subunit J or translocating ATPase J EC:3.6.1.34. These proteins are involved in active sodium up-take utilising ATP in the process. TrkH a member of the family from E. coli is a hydrophobic membrane protein and determines the specificity and kinetics of cation transport by the TrK system in E. coli.	0
418445	cl17398	YtfJ_HI0045	Bacterial protein of unknown function (YtfJ_HI0045). This model represents sequences from gamma proteobacteria that are related to the E. coli protein, YtfJ.	0
418446	cl17448	CP_ATPgrasp_1	A circularly permuted ATPgrasp. Circularly permuted ATP-grasp prototyped by Roseiflexus RoseRS_2616 that is associated in gene neighborhoods with a GCS2-like COOH-NH2 ligase, alpha/beta hydrolase fold peptidase, GAT-II -like amidohydrolase, and M20 peptidase. Members of this family are predicted to be involved in the biosynthesis of small peptides.	0
418447	cl17486	Sipho_tail	Phage tail protein. This model represents the best-conserved region of about 125 amino acids, toward the N-terminus, of a family of proteins from temperate phage of a number of Gram-positive bacteria. These phage proteins range in length from 230 to 525 amino acids. [Mobile and extrachromosomal element functions, Prophage functions]	0
418448	cl17505	CamS_repeat	Repeat domain of CamS sex pheromone cAM373 precursor and related proteins. This family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed.	0
248060	cl17506	LDT_IgD_like	IgD-like repeat domain of mycobacterial L,D-transpeptidases. Immunoglobulin-like domain found in actinobacterial L,D-transpeptidases, including Mycobacterium tuberculosis LdtMt2, which is a non-classical transpeptidase that generates 3->3 transpeptide linkages. LdtMt2 is associated with virulence and resistance to amoxicillin. This domain may occur in a tandem-repeat arrangement and is found N-terminal to the catalytic L,D-transpeptidase domain; this model represents the  repeat adjacent to the catalytic domain.	0
248061	cl17507	LbR-like	Left-handed beta-roll, including virulence factors and various other proteins. This group contains the collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins, including Moraxella catarrhalis UspA-like proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. These domains form a left handed beta roll made up of a series of short repeated elements. UspA1 and UspA2 are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric left-handed parallel beta-helices of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region.	0
418449	cl17515	FeS	Putative Fe-S cluster. This family includes a domain with four conserved cysteines that probably form an Fe-S redox cluster.	0
418450	cl17537	gp32	gp32 DNA binding protein like. single-stranded DNA binding protein; Provisional	0
418451	cl17559	Amido_AtzD_TrzD	Amidohydrolase ring-opening protein (Amido_AtzD_TrzD). Members of this family are are ring-opening amidohydrolases, including cyanuric acid amidohydrolase (EC 3.5.2.15) (AtzD and TrzD) and barbiturase. Note that barbiturase does not act as defined for EC 3.5.2.1 (barbiturate + water = malonate + urea) but rather catalyzes the ring-opening of barbituric acid to ureidomalonic acid (see Soong, et al., ).	0
418452	cl17562	Spore_III_AF	Stage III sporulation protein AF (Spore_III_AF). This family represents the stage III sporulation protein AF of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of this protein is poorly conserved, so only the N-terminal region, which includes two predicted transmembrane domains, is included in the seed alignment. [Cellular processes, Sporulation and germination]	0
418453	cl17592	TfoX_N	TfoX N-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the N-terminal presumed domain of TfoX. The domain is found as an isolated domain in some proteins suggesting this is an autonomous domain.	0
302653	cl17685	Phytase	Phytase. Phytase is a secreted enzyme which hydrolyzes phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity and has been shown to have a six- bladed propeller folding architecture.	0
418454	cl17687	5_nucleotid	5&apos; nucleotidase family. This model includes a 5'-nucleotidase specific for purines (IMP and GMP). These enzymes are members of the Haloacid Dehalogenase (HAD) superfamily. HAD members are recognized by three short motifs {hhhhDxDx(T/V)}, {hhhh(T/S)}, and either {hhhh(D/E)(D/E)x(3-4)(G/N)} or {hhhh(G/N)(D/E)x(3-4)(D/E)} (where "h" stands for a hydrophobic residue). Crystal structures of many HAD enzymes has verified PSI-PRED predictions of secondary structural elements which show each of the "hhhh" sequences of the motifs as part of beta sheets. This subfamily of enzymes is part of "Subfamily I" of the HAD superfamily by virtue of a "cap" domain in between motifs 1 and 2. This subfamily's cap domain has a different predicted secondary structure than all other known HAD enzymes and thus has been designated "subfamily IG". This domain appears to consist of a mixed alpha/beta fold. A Pfam model (pfam05761) detects an identical range of sequences above the trusted cutoff, but does not model the N-terminal motif 1 region. A TIGRFAMs model (TIGR01993) represents a (putative) family of _pyrimidine_ 5'-nucleotidases which are also subfamily I HAD's, which should not be confused with the current model.	0
388436	cl17690	DUF2204	Nucleotidyl transferase of unknown function (DUF2204). This domain, found in various hypothetical archaeal proteins, has no known function. However, this family was identified as belonging to the nucleotidyltransferase superfamily.	0
418455	cl17703	Dehydratase_MU	Dehydratase medium subunit. This family contains the medium subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances.	0
418456	cl17705	MBT	mbt repeat. Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation.	0
418457	cl17713	NnrU	NnrU protein. This family consists of several plant and bacterial NnrU proteins. NnrU is thought to be involved in the reduction of nitric oxide. The exact function of NnrU is unclear. It is thought however that NnrU and perhaps NnrT are required for expression of both nirK and nor.	0
418458	cl17715	Coat_F	Coat F domain. The Coat F proteins, which contribute to the Bacillales spore coat. It occurs multiple times in the genomes it is found in.	0
418459	cl17718	DUF2384	Protein of unknown function (DUF2384). Proteins in this family are found almost exclusively in the Proteobacteria, but also in Gloeobacter violaceus PCC 7421, a cyanobacterium. This family was proposed by Makarova, et al. (2009) to be the antitoxin component of a new class of type 2 toxin-antitoxin system, or addiction module. [Cellular processes, Other]	0
418460	cl17720	Aminopep	Putative aminopeptidase. This family of bacterial proteins has a conserved HEXXH motif, suggesting that members are putative peptidases of zincin fold.	0
327433	cl17735	VWC	von Willebrand factor type C domain. This cysteine rich domain occurs along side the TIL pfam01826 domain and is likely to be a distantly related relative.	0
418461	cl17774	SAM_adeno_trans	S-adenosyl-l-methionine hydroxide adenosyltransferase. Members of this family are fluorinase (adenosyl-fluoride synthase, EC 2.5.1.63), an enzyme involved in the first committed step in the biosynthesis of at least two different organofluorine compounds. Few organofluorine natural products are known. Related enzymes include chlorinases (EC 2.5.1.94) that lack fluorinase activity, although a fluorinase may show chlorinase activity. [Cellular processes, Biosynthesis of natural products]	0
418462	cl17781	Chromate_transp	Chromate transporter. Members of this family probably act as chromate transporters. Members of this family are found in both bacteria and archaebacteria. The proteins are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP.	0
302666	cl17795	ArsP_1	Predicted permease. This family of integral membrane proteins are predicted to be permeases of unknown specificity.	0
388445	cl17805	DUF483	Protein of unknown function (DUF483). Family of uncharacterized prokaryotic proteins.	0
418463	cl17812	Phage_base_V	Type VI secretion system, phage-baseplate injector. This family consists of Bacteriophage Mu Gp45 related proteins from both phages and bacteria. The function of this family is unknown although it has been suggested that family members may be involved in baseplate assembly.	0
354343	cl17816	OprB	Carbohydrate-selective porin, OprB family. 	0
418464	cl17823	MASE1	MASE1. Predicted integral membrane sensory domain found in histidine kinases, diguanylate cyclases and other bacterial signaling proteins. This entry also includes members of the 8 transmembrane UhpB type (8TMR-UT) domain family.	0
418465	cl17829	DUF917	Protein of unknown function (DUF917). This family consists of hypothetical bacterial and archaeal proteins of unknown function.	0
418466	cl17838	DUF1365	Protein of unknown function (DUF1365). This family consists of several bacterial and plant proteins of around 250 residues in length. The function of this family is unknown.	0
418468	cl17850	Trp_oprn_chp	Tryptophan-associated transmembrane protein (Trp_oprn_chp). Members of this family are predicted transmembrane proteins with four membrane-spanning helices. Members are found in the Actinobacteria (Mycobacterium, Corynebacterium, Streptomyces), always associated with genes for tryptophan biosynthesis.	0
418469	cl17851	DUF2100	Uncharacterized protein conserved in archaea (DUF2100). This domain, found in various hypothetical archaeal proteins, has no known function.	0
418470	cl17852	DUF2121	Uncharacterized protein conserved in archaea (DUF2121). This domain, found in various hypothetical archaeal proteins, has no known function.	0
418471	cl17857	DUF2278	Uncharacterized conserved protein (DUF2278). Members of this family of hypothetical bacterial proteins have no known function.	0
418472	cl17862	CBP_GIL	GGDEF I-site like or GIL domain. This protein, called BcsE (bacterial cellulose synthase E) or YhjS, is required for cellulose biosynthesis in Salmonella enteritidis. Its role is this process across multiple bacterial species is implied by the partial phylogenetic profiling algorithm. Members are found in the vicinity of other cellulose biosynthesis genes. The model does not include a much less well-conserved N-terminal region about 150 amino acids in length for most members. Solano, et al. suggest this protein acts as a protease. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
418473	cl17874	DDE_5	DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction.	0
248453	cl17899	HyfE	Hydrogenase-4 membrane subunit HyfE [Energy production and conversion]. hydrogenase 4 membrane subunit; Provisional	0
418474	cl17916	BF2867_like	Tandemly repeated domain found in Bacteroides fragilis Nctc 9343 BF2867 and related proteins. This family of proteins is found in bacteria. Proteins in this family are typically between 348 and 360 amino acids in length. Analysis of structural comparisons shows this family to be part of the FimbA (CL0450) superfamily of adhesin components or fimbrillins.	0
418475	cl17974	STI1	STI1 domain. This entry corresponds to the STI1 domain that is found in two copies in the Sti1 protein.	0
302697	cl18310	NHL	NHL repeat unit of beta-propeller proteins. This domain occurs in tandem repeats, as many as 13, in proteins from Bdellovibrio bacteriovorus, Azotobacter vinelandii, Geobacter sulfurreducens, Pirellula sp. 1, Myxococcus xanthus, and others, many of which are Deltaproteobacteria. The periodicity of the repeat ranges from about 57 to 61 amino acids, and a core region of about 54 is represented by this model and seed alignment.	0
418507	cl18921	Bvu_2165_C_like	The C-terminal domain of uncharacterized bacterial proteins. A C-terminal domain in a large family of (predicted) secreted proteins with uknown functions from human gut bacteroides	0
418508	cl18929	TIN2_N	N-terminal domain of TRF-interacting nuclear factor 2; shelterin complex protein of telomeres. This is the N-terminus of TERF1-interacting nuclear factor 2. It is required for the formation of the shelterin complex. The shelterin complex is involved in the protection and maintenance of telomeres.	0
418509	cl18942	MqsR	Motility quorum-sensing regulator (MqsR). MqsR_toxin is a family of bacterial toxins that act as an mRNA interferase. MqsR is the gene most highly upregulated in E. coli persister cells and it plays an essential role in biofilm regulation and cell signalling. It forms part of a bacterial toxin-antitoxin TA system, and as expected for a TA system, the expression of the MqsR toxin leads to growth arrest, while co-expression with its antitoxin, MqsA, rescues the growth arrest phenotype. In addition, MqsR associates with MqsA to form a tight, non-toxic complex and both MqsA alone and the MqsR:MqsA2:MqsR complex bind and regulate the mqsR promoter. The structure of MqsR shows that is is a member of the RelE/YoeB family of bacterial RNases that are structurally and functionally characterized bacterial toxins.y characterized bacterial toxins.	0
418510	cl18945	AAT_I	N/A. These proteins catalyze the reversible transfer of an amino group from the amino acid substrate to an acceptor alpha-keto acid. They require pyridoxal 5'-phosphate (PLP) as a cofactor to catalyze this reaction. Trans-amination reactions are of central importance in amino acid metabolism and in links to carbohydrate and fat metabolism. This class of aminotransferases acts as dimers in a head-to-tail configuration.	0
418511	cl18951	Amidase	Amidase. Members of this protein family are aminohydrolases related to, but distinct from, glutamyl-tRNA(Gln) amidotransferase subunit A. The best characterized member is the biuret hydrolase of Pseudomonas sp. ADP, which hydrolyzes ammonia from the three-nitrogen compound biuret to yield allophanate. Allophanate is also an intermediate in urea degradation by the urea carboxylase/allophanate hydrolase pathway, an alternative to urease. [Unknown function, Enzymes of unknown specificity]	0
418512	cl18957	TerD_like	Uncharacterized proteins involved in stress response, similar to tellurium resistance terD. The TerD domain is found in TerD family proteins that include the paralogous TerD, TerA, TerE, TerF and TerZ proteins It is found in a stress response operon with TerB and TerC. TerD has a maximum of two calcium-binding sites depending on the conservation of aspartates. It has various fusions to nuclease domains, RNA binding domains, ubiquitin related domains, and metal binding domains. The ter gene products lie at the centre of membrane-linked metal recognition complexes with regulatory ramifications encompassing phosphorylation-dependent signal transduction, RNA-dependent regulation, biosynthesis of nucleoside-like metabolites and DNA processing linked to novel pathways.	0
418513	cl18961	MltG_like	proteins similar to Escherichia coli YceG/mltG may function as endolytic murein transglycosylases. This family of proteins is found in bacteria. Proteins in this family are typically between 332 and 389 amino acids in length. This family was previously incorrectly annotated and names as aminodeoxychorismate lyase. The structure of YceG was solved by X-ray crystallography.	0
418514	cl18962	Radical_SAM	N/A. Radical SAM proteins catalyze diverse reactions, including unusual methylations, isomerisation, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation.	0
276221	cl18967	Csx17_I-U	CRISPR/Cas system-associated protein Csx17. Members of this protein family are found exclusively in CRISPR-associated (cas) type I system gene clusters of the Dpsyc subtype. Markers for that type include a variant form of cas3 (model TIGR02621) and the GSU0054-like protein family (model TIGR02165). This family occurs in less than half of known Dpsyc clusters.	0
418515	cl18968	RNase_H2-B	Ribonuclease H2-B is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids. RNases H are enzymes that specifically hydrolyze RNA when annealed to a complementary DNA and are present in all living organisms. In yeast RNase H2 is composed of a complex of three proteins (Rnh2Ap, Ydr279p and Ylr154p), this family represents the homologs of Ydr279p. It is not known whether non yeast proteins in this family fulfil the same function.	0
276222	cl19000	Cas10_III	CRISPR/Cas system-associated protein Cas10. Members of this uncommon, sporadically distributed protein family are large (>900 amino acids) and strictly associated, so far, with CRISPR-associated (Cas) gene clusters. Nearby Cas genes always include members of the RAMP superfamily and the six-gene CRISPR-associated RAMP module. Species in which it is found, so far, include three archaea (Methanosarcina mazei, M. barkeri and Methanobacterium thermoautotrophicum) and two bacteria (Thermodesulfovibrio yellowstonii DSM 11347 and Sulfurihydrogenibium azorense).	0
276223	cl19002	Csf1_U	CRISPR/Cas system-associated protein Csf1. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf1 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies closest to the repeats.	0
276224	cl19005	Csc1_I-D	CRISPR/Cas system-associated protein Csc1. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc1 for CRISPR/Cas Subtype Cyano protein 1, as it is often the first gene upstream of the core cas genes, cas3-cas4-cas1-cas2.	0
276225	cl19006	Cas10d_I-D	CRISPR/Cas system-associated protein Cas10d. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family is a CRISPR-associated (Cas) family strictly associated with the Cyano subtype of CRISPR/Cas locus, found in several species of Cyanobacteria and several archaeal species. This family is designated Csc3 for CRISPR/Cas Subtype Cyano protein 3, as it is often the third gene upstream of the core cas genes, cas3-cas4-cas1-cas2.	0
418516	cl19028	Csm6_III-A	CRISPR/Cas system-associated protein Csm6. This entry represents a conserved region of about 150 amino acids found in at least five archaeal and three bacterial species. These species all contain CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In six of eight species, the protein is encoded the vicinity of a CRISPR/Cas locus.	0
418517	cl19029	Csx16_III-U	CRISPR/Cas system-associated protein Csx16. This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas (CRISPR-associated) genes.	0
418518	cl19051	ST7	Suppression of tumorigenicity 7. The ST7 (for suppression of tumorigenicity 7) protein is thought to be a tumor suppressor gene. The molecular function of this protein is uncertain.	0
418519	cl19054	SDH_N_domain	Saccharopine dehydrogenase N-terminal domain. Lysine-oxoglutarate reductase/Saccharopine dehydrogenase (LOR/SDH) is a bifunctional enzyme. This conserved region is commonly found immediately N-terminal to Saccharop_dh (pfam03435) in eukaryotes.	0
418520	cl19078	REC	phosphoacceptor receiver (REC) domain of response regulators (RRs) and pseudo response regulators (PRRs). TadZ_N is the N-terminal region of the Flp pilus assembly protein TadZ, which carries an AAA, ATPase domain immediately downstream, AAA_31, pfam13614. The domain is an example of a signal-transduction-response receiver. It is localized to the cytoplasmic side of the inner bacterial cell-membrane, contacting also with both tadA and RcpC.	0
418521	cl19096	Flavin_utilizing_monoxygenases	N/A. Members of this family are F420-binding enzymes with a proven functional N-terminal twin-arginine translocation (TAT) signal. Members are homologous to the cytosolic F420-dependent glucose-6-phosphate dehydrogenase but do not share the same function.	0
418522	cl19097	TS_Pyrimidine_HMase	N/A. This is a family of proteins that are flavin-dependent thymidylate synthases.	0
418523	cl19102	Fer4_9	4Fe-4S dicluster domain. Domain II of the enzyme dihydroprymidine dehydrogenase binds FAD. Dihydroprymidine dehydrogenase catalyzes the first and rate-limiting step of pyrimidine degradation by converting pyrimidines to the corresponding 5,6- dihydro compounds. This domain carries two Fe4-S4 clusters.	0
418524	cl19105	Sina	N/A. The seven in absentia (sina) gene was first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non- neuronal cell type. The Sina protein contains an N-terminal RING finger domain pfam00097. Through this domain, Sina binds E2 ubiquitin-conjugating enzymes (UbcD1) Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that thus Sina targets TTK88 for degradation, therefore promoting the R7 pathway. Murine and human homologs of Sina have also been identified. The human homolog Siah-1 also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus this pathway links DNA damage to beta-catenin degradation. Sina proteins, therefore, physically interact with a variety of proteins. The N-terminal RING finger domain that binds ubiquitin conjugating enzymes is described in pfam00097, and does not form part of the alignment for this family. The remainder C-terminal part is involved in interactions with other proteins, and is included in this alignment. In addition to the Drosophila protein and mammalian homologs, whose similarity was noted previously, this family also includes putative homologs from Caenorhabditis elegans, Arabidopsis thaliana.	0
418525	cl19107	SPFH_like	core domain of the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily. This domain is found in the Major Vault Protein and has been called the shoulder domain. This family includes two bacterial proteins, suggesting that some bacteria may possess vault particles.	0
418526	cl19111	Sir4p-SID_like	The SID domain of Saccharomyces cerevisiae silent information regulator 4, a Sir2p interaction domain; and related domains. This is the Sir2 interaction domain (SID domain) of silent information regulator 4 (Sir4).	0
418527	cl19112	CBM29_CBM65	family 29 and family 65 carbohydrate binding modules. This domain is found in the non-catalytic carbohydrate binding module 65B (CMB65B) present in Eubacterium cellulosolvens. CBMs are present in plant cell wall degrading enzymes and are responsible for targeting, which enhances catalysis. CBM65s display higher affinity for oligosaccharides, such as cellohexaose, and particularly polysaccharides than cellotetraose, which fully occupies the core component of the substrate binding cleft. The concave surface presented by beta-sheet 2 comprises the beta-glucan binding site in CBM65s. C6 of all the backbone glucose moieties makes extensive hydrophobic interactions with the surface tryptophans of CBM65s. Three out of the four surface Trp are highly conserved. The conserved metal ion site typical of CBMs is absent in this CBM65 family.	0
418528	cl19114	RNAP_largest_subunit_N	Largest subunit of RNA polymerase (RNAP), N-terminal domain. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking.	0
418529	cl19115	Cupredoxin	Cupredoxin superfamily. This family represents the N-terminal non-catalytic domain of protein-arginine deiminase. This domain has a cupredoxin-like fold.	0
418530	cl19120	SMBP_like	Small metal-binding protein conserved in proteobacteria. This histidine-rich protein binds metal ions.	0
418531	cl19121	ABBA-PTs	ABBA-type aromatic prenyltransferases (PTases). This family of proteins represents tryptophan dimethylallyltransferase (EC:2.5.1.34), which catalyzes the first step of ergot alkaloid biosynthesis. Ergot alkaloids, which are produced by endophyte fungi, can enhance plant host fitness, but also cause livestock toxicosis to host plants. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 390 to 465 amino acids in length.	0
418532	cl19122	PANDER_like	Domains similar to the Pancreatic-derived factor. ILEI is a family of proteins found in vertebrates. It is heavily involved in the process of the transition from epithelial to mesenchymal tissue - EMT - during all of embryonic development, cancer progression, metastasis, and chronic inflammation/fibrosis. ILEI is upregulated exclusively at the level of translation, and abnormal ILEI expression, ie cytoplasmic over-expression instead of vesicular localization, is associated with EMT in human cancerous tissue. In order to induce and maintain the EMT of hepatocytes in a TGF-beta-independent fashion ILEI needs the cooperation of oncogenic Ras.	0
418533	cl19123	lytB_ispH	4-hydroxy-3-methylbut-2-enyl diphosphate reductase. The mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway for isoprenoid biosynthesis is essential in many eubacteria, plants, and the malaria parasite. The LytB gene is involved in the trunk line of the MEP pathway.	0
418534	cl19148	Molybdop_Fe4S4	Molybdopterin oxidoreductase Fe4S4 domain. The molybdopterin oxidoreductase Fe4S4 domain is found in a number of reductase/dehydrogenase families, which include the periplasmic nitrate reductase precursor and the formate dehydrogenase alpha chain.	0
418535	cl19167	Bac_export_2	FlhB HrpN YscU SpaS Family. type III secretion system protein HrcU; Validated	0
418537	cl19182	FlgH	Flagellar L-ring protein. flagellar basal body L-ring protein; Reviewed	0
418538	cl19186	Amidinotransf	Amidinotransferase. Peptidyl-arginine deiminase (PAD) enzymes catalyze the deimination of the guanidino group from carboxy-terminal arginine residues of various peptides to produce ammonia. PAD from Porphyromonas gingivalis (PPAD) appears to be evolutionarily unrelated to mammalian PAD (pfam03068), which is a metalloenzyme. PPAD is thought to belong to the same superfamily as aminotransferase and arginine deiminase, and to form an alpha/beta propeller structure. This family has previously been named PPADH (Porphyromonas peptidyl-arginine deiminase homologs). The predicted catalytic residues in PPAD are Asp130, Asp187, His236, Asp238 and Cys351. These are absolutely conserved with the exception of Asp187 which is absent in two family members. PPAD is also able to catalyze the deimination of free L-arginine, but has primarily peptidyl-arginine specificity. It may have a FMN cofactor.	0
418539	cl19188	PL-6	Polysaccharide Lyase Family 6. This family includes chondroitinases. These enzymes cleave the glycosaminoglycan dermatan sulfate.	0
418540	cl19190	Flavoprotein	Flavoprotein. phosphopantothenoylcysteine decarboxylase; Validated	0
418541	cl19192	LolA_fold-like	family containing periplasmic molecular chaperone LolA, the outer membrane lipoprotein receptor LolB and the periplasmic protein RseB. This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
418542	cl19194	Phage_portal	Phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage capsid and the tail proteins.	0
418543	cl19197	Complex1_30kDa	Respiratory-chain NADH dehydrogenase, 30 Kd subunit. This model describes the C subunit of the NADH dehydrogenase complex I in bacteria, as well as many instances of the corresponding mitochondrial subunit (NADH dehydrogenase subunit 9) and of the F420H2 dehydrogenase in Methanosarcina. Complex I contains subunits designated A-N. This C subunit often occurs as a fusion protein with the D subunit. This model excludes the NAD(P)H and plastoquinone-dependent form of chloroplasts and [Energy metabolism, Electron transport]	0
418544	cl19201	HypA	Hydrogenase/urease nickel incorporation, metallochaperone, hypA. CXXC-~12X-CXXC and genetically seems a regulatory protein. In Hpylori, hypA mutant abolished hydrogenase activity and decrease in urease activity. Nickel supplementation in media restored urease activity and partial hydrogenase activity. HypA probably involved in inserting Ni in enzymes. [Protein fate, Protein modification and repair]	0
418545	cl19212	PTH2_family	N/A. Peptidyl-tRNA hydrolases are enzymes that release tRNAs from peptidyl-tRNA during translation.	0
418546	cl19215	CoA_transf_3	CoA-transferase family III. Members of this protein family belong by homology to the family of CoA transferases. However, the characterized member from Chloroflexus aurantiacus appears to perform an intramolecular transfer, making it an isomerase. The enzyme converts mesaconyl-C1-CoA to mesaconyl-C4-CoA as part of the bicyclic 3-hydroxyproprionate pathway for carbon fixation.	0
418547	cl19217	SBF	Sodium Bile acid symporter family. These family members are 7TM putative membrane transporter proteins. The family is similar to the SBF family of bile-acid symporters, pfam01758.	0
418548	cl19219	Bactofilin	Polymer-forming cytoskeletal. Members of this family include FapA (flagellar assembly protein A), found in Vibrio vulnificus. The synthesis of flagella allows bacteria to respond to chemotaxis by facilitating motility. Studies examining the role of FapA show that the loss or delocalization of FapA results in a complete failure of the flagellar biosynthesis and motility in response to glucose mediated chemotaxis. The polar localization of FapA is required for flagellar synthesis, and dephosphorylated EIIAGlc (Glucose-permease IIA component) inhibited the polar localization of FapA through direct interaction.	0
418549	cl19223	G_glu_transpept	Gamma-glutamyltranspeptidase. gamma-glutamyltranspeptidase; Reviewed	0
418550	cl19224	TGT	Queuine tRNA-ribosyltransferase. queuine tRNA-ribosyltransferase; Provisional	0
418551	cl19237	DUF45	Protein of unknown function DUF45. This family represents a domain found in eukaryotes and prokaryotes. The domain contains a characteristic motif of the zinc metallopeptidases. This family includes the bacterial SprT protein.	0
418552	cl19248	CHAT	CHAT domain. These proteins appear to be related to peptidases in peptidase clan CD that includes the caspases. This domain has been termed the CHAT domain for Caspase HetF Associated with Tprs. This family has been identified as a sister group to the separins.	0
418553	cl19251	zf-ZPR1	ZPR1 zinc-finger domain. An orthologous protein found once in each of the completed archaeal genomes corresponds to a zinc finger-containing domain repeated as the N-terminal and C-terminal halves of the mouse protein ZPR1. ZPR1 is an experimentally proven zinc-binding protein that binds the tyrosine kinase domain of the epidermal growth factor receptor (EGFR); binding is inhibited by EGF stimulation and tyrosine phosphorylation, and activation by EGF is followed by some redistribution of ZPR1 to the nucleus. By analogy, other proteins with the ZPR1 zinc finger domain may be regulatory proteins that sense protein phosphorylation state and/or participate in signal transduction.	0
418554	cl19252	MreC	rod shape-determining protein MreC. rod shape-determining protein MreC; Provisional	0
418555	cl19253	YcaO	YcaO cyclodehydratase, ATP-ad Mg2+-binding. Members of this protein family include enzymes related to SagD, previously referred to as a scaffold or docking protein involved in the biosynthesis of streptolysin S in Streptococcus pyogenes from the protoxin polypeptide (product of the sagA gene). Newer evidence describes an enzymatic activity, an ATP-dependent cyclodehydration reaction, previously ascribed to the SagC component. This protein family serves as a marker for widely distributed prokaryotic systems for making a general class of heterocycle-containing bacteriocins.	0
418556	cl19280	FlgI	Flagellar P-ring protein. flagellar basal body P-ring protein; Provisional	0
418557	cl19284	Ribosomal_L37ae	Ribosomal L37ae protein family. This model finds eukaryotic ribosomal protein eL43 (previously L37a) and its archaeal orthologs. The nomeclature is tricky because eukaryotes have proteins called both L37 and L37a. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
418558	cl19285	SmpA_OmlA	SmpA / OmlA family. Structure 3D4E shared structural similarity to beta-lactamase inhibitory proteins (BLIP) which already include 1XXM, 1S0W, 1JTG, 2G2U, 2G2W, 2B5R, and 3due. All of structures are involved in beta-lactamase inhibitor complex. (REF http://www.topsan.org/Proteins/JCSG/3d4e)	0
302813	cl19288	RhaT	L-rhamnose-proton symport protein (RhaT). Members of this family fall in to the drug/metabolite transporter (dmt) superfamily. They carry 10xTM domains arranged as 5+5. Although these two sets may originally have arisen by gene-duplication the divergence now is such that the two halves are no longer homologous.	0
418559	cl19294	ApbE	ApbE family. thiamine biosynthesis lipoprotein ApbE; Provisional	0
418560	cl19297	Dor1	Dor1-like family. This Sec5 family of eukaryotic proteins conserved is not representing the Sec5-Ral binding site.	0
418561	cl19308	MdoG	Periplasmic glucan biosynthesis protein, MdoG. glucan biosynthesis protein G; Provisional	0
418562	cl19310	CT_C_D	Carboxyltransferase domain, subdomain C and D. This domain represents subunit 1 of allophanate hydrolase (AHS1).	0
418563	cl19311	Urocanase	Urocanase Rossmann-like domain. urocanate hydratase; Provisional	0
418564	cl19312	FdhE	Protein involved in formate dehydrogenase formation. formate dehydrogenase accessory protein FdhE; Provisional	0
418565	cl19356	PmbA_TldD	Putative modulator of DNA gyrase. peptidase PmbA; Provisional	0
418566	cl19360	DegV	Uncharacterized protein, DegV family COG1307. This is the kinase domain of the dihydroxyacetone kinase family.	0
418567	cl19362	Coprogen_oxidas	Coproporphyrinogen III oxidase. coproporphyrinogen-III oxidase	0
418568	cl19374	Diphthamide_syn	Putative diphthamide synthesis protein. Members of this family are the archaeal protein Dph2, members of the universal archaeal protein family designated arCOG04112. The chemical function of this protein is analogous to the radical SAM family (pfam04055), although the sequence is not homologous. The chemistry involves [4Fe-4S]-aided formation of a 3-amino-3-carboxypropyl radical rather than the canonical 5'-deoxyadenosyl radical of the radical SAM family.	0
418569	cl19388	COX15-CtaA	Cytochrome oxidase assembly protein. cytochrome c oxidase assembly protein; Provisional	0
418570	cl19398	Rep_3	Initiator Replication protein. Members of this family of bacterial proteins are single-stranded DNA binding proteins that are involved in DNA replication, repair and recombination.	0
418571	cl19401	FUSC_2	Fusaric acid resistance protein-like. This family consists of bacterial proteins with three transmembrane regions that are purported to be aromatic acid exporters.	0
327549	cl19409	Cad	Cadmium resistance transporter. These proteins are members of the Cadmium Resistance (CadD) Family (TC 2.A.77). To date, this family of proteins has only been found in Gram-positive bacteria. The CadD family includes several closely related Staphylococcal proteins reported to function in cadmium resistance. Members are predicted to span the membrane five times; the mechanism of resistance is believed to be export but has also been suggested to be binding and sequestration in the membrane. Closely related but outside the scope of this model is another staphylococcal protein that has been reported to possibly function in quaternary ammonium ion export. Still more distant are other members of the broader LysE family (see Vrljic. et al, ). [Transport and binding proteins, Amino acids, peptides and amines]	0
327550	cl19414	Glt_symporter	Sodium/glutamate symporter. [Transport and binding proteins, Amino acids, peptides and amines]	0
418573	cl19416	GRDB	Glycine/sarcosine/betaine reductase selenoprotein B (GRDB). Members of this family form the PrdB subunit, usually a selenoprotein, in the D-proline reductase complex. The usual pathway is conversion of L-protein to D-proline by a racemase, then use of D-proline as an electron acceptor coupled to ATP generation under anaerobic conditions.	0
418574	cl19417	FYDLN_acid	Protein of unknown function (FYDLN_acid). Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.	0
418575	cl19418	PrpF	PrpF protein. The 2-methylcitrate cycle is one of at least five degradation pathways for propionate via propionyl-CoA. Degradation of propionate toward pyruvate consumes oxaloacetate and releases succinate. Oxidation of succinate back into oxaloacetate by the TCA cycle makes the 2-methylcitrate pathway a cycle. This family consists of PrpF, an incompletely characterized protein that appears to be an essential accessory protein for the Fe/S-dependent 2-methylisocitrate dehydratase AcnD (TIGR02333). This protein is related to but distinct from FldA (part of pfam04303), a putative fluorene degradation protein of Sphingomonas sp. LB126. [Energy metabolism, Fermentation]	0
418576	cl19419	DUF2263	Uncharacterized protein conserved in bacteria (DUF2263). Members of this uncharacterized protein family are found in Streptomyces, Nostoc sp. PCC 7120, Clostridium acetobutylicum, Lactobacillus johnsonii NCC 533, Deinococcus radiodurans, and Pirellula sp. for a broad but sparse phylogenetic distibution that at least suggests lateral gene transfer.	0
418577	cl19420	Spore_IV_A	Stage IV sporulation protein A (spore_IV_A). A comparative genome analysis of all sequenced genomes of shows a number of proteins conserved strictly among the endospore-forming subset of the Firmicutes. This protein, a member of this panel, is designated stage IV sporulation protein A. It acts in the mother cell compartment and plays a role in spore coat morphogenesis. [Cellular processes, Sporulation and germination]	0
302859	cl19421	RHSP	Retrotransposon hot spot protein. This model describes full-length and part-length members of the RHS (retrotransposon hot spot) family in Trypanosoma brucei and Trypanosoma cruzi. Members of this family are frequently interrupted by non-LTR retrotransposons inserted at exactly the same relative position.	0
267777	cl19424	GCH_III	GTP cyclohydrolase III. GTP cyclohydrolase (GCH) III from Methanocaldococcus jannaschi catalyzes the conversion of GTP to 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate (FAPy). The reaction requires two bound magnesium ions for the catalysis and is activated by monovalent cations such as potassium and ammonium. The enzyme is a tetramer of identical subunits; each monomer is composed of an N- and a C-terminal domain that adopt nearly superimposible structures, suggesting that the protein has arisen by gene duplication. The family is found in archaea and bacteria.	0
388564	cl19428	TrbH	Conjugal transfer protein TrbH. conjugal transfer protein TrbH; Provisional	0
418578	cl19470	PTS_2-RNA	RNA 2&apos;-phosphotransferase, Tpt1 / KptA family. RNA 2'-phosphotransferase; Reviewed	0
418579	cl19471	DUF945	Bacterial protein of unknown function (DUF945). hypothetical protein; Provisional	0
418580	cl19472	Lipoprotein_16	Uncharacterized lipoprotein. hypothetical protein; Provisional	0
418581	cl19473	DUF1846	Domain of unknown function (DUF1846). hypothetical protein; Provisional	0
418582	cl19474	TraV	Type IV conjugative transfer system lipoprotein (TraV). The TraV protein is a component of conjugative type IV secretion systems. TraV is an outer membrane lipoprotein and is believed to interact with the secretin TraK. The alignment contains three conserved cysteines in the N-terminal half.	0
418583	cl19475	TraN	Type-1V conjugative transfer system mating pair stabilisation. TraN is a large cysteine-rich outer membrane protein involved in the mating-pair stabilization (adhesin) component of the F-type conjugative plamid transfer system. TraN is believed to interact with the core type IV secretion system apparatus through the TraV protein.	0
418584	cl19477	Sulf_transp	Sulphur transport. For 79 of the first 80 reference genomes in which a member of this protein family, YedE, is found, a selenium utilization system is found, spread over a broad taxonomic range (Firmicutes, spirochetes, delta-proteobacteria, Fusobacteria, Bacteriodes, etc. This family is less widespread than YedF, also involved in selenium metabolism.	0
354426	cl19481	LON	Found in ATP-dependent protease La (LON). N-terminal domain of the ATP-dependent protease La (LON), present also in other bacterial ORFs.	0
388572	cl19482	Peptidase_M8	Leishmanolysin. Glycoprotein GP63 (leishmanolysin); Provisional	0
418585	cl19485	AMA-1	Apical membrane antigen 1. apical membrane antigen 1; Provisional	0
418586	cl19499	UPF0061	Uncharacterized ACR, YdiU/UPF0061 family. 	0
418587	cl19501	Mut7-C	Mut7-C RNAse domain. RNAse domain of the PIN fold with an inserted Zinc Ribbon at the C-terminus.	0
354429	cl19503	TadB	Flp pilus assembly protein TadB  [Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. 	0
418588	cl19504	SpoVR	SpoVR like protein. SpoVR family protein; Provisional	0
418589	cl19505	CreA	CreA protein. This family consists of several bacterial CreA proteins, the function of which is unknown.	0
418590	cl19506	RraB	Regulator of ribonuclease activity B. This family of proteins regulate mRNA abundance by binding to RNaseE and inhibiting its endonucleolytic activity. A subset of these proteins are predicted to function as immunity proteins.	0
418591	cl19507	Lipoprotein_18	NlpB/DapX lipoprotein. This family consists of a number of bacterial lipoproteins often known as NlpB or DapX. This lipoprotein is detected in outer membrane vesicles in Escherichia coli and appears to be nonessential.	0
418592	cl19509	Fibrillarin_2	Fibrillarin-like archaeal protein. Members of this protein family are HmdC, whose gene regularly occurs in the context of genes for HmdA (5,10-methenyltetrahydromethanopterin hydrogenase) and the radical SAM protein HmdB involved in biosynthesis of the HmdA cofactor. Bioinformatics suggests this protein, a homolog of eukaryotic fibrillarin, may be involved in biosynthesis of the guanylyl pyridinol cofactor in HmdA. [Protein fate, Protein modification and repair, Energy metabolism, Methanogenesis]	0
418593	cl19510	EutB	Ethanolamine ammonia lyase large subunit (EutB). This family consists of several bacterial ethanolamine ammonia lyase large subunit (EutB) proteins (EC:4.3.1.7). Ethanolamine ammonia-lyase is a bacterial enzyme that catalyzes the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. The enzyme is a heterodimer composed of subunits of Mr approximately 55,000 (EutB) and 35,000 (EutC).	0
418594	cl19511	BshC	Bacillithiol biosynthesis BshC. Members of this protein family are BshC, an enzyme required for bacillithiol biosynthesis and described as a cysteine-adding enzyme. Bacillithiol is a low-molecular-weight thiol, an analog of glutathione and mycothiol, and is found largely in the Firmicutes. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	0
418595	cl19519	FKBP_C	FKBP-type peptidyl-prolyl cis-trans isomerase. This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides species. Distant homology prediction algorithms consistently suggest a homology between this family and FKBP-type peptidyl-prolyl cis-trans isomerases (PF00254), but this relation is as yet not confirmed. The function of this family is unknown.	0
418596	cl19522	PK_C	Pyruvate kinase, alpha/beta domain. As well as being found in pyruvate kinase this family is found as an isolated domain in some bacterial proteins.	0
327577	cl19527	SWIM	SWIM zinc finger. This domain is found in bacterial, archaeal and eukaryotic proteins. It is predicted to be organized into two N-terminal beta-strands and a C-terminal alpha helix, thus possibly adopting a fold similar to that of the C2H2 zinc finger (pfam00096). SWIM is thought to be a versatile domain that can interact with DNA or proteins in different contexts.	0
418597	cl19531	Phage_prot_Gp6	Phage portal protein, SPP1 Gp6-like. This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa. [Mobile and extrachromosomal element functions, Prophage functions]	0
418599	cl19541	Head-tail_con	Bacteriophage head to tail connecting protein. hypothetical protein	0
418600	cl19543	Metallothio	Metallothionein. This is a family of eukaryotic metallothioneins.	0
302911	cl19548	DUF515	Protein of unknown function (DUF515). Family of hypothetical Archaeal proteins.	0
418601	cl19549	DUF505	Protein of unknown function (DUF505). Family of uncharacterized prokaryotic proteins.	0
418602	cl19551	Monooxygenase_B	Monooxygenase subunit B protein. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit B of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria.	0
418603	cl19557	DUF1016	Protein of unknown function (DUF1016). Family of uncharacterized proteins found in viruses, archaea and bacteria.	0
418604	cl19561	DNA_circ_N	DNA circularisation protein N-terminus. This family represents the N-terminus (approximately 100 residues) of a number of phage DNA circularisation proteins.	0
418605	cl19562	PAS_5	PAS domain. This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long. This region is is distantly similar to other PAS domains.	0
418606	cl19566	RE_Alw26IDE	Type II restriction endonuclease (RE_Alw26IDE). Members of this family are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. Characterized specificities of three members are GGTCTC, CGTCTC, and the shared subsequence GTCTC. [DNA metabolism, Restriction/modification]	0
418607	cl19567	DSL	Delta serrate ligand. 	0
418608	cl19568	wnt	wnt family. Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families.	0
388595	cl19569	VPS9	Vacuolar sorting protein 9 (VPS9) domain. Domain present in yeast vacuolar sorting protein 9 and other proteins.	0
418609	cl19573	Pyr_excise	Pyrimidine dimer DNA glycosylase. Members of this protein are found in a small number of taxonomically well separated species, yet are strongly conserved, suggesting lateral gene transfer. Members are found in Treponema denticola, Clostridium acetobutylicum, and several of the Firmicutes. The function of this protein is unknown. [Hypothetical proteins, Conserved]	0
418610	cl19574	Salt_tol_Pase	Glucosylglycerol-phosphate phosphatase (Salt_tol_Pase). Proteins in this family are glucosylglycerol-phosphate phosphatase, with the gene symbol stpA (Salt Tolerance Protein A). A motif characteristic of acid phosphatases is found, but otherwise this family shows little sequence similarity to other phosphatases. This enzyme acts on the glucosylglycerol phosphate, product of glucosylglycerol phosphate synthase and immediate precursor of the osmoprotectant glucosylglycerol.	0
418611	cl19575	HrpB4	Bacterial type III secretion protein (HrpB4). This family of genes are always found in type III secretion operons in a limited number of species including Burkholderia, Xanthomonas and Ralstonia.	0
327591	cl19576	HrpB1_HrpK	Bacterial type III secretion protein (HrpB1_HrpK). This gene is found within type III secretion operons in a limited range of species including Xanthomonas, Ralstonia and Burkholderia.	0
418612	cl19579	Peptidase_U4	Sporulation factor SpoIIGA. Members of this protein family are the stage II sporulation protein SpoIIGA. This protein acts as an activating protease for Sigma-E, one of several specialized sigma factors of the sporulation process in Bacillus subtilis and related endospore-forming bacteria. [Cellular processes, Sporulation and germination]	0
354443	cl19580	pip_yhgE_Nterm	YhgE/Pip N-terminal domain. Members of this family are associated with type VII secretion of WXG100 family targets in the Firmicutes, but not in the Actinobacteria. This model represents the conserved N-terminal domain.	0
267934	cl19581	COG4008	Predicted metal-binding transcription factor, methanogenesis marker domain 9 [Transcription]. A gene for a protein that contains a copy of this domain, to date, is found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. A 69-amino acid core region of this 110-amino acid domain contains eight invariant Cys residues, including two copies of a motif [WFY]CCxxKPC. These motifs could be consistent with predicted metal-binding transcription factor as was suggested for the COG4008 family. Some members of this family have an additional N-terminal domain of about 250 amino acids from the nifR3 family of predicted TIM-barrel proteins.	0
418613	cl19585	Caud_tail_N	Caudoviral major tail protein N-terminus. tail protein	0
418614	cl19592	Zn_ribbon_recom	Recombinase zinc beta ribbon domain. This is a viral family of phage zinc-binding transcriptional activators, which also contains cryptic members in some bacterial genomes. The P4 phage delta protein contains two such domains attached covalently, while the P2 phage Ogr proteins possess one domain but function as dimers. All the members of this family have the following consensus sequence: C-X(2)-C-X(3)-A-(X)2-R-X(15)-C-X(4)-C-X(3)-F. This family also includes zinc fingers in recombinase proteins.	0
418615	cl19596	Peptidase_M29	Thermophilic metalloprotease (M29). 	0
302938	cl19597	SPAN	Surface presentation of antigens protein. Surface presentation of antigens protein (SPAN), also know as invasion protein invJ, is a Salmonella secretory pathway protein involved in presentation of determinants required for mammalian host cell invasion.	0
302951	cl19613	IpgD	Enterobacterial virulence protein IpgD. This family consists of several enterobacterial IpgD like virulence factor proteins. In the Gram-negative pathogen Shigella flexneri, the virulence factor IpgD is translocated directly into eukaryotic cells and acts as a potent inositol 4-phosphatase that specifically dephosphorylates phosphatidylinositol 4,5-bisphosphate [PtdIns(4,5)P(2)] into phosphatidylinositol 5-monophosphate [PtdIns(5)P] that then accumulates. Transformation of PtdIns(4,5)P(2) into PtdIns(5)P by IpgD is responsible for dramatic morphological changes of the host cell, leading to a decrease in membrane tether force associated with membrane blebbing and actin filament remodelling.	0
418616	cl19614	Phage_term_smal	Phage small terminase subunit. terminase endonuclease subunit; Provisional	0
418617	cl19619	FBPase_2	Firmicute fructose-1,6-bisphosphatase. This family consists of several bacterial fructose-1,6-bisphosphatase proteins (EC:3.1.3.11) which seem to be specific to phylum Firmicutes. Fructose-1,6-bisphosphatase (FBPase) is a well known enzyme involved in gluconeogenesis. This family does not seem to be structurally related to pfam00316.	0
418618	cl19620	Acetone_carb_G	Acetone carboxylase gamma subunit. Acetone carboxylase is the key enzyme of bacterial acetone metabolism, catalyzing the condensation of acetone and CO(2) to form acetoacetate.	0
418619	cl19622	Plasmid_RAQPRD	Plasmid protein of unknown function (Plasmid_RAQPRD). This model represents a small family of proteins about 100 amino acids in length, including a predicted signal sequence and a perfectly conserved motif RAQPRD towards the C-terminus. Members are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae DC3000. The function is unknown. [Mobile and extrachromosomal element functions, Plasmid functions]	0
418620	cl19623	ArsR	ArsR transcriptional regulator. Members of this family of archaeal proteins are conserved transcriptional regulators belonging to the ArsR family.	0
418621	cl19625	DUF2314	Uncharacterized protein conserved in bacteria (DUF2314). This domain is found in various bacterial hypothetical proteins, as well as putative ankyrin repeat proteins. The exact function of the domains comprising this family has not, as yet, been determined.	0
418622	cl19626	DUF2321	Uncharacterized protein conserved in bacteria (DUF2321). Members of this family of hypothetical bacterial proteins have no known function.	0
418623	cl19627	GlnD_UR_UTase	GlnD PII-uridylyltransferase. This domain is found associated with presumed nucleotidyltransferase domains and seems to be distantly related to other helical substrate binding domains.	0
418624	cl19633	DUF2799	Protein of unknown function (DUF2799). lipoprotein; Provisional	0
418625	cl19646	DUF4138	Domain of unknown function (DUF4138). Members of this family are the TraN protein encoded by transfer region genes of conjugative transposons of Bacteroides. The family is related to conjugative transfer proteins VirB9 and TrbG of Agrobacterium Ti plasmids. [Cellular processes, DNA transformation]	0
388610	cl19720	DUF4370	Domain of unknown function (DUF4370). Uncharacterized protein At1g47420	0
418626	cl19721	CAAD	CAAD domains of cyanobacterial aminoacyl-tRNA synthetase. photosystem I P subunit (PSI-P)	0
418627	cl19726	DUF2884	Protein of unknown function (DUF2884). hypothetical protein; Provisional	0
418628	cl19727	DUF1451	Zinc-ribbon containing domain. This family consists of several hypothetical bacterial proteins of around 160 residues in length. Members of this family contain four highly conserved cysteine resides toward the C-terminal region of the protein.	0
418629	cl19728	TraT	Enterobacterial TraT complement resistance protein. The traT gene is one of the F factor transfer genes and encodes an outer membrane protein which is involved in interactions between an Escherichia coli and its surroundings.	0
354453	cl19729	COG2888	Predicted RNA-binding protein involved in translation, contains  Zn-ribbon domain, DUF1610 family [General function prediction only]. putative Zn-ribbon RNA-binding protein; Provisional	0
388615	cl19730	Wzz	Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases.	0
418630	cl19731	SipA	Salmonella invasion protein A. Salmonella invasion protein A is an actin-binding protein that contributes to host cytoskeletal rearrangements by stimulating actin polymerization and counteracting F-actin destabilizing proteins. Members of this family possess an all-helical fold consisting of eight alpha-helices arranged so that six long, amphipathic helices form a compact fold that surrounds a final, predominantly hydrophobic helix in the middle of the molecule.	0
418631	cl19736	NlpE	NlpE N-terminal domain. This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. In Escherichia coli and Salmonella typhi, NlpE is also known to confer copper tolerance in copper-sensitive strains of Escherichia coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes.	0
418632	cl19737	DUF979	Protein of unknown function (DUF979). This family consists of several putative bacterial membrane proteins. The function of this family is unclear.	0
418633	cl19739	DUF1131	Protein of unknown function (DUF1131). RpoE-regulated lipoprotein; Provisional	0
418634	cl19740	DUF1272	Protein of unknown function (DUF1272). This family consists of several hypothetical bacterial proteins of around 80 residues in length. This family contains a number of conserved cysteine residues and its function is unknown.	0
418636	cl19744	zf-UBR	Putative zinc finger in N-recognin (UBR box). Domain is involved in recognition of N-end rule substrates in yeast Ubr1p	0
418637	cl19745	Ins145_P3_rec	Inositol 1,4,5-trisphosphate/ryanodine receptor. This domain corresponds to the ligand binding region on inositol 1,4,5-trisphosphate receptor, and the N terminal region of the ryanodine receptor. Both receptors are involved in Ca2+ release. They can couple to the activation of neurotransmitter-gated receptors and voltage-gated Ca2+ channels on the plasma membrane, thus allowing the endoplasmic reticulum discriminate between different types of neuronal activity.	0
418638	cl19746	GDNF	GDNF/GAS1 domain. This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons.. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity.	0
418639	cl19747	BetaGal_dom2	Beta-galactosidase, domain 2. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyses the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, which is N-terminal to it, but itself has no metazoan members.	0
388627	cl19751	bPH_4	Bacterial PH domain. This family of proteins with unknown function appear to be related to bacterial PH domains. This family was formerly known as DUF2679.	0
418642	cl19752	DUF2145	Uncharacterized protein conserved in bacteria (DUF2145). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
418643	cl19753	DUF2154	Cell wall-active antibiotics response 4TMS YvqF. 	0
276272	cl19755	heavy_Cys_CGP	heavy-Cys/CGP-CTERM domain protein. In this domain of about 50 residues, eight of twelve invariant residues are Cys. Proteins with this domain tend to have N-terminal signal sequences, suggesting an extracytoplasmic location for this domain.	0
418644	cl19756	I_LWEQ	I/LWEQ domain. Thought to possess an F-actin binding function.	0
418645	cl19758	FH2	Formin Homology 2 Domain. FH proteins control rearrangements of the actin cytoskeleton, especially in the context of cytokinesis and cell polarisation. Members of this family have been found to interact with Rho-GTPases, profilin and other actin-assoziated proteins. These interactions are mediated by the proline-rich FH1 domain, usually located in front of FH2 (but not listed in SMART). Despite this cytosolic function, vertebrate formins have been assigned functions within the nucleus. A set of Formin-Binding Proteins (FBPs) has been shown to bind FH1 with their WW domain.	0
418646	cl19760	IBR	IBR domain, a half RING-finger domain. the domains occurs between pairs og RING fingers	0
418647	cl19763	BOP1NT	BOP1NT (NUC169) domain. This N terminal domain is found in BOP1-like WD40 proteins.	0
418648	cl19764	COG6	Conserved oligomeric complex COG6. COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation.	0
418649	cl19765	mTERF	mTERF. MOC1-like protein; Provisional	0
418650	cl19816	Hydantoinase_B	Hydantoinase B/oxoprolinase. This family includes N-methylhydaintoinase B which converts hydantoin to N-carbamyl-amino acids, and 5-oxoprolinase EC:3.5.2.9 which catalyzes the formation of L-glutamate from 5-oxo-L-proline. These enzymes are part of the oxoprolinase family and are related to pfam01968.	0
418651	cl19817	UreF	UreF. This family consists of the Urease accessory protein UreF. The urease enzyme (urea amidohydrolase) hydrolyzes urea into ammonia and carbamic acid. UreF is proposed to modulate the activation process of urease by eliminating the binding of nickel irons to noncarbamylated protein.	0
418652	cl19818	DUF106	Integral membrane protein DUF106. This archaebacterial protein family has no known function. Members are predicted to be integral membrane proteins.	0
388639	cl19819	DUF499	Protein of unknown function (DUF499). Family of uncharacterized hypothetical prokaryotic proteins.	0
418653	cl19820	DUF530	Protein of unknown function (DUF530). Family of hypothetical archaeal proteins.	0
388641	cl19821	DUF166	Domain of unknown function. This family catalyzes the synthesis of thymidine monophosphate (dTMP) from deoxyuridine monophosphate (dUMP). The physiological co-substrate has not yet been identified. Previous designation of this famliy as being thymidylate synthase from one paper, PMID:10436953, has been shown to be erroneous. The proteins are uncharacterized.	0
418654	cl19822	DUF362	Domain of unknown function (DUF362). Domain that is sometimes present in iron-sulphur proteins.	0
418655	cl19823	GtrA	GtrA-like protein. Members of this family are predicted to be integral membrane proteins with three or four transmembrane spans. They are involved in the synthesis of cell surface polysaccharides. The GtrA family are a subset of this family. GtrA is predicted to be an integral membrane protein with 4 transmembrane spans. It is involved is in O antigen modification by Shigella flexneri bacteriophage X (SfX), but does not determine the specificity of glucosylation. Its function remains unknown, but it may play a role in translocation of undecaprenyl phosphate linked glucose (UndP-Glc) across the cytoplasmic membrane. Another member of this family is a DTDP-glucose-4-keto-6-deoxy-D-glucose reductase, which catalyzes the conversion of dTDP-4-keto-6-deoxy-D-glucose to dTDP-D-fucose, which is involved in the biosynthesis of the serotype-specific polysaccharide antigen of Actinobacillus actinomycetemcomitans Y4 (serotype b). This family also includes the teichoic acid glycosylation protein, GtcA, which is a serotype-specific protein in some Listeria innocua and monocytogenes strains. Its exact function is not known, but it is essential for decoration of cell wall teichoic acids with glucose and galactose.	0
418656	cl19824	Alpha-E	A predicted alpha-helical domain with a conserved ER motif. An uncharacterized alpha helical domain containing a highly conserved ER motif and typically found as a tandem duplication. Contextual analysis suggests that it functions in a distinct peptide synthesis/modification system comprising of a transglutaminase, a peptidase of the NTN-hydrolase superfamily, an active and inactive circularly permuted ATP-grasp domains and a transglutaminase fused N-terminal to a circularly permuted COOH-NH2 ligase domain.	0
418657	cl19825	Zn_peptidase	Putative neutral zinc metallopeptidase. Members of this family have a predicted zinc binding motif characteristic of neutral zinc metallopeptidases (Prosite:PDOC00129).	0
418658	cl19826	FmdA_AmdA	Acetamidase/Formamidase family. This family includes amidohydrolases of formamide EC:3.5.1.49 and acetamide. Methylophilus methylotrophus FmdA forms a homotrimer suggesting all the members of this family also do.	0
418659	cl19828	DUF2309	Uncharacterized protein conserved in bacteria (DUF2309). Members of this family of hypothetical bacterial proteins have no known function.	0
418660	cl19829	DUF333	Domain of unknown function (DUF333). This small domain of about 70 residues is found in a number of bacterial proteins. It is found at the N-terminus the of AF_1947 protein. The proteins containing this domain are uncharacterized.	0
418661	cl19830	PilN	Fimbrial assembly protein (PilN). 	0
418662	cl19831	PilP	Pilus assembly protein, PilP. The PilP family are periplasmic proteins involved in the biogenesis of type IV pili.	0
418663	cl19832	DIT1_PvcA	Pyoverdine/dityrosine biosynthesis protein. DIT1 is involved in synthesising dityrosine. Dityrosine is a sporulation-specific component of the yeast ascospore wall that is essential for the resistance of the spores to adverse environmental conditions. Pyoverdine biosynthesis protein PvcA is involved in the biosynthesis of pyoverdine, a cyclized isocyano derivative of tyrosine. It has a modified Rossmann fold.	0
418664	cl19833	HTH_42	Winged helix DNA-binding domain. This family contains two copies of a winged helix domain.	0
418665	cl19834	DUF2066	Uncharacterized protein conserved in bacteria (DUF2066). This domain, found in various prokaryotic proteins, has no known function.	0
418666	cl19836	DUF2072	Zn-ribbon containing protein. This archaeal protein has no known function.	0
418667	cl19837	DUF790	Protein of unknown function (DUF790). This family consists of several hypothetical archaeal proteins of unknown function.	0
418668	cl19838	DUF4190	Domain of unknown function (DUF4190). Family of uncharacterized proteins found in bacteria and archaea.	0
418669	cl19839	TfuA	TfuA-like protein. This family consists of a group of sequences that are similar to a region of TfuA protein. This protein is involved in the production of trifolitoxin (TFX), an gene-encoded, post-translationally modified peptide antibiotic. The role of TfuA in TFX synthesis is unknown, and it may be involved in other cellular processes.	0
418670	cl19841	Glyco_hydro_125	Metal-independent alpha-mannosidase (GH125). This family, which contains bacterial and fungal glycoside hydrolases, is also known as GH125. They function as metal-independent alpha-mannosidases, with specificity for alpha-1,6-linked non-reducing terminal mannose residues. Structurally this family is part of the 6 hairpin glycosidase superfamily.	0
418671	cl19842	DUF2213	Uncharacterized protein conserved in bacteria (DUF2213). Members of this family of bacterial proteins comprise various hypothetical and phage-related proteins. The exact function of these proteins has not, as yet, been determined.	0
418672	cl19843	DUF871	Bacterial protein of unknown function (DUF871). This family consists of several conserved hypothetical proteins from bacteria and archaea. The function of this family is unknown.	0
418673	cl19844	Metal_hydrol	Predicted metal-dependent hydrolase. Members of this family of proteins comprise various bacterial transition metal-dependent hydrolases.	0
418674	cl19845	NAGPA	Phosphodiester glycosidase. This is a family conserved from bacteria to humans. The structure of a member from Bacteroides has been crystallized and modelled onto the luminal region of the human member of the family, the transmembrane glycoprotein N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase. There is some conservation of potentially functional residues, implying that in the bacterial members this family acts in some way as a phosphodiester glycosidase. The human protein is also present, so the eukaryotic members are likely to be catalyzing the second step in the formation of the mannose 6-phosphate targeting signal on lysosomal enzyme oligosaccharides.	0
418675	cl19846	DGOK	2-keto-3-deoxy-galactonokinase. 2-keto-3-deoxy-galactonokinase EC:2.7.1.58 catalyzes the second step in D-galactonate degradation.	0
418676	cl19847	DUF1285	Protein of unknown function (DUF1285). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. The structures revealed a conserved core with domain duplication and a superficial similarity of the C-terminal domain to pleckstrin homology-like folds. The conservation of the domain- interface indicates a potential binding site that is likely to involve a nucleotide-based ligand, with genome-context and gene-fusion analyses additionally supporting a role for this family in signal transduction, possibly during oxidative stress.	0
418677	cl19849	DUF881	Bacterial protein of unknown function (DUF881). This family consists of a series of hypothetical bacterial proteins. One of the family members YlxW from Bacillus subtilis is thought to be involved in cell division and sporulation.	0
418678	cl19850	Virulence_RhuM	Virulence protein RhuM family. There are currently no experimental data for members of this group or their homologs. However, these proteins are implicated in virulence/pathogenicity because RhuM is encoded in the SPI-3 pathogenicity island in Salmonella typhimurium.	0
418679	cl19851	DUF1152	Protein of unknown function (DUF1152). This family consists of several hypothetical archaeal proteins of unknown function.	0
418680	cl19852	DUF2110	Uncharacterized protein conserved in archaea (DUF2110). This domain, found in various hypothetical archaeal proteins, has no known function.	0
418681	cl19853	DUF2117	Uncharacterized protein conserved in archaea (DUF2117). This domain, found in various hypothetical archaeal proteins, has no known function.	0
418682	cl19854	DUF1002	Protein of unknown function (DUF1002). This protein family has no known function. Its members are about 300 amino acids in length. It has so far been detected in Firmicute bacteria and some archaebacteria.	0
418683	cl19855	DUF1501	Protein of unknown function (DUF1501). This family contains a number of hypothetical bacterial proteins of unknown function approximately 400 residues long.	0
418684	cl19857	DUF2126	Putative amidoligase enzyme (DUF2126). Members of this family of bacterial domains are predominantly found in transglutaminase and transglutaminase-like proteins. Their exact function is, as yet, unknown, but they are likely to act as amidoligase enzymes Protein in this family are found in conserved gene neighborhoods encoding a glutamine amidotransferase-like thiol peptidase (in proteobacteria) or an Aig2 family cyclotransferase protein (in firmicutes).	0
418685	cl19858	DUF1015	Protein of unknown function (DUF1015). Family of proteins with unknown function found in archaea and bacteria.	0
418686	cl19860	DUF2252	Uncharacterized protein conserved in bacteria (DUF2252). This domain, found in various hypothetical bacterial proteins, has no known function.	0
418687	cl19861	zf-CHY	CHY zinc finger. This family of domains are likely to bind to zinc ions. They contain many conserved cysteine and histidine residues. We have named this domain after the N-terminal motif CXHY. This domain can be found in isolation in some proteins, but is also often associated with pfam00097. One of the proteins in this family is a mitochondrial intermembrane space protein called Hot13. This protein is involved in the assembly of small TIM complexes.	0
418688	cl19863	DUF935	Protein of unknown function (DUF935). This family consists of several bacterial proteins of unknown function as well as the Bacteriophage Mu gp29 protein.	0
418689	cl19864	Mu-like_Pro	Mu-like prophage I protein. Members of this family of proteins comprise various viral Mu-like prophage I proteins.	0
418690	cl19866	SrfB	Virulence factor SrfB. This family includes homologs of SsrAB is a two-component regulatory system encoded within the Salmonella pathogenicity island SPI-2. Among the products of genes activated by SsrAB within epithelial and macrophage cells is Salmonella typhimurium srfB. homologs are found in several other proteobacteria.	0
418691	cl19867	Virul_Fac	Putative bacterial virulence factor. Members of this family of prokaryotic proteins include various putative virulence factor effector proteins. Their exact function is, as yet, unknown.	0
418692	cl19868	DUF1054	Protein of unknown function (DUF1054). This family consists of several hypothetical bacterial proteins of unknown function.	0
418693	cl19870	DUF2135	Uncharacterized protein conserved in bacteria (DUF2135). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
303050	cl19871	DUF1646	Protein of unknown function (DUF1646). Some of the members of this family are hypothetical bacterial and archaeal proteins, but others are annotated as being cation transporters expressed by the archaebacterium Methanosarcina mazei.	0
418694	cl19872	DUF885	Bacterial protein of unknown function (DUF885). This family consists of several hypothetical bacterial proteins several of which are putative membrane proteins.	0
418695	cl19873	DUF2179	Uncharacterized protein conserved in bacteria (DUF2179). hypothetical protein; Provisional	0
418696	cl19874	DUF2183	Uncharacterized conserved protein (DUF2183). This domain, found in various hypothetical bacterial proteins, has no known function.	0
388683	cl19875	AbiEi_2	Transcriptional regulator, AbiEi antitoxin, Type IV TA system. AbiEi_2 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	0
388684	cl19876	DUF2192	Uncharacterized protein conserved in archaea (DUF2192). This domain, found in various hypothetical archaeal proteins, has no known function.	0
418697	cl19878	DUF2207	Predicted membrane protein (DUF2207). The majority of the proteins with a domain as described by this model have an extreme C-terminal sequence that is consists of extremely low-complexity sequence, rich in Ser or in Gly interspersed with Cys. That C-terminal region resembles ribosomal natural product precursors, although there is no evidence that C-terminal regions of these proteins undergo any modification or have any such function.	0
418698	cl19879	PocR	Sensory domain found in PocR. PocR, a ligand binding domain, has a novel variant of the PAS-like Fold. Evidence suggests that it binds small hydrocarbon derivatives such as 1,3-propanediol. In (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter)	0
418699	cl19880	ROS_MUCR	ROS/MUCR transcriptional regulator protein. This family consists of several ROS/MUCR transcriptional regulator proteins. The ros chromosomal gene is present in octopine and nopaline strains of Agrobacterium tumefaciens as well as in Rhizobium meliloti. This gene encodes a 15.5-kDa protein that specifically represses the virC and virD operons in the virulence region of the Ti plasmid and is necessary for succinoglycan production. Sinorhizobium meliloti can produce two types of acidic exopolysaccharides, succinoglycan and galactoglucan, that are interchangeable for infection of alfalfa nodules. MucR from Sinorhizobium meliloti acts as a transcriptional repressor that blocks the expression of the exp genes responsible for galactoglucan production therefore allowing the exclusive production of succinoglycan.	0
418700	cl19881	YEATS	YEATS family. We have named this family the YEATS family, after `YNK7', `ENL', `AF-9', and `TFIIF small subunit'. This family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity	0
418701	cl19882	TB2_DP1_HVA22	TB2/DP1, HVA22 family. This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein, which in humans is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease. The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein, which is thought to be a regulatory protein.	0
418702	cl19883	ERO1	Endoplasmic Reticulum Oxidoreductin 1 (ERO1). Members of this family are required for the formation of disulphide bonds in the ER.	0
418703	cl19885	Sec6	Exocyst complex component Sec6. Sec6 is a component of the multiprotein exocyst complex. Sec6 interacts with Sec8, Sec10 and Exo70.These exocyst proteins localize to regions of active exocytosis-at the growing ends of interphase cells and in the medial region of cells undergoing cytokinesis-in an F-actin-dependent and exocytosis- independent manner.	0
418704	cl19886	Cnd2	Condensin complex subunit 2. This family consists of several Barren protein homologs from several eukaryotic organisms. In Drosophila Barren (barr) is required for sister-chromatid segregation in mitosis. barr encodes a novel protein that is present in proliferating cells and has homologs in yeast and human. Mitotic defects in barr embryos become apparent during cycle 16, resulting in a loss of PNS and CNS neurons. Centromeres move apart at the metaphase-anaphase transition and Cyclin B is degraded, but sister chromatids remain connected, resulting in chromatin bridging. Barren protein localizes to chromatin throughout mitosis. Colocalization and biochemical experiments indicate that Barren associates with Topoisomerase II throughout mitosis and alters the activity of Topoisomerase II. It has been suggested that this association is required for proper chromosomal segregation by facilitating the decatenation of chromatids at anaphase. This family forms one of the three non-structural maintenance of chromosomes (SMC) subunits of the mitotic condensation complex along with Cnd1 and Cnd3.	0
418705	cl19887	TFCD_C	Tubulin folding cofactor D C terminal. This domain family is found in eukaryotes, and is typically between 182 and 199 amino acids in length. The family is found in association with pfam02985. There is a single completely conserved residue R that may be functionally important. Tubulin folding cofactor D does not co-polymerize with microtubules either in vivo or in vitro, but instead modulates microtubule dynamics by sequestering beta-tubulin from GTP-bound alphabeta-heterodimers in microtubules.	0
418706	cl19888	2H-phosphodiest	Domain of unknown function (DUF1868). This group of 2H-phosphodiesterases comprises a single family typified by the protein mlr3352 from M.loti. Members are also present in various alpha-proteobacteria, Synechocystis, Streptococcus and Chilo iridescent virus. The presence of a member of this predominantly bacterial group in a large eukaryotic DNA virus represents a potential case of horizontal transfer from a bacterial source into a virus. Several proteins of bacterial origin have been noticed in the insect viruses (L.M.Iyer, E.V.Koonin and L.Aravind, unpublished observations and these appear to have been acquired from endo-symbiotic or parasitic bacteria that share the same host cells with the viruses. Presence of 2H proteins in the proteomes of large DNA viruses (e.g. T4 57B protein and the Fowl-pox virus FPV025) may point to some role for these proteins in regulating the viral tRNA metabolism. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family.	0
418707	cl19890	DHHC	DHHC palmitoyltransferase. This entry refers to the DHHC domain, found in DHHC proteins which are palmitoyltransferases. Palmitoylation or, more specifically S-acylation, plays important roles in the regulation of protein localization, stability, and activity. It is a post-translational protein modification that involves the attachment of palmitic acid to Cys residues through a thioester linkage. Protein acyltransferases (PATs), also known as palmitoyltransferases, catalyze this reaction by transferring the palmitoyl group from palmitoyl-CoA to the thiol group of Cys residues. They are characterized by the presence of a 50-residue-long domain called the DHHC domain, which in most but not all cases is also cysteine-rich and gets its name from a highly conserved DHHC signature tetrapeptide (Asp-His-His-Cys). The Cys residue within the DHHC domain forms a stable acyl intermediate and transfers the acyl chain to the Cys residues of a target protein. Some proteins containing a DHHC domain include Drosophila DNZ1 protein, Mouse Abl-philin 2 (Aph2) protein, Mammalian ZDHHC9, Yeast ankyrin repeat-containing protein AKR1, Yeast Erf2 protein, and Arabidopsis thaliana tip growth defective 1.	0
418708	cl19894	Pilus_CpaD	Pilus biogenesis CpaD protein (pilus_cpaD). This family consists of a pilus biogenesis protein, CpaD, from Caulobacter, and homologs in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function is not known. [Cell envelope, Surface structures]	0
418709	cl19895	DUF2182	Predicted metal-binding integral membrane protein (DUF2182). This domain, found in various hypothetical bacterial membrane proteins having predicted metal-binding properties, has no known function.	0
418710	cl19897	DUF2428	Putative death-receptor fusion protein (DUF2428). This is a family of proteins conserved from plants to humans. The function is not known. Several members have been annotated as being HEAT repeat-containing proteins while others are designated as death-receptor interacting proteins, but neither of these could be confirmed.	0
418711	cl19898	Noc2	Noc2p family. At least one member, Noc2p from yeast, is required for a late step in 60S subunit export from the nucleus. It has also been shown to co-precipitate with Nug1p, a nuclear GTPase also required for ribosome nucleus export. This family was formerly known as UPF0120.	0
418712	cl19900	HECT_2	HECT-like Ubiquitin-conjugating enzyme (E2)-binding. HECT_2 is a family of UbcH10-binding proteins.	0
418713	cl19902	protein_MS5	Protein MS5. This model describes a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The central region of the repeat resembles the pattern [VIF][FY][QK]GX[LM]P[DEK]XXXDDAL.	0
418714	cl19905	TrwC	TrwC relaxase. This domain is in the N-terminal (relaxase) region of TrwC, a relaxase-helicase that acts in plasmid R388 conjugation. The relaxase domain has DNA cleavage and strand transfer activities. Plasmid transfer protein TraI is also a member of this domain family. Members of this family on bacterial chromosomes typically are found near other genes typical of conjugative plasmids and appear to mark integrated plasmids. [Mobile and extrachromosomal element functions, Plasmid functions]	0
418715	cl19906	Spore_GerAC	Spore germination B3/ GerAC like, C-terminal. Members of this protein family are restricted to endospore-forming members of the Firmicutes lineage of bacteria, including the genera Bacillus, Clostridium, Thermoanaerobacter, Carboxydothermus, etc. Members are nearly all predicted lipoproteins and belong to probable transport operons, some of which have been characterized as crucial to germination in response to alanine. Members typically have been gene symbols gerKC, gerAC, gerYC, etc. [Transport and binding proteins, Amino acids, peptides and amines, Cellular processes, Sporulation and germination]	0
418716	cl19907	ImpA_N	ImpA, N-terminal, type VI secretion system. This protein family is one of two related families in type VI secretion systems that contain an ImpA-related N-terminal domain (pfam06812).	0
388705	cl19908	alpha-hel2	Alpha-helical domain 2. A novel genetic system characterized by seven (usually) major proteins, including a ParB homolog and a ThiF homolog, is commonly found on plasmids or in bacterial chromosomal regions near phage, plasmid, or transposon markers. It is most common among the beta Proteobacteria. We designate the system PRTRC, or ParB-Related,ThiF-Related Cassette. This protein family is designated protein F. It is the most divergent of the families.	0
418717	cl19911	CBM_4_9	Carbohydrate binding domain. This family represents a duplicated conserved region found in a number of uncharacterized plant proteins, potentially in the stem. There is a conserved CGP sequence motif.	0
418718	cl19912	DNA_pol3_delta	DNA polymerase III, delta subunit. hypothetical protein; Provisional	0
418719	cl19913	Peptidase_U49	Peptidase U49. phage exclusion protein Lit; Provisional	0
418720	cl19916	ISG65-75	Invariant surface glycoprotein. 65 kDa invariant surface glycoprotein; Provisional	0
418721	cl19922	FAD_binding_4	FAD binding domain. This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidizes the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan.	0
418724	cl19929	A2M_N	MG2 domain. This family includes a region of the alpha-2-macroglobulin family.	0
418725	cl19932	OTU	OTU-like cysteine protease. This family of proteins conserved from plants to humans is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryote being a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumor domain) in which there is an active cysteine protease triad (ii) a nuclear localization signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif.	0
418726	cl19935	Gp_dh_C	Glyceraldehyde 3-phosphate dehydrogenase, C-terminal domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase.	0
418729	cl19950	Sec3_C	Exocyst complex component Sec3. Vps52 complexes with Vps53 and Vps54 to form a multi- subunit complex involved in regulating membrane trafficking events.	0
418730	cl19952	Gly_transf_sug	Glycosyltransferase sugar-binding region containing DXD motif. This domain represents the N-terminal glycosyltransferase from a set of toxins found in some bacteria. This domain in TcdB glycosylates the host RhoA protein.	0
418731	cl19976	7tm_7	7tm Chemosensory receptor. In Drosophila, taste is perceived by gustatory neurons located in sensilla distributed on several different appendages throughout the body of the animal. This family represents the taste receptor sensitive to trehalose.	0
418746	cl20010	Consortin_C	Consortin C-terminus. This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 161 amino acids in length.	0
418788	cl20168	Kazal_3	Kazal-type serine protease inhibitor domain. Kazal domain found in factor I-like modules (FIMs) region on the carboxyl-terminal of complement component C7 proteins. Complement component C7 is a subunit of the membrane attack complex (MAC), a fundamental machinery in the mammalian innate immunity. KAZAL domains are common in serine protease inhibitors.	0
418789	cl20183	DUF4810	Domain of unknown function (DUF4810). This family of proteins is found in bacteria. Proteins in this family are typically between 117 and 134 amino acids in length. There is a conserved PES sequence motif. It is a putative lipoprotein.	0
418790	cl20192	DUF4915	Domain of unknown function (DUF4915). This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown. [Hypothetical proteins, Conserved]	0
418792	cl20210	LTD	Lamin Tail Domain. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 392 and 433 amino acids in length. There is a conserved NNS sequence motif.	0
418796	cl20221	NSP3_rotavirus	rotavirus non-structural protein 3 (NSP3). This family consist of rotaviral non-structural RNA binding protein 34 (NS34 or NSP3). The NSP3 protein has been shown to bind viral RNA. The NSP3 protein consists of 3 conserved functional domains; a basic region which binds ssRNA, a region containing heptapeptide repeats mediating oligomerization and a leucine zipper motif. NSP3 may play a central role in replication and assembly of genomic RNA structures. Rotaviruses have a dsRNA genome and are a major cause cause of acute gastroenteritis in the young of many species. The rotavirus non-structural protein NSP3 is a sequence-specific RNA binding protein that binds the nonpolyadenylated 3' end of the rotavirus mRNAs. NSP3 also interacts with the translation initiation factor eIF4GI and competes with the poly(A) binding protein.	0
418798	cl20224	zf-MYND	MYND finger. zf-C6H2 is an unusual zinc-finger similar to zf-MYND, pfam01753.This zinc-finger is found at the N-terminus of Pfam families Exo_endo_phos pfam03372 and Peptidase_M24 pfam00557. The domain is missing in prokaryotic methionine aminopeptidases, and is a unique type of zinc-finger domain. It consists of a C2-C2 zinc-finger motif similar to the RING finger family followed by a C2H2 motif similar to zinc-fingers involved in RNA-binding. In yeast the domain chelates zinc in a 2:1 ratio. The domain is found in yeast, plants and mammals. The domain is necessary for the association of the methionine aminopeptidase with the ribosome and the normal processing of the peptidase.	0
418800	cl20226	TIL	trypsin inhibitor-like cysteine rich domain. This family contains trypsin inhibitors as well as a domain found in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9.	0
418844	cl20325	UPF0236	Uncharacterized protein family (UPF0236). 	0
418849	cl20331	SseB	SseB protein N-terminal domain. Members of this family occur almost exclusively in the genus Streptomyces, in the context of type VII secretion systems (T7SS). Several paralogs may accompany a single T7SS. A few members of this family are large proteins with additional domains that add or remove, ADP-ribosylations, suggesting that all family members may have effector activity as well, and that the longer members of the family are multifunctional effector proteins.	0
418856	cl20343	Glyco_hydro_38C	Glycosyl hydrolases family 38 C-terminal domain. This family consists of Glycosyl hydrolase family 38 proteins around 700 residues in length and is mainly found in various Clostridium and Rhizobium species. The function of this family is unknown.	0
418913	cl20439	RNase_Zc3h12a	Zc3h12a-like Ribonuclease NYN domain. PRORPs (protein-only RNase P) are a class of RNA processing enzymes that catalyze maturation of the 5' end of precursor tRNAs in Eukaryotes. Arabidopsis thaliana contains PRORP enzymes (PRORP1, PRORP2 and PRORP3) where PRORP1 localizes to mitochondria as well as chloroplasts, while PRORP2 and PRORP3 are found in the nucleus. In humans and most other metazoans, mt-RNase P is composed of three protein subunits (mitochondrial RNase P proteins 1-3; MRPP1-3), homologs to the Arabidopsis thaliana PRORP1-3. This domain corresponds to the metallonuclease domain of PRORPs. PRORP1 has 22% sequence identity to the human homolog MRPP3. PRORP1 crystal structure shows a V-shaped tripartite structure with a C-terminal metallonuclease domain of the NYN (N4BL1, YacP-like nuclease) family, with a typical and functional two-metal-ion catalytic site that has conserved aspartate residues.	0
418935	cl20473	ANAPC5	Anaphase-promoting complex subunit 5. Apc5 is a subunit of the anaphase-promoting complex/cyclosome (APC/C) which is a multi-subunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Apc5 binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction. The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC. This model represents the Tetratricopeptide repeat (TPR)-like motif region of Apc5.	0
418984	cl20541	EFG_III-like	Domain III of Elongation factor G (EF-G) and related proteins. This domain is found in Elongation Factor G. It shares a similar structure with domain V (pfam00679).	0
418994	cl20555	TetR_C_29	Tetracyclin repressor-like, C-terminal domain. This family comprises proteins that belong to the TetR family of transcriptional regulators. This family features the C-terminal region of these sequences, which does not include the N-terminal helix-turn-helix.	0
419068	cl20644	DUF4976	Domain of unknown function (DUF4976). This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown.	0
419202	cl20817	GBP_C	Guanylate-binding protein, C-terminal domain. IFT20 is subunit 20 of the intraflagellar transport complex B. The intraflagellar transport complex assembles and maintains eukaryotic cilia and flagella. IFT20 is localized to the Golgi complex and is anchored there by the Golgi polypeptide, GMAP210, whereas all other subunits except IFT172 localize to cilia and the peri-basal body or centrosomal region at the base of cilia. IFT20 accompanies Golgi-derived vesicles to the point of exocytosis near the basal bodies where the other IFT polypeptides are present, and where the intact IFT particle is assembled in association with the inner surface of the cell membrane. Passage of the IFT complex then follows, through the flagellar pore recognition site at the transition region, into the ciliary compartment. There also appears to be a role of intraflagellar transport (IFT) polypeptides in the formation of the immune synapse in non ciliated cells. The flagellum, in addition to being a sensory and motile organelle, is also a secretory organelle. A number of IFT components are expressed in haematopoietic cells, which have no cilia, indicating an unexpected role of IFT proteins in immune synapse-assembly and intracellular membrane trafficking in T lymphocytes; this suggests that the immune synapse could represent the functional homolog of the primary cilium in these cells.	0
419207	cl20823	DCAF15-NTD	N-terminal domain of DDB1- and CUL4-associated factor 15. DCAFs, Ddb1- and Cul4-associated factors, are substrate receptors for the Cul4-Ddb1 Ubiquitin Ligase. There are 18 different factors, the majority of which are WD40-repeat-proteins.	0
389307	cl20914	humanin	humanin and similar peptides. This family of proteins is found exclusively in humans. Humanin is a short anti-apoptotic peptide that interacts with Bax.	0
419594	cl21329	nt01cx_1156_like	Uncharacterized proteins conserved in Clostridia. This family of uncharacterized proteins from Clostridia and Bacilli classes has an unusual structure of three beta propeller repeats that do not form a barrel, as in well known 6-, 7- etc beta propeller barrels, but instead are stacked in a three-layer beta-sheet sandwich. The function of all the proteins from this family is unknown.	0
419595	cl21330	CdiA-CT_Ecl_RNase-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) protein CdiA of Enterobacter cloacae, and similar proteins. Bacterial genomes and plasmids encode a variety of peptide and protein toxins that mediate inter-bacterial competition. Bacteriocins are diffusible proteins that parasitize cell-envelope proteins to enter and kill bacteria. Contact-dependent growth inhibition (CDI) is one mechanism of inter-bacterial competition. Novel Toxin 21 (alternatively 16S rRNA endonuclease CdiA) belongs to a family of prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. This RNase toxin found in bacterial polymorphic toxin systems, is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, with two conserved lysine residues and [DS]xDxxxH, RxG[ST] and RxxD motifs. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 4, type 5 or type 7 secretion systems. This is also referred to as the E. cloacae CdiAC. The CdiAC proteins carry a variety of sequence-diverse C-terminal domains, which represent a collection of distinct toxins. Many CdiA-CT toxins have nuclease activities. In accord with the structural homology, CdiA-CT cleaves 16S rRNA at the same site as colicin E3 and this nuclease activity is responsible for growth inhibition.	0
419639	cl21406	CdiA-CT_Ec_tRNase	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Escherichia coli 563, and similar proteins. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all alpha-helical fold and conserved aspartate and glutamate residues, and K[DE] and[DN]HxxE motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5 or type 7 secretion system.	0
419665	cl21453	PKc_like	Protein Kinases, catalytic domain. The KIND domain (kinase non-catalytic C-lobe domain) evolved from a catalytic protein kinase fold and functions as an interaction domain. In SPIRE1 (protein spire homolog 1) this domain interacts with FMN2 (formin-2).	0
419666	cl21454	NADB_Rossmann	Rossmann-fold NAD(P)(+)-binding proteins. This entry is the rossmann domain found in the Xanthine dehydrogenase accessory protein.	0
419667	cl21456	Periplasmic_Binding_Protein_Type_2	Type 2 periplasmic binding fold superfamily. This domain is often found in association with the helix-turn-helix domain HTH_41 (pfam14502). It includes YhfZ proteins from Escherichia coli and Shigella flexneri.	0
419668	cl21457	TIM	TIM-like beta/alpha barrel domains. This domain includes the enzyme Phosphoenolpyruvate phosphomutase (EC:5.4.2.9). This protein has been characterized as catalyzing the formation of a carbon-phosphorus bond by converting phosphoenolpyruvate (PEP) to phosphonopyruvate (P-Pyr). This enzyme has a TIM barrel fold.	0
419669	cl21459	HTH	Helix-turn-helix domains. This winged helix-turn-helix domain contains an extended C-terminal alpha helix which is responsible for dimerization of this domain.	0
419670	cl21460	HAD_like	Haloacid Dehalogenase-like Hydrolases. This family is part of the HAD superfamily.	0
419671	cl21461	Globin-like	Globin-like protein superfamily. This family includes protoglobin from Methanosarcina acetivorans C2A. It is also found near the N-terminus of the Haem-based aerotactic transducer HemAT in Bacillus subtilis. It is part of the haemoglobin superfamily. Protoglobin has specific loops and an amino-terminal extension which leads to the burying of the haem within the matrix of the protein. Protoglobin-specific apolar tunnels allow the access of O2, CO and NO to the haem distal site. In HemAT it acts as an oxygen sensor domain.	0
419672	cl21462	bZIP	Basic leucine zipper (bZIP) domain of bZIP transcription factors: a DNA-binding and dimerization domain. This domain is found at the C-terminus of ABC transporters. It has a coiled coil structure with an atypical 3(10)-helix in the alpha-hairpin region. It is involved in DNA_binding.	0
419673	cl21463	UBA_like_SF	UBA domain-like superfamily. EDD, the ER ubiquitin ligase from the HECT ligases, contains an N-terminal ubiquitin-associated domain which binds ubiquitin. Ubiquitin is recognized by helices alpha-1 and -3 in in the UBA domain. EDD is involved in DNA damage repair pathways and binds to mono-ubiquitinated proteins.	0
419674	cl21467	Cytochrom_C	Cytochrome c. This domain is a heme binding cytochrome known as cytochrome c550, or cytochrome c549, or PsbV.	0
419675	cl21469	HDc	N/A. HD domains are metal dependent phosphohydrolases.	0
419676	cl21470	Peptidase_M14NE-CP-C_like	Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain. This is the N-terminal of Calcineurin-like phosphoesterases. It is around 150 residues in length from various Bacteroides species. The function of this family is unknown.	0
419677	cl21471	RAMP_I_III	CRISPR/Cas system-associated RAMP superfamily protein. CRISPR is a term for Clustered Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This highly divergent family, found in at least ten different archaeal and bacterial species, is represented by TM1793 from Thermotoga maritima.	0
419678	cl21473	ArsB_NhaD_permease	N/A. CitMHS is a family of putative citrate transporters, belonging to the Na+/H+ antiporter NhaD-like permease superfamily.	0
419679	cl21474	ABC2_membrane	ABC-2 type transporter. This is the N-terminal region of 7tm proteins. The function is not known.	0
419680	cl21478	ATP-synt_B	ATP synthase B/B&apos; CF(0). This family corresponds to subunit 8 (YMF19) of the F0 complex of plant and algae mitochondrial F-ATPases (EC:3.6.1.34).	0
419681	cl21479	Cas5_I	CRISPR/Cas system-associated RAMP superfamily protein Cas5. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This small Cas family is represented by CT1134 of Chlorobium tepidum.	0
419682	cl21481	malate_synt	N/A. This family of TIM-Barrel fold C-C bond lyase is related to citrate-lyase. These genes are found in the biosynthetic operon, with other enzymatic domains, associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	0
419683	cl21482	RuvC_like	Crossover junction endodeoxyribonuclease RuvC and similar proteins. This is the YqgF-like domain of the bacterial Tex protein, which is involved in transcriptional processes.	0
419684	cl21484	Oxidored_q3	NADH-ubiquinone/plastoquinone oxidoreductase chain 6. NADH dehydrogenase subunit 6; Provisional	0
419685	cl21486	Ketoacyl-synt_C	Beta-ketoacyl synthase, C-terminal domain. This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III EC:2.3.1.41, the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria.	0
419686	cl21487	OM_channels	N/A. This family includes proteins annotated as TonB dependent receptors. But it is also likely to contain other membrane beta barrel proteins of other functions.	0
419687	cl21488	ECF_trnsprt	ECF transporter, substrate-specific component. Members of this protein family have been assigned as thiamine transporters by a phylogenetic analysis of families of genes regulated by the THI element, a broadly conserved RNA secondary structure element through which thiamine pyrophosphate (TPP) levels can regulate transcription of many genes related to thiamine transport, salvage, and de novo biosynthesis. Species with this protein always lack the ThiBPQ ABC transporter. In some species (e.g. Streptococcus mutans and Streptococcus pyogenes), yuaJ is the only THI-regulated gene. Evidence from Bacillus cereus indicates thiamine uptake is coupled to proton translocation.	0
419688	cl21491	Transpeptidase	Penicillin binding protein transpeptidase domain. This family is closely related to Beta-lactamase, pfam00144, the serine beta-lactamase-like superfamily, which contains the distantly related pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase.	0
419689	cl21492	PTS_EIIC	Phosphotransferase system, EIIC. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The sugar-specific permease of the PTS consists of three domains (IIA, IIB and IIC). The IIC domain catalyzes the transfer of a phosphoryl group from IIB to the sugar substrate.	0
419690	cl21493	Complex1_49kDa	Respiratory-chain NADH dehydrogenase, 49 Kd subunit. This model represents that clade of F420-dependent hydrogenases (FRH) beta subunits found exclusively and universally in methanogenic archaea. This protein is a member of the Nickel-dependent hydrogenase superfamily represented by Pfam model, pfam00374.	0
419691	cl21494	Abhydrolase	alpha/beta hydrolases. This family consists of several chlorophyllase and chlorophyllase-2 (EC:3.1.1.14) enzymes. Chlorophyllase (Chlase) is the first enzyme involved in chlorophyll (Chl) degradation and catalyzes the hydrolysis of an ester bond to yield chlorophyllide and phytol. The family includes both plant and Amphioxus members.	0
419692	cl21495	Acyl_transf_3	Acyltransferase family. This domain, found in various hypothetical and OpgC prokaryotic proteins. It is likely to act as an acyltransferase enzyme.	0
419693	cl21496	2OG-FeII_Oxy	2OG-Fe(II) oxygenase superfamily. This family has structural similarity to the 2OG-Fe(II) oxygenase superfamily.	0
419694	cl21497	PAAR_like	proline-alanine-alanine-arginine (PAAR) repeat superfamily. This motif is found usually in pairs in a family of bacterial membrane proteins. It is also found as a triplet of tandem repeats comprising the entire length in a another family of hypothetical proteins.	0
419695	cl21498	SANT	N/A. This domain, approximately 90 residues, is mainly found in DNA methyltransferase 1-associated protein 1 (DAMP1) that plays an important role in development and maintenace of genome integrity in various mammalia species. It mainly consists of tandem repeats of three alpha-helices that are arranged in a helix-turn-helix motif and shows a structual similarity with SANT domain and Myb DNA-binding domain, indicating it contains a putative DNA binding site.	0
354841	cl21499	SPX	Domain found in Syg1, Pho81, XPR1, and related proteins. This region has been named the SPX domain after (Syg1, Pho81 and XPR1). The domain is found at the amino terminus of a variety of proteins. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors Pho81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The yeast protein Gde1/Ypl110c is similar to both, NUC-2 and Pho81, in sharing their multi-domain architecture, which includes the SPX N-terminal domain followed by several ankyrin repeats and a C-terminal glycerophosphodiester phosphodiesterase domain (GDPD). Gde1 hydrolyzes intracellular glycerophosphocholine into glycerolphosphate and choline, and plays a role in the utilization of glycerophosphocholine as a source for phosphate.	0
419696	cl21502	CTP_transf_1	Cytidylyltransferase family. CDP-archaeol synthase functions in the archaeal lipid biosynthetic pathway. It catalyzes the transfer of the nucleotide to its specific archaeal lipid substrate, leading to the formation of a CDP-activated precursor (CDP-archaeol) to which polar head groups are attached. Bacterial members of this family are uncharacterized.	0
419697	cl21503	ParE_toxin	ParE toxin of type II toxin-antitoxin system, parDE. YafQ is a family of bacterial toxin ribonucleases of type II toxin-antitoxin systems. The E.coli gene is expressed from the dinB operon. The cognate antitoxin for the E. coli protein is DinJ, in family RelB_antitoxin, pfam02604.	0
419698	cl21504	EGF_CA	N/A. This short domain on coagulation enzyme factor Xa is found to be the target for a potent inhibitor of coagulation, TAK-442.	0
419699	cl21506	DinB_2	DinB superfamily. This domain is found in MSMEG_5817 gene product from M. smegmatis. It has been shown to be vital for mycobacterial survival within host macrophages. Crystal structure revealed a Rossmann-like fold alpha/beta two-layer sandwich forming a highly hydrophobic interface cavity and with high structural homology to the SCP family. Hence, it has been suggested that this domain may be involved in the interaction of apolar ligands through its hydrophobic cavity. Alanine-scanning mutagenesis of the hydrophobic cavity of MSMEG_5817 protein demonstrated that the conserved Val82 residue plays an important role in ligand binding.	0
419700	cl21508	Ribosomal_P1_P2_L12p	N/A. This family includes archaebacterial L12, eukaryotic P0, P1 and P2.	0
389780	cl21509	ApoLp-III_like	Apolipophorin-III and similar insect proteins. This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage.	0
419701	cl21511	PEMT	Phospholipid methyltransferase. This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 300 residues long.	0
419702	cl21513	NrfD	Polysulphide reductase, NrfD. The terminal electron transfer enzyme Me2SO reductase of Escherichia coli is a heterotrimeric enzyme composed of a membrane extrinsic catalytic dimer (DmsAB) and a membrane intrinsic polytopic anchor subunit (DmsC).	0
419703	cl21514	TauE	Sulfite exporter TauE/SafE. High affinity nickel transporters involved in the incorporation of nickel into H2-uptake hydrogenase and urease enzymes. Essential for the expression of catalytically active hydrogenase and urease. Ion uptake is dependent on proton motive force. HoxN in Alcaligenes eutrophus is thought to be an integral membrane protein with seven transmembrane helices. The family also includes a cobalt transporter.	0
419704	cl21515	GAF	GAF domain. SpoVT_C is the C-terminal part of the stage V sporulation protein T, a transcription factor involved in endospore formation in Gram-positive bacteria such as Bacillus subtilis. Sporulation is induced by conditions of environmental stress to protect the genome. SpoVT behaves as a tetramer that shows an overall significant distortion mediated by electrostatic interactions. Two monomers dimerize via the highly charged N-terminal AbrB-like domains, family pfam04014, to form swapped-hairpin beta-barrels. These asymmetric dimers then form tetramers through the formation of mixed helix bundles between their C-terminal domains. The C-termini themselves fold as GAF (cGMP-specific and cGMP-stimulated phosphodiesterases, Anabaena adenylate cyclases, and Escherichia coli FhlA) domains.	0
419705	cl21516	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. The family describes Cas proteins of about 400 residues that include the motif [VIL]-D-x-[ST]-H-[GS]. The CRISPR and associated proteins are thought to be involved in the evolution of host resistance. The exact molecular function of this family is currently unknown.	0
419706	cl21519	Cas8a1_I-A	CRISPR/Cas system-associated protein Cas8a1. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes a conserved region of about 65 amino acids from an otherwise highly divergent protein found in a minority of CRISPR-associated protein regions. This region features two motifs of CXXC.	0
419707	cl21521	PEPcase	Phosphoenolpyruvate carboxylase. This family of phosphoenolpyruvate carboxylases is based on seqeunces not picked up by the model for PEPcase, PF00311. Most of the family members are from Archaea.	0
419708	cl21522	FN3	N/A. This fibronectin type III domain is found in fungal chitin biosynthesis protein CHS5 where, together with the neighboring BRCT domain (pfam00533), it binds to the Arf1 GTPase.	0
276343	cl21524	PRK13923	N/A. In a subset of endospore-forming members of the Firmcutes, members of this protein family are found, several to a genome. Two very strongly conserved sequences regions are separated by a highly variable linker region. Much of the linker region was excised from the seed alignment for this model. A characterized member is the prespore-specific transcription RsfA from Bacillus subtilis, previously called YwfN, which is controlled by sigma factor F and seems to fine-tune expression of some genes in the sigma-F regulon. A paralog in Bacillus subtilis is designated YlbO. [Regulatory functions, DNA interactions, Cellular processes, Sporulation and germination]	0
419709	cl21525	LysM	Lysin Motif is a small domain involved in binding peptidoglycan. The LysM (lysin motif) domain is about 40 residues long. It is found in a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. The structure of this domain is known.	0
419710	cl21526	TolB_N	TolB amino-terminal domain. This is a family of Gram-negative bacterial outer membrane lipoproteins. LpoB is required for the function of the major peptidoglycan synthase enzyme PBP1B. It interacts with PBP1B protein via the UvrB-like non-catalytic domain on that protein. LpoB has a 54-aa-long flexible N-terminal stretch followed by a globular domain with similarity to the N-terminal domain of the prevalent periplasmic protein TolB. The long, flexible N-terminal region of LpoB enables it to span the periplasm and reach its docking site in PBP1B. Peptidoglycan is the essential polymer within the sacculus that surrounds the cytoplasmic membrane of bacteria.	0
419711	cl21527	DoxX	DoxX. This family of uncharacterized proteins are related to DoxX pfam07681.	0
419712	cl21528	Lipocalin	Lipocalin / cytosolic fatty-acid binding protein family. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2).	0
277547	cl21530	Dockerin_like	Dockerin repeat domains and domains resembling dockerin repeats. Bacterial cohesin domains bind to a complementary protein domain named dockerin, and this interaction is required for the formation of the cellulosome, a cellulose-degrading complex. The cellulosome consists of scaffoldin, a noncatalytic scaffolding polypeptide, that comprises repeating cohesion modules and a single carbohydrate-binding module (CBM). Specific calcium-dependent interactions between cohesins and dockerins appear to be essential for cellulosome assembly. This subfamily represents type I dockerins, which are responsible for anchoring a variety of enzymatic domains to the complex.	0
419713	cl21531	Sialidase	sialidases/neuraminidases. This family of proteins contains BNR-like repeats suggesting these proteins may act as sialidases.	0
419714	cl21532	NADAR	Escherichia coli swarming motility protein YbiA and related proteins. This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate.	0
419715	cl21533	Cas8a1_I-A	CRISPR/Cas system-associated protein Cas8a1. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This minor cas protein is found in at least five prokaryotic genomes: Methanosarcina mazei, Sulfurihydrogenibium azorense, Thermotoga maritima, Carboxydothermus hydrogenoformans, and Dictyoglomus thermophilum, the first of which is archaeal while the rest are bacterial.	0
419716	cl21534	NLPC_P60	NlpC/P60 family. Amidase_YiiX is a family of permuted papain-like amidases. It has amidase specificity for the amide bond between a lipid and an amino acid (or peptide). From the structure, a tetramer, each monomer is made up of a layered alpha-beta fold with a central, 6-stranded, antiparallel beta-sheet that is protected by helices on either side. The catalytic Cys154 in UniProtKB:Q74NK7, Structure 3kw0, is located on the N-terminus of helix alphaF. The two additional helices located above Cys154 contribute to the formation of the active site, where the lysine ligand is bound.	0
419717	cl21536	Rhomboid	Rhomboid family. The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae contains of proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process. The mutant classes were called 'der' for 'degradation in the ER'. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein, that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins. The function of the Der1 protein seems to be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. Suggesting that this family may also mediate degradation of misfolded proteins (Bateman A pers. obs.).	0
419718	cl21538	TRAPPC_bet3-like	Bet3-like domains of TRAPP. TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterized TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localize TRAPP to the Golgi.	0
419719	cl21539	DnaJ_zf	Zinc finger domain of DnaJ and HSP40. The central cysteine-rich (CR) domain of DnaJ proteins contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DNAJ cysteine rich domain and various hydrophobic peptides has been found.	0
419720	cl21540	CopD	Copper resistance protein D. It appears this conserved hypothetical integral membrane protein is found only in gram negative bacteria. Completed genomes that include a member of this family include Rickettsia prowazekii, Synechocystis sp. PCC6803, and Helicobacter pylori. These proteins have 3 (Helicobacter pylori) to 5 (Synechocystis sp. PCC 6803) GES predicted transmembrane regions. Most members have 4 GES predicted transmembrane regions. [Hypothetical proteins, Conserved]	0
419721	cl21541	OstA	OstA-like protein. This is a family of OstA-like proteins that are related to pfam03968.	0
419722	cl21542	EthD	EthD domain. MmlI is a short, approx 115 residue, protein of two alpha helices and four beta strands. It is involved in the catabolism of methyl-substituted aromatics via a modified oxo-adipate pathway in bacteria. The enzyme appears to be monomeric in some species and tetrameric in others. The known structure shows two copies of the protein form a dimeric alpha beta barrel.	0
419723	cl21543	MMPL	MMPL family. Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus.	0
419724	cl21544	FlgD_ig	FlgD Ig-like domain. The function of this C-terminal domain is not known; there are several conserved tryptophan and asparagine residues.	0
419725	cl21545	GHB_like	Glycoprotein hormone beta chain homologues. This family contains several mammalian sclerostin (SOST) proteins. SOST is thought to suppress bone formation. Mutations of the SOST gene lead to sclerosteosis, a progressive sclerosing bone dysplasia with an autosomal recessive mode of inheritance. Radiologically, it is characterized by a generalized hyperostosis and sclerosis leading to a markedly thickened and sclerotic skull, with mandible, ribs, clavicles and all long bones also being affected. Due to narrowing of the foramina of the cranial nerves, facial nerve palsy, hearing loss and atrophy of the optic nerves can occur. Sclerosteosis is clinically and radiologically very similar to van Buchem disease, mainly differentiated by hand malformations and a large stature in sclerosteosis patients.	0
419726	cl21549	rve	Integrase core domain. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction.	0
419727	cl21551	Sulfotransfer_3	Sulfotransferase family. Members of this family are essential for the biosynthesis of sulpholipid-1 in prokaryotes. They adopt a structure that belongs to the sulphotransferase superfamily, consisting of a single domain with a core four-stranded parallel beta-sheet flanked by alpha-helices.	0
419728	cl21552	TPK	Thiamine pyrophosphokinase. Family of thiamin pyrophosphokinase (EC:2.7.6.2). Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis.	0
419730	cl21556	DUF2184	Uncharacterized protein conserved in bacteria (DUF2184). The Linocin_M18 is found in eubacteria and archaea. These proteins, referred to as encapsulins, form nanocompartments within the bacterium which contain ferritin-like proteins or peroxidases, enzymes involved in oxidative-stress response. These enzymes are targeted to the interior of encapsulins via unique C-terminal extensions.	0
419731	cl21557	Yip1	Yip1 domain. YIF1 (Yip1 interacting factor) is an integral membrane protein that is required for membrane fusion of ER derived vesicles. It also plays a role in the biogenesis of ER derived COPII transport vesicles.	0
419732	cl21559	HGD-D	2-hydroxyglutaryl-CoA dehydratase, D-component. Members of this family include various bacterial hypothetical proteins, as well as CoA enzyme activases. The exact function of this domain has not, as yet, been defined.	0
419733	cl21560	Ion_trans_2	Ion channel. This family includes the two membrane helix type ion channels found in bacteria.	0
419734	cl21562	DDE_Tnp_4	DDE superfamily endonuclease. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contains three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction.	0
419736	cl21565	LIF_OSM	LIF / OSM family. OSM, Oncostatin M	0
354875	cl21566	Sedlin_N	Sedlin, N-terminal conserved region. Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses.	0
419737	cl21567	Cyclophil_like	Cyclophilin-like. This is a family of bacterial and archaeal proteins, the structure for one of whose members has been characterized. Structure 3kop probably adopts a new hexameric form compared to previous structures. The putative active is near the domain interface. 3kop is most closely related, structurally to Structure 1zx8, where the potential active site is located near residues E51 and Y53 (conserved in 1zx8). Beyond the two residues above, the other residues are not conserved. Also the shape of the active site differs from that of 1zx8. Structure 1zx8 belongs to family DUF369. pfam04126, which is part of the cyclophilin-like clan.	0
419738	cl21568	SurA_N_3	SurA N-terminal domain. This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment.	0
419739	cl21569	Ribosomal_S30	Ribosomal protein S30. 40S ribosomal protein S30; Provisional	0
419740	cl21570	Csx1_III-U	CRISPR/Cas system-associated protein Csx1. Members of this family are found, exclusively in the vicinity of CRISPR repeats and other CRISPR-associated (cas) genes, in Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum), Thermus thermophilus (Deinococcus-Thermus), Chloroflexus aurantiacus (Chloroflexi), and Thermomicrobium roseum (Thermomicrobia).	0
419741	cl21572	GatB_Yqey	GatB domain. The function of this domain found in the YqeY protein is uncertain.	0
419742	cl21573	B3_4	B3/4 domain. This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein.	0
419744	cl21578	PAS	N/A. The MEKHLA domain shares similarity with the PAS domain and is found in the 3' end of plant HD-ZIP III homeobox genes, and bacterial proteins.	0
419745	cl21579	CBM_2	Cellulose binding domain. This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose.	0
419746	cl21581	FixH	FixH. This family consists of several Rhizobium FixH like proteins. It has been suggested that suggested that the four proteins FixG, FixH, FixI, and FixS may participate in a membrane-bound complex coupling the FixI cation pump with a redox process catalyzed by FixG.	0
419747	cl21583	DUF4142	Domain of unknown function (DUF4142). Domain found in small family of bacterial secreted proteins with no known function. Also found in Paramecium bursaria chlorella virus 1. This domain is short and found in one or two copies. The domain has a conserved HH motif that may be functionally important. This domain belongs to the ferritin superfamily. It contains two sequence similar repeats each of which is composed of two alpha helices.	0
419748	cl21584	Tryp_SPc	N/A. This family represents the catalytic domain of alpha-lytic protease (alpha-LP) and its closely-related homologs. Alpha-lytic protease (EC 3.4.21.12; also called alpha-lytic endopeptidase), originally isolated from the myxobacterium Lysobacter enzymogenes, belongs to the MEROPS peptidase family S1, subfamily S1E (streptogrisin A subfamily). It is synthesized as a pro-enzyme, thus having two domains; the N-terminal pro-domain acts as a foldase, required transiently for the correct folding of the protease domain, and also acts as a potent inhibitor of the mature enzyme, while the C-terminal domain catalyzes the cleavage of peptide bonds. Members of the alpha-lytic protease subfamily include Nocardiopsis alba protease (NAPase), a secreted chymotrypsin from the alkaliphile Cellulomonas bogoriensis, streptogrisins (SPG-A, SPG-B, SPG-C, and SPG-D), and Thermobifida fusca protease A (TFPA). These serine proteases have characteristic kinetic stability, exhibited by their extremely slow unfolding kinetics. The active site, characteristic of serine proteases, contains the catalytic triad consisting of serine acting as a nucleophile, aspartate as an electrophile, and histidine as a base, all required for activity. This model represents the C-terminal catalytic domain of alpha-lytic proteases.	0
419749	cl21588	Snf7	Snf7. SNF-7-like protein; Provisional	0
389828	cl21589	Relaxase	Relaxase/Mobilisation nuclease domain. Relaxases/mobilisation proteins are required for the horizontal transfer of genetic information contained on plasmids that occurs during bacterial conjugation. The relaxase, in conjunction with several auxiliary proteins, forms the relaxation complex or relaxosome. Relaxases nick duplex DNA in a specific manner by catalyzing trans-esterification.	0
419750	cl21590	PMT_2	Dolichyl-phosphate-mannose-protein mannosyltransferase. This family is conserved in bacteria. The function is not known.	0
419751	cl21591	PRCH	N/A. The PRC-barrel is an all beta barrel domain found in photosystem reaction centre subunit H of the purple bacteria and RNA metabolism proteins of the RimM group. PRC-barrels are approximately 80 residues long, and found widely represented in bacteria, archaea and plants. This domain is also present at the carboxyl terminus of the pan-bacterial protein RimM, which is involved in ribosomal maturation and processing of 16S rRNA. A family of small proteins conserved in all known euryarchaea are composed entirely of a single stand-alone copy of the domain.	0
419752	cl21592	DUF998	Protein of unknown function (DUF998). Family of conserved archaeal proteins.	0
419753	cl21594	Gate	Nucleoside recognition. Members of this protein family are found exclusively in Firmicutes (low-GC Gram-positive bacterial) and are known from studies in Bacillus subtilis to be part of the sigma-E regulon. Mutation leads to a sporulation defect, confirming that members of this protein family, YlbJ, are sporulation proteins. This protein appears to be universal among endospore-forming bacteria, but is encoded by a pair ORFs distant from eash other in Symbiobacterium thermophilum IAM14863. [Cellular processes, Sporulation and germination]	0
419754	cl21598	PMP22_Claudin	PMP-22/EMP/MP20/Claudin family. Members of this family are claudins, that form tight junctions between cells.	0
419755	cl21600	DUF302_like	Domains similar to DUF302 and the N-terminal domains found in some bacterial RNAses. RnlA_toxin is an RNase LS and a putative toxin of a bacterial toxin-antitoxin pair. Toxin-antitoxin systems consist of a stable toxin and an unstable antitoxin. In this case, a novel type II system, RnlA is the stable toxin that causes inhibition of cell growth and rapidly degrades T4 late mRNAs to prevent their expression, and this is neutralized by the activity of the unstable antitoxin RnlB.	0
419756	cl21601	zf-CHC2	CHC2 zinc finger. This region represents the zinc binding domain. It is found in the N-terminal region of the bacteriophage P4 alpha protein, which is a multifunctional protein with origin recognition, helicase and primase activities.	0
419757	cl21602	DUF1073	Protein of unknown function (DUF1073). This model describes an uncharacterized family of proteins found in prophage regions of a number of bacterial genomes, including Haemophilus influenzae, Xylella fastidiosa, Salmonella typhi, and Enterococcus faecalis. Distantly related proteins can be found in the prophage-bearing plasmids of Borrelia burgdorferi. [Mobile and extrachromosomal element functions, Prophage functions]	0
419758	cl21606	GH3	GH3 auxin-responsive promoter. indole-3-acetic acid-amido synthetase	0
419759	cl21608	Galactosyl_T	Galactosyltransferase. This family includes a conserved region found in several uncharacterized plant proteins.	0
419760	cl21610	PQ-loop	PQ loop repeat. This family includes proteins such as drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family. This family also contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisims it meditaes gluose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologs are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter.	0
419761	cl21612	PolyA_pol	Poly A polymerase head domain. hypothetical protein	0
419762	cl21614	YkuD_like	L,D-transpeptidases/carboxypeptidases similar to Bacillus YkuD. This family is related to pfam03734.	0
419763	cl21616	DUF4870	Domain of unknown function (DUF4870). 	0
389842	cl21617	Terminase_GpA	Phage terminase large subunit (GpA). This family consists of several phage terminase large subunit proteins as well as related sequences from several bacterial species. The DNA packaging enzyme of bacteriophage lambda, terminase, is a heteromultimer composed of a small subunit, gpNu1, and a large subunit, gpA, products of the Nu1 and A genes, respectively. Terminase is involved in the site-specific binding and cutting of the DNA in the initial stages of packaging. It is now known that gpA is actively involved in late stages of packaging, including DNA translocation, and that this enzyme contains separate functional domains for its early and late packaging activities.	0
354892	cl21618	Peptidase_M11	Gametolysin peptidase M11. This model describes a metalloproteinase domain, with a characteristic HExxH motif. Examples of this domain are found in proteins in the family of immune inhibitor A, which cleaves antibacterial peptides, and in other, only distantly related proteases. This model is built to be broader and more inclusive than pfam05547.	0
419765	cl21622	PepSY	Peptidase propeptide and YPEB domain. This region is likely to have a protease inhibitory function (personal obs:C Yeats). The name is derived from Peptidase & Bacillus subtilis YPEB.	0
419766	cl21623	ALO	D-arabinono-1,4-lactone oxidase. The substrate-binding domain found in Cholesterol oxidase is composed of an eight-stranded mixed beta-pleated sheet and six alpha-helices. This domain is positioned over the isoalloxazine ring system of the FAD cofactor bound by FAD_binding_4 (PF:PF01565) and forms the roof of the active site cavity, allowing for catalysis of oxidation and isomerisation of cholesterol to cholest-4-en-3-one.	0
304472	cl21625	Capsule_synth	Capsule polysaccharide biosynthesis protein. This family includes export proteins involved in capsule polysaccharide biosynthesis, such as KpsS and LipB.	0
419767	cl21627	DRTGG	DRTGG domain. This family represents the N-terminal region of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phospho-relay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognizes the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. The blades are formed by two N-terminal domains each, and the compact central hub assembles the C-terminal kinase domains.	0
419768	cl21628	POTRA	Surface antigen variable number repeat. FtsQ/DivIB bacterial division proteins (pfam03799) contain an N-terminal POTRA domain (for polypeptide-transport-associated domain). This is found in different types of proteins, usually associated with a transmembrane beta-barrel. FtsQ/DivIB may have chaperone-like roles, which has also been postulated for the POTRA domain in other contexts.	0
419769	cl21633	TruB-C_2	Pseudouridine synthase II TruB, C-terminal. The C terminal domain of tRNA Pseudouridine synthase II adopts a PUA (pfam01472) fold, with a four-stranded mixed beta-sheet flanked by one alpha-helix on each side. It allows for binding of the enzyme to RNA, as well as stabilisation of the RNA molecule.	0
419770	cl21636	AsmA_2	AsmA-like C-terminal region. This family is similar to the C-terminal of the AsmA protein of E. coli.	0
272058	cl21638	LodA_like	L-lysine epsilon-oxidase from Marinomonas mediterranea and similar proteins. L-lysine epsilon-oxidase is responsible for oxidative deamination of L-lysine, producing L-2-aminoadipate-6-semialdehyde. Hydrogen peroxide is a side-product of this enzymatic reaction, which requires the cofactor CTQ (cysteine tryptophylquinone). CTQ most likely forms a Schiff base with the free amino acid substrate. The protein is also called marinocine, for its broad-spectrum antibacterial activity; the latter is most likely caused by hydrogen peroxide synthesis. The dimerization interface observed in the available 3D structure does not seem to be conserved. Homologs of LodA have been detected in various gram-negative bacteria, and they appear to be associated with the formation of biofilms.	0
419771	cl21639	GH_101_like	Endo-a-N-acetylgalactosaminidase and related glcyosyl hydrolases. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae is largely determined by the ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. This family is the enzymatic region, EC:3.2.1.97, of the cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins. This reaction is exemplified by the S. pneumoniae protein Endo-alpha-N-acetylgalactosaminidase, where Asp764 is the catalytic nucleophile-base and Glu796 the catalytic proton donor.	0
419772	cl21640	PUFD_like	PCGF Ub-like fold discriminator and related domains. PUFD is the minimal domain at the C-terminus of BCORL (BCL6 corepressor) that is needed for binding and giving specificity to some of the PCGF proteins, polycomb-group RING finger homologs. PUFD binds to the RAWUL (RING finger- and WD40-associated ubiquitin-like) domain of the particular PCGF PCGF1, pfam16207. Polycomb group proteins form repressive complexes (PRC) that mediate epigenetic modifications of histones. In humans there are many different PCGF homologs whose functions all vary, but the direct binding partner of PCGF1 is BCOR. BCOR has emerged as an important player in development and health.	0
419773	cl21642	Pentapeptide	Pentapeptide repeats (8 copies). These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid.	0
419774	cl21648	Coa1	Cytochrome oxidase complex assembly protein 1. TIM21 interacts with the outer mitochondrial TOM complex and promotes the insertion of proteins into the inner mitochondrial membrane.	0
419775	cl21649	GFO_IDH_MocA_C	Oxidoreductase family, C-terminal alpha/beta domain. This is the C terminal of a family of putative oxidoreductases.	0
419776	cl21652	Peptidase_C11	Clostripain family. Clostripain is a cysteine protease characterized from Clostridium histolyticum, and also known from Clostridium perfringens. It is a heterodimer processed from a single precursor polypeptide, specific for Arg-|-Xaa peptide bonds. The older term alpha-clostripain refers to the most active, most reduced form, rather than to the product of one of several different genes. Clostripain belongs to the peptidase family C11, or clostripain family (see pfam03415). [Protein fate, Degradation of proteins, peptides, and glycopeptides, Cellular processes, Pathogenesis]	0
419777	cl21655	AMO	Ammonia monooxygenase. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit A of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria.	0
276368	cl21656	Silic_transp	Silicon transporter. Marine diatoms such as Cylindrotheca fusiformis encode at least six silicon transport protein homologues which exhibit similar size and topology. One characterized member of the family (Sit1) functions in the energy-dependent uptake of either Silicic acid [Si(OH)4] or Silicate [Si(OH)3O-] by a Na+ symport mechanism. The system is found in marine diatoms which make their "glass houses" out of silicon. [Transport and binding proteins, Other]	0
328841	cl21657	Phage_TTP_1	Phage tail tube protein. This model describes a set of proteins that share low levels of sequence similarity but similar lengths and similar patterns of charged, hydrophobic, and Gly/Pro residues. All members (except one attributed to mouse embryo cDNA) belong to phage of Gram-positive bacteria. Several are identified as phage major tail proteins. Some members of this family have additional C-terminal regions of about 100 residues not included in this model. [Mobile and extrachromosomal element functions, Prophage functions]	0
419778	cl21658	NinB	NinB protein. hypothetical protein; Provisional	0
419779	cl21662	GH7_CBH_EG	Glycosyl hydrolase family 7. Glycosyl hydrolase family 7 contains eukaryotic endoglucanases (EGs) and cellobiohydrolases (CBHs) that hydrolyze glycosidic bonds using a double-displacement mechanism. This leads to a net retention of the conformation at the anomeric carbon. Both enzymes work synergistically in the degradation of cellulose,which is the main component of plant cell wall, and is composed of beta-1,4 linked glycosyl units. EG cleaves the beta-1,4 linkages of cellulose and CBH cleaves off cellobiose disaccharide units from the reducing end of the chain. In general, the O-glycosyl hydrolases are a widespread group of enzymes that hydrolyze the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycoside hydrolase family 7.	0
272086	cl21666	PHA02004	N/A. major capsid protein	0
419780	cl21672	DIOX_N	non-haem dioxygenase in morphine synthesis N-terminal. flavanone-3-hydroxylase; Provisional	0
419781	cl21673	DUF3828	Protein of unknown function (DUF3828). putative lipoprotein; Provisional	0
419782	cl21675	OprD	outer membrane porin, OprD family. This family consists of Campylobacter major outer membrane proteins. The major outer membrane protein (MOMP), a putative porin and a multifunction surface protein of Campylobacter jejuni, may play an important role in the adaptation of the organism to various host environments.	0
389861	cl21676	VRP1	Salmonella virulence plasmid 28.1kDa A protein. virulence protein SpvA; Provisional	0
419783	cl21677	NHase_beta	Nitrile hydratase beta subunit. Members of this protein family are the beta subunit of nitrile hydratase. The alpha subunit is represented by model TIGR01323. While nitrile hydratase is given the specific EC number 4.2.1.84, nitriles are a class of compounds, and one genome may carry more than one nitrile hydratase. The enzyme occurs in both non-heme iron and non-corrin cobalt forms. [Energy metabolism, Amino acids and amines]	0
419784	cl21678	Lon_C	Lon protease (S16) C-terminal proteolytic domain. The Lon serine proteases must hydrolyze ATP to degrade protein substrates. In Escherichia coli, these proteases are involved in turnover of intracellular proteins, including abnormal proteins following heat-shock. The active site for protease activity resides in a C-terminal domain. The Lon proteases are classified as family S16 in Merops.	0
419785	cl21680	DUF3276	Protein of unknown function (DUF3276). This bacterial family of proteins has no known function.	0
389865	cl21681	STN	Secretin and TonB N-terminus short domain. This is a short domain found at the N-terminus of the Secretins of the bacterial type II/III secretory system as well as the TonB-dependent receptor proteins. These proteins are involved in TonB-dependent active uptake of selective substrates.	0
272102	cl21682	CBM_10	Cellulose or protein binding domain. This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria.	0
354906	cl21683	TFIIA_alpha_beta_like	Precursor of TFIIA alpha and beta subunits and similar proteins. Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of TATA-binding protein (TBP) for DNA in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta) and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single gene (TFIIA_alpha_beta), its protein product is post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. TFIIA_alpha_beta alone is sufficient for transcription in early embryogenesis, but the cleaved forms, TFIIA alpha and TFIIA beta, represent the vast majority of TFIIA in most differentiated cells. The exact functional differences between cleaved and uncleaved forms are not yet clear. This model also contains paralogs of the canonical TFIIA_alpha_beta, such as the human ALF, which may be involved in gametogenesis and early embryogenesis (and is also subject to proteolytic cleavage).	0
419786	cl21684	DUF4397	Domain of unknown function (DUF4397). AlgF is essential for the addition of O-acetyl groups to alginate, an extracellular polysaccharide. The presence of O-acetyl groups plays an important role in the ability of the polymer to act as a virulence factor.	0
419787	cl21686	RecO_N	Recombination protein O N terminal. This entry contains members that are not captured by pfam11967.	0
389868	cl21687	Orc6_mid	Middle domain of the origin recognition complex subunit 6. This family consists of several eukaryotic origin recognition complex subunit 6 (ORC6) proteins. Despite differences in their structure and sequences among eukaryotic replicators, ORC is a conserved feature of replication initiation in all eukaryotes. ORC-related genes have been identified in organisms ranging from S. pombe to plants to humans. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in.	0
419788	cl21688	DUF1743	Domain of unknown function (DUF1743). The first twenty-nine completed genomes with a member of this protein family include twenty-eight archaeal methanogens and one other related archaeon, Ferroglobus placidus DSM 10642. The exact function is unknown, but the protein likely belongs to a system usually tightly linked to methanogenesis.	0
419789	cl21693	CDC48_N	Cell division protein 48 (CDC48), N-terminal domain. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain.	0
419790	cl21695	Tn7_Tnp_TnsA_N	TnsA endonuclease N terminal. head completion protein; Provisional	0
419791	cl21700	Glyco_hydro_26	Glycosyl hydrolase family 26. 	0
419792	cl21701	PC4	Transcriptional Coactivator p15 (PC4). p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain.	0
304504	cl21702	DUF1700	Protein of unknown function (DUF1700). This family contains many hypothetical bacterial proteins and putative membrane proteins.	0
419793	cl21703	Peptidase_A24	Type IV leader peptidase family. Peptidase A24, or the prepilin peptidase as it is also known, processes the N-terminus of the prepilins. The processing is essential for the correct formation of the pseudopili of type IV bacterial protein secretion. The enzyme is found across eubacteria and archaea.	0
419794	cl21704	zf-CSL	CSL zinc finger. This is a zinc binding motif which contains four cysteine residues which chelate zinc. This domain is often found associated with a pfam00226 domain. This domain is named after the conserved motif of the final cysteine.	0
419795	cl21705	Arv1	Arv1-like family. Arv1 is a transmembrane protein with potential zinc-binding motifs. ARV1 is a novel mediator of eukaryotic sterol homeostasis.	0
419796	cl21707	Phage_holin_3_6	Putative Actinobacterial Holin-X, holin superfamily III. Phage_holin_3_6 is a family of small hydrophobic proteins with two or three transmembrane domains of the Hol-X family. Holin proteins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion.	0
419797	cl21709	COQ9	COQ9. This uncharacterized protein is found in a number of Alphaproteobacteria and, with N-terminal regions long enough to be transit peptides, in eukaryotes. This phylogeny suggests mitochondrial derivation. In several Alphaproteobacteria, the gene for this protein is encoded divergently from rpsU, the gene for ribosomal protein S21. S21 is unusual in being encoded outside the usual long ribosomal protein operons, but rather in contexts that suggest regulation of the initiation of protein translation. [Unknown function, General]	0
389879	cl21710	SASP_gamma	Small, acid-soluble spore protein, gamma-type. This model represents a family of small, glutamine and asparagine-rich peptides that store amino acids in the spores of Bacillus subtilis and related bacteria. Most members of the family have two copies of the spore protease (GPR) cleavage motif, typically EFASE in this family, separating three low-complexity repeats. [Cellular processes, Sporulation and germination]	0
419798	cl21712	T2SSC	Type II secretion system protein C. Members of this protein family are found in type IV pilus biogenesis loci and include proteins designated PilP. [Cell envelope, Surface structures]	0
389881	cl21715	TrbM	TrbM. conjugal transfer protein TrbM; Provisional	0
419799	cl21716	PapD_N	Pili and flagellar-assembly chaperone, PapD N-terminal domain. C2 domain-like beta-sandwich fold. This domain is the n-terminal part of the PapD chaperone protein for pilus and flagellar assembly.	0
419800	cl21721	IDO	Indoleamine 2,3-dioxygenase. This domain has no known function. It is found in various hypothetical and conserved domain proteins.	0
419801	cl21722	DnaJ_C	C-terminal substrate binding domain of DnaJ and HSP40. This family consists of the C terminal region of the DnaJ protein. It is always found associated with pfam00226 and pfam00684. DnaJ is a chaperone associated with the Hsp70 heat-shock system involved in protein folding and renaturation after stress. The two C-terminal domains CTDI and CTDII, both incorporated in this family are necessary for maintaining the J-domains in their specific relative positions. Structural analysis of Structure 1nlt shows that PF00684 is nested within this DnaJ C-terminal region.	0
419802	cl21724	GAG_Lyase	N/A. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen.	0
419803	cl21727	VATPase_H	N/A. The yeast Saccharomyces cerevisiae vacuolar H+-ATPase (V-ATPase) is a multisubunit complex responsible for acidifying organelles. It functions as an ATP dependent proton pump that transports protons across a lipid bilayer. This domain corresponds to the N terminal domain of the H subunit of V-ATPase. The N-terminal domain is required for the activation of the complex whereas the C-terminal domain is required for coupling ATP hydrolysis to proton translocation.	0
419804	cl21728	CIA30	Complex I intermediate-associated protein 30 (CIA30). This protein is associated with mitochondrial Complex I intermediate-associated protein 30 (CIA30) in human and mouse. The family is also present in Schizosaccharomyces pombe which does not contain the NADH dehydrogenase component of complex I, or many of the other essential subunits. This means it is possible that this family of protein may not be directly involved in oxidative phosphorylation.	0
304525	cl21731	DUF677	Protein of unknown function (DUF677). This family consists of several plant proteins and includes BYPASS1, which is required for normal root and shoot development. This protein prevents constitutive production of a root mobile carotenoid-derived signaling compound that is capable of arresting shoot and leaf development.	0
419805	cl21735	Lung_7-TM_R	Lung seven transmembrane receptor. This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers.	0
419806	cl21736	TAF6C	C-terminal domain of TATA Binding Protein (TBP) Associated Factor 6 (TAF6). TAF6_C is the C-terminal domain of the TAF6 subunit of the general transcription factor TFIID. The crystal structure reveals the presence of five conserved HEAT repeats. This region is necessary for the complexing together of the subunits TAF5, TAF6 and TAF9.	0
419808	cl21738	Alginate_lyase2	Alginate lyase. This family includes heparin lyase I, EC:4.2.2.7. Heparin lyase I depolymerizes heparin by cleaving the glycosidic linkage next to an iduronic acid moiety. The structure of heparin lyase I consists of a beta-jelly roll domain with a long, deep substrate-binding groove and an unusual thumb domain containing many basic residues extending from the main body of the enzyme. This family also includes glucuronan lyase, EC:4.2.2.14. The structure glucuronan lyase is a beta-jelly roll.	0
419809	cl21742	Cas10_III	CRISPR/Cas system-associated protein Cas10. This domain family is found in bacteria and archaea, and is typically between 101 and 138 amino acids in length. The proteins in this family are frequently annotated as CRISPR-associated proteins however there is little accompanying literature to confirm this.	0
419810	cl21745	DUF4188	Domain of unknown function (DUF4188). This family includes aldoxime dehydratase, EC:4.99.1.5. This is a haem-containing enzyme, which catalyzes the dehydration of aldoximes to their corresponding nitrile. It also includes phenylacetaldoxime dehydratase, EC:4.99.1.7. This haem-containing enzyme catalyzes the dehydration of Z-phenylacetaldoxime to phenylacetonitrile. The enzyme forms an elliptic beta barrel, composed of eight beta-strands, flanked by alpha-helices.	0
419811	cl21747	SusD	starch binding outer membrane protein SusD. SusD is a secreted polysaccharide-binding protein with an N-terminal lipid moiety that allows it to associate with the outer membrane. SusD probably mediates xyloglucan-binding prior to xyloglucan transport in the periplasm for degradation. This domain is found N-terminal to pfam07980.	0
419813	cl21750	Fis1	Mitochondrial Fission Protein Fis1, cytosolic domain. The mitochondrial fission protein Fis1 consists of two tetratricopeptide repeats. This domain is the C-terminal tetratricopeptide repeat	0
419819	cl22409	BslA_like	Bacterial immunoglobulin-like hydrophobin BslA and similar proteins. This family includes members such as BslA (previously called YuaB). Secreted BslA from Bacillus subtillis has been shown to form surface layers around the biofilm self-assembling at interfaces of B. subtilis biofilms, forming an elastic film. structural analysis revealed that BslA consists of an Ig-type fold with the addition of an unusual, extremely hydrophobic cap region. The hydrophobic cap exhibits physiochemical properties similar to the hydrophobic surface found in fungal hydrophobins; thus, BslA is defined as member of a class of bacterially produced hydrophobins.	0
419820	cl22411	E2F_DD	Dimerization domain of E2F transcription factors. This is the coiled coil (CC) - marked box (MB) domain of E2F transcription factors. This domain forms a heterodimer with the corresponding domain of the DP transcription factor, the heterodimer binds the C-terminus of retinoblastoma protein.	0
419821	cl22413	ADAM17_MPD	Membrane-proximal domain of a disintegrin and metalloprotease 17 (ADAM17). ADAM17_MPD is the membrane-proximal domain of a family of disintegrin and metalloproteinase domain-containing protein 17 found in metazoan species. ADAM17 is a major sheddase that is responsible for the regulation of a wide range of biological processes, such as cellular differentiation, regeneration, and cancer progression. This MPD region acts as the sheddase switch. PDI or protein-disulfide isomerase interacts with ADAM17 and to down-regulate its enzymatic activity. The interaction is directly with the MPD, the region of dimerization and substrate recognition, where it catalyzes an isomerisation of disulfide bridges within the thioredoxin motif CXXC. this isomerisation results in a major structural change between an active, open state and an inactive, closed state of the MPD. This change is thought to act as a molecular switch, allowing a global reorientation of the extracellular domains in ADAM17 and regulating its shedding activity.	0
419822	cl22414	Lmo2686_like	Uncharacterized hexameric protein conserved in Bacilli. This is a domain of unknown function mostly found in firmicutes.	0
419823	cl22415	ESP	Exocrine gland-secreting peptide 1 (ESP1) and similar pheromones. ESP is a family of largely rodent exocrine gland-secreting peptides that are produced by the male extraorbital lacrimal gland to be secreted into the tear fluid. Other mice including females detect these peptides through receptors in the vomeronasal organ, and the receptors report information on mouse-strain, sex and species. The peptides are short, all carrying an N-terminal signal-peptide to indicate they are for secretion which accounts for much of the common conservation.	0
419824	cl22417	bt3222_like	Uncharacterized proteins similar to Bacteriodes thetaiotaomicron bt3222. A small family of uncharacterized proteins around 310 residues in length and found in various Bacteroides species. The function of this family is unknown.	0
419825	cl22418	CttA_X	X module of the carbohydrate-binding protein CttA and similar proteins. This is the N-terminal domain of cellulose-binding protein CttA present in Ruminococcus flavefaciens. CttA mediates attachment of the bacterial substrate via two carbohydrate-binding modules. The domain is known as the X-module and lacks a true hydrophobic core. Unlike the X-modules in other types of CohE-XDoc complexes it does not contribute to the binding surface. This X-module appears to serve as an extended spacer, which separates the cellulose-binding modules at the N terminus of CttA and the bacterial cell wall. The domain does not share structural similarity with other known X-modules from cellulolytic bacteria but does show similarity to G5-1 module of StrH from S. pneumoniae.	0
419826	cl22419	LepB	Legionella Rab1-specific GAP LepB. This is a subdomain of a Rab GTPase-activating protein (GAP) effector from Legionella pneumophilia. This GAP modulates Rab enzymes that act as molecular switches in regulating vesicular transport in eukaryotic cells. This N-terminal subdomain belongs to the the GAP domain of the protein. The catalytic arginine finger (Arg444) is located within this sub-domain and it is the only arginine residue required for GAP activity.	0
419827	cl22420	Hip_N	N-terminal dimerization domain of the Hsp70-interacting protein (Hip) and similar proteins. This is the N-terminal domain, known as HipN, found in Hsp70-interacting protein (Hip) present in Rattus norvegicus. Hip cooperates with the chaperone Hsp70 in protein folding and prevention of aggregation and may delay substrate release by slowing ADP dissociation from Hsp70. HipN is responsible for N-terminal homo-dimerization which is necessary so that the Hip dimer can interact with Hsp70 molecules.	0
277561	cl22421	ZIP_Gal4p-like	Leucine zipper Dimerization domain of Gal4p-like transcription factors. Sip4p binds to carbon source-responsive element (CSRE) motifs and activates transcription of target genes under conditions of glucose deprivation. Its function is modulated through phosphorylation by SNF1 protein kinase, a protein essential for expression of glucose-repressed genes in response to glucose deprivation. Sip4p is a member of the Gal4p family of transcriptional activators which contain an N-terminal DNA-binding domain with a Zn2Cys6 binuclear cluster that interact with CCG triplets and a leucine zipper-like heptad repeat that dimerizes. Dimerization allows binding of targets which contain two CCG motifs oriented in an inverted (CGG-CCG), direct (CCG-CCG), or everted (CCG-CGG) manner.	0
419828	cl22422	SRP68-RBD	RNA-binding domain of signal recognition particle subunit 68. SRP68 is a family that is part of the SRP or signal recognition particle complex. This complex, consisting of six proteins and a 7SL-RNA is necessary for guiding the emerging proteins designed for the membrane towards the translocation pore. SRP68 forms a stable heterodimer with SRP72, a protein with a TPR repeat. Specific RNA-binding of SRP68 is mediated by the N-terminal domain of approximately 200 residues of this family.	0
419829	cl22423	NBR1_like	Functionally uncharacterized domain in neighbor of Brca1 Gene 1 and related proteins. Domain present between positions 365-485 in the human next to BRCA1 gene 1 protein Q14596 (NBR1_HUMAN) Distant homology and fold prediction analysis suggests this domain has an immunoglobulin like fold and is distantly homologous to domains involved in cell adhesion such as CARDB (PF07705). JCSG construct was crystalized confirming the domain boundaries	0
304554	cl22428	E1_enzyme_family	N/A. Members of the HesA/MoeB/ThiF family of proteins (pfam00899) include a number of members encoded in the midst of thiamine biosynthetic operons. This mix of known and putative ThiF proteins shows a deep split in phylogenetic trees, with the Escherichia. coli ThiF and the E. coli MoeB proteins seemingly more closely related than E. coli ThiF and Campylobacter (for example) ThiF. This model represents the more widely distributed clade of ThiF proteins such found in E. coli. [Biosynthesis of cofactors, prosthetic groups, and carriers, Thiamine]	0
419830	cl22429	HHH_5	Helix-hairpin-helix domain. The HHH domain is a short DNA-binding domain.	0
419831	cl22433	H3TH_StructSpec-5&apos;-nucleases	H3TH domains of structure-specific 5&apos; nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination. Exonuclease-1 (EXO1) is involved in multiple, eukaryotic DNA metabolic pathways, including DNA replication processes (5' flap DNA endonuclease activity and double stranded DNA 5'-exonuclease activity), DNA repair processes (DNA mismatch repair (MMR) and post-replication repair (PRR), recombination, and telomere integrity. EXO1 functions in the MMS2 error-free branch of the PRR pathway in the maintenance and repair of stalled replication forks. Studies also suggest that EXO1 plays both structural and catalytic roles during MMR-mediated mutation avoidance. Members of this subgroup include the H3TH (helix-3-turn-helix) domains of EXO1 and other similar eukaryotic 5' nucleases. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. These nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+ or Mn2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one Asp residue from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases. EXO1 nucleases also have C-terminal Mlh1- and Msh2-binding domains which allow interaction with MMR and PRR proteins, respectively.	0
419832	cl22434	Hint	N/A. This short domain is a conserved region of intein-containing proteins from lower eukaryotes.	0
419833	cl22435	TPP_enzyme_M	Thiamine pyrophosphate enzyme, central domain. TPP_enzyme_M_2 is the middle domain of thiamine pyrophosphate in sequences not captured by pfam00205. This enzyme is necessary for the first step of the biosynthesis of menaquinone, or vitamin K2, an important cofactor in electron transport in bacteria.	0
419834	cl22448	Inhibitor_I29	Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.	0
419835	cl22450	RtcB	RNA-splicing ligase RtcB, repairs tRNA damage [Translation, ribosomal structure and biogenesis]. Members of this family are related to RctB. RctB a protein of known structure but unknown function that often is encoded near RNA cyclase and therefore is suggested to be a tRNA or mRNA processing enzyme. This family of RctB-like proteins in encoded upstream of, and apparently is translationally coupled to, the putative peptide chain release factor RF-H (TIGR03072), product of the prfH gene. Note that a large deletion at the junction between this gene and the prfH gene in Escherichia coli K-12 marks both as probable pseudogenes. [Protein synthesis, Other]	0
419836	cl22451	ASF1_hist_chap	ASF1 like histone chaperone. This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.	0
419837	cl22454	Arm	Armadillo/beta-catenin-like repeat. The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514). These EZ repeats are found in subunits of cyanobacterial phycocyanin lyase and other proteins and probably carry out a scaffolding role.	0
419838	cl22470	AXH	Ataxin-1 and HBP1 module (AXH). unknown function	0
419839	cl22471	MtrF	Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF). tetrahydromethanopterin S-methyltransferase subunit F; Provisional	0
419840	cl22482	Sec63	Sec63 Brl domain. This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases.	0
419842	cl22495	Gp23	Major capsid protein Gp23. capsid vertex protein; Provisional	0
419843	cl22503	DUF2385	Protein of unknown function (DUF2385). Members of this uncharacterized protein family are found in a number of alphaProteobacteria, including root nodule bacteria, Brucella suis, Caulobacter crescentus, and Rhodopseudomonas palustris. Conserved residues include two well-separated cysteines, suggesting a disulfide bond. The function is unknown.	0
419844	cl22520	UPF0181	Uncharacterized protein family (UPF0181). This family contains small proteins of about 50 amino acids of unknown function. The family includes YoaH.	0
304575	cl22532	Carbam_trans_N	Carbamoyltransferase N-terminus. This family describes a protein family, YeaZ, now associated with the threonylcarbamoyl adenosine (t6A) tRNA modification. Members of this family may occur as fusions with ygjD (previously gcp) or the ribosomal protein N-acetyltransferase rimI, and is frequently encoded next to rimI. [Protein synthesis, tRNA and rRNA base modification]	0
419845	cl22542	P22_CoatProtein	P22 coat protein - gene protein 5. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 369 and 424 amino acids in length. There is a single completely conserved residue G that may be functionally important.	0
419847	cl22548	DUF3792	Protein of unknown function (DUF3792). Members of this family of strongly hydrophobic putative transmembrane protein average about 125 amino acids in length and occur mostly, but not exclusively, in the Firmicutes. Members are quite diverse in sequence. The function is unknown.	0
419848	cl22555	CRF	Corticotropin-releasing factor family. 	0
304582	cl22557	SPAM	Salmonella surface presentation of antigen gene type M protein. type III secretion system protein SpaM; Provisional	0
304583	cl22571	VirC2	VirC2 protein. This family consists of several VirC2 proteins which seem to be found exclusively in Agrobacterium species and Rhizobium etli. VirC2 is known to be involved in virulence in Agrobacterium species but its exact function is unclear.	0
419849	cl22626	YSIRK_signal	YSIRK type signal peptide. Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.	0
419850	cl22628	YcgL	YcgL domain. This family of proteins formerly called DUF709 includes the E. coli gene ycgL. homologs of YcgL are found in gammaproteobacteria. The structure of this protein shows a novel alpha/beta/alpha sandwich structure.	0
419851	cl22629	Cyt_c_Oxidase_VIIb	N/A. Cytochrome C oxidase chain VIIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane.  The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIb subunit is found only in eukaryotes and its specific function remains unclear. A rare polymorphism of the CcO VIIb gene may be associated with the high risk of nasopharyngeal carcinoma in a Cantonese family.	0
419852	cl22636	DNA_pol3_theta	DNA polymerase III, theta subunit. This family of proteins with unknown function appears to be restricted to Proteobacteria.	0
276635	cl22681	wcaD	N/A. This membrane protein is believed to function as the colanic acid repeating unit polymerase (in an analagous fashion to wzy proteins in O-antigen polymerization).	0
276638	cl22684	wcaM	N/A. This protein of uncharacterized function is the final gene in the conserved colanic acid biosynthesis cluster observed in Enterobacteraceae.	0
419854	cl22701	Phage_lysis	Bacteriophage Rz lysis protein. phage lambda Rz-like lysis protein	0
419856	cl22733	DUF1800	Protein of unknown function (DUF1800). This is a family of large bacterial proteins of unknown function.	0
419858	cl22759	T2SS_PulS_OutS	Type II secretion system pilotin lipoprotein (PulS_OutS). This family comprises lipoproteins from four gamma proteobacterial species: PulS protein of Klebsiella pneumoniae, the OutS protein of Erwinia chrysanthemi and Pectobacterium chrysanthemi, and the functionally uncharacterized E. coli protein EtpO. PulS and OutS have been shown to interact with and facilitate insertion of secretins into the outer membrane, suggesting a chaperone-like, or piloting function for members of this family. [Transport and binding proteins, Amino acids, peptides and amines]	0
419860	cl22765	SP_1775_like	Uncharacterized protein conserved in Streptococci. This family of Firmicute sequences has members that are annotated as ribose-phosphate pyrophosphokinase; however there is no evidence for this attribution. Member proteins are all shorter than 100 residues in length.	0
276720	cl22766	mycoplas_M_dom	IgG-blocking virulence domain. Members of this family, including MG_281 of Mycoplasma genitalium, bind conserved regions of the IgG light chain sequences, blocking IgG's normal function of antigen-specific binding. It is therefore an important virulence protein. Members of this family are found also in Mycoplasma pneumoniae, M. penetrans, M. gallisepticum, and M. iowae. Model TIGR04524 describes a region within this protein that is shared by many additional Mycoplasma and Ureaplasma proteins. [Cellular processes, Pathogenesis]	0
276722	cl22768	TIGR04562	TIGR04562 family protein. Members of this family are bacterial proteins, roughly 400 amino acids in length. Most members belong to the Deltaproteobacteria. All members of the Myxococcales, and order withing the Deltaproteobacteria, have a member. The arrangement of conserved residues into invariant motifs suggests enzymatic activity. The function is unknown.	0
419861	cl22808	DUF5840	Family of unknown function (DUF5840). Members of this protein family occur primarily in Cyanobacteria. They average about 50 residues in length and are the ribosomally translated precursors of peptide natural products whose modifications include cleavage, cyclization, and prenylation. Sequences are well-conserved in the N-terminal region. They are nearly invariant over the last eight residues, but hypervariable just before that stretch. A related family, often in a similar genome context, is TIGR03678.	0
419862	cl22817	DUF4842	Domain of unknown function (DUF4842). This domain is abundant in the Leptospira, in Bacteroides, and in Vibrio (three widely separated lineages). Most members have plausible lipoprotein signal peptides, including lipoprotein LruC from Leptospira interrogans and BACOVA_00967, from Bacteroides ovatus, with a solved crystal structure. Note that the C-terminal region of pfam13448 (length 83) matches the N-terminal region of some members of this domain (length 243).	0
419863	cl22825	HEPN_AbiV	AbiV. This family includes AbiV (abortive infection system V) from Lactococcus lactis, a phage resistance protein that causes certain phage infections to fail to lead to successful phage replication. Abortive infection mechanisms differ greatly. AbiV interacts directly with the protein SaV in phage p2 and blocks translation of phage proteins.	0
419864	cl22834	CDPS	Cyclodipeptide synthase. Members of this family take two aminoacylated tRNA molecules and produce a cyclic dipeptide with two peptide bonds. This enzyme therefore produces a type of nonribosomal peptide, but by a mechanism entirely different from the typical non-ribosomal peptide synthase (NRPS) that relies on adenylation to activate amino acids. Three characterized members of this family are the cyclodityrosine synthase of Mycobacterium tuberculosis (an essential gene), a cyclo(L-Phe-L-Leu) synthase from Streptomyces noursei involved in natural product biosynthesis, and cyclodileucine synthase YvmC from Bacillus licheniformis. Many cyclodipeptide synthases are found next to a cytochrome P450 that further modifies the product.	0
419865	cl22837	Peptidase_Mx1	Putative zinc-binding metallo-peptidase. Members of this family are lipoproteins with the typical zinc metallohydrolase HExxH motif and additional similarities to a better-documented zinc peptidase family, pfam06167. The seed alignment begins immediately after the lipoprotein motif Cys residue. Up to five members of this protein family occur per genome, in the context of certain gene pairs related to RagA and RagB, or to SusC and SusD. Those gene pairs, like the present family, are restricted to the Bacteriodetes, may number up to 100 pairs per genome, and are linked to TonB-dependent uptake of biopolymer-derived nutrients such as glycans. A possible function for this lipoprotein is to hydrolyse larger molecules to prepare substrates for import and utilization. [Unknown function, Enzymes of unknown specificity]	0
419866	cl22850	RGL11	Rhamnogalacturonan lyase of the polysaccharide lyase family 11. This is the beta-sheet domain found in rhamnogalacturonan (RG) lyases, which are responsible for an initial cleavage of the RG type I (RG-I) region of plant cell wall pectin. Polysaccharide lyase family 11 carrying this domain, such as YesW (EC:4.2.2.23) and YesX (EC:4.2.2.24), cleave glycoside bonds between rhamnose and galacturonic acid residues in RG-I through a beta-elimination reaction. Other family members carrying this domain are hemagglutinin A, lysine gingipain (Kgp) and Chitinase C (EC:3.2.1.14).	0
419867	cl22851	PHD_SF	PHD finger superfamily. PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3.	0
419868	cl22853	Motor_domain	Myosin and Kinesin motor domain. Myosin motor domain of cardiac muscle, beta myosin heavy chain 7b (also called KIAA1512, dJ756N5.1, MYH14, MHC14). MYH7B is a slow-twitch myosin. Mutations in this gene result in one form of autosomal dominant hearing impairment. Multiple transcript variants encoding different isoforms have been found for this gene. Class II myosins, also called conventional myosins, are the myosin type responsible for producing actomyosin contraction in metazoan muscle and non-muscle cells. Myosin II contains two heavy chains made up of the head (N-terminal) and tail (C-terminal) domains with a coiled-coil morphology that holds the two heavy chains together. The intermediate neck domain is the region creating the angle between the head and tail. It also contains 4 light chains which bind the heavy chains in the "neck" region between the head and tail. The head domain is a molecular motor, which utilizes ATP hydrolysis to generate directed movement toward the plus end along actin filaments. Class-II myosins are regulated by phosphorylation of the myosin light chain or by binding of Ca2+. A cyclical interaction between myosin and actin provides the driving force. Upon ATP binding, the myosin head dissociates from an actin filament. ATP hydrolysis causes the head to pivot and associate with a new actin subunit. The release of Pi causes the head to pivot and move the filament (power stroke). Release of ADP completes the cycle. CyMoBase classifications were used to confirm and identify the myosins in this hierarchy.	0
419869	cl22854	HTH_XRE	N/A. YdaS_antitoxin is a family of putative bacterial antitoxins, neutralising the toxin YdaT, family pfam06254.	0
419870	cl22855	TNFRSF	Tumor necrosis factor receptor superfamily (TNFRSF). This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 184 amino acids in length. This is the stn_TNFRSF12A_TNFR domain from the tumor necrosis factor receptor. The function of this domain is unknown.	0
419871	cl22856	SNARE	SNARE motif. This entry is of a family of proteins all approximately 300 residues in length. The proteins have a single C-terminal trans-membrane domain and a SNARE [soluble NSF (N-ethylmaleimide-sensitive fusion protein) attachment protein receptor] domain of approximately 60 residues. The SNARE domains are essential for membrane fusion and are conserved from yeasts to humans. Use1 is one of the three protein subunits that make up the SNARE complex and it is specifically required for Golgi-endoplasmic reticulum retrograde transport.	0
419872	cl22860	PEPCK_HprK	N/A. This family represents the C terminal kinase domain of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phosphorelay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognizes the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller.	0
419873	cl22861	LamG	N/A. This domain belongs to the Concanavalin A-like lectin/glucanases superfamily.	0
354965	cl22863	Str_synth	Strictosidine synthase. This family consists of arylesterases (Also known as serum paraoxonase) EC:3.1.1.2. These enzymes hydrolyze organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity. Human arylesterase (PON1) is associated with HDL and may protect against LDL oxidation.	0
419874	cl22867	Sigma70_r4	N/A. Region 4 of sigma-70 like sigma-factors are involved in binding to the -35 promoter element via a helix-turn-helix motif.	0
419876	cl22877	Autotransporter	Autotransporter beta-domain. Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type IV pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different peptidase is used and in some cases no cleavage occurs.	0
419877	cl22881	DNA_processg_A	DNA recombination-mediator protein A. This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.	0
419878	cl22882	S-methyl_trans	Homocysteine S-methyltransferase. homocysteine S-methyltransferase	0
419879	cl22885	TAF9	TATA Binding Protein (TBP) Associated Factor 9 (TAF9) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. This domain is predicted to bind DNA and is often found associated with pfam00439 and in transcription factors. It has a histone-like fold.	0
419880	cl22886	GGCT_like	N/A. GGACT, gamma-glutamylamine cyclotransferase, is a ubiquitous enzyme found in bacteria, plants, and metazoans from Dictyostelium through to humans. It converts gamma-glutamylamines to free amines and 5-oxoproline.	0
328943	cl22894	Mem_trans	Membrane transport protein. [Transport and binding proteins, Other]	0
419882	cl22895	HTH_8	Bacterial regulatory protein, Fis family. 	0
419883	cl22897	TPR_1	Tetratricopeptide repeat. This Pfam entry includes outlying Tetratricopeptide-like repeats (TPR) that are not matched by pfam00515.	0
419884	cl22899	UTRA	UTRA domain. It has a similar fold to HutC/FarR-like bacterial transcription factors of the GntR family. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules.	0
419885	cl22901	RHH_1	Ribbon-helix-helix protein, copG family. ParD is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is ParE in, pfam05016. The family contains several related antitoxins from Cyanobacteria, Proteobacteria and Actinobacteria. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all type II antitoxins.	0
419886	cl22902	MdcG	Phosphoribosyl-dephospho-CoA transferase MdcG. Malonate decarboxylase, like citrate lyase, has a unique acyl carrier protein subunit with a prosthetic group derived from, and distinct from, coenzyme A. Members of this protein family are the phosphoribosyl-dephospho-CoA transferase specific to the malonate decarboxylase system. This enzyme can also be designated holo-ACP synthase (2.7.7.61). The corresponding component of the citrate lyase system, CitX, shows little or no sequence similarity to this family. [Energy metabolism, Other]	0
419887	cl22903	Arrestin_N	Arrestin (or S-antigen), N-terminal domain. Vacuolar protein sorting-associated protein (Vps) 26 is one of around 50 proteins involved in protein trafficking. In particular, Vps26 assembles into a retromer complex with at least four other proteins Vps5, Vps17, Vps29 and Vps35. This family also contains Down syndrome critical region 3/A.	0
419888	cl22904	CARDB	CARDB. The English-language version of the first reference can be found on pages 388-399 of the above. This domain has been named NEW3 but its actual function is not known. It is found on proteins which are bacterial galactosidases. The domain is associated with the NPCBM family, pfam08305, a novel putative carbohydrate binding module found at the N-terminus of glycosyl hydrolases.	0
389966	cl22907	zf-U1	U1 zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.	0
419889	cl22912	CsbD	CsbD-like. hypothetical protein; Provisional	0
419890	cl22913	DUF1255	Protein of unknown function (DUF1255). This family consists of several conserved hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown	0
419891	cl22917	PNTB	NAD(P) transhydrogenase beta subunit. This family corresponds to the beta subunit of NADP transhydrogenase in prokaryotes, and either the protein N- or C terminal in eukaryotes. The domain is often found in conjunction with pfam01262. Pyridine nucleotide transhydrogenase catalyzes the reduction of NAD+ to NADPH. A complete loss of activity occurs upon mutation of Gly314 in E. coli.	0
419892	cl22918	AnmK	Anhydro-N-acetylmuramic acid kinase. anhydro-N-acetylmuramic acid kinase; Reviewed	0
419893	cl22919	Tfb4	Transcription factor Tfb4. All proteins in this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	0
419895	cl22923	DUF2268	Predicted Zn-dependent protease (DUF2268). This domain, found in various hypothetical bacterial proteins, as well as predicted zinc dependent proteases, has no known function.	0
419896	cl22924	7TM_GPCR_Str	Serpentine type 7TM GPCR chemoreceptor Str. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srd is part of the larger Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	0
419897	cl22925	SLBB	SLBB domain. This family consists of the C-terminal domain of several bacterial Na(+)-translocating NADH-quinone reductase subunit A (NQRA) proteins. The Na(+)-translocating NADH: ubiquinone oxidoreductase (Na(+)-NQR) generates an electrochemical Na(+) potential driven by aerobic respiration.	0
389977	cl22931	TAF12	TATA Binding Protein (TBP) Associated Factor 12. The TATA Binding Protein (TBP) Associated Factor 12 (TAF12; also known as TAF2J or TAFII20) is one of several TAFs that bind TBP and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of several General Transcription Factors (GTFs), which also include TFIIA, TFIIB, TFIIE, TFIIF and TFIIH, that are involved in the accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and in the assembly of the pre-initiation complex (PIC). The TFIID complex is composed of the TBP and at least 13 TAFs which specifically interact with a variety of core promoter DNA sequences. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A unified and systematic nomenclature has been adopted for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs function such as serving as activator-binding sites, core-promoter recognition, or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF12 interacts with TAF4 and makes a novel histone-like heterodimer that binds DNA and has a core promoter function of a subset of genes. It is important for RAS-induced transformation properties of human colorectal cancer cells; its levels are increased in the cells harboring the RAS mutation. Also, TAF12 interacts with activating transcription factor 7 (ATF7) and contributes to the hypersensitivity of osteoclast (OCL) precursors to 1,25-dihydroxyvitamin D2 (1,25-(OH)2D3; also known as calcitriol) in Paget's disease (PD), a disorder of the bone remodeling process, in which the body absorbs old bone and forms abnormal new bone.	0
419899	cl22933	Spo0M	SpoOM protein. This family consists of several bacterial SpoOM proteins which are thought to control sporulation in Bacillus subtilis.Spo0M exerts certain negative effects on sporulation and its gene expression is controlled by sigmaH.	0
389979	cl22934	DUF1699	Protein of unknown function (DUF1699). This family contains many archaeal proteins which have very conserved sequences.	0
419900	cl22935	GP11	GP11 baseplate wedge protein. baseplate wedge subunit and tail pin; Provisional	0
419901	cl22936	DUF2089	Protein of unknown function (DUF2089). This domain, found in various hypothetical prokaryotic proteins, has no known function. This domain is a zinc-ribbon.	0
304651	cl22942	TOMM_pelo	NHLP leader peptide domain. This model recognizes a number of type 2 lantibiotic-type bacteriocins, including mersacidin and lichenicidin. Members often are found as gene pairs encoding two-chain bacteriocins. Maturation is accomplished, at least in part, by a LanM-type enzyme (TIGR03897). This model describes only the leader peptide region. [Cellular processes, Toxin production and resistance]	0
389982	cl22943	DUF3693	Phage related protein. hypothetical protein	0
304653	cl22944	COG5510	Predicted small secreted protein  [Function unknown]. 	0
419903	cl22948	FeoC	FeoC like transcriptional regulator. This family contains several transcriptional regulators, including FeoC, which contain a HTH motif. FeoC acts as a [Fe-S] dependant transcriptional repressor.	0
419905	cl22951	Vps51	Vps51/Vps67. The COG complex, the peripheral membrane oligomeric protein complex involved in intra-Golgi protein trafficking, consists of eight subunits arranged in two lobes bridged by Cog1. Cog5 is in the smaller, B lobe, bound in with Cog6-8, and is itself bound to Cog1 as well as, strongly, to Cog7.	0
419906	cl22952	Pou	Pou domain - N-terminal to homeobox domain. 	0
419907	cl22953	RCC_reductase	Red chlorophyll catabolite reductase (RCC reductase). red chlorophyll catabolite reductase	0
419908	cl22958	Agenet	Agenet domain. Domain in plant sequences with possible chromatin-associated functions.	0
419909	cl22959	VRR_NUC	VRR-NUC domain. It is associated with members of the PD-(D/E)XK nuclease superfamily, which include the type III restriction modification enzymes, for example StyLTI.	0
419910	cl22960	T4_gp9_10	Bacteriophage T4 gp9/10-like protein. baseplate wedge tail fiber connector; Provisional	0
419911	cl22961	RNA_Me_trans	Predicted SAM-dependent RNA methyltransferase. This family of proteins are predicted to be alpha/beta-knot SAM-dependent RNA methyltransferases.	0
277679	cl22964	COG3905	Predicted transcriptional regulator  [Transcription]. 	0
419912	cl22966	DUF1330	Domain of unknown function (DUF1330). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown.	0
419913	cl22970	Sulfotransfer_2	Sulfotransferase family. This family consists of several mammalian galactose-3-O-sulfotransferase proteins. Gal-3-O-sulfotransferase is thought to play a critical role in 3'-sulfation of N-acetyllactosamine in both O- and N-glycans.	0
419914	cl22974	HpaP	Type III secretion protein (HpaP). This family of genes is always found in type III secretion operons, althought its function in the processes of secretion and virulence is unclear. Hpa stands for Hrp-associated gene, where Hrp stands for hypersensitivity response and virulence.	0
419917	cl22978	rve_3	Integrase core domain. 	0
277695	cl22980	Csx12	CRISPR/Cas system-associated protein Cas9. Members of this family of CRISPR-associated (cas) protein are found, so far, in CRISPR/cas loci in Wolinella succinogenes DSM 1740, Legionella pneumophila str. Paris, and Francisella tularensis, where the last probably is an example of a degenerate CRISPR locus, having neither repeats nor a functional Cas1. The characteristic repeat length is 37 base pairs and period is about 72. One region of this large protein shows sequence similarity to pfam01844, HNH endonuclease.	0
419933	cl23554	DUF968	Protein of unknown function (DUF968). REF is a family of P1-like phage RecA-dependent nucleases. It does not appear to act as a positive RecA regulator. It is a new kind of enzyme, a RecA-dependent nuclease.	0
355006	cl23634	EFh_HEF	EF-hand, calcium binding motif, found in the hexa-EF hand proteins family. CBN, the product of the cbn gene, is a Drosophila homolog to vertebrate neuronal six EF-hand calcium binding proteins. It is expressed through most of ontogenesis with a selective distribution in the nervous system and in a few small adult thoracic muscles. Its precise biological role remains unclear. CBN contains six EF-hand motifs, but some of them may not bind calcium ions due to the lack of key residues.	0
419953	cl23654	DUF4322	Domain of unknown function (DUF4322). This family contains transposases from the insertion element ISH3, and related transposases from other mobile elements with similar transposases. This model reproduces the classification from ISFinder except for ISC1439B-like transposases, since those are extremely different.	0
419954	cl23655	DUF4343	Domain of unknown function (DUF4343). Family of ATP-grasp enzymes belonging to the R2K clade, wherein one of the absolutely-conserved lysine residues has migrated to the RAGYNA domain which is a part of the core ATP-grasp module. This family is predicted to catalyze peptide ligation reactions on protein substrates in biological conflict contexts, probably between bacteriophages and their hosts.	0
419960	cl23716	metallo-hydrolase-like_MBL-fold	mainly hydrolytic enzymes and related proteins which carry out various biological functions; MBL-fold metallohydrolase domain. This family is part of the metallo-beta-lactamase superfamily.	0
419961	cl23717	crotonase-like	N/A. This family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase. This family differs from pfam00378 in the structure of it's C-terminus.	0
419962	cl23718	ALP_like	alkaline phosphatases and sulfatases. This family is a member of the Alkaline phosphatase clan.	0
419963	cl23719	NAD_binding_1	Oxidoreductase NAD-binding domain. Xanthine dehydrogenases, that also bind FAD/NAD, have essentially no similarity.	0
419964	cl23720	RILP-like	Rab interacting lysosomal protein-like 1 and 2 (Rilpl1 and Rilpl2). CEP290 and similar centrosomal proteins carry a number of coiled-coil regions, and this is the fifth along the length of the protein. It is thought that the proteins are involved in cilia biosynthesis.	0
419965	cl23721	AP2Ec	N/A. This family consists of several bacterial L-rhamnose isomerase proteins (EC:5.3.1.14).	0
419966	cl23723	Cytochrome_b_N	N/A. This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 50 amino acids in length. There are two conserved histidines that may be functionally important. This family is N-terminally truncated compared to other members of the clan.	0
419967	cl23724	PHP	Polymerase and Histidinol Phosphatase domain. This protein is part of the RNase P complex that is involved in tRNA maturation.	0
419968	cl23725	Glyco_hydro	Glycosyl hydrolases. A family of putative cellulases.	0
419971	cl23728	ATP_bind_2	P-loop ATPase protein family. This family contains an ATP-binding site and could be an ATPase (personal obs:C Yeats).	0
419972	cl23729	SdiA-regulated	SdiA-regulated. This family represents a conserved region approximately within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. Some family members contain the pfam01436 repeat.	0
419973	cl23730	F5_F8_type_C	F5/8 type C domain. This family around 200 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Parabacteroides species. The function of this family remains unknown.	0
419975	cl23733	FliJ	Flagellar FliJ protein. 	0
419976	cl23735	H4	N/A. This family includes archaebacterial histones and histone like transcription factors from eukaryotes.	0
419978	cl23739	HCP_like	N/A. This family includes both hybrid-cluster proteins and the beta chain of carbon monoxide dehydrogenase. The hybrid-cluster proteins contain two Fe/S centers - a [4Fe-4S] cubane cluster, and a hybrid [4Fe-2S-2O] cluster. The physiological role of this protein is as yet unknown, although a role in nitrate/nitrite respiration has been suggested. The prismane protein from Escherichia coli was shown to contain hydroxylamine reductase activity (NH2OH + 2e + 2 H+ -> NH3 + H2O). This activity is rather low. Hydroxylamine reductase activity was also found in CO-dehydrogenase in which the active site Ni was replaced by Fe. The CO dehydrogenase contains a Ni-3Fe-2S-3O centre.	0
419981	cl23744	Peptidase_C1	N/A. This family is closely related to the Peptidase_C1 family pfam00112, containing several prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases.	0
419983	cl23746	Xan_ur_permease	Permease family. MFS_MOT1 is a family of molybdenate transporters. Molybdenum is an essential element that is taken up into the cell in the oxyanion molybdate. Molybdenum is used in the form of molybdopterin-cofactor, which participates in the active site of enzymes involved in key reactions of carbon, nitrogen, and sulfur metabolism.	0
419984	cl23747	UPF0182	Uncharacterized protein family (UPF0182). hypothetical protein; Provisional	0
419985	cl23748	DUF3585	Protein of unknown function (DUF3585). This family consists of several eukaryotic proteins. Suppressor of IKBKE 1 (SIKE) is a physiological suppressor of IKK-epsilon and TBK1, which are two IKK-related kinases involved in virus- and TLR3-triggered activation of interferon regulatory factor 3 (IRF-3). Other members of this family are circulating cathodic antigen (CCA), found in Schistosoma mansoni (Blood fluke), and FGFR1 oncogene partner 2, which may be involved in wound healing pathway.	0
419986	cl23749	TIR_2	TIR domain. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 98 and 145 amino acids in length.	0
419987	cl23750	vATP-synt_E	ATP synthase (E/31 kDa) subunit. V-type ATP synthase subunit E; Provisional	0
329056	cl23751	Plasmodium_Vir	Plasmodium vivax Vir protein. variable surface protein Vir32; Provisional	0
419988	cl23752	Cytochrom_C3	Heme-binding domain of the class III cytochrome C family and related proteins. This family includes cytochromes c7 and c7-type. In cytochromes c7 all three haems are bis-His co-ordinated. In c7-type the last haem is His-Met co-ordinated.	0
419990	cl23754	EamA	EamA-like transporter family. This region is found in proteins related to Plasmodium falciparum chloroquine resistance transporter (CRT).	0
304914	cl23757	OCRE	OCRE domain. RBM5 is also called protein G15, H37, putative tumor suppressor LUCA15, or renal carcinoma antigen NY-REN-9. It is a known modulator of apoptosis. It acts as a tumor suppressor or an RNA splicing factor. RBM5 shows high sequence similarity to RNA-binding protein 6 (RBM6 or NY-LU-12 or g16 or DEF-3). Both of them specifically binds poly(G) RNA. RBM5 contains two N-terminal RNA recognition motifs (RRMs), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain.	0
419993	cl23759	GT1	GT1, myb-like, SANT family. This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes.	0
329062	cl23762	TK	Thymidine kinase. thymidine kinase; Provisional	0
419994	cl23766	fungal_TF_MHR	fungal transcription factor regulatory middle homology region. Cep3 is one of the major components of the CBF3. It dimerizes and in so doing forms a large central channel that is large enough to accommodate duplex B-form DNA. The dimerization region is followed by a linker to the zinc-finger domain at the C-terminus. The CBF3 complex is an essential core component of the budding yeast kinetochore and is required for the centromeric localization of all other kinetochore proteins. Cep3 is the only component with DNA-binding properties.	0
419995	cl23768	ENDO3c	N/A. This family contains a diverse range of structurally related DNA repair proteins. The superfamily is called the HhH-GPD family after its hallmark Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This includes endonuclease III, EC:4.2.99.18 and MutY an A/G-specific adenine glycosylase, both have a C terminal 4Fe-4S cluster. The family also includes 8-oxoguanine DNA glycosylases. The methyl-CPG binding protein MBD4 also contains a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II EC:3.2.2.21 and other members of the AlkA family.	0
419996	cl23770	FliH	Flagellar assembly protein FliH. This family consists of several nodulation protein NolV sequences from different Rhizobium species. The function of this family is unclear.	0
419997	cl23771	Big_1	Bacterial Ig-like domain (group 1). This family consists of bacterial domains with an Ig-like fold.	0
304931	cl23774	TAF	TATA box binding protein associated factor (TAF). TAFs (TATA box binding protein associated factors) are part of the transcription initiation factor TFIID multimeric protein complex. TFIID is composed of the TATA box binding protein (TBP) and a number of TAFs. The TAFs provide binding sites for many different transcriptional activators and co-activators that modulate transcription initiation by Pol II. TAF proteins adopt a histone-like fold.	0
355042	cl23776	EFP_modif_epmB	EF-P beta-lysylation protein EpmB. Members of this family are arginine 2,3-aminomutase, a radical SAM enzyme more closely related to lysine 2,3-aminomutase than to glutamate 2,3-aminomutase. The enzyme makes L-beta-arginine, sometimes in the context of antibiotic biosynthesis (blasticidin S, mildiomycin, etc). Activity is proven in Streptomyces griseochromogenes, which makes blasticidin S.	0
304934	cl23777	PRK01005	N/A. V-type ATP synthase subunit E; Provisional	0
420000	cl23778	LpxK	Tetraacyldisaccharide-1-P 4&apos;-kinase. Also called lipid-A 4'-kinase. This essential gene encodes an enzyme in the pathway of lipid A biosynthesis in Gram-negative organisms. A single copy of this protein is found in Gram-negative bacteria. PSI-BLAST converges on this set of apparent orthologs without identifying any other homologs. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
420001	cl23779	MethyltransfD12	D12 class N6 adenine-specific DNA methyltransferase. All proteins in this family for which functions are known are DNA-adenine methyltransferases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The DNA adenine methylase (dam) of E. coli and related species is instrumental in distinguishing the newly synthesized strand during DNA replication for methylation-directed mismatch repair. This family includes several phage methylases and a number of different restriction enzyme chromosomal site-specific modification systems. [DNA metabolism, DNA replication, recombination, and repair]	0
420002	cl23780	PepSY_TM	PepSY-associated TM region. This is a family of bacterial proteins with three PepSY-like TM regions.	0
420003	cl23781	Fascin	N/A. This family consists of several eukaryotic fascin or singed proteins. The fascins are a structurally unique and evolutionarily conserved group of actin cross-linking proteins. Fascins function in the organisation of two major forms of actin-based structures: dynamic, cortical cell protrusions and cytoplasmic microfilament bundles. The cortical structures, which include filopodia, spikes, lamellipodial ribs, oocyte microvilli and the dendrites of dendritic cells, have roles in cell-matrix adhesion, cell interactions and cell migration, whereas the cytoplasmic actin bundles appear to participate in cell architecture. Dictyostelium hisactophilin, another actin-binding protein, is a submembranous pH sensor that signals slight changes of the H+ concentration to actin by inducing actin polymerization and binding to microfilaments only at pH values below seven. Members of this family are histidine rich, typically contain the repeated motif of HHXH.	0
420004	cl23783	PsbQ	Oxygen evolving enhancer protein 3 (PsbQ). This protein through the member sll1638 from Synechocystis sp. PCC 6803, was shown to be part of the cyanobacteria photosystem II. It is homologous to (but quite diverged from) the chloroplast PsbQ protein, called oxygen-evolving enhancer protein 3 (OEE3). We designate this cyanobacteria protein PsbQ by homology. [Energy metabolism, Photosynthesis]	0
420005	cl23784	RICIN	N/A. This family of serine protease inhibitors has a beta-trefoil fold and inhibits trypsin and chymotrypsin.	0
304945	cl23788	Met_repressor_MetJ	N/A. Met Repressor, MetJ.  MetJ is a bacterial regulatory protein that uses S-adenosylmethionine (SAM) as a corepressor to regulate the production of Methionine.  MetJ binds arrays of two to five adjacent copies of an eight base-pair 'metbox' sequence.  MetJ forms sufficiently strong interactions with the sugar-phosphate backbone to accomodate sequence variation in natural operators. However, it is very sensitive to particular base changes in the operator. MetJ exists as a homodimer.	0
355048	cl23789	PLN02481	N/A. spermidine hydroxycinnamoyl transferase; Provisional	0
420009	cl23790	Auxin_inducible	Auxin responsive protein. uncharacterized protein; Provisional	0
420010	cl23791	UPF0154	Uncharacterized protein family (UPF0154). hypothetical protein; Provisional	0
420011	cl23792	DUF2129	Uncharacterized protein conserved in bacteria (DUF2129). hypothetical protein; Provisional	0
420012	cl23793	PsbH	Photosystem II 10 kDa phosphoprotein. photosystem II reaction center protein H; Provisional	0
420013	cl23795	Mntp	Putative manganese efflux pump. This protein family was identified, at the time of the publication of the Carboxydothermus hydrogenoformans genome, as having a phylogenetic profile that exactly matches the subset of the Firmicutes capable of forming endospores. The species include Bacillus anthracis, Clostridium tetani, Thermoanaerobacter tengcongensis, Geobacillus kaustophilus, etc. This protein, previously named YtaF, is therefore a putative sporulation protein. [Cellular processes, Sporulation and germination]	0
390093	cl23796	DUF1120	Protein of unknown function (DUF1120). hypothetical protein; Provisional	0
420014	cl23797	LPP	Lipoprotein leucine-zipper. This is leucine-zipper is found in the enterobacterial outer membrane lipoprotein LPP. It is likely that this domain oligomerizes and is involved in protein-protein interactions. As such it is a bundle of alpha-helical coiled-coils, which are known to play key roles in mediating specific protein-protein interactions for in molecular recognition and the assembly of multi-protein complexes.	0
420015	cl23798	CBM53	Starch/carbohydrate-binding module (family 53). CBM26 is a carbohydrate-binding module that binds starch.	0
420016	cl23799	MgtE_N	MgtE intracellular N domain. This is the N-terminal domain of the flagellar rotor protein FliG.	0
420017	cl23800	Creatinase_N	Creatinase/Prolidase N-terminal domain. This domain is structurally very similar to the creatinase N-terminal domain (pfam01321). However, little or no sequence similarity exists between the two families.	0
420019	cl23802	Peptidase_C48	Ulp1 protease family, C-terminal catalytic domain. Protease specific for SMALL UBIQUITIN-RELATED MODIFIER (SUMO); Provisional	0
420020	cl23804	CAF1	CAF1 family ribonuclease. The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localizes to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom resolution.	0
420021	cl23805	ABC_transp_aux	ABC-type uncharacterized transport system. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldG is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldG abolish the gliding phenotype. GldG, along with GldA and GldF are believed to compose an ABC transporter and are observed as an operon. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility.	0
420022	cl23808	TetR_C_11	Bacterial transcriptional repressor C-terminal. This family comprises the C-terminal domain of transcriptional regulators of the TetR family. It includes the AefR transcriptional regulator from P. syringae. It is found in association with pfam00440.	0
420024	cl23811	Glucos_trans_II	Glucosyl transferase GtrII. O-antigen conversion protein C	0
420028	cl23815	Glyco_hydro_30	Glycosyl hydrolase family 30 TIM-barrel domain. 	0
304973	cl23816	CSF2	N/A. GM-CSF stimulates the development of and the cytotoxic activity of white blood cells.	0
420029	cl23817	DUF1146	Protein of unknown function (DUF1146). Members of this protein family are small, typically about 80 residues in length, and are highly hydrophobic. The gene is found so far only in a subset of the Firmicutes in association with genes of the ATP synthase F1 complex or NADH-quinone oxidoreductase. This family includes ywzB from Bacillus subtilis; pfam06612 describes the same family as Protein of unknown function DUF1146.	0
304975	cl23818	COG4020	Uncharacterized protein [Function unknown]. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. [Energy metabolism, Methanogenesis]	0
420030	cl23820	PSI_PsaJ	Photosystem I reaction centre subunit IX / PsaJ. photosystem I reaction center subunit IX; Provisional	0
420031	cl23821	GSP_synth	Glutathionylspermidine synthase preATP-grasp. glutathionylspermidine synthase domain-containing protein	0
420032	cl23822	TCP	TCP family transcription factor. Protein TCP2; Provisional	0
420033	cl23823	RALF	Rapid ALkalinization Factor (RALF). rapid alkalinization factor 23-like protein; Provisional	0
420034	cl23824	PetG	Cytochrome B6-F complex subunit 5. cytochrome b6-f complex subunit PetG; Reviewed	0
420035	cl23825	UPF0223	Uncharacterized protein family (UPF0223). hypothetical protein; Provisional	0
420036	cl23826	Phageshock_PspG	Phage shock protein G (Phageshock_PspG). This protein previously was designated yjbO in E. coli. It is found only in genomes that have the phage shock operon (psp), but only rarely is encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins, and heat shock. [Cellular processes, Adaptations to atypical conditions]	0
420037	cl23827	Tra_M	TraM mediates signalling between transferosome and relaxosome. The TraM protein is an essential part of the DNA transfer machinery of the conjugative resistance plasmid R1 (IncFII). On the basis of mutational analyses, it was shown that the essential transfer protein TraM has at least two functions. First, a functional TraM protein was found to be required for normal levels of transfer gene expression. Second, experimental evidence was obtained that TraM stimulates efficient site-specific single-stranded DNA cleavage at the oriT, in vivo. Furthermore, a specific interaction of the cytoplasmic TraM protein with the membrane protein TraD was demonstrated, suggesting that the TraM protein creates a physical link between the relaxosomal nucleoprotein complex and the membrane-bound DNA transfer apparatus.	0
420038	cl23828	RMF	Ribosome modulation factor. ribosome modulation factor; Provisional	0
304986	cl23829	PRK15383	type III secretion system effector arginine glycosyltransferase. 	0
355063	cl23830	H2B	Histone H2B. histone H2B; Provisional	0
304988	cl23831	Csm4_III-A	CRISPR/Cas system-associated RAMP superfamily protein Csm4. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. Members of this cas gene family are found in the mtube subtype of CRISPR/cas locus and designated Csm4, for CRISPR/cas Subtype Mtube, protein 4.	0
304989	cl23832	photo_TT_lyase	spore photoproduct lyase. This uncharacterized radical SAM domain protein occurs rarely and sporadically in species that include select Alphaproteobacteria and Actinobacteria, and in Deinococcus deserti VCD115. It is a distant but full-length homolog to the Bacillus subtilis spore photoproduct lyase (spl), which monomerizes thymine dimers created as DNA damage by uv radiation.	0
355064	cl23833	NrfA	Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit [Inorganic ion transport and metabolism]. Members of this protein family are cytochrome c552, a component of cytochrome c nitrite reductase, which is known more formally as nitrite reductase (cytochrome; ammonia-forming) (EC 1.7.2.2). Nitrate can be reduced by several enzymes. EC 1.7.2.2 reduces nitrite all the way to ammonia, rather than to ammonium hydroxide (nitrite reductase (NAD(P)H), EC 1.7.1.4) or nitric oxide (nitrite reductase (NO-forming), EC 1.7.2.1). Some examples of EC 1.7.2.2 occur in a seven gene system that enables formate-dependent nitrite reduction, but is also found in simpler contexts. Members of this protein family, however, belong to the formate-dependent system. [Energy metabolism, Electron transport]	0
329112	cl23835	Glyco_transf_90	Glycosyl transferase family 90. This family of glycosyl transferases are specifically (mannosyl) glucuronoxylomannan/galactoxylomannan -beta 1,2-xylosyltransferases, EC:2.4.2.-.	0
420039	cl23836	DUF262	Protein of unknown function DUF262. 	0
420040	cl23837	HicB_lk_antitox	HicB_like antitoxin of bacterial toxin-antitoxin system. This family consists of several bacterial HicB related proteins. The function of HicB is unknown although it is thought to be involved in pilus formation. It has been speculated that HicB performs a function antagonistic to that of pili and yet is necessary for invasion of certain niches.	0
420041	cl23838	PrsW-protease	Protease prsW family. PrsW, an intramembrane protease, cleaves the anti-sigma factor RsiW, which regulates the activity of the ECF-type sigma factor SigW.	0
420042	cl23839	DUF496	Protein of unknown function (DUF496). 	0
420043	cl23840	AGE	N/A. This family contains a number of eukaryotic and bacterial N-acylglucosamine 2-epimerase (GlcNAc 2-epimerase) enzymes (EC:5.3.1.8) approximately 500 residues long. This converts N-acyl-D-glucosamine to N-acyl-D-mannosamine.	0
420044	cl23841	DUF411	Protein of unknown function, DUF. The function of the members of this bacterial protein family is unknown. Some members may be involved in conferring cation resistance.	0
420045	cl23842	MatP	MatP N-terminal domain. This family, many of whose members are YcbG, organizes the macrodomain Ter of the chromosome of bacteria such as E coli. In these bacteria, insulated macrodomains influence the segregation of sister chromatids and the mobility of chromosomal DNA. Organisation of the Terminus region (Ter) into a macrodomain relies on the presence of a 13 bp motif called matS repeated 23 times in the 800-kb-long domain. MatS sites are the main targets in the E. coli chromosome of YcbG or MatP (macrodomain Ter protein). MatP accumulates in the cell as a discrete focus that co-localizes with the Ter macrodomain. The effects of MatP inactivation reveal its role as the main organizer of the Ter macrodomain: in the absence of MatP, DNA is less compacted, the mobility of markers is increased, and segregation of the Ter macrodomain occurs early in the cell cycle. A specific organisational system is required in the Terminus region for bacterial chromosome management during the cell cycle. This entry represents the N-terminal domain of MatP.	0
420046	cl23843	DUF839	Bacterial protein of unknown function (DUF839). This family consists of several bacterial proteins of unknown function that contain a predicted beta-propeller repeats.	0
305001	cl23844	Ble	Predicted trehalose synthase [Carbohydrate transport and metabolism]. Three pathways for the biosynthesis of trehalose, an osmoprotectant that in some species is also a precursor of certain cell wall glycolipids. Trehalose synthase, TreS, can interconvert maltose and trehalose, but while the equilibrium may favor trehalose, physiological concentrations of trehalose may be much greater than that of maltose and TreS may act largely in its degradation. This model describes a domain found only as a C-terminal fusion to TreS proteins. The most closely related proteins outside this family, Pep2 of Streptomyces coelicolor and Mak1 of Actinoplanes missouriensis, have known maltokinase activity. We suggest this domain acts as a maltokinase and helps drive conversion of trehalose to maltose. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	0
420047	cl23845	DUF2312	Uncharacterized protein conserved in bacteria (DUF2312). hypothetical protein; Provisional	0
420048	cl23846	PhosphMutase	2,3-bisphosphoglycerate-independent phosphoglycerate mutase. Members of this family are found in various bacterial 2,3-bisphosphoglycerate-independent phosphoglycerate mutase enzymes, which catalyze the interconversion of 2-phosphoglycerate and 3-phosphoglycerate in the reaction: [2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate].	0
420049	cl23847	UPF0262	Uncharacterized protein family (UPF0262). hypothetical protein; Provisional	0
420050	cl23848	DUF1801	Domain of unknown function (DU1801). This large family of bacterial proteins is uncharacterized. They contain a presumed domain about 110 amino acids in length.	0
420051	cl23849	TrbI	Bacterial conjugation TrbI-like protein. Proteins in this entry are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage.	0
420052	cl23850	Plug	TonB-dependent Receptor Plug Domain. This model describes a 31-residue signature region of the SusC/RagA family of outer membrane proteins from the Bacteriodetes. While many TonB-dependent outer membrane receptors are associated with siderophore import, this family seems to include generalized nutrient receptors that may convey fairly large oligomers of protein or carbohydrate. This family occurs in high copy numbers in the most abundant species of the human gut microbiome.	0
420053	cl23851	MHB	Haemophore, haem-binding. Members of this family, including Rv0203 from Mycobacterium tuberculosis, are secreted heme-binding proteins used in heme acquisition. Such proteins are called hemophores. Members have a cleavable N-terminal signal peptide, and a mature region just over 100 amino acids long with a pair of invariant Cys residues. An unrelated hemophore, HasA, occurs in Gram-negative pathogens such as Yersinia pestis. [Transport and binding proteins, Other]	0
420054	cl23853	Condensation	Condensation domain. This family contains a number of alcohol acetyltransferase (EC:2.3.1.84) enzymes approximately 500 residues long found in both bacteria and metazoa. These catalyze the esterification of isoamyl alcohol by acetyl coenzyme A.	0
420055	cl23855	FGE-sulfatase	Sulfatase-modifying factor enzyme 1. This model represents a signature C-terminal region of a distinct clade in the EgtB subfamily, other members of which participate in ergothioneine biosynthesis	0
305014	cl23857	DUF1565	Protein of unknown function (DUF1565). This model represents a tandem pair of an approximately 22-amino acid (each) repeat homologous to the beta-strand repeats that stack in a right-handed parallel beta-helix in the periplasmic C-5 mannuronan epimerase, AlgA, of Pseudomonas aeruginosa. A homology domain consisting of a longer tandem array of these repeats is described in the SMART database as CASH (SM00722), and is found in many carbohydrate-binding proteins and sugar hydrolases. A single repeat is represented by SM00710. This TIGRFAMs model represents a flavor of the parallel beta-helix-forming repeat based on prokaryotic sequences only in its seed alignment, although it also finds many eukaryotic sequences.	0
420056	cl23858	PSCyt1	Planctomycete cytochrome C. This domain contains a potential haem-binding motif, CXXCH. This family is found in association with pfam00034 and pfam03150.	0
420057	cl23859	Peptidase_M10_C	Peptidase M10 serralysin C terminal. This family consists of a number of bacteria specific domains which are found in haemolysin-type calcium binding proteins. This family is found in conjunction with pfam00353 and is often found in multiple copies.	0
420060	cl23862	PmrD	Polymyxin resistance protein PmrD. anti-adapter protein IraM; Provisional	0
420061	cl23863	BrnA_antitoxin	BrnA antitoxin of type II toxin-antitoxin system. CopG antitoxin is a member of a type II toxin-antitoxin system family found in bacteria and archaea. Most antitoxins encoded by the relBE and parDE loci belong to the MetJ/Arc/CopG family of dimeric proteins which bind DNA through N-terminal ribbon-helix-helix (RHH) motifs. The toxin for CopG proteins falls into the family BrnT_toxin, pfam04365.	0
420068	cl23870	PsbX	Photosystem II reaction centre X protein (PsbX). photosystem II protein X; Reviewed	0
305028	cl23871	DUF2560	Protein of unknown function (DUF2560). hypothetical protein	0
420069	cl23872	A_amylase_inhib	Alpha amylase inhibitor. Alpha amylase inhibitor inhibits mammalian alpha-amylases specifically, by forming a tight stoichiometric 1:1 complex with alpha-amylase. The inhibitor has no action on plant and microbial alpha amylases.	0
305030	cl23873	Spider_toxin	Spider neurotoxins including agatoxin, purotoxin and ctenitoxin. This family of spider neurotoxins are thought to be calcium ion channel inhibitors.	0
305031	cl23874	DUF1187	Protein of unknown function (DUF1187). hypothetical protein; Provisional	0
420070	cl23875	MvaI_BcnI	MvaI/BcnI restriction endonuclease family. This family includes the LlaMI (recognizes and cleaves CC^NGG) restriction endonuclease.	0
420071	cl23876	ToxGAP	N/A. GTPase-activating protein (GAP) domain found in bacterial cytotoxins, ExoS, SptP, and YopE. Part of protein secretion system; stimulates Rac1- dependent cytoskeletal changes that promote bacterial internalization.	0
390151	cl23877	Lysin-Sp18	N/A. Egg lysin creates a hole in the envelope of the egg thereby allowing the sperm to pass through the envelope and fuse with the egg.	0
420072	cl23878	C1q	C1q domain. Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor.	0
390153	cl23879	IL2	Interleukin 2. Interleukin-2 is a cytokine produced by T-helper cells in response to antigenic or mitogenic stimulation. This protein is required for T-cell proliferation and other activities crucial to the regulation of the immune response.	0
420073	cl23880	HALZ	Homeobox associated leucine zipper. 	0
420074	cl23881	GIT_SHD	Spa2 homology domain (SHD) of GIT. Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins, and in yeast Spa2p and Sph1p (CPP; unpublished results). In p95-APP1 the N-terminal GIT motif might be involved in binding PIX.	0
420075	cl23882	Holin_SPP1	SPP1 phage holin. This model represents one of more than 30 families of phage proteins, all lacking detectable homology with each other, known or believed to act as holins. Holins act in cell lysis by bacteriophage. Members of this family are found in phage PBSX and phage SPP1, among others. [Mobile and extrachromosomal element functions, Prophage functions]	0
305040	cl23883	DUF722	Protein of unknown function (DUF722). This model represents a family of phage proteins, including RinA, a transcriptional activator in staphylococcal phage phi 11. This family shows similarity to ArpU, a phage-related putative autolysin regulator, and to some sporulation-specific sigma factors. [Mobile and extrachromosomal element functions, Prophage functions, Regulatory functions, DNA interactions]	0
420076	cl23884	PRESAN	Plasmodium RESA N-terminal. This model represents a conserved sequence region of about 60 amino acids found in over 40 predicted proteins of Plasmodium falciparum. It is not found elsewhere, including closely related species such as Plasmodium yoelii. No member of this family is characterized.	0
420077	cl23885	NTase_sub_bind	Nucleotidyltransferase substrate binding protein like. The member of this family from Haemophilus influenzae, HI0074, has been shown by crystal structure to resemble nucleotidyltransferase substrate binding proteins. It forms a complex with HI0073, encoded by the adjacent gene and containing a nucleotidyltransferase nucleotide binding domain (pfam01909).	0
420078	cl23886	SWM_repeat	Putative flagellar system-associated repeat. This domain appears in 29 copies in a large (>10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803.	0
420079	cl23887	DUF4349	Domain of unknown function (DUF4349). This model describes a protein, PhaR, localized to polyhydroxyalkanoic acid (PHA) inclusion granules in Bacillus cereus and related species. PhaR is required for PHA biosynthesis along with PhaC and may be a regulatory subunit.	0
390161	cl23888	Gmx_para_CXXCG	Protein of unknown function (Gmx_para_CXXCG). This family consists of at least 10 paralogous proteins from Myxococcus xanthus that lack detectable sequence similarity to any other protein family. An imperfectly conserved CXXCG motif, a probable binding site, appears twice in the multiple sequence alignment.	0
390162	cl23889	Dimeth_Pyl	Dimethylamine methyltransferase (Dimeth_PyL). This family consists of dimethylamine methyltransferases from the genus Methanosarcina. It is found in three nearly identical copies in each of M. acetivorans, M. barkeri, and M. Mazei. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with trimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates dimethylamine, leaving monomethylamine, and methylates the prosthetic group of the small corrinoid protein MtbC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence.	0
329155	cl23890	Bac_small_YrzI	Probable sporulation protein (Bac_small_yrzI). Members of this family are very small proteins, about 47 residues each, in the genus Bacillus. Single members are found in Bacillus subtilis and Bacillus halodurans, but arrays of six in tandem in Bacillus cereus and Bacillus anthracis. An EIxxE motif present in most members of this family resembles cleavage sites by the germination protease GPR in a number small, acid-soluble spore proteins (SASP). A role in sporulation is possible.	0
420080	cl23891	Type_III_YscG	Bacterial type II secretion system chaperone protein (type_III_yscG). YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designate Yops (Yersinia outer proteins) in Yersinia. This family consists of YscG of Yersinia, and functionally equivalent type III secretion machinery protein in other species: AscG in Aeromonas, LscG in Photorhabdus luminescens, etc. [Protein fate, Protein folding and stabilization, Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
420081	cl23892	TyeA	TyeA. Members of this family include both small proteins, about 90 amino acids, in which this model covers the whole, and longer proteins of about 360 residues which match in the C-terminal region. The longer proteins (HrpJ) have N-terminal regions that match pfam07201. Members of this family belong to bacterial type III secretion systems, and include TyeA from the well-studied Yersinia systems. TyeA appears involved in calcium-responsive regulation of the delivery of type III effectors.	0
390164	cl23893	HrpB2	Bacterial type III secretion protein (HrpB2). This family of genes is found in type III secretion operons in a narrow group of species including Xanthomonas, Burkholderia and Ralstonia.	0
305052	cl23895	LcrG	LcrG protein. This protein is found in type III secretion operons, along with LcrR, H and V. Also known as PcrG in Pseudomonas, the protein is believed to make a 1:1 complex with PcrV (LcrV). Mutants of LcrG cause premature secretion of effector proteins into the medium .	0
390165	cl23896	PGPGW	Putative transmembrane protein (PGPGW). Members of this family are Actinobacterial putative proteins of about 150 amino acids in length with three apparent transmembrane helix and an unusual motif with consensus sequence PGPGW. [Hypothetical proteins, Conserved]	0
305054	cl23897	Phenyl_P_gamma	Phenylphosphate carboxylase gamma subunit (Phenyl_P_gamma). Members of this protein family are the gamma subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. The gamma subunit has no known homologs.	0
420082	cl23898	SpoIIIAC	Stage III sporulation protein AC/AD protein family. Members of this family are the uncharacterized protein SpoIIIAD, part of the spoIIIA operon that acts at sporulation stage III as part of a cascade of events leading to endospore formation. Note that the start sites of members of this family as annotated tend to be variable; quite a few members have apparent homologous protein-coding regions continuing upstream of the first available start codon. The length of the alignment and the scoring cutoff thresholds for the model have been set to try to detect all valid members of the family, even if annotation of the start site begins too far downstream. [Cellular processes, Sporulation and germination]	0
329159	cl23899	Spore_YpjB	Sporulation protein YpjB (SpoYpjB). Members of this protein, YpjB, family are restricted to a subset of endospore-forming bacteria, including Bacillus species but not CLostridium or some others. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon, where sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect. This protein family is not, however, a part of the endospore formation minimal gene set. [Cellular processes, Sporulation and germination]	0
305057	cl23900	DUF2375	Protein of unknown function (DUF2375). Two members of this family are found in Colwellia psychrerythraea 34H and one each in various other species of Colwellia and Shewanella. One member from C. psychrerythraea is of special interest because it is preceded by the same cis-regulatory site as a number of genes that have the PEP-CTERM domain described by TIGR02595. [Hypothetical proteins, Conserved]	0
390167	cl23901	DUF3938	Protein of unknown function (DUF3938). hypothetical protein; Provisional	0
329161	cl23902	DUF2689	Protein of unknown function (DUF2689). conjugal transfer protein TrbD; Provisional	0
329162	cl23903	TrbE	Conjugal transfer protein TrbE. conjugal transfer protein TrbE; Provisional	0
420083	cl23904	DNA_Packaging_2	DNA packaging protein. DNA packaging protein, small subunit	0
390169	cl23905	DUF2824	Protein of unknown function (DUF2824). head assembly protein	0
420084	cl23906	UvsW	ATP-dependant DNA helicase UvsW. hypothetical protein; Provisional	0
305065	cl23908	RBP-H	Head domain of virus receptor-binding proteins (RBP). Caudo_bapla_RBP is a family of proteins expressed from ORF18 of the Lactococcus P2-like phage. This is one of three protein species, shoulders, neck, and head, that form the phage tail base-plate. In the overall structure this head domain exists as six trimers, and is necessary for specific recognition of the receptors at the host cell surface. Siphoviridae are the P2-like Caudovirales of Lactococcus. This family now includes DUF1914. Family Baseplate, pfam16774, is the ORF15 or shoulder component of the base-plate complex.	0
420085	cl23910	Replic_Relax	Replication-relaxation. putative internal core protein; Provisional	0
420086	cl23911	MSP	Manganese-stabilizing protein / photosystem II polypeptide. photosystem II oxygen-evolving enhancer protein 1; Provisional	0
420087	cl23912	GlgS	Glycogen synthesis protein. Members of this family are involved in glycogen synthesis in Enterobacteria. The structure of the polypeptide chain comprises a bundle of two parallel amphipathic helices, alpha-1 and alpha-3, and a short hydrophobic helix alpha-2 sandwiched between them.	0
420088	cl23913	DUF2614	Zinc-ribbon containing domain. hypothetical protein; Provisional	0
420089	cl23914	UPF0257	Uncharacterized protein family (UPF0257). 	0
329168	cl23915	SspK	Small acid-soluble spore protein K family. This protein family is restricted to a subset of endospore-forming bacteria such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspK. [Cellular processes, Sporulation and germination]	0
420090	cl23916	UPF0253	Uncharacterized protein family (UPF0253). hypothetical protein; Provisional	0
420091	cl23918	SspP	Small acid-soluble spore protein P family. This family consists of the small acid-soluble spore proteins (SASP) P type (sspP). sspP is expressed only in the forespore compartment of the sporulating cell. sspP is also expressed under sigma-G control from the same promoter as sspO. Mutations deleting sspP causes no discernible effect on sporulation, spore properties or spore germination.	0
329170	cl23919	Ribosomal_S22	30S ribosomal protein subunit S22 family. This family consists of the 30S ribosomal proteins subunit S22 polypeptides. This polypeptide is 47 amino acids in length and has a molecular weight of about 5 kDa. The S22 subunit is a component of the stationary-phase-specific ribosomal protein and is assembled in the ribosomal particles in the stationary phase. This subunit along with other stationary-phase-specific ribosomal proteins result in compositional changes of ribosomes during the stationary phase. The significance of this change is not clear as yet.	0
329171	cl23920	Tafi-CsgC	Thin aggregative fimbriae synthesis protein. curli assembly protein CsgC; Provisional	0
420092	cl23921	YccJ	YccJ-like protein. hypothetical protein; Provisional	0
420093	cl23922	DUF2559	Protein of unknown function (DUF2559). hypothetical protein; Provisional	0
420094	cl23924	DUF2767	Protein of unknown function (DUF2767). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	0
420095	cl23925	YedD	YedD-like protein. lipoprotein; Provisional	0
390178	cl23926	DUF2594	Protein of unknown function (DUF2594). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae.	0
420096	cl23927	DUF2583	Protein of unknown function (DUF2583). Some members in this family of proteins are annotated as YchH however currently no function is known.	0
420097	cl23928	MsyB	MsyB protein. secY/secA suppressor protein; Provisional	0
420098	cl23929	YejG	YejG-like protein. hypothetical protein; Provisional	0
420099	cl23930	BssS	BssS protein family. The BssS protein family is a group of proteins that are involved in regulation of biofilm formation. Proteins in this family are approximately 80 amino acids in length.	0
420100	cl23931	Peptidase_S48	Peptidase family S48. heterocyst differentiation control protein; Reviewed	0
420101	cl23932	DUF1283	Protein of unknown function (DUF1283). This family consists of several hypothetical proteins of around 115 residues in length which seem to be specific to Enterobacteria. The function of the family is unknown.	0
420102	cl23933	UPF0370	Uncharacterized protein family (UPF0370). hypothetical protein; Provisional	0
420103	cl23934	YebF	YebF-like protein. hypothetical protein; Provisional	0
390185	cl23935	PsiA	PsiA protein. plasmid SOS inhibition protein A; Provisional	0
305093	cl23936	PTZ00202	N/A. tuzin-like protein; Provisional	0
305094	cl23937	exosort_Gpos	exosortase family protein XrtG. Members of this protein family, ArtF, belong to the archaeosortase/exosortase family, in which many members associate with specific protein C-terminal putative protein sorting domains (exosortase A with PEP-CTERM, archaeosortase A with PGF-CTERM, etc.). This subgroup is observed in Thermococcus gammatolerans EJ3 and Thermococcus sp. AM4, but the gene neighborhood is not conserved. The cognate sequence to ArtF is unknown, but should not be ICGP-CTERM (model TIGR04288), found also in many Pyrococcus species that lack any archaeosortase family member.	0
420104	cl23938	GA	GA module. The protein G-related albumin-binding (GA) module is composed of three alpha helices. This module is found in a range of bacterial cell surface proteins. The GA module from the Peptostreptococcus magnus albumin-binding protein (PAB) shows a strong affinity for albumin.	0
420105	cl23940	PYST-C1	Plasmodium yoelii subtelomeric region (PYST-C1). This model represents the N-terminal domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The C-terminal portions of the genes which contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (TIGR01604).	0
420106	cl23941	ChpXY	CO2 hydration protein (ChpXY). This small family of proteins includes paralogs ChpX and ChpY in Synechococcus sp. PCC7942 and other cyanobacteria, associated with distinct NAD(P)H dehydrogenase complexes. These proteins collectively enable light-dependent CO2 hydration and CO2 uptake; loss of both blocks growth at low CO2 concentrations. [Energy metabolism, Photosynthesis]	0
305099	cl23942	Paramecium_SA	Paramecium surface antigen domain. This domain is a cysteine rich extracellular repeat found in surface antigens of Paramecium. The domain contains 8 cysteine residues.	0
420107	cl23944	CaM_binding	Plant calmodulin-binding domain. The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins, and are thought to constitute the calmodulin-binding domains.. Binding of the proteins to calmodulin depends on the presence of calcium ions.. These proteins are thought to be involved in various processes, such as plant defence responses.and stolonisation or tuberization.	0
420108	cl23945	FragX_IP	Cytoplasmic Fragile-X interacting family. Protein PIR; Provisional	0
305103	cl23946	PHA00148	N/A. putative lower collar protein	0
390190	cl23947	DUF3653	Phage protein. putative transcription regulator	0
420109	cl23948	Peptidase_S80	Bacteriophage T4-like capsid assembly protein (Gp20). portal vertex protein; Provisional	0
420110	cl23949	Late_protein_L1	L1 (late) protein. major capsid L1 protein; Provisional	0
305108	cl23951	Pox_vIL-18BP	Orthopoxvirus interleukin 18 binding protein. IL-18 binding protein; Provisional	0
420111	cl23953	DUF212	Divergent PAP2 family. This family is related to the pfam01569 family (personal obs: C Yeats).	0
420112	cl23954	DHNA	Dihydroneopterin aldolase. 	0
305112	cl23955	COG2122	Uncharacterized protein, UPF0280 family, ApbE superfamily [Function unknown]. hypothetical protein; Provisional	0
420113	cl23956	YitT_membrane	Uncharacterized 5xTM membrane BCR, YitT family COG1284. This is probably a bacterial ABC transporter permease (personal obs:Yeats C).	0
420114	cl23957	Lys_export	Lysine exporter LysO. Members of this family contain a conserved core of four predicted transmembrane segments. Some members have an additional pair of N-terminal transmembrane helices. This family includes lysine exporter LysO (YbjE) from E. coli.	0
420115	cl23958	DUF1040	Protein of unknown function (DUF1040). This family consists of several bacterial YihD proteins of unknown function.	0
390198	cl23959	DUF1495	Winged helix DNA-binding domain (DUF1495). This family consists of several hypothetical archaeal proteins of around 110 residues in length. The structure of this domain possesses a winged helix DNA-binding domain suggesting these proteins are bacterial transcription factors.	0
420116	cl23960	DUF4097	Putative adhesin. This bacterial family of proteins shows structural similarity to other pectin lyase families. Although structures from this family align with acetyl-transferases, there is no conservation of catalytic residues found. It is likely that the function is one of cell-adhesion. In Structure 3jx8, it is interesting to note that the sequence of contains several well defined sequence repeats, centred around GSG motifs defining the tight beta turn between the two sheets of the super-helix; there are 8 such repeats in the C-terminal half of the protein, which could be grouped into 4 repeats of two. It seems likely that this family belongs to the superfamily of trimeric auto-transporter adhesins (TAAs), which are important virulence factors in Gram-negative pathogens. In the case of Parabacteroides distasonis, which is a component of the normal distal human gut microbiota, TAA-like complexes probably modulate adherence to the host (information derived from TOPSAN).	0
420117	cl23961	DUF2106	Predicted membrane protein (DUF2106). This domain, found in various hypothetical archaeal proteins, has no known function.	0
420118	cl23962	DUF1959	Domain of unknown function (DUF1959). This domain is found in a set of uncharacterized Archaeal hypothetical proteins. Its function has not, as yet, been described.	0
420119	cl23964	DUF2324	Putative membrane peptidase family (DUF2324). This domain, found in various hypothetical bacterial proteins, has no known function. This family appears to be related to the prenyl protease 2 family pfam02517, suggesting this family may be peptidases.	0
420120	cl23965	DUF910	Bacterial protein of unknown function (DUF910). This family consists of several short bacterial proteins of unknown function.	0
420121	cl23966	Phage_TAC_7	Phage tail assembly chaperone proteins, E, or 41 or 14. This is family of various Myoviridae bacteriophage tail assembly chaperone, or TAC, proteins.	0
420122	cl23967	DUF1033	Protein of unknown function (DUF1033). This family consists of several hypothetical bacterial proteins. Many of the sequences in this family are annotated as putative DNA binding proteins but the function of this family is unknown.	0
420123	cl23968	DUF1128	Protein of unknown function (DUF1128). This family consists of several short, hypothetical bacterial proteins of unknown function.	0
420124	cl23969	DUF2187	Uncharacterized protein conserved in bacteria (DUF2187). This domain, found in various hypothetical bacterial proteins, has no known function.	0
390205	cl23970	DUF2188	Uncharacterized protein conserved in bacteria (DUF2188). This domain, found in various hypothetical bacterial proteins, has no known function.	0
329205	cl23971	DUF2197	Uncharacterized protein conserved in bacteria (DUF2197). This domain, found in various hypothetical bacterial proteins, has no known function.	0
420125	cl23972	DUF2198	Uncharacterized protein conserved in bacteria (DUF2198). This domain, found in various hypothetical bacterial proteins, has no known function.	0
420126	cl23973	Glycoamylase	Putative glucoamylase. The structure of UniProt:Q5LIB7 has an alpha/alpha toroid fold and is similar structurally to a number of glucoamylases. Most of these structural homologs are glucoamylases, involved in breaking down complex sugars (e.g. starch). The biologically relevant state is likely to be monomeric. The putative active site is located at the centre of the toroid with a well defined large cavity.	0
420127	cl23974	zinc_ribbon_10	Predicted integral membrane zinc-ribbon metal-binding protein. This domain, found in various hypothetical bacterial and eukaryotic metal-binding proteins is a probably zinc-ribbon.	0
420128	cl23975	DUF1127	Domain of unknown function (DUF1127). This family is found in several hypothetical bacterial proteins. In some cases it represents it represents the C-terminal region whereas in others it represents the whole sequence.	0
420129	cl23976	DUF1189	Protein of unknown function (DUF1189). This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown.	0
305135	cl23978	UPF0259	Uncharacterized protein family (UPF0259). hypothetical protein; Provisional	0
355112	cl23979	UspB	Universal stress protein B (UspB). universal stress protein UspB; Provisional	0
420130	cl23980	MarB	MarB protein. The MarB protein is found in the multiple antibiotic resistance (mar) locus in Escherichia coli. The MarB protein is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved GSDKSD sequence motif.	0
305138	cl23981	PyrBI_leader	PyrBI operon leader peptide. This family consists of the pyrBI operon leader peptides. The expression of the pyrBI operon, which encodes the subunits of the pyrimidine biosynthetic enzyme aspartate transcarbamylase. is regulated primarily through a UTP-sensitive transcriptional attenuation control mechanism. In this mechanism, the concentration of UTP determines the extent of coupling between transcription and translation within the pyrBI leader region, hence determining the level of rho-independent transcriptional termination at an attenuator preceding the pyrB gene.	0
420131	cl23982	DUF3561	Protein of unknown function (DUF3561). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 110 amino acids in length.	0
420132	cl23983	DUF1422	Protein of unknown function (DUF1422). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown.	0
420133	cl23984	PsiB	Plasmid SOS inhibition protein (PsiB). This family consists of several plasmid SOS inhibition protein (PsiB) sequences.	0
420134	cl23985	Endostatin-like	N/A. NC10 stands for Non-helical region 10 and is taken from COL15A1. A mutation in this region in COL18A1 is associated with an increased risk of prostate cancer. This domain is cleaved from the precursor and forms endostatin. Endostatin is a key tumor suppressor and has been used highly successfully to treat cancer. It is a potent angiogenesis inhibitor. Endostatin also binds a zinc ion near the N-terminus; this is likely to be of structural rather than functional importance according to.	0
420135	cl23986	DAXX_helical_bundle	Helical bundle domain of the death-domain associated protein (DAXX). The Daxx protein (also known as the Fas-binding protein) is thought to play a role in apoptosis. Daxx forms a complex with Axin. Remodelling of the family to a short domain based on the Structure 2kzs structure gives a more representative family. DAXX is a scaffold protein shown to play diverse roles in transcription and cell cycle regulation. This N-terminal domain folds into a left-handed four-helix bundle (H1, H2, H4, H5) that binds to the N-terminal residues of the tumor-suppressor Rassf1C.	0
329218	cl23987	FBA_1	F-box associated. This model describes a large family of plant domains, with several hundred members in Arabidopsis thaliana. Most examples are found C-terminal to an F-box (pfam00646), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes. Some members have two copies of this domain.	0
390217	cl23989	Gram_pos_anchor	LPXTG cell wall anchor motif. This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other]	0
420136	cl23990	S-layer	S-layer protein. This model represents a sequence region found tandemly duplicated in two proven archaeal S-layer glycoproteins, MA0829 from Methanosarcina acetivorans C2A and MM1976 from Methanosarcina mazei Go1, as well as in several paralogs of those L-layer proteins from both species. Members of the family show regions of local similarity to another known family of archaeal S-layer proteins described by model TIGR01564. Some members of this family, including the proven S-layer proteins, have the archaeosortase A target motif, PGF-CTERM (TIGR04126), at the protein C-terminus. [Cell envelope, Surface structures]	0
420137	cl23991	Phage_holin_3_1	Phage holin family (Lysis protein S). This model represents one of a large number of mutally dissimilar families of phage holins. Holins act against the host cell membrane to allow lytic enzymes of the phage to reach the bacterial cell wall. This family includes the product of the S gene of phage lambda. [Mobile and extrachromosomal element functions, Prophage functions]	0
420138	cl23992	Wx5_PLAF3D7	Protein of unknown function (Wx5_PLAF3D7). This model represents a family of at least four proteins in Plasmodium falciparum. An interesting feature is five perfectly conserved Trp residues.	0
420139	cl23994	Prophage_tail	Prophage endopeptidase tail. This model represents the conserved N-terminal region, typically from about residue 25 to about residue 350, of a family of uncharacterized phage proteins 500 to 1700 residues in length. [Mobile and extrachromosomal element functions, Prophage functions]	0
420140	cl23995	Phage_XkdX	Phage uncharacterized protein (Phage_XkdX). This model represents a family of small (about 50 amino acid) phage proteins, found in at least 12 different phage and prophage regions of Gram-positive bacteria. In a number of these phage, the gene for this protein is found near the holin and endolysin genes. [Mobile and extrachromosomal element functions, Prophage functions]	0
420141	cl23996	DUF576	Csa1 family. Members of this family are predicted lipoproteins (mostly), found in Staphylococcus aureus in several different tandem clusters in pathogenicity islands. Members are also found, clustered, in Staphylococcus epidermidis.	0
420142	cl23997	Phage_TAC_6	Phage tail assembly chaperone protein, TAC. This model describes a family of proteins found exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. [Mobile and extrachromosomal element functions, Prophage functions]	0
420143	cl23998	Phage_T4_gp19	T4-like virus tail tube protein gp19. This family consists of uncharacterized proteins. All members so far represent bacterial genes found in apparent phage or otherwisely laterally transferred regions of the chromosome. Tentatively identified neighboring proteins tend to be phage tail region proteins. In some species, including Photorhabdus luminescens TTO1, several members of this family may be encoded near each other.	0
420144	cl23999	DUF2388	Protein of unknown function (DUF2388). This family consists of small hypothetical proteins, about 100 amino acids in length. The family includes five members (three in tandem) in Pseudomonas aeruginosa PAO1, and also in Pseudomonas putida KT2440, four in Pseudomonas syringae DC3000, and single members in several other Proteobacteria. The function is unknown.	0
420145	cl24000	Pec_lyase	Pectic acid lyase. Members of this family are isozymes of pectate lyase (EC 4.2.2.2), also called polygalacturonic transeliminase and alpha-1,4-D-endopolygalacturonic acid lyase. [Energy metabolism, Biosynthesis and degradation of polysaccharides]	0
420146	cl24001	Flg_hook	Flagellar hook-length control protein FliK. Members of this family include YscP of the Yersinia type III secretion system and equivalent proteins in other animal pathogen bacterial type III secretion systems. The model describes the conserved C-terminal region. N-terminal regions are poorly conserved and variable in length with some low-complexity sequence.	0
305159	cl24002	Dot_icm_IcmQ	Dot/Icm secretion system protein (dot_icm_IcmQ). Members of this protein family are the IcmQ component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation ().	0
420147	cl24004	CBP_BcsF	Cellulose biosynthesis protein BcsF. Members of this protein family are found invariably together with genes of bacterial cellulose biosynthesis, and are presumed to be involved in the process. Members average about 63 amino acids in length and are not uncharacterized. The gene has been designated both YhjT and BcsF (bacterial cellulose synthesis F). [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
420148	cl24005	DUF2570	Protein of unknown function (DUF2570). Members of this protein family are phage lysis regulatory protein, including the well-studied protein LysB (lysis protein B) of Enterobacteria phage P2. For members of this family, genes are found in phage or in prophage regions of bacterial genomes, typically near a phage lysozyme or phage holin.	0
420149	cl24006	YidC_periplas	YidC periplasmic domain. Essentially all bacteria have a member of the YidC family, whose C-terminal domain is modeled by TIGR03592. The two copies are found in endospore-forming bacteria such as Bacillus subtilis appear redundant during vegetative growth, although the member designated spoIIIJ (stage III sporulation protein J) has a distinct role in spore formation. YidC, its mitochondrial homolog Oxa1, and its chloroplast homolog direct insertion into the bacterial/organellar inner (or only) membrane. This model describes an N-terminal sequence region, including a large periplasmic domain lacking in YidC members from Gram-positive species. The multifunctional YidC protein acts both with and independently of the Sec system. [Protein fate, Protein and peptide secretion and trafficking]	0
420150	cl24007	DUF2976	Protein of unknown function (DUF2976). Members of this protein family are found occasionally on plasmids such as the Pseudomonas putida TOL plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in a region flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	0
420151	cl24008	CtnDOT_TraJ	homologs of TraJ from Bacteroides conjugative transposon. Members of this protein family are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage. This family is related conjugation system proteins in the Proteobacteria, including TrbL of Agrobacterium Ti plasmids and VirB6. [Cellular processes, DNA transformation]	0
390233	cl24009	Parvo_NS1	Parvovirus non-structural protein NS1. This protein is a DNA helicase that is required for initiation of viral DNA replication. This protein forms a complex with the E2 protein pfam00508.	0
420152	cl24013	B_lectin	N/A. These proteins include mannose-specific lectins from plants as well as bacteriocins from bacteria.	0
420153	cl24015	DDE_Tnp_ISL3	Transposase. This domain was identified by Babu and colleagues.	0
305174	cl24017	PCNA	N/A. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA.	0
420154	cl24018	DUF4143	Domain of unknown function (DUF4143). This domain is almost always found C-terminal to an ATPase core family.	0
420155	cl24019	CSN8_PSD8_EIF3K	CSN8/PSMD8/EIF3K family. This family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localization of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle.	0
420156	cl24020	Pyocin_S	S-type Pyocin. The C-terminal region of colicin-like bacteriocins is either a pore-forming or an endonuclease-like domain. Cloacin and Pyocins have similar structures and activities to the colicins from E coli and the klebicins from Klebsiella spp. Colicins E5 and D cleave the anticodon loops of distinct tRNAs of Escherichia coli both in vivo and in vitro. The full-length molecule has an N-terminal translocation domain and a middle, double alpha-helical region which is receptor-binding.	0
420157	cl24021	NPR3	Nitrogen Permease regulator of amino acid transport activity 3. This family of regulators are involved in post-translational control of nitrogen permease.	0
420159	cl24023	Cyclase_polyket	Polyketide synthesis cyclase. Members of this family have only been identified in species of the Streptomyces genus. Two family members are known to be part of gene clusters involved in the synthesis of polyketide-based spore pigments, homologous to clusters involved in the synthesis of polyketide antibiotics. The function of this protein is unknown, but it has been speculated to contain a NAD(P) binding site. Many of these proteins contain two copies of this presumed domain.	0
420163	cl24030	SUR7	SUR7/PalI family. During the mating process of yeast cells, two Ca2+ influx pathways become activated. The resulting elevation of cytosolic free Ca2+ activates downstream signaling factors that promote long term survival of unmated cells. Fig1 is a regulator of the low affinity Ca2+ influx system (LACS), and is also required for efficient membrane fusion during yeast mating.	0
420164	cl24032	QCR10	Ubiquinol-cytochrome-c reductase complex subunit (QCR10). The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is an essential component of the mitochondrial cellular respiratory chain. This family represents the 6.4kD protein, which may be closely linked to the iron-sulphur protein in the complex and function as an iron-sulphur protein binding factor.	0
420165	cl24033	zf-NADH-PPase	NADH pyrophosphatase zinc ribbon domain. Ths domain occurs at the N-terminus of several Nudix (Nucleoside Diphosphate linked to X) hydrolases.	0
420167	cl24037	MRP-S25	Mitochondrial ribosomal protein S25. MRP-S23 is one of the proteins that makes up the 55S ribosome in eukaryotes from nematodes to humans. It does not appear to carry any common motifs, either RNA binding or ribosomal protein motifs. All of the mammalian MRPs are encoded in nuclear genes that are evolving more rapidly than those encoding cytoplasmic ribosomal proteins. The MRPs are imported into mitochondria where they assemble coordinately with mitochondrially transcribed rRNAs into ribosomes that are responsible for translating the 13 mRNAs for essential proteins of the oxidative phosphorylation system. MRP-S23 is significantly up-regulated in uterine cancer cells.	0
420168	cl24038	SNAP	Soluble N-ethylmaleimide-sensitive factor (NSF) Attachment Protein family. Neuromuscular junction formation relies upon the clustering of acetylcholine receptors and other proteins in the muscle membrane. Rapsyn is a peripheral membrane protein that is selectively concentrated at the neuromuscular junction and is essential for the formation of synaptic acetylcholine receptor aggregates. Acetylcholine receptors fail to aggregate beneath nerve terminals in mice where rapsyn has been knocked out. The N-terminal six amino acids of rapsyn are its myristoylation site, and myristoylation is necessary for the targeting of the protein to the membrane.	0
420169	cl24040	AbiEi_4	Transcriptional regulator, AbiEi antitoxin. AbiEi_3 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	0
420173	cl24047	DUF4173	Domain of unknown function (DUF4173). Members of this family are annotated as putative inner membrane proteins.	0
420177	cl24051	BBS2_Mid	Ciliary BBSome complex subunit 2, middle region. Members of this family are annotated as being integrin-alpha FG-GAP repeat-containing protein 2.	0
420179	cl24054	CTC1	CST, telomere maintenance, complex subunit CTC1. CTC1 is one of the three components of the CST complex that assists Shelterin to protect the ends of telomeres from attack by DNA-repair mechanisms. This family largely represents sequences from plants species.	0
420186	cl24062	Phage_holin_3_3	LydA holin phage, holin superfamily III. Phage_holin_6_2 is a family of holins classified as 1.E.20 in the TC database. The hol gene (PRF9) product (117 aas) of Pseudomonas aeruginosa PAO1 exhibits a hydrophobicity profile similar to holins of P2 and phiCTX phages with two peaks of hydrophobicity that might correspond to either one or two TMSs. Hol functions in conjunction with the lytic enzyme, Lys, a glycosyl hydrolase that breaks-up the murein in the bacterial cell-wall, causing lysis of the cell and hence entry of phage particles. Several members are annotated as pyocin R2_PP when encoded on the chromosome.	0
305223	cl24066	SA1633_like	Uncharacterized protein family conserved in Staphylococci. This family consists of uncharacterized proteins around 190 residues in length and is mainly found in various Staphylococcus species. The function of this family is unknown.	0
390276	cl24077	Zn_ribbon_17	Zinc-ribbon, C4HC2 type. This family is found at the C-terminus of WD40 repeat structures in eukaryotes.	0
420196	cl24079	Helo_like_N	Fungal N-terminal domain of STAND proteins. This is a family of fungal N-terminal domains that appear at the N-terminus of P-loop NTPases, NACHT-NTPases and Ankyrin or WD repeat proteins. The exact function is not known.	0
420197	cl24084	DUF5669	Family of unknown function (DUF5669). Members of this family are found, so far, only in the Gammaproteobacteria. The function is unknown. The location on the chromosome usually is not far from housekeeping genes rather than in what is clearly, say, a prophage region. Some members have been annotated in public databases as DNA-binding protein inhibitor Id-2-related protein, putative transcriptional regulator, or hypothetical DNA binding protein. [Hypothetical proteins, Conserved]	0
420257	cl24259	Transglut_core3	Transglutaminase-like superfamily. This family includes uncharacterized proteins that are related to the transglutaminase like domain pfam01841.	0
420385	cl24410	CASIMO1	Cancer Associated Small Integral Membrane Open reading frame 1 (CASIMO1). This family of proteins is found in eukaryotes. Proteins in this family are typically between 68 and 91 amino acids in length. Members are single-pass membrane proteins.	0
420631	cl24758	DUF5128	6-bladed beta-propeller. This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species, such as Bacteroides fragilis and Bacteroides sp. The function of this family is unknown.	0
420752	cl24939	T2SSB	Type II secretion system protein B. GspB (general secretory pathway B) occurs in type II secretion systems (T2SS) and is viewed as an accessory protein, a factor involved in the assembly process rather than integral to the completed T2SS apparatus.	0
420758	cl24946	FlgT_N	Flagellar assembly protein T, N-terminal domain. Members of this family are lipoprotein LipL46, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	0
420886	cl25130	DotD	DotD protein. Members of this family are the lipoprotein DotD from type IVB secretion systems, which are also called Dot/Icm secretion systems. DotD is is related to conjugal transfer protein TraH as that term is used in IncI1 plasmid transfer regions.	0
420941	cl25222	MaAIMP_sms	Putative methionine and alanine importer, small subunit. MetS, as described in the Gram-positive bacterium Corynebacterium glutamicum, is the small subunit of MetPS, an NSS (Neurotransmitter:Sodium Symporter) transporter involved in methionine and alanine import. While MetS itself is small, only 60 amino acids, homologs in gamma proteobacteria such as Vibrio sp., similarly found next to an NSS transporter large subunit, may be barely half that length and consist almost entirely of a predicted hydrophobic region that would localize to within the plasma membrane.	0
391092	cl25225	Porin_7	Putative general bacterial porin. Members of this family are outer membrane beta-barrel proteins that facilitate passive transport from the extracellular milieu into the periplasm. Known members are limited to the genus Acinetobacter, and the name, Omp33-36, reflects variability of this protein across the lineage. Note that this HMM previously was named CarO in error. Both this protein and CarO affect carbapenem transport across the outer member and thus carbapenem susceptibility or resistance.	0
420976	cl25298	QVR	Sleepless protein. This is a highly conserved domain found in various Platyhelminthes. Its function is currently unknown, with some of the sequences annotated as being Palmitoyltransferase. This highly conserved domain is located at the amino terminus, next to a DHHC domain (named for its signature tetrapeptide Asp-His-His-Cys).	0
330171	cl25349	EFh_SPARC_EC	EF-hand, extracellular calcium-binding (EC) motif, found in secreted protein acidic and rich in cysteine (SPARC)-like proteins. SMOC-2, also termed SPARC-related modular calcium-binding protein 2, or smooth muscle-associated protein 2 (SMAP-2), is a ubiquitously expressed matricellular protein that enhances the response to angiogenic growth factors, mediate cell adhesion, keratinocyte migration, and metastasis. It is also associated with vitiligo and craniofacial and dental defects. Moreover, SMOC-2 acts as an Arf1 GTPase-activating protein (GAP) that interacts with clathrin heavy chain (CHC) and clathrin assembly protein CALM and functions in the retrograde, early endosome/trans-Golgi network (TGN) pathway in a clathrin- and AP-1-dependent manner. It also contributes to mitogenesis via activation of integrin-linked kinase (ILK). SMOC-2 contains a follistatin-like (FS) domain, two thyroglobulin-like (TY) domains, a novel domain, which is found only in the homologous SMOC-1, and an extracellular calcium-binding (EC) domain with two EF-hand calcium-binding motifs.	0
355382	cl25352	EFh_PEF	The penta-EF hand (PEF) family. CAPN2, also termed millimolar-calpain (m-calpain), or calpain-2 catalytic subunit, or calcium-activated neutral proteinase 2 (CANP 2), or calpain large polypeptide L2, or calpain-2 large subunit, is a ubiquitously expressed 80-kDa Ca2+-dependent intracellular cysteine protease that contains a short N-terminal anchor helix, followed by a calpain cysteine protease (CysPc) domain, a C2-domain-like (C2L) domain, and a C-terminal Ca2+-binding penta-EF-hand (PEF) domain. The catalytic subunit CAPN2 in complex with a regulatory subunit encoded by CAPNS1 forms an m-calpain heterodimer. CAPN2 acts as the key protease responsible for N-methyl-d-aspartic acid (NMDA)-induced cytoplasmic polyadenylation element-binding protein 3 (CPEB3) degradation in neurons. It cleaves several components of the focal adhesion complex, such as FAK and talin, triggering disassembly of the complex at the rear of the cell. The stimulation of CAPN2 activity is required for Golgi antiapoptotic proteins (GAAPs) to promote cleavage of FA kinase (FAK), cell spreading, and enhanced migration. calpain 2 is also involved in the onset of glial differentiation. It regulates proliferation, survival, migration, and tumorigenesis of breast cancer cells through a PP2A-Akt-FoxO-p27(Kip1) signaling cascade. Its expression is associated with response to platinum based chemotherapy, progression-free and overall survival in ovarian cancer. Moreover, CAPN2 may play a role in fundamental mitotic functions, such as the maintenance of sister chromatid cohesion. The activation of CAPN2 plays an essential role in hippocampal synaptic plasticity and in learning and memory. In the eye, CAPN2, together with a lens-specific variant of CAPN3, is responsible for proteolytic cleavages of alpha and beta-crystallin. Overactivated alpha and beta-crystallin can lead to cataract formation. Sometimes, CAPN2 compensates for loss of CAPN1, and both calpain isoforms are involved in AngII-induced aortic aneurysm formation. The main phosphorylation sites in m-calpain are Ser50 and Ser369/Thr370.	0
330175	cl25354	EFh_CREC	EF-hand, calcium binding motif, found in CREC-EF hand family. RCN-3, also termed EF-hand calcium-binding protein RLP49, is a putative six EF-hand Ca2+-binding protein that contains five RXXR (X is any amino acid) motifs and a C-terminal ER retrieval signal His-Asp-Glu-Leu (HDEL) tetrapeptide. The RXXR motif represents the target sequence of subtilisin-like proprotein convertases (SPCs). RCN-3 is specifically bound to the paired basic amino-acid-cleaving enzyme-4 (PACE4) precursor protein and plays an important role in the biosynthesis of PACE4.	0
330177	cl25356	EFh_parvalbumin_like	EF-hand, calcium binding motif, found in parvalbumin-like EF-hand family. Beta-parvalbumin, also termed Oncomodulin-1 (OM), is a small calcium-binding protein that is expressed in hepatomas, as well as in the blastocyst and the cytotrophoblasts of the placenta. It is also found to be expressed in the cochlear outer hair cells of the organ of Corti and frequently expressed in neoplasms. Mammalian beta-parvalbumin is secreted by activated macrophages and neutrophils. It may function as a tissue-specific Ca2+-dependent regulatory protein, and may also serve as a specialized cytosolic Ca2+ buffer. Beta-parvalbumin acts as a potent growth-promoting signal between the innate immune system and neurons in vivo. It has high and specific affinity for its receptor on retinal ganglion cells (RGC) and functions as the principal mediator of optic nerve regeneration. It exerts its effects in a cyclic adenosine monophosphate (cAMP)-dependent manner and can further elevate intracellular cAMP levels. Moreover, beta-parvalbumin is associated with efferent function and outer hair cell electromotility, and can identify different hair cell types in the mammalian inner ear. Beta-parvalbumin is characterized by the presence of three consecutive EF-hand motifs (helix-loop-helix) called AB, CD, and EF, but only CD and EF can chelate metal ions, such as Ca2+ and Mg2+. The EF site displays a high-affinity for Ca2+/Mg2+, and the CD site is a low-affinity Ca2+-specific site. In addition, beta-parvalbumin is distinguished from other parvalbumins by its unusually low isoelectric point (pI = 3.1) and sequence eccentricities (e.g., Y57-L58-D59 instead of F57-I58-E59).	0
355383	cl25360	DMSOR_beta-like	Beta subunit of the DMSO Reductase (DMSOR) family. This family consists of the small beta iron-sulfur (FeS) subunit of the DMSO Reductase (DMSOR) family. Members of this family also contain a large, periplasmic molybdenum-containing alpha subunit and may have a small gamma subunit as well.  Examples of heterodimeric members with alpha and beta subunits include arsenite oxidase, and tungsten-containing formate dehydrogenase (FDH-T) while   heterotrimeric members containing alpha, beta, and gamma subunits include formate dehydrogenase-N (FDH-N), and nitrate reductase (NarGHI).  The beta subunit contains four Fe4/S4 and/or Fe3/S4 clusters which transfer the electrons from the alpha subunit to a hydrophobic integral membrane protein, presumably a cytochrome containing two b-type heme groups. The reducing equivalents are then transferred to menaquinone, which finally reduces the electron-accepting enzyme system.	0
355384	cl25362	YitT_C_like	C-terminal domain of Bacillus subtilis YitT and similar protein domains. This domain, characteristic of various bacterial proteins, has no known function. It has been given the designation DUF2179 and is similar to the C-terminus of the Bacillus subtilis membrane protein.	0
355385	cl25364	beta_Kdo_transferase	beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. KpsS is a beta-3-deoxy-D-manno-oct-2-ulosonic acid (Kdo)-transferase. It is part of the ATP-binding cassette transporter dependent capsular polysaccharides (CPSs) synthesis pathway, one of two CPS synthesis pathways present in Escherichia coli. The poly-Kdo linker is thought to be the common feature of CPSs synthesized via this pathway. CPSs are high-molecular-mass cell-surface polysaccharides that are important virulence factors for many pathogenic bacteria.	0
355386	cl25366	ChuX-like	heme utilization protein ChuX and similar proteins. This family contains the C-terminal domain of heme degrading enzyme HemS, and similar proteins, including PhuS, ChuS, ShuS, and HmuS in proteobacteria.  Despite low sequence identity between the N- and C-terminal halves, these segments represent a structural duplication, with each terminal half having similar fold to single domains of ChuX. HemS shares homology with both, heme degrading enzymes and heme trafficking enzymes. Heme is an iron source for pathogenic microorganisms to enable multiplication and survival within hosts they invade and therefore heme degrading enzyme activity is required for the release of iron from heme after its transportation into the cytoplasm. N- and C-terminal halves of ChuS are each a functional heme oxygenase (HO). The mode of heme coordination by ChuS has been shown to be distinct, whereby the heme is stabilized mostly by residues from the C-terminal domain, assisted by a distant arginine from the N-terminal domain. ChuS can use ascorbic acid or cytochrome P450 reductase-NADPH as electron sources for heme oxygenation. Shigella dysenteriae ShuS promotes utilization of heme as an iron source and protects against heme toxicity by physically sequestering DNA. Heme transporter protein PhuS in Pseudomonas aeruginosa is unique among this family since it contains three histidines in the heme-binding pocket, compared with only one in ChuX.	0
355389	cl25402	iSH2_PI3K_IA_R	Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunits. PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. p85beta, also called PIK3R2, contains N-terminal SH3 and GAP domains. It is expressed ubiquitously but at lower levels than p85alpha. Its expression is increased in breast and colon cancer, correlates with tumor progression, and enhanced invasion. During viral infection, the viral nonstructural (NS1) protein binds p85beta specifically, which leads to PI3K activation and the promotion of viral replication. Mice deficient with PIK3R2 develop normally and exhibit moderate metabolic and immunological defects.	0
421006	cl25407	ClassIIa_HDAC_Gln-rich-N	Glutamine-rich N-terminal helical domain of various Class IIa histone deacetylases (HDAC4, HDAC5 and HDCA9). This domain is found in eukaryotes, and is approximately 90 amino acids in length. The family is found in association with pfam00850. The domain forms an alpha helix which complexes to form a tetramer. The glutamine rich domains have many intra- and inter-helical interactions which are thought to be involved in reversible assembly and disassembly of proteins. The domain is part of histone deacetylase 4 (HDAC4) which removes acetyl groups from histones. This restores their positive charge to allow stronger DNA binding thus restricting transcriptional activity.	0
355393	cl25421	Lnt	Apolipoprotein N-acyltransferase [Cell wall/membrane/envelope biogenesis]. apolipoprotein N-acyltransferase; Reviewed	0
421007	cl25432	RseA_N	N-terminal domain of RseA. Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress.	0
421008	cl25434	MukF_C	bacterial condensin complex subunit MukF, C-terminal domain. This presumed domain is found at the C-terminus of the MukF protein.	0
355396	cl25446	PRK15055	anaerobic sulfite reductase subunit AsrA. Members of this protein family include the A subunit, one of three subunits, of the anaerobic sulfite reductase of Salmonella, and close homologs from various Clostridum species, where the three-gene neighborhood is preserved. Two such gene clusters are found in Clostridium perfringens, but it may be that these sets of genes correspond to the distinct assimilatory and dissimilatory forms as seen in Clostridium pasteurianum. Note that any one of these enzymes may have secondary substates such as NH2OH, SeO3(2-), and SO3(2-). Heterologous expression of the anaerobic sulfite reductase of Salmonella confers on Escherichia coli the ability to produce hydrogen sulfide gas from sulfite. [Central intermediary metabolism, Sulfur metabolism]	0
330301	cl25480	FA58C	N/A. Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes.	0
421067	cl25556	tRNA-synt_2_TM	Transmembrane region of lysyl-tRNA synthetase. tRNA-synt_2_TM is a family from the N-terminal region of tRNA-synthase-2, with 6xTMs. The presence of this region indicates that the protein is anchored in the membrane. The family is found in Actinobacteria.	0
421069	cl25561	TadE	TadE-like protein. The members of this family are similar to a region of the protein product of the bacterial tadE locus. In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria. All tad loci but TadA have putative transmembrane regions, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues.	0
421071	cl25563	Mal_decarbox_Al	Malonate decarboxylase, alpha subunit, transporter. This model describes malonate decarboxylase alpha subunit, from both the water-soluble form as found in Klebsiella pneumoniae and the form couple to sodium ion pumping in Malonomonas rubra. Malonate decarboxylase Na+ pump is the paradigm of the family of Na+ transport decarboxylases. Essentially, it couples the energy derived from decarboxylation of a carboxylic acid substrate to move Na+ ion across the bilayer. Functional malonate decarboylase is a multi subunit protein. The alpha subunit enzymatically performs the transfer of malonate (substrate) to an acyl carrier protein subunit for subsequent decarboxylation, hence the name: acetyl-S-acyl carrier protein:malonate carrier protein-SH transferase. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other]	0
391257	cl25564	OFeT_1	Ferrous iron uptake permease, iron-lead transporter. OFeT_1 is a family of conserved archaeal membrane proteins that are putative oxidase-dependent Fe2+ transporters.	0
391260	cl25584	RNA_pol_inhib	RNA polymerase inhibitor. inhibitor of host bacterial RNA polymerase	0
421073	cl25585	Stomagen	Stomagen. stomagen; Provisional	0
421079	cl25598	TRF2_RBM	RAP1 binding motif of telomere repeat binding factor. This domain, found in telomeric repeat-binding factor 2, binds to the C-terminus of repressor activator protein 1 (RAP1) (telomeric repeat-binding factor 2-interacting protein 1).	0
391269	cl25608	MSL2_CXC	DNA-binding cysteine-rich domain of male-specific lethal 2 and related proteins. MSL2-CXC is an autonomously folded domain containing that binds three zinc ions. It lies on the E3 ubiquitin-protein ligase MSL2 in eukaryotes. The CXC domain critically contributes to the DNA-binding activity of MSL2. It carries 9 invariant cysteines within about a 50 residue region.	0
421085	cl25616	Apocytochr_F_N	Apocytochrome F, N-terminal. cytochrome f	0
421098	cl25646	zn-ribbon_14	Zinc-ribbon. [Hypothetical proteins, Conserved]	0
355433	cl25655	Patched	Patched family. The transmembrane protein Patched is a receptor for the morphogene Sonic Hedgehog. This protein associates with the smoothened protein to transduce hedgehog signals.	0
421105	cl25664	YaiA	YaiA protein. hypothetical protein; Provisional	0
421106	cl25669	DUF1735	Domain of unknown function (DUF1735). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this protein is unknown.	0
421107	cl25673	CcmF_C	Cytochrome c-type biogenesis protein CcmF C-terminal. Members of this protein family closely resemble the CcmF protein of the CcmABCDEFGH system, or system I, for c-type cytochrome biogenesis (GenProp0678). Members are found, as a rule, next to closely related paralogs of CcmG and CcmH and always located near other genes associated with the cytochrome c nitrite reductase enzyme complex. As a rule, members are found in species that also encode bona fide members of the CcmF, CcmG, and CcmH families.	0
421111	cl25682	DUF4912	Domain of unknown function (DUF4912). This family consists of uncharacterized proteins around 160 residues in length and is mainly found in various Clostridium species. The function of this family is unknown.	0
421112	cl25683	DUF4910	Domain of unknown function (DUF4910). This domain, found in various hypothetical prokaryotic proteins, has no known function. An aminopeptidase domain is conserved within the family, but its relevance has not been established yet. Rebuilding from Structure 3kt9 shows this is an inserted (nested domain within the amino-peptidase). The function of this small domain is not known.	0
330507	cl25686	COG5135	Uncharacterized protein [Function unknown]. Members of the PPOX family (see pfam01243) may contain either FMN or F420 as cofactor. This subfamily described here is widespread in Cyanobacteria and plants, and is named for alr4036 from Nostoc sp. PCC 7120. The family consists mostly of proteins from species that lack the capability to synthesize F420, so it is probable that all members bind FMN rather than F420. [Unknown function, Enzymes of unknown specificity]	0
355439	cl25705	COG4745	Predicted membrane-bound mannosyltransferase  [General function prediction only]. Members of this protein family, uncommon and rather sporadically distributed, are found almost always in the same genomes as members of family TIGR03662, and frequently as a nearby gene. Members show some N-terminal sequence similarity with pfam02366, dolichyl-phosphate-mannose-protein mannosyltransferase. The few invariant residues in this family, found toward the N-terminus, include a dipeptide DE, a tripeptide HGP, and two different Arg residues. Up to three members may be found in a genome. The function is unknown.	0
421146	cl25771	ComGF	Putative Competence protein ComGF. ComGF is a family of putative bacterial competence proteins.	0
421190	cl25857	RmuC	RmuC family. DNA recombination protein RmuC; Provisional	0
421217	cl25912	Bacuni_01323_like	Uncharacterized protein conserved in Bacteroidetes. Large family of predicted secreted proteins mostly from CFG group, but also from Burkholderia, Pseudomonas and Streptomyces. Function of these proteins is not known. A 3D structure of a representative of this family from Bacteroides uniformis was solved by JCSG and deposited to PDB as 4ghb. There is some overlap with RHS-repeat (PF05593) family despite lack of obvious repeats in the structure.	0
330753	cl25932	Peptidase_S21	Assemblin (Peptidase family S21). hypothetical protein; Provisional	0
355468	cl25954	PRK06718	NAD(P)-binding protein. precorrin-2 dehydrogenase; Validated	0
355469	cl25961	POLXc	DNA polymerase X family. includes vertebrate polymerase beta and terminal deoxynucleotidyltransferases	0
421244	cl25973	RNA_pol	DNA-dependent RNA polymerase. T3/T7-like RNA polymerase	0
421249	cl25995	TM_EphA1	Transmembrane domain of Ephrin Receptor A1 Protein Tyrosine Kinase. Epha2_TM represents the left-handed dimer transmembrane domain of of EphA2 receptor. This domain oligomerizes and is important for the active signalling process.	0
421254	cl26030	Topo_C_assoc	C-terminal topoisomerase domain. DNA Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina virus topoisomerase, Variola virus topoisomerase, Shope fibroma virus topoisomeras	0
421256	cl26041	Gly_rich_SFCGS	Glycine-rich SFCGS. Members of this family of small (about 120 amino acid), relatively rare proteins are found in both Gram-positive (e.g. Enterococcus faecalis) and Gram-negative (e.g. Aeromonas hydrophila) bacteria, as part of a cluster of conserved proteins. The function is unknown. [Hypothetical proteins, Conserved]	0
421257	cl26042	Arylsulfotrans	Arylsulfotransferase (ASST). This family consists of several bacterial Arylsulfotransferase proteins. Arylsulfotransferase (ASST) transfers a sulfate group from phenolic sulfate esters to a phenolic acceptor substrate.	0
421260	cl26057	YdfZ	YdfZ protein. This small protein has a very limited distribution, being found so far only among some gamma-Proteobacteria. The member from Escherichia coli was shown to bind selenium in the absence of a working SelD-dependent selenium incorporation system. Note that while the E. coli member contains a single Cys residue, a likely selenium binding site, some other members of this protein family contain two Cys residues or none. [Unknown function, General]	0
421261	cl26058	MgrB	MgrB protein. The MgrB protein is a short lipoprotein. The mgrB gene has a mg2+ responsive promoter. Deletion of mgrB results in a potent increase in PhoP-regulated transcription. The PhoQ/PhoP signaling system responds to low magnesium and the presence of certain cationic antimicrobial peptides. Over-expression of mgrB decreased transcription at both high and low concentrations of magnesium. Localization and bacterial two-hybrid studies suggest that MgrB resides in the inner-membrane and interacts directly with PhoQ. This domain family is found in bacteria, and is approximately 40 amino acids in length. There are two conserved sequence motifs: CDQ and GIC.	0
421262	cl26078	Clathrin	Region in Clathrin and VPS. Each region is about 140 amino acids long. The regions are composed of multiple alpha helical repeats. They occur in the arm region of the Clathrin heavy chain.	0
421264	cl26089	DDE_Tnp_IS240	DDE domain. This DDE domain is found in a wide variety of transposases including those found in IS240, IS26, IS6100 and IS26.	0
421266	cl26118	LPAM_2	Prokaryotic lipoprotein-attachment site. In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached.	0
421268	cl26132	DUF3029	Protein of unknown function (DUF3029). Members of this family are homologs to enzymes known to undergo activation by a radical SAM protein to create an active site glycyl radical. This family appears to be activated by the YjjW radical SAM protein, usually encoded by an adjacent gene. [Unknown function, Enzymes of unknown specificity]	0
421271	cl26153	Glyco_trans_1_3	Glycosyl transferase family 1. This model represents nearly the full length of MJ1255 from Methanococcus jannaschii and of an unpublished protein from Vibrio cholerae, as well as the C-terminal half of a protein from Methanobacterium thermoautotrophicum. A small region (~50 amino acids) within the domain appears related to a family of sugar transferases. [Hypothetical proteins, Conserved]	0
421272	cl26156	Acetyltransf_8	Acetyltransferase (GNAT) domain. AlcB is the conserved 45 residue region of one of the proteins of a complex which mediates alcaligin biosynthesis in Bordetella and aerobactin biosynthesis in E. coli and other bacteria. The protein appears to catalyse N-acylation of the hydroxylamine group in N-hydroxyputrescine with succinyl CoA - an activated mono-thioester derivative of succinic acid that is an intermediate in the Krebs cycle.	0
421273	cl26163	FUSC	Fusaric acid resistance protein family. This family includes a conserved region found in two proteins associated with fusaric acid resistance, FusC from Burkholderia cepacia and fdt-2 from Klebsiella oxytoca. These proteins are likely to be membrane transporter proteins.	0
391485	cl26168	Dehalogenase	Reductive dehalogenase subunit. This model represents a family of corrin and 8-iron Fe-S cluster-containing reductive dehalogenases found primarily in halorespiring microorganisms such as dehalococcoides ethenogenes which contains as many as 17 enzymes of this type with varying substrate ranges. One example of a characterized species is the tetrachloroethene reductive dehalogenase (1.97.1.8) which also acts on trichloroethene converting it to dichloroethene.	0
421277	cl26199	Transglut_core2	Transglutaminase-like superfamily. 	0
421279	cl26235	RecQL4_SLD2_NTD	N-terminal homeodomain-like domain of metazoan RecQ protein-like 4 (RecQL4), fungal DNA replication regulator SLD2 and similar proteins. Genome duplication is precisely regulated by cyclin-dependent kinases CDKs, which bring about the onset of S phase by activating replication origins and then prevent re-licensing of origins until mitosis is completed. The optimum sequence motif for CDK phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found to have at least 11 potential phosphorylation sites. Drc1 is required for DNA synthesis and S-M replication checkpoint control. Drc1 associates with Cdc2 and is phosphorylated at the onset of S phase when Cdc2 is activated. Thus Cdc2 promotes DNA replication by phosphorylating Drc1 and regulating its association with Cut5. Sld2 and Sld3 represent the minimal set of S-CDK substrates required for DNA replication.	0
421281	cl26244	VSG_B	Trypanosomal VSG domain. variant surface glycoprotein; Provisional	0
421283	cl26253	SCIFF	Six-cysteine peptide SCIFF. Members of this protein family are essentially universal in the class Clostidia and therefore highly abundant in the human gut microbiome. This short peptide is designated SCIFF, for Six Cysteines in Forty-Five residues. It is a presumed ribosomal natural product precursor, always found associated with a yet-uncharacterized radical SAM protein, family TIGR03974, that resembles other peptide modification radical SAM enzymes and is designated SCIFF radical SAM maturase.	0
421288	cl26307	FUSC-like	FUSC-like inner membrane protein yccS. This model represents two clades of putative transmembrane proteins including the E. coli YccS and YhfK proteins. The YccS hypothetical equivalog (TIGR01666) is found in beta and gamma proteobacteria, while the smaller YhfK group is only found in E. coli, Salmonella and Yersinia. TMHMM on the 19 hits to this model shows a consensus of 11 transmembrane helices separated into two clusters, an N-terminal cluster of 6 and a central cluster of 5. This would indicate two non-membrane domains one on each side of the membrane	0
421293	cl26348	T2SSppdC	Type II secretion prepilin peptidase dependent protein C. 	0
391509	cl26362	Peptidase_M73	Camelysin metallo-endopeptidase. This model describes a protein N-terminal domain found regularly in proteins encoded near a variant form of signal peptidase I such as the SipW protein of Bacillus subtilis. Many though not all members are homologs of camelysin (a casein-cleaving metalloprotease) and TasA (CotN), a metalloprotease that is secreted, along with extracellular polysaccharide (EPS), to be the major protein constituent of the Bacillus subtilis biofilm matrix. Sequencing from several known TasA/CotN proteins shows the cleavage location to be near the center of the alignment and typical of type I signal peptidases, with small residues at -3 and -1. This domain, therefore, appears to be a special subclass of signal peptide.	0
421296	cl26390	Beta-TrCP_D	D domain of beta-TrCP. This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with F-box domain, WD domain. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerisation of the protein. Dimerisation is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation.	0
421299	cl26405	MgsA_C	MgsA AAA+ ATPase C terminal. recombination factor protein RarA; Provisional	0
421304	cl26420	IAT_beta	Inverse autotransporter, beta-domain. This is a family of beta-barrel porin-like outer membrane proteins from enteropathogenic Gram-negative bacteria. Intimins and invasins are virulence factors produced by pathogenic Gram-negative bacteria. They carry C-terminal extracellular passenger domains that are involved in adhesion to host cells and N-terminal beta domains that are embedded in the outer membrane. This family represents the beta-barrel porin-like domain in the outer membrane that can be found in intimins, invasins and some inverse autotransporters.	0
421305	cl26423	DUF3421	Protein of unknown function (DUF3421). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 119 to 296 amino acids in length.	0
421307	cl26442	Cytochrome_cB	Cytochrome c bacterial. Members of this protein family are multiheme cytochrome c proteins of Methanosarcina acetivorans C2A and several other archaeal methanogens. All members have N-terminal signal peptides and are presumed to act in electron transfer reactions associated with methanogenesis. Putative heme-binding motifs include five (or six) CXXCH motifs, a CXXXCH motif, and a CXXXXCH motif. These proteins show multiple regions of local homology, in the same order, with multiheme cytochrome c proteins such as octaheme tetrathionate reductase from Shewanella.	0
421308	cl26443	CbiG_mid	Cobalamin biosynthesis central region. Members of this family are involved in cobalamin synthesis. Synechocystis sp. cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process.	0
421310	cl26453	DUF3262	Protein of unknown function (DUF3262). Members of this family of small, hydrophobic proteins are found occasionally on plasmids such as the Pseudomonas putida TOL (toluene catabolic) plasmid pWWO_p085. Usually, however, they are found on the bacterial main chromosome in regions flanked by markers of conjugative transfer and/or transposition. [Mobile and extrachromosomal element functions, Plasmid functions]	0
331275	cl26454	VirionAssem_T7	Bacteriophage T7 virion assembly protein. tail assembly protein	0
421311	cl26457	T2SSJ	Type II secretion system (T2SS), protein J. The T2SJ proteins are pseudopilins, which are targeted to the membrane in E. Coli. T2SJ forms a complex with T2SI (pfam02501) and T2SK (pfam03934) which is part of the Type II secretion apparatus involved in the translocation of proteins across the outer membrane in E.coli. The T2SK-I-J complex has quasihelical characteristics.	0
421313	cl26461	DUF3236	Protein of unknown function (DUF3236). This family of proteins with unknown function appears to be restricted to Methanobacteria.	0
331288	cl26467	MerE	MerE protein. putative mercury resistance protein; Provisional	0
421316	cl26468	Scaffolding_pro	Phi29 scaffolding protein. scaffolding protein	0
331295	cl26474	DUF2675	Protein of unknown function (DUF2675). host protein H-NS-interacting protein	0
421319	cl26475	Phage_gp53	Base plate wedge protein 53. baseplate wedge subunit; Provisional	0
421320	cl26479	EutA	Ethanolamine utilisation protein EutA. This family consists of several bacterial EutA ethanolamine utilisation proteins. The EutA protein is thought to protect the lyase (EutBC) from inhibition by CNB12.	0
421321	cl26481	M11L	Apoptosis regulator M11L like. Hypothetical protein; Provisional	0
391537	cl26482	T4_tail_cap	Tail-tube assembly protein. baseplate subunit; Provisional	0
421322	cl26486	YfdX	YfdX protein. hypothetical protein; Provisional	0
331308	cl26487	DUF2745	Protein of unknown function (DUF2745). host dGTPase inhibitor	0
331309	cl26488	DUF2718	Protein of unknown function (DUF2718). Hypothetical protein; Provisional	0
331311	cl26490	YliH	Biofilm formation protein (YliH/bssR). YliH is induced in biofilms and is involved in repression of motility in the biofilms. YliH is also known as bssR (regulator of biofilm through signal secreton).	0
421323	cl26491	DSRB	Dextransucrase DSRB. DSRB is a novel dextransucrase which produces a dextran different from the typical dextran, as it contains (1-6) and (1-2) linkages, when this strain is grown in the presence of sucrose.	0
331313	cl26492	CedA	Cell division activator CedA. CedA is made up of four antiparallel beta-strands and an alpha-helix. It activates cell division by inhibiting chromosome over-replication. This is mediated by binding to dsDNA via the beta-sheet..	0
421324	cl26494	DUF2498	Protein of unknown function (DUF2498). hypothetical protein; Provisional	0
421325	cl26496	DUF2496	Protein of unknown function (DUF2496). hypothetical protein; Provisional	0
331319	cl26498	BDM	Putative biofilm-dependent modulation protein. biofilm-dependent modulation protein; Provisional	0
421326	cl26499	DUF4198	Domain of unknown function (DUF4198). This family was previously missannotated in Pfam as NikM.	0
421327	cl26521	AKAP7_NLS	AKAP7 2&apos;5&apos; RNA ligase-like domain. unknown protein; Provisional	0
421338	cl26560	TIR-like	Predicted nucleotide-binding protein containing TIR-like domain. Members of this family of bacterial nucleotide-binding proteins contain a TIR-like domain. Their exact function has not, as yet, been defined.	0
421339	cl26561	DUF2344	Uncharacterized protein conserved in bacteria (DUF2344). This model describes an uncharacterized protein encoded adjacent to, or as a fusion protein with, an uncharacterized radical SAM protein.	0
421340	cl26564	DUF2338	Uncharacterized protein conserved in bacteria (DUF2338). Members of this family of hypothetical bacterial proteins have no known function.	0
421341	cl26565	DUF2336	Uncharacterized protein conserved in bacteria (DUF2336). Members of this family of hypothetical bacterial proteins have no known function.	0
421342	cl26566	HPTransfase	Histidine phosphotransferase C-terminal domain. HPTransfase is a family of essential histidine phosphotransferases. It controls the activity of the master bacterial cell-cycle regulator CtrA through phosphorylation. It behaves as a homodimer by adopting the domain architecture of the intracellular part of class I histidine kinases. Each subunit consists of two distinct domains: an N-terminal helical hairpin domain and a C-terminal [alpha]/[beta] domain. The two N-terminal domains are adjacent within the dimer, forming a four-helix bundle. The C-terminal domain adopts an atypical Bergerat ATP-binding fold.	0
421343	cl26569	DUF2318	Predicted membrane protein (DUF2318). Members of this family of hypothetical bacterial proteins have no known function.	0
421344	cl26571	DUF2273	Small integral membrane protein (DUF2273). Members of this family of hypothetical bacterial proteins have no known function.	0
421345	cl26572	Methyltransf_33	Histidine-specific methyltransferase, SAM-dependent. This model represents an uncharacterized domain of about 300 amino acids with homology to S-adenosylmethionine-dependent methyltransferases. Proteins with this domain are exclusively fungal. A few, such as EasF from Neotyphodium lolii, are associated with the biosynthesis of ergot alkaloids, a class of fungal secondary metabolites. EasF may, in fact, be the AdoMet:dimethylallyltryptophan N-methyltransferase, the enzyme that follows tryptophan dimethylallyltransferase (DMATS) in ergot alkaloid biosynthesis. Several other members of this family, including mug158 (meiotically up-regulated gene 158 protein) from Schizosaccharomyces pombe, contain an additional uncharacterized domain DUF323 (pfam03781).	0
421346	cl26573	DUF2254	Predicted membrane protein (DUF2254). Members of this family of bacterial proteins comprises various hypothetical and putative membrane proteins. Their exact function, has not, as yet, been defined.	0
421347	cl26577	DUF2225	Uncharacterized protein conserved in bacteria (DUF2225). This domain, found in various hypothetical bacterial proteins, has no known function.	0
421348	cl26580	DUF2206	Predicted membrane protein (DUF2206). This domain, found in various hypothetical archaeal proteins, has no known function.	0
421349	cl26582	VapB_antitoxin	Bacterial antitoxin of type II TA system, VapB. VapB is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is VapC, pfam05016. The family contains several related antitoxins from Cyanobacteria and Actinobacterial families. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the Toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all antitoxins.	0
421350	cl26583	DUF2163	Uncharacterized conserved protein (DUF2163). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
421351	cl26584	DUF2148	Uncharacterized protein containing a ferredoxin domain (DUF2148). This domain, found in various hypothetical bacterial proteins containing a ferredoxin domain, has no known function.	0
421352	cl26586	DUF2125	Uncharacterized protein conserved in bacteria (DUF2125). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
421353	cl26587	DUF2109	Predicted membrane protein (DUF2109). This domain, found in various hypothetical archaeal proteins, has no known function.	0
421354	cl26588	DUF2107	Predicted membrane protein (DUF2107). This domain, found in various hypothetical archaeal proteins, has no known function.	0
421355	cl26589	DUF2093	Uncharacterized protein conserved in bacteria (DUF2093). This domain, found in various hypothetical prokaryotic proteins, has no known function.	0
421356	cl26591	DUF2080	Putative transposon-encoded protein (DUF2080). Members of this family appear restricted to the archaea. They tend to be encoded upstream of predicted transposase genes within insertion sequences such as ISNagr11, ISHca1, ISH36, etc. The widespread distribution suggests this protein may be more than a mere passenger gene and may participate in some transposase-associated function. See PF09853, COG3466, and arCOG03884 for alternative (currently narrow) treatments of this family.	0
421357	cl26592	SHOCT	Short C-terminal domain. 	0
421358	cl26593	DUF2076	Uncharacterized protein conserved in bacteria (DUF2076). This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins.	0
421359	cl26595	DUF2070	Predicted membrane protein (DUF2070). This is a family of Archaeal 7-TM proteins. There are 6 closely assembled TM-regions at the N-terminus followed by a long intracellular, from residues 220-590, highly conserved region, of unknown function, terminating with one more TM-region. The short 25 residue section between TMs 5 and 6 might lie on the outer surface of the membrane and be acting as a receptor (from TMHMM).	0
421360	cl26596	DUF2059	Uncharacterized protein conserved in bacteria (DUF2059). This domain, found in various prokaryotic proteins, has no known function.	0
421361	cl26597	DUF2058	Uncharacterized protein conserved in bacteria (DUF2058). This domain, found in various prokaryotic proteins, has no known function.	0
421362	cl26599	Beta_propel	Beta propeller domain. Members of this family comprise secreted bacterial proteins containing C-terminal beta-propeller domain distantly related to WD-40 repeats. Jpred secondary-structure prediction shows family to be a series of 4 short beta-strands, characteristic of beta-propeller families.	0
421366	cl26616	DUF2397	Protein of unknown function (DUF2397). Members of this protein belong to a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria). [Hypothetical proteins, Conserved]	0
421367	cl26619	Myco_arth_vir_N	Mycoplasma virulence signal region (Myco_arth_vir_N). This model represents the N-terminal region, including a probable signal sequence or signal anchor which in most instances has four consecutive Lys residues before the hydrophobic stretch, of a family of large, virulence-associated proteins in Mycoplasma arthritidis and smaller proteins in Mycoplasma capricolum.	0
421368	cl26629	Phageshock_PspD	Phage shock protein PspD (Phageshock_PspD). Members of this family are phage shock protein PspD, found in a minority of bacteria that carry the defining genes of the phage shock regulon (pspA, pspB, pspC, and pspF). It is found in Escherichia coli, Yersinia pestis, and closely related species, where it is part of the phage shock operon. It is known to be expressed but its function is unknown. [Cellular processes, Adaptations to atypical conditions]	0
421369	cl26630	Spore_YhcN_YlaJ	Sporulation lipoprotein YhcN/YlaJ (Spore_YhcN_YlaJ). YhcN and YlaJ are predicted lipoproteins that have been detected as spore proteins but not vegetative proteins in Bacillus subtilis. Both appear to be expressed under control of the RNA polymerase sigma-G factor. The YlaJ-like members of this family have a low-complexity, strongly acidic 40-residue C-terminal domain that is not included in the seed alignment for this model. A portion of the low-complexity region between the lipoprotein signal sequence and the main conserved region of the protein family was also excised from the seed alignment. [Cellular processes, Sporulation and germination]	0
391585	cl26631	DUF2379	Protein of unknown function (DUF2379). This family consists of at least eight paralogs in Myxococcus xanthus and six in Stigmatella aurantiaca DW4/3-1, both members of Myxococcales order within the Deltaproteobacteria. The function is unknown. Some member proteins consist of two copies of the domain. This domain is hereby named DUSAM, DUplication in Stigmatella And Myxococcus.	0
331453	cl26632	Ehrlichia_rpt	Ehrlichia tandem repeat (Ehrlichia_rpt). This model represents 77 residues of an 80 amino acid (240 nucleotide) tandem repeat, found in a variable number of copies in an immunodominant outer membrane protein of Ehrlichia chaffeensis, a tick-borne obligate intracellular pathogen.	0
421374	cl26647	Lact-deh-memb	D-lactate dehydrogenase, membrane binding. D-lactate dehydrogenase; Provisional	0
421375	cl26651	AcetDehyd-dimer	Prokaryotic acetaldehyde dehydrogenase, dimerization. acetaldehyde dehydrogenase; Validated	0
421376	cl26652	FOLN	Follistatin/Osteonectin-like EGF domain. Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence	0
421378	cl26682	GshA	Glutamate-cysteine ligase. This family consists of a rare family of glutamate--cysteine ligases, demonstrated first in Thiobacillus ferrooxidans and present in a few other Proteobacteria. It is the first of two enzymes for glutathione biosynthesis. It is also called gamma-glutamylcysteine synthetase. [Biosynthesis of cofactors, prosthetic groups, and carriers, Glutathione and analogs]	0
421379	cl26684	IDEAL	IDEAL domain. It is found at the C-terminus of proteins in the UPF0302 family. It is named after the sequence of the most conserved region in some members.	0
421381	cl26695	Ca_chan_IQ	Voltage gated calcium channel IQ domain. Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF).	0
421382	cl26696	CotH	CotH kinase protein. Members of this family include the spore coat protein H (cotH). This protein is an atypical protein kinase that phosphorylates CotB and CotG.	0
421384	cl26712	U3_assoc_6	U3 small nucleolar RNA-associated protein 6. This is a family of U3 nucleolar RNA-associated proteins which are involved in nucleolar processing of pre-18S ribosomal RNA.	0
331548	cl26727	aMBF1	Archaeal ribosome-binding protein aMBF1, putative translation factor, contains Zn-ribbon and HTH domains [Translation, ribosomal structure and biogenesis]. [Hypothetical proteins, Conserved]	0
421385	cl26728	STAG	STAG domain. STAG domain proteins are subunits of cohesin complex - a protein complex required for sister chromatid cohesion in eukaryotes. The STAG domain is present in Schizosaccharomyces pombe mitotic cohesin Psc3, and the meiosis specific cohesin Rec11. Many organisms express a meiosis-specific STAG protein, for example, mice and humans have a meiosis specific variant called STAG3, although budding yeast does not have a meiosis specific version.	0
391601	cl26729	LisH	LisH. Alpha-helical motif present in Lis1, treacle, Nopp140, some katanin p60 subunits, muskelin, tonneau, LEUNIG and numerous WD40 repeat-containing proteins. It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerisation, or else by binding cytoplasmic dynein heavy chain or microtubules directly.	0
421386	cl26730	Amelin	Ameloblastin precursor (Amelin). This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals.	0
421388	cl26737	SpoIID	Stage II sporulation protein. Stage II sporulation protein D (SpoIID) is a protein of the endospore formation program in a number of lineages in the Firmicutes (low-GC Gram-positive bacteria). It is expressed in the mother cell compartment, under control of Sigma-E. SpoIID, along with SpoIIM and SpoIIP, is one of three major proteins involved in engulfment of the forespore by the mother cell. [Cellular processes, Sporulation and germination]	0
421389	cl26744	TPT	Triose-phosphate Transporter family. The 6-8 TMS Triose-phosphate Transporter (TPT) Family (TC 2.A.7.9)Functionally characterized members of the TPT family are derived from the inner envelope membranes of chloroplasts and nongreen plastids of plants. However,homologues are also present in yeast. Saccharomyces cerevisiae has three functionally uncharacterized TPT paralogues encoded within its genome. Under normal physiologicalconditions, chloroplast TPTs mediate a strict antiport of substrates, frequently exchanging an organic three carbon compound phosphate ester for inorganic phosphate (Pi).Normally, a triose-phosphate, 3-phosphoglycerate, or another phosphorylated C3 compound made in the chloroplast during photosynthesis, exits the organelle into thecytoplasm of the plant cell in exchange for Pi. However, experiments with reconstituted translocator in artificial membranes indicate that transport can also occur by achannel-like uniport mechanism with up to 10-fold higher transport rates. Channel opening may be induced by a membrane potential of large magnitude and/or by high substrateconcentrations. Nongreen plastid and chloroplast carriers, such as those from maize endosperm and root membranes, mediate transport of C3 compounds phosphorylated atcarbon atom 2, particularly phosphenolpyruvate, in exchange for Pi. These are the phosphoenolpyruvate:Pi antiporters (PPT). Glucose-6-P has also been shown to be asubstrate of some plastid translocators (GPT). The three types of proteins (TPT, PPT and GPT) are divergent in sequence as well as substrate specificity, but their substratespecificities overlap. [Hypothetical proteins, Conserved]	0
421394	cl26765	FBD	FBD. This region is found in F-box (pfam00646) and other domain containing plant proteins; it is repeated in two family members. Its precise function is unknown, but it is thought to be associated with nuclear processes. In fact, several family members are annotated as being similar to transcription factors.	0
421396	cl26778	TED	Thioester domain. This model describes a domain of about 40 residues with an invariant TQ dipeptide in an almost invariant TQxA[VI]W motif. This domain occurs in surface-expressed proteins of Gram-positive bacteria, many of which are anchored by LPXTG-containing sortase target domains. Numerous members of this family have domains pfam05738 (Cna protein B-type domain) and pfam08341 (fibronectin-binding protein signal sequence).	0
421397	cl26779	YicC_N	YicC-like family, N-terminal region. The apparent ortholog from Aquifex aeolicus as reported is split into two consecutive reading frames. [Hypothetical proteins, Conserved]	0
391612	cl26797	SpoV	Stage V sporulation protein family. Members of this family are SpoVM (stage V sporulation protein M).	0
391614	cl26808	PqqA	PqqA family. This model describes a very small protein, coenzyme PQQ biosynthesis protein A, which is smaller than 25 amino acids in many species. It is proposed to serve as a peptide precursor of coenzyme pyrrolo-quinoline-quinone (PQQ), with Glu and Tyr of a conserved motif Glu-Xxx-Xxx-Xxx-Tyr becoming part of the product. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	0
421399	cl26817	GSDH	Glucose / Sorbosone dehydrogenase. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis.	0
355577	cl26832	PLN03020	N/A. putative Low-temperature-induced protein; Provisional	0
421400	cl26841	PHB_acc_N	PHB/PHA accumulation regulator DNA-binding domain. Poly-B-hydroxyalkanoates are lipidlike carbon/energy storage polymers found in granular inclusions. PhaR is a regulatory protein found in general near other proteins associated with polyhydroxyalkanoate (PHA) granule biosynthesis and utilization. It is found to be a DNA-binding homotetramer that is also capable of binding short chain hydroxyalkanoic acids and PHA granules. PhaR may regulate the expression of itself, of the phasins that coat granules, and of enzymes that direct carbon flux into polymers stored in granules. The C-terminal region is poorly conserved in this family and is not part of this model.//GO terms added 12/6/04 [SS] [Fatty acid and phospholipid metabolism, Biosynthesis, Regulatory functions, DNA interactions]	0
421401	cl26842	DUF1656	Protein of unknown function (DUF1656). efflux system membrane protein; Provisional	0
421402	cl26844	DUF1641	Protein of unknown function (DUF1641). Archaeal and bacterial hypothetical proteins are found in this family, with the region in question being approximately 40 residues long.	0
421403	cl26845	FTCD_N	Formiminotransferase domain, N-terminal subdomain. This model represents the tetrahydrofolate (THF) dependent glutamate formiminotransferase involved in the histidine utilization pathway. This enzyme interconverts L-glutamate and N-formimino-L-glutamate. The enzyme is bifunctional as it also catalyzes the cyclodeaminase reaction on N-formimino-THF, converting it to 5,10-methenyl-THF and releasing ammonia - part of the process of regenerating THF. This model covers enzymes from metazoa as well as gram-positive bacteria and archaea. In humans, deficiency of this enzyme results in a disease phenotype. The crystal structure of the enzyme has been studied in the context of the catalytic mechanism. [Energy metabolism, Amino acids and amines]	0
421404	cl26877	Band_3_cyto	Band 3 cytoplasmic domain. The Anion Exchanger (AE) Family (TC 2.A.31)Characterized protein members of the AE family are found only in animals.They preferentially catalyze anion exchange (antiport) reactions, typically acting as HCO3-:Cl- antiporters, but also transporting a range of other inorganic and organic anions. Additionally, renal Na+:HCO3- cotransporters have been found to be members of the AE family. They catalyze the reabsorption of HCO3- in the renal proximal tubule. [Transport and binding proteins, Anions]	0
421405	cl26884	TraI_2	Putative helicase. Members of this protein family are the TraI putative relaxases required for transfer by a subclass of integrating conjugative elements (ICE) as found in Pseudomonas fluorescens Pf-5, and understood from study of two related ICE, SXT and R391. This model represents the N-terminal domain. Note that no homology is detected to the similarly named TraI relaxase of the F plasmid.	0
421406	cl26890	Collar	Phage Tail Collar Domain. This region is occasionally found in conjunction with pfam03335. Most of the family appear to be phage tail proteins; however some appear to be involved in other processes. For instance a member from Rhizobium leguminosarum may be involved in plant-microbe interactions. A related protein MrpB is involved in the pathogenicity of Microcystis aeruginosa. The finding of this family in a structural component of the phage tail fibre baseplate suggests that its function is structural rather than enzymatic. Structural studies show this region consists of a helix and a loop and three beta-strands. This alignment does not catch the third strand as it is separated from the rest of the structure by around 100 residues. This strand is conserved in homologs but the intervening sequence is not. Much of the function of phage T4 appears to reside in this intervening region. In the tertiary structure of the phage baseplate this domain forms part of the 'collar'. The domain may bind SO4, however the residues accredited with this vary between the PDB file and the Swiss-Prot entry. The long unconserved region maybe due to domain swapping in and out of a loop or reflective of rapid evolution.	0
421407	cl26892	PsaM	Photosystem I protein M (PsaM). Members of this protein family are PsaM, which is subunit XII of the photosystem I reaction center. This protein is found in both the Cyanobacteria and the chloroplasts of plants, but is absent from non-oxygenic photosynthetic bacteria such as Rhodobacter sphaeroides. Species that contain photosystem I also contain photosystem II, which splits water and releases molecular oxygen. The seed alignment for this model includes sequences from pfam07465 and additional sequences, as from Prochlorococcus. [Energy metabolism, Photosynthesis]	0
421408	cl26894	BofA	SigmaK-factor processing regulatory protein BofA. Members of this protein family are found only in endospore-forming bacteria, such as Bacillus subtilis and Clostridium tetani. Among such bacteria, it appears only Symbiobacterium thermophilum lacks a member of this family. The protein, designated BofA, is an integral membrane protein that regulates the proteolytic activation of the RNA polymerase sigma factor K. [Cellular processes, Sporulation and germination]	0
421409	cl26898	DUF1507	Protein of unknown function (DUF1507). hypothetical protein; Provisional	0
355590	cl26909	COG4652	Uncharacterized protein [Function unknown]. This model represents a family of integral membrane proteins, most of which are about 650 residues in size and predicted to span the membrane seven times. Nearly half of the members of this family are found in association with a member of the lactococcin 972 family of bacteriocins (TIGR01653). Others may be associated with uncharacterized proteins that may also act as bacteriocins. Although this protein is suggested to be an immunity protein, and the bacteriocin is suggested to be exported by a Sec-dependent process, the role of this protein is unclear. [Cellular processes, Toxin production and resistance]	0
421410	cl26914	Neuralized	Neuralized. This family contains a conserved region approximately 60 residues long within eukaryotic neuralized and neuralized-like proteins. Neuralized belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the Drosophila nervous system. Some family members contain multiple copies of this region.	0
331736	cl26915	Sugar_transport	Sugar transport protein. This is a family of bacterial sugar transporters approximately 300 residues long. Members include glucose uptake proteins, ribose transport proteins, and several putative and hypothetical membrane proteins probably involved in sugar transport across bacterial membranes. These members are transmembrane proteins which are usually 5+5 duplications. This model recognizes a set of five TMs,	0
421411	cl26917	SKA1_N	Spindle and kinetochore-associated protein 1, N-terminal domain. Spindle and kinetochore-associated protein 1 (SKA1) is a component of the SKA1 complex (consists of Ska1, Ska2, and Ska3/Rama1), a microtubule-binding subcomplex of the outer kinetochore that is essential for proper chromosome segregation.	0
421412	cl26923	DUF1328	Protein of unknown function (DUF1328). hypothetical protein; Provisional	0
421413	cl26935	zf-LSD1	LSD1 zinc finger. This model describes a putative zinc finger domain found in three closely spaced copies in Arabidopsis protein LSD1 and in two copies in other proteins from the same species. The motif resembles CxxCRxxLMYxxGASxVxCxxC	0
421414	cl26937	YqfD	Putative stage IV sporulation protein YqfD. YqfD is part of the sigma-E regulon in the sporulation program of endospore-forming Gram-positive bacteria. Mutation results in a sporulation defect in Bacillus subtilis. Members are found in all currently known endospore-forming bacteria, including the genera Bacillus, Symbiobacterium, Carboxydothermus, Clostridium, and Thermoanaerobacter. [Cellular processes, Sporulation and germination]	0
421415	cl26941	DUF1246	Protein of unknown function (DUF1246). This family represents the N-terminus of a number of hypothetical archaeal proteins of unknown function. This family is structurally related to the PreATP-grasp domain.	0
391639	cl26952	BC10	Bladder cancer-related protein BC10. hypothetical protein	0
421416	cl26955	DUF1192	Protein of unknown function (DUF1192). This family consists of several short, hypothetical, bacterial proteins of around 60 residues in length. The function of this family is unknown.	0
391642	cl26958	DUF1156	Protein of unknown function (DUF1156). This family represents a conserved region within hypothetical prokaryotic and archaeal proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids.	0
421418	cl26962	DUF1126	DUF1126 PH-like domain. The structure of this domain shows that it has a PH-like fold.	0
421419	cl26964	ABC_trans_CmpB	Putative ABC-transporter type IV. CmpB is a family of membrane proteins that are likely to be part of a two-component type IV ABC-transporter system. Families can transport multiple drugs including ethidium and fluoroquinolones. UniProtKB:Q83XH0 is a member of TCDB family 3.A.1.121.4.	0
421420	cl26967	Orthopox_A49R	Orthopoxvirus A49R protein. hypothetical protein; Provisional	0
421421	cl26969	Vitellogenin_N	Lipoprotein amino terminal region. This family contains regions from: Vitellogenin, Microsomal triglyceride transfer protein and apolipoprotein B-100. These proteins are all involved in lipid transport. This family contains the LV1n chain from lipovitellin, that contains two structural domains.	0
391647	cl26972	API3	N/A. Pepsin inhibitor-3 consisting of two domains, each comprising an antiparallel beta-sheet flanked by an alpha-helix. In the enzyme-inhibitor complex, the N-terminal beta-strand of PI-3 pairs with one strand of the active site flap region of pepsin. The two domains are tandem repeats of sequence, and has therefore been termed repeated domain.	0
421422	cl26975	pbsY	N/A. photosystem II protein Y; Provisional	0
421423	cl26977	Mito_fiss_Elm1	Mitochondrial fission ELM1. In plants, this family is involved in mitochondrial fission. It binds to dynamin-related proteins and plays a role in their relocation from the cytosol to mitochondrial fission sites. Its function in bacteria is unknown.	0
421424	cl26979	Usg	Usg-like family. Family of bacterial proteins, referred to as Usg. Usg is found in the same operon as trpF, trpB, and trpA and is expressed in a coupled transcription-translation system.	0
331802	cl26981	Terminase_1	Phage Terminase. The majority of the members of this family are bacteriophage proteins, several of which are thought to be terminase large subunit proteins. There are also a number of bacterial proteins of unknown function.	0
421425	cl26984	PTAC	Phosphate propanoyltransferase. This family includes phosphotransacylases (PTACs) required for the degradation of 1,2-propanediol (1,2-PD).	0
391652	cl26985	TLP-20	N/A. This family consists of several Nucleopolyhedrovirus telokin-like protein-20 (TLP20) sequences. The function of this family is unknown but TLP20 is known to shares some antigenic similarities to the smooth muscle protein telokin although the amino acid sequence shows no homologies to telokin.	0
421430	cl26996	TelA	Toxic anion resistance protein (TelA). This family consists of several prokaryotic TelA like proteins. TelA and KlA are associated with tellurite resistance and plasmid fertility inhibition.	0
331826	cl27005	VirD1	T-DNA border endonuclease VirD1. This family consists of several T-DNA border endonuclease VirD1 proteins which appear to be found exclusively in Agrobacterium species. Agrobacterium, a plant pathogen, is capable to stably transform the plant cell with a segment of its own DNA called T-DNA (transferred DNA). This process depends, among others, on the specialized bacterial virulence proteins VirD1 and VirD2 that excise the T-DNA from its adjacent sequences. VirD1 is thought to interact with VirD2 in this process.	0
421431	cl27008	HECTc	N/A. The name HECT comes from Homologous to the E6-AP Carboxyl Terminus.	0
391659	cl27011	Pup	Pup-like protein. Members of this protein family are Pup, a small protein whose ligation to target proteins steers them toward degradation. This protein family occurs in a number of bacteria, especially Actinobacteria such as Mycobacterium tuberculosis, that possess an archeal-type proteasome. All members of this protein family known during model construction end with the C-terminal motif [FY][VI]QKGG[QE]. Ligation is thought to occur between the C-terminal COOH of Pup and an epsilon-amino group of a Lys on the target protein. The N-terminal half of this protein is poorly conserved and not represented in the seed alignment. [Protein fate, Degradation of proteins, peptides, and glycopeptides]	0
421432	cl27012	HIGH_NTase1	HIGH Nucleotidyl Transferase. This family consists of HIGH Nucleotidyl Transferases	0
421433	cl27023	TraY	TraY domain. This family consists of several enterobacterial TraY proteins. TraY is involved in bacterial conjugation where it is required for efficient nick formation in the F plasmid. These proteins have a ribbon-helix-helix fold and are likely to be DNA-binding proteins.	0
421434	cl27025	Herpes_UL69	Herpesvirus transcriptional regulator family. multifunctional expression regulator; Provisional	0
391663	cl27031	Phi-29_GP3	Phi-29 DNA terminal protein GP3. terminal protein	0
421436	cl27037	FlaC_arch	Flagella accessory protein C (FlaC). Although archaeal flagella appear superficially similar to those of bacteria, they are quite distinct. In several archaea, the flagellin genes are followed immediately by the flagellar accessory genes flaCDEFGHIJ. The gene products may have a role in translocation, secretion, or assembly of the flagellum. FlaC is a protein whose exact role is unknown but it has been shown to be membrane-associated (by immuno-blotting fractionated cells).	0
421437	cl27040	PilS	PilS N terminal. This family consists of several bundlin proteins from E. coli. Bundlin is a type IV pilin protein that is the only known structural component of enteropathogenic Escherichia coli bundle-forming pili (BFP). BFP play a role in virulence, antigenicity, autoaggregation, and localized adherence to epithelial cells. These proteins contain an N-terminal methylation motif.	0
421440	cl27046	CopB	Copper resistance protein B precursor (CopB). This family consists of several bacterial copper resistance proteins. Copper is essential and serves as cofactor for more than 30 enzymes yet a surplus of copper is toxic and leads to radical formation and oxidation of biomolecules. Therefore, copper homeostasis is a key requisite for every organism. CopB serves to extrude copper when it approaches toxic levels.	0
421441	cl27047	CHAD	CHAD domain. It has conserved histidines that may chelate metals.	0
421443	cl27068	AbrB	Transition state regulatory protein AbrB. The model describes a hydrophobic sequence region that is duplicated to form the AbrB protein of Escherichia coli (not to be confused with a Bacillus subtilis protein with the same gene symbol). In some species, notably the Cyanobacteria and Thermus thermophilus, proteins consist of a single copy rather than two copies. The member from Pseudomonas putida, PP_1415, was suggested to be an ammonia monooxygenase characteristic of heterotrophic nitrifiers, based on an experimental indication of such activity in the organism and a glimmer of local sequence similarity between parts of P. putida protein and an instance of the AmoA protein from Nitrosomonas europaea (; we do not believe the sequence similarity to be meaningful. The member from E. coli (b0715, ybgN) appears to be the largely uncharacterized AbrB (aidB regulator) protein of E. coli cited in Volkert, et al. (PMID 8002588), although we did not manage to trace the origin of association of the article to the sequence.	0
421444	cl27073	T7SS_ESX1_EccB	Type VII secretion system ESX-1, transport TM domain B. This model represents the transmembrane protein EccB of the actinobacterial flavor of type VII secretion systems. Species such as Mycobacterium tuberculosis have several instances of this system per genome, designated EccB1, EccB2, etc. This model does not identify functionally related proteins in the Firmicutes such as Staphylococcus aureus and Bacillus anthracis. [Protein fate, Protein and peptide secretion and trafficking]	0
331896	cl27075	Holin_BlyA	holin, BlyA family. This family represents a BlyA, a small holin found in Borrelia circular plasmids that prove to be temperate phage. This protein was previously proposed to be an hemolysin. BlyA is small (67 residues) and contains two largely hydrophobic helices and a highly charged C-terminus. [Mobile and extrachromosomal element functions, Prophage functions]	0
421445	cl27076	Glu_cyclase_2	Glutamine cyclotransferase. This family of enzymes EC:2.3.2.5 catalyze the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes.	0
421447	cl27082	Phage_capsid	Phage capsid family. This model family represents the major capsid protein component of the heads (capsids) of bacteriophage HK97, phi-105, P27, and related phage. This model represents one of several analogous families lacking detectable sequence similarity. The gene encoding this component is typically located in an operon encoding the small and large terminase subunits, the portal protein and the prohead or maturation protease. [Mobile and extrachromosomal element functions, Prophage functions]	0
421448	cl27083	Menin	Scaffolding protein menin encoded by the MEN1 gene. MEN1, the gene responsible for multiple endocrine neoplasia type 1, is a tumor suppressor gene that encodes a protein called Menin which may be an atypical GTPase stimulated by nm23.	0
421450	cl27099	DltD	DltD protein. Members of this protein family are DltD, part of the DltABCD system widely distributed in the Firmicutes for D-alanylation of lipoteichoic acids. The most common form of LTA, as in Staphylococcus aureus, has a backbone of polyglycerolphosphate.	0
421451	cl27100	Glu_synthase	Conserved region in glutamate synthase. This family represents a region of the glutamate synthase protein. This region is expressed as a separate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a large multidomain enzyme in other organisms. The aligned region of these proteins contains a putative FMN binding site and Fe-S cluster.	0
421452	cl27103	SseC	Secretion system effector C (SseC) like family. SseC is a secreted protein that forms a complex together with SecB and SecD on the surface of Salmonella. All these proteins are secreted by the type III secretion system. Many mucosal pathogens use type III secretion systems for the injection of effector proteins into target cells. SecB, SseC and SecD are inserted into the target cell membrane. where they form a small pore or translocon. In addition to SseC, this family includes the bacterial secreted proteins PopB, PepB, YopB and EspD which are thought to be directly involved in pore formation, and type III secretion system translocon.	0
421453	cl27104	Mak16	Mak16 protein C-terminal region. Protein MAK16 homolog; Provisional	0
421455	cl27109	Herpes_UL49_2	Herpesvirus UL49 tegument protein. tegument protein VP22; Provisional	0
331932	cl27111	Herpes_ORF11	Herpesvirus dUTPase protein. hypothetical protein; Provisional	0
421457	cl27115	IKI3	IKI3 family. Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation. This region contains WD40 like repeats.	0
421458	cl27123	Arch_fla_DE	Archaeal flagella protein. Family of archaeal flaD and flaE proteins. Conserved region found at N-terminus of flaE but towards the C-terminus of flaD.	0
421460	cl27132	Herpes_UL17	Herpesvirus UL17 protein. UL17 tegument protein; Provisional	0
421461	cl27156	FrhB_FdhB_C	Coenzyme F420 hydrogenase/dehydrogenase, beta subunit C-terminus. Coenzyme F420 hydrogenase (EC:1.12.99.1) reduces the low-potential two-electron acceptor coenzyme F420. This family contains the C termini of F420 hydrogenase and dehydrogenase beta subunits,. The N-terminus of Methanobacterium formicicum formate dehydrogenase beta chain (EC:1.2.1.2) is also a member of this family. This region is often found in association with the 4Fe-4S binding domain, fer4 (pfam00037).	0
421462	cl27158	FrhB_FdhB_N	Coenzyme F420 hydrogenase/dehydrogenase, beta subunit N-term. coenzyme F420-reducing hydrogenase subunit beta; Validated	0
421463	cl27166	ATE_N	Arginine-tRNA-protein transferase, N-terminus. arginyl-tRNA-protein transferase; Provisional	0
421464	cl27167	HemX	HemX, putative uroporphyrinogen-III C-methyltransferase. putative uroporphyrinogen III C-methyltransferase; Provisional	0
421465	cl27170	DUF460	Protein of unknown function (DUF460). Archaeal protein of unknown function.	0
421466	cl27175	CheF-arch	Chemotaxis signal transduction system protein F from archaea. This is a family of proteins that are archaea-specific components of the bacterial-like chemotaxis signal transduction system of archaea. In H. salinarum, the CheF proteins interact with the chemotaxis proteins CheY, CheD and CheC2 as well as the flagella-accessory proteins FlaCE and FlaD, and are essential for any tactic response. CheF probably functions at the interface between the bacterial-like chemotaxis signal transduction system and the archaeal flagellar apparatus.	0
391698	cl27177	Phospholamban	Phospholamban. This model represents the short (52 residue) transmembrane phosphoprotein phospholamban. Phospholamban, in its unphosphorylated form, inhibits SERCA2, the cardiac sarcoplasmic reticulum Ca-ATPase.	0
421467	cl27188	Mre11_DNA_bind	Mre11 DNA-binding presumed domain. All proteins in this family for which functions are known are subunits of a nuclease complex made up of multiple proteins including MRE11 and RAD50 homologs. The functions of this nuclease complex include recombinational repair and non-homolgous end joining. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The proteins in this family are distantly related to proteins in the SbcCD complex of bacteria. [DNA metabolism, DNA replication, recombination, and repair]	0
421470	cl27198	ORC2	Origin recognition complex subunit 2. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in.	0
421471	cl27200	Ribo_biogen_C	Ribosome biogenesis protein, C-terminal. This family represents the C-terminal domain of some putative ribosome biogenesis proteins in archaea. It has also been identified in the eukaryotic protein Tsr3, which is involved in ribosomal RNA biogenesis.	0
421472	cl27202	Mpp10	Mpp10 protein. This family includes proteins related to Mpp10 (M phase phosphoprotein 10). The U3 small nucleolar ribonucleoprotein (snoRNP) is required for three cleavage events that generate the mature 18S rRNA from the pre-rRNA. In Saccharomyces cerevisiae, depletion of Mpp10, a U3 snoRNP-specific protein, halts 18S rRNA production and impairs cleavage at the three U3 snoRNP-dependent sites.	0
421474	cl27208	PRCH	Photosynthetic reaction centre, H-chain N-terminal region. This model describes the photosynthetic reaction center H subunit in non-oxygenic photosynthetic bacteria. The reaction center is an integral membrane pigment-protein that carries out light-driven electron transfer reactions. At the core of reaction center is a collection light-harvesting cofactors and closely associated polypeptides. The core protein complex is made of L, M and H subunits. The common cofactors include bacterichlorophyll, bacteriopheophytins, ubiquinone and no-heme ferrous iron. The net result of electron tranfer reactions is the establishment of proton electrochemical gradient and production of reducing equivalents in the form of NADH. Ultimately, the process results in the reduction of C02 to carbohydrates(C6H12O6) In non-oxygenic organisms, the electron donor is an organic acid rather than water. Much of our current functional understanding of photosynthesis comes from the structural determination and spectroscopic studies on the reaction center of Rhodobacter sphaeroides. [Energy metabolism, Electron transport, Energy metabolism, Photosynthesis]	0
421475	cl27223	CBF	CBF/Mak21 family. 	0
421476	cl27226	DUF331	Domain of unknown function. Members of this family are uncharacterized proteins from a number of bacterial species. The proteins range in size from 50-70 residues.	0
421477	cl27236	VMO-I	N/A. VOMI binds tightly to ovomucin fibrils of the egg yolk membrane. The structure that consists of three beta-sheets forming Greek key motifs, which are related by an internal pseudo three-fold symmetry. Furthermore, the structure of VOMI has strong similarity to the structure of the delta-endotoxin, as well as a carbohydrate-binding site in the top region of the common fold.	0
421478	cl27240	PetN	PetN. cytochrome b6/f complex subunit VIII	0
421479	cl27241	YccF	Inner membrane component domain. Domain occurs as one or more copies in bacterial and eukaryotic proteins. These are membrane proteins of four TM regions, two appearing in each of the two copies when both are present. Many of the latter members also carry the sodium/calcium exchanger protein family pfam01699, which have multipass membrane regions.	0
421480	cl27251	EIIBC-GUT_N	Sorbitol phosphotransferase enzyme II N-terminus. Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Gut family consists only of glucitol-specific permeases, but these occur both in Gram-negative and Gram-positive bacteria.E. coli consists of IIA protein, a IIC protein and a IIBC protein. This family is specific for the IIBC component. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids, Signal transduction, PTS]	0
355631	cl27253	Alc	Allantoicase  [Nucleotide transport and metabolism]. Members of this family are the enzyme allantoicase (EC 3.5.3.4), also called allantoate amidinohydrolase. This enzyme hydrolyzes allantoate to (S)-ureidoglycolate and urea; it can also degrade (R)-ureidoglycolate to glyoxylate and urea. Allantoinase (EC 3.5.2.5) hydrolyzes (S)-allantoin (a xanthine metabolite, via urate) to allantoate. Allantoate can then be degraded either by this enzyme, allantoicase, or by allantoate deiminase (EC 3.5.3.9). Members of the seed alignment for this model were taken from BRENDA. Proteins in this family contain two copies of the allantoicase repeat (pfam03561). A different but similarly named enzyme, allantoate amidohydrolase (EC 3.5.3.9), simultaneously breaks down the urea to ammonia and carbon dioxide. [Purines, pyrimidines, nucleosides, and nucleotides, Other, Energy metabolism, Other]	0
355632	cl27261	NrdR	Transcriptional regulator NrdR, contains Zn-ribbon and ATP-cone domains [Transcription]. Members of this almost entirely bacterial family contain an ATP cone domain (pfam03477). There is never more than one member per genome. Common gene symbols given include nrdR, ybaD, ribX and ytcG. The member from Streptomyces coelicolor is found upstream in the operon of the class II oxygen-independent ribonucleotide reductase gene nrdJ and was shown to repress nrdJ expression. Many members of this family are found near genes for riboflavin biosynthesis in Gram-negative bacteria, suggesting a role in that pathway. However, a phylogenetic profiling study associates members of this family with the presence of a palindromic signal with consensus acaCwAtATaTwGtgt, termed the NrdR-box, an upstream element for most operons for ribonucleotide reductase of all three classes in bacterial genomes. [Regulatory functions, DNA interactions]	0
421481	cl27262	MOSC	MOSC domain. 6-N-hydroxylaminopurine resistance protein; Provisional	0
421482	cl27268	UPF0126	UPF0126 domain. hypothetical protein; Provisional	0
391716	cl27276	RNA_replicase_B	RNA replicase, beta-chain. RNA replicase, beta subunit	0
391717	cl27281	PapB	Adhesin biosynthesis transcription regulatory protein. fimbriae biosynthesis regulatory protein; Provisional	0
421483	cl27283	SDH_alpha	Serine dehydratase alpha chain. This enzyme is also called serine deaminase and L-serine dehydratase 1. L-serine ammonia-lyase converts serine into pyruvate in the gluconeogenesis pathway from serine. This enzyme is comprised of a single chain in Escherichia coli, Mycobacterium tuberculosis, and several other species, but has separate alpha and beta chains in Bacillus subtilis and related species. The beta and alpha chains are homologous to the N-terminal and C-terminal regions, respectively, but are rather deeply branched in a UPGMA tree. This enzyme requires iron and dithiothreitol for activation in vitro, and is a predicted 4Fe-4S protein. Escherichia coli Pseudomonas aeruginosa have two copies of this protein. [Energy metabolism, Amino acids and amines, Energy metabolism, Glycolysis/gluconeogenesis]	0
421484	cl27287	Glf	UDP-galactopyranose mutase [Cell wall/membrane/envelope biogenesis]. This enzyme is involved in the conversion of UDP-GALP into UDP-GALF through a 2-keto intermediate. It contains FAD as a cofactor. The gene is known as glf, ceoA, and rfbD. It is known experimentally in E. coli, Mycobacterium tuberculosis, and Klebsiella pneumoniae. [Cell envelope, Biosynthesis and degradation of surface polysaccharides and lipopolysaccharides]	0
421485	cl27291	LUC7	LUC7 N_terminus. This family contains the N terminal region of several LUC7 protein homologs and only contains eukaryotic proteins. LUC7 has been shown to be a U1 snRNA associated protein with a role in splice site recognition. The family also contains human and mouse LUC7 like (LUC7L) proteins and human cisplatin resistance-associated overexpressed protein (CROP).	0
421486	cl27293	UFD1	Ubiquitin fusion degradation protein UFD1. Post-translational ubiquitin-protein conjugates are recognized for degradation by the ubiquitin fusion degradation (UFD) pathway. Several proteins involved in this pathway have been identified. This family includes UFD1, a 40kD protein that is essential for vegetative cell viability. The human UFD1 gene is expressed at high levels during embryogenesis, especially in the eyes and in the inner ear primordia and is thought to be important in the determination of ectoderm-derived structures, including neural crest cells. In addition, this gene is deleted in the CATCH-22 (cardiac defects, abnormal facies, thymic hypoplasia, cleft palate and hypocalcaemia with deletions on chromosome 22) syndrome. This clinical syndrome is associated with a variety of developmental defects, all characterized by microdeletions on 22q11.2. Two such developmental defects are the DiGeorge syndrome OMIM:188400, and the velo-cardio- facial syndrome OMIM:145410. Several of the abnormalities associated with these conditions are thought to be due to defective neural crest cell differentiation.	0
391729	cl27371	GYR	GYR motif. The GYR motif is found in several drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues may suggest it could be a substrate for tyrosine kinases.	0
355644	cl27375	PyrI	Aspartate carbamoyltransferase, regulatory subunit [Nucleotide transport and metabolism]. aspartate carbamoyltransferase regulatory subunit; Reviewed	0
421492	cl27388	Herpes_UL31	Herpesvirus UL31-like protein. nuclear egress lamina protein UL31; Provisional	0
391731	cl27389	Pap_E4	E4 protein. E4 protein; Provisional	0
421493	cl27397	WhiA_N	WhiA N-terminal LAGLIDADG-like domain. This family describes a DNA-binding protein widely conserved in Gram-positive bacteria, and occasionally occurring elsewhere, such as in Thermotoga. It is associated with cell division, and in sporulating organisms with sporulation. [Cellular processes, Cell division]	0
421494	cl27405	RecO_C	Recombination protein O C terminal. All proteins in this family for which functions are known are DNA binding proteins that are involved in the initiation of recombination or recombinational repair. [DNA metabolism, DNA replication, recombination, and repair]	0
421495	cl27409	PsbK	Photosystem II 4 kDa reaction centre component. Photosystem II reaction center protein K; Provisional	0
332231	cl27410	PsbI	Photosystem II reaction centre I protein (PSII 4.8 kDa protein). photosystem II protein I	0
421496	cl27417	Rep_trans	Replication initiation factor. DNA replication initiation protein	0
421497	cl27418	Branch	Core-2/I-Branching enzyme. acetylglucosaminyltransferase  family protein; Provisional	0
421498	cl27421	DisA_N	DisA bacterial checkpoint controller nucleotide-binding. These proteins have no detectable global or local homology to any protein of known function. Members are restricted to the bacteria and found broadly in lineages other than the Proteobacteria. [Hypothetical proteins, Conserved]	0
421499	cl27429	PsbL	PsbL protein. photosystem II protein L	0
421500	cl27440	Cyt_c_Oxidase_VIII	N/A. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIII.	0
421501	cl27443	Orthopox_35kD	35kD major secreted virus protein. chemokine binding protein; Provisional	0
421502	cl27447	WSN	Domain of unknown function. 	0
421503	cl27448	GoLoco	GoLoco motif. GEF specific for Galpha_i proteins	0
421504	cl27450	BH4	Bcl-2 homology region 4. 	0
421508	cl27463	TrpBP	Tryptophan RNA-binding attenuator protein. 	0
391748	cl27467	NMU	Neuromedin U. Neuromedin U (NmU) is a vertebrate peptide which stimulates uterine smooth muscle contraction and causes selective vasoconstriction. Like most other active peptides, it is proteolytically processed from a larger precursor protein. The mature peptides are 8 (NmU-8) to 25 (NmU-25) residues long and C- terminally amidated. The sequence of the C-terminal extremity of NmU is extremely well conserved in mammals, birds and amphibians.	0
421509	cl27474	AdoMet_Synthase	S-adenosylmethionine synthetase (AdoMet synthetase). This family consists of several archaebacterial S-adenosylmethionine synthetase C(AdoMet synthetase or MAT) (EC 2.5.1.6). S-Adenosylmethionine (AdoMet) occupies a central role in the metabolism of all cells. The biological roles of AdoMet include acting as the primary methyl group donor, as a precursor to the polyamines, and as a progenitor of a 5'-deoxyadenosyl radical. S-Adenosylmethionine synthetase catalyzes the only known route of AdoMet biosynthesis. The synthetic process occurs in a unique reaction in which the complete triphosphate chain is displaced from ATP and a sulfonium ion formed. MATs from various organisms contain ~400-amino acid polypeptide chains.	0
421510	cl27476	Ribosomal_L14e	Ribosomal protein L14. 60S ribosomal protein L14; Provisional	0
421511	cl27487	HOK_GEF	Hok/gef family. small toxic polypeptide; Provisional	0
332309	cl27488	Bax	Uncharacterized FlgJ-related protein [General function prediction only]. 	0
421512	cl27498	PsbJ	PsbJ. photosystem II reaction center protein J; Provisional	0
421513	cl27501	Ribosomal_L29e	Ribosomal L29e protein family. 60S ribosomal protein L29; Provisional	0
421514	cl27502	Folate_carrier	Reduced folate carrier. The Reduced Folate Carrier (RFC) Family (TC 2.A.48) Members of the RFC family mediate the uptake of folate, reduce folate, derivatives of reduced folate and the drug, methotrexate. Proteins of the RFC family are so-far restricted to animals. RFC proteins possess 12 putative transmembrane a-helical spanners (TMSs) and evidence for a 12 TMS topology has been published for the human RFC. The RFC transporters appear to transport reduced folate by an energy-dependent, pH-dependent, Na+-independent mechanism. Folate:H+ symport, folate:OH- antiport and folate:anion antiport mechanisms have been proposed, but the energetic mechanism is not well defined. [Transport and binding proteins, Carbohydrates, organic alcohols, and acids]	0
421515	cl27506	zf-A20	A20-like zinc finger. A20- (an inhibitor of cell death)-like zinc fingers. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappaB activation.	0
421516	cl27507	Ycf9	YCF9. PsbZ is a core protein of photosystem II in thylakoid-containing Cyanobacteria and plant chloroplasts. The original Chlamydomonas gene symbol, ycf9, is a synonym. PsbZ controls the interaction of the reaction center core with the light-harvesting antenna. [Energy metabolism, Photosynthesis]	0
332333	cl27512	Adeno_PIX	Adenovirus hexon-associated protein (IX). capsid protein IX,hexon associated protein IX; Provisional	0
421517	cl27532	B56	Protein phosphatase 2A regulatory B subunit (B56 family). serine/threonine protein phosphatase 2A; Provisional	0
421518	cl27544	Herpes_glycop_D	Herpesvirus glycoprotein D/GG/GX domain. envelope glycoprotein D; Provisional	0
421520	cl27556	Col_cuticle_N	Nematode cuticle collagen N-terminal domain. The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins.	0
421521	cl27557	P_proprotein	Proprotein convertase P-domain. A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain.	0
421523	cl27575	PHO4	Phosphate transporter family. This family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter. This family also contains the leukaemia virus receptor.	0
421524	cl27585	Adenylate_cycl	Adenylate cyclase, class-I. 	0
421525	cl27586	Thymosin	Thymosin beta-4 family. 	0
421526	cl27588	Parathyroid	Parathyroid hormone family. 	0
421527	cl27628	Gastrin	Gastrin/cholecystokinin family. This family gathers small proteins of about 100 130 amino acids that act as hormones, among them gastrin, cholecystokinin and preprocaerulein which stimulate gastric, biliary, and pancreatic secretion and smooth muscle contraction.	0
421528	cl27631	Nebulin	Nebulin repeat. Tandem arrays of these repeats are known to bind actin.	0
421529	cl27632	Transposase_mut	Transposase, Mutator family. Members of this family belong to the branch of the IS256-like family of transposases that includes the founding member. It excludes the IS1249 group.	0
332455	cl27634	STNV	N/A. STNV domain; satellite tobacco necrosis virus (STNV) are small plant viruses which are completely dependent on the presence of a specific helper virus, TNV, for their replication;  60 identical subunits, this domain is one of them; form an icosahedral shell around a single RNA molecule. Half of the RNA codes for the coat protein with the other half being non-coding. The STNV domain has a "Swiss roll" Greek key topology with its two 4-stranded antiparallel beta sheets	0
332467	cl27646	Polyhedrin	Polyhedrin. polyhedrin; Provisional	0
332469	cl27648	Polyoma_coat	Polyomavirus coat protein. Major capsid protein VP1; Provisional	0
421530	cl27652	Plectin	Plectin repeat. This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen.	0
421531	cl27657	Motile_Sperm	MSP (Major sperm protein) domain. Major sperm proteins are involved in sperm motility. These proteins oligomerize to form filaments. This family contains many other proteins.	0
421532	cl27659	Filamin	Filamin/ABP280 repeat. These form a rod-like structure in the actin-binding cytoskeleton protein, filamin. The C-terminal repeats of filamin bind beta1-integrin (CD29).	0
421533	cl27660	MAM	N/A. An extracellular domain found in many receptors. The MAM domain along with the associated Ig domain in type IIB receptor protein tyrosine phosphatases forms a structural unit (termed MIg) with a seamless interdomain interface. It plays a major role in homodimerization of the phosphatase ectoprotein and in cell adhesion. MAM is a beta-sandwich consisting of two five-stranded antiparallel beta-sheets rotated away from each other by approx 25 degrees, and plays a similar role in meprin metalloproteinases.	0
332494	cl27673	E6	Early Protein (E6). E6 protein; Provisional	0
355670	cl27691	AcrR	DNA-binding transcriptional regulator, AcrR family [Transcription]. transcriptional regulator BetI; Validated	0
421536	cl27706	Prion	Prion/Doppel alpha-helical domain. The prion protein is a major component of scrapie-associated fibrils in Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler syndrome and bovine spongiform encephalopathy.	0
421537	cl27713	Defensin_1	Mammalian defensin. Cysteine-rich domains that lyse bacteria, fungi and enveloped viruses by forming multimeric membrane-spanning channels.	0
391777	cl27714	Endothelin	Endothelin family. endothelin precursor; Provisional	0
355673	cl27727	BBI	N/A. Bowman-Birk type proteinase inhibitor (BBI); family of plant serine protease inhibitors that block trypsin or chymotrypsin.They are either single-headed (one reactive site, one inactive site, present mainly in monocotyledonous seeds) or double-headed (two reactive sites, present mainly in dicotyledonous seeds).	0
421538	cl27728	Calc_CGRP_IAPP	Calcitonin / CGRP / IAPP family. This family is formed by calcitonin, the calcitonin gene-related peptide, and amylin. They are short polypeptide hormones.	0
421539	cl27729	ANP	Atrial natriuretic peptide. Atrial natriuretic peptides are vertebrate hormones important in the overall control of cardiovascular homeostasis and sodium and water balance in general.	0
421541	cl27746	Hormone_2	Peptide hormone. This family contains glucagon, GIP, secretin and VIP.	0
421542	cl27758	Zona_pellucida	Zona pellucida-like domain. ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan).	0
332591	cl27770	Tail_VII	Inovirus G7P protein. minor coat protein	0
332592	cl27771	RPS31	Ribosomal protein S31e. Members of this protein are the lineage-specific bacterial ribosomal small subunit proteint bTHX (previously THX), originally shown to exist in the genus Thermus. The protein is conserved for the first 26 amino acids, past which some members continue with additional sequence, often repetitive or low-complexity. This model also finds eukaryotic organelle forms, which have additional N-terminal transit peptides. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
421544	cl27778	Phage_clamp_A	Bacteriophage clamp loader A subunit. clamp loader small subunit; Provisional	0
421545	cl27779	DMP12	Putative DNA mimic protein DMP12. This is a family of DNA-mimic proteins expressed by Neisseria species. In its monomeric form DMP12 interacts with the Neisseria dimeric form of the bacterial histone-like protein HU. HU proteins promote the assembly of higher-order DNA-protein structures, The interaction between DMP12 and HU protein may be instrumental in controlling the stability of the nucleoid in Neisseria as DMP12 prevents Neisseria HU protein from being digested by trypsin.	0
421546	cl27780	UL141	Herpes-like virus membrane glycoprotein UL141. UL14 tegument protein; Provisional	0
332602	cl27781	EAGR_box	Enriched in aromatic and glycine Residues box. The EAGR box (Enriched in Aromatic and Glycine Residues) is found in three different proteins of the Mycoplasma genitalium terminal organelle, which acts in both cytadherence and gliding motility. The presence of this domain in a genome predicts the Mycoplasma-type terminal organelle structure, gliding motility, and cytadherence. The EAGR box may occur from one to nine times in a protein.	0
355678	cl27782	Yop-YscD_ppl	Inner membrane component of T3SS, periplasmic domain. Yop-YscD-ppl is the periplasmic domain of Yop proteins like YscD from Proteobacteria. YscD forms part of the inner membrane component of the bacterial type III secretion injectosome apparatus.	0
421547	cl27783	Cas9_REC	REC lobe of CRISPR-associated endonuclease Cas9. CRISPR loci appear to be mobile elements with a wide host range. This model represents a protein found only in CRISPR-containing species, near other CRISPR-associated proteins (cas), as part of the NMENI subtype of CRISPR/Cas locus. The species range so far for this protein is animal pathogens and commensals only.	0
332607	cl27786	FPRL1_inhibitor	Formyl peptide receptor-like 1 inhibitory protein. formyl peptide receptor-like 1 inhibitory protein; Reviewed	0
332610	cl27789	YmcE_antitoxin	Putative antitoxin of bacterial toxin-antitoxin system. YmcE_antitoxin is the putative antitoxin for the supposed bacterial toxin GnsA, UniProtKB:P0AC92, family pfam08178.	0
421548	cl27796	Tox-PLDMTX	Dermonecrotoxin of the Papain-like fold. A papain fold toxin domain found in bacterial polymorphic toxin systems.	0
421549	cl27802	MerR_2	MerR HTH family regulatory protein. 	0
332632	cl27811	G-7-MTase	mRNA (guanine-7-)methyltransferase (G-7-MTase). This model represents a common C-terminal region shared by paramyxovirus-like RNA-dependent RNA polymerases (see pfam00946). Polymerase proteins described by these two models are often called L protein (large polymerase protein). Capping of mRNA requires RNA triphosphatase and guanylyl transferase activities, demonstrated for the rinderpest virus L protein and at least partially localized to the region of this model.	0
332633	cl27812	EF-hand_4	Cytoskeletal-regulatory complex EF hand. Pair of EF hand motifs that recognise proteins containing Asn-Pro-Phe (NPF) sequences.	0
332642	cl27821	T4_baseplate	T4 bacteriophage base plate protein. baseplate hub assembly protein; Provisional	0
332643	cl27822	Protein_K	Bacteriophage protein K. protein K	0
332644	cl27823	Sulf_coat_C	Sulfolobus virus coat protein C terminal. coat protein	0
332645	cl27824	VirE1	Single-strand DNA-binding protein. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved IELE sequence motif. VirE1 is an acidic chaperone protein which binds to VirE2, a ssDNA binding protein. These proteins are virulence factors of the plant pathogens Agrobacteria. VirE1 competes for the ssDNA binding site of VirE2.	0
332646	cl27825	VirArc_Nuclease	Viral/Archaeal nuclease. hypothetical protein	0
332649	cl27828	T4_neck-protein	Virus neck protein. neck protein; Provisional	0
421551	cl27829	NTPase_P4	ATPase P4 of dsRNA bacteriophage phi-12. packaging NTPase P4	0
332651	cl27830	DUF3130	Protein of unknown function (DUF3130. Members of this protein family are similar in length and sequence (although remotely) to the WXG100 family of type VII secretion system (T7SS) targets, described by family TIGR03930. Phylogenetic profiling shows that members of this family are similarly restricted to species with T7SS, marking this family as a related set of T7SS effectors. Members include SACOL2603 from Staphylococcus aureus subsp. aureus COL. Oddly, members of family pfam10824 (DUF2580), which appears also to be related, seem not to be tied to T7SS.	0
332652	cl27831	Phage_DsbA	Transcriptional regulator DsbA. double-stranded DNA binding protein; Provisional	0
332653	cl27832	DUF2830	Protein of unknown function (DUF2830). lysis protein	0
391790	cl27833	Phage_glycop_gL	Viral glycoprotein L. hypothetical protein; Provisional	0
332655	cl27834	UL11	Membrane-associated tegument protein. tegument protein UL11; Provisional	0
332656	cl27835	DNA_Packaging	Terminase DNA packaging enzyme. small terminase protein; Provisional	0
421552	cl27836	DUF2810	Protein of unknown function (DUF2810). This is a bacterial family of uncharacterized proteins.	0
332658	cl27837	DUF2685	Protein of unknown function (DUF2685). hypothetical protein; Provisional	0
355682	cl27838	DUF2701	Protein of unknown function (DUF2701). putative transmembrane protein; Provisional	0
332660	cl27839	DUF2649	Protein of unknown function (DUF2649). hypothetical protein	0
332661	cl27840	DUF2654	Protein of unknown function (DUF2654). hypothetical protein; Provisional	0
332662	cl27841	DUF2733	Protein of unknown function (DUF2733). Alkaline exonuclease; Provisional	0
391791	cl27842	YbaJ	Biofilm formation regulator YbaJ. YbaJ regulates biofilm formation. It also has an important role in the regulation of motility in the biofilm. YbaJ functions in increasing conjugation, aggregation and decreasing the motility, resulting in an increase of biofilm	0
332664	cl27843	Phage_holin_2_2	Phage holin T7 family, holin superfamily II. type II holin	0
332665	cl27844	RepB-RCR_reg	Replication regulatory protein RepB. This is a family of proteins which regulate replication of rolling circle replication (RCR) plasmids that have a double-strand replication origin (dso). Regulation of replication of RCR plasmids occurs mainly at initiation of leading strand synthesis at the dso, such that Rep protein concentration controls plasmid replication.	0
355684	cl27850	Glyco_hydro_65m	Glycosyl hydrolase family 65 central catalytic domain. maltose phosphorylase; Provisional	0
332674	cl27853	GSu_C4xC__C2xCH	Geobacter CxxxxCH...CXXCH motif (GSu_C4xC__C2xCH). This domain occurs from three to eight times in eight different proteins of Geobacter sulfurreducens. The final CXXCH motif matches ProSite motif PS00190, the cytochrome c family heme-binding site signature, suggesting	0
332675	cl27854	IpaC_SipC	Salmonella-Shigella invasin protein C (IpaC_SipC). This model represents a family of proteins associated with bacterial type III secretion systems, which are injection machines for virulence factors into host cell cytoplasm. Characterized members of this protein family are known to be secreted and are described as invasins, including IpaC from Shigella flexneri (SP:P18012) and SipC from Salmonella typhimurium (GB:AAA75170.1). Members may be referred to as invasins, pathogenicity island effectors, and cell invasion proteins. [Cellular processes, Pathogenesis]	0
332676	cl27855	Spore_SspJ	Small spore protein J (Spore_SspJ). New small, acid-soluble proteins unique to spores of Bacillus subtilis [Cellular processes, Sporulation and germination]	0
391792	cl27856	DUF2374	Protein of unknown function (Duf2374). This very small protein (about 46 amino acids) consists largely of a single predicted membrane-spanning region. It is found in Photobacterium profundum SS9 and in three species of Vibrio, always near periplasmic nitrate reductase genes, but far from the periplasmic nitrate reductase genes in Aeromonas hydrophila ATCC7966. [Hypothetical proteins, Conserved]	0
421553	cl27860	Pfg27	Pfg27. gamete antigen 27/25-like protein; Provisional	0
421554	cl27861	Phage-Gp8	Bacteriophage T4, Gp8. baseplate wedge subunit; Provisional	0
421555	cl27870	T3SS_needle_E	Type III secretion system, cytoplasmic E component of needle. Members of this family are found exclusively in type III secretion appparatus gene clusters in bacteria. Those bacteria with a protein from this family tend to target animal cells, as does Yersinia pestis. This protein is small (about 70 amino acids) and not well characterized. [Cellular processes, Pathogenesis]	0
332699	cl27878	Flu_M1_C	Influenza Matrix protein (M1) C-terminal domain. This region is thought to be a second domain of the M1 matrix protein.	0
391795	cl27882	Phage_1_1	Bacteriophage 1.1 Protein. hypothetical protein	0
332704	cl27883	SspN	Small acid-soluble spore protein N family. acid-soluble spore protein N; Provisional	0
332705	cl27884	TetM_leader	Tetracycline resistance determinant leader peptide. tetracycline resistance determinant leader peptide; Provisional	0
332706	cl27885	Leu_leader	Leucine operon leader peptide. leu operon leader peptide; Provisional	0
391796	cl27886	Tna_leader	Tryptophanase operon leader peptide. tryptophanase leader peptide; Provisional	0
391797	cl27888	PaaX_C	PaaX-like protein C-terminal domain. This transcriptional regulator is always found in association with operons believed to be involved in the degradation of phenylacetic acid. The gene product has been shown to bind to the promoter sites and repress their transcription. [Regulatory functions, DNA interactions]	0
332711	cl27890	Chaperone_III	Type III secretion chaperone domain. Type III secretion chaperones are involved in delivering virulence effector proteins from bacterial pathogens directly into eukaryotic cells. The chaperones may prevent aggregation and degradation of their substrates, may target the effector to the secretion apparatus, and may ensure a secretion-component unfolded confirmation of their specific substrate. One member of this family, SigE forms homodimers in crystal. The monomers have a novel fold with an alpha-beta(3)-alpha-beta(2)-alpha topology.	0
332717	cl27896	Herpes_UL37_2	Betaherpesvirus immediate-early glycoprotein UL37. UL37 tegument protein; Provisional	0
332720	cl27899	Orthopox_B11R	Orthopoxvirus B11R protein. hypothetical protein; Provisional	0
332721	cl27900	DUF1314	Protein of unknown function (DUF1314). circ protein; Provisional	0
421556	cl27901	GlpM	GlpM protein. This family consists of several bacterial GlpM membrane proteins. GlpM is a hydrophobic protein containing 109 amino acids. It is thought that GlpM may play a role in alginate biosynthesis in Pseudomonas aeruginosa.	0
332723	cl27902	DUF1235	Protein of unknown function (DUF1235). hypothetical protein; Provisional	0
332726	cl27905	DUF1231	Protein of unknown function (DUF1231). hypothetical protein; Provisional	0
332734	cl27913	DUF1181	Protein of unknown function (DUF1181). hypothetical protein; Provisional	0
332736	cl27915	Orthopox_F6	Orthopoxvirus F6 protein. hypothetical protein; Provisional	0
391798	cl27921	DUF1039	Protein of unknown function (DUF1039). type III secretion system protein SsaH; Provisional	0
332743	cl27922	DUF1029	Protein of unknown function (DUF1029). ORF091 IMV membrane protein; Provisional	0
332744	cl27923	Orthopox_A5L	Orthopoxvirus A5L protein-like. virion core protein; Provisional	0
332748	cl27927	Chordopox_E11	Chordopoxvirus E11 protein. putative virion core protein; Provisional	0
332749	cl27928	Chordopox_G3	Chordopoxvirus G3 protein. hypothetical protein; Provisional	0
421558	cl27929	Pox_A30L_A26L	Orthopoxvirus A26L/A30L protein. A-type inclusion protein; Provisional	0
332751	cl27930	Chordopox_A30L	Chordopoxvirus A30L protein. ORF107 virion morphogenesis; Provisional	0
421559	cl27931	RING_CBP-p300	atypical RING domain found in CREB-binding protein and p300 histone acetyltransferases. This domain of unknown function is found in several transcriptional co-activators including the CREB-binding protein, which is an acetyltransferase that acetylates histones, giving a specific tag for transcriptional activation. This short domain is found to the C-terminus of bromodomains. The 40 residue domain contains four conserved cysteines suggesting that it may be stabilized by a zinc ion. In CREB this domain is to the N-terminus of another zinc binding PHD domain.	0
332753	cl27932	Herpes_U5	Herpesvirus U5-like family. hypothetical protein; Provisional	0
332754	cl27933	Chordopox_A35R	Chordopoxvirus A35R protein. hypothetical protein; Provisional	0
421560	cl27937	PSII_Ycf12	Photosystem II complex subunit Ycf12. Ycf12; Provisional	0
332759	cl27938	Chordopox_A33R	Chordopoxvirus A33R protein. EEV glycoprotein; Provisional	0
332760	cl27939	Chordopox_A13L	Chordopoxvirus A13L protein. IMV membrane protein; Provisional	0
391800	cl27940	AgrD	Staphylococcal AgrD protein. Members of this family of short peptides are precursors to thiolactone (unless Cys is replaced by Ser) cyclic autoinducer peptides, used in quorum-sensing systems in Gram-positive bacteria. The best characterized is the AgrD precursor, processed by the AgrB protein. Nearby proteins regularly encountered include a histidine kinase and a response regulator. This model is related to pfam05931 but is newer and currently broader in scope.	0
332762	cl27941	Orthopox_F8	Orthopoxvirus F8 protein. Hypothetical protein; Provisional	0
332763	cl27942	Chordopox_RPO7	Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7). DNA-dependent RNA polymerase subunit; Provisional	0
332764	cl27943	DUF848	Gammaherpesvirus protein of unknown function (DUF848). hypothetical protein; Provisional	0
332765	cl27944	Orthopox_F7	Orthopoxvirus F7 protein. hypothetical protein; Provisional	0
332766	cl27945	Herpes_BLRF2	Herpesvirus BLRF2 protein. hypothetical protein; Provisional	0
332768	cl27947	Chordopox_G2	Chordopoxvirus protein G2. transcriptional elongation factor; Provisional	0
332769	cl27948	Herpes_heli_pri	Herpesvirus helicase-primase complex component. helicase-primase primase subunit; Provisional	0
355689	cl27949	Pox_A14	Poxvirus virion envelope protein A14. ORF090 IMV phosphorylated membrane protein; Provisional	0
421561	cl27954	SpvD	Salmonella plasmid virulence protein SpvD. This family consists of several SpvD plasmid virulence proteins from different Salmonella species. The structure of the protein from Salmonella typhimurium has been solved and shows a papain-like fold, with a predicted catalytic triad of Cys73, His162 and Asp182. The protein has been shown to have deubiquitinating-like activity, releasing aminoluciferin (AML) from Ub-AML.	0
332777	cl27956	Pox_G7	Poxvirus G7-like. putative virion core protein; Provisional	0
421563	cl27957	Phi-29_GP4	Phi-29-like late genes activator (early protein GP4). transcriptional regulator	0
332779	cl27958	Pox_ser-thr_kin	Poxvirus serine/threonine protein kinase. Ser/Thr kinase; Provisional	0
332782	cl27961	Pox_A21	Poxvirus A21 Protein. hypothetical protein; Provisional	0
332785	cl27964	Pox_A3L	Poxvirus A3L Protein. virus redox protein; Provisional	0
332788	cl27967	DUF705	Protein of unknown function (DUF705). This model represents a family of viral proteins of unknown function. These proteins are members, however, of the IIIC (TIGR01681) subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. All characterized members of the III subfamilies (IIIA, TIGR01662; IIIB, pfam03767) are phosphatases, including MDP-1, a member of subfamily IIIC (TIGR01681). No member of this subfamily is characterized with respect to particular function. All of the active site residues characteristic of HAD-superfamily phosphatases are present in subfamily IIIC. These proteins also include an N-terminal domain (ca. 125 aas) that is unique to this clade.	0
332791	cl27970	Baculo_p47	Baculovirus P47 protein. viral transcription regulator p47; Provisional	0
332792	cl27971	LEF-9	Late expression factor 9 (LEF-9). late expression factor 9; Provisional	0
332793	cl27972	DUF678	Protein of unknown function (DUF678). hypothetical protein; Provisional	0
332794	cl27973	Herpes_UL43	Herpesvirus UL43 protein. UL43 envelope protein; Provisional	0
332795	cl27974	Pox_A11	Poxvirus A11 Protein. hypothetical protein; Provisional	0
391803	cl27976	PIF3	Per os infectivity factor 3. per os infectivity factor 3; Provisional	0
332800	cl27979	LEF-8	Late expression factor 8 (LEF-8). DNA-directed RNA polymerase subunit beta-like protein; Provisional	0
332801	cl27980	DUF655	Protein of unknown function (DUF655). This family includes several uncharacterized archaeal proteins. This protein appears to contain two HHH motifs.	0
332802	cl27981	Pox_M2	Poxvirus M2 protein. hypothetical protein; Provisional	0
332804	cl27983	Pox_L5	Poxvirus L5 protein family. ORF051 putative membrane protein; Provisional	0
332806	cl27985	Pox_E10	E10-like protein conserved region. sulfhydryl oxidase; Provisional	0
391804	cl27986	Herpes_BBRF1	BRRF1-like protein. hypothetical protein; Provisional	0
332808	cl27987	Pox_H7	Late protein H7. hypothetical protein; Provisional	0
332809	cl27988	Pox_F17	DNA-binding 11 kDa phosphoprotein. ORF017 DNA-binding phosphoprotein; Provisional	0
332810	cl27989	InvH	InvH outer membrane lipoprotein. This family represents the Salmonella outer membrane lipoprotein InvH. The molecular function of this protein is unknown, but it is required for the localization to outer membrane of InvG, which is involved in a type III secretion apparatus mediating host cell invasion.	0
391805	cl27990	Agro_virD5	Agrobacterium VirD5 protein. The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterised products. This family represents the VirD5 protein.	0
355692	cl27992	Pox_I5	Poxvirus protein I5. putative IMV membrane protein; Provisional	0
332814	cl27993	Pox_F16	Poxvirus F16 protein. hypothetical protein; Provisional	0
332816	cl27995	Microvir_H	Microvirus H protein (pilot protein). minor spike protein	0
332817	cl27996	Herpes_BTRF1	Herpesvirus BTRF1 protein conserved region. hypothetical protein; Provisional	0
332818	cl27997	Pox_I3	Poxvirus I3 ssDNA-binding protein. DNA-binding phosphoprotein; Provisional	0
332819	cl27998	Pox_E6	Pox virus E6 protein. Hypothetical protein; Provisional	0
355693	cl27999	Herpes_pp85	Herpesvirus phosphoprotein 85 (HHV6-7 U14/HCMV UL25). DNA packaging tegument protein UL25; Provisional	0
332821	cl28000	Pox_G5	Poxvirus G5 protein. Hypothetical protein; Provisional	0
332822	cl28001	Pox_F15	Poxvirus protein F15. hypothetical protein; Provisional	0
332823	cl28002	Herpes_UL55	Herpesvirus UL55 protein. nuclear protein UL55; Provisional	0
332824	cl28003	Herpes_U44	Herpes virus U44 protein. tegument protein; Provisional	0
332825	cl28004	Microvir_lysis	Microvirus lysis protein (E), C-terminus. cell lysis protein	0
332826	cl28005	Pox_VP8_L4R	Poxvirus nucleic acid binding protein VP8/L4R. DNA-binding virion core protein; Provisional	0
355694	cl28006	PHA02695	N/A. hypothetical protein; Provisional	0
332830	cl28009	Poxvirus_B22R	Poxvirus B22R protein. hypothetical protein; Provisional	0
421565	cl28016	Hema_HEFG	Hemagglutinin domain of haemagglutinin-esterase-fusion glycoprotein. 	0
332838	cl28017	Herpes_UL37_1	Herpesvirus UL37 tegument protein. UL37 tegument protein; Provisional	0
332842	cl28021	Phage_mat-A	Phage maturation protein. maturation protein	0
332856	cl28035	Herpes_UL33	Herpesvirus UL33-like protein. DNA packaging protein UL33; Provisional	0
355695	cl28037	PHA03163	N/A. hypothetical protein; Provisional	0
332861	cl28040	Peptidase_C37	Southampton virus-type processing peptidase. Corresponds to Merops family C37. Norwalk-like viruses (NLVs), including the Southampton virus, cause acute non-bacterial gastroenteritis in humans. The NLV genome encodes three open reading frames (ORFs). ORF1 encodes a polyprotein, which is processed by the viral protease into six proteins.	0
332864	cl28043	Peptidase_M44	Metallopeptidase from vaccinia pox. putative metalloprotease; Provisional	0
332865	cl28044	Pox_E8	Poxvirus E8 protein. putative membrane protein; Provisional	0
355696	cl28045	Herpes_UL46	Herpesvirus UL46 protein. tegument protein VP11/12; Provisional	0
332867	cl28046	Pox_LP_H2	Viral late protein H2. putative viral membrane protein; Provisional	0
332868	cl28047	Pox_L3_FP4	Poxvirus L3/FP4 protein. hypothetical protein; Provisional	0
332869	cl28048	Pox_F12L	Poxvirus F12L protein. EEV maturation protein; Provisional	0
421566	cl28050	Herpes_VP19C	Herpesvirus capsid shell protein VP19C. Capsid triplex subunit 1; Provisional	0
355697	cl28051	PHA03144	N/A. helicase-primase primase subunit; Provisional	0
332874	cl28053	Pox_I1	Poxvirus protein I1. putative DNA-binding virion core protein; Provisional	0
332875	cl28054	Pox_Ag35	Pox virus Ag35 surface protein. late transcription factor VLTF-4; Provisional	0
332877	cl28056	Herpes_UL21	Herpesvirus UL21. tegument protein UL21; Provisional	0
332878	cl28057	Pox_P35	Poxvirus P35 protein. ORF059 IMV protein VP55; Provisional	0
391808	cl28058	DNA_pol_B_2	DNA polymerase type B, organellar and viral. DNA polymerase; Provisional	0
332886	cl28065	Herpes_UL79	UL79 family. hypothetical protein; Provisional	0
355700	cl28066	Herpes_UL16	Herpesvirus UL16/UL94 family. tegument protein UL16; Provisional	0
355701	cl28067	Herpes_UL87	Herpesvirus UL87 family. hypothetical protein; Provisional	0
332889	cl28068	Pox_G9-A16	Pox virus entry-fusion-complex G9/A16. poxvirus myristoylprotein; Provisional	0
332891	cl28070	gpD	Bacteriophage scaffolding protein D. external scaffolding protein	0
332896	cl28075	Flavi_E_C	Immunoglobulin-like domain III (C-terminal domain) of Flavivirus envelope glycoprotein E. The C-terminal domain (domain III) of Flavivirus glycoprotein E appears to be involved in low-affinity interactions with negatively charged glycoaminoglycans on the host cell surface. Domain III may also play a role in interactions with alpha-v-beta-3 integrins in West Nile virus, Japanese encephalitis virus, and Dengue virus. The interface between domain I and domain III appears to be destabilized by the low-pH environment of the endosome, and domain III may play a vital role in the conformational changes of envelope glycoprotein E that follow the clathrin-mediated endocytosis of viral particles and are a prerequisite to membrane fusion.	0
332900	cl28079	Herpes_Helicase	Helicase. helicase-primase subunit BBLF4; Provisional	0
421567	cl28085	US2	US2 family. virion protein US2; Provisional	0
421568	cl28086	PsbN	Photosystem II reaction centre N protein (psbN). photosystem II protein N	0
421569	cl28088	L1R_F9L	Lipid membrane protein of large eukaryotic DNA viruses. S-S bond formation pathway protein; Provisional	0
355704	cl28089	TrkG	Trk-type K+ transport system, membrane component [Inorganic ion transport and metabolism]. The proteins of the Trk family are derived from Gram-negative and Gram-positive bacteria, yeast and wheat. The proteins of E. coli K12 TrkH and TrkG as well as several yeast proteins have been functionally characterized.The E. coli TrkH and TrkG proteins are complexed to two peripheral membrane proteins, TrkA, an NAD-binding protein, and TrkE, an ATP-binding protein. This complex forms the potassium uptake system. [Transport and binding proteins, Cations and iron carrying compounds]	0
332926	cl28105	Vac_Fusion	Chordopoxvirus multifunctional envelope protein A27. ORF104 fusion protein; Provisional	0
332927	cl28106	Phage_B	Scaffold protein B. internal scaffolding protein	0
332933	cl28112	PhoU_div	Protein of unknown function DUF47. An apparent homolog with a suggested function is Pit accessory protein from Sinorhizobium meliloti, which may be involved in phosphate (Pi) transport. [Hypothetical proteins, Conserved]	0
332935	cl28114	MatK_N	MatK/TrnK amino terminal region. maturase K	0
421570	cl28115	Levi_coat	Levivirus coat protein. coat protein	0
421571	cl28116	Translat_reg	Bacteriophage translational regulator. translation repressor protein; Provisional	0
332939	cl28118	Cytomega_gL	Cytomegalovirus glycoprotein L. envelope glycoprotein L; Provisional	0
332940	cl28119	Polyoma_agno	Polyomavirus agnoprotein. agnoprotein; Provisional	0
332942	cl28121	Herpes_UL7	Herpesvirus UL7 like. UL7 tegument protein; Provisional	0
332943	cl28122	Herpes_env	Herpesvirus putative major envelope glycoprotein. DNA packaging protein UL32; Provisional	0
332947	cl28126	Fibritin_C	Fibritin C-terminal region. fibritin; Provisional	0
332949	cl28128	Herpes_glycop	Herpesvirus glycoprotein M. envelope glycoprotein M; Provisional	0
421572	cl28129	Herpes_UL25	Herpesvirus UL25 family. DNA packaging tegument protein UL25; Provisional	0
421573	cl28134	PsbT	Photosystem II reaction centre T protein. photosystem II protein T	0
391813	cl28136	MMTV_SAg	Mouse mammary tumor virus superantigen. hypothetical protein; Provisional	0
355705	cl28141	psaI	N/A. photosystem I subunit VIII; Validated	0
421574	cl28142	Viral_DNA_bp	ssDNA binding protein. single-stranded DNA binding protein; Provisional	0
332973	cl28153	Late_protein_L2	Late Protein L2. major capsid L1 protein; Provisional	0
355706	cl28158	psbF	N/A. photosystem II protein VI	0
355708	cl28191	MSEP-CTERM	MSEP-CTERM protein. Members of this subfamily average about 850 amino acids in length, ending with a variant form of PEP-CTERM sorting signal. Members have a VIT (vault protein inter-alpha-trypsin inhibitor heavy chain) domain (pfam08487). Other bacterial subfamilies of VIT domain proteins have members with either GlyGly-CTERM or LPXTG C-terminal sorting signals. Members of this subfamily occur only in context next to a protein sorting/processing enzyme, exosortase N (XrtN). These subsystems occur both among the Bacteriodetes and in the spirochete genus Leptospira.	0
333016	cl28196	PchG	Oxidoreductase (NAD-binding), involved in siderophore biosynthesis  [Inorganic ion transport and metabolism]. This reductase is found associated with gene clusters for the biosynthesis of various non-ribosomal peptide derived natural products in which cysteine is cyclized to a thiazoline ring containing an imide double bond. Examples include yersiniabactin (irp3/YbtU, GP|21959262) and pyochelin (PchG, GP|4325022).	0
333077	cl28257	SpsG	Spore coat polysaccharide biosynthesis protein SpsG, predicted glycosyltransferase  [Cell wall/membrane/envelope biogenesis]. This protein is found in association with enzymes involved in the biosynthesis of pseudaminic acid, a component of polysaccharide in certain Pseudomonas strains as well as a modification of flagellin in Campylobacter and Hellicobacter. The role of this protein is unclear, although it may participate in N-acetylation in conjunction with, or in the absence of PseH (TIGR03585) as it often scores above the trusted cutoff to pfam00583 representing a family of acetyltransferases.	0
421575	cl28269	CitF	Citrate lyase, alpha subunit (CitF). This is a model of the alpha subunit of the holoenzyme citrate lyase (EC 4.1.3.6) composed of alpha (EC 2.8.3.10), beta (EC 4.1.3.34), and acyl carrier protein subunits in a stoichiometric relationship of 6:6:6. Citrate lyase is an enzyme which converts citrate to oxaloacetate. In bacteria, this reaction is involved in citrate fermentation. The alpha subunit catalyzes the reaction Acetyl-CoA + citrate = acetate + (3S)-citryl-CoA. The seed contains an experimentally characterized member from Lactococcus lactis subsp. lactis. The model covers both Gram positive and Gram negative bacteria. It is quite robust with queries scoring either quite well or quite poorly against the model. There are currently no hits in between the noise cutoff and trusted cutoff. [Energy metabolism, Fermentation]	0
333203	cl28383	Pus10	tRNA U54 and U55 pseudouridine synthase Pus10 [Translation, ribosomal structure and biogenesis]. Members of this family show twilight-zone similarity to several predicted RNA pseudouridine synthases. All trusted members of this family are archaeal. Several eukaryotic homologs lack N-terminal homology including two CXXC motifs. [Hypothetical proteins, Conserved]	0
421576	cl28438	Phage_coatGP8	Phage major coat protein, Gp8. Class I phage major coat protein Gp8 or B. The coat protein is largely alpha-helix with a slight curve.	0
333259	cl28439	Pox_I6	Poxvirus I6-like family. Hypothetical protein; Provisional	0
391818	cl28444	Polyoma_coat2	Polyomavirus coat protein. VP3; Provisional	0
421577	cl28445	PlantTI	N/A. Plant trypsin inhibitors such as squash trypsin inhibitor. Plant proteinase inhibitors play important roles in natural plant defense. Proteinase inhibitors from squash seeds form an uniform family of small proteins cross-linked with three disulfide bridges.	0
333282	cl28462	pnk	bifunctional NADP phosphatase/NAD kinase. NAD kinase	0
333312	cl28492	PheB	ACT domain-containing protein  [General function prediction only]. 	0
333347	cl28527	HYS2	Archaeal DNA polymerase II, small subunit/DNA polymerase delta, subunit B [Replication, recombination and repair]. 	0
333351	cl28531	ENDO3c	Thermostable 8-oxoguanine DNA glycosylase  [Replication, recombination and repair, Defense mechanisms]. N-glycosylase/DNA lyase; Provisional	0
333356	cl28536	COG5412	Phage-related protein  [Mobilome: prophages, transposons]. membrane protein P6	0
355712	cl28539	PRK14982	N/A. This enzyme, found in cyanobacteria, reduces a long-chain (mainly C16 or C18) fatty acyl ACP ester to its corresponding fatty aldehyde, releasing the acyl carrier protein (ACP). NADPH or NADH is the reductant for this reaction. This enzyme may be distantly related to the short-chain dehydrogenase or reductase (SDR) family (pfam00106). The purpose of this reaction is in the first step of alkane biosynthesis (GenProp0942). [Central intermediary metabolism, Other]	0
333370	cl28550	PilV	Tfp pilus assembly protein PilV [Cell motility, Extracellular structures]. Pilus systems categorized as type IV pilins differ greatly from one another, with some showing greater similarty to type II or type III secretion systems than to each other. Members of this protein family represent the PilV protein of type IV pilus systems as found in Pseudomonas aeruginosa PAO1, Pseudomonas syringae DC3000, Neisseria meningitidis MC58, Xylella fastidiosa 9a5c, etc. [Cell envelope, Surface structures, Protein fate, Protein modification and repair]	0
333390	cl28570	CoxE	Uncharacterized conserved protein, contains von Willebrand factor type A (vWA) domain   [Function unknown]. 	0
421578	cl28577	MVD1	Mevalonate pyrophosphate decarboxylase  [Lipid transport and metabolism]. diphosphomevalonate decarboxylase	0
355713	cl28581	PRK15430	EamA family transporter RarD. This uncharacterized protein is predicted to have many membrane-spanning domains. [Transport and binding proteins, Unknown substrate]	0
333416	cl28596	NhaC	Na+/H+ antiporter NhaC [Energy production and conversion]. A single member of the NhaC family, a protein from Bacillus firmus, has been functionally characterized.It is involved in pH homeostasis and sodium extrusion. Members of the NhaC family are found in both Gram-negative bacteria and Gram-positive bacteria. Intriguingly, archaeal homolog ArcD (just outside boundaries of family) has been identified as an arginine/ornithine antiporter. [Transport and binding proteins, Cations and iron carrying compounds]	0
333419	cl28599	COG1318	Predicted transcriptional regulator [Transcription]. This model describes a common domain shared by two different families of proteins, each of which occurs regularly next to its corresponding partner family, a probable regulatory with homology to KaiC. By implication, this protein family likely is also involved in sensory transduction and/or regulation.	0
421579	cl28607	PHA02840	N/A. hypothetical protein; Provisional	0
333429	cl28609	PHA03178	N/A. UL43 envelope protein; Provisional	0
391819	cl28610	Herpes_HEPA	Herpesvirus DNA helicase/primase complex associated protein. hypothetical protein; Provisional	0
333432	cl28612	PHA03128	N/A. dUTPase; Provisional	0
333435	cl28615	PHA02681	N/A. putative IMV membrane protein; Provisional	0
333436	cl28616	PHA02670	N/A. GM-CSF/IL-2 inhibition factor; Provisional	0
333439	cl28619	PHA03415	N/A. virion protein; Provisional	0
333447	cl28627	PLN00046	N/A. Members of this family are the PsaO protein of photosystem I. This protein is found in chloroplasts but not in Cyanobacteria.	0
421580	cl28628	Reoviridae_Vp9	Reoviridae VP9. This model, broader than related pfam08978, describes proteins VP9 in Coltivirus, and proteins with various designations in the seadornavirus group: VP9 in Banna virus, VP10 in Liao ning virus, and VP11 in Kadipiro virus.	0
333450	cl28630	PRK15358	type III secretion systems effector SseF. pathogenicity island 2 effector protein SseG; Provisional	0
333451	cl28631	PRK15355	N/A. This model represents the conserved C-terminal domain of a protein conserved in across species in the bacterial type III secretion apparatus. This protein is designated YscI (Yop proteins translocation protein I) in Yersinia and HrpB (hypersensitivity response and pathogenicity protein B) in plant pathogens such as Pseudomonas syringae. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
333457	cl28637	PHA03175	N/A. US22 family homolog; Provisional	0
333459	cl28639	Orthopox_F14	Orthopoxvirus F14 protein. hypothetical protein; Provisional	0
333460	cl28640	PHA02693	N/A. Hypothetical protein; Provisional	0
333461	cl28641	19	N/A. baseplate subunit; Provisional	0
391821	cl28642	NAD4L	NADH dehydrogenase subunit 4L (NAD4L). NADH dehydrogenase subunit 4L; Provisional	0
333464	cl28644	PRK09781	N/A. hypothetical protein; Provisional	0
391822	cl28645	CHIPS	Chemotaxis-inhibiting protein CHIPS. chemotaxis-inhibiting protein CHIPS; Reviewed	0
333468	cl28648	Pox_TAP	Viral Trans-Activator Protein. late transcription factor VLTF-1; Provisional	0
333469	cl28649	PHA03043	N/A. putative virulence factor; Provisional	0
333472	cl28652	PHA02837	N/A. Toll/IL-receptor-like protein; Provisional	0
333473	cl28653	PHA02836	N/A. hypothetical protein; Provisional	0
333476	cl28656	PHA02725	N/A. hypothetical protein; Provisional	0
355716	cl28658	DUF1406	Protein of unknown function (DUF1406). hypothetical protein; Provisional	0
333479	cl28659	End_beta_propel	Catalytic beta propeller domain of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is approximately 80 amino acids in length.This domain is the beta barrel domain of bacteriophage endosialidase which represents the one of the two sialic acid binding sites of the enzyme. The domain is nested in the beta propeller domain of the endosialidase enzyme. The endosialidase protein complexes to form homotrimeric molecules.	0
333481	cl28661	DUF2717	Protein of unknown function (DUF2717). hypothetical protein	0
333482	cl28662	RepA1_leader	Tap RepA1 leader peptide. This protein is a translated leader peptide that actis in the regulation of the expression of the plasmid replication protein RepA in incF2 group plasmids. [Mobile and extrachromosomal element functions, Plasmid functions]	0
333484	cl28664	Amb_V_allergen	Amb V Allergen. Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphhydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens.	0
333485	cl28665	PapG_CBD	N/A. PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus. The carbohydrate-binding domain interacts with the receptor glycan.	0
421582	cl28728	TIGR02687	TIGR02687 family protein. Members of this family are uncharacterized proteins sporadically distributed in bacteria and archaea, about 880 amino acids in length. This protein is repeatedly found upstream of another uncharacterized protein of about 470 amino acids in length, modeled by TIGR02688.	0
421593	cl28752	Penicillinase_R	Penicillinase repressor. The penicillinase repressor negatively regulates expression of the penicillinase gene. The N-terminal region of this protein is involved in operator recognition, while the C-terminal is responsible for dimerization of the protein.	0
421665	cl28849	CrtC-like	carotenoid 1,2-hydratase and similar proteins. This family includes Aspergillus nidulans tyrosinase family protein asqI (aspoquinolone biosynthesis protein I) that is part of the gene cluster that mediates the biosynthesis of the aspoquinolone mycotoxins.	0
421686	cl28876	PCSK9_C-CRD	proprotein convertase subtilisin/kexin type 9, C-terminal cysteine-rich domain (CRD). This entry represents a subdomain found in the C-terminal cysteine/histidine-rich domain (CRD) of PCSK9 (also known as neural apoptosis-regulated convertase, NARC-1). PCSK9 has been shown to regulate circulating LDL-R levels by controlling LDL-R degradation. Furthermore, numerous mutations in the PCSK9 gene have been identified and associated with hypercholesterolemia (gain of function) or hypocholesterolemia (loss of function). The fully folded CRD, shows structural similarity to the resistin homotrimer, a small cytokine associated with obesity and diabetes. The C-terminal domain from PCSK9 consists of three, three-stranded beta-subdomains arranged in a pseudothreefold, and each of the subdomains in the CRD of PCSK9 consists of three structurally conserved disulfide bonds.	0
333703	cl28883	PA	N/A. PA_M28_1_3: Protease-associated (PA) domain, peptidase family M28, subfamily-1, subgroup 3. A subgroup of PA-domain containing proteins belonging to the peptidase family M28. Family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following members of the peptidase family M28: i) prostate-specific membrane antigen (PSMA), ii) yeast aminopeptidase Y, and ii) human TfR (transferrin receptor)1 and human TfR2. The proteins listed above belong to other subgroups; relatively little is known about proteins in this subgroup.	0
421687	cl28889	CA_like	Cadherin repeat-like domain. This domain is found in a range of enzymes that act on branched substrates - isoamylase, pullulanase and branching enzyme. This family also contains the beta subunit of 5' AMP activated kinase.	0
333710	cl28890	FYVE_like_SF	FYVE domain like superfamily. Protein piccolo, also termed aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics. It also functions as a presynaptic low-affinity Ca2+ sensor and has been implicated in Ca2+ regulation of neurotransmitter release. Piccolo is a multi-domain protein containing two N-terminal FYVE zinc fingers, a polyproline tract, and a PDZ domain and two C-terminal C2 domains. This family corresponds to the second FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.	0
421688	cl28891	ParB_N_Srx	ParB N-terminal domain and sulfiredoxin protein-related families. This is family of bacterial proteins likely to be necessary for binding to DNA and recognising the modification sites. Members are found in bacteria, archaea and on viral plasmids, and are typically between 354 and 474 amino acids in length. There is a conserved DGQHR sequence motif.	0
333712	cl28892	CNF1_CheD_YfiH-like	cytotoxic necrotizing factor 1 (CNF1), chemotaxis protein CheD and YfiH (DUF152) are distant homologs. This subfamily contains Rho-activating toxins cytotoxic necrotizing factor 1 (CNF1) and dermonecrotic toxin (DNT) from Bordetella species, as well as Burkholderia Lethal Factor 1 (BLF1, also known as BPSL1549), and similar proteins. CNF1 causes alteration of the host cell actin cytoskeleton and promotes bacterial invasion of blood-brain barrier endothelial cells. E. coli CNF1 constitutively activates host small G proteins such as RhoA and Cdc42 by deamidating a glutamine residue essential for GTP hydrolysis. DNT stimulates the assembly of actin stress fibers and focal adhesions by deamidation/polyamination of a specific glutamine of the small GTPase Rho. CNF1 and DNT are A-B toxins composed of an N-terminal receptor-binding (B) domain and a C-terminal enzymatically active (A) domain; their homology is restricted to the catalytic domains at the C termini of the toxins, suggesting that they share a similar molecular mechanism. BLF1, a toxin that inhibits helicase activity of translation factor eIF4A, is similar to the catalytic domain of Escherichia coli CNF1 (CNF1-C); although CNF1-C and BLF1 show little sequence identity, the active sites have the conserved LSGC (Leu, Ser, Gly, Cys) motif.	0
333713	cl28893	CpcS_T	S- and T-type phycobiliprotein (PBP) lyases. This family contains the S-type phycobiliprotein (PBP) lyase (denoted CpcS/CpcU or CpeS/CpeU). PBP lyases are employed by cyanobacteria, red algae, cryptophytes and glaucophytes for light-harvesting. Pigmentation of light-harvesting phycobiliproteins of cyanobacteria and cryptophytes requires covalent attachment of open-chain tetrapyrrole chromophores, the phycobilins, to the apoproteins. PBP lyases mediate this covalent attachment of phycobilin chromophores to apo-PBPs and also ensure the correct binding of the chromophore with regard to the specific attachment site and stereospecificity. The S-type lyase is distantly related to CpcT and similarly adopts a beta-barrel structure with a modified lipocalin fold. Many members of the CpcS/CpcU family ligate phycocyanobilin (PCB) to a specific cysteine residue in the beta-subunits of phycocyanin (CpcB) or phycoerythrocyanin (PecB) and to a related cysteine residue in the alpha and beta subunits of allophycocyanin (AP); they are typically given the designation of "CpcS" or "CpcU". Other members which attach phycoerythrobilin (PEB) to the beta-subunit of phycoerythrin (PE) are given the designation "CpeS" or "CpeU". In Guillardia theta, a Cryptophyte, which has adopted phycoerythrobilin (PEB) biosynthesis from cyanobacteria, phycobiliprotein lyase has been shown to provide structural requirements for the transfer of this chromophore to the specific cysteine residue of the apophycobiliprotein.	0
333715	cl28895	EFh_PI-PLC	EF-hand motif found in eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11) isozymes. PRIP-2, also termed phospholipase C-L2, or phospholipase C-epsilon-2 (PLC-epsilon-2), or inactive phospholipase C-like protein 2 (PLC-L2), is a novel inositol 1,4,5-trisphosphate (InsP3) binding protein that exhibits a relatively ubiquitous expression. It functions as a novel negative regulator of B-cell receptor (BCR) signaling and immune responses. PRIP-2 has a primary structure and domain architecture, incorporating a pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core domain with highly conserved X- and Y-regions split by a linker sequence, and a C-terminal C2 domain, similar to phosphoinositide-specific phospholipases C (PI-PLC, EC 3.1.4.11)-delta isoforms. Due to replacement of critical catalytic residues, PRIP-2 does not have PLC enzymatic activity.	0
333716	cl28896	EFh_MICU	EF-hand, calcium binding motif, found in mitochondrial calcium uptake proteins MICU1, MICU2, MICU3, and similar proteins. MICU3, also termed EF-hand domain-containing family member A2 (EFHA2), is a paralog of MICU1 and notably found in the central nervous system (CNS) and skeletal muscle. At present, the precise molecular function of MICU3 remains unclear. It likely has a role in mitochondrial calcium handling. MICU3 contains an N-terminal mitochondrial targeting sequence (MTS) as well as two evolutionarily conserved canonical Ca2+-binding EF-hands separated by a long stretch of residues predicted to form alpha-helices.	0
421689	cl28897	7tm_GPCRs	seven-transmembrane G protein-coupled receptor superfamily. The viral group I rhodopsins includes Phaeocystis globosa virus 12T divergent type-1 DTS-motif rhodopsin (VirRDTS), a green light-absorbing proton pump that has a structure similar to that of bacteriorhodopsin (BR) and transfers light energy in a manner that substantially changes medium pH when expressed in a cell. Members of this group are considered homologs of proteorhodopsins (PRs), which are blue-light absorbing and green-light absorbing proteins acting as light-driven proton pumps that play a major role in supplying light energy for phototropic marine microorganisms, by a mechanism similar to that of bacteriorhodopsin. Viral proteorhodopsins are predicted to function as sensory rhodopsins that could affect signaling, for example, phototaxis in the infected protists, perhaps stimulating relocation of the infected protists to areas that are rich in nutrients required for virus reproduction. PRs belong to the microbial rhodopsin family, also known as type 1 rhodopsins, which also comprise the light-driven inward chloride pump halorhodopsin (HR), the light-gated cation channel channelrhodopsin (ChR), the light-sensor activating transmembrane transducer protein sensory rhodopsin II (SRII), the light-sensor activating soluble transducer protein Anabaena sensory rhodopsin (ASR), and the other light-driven proton pumps such as bacteriorhodopsin (BR). While microbial (type 1) and animal (type 2) rhodopsins have no sequence similarity with each other, they share a common architecture consisting of seven-transmembrane alpha-helices (TM) connected by extracellular loops and intracellular loops. Both types of rhodopsins consist of opsin and a covalently attached retinal (the aldehyde of vitamin A), a photoreactive chromophore, via a protonated Schiff base linkage to an amino group of lysine in the middle of the seventh transmembrane helix (TM7). Upon the absorption of light, microbial rhodopsins undergo light-induced photoisomerization of all-trans retinal into the 13-cis isomer, whereas the photoisomerization of 11-cis retinal to all-trans isomer occurs in the animal rhodopsins. While animal visual rhodopsins are activated by light to catalyze GDP/GTP exchange in the alpha subunit of the retinal G protein transducin (Gt), microbial rhodopsins do not activate G proteins, but instead can function as light-dependent ion pumps, cation channels, and sensors.	0
333718	cl28898	Peptidase_M48_M56	Peptidases M48 (Ste24 endopeptidase or htpX homolog) and M56 (in MecR1 and BlaR1), integral membrane metallopeptidases. This family contains peptidase family M48 subfamily A-like CaaX prenyl protease 1, most of which are uncharacterized. Some of these contain tetratricopeptide (TPR) repeats at the C-terminus. Proteins in this family contain the zinc metalloprotease motif (HEXXH), likely exposed on the cytoplasmic side. They are thought to be possibly associated with the endoplasmic reticulum (ER), regardless of whether their genes possess the conventional signal motif (KKXX) in the C-terminal. These proteins putatively remove the C-terminal three residues of farnesylated proteins proteolytically.	0
421690	cl28899	DEAD-like_helicase_N	N-terminal helicase domain of the DEAD-box helicase superfamily. This domain is found at the C-terminus of DEAD-box helicases.	0
355777	cl28901	MCM	MCM helicase family. archaeal MCM proteins form a homohexameric ring homologous to the eukaryotic Mcm2-7 helicase and also function as the replicative helicase at the replication fork	0
355778	cl28902	ARID	ARID/BRIGHT DNA binding domain family. ARID5B, also called MRF1-like protein or modulator recognition factor 2 (MRF-2), is a DNA-binding protein that directly interacts with plant homeodomain (PHD) finger 2 (PHF2) to form a protein kinase A (PKA)-dependent PHF2-ARID5B histone H3K9Me2 demethylase complex, which is a signal-sensing modulator of histone methylation and gene transcription. It also functions as a transcriptional co-regulator for the transcription factor sex determining region Y (SRY)-box protein 9 (Sox9) and promotes chondrogenesis through histone modification. Moreover, ARID5B is highly expressed in the cardiovascular system and may play essential roles in the phenotypic change of smooth muscle cells (SMCs) through its regulation of SMC differentiation. Its polymorphism has been associated with risk for pediatric acute lymphoblastic leukemia (ALL). ARID5B contains an AT-rich DNA-interacting domain (ARID, also known as BRIGHT), which can bind both the major and minor grooves of its target sequences.	0
421692	cl28903	BACK	BACK (BTB and C-terminal Kelch) domain. This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation).	0
421693	cl28904	PTP_DSP_cys	cys-based protein tyrosine phosphatase and dual-specificity phosphatase superfamily. This family is closely related to the pfam00102 and pfam00782 families.	0
421694	cl28905	PIN_SF	PIN (PilT N terminus) domain: Superfamily. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 153 and 170 amino acids in length. There is a single completely conserved residue D that may be functionally important.	0
355783	cl28907	ArfGap	GTPase-activating protein (GAP) for the ADP ribosylation factors (ARFs). The ArfGAP domain and FG repeat-containing proteins (AFGF) subfamily of Arf GTPase-activating proteins consists of the two structurally-related members: AGFG1 and AGFG2. AGFG2 is a member of the HIV-1 Rev binding protein (HRB) family and contains one Arf-GAP zinc finger domain, several Phe-Gly (FG) motifs, and four Asn-Pro-Phe (NPF) motifs. AGFG2 interacts with Eps15 homology (EH) domains and plays a role in the Rev export pathway, which mediates the nucleocytoplasmic transfer of proteins and RNAs. In humans, the presence of the FG repeat motifs (11 in AGFG1 and 7 in AGFG2) are thought to be required for these proteins to act as HIV-1 Rev cofactors. Hence, AGFG promotes movement of Rev-responsive element-containing RNAs from the nuclear periphery to the cytoplasm, which is an essential step for HIV-1 replication.	0
421695	cl28910	MFS	Major Facilitator Superfamily. This is a family of transport proteins. Members of this family include a protein responsible for the secretion of the ferric chelator, enterobactin, and a protein involved in antibiotic resistance.	0
355788	cl28912	LGIC_ECD	extracellular domain (ECD) of Cys-loop neurotransmitter-gated ion channels (also known as ligand-gated ion channel (LGIC)). This family contains extracellular domain (ECD) of the rho subunit 3 of type-A gamma-aminobutyric acid receptor (GABAAR), encoded by the GABRR3 gene which maps to a different chromosome to that of GABRR1 and GABRR2. While close proximity of the rho1 and rho2 subunit genes suggests that they emerged via a local duplication event, GABRR3 may have arisen by duplication of a GABRR1/GABRR2 progenitor. This subunit homo-oligomerizes to form GABAA-rho receptors (formerly classified as GABA-rho or GABAc receptor), but does not co-assemble with any of the classical GABAAR subunits. In humans, some individuals contain a variant that is predicted to inactivate this gene product.	0
421697	cl28914	CD_CSD	CHROMO (CHRromatin Organization Modifier) domains and chromo shadow domains. This is a novel knotted tudor domain which is required for binding to RNA. The know influences the loop conformation of the helical turn Ht2 - residues 61-6 3- that is located at the side opposite the knot in the tudor domain-chromodomain; stabilisation of Ht2 is essential for RNA binding.	0
355792	cl28916	HipA-like	serine/threonine-protein kinases similar to HipA and CtkA. This family contains type II toxin-antitoxin (TA) system HipA family toxins similar to Shewanella oneidensis HipA, a serine/threonine-protein kinase that phosphorylates Glu-tRNA-ligase (GltX), preventing it from being charged, leading to an increase in uncharged tRNA(Glu). This induces amino acid starvation and the stringent response via RelA/SpoT and increased (p)ppGpp levels, which inhibits replication, transcription, translation and cell wall synthesis, reducing growth and leading to persistence and multidrug resistance. HipA is the toxin component of the HipA-HipB TA module that is a major factor in persistence and bioflim formation; its toxic effect is neutralized by its cognate antitoxin HipB. HipA, with HipB, acts as a a corepressor for transcription of the hipBA promoter. In the Shewanella oneidensis HipAB:DNA promoter complex, HipB forms a dimer that binds the duplex operator DNA, with each HipB monomer interacting with separate HipA monomers. The HipAB component of the complex is composed of two HipA and two HipB subunits.	0
355793	cl28917	Alpha_kinase	Alpha kinase family. The alpha kinase family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional serine/threonine protein kinases. The family contains myosin heavy chain kinases, elongation factor-2 kinases, and bifunctional ion channel kinases. These kinases are implicated in a large variety of cellular processes such as protein translation, Mg2+/Ca2+ homeostasis, intracellular transport, cell migration, adhesion, and proliferation. The alpha-kinase family was named after the unique mode of substrate recognition by its initial members, the Dictyostelium heavy chain kinases, which targeted protein sequences that adopt an alpha-helical conformation. More recently, alpha-kinases were found to also target residues in non-helical regions.	0
355795	cl28919	ING	Inhibitor of growth (ING) domain family. The ING family includes three yeast orthologs, chromatin modification-related protein YNG1 (Yng1p), YNG2 (Yng2p), and transcriptional regulatory protein PHO23 (Pho23p). Yng1p, also termed ING1 homolog 1, is one of the components of the NuA3 histone acetyltransferase (HAT) complex. Yng2p, also termed ESA1-associated factor 4, or ING1 homolog 2, is a subunit of the NuA4 HAT complex. It plays acritical role in intra-S-phase DNA damage response. Pho23p is part of Rpd3/Sin3 histone deacetylase (HDAC) complex. It is required for the normal function of Rpd3 in the silencing of rDNA, telomeric, and mating-type loci. Yng1p and Pho23p inhibit p53-dependent transcription. In contrast, Yng2p has the opposite effect. The related mammalian ING proteins act as readers and writers of the histone epigenetic code, affecting DNA damage response, chromatin remodeling, cellular senescence, differentiation, cell cycle regulation and apoptosis. They may have a general role in mediating the cellular response to genotoxic stress through binding to and regulating the activities of histone acetyltransferase (HAT) and histone deacetylase (HDAC) chromatin remodeling complexes. All ING proteins contain an N-terminal leucine zipper-like (LZL) motif-containing ING domain that binds unmodified H3 tails, and a well-characterized C-terminal plant homeodomain (PHD)-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3). Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail.	0
355796	cl28920	STAT_DBD	DNA-binding domain of Signal Transducer and Activator of Transcription (STAT). This family consists of the DNA-binding domain (DBD) of the STAT6 proteins (Signal Transducer and Activator of Transcription 6, or Signal Transduction And Transcription 6). The DNA binding domain has an Ig-like fold. STAT6 is essential for the functional responses of T helper 2 (Th2) lymphocyte mediated by interleukins IL-4 and IL-13. STAT6 almost exclusively mediates the expression of genes activated by these cytokines; IL-4 signaling regulates the expression of genes involved in immune and anti-inflammatory responses. Abnormal production of IL-4 and IL-13 play important roles in the pathogenesis of asthma where upregulation of the Th2 response mediated by IL-4/IL-13 is a main characteristic. STAT6 has a unique extended transactivation domain, not found in other STATs, through which it recruits p300/CBP and NCoA-1, two coactivators needed for transcriptional activation by IL-4. STAT6 activation is linked to Kaposi's sarcoma-associated herpesvirus (KSHV)-associated cancers such as primary effusion lymphoma, a cancerous proliferation of B cells.  Studies show that Meningeal solitary fibrous tumor (SFT) and hemangiopericytoma (HPC) represent a histopathologic spectrum linked by STAT6 nuclear expression and recurrent somatic fusions of the two genes, NGFI-A-binding protein 2 (NAB2) and STAT6 (NAB2-STAT6), similar to their soft tissue counterparts. It is associated with local recurrence and late distance metastasis of brain tumors to extracranial sites.	0
355797	cl28921	STAT_CCD	Coiled-coil domain of Signal Transducer and Activator of Transcription (STAT), also called alpha domain. This family consists of the coiled-coil (alpha) domain of the STAT6 proteins (Signal Transducer and Activator of Transcription 6, or Signal Transduction And Transcription 6). SImilar to STAT3 and STAT5. the coiled-coil domain (CCD) of STAT6 is required for constitutive nuclear localization signals (NLS) function; small deletions within the CCD can abrogate nuclear import. Studies show that the CCD binds to the importin-alpha3 NLS adapter in most cells.STAT6 is essential for the functional responses of T helper 2 (Th2) lymphocyte mediated by interleukins IL-4 and IL-13. STAT6 almost exclusively mediates the expression of genes activated by these cytokines; IL-4 signaling regulates the expression of genes involved in immune and anti-inflammatory responses. Abnormal production of IL-4 and IL-13 play important roles in the pathogenesis of asthma where upregulation of the Th2 response mediated by IL-4/IL-13 is a main characteristic. STAT6 has a unique extended transactivation domain, not found in other STATs, through which it recruits p300/CBP and NCoA-1, two coactivators needed for transcriptional activation by IL-4. STAT6 activation is linked to Kaposi's sarcoma-associated herpesvirus (KSHV)-associated cancers such as primary effusion lymphoma, a cancerous proliferation of B cells. Studies show that Meningeal solitary fibrous tumor (SFT) and hemangiopericytoma (HPC) represent a histopathologic spectrum linked by STAT6 nuclear expression and recurrent somatic fusions of the two genes, NGFI-A-binding protein 2 (NAB2) and STAT6 (NAB2-STAT6), similar to their soft tissue counterparts. It is associated with local recurrence and late distance metastasis of brain tumors to extracranial sites.	0
421700	cl28922	Ubiquitin_like_fold	Beta-grasp ubiquitin-like fold. This domain is the binding/interacting region of several protein kinases, such as the Schizosaccharomyces pombe Byr2. Byr2 is a Ser/Thr-specific protein kinase acting as mediator of signals for sexual differentiation in S. pombe by initiating a MAPK module, which is a highly conserved element in eukaryotes. Byr2 is activated by interacting with Ras, which then translocates the molecule to the plasma membrane. Ras proteins are key elements in intracellular signaling and are involved in a variety of vital processes such as DNA transcription, growth control, and differentiation. They function like molecular switches cycling between GTP-bound 'on' and GDP-bound 'off' states.	0
421701	cl28923	PIPKc	Phosphatidylinositol phosphate kinase (PIPK) catalytic domain family. This family contains a region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in. The family consists of various type I, II and III PIP5K enzymes. PIP5K catalyzes the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signaling pathway.	0
421702	cl28926	23S_rRNA_IVP	23S rRNA-intervening sequence protein. This family describes a protein of unknown function whose structure is a bundle of four long alpha helices. Some of the first members of this family were found encoded in the (atypically large) intervening sequence (IVS) of Leptospira 23S RNA, a region often present in the rRNA gene and removed during rRNA processing without re-ligation. However, this location is not conserved, and naming this protein as a 23S RNA protein is both confusing and inaccurate.	0
355803	cl28927	Avd_IVP_like	proteins similar to the diversity-generating retroelement protein bAvd. A family of functionally uncharacterized bacterial proteins, some of which are encoded by an atypically large intervening sequence present within some 23S rRNA genes. The distantly related bAvd protein, which also forms a homopentamer of four-helix bundles, has been suggested to interact with nucleic acids and a reverse transcriptase.	0
355804	cl28928	VirB8_like	virulence protein VirB8. This family includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain is similar to the type IV secretion system (T4ASS) component VirB8 and possibly has a similar fold to the nuclear transport factor-2 (NTF-2)-like superfamily.	0
421703	cl28929	VirB10_like	VirB10 and similar proteins form part of core complex in Type IV secretion system (T4SS). This family contains DotG/IcmE (VirB10 homolog) and a component of the type IV secretion system (T4SS), and similar proteins. The Dot/Icm system is a T4SS found in the pathogens Legionella and Coxiella and the conjugative apparatus of IncI plasmids; T4SS is employed by pathogenic bacteria to export virulence DNAs and/or proteins directly from the bacterial cytoplasm into the host cell. Similar to T4SS VirB/D components, the Legionella Dot/Icm secretion apparatus contains a critical five-protein sub-assembly that forms the membrane-spanning 'core-complex' (CC), around which all other components assemble. This transmembrane connection is mediated by protein dimer pairs consisting of two inner membrane proteins, DotF and DotG, each independently associating with DotH/DotC/DotD in the outer membrane.	0
421705	cl28933	Riboflavin_synthase_like	Riboflavin synthase and similar proteins. This domain binds to derivatives of lumazine in some proteins. Some proteins have lost the residues involved in binding lumazine.	0
421711	cl28984	E_set	Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus. AMPK1_CBM is a family found in close association with AMPKBI pfam04739. The surface of AMPK1_CBM reveals a carbohydrate-binding pocket.	0
421712	cl28996	PolY	Y-family of DNA polymerases. These proteins are involved in UV protection.	0
391963	cl29010	DNA_alkylation	DNA alkylation repair enzyme. Proteins in this family are predicted to be DNA alkylation repair enzymes. The structure of a hypothetical protein in this family shows it to adopt a supercoiled alpha helical structure.	0
355888	cl29012	RNAP_largest_subunit_C	Largest subunit of RNA polymerase (RNAP), C-terminal domain. Archaeal RNA polymerase (RNAP), like bacterial RNAP, is a large multi-subunit complex responsible for the synthesis of all RNAs in the cell. The relative positioning of the RNAP core is highly conserved between archaeal RNAP and the three classes of eukaryotic RNAPs. In archaea, the largest subunit is split into two polypeptides, A' and A'', which are encoded by separate genes in an operon. Sequence alignments reveal that the archaeal A'' subunit corresponds to the C-terminal one-third of the RNAPII largest subunit (Rpb1). In subunit A'', several loops in the jaw domain are shorter. The RNAPII Rpb1 interacts with the second-largest subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis.	0
355925	cl29049	DUF5409	Family of unknown function (DUF5409). hypothetical protein; Provisional	0
421719	cl29051	Bac_rhamnosid6H	Bacterial alpha-L-rhamnosidase 6 hairpin glycosidase domain. This family includes human glycogen branching enzyme AGL. This enzyme contains a number of distinct catalytic activities. It has been shown for the yeast homolog GDB1 that mutations in this region disrupt the enzymes Amylo-alpha-1,6-glucosidase (EC:3.2.1.33).	0
355929	cl29053	ArenaCapSnatch	Arenavirus cap snatching domain. This model describes a shared signature region from an RNA endonuclease region associated with cap-snatching for mRNA production by RNA viruses. This domain usually is part of a multifunctional protein, the L protein responsible for RNA-dependent RNA polymerase activity. Cap-snatching is a viral alternative to synthesizing a eukaryotic-like mRNA cap itself.	0
421721	cl29069	MASE4	Membrane-associated sensor, integral membrane domain. MASE3 (Membrane-Associated SEnsor) is an integral membrane sensor domain of unknown specificity found in histidine kinases, diguanylate cyclases and protein phosphatases in various bacteria and archaea.	0
421723	cl29075	ComP_DUS	Type IV minor pilin ComP, DNA uptake sequence receptor. ComP-DUS is the DNA-uptake sequence receptor of pathogenic Proteobacteria. ComP is a type IV minor pilin -site on the minor type IV pilin, C one of three minor (low abundance) pilins in pathogenic Proteobacteria Neisseria species (with PilV and PilX). These modulate Tfp-mediated properties without affecting Tfp biogenesis. ComP plays a prominent role in competence at the level of DNA uptake. Comp is exposed on the surface of Neisseria filaments, and it is this that recognizes homotypic DNA through genus-specific DNA uptake sequence (DUS) motifs.	0
421724	cl29080	Glyco_hydro_32C	Glycosyl hydrolases family 32 C terminal. This family consists of uncharacterized proteins around 500 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Glycosyl hydrolases, but the function of this protein is unknown.	0
421725	cl29083	Spaetzle	Spaetzle. This family of proteins are nerve growth factor-like ligands required in the pathway that establishes the dorsal-ventral pattern of the embryo. They form a cystine knot structure.	0
421727	cl29093	SynN	N/A. This domain includes syntaxin-like domains including from the Vam3p protein.	0
421729	cl29100	MutL	MutL protein. This small family includes, so far, an uncharacterized protein from E. coli O157:H7 and GlmL from Clostridium tetanomorphum and Clostridium cochlearium. GlmL is located between the genes for the two subunits, epsilon (GlmE) and sigma (GlmS), of the coenzyme-B12-dependent glutamate mutase (methylaspartate mutase), the first enzyme in a pathway of glutamate fermentation. Members shows significant sequence similarity to the hydantoinase branch of the hydantoinase/oxoprolinase family (pfam01968).	0
421730	cl29105	FLgD_tudor	FlgD Tudor-like domain. flagellar basal body rod modification protein; Reviewed	0
421731	cl29107	Ligase_CoA	CoA-ligase. This domain contains the catalytic domain from Succinyl-CoA ligase alpha subunit and other related enzymes. A conserved histidine is involved in phosphoryl transfer.	0
421732	cl29110	HSDR_N	Type I restriction enzyme R protein N-terminus (HSDR_N). This family consists of a number of N terminal regions found in type I restriction enzyme R (HSDR) proteins. Restriction and modification (R/M) systems are found in a wide variety of prokaryotes and are thought to protect the host bacterium from the uptake of foreign DNA. Type I restriction and modification systems are encoded by three genes: hsdR, hsdM, and hsdS. The three polypeptides, HsdR, HsdM, and HsdS, often assemble to give an enzyme (R2M2S1) that modifies hemimethylated DNA and restricts unmethylated DNA.	0
421733	cl29114	BamD	BamD lipoprotein, a component of the beta-barrel assembly machinery. BamD, also called YfiO, is part of the beta-barrel assembly machinery (BAM), which is essential for the folding and insertion of outer membrane proteins (OMPs) in the OM of Gram-negative bacteria. Transmembrane OMPs carry out important functions including nutrient and waste management, cell adhesion, and structural roles. The BAM complex is composed of the beta-barrel OMP BamA (also called Omp85/YaeT) and four lipoproteins BamBCDE. BamD is the only BAM lipoprotein required for viability. Both BamA and BamD are broadly distributed in Gram-negative bacteria, and may constitute the core of the BAM complex. BamD contains five Tetratricopeptide repeats (TPRs). The three TPRs at the N-terminus may participate in interaction with substrates, while the two TPRs in the C-terminus may be involved in binding with other BAM components.	0
391992	cl29116	Cytochrome_C554	Cytochrome c554 and c-prime. This domain carries up to seven CxxCH repeated sequence motifs, characteristic of multi-haem cytochromes.	0
421736	cl29122	Rotamase_2	PPIC-type PPIASE domain. Rotamases increase the rate of protein folding by catalyzing the interconversion of cis-proline and trans-proline.	0
421737	cl29123	Phage_int_SAM_5	Phage integrase SAM-like domain. This domain is found in a variety of phage integrase proteins.	0
421739	cl29132	GspH	Type II transport protein GspH. GspH is involved in bacterial type II export systems. Like all pilins, GspH has an N-terminus alpha helix. This helix is followed by nine beta strands forming two beta sheets, one of five antiparallel strands and one of four antiparallel strands. GspH is a minor pseudopilin; it is expressed much less than other pseudopilins in the type II secretion pilus (major pilins). The function and localization of minor pseudo-pilins are still to be fully unraveled. It has been suggested that some minor pseudopilins may assemble either into the base or the tip of pili, or both. They function as initiators or regulators of pilus biogenesis and dynamics, and/or as adaptors between various pseudopilin component and other members of the T2SS.	0
421740	cl29134	Nup188	Nucleoporin subcomplex protein binding to Pom34. This is a family of eukaryotic nucleoporins of several different sizes. All of them are long and form the scaffold of the nuclear pore complex. Nup192 in particular modulates the permeability of the central channel of the NPC central or nuclear pore complex.	0
421741	cl29139	Beta-Casp	Beta-Casp domain. The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains.	0
421742	cl29140	DUF2815	Protein of unknown function (DUF2815). single-stranded DNA-binding protein	0
356017	cl29141	PLN02918	N/A. This model is similar to Pyridox_oxidase from Pfam but is designed to find only true pyridoxamine-phosphate oxidase and to ignore the related protein PhzG involved in phenazine biosynthesis. This protein from E. coli was characterized as a homodimer with two FMN per dimer. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridoxine]	0
421744	cl29146	NADH_4Fe-4S	NADH-ubiquinone oxidoreductase-F iron-sulfur binding region. 	0
421745	cl29147	NADH-G_4Fe-4S_3	NADH-ubiquinone oxidoreductase-G iron-sulfur binding region. 	0
421746	cl29148	Proteasome_A_N	Proteasome subunit A N-terminal signature. This domain is conserved in the A subunits of the proteasome complex proteins.	0
421747	cl29154	ClpB_D2-small	C-terminal, D2-small domain, of ClpB protein. This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit and thereby providing enough binding energy to stabilise the functional assembly. The domain is associated with two Clp_N at the N-terminus as well as AAA and AAA_2.	0
421748	cl29165	BHD_3	Rad4 beta-hairpin domain 3. This short domain is found in the Rad4 protein. This domain binds to DNA.	0
421749	cl29167	PhageMin_Tail	Phage-related minor tail protein. This model represents a reasonably well conserved core region of a family of phage tail proteins. The member from phage TP901-1 was characterized as a tail length tape measure protein in that a shortened form of the protein leads to phage with proportionately shorter tails. [Mobile and extrachromosomal element functions, Prophage functions]	0
421750	cl29174	MethyTransf_Reg	Predicted methyltransferase regulatory domain. Members of this family of domains are found in various prokaryotic methyltransferases, where they regulate the activity of the methyltransferase domain.	0
421751	cl29183	RQC	RQC domain. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain.	0
421754	cl29203	Alpha-mann_mid	Alpha mannosidase middle domain. Members of this entry belong to the glycosyl hydrolase family 38, This domain, which is found in the central region adopts a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. The domain is predominantly found in the enzyme alpha-mannosidase.	0
421756	cl29215	LeuA_dimer	LeuA allosteric (dimerization) domain. This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyses the first committed step in the leucine biosynthetic pathway. This domain, is an internally duplicated structure with a novel fold. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich.	0
356100	cl29224	PulG	Type II secretory pathway, pseudopilin PulG [Cell motility, Intracellular trafficking, secretion, and vesicular transport, Extracellular structures]. This model represents GspG, protein G of the main terminal branch of the general secretion pathway, also called type II secretion. It transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
421759	cl29226	Cadherin	Cadherin domain. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.	0
421761	cl29229	Sel1	Sel1 repeat. These represent a subfamily of TPR (tetratricopeptide repeat) sequences.	0
421762	cl29235	HisG_C	HisG, C-terminal domain. This domain corresponds to the C-terminal third of the HisG protein. It is absent in many lineages.	0
421763	cl29236	tRNA_SAD	Threonyl and Alanyl tRNA synthetase second additional domain. The catalytically active form of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this SAD domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain.	0
421764	cl29237	PBP5_C	Penicillin-binding protein 5, C-terminal domain. Penicillin-binding protein 5 expressed by E. coli functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain (pfam00768) is the catalytic domain. The C-terminal domain featured in this family is organized into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides.	0
421765	cl29239	PYNP_C	Pyrimidine nucleoside phosphorylase C-terminal domain. This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP). The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer.	0
421766	cl29240	Alpha-amyl_C2	Alpha-amylase C-terminal beta-sheet domain. This entry represents the beta-sheet domain that is found in several alpha-amylases, usually at the C-terminus. This domain is organised as a five-stranded anti-parallel beta-sheet.	0
392030	cl29241	CobW_C	Cobalamin synthesis protein cobW C-terminal domain. CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids. This entry represents the C-terminal domain found in CobW, as well as in P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression.	0
421767	cl29255	DnaB_2	Replication initiation and membrane attachment. This model represents the conserved domain of DnaD, part of Bacillus subtilis replication restart primosome, and of a number of phage-associated proteins. Members, both chromosomal or phage-associated, are found in the Bacillus/Clostridium group of Gram-positive bacteria. [DNA metabolism, DNA replication, recombination, and repair, Mobile and extrachromosomal element functions, Prophage functions]	0
421768	cl29256	Curlin_rpt	Curlin associated repeat. This family consists of several bacterial repeats of around 30 residues in length. These repeats are often found in multiple copies in the curlin proteins CsgA and CsgB. Curli fibers are thin aggregative surface fibers, connected with adhesion, which bind laminin, fibronectin, plasminogen, human contact phase proteins, and major histocompatibility complex (MHC) class I molecules. Curli fibers are coded for by the csg gene cluster, which is comprised of two divergently transcribed operons. One operon encodes the csgB, csgA, and csgC genes, while the other encodes csgD, csgE, csgF, and csgG. The assembly of the fibers is unique and involves extracellular self-assembly of the curlin subunit (CsgA), dependent on a specific nucleator protein (CsgB). CsgD is a transcriptional activator essential for expression of the two curli fibre operons, and CsgG is an outer membrane lipoprotein involved in extracellular stabilisation of CsgA and CsgB.	0
421769	cl29260	BATS	Biotin and Thiamin Synthesis associated domain. Biotin synthase (BioB), , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this entry) and form a heterodimer. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers.. This domain therefore may be involved in co-factor binding or dimerisation.	0
421770	cl29262	Hyd_WA	Propeller. Probable beta-propeller.	0
421771	cl29265	TnpB_IS66	IS66 Orf2 like protein. The IS66 family insertion sequence element encodes a DDE transposase TnpC, and two accessory proteins, TnpA and TnpB. It has been assumed that the TnpA, TnpB, and TnpC proteins are produced independently in appropriate amounts and form a complex, which acts as a transposase to promote the transposition of an IS66 family element.	0
421774	cl29279	MutS_III	MutS domain III. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam00488. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds in part with globular domain IV, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in.	0
421778	cl29298	DapB_C	Dihydrodipicolinate reductase, C-terminus. dihydrodipicolinate reductase; Provisional	0
421780	cl29306	Bac_GDH	Bacterial NAD-glutamate dehydrogenase. glutamate dehydrogenase 2; Provisional	0
421781	cl29307	BON	BON domain. This domain is found in a family of osmotic shock protection proteins. It is also found in some Secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes.	0
392048	cl29316	Transposase_31	Putative transposase, YhgA-like. This family of putative transposases includes the YhgA sequence from Escherichia coli and several prokaryotic homologs.	0
421784	cl29317	GcpE	GcpE protein. In a variety of organisms, including plants and several eubacteria, isoprenoids are synthesized by the mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. Although different enzymes of this pathway have been described, the terminal biosynthetic steps of the MEP pathway have not been fully elucidated. GcpE gene of Escherichia coli is involved in this pathway.	0
421785	cl29319	HA2	Helicase associated domain (HA2). This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding.	0
392051	cl29320	KilA-N	KilA-N domain. N1R/p28-like protein; Provisional	0
421786	cl29321	ZipA	N/A. This family represents the ZipA C-terminal domain. ZipA is involved in septum formation in bacterial cell division. Its C-terminal domain binds FtsZ, a major component of the bacterial septal ring. The structure of this domain is an alpha-beta fold with three alpha helices and a beta sheet of six antiparallel beta strands. The major loops protruding from the beta sheet surface are thought to form a binding site for FtsZ.	0
421787	cl29323	TPK_B1_binding	Thiamin pyrophosphokinase, vitamin B1 binding domain. Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis.	0
421788	cl29327	DFP	DNA / pantothenate metabolism flavoprotein. phosphopantothenate--cysteine ligase; Validated	0
421791	cl29338	Ribosomal_L11	N/A. The N-terminal domain of Ribosomal protein L11 adopts an alpha/beta fold and is followed by the RNA binding C-terminal domain.	0
421792	cl29344	PhnA	PhnA domain. 	0
421793	cl29347	FtsQ	Cell division protein FtsQ. cell division protein FtsQ; Provisional	0
421794	cl29357	BK_channel_a	Calcium-activated BK potassium channel alpha subunit. This family represents a short region in the middle of largely plant proteins that belong to the TCDB:1.A.1.23.2 family of the voltage-gated ion channel superfamily, eg UniProtKB:Q5H8A6, Q5H8A5 and Q4VY51.	0
421795	cl29360	B5	tRNA synthetase B5 domain. This domain is found in phenylalanine-tRNA synthetase beta subunits.	0
421796	cl29361	CorC_HlyC	Transporter associated domain. This small domain is found in a family of proteins with the DUF21 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates.	0
421797	cl29362	CO_deh_flav_C	CO dehydrogenase flavoprotein C-terminal domain. 	0
421798	cl29364	DPBB_1	Lytic transglycolase. Putative EG45-like domain containing protein 1; Provisional	0
421799	cl29368	LPMO_10	Lytic polysaccharide mono-oxygenase, cellulose-degrading. spherodin-like protein; Provisional	0
421800	cl29370	FAR_C	C-terminal domain of fatty acyl CoA reductases. This family represents the C-terminal region of the male sterility protein in a number of arabidopsis and drosophila. A sequence-related jojoba acyl CoA reductase is also included.	0
421801	cl29371	Transposase_21	Transposase family tnp2. This family represents a conserved region approximately 260 residues long within a number of hypothetical proteins of unknown function that seem to be specific to C. elegans. Note that this family contains a number of conserved cysteine and histidine residues.	0
421804	cl29392	UreE	N/A. UreE is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid.	0
421805	cl29395	CPSase_L_D3	Carbamoyl-phosphate synthetase large chain, oligomerization domain. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain.	0
421806	cl29396	Biotin_carb_C	Biotin carboxylase C-terminal domain. Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyses the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain.	0
421809	cl29414	RPEL	RPEL repeat. The RPEL repeat is named after four conserved amino acids it contains. The RPEL motif binds to actin.	0
421811	cl29417	Ald_Xan_dh_C2	Molybdopterin-binding domain of aldehyde dehydrogenase. xanthine dehydrogenase subunit XdhA; Provisional	0
421812	cl29418	Dak2	DAK2 domain. Two types of dihydroxyacetone kinase (glycerone kinase) are described. In yeast and a few bacteria, e.g. Citrobacter freundii, the enzyme is a single chain that uses ATP as phosphoryl donor and is designated EC 2.7.1.29. By contract, E. coli and many other bacterial species have a multisubunit form (EC 2.7.1.-) with a phosphoprotein donor related to PTS transport proteins. This family represents the subunit homologous to the E. coli YcgS subunit.	0
421813	cl29420	ERCC4	ERCC4 domain. This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases.	0
421814	cl29422	B12-binding_2	B12 binding domain. Cobalamin-dependent methionine synthase is a large modular protein that catalyses methyl transfer from methyltetrahydrofolate (CH3-H4folate) to homocysteine. During the catalytic cycle, it supports three distinct methyl transfer reactions, each involving the cobalamin (vitamin B12) cofactor and a substrate bound to its own functional unit. The cobalamin cofactor plays an essential role in this reaction, accepting the methyl group from CH3-H4folate to form methylcob(III)alamin, and in turn donating the methyl group to homocysteine to generate methionine and cob(I)alamin. Methionine synthase is a large enzyme composed of four structurally and functionally distinct modules: the first two modules bind homocysteine and CH3-H4folate, the third module binds the cobalamin cofactor and the C-terminal module binds S-adenosylmethionine. The cobalamin-binding module is composed of two structurally distinct domains: a 4-helical bundle cap domain (residues 651-740 in the Escherichia coli enzyme) and an alpha/beta B12-binding domain (residues 741-896). The 4-helical bundle forms a cap over the alpha/beta domain, which acts to shield the methyl ligand of cobalamin from solvent. Furthermore, in the conversion to the active conformation of this enzyme, the 4-helical cap rotates to allow the cobalamin cofactor to bind the activation domain. The alpha/beta domain is a common cobalamin-binding motif, whereas the 4-helical bundle domain with its methyl cap is a distinctive feature of methionine synthases.	0
421816	cl29427	PhzC-PhzF	Phenazine biosynthesis-like protein. CntK (cobalt and nickel transport system protein K) is a histidine racemase that performs the first step in the biosynthesis of staphylopine, a metallophore involved in the import of multiple divalent cations. It was first characterized in Staphylococcus aureus.	0
421817	cl29429	Cyanase_C	N/A. Cyanate lyase (also known as cyanase) EC:4.2.1.104 is responsible for the hydrolysis of cyanate, allowing organisms that possess the enzyme to overcome the toxicity of environmental cyanate. This enzyme is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain.	0
421818	cl29433	Bac_transf	Bacterial sugar transferase. This Pfam family represents a conserved region from a number of different bacterial sugar transferases, involved in diverse biosynthesis pathways.	0
421825	cl29459	Glucosaminidase	Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase. Eubacterial enzymes distantly related to eukaryotic lysozymes.	0
421827	cl29462	AICARFT_IMPCHas	AICARFT/IMPCHase bienzyme. This is a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP.	0
421829	cl29468	RimM	RimM N-terminal domain. 16S rRNA-processing protein RimM; Provisional	0
421830	cl29469	MgtE	Divalent cation transporter. This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding (Bateman A unpubl.)	0
421832	cl29474	Flavokinase	Riboflavin kinase. Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme. the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases. This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme.	0
421834	cl29482	UPF0051	Uncharacterized protein family (UPF0051). This protein, SufD, forms a cytosolic complex SufBCD. This complex enhances the cysteine desulfurase of SufSE. The system, together with SufA, is believed to act in iron-sulfur cluster formation during oxidative stress. SufB and SufD are homologous. Note that SufC belongs to the family of ABC transporter ATP binding proteins, so this protein, encoded by an adjacent gene, has often been annotated as a transporter component. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	0
421835	cl29486	IlvC	Acetohydroxy acid isomeroreductase, catalytic domain. ketol-acid reductoisomerase; Provisional	0
421837	cl29490	Porphobil_deam	Porphobilinogen deaminase, dipyromethane cofactor binding domain. porphobilinogen deaminase; Provisional	0
421841	cl29503	GreA_GreB	Transcription elongation factor, GreA/GreB, C-term. This domain has an FKBP-like fold.	0
421844	cl29515	NifU	NifU-like domain. This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown.	0
421845	cl29525	FliMN_C	Type III flagellar switch regulator (C-ring) FliN C-term. flagellar motor switch protein; Validated	0
421846	cl29526	SecA_PP_bind	SecA preprotein cross-linking domain. The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain.	0
421847	cl29527	ATase	N/A. This domain is a 3 helical bundle.	0
421849	cl29529	Transgly	Transglycosylase. This family is one of the transglycosylases involved in the late stages of peptidoglycan biosynthesis. Members tend to be small, about 240 amino acids in length, and consist almost entirely of a domain described by pfam00912 for transglycosylases. Species with this protein will have several other transglycosylases as well. All species with this protein are Proteobacteria that produce murein (peptidoglycan). [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	0
421850	cl29530	EF_TS	Elongation factor TS. elongation factor Ts	0
421851	cl29532	Peptidase_M17	N/A. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase.	0
421852	cl29533	XPG_I	XPG I-region. domain in nucleases	0
356411	cl29535	HNS	Domain in histone-like proteins of HNS family. 	0
421854	cl29546	Pumilio	Pumilio-family RNA binding domain. Puf repeats (aka PUM-HD, Pumilio homology domain) are necessary and sufficient for sequence specific RNA binding in fly Pumilio and worm FBF-1 and FBF-2. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs (e.g. the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA). Other proteins that contain Puf domains are also plausible RNA binding proteins. Puf domains usually occur as a tandem repeat of 8 domains. The Pfam model does not necessarily recognize all 8 repeats in all sequences; some sequences appear to have 5 or 6 repeats on initial analysis, but further analysis suggests the presence of additional divergent repeats. Structures of PUF repeat proteins show they consist of a two helix structure.	0
421855	cl29549	beta_clamp	N/A. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold.	0
421859	cl29555	IQ	IQ calmodulin-binding motif. Short calmodulin-binding motif containing conserved Ile and Gln residues.	0
421860	cl29556	Glycos_transf_3	Glycosyl transferase family, a/b domain. anthranilate phosphoribosyltransferase; Provisional	0
356436	cl29560	SNc	Staphylococcal nuclease homologues. 	0
421861	cl29561	EAL	N/A. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues. The EAL domain is a good candidate for a diguanylate phosphodiesterase function. The domain contains many conserved acidic residues that could participate in metal binding and might form the phosphodiesterase active site.	0
421862	cl29562	Ribosomal_L7_L12	N/A. Ribosomal protein L7/L12. Ribosomal protein L7/L12 refers to the large ribosomal subunit proteins L7 and L12, which are identical except that L7 is acetylated at the N terminus. It is a component of the L7/L12 stalk, which is located at the surface of the ribosome. The stalk base consists of a portion of the 23S rRNA and ribosomal proteins L11 and L10. An extended C-terminal helix of L10 provides the binding site for L7/L12. L7/L12 consists of two domains joined by a flexible hinge, with the helical N-terminal domain (NTD) forming pairs of homodimers that bind to the extended helix of L10. It is the only multimeric ribosomal component, with either four or six copies per ribosome that occur as two or three dimers bound to the L10 helix. L7/L12 is the only ribosomal protein that does not interact directly with rRNA, but instead has indirect interactions through L10. The globular C-terminal domains of L7/L12 are highly mobile. They are exposed to the cytoplasm and contain binding sites for other molecules. Initiation factors, elongation factors, and release factors are known to interact with the L7/L12 stalk during their GTP-dependent cycles. The binding site for the factors EF-Tu and EF-G comprises L7/L12, L10, L11, the L11-binding region of 23S rRNA, and the sarcin-ricin loop of 23S rRNA. Removal of L7/L12 has minimal effect on factor binding and it has been proposed that L7/L12 induces the catalytically active conformation of EF-Tu and EF-G, thereby stimulating the GTPase activity of both factors. In eukaryotes, the proteins that perform the equivalent function to L7/L12 are called P1 and P2, which do not share sequence similarity with L7/L12. However, a bacterial L7/L12 homolog is found in some eukaryotes, in mitochondria and chloroplasts. In archaea, the protein equivalent to L7/L12 is called aL12 or L12p, but it is closer in sequence to P1 and P2 than to L7/L12.	0
421864	cl29579	HisKA	N/A. dimerization and phospho-acceptor domain of histidine kinases.	0
421865	cl29584	RF-1	RF-1 domain. This domain is found in peptide chain release factors such as RF-1 and RF-2, and a number of smaller proteins of unknown function. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis.	0
421866	cl29593	WD40	N/A. Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.	0
421867	cl29594	PI-PLC-Y	Phosphatidylinositol-specific phospholipase C, Y domain. Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme appears to be a homologue of the mammalian PLCs.	0
421868	cl29595	PTS_IIB_glc	N/A. PTS_IIB, PTS system, glucose/sucrose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation	0
421870	cl29608	ZnF_GATA	N/A. This domain uses four cysteine residues to coordinate a zinc ion. This domain binds to DNA. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contain a single copy of the domain.	0
421877	cl29653	Pilin	Pilin (bacterial filament). Proteins with only the short N-terminal methylation site are not separated from the noise. The Prosite pattern detects those better.	0
421878	cl29654	CCP	Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. A missense mutation in seventh CCP domain causes deficiency of the b subunit of factor XIII.	0
421879	cl29663	exosort_XrtP	exosortase P. Members of the exosortase S family occur in the high GC Gram-positive order Micrococcales (a branch of the Actinobacteria), in genera such as Arthrobacter, Microbacterium, Curtobacterium, and Paenarthrobacter.	0
421880	cl29666	FusC_FusB	Fusidic acid resistance protein (FusC/FusB). FBP_C is a family from the C terminal end of fibronectin-binding proteins. It forms an extended four-cysteine zinc-finger with a unique structural fold. Fibronectin-binding proteins bind to elongation factor G - EF-G, which is mediated by the zinc-finger binding to the C-terminus of EF-G. FBPs release ribosomes by competing with them for EF-G.	0
421881	cl29674	Retrotrans_gag	Retrotransposon gag protein. This family consists of uncharacterized proteins around 110 residues in length and is mainly found in various mammalia species. LDOC1, a member of this family and a novel MZF-1-interacting protein, inhibits NF-kappaB activation and relates with cancer and some other diseases. But the specific function of this family is still unknown.	0
421882	cl29684	Citrate_bind	ATP citrate lyase citrate-binding. ATP citrate (pro-S)-lyase	0
421883	cl29685	LisH_2	LisH. Fibroblast growth factor receptor 1 (FGFR1) oncogene partner (FOP) is a centrosomal protein that is involved in anchoring microtubules to subcellular structures. This domain includes a Lis-homology motif. It forms an alpha helical bundle and is involved in dimerization.	0
421885	cl29690	PSII_BNR	Photosynthesis system II assembly factor YCF48. YCF48 is one of several assembly factors of the photosynthesis system II. The photosynthesis system II occurs in Cyanobacteria that are Gram-negative bacteria performing oxygenic photosynthesis. One of the three membranes surrounding these bacteria is the inner thylakoid membrane (TM) system that is localized within the cell and houses the large pigment-protein complexes of the photosynthetic electron transfer chain, i.e. Photosystem (PS) II, PSI, the cytochrome b6f complex, and the ATP synthase. YCF48 is necessary for efficient assembly and repair of the PSII. YCF48 is found predominantly in the thykaloid membrane. It is a BNR repeat protein.	0
421886	cl29693	Defensin_beta_2	Beta defensin. Big defensins are antimicrobial peptides. They consist of a hydrophobic N-terminal half, which is active against Gram-positive bacteria, and a cationic C-terminal half, which is active against Gram-negative bacteria. The C-terminal half adopts a beta-defensin-like structure.	0
392159	cl29696	Peptidase_S30	Potyvirus P1 protease. This family is the P1 protein of the Potyviridae polyproteins that is a serine peptidase at the N-terminus. The catalytic triad in the genome polyprotein of ssRNA positive-strand Brome streak mosaic rymovirus, is His-311, Asp-322 and Ser-355.	0
421888	cl29705	CHB_HEX_C_1	Chitobiase/beta-hexosaminidase C-terminal domain. 	0
421891	cl29710	PPR	PPR repeat. This family matches additional variants of the PPR repeat that were not captured by the model for pfam01535. The exact function is not known.	0
421893	cl29718	Rhomboid_N	Cytoplasmic N-terminal domain of rhomboid serine protease. This is the N-terminal domain of rhomboid protease.	0
421894	cl29726	Pectate_lyase22	Oligogalacturonate lyase. Members of this protein family are the TolB periplasmic protein of Gram-negative bacteria. TolB is part of the Tol-Pal (peptidoglycan-associated lipoprotein) multiprotein complex, comprising five envelope proteins, TolQ, TolR, TolA, TolB and Pal, which form two complexes. The TolQ, TolR and TolA inner-membrane proteins interact via their transmembrane domains. The {beta}-propeller domain of the periplasmic protein TolB is responsible for its interaction with Pal. TolB also interacts with the outer-membrane peptidoglycan-associated proteins Lpp and OmpA. TolA undergoes a conformational change in response to changes in the proton-motive force, and interacts with Pal in an energy-dependent manner. The C-terminal periplasmic domain of TolA also interacts with the N-terminal domain of TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi , Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear. [Transport and binding proteins, Other, Cellular processes, Pathogenesis]	0
421897	cl29742	Tiny_TM_bacill	Protein of unknown function (Tiny_TM_bacill). This model represents a family of hypothetical proteins, half of which are 40 residues or less in length. Members are found only in spore-forming species. A Gly-rich variable region is followed by a strongly conserved, highly hydrophobic region, predicted to form a transmembrane helix, ending with an invariant Gly. The consensus for this stretch is FALLVVFILLIIV. [Hypothetical proteins, Conserved]	0
421898	cl29743	AbiEi_1	AbiEi antitoxin C-terminal domain. AbiEi_1 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	0
421899	cl29745	Phospholip_A2_3	Prokaryotic phospholipase A2. This family consists of several group XII secretory phospholipase A2 precursor (PLA2G12) (EC:3.1.1.4) proteins. Group XII and group V PLA(2)s are thought to participate in helper T cell immune response through release of immediate second signals and generation of downstream eicosanoids.	0
392178	cl29746	D5_N	D5 N terminal like. This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primases phages.	0
421901	cl29748	ApbA_C	Ketopantoate reductase PanE/ApbA C terminal. This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalyzed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyzes the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway.	0
421903	cl29757	TBCC	Tubulin binding cofactor C. Members of this family are involved in the folding pathway of tubulins and form a beta helix structure.	0
421904	cl29758	Gryzun	Gryzun, putative trafficking through Golgi. Members of this family are involved in Golgi trafficking.	0
421906	cl29762	PP2C_C	Protein serine/threonine phosphatase 2C, C-terminal domain. Protein phosphatase 2c; Provisional	0
421908	cl29764	Cathelicidins	Cathelicidin. This family represents a conserved region approximately 60 residues long within secreted phosphoprotein 24 (Spp-24), which seems to be restricted to vertebrates. This is a non-collagenous protein found in bone that is related in sequence to the cystatin family of thiol protease inhibitors. This suggests that Spp-24 could function to modulate the thiol protease activities known to be involved in bone turnover. It is also possible that the intact form of Spp-24 found in bone could be a precursor to a biologically active peptide that coordinates an aspect of bone turnover.	0
421909	cl29765	s48_45	Sexual stage antigen s48/45 domain. This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation.	0
421910	cl29766	NDUF_B4	NADH-ubiquinone oxidoreductase B15 subunit (NDUFB4). complex I subunit	0
392191	cl29769	Poxvirus	dsDNA Poxvirus. putative alpha aminitin-sensitive protein; Provisional	0
356648	cl29772	PRK10015	N/A. putative oxidoreductase FixC; Provisional	0
392192	cl29774	Herpes_UL73	UL73 viral envelope glycoprotein. UL49.5 protein consists of 98 amino acids with a calculated molecular mass of 10,155 Da. It contains putative signal peptide and transmembrane domains but lacks a consensus sequence for N glycosylation. UL49.5 protein is an O-glycosylated structural component of the viral envelope.	0
392193	cl29777	NPV_P10	Nucleopolyhedrovirus P10 protein. fibrous body protein; Provisional	0
421911	cl29782	AlaDh_PNT_N	Alanine dehydrogenase/PNT, N-terminal domain. Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine.	0
421915	cl29790	Herpes_U34	Herpesvirus virion protein U34. nuclear egress membrane protein UL34; Provisional	0
421916	cl29795	MtrA	Tetrahydromethanopterin S-methyltransferase, subunit A. methyltransferase; Provisional	0
421917	cl29798	RNA_pol_Rpb5_N	RNA polymerase Rpb5, N-terminal domain. DNA-directed RNA polymerase II subunit family protein; Provisional	0
421920	cl29801	Competence	Competence protein. The related model ComEC_Rec2 (TIGR00361) describes a set of proteins of ~ 700-800 residues, one each from a number of different species, of which most can become competent for natural transformation with exogenous DNA. The best-studied examples are ComEC from Bacillus subtilis and Rec-2 from Haemophilus influenzae, where the protein appears to form part of the DNA import structure. This model represents a region found in full-length ComEC/Rec2 and shorter homologs of unknown function from large number of additional bacterial species, most of which are not known to become competent for transformation (an exception is Helicobacter pylori). [Unknown function, General]	0
421923	cl29820	FDX-ACB	Ferredoxin-fold anticodon binding domain. This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2).	0
421925	cl29826	F1-ATPase_delta	mitochondrial ATP synthase delta subunit. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. The subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213).	0
421928	cl29842	COLIPASE	N/A. SCOP reports duplication of common fold with Colipase N-terminal domain.	0
421931	cl29847	CobN_like	CobN subunit of cobaltochelatase, bchH and chlH subunits of magnesium chelatases, and similar proteins. This family contains a domain common to the cobN protein and to magnesium protoporphyrin chelatase. CobN is implicated in the conversion of hydrogenobyrinic acid a,c-diamide to cobyrinic acid. Magnesium protoporphyrin chelatase is involved in chlorophyll biosynthesis.	0
421932	cl29848	MlaD	MlaD protein. Members of this protein family are the MlaD (maintenance of Lipid Asymmetry D) protein of an ABC transport system that seems to remove phospholipid from the outer leaflet of the Gram-negative bacterial outer membrane (OM), leaving only lipopolysaccharide in the outer leaflet. The Mla locus has long been associated with toluene tolerance, consistent with the proposed role in retrograde transport of phospholipid and therefore with maintaining the integrity of the OM as a protective barrier.	0
392220	cl29860	Papo_T_antigen	T-antigen specific domain. Small T antigen; Reviewed	0
421935	cl29862	CTP-dep_RFKase	Domain of unknown function DUF120. riboflavin kinase; Provisional	0
421936	cl29864	tRNA-synt_1f	tRNA synthetases class I (K). lysyl-tRNA synthetase; Reviewed	0
421937	cl29867	Peptidase_S7	Peptidase S7, Flavivirus NS3 serine protease. This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. It appears to be related to the superfamily of trypsin peptidases and so may have a peptidase function.	0
421939	cl29874	LIGANc	N/A. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This domain is the catalytic adenylation domain. The NAD+ group is covalently attached to this domain at the lysine in the KXDG motif of this domain. This enzyme- adenylate intermediate is an important feature of the proposed catalytic mechanism.	0
356751	cl29875	Herpes_UL24	Herpes virus proteins UL24 and UL76. nuclear protein UL24; Provisional	0
421941	cl29885	ArfGap	Putative GTPase activating protein for Arf. Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs.	0
421942	cl29886	DUF11	Domain of unknown function DUF11. This model represents the conserved region of about 53 amino acids shared between regions, usually repeated, of proteins from a small number of phylogenetically distant prokaryotes. Examples include a 132-residue region found repeated in three of the five longest proteins of Bacillus anthracis, a 131-residue repeat in a cell wall-anchored protein of Enterococcus faecalis, and a 120-residue repeat in Methanobacterium thermoautotrophicum. A similar region is found in some Chlamydial outer membrane proteins.	0
421943	cl29887	Apocytochr_F_C	Apocytochrome F, C-terminal. apocytochrome f; Reviewed	0
421944	cl29893	Pectinesterase	Pectinesterase. pectinesterase family protein	0
421945	cl29894	AsnC_trans_reg	Lrp/AsnC ligand binding domain. AsnC: an autogenously regulated activator of asparagine synthetase A transcription in Escherichia coli)	0
421946	cl29895	Glyco_hydro_3	Glycosyl hydrolase family 3 N terminal domain. beta-hexosaminidase; Provisional	0
421949	cl29917	RNA_pol_B_RPB2	N/A. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain represents the hybrid binding domain and the wall domain. The hybrid binding domain binds the nascent RNA strand / template DNA strand in the Pol II transcription elongation complex. This domain contains the important structural motifs, switch 3 and the flap loop and binds an active site metal ion. This domain is also involved in binding to Rpb1 and Rpb3. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 2 (DRII).	0
421951	cl29920	Chorismate_bind	chorismate binding enzyme. Members of this family, aminodeoxychorismate synthase, component I (PabB), were designated para-aminobenzoate synthase component I until it was recognized that PabC, a lyase, completes the pathway of PABA synthesis. This family is closely related to anthranilate synthase component I (trpE), and both act on chorismate. The clade of PabB enzymes represented by this model includes sequences from Gram-positive and alpha and gamma Proteobacteria as well as Chlorobium, Nostoc, Fusobacterium and Arabidopsis. A closely related clade of fungal PabB enzymes is identified by TIGR01823, while another bacterial clade of potential PabB enzymes is more closely related to TrpE (TIGR01824). [Biosynthesis of cofactors, prosthetic groups, and carriers, Folic acid]	0
421954	cl29940	RbcS	Ribulose-1,5-bisphosphate carboxylase small subunit. ribulose-bisphosphate carboxylase small chain	0
421955	cl29941	Viral_coat	Viral coat protein (S domain). The capsid or coat protein of this family is expressed in Nodaviridae, that are ssRNA positive-strand viruses, with no DNA stage. These viruses are the causative agents of viral nervous necrosis in marine fish.	0
356823	cl29947	VirDNA-topo-I_N	Viral DNA topoisomerase I, N-terminal. Members of this family are predominantly found in viral DNA topoisomerase, and assume a beta(2)-alpha-beta-alpha-beta(2) fold, with a left-handed crossover between strands beta2 and beta3.	0
421957	cl29957	PepX_C	X-Pro dipeptidyl-peptidase C-terminal non-catalytic domain. This domain is found at the C-terminus of cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). The domain, which is a beta sandwich, is also found in serine peptidases belonging to MEROPS peptidase family S15: Xaa-Pro dipeptidyl-peptidases. Members of this entry, that are not characterised as peptidases, show extensive low-level similarity to the Xaa-Pro dipeptidyl-peptidases.	0
356835	cl29959	SBP_bac_10	Protein of unknown function (DUF1559). This model describes a region of ~16 residues found typically about 30 residues away from the C-terminus of large numbers of proteins in the Planctomycetes, Lentisphaerae, and Verrucomicrobia, on proteins with a prepilin-type N-terminal cleavage/methylation domain (see TIGR02532). The motif H-X(9)-D-G is nearly invariant. Single genomes may encode over 200 such proteins.	0
421958	cl29960	SopE_GEF	SopE GEF domain. type III secretion protein BopE; Provisional	0
421959	cl29970	Phage_antitermQ	Phage antitermination protein Q. This family consists of a number of hypothetical proteins from Escherichia coli O157:H7 and Salmonella typhi. The function of this family is unknown.	0
356847	cl29971	Chordopox_L2	Chordopoxvirus L2 protein. hypothetical protein; Provisional	0
356850	cl29974	NinE	NINE Protein. prophage protein NinE; Provisional	0
392246	cl29975	Herpes_UL1	Herpesvirus glycoprotein L. envelope glycoprotein L; Provisional	0
356852	cl29976	minC	septum site-determining protein MinC. septum formation inhibitor; Reviewed	0
356854	cl29978	Pox_F11	Poxvirus F11 protein. hypothetical protein; Provisional	0
421960	cl29988	CutC	CutC family. copper homeostasis protein CutC; Provisional	0
356870	cl29994	Herpes_ICP4_C	Herpesvirus ICP4-like protein C-terminal region. transcriptional regulator ICP4; Provisional	0
356910	cl30034	DNA_pack_N	Probable DNA packing protein, N-terminus. DNA packaging terminase subunit 1; Provisional	0
356911	cl30035	DNA_pack_C	Probable DNA packing protein, C-terminus. DNA packaging terminase subunit 1; Provisional	0
356913	cl30037	Herpes_gE	Alphaherpesvirus glycoprotein E. envelope glycoprotein E; Provisional	0
356923	cl30047	Marek_A	Marek&apos;s disease glycoprotein A. envelope glycoprotein C; Provisional	0
421965	cl30049	Herpes_V23	Herpesvirus VP23 like capsid protein. Capsid triplex subunit 2; Provisional	0
356926	cl30050	PHA03259	N/A. Capsid triplex subunit 2; Provisional	0
356927	cl30051	Herpes_gI	Alphaherpesvirus glycoprotein I. envelope glycoprotein I; Provisional	0
421970	cl30079	OmpA_C-like	Peptidoglycan binding domains similar to the C-terminal domain of outer-membrane protein OmpA. The Pfam entry also includes MotB and related proteins which are not included in the Prosite family.	0
421973	cl30086	Ribosomal_L6	Ribosomal protein L6. Members of this protein family are the archaeal form ofribosomal protein uL6 (previously L9 in yeast and human). The top-scoring proteins not selected by this model are eukaryotic cytosolic uL6. Bacterial ribosomal protein L6 scores lower and is described by a distinct model. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
392263	cl30117	AA_permease	Amino acid permease. 	0
421976	cl30226	VpdB_C	C-terminal fragment of effector protein VpdB. Members of this family include the enzyme myo-inosose-2 dehydratase, product of the gene iolE, as found in inositol utilization cassettes in many species. [Energy metabolism, Sugars]	0
357125	cl30249	Fe_III_red_FhuF	siderophore-iron reductase FhuF. Members of this protein family are 2Fe-2S cluster binding proteins, found regularly in the context of siderophore transporters. Members are distantly related to FhuF from E. coli, a ferric iron reductase linked to removal of iron from hydroxamate-type siderophores (). [Energy metabolism, Electron transport, Transport and binding proteins, Cations and iron carrying compounds]	0
357126	cl30250	cyclo_dehyd_2	bacteriocin biosynthesis cyclodehydratase domain. Members of this protein family are found in a three-gene operon in Bacillus anthracis and related Bacillus species, where the other two genes are clearly identified with maturation of a putative thiazole-containing bacteriocin precursor. While there is no detectable pairwise sequence similarity between members of this family and the proposed cyclodehydratases such as SagC of Streptococcus pyogenes (see family TIGR03603), both families show similarity through PSI-BLAST to ThiF, a protein involved in biosynthesis of the thiazole moiety for thiamine biosynthesis. This family, therefore, may contribute to cyclodehydratase function in heterocycle-containing bacteriocin biosyntheses. In Bacillus licheniformis ATCC 14580, the bacteriocin precursor gene is adjacent to the gene for this protein. [Cellular processes, Toxin production and resistance]	0
357136	cl30260	PRK10992	iron-sulfur cluster repair protein YtfE. Members of this protein family, designated variously as YftE, NorA, DrnN, and NipC, are di-iron proteins involved in the repair of iron-sulfur clusters. Previously assigned names reflect pleiotropic effects of damage from NO or other oxidative stress when this protein is mutated. The suggested name now is RIC, for Repair of Iron Centers. [Biosynthesis of cofactors, prosthetic groups, and carriers, Other]	0
357139	cl30263	PRK14127	cell division regulator GpsB. This model describes a domain found in Bacillus subtilis cell division initiation protein DivIVA, and homologs, toward the N-terminus. It is also found as a repeated domain in certain other proteins, including family TIGR03543.	0
357161	cl30285	Csf4_U	CRISPR/Cas system-associated DinG family helicase Csf4. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. EbN1, and Rhodoferax ferrireducens DSM 15236. In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of CRISPR-associated (cas) gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf4 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies farthest (fourth closest) from the repeats in the A. ferrooxidans genome.	0
357162	cl30286	PLN00090	N/A. Members of this protein family are the photosystem II reaction center M protein, product of the psbM gene, in Cyanobacteria and their derived organelles in plants. This model resembles pfam05151 but has cutoffs set to avoid false-positive matches to similar (not necessarily homologous) sequences in species that are not photosynthetic. [Energy metabolism, Photosynthesis]	0
421978	cl30289	CccA	Cytochrome c, mono- and diheme variants  [Energy production and conversion]. Cytochrome 579, as described originally in Leptospirillum from acid mine drainage, is an abundant red cytochrome that acts as an electron transfer protein involved in Fe(II) oxidation.	0
357168	cl30292	PRK15331	type III secretion system translocator chaperone SicA. Genes in this family are found in type III secretion operons. LcrH, from Yersinia is believed to have a regulatory function in the low-calcium response of the secretion system. The same protein is also known as SycD (SYC = Specific Yop Chaperone) for its chaperone role. In Pseudomonas, where the homolog is known as PcrH, the chaperone role has been demonstrated and the regulatory role appears to be absent. ScyD/LcrH contains three central tetratricopeptide-like repeats that are predicted to fold into an all-alpha-helical array.	0
357184	cl30308	LolE	ABC-type transport system, involved in lipoprotein release, permease component [Cell wall/membrane/envelope biogenesis]. This model describes the LolC protein, and its paralog LolE found in some species. These proteins are homologous to permease proteins of ABC transporters. In some species, two paralogs occur, designated LolC and LolE. In others, a single form is found and tends to be designated LolC. [Protein fate, Protein and peptide secretion and trafficking]	0
357216	cl30340	tolC	N/A. Members of this model are outer membrane proteins from the TolC subfamily within the RND (Resistance-Nodulation-cell Division) efflux systems. These proteins, unlike the NodT subfamily, appear not to be lipoproteins. All are believed to participate in type I protein secretion, an ABC transporter system for protein secretion without cleavage of a signal sequence, although they may, like TolC, participate also in the efflux of smaller molecules as well. This family includes the well-documented examples TolC (E. coli), PrtF (Erwinia), and AprF (Pseudomonas aeruginosa). [Protein fate, Protein and peptide secretion and trafficking, Transport and binding proteins, Porins]	0
357259	cl30383	flgB	N/A. flagellar basal body rod protein FlgB; Reviewed	0
392266	cl30459	Cyt_C5_DNA_methylase	N/A. All proteins in this family for which functions are known are DNA-cytosine methyltransferases. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	0
357347	cl30471	SORL	Desulfoferrodoxin, superoxide reductase-like (SORL) domain  [Energy production and conversion]. The short N-terminal domain contains four conserved Cys for binding of a ferric iron atom, and is homologous to the small protein desulforedoxin; this domain may also be responsible for dimerization. The remainder of the molecule binds a ferrous iron atom and is similar to neelaredoxin, a monomeric blue non-heme iron protein. The homolog from Treponema pallidum scores between the trusted cutoff for orthology and the noise cutoff. Although essentially a full length homolog, it lacks three of the four Cys residues in the N-terminal domain; the domain may have lost ferric binding ability but may have some conserved structural role such as dimerization, or some new function. This protein is described in some articles as rubredoxin oxidoreductase (rbo), and its gene shares an operon with the rubredoxin gene in Desulfovibrio vulgaris Hildenborough. [Energy metabolism, Electron transport]	0
357355	cl30479	PRK15103	membrane integrity-associated transporter subunit PqiA. This family consists of uncharacterized predicted integral membrane proteins found, so far, only in the Proteobacteria. Of two members in E. coli, one is induced by paraquat and is designated PqiA, paraquat-inducible protein A. [Unknown function, General]	0
357372	cl30496	GIDA	Glucose inhibited division protein A. GidA, the longer of two forms of GidA-related proteins, appears to be present in all complete eubacterial genomes so far, as well as Saccharomyces cerevisiae. A subset of these organisms have a closely related protein. GidA is absent in the Archaea. It appears to act with MnmE, in an alpha2/beta2 heterotetramer, in the 5-carboxymethylaminomethyl modification of uridine 34 in certain tRNAs. The shorter, related protein, previously called gid or gidA(S), is now called TrmFO (see model TIGR00137). [Protein synthesis, tRNA and rRNA base modification]	0
357415	cl30539	PLN02852	N/A. adrenodoxin reductase; Provisional	0
357420	cl30544	TOP1Ac	N/A. Bacterial DNA topoisomerase I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase alpha subunit	0
421979	cl30545	PKD	N/A. This domain was first identified in the Polycystic kidney disease protein PKD1. This domain has been predicted to contain an Ig-like fold.	0
357422	cl30546	H2A	N/A. histone H2A; Provisional	0
357435	cl30559	PriA	Primosomal protein N&apos; (replication factor Y) - superfamily II helicase [Replication, recombination and repair]. 	0
357450	cl30574	PRK13596	NADH-quinone oxidoreductase subunit NuoF. NADH dehydrogenase [ubiquinone] flavoprotein 1; Provisional	0
421980	cl30589	ThiD2	ThiD2 family. This domain functions as a ThiD protein and is called the ThiD2 family. The domain is associated with the ThiE domain in some proteins.	0
357482	cl30606	KdpD	K+-sensing histidine kinase KdpD [Signal transduction mechanisms]. sensor protein KdpD; Provisional	0
357489	cl30613	LacZ	Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]. beta-D-glucuronidase; Provisional	0
357493	cl30617	flgD	flagellar hook assembly protein FlgD. flagellar basal body rod modification protein; Provisional	0
421981	cl30663	DUF5710	Domain of unknown function (DUF5710). DNA polymerase III subunit epsilon; Validated	0
421982	cl30664	FAD_binding_2	FAD binding domain. L-aspartate oxidase is the B protein, NadB, of the quinolinate synthetase complex. Quinolinate synthetase makes a precursor of the pyridine nucleotide portion of NAD. This model identifies proteins that cluster as L-aspartate oxidase (a flavoprotein difficult to separate from the set of closely related flavoprotein subunits of succinate dehydrogenase and fumarate reductase) by both UPGMA and neighbor-joining trees. The most distant protein accepted as an L-aspartate oxidase (NadB), that from Pyrococcus horikoshii, not only clusters with other NadB but is just one gene away from NadA. [Biosynthesis of cofactors, prosthetic groups, and carriers, Pyridine nucleotides]	0
357541	cl30665	NuoL	NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit [Energy production and conversion, Inorganic ion transport and metabolism]. NADH dehydrogenase subunit 5; Validated	0
357544	cl30668	PRK14692	flagellar hook-associated protein FlgL. flagellar hook-associated protein FlgL; Validated	0
357559	cl30683	PRK08241	RNA polymerase subunit sigma-70. RNA polymerase sigma factor SigJ; Provisional	0
357561	cl30685	PRK07502	prephenate/arogenate dehydrogenase family protein. 	0
421983	cl30686	PLN02487	N/A. phytoene desaturase	0
357568	cl30692	UbiH	2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases [Coenzyme transport and metabolism, Energy production and conversion]. hypothetical protein; Provisional	0
357572	cl30696	TFIIE	Transcription initiation factor IIE. 	0
357587	cl30711	PRK04233	N/A. hypothetical protein; Provisional	0
392272	cl30717	InfB	Translation initiation factor IF-2, a GTPase [Translation, ribosomal structure and biogenesis]. This model describes archaeal and eukaryotic orthologs of bacterial IF-2. Like IF-2, it helps convey the initiator tRNA to the ribosome, although the initiator is N-formyl-Met in bacteria and Met here. This protein is not closely related to the subunits of eIF-2 of eukaryotes, which is also involved in the initiation of translation. The aIF-2 of Methanococcus jannaschii contains a large intein interrupting a region of very strongly conserved sequence very near the amino end; the alignment generated by this model does not correctly align the sequences from Methanococcus jannaschii and Pyrococcus horikoshii in this region. [Protein synthesis, Translation factors]	0
421984	cl30729	tRNA_bind_4	tRNA-binding domain. seryl-tRNA synthetase; Provisional	0
357607	cl30731	PRK00694	N/A. 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase; Provisional	0
357616	cl30740	COG2810	Predicted type IV restriction endonuclease  [Defense mechanisms]. 	0
392274	cl30759	PelA	Stalled ribosome rescue protein Dom34, pelota family  [Translation, ribosomal structure and biogenesis]. Directs the termination of nascent peptide synthesis (translation) in response to the termination codons UAA, UAG and UGA. This model identifies both archaeal (aRF1) and eukaryotic (eRF1) of the protein. Also known as translation termination factor 1. [Protein synthesis, Translation factors]	0
357654	cl30778	MltE	Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) [Cell wall/membrane/envelope biogenesis]. lytic murein transglycosylase; Provisional	0
357662	cl30786	FusA	Translation elongation factor EF-G, a GTPase [Translation, ribosomal structure and biogenesis]. elongation factor 2; Provisional	0
392275	cl30787	PhrB	Deoxyribodipyrimidine photolyase [Replication, recombination and repair]. This model describes a narrow clade of cyanobacterial deoxyribodipyrimidine photo-lyase. This group, in contrast to several closely related proteins, uses a chromophore that, in other lineages is modified further to become coenzyme F420. This chromophore is called 8-HDF in most articles on the DNA photolyase and FO in most literature on coenzyme F420. [DNA metabolism, DNA replication, recombination, and repair]	0
357667	cl30791	FieF	Divalent metal cation (Fe/Co/Zn/Cd) transporter [Inorganic ion transport and metabolism]. 	0
357684	cl30808	PHA00368	N/A. virion protein; Provisional	0
357685	cl30809	PLN03107	N/A. eukaryotic initiation factor 5a; Provisional	0
357701	cl30825	PLN02893	N/A. cellulose synthase-like protein	0
357716	cl30840	PRK09940	N/A. transcriptional regulator SirC; Provisional	0
421985	cl30861	PRK09915	MdtP family multidrug efflux transporter outer membrane subunit. multidrug resistance outer membrane protein MdtQ; Provisional	0
357750	cl30874	PHA02699	N/A. Hypothetical protein; Provisional	0
357752	cl30876	PHA03055	N/A. ORF033 IMV membrane protein; Provisional	0
357759	cl30883	PHA02984	N/A. hypothetical protein; Provisional	0
357760	cl30884	PHA02861	N/A. hypothetical protein; Provisional	0
357761	cl30885	PHA02818	N/A. hypothetical protein; Provisional	0
392276	cl30941	CheC_CheX_FliY	CheC/CheX/FliY (CXY) family phosphatases. This family contains class III CheC proteins, present chiefly in the archaeal class Halobacteria. Sequence analysis shows that class III CheC proteins are structurally and functionally similar to class I CheCs, and not to CheX, despite the fact that both class III CheCs and CheX lack the first of the two phosphatase active sites of class I CheCs, and retain the second active site. Mutation analysis shows that the second active site is more important for function that the first one, suggesting that class III proteins arose by loss of the unnecessary first active site through mutational shift. All chemotactic archaea have a CheC homologue.	0
421990	cl31489	Mtd_N	Major tropism determinant N-terminal domain. major tropism determinant	0
421991	cl32029	DnaT	DnaT DNA-binding domain. This domain is found in E.coli primosomal protein 1 (Pp1); the PP1 domain (residues 84-153) can bind to different types of ssDNA, which is fundamental for its physiological substrate bindings. Functional analysis indicate that both N- and C- terminals are essential to having the cooperative effect in binding ssDNA. The ssDNA bound complex displays a spiral filament assembly that is adopted by many proteins that are involved in DNA replication, such as DnaA, RecA and PriB. This domain is similar to pfam08585 except that it contains an extra loop at the N-terminus (84-99). Structural analysis indicate that this extra loop might be essential for the stabilisation of the three-helix bundle.	0
421994	cl36576	RX-CC_like	Coiled-coil domain of the potato virux X resistance protein and similar proteins. This entry represents the N-terminal domain found in many plant resistance proteins. This domain has been predicted to be a coiled-coil, however the structure shows that it adopts a four helical bundle fold.	0
421995	cl36727	NaPi_cotrn_rel	Na/Pi-cotransporter. Proteins of this family belong to the Phosphate:Na+ Symporter (PNaS) superfamily.	0
422182	cl37813	DUF2791	P-loop Domain of unknown function (DUF2791). BrxD is an ATP-binding protein found in types 2 and 6 of BREX (bacteriophage exclusion) phage resistance systems.	0
422465	cl38185	BTP	Chlorhexidine efflux transporter. PACE (proteobacterial antimicrobial compound efflux) transporters are single component proton-coupled efflux pumps that help confer resistance to a number of biocides and antibiotics. The family has also been named PCE (proteobacterial chlorhexidine efflux). Members of this subfamily of the PACE transporters, distinct from the AceI-like branch, include several whose expression is increased by exposure to chlorhexidine and/or help confer increased resistance to it.	0
422524	cl38252	YbbR	YbbR domain. The members of this family are are all hypothetical bacterial proteins of unknown function, and are similar to the YbbR protein expressed by Bacillus subtilis. One member is annotated as an uncharacterized secreted protein, whereas another member is described as a hypothetical protein in the 5'region of the def gene of Thermus thermophilus, which encodes a deformylase, but no further information was found in either case. This region is found repeated up to four times in many members of this family.	0
422701	cl38447	DUF4277	Domain of unknown function (DUF4277). Members of this protein family are DDE type transposases encoded by the IS1634 family elements, which were firstly identified and characterized in Mycoplasma mycoides.	0
422835	cl38594	DUF5357	Family of unknown function (DUF5357). Proteins of this family are components of cyanobacterial septal junctions (microplasmodesmata) in heterocyst-forming cyanobacteria.	0
422937	cl38795	CnrY	anti-sigma factor CnrY. This family is found in alpha and beta proteobacteria. Family members include anti-sigma factor CnrY from Cupriavidus metallidurans. Sigma factors are multi-domain sub-units of bacterial RNA polymerase (RNAP) that play critical roles in transcription initiation, including the recognition and opening of promoters as well as the initial steps in RNA synthesis. They also control a wide variety of adaptive responses such as morphological development and the management of stress. A recurring theme in sigma factor control is their sequestration by anti-sigma factors that occlude their RNAP-binding determinants. CnrH, controls cobalt and nickel resistance in Cupriavidus metallidurans. CnrH is regulated by a complex of two transmembrane proteins: the periplasmic sensor CnrX and the anti-sigma CnrY. At rest, CnrH is sequestered by CnrY whose 45-residue-long cytosolic domain is one of the shortest anti-sigma domains. Upon Ni(II) or Co(II) ions detection by CnrX in the periplasm, CnrH is released between CnrH and the cytosolic domain of CnrY (CnrYc). The CnrH/CnrYC complex displays an unexpected structural similarity to the anti-sigma NepR in complex with its antagonist PhyR, whereas NepR shares no sequence similarity with CnrY. Crystal structure of CnrH/CnrY shows that CnrYC residues 3-19 are folded as a well-defined alpha-helix. The peptide further extends along the hydrophobic groove of sigma 2 with no canonical structure except for a short helical turn spanning residues 24-28. CnrY has a hydrophobic knob made of V4, W7 and L8 side chains protruding into sigma 4 hydrophobic pocket and contributing to the interface. In vivo investigation of CnrY function pinpoints part of the hydrophobic knob as a hotspot in CnrH inhibitory binding.	0
422948	cl38891	SARS-CoV_ORF9c	accessory protein ORF9c (also referred to as ORF14) from Severe acute respiratory syndrome-associated coronavirus and related coronaviruses. This is a family of unknown function found in SARS coronavirus.	0
422950	cl38901	T3SC_I-like	class I type III secretion system (T3SS) chaperones and similar proteins. TtfA (trehalose monomycolate transport factor A) plays a role in the transport of trehalose monomycolate across the inner membrane, potentially by forming a complex with the atypical lipid transporter MmpL3. Trehalose monomycolate is a component of the mycobacterial envelope. The core domain of TtfA shows strong structural similarity to class I type III secretion system (T3SS) chaperones, and TtfA may play other roles besides assisting in mycolate transport, given its phylogenetic distribution.	0
365778	cl38902	YEATS	YEATS domain family, chromatin reader proteins. YEATS domain containing proteins, which include Transcription initiation factor TFIID subunits 14 and 14b of Arabidopsis, shown to be part of the TFIID general transcriptional regulator complex in a two-hybrid screen.  DNA regulation by chromatin thru histone post-translational modification and other mechanism involves complexes with write, eraser and reader functions. YEATS domains act as readers of the chromatin state, and stimulate transcriptional activity, thru preferential interactions with crotonylated lysines on histones.   The YEATS family is named for several family members: 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', and also contains the GAS41 protein.	0
365779	cl38903	RMtype1_S_TRD-CR_like	Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR) and similar domains. The recognition sequences of Campylobacter jejuni RM 2232 S subunit (S.Cje2232P) and Shewanella baltica OS223 S subunit (S.Sba223ORF389P) are undetermined. The restriction-modification (RM) system S subunit consists of two variable target recognition domains (TRD1 and 2) and two conserved regions (CR1 and CR2) which separate the TRDs. The TRDs each bind to different specific sequences in the DNA. RM systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one enzyme complex composed of one DNA specificity (S) subunit (this family), two modification (M) subunits and two restriction (R) subunits. This model contains both TRD1-CR1 and TRD2-CR2. Also included in this subfamily is the C-terminal TRD-CR-like sequence-recognition domain of Microcystis aeruginosa putative type I N6-adenine DNA methyltransferase M subunit (M.Mae7806ORF3969P). The recognition sequence of M.Mae7806ORF3969P is undetermined.	0
365780	cl38904	AldB-like	proteins similar to alpha-acetolactate dehydrogenase. alpha-acetolactate decarboxylase (AldB, E.C. 4.1.1.5) converts acetolactate ((2S)-2-hydroxy-2-methyl-3-oxobutanoate) into acetoin ((3R)-3-hydroxybutan-2-one) and CO(2). Acetoin may be secreted by the cells, perhaps in order to control the internal pH. AldB may function as a regulator in valine and leucine biosynthesis and in catalyzing the second step of the 2,3-butanediol pathway. The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxxH motif  (x could be any residue) that coordinates a zinc ion.	0
365781	cl38905	longin-like	Longin-like domains. Trafficking protein particle complex subunit 4 (TRAPPC4), also known as synbindin or TRS23, has been identified as a component of the transport protein particle (TRAPP), required for tethering endoplasmic reticulum (ER)-derived vesicles to Golgi membranes and for Golgi traffic.	0
365782	cl38906	ATP-synt_Fo_Vo_Ao_c	ATP synthase, membrane-bound Fo/Vo/Ao complexes, subunit c. This family includes subunit c of F-ATP synthase (also called ATP synthase F(o) sector subunit c, F-type ATPase subunit c, or F-ATPase subunit c) and similar proteins. It is a proton-translocating subunit of the ATP synthase encoded by gene atpE.	0
422951	cl38907	SWIB-MDM2	SWIB/MDM2 domain family. This family includes the SWIB domain and the MDM2 domain. The p53-associated protein (MDM2) is an inhibitor of the p53 tumor suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2.	0
365784	cl38908	BTB_POZ	BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain superfamily. ZBTB42 is a transcriptional repressor that specifically binds DNA and probably acts by recruiting chromatin remodeling multiprotein complexes. It is enriched in skeletal muscles, especially at the neuromuscular junction. A ZBTB42 mutation has been identified to define a novel lethal congenital contracture syndrome (LCCS6), a lethal autosomal recessive form of arthrogryposis multiplex congenita (AMC). ZBTB42 contains a BTB/POZ domain, a common protein-protein interaction motif of about 100 amino acids.	0
365785	cl38909	ATP-synt_F1_V1_A1_AB_FliI_N	ATP synthase, alpha/beta subunits of F1/V1/A1 complex, flagellum-specific ATPase FliI, N-terminal domain. The alpha (A) subunit of the V1/A1 complexes of V/A-type ATP synthases, N-terminal domain.  The V- and A-type family of ATPases are composed of two linked multi-subunit complexes: the V1 or A1 complex contain three copies each of the alpha and beta subunits that form the soluble catalytic core, which is involved in ATP synthesis/hydrolysis, and the Vo or Ao complex that forms the membrane-embedded proton pore. The A-ATP synthase (AoA1-ATPase) is found in archaea and functions like F-ATP synthase. Structurally, however, the A-ATP synthase is more closely related to the V-ATP synthase (vacuolar VoV1-ATPase), which is a proton-translocating ATPase responsible for acidification of eukaryotic intracellular compartments and for ATP synthesis in archaea and some eubacteria. Collectively, the V- and A-type synthases can function in both ATP synthesis and hydrolysis modes.	0
365786	cl38910	ATP-synt_F1_V1_A1_AB_FliI_C	ATP synthase, alpha/beta subunits of F1/V1/A1 complex, flagellum-specific ATPase FliI, C-terminal domain. The C-terminal domain of the flagellum-specific ATPase/type III secretory pathway virulence-related protein. This group of ATPases are responsible for the export of flagellum and virulence-related proteins. The flagellum-specific ATPase FliI is the soluble export component that drives flagellar protein export, and it shows extensive similarity to the alpha and beta subunits of FoF1-ATP synthase. Although they both are proton driven rotary molecular devices, the main function of the bacterial flagellar motor is to rotate the flagellar filament for cell motility. Intracellular pathogens such as Salmonella and Chlamydia also have proteins which are similar to the flagellar-specific ATPase, but function in the secretion of virulence-related proteins via the type III secretory pathway.	0
365787	cl38911	LGIC_TM	transmembrane domain of Cys-loop neurotransmitter-gated ion channels. This family contains transmembrane (TM) domain of zinc-activated ligand-gated ion channel (ZAC). The transmembrane region consists of four transmembrane-spanning alpha-helical segments (M1-M4) that are linked by loops. The intracellular loop that links M1 and M2 determines the ion selectivity of the channel. ZAC displays low sequence similarity to other members in the superfamily, with closest matches to the human serotonin 5-HT3 receptor (5-HT3R) subunits 5-HT3A and 5-HT3B, and nAChR alpha7 subunits that exhibit approximately 15% amino acid sequence identity to ZAC. Expression of ZAC has been detected in human fetal whole brain, spinal cord, pancreas, placenta, prostate, thyroid, trachea, and stomach, as well as in adult hippocampus, striatum, amygdala, and thalamus. ZAC forms an ion channel gated by Zn2+, Cu2+, and H+, and is non-selectively permeable to monovalent cations. However, the role of ZAC in Zn2+, Cu2+, and H+ signaling is as yet unknown.	0
422952	cl38912	SPOUT_MTase	SPOUT superfamily of SAM-dependent RNA methyltransferases. This family has a Rossmanoid fold, with a deep trefoil knot in its C-terminal region. It has structural similarity to RNA methyltransferases, and is likely to function as an S-adenosyl-L-methionine (SAM)-dependent RNA 2'-O methyltransferase.	0
365789	cl38913	ABC_6TM_exporters	Six-transmembrane helical domain of the ATP-binding cassette transporters. ATP-binding cassette sub-family B member 9 is also known as transporter associated with antigen processing, TAP-like protein, TAPL, and ABCB9. It is a half transporter comprises a homodimeric lysosomal peptide transport complex. It belongs to the ABC_6TM_TAP_ABCB8_10_like subgroup of the ABC_6TM exporter family. The ABC_6TM exporter family represents the six transmembrane (TM) helices typically found in the ATP-binding cassette (ABC) transporters that function as exporters, which contain 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds and a various type of lipids. In addition to ABC exporters, ABC transporters include two classes of ABC importers, classified depending on details of their architecture and mechanism. Only the ABC exporters are included in the ABC_6TM exporter family. ABC transporters typically consist of two transmembrane domains (TMDs) and two nucleotide-binding domains (NBDs. The sequences and structures of the TMDs are quite varied between the different type of transporters, suggesting chemical diversity of the translocated substrates, whereas NBDs are conserved among all ABC transporters. The two NBDs together bind and hydrolyze ATP, thereby providing the driving force for transport, while the TMDs participate in substrate recognition and translocation across the lipid membrane. However, some ABC genes are organized as half-transporters, which must form either homodimers or heterodimers to form a functional unit.	0
365790	cl38914	NP-I	nucleoside phosphorylase-I family. This subfamily includes both bacterial and plant 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidases (MTANs), as well as futalosine nucleosidase and adenosylhopane nucleosidase. Bacterial MTANs show comparable efficiency in hydrolyzing MTA and SAH, while plant enzymes are highly specific for MTA and are unable to metabolize SAH or show significantly reduced activity towards SAH. MTAN is involved in methionine and S-adenosyl-methionine recycling, polyamine biosynthesis, and bacterial quorum sensing. This subfamily belongs to the nucleoside phosphorylase-I (NP-I) family, whose members accept a range of purine nucleosides as well as the pyrimidine nucleoside uridine. The NP-1 family  includes phosphorolytic nucleosidases, such as purine nucleoside phosphorylase (PNPs, EC. 2.4.2.1), uridine phosphorylase (UP, EC 2.4.2.3), and 5'-deoxy-5'-methylthioadenosine phosphorylase (MTAP, EC 2.4.2.28), and hydrolytic nucleosidases, such as AMP nucleosidase (AMN, EC 3.2.2.4), and 5'-methylthioadenosine/S-adenosylhomocysteine (MTA/SAH) nucleosidase (MTAN, EC 3.2.2.16). The NP-I family is distinct from nucleoside phosphorylase-II, which belongs to a different structural family.	0
365791	cl38915	DEAD-like_helicase_C	C-terminal helicase domain of the DEAD-like helicases. ATP-dependent DNA helicase RecG plays a critical role in recombination and DNA repair. RecG helps process Holliday junction intermediates to mature products by catalyzing branch migration. It is a DEAD-like helicase belonging to superfamily (SF)2, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF1 helicases, SF2 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA. This model describes the C-terminal domain, also called HelicC.	0
365792	cl38916	HK_sensor	Sensor domains of Histidine Kinase receptors. Histidine kinase (HK) receptors are part of two-component systems (TCS) in bacteria that play a critical role for sensing and adapting to environmental changes. Typically, HK receptors contain an extracellular sensing domain flanked by two transmembrane helices, an intracellular dimerization histidine phosphorylation domain (DHp), and a C-terminal kinase domain, with many variations on this theme. HK receptors in this family contain double PDC (PhoQ/DcuS/CitA) sensor domains. Signals detected by the sensor domain are transmitted through DHp to the kinase domain, resulting in the phosphorylation of a conserved histidine residue in DHp; phosphotransfer to a conserved aspartate in its cognate response regulator (RR) follows, which leads to the activation of genes for downstream cellular responses. The HK family includes not just histidine kinase receptors but also sensors for chemotaxis proteins and diguanylate cyclase receptors, implying a combinatorial molecular evolution.	0
422953	cl38917	Tiki_TraB-like	diverse proteins related to the Tiki and TraB protease domains. pAD1 is a haemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. It encodes a mating response to a peptide sex pheromone, cAD1, secreted by recipient bacteria. Once the plasmid pAD1 is acquired, production of the pheromone ceases--a trait related in part to a determinant designated traB. However a related protein is found in C. elegans, suggesting that members of the TraB family have some more general function. This family also includes the bacterial GumN protein. The family has a conserved GXXH motif close to the N-terminus, a conserved glutamate and a conserved arginine that may be catalytic. The family also includes a second conserved GXXH motif near the C-terminus. This family also contains the Tiki proteins that regulate Wnt signalling.	0
422954	cl38918	Peptidase_M15	Metalloproteases including zinc D-Ala-D-Ala carboxypeptidase, L-Ala-D-Glu peptidase, L,D-carboxypeptidase, bacteriophage endolysins, and related proteins. This family resembles VanY, pfam02557, which is part of the peptidase M15 family.	0
365795	cl38919	AfaD_SafA-like	AfaD-like family of invasins. This subfamily is composed of Yersinia pestis PsaA, Yersinia enterocolitica MyfA, and similar proteins. PsaA and MyfA are the major subunits of pH 6 antigen (Psa) and Myf fimbrial homopolymers. Psa and Myf specifically recognize beta1-3- or beta1-4-linked galactose in glycosphingolipids, but while Psa also binds phosphatidylcholine, Myf does not. Psa has acquired a tyrosine-rich surface that enables it to bind to phosphatidylcholine and mediate adhesion of Y. pestis/pseudotuberculosis to alveolar cells. Myf has specialized as a carbohydrate-binding adhesin, facilitating the attachment of Y. enterocolitica to intestinal cells. During fimbria/pili assembly, polymerization occurs when the N-terminal extension (NTE) of one monomer is inserted into an adjacent monomer, providing the final beta strand or G-strand, to complete the Ig-like fold, in a mechanism called the donor-strand complementation (DSC) or donor-strand exchange (DSE).	0
365796	cl38920	T3SS_Flik_C_like	C-terminal domain of type III secretion proteins FliK, HrpP, YscP, and similar domains. The flagellar hook-length control protein FliK is a soluble cytoplasmic protein that is secreted during flagellar formation. It controls hook elongation by two successive events: by determining hook length and by stopping the supply of hook protein. It contains an N-terminal domain that determines hook length and a C-terminal domain that is responsible for switching secretion from the hook protein to that of the filament protein, by interacting with FlhB, the switchable secretion gate.	0
365797	cl38921	HLD_clamp	helical lid domain of clamp loader-like AAA+ proteins. Replication factor C (RFC) is five-protein clamp loader complex that forms a stable ATP-dependent complex with the sliding clamp, PCNA, which binds specifically to primed DNA.  RFC subunits belong to the clamp loader clade of the AAA+ superfamily.	0
422955	cl38923	PIN_Mut7-C-like	PIN domain at the C-terminus of Caenorhabditis elegans exonuclease Mut-7 and related domains. This is a domain of unknown function found in potential toxin-antitoxin system component.	0
393294	cl38924	Wnt	Wnt domain found in the WNT signaling gene family, also called Wingless-type mouse mammary tumor virus (MMTV) integration site family. Wnt-10b, also called protein Wnt-12, specifically activates canonical Wnt/beta-catenin signaling and thus triggers beta-catenin/LEF/TCF-mediated transcriptional programs. It is involved in signaling networks controlling stemness, pluripotency and cell fate decisions. Wnt-10b is unique and plays an important role in differentiation of epithelial cells in the hair follicle. Wnt genes have been identified in vertebrates and invertebrates, but not in plants, unicellular eukaryotes, or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families. The Wnt signaling mediated by Wnt proteins that orchestrate and influence a myriad of cellular processes, such as cell proliferation, differentiation, tumorigenesis, apoptosis, and participation in immune defense during microbe infection.	0
393295	cl38925	TenA_PqqC-like	TenA-like proteins including TenA_C and TenA_E proteins, as well as pyrroloquinoline quinone (PQQ) synthesis protein C. This family contains proteins with similarity to TenA, and includes bacterial coenzyme pyrroloquinoline quinone (PQQ) synthesis protein C or PQQC proteins. PQQ is the prosthetic group of several bacterial enzymes, including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria. PQQC catalyzes the last step of PQQ biogenesis which involves a ring closure and an eight-electron oxidation of the substrate [3a-(2-amino-2-carboxyethyl)-4,5-dioxo-4,5,6,7,8,9-hexahydroquinoline-7,9-dicarboxylic acid (AHQQ)]. The exact molecular function of members of this family is unclear. Also belonging to this family is Chlamydia protein CADD (Chlamydia protein Associating with Death Domains), a redox protein toxin unique to Chlamydia species, which modulates host cell apoptosis; its redox activity and death domain binding ability may be required for this biological activity. CADD may have a role in folate metabolism.	0
422956	cl38926	serpin	SERine Proteinase INhibitors (serpin) family. Structure is a multi-domain fold containing a bundle of helices and a beta sandwich.	0
393297	cl38927	Peptidase_M90-like	M90 peptidase is a zinc-metallopeptidase. This subfamily contains uncharacterized M90 peptidase-like domains, similar to the Mlc Titration Factor A (MtfA) peptidase from Escherichia coli, also known as the YeeI gene product, which is involved in the control of the glucose-phosphotransferase sensory and regulatory system by inactivation of the repressor Mlc (making large colonies). E. coli MtfA has been shown to have aminopeptidase activity with the presence of a single zinc ion in the active site ligated by two histidines in an HEXXH motif. MtfA is related to the catalytic domain of the anthrax lethal factor and the Mop protein involved in the virulence of Vibrio cholerae; although sequence similarity is low, conservation is observed in the overall structure as well as in the residues around the active site.	0
422958	cl38930	AmyAc_family	Alpha amylase catalytic domain family. This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides, Prevotella and Prevotella species. The function of this family remains unknown.	0
422963	cl38936	P-loop_NTPase	P-loop containing Nucleoside Triphosphate Hydrolases. NVL exists in two forms with N-terminal extensions of different lengths in mammalian cells. NVL has two alternatively spliced isoforms, a short form, NVL1, and a long form, NVL2. NVL2, the major species, is mainly present in the nucleolus, whereas NVL1 is nucleoplasmic. Each has an N-terminal domain, followed by two tandem ATPase domains; this subfamily includes the first of the two ATPase domains. NVL2 is involved in the biogenesis of the 60S ribosome subunit by associating specifically with ribosome protein L5 and modulating the function of DOB1. NVL2 is also required for telomerase assembly and the regulation of telomerase activity, and is involved in pre-rRNA processing. The role of NVL1 is unclear. This RecA-like_NVL_r1-like subfamily belongs to the RecA-like NTPase family which includes the NTP binding domain of F1 and V1 H(+)ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. The RecA-like NTPase family also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion.	0
422965	cl38938	RNR_PFL	Ribonucleotide reductase and Pyruvate formate lyase. The proteins in this family are functionally uncharacterized. The proteins are around 450 amino acids long. It is likely that this family represents a group of glycerol-3-phosphate dehydrogenases.	0
422966	cl38939	phosphohexomutase	N/A. The MMP1680 protein from Methanococcus maripaludis has been characterized as the archaeal protein responsible for the second step of UDP-GlcNAc biosynthesis. This GlmM protein catalyzes the conversion of glucosamine-6-phosphate to glucosamine-1-phosphate. The first-characterized bacterial GlmM protein is modeled by TIGR01455. These two families are members of the larger phosphoglucomutase/phosphomannomutase family (characterized by three domains: pfam02878, pfam02879 and pfam02880), but are not nearest neighbors to each other. This model also includes a number of sequences from non-archaea in the Bacteroides, Chlorobi, Chloroflexi, Planctomycetes and Spirochaetes lineages. Evidence supporting their inclusion in this equivalog as having the same activity comes from genomic context and phylogenetic profiling. A large number of these organisms are known to produce exo-polysaccharide and yet only appeared to contain the GlmS enzyme of the GlmSMU pathway for UDP-GlcNAc biosynthesis (GenProp0750). In some organisms including Leptospira, this archaeal GlmM is found adjacent to the GlmS as well as a putative GlmU non-orthologous homolog. Phylogenetic profiling of the GlmS-only pattern using PPP identifies members of this archaeal GlmM family as the highest-scoring result. [Central intermediary metabolism, Amino sugars]	0
393315	cl38945	BrxE_fam	BrxE family protein. Members of this family are BrxE, a protein of unknown function that is found in type 6 BREX systems of phage defense.	0
422967	cl38947	Spa1_C	Lantibiotic immunity protein Spa1 C-terminal domain. This HMM describes a domain that occurs twice in the nisin lantibiotic self-immunity lipoprotein NisI, and once in the subtilin lantibiotic self-immunity lipoprotein SpaI, and once or twice in numerous other known or putative lantibiotic resistance lipoproteins.	0
393319	cl38949	T6SS_TagK_dom	TagK family protein C-terminal domain. Members of this family have full-length homology to SciF, a type VI secretion system (T6SS) protein from Salmonella typhimurium  island SPI-6. Homologs occur in some but not all T6SS loci, and the broader family is now called TagK.	0
393320	cl38950	PriX	Primase X. In most archaea, the eukaryotic-type DNA primase has catalytic subunit PriS and a regulatory subunit PriL. The proteins in this family are PriX, an essential second noncatalytic subunit found in a subset of the archaea.	0
422969	cl38951	BB_PF	Beta barrel Pore-forming domain. Members of this family are secreted in a water-soluble pro-toxin form, but undergo cleavage and oligomerization to form beta-barrel pore. The founding member of the family is monalysin from Pseudomonas entomophila. This family is built narrowly, and therefore excludes a set of pore-forming proteins (not necessarily toxins) from a eukaryote, Dictyostelium. Analogous (but perhaps not homologous) beta-type pore-forming toxins include aerolysin and leukocidin.	0
422972	cl38962	retention_LapA	retention module-containing protein. Members of this family are lipoprotein LipL45, as described in Leptospira interrogans serovar Copenhageni str. Fiocruz L1-130 but found broadly in the genus Leptospira. Close homologs that are not lipoproteins by sequence are likely defective in their reported coding region.	0
422973	cl38966	LD_cluster2	SLOG cluster2. Family in the SLOG superfamily, observed to associate with a predicted effector protein containing one enzymatically active and inactive copy of the TIR domain.	0
422978	cl38972	TetR_C_30	Tetracyclin repressor-like, C-terminal domain. Members of this family are found in various prokaryotic transcriptional regulator proteins. Their exact function has not, as yet, been identified.	0
422979	cl38973	TetR_C_24	Tetracyclin repressor-like, C-terminal domain. This is the C-terminal domain present in putative TetR transcriptional regulators.	0
422980	cl38975	TetR_C_15	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the TetR Transcriptional Repressor present in sco1712 proteins from Streptomyces coelicolo which act as a regulator of antibiotic production.	0
422988	cl38983	Ig_mannosidase	Ig-fold domain. This domain can be found in 2 glycoside hydrolase subfamily of beta-glucosaminidases (EC:3.2.1.165) such as CsxA, from Amycolatopsis orientalis that has exo-beta-D-glucosaminidase (exo-chitosanase) activity. It has an immunoglobulin-like topology.	0
422989	cl38984	sCache_3_3	Single cache domain 3. Cache_3 is the periplasmic sensor domains of sensor histidine kinase of E. coli DcuS. This domain forms one of the components of the two-component signalling system that allows bacteria to adapt to changing environments. The ability of bacteria to monitor and adapt to their environment is crucial to their survival, and two-component signal transduction systems mediate most of these adaptive responses. One component is a histidine kinase sensor - this domain - most commonly part of a homodimeric transmembrane sensor protein, and the second component is a cytoplasmic response regulator. The two components interact in tandem through a phospho-transfer cascade.	0
422991	cl38987	2_5_RNA_ligase2	2&apos;-5&apos; RNA ligase superfamily. Members of this family are bacterial and archaeal RNA ligases that are able to ligate tRNA half molecules containing 2',3'-cyclic phosphate and 5' hydroxyl termini to products containing the 2',5' phosphodiester linkage. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family. The structure of a related protein is known. They belong to the 2H phosphoesterase superfamily. They share a common active site, characterized by two conserved histidines, with vertebrate myelin-associated 2',3' phosphodiesterases, plant Arabidopsis thaliana CPDases and several several bacteria and virus proteins.	0
422992	cl38988	WHG	WHG domain. This presumed domain is around 80 amino acids in length. It is found to the C-terminus of a DNA-binding helix-turn-helix domain. This domain may be involved in binding to an as yet unknown ligand that allows a transcriptional regulation response to that molecule. The domain is named WHG after three conserved residues near the C-terminus of the domain.	0
422993	cl38990	EndIII_4Fe-2S	Iron-sulfur binding domain of endonuclease III. Escherichia coli endonuclease III (EC 4.2.99.18) is a DNA repair enzyme that acts both as a DNA N-glycosylase, removing oxidized pyrimidines from DNA, and as an apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a single 4Fe-4S cluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, but is probably involved in the proper positioning of the enzyme along the DNA strand. The 4Fe-4S cluster is bound by four cysteines which are all located in a 17 amino acid region at the C-terminal end of endonuclease III. A similar region is also present in the central section of mutY and in the C-terminus of ORF-10 and of the Micro-coccus UV endonuclease.	0
422995	cl38996	T2SSK	Type II secretion system (T2SS), protein K. Members of this family are involved in the Type II protein secretion system. The T2SK family includes proteins such as ExeK, PulK, OutX and XcpX.	0
422996	cl38998	MCR_alpha_N	Methyl-coenzyme M reductase alpha subunit, N-terminal domain. Members of this protein family are the alpha subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis]	0
422999	cl39004	HrcA	HrcA protein C terminal domain. HrcA represses the class I heat shock operons groE and dnaK; overproduction prevents induction of these operons by heat shock while deletion allows constitutive expression even at low temperatures. In Bacillus subtilis, hrcA is the first gene of the dnaK operon and so is itself a heat shock gene. [Regulatory functions, DNA interactions]	0
423004	cl39012	baeRF_family10	Bacterial archaeo-eukaryotic release factor family 10. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. This family contains a well-conserved 'FP' motif in the catalytic loop.	0
423005	cl39013	PBECR3	phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease3. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages. The predicted active site contains a conserved arginine and threonine residues.	0
423006	cl39014	CxC2	CxC2 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain.	0
423007	cl39015	LRR_RI	N/A. Leucine-rich repeats are composed of a beta-alpha unit. This repeat unit is found as capping unit (N- or C- terminal of the repeat region) of Ribonuclease Inhibitors.	0
423013	cl39021	aGPT-Pplase2	Alpha-glutamyl/putrescinyl thymine pyrophosphorylase clade 2. An alpha helical domain related to the alpha-helical DNA glycosylases, predicted to catalyze the in situ synthesis of hypermodified bases such as alpha-glutamyl, putrescinyl thymine, 5-(2-aminoethoxy)methyluridine or 5-(2-aminoethyl)uridine. The enzyme is predicted to utilize a high-energy pyrophosphate DNA base intermediate which is subject to a nucleophilic attack by the modifying moiety. Mainly found in bacterial mobile operons.	0
423014	cl39022	SNAD1	Secreted Novel AID/APOBEC-like Deaminase 1. A family of secreted AID/APOBEC like deaminases found in ray-finned fishes.	0
423015	cl39023	HEPN_RiboL-PSP	RiboL-PSP-HEPN. HEPN-like nuclease. MAE_28990 In operon with a ParB nuclease and DNA methylase genes. MAE_18760-like HEPN found fused to HEPN/RES-NTD1, HEPN/Toprim-NTD1, Schlafen and a novel beta rich domain. In operon with ParA/Soj ATPase of SIMIBI-type GTPase fold.	0
423016	cl39024	BclA_C	BclA C-terminal domain. This model often occurs at the C-terminus, and companion model N_to_GlyXaaXaa (NF033172) at the N-terminus, of proteins that in between consist largely of variable numbers of Gly-Xaa-Xaa repeats, reminiscent of collagen repeats.  Member proteins observed have been found so far only in Gram-positive bacteria.  This domain contains a motif IPxTG near its C-terminus, suggesting it is processed by some form of sortase.	0
423019	cl39027	RuvC_1	RuvC nuclease domain. This is a nuclease (NUC) domain found in Cpf1, an RNA-guided endonuclease of a type V CRISPR-Cas system. Structural and functional analysis indicate that this domain is involved in DNA cleavage.	0
423021	cl39030	zf_CCCH_4	Zinc finger domain. This short zinc binding domain has the pattern of three cysteines and one histidine to coordinate the zinc ion. This domain is found in a wide variety of proteins such as E3 ligases.	0
423022	cl39031	mCpol	minimal CRISPR polymerase domain. The mCpol domain (minimal CRISPR polymerase) is named for its homology relationship to catalytic domain of the CRISPR polymerases (often called Cmr2 or Cas10).  It is predicted to generate cyclic nucleotides, potentially sensed by CARF domains which in turn activate various effector domain including HEPN RNases, CARF sensor and effectors are found in conserved genome contexts. It is part of a broader class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives.  The putative function of the mCpol domain implies that CRISPR polymerases of the type III CRISPR/Cas systems have a nucleotide synthetase functional role.	0
423023	cl39032	LSDAT_euk	SLOG in TRPM. Family in the SLOG superfamily, fused to or operonically associating with SLATT domain in diverse prokaryotes. Predicted to function as ligand sensor in conjunction with the SLATT transmembrane domain.	0
423024	cl39033	ISP1_C	ISP1 C-terminal. This is the C-terminal domain of ISP3 protein, which plays a role in asexual daughter cell formation, for example in T.gondii. The domain consists of a seven-stranded antiparallel beta-sandwich bordered on one end by a interstrand loop (open end) and capped at the other end by an amphipathic C-terminal helix (closed end). The loop between beta 5 and beta 6 is extended and variable. The domain adopts a pleckstrin homology (PH) fold, despite having neglible sequence similarity. PH domains are often found in proteins that support protein-lipid and play a role in mediating membrane localization through IP binding. However, the Phospholipid Binding Properties of PH domains is not conserved in the ISP3. Unlike PH domains, ISP3 is cysteine rich. The cysteine-rich nature of the ISP3s and the number of surface-exposed cysteines may result in redox instability and may also facilitate higher order multimerization. There are no disulfide bonds in ISP3 unlike in ISP1. It is worth noting that ISP1 and ISP3 share low sequence identity but contain the same secondary core elements.	0
423026	cl39035	Agglutinin_C	Agglutinin C-terminal. This is the C-terminal domain of the beta chain found in Polyporus squamosus lectin protein (PSL). PSL binds specifically to glycans terminating with the sequence: Neu5Ac.alpha2-6Gal.beta. The C-terminal domain is not involved in the binding to the Neu5Ac.alpha2-6Gal.beta. The C-terminal domain is characterized by a central five-stranded beta-sheet that is flanked by three alpha-helices and topped by a short strand. It shows high fold similarity to its closest relative, the Gal.alpha1-3Gal-binding agglutinin from the mushroom Marasmius oreades agglutinin (MOA).	0
423032	cl39045	MukF_N	bacterial condensin complex subunit MukF, N-terminal domain. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid.	0
423033	cl39046	UAE_UbL	Ubiquitin/SUMO-activating enzyme ubiquitin-like domain. E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The domain resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains.	0
423034	cl39049	DUF2300	Predicted secreted protein (DUF2300). This domain, found in various bacterial hypothetical and putative signal peptide proteins, has no known function.	0
423035	cl39051	ATG27	Autophagy-related protein 27. This family includes both Cation-dependent and cation independent mannose-6-phosphate receptors.	0
423037	cl39057	MS_channel	Mechanosensitive ion channel. Two members of this protein family of M. jannaschii have been functionally characterized. Both proteins form mechanosensitive (MS) ion channels upon reconstitution into liposomes and functional examination by the patch-clamp technique. Therefore this family are likely to also be MS channel proteins.	0
423038	cl39058	RNB	RNB domain. This family consists of an exoribonuclease, ribonuclease R, also called VacB. It is one of the eight exoribonucleases reported in E. coli and is broadly distributed throughout the bacteria. In E. coli, double mutants of this protein and polynucleotide phosphorylase are not viable. Scoring between trusted and noise cutoffs to the model are shorter, divergent forms from the Chlamydiae, and divergent forms from the Campylobacterales (including Helicobacter pylori) and Leptospira interrogans. [Transcription, Degradation of RNA]	0
393429	cl39059	Proton_antipo_M	Proton-conducting membrane transporter. This model describes the 14th (based on E. coli) structural gene, N, of bacterial and chloroplast energy-transducing NADH (or NADPH) dehydrogenases. This model does not describe any subunit of the mitochondrial complex I (for which the subunit composition is very different), nor NADH dehydrogenases that are not coupled to ion transport. The Enzyme Commission designation 1.6.5.3, for NADH dehydrogenase (ubiquinone), is applied broadly, perhaps unfortunately, even if the quinone is menaquinone (Thermus, Mycobacterium) or plastoquinone (chloroplast). For chloroplast members, the name NADH-plastoquinone oxidoreductase is used for the complex and this protein is designated as subunit 2 or B. This model also includes a subunit of a related complex in the archaeal methanogen, Methanosarcina mazei, in which F420H2 replaces NADH and 2-hydroxyphenazine replaces the quinone. [Energy metabolism, Electron transport]	0
423039	cl39061	OTCace	Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain. Members of this family are putrescine carbamoyltransferase (EC 2.1.3.6). There is some overlapping specificity with ornithine carbamoyltransferase (EC 2.1.3.3). The gene regularly is found next to agmatine deiminase and a carbamate kinase, suggesting a conserved catabolic agmatine deiminase pathway. [Energy metabolism, Amino acids and amines]	0
393432	cl39062	DUF5494	Family of unknown function (DUF5494). hypothetical protein	0
393433	cl39063	DUF5461	Family of unknown function (DUF5461). hypothetical protein	0
393434	cl39064	GerPE	Spore germination protein GerPE. Members of this family are required for formation of functionally normal spores. They may be involved in the establishment of spore coat structure or permeability.	0
423040	cl39066	GCV_T	Aminomethyltransferase folate-binding domain. This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase.	0
393437	cl39067	Phage_T7_tail	Phage T7 tail fibre protein. hypothetical protein; Provisional	0
423042	cl39070	Ldl_recept_b	Low-density lipoprotein receptor repeat class B. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin.	0
393443	cl39073	GapA	Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase [Carbohydrate transport and metabolism]. This model describes the type II glyceraldehyde-3-phosphate dehydrogenases which are limited to archaea. These enzymes catalyze the interconversion of 1,3-diphosphoglycerate and glyceraldehyde-3-phosphate, a central step in glycolysis and gluconeogenesis. In archaea, either NAD or NADP may be utilized as the cofactor. The class I GAPDH's from bacteria and eukaryotes are covered by TIGR01534. All of the members of the seed are characterized. See, for instance. This model is very solid, there are no species falling between trusted and noise at this time. The closest relatives scoring in the noise are the class I GAPDH's.	0
393445	cl39075	IlvB	Acetolactate synthase large subunit or other thiamine pyrophosphate-requiring enzyme [Amino acid transport and metabolism, Coenzyme transport and metabolism]. Two groups of proteins form acetolactate from two molecules of pyruvate. The type of acetolactate synthase described in this model also catalyzes the formation of acetohydroxybutyrate from pyruvate and 2-oxobutyrate, an early step in the branched chain amino acid biosynthesis; it is therefore also termed acetohydroxyacid synthase. In bacteria, this catalytic chain is associated with a smaller regulatory chain in an alpha2/beta2 heterotetramer. Acetolactate synthase is a thiamine pyrophosphate enzyme. In this type, FAD and Mg++ are also found. Several isozymes of this enzyme are found in E. coli K12, one of which contains a frameshift in the large subunit gene and is not expressed. [Amino acid biosynthesis, Pyruvate family]	0
423043	cl39076	Pyruvate_Kinase	N/A. This domain of the is actually a small beta-barrel domain nested within a larger TIM barrel. The active site is found in a cleft between the two domains.	0
393447	cl39077	PulD	Type II secretory pathway component GspD/PulD (secretin) [Intracellular trafficking, secretion, and vesicular transport]. A number of proteins homologous to the type IV pilus secretin PilQ (TIGR02515) are involved in type IV pilus formation, competence for transformation, type III secretion, and type II secretion (also called the main terminal branch of the general secretion pathway). The clade described by this model contains the outer membrane pore proteins of bacterial type III secretion systems, typified by YscC for animal pathogens and HrcC for plant pathogens. [Protein fate, Protein and peptide secretion and trafficking, Cellular processes, Pathogenesis]	0
393448	cl39078	SecA	Preprotein translocase subunit SecA (ATPase, RNA helicase) [Intracellular trafficking, secretion, and vesicular transport]. Members of this family are SecA2, part of a Sec-like preprotein translocase called accessory Sec. This SecA2 family is characteristic of Listeria species.	0
393449	cl39079	OadA1	Pyruvate/oxaloacetate carboxyltransferase  [Energy production and conversion]. This model describes the bacterial oxaloacetate decarboxylase alpha subunit and its equivalents in archaea. The oxaloacetate decarboxylase Na+ pump is the paradigm of the family of Na+ transport decarboxylases that present in bacteria and archaea. It a multi subunit enzyme consisting of a peripheral alpha-subunit and integral membrane subunits beta and gamma. The energy released by the decarboxylation reaction of oxaloacetate is coupled to Na+ ion pumping across the membrane. [Transport and binding proteins, Cations and iron carrying compounds, Energy metabolism, Other]	0
393450	cl39080	DnaX	DNA polymerase III, gamma/tau subunits [Replication, recombination and repair]. This model represents the well-conserved first ~ 365 amino acids of the translation of the dnaX gene. The full-length product of the dnaX gene in the model bacterium E. coli is the DNA polymerase III tau subunit. A translational frameshift leads to early termination and a truncated protein subunit gamma, about 1/3 shorter than tau and present in roughly equal amounts. This frameshift mechanism is not necessarily universal for species with DNA polymerase III but appears conserved in the exterme thermophile Thermus thermophilis. [DNA metabolism, DNA replication, recombination, and repair]	0
393451	cl39081	MotB	Flagellar motor protein MotB [Cell motility]. flagellar motor protein MotB; Reviewed	0
393452	cl39082	PycA	Pyruvate carboxylase  [Energy production and conversion]. Members of this family are ATP-dependent urea carboxylase, including characterized members from Oleomonas sagaranensis (alpha class Proteobacterium) and yeasts such as Saccharomyces cerevisiae. The allophanate hydrolase domain of the yeast enzyme is not included in this model and is represented by an adjacent gene in Oleomonas sagaranensis. The fusion of urea carboxylase and allophanate hydrolase is designated urea amidolyase. The enzyme from Oleomonas sagaranensis was shown to be highly active on acetamide and formamide as well as urea. [Central intermediary metabolism, Nitrogen metabolism]	0
393456	cl39086	FadR	DNA-binding transcriptional regulator, FadR family [Transcription]. transcriptional regulator NanR; Provisional	0
393457	cl39087	MurB	UDP-N-acetylenolpyruvoylglucosamine reductase [Cell wall/membrane/envelope biogenesis]. This model describes MurB, UDP-N-acetylenolpyruvoylglucosamine reductase, which is also called UDP-N-acetylmuramate dehydrogenase. It is part of the pathway for the biosynthesis of the UDP-N-acetylmuramoyl-pentapeptide that is a precursor of bacterial peptidoglycan. [Cell envelope, Biosynthesis and degradation of murein sacculus and peptidoglycan]	0
393460	cl39090	ComEA	DNA uptake protein ComE and related DNA-binding proteins [Replication, recombination and repair]. This model describes the ComEA protein in bacteria. The com E locus is obligatory for bacterial cell competence - the process of internalizing the exogenous added DNA. Lesions in the loci has been variously described for the appearance of competence-related pheonotypes and impairment of competence, suggesting their intimate functional role in bacterial transformation. [Cellular processes, DNA transformation]	0
393461	cl39091	SdhA	Succinate dehydrogenase/fumarate reductase, flavoprotein subunit [Energy production and conversion]. This model represents the succinate dehydrogenase flavoprotein subunit as found in Gram-negative bacteria, mitochondria, and some Archaea. Mitochondrial forms interact with ubiquinone and are designated EC 1.3.5.1, but can be degraded to 1.3.99.1. Some isozymes in E. coli and other species run primarily in the opposite direction and are designated fumarate reductase. [Energy metabolism, Aerobic, Energy metabolism, Anaerobic, Energy metabolism, TCA cycle]	0
393462	cl39092	SucA	2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes [Energy production and conversion]. 2-oxoglutarate dehydrogenase E1 component; Reviewed	0
423044	cl39093	Pyr_redox_2	Pyridine nucleotide-disulphide oxidoreductase. Members of this protein family include N-terminal sequence regions of (probable) bifunctional proteins whose C-terminal sequences are SelD, or selenide,water dikinase, the selenium donor protein necessary for selenium incorporation into protein (as selenocysteine), tRNA (as 2-selenouridine), or both. However, some members of this family occur in species that do not show selenium incorporation, and the function of this protein family is unknown.	0
423045	cl39094	Ank_2	Ankyrin repeats (3 copies). ankyrin repeat protein; Provisional	0
393467	cl39097	MCP_signal	Methyl-accepting chemotaxis protein (MCP), signaling domain. This domain is thought to transduce the signal to CheA since it is highly conserved in very diverse MCPs.	0
393468	cl39098	TrpE	Anthranilate/para-aminobenzoate synthases component I [Amino acid transport and metabolism, Coenzyme transport and metabolism]. Members of this protein family are salicylate synthases, bifunctional enzymes that make salicylate, in two steps, from chorismate. Members are homologous to anthranilate synthase component I from Trp biosynthesis. Members typically are found in gene regions associated with siderophore or other secondary metabolite biosynthesis.	0
423046	cl39166	CshA_repeat	Surface adhesin CshA repetitive domain. Many proteins with this repeat are LPXTG-anchored surface proteins of Firmicutes species, but the repeat occurs more broadly. Members include CshA from Streptococcus gordonii.	0
423165	cl39327	Bep_C_terminal	BID domain of Bartonella effector protein (Bep). The BID domain (Bartonella intracellular delivery domain) is recognized by the type IV secretion system (T4SS) virB (not trw) of Bartonella and related taxa (e.g. Ochrobactrum), and is found in T4SS effector proteins such as BepA, BepB, BepC, etc. Multiple copies of the domain may be found in a single protein.	0
423228	cl39406	RING0_parkin	RING finger-like zinc-binding domain 0 of parkin. This is a RING zinc finger domain found in parkin proteins. Parkin consists of a ubiquitin-like (Ubl) domain and a 60-amino acid linker followed by this domain RING0 and three additional zinc finger domains characteristic of the RBR family. RING0 binds two coordinated zinc atoms at each extremity of the domain with a hairpin. Deletion of RING0 massively derepressed parkin activity supporting the role of RING0 in autoinhibition, point mutations in RING0 (Phe146 to Ala) or RING2 (Phe463 to Ala) both increased parkin activity. The REP (repressor element of parkin) and RING0 domains play a preeminent role in repressing parkin ligase activity through their interactions with RING1 and RING2, respectively.	0
423331	cl39524	SLATT_fungal	SMODS and SLOG-associating 2TM effector domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function in bacteria as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. The role of this fungal family is not yet understood, although the expansion of the family in many fungal lineages points to a potential role in conflict.	0
423341	cl39535	SLATT_5	SMODS and SLOG-associating 2TM effector domain family 5. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family contains an additional C-terminal alpha-helix, and strictly associates with a reverse transcriptase domain, part of a predicted retroelement with diversity-generating potential.	0
423350	cl39545	SLATT_1	SMODS and SLOG-associating 2TM effector domain 1. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often C-terminally fused to the SLATT_3 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. In relatively rare instances, it is genomically linked as a standalone domain to the RelA/SpoT nucleotide synthetase and the predicted NA37/YejK sensor domain.	0
423351	cl39546	SLATT_2	SMODS and SLOG-associating 2TM effector domain 2. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is the only prokaryotic SLATT family to exist as a standalone domain, with no as-yet discernable genome associations.	0
423385	cl39583	CdiI_ECL-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of  Enterobacter cloacae, and similar proteins. This is the N-terminal domain of Contact-dependent growth inhibition immunity (CdiI) proteins present in Enterobacter cloacae. CdiI proteins neutralize CdiA-CT toxins to protect toxin-producing cells from auto-inhibition. Structural homology searches reveal that Enterobacter cloacae's CdiI is most similar to the Whirly family of single-stranded DNA-binding protein.	0
423424	cl39636	T3SS_ExsE	Type III secretion system ExsE. ExsE, through protein-protein interaction, serves in a regulatory cascade that modulates the role of ExsA, a transcriptional activator of Pseudomonas aeruginosa's type III secretion system (T3SS) regulon. ExsE itself is a substrate for translocation (i.e. removal) by the T3SS system, providing feedback that modulates expression of secretion system genes. Homologs found in multiple species of Aeromonas and Photorhabdus may be functionally equivalent. Note that VP1702 from Vibrio parahaemolyticus, given the same gene symbol and ascribed an equivalent function, appears unrelated in sequence.	0
423545	cl39768	CdiA-CT_Yk_RNaseA-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system protein CdiA (CdiA-CT) of Yersinia kristensenii, and similar proteins. Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighboring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. Structure analysis of CdiA-CT shows that it adopts the same fold (with two beta-sheets forming an overall kidney shape) as angiogenin and other RNase A paralogs, but the toxin does not share sequence similarity with these nucleases and lacks the characteristic disulfide bonds of the superfamily. Furthermore, structural comparison analysis identified human angiogenin, Rana pipiens protein P-30 (onconase) and mouse pancreatic ribonuclease (RNase 1) as the closest structural homologs of CdiA-CT.	0
423675	cl39912	CdiI_Ykris-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Yersinia kristensenii, and similar proteins. Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighboring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. CDI+ bacteria also produce CdiI immunity proteins, which specifically neutralize cognate CdiA-CT toxins to prevent self-inhibition. Structure analysis of CdiI immunity protein from Yersinia kristensenii shows that it is composed of eight alpha-helices packed together to form a nearly spherical structure with weak structural homology to a putative TetR family transcriptional repressor. The CdiI protein fits into the curved cavity of the CdiA-CTYkris toxin domain where it most likely neutralizes toxin activity by blocking access to RNA substrates. This domain is mostly found in gammaproteobacteria.	0
423697	cl39935	CdiI_EC536-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli 536, and similar proteins. Contact-dependent growth inhibition (CDI) is a widespread mechanism of bacterial competition. CDI+ bacteria deliver the toxic C-terminal region of contact-dependent inhibition A proteins (CdiA-CT) into neighboring target bacteria and produce CDI immunity proteins (CdiI) which bind CdiA-CT domains and neutralize their toxic activity to protect against self-inhibition. CdiI immunity proteins are also variable and only neutralize their cognate CdiA-CT toxins. Structure analysis of CdiI from Escherichia coli 536 (EC536) shows that is composed of a single domain and that it blocks the interaction with substrate, strongly suggesting that the immunity protein occludes the nuclease active site.	0
423705	cl39943	CdiI_Ecoli_Nm-like	inhibitor (or immunity protein) of the contact-dependent growth inhibition (CDI) system of Escherichia coli STEC_O31, Neisseria meningitidis MC58, and similar proteins. CdiI proteins, including the founding member from Escherichia coli strain STEC_O31, serve as immunity proteins for the toxic tRNA-cleaving ribonuclease toxin CdiA. The system confers contact-dependent inhibition (cdi) between different strains of bacteria.	0
423742	cl39980	Pallilysin	Pallilysin beta barrel domain. In contrast to pallilysin itself (a bifunctional adhesin and protease), members of the pallilysin-related adhesin family average twice the length, lack the HEXXH motif essential to pallilysin's metalloprotease activity, and are likely to function in virulence only as an adhesin. Typical members of this family include TDE0840 from Treponema denticola and BB0038 from Borrelia burgdorferi, which share less than 20% pairwise amino acid sequence identity.	0
423769	cl40010	NTD_TDP-43	N-terminal domain of transactive response DNA-binding protein 43. This domain can be found at the N-terminal region of transactive response DNA-binding protein 43 kDa (TDP-43), an RNA transporting and processing protein whose aberrant aggregates are implicated in neurodegenerative diseases. TDP-43 N-terminal domain has been shown to play an important role in the aggregation of TDP-43 monomers and its loss of function affects the RNA metabolic levels. Secondary structure of the N-terminal domain consists of six beta-strands and it resembles axin 1.	0
423820	cl40066	Heliorhodopsin	Heliorhodopsin. This HMM represents heliorhodopsins, a group of phylogenetically distinct microbial rhodopsins, which play an important role in absorbing and transferring light energy for numerous biological processes in bacteria. Heliorhodopsin was initially identified and characterized in a Gram-positive actinobacterium based on functional metagenomics and photochemical approaches. Heliorhodopsin have seven transmembrane domains, and exhibit similar biological function as microbial rhodopsins. however, heliorhodopsin form a distinct cluster based on phylogenetic analyses. Most microbial rhodopsins are hit by the Pfam HMM PF01036, which does not hit heliorhodopsins.	0
423877	cl40125	IFTase	inulin fructotransferase. This region contains a right-handed parallel beta helix repeat unit found in Inulin fructotransferase. This Pfam entry includes sequences not found by pfam13229.	0
423902	cl40150	RnlA-toxin_C	RNase LS, bacterial toxin C terminal. RnaseLS-like HEPN.	0
424006	cl40259	Rimk_N	RimK PreATP-grasp domain. Members of this family are proteins of unknown function, regularly found in a conserved gene neighborhood that also includes two uncharacterized radical SAM proteins. The protein family is named for a founding member from the Salmonella enterica model strain LT2, although the system is rare in the Proteobacteria and relativly common in Streptomyces and related taxa.	0
424030	cl40283	SAVED	SMODS-associated and fused to various effectors sensor domain. The SAVED domain is predicted to function as a sensor domain, sensing nucleotides or nucleotide derivatives generated by SMODS and other nucleotide synthetase domains. The sensing of ligands by SAVED is predicted to activate effectors deployed by a class of conflict systems which are reliant on the on the production and sensing of the nucleotide second messengers.	0
424038	cl40291	SLATT_6	SMODS and SLOG-associating 2TM effector domain 6. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family associates with a SMODS nucleotide synthetase domain fused to the predicted AGS-C sensor domain. It is sometimes further coupled to R-M systems.	0
424041	cl40294	SLATT_4	SMODS and SLOG-associating 2TM effector domain family 4. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often coupled to the SMODS nucleotide synthetase and is sometimes further embedded in other conflict systems like CRISPR/Cas or R-M systems.	0
424056	cl40356	DUF5841	Family of unknown function (DUF5841). Members of this family have leader sequences like bacteriocins (see TIGR01847), but characterized examples function as signaling peptides that induce production of a nearby encoded bacteriocin, rather than as bacteriocins themselves. The founding member of this family is enterocin induction factor EntF.	0
394792	cl40422	M34_peptidase	Peptidase family M34 includes the C-terminal catalytic domain of anthrax lethal factor (ATLF), the protective antigen-binding domains of ATLF and edema factor, and Pro-Pro endopeptidase. This subfamily includes the C-terminal catalytic domain of anthrax toxin lethal factor (ATLF; EC 3.4.24.83). ATLF and edema factor are enzyme components of anthrax toxin and are carried into the cell by a third component, the protective antigen (PA). ATLF is secreted by Bacillus anthracis to promote disease virulence through disruption of host signaling pathways. ATLF belongs to peptidase family M34 and has the hallmark metalloprotease motif HEXXH motif where the two His residues bind a single zinc atom, and the Glu has a catalytic role. ATLF is a highly selective protease whose major substrates are mitogen-activated protein kinase kinases (MKKs). MKKs are cleaved by ATLF near their N-termini, removing the docking sequence for the downstream cognate mitogen-activated protein kinase. Preferred amino acids around the cleavage site can be denoted BBBBxHxH, in which B denotes Arg or Lys, H denotes a hydrophobic amino acid, and x is any amino acid. At its N-terminus, ATLF has a related PABD domain which lacks the hallmark metalloprotease motif HEXXH. This subfamily also includes Bacillus thuringiensis Vip2Ac-like_2 which belongs to the Vip family of proteins that are secreted during the vegetative growth phase.	0
424065	cl40423	cupin_RmlC-like	RmlC-like cupin superfamily. Breaks down into dimethylsulfoniopropionate (DMSP) into acrylate and dimethyl sulfide.	0
394794	cl40424	USP25_USP28_C-like	carboxyl-terminal domain of ubiquitin-specific protease 25 (USP25) and 28 (USP28), and similar domains. This family contains the C-terminal domain of ubiquitin-specific protease USP28, a deubiquitinase (DUB), which shares high similarity with USP25 but varies in cellular function; USP28 is known for its tumor-promoting role while USP25 is a regulator of the innate immune system and may play a role in tumorigenesis. USP28 stabilizes c-MYC and other nuclear proteins, and USP25 regulates inflammatory TRAF signaling. These two closely related DUBs contain an N-terminal domain harboring a Ub-associated domain (UBA) and two Ub-interacting motifs (UIMs), a central catalytic USP domain, and a C-terminal region of unknown function and variable size due to alternative splicing. In general, USP catalytic domains are around 350 amino acids in length; however, in USP25 and 28, the catalytic domains span around 550 amino acids due to a large, conserved insertion at a common insertion point called USP25/28 catalytic domain inserted domain (UCID). This C-terminal region has been implicated in substrate binding for USP28 and harbors the splicing site for isoform-specific sequences. Structure studies suggest that the C-terminal domain forms an independent entity.	0
394795	cl40425	C_NRPS-like	Condensation domain of nonribosomal peptide synthetases (NRPSs). Condensation (C) domains of nonribosomal peptide synthetases (NRPSs) catalyze peptide bond formation within (usually) large multi-modular enzymatic complexes. Hybrid PKS/NRPS create polymers containing both polyketide and amide linkages. C-domains typically have a conserved HHxxxD motif at the active site; mutations in this motif can abolish or diminish condensation activity. Members of this subfamily have the typical C-domain HHxxxD motif. PksJ is involved in some intermediate steps for the synthesis of the antibiotic polyketide bacillaene which is important in secondary metabolism. NRPS can use a large variety of acyl monomers (approximately 500 different possible monomer substrates as opposed to the 20 standard amino acids in ribosomal protein synthesis) to construct bioactive secondary metabolites of 2 to 18 units long (with various activities such as antibiotic, antifungal, antitumor and immunosuppression). There are various subtypes of C-domains such as the LCL-type which catalyzes peptide bond formation between two L-amino acids, the DCL-type which links an L-amino acid to the D-amino acid at the end of a growing peptide, starter C-domains which acylate the first amino acid with a beta-hydroxy carboxylic acid, and heterocyclization (Cyc) domains which catalyze both peptide bond formation and cyclization of Cys, Ser, or Thr residues. Typically, an NRPS module consists of an adenylation domain, a peptidyl carrier protein (PCP) domain (also known as thiolation (T) domain) and a C-domain. NRPS modules may also include specialized domains such as the terminal-module thioesterase (Te) domain that releases the product via hydrolysis or macrocyclization and any of various C-domain family members such as the epimerization (E) domain, the ester-bond forming C-domain, dual E/C (epimerization and condensation) domains, and the X-domain.	0
394796	cl40426	Cas13b	Class 2 type VI-B CRISPR-associated RNA-guided ribonuclease Cas13b. CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins) adaptive immune systems defend microbes against foreign nucleic acids via RNA-guided endonucleases. These systems are divided into two classes: class 1 systems utilize multiple Cas proteins and CRISPR RNA (crRNA) to form an effector complex while class 2 systems employ a large, single effector with crRNA to mediate interference. Class 2 type VI CRISPR-Cas13 systems use a single enzyme to target RNA using a programmable crRNA guide and are divided into four subtypes based on the identity of the Cas13 protein (Cas13a-d). The Cas13 proteins are capable of both pre-crRNA processing and target RNA cleavage, which protect the host from phage attacks. Once bound to a target RNA, their non-specific RNase activity is activated. Cas13b has many distinctive features compared to the other Cas13 proteins, including the lack of significant sequence similarity, disparate crRNA repeat region, and double-sided protospacer flanking sequence (PFS)-dependent target RNA cleavage.	0
394797	cl40427	Tet_JBP	oxygenase domain of ten-eleven translocation (TET) enzymes, J-binding proteins (JBPs), and similar proteins. J binding protein (JBP) 1 and JBP2 catalyze the first step of base J biosynthesis: the hydroxylation of thymine in DNA to form 5-hydroxymethyluracil (hmU). Base J (beta-d-glucopyranosyloxymethyluracil) is a hyper-modified DNA base found in the DNA of kinetoplastids (Trypanosoma brucei, Trypanosoma cruzi, and Leishmania). JBP1 and JBP2 each contain a J-DNA binding domain and this oxygenase domain. They belong to the TET/JBP family of dioxygenases that require Fe2+ and alpha-ketoglutarate (also known as 2-oxoglutarate) for activity.	0
394798	cl40428	phospholamban_like	phospholamban, sarcolipin, and sarcolamban family of bioactive peptides. Invertebrate sarcolamban A (SCLA) belongs to a family of bioactive peptides which includes invertebrate sarcolamban B (SCLB), and vertebrate phospholamban (PLN) and sarcolipin (SLN). SCLA and SCLB are encoded within a single putative noncoding transcript, pncr003:2L; PLN and SLN are each encoded within a single exon of a spliced transcript. PLN is chiefly expressed in the cardiac muscle, while SLN is expressed in the atria of the heart and embryonic slow-type skeletal muscle; SCL is found in cardiac and somatic muscle of Drosophila melanogaster. PLN and SLN are each a single-pass transmembrane alpha-helix that interacts directly with the sarcoplasmic reticulum (SR) calcium pump (SERCA), lower its affinity for Ca2+, thereby decreasing the rate of Ca2+ reuptake into the SR from the sarcoplasm. In the heart, PLN and SLN inhibit the activity of SERCA2a isoform and function as important regulators of cardiac contractibility and disease. SCLA and SCLB are each predicted to form a single-pass transmembrane helix, localize to the SR with the SR calcium pump (Ca-P60A), and dampen its activity.	0
394799	cl40429	Complex1_LYR_SF	LYR (leucine-tyrosine-arginine) motif found in Complex1_LYR-like superfamily. This group contains uncharacterized LYR motif-containing proteins belonging to the Complex1_LYR-like superfamily that consists of proteins of diverse functions that are exclusively found in eukaryotes and contain the conserved tripeptide 'LYR' close to the N-terminus.	0
394800	cl40430	Stannin_family	Stannin family includes vertebrate Stannin and insect Hemotin. Stannin (SNN) is a monotopic membrane protein containing an N-terminal single transmembrane helix that transverses the lipid bilayer, an unstructured linker which includes a conserved CXC metal-binding motif and a putative 14-3-3zeta binding site, and a C-terminal distorted cytoplasmic helix. It binds and antagonizes 14-3-3zeta and is required for endosomal maturation. It has also been identified as the specific marker for neuronal cell apoptosis induced by trimethyltin (TMT) intoxication. TMT is one of the most toxic organotin compound (or alkyltin), and is known to selectively inflict injury to specific regions of the brain.	0
394801	cl40431	PFM_aerolysin_family	pore-forming module of aerolysin-type beta-barrel pore-forming proteins. Members of this group belong to the aerolysin family of beta-pore-forming proteins (beta-PFPs). PFPs are generally secreted as water-soluble monomers, which upon binding to target lipid membranes, oligomerize and form transmembrane pores harmful to cells. Beta-PFPs form pores by transmembrane beta-barrels. Aerolysin-type beta-PFPs are believed to use an amphipathic beta-hairpin to form the beta-barrel, are found in all kingdoms of life and many are bacterial toxins. In addition to having a role in microbial infection, they have potential as biotechnological sensors and delivery systems. They share a similar monomeric architecture, with a variable membrane-binding domain and a structurally conserved pore-forming region. A significant portion of the monomeric subunit structure is re-organized to form the pore. Oligomers formed by members of the aerolysin family include: hepta- (aerolysin), octa- (Dln1), and nonameric oligomers (lysenin and monalysin).	0
394802	cl40432	SET	SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain superfamily. This subfamily contains fission yeast Schizosaccharomyces pombe H3K9 methyltransferase Clr4 (also known as Suv39h), the sole homolog of the mammalian SUV39H1 and SUV39H2 enzymes, that has a critical role in preventing aberrant heterochromatin formation. It is known to di- and tri-methylate Lys-9 of histone H3, a central heterochromatic histone modification, with its specificity profile most similar to that of the human SUV39H2 homolog.	0
394803	cl40433	Fer2_BFD-like	[2Fe-2S]-binding domain of bacterioferritin-associated ferredoxin (BFD) and related proteins. The BFD-like [2Fe-2S]-binding domain is found in a variety of other proteins including bacterioferritin-associated ferredoxin (BFD), the large subunit of NADH-dependent nitrite reductase, and Cu+ chaperone CopZ. It comprises a helix-turn-helix fold, and binds an [2Fe-2S] cluster via 4 highly-conserved Cys residues, found in loops between the alpha-helices. For the class of proteins having a BFD-like [2Fe-2S]-binding domain, the Cys residues are organized in a unique C-X2-C-X31-35-C-X2-9-C-arrangement. [2Fe-2S] clusters are sulfide-linked diiron centers, a primary role for which is electron transport.	0
394804	cl40434	TGF_beta_SF	transforming growth factor beta (TGF-beta) like domain found in TGF-beta superfamily. The family includes INHBC and INHBE. INHBC, also termed activin beta-C chain, might play important roles in carcinogenesis. It may function as a negative regulator of liver growth. INHBE, also termed activin beta-E chain, is a possible insulin resistance-associated hepatokine with hepatic gene expression that positively correlated with insulin resistance and body mass index in humans. It also acts as a possible new marker for drug-induced endoplasmic reticulum stress.	0
424066	cl40435	CTD_KDM2A_2B-like	C-terminal domain found in lysine-specific demethylase KDM2A, KDM2B, and similar proteins. Lysine-specific demethylase 2B (KDM2B) is also called Ndy1, CXXC-type zinc finger protein 2, F-box and leucine-rich (LRR) repeat protein 10 (FBXL10), F-box protein FBL10, JmjC domain-containing histone demethylation protein 1B (Jhdm1b), Jumonji domain-containing EMSY-interactor methyltransferase motif protein (protein JEMMA), or [Histone-H3]-lysine-36 demethylase 1B. It is a ubiquitously expressed histone H3 lysine 4 (H3K4me2) or histone H3 lysine 36 (H3K36me2) demethylase that functions as a regulator of chemokine expression, cellular morphology, and the metabolome of fibroblasts. It regulates the differentiation of Mesenchymal Stem Cells (MSCs) and has been implicated in cell cycle regulation by de-repressing cyclin-dependent kinase inhibitor 2B (CDKN2B or p15INK4B). It also plays a role in recruiting polycomb repressive complex 1 (PRC1) to CpG islands (CGIs) of developmental genes and regulates lysine 119 monoubiquitylation on H2A (H2AK119ub1) in embryonic stem cells (ESCs). KDM2B also acts as an oncogene that plays a critical role in leukemia development and maintenance. It consists of two Jumonji domains (JmjN and JmjC), a CXXC zinc-finger domain, a plant homeodomain (PHD) finger, an F-box domain, followed by an antagonist of mitotic exit network protein 1 (AMN1) domain. This model corresponds to a small conserved region in KDM2B between the JmjC domain and the CXXC zinc-finger domain, which has been called the C-terminal domain by literature.	0
424067	cl40436	IPD_PPP1R12	inhibitory phosphorylation domain of protein phosphatase 1 regulatory subunit 12 (PPP1R12) family. Protein phosphatase 1 regulatory subunit 12A-like (PPP1R12A-like) is a homolog of MYPT1, also called protein phosphatase 1 regulatory subunit 12A (PPP1R12A), myosin phosphatase target subunit 1, or protein phosphatase myosin-binding subunit. MYPT1 is the targeting subunit of smooth-muscle myosin phosphatase. It is a substrate for the asparaginyl hydroxylase factor inhibiting hypoxia-inducible factor (FIH). MYPT1 acts as a key regulator of protein phosphatase 1C (PPP1C). It mediates binding to myosin. As part of the PPP1C complex, MYPT1 is involved in dephosphorylation of the mitosis regulator polo-like kinase 1 (PLK1). It is capable of inhibiting HIF1A inhibitor (HIF1AN)-dependent suppression of HIF1A activity. This model corresponds to the inhibitory phosphorylation domain of PPP1R12A-like protein.	0
424068	cl40437	TRPV	Transient Receptor Potential channel, Vanilloid subfamily (TRPV). TRPV2 is closely related to TRPV1, sharing high sequence identity (>50%), but TRPV2 shows a higher temperature threshold and sensitivity for activation than TRPV1. TRPV2 can be stimulated by ligands or lipids, and is involved in osmosensation and mechanosensation. TRPV2 is expressed in both neuronal and non-neuronal tissues, and it has been implicated in diverse physiological and pathophysiological processes, including cardiac-structure maintenance, innate immunity, and cancer. TRPV2 belongs to the vanilloid TRP subfamily (TRPV), named after the founding member vanilloid receptor 1 (TRPV1). The structure of TRPV shows the typical topology features of all Transient Receptor Potential (TRP) ion channel family members, such as six transmembrane regions, a short hydrophobic stretch between transmembrane segments 5 and 6 and large intracellular N- and C-terminal domains.	0
424069	cl40438	cc_LAMB_C	C-terminal coiled-coil domain found in the laminin subunit beta (LAMB) family. LAMB3 is also called epiligrin subunit beta, kalinin B1 chain, kalinin subunit beta, laminin B1k chain, laminin-5 subunit beta, or nicein subunit beta. It is a major component of the basement membrane in most adult tissues. Mutations in LAMB3 are associated with Herlitz junctional epidermolysis bullosa (H-JEB), a severe autosomal recessive disorder characterized by blister formation within the dermal-epidermal basement membrane. LAMB3 is a component of laminin, a complex glycoprotein consisting of three different polypeptide chains (alpha, beta, gamma). Binding to cells via a high affinity receptor, laminin is thought to mediate the attachment, migration, and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components. This model corresponds to the C-terminal coiled-coil domain of LAMB3, which may be involved in the integrin binding activity.	0
424070	cl40439	CoV_Spike_S1-S2_S2	S1/S2 cleavage region and the S2 fusion subunit of coronavirus spike (S) proteins. This group contains the SD-1 and SD-2 subdomains of the S1 subunit C-terminal domain (C-domain), the S1/S2 cleavage region, and the S2 fusion subunit of the spike (S) glycoprotein from betacoronaviruses in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9 (Ro-BatCoV HKU9). The CoV S protein is an envelope glycoprotein that plays a very important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains the coronavirus fusion machinery and is primarily alpha-helical. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-domain. The S1 C-domain also contains two subdomains (SD-1 and SD-2), which connect the S1 and S2 subunits. Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs, including SARS-CoV-2, SARS-CoV and MERS-CoV use the C-domain to bind their receptors. The S2 subunit comprises the fusion peptide (FP), a second proteolytic site (S2'), followed by an internal fusion peptide (IFP) and two heptad-repeat domains (HR1 and HR2) preceding the transmembrane domain (TM). After binding of the S1 subunit RBD on the virion to its receptor on the target cell, the HR1 and HR2 domains interact with each other to form a six-helix bundle (6-HB) fusion core, bringing viral and cellular membranes into close proximity for fusion and infection. In order to catalyze the membrane fusion reaction, CoV S needs to be primed through cleavage at the S1/S2 and S2' sites. In the case of human-infecting coronaviruses such as SARS-CoV-2, HCoV-OC43, MERS-CoV, and HCoV-KU1, the spike protein contains an insertion of (R/K)-(2X)n-(R/K) (furin cleavage motif) at the S1/S2 site, which is absent in SARS-CoV and other SARS-related coronaviruses, as well as Ro-BatCoV HKU9. The region modeled in this cd (SD-1 and SD-2, the S1/S2 cleavage region, and the S2 fusion subunit) plays an essential role in viral entry by initiating fusion of the viral and cellular membranes.	0
424071	cl40440	PDDEXK_nuclease-like	PDDEXK family nucleases. This model characterizes a diverse set of poorly characterized nucleases such as Escherichia coli YaeQ. They belong to a superfamily of nucleases including very short patch repair (Vsr) endonucleases, archaeal Holliday junction resolvases, MutH methyl-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI.	0
424072	cl40441	5TM_YidC_Oxa1_Alb3	Five transmembrane core domain of YidC/Oxa1/Alb3 protein family of insertases. This group is composed of the bacterial and chloroplastic members of the YidC/Oxa1/Alb3 protein family of insertases, including bacterial YidC, and chloroplastic ALBINO3 (Alb3) and Alb3-like proteins such as ALBINO3-like protein 1 (also called Alb4). Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC from Gram-negative bacteria contains an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. Alb3 and Alb3-like proteins are required for the post-translational insertion of the light-harvesting chlorophyll-binding proteins (LHCPs) into the chloroplast thylakoid membrane. Alb3 acts independently and may also function cooperatively with the thylakoid cpSecYE translocase to insert proteins co-translationally into the thylakoid membrane, similar to bacterial YidC that can function with the SecYEG translocase. YidC/Oxa1/Alb3 family insertases contain a core domain of five transmembrane (5TM) segments that is essential to insertase function.	0
424073	cl40442	Peptidase_C80	peptidase C80 family. This peptidase C80 family includes the cysteine-binding domain (CPD) of several large, repetitive bacterial exoproteins involved in heme utilization or adhesion and many typically having CPD repeats as well as regions rich in repeats. Many members of this family have been designated adhesins or filamentous haemagglutinins. The CPD contains the characteristic Asp/Cys/His residues found in Clostridium toxin B active site.	0
424075	cl40444	DBD_XPA-like	DNA-binding domain found in DNA repair protein complementing XP-A cells (XPA), yeast DNA repair protein RAD14 and similar proteins. ZNT9, also known as solute carrier family 30 member 9 (SLC30A9), may act as a zinc transporter involved in intracellular zinc homeostasis and may also play a role as nuclear receptor coactivator.	0
424078	cl40447	cyt_P460_fam	Cytochrome P460 family. Cytochrome (cyt) P460 is a small soluble periplasmic protein that binds the c-type heme cofactor, heme P460, named for its characteristic ferrous Soret peak maximum at 460 nm, which has the distinction of being the only known heme in biology to withdraw electrons from an iron coordinated substrate. M. capsulatus (Bath) cytochrome P460, encoded by the cytL gene, catalyzes the oxidation of hydroxylamine (NH2OH) to form nitrous oxide (N2O) under anaerobic conditions. Similar to Nitrosomonas europaea cyt P460, it is defined by an unusual porphyrin (heme)-lysine cross link. This subfamily belongs to a family, called the cytochrome P460 family, of small mono-heme c-type cytochromes that are predominantly of beta-sheet structure, as opposed to the four elongate, tightly-packed alpha-helices of the widely distributed cytochromes c' of photoheterotrophic and denitrifying bacteria.	0
424079	cl40448	NURR	NURR (N-terminal unit for RNA recognition) domain. hnRNPR is a highly conserved RNA-binding protein that belongs to the heterogeneous nuclear ribonucleoprotein (hnRNP) family. hnRNP plays an important role in processing of precursor mRNA in the nucleus. hnRNPR acts as a general positive regulator of MHC class I expression. The model corresponds to NURR domain of hnRNPR.	0
424080	cl40449	DHD_Ski_Sno_Dac	Dachshund-homology domain found in the Ski/Sno/Dac family of transcriptional regulators. Ski-like protein, also known as SKIL, Ski-related oncogene (Sno), or Ski-related protein, is the ski proto-oncogene homolog. It may have regulatory roles in cell division or differentiation in response to extracellular signals. The Dachshund-homology domain (DHD), also known as the N-terminal Ski/Sno/Dac domain, adopts a mixed alpha/beta structure containing a helix-turn-helix motif, similar to features found in the forkhead/winged-helix family of DNA binding proteins. It contains a conserved CLPQ motif and can bind co-factors. Its structure suggests that it may also bind DNA.	0
424081	cl40450	KLF8_12_N	N-terminal domain of Kruppel-like factor (KLF) 8, KLF12, and similar proteins. Kruppel-like factor 12 (also known as Krueppel-like transcription factor 12, KLF12) regulates, by transcriptionally repressing Nur77 expression, endometrial decidualization, which is a prerequisite for successful implantation and the establishment of pregnancy. It is involved in the maturation processes of kidney collecting ducts after birth, and is able to increase the promoter activity of the UT-A1 urea transporter promoter by binding to the CACCC motif. KLF12 has also been found to promote colorectal cancer growth is also involved in the invasion and apoptosis of basal-like breast carcinoma. KLF12 belongs to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Although these factors bind to similar elements in vitro, they have distinct activities in vivo depending on their expression profile and the sequence of the N-terminal activation/repression domain, which differ between members. KLF12 contains an N-terminal domain that is related to the N-terminal repression domain of KLF8.	0
424082	cl40451	cwf21	cwf21 domain. Serine/arginine repetitive matrix protein 3 (SRRM3) may play a role in regulating breast cancer cell invasiveness. It may also be involved in RYBP-mediated breast cancer progression. SRRM3 contains a cwf21 domain at the N-terminus. The cwf21 domain is involved in mRNA splicing; it binds directly to the spliceosomal protein Prp8.	0
424083	cl40452	SNARE_NTD_STX6-like	N-terminal domain of syntaxin-6 and similar proteins. Syntaxin-6 (STX6) is a component of a soluble NSF attachment protein receptor (SNARE) complex involved in intracellular vesicle trafficking and in the fusion of retrograde transport carriers with the trans-Golgi network (TGN). This model corresponds to N-terminal domain of STX6, which is a regulatory domain named Habc.	0
424084	cl40453	FXYD	phenylalanine-X-tyrosine-aspartate (FXYD) family. The FXYD domain-containing ion transport regulator 12 (FXYD12) mRNA is mainly distributed in kidneys and intestines of fish. In co-immunoprecipitation experiments, FXYD12 was shown to associate with the Na(+)/(K+)-ATPase (NKA) alpha-subunit in the intestines of two closely related medakas, Oryzias dancena and O. latipes. These results suggests that FXYD12 may play a role in modulating NKA activity in the intestines following salinity changes in the maintenance of internal homeostasis.	0
424085	cl40454	CYCLIN_SF	Cyclin box fold superfamily. CCNO is specifically required for generation of multiciliated cells, possibly by promoting a cell cycle state compatible with centriole amplification and maturation. It acts downstream of MCIDAS (multiciliate differentiation and DNA synthesis associated cell cycle protein) to promote mother centriole amplification and maturation in preparation for apical docking. CCNO is involved in the activation of cyclin-dependent kinase 2. CCNO contains two cyclin boxes. This model corresponds to the second one. The cyclin box is a protein binding domain.	0
424087	cl40456	TM_Y_CoV_Nsp3_C	C-terminus of coronavirus non-structural protein 3, including transmembrane and Y domains. This model represents the C-terminus of non-structural protein 3 (Nsp3) from betacoronavirus in the sarbecovirus subgenus (B lineage), including highly pathogenic human coronaviruses such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and SARS-CoV2 (also called 2019 novel CoV or 2019-nCoV). This conserved C-terminus includes two transmembrane (TM) regions TM1 and TM2, an ectodomain (3Ecto) between the TM1 and TM2 that is glycosylated and located on the lumenal side of the ER, an amphiphatic region (AH1) that is not membrane-spanning, and a large Y domain of approximately 370 residues. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. In SARS-CoV and the related murine hepatitis virus (MHV), the TM1, 3Ecto and TM2 domains are important for the papain-like protease (PL2pro) domain to process Nsp3-Nsp4 cleavage. It has also been shown that the interaction of 3Ecto with the lumenal loop of Nsp4 is essential for ER rearrangements in cells infected with SARS-CoV or MHV. The Y domain, located at the cytosolic side of the ER, consists of the Y1 and CoV-Y subdomains, which are conserved in nidovirus and coronavirus, respectively. Functional information about the Y domain is limited; it has been shown that Nsp3 binding to Nsp4 is less efficient without the Y domain.	0
424088	cl40457	CoV_PLPro	Coronavirus (CoV) papain-like protease (PLPro). This model represents the papain-like protease (PLPro) found in the non-structural protein 3 (Nsp3) region of deltacoronavirus, including Porcine deltacoronavirus, Bulbul coronavirus HKU11, and Common moorhen coronavirus HKU21. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. PLPro is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. PLPro, which belongs to the MEROPS peptidase C16 family, participates in the proteolytic processing of the N-terminal region of the replicase polyprotein; it can cleave Nsp1|Nsp2, Nsp2|Nsp3, and Nsp3|Nsp4 sites and its activity is dependent on zinc. Besides cleaving the polyproteins, PLPro also possesses a related enzymatic activity to promote virus replication: deubiquitinating (DUB) and de-ISGylating activities. Both, ubiquitin (Ub) and Ub-like interferon-stimulated gene product 15 (ISG15), are involved in preventing viral infection; coronaviruses utilize Ubl-conjugating pathways to counter the pro-inflammatory properties of Ubl-conjugated host proteins via the action of PLPro, which processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. The Nsp3 PLPro domain in many of these CoVs has also been shown to antagonize host innate immune induction of type I interferon by interacting with IRF3 and blocking its activation.	0
424089	cl40458	betaCoV_Nsp3_NAB	nucleic acid binding domain of betacoronavirus non-structural protein 3. This model represents the nucleic acid binding (NAB) domain of non-structural protein 3 (Nsp3) from betacoronavirus in the nobecovirus subgenus (D lineage), including Rousettus bat coronavirus HKU9. The NAB domain represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands. NAB is a cytoplasmic domain located between the papain-like protease (PLPro) and betacoronavirus-specific marker (betaSM) domains of CoV Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. The NAB domain both binds ssRNA and unwinds dsDNA. It prefers to bind ssRNA containing repeats of three consecutive guanines. A group of residues that form a positively charged patch on the protein surface of SARS-CoV Nsp3 NAB serves as the binding site of nucleic acids. This site is conserved in the NAB of Nsp3 from betacoronavirus in the sarbecovirus subgenus (B lineage), but is not conserved in the Nsp3 NAB from betacoronaviruses in the D lineage.	0
424090	cl40459	CoV_Nsp9	coronavirus non-structural protein 9. This model represents the non-structural protein 9 (Nsp9) from deltacoronaviruses such as the Porcine delta coronavirus (PDCoV) Porcine coronavirus HKU15. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. All of these Nsps, except for Nsp1 and Nsp2, are considered essential for transcription, replication, and translation of the viral RNA. Nsp9, with Nsp7, Nsp8, and Nsp10, localizes within the replication complex. Nsp9 is an essential single-stranded RNA-binding protein for coronavirus replication; it shares structural similarity to the oligosaccharide-binding (OB) fold, which is characteristic of proteins that bind to ssDNA or ssRNA. Nsp9 requires dimerization for binding and orienting RNA for subsequent use by the replicase machinery. CoV Nsp9s have diverse forms of dimerization that promote their biological function, which may help elucidate the mechanism underlying CoVs replication and contribute to the development of antiviral drugs. Generally, dimers are formed via interaction of the parallel alpha-helices containing the protein-protein interaction motif GXXXG; additionally, the N-finger region may also play a critical role in dimerization as seen in porcine delta coronavirus (PDCoV) Nsp9. As a member of the replication complex, Nsp9 may not have a specific RNA-binding sequence but may act in conjunction with other Nsps as a processivity factor, as shown by mutation studies indicating that Nsp9 is a key ingredient that intimately engages other proteins in the replicase complex to mediate efficient virus transcription and replication.	0
424091	cl40460	CoV_Nsp10	coronavirus non-structural protein 10. This model represents the non-structural protein 10 (Nsp10) of deltacoronaviruses, including Thrush coronavirus HKU12-600 and Wigeon coronavirus HKU20. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Coronaviruses cap their mRNAs; RNA cap methylation may involve at least three proteins: Nsp10, Nsp14, and Nsp16. Nsp10 serves as a cofactor for both Nsp14 and Nsp16. Nsp14 consists of 2 domains with different enzymatic activities: an N-terminal ExoN domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. The association of Nsp10 with Nsp14 enhances Nsp14's exoribonuclease (ExoN) activity, and not its N7-Mtase activity. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The Nsp10/Nsp14 complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end, mimicking an erroneous replication product, and may function in a replicative mismatch repair mechanism. Nsp16 Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) acts sequentially to Nsp14 MTase in RNA capping methylation and methylates the RNA cap at the ribose 2'-O position; it catalyzes the conversion of the cap-0 structure on m7GpppA-RNA to a cap-1 structure. The association of Nsp10 with Nsp16 enhances Nsp16's 2'OMTase activity, possibly through enhanced RNA binding affinity. Additionally, transmissible gastroenteritis virus (TGEV) Nsp10, Nsp16 and their complex can interact with DII4, which normally binds to Notch receptors; this interaction may disturb Notch signaling. Nsp10 also binds 2 zinc ions with high affinity.	0
424092	cl40461	ZIP_TSC22D-like	leucine zipper found in the TSC22 domain leucine zipper transcription factors, c-Myc-binding protein, and similar proteins. TSC22 domain family protein 4 (TSC22D4), also called TSC22-related-inducible leucine zipper protein 2 (TILZ2), or Tsc-22-like protein THG-1, is a transcriptional repressor that acts as a molecular determinant of insulin signalling and glucose handling. It also functions in hepatic lipid handling by regulating hepatic very-low-density-lipoprotein (VLDL) release and lipogenic gene expression. This model corresponds to the conserved leucine zipper (ZIP) domain located at the C-terminus of TSC22D4. Its first helix is not basic and does not contain the consensus sequence, NXX(A)(A)XX(C/S)R, found in most basic region/leucine zipper (bZIP) proteins. Thus, the DNA-binding capability of the ZIP domain is not obvious. Similar to bZIP, ZIP forms homo- and heterodimers, resulting in many dimers that may have different effects on transcription.	0
424093	cl40462	CoV_Nsp8	Coronavirus non-structural protein 8. This model represents the non-structural protein 8 (Nsp8) region of deltacoronaviruses that include White-eye coronavirus HKU16 and Quail coronavirus UAE-HKU30, among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9, and Nsp10 form functional complexes with CoV core enzymes and thereby stimulate replication. Most importantly, a complex of Nsp8 with Nsp7 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the genes encoding Nsp8 and Nsp7 have been shown to delay virus growth. Nsp8 and Nsp7 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp8 with Nsp7 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp8 has a novel 'golf-club' fold composed of an N-terminal 'shaft' domain and a C-terminal 'head' domain. The shaft domain contains three helices, one of which is very long, while the head domain contains another three helices and seven beta-strands, forming an alpha/beta fold. SARS-CoV Nsp8 forms a 8:8 hexadecameric supercomplex with Nsp7 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp8 forms a 1:2 heterotrimer with Nsp7. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to the template length.	0
424094	cl40463	NAC	NAC domain. Basal transcription factor 3 (BTF3) plays an important role in the transcriptional regulation linked to growth and development in eukaryotes. In mammals, the BTF3 gene encodes two alternative splicing isoforms, BTF3a and BTF3b. The full length BTF3a protein excites transcription. The shortened BTF3b, which lacks the first 44 amino-terminal extension, is a component of the nascent polypeptide-associated complex (NAC), involved in regulating protein localization during translation. BTF3 is involved in oncogenesis; overexpression of BTF3 has been shown to be associated with a variety of malignancies such as cancer of the colon, pancreas, stomach, prostate and breast. It is upregulated in hypopharyngeal squamous cell carcinoma (HSCC) tumors correlating with lymph node metastasis and tumor promotion, thus indicating that BTF3 is a potential therapeutic target and prognostic biomarker for HSCC. BTF3 has also been implicated in the pathogenesis of osteosarcoma (OS), a malignant cancer that affects rapidly proliferating bones, and has a poor prognosis.	0
424095	cl40464	CoV_Nsp14	nonstructural protein 14 of coronavirus. Nonstructural protein 14 (Nsp14) of coronavirus (CoV) plays an important role in viral replication and transcription. It consists of 2 domains with different enzymatic activities: an N-terminal exoribonuclease (ExoN) domain and a C-terminal cap (guanine-N7) methyltransferase (N7-MTase) domain. ExoN is important for proofreading and therefore, the prevention of lethal mutations. The association of Nsp14 with Nsp10 stimulates its ExoN activity; the complex hydrolyzes double-stranded RNA in a 3' to 5' direction as well as a single mismatched nucleotide at the 3'-end mimicking an erroneous replication product. The Nsp10/Nsp14 complex may function in a replicative mismatch repair mechanism. N7-MTase functions in mRNA capping. Nsp14 can methylate GTP, dGTP as well as cap analogs GpppG, GpppA and m7GpppG. The accumulation of m7GTP or Nsp14 has been found to interfere with protein translation of cellular mRNAs.	0
424096	cl40465	CoV_Spike_S1_NTD	N-terminal domain of the S1 subunit of coronavirus Spike (S) proteins. The N-terminal domain of the coronavirus spike glycoprotein functions as a receptor binding domain. It binds carcinoembryonic antigen-related cell adhesion molecule 1.	0
424097	cl40466	ORF8-Ig_SARS-CoV-2-like	SARS-CoV-2 ORF8 immunoglobulin (Ig) domain protein and related proteins. This subfamily includes the ORF8 immunoglobulin (Ig) domain proteins of Bat SARS coronavirus HKU3-1 and Bat SARS-like coronavirus Rs3367, which have been classified previously as type III ORF8's. They belong to a family which includes the ORF8 immunoglobulin (Ig) domain protein of Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2, also known as a 2019 novel coronavirus, 2019-nCoV) and other related Sarbecovirus ORF8's, such as bat coronavirus Rf1 (Bat SARS CoV Rf1) ORF8 which has been classified previously as a type II ORF8. SARS-CoV-2 causes the disease called "coronavirus disease 2019" (COVID-19). SARS-CoV-2 ORF8 protein (also known as ns8 and accessory protein 8) is a fast-evolving protein in SARS-related CoVs, and a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts (DOI:10.1101/2020.03.04.977736).	0
424098	cl40467	embe-merbe_CoV_ORF8b_protein-I-like	MERS-CoV ORF8b, BECV protein I, and related Embecovirus and Merbecovirus proteins. This subfamily includes protein I (also known as accessory protein N2) from bovine enteritic coronavirus-F15 strain (BECV-F15) and related Embecoviruses (A lineage) including murine hepatitis virus. The gene encoding protein I is included in the N gene as an alternative ORF. Protein I appears to have no homologous proteins in Sarbecovirus lineage B, which includes Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) and SARS-CoV-2 (2019 novel coronavirus, 2019-nCoV). There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. BECV-F15 protein I is not essential for viral replication. It is related to the ORF8b accessory protein of Middle East respiratory syndrome-related coronavirus (MERS-CoV) and other related merbecoviruses (C lineage); the gene encoding ORF8b is an internal ORF that is overlapped by the N (nucleocapsid) protein gene (ORF8a).	0
424099	cl40468	ORF7a_SARS-CoV-like	Severe Acute Respiratory Syndrome coronavirus (SARS-CoV) structural accessory protein ORF7a and similar proteins from related betacoronaviruses in the subgenera Sarbecovirus (B lineage). The structure of the coronavirus X4 protein (also known as 7a and U122) shows similarities to the immunoglobulin like fold and suggests a binding activity to integrin I domains. In SARS-CoV- infected cells, the X4 protein is expressed and retained intra-cellularly within the Golgi network. X4 has been implicated to function during the replication cycle of SARS-CoV.	0
424100	cl40469	NTD_cv_Nsp15-like	N-terminal domain of coronavirus Nonstructural protein 15 (Nsp15) and related proteins. This is the N-terminal domain of the coronavirus nonstructural protein 15 (NSP15), which is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. NSP15, is a nidoviral RNA uridylate-specific endoribonuclease (NendoU) carrying C-terminal catalytic domain belonging to the EndoU family. The SARS-CoV-2 NendoU monomers assemble into a double-ring hexamer, generated by a dimer of trimers. The hexamer is stabilized by the interactions of N-terminal oligomerization domain.	0
424101	cl40470	CoV_RdRp	coronavirus RNA-dependent RNA polymerase, also known as non-structural protein 12: responsible for replication and transcription of the viral RNA genome. This group contains the RNA-dependent RNA polymerase (RdRp) of bat coronavirus HKU9 and similar proteins from betacoronaviruses in the nobecovirus subgenera (D lineage). CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as Nsp12), catalyzes the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, Nsp7 and Nsp8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. Nsp12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides.	0
424102	cl40471	CoV_Nsp5_Mpro	coronavirus non-structural protein 5, also called Main protease (Mpro). This subfamily contains the coronavirus (CoV) non-structural protein 5 (Nsp5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in deltacoronaviruses. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/Nsp5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/Nsp5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of alpha-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/Nsp5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs.	0
424104	cl40473	cv-alpha_beta_Nsp2-like	alpha- and betacoronavirus non-structural protein 2 (Nsp2), similar to SARS-CoV Nsp2 and HCoV-229E Nsp2, and related proteins. Coronavirus non-structural proteins (Nsps) are encoded in ORF1a and ORF1b. Post infection, the genomic RNA is released into the cytoplasm of the cell and translated into two long polyproteins (pp), pp1a and pp1ab, which are then autoproteolytically cleaved by two viral proteases Nsp3 and Nsp5 into smaller subunits. Nsp2 is one of these subunits. This subgroup includes Nsp2 from Murine hepatitis virus (MHV) and betacoronaviruses in the embecovirus subgenera (A lineage). It belongs to a family which includes Severe acute respiratory syndrome coronavirus (SARS-CoV) Nsp2. The functions of Nsp2 remain unclear. SARS-CoV Nsp2, rather than playing a role in viral replication, may be involved in altering the host cell environment; deletion of Nsp2 from the SARS-CoV genome results in only a modest reduction in viral titers, and it has been shown to interact with two host proteins, prohibitin 1 (PHB1) and PHB2 which have been implicated in cellular functions, including cell-cycle progression, cell migration, cellular differentiation, apoptosis, and mitochondrial biogenesis. MHV Nsp2, also known as p65, different from SARS-CoV Nsp2, may play an important role in the viral life cycle.	0
424105	cl40474	CoV_E	Coronavirus Envelope (E) small membrane protein. This group contains the Envelope (E) small membrane protein of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as 2019 novel coronavirus (2019-nCoV) or COVID-19 virus. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The E protein is a small polypeptide (76-109 amino acids) that contains a single alpha-helical transmembrane domain. It plays a central role in virus morphogenesis and assembly. It acts as a viroporin and self-assembles in host membranes forming homopentameric protein-lipid pores that allow ion transport with poor selectivity. For some CoVs, such as mouse hepatitis virus (MHV) and SARS-CoV, deletion of the E gene did not completely abolish replication, but the virions were severely disabled from infecting new host cells with significantly reduced viral titers. In animal models, SARS-CoV lacking the E gene also showed significantly attenuated viral titers, likely due to its deficiency in suppressing host stress response and apoptosis induction. Moreover, the PDZ-binding motif (PBM) at the C-terminus of SARS-CoV E protein was shown to interact with a host PDZ protein called syntenin and lead to its relocation from nucleus to cytoplasm during SARS-CoV infection, thereby activating p38 kinase to induce the overexpression of inflammatory cytokines. Thus, the E protein is involved in both, viral replication and pathogenesis during CoV infection.	0
424106	cl40475	CoV_M	coronavirus Membrane (or Matrix) protein. This subfamily contains the Membrane (M) protein of deltacoronaviruses including porcine deltacoronavirus and Bulbul coronavirus HKU11. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the Orf1ab (a large polyprotein known as replicase/protease); all are required to produce a structurally complete viral particle. The M protein, a triple-spanning membrane protein, is the most abundant protein in the virion. It plays a central role in virion assembly and morphogenesis, and it defines the shape of the viral envelope. It is regarded as the central organizer of CoV assembly, interacting with all other major coronaviral structural proteins and turning cellular membranes into workshops where virus and host factors come together to make new virus particles. While homotypic interactions between the M proteins are the major driving force behind virion envelope formation, it needs to interact with other coronaviral structural proteins for complete virion formation. The interaction of the Spike protein with M is not required for the assembly process. However, binding of M to N protein stabilizes the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and thereby promotes completion of viral assembly. Thus, the M protein, and its interactions with other structural proteins, is necessary for the production and release of virus-like particles.	0
424108	cl40477	CoV_Nsp6	coronavirus non-structural protein 6. Coronaviruses (CoV) redirect and rearrange host cell membranes as part of the viral genome replication and transcription machinery; they induce the formation of double-membrane vesicles in infected cells. CoV non-structural protein 6 (Nsp6), a transmembrane-containing protein, together with Nsp3 and Nsp4, have the ability to induce double-membrane vesicles that are similar to those observed in severe acute respiratory syndrome (SARS) coronavirus-infected cells. By itself, Nsp6 can generate autophagosomes from the endoplasmic reticulum. Autophagosomes are normally generated as a cellular response to starvation to carry cellular organelles and long-lived proteins to lysosomes for degradation. Degradation through autophagy may provide an innate defense against virus infection, or conversely, autophagosomes can promote infection by facilitating the assembly of replicase proteins. In addition to initiating autophagosome formation, Nsp6 also limits autophagosome expansion regardless of how they were induced, i.e. whether they were induced directly by Nsp6, or indirectly by starvation or chemical inhibition of MTOR signaling. This may favor coronavirus infection by compromising the ability of autophagosomes to deliver viral components to lysosomes for degradation.	0
424109	cl40478	CoV_Spike_S1_RBD	receptor-binding domain of the S1 subunit of coronavirus spike (S) proteins. This group contains the receptor-binding domain (RBD) of the S1 subunit of the spike (S) protein from porcine hemagglutinating encephalomyelitis virus (HEV), which is associated with acute outbreaks of wasting and encephalitis in nursing piglets from pig farms. Porcine HEV is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. The CoV S protein is an envelope glycoprotein that plays the most important role in viral attachment, fusion, and entry into host cells, and serves as a major target for the development of neutralizing antibodies, inhibitors of viral entry, and vaccines. It is synthesized as a precursor protein that is cleaved into an N-terminal S1 subunit (~700 amino acids) and a C-terminal S2 subunit (~600 amino acids) that mediates attachment and membrane fusion, respectively. Three S1/S2 heterodimers assemble to form a trimer spike protruding from the viral envelope. The S1 subunit contains a receptor-binding domain (RBD), while the S2 subunit contains a hydrophobic fusion peptide and two heptad repeat regions. S1 contains two structurally independent domains, the N-terminal domain (NTD) and the C-terminal domain (C-domain). Depending on the virus, either the NTD or the C-domain can serve as the receptor-binding domain (RBD). While the RBD of mouse hepatitis virus (MHV) is located at the NTD, most CoVs use the C-domain to bind their receptors. The protein receptor for porcine HEV has not yet been identified. Due to the key role of the S protein RBD in viral attachment, it is the major target for antibody-mediated neutralization. This model corresponds to the S1 subunit C-domain that serves as the RBD for most CoVs.	0
424111	cl40480	enolase_like	N/A. Mandelate racemase (MR)-like subfamily of the enolase superfamily, subgroup 4. Enzymes of this subgroup share three conserved carboxylate ligands for the essential divalent metal ion (usually Mg2+), two aspartates and a glutamate, and conserved catalytic residues,  a Lys-X-Lys motif and a conserved histidine-aspartate dyad. This subgroup's function is unknown.	0
424112	cl40481	YlqF_related_GTPase	Circularly permuted YlqF-related GTPases. This is the C-terminal helicase domain of ERCC3, RAD25 and XPB helicases.	0
424114	cl40483	ICL_KPHMT	N/A. Ketopantoate hydroxymethyltransferase (KPHMT) is the first enzyme in the pantothenate biosynthesis pathway. Ketopantoate hydroxymethyltransferase (KPHMT) catalyzes the first committed step in the biosynthesis of pantothenate (vitamin B5), which is a precursor to coenzyme A and is required for penicillin biosynthesis.	0
424116	cl40485	AcrIF10	Anti-CRISPR type I subtype F10. Members of the AcrF10 family of anti-CRISPR proteins have been found in phage from various Vibrio, Shewanella, and their relatives. AcrF10 is considered a DNA mimic protein.	0
424117	cl40486	HopBF1	type III secretion system (T3SS) effector HopBF1 from Ewingella americana, and similar proteins. HopBF1, found in plant pathogens such as Pseudomonas syringae and in the human pathogen Ewingella americana, it a type III secretion system effector that acts as a protein kinase. It phosphorylates the eukaryotic chaperone HSP90 on a serine residue, inhibiting its ATPase activity. The inhibition interferes with the proper folding of client proteins of HSP90 that are important to resistance to bacterial infection.	0
424126	cl40495	CueP_fam	CueP family metal-binding protein. This narrowly built model for CueP includes periplasmic proteins from Salmonella enterica, in which it contributes to an increased tolerance to copper, and from various other Gram-negative bacteria. It does not include CueP lipoproteins from species such as Corynebacterium diphtheriae.	0
424133	cl40502	phiSA1p31	phiSA1p31 domain. This uncharacterized protein family occurs in Streptomyces and related species. Some members have insertions of long stretches of low-complexity sequences.	0
424149	cl40518	heterocyst_HetZ	heterocyst differentiation protein HetZ. Members of this family are cyanobacterial proteins distantly related to heterocyst differentiation protein HetZ, which also has a much more closely related set of paralogs in heterocyst-forming species.	0
424152	cl40521	PEPxxWA-CTERM	PEPxxWA-CTERM sorting domain. Members of this family are PEP-CTERM proteins, that is, surface proteins of Gram-negative organisms that carry a short C-terminal region used to help target proteins to their proper cellular location, hold them in position for post-translational modifications that might need to occur (such as glycosylation), and which is eventually removed by exosortase as the protein is ligated to something else. In this family the most conspicuous feature other than the PEP-CTERM sorting signal (with variants that include PEP, PAP, PTP, and SEP) is a pair of Cys residues about 6 amino acids apart from each other. The second Cys occurs in the middle of run of amino acids that are all either small (Gly, Ser, Ala) or else Asn. The local context suggests the Cys occurs at a turn at the end of a structural feature such as alpha-helix or beta-strand, rather than in the middle of one. The word "cistern" was assigned to suggest the proposed Cys-turn feature.	0
424154	cl40523	EboA_domain	EboA domain-containing protein. This HMM describes a narrow, cyanobacterial-only clade of members of the EboA (eustigmatophyte/bacterial operon A) family. Members of this family appear required for transport of certain secondary metabolite precursors to the periplasm, including (but not limited) to precursors of scytonemin. More than half the members of this clade belong to scytonemin producers.	0
424159	cl40528	porH_1	PorH family porin. Proteins of this HMM family form major outer membrane hetero-oligomeric pores on the cell wall of Corynebacterium with PorA family porins.	0
424160	cl40529	opr_proin_2	Opr family porin. Proteins hit by this HMM model are members of the Opr family porins, which are mainly found in Pseudomonas and other Gram-negative bacteria with different substrates.	0
424162	cl40531	DotA_TraY	conjugal transfer/type IV secretion protein DotA/TraY. This HMM distinguishes DotA of type IVB secretion systems from TraY as the term is used in the conjugal transfer systems of IncI1 family plasmids.	0
424173	cl40542	NprX_fam	NprX family peptide pheromone. NprX, also called NprRB, belongs to the NprR-NprX quorum-sensing system in Bacillus. The mature form of the peptide pheromone is the SKPDIVG heptapeptide.	0
424178	cl40547	gliding_CglD	adventurous gliding motility lipoprotein CglD. CglC (cell contact-dependent gliding (or conditional gliding) motility protein C, also called adventurous gliding motility protein AgmO, is found in delta-proteobacterial species that exhibit a taxonomically restricted form of gliding motility.	0
424188	cl40557	PorV_fam	PorV/PorQ family protein. PorV, as characterized in oral pathogen Porphyromonas gingivalis, is a component of the type IX secretion system (T9SS) needed to process a subset of T9SS substrates. PorV is a paralog of PorQ.	0
424196	cl40565	staphy_B_SbnF	staphyloferrin B biosynthesis protein SbnF. SbnC, related to siderophore biosynthesis protein IucA and IucC, is encoded in Staphylococcus aureus in the sbnABCDEFGHI locus responsible for the biosynthesis of staphyloferrin B, a carboxylate-type siderophore. SbnC is found in many species of Staphylococcus.	0
424201	cl40570	Ca_tandemer	Ca2+-stabilized adhesin repeat. This variant form of the Ig-like domain occurs as a repeat in a number of large adhesins, including a 1.5-MDa ice-binding adhesin, the Marinomonas primoryensis antifreeze protein.	0
424204	cl40573	DUF5670	Family of unknown function (DUF5670). Members of this family are very small (about 45 amino acids) and highly hydrophobic, suggesting a presence in the membrane, and have a broad phylogenetic distribution. The member protein lmo0937, from the pathogen Listeria monocytogenes, is described as up-regulated when the bacterium is in the mouse spleen, suggesting a role in stress response.	0
424205	cl40574	DUF5309	Family of unknown function (DUF5309). A founding member of this family, AKO59007.1, was identified as the major head protein in Brucella phage 02_19 during a comparison of Brucella phage genomes. The N-terminal half appears to the better conserved region with fewer insertions and deletions.	0
424212	cl40581	Fuzzy	protein fuzzy and homologs. This entry represents a longin-like domain found in Fuz and related proteins. This entry is specific to the second Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1-CCZ1) family, including protein sequences of FUZ, MON1 and HPS1 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia.	0
424216	cl40585	dRRM_Rrp7p	deviant RNA recognition motif (dRRM) in yeast ribosomal RNA-processing protein 7 (Rrp7p) and similar proteins. This domain corresponds to the N-terminal RNA-binding domain found in the Rrp7 protein. It has an RRM-like fold with a circular permutation.	0
424219	cl40588	cv_Nsp4_TM	coronavirus non-structural protein 4 (Nsp4) transmembrane domain. This is the N-terminal domain of the coronavirus nonstructural protein 4 (NSP4). NSP4 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. NSP4 is a membrane-spanning protein which is thought to anchor the viral replication-transcription complex to modified endoplasmic reticulum membranes. This N-terminal region represents the membrane spanning region, covering four transmembrane regions.	0
424225	cl40594	AAA_10	AAA-like domain. This entry represents the P-loop domain found in the TraG conjugation protein.	0
424235	cl40604	baeRF_family3	Bacterial archaeo-eukaryotic release factor family 3. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome.	0
424236	cl40605	Hsm3_like	Hsm3 is a  yeast Proteasome chaperone of the 19S regulatory particle and related proteins. Hsm3 is a proteasome-dedicated chaperone that forms a base precursor, Hsm3-Rpt1-Rpt2-Rpn1. Hsm3 consists of 23 alpha-helices forming 11 repeats similar to the HEAT repeats. This entry includes the first 5 repeats at the N-terminal.	0
424239	cl40608	MDD_C	Mevalonate 5-diphosphate decarboxylase C-terminal domain. This enzyme catalyzes the last step in the synthesis of isopentenyl diphosphate (IPP) in the mevalonate pathway. Alternate names: mevalonate diphosphate decarboxylase; pyrophosphomevalonate decarboxylase [Central intermediary metabolism, Other]	0
424240	cl40609	SAM_DrpA	DNA processing protein A sterile alpha motif domain. This is the N-terminal domain found in DNA processing protein A (DprA) present in Streptococcus pneumoniae. DprA has recently been discovered to be a transformation-dedicated RecA loader. Transformation is believed to play a major role in genetic plasticity. This domain is known as the sterile alpha motif (SAM) domain. DprAs are able to form a type of dimer through SAM-SAM interactions, also known as N/N interactions.	0
424241	cl40610	FtsX_ECD	FtsX extracellular domain. FtsX is an integral membrane protein encoded in the same operon as signal recognition particle docking protein FtsY and FtsE. It belongs to a family of predicted permeases and may play a role in the insertion of proteins required for potassium transport, cell division, and other activities. FtsE is a hydrophilic nucleotide-binding protein that associates with the inner membrane by means of association with FtsX. [Cellular processes, Cell division, Protein fate, Protein and peptide secretion and trafficking]	0
424243	cl40612	YlmH_RBD	Putative RNA-binding domain in YlmH. This domain adopts an RRM like fold and is found in the B. subtilis YlmH cell division protein.	0
424244	cl40613	HTH_ParB	HTH domain found in ParB protein. This family may include an HTH domain.	0
424245	cl40614	Usher	Outer membrane usher protein. This is the presumed beta barrel domain from the usher-like TcfC family of proteins.	0
424246	cl40615	KIX_2	KIX domain. CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun.	0
424248	cl40617	zf-TRAF	TRAF-type zinc finger. 	0
424250	cl40619	TetR_C_8	Transcriptional regulator C-terminal region. The seed alignment for this family was built from a set of closely related uncharacterized proteins associated with operons for the type of bacterial dihydroxyacetone kinase that transfers PEP-derived phosphate from a phosphoprotein, as in phosphotransferase system transport, rather than from ATP. Members have a TetR transcriptional regulator domain (pfam00440) at the N-terminus and sequence homology throughout.	0
424251	cl40620	DUF4175	Domain of unknown function (DUF4175). Members of this family are long (~850 residue) bacterial proteins from the alpha Proteobacteria. Each has 2-3 predicted transmembrane helices near the N-terminus and a long C-terminal region that includes stretches of Gln/Gly-rich low complexity sequence, predicted by TMHMM to be outside the membrane. In Bradyrhizobium japonicum, two tandem reading frames are together homologous the single members found in other species; the cutoffs scores are set low enough that the longer scores above the trusted cutoff and the shorter above the noise cutoff for this model.	0
424252	cl40621	HTH_33	Winged helix-turn helix. Transposase proteins are necessary for efficient DNA transposition. This family includes insertion sequences from Synechocystis PCC 6803 three of which are characterized as homologous to bacterial IS5- and IS4- and to several members of the IS630-Tc1-mariner superfamily.	0
424253	cl40622	BatD	Oxygen tolerance. This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion.	0
424254	cl40623	tRNA_int_end_N2	tRNA-splicing endonuclease subunit sen54 N-term. tRNA-splicing endonuclease subunit alpha; Reviewed	0
424255	cl40624	zf-C2H2_jaz	Zinc-finger double-stranded RNA-binding. This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding.	0
424257	cl40626	Helicase_IV_N	DNA helicase IV / RNA helicase N terminal. DNA helicase IV; Provisional	0
424258	cl40627	HpaB_N	4-hydroxyphenylacetate 3-hydroxylase N terminal. This gene for this monooxygenase is found within apparent operons for the degradation of 4-hydroxyphenylacetic acid in Shigella, Photorhabdus and Pasteurella. The family represented by this model is narrowly limited to gammaproteobacteria to exclude other aromatic hydroxylases involved in various secondary metabolic pathways. Generally, this enzyme acts with the assistance of a small flavin reductase domain protein (HpaC) to provide the cycle the flavin reductant for the reaction. This family of sequences is a member of a larger subfamily of monooxygenases (pfam03241).	0
424259	cl40628	SARS-CoV-like_ORF3a	accessory protein ORF3a of severe acute respiratory syndrome-associated coronavirus and similar proteins from related betacoronavirus. APA3_viroporin is a pro-apoptosis-inducing protein. It localizes to the endoplasmic reticulum (ER)-Golgi compartment. The Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) causes apoptosis of infected cells, and this is one of the culprits. Multi-pass membrane protein that forms a homotetrameric potassium-sensitive ion channel called a viroporin whose activity causes ER-stress to the host cell.	0
424260	cl40629	DisA-linker	DisA bacterial checkpoint controller linker region. DNA integrity scanning protein DisA; Provisional	0
424261	cl40630	RskA	Anti-sigma-K factor rskA. This domain, formerly known as DUF2337, is the anti-sigma-K factor, RskA. In Mycobacterium tuberculosis the protein positively regulates expression of the antigenic proteins MPB70 and MPB83.	0
424262	cl40631	SPA	Stabilization of polarity axis. Members of this family of hypothetical proteins have no known function.	0
424263	cl40632	Avl9	Transport protein Avl9. This domain occurs at the N-terminal of Afi1, an Arf3p-interacting protein, is a protein necessary for vesicle trafficking in yeast. This domain is the interacting region of the protein which binds to Arf3, the highly conserved small GTPases (ADP-ribosylation factors). Afi1 is distributed asymmetrically at the plasma membrane and is required for polarized distribution of Arf3 but not of an Arf3 guanine nucleotide-exchange factor, Yel1p. However, Afi1 is not required for targeting of Arf3 or Yel1p to the plasma membrane. Afi1 functions as an Arf3 polarization-specific adapter and participates in development of polarity. Although Arf3 is the homolog of human Arf6 it does not function in the same way, not being necessary for endocytosis or for mating factor receptor internalization. In the S phase, however, it is concentrated at the plasma membrane of the emerging bud. Because of its polarized localization and its critical function in the normal budding pattern of yeast, Arf3 is probably a regulator of vesicle trafficking, which is important for polarized growth.	0
424264	cl40633	Potass_KdpF	F subunit of K+-transporting ATPase (Potass_KdpF). This model describes a very small integral membrane peptide KdpF, a subunit of the K(+)-translocating Kdp complex. It is found upstream of the KdpA subunit (TIGR00680). Because of its very small size and highly hydrophobic character, it is sometimes missed in genome annotation. [Transport and binding proteins, Cations and iron carrying compounds]	0
424265	cl40634	Docking	Erythronolide synthase docking. Polyketide synthase (PKS) catalyzes the biosynthesis of polyketides, which are structurally and functionally diverse natural products in microorganisms and plants. Type I modular PKSs are the large, multifunctional enzymes responsible for the production of a diverse family of structurally rich and often biologically active natural products. The efficiency of acyl transfer at the interfaces of the individual PKS proteins is thought to be governed by helical regions, termed docking domains (dd), located at the C-terminus of the upstream and N-terminus of the downstream polypeptide chains. This entry represents the N-terminal coiled-coil domain found in PikAIV (module 6) proteins from the Pik PKS system in bacteria. This N-terminal PKS docking domain (KS-side docking domain, KSdd) exhibits a coiled-coil motif and the dimer presents a small hydrophobic patch, sometimes flanked by charged residues, as a narrow binding groove where the ACPdd terminal helix can bind.	0
424267	cl40636	FAR1	FAR1 DNA-binding domain. AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This family includes the paralogous pair of transcription factors AFT1 and AFT2.	0
424268	cl40637	HDOD	HDOD domain. 	0
424269	cl40638	TetR_C_3	YcdC-like protein, C-terminal region. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the TetR family of transcriptional regulators defined by the N-teminal model pfam00440 and the C-terminal model pfam08362 (YcdC-like protein, C-terminal region).	0
424271	cl40640	A-2_8-polyST	Alpha-2,8-polysialyltransferase (POLYST). This family features glycosyltransferases belonging to glycosyltransferase family 52, which have alpha-2,3- sialyltransferase (EC:4.2.99.4) and alpha-glucosyltransferase (EC 2.4.1.-) activity. For example, beta-galactoside alpha-2,3- sialyltransferase expressed by Neisseria meningitidis is a member of this family and is involved in a step of lipooligosaccharide biosynthesis requiring sialic acid transfer; these lipooligosaccharides are thought to be important in the process of pathogenesis. This family includes several bacterial lipooligosaccharide sialyltransferases similar to the Haemophilus ducreyi LST protein. Haemophilus ducreyi is the cause of the sexually transmitted disease chancroid and produces a lipooligosaccharide (LOS) containing a terminal sialyl N-acetyllactosamine trisaccharide.	0
424272	cl40641	TraG_N	TraG-like protein, N-terminal region. conjugal transfer mating pair stabilization protein TraG; Provisional	0
424281	cl40650	MAT1	CDK-activating kinase assembly factor MAT1. MAT1 is an assembly/targeting factor for cyclin-dependent kinase-activating kinase (CAK), which interacts with the transcription factor TFIIH. The domain found to the N-terminal side of this domain is a C3HC4 RING finger.	0
424282	cl40651	FhuF	Ferric iron reductase protein FhuF, involved in iron transport [Inorganic ion transport and metabolism]. ferric iron reductase involved in ferric hydroximate transport; Provisional	0
424284	cl40653	FbpA	Fibronectin-binding protein A N-terminus (FbpA). This family consists of the N-terminal region of the prokaryotic fibronectin-binding protein. Fibronectin binding is considered to be an important virulence factor in streptococcal infections. Fibronectin is a dimeric glycoprotein that is present in a soluble form in plasma and extracellular fluids; it is also present in a fibrillar form on cell surfaces. Both the soluble and cellular forms of fibronectin may be incorporated into the extracellular tissue matrix. While fibronectin has critical roles in eukaryotic cellular processes, such as adhesion, migration and differentiation, it is also a substrate for the attachment of bacteria. The binding of pathogenic Streptococcus pyogenes and Staphylococcus aureus to epithelial cells via fibronectin facilitates their internalisation and systemic spread within the host.	0
424285	cl40654	Phage_Coat_B	Phage Coat protein B. CoatB is a single filamentous bacteriophage alpha helix of approximately 44 residues. It is likely to assemble into a complex of 35 monomers in a Catherine-wheel like formation. It is the major coat protein of the virion.	0
424286	cl40655	Legionella_OMP	Legionella pneumophila major outer membrane protein precursor. This is a family of putative beta barrel porin-7 BBP7 proteins identified initially in Rhodopirellula baltica.	0
424287	cl40656	NMD3	NMD3 family. The NMD3 protein is involved in nonsense mediated mRNA decay. This amino terminal region contains four conserved CXXC motifs that could be metal binding. NMD3 is involved in export of the 60S ribosomal subunit is mediated by the adapter protein Nmd3p in a Crm1p-dependent pathway.	0
424288	cl40657	DUF572	Family of unknown function (DUF572). Family of eukaryotic proteins with undetermined function.	0
424289	cl40658	DUF512	Protein of unknown function (DUF512). Members of this protein family are predicted radical SAM enzymes of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.	0
424291	cl40660	G3P_antiterm	Glycerol-3-phosphate responsive antiterminator. Intracellular glycerol is usually converted to glycerol-3-phosphate in an ATP-requiring phosphorylation reaction catalyzed by glycerol kinase (GlpK) glycerol-3-phosphate activates the antiterminator GlpP.	0
424293	cl40662	DUF445	Protein of unknown function (DUF445). Predicted to be a membrane protein.	0
424295	cl40664	Exonuc_V_gamma	Exodeoxyribonuclease V, gamma subunit. This model describes the gamma subunit of exodeoxyribonuclease V. Species containing this protein should also have the alpha (TIGR01447) and beta (TIGR00609) subunits. Candidates from Borrelia and from the Chlamydias differ dramatically and score between trusted and noise cutoffs. [DNA metabolism, DNA replication, recombination, and repair]	0
424296	cl40665	Class_IIIsignal	Class III signal peptide. This family of archaeal proteins contains. an amino terminal motif QXSXEXXXL that has been suggested to be part of a class III signal sequence. With the Q being the +1 residue of the signal peptidase cleavage site. Two members of this family are cleaved by a type IV pilin-like signal peptidase.	0
424297	cl40666	Autophagy_act_C	Autophagocytosis associated protein, active-site domain. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The small C-terminal domain is likely to be a distinct binding region for the stability of the autophagosome complex. It carries a highly characteristic conserved FLKF sequence motif.	0
424298	cl40667	RplB	Ribosomal protein L2 [Translation, ribosomal structure and biogenesis]. This model distinguishes bacterial and organellar ribosomal protein L2 from its counterparts in the archaea nad in the eukaryotic cytosol. Plant mitochondrial examples tend to have long, variable inserts. [Protein synthesis, Ribosomal proteins: synthesis and modification]	0
424301	cl40670	PCRF	PCRF domain. This is a conserved region of approx. 125 residues of one of the proteins that makes up the small subunit of the mitochondrial ribosome. In Saccharomyces cerevisiae the protein is MRP-S24 whereas in humans it is MRP-S28. The human mitochondrial ribosome has 29 distinct proteins in the small subunit and these have homologs in, for example, Drosophila melanogaster, Caenorhabditis elegans, and in the genomes of several fungi.	0
424304	cl40673	Fe_dep_repr_C	Iron dependent repressor, metal binding and dimerization domain. This family includes the Diphtheria toxin repressor.	0
424306	cl40675	Herpes_glycop_H	Herpesvirus glycoprotein H main domain. envelope glycoprotein H; Provisional	0
424307	cl40676	Epimerase_2	UDP-N-acetylglucosamine 2-epimerase. This family consists of UDP-N-acetylglucosamine 2-epimerases EC:5.1.3.14 this enzyme catalyzes the production of UDP-ManNAc from UDP-GlcNAc. Note that some of the enzymes is this family are bifunctional, in these instances Pfam matches only the N-terminal half of the protein suggesting that the additional C-terminal part (when compared to mono-functional members of this family) is responsible for the UPD-N-acetylmannosamine kinase activity of these enzymes. This hypothesis is further supported by the assumption that the C-terminal part of rat Gne is the kinase domain.	0
424310	cl40679	MCR_beta	Methyl-coenzyme M reductase beta subunit, C-terminal domain. Members of this protein family are the beta subunit of methyl coenzyme M reductase, also called coenzyme-B sulfoethylthiotransferase (EC 2.8.4.1). This enzyme, with alpha, beta, and gamma subunits, catalyzes the last step in methanogenesis. Several methanogens have encode two such enzymes, designated I and II; this model does not separate the isozymes. [Energy metabolism, Methanogenesis]	0
424313	cl40682	DUF4271	Domain of unknown function (DUF4271). This family includes O-antigen polysaccharide polymerases. These enzymes link O-units via a glycosidic linkage to form a long O-antigen. These enzymes vary in specificity and sequence.	0
424314	cl40683	Pro_dh	Proline dehydrogenase. proline dehydrogenase	0
424315	cl40684	PPTA	Protein prenyltransferase alpha subunit repeat. Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognize a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognizes a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family.	0
424316	cl40685	DusA	tRNA-dihydrouridine synthase [Translation, ribosomal structure and biogenesis]. 	0
424318	cl40687	Carboxyl_trans	Carboxyl transferase domain. All of the members in this family are biotin dependent carboxylases. The carboxyl transferase domain carries out the following reaction; transcarboxylation from biotin to an acceptor molecule. There are two recognized types of carboxyl transferase. One of them uses acyl-CoA and the other uses 2-oxoacid as the acceptor molecule of carbon dioxide. All of the members in this family utilize acyl-CoA as the acceptor molecule.	0
424319	cl40688	MvaT_DBD	DNA-binding domain of the bacterial xenogeneic silencer MvaT. MvaT is a xenogeneic silencer conserved in Pseudomonas which assists in distinguishing foreign from self DNA. It prefers binding to flexible DNA segments with multiple TpA steps, and forms nucleoprotein filaments through cooperative polymerization.	0
424320	cl40689	TOP4c	N/A. DNA topisomerase II medium subunit; Provisional	0
424325	cl40694	BglB	Beta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase [Carbohydrate transport and metabolism]. 6-phospho-beta-glucosidase; Reviewed	0
424326	cl40695	Ribosomal_S4	Ribosomal protein S4/S9 N-terminal domain. 30S ribosomal protein S4; Validated	0
424327	cl40696	SARS-CoV_ORF3b	accessory protein ORF3b of severe acute respiratory syndrome-associated coronavirus. This family of proteins is found in viruses. Proteins in this family are typically between 32 and 154 amino acids in length. This family contains the SARS coronavirus 3b protein which is predominantly localized in the nucleolus, and induces G0/G1 arrest and apoptosis in transfected cells.	0
424328	cl40697	cv_gamma-delta_Nsp2_IBV-like	gamma- and deltacoronavirus non-structural protein 2 (Nsp2), similar to IBV Nsp2 and related proteins. This is the N-terminal domain found in Replicase polyprotein 1a (also known as non-structural protein 2a-Nsp2a). Family members are found in Gammacoronaviruses.	0
424330	cl40699	AAA_13	AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. This family includes the PrrC protein that is thought to be the active component of the anticodon nuclease.	0
424334	cl40703	Trypan_PARP	Procyclic acidic repetitive protein (PARP). The SPATA3 family of proteins is expressed significantly in testis and faintly in epididymis in the ten tissues of testis, ovary, spleen, kidney, lung, heart, brain, epididymis, liver and skeletal muscle in mouse. Members are not expressed in the eight other tissues. This suggests that SPATA3 plays potential roles in spermatogenesis cell apoptosis or spermatogenesis.	0
424335	cl40704	Ycf1	Ycf1. Ycf1; Provisional	0
424336	cl40705	RAP-1	Rhoptry-associated protein 1 (RAP-1). rhoptry-associated protein; Provisional	0
424338	cl40707	EpmB	L-lysine 2,3-aminomutase (EF-P beta-lysylation pathway) [Amino acid transport and metabolism]. This model represents essentially the whole of E. coli YjeK and of some of its apparent orthologs. YodO in Bacillus subtilis, a family member which is longer protein by an additional 100 residues, is characterized as a lysine 2,3-aminomutase with iron, sulphide and pyridoxal 5'-phosphate groups. The homolog MJ0634 from M. jannaschii is preceded by nearly 200 C-terminal residues. This family shows similarity to molybdenum cofactor biosynthesis protein MoaA and related proteins. Note that the E. coli homolog was expressed in E. coli and purified and found not to display display lysine 2,3-aminomutase activity. Active site residues are found in 100 residue extension in B. subtilis. Name changed to KamA family protein. [Cellular processes, Adaptations to atypical conditions]	0
424339	cl40708	MviM	Predicted dehydrogenase [General function prediction only]. All members of the seed alignment for this model are known or predicted inositol 2-dehydrogenase sequences co-clustered with other enzymes for catabolism of myo-inositol or closely related compounds. Inositol 2-dehydrogenase catalyzes the first step in inositol catabolism. Members of this family may vary somewhat in their ranges of acceptable substrates and some may act on analogs to myo-inositol rather than myo-inositol per se. [Energy metabolism, Sugars]	0
424340	cl40709	cyano_w_EgtBD	hercynine metabolism protein. Members of this protein family resemble TIGR04375 and, more distantly, to phage shock protein A (PspA). Members are restricted to the Cyanobacteria.	0
424341	cl40710	PflX	Uncharacterized Fe-S protein PflX, radical SAM superfamily [General function prediction only]. Members of this protein family are uncharacterized radical SAM enzymes that occur in a prokaryotic three-gene system along with homologs of mammalian proteins Memo (Mediator of ErbB2-driven cell MOtility) and AMMERCR1 (Alport syndrome, Mental Retardation, Midface hypoplasia, and Elliptocytosis). Among radical SAM enzymes that have been experimentally characterized, the most closely related in sequence include activases of pyruvate formate-lyase and of benzylsuccinate synthase.	0
424349	cl40718	mnmC	bifunctional tRNA (5-methylaminomethyl-2-thiouridine)(34)-methyltransferase MnmD/FAD-dependent 5-carboxymethylaminomethyl-2-thiouridine(34) oxidoreductase MnmC. In Escherichia coli, the protein previously designated YfcK is now identified as the bifunctional enzyme MnmC. It acts, following the action of the heterotetramer of GidA and MnmE, in the modification of U-34 of certain tRNA to 5-methylaminomethyl-2-thiouridine (mnm5s2U). In other bacterial, the corresponding proteins are usually but always found as a single polypeptide chain, but occasionally as the product of tandem genes. This model represents the C-terminal region of the multifunctional protein. [Protein synthesis, tRNA and rRNA base modification]	0
424350	cl40719	PRK06078	N/A. In general, members of this protein family are designated pyrimidine-nucleoside phosphorylase, enzyme family EC 2.4.2.2, as in Bacillus subtilis, and more narrowly as the enzyme family EC 2.4.2.4, thymidine phosphorylase (alternate name: pyrimidine phosphorylase), as in Escherichia coli. The set of proteins encompassed by this model is designated subfamily rather than equivalog for this reason; the protein name from this model should be used when TIGR02643 does not score above trusted cutoff. [Purines, pyrimidines, nucleosides, and nucleotides, Other]	0
424351	cl40720	PRK15033	tricarballylate utilization 4Fe-4S protein TcuB. This model identifies proteins of two distinct names which may or may not have two distinct functions. CitB has been identified in salmonella and E. coli as the signal transduction component of a two-component system for citrate in which CitA acts as a citrate transporter. CobZ is essential for cobalamin biosynthesis (by knockout of the R. capsulatus gene) and is complemented by the characterized precorrin 3B synthase CobG. The enzyme has been shown to contain flavin, heme and Fe-S cluster cofactors and is believed to require dioxygen as a substrate. This model identifies the C-terminal domain of the R. capsulatus CobZ, which, in most other species exists as a separate gene adjacent to CobZ.	0
424352	cl40721	CysI	sulfite reductase (NADPH) hemoprotein, beta-component. Distantly related to the iron-sulfur hemoprotein of sulfite reductase (NADPH) found in Proteobacteria and Eubacteria, sulfite reductase (ferredoxin) is a cyanobacterial and plant monomeric enzyme that also catalyzes the reduction of sulfite to sulfide. [Central intermediary metabolism, Sulfur metabolism]	0
424353	cl40722	COG2605	Predicted kinase related to galactokinase and mevalonate kinase  [General function prediction only]. This model represents the shikimate kinase (SK) gene found in archaea which is only distantly related to homoserine kinase (thrB) and not atr all to the bacterial SK enzyme. The SK from M. janaschii has been overexpressed in E. coli and characterized. SK catalyzes the fifth step of the biosynthesis of chorismate from D-erythrose-4-phosphate and phosphoenolpyruvate. [Amino acid biosynthesis, Aromatic amino acid family]	0
424354	cl40723	ArgC	N-acetyl-gamma-glutamylphosphate reductase [Amino acid transport and metabolism]. This model represents the more common of two related families of N-acetyl-gamma-glutamyl-phosphate reductase, an enzyme catalyzing the third step or Arg biosynthesis from Glu. The two families differ by phylogeny, similarity clustering, and the gap architecture in a multiple sequence alignment. Bacterial members of this family tend to be found within Arg biosynthesis operons. [Amino acid biosynthesis, Glutamate family]	0
424355	cl40724	PpsA	Phosphoenolpyruvate synthase/pyruvate phosphate dikinase [Carbohydrate transport and metabolism]. Also called pyruvate,water dikinase and PEP synthase. The member from Methanococcus jannaschii contains a large intein. This enzyme generates phosphoenolpyruvate (PEP) from pyruvate, hydrolyzing ATP to AMP and releasing inorganic phosphate in the process. The enzyme shows extensive homology to other enzymes that use PEP as substrate or product. This enzyme may provide PEP for gluconeogenesis, for PTS-type carbohydrate transport systems, or for other processes. [Energy metabolism, Glycolysis/gluconeogenesis]	0
424356	cl40725	ERG8	Phosphomevalonate kinase  [Lipid transport and metabolism]. This enzyme is part of the mevalonate pathway, one of two alternative pathways for the biosynthesis of IPP. In an example of nonorthologous gene displacement, two different types of phosphomevalonate kinase are found - the animal type and this ERG8 type. This model represents plant and fungal forms of the ERG8 type of phosphomevalonate kinase. [Central intermediary metabolism, Other]	0
424357	cl40726	COG2936	Predicted acyl esterase [General function prediction only]. This model represents a protein subfamily that includes the cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). This family shows extensive, low-level similarity to a family of xaa-pro dipeptidyl-peptidases, and local similarity by PSI-BLAST to many other hydrolases. [Unknown function, Enzymes of unknown specificity]	0
424358	cl40727	CagE_TrbE_VirB	CagE, TrbE, VirB family, component of type IV transporter system. Type IV secretion systems are found in Gram-negative pathogens. They export proteins, DNA, or complexes in different systems and are related to plasmid conjugation systems. This model represents related ATPases that include VirB4 in Agrobacterium tumefaciens (DNA export) CagE in Helicobacter pylori (protein export) and plasmid TraB (conjugation).	0
424359	cl40728	cax	calcium/proton exchanger (cax). The Ca2+:Cation Antiporter (CaCA) Family (TC 2.A.19)Proteins of the CaCA family are found ubiquitously, having been identified in animals, plants, yeast, archaea and widely divergent bacteria.All of the characterized animal proteins catalyze Ca2+:Na+ exchange although some also transport K+. The NCX1 plasma membrane protein exchanges 3 Na+ for 1 Ca2+. The E. coli ChaA protein catalyzes Ca2+:H+ antiport but may also catalyze Na+:H+ antiport. All remaining well-characterized members of the family catalyze Ca2+:H+ exchange.This model is generated from the calcium ion/proton exchangers of the CacA family. [Transport and binding proteins, Cations and iron carrying compounds]	0
424361	cl40730	GlcD	FAD/FMN-containing dehydrogenase [Energy production and conversion]. This protein, the glycolate oxidase GlcD subunit, is similar in sequence to that of several D-lactate dehydrogenases, including that of E. coli. The glycolate oxidase has been found to have some D-lactate dehydrogenase activity. [Energy metabolism, Other]	0
424362	cl40731	PLN02677	N/A. mevalonate kinase; Provisional	0
424363	cl40732	AA_permease_2	Amino acid permease. inner membrane transporter YjeM; Provisional	0
424364	cl40733	EutJ	Ethanolamine utilization protein EutJ, possible chaperonin [Amino acid transport and metabolism]. ethanolamine utilization protein EutJ; Provisional	0
424366	cl40735	RecB	ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) [Replication, recombination and repair]. The RecBCD holoenzyme is a multifunctional nuclease with potent ATP-dependent exodeoxyribonuclease activity. Ejection of RecD, as occurs at chi recombinational hotspots, cripples exonuclease activity in favor of recombinagenic helicase activity. All proteins in this family for which functions are known are DNA-DNA helicases that are used as part of an exonuclease-helicase complex (made up of RecBCD homologs) that function to generate substrates for the initiation of recombination and recombinational repair. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]	0
424367	cl40736	FeoB	Fe2+ transport system protein B [Inorganic ion transport and metabolism]. FeoB (773 amino acids in E. coli), a cytoplasmic membrane protein required for iron(II) update, is encoded in an operon with FeoA (75 amino acids), which is also required, and is regulated by Fur. There appear to be two copies in Archaeoglobus fulgidus and Clostridium acetobutylicum. [Transport and binding proteins, Cations and iron carrying compounds]	0
424368	cl40737	PurT	Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) [Nucleotide transport and metabolism]. This enzyme is an alternative to PurN (TIGR00639) [Purines, pyrimidines, nucleosides, and nucleotides, Purine ribonucleotide biosynthesis]	0
424369	cl40738	fliF	flagellar basal body M-ring protein FliF. flagellar MS-ring protein; Reviewed	0
424370	cl40739	TOP2c	TopoisomeraseII. This model describes the common type II DNA topoisomerase (DNA gyrase). Two apparently independently arising families, one in the Proteobacteria and one in Gram-positive lineages, are both designated toposisomerase IV. Proteins scoring above the noise cutoff for this model and below the trusted cutoff for topoisomerase IV models probably should be designated GyrB. [DNA metabolism, DNA replication, recombination, and repair]	0
424371	cl40740	DnaG	DNA primase (bacterial type) [Replication, recombination and repair]. DNA primase; Provisional	0
424372	cl40741	DAO	FAD dependent oxidoreductase. Members of this protein family are the A subunit, product of the glpA gene, of a three-subunit, membrane-anchored, FAD-dependent anaerobic glycerol-3-phosphate dehydrogenase. [Energy metabolism, Anaerobic]	0
424373	cl40742	PhaC	Poly(3-hydroxyalkanoate) synthetase  [Lipid transport and metabolism]. This model represents the class II subfamily of poly(R)-hydroxyalkanoate synthases, which polymerizes hydroxyacyl-CoAs, typically with six to fourteen carbons in the hydroxyacyl backbone into aliphatic esters termed poly(R)-hydroxyalkanoic acids. These polymers accumulate as carbon and energy storage inclusions in many species and can amount to 90 percent of the dry weight of cell. [Fatty acid and phospholipid metabolism, Biosynthesis]	0
424374	cl40743	HemY	Uncharacterized conserved protein HemY, contains two TPR repeats [Function unknown]. Members of this protein family are uncharacterized tetratricopeptide repeat (TPR) proteins invariably found in heme biosynthesis gene clusters. The absence of any invariant residues other than Ala argues against this protein serving as an enzyme per se. The gene symbol hemY assigned in E. coli is unfortunate in that an unrelated protein, protoporphyrinogen oxidase (HemG in E. coli) is designated HemY in Bacillus subtilis. [Unknown function, General]	0
424376	cl40745	SufI	Multicopper oxidase with three cupredoxin domains (includes cell division protein FtsP and spore coat protein CotA)   [Cell cycle control, cell division, chromosome partitioning, Inorganic ion transport and metabolism, Cell wall/membrane/envelope biogenes. This family consists of copper-type nitrite reductase. It reduces nitrite to nitric oxide, the first step in denitrification. [Central intermediary metabolism, Nitrogen metabolism]	0
424377	cl40746	SepRS	O-phosphoseryl-tRNA(Cys) synthetase [Translation, ribosomal structure and biogenesis]. O-phosphoseryl-tRNA synthetase; Reviewed	0
424378	cl40747	MraZ	MraZ, DNA-binding transcriptional regulator and inhibitor of RsmH methyltransferase activity [Translation, ribosomal structure and biogenesis]. Members of this family contain two tandem copies of a domain described by pfam02381. This protein often is found with other genes of the dcw (division cell wall) gene cluster, including mraW, ftsI, murE, murF, ftsW, murG, etc. Recent work shows MraW in E. coli binds an upstream region with three tandem GTGGG repeats separated by 5bp spacers. We find similar sites in other species. [Cellular processes, Cell division, Regulatory functions, DNA interactions]	0
424379	cl40748	MauG	Cytochrome c peroxidase [Posttranslational modification, protein turnover, chaperones]. This model describes a subfamily of di-heme proteins related to the di-heme cytochrome c peroxidase and to MauG (methylamine utilization G), an enzyme that performs a tryptophan tryptophylquinone modification to the methylamine dehydrogenase light chain.	0
424380	cl40749	COG1712	Predicted dinucleotide-utilizing enzyme  [General function prediction only]. putative L-aspartate dehydrogenase; Provisional	0
424381	cl40750	PcnB	tRNA nucleotidyltransferase/poly(A) polymerase [Translation, ribosomal structure and biogenesis]. 	0
424382	cl40751	LysR	DNA-binding transcriptional regulator, LysR family [Transcription]. This group of sequences represents a number of related clades with numerous examples of members adjacent to operons for the degradation of 2-aminoethylphosphonate (AEP) in Pseudomonas, Ralstonia, Bordetella and Burkholderia species. These are transcriptional regulators of the LysR family which contain a helix-turn-helix (HTH) domain (pfam00126) and a periplasmic substrate-binding protein-like domain (pfam03466). [Regulatory functions, DNA interactions]	0
424383	cl40752	GltD	NADPH-dependent glutamate synthase beta chain or related oxidoreductase [Amino acid transport and metabolism, General function prediction only]. dihydropyrimidine dehydrogenase subunit A; Provisional	0
424384	cl40753	PRK10668	N/A. putative transcriptional regulator; Provisional	0
424385	cl40754	Efp	Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) [Translation, ribosomal structure and biogenesis]. function: involved in peptide bond synthesis. stimulate efficient translation and peptide-bond synthesis on native or reconstituted 70S ribosomes in vitro. probably functions indirectly by altering the affinity of the ribosome for aminoacyl-tRNA, thus increasing their reactivity as acceptors for peptidyl transferase (by similarity). The trusted cutoff of this model is set high enough to exclude members of TIGR02178, an EFP-like protein of certain Gammaproteobacteria. [Protein synthesis, Translation factors]	0
425345	cl41714	ZBD_UPF1_nv_SF1_Hel-like	Cys/His rich zinc-binding domain (CH/ZBD) of eukaryotic UPF1 helicase, nidovirus SF1 helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this arnidovirus group belong to helicase superfamily 1 (SF1) and include arterivirus helicases such Equine arteritis virus (EAV) Nsp10 helicase encoded on ORF1b. The CH/ZBD has 3 zinc-finger (ZnF1-3) motifs. Members of this family belong to a family of nindoviral replication helicases which include SARS-Nsp13, a component of the viral RNA synthesis replication and transcription complex (RTC). The SARS-Nsp13 CH/ZBD is indispensable for helicase activity and interacts with SARS-Nsp12. SARS-Nsp12 can enhance the helicase activity of SARS-Nsp13 and can interact with SARS-Nsp13 on the third zinc finger motif of the CH/ZBD.	0
425346	cl41715	1B_UPF1_nv_SF1_Hel-like	1B domain of eukaryotic UPF1 helicase, nidovirus SF1 helicases including coronavirus Nsp13 and arterivirus Nsp10, and related proteins. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. Members of this subfamily belong to helicase superfamily 1 (SF1) and include arterivirus helicases such Equine arteritis virus (EAV) Nsp10 helicase encoded on ORF1b. EAV Nsp10 is a multidomain protein; its other domains include an N-terminal Cys/His rich zinc-binding domain (CH/ZBD) and a SF1 helicase core. The 1B domain is involved in nucleic acid substrate binding; the 1B domain of EAV Nsp10 undergoes large conformational change upon substrate binding, and together with the 1A and 2A domains of the helicase core form a channel that accommodates the single stranded nucleic acids.	0
425347	cl41716	SUD_C_DPUP_CoV_Nsp3	C-terminal SARS-Unique Domain (SUD) of betacoronavirus non-structural protein 3 (Nsp3). This subfamily contains the SUD-C of Rousettus bat coronavirus (CoV) HKU9 non-structural protein 3 (Nsp3) and other Nsp3s from betacoronaviruses in the nobecovirus subgenera (D lineage). Non-structural protein 3 (Nsp3) is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Nsp3 of SARS coronavirus includes a SARS-unique domain (SUD) consisting of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C. SUD-N and SUD-M are macro domains which bind G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides). SUD is not as specific to SARS CoV as originally thought and is also found in Rousettus bat CoV HKU9 and related bat CoVs. Similar to SARS SUD-C, Rousettus bat CoV HKU9 SUD-C (HKU9 C), also adopts a frataxin-like fold that has structural similarity to DNA-binding domains of DNA-modifying enzymes. However, there is little sequence similarity between the two domains. SARS SUD-C has been shown to bind to single-stranded RNA and recognize purine bases more strongly than pyrimidine bases; it also regulates the RNA binding behavior of the SARS SUD-M macrodomain. It is not known whether HKU9 C functions in the same way.	0
425348	cl41717	M_cv_Nsp15-NTD_av_Nsp11-like	middle (M) domain of coronavirus Nonstructural protein 15 (Nsp15) and the N-terminal domain (NTD) of arterivirus Nsp11 and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. NendoUs include Nsp15 from coronaviruses and Nsp11 from arteriviruses, both of which may participate in the viral replication process and in the evasion of the host immune system. Coronavirus Nsp15 NendoUs have an N-terminal domain, a middle (M) domain and a C-terminal catalytic (NendoU) domain. Coronavirus Nsp15 from Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), human Coronavirus 229E (HCoV229E), and Murine Hepatitis Virus (MHV) form a functional hexamer. Oligomerization of Porcine DeltaCoronavirus (PDCoV) Nsp15 differs from that of the other coronaviruses; it has been shown to exist as a dimer and a monomer in solution.	0
425349	cl41718	NendoU_XendoU-like	Nidoviral uridylate-specific endoribonuclease (NendoU) domain of coronavirus Nonstructural protein 15 (Nsp15), arterivirus Nsp11, torovirus endoribonuclease, Xenopus laevis endoribonuclease XendoU, and related proteins. Nidovirus endoribonucleases (NendoUs) are uridylate-specific endoribonucleases, which release a cleavage product containing a 2',3'-cyclic phosphate at the 3' terminal end. The Porcine torovirus (PToV) strain PToV-NPL/2013 NendoU domain is located at the N-terminus of the ORF1ab replicase polyprotein, between regions annotated as Nonstructural proteins 11 (Nsp11) and 13 (Nsp13). This subfamily belongs to a family which includes Nsp15 from coronaviruses and Nsp11 from arteriviruses, which may participate in the viral replication process and in the evasion of the host immune system. These vary in their requirement for Mn2+. Coronavirus Nsp15 generally form functional hexamers, with the exception of Porcine DeltaCoronavirus (PDCoV) Nsp15 which exists as a dimer and a monomer in solution. Arterivirus (Porcine Reproductive and Respiratory Syndrome virus) PRRSV Nsp11 is a dimer. NendoUs are distantly related to Xenopus laevis Mn(2+)-dependent uridylate-specific endoribonuclease (XendoU) which is involved in the processing of intron-encoded box C/D U16 small, nucleolar RNA.	0
425350	cl41719	capping_2-OMTase_viral	viral Cap-0 specific (nucleoside-2&apos;-O-)-methyltransferase. Cap-0 specific (nucleoside-2'-O-)-methyltransferase (2'OMTase) catalyzes the methylation of Cap-0 (m7GpppNp) at the 2'-hydroxyl of the ribose of the first nucleotide, using S-adenosyl-L-methionine (AdoMet) as the methyl donor. This reaction is the fourth and last step in mRNA capping, the creation of the stabilizing five-prime cap (5' cap) on mRNA. Nidovirales, a family of ss(+)RNA viruses, cap their mRNAs. For one member, Coronavirus, the 2'OMTase activity is located in the nonstructural protein 16 (NSP16). For others, the 2'OMTase activity may be located in replicase polyprotein 1ab.	0
425351	cl41720	ORF4b_NS3c-betaCoV	accessory protein ORF4b, also known as non-structural protein 3c (NS3c), of betacoronaviruses in the C lineage. This model represents the accessory protein 4b, ORF4b (also called NS3c protein) of Pipistrellus bat coronavirus HKU5 and related bat coronaviruses including Pipistrellus abramus bat coronavirus HKU5-related. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication, however several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilion (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. ORF4b/NS3c proteins in this subgroup are similar to the MERS-CoV ORF4b (also known as MERS-CoV 4b) which has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis.	0
425352	cl41721	deltaCoV_NS7_NS7a	deltacoronavirus accessory protein NS7 and NS7a. This group includes the accessory protein NS7a from Quail deltacoronavirus (QdCoV) UAE-HKU30 and sparrow deltacoronavirus (SpCoV-HKU17) within the Buldecovirus subgenus of deltacoronaviruses. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and replicase/protease polyproteins (ORF1ab); all are required to produce a structurally complete viral particle. In addition, CoV genomes also contain ORFs coding for accessory proteins that are specific for certain CoV lineages or for a particular CoV. In general, CoV accessory proteins are considered to be dispensable for viral replication; however, several accessory proteins have been shown to exhibit functions in virus-host interactions during CoV infection. In deltaCoVs, several avian species encode accessory protein NS7a, which is homologous to Porcine coronavirus (PDCoV) HKU15 accessory proteins NS7 and NS7a. PDCoV NS7a is a 100 amino-acid polypeptide identical to the C-terminus of NS7; it remains unclear whether their functions are redundant. The PDCoV NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7a proteins in this subfamily have yet to be characterized. Phylogenetic analysis revealed that QdCoV UAE-HKU30 belongs to the same CoV species as porcine deltacoronavirus (PdCoV) HKU15 and sparrow deltacoronavirus (SpdCoV) HKU17 within Buldecovirus subgenus, suggesting transmission between avian and swine hosts.	0
425353	cl41722	ORF7b_SARS_bat-CoV-like	Severe Acute Respiratory Syndrome coronavirus structural accessory protein ORF7b and similar proteins from related betacoronaviruses in the B lineage. This group contains the ORF7b, also called NS7b, of Severe Acute Respiratory Syndrome coronaviruses (SARS-CoVs) and related betacoronaviruses identified in Chinese horseshoe bats, including bat SARS-like-CoV WIV1 and HKU3. ORF7b/NS7b from betacoronavirus in the B lineage are not related to NS7b proteins from other betacoronavirus lineages. There are five essential genes in CoVs that result in the following gene products: Spike (S) protein, Membrane (M) glycoprotein, Nucleocapsid (N), Envelope (E) protein, and the ORF1ab (a large polyprotein known as replicase/protease); all required to produce a structurally complete viral particle. In addition, SARS coronavirus contains a number of open reading frames that code for a total of eight accessory proteins, namely ORFs 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b. These ORFs are specific for SARS-CoV and do not show significant homology to accessory proteins of other coronaviruses. The SARS-CoV ORF7b protein is a highly hydrophobic 43 amino acid protein which is homologous to an accessory but structural component of SARS-CoV virion. While ORF7b is packaged into virions, it is not required for the virus budding process, as gene 7 deletion viruses replicate efficiently in vitro and in vivo. Moreover, ORF7b possesses a transmembrane helical domain (TMD), between 9-29 amino acid residues, is necessary for its Golgi complex localization, as replacing it with the TMD from the human endoprotease furin results in aberrant localization.	0
425354	cl41723	HD_XRCC4-like_N	N-terminal head domain found in the XRCC4 superfamily of proteins. Paralog of XRCC4 and XLF (PAXX), also called XRCC4-like small protein, is a paralog of X-ray repair cross-complementing protein 4 (XRCC4) and XRCC4-like factor (XLF). It is involved in non-homologous end joining (NHEJ), a major pathway to repair double-strand breaks (DSBs) in DNA. It may act as a scaffold required to stabilize the DSB-repair protein Ku heterodimer, composed of XRCC5/Ku80 and XRCC6/Ku70, at double-strand break sites in cells. It functions with XRCC4 and XLF to bring about DSB repair and cell survival in response to DSB-inducing agents. Similar to XRCC4 and XLF, PAXX monomers are comprised of an N-terminal globular head domain, a centrally located coiled-coil, and a C-terminal region. These monomers homodimerize through two homodimerization domains, the N-terminal globular head domains and long extended alpha-helical coiled-coil regions. This model corresponds to the N-terminal head domain of PAXX, which is structurally related to other XRCC4-superfamily members, XRCC4, XLF, SAS6, and CCDC61.	0
425355	cl41724	DPBB_RlpA_EXP_N-like	double-psi beta-barrel fold of RlpA, N-terminal domain of expansins, and similar domains. This group is made up of endoglucanases from mollusks similar to Ampullaria crossean endoglucanase EG27II, a glycoside hydrolase family 45 (GH45) subfamily B protein. Endoglucanases (EC 3.2.1.4) catalyze the endohydrolysis of (1-4)-beta-D-glucosidic linkages in cellulose, lichenin, and cereal beta-D-glucans. Animal cellulases, such as endoglucanase EG27II, have great potential for industrial applications such as bioethanol production. GH45 endoglucanases from mollusks adopt a double-psi beta-barrel (DPBB) fold.	0
425356	cl41725	UDM1_RNF168_RNF169-like	UDM1 (ubiquitin-dependent DSB recruitment module 1) found in RING finger proteins RNF168, RNF169 and similar proteins. RING finger protein 168 (RNF168) is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. Together with RNF8, RNF168 functions as a DNA damage response (DDR) factor that promotes a series of ubiquitylation events on substrates such as H2A and H2AX. With H2AK13/15 ubiquitylation, it facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. In addition, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. This model corresponds to the UDM1 (ubiquitin-dependent double-strand break [DSB] recruitment module 1) domain of RNF168, which comprises LRM1 (LR motif 1), UMI (ubiquitin-interacting motif [UIM]- and MIU-related UBD) and MIU1 (motif interacting with ubiquitin 1). Mutations of Ub-interacting residues in UDM1 have little effect on the accumulation of RNF168 to DSB sites, suggesting that it may not be the main site of binding ubiquitylated and polyubiquitylated targets.	0
425357	cl41726	RHH_CopG_NikR-like	ribbon-helix-helix domains of transcription repressor CopG, nickel responsive transcription factor NikR, and similar proteins. This subfamily includes the N-terminal ribbon-helix-helix (RHH) domain of putative transcriptional repressor CopG from archaea, and similar proteins. These uncharacterized proteins have a typical RHH, similar to plasmid-encoded transcriptional repressor CopG, the protein that is encoded by the promiscuous streptococcal plasmid pMV158 and is involved in the control of plasmid copy number.	0
425358	cl41727	H1_KCTD12-like	H1 domain found in potassium channel tetramerization domain-containing proteins. Potassium channel tetramerization domain-containing protein 16 (KCTD16) is a BTB/POZ domain-containing protein that is an auxiliary subunit of gamma-aminobutyric acid type B (GABA-) receptors associated with mood disorders. It interacts with amyloid beta precursor protein (APP), a type I transmembrane protein involved in a variety of cellular processes such as cell adhesion and axon guidance. KCTD16 generates largely non-desensitizing receptor responses. It consists of an N-terminal BTB domain followed by a region called the H1 domain. The BTB domain mediates interaction with the receptor. The C-terminal H1 domain, which possesses a beta-propeller-like fold, engages in interactions with G-protein beta-gamma subunits. In the related protein KCTD12, the H1 domain is also responsible for desensitization. This model corresponds to the H1 domain of KCTD16, which may not be involved in desensitization.	0
425359	cl41728	WH2	Wiskott-Aldrich Syndrome Homology (WASP) region 2 (WH2 motif), and similar proteins. This family contains the third tandem Wiskott-Aldrich syndrome protein (WASP)-homology domain 2 (WH2) domain in human Spire family protein Spire-1 (also called Spir1) and Spire-2 (Spir2) and related proteins. Spire is an actin nucleator essential for establishing an actin mesh during oogenesis. It was first identified as a Drosophila maternal effect gene essential to establishment of both the anterior/posterior and dorsal/ventral body axes in developing oocytes and embryos. It has been found to sever filaments and sequester monomers in addition to nucleating new filaments; it remains associated with the slow-growing pointed end of the new filament. Spire is involved in intracellular vesicle transport along actin fibers, providing a novel link between actin cytoskeleton dynamics and intracellular transport. It is required for asymmetric spindle positioning and asymmetric cell division during oocyte meiosis. Spire contains four tandem WH2 domains. The mammalian genome encodes two Spire proteins, namely Spire-1 (Spir1) and Spire-2 (Spir2). This model contains WH2 domain 3 of human Spire-1 and Spire-2 . Major expression of both spire genes have been detected during embryogenesis in the developing nervous system). In addition, spire1 expression is found in the fetal liver, while spire2 expression is seen in early stages of intestinal development. In adult tissues, the spire2 gene shows a rather broad expression pattern, which includes the epithelial cells of the digestive tract, testical spermatocytes, and neuronal cells of the nervous system. In contrast, spire1 is mainly expressed in neuronal cells of the nervous system. Minor expression levels were detected in testis and spleen. Spire also acts in the nucleus where, together with Spire-1 and Spire-2, it promotes assembly of nuclear actin filaments in response to DNA damage in order to facilitate movement of chromatin and repair factors after DNA damage. High levels of spire1 expression are restricted to the nervous system, oocytes, and testis. Since function of Spire-1 and Spire-2 in oocyte maturation is redundant, spire1 mutant mice are fertile, overall brain anatomy is not altered, and visual and motor functions remain normal; however, detailed behavioral studies of the spire1 mutant mice unveiled a very specific and highly significant phenotype in terms of fear learning in male mice.	0
425360	cl41729	KLF1_2_4_N	N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins. Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domains of an unknown subfamily of KLFs, predominantly found in fish, related to the N-terminal domains of KLF1, KLF2, and KLF4.	0
425361	cl41730	KLF9_13_N-like	Kruppel-like factor (KLF) 9, KLF13, KLF14, KLF16, and similar proteins. Kruppel-like factor 9 (KLF9; also known as Krueppel-like factor 9, or Basic Transcription Element Binding Protein 1/BTEB Protein 1) is a protein that in humans is encoded by the KLF9 gene. KLF9 is critical for the inhibition of growth and development of tumors. It is involved in cell differentiation of B cells, keratinocytes, and neurons. It is also a key transcriptional regulator for uterine endometrial cell proliferation, adhesion, and differentiation; these are processes essential for pregnancy success and are subverted during tumorigenesis. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved alpha-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF9 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF9.	0
425362	cl41731	KLF10_11_N	N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins. Kruppel-like factor 11 (KLF11; also known as Krueppel-like factor 11; Fetal Kruppel-like factor-1/FKLF-1; maturity-onset diabetes of the young 7/MODY7; TGFbeta Inducible Early Growth Response 2/TIEG2) is a protein that in humans is encoded by the KLF11 gene. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF11 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF11.	0
425363	cl41732	KLF6_7_N-like	N-terminal domain of Kruppel-like factor (KLF) 6, KLF7, and similar proteins. Kruppel-like factor 6 (KLF6; also known as Krueppel-like factor 6, BCD1, CBA1, COPEB, CPBP, GBF, PAC1, ST12, or ZF9) is a protein that, in humans, is encoded by the KLF6 gene. KLF6 contributes to cell proliferation, differentiation, cell death, and signal transduction. Hepatocyte expression of KLF6 regulates hepatic fatty acid and glucose metabolism via transcriptional activation of liver glucokinase and post-transcriptional regulation of the nuclear receptor peroxisome proliferator activated receptor alpha (PPARa). KLF6-expression contributes to hepatic insulin resistance and the progression of non-alcoholic fatty liver disease (NAFLD) to non-alcoholic steatohepatitis (NASH) and NASH-cirrhosis. KLF6 also affects peroxisome proliferator activated receptor gamma (PPARgamma)-signaling in NAFLD. KLF6 has also been identified as a tumor suppressor gene that is inactivated or downregulated in different cancers, including prostate, colon, and hepatocellular carcinomas. KLF6 transactivates genes controlling cell proliferation, including p21, E-cadherin, and pituary tumor-transforming gene 1 (PTTG1). KLF6 functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF6.	0
425364	cl41733	Syt1_2_N	N-terminal domain of synaptotagmin-1 and -2. Syt2, also called synaptotagmin II (SytII), exhibits calcium-dependent phospholipid and inositol polyphosphate binding properties. It may have a regulatory role in the membrane interactions during trafficking of synaptic vesicles at the active zone of the synapse. It plays a role in dendrite formation by melanocytes. The model corresponds to N-terminal domain of Syt2, which is a recognition domain responsible for the binding of botulinum neurotoxin B (BoNT B).	0
425365	cl41734	CoV_Nsp7	coronavirus non-structural protein 7. This model represents the non-structural protein 7 (Nsp7) of deltacoronaviruses that include White-eye coronavirus HKU16 and Quail coronavirus UAE-HKU30, among others. CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Upon processing of the Nsp7-10 region by protease M (Mpro), the released four small proteins Nsp7, Nsp8, Nsp9 and Nsp10 form functional complexes with CoV core enzymes and stimulate replication. Most importantly, a complex of Nsp7 with Nsp8 has been shown to activate and confer processivity to the RNA-synthesizing activity of Nsp12, the RNA-dependent RNA-polymerase (RdRp); in SARS-CoV, point mutations in the NSP7- or NSP8-coding region have been shown to delay virus growth. Nsp7 and Nsp8 cooperate in activating the primer-dependent activity of the Nsp12 RdRp such that the level of their association may constitute a limiting factor for obtaining a high RNA polymerase activity. The subsequent Nsp7/Nsp8/Nsp12 polymerase complex is then able to associate with an active bifunctional Nsp14, which includes N-terminal 3' to 5' exoribonuclease (ExoN) and C-terminal N7-guanine cap methyltransferase (N7-MTase) activities, thus representing a unique coronavirus Nsp assembly that incorporates RdRp, exoribonuclease, and N7-MTase activities. Interaction of Nsp7 with Nsp8 appears to be conserved across the coronavirus family, making these proteins interesting drug targets. Nsp7 has a 4-helical bundle conformation which is strongly affected by its interaction with Nsp8, especially where it concerns alpha-helix 4. SARS-CoV Nsp7 forms a 8:8 hexadecameric supercomplex with Nsp8 that adopts a hollow cylinder-like structure with a large central channel and positive electrostatic properties in the cylinder, while Feline infectious peritonitis virus Nsp7 forms a 2:1 heterotrimer with Nsp8. Regardless of their oligomeric structure, the Nsp7/Nsp8 complex functions as a noncanonical RNA polymerase capable of synthesizing RNA of up to template length.	0
425366	cl41735	TBK1_IKKE-like_C	C-terminal domain of non-canonical Inhibitor of kappa B kinases, IKK-E and TBK1, and similar proteins. TANK-binding kinase 1 (TBK1), also called T2K and NF-kB-activating kinase, is a serine/threonine-protein kinase that is widely expressed in most cell types and acts as an IkappaB kinase (IKK)-activating kinase responsible for NF-kB activation in response to growth factors. It plays a role in modulating inflammatory responses through the NF-kB pathway. TKB1 is also a major player in innate immune responses since it functions as a virus-activated kinase necessary for establishing an antiviral state. It phosphorylates IRF-3 and IRF-7, which are important transcription factors for inducing type I interferon during viral infection. TBK1 may also play roles in cell transformation and oncogenesis. In addition, it regulates optineurin (OPTN), an important autophagy receptor involved in several selective autophagy processes. TBK1 contains N-terminal serine/threonine protein kinase, ubiquitin-like (Ubl), coiled-coil domain 1 (CCD1), and C-terminal alpha-helical domains. This model corresponds to a small conserved elongated alpha-helical domain at the C-terminus of TBK1, which is responsible for the binding of its adaptor proteins such as OPTN and NAP1.	0
425367	cl41736	MIU2_RNF168-like	second motif interacting with ubiquitin domain found in RING finger protein 168 and similar domains. RNF168 is an E3 ubiquitin-protein ligase that promotes noncanonical K27 ubiquitination to signal DNA damage. It, together with RNF8, functions as a DNA damage response (DDR) factor that promotes monoubiquitination of H2A/H2AX at K13/15, facilitates recruitment of repair factors p53-binding protein 1 (53BP1) or the RAP80-BRCA1 complex to sites of double-strand breaks (DSBs), and inhibits homologous recombination (HR) in cells deficient in the tumor suppressor BRCA1. RNF168 also promotes H2A neddylation, which antagonizes ubiquitylation of H2A and regulates DNA damage repair. Moreover, RNF168 forms a functional complex with RAD6A or RAD6B during the DNA damage response. RNF168 contains an N-terminal C3HC4-type RING-HC finger that catalyzes H2A-K15ub modification and interacts with H2A, and two MIU (motif interacting with ubiquitin) domains responsible for the interaction with K63 linked poly-ubiquitin. This model corresponds to the second MIU (MIU2) domain of RNF168. The first MIU belongs to a different domain family and is not included here.	0
425368	cl41737	TD_EMAP-like	trimerization domain of the echinoderm microtubule-associated protein-like family. Echinoderm microtubule-associated protein-like 4 (EMAP-4), also called EML4, EMAPL4, restrictedly overexpressed proliferation-associated protein, or Ropp 120, may modify the assembly dynamics of microtubules, such that microtubules are slightly longer, but more dynamic. This model corresponds to the N-terminal trimerization domain of EMAP-4.	0
425369	cl41738	LGNbd_FRMPD1_D4-like	LGN tetratricopeptide repeat-binding domain found in FERM and PDZ domain-containing proteins FRMPD1, FRMPD4, and similar proteins. FRMPD4, also called PDZ domain-containing protein 10 (PDZD10), PDZK10, or PSD-95-interacting regulator of spine morphogenesis (Preso), is a novel PSD-95-interacting FERM and PDZ domain protein that regulates dendritic spine morphogenesis. It acts as a positive regulator of dendritic spine morphogenesis and density. It is required for the maintenance of excitatory synaptic transmission. It binds phosphatidylinositol 4,5-bisphosphate. FRMPD4 contains WW, PDZ and FERM domains in the N-terminal region. This model corresponds to a conserved region in the C-terminal region of FRMPD4 that binds to tetratricopeptide (TPR) repeats present in the N-terminal domain of adaptor protein LGN. LGN plays a crucial role in mitotic spindle orientation and cell polarization via interaction with multiple targets including FRMPD4.	0
425370	cl41739	Nip7_N-like	N-terminal domain of Nip7 and similar proteins. The N-terminal domain of archaeal 60S ribosome subunit biogenesis protein Nip7 co-occurs with a PUA (PseudoUridine synthase and Archaeosine transglycosylase) RNA binding domain. Nip7 is involved in ribosome biogenesis, taking part in 27S pre-rRNA processing and in formation of the 60S ribosomal subunit. Nip7 and its homologs share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets.	0
425371	cl41740	CC1_SLMAP-like	first coiled-coil (CC1) domain found in Sarcolemmal membrane-associated protein and similar proteins. TRAF3-interacting JNK-activating modulator (T3JAM), also called TRAF3-interacting protein 3 (TRAF3IP3), is a novel protein that specifically interacts with TRAF3 and promotes the activation of JNK. It may function as an adapter molecule that regulates TRAF3-mediated JNK activation. The model corresponds to a conserved region that shows high sequence similarity with the first CC (CC1) domain of Sarcolemmal membrane-associated protein (SLMAP), which is responsible for the binding of suppressor of IKBKE 1 (SIKE1).	0
425372	cl41741	PUA	PUA RNA binding domain. The RNA-binding PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was detected in a number of proteins involved in RNA metabolism. Members of the thermotogae subfamily of pseudouridine synthases TruB are modules that assist in the binding and positioning (guide and/or substrate) of RNA to the pseudouridine synthase complex. Pseudouridine synthases are enzymes that are responsible for post-translational modifications of RNAs by specifically isomerizing uracil residues. The pseudouridine synthase TruB (also called tRNA pseudouridylate synthase B or Psi55 synthase) is responsible for synthesis of pseudouridine from uracil-55 in the psi GC loop of elongator tRNAs.	0
425373	cl41742	alpha_betaCoV_Nsp1	non-structural protein 1 from alpha- and betacoronavirus. This model represents the non-structural protein 1 (Nsp1) from betacoronavirus in the embecovirus subgenus (A lineage), including murine hepatitis virus (MHV), bovine coronavirus (BCoV) and Human coronavirus HKU1. CoVs utilize a multi-subunit replication/transcription machinery assembled from a set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp1 is the N-terminal cleavage product released from the ORF1a polyprotein by the action of papain-like protease (PLpro). Though Nsp1s of alphaCoVs and betaCoVs share structural similarity, they show no significant sequence similarity and may be considered as genus-specific markers. Despite low sequence similarity, the Nsp1s of alphaCoVs and betaCoVs exhibit remarkably similar biological functions, and are involved in the regulation of both host and viral gene expression. CoV Nsp1 induces suppression of host gene expression and interferes with host immune response. It inhibits host gene expression in two ways: by targeting the translation and stability of cellular mRNAs, and by inhibiting mRNA translation and inducing an endonucleolytic RNA cleavage in the 5'-UTR of cellular mRNAs through its tight association with the 40S ribosomal subunit, a key component of the cellular translation machinery. Nsp1 is critical in regulating viral replication and gene expression, as shown by multiple evidences, including: mutations in the Nsp1 coding region of the transmissible gastroenteritis virus (TGEV) and MHV genomes cause drastic reduction or elimination of infectious virus; BCoV Nsp1 is an RNA-binding protein that interacts with cis-acting replication elements in the 5'-UTR of the BCoV genome, implying its potential role in the regulation of viral translation or replication; and SARS-CoV Nsp1 enhances virus replication by binding to a stem-loop structure in the 5'-UTR of its genome.	0
425374	cl41743	betaCoV_Nsp3_betaSM	betacoronavirus-specific marker of betacoronavirus non-structural protein 3. This model represents the betacoronavirus-specific marker (betaSM), also called group 2-specific marker (G2M), of non-structural protein 3 (Nsp3) from betacoronavirus in the merbecovirus subgenus (C lineage), including Middle East respiratory syndrome-related coronavirus (MERS-CoV) and Tylonycteris bat coronavirus HKU4. The betaSM/G2M is located C-terminal to the nucleic acid-binding (NAB) domain. This region is absent in alpha- and deltacoronavirus Nsp3; there is a gammacoronavirus-specific marker (gammaSM) at this position in gammacoronavirus Nsp3. Nsp3 is a large multi-functional multi-domain protein that is an essential component of the replication/transcription complex (RTC), which carries out RNA synthesis, RNA processing, and interference with the host cell innate immune system. Little is known about the betaSM/G2M domain; it is predicted to be non-enzymatic and may be an intrinsically disordered region. The betaSM/G2M domain is part of the predicted PLnc domain (made up of 385 amino acids) of the related SARS-CoV Nsp3 that may function as a replication/transcription scaffold, with interactions to Nsp5, Nsp12, Nsp13, Nsp14, and Nsp16.	0
425375	cl41744	ABC-2_lan_permease	lantibiotic immunity ABC transporter permease (also called ABC-2 transporter permease) subunit. This subfamily contains lantibiotic ABC transporter permease subunits NisG and NsuG, which are highly hydrophobic, integral membrane proteins, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, particularly to the lantibiotic nisin. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis. In Lactococcus lactis and Streptococcus uberis, the lantibiotic nisin is active against other Gram-positive bacteria via various modes of actions; however, its self-protection against the pore-forming nisin is mediated by the ABC transporter composed of NisF, NisE and NisG. In Streptococcus uberis, similar proteins provide self-protection against the pore-forming lantibiotic nisin U. This subfamily contains the NisG and NsuG permease subunits that transport nisin to the surface and expel it from the membrane.	0
425376	cl41745	DEFL	defensin-like domain family. This subfamily includes a group of bactericidal proteins, such as defensins, sapecins, tenecins, phormicins, and lucifensins from bilateria. They are host defense peptides produced in response to injury and mostly active against Gram-positive bacteria. This model corresponds to the defensin-like (DEFL) domain, which adopts a typical structure characterized by cysteine-stabilized alpha/beta scaffold.	0
425377	cl41746	CEN_USH1G_ANKS4B	central domain found in usher syndrome type-1G protein, ankyrin repeat and SAM domain-containing protein 4B, and similar proteins. Usher syndrome type-1G protein (USH1G), also called scaffold protein containing ankyrin repeats and SAM domain (Sans), is an anchoring/scaffolding protein that is part of the functional network formed by USH1C, USH1G, CDH23 and MYO7A, that mediates mechanotransduction in cochlear hair cells. It is required for normal development and maintenance of cochlear hair cell bundles, as well as for normal hearing. USH1G consists of four N-terminal ANK repeats, a central region, and a sterile alpha motif (SAM) followed by a C-terminal type I PDZ binding motif (PBM). This model corresponds to the central region (CEN) of USH1G, which contains the conserved regions CEN1 and CEN2. CEN is directly responsible for binding to the MYO7A MyTH4-FERM tandem.	0
425378	cl41747	RBD_KIF20A-like	RAB6 binding domain (RBD) found in kinesin-like proteins KIF20A, KIF20B, and similar proteins. KIF20A, also called GG10_2, or mitotic kinesin-like protein 2 (MKlp2), or Rab6-interacting kinesin-like protein, or rabkinesin-6, is a mitotic kinesin required for chromosome passenger complex (CPC)-mediated cytokinesis. Following phosphorylation by PLK1, it is involved in recruitment of PLK1 (polo-like kinase 1) to the central spindle. KIF20A interacts with guanosine triphosphate (GTP)-bound forms of RAB6A and RAB6B. It may act as a motor required for the retrograde RAB6 regulated transport of Golgi membranes and associated vesicles along microtubules. KIF20A has a microtubule plus end-directed motility. This model corresponds to RAB6 binding domain (RBD) of KIF20A. KIF20A-RBD is a dimer composed of two parallel alpha helices that form a right-handed coiled-coil additionally stabilized by an inter-helical cysteine bridge.	0
425379	cl41748	CoV_Nsp13-helicase	helicase domain of coronavirus non-structural protein 13. This model represents the helicase domain of non-structural protein 13 (Nsp13) from alphacoronavirus, including Porcine epidemic diarrhea virus and Human coronavirus (CoV) NL63. Helicases catalyze NTP-dependent unwinding of nucleic acid duplexes into single strands and are classified based on the arrangement of conserved motifs into six superfamilies. CoV Nsp13 is a member of the helicase superfamily 1 (SF1); SF1 and SF2 helicases do not form toroidal structures, while SF3-6 helicases do. Nsp13 is a component of the viral RNA synthesis replication and transcription complex (RTC). It is a multidomain protein containing a Cys/His rich zinc-binding domain (CH/ZBD), a stalk domain, a 1B domain involved in nucleic acid substrate binding, and a SF1 helicase core.	0
425380	cl41749	GH2_like	GIPC homology 2 (GH2) domain-like family. DHX8 (a human homolog of yeast Prp22), also called RNA helicase HRH1, is an ATP-dependent RNA helicase involved in pre-mRNA splicing as a component of the spliceosome. It facilitates nuclear export of spliced mRNA by releasing the RNA from the spliceosome. This model corresponds to the GH2-like domain that shows high sequence similarity with the GH2 domain found in GIPC (GAIP C-terminus-interacting protein) family of proteins, which mediate endocytosis by tethering cargo proteins to the motor myosin VI.	0
425381	cl41750	Rcc_KIF21	regulatory coiled-coil domain found in the kinesin-like KIF21 family. KIF21A, also called kinesin-like protein KIF2 or renal carcinoma antigen NY-REN-62, is a microtubule-binding motor protein involved in neuronal axonal transport. It works as a microtubule stabilizer that regulates axonal morphology, suppressing cortical microtubule dynamics in neurons. Mutations in KIF21A cause congenital fibrosis of the extraocular muscles type 1 (CFEOM1). In vitro, it has a plus-end directed motor activity. This model corresponds to the regulatory coiled-coil domain of KIF21A, which folds into an intramolecular antiparallel coiled-coil monomer in solution, but crystallizes into a dimeric domain-swapped antiparallel coiled-coil.	0
425382	cl41751	PPP2R3	serine/threonine protein phosphatase 2A regulatory subunit B&quot;. Heterotrimeric serine/threonine protein phosphatase 2A (PP2A) consists of scaffolding (A), catalytic (C), and variable (B, B', and B") subunits. The variable subunits dictate subcellular localization and substrate specificity of the PP2A holoenzyme. This group contains protein phosphatase subunit PR70 (also known as protein phosphatase 2 regulatory subunit B'' subunit beta, PR48, NYREN8, PPP2R3L, or PPP2R3LY) that is encoded by the PPP2R3B gene. This substrate-recognizing subunit of PP2A has a two-domain elongated structure with two calcium EF-hands, each displaying different affinities to Ca2+. PPP2R3B/PR70 is a gonosomal melanoma tumor suppressor gene; PR70 decreased melanoma growth by negatively interfering with DNA replication and cell cycle progression through its role in stabilizing the cell division cycle 6 (CDC6)-chromatin licensing and DNA replication factor 1 (CDT1) interaction, which delays the firing of origins of DNA replication.	0
425383	cl41752	LPS_wlbK-like	Bordetella wlbK gene product domains involved in bacterial polysaccharide synthesis, and similar domains. This model includes the C-terminal domain of the gene wlbJ (also known as bplJ, bplK, wlbjK) product protein, one of 12 genes that is involved in liposaccharide (LPS) synthesis. The liposaccharides (LPS) of Bordetella species are pyrogenic, mitogenic, and toxic, and can activate and induce tumor necrosis factor production in macrophages, similar to endotoxins from other gram-negative bacteria. Also, while the family Enterobacteriaceae expresses smooth-type LPS, the Bordetella LPS molecules differ in chemical structure; B. bronchiseptica and B. parapertussis synthesize a long-chain polysaccharide consisting of a homopolymer of 2,3-dideoxy-2,3-diN-acetylgalactosaminuronic acid (2,3-diNAcGalA), known as O antigen, whereas B. pertussis does not and is therefore more similar to rough-type LPS. This substantial structural difference between the LPS molecules of the three main pathogenic bordetellae likely confers quite different surface properties on the different species. Gene characterization studies show that wlbJ and wlbK are two apparently separate genes in B. pertussis, but are fused into a single open reading frame in B. bronchiseptica and B. parapertussishu. Studies show that mutations in wlbJK do not affect LPS biosynthesis but their function remains unclear.	0
425384	cl41753	SUN_cc1	coiled-coil domain 1 of SUN domain-containing proteins. SUN domain-containing protein 1 (SUN1), also called protein unc-84 homolog A, or Sad1/unc-84 protein-like 1, is a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex which is involved in the connection between the nuclear lamina and the cytoskeleton. Besides the core SUN domain, SUN1 contains two coiled-coil domains (CC1 and CC2), which act as the intrinsic dynamic regulators for controlling the activity of the SUN domain. This model corresponds to CC1 that may function as an activation segment to release CC2-mediated inhibition of the SUN domain.	0
425385	cl41754	SPASM	Iron-sulfur cluster-binding SPASM domain. Butirosin biosynthesis protein N (BtrN), also called S-adenosyl-L-methionine-dependent 2-deoxy-scyllo-inosamine dehydrogenase (EC 1.1.99.38), is a radical S-adenosylmethionine (SAM) enzyme that catalyzes the two-electron oxidation of 2-deoxy-scyllo-inosamine (DOIA) to amino-dideoxy-scyllo-inosose (amino-DOI) in the biosynthetic pathway of the aminoglycoside antibiotic butirosin. Radical SAM enzymes are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster that is involved in the reductive cleavage of SAM and generates a 5'-deoxyadenosyl radical, which in turn abstracts a hydrogen from the appropriately positioned carbon atom of the substrate. Radical SAM enzymes with a C-terminal SPASM domain contain at least one other iron-sulfur cluster. BtrN contains one auxillary 4Fe-4S cluster.	0
425386	cl41755	LPMO_auxiliary	lytic polysaccharide monooxygenase auxiliary activity protein. Fusolin is a protein found in spindles of insect poxviruses that resembles the lytic polysaccharide monooxygenases of chitinovorous bacteria and may function to disrupt the chitin-rich peritrophic matrix that protects insects against oral infections. Thus, it is a component of the virus occlusion bodies (which are large proteinaceous polyhedra) that protect the virus from the outside environment for extended periods until they are ingested by insect larvae.	0
425387	cl41756	PoNe	Polymorphic Nuclease effector (PoNe) domain is a deoxyribonuclease. The DNase toxin domain called PoNe (Polymorphic Nuclease effector) belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. This subfamily contains proteins with PoNe domains that typically co-occur with N-terminal domains such as a TANFOR domain which contains uncharacterized single or repeat domains that co-occur with fibronectin type III domains, or a pre-toxin HINT domain, a member of the HINT superfamily of proteases usually found N-terminal to the toxin module in polymorphic toxin systems; the HINT domain is predicted to function in releasing the toxin domain by autoproteolysis.	0
425388	cl41757	cytochrome_P450	cytochrome P450 (CYP) superfamily. Cytochrome P450 family 4, subfamily V, polypeptide 2 (CYP4V2) is the most characterized member of the CYP4V subfamily. It is a selective omega-hydroxylase of saturated, medium-chain fatty acids, such as laurate, myristate and palmitate, with high catalytic efficiency toward myristate. Polymorphisms in the CYP4V2 gene cause Bietti's crystalline corneoretinal dystrophy (BCD), a recessive degenerative retinopathy that is characterized clinically by a progressive decline in central vision, night blindness, and constriction of the visual field. The CYP4V subfamily belongs to the large cytochrome P450 (P450, CYP) superfamily of heme-containing proteins that catalyze a variety of oxidative reactions of a large number of structurally different endogenous and exogenous compounds in organisms from all major domains of life. CYPs bind their diverse ligands in a buried, hydrophobic active site, which is accessed through a substrate access channel formed by two flexible helices and their connecting loop.	0
425389	cl41758	peptidase_C58-like	C58 peptidase domain and and similar domains. This family includes the C58 peptidase domain of Pseudomonas syringae HopN1 peptidase, a type III secretion system effector that can suppress plant cell death events in both compatible and incompatible interactions. HopN1's proteolytic activity is dependent upon the invariant C/H/D residues conserved in the C58/YopT family peptidase domain.	0
425390	cl41759	polyA_pol_NCLDV	RNA polyadenylate polymerase of nucleocytoplasmic large DNA viruses. Poly(A) polymerases (PAPs) catalyze the attachment of adenylates to the 3' ends of messenger RNA and other RNAs, forming poly(A) tails. PAP acts as a nucleic acid template-independent NMP-transferase, preferentially utilizing a single species of NTP, namely ATP. The polyadenylation state of an mRNA may correlate with the efficiency of its translation. The catalytic subunit of NCLDV PAPs contains two topologically identical subdomains with a nucleotidyltransferase fold, suggesting that an ancestral duplication was at the origin of these viral PAPs.	0
425391	cl41760	XPF_nuclease-like	nuclease domain of XPF/MUS81 family proteins. Budding yeast Mms4, also known as Eme1 in other organisms, is a putative transcriptional (co)activator that protects Saccharomyces cerevisiae cells from endogenous and environmental DNA damage. It interacts with MUS81 to form a DNA structure-specific endonuclease with substrate preference for branched DNA structures with a 5'-end at the branch nick. Typical substrates include 3'-flap structures, D-loops, replication forks with regressed leading strands and nicked Holliday junctions. The nuclease domain of Mms4 lacks the catalytic motif.	0
425392	cl41761	FIX-like	Found in type sIX effector (FIX) domain of unknown function. The Found in type sIX effector (FIX) domain is found N-terminal to known toxin domains and is genetically and functionally linked to type VI secretion system (T6SS), a widespread mechanism used by Gram-negative bacteria to antagonize neighboring cells. In Vibrio parahaemolyticus, it also co-occurs with C-terminal nuclease toxin PoNe (Polymorphic Nuclease effector) which is associated with several toxin delivery systems including type V, type VI, and type VII. In this subfamily, members contain a FIX domain that generally co-occurs with the C-terminal Ntox15 (Novel toxin 15), a predicted RNase toxin that possesses a conserved HxxD motif, as well as with domains such as DNA/RNA non-specific endonuclease, RhsA domain regions with extende RHS repeats, or DUF4112. Some members also contain an N-terminal PAAR-like (i.e., DUF4280) domain.	0
425393	cl41762	MIX	Marker for type sIX effectors domain. This subfamily contains the MIX (Marker for type sIX effectors) V clan (MIX V) domain. MIX is a marker of type VI secretion system (T6SS) effectors carrying polymorphic C-terminal toxins. Predicted antibacterial activities of the C-terminal toxin domains of Vibrionaceae MIX V effectors include peptidase, peptidoglycan hydrolase, nuclease and pore-forming. Also included in this clan is VPR01S_11_01570, encoded by V. proteolyticus, that carries a CNF1 (cytotoxic necrotizing factor 1) toxin domain and modulates the actin cytoskeleton of eukaryotic phagocytic cells. Some members contain DUF2235, which is predicted as a phospholipase domain. Members of the MIX V clan are shared between marine bacteria via horizontal gene transfer, thereby enhancing their bacterial competitive fitness. Notably, many toxins identified as T6SS effectors do not contain a recognizable delivery domain or signal, suggesting that additional delivery domains may exist.	0
425394	cl41763	ELD_TRPML	extracytosolic/lumenal domain (ELD) found in transient receptor potential channel mucolipins (TRPMLs). TRPML3, also called mucolipin-3 (ML3), acts as Ca(2+)-permeable cation channel with inwardly rectifying activity. It mediates release of Ca(2+) from endosomes to the cytoplasm, contributes to endosomal acidification and is involved in the regulation of membrane trafficking and fusion in the endosomal pathway. The model corresponds to extracytosolic/lumenal domain (ELD), a linker located between the first two transmembrane segments (S1 and S2) of TRPML3. It forms a tight tetramer that is crucial for full-length TRPML3 assembly and localization.	0
425395	cl41764	CdiA-CT_Ec_Kp-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Escherichia coli and Klebsiella pneumoniae CdiA, and similar proteins. This family includes the C-terminal (CT) domain of bacterial CdiA, an effector protein involved in contact-dependent growth inhibition (CDI), a mechanism of inter-bacterial competition. The large CdiA effector protein carries a C-terminal toxin domain (CdiA-CT) which is delivered to neighboring bacteria to inhibit target-cell growth. Many of the domains in this family are associated with RHS repeats N-terminal to the domain. The exact biochemical function of this CdiA-CT is as yet unknown. CDI(+) bacteria also produce a CDI immunity protein (CdiI) to specifically neutralize the CdiA-CT toxins to prevent auto-inhibition. This CdiA-CT binds its cognate CdiI with high affinity.	0
425396	cl41765	EVE-like	EVE and YTH domains belong to the PUA superfamily. Individual members of the YTH family have been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. In general, eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or in other forms of silencing. The YTH domain is a novel RNA-binding domain that has been shown to bind to short, degenerate, single-stranded RNA motifs that loosely follow a consensus sequence. It belongs to the larger PUA superfamily.	0
425397	cl41766	YidC_peri	periplasmic beta-super sandwich fold domain of membrane protein insertase YidC from Gram-negative bacteria and similar domains. This subfamily is composed of Escherichia coli YidC and similar proteins. Membrane protein insertase YidC, also called foldase YidC or membrane integrase YidC, facilitates proper folding, insertion, and assembly of inner membrane proteins and complexes. Depending on the nature of the substrate, YidC functions in a Sec-independent (YidC only) or a Sec-dependent manner as part of a complex containing YidC, the SecYEG channel, and SecDFYajC. YidC belongs to the YidC/Oxa1/Alb3 protein family of insertases that contain a core domain of five transmembrane (TM) segments that is essential to insertase function. In addition to this core transmembrane domain, YidC from Gram-negative bacteria contain an extra transmembrane segment (TM1) at the N-terminus and a large periplasmic domain, located between TM1 and TM2, that adopts a beta-super sandwich fold that is found in sugar-binding proteins such as galactose mutarotase. This periplasmic domain may have a role in protein assembly: a region of YidC that binds to SecF maps to one edge of the beta-super sandwich.	0
425398	cl41767	Rab11BD_RAB3IP_like	Rab11 binding domain of Rab-3A-interacting protein (RAB3IP), Rab-3A-interacting-like protein 1 (RAB3IL1) and similar proteins. RAB3IL1, also called guanine nucleotide exchange factor for Rab-3A (GRAB), or Rab3A-interacting-like protein 1, or Rabin3-like 1, acts as a guanine nucleotide exchange factor (GEF) which promotes the exchange of GDP to GTP, converting inactive GDP-bound Rab proteins into their active GTP-bound form. As a dual Rab-binding protein, RAB3IL1 could potentially link Rab3 and Rab11 and/or Rab8 and Rab11-mediated intracellular trafficking processes. It may activate RAB3A, a GTPase that regulates synaptic vesicle exocytosis. It may also activate RAB8A and RAB8B. In addition, RAB3IL1 interacts with InsP6K1 and plays a role for InsP7 in vesicle exocytosis. The model corresponds to the Rab11a/Rab11b-binding region of RAB3IL1 lies within its carboxy-terminus, a region distinct from its GEF domain and Rab3a-binding region.	0
425399	cl41768	WH_NTD_SMARCB1_like	N-terminal winged helix DNA-binding domain found in SMARCB1, PHF10 and similar proteins. SMARCB1, also termed BRG1-associated factor 47 (BAF47), or integrase interactor 1 protein (INI1), or SNF5, or SNF5L1, is a core component of the BAF (hSWI/SNF) complex, an ATP-dependent chromatin-remodeling complex that plays important roles in cell proliferation and differentiation, in cellular antiviral activities and inhibition of tumor formation. The model corresponds to the N-terminal winged helix DNA binding domain of SMARCB1, which is structurally related to the SKI/SNO/DAC domain that is found in a number of metazoan chromatin-associated proteins.	0
425400	cl41769	toxin_MLD_like	membrane localization domain (MLD) of Vibrio MARTX, Pasteurella PMT, clostridial glycosylating cytotoxins, toxin effectors BteA (Bordetella T3SS effector A) and related proteins. This family includes the MLD located in the N-terminal minimal membrane-binding segment of BteA (residues 1-131, BteA131), which has also been referred to as the lipid raft targeting (LRT) domain/motif. BteA is a type III secretion system (T3SS) effector protein from Bordetella pertussis, a bacterial respiratory pathogen and the causative agent of whooping cough. The BteA131 segment is multifunctional: in addition to targeting phosphatidylinositol (PI)-rich microdomains in the host membrane, it binds its cognate chaperone BtcA. The MLD adopts a four-helix bundle structure, with a positively charged surface that targets phosphatidylinositol 4,5-bisphosphate (PIP2) in the host membrane via critical arginine and lysine residues. A flexible region preceding the BteA helical bundle contains the characteristic beta-motif required for binding BtcA. This domain has significant sequence similarity to the N-terminal domain of effectors and the endo-domain of RTX-type toxins from Photorhabdus luminescens. This family includes the N-terminal domain of Photorhabdus laumondii Photox toxin; little is known about the N-terminus of Photox, but its C-terminus is an actin-targeting ADP-ribosyltransferase.	0
425401	cl41770	NucC-like	cyclic oligonucleotide-based anti-phage signaling system-associated NucC nuclease and similar proteins. Cyclic oligonucleotide-based anti-phage signaling system (CBASS)-associated NucC nuclease kills phage-infected cells through genome destruction. It is allosterically activated by a cyclic triadenylate (cA3) second messenger that is synthesized by CBASS upon infection. NucC is related to restriction endonucleases but it adopts a homotrimeric structure. Binding of cA3 causes two NucC homotrimers to assemble into a homohexamer, which brings together a pair of active sites to activate DNA cleavage. NucC has also been integrated into type III CRISPR/Cas systems as an accessory nuclease.	0
425402	cl41771	SP6-9_N	N-terminal domains of transcription factor Specificity Proteins (SP) 6-9, and similar proteins. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP9 plays a role in limb outgrowth. It is expressed during embryogenesis in the forming apical ectodermal ridge, restricted regions of the central nervous system, and tail bud. SP8 and SP9 are two closely related transcription factors that mediate FGF10 signaling, which in turn regulates FGF8 expression which is essential for normal limb development. Both SP8 and SP9 have been found in vertebrates, but only SP8 is present in invertebrates. SP9 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP9.	0
425403	cl41772	Arl6IP1_RETR3-like	ADP-ribosylation factor-like protein 6-interacting protein 1, Reticulophagy regulator 3, and  similar proteins. Reticulophagy regulator 3 (RETR3 or RETREG3), also called FAM134C (family with sequence similarity 134, member C), mediates NRF1-enhanced neurite outgrowth. It interacts with ATG8 family modifier proteins MAP1LC3A, MAP1LC3B, GABARAP, and GABARAPL1. RETREG3/FAM134C contains an N-terminal reticulon-homology domain (RHD) that shows sequence similarity to ADP-ribosylation factor-like 6 binding factor 1 (Arl6IP1 or Arl6ip-1), an endoplasmic reticulum protein that has an important role in cell conduction and material transport. The RHD may function in inducing membrane curvature.	0
425404	cl41773	SP1-4_N	N-terminal domain of transcription factor Specificity Proteins (SP) 1-4. Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.	0
425405	cl41774	HkD_SF	Hook domain-containing proteins superfamily. Gipie, also called coiled-coil domain-containing protein 88B (CCDC88B), or brain leucine zipper domain-containing protein, or Hook-related protein 3 (HkRP3), is a novel actin cytoskeleton-binding protein and Akt substrate that regulates cell migratory responses in various biological contexts. It acts as a positive regulator of T-cell maturation and inflammatory function. As a microtubule-binding protein, Gipie regulates lytic granule clustering and NK cell killing.	0
425406	cl41775	JMTM_Notch_APP	juxtamembrane and transmembrane (JMTM) domain found in Notch and APP family proteins. Amyloid-like protein 2 (APLP-2), also called amyloid protein homolog (APPH), or CDEI box-binding protein (CDEBP), may play a role in the regulation of hemostasis. Its soluble form may have inhibitory properties towards coagulation factors. APLP-2 may bind to the DNA 5'-GTCACATG-3'(CDEI box). It inhibits trypsin, chymotrypsin, plasmin, factor XIA, and plasma and glandular kallikrein. This model corresponds to juxtamembrane and transmembrane (JMTM) domain of APLP-2, which consists of the intact transmembrane (TM) domain with adjacent N-terminal juxtamembrane (JM) region.	0
425407	cl41776	DLC-like_SF	dynein light chain (DLC)-like domain superfamily. Dynein light chain Tctex-type 3 (DYNLT3), also called rp3, or protein 91/23, or T-complex-associated testis-expressed 1-like, is a non-catalytic accessory component of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function. It has a potential role in chromosome congression in human mitosis and is required for chromosome alignment during mouse oocyte meiotic maturation. The DYNLT3 light chain directly links cytoplasmic dynein to a spindle checkpoint protein, Bub3. The model corresponds to the dynein light chain (DLC)-like domain of DYNLT3.	0
425408	cl41777	Zn-C2H2_CALCOCO1_TAX1BP1_like	autophagy receptor zinc finger-C2H2 domain found in calcium-binding and coiled-coil domain-containing proteins, TAX1BP1 and similar proteins. spn-F is the central mediator of IK2 kinase-dependent dendrite pruning in drosophila sensory neurons. It acts downstream of IKK-related kinase Ik2 in the same pathway for dendrite pruning. Spn-F is a coil-coiled protein containing a C2H2-type zinc binding domain.	0
425409	cl41778	GINS_B	beta-strand (B) domain of GINS complex proteins: Sld5, Psf1, Psf2, Psf3, Gins51 and Gins23. The GINS (named from the Japanese go-ichi-ni-san, meaning 5-1-2-3 for the Sld5, Psf1, Psf2, and Psf3 subunits) complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In archaeal DNA replication initiation, homo-hexameric MCM (mini-chromosome maintenance) unwinds the template double-stranded DNA to form the replication fork. MCM is activated by two proteins GINS and GAN (GINS-associated nuclease), which constitute the 'CMG' unwindosome complex together with the MCM core. While eukaryotic GINS complex is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3, the archaeal complex consists of two different proteins, namely Gins51 and Gins23, and forms either an alpha2beta2-type heterotetramer composed of Gins51 and Gins23, or a Gins51-only alpha4-type homotetramer. The archaeal Gins23, as well as eukaryotic Psf2 and Psf3, have the alpha-helical (A) domain at the C-terminus and the beta-strand domain (B) at the N-terminus; this arrangement is called BAtype. The locations and contributions of the archaeal Gins subunit B domain to the tetramer formation, imply the possibility that the archaeal and eukaryotic GINS complexes contribute to DNA unwinding reactions by significantly different mechanisms in terms of the atomic details. This model represents the B-domain of archaeal Gins23.	0
425410	cl41779	akirin	akirin. Akirins are small, highly conserved eumetazoan nuclear proteins that play a role in immune response and tumorigenesis. It is believed that they act as a connector between a variety of transcription factors and major chromatin remodeling complexes. Akirin-2 is one of the two orthologs in vertebrates that plays a role in immunity, myogenesis, and brain- and limb-development. Akirin-2 is partly cytosolic. It has been shown to interact with nuclear importins and therefore may play a role in proper transport between nucleus and cytoplasm.	0
425411	cl41780	CDI_toxin_Bp_tRNase-like	C-terminal (CT) domain of the contact-dependent growth inhibition (CDI) system (CdiA-CT) of Burkholderia pseudomallei, and similar proteins. CDI toxins are expressed by gram-negative bacteria as part of a mechanism to inhibit the growth of neighboring cells. This model represents the C-terminal (CT) toxin domain of CdiA effector proteins. CdiA secretion is dependent on the outer membrane protein CdiB. Upon binding to a receptor on the surface of target bacteria, the CDI toxin is delivered via the C-terminal domain. A wide variety of C-terminal toxin domains appear to exist; this particular model contains the C-terminal (CT) toxin domains that are similar to Burkholderia pseudomallei E479 and 1026b CdiA toxins, both of which are tRNAses.	0
425412	cl41781	pseudoGTPaseD_p190RhoGAP	pseudoGTPase domain found in the family of p190RhoGAP. p190RhoGAP protein A (p190RhoGAP-A), also called Rho GTPase-activating protein 35(RHOGAP35), glucocorticoid receptor DNA-binding factor 1, or glucocorticoid receptor repression factor 1 (GRF-1), or Rho GAP p190A, or p190-A, is a Rho family GTPase-activating protein (GAP) that acts as a key regulator of Rho GTPase signaling and is essential for actin cytoskeletal structure and contractility. It binds several acidic phospholipids which inhibits the Rho GAP activity to promote the Rac GAP activity. This model corresponds to the GTPase-like domain called pseudoGTPase domain that is located at the middle region of p190RhoGAP-A. Rho family GTPase-activating proteins normally have five highly conserved sequence motifs, termed 'G-motifs', required for nucleotide-binding and catalytic activity. PseudoGTPases would consist of a GTPase fold lacking one or more of these G motifs.	0
394960	pfam00001	7tm_1	7 transmembrane receptor (rhodopsin family). This family contains, amongst other G-protein-coupled receptors (GCPRs), members of the opsin family, which have been considered to be typical members of the rhodopsin superfamily. They share several motifs, mainly the seven transmembrane helices, GCPRs of the rhodopsin superfamily. All opsins bind a chromophore, such as 11-cis-retinal. The function of most opsins other than the photoisomerases is split into two steps: light absorption and G-protein activation. Photoisomerases, on the other hand, are not coupled to G-proteins - they are thought to generate and supply the chromophore that is used by visual opsins.	256
394961	pfam00002	7tm_2	7 transmembrane receptor (Secretin family). This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognized. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling.	245
394962	pfam00003	7tm_3	7 transmembrane sweet-taste receptor of 3 GCPR. This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness.	236
394963	pfam00004	AAA	ATPase family associated with various cellular activities (AAA). AAA family proteins often perform chaperone-like functions that assist in the assembly, operation, or disassembly of protein complexes.	130
394964	pfam00005	ABC_tran	ABC transporter. ABC transporters for a large family of proteins responsible for translocation of a variety of compounds across biological membranes. ABC transporters are the largest family of proteins in many completely sequenced bacteria. ABC transporters are composed of two copies of this domain and two copies of a transmembrane domain pfam00664. These four domains may belong to a single polypeptide as in CFTR, or belong in different polypeptide chains.	150
394965	pfam00006	ATP-synt_ab	ATP synthase alpha/beta family, nucleotide-binding domain. This entry includes the ATP synthase alpha and beta subunits, the ATP synthase associated with flagella and the termination factor Rho.	212
394966	pfam00007	Cys_knot	Cystine-knot domain. The family comprises glycoprotein hormones and the C-terminal domain of various extracellular proteins. It is believed to be involved in disulfide-linked dimerization.	105
394967	pfam00008	EGF	EGF-like domain. There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains.	31
394968	pfam00009	GTP_EFTU	Elongation factor Tu GTP binding domain. This domain contains a P-loop motif, also found in several other families such as pfam00071, pfam00025 and pfam00063. Elongation factor Tu consists of three structural domains, this plus two C-terminal beta barrel domains.	187
394969	pfam00010	HLH	Helix-loop-helix DNA-binding domain. 	53
365807	pfam00011	HSP20	Hsp20/alpha crystallin family. Not only do small heat-shock-proteins occur in eukaryotes and prokaryotes but they have also now been shown to occur in cyanobacterial phages as well as their bacterial hosts.	100
394970	pfam00012	HSP70	Hsp70 protein. Hsp70 chaperones help to fold many proteins. Hsp70 assisted folding involves repeated cycles of substrate binding and release. Hsp70 activity is ATP dependent. Hsp70 proteins are made up of two regions: the amino terminus is the ATPase domain and the carboxyl terminus is the substrate binding region.	598
394971	pfam00013	KH_1	KH domain. KH motifs bind RNA in vitro. Autoantibodies to Nova, a KH domain protein, cause paraneoplastic opsoclonus ataxia.	65
394972	pfam00014	Kunitz_BPTI	Kunitz/Bovine pancreatic trypsin inhibitor domain. Indicative of a protease inhibitor, usually a serine protease inhibitor. Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Certain family members are similar to the tick anticoagulant peptide (TAP). This is a highly selective inhibitor of factor Xa in the blood coagulation pathways. TAP molecules are highly dipolar, and are arranged to form a twisted two- stranded antiparallel beta-sheet followed by an alpha helix.	53
333767	pfam00015	MCPsignal	Methyl-accepting chemotaxis protein (MCP) signalling domain. This domain is thought to transduce the signal to CheA since it is highly conserved in very diverse MCPs.	172
394973	pfam00016	RuBisCO_large	Ribulose bisphosphate carboxylase large chain, catalytic domain. The C-terminal domain of RuBisCO large chain is the catalytic domain adopting a TIM barrel fold.	292
394974	pfam00017	SH2	SH2 domain. 	77
394975	pfam00018	SH3_1	SH3 domain. SH3 (Src homology 3) domains are often indicative of a protein involved in signal transduction related to cytoskeletal organisation. First described in the Src cytoplasmic tyrosine kinase. The structure is a partly opened beta barrel.	47
394976	pfam00019	TGF_beta	Transforming growth factor beta like domain. 	100
394977	pfam00020	TNFR_c6	TNFR/NGFR cysteine-rich region. 	38
394978	pfam00021	UPAR_LY6	u-PAR/Ly-6 domain. This extracellular disulphide bond rich domain is related to pfam00087.	77
394979	pfam00022	Actin	Actin. 	407
394980	pfam00023	Ank	Ankyrin repeat. Ankyrins are multifunctional adaptors that link specific proteins to the membrane-associated, spectrin- actin cytoskeleton. This repeat-domain is a 'membrane-binding' domain of up to 24 repeated units, and it mediates most of the protein's binding activities. Repeats 13-24 are especially active, with known sites of interaction for the Na/K ATPase, Cl/HCO(3) anion exchanger, voltage-gated sodium channel, clathrin heavy chain and L1 family cell adhesion molecules. The ANK repeats are found to form a contiguous spiral stack such that ion transporters like the anion exchanger associate in a large central cavity formed by the ANK repeat spiral, while clathrin and cell adhesion molecules associate with specific regions outside this cavity.	33
394981	pfam00024	PAN_1	PAN domain. The PAN domain contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge the links the N and C termini of the domain. The domain is found in diverse proteins, in some they mediate protein-protein interactions, in others they mediate protein-carbohydrate interactions.	77
394982	pfam00025	Arf	ADP-ribosylation factor family. Pfam combines a number of different Prosite families together	174
394983	pfam00026	Asp	Eukaryotic aspartyl protease. Aspartyl (acid) proteases include pepsins, cathepsins, and renins. Two-domain structure, probably arising from ancestral duplication. This family does not include the retroviral nor retrotransposon proteases (pfam00077), which are much smaller and appear to be homologous to a single domain of the eukaryotic asp proteases.	313
394984	pfam00027	cNMP_binding	Cyclic nucleotide-binding domain. 	89
394985	pfam00028	Cadherin	Cadherin domain. 	92
394986	pfam00029	Connexin	Connexin. Connexin proteins form gap-junctions between cells. They carry four transmembrane regions, hence why this family now includes Connexin_CCC, which represented the second pair of TMs.	222
394987	pfam00030	Crystall	Beta/Gamma crystallin. The alignment comprises two Greek key motifs since the similarity between them is very low.	82
394988	pfam00031	Cystatin	Cystatin domain. Very diverse family. Attempts to define separate sub-families failed. Typically, either the N-terminal or C-terminal end is very divergent. But splitting into two domains would make very short families. pfam00666 is related to this family but members have not been included.	92
394989	pfam00032	Cytochrom_B_C	Cytochrome b(C-terminal)/b6/petD. 	101
306530	pfam00033	Cytochrome_B	Cytochrome b/b6/petB. 	189
394990	pfam00034	Cytochrom_C	Cytochrome c. The Pfam entry does not include all Prosite members. The cytochrome 556 and cytochrome c' families are not included. All these are now in a new clan together. The C-terminus of DUF989, pfam06181, has now been merged into this family.	89
394991	pfam00035	dsrm	Double-stranded RNA binding motif. Sequences gathered for seed by HMM_iterative_training Putative motif shared by proteins that bind to dsRNA. At least some DSRM proteins seem to bind to specific RNA targets. Exemplified by Staufen, which is involved in localization of at least five different mRNAs in the early Drosophila embryo. Also by interferon-induced protein kinase in humans, which is part of the cellular response to dsRNA.	66
394992	pfam00036	EF-hand_1	EF hand. The EF-hands can be divided into two classes: signalling proteins and buffering/transport proteins. The first group is the largest and includes the most well-known members of the family such as calmodulin, troponin C and S100B. These proteins typically undergo a calcium-dependent conformational change which opens a target binding site. The latter group is represented by calbindin D9k and do not undergo calcium dependent conformational changes.	28
394993	pfam00037	Fer4	4Fe-4S binding domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich.	24
365827	pfam00038	Filament	Intermediate filament protein. 	313
394994	pfam00039	fn1	Fibronectin type I domain. 	40
394995	pfam00040	fn2	Fibronectin type II domain. 	42
394996	pfam00041	fn3	Fibronectin type III domain. 	85
394997	pfam00042	Globin	Globin. 	109
394998	pfam00043	GST_C	Glutathione S-transferase, C-terminal domain. GST conjugates reduced glutathione to a variety of targets including S-crystallin from squid, the eukaryotic elongation factor 1-gamma, the HSP26 family of stress-related proteins and auxin-regulated proteins in plants. Stringent starvation proteins in E. coli are also included in the alignment but are not known to have GST activity. The glutathione molecule binds in a cleft between N and C-terminal domains. The catalytically important residues are proposed to reside in the N-terminal domain. In plants, GSTs are encoded by a large gene family (48 GST genes in Arabidopsis) and can be divided into the phi, tau, theta, zeta, and lambda classes.	93
394999	pfam00044	Gp_dh_N	Glyceraldehyde 3-phosphate dehydrogenase, NAD binding domain. GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. N-terminal domain is a Rossmann NAD(P) binding fold.	101
395000	pfam00045	Hemopexin	Hemopexin. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metallopeptidases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metallopeptidases (TIMPs).	44
395001	pfam00046	Homeobox	Homeobox domain. 	55
395002	pfam00047	ig	Immunoglobulin domain. Members of the immunoglobulin superfamily are found in hundreds of proteins of different functions. Examples include antibodies, the giant muscle kinase titin and receptor tyrosine kinases. Immunoglobulin-like domains may be involved in protein-protein and protein-ligand interactions.	86
395003	pfam00048	IL8	Small cytokines (intecrine/chemokine), interleukin-8 like. Includes a number of secreted growth factors and interferons involved in mitogenic, chemotactic, and inflammatory activity. Structure contains two highly conserved disulfide bonds.	60
306545	pfam00049	Insulin	Insulin/IGF/Relaxin family. Superfamily includes insulins; relaxins; insulin-like growth factor; and bombyxin. All are secreted regulatory hormones. Disulfide rich, all-alpha fold. Alignment includes B chain, linker (which is processed out of the final product), and A chain.	77
395004	pfam00050	Kazal_1	Kazal-type serine protease inhibitor domain. Usually indicative of serine protease inhibitors. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays. Small alpha+beta fold containing three disulphides. Alignment also includes a single domain from transporters in the OATP/PGT family.	49
395005	pfam00051	Kringle	Kringle domain. Kringle domains have been found in plasminogen, hepatocyte growth factors, prothrombin, and apolipoprotein A. Structure is disulfide-rich, nearly all-beta.	79
395006	pfam00052	Laminin_B	Laminin B (Domain IV). 	136
395007	pfam00053	Laminin_EGF	Laminin EGF domain. This family is like pfam00008 but has 8 conserved cysteines instead of six.	49
395008	pfam00054	Laminin_G_1	Laminin G domain. 	131
395009	pfam00055	Laminin_N	Laminin N-terminal (Domain VI). 	231
395010	pfam00056	Ldh_1_N	lactate/malate dehydrogenase, NAD binding domain. L-lactate dehydrogenases are metabolic enzymes which catalyze the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyze the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. N-terminus (this family) is a Rossmann NAD-binding fold. C-terminus is an unusual alpha+beta fold.	141
395011	pfam00057	Ldl_recept_a	Low-density lipoprotein receptor domain class A. 	37
395012	pfam00058	Ldl_recept_b	Low-density lipoprotein receptor repeat class B. This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure.	41
395013	pfam00059	Lectin_C	Lectin C-type domain. This family includes both long and short form C-type	104
395014	pfam00060	Lig_chan	Ligand-gated ion channel. This family includes the four transmembrane regions of the ionotropic glutamate receptors and NMDA receptors.	266
395015	pfam00061	Lipocalin	Lipocalin / cytosolic fatty-acid binding protein family. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel.	143
395016	pfam00062	Lys	C-type lysozyme/alpha-lactalbumin family. Alpha-lactalbumin is the regulatory subunit of lactose synthase, changing the substrate specificity of galactosyltransferase from N-acetylglucosamine to glucose. C-type lysozymes are secreted bacteriolytic enzymes that cleave the peptidoglycan of bacterial cell walls. Structure is a multi-domain, mixed alpha and beta fold, containing four conserved disulfide bonds.	123
395017	pfam00063	Myosin_head	Myosin head (motor domain). 	674
395018	pfam00064	Neur	Neuraminidase. Neuraminidases cleave sialic acid residues from glycoproteins. Belong to the sialidase family - but this alignment does not generalize to the other sialidases. Structure is a 6-sheet beta propeller.	334
395019	pfam00066	Notch	LNR domain. The LNR (Lin-12/Notch repeat) domain is found in three tandem copies in Notch related proteins. The structure of the domain has been determined by NMR and was shown to contain three disulphide bonds and coordinate a calcium ion. Three repeats are also found in the PAPP-A peptidase.	30
395020	pfam00067	p450	Cytochrome P450. Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures.	461
395021	pfam00068	Phospholip_A2_1	Phospholipase A2. Phospholipase A2 releases fatty acids from the second carbon group of glycerol. Perhaps the best known members are secreted snake venoms, but also found in secreted pancreatic and membrane-associated forms. Structure is all-alpha, with two core disulfide-linked helices and a calcium-binding loop. This alignment represents the major family of PLA2s. A second minor family, defined by the honeybee venom PLA2 Structure 1POC and related sequences from Gila monsters (Heloderma), is not recognized. This minor family conserves the core helix pair but is substantially different elsewhere. The PROSITE pattern PA2_HIS, specific to the first core helix, recognizes both families.	107
395022	pfam00069	Pkinase	Protein kinase domain. 	217
395023	pfam00070	Pyr_redox	Pyridine nucleotide-disulphide oxidoreductase. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain.	80
395024	pfam00071	Ras	Ras family. Includes sub-families Ras, Rab, Rac, Ral, Ran, Rap Ypt1 and more. Shares P-loop motif with GTP_EFTU, arf and myosin_head. See pfam00009 pfam00025, pfam00063. As regards Rab GTPases, these are important regulators of vesicle formation, motility and fusion. They share a fold in common with all Ras GTPases: this is a six-stranded beta-sheet surrounded by five alpha-helices.	162
395025	pfam00072	Response_reg	Response regulator receiver domain. This domain receives the signal from the sensor partner in bacterial two-component systems. It is usually found N-terminal to a DNA binding effector domain.	111
395026	pfam00073	Rhv	picornavirus capsid protein. CAUTION: This alignment is very weak. It can not be generated by clustalw. If a representative set is used for a seed, many so-called members are not recognized. The family should probably be split up into sub-families. Capsid proteins of picornaviruses. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids. They include rhinovirus (common cold) and poliovirus. Common structure is an 8-stranded beta sandwich. Variations (one or two extra strands) occur.	170
395027	pfam00074	RnaseA	Pancreatic ribonuclease. Ribonucleases. Members include pancreatic RNAase A and angiogenins. Structure is an alpha+beta fold -- long curved beta sheet and three helices.	121
395028	pfam00075	RNase_H	RNase H. RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral replication cycle, and often found as a domain associated with reverse transcriptases. Structure is a mixed alpha+beta fold with three a/b/a layers.	141
395029	pfam00076	RRM_1	RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain). The RRM motif is probably diagnostic of an RNA binding protein. RRMs are found in a variety of RNA binding proteins, including various hnRNP proteins, proteins implicated in regulation of alternative splicing, and protein components of snRNPs. The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, with a third helix present during RNA binding in some cases The C-terminal beta strand (4th strand) and final helix are hard to align and have been omitted in the SEED alignment The LA proteins have an N terminal rrm which is included in the seed. There is a second region towards the C-terminus that has some features characteristic of a rrm but does not appear to have the important structural core of a rrm. The LA proteins are one of the main autoantigens in Systemic lupus erythematosus (SLE), an autoimmune disease.	70
395030	pfam00077	RVP	Retroviral aspartyl protease. Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases such as pepsins, cathepsins, and renins (pfam00026).	99
395031	pfam00078	RVT_1	Reverse transcriptase (RNA-dependent DNA polymerase). A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.	189
395032	pfam00079	Serpin	Serpin (serine protease inhibitor). Structure is a multi-domain fold containing a bundle of helices and a beta sandwich.	367
395033	pfam00080	Sod_Cu	Copper/zinc superoxide dismutase (SODC). superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene cause familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Structure is an eight-stranded beta sandwich, similar to the immunoglobulin fold.	137
395034	pfam00081	Sod_Fe_N	Iron/manganese superoxide dismutases, alpha-hairpin domain. superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. N-terminal domain is a long alpha antiparallel hairpin. A small fragment of YTRE_LEPBI matches well - sequencing error?	82
395035	pfam00082	Peptidase_S8	Subtilase family. Subtilases are a family of serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like that found in the trypsin serine proteases (see pfam00089). Structure is an alpha/beta fold containing a 7-stranded parallel beta sheet, order 2314567.	287
395036	pfam00083	Sugar_tr	Sugar (and other) transporter. 	452
395037	pfam00084	Sushi	Sushi repeat (SCR repeat). 	56
395038	pfam00085	Thioredoxin	Thioredoxin. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond. Some members with only the active site are not separated from the noise.	103
395039	pfam00086	Thyroglobulin_1	Thyroglobulin type-1 repeat. Thyroglobulin type 1 repeats are thought to be involved in the control of proteolytic degradation. The domain usually contains six conserved cysteines. These form three disulphide bridges. Cysteines 1 pairs with 2, 3 with 4 and 5 with 6.	66
395040	pfam00087	Toxin_TOLIP	Snake toxin and toxin-like protein. This family predominantly includes venomous neurotoxins and cytotoxins from snakes, but also structurally similar (non-snake) toxin-like proteins (TOLIPs) such as Lymphocyte antigen 6D and Ly6/PLAUR domain-containing protein. Snake toxins are short proteins with a compact, disulphide-rich structure. TOLIPs have similar structural features (abundance of spaced cysteine residues, a high frequency of charge residues, a signal peptide for secretion and a compact structure) but, are not associated with a venom gland or poisonous function. They are endogenous animal proteins that are not restricted to poisonous animals.	70
395041	pfam00088	Trefoil	Trefoil (P-type) domain. 	43
395042	pfam00089	Trypsin	Trypsin. 	219
395043	pfam00090	TSP_1	Thrombospondin type 1 domain. 	49
395044	pfam00091	Tubulin	Tubulin/FtsZ family, GTPase domain. This family includes the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. Members of this family are involved in polymer formation. FtsZ is the polymer-forming protein of bacterial cell division. It is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea. Tubulin is the major component of microtubules.	190
395045	pfam00092	VWA	von Willebrand factor type A domain. 	174
278520	pfam00093	VWC	von Willebrand factor type C domain. The high cutoff was used to prevent overlap with pfam00094.	57
395046	pfam00094	VWD	von Willebrand factor type D domain. Luciferin monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods.	155
395047	pfam00095	WAP	WAP-type (Whey Acidic Protein) 'four-disulfide core'. WAP belongs to the group of Elafin or elastase-specific inhibitors.	42
395048	pfam00096	zf-C2H2	Zinc finger, C2H2 type. The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter.	23
395049	pfam00097	zf-C3HC4	Zinc finger, C3HC4 type (RING finger). The C3HC4 type zinc-finger (RING finger) is a cysteine-rich domain of 40 to 60 residues that coordinates two zinc ions, and has the consensus sequence: C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C where X is any amino acid. Many proteins containing a RING finger play a key role in the ubiquitination pathway.	40
395050	pfam00098	zf-CCHC	Zinc knuckle. The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.	18
395051	pfam00100	Zona_pellucida	Zona pellucida-like domain. 	247
395052	pfam00101	RuBisCO_small	Ribulose bisphosphate carboxylase, small chain. 	97
395053	pfam00102	Y_phosphatase	Protein-tyrosine phosphatase. 	234
306585	pfam00103	Hormone_1	Somatotropin hormone family. 	214
395054	pfam00104	Hormone_recep	Ligand-binding domain of nuclear hormone receptor. This all helical domain is involved in binding the hormone in these receptors.	208
395055	pfam00105	zf-C4	Zinc finger, C4 type (two domains). In nearly all cases, this is the DNA binding domain of a nuclear hormone receptor. The alignment contains two Zinc finger domains that are too dissimilar to be aligned with each other.	68
395056	pfam00106	adh_short	short chain dehydrogenase. This family contains a wide variety of dehydrogenases.	195
395057	pfam00107	ADH_zinc_N	Zinc-binding dehydrogenase. 	129
395058	pfam00108	Thiolase_N	Thiolase, N-terminal domain. Thiolase is reported to be structurally related to beta-ketoacyl synthase (pfam00109), and also chalcone synthase.	260
395059	pfam00109	ketoacyl-synt	Beta-ketoacyl synthase, N-terminal domain. The structure of beta-ketoacyl synthase is similar to that of the thiolase family (pfam00108) and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. The N-terminal domain contains most of the structures involved in dimer formation and also the active site cysteine.	249
395060	pfam00110	wnt	wnt family. Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families.	304
395061	pfam00111	Fer2	2Fe-2S iron-sulfur cluster binding domain. 	77
395062	pfam00112	Peptidase_C1	Papain family cysteine protease. 	212
395063	pfam00113	Enolase_C	Enolase, C-terminal TIM barrel domain. 	296
395064	pfam00114	Pilin	Pilin (bacterial filament). Proteins with only the short N-terminal methylation site are not separated from the noise. The Prosite pattern detects those better.	108
395065	pfam00115	COX1	Cytochrome C and Quinol oxidase polypeptide I. 	434
395066	pfam00116	COX2	Cytochrome C oxidase subunit II, periplasmic domain. 	120
395067	pfam00117	GATase	Glutamine amidotransferase class-I. 	188
395068	pfam00118	Cpn60_TCP1	TCP-1/cpn60 chaperonin family. This family includes members from the HSP60 chaperone family and the TCP-1 (T-complex protein) family.	489
395069	pfam00119	ATP-synt_A	ATP synthase A chain. 	209
395070	pfam00120	Gln-synt_C	Glutamine synthetase, catalytic domain. 	343
395071	pfam00121	TIM	Triosephosphate isomerase. 	243
395072	pfam00122	E1-E2_ATPase	E1-E2 ATPase. 	181
395073	pfam00123	Hormone_2	Peptide hormone. This family contains glucagon, GIP, secretin and VIP.	28
395074	pfam00124	Photo_RC	Photosynthetic reaction centre protein. 	258
333859	pfam00125	Histone	Core histone H2A/H2B/H3/H4. 	127
395075	pfam00126	HTH_1	Bacterial regulatory helix-turn-helix protein, lysR family. 	60
395076	pfam00127	Copper-bind	Copper binding proteins, plastocyanin/azurin family. 	99
395077	pfam00128	Alpha-amylase	Alpha amylase, catalytic domain. Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain.	334
395078	pfam00129	MHC_I	Class I Histocompatibility antigen, domains alpha 1 and 2. 	179
395079	pfam00130	C1_1	Phorbol esters/diacylglycerol binding domain (C1 domain). This domain is also known as the Protein kinase C conserved region 1 (C1) domain.	53
395080	pfam00131	Metallothio	Metallothionein. 	65
395081	pfam00132	Hexapep	Bacterial transferase hexapeptide (six repeats). 	30
395082	pfam00133	tRNA-synt_1	tRNA synthetases class I (I, L, M and V). Other tRNA synthetase sub-families are too dissimilar to be included.	603
395083	pfam00134	Cyclin_N	Cyclin, N-terminal domain. Cyclins regulate cyclin dependent kinases (CDKs). Cyclin-0 (CCNO) is a Uracil-DNA glycosylase that is related to other cyclins. Cyclins contain two domains of similar all-alpha fold, of which this family corresponds with the N-terminal domain.	127
395084	pfam00135	COesterase	Carboxylesterase family. 	513
395085	pfam00136	DNA_pol_B	DNA polymerase family B. This region of DNA polymerase B appears to consist of more than one structural domain, possibly including elongation, DNA-binding and dNTP binding activities.	439
395086	pfam00137	ATP-synt_C	ATP synthase subunit C. 	60
395087	pfam00139	Lectin_legB	Legume lectin domain. 	244
395088	pfam00140	Sigma70_r1_2	Sigma-70 factor, region 1.2. 	33
395089	pfam00141	peroxidase	Peroxidase. 	187
395090	pfam00142	Fer4_NifH	4Fe-4S iron sulfur cluster binding proteins, NifH/frxC family. 	271
395091	pfam00143	Interferon	Interferon alpha/beta domain. 	160
395092	pfam00144	Beta-lactamase	Beta-lactamase. This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase.	327
395093	pfam00145	DNA_methylase	C-5 cytosine-specific DNA methylase. 	324
395094	pfam00146	NADHdh	NADH dehydrogenase. 	302
395095	pfam00147	Fibrinogen_C	Fibrinogen beta and gamma chains, C-terminal globular domain. 	221
395096	pfam00148	Oxidored_nitro	Nitrogenase component 1 type Oxidoreductase. 	398
395097	pfam00149	Metallophos	Calcineurin-like phosphoesterase. This family includes a diverse range of phosphoesterases, including protein phosphoserine phosphatases, nucleotidases, sphingomyelin phosphodiesterases and 2'-3' cAMP phosphodiesterases as well as nucleases such as bacterial SbcD or yeast MRE11. The most conserved regions in this superfamily centre around the metal chelating residues.	191
395098	pfam00150	Cellulase	Cellulase (glycosyl hydrolase family 5). 	272
395099	pfam00151	Lipase	Lipase. 	336
395100	pfam00152	tRNA-synt_2	tRNA synthetases class II (D, K and N). 	315
395101	pfam00153	Mito_carr	Mitochondrial carrier protein. 	96
395102	pfam00154	RecA	recA bacterial DNA recombination protein. RecA is a DNA-dependent ATPase and functions in DNA repair systems. RecA protein catalyzes an ATP-dependent DNA strand-exchange reaction that is the central step in the repair of dsDNA breaks by homologous recombination.	262
395103	pfam00155	Aminotran_1_2	Aminotransferase class I and II. 	351
395104	pfam00156	Pribosyltran	Phosphoribosyl transferase domain. This family includes a range of diverse phosphoribosyl transferase enzymes. This family includes: Adenine phosphoribosyl-transferase EC:2.4.2.7, Hypoxanthine-guanine-xanthine phosphoribosyl-transferase, Hypoxanthine phosphoribosyl-transferase EC:2.4.2.8. Ribose-phosphate pyrophosphokinase i EC:2.7.6.1. Amidophosphoribosyltransferase EC:2.4.2.14. Orotate phosphoribosyl-transferase EC:2.4.2.10, Uracil phosphoribosyl-transferase EC:2.4.2.9, Xanthine-guanine phosphoribosyl-transferase EC:2.4.2.22. In Arabidopsis, at the very N-terminus of this domain is the P-Loop NTPase domain.	150
395105	pfam00157	Pou	Pou domain - N-terminal to homeobox domain. 	69
395106	pfam00158	Sigma54_activat	Sigma-54 interaction domain. 	168
395107	pfam00159	Hormone_3	Pancreatic hormone peptide. 	35
395108	pfam00160	Pro_isomerase	Cyclophilin type peptidyl-prolyl cis-trans isomerase/CLD. The peptidyl-prolyl cis-trans isomerases, also known as cyclophilins, share this domain of about 109 amino acids. Cyclophilins have been found in all organisms studied so far and catalyze peptidyl-prolyl isomerisation during which the peptide bond preceding proline (the peptidyl-prolyl bond) is stabilized in the cis conformation. Mammalian cyclophilin A (CypA) is a major cellular target for the immunosuppressive drug cyclosporin A (CsA). Other roles for cyclophilins may include chaperone and cell signalling function.	150
395109	pfam00161	RIP	Ribosome inactivating protein. 	198
395110	pfam00162	PGK	Phosphoglycerate kinase. 	370
395111	pfam00163	Ribosomal_S4	Ribosomal protein S4/S9 N-terminal domain. This family includes small ribosomal subunit S9 from prokaryotes and S16 from metazoans. This domain is predicted to bind to ribosomal RNA. This domain is composed of four helices in the known structure. However the domain is discontinuous in sequence and the alignment for this family contains only the first three helices.	87
395112	pfam00164	Ribosom_S12_S23	Ribosomal protein S12/S23. This protein is known as S12 in bacteria and archaea and S23 in eukaryotes.	114
395113	pfam00165	HTH_AraC	Bacterial regulatory helix-turn-helix proteins, AraC family. In the absence of arabinose, the N-terminal arm of AraC binds to the DNA binding domain (pfam00165) and helps to hold the two DNA binding domains in a relative orientation that favours DNA looping. In the presence of arabinose, the arms bind over the arabinose on the dimerization domain, thus freeing the DNA-binding domains. The freed DNA-binding domains are then able to assume a conformation suitable for binding to the adjacent DNA sites that are utilized when AraC activates transcription, and hence AraC ceases looping the DNA when arabinose is added.	42
395114	pfam00166	Cpn10	Chaperonin 10 Kd subunit. This family contains GroES and Gp31-like chaperonins. Gp31 is a functional co-chaperonin that is required for the folding and assembly of Gp23, a major capsid protein, during phage morphogenesis.	92
395115	pfam00167	FGF	Fibroblast growth factor. Fibroblast growth factors are a family of proteins involved in growth and differentiation in a wide range of contexts. They are found in a wide range of organisms, from nematodes to humans. Most share an internal core region of high similarity, conserved residues in which are involved in binding with their receptors. On binding, they cause dimerization of their tyrosine kinase receptors leading to intracellular signalling. There are currently four known tyrosine kinase receptors for fibroblast growth factors. These receptors can each bind several different members of this family. Members of this family have a beta trefoil structure. Most have N-terminal signal peptides and are secreted. A few lack signal sequences but are secreted anyway; still others also lack the signal peptide but are found on the cell surface and within the extracellular matrix. A third group remain intracellular. They have central roles in development, regulating cell proliferation, migration and differentiation. On the other hand, they are important in tissue repair following injury in adult organisms.	124
395116	pfam00168	C2	C2 domain. 	104
395117	pfam00169	PH	PH domain. PH stands for pleckstrin homology.	105
395118	pfam00170	bZIP_1	bZIP transcription factor. The Pfam entry includes the basic region and the leucine zipper region.	60
395119	pfam00171	Aldedh	Aldehyde dehydrogenase family. This family of dehydrogenases act on aldehyde substrates. Members use NADP as a cofactor. The family includes the following members: The prototypical members are the aldehyde dehydrogenases EC:1.2.1.3. Succinate-semialdehyde dehydrogenase EC:1.2.1.16. Lactaldehyde dehydrogenase EC:1.2.1.22. Benzaldehyde dehydrogenase EC:1.2.1.28. Methylmalonate-semialdehyde dehydrogenase EC:1.2.1.27. Glyceraldehyde-3-phosphate dehydrogenase EC:1.2.1.9. Delta-1-pyrroline-5-carboxylate dehydrogenase EC: 1.5.1.12. Acetaldehyde dehydrogenase EC:1.2.1.10. Glutamate-5-semialdehyde dehydrogenase EC:1.2.1.41. This family also includes omega crystallin, an eye lens protein from squid and octopus that has little aldehyde dehydrogenase activity.	458
395120	pfam00172	Zn_clus	Fungal Zn(2)-Cys(6) binuclear cluster domain. 	39
395121	pfam00173	Cyt-b5	Cytochrome b5-like Heme/Steroid binding domain. This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases.	74
395122	pfam00174	Oxidored_molyb	Oxidoreductase molybdopterin binding domain. This domain is found in a variety of oxidoreductases. This domain binds to a molybdopterin cofactor. Xanthine dehydrogenases, that also bind molybdopterin, have essentially no similarity.	168
395123	pfam00175	NAD_binding_1	Oxidoreductase NAD-binding domain. Xanthine dehydrogenases, that also bind FAD/NAD, have essentially no similarity.	109
395124	pfam00176	SNF2_N	SNF2 family N-terminal domain. This domain is found in proteins involved in a variety of processes including transcription regulation (e.g., SNF2, STH1, brahma, MOT1), DNA repair (e.g., ERCC6, RAD16, RAD5), DNA recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) as well as a variety of other proteins with little functional information (e.g., lodestar, ETL1).	305
395125	pfam00177	Ribosomal_S7	Ribosomal protein S7p/S5e. This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes.	132
395126	pfam00178	Ets	Ets-domain. 	80
395127	pfam00179	UQ_con	Ubiquitin-conjugating enzyme. Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologs. TSG101 is one of several UBC homologs that lacks this active site cysteine.	139
395128	pfam00180	Iso_dh	Isocitrate/isopropylmalate dehydrogenase. 	349
395129	pfam00181	Ribosomal_L2	Ribosomal Proteins L2, RNA binding domain. 	77
395130	pfam00182	Glyco_hydro_19	Chitinase class I. 	232
395131	pfam00183	HSP90	Hsp90 protein. 	515
395132	pfam00184	Hormone_5	Neurohypophysial hormones, C-terminal Domain. N-terminal Domain is in hormone5	79
395133	pfam00185	OTCace	Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain. 	156
395134	pfam00186	DHFR_1	Dihydrofolate reductase. 	159
395135	pfam00187	Chitin_bind_1	Chitin recognition protein. 	36
395136	pfam00188	CAP	Cysteine-rich secretory protein family. This is a large family of cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins (CAP) that are found in a wide range of organisms, including prokaryotes and non-vertebrate eukaryotes, The nine subfamilies of the mammalian CAP 'super'family include: the human glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), cysteine-rich secretory proteins (CRISPs), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), mannose receptor like and the R3H domain containing like proteins. Members are most often secreted and have an extracellular endocrine or paracrine function and are involved in processes including the regulation of extracellular matrix and branching morphogenesis, potentially as either proteases or protease inhibitors; in ion channel regulation in fertility; as tumor suppressor or pro-oncogenic genes in tissues including the prostate; and in cell-cell adhesion during fertilisation. The overall protein structural conservation within the CAP 'super'family results in fundamentally similar functions for the CAP domain in all members, yet the diversity outside of this core region dramatically alters the target specificity and, thus, the biological consequences. The Ca++-chelating function would fit with the various signalling processes (e.g. the CRISP proteins) that members of this family are involved in, and also the sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how the cysteine-rich venom protein helothermine blocks the Ca++ transporting ryanodine receptors.	117
395137	pfam00189	Ribosomal_S3_C	Ribosomal protein S3, C-terminal domain. This family contains a central domain pfam00013, hence the amino and carboxyl terminal domains are stored separately. This is a minimal carboxyl-terminal domain. Some are much longer.	83
395138	pfam00190	Cupin_1	Cupin. This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant.	151
395139	pfam00191	Annexin	Annexin. This family of annexins also includes giardin that has been shown to function as an annexin.	66
395140	pfam00193	Xlink	Extracellular link domain. 	93
395141	pfam00194	Carb_anhydrase	Eukaryotic-type carbonic anhydrase. 	232
395142	pfam00195	Chal_sti_synt_N	Chalcone and stilbene synthases, N-terminal domain. The C-terminal domain of Chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in this N-terminal domain.	225
395143	pfam00196	GerE	Bacterial regulatory proteins, luxR family. 	57
395144	pfam00197	Kunitz_legume	Trypsin and protease inhibitor. 	174
395145	pfam00198	2-oxoacid_dh	2-oxoacid dehydrogenases acyltransferase (catalytic domain). These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain.	212
395146	pfam00199	Catalase	Catalase. 	383
395147	pfam00200	Disintegrin	Disintegrin. 	75
278624	pfam00201	UDPGT	UDP-glucoronosyl and UDP-glucosyl transferase. 	499
395148	pfam00202	Aminotran_3	Aminotransferase class-III. 	397
395149	pfam00203	Ribosomal_S19	Ribosomal protein S19. 	80
395150	pfam00204	DNA_gyraseB	DNA gyrase B. This family represents the second domain of DNA gyrase B which has a ribosomal S5 domain 2-like fold. This family is structurally related to PF01119.	173
395151	pfam00205	TPP_enzyme_M	Thiamine pyrophosphate enzyme, central domain. The central domain of TPP enzymes contains a 2-fold Rossman fold.	137
395152	pfam00206	Lyase_1	Lyase. 	312
395153	pfam00207	A2M	Alpha-2-macroglobulin family. This family includes the C-terminal region of the alpha-2-macroglobulin family.	91
395154	pfam00208	ELFV_dehydrog	Glutamate/Leucine/Phenylalanine/Valine dehydrogenase. 	240
395155	pfam00209	SNF	Sodium:neurotransmitter symporter family. These are twelve xTM-containing region transporters.	517
395156	pfam00210	Ferritin	Ferritin-like domain. This family contains ferritins and other ferritin-like proteins such as members of the DPS family and bacterioferritins.	141
306677	pfam00211	Guanylate_cyc	Adenylate and Guanylate cyclase catalytic domain. 	183
395157	pfam00212	ANP	Atrial natriuretic peptide. 	31
395158	pfam00213	OSCP	ATP synthase delta (OSCP) subunit. The ATP D subunit from E. coli is the same as the OSCP subunit which is this family. The ATP D subunit from metazoa are found in family pfam00401.	171
395159	pfam00214	Calc_CGRP_IAPP	Calcitonin / CGRP / IAPP family. 	124
395160	pfam00215	OMPdecase	Orotidine 5'-phosphate decarboxylase / HUMPS family. This family includes Orotidine 5'-phosphate decarboxylase enzymes EC:4.1.1.23 that are involved in the final step of pyrimidine biosynthesis. The family also includes enzymes such as hexulose-6-phosphate synthase. This family appears to be distantly related to pfam00834.	215
395161	pfam00216	Bac_DNA_binding	Bacterial DNA-binding protein. 	88
395162	pfam00217	ATP-gua_Ptrans	ATP:guanido phosphotransferase, C-terminal catalytic domain. The substrate binding site is located in the cleft between N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain.	212
395163	pfam00218	IGPS	Indole-3-glycerol phosphate synthase. 	252
395164	pfam00219	IGFBP	Insulin-like growth factor binding protein. 	53
395165	pfam00220	Hormone_4	Neurohypophysial hormones, N-terminal Domain. C-terminal is in hormone5	9
395166	pfam00221	Lyase_aromatic	Aromatic amino acid lyase. This family includes proteins with phenylalanine ammonia-lyase, EC:4.3.1.24, histidine ammonia-lyase, EC:4.3.1.3, and tyrosine aminomutase, EC:5.4.3.6, activities.	465
395167	pfam00223	PsaA_PsaB	Photosystem I psaA/psaB protein. 	712
395168	pfam00224	PK	Pyruvate kinase, barrel domain. This domain of the is actually a small beta-barrel domain nested within a larger TIM barrel. The active site is found in a cleft between the two domains.	348
395169	pfam00225	Kinesin	Kinesin motor domain. 	326
395170	pfam00226	DnaJ	DnaJ domain. DnaJ domains (J-domains) are associated with hsp70 heat-shock system and it is thought that this domain mediates the interaction. DnaJ-domain is therefore part of a chaperone (protein folding) system. The T-antigens, although not in Prosite are confirmed as DnaJ containing domains from literature.	63
395171	pfam00227	Proteasome	Proteasome subunit. The proteasome is a multisubunit structure that degrades proteins. Protein degradation is an essential component of regulation because proteins can become misfolded, damaged, or unnecessary. Proteasomes and their homologs vary greatly in complexity: from HslV (heat shock locus v), which is encoded by 1 gene in bacteria, to the eukaryotic 20S proteasome, which is encoded by more than 14 genes. Recently evidence of two novel groups of bacterial proteasomes was proposed. The first is Anbu, which is sparsely distributed among cyanobacteria and proteobacteria. The second is call beta-proteobacteria proteasome homolog (BPH).	188
395172	pfam00228	Bowman-Birk_leg	Bowman-Birk serine protease inhibitor family. 	24
395173	pfam00229	TNF	TNF(tumor Necrosis Factor) family. 	127
395174	pfam00230	MIP	Major intrinsic protein. MIP (Major Intrinsic Protein) family proteins exhibit essentially two distinct types of channel properties: (1) specific water transport by the aquaporins, and (2) small neutral solutes transport, such as glycerol by the glycerol facilitators.	223
395175	pfam00231	ATP-synt	ATP synthase. 	285
395176	pfam00232	Glyco_hydro_1	Glycosyl hydrolase family 1. 	453
395177	pfam00233	PDEase_I	3'5'-cyclic nucleotide phosphodiesterase. 	236
395178	pfam00234	Tryp_alpha_amyl	Protease inhibitor/seed storage/LTP family. This family is composed of trypsin-alpha amylase inhibitors, seed storage proteins and lipid transfer proteins from plants.	75
395179	pfam00235	Profilin	Profilin. 	124
395180	pfam00236	Hormone_6	Glycoprotein hormone. 	89
395181	pfam00237	Ribosomal_L22	Ribosomal protein L22p/L17e. This family includes L22 from prokaryotes and chloroplasts and L17 from eukaryotes.	104
395182	pfam00238	Ribosomal_L14	Ribosomal protein L14p/L23e. 	119
395183	pfam00239	Resolvase	Resolvase, N terminal domain. The N-terminal domain of the resolvase family (this family) contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase - see pfam02796.	144
395184	pfam00240	ubiquitin	Ubiquitin family. This family contains a number of ubiquitin-like proteins: SUMO (smt3 homolog), Nedd8, Elongin B, Rub1, and Parkin. A number of them are thought to carry a distinctive five-residue motif termed the proteasome-interacting motif (PIM), which may have a biologically significant role in protein delivery to proteasomes and recruitment of proteasomes to transcription sites.	72
395185	pfam00241	Cofilin_ADF	Cofilin/tropomyosin-type actin-binding protein. Severs actin filaments and binds to actin monomers.	124
365972	pfam00242	DNA_pol_viral_N	DNA polymerase (viral) N-terminal domain. 	401
395186	pfam00243	NGF	Nerve growth factor family. 	111
395187	pfam00244	14-3-3	14-3-3 protein. 	221
395188	pfam00245	Alk_phosphatase	Alkaline phosphatase. 	418
395189	pfam00246	Peptidase_M14	Zinc carboxypeptidase. 	287
395190	pfam00248	Aldo_ket_red	Aldo/keto reductase family. This family includes a number of K+ ion channel beta chain regulatory domains - these are reported to have oxidoreductase activity.	290
395191	pfam00249	Myb_DNA-binding	Myb-like DNA-binding domain. This family contains the DNA binding domains from Myb proteins, as well as the SANT domain family.	46
395192	pfam00250	Forkhead	Forkhead domain. 	86
395193	pfam00251	Glyco_hydro_32N	Glycosyl hydrolases family 32 N-terminal domain. This domain corresponds to the N-terminal domain of glycosyl hydrolase family 32 which forms a five bladed beta propeller structure.	308
395194	pfam00252	Ribosomal_L16	Ribosomal protein L16p/L10e. 	132
395195	pfam00253	Ribosomal_S14	Ribosomal protein S14p/S29e. This family includes both ribosomal S14 from prokaryotes and S29 from eukaryotes.	54
395196	pfam00254	FKBP_C	FKBP-type peptidyl-prolyl cis-trans isomerase. 	94
395197	pfam00255	GSHPx	Glutathione peroxidase. 	108
395198	pfam00257	Dehydrin	Dehydrin. 	144
395199	pfam00258	Flavodoxin_1	Flavodoxin. 	141
365984	pfam00260	Protamine_P1	Protamine P1. 	49
395200	pfam00261	Tropomyosin	Tropomyosin. Tropomyosin is an alpha-helical protein that forms a coiled-coil structure of 2 parallel helices containing 2 sets of 7 alternating actin binding sites. The protein is best known for its role in regulating the interaction between actin and myosin in muscle contraction, but is also involved in the organisation and dynamics of the cytoskeleton in non-muscle cells. There are multiple cell-specific isoforms, expressed by alternative promoters and alternative RNA processing of at least four genes. Muscle isoforms of tropomyosin are characterized by having 284 amino acid residues and a highly conserved N-terminal region, whereas non-muscle forms are generally smaller and are heterogeneous in their N-terminal region.	235
395201	pfam00262	Calreticulin	Calreticulin family. 	369
395202	pfam00263	Secretin	Bacterial type II and III secretion system protein. 	161
395203	pfam00264	Tyrosinase	Common central domain of tyrosinase. This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372.	209
306721	pfam00265	TK	Thymidine kinase. 	176
395204	pfam00266	Aminotran_5	Aminotransferase class-V. This domain is found in amino transferases, and other enzymes including cysteine desulphurase EC:4.4.1.-.	368
395205	pfam00267	Porin_1	Gram-negative porin. 	335
395206	pfam00268	Ribonuc_red_sm	Ribonucleotide reductase, small chain. 	276
395207	pfam00269	SASP	Small, acid-soluble spore proteins, alpha/beta type. 	58
395208	pfam00270	DEAD	DEAD/DEAH box helicase. Members of this family include the DEAD and DEAH box helicases. Helicases are involved in unwinding nucleic acids. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.	165
395209	pfam00271	Helicase_C	Helicase conserved C-terminal domain. The Prosite family is restricted to DEAD/H helicases, whereas this domain family is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase.	109
395210	pfam00272	Cecropin	Cecropin family. 	30
395211	pfam00273	Serum_albumin	Serum albumin family. 	176
395212	pfam00274	Glycolytic	Fructose-bisphosphate aldolase class-I. 	349
395213	pfam00275	EPSP_synthase	EPSP synthase (3-phosphoshikimate 1-carboxyvinyltransferase). 	415
395214	pfam00276	Ribosomal_L23	Ribosomal protein L23. 	85
395215	pfam00277	SAA	Serum amyloid A protein. 	101
395216	pfam00278	Orn_DAP_Arg_deC	Pyridoxal-dependent decarboxylase, C-terminal sheet domain. These pyridoxal-dependent decarboxylases act on ornithine, lysine, arginine and related substrates.	346
395217	pfam00280	potato_inhibit	Potato inhibitor I family. 	64
395218	pfam00281	Ribosomal_L5	Ribosomal protein L5. 	56
395219	pfam00282	Pyridoxal_deC	Pyridoxal-dependent decarboxylase conserved domain. 	373
395220	pfam00283	Cytochrom_B559	Cytochrome b559, alpha (gene psbE) and beta (gene psbF)subunits. 	29
395221	pfam00284	Cytochrom_B559a	Lumenal portion of Cytochrome b559, alpha (gene psbE) subunit. This family is the lumenal portion of cytochrome b559 alpha chain, matches to this family should be accompanied by a match to the pfam00283 family also. The Prosite pattern pattern matches the transmembrane region of the cytochrome b559 alpha and beta subunits.	38
395222	pfam00285	Citrate_synt	Citrate synthase, C-terminal domain. This is the long, C-terminal part of the enzyme.	358
395223	pfam00286	Flexi_CP	Viral coat protein. Family includes coat proteins from Potexviruses and carlaviruses.	138
395224	pfam00287	Na_K-ATPase	Sodium / potassium ATPase beta chain. 	272
395225	pfam00288	GHMP_kinases_N	GHMP kinases N terminal domain. This family includes homoserine kinases, galactokinases and mevalonate kinases.	64
395226	pfam00289	Biotin_carb_N	Biotin carboxylase, N-terminal domain. This domain is structurally related to the PreATP-grasp domain. The family contains the N-terminus of biotin carboxylase enzymes, and propionyl-CoA carboxylase A chain.	109
395227	pfam00290	Trp_syntA	Tryptophan synthase alpha chain. 	258
395228	pfam00291	PALP	Pyridoxal-phosphate dependent enzyme. Members of this family are all pyridoxal-phosphate dependent enzymes. This family includes: serine dehydratase EC:4.2.1.13 P20132, threonine dehydratase EC:4.2.1.16, tryptophan synthase beta chain EC:4.2.1.20, threonine synthase EC:4.2.99.2, cysteine synthase EC:4.2.99.8 P11096, cystathionine beta-synthase EC:4.2.1.22, 1-aminocyclopropane-1-carboxylate deaminase EC:4.1.99.4.	295
366005	pfam00292	PAX	'Paired box' domain. 	125
395229	pfam00293	NUDIX	NUDIX domain. 	132
395230	pfam00294	PfkB	pfkB family carbohydrate kinase. This family includes a variety of carbohydrate and pyrimidine kinases.	289
395231	pfam00295	Glyco_hydro_28	Glycosyl hydrolases family 28. Glycosyl hydrolase family 28 includes polygalacturonase EC:3.2.1.15 as well as rhamnogalacturonase A(RGase A), EC:3.2.1.-. These enzymes are important in cell wall metabolism.	321
395232	pfam00296	Bac_luciferase	Luciferase-like monooxygenase. 	314
395233	pfam00297	Ribosomal_L3	Ribosomal protein L3. 	369
395234	pfam00298	Ribosomal_L11	Ribosomal protein L11, RNA binding domain. 	69
395235	pfam00299	Squash	Squash family serine protease inhibitor. 	29
395236	pfam00300	His_Phos_1	Histidine phosphatase superfamily (branch 1). The histidine phosphatase superfamily is so named because catalysis centers on a conserved His residue that is transiently phosphorylated during the catalytic cycle. Other conserved residues contribute to a 'phosphate pocket' and interact with the phospho group of substrate before, during and after its transfer to the His residue. Structure and sequence analyses show that different families contribute different additional residues to the 'phosphate pocket' and, more surprisingly, differ in the position, in sequence and in three dimensions, of a catalytically essential acidic residue. The superfamily may be divided into two main branches. The larger branch 1 contains a wide variety of catalytic functions, the best known being fructose 2,6-bisphosphatase (found in a bifunctional protein with 2-phosphofructokinase) and cofactor-dependent phosphoglycerate mutase. The latter is an unusual example of a mutase activity in the superfamily: the vast majority of members appear to be phosphatases. The bacterial regulatory protein phosphatase SixA is also in branch 1 and has a minimal, and possible ancestral-like structure, lacking the large domain insertions that contribute to binding of small molecules in branch 1 members.	194
395237	pfam00301	Rubredoxin	Rubredoxin. 	47
395238	pfam00302	CAT	Chloramphenicol acetyltransferase. 	203
395239	pfam00303	Thymidylat_synt	Thymidylate synthase. This is a family of proteins that are flavin-dependent thymidylate synthases.	262
395240	pfam00304	Gamma-thionin	Gamma-thionin family. 	44
395241	pfam00305	Lipoxygenase	Lipoxygenase. 	672
395242	pfam00306	ATP-synt_ab_C	ATP synthase alpha/beta chain, C terminal domain. 	126
395243	pfam00307	CH	Calponin homology (CH) domain. The CH domain is found in both cytoskeletal proteins and signal transduction proteins. The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most member proteins have from two to four copies of the CH domain, however some proteins such as calponin have only a single copy.	109
278724	pfam00308	Bac_DnaA	Bacterial dnaA protein. 	219
395244	pfam00309	Sigma54_AID	Sigma-54 factor, Activator interacting domain (AID). The sigma-54 holoenzyme is an enhancer dependent form of the RNA polymerase. The AID is necessary for activator interaction. In addition, the AID also inhibits transcription initiation in the sigma-54 holoenzyme prior to interaction with the activator.	39
395245	pfam00310	GATase_2	Glutamine amidotransferases class-II. 	420
395246	pfam00311	PEPcase	Phosphoenolpyruvate carboxylase. 	794
395247	pfam00312	Ribosomal_S15	Ribosomal protein S15. 	76
278729	pfam00313	CSD	'Cold-shock' DNA-binding domain. 	66
395248	pfam00314	Thaumatin	Thaumatin family. 	211
395249	pfam00316	FBPase	Fructose-1-6-bisphosphatase, N-terminal domain. This family represents the N-terminus of this protein family.	189
395250	pfam00317	Ribonuc_red_lgN	Ribonucleotide reductase, all-alpha domain. 	79
395251	pfam00318	Ribosomal_S2	Ribosomal protein S2. 	216
395252	pfam00319	SRF-TF	SRF-type transcription factor (DNA-binding and dimerization domain). 	47
395253	pfam00320	GATA	GATA zinc finger. This domain uses four cysteine residues to coordinate a zinc ion. This domain binds to DNA. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contain a single copy of the domain.	36
395254	pfam00321	Thionin	Plant thionin. 	46
366026	pfam00322	Endothelin	Endothelin family. 	29
395255	pfam00323	Defensin_1	Mammalian defensin. 	29
366028	pfam00324	AA_permease	Amino acid permease. 	467
395256	pfam00325	Crp	Bacterial regulatory proteins, crp family. 	32
395257	pfam00326	Peptidase_S9	Prolyl oligopeptidase family. 	213
395258	pfam00327	Ribosomal_L30	Ribosomal protein L30p/L7e. This family includes prokaryotic L30 and eukaryotic L7.	51
395259	pfam00328	His_Phos_2	Histidine phosphatase superfamily (branch 2). The histidine phosphatase superfamily is so named because catalysis centers on a conserved His residue that is transiently phosphorylated during the catalytic cycle. Other conserved residues contribute to a 'phosphate pocket' and interact with the phospho group of substrate before, during and after its transfer to the His residue. Structure and sequence analyses show that different families contribute different additional residues to the 'phosphate pocket' and, more surprisingly, differ in the position, in sequence and in three dimensions, of a catalytically essential acidic residue. The superfamily may be divided into two main branches.The smaller branch 2 contains predominantly eukaryotic proteins. The catalytic functions in members include phytase, glucose-1-phosphatase and multiple inositol polyphosphate phosphatase. The in vivo roles of the mammalian acid phosphatases in branch 2 are not fully understood, although activity against lysophosphatidic acid and tyrosine-phosphorylated proteins has been demonstrated.	356
395260	pfam00329	Complex1_30kDa	Respiratory-chain NADH dehydrogenase, 30 Kd subunit. 	120
395261	pfam00330	Aconitase	Aconitase family (aconitate hydratase). 	458
395262	pfam00331	Glyco_hydro_10	Glycosyl hydrolase family 10. 	310
366033	pfam00332	Glyco_hydro_17	Glycosyl hydrolases family 17. 	309
395263	pfam00333	Ribosomal_S5	Ribosomal protein S5, N-terminal domain. 	65
395264	pfam00334	NDK	Nucleoside diphosphate kinase. 	135
395265	pfam00335	Tetraspannin	Tetraspanin family. 	221
366036	pfam00336	DNA_pol_viral_C	DNA polymerase (viral) C-terminal domain. 	241
395266	pfam00337	Gal-bind_lectin	Galactoside-binding lectin. This family contains galactoside binding lectins. The family also includes enzymes such as human eosinophil lysophospholipase (EC:3.1.1.5).	131
395267	pfam00338	Ribosomal_S10	Ribosomal protein S10p/S20e. This family includes small ribosomal subunit S10 from prokaryotes and S20 from eukaryotes.	98
395268	pfam00339	Arrestin_N	Arrestin (or S-antigen), N-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain.	146
395269	pfam00340	IL1	Interleukin-1 / 18. This family includes interleukin-1 and interleukin-18.	119
395270	pfam00341	PDGF	PDGF/VEGF domain. 	82
395271	pfam00342	PGI	Phosphoglucose isomerase. Phosphoglucose isomerase catalyzes the interconversion of glucose-6-phosphate and fructose-6-phosphate.	487
395272	pfam00343	Phosphorylase	Carbohydrate phosphorylase. The members of this family catalyze the formation of glucose 1-phosphate from one of the following polyglucoses; glycogen, starch, glucan or maltodextrin.	661
395273	pfam00344	SecY	SecY translocase. 	313
395274	pfam00345	PapD_N	Pili and flagellar-assembly chaperone, PapD N-terminal domain. C2 domain-like beta-sandwich fold. This domain is the n-terminal part of the PapD chaperone protein for pilus and flagellar assembly.	121
395275	pfam00346	Complex1_49kDa	Respiratory-chain NADH dehydrogenase, 49 Kd subunit. 	270
395276	pfam00347	Ribosomal_L6	Ribosomal protein L6. 	78
395277	pfam00348	polyprenyl_synt	Polyprenyl synthetase. 	252
395278	pfam00349	Hexokinase_1	Hexokinase. Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam03727. Some members of the family have two copies of each of these domains.	194
395279	pfam00350	Dynamin_N	Dynamin family. 	168
395280	pfam00351	Biopterin_H	Biopterin-dependent aromatic amino acid hydroxylase. This family includes phenylalanine-4-hydroxylase, the phenylketonuria disease protein.	331
395281	pfam00352	TBP	Transcription factor TFIID (or TATA-binding protein, TBP). 	83
395282	pfam00353	HemolysinCabind	Haemolysin-type calcium-binding repeat (2 copies). 	36
278768	pfam00354	Pentaxin	Pentaxin family. Pentaxins are also known as pentraxins.	194
395283	pfam00355	Rieske	Rieske [2Fe-2S] domain. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines coordinate one Fe ion, while the other Fe ion is coordinated by two conserved histidines. In hyperthermophilic archaea there is a SKTPCX(2-3)C motif at the C-terminus. The cysteines in this motif form a disulphide bridge, which stabilizes the protein.	88
306791	pfam00356	LacI	Bacterial regulatory proteins, lacI family. 	46
395284	pfam00357	Integrin_alpha	Integrin alpha cytoplasmic region. This family contains the short intracellular region of integrin alpha chains.	15
395285	pfam00358	PTS_EIIA_1	phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 1. 	122
395286	pfam00359	PTS_EIIA_2	Phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 2. 	139
395287	pfam00360	PHY	Phytochrome region. Phytochromes are red/far-red photochromic biliprotein photoreceptors which regulate plant development. They are widely represented in both photosynthetic and non-photosynthetic bacteria and are known in a variety of fungi. Although sequence similarities are low, this domain is structurally related to pfam01590, which is generally located immediately N-terminal to this domain. Compared with pfam01590, this domain carries an additional tongue-like hairpin loop between the fifth beta-sheet and the sixth alpha-helix which functions to seal the chromophore pocket and stabilize the photoactivated far-red-absorbing state (Pfr). The tongue carries a conserved PRxSF motif, from which an arginine finger points into the chromophore pocket close to ring D forming a salt bridge with a conserved aspartate residue.	182
366050	pfam00361	Proton_antipo_M	Proton-conducting membrane transporter. This is a family of membrane transporters that inlcudes some 7 of potentially 14-16 TM regions. In many instances the family forms part of complex I that catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane, and in this context is a combination predominantly of subunits 2, 4, 5, 14, L, M and N. In many bacterial species these proteins are probable stand-alone transporters not coupled with oxidoreduction. The family in total represents homologs across the phyla.	291
395288	pfam00362	Integrin_beta	Integrin beta chain VWA domain. Integrins have been found in animals and their homologs have also been found in cyanobacteria, probably due to horizontal gene transfer. This domain corresponds to the integrin beta VWA domain.	248
395289	pfam00363	Casein	Casein. 	87
395290	pfam00364	Biotin_lipoyl	Biotin-requiring enzyme. This family covers two Prosite entries, the conserved lysine residue binds biotin in one group and lipoic acid in the other. Note that the HMM does not currently recognize the Glycine cleavage system H proteins.	73
395291	pfam00365	PFK	Phosphofructokinase. 	271
395292	pfam00366	Ribosomal_S17	Ribosomal protein S17. 	68
395293	pfam00367	PTS_EIIB	phosphotransferase system, EIIB. 	34
395294	pfam00368	HMG-CoA_red	Hydroxymethylglutaryl-coenzyme A reductase. The HMG-CoA reductases catalyze the conversion of HMG-CoA to mevalonate, which is the rate-limiting step in the synthesis of isoprenoids like cholesterol. Probably because of the critical role of this enzyme in cholesterol homeostasis, mammalian HMG-CoA reductase is heavily regulated at the transcriptional, translational, and post-translational levels.	368
395295	pfam00370	FGGY_N	FGGY family of carbohydrate kinases, N-terminal domain. This domain adopts a ribonuclease H-like fold and is structurally related to the C-terminal domain.	245
395296	pfam00372	Hemocyanin_M	Hemocyanin, copper containing domain. This family includes arthropod hemocyanins and insect larval storage proteins.	268
395297	pfam00373	FERM_M	FERM central domain. This domain is the central structural domain of the FERM domain.	116
395298	pfam00374	NiFeSe_Hases	Nickel-dependent hydrogenase. 	495
395299	pfam00375	SDF	Sodium:dicarboxylate symporter family. 	388
395300	pfam00376	MerR	MerR family regulatory protein. 	38
395301	pfam00377	Prion	Prion/Doppel alpha-helical domain. The prion protein is thought to be the infectious agent that causes transmissible spongiform encephalopathies, such as scrapie and BSE. It is thought that the prion protein can exist in two different forms: one is the normal cellular protein, and the other is the infectious form which can change the normal prion protein into the infectious form. It has been found that the prion alpha-helical domain is also found in the Doppel protein.	116
395302	pfam00378	ECH_1	Enoyl-CoA hydratase/isomerase. This family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase.	251
395303	pfam00379	Chitin_bind_4	Insect cuticle protein. Many insect cuticular proteins include a 35-36 amino acid motif known as the R&R consensus. The extensive conservation of this region led to the suggestion that it functions to bind chitin. Provocatively, it has no sequence similarity to the well-known cysteine-containing chitin-binding domain found in chitinases and some peritrophic membrane proteins. Chitin binding has been shown experimentally for this region. Thus arthropods have two distinct classes of chitin binding proteins, those with the chitin-binding domain found in lectins, chitinases and peritrophic membranes (cysCBD) and those with the cuticular protein chitin-binding domain (non-cysCBD).	52
395304	pfam00380	Ribosomal_S9	Ribosomal protein S9/S16. This family includes small ribosomal subunit S9 from prokaryotes and S16 from eukaryotes.	121
395305	pfam00381	PTS-HPr	PTS HPr component phosphorylation site. 	79
395306	pfam00382	TFIIB	Transcription factor TFIIB repeat. 	71
395307	pfam00383	dCMP_cyt_deam_1	Cytidine and deoxycytidylate deaminase zinc-binding region. 	100
395308	pfam00384	Molybdopterin	Molybdopterin oxidoreductase. 	359
395309	pfam00385	Chromo	Chromo (CHRromatin Organisation MOdifier) domain. 	52
395310	pfam00386	C1q	C1q domain. C1q is a subunit of the C1 enzyme complex that activates the serum complement system.	126
395311	pfam00387	PI-PLC-Y	Phosphatidylinositol-specific phospholipase C, Y domain. This associates with pfam00388 to form a single structural unit.	114
395312	pfam00388	PI-PLC-X	Phosphatidylinositol-specific phospholipase C, X domain. This associates with pfam00387 to form a single structural unit.	142
395313	pfam00389	2-Hacid_dh	D-isomer specific 2-hydroxyacid dehydrogenase, catalytic domain. This family represents the largest portion of the catalytic domain of 2-hydroxyacid dehydrogenases as the NAD binding domain is inserted within the structural domain.	312
395314	pfam00390	malic	Malic enzyme, N-terminal domain. 	182
395315	pfam00391	PEP-utilizers	PEP-utilising enzyme, mobile domain. This domain is a "swivelling" beta/beta/alpha domain which is thought to be mobile in all proteins known to contain it.	73
306822	pfam00392	GntR	Bacterial regulatory proteins, gntR family. This family of regulatory proteins consists of the N-terminal HTH region of GntR-like bacterial transcription factors. At the C-terminus there is usually an effector-binding/oligomerization domain. The GntR-like proteins include the following sub-families: MocR, YtrR, FadR, AraR, HutC and PlmA, DevA, DasR. Many of these proteins have been shown experimentally to be autoregulatory, enabling the prediction of operator sites and the discovery of cis/trans relationships. The DasR regulator has been shown to be a global regulator of primary metabolism and development in Streptomyces coelicolor.	64
395316	pfam00393	6PGD	6-phosphogluconate dehydrogenase, C-terminal domain. This family represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each.	290
395317	pfam00394	Cu-oxidase	Multicopper oxidase. Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain.	146
395318	pfam00395	SLH	S-layer homology domain. 	42
395319	pfam00396	Granulin	Granulin. 	42
395320	pfam00397	WW	WW domain. The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.	30
395321	pfam00398	RrnaAD	Ribosomal RNA adenine dimethylase. 	263
395322	pfam00399	PIR	Yeast PIR protein repeat. 	17
395323	pfam00400	WD40	WD domain, G-beta repeat. 	39
395324	pfam00401	ATP-synt_DE	ATP synthase, Delta/Epsilon chain, long alpha-helix domain. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. This subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213).	45
395325	pfam00402	Calponin	Calponin family repeat. 	25
395326	pfam00403	HMA	Heavy-metal-associated domain. 	58
395327	pfam00404	Dockerin_1	Dockerin type I repeat. The dockerin repeat is the binding partner of the cohesin domain pfam00963. The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome. The dockerin repeats, each bearing homology to the EF-hand calcium-binding loop bind calcium.	56
395328	pfam00405	Transferrin	Transferrin. 	328
395329	pfam00406	ADK	Adenylate kinase. 	184
395330	pfam00407	Bet_v_1	Pathogenesis-related protein Bet v I family. This family is named after Bet v 1, the major birch pollen allergen. This protein belongs to family 10 of plant pathogenesis-related proteins (PR-10), cytoplasmic proteins of 15-17 kd that are wide-spread among dicotyledonous plants. In recent years, a number of diverse plant proteins with low sequence similarity to Bet v 1 was identified. A classification by sequence similarity yielded several subfamilies related to PR-10: - Pathogenesis-related proteins PR-10: These proteins were identified as major tree pollen allergens in birch and related species (hazel, alder), as plant food allergens expressed in high levels in fruits, vegetables and seeds (apple, celery, hazelnut), and as pathogenesis-related proteins whose expression is induced by pathogen infection, wounding, or abiotic stress. Hyp-1, an enzyme involved in the synthesis of the bioactive naphthodianthrone hypericin in St. John's wort (Hypericum perforatum) also belongs to this family. Most of these proteins were found in dicotyledonous plants. In addition, related sequences were identified in monocots and conifers. - Cytokinin-specific binding proteins: These legume proteins bind cytokinin plant hormones. - (S)-Norcoclaurine synthases are enzymes catalyzing the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine. -Major latex proteins and ripening-related proteins are proteins of unknown biological function that were first discovered in the latex of opium poppy (Papaver somniferum) and later found to be upregulated during ripening of fruits such as strawberry and cucumber. The occurrence of Bet v 1-related proteins is confined to seed plants with the exception of a cytokinin-binding protein from the moss Physcomitrella patens.	149
395331	pfam00408	PGM_PMM_IV	Phosphoglucomutase/phosphomannomutase, C-terminal domain. 	71
395332	pfam00410	Ribosomal_S8	Ribosomal protein S8. 	127
306838	pfam00411	Ribosomal_S11	Ribosomal protein S11. 	109
395333	pfam00412	LIM	LIM domain. This family represents two copies of the LIM structural domain.	57
395334	pfam00413	Peptidase_M10	Matrixin. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis.	159
366084	pfam00414	MAP1B_neuraxin	Neuraxin and MAP1B repeat. 	17
395335	pfam00415	RCC1	Regulator of chromosome condensation (RCC1) repeat. 	50
395336	pfam00416	Ribosomal_S13	Ribosomal protein S13/S18. This family includes ribosomal protein S13 from prokaryotes and S18 from eukaryotes.	108
395337	pfam00418	Tubulin-binding	Tau and MAP protein, tubulin-binding repeat. This family includes the vertebrate proteins MAP2, MAP4 and Tau, as well as other animal homologs. MAP4 is present in many tissues but is usually absent from neurons; MAP2 and Tau are mainly neuronal. Members of this family have the ability to bind to and stabilize microtubules. As a result, they are involved in neuronal migration, supporting dendrite elongation, and regulating microtubules during mitotic metaphase. Note that Tau is involved in neurofibrillary tangle formation in Alzheimer's disease and some other dementias. This family features a C-terminal microtubule binding repeat that contains a conserved KXGS motif.	31
395338	pfam00419	Fimbrial	Fimbrial protein. 	149
395339	pfam00420	Oxidored_q2	NADH-ubiquinone/plastoquinone oxidoreductase chain 4L. 	95
395340	pfam00421	PSII	Photosystem II protein. 	496
278832	pfam00423	HN	Haemagglutinin-neuraminidase. 	539
366091	pfam00424	REV	REV protein (anti-repression trans-activator protein). 	91
395341	pfam00425	Chorismate_bind	chorismate binding enzyme. This family includes the catalytic regions of the chorismate binding enzymes anthranilate synthase, isochorismate synthase, aminodeoxychorismate synthase and para-aminobenzoate synthase.	255
395342	pfam00426	VP4_haemagglut	Outer Capsid protein VP4 (Hemagglutinin) Concanavalin-like domain. This entry represents the N-terminal concanavalin-like domain from the VP4 protein of rotavirus C.	193
395343	pfam00427	PBS_linker_poly	Phycobilisome Linker polypeptide. 	125
395344	pfam00428	Ribosomal_60s	60s Acidic ribosomal protein. This family includes archaebacterial L12, eukaryotic P0, P1 and P2.	87
306850	pfam00429	TLV_coat	ENV polyprotein (coat polyprotein). 	560
366095	pfam00430	ATP-synt_B	ATP synthase B/B' CF(0). Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006	131
395345	pfam00431	CUB	CUB domain. 	110
395346	pfam00432	Prenyltrans	Prenyltransferase and squalene oxidase repeat. 	44
395347	pfam00433	Pkinase_C	Protein kinase C terminal domain. 	42
278842	pfam00434	VP7	Glycoprotein VP7. 	336
395348	pfam00435	Spectrin	Spectrin repeat. Spectrin repeat-domains are found in several proteins involved in cytoskeletal structure. These include spectrin, alpha-actinin and dystrophin. The sequence repeat used in this family is taken from the structural repeat in reference. The spectrin domain- repeat forms a three helix bundle. The second helix is interrupted by proline in some sequences. The repeats are defined by a characteristic tryptophan (W) residue at position 17 in helix A and a leucine (L) at 2 residues from the carboxyl end of helix C. Although the domain occurs in multiple repeats along sequences, the domains are actually stable on their own - ie they act, biophysically, like domains rather than repeats that along function when aggregated.	105
395349	pfam00436	SSB	Single-strand binding protein family. This family includes single stranded binding proteins and also the primosomal replication protein N (PriB). PriB forms a complex with PriA, PriC and ssDNA.	103
395350	pfam00437	T2SSE	Type II/IV secretion system protein. This family contains both type II and type IV pathway secretion proteins from bacteria. VirB11 ATPase is a subunit of the Agrobacterium tumefaciens transfer DNA (T-DNA) transfer system, a type IV secretion pathway required for delivery of T-DNA and effector proteins to plant cells during infection.	271
395351	pfam00438	S-AdoMet_synt_N	S-adenosylmethionine synthetase, N-terminal domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold.	98
395352	pfam00439	Bromodomain	Bromodomain. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.	84
395353	pfam00440	TetR_N	Bacterial regulatory proteins, tetR family. 	47
395354	pfam00441	Acyl-CoA_dh_1	Acyl-CoA dehydrogenase, C-terminal domain. C-terminal domain of Acyl-CoA dehydrogenase is an all-alpha, four helical up-and-down bundle.	149
395355	pfam00443	UCH	Ubiquitin carboxyl-terminal hydrolase. 	306
395356	pfam00444	Ribosomal_L36	Ribosomal protein L36. 	38
395357	pfam00445	Ribonuclease_T2	Ribonuclease T2 family. 	183
278853	pfam00446	GnRH	Gonadotropin-releasing hormone. 	10
395358	pfam00447	HSF_DNA-bind	HSF-type DNA-binding. 	99
395359	pfam00448	SRP54	SRP54-type protein, GTPase domain. This family includes relatives of the G-domain of the SRP54 family of proteins.	193
395360	pfam00449	Urease_alpha	Urease alpha-subunit, N-terminal domain. The N-terminal domain is a composite domain and plays a major trimer stabilizing role by contacting the catalytic domain of the symmetry related alpha-subunit.	120
395361	pfam00450	Peptidase_S10	Serine carboxypeptidase. 	415
278858	pfam00451	Toxin_2	Scorpion short toxin, BmKK2. Members of this family, which are found in various scorpion toxins, confer potassium channel blocking activity.	32
395362	pfam00452	Bcl-2	Apoptosis regulator proteins, Bcl-2 family. 	100
395363	pfam00453	Ribosomal_L20	Ribosomal protein L20. 	104
395364	pfam00454	PI3_PI4_kinase	Phosphatidylinositol 3- and 4-kinase. Some members of this family probably do not have lipid kinase activity and are protein kinases.	241
395365	pfam00455	DeoRC	DeoR C terminal sensor domain. The sensor domains of the DeoR are catalytically inactive versions of the ISOCOT fold, but retain the substrate binding site. DeorC senses diverse sugar derivatives such as deoxyribose nucleoside (DeoR), tagatose phosphate (LacR), galactosamine (AgaR), myo-inositol (Bacillus IolR) and L-ascorbate (UlaR).	160
395366	pfam00456	Transketolase_N	Transketolase, thiamine diphosphate binding domain. This family includes transketolase enzymes EC:2.2.1.1. and also partially matches to 2-oxoisovalerate dehydrogenase beta subunit EC:1.2.4.4. Both these enzymes utilize thiamine pyrophosphate as a cofactor, suggesting there may be common aspects in their mechanism of catalysis.	334
395367	pfam00457	Glyco_hydro_11	Glycosyl hydrolases family 11. 	175
395368	pfam00458	WHEP-TRS	WHEP-TRS domain. 	49
395369	pfam00459	Inositol_P	Inositol monophosphatase family. 	270
395370	pfam00460	Flg_bb_rod	Flagella basal body rod protein. 	30
395371	pfam00462	Glutaredoxin	Glutaredoxin. 	60
278869	pfam00463	ICL	Isocitrate lyase family. 	526
395372	pfam00464	SHMT	Serine hydroxymethyltransferase. 	399
395373	pfam00465	Fe-ADH	Iron-containing alcohol dehydrogenase. 	362
395374	pfam00466	Ribosomal_L10	Ribosomal protein L10. 	99
395375	pfam00467	KOW	KOW motif. This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.	32
395376	pfam00468	Ribosomal_L34	Ribosomal protein L34. 	42
366118	pfam00469	F-protein	Negative factor, (F-Protein) or Nef. Nef protein accelerates virulent progression of AIDS by its interaction with cellular proteins involved in signal transduction and host cell activation. Nef has been shown to bind specifically to a subset of the Src kinase family.	220
395377	pfam00471	Ribosomal_L33	Ribosomal protein L33. 	46
395378	pfam00472	RF-1	RF-1 domain. This domain is found in peptide chain release factors such as RF-1 and RF-2, and a number of smaller proteins of unknown function. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis.	111
395379	pfam00473	CRF	Corticotropin-releasing factor family. 	38
109527	pfam00474	SSF	Sodium:solute symporter family. 	406
395380	pfam00475	IGPD	Imidazoleglycerol-phosphate dehydratase. 	140
395381	pfam00476	DNA_pol_A	DNA polymerase family A. 	372
306882	pfam00477	LEA_5	Small hydrophilic plant seed protein. 	109
395382	pfam00478	IMPDH	IMP dehydrogenase / GMP reductase domain. This family is involved in biosynthesis of guanosine nucleotide. Members of this family contain a TIM barrel structure. In the inosine monophosphate dehydrogenases 2 CBS domains pfam00571 are inserted in the TIM barrel. This family is a member of the common phosphate binding site TIM barrel family.	418
395383	pfam00479	G6PD_N	Glucose-6-phosphate dehydrogenase, NAD binding domain. 	178
395384	pfam00480	ROK	ROK family. 	292
395385	pfam00481	PP2C	Protein phosphatase 2C. Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase.	252
395386	pfam00482	T2SSF	Type II secretion system (T2SS), protein F. The original family covered both the regions found by the current model. The splitting of the family has allowed the related FlaJ_arch (archaeal FlaJ family) to be merged with it. Proteins with this domain in form a platform for the machiney of the Type II secretion system, as well as the Type 4 pili and the archaeal flagella. This domain seems to show some similarity to PF00664 but this may just be due to similarities in the TM helices (personal obs: C Yeats).	115
395387	pfam00483	NTP_transferase	Nucleotidyl transferase. This family includes a wide range of enzymes which transfer nucleotides onto phosphosugars.	243
395388	pfam00484	Pro_CA	Carbonic anhydrase. This family includes carbonic anhydrases as well as a family of non-functional homologs related to YbcF.	156
395389	pfam00485	PRK	Phosphoribulokinase / Uridine kinase family. This family matches three types of P-loop containing kinases: phosphoribulokinases, uridine kinases and bacterial pantothenate kinases(CoaA). Arabidopsis and other organisms have a dual uridine kinase/uracil phosphoribosyltransferase protein where the N-terminal region consists of a UK domain and the C-terminal region of a UPRT domain.	197
395390	pfam00486	Trans_reg_C	Transcriptional regulatory protein, C terminal. 	75
395391	pfam00487	FA_desaturase	Fatty acid desaturase. 	253
395392	pfam00488	MutS_V	MutS domain V. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam05190. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain V of Thermus aquaticus MutS, which contains a Walker A motif, and is structurally similar to the ATPase domain of ABC transporters.	188
395393	pfam00489	IL6	Interleukin-6/G-CSF/MGF family. 	184
395394	pfam00490	ALAD	Delta-aminolevulinic acid dehydratase. 	315
395395	pfam00491	Arginase	Arginase family. 	271
395396	pfam00493	MCM	MCM2/3/5 family. 	224
395397	pfam00494	SQS_PSY	Squalene/phytoene synthase. 	260
395398	pfam00496	SBP_bac_5	Bacterial extracellular solute-binding proteins, family 5 Middle. The borders of this family are based on the PDBSum definitions of the domain edges for Salmonella typhimurium oppA.	369
395399	pfam00497	SBP_bac_3	Bacterial extracellular solute-binding proteins, family 3. 	225
395400	pfam00498	FHA	FHA domain. The FHA (Forkhead-associated) domain is a phosphopeptide binding motif.	66
395401	pfam00499	Oxidored_q3	NADH-ubiquinone/plastoquinone oxidoreductase chain 6. 	143
395402	pfam00500	Late_protein_L1	L1 (late) protein. 	456
395403	pfam00501	AMP-binding	AMP-binding enzyme. 	411
395404	pfam00502	Phycobilisome	Phycobilisome protein. 	155
395405	pfam00503	G-alpha	G-protein alpha subunit. G proteins couple receptors of extracellular signals to intracellular signaling pathways. The G protein alpha subunit binds guanyl nucleotide and is a weak GTPase. A set of residues that are unique to G-alpha as compared to its ancestor the Arf-like family form a ring of residues centered on the nucleotide binding site. A Ggamma is found fused to an inactive Galpha in the Dictyostelium protein gbqA.	316
395406	pfam00504	Chloroa_b-bind	Chlorophyll A-B binding protein. 	157
395407	pfam00505	HMG_box	HMG (high mobility group) box. 	68
278907	pfam00506	Flu_NP	Influenza virus nucleoprotein. 	520
395408	pfam00507	Oxidored_q4	NADH-ubiquinone/plastoquinone oxidoreductase, chain 3. 	94
278909	pfam00508	PPV_E2_N	E2 (early) protein, N terminal. 	197
395409	pfam00509	Hemagglutinin	Haemagglutinin. Haemagglutinin from influenza virus causes membrane fusion of the viral membrane with the host membrane. Fusion occurs after the host cell internalizes the virus by endocytosis. The drop of pH causes release of a hydrophobic fusion peptide and a large conformational change leading to membrane fusion.	504
395410	pfam00510	COX3	Cytochrome c oxidase subunit III. 	258
395411	pfam00511	PPV_E2_C	E2 (early) protein, C terminal. 	80
395412	pfam00512	HisKA	His Kinase A (phospho-acceptor) domain. dimerization and phospho-acceptor domain of histidine kinases.	64
278914	pfam00513	Late_protein_L2	Late Protein L2. 	514
395413	pfam00514	Arm	Armadillo/beta-catenin-like repeat. Approx. 40 amino acid repeat. Tandem repeats form super-helix of helices that is proposed to mediate interaction of beta-catenin with its ligands. CAUTION: This family does not contain all known armadillo repeats.	41
395414	pfam00515	TPR_1	Tetratricopeptide repeat. 	34
278917	pfam00516	GP120	Envelope glycoprotein GP120. The entry of HIV requires interaction of viral GP120 with CD4 and a chemokine receptor on the cell surface.	525
395415	pfam00517	GP41	Retroviral envelope protein. This family includes envelope protein from a variety of retroviruses. It includes the GP41 subunit of the envelope protein complex from human and simian immunodeficiency viruses (HIV and SIV) which mediate membrane fusion during viral entry. The family also includes bovine immunodeficiency virus, feline immunodeficiency virus and Equine infectious anaemia (EIAV). The family also includes the Gp36 protein from mouse mammary tumor virus (MMTV) and human endogenous retroviruses (HERVs).	197
306907	pfam00518	E6	Early Protein (E6). 	108
278920	pfam00519	PPV_E1_C	Papillomavirus helicase. This protein is a DNA helicase that is required for initiation of viral DNA replication. This protein forms a complex with the E2 protein pfam00508.	432
395416	pfam00520	Ion_trans	Ion transport protein. This family contains sodium, potassium and calcium ion channels. This family is 6 transmembrane helices in which the last two helices flank a loop which determines ion selectivity. In some sub-families (e.g. Na channels) the domain is repeated four times, whereas in others (e.g. K channels) the protein forms as a tetramer in the membrane.	238
395417	pfam00521	DNA_topoisoIV	DNA gyrase/topoisomerase IV, subunit A. 	430
278923	pfam00522	VPR	VPR/VPX protein. 	83
395418	pfam00523	Fusion_gly	Fusion glycoprotein F0. 	473
278925	pfam00524	PPV_E1_N	E1 Protein, N terminal domain. 	121
395419	pfam00525	Crystallin	Alpha crystallin A chain, N terminal. 	53
306912	pfam00526	Dicty_CTDC	Dictyostelium (slime mold) repeat. 	23
278928	pfam00527	E7	E7 protein, Early protein. 	92
334128	pfam00528	BPD_transp_1	Binding-protein-dependent transport system inner membrane component. The alignments cover the most conserved region of the proteins, which is thought to be located in a cytoplasmic loop between two transmembrane domains. The members of this family have a variable number of transmembrane helices.	183
395420	pfam00529	HlyD	HlyD membrane-fusion protein of T1SS. HlyD is a component of the prototypical alpha-haemolysin (HlyA) bacterial type I secretion system, along with the other components HlyB and TolC. HlyD and HlyB are inner-membrane proteins and specific components of the transport apparatus of alpha-haemolysin. HlyD is anchored in the cytoplasmic membrane by a single transmembrane domain and has a large periplasmic domain within the carboxy-terminal 100 amino acids, HlyB and HlyD form a stable complex that binds the recombinant protein bearing a C-terminal HlyA signal sequence and ATP in the cytoplasm. HlyD, HlyB and TolC combine to form the three-component ABC transporter complex that forms a trans-membrane channel or pore through which HlyA can be transferred directly to the extracellular medium. Cutinase has been shown to be transported effectively through this pore.	322
395421	pfam00530	SRCR	Scavenger receptor cysteine-rich domain. These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions.	98
395422	pfam00531	Death	Death domain. 	83
395423	pfam00532	Peripla_BP_1	Periplasmic binding proteins and sugar binding domain of LacI family. This family includes the periplasmic binding proteins, and the LacI family transcriptional regulators. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The LacI family of proteins consist of transcriptional regulators related to the lac repressor. In this case, generally the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain (pfam00356).	281
395424	pfam00533	BRCT	BRCA1 C-terminus (BRCT) domain. The BRCT domain is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage. The BRCT domain of XRCC1 forms a homodimer in the crystal structure. This suggests that pairs of BRCT domains associate as homo- or heterodimers. BRCT domains are often found as tandem-repeat pairs. Structures of the BRCA1 BRCT domains revealed a basis for a widely utilized head-to-tail BRCT-BRCT oligomerization mode. This conserved tandem BRCT architecture facilitates formation of the canonical BRCT phospho-peptide interaction cleft at a groove between the BRCT domains. Disease associated missense and nonsense mutations in the BRCA1 BRCT domains disrupt peptide binding by directly occluding this peptide binding groove, or by disrupting key conserved BRCT core folding determinants.	78
395425	pfam00534	Glycos_transf_1	Glycosyl transferases group 1. Mutations in this domain of PIGA lead to disease (Paroxysmal Nocturnal haemoglobinuria). Members of this family transfer activated sugars to a variety of substrates, including glycogen, Fructose-6-phosphate and lipopolysaccharides. Members of this family transfer UDP, ADP, GDP or CMP linked sugars. The eukaryotic glycogen synthases may be distant members of this family.	158
395426	pfam00535	Glycos_transf_2	Glycosyl transferase family 2. Diverse family, transferring sugar from UDP-glucose, UDP-N-acetyl- galactosamine, GDP-mannose or CDP-abequose, to a range of substrates including cellulose, dolichol phosphate and teichoic acids.	164
395427	pfam00536	SAM_1	SAM domain (Sterile alpha motif). It has been suggested that SAM is an evolutionarily conserved protein binding domain that is involved in the regulation of numerous developmental processes in diverse eukaryotes. The SAM domain can potentially function as a protein interaction module through its ability to homo- and heterooligomerize with other SAM domains.	64
395428	pfam00537	Toxin_3	Scorpion toxin-like domain. This family contains both neurotoxins and plant defensins. The mustard trypsin inhibitor, MTI-2, is plant defensin. It is a potent inhibitor of trypsin with no activity towards chymotrypsin. MTI-2 is toxic for Lepidopteran insects, but has low activity against aphids. Brazzein is plant defensin-like protein. It is pH-stable, heat-stable and intensely sweet protein. The scorpion toxin (a neurotoxin) binds to sodium channels and inhibits the activation mechanisms of the channels, thereby blocking neuronal transmission. Scorpion toxins bind to sodium channels and inhibit the activation mechanisms of the channels, thereby blocking neuronal transmission	55
395429	pfam00538	Linker_histone	linker histone H1 and H5 family. Linker histone H1 is an essential component of chromatin structure. H1 links nucleosomes into higher order structures Histone H1 is replaced by histone H5 in some cell types.	73
306921	pfam00539	Tat	Transactivating regulatory protein (Tat). The retroviral Tat protein binds to the Tar RNA. This activates transcriptional initiation and elongation from the LTR promoter. Binding is mediated by an arginine rich region.	64
249943	pfam00540	Gag_p17	gag gene protein p17 (matrix protein). The matrix protein forms an icosahedral shell associated with the inner membrane of the mature immunodeficiency virus.	140
306922	pfam00541	Adeno_knob	Adenoviral fibre protein (knob domain). Specific attachment of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fibre protein and is mediated by the globular carboxy-terminal domain of the adenovirus fibre protein, termed the carboxy-terminal knob domain.	178
395430	pfam00542	Ribosomal_L12	Ribosomal protein L7/L12 C-terminal domain. 	67
395431	pfam00543	P-II	Nitrogen regulatory protein P-II. P-II modulates the activity of glutamine synthetase.	102
366158	pfam00544	Pec_lyase_C	Pectate lyase. This enzyme forms a right handed beta helix structure. Pectate lyase is an enzyme involved in the maceration and soft rotting of plant tissue.	211
395432	pfam00545	Ribonuclease	ribonuclease. This enzyme hydrolyzes RNA and oligoribonucleotides.	81
395433	pfam00547	Urease_gamma	Urease, gamma subunit. Urease is a nickel-binding enzyme that catalyzes the hydrolysis of urea to carbon dioxide and ammonia.	99
278947	pfam00548	Peptidase_C3	3C cysteine protease (picornain 3C). Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease.	174
395434	pfam00549	Ligase_CoA	CoA-ligase. This family includes the CoA ligases Succinyl-CoA synthetase alpha and beta chains, malate CoA ligase and ATP-citrate lyase. Some members of the family utilize ATP others use GTP.	128
395435	pfam00550	PP-binding	Phosphopantetheine attachment site. A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of the anguibactin system regulator AngR has the attachment serine replaced by an alanine.	62
395436	pfam00551	Formyl_trans_N	Formyl transferase. This family includes the following members. Glycinamide ribonucleotide transformylase catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyltetrahydrofolate deformylase produces formate from formyl- tetrahydrofolate. Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. Inclusion of the following members is supported by PSI-blast. HOXX_BRAJA (P31907) contains a related domain of unknown function. PRTH_PORGI (P46071) contains a related domain of unknown function. Y09P_MYCTU (Q50721) contains a related domain of unknown function.	181
395437	pfam00552	IN_DBD_C	Integrase DNA binding domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain. The central domain is the catalytic domain pfam00665. This domain is the carboxyl terminal domain that is a non-specific DNA binding domain.	45
395438	pfam00553	CBM_2	Cellulose binding domain. Two tryptophan residues are involved in cellulose binding. Cellulose binding domain found in bacteria.	101
395439	pfam00554	RHD_DNA_bind	Rel homology DNA-binding domain. Proteins containing the Rel homology domain (RHD) are eukaryotic transcription factors. The RHD is composed of two structural domains. This is the N-terminal DNA-binding domain that is similar to that found in P53. The C-terminal domain has an immunoglobulin-like fold (See pfam16179) that functions as a dimerization domain.	169
395440	pfam00555	Endotoxin_M	delta endotoxin. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding.	204
395441	pfam00556	LHC	Antenna complex alpha/beta subunit. 	39
395442	pfam00557	Peptidase_M24	Metallopeptidase family M24. This family contains metallopeptidases. It also contains non-peptidase homologs such as the N terminal domain of Spt16 which is a histone H3-H4 binding module.	206
109608	pfam00558	Vpu	Vpu protein. The Vpu protein contains an N-terminal transmembrane spanning region and a C-terminal cytoplasmic region. The HIV-1 Vpu protein stimulates virus production by enhancing the release of viral particles from infected cells. The VPU protein binds specifically to CD4.	81
278957	pfam00559	Vif	Retroviral Vif (Viral infectivity) protein. Human immunodeficiency virus type 1 (HIV-1) Vif is required for productive infection of T lymphocytes and macrophages. Virions produced in the absence of Vif have abnormal core morphology and those produced in primary T cells carry immature core proteins and low levels of mature capsid.	200
395443	pfam00560	LRR_1	Leucine Rich Repeat. CAUTION: This Pfam may not find all Leucine Rich Repeats in a protein. Leucine Rich Repeats are short sequence motifs present in a number of proteins with diverse functions and cellular locations. These repeats are usually involved in protein-protein interactions. Each Leucine Rich Repeat is composed of a beta-alpha unit. These units form elongated non-globular structures. Leucine Rich Repeats are often flanked by cysteine rich domains.	23
395444	pfam00561	Abhydrolase_1	alpha/beta hydrolase fold. This catalytic domain is found in a very wide range of enzymes.	245
395445	pfam00562	RNA_pol_Rpb2_6	RNA polymerase Rpb2, domain 6. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain represents the hybrid binding domain and the wall domain. The hybrid binding domain binds the nascent RNA strand / template DNA strand in the Pol II transcription elongation complex. This domain contains the important structural motifs, switch 3 and the flap loop and binds an active site metal ion. This domain is also involved in binding to Rpb1 and Rpb3. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 2 (DRII).	369
395446	pfam00563	EAL	EAL domain. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues. The EAL domain is a good candidate for a diguanylate phosphodiesterase function. The domain contains many conserved acidic residues that could participate in metal binding and might form the phosphodiesterase active site.	236
395447	pfam00564	PB1	PB1 domain. 	84
395448	pfam00565	SNase	Staphylococcal nuclease homolog. Present in all three domains of cellular life. Four copies in the transcriptional coactivator p100: these, however, appear to lack the active site residues of Staphylococcal nuclease. Positions 14 (Asp-21), 34 (Arg-35), 39 (Asp-40), 42 (Glu-43) and 110 (Arg-87) [SNase numbering in parentheses] are thought to be involved in substrate-binding and catalysis.	106
366170	pfam00566	RabGAP-TBC	Rab-GTPase-TBC domain. Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases.	180
395449	pfam00567	TUDOR	Tudor domain. 	117
395450	pfam00568	WH1	WH1 domain. WASp Homology domain 1 (WH1) domain. WASP is the protein that is defective in Wiskott-Aldrich syndrome (WAS). The majority of point mutations occur within the amino- terminal WH1 domain. The metabotropic glutamate receptors mGluR1alpha and mGluR5 bind a protein called homer, which is a WH1 domain homolog. A subset of WH1 domains has been termed a "EVH1" domain and appear to bind a polyproline motif.	111
395451	pfam00569	ZZ	Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300. ZZ in dystrophin binds calmodulin. Putative zinc finger; binding not yet shown. Four to six cysteine residues in its sequence are responsible for coordinating zinc ions, to reinforce the structure.	45
395452	pfam00570	HRDC	HRDC domain. The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain cause human disease. It is interesting to note that the RecQ helicase in Deinococcus radiodurans has three tandem HRDC domains.	68
395453	pfam00571	CBS	CBS domain. CBS domains are small intracellular modules that pair together to form a stable globular domain. This family represents a single CBS domain. Pairs of these domains have been termed a Bateman domain. CBS domains have been shown to bind ligands with an adenosyl group such as AMP, ATP and S-AdoMet. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role making proteins sensitive to adenosyl carrying ligands. The region containing the CBS domains in Cystathionine-beta synthase is involved in regulation by S-AdoMet. CBS domain pairs from AMPK bind AMP or ATP. The CBS domains from IMPDH and the chloride channel CLC2 bind ATP.	57
395454	pfam00572	Ribosomal_L13	Ribosomal protein L13. 	119
395455	pfam00573	Ribosomal_L4	Ribosomal protein L4/L1 family. This family includes Ribosomal L4/L1 from eukaryotes and archaebacteria and L4 from eubacteria. L4 from yeast has been shown to bind rRNA.	190
395456	pfam00574	CLP_protease	Clp protease. The Clp protease has an active site catalytic triad. In E. coli Clp protease, ser-111, his-136 and asp-185 form the catalytic triad. Some members have lost active site residues and are therefore inactive, some contain one or two large insertions.	175
395457	pfam00575	S1	S1 RNA binding domain. The S1 domain occurs in a wide range of RNA associated proteins. It is structurally similar to cold shock protein which binds nucleic acids. The S1 domain has an OB-fold structure.	74
395458	pfam00576	Transthyretin	HIUase/Transthyretin family. This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyze the conversion of 5-hydroxyisourate (HIU) to OHCU. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence.	110
395459	pfam00577	Usher	Outer membrane usher protein. In Gram-negative bacteria the biogenesis of fimbriae (or pili) requires a two- component assembly and transport system which is composed of a periplasmic chaperone and an outer membrane protein which has been termed a molecular 'usher'. The usher protein is rather large (from 86 to 100 Kd) and seems to be mainly composed of membrane-spanning beta-sheets, a structure reminiscent of porins. Although the degree of sequence similarity of these proteins is not very high they share a number of characteristics. One of these is the presence of two pairs of cysteines, the first one located in the N-terminal part and the second at the C-terminal extremity that are probably involved in disulphide bonds. The best conserved region is located in the central part of these proteins.	551
395460	pfam00578	AhpC-TSA	AhpC/TSA family. This family contains proteins related to alkyl hydroperoxide reductase (AhpC) and thiol specific antioxidant (TSA).	124
395461	pfam00579	tRNA-synt_1b	tRNA synthetases class I (W and Y). 	292
395462	pfam00580	UvrD-helicase	UvrD/REP helicase N-terminal domain. The Rep family helicases are composed of four structural domains. The Rep family function as dimers. REP helicases catalyze ATP dependent unwinding of double stranded DNA to single stranded DNA. Some members have large insertions near to the carboxy-terminus relative to other members of the family.	267
395463	pfam00581	Rhodanese	Rhodanese-like domain. Rhodanese has an internal duplication. This Pfam represents a single copy of this duplicated domain. The domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.	92
395464	pfam00582	Usp	Universal stress protein family. The universal stress protein UspA is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. UspA enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity. The crystal structure of Haemophilus influenzae UspA reveals an alpha/beta fold similar to that of the Methanococcus jannaschii MJ0577 protein, which binds ATP, though UspA lacks ATP-binding activity.	140
395465	pfam00583	Acetyltransf_1	Acetyltransferase (GNAT) family. This family contains proteins with N-acetyltransferase functions such as Elp3-related proteins.	116
395466	pfam00584	SecE	SecE/Sec61-gamma subunits of protein translocation complex. SecE is part of the SecYEG complex in bacteria which translocates proteins from the cytoplasm. In eukaryotes the complex, made from Sec61-gamma and Sec61-alpha translocates protein from the cytoplasm to the ER. Archaea have a similar complex.	55
395467	pfam00585	Thr_dehydrat_C	C-terminal regulatory domain of Threonine dehydratase. Threonine dehydratases pfam00291 all contain a carboxy terminal region. This region may have a regulatory role. Some members contain two copies of this region. This family is homologous to the pfam01842 domain.	91
395468	pfam00586	AIRS	AIR synthase related protein, N-terminal domain. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain.	104
395469	pfam00587	tRNA-synt_2b	tRNA synthetase class II core domain (G, H, P, S and T). tRNA-synt_2b is a family of largely threonyl-tRNA members.	181
395470	pfam00588	SpoU_methylase	SpoU rRNA Methylase family. This family of proteins probably use S-AdoMet.	141
395471	pfam00589	Phage_integrase	Phage integrase family. Members of this family cleave DNA substrates by a series of staggered cuts, during which the protein becomes covalently linked to the DNA through a catalytic tyrosine residue at the carboxy end of the alignment. The catalytic site residues in CRE recombinase are Arg-173, His-289, Arg-292 and Tyr-324.	169
395472	pfam00590	TP_methylase	Tetrapyrrole (Corrin/Porphyrin) Methylases. This family uses S-AdoMet in the methylation of diverse substrates. This family includes a related group of bacterial proteins of unknown function. This family includes the methylase Dipthine synthase.	209
395473	pfam00591	Glycos_transf_3	Glycosyl transferase family, a/b domain. This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate.	251
395474	pfam00593	TonB_dep_Rec	TonB dependent receptor. This model now only covers the conserved part of the barrel structure.	475
395475	pfam00594	Gla	Vitamin K-dependent carboxylation/gamma-carboxyglutamic (GLA) domain. This domain is responsible for the high-affinity binding of calcium ions. This domain contains post-translational modifications of many glutamate residues by Vitamin K-dependent carboxylation to form gamma-carboxyglutamate (Gla).	41
395476	pfam00595	PDZ	PDZ domain (Also known as DHR or GLGF). PDZ domains are found in diverse signaling proteins.	81
395477	pfam00596	Aldolase_II	Class II Aldolase and Adducin N-terminal domain. This family includes class II aldolases and adducins which have not been ascribed any enzymatic function.	175
395478	pfam00598	Flu_M1	Influenza Matrix protein (M1). This protein forms a continuous shell on the inner side of the lipid bilayer, but its function is unclear.	156
278994	pfam00599	Flu_M2	Influenza Matrix protein (M2). This protein spans the viral membrane with an extracellular amino-terminus external and a cytoplasmic carboxy-terminus.	97
366187	pfam00600	Flu_NS1	Influenza non-structural protein (NS1). NS1 is a homodimeric RNA-binding protein that is required for viral replication. NS1 binds polyA tails of mRNA keeping them in the nucleus. NS1 inhibits pre-mRNA splicing by tightly binding to a specific stem-bulge of U6 snRNA.	217
278996	pfam00601	Flu_NS2	Influenza non-structural protein (NS2). NS2 may play a role in promoting normal replication of the genomic RNAs by preventing the replication of short-length RNA species.	108
395479	pfam00602	Flu_PB1	Influenza RNA-dependent RNA polymerase subunit PB1. Two GTP binding sites exist in this protein.	732
395480	pfam00603	Flu_PA	Influenza RNA-dependent RNA polymerase subunit PA. 	694
395481	pfam00604	Flu_PB2	Influenza RNA-dependent RNA polymerase subunit PB2. PB2 can bind 5' end cap structure of RNA.	754
395482	pfam00605	IRF	Interferon regulatory factor transcription factor. This family of transcription factors are important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. Three of the five conserved tryptophan residues bind to DNA.	106
334169	pfam00606	Glycoprotein_B	Herpesvirus Glycoprotein B ectodomain. This domain corresponds to the ectodomain of glycoprotein B according to ECOD.	222
395483	pfam00607	Gag_p24	gag gene protein p24 (core nucleocapsid protein). p24 forms inner protein layer of the nucleocapsid. ELISA tests for p24 is the most commonly used method to demonstrate virus replication both in vivo and in vitro.	195
306963	pfam00608	Adeno_shaft	Adenoviral fibre protein (repeat/shaft region). There is no separation between signal and noise. Specific attachment of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fibre protein and is mediated by the globular carboxy-terminal domain of the adenovirus fibre protein, rather than the 'shaft' region represented by this family. The alignment of this family contains two copies of a fifteen residue repeat found in the 'shaft' region of adenoviral fibre proteins.	34
395484	pfam00609	DAGK_acc	Diacylglycerol kinase accessory domain. Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown.	158
395485	pfam00610	DEP	Domain found in Dishevelled, Egl-10, and Pleckstrin (DEP). The DEP domain is responsible for mediating intracellular protein targeting and regulation of protein stability in the cell. The DEP domain is present in a number of signaling molecules, including Regulator of G protein Signaling (RGS) proteins, and has been implicated in membrane targeting. New findings in yeast, however, demonstrate a major role for a DEP domain in mediating the interaction of an RGS protein to the C-terminal tail of a GPCR, thus placing RGS in close proximity with its substrate G protein alpha subunit.	71
395486	pfam00611	FCH	Fes/CIP4, and EFC/F-BAR homology domain. Alignment extended from. Highly alpha-helical. The cytosolic endocytic adaptor proteins in fungi carry this domain at the N-terminus; several of these have been referred to as muniscin proteins. These N-terminal BAR, N-BAR, and EFC/F-BAR domains are found in proteins that regulate membrane trafficking events by inducing membrane tubulation. The domain dimerizes into a curved structure that binds to liposomes and either senses or induces the curvature of the membrane bilayer to cause biophysical changes to the shape of the bilayer; it also thereby recruits other trafficking factors, such as the GTPase dynamin. Most EFC/F-BAR domain-family members localize to actin-rich structures.	77
395487	pfam00612	IQ	IQ calmodulin-binding motif. Calmodulin-binding motif.	21
395488	pfam00613	PI3Ka	Phosphoinositide 3-kinase family, accessory domain (PIK domain). PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation.	185
395489	pfam00614	PLDc	Phospholipase D Active site motif. Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homolog of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site. aspartic acid. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologs but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved.	28
395490	pfam00615	RGS	Regulator of G protein signaling domain. RGS family members are GTPase-activating proteins for heterotrimeric G-protein alpha-subunits.	117
395491	pfam00616	RasGAP	GTPase-activator protein for Ras-like GTPase. All alpha-helical domain that accelerates the GTPase activity of Ras, thereby "switching" it into an "off" position.	206
395492	pfam00617	RasGEF	RasGEF domain. Guanine nucleotide exchange factor for Ras-like small GTPases.	179
395493	pfam00618	RasGEF_N	RasGEF N-terminal motif. A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this motif/domain N-terminal to the RasGef (Cdc25-like) domain.	104
395494	pfam00619	CARD	Caspase recruitment domain. Motif contained in proteins involved in apoptotic signaling. Predicted to possess a DEATH (pfam00531) domain-like fold.	85
395495	pfam00620	RhoGAP	RhoGAP domain. GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases.	152
395496	pfam00621	RhoGEF	RhoGEF domain. Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that pfam00169 domains invariably occur C-terminal to RhoGEF/DH domains.	177
395497	pfam00622	SPRY	SPRY domain. SPRY Domain is named from SPla and the RYanodine Receptor. Domain of unknown function. Distant homologs are domains in butyrophilin/marenostrin/pyrin homologs.	121
395498	pfam00623	RNA_pol_Rpb1_2	RNA polymerase Rpb1, domain 2. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 2, contains the active site. The invariant motif -NADFDGD- binds the active site magnesium ion.	166
395499	pfam00624	Flocculin	Flocculin repeat. This short repeat is rich in serine and threonine residues.	39
395500	pfam00625	Guanylate_kin	Guanylate kinase. 	182
395501	pfam00626	Gelsolin	Gelsolin repeat. 	76
395502	pfam00627	UBA	UBA/TS-N domain. This small domain is composed of three alpha helices. This family includes the previously defined UBA and TS-N domains. The UBA-domain (ubiquitin associated domain) is a novel sequence motif found in several proteins having connections to ubiquitin and the ubiquitination pathway. The structure of the UBA domain consists of a compact three helix bundle. This domain is found at the N-terminus of EF-TS hence the name TS-N. The structure of EF-TS is known and this domain is implicated in its interaction with EF-TU. The domain has been found in non EF-TS proteins such as alpha-NAC and MJ0280.	37
395503	pfam00628	PHD	PHD-finger. PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3.	51
395504	pfam00629	MAM	MAM domain, meprin/A5/mu. An extracellular domain found in many receptors. The MAM domain along with the associated Ig domain in type IIB receptor protein tyrosine phosphatases forms a structural unit (termed MIg) with a seamless interdomain interface. It plays a major role in homodimerization of the phosphatase ectoprotein and in cell adhesion. MAM is a beta-sandwich consisting of two five-stranded antiparallel beta-sheets rotated away from each other by approx 25 degrees, and plays a similar role in meprin metalloproteinases.	160
395505	pfam00630	Filamin	Filamin/ABP280 repeat. 	89
395506	pfam00631	G-gamma	GGL domain. G-protein gamma like domains (GGL) are found in the gamma subunit of the heterotrimeric G protein complex and in regulators of G protein signaling (RGS) proteins. It is also found fused to an inactive Galpha in the Dictyostelium protein gbqA. G-gamma likely shares a common origin with the helical N-terminal unit of G-beta. All organisms that posses a G-beta possess a G-gamma.	67
395507	pfam00632	HECT	HECT-domain (ubiquitin-transferase). The name HECT comes from Homologous to the E6-AP Carboxyl Terminus.	300
395508	pfam00633	HHH	Helix-hairpin-helix motif. The helix-hairpin-helix DNA-binding motif is found to be duplicated in the central domain of RuvA. The HhH domain of DisA, a bacterial checkpoint control protein, is a DNA-binding domain.	30
395509	pfam00634	BRCA2	BRCA2 repeat. The alignment covers only the most conserved region of the repeat.	31
395510	pfam00635	Motile_Sperm	MSP (Major sperm protein) domain. Major sperm proteins are involved in sperm motility. These proteins oligomerize to form filaments. This family contains many other proteins.	109
395511	pfam00636	Ribonuclease_3	Ribonuclease III domain. 	101
395512	pfam00637	Clathrin	Region in Clathrin and VPS. Each region is about 140 amino acids long. The regions are composed of multiple alpha helical repeats. They occur in the arm region of the Clathrin heavy chain.	137
395513	pfam00638	Ran_BP1	RanBP1 domain. 	122
395514	pfam00639	Rotamase	PPIC-type PPIASE domain. Rotamases increase the rate of protein folding by catalyzing the interconversion of cis-proline and trans-proline.	96
395515	pfam00640	PID	Phosphotyrosine interaction domain (PTB/PID). 	133
395516	pfam00641	zf-RanBP	Zn-finger in Ran binding protein and others. 	30
395517	pfam00642	zf-CCCH	Zinc finger C-x8-C-x5-C-x3-H type (and similar). 	27
395518	pfam00643	zf-B_box	B-box zinc finger. 	42
395519	pfam00644	PARP	Poly(ADP-ribose) polymerase catalytic domain. Poly(ADP-ribose) polymerase catalyzes the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active.	195
395520	pfam00645	zf-PARP	Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region. Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor.	87
395521	pfam00646	F-box	F-box domain. This domain is approximately 50 amino acids long, and is usually found in the N-terminal half of a variety of proteins. Two motifs that are commonly found associated with the F-box domain are the leucine rich repeats (LRRs; pfam00560 and pfam07723) and the WD repeat (pfam00400). The F-box domain has a role in mediating protein-protein interactions in a variety of contexts, such as polyubiquitination, transcription elongation, centromere binding and translational repression.	45
395522	pfam00647	EF1G	Elongation factor 1 gamma, conserved domain. 	105
395523	pfam00648	Peptidase_C2	Calpain family cysteine protease. 	296
395524	pfam00649	Copper-fist	Copper fist DNA binding domain. 	38
395525	pfam00650	CRAL_TRIO	CRAL/TRIO domain. 	152
395526	pfam00651	BTB	BTB/POZ domain. The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerization and in some instances heteromeric dimerization. The structure of the dimerized PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN.	107
395527	pfam00652	Ricin_B_lectin	Ricin-type beta-trefoil lectin domain. 	126
395528	pfam00653	BIR	Inhibitor of Apoptosis domain. BIR stands for 'Baculovirus Inhibitor of apoptosis protein Repeat'. It is found repeated in inhibitor of apoptosis proteins (IAPs), and in fact it is also known as IAP repeat. These domains characteristically have a number of invariant residues, including 3 conserved cysteines and one conserved histidine that coordinate a zinc ion. They are usually made up of 4-5 alpha helices and a three-stranded beta-sheet. BIR is also found in other proteins known as BIR-domain-containing proteins (BIRPs), such as Survivin.	67
395529	pfam00654	Voltage_CLC	Voltage gated chloride channel. This family of ion channels contains 10 or 12 transmembrane helices. Each protein forms a single pore. It has been shown that some members of this family form homodimers. In terms of primary structure, they are unrelated to known cation channels or other types of anion channels. Three ClC subfamilies are found in animals. ClC-1 is involved in setting and restoring the resting membrane potential of skeletal muscle, while other channels play important parts in solute concentration mechanisms in the kidney. These proteins contain two pfam00571 domains.	344
395530	pfam00656	Peptidase_C14	Caspase domain. 	232
395531	pfam00657	Lipase_GDSL	GDSL-like Lipase/Acylhydrolase. 	224
395532	pfam00658	PABP	Poly-adenylate binding protein, unique domain. The region featured in this family is found towards the C-terminus of poly(A)-binding proteins (PABPs). These are eukaryotic proteins that, through their binding of the 3' poly(A) tail on mRNA, have very important roles in the pathways of gene expression. They seem to provide a scaffold on which other proteins can bind and mediate processes such as export, translation and turnover of the transcripts. Moreover, they may act as antagonists to the binding of factors that allow mRNA degradation, regulating mRNA longevity. PABPs are also involved in nuclear transport. PABPs interact with poly(A) tails via RNA-recognition motifs (pfam00076). Note that the PABP C-terminal region is also found in members of the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains - these are also included in this family.	65
395533	pfam00659	POLO_box	POLO box duplicated region. 	66
395534	pfam00660	SRP1_TIP1	Seripauperin and TIP1 family. 	98
395535	pfam00661	Matrix	Viral matrix protein. Found in Morbillivirus and paramyxovirus, pneumovirus.	340
395536	pfam00662	Proton_antipo_N	NADH-Ubiquinone oxidoreductase (complex I), chain 5 N-terminus. This sub-family represents an amino terminal extension of pfam00361. Only NADH-Ubiquinone chain 5 and eubacterial chain L are in this family. This sub-family is part of complex I which catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane.	58
395537	pfam00664	ABC_membrane	ABC transporter transmembrane region. This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions.	274
395538	pfam00665	rve	Integrase core domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.	99
395539	pfam00666	Cathelicidins	Cathelicidin. A novel protein family, showing a conserved proregion and a variable carboxyl-terminal antimicrobial domain. This region shows similarity to cystatins.	101
395540	pfam00667	FAD_binding_1	FAD binding domain. This domain is found in sulfite reductase, NADPH cytochrome P450 reductase, Nitric oxide synthase and methionine synthase reductase.	219
395541	pfam00668	Condensation	Condensation domain. This domain is found in many multi-domain enzymes which synthesize peptide antibiotics. This domain catalyzes a condensation reaction to form peptide bonds in non- ribosomal peptide biosynthesis. It is usually found to the carboxy side of a phosphopantetheine binding domain (pfam00550). It has been shown that mutations in the HHXXXDG motif abolish activity suggesting this is part of the active site.	454
395542	pfam00669	Flagellin_N	Bacterial flagellin N-terminal helical region. Flagellins polymerize to form bacterial flagella. This family includes flagellins and hook associated protein 3. Structurally this family forms an extended helix that interacts with pfam00700.	139
395543	pfam00670	AdoHcyase_NAD	S-adenosyl-L-homocysteine hydrolase, NAD binding domain. 	162
395544	pfam00672	HAMP	HAMP domain. 	53
395545	pfam00673	Ribosomal_L5_C	ribosomal L5P family C-terminus. This region is found associated with pfam00281.	94
395546	pfam00674	DUP	DUP family. This family consists of several yeast proteins of unknown functions. Swiss-prot annotates these as belonging to the DUP family. Several members of this family contain an internal duplication of this region.	96
395547	pfam00675	Peptidase_M16	Insulinase (Peptidase family M16). 	149
395548	pfam00676	E1_dh	Dehydrogenase E1 component. This family uses thiamine pyrophosphate as a cofactor. This family includes pyruvate dehydrogenase, 2-oxoglutarate dehydrogenase and 2-oxoisovalerate dehydrogenase.	300
395549	pfam00677	Lum_binding	Lumazine binding domain. This domain binds to derivatives of lumazine in some proteins. Some proteins have lost the residues involved in binding lumazine.	83
395550	pfam00679	EFG_C	Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold.	88
395551	pfam00680	RdRP_1	RNA dependent RNA polymerase. 	453
395552	pfam00681	Plectin	Plectin repeat. This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen.	40
395553	pfam00682	HMGL-like	HMGL-like. This family contains a diverse set of enzymes. These include various aldolases and a region of pyruvate carboxylase.	264
395554	pfam00683	TB	TB domain. This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin.	42
395555	pfam00684	DnaJ_CXXCXGXG	DnaJ central domain. The central cysteine-rich (CR) domain of DnaJ proteins contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DNAJ cysteine rich domain and various hydrophobic peptides has been found.	61
395556	pfam00685	Sulfotransfer_1	Sulfotransferase domain. 	253
395557	pfam00686	CBM_20	Starch binding domain. 	95
395558	pfam00687	Ribosomal_L1	Ribosomal protein L1p/L10e family. This family includes prokaryotic L1 and eukaryotic L10.	197
395559	pfam00688	TGFb_propeptide	TGF-beta propeptide. This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein.	229
376368	pfam00689	Cation_ATPase_C	Cation transporting ATPase, C-terminus. Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. This family represents 5 transmembrane helices.	175
395560	pfam00690	Cation_ATPase_N	Cation transporter/ATPase, N-terminus. Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport.	69
395561	pfam00691	OmpA	OmpA family. The Pfam entry also includes MotB and related proteins which are not included in the Prosite family.	94
395562	pfam00692	dUTPase	dUTPase. dUTPase hydrolyzes dUTP to dUMP and pyrophosphate.	129
395563	pfam00693	Herpes_TK	Thymidine kinase from herpesvirus. 	280
395564	pfam00694	Aconitase_C	Aconitase C-terminal domain. Members of this family usually also match to pfam00330. This domain undergoes conformational change in the enzyme mechanism.	131
366252	pfam00695	vMSA	Major surface antigen from hepadnavirus. 	394
395565	pfam00696	AA_kinase	Amino acid kinase family. This family includes kinases that phosphorylate a variety of amino acid substrates, as well as uridylate kinase and carbamate kinase. This family includes: Aspartokinase EC:2.7.2.4. Acetylglutamate kinase EC:2.7.2.8. Glutamate 5-kinase EC:2.7.2.11. Uridylate kinase EC:2.7.4.-. Carbamate kinase EC:2.7.2.2.	232
395566	pfam00697	PRAI	N-(5'phosphoribosyl)anthranilate (PRA) isomerase. 	193
395567	pfam00698	Acyl_transf_1	Acyl transferase domain. 	319
395568	pfam00699	Urease_beta	Urease beta subunit. This subunit is known as alpha in Heliobacter.	98
395569	pfam00700	Flagellin_C	Bacterial flagellin C-terminal helical region. Flagellins polymerize to form bacterial flagella. There is some similarity between this family and pfam00669, particularly the motif NRFXSXIXXL. It has been suggested that these two regions associate and this is shown to be correct as structurally this family forms an extended helix that interacts with pfam00700.	86
395570	pfam00701	DHDPS	Dihydrodipicolinate synthetase family. This family has a TIM barrel structure.	289
395571	pfam00702	Hydrolase	haloacid dehalogenase-like hydrolase. This family is structurally different from the alpha/beta hydrolase family (pfam00561). This family includes L-2-haloacid dehalogenase, epoxide hydrolases and phosphatases. The structure of the family consists of two domains. One is an inserted four helix bundle, which is the least well conserved region of the alignment, between residues 16 and 96 of Pseudomonas sp. (S)-2-haloacid dehalogenase 1. The rest of the fold is composed of the core alpha/beta domain. Those members with the characteristic DxD triad at the N-terminus are probably phosphatidylglycerolphosphate (PGP) phosphatases involved in cardiolipin biosynthesis in the mitochondria.	191
395572	pfam00703	Glyco_hydro_2	Glycosyl hydrolases family 2. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities.	106
395573	pfam00704	Glyco_hydro_18	Glycosyl hydrolases family 18. 	307
395574	pfam00705	PCNA_N	Proliferating cell nuclear antigen, N-terminal domain. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA.	125
109750	pfam00706	Toxin_4	Anenome neurotoxin. 	43
395575	pfam00707	IF3_C	Translation initiation factor IF-3, C-terminal domain. 	86
395576	pfam00708	Acylphosphatase	Acylphosphatase. 	85
395577	pfam00709	Adenylsucc_synt	Adenylosuccinate synthetase. 	418
395578	pfam00710	Asparaginase	Asparaginase, N-terminal. This is the N-terminal domain of this enzyme.	188
395579	pfam00711	Defensin_beta	Beta defensin. The beta defensins are antimicrobial peptides implicated in the resistance of epithelial surfaces to microbial colonisation.	36
395580	pfam00712	DNA_pol3_beta	DNA polymerase III beta subunit, N-terminal domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold.	121
307040	pfam00713	Hirudin	Hirudin. 	64
395581	pfam00714	IFN-gamma	Interferon gamma. 	138
366262	pfam00715	IL2	Interleukin 2. 	144
279105	pfam00716	Peptidase_S21	Assemblin (Peptidase family S21). 	336
395582	pfam00717	Peptidase_S24	Peptidase S24-like. 	116
307044	pfam00718	Polyoma_coat	Polyomavirus coat protein. 	293
395583	pfam00719	Pyrophosphatase	Inorganic pyrophosphatase. 	154
395584	pfam00720	SSI	Subtilisin inhibitor-like. 	92
307047	pfam00721	TMV_coat	Virus coat protein (TMV like). This family contains coat proteins from tobamoviruses, hordeiviruses, Tobraviruses, Furoviruses and Potyviruses.	163
395585	pfam00722	Glyco_hydro_16	Glycosyl hydrolases family 16. 	168
395586	pfam00723	Glyco_hydro_15	Glycosyl hydrolases family 15. In higher organisms this family is represented by phosphorylase kinase subunits.	417
395587	pfam00724	Oxidored_FMN	NADH:flavin oxidoreductase / NADH oxidase family. 	341
395588	pfam00725	3HCDH	3-hydroxyacyl-CoA dehydrogenase, C-terminal domain. This family also includes lambda crystallin. Some proteins include two copies of this domain.	97
334228	pfam00726	IL10	Interleukin 10. 	170
395589	pfam00727	IL4	Interleukin 4. 	116
395590	pfam00728	Glyco_hydro_20	Glycosyl hydrolase family 20, catalytic domain. This domain has a TIM barrel fold.	345
395591	pfam00729	Viral_coat	Viral coat protein (S domain). 	204
395592	pfam00730	HhH-GPD	HhH-GPD superfamily base excision DNA repair protein. This family contains a diverse range of structurally related DNA repair proteins. The superfamily is called the HhH-GPD family after its hallmark Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This includes endonuclease III, EC:4.2.99.18 and MutY an A/G-specific adenine glycosylase, both have a C terminal 4Fe-4S cluster. The family also includes 8-oxoguanine DNA glycosylases. The methyl-CPG binding protein MBD4 also contains a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II EC:3.2.2.21 and other members of the AlkA family.	142
395593	pfam00731	AIRC	AIR carboxylase. Members of this family catalyze the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyze the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain.	147
366272	pfam00732	GMC_oxred_N	GMC oxidoreductase. This family of proteins bind FAD as a cofactor.	218
395594	pfam00733	Asn_synthase	Asparagine synthase. This family is always found associated with pfam00310. Members of this family catalyze the conversion of aspartate to asparagine.	279
395595	pfam00734	CBM_1	Fungal cellulose binding domain. 	29
395596	pfam00735	Septin	Septin. Members of this family include CDC3, CDC10, CDC11 and CDC12/Septin. Members of this family bind GTP. As regards the septins, these are polypeptides of 30-65kDa with three characteristic GTPase motifs (G-1, G-3 and G-4) that are similar to those of the Ras family. The G-4 motif is strictly conserved with a unique septin consensus of AKAD. Most septins are thought to have at least one coiled-coil region, which in some cases is necessary for intermolecular interactions that allow septins to polymerize to form rod-shaped complexes. In turn, these are arranged into tandem arrays to form filaments. They are multifunctional proteins, with roles in cytokinesis, sporulation, germ cell development, exocytosis and apoptosis.	272
395597	pfam00736	EF1_GNE	EF-1 guanine nucleotide exchange domain. This family is the guanine nucleotide exchange domain of EF-1 beta and EF-1 delta chains.	83
395598	pfam00737	PsbH	Photosystem II 10 kDa phosphoprotein. This protein is phosphorylated in a light dependent reaction.	52
307060	pfam00738	Polyhedrin	Polyhedrin. These proteins are found in occlusion bodies in various viruses. The polyhedrin protein protects the virus.	232
109783	pfam00739	X	Trans-activation protein X. This protein is found in hepadnaviruses where it is indispensable for replication.	142
395599	pfam00740	Parvo_coat	Parvovirus coat protein VP2. This protein, together with VP1 forms a capsomer. Both of these proteins are formed from the same transcript using alternative splicing. As a result, VP1 and VP2 differ only in the N-terminal region of VP1. VP2 is involved in packaging the viral DNA.	518
395600	pfam00741	Gas_vesicle	Gas vesicle protein. 	39
395601	pfam00742	Homoserine_dh	Homoserine dehydrogenase. 	178
395602	pfam00743	FMO-like	Flavin-binding monooxygenase-like. This family includes FMO proteins, cyclohexanone mono-oxygenase and a number of different mono-oxygenases.	531
395603	pfam00745	GlutR_dimer	Glutamyl-tRNAGlu reductase, dimerization domain. 	94
366278	pfam00746	Gram_pos_anchor	LPXTG cell wall anchor motif. 	43
395604	pfam00747	Viral_DNA_bp	ssDNA binding protein. This protein is found in herpesviruses and is needed for replication.	1120
395605	pfam00748	Calpain_inhib	Calpain inhibitor. This region is found multiple times in calpain inhibitor proteins.	130
395606	pfam00749	tRNA-synt_1c	tRNA synthetases class I (E and Q), catalytic domain. Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln).	314
395607	pfam00750	tRNA-synt_1d	tRNA synthetases class I (R). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only arginyl tRNA synthetase.	348
395608	pfam00751	DM	DM DNA binding domain. The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerize and bind palindromic DNA.	47
395609	pfam00752	XPG_N	XPG N-terminal domain. 	100
395610	pfam00753	Lactamase_B	Metallo-beta-lactamase superfamily. 	196
395611	pfam00754	F5_F8_type_C	F5/8 type C domain. This domain is also known as the discoidin (DS) domain family.	127
395612	pfam00755	Carn_acyltransf	Choline/Carnitine o-acyltransferase. 	578
395613	pfam00756	Esterase	Putative esterase. This family contains Esterase D. However it is not clear if all members of the family have the same function. This family is related to the pfam00135 family.	246
395614	pfam00757	Furin-like	Furin-like cysteine rich region. 	143
395615	pfam00758	EPO_TPO	Erythropoietin/thrombopoietin. 	160
395616	pfam00759	Glyco_hydro_9	Glycosyl hydrolase family 9. 	374
395617	pfam00760	Cucumo_coat	Cucumovirus coat protein. 	175
366290	pfam00761	Polyoma_coat2	Polyomavirus coat protein. 	322
395618	pfam00762	Ferrochelatase	Ferrochelatase. 	315
395619	pfam00763	THF_DHG_CYH	Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain. 	117
279148	pfam00764	Arginosuc_synth	Arginosuccinate synthase. This family contains a PP-loop motif.	386
366292	pfam00765	Autoind_synth	Autoinducer synthase. 	182
395620	pfam00766	ETF_alpha	Electron transfer flavoprotein FAD-binding domain. This domain found at the C-terminus of electron transfer flavoprotein alpha chain and binds to FAD. The fold consists of a five-stranded parallel beta sheet as the core of the domain, flanked by alternating helices. A small part of this domain is donated by the beta chain.	83
279151	pfam00767	Poty_coat	Potyvirus coat protein. 	243
395621	pfam00768	Peptidase_S11	D-alanyl-D-alanine carboxypeptidase. 	236
395622	pfam00769	ERM	Ezrin/radixin/moesin family. This family of proteins contain a band 4.1 domain (pfam00373), at their amino terminus. This family represents the rest of these proteins.	244
279154	pfam00770	Peptidase_C5	Adenovirus endoprotease. This family of adenovirus thiol endoproteases specifically cleave Gly-Ala peptides in viral precursor peptides.	179
395623	pfam00771	FHIPEP	FHIPEP family. 	657
395624	pfam00772	DnaB	DnaB-like helicase N terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This N-terminal domain is required both for interaction with other proteins in the primosome and for DnaB helicase activity.	103
395625	pfam00773	RNB	RNB domain. This domain is the catalytic domain of ribonuclease II.	317
395626	pfam00775	Dioxygenase_C	Dioxygenase. 	182
395627	pfam00777	Glyco_transf_29	Glycosyltransferase family 29 (sialyltransferase). Members of this family belong to glycosyltransferase family 29.	267
395628	pfam00778	DIX	DIX domain. The DIX domain is present in Dishevelled and axin. This domain is involved in homo- and hetero-oligomerization. It is involved in the homo- oligomerization of mouse axin. The axin DIX domain also interacts with the dishevelled DIX domain. The DIX domain has also been called the DAX domain.	79
395629	pfam00779	BTK	BTK motif. Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains. The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region.	29
395630	pfam00780	CNH	CNH domain. Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations.	260
395631	pfam00781	DAGK_cat	Diacylglycerol kinase catalytic domain. Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologs. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family.	125
395632	pfam00782	DSPc	Dual specificity phosphatase, catalytic domain. Ser/Thr and Tyr protein phosphatases. The enzyme's tertiary fold is highly similar to that of tyrosine-specific phosphatases, except for a "recognition" region.	127
395633	pfam00784	MyTH4	MyTH4 domain. Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins.	104
395634	pfam00786	PBD	P21-Rho-binding domain. Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB).	59
395635	pfam00787	PX	PX domain. PX domains bind to phosphoinositides.	84
395636	pfam00788	RA	Ras association (RalGDS/AF-6) domain. RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Recent evidence (not yet in MEDLINE) shows that some RA domains do NOT bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase.	93
395637	pfam00789	UBX	UBX domain. This domain is present in ubiquitin-regulatory proteins and is a general Cdc48-interacting module.	80
395638	pfam00790	VHS	VHS domain. Domain present in VPS-27, Hrs and STAM.	136
395639	pfam00791	ZU5	ZU5 domain. Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function.	98
395640	pfam00792	PI3K_C2	Phosphoinositide 3-kinase C2. Phosphoinositide 3-kinase region postulated to contain a C2 domain. Outlier of pfam00168 family.	136
395641	pfam00793	DAHP_synth_1	DAHP synthetase I family. Members of this family catalyze the first step in aromatic amino acid biosynthesis from chorismate. E-coli has three related synthetases, which are inhibited by different aromatic amino acids. This family also includes KDSA which has very similar catalytic activity but is involved in the first step of liposaccharide biosynthesis. The enzyme is also part of the shikimate pathway, EC:2.5.1.54.	271
395642	pfam00794	PI3K_rbd	PI3-kinase family, ras-binding domain. Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding pfam00788 domains (unpublished observation).	106
395643	pfam00795	CN_hydrolase	Carbon-nitrogen hydrolase. This family contains hydrolases that break carbon-nitrogen bonds. The family includes: Nitrilase EC:3.5.5.1, Aliphatic amidase EC:3.5.1.4, Biotidinase EC:3.5.1.12, Beta-ureidopropionase EC:3.5.1.6. Nitrilase-related proteins generally have a conserved E-K-C catalytic triad, and are multimeric alpha-beta-beta-alpha sandwich proteins.	257
279175	pfam00796	PSI_8	Photosystem I reaction centre subunit VIII. 	24
395644	pfam00797	Acetyltransf_2	N-acetyltransferase. Arylamine N-acetyltransferase (NAT) is a cytosolic enzyme of approximately 30kDa. It facilitates the transfer of an acetyl group from Acetyl Coenzyme A on to a wide range of arylamine, N-hydroxyarylamines and hydrazines. Acetylation of these compounds generally results in inactivation. NAT is found in many species from Mycobacteria (M. tuberculosis, M. smegmatis etc) to man. It was the first enzyme to be observed to have polymorphic activity amongst human individuals. NAT is responsible for the inactivation of Isoniazid (a drug used to treat Tuberculosis) in humans. The NAT protein has also been shown to be involved in the breakdown of folic acid.	240
279177	pfam00798	Arena_glycoprot	Arenavirus glycoprotein. 	483
366313	pfam00799	Gemini_AL1	Geminivirus Rep catalytic domain. The AL1 proteins encodes the replication initiator protein (Rep) of geminiviruses, which is a replicon-specific initiator enzyme and is an essential component of the replisome. For geminivirus Rep protein, this N-terminal region is crucial for origin recognition and DNA cleavage and nucleotidyl transfer.	113
395645	pfam00800	PDT	Prephenate dehydratase. This protein is involved in Phenylalanine biosynthesis. This protein catalyzes the decarboxylation of prephenate to phenylpyruvate.	181
395646	pfam00801	PKD	PKD domain. This domain was first identified in the Polycystic kidney disease protein PKD1. This domain has been predicted to contain an Ig-like fold.	70
144411	pfam00802	Glycoprotein_G	Pneumovirus attachment glycoprotein G. This family includes attachment proteins from respiratory synctial virus. Glycoprotein G has not been shown to have any neuraminidase or hemagglutinin activity. The amino terminus is thought to be cytoplasmic, and the carboxyl terminus extracellular. The extracellular region contains four completely conserved cysteine residues.	263
279181	pfam00803	3A	3A/RNA2 movement protein family. This family includes movement proteins from various viruses. The 3A protein is found in bromoviruses and Cucumoviruses. The genome of these viruses contain 3 RNA segments. The third segment (RNA 3) contains two proteins, the coat protein and the 3A protein. The function of the 3A protein is uncertain but has been shown to be involved in cell-to- cell movement of the virus. The family also includes movement proteins from Dianthoviruses.	225
395647	pfam00804	Syntaxin	Syntaxin. Syntaxins are the prototype family of SNARE proteins. They usually consist of three main regions - a C-terminal transmembrane region, a central SNARE domain which is characteristic of and conserved in all syntaxins (pfam05739), and an N-terminal domain that is featured in this entry. This domain varies between syntaxin isoforms; in syntaxin 1A it is found as three alpha-helices with a left-handed twist. It may fold back on the SNARE domain to allow the molecule to adopt a 'closed' configuration that prevents formation of the core fusion complex - it thus has an auto-inhibitory role. The function of syntaxins is determined by their localization. They are involved in neuronal exocytosis, ER-Golgi transport and Golgi-endosome transport, for example. They also interact with other proteins as well as those involved in SNARE complexes. These include vesicle coat proteins, Rab GTPases, and tethering factors.	200
395648	pfam00805	Pentapeptide	Pentapeptide repeats (8 copies). These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid.	40
395649	pfam00806	PUF	Pumilio-family RNA binding repeat. Puf repeats (aka PUM-HD, Pumilio homology domain) are necessary and sufficient for sequence specific RNA binding in fly Pumilio and worm FBF-1 and FBF-2. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs (e.g. the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA). Other proteins that contain Puf domains are also plausible RNA binding proteins. Puf domains usually occur as a tandem repeat of 8 domains. The Pfam model does not necessarily recognize all 8 repeats in all sequences; some sequences appear to have 5 or 6 repeats on initial analysis, but further analysis suggests the presence of additional divergent repeats. Structures of PUF repeat proteins show they consist of a two helix structure.	35
366317	pfam00807	Apidaecin	Apidaecin. These antibacterial peptides are found in bees. These heat-stable, non-helical peptides are active against a wide range of plant-associated bacteria and some human pathogens. The Pfam alignment includes the propeptide and apidaecin sequence.	30
395650	pfam00808	CBFD_NFYB_HMF	Histone-like transcription factor (CBF/NF-Y) and archaeal histone. This family includes archaebacterial histones and histone like transcription factors from eukaryotes.	65
395651	pfam00809	Pterin_bind	Pterin binding enzyme. This family includes a variety of pterin binding enzymes that all adopt a TIM barrel fold. The family includes dihydropteroate synthase EC:2.5.1.15 as well as a group methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) that catalyzes a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation. It transfers the N5-methyl group from methyltetrahydrofolate (CH3-H4folate) to a cob(I)amide centre in another protein, the corrinoid iron-sulfur protein. MeTr is a member of a family of proteins that includes methionine synthase and methanogenic enzymes that activate the methyl group of methyltetra-hydromethano(or -sarcino)pterin.	243
395652	pfam00810	ER_lumen_recept	ER lumen protein retaining receptor. 	143
395653	pfam00811	Ependymin	Ependymin. 	124
395654	pfam00812	Ephrin	Ephrin. 	137
395655	pfam00813	FliP	FliP family. 	191
395656	pfam00814	Peptidase_M22	Glycoprotease family. The Peptidase M22 proteins are part of the HSP70-actin superfamily. The region represented here is an insert into the fold and is not found in the rest of the family (beyond the Peptidase M22 family). Included in this family are the Rhizobial NodU proteins and the HypF regulator. This region also contains the histidine dyad believed to coordinate the metal ion and hence provide catalytic activity. Interestingly the histidines are not well conserved, and there is a lack of experimental evidence to support peptidase activity as a general property of this family. There also appear to be instances of this domain outside of the HSP70-actin superfamily.	272
395657	pfam00815	Histidinol_dh	Histidinol dehydrogenase. 	410
395658	pfam00816	Histone_HNS	H-NS histone family. 	91
395659	pfam00817	IMS	impB/mucB/samB family. These proteins are involved in UV protection.	148
307113	pfam00818	Ice_nucleation	Ice nucleation protein repeat. 	15
109859	pfam00819	Myotoxins	Myotoxin, crotamine. Crotamine is a family of cationic peptides expressed by the venom gland of, for example, Crotalus durissus terrificus. It acts as a cell-penetrating peptide (CPP) and as a potent voltage-gated potassium channel (Kv) inhibitor.	42
395660	pfam00820	Lipoprotein_1	Borrelia lipoprotein. This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is uncertain.	259
395661	pfam00821	PEPCK_C	Phosphoenolpyruvate carboxykinase C-terminal P-loop domain. catalyzes the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate.	358
395662	pfam00822	PMP22_Claudin	PMP-22/EMP/MP20/Claudin family. 	162
395663	pfam00823	PPE	PPE family. This family named after a PPE motif near to the amino terminus of the domain. The PPE family of proteins all contain an amino-terminal region of about 180 amino acids. The carboxyl terminus of this family are variable, and on the basis of this region fall into at least three groups. The MPTR subgroup has tandem copies of a motif NXGXGNXG. The second subgroup contains a conserved motif at about position 350. The third group are only related in the amino terminal region. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis.	158
395664	pfam00825	Ribonuclease_P	Ribonuclease P. 	107
395665	pfam00827	Ribosomal_L15e	Ribosomal L15. 	191
395666	pfam00828	Ribosomal_L27A	Ribosomal proteins 50S-L15, 50S-L18e, 60S-L27A. This family includes higher eukaryotic ribosomal 60S L27A, archaeal 50S L18e, prokaryotic 50S L15, fungal mitochondrial L10, plant L27A, mitochondrial L15 and chloroplast L18-3 proteins.	127
395667	pfam00829	Ribosomal_L21p	Ribosomal prokaryotic L21 protein. 	100
395668	pfam00830	Ribosomal_L28	Ribosomal L28 family. The ribosomal 28 family includes L28 proteins from bacteria and chloroplasts. The L24 protein from yeast also contains a region of similarity to prokaryotic L28 proteins. L24 from yeast is also found in the large ribosomal subunit	58
395669	pfam00831	Ribosomal_L29	Ribosomal L29 protein. 	56
395670	pfam00832	Ribosomal_L39	Ribosomal L39 protein. 	42
395671	pfam00833	Ribosomal_S17e	Ribosomal S17. 	122
395672	pfam00834	Ribul_P_3_epim	Ribulose-phosphate 3 epimerase family. This enzyme catalyzes the conversion of D-ribulose 5-phosphate into D-xylulose 5-phosphate.	198
395673	pfam00835	SNAP-25	SNAP-25 family. SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of SNARE complexes. Members of this family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment.	55
395674	pfam00836	Stathmin	Stathmin family. The Stathmin family of proteins play an important role in the regulation of the microtubule cytoskeleton. They regulate microtubule dynamics by promoting depolymerization of microtubules and/or preventing polymerization of tubulin heterodimers.	136
279210	pfam00837	T4_deiodinase	Iodothyronine deiodinase. Iodothyronine deiodinase converts thyroxine (T4) to 3,5,3'-triiodothyronine (T3).	237
395675	pfam00838	TCTP	Translationally controlled tumor protein. 	165
395676	pfam00839	Cys_rich_FGFR	Cysteine rich repeat. This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1).	58
395677	pfam00840	Glyco_hydro_7	Glycosyl hydrolase family 7. 	434
395678	pfam00841	Protamine_P2	Sperm histone P2. This protein also known as protamine P2 can substitute for histones in the chromatin of sperm. The alignment contains both the sequence of the mature P2 protein and its propeptide.	89
395679	pfam00842	Ala_racemase_C	Alanine racemase, C-terminal domain. 	126
334282	pfam00843	Arena_nucleocap	Arenavirus nucleocapsid N-terminal domain. This N-terminal domain folds into a novel structure with a deep cavity for binding the m7GpppN cap structure that is required for viral RNA transcription.	334
307130	pfam00844	Gemini_coat	Geminivirus coat protein/nuclear export factor BR1 family. It has been shown that the 104 N-terminal amino acids of the maize streak virus coat protein bind DNA non- specifically. This family also includes various geminivirus movement proteins that are nuclear export factors or shuttles. One member BR1 facilitates the export of both ds and ss DNA form the nucleus.	244
307131	pfam00845	Gemini_BL1	Geminivirus BL1 movement protein. Geminiviruses encode two movement proteins that are essential for systemic infection of their host but dispensable for replication and encapsidation.	276
279218	pfam00846	Hanta_nucleocap	Hantavirus nucleocapsid protein. 	429
395680	pfam00847	AP2	AP2 domain. This 60 amino acid residue domain can bind to DNA and is found in transcription factor proteins.	52
395681	pfam00848	Ring_hydroxyl_A	Ring hydroxylating alpha subunit (catalytic domain). This family is the catalytic domain of aromatic-ring- hydroxylating dioxygenase systems. The active site contains a non-heme ferrous ion coordinated by three ligands.	210
376401	pfam00849	PseudoU_synth_2	RNA pseudouridylate synthase. Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine. This family includes RluD, a pseudouridylate synthase that converts specific uracils to pseudouridine in 23S rRNA. RluA from E. coli converts bases in both rRNA and tRNA.	151
395682	pfam00850	Hist_deacetyl	Histone deacetylase domain. Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyze the removal of the acetyl group. Histone deacetylases are related to other proteins.	297
279223	pfam00851	Peptidase_C6	Helper component proteinase. This protein is found in genome polyproteins of potyviruses.	440
395683	pfam00852	Glyco_transf_10	Glycosyltransferase family 10 (fucosyltransferase) C-term. This is the C-terminal domain of a family of fucosyltransferases. This enzyme transfers fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is known as glycosyltransferase family 10. The C-terminal domain is the likely binding-region for ADP (manuscript in publication).	173
395684	pfam00853	Runt	Runt domain. 	127
395685	pfam00854	PTR2	POT family. The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters.	392
395686	pfam00855	PWWP	PWWP domain. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. The domain binds to Histone-4 methylated at lysine-20, H4K20me, suggesting that it is methyl-lysine recognition motif. Removal of two conserved aromatic residues in a hydrophobic cavity created by this domain within the full-length protein, Pdp1, abolishes the interaction o f the protein with H4K20me3. In fission yeast, Set9 is the sole enzyme that catalyzes all three states of H4K20me, and Set9-mediated H4K20me is required for efficient recruitment of checkpoint protein Crb2 to sites of DNA damage. The methylation of H4K20 is involved in a diverse array of cellular processes, such as organising higher-order chromatin, maintaining genome stability, and regulating cell-cycle progression.	95
395687	pfam00856	SET	SET domain. SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure.	115
376404	pfam00857	Isochorismatase	Isochorismatase family. This family are hydrolase enzymes.	173
395688	pfam00858	ASC	Amiloride-sensitive sodium channel. 	405
395689	pfam00859	CTF_NFI	CTF/NF-I family transcription modulation region. 	292
395690	pfam00860	Xan_ur_permease	Permease family. This family includes permeases for diverse substrates such as xanthine, uracil, and vitamin C. However many members of this family are functionally uncharacterized and may transport other substrates. Members of this family have ten predicted transmembrane helices.	389
395691	pfam00861	Ribosomal_L18p	Ribosomal L18 of archaea, bacteria, mitoch. and chloroplast. This family includes the large subunit ribosomal proteins from bacteria, archaea, the mitochondria and the chloroplast. It does not include the 60S L18 or L5 proteins from Metazoa.	116
395692	pfam00862	Sucrose_synth	Sucrose synthase. Sucrose synthases catalyze the synthesis of sucrose from UDP-glucose and fructose. This family includes the bulk of the sucrose synthase protein. However the carboxyl terminal region of the sucrose synthases belongs to the glycosyl transferase family pfam00534.	540
279235	pfam00863	Peptidase_C4	Peptidase family C4. This peptidase is present in the nuclear inclusion protein of potyviruses.	243
395693	pfam00864	P2X_receptor	ATP P2X receptor. 	363
395694	pfam00865	Osteopontin	Osteopontin. 	291
395695	pfam00866	Ring_hydroxyl_B	Ring hydroxylating beta subunit. This subunit has a similar structure to NTF-2 and scytalone dehydratase.	144
395696	pfam00867	XPG_I	XPG I-region. 	90
395697	pfam00868	Transglut_N	Transglutaminase family. 	117
395698	pfam00869	Flavi_glycoprot	Flavivirus glycoprotein, central and dimerization domains. 	300
395699	pfam00870	P53	P53 DNA-binding domain. This family contains one anomalous member, viz: Zea mays (Q6JAD8). This sequence is identical to human P53 and would appear to be a a human contaminant within the Zea mays sampling effort.	191
395700	pfam00871	Acetate_kinase	Acetokinase family. This family includes acetate kinase, butyrate kinase and 2-methylpropanoate kinase.	387
307151	pfam00872	Transposase_mut	Transposase, Mutator family. 	380
395701	pfam00873	ACR_tran	AcrB/AcrD/AcrF family. Members of this family are integral membrane proteins. Some are involved in drug resistance. AcrB cooperates with a membrane fusion protein, AcrA, and an outer membrane channel TolC. The structure shows the AcrB forms a homotrimer.	1021
395702	pfam00874	PRD	PRD domain. The PRD domain (for PTS Regulation Domain), is the phosphorylatable regulatory domain found in bacterial transcriptional antiterminator such as BglG, SacY and LicT, as well as in activators such as MtlR and LevR. The PRD is phosphorylated on one or two conserved histidine residues. PRD-containing proteins are involved in the regulation of catabolic operons in Gram+ and Gram- bacteria and are often characterized by a short N-terminal effector domain that binds to either RNA (CAT-RBD for antiterminators pfam03123) or DNA (for activators), and a duplicated PRD module which is phosphorylated by the sugar phosphotransferase system (PTS) in response to the availability of carbon source. The phosphorylations modify the conformation and stability of the dimeric proteins and thereby the RNA- or DNA-binding activity of the effector domain. The structure of the LicT PRD domains has been solved in both the active (Structure 1h99) and inactive state (Structure 1tlv), revealing massive structural rearrangements upon activation.	90
395703	pfam00875	DNA_photolyase	DNA photolyase. This domain binds a light harvesting cofactor.	164
395704	pfam00876	Innexin	Innexin. This family includes the Drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins.	330
395705	pfam00877	NLPC_P60	NlpC/P60 family. The function of this domain is unknown. It is found in several lipoproteins.	105
395706	pfam00878	CIMR	Cation-independent mannose-6-phosphate receptor repeat. The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat.	145
395707	pfam00879	Defensin_propep	Defensin propeptide. 	51
395708	pfam00880	Nebulin	Nebulin repeat. 	28
395709	pfam00881	Nitroreductase	Nitroreductase family. The nitroreductase family comprises a group of FMN- or FAD-dependent and NAD(P)H-dependent enzymes able to metabolize nitrosubstituted compounds.	168
395710	pfam00882	Zn_dep_PLPC	Zinc dependent phospholipase C. 	174
395711	pfam00883	Peptidase_M17	Cytosol aminopeptidase family, catalytic domain. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase.	303
395712	pfam00884	Sulfatase	Sulfatase. 	298
395713	pfam00885	DMRL_synthase	6,7-dimethyl-8-ribityllumazine synthase. This family includes the beta chain of 6,7-dimethyl-8- ribityllumazine synthase EC:2.5.1.9, an enzyme involved in riboflavin biosynthesis. The family also includes a subfamily of distant archaebacterial proteins that may also have the same function. The family contains a number of different subsets including a family of proteins comprising archaeal lumazine and riboflavin synthases, type I lumazine synthases, and the eubacterial type II lumazine synthases. It has been established that lumazine synthase catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. The type I lumazine synthases area active in pentameric or icosahedral quaternary assemblies, whereas the type II are decameric. Brucella, a bacterial genus that causes brucellosis, and other Rhizobiales have an atypical riboflavin metabolic pathway. Brucella spp code for both a type-I and a type-II lumazine synthase, and it has been shown that at least one of these two has to be present in order for Brucella to be viable, showing that in the case of Brucella flavin metabolism is implicated in bacterial virulence.	134
395714	pfam00886	Ribosomal_S16	Ribosomal protein S16. 	61
395715	pfam00887	ACBP	Acyl CoA binding protein. 	81
395716	pfam00888	Cullin	Cullin family. 	610
395717	pfam00889	EF_TS	Elongation factor TS. 	204
395718	pfam00890	FAD_binding_2	FAD binding domain. This family includes members that bind FAD. This family includes the flavoprotein subunits from succinate and fumarate dehydrogenase, aspartate oxidase and the alpha subunit of adenylylsulphate reductase.	398
395719	pfam00891	Methyltransf_2	O-methyltransferase. This family includes a range of O-methyltransferases. These enzymes utilize S-adenosyl methionine.	208
307170	pfam00892	EamA	EamA-like transporter family. This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. The family used to be known as DUF6. Members of this family usually carry 5+5 transmembrane domains, and this domain attempts to model five of these.	136
279265	pfam00893	Multi_Drug_Res	Small Multidrug Resistance protein. This family is the Small Multidrug Resistance (SMR) family. Several members have been shown to export a range of toxins, including ethidium bromide and quaternary ammonium compounds, through coupling with proton influx.	93
279266	pfam00894	Luteo_coat	Luteovirus coat protein. 	138
395720	pfam00895	ATP-synt_8	ATP synthase protein 8. 	54
279268	pfam00897	Orbi_VP7	Orbivirus inner capsid protein VP7. In BTV, 260 trimers of VP7 are found in the core. The major proteins of the core are VP7 and VP3. VP7 forms an outer layer around VP3.	348
307171	pfam00898	Orbi_VP2	Orbivirus outer capsid protein VP2. VP2 acts as an anchor for VP1 and VP3. VP2 contains a non-specific DNA and RNA binding domain in the N-terminus.	946
395721	pfam00899	ThiF	ThiF family. This domain is found in ubiquitin activating E1 family and members of the bacterial ThiF/MoeB/HesA family. It is repeated in Ubiquitin-activating enzyme E1.	243
395722	pfam00900	Ribosomal_S4e	Ribosomal family S4e. 	75
279272	pfam00901	Orbi_VP5	Orbivirus outer capsid protein VP5. cryoelectron microscopy indicates that VP5 is a trimer implying that there are 360 copies of VP5 per virion.	507
395723	pfam00902	TatC	Sec-independent protein translocase protein (TatC). The bacterial Tat system has a remarkable ability to transport folded proteins even enzyme complexes across the cytoplasmic membrane. It is structurally and mechanistically similar to the Delta pH-driven thylakoidal protein import pathway. A functional Tat system or Delta pH-dependent pathway requires three integral membrane proteins: TatA/Tha4, TatB/Hcf106 and TatC/cpTatC. The TatC protein is essential for the function of both pathways. It might be involved in twin-arginine signal peptide recognition, protein translocation and proton translocation. Sequence analysis predicts that TatC contains six transmembrane helices (TMHs), and experimental data confirmed that N- and C-termini of TatC or cpTatC are exposed to the cytoplasmic or stromal face of the membrane. The cytoplasmic N-terminus and the first cytoplasmic loop region of the Escherichia coli TatC protein are essential for protein export. At least two TatC molecules co-exist within each Tat translocon.	210
395724	pfam00903	Glyoxalase	Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily. 	121
395725	pfam00904	Involucrin	Involucrin repeat. 	9
395726	pfam00905	Transpeptidase	Penicillin binding protein transpeptidase domain. The active site serine is conserved in all members of this family.	292
279277	pfam00906	Hepatitis_core	Hepatitis core antigen. The core antigen of hepatitis viruses possesses a carboxyl terminus rich in arginine. On this basis it was predicted that the core antigen would bind DNA. There is some experimental evidence to support this.	273
395727	pfam00907	T-box	T-box. The T-box encodes a 180 amino acid domain that binds to DNA. Genes encoding T-box proteins are found in a wide range of animals, but not in other kingdoms such as plants. Family members are all thought to bind to the DNA consensus sequence TCACACCT. they are found exclusively in the nucleus, and perform DNA-binding and transcriptional activation/repression roles. They are generally required for development of the specific tissues they are expressed in, and mutations in T-box genes are implicated in human conditions such as DiGeorge syndrome and X-linked cleft palate, which feature malformations.	182
395728	pfam00908	dTDP_sugar_isom	dTDP-4-dehydrorhamnose 3,5-epimerase. This family catalyze the isomerisation of dTDP-4-dehydro-6-deoxy -D-glucose with dTDP-4-dehydro-6-deoxy-L-mannose. The EC number of this enzyme is 5.1.3.13.	164
395729	pfam00909	Ammonium_transp	Ammonium Transporter Family. 	399
395730	pfam00910	RNA_helicase	RNA helicase. This family includes RNA helicases thought to be involved in duplex unwinding during viral RNA replication. Members of this family are found in a variety of single stranded RNA viruses.	101
395731	pfam00912	Transgly	Transglycosylase. The penicillin-binding proteins are bifunctional proteins consisting of transglycosylase and transpeptidase in the N- and C-terminus respectively. The transglycosylase domain catalyzes the polymerization of murein glycan chains.	177
395732	pfam00913	Trypan_glycop	Trypanosome variant surface glycoprotein (A-type). The trypanosome parasite expresses these proteins to evade the immune response. This family includes a variety of surface proteins such as Trypanosoma brucei VSGs such as expression site associated gene (ESAG) 6 and 7.	367
366366	pfam00915	Calici_coat	Calicivirus coat protein. 	290
395733	pfam00916	Sulfate_transp	Sulfate permease family. This family of integral membrane proteins are known as the Sulfate Permease (SulP) family. SulP is a large family found in all domains of life. Although sulfate is a commonly transported ion there are many other activities in this family. See the TCDB description for a comprehensive summary.	379
334312	pfam00917	MATH	MATH domain. This motif has been called the Meprin And TRAF-Homology (MATH) domain. This domain is hugely expanded in the nematode C. elegans.	113
395734	pfam00918	Gastrin	Gastrin/cholecystokinin family. 	126
395735	pfam00919	UPF0004	Uncharacterized protein family UPF0004. This family is the N terminal half of the Prosite family. The C-terminal half has been shown to be related to MiaB proteins. This domain is a nearly always found in conjunction with pfam04055 and pfam01938 although its function is uncertain.	98
395736	pfam00920	ILVD_EDD	Dehydratase family. 	518
376417	pfam00921	Lipoprotein_2	Borrelia lipoprotein. This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is uncertain.	299
366369	pfam00922	Phosphoprotein	Vesiculovirus phosphoprotein. 	204
395737	pfam00923	TAL_FSA	Transaldolase/Fructose-6-phosphate aldolase. Transaldolase (TAL) is an enzyme of the pentose phosphate pathway (PPP) found almost ubiquitously in the three domains of life (Archaea, Bacteria, and Eukarya). TAL shares a high degree of structural similarity and sequence identity with fructose-6-phosphate aldolase (FSA). They both belong to the class I aldolase family. Their protein structures have been revealed.	226
395738	pfam00924	MS_channel	Mechanosensitive ion channel. Two members of this protein family of M. jannaschii have been functionally characterized. Both proteins form mechanosensitive (MS) ion channels upon reconstitution into liposomes and functional examination by the patch-clamp technique. Therefore this family are likely to also be MS channel proteins.	201
395739	pfam00925	GTP_cyclohydro2	GTP cyclohydrolase II. GTP cyclohydrolase II catalyzes the first committed step in the biosynthesis of riboflavin.	123
395740	pfam00926	DHBP_synthase	3,4-dihydroxy-2-butanone 4-phosphate synthase. 3,4-Dihydroxy-2-butanone 4-phosphate is biosynthesized from ribulose 5-phosphate and serves as the biosynthetic precursor for the xylene ring of riboflavin. Sometimes found as a bifunctional enzyme with pfam00925.	191
395741	pfam00927	Transglut_C	Transglutaminase family, C-terminal ig like domain. 	106
395742	pfam00928	Adap_comp_sub	Adaptor complexes medium subunit family. This family also contains members which are coatomer subunits.	259
395743	pfam00929	RNase_T	Exonuclease. This family includes a variety of exonuclease proteins, such as ribonuclease T and the epsilon subunit of DNA polymerase III.;	164
395744	pfam00930	DPPIV_N	Dipeptidyl peptidase IV (DPP IV) N-terminal region. This family is an alignment of the region to the N-terminal side of the active site. The Prosite motif does not correspond to this Pfam entry.	352
395745	pfam00931	NB-ARC	NB-ARC domain. 	245
395746	pfam00932	LTD	Lamin Tail Domain. The lamin-tail domain (LTD), which has an immunoglobulin (Ig) fold, is found in Nuclear Lamins, Chlo1887 from Chloroflexus, and several bacterial proteins where it occurs with membrane associated hydrolases of the metallo-beta-lactamase,synaptojanin, and calcineurin-like phosphoesterase superfamilies.	106
395747	pfam00933	Glyco_hydro_3	Glycosyl hydrolase family 3 N terminal domain. 	316
395748	pfam00934	PE	PE family. This family named after a PE motif near to the amino terminus of the domain. The PE family of proteins all contain an amino-terminal region of about 110 amino acids. The carboxyl terminus of this family are variable and fall into several classes. The largest class of PE proteins is the highly repetitive PGRS class which have a high glycine content. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis.	91
395749	pfam00935	Ribosomal_L44	Ribosomal protein L44. 	76
395750	pfam00936	BMC	BMC domain. Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure.	74
395751	pfam00937	Corona_nucleoca	Coronavirus nucleocapsid protein. 	346
279306	pfam00938	Lipoprotein_3	Lipoprotein. This family of lipoproteins is Mycoplasma specific.	85
279307	pfam00939	Na_sulph_symp	Sodium:sulfate symporter transmembrane region. There are also some members in this family that do not match the Prosite motif, and belong to the subfamily SODIT1.	472
395752	pfam00940	RNA_pol	DNA-dependent RNA polymerase. This is a family of single chain RNA polymerases.	411
395753	pfam00941	FAD_binding_5	FAD binding domain in molybdopterin dehydrogenase. 	170
395754	pfam00942	CBM_3	Cellulose binding domain. 	82
279311	pfam00943	Alpha_E2_glycop	Alphavirus E2 glycoprotein. E2 forms a heterodimer with E1. The virus spikes are made up of 80 trimers of these heterodimers (sindbis virus).	403
366379	pfam00944	Peptidase_S3	Alphavirus core protein. Also known as coat protein C and capsid protein C. This makes the literature very confusing. Alphaviruses consist of a nucleoprotein core, a lipid membrane which envelopes the core, and glycoprotein spikes protruding from the lipid membrane.	156
395755	pfam00945	Rhabdo_ncap	Rhabdovirus nucleocapsid protein. The Nucleocapsid (N) Protein is said to have a "tight" structure. The carboxyl end of the N-terminal domain possesses an RNA binding domain. Sequence alignments show 2 regions of reasonable conservation, approx. 64-103 and 201-329. A whole functional protein is required for encapsidation to take place.	409
395756	pfam00946	Mononeg_RNA_pol	Mononegavirales RNA dependent RNA polymerase. Members of the Mononegavirales including the Paramyxoviridae, like other non-segmented negative strand RNA viruses, have an RNA-dependent RNA polymerase composed of two subunits, a large protein L and a phosphoprotein P. This is a protein family of the L protein. The L protein confers the RNA polymerase activity on the complex. The P protein acts as a transcription factor.	1042
395757	pfam00947	Pico_P2A	Picornavirus core protein 2A. This protein is a protease, involved in cleavage of the polyprotein.	127
279316	pfam00948	Flavi_NS1	Flavivirus non-structural Protein NS1. The NS1 protein is well conserved amongst the flaviviruses. It contains 12 cysteines, and undergoes glycosylation in a similar manner to other NS proteins. Mutational analysis has strongly implied a role for NS1 in the early stages of RNA replication.	360
395758	pfam00949	Peptidase_S7	Peptidase S7, Flavivirus NS3 serine protease. The viral genome is a positive strand RNA that encodes a single polyprotein precursor. Processing of the polyprotein precursor into mature proteins is carried out by the host signal peptidase and by NS3 serine protease, which requires NS2B (pfam01002) as a cofactor.	129
334323	pfam00950	ABC-3	ABC 3 transport family. 	258
109986	pfam00951	Arteri_Gl	Arterivirus GL envelope glycoprotein. Arteriviruses encode 4 envelope proteins, Gl, Gs, M and N. Gl envelope protein, is encoded in ORF5, and is 30- 45 kDa in size. Gl is heterogenously glycosylated with N-acetyllactosamine in a cell-type-specific manner. The Gl glycoprotein expresses the neutralisation determinants.	179
395759	pfam00952	Bunya_nucleocap	Bunyavirus nucleocapsid (N) protein. The bunyaviruses are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encode on the small (S) genomic RNA. The N protein is the major component of the nucleocapsids. This protein is thought to interact with the L protein, virus RNA and/or other N proteins.	229
395760	pfam00953	Glycos_transf_4	Glycosyl transferase family 4. 	160
395761	pfam00954	S_locus_glycop	S-locus glycoprotein domain. In Brassicaceae, self-incompatible plants have a self/non-self recognition system. This is sporophytically controlled by multiple alleles at a single locus (S). S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles. This region is inferred to be a domain due to it having other domains adjacent to it.	111
395762	pfam00955	HCO3_cotransp	HCO3- transporter family. This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-.	502
395763	pfam00956	NAP	Nucleosome assembly protein (NAP). NAP proteins are involved in moving histones into the nucleus, nucleosome assembly and chromatin fluidity. They affect the transcription of many genes.	258
395764	pfam00957	Synaptobrevin	Synaptobrevin. 	89
395765	pfam00958	GMP_synt_C	GMP synthase C terminal domain. GMP synthetase is a glutamine amidotransferase from the de novo purine biosynthetic pathway. This family is the C-terminal domain specific to the GMP synthases EC:6.3.5.2. In prokaryotes this domain mediates dimerization. Eukaryotic GMP synthases are monomers. This domain in eukaryotes includes several large insertions that may form globular domains.	92
395766	pfam00959	Phage_lysozyme	Phage lysozyme. This family includes lambda phage lysozyme and E. coli endolysin.	107
366388	pfam00960	Neocarzinostat	Neocarzinostatin family. 	110
395767	pfam00961	LAGLIDADG_1	LAGLIDADG endonuclease. 	101
395768	pfam00962	A_deaminase	Adenosine/AMP deaminase. 	327
395769	pfam00963	Cohesin	Cohesin domain. Cohesin domains interact with a complementary domain, termed the dockerin domain. The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome.	139
395770	pfam00964	Elicitin	Elicitin. Elicitins form a novel class of plant necrotic proteins which are secreted by Phytophthora and Pythium fungi, parasites of many economically important crops. These proteins induce leaf necrosis in infected plants and elicit an incompatible hypersensitive-like reaction, leading to the development of a systemic acquired resistance against a range of fungal and bacterial plant pathogens.	88
395771	pfam00965	TIMP	Tissue inhibitor of metalloproteinase. Members of this family are common in extracellular regions of vertebrate species	183
395772	pfam00967	Barwin	Barwin family. 	116
395773	pfam00969	MHC_II_beta	Class II histocompatibility antigen, beta domain. 	75
395774	pfam00970	FAD_binding_6	Oxidoreductase FAD-binding domain. 	99
250265	pfam00971	EIAV_GP90	EIAV coat protein, gp90. Equine infectious anaemia (EIAV). EIAV belongs to the family Retroviridae. EIAV gp90 is hypervariable in the carboxyl-end region and more stable in the amino-end region. This variability is a pathogenicity factor that allows the evasion of the host's immune response.	385
366396	pfam00972	Flavi_NS5	Flavivirus RNA-directed RNA polymerase. Flaviviruses produce a polyprotein from the ssRNA genome. This protein is also known as NS5. This RNA-directed RNA polymerase possesses a number of short regions and motifs homologous to other RNA-directed RNA polymerases.	644
395775	pfam00973	Paramyxo_ncap	Paramyxovirus nucleocapsid protein. The nucleocapsid protein is referred to as NP. NP is is the major structural component of the nucleocapsid. The protein is approx. 58 kDa. 2600 NP molecules go to tightly encapsidate the RNA. NP interacts with several other viral encoded proteins, all of which are involved in controlling replication. {NP-NP, NP-P, NP-(PL), and NP-V}.	524
279337	pfam00974	Rhabdo_glycop	Rhabdovirus spike glycoprotein. Frequently abbreviated to G protein. The glycoprotein spike is made up of a trimer of G proteins. Channel formed by glycoprotein spike is thought to function in a similar manner to Influenza virus M2 protein channel, thus allowing a signal to pass across the viral membrane to signal for viral uncoating.	502
395776	pfam00975	Thioesterase	Thioesterase domain. Peptide synthetases are involved in the non-ribosomal synthesis of peptide antibiotics. Next to the operons encoding these enzymes, in almost all cases, are genes that encode proteins that have similarity to the type II fatty acid thioesterases of vertebrates. There are also modules within the peptide synthetases that also share this similarity. With respect to antibiotic production, thioesterases are required for the addition of the last amino acid to the peptide antibiotic, thereby forming a cyclic antibiotic. Thioesterases (non-integrated) have molecular masses of 25-29 kDa.	223
395777	pfam00976	ACTH_domain	Corticotropin ACTH domain. 	19
395778	pfam00977	His_biosynth	Histidine biosynthesis protein. Proteins involved in steps 4 and 6 of the histidine biosynthesis pathway are contained in this family. Histidine is formed by several complex and distinct biochemical reactions catalyzed by eight enzymes. The enzymes in this Pfam entry are called His6 and His7 in eukaryotes and HisA and HisF in prokaryotes. The structure of HisA is known to be a TIM barrel fold. In some archaeal HisA proteins the TIM barrel is composed of two tandem repeats of a half barrel. This family belong to the common phosphate binding site TIM barrel family.	230
395779	pfam00978	RdRP_2	RNA dependent RNA polymerase. This family may represent an RNA dependent RNA polymerase. The family also contains the following proteins: 2A protein from bromoviruses putative RNA dependent RNA polymerase from tobamoviruses Non structural polyprotein from togaviruses	440
144537	pfam00979	Reovirus_cap	Reovirus outer capsid protein, Sigma 3. Sigma 3 is the major outer capsid protein of reovirus. Sigma 3 is encoded by genome segment 4. Sigma 3 binds to double stranded RNA and associates with polypeptide u1 and its cleavage product u1C to form the outer shell of the virion. The Sigma 3 protein possesses a zinc-finger motif and an RNA-binding domain in the N and C termini respectively. This protein is also thought to play a role in pathogenesis.	367
395780	pfam00980	Rota_Capsid_VP6	Rotavirus major capsid protein VP6. Rotaviruses consist of three concentric protein shells. The intermediate (middle) protein layer consists 260 trimers of VP6. VP6 in the most abundant protein in the virion. VP6 is also involved in virion assembly, and possesses the ability to interact with VP2, VP4 and VP7.	396
144538	pfam00981	Rota_NS53	Rotavirus RNA-binding Protein 53 (NS53). This protein is also known as NSP1. NS53 is encoded by gene 5. It is made in low levels in the infected cells and is a component of early replication. The protein is known to accumulate on the cytoskeleton of the infected cell. NS53 is an RNA binding protein that contains a characteristic cysteine rich region.	488
395781	pfam00982	Glyco_transf_20	Glycosyltransferase family 20. Members of this family belong to glycosyl transferase family 20. OtsA (Trehalose-6-phosphate synthase) is homologous to regions in the subunits of yeast trehalose-6-phosphate synthase/phosphate complex,.	470
395782	pfam00983	Tymo_coat	Tymovirus coat protein. 	179
395783	pfam00984	UDPG_MGDP_dh	UDP-glucose/GDP-mannose dehydrogenase family, central domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate.	94
144541	pfam00985	MSA_2	Merozoite Surface Antigen 2 (MSA-2) family. 	171
395784	pfam00986	DNA_gyraseB_C	DNA gyrase B subunit, carboxyl terminus. The amino terminus of eukaryotic and prokaryotic DNA topoisomerase II are similar, but they have a different carboxyl terminus. The amino-terminal portion of the DNA gyrase B protein is thought to catalyze the ATP-dependent super-coiling of DNA. See pfam00204. The carboxyl-terminal end supports the complexation with the DNA gyrase A protein and the ATP-independent relaxation. This family also contains Topoisomerase IV. This is a bacterial enzyme that is closely related to DNA gyrase,.	63
395785	pfam00988	CPSase_sm_chain	Carbamoyl-phosphate synthase small chain, CPSase domain. The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesize carbamoyl phosphate. See pfam00289. The small chain has a GATase domain in the carboxyl terminus. See pfam00117.	128
395786	pfam00989	PAS	PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya.	113
395787	pfam00990	GGDEF	Diguanylate cyclase, GGDEF domain. This domain is found linked to a wide range of non-homologous domains in a variety of bacteria. It has been shown to be homologous to the adenylyl cyclase catalytic domain and has diguanylate cyclase activity. This observation correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. In the WspR protein of Pseudomonas aeruginosa, the GGDEF domain acts as a diguanylate cyclase, Structure 3bre, when the whole molecule appears to form a tetramer consisting of two symmetrically-related dimers representing a biological unit. The active site is the GGD/EF motif, buried in the structure, and the cyclic dimeric guanosine monophosphate (c-di-GMP) bind to the inhibitory-motif RxxD on the surface. The enzyme thus catalyzes the cyclisation of two guanosine triphosphate (GTP) molecules to one c-di-GMP molecule.	160
395788	pfam00992	Troponin	Troponin. Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin.	132
395789	pfam00993	MHC_II_alpha	Class II histocompatibility antigen, alpha domain. 	81
395790	pfam00994	MoCF_biosynth	Probable molybdopterin binding domain. This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation.	143
395791	pfam00995	Sec1	Sec1 family. 	510
395792	pfam00996	GDI	GDP dissociation inhibitor. 	436
395793	pfam00997	Casein_kappa	Kappa casein. Kappa-casein is a mammalian milk protein involved in a number of important physiological processes. In the gut, the ingested protein is split into an insoluble peptide (para kappa-casein) and a soluble hydrophilic glycopeptide (caseinomacropeptide). Caseinomacropeptide is responsible for increased efficiency of digestion, prevention of neonate hypersensitivity to ingested proteins, and inhibition of gastric pathogens.	160
395794	pfam00998	RdRP_3	Viral RNA dependent RNA polymerase. This family includes viral RNA dependent RNA polymerase enzymes from hepatitis C virus and various plant viruses.	486
395795	pfam00999	Na_H_Exchanger	Sodium/hydrogen exchanger family. Na/H antiporters are key transporters in maintaining the pH of actively metabolising cells. The molecular mechanisms of antiport are unclear. These antiporters contain 10-12 transmembrane regions (M) at the amino-terminus and a large cytoplasmic region at the carboxyl terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family.	377
395796	pfam01000	RNA_pol_A_bac	RNA polymerase Rpb3/RpoA insert domain. Members of this family include: alpha subunit from eubacteria alpha subunits from chloroplasts Rpb3 subunits from eukaryotes RpoD subunits from archaeal	117
110032	pfam01001	HCV_NS4b	Hepatitis C virus non-structural protein NS4b. No precise function has been assigned to NS4b. However, it is known that NS4b interacts with NS4a and NS3 to form a large replicase complex to direct the viral RNA replication.	192
279357	pfam01002	Flavi_NS2B	Flavivirus non-structural protein NS2B. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. All, but two, are cleaved by the NS2B-NS3 protease complex.	127
366413	pfam01003	Flavi_capsid	Flavivirus capsid protein C. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. Multiple copies of the C protein form the nucleocapsid, which contains the ssRNA molecule.	117
307237	pfam01004	Flavi_M	Flavivirus envelope glycoprotein M. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. The envelope glycoprotein M is made as a precursor, called prM. The precursor portion of the protein is the signal peptide for the proteins entry into the membrane. prM is cleaved to form M in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus.	74
279359	pfam01005	Flavi_NS2A	Flavivirus non-structural protein NS2A. NS2A is a hydrophobic protein about 25 kDa is size. NS2A is cleaved from NS1 by a membrane bound host protease. NS2A has been found to associate with the dsRNA within the vesicle packages. It has also been found that NS2A associates with the known replicase components and so NS2A has been postulated to be part of this replicase complex.	215
366414	pfam01006	HCV_NS4a	Hepatitis C virus non-structural protein NS4a. NS4a forms an integral part of the NS3 serine protease, as it is required in a number of cases as a cofactor of cleavage. It has also been reported that NS4a interacts with NS4b and NS3 to form a multi-subunit replicase complex.	55
395797	pfam01007	IRK	Inward rectifier potassium channel. 	141
395798	pfam01008	IF-2B	Initiation factor 2 subunit family. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes, initiation factor 2B subunits 1 and 2 from archaebacteria and some proteins of unknown function from prokaryotes. Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. Members of this family have also been characterized as 5-methylthioribose- 1-phosphate isomerases, an enzyme of the methionine salvage pathway. The crystal structure of Ypr118w, a non-essential, low-copy number gene product from Saccharomyces cerevisiae, reveals a dimeric protein with two domains and a putative active site cleft.	281
366417	pfam01010	Proton_antipo_C	NADH-dehyrogenase subunit F, TMs, (complex I) C-terminus. This sub-family represents a carboxyl terminal extension of pfam00361. It includes subunit 5 from chloroplasts, and bacterial subunit L. This sub-family is part of complex I which catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane. This family is largely a few TM regions of the F subunit of NADH-Ubiquinone oxidoreductase from plants. The TMs form part of the anti-porter subunit.	244
395799	pfam01011	PQQ	PQQ enzyme repeat. The family represent a single repeat of a beta propeller. This propeller has been found in several enzymes which utilize pyrrolo-quinoline quinone as a prosthetic group.	36
395800	pfam01012	ETF	Electron transfer flavoprotein domain. This family includes the homologous domain shared between the alpha and beta subunits of the electron transfer flavoprotein.	178
395801	pfam01014	Uricase	Uricase. 	128
395802	pfam01015	Ribosomal_S3Ae	Ribosomal S3Ae family. 	191
395803	pfam01016	Ribosomal_L27	Ribosomal L27 protein. 	77
395804	pfam01017	STAT_alpha	STAT protein, all-alpha domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain pfam00017.	171
395805	pfam01018	GTP1_OBG	GTP1/OBG. The N-terminal domain of the GTPase OBG has the OBG fold, which is formed by three glycine-rich regions inserted into a small 8-stranded beta-sandwich these regions form six left-handed collagen-like helices packed and H-bonded together.	155
395806	pfam01019	G_glu_transpept	Gamma-glutamyltranspeptidase. 	498
395807	pfam01020	Ribosomal_L40e	Ribosomal L40e family. Bovine L40 has been identified as a secondary RNA binding protein. L40 is fused to a ubiquitin protein.	48
395808	pfam01021	TYA	TYA transposon protein. Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles.	98
395809	pfam01022	HTH_5	Bacterial regulatory protein, arsR family. Members of this family contains a DNA binding 'helix-turn-helix' motif. This family includes other proteins which are not included in the Prosite definition.	47
395810	pfam01023	S_100	S-100/ICaBP type calcium binding domain. The S-100 domain is a subfamily of the EF-hand calcium binding proteins.	43
395811	pfam01024	Colicin	Colicin pore forming domain. 	183
395812	pfam01025	GrpE	GrpE. 	164
395813	pfam01026	TatD_DNase	TatD related DNase. This family of proteins are related to a large superfamily of metalloenzymes. TatD, a member of this family has been shown experimentally to be a DNase enzyme.	253
395814	pfam01027	Bax1-I	Inhibitor of apoptosis-promoting Bax1. Programmed cell-death involves a set of Bcl-2 family proteins, some of which inhibit apoptosis (Bcl-2 and Bcl-XL) and some of which promote it (Bax and Bak). Human Bax inhibitor, BI-1, is an evolutionarily conserved integral membrane protein containing multiple membrane-spanning segments predominantly localized to intracellular membranes. It has 6-7 membrane-spanning domains. The C termini of the mammalian BI-1 proteins are comprised of basic amino acids resembling some nuclear targeting sequences, but otherwise the predicted proteins lack motifs that suggest a function. As plant BI-1 appears to localize predominantly to the ER, we hypothesized that plant BI-1 could also regulate cell death triggered by ER stress. BI-1 appears to exert its effect through an interaction with calmodulin. The budding yeast member of this family has been found unexpectedly to encode a BH3 domain-containing protein (Ybh3p) that regulates the mitochondrial pathway of apoptosis in a phylogenetically conserved manner. Examination of the crystal structure of a bacterial member of this family shows that these proteins mediate a calcium leak across the membrane that is pH-dependent. Calcium homoeostasis balances passive calcium leak with active calcium uptake. The structure exists in a pore-closed and pore-open conformation, at pHs of 8 and 6 respectively, and the pore can be opened by intracrystalline transition; together these findings suggest that pH controls the conformational transition.	206
395815	pfam01028	Topoisom_I	Eukaryotic DNA topoisomerase I, catalytic core. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination.	198
395816	pfam01029	NusB	NusB family. The NusB protein is involved in the regulation of rRNA biosynthesis by transcriptional antitermination.	132
395817	pfam01030	Recep_L_domain	Receptor L domain. The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain.	113
395818	pfam01031	Dynamin_M	Dynamin central region. This region lies between the GTPase domain, see pfam00350, and the pleckstrin homology (PH) domain, see pfam00169.	244
395819	pfam01032	FecCD	FecCD transport family. This is a sub-family of bacterial binding protein-dependent transport systems family. This Pfam entry contains the inner components of this multicomponent transport system.	311
395820	pfam01033	Somatomedin_B	Somatomedin B domain. 	40
395821	pfam01034	Syndecan	Syndecan domain. Syndecans are transmembrane heparin sulfate proteoglycans which are implicated in the binding of extracellular matrix components and growth factors.	61
395822	pfam01035	DNA_binding_1	6-O-methylguanine DNA methyltransferase, DNA binding domain. This domain is a 3 helical bundle.	81
395823	pfam01036	Bac_rhodopsin	Bacteriorhodopsin-like protein. The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine). This family also includes distantly related proteins that do not contain the retinal binding lysine and so cannot function as opsins.	223
395824	pfam01037	AsnC_trans_reg	Lrp/AsnC ligand binding domain. The l-leucine-responsive regulatory protein (Lrp/AsnC) family is a family of similar bacterial transcription regulatory proteins. The family is named after two E. coli proteins involved in regulating amino acid metabolism. This entry corresponds to the usually C-terminal regulatory ligand binding domain. Structurally this domain has a dimeric alpha/beta barrel fold.	73
395825	pfam01039	Carboxyl_trans	Carboxyl transferase domain. All of the members in this family are biotin dependent carboxylases. The carboxyl transferase domain carries out the following reaction; transcarboxylation from biotin to an acceptor molecule. There are two recognized types of carboxyl transferase. One of them uses acyl-CoA and the other uses 2-oxoacid as the acceptor molecule of carbon dioxide. All of the members in this family utilize acyl-CoA as the acceptor molecule.	491
395826	pfam01040	UbiA	UbiA prenyltransferase family. 	247
395827	pfam01041	DegT_DnrJ_EryC1	DegT/DnrJ/EryC1/StrS aminotransferase family. The members of this family are probably all pyridoxal-phosphate-dependent aminotransferase enzymes with a variety of molecular functions. The family includes StsA, StsC and StsS. The aminotransferase activity was demonstrated for purified StsC protein as the L-glutamine:scyllo-inosose aminotransferase EC:2.6.1.50, which catalyzes the first amino transfer in the biosynthesis of the streptidine subunit of streptomycin.	360
395828	pfam01042	Ribonuc_L-PSP	Endoribonuclease L-PSP. Endoribonuclease active on single-stranded mRNA. Inhibits protein synthesis by cleavage of mRNA. Previously thought to inhibit protein synthesis initiation. This protein may also be involved in the regulation of purine biosynthesis. YjgF (renamed RidA) family members are enamine/imine deaminases. They hydrolyze reactive intermediates released by PLP-dependent enzymes, including threonine dehydratase. YjgF also prevents inhibition of transaminase B (IlvE) in Salmonella.	117
395829	pfam01043	SecA_PP_bind	SecA preprotein cross-linking domain. The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain.	107
395830	pfam01044	Vinculin	Vinculin family. 	791
395831	pfam01047	MarR	MarR family. The Mar proteins are involved in the multiple antibiotic resistance, a non-specific resistance system. The expression of the mar operon is controlled by a repressor, MarR. A large number of compounds induce transcription of the mar operon. This is thought to be due to the compound binding to MarR, and the resulting complex stops MarR binding to the DNA. With the MarR repression lost, transcription of the operon proceeds. The structure of MarR is known and shows MarR as a dimer with each subunit containing a winged-helix DNA binding motif.	59
395832	pfam01048	PNP_UDP_1	Phosphorylase superfamily. Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase)	231
395833	pfam01049	Cadherin_C	Cadherin cytoplasmic region. Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn.	148
395834	pfam01050	MannoseP_isomer	Mannose-6-phosphate isomerase. All of the members of this Pfam entry belong to family 2 of the mannose-6-phosphate isomerases. The type II phosphomannose isomerases are bifunctional enzymes. This Pfam entry covers the isomerase domain. The guanosine diphospho-D-mannose pyrophosphorylase domain is in another Pfam entry, see pfam00483.	151
395835	pfam01051	Rep_3	Initiator Replication protein. This protein is an initiator of plasmid replication. RepB possesses nicking-closing (topoisomerase I) like activity. It is also able to perform a strand transfer reaction on ssDNA that contains its target. This family also includes RepA which is an E.coli protein involved in plasmid replication. The RepA protein binds to DNA repeats that flank the repA gene.	218
395836	pfam01052	FliMN_C	Type III flagellar switch regulator (C-ring) FliN C-term. This family includes the C-terminal region of flagellar motor switch proteins FliN and FliM. It is associated with family FliM, pfam02154 and family FliN_N pfam16973.	66
395837	pfam01053	Cys_Met_Meta_PP	Cys/Met metabolism PLP-dependent enzyme. This family includes enzymes involved in cysteine and methionine metabolism. The following are members: Cystathionine gamma-lyase, Cystathionine gamma-synthase, Cystathionine beta-lyase, Methionine gamma-lyase, OAH/OAS sulfhydrylase, O-succinylhomoserine sulfhydrylase All of these members participate is slightly different reactions. All these enzymes use PLP (pyridoxal-5'-phosphate) as a cofactor.	376
366439	pfam01054	MMTV_SAg	Mouse mammary tumor virus superantigen. The mouse mammary tumor virus (MMTV) is a milk-transmitted type B retrovirus. The superantigen (SAg) is encoded by the long terminal repeat. The SAgs are also called PR73.	184
395838	pfam01055	Glyco_hydro_31	Glycosyl hydrolases family 31. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. Family 31 comprises of enzymes that are, or similar to, alpha- galactosidases.	442
395839	pfam01056	Myc_N	Myc amino-terminal region. The myc family belongs to the basic helix-loop-helix leucine zipper class of transcription factors, see pfam00010. Myc forms a heterodimer with Max, and this complex regulates cell growth through direct activation of genes involved in cell replication. Mutations in the C-terminal 20 residues of this domain cause unique changes in the induction of apoptosis, transformation, and G2 arrest.	346
366441	pfam01057	Parvo_NS1	Parvovirus non-structural protein NS1. This family also contains the NS2 protein. Parvoviruses encode two non-structural proteins, NS1 and NS2. The mRNA for NS2 contains the coding sequence for the first 87 amino acids of NS1, then by an alternative splicing mechanism mRNA from a different reading frame, encoding the last 78 amino acids, makes up the full length of the NS2 mRNA. NS1, is the major non-structural protein. It is essential for DNA replication. It is an 83-kDa nuclear phosphoprotein. It has DNA helicase and ATPase activity.	271
395840	pfam01058	Oxidored_q6	NADH ubiquinone oxidoreductase, 20 Kd subunit. 	124
366442	pfam01059	Oxidored_q5_N	NADH-ubiquinone oxidoreductase chain 4, amino terminus. 	110
395841	pfam01060	TTR-52	Transthyretin-like family. TTR-52 was called family 2 in, and has weak similarity to transthyretin (formerly called pre-albumin) which transports thyroid hormones. The specific function of this protein is as a bridging molecule in apoptosis cross-linking dying cells to phagocytes. TTR-52 bridges by cross-linking surface-exposed phosphatidylserine (PtdSer) on apoptotic cells to the CED-1 receptor, a transmembrane receptor, on phagocytes. TTR-52 has an open beta-barrel-like structure.	79
395842	pfam01061	ABC2_membrane	ABC-2 type transporter. 	204
395843	pfam01062	Bestrophin	Bestrophin, RFP-TM, chloride channel. Bestrophin is a 68-kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterized by a depressed light peak in the electrooculogram. VMD2 encodes a 585-amino acid protein with an approximate mass of 68 kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localized to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of chloride channels, indicating a direct role for bestrophin in generating the light peak. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal RFP-TM domain implying important functional properties. The bestrophins are four-pass transmembrane chloride-channel proteins, and the RFP-TM or bestrophin domain extends from the N-terminus through approximately 350 amino acids and contains all of the TM domains as well as nearly all reported disease causing mutations. Interestingly, the RFP motif is not conserved evolutionarily back beyond Metazoa, neither is it in plant members.	286
395844	pfam01063	Aminotran_4	Amino-transferase class IV. The D-amino acid transferases (D-AAT) are required by bacteria to catalyze the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity.	221
395845	pfam01064	Activin_recp	Activin types I and II receptor domain. This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box.	77
395846	pfam01065	Adeno_hexon	Hexon, adenovirus major coat protein, N-terminal domain. Hexon is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organized so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices. The N and C-terminal domains adopt the same PNGase F-like fold although they are significantly different in length.	586
395847	pfam01066	CDP-OH_P_transf	CDP-alcohol phosphatidyltransferase. All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond.	65
395848	pfam01067	Calpain_III	Calpain large subunit, domain III. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions.	135
395849	pfam01068	DNA_ligase_A_M	ATP dependent DNA ligase domain. This domain belongs to a more diverse superfamily, including pfam01331 and pfam01653.	203
395850	pfam01070	FMN_dh	FMN-dependent dehydrogenase. 	350
395851	pfam01071	GARS_A	Phosphoribosylglycinamide synthetase, ATP-grasp (A) domain. Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the ATP-grasp domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam02786).	194
366449	pfam01073	3Beta_HSD	3-beta hydroxysteroid dehydrogenase/isomerase family. The enzyme 3 beta-hydroxysteroid dehydrogenase/5-ene-4-ene isomerase (3 beta-HSD) catalyzes the oxidation and isomerisation of 5-ene-3 beta-hydroxypregnene and 5-ene-hydroxyandrostene steroid precursors into the corresponding 4-ene-ketosteroids necessary for the formation of all classes of steroid hormones.	279
395852	pfam01074	Glyco_hydro_38	Glycosyl hydrolases family 38 N-terminal domain. Glycosyl hydrolases are key enzymes of carbohydrate metabolism.	271
395853	pfam01075	Glyco_transf_9	Glycosyltransferase family 9 (heptosyltransferase). Members of this family belong to glycosyltransferase family 9. Lipopolysaccharide is a major component of the outer leaflet of the outer membrane in Gram-negative bacteria. It is composed of three domains; lipid A, Core oligosaccharide and the O-antigen. All of these enzymes transfer heptose to the lipopolysaccharide core.	247
395854	pfam01076	Mob_Pre	Plasmid recombination enzyme. With some plasmids, recombination can occur in a site specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre. Pre is a plasmid recombination enzyme. This protein is also known as Mob (conjugative mobilisation).	195
395855	pfam01077	NIR_SIR	Nitrite and sulphite reductase 4Fe-4S domain. Sulphite and nitrite reductases are vital in the biosynthetic assimilation of sulphur and nitrogen, respectfully. They are also both important for the dissimilation of oxidized anions for energy transduction.	153
307292	pfam01078	Mg_chelatase	Magnesium chelatase, subunit ChlI. Magnesium-chelatase is a three-component enzyme that catalyzes the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in channelling inter- mediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weight between 38-42 kDa.	207
395856	pfam01079	Hint	Hint module. This is an alignment of the Hint module in the Hedgehog proteins. It does not include any Inteins which also possess the Hint module.	211
395857	pfam01080	Presenilin	Presenilin. Mutations in presenilin-1 are a major cause of early onset Alzheimer's disease. It has been found that presenilin-1 binds to beta-catenin in-vivo. This family also contains SPE proteins from C.elegans.	387
395858	pfam01081	Aldolase	KDPG and KHG aldolase. This family includes the following members: 4-hydroxy-2-oxoglutarate aldolase (KHG-aldolase) Phospho-2-dehydro-3-deoxygluconate aldolase (KDPG-aldolase)	196
395859	pfam01082	Cu2_monooxygen	Copper type II ascorbate-dependent monooxygenase, N-terminal domain. The N and C-terminal domains of members of this family adopt the same PNGase F-like fold.	130
395860	pfam01083	Cutinase	Cutinase. 	173
395861	pfam01084	Ribosomal_S18	Ribosomal protein S18. 	52
395862	pfam01085	HH_signal	Hedgehog amino-terminal signalling domain. For the carboxyl Hint module, see pfam01079. Hedgehog is a family of secreted signal molecules required for embryonic cell differentiation.	146
395863	pfam01086	Clathrin_lg_ch	Clathrin light chain. 	168
395864	pfam01087	GalP_UDP_transf	Galactose-1-phosphate uridyl transferase, N-terminal domain. SCOP reports fold duplication with C-terminal domain. Both involved in Zn and Fe binding.	182
395865	pfam01088	Peptidase_C12	Ubiquitin carboxyl-terminal hydrolase, family 1. 	205
395866	pfam01090	Ribosomal_S19e	Ribosomal protein S19e. 	137
395867	pfam01091	PTN_MK_C	PTN/MK heparin-binding protein family, C-terminal domain. 	61
395868	pfam01092	Ribosomal_S6e	Ribosomal protein S6e. 	124
395869	pfam01093	Clusterin	Clusterin. 	417
395870	pfam01094	ANF_receptor	Receptor family ligand binding region. This family includes extracellular ligand binding domains of a wide range of receptors. This family also includes the bacterial amino acid binding proteins of known structure.	349
395871	pfam01095	Pectinesterase	Pectinesterase. 	298
395872	pfam01096	TFIIS_C	Transcription factor S-II (TFIIS). 	39
395873	pfam01097	Defensin_2	Arthropod defensin. 	34
279444	pfam01098	FTSW_RODA_SPOVE	Cell cycle protein. This entry includes the following members; FtsW, RodA, SpoVE	359
395874	pfam01099	Uteroglobin	Uteroglobin family. Uteroglobin is a homodimer of two identical 70 amino acid polypeptides linked by two disulphide bridges. The precise role of uteroglobin has still to be elucidated.	90
395875	pfam01101	HMG14_17	HMG14 and HMG17. 	90
395876	pfam01102	Glycophorin_A	Glycophorin A. 	113
395877	pfam01103	Bac_surface_Ag	Surface antigen. This entry includes the following surface antigens; D15 antigen from H.influenzae, OMA87 from P.multocida, OMP85 from N.meningitidis and N.gonorrhoeae. The family also includes a number of eukaryotic proteins that are members of the UPF0140 family. There also appears to be a relationship to pfam03865 (personal obs: C Yeats). In eukaryotes, it appears that these proteins are not surface antigens; S. cerevisiae YNL026W (SAM50) is an essential component of the Sorting and Assembly Machinery (SAM) of the mitochondrial outer membrane. The protein was localized to the mitochondria.	323
279449	pfam01104	Bunya_NS-S	Bunyavirus non-structural protein NS-s. The NS-s protein is encoded by the S RNA. This segment also encodes for the N protein. These two proteins are encoded by overlapping reading frames.	91
395878	pfam01105	EMP24_GP25L	emp24/gp25L/p24 family/GOLD. Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains.	181
395879	pfam01106	NifU	NifU-like domain. This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown.	67
279452	pfam01107	MP	Viral movement protein (MP). This family includes a variety of movement proteins (MP)s. The MP is necessary for the initial cell-to-cell movement during the early stages of a viral infection. This movement is active, and it is known that the MP interacts with the plasmodesmata and possesses the ability to bind to RNA to achieve its role. This family also includes consists of virus movement proteins from the caulimovirus family. It has been suggested in cauliflower mosaic virus that these proteins mediated viral movement by modifying plasmodesmata and forming tubules in the channel that can accommodate the virus particles and references therein. The family contains a conserved DXR motif that is probably functionally important.	191
395880	pfam01108	Tissue_fac	Tissue factor. This family is found in metazoa, and is very similar to the fibronectin type III domain. The family is found in cytokine receptors, interleukin and interferon receptors and coagulation factor III proteins. It occurs multiple times, as does fn3, family pfam00041.	107
144630	pfam01109	GM_CSF	Granulocyte-macrophage colony-stimulating factor. 	122
395881	pfam01110	CNTF	Ciliary neurotrophic factor. 	187
395882	pfam01111	CKS	Cyclin-dependent kinase regulatory subunit. 	66
395883	pfam01112	Asparaginase_2	Asparaginase. 	304
395884	pfam01113	DapB_N	Dihydrodipicolinate reductase, N-terminus. Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The N-terminal domain of DapB binds the dinucleotide NADPH.	121
279458	pfam01114	Colipase	Colipase, N-terminal domain. SCOP reports duplication of common fold with Colipase C-terminal domain.	40
395885	pfam01115	F_actin_cap_B	F-actin capping protein, beta subunit. 	230
395886	pfam01116	F_bP_aldolase	Fructose-bisphosphate aldolase class-II. 	277
366474	pfam01117	Aerolysin	Aerolysin toxin. This family represents the pore forming lobe of aerolysin.	359
395887	pfam01118	Semialdhyde_dh	Semialdehyde dehydrogenase, NAD binding domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase	119
395888	pfam01119	DNA_mis_repair	DNA mismatch repair protein, C-terminal domain. This family represents the C-terminal domain of the mutL/hexB/PMS1 family. This domain has a ribosomal S5 domain 2-like fold.	117
395889	pfam01120	Alpha_L_fucos	Alpha-L-fucosidase. 	333
395890	pfam01121	CoaE	Dephospho-CoA kinase. This family catalyzes the phosphorylation of the 3'-hydroxyl group of dephosphocoenzyme A to form Coenzyme A EC:2.7.1.24. This enzyme uses ATP in its reaction.	179
395891	pfam01122	Cobalamin_bind	Eukaryotic cobalamin-binding protein. 	300
395892	pfam01123	Stap_Strp_toxin	Staphylococcal/Streptococcal toxin, OB-fold domain. 	79
395893	pfam01124	MAPEG	MAPEG family. This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity.	127
395894	pfam01125	G10	G10 protein. 	146
395895	pfam01126	Heme_oxygenase	Heme oxygenase. 	204
395896	pfam01127	Sdh_cyt	Succinate dehydrogenase/Fumarate reductase transmembrane subunit. This family includes a transmembrane protein from both the Succinate dehydrogenase and Fumarate reductase complexes.	122
395897	pfam01128	IspD	2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase. Members of this family are enzymes which catalyze the formation of 4-diphosphocytidyl-2-C-methyl-D-erythritol from cytidine triphosphate and 2-C-methyl-D-erythritol 4-phosphate (MEP).	219
279473	pfam01129	ART	NAD:arginine ADP-ribosyltransferase. 	222
395898	pfam01130	CD36	CD36 family. The CD36 family is thought to be a novel class of scavenger receptors. There is also evidence suggesting a possible role in signal transduction. CD36 is involved in cell adhesion.	453
395899	pfam01131	Topoisom_bac	DNA topoisomerase. This subfamily of topoisomerase is divided on the basis that these enzymes preferentially relax negatively supercoiled DNA, from a 5' phospho- tyrosine linkage in the enzyme-DNA covalent intermediate and has high affinity for single stranded DNA.	409
395900	pfam01132	EFP	Elongation factor P (EF-P) OB domain. 	54
395901	pfam01133	ER	Enhancer of rudimentary. Enhancer of rudimentary is a protein of unknown function that is highly conserved in plants and animals. This protein is found to be an enhancer of the rudimentary gene.	98
250388	pfam01134	GIDA	Glucose inhibited division protein A. 	391
395902	pfam01135	PCMT	Protein-L-isoaspartate(D-aspartate) O-methyltransferase (PCMT). 	205
395903	pfam01136	Peptidase_U32	Peptidase family U32. 	233
395904	pfam01137	RTC	RNA 3'-terminal phosphate cyclase. RNA cyclases are a family of RNA-modifying enzymes that are conserved in all cellular organisms. They catalyze the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA, in a reaction involving formation of the covalent AMP-cyclase intermediate. The structure of RTC demonstrates that RTCs are comprised two domain. The larger domain contains an insert domain of approximately 100 amino acids.	324
395905	pfam01138	RNase_PH	3' exoribonuclease family, domain 1. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterized subfamily. This subfamily is found in both eukaryotes and archaebacteria.	129
395906	pfam01139	RtcB	tRNA-splicing ligase RtcB. This family of RNA ligases (EC:6.5.1.3) join 2',3'-cyclic phosphate and 5'-OH ends. They catalyze the splicing of tRNA and may also participate in tRNA repair and recovery from stress-induced RNA damage.	415
395907	pfam01140	Gag_MA	Matrix protein (MA), p15. The matrix protein, p15, is encoded by the gag gene. MA is involved in pathogenicity.	126
279483	pfam01141	Gag_p12	Gag polyprotein, inner coat protein p12. The retroviral p12 is a virion structural protein. p12 is proline rich. The function carried out by p12 in assembly and replication is unknown. p12 is associated with pathogenicity of the virus.	85
395908	pfam01142	TruD	tRNA pseudouridine synthase D (TruD). TruD is responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. The structure of TruD reveals an overall V-shaped molecule which contains an RNA-binding cleft.	415
395909	pfam01144	CoA_trans	Coenzyme A transferase. 	216
395910	pfam01145	Band_7	SPFH domain / Band 7 family. This family has been called SPFH, Band 7 or PHB domain. Recent phylogenetic analysis has shown this domain to be a slipin or Stomatin-like integral membrane domain conserved from protozoa to mammals.	176
395911	pfam01146	Caveolin	Caveolin. All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localized and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localization. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumor suppression.	131
395912	pfam01147	Crust_neurohorm	Crustacean CHH/MIH/GIH neurohormone family. 	67
395913	pfam01148	CTP_transf_1	Cytidylyltransferase family. The members of this family are integral membrane protein cytidylyltransferases. The family includes phosphatidate cytidylyltransferase EC:2.7.7.41 as well as Sec59 from yeast. Sec59 is a dolichol kinase EC:2.7.1.108.	264
395914	pfam01149	Fapy_DNA_glyco	Formamidopyrimidine-DNA glycosylase N-terminal domain. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. This family is the N-terminal domain contains eight beta-strands, forming a beta-sandwich with two alpha-helices parallel to its edges.	118
395915	pfam01150	GDA1_CD39	GDA1/CD39 (nucleoside phosphatase) family. 	423
395916	pfam01151	ELO	GNS1/SUR4 family. Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1.	244
395917	pfam01152	Bac_globin	Bacterial-like globin. This family of heme binding proteins are found mainly in bacteria. However they can also be found in some protozoa and plants as well.	121
395918	pfam01153	Glypican	Glypican. 	554
307348	pfam01154	HMG_CoA_synt_N	Hydroxymethylglutaryl-coenzyme A synthase N terminal. 	173
395919	pfam01155	HypA	Hydrogenase/urease nickel incorporation, metallochaperone, hypA. HypA is a metallochaperone that binds nickel to bring it safely to its target. The targets for Hypa are the nickel-containing enzymes [Ni,Fe]-hydrogenase and urease. The nickel coordinates with four nitrogens within the protein. The four conserved cysteines towards the C-terminus bind one zinc moiety probably to stabilize the protein fold.	111
395920	pfam01156	IU_nuc_hydro	Inosine-uridine preferring nucleoside hydrolase. 	255
395921	pfam01157	Ribosomal_L21e	Ribosomal protein L21e. 	100
395922	pfam01158	Ribosomal_L36e	Ribosomal protein L36e. 	96
395923	pfam01159	Ribosomal_L6e	Ribosomal protein L6e. 	109
395924	pfam01160	Opiods_neuropep	Vertebrate endogenous opioids neuropeptide. 	47
395925	pfam01161	PBP	Phosphatidylethanolamine-binding protein. 	140
395926	pfam01163	RIO1	RIO1 family. This is a family of atypical serine kinases which are found in archaea, bacteria and eukaryotes. Activity of Rio1 is vital in Saccharomyces cerevisiae for the processing of ribosomal RNA, as well as for proper cell cycle progression and chromosome maintenance. The structure of RIO1 has been determined.	184
395927	pfam01165	Ribosomal_S21	Ribosomal protein S21. 	51
395928	pfam01166	TSC22	TSC-22/dip/bun family. 	57
395929	pfam01167	Tub	Tub family. 	250
395930	pfam01168	Ala_racemase_N	Alanine racemase, N-terminal domain. 	220
395931	pfam01169	UPF0016	Uncharacterized protein family UPF0016. This family contains integral membrane proteins of unknown function. Most members of the family contain two copies of a region that contains an EXGD motif. Each of these regions contains three predicted transmembrane regions. It has been suggested that these proteins are calcium transporters.	75
395932	pfam01170	UPF0020	Putative RNA methylase family UPF0020. This domain is probably a methylase. It is associated with the THUMP domain that also occurs with RNA modification domains.	184
395933	pfam01171	ATP_bind_3	PP-loop family. This family of proteins belongs to the PP-loop superfamily.	178
395934	pfam01172	SBDS	Shwachman-Bodian-Diamond syndrome (SBDS) protein. This family is highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterized by bone marrow failure and leukemia predisposition. Members of this family play a role in RNA metabolism. In yeast these proteins have been shown to be critical for the release and recycling of the nucleolar shuttling factor Tif6 from pre-60S ribosomes, a key step in 60S maturation and translational activation of ribosomes. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition.	82
334414	pfam01174	SNO	SNO glutamine amidotransferase family. This family and its amidotransferase domain was first described in. It is predicted that members of this family are involved in the pyridoxine biosynthetic pathway, based on the proximity and co-regulation of the corresponding genes and physical interaction between the members of pfam01174 and pfam01680.	188
395935	pfam01175	Urocanase	Urocanase Rossmann-like domain. 	209
395936	pfam01176	eIF-1a	Translation initiation factor 1A / IF-1. This family includes both the eukaryotic translation factor eIF-1A and the bacterial translation initiation factor IF-1.	62
395937	pfam01177	Asp_Glu_race	Asp/Glu/Hydantoin racemase. This family contains aspartate racemase, maleate isomerases EC:5.2.1.1, glutamate racemase, hydantoin racemase and arylmalonate decarboxylase EC:4.1.1.76.	210
395938	pfam01179	Cu_amine_oxid	Copper amine oxidase, enzyme domain. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme.	405
395939	pfam01180	DHO_dh	Dihydroorotate dehydrogenase. 	291
395940	pfam01182	Glucosamine_iso	Glucosamine-6-phosphate isomerases/6-phosphogluconolactonase. 	221
395941	pfam01183	Glyco_hydro_25	Glycosyl hydrolases family 25. 	180
395942	pfam01184	Grp1_Fun34_YaaH	GPR1/FUN34/yaaH family. The Ady2 protein is required for acetate in Saccharomyces cerevisiae, and is probably an acetate transporter. A homolog in Yarrowia lipolytica (GPR1) has a role in acetic acid sensitivity.	207
395943	pfam01185	Hydrophobin	Fungal hydrophobin. 	79
395944	pfam01186	Lysyl_oxidase	Lysyl oxidase. 	200
395945	pfam01187	MIF	Macrophage migration inhibitory factor (MIF). 	114
395946	pfam01189	Methyltr_RsmB-F	16S rRNA methyltransferase RsmB/F. This is the catalytic core of this SAM-dependent 16S ribosomal methyltransferase RsmB/F enzyme. There is a catalytic cysteine residue at 180 in UniProtKB:Q5SII2, with another highly conserved cysteine at residue 230. It methylates the C(5) position of cytosine 2870 (m5C2870) in 25S rRNA.	199
395947	pfam01190	Pollen_Ole_e_I	Pollen proteins Ole e I like. 	94
395948	pfam01191	RNA_pol_Rpb5_C	RNA polymerase Rpb5, C-terminal domain. The assembly domain of Rpb5. The archaeal equivalent to this domain is subunit H. Subunit H lacks the N-terminal domain.	72
395949	pfam01192	RNA_pol_Rpb6	RNA polymerase Rpb6. Rpb6 is an essential subunit in the eukaryotic polymerases Pol I, II and III. This family also contains the bacterial equivalent to Rpb6, the omega subunit. Rpb6 and omega are structurally conserved and both function in polymerase assembly.	53
395950	pfam01193	RNA_pol_L	RNA polymerase Rpb3/Rpb11 dimerization domain. The two eukaryotic subunits Rpb3 and Rpb11 dimerize to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent of the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerization domain of the alpha subunit/Rpb3 is interrupted by an insert domain (pfam01000). Some of the alpha subunits also contain iron-sulphur binding domains (pfam00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria, alpha subunits from chloroplasts, Rpb3 subunits from eukaryotes, Rpb11 subunits from eukaryotes, RpoD subunits from archaeal spp, and RpoL subunits from archaeal spp.	191
395951	pfam01194	RNA_pol_N	RNA polymerases N / 8 kDa subunit. 	59
395952	pfam01195	Pept_tRNA_hydro	Peptidyl-tRNA hydrolase. 	177
395953	pfam01196	Ribosomal_L17	Ribosomal protein L17. 	97
395954	pfam01197	Ribosomal_L31	Ribosomal protein L31. 	65
395955	pfam01198	Ribosomal_L31e	Ribosomal protein L31e. 	82
395956	pfam01199	Ribosomal_L34e	Ribosomal protein L34e. 	94
395957	pfam01200	Ribosomal_S28e	Ribosomal protein S28e. 	64
395958	pfam01201	Ribosomal_S8e	Ribosomal protein S8e. 	126
395959	pfam01202	SKI	Shikimate kinase. 	158
395960	pfam01203	T2SSN	Type II secretion system (T2SS), protein N. Members of the T2SN family are involved in the Type II protein secretion system. The precise function of these proteins is unknown.	207
395961	pfam01204	Trehalase	Trehalase. Trehalase (EC:3.2.1.28) is known to recycle trehalose to glucose. Trehalose is a physiological hallmark of heat-shock response in yeast and protects of proteins and membranes against a variety of stresses. This family is found in conjunction with pfam07492 in fungi.	509
395962	pfam01205	UPF0029	Uncharacterized protein family UPF0029. 	103
395963	pfam01206	TusA	Sulfurtransferase TusA. This family includes the TusA sulfurtransferases.	65
395964	pfam01207	Dus	Dihydrouridine synthase (Dus). Members of this family catalyze the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria.	310
395965	pfam01208	URO-D	Uroporphyrinogen decarboxylase (URO-D). 	344
395966	pfam01209	Ubie_methyltran	ubiE/COQ5 methyltransferase family. 	228
395967	pfam01210	NAD_Gly3P_dh_N	NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus. NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain.	158
395968	pfam01212	Beta_elim_lyase	Beta-eliminating lyase. 	283
395969	pfam01213	CAP_N	Adenylate cyclase associated (CAP) N terminal. 	70
395970	pfam01214	CK_II_beta	Casein kinase II regulatory subunit. 	182
395971	pfam01215	COX5B	Cytochrome c oxidase subunit Vb. 	125
395972	pfam01216	Calsequestrin	Calsequestrin. 	350
395973	pfam01217	Clat_adaptor_s	Clathrin adaptor complex small chain. 	142
395974	pfam01218	Coprogen_oxidas	Coproporphyrinogen III oxidase. 	292
395975	pfam01219	DAGK_prokar	Prokaryotic diacylglycerol kinase. 	102
395976	pfam01220	DHquinase_II	Dehydroquinase class II. 	138
395977	pfam01221	Dynein_light	Dynein light chain type 1. 	83
250456	pfam01222	ERG4_ERG24	Ergosterol biosynthesis ERG4/ERG24 family. 	429
395978	pfam01223	Endonuclease_NS	DNA/RNA non-specific endonuclease. 	219
395979	pfam01225	Mur_ligase	Mur ligase family, catalytic domain. This family contains a number of related ligase enzymes which have EC numbers 6.3.2.*. This family includes: MurC, MurD, MurE, MurF, Mpl, and FolC. MurC, MurD, Mure and MurF catalyze consecutive steps in the synthesis of peptidoglycan. Peptidoglycan consists of a sheet of two sugar derivatives, with one of these N-acetylmuramic acid attaching to a small pentapeptide. The pentapeptide is is made of L-alanine, D-glutamic acid, Meso-diaminopimelic acid and D-alanyl alanine. The peptide moiety is synthesized by successively adding these amino acids to UDP-N-acetylmuramic acid. MurC transfers the L-alanine, MurD transfers the D-glutamate, MurE transfers the diaminopimelic acid, and MurF transfers the D-alanyl alanine. This family also includes Folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate.	84
395980	pfam01226	Form_Nir_trans	Formate/nitrite transporter. 	244
395981	pfam01227	GTP_cyclohydroI	GTP cyclohydrolase I. This family includes GTP cyclohydrolase enzymes and a family of related bacterial proteins.	176
395982	pfam01228	Gly_radical	Glycine radical. 	106
395983	pfam01229	Glyco_hydro_39	Glycosyl hydrolases family 39. 	490
395984	pfam01230	HIT	HIT domain. 	98
395985	pfam01231	IDO	Indoleamine 2,3-dioxygenase. 	410
395986	pfam01232	Mannitol_dh	Mannitol dehydrogenase Rossmann domain. 	151
395987	pfam01233	NMT	Myristoyl-CoA:protein N-myristoyltransferase, N-terminal domain. The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold.	158
395988	pfam01234	NNMT_PNMT_TEMT	NNMT/PNMT/TEMT family. 	261
395989	pfam01235	Na_Ala_symp	Sodium:alanine symporter family. 	385
395990	pfam01237	Oxysterol_BP	Oxysterol-binding protein. 	362
395991	pfam01238	PMI_typeI	Phosphomannose isomerase type I. This is a family of Phosphomannose isomerase type I enzymes (EC 5.3.1.8).	373
395992	pfam01239	PPTA	Protein prenyltransferase alpha subunit repeat. Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognize a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognizes a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family.	32
395993	pfam01241	PSI_PSAK	Photosystem I psaG / psaK. 	69
395994	pfam01242	PTPS	6-pyruvoyl tetrahydropterin synthase. 6-Pyruvoyl tetrahydrobiopterin synthase catalyzes the conversion of dihydroneopterin triphosphate to 6-pyruvoyl tetrahydropterin, the second of three enzymatic steps in the synthesis of tetrahydrobiopterin from GTP. The functional enzyme is a hexamer of identical subunits.	121
395995	pfam01243	Putative_PNPOx	Pyridoxamine 5'-phosphate oxidase. Family of domains with putative PNPOx function. Family members were predicted to encode pyridoxamine 5'-phosphate oxidase, based on sequence similarity. However, there is no experimental data to validate the predicted activity and purified proteins, such as yeast YLR456W and its paralogs, do not possess this activity, nor do they bind to flavin mononucleotide (FMN). To date, the only time functional oxidase activity has been experimentally demonstrated is when the sequences contain both pfam01243 and pfam10590. Moreover, some of the family members that contain both domains have been shown to be involved in phenazine biosynthesis. While some molecular function has been experimentally validated for the proteins containing both domains, the role performed by each domain on its own is unknown.	88
395996	pfam01244	Peptidase_M19	Membrane dipeptidase (Peptidase family M19). 	317
395997	pfam01245	Ribosomal_L19	Ribosomal protein L19. 	108
395998	pfam01246	Ribosomal_L24e	Ribosomal protein L24e. 	63
395999	pfam01247	Ribosomal_L35Ae	Ribosomal protein L35Ae. 	94
396000	pfam01248	Ribosomal_L7Ae	Ribosomal protein L7Ae/L30e/S12e/Gadd45 family. This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118.	95
396001	pfam01249	Ribosomal_S21e	Ribosomal protein S21e. 	79
396002	pfam01250	Ribosomal_S6	Ribosomal protein S6. 	88
396003	pfam01251	Ribosomal_S7e	Ribosomal protein S7e. 	184
396004	pfam01252	Peptidase_A8	Signal peptidase (SPase) II. 	140
396005	pfam01253	SUI1	Translation initiation factor SUI1. 	77
366540	pfam01254	TP2	Nuclear transition protein 2. 	133
396006	pfam01255	Prenyltransf	Putative undecaprenyl diphosphate synthase. Previously known as uncharacterized protein family UPF0015, a single member of this family has been identified as an undecaprenyl diphosphate synthase.	220
396007	pfam01256	Carb_kinase	Carbohydrate kinase. This family is related to pfam02110 and pfam00294 implying that it also is a carbohydrate kinase. (personal obs Yeats C).	242
396008	pfam01257	2Fe-2S_thioredx	Thioredoxin-like [2Fe-2S] ferredoxin. 	145
396009	pfam01258	zf-dskA_traR	Prokaryotic dksA/traR C4-type zinc finger. 	36
396010	pfam01259	SAICAR_synt	SAICAR synthetase. Also known as Phosphoribosylaminoimidazole-succinocarboxamide synthase.	225
396011	pfam01261	AP_endonuc_2	Xylose isomerase-like TIM barrel. This TIM alpha/beta barrel structure is found in xylose isomerase and in endonuclease IV (EC:3.1.21.2). This domain is also found in the N termini of bacterial myo-inositol catabolism proteins. These are involved in the myo-inositol catabolism pathway, and is required for growth on myo-inositol in Rhizobium leguminosarum bv. viciae.	248
396012	pfam01262	AlaDh_PNT_C	Alanine dehydrogenase/PNT, C-terminal domain. This family now also contains the lysine 2-oxoglutarate reductases.	214
396013	pfam01263	Aldose_epim	Aldose 1-epimerase. 	300
396014	pfam01264	Chorismate_synt	Chorismate synthase. 	344
396015	pfam01265	Cyto_heme_lyase	Cytochrome c/c1 heme lyase. 	291
396016	pfam01266	DAO	FAD dependent oxidoreductase. This family includes various FAD dependent oxidoreductases: Glycerol-3-phosphate dehydrogenase EC:1.1.99.5, Sarcosine oxidase beta subunit EC:1.5.3.1, D-alanine oxidase EC:1.4.99.1, D-aspartate oxidase EC:1.4.3.1.	339
396017	pfam01267	F-actin_cap_A	F-actin capping protein alpha subunit. 	265
396018	pfam01268	FTHFS	Formate--tetrahydrofolate ligase. 	555
396019	pfam01269	Fibrillarin	Fibrillarin. 	227
396020	pfam01270	Glyco_hydro_8	Glycosyl hydrolases family 8. 	321
279595	pfam01271	Granin	Granin (chromogranin or secretogranin). 	584
396021	pfam01272	GreA_GreB	Transcription elongation factor, GreA/GreB, C-term. This domain has an FKBP-like fold.	77
396022	pfam01273	LBP_BPI_CETP	LBP / BPI / CETP family, N-terminal domain. The N and C terminal domains of the LBP/BPI/CETP family are structurally similar.	164
396023	pfam01274	Malate_synthase	Malate synthase. 	523
396024	pfam01275	Myelin_PLP	Myelin proteolipid protein (PLP or lipophilin). 	233
396025	pfam01276	OKR_DC_1	Orn/Lys/Arg decarboxylase, major domain. 	417
396026	pfam01277	Oleosin	Oleosin. 	113
396027	pfam01278	Omptin	Omptin family. The omptin family is a family of serine proteases.	282
396028	pfam01279	Parathyroid	Parathyroid hormone family. 	106
396029	pfam01280	Ribosomal_L19e	Ribosomal protein L19e. 	143
396030	pfam01281	Ribosomal_L9_N	Ribosomal protein L9, N-terminal domain. 	46
396031	pfam01282	Ribosomal_S24e	Ribosomal protein S24e. 	78
396032	pfam01283	Ribosomal_S26e	Ribosomal protein S26e. 	105
366555	pfam01284	MARVEL	Membrane-associating domain. MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis.	136
396033	pfam01285	TEA	TEA/ATTS domain family. 	68
396034	pfam01286	XPA_N	XPA protein N-terminal. 	32
396035	pfam01287	eIF-5a	Eukaryotic elongation factor 5A hypusine, DNA-binding OB fold. eIF5A, previously thought to be an initiation factor, has been shown to be required for peptide chain elongation in yeast.	69
396036	pfam01288	HPPK	7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK). 	128
396037	pfam01289	Thiol_cytolysin	Thiol-activated cytolysin. 	354
396038	pfam01290	Thymosin	Thymosin beta-4 family. 	39
396039	pfam01291	LIF_OSM	LIF / OSM family. 	162
396040	pfam01292	Ni_hydr_CYTB	Prokaryotic cytochrome b561. This family includes cytochrome b561 and related proteins, in addition to the nickel-dependent hydrogenases b-type cytochrome subunit. Cytochrome b561 is a secretory vesicle-specific electron transport protein. It is an integral membrane protein, that binds two heme groups non-covalently. This is a prokaryotic family. Members of the 'eukaryotic cytochrome b561' family can be found in pfam03188.	180
396041	pfam01293	PEPCK_ATP	Phosphoenolpyruvate carboxykinase. 	465
396042	pfam01294	Ribosomal_L13e	Ribosomal protein L13e. 	180
396043	pfam01295	Adenylate_cycl	Adenylate cyclase, class-I. 	601
366564	pfam01296	Galanin	Galanin. 	29
396044	pfam01297	ZnuA	Zinc-uptake complex component A periplasmic. ZnuA includes periplasmic solute binding proteins such as TroA that interacts with an ATP-binding cassette transport system in Treponema pallidum. ZnuA is part of the bacterial zinc-uptake complex ZnuABC, whose components are the following families, ZinT, pfam09223, pfam00950, pfam00005, all of which are regulated by the transcription-regulator family FUR, pfam01475. ZinT acts as a Zn2+-buffering protein that delivers Zn2+ to ZnuA (TroA), a high-affinity zinc-uptake protein. In Gram-negative bacteria the ZnuABC transporter system ensures an adequate import of zinc in Zn2+-poor environments, such as those encountered by pathogens within the infected host.	268
396045	pfam01298	TbpB_B_D	C-lobe and N-lobe beta barrels of Tf-binding protein B. Bacterial lipoproteins represent a large group of specialized membrane proteins that perform a variety of functions including maintenance and stabilization of the cell envelope, protein targeting and transit to the outer membrane, membrane biogenesis, and cell adherence. Pathogenic Gram-negative bacteria within the Neisseriaceae and Pasteurellaceae families rely on a specialized uptake system, characterized by an essential surface receptor complex that acquires iron from host transferrin (Tf) and transports the iron across the outer membrane. They have an iron uptake system composed of surface exposed lipoprotein, Tf-binding protein B (TbpB), and an integral outer-membrane protein, Tf-binding protein A (TbpA), that together function to extract iron from the host iron binding glycoprotein (Tf). TbpB is a bilobed (N and C lobe) lipid-anchored protein with each lobe consisting of an eight-stranded beta barrel flanked by a handle domain made up of four (N lobe) or eight (C lobe) beta strands. TbpB extends from the outer membrane surface by virtue of an N-terminal peptide region that is anchored to the outer membrane by fatty acyl chains on the N-terminal cysteine and is involved in the initial capture of iron-loaded Tf. This domain family is found in C and N lobe eight stranded beta barrel region of TbpB proteins. The eight-stranded barrel domains in N and C lobe draw comparisons to eight-stranded beta barrel outer-membrane protein W (OmpW). However, the barrel domains of TbpB have the hydrophobic residues line the inner surface of the beta barrels to create a stable hydrophobic core.	125
396046	pfam01299	Lamp	Lysosome-associated membrane glycoprotein (Lamp). 	148
396047	pfam01300	Sua5_yciO_yrdC	Telomere recombination. This domain has been shown to bind preferentially to dsRNA. The domain is found in SUA5 as well as HypF and YrdC. It has also been shown to be required for telomere recombniation in yeast.	178
396048	pfam01301	Glyco_hydro_35	Glycosyl hydrolases family 35. 	316
396049	pfam01302	CAP_GLY	CAP-Gly domain. Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove.	65
366569	pfam01303	Egg_lysin	Egg lysin (Sperm-lysin). Egg lysin creates a hole in the envelope of the egg thereby allowing the sperm to pass through the envelope and fuse with the egg.	121
279626	pfam01304	Gas_vesicle_C	Gas vesicles protein GVPc repeated domain. 	33
396050	pfam01306	LacY_symp	LacY proton/sugar symporter. This family is closely related to the sugar transporter family.	413
279627	pfam01307	Plant_vir_prot	Plant viral movement protein. This family includes several known plant viral movement proteins from a number of different ssRNA plant virus families including potexviruses, hordeiviruses and carlaviruses.	100
279628	pfam01308	Chlam_OMP	Chlamydia major outer membrane protein. The major outer membrane protein of Chlamydia contains four symmetrically spaced variable domains (VDs I to IV). This protein is believed to be an integral part to the pathogenesis, possibly adhesion. Along with the lipopolysaccharide, the major out membrane protein (MOMP) makes up the surface of the elementary body cell. The MOMP is the protein used to determine the different serotypes.	397
279629	pfam01309	EAV_GS	Equine arteritis virus small envelope glycoprotein. Equine arteritis virus small envelope glycoprotein (Gs) is a class I transmembrane protein which adopts a number of different conformations.	196
366571	pfam01310	Adeno_PVIII	Adenovirus hexon associated protein, protein VIII. See pfam01065. This family represents Hexon.	216
396051	pfam01311	Bac_export_1	Bacterial export proteins, family 1. This family includes the following members; FliR, MopE, SsaT, YopT, Hrp, HrcT and SpaR All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways.	229
396052	pfam01312	Bac_export_2	FlhB HrpN YscU SpaS Family. This family includes the following members: FlhB, HrpN, YscU, SpaS, HrcU SsaU and YopU. All of these proteins export peptides using the type III secretion system. The peptides exported are quite diverse.	338
396053	pfam01313	Bac_export_3	Bacterial export proteins, family 3. This family includes the following members; FliQ, MopD, HrcS, Hrp, YopS and SpaQ All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways.	72
396054	pfam01314	AFOR_C	Aldehyde ferredoxin oxidoreductase, domains 2 & 3. Aldehyde ferredoxin oxidoreductase (AOR) catalyzes the reversible oxidation of aldehydes to their corresponding carboxylic acids with their accompanying reduction of the redox protein ferredoxin. This family is composed of two structural domains that bind the tungsten cofactor via DXXGL(C/D) motifs. In addition to maintaining specific binding interactions with the cofactor, another role for domains 2 and 3 may be to regulate substrate access to AOR.	388
396055	pfam01315	Ald_Xan_dh_C	Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. 	107
396056	pfam01316	Arg_repressor	Arginine repressor, DNA binding domain. 	69
279637	pfam01318	Bromo_coat	Bromovirus coat protein. 	187
396057	pfam01320	Colicin_Pyocin	Colicin immunity protein / pyocin immunity protein. 	82
396058	pfam01321	Creatinase_N	Creatinase/Prolidase N-terminal domain. This family includes the N-terminal non-catalytic domains from creatinase and prolidase. The exact function of this domain is uncertain.	127
396059	pfam01322	Cytochrom_C_2	Cytochrome C'. 	117
396060	pfam01323	DSBA	DSBA-like thioredoxin domain. This family contains a diverse set of proteins with a thioredoxin-like structure pfam00085. This family also includes 2-hydroxychromene-2-carboxylate (HCCA) isomerase enzymes catalyze one step in prokaryotic polyaromatic hydrocarbon (PAH) catabolic pathways. This family also contains members with functions other than HCCA isomerisation, such as Kappa family GSTs, whose similarity to HCCA isomerases was not previously recognized. Some members have been annotated as dioxygenases, dehydrogenases, or putative glycerol-3-phosphate transfer proteins, but are most likely HCCA isomerase enzymes.	192
396061	pfam01324	Diphtheria_R	Diphtheria toxin, R domain. C-terminal receptor binding (R) domain - binds to cell surface receptor, permitting the toxin to enter the cell by receptor mediated endocytosis.	167
396062	pfam01325	Fe_dep_repress	Iron dependent repressor, N-terminal DNA binding domain. This family includes the Diphtheria toxin repressor. DNA binding is through a helix-turn-helix motif.	57
396063	pfam01326	PPDK_N	Pyruvate phosphate dikinase, PEP/pyruvate binding domain. This enzyme catalyzes the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP).	328
396064	pfam01327	Pep_deformylase	Polypeptide deformylase. 	153
396065	pfam01328	Peroxidase_2	Peroxidase, family 2. The peroxidases in this family do not have similarity to other peroxidases.	186
396066	pfam01329	Pterin_4a	Pterin 4 alpha carbinolamine dehydratase. Pterin 4 alpha carbinolamine dehydratase is also known as DCoH (dimerization cofactor of hepatocyte nuclear factor 1-alpha).	88
396067	pfam01330	RuvA_N	RuvA N terminal domain. The N terminal domain of RuvA has an OB-fold structure. This domain forms the RuvA tetramer contacts.	61
396068	pfam01331	mRNA_cap_enzyme	mRNA capping enzyme, catalytic domain. This family represents the ATP binding catalytic domain of the mRNA capping enzyme.	194
396069	pfam01333	Apocytochr_F_C	Apocytochrome F, C-terminal. This is a sub-family of cytochrome C. See pfam00034.	115
396070	pfam01335	DED	Death effector domain. 	82
396071	pfam01336	tRNA_anti-codon	OB-fold nucleic acid binding domain. This family contains OB-fold domains that bind to nucleic acids. The family includes the anti-codon binding domain of lysyl, aspartyl, and asparaginyl -tRNA synthetases (see pfam00152). Aminoacyl-tRNA synthetases catalyze the addition of an amino acid to the appropriate tRNA molecule EC:6.1.1.-. This family also includes part of RecG helicase involved in DNA repair. Replication factor A is a hetero-trimeric complex, that contains a subunit in this family. This domain is also found at the C-terminus of bacterial DNA polymerase III alpha chain.	75
396072	pfam01337	Barstar	Barstar (barnase inhibitor). 	82
396073	pfam01338	Bac_thur_toxin	Bacillus thuringiensis toxin. 	230
396074	pfam01339	CheB_methylest	CheB methylesterase. 	177
279656	pfam01340	MetJ	Met Apo-repressor, MetJ. 	97
396075	pfam01341	Glyco_hydro_6	Glycosyl hydrolases family 6. 	293
396076	pfam01342	SAND	SAND domain. The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerization. This region is also found in the putative transcription factor RegA from the multicellular green alga Volvox cateri. This region of RegA is known as the VARL domain.	75
396077	pfam01343	Peptidase_S49	Peptidase family S49. 	154
396078	pfam01344	Kelch_1	Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that the ring canal kelch protein is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415.	46
396079	pfam01345	DUF11	Domain of unknown function DUF11. A domain of unknown function found in multiple copies in several archaebacterial proteins. Conserved N-terminal lysine and C-terminal asparagine with central asp/glu suggests that many of these domain may contain an isopeptide bond.	114
396080	pfam01346	FKBP_N	Domain amino terminal to FKBP-type peptidyl-prolyl isomerase. This family is only found at the amino terminus of pfam00254. This domain is of unknown function.	97
396081	pfam01347	Vitellogenin_N	Lipoprotein amino terminal region. This family contains regions from: Vitellogenin, Microsomal triglyceride transfer protein and apolipoprotein B-100. These proteins are all involved in lipid transport. This family contains the LV1n chain from lipovitellin, that contains two structural domains.	582
279664	pfam01348	Intron_maturas2	Type II intron maturase. Group II introns use intron-encoded reverse transcriptase, maturase and DNA endonuclease activities for site-specific insertion into DNA. Although this type of intron is self splicing in vitro they require a maturase protein for splicing in vivo. It has been shown that a specific region of the aI2 intron is needed for the maturase function. This region was found to be conserved in group II introns and called domain X.	140
279665	pfam01349	Flavi_NS4B	Flavivirus non-structural protein NS4B. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4B protein is small and poorly conserved among the Flaviviruses. NS4B contains multiple hydrophobic potential membrane spanning regions. NS4B may form membrane components of the viral replication complex and could be involved in membrane localization of NS3 and pfam00972.	248
279666	pfam01350	Flavi_NS4A	Flavivirus non-structural protein NS4A. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4A protein is small and poorly conserved among the Flaviviruses. NS4A contains multiple hydrophobic potential membrane spanning regions. NS4A has only been found in cells infected by Kunjin virus.	144
396082	pfam01351	RNase_HII	Ribonuclease HII. 	199
396083	pfam01352	KRAB	KRAB box. The KRAB domain (or Kruppel-associated box) is present in about a third of zinc finger proteins containing C2H2 fingers. The KRAB domain is found to be involved in protein-protein interactions. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B. The A box plays an important role in repression by binding to corepressors, while the B box is thought to enhance this repression brought about by the A box. KRAB-containing proteins are thought to have critical functions in cell proliferation and differentiation, apoptosis and neoplastic transformation.	42
396084	pfam01353	GFP	Green fluorescent protein. 	212
396085	pfam01355	HIPIP	High potential iron-sulfur protein. 	66
396086	pfam01356	A_amylase_inhib	Alpha amylase inhibitor. 	68
396087	pfam01357	Pollen_allerg_1	Pollen allergen. This family contains allergens lol PI, PII and PIII from Lolium perenne.	75
396088	pfam01358	PARP_regulatory	Poly A polymerase regulatory subunit. 	292
396089	pfam01359	Transposase_1	Transposase (partial DDE domain). This family includes the mariner transposase.	80
396090	pfam01361	Tautomerase	Tautomerase enzyme. This family includes the enzyme 4-oxalocrotonate tautomerase, which catalyzes the ketonisation of 2-hydroxymuconate to 2-oxo-3-hexenedioate.	60
396091	pfam01363	FYVE	FYVE zinc finger. The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn++ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. We have included members which do not conserve these histidine residues but are clearly related.	68
396092	pfam01364	Peptidase_C25	Peptidase family C25. 	343
396093	pfam01365	RYDR_ITPR	RIH domain. The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5-trisphosphate receptor. This domain may form a binding site for IP3.	201
279677	pfam01366	PRTP	Herpesvirus processing and transport protein. The members of this family are associate with capsid intermediates during packaging of the virus.	653
396094	pfam01367	5_3_exonuc	5'-3' exonuclease, C-terminal SAM fold. 	93
396095	pfam01368	DHH	DHH family. It is predicted that this family of proteins all perform a phosphoesterase function. It included the single stranded DNA exonuclease RecJ.	100
396096	pfam01369	Sec7	Sec7 domain. The Sec7 domain is a guanine-nucleotide-exchange-factor (GEF) for the pfam00025 family.	183
396097	pfam01370	Epimerase	NAD dependent epimerase/dehydratase family. This family of proteins utilize NAD as a cofactor. The proteins in this family use nucleotide-sugar substrates for a variety of chemical reactions.	238
396098	pfam01371	Trp_repressor	Trp repressor protein. This protein binds to tryptophan and represses transcription of the Trp operon.	86
279683	pfam01372	Melittin	Melittin. 	26
366599	pfam01373	Glyco_hydro_14	Glycosyl hydrolase family 14. This family are beta amylases.	402
396099	pfam01374	Glyco_hydro_46	Glycosyl hydrolase family 46. This family are chitosanase enzymes.	210
366600	pfam01375	Enterotoxin_a	Heat-labile enterotoxin alpha chain. 	258
396100	pfam01376	Enterotoxin_b	Heat-labile enterotoxin beta chain. 	102
396101	pfam01378	IgG_binding_B	B domain. This domain is found as a tandem repeat in Streptococcal cell surface proteins, such as the IgG binding protein G.	68
396102	pfam01379	Porphobil_deam	Porphobilinogen deaminase, dipyromethane cofactor binding domain. 	203
396103	pfam01380	SIS	SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Presumably the SIS domains bind to the end-product of the pathway.	131
396104	pfam01381	HTH_3	Helix-turn-helix. This large family of DNA binding helix-turn helix proteins includes Cro and CI. Within Neisseria gonorrhoeae NGO_0477, the full protein fold incorporates a helix-turn-helix motif, but the function of this member is unlikely to be that of a DNA-binding regulator, the function of most other members, so is not necessarily characteristic of the whole family.	55
396105	pfam01382	Avidin	Avidin family. 	114
396106	pfam01383	CpcD	CpcD/allophycocyanin linker domain. 	55
396107	pfam01384	PHO4	Phosphate transporter family. This family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter. This family also contains the leukaemia virus receptor.	316
396108	pfam01385	OrfB_IS605	Probable transposase. This family includes IS891, IS1136 and IS1341. DUF1225, pfam06774, has now been merged into this family.	120
396109	pfam01386	Ribosomal_L25p	Ribosomal L25p family. Ribosomal protein L25 is an RNA binding protein, that binds 5S rRNA. This family includes Ctc from B. subtilis, which is induced by stress.	87
396110	pfam01387	Synuclein	Synuclein. There are three types of synucleins in humans, these are called alpha, beta and gamma. Alpha synuclein has been found mutated in families with autosomal dominant Parkinson's disease. A peptide of alpha synuclein has also been found in amyloid plaques in Alzheimer's patients.	133
396111	pfam01388	ARID	ARID/BRIGHT DNA binding domain. This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain.	87
396112	pfam01389	OmpA_membrane	OmpA-like transmembrane domain. The structure of OmpA transmembrane domain shows that it consists of an eight stranded beta barrel. This family includes some other distantly related outer membrane proteins with low scores.	177
396113	pfam01390	SEA	SEA domain. Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain.	106
396114	pfam01391	Collagen	Collagen triple helix repeat (20 copies). Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.	57
396115	pfam01392	Fz	Fz domain. Also known as the CRD (cysteine rich domain), the C6 box in MuSK receptor. This domain of unknown function has been independently identified by several groups. The domain contains 10 conserved cysteines.	107
396116	pfam01393	Chromo_shadow	Chromo shadow domain. This domain is distantly related to pfam00385. This domain is always found in association with a chromo domain.	53
396117	pfam01394	Clathrin_propel	Clathrin propeller repeat. Clathrin is the scaffold protein of the basket-like coat that surrounds coated vesicles. The soluble assembly unit, a triskelion, contains three heavy chains and three light chains in an extended three-legged structure. Each leg contains one heavy and one light chain. The N-terminus of the heavy chain is known as the globular domain, and is composed of seven repeats which form a beta propeller.	37
396118	pfam01395	PBP_GOBP	PBP/GOBP family. The olfactory receptors of terrestrial animals exist in an aqueous environment, yet detect odorants that are primarily hydrophobic. The aqueous solubility of hydrophobic odorants is thought to be greatly enhanced via odorant binding proteins which exist in the extracellular fluid surrounding the odorant receptors. This family is composed of pheromone binding proteins (PBP), which are male-specific and associate with pheromone-sensitive neurons and general-odorant binding proteins (GOBP).	110
307520	pfam01396	zf-C4_Topoisom	Topoisomerase DNA binding C4 zinc finger. 	39
396119	pfam01397	Terpene_synth	Terpene synthase, N-terminal domain. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase.	190
396120	pfam01398	JAB	JAB1/Mov34/MPN/PAD-1 ubiquitin protease. Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain and PAD-1-like domain, JABP1 domain or JAMM domain. These are metalloenzymes that function as the ubiquitin isopeptidase/ deubiquitinase in the ubiquitin-based signalling and protein turnover pathways in eukaryotes. Versions of the domain in prokaryotic cognates of the ubiquitin-modification pathway are shown to have a similar role, and the archael protein from Haloferax volcanii is found to cleave ubiquitin-like small archaeal modifier proteins (SAMP1/2) from protein conjugates.	117
396121	pfam01399	PCI	PCI domain. This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15).	105
396122	pfam01400	Astacin	Astacin (Peptidase family M12A). The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Members of this family contain two conserved disulphide bridges, these are joined 1-4 and 2-3. Members of this family have an amino terminal propeptide which is cleaved to give the active protease domain. All other linked domains are found to the carboxyl terminus of this domain. This family includes: Astacin, a digestive enzyme from Crayfish. Meprin, a multiple domain membrane component that is constructed from a homologous alpha and beta chain. Proteins involved in morphogenesis and Tolloid from drosophila.	192
396123	pfam01401	Peptidase_M2	Angiotensin-converting enzyme. Members of this family are dipeptidyl carboxydipeptidases (cleave carboxyl dipeptides) and most notably convert angiotensin I to angiotensin II. Many members of this family contain a tandem duplication of the 600 amino acid peptidase domain, both of these are catalytically active. Most members are secreted membrane bound ectoenzymes.	581
396124	pfam01402	RHH_1	Ribbon-helix-helix protein, copG family. The structure of this protein repressor, which is the shortest reported to date and the first isolated from a plasmid, has a homodimeric ribbon-helix-helix arrangement. The helix-turn-helix-like structure is involved in dimerization and not DNA binding as might have been expected.	39
396125	pfam01403	Sema	Sema domain. The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in the hepatocyte growth factor receptor and plexin-A3.	406
396126	pfam01404	Ephrin_lbd	Ephrin receptor ligand binding domain. The Eph receptors, which bind to ephrins pfam00812 are a large family of receptor tyrosine kinases. This family represents the amino terminal domain which binds the ephrin ligand.	178
396127	pfam01405	PsbT	Photosystem II reaction centre T protein. The exact function of this protein is unknown. It probably consists of a single transmembrane spanning helix. The Chlamydomonas reinhardtii psbT protein appears to be (i) a novel photosystem II subunit and (ii) required for maintaining optimal photosystem II activity under adverse growth conditions.	29
396128	pfam01406	tRNA-synt_1e	tRNA synthetases class I (C) catalytic domain. This family includes only cysteinyl tRNA synthetases.	301
279715	pfam01407	Gemini_AL3	Geminivirus AL3 protein. Geminiviruses are small, ssDNA-containing plant viruses. Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that overlap and are specified by multiple polycistronic mRNAs. The AL3 protein comprises approximately 0.05% of the cellular proteins and is present in the soluble and organelle fractions. AL3 may form oligomers. Immunoprecipitation of AL3 in a baculovirus expression system extracts expressing both AL1 pfam00799 and AL3 showed that the two proteins also complex with each other. The AL3 protein is involved in viral replication.	119
396129	pfam01408	GFO_IDH_MocA	Oxidoreductase family, NAD-binding Rossmann fold. This family of enzymes utilize NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot.	120
396130	pfam01409	tRNA-synt_2d	tRNA synthetases class II core domain (F). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only phenylalanyl-tRNA synthetases. This is the core catalytic domain.	245
396131	pfam01410	COLFI	Fibrillar collagen C-terminal domain. Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1 alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc.	233
279719	pfam01411	tRNA-synt_2c	tRNA synthetases class II (A). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only alanyl-tRNA synthetases.	548
396132	pfam01412	ArfGap	Putative GTPase activating protein for Arf. Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs.	117
396133	pfam01413	C4	C-terminal tandem repeated domain in type 4 procollagen. Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome.	109
396134	pfam01414	DSL	Delta serrate ligand. 	63
396135	pfam01415	IL7	Interleukin 7/9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multi-functional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear.	152
396136	pfam01416	PseudoU_synth_1	tRNA pseudouridine synthase. Involved in the formation of pseudouridine at the anticodon stem and loop of transfer-RNAs Pseudouridine is an isomer of uridine (5-(beta-D-ribofuranosyl) uracil, and id the most abundant modified nucleoside found in all cellular RNAs. The TruA-like proteins also exhibit a conserved sequence with a strictly conserved aspartic acid, likely involved in catalysis.	108
396137	pfam01417	ENTH	ENTH domain. The ENTH (Epsin N-terminal homology) domain is found in proteins involved in endocytosis and cytoskeletal machinery. The function of the ENTH domain is unknown.	124
334531	pfam01418	HTH_6	Helix-turn-helix domain, rpiR family. This domain contains a helix-turn-helix motif. The best characterized member of this family is RpiR, a regulator of the expression of rpiB gene.	77
396138	pfam01419	Jacalin	Jacalin-like lectin domain. Proteins containing this domain are lectins. It is found in 1 to 6 copies in these proteins. The domain is also found in the animal prostatic spermine-binding protein.	134
396139	pfam01420	Methylase_S	Type I restriction modification DNA specificity domain. This domain is also known as the target recognition domain (TRD). Restriction-modification (R-M) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity subunit (this family), two modification (M) subunits and two restriction (R) subunits.	167
396140	pfam01421	Reprolysin	Reprolysin (M12B) family zinc metalloprotease. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Members of this family are also known as adamalysins. Most members of this family are snake venom endopeptidases, but there are also some mammalian proteins and fertilin. Fertilin and closely related proteins appear to not have some active site residues and may not be active enzymes.	199
396141	pfam01422	zf-NF-X1	NF-X1 type zinc finger. This domain is presumed to be a zinc binding domain. The following pattern describes the zinc finger. C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C Where X can be any amino acid, and numbers in brackets indicate the number of residues. Two position can be either his or cys. The zinc fingers in NFX1 bind to DNA.	19
396142	pfam01423	LSM	LSM domain. The LSM domain contains Sm proteins as well as other related LSM (Like Sm) proteins. The U1, U2, U4/U6, and U5 small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing contain seven Sm proteins (B/B', D1, D2, D3, E, F and G) in common, which assemble around the Sm site present in four of the major spliceosomal small nuclear RNAs. The U6 snRNP binds to the LSM (Like Sm) proteins. Sm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Sm proteins. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. This family also includes the bacterial Hfq (host factor Q) proteins. Hfq are also RNA-binding proteins, that form hexameric rings.	66
396143	pfam01424	R3H	R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA.	60
396144	pfam01425	Amidase	Amidase. 	442
396145	pfam01426	BAH	BAH domain. This domain has been called BAH (Bromo adjacent homology) domain and has also been called ELM1 and BAM (Bromo adjacent motif) domain. The function of this domain is unknown but may be involved in protein-protein interaction.	120
279735	pfam01427	Peptidase_M15	D-ala-D-ala dipeptidase. 	199
396146	pfam01428	zf-AN1	AN1-like Zinc finger. Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues.	37
396147	pfam01429	MBD	Methyl-CpG binding domain. The Methyl-CpG binding domain (MBD) binds to DNA that contains one or more symmetrically methylated CpGs. DNA methylation in animals is associated with alterations in chromatin structure and silencing of gene expression. MBD has negligible non-specific affinity for DNA. In vitro foot-printing with MeCP2 showed the MBD can protect a 12 nucleotide region surrounding a methyl CpG pair. MBDs are found in several Methyl-CpG binding proteins and also DNA demethylase.	76
396148	pfam01430	HSP33	Hsp33 protein. Hsp33 is a molecular chaperone, distinguished from all other known chaperones by its mode of functional regulation. Its activity is redox regulated. Hsp33 is a cytoplasmically localized protein with highly reactive cysteines that respond quickly to changes in the redox environment. Oxidising conditions like H2O2 cause disulfide bonds to form in Hsp33, a process that leads to the activation of its chaperone function.	277
279739	pfam01431	Peptidase_M13	Peptidase family M13. Mammalian enzymes are typically type-II membrane anchored enzymes which are known, or believed to activate or inactivate oligopeptide (pro)-hormones such as opioid peptides. The family also contains a bacterial member believed to be involved with milk protein cleavage.	205
396149	pfam01432	Peptidase_M3	Peptidase family M3. This is the Thimet oligopeptidase family, large family of mammalian and bacterial oligopeptidases that cleave medium sized peptides. The group also contains mitochondrial intermediate peptidase which is encoded by nuclear DNA but functions within the mitochondria to remove the leader sequence.	450
396150	pfam01433	Peptidase_M1	Peptidase family M1 domain. Members of this family are aminopeptidases. The members differ widely in specificity, hydrolysing acidic, basic or neutral N-terminal residues. This family includes leukotriene-A4 hydrolase, this enzyme also has an aminopeptidase activity.	220
396151	pfam01434	Peptidase_M41	Peptidase family M41. 	190
396152	pfam01435	Peptidase_M48	Peptidase family M48. Peptidase_M48 is the largely extracellular catalytic region of CAAX prenyl protease homologs such as Human FACE-1 protease. These are metallopeptidases, with the characteristic HExxH motif giving the two histidine-zinc-ligands and an adjacent glutamate on the next helix being the third. The whole molecule folds to form a deep groove/cleft into which the substrate can fit.	198
396153	pfam01436	NHL	NHL repeat. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in Bos taurus PAM, proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. The E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats.	28
396154	pfam01437	PSI	Plexin repeat. A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman).	52
396155	pfam01439	Metallothio_2	Metallothionein. Members of this family are metallothioneins. These proteins are cysteine rich proteins that bind to heavy metals. Members of this family appear to be closest to Class II metallothioneins, seed pfam00131.	80
279747	pfam01440	Gemini_AL2	Geminivirus AL2 protein. Geminiviruses are small, ssDNA-containing plant viruses. Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that overlap and are specified by multiple polycistronic mRNAs. The AL2 gene product transactivates expression of TGMV coat protein gene, and BR1 movement protein.	126
396156	pfam01441	Lipoprotein_6	Lipoprotein. Members of this family are lipoproteins that are probably involved in evasion of the host immune system by pathogens.	170
396157	pfam01442	Apolipoprotein	Apolipoprotein A1/A4/E domain. These proteins contain several 22 residue repeats which form a pair of alpha helices. This family includes: Apolipoprotein A-I. Apolipoprotein A-IV. Apolipoprotein E.	170
366646	pfam01443	Viral_helicase1	Viral (Superfamily 1) RNA helicase. Helicase activity for this family has been demonstrated and NTPase activity. This helicase has multiple roles at different stages of viral RNA replication, as dissected by mutational analysis.	227
279751	pfam01445	SH	Viral small hydrophobic protein. The SH (small hydrophobic) protein is a membrane protein of uncertain function.	57
396158	pfam01446	Rep_1	Replication protein. Replication proteins (rep) are involved in plasmid replication. The Rep protein binds to the plasmid DNA and nicks it at the double strand origin (dso) of replication. The 3'-hydroxyl end created is extended by the host DNA replicase, and the 5' end is displaced during synthesis. At the end of one replication round, Rep introduces a second single stranded break at the dso and ligates the ssDNA extremities generating one double-stranded plasmid and one circular ssDNA form. Complementary strand synthesis of the circular ssDNA is usually initiated at the single-stranded origin by the host RNA polymerase.	248
396159	pfam01447	Peptidase_M4	Thermolysin metallopeptidase, catalytic domain. 	147
396160	pfam01448	ELM2	ELM2 domain. The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N-terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in ARID1. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.	53
396161	pfam01450	IlvC	Acetohydroxy acid isomeroreductase, catalytic domain. Acetohydroxy acid isomeroreductase catalyzes the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain amino acids valine and isoleucine.	138
396162	pfam01451	LMWPc	Low molecular weight phosphotyrosine protein phosphatase. 	142
279757	pfam01452	Rota_NSP4	Rotavirus non structural protein. This protein has been called NSP4, NSP5, NS28, and NCVP5. The final steps in the assembly of rotavirus occur in the lumen of the endoplasmic reticulum (ER). Targeting of the immature inner capsid particle (ICP) to this compartment is mediated by the cytoplasmic tail of NSP4, located in the ER membrane.	173
396163	pfam01453	B_lectin	D-mannose binding lectin. These proteins include mannose-specific lectins from plants as well as bacteriocins from bacteria.	105
396164	pfam01454	MAGE	MAGE family. The MAGE (melanoma antigen-encoding gene) family are expressed in a wide variety of tumors but not in normal cells, with the exception of the male germ cells, placenta, and, possibly, cells of the developing embryo. The cellular function of this family is unknown. This family also contains the yeast protein, Nse3. The Nse3 protein is part of the Smc5-6 complex. Nse3 has been demonstrated to be important for meiosis.	202
396165	pfam01455	HupF_HypC	HupF/HypC family. 	65
250634	pfam01456	Mucin	Mucin-like glycoprotein. This family of trypanosomal proteins resemble vertebrate mucins. The protein consists of three regions. The N and C terminii are conserved between all members of the family, whereas the central region is not well conserved and contains a large number of threonine residues which can be glycosylated. Indirect evidence suggested that these genes might encode the core protein of parasite mucins, glycoproteins that were proposed to be involved in the interaction with, and invasion of, mammalian host cells. This family contains an N-terminal signal peptide.	143
366652	pfam01457	Peptidase_M8	Leishmanolysin. 	529
396166	pfam01458	UPF0051	Uncharacterized protein family (UPF0051). 	218
396167	pfam01459	Porin_3	Eukaryotic porin. 	270
396168	pfam01462	LRRNT	Leucine rich repeat N-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats.	28
279765	pfam01463	LRRCT	Leucine rich repeat C-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the C-terminus of tandem leucine rich repeats.	26
396169	pfam01464	SLT	Transglycosylase SLT domain. This family is distantly related to pfam00062. Members are found in phages, type II, type III and type IV secretion systems.	114
396170	pfam01465	GRIP	GRIP domain. The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. At least some of these domains have been shown to bind to GTPase Arl1.	43
396171	pfam01466	Skp1	Skp1 family, dimerization domain. 	48
396172	pfam01467	CTP_transf_like	Cytidylyltransferase-like. This family includes: Cholinephosphate cytidylyltransferase; glycerol-3-phosphate cytidylyltransferase. It also includes putative adenylyltransferases, and FAD synthases.	134
396173	pfam01468	GA	GA module. The GA (protein G-related Albumin-binding) module is composed of three alpha helices. This module is found in a range of bacterial cell surface proteins. The GA module from peptostreptococcal albumin-binding protein shows a strong affinity for albumin.	55
279771	pfam01469	Pentapeptide_2	Pentapeptide repeats (8 copies). These repeats are found in many mycobacterial proteins. These repeats are most common in the pfam00823 family of proteins, where they are found in the MPTR subfamily of PPE proteins. The function of these repeats is unknown. The repeat can be approximately described as XNXGX, where X can be any amino acid. These repeats are similar to pfam00805, however it is not clear if these two families are structurally related.	39
396174	pfam01470	Peptidase_C15	Pyroglutamyl peptidase. 	201
396175	pfam01471	PG_binding_1	Putative peptidoglycan binding domain. This domain is composed of three alpha helices. This domain is found at the N or C-terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. This family is found N-terminal to the catalytic domain of matrixins. The domain is found to bind peptidoglycan experimentally.	57
396176	pfam01472	PUA	PUA domain. The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain.	74
366661	pfam01473	CW_binding_1	Putative cell wall binding repeat. These repeats are characterized by conserved aromatic residues and glycines are found in multiple tandem copies in a number of proteins. The CW repeat is 20 amino acid residues long. The exact domain boundaries may not be correct. It has been suggested that these repeats in Streptococcus phage Cp-1 lysozyme might be responsible for the specific recognition of choline-containing cell walls. Similar but longer repeats are found in the glucosyltransferases and glucan-binding proteins of oral streptococci and shown to be involved in glucan binding as well as in the related dextransucrases of Leuconostoc mesenteroides. Repeats also occur in toxins of Clostridium difficile and other clostridia, though the ligands are not always known.	19
396177	pfam01474	DAHP_synth_2	Class-II DAHP synthetase family. Members of this family are aldolase enzymes that catalyze the first step of the shikimate pathway.	437
396178	pfam01475	FUR	Ferric uptake regulator family. This family includes metal ion uptake regulator proteins, that bind to the operator DNA and controls transcription of metal ion-responsive genes. This family is also known as the FUR family.	120
396179	pfam01476	LysM	LysM domain. The LysM (lysin motif) domain is about 40 residues long. It is found in a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. The structure of this domain is known.	43
396180	pfam01477	PLAT	PLAT/LH2 domain. This domain is found in a variety of membrane or lipid associated proteins. It is called the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain. The known structure of pancreatic lipase shows this domain binds to procolipase pfam01114, which mediates membrane association. So it appears possible that this domain mediates membrane attachment via other protein binding partners. The structure of this domain is known for many members of the family and is composed of a beta sandwich.	115
396181	pfam01478	Peptidase_A24	Type IV leader peptidase family. Peptidase A24, or the prepilin peptidase as it is also known, processes the N-terminus of the prepilins. The processing is essential for the correct formation of the pseudopili of type IV bacterial protein secretion. The enzyme is found across eubacteria and archaea.	101
396182	pfam01479	S4	S4 domain. The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterized, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA.	48
396183	pfam01480	PWI	PWI domain. 	70
396184	pfam01481	Arteri_nucleo	Arterivirus nucleocapsid protein. 	116
396185	pfam01483	P_proprotein	Proprotein convertase P-domain. A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain.	86
396186	pfam01484	Col_cuticle_N	Nematode cuticle collagen N-terminal domain. The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens, see pfam01391. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins.	48
396187	pfam01485	IBR	IBR domain, a half RING-finger domain. The IBR (In Between Ring fingers) domain is often found to occur between pairs of ring fingers (pfam00097). This domain has also been called the C6HC domain and DRIL (for double RING finger linked) domain. Proteins that contain two Ring fingers and an IBR domain (these proteins are also termed RBR family proteins) are thought to exist in all eukaryotic organisms. RBR family members play roles in protein quality control and can indirectly regulate transcription. Evidence suggests that RBR proteins are often parts of cullin-containing ubiquitin ligase complexes. The ubiquitin ligase Parkin is an RBR family protein whose mutations are involved in forms of familial Parkinson's disease.	59
396188	pfam01486	K-box	K-box region. The K-box region is commonly found associated with SRF-type transcription factors see pfam00319. The K-box is a possible coiled-coil structure. Possible role in multimer formation.	91
396189	pfam01487	DHquinase_I	Type I 3-dehydroquinase. Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase.) catalyzes the cis-dehydration of 3-dehydroquinate via a covalent imine intermediate giving dehydroshikimate. Dehydroquinase functions in the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Type II 3-dehydroquinase catalyzes the trans-dehydration of 3-dehydroshikimate see pfam01220.	227
396190	pfam01488	Shikimate_DH	Shikimate / quinate 5-dehydrogenase. This family contains both shikimate and quinate dehydrogenases. Shikimate 5-dehydrogenase catalyzes the conversion of shikimate to 5-dehydroshikimate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Quinate 5-dehydrogenase catalyzes the conversion of quinate to 5-dehydroquinate. This reaction is part of the quinate pathway where quinic acid is exploited as a source of carbon in prokaryotes and microbial eukaryotes. Both the shikimate and quinate pathways share two common pathway metabolites 3-dehydroquinate and dehydroshikimate.	136
279788	pfam01490	Aa_trans	Transmembrane amino acid transporter protein. This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases.	410
396191	pfam01491	Frataxin_Cyay	Frataxin-like domain. This family contains proteins that have a domain related to the globular C-terminus of Frataxin the protein that is mutated in Friedreich's ataxia. This domain is found in a family of bacterial proteins. The function of this domain is currently unknown. It has been suggested that this family is involved in iron transport.	104
279790	pfam01492	Gemini_C4	Geminivirus C4 protein. This family consists of the N terminal region of geminivirus C4 or AC4 proteins. In Tomato yellow leaf curl geminivirus (TYLCV) the C4 protein is necessary for efficient spreading of the virus in tomato plants.	84
396192	pfam01493	GXGXG	GXGXG motif. This domain is found in glutamate synthase, tungsten formylmethanofuran dehydrogenase subunit c (FwdC) and molybdenum formylmethanofuran dehydrogenase subunit c (FmdC). A repeated G-XX-G-XXX-G motif is seen in the alignment.	190
396193	pfam01494	FAD_binding_3	FAD binding domain. This domain is involved in FAD binding in a number of enzymes.	348
396194	pfam01496	V_ATPase_I	V-type ATPase 116kDa subunit family. This family consists of the 116kDa V-type ATPase (vacuolar (H+)-ATPases) subunits, as well as V-type ATP synthase subunit i. The V-type ATPases family are proton pumps that acidify intracellular compartments in eukaryotic cells for example yeast central vacuoles, clathrin-coated and synaptic vesicles. They have important roles in membrane trafficking processes. The 116kDa subunit (subunit a) in the V-type ATPase is part of the V0 functional domain responsible for proton transport. The a subunit is a transmembrane glycoprotein with multiple putative transmembrane helices it has a hydrophilic amino terminal and a hydrophobic carboxy terminal. It has roles in proton transport and assembly of the V-type ATPase complex. This subunit is encoded by two homologous gene in yeast VPH1 and STV1.	756
396195	pfam01497	Peripla_BP_2	Periplasmic binding protein. This family includes bacterial periplasmic binding proteins. Several of which are involved in iron transport.	234
366677	pfam01498	HTH_Tnp_Tc3_2	Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of C.elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in. Tc3 is a member of the Tc1/mariner family of transposable elements.	72
396196	pfam01499	Herpes_UL25	Herpesvirus UL25 family. The herpesvirus UL25 gene product is a virion component involved in virus penetration and capsid assembly. The product of the UL25 gene is required for packaging but not cleavage of replicated viral DNA. This family includes a number of herpesvirus proteins: EHV-1 36, EBV BVRF1, HCMV UL77, ILTV ORF2, and VZV gene 34.	541
366678	pfam01500	Keratin_B2	Keratin, high sulfur B2 protein. High sulfur proteins are cysteine-rich proteins synthesized during the differentiation of hair matrix cells, and form hair fibers in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1.	161
279798	pfam01501	Glyco_transf_8	Glycosyl transferase family 8. This family includes enzymes that transfer sugar residues to donor molecules. Members of this family are involved in lipopolysaccharide biosynthesis and glycogen synthesis. This family includes Lipopolysaccharide galactosyltransferase, lipopolysaccharide glucosyltransferase 1, and glycogenin glucosyltransferase.	252
396197	pfam01502	PRA-CH	Phosphoribosyl-AMP cyclohydrolase. This enzyme catalyzes the third step in the histidine biosynthetic pathway. It requires Zn ions for activity.	74
396198	pfam01503	PRA-PH	Phosphoribosyl-ATP pyrophosphohydrolase. This enzyme catalyzes the second step in the histidine biosynthetic pathway.	83
396199	pfam01504	PIP5K	Phosphatidylinositol-4-phosphate 5-Kinase. This family contains a region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in. The family consists of various type I, II and III PIP5K enzymes. PIP5K catalyzes the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signaling pathway.	284
396200	pfam01505	Vault	Major Vault Protein repeat. The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein.	41
366682	pfam01506	HCV_NS5a	Hepatitis C virus non-structural 5a protein membrane anchor. The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. The N-terminal region of the NS5a protein has been used in the construction of the alignment for this family. The C-terminal region has not been included because it is too heterogeneous.	23
396201	pfam01507	PAPS_reduct	Phosphoadenosine phosphosulfate reductase family. This domain is found in phosphoadenosine phosphosulfate (PAPS) reductase enzymes or PAPS sulfotransferase. PAPS reductase is part of the adenine nucleotide alpha hydrolases superfamily also including N type ATP PPases and ATP sulphurylases. The enzyme uses thioredoxin as an electron donor for the reduction of PAPS to phospho-adenosine-phosphate (PAP). It is also found in NodP nodulation protein P from Rhizobium which has ATP sulfurylase activity (sulfate adenylate transferase).	173
279805	pfam01508	Paramecium_SA	Paramecium surface antigen domain. This domain is a cysteine rich extracellular repeat found in surface antigens of Paramecium. The domain contains 8 cysteine residues.	62
396202	pfam01509	TruB_N	TruB family pseudouridylate synthase (N terminal domain). Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine. This family includes TruB, a pseudouridylate synthase that specifically converts uracil 55 to pseudouridine in most tRNAs. This family also includes Cbf5p that modifies rRNA.	148
396203	pfam01510	Amidase_2	N-acetylmuramoyl-L-alanine amidase. This family includes zinc amidases that have N-acetylmuramoyl-L-alanine amidase activity EC:3.5.1.28. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls (preferentially: D-lactyl-L-Ala). The structure is known for the bacteriophage T7 structure and shows that two of the conserved histidines are zinc binding.	121
396204	pfam01512	Complex1_51K	Respiratory-chain NADH dehydrogenase 51 Kd subunit. 	150
396205	pfam01513	NAD_kinase	ATP-NAD kinase. Members of this family include ATP-NAD kinases EC:2.7.1.23, which catalyzes the phosphorylation of NAD to NADP utilising ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. Also includes NADH kinases EC:2.7.1.86.	285
396206	pfam01514	YscJ_FliF	Secretory protein of YscJ/FliF family. This family includes proteins that are related to the YscJ lipoprotein, and the amino terminus of FliF, the flageller M-ring protein. The members of the YscJ family are thought to be involved in secretion of several proteins. The FliF protein ring is thought to be part of the export apparatus for flageller proteins, based on the similarity to YscJ proteins.	179
396207	pfam01515	PTA_PTB	Phosphate acetyl/butaryl transferase. This family contains both phosphate acetyltransferase and phosphate butaryltransferase. These enzymes catalyze the transfer of an acetyl or butaryl group to orthophosphate.	318
279812	pfam01516	Orbi_VP6	Orbivirus helicase VP6. The VP6 protein a minor protein in the core of the virion is probably the viral helicase.	324
279813	pfam01517	HDV_ag	Hepatitis delta virus delta antigen. The hepatitis delta virus (HDV) encodes a single protein, the hepatitis delta antigen (HDAg). The central region of this protein has been shown to bind RNA. Several interactions are also mediated by a coiled-coil region at the N-terminus of the protein.	195
250679	pfam01518	PolyG_pol	Sigma NS protein. This viral protein has a poly(C)-dependent poly(G) polymerase activity.	366
396208	pfam01519	DUF16	Protein of unknown function DUF16. The function of this protein is unknown. It appears to only occur in Mycoplasma pneumoniae. The crystal structure revealed that this domain is composed of two separated homotrimeric coiled-coils.	95
396209	pfam01520	Amidase_3	N-acetylmuramoyl-L-alanine amidase. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls.	172
396210	pfam01521	Fe-S_biosyn	Iron-sulphur cluster biosynthesis. This family is involved in iron-sulphur cluster biosynthesis. Its members include proteins that are involved in nitrogen fixation such as the HesB and HesB-like proteins.	111
396211	pfam01522	Polysacc_deac_1	Polysaccharide deacetylase. This domain is found in polysaccharide deacetylase. This family of polysaccharide deacetylases includes NodB (nodulation protein B from Rhizobium) which is a chitooligosaccharide deacetylase. It also includes chitin deacetylase from yeast, and endoxylanases which hydrolyzes glucosidic bonds in xylan.	124
396212	pfam01523	PmbA_TldD	Putative modulator of DNA gyrase. tldD and pmbA were found to suppress mutations in letD and inhibitor of DNA gyrase. Therefore it has been hypothesized that the TldD and PmbA proteins modulate the activity of DNA gyrase. It has also been suggested that PmbA may be involved in secretion.	156
366689	pfam01524	Gemini_V2	Geminivirus V2 protein. Disruption of the V2 gene in Tomato yellow leaf curl virus (TYLCV) stopped its ability to systemically infect tomato plants, suggesting that the V2 gene product is required for successful infection of the host.	78
144935	pfam01525	Rota_NS26	Rotavirus NS26. Gene 11 product is a non-structural phosphoprotein designated as NS26.	212
396213	pfam01526	DDE_Tnp_Tn3	Tn3 transposase DDE domain. This family includes transposases of Tn3, Tn21, Tn1721, Tn2501, Tn3926 transposons from E-coli. The specific binding of the Tn3 transposase to DNA has been demonstrated. Sequence analysis has suggested that the invariant triad of Asp689, Asp765, Glu895 (numbering as in Tn3) may correspond to the D-D-35-E motif previously implicated in the catalysis of numerous transposases.	389
396214	pfam01527	HTH_Tnp_1	Transposase. Transposase proteins are necessary for efficient DNA transposition. This family consists of various E. coli insertion elements and other bacterial transposases some of which are members of the IS3 family.	75
279822	pfam01528	Herpes_glycop	Herpesvirus glycoprotein M. The herpesvirus glycoprotein M (gM) is an integral membrane protein predicted to contain 8 transmembrane segments. Glycoprotein M is not essential for viral replication.	373
396215	pfam01529	DHHC	DHHC palmitoyltransferase. This entry refers to the DHHC domain, found in DHHC proteins which are palmitoyltransferases. Palmitoylation or, more specifically S-acylation, plays important roles in the regulation of protein localization, stability, and activity. It is a post-translational protein modification that involves the attachment of palmitic acid to Cys residues through a thioester linkage. Protein acyltransferases (PATs), also known as palmitoyltransferases, catalyze this reaction by transferring the palmitoyl group from palmitoyl-CoA to the thiol group of Cys residues. They are characterized by the presence of a 50-residue-long domain called the DHHC domain, which in most but not all cases is also cysteine-rich and gets its name from a highly conserved DHHC signature tetrapeptide (Asp-His-His-Cys). The Cys residue within the DHHC domain forms a stable acyl intermediate and transfers the acyl chain to the Cys residues of a target protein. Some proteins containing a DHHC domain include Drosophila DNZ1 protein, Mouse Abl-philin 2 (Aph2) protein, Mammalian ZDHHC9, Yeast ankyrin repeat-containing protein AKR1, Yeast Erf2 protein, and Arabidopsis thaliana tip growth defective 1.	132
396216	pfam01530	zf-C2HC	Zinc finger, C2HC type. This is a DNA binding zinc finger domain.	29
250689	pfam01531	Glyco_transf_11	Glycosyl transferase family 11. This family contains several fucosyl transferase enzymes.	298
396217	pfam01532	Glyco_hydro_47	Glycosyl hydrolase family 47. Members of this family are alpha-mannosidases that catalyze the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2).	453
396218	pfam01533	Tospo_nucleocap	Tospovirus nucleocapsid protein. The tospovirus genome consists of three linear ssRNA segments, denoted L, M and S complexed with the nucleocapsid protein. The S RNA encodes the nucleocapsid protein and another non-structural protein.	246
396219	pfam01534	Frizzled	Frizzled/Smoothened family membrane region. This family contains the membrane spanning region of frizzled and smoothened receptors. This membrane region is predicted to contain seven transmembrane alpha helices. Proteins related to Drosophila frizzled are receptors for Wnt (mediating the beta-catenin signalling pathway), but also the planar cell polarity (PCP) pathway and the Wnt/calcium pathway. The predominantly alpha-helical Cys-rich ligand-binding region (CRD) of Frizzled is both necessary and sufficient for Wnt binding. The smoothened receptor mediates hedgehog signalling.	321
366695	pfam01535	PPR	PPR repeat. This repeat has no known function. It is about 35 amino acids long and found in up to 18 copies in some proteins. This family appears to be greatly expanded in plants. This repeat occurs in PET309 that may be involved in RNA stabilisation. This domain occurs in crp1 that is involved in RNA processing. This repeat is associated with a predicted plant protein that has a domain organisation similar to the human BRCA1 protein. The repeat has been called PPR.	31
396220	pfam01536	SAM_decarbox	Adenosylmethionine decarboxylase. This is a family of S-adenosylmethionine decarboxylase (SAMDC) proenzymes. In the biosynthesis of polyamines SAMDC produces decarboxylated S-adenosylmethionine, which serves as the aminopropyl moiety necessary for spermidine and spermine biosynthesis from putrescine. The Pfam alignment contains both the alpha and beta chains that are cleaved to form the active enzyme.	332
396221	pfam01537	Herpes_glycop_D	Herpesvirus glycoprotein D/GG/GX domain. This domain is found in several Herpes viruses glycoproteins. This is a family includes glycoprotein-D (gD or gIV) which is common to herpes simplex virus types 1 and 2, as well as equine herpes, bovine herpes and Marek's disease virus. Glycoprotein-D has been found on the viral envelope and the plasma membrane of infected cells. and gD immunisation can produce an immune response to bovine herpes virus (BHV-1). This response is stronger than that of the other major glycoproteins gB (gI) and gC (gIII) in BHV-1. Glycoprotein G (gG)is one of the seven external glycoproteins of HSV1 and HSV2. This family also contains the glycoprotein GX, (gX), initially identified in Pseudorabies virus.	118
366698	pfam01538	HCV_NS2	Hepatitis C virus non-structural protein NS2. The viral genome is translated into a single polyprotein of about 3000 amino acids. Generation of the mature non-structural proteins relies on the activity of viral proteases. Cleavage at the NS2/NS3 junction is accomplished by a metal-dependent autoprotease encoded within NS2 and the N-terminus of NS3.	195
110536	pfam01539	HCV_env	Hepatitis C virus envelope glycoprotein E1. 	190
110537	pfam01540	Lipoprotein_7	Adhesin lipoprotein. This family consists of the p50 and variable adherence-associated antigen (Vaa) adhesins from Mycoplasma hominis. M. hominis is a mycoplasma associated with human urogenital diseases, pneumonia, and septic arthritis. An adhesin is a cell surface molecule that mediates adhesion to other cells or to the surrounding surface or substrate. The Vaa antigen is a 50-kDa surface lipoprotein that has four tandem repetitive DNA sequences encoding a periodic peptide structure, and is highly immunogenic in the human host. p50 is also a 50-kDa lipoprotein, having three repeats A,B and C, that may be a tetramer of 191-kDa in its native environment.	353
396222	pfam01541	GIY-YIG	GIY-YIG catalytic domain. This domain called GIY-YIG is found in the amino terminal region of excinuclease abc subunit c (uvrC), bacteriophage T4 endonucleases segA, segB, segC, segD and segE; it is also found in putative endonucleases encoded by group I introns of fungi and phage. The structure of I-TevI a GIY-YIG endonuclease, reveals a novel alpha/beta-fold with a central three-stranded antiparallel beta-sheet flanked by three helices. The most conserved and putative catalytic residues are located on a shallow, concave surface and include a metal coordination site.	78
279832	pfam01542	HCV_core	Hepatitis C virus core protein. The viral core protein forms the internal viral coat that encapsidates the genomic RNA and is enveloped in a host cell-derived lipid membrane. The core protein has been shown, by yeast two-hybrid assay to interact with cellular DEAD box helicases. The N-terminus of the core protein is involved in transcriptional repression.	75
144947	pfam01543	HCV_capsid	Hepatitis C virus capsid protein. 	121
396223	pfam01544	CorA	CorA-like Mg2+ transporter protein. The CorA transport system is the primary Mg2+ influx system of Salmonella typhimurium and Escherichia coli. CorA is virtually ubiquitous in the Bacteria and Archaea. There are also eukaryotic relatives of this protein. The family includes the MRS2 protein from yeast that is thought to be an RNA splicing protein. However its membership of this family suggests that its effect on splicing is due to altered magnesium levels in the cell.	292
396224	pfam01545	Cation_efflux	Cation efflux family. Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells.	189
396225	pfam01546	Peptidase_M20	Peptidase family M20/M25/M40. This family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases.	315
396226	pfam01547	SBP_bac_1	Bacterial extracellular solute-binding protein. This family also includes the bacterial extracellular solute-binding protein family POTD/POTF.	294
396227	pfam01548	DEDD_Tnp_IS110	Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes an amino-terminal region of the pilin gene inverting protein (PIVML) and of members of the IS111A/IS1328/IS1533 family of transposases. The C-terminus is represented by family pfam02371.	155
396228	pfam01549	ShK	ShK domain-like. This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone.	37
396229	pfam01551	Peptidase_M23	Peptidase family M23. Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown.	96
279840	pfam01552	Pico_P2B	Picornavirus 2B protein. Poliovirus infection leads to drastic alterations in membrane permeability late during infection. Proteins 2B and 2BC enhance membrane permeability.	101
366704	pfam01553	Acyltransferase	Acyltransferase. This family contains acyltransferases involved in phospholipid biosynthesis and other proteins of unknown function. This family also includes tafazzin, the Barth syndrome gene.	131
334587	pfam01554	MatE	MatE. The MatE domain	161
396230	pfam01555	N6_N4_Mtase	DNA methylase. Members of this family are DNA methylases. The family contains both N-4 cytosine-specific DNA methylases and N-6 Adenine-specific DNA methylases.	221
396231	pfam01556	DnaJ_C	DnaJ C terminal domain. This family consists of the C terminal region of the DnaJ protein. It is always found associated with pfam00226 and pfam00684. DnaJ is a chaperone associated with the Hsp70 heat-shock system involved in protein folding and renaturation after stress. The two C-terminal domains CTDI and CTDII, both incorporated in this family are necessary for maintaining the J-domains in their specific relative positions. Structural analysis of Structure 1nlt shows that PF00684 is nested within this DnaJ C-terminal region.	130
396232	pfam01557	FAA_hydrolase	Fumarylacetoacetate (FAA) hydrolase family. This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyzes fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hepatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerizes this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. This family also includes various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase encoded by mhpD in E. coli is involved in the phenylpropionic acid pathway of E. coli and catalyzes the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase.	211
396233	pfam01558	POR	Pyruvate ferredoxin/flavodoxin oxidoreductase. This family includes a region of the large protein pyruvate-flavodoxin oxidoreductase and the whole pyruvate ferredoxin oxidoreductase gamma subunit protein. It is not known whether the gamma subunit has a catalytic or regulatory role. Pyruvate oxidoreductase (POR) catalyzes the final step in the fermentation of carbohydrates in anaerobic microorganisms. This involves the oxidative decarboxylation of pyruvate with the participation of thiamine followed by the transfer of an acetyl moiety to coenzyme A for the synthesis of acetyl-CoA. The family also includes pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in cyanobacterium which is required for growth on molecular nitrogen when iron is limited.	173
366705	pfam01559	Zein	Zein seed storage protein. Zeins are seed storage proteins. They are unusually rich in glutamine, proline, alanine, and leucine residues and their sequences show a series of tandem repeats.	244
110557	pfam01560	HCV_NS1	Hepatitis C virus non-structural protein E2/NS1. The hypervariable region of the E2/NS1 region of hepatitis C virus varies greatly between viral isolates. E2 is thought to encode a structurally unconstrained envelope protein.	344
396234	pfam01561	Hanta_G2	Hantavirus glycoprotein G2. The medium (M) genome segment of hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins. G1 and G2, as a precursor protein in the complementary sense RNA.	457
396235	pfam01562	Pep_M12B_propep	Reprolysin family propeptide. This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C-terminus of the alignment but is not well aligned.	129
396236	pfam01563	Alpha_E3_glycop	Alphavirus E3 glycoprotein. This protein is found in some alphaviruses as a virion associated spike protein.	59
396237	pfam01564	Spermine_synth	Spermine/spermidine synthase domain. Spermine and spermidine are polyamines. This family includes spermidine synthase that catalyzes the fifth (last) step in the biosynthesis of spermidine from arginine, and spermine synthase.	183
396238	pfam01565	FAD_binding_4	FAD binding domain. This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidizes the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan.	139
396239	pfam01566	Nramp	Natural resistance-associated macrophage protein. The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functional related proteins defined by a conserved hydrophobic core of ten transmembrane domains. This family of membrane proteins are divalent cation transporters. Nramp1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis. By controlling divalent cation concentrations Nramp1 may regulate the interphagosomal replication of bacteria. Mutations in Nramp1 may genetically predispose an individual to susceptibility to diseases including leprosy and tuberculosis conversely this might however provide protection form rheumatoid arthritis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+ amongst others it is expressed at high levels in the intestine; and is major transferrin-independent iron uptake system in mammals. The yeast proteins Smf1 and Smf2 may also transport divalent cations.	357
279853	pfam01567	Hanta_G1	Hantavirus glycoprotein G1. The medium (M) genome segment of hantaviruses (family Bunyaviridae) encodes the two virion glycoproteins. G1 and G2, as a precursor protein in the complementary sense RNA.	523
396240	pfam01568	Molydop_binding	Molydopterin dinucleotide binding domain. This domain is found in various molybdopterin - containing oxidoreductases and tungsten formylmethanofuran dehydrogenase subunit d (FwdD) and molybdenum formylmethanofuran dehydrogenase subunit (FmdD); where the domain constitutes almost the entire subunit. The formylmethanofuran dehydrogenase catalyzes the first step in methane formation from CO2 in methanogenic archaea and has a molybdopterin dinucleotide cofactor. This domain corresponds to the C-terminal domain IV in dimethyl sulfoxide (DMSO)reductase which interacts with the 2-amino pyrimidone ring of both molybdopterin guanine dinucleotide molecules.	110
396241	pfam01569	PAP2	PAP2 superfamily. This family includes the enzyme type 2 phosphatidic acid phosphatase (PAP2), Glucose-6-phosphatase EC:3.1.3.9, Phosphatidylglycerophosphatase B EC:3.1.3.27 and bacterial acid phosphatase EC:3.1.3.2. The family also includes a variety of haloperoxidases that function by oxidising halides in the presence of hydrogen peroxide to form the corresponding hypohalous acids.	123
366710	pfam01570	Flavi_propep	Flavivirus polyprotein propeptide. The flaviviruses are small enveloped animal viruses containing a single positive strand genomic RNA. The genome encodes one large ORF a polyprotein which undergos proteolytic processing into mature viral peptide chains. This family consists of a propeptide region of approximately 90 amino acid length.	78
396242	pfam01571	GCV_T	Aminomethyltransferase folate-binding domain. This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase.	255
279858	pfam01573	Bromo_MP	Bromovirus movement protein. 	283
396243	pfam01575	MaoC_dehydratas	MaoC like domain. The maoC gene is part of a operon with maoA which is involved in the synthesis of monoamine oxidase. The MaoC protein is found to share similarity with a wide variety of enzymes; estradiol 17 beta-dehydrogenase 4, peroxisomal hydratase-dehydrogenase-epimerase, fatty acid synthase beta subunit. Several bacterial proteins that are composed solely of this domain have (R)-specific enoyl-CoA hydratase activity. This domain is also present in the NodN nodulation protein N.	123
396244	pfam01576	Myosin_tail_1	Myosin tail. The myosin molecule is a multi-subunit complex made up of two heavy chains and four light chains it is a fundamental contractile protein found in all eukaryote cell types. This family consists of the coiled-coil myosin heavy chain tail region. The coiled-coil is composed of the tail from two molecules of myosin. These can then assemble into the macromolecular thick filament. The coiled-coil region provides the structural backbone the thick filament.	1081
250716	pfam01577	Peptidase_S30	Potyvirus P1 protease. The potyviridae family positive stand RNA viruses with genome encoding a polyprotein. members include zucchini yellow mosaic virus, and turnip mosaic viruses which cause considerable losses of crops worldwide. This family consists of a C-terminus region from various plant potyvirus P1 proteins (found at the N-terminus of the polyprotein). The C-terminus of P1 is a serine-type protease responsible for autocatalytic cleavage between P1 and the helper component protease pfam00851. The entire P1 protein may be involved in virus-host interactions.	245
307628	pfam01578	Cytochrom_C_asm	Cytochrome C assembly protein. This family consists of various proteins involved in cytochrome c assembly from mitochondria and bacteria; CycK from Rhizobium, CcmC from E. coli and Paracoccus denitrificans and orf240 from wheat mitochondria. The members of this family are probably integral membrane proteins with six predicted transmembrane helices. It has been proposed that members of this family comprise a membrane component of an ABC (ATP binding cassette) transporter complex. It is also proposed that this transporter is necessary for transport of some component needed for cytochrome c assembly. One member CycK contains a putative heme-binding motif, orf240 also contains a putative heme-binding motif and is a proposed ABC transporter with c-type heme as its proposed substrate. However it seems unlikely that all members of this family transport heme nor c-type apocytochromes because CcmC in the putative CcmABC transporter transports neither. CcmF forms a working module with CcmH and CcmI, CcmFHI, and itself is unlikely to bind haem directly.	211
396245	pfam01579	DUF19	Domain of unknown function (DUF19). This presumed domain has no known function. It is found in one or two copies in several Caenorhabditis elegans proteins. It is roughly 130 amino acids long. The domain contains 12 conserved cysteines which suggests that the domain is an extracellular domain and that these cysteines form six intradomain disulphide bridges. The GO annotation for this protein indicates that it has a function in nematode larval development and has a positive regulation of growth rate.	155
279863	pfam01580	FtsK_SpoIIIE	FtsK/SpoIIIE family. FtsK has extensive sequence similarity to wide variety of proteins from prokaryotes and plasmids, termed the FtsK/SpoIIIE family. This domain contains a putative ATP binding P-loop motif. It is found in the FtsK cell division protein from E. coli and the stage III sporulation protein E SpoIIIE which has roles in regulation of prespore specific gene expression in B. subtilis. A mutation in FtsK causes a temperature sensitive block in cell division and it is involved in peptidoglycan synthesis or modification. The SpoIIIE protein is implicated in intercellular chromosomal DNA transfer.	219
110576	pfam01581	FARP	FMRFamide related peptide family. The neuroactive peptide Phe-Met-Arg-Phe-NH2 (FMRF-amide) has a variety of effects on both mammalian and invertebrate tissues.	11
396246	pfam01582	TIR	TIR domain. The Toll/interleukin-1 receptor (TIR) homology domain is an intracellular signalling domain found in MyD88, interleukin 1 receptor and the Toll receptor. It contains three highly-conserved regions, and mediates protein-protein interactions between the Toll-like receptors (TLRs) and signal-transduction components. TIR-like motifs are also found in plant proteins thought to be involved in resistance to disease. When activated, TIR domains recruit cytoplasmic adaptor proteins MyD88 and TOLLIP (Toll interacting protein). In turn, these associate with various kinases to set off signalling cascades.	165
396247	pfam01583	APS_kinase	Adenylylsulphate kinase. Enzyme that catalyzes the phosphorylation of adenylylsulphate to 3'-phosphoadenylylsulfate. This domain contains an ATP binding P-loop motif.	154
396248	pfam01584	CheW	CheW-like domain. CheW proteins are part of the chemotaxis signaling mechanism in bacteria. CheW interacts with the methyl accepting chemotaxis proteins (MCPs) and relays signals to CheY, which affects flageller rotation. This family includes CheW and other related proteins that are involved in chemotaxis. The CheW-like regulatory domain in CheA binds to CheW, suggesting that these domains can interact with each other.	137
396249	pfam01585	G-patch	G-patch domain. This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines.	45
396250	pfam01586	Basic	Myogenic Basic domain. This basic domain is found in the MyoD family of muscle specific proteins that control muscle development. The bHLH region of the MyoD family includes the basic domain and the Helix-loop-helix (HLH) motif. The bHLH region mediates specific DNA binding. With 12 residues of the basic domain involved in DNA binding. The basic domain forms an extended alpha helix in the structure.	81
396251	pfam01588	tRNA_bind	Putative tRNA binding domain. This domain is found in prokaryotic methionyl-tRNA synthetases, prokaryotic phenylalanyl tRNA synthetases the yeast GU4 nucleic-binding protein (G4p1 or p42, ARC1), human tyrosyl-tRNA synthetase, and endothelial-monocyte activating polypeptide II. G4p1 binds specifically to tRNA form a complex with methionyl-tRNA synthetases. In human tyrosyl-tRNA synthetase this domain may direct tRNA to the active site of the enzyme. This domain may perform a common function in tRNA aminoacylation.	96
279870	pfam01589	Alpha_E1_glycop	Alphavirus E1 glycoprotein. E1 forms a heterodimer with E2 pfam00943. The virus spikes are made up of 80 trimers of these heterodimers (sindbis virus).	504
396252	pfam01590	GAF	GAF domain. This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54.	133
396253	pfam01591	6PF2K	6-phosphofructo-2-kinase. This enzyme occurs as a bifunctional enzyme with fructose-2,6-bisphosphatase. The bifunctional enzyme catalyzes both the synthesis and degradation of fructose-2,6-bisphosphate, a potent regulator of glycolysis. This enzyme contains a P-loop motif.	223
396254	pfam01592	NifU_N	NifU-like N terminal domain. This domain is found in NifU in combination with pfam01106. This domain is found on isolated in several bacterial species. The nif genes are responsible for nitrogen fixation. However this domain is found in bacteria that do not fix nitrogen, so it may have a broader significance in the cell than nitrogen fixation. These proteins appear to be scaffold proteins for iron-sulfur clusters.	124
396255	pfam01593	Amino_oxidase	Flavin containing amine oxidoreductase. This family consists of various amine oxidases, including maze polyamine oxidase (PAO) and various flavin containing monoamine oxidases (MAO). The aligned region includes the flavin binding site of these enzymes. The family also contains phytoene dehydrogenases and related enzymes. In vertebrates MAO plays an important role regulating the intracellular levels of amines via there oxidation; these include various neurotransmitters, neurotoxins and trace amines. In lower eukaryotes such as aspergillus and in bacteria the main role of amine oxidases is to provide a source of ammonium. PAOs in plants, bacteria and protozoa oxidase spermidine and spermine to an aminobutyral, diaminopropane and hydrogen peroxide and are involved in the catabolism of polyamines. Other members of this family include tryptophan 2-monooxygenase, putrescine oxidase, corticosteroid binding proteins and antibacterial glycoproteins.	446
279875	pfam01594	AI-2E_transport	AI-2E family transporter. This family includes four different proteins from E. coli alone. One of them, YdgG or TqsA, has been shown to mediate transport of the quorum-sensing signal autoinducer 2 (AI-2). It is not clear if TqsA enhances secretion of AI-2 or inhibits AI-2 uptake. By altering the intracellular concentration of AI-2, TqsA affects gene expression in biofilms and biofilm formation. TsqA belongs to the AI-2 exporter (AI-2E) superfamily.	327
396256	pfam01595	DUF21	Domain of unknown function DUF21. This transmembrane region has no known function. Many of the sequences in this family are annotated as hemolysins, however this is due to a similarity to Brachyspira hyodysenteriae hemolysin C that does not contain this domain. This domain is found in the N-terminus of the proteins adjacent to two intracellular CBS domains pfam00571.	182
396257	pfam01596	Methyltransf_3	O-methyltransferase. Members of this family are O-methyltransferases. The family includes catechol o-methyltransferase, caffeoyl-CoA O-methyltransferase and a family of bacterial O-methyltransferases that may be involved in antibiotic production.	203
396258	pfam01597	GCV_H	Glycine cleavage H-protein. This is a family of glycine cleavage H-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. A lipoyl group is attached to a completely conserved lysine residue. The H protein shuttles the methylamine group of glycine from the P protein to the T protein.	122
396259	pfam01599	Ribosomal_S27	Ribosomal protein S27a. This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is synthesized as a C-terminal extension of ubiquitin (CEP). The S27a domain compromises the C-terminal half of the protein. The synthesis of ribosomal proteins as extensions of ubiquitin promotes their incorporation into nascent ribosomes by a transient metabolic stabilisation and is required for efficient ribosome biogenesis. The ribosomal extension protein S27a contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and ribosomes a source of proteins.	43
396260	pfam01600	Corona_S1	Coronavirus S1 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 and S2 pfam01601.	407
396261	pfam01601	Corona_S2	Coronavirus S2 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 pfam01600 and S2.	485
396262	pfam01602	Adaptin_N	Adaptin N terminal region. This family consists of the N terminal region of various alpha, beta and gamma subunits of the AP-1, AP-2 and AP-3 adaptor protein complexes. The adaptor protein (AP) complexes are involved in the formation of clathrin-coated pits and vesicles. The N-terminal region of the various adaptor proteins (APs) is constant by comparison to the C-terminal which is variable within members of the AP-2 family; and it has been proposed that this constant region interacts with another uniform component of the coated vesicles.	523
396263	pfam01603	B56	Protein phosphatase 2A regulatory B subunit (B56 family). Protein phosphatase 2A (PP2A) is a major intracellular protein phosphatase that regulates multiple aspects of cell growth and metabolism. The ability of this widely distributed heterotrimeric enzyme to act on a diverse array of substrates is largely controlled by the nature of its regulatory B subunit. There are multiple families of B subunits (See also pfam01240), this family is called the B56 family.	404
250739	pfam01606	Arteri_env	Arterivirus envelope protein. This family consists of viral envelope proteins from the arterivirus genus; this includes porcine reproductive and respiratory virus (PRRSV) envelope protein GP3 and lactate dehydrogenase elevating virus (LDV) structural glycoprotein. Arteriviruses consists of positive ssRNA and do not have a DNA stage.	211
366726	pfam01607	CBM_14	Chitin binding Peritrophin-A domain. This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains.	53
396264	pfam01608	I_LWEQ	I/LWEQ domain. I/LWEQ domains bind to actin. It has been shown that the I/LWEQ domains from mouse talin and yeast Sla2p interact with F-actin. I/LWEQ domains can be placed into four major groups based on sequence similarity: (1) Metazoan talin; (2) Dictyostelium TalA/TalB and SLA110; (3) metazoan Hip1p and (4) yeast Sla2p. The domain has four conserved blocks, the name of the domain is derived from the initial conserved amino acid of each of the four blocks.	140
376573	pfam01609	DDE_Tnp_1	Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. This family contains transposases for IS4, IS421, IS5377, IS427, IS402, IS1355, IS5, which was original isolated in bacteriophage lambda.	196
396265	pfam01610	DDE_Tnp_ISL3	Transposase. Transposase proteins are necessary for efficient DNA transposition. Contains transposases for IS204, IS1001, IS1096 and IS1165.	238
279888	pfam01611	Filo_glycop	Filovirus glycoprotein. This family includes an extracellular region from the envelope glycoprotein of Ebola and Marburg viruses. This region is also produced as a separate transcript that gives rise to a non-structural, secreted glycoprotein, which is produced in large amounts and has an unknown function. Processing of this protein may be involved in viral pathogenicity.	395
396266	pfam01612	DNA_pol_A_exo1	3'-5' exonuclease. This domain is responsible for the 3'-5' exonuclease proofreading activity of E. coli DNA polymerase I (polI) and other enzymes, it catalyzes the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI it is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D). Werner syndrome is a human genetic disorder causing premature aging; the WRN protein has helicase activity in the 3'-5' direction. The FFA-1 protein is required for formation of a replication foci and also has helicase activity; it is a homolog of the WRN protein. RNase D is a 3'-5' exonuclease involved in tRNA processing. Also found in this family is the autoantigen PM/Scl thought to be involved in polymyositis-scleroderma overlap syndrome.	173
396267	pfam01613	Flavin_Reduct	Flavin reductase like domain. This is a flavin reductase family consisting of enzymes known to be flavin reductases as well as various oxidoreductase and monooxygenase components. VlmR is a flavin reductase that functions in a two-component enzyme system to provide isobutylamine N-hydroxylase with reduced flavin and may be involved in the synthesis of valanimycin. SnaC is a flavin reductase that provides reduced flavin for the oxidation of pristinamycin IIB to pristinamycin IIA as catalyzed by SnaA, SnaB heterodimer. This flavin reductase region characterized by enzymes of the family is present in the C-terminus of potential FMN proteins from Synechocystis sp. suggesting it is a flavin reductase domain.	145
396268	pfam01614	IclR	Bacterial transcriptional regulator. This family of bacterial transcriptional regulators includes the glycerol operon regulatory protein and acetate operon repressor both of which are members of the iclR family. These proteins have a Helix-Turn-Helix motif at the N-terminus. However this family covers the C-terminal region that may bind to the regulatory substrate (unpublished observation, Bateman A.).	129
279892	pfam01616	Orbi_NS3	Orbivirus NS3. The function of this Orbivirus non structural protein is uncertain. However it may play a role on release of the virus from infected cells.	193
366729	pfam01617	Surface_Ag_2	Surface antigen. This family includes a number of bacterial surface antigens expressed on the surface of pathogens.	247
396269	pfam01618	MotA_ExbB	MotA/TolQ/ExbB proton channel family. This family groups together integral membrane proteins that appear to be involved translocation of proteins across a membrane. These proteins are probably proton channels. MotA is an essential component of the flageller motor that uses a proton gradient to generate rotational motion in the flageller. ExbB is part of the TonB-dependent transduction complex. The TonB complex uses the proton gradient across the inner bacterial membrane to transport large molecules across the outer bacterial membrane.	126
396270	pfam01619	Pro_dh	Proline dehydrogenase. 	300
366730	pfam01620	Pollen_allerg_2	Ribonuclease (pollen allergen). This family contains grass pollen proteins of group V. Phleum pratense pollen allergen Phl p 5b has been shown to possess ribonuclease activity.	155
279897	pfam01621	Fusion_gly_K	Cell fusion glycoprotein K. This protein is probably an integral membrane bound glycoprotein that is involved in viral fusion with the host cell.	339
250753	pfam01623	Carla_C4	Carlavirus putative nucleic acid binding protein. This family of carlavirus nucleic acid binding proteins includes a motif for a potential C-4 type zinc finger this has four highly conserved cysteine residues and is a conserved feature of the carlaviruses 3' terminal ORF. These proteins may function as viral transcriptional regulators. The carlavirus family includes garlic latent virus and potato virus S and M, these viruses are positive strand, ssRNA with no DNA stage.	91
396271	pfam01624	MutS_I	MutS domain I. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with globular domain I, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in.	113
396272	pfam01625	PMSR	Peptide methionine sulfoxide reductase. This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine.	153
396273	pfam01627	Hpt	Hpt domain. The histidine-containing phosphotransfer (HPt) domain is a novel protein module with an active histidine residue that mediates phosphotransfer reactions in the two-component signaling systems. A multistep phosphorelay involving the HPt domain has been suggested for these signaling pathways. The crystal structure of the HPt domain of the anaerobic sensor kinase ArcB has been determined. The domain consists of six alpha helices containing a four-helix bundle-folding. The pattern of sequence similarity of the HPt domains of ArcB and components in other signaling systems can be interpreted in light of the three-dimensional structure and supports the conclusion that the HPt domains have a common structural motif both in prokaryotes and eukaryotes. In S. cerevisiae ypd1p this domain has been shown to contain a binding surface for Ssk1p (response regulator receiver domain containing protein pfam00072).	84
396274	pfam01628	HrcA	HrcA protein C terminal domain. HrcA is found to negatively regulate the transcription of heat shock genes. HrcA contains an amino terminal helix-turn-helix domain, however this corresponds to the carboxy terminal domain.	221
396275	pfam01629	DUF22	Domain of unknown function DUF22. This domain is found in 1 to 3 copies in archaebacterial proteins. The function of the domain is unknown. This family appears to be expanded in Archaeoglobus fulgidus.	106
396276	pfam01630	Glyco_hydro_56	Hyaluronidase. 	327
396277	pfam01632	Ribosomal_L35p	Ribosomal protein L35. 	60
396278	pfam01633	Choline_kinase	Choline/ethanolamine kinase. Choline kinase catalyzes the committed step in the synthesis of phosphatidylcholine by the CDP-choline pathway. This alignment covers the protein kinase portion of the protein. The divergence of this family makes it very difficult to create a model that specifically predicts choline/ethanolamine kinases only. However if pfam01633 is also present then it is definitely a member of this family.	211
396279	pfam01634	HisG	ATP phosphoribosyltransferase. 	157
396280	pfam01635	Corona_M	Coronavirus M matrix/glycoprotein. This family consists of various coronavirus matrix proteins which are transmembrane glycoproteins. The M protein or E1 glycoprotein is The coronavirus M protein is implicated in virus assembly. The E1 viral membrane protein is required for formation of the viral envelope and is transported via the Golgi complex.	208
396281	pfam01636	APH	Phosphotransferase enzyme family. This family consists of bacterial antibiotic resistance proteins, which confer resistance to various aminoglycosides they include: aminoglycoside 3'-phosphotransferase or kanamycin kinase / neomycin-kanamycin phosphotransferase and streptomycin 3''-kinase or streptomycin 3''-phosphotransferase. The aminoglycoside phosphotransferases inactivate aminoglycoside antibiotics via phosphorylation. This family also includes homoserine kinase. This family is related to fructosamine kinase pfam03881.	239
376582	pfam01637	ATPase_2	ATPase domain predominantly from Archaea. This family contain a conserved P-loop motif that is involved in binding ATP. There are eukaryote members as well as archaeal members in this family.	222
396282	pfam01638	HxlR	HxlR-like helix-turn-helix. HxlR, a member of this family, is a DNA-binding protein that acts as a positive regulator of the formaldehyde-inducible hxlAB operon in Bacillus subtilis.	90
279910	pfam01639	v110	Viral family 110. This family of viral proteins is known as the 110 family. The function of members of this family is unknown. The family contains a central cysteine rich region with eight conserved cysteines. Some members of the family contains two copies of the cysteine rich region.	102
396283	pfam01640	Peptidase_C10	Peptidase C10 family. This family represents just the active peptide part of these proteins. Residues 1-120 are not part of the model as they form the pro-peptide, which before cleavage blocks the active site from the substrate. The catalytic residues of histidine and cysteine are brought close together at the active site by the folding of the active peptide.	187
396284	pfam01641	SelR	SelR domain. Methionine sulfoxide reduction is an important process, by which cells regulate biological processes and cope with oxidative stress. MsrA, a protein involved in the reduction of methionine sulfoxides in proteins, has been known for four decades and has been extensively characterized with respect to structure and function. However, recent studies revealed that MsrA is only specific for methionine-S-sulfoxides. Because oxidized methionines occur in a mixture of R and S isomers in vivo, it was unclear how stereo-specific MsrA could be responsible for the reduction of all protein methionine sulfoxides. It appears that a second methionine sulfoxide reductase, SelR, evolved that is specific for methionine-R-sulfoxides, the activity that is different but complementary to that of MsrA. Thus, these proteins, working together, could reduce both stereoisomers of methionine sulfoxide. This domain is found both in SelR proteins and fused with the peptide methionine sulfoxide reductase enzymatic domain pfam01625. The domain has two conserved cysteine and histidines. The domain binds both selenium and zinc. The final cysteine is found to be replaced by the rare amino acid selenocysteine in some members of the family. This family has methionine-R-sulfoxide reductase activity.	120
396285	pfam01642	MM_CoA_mutase	Methylmalonyl-CoA mutase. The enzyme methylmalonyl-CoA mutase is a member of a class of enzymes that uses coenzyme B12 (adenosylcobalamin) as a cofactor. The enzyme induces the formation of an adenosyl radical from the cofactor. This radical then initiates a free-radical rearrangement of its substrate, succinyl-CoA, to methylmalonyl-CoA.	510
366738	pfam01643	Acyl-ACP_TE	Acyl-ACP thioesterase. This family consists of various acyl-acyl carrier protein (ACP) thioesterases (TE) these terminate fatty acyl group extension via hydrolysing an acyl group on a fatty acid.	248
396286	pfam01644	Chitin_synth_1	Chitin synthase. This region is found commonly in chitin synthases classes I, II and III. Chitin a linear homopolymer of GlcNAc residues, it is an important component of the cell wall of fungi and is synthesized on the cytoplasmic surface of the cell membrane by membrane bound chitin synthases.	163
396287	pfam01645	Glu_synthase	Conserved region in glutamate synthase. This family represents a region of the glutamate synthase protein. This region is expressed as a separate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a large multidomain enzyme in other organisms. The aligned region of these proteins contains a putative FMN binding site and Fe-S cluster.	367
307668	pfam01646	Herpes_UL24	Herpes virus proteins UL24 and UL76. This family consists of various herpes virus proteins; the gene 20 product, U49 protein, UL24 and UL76 proteins and BXRF1. The UL24 gene (product of the 24th ORF) is not essential for virus replication, and mutants with lesions in UL24 show a reduced ability to replicate in tissue culture and have reduced thymidine kinase activity, as the UL24 gene overlaps with thymidine kinase. The family of proteins is involved in viral production, latency, and reactivation. Protein UL76 presents as globular aggresomes in the nuclei of transiently transfected cells. Bioinformatic analyses predict that UL76 has a propensity for aggregation and targets cellular proteins implicated in protein folding and ubiquitin-proteasome systems. UL76 interacts with the VWA domain of S5a, the 26S proteasome non-ATPase regulatory subunit 4 (or PSMD4, or Rpn10), forming a complex in the late phase of infection.	176
396288	pfam01648	ACPS	4'-phosphopantetheinyl transferase superfamily. Members of this family transfers the 4'-phosphopantetheine (4'-PP) moiety from coenzyme A (CoA) to the invariant serine of pfam00550. This post-translational modification renders holo-ACP capable of acyl group activation via thioesterification of the cysteamine thiol of 4'-PP. This superfamily consists of two subtypes: The ACPS type and the Sfp type. The structure of the Sfp type is known, which shows the active site accommodates a magnesium ion. The most highly conserved regions of the alignment are involved in binding the magnesium ion.	111
396289	pfam01649	Ribosomal_S20p	Ribosomal protein S20. Bacterial ribosomal protein S20 interacts with 16S rRNA.	76
396290	pfam01650	Peptidase_C13	Peptidase C13 family. Members of this family are asparaginyl peptidases. The blood fluke parasite Schistosoma mansoni has at least five Clan CA cysteine peptidases in its digestive tract including cathepsins B (2 isoforms), C, F and L. All have been recombinantly expressed as active enzymes, albeit in various stages of activation. In addition, a Clan CD peptidase, termed asparaginyl endopeptidase or 'legumain' has been identified. This has formerly been characterized as a 'haemoglobinase', but this term is probably incorrect. Two cDNAs have been described for Schistosoma mansoni legumain; one encodes an active enzyme whereas the active site cysteine residue encoded by the second cDNA is substituted by an asparagine residue. Both forms have been recombinantly expressed.	257
396291	pfam01652	IF4E	Eukaryotic initiation factor 4E. 	158
396292	pfam01653	DNA_ligase_aden	NAD-dependent DNA ligase adenylation domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This domain is the catalytic adenylation domain. The NAD+ group is covalently attached to this domain at the lysine in the KXDG motif of this domain. This enzyme- adenylate intermediate is an important feature of the proposed catalytic mechanism.	318
396293	pfam01654	Cyt_bd_oxida_I	Cytochrome bd terminal oxidase subunit I. This family are the alternative oxidases found in many bacteria which oxidize ubiquinol and reduce oxygen as part of the electron transport chain. This family is the subunit I of the oxidase E. coli has two copies of the oxidase, bo and bd', both of which are represented here In some nitrogen fixing bacteria, e.g. Klebsiella pneumoniae this oxidase is responsible for removing oxygen in microaerobic conditions, making the oxidase required for nitrogen fixation. This subunit binds a single b-haem, through ligands at His186 and Met393 (using SW:P11026 numbering). In addition His19 is a ligand for the haem b found in subunit II	426
396294	pfam01655	Ribosomal_L32e	Ribosomal protein L32. This family includes ribosomal protein L32 from eukaryotes and archaebacteria.	108
396295	pfam01656	CbiA	CobQ/CobB/MinD/ParA nucleotide binding domain. This family consists of various cobyrinic acid a,c-diamide synthases. These include CbiA and CbiP from S.typhimurium, and CobQ from R. capsulatus. These amidases catalyze amidations to various side chains of hydrogenobyrinic acid or cobyrinic acid a,c-diamide in the biosynthesis of cobalamin (vitamin B12) from uroporphyrinogen III. Vitamin B12 is an important cofactor and an essential nutrient for many plants and animals and is primarily produced by bacteria. The family also contains dethiobiotin synthetases as well as the plasmid partitioning proteins of the MinD/ParA family.	224
396296	pfam01657	Stress-antifung	Salt stress response/antifungal. This domain is often found in association with the kinase domains pfam00069 or pfam07714. In many proteins it is duplicated. It contains six conserved cysteines which are involved in disulphide bridges. It has a role in salt stress response and has antifungal activity.	95
396297	pfam01658	Inos-1-P_synth	Myo-inositol-1-phosphate synthase. This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyzes the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction.	104
279928	pfam01659	Luteo_Vpg	Luteovirus putative VPg genome linked protein. This family consists of several putative genome linked proteins. The genomic RNA of luteoviruses are linked to virally encoded genome proteins (VPg). Open reading frame 4 is thought to encode the VPg in Soybean dwarf luteovirus. Luteoviruses have isometric capsids that contain a positive stand ssRNA genome, they have no DNA stage during their replication.	105
396298	pfam01660	Vmethyltransf	Viral methyltransferase. This RNA methyltransferase domain is found in a wide range of ssRNA viruses, including Hordei-, Tobra-, Tobamo-, Bromo-, Clostero- and Caliciviruses. This methyltransferase is involved in mRNA capping. Capping of mRNA enhances its stability. This usually occurs in the nucleus. Therefore, many viruses that replicate in the cytoplasm encode their own. This is a specific guanine-7-methyltransferase domain involved in viral mRNA cap0 synthesis. Specificity for guanine 7 position is shown by NMR in and in vivo role in cap synthesis. Based on secondary structure prediction, the basic fold is believed to be similar to the common AdoMet-dependent methyltransferase fold. A curious feature of this methyltransferase domain is that it together with flanking sequences seems to have guanylyltransferase activity coupled to the methyltransferase activity. The domain is found throughout the so-called Alphavirus superfamily, (including alphaviruses and several other groups). It forms the defining, unique feature of this superfamily.	308
396299	pfam01661	Macro	Macro domain. This domain is an ADP-ribose binding module. It is found in a number of otherwise unrelated proteins. It is found at the C-terminus of the macro-H2A histone protein. This domain is found in the non-structural proteins of several types of ssRNA viruses such as NSP3 from alphaviruses. This domain is also found on its own in a family of proteins from bacteria, archaebacteria and eukaryotes.	118
396300	pfam01663	Phosphodiest	Type I phosphodiesterase / nucleotide pyrophosphatase. This family consists of phosphodiesterases, including human plasma-cell membrane glycoprotein PC-1 / alkaline phosphodiesterase i / nucleotide pyrophosphatase (nppase). These enzymes catalyze the cleavage of phosphodiester and phosphosulfate bonds in NAD, deoxynucleotides and nucleotide sugars. Also in this family is ATX an autotaxin, tumor cell motility-stimulating protein which exhibits type I phosphodiesterases activity. The alignment encompasses the active site. Also present with in this family is 60-kDa Ca2+-ATPase form F. odoratum.	343
366748	pfam01664	Reo_sigma1	Reovirus viral attachment protein sigma 1. This family consists of the reovirus sigma 1 hemagglutinin, cell attachment protein. This glycoprotein is a minor capsid protein and also determines the serotype-specific humoral immune response. Sigma 1 consist of a fibrous tail and a globular head. The head has important roles in the cell attachment function of sigma 1 and determinant of the type-specific humoral immune response. Reovirus is part of the orthoreovirus group of retroviruses with, a dsRNA genome. Also present in this family is bacteriophage SF6 Lysozyme.	216
279933	pfam01665	Rota_NSP3	Rotavirus non-structural protein NSP3. This family consist of rotaviral non-structural RNA binding protein 34 (NS34 or NSP3). The NSP3 protein has been shown to bind viral RNA. The NSP3 protein consists of 3 conserved functional domains; a basic region which binds ssRNA, a region containing heptapeptide repeats mediating oligomerization and a leucine zipper motif. NSP3 may play a central role in replication and assembly of genomic RNA structures. Rotaviruses have a dsRNA genome and are a major cause cause of acute gastroenteritis in the young of many species. The rotavirus non-structural protein NSP3 is a sequence-specific RNA binding protein that binds the nonpolyadenylated 3' end of the rotavirus mRNAs. NSP3 also interacts with the translation initiation factor eIF4GI and competes with the poly(A) binding protein.	311
366749	pfam01666	DX	DX module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges.	76
279935	pfam01667	Ribosomal_S27e	Ribosomal protein S27. 	55
396301	pfam01668	SmpB	SmpB protein. 	143
396302	pfam01669	Myelin_MBP	Myelin basic protein. 	156
396303	pfam01670	Glyco_hydro_12	Glycosyl hydrolase family 12. 	207
279939	pfam01671	ASFV_360	African swine fever virus multigene family 360 protein. The multigene family 360 protein are found within the African swine fever virus (ASF) genome which consist of dsDNA and has similar structural features to the poxyviruses. The biological function of this family is not known. Although African swine fever virus Protein MGF 360-9L is a major structural protein.	215
376591	pfam01672	Plasmid_parti	Putative plasmid partition protein. This family consists of conserved hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete, some of which are putative plasmid partition proteins.	85
279941	pfam01673	Herpes_env	Herpesvirus putative major envelope glycoprotein. This family consists of probable major envelope glycoproteins from members of the herpesviridae including herpes simplex virus, human cytomegalovirus and varicella-zoster virus. Members of the herpesviridae have a dsDNA genome and do not have a RNA stage during there replication.	526
396304	pfam01674	Lipase_2	Lipase (class 2). This family consists of hypothetical C. elegans proteins and lipases. Lipases or triacylglycerol acylhydrolases hydrolyze ester bonds in triacylglycerol giving diacylglycerol, monoacylglycerol, glycerol and free fatty acids. Lipase EstA is a extracellular lipase from B. subtilis 168.	218
396305	pfam01676	Metalloenzyme	Metalloenzyme superfamily. This family includes phosphopentomutase and 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. This family is also related to pfam00245. The alignment contains the most conserved residues that are probably involved in metal binding and catalysis.	410
279943	pfam01677	Herpes_UL7	Herpesvirus UL7 like. This family consists of various functionally undefined proteins from the herpesviridae and UL7 from bovine herpes virus. UL7 is not essential for virus replication in cell culture, and is found localized in the cytoplasm of infected cells accumulated around the nucleus but could not be detected in purified virions. Members of the herpesviridae have a dsDNA genome and do not have a RNA stage during there replication.	213
396306	pfam01678	DAP_epimerase	Diaminopimelate epimerase. Diaminopimelate epimerase contains two domains of the same alpha/beta fold, both contained in this family.	119
396307	pfam01679	Pmp3	Proteolipid membrane potential modulator. Pmp3 is an evolutionarily conserved proteolipid in the plasma membrane which, in S. pombe, is transcriptionally regulated by the Spc1 stress MAPK (mitogen-activated protein kinases) pathway. It functions to modulate the membrane potential, particularly to resist high cellular cation concentration. In eukaryotic organisms, stress-activated mitogen-activated protein kinases play crucial roles in transmitting environmental signals that will regulate gene expression for allowing the cell to adapt to cellular stress. Pmp3-like proteins are highly conserved in bacteria, yeast, nematode and plants.	49
396308	pfam01680	SOR_SNZ	SOR/SNZ family. Members of this family are enzymes involved in a new pathway of pyridoxine/pyridoxal 5-phosphate biosynthesis. This family was formerly known as UPF0019.	206
396309	pfam01681	C6	C6 domain. This domain of unknown function is found in a hypothetical C. elegans protein. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge.	90
396310	pfam01682	DB	DB module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657.	97
396311	pfam01683	EB	EB module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges. This domain is found associated with kunitz domains pfam00014.	52
366757	pfam01684	ET	ET module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 8-10 conserved cysteines that probably form 4-5 disulphide bridges. By inspection of the conservation of cysteines it looks like cysteines 1,2,3,4,9 and 10 are always present and that sometimes the pair 5 and 8 or the pair 6 and 7 are missing. This suggests that cysteines 5/8 and 6/7 make disulphide bridges.	78
396312	pfam01686	Adeno_Penton_B	Adenovirus penton base protein. This family consists of various adenovirus penton base proteins, from both the Mastadenoviradae having mammalian hosts and the Aviadenoviradae having avian hosts. The penton base is a major structural protein forming part of the penton which consists of a base and a fibre, the pentons hold a morphologically prominent position at the vertex capsomer in the adenovirus particle. In mammalian adenovirus there is only one tail on each base where as in avian adenovirus there are two.	450
396313	pfam01687	Flavokinase	Riboflavin kinase. This family represents the C-terminal region of the bifunctional riboflavin biosynthesis protein known as RibC in Bacillus subtilis. The RibC protein from Bacillus subtilis has both flavokinase and flavin adenine dinucleotide synthetase (FAD-synthetase) activities. RibC plays an essential role in the flavin metabolism. This domain is thought to have kinase activity.	122
279953	pfam01688	Herpes_gI	Alphaherpesvirus glycoprotein I. This family consists of glycoprotein I form various members of the alphaherpesvirinae these include herpesvirus, varicella-zoster virus and pseudorabies virus. Glycoprotein I (gI) is important during natural infection, mutants lacking gI produce smaller lesions at the site of infection and show reduced neuronal spread. gI forms a heterodimeric complex with gE; this complex displays Fc receptor activity (binds to the Fc region of immunoglobulin). Glycoproteins are also important in the production of virus-neutralising antibodies and cell mediated immunity. The alphaherpesvirinae have a dsDNA gnome and have no RNA stage during viral replication.	155
279954	pfam01690	PLRV_ORF5	Potato leaf roll virus readthrough protein. This family consists mainly of the potato leaf roll virus readthrough protein. This is generated via a readthrough of open reading frame 3 a coat protein allowing transcription of open reading frame 5 to give an extended coat protein with a large c-terminal addition or read through domain. The readthrough protein is thought to play a role in the circulative aphid transmission of potato leaf roll virus. Also in the family is open reading frame 6 from beet western yellows virus and potato leaf roll virus both luteovirus and an unknown protein from cucurbit aphid-borne yellows virus a closterovirus.	524
279955	pfam01691	Adeno_E1B_19K	Adenovirus E1B 19K protein / small t-antigen. This family consists of adenovirus E1B 19K protein or small t-antigen. The E1B 19K protein inhibits E1A induced apoptosis and hence prolongs the viability of the host cell. It can also inhibit apoptosis mediated by tumor necrosis factor alpha and Fas antigen. E1B 19K blocks apoptosis by interacting with and inhibiting the p53-inducible and death- promoting Bax protein. The E1B region of adenovirus encodes two proteins E1B 19K the small t-antigen as found in this family and E1B 55K the large t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis.	135
279956	pfam01692	Paramyxo_C	Paramyxovirus non-structural protein C. This family consist of the C proteins (C', C, Y1, Y2) found in Paramyxovirinae; human parainfluenza, and sendai virus. The C proteins effect viral RNA synthesis having both a positive and negative effect during the course of infection. Paramyxovirus have a negative strand ssRNA genome of 15.3kb form which six mRNAs are transcribed, five of these are monocistronic. The P/C mRNA is polycistronic and has two overlapping open reading frames P and C, C encodes the nested C proteins C', C, Y1 and Y2.	204
396314	pfam01693	Cauli_VI	Caulimovirus viroplasmin. This family consists of various caulimovirus viroplasmin proteins. The viroplasmin protein is encoded by gene VI and is the main component of viral inclusion bodies or viroplasms. Inclusions are the site of viral assembly, DNA synthesis and accumulation. Two domains exist within gene VI corresponding approximately to the 5' third and middle third of gene VI, these influence systemic infection in a light-dependent manner.	44
396315	pfam01694	Rhomboid	Rhomboid family. This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite.	145
396316	pfam01695	IstB_IS21	IstB-like ATP binding protein. This protein contains an ATP/GTP binding P-loop motif. It is found associated with IS21 family insertion sequences. The function of this protein is unknown, but it may perform a transposase function.	238
366761	pfam01696	Adeno_E1B_55K	Adenovirus EB1 55K protein / large t-antigen. This family consists of adenovirus E1B 55K protein or large t-antigen. E1B 55K binds p53 the tumor suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the adenovirus E1A protein. The E1B region of adenovirus encodes two proteins E1B 55K the large t-antigen as found in this family and E1B 19K pfam01691 the small t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis. This family shows distant similarities to the pectate lyase superfamily.	387
396317	pfam01697	Glyco_transf_92	Glycosyltransferase family 92. Members of this family act as galactosyltransferases, belonging to glycosyltransferase family 92. The aligned region contains several conserved cysteine residues and several charged residues that may be catalytic residues. This is supported by the inclusion of this family in the GT-A glycosyl transferase superfamily.	250
396318	pfam01698	LFY_SAM	Floricaula / Leafy protein SAM domain. This family consists of various plant development proteins which are homologs of floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. Mutations in the sequences of these proteins affect flower and leaf development. LFY proteins have been shown to binds semi-palindromic 19-bp DNA elements through its highly conserved C-terminal DBD. In addition to its well-characterized DBD, LFY possesses a second conserved domain at its amino terminus (LFY-N). This entry represents the SAM domain found in N -terminal of LFY proteins in plants. Crystallographic structure determination of LFY-N shows that LFY-N is a Sterile Alpha Motif (SAM) domain that mediates LFY oligomerization. It allows LFY to bind to regions lacking high-affinity LFYbs (LFY-binding sites) and confers on LFY the ability to access closed chromatin regions. Experiments carried out in plants, revealed that altering the capacity of LFY to oligomerize compromised its floral function and drastically reduced its genome-wide DNA binding. SAM oligomerization has been suggested to have a profound effect on a TF binding landscape by promoting cooperative binding of LFY to DNA, as was proposed for other oligomeric TFs, and it gives LFY access to closed chromatin regions that are notably refractory to TF binding. It has also been suggested that the biochemical properties of the SAM domain are evolutionary conserved in all plant species.	80
396319	pfam01699	Na_Ca_ex	Sodium/calcium exchanger protein. This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3.	149
279964	pfam01700	Orbi_VP3	Orbivirus VP3 (T2) protein. The orbivirus VP3 protein is part of the virus core and makes a 'subcore' shell made up of 120 copies of the 100K protein. VP3 particles can also bind RNA and are fundamental in the early stages of viral core formation. Also found in the family is structural core protein VP2 from broadhaven virus which is similar to VP3 in bluetongue virus. Orbivirus are part of the larger reoviridae which have a dsRNA genome of 10-12 linear segments; orbivirus found in this family include bluetongue virus and epizootic hemorrhagic disease virus.	888
396320	pfam01701	PSI_PsaJ	Photosystem I reaction centre subunit IX / PsaJ. This family consists of the photosystem I reaction centre subunit IX or PsaJ from various organisms including Synechocystis sp. (strain pcc 6803), Pinus thunbergii (green pine) and Zea mays (maize). PsaJ is a small 4.4kDa, chloroplastal encoded, hydrophobic subunit of the photosystem I reaction complex its function is not yet fully understood. PsaJ can be cross-linked to PsaF and has a single predicted transmembrane domain it has a proposed role in maintaining PsaF in the correct orientation to allow for fast electron transfer from soluble donor proteins to P700+.	37
396321	pfam01702	TGT	Queuine tRNA-ribosyltransferase. This is a family of queuine tRNA-ribosyltransferases EC:2.4.2.29, also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine. It catalyzes the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues.	358
396322	pfam01704	UDPGP	UTP--glucose-1-phosphate uridylyltransferase. This family consists of UTP--glucose-1-phosphate uridylyltransferases, EC:2.7.7.9. Also known as UDP-glucose pyrophosphorylase (UDPGP) and Glucose-1-phosphate uridylyltransferase. UTP--glucose-1-phosphate uridylyltransferase catalyzes the interconversion of MgUTP + glucose-1-phosphate and UDP-glucose + MgPPi. UDP-glucose is an important intermediate in mammalian carbohydrate interconversion involved in various metabolic roles depending on tissue type. In Dictyostelium (slime mold) mutants in this enzyme abort the development cycle. Also within the family is UDP-N-acetylglucosamine or AGX1 and two hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete.	412
396323	pfam01705	CX	CX module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges.	59
396324	pfam01706	FliG_C	FliG C-terminal domain. FliG is a component of the flageller rotor, present in about 25 copies per flagellum. This domain functions specifically in motor rotation.	108
279970	pfam01707	Peptidase_C9	Peptidase family C9. 	202
366768	pfam01708	Gemini_mov	Geminivirus putative movement protein. This family consists of putative movement proteins from Maize streak and wheat dwarf virus.	92
396325	pfam01709	Transcrip_reg	Transcriptional regulator. This is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region.	235
279973	pfam01710	HTH_Tnp_IS630	Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes insertion sequences from Synechocystis PCC 6803 three of which are characterized as homologous to bacterial IS5- and IS4- and to several members of the IS630-Tc1-mariner superfamily.	119
396326	pfam01712	dNK	Deoxynucleoside kinase. This family consists of various deoxynucleoside kinases cytidine EC:2.7.1.74, guanosine EC:2.7.1.113, adenosine EC:2.7.1.76 and thymidine kinase EC:2.7.1.21 (which also phosphorylates deoxyuridine and deoxycytosine.) These enzymes catalyze the production of deoxynucleotide 5'-monophosphate from a deoxynucleoside. Using ATP and yielding ADP in the process.	201
396327	pfam01713	Smr	Smr domain. This family includes the Smr (Small MutS Related) proteins, and the C-terminal region of the MutS2 protein. It has been suggested that this domain interacts with the MutS1 protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2. This domain exhibits nicking endonuclease activity that might have a role in mismatch repair or genetic recombination. It shows no significant double strand cleavage or exonuclease activity. The full-length human NEDD4-binding protein 2 also has the polynucleotide kinase activity.	78
396328	pfam01715	IPPT	IPP transferase. This is a family of IPP transferases EC:2.5.1.8 also known as tRNA delta(2)-isopentenylpyrophosphate transferase. These enzymes modify both cytoplasmic and mitochondrial tRNAs at A(37) to give isopentenyl A(37).	244
396329	pfam01716	MSP	Manganese-stabilizing protein / photosystem II polypeptide. This family consists of the 33 KDa photosystem II polypeptide from the oxygen evolving complex (OEC) of plants and cyanobacteria. The protein is also known as the manganese-stabilizing protein as it is associated with the manganese complex of the OEC and may provide the ligands for the complex.	242
366771	pfam01717	Meth_synt_2	Cobalamin-independent synthase, Catalytic domain. This is a family of vitamin-B12 independent methionine synthases or 5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferases, EC:2.1.1.14 from bacteria and plants. Plants are the only higher eukaryotes that have the required enzymes for methionine synthesis. This enzyme catalyzes the last step in the production of methionine by transferring a methyl group from 5-methyltetrahydrofolate to homocysteine. The aligned region makes up the carboxy region of the approximately 750 amino acid protein except in some hypothetical archaeal proteins present in the family, where this region corresponds to the entire length. This domain contains the catalytic residues of the enzyme.	323
396330	pfam01718	Orbi_NS1	Orbivirus non-structural protein NS1, or hydrophobic tubular protein. This family consists of orbivirus non-structural protein NS1, or hydrophobic tubular protein. NS1 has no specific function in virus replication, it is however thought to play a role in transport of mature virus particles from virus inclusion bodies to the cell membrane. Orbivirus are part of the larger reoviridae which have a dsRNA genome of at least 10 segments encoding at least 10 viral proteins; orbivirus found in this family include bluetongue virus, and African horsesickness virus.	548
396331	pfam01719	Rep_2	Plasmid replication protein. This family consists of various bacterial plasmid replication (Rep) proteins. These proteins are essential for replication of plasmids, the Rep proteins are topoisomerases that nick the positive stand at the plus origin of replication and also at the single-strand conversion sequence.	181
366773	pfam01721	Bacteriocin_II	Class II bacteriocin. The bacteriocins are small peptides that inhibit the growth of various bacteria. Bacteriocins of lactic acid bacteria may inhibit their target cells by permeabilising the cell membrane.	33
396332	pfam01722	BolA	BolA-like protein. This family consist of the morphoprotein BolA from E. coli and its various homologs. In E. coli over expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5.	75
279983	pfam01723	Chorion_1	Chorion protein. This family consists of the chorion superfamily proteins classes A, B, CA, CB and high-cysteine HCB from silk, gypsy and polyphemus moths. The chorion proteins make up the moths egg shell a complex extracellular structure.	169
396333	pfam01724	DUF29	Domain of unknown function DUF29. This family consists of various hypothetical proteins from cyanobacteria, none of which are functionally described. The aligned region is approximately 120-140 amino acids long corresponding to almost the entire length of the proteins in the family. Structure 3fcn is a small protein that has a novel all-alpha fold. The N-terminal helical hairpin is likely to function as a dimerization module. This protein is a member of PFam family PF01724. The function of this protein is unknown. One protein sequence contains a fusion of this protein and a DnaB domain, suggesting a possible role in DNA helicase activity (hypothetical). Dali hits have low Z and high rmsd, suggesting probably only topological similarities (not functional relevance) (details derived from TOPSAN). The family has several highly conserved sequence motifs, including YD/ExD, DxxNVxEEIE, and CPY/F/W, as well as conserved tryptophans.	138
396334	pfam01725	Ham1p_like	Ham1 family. This family consists of the HAM1 protein and hypothetical archaeal bacterial and C. elegans proteins. HAM1 controls 6-N-hydroxylaminopurine (HAP) sensitivity and mutagenesis in S. cerevisiae. The HAM1 protein protects the cell from HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet unidentified set of reactions.	184
396335	pfam01726	LexA_DNA_bind	LexA DNA binding domain. This is the DNA binding domain of the LexA SOS regulon repressor which prevents expression of DNA repair proteins. The aligned region contains a variant form of the helix-turn-helix DNA binding motif. This domain is found associated with pfam00717 the auto-proteolytic domain of LexA EC:3.4.21.88.	63
396336	pfam01728	FtsJ	FtsJ-like methyltransferase. This family consists of FtsJ from various bacterial and archaeal sources FtsJ is a methyltransferase, but actually has no effect on cell division. FtsJ's substrate is the 23S rRNA. The 1.5 A crystal structure of FtsJ in complex with its cofactor S-adenosylmethionine revealed that FtsJ has a methyltransferase fold. This family also includes the N-terminus of flaviviral NS5 protein. It has been hypothesized that the N-terminal domain of NS5 is a methyltransferase involved in viral RNA capping.	179
396337	pfam01729	QRPTase_C	Quinolinate phosphoribosyl transferase, C-terminal domain. Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase EC:2.4.2.19 is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyzes the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide. The QA substrate is bound between the C-terminal domain of one subunit, and the N-terminal domain of the other. The C-terminal domain has a 7 beta-stranded TIM barrel-like fold.	169
396338	pfam01730	UreF	UreF. This family consists of the Urease accessory protein UreF. The urease enzyme (urea amidohydrolase) hydrolyzes urea into ammonia and carbamic acid. UreF is proposed to modulate the activation process of urease by eliminating the binding of nickel irons to noncarbamylated protein.	145
334656	pfam01731	Arylesterase	Arylesterase. This family consists of arylesterases (Also known as serum paraoxonase) EC:3.1.1.2. These enzymes hydrolyze organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity. Human arylesterase (PON1) is associated with HDL and may protect against LDL oxidation.	86
396339	pfam01732	DUF31	Putative peptidase (DUF31). This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. It appears to be related to the superfamily of trypsin peptidases and so may have a peptidase function.	351
396340	pfam01733	Nucleoside_tran	Nucleoside transporter. This is a family of nucleoside transporters. In mammalian cells nucleoside transporters transport nucleoside across the plasma membrane and are essential for nucleotide synthesis via the salvage pathways for cells that lack their own de novo synthesis pathways. Also in this family is mouse and human nucleolar protein HNP36, a protein of unknown function; although it has been hypothesized to be a plasma membrane nucleoside transporter.	286
396341	pfam01734	Patatin	Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein but it also has the enzymatic activity of lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have been found also in vertebrates.	190
366778	pfam01735	PLA2_B	Lysophospholipase catalytic domain. This family consists of Lysophospholipase / phospholipase B EC:3.1.1.5 and cytosolic phospholipase A2 EC:3.1.4 which also has a C2 domain pfam00168. Phospholipase B enzymes catalyze the release of fatty acids from lysophsopholipids and are capable in vitro of hydrolysing all phospholipids extractable form yeast cells. Cytosolic phospholipase A2 associates with natural membranes in response to physiological increases in Ca2+ and selectively hydrolyzes arachidonyl phospholipids, the aligned region corresponds the the carboxy-terminal Ca2+-independent catalytic domain of the protein as discussed in.	490
279993	pfam01736	Polyoma_agno	Polyomavirus agnoprotein. This family consist of the DNA binding protein or agnoprotein from various polyomaviruses. This protein is highly basic and can bind single stranded and double stranded DNA. Mutations in the agnoprotein produce smaller viral plaques, hence its function is not essential for growth in tissue culture cells but something has slowed in the normal replication cycle. There is also evidence suggesting that the agnogene and agnoprotein act as regulators of structural protein synthesis.	62
396342	pfam01737	Ycf9	YCF9. This family consists of the hypothetical protein product of the YCF9 gene from chloroplasts and cyanobacteria. These proteins have no known function.	56
396343	pfam01738	DLH	Dienelactone hydrolase family. 	213
396344	pfam01739	CheR	CheR methyltransferase, SAM binding domain. CheR proteins are part of the chemotaxis signaling mechanism in bacteria. CheR methylates the chemotaxis receptor at specific glutamate residues. CheR is an S-adenosylmethionine- dependent methyltransferase - the C-terminal domain (this one) binds SAM.	190
396345	pfam01740	STAS	STAS domain. The STAS (after Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C terminal region of Sulphate transporters and bacterial antisigma factor antagonists. It has been suggested that this domain may have a general NTP binding function.	106
396346	pfam01741	MscL	Large-conductance mechanosensitive channel, MscL. 	130
396347	pfam01742	Peptidase_M27	Clostridial neurotoxin zinc protease. These toxins are zinc proteases that block neurotransmitter release by proteolytic cleavage of synaptic proteins such as synaptobrevins, syntaxin and SNAP-25.	420
396348	pfam01743	PolyA_pol	Poly A polymerase head domain. This family includes nucleic acid independent RNA polymerases, such as Poly(A) polymerase, which adds the poly (A) tail to mRNA EC:2.7.7.19. This family also includes the tRNA nucleotidyltransferase that adds the CCA to the 3' of the tRNA EC:2.7.7.25. This family is part of the nucleotidyltransferase superfamily.	126
396349	pfam01744	GLTT	GLTT repeat (6 copies). This short repeat of unknown function is found in multiple copies in several C. elegans proteins. The repeat is five residues long and consists of XGLTT where X can be any amino acid.	28
366786	pfam01745	IPT	Isopentenyl transferase. Isopentenyl transferase / dimethylallyl transferase synthesizes isopentenyladensosine 5'-monophosphate, a cytokinin that induces shoot formation on host plants infected with the Ti plasmid.	232
396350	pfam01746	tRNA_m1G_MT	tRNA (Guanine-1)-methyltransferase. This is a family of tRNA (Guanine-1)-methyltransferases EC:2.1.1.31. In E.coli K12 this enzyme catalyzes the conversion of a guanosine residue to N1-methylguanine in position 37, next to the anticodon, in tRNA.	182
396351	pfam01747	ATP-sulfurylase	ATP-sulfurylase. This domain is the catalytic domain of ATP-sulfurylase or sulfate adenylyltransferase EC:2.7.7.4 some of which are part of a bifunctional polypeptide chain associated with adenosyl phosphosulphate (APS) kinase pfam01583. Both enzymes are required for PAPS (phosphoadenosine-phosphosulfate) synthesis from inorganic sulphate. ATP sulfurylase catalyzes the synthesis of adenosine-phosphosulfate APS from ATP and inorganic sulphate.	213
396352	pfam01749	IBB	Importin beta binding domain. This family consists of the importin alpha (karyopherin alpha), importin beta (karyopherin beta) binding domain. The domain mediates formation of the importin alpha beta complex; required for classical NLS import of proteins into the nucleus, through the nuclear pore complex and across the nuclear envelope. Also in the alignment is the NLS of importin alpha which overlaps with the IBB domain.	79
396353	pfam01750	HycI	Hydrogenase maturation protease. The family consists of hydrogenase maturation proteases. In E. coli HypI the hydrogenase maturation protease is involved in processing of HypE the large subunit of hydrogenases 3, by cleavage of its C-terminal.	130
396354	pfam01751	Toprim	Toprim domain. This is a conserved region from DNA primase. This corresponds to the Toprim domain common to DnaG primases, topoisomerases, OLD family nucleases and RecR proteins. Both DnaG motifs IV and V are present in the alignment, the DxD (V) motif may be involved in Mg2+ binding and mutations to the conserved glutamate (IV) completely abolish DnaG type primase activity. DNA primase EC:2.7.7.6 is a nucleotidyltransferase it synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork; it can also prime the leading stand and has been implicated in cell division. This family also includes the atypical archaeal A subunit from type II DNA topoisomerases. Type II DNA topoisomerases catalyze the relaxation of DNA supercoiling by causing transient double strand breaks.	93
396355	pfam01752	Peptidase_M9	Collagenase. This family of enzymes break down collagens.	285
396356	pfam01753	zf-MYND	MYND finger. 	39
396357	pfam01754	zf-A20	A20-like zinc finger. The A20 Zn-finger of bovine/human Rabex5/rabGEF1 is a Ubiquitin Binding Domain. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation.	23
366794	pfam01755	Glyco_transf_25	Glycosyltransferase family 25 (LPS biosynthesis protein). Members of this family belong to Glycosyltransferase family 25 This is a family of glycosyltransferases involved in lipopolysaccharide (LPS) biosynthesis. These enzymes catalyze the transfer of various sugars onto the growing LPS chain during its biosynthesis.	200
396358	pfam01756	ACOX	Acyl-CoA oxidase. This is a family of Acyl-CoA oxidases EC:1.3.3.6. Acyl-coA oxidase converts acyl-CoA into trans-2- enoyl-CoA.	179
376607	pfam01757	Acyl_transf_3	Acyltransferase family. This family includes a range of acyltransferase enzymes. This domain is found in many as yet uncharacterized C. elegans proteins and it is approximately 300 amino acids long.	330
366796	pfam01758	SBF	Sodium Bile acid symporter family. This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae - this is a putative transmembrane protein involved in resistance to arsenic compounds.	191
396359	pfam01759	NTR	UNC-6/NTR/C345C module. Sequence similarity between netrin UNC-6 and C345C complement protein family members, and hence the existence of the UNC-6 module, was first reported in. Subsequently, many additional members of the family were identified on the basis of sequence similarity between the C-terminal domains of netrins, complement proteins C3, C4, C5, secreted frizzled-related proteins, and type I pro-collagen C-proteinase enhancer proteins (PCOLCEs), which are homologous with the N-terminal domains of tissue inhibitors of metalloproteinases (TIMPs). The TIMPs are classified as a separate family in Pfam (pfam00965). This expanded domain family has been named as the NTR module.	106
396360	pfam01761	DHQ_synthase	3-dehydroquinate synthase. The 3-dehydroquinate synthase EC:4.6.1.3 domain is present in isolation in various bacterial 3-dehydroquinate synthases and also present as a domain in the pentafunctional AROM polypeptide. 3-dehydroquinate (DHQ) synthase catalyzes the formation of dehydroquinate (DHQ) and orthophosphate from 3-deoxy-D-arabino heptulosonic 7 phosphate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids.	258
250845	pfam01762	Galactosyl_T	Galactosyltransferase. This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2.	196
396361	pfam01763	Herpes_UL6	Herpesvirus UL6 like. This family consists of various proteins from the herpesviridae that are similar to herpes simplex virus type I UL6 virion protein. UL6 is essential for cleavage and packaging of the viral genome.	556
396362	pfam01764	Lipase_3	Lipase (class 3). 	139
396363	pfam01765	RRF	Ribosome recycling factor. The ribosome recycling factor (RRF / ribosome release factor) dissociates the ribosome from the mRNA after termination of translation, and is essential bacterial growth. Thus ribosomes are "recycled" and ready for another round of protein synthesis.	163
280020	pfam01766	Birna_VP2	Birnavirus VP2 protein. VP2 is the major structural protein of birnaviruses. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C).	440
280021	pfam01767	Birna_VP3	Birnavirus VP3 protein. VP3 is a minor structural component of the virus. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C).	227
280022	pfam01768	Birna_VP4	Birnavirus VP4 protein. VP4 is a viral protease. The large RNA segment of birnaviruses codes for a polyprotein (N-VP2-VP4-VP3-C).	259
396364	pfam01769	MgtE	Divalent cation transporter. This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding (Bateman A unpubl.)	122
396365	pfam01770	Folate_carrier	Reduced folate carrier. The reduced folate carrier (a transmembrane glycoprotein) transports reduced folate into mammalian cells via the carrier mediated mechanism (as opposed to the receptor mediated mechanism) it also transports cytotoxic folate analogues used in chemotherapy, such as methotrexate (MTX). Mammalian cells have an absolute requirement for exogenous folates which are needed for growth, and biosynthesis of macromolecules.	412
396366	pfam01771	Herpes_alk_exo	Herpesvirus alkaline exonuclease. This family includes various alkaline exonucleases from members of the herpesviridae. Alkaline exonuclease appears to have an important role in the replication of herpes simplex virus.	460
396367	pfam01773	Nucleos_tra2_N	Na+ dependent nucleoside transporter N-terminus. This family consists of nucleoside transport proteins. Rat Slc28a2 is a purine-specific Na+-nucleoside cotransporter localized to the bile canalicular membrane. Rat Slc28a1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N-terminus of this family	73
396368	pfam01774	UreD	UreD urease accessory protein. UreD is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid. UreD is involved in activation of the urease enzyme via the UreD-UreF-UreG-urease complex and is required for urease nickel metallocenter assembly. See also UreF pfam01730, UreG pfam01495.	164
396369	pfam01775	Ribosomal_L18A	Ribosomal proteins 50S-L18Ae/60S-L20/60S-L18A. This family includes: archaeal 50S ribosomal protein L18Ae, often referred to as L20e or LX; fungal 60S ribosomal protein L20; and higher eukaryote 60S ribosomal protein L18A.	60
396370	pfam01776	Ribosomal_L22e	Ribosomal L22e protein family. 	99
396371	pfam01777	Ribosomal_L27e	Ribosomal L27e protein family. The N-terminal region of the eukaryotic ribosomal L27 has the KOW motif. C-terminal region is represented by this family.	85
396372	pfam01778	Ribosomal_L28e	Ribosomal L28e protein family. 	114
396373	pfam01779	Ribosomal_L29e	Ribosomal L29e protein family. 	39
396374	pfam01780	Ribosomal_L37ae	Ribosomal L37ae protein family. This ribosomal protein is found in archaebacteria and eukaryotes. It contains four conserved cysteine residues that may bind to zinc.	85
396375	pfam01781	Ribosomal_L38e	Ribosomal L38e protein family. 	67
396376	pfam01782	RimM	RimM N-terminal domain. The RimM protein is essential for efficient processing of 16S rRNA. The RimM protein was shown to have affinity for free ribosomal 30S subunits but not for 30S subunits in the 70S ribosomes. This N-terminal domain is found associated with a PRC-barrel domain.	84
396377	pfam01783	Ribosomal_L32p	Ribosomal L32p protein family. 	56
396378	pfam01784	NIF3	NIF3 (NGG1p interacting factor 3). This family contains several NIF3 (NGG1p interacting factor 3) protein homologs. NIF3 interacts with the yeast transcriptional coactivator NGG1p which is part of the ADA complex, the exact function of this interaction is unknown.	240
250863	pfam01785	Closter_coat	Closterovirus coat protein. This family consist of coat proteins from closteroviruses a member of the closteroviridae. The viral coat protein encapsulates and protects the viral genome. Both the large cp1 and smaller cp2 coat protein originate from the same primary transcript. Members of the closteroviridae include Sugar beet yellow virus and Grapevine leafroll-associated virus, closteroviruses have a positive strand ssRNA genome with no DNA stage during replication.	188
396379	pfam01786	AOX	Alternative oxidase. The alternative oxidase is used as a second terminal oxidase in the mitochondria, electrons are transfered directly from reduced ubiquinol to oxygen forming water. This is not coupled to ATP synthesis and is not inhibited by cyanide, this pathway is a single step process. In rice the transcript levels of the alternative oxidase are increased by low temperature.	218
396380	pfam01787	Ilar_coat	Ilarvirus coat protein. This family consists of various coat proteins from the ilarviruses part of the Bromoviridae, members include apple mosaic virus and prune dwarf virus. The ilarvirus coat protein is required to initiate replication of the viral genome in host plants. Members of the Bromoviridae have a positive stand ssRNA genome with no DNA stage in there replication.	204
396381	pfam01788	PsbJ	PsbJ. This family consists of the photosystem II reaction centre protein PsbJ from plants and Cyanobacteria. In Synechocystis sp. PCC 6803 PsbJ regulates the number of photosystem II centers in thylakoid membranes, it is a predicted 4kDa protein with one membrane spanning domain.	38
396382	pfam01789	PsbP	PsbP. This family consists of the 23 kDa subunit of oxygen evolving system of photosystem II or PsbP from various plants (where it is encoded by the nuclear genome) and Cyanobacteria. The 23 KDa PsbP protein is required for PSII to be fully operational in vivo, it increases the affinity of the water oxidation site for Cl- and provides the conditions required for high affinity binding of Ca2+.	155
396383	pfam01790	LGT	Prolipoprotein diacylglyceryl transferase. 	238
396384	pfam01791	DeoC	DeoC/LacD family aldolase. This family includes diverse aldolase enzymes. This family includes the enzyme deoxyribose-phosphate aldolase EC:4.1.2.4, which is involved in nucleotide metabolism. The family also includes a group of related bacterial proteins of unknown function. The family also includes tagatose 1,6-diphosphate aldolase (EC:4.1.2.40) is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation.	235
396385	pfam01793	Glyco_transf_15	Glycolipid 2-alpha-mannosyltransferase. This is a family of alpha-1,2 mannosyl-transferases involved in N-linked and O-linked glycosylation of proteins. Some of the enzymes in this family have been shown to be involved in O- and N-linked glycan modifications in the Golgi.	313
396386	pfam01794	Ferric_reduct	Ferric reductase like transmembrane component. This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease.	117
396387	pfam01795	Methyltransf_5	MraW methylase family. Members of this family are probably SAM dependent methyltransferases based on Escherichia coli RsmH. This family appears to be related to pfam01596.	309
396388	pfam01796	OB_aCoA_assoc	DUF35 OB-fold domain, acyl-CoA-associated. The structure of a DUF35 representative reveals two long N-terminal helices followed by a rubredoxin-like zinc ribbon domain and a C-terminal OB fold domain represented in this entry. OB-folds are frequently found to bind nucleic acids suggesting this domain might bind to DNA or RNA (Topsan http://www.topsan.org/). Genomic context shows it to be adjacent to acyl-CoA transferase (http:/www.microbesonline.org/).	65
396389	pfam01797	Y1_Tnp	Transposase IS200 like. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS200 from E. coli.	119
396390	pfam01798	Nop	snoRNA binding domain, fibrillarin. This family consists of various Pre RNA processing ribonucleoproteins. The function of the aligned region is unknown however it may be a common RNA or snoRNA or Nop1p binding domain. Nop5p (Nop58p) from yeast is the protein component of a ribonucleoprotein required for pre-18s rRNA processing and is suggested to function with Nop1p in a snoRNA complex. Nop56p and Nop5p interact with Nop1p and are required for ribosome biogenesis. Prp31p is required for pre-mRNA splicing in S. cerevisiae. Fibrillarin, or Nop, is the catalytic subunit responsible for the methyl transfer reaction of the site-specific 2'-O-methylation of ribosomal and spliceosomal RNA.	229
396391	pfam01799	Fer2_2	[2Fe-2S] binding domain. 	73
280049	pfam01801	Cytomega_gL	Cytomegalovirus glycoprotein L. Glycoprotein L from cytomegalovirus serves a chaperone for the correct folding and surface expression of glycoprotein H (gH). Glycoprotein L is a member of the heterotrimeric gCIII complex of glycoprotein which also includes gH and gO and has an essential role in viral fusion.	211
396392	pfam01802	Herpes_V23	Herpesvirus VP23 like capsid protein. This family consist of various capsid proteins from members of the herpesviridae. The capsid protein VP23 in herpes simplex virus forms a triplex together with VP19C these fit between and link together adjacent capsomers as formed by VP5 and VP26. VP3 along with the scaffolding proteins helps to form normal capsids by defining the curvature of the shell and size of the particle.	294
396393	pfam01803	LIM_bind	LIM-domain binding protein. The LIM-domain binding protein, binds to the LIM domain pfam00412 of LIM homeodomain proteins which are transcriptional regulators of development. Nuclear LIM interactor (NLI) / LIM domain-binding protein 1 (LDB1) is located in the nuclei of neuronal cells during development, it is co-expressed with Isl1 in early motor neuron differentiation and has a suggested role in the Isl1 dependent development of motor neurons. It is suggested that these proteins act synergistically to enhance transcriptional efficiency by acting as co-factors for LIM homeodomain and Otx class transcription factors both of which have essential roles in development. The Drosophila protein Chip is required for segmentation and activity of a remote wing margin enhancer. Chip is a ubiquitous chromosomal factor required for normal expression of diverse genes at many stages of development. It is suggested that Chip cooperates with different LIM domain proteins and other factors to structurally support remote enhancer-promoter interactions.	242
396394	pfam01804	Penicil_amidase	Penicillin amidase. Penicillin amidase or penicillin acylase EC:3.5.1.11 catalyzes the hydrolysis of benzylpenicillin to phenylacetic acid and 6-aminopenicillanic acid (6-APA) a key intermediate in the the synthesis of penicillins. Also in the family is cephalosporin acylase and aculeacin A acylase which are involved in the synthesis of related peptide antibiotics.	626
396395	pfam01805	Surp	Surp module. This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding.	52
280054	pfam01806	Paramyxo_P	Paramyxovirinae P phosphoprotein C-terminal region. The subfamily Paramyxovirinae of the family Paramyxoviridae now contains as main genera the Rubulaviruses, avulaviruses, respiroviruses, Henipavirus-es and morbilliviruses. Protein P is the best characterized, structurally of the replicative complex of N, P and L proteins and consists of two functionally distinct moieties, an N-terminal PNT, and a C-terminal PCT. The P protein is an essential part of the viral RNA polymerase complex formed from the P and L proteins. P protein plays a crucial role in the enzyme by positioning L onto the N/RNA template through an interaction with the C-terminal domain of N. Without P, L is not functional.The C-terminal part of P (PCT) is only functional as an oligomer and forms with L the polymerase complex. PNT is poorly conserved and unstructured in solution while PCT contains the oligomerization domain (PMD) that folds as a homotetrameric coiled coil (40) containing the L binding region and a C-terminal partially folded domain, PX (residues 474 to 568), identified as the nucleocapsid binding site. Interestingly, PX is also expressed as an independent polypeptide in infected cells. PX has a C-subdomain (residues 516 to 568) that consists of three {alpha}-helices arranged in an antiparallel triple-helical bundle linked to an unfolded flexible N-subdomain (residues 474 to 515).	248
280055	pfam01807	zf-CHC2	CHC2 zinc finger. This domain is principally involved in DNA binding in DNA primases.	95
396396	pfam01808	AICARFT_IMPCHas	AICARFT/IMPCHase bienzyme. This is a family of bifunctional enzymes catalyzing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalyzed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase EC:2.1.2.3 (AICARFT), this enzyme catalyzes the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. This is catalyzed by a pair of C-terminal deaminase fold domains in the protein, where the active site is formed by the dimeric interface of two monomeric units. The last step is catalyzed by the N-terminal IMP (Inosine monophosphate) cyclohydrolase domain EC:3.5.4.10 (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP.	308
396397	pfam01809	Haemolytic	Haemolytic domain. This domain has haemolytic activity. It is found in short (73-103 amino acid) proteins and contains three conserved cysteine residues.	67
280058	pfam01810	LysE	LysE type translocator. This family consists of various hypothetical proteins and an l-lysine exporter LysE from Corynebacterium glutamicum which is proposed to be the first of a novel family of translocators. LysE exports l-lysine from the cell into the surrounding medium and is predicted to span the membrane six times. The physiological function of the exporter is to excrete excess l-Lysine as a result of natural flux imbalances or peptide hydrolysis; and also after artificial deregulation of l-Lysine biosynthesis as used by the biotechnology. industry for the production of l-lysine.	193
396398	pfam01812	5-FTHF_cyc-lig	5-formyltetrahydrofolate cyclo-ligase family. 5-formyltetrahydrofolate cyclo-ligase or methenyl-THF synthetase EC:6.3.3.2 catalyzes the interchange of 5-formyltetrahydrofolate (5-FTHF) to 5-10-methenyltetrahydrofolate, this requires ATP and Mg2+. 5-FTHF is used in chemotherapy where it is clinically known as Leucovorin.	186
396399	pfam01813	ATP-synt_D	ATP synthase subunit D. This is a family of subunit D form various ATP synthases including V-type H+ transporting and Na+ dependent. Subunit D is suggested to be an integral part of the catalytic sector of the V-ATPase.	194
396400	pfam01814	Hemerythrin	Hemerythrin HHE cation binding domain. Iteration of the HHE family found it to be related to Hemerythrin. It also demonstrated that what has been described as a single domain in fact consists of two cation binding domains. Members of this family occur all across nature and are involved in a variety of processes. For instance, in Nereis diversicolor hemerythrin binds Cadmium so as to protect the organism from toxicity. However Hemerythrin is classically described as Oxygen-binding through two attached Fe2+ ions. And the bacterial NorA is a regulator of response to NO, which suggests yet another set-up for its metal ligands. In Staphylococcus aureus the iron-sulfur cluster repair protein ScdA has been noted to be important when the organism switches to living in environments with low oxygen concentrations; perhaps this protein acts as an oxygen store or scavenger.	128
307776	pfam01815	Rop	Rop protein. 	57
250888	pfam01816	LRV	Leucine rich repeat variant. The function of this repeat is unknown. It has an unusual structure of two helices. One is an alpha helix, the other is the much rarer 3-10 helix.	26
396401	pfam01817	CM_2	Chorismate mutase type II. Chorismate mutase EC:5.4.99.5 catalyzes the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine.	79
396402	pfam01818	Translat_reg	Bacteriophage translational regulator. The translational regulator protein regA is encoded by the T4 bacteriophage and binds to a region of messenger RNA (mRNA) that includes the initiator codon. RegA is unusual in that it represses the translation of about 35 early T4 mRNAs but does not affect nearly 200 other mRNAs.	122
396403	pfam01819	Levi_coat	Levivirus coat protein. The Levivirus coat protein forms the bacteriophage coat that encapsidates the viral RNA. 180 copies of this protein form the virion shell. The MS2 bacteriophage coat protein controls two distinct processes: sequence-specific RNA encapsidation and repression of replicase translation-by binding to an RNA stem-loop structure of 19 nucleotides containing the initiation codon of the replicase gene. The binding of a coat protein dimer to this hairpin shuts off synthesis of the viral replicase, switching the viral replication cycle to virion assembly rather than continued replication.	132
396404	pfam01820	Dala_Dala_lig_N	D-ala D-ala ligase N-terminus. This family represents the N-terminal region of the D-alanine--D-alanine ligase enzyme EC:6.3.2.4 which is thought to be involved in substrate binding. D-Alanine is one of the central molecules of the cross-linking step of peptidoglycan assembly. There are three enzymes involved in the D-alanine branch of peptidoglycan biosynthesis: the pyridoxal phosphate-dependent D-alanine racemase (Alr), the ATP-dependent D-alanine:D-alanine ligase (Ddl), and the ATP-dependent D-alanine:D-alanine-adding enzyme (MurF). This domain is structurally related to the PreATP-grasp domain.	118
396405	pfam01821	ANATO	Anaphylotoxin-like domain. C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins.	36
396406	pfam01822	WSC	WSC domain. This domain may be involved in carbohydrate binding.	80
396407	pfam01823	MACPF	MAC/Perforin domain. The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerization of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerizes into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold.	210
280070	pfam01824	MatK_N	MatK/TrnK amino terminal region. The function of this region is unknown.	331
396408	pfam01825	GPS	GPCR proteolysis site, GPS, motif. The GPS motif is found in GPCRs, and is the site for auto-proteolysis, so is thus named, GPS. The GPS motif is a conserved sequence of ~40 amino acids containing canonical cysteine and tryptophan residues, and is the most highly conserved part of the domain. In most, if not all, cell-adhesion GPCRs these undergo autoproteolysis in the GPS between a conserved aliphatic residue (usually a leucine) and a threonine, serine, or cysteine residue. In higher eukaryotes this motif is found embedded in the C-terminal beta-stranded part of a GAIN domain - GPCR-Autoproteolysis INducing (GAIN). The GAIN-GPS domain adopts a fold in which the GPS motif, at the C-terminus, forms five beta-strands that are tightly integrated into the overall GAIN domain. The GPS motif, evolutionarily conserved from tetrahymena to mammals, is the only extracellular domain shared by all human cell-adhesion GPCRs and PKD proteins, and is the locus of multiple human disease mutations. The GAIN-GPS domain is both necessary and sufficient functionally for autoproteolysis, suggesting an autoproteolytic mechanism whereby the overall GAIN domain fine-tunes the chemical environment in the GPS to catalyze peptide bond hydrolysis. In the cell-adhesion GPCRs and PKD proteins, the GPS motif is always located at the end of their long N-terminal extracellular regions, immediately before the first transmembrane helix of the respective protein.	46
396409	pfam01826	TIL	Trypsin Inhibitor like cysteine rich domain. This family contains trypsin inhibitors as well as a domain found in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9.	55
396410	pfam01827	FTH	FTH domain. This presumed domain is likely to be a protein-protein interaction module. It is found in many proteins from C. elegans. The domain is found associated with the F-box pfam00646. This domain is named FTH after FOG-2 homology domain.	141
396411	pfam01828	Peptidase_A4	Peptidase A4 family. 	206
396412	pfam01829	Peptidase_A6	Peptidase A6 family. 	314
280076	pfam01830	Peptidase_C7	Peptidase C7 family. 	243
280077	pfam01831	Peptidase_C16	Peptidase C16 family. 	249
396413	pfam01832	Glucosaminidase	Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase. This family includes Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase EC:3.2.1.96. As well as the flageller protein J that has been shown to hydrolyze peptidoglycan.	91
396414	pfam01833	TIG	IPT/TIG domain. This family consists of a domain that has an immunoglobulin like fold. These domains are found in cell surface receptors such as Met and Ron as well as in intracellular transcription factors where it is involved in DNA binding. CAUTION: This family does not currently recognize a significant number of members.	85
396415	pfam01834	XRCC1_N	XRCC1 N terminal domain. 	148
396416	pfam01835	A2M_N	MG2 domain. This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin.	96
396417	pfam01837	HcyBio	Homocysteine biosynthesis enzyme, sulfur-incorporation. This presumed domain is about is about 360 residues long. The function of this domain is unknown. It is found in some proteins that have two C-terminal CBS pfam00571 domains. There are also proteins that contain two inserted Fe4S domains near the C-terminal end of the domain. The Methanothermobacter thermautotrophicus gene MTH_855 product has been misannotated as an inosine monophosphate dehydrogenase based on the similarity to the CBS domains. Based on genetic analyses in the methanogen Methanosarcina acetivorans, this family is a key component of the metabolic network for sulfide assimilation and trafficking in methanogens. It is essential to a novel, O-acetylhomoserine sulfhydrylase-independent pathway for homocysteine biosynthesis, and may catalyze sulfur incorporation into the side chain of an as yet unidentified amino acid precursor. The DUF39-CBS and DUF39-ferredoxin architectures repeatedly occur together in the genomes of methanogenic Archaea, suggesting they may be of diverged function. This is consistent with a phylogenetic reconstruction of the DUF39 family, which clearly distinguishes the CBS-associated and ferredoxin-associated DUF39s.	350
396418	pfam01839	FG-GAP	FG-GAP repeat. This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N-terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats.	36
396419	pfam01840	TCL1_MTCP1	TCL1/MTCP1 family. Two related oncogenes, TCL-1 and MTCP-1, are overexpressed in T cell prolymphocytic leukaemias as a result of chromosomal rearrangements that involve the translocation of one T cell receptor gene to either chromosome 14q32 or Xq28. This family contains two repeated motifs that form a single globular domain.	118
376628	pfam01841	Transglut_core	Transglutaminase-like superfamily. This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease.	108
396420	pfam01842	ACT	ACT domain. This family of domains generally have a regulatory role. ACT domains are linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. The ACT domain is found in: D-3-phosphoglycerate dehydrogenase EC:1.1.1.95, which is inhibited by serine. Aspartokinase EC:2.7.2.4, which is regulated by lysine. Acetolactate synthase small regulatory subunit, which is inhibited by valine. Phenylalanine-4-hydroxylase EC:1.14.16.1, which is regulated by phenylalanine. Prephenate dehydrogenase EC:4.2.1.51. formyltetrahydrofolate deformylase EC:3.5.1.10, which is activated by methionine and inhibited by glycine. GTP pyrophosphokinase EC:2.7.6.5.	66
396421	pfam01843	DIL	DIL domain. The DIL domain has no known function.	103
396422	pfam01844	HNH	HNH endonuclease. His-Asn-His (HNH) proteins are a very common family of small nucleic acid-binding proteins that are generally associated with endonuclease activity.	47
396423	pfam01845	CcdB	CcdB protein. 	99
396424	pfam01846	FF	FF domain. This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.	50
396425	pfam01847	VHL	VHL beta domain. VHL forms a ternary complex with the elonginB and elonginC proteins. This complex binds Cul2, which then is involved in regulation of vascular endothelial growth factor mRNA.	82
396426	pfam01848	HOK_GEF	Hok/gef family. 	42
396427	pfam01849	NAC	NAC domain. 	54
396428	pfam01850	PIN	PIN domain. 	121
396429	pfam01851	PC_rep	Proteasome/cyclosome repeat. 	35
396430	pfam01852	START	START domain. 	205
396431	pfam01853	MOZ_SAS	MOZ/SAS family. This region of these proteins has been suggested to be homologous to acetyltransferases.	179
396432	pfam01855	POR_N	Pyruvate flavodoxin/ferredoxin oxidoreductase, thiamine diP-bdg. This family includes the N terminal structural domain of the pyruvate ferredoxin oxidoreductase. This domain binds thiamine diphosphate, and along with domains II and IV, is involved in inter subunit contacts. The family also includes pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in cyanobacterium which is required for growth on molecular nitrogen when iron is limited.	230
280099	pfam01856	HP_OMP	Helicobacter outer membrane protein. This family seems confined to Helicobacter. It is predicted to be an outer membrane protein based on its pattern of alternating hydrophobic amino acids similar to porins.	154
396433	pfam01857	RB_B	Retinoblastoma-associated protein B domain. The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B domain. The B domain has a cyclin fold.	131
396434	pfam01858	RB_A	Retinoblastoma-associated protein A domain. This domain has the cyclin fold as predicted.	195
396435	pfam01861	DUF43	Protein of unknown function DUF43. This family includes archaebacterial proteins of unknown function. All the members are 350-400 amino acids long.	243
396436	pfam01862	PvlArgDC	Pyruvoyl-dependent arginine decarboxylase (PvlArgDC). Methanococcus jannaschii contains homologs of most genes required for spermidine polyamine biosynthesis. Yet genomes from neither this organism nor any other euryarchaeon have orthologues of the pyridoxal 5'-phosphate- dependent ornithine or arginine decarboxylase genes, required to produce putrescine. Instead,these organisms have a new class of arginine decarboxylase (PvlArgDC) formed by the self-cleavage of a proenzyme into a 5-kDa subunit and a 12-kDa subunit that contains a reactive pyruvoyl group. Although this extremely thermostable enzyme has no significant sequence similarity to previously characterized proteins, conserved active site residues are similar to those of the pyruvoyl-dependent histidine decarboxylase enzyme, and its subunits form a similar (alpha-beta)(3) complex. homologs of PvlArgDC are found in several bacterial genomes, including those of Chlamydia spp., which have no agmatine ureohydrolase enzyme to convert agmatine (decarboxylated arginine) into putrescine. In these intracellular pathogens, PvlArgDC may function analogously to pyruvoyl-dependent histidine decarboxylase; the cells are proposed to import arginine and export agmatine, increasing the pH and affecting the host cell's metabolism. Phylogenetic analysis of Pvl- ArgDC proteins suggests that this gene has been recruited from the euryarchaeal polyamine biosynthetic pathway to function as a degradative enzyme in bacteria.	162
396437	pfam01863	DUF45	Protein of unknown function DUF45. This protein has no known function. Members are found in some archaebacteria, as well as Helicobacter pylori. The proteins are 190-240 amino acids long, with the C-terminus being the most conserved region, containing three conserved histidines. This motif is similar to that found in Zinc proteases, suggesting that this family may also be proteases.	207
280105	pfam01864	CarS-like	CDP-archaeol synthase. CDP-archaeol synthase functions in the archaeal lipid biosynthetic pathway. It catalyzes the transfer of the nucleotide to its specific archaeal lipid substrate, leading to the formation of a CDP-activated precursor (CDP-archaeol) to which polar head groups are attached. Bacterial members of this family are uncharacterized.	175
280106	pfam01865	PhoU_div	Protein of unknown function DUF47. This family includes prokaryotic proteins of unknown function, as well as a protein annotated as the pit accessory protein from Sinorhizobium meliloti. However, the function of this protein is also unknown (Pit stands for Phosphate transport). It is probably distantly related to pfam01895 (personal obs:Yeats C).	214
396438	pfam01866	Diphthamide_syn	Putative diphthamide synthesis protein. Diphthamide_syn, diphthamide synthase, catalyzes the last amidation step of diphthamide biosynthesis using ammonium and ATP. Human DPH1 is a candidate tumor suppressor gene. DPH2 from yeast, which confers resistance to diphtheria toxin has been found to be involved in diphthamide synthesis. Diphtheria toxin inhibits eukaryotic protein synthesis by ADP-ribosylating diphthamide, a post-translationally modified histidine residue present in EF2. Diphthamide synthase is evolutionarily conserved in eukaryotes. Diphthamide is a post-translationally modified histidine residue found on archaeal and eukaryotic translation elongation factor 2 (eEF-2).	302
396439	pfam01867	Cas_Cas1	CRISPR associated protein Cas1. Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. This family of proteins corresponds to Cas1, a CRISPR-associated protein. Cas1 may be involved in linking DNA segments to CRISPR.	283
396440	pfam01868	UPF0086	Domain of unknown function UPF0086. This family consists of several archaeal and eukaryotic proteins. The archaeal proteins are found to be expressed within ribosomal operons and several of the sequences are described as ribonuclease P protein subunit p29 proteins.	83
396441	pfam01869	BcrAD_BadFG	BadF/BadG/BcrA/BcrD ATPase family. This family includes the BadF and BadG proteins that are two subunits of Benzoyl-CoA reductase, that may be involved in ATP hydrolysis. The family also includes an activase subunit from the enzyme 2-hydroxyglutaryl-CoA dehydratase. Aquifex aeolicus aq_278 contains two copies of this region suggesting that the family may structurally dimerize. This family appears to be related to pfam00370.	271
396442	pfam01870	Hjc	Archaeal holliday junction resolvase (hjc). This family of archaebacterial proteins are holliday junction resolvases (hjc gene). The Holliday junction is an essential intermediate of homologous recombination. This protein is the archaeal equivalent of RuvC but is not sequence similar.	91
396443	pfam01871	AMMECR1	AMMECR1. This family consists of several AMMECR1 as well as several uncharacterized proteins. The contiguous gene deletion syndrome AMME is characterized by Alport syndrome, midface hypoplasia, mental retardation and elliptocytosis and is caused by a deletion in Xq22.3, comprising several genes including COL4A5, FACL4 and AMMECR1. This family contains sequences from several eukaryotic species as well as archaebacteria and it has been suggested that the AMMECR1 protein may have a basic cellular function, potentially in either the transcription, replication, repair or translation machinery.	167
396444	pfam01872	RibD_C	RibD C-terminal domain. The function of this domain is not known, but it is thought to be involved in riboflavin biosynthesis. This domain is found in the C-terminus of RibD/RibG, in combination with pfam00383, as well as in isolation in some archaebacterial proteins. This family appears to be related to pfam00186.	196
396445	pfam01873	eIF-5_eIF-2B	Domain found in IF2B/IF5. This family includes the N-terminus of eIF-5, and the C-terminus of eIF-2 beta. This region corresponds to the whole of the archaebacterial eIF-2 beta homolog. The region contains a putative zinc binding C4 finger.	115
396446	pfam01874	CitG	ATP:dephospho-CoA triphosphoribosyl transferase. The citG gene is found in a gene cluster with citrate lyase subunits. The function of the CitG protein was elucidated as ATP:dephospho-CoA triphosphoribosyl transferase.	258
280116	pfam01875	Memo	Memo-like protein. This family contains members from all branches of life. The molecular function of this protein is unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton.	271
396447	pfam01876	RNase_P_p30	RNase P subunit p30. This protein is part of the RNase P complex that is involved in tRNA maturation.	214
396448	pfam01877	RNA_binding	RNA binding. PH1010 is composed of five alpha-helices (1-5) and eight beta-strands (1-8) with the following topology: beta-1, alpha-1, beta-2, beta-3, alpha-2, alpha-3, beta-4, beta-5, alpha-4, beta-6, alpha-5, beta-7, beta-8. The first six beta-strands (1-6) form a slightly twisted antiparallel beta-sheet and face five alpha-helices on one side. The last two beta-strands form an antiparallel beta-sheet in the C-terminus. PH1010 forms a characteristic homodimer structure in the crystal. dimerization of the molecule is crucial for function. The structure resembles that of some ribosomal proteins such as the 50S ribosomal protein L5. Although the structure resembles that of the RRM-type RNA-binding domain of the ribosomal L5 protein, the residues involved in RNA-binding in the L5 protein are not conserved in this family. Despite this, these proteins bind to double-stranded RNA in a non-sequence specific manner.	113
396449	pfam01878	EVE	EVE domain. This domain was formerly known as DUF55. Crystal structures have shown that this domain is part of the PUA superfamily. This domain has been named EVE and is thought to be RNA-binding.	146
396450	pfam01880	Desulfoferrodox	Desulfoferrodoxin. Desulfoferrodoxins contains two types of iron: an Fe-S4 site very similar to that found in desulforedoxin from Desulfovibrio gigas and an octahedral coordinated high-spin ferrous site most probably with nitrogen/oxygen-containing ligands. Due to this rather unusual combination of active centers, this novel protein is named desulfoferrodoxin.	97
396451	pfam01881	Cas_Cas6	CRISPR associated protein Cas6. This group of families is one of several protein families that are always found associated with prokaryotic CRISPRs, themselves a family of clustered regularly interspaced short palindromic repeats, DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. It has been shown that the CRISPRs are virus-derived sequences acquired by the host to enable them to resist viral infection. The Cas proteins from the host use the CRISPRs to mediate an antiviral response. After transcription of the CRISPR, a complex of Cas proteins termed Cascade cleaves a CRISPR RNA precursor in each repeat and retains the cleavage products containing the virus-derived sequence. Assisted by the helicase Cas3, these mature CRISPR RNAs then serve as small guide RNAs that enable Cascade to interfere with virus proliferation. Cas5 contains an endonuclease motif, whose inactivation leads to loss of resistance, even in the presence of phage-derived spacers.	108
396452	pfam01882	DUF58	Protein of unknown function DUF58. This family of prokaryotic proteins have no known function. Caldicellulosiruptor saccharolyticus PepX, a protein of unknown function in the family, has been misannotated as alpha-dextrin 6-glucanohydrolase.	86
396453	pfam01883	FeS_assembly_P	Iron-sulfur cluster assembly protein. This family has an alpha/beta topology, with 13 conserved hydrophobic residues at its core and a putative active site containing a highly conserved cysteine. Members of this family are involved in a range of physiological functions. The family includes PaaJ (PhaH) from Pseudomonas putida. PaaJ forms a complex with PaaG (PhaF), PaaI (PhaG) and PaaK (PhaI), which hydroxylates phenylacetic acid to 2-hydroxyphenylacetic acid. It also includes PaaD from Escherichia coli, a member of a multicomponent oxygenase involved in phenylacetyl-CoA hydroxylation. Furthermore, several members of this family are shown to be involved in iron-sulfur (FeS) cluster assembly. Iron-sulfur (FeS) clusters are inorganic co-factors that are are able to transfer electrons and act as catalysts. They are involved in diverse cellular processes including cellular respiration, DNA replication and repair, antibiotic resistance, and dinitrogen fixation. The biogenesis of such clusters from elemental iron and sulfur is an enzymatic process that requires a set of specialized proteins. Proteins containing this domain include the chloroplast protein HCF101 (high chlorophyll fluorescence 101), which has been described as an essential and specific factor for assembly of [4Fe-4S]-cluster-containing protein complexes such as the membrane complex Photosystem I (PSI) and the heterodimeric FTR (ferredoxin-thioredoxin reductase) complex and is involved in the assembly of [4Fe-4S] clusters and their transfer to apoproteins. The mature HCF101 protein contains an N-terminal DUF59 domain as well as eight cysteine residues along the sequence. All cysteine residues are conserved among higher plants, but of the two cysteine residues located in the DUF59 domain only Cys128 is highly conserved and is present in the highly conserved P-loop domain of the plant HCF101 (CKGGVGKS). SufT protein from Staphylococcus aureus is composed of DUF59 solely and is shown to be involved in the maturation of FeS proteins. Given all this data, it is hypothesized that DUF59 might play a role in FeS cluster assembly.	72
396454	pfam01884	PcrB	PcrB family. This family contains proteins that are related to PcrB. The function of these proteins is unknown.	226
396455	pfam01885	PTS_2-RNA	RNA 2'-phosphotransferase, Tpt1 / KptA family. Tpt1 catalyzes the last step of tRNA splicing in yeast. It transfers the splice junction 2'-phosphate from ligated tRNA to NAD, to produce ADP-ribose 1"-2"-cyclic phosphate. This is presumed to be followed by a transesterification step to release the RNA. The first step of this reaction is similar to that catalyzed by some bacterial toxins. E. coli KptA and mouse Tpt1 are likely to use the same reaction mechanism.	172
396456	pfam01886	DUF61	Protein of unknown function DUF61. Protein found in Archaebacteria. These proteins have no known function.	121
396457	pfam01887	SAM_adeno_trans	S-adenosyl-l-methionine hydroxide adenosyltransferase. This is a family of proteins, previously known as DUF62, found in archaebacteria and bacteria. The structure of proteins in this family is similar to that of a bacterial fluorinating enzyme. S-adenosyl-l-methionine hydroxide adenosyltransferases utilizes a rigorously conserved amino acid side chain triad (Asp-Arg-His) which may have a role in activating water to hydroxide ion. This family used to be known as DUF62.	217
396458	pfam01888	CbiD	CbiD. CbiD is essential for cobalamin biosynthesis in both S. typhimurium and B. megaterium, no functional role has been ascribed to the protein. The CbiD protein has a putative S-AdoMet binding site. It is possible that CbiD might have the same role as CobF in undertaking the C-1 methylation and deacylation reactions required during the ring contraction process.	258
396459	pfam01889	DUF63	Membrane protein of unknown function DUF63. Proteins found in Archaebacteria of unknown function. These proteins are probably transmembrane proteins.	270
396460	pfam01890	CbiG_C	Cobalamin synthesis G C-terminus. Members of this family are involved in cobalamin synthesis. The protein encoded by Synechocystis sp.cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process. Within the cobalamin synthesis pathway CbiG catalyzes the both the opening of the lactone ring and the extrusion of the two-carbon fragment of cobalt-precorrin-5A from C-20 and its associated methyl group (deacylation) to give cobalt-precorrin-5B. This family is the C-terminal region, and the mid- and N-termival parts are conserved independently in other families.	120
396461	pfam01891	CbiM	Cobalt uptake substrate-specific transmembrane region. This family of proteins forms part of the cobalt-transport complex in prokaryotes, CbiMNQO. CbiMNQO and NikMNQO are the most widespread groups of microbial transporters for cobalt and nickel ions and are unusual uptake systems as they consist of eg two transmembrane components (CbiM and CbiQ), a small membrane-bound component (CbiN) and an ATP-binding protein (CbiO) but no extracytoplasmic solute-binding protein. Similar components constitute the nickel transporters with some variability in the small membrane-bound component, either NikN or NikL, which are not similar to CbiN at the sequence level. CbiM is the substrate-specific component of the complex and is a seven-transmembrane protein. The CbiMNQO and NikMNQO systems form part of the coenzyme B12 biosynthesis pathway. The NikM protein is pfam10670.	202
396462	pfam01893	UPF0058	Uncharacterized protein family UPF0058. This archaebacterial protein has no known function.	86
396463	pfam01894	UPF0047	Uncharacterized protein family UPF0047. This family has no known function. The alignment contains a conserved aspartate and histidine that may be functionally important.	116
396464	pfam01895	PhoU	PhoU domain. This family contains phosphate regulatory proteins including PhoU. PhoU proteins are known to play a role in the regulation of phosphate uptake. The PhoU domain is composed of a three helix bundle. The PhoU protein contains two copies of this domain. The domain binds to an iron cluster via its conserved E/DXXXD motif.	87
396465	pfam01896	DNA_primase_S	DNA primase small subunit. DNA primase synthesizes the RNA primers for the Okazaki fragments in lagging strand DNA synthesis. DNA primase is a heterodimer of large and small subunits. This family also includes baculovirus late expression factor 1 or LEF-1 proteins. Baculovirus LEF-1 is a DNA primase enzyme. The family also contains many bacterial DNA primases.	158
396466	pfam01899	MNHE	Na+/H+ ion antiporter subunit. Subunit of a Na+/H+ Prokaryotic antiporter complex.	150
396467	pfam01900	RNase_P_Rpp14	Rpp14/Pop5 family. tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule associated with at least eight protein subunits, hPop1, Rpp14, Rpp20, Rpp25, Rpp29, Rpp30, Rpp38, and Rpp40. This protein is known as Pop5 in eukaryotes.	102
396468	pfam01901	O_anti_polymase	Putative O-antigen polymerase. Archaebacterial proteins of unknown function. Members of this family may be transmembrane proteins. These are potentially O-antigen assembly enzymes, with up to 11 transmembrane regions.	337
280139	pfam01902	Diphthami_syn_2	Diphthamide synthase. Diphthamide_syn, diphthamide synthase, catalyzes the last amidation step of diphthamide biosynthesis using ammonium and ATP. Diphthamide synthase is evolutionarily conserved in eukaryotes. Diphthamide is a post-translationally modified histidine residue found on archaeal and eukaryotic translation elongation factor 2 (eEF-2). In some members of this family this domain is associated with pfam01042. The enzyme classification is EC:6.3.1.14.	219
396469	pfam01903	CbiX	CbiX. The function of CbiX is uncertain, however it is found in cobalamin biosynthesis operons and so may have a related function. Some CbiX proteins contain a striking histidine-rich region at their C-terminus, which suggests that it might be involved in metal chelation.	106
396470	pfam01904	DUF72	Protein of unknown function DUF72. The function of this family is unknown.	219
396471	pfam01905	DevR	CRISPR-associated negative auto-regulator DevR/Csa2. This group of families is one of several protein families that are always found associated with prokaryotic CRISPRs, themselves a family of clustered regularly interspaced short palindromic repeats, DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. It has been shown that the CRISPRs are virus-derived sequences acquired by the host to enable them to resist viral infection. The Cas proteins from the host use the CRISPRs to mediate an antiviral response. After transcription of the CRISPR, a complex of Cas proteins termed Cascade cleaves a CRISPR RNA precursor in each repeat and retains the cleavage products containing the virus-derived sequence. Assisted by the helicase Cas3, these mature CRISPR RNAs then serve as small guide RNAs that enable Cascade to interfere with virus proliferation. Cas5 contains an endonuclease motif, whose inactivation leads to loss of resistance, even in the presence of phage-derived spacers. This family used to be known as DUF73. DevR appears to be negative auto-regulator within the system.	268
396472	pfam01906	YbjQ_1	Putative heavy-metal-binding. From comparative structural analysis, this family is likely to be a heavy-metal binding domain. The domain oligomerizes as a pentamer. The domain is about 100 amino acids long and is found in prokaryotes.	99
396473	pfam01907	Ribosomal_L37e	Ribosomal protein L37e. This family includes ribosomal protein L37 from eukaryotes and archaebacteria. The family contains many conserved cysteines and histidines suggesting that this protein may bind to zinc.	53
396474	pfam01909	NTP_transf_2	Nucleotidyltransferase domain. Members of this family belong to a large family of nucleotidyltransferases. This family includes kanamycin nucleotidyltransferase (KNTase) which is a plasmid-coded enzyme responsible for some types of bacterial resistance to aminoglycosides. KNTase in-activates antibiotics by catalyzing the addition of a nucleotidyl group onto the drug.	91
396475	pfam01910	Thiamine_BP	Thiamine-binding protein. The crystal structure of two of these members shows that this domain has a ferredoxin like fold and is likely to exists as at least homodimers. Sulphate ions are are located at the dimer interfaces, which are thought to confer additional stability. Although the function of this domain remains to be identified, its structure suggests a role in protein-protein interactions possibly regulated by the binding of small-molecule ligands. Solution of the structure of the hyperthermophilic anaerobic Thermotoga maritima sequence, UniProtKB:Q9WYV6, shows that this has a beta-alpha-beta-beta-alpha-beta ferredoxin-like fold and assembles as a homotetramer. It was possible to identify a pocket in each monmer that bound an unidentified ligand. It was also found that it bound charged thiamine though not hydroxymethyl pyrimidine. It is proposed that it is transporting charged thiamine around the cytoplasm. Under oxidative conditions this bacterium is under stress, and the transcriiptional unit within which this protein is expressed is up-regulated in these conditions, suggesting that the chelation of cytoplasmic thaimine is part of the response mechanism to such oxidatvie stress, which is mediated by this family.	92
396476	pfam01912	eIF-6	eIF-6 family. This family includes eukaryotic translation initiation factor 6 as well as presumed archaebacterial homologs.	196
396477	pfam01913	FTR	Formylmethanofuran-tetrahydromethanopterin formyltransferase. This enzyme EC:2.3.1.101 is involved in archaebacteria in the formation of methane from carbon dioxide. N-terminal distal lobe of alpha+beta ferredoxin-like fold. SCOP reports fold duplication with C-terminal proximal lobe.	144
280149	pfam01914	MarC	MarC family integral membrane protein. Integral membrane protein family that includes the protein MarC. MarC was thought to be a multiple antibiotic resistance protein. Nevertheless, a study has shown that MarC is not involved in multiple antibiotic resistance. The function of this family is unclear.	203
396478	pfam01915	Glyco_hydro_3_C	Glycosyl hydrolase family 3 C-terminal domain. This domain is involved in catalysis and may be involved in binding beta-glucan. This domain is found associated with pfam00933.	216
396479	pfam01916	DS	Deoxyhypusine synthase. Eukaryotic initiation factor 5A (eIF-5A) contains an unusual amino acid, hypusine [N epsilon-(4-aminobutyl-2-hydroxy)lysine]. The first step in the post-translational formation of hypusine is catalyzed by the enzyme deoxyhypusine synthase (DS) EC:1.1.1.249. The modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation.	284
396480	pfam01917	Arch_flagellin	Archaebacterial flagellin. Members of this family are the proteins that form the flagella in archaebacteria.	160
396481	pfam01918	Alba	Alba. Alba is a novel chromosomal protein that coats archaeal DNA without compacting it.	66
396482	pfam01920	Prefoldin_2	Prefoldin subunit. This family includes prefoldin subunits that are not detected by pfam02996.	102
396483	pfam01921	tRNA-synt_1f	tRNA synthetases class I (K). This family includes only lysyl tRNA synthetases from prokaryotes.	357
396484	pfam01922	SRP19	SRP19 protein. The signal recognition particle (SRP) binds to the signal peptide of proteins as they are being translated. The binding of the SRP halts translation and the complex is then transported to the endoplasmic reticulum's cytoplasmic surface. The SRP then aids translocation of the protein through the ER membrane. The SRP is a ribonucleoprotein that is composed of a small RNA and several proteins. One of these proteins is the SRP19 protein (Sec65 in yeast).	94
396485	pfam01923	Cob_adeno_trans	Cobalamin adenosyltransferase. Cobalamin adenosyltransferase This family contains the gene products of PduO and EutT which are both cobalamin adenosyltransferases. PduO is a protein with ATP:cob(I)alamin adenosyltransferase activity. The main role of this protein is the conversion of inactive cobalamins to AdoCbl for 1,2-propanediol degradation.The EutT enzyme appears to be an adenosyl transferase, converting CNB12 to AdoB12.	163
396486	pfam01924	HypD	Hydrogenase formation hypA family. HypD is involved in hydrogenase formation. It contains many possible metal binding residues, which may bind to nickel. Transposon Tn5 insertions into hypD resulted in R. leguminosarum mutants that lacked any hydrogenase activity in symbiosis with peas.	352
396487	pfam01925	TauE	Sulfite exporter TauE/SafE. This is a family of integral membrane proteins where the alignment appears to contain two duplicated modules of three transmembrane helices. The proteins are involved in the transport of anions across the cytoplasmic membrane during taurine metabolism as an exporter of sulfoacetate. This family used to be known as DUF81.	235
396488	pfam01926	MMR_HSR1	50S ribosome-binding GTPase. The full-length GTPase protein is required for the complete activity of the protein of interacting with the 50S ribosome and binding of both adenine and guanine nucleotides, with a preference for guanine nucleotide.	113
396489	pfam01927	Mut7-C	Mut7-C RNAse domain. RNAse domain of the PIN fold with an inserted Zinc Ribbon at the C-terminus.	145
396490	pfam01928	CYTH	CYTH domain. These sequences are functionally identified as members of the adenylate cyclase family, which catalyzes the conversion of ATP to 3',5'-cyclic AMP and pyrophosphate. Six distinct non-homologous classes of AC have been identified. The structure of three classes of adenylyl cyclases have been solved.	172
396491	pfam01929	Ribosomal_L14e	Ribosomal protein L14. This family includes the eukaryotic ribosomal protein L14.	75
396492	pfam01930	Cas_Cas4	Domain of unknown function DUF83. This domain has no known function. The domain contains three conserved cysteines at its C-terminus.	162
396493	pfam01931	NTPase_I-T	Protein of unknown function DUF84. NTPase_I-T is a family of NTPases with supreme activity against ITP and XTP. Active site analysis and structure comparison of YjjX strongly suggested that it is an NTP binding protein with nucleoside triphosphatase activity. YjjX exhibits a mixed alpha-beta fold.	163
396494	pfam01933	UPF0052	Uncharacterized protein family UPF0052. 	249
396495	pfam01934	DUF86	Protein of unknown function DUF86. The function of members of this family is unknown.	120
376671	pfam01935	DUF87	Domain of unknown function DUF87. The function of this prokaryotic domain is unknown. It contains several conserved aspartates and histidines that could be metal ligands.	220
376672	pfam01936	NYN	NYN domain. These domains are found in the eukaryotic proteins typified by the Nedd4-binding protein 1 and the bacterial YacP-like proteins (Nedd4-BP1, YacP nucleases; NYN domains). The NYN domain shares a common protein fold with two other previously characterized groups of nucleases, namely the PIN (PilT N-terminal) and FLAP/5' --> 3' exonuclease superfamilies. These proteins share a common set of 4 acidic conserved residues that are predicted to constitute their active site. Based on the conservation of the acidic residues and structural elements Aravind and colleagues suggest that PIN and NYN domains are likely to bind only a single metal ion, unlike the FLAP/5' --> 3' exonuclease superfamily, which binds two metal ions. Based on conserved gene neighborhoods Aravind and colleagues infer that the bacterial members are likely to be components of the processome/degradsome that process tRNAs or ribosomal RNAs.	137
396496	pfam01937	DUF89	Protein of unknown function DUF89. This family has no known function.	303
396497	pfam01938	TRAM	TRAM domain. This small domain has no known function. However it may perform a nucleic acid binding role (Bateman A. unpublished observation).	59
280172	pfam01939	NucS	Endonuclease NucS. Endonuclease NucS cleaves both 3' and 5' ssDNA extremities of branched DNA structures and it binds to ssDNA.	229
396498	pfam01940	DUF92	Integral membrane protein DUF92. Members of this family have several predicted transmembrane helices. The function of these prokaryotic proteins is unknown.	238
396499	pfam01941	AdoMet_Synthase	S-adenosylmethionine synthetase (AdoMet synthetase). This family consists of several archaebacterial S-adenosylmethionine synthetase C(AdoMet synthetase or MAT) (EC 2.5.1.6). S-Adenosylmethionine (AdoMet) occupies a central role in the metabolism of all cells. The biological roles of AdoMet include acting as the primary methyl group donor, as a precursor to the polyamines, and as a progenitor of a 5'-deoxyadenosyl radical. S-Adenosylmethionine synthetase catalyzes the only known route of AdoMet biosynthesis. The synthetic process occurs in a unique reaction in which the complete triphosphate chain is displaced from ATP and a sulfonium ion formed. MATs from various organisms contain ~400-amino acid polypeptide chains.	394
280175	pfam01943	Polysacc_synt	Polysaccharide biosynthesis protein. Members of this family are integral membrane proteins. Many members of the family are implicated in production of polysaccharide. The family includes RfbX part of the O antigen biosynthesis operon. The family includes SpoVB from Bacillus subtilis, which is involved in spore cortex biosynthesis.	273
396500	pfam01944	SpoIIM	Stage II sporulation protein M. SpoIIM is on e of four stage II sporulation proteins that is necessary for the forespore inside the mother-cell to be properly internalized through the breakdown of peptidoglycans trapped between the membranes of the septum separating the forespore and the mother-cell. The four proteins working in sequence are SpoIIB, pfam05036, SpoIIM, SpoIIP, pfam07454, and finally SpoIID, pfam08486. D, M and P are in a complex with each other and the complex assembles in a hierarchical manner such that M, which serves as a membrane anchor, recruits P to the septum and P, in turn, recruits D to the septum.	172
396501	pfam01946	Thi4	Thi4 family. This family includes a putative thiamine biosynthetic enzyme.	230
366866	pfam01947	DUF98	Protein of unknown function (DUF98). This is a family of uncharacterized proteins.	149
396502	pfam01948	PyrI	Aspartate carbamoyltransferase regulatory chain, allosteric domain. The regulatory chain is involved in allosteric regulation of aspartate carbamoyltransferase. The N-terminal domain has ferredoxin-like fold, and provides the regulatory chain dimerization interface.	92
396503	pfam01949	DUF99	Protein of unknown function DUF99. The function of this archaebacterial protein family is unknown.	173
396504	pfam01950	FBPase_3	Fructose-1,6-bisphosphatase. This is a family of bacterial and archaeal fructose-1,6-bisphosphatases (FBPases). FBPase catalyzes the hydrolysis of D-fructose-1,6-bisphosphate (FBP) to D-fructose-6-phosphate (F6P) and orthophosphate and is an essential regulatory enzyme in the glyconeogenic pathway.	357
396505	pfam01951	Archease	Archease protein family (MTH1598/TM1083). This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism.	136
376681	pfam01954	DUF104	Protein of unknown function DUF104. This family includes short archaebacterial proteins of unknown function. Archaeoglobus fulgidus has twelve copies of this protein, with several being clustered together in the genome.	56
396506	pfam01955	CbiZ	Adenosylcobinamide amidohydrolase. This prokaryotic protein family includes CbiZ which converts adenosylcobinamide (AdoCbi) to adenosylcobyric acid (AdoCby), an intermediate of the de novo coenzyme B12 biosynthetic route.	193
396507	pfam01956	DUF106	Integral membrane protein DUF106. This archaebacterial protein family has no known function. Members are predicted to be integral membrane proteins.	169
396508	pfam01957	NfeD	NfeD-like C-terminal, partner-binding. NfeD-like proteins are widely distributed throughout prokaryotes and are frequently associated with genes encoding stomatin-like proteins (slipins). There appear to be three major groups: an ancestral group with only an N-terminal serine protease domain and this C-terminal beta sheet-rich domain which is structurally very similar to the OB-fold domain, associated with its neighboring slipin cluster; a second major group with an additional middle, membrane-spanning domain, associated in some species with eoslipin and in others with yqfA; a final 'artificial' group which unites truncated forms lacking the protease region and associated with their ancestral gene partner, either yqfA or eoslipin. This NefD, C-terminal, domain appears to be the major one for relating to the associated protein. NfeD homologs are clearly reliant on their conserved gene neighbor which is assumed to be necessary for function, either through direct physical interaction or by functioning in the same pathway, possibly involve with lipid-rafts.	90
396509	pfam01958	DUF108	Domain of unknown function DUF108. This family has no known function. It is found to compose the complete protein in archaebacteria and a single domain in a large C. elegans protein.	89
396510	pfam01959	DHQS	3-dehydroquinate synthase II (EC 1.4.1.24). 3-Dehydroquinate synthase II was isolated from the archaeon Methanocaldococcus jannaschii and plays a key role in an alternative pathway for the biosynthesis of 3-dehydroquinate (DHQ), an intermediate of the canonical pathway for the biosynthesis of aromatic amino acids. The enzyme catalyzes a two-step reaction - an oxidative deamination, followed by cyclization. The enzyme converts 2-amino-3,7-dideoxy-D-threo-hept-6-ulosonate to 3-dehydroquinate.	347
396511	pfam01960	ArgJ	ArgJ family. Members of the ArgJ family catalyze the first EC:2.3.1.1 and fifth steps EC:2.3.1.35 in arginine biosynthesis.	373
396512	pfam01963	TraB	TraB family. pAD1 is a haemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. It encodes a mating response to a peptide sex pheromone, cAD1, secreted by recipient bacteria. Once the plasmid pAD1 is acquired, production of the pheromone ceases--a trait related in part to a determinant designated traB. However a related protein is found in C. elegans, suggesting that members of the TraB family have some more general function. This family also includes the bacterial GumN protein. The family has a conserved GXXH motif close to the N-terminus, a conserved glutamate and a conserved arginine that may be catalytic. The family also includes a second conserved GXXH motif near the C-terminus. This family also contains the Tiki proteins that regulate Wnt signalling.	260
396513	pfam01964	ThiC_Rad_SAM	Radical SAM ThiC family. ThiC is found within the thiamine biosynthesis operon. ThiC is involved in pyrimidine biosynthesis. ThiC participates in the formation of 4-Amino-5-hydroxymethyl-2-methylpyrimidine from AIR, an intermediate in the de novo pyrimidine biosynthesis. Thic is a member of the radical SAM superfamily.	418
396514	pfam01965	DJ-1_PfpI	DJ-1/PfpI family. The family includes the protease PfpI. This domain is also found in transcriptional regulators.	165
396515	pfam01966	HD	HD domain. HD domains are metal dependent phosphohydrolases.	110
396516	pfam01967	MoaC	MoaC family. Members of this family are involved in molybdenum cofactor biosynthesis. However their molecular function is not known.	136
396517	pfam01968	Hydantoinase_A	Hydantoinase/oxoprolinase. This family includes the enzymes hydantoinase and oxoprolinase EC:3.5.2.9. Both reactions involve the hydrolysis of 5-membered rings via hydrolysis of their internal imide bonds.	288
396518	pfam01969	DUF111	Protein of unknown function DUF111. This prokaryotic family has no known function.	380
396519	pfam01970	TctA	Tripartite tricarboxylate transporter TctA family. This family, formerly known as DUF112, is a family of bacterial and archaeal tripartite tricarboxylate transporters of the extracytoplasmic solute binding receptor-dependent transporter group of families, distinct from the ABC and TRAP-T families. TctA is part of the tripartite TctABC system which, as characterized in S. typhimurium, is a secondary carrier that depends for activity on the extracytoplasmic tricarboxylate-binding receptor TctC as well as two integral membrane proteins, TctA and TctB. complete three-component systems are found only in bacteria. TctA is a large transmembrane protein with up to 12 predicted membrane spanning regions in bacteria and up to 11 such in archaea, with the N-terminal within the cytoplasm. TctA is thought to be a permease, and in most other bacteria functions without TctB and TctC molecules.	415
110924	pfam01972	SDH_sah	Serine dehydrogenase proteinase. This family of archaebacterial proteins, formerly known as DUF114, has been found to be a serine dehydrogenase proteinase distantly related to ClpP proteinases that belong to the serine proteinase superfamily. The family has a catalytic triad of Ser, Asp, His residues, which shows an altered residue ordering compared with the ClpP proteinases but similar to that of the carboxypeptidase clan.	286
396520	pfam01973	MAF_flag10	Protein of unknown function DUF115. This family of archaebacterial proteins has no known function.	171
396521	pfam01974	tRNA_int_endo	tRNA intron endonuclease, catalytic C-terminal domain. Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9.	85
396522	pfam01975	SurE	Survival protein SurE. E. coli cells with the surE gene disrupted are found to survive poorly in stationary phase. It is suggested that SurE may be involved in stress response. Yeast also contains a member of the family. Yarrowia lipolytica PHO2 can complement a mutation in acid phosphatase, suggesting that members of this family could be phosphatases.	187
396523	pfam01976	DUF116	Protein of unknown function DUF116. This archaebacterial protein has no known function. The protein contains seven conserved cysteines and may also be an integral membrane protein.	152
396524	pfam01977	UbiD	3-octaprenyl-4-hydroxybenzoate carboxy-lyase. This family has been characterized as 3-octaprenyl-4- hydroxybenzoate carboxy-lyase enzymes. This enzyme catalyzes the third reaction in ubiquinone biosynthesis. For optimal activity the carboxy-lase was shown to require Mn2+.	400
396525	pfam01978	TrmB	Sugar-specific transcriptional regulator TrmB. One member of this family, TrmB, has been shown to be a sugar-specific transcriptional regulator of the trehalose/maltose ABC transporter in Thermococcus litoralis.	67
396526	pfam01979	Amidohydro_1	Amidohydrolase family. This family of enzymes are a a large metal dependent hydrolase superfamily. The family includes Adenine deaminase EC:3.5.4.2 that hydrolyzes adenine to form hypoxanthine and ammonia. Adenine deaminases reaction is important for adenine utilisation as a purine and also as a nitrogen source. This family also includes dihydroorotase and N-acetylglucosamine-6-phosphate deacetylases, EC:3.5.1.25 These enzymes catalyze the reaction N-acetyl-D-glucosamine 6-phosphate + H2O <=> D-glucosamine 6-phosphate + acetate. This family includes the catalytic domain of urease alpha subunit. Dihydroorotases (EC:3.5.2.3) are also included.	335
396527	pfam01980	UPF0066	Uncharacterized protein family UPF0066. 	116
396528	pfam01981	PTH2	Peptidyl-tRNA hydrolase PTH2. Peptidyl-tRNA hydrolases are enzymes that release tRNAs from peptidyl-tRNA during translation.	115
396529	pfam01982	CTP-dep_RFKase	Domain of unknown function DUF120. This domain is a CTP-dependent riboflavin kinase (RFK), found in archaea, that catalyzes the phosphorylation of riboflavin to form flavin mononucleotide in riboflavin biosynthesis EC:2.7.1.26. Its structure resembles a RIFT barrel, structurally similar to but topologically distinct from bacterial and eukaryotic examples. The N-terminal is a winged helix-turn-helix DNA-binding domain, and the C-terminal half is most similar in sequence to a group of cradle-loop barrels. Archaeoglobus fulgidus RibK has this domain attached to pfam00325.	121
251014	pfam01983	CofC	Guanylyl transferase CofC like. Coenzyme F420 is a hydride carrier cofactor that functions during methanogenesis. This family of proteins represents CofC, a nucleotidyl transferase that is involved in coenzyme F420 biosynthesis. CofC has been shown to catalyze the formation of lactyl-2-diphospho-5'-guanosine from 2-phospho-L-lactate and GTP.	217
396530	pfam01984	dsDNA_bind	Double-stranded DNA-binding domain. This domain is believed to bind double-stranded DNA of 20 bases length.	107
396531	pfam01985	CRS1_YhbY	CRS1 / YhbY (CRM) domain. Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localizes to the nucleolus, suggesting that an analogous activity may have been retained in plants. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome.	81
396532	pfam01986	DUF123	Domain of unknown function DUF123. This archaebacterial domain has no known function. It is attached to an endonuclease domain in Methanocaldococcus jannaschii endonuclease III (nth). The domain contains several conserved cysteines and histidines. This suggests that the domain may be a zinc binding nucleic acid interaction domain (Bateman A unpubl.).	96
396533	pfam01987	AIM24	Mitochondrial biogenesis AIM24. In eukaryotes, this domain is involved in mitochondrial biogenesis. Its function in prokaryotes in unknown.	206
396534	pfam01988	VIT1	VIT family. This family includes the vacuolar Fe2+/Mn2+ uptake transporter, Ccc1 and the vacuolar iron transporter VIT1.	212
396535	pfam01989	DUF126	Protein of unknown function DUF126. This archaebacterial protein family has no known function.	75
396536	pfam01990	ATP-synt_F	ATP synthase (F/14-kDa) subunit. This family includes 14-kDa subunit from vATPases, which is in the peripheral catalytic part of the complex. The family also includes archaebacterial ATP synthase subunit F.	91
396537	pfam01991	vATP-synt_E	ATP synthase (E/31 kDa) subunit. This family includes the vacuolar ATP synthase E subunit, as well as the archaebacterial ATP synthase E subunit.	199
396538	pfam01992	vATP-synt_AC39	ATP synthase (C/AC39) subunit. This family includes the AC39 subunit from vacuolar ATP synthase, and the C subunit from archaebacterial ATP synthase. The family also includes subunit C from the Sodium transporting ATP synthase from Enterococcus hirae.	333
396539	pfam01993	MTD	methylene-5,6,7,8-tetrahydromethanopterin dehydrogenase. This enzyme family is involved in formation of methane from carbon dioxide EC:1.5.99.9. The enzyme requires coenzyme F420.	274
396540	pfam01994	Trm56	tRNA ribose 2'-O-methyltransferase, aTrm56. This family is an aTrm56 that catalyzes the 2'-O-methylation of the cytidine residue in archaeal tRNA, using S-adenosyl-L-methionine. Biochemical assays showed that aTrm56 forms a dimer and prefers the L-shaped tRNA to the lambda form as its substrate. aTrm56 consists of the SPOUT domain, which contains the characteristic deep trefoil knot for AdoMet binding, and a unique C-terminal beta-hairpin.	119
396541	pfam01995	DUF128	Domain of unknown function DUF128. This archaebacterial protein family has no known function. The domain is found duplicated in Methanothermobacter thermautotrophicus MTH_1569. Many of these are attached to an N-terminal winged helix domain suggesting these are transcriptional regulators and that this domain has a ligand binding function.	238
396542	pfam01996	F420_ligase	F420-0:Gamma-glutamyl ligase. F420-0:Gamma-glutamyl ligase (EC:6.3.2.-) is an enzyme involved in F420 biosynthesis pathway. It catalyzes the GTP-dependent successive addition of multiple gamma-linked L-glutamates to the L-lactyl phosphodiester of 7,8-didemethyl-8-hydroxy-5-deazariboflavin (F420-0). This reaction produces polyglutamated F420 derivatives. GTP + F420-0 + n L-glutamate -> GDP + phosphate + F420-n	216
396543	pfam01997	Translin	Translin family. Members of this family include Translin, which interacts with DNA and forms a ring around the DNA. This family also includes human TSNAX, which was found to interact with translin with yeast two-hybrid screen.	196
396544	pfam01998	DUF131	Protein of unknown function DUF131. This archaebacterial protein family has no known function. The proteins are predicted to contain two transmembrane helices.	62
280223	pfam02001	DUF134	Protein of unknown function DUF134. This family of archaeal proteins has no known function.	98
280224	pfam02002	TFIIE_alpha	TFIIE alpha subunit. The general transcription factor TFIIE has an essential role in eukaryotic transcription initiation together with RNA polymerase II and other general factors. Human TFIIE consists of two subunits TFIIE-alpha and TFIIE-beta and joins the pre-initiation complex after RNA polymerase II and TFIIF. This family consists of the conserved amino terminal region of eukaryotic TFIIE-alpha and proteins from archaebacteria that are presumed to be TFIIE-alpha subunits also Archaeoglobus fulgidus tfe.	105
396545	pfam02005	TRM	N2,N2-dimethylguanosine tRNA methyltransferase. This enzyme EC:2.1.1.32 used S-AdoMet to methylate tRNA. The TRM1 gene of Saccharomyces cerevisiae is necessary for the N2,N2-dimethylguanosine modification of both mitochondrial and cytoplasmic tRNAs. The enzyme is found in both eukaryotes and archaebacteria	375
396546	pfam02006	DUF137	Protein of unknown function DUF137. This family of archaeal proteins has no known function.	176
280227	pfam02007	MtrH	Tetrahydromethanopterin S-methyltransferase MtrH subunit. The enzyme tetrahydromethanopterin S-methyltransferase EC:2.1.1.86 is composed of eight subunits. The enzyme is a membrane- associated enzyme complex which catalyzes an energy-conserving, sodium-ion-translocating step in methanogenesis from hydrogen and carbon dioxide.	299
366873	pfam02008	zf-CXXC	CXXC zinc finger domain. This domain contains eight conserved cysteine residues that bind to two zinc ions. The CXXC domain is found in a variety of chromatin-associated proteins. This domain binds to nonmethyl-CpG dinucleotides. The domain is characterized by two repeats, and shows a peculiar internal duplication in which the second unit is inserted into the first one. Each of these units is characterized by four conserved cysteines, displaying a CXXCXXCX(n)C motif that chelate a Zn+2 ion. The DNA binding interface has been identified by NMR. In eukaryotes, the CXXC domain is found in stramenopiles, plants and metazoans. Plants possess a mono-CXXC domain that is present in distinct chromatin proteins. Structural comparisons show that the mono-CXXC is homologous to the structural-zinc binding domain of medium chain dehydrogenases.	48
396547	pfam02009	RIFIN	Rifin. Plasmodium falciparum is the causative agent of deadly malaria disease. It encodes repetitive interspersed families of polypeptides (RIFINs), which are expressed on the surface of infected erythrocytes. All RIFIN sequences contain the PEXEL motif (a pentameric sequence RxLxE/Q/D, known as the Plasmodium export element) required for correct export and surface expression or host-targeting (HT) signal which plays a central role in the export of proteins into the host cell. It has been reported that PEXEL is preferably located 15-20 amino acids downstream of an N-terminal hydrophobic signal sequence. The RIFIN protein family can be divided into A and B types based on the presence or absence of a 25 amino acid motif located approximately 66 amino acids downstream of the PEXEL motif, with A- and B-types serving different roles in distinct parasite stages. The specific type B RIFIN variant (PF13_0006) is expressed on the surface of free merozoites, internally in developing gametocytes and on the surface of gametes at the point of emerging from activated, mature stage V gametocytes. While type A RIFIN are expressed on the infected erythrocyte surface, potentially contributing to the antigenic variation capacity of the parasite.	326
366875	pfam02010	REJ	REJ domain. The REJ (Receptor for Egg Jelly) domain is found in PKD1 and the sperm receptor for egg jelly. The function of this domain is unknown. The domain is 600 amino acids long so is probably composed of multiple structural domains. There are six completely conserved cysteine residues that may form disulphide bridges. This region contains tandem PKD-like domains.	448
396548	pfam02011	Glyco_hydro_48	Glycosyl hydrolase family 48. Members of this family are endoglucanase EC:3.2.1.4 and exoglucanase EC:3.2.1.91 enzymes that cleave cellulose or related substrate.	620
396549	pfam02012	BNR	BNR/Asp-box repeat. Members of this family contain multiple BNR (bacterial neuraminidase repeat) repeats or Asp-boxes. The repeats are short, however the repeats are never found closer than 40 residues together suggesting that the repeat is structurally longer. These repeats are found in many glycosyl hydrolases as well as other extracellular proteins of unknown function.	12
251036	pfam02013	CBM_10	Cellulose or protein binding domain. This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria.	36
396550	pfam02014	Reeler	Reeler domain. 	129
280233	pfam02015	Glyco_hydro_45	Glycosyl hydrolase family 45. 	214
396551	pfam02016	Peptidase_S66	LD-carboxypeptidase. Muramoyl-tetrapeptide carboxypeptidase hydrolyzes a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein. This family corresponds to Merops family S66.	119
396552	pfam02017	CIDE-N	CIDE-N domain. This domain is found in CAD nuclease and ICAD, the inhibitor of CAD nuclease. The two proteins interact through this domain.	75
396553	pfam02018	CBM_4_9	Carbohydrate binding domain. This family includes diverse carbohydrate binding domains.	134
396554	pfam02019	WIF	WIF domain. The WIF domain is found in the RYK tyrosine kinase receptors and WIF the Wnt-inhibitory-factor. The domain is extracellular and contains two conserved cysteines that may form a disulphide bridge. This domain is Wnt binding in WIF, and it has been suggested that RYK may also bind to Wnt. The WIF domain is a member of the immunoglobulin superfamily, and it comprises nine beta-strands and two alpha-helices, with two of the beta-strands (6 and 9) interrupted by four and six residues of irregular secondary structure, respectively. Considering that the activity of Wnts depends on the presence of a palmitoylated cysteine residue in their amino-terminal polypeptide segment, Wnt proteins are lipid-modified and can act as stem cell growth factors, it is likely that the WIF domain recognizes and binds to Wnts that have been activated by palmitoylation and that the recognition of palmitoylated Wnts by WIF-1 is effected by its WIF domain rather than by its EGF domains. A strong binding affinity for palmitoylated cysteine residues would further explain the remarkably high affinity of human WIF-1 not only for mammalian Wnts, but also for Wnts from Xenopus and Drosophila.	126
396555	pfam02020	W2	eIF4-gamma/eIF5/eIF2-epsilon. This domain of unknown function is found at the C-terminus of several translation initiation factors.	78
396556	pfam02021	UPF0102	Uncharacterized protein family UPF0102. The function of this family is unknown.	92
396557	pfam02022	Integrase_Zn	Integrase Zinc binding domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. This domain is the amino-terminal domain zinc binding domain. The central domain is the catalytic domain pfam00665. The carboxyl terminal domain is a DNA binding domain pfam00552.	37
396558	pfam02023	SCAN	SCAN domain. The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several pfam00096 proteins. The domain has been shown to be able to mediate homo- and hetero-oligomerization.	87
396559	pfam02024	Leptin	Leptin. 	142
396560	pfam02025	IL5	Interleukin 5. 	112
396561	pfam02026	RyR	RyR domain. This domain is called RyR for Ryanodine receptor. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown.	91
366884	pfam02027	RolB_RolC	RolB/RolC glucosidase family. This family of proteins includes RolB and RolC. RolC releases cytokinins from glucoside conjugates. Whereas RolB hydrolyzes indole glucosides.	184
396562	pfam02028	BCCT	BCCT, betaine/carnitine/choline family transporter. 	484
396563	pfam02029	Caldesmon	Caldesmon. 	474
307931	pfam02030	Lipoprotein_8	Hypothetical lipoprotein (MG045 family). This family includes hypothetical lipoproteins, the amino terminal part of this protein is related to pfam01547, a family of solute binding proteins. This suggests this family also has a solute binding function.	493
280248	pfam02031	Peptidase_M7	Streptomyces extracellular neutral proteinase (M7) family. 	133
396564	pfam02033	RBFA	Ribosome-binding factor A. 	104
396565	pfam02035	Coagulin	Coagulin. 	172
396566	pfam02036	SCP2	SCP-2 sterol transfer family. This domain is involved in binding sterols. It is found in the SCP2 protein, as well as the C-terminus of the enzyme estradiol 17 beta-dehydrogenase EC:1.1.1.62. The UNC-24 protein contains an SPFH domain pfam01145.	99
396567	pfam02037	SAP	SAP domain. The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins.	35
396568	pfam02038	ATP1G1_PLM_MAT8	ATP1G1/PLM/MAT8 family. 	46
307937	pfam02040	ArsB	Arsenical pump membrane protein. 	423
396569	pfam02041	Auxin_BP	Auxin binding protein. 	164
396570	pfam02042	RWP-RK	RWP-RK domain. This domain is named RWP-RK after a conserved motif at the C-terminus of the presumed domain. The domain is found in algal minus dominance proteins as well as plant proteins involved in nitrogen-controlled development.	49
396571	pfam02043	Bac_chlorC	Bacteriochlorophyll C binding protein. 	80
280257	pfam02044	Bombesin	Bombesin-like peptide. 	14
396572	pfam02045	CBFB_NFYA	CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B. 	56
396573	pfam02046	COX6A	Cytochrome c oxidase subunit VIa. 	112
307942	pfam02048	Enterotoxin_ST	Heat-stable enterotoxin ST. This family consists of the heat stable enterotoxin ST from Escherichia coli. ST is a small peptide of 18 or 19 amino acid residues produced by enterotoxigenic E. coli and is one of the causes of acute diarrhoea in infants and travellers in developing countries. ST triggers a biological response by binding to a membrane-associated guanylyl cyclase C which is located on intestinal epithelial cell membranes.	54
396574	pfam02049	FliE	Flagellar hook-basal body complex protein FliE. 	89
396575	pfam02050	FliJ	Flagellar FliJ protein. 	123
110996	pfam02052	Gallidermin	Gallidermin. 	52
366894	pfam02053	Gene66	Gene 66 (IR5) protein. 	209
307945	pfam02055	Glyco_hydro_30	Glycosyl hydrolase family 30 TIM-barrel domain. 	348
396576	pfam02056	Glyco_hydro_4	Family 4 glycosyl hydrolase. 	183
396577	pfam02057	Glyco_hydro_59	Glycosyl hydrolase family 59. 	293
396578	pfam02058	Guanylin	Guanylin precursor. 	86
396579	pfam02059	IL3	Interleukin-3. 	110
396580	pfam02060	ISK_Channel	Slow voltage-gated potassium channel. 	122
280268	pfam02061	Lambda_CIII	Lambda Phage CIII. The CIII protein from bacteriophage lambda is an inhibitor of the FtsH peptidase.	42
366900	pfam02063	MARCKS	MARCKS family. 	281
396581	pfam02064	MAS20	MAS20 protein import receptor. 	132
307952	pfam02065	Melibiase	Melibiase. Glycoside hydrolase families GH27, GH31 and GH36 form the glycoside hydrolase clan GH-D. Glycoside hydrolase family 36 can be split into 11 families, GH36A to GH36K. This family includes enzymes from GH36A-B and GH36D-K and from GH27.	347
280272	pfam02066	Metallothio_11	Metallothionein family 11. 	54
280273	pfam02067	Metallothio_5	Metallothionein family 5. 	41
396582	pfam02068	Metallothio_PEC	Plant PEC family metallothionein. 	75
396583	pfam02069	Metallothio_Pro	Prokaryotic metallothionein. 	51
366904	pfam02070	NMU	Neuromedin U. 	25
366905	pfam02071	NSF	Aromatic-di-Alanine (AdAR) repeat. This repeat is found in NSF attachment proteins. Its structure is similar to that found in TPR repeats pfam00515.	12
396584	pfam02072	Orexin	Prepro-orexin. 	129
396585	pfam02073	Peptidase_M29	Thermophilic metalloprotease (M29). 	405
396586	pfam02074	Peptidase_M32	Carboxypeptidase Taq (M32) metallopeptidase. 	489
366907	pfam02075	RuvC	Crossover junction endodeoxyribonuclease RuvC. This entry includes endodeoxyribonucleases found in bacteria, such as RuvC. RuvC is a small protein of about 20 kD. It requires and binds a magnesium ion. The structure of E. coli RuvC is a 3-layer alpha-beta sandwich containing a 5-stranded beta-sheet sandwiched between 5 alpha-helices. The Escherichia coli RuvC gene is involved in DNA repair and in the late step of RecE and RecF pathway recombination. RuvC protein (EC:3.1.22.4) cleaves cruciform junctions, which are formed by the extrusion of inverted repeat sequences from a super-coiled plasmid and which are structurally analogous to Holliday junctions, by introducing nicks into strands with the same polarity. The nicks leave a 5'terminal phosphate and a 3'terminal hydroxyl group which are ligated by E. coli or Bacteriophage T4 DNA ligases. Analysis of the cleavage sites suggests that DNA topology rather than a particular sequence determines the cleavage site. RuvC protein also cleaves Holliday junctions that are formed between gapped circular and linear duplex DNA by the function of RecA protein. The active form of RuvC protein is a dimer. This is mechanistically suited for an endonuclease involved in swapping DNA strands at the crossover junctions. It is inferred that RuvC protein is an endonuclease that resolves Holliday structures in vivo.	148
396587	pfam02076	STE3	Pheromone A receptor. 	292
111019	pfam02077	SURF4	SURF4 family. 	267
396588	pfam02078	Synapsin	Synapsin, N-terminal domain. This family is structurally related to the PreATP-grasp domain.	100
366910	pfam02079	TP1	Nuclear transition protein 1. 	51
396589	pfam02080	TrkA_C	TrkA-C domain. This domain is often found next to the pfam02254 domain. The exact function of this domain is unknown. It has been suggested that it may bind an unidentified ligand. The domain is predicted to adopt an all beta structure.	70
396590	pfam02081	TrpBP	Tryptophan RNA-binding attenuator protein. 	68
396591	pfam02082	Rrf2	Transcriptional regulator. This family is related to pfam001022 and other transcription regulation families (personal obs: Yeats C).	131
280288	pfam02083	Urotensin_II	Urotensin II. 	12
251078	pfam02084	Bindin	Bindin. 	239
396592	pfam02085	Cytochrom_CIII	Class III cytochrome C family. 	103
396593	pfam02086	MethyltransfD12	D12 class N6 adenine-specific DNA methyltransferase. 	254
111029	pfam02087	Nitrophorin	Nitrophorin. 	178
145317	pfam02088	Ornatin	Ornatin. 	41
396594	pfam02089	Palm_thioest	Palmitoyl protein thioesterase. 	251
280292	pfam02090	SPAM	Salmonella surface presentation of antigen gene type M protein. 	140
396595	pfam02091	tRNA-synt_2e	Glycyl-tRNA synthetase alpha subunit. 	276
396596	pfam02092	tRNA_synt_2f	Glycyl-tRNA synthetase beta subunit. 	536
396597	pfam02093	Gag_p30	Gag P30 core shell protein. According to Swiss-Prot annotation this protein is the viral core shell protein. P30 is essential for viral assembly.	208
396598	pfam02095	Extensin_1	Extensin-like protein repeat. 	10
396599	pfam02096	60KD_IMP	60Kd inner membrane protein. 	188
280298	pfam02097	Filo_VP35	Filoviridae VP35. 	342
396600	pfam02098	His_binding	Tick histamine binding protein. 	150
396601	pfam02099	Josephin	Josephin. 	154
396602	pfam02100	ODC_AZ	Ornithine decarboxylase antizyme. This family consists of ornithine decarboxylase antizyme proteins. The polyamine biosynthetic enzyme ornithine decarboxylase (ODC) is degraded by the 26 S proteasome via a ubiquitin-independent pathway. Its degradation is greatly accelerated by association with the polyamine-induced regulatory protein antizyme 1 (AZ1).	112
396603	pfam02101	Ocular_alb	Ocular albinism type 1 protein. 	402
251087	pfam02102	Peptidase_M35	Deuterolysin metalloprotease (M35) family. 	352
396604	pfam02104	SURF1	SURF1 family. 	192
396605	pfam02106	Fanconi_C	Fanconi anaemia group C protein. 	547
396606	pfam02107	FlgH	Flagellar L-ring protein. 	175
396607	pfam02108	FliH	Flagellar assembly protein FliH. 	124
396608	pfam02109	DAD	DAD family. Members of this family are thought to be integral membrane proteins. Some members of this family have been shown to cause apoptosis if mutated, these proteins are known as DAD for defender against death. The family also includes the epsilon subunit of the oligosaccharyltransferase that is involved in N-linked glycosylation.	108
396609	pfam02110	HK	Hydroxyethylthiazole kinase family. 	247
396610	pfam02112	PDEase_II	cAMP phosphodiesterases class-II. 	339
396611	pfam02113	Peptidase_S13	D-Ala-D-Ala carboxypeptidase 3 (S13) family. 	444
251094	pfam02114	Phosducin	Phosducin. 	265
396612	pfam02115	Rho_GDI	RHO protein GDP dissociation inhibitor. 	194
396613	pfam02116	STE2	Fungal pheromone mating factor STE2 GPCR. 	276
111054	pfam02117	7TM_GPCR_Sra	Serpentine type 7TM GPCR chemoreceptor Sra. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sra is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	328
396614	pfam02118	Srg	Srg family chemoreceptor. 	270
396615	pfam02119	FlgI	Flagellar P-ring protein. 	342
396616	pfam02120	Flg_hook	Flagellar hook-length control protein FliK. This is the C terminal domain of FliK. FliK controls the length of the flagellar hook by directly measuring the hook length as a molecular ruler. This family also includes YscP of the Yersinia type III secretion system, and equivalent proteins in other pathogenic bacterial type III secretion systems.	83
396617	pfam02121	IP_trans	Phosphatidylinositol transfer protein. Along with the structurally unrelated Sec14p family (found in pfam00650), this family can bind/exchange one molecule of phosphatidylinositol (PI) or phosphatidylcholine (PC) and thus aids their transfer between different membrane compartments. There are three sub-families - all share an N-terminal PITP-like domain, whose sequence is highly conserved. It is described as consisting of three regions. The N-terminal region is thought to bind the lipid and contains two helices and an eight-stranded, mostly antiparallel beta-sheet. An intervening loop region, which is thought to play a role in protein-protein interactions, separates this from the C-terminal region, which exhibits the greatest sequence variation and may be involved in membrane binding. PITP alpha has a 16-fold greater affinity for PI than PC. Together with PITP beta, it is expressed ubiquitously in all tissues.	245
111059	pfam02122	Peptidase_S39	Peptidase S39. This family contains polyprotein processing endopeptidases from RNA viruses.	203
280316	pfam02123	RdRP_4	Viral RNA-directed RNA-polymerase. This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	465
280317	pfam02124	Marek_A	Marek's disease glycoprotein A. 	210
396618	pfam02126	PTE	Phosphotriesterase family. 	298
396619	pfam02127	Peptidase_M18	Aminopeptidase I zinc metalloprotease (M18). 	430
396620	pfam02128	Peptidase_M36	Fungalysin metallopeptidase (M36). 	366
396621	pfam02129	Peptidase_S15	X-Pro dipeptidyl-peptidase (S15 family). 	264
396622	pfam02130	UPF0054	Uncharacterized protein family UPF0054. 	125
396623	pfam02132	RecR	RecR protein. 	40
251109	pfam02133	Transp_cyt_pur	Permease for cytosine/purines, uracil, thiamine, allantoin. 	439
396624	pfam02135	zf-TAZ	TAZ zinc finger. The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumor suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC.	72
396625	pfam02136	NTF2	Nuclear transport factor 2 (NTF2) domain. This family includes the NTF2-like Delta-5-3-ketosteroid isomerase proteins.	116
396626	pfam02137	A_deamin	Adenosine-deaminase (editase) domain. Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defense against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc.	327
396627	pfam02138	Beach	Beige/BEACH domain. 	277
396628	pfam02140	Gal_Lectin	Galactose binding lectin domain. 	80
396629	pfam02141	DENN	DENN (AEX-3) domain. DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways.	185
396630	pfam02142	MGS	MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site.	93
396631	pfam02144	Rad1	Repair protein Rad1/Rec1/Rad17. 	257
396632	pfam02145	Rap_GAP	Rap/ran-GAP. 	181
396633	pfam02146	SIR2	Sir2 family. This region is characteristic of Silent information regulator 2 (Sir2) proteins, or sirtuins. These are protein deacetylases that depend on nicotine adenine dinucleotide (NAD). They are found in many subcellular locations, including the nucleus, cytoplasm and mitochondria. Eukaryotic forms play in important role in the regulation of transcriptional repression. Moreover, they are involved in microtubule organisation and DNA damage repair processes.i	179
396634	pfam02148	zf-UBP	Zn-finger in ubiquitin-hydrolases and other protein. 	63
396635	pfam02149	KA1	Kinase associated domain 1. 	44
280336	pfam02150	RNA_POL_M_15KD	RNA polymerases M/15 Kd subunit. 	36
308001	pfam02151	UVR	UvrB/uvrC motif. 	36
396636	pfam02152	FolB	Dihydroneopterin aldolase. This enzyme EC:4.1.2.25 catalyzes the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate.	113
396637	pfam02153	PDH	Prephenate dehydrogenase. Members of this family are prephenate dehydrogenases EC:1.3.1.12 involved in tyrosine biosynthesis.	257
111086	pfam02154	FliM	Flagellar motor switch protein FliM. 	192
396638	pfam02155	GCR	Glucocorticoid receptor. 	371
396639	pfam02156	Glyco_hydro_26	Glycosyl hydrolase family 26. 	311
396640	pfam02157	Man-6-P_recep	Mannose-6-phosphate receptor. This family includes both Cation-dependent and cation independent mannose-6-phosphate receptors.	254
396641	pfam02158	Neuregulin	Neuregulin family. 	360
396642	pfam02159	Oest_recep	Oestrogen receptor. 	138
280345	pfam02160	Peptidase_A3	Cauliflower mosaic virus peptidase (A3). 	208
396643	pfam02161	Prog_receptor	Progesterone receptor. 	564
145362	pfam02162	XYPPX	XYPPX repeat (two copies). This repeat is found in a wide variety of proteins and generally consists of the motif XYPPX where X can be any amino acid. The family includes annexin VII and the carboxy tail of certain rhodopsins. This family also includes plaque matrix proteins, however this motif is embedded in a ten residue repeat in Mytilus edulis adhesive plaque matrix protein FP1. The molecular function of this repeat is unknown. It is also not clear is all the members of this family share a common evolutionary ancestor due to its short length and biased amino acid composition.	15
308008	pfam02163	Peptidase_M50	Peptidase family M50. 	275
396644	pfam02165	WT1	Wilm's tumor protein. 	290
396645	pfam02166	Androgen_recep	Androgen receptor. 	484
396646	pfam02167	Cytochrom_C1	Cytochrome C1 family. 	219
396647	pfam02169	LPP20	LPP20 lipoprotein. This family contains the LPP20 lipoprotein, which is a non-essential class of lipoprotein.	96
396648	pfam02170	PAZ	PAZ domain. This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerization. The three-dimensional structure of this domain has been solved. The PAZ domain is composed of two subdomains. One subdomain is similar to the OB fold, albeit with a different topology. The OB-fold is well known as a single-stranded nucleic acid binding fold. The second subdomain is composed of a beta-hairpin followed by an alpha-helix. The PAZ domains shows low-affinity nucleic acid binding and appears to interact with the 3' ends of single-stranded regions of RNA in the cleft between the two subdomains. PAZ can bind the characteristic two-base 3' overhangs of siRNAs, indicating that although PAZ may not be a primary nucleic acid binding site in Dicer or RISC, it may contribute to the specific and productive incorporation of siRNAs and miRNAs into the RNAi pathway.	116
396649	pfam02171	Piwi	Piwi domain. This domain is found in the protein Piwi and its relatives. The function of this domain is the dsRNA guided hydrolysis of ssRNA. Determination of the crystal structure of Argonaute reveals that PIWI is an RNase H domain, and identifies Argonaute as Slicer, the enzyme that cleaves mRNA in the RNAi RISC complex. In addition, Mg+2 dependence and production of 3'-OH and 5' phosphate products are shared characteristics of RNaseH and RISC. The PIWI domain core has a tertiary structure belonging to the RNase H family of enzymes. RNase H fold proteins all have a five-stranded mixed beta-sheet surrounded by helices. By analogy to RNase H enzymes which cleave single-stranded RNA guided by the DNA strand in an RNA/DNA hybrid, the PIWI domain can be inferred to cleave single-stranded RNA, for example mRNA, guided by double stranded siRNA.	296
366953	pfam02172	KIX	KIX domain. CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun.	81
396650	pfam02173	pKID	pKID domain. CBP and P300 bind to the pKID (phosphorylated kinase-inducible-domain) domain of CREB.	41
396651	pfam02174	IRS	PTB domain (IRS-1 type). 	97
111105	pfam02175	7TM_GPCR_Srb	Serpentine type 7TM GPCR chemoreceptor Srb. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srb is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	236
280357	pfam02176	zf-TRAF	TRAF-type zinc finger. 	60
396652	pfam02177	APP_N	Amyloid A4 N-terminal heparin-binding. This N-terminal domain of APP, amyloid precursor protein, is the heparin-binding domain of the protein. this region is also responsible for stimulation of neurite outgrowth. The structure reveals both a highly charged basic surface that may interact with glycosaminoglycans in the brain and an abutting hydrophobic surface that is proposed to play an important functional role such as in dimerization or ligand-binding. Structural similarities with cysteine-rich growth factors, taken together with its known growth-promoting properties, suggest the APP N-terminal domain could function as a growth factor in vivo.	100
308018	pfam02178	AT_hook	AT hook motif. At hooks are DNA binding motifs with a preference for A/T rich regions.	13
396653	pfam02179	BAG	BAG domain. Domain present in Hsp70 regulators.	77
396654	pfam02180	BH4	Bcl-2 homology region 4. 	25
396655	pfam02181	FH2	Formin Homology 2 Domain. 	372
396656	pfam02182	SAD_SRA	SAD/SRA domain. The domain goes by several names including SAD, SRA and YDG. It adopts a beta barrel, modified PUA-like, fold that is widely present in eukaryotic chromatin proteins and in bacteria. Versions of this domain are known to bind hemi-methylated CpG dinucleotides and also other 5mC containing dinucleotides. The domain binds DNA by flipping out the methylated cytosine base from the DNA double helix.The conserved tyrosine and aspartate residues and a glycine rich patch are critical for recognition of the flipped out base. Mammalian UHRF1 that contains this domain plays an important role in maintenance of methylation at CpG dinucleotides by recruiting DNMT1 to hemimethylated sites associated with replication forks. The SAD/SRA domain has been combined with other domains involved in the ubiquitin pathway on multiple occasions and such proteins link recognition of DNA methylation to chromatin-protein ubiquitination. The domain is also found in species that lack DNA methylation, such as certain apicomplexans, suggestive of other DNA-binding modes or functions. A highly derived and distinct version of the domain is also found in fungi where it is fused to AlkB-type 2OGFeDO domains. In bacteria, the domain is usually fused or associated with restriction endonucleases, many of which target methylated or hemi-methylated DNA.	143
396657	pfam02183	HALZ	Homeobox associated leucine zipper. 	43
111114	pfam02184	HAT	HAT (Half-A-TPR) repeat. The HAT (Half A TPR) repeat is found in several RNA processing proteins.	32
396658	pfam02185	HR1	Hr1 repeat. The HR1 repeat was first described as a three times repeated homology region of the N-terminal non-catalytic part of protein kinase PRK1(PKN). The first two of these repeats were later shown to bind the small G protein rho known to activate PKN in its GTP-bound form. Similar rho-binding domains also occur in a number of other protein kinases and in the rho-binding proteins rhophilin and rhotekin. Recently, the structure of the N-terminal HR1 repeat complexed with RhoA has been determined by X-ray crystallography. It forms an antiparallel coiled-coil fold termed an ACC finger.	57
396659	pfam02186	TFIIE_beta	TFIIE beta subunit core domain. General transcription factor TFIIE consists of two subunits, TFIIE alpha pfam02002 and TFIIE beta. TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The structure of the DNA binding core region has been solved and has a winged helix fold.	66
396660	pfam02187	GAS2	Growth-Arrest-Specific Protein 2 Domain. The GAR2 domain is common in plakin family members and Gas2 family members. The GAR domain comprises around 57 amino acids and has been shown to bind to microtubules.	69
396661	pfam02188	GoLoco	GoLoco motif. 	20
396662	pfam02189	ITAM	Immunoreceptor tyrosine-based activation motif. 	20
396663	pfam02190	LON_substr_bdg	ATP-dependent protease La (LON) substrate-binding domain. This domain has been shown to be part of the PUA superfamily. This domain represents a general protein and polypeptide interaction domain for the ATP-dependent serine peptidase, LON, Peptidase_S16, pfam05362. ATP-dependent Lon proteases are conserved in all living organisms and catalyze rapid turnover of short-lived regulatory proteins and many damaged or denatured proteins.	195
396664	pfam02191	OLF	Olfactomedin-like domain. 	243
396665	pfam02192	PI3K_p85B	PI3-kinase family, p85-binding domain. 	76
396666	pfam02194	PXA	PXA domain. This domain is associated with PX domains pfam00787.	183
396667	pfam02195	ParBc	ParB-like nuclease domain. 	90
396668	pfam02196	RBD	Raf-like Ras-binding domain. 	65
396669	pfam02197	RIIa	Regulatory subunit of type II PKA R-subunit. 	37
396670	pfam02198	SAM_PNT	Sterile alpha motif (SAM)/Pointed domain. 	82
396671	pfam02199	SapA	Saposin A-type domain. 	33
366975	pfam02200	STE	STE like transcription factor. 	109
396672	pfam02201	SWIB	SWIB/MDM2 domain. This family includes the SWIB domain and the MDM2 domain. The p53-associated protein (MDM2) is an inhibitor of the p53 tumor suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2.	73
396673	pfam02202	Tachykinin	Tachykinin family. 	11
396674	pfam02203	TarH	Tar ligand binding domain homolog. 	152
366977	pfam02204	VPS9	Vacuolar sorting protein 9 (VPS9) domain. This domain acts as a GDP-GTP exchange factor (GEF). It activates Rab GTPases by stimulating the release of GDP and allowing GTP to bind.	104
396675	pfam02205	WH2	WH2 motif. The WH2 motif (for Wiskott Aldrich syndrome homology region 2) has been shown in WASP and Scar1 (mammalian homolog) to be the region that interacts with actin.	28
396676	pfam02206	WSN	Domain of unknown function. 	66
396677	pfam02207	zf-UBR	Putative zinc finger in N-recognin (UBR box). This region is found in E3 ubiquitin ligases that recognize N-recognins.	68
396678	pfam02208	Sorb	Sorbin homologous domain. 	45
396679	pfam02209	VHP	Villin headpiece domain. 	35
396680	pfam02210	Laminin_G_2	Laminin G domain. This family includes the Thrombospondin N-terminal-like domain, a Laminin G subfamily.	126
396681	pfam02211	NHase_beta	Nitrile hydratase beta subunit. Nitrile hydratases EC:4.2.1.84 are unusual metalloenzymes that catalyze the hydration of nitriles to their corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and they contain one iron atom per alpha beta unit.	220
396682	pfam02212	GED	Dynamin GTPase effector domain. 	89
396683	pfam02213	GYF	GYF domain. The GYF domain is named because of the presence of Gly-Tyr-Phe residues. The GYF domain is a proline-binding domain in CD2-binding protein.	45
396684	pfam02214	BTB_2	BTB/POZ domain. In voltage-gated K+ channels this domain is responsible for subfamily-specific assembly of alpha-subunits into functional tetrameric channels. In KCTD1 this domain functions as a transcriptional repressor. It also mediates homomultimerisation of KCTD1 and interaction of KCTD1 with the transcription factor AP-2-alpha.	93
396685	pfam02216	B	B domain. This family contains the B domain of Staphylococcal protein A, which specifically binds to the Fc portion of immunoglobulin G.	51
280395	pfam02217	T_Ag_DNA_bind	Origin of replication binding protein. This domain of large T antigen binds to the SV40 origin of DNA replication.	94
396686	pfam02218	HS1_rep	Repeat in HS1/Cortactin. The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018.	36
396687	pfam02219	MTHFR	Methylenetetrahydrofolate reductase. This family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from bacteria and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The structure for this domain is known to be a TIM barrel.	287
396688	pfam02221	E1_DerP2_DerF2	ML domain. ML domain - MD-2-related lipid recognition domain. This family consists of proteins from plants, animals and fungi, including dust mite allergen Der P 2. It has been implicate in lipid recognition, particularly in the recognition of pathogen related products. A mutation in Npc2 causes a rare form of Niemann-Pick type C2 disease. This domain has a similar topology to immunoglobulin domains.	130
396689	pfam02222	ATP-grasp	ATP-grasp domain. This family does not contain all known ATP-grasp domain members. This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity.	169
396690	pfam02223	Thymidylate_kin	Thymidylate kinase. 	184
280401	pfam02224	Cytidylate_kin	Cytidylate kinase. Cytidylate kinase EC:2.7.4.14 catalyzes the phosphorylation of cytidine 5'-monophosphate (dCMP) to cytidine 5'-diphosphate (dCDP) in the presence of ATP or GTP.	211
396691	pfam02225	PA	PA domain. The PA (Protease associated) domain is found as an insert domain in diverse proteases. The PA domain is also found in a plant vacuolar sorting receptor and members of the RZF family. It has been suggested that this domain forms a lid-like structure that covers the active site in active proteases, and is involved in protein recognition in vacuolar sorting receptors.	89
308053	pfam02226	Pico_P1A	Picornavirus coat protein (VP4). VP1, VP2, VP3 and VP4 for the basic unit that forms the icosahedral coat of picornaviruses. Five symmetry-related N termini of coat protein VP4 form a ten-stranded, antiparallel beta barrel around the base of the icosahedral fivefold axis.	68
280404	pfam02228	Gag_p19	Major core protein p19. p19 is a component of the inner protein layer of the viral nucleocapsid.	92
396692	pfam02229	PC4	Transcriptional Coactivator p15 (PC4). p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain.	48
396693	pfam02230	Abhydrolase_2	Phospholipase/Carboxylesterase. This family consists of both phospholipases and carboxylesterases with broad substrate specificity, and is structurally related to alpha/beta hydrolases pfam00561.	217
280407	pfam02232	Alpha_TIF	Alpha trans-inducing protein (Alpha-TIF). Alpha-TIF, a virion protein (VP16), is involved in transcriptional activation of viral immediate early (IE) promoters (alpha genes). Specificity of tegument protein VP16 for IE genes is conferred by the 400 residue N-terminal, the 80 residue C-terminal is responsible for transcriptional activation.	343
396694	pfam02233	PNTB	NAD(P) transhydrogenase beta subunit. This family corresponds to the beta subunit of NADP transhydrogenase in prokaryotes, and either the protein N- or C terminal in eukaryotes. The domain is often found in conjunction with pfam01262. Pyridine nucleotide transhydrogenase catalyzes the reduction of NAD+ to NADPH. A complete loss of activity occurs upon mutation of Gly314 in E. coli.	452
396695	pfam02234	CDI	Cyclin-dependent kinase inhibitor. Cell cycle progression is negatively controlled by cyclin-dependent kinases inhibitors (CDIs). CDIs are involved in cell cycle arrest at the G1 phase.	46
280410	pfam02236	Viral_DNA_bi	Viral DNA-binding protein, all alpha domain. This family represents a domain of the viral DNA- binding protein, a multi functional protein involved in DNA replication and transcription control.	79
396696	pfam02237	BPL_C	Biotin protein ligase C terminal domain. The function of this structural domain is unknown. It is found to the C-terminus of the biotin protein ligase catalytic domain pfam01317.	47
396697	pfam02238	COX7a	Cytochrome c oxidase subunit VII. Cytochrome c oxidase, a 13 sub-unit complex, is the terminal oxidase in the mitochondrial electron transport chain. This family also contains both heart and liver isoforms of cytochrome c oxidase subunit VIIa.	53
366994	pfam02239	Cytochrom_D1	Cytochrome D1 heme domain. Cytochrome cd1 (nitrite reductase) catalyzes the conversion of nitrite to nitric oxide in the nitrogen cycle. This family represents the d1 heme binding domain of cytochrome cd1, in which His/Tyr side chains ligate the d1 heme iron of the active site in the oxidized state.	368
396698	pfam02240	MCR_gamma	Methyl-coenzyme M reductase gamma subunit. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (pfam02241), and 2 gamma (this family) subunits with two identical nickel porphinoid active sites.	246
396699	pfam02241	MCR_beta	Methyl-coenzyme M reductase beta subunit, C-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (this family), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The C-terminal domain of MCR beta has an all-alpha fold with buried central helix.	249
396700	pfam02244	Propep_M14	Carboxypeptidase activation peptide. Carboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase, and is responsible for modulation of folding and activity of the pro-enzyme.	68
396701	pfam02245	Pur_DNA_glyco	Methylpurine-DNA glycosylase (MPG). Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA.	181
308063	pfam02246	B1	Protein L b1 domain. Protein L is a bacterial protein with immunoglobulin (Ig) light chain-binding properties. It contains a number of homologous b1 repeats towards the N-terminus. These repeats have been found to be responsible for the interaction of protein L with Ig light chains.	62
396702	pfam02247	Como_LCP	Large coat protein. This family contains the large coat protein (LCP) of the comoviridae viral family.	369
396703	pfam02248	Como_SCP	Small coat protein. This family contains the small coat protein (SCP) of the comoviridae viral family.	183
396704	pfam02249	MCR_alpha	Methyl-coenzyme M reductase alpha subunit, C-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (this family), 2 beta (pfam02241), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The C-terminal domain is comprised of an all-alpha multi-helical bundle.	127
396705	pfam02250	Orthopox_35kD	35kD major secreted virus protein. This family of orthopoxvirus secreted proteins (also known as T1 and A41) interact with members of both the CC and CXC superfamilies of chemokines. It has been suggested that these secreted proteins modulate leukocyte influx into virus-infected tissues.	224
396706	pfam02251	PA28_alpha	Proteasome activator pa28 alpha subunit. PA28 activator complex (also known as 11s regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha and beta subunits. This family represents the alpha subunit. The activator complex binds to the 20S proteasome ana simulates peptidase activity in and ATP-independent manner.	61
396707	pfam02252	PA28_beta	Proteasome activator pa28 beta subunit. PA28 activator complex (also known as 11s regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha and beta subunits. This family represents the beta subunit. The activator complex binds to the 20S proteasome ana simulates peptidase activity in and ATP-independent manner.	143
396708	pfam02253	PLA1	Phospholipase A1. Phospholipase A1 is a bacterial outer membrane bound acyl hydrolase with a broad substrate specificity EC:3.1.1.32. It has been proposed that Ser164 is the active site for Escherichia coli phospholipase A1.	251
396709	pfam02254	TrkA_N	TrkA-N domain. This domain is found in a wide variety of proteins. These protein include potassium channels, phosphoesterases, and various other transporters. This domain binds to NAD.	115
396710	pfam02255	PTS_IIA	PTS system, Lactose/Cellobiose specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIA PTS system enzymes. This family of proteins normally function as a homotrimer, stabilized by a centrally located metal ion. Separation into subunits is thought to occur after phosphorylation.	94
396711	pfam02256	Fe_hyd_SSU	Iron hydrogenase small subunit. This family represents the small subunit of the Fe-only hydrogenases EC:1.18.99.1. The subunit is comprised of alternating random coil and alpha helical structures that encompasses the large subunit in a novel protein fold.	56
396712	pfam02257	RFX_DNA_binding	RFX DNA-binding domain. RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. It recognize X-boxes (DNA of the sequence 5'-GTNRCC(0-3N)RGYAAC-3', where N is any nucleotide, R is a purine and Y is a pyrimidine) using a highly conserved 76-residue DNA-binding domain (DBD).	77
396713	pfam02258	SLT_beta	Shiga-like toxin beta subunit. This family represents the B subunit of shiga-like toxin (SLT or verotoxin) produced by some strains of E.coli associated with hemorrhagic colitis and hemolytic uremic syndrome. SLT's are composed of one enzymatic A subunit and five cell binding B subunits.	69
396714	pfam02259	FAT	FAT domain. The FAT domain is named after FRAP, ATM and TRRAP.	342
396715	pfam02260	FATC	FATC domain. The FATC domain is named after FRAP, ATM, TRRAP C-terminal. The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability.	32
396716	pfam02261	Asp_decarbox	Aspartate decarboxylase. Decarboxylation of aspartate is the major route of beta-alanine production in bacteria, and is catalyzed by the enzyme aspartate decarboxylase EC:4.1.1.11 which requires a pyruvoyl group for its activity. It is synthesized initially as a proenzyme which is then proteolytically cleaved to an alpha (C-terminal) and beta (N-terminal) subunit and a pyruvoyl group. This family contains both chains of aspartate decarboxylase.	107
396717	pfam02262	Cbl_N	CBL proto-oncogene N-terminal domain 1. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. Cbl_N is comprised of 3 structural domains of which this is the first - a four helix bundle.	119
308078	pfam02263	GBP	Guanylate-binding protein, N-terminal domain. Transcription of the anti-viral guanylate-binding protein (GBP) is induced by interferon-gamma during macrophage induction. This family contains GBP1 and GPB2, both GTPases capable of binding GTP, GDP and GMP.	260
396718	pfam02264	LamB	LamB porin. Maltoporin (LamB protein) forms a trimeric structure which facilitates the diffusion of maltodextrins across the outer membrane of Gram-negative bacteria. The membrane channel is formed by an antiparallel beta-barrel.	385
396719	pfam02265	S1-P1_nuclease	S1/P1 Nuclease. This family contains both S1 and P1 nucleases (EC:3.1.30.1) which cleave RNA and single stranded DNA with no base specificity.	252
396720	pfam02267	Rib_hydrolayse	ADP-ribosyl cyclase. ADP-ribosyl cyclase EC:3.2.2.5 (also know as cyclic ADP-ribose hydrolase or CD38) synthesizes cyclic-ADP ribose, a second messenger for glucose-induced insulin secretion.	229
396721	pfam02268	TFIIA_gamma_N	Transcription initiation factor IIA, gamma subunit, helical domain. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The N-terminal domain of the gamma subunit is a 4 helix bundle.	46
190265	pfam02269	TFIID-18kDa	Transcription initiation factor IID, 18kD subunit. This family includes the Spt3 yeast transcription factors and the 18kD subunit from human transcription initiation factor IID (TFIID-18). Determination of the crystal structure reveals an atypical histone fold	93
396722	pfam02270	TFIIF_beta	Transcription initiation factor IIF, beta subunit. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter.	65
396723	pfam02271	UCR_14kD	Ubiquinol-cytochrome C reductase complex 14kD subunit. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 14kD (or VI) subunit of the complex which is not directly involved in electron transfer, but has a role in assembly of the complex.	100
396724	pfam02272	DHHA1	DHHA1 domain. This domain is often found adjacent to the DHH domain pfam01368 and is called DHHA1 for DHH associated domain. This domain is diagnostic of DHH subfamily 1 members. This domains is also found in alanyl tRNA synthetase, suggesting that this domain may have an RNA binding function. The domain is about 60 residues long and contains a conserved GG motif.	139
111194	pfam02273	Acyl_transf_2	Acyl transferase. This bacterial family of Acyl transferases (or myristoyl-acp-specific thioesterases) catalyze the first step in the bioluminescent fatty acid reductase system.	294
396725	pfam02274	Amidinotransf	Amidinotransferase. This family contains glycine (EC:2.1.4.1) and inosamine (EC:2.1.4.2) amidinotransferases, enzymes involved in creatine and streptomycin biosynthesis respectively. This family also includes arginine deiminases, EC:3.5.3.6. These enzymes catalyze the reaction: arginine + H2O <=> citrulline + NH3. Also found in this family is the Streptococcus anti tumor glycoprotein.	284
396726	pfam02275	CBAH	Linear amide C-N hydrolases, choloylglycine hydrolase family. This family includes several hydrolases which cleave carbon-nitrogen bonds, other than peptide bonds, in linear amides. These include choloylglycine hydrolase (conjugated bile acid hydrolase, CBAH) EC:3.5.1.24, penicillin acylase EC:3.5.1.11 and acid ceramidase EC:3.5.1.23. This domain forms the alpha-subunit for members from vertebral species, see family NAAA-beta, pfam15508.	316
396727	pfam02276	CytoC_RC	Photosynthetic reaction centre cytochrome C subunit. Photosynthesis in purple bacteria is dependent on light-induced electron transfer in the reaction centre (RC), coupled to the uptake of protons from the cytoplasm. The RC contains a cytochrome molecule which re-reduces the oxidized electron donor.	309
396728	pfam02277	DBI_PRT	Phosphoribosyltransferase. This family of proteins represent the nicotinate-nucleotide- dimethylbenzimidazole phosphoribosyltransferase (NN:DBI PRT) enzymes involved in dimethylbenzimidazole synthesis. This function is essential to de novo cobalamin (vitamin B12) production in bacteria. Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) from Salmonella enterica plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin.	333
396729	pfam02278	Lyase_8	Polysaccharide lyase family 8, super-sandwich domain. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen.	252
280446	pfam02281	Dimer_Tnp_Tn5	Transposase Tn5 dimerization domain. Transposons are mobile DNA sequences capable of replication and insertion into the chromosome. Typically transposons code for the transposase enzyme, which catalyzes insertion, found between terminal inverted repeats. Tn5 has a unique method of self- regulation in which a truncated version of the transposase enzyme acts as an inhibitor. The catalytic domain of the Tn5 transposon is found in pfam01609. This domain mediates dimerization in the known structure.	106
396730	pfam02282	Herpes_UL42	DNA polymerase processivity factor (UL42). The DNA polymerase processivity factor (UL42) of herpes simplex virus forms a heterodimer with UL30 to create the viral DNA polymerase complex. UL42 functions to increase the processivity of polymerization and makes little contribution to the catalytic activity of the polymerase.	142
396731	pfam02283	CobU	Cobinamide kinase / cobinamide phosphate guanyltransferase. This family is composed of a group of bifunctional cobalamin biosynthesis enzymes which display cobinamide kinase and cobinamide phosphate guanyltransferase activity. The crystal structure of the enzyme reveals the molecule to be a trimer with a propeller-like shape.	167
396732	pfam02284	COX5A	Cytochrome c oxidase subunit Va. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit Va.	99
396733	pfam02285	COX8	Cytochrome oxidase c subunit VIII. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIII.	41
396734	pfam02286	Dehydratase_LU	Dehydratase large subunit. This family contains the large subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances.	552
396735	pfam02287	Dehydratase_SU	Dehydratase small subunit. This family contains the small subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances.	131
396736	pfam02288	Dehydratase_MU	Dehydratase medium subunit. This family contains the medium subunit of the trimeric diol dehydratases and glycerol dehydratases. These enzymes are produced by some enterobacteria in response to growth substances.	108
396737	pfam02289	MCH	Cyclohydrolase (MCH). Methenyl tetrahydromethanopterin cyclohydrolase EC:3.5.4.27 is involved in methanogenesis in bacteria and archaea, producing methane from carbon monoxide or carbon dioxide.	308
396738	pfam02290	SRP14	Signal recognition particle 14kD protein. The signal recognition particle (SRP) is a multimeric protein involved in targeting secretory proteins to the rough endoplasmic reticulum membrane. SRP14 and SRP9 form a complex essential for SRP RNA binding.	106
396739	pfam02291	TFIID-31kDa	Transcription initiation factor IID, 31kD subunit. This family represents the N-terminus of the 31kD subunit (42kD in drosophila) of transcription initiation factor IID (TAFII31). TAFII31 binds to p53, and is an essential requirement for p53 mediated transcription activation.	122
396740	pfam02293	AmiS_UreI	AmiS/UreI family transporter. This family includes UreI and proton gated urea channel as well as putative amide transporters.	165
280458	pfam02294	7kD_DNA_binding	7kD DNA-binding domain. This family contains members of the hyper-thermophilic archaebacterium 7kD DNA-binding/endoribonuclease P2 family. There are five 7kD DNA-binding proteins, 7a-7e, found as monomers in the cell. Protein 7e shows the tightest DNA-binding ability.	58
280459	pfam02295	z-alpha	Adenosine deaminase z-alpha domain. This family consists of the N-terminus and thus the z-alpha domain of double-stranded RNA-specific adenosine deaminase (ADAR), an RNA- editing enzyme. The z-alpha domain is a Z-DNA binding domain, and binding of this region to B-DNA has been shown to be disfavoured by steric hindrance.	67
367023	pfam02296	Alpha_adaptin_C	Alpha adaptin AP2, C-terminal domain. Alpha adaptin is a hetero tetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site.	113
396741	pfam02297	COX6B	Cytochrome oxidase c subunit VIb. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of the potentially heme-binding subunit IVb of the oxidase.	65
280462	pfam02298	Cu_bind_like	Plastocyanin-like domain. This family represents a domain found in flowering plants related to the copper binding protein plastocyanin. Some members of this family may not bind copper due to the lack of key residues.	84
396742	pfam02300	Fumarate_red_C	Fumarate reductase subunit C. Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centers to the electron-transport chain. This family consists of the 15kD hydrophobic subunit C.	127
396743	pfam02301	HORMA	HORMA domain. The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognize chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity.	209
396744	pfam02302	PTS_IIB	PTS system, Lactose/Cellobiose specific IIB subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The lactose/cellobiose-specific family are one of four structurally and functionally distinct group IIB PTS system cytoplasmic enzymes. The fold of IIB cellobiose shows similar structure to mammalian tyrosine phosphatases. This family also contains the fructose specific IIB subunit.	92
396745	pfam02303	Phage_DNA_bind	Helix-destabilizing protein. This family contains the bacteriophage helix-destabilizing protein, or single-stranded DNA binding protein, required for DNA synthesis.	83
280467	pfam02304	Phage_B	Scaffold protein B. This is a family of proteins from single-stranded DNA bacteriophages. Scaffold proteins B and D are required for procapsid formation. Sixty copies of the internal scaffold protein B are found in the procapsid.	117
308107	pfam02305	Phage_F	Capsid protein (F protein). This is a family of proteins from single-stranded DNA bacteriophages. Protein F is the major capsid component, sixty copies of which are found in the virion.	510
280468	pfam02306	Phage_G	Major spike protein (G protein). This is a family of proteins from single-stranded DNA bacteriophages. Five G proteins, each a tight beta barrel, from twelve surface spikes.	175
396746	pfam02308	MgtC	MgtC family. The MgtC protein is found in an operon with the Mg2+ transporter protein MgtB. The function of MgtC and its homologs is not known.	120
396747	pfam02309	AUX_IAA	AUX/IAA family. Transcription of the AUX/IAA family of genes is rapidly induced by the plant hormone auxin. Some members of this family are longer and contain an N terminal DNA binding domain. The function of this region is uncertain.	188
396748	pfam02310	B12-binding	B12 binding domain. This domain binds to B12 (adenosylcobamide), it is found in several enzymes, such as glutamate mutase, methionine synthase, and methylmalonyl-CoA mutase. It contains a conserved DxHxxGx(41)SxVx(26)GG motif, which is important for B12 binding.	121
396749	pfam02311	AraC_binding	AraC-like ligand binding domain. This family represents the arabinose-binding and dimerization domain of the bacterial gene regulatory protein AraC. The domain is found in conjunction with the helix-turn-helix (HTH) DNA-binding motif pfam00165. This domain is distantly related to the Cupin domain pfam00190.	134
396750	pfam02312	CBF_beta	Core binding factor beta subunit. Core binding factor (CBF) is a heterodimeric transcription factor essential for genetic regulation of hematopoiesis and osteogenesis. The beta subunit enhances DNA-binding ability of the alpha subunit in vitro, and has been show to have a structure related to the OB fold.	166
396751	pfam02313	Fumarate_red_D	Fumarate reductase subunit D. Fumarate reductase is a membrane-bound flavoenzyme consisting of four subunits, A-B. A and B comprise the membrane-extrinsic catalytic domain and C and D link the catalytic centers to the electron-transport chain. This family consists of the 13kD hydrophobic subunit D.	114
396752	pfam02315	MDH	Methanol dehydrogenase beta subunit. Methanol dehydrogenase (MDH) is a bacterial periplasmic quinoprotein that oxidizes methanol to formaldehyde. MDH is a tetramer of two alpha and two beta subunits. This family contains the small beta subunit.	88
367032	pfam02316	HTH_Tnp_Mu_1	Mu DNA-binding domain. This family consists of MuA-transposase and repressor protein CI. These proteins contain homologous DNA-binding domains at their N-termini which compete for the same DNA site within the Mu bacteriophage genome.	134
396753	pfam02317	Octopine_DH	NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain. This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids.	149
396754	pfam02318	FYVE_2	FYVE-type zinc finger. This FYVE-type zinc finger is found at the N-terminus of effector proteins including rabphilin-3A and regulating synaptic membrane exocytosis protein 2.	118
396755	pfam02319	E2F_TDP	E2F/DP family winged-helix DNA-binding domain. This family contains the transcription factor E2F and its dimerization partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. The crystal structure of an E2F4-DP2-DNA complex shows that the DNA-binding domains of the E2F and DP proteins both have a fold related to the winged-helix DNA-binding motif. Recognition of the central c/gGCGCg/c sequence of the consensus DNA-binding site is symmetric, and amino acids that contact these bases are conserved among all known E2F and DP proteins.	64
396756	pfam02320	UCR_hinge	Ubiquinol-cytochrome C reductase hinge protein. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 'hinge' protein of the complex which is thought to mediate formation of the cytochrome c1 and cytochrome c complex.	64
396757	pfam02321	OEP	Outer membrane efflux protein. The OEP family (Outer membrane efflux protein) form trimeric channels that allow export of a variety of substrates in Gram negative bacteria. Each member of this family is composed of two repeats. The trimeric channel is composed of a 12 stranded all beta sheet barrel that spans the outer membrane, and a long all helical barrel that spans the periplasm.	181
396758	pfam02322	Cyt_bd_oxida_II	Cytochrome bd terminal oxidase subunit II. This family consists of cytochrome bd type terminal oxidases that catalyze quinol-dependent, Na+-independent oxygen uptake. Members of this family are integral membrane proteins and contain a protohaem IX centre B558. One member of the family, Klebsiella pneumoniae CydB, is implicated in having an important role in micro-aerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae. The family forms an integral functional unit with subunit I, family Bac_Ubq_Cox, pfam01654.	300
145463	pfam02323	ELH	Egg-laying hormone precursor. This family consists of egg-laying hormone (ELH) precursor and atrial gland peptides form little and California sea hare. The family also includes ovulation prohormone precursor from great pond snail. This family thus represents a conserved gastropoda ovulation and egg production prohormone. Note that many of the proteins present are further cleaved to give individual peptides. Neuropeptidergic bag cells of the marine mollusk Aplysia californica synthesize an egg-laying hormone (ELH) precursor protein which is cleaved to generate several bioactive peptides including ELH, bag cell peptides (BCP) and acidic peptide (AP).	255
334895	pfam02324	Glyco_hydro_70	Glycosyl hydrolase family 70. Members of this family belong to glycosyl hydrolase family 70 Glucosyltransferases or sucrose 6-glycosyl transferases (GTF-S) catalyze the transfer of D-glucopyramnosyl units from sucrose onto acceptor molecules, EC:2.4.1.5. This family roughly corresponds to the N-terminal catalytic domain of the enzyme. Members of this family also contain the Putative cell wall binding domain pfam01473, which corresponds with the C-terminal glucan-binding domain.	804
396759	pfam02325	YGGT	YGGT family. This family consists of a repeat found in conserved hypothetical integral membrane proteins. The function of this region and the proteins which possess it is unknown.	71
396760	pfam02326	YMF19	Plant ATP synthase F0. This family corresponds to subunit 8 (YMF19) of the F0 complex of plant and algae mitochondrial F-ATPases (EC:3.6.1.34).	84
396761	pfam02327	BChl_A	Bacteriochlorophyll A protein. Bacteriochlorophyll A protein is involved in the energy transfer system of green photosynthetic bacteria. The protein forms a homotrimer, with each monomer unit containing seven molecules of bacteriochlorophyll A.	354
280487	pfam02329	HDC	Histidine carboxylase PI chain. Histidine carboxylase catalyzes the formation of histamine from histidine. Cleavage of the proenzyme PI chain yields two subunits, alpha and beta, which arrange as a hexamer (alpha beta)6.	293
396762	pfam02330	MAM33	Mitochondrial glycoprotein. This mitochondrial matrix protein family contains members of the MAM33 family which bind to the globular 'heads' of C1Q. It is thought to be involved in mitochondrial oxidative phosphorylation and in nucleus-mitochondrion interactions.	202
280489	pfam02331	P35	Apoptosis preventing protein. This viral protein functions to block the host apoptotic response caused by infection by the virus. The apoptosis preventing protein (or early 35kD protein, P35) acts by blocking caspase protease activity.	295
396763	pfam02332	Phenol_Hydrox	Methane/Phenol/Toluene Hydroxylase. Bacterial phenol hydroxylase is a multicomponent enzyme that catabolises phenol and some of its methylated derivatives. This Pfam family contains both the P1 and P3 polypeptides of phenol hydroxylase and the alpha and beta chain of methane hydroxylase protein A.	226
280491	pfam02333	Phytase	Phytase. Phytase is a secreted enzyme which hydrolyzes phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity and has been shown to have a six- bladed propeller folding architecture.	375
396764	pfam02334	RTP	Replication terminator protein. The bacterial replication terminator protein (RTP) plays a role in the termination of DNA replication by impeding replication fork movement. Two RTP dimers bind to the two inverted repeat regions at the termination site.	113
396765	pfam02335	Cytochrom_C552	Cytochrome c552. Cytochrome c552 (cytochrome c nitrite reductase) is a crucial enzyme in the nitrogen cycle catalyzing the reduction of nitrite to ammonia. The crystal structure of cytochrome c552 reveals it to be a dimer, with with 10 close-packed type c haem groups.	435
280494	pfam02336	Denso_VP4	Capsid protein VP4. Four different translation initiation sites of the densovirus capsid protein mRNA give rise to four viral proteins, VP1 to VP4. This family represents VP4.	431
396766	pfam02337	Gag_p10	Retroviral GAG p10 protein. This family consists of various retroviral GAG (core) polyproteins and encompasses the p10 region producing the p10 protein upon proteolytic cleavage of GAG by retroviral protease. The p10 or matrix protein (MA) is associated with the virus envelope glycoproteins in most mammalian retroviruses and may be involved in virus particle assembly, transport and budding. Some of the GAG polyproteins have alternate cleavage sites leading to the production of alternative and longer cleavage products (e.g. p19) the alignment of this family only covers the approximately N-terminal (GAG) 100 amino acid region of homology to p10.	83
396767	pfam02338	OTU	OTU-like cysteine protease. This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian tumor (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases.	128
111251	pfam02340	PRRSV_Env	PRRSV putative envelope protein. This family consists of a conserved probable envelope protein or ORF2 in porcine reproductive and respiratory syndrome virus (PRRSV) also in the family is a minor structural protein from lactate dehydrogenase-elevating virus.	234
396768	pfam02341	RcbX	RbcX protein. The RBCX protein has been identified as having a possible chaperone-like function. The rbcX gene is juxtaposed to and cotranscribed with rbcL and rbcS encoding RuBisCO in Anabaena sp. CA. RbcX has been shown to possess a chaperone-like function assisting correct folding of RuBisCO in E. coli expression studies and is needed for RuBisCO to reach its maximal activity.	100
396769	pfam02342	TerD	TerD domain. The TerD domain is found in TerD family proteins that include the paralogous TerD, TerA, TerE, TerF and TerZ proteins It is found in a stress response operon with TerB and TerC. TerD has a maximum of two calcium-binding sites depending on the conservation of aspartates. It has various fusions to nuclease domains, RNA binding domains, ubiquitin related domains, and metal binding domains. The ter gene products lie at the centre of membrane-linked metal recognition complexes with regulatory ramifications encompassing phosphorylation-dependent signal transduction, RNA-dependent regulation, biosynthesis of nucleoside-like metabolites and DNA processing linked to novel pathways.	187
396770	pfam02343	TRA-1_regulated	TRA-1 regulated protein R03H10.4. This family of proteins represents the protein product of the gene R03H10.4 which is located near a sequence that matches the TRA-1 binding consensus. TRA-1 is a transcription factor which controls sexual differentiation in C.elegans. R03H10.4 shows male-enriched reporter gene expression and acts as a direct target of TRA-1 regulation.	128
396771	pfam02344	Myc-LZ	Myc leucine zipper domain. This family consists of the leucine zipper dimerization domain found in both cellular c-Myc proto-oncogenes and viral v-Myc oncogenes. dimerization via the leucine zipper motif with other basic helix-loop-helix-leucine zipper (b/HLH/lz) proteins such as Max is required for efficient DNA binding. The Myc-Max dimer is a transactivating complex activating expression of growth related genes promoting cell proliferation. The dimerization is facilitated via interdigitating leucine residues every 7th position of the alpha helix. Like charge repulsion of adjacent residues in this region perturbs the formation of homodimers with heterodimers being promoted by opposing charge attractions.	27
280501	pfam02346	Vac_Fusion	Chordopoxvirus multifunctional envelope protein A27. This is a family of viral fusion proteins from the chordopoxviruses. The A27L gene product, a 14-kDa Vaccinia Virus protein, has been demonstrated to function as a viral fusion protein mediating cell fusion at endosmomal (low) pH. More recently it has been shown that A27 forms disulfide-linked protein complexes with A26 protein providing an anchor for A26 protein packaging into mature virions. A27 regulates virion-membrane fusion rather than inducing it and is critical for the successful egress of mature virus particles.	56
396772	pfam02347	GDC-P	Glycine cleavage system P-protein. This family consists of Glycine cleavage system P-proteins EC:1.4.4.2 from bacterial, mammalian and plant sources. The P protein is part of the glycine decarboxylase multienzyme complex EC:2.1.2.10 (GDC) also annotated as glycine cleavage system or glycine synthase. GDC consists of four proteins P, H, L and T. The reaction catalyzed by this protein is:- Glycine + lipoylprotein <=> S-aminomethyldihydrolipoylprotein + CO2	428
396773	pfam02348	CTP_transf_3	Cytidylyltransferase. This family consists of two main Cytidylyltransferase activities: 1) 3-deoxy-manno-octulosonate cytidylyltransferase,, EC:2.7.7.38 catalyzing the reaction:- CTP + 3-deoxy-D-manno-octulosonate <=> diphosphate + CMP-3-deoxy-D-manno-octulosonate, 2) acylneuraminate cytidylyltransferase EC:2.7.7.43, catalyzing the reaction:- CTP + N-acylneuraminate <=> diphosphate + CMP-N-acylneuraminate. NeuAc cytydilyltransferase of Mannheimia haemolytica has been characterized describing kinetics and regulation by substrate charge, energetic charge and amino-sugar demand.	217
396774	pfam02349	MSG	Major surface glycoprotein. This is a novel repeat in Pneumocystis carinii Major surface glycoprotein (MSG) some members of the alignment have up to nine repeats of this family, the repeats containing several conserved cysteines. The MSG of P. carinii is an important protein in host-pathogen interactions. Surface glycoprotein A from Pneumocystis carinii is a main target for the host immune system, this protein is implicated in the attachment of Pneumocystis carinii to the host alveolar epithelial cells, alveolar macrophages, host surfactant and possibly accounts in part for the hypoxia seen in Pneumocystis carinii pneumonia (PCP).	76
396775	pfam02350	Epimerase_2	UDP-N-acetylglucosamine 2-epimerase. This family consists of UDP-N-acetylglucosamine 2-epimerases EC:5.1.3.14 this enzyme catalyzes the production of UDP-ManNAc from UDP-GlcNAc. Note that some of the enzymes is this family are bifunctional, in these instances Pfam matches only the N-terminal half of the protein suggesting that the additional C-terminal part (when compared to mono-functional members of this family) is responsible for the UPD-N-acetylmannosamine kinase activity of these enzymes. This hypothesis is further supported by the assumption that the C-terminal part of rat Gne is the kinase domain.	336
396776	pfam02351	GDNF	GDNF/GAS1 domain. This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity.	88
367048	pfam02352	Decorin_bind	Decorin binding protein. This family consists of decorin binding proteins from Borrelia. The decorin binding protein of Borrelia burgdorferi the lyme disease spirochetes adheres to the proteoglycan decorin found on collagen fibers.	141
396777	pfam02353	CMAS	Mycolic acid cyclopropane synthetase. This family consist of Cyclopropane-fatty-acyl-phospholipid synthase or CFA synthase EC:2.1.1.79 this enzyme catalyze the reaction: S-adenosyl-L-methionine + phospholipid olefinic fatty acid <=> S-adenosyl-L-homocysteine + phospholipid cyclopropane fatty acid.	272
396778	pfam02354	Latrophilin	Latrophilin Cytoplasmic C-terminal region. This family consists of the cytoplasmic C-terminal region in latrophilin. Latrophilin is a synaptic Ca2+ independent alpha- latrotoxin (LTX) receptor and is a novel member of the secretin family of G-protein coupled receptors that are involved in secretion. Latrophilin mRNA is present only in neuronal tissue. Lactrophillin interacts with G-alpha O.	378
280510	pfam02355	SecD_SecF	Protein export membrane protein. This family consists of various prokaryotic SecD and SecF protein export membrane proteins. This SecD and SecF proteins are part of the multimeric protein export complex comprising SecA, D, E, F, G, Y, and YajC. SecD and SecF are required to maintain a proton motive force.	189
396779	pfam02357	NusG	Transcription termination factor nusG. 	98
396780	pfam02358	Trehalose_PPase	Trehalose-phosphatase. This family consist of trehalose-phosphatases EC:3.1.3.12 these enzyme catalyze the de-phosphorylation of trehalose-6-phosphate to trehalose and orthophosphate. The aligned region is present in trehalose-phosphatases and comprises the entire length of the protein it is also found in the C-terminus of trehalose-6-phosphate synthase EC:2.4.1.15 adjacent to the trehalose-6-phosphate synthase domain - pfam00982. It would appear that the two equivalent genes in the E. coli otsBA operon otsA the trehalose-6-phosphate synthase and otsB trehalose-phosphatase (this family) have undergone gene fusion in most eukaryotes. Trehalose is a common disaccharide of bacteria, fungi and invertebrates that appears to play a major role in desiccation tolerance.	232
396781	pfam02359	CDC48_N	Cell division protein 48 (CDC48), N-terminal domain. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain.	85
396782	pfam02361	CbiQ	Cobalt transport protein. This family consists of various cobalt transport proteins Most of which are found in Cobalamin (Vitamin B12) biosynthesis operons. In Salmonella the cbiN cbiQ (product CbiQ in this family) and cbiO are likely to form an active cobalt transport system.	215
396783	pfam02362	B3	B3 DNA binding domain. This is a family of plant transcription factors with various roles in development, the aligned region corresponds to the B3 DNA binding domain, this domain is found in VP1/AB13 transcription factors. Some proteins also have a second AP2 DNA binding domain pfam00847 such as RAV1.	101
308140	pfam02363	C_tripleX	Cysteine rich repeat. This Cysteine repeat C-X3-C-X3-C is repeated in sequences of this family, 34 times in an uncharacterized C. elegans protein. The function of these repeats is unknown as is the function of the proteins in which they occur. Most of the sequences in this family are from C. elegans.	17
396784	pfam02364	Glucan_synthase	1,3-beta-glucan synthase component. This family consists of various 1,3-beta-glucan synthase components including Gls1, Gls2 and Gls3 from yeast. 1,3-beta-glucan synthase EC:2.4.1.34 also known as callose synthase catalyzes the formation of a beta-1,3-glucan polymer that is a major component of the fungal cell wall. The reaction catalyzed is:- UDP-glucose + {(1,3)-beta-D-glucosyl}(N) <=> UDP + {(1,3)-beta-D-glucosyl}(N+1).	819
396785	pfam02365	NAM	No apical meristem (NAM) protein. This is a family of no apical meristem (NAM) proteins these are plant development proteins. Mutations in NAM result in the failure to develop a shoot apical meristem in petunia embryos. NAM is indicated as having a role in determining positions of meristems and primordial. One member of this family NAP (NAC-like, activated by AP3/PI) is encoded by the target genes of the AP3/PI transcriptional activators and functions in the transition between growth by cell division and cell expansion in stamens and petals.	123
396786	pfam02366	PMT	Dolichyl-phosphate-mannose-protein mannosyltransferase. This is a family of Dolichyl-phosphate-mannose-protein mannosyltransferase proteins EC:2.4.1.109. These proteins are responsible for O-linked glycosylation of proteins, they catalyze the reaction:- Dolichyl phosphate D-mannose + protein <=> dolichyl phosphate + O-D-mannosyl-protein. Also in this family is Drosophila rotated abdomen protein which is a putative mannosyltransferase. This family appears to be distantly related to pfam02516 (A Bateman pers. obs.). This family also contains sequences from ArnTs (4-amino-4-deoxy-L-arabinose lipid A transferase). They catalyze the addition of 4-amino-4-deoxy-l-arabinose (l-Ara4N) to the lipid A moiety of the lipopolysaccharide. This is a critical modification enabling bacteria (e.g. Escherichia coli and Salmonella typhimurium) to resist killing by antimicrobial peptides such as polymyxins. Members such as undecaprenyl phosphate-alpha-4-amino-4-deoxy-L-arabinose arabinosyl transferase are predicted to have 12 trans-membrane regions. The N-terminal portion of these proteins is hypothesized to have a conserved glycosylation activity which is shared between distantly related oligosaccharyltransferases ArnT and PglB families.	245
396787	pfam02367	TsaE	Threonylcarbamoyl adenosine biosynthesis protein TsaE. This family of proteins is involved in the synthesis of threonylcarbamoyl adenosine (t(6)A).	125
396788	pfam02368	Big_2	Bacterial Ig-like domain (group 2). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial and phage surface proteins such as intimins.	77
396789	pfam02369	Big_1	Bacterial Ig-like domain (group 1). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in bacterial surface proteins such as intimins and invasins involved in pathogenicity.	64
111279	pfam02370	M	M protein repeat. This short repeat is found in multiple copies in bacterial M proteins. The M proteins bind to IgA and are closely associated with virulence. The M protein has been postulated to be a major group A Streptococcal (GAS) virulence factor because of its contribution to the bacterial resistance to opsonophagocytosis.	21
396790	pfam02371	Transposase_20	Transposase IS116/IS110/IS902 family. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS116, IS110 and IS902. This region is often found with pfam01548. The exact function of this region is uncertain. This family contains a HHH motif suggesting a DNA-binding function.	86
308146	pfam02372	IL15	Interleukin 15. Interleukin-15 (IL-15) is a cytokine that possesses a variety of biological functions, including stimulation and maintenance of cellular immune responses. Structurally these proteins are short-chain 4-helical cytokines.	129
396791	pfam02373	JmjC	JmjC domain, hydroxylase. The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation.	114
396792	pfam02374	ArsA_ATPase	Anion-transporting ATPase. This Pfam family represents a conserved domain, which is sometimes repeated, in an anion-transporting ATPase. The ATPase is involved in the removal of arsenate, antimonite, and arsenate from the cell.	302
396793	pfam02375	JmjN	jmjN domain. 	34
396794	pfam02376	CUT	CUT domain. The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain, often found downstream of the CUT domain. Multiple copies of the CUT domain can exist in one protein.	78
396795	pfam02377	Dishevelled	Dishevelled specific domain. This domain is specific to the signalling protein dishevelled. The domain is found adjacent to the PDZ domain pfam00595, often in conjunction with DEP (pfam00610) and DIX (pfam00778). Much of it is disordered and yet conserved.	162
367061	pfam02378	PTS_EIIC	Phosphotransferase system, EIIC. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The sugar-specific permease of the PTS consists of three domains (IIA, IIB and IIC). The IIC domain catalyzes the transfer of a phosphoryl group from IIB to the sugar substrate.	315
367062	pfam02380	Papo_T_antigen	T-antigen specific domain. This domain represents a conserved region in papovavirus small and middle T-antigens. It is found as the N-terminal domain in the small T-antigen, and is centrally located in the middle T-antigen.	93
396796	pfam02381	MraZ	MraZ protein, putative antitoxin-like. This small 70 amino acid domain is found duplicated in a family of bacterial proteins. These proteins may be DNA-binding transcription factors (Pers. comm. A Andreeva & A Murzin). It is likely, due to the similarity of fold, that this family acts as a bacterial antitoxin like the MazE antitoxin family.	72
396797	pfam02382	RTX	RTX N-terminal domain. The RTX family of bacterial toxins are a group of cytolysins and cytotoxins. This Pfam family represents the N-terminal domain which is found in association with a glycine-rich repeat domain and hemolysinCabind pfam00353.	312
396798	pfam02383	Syja_N	SacI homology domain. This Pfam family represents a protein domain which shows homology to the yeast protein SacI. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin.	296
396799	pfam02384	N6_Mtase	N-6 DNA Methylase. Restriction-modification (R-M) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The R-M system is a complex containing three polypeptides: M (this family), S (pfam01420), and R. This family consists of N-6 adenine-specific DNA methylase EC:2.1.1.72 from Type I and Type IC restriction systems. These methylases have the same sequence specificity as their corresponding restriction enzymes.	312
280534	pfam02386	TrkH	Cation transport protein. This family consists of various cation transport proteins (Trk) and V-type sodium ATP synthase subunit J or translocating ATPase J EC:3.6.1.34. These proteins are involved in active sodium up-take utilising ATP in the process. TrkH a member of the family from E. coli is a hydrophobic membrane protein and determines the specificity and kinetics of cation transport by the TrK system in E. coli.	491
367066	pfam02387	IncFII_repA	IncFII RepA protein family. This protein is plasmid encoded and found to be essential for plasmid replication.	275
367067	pfam02388	FemAB	FemAB family. The femAB operon codes for two nearly identical approximately 50-kDa proteins involved in the formation of the Staphylococcal pentaglycine interpeptide bridge in peptidoglycan. These proteins are also considered as a factor influencing the level of methicillin resistance.	406
280537	pfam02389	Cornifin	Cornifin (SPRR) family. SPRR genes (formerly SPR) encode a novel class of polypeptides (small proline rich proteins) that are strongly induced during differentiation of human epidermal keratinocytes in vitro and in vivo. The most characteristic feature of the SPRR gene family resides in the structure of the central segments of the encoded polypeptides that are built up from tandemly repeated units of either eight (SPRR1 and SPRR3) or nine (SPRR2) amino acids with the general consensus XKXPEPXX where X is any amino acid. In order to avoid bacterial contamination due to the high polar-nature of the HMM the threshold has been set very high.	135
367068	pfam02390	Methyltransf_4	Putative methyltransferase. This is a family of putative methyltransferases. The aligned region contains the GXGXG S-AdoMet binding site suggesting a putative methyltransferase activity.	173
396800	pfam02391	MoaE	MoaE protein. This family contains the MoaE protein that is involved in biosynthesis of molybdopterin. Molybdopterin, the universal component of the pterin molybdenum cofactors, contains a dithiolene group serving to bind Mo. Addition of the dithiolene sulfurs to a molybdopterin precursor requires the activity of the converting factor. Converting factor contains the MoaE and MoaD proteins.	113
396801	pfam02392	Ycf4	Ycf4. This family consists of hypothetical Ycf4 proteins from various chloroplast genomes. It has been suggested that Ycf4 is involved in the assembly and/or stability of the photosystem I complex in chloroplasts.	176
396802	pfam02393	US22	US22 like. US22 proteins have been found across many animal DNA viruses and some vertebrates. The name sake of this family, US22, is an early nuclear protein that is secreted from cells. The US22 family may have a role in virus replication and pathogenesis. Domain analysis showed that US22 proteins usually contain two copies of conserved modules which is homologous to several other families like SMI1 and SYD (commonly called SUKH superfamily). Bacterial operon analysis revealed that all bacterial SUKH members function as immunity proteins against various toxins. Thus US22 family is predicted to counter diverse anti-viral responses by interacting with specific host proteins.	124
396803	pfam02394	IL1_propep	Interleukin-1 propeptide. The Interleukin-1 cytokines are translated as precursor proteins. The N terminal approx. 115 amino acids form a propeptide that is cleaved off to release the active interleukin-1.	102
396804	pfam02395	Peptidase_S6	Immunoglobulin A1 protease. This family consists of immunoglobulin A1 protease proteins. The immunoglobulin A1 protease cleaves immunoglobulin IgA and is found in pathogenic bacteria such as Neisseria gonorrhoeae. Not all of the members of this family are IgA proteases, EspP from E. coli O157:H7 cleaves human coagulation factor V and hbp is a hemoglobin protease from E. coli EB1.	784
396805	pfam02397	Bac_transf	Bacterial sugar transferase. This Pfam family represents a conserved region from a number of different bacterial sugar transferases, involved in diverse biosynthesis pathways.	181
280545	pfam02398	Corona_7	Coronavirus protein 7. This is a family of proteins from coronavirus which may function in viral assembly.	101
280546	pfam02399	Herpes_ori_bp	Origin of replication binding protein. This Pfam family represents the herpesvirus origin of replication binding protein, probably involved in DNA replication.	820
396806	pfam02401	LYTB	LytB protein. The mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway for isoprenoid biosynthesis is essential in many eubacteria, plants, and the malaria parasite. The LytB gene is involved in the trunk line of the MEP pathway.	267
367072	pfam02402	Lysis_col	Lysis protein. These small bacterial proteins are required for colicin release and partial cell lysis. This family contains lysis proteins for several different forms of colicin. B. subtilis LytA has been included in this family, the similarity is not highly significant, however it is also a short protein, that is involved in secretion of other proteins (Bateman A pers. obs.). This family includes a signal peptide motif and a lipid attachment site.	49
396807	pfam02403	Seryl_tRNA_N	Seryl-tRNA synthetase N-terminal domain. This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase.	107
396808	pfam02404	SCF	Stem cell factor. Stem cell factor (SCF) is a homodimer involved in hematopoiesis. SCF binds to and activates the SCF receptor (SCFR), a receptor tyrosine kinase. The crystal structure of human SCF has been resolved and a potential receptor-binding site identified.	275
396809	pfam02405	MlaE	Permease MlaE. MlaE is a permease which in E. coli is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. In NMB1965 it is involved in L-glutamate import into the cell. In Arabidopsis thaliana TGD1 it is involved in lipid transfer within the cell.	212
396810	pfam02406	MmoB_DmpM	MmoB/DmpM family. This family consists of monooxygenase components such as MmoB methane monooxygenase (EC:1.14.13.25) regulatory protein B. When MmoB is present at low concentration it converts methane monooxygenase from an oxidase to a hydroxylase and stabilizes intermediates required for the activation of dioxygen. Also found in this family is DmpM or Phenol hydroxylase (EC:1.14.13.7) protein component P2, this protein lacks redox co-factors and is required for optimal turnover of Phenol hydroxylase.	85
280553	pfam02407	Viral_Rep	Putative viral replication protein. This is a family of viral ORFs from various plant and animal ssDNA circoviruses. Published evidence to support the annotated function "viral replication associated protein" has not be found.	82
280554	pfam02408	CUB_2	CUB-like domain. This is a family of hypothetical C. elegans proteins. The aligned region has no known function nor do any of the proteins which possess it. However, this domain is related to the CUB domain.	120
396811	pfam02410	RsfS	Ribosomal silencing factor during starvation. This family is expressed by almost all bacterial and eukaryotic genomes but not by archaea. Its function is to down-regulate protein synthesis under conditions of nutrient shortage, and it does this by binding to protein L14 of the large ribosomal subunit, thus acting as a ribosomal silencing factor (RsfS) by blocking the joining of the ribosomal subunits. This family is structurally homologous to nucleotidyltransferases.	97
111318	pfam02411	MerT	MerT mercuric transport protein. MerT is an mercuric transport integral membrane protein and is responsible for transport of the Hg2+ iron from periplasmic MerP (also part of the transport system) to mercuric reductase (MerE).	116
367074	pfam02412	TSP_3	Thrombospondin type 3 repeat. The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure.	36
396812	pfam02413	Caudo_TAP	Caudovirales tail fibre assembly protein, lambda gpK. This family contains bacterial and phage tail fibre assembly proteins. E.coli contains several members of this family although the function of these proteins is uncertain. Using the lambda phage members as examples, there are both gptfa and gpK tail proteins here. GpK forms part of the TTC or tail-tip complex that is located at the distal end of the tail. TTCs form the platform on which the tail-tube proteins self-assemble and are also the attachment point for fibers or receptor-binding proteins that mediate phage-adsorption to the surface of the host cell. TTC assembly starts with gpJ, which is also known as the central tail fibre and is involved in host-cell adsorption. It is the C-terminus of gpJ that interacts with the lamB receptor on host cells. A number of intermediates including gpK then interact with gpJ during tail morphogenesis.	130
396813	pfam02414	Borrelia_orfA	Borrelia ORF-A. This protein is encoded by an open reading frame in plasmid borne DNA repeats of Borrelia species. This protein is known as ORF-A. The function of this putative protein is unknown.	285
308169	pfam02415	Chlam_PMP	Chlamydia polymorphic membrane protein (Chlamydia_PMP) repeat. This family contains several Chlamydia polymorphic membrane proteins. Chlamydia pneumoniae is an obligate intracellular bacterium and a common human pathogen causing infection of the upper and lower respiratory tract. Common for the Pmps are the tetrapeptide GGA(I/V/L) motif repeated several times in the N-terminal part. The C-terminal half is characterized by conserved tryptophans and a carboxy-terminal phenylalanine. A signal peptide leader sequence is predicted in 20 C. pneumoniae Pmps, which indicates an outer membrane localization. Pmp10 and Pmp11 contain a signal peptidase II cleavage site suggesting lipid modification. The C. pneumoniae pmp genes represent 17.5% of the chlamydia-specific coding capacity and they are all transcribed during chlamydial growth but the function of Pmps remains unknown. This family shows some similarity to pfam05594 and hence is likely to also form a beta-helical structure (personal obs:C Yeats).	19
280560	pfam02416	MttA_Hcf106	mttA/Hcf106 family. Members of this protein family are involved in a sec independent translocation mechanism. This pathway has been called the DeltapH pathway in chloroplasts. Members of this family in E.coli are involved in export of redox proteins with a "twin arginine" leader motif.	53
396814	pfam02417	Chromate_transp	Chromate transporter. Members of this family probably act as chromate transporters. Members of this family are found in both bacteria and archaebacteria. The proteins are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP.	164
396815	pfam02419	PsbL	PsbL protein. This family consists of the photosystem II reaction centre protein PsbJ from plants and Cyanobacteria. The function of this small protein is unknown. Interestingly the mRNA for this protein requires a post-transcriptional modification of an ACG triplet to form an AUG initiator codon.	37
111326	pfam02420	AFP	Insect antifreeze protein repeat. This family of extracellular proteins is involved in stopping the formation of ice crystals at low temperatures. The proteins are composed of a 12 residue repeat that forms a structural repeat. The structure of the repeats is a beta helix. Each repeat contains two cys residues that form a disulphide bridge.	12
396816	pfam02421	FeoB_N	Ferrous iron transport protein B. Escherichia coli has an iron(II) transport system (feo) which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N-terminus contains a P-loop motif suggesting that iron transport may be ATP dependent.	156
396817	pfam02422	Keratin	Keratin. This family represents avian keratin proteins, found in feathers, scale and claw.	89
396818	pfam02423	OCD_Mu_crystall	Ornithine cyclodeaminase/mu-crystallin family. This family contains the bacterial Ornithine cyclodeaminase enzyme EC:4.3.1.12, which catalyzes the deamination of ornithine to proline. This family also contains mu-Crystallin the major component of the eye lens in several Australian marsupials, mRNA for this protein has also been found in human retina.	317
396819	pfam02424	ApbE	ApbE family. This prokaryotic family of lipoproteins are related to ApbE from Salmonella typhimurium. ApbE is involved in thiamine synthesis. It acts as an FAD:protein FMN-transferase, catalyzing the attachment of an FMN residue to a threonine residue of a protein via a phosphoester bond in such bacterial flavoproteins.	226
280567	pfam02425	GBP_PSP	Paralytic/GBP/PSP peptide. This family includes insect peptides that are short (23 amino acids) and contain 1 disulphide bridge. The family includes growth-blocking peptide (GBP) of Pseudaletia separata and the paralytic peptides from Manduca sexta, Heliothis virescens, and Spodoptera exigua as well as plasmatocyte-spreading peptide (PSP1). These peptides function to halt metamorphosis from larvae to pupae.	23
396820	pfam02426	MIase	Muconolactone delta-isomerase. This small enzyme forms a homodecameric complex, that catalyzes the third step in the catabolism of catechol to succinate- and acetyl-coa in the beta-ketoadipate pathway EC:5.3.3.4. The protein has a ferredoxin-like fold according to SCOP.	87
396821	pfam02427	PSI_PsaE	Photosystem I reaction centre subunit IV / PsaE. PsaE is a 69 amino acid polypeptide from photosystem I present on the stromal side of the thylakoid membrane. The structure is comprised of a well-defined five-stranded beta-sheet similar to SH3 domains.	59
396822	pfam02428	Prot_inhib_II	Potato type II proteinase inhibitor family. Members of this family are proteinase inhibitors that contain eight cysteines that form four disulphide bridges. The structure of the proteinase-inhibitor complex is known.	51
396823	pfam02429	PCP	Peridinin-chlorophyll A binding protein. Peridinin-chlorophyll-protein, a water-soluble light-harvesting complex that has a blue-green absorbing carotenoid as its main pigment, is present in most photosynthetic dinoflagellates. These proteins are composed of two similar repeated domains. These domains constitute a scaffold with pseudo-twofold symmetry surrounding a hydrophobic cavity filled by two lipid, eight peridinin, and two chlorophyll a molecules.	145
396824	pfam02430	AMA-1	Apical membrane antigen 1. Apical membrane antigen 1 (AMA-1) is a Plasmodium asexual blood-stage antigen. It has been suggested that positive selection operates on the AMA-1 gene in regions coding for antigenic sites.	432
396825	pfam02431	Chalcone	Chalcone-flavanone isomerase. Chalcone-flavanone isomerase is a plant enzyme responsible for the isomerisation of chalcone to naringenin, 4',5,7-trihydroxyflavanone, a key step in the biosynthesis of flavonoids.	203
367084	pfam02432	Fimbrial_K88	Fimbrial, major and minor subunit. Fimbriae (also know as pili) are polar filaments found on the bacterial surface, allowing colonisation of the host. This family consists of the minor and major fimbrial subunits.	155
396826	pfam02433	FixO	Cytochrome C oxidase, mono-heme subunit/FixO. The bacterial oxidase complex, fixNOPQ or cytochrome cbb3, is thought to be required for respiration in endosymbiosis. FixO is a membrane bound mono-heme constituent of the fixNOPQ complex.	217
367085	pfam02434	Fringe	Fringe-like. The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localized to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homolog, lunatic fringe, has been implicated in a variety of functions.	248
396827	pfam02435	Glyco_hydro_68	Levansucrase/Invertase. This Pfam family consists of the glycosyl hydrolase 68 family, including several bacterial levansucrase enzymes, and invertase from zymomonas.	411
396828	pfam02436	PYC_OADA	Conserved carboxylase domain. This domain represents a conserved region in pyruvate carboxylase (PYC), oxaloacetate decarboxylase alpha chain (OADA), and transcarboxylase 5s subunit. The domain is found adjacent to the HMGL-like domain (pfam00682) and often close to the biotin_lipoyl domain (pfam00364) of biotin requiring enzymes.	199
396829	pfam02437	Ski_Sno	SKI/SNO/DAC family. This family contains a presumed domain that is about 100 amino acids long. All members of this family contain a conserved CLPQ motif. The c-ski proto-oncogene has been shown to influence proliferation, morphological transformation and myogenic differentiation. Sno, a Ski proto-oncogene homolog, is expressed in two isoforms and plays a role in the response to proliferation stimuli. Dachshund also contains this domain. It is involved in various aspects of development.	100
308188	pfam02438	Adeno_100	Late 100kD protein. The late 100kD protein is a non-structural viral protein involved in the transport of hexon from the cytoplasm to the nucleus.	591
111345	pfam02439	Adeno_E3_CR2	Adenovirus E3 region protein CR2. Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host. This region called CR2 (conserved region 1) is found in Adenovirus type 19 (a subgroup D virus) 49 Kd protein in the E3 region. CR2 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 50 amino acid region is unknown.	38
367088	pfam02440	Adeno_E3_CR1	Adenovirus E3 region protein CR1. Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host. This region called CR1 (conserved region 1) is found three times in Adenovirus type 19 (a subgroup D virus) 49 Kd protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain (A. Bateman pers. observation).	95
396830	pfam02441	Flavoprotein	Flavoprotein. This family contains diverse flavoprotein enzymes. This family includes epidermin biosynthesis protein, EpiD, which has been shown to be a flavoprotein that binds FMN. This enzyme catalyzes the removal of two reducing equivalents from the cysteine residue of the C-terminal meso-lanthionine of epidermin to form a --C==C-- double bond. This family also includes the B chain of dipicolinate synthase a small polar molecule that accumulates to high concentrations in bacterial endospores, and is thought to play a role in spore heat resistance, or the maintenance of heat resistance. dipicolinate synthase catalyzes the formation of dipicolinic acid from dihydroxydipicolinic acid. This family also includes phenyl-acrylic acid decarboxylase (EC:4.1.1.-).	179
396831	pfam02442	L1R_F9L	Lipid membrane protein of large eukaryotic DNA viruses. The four families of large eukaryotic DNA viruses, Poxviridae, Asfarviridae, Iridoviridae, and Phycodnaviridae, referred to collectively as nucleocytoplasmic large DNA viruses or NCLDV, have all been shown to have a lipid membrane, in spite of the major differences in virion structure. The paralogous genes L1R and F9L encode membrane proteins that have a conserved domain architecture, with a single, C-terminal transmembrane helix, and an N-terminal, multiple-disulfide-bonded domain. The conservation of the myristoylated, disulfide-bonded protein L1R/F9L in most of the NCLDV correlates with the conservation of the thiol-disulfide oxidoreductase E10R which, in vaccinia virus, is required for the formation of disulfide bonds in L1R and F9L.	183
308191	pfam02443	Circo_capsid	Circovirus capsid protein. Circoviruses are small circular single stranded viruses. This family is the capsid protein from viruses such as porcine circovirus and beak and feather disease virus. These proteins are about 220 amino acids long.	200
280583	pfam02444	HEV_ORF1	Hepatitis E virus ORF-2 (Putative capsid protein). The Hepatitis E virus (HEV) genome is a single-stranded, positive-sense RNA molecule of approximately 7.5 kb. Three open reading frames (ORF) were identified within the HEV genome: ORF1 encodes non-structural proteins, ORF2 encodes the putative structural protein(s), and ORF3 encodes a protein of unknown function. ORF2 contains a consensus signal peptide sequence at its amino terminus and a capsid-like region with a high content of basic amino acids similar to that seen with other virus capsid proteins.	114
396832	pfam02445	NadA	Quinolinate synthetase A protein. Quinolinate synthetase catalyzes the second step of the de novo biosynthetic pathway of pyridine nucleotide formation. In particular, quinolinate synthetase is involved in the condensation of dihydroxyacetone phosphate and iminoaspartate to form quinolinic acid. This synthesis requires two enzymes, a FAD-containing "B protein" and an "A protein".	287
396833	pfam02446	Glyco_hydro_77	4-alpha-glucanotransferase. These enzymes EC:2.4.1.25 transfer a segment of a (1,4)-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or (1,4)-alpha-D-glucan.	460
308194	pfam02447	GntP_permease	GntP family permease. This is a family of integral membrane permeases that are involved in gluconate uptake. E. coli contains several members of this family including GntU, a low affinity transporter, and GntT, a high affinity transporter.	440
367089	pfam02448	L71	L71 family. This family of insect proteins are each about 100 amino acids long and have 6 conserved cysteine residues. They all have a predicted signal peptide and are probably excreted. The function of the proteins is unknown.	70
396834	pfam02449	Glyco_hydro_42	Beta-galactosidase. This group of beta-galactosidase enzymes belong to the glycosyl hydrolase 42 family. The enzyme catalyzes the hydrolysis of terminal, non-reducing terminal beta-D-galactosidase residues.	376
396835	pfam02450	LCAT	Lecithin:cholesterol acyltransferase. Lecithin:cholesterol acyltransferase (LCAT) is involved in extracellular metabolism of plasma lipoproteins, including cholesterol.	383
308198	pfam02451	Nodulin	Nodulin. Nodulin is a plant protein of unknown function. It is induced during nodulation in legume roots after rhizobium infection.	188
396836	pfam02452	PemK_toxin	PemK-like, MazF-like toxin of type II toxin-antitoxin system. PemK is a growth inhibitor in E. coli known to bind to the promoter region of the Pem operon, auto-regulating synthesis. This family represents the toxin molecule of a typical bacterial toxin-antitoxin system pairing. The family includes a number of different toxins, such as MazF, Kid, PemK, ChpA, ChpB and ChpAK.	108
396837	pfam02453	Reticulon	Reticulon. Reticulon, also know as neuroendocrine-specific protein (NSP), is a protein of unknown function which associates with the endoplasmic reticulum. This family represents the C-terminal domain of the three reticulon isoforms and their homologs.	157
111360	pfam02454	Sigma_1s	Sigma 1s protein. The reoviral gene S1 encodes for haemagglutinin (sigma 1 protein), an outer capsid protein and a major factor in determining virus-host cell interactions. Sigma 1s is one of two translation products of the S1 gene.	116
308201	pfam02455	Hex_IIIa	Hexon-associated protein (IIIa). The major capsid protein of the adenovirus strain is also known as a hexon. This is a family of hexon-associated proteins (protein IIIa).	539
280594	pfam02456	Adeno_IVa2	Adenovirus IVa2 protein. IVa2 protein can interact with the adenoviral packaging signal and that this interaction involves DNA sequences that have previously been demonstrated to be required for packaging. During the course of lytic infection, the adenovirus major late promoter (MLP) is induced to high levels after replication of viral DNA has started. IVa2 is a transcriptional activator of the major late promoter.	370
396838	pfam02457	DisA_N	DisA bacterial checkpoint controller nucleotide-binding. The DisA protein is a bacterial checkpoint protein that dimerizes into an octameric complex. The protein consists of three distinct domains. This domain is the first and is a globular, nucleotide-binding region; the next 146-289 residues constitute the DisA-linker family, pfam10635, that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), which thus forms a spine like-linker between domains 1 and 3. The C-terminal residues, of domain 3, are represented by family HHH, pfam00633, the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis.	114
280596	pfam02458	Transferase	Transferase family. This family includes a number of transferase enzymes. These include anthranilate N-hydroxycinnamoyl/benzoyltransferase that catalyzes the first committed reaction of phytoalexin biosynthesis. Deacetylvindoline 4-O-acetyltransferase EC:2.3.1.107 catalyzes the last step in vindoline biosynthesis is also a member of this family. The motif HXXXD is probably part of the active site. The family also includes trichothecene 3-O-acetyltransferase.	434
280597	pfam02459	Adeno_terminal	Adenoviral DNA terminal protein. This protein is covalently attached to the terminii of replicating DNA in vivo.	543
308203	pfam02460	Patched	Patched family. The transmembrane protein Patched is a receptor for the morphogene Sonic Hedgehog. This protein associates with the smoothened protein to transduce hedgehog signals.	793
396839	pfam02461	AMO	Ammonia monooxygenase. Ammonia monooxygenase plays a key role in the nitrogen cycle and degrades a wide range of hydrocarbons and halogenated hydrocarbons.	238
308205	pfam02462	Opacity	Opacity family porin protein. Pathogenic Neisseria spp. possess a repertoire of phase-variable Opacity proteins that mediate various pathogen--host cell interactions. These proteins are integral membrane proteins related to other porins.	126
308206	pfam02463	SMC_N	RecF/RecN/SMC N terminal domain. This domain is found at the N-terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.	1162
396840	pfam02464	CinA	Competence-damaged protein. CinA is the first gene in the competence-inducible (cin) operon, and is thought to be specifically required at some stage in the process of transformation. This Pfam family consists of putative competence-damaged proteins from the cin operon. Some members of this family have nicotinamide mononucleotide (NMN) deamidase activity.	155
396841	pfam02465	FliD_N	Flagellar hook-associated protein 2 N-terminus. The flagellar hook-associated protein 2 (HAP2 or FliD) forms the distal end of the flagella, and plays a role in mucin specific adhesion of the bacteria. This alignment covers the N-terminal region of this family of proteins.	97
396842	pfam02466	Tim17	Tim17/Tim22/Tim23/Pmp24 family. The pre-protein translocase of the mitochondrial outer membrane (Tom) allows the import of pre-proteins from the cytoplasm. Tom forms a complex with a number of proteins, including Tim17. Tim17 and Tim23 are thought to form the translocation channel of the inner membrane. This family includes Tim17, Tim22 and Tim23. This family also includes Pmp24 a peroxisomal protein. The involvement of this domain in the targeting of PMP24 remains to be proved. PMP24 was known as Pmp27 in.	111
396843	pfam02467	Whib	Transcription factor WhiB. WhiB is a putative transcription factor in Actinobacteria, required for differentiation and sporulation.	65
396844	pfam02468	PsbN	Photosystem II reaction centre N protein (psbN). This is a family of small proteins encoded on the chloroplast genome. psbN is involved in photosystem II during photosynthesis, but its exact role is unknown.	43
396845	pfam02469	Fasciclin	Fasciclin domain. This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria.	123
396846	pfam02470	MlaD	MlaD protein. This family of proteins contains MlaD, which is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. The family also contains the mce (mammalian cell entry) proteins from Mycobacterium tuberculosis. The archetype (Rv0169), was isolated as being necessary for colonisation of, and survival within, the macrophage. This family contains proteins of unknown function from other bacteria.	81
396847	pfam02471	OspE	Borrelia outer surface protein E. This is a family of outer surface proteins (Osp) from the Borrelia spirochete. The family includes OspE, and OspEF-related proteins (Erp). These proteins are coded for on different circular plasmids in the Borrelia genome.	107
396848	pfam02472	ExbD	Biopolymer transport protein ExbD/TolR. This group of proteins are membrane bound transport proteins essential for ferric ion uptake in bacteria. The Pfam family consists of ExbD, and TolR which are involved in TonB-dependent transport of various receptor bound substrates including colicins.	128
396849	pfam02474	NodA	Nodulation protein A (NodA). Rhizobia nodulation (nod) genes control the biosynthesis of Nod factors required for infection and nodulation of their legume hosts. Nodulation protein A (NodA) is a N-acetyltransferase involved in production of Nod factors that stimulate mitosis in various plant protoplasts.	195
396850	pfam02475	Met_10	Met-10+ like-protein. The methionine-10 mutant allele of N. crassa codes for a protein of unknown function. However, homologous proteins have been found in yeast, suggesting this protein may be involved in methionine biosynthesis, transport and/or utilisation.	198
396851	pfam02476	US2	US2 family. This is a family of unique short (US) region proteins from the herpesvirus strain. The US2 family have no known function.	124
396852	pfam02477	Nairo_nucleo	Nucleocapsid N protein. The nucleoprotein of the ssRNA negative-strand Nairovirus is an internal part of the virus particle.	443
280615	pfam02478	Pneumo_phosprot	Pneumovirus phosphoprotein. This family represents the phosphoprotein of Paramyxoviridae, a putative RNA polymerase alpha subunit that may function in template binding.	286
280616	pfam02479	Herpes_IE68	Herpesvirus immediate early protein. This regulatory protein is expressed from an immediate early gene in the cell cycle of herpesvirus. The protein is known by various names including IE-68, US1, ICP22 and IR4.	132
280617	pfam02480	Herpes_gE	Alphaherpesvirus glycoprotein E. Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI) (pfam01688), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation.	432
396853	pfam02481	DNA_processg_A	DNA recombination-mediator protein A. The SMF family, of DNA processing chain A, dprA, are a group of bacterial proteins. In H. pylori, dprA is required for natural chromosomal and plasmid transformation. It has now been shown that DprA is found to bind cooperatively to single-stranded DNA (ssDNA) and to interact with RecA. In the process, DprA-RecA-ssDNA filaments are produced and these filaments catalyze the homology-dependent formation of joint molecules. While the E.coli SSB protein limits access of RecA to ssDNA, DprA alleviates this barrier. It is proposed that DprA is a new member of the recombination-mediator protein family, dedicated to natural bacterial transformation.	210
396854	pfam02482	Ribosomal_S30AE	Sigma 54 modulation protein / S30EA ribosomal protein. This Pfam family contains the sigma-54 modulation protein family and the S30AE family of ribosomal proteins which includes the light- repressed protein (lrtA).	92
280620	pfam02484	Rhabdo_NV	Rhabdovirus Non-virion protein. Infectious hematopoietic necrosis virus (IHNV) is a member of the family Rhabdoviridae. The non-virion protein (NV) is coded for by one of the six genes of the IHNV genome, but is absent in vesiculovirus -like rhabdovirus.	111
396855	pfam02485	Branch	Core-2/I-Branching enzyme. This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme. I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans. This is a fmmily of glycosyl-transferases that are Type II membrane proteins that are found in the endoplasmic reticulum (ER) and Golgi apparatus.	250
396856	pfam02486	Rep_trans	Replication initiation factor. Plasmid replication is initiated by the replication initiation factor (REP). This family represents a probable topoisomerase that makes a sequence-specific single-stranded nick in the plasmid DNA at the origin of replication. Human proteins also belong to this family, including myelin transcription factor 2 and cerebrin-50.	201
396857	pfam02487	CLN3	CLN3 protein. This is a family of proteins from the CLN3 gene. A mis-sense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Batten disease is characterized by the accumulation of autofluorescent material in the lysosomes of most cells. Members of this family are transmembrane proteins functional in pre-vacuolar compartments. The protein in Sch.pombe is found to be localized to the vacuolar membrane, and a lack of functional protein clearly affects the size and pH of the vacuole. Thus the protein is necessary for vacuolar homeostasis. It is important for localization of late endosomal/lysosomal compartments, and it interacts with motor components driving both plus and minus end microtubular trafficking: tubulin, dynactin, dynein and kinesin-2.	384
280624	pfam02488	EMA	Merozoite Antigen. This family represents the immunodominant surface antigen of Theileria parasites including equi merozoite antigen-1 (EMA-1) and equi merozoite antigen-2 (EMA-2). The protein shows variation at a putative glycosylation site, a potential mechanism for host immune response evasion.	250
396858	pfam02489	Herpes_glycop_H	Herpesvirus glycoprotein H main domain. Herpesvirus glycoprotein H (gH) is a virion associated envelope glycoprotein. Complex formation between gH and gL has been demonstrated in both virions and infected cells.	500
396859	pfam02491	SHS2_FTSA	SHS2 domain inserted in FTSA. FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. The SHS2 domain is inserted in to the RNAseH fold of FtsA, and is involved in protein-protein interaction.	73
396860	pfam02492	cobW	CobW/HypB/UreG, nucleotide-binding domain. This domain is found in HypB, a hydrogenase expression / formation protein, and UreG a urease accessory protein. Both these proteins contain a P-loop nucleotide binding motif. HypB has GTPase activity and is a guanine nucleotide binding protein. It is not known whether UreG binds GTP or some other nucleotide. Both enzymes are involved in nickel binding. HypB can store nickel and is required for nickel dependent hydrogenase expression. UreG is required for functional incorporation of the urease nickel metallocenter. GTP hydrolysis may required by these proteins for nickel incorporation into other nickel proteins. This family of domains also contains P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression, and the cobW gene product, which may be involved in cobalamin biosynthesis in Pseudomonas denitrificans.	179
308220	pfam02493	MORN	MORN repeat. The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (see Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesized to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton.	23
367105	pfam02494	HYR	HYR domain. This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion.	81
396861	pfam02495	7kD_coat	7kD viral coat protein. This family consists of a 7kD coat protein from carlavirus and potexvirus.	59
396862	pfam02496	ABA_WDS	ABA/WDS induced protein. This is a family of plant proteins induced by water deficit stress (WDS), or abscisic acid (ABA) stress and ripening.	78
111400	pfam02497	Arteri_GP4	Arterivirus glycoprotein. This is a family of structural glycoproteins from arterivirus that corresponds to open reading frame 4 (ORF4) of the virus.	178
376797	pfam02498	Bro-N	BRO family, N-terminal domain. This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins.	96
280633	pfam02499	DNA_pack_C	Probable DNA packing protein, C-terminus. This family includes proteins that are probably involved in DNA packing in herpesvirus. This domain is found at the C-terminus of the protein.	348
280634	pfam02500	DNA_pack_N	Probable DNA packing protein, N-terminus. This family includes proteins that are probably involved in DNA packing in herpesvirus. This domain is normally found at the N-terminus of the protein.	277
396863	pfam02501	T2SSI	Type II secretion system (T2SS), protein I. The Type II secretion system, also called Secretion-dependent pathway (SDP), is responsible for the transport of proteins across the outer membrane first exported to the periplasm by the Sec or Tat translocon in Gram-negative (diderm) bacteria. As members of the T2SJ family, members of the T2SI family are pseudopilins containing prepilin signal sequences.	80
396864	pfam02502	LacAB_rpiB	Ribose/Galactose Isomerase. This family of proteins contains the sugar isomerase enzymes ribose 5-phosphate isomerase B (rpiB), galactose isomerase subunit A (LacA) and galactose isomerase subunit B (LacB).	134
396865	pfam02503	PP_kinase	Polyphosphate kinase middle domain. Polyphosphate kinase (Ppk) catalyzes the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules.	199
396866	pfam02504	FA_synthesis	Fatty acid synthesis protein. The plsX gene is part of the bacterial fab gene cluster which encodes several key fatty acid biosynthetic enzymes. The exact function of the plsX protein in fatty acid synthesis is unknown.	324
396867	pfam02505	MCR_D	Methyl-coenzyme M reductase operon protein D. Methyl coenzyme M reductase (MCR) catalyzes the final step in methanogenesis. MCR is composed of three subunits, alpha (pfam02249), beta (pfam02241) and gamma (pfam02240). Genes encoding the beta (mcrB) and gamma (mcrG) subunits are separated by two open reading frames coding for two proteins C and D. The function of proteins C and D (this family) is unknown.	142
367108	pfam02507	PSI_PsaF	Photosystem I reaction centre subunit III. Photosystem I (PSI) is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. Subunit III (or PSI-F) is one of at least 14 different subunits that compose the PSI complex.	159
396868	pfam02508	Rnf-Nqr	Rnf-Nqr subunit, membrane protein. This is a family of integral membrane proteins including Rhodobacter-specific nitrogen fixation (rnf) proteins RnfA and RnfE and Na+-translocating NADH:ubiquinone oxidoreductase (Na+-NQR) subunits NqrD and NqrE.	181
396869	pfam02509	Rota_NS35	Rotavirus non-structural protein 35. Rotavirus non-structural protein 35 (NS35) is a basic protein which possesses RNA-binding activity and is essential for genome replication.	317
280643	pfam02510	SPAN	Surface presentation of antigens protein. Surface presentation of antigens protein (SPAN), also know as invasion protein invJ, is a Salmonella secretory pathway protein involved in presentation of determinants required for mammalian host cell invasion.	336
396870	pfam02511	Thy1	Thymidylate synthase complementing protein. Thymidylate synthase complementing protein (Thy1) complements the thymidine growth requirement of the organisms in which it is found, but shows no homology to thymidylate synthase. The bacterial members of this family at least are flavin-dependent thymidylate synthases.	185
280645	pfam02512	UK	Virulence determinant. The UK protein is an African swine fever virus (ASFV) protein that is highly conserved amongst strains, and is an important viral virulence determinant for domestic pigs.	96
396871	pfam02513	Spin-Ssty	Spin/Ssty Family. Spindlin (Spin) is a novel maternal transcript present in the unfertilized egg and early embryo. The Y-linked spermiogenesis -specific transcript (Ssty) is also expressed during gametogenesis and forms part of this Pfam family. Members of this family contain three copies of this 50 residue repeat. The repeat is predicted to contain four beta strands.	49
376805	pfam02514	CobN-Mg_chel	CobN/Magnesium Chelatase. This family contains a domain common to the cobN protein and to magnesium protoporphyrin chelatase. CobN is implicated in the conversion of hydrogenobyrinic acid a,c-diamide to cobyrinic acid. Magnesium protoporphyrin chelatase is involved in chlorophyll biosynthesis.	1051
396872	pfam02515	CoA_transf_3	CoA-transferase family III. CoA-transferases are found in organisms from all lines of descent. Most of these enzymes belong to two well-known enzyme families, but recent work on unusual biochemical pathways of anaerobic bacteria has revealed the existence of a third family of CoA-transferases. The members of this enzyme family differ in sequence and reaction mechanism from CoA-transferases of the other families. Currently known enzymes of the new family are a formyl-CoA: oxalate CoA-transferase, a succinyl-CoA: (R)-benzylsuccinate CoA-transferase, an (E)-cinnamoyl-CoA: (R)-phenyllactate CoA-transferase, and a butyrobetainyl-CoA: (R)-carnitine CoA-transferase. In addition, a large number of proteins of unknown or differently annotated function from Bacteria, Archaea and Eukarya apparently belong to this enzyme family. Properties and reaction mechanisms of the CoA-transferases of family III are described and compared to those of the previously known CoA-transferases.	367
396873	pfam02516	STT3	Oligosaccharyl transferase STT3 subunit. This family consists of the oligosaccharyl transferase STT3 subunit and related proteins. The STT3 subunit is part of the oligosaccharyl transferase (OTase) complex of proteins and is required for its activity. In eukaryotes, OTase transfers a lipid-linked core-oligosaccharide to selected asparagine residues in the ER. In the archaea STT3 occurs alone, rather than in an OTase complex, and is required for N-glycosylation of asparagines.	478
396874	pfam02517	Abi	CAAX protease self-immunity. Members of this family are probably proteases (after a isoprenyl group is attached to the Cys residue in the C-terminal CAAX motif of a protein to attach it to the membrane, the AAX tripeptide being removed by one of the CAAX prenyl proteases). The family contains the CAAX prenyl protease. The proteins contain a highly conserved Glu-Glu motif at the amino end of the alignment. The alignment also contains two histidine residues that may be involved in zinc binding. While they are involved in membrane anchoring of proteins in eukaryotes, little is known about their function in prokaryotes. In some known bacteriocin loci, Abi genes have been found downstream of bacteriocin structural genes where they are probably involved in self-immunity. Investigation of the bacteriocin-like loci in the Gram positive bacteria locus from Lactobacillus sakei 23K confirmed that the bacteriocin-like genes (sak23Kalphabeta) exhibited antimicrobial activity when expressed in a heterologous host and that the associated Abi gene (sak23Ki) conferred immunity against the cognate bacteriocin. Interestingly, the immunity genes from three similar systems conferred a high degree of cross-immunity against each other's bacteriocins, suggesting the recognition of a common receptor. Site-directed mutagenesis demonstrated that the conserved motifs constituting the putative proteolytic active site of the Abi proteins are essential for the immunity function of Sak23Ki - thus a new concept in self-immunity.	92
396875	pfam02518	HATPase_c	Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase. This family represents the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90.	111
396876	pfam02519	Auxin_inducible	Auxin responsive protein. This family consists of the protein products of the ARG7 auxin responsive genes family none of which have any identified functional role.	92
396877	pfam02520	DUF148	Domain of unknown function DUF148. This domain has no known function nor do any of the proteins that possess it. In one member of this family the aligned region is repeated twice.	107
334957	pfam02521	HP_OMP_2	Putative outer membrane protein. This family consists of putative outer membrane proteins from Helicobacter pylori (campylobacter pylori).	442
396878	pfam02522	Antibiotic_NAT	Aminoglycoside 3-N-acetyltransferase. This family consists of bacterial aminoglycoside 3-N-acetyltransferases EC:2.3.1.81, these catalyze the reaction: Acetyl-Co + a 2-deoxystreptamine antibiotic <=> CoA + N3'-acetyl-2-deoxystreptamine antibiotic. The enzyme can use a range of antibiotics with 2-deoxystreptamine rings as acceptor for its acetyltransferase activity, this inactivates and confers resistance to gentamicin, kanamycin, tobramycin, neomycin and apramycin amongst others.	230
280656	pfam02524	KID	KID repeat. This is family contains the KID repeat as found in Borrelia spirochete RepA / Rep+ proteins. The function of these proteins is unknown. RepA and related Borrelia proteins have been suggested to play an important genus-wide role in the biology of the Borrelia.	11
396879	pfam02525	Flavodoxin_2	Flavodoxin-like fold. This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258.	190
280658	pfam02526	GBP_repeat	Glycophorin-binding protein. This family contains glycophorin binding proteins from P. falciparum the malarial parasite. Glycophorin is a cell surface protein of erythrocytes. The Glycophorin binding protein contains a tandem 38 residue repeat. In Plasmodium falciparum GBP the repeat occurs 11 times.	38
396880	pfam02527	GidB	rRNA small subunit methyltransferase G. This is a family of bacterial glucose inhibited division proteins these are probably involved in the regulation of cell devision. GidB has been shown to be a methyltransferase G specific to the rRNA small subunit. Previously identified as a glucose-inhibited division protein B that appears to be present and in a single copy in all complete eubacterial genomes so far sequenced. GidB specifically methylates the N7 position of a guanosine in 16S rRNA.	184
396881	pfam02529	PetG	Cytochrome B6-F complex subunit 5. This family consists of cytochrome B6-F complex subunit 5 (PetG). The cytochrome bf complex found in green plants, eukaryotic algae and cyanobacteria, connects photosystem I to photosystem II in the electron transport chain, functioning as a plastoquinol:plastocyanin/cytochrome c6 oxidoreductase. PetG or subunit 5 is associated with the bf complex and the absence of PetG affects either the assembly or stability of the cytochrome bf complex in Chlamydomonas reinhardtii.	36
308243	pfam02530	Porin_2	Porin subfamily. This family consists of porins from the alpha subdivision of Proteobacteria the members of this family are related to pfam00267. The porins form large aqueous channels in the cell membrane allowing the selective entry of hydrophilic compounds this so called 'molecular sieve' is found in the cell walls of gram negative bacteria.	355
396882	pfam02531	PsaD	PsaD. This family consists of PsaD from plants and cyanobacteria. PsaD is an extrinsic polypeptide of photosystem I (PSI) and is required for native assembly of PSI reaction clusters and is implicated in the electrostatic binding of ferredoxin within the reaction centre. PsaD forms a dimer in solution which is bound by PsaE however PsaD is monomeric in its native complexed PSI environment.	133
308245	pfam02532	PsbI	Photosystem II reaction centre I protein (PSII 4.8 kDa protein). This family consists of various Photosystem II (PSII) reaction centre I proteins or PSII 4.8 kDa proteins, PsbI, from the chloroplast genome of many plants and Cyanobacteria. PsbI is a small, integral membrane component of PSII the role of which is not clear. Synechocystis mutants lacking PsbI have 20-30% loss of PSII activity however the PSII complex is not destabilized.	36
396883	pfam02533	PsbK	Photosystem II 4 kDa reaction centre component. This family consists of various photosystem II 4 kDa reaction centre components (PsbK) from plant and Cyanobacteria. The photosystem II reaction centre is responsible for catalyzing the core photosynthesis reaction the light-induced splitting of water and the consequential release of dioxygen. In C. reinhardtii the psbK product is required for the stable assembly and/or stability of the photosystem II complex.	41
367119	pfam02534	T4SS-DNA_transf	Type IV secretory system Conjugative DNA transfer. These proteins contain a P-loop and walker-B site for nucleotide binding. TraG is essential for DNA transfer in bacterial conjugation. These proteins are thought to mediate interactions between the DNA-processing (Dtr) and the mating pair formation (Mpf) systems. The C-terminus of this domain interacts with the relaxosome component TraM via the latter's tetramerisation domain. TraD is a hexameric ring ATPase that forms the cytoplasmic face of the conjugative pore. The family contains a number of different DNA transfer proteins.	468
396884	pfam02535	Zip	ZIP Zinc transporter. The ZIP family consists of zinc transport proteins and many putative metal transporters. The main contribution to this family is from the Arabidopsis thaliana ZIP protein family these proteins are responsible for zinc uptake in the plant. Also found within this family are C. elegans proteins of unknown function which are annotated as being similar to human growth arrest inducible gene product, although this protein in not found within this family.	325
396885	pfam02536	mTERF	mTERF. This family contains one sequence of known function Human mitochondrial transcription termination factor (mTERF) the rest of the family consists of hypothetical proteins none of which have any functional information. mTERF is a multizipper protein possessing three putative leucine zippers one of which is bipartite. The protein binds DNA as a monomer. The leucine zippers are not implicated in a dimerization role as in other leucine zippers.	313
396886	pfam02537	CRCB	CrcB-like protein, Camphor Resistance (CrcB). CRCB is a family of bacterial integral membrane proteins with four TMs.. Over expression in E. coli also leads to camphor resistance.	109
396887	pfam02538	Hydantoinase_B	Hydantoinase B/oxoprolinase. This family includes N-methylhydaintoinase B which converts hydantoin to N-carbamyl-amino acids, and 5-oxoprolinase EC:3.5.2.9 which catalyzes the formation of L-glutamate from 5-oxo-L-proline. These enzymes are part of the oxoprolinase family and are related to pfam01968.	505
396888	pfam02540	NAD_synthase	NAD synthase. NAD synthase (EC:6.3.5.1) is involved in the de novo synthesis of NAD and is induced by stress factors such as heat shock and glucose limitation.	241
396889	pfam02541	Ppx-GppA	Ppx/GppA phosphatase family. This family consists of the N-terminal region of exopolyphosphatase (Ppx) EC:3.6.1.11 and guanosine pentaphosphate phospho-hydrolase (GppA) EC:3.6.1.40.	285
396890	pfam02542	YgbB	YgbB family. The ygbB protein is a putative enzyme of deoxy-xylulose pathway (terpenoid biosynthesis).	155
280673	pfam02543	Carbam_trans_N	Carbamoyltransferase N-terminus. This domain is found in NodU from Rhizobium, CmcH from Nocardia lactamdurans and the bifunctional carbamoyltransferase TobZ from Streptoalloteichus tenebrarius. NodU a Rhizobium nodulation protein involved in the synthesis of nodulation factors has 6-O-carbamoyltransferase-like activity. CmcH is involved in cephamycin (antibiotic) biosynthesis and has 3-hydroxymethylcephem carbamoyltransferase activity, EC:2.1.3.7 catalyzing the reaction: Carbamoyl phosphate + 3-hydroxymethylceph-3-EM-4-carboxylate <=> phosphate + 3-carbamoyloxymethylcephem. TobZ functions as an ATP carbamoyltransferase and tobramycin carbamoyltransferase. These proteins contain two domains, this is the larger, N-terminal, domain.	336
251363	pfam02544	Steroid_dh	3-oxo-5-alpha-steroid 4-dehydrogenase. This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalyzed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants is DET2, a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development.	150
396891	pfam02545	Maf	Maf-like protein. Maf is a putative inhibitor of septum formation in eukaryotes, bacteria, and archaea.	183
396892	pfam02547	Queuosine_synth	Queuosine biosynthesis protein. Queuosine (Q) biosynthesis protein, or S-adenosylmethionine:tRNA -ribosyltransferase-isomerase, is required for the synthesis of the queuosine precursor (oQ). It catalyzes the transfer and isomerisation of the ribose moiety from AdoMet to the 7-aminomethyl group of 7-deazaguanine (preQ1-tRNA) to form epoxyqueuosine (oQ-tRNA). Q is a hypermodified nucleoside usually found at the first position of the anticodon of asparagine, aspartate, histidine, and tyrosine tRNAs. In Streptococcus gordonii, QueA has been shown to play a role in the regulation of arginine deiminase genes.	336
396893	pfam02548	Pantoate_transf	Ketopantoate hydroxymethyltransferase. Ketopantoate hydroxymethyltransferase (EC:2.1.2.11) is the first enzyme in the pantothenate biosynthesis pathway.	259
251367	pfam02550	AcetylCoA_hydro	Acetyl-CoA hydrolase/transferase N-terminal domain. This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilizes acyl-CoA and acetate to form acetyl-CoA.	198
396894	pfam02551	Acyl_CoA_thio	Acyl-CoA thioesterase. This family represents the thioesterase II domain. Two copies of this domain are found in a number of acyl-CoA thioesterases.	132
251369	pfam02552	CO_dh	CO dehydrogenase beta subunit/acetyl-CoA synthase epsilon subunit. This family consists of Carbon monoxide dehydrogenase I/II beta subunit EC:1.2.99.2 and acetyl-CoA synthase epsilon subunit. Carbon monoxide beta subunit catalyzes the reaction: CO + H2O + acceptor <=> CO2 + reduced acceptor.	168
396895	pfam02553	CbiN	Cobalt transport protein component CbiN. CbiN is part of the active cobalt transport system involved in uptake of cobalt in to the cell involved with cobalamin biosynthesis (vitamin B12). It has been suggested that CbiN may function as the periplasmic binding protein component of the active cobalt transport system.	67
396896	pfam02554	CstA	Carbon starvation protein CstA. This family consists of Carbon starvation protein CstA a predicted membrane protein. It has been suggested that CstA is involved in peptide utilisation.	372
396897	pfam02556	SecB	Preprotein translocase subunit SecB. This family consists of preprotein translocase subunit SecB. SecB is required for the normal export of envelope proteins out of the cell cytoplasm.	137
396898	pfam02557	VanY	D-alanyl-D-alanine carboxypeptidase. 	131
396899	pfam02558	ApbA	Ketopantoate reductase PanE/ApbA. This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalyzed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyzes the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway.	150
396900	pfam02559	CarD_CdnL_TRCF	CarD-like/TRCF domain. CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes. This family includes the presumed N-terminal domain, CdnL. CarD interacts with the zinc-binding protein CarG to form a complex that regulates multiple processes in Myxococcus xanthus. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF (transcription-repair-coupling factor) proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription. This domain is involved in binding to the stalled RNA polymerase. The family includes members otherwise referred to as CdnL, for CarD N-terminal like, whichdiffer functionally from CarD. The TRCF domain mentioned above is the RNA polymerase-interacting domain or RID.	89
396901	pfam02560	Cyanate_lyase	Cyanate lyase C-terminal domain. Cyanate lyase (also known as cyanase) EC:4.2.1.104 is responsible for the hydrolysis of cyanate, allowing organisms that possess the enzyme to overcome the toxicity of environmental cyanate. This enzyme is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain.	65
396902	pfam02561	FliS	Flagellar protein FliS. FliS is coded for by the FliD operon and is transcribed in conjunction with FliD and FliT, however this protein has no known function.	115
396903	pfam02562	PhoH	PhoH-like protein. PhoH is a cytoplasmic protein and predicted ATPase that is induced by phosphate starvation.	204
396904	pfam02563	Poly_export	Polysaccharide biosynthesis/export protein. This is a family of periplasmic proteins involved in polysaccharide biosynthesis and/or export.	74
396905	pfam02565	RecO_C	Recombination protein O C terminal. Recombination protein O (RecO) is involved in DNA repair and pfam00470 pathway recombination.	157
396906	pfam02566	OsmC	OsmC-like protein. Osmotically inducible protein C (OsmC) is a stress -induced protein found in E. Coli. This family also contains a organic hydroperoxide detoxification protein that has a novel pattern of oxidative stress regulation.	99
396907	pfam02567	PhzC-PhzF	Phenazine biosynthesis-like protein. PhzC/PhzF is involved in dimerization of two 2,3-dihydro-3-oxo-anthranilic acid molecules to create PCA by P. fluorescens. This family also contains uncharacterized Mycobacterial proteins, though there is no significant sequence similarity to pfam00303 members. This family appears to be distantly related to pfam01678, including containing a weak internal duplication. However members of this family do not contain the conserved cysteines that are hypothesized to be active site residues (Bateman A pers obs).	280
280691	pfam02568	ThiI	Thiamine biosynthesis protein (ThiI). ThiI is required for thiazole synthesis, required for thiamine biosynthesis.	197
396908	pfam02569	Pantoate_ligase	Pantoate-beta-alanine ligase. Pantoate-beta-alanine ligase, also know as pantothenate synthase, (EC:6.3.2.1) catalyzes the formation of pantothenate from pantoate and alanine.	275
396909	pfam02570	CbiC	Precorrin-8X methylmutase. This is a family Precorrin-8X methylmutases also known as Precorrin isomerase, CbiC/CobH, EC:5.4.1.2. This enzyme catalyzes the reaction: Precorrin-8X <=> hydrogenobyrinate. This enzyme is part of the Cobalamin (vitamin B12) biosynthetic pathway and catalyzes a methyl rearrangement.	191
396910	pfam02571	CbiJ	Precorrin-6x reductase CbiJ/CobK. This family consists of Precorrin-6x reductase EC:1.3.1.54. This enzyme catalyzes the reaction: precorrin-6Y + NADP(+) <=> precorrin-6X + NADPH. CbiJ and CobK both catalyze the reduction of macocycle in the colbalmin biosynthesis pathway.	248
396911	pfam02572	CobA_CobO_BtuR	ATP:corrinoid adenosyltransferase BtuR/CobO/CobP. This family consists of the BtuR, CobO, CobP proteins all of which are Cob(I)alamin adenosyltransferase, EC:2.5.1.17, involved in cobalamin (vitamin B12) biosynthesis. These enzymes catalyze the adenosylation reaction: ATP + cob(I)alamin + H2O <=> phosphate + diphosphate + adenosylcobalamin.	171
396912	pfam02574	S-methyl_trans	Homocysteine S-methyltransferase. This is a family of related homocysteine S-methyltransferases enzymes: 5-methyltetrahydrofolate--homocysteine S-methyltransferases also known EC:2.1.1.13; Betaine--homocysteine S-methyltransferase (vitamin B12 dependent), EC:2.1.1.5; and Homocysteine S-methyltransferase, EC:2.1.1.10,.	267
396913	pfam02575	YbaB_DNA_bd	YbaB/EbfC DNA-binding family. This is a family of DNA-binding proteins. Members of this family form homodimers which bind DNA via a tweezer-like structure. The conformation of the DNA is changed when bound to these proteins. In bacteria, these proteins may play a role in DNA replication-recovery following DNA damage.	90
396914	pfam02576	DUF150	RimP N-terminal domain. This family represents the N-terminal domain from RimP.	73
396915	pfam02577	DNase-RNase	Bifunctional nuclease. This family is a bifunctional nuclease, with both DNase and RNase activity. It forms a wedge-shaped dimer, with each monomer being triangular in shape. A large groove at the thick end of the wedge contains a possible active site.	112
396916	pfam02578	Cu-oxidase_4	Multi-copper polyphenol oxidoreductase laccase. Laccases are multi-copper oxidoreductases able to oxidize a wide variety of phenolic and non-phenolic compounds and are widely distributed among both prokaryotes and eukaryotes. There are two main active catalytic sites with conserved histidines that are capable of binding four copper atoms.	232
396917	pfam02579	Nitro_FeMo-Co	Dinitrogenase iron-molybdenum cofactor. This family contains several NIF (B, Y and X) proteins which are iron-molybdenum cofactors (FeMo-co) in the dinitrogenase enzyme which catalyzes the reduction of dinitrogen to ammonium. Dinitrogenase is a hetero-tetrameric (alpha(2)beta(2)) enzyme which contains the iron-molybdenum cofactor (FeMo-co) at its active site.	92
396918	pfam02580	Tyr_Deacylase	D-Tyr-tRNA(Tyr) deacylase. This family comprises of several D-Tyr-tRNA(Tyr) deacylase proteins. Cell growth inhibition by several d-amino acids can be explained by an in vivo production of d-aminoacyl-tRNA molecules. Escherichia coli and yeast cells express an enzyme, d-Tyr-tRNA(Tyr) deacylase, capable of recycling such d-aminoacyl-tRNA molecules into free tRNA and d-amino acid. Accordingly, upon inactivation of the genes of the above deacylases, the toxicity of d-amino acids increases. Orthologues of the deacylase are found in many cells.	143
396919	pfam02581	TMP-TENI	Thiamine monophosphate synthase. Thiamine monophosphate synthase (TMP) (EC:2.5.1.3) catalyzes the substitution of the pyrophosphate of 2-methyl-4-amino-5- hydroxymethylpyrimidine pyrophosphate by 4-methyl-5- (beta-hydroxyethyl)thiazole phosphate to yield thiamine phosphate. This Pfam family also includes the regulatory protein TENI, a protein from Bacillus subtilis that regulates the production of several extracellular enzymes by reducing alkaline protease production. While TenI shows high sequence similarity with thiamin phosphate synthase, the purified protein has no thiamin phosphate synthase activity. Instead, it is a thiazole tautomerase.	180
396920	pfam02582	DUF155	Uncharacterized ACR, YagE family COG1723. 	173
396921	pfam02583	Trns_repr_metal	Metal-sensitive transcriptional repressor. This is a family of metal-sensitive repressors, involved in resistance to metal ions. Members of this family bind copper, nickel or cobalt ions via conserved cysteine and histidine residues. In the absence of metal ions, these proteins bind to promoter regions and repress transcription. When bound to metal ions they are unable to bind DNA, leading to transcriptional derepression.	79
396922	pfam02585	PIG-L	GlcNAc-PI de-N-acetylase. Members of this family are related to PIG-L an N-acetylglucosaminylphosphatidylinositol de-N-acetylase (EC:3.5.1.89) that catalyzes the second step in GPI biosynthesis.	125
396923	pfam02586	SRAP	SOS response associated peptidase (SRAP). The SRAP family functions as a DNA-associated autoproteolytic switch that recruits diverse repair enzymes onto DNA damage. We propose that the human protein Q96FZ2:UniProtKB, the eukaryotic member of the SRAP family, which has been recently shown to bind specifically to DNA with 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxycytosine, is a sensor for these oxidized bases generated by the TET (tetrahedral aminopeptidase of the M42 family) enzymes from methylcytosine. Hence, its autoproteolytic activity might help it act as a switch that recruits DNA repair enzymes to remove these oxidized methylcytosine species as part of the DNA demethylation pathway downstream of the TET enzymes.	212
396924	pfam02588	YitT_membrane	Uncharacterized 5xTM membrane BCR, YitT family COG1284. This is probably a bacterial ABC transporter permease (personal obs:Yeats C).	206
396925	pfam02589	LUD_dom	LUD domain. This entry represents a domain found in lactate utilization proteins B (LutB) and C (LutC), as well as several uncharacterized proteins. LutB and LutC are encoded by th conserved LutABC operon in bacteria. They are involved in lactate utilization and is implicated in the oxidative conversion of L-lactate into pyruvate	188
396926	pfam02590	SPOUT_MTase	Predicted SPOUT methyltransferase. This family of proteins are predicted to be SPOUT methyltransferases.	155
396927	pfam02591	zf-RING_7	C4-type zinc ribbon domain. Zn-ribbon_9 is a Zn-ribbon domain rich in aromatic and positively charged amino acid residues. This C-terminal Zn-ribbon domain consists of two beta-strands acting as a scaffold for the two Zn knuckles. Both pairs of cysteines making up the two Zn knuckles are situated at highly conserved sharp beta-turns, an arrangement that facilitates the tetrahedral coordination of the divalent Zn ion. The two Zn-knuckle cysteine motifs are separated by 20 residues, 9 of which form an alpha-helix (helix 4).Structural modelling suggests this domain may bind nucleic acids. The domain appears to bind flaA-mRNA, thus contributing to flagellum formation and motility.	33
396928	pfam02592	Vut_1	Putative vitamin uptake transporter. 	154
367127	pfam02593	DUF166	Domain of unknown function. This family catalyzes the synthesis of thymidine monophosphate (dTMP) from deoxyuridine monophosphate (dUMP). The physiological co-substrate has not yet been identified. Previous designation of this famliy as being thymidylate synthase from one paper, PMID:10436953, has been shown to be erroneous. The proteins are uncharacterized.	218
396929	pfam02594	DUF167	Uncharacterized ACR, YggU family COG1872. 	75
396930	pfam02595	Gly_kinase	Glycerate kinase family. This is family of Glycerate kinases.	367
396931	pfam02596	DUF169	Uncharacterized ArCR, COG2043. 	209
396932	pfam02597	ThiS	ThiS family. ThiS (thiaminS) is a 66 aa protein involved in sulphur transfer. ThiS is coded in the thiCEFSGH operon in E. coli. This family of proteins have two conserved Glycines at the COOH terminus. Thiocarboxylate is formed at the last G in the activation process. Sulphur is transferred from ThiI to ThiS in a reaction catalyzed by IscS. MoaD, a protein involved sulphur transfer in molybdopterin synthesis, is about the same length and shows limited sequence similarity to ThiS. Both have the conserved GG at the COOH end.	74
396933	pfam02598	Methyltrn_RNA_3	Putative RNA methyltransferase. This family has a TIM barrel-like fold with a deep C-terminal trefoil knot. The arrangement of its hydrophilic and hydrophobic surfaces are opposite to that of the classic TIM barrel proteins. It is likely to bind RNA, and may function as a methyltransferase.	282
396934	pfam02599	CsrA	Global regulator protein family. This is a family of global regulator proteins. This protein is a RNA-binding protein and a global regulator of carbohydrate metabolism genes facilitating mRNA decay. In E. coli CsrA binds the CsrB RNA molecule to form the Csr regulatory system which has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara RmsA has been shown to regulate the production of virulence determinants, such extracellular enzymes. RmsA binds to RmsB regulatory RNA.	50
396935	pfam02600	DsbB	Disulfide bond formation protein DsbB. This family consists of disulfide bond formation protein DsbB from bacteria. The DsbB protein oxidizes the periplasmic protein DsbA which in turn oxidizes cysteines in other periplasmic proteins in order to make disulfide bonds. DsbB acts as a redox potential transducer across the cytoplasmic membrane and is an integral membrane protein. DsbB posses six cysteines four of which are necessary for it proper function in vivo.	149
396936	pfam02601	Exonuc_VII_L	Exonuclease VII, large subunit. This family consist of exonuclease VII, large subunit EC:3.1.11.6 This enzyme catalyzes exonucleolytic cleavage in either 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. This exonuclease VII enzyme is composed of one large subunit and 4 small ones.	264
396937	pfam02602	HEM4	Uroporphyrinogen-III synthase HemD. This family consists of uroporphyrinogen-III synthase HemD EC:4.2.1.75 also known as Hydroxymethylbilane hydrolyase (cyclizing) from eukaryotes, bacteria and archaea. This enzyme catalyzes the reaction: Hydroxymethylbilane <=> uroporphyrinogen-III + H(2)O. Some members of this family are multi-functional proteins possessing other enzyme activities related to porphyrin biosynthesis, such as HemD with pfam00590, however the aligned region corresponds with the uroporphyrinogen-III synthase EC:4.2.1.75 activity only. Uroporphyrinogen-III synthase is the fourth enzyme in the heme pathway. Mutant forms of the Uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria in humans a recessive inborn error of metabolism also known as Gunther disease.	230
396938	pfam02603	Hpr_kinase_N	HPr Serine kinase N-terminus. This family represents the N-terminal region of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phospho-relay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognizes the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. The blades are formed by two N-terminal domains each, and the compact central hub assembles the C-terminal kinase domains.	125
396939	pfam02604	PhdYeFM_antitox	Antitoxin Phd_YefM, type II toxin-antitoxin system. Members of this family act as antitoxins in type II toxin-antitoxin systems. When bound to their toxin partners, they can bind DNA via the N-terminus and repress the expression of operons containing genes encoding the toxin and the antitoxin. This domain complexes with Txe toxins containing pfam06769, Fic/DOC toxins containing pfam02661 and YafO toxins containing pfam13957.	67
396940	pfam02605	PsaL	Photosystem I reaction centre subunit XI. This family consists of the photosystem I reaction centre subunit XI, PsaL, from plants and bacteria. PsaL is one of the smaller subunits in photosystem I with only two transmembrane alpha helices and interacts closely with PsaI.	143
396941	pfam02606	LpxK	Tetraacyldisaccharide-1-P 4'-kinase. This family consists of tetraacyldisaccharide-1-P 4'-kinase also known as Lipid-A 4'-kinase or Lipid A biosynthesis protein LpxK, EC:2.7.1.130. This enzyme catalyzes the reaction: ATP + 2,3-bis(3-hydroxytetradecanoyl)-D -glucosaminyl-(beta-D-1,6)-2,3-bis(3-hydroxytetradecanoyl)-D-glu cosam inyl beta-phosphate <=> ADP + 2,3,2',3'-tetrakis(3-hydroxytetradecanoyl)-D- glucosaminyl-1,6-beta-D-glucosamine 1,4'-bisphosphate. This enzyme is involved in the synthesis of lipid A portion of the bacterial lipopolysaccharide layer (LPS). The family contains a P-loop motif at the N-terminus.	318
396942	pfam02607	B12-binding_2	B12 binding domain. This B12 binding domain is found in methionine synthase EC:2.1.1.13, and other shorter proteins that bind to B12. This domain is always found to the N-terminus of pfam02310. The structure of this domain is known, it is a 4 helix bundle. Many of the conserved residues in this domain are involved in B12 binding, such as those in the MXXVG motif.	68
396943	pfam02608	Bmp	ABC transporter substrate-binding protein PnrA-like. Proteins containing this domain were originally annotated as basic membrane lipoproteins. However, several proteins containing this domain were later predicted as ABC transporter substrate-binding proteins, such as PnrA (also known as TmpC or TP0319) and RfuA (also known as Tpn38 or TP0298) from Treponema pallidum. RfuA transports purine nucleosides, while RfuA transports riboflavin. Proteins containing this domain also include Med from Bacillus subtilis. Med was annotated as a transcriptional activator protein that regulates comK. This domain can also found at the N-terminus of glutamate receptor-like proteins from Dictyostelium (slime mold).	302
396944	pfam02609	Exonuc_VII_S	Exonuclease VII small subunit. This family consist of exonuclease VII, small subunit EC:3.1.11.6 This enzyme catalyzes exonucleolytic cleavage in either 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. This exonuclease VII enzyme is composed of one large subunit and 4 small ones.	52
396945	pfam02610	Arabinose_Isome	L-arabinose isomerase. This is a family of L-arabinose isomerases, AraA, EC:5.3.1.4. These enzymes catalyze the reaction: L-arabinose <=> L-ribulose. This reaction is the first step in the pathway of L-arabinose utilisation as a carbon source after entering the cell L-arabinose is converted into L-ribulose by the L-arabinose isomerases enzyme.	356
396946	pfam02611	CDH	CDP-diacylglycerol pyrophosphatase. This is a family of CDP-diacylglycerol pyrophosphatases, EC:3.6.1.26. This enzyme catalyzes the reaction CDP-diacylglycerol + H2O <=> CMP + phosphatidate.	224
396947	pfam02613	Nitrate_red_del	Nitrate reductase delta subunit. This family is the delta subunit of the nitrate reductase enzyme, The delta subunit is not part of the nitrate reductase enzyme but is most likely needed for assembly of the multi-subunit enzyme complex. In the absence of the delta subunit the core alpha beta enzyme complex is unstable. The delta subunit is essential for enzyme activity in vivo and in vitro. The nitrate reductase enzyme, EC:1.7.99.4 catalyze the conversion of nitrite to nitrate via the reduction of an acceptor. The nitrate reductase enzyme is composed of three subunits. Nitrate is the most widely used alternative electron acceptor after oxygen. This family also now contains the family TorD, a family of cytoplasmic chaperone proteins; like many prokaryotic molybdoenzymes, the TMAO reductase (TorA) of Escherichia coli requires the insertion of a bis(molybdopterin guanine dinucleotide) molybdenum (bis(MGD)Mo) cofactor in its catalytic site to be active and translocated to the periplasm. The TorD chaperone increases apoTorA activation up to four-fold, allowing maturation of most of the apoprotein. Therefore TorD is involved in the first step of TorA maturation to make it competent to receive the cofactor.	133
396948	pfam02614	UxaC	Glucuronate isomerase. This is a family of Glucuronate isomerases also known as D-glucuronate isomerase, uronic isomerase, uronate isomerase, or uronic acid isomerase, EC:5.3.1.12. This enzyme catalyzes the reactions: D-glucuronate <=> D-fructuronate and D-galacturonate <=> D-tagaturonate. It is not however clear where the experimental evidence for this functional assignment came from and thus this family has no literature reference.	464
396949	pfam02615	Ldh_2	Malate/L-lactate dehydrogenase. This family consists of bacterial and archaeal Malate/L-lactate dehydrogenase. L-lactate dehydrogenase, EC:1.1.1.27, catalyzes the reaction (S)-lactate + NAD(+) <=> pyruvate + NADH. Malate dehydrogenase, EC:1.1.1.37 and EC:1.1.1.82, catalyzes the reactions: (S)-malate + NAD(+) <=> oxaloacetate + NADH, and (S)-malate + NADP(+) <=> oxaloacetate + NADPH respectively.	330
280735	pfam02616	SMC_ScpA	Segregation and condensation protein ScpA. This is a family of proteins that from part of the condensin complex that regulates chromosome segregation. This is the A subunit, which binds to the ScpB subunit, pfam04079, and SMC, pfam02463, to participate in chromosomal partition during cell division. The condensin complex pulls DNA away from the mid-cell into both cell halves. These proteins are part of the Kleisin superfamily.	225
396950	pfam02617	ClpS	ATP-dependent Clp protease adaptor protein ClpS. In the bacterial cytosol, ATP-dependent protein degradation is performed by several different chaperone-protease pairs, including ClpAP. ClpS directly influences the ClpAP machine by binding to the N-terminal domain of the chaperone ClpA. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins.	80
396951	pfam02618	YceG	YceG-like family. This family of proteins is found in bacteria. Proteins in this family are typically between 332 and 389 amino acids in length. This family was previously incorrectly annotated and names as aminodeoxychorismate lyase. The structure of YceG was solved by X-ray crystallography.	274
396952	pfam02620	DUF177	Uncharacterized ACR, COG1399. This family is nearly universally conserved in bacteria and plants except the Chlorophyceae algae. Thus far, mutantional analysis in bacteria have not established a function. In contrast, mutants have embryo lethal phenotypes in maize and Arabidopsis. In maize, the mutant embryos arrest at an early transition stage.It has been suggested that family members specifically affect 23S rRNA accumulation in plastids as well as bacteria.	118
396953	pfam02621	VitK2_biosynth	Menaquinone biosynthesis. This family includes two enzymes which are involved in menaquinone biosynthesis. One which catalyzes the conversion of cyclic de-hypoxanthine futalosine to 1,4-dihydroxy-6-naphthoate, and one which may be involved in the conversion of chorismate to futalosine. These enzymes comprise two domains with alpha/beta structures, a large domain and a small domain. A pocket between the two domains may form the active site, a conserved histidine located within this pocket could be the catalytic base.	253
396954	pfam02622	DUF179	Uncharacterized ACR, COG1678. 	159
396955	pfam02623	FliW	FliW protein. The protein BSU35380 from Bacillus subtilis (renamed FliW) was characterized as being a flagellar assembly factor. Experimental characterization was also carried out in Treponema pallidum (TP0658). In Campylobacter jejuni, Cj1075 has been shown to be involved in motility and flagellin biosynthesis. The two paralogues in Helicobacter pylori (HP1154 and HP1377) were found to be able to bind to flagellin. FliW proteins are involved in flagellar assembly. FliW is part of a three-part feedback loop: in Bacillus subtilis FliW inhibits CsrA (an RNA-binding protein) which inhibits FliC translation; hence FliW is required for FliC (flagellin) production.	121
396956	pfam02624	YcaO	YcaO cyclodehydratase, ATP-ad Mg2+-binding. YcaO is an ATP- an Mg2+-binding protein involved in the peptidic biosynthesis of azoline. There three motifs involved in the binding are, in UniProtKB:P75838, 71-79: Sx3ExxER, 184-203: Sx6Ex3Qx3ExxER, and 286-290: RxxxE. Three slightly different functional families are represented in this family, proteins involved in TOMM (thiazole/oxazole-modified microcin) biogenesis, non-TOMM proteins such as UniProtKB:P75838, and TfuA-associated non-TOMM proteins involved in trifolitoxin biosynthesis. UniProtKB:P75838 hydrolyzes ATP to AMP and pyrophosphate.	319
396957	pfam02625	XdhC_CoxI	XdhC and CoxI family. This domain is often found in association with an NAD-binding region, related to TrkA-N (pfam02254; personal obs:C. Yeats). XdhC is believed to be involved in the attachment of molybdenum to Xanthine Dehydrogenase.	68
396958	pfam02626	CT_A_B	Carboxyltransferase domain, subdomain A and B. Urea carboxylase (UC) catalyzes a two-step, ATP- and biotin-dependent carboxylation reaction of urea. It is composed of biotin carboxylase (BC), carboxyltransferase (CT), and biotin carboxyl carrier protein (BCCP) domains. The CT domain of UC consists of four subdomains, named A, B, C and D. This domain covers the A and B subdomains of the CT domain. This domain covers the whole length of KipA (kinase A) from Bacillus subtilis. It can also be found in S. cerevisiae urea amidolyase Dur1,2, which is a multifunctional biotin-dependent enzyme with domains for urea carboxylase and allophanate (urea carboxylate) hydrolase activity.	263
396959	pfam02627	CMD	Carboxymuconolactone decarboxylase family. Carboxymuconolactone decarboxylase (CMD) EC:4.1.1.44 is involved in protocatechuate catabolism. In some bacteria a gene fusion event leads to expression of CMD with a hydrolase involved in the same pathway. In these bifunctional proteins CMD represents the C-terminal domain, pfam00561 represents the N-terminal domain.	84
396960	pfam02628	COX15-CtaA	Cytochrome oxidase assembly protein. This is a family of integral membrane proteins. CtaA is required for cytochrome aa3 oxidase assembly in Bacillus subtilis. COX15 is required for cytochrome c oxidase assembly in yeast.	322
396961	pfam02629	CoA_binding	CoA binding domain. This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases.	97
396962	pfam02630	SCO1-SenC	SCO1/SenC. This family is involved in biogenesis of respiratory and photosynthetic systems. SCO1 is required for a post-translational step in the accumulation of subunits COXI and COXII of cytochrome c oxidase. SenC is required for optimal cytochrome c oxidase activity and maximal induction of genes encoding the light-harvesting and reaction centre complexes of R. capsulatus.	134
396963	pfam02631	RecX	RecX family. RecX is a putative bacterial regulatory protein. The gene encoding RecX is found downstream of recA, and is thought to interact with the RecA protein.	122
396964	pfam02632	BioY	BioY family. A number of bacterial genes are involved in bioconversion of pimelate into dethiobiotin. BioY is a component of the BioMNY transport system involved in biotin uptake in prokaryotes.	138
396965	pfam02633	Creatininase	Creatinine amidohydrolase. Creatinine amidohydrolase (EC:3.5.2.10), or creatininase, catalyzes the hydrolysis of creatinine to creatine.	226
396966	pfam02634	FdhD-NarQ	FdhD/NarQ family. A pan-bacterial lineage of proteins. Nitrate assimilation protein, NarQ, and FdhD are required for formate dehydrogenase activity. Structurally, they possess a deaminase fold with a characteristic binding pocket, suggesting that they might bind a nucleotide or related molecule allosterically to regulate the formate dehydrogenase catalytic subunit.	238
396967	pfam02635	DrsE	DsrE/DsrF-like family. DsrE is a small soluble protein involved in intracellular sulfur reduction. This family also includes DsrF.	118
396968	pfam02636	Methyltransf_28	Putative S-adenosyl-L-methionine-dependent methyltransferase. This family is a putative S-adenosyl-L-methionine (SAM)-dependent methyltransferase. In eukaryotes it plays a role in mitochondrial complex I activity.	247
396969	pfam02637	GatB_Yqey	GatB domain. This domain is found in GatB. It is about 140 amino acid residues long. This domain is found at the C-terminus of GatB, which transamidates Glu-tRNA to Gln-tRNA.	148
251441	pfam02638	GHL10	Glycosyl hydrolase-like 10. This is family of bacterial glycosyl-hydrolase-like proteins falling into the family GHL10 as described above,.	311
396970	pfam02639	DUF188	Uncharacterized BCR, YaiI/YqxD family COG1671. 	130
396971	pfam02641	DUF190	Uncharacterized ACR, COG1993. 	97
396972	pfam02643	DUF192	Uncharacterized ACR, COG1430. Two structures have been solved for members of this large (>500 members) family of bacterial proteins present mostly in environmental bacteria and metagenomes (distant homologs are also present in several Plasmodium species). TOPSAN analysis for Structure 3pjy shows that there is much similarity with the other solved structure, Structure 3m7a, solved for UniProt:Q2GA55 (Saro_0823), a homolog of Thermotoga maritima TM1668, UniProt:Q9X1Z6., The homolog in Caulobacter crescentus (CC1388), UniProt:Q9A8G6, is associated with CspD, a cold shock protein (CC1387), UniProt:Q9A8G7. However, the genomic context of UniProt:Q2GA55 is most conserved with a putative xylose isomerase, suggesting a possible role in extracellular sugar processing. Saro_0821, UniProt:Q2GA57, is annotated as an AMP-dependent synthetase and ligase. Structure 3m7a structure corresponds to the C-terminal (27-165) fragment of the YP_496102 (Saro_0823) protein and it is structurally unique, as the best hits from Dali have a Z-score of 3.8 (1nt0, 2j1t, 3kq4) and it is thus a likely candidate for a new fold. Interestingly, many of the top Dali hits are involved in sugar metabolism. There are no obvious active site-like cavities on the protein surface of 3m7a (http://www.topsan.org/Proteins/JCSG/).	105
396973	pfam02645	DegV	Uncharacterized protein, DegV family COG1307. The structure of this protein revealed a bound fatty-acid molecule in a pocket between the two protein domains. The structure indicates that this family has the molecular function of fatty-acid binding and may play a role in the cellular functions of fatty acid transport or metabolism.	280
396974	pfam02646	RmuC	RmuC family. This family contains several bacterial RmuC DNA recombination proteins. The function of the RMUC protein is unknown but it is suspected that it is either a structural protein that protects DNA against nuclease action, or is itself involved in DNA cleavage at the regions of DNA secondary structures	286
396975	pfam02649	GCHY-1	Type I GTP cyclohydrolase folE2. This is a family of prokaryotic proteins with type I GTP cyclohydrolase activity. GTP cyclohydrolase I is the first enzyme of the de novo tetrahydrofolate biosynthetic pathway present in bacteria, fungi, and plants, and encoded in Escherichia coli by the folE gene; it is also the first enzyme of the biopterin (BH4) pathway in Homo sapiens. The invariate, highly conserved glutamate residue at position 216 in Neisseria gonorrhoeae FolE2 is likely to be the substrate ligand and the metal ligand is likely to be the cysteine at position 147. The enzyme is Zinc 2+ dependent.	262
396976	pfam02650	HTH_WhiA	WhiA C-terminal HTH domain. This domain is found at the C-terminus of the sporulation regulator WhiA. It is predicted to form a DNA-binding helix-turn-helix structure. The WhiA protein also contains two N-terminal domains that are distant homologs of LAGLIDADG homing endonucleases.	83
280762	pfam02652	Lactate_perm	L-lactate permease. L-lactate permease is an integral membrane protein probably involved in L-lactate transport.	522
396977	pfam02653	BPD_transp_2	Branched-chain amino acid transport system / permease component. This is a large family mainly comprising high-affinity branched-chain amino acid transporter proteins such as E. coli LivH and LivM, both of which are form the LIV-I transport system. Also found with in this family are proteins from the galactose transport system permease and a ribose transport system.	269
396978	pfam02654	CobS	Cobalamin-5-phosphate synthase. This is family of Colbalmin-5-phosphate synthases, CobS, from bacteria. The CobS enzyme catalyzes the synthesis of AdoCbl-5'-p from AdoCbi-GDP and alpha-ribazole-5'-P. This enzyme is involved in the cobalamin (vitamin B12) biosynthesis pathway in particular the nucleotide loop assembly stage in conjunction with CobC, CobU and CobT.	217
396979	pfam02655	ATP-grasp_3	ATP-grasp domain. No functional information or experimental verification of function is known in this family. This family appears to be an ATP-grasp domain (Pers. obs. A Bateman).	160
396980	pfam02656	DUF202	Domain of unknown function (DUF202). This family consists of hypothetical proteins some of which are putative membrane proteins. No functional information or experimental verification of function is known. This domain is around 100 amino acids long.	68
396981	pfam02657	SufE	Fe-S metabolism associated domain. This family consists of the SufE-related proteins. These have been implicated in Fe-S metabolism and export).	119
396982	pfam02659	Mntp	Putative manganese efflux pump. MntP is a family of bacterial proteins with a signal peptide and four transmembrane domains. It is a putative manganese efflux pump, since deletion of the gene leads to profound manganese sensitivity and elevated intracellular manganese levels in bacteria. Manganese is a highly important trace nutrient for organisms from bacteria to humans, and acts as an important element in the defense against oxidative stress and as an enzyme cofactor.	152
396983	pfam02660	G3P_acyltransf	Glycerol-3-phosphate acyltransferase. This family of enzymes catalyzes the transfer of an acyl group from acyl-ACP to glycerol-3-phosphate to form lysophosphatidic acid.	174
396984	pfam02661	Fic	Fic/DOC family. This family consists of the Fic (filamentation induced by cAMP) protein and doc (death on curing). The Fic protein is involved in cell division and is suggested to be involved in the synthesis of PAB or folate, indicating that the Fic protein and cAMP are involved in a regulatory mechanism of cell division via folate metabolism. This family contains a central conserved motif HPFXXGNG in most members. The exact molecular function of these proteins is uncertain. P1 lysogens of Escherichia coli carry the prophage as a stable low copy number plasmid. The frequency with which viable cells cured of prophage are produced is about 10(-5) per cell per generation. A significant part of this remarkable stability can be attributed to a plasmid-encoded mechanism that causes death of cells that have lost P1. In other words, the lysogenic cells appear to be addicted to the presence of the prophage. The plasmid withdrawal response depends on a gene named doc (death on curing) that is represented by this family. Doc induces a reversible growth arrest of E. coli cells by targetting the protein synthesis machinery. Doc hosts the C-terminal domain of its antitoxin partner Phd (prevents host death) through fold complementation, a domain that is intrinsically disordered in solution but that folds into an alpha-helix on binding to Doc.This domain forms complexes with Phd antitoxins containing pfam02604.	94
396985	pfam02662	FlpD	Methyl-viologen-reducing hydrogenase, delta subunit. This family consist of methyl-viologen-reducing hydrogenase, delta subunit / heterodisulphide reductase. No specific functions have been assigned to this subunit. The aligned region corresponds to almost the entire delta chain sequence and contains 4 conserved cysteine residues. However, in two Archaeoglobus sequences this region corresponds to only the C-terminus of these proteins.	122
396986	pfam02663	FmdE	FmdE, Molybdenum formylmethanofuran dehydrogenase operon. This entry represents the FmdE protein that is encode by the molybdenum formylmethanofuran dehydrogenase operon. FmdE does not co-purify with the molybdenum isozyme that is formed by FmdC and FmdB. The domain is typically found as a single copy, but is repeated in some sequence two to three times. It is also common place to find this domain co-occurs with a zinc-beta ribbon domain, suggesting that is may bind nucleic acid and be involved in transcription regulation.	89
396987	pfam02664	LuxS	S-Ribosylhomocysteinase (LuxS). This family consists of the LuxS protein involved in autoinducer AI2 synthesis and its hypothetical relatives. S-ribosylhomocysteinase (LuxS) catalyzes the cleavage of the thioether bond in S-ribosylhomocysteine (SRH) to produce homocysteine and 4,5-dihydroxy-2,3-pentanedione (DPD), the precursor of type II bacterial quorum sensing molecule.	154
396988	pfam02665	Nitrate_red_gam	Nitrate reductase gamma subunit. This family is the gamma subunit of the nitrate reductase enzyme, the gamma subunit is a b-type cytochrome that receives electrons from the quinone pool. It then transfers these via the iron-sulfur clusters of the beta subunit to the molybdenum cofactor found in the alpha subunit. The nitrate reductase enzyme, EC:1.7.99.4 catalyzes the conversion of nitrite to nitrate via the reduction of an acceptor. The nitrate reductase enzyme is composed of three subunits. Nitrate is the most widely used alternative electron acceptor after oxygen.	220
396989	pfam02666	PS_Dcarbxylase	Phosphatidylserine decarboxylase. This is a family of phosphatidylserine decarboxylases, EC:4.1.1.65. These enzymes catalyze the reaction: Phosphatidyl-L-serine <=> phosphatidylethanolamine + CO2. Phosphatidylserine decarboxylase plays a central role in the biosynthesis of aminophospholipids by converting phosphatidylserine to phosphatidylethanolamine.	198
280776	pfam02667	SCFA_trans	Short chain fatty acid transporter. This family consists of two sequences annotated as short chain fatty acid transporters, however, there are no references giving details of experimental characterization of this function.	453
367137	pfam02668	TauD	Taurine catabolism dioxygenase TauD, TfdA family. This family consists of taurine catabolism dioxygenases of the TauD, TfdA family. TauD from E. coli is a alpha-ketoglutarate-dependent taurine dioxygenase. This enzyme catalyzes the oxygenolytic release of sulfite from taurine. TfdA from Burkholderia sp. is a 2,4-dichlorophenoxyacetic acid/alpha-ketoglutarate dioxygenase. TfdA from Alcaligenes eutrophus JMP134 is a 2,4-dichlorophenoxyacetate monooxygenase. Also included are gamma-Butyrobetaine hydroxylase enzymes EC:1.14.11.1.	264
396990	pfam02669	KdpC	K+-transporting ATPase, c chain. This family consists of K+-transporting ATPase, c chain, KdpC. KdpC forms strong interactions with the KdpA subunit, serving to assemble and stabilize the Kdp complex. It has been suggested that KdpC could be one of the connecting links between the energy providing subunit KdpB and the K+-transporting subunit KdpA. The K+ transport system actively transports K+ ions via ATP hydrolysis.	179
396991	pfam02670	DXP_reductoisom	1-deoxy-D-xylulose 5-phosphate reductoisomerase. This is a family of 1-deoxy-D-xylulose 5-phosphate reductoisomerases. This enzyme catalyzes the formation of 2-C-methyl-D-erythritol 4-phosphate from 1-deoxy-D-xylulose-5-phosphate in the presence of NADPH. This reaction is part of the terpenoid biosynthesis pathway.	127
396992	pfam02671	PAH	Paired amphipathic helix repeat. This family contains the paired amphipathic helix repeat. The family contains the yeast SIN3 gene (also known as SDI1) that is a negative regulator of the yeast HO gene. This repeat may be distantly related to the helix-loop-helix motif, which mediate protein-protein interactions.	45
396993	pfam02672	CP12	CP12 domain. The function of this domain is unknown, it does contain three conserved cysteines and a histidine, that suggests this may be a zinc binding domain (Bateman A pers. observation). This domain is found associated with CBS domains in some proteins pfam00571.	68
396994	pfam02673	BacA	Bacitracin resistance protein BacA. Bacitracin resistance protein (BacA) is a putative undecaprenol kinase. BacA confers resistance to bacitracin, probably by phosphorylation of undecaprenol. More recent studies show that BacA has undecaprenyl pyrophosphate phosphatase activity. Undecaprenyl phosphate is a key lipid intermediate involved in the synthesis of various bacterial cell wall polymers. Bacitracin, a mixture of related cyclic polypeptide antibiotics, is used to treat surface tissue infections. Its primary mode of action is the inhibition of bacterial cell wall synthesis through sequestration of the essential carrier lipid undecaprenyl pyrophosphate, C55-PP, resulting in the loss of cell integrity and lysis. The characteristic phosphatase sequence-motif in this family is likely to be the PGxSRSGG, compared with the PSGH of the PAP family of phosphatases.	257
396995	pfam02674	Colicin_V	Colicin V production protein. Colicin V production protein is required in E. Coli for colicin V production from plasmid pColV-K30. This protein is coded for in the purF operon.	144
396996	pfam02675	AdoMet_dc	S-adenosylmethionine decarboxylase. This family contains several S-adenosylmethionine decarboxylase proteins from bacterial and archaebacterial species. S-adenosylmethionine decarboxylase (AdoMetDC), a key enzyme in the biosynthesis of spermidine and spermine, is first synthesized as a proenzyme, which is cleaved post translationally to form alpha and beta subunits. The alpha subunit contains a covalently bound pyruvoyl group derived from serine that is essential for activity.	98
396997	pfam02676	TYW3	Methyltransferase TYW3. The methyltransferase TYW3 (tRNA-yW- synthesising protein 3) has been identified in yeast to be involved in wybutosine (yW) biosynthesis. yW is a complexly modified guanosine residue that contains a tricyclic base and is found at the 3' position adjacent the anticodon of phenylalanine tRNA. TYW3 is an N-4 methylase that methylates yW-86 to yield yW-72 in an Ado-Met-dependent manner.	207
396998	pfam02677	DUF208	Uncharacterized BCR, COG1636. 	175
396999	pfam02678	Pirin	Pirin. This family consists of Pirin proteins from both eukaryotes and prokaryotes. The function of Pirin is unknown but the gene coding for this protein is known to be expressed in all tissues in the human body although it is expressed most strongly in the liver and heart. Pirin is known to be a nuclear protein, exclusively localized within the nucleoplasma and predominantly concentrated within dot-like subnuclear structures. A tomato homolog of human Pirin has been found to be induced during programmed cell death. Human Pirin interacts with Bcl-3 and NFI and hence is probably involved in the regulation of DNA transcription and replication. It appears to be an Fe(II)-containing member of the Cupin superfamily.	104
397000	pfam02679	ComA	(2R)-phospho-3-sulfolactate synthase (ComA). In methanobacteria (2R)-phospho-3-sulfolactate synthase (ComA) catalyzes the first step of the biosynthesis of coenzyme M from phosphoenolpyruvate (P-enolpyruvate). This novel enzyme catalyzes the stereospecific Michael addition of sulfite to P-enolpyruvate, forming L-2-phospho-3-sulfolactate (PSL). It is suggested that the ComA-catalyzed reaction is analogous to those reactions catalyzed by beta-elimination enzymes that proceed through an enolate intermediate.	238
397001	pfam02680	DUF211	Uncharacterized ArCR, COG1888. 	88
397002	pfam02681	DUF212	Divergent PAP2 family. This family is related to the pfam01569 family (personal obs: C Yeats).	134
397003	pfam02682	CT_C_D	Carboxyltransferase domain, subdomain C and D. Urea carboxylase (UC) catalyzes a two-step, ATP- and biotin-dependent carboxylation reaction of urea. It is composed of biotin carboxylase (BC), carboxyltransferase (CT), and biotin carboxyl carrier protein (BCCP) domains. The CT domain of UC consists of four subdomains, named A, B, C and D. This domain covers the C and D subdomains of the CT domain. This domain covers the whole length of kipI (kinase A inhibitor) from Bacillus subtilis. It can also be found in S. cerevisiae urea amidolyase Dur1,2, which is a multifunctional biotin-dependent enzyme with domains for urea carboxylase and allophanate (urea carboxylate) hydrolase activity.	201
280792	pfam02683	DsbD	Cytochrome C biogenesis protein transmembrane region. This family consists of the transmembrane (i.e. non-catalytic) region of Cytochrome C biogenesis proteins also known as disulphide interchange proteins. These proteins posses a protein disulphide isomerase like domain that is not found within the aligned region of this family.	213
397004	pfam02684	LpxB	Lipid-A-disaccharide synthetase. This is a family of lipid-A-disaccharide synthetases, EC:2.4.2.128. These enzymes catalyze the reaction: UDP-2,3-bis(3-hydroxytetradecanoyl) glucosamine + 2,3-bis(3-hydroxytetradecanoyl)-beta-D-glucosaminyl 1-phosphate <=> UDP + 2,3-bis(3-hydroxytetradecanoyl)-D-glucosaminyl-1,6 -beta-D-2,3-bis(3-hydroxytetradecanoyl)-beta-D-glucosaminyl 1-phosphate. These enzymes catalyze the fist disaccharide step in the synthesis of lipid-A-disaccharide.	374
397005	pfam02685	Glucokinase	Glucokinase. This is a family of glucokinases or glucose kinases EC:2.7.1.2. These enzymes phosphorylate glucose using ATP as a donor to give glucose-6-phosphate and ADP.	314
397006	pfam02686	Glu-tRNAGln	Glu-tRNAGln amidotransferase C subunit. This is a family of Glu-tRNAGln amidotransferase C subunits. The Glu-tRNA Gln amidotransferase enzyme itself is an important translational fidelity mechanism replacing incorrectly charged Glu-tRNAGln with the correct Gln-tRANGln via transmidation of the misacylated Glu-tRNAGln. This activity supplements the lack of glutaminyl-tRNA synthetase activity in gram-positive eubacterteria, cyanobacteria, Archaea, and organelles.	70
397007	pfam02687	FtsX	FtsX-like permease family. This is a family of predicted permeases and hypothetical transmembrane proteins. Buchnera aphidicola LolC has been shown to transport lipids targeted to the outer membrane across the inner membrane. Both LolC and Streptococcus cristatus TptD have been shown to require ATP. This region contains three transmembrane helices.	96
280797	pfam02689	Herpes_Helicase	Helicase. This family consists of Helicases from the Herpes viruses. Helicases are responsible for the unwinding of DNA and are essential for replication and completion of the viral life cycle.	809
397008	pfam02690	Na_Pi_cotrans	Na+/Pi-cotransporter. This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule.	137
111576	pfam02691	VacA	Vacuolating cyotoxin. This family consists of Vacuolating cyotoxin proteins form Proteobacteria. These proteins are an important virulence determinate in H. pylori and induce cytoplasmic vacuolation in a variety of mammalian cell lines.	1002
111577	pfam02694	UPF0060	Uncharacterized BCR, YnfA/UPF0060 family. 	107
397009	pfam02696	UPF0061	Uncharacterized ACR, YdiU/UPF0061 family. 	458
397010	pfam02697	VAPB_antitox	Putative antitoxin. Proteins in this family are possibly the antitoxin component of a VAPBC-like toxin-antitoxin (TA) module, which is widespread in the in both archaea and bacteria.	69
397011	pfam02698	DUF218	DUF218 domain. This large family of proteins contains several highly conserved charged amino acids, suggesting this may be an enzymatic domain (Bateman A pers. obs). The family includes SanA, which is involved in Vancomycin resistance. This protein may be involved in murein synthesis.	137
397012	pfam02699	YajC	Preprotein translocase subunit. See.	78
397013	pfam02700	PurS	Phosphoribosylformylglycinamidine (FGAM) synthase. This family forms a component of the de novo purine biosynthesis pathway.	76
397014	pfam02701	zf-Dof	Dof domain, zinc finger. The Dof domain is a zinc finger DNA-binding domain, that shows resemblance to the Cys2 zinc finger.	57
397015	pfam02702	KdpD	Osmosensitive K+ channel His kinase sensor domain. This is a family of KdpD sensor kinase proteins that regulate the kdpFABC operon responsible for potassium transport. The aligned region corresponds to the N-terminal cytoplasmic part of the protein which may be the sensor domain responsible for sensing turgor pressure.	210
308370	pfam02703	Adeno_E1A	Early E1A protein. This is a family of adenovirus early E1A proteins. The E1A protein is 32 kDa it can however be cleaved to yield the 28 kDa protein. The E1A protein is responsible for the transcriptional activation of the early genes with in the viral genome at the start of the infection process as well as some cellular genes.	289
397016	pfam02704	GASA	Gibberellin regulated protein. This is the GASA gibberellin regulated cysteine rich protein family. The expression of these proteins is up-regulated by the plant hormone gibberellin, most of these proteins have some role in plant development. There are 12 cysteine residues conserved within the alignment giving the potential for these proteins to posses 6 disulphide bonds.	60
397017	pfam02705	K_trans	K+ potassium transporter. This is a family of K+ potassium transporters that are conserved across phyla, having both bacterial (KUP), yeast (HAK), and plant (AtKT) sequences as members.	534
367147	pfam02706	Wzz	Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases.	74
280810	pfam02707	MOSP_N	Major Outer Sheath Protein N-terminal region. This is a family of spirochete major outer sheath protein N-terminal regions. These proteins are present on the bacterial cell surface. In T. denticola the major outer sheath protein (Msp) binds immobilised laminin and fibronectin supporting the hypothesis that Msp mediates the extracellular matrix binding activity of T. denticola.	196
397018	pfam02709	Glyco_transf_7C	N-terminal domain of galactosyltransferase. This is the N-terminal domain of a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activities, all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalyzed reaction.	77
397019	pfam02710	Hema_HEFG	Hemagglutinin domain of haemagglutinin-esterase-fusion glycoprotein. 	155
367150	pfam02711	Pap_E4	E4 protein. This is is a family of Papillomavirus proteins, E4, coded for by ORF4. A splice variant, E1--E4, exists but neither the function of E4 or E1--E4 is known.	95
397020	pfam02713	DUF220	Domain of unknown function DUF220. This is family consists of a region in several Arabidopsis thaliana hypothetical proteins none of which have any known function. The aligned region contains two cysteine residues.	73
397021	pfam02714	RSN1_7TM	Calcium-dependent channel, 7TM region, putative phosphate. RSN1_7TM is the seven transmembrane domain region of putative phosphate transporter. The family is the 7TM region of osmosensitive calcium-permeable cation channels.	273
397022	pfam02718	Herpes_UL31	Herpesvirus UL31-like protein. This is a family of Herpesvirus proteins including UL31, UL53, and the product of ORF 69 in some strains. The proteins in this family have no known function.	251
397023	pfam02719	Polysacc_synt_2	Polysaccharide biosynthesis protein. This is a family of diverse bacterial polysaccharide biosynthesis proteins including the CapD protein, WalL protein, mannosyl-transferase, and several putative epimerases (e.g. WbiI).	284
397024	pfam02720	DUF222	Domain of unknown function (DUF222). This family is often found associated to the N-terminus of the HNH endonuclease domain pfam01844. The function of this domain is uncertain. This family has been called the 13E12 repeat family.	305
145722	pfam02721	DUF223	Domain of unknown function DUF223. 	95
280819	pfam02722	MOSP_C	Major Outer Sheath Protein C-terminal domain. This is a family of spirochete major outer sheath protein C-terminal regions. These proteins are present on the bacterial cell surface. In T. denticola the major outer sheath protein (Msp) binds immobilised laminin and fibronectin supporting the hypothesis that Msp mediates the extracellular matrix binding activity of T. denticola. This domain forms an amphipathic beta rich structure with channel forming activity.	205
397025	pfam02723	NS3_envE	Non-structural protein NS3/Small envelope protein E. This is a family of small non-structural proteins, well conserved among Coronavirus strains. This protein is also found in murine hepatitis virus as small envelope protein E.	75
397026	pfam02724	CDC45	CDC45-like protein. CDC45 is an essential gene required for initiation of DNA replication in S. cerevisiae, forming a complex with MCM5/CDC46. homologs of CDC45 have been identified in human, mouse and smut fungus, among others.	539
280822	pfam02725	Paramyxo_NS_C	Non-structural protein C. This family consists of the polymerase accessory protein C from members of the paramyxoviridae.	164
397027	pfam02727	Cu_amine_oxidN2	Copper amine oxidase, N2 domain. This domain is the first or second structural domain in copper amine oxidases, it is known as the N2 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ).	87
397028	pfam02728	Cu_amine_oxidN3	Copper amine oxidase, N3 domain. This domain is the second or third structural domain in copper amine oxidases, it is known as the N3 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ).	101
397029	pfam02729	OTCace_N	Aspartate/ornithine carbamoyltransferase, carbamoyl-P binding domain. 	140
397030	pfam02730	AFOR_N	Aldehyde ferredoxin oxidoreductase, N-terminal domain. Aldehyde ferredoxin oxidoreductase (AOR) catalyzes the reversible oxidation of aldehydes to their corresponding carboxylic acids with their accompanying reduction of the redox protein ferredoxin. This domain interacts with the tungsten cofactor.	200
397031	pfam02731	SKIP_SNW	SKIP/SNW domain. This domain is found in chromatin proteins.	152
397032	pfam02732	ERCC4	ERCC4 domain. This domain is a family of nucleases. The family includes EME1 which is an essential component of a Holliday junction resolvase. EME1 interacts with MUS81 to form a DNA structure-specific endonuclease.	139
397033	pfam02733	Dak1	Dak1 domain. This is the kinase domain of the dihydroxyacetone kinase family EC:2.7.1.29.	310
397034	pfam02734	Dak2	DAK2 domain. This domain is the predicted phosphatase domain of the dihydroxyacetone kinase family.	175
397035	pfam02735	Ku	Ku70/Ku80 beta-barrel domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the central DNA-binding beta-barrel domain. This domain is found in both the Ku70 and Ku80 proteins that form a DNA binding heterodimer.	193
397036	pfam02736	Myosin_N	Myosin N-terminal SH3-like domain. This domain has an SH3-like fold. It is found at the N-terminus of many but not all myosins. The function of this domain is unknown.	39
397037	pfam02737	3HCDH_N	3-hydroxyacyl-CoA dehydrogenase, NAD binding domain. This family also includes lambda crystallin.	180
397038	pfam02738	Ald_Xan_dh_C2	Molybdopterin-binding domain of aldehyde dehydrogenase. 	541
397039	pfam02739	5_3_exonuc_N	5'-3' exonuclease, N-terminal resolvase-like domain. 	163
397040	pfam02740	Colipase_C	Colipase, C-terminal domain. SCOP reports duplication of common fold with Colipase N-terminal domain.	44
397041	pfam02741	FTR_C	FTR, proximal lobe. The FTR (Formylmethanofuran--tetrahydromethanopterin formyltransferase) enzyme EC:2.3.1.101 is involved in archaebacteria in the formation of methane from carbon dioxide. C-terminal proximal lobe of alpha+beta ferredoxin-like fold. SCOP reports fold duplication with N-terminal distal lobe.	149
397042	pfam02742	Fe_dep_repr_C	Iron dependent repressor, metal binding and dimerization domain. This family includes the Diphtheria toxin repressor.	70
397043	pfam02743	dCache_1	Cache domain. Double cache domain 1 covers the last three strands from the membrane distal PAS-like domain, the first two strands of the membrane proximal domain, and the connecting elements between the two domains.	238
397044	pfam02744	GalP_UDP_tr_C	Galactose-1-phosphate uridyl transferase, C-terminal domain. SCOP reports fold duplication with N-terminal domain. Both involved in Zn and Fe binding.	166
397045	pfam02745	MCR_alpha_N	Methyl-coenzyme M reductase alpha subunit, N-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (this family), 2 beta (pfam02241), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The N-terminal domain has a ferredoxin-like fold.	269
397046	pfam02746	MR_MLE_N	Mandelate racemase / muconate lactonizing enzyme, N-terminal domain. SCOP reports fold similarity with enolase N-terminal domain.	117
280843	pfam02747	PCNA_C	Proliferating cell nuclear antigen, C-terminal domain. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA.	128
397047	pfam02748	PyrI_C	Aspartate carbamoyltransferase regulatory chain, metal binding domain. The regulatory chain is involved in allosteric regulation of aspartate carbamoyltransferase. The C-terminal metal binding domain has a rubredoxin-like fold and provides the interface with the catalytic chain.	48
397048	pfam02749	QRPTase_N	Quinolinate phosphoribosyl transferase, N-terminal domain. Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase EC:2.4.2.19 is involved in the de novo synthesis of NAD in both prokaryotes and eukaryotes. It catalyzes the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide. The QA substrate is bound between the C-terminal domain of one subunit, and the N-terminal domain of the other. The N-terminal domain has an alpha/beta hammerhead fold.	87
308403	pfam02750	Synapsin_C	Synapsin, ATP binding domain. Ca dependent ATP binding in this ATP grasp fold. Function unknown.	203
397049	pfam02751	TFIIA_gamma_C	Transcription initiation factor IIA, gamma subunit. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIA (TFIIA) is a multimeric protein which facilitates the binding of TFIID to the TATA box. The C-terminal domain of the gamma subunit is a 12 stranded beta-barrel.	43
397050	pfam02752	Arrestin_C	Arrestin (or S-antigen), C-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain.	135
397051	pfam02753	PapD_C	Pili assembly chaperone PapD, C-terminal domain. Ig-like beta-sandwich fold. This domain is the C-terminal part of the pilus and flagellar-assembly chaperone protein PapD.	63
397052	pfam02754	CCG	Cysteine-rich domain. The key element of this family is the CX31-38CCX33-34CXXC sequence motif normally found at the C-terminus in archaeal and bacterial Hdr-like proteins. There may be one or two copies, and the motif is probably an iron-sulfur binding cluster. In some instances one of the cysteines is replaced by an aspartate, and aspartate can in principle also function as a ligand of an iron-sulfur cluster. The family includes a subunit from heterodisulphide reductase and a subunit from glycolate oxidase and glycerol-3-phosphate dehydrogenase.	84
397053	pfam02755	RPEL	RPEL repeat. The RPEL repeat is named after four conserved amino acids it contains. The RPEL motif binds to actin.	24
367166	pfam02756	GYR	GYR motif. The GYR motif is found in several drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues may suggest it could be a substrate for tyrosine kinases.	18
367167	pfam02757	YLP	YLP motif. The YLP motif is found in several drosophila proteins. Its function is unknown, however the presence of completely conserved tyrosine residues and its presence in human ERBB4 may suggest it could be a substrate for tyrosine kinases.	9
397054	pfam02758	PYRIN	PAAD/DAPIN/Pyrin domain. This domain is predicted to contain 6 alpha helices and to have the same fold as the pfam00531 domain. This similarity may mean that this is a protein-protein interaction domain.	75
397055	pfam02759	RUN	RUN domain. This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases.	129
397056	pfam02760	HIN	HIN-200/IF120x domain. This domain has no known function. It is found in one or two copies per protein, and is found associated with the PAAD/DAPIN domain pfam02758.	168
397057	pfam02761	Cbl_N2	CBL proto-oncogene N-terminus, EF hand-like domain. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the central EF hand domain.	84
397058	pfam02762	Cbl_N3	CBL proto-oncogene N-terminus, SH2-like domain. Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the C-terminal SH2 domain.	80
397059	pfam02763	Diphtheria_C	Diphtheria toxin, C domain. N-terminal catalytic (C) domain - blocks protein synthesis by transfer of ADP-ribose from NAD to a diphthamide residue of EF-2.	187
280860	pfam02764	Diphtheria_T	Diphtheria toxin, T domain. Central domain of diphtheria toxin is the translocation (T) domain. pH induced conformational change in this domain triggers insertion into the endosomal membrane and facilitates the transfer of the catalytic domain into the cytoplasm.	180
397060	pfam02765	POT1	Telomeric single stranded DNA binding POT1/CDC13. This domain binds single stranded telomeric DNA and adopts an OB fold. It includes the proteins POT1 and CDC13 which have been shown to regulate telomere length, replication and capping. POT1 is one component of the shelterin complex that protects telomere-ends from attack by DNA-repair mechanisms.	140
397061	pfam02767	DNA_pol3_beta_2	DNA polymerase III beta subunit, central domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold.	115
280863	pfam02768	DNA_pol3_beta_3	DNA polymerase III beta subunit, C-terminal domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold.	118
397062	pfam02769	AIRS_C	AIR synthase related protein, C-terminal domain. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The function of the C-terminal domain of AIR synthase is unclear, but the cleft formed between N and C domains is postulated as a sulphate binding site.	152
397063	pfam02770	Acyl-CoA_dh_M	Acyl-CoA dehydrogenase, middle domain. Central domain of Acyl-CoA dehydrogenase has a beta-barrel fold.	95
397064	pfam02771	Acyl-CoA_dh_N	Acyl-CoA dehydrogenase, N-terminal domain. The N-terminal domain of Acyl-CoA dehydrogenase is an all-alpha domain.	113
397065	pfam02772	S-AdoMet_synt_M	S-adenosylmethionine synthetase, central domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold.	119
397066	pfam02773	S-AdoMet_synt_C	S-adenosylmethionine synthetase, C-terminal domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold.	138
397067	pfam02774	Semialdhyde_dhC	Semialdehyde dehydrogenase, dimerization domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase.	167
397068	pfam02775	TPP_enzyme_C	Thiamine pyrophosphate enzyme, C-terminal TPP binding domain. 	151
397069	pfam02776	TPP_enzyme_N	Thiamine pyrophosphate enzyme, N-terminal TPP binding domain. 	169
397070	pfam02777	Sod_Fe_C	Iron/manganese superoxide dismutases, C-terminal domain. superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. C-terminal domain is a mixed alpha/beta fold.	102
397071	pfam02778	tRNA_int_endo_N	tRNA intron endonuclease, N-terminal domain. Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9.	67
397072	pfam02779	Transket_pyr	Transketolase, pyrimidine binding domain. This family includes transketolase enzymes, pyruvate dehydrogenases, and branched chain alpha-keto acid decarboxylases.	174
397073	pfam02780	Transketolase_C	Transketolase, C-terminal domain. The C-terminal domain of transketolase has been proposed as a regulatory molecule binding site.	124
397074	pfam02781	G6PD_C	Glucose-6-phosphate dehydrogenase, C-terminal domain. 	296
397075	pfam02782	FGGY_C	FGGY family of carbohydrate kinases, C-terminal domain. This domain adopts a ribonuclease H-like fold and is structurally related to the N-terminal domain.	197
397076	pfam02783	MCR_beta_N	Methyl-coenzyme M reductase beta subunit, N-terminal domain. Methyl-coenzyme M reductase (MCR) is the enzyme responsible for microbial formation of methane. It is a hexamer composed of 2 alpha (pfam02249), 2 beta (this family), and 2 gamma (pfam02240) subunits with two identical nickel porphinoid active sites. The N-terminal domain has an alpha/beta ferredoxin-like fold.	182
397077	pfam02784	Orn_Arg_deC_N	Pyridoxal-dependent decarboxylase, pyridoxal binding domain. These pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and related substrates This domain has a TIM barrel fold.	241
397078	pfam02785	Biotin_carb_C	Biotin carboxylase C-terminal domain. Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyzes the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain.	107
397079	pfam02786	CPSase_L_D2	Carbamoyl-phosphate synthase L chain, ATP binding domain. Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesize carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117. The ATP binding domain (this one) has an ATP-grasp fold.	209
397080	pfam02787	CPSase_L_D3	Carbamoyl-phosphate synthetase large chain, oligomerization domain. Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain.	79
397081	pfam02788	RuBisCO_large_N	Ribulose bisphosphate carboxylase large chain, N-terminal domain. The N-terminal domain of RuBisCO large chain adopts a ferredoxin-like fold.	120
397082	pfam02789	Peptidase_M17_N	Cytosol aminopeptidase family, N-terminal domain. 	127
397083	pfam02790	COX2_TM	Cytochrome C oxidase subunit II, transmembrane domain. The N-terminal domain of cytochrome C oxidase contains two transmembrane alpha-helices.	89
397084	pfam02791	DDT	DDT domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors) and is approximately 60 residues in length. Along with the WHIM motifs, it comprises an entirely alpha helical module found in diverse eukaryotic chromatin proteins. Based on the structure of Ioc3, this module is inferred to interact with nucleosomal linker DNA and the SLIDE domain of ISWI proteins. The resulting complex forms a protein ruler that measures out the spacing between two adjacent nucleosomes. In particular, the DDT domain, in combination with the WHIM1 and WHIM2 motifs form the SLIDE domain binding pocket.	58
397085	pfam02792	Mago_nashi	Mago nashi protein. This family was originally identified in Drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene. The human homolog has been shown to interact with an RNA binding protein. An RNAi knockout of the C. elegans homolog causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination. Mago nashi has been found to be part of the exon-exon junction complex that binds 20 nucleotides upstream of exon-exon junctions.	131
397086	pfam02793	HRM	Hormone receptor domain. This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain.	64
397087	pfam02794	HlyC	RTX toxin acyltransferase family. Members of this family are enzymes EC:2.3.1.-. involved in fatty acylation of the protoxins (HlyA) at lysine residues, thereby converting them to the active toxin. Acyl-acyl carrier protein (ACP) is the essential acyl donor. This family show a number of conserved residues that are possible candidates for participation in acyl transfer. Site-directed mutagenesis of the single conserved histidine residue in Escherichia coli HlyC resulted in complete inactivation of the enzyme.	127
397088	pfam02796	HTH_7	Helix-turn-helix domain of resolvase. 	45
397089	pfam02797	Chal_sti_synt_C	Chalcone and stilbene synthases, C-terminal domain. This domain of chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in the N-terminal domain.	151
397090	pfam02798	GST_N	Glutathione S-transferase, N-terminal domain. Function: conjugation of reduced glutathione to a variety of targets. Also included in the alignment, but not GSTs: S-crystallins from squid (similarity to GST previously noted); eukaryotic elongation factors 1-gamma (not known to have GST activity and similarity not previously recognized); HSP26 family of stress-related proteins including auxin-regulated proteins in plants and stringent starvation proteins in E. coli (not known to have GST activity and similarity not previously recognized). The glutathione molecule binds in a cleft between the N- and C-terminal domains - the catalytically important residues are proposed to reside in the N-terminal domain.	76
397091	pfam02799	NMT_C	Myristoyl-CoA:protein N-myristoyltransferase, C-terminal domain. The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold.	193
397092	pfam02800	Gp_dh_C	Glyceraldehyde 3-phosphate dehydrogenase, C-terminal domain. GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. C-terminal domain is a mixed alpha/antiparallel beta fold.	158
397093	pfam02801	Ketoacyl-synt_C	Beta-ketoacyl synthase, C-terminal domain. The structure of beta-ketoacyl synthase is similar to that of the thiolase family (pfam00108) and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains.	118
397094	pfam02803	Thiolase_C	Thiolase, C-terminal domain. Thiolase is reported to be structurally related to beta-ketoacyl synthase (pfam00109), and also chalcone synthase.	123
397095	pfam02805	Ada_Zn_binding	Metal binding domain of Ada. The Escherichia coli Ada protein repairs O6-methylguanine residues and methyl phosphotriesters in DNA by direct transfer of the methyl group to a cysteine residue. This domain contains four conserved cysteines that form a zinc binding site. One of these cysteines is a methyl group acceptor. The methylated domain can then specifically bind to the ada box on a DNA duplex.	62
397096	pfam02806	Alpha-amylase_C	Alpha amylase, C-terminal all-beta domain. Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain.	93
397097	pfam02807	ATP-gua_PtransN	ATP:guanido phosphotransferase, N-terminal domain. The N-terminal domain has an all-alpha fold.	67
397098	pfam02809	UIM	Ubiquitin interaction motif. This motif is called the ubiquitin interaction motif. One of the proteins containing this motif is a receptor for poly-ubiquitination chains for the proteasome. This motif has a pattern of conservation characteristic of an alpha helix.	16
397099	pfam02810	SEC-C	SEC-C motif. The SEC-C motif found in the C-terminus of the SecA protein, in the middle of some SWI2 ATPases and also solo in several proteins. The motif is predicted to chelate zinc with the CXC and C[HC] pairs that constitute the most conserved feature of the motif. It is predicted to be a potential nucleic acid binding domain.	19
397100	pfam02811	PHP	PHP domain. The PHP (Polymerase and Histidinol Phosphatase) domain is a putative phosphoesterase domain.	164
397101	pfam02812	ELFV_dehydrog_N	Glu/Leu/Phe/Val dehydrogenase, dimerization domain. 	129
280904	pfam02813	Retro_M	Retroviral M domain. Retroviruses contain a small protein, MA (matrix), which forms a protein lining immediately beneath the phospholipid membrane of the mature virus particle. MA is located in the N-terminal region of the Gag precursor polyprotein. The N-terminal segment of MA proteins directs the Gag protein to the plasma membrane where budding takes place, and has been called the M domain. This domain forms an alpha helical bundle structure.	86
397102	pfam02814	UreE_N	UreE urease accessory protein, N-terminal domain. UreE is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid.	62
397103	pfam02815	MIR	MIR domain. The MIR (protein mannosyltransferase, IP3R and RyR) domain is a domain that may have a ligand transferase function.	185
397104	pfam02816	Alpha_kinase	Alpha-kinase family. This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains.	183
397105	pfam02817	E3_binding	e3 binding domain. This family represents a small domain of the E2 subunit of 2-oxo-acid dehydrogenases responsible for the binding of the E3 subunit.	35
397106	pfam02818	PPAK	PPAK motif. These motifs are found in the PEVK region of titin.	27
111689	pfam02819	Toxin_9	Spider toxin. This family of spider neurotoxins are thought to be calcium ion channel inhibitors.	43
397107	pfam02820	MBT	mbt repeat. The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function.	68
367201	pfam02821	Staphylokinase	Staphylokinase/Streptokinase family. 	117
397108	pfam02822	Antistasin	Antistasin family. Members of this family are inhibitors of trypsin family proteases. This domain is highly disulphide bonded. The domain is also found in some large extracellular proteins in multiple copies.	26
397109	pfam02823	ATP-synt_DE_N	ATP synthase, Delta/Epsilon chain, beta-sandwich domain. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. The subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213).	79
397110	pfam02824	TGS	TGS domain. The TGS domain is named after ThrRS, GTPase, and SpoT. Interestingly, TGS domain was detected also at the amino terminus of the uridine kinase from the spirochaete Treponema pallidum (but not any other organism, including the related spirochaete Borrelia burgdorferi). TGS is a small domain that consists of ~50 amino acid residues and is predicted to possess a predominantly beta-sheet structure. There is no direct information on the functions of the TGS domain, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role.	60
397111	pfam02825	WWE	WWE domain. The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems.	64
397112	pfam02826	2-Hacid_dh_C	D-isomer specific 2-hydroxyacid dehydrogenase, NAD binding domain. This domain is inserted into the catalytic domain, the large dehydrogenase and D-lactate dehydrogenase families in SCOP. N-terminal portion of which is represented by family pfam00389.	176
397113	pfam02827	PKI	cAMP-dependent protein kinase inhibitor. Members of this family are extremely potent competitive inhibitors of camp-dependent protein kinase activity. These proteins interact with the catalytic subunit of the enzyme after the cAMP-induced dissociation of its regulatory chains.	69
397114	pfam02828	L27	L27 domain. The L27 domain is found in receptor targeting proteins Lin-2 and Lin-7.	52
397115	pfam02829	3H	3H domain. This domain is predicted to be a small molecule binding domain, based on its occurrence with other domains. The domain is named after its three conserved histidine residues.	97
367208	pfam02830	V4R	V4R domain. The V4R (vinyl 4 reductase) domain is a predicted small molecular binding domain, that may bind to hydrocarbons.	62
397116	pfam02831	gpW	gpW. gpW is a 68 residue protein known to be present in phage particles. Extracts of phage-infected cells lacking gpW contain DNA-filled heads, and active tails, but no infectious virions. gpW is required for the addition of gpFII to the head, which is, in turn, required for the attachment of tails. Since gpFII and tails are known to be attached at the connector, gpW is also likely to assemble at this site. The addition of gpW to filled heads increases the DNase resistance of the packaged DNA, suggesting that gpW either forms a plug at the connector to prevent ejection of the DNA, or binds directly to the DNA. The large number of positively charged residues in gpW (its calculated pI is 10.8) is consistent with a role in DNA interaction.	62
280922	pfam02832	Flavi_glycop_C	Flavivirus glycoprotein, immunoglobulin-like domain. 	97
397117	pfam02833	DHHA2	DHHA2 domain. This domain is often found adjacent to the DHH domain pfam01368 and is called DHHA2 for DHH associated domain. This domain is diagnostic of DHH subfamily 2 members. The domain is about 120 residues long and contains a conserved DXK motif at its amino terminus.	124
397118	pfam02834	LigT_PEase	LigT like Phosphoesterase. Members of this family are bacterial and archaeal RNA ligases that are able to ligate tRNA half molecules containing 2',3'-cyclic phosphate and 5' hydroxyl termini to products containing the 2',5' phosphodiester linkage. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family. The structure of a related protein is known. They belong to the 2H phosphoesterase superfamily. They share a common active site, characterized by two conserved histidines, with vertebrate myelin-associated 2',3' phosphodiesterases, plant Arabidopsis thaliana CPDases and several several bacteria and virus proteins.	87
397119	pfam02836	Glyco_hydro_2_C	Glycosyl hydrolases family 2, TIM barrel domain. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities.	302
397120	pfam02837	Glyco_hydro_2_N	Glycosyl hydrolases family 2, sugar binding domain. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities and has a jelly-roll fold. The domain binds the sugar moiety during the sugar-hydrolysis reaction.	169
397121	pfam02838	Glyco_hydro_20b	Glycosyl hydrolase family 20, domain 2. This domain has a zincin-like fold.	123
397122	pfam02839	CBM_5_12	Carbohydrate binding domain. This short domain is found in many different glycosyl hydrolase enzymes and is presumed to have a carbohydrate binding function. The domain has six aromatic groups that may be important for binding.	25
397123	pfam02840	Prp18	Prp18 domain. The splicing factor Prp18 is required for the second step of pre-mRNA splicing. The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles.	138
397124	pfam02841	GBP_C	Guanylate-binding protein, C-terminal domain. Transcription of the anti-viral guanylate-binding protein (GBP) is induced by interferon-gamma during macrophage induction. This family contains GBP1 and GPB2, both GTPases capable of binding GTP, GDP and GMP.	297
397125	pfam02843	GARS_C	Phosphoribosylglycinamide synthetase, C domain. Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the C-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam02787).	92
397126	pfam02844	GARS_N	Phosphoribosylglycinamide synthetase, N domain. Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the N-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam00289). This domain is structurally related to the PreATP-grasp domain.	101
397127	pfam02845	CUE	CUE domain. CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2.	42
397128	pfam02847	MA3	MA3 domain. Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains.	113
397129	pfam02852	Pyr_redox_dim	Pyridine nucleotide-disulphide oxidoreductase, dimerization domain. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases.	107
397130	pfam02854	MIF4G	MIF4G domain. MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA.	203
397131	pfam02861	Clp_N	Clp amino terminal domain, pathogenicity island component. This short domain is found in one or two copies at the amino terminus of ClpA and ClpB proteins from bacteria and eukaryotes. The function of these domains is uncertain but they may form a protein binding site. In many bacterial species, including E.coli, this region represents the N-terminus of one of the key components of the pathogenicity island complex that injects toxin from one bacterium into another.	53
397132	pfam02862	DDHD	DDHD domain. The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3). This suggests that this region is involved in functionally important interactions in other members of this family.	235
397133	pfam02863	Arg_repressor_C	Arginine repressor, C-terminal domain. 	68
397134	pfam02864	STAT_bind	STAT protein, DNA binding domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. This family represents the DNA binding domain of STAT, which has an ig-like fold. STAT proteins also include an SH2 domain pfam00017.	133
397135	pfam02865	STAT_int	STAT protein, protein interaction domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain pfam00017.	119
397136	pfam02866	Ldh_1_C	lactate/malate dehydrogenase, alpha/beta C-terminal domain. L-lactate dehydrogenases are metabolic enzymes which catalyze the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyze the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes.	173
397137	pfam02867	Ribonuc_red_lgC	Ribonucleotide reductase, barrel domain. 	487
397138	pfam02868	Peptidase_M4_C	Thermolysin metallopeptidase, alpha-helical domain. 	167
397139	pfam02870	Methyltransf_1N	6-O-methylguanine DNA methyltransferase, ribonuclease-like domain. 	77
397140	pfam02872	5_nucleotid_C	5'-nucleotidase, C-terminal domain. 	155
397141	pfam02873	MurB_C	UDP-N-acetylenolpyruvoylglucosamine reductase, C-terminal domain. Members of this family are UDP-N-acetylenolpyruvoylglucosamine reductase enzymes EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan.	99
397142	pfam02874	ATP-synt_ab_N	ATP synthase alpha/beta family, beta-barrel domain. This family includes the ATP synthase alpha and beta subunits the ATP synthase associated with flagella.	69
397143	pfam02875	Mur_ligase_C	Mur ligase family, glutamate ligase domain. This family contains a number of related ligase enzymes which have EC numbers 6.3.2.*. This family includes: MurC, MurD, MurE, MurF, Mpl, and FolC. MurC, MurD, Mure and MurF catalyze consecutive steps in the synthesis of peptidoglycan. Peptidoglycan consists of a sheet of two sugar derivatives, with one of these N-acetylmuramic acid attaching to a small pentapeptide. The pentapeptide is is made of L-alanine, D-glutamic acid, Meso-diaminopimelic acid and D-alanyl alanine. The peptide moiety is synthesized by successively adding these amino acids to UDP-N-acetylmuramic acid. MurC transfers the L-alanine, MurD transfers the D-glutamate, MurE transfers the diaminopimelic acid, and MurF transfers the D-alanyl alanine. This family also includes Folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate.	87
397144	pfam02876	Stap_Strp_tox_C	Staphylococcal/Streptococcal toxin, beta-grasp domain. 	101
397145	pfam02877	PARP_reg	Poly(ADP-ribose) polymerase, regulatory domain. Poly(ADP-ribose) polymerase catalyzes the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active.	134
397146	pfam02878	PGM_PMM_I	Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain I. 	138
397147	pfam02879	PGM_PMM_II	Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain II. 	102
397148	pfam02880	PGM_PMM_III	Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain III. 	114
397149	pfam02881	SRP54_N	SRP54-type protein, helical bundle domain. 	75
397150	pfam02882	THF_DHG_CYH_C	Tetrahydrofolate dehydrogenase/cyclohydrolase, NAD(P)-binding domain. 	160
397151	pfam02883	Alpha_adaptinC2	Adaptin C-terminal domain. Alpha adaptin is a heterotetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This ig-fold domain is found in alpha, beta and gamma adaptins.	111
397152	pfam02884	Lyase_8_C	Polysaccharide lyase family 8, C-terminal beta-sandwich domain. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen.	67
397153	pfam02885	Glycos_trans_3N	Glycosyl transferase family, helical bundle domain. This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate.	61
397154	pfam02886	LBP_BPI_CETP_C	LBP / BPI / CETP family, C-terminal domain. The N and C terminal domains of the LBP/BPI/CETP family are structurally similar.	238
397155	pfam02887	PK_C	Pyruvate kinase, alpha/beta domain. As well as being found in pyruvate kinase this family is found as an isolated domain in some bacterial proteins.	114
397156	pfam02888	CaMBD	Calmodulin binding domain. Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other.	73
397157	pfam02889	Sec63	Sec63 Brl domain. This domain (also known as the Brl domain) is required for assembly of functional endoplasmic reticulum translocons.	307
308504	pfam02890	DUF226	Borrelia family of unknown function DUF226. This family of proteins are found in Borrelia. The proteins are about 190 amino acids long and have no known function.	139
397158	pfam02891	zf-MIZ	MIZ/SP-RING zinc finger. This domain has SUMO (small ubiquitin-like modifier) ligase activity and is involved in DNA repair and chromosome organisation.	50
397159	pfam02892	zf-BED	BED zinc finger. 	44
397160	pfam02893	GRAM	GRAM domain. The GRAM domain is found in in glucosyltransferases, myotubularins and other putative membrane-associated proteins. Note the alignment is lacking the last two beta strands and alpha helix.	112
397161	pfam02894	GFO_IDH_MocA_C	Oxidoreductase family, C-terminal alpha/beta domain. This family of enzymes utilize NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot.	204
397162	pfam02895	H-kinase_dim	Signal transducing histidine kinase, homodimeric domain. This helical bundle domain is the homodimer interface of the signal transducing histidine kinase family.	66
397163	pfam02896	PEP-utilizers_C	PEP-utilising enzyme, TIM barrel domain. 	292
397164	pfam02897	Peptidase_S9_N	Prolyl oligopeptidase, N-terminal beta-propeller domain. This unusual 7-stranded beta-propeller domain protects the catalytic triad of prolyl oligopeptidase (see pfam00326), excluding larger peptides and proteins from proteolysis in the cytosol.	414
397165	pfam02898	NO_synthase	Nitric oxide synthase, oxygenase domain. 	362
397166	pfam02899	Phage_int_SAM_1	Phage integrase, N-terminal SAM-like domain. 	84
397167	pfam02900	LigB	Catalytic LigB subunit of aromatic ring-opening dioxygenase. 	260
397168	pfam02901	PFL-like	Pyruvate formate lyase-like. This family of enzymes includes pyruvate formate lyase, choline trimethylamine lyase, glycerol dehydratase, 4-hydroxyphenylacetate decarboxylase, and benzylsuccinate synthase.	646
397169	pfam02902	Peptidase_C48	Ulp1 protease family, C-terminal catalytic domain. This domain contains the catalytic triad Cys-His-Asn.	202
397170	pfam02903	Alpha-amylase_N	Alpha amylase, N-terminal ig-like domain. 	120
397171	pfam02905	EBV-NA1	Epstein Barr virus nuclear antigen-1, DNA-binding domain. This domain has a ferredoxin-like fold.	139
397172	pfam02906	Fe_hyd_lg_C	Iron only hydrogenase large subunit, C-terminal domain. 	277
397173	pfam02907	Peptidase_S29	Hepatitis C virus NS3 protease. Hepatitis C virus NS3 protein is a serine protease which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. NS2-3 proteinase, a zinc-dependent enzyme, performs a single proteolytic cut to release the N-terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4A.	149
397174	pfam02909	TetR_C	Tetracyclin repressor, C-terminal all-alpha domain. 	144
397175	pfam02910	Succ_DH_flav_C	Fumarate reductase flavoprotein C-term. This family contains fumarate reductases, succinate dehydrogenases and L-aspartate oxidases.	128
397176	pfam02911	Formyl_trans_C	Formyl transferase, C-terminal domain. 	95
397177	pfam02912	Phe_tRNA-synt_N	Aminoacyl tRNA synthetase class II, N-terminal domain. 	67
397178	pfam02913	FAD-oxidase_C	FAD linked oxidases, C-terminal domain. This domain has a ferredoxin-like fold.	248
397179	pfam02914	DDE_2	Bacteriophage Mu transposase. 	221
397180	pfam02915	Rubrerythrin	Rubrerythrin. This domain has a ferritin-like fold.	137
397181	pfam02916	DNA_PPF	DNA polymerase processivity factor. 	115
397182	pfam02917	Pertussis_S1	Pertussis toxin, subunit 1. 	239
280986	pfam02918	Pertussis_S2S3	Pertussis toxin, subunit 2 and 3, C-terminal domain. 	109
397183	pfam02919	Topoisom_I_N	Eukaryotic DNA topoisomerase I, DNA binding fragment. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination. This family may be more than one structural domain.	213
397184	pfam02920	Integrase_DNA	DNA binding domain of tn916 integrase. 	58
397185	pfam02921	UCR_TM	Ubiquinol cytochrome reductase transmembrane region. Each subunit of the cytochrome bc1 complex provides a single helix (this family) to make up the transmembrane region of the complex.	66
397186	pfam02922	CBM_48	Carbohydrate-binding module 48 (Isoamylase N-terminal domain). This domain is found in a range of enzymes that act on branched substrates - isoamylase, pullulanase and branching enzyme. This family also contains the beta subunit of 5' AMP activated kinase.	80
280991	pfam02923	BamHI	Restriction endonuclease BamHI. 	157
397187	pfam02924	HDPD	Bacteriophage lambda head decoration protein D. 	116
280993	pfam02925	gpD	Bacteriophage scaffolding protein D. 	141
397188	pfam02926	THUMP	THUMP domain. The THUMP domain is named after after thiouridine synthases, methylases and PSUSs. The THUMP domain consists of about 110 amino acid residues. The structure of ThiI reveals that the THUMP has a fold unlike that of previously characterized RNA-binding domains. It is predicted that this domain is an RNA-binding domain The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets.	143
397189	pfam02927	CelD_N	Cellulase N-terminal ig-like domain. 	83
397190	pfam02928	zf-C5HC2	C5HC2 zinc finger. Predicted zinc finger with eight potential zinc ligand binding residues. This domain is found in Jumonji. This domain may have a DNA binding function.	54
397191	pfam02929	Bgal_small_N	Beta galactosidase small chain. This domain comprises the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase.	230
397192	pfam02931	Neur_chan_LBD	Neurotransmitter-gated ion-channel ligand binding domain. This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure.	215
397193	pfam02932	Neur_chan_memb	Neurotransmitter-gated ion-channel transmembrane region. This family includes the four transmembrane helices that form the ion channel.	232
397194	pfam02933	CDC48_2	Cell division protein 48 (CDC48), domain 2. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain.	64
397195	pfam02934	GatB_N	GatB/GatE catalytic domain. This domain is found in the GatB and GatE proteins.	283
397196	pfam02935	COX7C	Cytochrome c oxidase subunit VIIc. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIIc. The yeast member of this family is called COX VIII.	57
397197	pfam02936	COX4	Cytochrome c oxidase subunit IV. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit IV. The Dictyostelium member of this family is called COX VI. The yeast protein MTC3 appears to be the yeast COX IV subunit.	132
397198	pfam02937	COX6C	Cytochrome c oxidase subunit VIc. Cytochrome c oxidase, a 13 sub-unit complex, EC:1.9.3.1 is the terminal oxidase in the mitochondrial electron transport chain. This family is composed of cytochrome c oxidase subunit VIc.	70
397199	pfam02938	GAD	GAD domain. This domain is found in some members of the GatB and aspartyl tRNA synthetases.	94
397200	pfam02939	UcrQ	UcrQ family. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This family represents the 9.5 kDa subunit of the complex.	76
397201	pfam02940	mRNA_triPase	mRNA capping enzyme, beta chain. The beta chain of mRNA capping enzyme has triphosphatase activity. The function of the capping enzyme also depends on the guanylyltransferase activity conferred by the alpha chain (see pfam01331)	221
397202	pfam02941	FeThRed_A	Ferredoxin thioredoxin reductase variable alpha chain. 	67
281009	pfam02942	Flu_B_NS1	Influenza B non-structural protein (NS1). A specific region of the influenza B virus NS1 protein, which includes part of its effector domain, blocks the covalent linkage of ISG15 to its target proteins both in vitro and in infected cells. Of the several hundred proteins induced by interferon (IFN) alpha/beta, the ubiquitin-like ISG15 protein is one of the most predominant. Influenza A virus employs a different strategy: its NS1 protein does not bind the ISG15 protein, but little or no ISG15 protein is produced during infection.	247
397203	pfam02943	FeThRed_B	Ferredoxin thioredoxin reductase catalytic beta chain. 	106
397204	pfam02944	BESS	BESS motif. The BESS motif is named after the proteins in which it is found (BEAF, Suvar(3)7 and Stonewall). The motif is 40 amino acid residues long and is composed of two predicted alpha helices. Based on the protein in which it is found and the presence of conserved positively charged residues it is predicted to be a DNA binding domain. This domain appears to be specific to drosophila.	35
397205	pfam02945	Endonuclease_7	Recombination endonuclease VII. 	82
397206	pfam02946	GTF2I	GTF2I-like repeat. This region of sequence similarity is found up to six times in a variety of proteins including GTF2I. It has been suggested that this may be a DNA binding domain.	75
281014	pfam02947	Flt3_lig	flt3 ligand. The flt3 ligand is a short chain cytokine with a 4 helical bundle fold.	131
397207	pfam02948	Amelogenin	Amelogenin. Amelogenins play a role in biomineralisation. They seem to regulate the formation of crystallites during the secretory stage of tooth enamel development. thought to play a major role in the structural organisation and mineralisation of developing enamel. They are found in the extracellular matrix. Mutations in X-chromosomal amelogenin can cause Amelogenesis imperfecta.	173
251636	pfam02949	7tm_6	7tm Odorant receptor. This family is composed of 7 transmembrane receptors, that are probably drosophila odorant receptors.	313
367269	pfam02950	Conotoxin	Conotoxin. Conotoxins are small snail toxins that block ion channels.	74
397208	pfam02951	GSH-S_N	Prokaryotic glutathione synthetase, N-terminal domain. 	116
397209	pfam02952	Fucose_iso_C	L-fucose isomerase, C-terminal domain. 	142
397210	pfam02953	zf-Tim10_DDP	Tim10/DDP family zinc finger. Putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein TIMM8A. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localized to the mitochondrial intermembrane space.	62
397211	pfam02954	HTH_8	Bacterial regulatory protein, Fis family. 	41
397212	pfam02955	GSH-S_ATP	Prokaryotic glutathione synthetase, ATP-grasp domain. 	175
367272	pfam02956	TT_ORF1	TT viral orf 1. TT virus (TTV), isolated initially from a Japanese patient with hepatitis of unknown aetiology, has since been found to infect both healthy and diseased individuals and numerous prevalence studies have raised questions about its role in unexplained hepatitis. ORF1 is a large 750 residue protein. The N-terminal half of this protein corresponds to the capsid protein.	526
251643	pfam02957	TT_ORF2	TT viral ORF2. TT virus (TTV), isolated initially from a Japanese patient with hepatitis of unknown aetiology, has since been found to infect both healthy and diseased individuals, and numerous prevalence studies have raised questions about its role in unexplained hepatitis. ORF2 is a 150 residue protein. This family also includes the VP2 protein from the chicken anaemia virus which is a gyrovirus. Gyroviruses are small circular single stranded viruses. The proteins contain a set of conserved cysteine and histidine residues suggesting a zinc binding domain.	103
397213	pfam02958	EcKinase	Ecdysteroid kinase. This family includes ecdysteroid 22-kinase, an enzyme responsible for the phosphorylation of ecdysteroids (insect growth and moulting hormones) at C-22, to form physiologically inactive ecdysteroid 22-phosphates.	293
281024	pfam02959	Tax	HTLV Tax. Human T-cell leukaemia virus type I (HTLV-I) is the etiological agent for adult T-cell leukaemia (ATL), as well as for tropical spastic paraparesis (TSP) and HTLV-I associate myelopathy (HAM). A biological understanding of the involvement of HTLV-I and in ATL has focused significantly on the workings of the virally-encoded 40 kDa phospho-oncoprotein, Tax. Tax is a transcriptional activator. Its ability to modulate the expression and function of many cellular genes has been reasoned to be a major contributory mechanism explaining HTLV-I-mediated transformation of cells. In activating cellular gene expression, Tax impinges upon several cellular signal-transduction pathways, including those for CREB/ATF and NF-kappaB.	222
281025	pfam02960	K1	K1 glycoprotein. 	120
397214	pfam02961	BAF	Barrier to autointegration factor. The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration.	86
397215	pfam02962	CHMI	5-carboxymethyl-2-hydroxymuconate isomerase. 	124
397216	pfam02963	EcoRI	Restriction endonuclease EcoRI. 	257
281029	pfam02964	MeMO_Hyd_G	Methane monooxygenase, hydrolase gamma chain. 	161
397217	pfam02965	Met_synt_B12	Vitamin B12 dependent methionine synthase, activation domain. 	273
397218	pfam02966	DIM1	Mitosis protein DIM1. 	133
281032	pfam02969	TAF	TATA box binding protein associated factor (TAF). TAF proteins adopt a histone-like fold.	66
397219	pfam02970	TBCA	Tubulin binding cofactor A. 	88
397220	pfam02971	FTCD	Formiminotransferase domain. 	144
145886	pfam02972	Phycoerythr_ab	Phycoerythrin, alpha/beta chain. This family represents the non-globular alpha and beta chain components of phycoerythrin. The structure is a long beta-hairpin and a single alpha-helix.	57
281035	pfam02973	Sialidase	Sialidase, N-terminal domain. 	188
397221	pfam02974	Inh	Protease inhibitor Inh. The Inh inhibitor is secreted into the periplasm where its presumed physiological function is to protect periplasmic proteins against the action of secreted proteases. A range of proteases including A, B and C from E. chrysanthemi, alkaline protease from Pseudomonas aeruginosa and the 50 kDa protease from Serratia marcescens are inhibited.	95
397222	pfam02975	Me-amine-dh_L	Methylamine dehydrogenase, L chain. 	113
397223	pfam02976	MutH	DNA mismatch repair enzyme MutH. 	103
397224	pfam02977	CarbpepA_inh	Carboxypeptidase A inhibitor. 	40
397225	pfam02978	SRP_SPB	Signal peptide binding domain. 	95
397226	pfam02979	NHase_alpha	Nitrile hydratase, alpha chain. 	178
397227	pfam02980	FokI_C	Restriction endonuclease FokI, catalytic domain. 	136
397228	pfam02981	FokI_N	Restriction endonuclease FokI, recognition domain. 	135
202497	pfam02982	Scytalone_dh	Scytalone dehydratase. Scytalone dehydratases are structurally related to the NTF2 family (see pfam02136).	160
397229	pfam02983	Pro_Al_protease	Alpha-lytic protease prodomain. 	57
397230	pfam02984	Cyclin_C	Cyclin, C-terminal domain. Cyclins regulate cyclin dependent kinases (CDKs). Human CCNO is a Uracil-DNA glycosylase that is related to other cyclins. Cyclins contain two domains of similar all-alpha fold, of which this family corresponds with the C-terminal domain.	119
397231	pfam02985	HEAT	HEAT repeat. The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514).	31
397232	pfam02986	Fn_bind	Fibronectin binding repeat. The ability of bacteria to bind fibronectin is thought to enable the colonisation of wound tissue and blood clots. The fibronectin binding repeat is found in bacterial fibronectin binding proteins and serum opacity factor. Bacterial fibronectin binding proteins are surface proteins that covalently link to the bacterial cell wall, mediate adherence of the bacteria to host cells and trigger the fibronectin/integrin-mediated uptake of bacteria by host cells. Each fibronectin binding repeat is an array of short motifs that bind to fibronectin type I domains. Fibronectin binding repeats are natively unfolded in the absence of fibronectin and are thought to adopt a well-defined conformation (tandem beta-zipper) upon binding.	33
111833	pfam02987	LEA_4	Late embryogenesis abundant protein. Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress. The function of these proteins is unknown.	44
190495	pfam02988	PLA2_inh	Phospholipase A2 inhibitor. 	83
281047	pfam02989	DUF228	Lyme disease proteins of unknown function. 	182
397233	pfam02990	EMP70	Endomembrane protein 70. 	509
281049	pfam02991	Atg8	Autophagy protein Atg8 ubiquitin like. Light chain 3 is proposed to function primarily as a subunit of microtubule associated proteins 1A and 1B and that its expression may regulate microtubule binding activity. Autophagy is generally known as a process involved in the degradation of bulk cytoplasmic components that are non-specifically sequestered into an autophagosome, where they are sequestered into double-membrane vesicles and delivered to the degradative organelle, the lysosome/vacuole, for breakdown and eventual recycling of the resulting macromolecules. The yeast proteins are involved in the autophagosome, and Atg8 binds Atg19, via its N-terminus and the C-terminus of Atg19.	104
397234	pfam02992	Transposase_21	Transposase family tnp2. 	211
367287	pfam02993	MCPVI	Minor capsid protein VI. This minor capsid protein may act as a link between the external capsid and the internal DNA-protein core. The C-terminal 11 residues may function as a protease cofactor leading to enzyme activation.	225
397235	pfam02994	Transposase_22	L1 transposable element RBD-like domain. This entry represents the RBD-like domain.	98
397236	pfam02995	DUF229	Protein of unknown function (DUF229). Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate.	496
397237	pfam02996	Prefoldin	Prefoldin subunit. This family comprises of several prefoldin subunits. The biogenesis of the cytoskeletal proteins actin and tubulin involves interaction of nascent chains of each of the two proteins with the oligomeric protein prefoldin (PFD) and their subsequent transfer to the cytosolic chaperonin CCT (chaperonin containing TCP-1). Electron microscopy shows that eukaryotic PFD, which has a similar structure to its archaeal counterpart, interacts with unfolded actin along the tips of its projecting arms. In its PFD-bound state, actin seems to acquire a conformation similar to that adopted when it is bound to CCT.	118
111843	pfam02998	Lentiviral_Tat	Lentiviral Tat protein. This family contains retroviral transactivating (Tat) proteins, from a variety of Lentiviruses.	86
308571	pfam02999	Borrelia_orfD	Borrelia orf-D family. Borrelia burgdorferi supercoiled plasmids encode multicopy tandem open reading frames called Orf-A, Orf-B, Orf-C and Orf-D. This family corresponds to Orf-D. The putative product of this gene has no known function.	100
397238	pfam03000	NPH3	NPH3 family. Phototropism of Arabidopsis thaliana seedlings in response to a blue light source is initiated by nonphototropic hypocotyl 1 (NPH1), a light-activated serine-threonine protein kinase. Mutations in NPH3 disrupt early signaling occurring downstream of the NPH1 photoreceptor. The NPH3 gene encodes a NPH1-interacting protein. NPH3 is a member of a large protein family, apparently specific to higher plants, and may function as an adapter or scaffold protein to bring together the enzymatic components of a NPH1-activated phosphorelay.	219
367290	pfam03002	Somatostatin	Somatostatin/Cortistatin family. Members of this family are hormones. Somatostatin inhibits the release of somatotropin. Cortistatin is a peptide that is related to the Somatostatins that is found to depresses neuronal electrical activity but, unlike somatostatin, induces low-frequency waves in the cerebral cortex and antagonizes the effects of acetylcholine on hippocampal and cortical measures of excitability.	18
281058	pfam03003	Pox_G9-A16	Pox virus entry-fusion-complex G9/A16. Pox_G9-A16 is a family of two of the eight entry-fusion complex proteins of pox viruses. the viral fusion proteins are components of the mature virion, MV, membrane. Extracellular enveloped virions (EVs), the infecting particles are MVs with an additional membrane that is opened or removed prior to the fusion of the MV and cell membrane during virus entry. G9 and A16 interact closely with each other and each is required for membrane fusion and virus entry as well as for interaction with A56/K2.	128
367291	pfam03004	Transposase_24	Plant transposase (Ptta/En/Spm family). Transposase proteins are necessary for efficient DNA transposition. This family includes various plant transposases from the Ptta and En/Spm families.	137
397239	pfam03006	HlyIII	Haemolysin-III related. Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea. This family belongs to the CREST superfamily, which are distantly related to GPCRs.	222
281060	pfam03007	WES_acyltransf	Wax ester synthase-like Acyl-CoA acyltransferase domain. This domain is found in wax ester synthase genes. In these proteins this domain catalyzes the CoA dependent acyltransferase reaction with fatty alcohols to form wax esters.	261
397240	pfam03008	DUF234	Archaea bacterial proteins of unknown function. 	91
397241	pfam03009	GDPD	Glycerophosphoryl diester phosphodiesterase family. E. coli has two sequence related isozymes of glycerophosphoryl diester phosphodiesterase (GDPD) - periplasmic and cytosolic. This family also includes agrocinopine synthase, the similarity to GDPD has been noted. This family appears to have weak but not significant matches to mammalian phospholipase C pfam00388, which suggests that this family may adopt a TIM barrel fold.	244
281063	pfam03010	GP4	GP4. GP4 is a minor membrane-associated glycoproteins. This family contains envelope protein GP4 from equine arteritis virus.	152
281064	pfam03011	PFEMP	PFEMP DBL domain. PfEMP1 (Plasmodium falciparum erythrocyte membrane protein) has been identified as the rosetting ligand of the malaria parasite P. falciparum. Rosetting is the adhesion of infected erythrocytes with uninfected erythrocytes in the vasculature of the infected organ, and is associated with severe malaria. PfEMP1 interacts with Complement Receptor One on uninfected erythrocytes to form rosettes. The extreme variation within these proteins and the grouping of var genes implies that var gene recombination preferentially occurs within var gene groups. These groups reflect a functional diversification that has evolved to cope with the varying conditions of transmission and host immune response met by the parasite. A recombination hotspot was uncovered between Duffy-binding-like (DBL) subdomains. Solution of the crystal structure of the N-terminal and first DBL region of PfEMP1 from the VarO variant of the PfEMP1 protein is found to be directly implicated in rosetting as the heparin-binding site.	154
397242	pfam03012	PP_M1	Phosphoprotein. This family includes the M1 phosphoprotein non-structural RNA polymerase alpha subunit, which is thought to be a component of the active polymerase, and may be involved in template binding.	296
397243	pfam03013	Pyr_excise	Pyrimidine dimer DNA glycosylase. Pyrimidine dimer DNA glycosylases excise pyrimidine dimers by hydrolysis of the glycosylic bond of the 5' pyrimidine, followed by the intra-pyrimidine phosphodiester bond. Pyrimidine dimers are the major UV-lesions of DNA.	81
367296	pfam03014	SP2	Structural protein 2. This family represents structural protein 2 of the hepatitis E virus. The high basic amino acid content of this protein has lead to the suggestion of a role in viral genomic RNA encapsidation.	709
397244	pfam03015	Sterile	Male sterility protein. This family represents the C-terminal region of the male sterility protein in a number of arabidopsis and drosophila. A sequence-related jojoba acyl CoA reductase is also included.	92
397245	pfam03016	Exostosin	Exostosin family. The EXT family is a family of tumor suppressor genes. Mutations of EXT1 on 8q24.1, EXT2 on 11p11-13, and EXT3 on 19p have been associated with the autosomal dominant disorder known as hereditary multiple exostoses (HME). This is the most common known skeletal dysplasia. The chromosomal locations of other EXT genes suggest association with other forms of neoplasia. EXT1 and EXT2 have both been shown to encode a heparan sulphate polymerase with both D-glucuronyl (GlcA) and N-acetyl-D-glucosaminoglycan (GlcNAC) transferase activities. The nature of the defect in heparan sulphate biosynthesis in HME is unclear.	290
397246	pfam03017	Transposase_23	TNP1/EN/SPM transposase. 	64
397247	pfam03018	Dirigent	Dirigent-like protein. This family contains a number of proteins which are induced during disease response in plants. Members of this family are involved in lignification.	144
397248	pfam03020	LEM	LEM domain. The LEM domain is 50 residues long and is composed of two parallel alpha helices. This domain is found in inner nuclear membrane proteins. It is called the LEM domain after LAP2, Emerin, and Man1.	40
281073	pfam03021	CM2	Influenza C virus M2 protein. Influenza C virus M1 protein is encoded by a spliced mRNA. The unspliced mRNA is also found in small quantities and can encode the protein represented by this family.	139
308585	pfam03022	MRJP	Major royal jelly protein. Royal jelly is the food of queen bee larvae, and is responsible for the high reproductive ability of the queen. Major royal jelly proteins make up around 90% of larval jelly proteins. This family also the sequence-related yellow protein of drosophila which controls pigmentation of the adult cuticle and larval mouth parts.	288
397249	pfam03023	MVIN	MviN-like protein. Deletion of the mviN virulence gene in Salmonella enterica serovar. Typhimurium greatly reduces virulence in a mouse model of typhoid-like disease. Open reading frames encoding homologs of MviN have since been identified in a variety of bacteria, including pathogens and non-pathogens and plant-symbionts. In the nitrogen-fixing symbiont Rhizobium tropici, mviN is required for motility. The MviM protein is predicted to be membrane-associated.	451
397250	pfam03024	Folate_rec	Folate receptor family. This family includes the folate receptor which binds to folate and reduced folic acid derivatives and mediates delivery of 5-methyltetrahydrofolate to the interior of cells. These proteins are attached to the membrane by a GPI-anchor. The proteins contain 16 conserved cysteines that form eight disulphide bridges.	172
145918	pfam03025	Papilloma_E5	Papillomavirus E5. The E5 protein from papillomaviruses is about 80 amino acids long. The proteins are contain three regions that are predicted to be transmembrane alpha helices. The function of this protein is unknown.	72
281077	pfam03026	CM1	Influenza C virus M1 protein. This family represents the matrix 1 protein of influenza C virus. The protein is the product of a spliced mRNA. Small quantities of the unspliced mRNA are found in the cell additionally encoding the M2 protein (see pfam03021).	235
397251	pfam03028	Dynein_heavy	Dynein heavy chain and region D6 of dynein motor. This family represents the C-terminal region of dynein heavy chain. The chain also contains ATPase activity and microtubule binding ability and acts as a motor for the movement of organelles and vesicles along microtubules. Dynein is also involved in cilia and flagella movement. The dynein subunit consists of at least two heavy chains and a number of intermediate and light chains. The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This C-terminal domain carries the D6 region of the dynein motor where the P-loop has been lost in evolution but the general structure of a potential ATP binding site appears to be retained.	113
397252	pfam03029	ATP_bind_1	Conserved hypothetical ATP binding protein. Members of this family are found in a range of archaea and eukaryotes and have hypothesized ATP binding activity.	238
397253	pfam03030	H_PPase	Inorganic H+ pyrophosphatase. The H+ pyrophosphatase is an transmembrane proton pump involved in establishing the H+ electrochemical potential difference between the vacuole lumen and the cell cytosol. Vacuolar-type H(+)-translocating inorganic pyrophosphatases have long been considered to be restricted to plants and to a few species of photo-trophic bacteria. However, in recent investigations, these pyrophosphatases have been found in organisms as disparate as thermophilic Archaea and parasitic protists.	663
397254	pfam03031	NIF	NLI interacting factor-like phosphatase. This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain.	160
281082	pfam03032	FSAP_sig_propep	Frog skin active peptide family signal and propeptide. This family contains a number of defense peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins. The alignment for this family consists of the signal peptide and propeptide regions and does not include the active peptides.	46
397255	pfam03033	Glyco_transf_28	Glycosyltransferase family 28 N-terminal domain. The glycosyltransferase family 28 includes monogalactosyldiacylglycerol synthase (EC 2.4.1.46) and UDP-N-acetylglucosamine transferase (EC 2.4.1.-). This N-terminal domain contains the acceptor binding site and likely membrane association site. This family also contains a large number of proteins that probably have quite distinct activities.	139
397256	pfam03034	PSS	Phosphatidyl serine synthase. Phosphatidyl serine synthase is also known as serine exchange enzyme. This family represents eukaryotic PSS I and II which are membrane bound proteins which catalyzes the replacement of the head group of a phospholipid (phosphotidylcholine or phosphotidylethanolamine) by L-serine.	273
367308	pfam03035	RNA_capsid	Calicivirus putative RNA polymerase/capsid protein. 	226
397257	pfam03036	Perilipin	Perilipin family. The perilipin family includes lipid droplet-associated protein (perilipin) and adipose differentiation-related protein (adipophilin).	403
367310	pfam03037	KMP11	Kinetoplastid membrane protein 11. Kinetoplastid membrane protein 11 is a major cell surface glycoprotein of the parasite Leishmania donovani.	90
281088	pfam03038	Herpes_UL95	UL95 family. Members of this family, found in several herpesviruses, include EBV BGLF3 and other UL95 proteins (e.g. HCMV UL95, HVS-1 34, HSV6 U67). Their function is unknown.	319
397258	pfam03039	IL12	Interleukin-12 alpha subunit. Interleukin 12 (IL-12) is a disulphide-bonded heterodimer consisting of a 35kDa alpha subunit and a 40kDa beta subunit. It is involved in the stimulation and maintenance of Th1 cellular immune responses, including the normal host defense against various intracellular pathogens, such as Leishmania, Toxoplasma, measles virus and HIV. IL-12 also has an important role in pathological Th1 responses, such as in inflammatory bowel disease and multiple sclerosis. Suppression of IL-12 activity in such diseases may have therapeutic benefit. On the other hand, administration of recombinant IL-12 may have therapeutic benefit in conditions associated with pathological Th2 responses.	214
397259	pfam03040	CemA	CemA family. Members of this family are probable integral membrane proteins. Their molecular function is unknown. CemA proteins are found in the inner envelope membrane of chloroplasts but not in the thylakoid membrane. A cyanobacterial member of this family has been implicated in CO2 transport, but is probably not a CO2 transporter itself. They are predicted to be haem-binding however this has not been proven experimentally.	228
281091	pfam03041	Baculo_LEF-2	lef-2. The lef-2 gene (for late expression factor 2) from baculovirus is required for expression of late genes. This gene has been shown to be specifically required for expression from the vp39 and polh promoters. LEF-1 is a DNA primase and there is some evidence to suggest that LEF-2 may bind to both DNA and LEF-1.	184
281092	pfam03042	Birna_VP5	Birnavirus VP5 protein. Birnaviruses are ds RNA viruses. Non structural protein VP5 is found in RNA segment A. The function of this small viral protein is unknown. The proteins are about 150 amino acids long and contain several conserved histidines and cysteines that might form a zinc binding site (Bateman A pers. obs.).	139
281093	pfam03043	Herpes_UL87	Herpesvirus UL87 family. Members of this family are functionally uncharacterized. This family groups together EBV BcRF1, HSV-6 U58, HVS-1 24 and HCMV UL87. The proteins range from 575 to 950 amino acids in length.	523
281094	pfam03044	Herpes_UL16	Herpesvirus UL16/UL94 family. This family groups together HSV-1 UL16, HSV-6 ORF11R, EHV-1 46, HCMV UL94, EBV BGLF2 and VZV 44. UL16 protein may play a role in capsid maturation including DNA packaging/cleavage. In immunofluorescence studies, UL16 was localized to the nucleus of infected cells in areas containing high concentrations of HSV capsid proteins. These nuclear compartments have been described previously as viral assemblons and are distinct from compartments containing replicating DNA. localization within assemblons argues for a role of UL16 encoded protein in capsid assembly or maturation.	328
397260	pfam03045	DAN	DAN domain. This domain contains 9 conserved cysteines and is extracellular. Therefore the cysteines may form disulphide bridges. This family of proteins has been termed the DAN family after the first member to be reported. This family includes DAN, Cerberus and Gremlin. The gremlin protein is an antagonist of bone morphogenetic protein signaling. It is postulated that all members of this family antagonize different TGF beta pfam00019 ligands. Recent work shows that the DAN protein is not an efficient antagonist of BMP-2/4 class signals, we found that DAN was able to interact with GDF-5 in a frog embryo assay, suggesting that DAN may regulate signaling by the GDF-5/6/7 class of BMPs in vivo.	108
335196	pfam03047	ComC	COMC family. This family consists exclusively of streptococcal competence stimulating peptide precursors, which are generally up to 50 amino acid residues long. In all the members of this family, the leader sequence is cleaved after two conserved glycine residues; thus the leader sequence is of the double- glycine type. Competence stimulating peptides (CSP) are small (less than 25 amino acid residues) cationic peptides. The N-terminal amino acid residue is negatively charged, either glutamate or aspartate. The C-terminal end is positively charged. The third residue is also positively charged: a highly conserved arginine. A few COMC proteins and their precursors (not included in this family) do not fully follow the above description. In particular: the leader sequence in the CSP precursor from Streptococcus sanguis NCTC 7863 is not of the double-glycine type; the CSP from Streptococcus gordonii NCTC 3165 does not have a negatively charged N-terminus residue and has a lysine instead of arginine at the third position. Functionally, CSP act as pheromones, stimulating competence for genetic transformation in streptococci. In streptococci, the (CSP mediated) competence response requires exponential cell growth at a critical density, a relatively simple requirement when compared to the stationary-phase requirement of Haemophilus, or the late-logarithmic- phase of Bacillus. All bacteria induced to competence by a particular CSP are said to belong to the same pherotype, because each CSP is recognized by a specific receptor (the signalling domain of a histidine kinase ComD). Pherotypes are not necessarily species-specific. In addition, an organism may change pherotype. There are two possible mechanisms for pherotype switching: horizontal gene transfer, and accumulation of point mutations. The biological significance of pherotypes and pherotype switching is not definitively determined. Pherotype switching occurs frequently enough in naturally competent streptococci to suggest that it may be an important contributor to genetic exchange between different bacterial species. The family Antibacterial16, streptolysins from group A streptococci, has been merged into this family.	31
281097	pfam03048	Herpes_UL92	UL92 family. Members of this family, found in several herpesviruses, include EBV BDLF4, HCMV UL92, HHV8 31, HSV6 U63. Their function is unknown. The N-terminus of this protein contains 6 conserved cysteines and histidines that might form a zinc binding domain (A Bateman pers. obs.).	189
281098	pfam03049	Herpes_UL79	UL79 family. Members of this family are functionally uncharacterized proteins from herpesviruses. This family groups together HSV-6 U52, HVS-1 18 and HCMV UL79.	254
397261	pfam03050	DDE_Tnp_IS66	Transposase IS66 family. Transposase proteins are necessary for efficient DNA transposition. This family includes IS66 from Agrobacterium tumefaciens.	282
397262	pfam03051	Peptidase_C1_2	Peptidase C1-like family. This family is closely related to the Peptidase_C1 family pfam00112, containing several prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases.	438
308597	pfam03052	Adeno_52K	Adenoviral protein L1 52/55-kDa. The adenoviral protein L1 52/55-kDa is expressed in both the early and late stages of infection which suggests that it could play multiple roles in the viral life cycle. The L1 52/55 kDa protein interacts with the viral IVa2 protein and is required for DNA packaging. L1 53/55-kDa is required to mediate stable association between the viral DNA and empty capsid.	198
251695	pfam03053	Corona_NS3b	ORF3b coronavirus protein. Members of this family are non-structural proteins, approximately 250 amino acid residues long. They are found in transmissible gastroenteritis coronavirus (TGEV) and porcine respiratory coronavirus (PRCV) isolates. These proteins are found on the same mRNA as another product, designated ORF3a. While ORF3a/b has been implicated in TGEV and PRCV pathogenesis, its precise role remains unclear.	226
281101	pfam03054	tRNA_Me_trans	tRNA methyl transferase. This family represents tRNA(5-methylaminomethyl-2-thiouridine)-methyltransferase which is involved in the biosynthesis of the modified nucleoside 5-methylaminomethyl-2-thiouridine present in the wobble position of some tRNAs.	353
397263	pfam03055	RPE65	Retinal pigment epithelial membrane protein. This family represents a retinal pigment epithelial membrane receptor which is abundantly expressed in retinal pigment epithelium, and binds plasma retinal binding protein. The family also includes the sequence related neoxanthin cleavage enzyme in plants and lignostilbene-alpha,beta-dioxygenase in bacteria.	445
397264	pfam03057	DUF236	DUF236 repeat. This family represents a short repeat region found a number of C. elegans proteins of unknown function.	31
251699	pfam03058	Sar8_2	Sar8.2 family. Members of this family are found in Solanaceae plants, a taxonomic group (family) that includes pepper and tobacco plant species. Synthesis of these proteins is induced by tobacco mosaic virus (TMV) and salicylic acid; indeed they are thought to be involved in the development of systemic acquired resistance (SAR) after an initial hypersensitive response to microbial infection. SAR is characterized by long-lasting resistance to infection by a wide range of pathogens, extending to plant tissues distant from the initial infection site.	85
308600	pfam03059	NAS	Nicotianamine synthase protein. Nicotianamine synthase EC:2.5.1.43 catalyzes the trimerisation of S-adenosylmethionine to yield one molecule of nicotianamine. Nicotianamine has an important role in plant iron uptake mechanisms. Plants adopt two strategies (termed I and II) of iron acquisition. Strategy I is adopted by all higher plants except graminaceous plants, which adopt strategy II. In strategy I plants, the role of nicotianamine is not fully determined: possible roles include the formation of more stable complexes with ferrous than with ferric ion, which might serve as a sensor of the physiological status of iron within a plant, or which might be involved in the transport of iron. In strategy II (graminaceous) plants, nicotianamine is the key intermediate (and nicotianamine synthase the key enzyme) in the synthesis of the mugineic family (the only known family in plants) of phytosiderophores. Phytosiderophores are iron chelators whose secretion by the roots is greatly increased in instances of iron deficiency. The 3D structures of five example NAS from Methanothermobacter thermautotrophicus reveal the monomer to consist of a five-helical bundle N-terminal domain on top of a classic Rossmann fold C-terminal domain. The N-terminal domain is unique to the NAS family, whereas the C-terminal domain is homologous to the class I family of SAM-dependent methyltransferases. An active site is created at the interface of the two domains, at the rim of a large cavity that corresponds to the nucleotide binding site such as is found in other proteins adopting a Rossmann fold.	276
367316	pfam03060	NMO	Nitronate monooxygenase. Nitronate monooxygenase (NMO), formerly referred to as 2-nitropropane dioxygenase (NPD) (EC:1.13.11.32), is an FMN-dependent enzyme that uses molecular oxygen to oxidize (anionic) alkyl nitronates and, in the case of the enzyme from Neurospora crassa, (neutral) nitroalkanes to the corresponding carbonyl compounds and nitrite. Previously classified as 2-nitropropane dioxygenase, but it is now recognized that this was the result of the slow ionization of nitroalkanes to their nitronate (anionic) forms. The enzymes from the fungus Neurospora crassa and the yeast Williopsis saturnus var. mrakii (formerly classified as Hansenula mrakii) contain non-covalently bound FMN as the cofactor. Active towards linear alkyl nitronates of lengths between 2 and 6 carbon atoms and, with lower activity, towards propyl-2-nitronate. The enzyme from N. crassa can also utilize neutral nitroalkanes, but with lower activity. One atom of oxygen is incorporated into the carbonyl group of the aldehyde product. The reaction appears to involve the formation of an enzyme-bound nitronate radical and an a-peroxynitroethane species, which then decomposes, either in the active site of the enzyme or after release, to acetaldehyde and nitrite.	331
397265	pfam03061	4HBT	Thioesterase superfamily. This family contains a wide variety of enzymes, principally thioesterases. This family includes 4HBT (EC 3.1.2.23) which catalyzes the final step in the biosynthesis of 4-hydroxybenzoate from 4-chlorobenzoate in the soil dwelling microbe Pseudomonas CBS-3. This family includes various cytosolic long-chain acyl-CoA thioester hydrolases. Long-chain acyl-CoA hydrolases hydrolyze palmitoyl-CoA to CoA and palmitate, they also catalyze the hydrolysis of other long chain fatty acyl-CoA thioesters.	79
281107	pfam03062	MBOAT	MBOAT, membrane-bound O-acyltransferase family. The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue.	334
397266	pfam03063	Prismane	Prismane/CO dehydrogenase family. This family includes both hybrid-cluster proteins and the beta chain of carbon monoxide dehydrogenase. The hybrid-cluster proteins contain two Fe/S centers - a [4Fe-4S] cubane cluster, and a hybrid [4Fe-2S-2O] cluster. The physiological role of this protein is as yet unknown, although a role in nitrate/nitrite respiration has been suggested. The prismane protein from Escherichia coli was shown to contain hydroxylamine reductase activity (NH2OH + 2e + 2 H+ -> NH3 + H2O). This activity is rather low. Hydroxylamine reductase activity was also found in CO-dehydrogenase in which the active site Ni was replaced by Fe. The CO dehydrogenase contains a Ni-3Fe-2S-3O centre.	538
281109	pfam03064	U79_P34	HSV U79 / HCMV P34. This family represents herpes virus protein U79 and cytomegalovirus early phosphoprotein P34 (UL112).	228
397267	pfam03065	Glyco_hydro_57	Glycosyl hydrolase family 57. This family includes alpha-amylase (EC:3.2.1.1), 4--glucanotransferase (EC:2.4.1.-) and amylopullulanase enzymes.	293
397268	pfam03066	Nucleoplasmin	Nucleoplasmin/nucleophosmin domain. Nucleoplasmins are also known as chromatin decondensation proteins. They bind to core histones and transfer DNA to them in a reaction that requires ATP. This is thought to play a role in the assembly of regular nucleosomal arrays.	102
397269	pfam03067	LPMO_10	Lytic polysaccharide mono-oxygenase, cellulose-degrading. This domain is found associated with a wide variety of cellulose binding domains. This is a family of two very closely related proteins that together act as both a C1- and a C4-oxidising lytic polysaccharide mono-oxygenase, degrading cellulose. This domain is also found in baculoviral spheroidins and spindolins, protein of unknown function.	186
397270	pfam03068	PAD	Protein-arginine deiminase (PAD). Members of this family are found in mammals. In the presence of calcium ions, PAD enzymes EC:3.5.3.15 catalyze the post-translational modification reaction responsible for the formation of citrulline residues: Protein L-arginine + H2O <=> Protein L-citrulline + NH3. Several types are recognized (and included in the family) on the basis of molecular mass, substrate specificity, and tissue localization. The expression of type I PAD is known to be under the control of oestrogen.	384
397271	pfam03069	FmdA_AmdA	Acetamidase/Formamidase family. This family includes amidohydrolases of formamide EC:3.5.1.49 and acetamide. Methylophilus methylotrophus FmdA forms a homotrimer suggesting all the members of this family also do.	271
397272	pfam03070	TENA_THI-4	TENA/THI-4/PQQC family. Members of this family are found in all the three major phyla of life: archaebacteria, eubacteria, and eukaryotes. In Bacillus subtilis, TENA is one of a number of proteins that enhance the expression of extracellular enzymes, such as alkaline protease, neutral protease and levansucrase. The THI-4 protein, which is involved in thiamine biosynthesis, is also a member of this family. The C-terminal part of these proteins consistently show significant sequence similarity to TENA proteins. This similarity was first noted with the Neurospora crassa THI-4. This family includes bacterial coenzyme PQQ synthesis protein C or PQQC proteins. Pyrroloquinoline quinone (PQQ) is the prosthetic group of several bacterial enzymes,including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria. PQQC has been found to be required in the synthesis of PQQ but its function is unclear. The exact molecular function of members of this family is uncertain.	210
397273	pfam03071	GNT-I	GNT-I family. Alpha-1,3-mannosyl-glycoprotein beta-1,2-N-acetylglucosaminyltransferase (GNT-I, GLCNAC-T I) EC:2.4.1.101 transfers N-acetyl-D-glucosamine from UDP to high-mannose glycoprotein N-oligosaccharide. This is an essential step in the synthesis of complex or hybrid-type N-linked oligosaccharides. The enzyme is an integral membrane protein localized to the Golgi apparatus, and is probably distributed in all tissues. The catalytic domain is located at the C-terminus.	434
281117	pfam03072	DUF237	MG032/MG096/MG288 family 1. This family consists entirely of mycoplasmal proteins. Their function is unknown. Another related family, pfam03086, also consists entirely of mycoplasmal proteins of the MG032/MG096/MG288 family. Some proteins are included in both families, but of course differ in the aligned residues.	137
397274	pfam03073	TspO_MBR	TspO/MBR family. Tryptophan-rich sensory protein (TspO) is an integral membrane protein that acts as a negative regulator of the expression of specific photosynthesis genes in response to oxygen/light. It is involved in the efflux of porphyrin intermediates from the cell. This reduces the activity of coproporphyrinogen III oxidase, which is thought to lead to the accumulation of a putative repressor molecule that inhibits the expression of specific photosynthesis genes. Several conserved aromatic residues are necessary for TspO function: they are thought to be involved in binding porphyrin intermediates. In, the rat mitochondrial peripheral benzodiazepine receptor (MBR) was shown to not only retain its structure within a bacterial outer membrane, but also to be able to functionally substitute for TspO in TspO- mutants, and to act in a similar manner to TspO in its in situ location: the outer mitochondrial membrane. The biological significance of MBR remains unclear, however. It is thought to be involved in a variety of cellular functions, including cholesterol transport in steroidogenic tissues.	144
397275	pfam03074	GCS	Glutamate-cysteine ligase. This family represents the catalytic subunit of glutamate-cysteine ligase (E.C. 6.3.2.2), also known as gamma-glutamylcysteine synthetase (GCS). This enzyme catalyzes the rate limiting step in the biosynthesis of glutathione. The eukaryotic enzyme is a dimer of a heavy chain and a light chain with all the catalytic activity exhibited by the heavy chain (this family).	369
281120	pfam03076	GP3	Equine arteritis virus GP3. This protein is encoded by ORF3 of equine arteritis virus. The function is unknown.	160
281121	pfam03077	VacA2	Putative vacuolating cytotoxin. This family contains a number of Helicobacter outer membrane proteins with multiple copies of this small conserved region.	58
251715	pfam03078	ATHILA	ATHILA ORF-1 family. ATHILA is a group of Arabidopsis thaliana retrotransposons belonging to the Ty3/gypsy family of the long terminal repeat (LTR) class of eukaryotic retrotransposons. The central region of ATHILA retrotransposons contains two or three open reading frames (ORFs). This family represents the ORF1 product. The function of ORF1 is unknown.	456
281122	pfam03079	ARD	ARD/ARD' family. The two acireductone dioxygenase enzymes (ARD and ARD', previously known as E-2 and E-2') from Klebsiella pneumoniae share the same amino acid sequence, but bind different metal ions: ARD binds Ni2+, ARD' binds Fe2+. ARD and ARD' can be experimentally interconverted by removal of the bound metal ion and reconstitution with the appropriate metal ion. The two enzymes share the same substrate, 1,2-dihydroxy-3-keto-5-(methylthio)pentene, but yield different products. ARD' yields the alpha-keto precursor of methionine (and formate), thus forming part of the ubiquitous methionine salvage pathway that converts 5'-methylthioadenosine (MTA) to methionine. This pathway is responsible for the tight control of the concentration of MTA, which is a powerful inhibitor of polyamine biosynthesis and transmethylation reactions. ARD yields methylthiopropanoate, carbon monoxide and formate, and thus prevents the conversion of MTA to methionine. The role of the ARD catalyzed reaction is unclear: methylthiopropanoate is cytotoxic, and carbon monoxide can activate guanylyl cyclase, leading to increased intracellular cGMP levels. This family also contains other members, whose functions are not well characterized.	157
397276	pfam03080	Neprosin	Neprosin. Pitcher plants are insectivorous and secrete a digestive fluid into the pitcher. This fluid contains a mixture of enzymes including peptidases. One of these is neprosin, characterized from the pitcher plant Nepenthes ventrata. This peptidase is of unknown catalytic type and is unaffected by standard peptidase inhibitors. Unusually, activity is directed towards prolyl bonds, but unlike most peptidase that cleave after proline, there is no restriction on sequence length or position of the proline residue. The peptidase is secreted and is presumed to possess an N-terminal activation peptide. The neprosin domain corresponds to the mature peptidase. It is not known if other proteins with this domain are peptidases.	221
397277	pfam03081	Exo70	Exo70 exocyst complex subunit. The Exo70 protein forms one subunit of the exocyst complex. First discovered in S. cerevisiae, Exo70 and other exocyst proteins have been observed in several other eukaryotes, including humans. In S. cerevisiae, the exocyst complex is involved in the late stages of exocytosis, and is localized at the tip of the bud, the major site of exocytosis in yeast. Exo70 interacts with the Rho3 GTPase. This interaction mediates one of the three known functions of Rho3 in cell polarity: vesicle docking and fusion with the plasma membrane (the other two functions are regulation of actin polarity and transport of exocytic vesicles from the mother cell to the bud). In humans, the functions of Exo70 and the exocyst complex are less well characterized: Exo70 is expressed in several tissues and is thought to also be involved in exocytosis.	373
281125	pfam03082	MAGSP	Male accessory gland secretory protein. The accessory gland of male insects is a genital tissue that secretes many components of the ejaculatory fluid, some of which affect the female's receptivity to courtship and her rate of oviposition. This protein is expressed exclusively in the male accessory glands of adult Drosophila melanogaster. The proteins are transferred to the female fly during copulation and are rapidly altered in the female genital tract.	267
397278	pfam03083	MtN3_slv	Sugar efflux transporter for intercellular exchange. This family includes proteins such as drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family. This family also contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisims it meditaes gluose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologs are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter.	87
367326	pfam03084	Sigma_1_2	Reoviral Sigma1/Sigma2 family. Reoviruses are double-stranded RNA viruses. They lack a membrane envelope and their capsid is organized in two concentric icosahedral layers: an inner core and an outer capsid layer. The sigma1 protein is found in the outer capsid, and the sigma2 protein is found in the core. There are four other kinds of protein (besides sigma2) in the core, termed lambda 1-3, mu2. Interactions between sigma2 and lambda 1 and lambda 3 are thought to initiate core formation, followed by mu2 and lambda2. Sigma1 is a trimeric protein, and is positioned at the 12 vertices of the icosahedral outer capsid layer. Its N-terminal fibrous tail, arranged as a triple coiled coil, anchors it in the virion, and a C-terminal globular head interacts with the cellular receptor. These two parts form by separate trimerisation events. The N-terminal fibrous tail forms on the polysome, without the involvement of ATP or chaperones. The post- translational assembly of the C-terminal globular head involves the chaperone activity of Hsp90, which is associated with phosphorylation of Hsp90 during the process. Sigma1 protein acts as a cell attachment protein, and determines viral virulence, pathways of spread, and tropism. Junctional adhesion molecule has been identified as a receptor for sigma1. In type 3 reoviruses, a small region, predicted to form a beta sheet, in the N-terminal tail was found to bind target cell surface sialic acid (i.e. sialic acid acts as a co-receptor) and promote apoptosis. The sigma1 protein also binds to the lambda2 core protein.	452
367327	pfam03085	RAP-1	Rhoptry-associated protein 1 (RAP-1). Members of this family are found in Babesia species. Though not in this Pfam family, rhoptry-associated proteins are also found in Plasmodium falciparum. Indeed, animal infection with Babesia may produce a pattern similar to human malaria. Rhoptry organelles form part of the apical complex in apicomplexan parasites. Rhoptry-associated proteins are antigenic, and generate partially protective immune responses in infected mammals. Thus RAPs are among the targeted vaccine antigens for babesial (and malarial) parasites. However, RAP-1 proteins are encoded by by a multigene family; thus RAP-1 proteins are polymorphic, with B and T cell epitopes that are conserved among strains, but not across species. Antibodies to Babesia RAP-1 may also be helpful in the serological detection of Babesia infections.	241
281129	pfam03086	DUF240	MG032/MG096/MG288 family 2. This family consists entirely of mycoplasmal proteins. Their function is unknown. Another related family, pfam03072, also consists entirely of mycoplasmal proteins of the MG032/MG096/MG288 family. Some proteins are included in both families, but of course differ in the aligned residues.	119
397279	pfam03087	DUF241	Arabidopsis protein of unknown function. This family represents a number of Arabidopsis proteins. Their functions are unknown.	238
281131	pfam03088	Str_synth	Strictosidine synthase. Strictosidine synthase (E.C. 4.3.3.2) is a key enzyme in alkaloid biosynthesis. It catalyzes the condensation of tryptamine with secologanin to form strictosidine.	89
397280	pfam03089	RAG2	Recombination activating protein 2. V-D-J recombination is the combinatorial process by which the huge range of immunoglobulin and T cell binding specificity is generated from a limited amount of genetic material. This process is synergistically activated by RAG1 and RAG2 in developing lymphocytes. Defects in RAG2 in humans are a cause of severe combined immunodeficiency B cell negative and Omenn syndrome.	338
397281	pfam03090	Replicase	Replicase family. This is a family of bacterial plasmid DNA replication initiator proteins. pfam01051 is a similar family. These RepA proteins exist as monomers and dimers in equilibrium: monomers bind directly to repeated DNA sequences and thus activate replication; dimers repress repA transcription by binding an inversely repeated DNA operator. Dimer dissociation can occur spontaneously or be mediated by Hsp70 chaperones.	128
397282	pfam03091	CutA1	CutA1 divalent ion tolerance protein. Several gene loci with a possible involvement in cellular tolerance to copper have been identified. One such locus in eubacteria and archaebacteria, cutA, is thought to be involved in cellular tolerance to a wide variety of divalent cations other than copper. The cutA locus consists of two operons, of one and two genes. The CutA1 protein is a cytoplasmic protein, encoded by the single-gene operon and has been linked to divalent cation tolerance. It has no recognized structural motifs. This family also contains putative proteins from eukaryotes (human and Drosophila).	99
308617	pfam03092	BT1	BT1 family. Members of this family are transmembrane proteins. Several are Leishmania putative proteins that are thought to be pteridine transporters. One such protein, previously termed (and still annotated as) ORFG, was shown to encode a biopterin transport protein using null mutants, thus being subsequently renamed BT1. The significant similarity of ORFG/BT1 to Trypanosoma brucei ESAG10 (a putative transmembrane protein and another member of this family) was previously noted. This family also contains five putative Arabidopsis thaliana proteins of unknown function. In addition, it also contains two predicted prokaryotic proteins (from the cyanobacteria Synechocystis and Synechococcus).	432
397283	pfam03094	Mlo	Mlo family. A family of plant integral membrane proteins, first discovered in barley. Mutants lacking wild-type Mlo proteins show broad spectrum resistance to the powdery mildew fungus, and dysregulated cell death control, with spontaneous cell death in response to developmental or abiotic stimuli. Thus wild-type Mlo proteins are thought to be inhibitors of cell death whose deficiency lowers the threshold required to trigger the cascade of events that result in plant cell death. Mlo proteins are localized in the plasma membrane and possess seven transmembrane regions; thus the Mlo family is the only major higher plant family to possess 7 transmembrane domains. It has been suggested that Mlo proteins function as G-protein coupled receptors in plants; however the molecular and biological functions of Mlo proteins remain to be fully determined.	484
397284	pfam03095	PTPA	Phosphotyrosyl phosphate activator (PTPA) protein. Phosphotyrosyl phosphatase activator (PTPA) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognized phosphoserine/ threonine protein phosphorylase activity. The specific biological role of PTPA is unknown, Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumor suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1.	291
397285	pfam03096	Ndr	Ndr family. This family consists of proteins from different gene families: Ndr1/RTP/Drg1, Ndr2, and Ndr3. Their similarity was previously noted. The precise molecular and cellular function of members of this family is still unknown. Yet, they are known to be involved in cellular differentiation events. The Ndr1 group was the first to be discovered. Their expression is repressed by the proto-oncogenes N-myc and c-myc, and in line with this observation, Ndr1 protein expression is down-regulated in neoplastic cells, and is reactivated when differentiation is induced by chemicals such as retinoic acid. Ndr2 and Ndr3 expression is not under the control of N-myc or c-myc. Ndr1 expression is also activated by several chemicals: tunicamycin and homocysteine induce Ndr1 in human umbilical endothelial cells; nickel induces Ndr1 in several cell types. Members of this family are found in wide variety of multicellular eukaryotes, including an Ndr1 type protein in Helianthus annuus (sunflower), known as Sf21. Interestingly, the highest scoring matches in the noise are all alpha/beta hydrolases pfam00561, suggesting that this family may have an enzymatic function (Bateman A pers. obs.).	285
397286	pfam03097	BRO1	BRO1-like domain. This domain is found in a number proteins including Rhophilin and BRO1. It is known to have a role in endosomal targeting. ESCRT-III subunit Snf7 binds to a conserved hydrophobic patch in the BRO1 domain that is required for protein complex formation and for the protein-sorting function of BRO1.	369
397287	pfam03098	An_peroxidase	Animal haem peroxidase. 	533
397288	pfam03099	BPL_LplA_LipB	Biotin/lipoate A/B protein ligase family. This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyzes the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. The unusual biosynthesis pathway of lipoic acid is mechanistically intertwined with attachment of the cofactor.	129
397289	pfam03100	CcmE	CcmE. CcmE is the product of one of a cluster of Ccm genes that are necessary for cytochrome c biosynthesis in eubacteria. Expression of these proteins is induced when the organisms are grown under anaerobic conditions with nitrate or nitrite as the final electron acceptor.	129
335217	pfam03101	FAR1	FAR1 DNA-binding domain. This domain contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.	90
397290	pfam03102	NeuB	NeuB family. NeuB is the prokaryotic N-acetylneuraminic acid (Neu5Ac) synthase. It catalyzes the direct formation of Neu5Ac (the most common sialic acid) by condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine (ManNAc). This reaction has only been observed in prokaryotes; eukaryotes synthesize the 9-phosphate form, Neu5Ac-9-P, and utilize ManNAc-6-P instead of ManNAc. Such eukaryotic enzymes are not present in this family. This family also contains SpsE spore coat polysaccharide biosynthesis proteins.	240
397291	pfam03103	DUF243	Domain of unknown function (DUF243). This family of uncharacterized proteins is only found in fly proteins. It is found associated with YLP motifs pfam02757 in some proteins.	97
397292	pfam03104	DNA_pol_B_exo1	DNA polymerase family B, exonuclease domain. This domain has 3' to 5' exonuclease activity and adopts a ribonuclease H type fold.	333
397293	pfam03105	SPX	SPX domain. We have named this region the SPX domain after SYG1, Pho81 and XPR1. This 180 residue long domain is found at the amino terminus of a variety of proteins. In the yeast protein SYG1, the N-terminus directly binds to the G-protein beta subunit and inhibits transduction of the mating pheromone signal. Similarly, the N-terminus of the human XPR1 protein binds directly to the beta subunit of the G-protein heterotrimer leading to increased production of cAMP. These findings suggest that all the members of this family are involved in G-protein associated signal transduction. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors PHO81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The SPX domain of S. cerevisiae low-affinity phosphate transporters Pho87 and Pho90 auto-regulates uptake and prevents efflux. This SPX dependent inhibition is mediated by the physical interaction with Spl2 NUC-2 contains several ankyrin repeats pfam00023. Several members of this family are annotated as XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with murine xenotropic and polytropic leukaemia viruses (MLV). Infection by these retroviruses can inhibit XPR1-mediated cAMP signalling and result in cell toxicity and death. The similarity between SYG1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae, and many other diverse organisms. In addition, given the similarities between XPR1 and SYG1 and phosphate regulatory proteins, it has been proposed that XPR1 might be involved in G-protein associated signal transduction and may itself function as a phosphate sensor.	340
397294	pfam03106	WRKY	WRKY DNA -binding domain. 	57
367338	pfam03107	C1_2	C1 domain. This short domain is rich in cysteines and histidines. The pattern of conservation is similar to that found in pfam00130, therefore we have termed this domain DC1 for divergent C1 domain. This domain probably also binds to two zinc ions. The function of proteins with this domain is uncertain, however this domain may bind to molecules such as diacylglycerol (A Bateman pers. obs.). This family are found in plant proteins.	48
397295	pfam03108	DBD_Tnp_Mut	MuDR family transposase. This region is found in plant proteins that are presumed to be the transposases for Mutator transposable elements. These transposons contain two ORFs. The molecular function of this region is unknown.	65
281150	pfam03109	ABC1	ABC1 family. This family includes ABC1 from yeast and AarF from E. coli. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex and E. coli AarF is required for ubiquinone production. It has been suggested that members of the ABC1 family are novel chaperonins. These proteins are unrelated to the ABC transporter proteins.	117
397296	pfam03110	SBP	SBP domain. SBP domains (for SQUAMOSA-pROMOTER BINDING PROTEIN) are found in plant proteins. It is a sequence specific DNA-binding domain. Members of family probably function as transcription factors involved in the control of early flower development. The domain contains 10 conserved cysteine and histidine residues that probably are zinc ligands.	75
281152	pfam03112	DUF244	Uncharacterized protein family (ORF7) DUF. Several members of this family are Borrelia burgdorferi plasmid proteins of uncharacterized function.	161
281153	pfam03113	RSV_NS2	Respiratory synctial virus non-structural protein NS2. The molecular structure and function of the NS2 protein is not known. However, mutants lacking the NS2 grow at slower rates when compared to the wild-type. Nevertheless, NS2 is not essential for viral replication.	124
281154	pfam03114	BAR	BAR domain. BAR domains are dimerization, lipid binding and curvature sensing modules found in many different protein families. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysin, endophilin, BRAP and Nadrin. BAR domains are also frequently found alongside domains that determine lipid specificity, like pfam00169 and pfam00787 domains in beta centaurins and sorting nexins respectively.	234
367340	pfam03115	Astro_capsid_N	Astrovirus capsid protein precursor. This product is encoded by astrovirus ORF2, one of the three astrovirus ORFs (1a, 1b, 2). The 87kD precursor protein undergoes an intracellular cleavage to form a 79kD protein. Subsequently, extracellular trypsin cleavage yields the three proteins forming the infectious virion.	431
397297	pfam03116	NQR2_RnfD_RnfE	NQR2, RnfD, RnfE family. This family of bacterial proteins includes a sodium-translocating NADH-ubiquinone oxidoreductase (i.e. a respiration linked sodium pump). In Vibrio cholerae, it negatively regulates the expression of virulence factors through inhibiting (by an unknown mechanism) the transcription of the transcriptional activator ToxT. The family also includes proteins involved in nitrogen fixation, RnfD and RnfE. The similarity of these proteins to NADH-ubiquinone oxidoreductases was previously noted.	304
281157	pfam03117	Herpes_UL49_1	UL49 family. Members of this family, found in several herpesviruses, include EBV BFRF2 and other UL49 proteins (e.g. HCMVA UL49, HSV6 U33). There are eight conserved cysteine residues in this alignment, all lying towards the C-terminus. Their function is unknown.	243
397298	pfam03118	RNA_pol_A_CTD	Bacterial RNA polymerase, alpha chain C terminal domain. The alpha subunit of RNA polymerase consists of two independently folded domains, referred to as amino-terminal and carboxyl terminal domains. The amino terminal domain is involved in the interaction with the other subunits of the RNA polymerase. The carboxyl-terminal domain interacts with the DNA and activators. The amino acid sequence of the alpha subunit is conserved in prokaryotic and chloroplast RNA polymerases. There are three regions of particularly strong conservation, two in the amino-terminal and one in the carboxyl- terminal.	63
397299	pfam03119	DNA_ligase_ZBD	NAD-dependent DNA ligase C4 zinc finger domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This family is a small zinc binding motif that is presumably DNA binding. IT is found only in NAD dependent DNA ligases.	26
397300	pfam03120	DNA_ligase_OB	NAD-dependent DNA ligase OB-fold domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This family is a small domain found after the adenylation domain pfam01653 in NAD dependent ligases. OB-fold domains generally are involved in nucleic acid binding.	79
367342	pfam03121	Herpes_UL52	Herpesviridae UL52/UL70 DNA primase. Herpes simplex virus type 1 DNA replication in host cells is known to be mediated by seven viral-encoded proteins, three of which form a heterotrimeric DNA helicase-primase complex. This complex consists of UL5, UL8, and UL52 subunits. Heterodimers consisting of UL5 and UL52 have been shown to retain both helicase and primase activities. Nevertheless, UL8 is still essential for replication: though it lacks any DNA binding or catalytic activities, it is involved in the transport of UL5-UL52 and it also interacts with other replication proteins. The molecular mechanisms of the UL5-UL52 catalytic activities are not known. While UL5 is associated with DNA helicase activity and UL52 with DNA primase activity, the helicase activity requires the interaction of UL5 and UL52. It is not known if the primase activity can be maintained by UL52 alone. The region encompassed by residues 610-636 of HSV1 UL52 is thought to contain a divalent metal cation binding motif. Indeed, this region contains several aspartate and glutamate residues that might be involved in divalent cation binding. The biological significance of UL52-UL8 interaction is not known. Yeast two-hybrid analysis together with immunoprecipitation experiments have shown that the HSV1 UL52 region between residues 366-914 is essential for this interaction, while the first 349 N-terminal residues are dispensable. This family also includes protein UL70 from cytomegalovirus (CMV, a subgroup of the Herpesviridae) strains, which, by analogy with UL52, is thought to have DNA primase activity. Indeed, CMV strains also possess a DNA helicase-primase complex, the other subunits being protein UL105 (with known similarity to HSV1 UL5) and protein UL102.	75
397301	pfam03122	Herpes_MCP	Herpes virus major capsid protein. This family represents the major capsid protein (MCP) of herpes viruses. The capsid shell consists of 150 MCP hexamers and 12 MCP pentamers. One pentamer is found at each of the 12 apices of the icosahedral shell, and the hexamers form the edges and 20 faces.	1368
397302	pfam03123	CAT_RBD	CAT RNA binding domain. This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram+ and Gram- bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains (pfam00874) that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template.	55
397303	pfam03124	EXS	EXS family. We have named this region the EXS family after (ERD1, XPR1, and SYG1). This family includes C-terminus portions from the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be murine leukaemia virus (MLV) receptors (XPR1). N-terminus portions from these proteins are aligned in the SPX pfam03105 family. The previously noted similarity between SYG1 and MLV receptors over their whole sequences is thus borne out in pfam03105 and this family. While the N-termini aligned in pfam03105 are thought to be involved in signal transduction, the role of the C-terminus sequences aligned in this family is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) yeast proteins. ERD1 proteins are involved in the localization of endogenous endoplasmic reticulum (ER) proteins. erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localization label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via `salvage' vesicles.	332
251743	pfam03125	Sre	C. elegans Sre G protein-coupled chemoreceptor. Caenorhabditis elegans Sre proteins are candidate chemosensory receptors. There are four main recognized groups of such receptors: Odr-10, Sra, Sro, and Srg. Sre (this family), Sra pfam02117 and Srb pfam02175 comprise the Sra group. All of the above receptors are thought to be G protein-coupled seven transmembrane domain proteins. The existence of several different chemosensory receptors underlies the fact that in spite of having only 20-30 chemosensory neurones, C. elegans detects hundreds of different chemicals, with the ability to discern individual chemicals among combinations.	363
397304	pfam03126	Plus-3	Plus-3 domain. This domain is about 90 residues in length and is often found associated with the pfam02213 domain. The function of this domain is uncertain. It is possible that this domain is involved in DNA binding as it has three conserved positively charged residues, hence this domain has been named the plus-3 domain. It is found in yeast Rtf1 which may be a transcription elongation factor.	103
397305	pfam03127	GAT	GAT domain. The GAT domain is responsible for binding of GGA proteins to several members of the ARF family including ARF1 and ARF3. The GAT domain stabilizes membrane bound ARF1 in its GTP bound state, by interfering with GAP proteins.	77
397306	pfam03128	CXCXC	CXCXC repeat. This repeat contains the conserved pattern CXCXC where X can be any amino acid. The repeat is found in up to five copies in Vascular endothelial growth factor C. In the salivary glands of the dipteran Chironomus tentans, a specific messenger ribonucleoprotein (mRNP) particle, the Balbiani ring (BR) granule, can be visualized during its assembly on the gene and during its nucleocytoplasmic transport. This repeat is found over 70 copies in the balbiani ring protein 3. It is also found in some silk proteins.	13
397307	pfam03129	HGTP_anticodon	Anticodon binding domain. This domain is found in histidyl, glycyl, threonyl and prolyl tRNA synthetases it is probably the anticodon binding domain.	94
308641	pfam03130	HEAT_PBS	PBS lyase HEAT-like repeat. This family contains a short bi-helical repeat that is related to pfam02985. Cyanobacteria and red algae harvest light energy using macromolecular complexes known as phycobilisomes (PBS), peripherally attached to the photosynthetic membrane. The major components of PBS are the phycobiliproteins. These heterodimeric proteins are covalently attached to phycobilins: open-chain tetrapyrrole chromophores, which function as the photosynthetic light-harvesting pigments. Phycobiliproteins differ in sequence and in the nature and number of attached phycobilins to each of their subunits. This family includes the lyase enzymes that specifically attach particular phycobilins to apophycobiliprotein subunits. The most comprehensively studied of these is the CpcE/F lyase, which attaches phycocyanobilin (PCB) to the alpha subunit of apophycocyanin. Similarly, MpeU/V attaches phycoerythrobilin to phycoerythrin II, while CpeY/Z is thought to be involved in phycoerythrobilin (PEB) attachment to phycoerythrin (PE) I (PEs I and II differ in sequence and in the number of attached molecules of PEB: PE I has five, PE II has six). All the reactions of the above lyases involve an apoprotein cysteine SH addition to a terminal delta 3,3'-double bond. Such a reaction is not possible in the case of phycoviolobilin (PVB), the phycobilin of alpha-phycoerythrocyanin (alpha-PEC). It is thought that in this case, PCB, not PVB, is first added to apo-alpha-PEC, and is then isomerized to PVB. The addition reaction has been shown to occur in the presence of either of the components of alpha-PEC-PVB lyase PecE or PecF (or both). The isomerisation reaction occurs only when both PecE and PecF components are present, i.e. the PecE/F phycobiliprotein lyase is also a phycobilin isomerase. Another member of this family is the NblB protein, whose similarity to the phycobiliprotein lyases was previously noted. This constitutively expressed protein is not known to have any lyase activity. It is thought to be involved in the coordination of PBS degradation with environmental nutrient limitation. It has been suggested that the similarity of NblB to the phycobiliprotein lyases is due to the ability to bind tetrapyrrole phycobilins via the common repeated motif.	27
367348	pfam03131	bZIP_Maf	bZIP Maf transcription factor. Maf transcription factors contain a conserved basic region leucine zipper (bZIP) domain, which mediates their dimerization and DNA binding property. Thus, this family is probably related to pfam00170. This family also includes the DNA_binding domain of Skn-1, this domain lacks the leucine zipper found in other bZip domains, and binds DNA is a monomer.	92
397308	pfam03133	TTL	Tubulin-tyrosine ligase family. Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumor aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain.	291
397309	pfam03134	TB2_DP1_HVA22	TB2/DP1, HVA22 family. This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein, which in humans is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease. The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein, which is thought to be a regulatory protein.	77
367350	pfam03135	CagE_TrbE_VirB	CagE, TrbE, VirB family, component of type IV transporter system. This family includes the Helicobacter pylori protein CagE, which together with other proteins from the cag pathogenicity island (PAI), encodes a type IV transporter secretion system. The precise role of CagE is not known, but studies in animal models have shown that it is essential for pathogenesis in Helicobacter pylori induced gastritis and peptic ulceration. Indeed, the expression of the cag PAI has been shown to be essential for stimulating human gastric epithelial cell apoptosis in vitro. Similar type IV transport systems are also found in other bacteria. This family includes the TrbE and VirB proteins from the respective trb and Vir conjugal transfer systems in Agrobacterium tumefaciens. homologs of VirB proteins from other species are also members of this family, e.g. VirB from Brucella suis.	202
397310	pfam03136	Pup_ligase	Pup-ligase protein. Pupylation is a novel protein modification system found in some bacteria. This family of proteins are the enzyme that can conjugate proteins of the Pup family to lysine residues in target proteins marking them for degradation. The archetypal protein in this family is PafA (proteasome accessory factor) from Mycobacterium tuberculosis. It has been suggested that these proteins are related to gamma-glutamyl-cysteine synthetases.	405
397311	pfam03137	OATP	Organic Anion Transporter Polypeptide (OATP) family. This family consists of several eukaryotic Organic-Anion-Transporting Polypeptides (OATPs). Several have been identified mostly in human and rat. Different OATPs vary in tissue distribution and substrate specificity. Since the numbering of different OATPs in particular species was based originally on the order of discovery, similarly numbered OATPs in humans and rats did not necessarily correspond in function, tissue distribution and substrate specificity (in spite of the name, some OATPs also transport organic cations and neutral molecules). Thus, Tamai et al. initiated the current scheme of using digits for rat OATPs and letters for human ones. Prostaglandin transporter (PGT) proteins are also considered to be OATP family members. In addition, the methotrexate transporter OATK is closely related to OATPs. This family also includes several predicted proteins from Caenorhabditis elegans and Drosophila melanogaster. This similarity was not previously noted. Note: Members of this family are described (in the Swiss-Prot database) as belonging to the SLC21 family of transporters.	529
397312	pfam03139	AnfG_VnfG	Vanadium/alternative nitrogenase delta subunit. The nitrogenase complex EC:1.18.6.1 catalyzes the conversion of molecular nitrogen to ammonia (nitrogen fixation) as follows: 8 reduced ferredoxin + 8 H(+) + N(2) + 16 ATP <=> 8 oxidized ferredoxin + 2 NH(3) + 16 ADP + 16 phosphate. The complex is hexameric, consisting of 2 alpha, 2 beta, and 2 delta subunits. This family represents the delta subunit of a group of nitrogenases that do not utilize molybdenum (Mo) as a cofactor, but instead use either vanadium (V nitrogenases), or iron (alternative nitrogenases). V nitrogenases are encoded by vnf operons, and alternative nitrogenases by anf operons. The delta subunits are VnfG and AnfG, respectively.	111
397313	pfam03140	DUF247	Plant protein of unknown function. The function of the plant proteins constituting this family is unknown.	389
335237	pfam03141	Methyltransf_29	Putative S-adenosyl-L-methionine-dependent methyltransferase. This family is a putative S-adenosyl-L-methionine (SAM)-dependent methyltransferase.	506
367353	pfam03142	Chitin_synth_2	Chitin synthase. Members of this family are fungal chitin synthase EC:2.4.1.16 enzymes. They catalyze chitin synthesis as follows: UDP-N-acetyl-D-glucosamine + {(1,4)-(N-acetyl-beta-D-glucosaminyl)}(N) <=> UDP + {(1,4)-(N-acetyl-beta-D-glucosaminyl)}(N+1).	527
397314	pfam03143	GTP_EFTU_D3	Elongation factor Tu C-terminal domain. Elongation factor Tu consists of three structural domains, this is the third domain. This domain adopts a beta barrel structure. This the third domain is involved in binding to both charged tRNA and binding to EF-Ts pfam00889.	105
397315	pfam03144	GTP_EFTU_D2	Elongation factor Tu domain 2. Elongation factor Tu consists of three structural domains, this is the second domain. This domain adopts a beta barrel structure. This the second domain is involved in binding to charged tRNA. This domain is also found in other proteins such as elongation factor G and translation initiation factor IF-2. This domain is structurally related to pfam03143, and in fact has weak sequence matches to this domain.	73
397316	pfam03145	Sina	Seven in absentia protein family. The seven in absentia (sina) gene was first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non- neuronal cell type. The Sina protein contains an N-terminal RING finger domain pfam00097. Through this domain, Sina binds E2 ubiquitin-conjugating enzymes (UbcD1) Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that thus Sina targets TTK88 for degradation, therefore promoting the R7 pathway. Murine and human homologs of Sina have also been identified. The human homolog Siah-1 also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus this pathway links DNA damage to beta-catenin degradation. Sina proteins, therefore, physically interact with a variety of proteins. The N-terminal RING finger domain that binds ubiquitin conjugating enzymes is described in pfam00097, and does not form part of the alignment for this family. The remainder C-terminal part is involved in interactions with other proteins, and is included in this alignment. In addition to the Drosophila protein and mammalian homologs, whose similarity was noted previously, this family also includes putative homologs from Caenorhabditis elegans, Arabidopsis thaliana.	198
397317	pfam03146	NtA	Agrin NtA domain. Agrin is a multidomain heparan sulphate proteoglycan, that is a key organizer for the induction of postsynaptic specialisations at the neuromuscular junction. Binding of agrin to basement membranes requires the amino terminal (NtA) domain. This region mediates high affinity interaction with the coiled-coil domain of laminins. The binding of agrin to laminins via the NtA domain is subject to tissue-specific regulation. The NtA domain-containing form of agrin is expressed in non-neuronal cells or in neurons that project to non-neuronal cell such as motor neurons. The structure of this domain is an OB-fold.	109
397318	pfam03147	FDX-ACB	Ferredoxin-fold anticodon binding domain. This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold.	94
397319	pfam03148	Tektin	Tektin family. Tektins are cytoskeletal proteins. They have been demonstrated in such cellular sites as centrioles, basal bodies, and along ciliary and flagellar doublet microtubules. Tektins form unique protofilaments, organized as longitudinal polymers of tektin heterodimers with axial periodicity matching tubulin. Tektin polypeptides consist of several alpha-helical regions that are predicted to form coiled coils. Indeed, tektins share considerable structural similarities with intermediate filament proteins. Possible functional roles for tektins are: stabilisation of tubulin protofilaments; attachment of A and B-tubules in ciliary/flagellar microtubule doublets and C-tubules in centrioles; binding of axonemal components.	383
397320	pfam03150	CCP_MauG	Di-haem cytochrome c peroxidase. This is a family of distinct cytochrome c peroxidases (CCPs) that contain two haem groups. Similar to other cytochrome c peroxidases, they reduce hydrogen peroxide to water using c-type haem as an oxidisable substrate. However, since they possess two, instead of one, haem prosthetic groups, bacterial CCPs reduce hydrogen peroxide without the need to generate semi-stable free radicals. The two haem groups have significantly different redox potentials. The high potential (+320 mV) haem feeds electrons from electron shuttle proteins to the low potential (-330 mV) haem, where peroxide is reduced (indeed, the low potential site is known as the peroxidatic site). The CCP protein itself is structured into two domains, each containing one c-type haem group, with a calcium-binding site at the domain interface. This family also includes MauG proteins, whose similarity to di-haem CCP was previously recognized.	151
308657	pfam03151	TPT	Triose-phosphate Transporter family. This family includes transporters with a specificity for triose phosphate.	290
397321	pfam03152	UFD1	Ubiquitin fusion degradation protein UFD1. Post-translational ubiquitin-protein conjugates are recognized for degradation by the ubiquitin fusion degradation (UFD) pathway. Several proteins involved in this pathway have been identified. This family includes UFD1, a 40kD protein that is essential for vegetative cell viability. The human UFD1 gene is expressed at high levels during embryogenesis, especially in the eyes and in the inner ear primordia and is thought to be important in the determination of ectoderm-derived structures, including neural crest cells. In addition, this gene is deleted in the CATCH-22 (cardiac defects, abnormal facies, thymic hypoplasia, cleft palate and hypocalcaemia with deletions on chromosome 22) syndrome. This clinical syndrome is associated with a variety of developmental defects, all characterized by microdeletions on 22q11.2. Two such developmental defects are the DiGeorge syndrome OMIM:188400, and the velo-cardio- facial syndrome OMIM:145410. Several of the abnormalities associated with these conditions are thought to be due to defective neural crest cell differentiation.	172
397322	pfam03153	TFIIA	Transcription factor IIA, alpha/beta subunit. Transcription initiation factor IIA (TFIIA) is a heterotrimer, the three subunits being known as alpha, beta, and gamma, in order of molecular weight. The N and C-terminal domains of the gamma subunit are represented in pfam02268 and pfam02751, respectively. This family represents the precursor that yields both the alpha and beta subunits. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II. Together with TFIID, TFIIA binds to the promoter region; this is the first step in the formation of a pre-initiation complex (PIC). Binding of the rest of the transcription machinery follows this step. After initiation, the PIC does not completely dissociate from the promoter. Some components, including TFIIA, remain attached and re-initiate a subsequent round of transcription.	237
397323	pfam03154	Atrophin-1	Atrophin-1 family. Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.	982
397324	pfam03155	Alg6_Alg8	ALG6, ALG8 glycosyltransferase family. N-linked (asparagine-linked) glycosylation of proteins is mediated by a highly conserved pathway in eukaryotes, in which a lipid (dolichol phosphate)-linked oligosaccharide is assembled at the endoplasmic reticulum membrane prior to the transfer of the oligosaccharide moiety to the target asparagine residues. This oligosaccharide is composed of Glc(3)Man(9)GlcNAc(2). The addition of the three glucose residues is the final series of steps in the synthesis of the oligosaccharide precursor. Alg6 transfers the first glucose residue, and Alg8 transfers the second one. In the human alg6 gene, a C->T transition, which causes Ala333 to be replaced with Val, has been identified as the cause of a congenital disorder of glycosylation, designated as type Ic OMIM:603147.	472
367362	pfam03157	Glutenin_hmw	High molecular weight glutenin subunit. Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.	786
281192	pfam03158	DUF249	Multigene family 530 protein. Members of this family are multigene family 530 proteins from African swine fever viruses. These proteins may be involved in promoting survival of infected macrophages.	192
397325	pfam03159	XRN_N	XRN 5'-3' exonuclease N-terminus. This family aligns residues towards the N-terminus of several proteins with multiple functions. The members of this family all appear to possess 5'-3' exonuclease activity EC:3.1.11.-. Thus, the aligned region may be necessary for 5' to 3' exonuclease function. The family also contains several Xrn1 and Xrn2 proteins. The 5'-3' exoribonucleases Xrn1p and Xrn2p/Rat1p function in the degradation and processing of several classes of RNA in Saccharomyces cerevisiae. Xrn1p is the main enzyme catalyzing cytoplasmic mRNA degradation in multiple decay pathways, whereas Xrn2p/Rat1p functions in the processing of rRNAs and small nucleolar RNAs (snoRNAs) in the nucleus.	231
397326	pfam03160	Calx-beta	Calx-beta domain. 	91
397327	pfam03161	LAGLIDADG_2	LAGLIDADG DNA endonuclease family. This is a family of site-specific DNA endonucleases encoded by DNA mobile elements. Similar to pfam00961, the members of this family are also LAGLIDADG endonucleases.	168
397328	pfam03162	Y_phosphatase2	Tyrosine phosphatase family. This family is closely related to the pfam00102 and pfam00782 families.	150
367367	pfam03164	Mon1	Trafficking protein Mon1. Members of this family have been called SAND proteins although these proteins do not contain a SAND domain. In Saccharomyces cerevisiae a protein complex of Mon1 and Ccz1 functions with the small GTPase Ypt7 to mediate vesicle trafficking to the vacuole. The Mon1/Ccz1 complex is conserved in eukaryotic evolution and members of this family (previously known as DUF254) are distant homologs to domains of known structure that assemble into cargo vesicle adapter (AP) complexes. describes orthologues in Fugu rubripes.	400
397329	pfam03165	MH1	MH1 domain. The MH1 (MAD homology 1) domain is found at the amino terminus of MAD related proteins such as Smads. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind the DNA consensus sequence GNCN in the major groove, shown to be vital for the transcriptional activation of target genes. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. A basic helix (H2) in MH1 with the nuclear localization signal KKLKK has been shown to be essential for Smad3 nuclear import. Smads also use the MH1 domain to interact with transcription factors such as Jun, TFE3, Sp1, and Runx.	103
397330	pfam03166	MH2	MH2 domain. This is the MH2 (MAD homology 2) domain found at the carboxy terminus of MAD related proteins such as Smads. This domain is separated from the MH1 domain by a non-conserved linker region. The MH2 domain mediates interaction with a wide variety of proteins and provides specificity and selectivity to Smad function and also is critical for mediating interactions in Smad oligomers. Unlike MH1, MH2 does not bind DNA. The well-studied MH2 domain of Smad4 is composed of five alpha helices and three loops enclosing a beta sandwich. Smads are involved in the propagation of TGF-beta signals by direct association with the TGF-beta receptor kinase which phosphorylates the last two Ser of a conserved 'SSXS' motif located at the C-terminus of MH2.	181
397331	pfam03167	UDG	Uracil DNA glycosylase superfamily. 	154
397332	pfam03168	LEA_2	Late embryogenesis abundant protein. Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress. The function of these proteins is unknown. This family represents a group of LEA proteins that appear to be distinct from those in pfam02987. The family DUF1511, pfam07427, has now been merged into this family.	98
367370	pfam03169	OPT	OPT oligopeptide transporter protein. The OPT family of oligopeptide transporters is distinct from the ABC pfam00005 and PTR pfam00854 transporter families. OPT transporters were first recognized in fungi (Candida albicans and Schizosaccharomyces pombe), but this alignment also includes orthologues from Arabidopsis thaliana. OPT transporters are thought to have 12-14 transmembrane domains and contain the following motif: SPYxEVRxxVxxxDDP.	614
397333	pfam03170	BcsB	Bacterial cellulose synthase subunit. This family includes bacterial proteins involved in cellulose synthesis. Cellulose synthesis has been identified in several bacteria. In Agrobacterium tumefaciens, for instance, cellulose has a pathogenic role: it allows the bacteria to bind tightly to their host plant cells. While several enzymatic steps are involved in cellulose synthesis, potentially the only step unique to this pathway is that catalyzed by cellulose synthase. This enzyme is a multi subunit complex. This family encodes a subunit that is thought to bind the positive effector cyclic di-GMP. This subunit is found in several different bacterial cellulose synthase enzymes. The first recognized sequence for this subunit is BcsB. In the AcsII cellulose synthase, this subunit and the subunit corresponding to BcsA are found in the same protein. Indeed, this alignment only includes the C-terminal half of the AcsAII synthase, which corresponds to BcsB.	605
397334	pfam03171	2OG-FeII_Oxy	2OG-Fe(II) oxygenase superfamily. This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 catalyzing the reaction: Procollagen L-proline + 2-oxoglutarate + O2 <=> procollagen trans- 4-hydroxy-L-proline + succinate + CO2. The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB.	101
397335	pfam03172	HSR	HSR domain. The Sp100 protein is a constituent of nuclear domains, also known as nuclear dots (NDs). An ND-targeting region that coincides with a homodimerization domain was mapped in Sp100. Sequences similar to the Sp100 homodimerization/ND-targeting region occur in several other proteins and constitute a novel protein motif, termed HSR domain (for homogeneously-staining region). The HSR domain has also been named ASS (AIRE, Sp-100 and Sp140). This domain is usually found at the amino terminus of proteins that contain a SAND domain pfam01342.	99
397336	pfam03173	CHB_HEX	Putative carbohydrate binding domain. This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain.	152
397337	pfam03174	CHB_HEX_C	Chitobiase/beta-hexosaminidase C-terminal domain. This short domain represents the C terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure. The function of this domain is unknown.	76
367375	pfam03175	DNA_pol_B_2	DNA polymerase type B, organellar and viral. Like pfam00136, members of this family are also DNA polymerase type B proteins. Those included here are found in plant and fungal mitochondria, and in viruses.	455
308676	pfam03176	MMPL	MMPL family. Members of this family are putative integral membrane proteins from bacteria. Several of the members are mycobacterial proteins. Many of the proteins contain two copies of this aligned region. The function of these proteins is not known, although it has been suggested that they may be involved in lipid transport.	332
397338	pfam03177	Nucleoporin_C	Non-repetitive/WGA-negative nucleoporin C-terminal. This is the C-termainl half of a family of nucleoporin proteins. Nucleoporins are the main components of the nuclear pore complex in eukaryotic cells, and mediate bidirectional nucleocytoplasmic transport, especially of mRNA and proteins. Two nucleoporin classes are known: one is characterized by the FG repeat pfam03093; the other is represented by this family, and lacks any repeats. RNA undergoing nuclear export first encounters the basket of the nuclear pore and many nucleoporins are accessible on the basket side of the pore.	559
397339	pfam03178	CPSF_A	CPSF A subunit region. This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs. The function of the aligned region is unknown but may be involved in RNA/DNA binding.	318
397340	pfam03179	V-ATPase_G	Vacuolar (H+)-ATPase G subunit. This family represents the eukaryotic vacuolar (H+)-ATPase (V-ATPase) G subunit. V-ATPases generate an acidic environment in several intracellular compartments. Correspondingly, they are found as membrane-attached proteins in several organelles. They are also found in the plasma membranes of some specialized cells. V-ATPases consist of peripheral (V1) and membrane integral (V0) heteromultimeric complexes. The G subunit is part of the V1 subunit, but is also thought to be strongly attached to the V0 complex. It may be involved in the coupling of ATP degradation to H+ translocation.	105
397341	pfam03180	Lipoprotein_9	NLPA lipoprotein. This family of bacterial lipoproteins contains several antigenic members, that may be involved in bacterial virulence. Their precise function is unknown. However they are probably distantly related to pfam00497 which are solute binding proteins.	236
397342	pfam03181	BURP	BURP domain. The BURP domain is found at the C-terminus of several different plant proteins. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus; USPs and USP-like proteins; RD22 from Arabidopsis thaliana; and PG1beta from Lycopersicon esculentum. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid. The function of this domain is unknown.	215
112017	pfam03183	Borrelia_rep	Borrelia repeat protein. 	18
367380	pfam03184	DDE_1	DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localized to the centromere.	177
397343	pfam03185	CaKB	Calcium-activated potassium channel, beta subunit. 	195
397344	pfam03186	CobD_Cbib	CobD/Cbib protein. This family includes CobD proteins from a number of bacteria, in Salmonella this protein is called Cbib. Salmonella CobD is a different protein. This protein is involved in cobalamin biosynthesis and is probably an enzyme responsible for the conversion of adenosylcobyric acid to adenosylcobinamide or adenosylcobinamide phosphate.	278
281216	pfam03187	Corona_I	Corona nucleocapsid I protein. 	207
397345	pfam03188	Cytochrom_B561	Eukaryotic cytochrome b561. Cytochrome b561 is a secretory vesicle-specific electron transport protein. It is an integral membrane protein, that binds two heme groups non-covalently. This is a eukaryotic family. Members of the 'prokaryotic cytochrome b561' family can be found in pfam01292.	137
397346	pfam03189	Otopetrin	Otopetrin. 	435
397347	pfam03190	Thioredox_DsbH	Protein of unknown function, DUF255. 	163
367385	pfam03192	DUF257	Pyrococcus protein of unknown function, DUF257. 	205
397348	pfam03193	RsgA_GTPase	RsgA GTPase. RsgA (also known as EngC and YjeQ) represents a protein family whose members are broadly conserved in bacteria and are indispensable for growth. The GTPase domain of RsgA is very similar to several P-loop GTPases, but differs in having a circular permutation of the GTPase structure described by a G4-G1-G3 pattern.	174
397349	pfam03194	LUC7	LUC7 N_terminus. This family contains the N terminal region of several LUC7 protein homologs and only contains eukaryotic proteins. LUC7 has been shown to be a U1 snRNA associated protein with a role in splice site recognition. The family also contains human and mouse LUC7 like (LUC7L) proteins and human cisplatin resistance-associated overexpressed protein (CROP).	246
397350	pfam03195	LOB	Lateral organ boundaries (LOB) domain. The lateral organ boundaries (LOB) gene encodes a plant-specific protein of unknown function that is expressed at the adaxial base of initiating lateral organs. The N-terminal of the LOB protein contains an approximately 100-amino acid conserved domain (the LOB domain) that is present in 42 other Arabidopsis thaliana proteins as well as in proteins from a variety of other plant species. The LOB domain contains conserved blocks of amino acids that identify the LOB domain (LBD) gene family. In particular, a conserved C-x(2)-C-x(6)-C-x(3)-C motif, which is the defining feature of the LOB domain, is present in all LBD proteins.	99
281224	pfam03196	DUF261	Protein of unknown function, DUF261. 	137
281225	pfam03197	FRD2	Bacteriophage FRD2 protein. 	103
397351	pfam03198	Glyco_hydro_72	Glucanosyltransferase. This is a family of glycosylphosphatidylinositol-anchored beta(1-3)glucanosyltransferases. The active site residues in the Aspergillus fumigatus example are the two glutamate residues at 160 and 261.	315
397352	pfam03199	GSH_synthase	Eukaryotic glutathione synthase. 	103
397353	pfam03200	Glyco_hydro_63	Glycosyl hydrolase family 63 C-terminal domain. This is a family of eukaryotic enzymes belonging to glycosyl hydrolase family 63. They catalyze the specific cleavage of the non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase EC:3.2.1.106 is the first enzyme in the N-linked oligosaccharide processing pathway. This family represents the C-terminal catalytic domain.	494
397354	pfam03201	HMD	H2-forming N5,N10-methylene-tetrahydromethanopterin dehydrogenase. 	88
397355	pfam03202	Lipoprotein_10	Putative mycoplasma lipoprotein, C-terminal region. 	129
397356	pfam03203	MerC	MerC mercury resistance protein. 	105
397357	pfam03205	MobB	Molybdopterin guanine dinucleotide synthesis protein B. This protein contains a P-loop.	133
397358	pfam03206	NifW	Nitrogen fixation protein NifW. Nitrogenase is a complex metalloenzyme composed of two proteins designated the Fe-protein and the MoFe-protein. Apart from these two proteins, a number of accessory proteins are essential for the maturation and assembly of nitrogenase. Even though experimental evidence suggests that these accessory proteins are required for nitrogenase activity, the exact roles played by many of these proteins in the functions of nitrogenase are unclear. Using yeast two-hybrid screening it has been shown that NifW can interact with itself as well as NifZ.	99
367392	pfam03207	OspD	Borrelia outer surface protein D (OspD). 	254
397359	pfam03208	PRA1	PRA1 family protein. This family includes the PRA1 (Prenylated rab acceptor) protein which is a Rab guanine dissociation inhibitor (GDI) displacement factor. This family also includes the glutamate transporter EAAC1 interacting protein GTRAP3-18.	141
367394	pfam03209	PUCC	PUCC protein. This protein is required for high-level transcription of the PUC operon.	401
281237	pfam03210	Paramyx_P_V_C	Paramyxovirus P/V phosphoprotein C-terminal. Paramyxoviridae P genes are able to generate more than one product, using alternative reading frames and RNA editing. The P gene encodes the structural phosphoprotein P. In addition, it encodes several non-structural proteins present in the infected cell but not in the virus particle. This family includes phosphoprotein P and the non-structural phosphoprotein V from different paramyxoviruses. Phosphoprotein P is essential for the activity of the RNA polymerase complex which it forms with another subunit, L pfam00946. Although all the catalytic activities of the polymerase are associated with the L subunit, its function requires specific interactions with phosphoprotein P. The P and V phosphoproteins are amino co-terminal, but diverge at their C-termini. This difference is generated by an RNA-editing mechanism in which one or two non-templated G residues are inserted into P-gene-derived mRNA. In measles virus and Sendai virus, one G residue is inserted and the edited transcript encodes the V protein. In mumps, simian virus type 5 and Newcastle disease virus, two G residues are inserted, and the edited transcript codes for the P protein. Being phosphoproteins, both P and V are rich in serine and threonine residues over their whole lengths. In addition, the V proteins are rich in cysteine residues at the C-termini. This C-terminal region of the P phosphoprotein is likely to be the nucleocapsid-binding domain, and is found to be intrinsically disordered and thus liable to induced folding.	154
397360	pfam03211	Pectate_lyase	Pectate lyase. 	200
367396	pfam03212	Pertactin	Pertactin. 	121
281240	pfam03213	Pox_P35	Poxvirus P35 protein. 	323
367397	pfam03214	RGP	Reversibly glycosylated polypeptide. 	340
367398	pfam03215	Rad17	Rad17 cell cycle checkpoint protein. 	186
397361	pfam03216	Rhabdo_ncap_2	Rhabdovirus nucleoprotein. 	295
397362	pfam03217	SLAP	SLAP domain. This short domain is found in a variety of bacterial cell surface proteins. The domain is about 60 residues in length (although previously defined as 2 copies of this domain). It usually occurs in tandem pairs. It may be distantly related to the SH3 domain.	54
397363	pfam03219	TLC	TLC ATP/ADP transporter. 	491
397364	pfam03220	Tombus_P19	Tombusvirus P19 core protein. 	171
397365	pfam03221	HTH_Tnp_Tc5	Tc5 transposase DNA-binding domain. 	63
367404	pfam03222	Trp_Tyr_perm	Tryptophan/tyrosine permease family. 	393
397366	pfam03223	V-ATPase_C	V-ATPase subunit C. 	369
397367	pfam03224	V-ATPase_H_N	V-ATPase subunit H. The yeast Saccharomyces cerevisiae vacuolar H+-ATPase (V-ATPase) is a multisubunit complex responsible for acidifying organelles. It functions as an ATP dependent proton pump that transports protons across a lipid bilayer. This domain corresponds to the N terminal domain of the H subunit of V-ATPase. The N-terminal domain is required for the activation of the complex whereas the C-terminal domain is required for coupling ATP hydrolysis to proton translocation.	311
251809	pfam03225	Viral_Hsp90	Viral heat shock protein Hsp90 homolog. 	511
397368	pfam03226	Yippee-Mis18	Yippee zinc-binding/DNA-binding /Mis18, centromere assembly. This family includes both Yippee-type proteins and Mis18 kinetochore proteins. Yippee are putative zinc-binding/DNA-binding proteins. Mis18 are proteins involved in the priming of centromeres for recruiting CENP-A. Mis18-alpha and beta form part of a small complex with Mis18-binding protein. Mis18-alpha is found to interact with DNA de-methylases through a Leu-rich region located at its carboxyl terminus. This entry also includes the CULT domain proteins such as Cereblon.	100
397369	pfam03227	GILT	Gamma interferon inducible lysosomal thiol reductase (GILT). This family includes the two characterized human gamma-interferon-inducible lysosomal thiol reductase (GILT) sequences. It also contains several other eukaryotic putative proteins with similarity to GILT. The aligned region contains three conserved cysteine residues. In addition, the two GILT sequences possess a C-X(2)-C motif that is shared by some of the other sequences in the family. This motif is thought to be associated with disulphide bond reduction.	106
281252	pfam03228	Adeno_VII	Adenoviral core protein VII. The function of this protein is unknown. It has a conserved amino terminus of 50 residues followed by a positively charged tail, suggesting it may interact with nucleic acid. The major core protein of the adenovirus, protein VII, was found to be associated with viral DNA throughout infection. The precursor to protein VII were shown to be in vivo and in vitro acceptors of ADP-ribose. The ADP-ribosylated core proteins were assembled into mature virus particles. ADP-ribosylation of adenovirus core proteins may have a role in virus decapsidation.	142
251813	pfam03229	Alpha_GJ	Alphavirus glycoprotein J. 	126
397370	pfam03230	Antirestrict	Antirestriction protein. This family includes various protein that are involved in antirestriction. The ArdB protein efficiently inhibits restriction by members of the three known families of type I systems of E. coli.	92
251815	pfam03231	Bunya_NS-S_2	Bunyavirus non-structural protein NS-S. This family represents the Bunyavirus NS-S family. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase.	444
397371	pfam03232	COQ7	Ubiquinone biosynthesis protein COQ7. Members of this family contain two repeats of about 90 amino acids, that contains two conserved motifs. One of these DXEXXH may be part of an enzyme active site.	171
251817	pfam03233	Cauli_AT	Aphid transmission protein. This protein is found in various caulimoviruses. It codes for an 18 kDa protein (PII), which is dispensable for infection but which is required for aphid transmission of the virus. This protein interacts with the PIII protein.	163
397372	pfam03234	CDC37_N	Cdc37 N terminal kinase binding. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain pfam08565. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function.	123
397373	pfam03235	DUF262	Protein of unknown function DUF262. 	177
397374	pfam03237	Terminase_6	Terminase-like family. This family represents a group of terminase proteins.	203
281258	pfam03238	ESAG1	ESAG protein. Expression-site-associated gene (ESAG) proteins are thought to be involved in VSG activation. This family includes ESAG 117A as well as ESAG IM.	227
251822	pfam03239	FTR1	Iron permease FTR1 family. 	284
397375	pfam03241	HpaB	4-hydroxyphenylacetate 3-hydroxylase C terminal. HpaB encodes part of the 4-hydroxyphenylacetate 3-hydroxylase from Escherichia coli. HpaB is part of a heterodimeric enzyme that also requires HpaC. The enzyme is NADH-dependent and uses FAD as the redox chromophore. This family also includes PvcC may play a role in one of the proposed hydroxylation steps of pyoverdine chromophore biosynthesis.	196
397376	pfam03242	LEA_3	Late embryogenesis abundant protein. Members of this family are similar to late embryogenesis abundant proteins. Members of the family have been isolated in a number of different screens. However, the molecular function of these proteins remains obscure.	93
377009	pfam03243	MerB	Alkylmercury lyase. Alkylmercury lyase (EC:4.99.1.2) cleaves the carbon-mercury bond of organomercurials such as phenylmercuric acetate.	126
367413	pfam03244	PSI_PsaH	Photosystem I reaction centre subunit VI. Photosystem I (PSI) is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin.	139
397377	pfam03245	Phage_lysis	Bacteriophage Rz lysis protein. This protein is involved in host lysis. This family is not considered to be a peptidase according to the MEROPs database. This family Rz and the Rz1 protein (pfam06085) represent a unique example of two genes located in different reading frames in the same nucleotide sequence, which encode different proteins that are both required in the same physiological pathway.	126
397378	pfam03246	Pneumo_ncap	Pneumovirus nucleocapsid protein. 	390
397379	pfam03247	Prothymosin	Prothymosin/parathymosin family. Prothymosin alpha and parathymosin are two ubiquitous small acidic nuclear proteins that are thought to be involved in cell cycle progression, proliferation, and cell differentiation.	111
397380	pfam03248	Rer1	Rer1 family. RER1 family protein are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C-terminus of yeast Rer1p interacts with a coatomer complex.	167
397381	pfam03249	TSA	Type specific antigen. There are several antigenic variants in Rickettsia tsutsugamushi, and a type-specific antigen (TSA) of 56-kilodaltons located on the rickettsial surface is responsible for the variation. TSA proteins are probably integral membrane proteins.	510
397382	pfam03250	Tropomodulin	Tropomodulin. Tropomodulin is a novel tropomyosin regulatory protein that binds to the end of erythrocyte tropomyosin and blocks head-to-tail association of tropomyosin along actin filaments. Limited proteolysis shows this protein is composed of two domains. The amino terminal domain contains the tropomyosin binding function.	143
281269	pfam03251	Tymo_45kd_70kd	Tymovirus 45/70Kd protein. Tymoviruses are single stranded RNA viruses. This family includes a protein of unknown function that has been named based on its molecular weight. Tymoviruses such as the ononis yellow mosaic tymovirus encode only three proteins. Of these two are overlapping this protein overlaps a larger ORF that is thought to be the polymerase.	468
281270	pfam03252	Herpes_UL21	Herpesvirus UL21. The UL21 protein appears to be a dispensable component in herpesviruses.	524
397383	pfam03253	UT	Urea transporter. Members of this family transport urea across membranes. The family includes a bacterial homolog.	292
397384	pfam03254	XG_FTase	Xyloglucan fucosyltransferase. Plant cell walls are crucial for development, signal transduction, and disease resistance in plants. Cell walls are made of cellulose, hemicelluloses, and pectins. Xyloglucan (XG), the principal load-bearing hemicellulose of dicotyledonous plants, has a terminal fucosyl residue. This fucosyltransferase adds this residue.	458
397385	pfam03255	ACCA	Acetyl co-enzyme A carboxylase carboxyltransferase alpha subunit. Acetyl co-enzyme A carboxylase carboxyltransferase is composed of an alpha and beta subunit.	144
367420	pfam03256	ANAPC10	Anaphase-promoting complex, subunit 10 (APC10). 	185
281275	pfam03257	Adhesin_P1	Mycoplasma adhesin P1. This family corresponds to a short 100 residue region found in adhesins from Mycoplasmas.	91
367421	pfam03258	Baculo_FP	Baculovirus FP protein. The FP protein is missing in baculovirus (Few Polyhedra) mutants.	147
397386	pfam03259	Robl_LC7	Roadblock/LC7 domain. This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role.	91
397387	pfam03260	Lipoprotein_11	Lepidopteran low molecular weight (30 kD) lipoprotein. 	251
397388	pfam03261	CDK5_activator	Cyclin-dependent kinase 5 activator protein. 	356
281280	pfam03262	Corona_6B_7B	Coronavirus 6B/7B protein. 	206
281281	pfam03263	Cucumo_2B	Cucumovirus protein 2B. This protein may be a viral movement protein.	105
397389	pfam03264	Cytochrom_NNT	NapC/NirT cytochrome c family, N-terminal region. Within the NapC/NirT family of cytochrome c proteins, some members, such as NapC and NirT, bind four haem groups, while others, such as TorC, bind five haems. This family aligns the common N-terminal region that contains four haem-binding C-X(2)-CH motifs.	174
397390	pfam03265	DNase_II	Deoxyribonuclease II. 	312
397391	pfam03266	NTPase_1	NTPase. This domain is found across all species from bacteria to human, and the function was determined first in a hyperthermophilic bacterium to be an NTPase. The structure of one member-sequence represents a variation of the RecA fold, and implies that the function might be that of a DNA/RNA modifying enzyme. The sequence carries both a Walker A and Walker B motif which together are characteristic of ATPases or GTPases. The protein exhibits an increased expression profile in human liver cholangiocarcinoma when compared to normal tissue.	168
367428	pfam03268	DUF267	Caenorhabditis protein of unknown function, DUF267. 	360
367429	pfam03269	DUF268	Caenorhabditis protein of unknown function, DUF268. 	176
397392	pfam03270	DUF269	Protein of unknown function, DUF269. Members of this family may be involved in nitrogen fixation, since they are found within nitrogen fixation operons.	121
397393	pfam03271	EB1	EB1-like C-terminal motif. This motif is found at the C-terminus of proteins that are related to the EB1 protein. The EB1 proteins contain an N-terminal CH domain pfam00307. The human EB1 protein was originally discovered as a protein interacting with the C-terminus of the APC protein. This interaction is often disrupted in colon cancer, due to deletions affecting the APC C-terminus. Several EB1 orthologues are also included in this family. The interaction between EB1 and APC has been shown to have a potent synergistic effect on microtubule polymerization. Neither of EB1 or APC alone has this effect. It is thought that EB1 targets APC to the + ends of microtubules, where APC promotes microtubule polymerization. This process is regulated by APC phosphorylation by Cdc2, which disrupts APC-EB1 binding. Human EB1 protein can functionally substitute for the yeast EB1 homolog Mal3. In addition, Mal3 can substitute for human EB1 in promoting microtubule polymerization with APC.	39
397394	pfam03272	Mucin_bdg	Putative mucin or carbohydrate-binding module. This family is the putative binding domain for the substrates of enhancin, and other similar metallopeptidases. This is not the enzymically active, peptidase, part of the proteins - see pfam13402.	116
397395	pfam03273	Baculo_gp64	Baculovirus gp64 envelope glycoprotein family. This family includes the gp64 glycoprotein from baculovirus as well as other viruses.	506
281291	pfam03274	Foamy_BEL	Foamy virus BEL 1/2 protein. 	301
397396	pfam03275	GLF	UDP-galactopyranose mutase. 	203
281293	pfam03276	Gag_spuma	Spumavirus gag protein. 	614
281294	pfam03277	Herpes_UL4	Herpesvirus UL4 family. 	187
281295	pfam03278	IpaB_EvcA	IpaB/EvcA family. This family includes IpaB, which is an invasion plasmid antigen from Shigella, as well as EvcA from E. coli. Members of this family seem to be involved in pathogenicity of some enterobacteria. However the exact function of this component is not clear.	144
281296	pfam03279	Lip_A_acyltrans	Bacterial lipid A biosynthesis acyltransferase. 	294
397397	pfam03280	Lipase_chap	Proteobacterial lipase chaperone protein. 	193
397398	pfam03281	Mab-21	Mab-21 protein. This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development.	287
397399	pfam03283	PAE	Pectinacetylesterase. 	355
397400	pfam03284	PHZA_PHZB	Phenazine biosynthesis protein A/B. 	161
397401	pfam03285	Paralemmin	Paralemmin. 	312
281302	pfam03286	Pox_Ag35	Pox virus Ag35 surface protein. 	196
308745	pfam03287	Pox_C7_F8A	Poxvirus C7/F8A protein. 	147
367437	pfam03288	Pox_D5	Poxvirus D5 protein-like. This family includes D5 from Poxviruses which is necessary for viral DNA replication, and is a nucleic acid independent nucleoside triphosphatase. Members of this family are also found outside of poxviruses. This domain is a DNA-binding winged HTH domain.	85
281305	pfam03289	Pox_I1	Poxvirus protein I1. 	307
281306	pfam03290	Peptidase_C57	Vaccinia virus I7 processing peptidase. 	425
281307	pfam03291	Pox_MCEL	mRNA capping enzyme. This family of enzymes are related to pfam03919.	332
281308	pfam03292	Pox_P4B	Poxvirus P4B major core protein. 	657
397402	pfam03293	Pox_RNA_pol	Poxvirus DNA-directed RNA polymerase, 18 kD subunit. 	160
397403	pfam03294	Pox_Rap94	RNA polymerase-associated transcription specificity factor, Rap94. 	796
281311	pfam03295	Pox_TAA1	Poxvirus trans-activator protein A1 C-terminal. 	63
281312	pfam03296	Pox_polyA_pol	Poxvirus poly(A) polymerase nucleotidyltransferase domain. 	147
397404	pfam03297	Ribosomal_S25	S25 ribosomal protein. 	100
397405	pfam03298	Stanniocalcin	Stanniocalcin family. 	200
397406	pfam03299	TF_AP-2	Transcription factor AP-2. 	196
281316	pfam03300	Tenui_NS4	Tenuivirus non-structural, movement protein NS4. 	282
281317	pfam03301	Trp_dioxygenase	Tryptophan 2,3-dioxygenase. 	346
146106	pfam03302	VSP	Giardia variant-specific surface protein. 	397
281318	pfam03303	WTF	WTF protein. This is a family of hypothetical Schizosaccharomyces pombe proteins. Their function is unknown.	237
308750	pfam03304	Mlp	Mlp lipoprotein family. The Mlp (for Multicopy Lipoprotein) family of lipoproteins is found in Borrelia species. This family were previously known as 2.9 lipoprotein genes. These surface expressed genes may represent new candidate vaccinogens for Lyme disease. Members of this family generally are downstream of four ORFs called A,B,C and D that are involved in hemolytic activity.	121
397407	pfam03305	Lipoprotein_X	Mycoplasma MG185/MG260 protein. Most of the aligned regions in this family are found towards the middle of the member proteins.	183
397408	pfam03306	AAL_decarboxy	Alpha-acetolactate decarboxylase. 	219
281322	pfam03307	Adeno_E3_15_3	Adenovirus 15.3kD protein in E3 region. 	117
281323	pfam03308	ArgK	ArgK protein. The ArgK protein acts as an ATPase enzyme and as a kinase, and phosphorylates periplasmic binding proteins involved in the LAO (lysine, arginine, ornithine)/AO transport systems.	272
397409	pfam03309	Pan_kinase	Type III pantothenate kinase. Type III pantothenate kinase catalyzes the phosphorylation of pantothenate (Pan), the first step in the universal pathway of CoA biosynthesis.	204
251863	pfam03310	Cauli_DNA-bind	Caulimovirus DNA-binding protein. 	121
397410	pfam03311	Cornichon	Cornichon protein. 	122
397411	pfam03312	DUF272	Protein of unknown function (DUF272). This family of proteins is restricted to C.elegans and has no known function. The protein contains a ubiquitin fold. The GO annotation for the protein indicates that it has a function in nematode larval development.	123
397412	pfam03313	SDH_alpha	Serine dehydratase alpha chain. L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyzes the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway.	259
367444	pfam03314	DUF273	Protein of unknown function, DUF273. 	219
397413	pfam03315	SDH_beta	Serine dehydratase beta chain. L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyzes the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway.	146
397414	pfam03317	ELF	ELF protein. This is a family of hypothetical proteins from cereal crops.	123
397415	pfam03318	ETX_MTX2	Clostridium epsilon toxin ETX/Bacillus mosquitocidal toxin MTX2. This family appears to be distantly related to pfam01117.	222
397416	pfam03319	EutN_CcmL	Ethanolamine utilisation protein EutN/carboxysome. The crystal structure of EutN contains a central five-stranded beta-barrel, with an alpha-helix at the open end of this barrel (Structure 2HD3). The structure also contains three additional beta-strands, which help the formation of a tight hexamer, with a hole in the center. this suggests that EutN forms a pore, with an opening of 26 Angstrom in diameter on one face and 14 Angstrom on the other face. EutN is involved in the cobalamin-dependent degradation of ethanolamine.	83
397417	pfam03320	FBPase_glpX	Bacterial fructose-1,6-bisphosphatase, glpX-encoded. 	304
397418	pfam03321	GH3	GH3 auxin-responsive promoter. 	529
397419	pfam03323	GerA	Bacillus/Clostridium GerA spore germination protein. 	467
367447	pfam03324	Herpes_HEPA	Herpesvirus DNA helicase/primase complex associated protein. This family includes HSV UL8, EHV-1 54, VZV 52 AND HCMV 102.	95
397420	pfam03325	Herpes_PAP	Herpesvirus polymerase accessory protein. The same proteins are also known as polymerase processivity factors.	166
308764	pfam03326	Herpes_TAF50	Herpesvirus transcription activation factor (transactivator). This family includes EBV BRLF1 and similar ORF 50 proteins from other herpesviruses.	568
397421	pfam03327	Herpes_VP19C	Herpesvirus capsid shell protein VP19C. 	262
397422	pfam03328	HpcH_HpaI	HpcH/HpaI aldolase/citrate lyase family. This family includes 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase and 4-hydroxy-2-oxovalerate aldolase.	221
397423	pfam03330	DPBB_1	Lytic transglycolase. Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N-terminus of pollen allergen. Recent studies show that the full-length RlpA protein from Pseudomonas Aeruginosa is an outer membrane protein that is a lytic transglycolase with specificity for peptidoglycan lacking stem peptides. Residue D157 in UniProtKB:Q9X6V6 is critical for lytic activity.	82
397424	pfam03331	LpxC	UDP-3-O-acyl N-acetylglycosamine deacetylase. The enzymes in this family catalyze the second step in the biosynthetic pathway for lipid A.	271
397425	pfam03332	PMM	Eukaryotic phosphomannomutase. This enzyme EC:5.4.2.8 is involved in the synthesis of the GDP-mannose and dolichol-phosphate-mannose required for a number of critical mannosyl transfer reactions.	220
377022	pfam03333	PapB	Adhesin biosynthesis transcription regulatory protein. This family includes PapB, DaaA, FanA, FanB, and AfaA.	91
397426	pfam03334	PhaG_MnhG_YufB	Na+/H+ antiporter subunit. This family includes PhaG from Rhizobium meliloti, MnhG from Staphylococcus aureus, YufB from Bacillus subtilis.	71
335296	pfam03335	Phage_fiber	Phage tail fibre repeat. 	14
281347	pfam03336	Pox_C4_C10	Poxvirus C4/C10 protein. 	322
281348	pfam03337	Pox_F12L	Poxvirus F12L protein. 	649
281349	pfam03338	Pox_J1	Poxvirus J1 protein. 	145
281350	pfam03339	Pox_L3_FP4	Poxvirus L3/FP4 protein. 	316
397427	pfam03340	Pox_Rif	Poxvirus rifampicin resistance protein. 	542
281352	pfam03341	Pox_mRNA-cap	Poxvirus mRNA capping enzyme, small subunit. The small subunit of the poxvirus mRNA capping enzyme has been found to have a structure which suggests that it started life as an RNA cap 2-prime O-methyltransferase. It has subsequently evolved to a catalytically inactive form that has been retained in order to help stabilize the large subunit, D1, and to enhance its methyltransferase activity through an allosteric mechanism.	286
281353	pfam03342	Rhabdo_M1	Rhabdovirus M1 matrix protein (M1 polymerase-associated protein). 	227
397428	pfam03343	SART-1	SART-1 family. SART-1 is a protein involved in cell cycle arrest and pre-mRNA splicing. It has been shown to be a component of U4/U6 x U5 tri-snRNP complex in human, Schizosaccharomyces pombe and Saccharomyces cerevisiae. SART-1 is a known tumor antigen in a range of cancers recognized by T cells.	569
397429	pfam03344	Daxx	Daxx N-terminal Rassf1C-interacting domain. The Daxx protein (also known as the Fas-binding protein) is thought to play a role in apoptosis. Daxx forms a complex with Axin. Remodelling of the family to a short domain based on the Structure 2kzs structure gives a more representative family. DAXX is a scaffold protein shown to play diverse roles in transcription and cell cycle regulation. This N-terminal domain folds into a left-handed four-helix bundle (H1, H2, H4, H5) that binds to the N-terminal residues of the tumor-suppressor Rassf1C.	95
397430	pfam03345	DDOST_48kD	Oligosaccharyltransferase 48 kDa subunit beta. Members of this family are involved in asparagine-linked protein glycosylation. In particular, dolichyl-diphosphooligosaccharide-protein glycosyltransferase (DDOST), also known as oligosaccharyltransferase EC:2.4.1.119, transfers the high-mannose sugar GlcNAc(2)-Man(9)-Glc(3) from a dolichol-linked donor to an asparagine acceptor in a consensus Asn-X-Ser/Thr motif. In most eukaryotes, the DDOST complex is composed of three subunits, which in humans are described as a 48kD subunit, ribophorin I, and ribophorin II. However, the yeast DDOST appears to consist of six subunits (alpha, beta, gamma, delta, epsilon, zeta). The yeast beta subunit is a 45kD polypeptide, previously discovered as the Wbp1 protein, with known sequence similarity to the human 48kD subunit and the other orthologues. This family includes the 48kD-like subunits from several eukaryotes; it also includes the yeast DDOST beta subunit Wbp1.	412
281357	pfam03347	TDH	Vibrio thermostable direct hemolysin. 	165
397431	pfam03348	Serinc	Serine incorporator (Serinc). This is a family of eukaryotic membrane proteins which incorporate serine into membranes and facilitate the synthesis of the serine-derived lipids phosphatidylserine and sphingolipid. Members of this family contain 11 transmembrane domains and form intracellular complexes with key enzymes involved in serine and sphingolipid biosynthesis.	421
397432	pfam03349	Toluene_X	Outer membrane protein transport protein (OMPP1/FadL/TodX). This family includes TodX from Pseudomonas putida F1 and TbuX from Ralstonia pickettii PKO1. These are membrane proteins of uncertain function that are involved in toluene catabolism. Related proteins involved in the degradation of similar aromatic hydrocarbons are also in this family, such as CymD. This family also includes FadL involved in translocation of long-chain fatty acids across the outer membrane. It is also a receptor for the bacteriophage T2.	419
397433	pfam03350	UPF0114	Uncharacterized protein family, UPF0114. 	118
397434	pfam03351	DOMON	DOMON domain. The DOMON (named after dopamine beta-monooxygenase N-terminal) domain is 110-125 residues long. It is predicted to form an all beta fold with up to 11 strands and is secreted to the extracellular compartment. The beta-strand folding produces a hydrophobic pocket which appears to bind soluble haem. This is consistent with the predominant architectures where the protein is associated with cytochromes or enzymatic domains whose activity involves redox or electron transfer reactions potentially as a direct participant in the electron transfer process. The DOMON domain superfamily, of which this is just one member, shows (1) multiple hydrophobic residues that contribute to the hydrophobic core of the strands of the beta-sandwich, and small residues found at the boundaries of strands and loops, (2) a strongly conserved charged residue (usually arginine/lysine) at the end of strand 9, which possibly stabilizes the loop between 9 and 10, and (3) a polar residue (usually histidine, lysine or arginine), that interacts or coordinates with ligands. The suggested superfamily includes both haem- and sugar-binding members: the haem-binding families being the ethyl-Benzoate dehydrogenase family EB_dh, pfam09459, the cellobiose dehydrogenase family CBDH and this family, and the sugar-binding families being the xylanases, CBM_4_9, pfam02018. The common feature of the superfamily is the 11-beta-strand structure, although the first and eleventh strands are not well conserved either within families or between families.	119
397435	pfam03352	Adenine_glyco	Methyladenine glycosylase. The DNA-3-methyladenine glycosylase I is constitutively expressed and is specific for the alkylated 3-methyladenine DNA.	177
397436	pfam03353	Lin-8	Ras-mediated vulval-induction antagonist. LIN-8 is a nuclear protein, present at the sites of transcriptional repressor complexes, which interacts with LIN-35 Rb. Lin35 Rb is a product of the class B synMuv gene lin-35 which silences genes required for vulval specification through chromatin modification and remodelling. The biological role of the interaction has not yet been determined however predictions have been made. The interaction shows that class A synMuv genes control vulval induction through the transcriptional regulation of gene expression. LIN-8 normally functions as part of a protein complex however when the complex is absent, other family members can partially replace LIN-8 activity.	309
281364	pfam03354	Terminase_1	Phage Terminase. The majority of the members of this family are bacteriophage proteins, several of which are thought to be terminase large subunit proteins. There are also a number of bacterial proteins of unknown function.	466
146143	pfam03355	Pox_TAP	Viral Trans-Activator Protein. These proteins function as a trans-activator of viral late genes.	260
281365	pfam03356	Pox_LP_H2	Viral late protein H2. All Members of this family show similarity to the vaccinia virus late protein H2. This protein is often referred to by its gene name of H2R. Members from this family all belong to the viral taxon Poxviridae.	188
397437	pfam03357	Snf7	Snf7. This family of proteins are involved in protein sorting and transport from the endosome to the vacuole/lysosome in eukaryotic cells. Vacuoles/lysosomes play an important role in the degradation of both lipids and cellular proteins. In order to perform this degradative function, vacuoles/lysosomes contain numerous hydrolases which have been transported in the form of inactive precursors via the biosynthetic pathway and are proteolytically activated upon delivery to the vacuole/lysosome. The delivery of transmembrane proteins, such as activated cell surface receptors to the lumen of the vacuole/lysosome, either for degradation/downregulation, or in the case of hydrolases, for proper localization, requires the formation of multivesicular bodies (MVBs). These late endosomal structures are formed by invaginating and budding of the limiting membrane into the lumen of the compartment. During this process, a subset of the endosomal membrane proteins is sorted into the forming vesicles. Mature MVBs fuse with the vacuole/lysosome, thereby releasing cargo containing vesicles into its hydrolytic lumen for degradation. Endosomal proteins that are not sorted into the intralumenal MVB vesicles are either recycled back to the plasma membrane or Golgi complex, or remain in the limiting membrane of the MVB and are thereby transported to the limiting membrane of the vacuole/lysosome as a consequence of fusion. Therefore, the MVB sorting pathway plays a critical role in the decision between recycling and degradation of membrane proteins. A few archaeal sequences are also present within this family.	170
397438	pfam03358	FMN_red	NADPH-dependent FMN reductase. 	151
397439	pfam03359	GKAP	Guanylate-kinase-associated protein (GKAP) protein. 	341
397440	pfam03360	Glyco_transf_43	Glycosyltransferase family 43. 	202
281370	pfam03361	Herpes_IE2_3	Herpes virus intermediate/early protein 2/3. These viral sequences are similar to UL117 protein of human and chimpanzee cytomegalovirus, and to intermediate/early proteins 2 and 3 of certain herpes viruses. UL117 is thought to be a glycoprotein that is expressed at early and late times after infection. This region is close to the C-terminus of the protein and may be a transmembrane region.	162
281371	pfam03362	Herpes_UL47	Herpesvirus UL47 protein. 	448
281372	pfam03363	Herpes_LP	Herpesvirus leader protein. 	174
397441	pfam03364	Polyketide_cyc	Polyketide cyclase / dehydrase and lipid transport. This family contains polyketide cylcases/dehydrases which are enzymes involved in polyketide synthesis. The family also includes proteins which are involved in the binding/transport of lipids.	125
397442	pfam03366	YEATS	YEATS family. We have named this family the YEATS family, after `YNK7', `ENL', `AF-9', and `TFIIF small subunit'. This family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity	85
397443	pfam03367	zf-ZPR1	ZPR1 zinc-finger domain. The zinc-finger protein ZPR1 is ubiquitous among eukaryotes. It is indeed known to be an essential protein in yeast. In quiescent cells, ZPR1 is localized to the cytoplasm. But in proliferating cells treated with EGF or with other mitogens, ZPR1 accumulates in the nucleolus. ZPR1 interacts with the cytoplasmic domain of the inactive EGF receptor (EGFR) and is thought to inhibit the basal protein tyrosine kinase activity of EGFR. This interaction is disrupted when cells are treated with EGF, though by themselves, inactive EGFRs are not sufficient to sequester ZPR1 to the cytoplasm. Upon stimulation by EGF, ZPR1 directly binds the eukaryotic translation elongation factor-1alpha (eEF-1alpha) to form ZPR1/eEF-1alpha complexes. These move into the nucleus, localising particularly at the nucleolus. Indeed, the interaction between ZPR1 and eEF-1alpha has been shown to be essential for normal cellular proliferation, and ZPR1 is thought to be involved in pre-ribosomal RNA expression. The ZPR1 domain consists of an elongation initiation factor 2-like zinc finger and a double-stranded beta helix with a helical hairpin insertion. ZPR1 binds preferentially to GDP-bound eEF1A but does not directly influence the kinetics of nucleotide exchange or GTP hydrolysis. The alignment for this family shows a domain of which there are two copies in ZPR1 proteins. This family also includes several hypothetical archaeal proteins (from both Crenarchaeota and Euryarchaeota), which only contain one copy of the aligned region. This similarity between ZPR1 and archaeal proteins was not previously noted.	159
397444	pfam03368	Dicer_dimer	Dicer dimerization domain. This domain is found in members of the Dicer protein family which function in RNA interference, an evolutionarily conserved mechanism for gene silencing using double-stranded RNA (dsRNA) molecules. It is essential for the activity of Dicer. It is a divergent double stranded RNA-binding domain. The N-terminal alpha helix of this domain is in a different orientation to that found in canonical dsRNA-binding domains. This results in a change of charge distribution at the potential dsRNA-binding surface and in the N- and C-termini of the domain being in close proximity. This domain has weak dsRNA-binding activity. It mediates heterodimerization of Dicer proteins with their respective protein partners.	90
281377	pfam03369	Herpes_UL3	Herpesvirus UL3 protein. 	135
397445	pfam03370	CBM_21	Carbohydrate/starch-binding module (family 21). This family consists of several eukaryotic proteins that are thought to be involved in the regulation of glycogen metabolism. For instance, the mouse PTG protein has been shown to interact with glycogen synthase, phosphorylase kinase, phosphorylase a: these three enzymes have key roles in the regulation of glycogen metabolism. PTG also binds the catalytic subunit of protein phosphatase 1 (PP1C) and localizes it to glycogen. Subsets of similar interactions have been observed with several other members of this family, such as the yeast PIG1, PIG2, GAC1 and GIP2 proteins. While the precise function of these proteins is not known, they may serve a scaffold function, bringing together the key enzymes in glycogen metabolism. This family is a carbohydrate binding domain.	113
397446	pfam03371	PRP38	PRP38 family. Members of this family are related to the pre mRNA splicing factor PRP38 from yeast. Therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation.	166
397447	pfam03372	Exo_endo_phos	Endonuclease/Exonuclease/phosphatase family. This large family of proteins includes magnesium dependent endonucleases and a large number of phosphatases involved in intracellular signalling. This family includes: AP endonuclease proteins EC:4.2.99.18, DNase I proteins EC:3.1.21.1, Synaptojanin an inositol-1,4,5-trisphosphate phosphatase EC:3.1.3.56, Sphingomyelinase EC:3.1.4.12, and Nocturnin.	228
281381	pfam03373	Octapeptide	Octapeptide repeat. This octapeptide repeat is found in several bacterial proteins. The function of this repeat is unknown.	8
397448	pfam03374	ANT	Phage antirepressor protein KilAC domain. This domain was called the KilAC domain by Iyer and colleagues.	105
281383	pfam03376	Adeno_E3B	Adenovirus E3B protein. 	67
397449	pfam03377	TAL_effector	TAL effector repeat. The proteins in this family bind to DNA. Each repeat binds to a base pair in a predictable way. The structure shows that each repeat is composed of two alpha helices.	33
367469	pfam03378	CAS_CSE1	CAS/CSE protein, C-terminus. Mammalian cellular apoptosis susceptibility (CAS) proteins are homologous to the yeast chromosome-segregation protein, CSE1. This family aligns the C-terminal halves (approximately). CAS is involved in both cellular apoptosis and proliferation. Apoptosis is inhibited in CAS-depleted cells, while the expression of CAS correlates to the degree of cellular proliferation. Like CSE1, it is essential for the mitotic checkpoint in the cell cycle (CAS depletion blocks the cell in the G2 phase), and has been shown to be associated with the microtubule network and the mitotic spindle, as is the protein MEK, which is thought to regulate the intracellular localization (predominantly nuclear vs. predominantly cytosolic) of CAS. In the nucleus, CAS acts as a nuclear transport factor in the importin pathway. The importin pathway mediates the nuclear transport of several proteins that are necessary for mitosis and further progression. CAS is therefore thought to affect the cell cycle through its effect on the nuclear transport of these proteins. Since apoptosis also requires the nuclear import of several proteins (such as P53 and transcription factors), it has been suggested that CAS also enables apoptosis by facilitating the nuclear import of at least a subset of these essential proteins.	435
281385	pfam03379	CcmB	CcmB protein. CcmB is the product of one of a cluster of Ccm genes that are necessary for cytochrome c biosynthesis in eubacteria. Expression of these proteins is induced when the organisms are grown under anaerobic conditions with nitrate or nitrite as the final electron acceptor. CcmB is required for the export of haem to the periplasm.	215
367470	pfam03380	DUF282	Caenorhabditis protein of unknown function, DUF282. 	38
397450	pfam03381	CDC50	LEM3 (ligand-effect modulator 3) family / CDC50 family. Members of this family have been predicted to contain transmembrane helices. The family member LEM3 is a ligand-effect modulator, mutation of which increases glucocorticoid receptor activity in response to dexamethasone and also confers increased activity on other intracellular receptors including the progesterone, oestrogen and mineralocorticoid receptors. LEM3 is thought to affect a downstream step in the glucocorticoid receptor pathway. Factors that modulate ligand responsiveness are likely to contribute to the context-specific actions of the glucocorticoid receptor in mammalian cells. The products of genes YNR048w, YNL323w, and YCR094w (CDC50) show redundancy of function and are involved in regulation of transcription via CDC39. CDC39 (also known as NOT1) is normally a negative regulator of transcription either by affecting the general RNA polymerase II machinery or by altering chromatin structure. One function of CDC39 is to block activation of the mating response pathway in the absence of pheromone, and mutation causes arrest in G1 by activation of the pathway. It may be that the cold-sensitive arrest in G1 noticed in CDC50 mutants may be due to inactivation of CDC39. The effects of LEM3 on glucocorticoid receptor activity may also be due to effects on transcription via CDC39.	276
397451	pfam03382	DUF285	Mycoplasma protein of unknown function, DUF285. This region appears distantly related to leucine rich repeats.	120
251914	pfam03383	Serpentine_r_xa	Caenorhabditis serpentine receptor-like protein, class xa. This family contains various Caenorhabditis proteins, some of which are annotated as being serpentine receptors, mainly of the xa class.	153
146165	pfam03384	DUF287	Drosophila protein of unknown function, DUF287. 	55
308795	pfam03385	STELLO	STELLO glycosyltransferases. This domain family is found in Metazoa and in Virdiplantae. Two of the family members are characterized in Arabidopsis thaliana and named STELLO1 (STL1) and STELLO2 (STL2) respectively. They are Golgi-localized proteins that can interact with CesAs (cellulose synthase A) and control cellulose quantity. In the absence of STELLO function, the spatial distribution within the Golgi, secretion and activity of the CSCs are impaired indicating a central role of the STELLO proteins in CSC assembly. Point mutations in the predicted catalytic domains of the STELLO proteins indicate that they are glycosyltransferases facing the Golgi lumen. STL homologs are present throughout the plant kingdom, but STL proteins are distinct from distantly related proteins in nematodes, fungi and molluscs.	388
397452	pfam03386	ENOD93	Early nodulin 93 ENOD93 protein. 	78
281391	pfam03387	Herpes_UL46	Herpesvirus UL46 protein. 	435
397453	pfam03388	Lectin_leg-like	Legume-like lectin family. Lectins are structurally diverse proteins that bind to specific carbohydrates. This family includes the VIP36 and ERGIC-53 lectins. These two proteins were the first recognized members of a family of animal lectins similar (19-24%) to the leguminous plant lectins. The alignment for this family aligns residues lying towards the N-terminus, where the similarity of VIP36 and ERGIC-53 is greatest. However, while Fiedler and Simons identified these proteins as a new family of animal lectins, our alignment also includes yeast sequences. ERGIC-53 is a 53kD protein, localized to the intermediate region between the endoplasmic reticulum and the Golgi apparatus (ER-Golgi-Intermediate Compartment, ERGIC). It was identified as a calcium-dependent, mannose-specific lectin. Its dysfunction has been associated with combined factors V and VIII deficiency OMIM:227300 OMIM:601567, suggesting an important and substrate-specific role for ERGIC-53 in the glycoprotein- secreting pathway.	226
397454	pfam03389	MobA_MobL	MobA/MobL family. This family includes of the MobA protein from the E. coli plasmid RSF1010, and the MobL protein from the Thiobacillus ferrooxidans plasmid PTF1. These sequences are mobilisation proteins, which are essential for specific plasmid transfer.	217
397455	pfam03390	2HCT	2-hydroxycarboxylate transporter family. The 2-hydroxycarboxylate transporter family is a family of secondary transporters found exclusively in the bacterial kingdom. They function in the metabolism of the di- and tricarboxylates malate and citrate, mostly in fermentative pathways involving decarboxylation of malate or oxaloacetate.	415
397456	pfam03391	Nepo_coat	Nepovirus coat protein, central domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure.	167
397457	pfam03392	OS-D	Insect pheromone-binding family, A10/OS-D. 	93
308800	pfam03393	Pneumo_matrix	Pneumovirus matrix protein. 	252
281397	pfam03394	Pox_E8	Poxvirus E8 protein. 	238
281398	pfam03395	Pox_P4A	Poxvirus P4A protein. 	882
397458	pfam03396	Pox_RNA_pol_35	Poxvirus DNA-directed RNA polymerase, 35 kD subunit. 	293
281400	pfam03397	Rhabdo_matrix	Rhabdovirus matrix protein. 	168
397459	pfam03398	Ist1	Regulator of Vps4 activity in the MVB pathway. ESCRT-I, -II, and -III are endosomal sorting complexes required for transporting proteins and carry out cargo sorting and vesicle formation in the multivesicular bodies, MVBs, pathway. These complexes are transiently recruited from the cytoplasm to the endosomal membrane where they bind transmembrane proteins previously marked for degradation by mono-ubiquitination. Assembly of ESCRT-III, a complex composed of at least four subunits (Vps2, Vps24, Vps20, Snf7), is intimately linked with MVB vesicle formation, its disassembly being an essential step in the MVB vesicle formation, a reaction that is carried out by Vps4, an AAA-type ATPase. The family Ist1 is a regulator of Vps4 activity; by interacting with Did2 and Vps4, Ist1 appears to regulate the recruitment and oligomerization of Vps4. Together Ist1, Did2, and Vta1 form a network of interconnected regulatory proteins that modulate Vps4 activity, thereby regulating the flow of cargo through the MVB pathway.	164
397460	pfam03399	SAC3_GANP	SAC3/GANP family. This family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localization of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle.	293
281403	pfam03400	DDE_Tnp_IS1	IS1 transposase. Transposase proteins are necessary for efficient DNA transposition. This family represents bacterial IS1 transposases.	131
397461	pfam03401	TctC	Tripartite tricarboxylate transporter family receptor. These probable extra-cytoplasmic solute receptors are strongly overrepresented in several beta-proteobacteria. This family, formerly known as Bug - Bordetella uptake gene (bug) product - is a family of bacterial tripartite tricarboxylate receptors of the extracytoplasmic solute binding receptor-dependent transporter group of families, distinct from the ABC and TRAP-T families. The TctABC system has been characterized in S. typhimurium, and TctC is the extracytoplasmic tricarboxylate-binding receptor which binds the transporters TctA and TctB, two integral membrane proteins. Complete three-component systems are found only in bacteria.	274
112227	pfam03402	V1R	Vomeronasal organ pheromone receptor family, V1R. This family represents one of two known vomeronasal organ receptor families, the V1R family.	265
397462	pfam03403	PAF-AH_p_II	Platelet-activating factor acetylhydrolase, isoform II. Platelet-activating factor acetylhydrolase (PAF-AH) is a subfamily of phospholipases A2, responsible for inactivation of platelet-activating factor through cleavage of an acetyl group. Three known PAF-AHs are the brain heterotrimeric PAF-AH Ib, whose catalytic beta and gamma subunits are aligned in pfam02266, the extracellular, plasma PAF-AH (pPAF-AH), and the intracellular PAF-AH isoform II (PAF-AH II). This family aligns pPAF-AH and PAF-AH II, whose similarity was previously noted.	372
397463	pfam03404	Mo-co_dimer	Mo-co oxidoreductase dimerization domain. This domain is found in molybdopterin cofactor (Mo-co) oxidoreductases. It is involved in dimer formation, and has an Ig-fold structure.	136
397464	pfam03405	FA_desaturase_2	Fatty acid desaturase. 	319
397465	pfam03406	Phage_fiber_2	Phage tail fibre repeat. This repeat is found in the tail fibers of phage. For example protein K. The repeats are about 40 residues long.	38
367480	pfam03407	Nucleotid_trans	Nucleotide-diphospho-sugar transferase. Proteins in this family have been been predicted to be nucleotide-diphospho-sugar transferases.	208
281409	pfam03408	Foamy_virus_ENV	Foamy virus envelope protein. Expression of the envelope (Env) glycoprotein is essential for viral particle egress. This feature is unique to the Spumavirinae, a subclass of the Retroviridae.	984
397466	pfam03409	Glycoprotein	Transmembrane glycoprotein. This family of proteins has some GO annotations for positive regulation of growth rate and nematode larval development. This is probably a family of membrane glycoproteins.	351
281411	pfam03410	Peptidase_M44	Metallopeptidase from vaccinia pox. This is a family of Poxviridae metalloendopeptidases. The members were often originally named as G1 proteins. The family carries three zinc-binding ligands and a catalytic glutamate. The first two zinc ligands are histidine residues, found together with the catalytic glutamate in a HXXEH motif, an inverse of the classical metallopeptidase motif, HEXXH. The third zinc ligand is a glutamate C-terminal to the HXXEH motif within a motif ELENEY (see MEROPS).	596
397467	pfam03411	Peptidase_M74	Penicillin-insensitive murein endopeptidase. 	242
367483	pfam03412	Peptidase_C39	Peptidase C39 family. Lantibiotic and non-lantibiotic bacteriocins are synthesized as precursor peptides containing N-terminal extensions (leader peptides) which are cleaved off during maturation. Most non-lantibiotics and also some lantibiotics have leader peptides of the so-called double-glycine type. These leader peptides share consensus sequences and also a common processing site with two conserved glycine residues in positions -1 and -2. The double- glycine-type leader peptides are unrelated to the N-terminal signal sequences which direct proteins across the cytoplasmic membrane via the sec pathway. Their processing sites are also different from typical signal peptidase cleavage sites, suggesting that a different processing enzyme is involved. Peptide bacteriocins are exported across the cytoplasmic membrane by a dedicated ATP-binding cassette (ABC) transporter. The ABC transporter is the maturation protease and its proteolytic domain resides in the N-terminal part of the protein. This peptidase domain is found in a wide range of ABC transporters, however the presumed catalytic cysteine and histidine are not conserved in all members of this family.	133
397468	pfam03413	PepSY	Peptidase propeptide and YPEB domain. This region is likely to have an protease inhibitory function (personal obs:C Yeats). This model is likely to miss some members of this family as the separation from signal to noise is not clear. The name is derived from Peptidase & Bacillus subtilis YPEB.	59
397469	pfam03414	Glyco_transf_6	Glycosyltransferase family 6. 	289
397470	pfam03415	Peptidase_C11	Clostripain family. 	354
397471	pfam03416	Peptidase_C54	Peptidase family C54. 	271
397472	pfam03417	AAT	Acyl-coenzyme A:6-aminopenicillanic acid acyl-transferase. 	223
367487	pfam03418	Peptidase_A25	Germination protease. 	354
397473	pfam03419	Peptidase_U4	Sporulation factor SpoIIGA. 	275
397474	pfam03420	Peptidase_S77	Prohead core protein serine protease. 	198
397475	pfam03421	Acetyltransf_14	YopJ Serine/Threonine acetyltransferase. The Yersinia effector YopJ inhibits the innate immune response by blocking MAP kinase and NFkappaB signaling pathways. YopJ is a serine/threonine acetyltransferase which regulates signalling pathways by blocking phosphorylation. Specifically, YopJ has been shown to block phosphorylation of active site residues. It has also been shown that YopJ acetyltransferase is activated by eukaryotic host cell inositol hexakisphosphate. This family was previously incorrectly annotated in Pfam as being a peptidase family.	172
397476	pfam03422	CBM_6	Carbohydrate binding module (family 6). 	125
367491	pfam03423	CBM_25	Carbohydrate binding domain (family 25). 	101
397477	pfam03424	CBM_17_28	Carbohydrate binding domain (family 17/28). 	204
397478	pfam03425	CBM_11	Carbohydrate binding domain (family 11). 	175
367493	pfam03426	CBM_15	Carbohydrate binding domain (family 15). 	153
112252	pfam03427	CBM_19	Carbohydrate binding domain (family 19). 	61
397479	pfam03428	RP-C	Replication protein C N-terminal domain. Replication protein C is involved in the early stages of viral DNA replication.	174
367495	pfam03429	MSP1b	Major surface protein 1B. The major surface protein (MSP1) of the cattle pathogen Anaplasma is a heterodimer comprised of MSP1a and MSP1b. This family is the MSP1b chain. There MSP1 proteins are putative adhesins for bovine erythrocytes.	769
281430	pfam03430	TATR	Trans-activating transcriptional regulator. This family of trans-activating transcriptional regulator (TATR), also known as intermediate early protein 1, are common to the Nucleopolyhedroviruses.	575
367496	pfam03431	RNA_replicase_B	RNA replicase, beta-chain. This family is of Leviviridae RNA replicases. The replicase is also known as RNA dependent RNA polymerase.	538
367497	pfam03432	Relaxase	Relaxase/Mobilisation nuclease domain. Relaxases/mobilisation proteins are required for the horizontal transfer of genetic information contained on plasmids that occurs during bacterial conjugation. The relaxase, in conjunction with several auxiliary proteins, forms the relaxation complex or relaxosome. Relaxases nick duplex DNA in a specific manner by catalyzing trans-esterification.	240
367498	pfam03433	EspA	EspA-like secreted protein. EspA is the prototypical member of this family. EspA, together with EspB, EspD and Tir are exported by a type III secretion system. These proteins are essential for attaching and effacing lesion formation. EspA is a structural protein and a major component of a large, transiently expressed, filamentous surface organelle which forms a direct link between the bacterium and the host cell.	180
308824	pfam03434	DUF276	DUF276. This family is specific to Borrelia burgdorferi. The protein is encoded on extra-chromosomal DNA. This domain has no known function.	291
397480	pfam03435	Sacchrp_dh_NADP	Saccharopine dehydrogenase NADP binding domain. This family contains the NADP binding domain of saccharopine dehydrogenase. In some organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase. The saccharopine dehydrogenase can also function as a saccharopine reductase.	120
367499	pfam03436	DUF281	Domain of unknown function (DUF281). This family of worm domain has no known function. The boundaries of the presumed domain are rather uncertain.	54
281436	pfam03437	BtpA	BtpA family. The BtpA protein is tightly associated with the thylakoid membranes, where it stabilizes the reaction centre proteins of photosystem I.	254
367500	pfam03438	Pneumo_NS1	Pneumovirus NS1 protein. This non-structural protein is one of two found in pneumoviruses. The protein is about 140 amino acids in length. The NS1 protein appears to be important for efficient replication but not essential. The NS1 protein has been shown by yeast two-hybrid to interact with the viral P protein. This protein is also known as the 1C protein. It has also been shown that NS1 can potently inhibit transcription and RNA replication.	136
397481	pfam03439	Spt5-NGN	Early transcription elongation factor of RNA pol II, NGN section. Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.	84
367501	pfam03440	APT	Aerolysin/Pertussis toxin (APT) domain. This family represents the N-terminal domain of aerolysin and pertussis toxin and has a type-C lectin like fold.	87
397482	pfam03441	FAD_binding_7	FAD binding domain of DNA photolyase. 	201
397483	pfam03442	CBM_X2	Carbohydrate binding domain X2. This domain binds to cellulose and to bacterial cell walls. It is found in glycosyl hydrolases and in scaffolding proteins of cellulosomes (multiprotein glycosyl hydrolase complexes). In the cellulosome it may aid cellulose degradation by anchoring the cellulosome to the bacterial cell wall and by binding it to its substrate. This domain has an Ig-like fold.	83
397484	pfam03443	Glyco_hydro_61	Glycosyl hydrolase family 61. Although weak endoglucanase activity has been demonstrated in several members of this family, they lack the clustered conserved catalytic acidic amino acids present in most glycoside hydrolases. Many members of this family lack measurable cellulase activity on their own, but enhance the activity of other cellulolytic enzymes. They are therefore unlikely to be true glycoside hydrolases. The subsrate-binding surface of this family is a flat Ig-like fold.	211
281443	pfam03444	HrcA_DNA-bdg	Winged helix-turn-helix transcription repressor, HrcA DNA-binding. This domain is always found with a pair of CBS domains pfam00571.	79
397485	pfam03445	DUF294	Putative nucleotidyltransferase DUF294. This domain is found associated with pfam00571. This region is uncharacterized, however it seems to be similar to pfam01909, conserving the DXD motif. This strongly suggests that members of this family are also nucleotidyltransferases (Bateman A pers. obs.).	138
397486	pfam03446	NAD_binding_2	NAD binding domain of 6-phosphogluconate dehydrogenase. The NAD binding domain of 6-phosphogluconate dehydrogenase adopts a Rossmann fold.	159
281446	pfam03447	NAD_binding_3	Homoserine dehydrogenase, NAD binding domain. This domain adopts a Rossmann NAD binding fold. The C-terminal domain of homoserine dehydrogenase contributes a single helix to this structural domain, which is not included in the Pfam model.	116
397487	pfam03448	MgtE_N	MgtE intracellular N domain. This domain is found at the N-terminus of eubacterial magnesium transporters of the MgtE family pfam01769. This domain is an intracellular domain that has an alpha-helical structure. The crystal structure of the MgtE transporter shows two of 5 magnesium ions are in the interface between the N domain and the CBS domains. In the absence of magnesium there is a large shift between the N and CBS domains.	102
397488	pfam03449	GreA_GreB_N	Transcription elongation factor, N-terminal. This domain adopts a long alpha-hairpin structure.	71
397489	pfam03450	CO_deh_flav_C	CO dehydrogenase flavoprotein C-terminal domain. 	103
397490	pfam03451	HELP	HELP motif. The founding member of the EMAP protein family is the 75 kDa Echinoderm Microtubule-Associated Protein, so-named for its abundance in sea urchin, sand dollar and starfish eggs. The Hydrophobic EMAP-Like Protein (HELP) motif was identified initially in the human EMAP-Like Protein 2 (EML2) and subsequently in the entire EMAP Protein family. The HELP motif is approximately 60-70 amino acids in length and is conserved amongst metazoans. Although the HELP motif is hydrophobic, there is no evidence that EMAP-Like Proteins are membrane-associated. All members of the EMAP-Like Protein family, identified to-date, are constructed with an amino terminal HELP motif followed by a WD domain. In C. elegans, EMAP-Like Protein-1 (ELP-1) is required for touch sensation indicating that ELP-1 may play a role in mechanosensation. The localization of ELP-1 to microtubules and adhesion sites implies that ELP-1 may transmit forces between the body surface and the touch receptor neurons.	73
397491	pfam03452	Anp1	Anp1. The members of this family (Anp1, Van1 and Mnn9) are membrane proteins required for proper Golgi function. These proteins co-localize within the cis Golgi, and that they are physically associated in two distinct complexes.	265
397492	pfam03453	MoeA_N	MoeA N-terminal region (domain I and II). This family contains two structural domains. One of these contains the conserved DGXA motif. This region is found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of this region is uncertain.	145
397493	pfam03454	MoeA_C	MoeA C-terminal region (domain IV). This domain is found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of this domain is uncertain. The structure of this domain is known and forms an incomplete beta barrel.	72
397494	pfam03455	dDENN	dDENN domain. This region is always found associated with pfam02141. It is predicted to form a globular domain. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. This N-terminal region of DENN folds into a longin module, consisting of a central antiparallel beta-sheet layered between helix H1 and helices H2 and H3 (strands S1-S5). Rab35 interacts with dDENN via residues in helix 1 and in the loop S3-S4.	50
397495	pfam03456	uDENN	uDENN domain. This region is always found associated with pfam02141. It is predicted to form an all beta domain.	62
397496	pfam03457	HA	Helicase associated domain. This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid.	63
397497	pfam03458	UPF0126	UPF0126 domain. Domain always found as pair in bacterial membrane proteins of unknown function. This domain contains three transmembrane helices. The conserved glycines are suggestive of an ion channel (C. Yeats unpublished obs.).	74
397498	pfam03459	TOBE	TOBE domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulfate. Found in ABC transporters immediately after the ATPase domain.	62
377044	pfam03460	NIR_SIR_ferr	Nitrite/Sulfite reductase ferredoxin-like half domain. Sulfite and Nitrite reductases are key to both biosynthetic assimilation of sulfur and nitrogen and dissimilation of oxidized anions for energy transduction. Two copies of this repeat are found in Nitrite and Sulfite reductases and form a single structural domain.	67
397499	pfam03461	TRCF	TRCF domain. 	93
397500	pfam03462	PCRF	PCRF domain. This domain is found in peptide chain release factors.	192
397501	pfam03463	eRF1_1	eRF1 domain 1. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification.	128
397502	pfam03464	eRF1_2	eRF1 domain 2. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification.	133
397503	pfam03465	eRF1_3	eRF1 domain 3. The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification.	100
397504	pfam03466	LysR_substrate	LysR substrate binding domain. The structure of this domain is known and is similar to the periplasmic binding proteins.	209
397505	pfam03467	Smg4_UPF3	Smg-4/UPF3 family. This family contains proteins that are involved in nonsense mediated mRNA decay. A process that is triggered by premature stop codons in mRNA. The family includes Smg-4 and UPF3.	171
397506	pfam03468	XS	XS domain. The XS (rice gene X and SGS3) domain is found in a family of plant proteins including gene X and SGS3. SGS3 is thought to be involved in post-transcriptional gene silencing (PTGS). This domain contains a conserved aspartate residue that may be functionally important. The XS domain has recently been predicted to possess an RRM-like RNA-binding domain by fold recognition.	113
397507	pfam03469	XH	XH domain. The XH (rice gene X Homology) domain is found in a family of plant proteins including gene X. The molecular function of these proteins is unknown. However these proteins usually contain an XS domain that is also found in the PTGS protein SGS3. This domain contains a conserved glutamate residue that may be functionally important.	131
251981	pfam03470	zf-XS	XS zinc finger domain. This domain is a putative nucleic acid binding zinc finger found in proteins that also contain an XS domain.	43
397508	pfam03471	CorC_HlyC	Transporter associated domain. This small domain is found in a family of proteins with the pfam01595 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C-terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates.	81
397509	pfam03472	Autoind_bind	Autoinducer binding domain. This domain is found a a large family of transcriptional regulators. This domain specifically binds to autoinducer molecules.	149
397510	pfam03473	MOSC	MOSC domain. The MOSC (MOCO sulfurase C-terminal) domain is a superfamily of beta-strand-rich domains identified in the molybdenum cofactor sulfurase and several other proteins from both prokaryotes and eukaryotes. These MOSC domains contain an absolutely conserved cysteine and occur either as stand-alone forms or fused to other domains such as NifS-like catalytic domain in Molybdenum cofactor sulfurase. The MOSC domain is predicted to be a sulfur-carrier domain that receives sulfur abstracted by the pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulfur-metal clusters.	116
397511	pfam03474	DMA	DMRTA motif. This region is found to the C-terminus of the pfam00751. DM-domain proteins with this motif are known as DMRTA proteins. The function of this region is unknown.	36
397512	pfam03475	3-alpha	3-alpha domain. This small triple helical domain has been predicted to assume a topology similar to helix-turn-helix domains. These domains are found at the C-terminus of proteins related to Escherichia coli YiiM.	44
281474	pfam03476	MOSC_N	MOSC N-terminal beta barrel domain. This domain is found to the N-terminus of pfam03473. The function of this domain is unknown, however it is predicted to adopt a beta barrel fold.	118
397513	pfam03477	ATP-cone	ATP cone domain. 	86
397514	pfam03478	DUF295	Protein of unknown function (DUF295). This family of proteins are found in plants. The function of the proteins is unknown.	57
397515	pfam03479	DUF296	Domain of unknown function (DUF296). This putative domain is found in proteins that contain AT-hook motifs pfam02178, which strongly suggests a DNA-binding function for the proteins as a whole. There are three highly conserved histidine residues, eg at 117, 119 and 133 in Reut_B5223, which should be a structurally conserved metal-binding unit, based on structural comparison with known metal-binding structures. The proteins should work as trimers.	114
397516	pfam03480	DctP	Bacterial extracellular solute-binding protein, family 7. This family of proteins is involved in binding extracellular solutes for transport across the bacterial cytoplasmic membrane. This family includes DctP, a C4-dicarboxylate-binding protein and the sialic acid-binding protein SiaP. The structure of the SiaP receptor has revealed an overall topology similar to ATP binding cassette ESR (extracytoplasmic solute receptors) proteins. Upon binding of sialic acid, SiaP undergoes domain closure about a hinge region and kinking of an alpha-helix hinge component.	285
397517	pfam03481	SUA5	Putative GTP-binding controlling metal-binding. Structural investigation of this domain suggests that it might be a GTP-binding region that regulates metal binding and involves hydrolysis of ATP to AMP. It is found to the C-terminus of pfam01300.	132
281480	pfam03482	SIC	sic protein repeat. Serotype M1 group A Streptococcus strains cause epidemic waves of human infections. This 30 aa repeat occurs in the sic protein, an extracellular protein (streptococcal inhibitor of complement) that inhibits human complement.	30
397518	pfam03483	B3_4	B3/4 domain. This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins.	174
397519	pfam03484	B5	tRNA synthetase B5 domain. This domain is found in phenylalanine-tRNA synthetase beta subunits.	67
397520	pfam03485	Arg_tRNA_synt_N	Arginyl tRNA synthetase N terminal domain. This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition.	83
397521	pfam03486	HI0933_like	HI0933-like protein. 	404
397522	pfam03487	IL13	Interleukin-13. 	108
112313	pfam03488	Ins_beta	Nematode insulin-related peptide beta type. 	48
397523	pfam03489	SapB_2	Saposin-like type B, region 2. 	34
397524	pfam03491	5HT_transport_N	Serotonin (5-HT) neurotransmitter transporter, N-terminus. This short domain lies at the very N-terminus of many serotonin and other transporter proteins, eg SNF, pfam00209.	41
397525	pfam03492	Methyltransf_7	SAM dependent carboxyl methyltransferase. This family of plant methyltransferases contains enzymes that act on a variety of substrates including salicylic acid, jasmonic acid and 7-Methylxanthine. Caffeine is synthesized through sequential three-step methylation of xanthine derivatives at positions 7-N, 3-N, and 1-N. The protein 7-methylxanthine methyltransferase (designated as CaMXMT) catalyzes the second step to produce theobromine.	326
397526	pfam03493	BK_channel_a	Calcium-activated BK potassium channel alpha subunit. 	98
367525	pfam03494	Beta-APP	Beta-amyloid peptide (beta-APP). 	37
367526	pfam03495	Binary_toxB	Clostridial binary toxin B/anthrax toxin PA Ca-binding domain. This domain is a calcium binding domain in the anthrax toxin protective antigen.	78
397527	pfam03496	ADPrib_exo_Tox	ADP-ribosyltransferase exoenzyme. This is a family of bacterial and viral bi-glutamic acid ADP-ribosyltransferases, where, in Aeromonas salmonicida AexT, E403 is the catalytic residue and E401 contributes to the transfer of ADP-ribose to the target protein. In clostridial species it is actin that is being ADP-ribosylated; this result is lethal and dermonecrotic in infected mammals.	199
397528	pfam03497	Anthrax_toxA	Anthrax toxin LF subunit. 	174
397529	pfam03498	CDtoxinA	Cytolethal distending toxin A/C domain. 	149
281496	pfam03500	Cellsynth_D	Cellulose synthase subunit D. 	144
397530	pfam03501	S10_plectin	Plectin/S10 domain. This presumed domain is found at the N-terminus of some isoforms of the cytoskeletal muscle protein plectin as well as the ribosomal S10 protein. This domain may be involved in RNA binding.	92
367531	pfam03502	Channel_Tsx	Nucleoside-specific channel-forming protein, Tsx. 	242
308876	pfam03503	Chlam_OMP3	Chlamydia cysteine-rich outer membrane protein 3. 	54
281500	pfam03504	Chlam_OMP6	Chlamydia cysteine-rich outer membrane protein 6. 	91
308877	pfam03505	Clenterotox	Clostridium enterotoxin. 	197
281501	pfam03506	Flu_C_NS1	Influenza C non-structural protein (NS1). The influenza C virus genome consists of seven single-stranded RNA segments. The shortest RNA segment encodes a 286 amino acid non-structural protein NS1. This protein contains 6 conserved cysteines that may be functionally important, perhaps binding to a metal ion.	162
397531	pfam03507	CagA	CagA exotoxin. 	39
397532	pfam03508	Connexin43	Gap junction alpha-1 protein (Cx43). 	20
397533	pfam03509	Connexin50	Gap junction alpha-8 protein (Cx50). 	65
281505	pfam03510	Peptidase_C24	2C endopeptidase (C24) cysteine protease family. 	105
397534	pfam03511	Fanconi_A	Fanconi anaemia group A protein. 	63
397535	pfam03512	Glyco_hydro_52	Glycosyl hydrolase family 52. 	414
281508	pfam03513	Cloacin_immun	Cloacin immunity protein. 	80
397536	pfam03514	GRAS	GRAS domain family. Proteins in the GRAS (GAI, RGA, SCR) family are known as major players in gibberellin (GA) signaling, which regulates various aspects of plant growth and development. Mutation of the SCARECROW (SCR) gene results in a radial pattern defect, loss of a ground tissue layer, in the root. The PAT1 protein is involved in phytochrome A signal transduction. A sequence, structure and evolutionary analysis showed that the GRAS family emerged in bacteria and belongs to the Rossmann-fold, AdoMET (SAM)-dependent methyltransferase superfamily. All bacterial, and a subset of plant GRAS proteins, are predicted to be active and function as small-molecule methylases. Several plant GRAS proteins lack one or more AdoMet (SAM)-binding residues while preserving their substrate-binding residues. Although GRAS proteins are implicated to function as transcriptional factors, the above analysis suggests that they instead might either modify or bind small molecules.	374
367536	pfam03515	Cloacin	Colicin-like bacteriocin tRNase domain. The C-terminal region of colicin-like bacteriocins is either a pore-forming or an endonuclease-like domain. Cloacin and Pyocins have similar structures and activities to the colicins from E coli and the klebicins from Klebsiella spp. Colicins E5 and D cleave the anticodon loops of distinct tRNAs of Escherichia coli both in vivo and in vitro. The full-length molecule has an N-terminal translocation domain and a middle, double alpha-helical region which is receptor-binding.	273
397537	pfam03516	Filaggrin	Filaggrin. 	56
397538	pfam03517	Voldacs	Regulator of volume decrease after cellular swelling. ICln is a ubiquitously expressed multi-functional protein that plays a critical role in regulating volume decrease in cells after cellular swelling. In plants, ICln induces Cl- currents, thus regulating Cl- homoeostasis in eukaryotes. Structurally, the fold resembles a pleckstrin homology fold, on of whose roles is to recruit and tether their host protein to the cell membrane; and although the surface charges of the ICln fold are not equivalent to those of the PH domain, ICln can be phosphorylated in vitro and the PH-nature of the domain may be the part involving it in the transposition from cytosol to cell membrane during cytotonic swelling.	139
397539	pfam03519	Invas_SpaK	Invasion protein B family. 	78
397540	pfam03520	KCNQ_channel	KCNQ voltage-gated potassium channel. This family matches to the C-terminal tail of KCNQ type potassium channels.	190
397541	pfam03521	Kv2channel	Kv2 voltage-gated K+ channel. 	288
397542	pfam03522	SLC12	Solute carrier family 12. 	415
397543	pfam03523	Macscav_rec	Macrophage scavenger receptor. 	49
397544	pfam03524	CagX	Conjugal transfer protein. This family includes type IV secretion system CagX conjugation protein. Other members of this family are involved in conjugal transfer to plant cells of T-DNA.	217
367544	pfam03525	Meiotic_rec114	Meiotic recombination protein rec114. 	328
112349	pfam03526	Microcin	Colicin E1 (microcin) immunity protein. 	55
397545	pfam03527	RHS	RHS protein. 	38
367545	pfam03528	Rabaptin	Rabaptin. 	486
397546	pfam03529	TF_Otx	Otx1 transcription factor. 	89
397547	pfam03530	SK_channel	Calcium-activated SK potassium channel. 	109
397548	pfam03531	SSrecog	Structure-specific recognition protein (SSRP1). SSRP1 has been implicated in transcriptional initiation and elongation and in DNA replication and repair. This domain belongs to the Pleckstrin homology fold superfamily.	69
281524	pfam03532	OMS28_porin	OMS28 porin. 	253
367549	pfam03533	SPO11_like	SPO11 homolog. 	43
397549	pfam03534	SpvB	Salmonella virulence plasmid 65kDa B protein. 	286
397550	pfam03535	Paxillin	Paxillin family. Paxillin is a multi-domain protein that localizes in cultured cells primarily to sites of cell adhesion to the extracellular matrix (ECM) called focal adhesions. The family here represents the N-terminal regions with the proline-rich part as well as the Paxillin part. Focal adhesions form a structural link between the ECM and the actin cytoskeleton and are also important sites of signal transduction; their components propagate signals arising from the activation of integrins following their engagement with ECM proteins, such as fibronectin, collagen and laminin. Importantly, focal adhesion proteins including paxillin also serve as a point of convergence for signals resulting from stimulation of various classes of growth factor receptor.	198
397551	pfam03536	VRP3	Salmonella virulence-associated 28kDa protein. 	218
397552	pfam03537	Glyco_hydro_114	Glycoside-hydrolase family GH114. This family is recognized as a glycosyl-hydrolase family, number 114. It is endo-alpha-1,4-polygalactosaminidase, a rare enzyme. It is proposed to be TIM-barrel, the most common structure amongst the catalytic domains of glycosyl-hydrolases.	221
367552	pfam03538	VRP1	Salmonella virulence plasmid 28.1kDa A protein. 	311
112362	pfam03539	Spuma_A9PTase	Spumavirus aspartic protease (A9). 	163
397553	pfam03540	TFIID_30kDa	Transcription initiation factor TFIID 23-30kDa subunit. 	49
397554	pfam03542	Tuberin	Tuberin. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterized by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumor suppressor gene. The TSC2 gene codes for tuberin and interacts with hamartin pfam04388, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking.	351
397555	pfam03543	Peptidase_C58	Yersinia/Haemophilus virulence surface antigen. 	203
397556	pfam03544	TonB_C	Gram-negative bacterial TonB protein C-terminal. The TonB_C domain is the well-characterized C-terminal region of the TonB receptor molecule. This protein is bound to an inner membrane-bound protein ExbB via a globular domain and has a flexible middle region that is likely to help in positioning the C-terminal domain into the iron-transporter barrel in the outer membrane. TonB_C interacts with the N-terminal TonB box of the outer membrane transporter that binds the Fe3+-siderophore complex. The barrel of the transporter, consisting of 22 beta-sheets and an inside plug, binds the iron complex in the barrel entrance.	79
397557	pfam03545	YopE	Yersinia virulence determinant (YopE). 	70
397558	pfam03546	Treacle	Treacher Collins syndrome protein Treacle. 	524
308904	pfam03547	Mem_trans	Membrane transport protein. This family includes auxin efflux carrier proteins and other transporter proteins from all domains of life.	341
397559	pfam03548	LolA	Outer membrane lipoprotein carrier protein LolA. 	165
308906	pfam03549	Tir_receptor_M	Translocated intimin receptor (Tir) intimin-binding domain. Intimin and its translocated intimin receptor (Tir) are bacterial proteins that mediate adhesion between mammalian cells and attaching and effacing (A/E) pathogens. A unique and essential feature of A/E bacterial pathogens is the formation of actin-rich pedestals beneath the intimately adherent bacteria and localized destruction of the intestinal brush border. The bacterial outer membrane adhesin, intimin, is necessary for the production of the A/E lesion and diarrhoea. The A/E bacteria translocate their own receptor for intimin, Tir, into the membrane of mammalian cells using the type III secretion system. The translocated Tir triggers additional host signalling events and actin nucleation, which are essential for lesion formation. This family represents the Tir intimin-binding domain (Tir IBD) which is needed to bind intimin and support the predicted topology for Tir, with both N- and C-terminal regions in the mammalian cell cytosol.	65
397560	pfam03550	LolB	Outer membrane lipoprotein LolB. 	149
397561	pfam03551	PadR	Transcriptional regulator PadR-like family. Members of this family are transcriptional regulators that appear to be related to the pfam01047 family. This family includes PadR, a protein that is involved in negative regulation of phenolic acid metabolism.	74
281541	pfam03552	Cellulose_synt	Cellulose synthase. Cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues, is the major component of wood and thus paper, and is synthesized by plants, most algae, some bacteria and fungi, and even some animals. The genes that synthesize cellulose in higher plants differ greatly from the well-characterized genes found in Acetobacter and Agrobacterium sp. More correctly designated as 'cellulose synthase catalytic subunits', plant cellulose synthase (CesA) proteins are integral membrane proteins, approximately 1,000 amino acids in length. There are a number of highly conserved residues, including several motifs shown to be necessary for processive glycosyltransferase activity.	715
281542	pfam03553	Na_H_antiporter	Na+/H+ antiporter family. This family includes integral membrane proteins, some of which are NA+/H+ antiporters.	303
281543	pfam03554	Herpes_UL73	UL73 viral envelope glycoprotein. This family groups together the viral proteins BLRF1, U46, 53, and UL73. The UL73-like envelope glycoproteins, which associates in a high molecular mass complex with its counterpart, gM, induce neutralising antibody responses in the host. These glycoprotein are highly polymorphic, particularly in the N-terminal region.	74
281544	pfam03555	Flu_C_NS2	Influenza C non-structural protein (NS2). The influenza C virus genome consists of seven single-stranded RNA segments. The shortest RNA segment encodes a 286 amino acid non-structural protein NS1 pfam03506 as well as the NS2 protein. The NS2 protein is only about 60 amino acids in length and of unknown function.	57
397562	pfam03556	Cullin_binding	Cullin binding. This domain binds to cullins and to Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. Neddylation is the process by which the C-terminal glycine of the ubiquitin-like protein Nedd8 is covalently linked to lysine residues in a protein through an isopeptide bond. The structure of this domain is composed entirely of alpha helices.	116
281546	pfam03557	Bunya_G1	Bunyavirus glycoprotein G1. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This family contains the G1 glycoprotein which is the viral attachment protein.	879
367561	pfam03558	TBSV_P22	TBSV core protein P21/P22. This protein is required for cell-to-cell movement in plants. Furthermore, the membrane-associated protein is dispensable for both replication and transcription.	187
397563	pfam03559	Hexose_dehydrat	NDP-hexose 2,3-dehydratase. This family includes a range of proteins from antibiotic production pathways. The family includes gra-ORF27 product that probably functions at an early step, most likely as a dTDP-4-keto-6- deoxyglucose-2,3-dehydratase. Its homologs include dnmT from the daunorubicin biosynthetic gene cluster in S. peucetius, a similar gene from the daunomycin biosynthetic cluster in Streptomyces sp. strain C5, eryBVI from the erythromycin cluster in S. erythraea and snoH from the nogalamycin cluster in S. nogalater. The proteins in this family are composed of two copies of a 200 amino acid long unit that may be a structural domain.	203
397564	pfam03561	Allantoicase	Allantoicase repeat. This family is found in pairs in Allantoicases, forming the majority of the protein. These proteins allow the use of purines as secondary nitrogen sources in nitrogen-limiting conditions through the reaction: allantoate + H(2)0 = (-)-ureidoglycolate + urea.	103
397565	pfam03562	MltA	MltA specific insert domain. This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding.	231
397566	pfam03563	Bunya_G2	Bunyavirus glycoprotein G2. Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This family contains the G2 glycoprotein which interacts with the pfam03557 G1 glycoprotein.	281
281552	pfam03564	DUF1759	Protein of unknown function (DUF1759). This is a family of proteins of unknown function. Most of the members are gag-polyproteins.	148
397567	pfam03566	Peptidase_A21	Peptidase family A21. 	574
397568	pfam03567	Sulfotransfer_2	Sulfotransferase family. This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate.	235
397569	pfam03568	Peptidase_C50	Peptidase family C50. 	394
281556	pfam03569	Peptidase_C8	Peptidase family C8. 	208
397570	pfam03571	Peptidase_M49	Peptidase family M49. 	549
397571	pfam03572	Peptidase_S41	Peptidase family S41. 	165
397572	pfam03573	OprD	outer membrane porin, OprD family. This family includes outer membrane proteins related to OprD. OprD has been described as a serine type peptidase. However the proposed catalytic residues are not conserved suggesting that many of these proteins are not peptidases.	393
397573	pfam03574	Peptidase_S48	Peptidase family S48. 	149
397574	pfam03575	Peptidase_S51	Peptidase family S51. 	206
397575	pfam03576	Peptidase_S58	Peptidase family S58. 	309
335383	pfam03577	Peptidase_C69	Peptidase family C69. 	401
397576	pfam03578	HGWP	HGWP repeat. This short (30 amino acids) repeat is found in a number of plant proteins. It contains a conserved HGWP motif, hence its name. The function of these proteins is unknown.	28
281565	pfam03579	SHP	Small hydrophobic protein. The small hydrophobic integral membrane protein, SH (previously designated 1A) is found to have a variety of glycosylated forms. This protein is a component of the mature virion.	64
281566	pfam03580	Herpes_UL14	Herpesvirus UL14-like protein. This is a family of Herpesvirus proteins including UL14. UL14 protein is a minor component of the virion tegument and is expressed late in infection. UL14 protein can influence the intracellular localization patterns of a number of proteins belonging to the capsid or the DNA encapsidation machinery.	146
281567	pfam03581	Herpes_UL33	Herpesvirus UL33-like protein. This is a family of Herpesvirus proteins including UL33 and UL51. The proteins in this family are involved in packaging viral DNA.	72
367570	pfam03583	LIP	Secretory lipase. These lipases are expressed and secreted during the infection cycle of these pathogens. In particular, C. albicans has a large number of different lipases, possibly reflecting broad lipolytic activity, which may contribute to the persistence and virulence of C. albicans in human tissue.	286
367571	pfam03584	Herpes_ICP4_N	Herpesvirus ICP4-like protein N-terminal region. The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the N-terminal region that contains sites for DNA binding and homodimerization.	163
281570	pfam03585	Herpes_ICP4_C	Herpesvirus ICP4-like protein C-terminal region. The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the C-terminal region that probably acts as an enhancer for the N-terminal region.	444
397577	pfam03586	Herpes_UL36	Herpesvirus UL36 tegument protein. The UL36 open reading frame (ORF) encodes the largest herpes simplex virus type 1 (HSV-1) protein, a 270-kDa polypeptide designated VP1/2, which is also a component of the virion tegument. A null mutation in the UL36 gene of herpes simplex virus type 1 results in accumulation of unenveloped DNA-filled capsids in the cytoplasm of infected cells. This family only covers a small central part of this large protein.	252
397578	pfam03587	EMG1	EMG1/NEP1 methyltransferase. Members of this family are essential for 40S ribosomal biogenesis. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases.	205
397579	pfam03588	Leu_Phe_trans	Leucyl/phenylalanyl-tRNA protein transferase. 	171
397580	pfam03589	Antiterm	Antitermination protein. 	85
397581	pfam03590	AsnA	Aspartate-ammonia ligase. 	228
397582	pfam03591	AzlC	AzlC protein. 	135
397583	pfam03592	Terminase_2	Terminase small subunit. Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far, which are hetero-oligomers composed of a small and a large subunit, do not have a significant level of sequence homology. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site.	143
397584	pfam03594	BenE	Benzoate membrane transport protein. 	378
397585	pfam03595	SLAC1	Voltage-dependent anion channel. This family of transporters has ten alpha helical transmembrane segments. The structure of a bacterial homolog of SLAC1 shows it to have a trimeric arrangement. The pore is composed of five helices with a conserved Phe residue involved in gating. One homolog, Mae1 from the yeast Schizosaccharomyces pombe, functions as a malate uptake transporter; another, Ssu1 from Saccharomyces cerevisiae and other fungi including Aspergillus fumigatus, is characterized as a sulfite efflux pump; and TehA from Escherichia coli is identified as a tellurite resistance protein by virtue of its association in the tehA/tehB operon. In plants, this family is found in the stomatal guard cells functioning as an anion-transporting pore. Many homologs are incorrectly annotated as tellurite resistance or dicarboxylate transporter (TDT) proteins.	332
281580	pfam03596	Cad	Cadmium resistance transporter. 	192
397586	pfam03597	FixS	Cytochrome oxidase maturation protein cbb3-type. 	44
397587	pfam03598	CdhC	CO dehydrogenase/acetyl-CoA synthase complex beta subunit. 	155
397588	pfam03599	CdhD	CO dehydrogenase/acetyl-CoA synthase delta subunit. 	384
397589	pfam03600	CitMHS	Citrate transporter. 	299
397590	pfam03601	Cons_hypoth698	Conserved hypothetical protein 698. 	305
397591	pfam03602	Cons_hypoth95	Conserved hypothetical protein 95. 	179
397592	pfam03603	DNA_III_psi	DNA polymerase III psi subunit. 	126
397593	pfam03604	DNA_RNApol_7kD	DNA directed RNA polymerase, 7 kDa subunit. 	32
397594	pfam03605	DcuA_DcuB	Anaerobic c4-dicarboxylate membrane transporter. 	364
281589	pfam03606	DcuC	C4-dicarboxylate anaerobic carrier. 	452
397595	pfam03607	DCX	Doublecortin. 	55
397596	pfam03608	EII-GUT	PTS system enzyme II sorbitol-specific factor. 	162
397597	pfam03609	EII-Sor	PTS system sorbose-specific iic component. 	233
397598	pfam03610	EIIA-man	PTS system fructose IIA component. 	114
397599	pfam03611	EIIC-GAT	PTS system sugar-specific permease component. This family includes bacterial transmembrane proteins with a putative sugar-specific permease function, including and analogous to the IIC component of the PTS system. It has been suggested that this permease may form part of an L-ascorbate utilisation pathway, with proposed specificity for 3-keto-L-gulonate (formed by hydrolysis of L-ascorbate). This family includes the IIC component of the galactitol specific GAT family PTS system.	397
397600	pfam03612	EIIBC-GUT_N	Sorbitol phosphotransferase enzyme II N-terminus. 	184
397601	pfam03613	EIID-AGA	PTS system mannose/fructose/sorbose family IID component. 	263
281597	pfam03614	Flag1_repress	Repressor of phase-1 flagellin. 	170
397602	pfam03615	GCM	GCM motif protein. 	140
308938	pfam03616	Glt_symporter	Sodium/glutamate symporter. 	368
281600	pfam03617	IBV_3A	IBV 3A protein. The gene product of gene 3 from Avian infectious bronchitis virus. Currently, the function of this protein remains unknown.	57
397603	pfam03618	Kinase-PPPase	Kinase/pyrophosphorylase. This family of regulatory proteins has ADP-dependent kinase and inorganic phosphate-dependent pyrophosphorylase activity.	255
397604	pfam03619	Solute_trans_a	Organic solute transporter Ostalpha. This family is a transmembrane organic solute transport protein. In vertebrates these proteins form a complex with Ostbeta, and function as bile transporters. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death.	261
281603	pfam03620	IBV_3C	IBV 3C protein. Product of ORF 3C from Avian infectious bronchitis virus (IBV). Currently, the function of this protein remains unknown.	92
397605	pfam03621	MbtH	MbtH-like protein. This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. MbtH and its homologs were first noted in gene clusters involved in non-ribosomal peptides and other secondary metabolites by Quadri et al. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. The structure of the PA2412 protein shows it adopts a beta-beta-beta-alpha-alpha topology with the short C-terminal helix forming the tip of an overall arrowhead shape. MbtH proteins have been shown to be required for the synthesis of antibiotics, siderophores and glycopeptidolipids.	52
281605	pfam03622	IBV_3B	IBV 3B protein. Product of ORF 3B from Avian infectious bronchitis virus (IBV). Currently, the function of this protein remains unknown.	64
397606	pfam03623	Focal_AT	Focal adhesion targeting region. Focal adhesion kinase (FAK) is a tyrosine kinase found in focal adhesions, intracellular signaling complexes that are formed following engagement of the extracellular matrix by integrins. The C-terminal 'focal adhesion targeting' (FAT) region is necessary and sufficient for localising FAK to focal adhesions. The crystal structure of FAT shows it forms a four-helix bundle that resembles those found in two other proteins involved in cell adhesion, alpha-catenin and vinculin. The binding of FAT to the focal adhesion protein, paxillin, requires the integrity of the helical bundle, whereas binding to another focal adhesion protein, talin, does not.	130
397607	pfam03625	DUF302	Domain of unknown function DUF302. Domain is found in an undescribed set of proteins. Normally occurs uniquely within a sequence, but is found as a tandem repeat. Shows interesting phylogenetic distribution with majority of examples in bacteria and archaea, but also in in D.melanogaster.	63
397608	pfam03626	COX4_pro	Prokaryotic Cytochrome C oxidase subunit IV. Cytochrome c oxidase (COX) is a multi-subunit enzyme complex that catalyzes the final step of electron transfer through the respiratory chain on the mitochondrial inner membrane. This family is composed of cytochrome c oxidase subunit 4 from prokaryotes.	72
112444	pfam03627	PapG_N	PapG carbohydrate binding domain. PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus (this domain) and chaperone binding C-terminus. The carbohydrate-binding domain interacts with the receptor glycan.	226
397609	pfam03628	PapG_C	PapG chaperone-binding domain. PapG, the adhesin of the P-pili, is situated at the tip and is only a minor component of the whole pilus structure. A two-domain structure has been postulated for PapG; a carbohydrate binding N-terminus and chaperone binding C-terminus (this domain). The chaperone-binding domain is highly conserved, and is essential for the correct assembly of the pili structure when aided by the chaperone molecule PapD.	108
397610	pfam03629	SASA	Carbohydrate esterase, sialic acid-specific acetylesterase. The catalytic triad of this esterase enzyme comprises residues Ser127, His403 and Asp391 in UniProtKB:P70665.	226
397611	pfam03630	Fumble	Fumble. Fumble is required for cell division in Drosophila. Mutants lacking fumble exhibit abnormalities in bipolar spindle organisation, chromosome segregation, and contractile ring formation. Analyses have demonstrated that encodes three protein isoforms, all of which contain a domain with high similarity to the pantothenate kinases of A. nidulans and mouse. A role of fumble in membrane synthesis has been proposed.	321
397612	pfam03631	Virul_fac_BrkB	Virulence factor BrkB. This family acts as a virulence factor. In Bordetella pertussis, BrkB is essential for resistance to complement-dependent killing by serum. This family was originally predicted to be ribonuclease BN, but this prediction has since been shown to be incorrect.	256
281612	pfam03632	Glyco_hydro_65m	Glycosyl hydrolase family 65 central catalytic domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The central domain is the catalytic domain, which binds a phosphate ion that is proximal the the highly conserved Glu. The arrangement of the phosphate and the glutamate is thought to cause nucleophilic attack on the anomeric carbon atom. The catalytic domain also forms the majority of the dimerization interface.	387
397613	pfam03633	Glyco_hydro_65C	Glycosyl hydrolase family 65, C-terminal domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The C-terminal domain forms a two layered jelly roll motif. This domain is situated at the base of the catalytic domain, however its function remains unknown.	50
397614	pfam03634	TCP	TCP family transcription factor. This is a family of TCP plant transcription factors. TCP proteins were named after the first characterized members (TB1, CYC and PCFs) and they are involved in multiple developmental control pathways. This region contains a DNA binding basic-Helix-Loop-Helix (bHLP) structure.	152
397615	pfam03635	Vps35	Vacuolar protein sorting-associated protein 35. Vacuolar protein sorting-associated protein (Vps) 35 is one of around 50 proteins involved in protein trafficking. In particular, Vps35 assembles into a retromer complex with at least four other proteins Vps5, Vps17, Vps26 and Vps29. Vps35 contains a central region of weaker sequence similarity, thought to indicate the presence of at least three domains.	732
397616	pfam03636	Glyco_hydro_65N	Glycosyl hydrolase family 65, N-terminal domain. This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. This domain is believed to be essential for catalytic activity although its precise function remains unknown.	240
397617	pfam03637	Mob1_phocein	Mob1/phocein family. Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature. This family also includes phocein, a rat protein that by yeast two hybrid interacts with striatin.	170
397618	pfam03638	TCR	Tesmin/TSO1-like CXC domain, cysteine-rich domain. This family includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin and TSO1. This family is called a CXC domain in.	38
397619	pfam03639	Glyco_hydro_81	Glycosyl hydrolase family 81. Family of eukaryotic beta-1,3-glucanases. Within the Aspergillus fumigatus protein ENGL1, two perfectly conserved Glu residues (E550 or E554) have been proposed as putative nucleophiles of the active site of the Engl1 endoglucanase, while the proton donor would be D475. The endo-beta-1,3-glucanase activity is essential for efficient spore release.	321
397620	pfam03640	Lipoprotein_15	Secreted repeat of unknown function. This family occurs as tandem repeats in a set of lipoproteins. The alignment contains a Y-X4-D motif.	45
397621	pfam03641	Lysine_decarbox	Possible lysine decarboxylase. The members of this family share a highly conserved motif PGGXGTXXE that is probably functionally important. This family includes proteins annotated as lysine decarboxylases, although the evidence for this is not clear.	130
367591	pfam03642	MAP	MAP domain. This presumed 110 amino acid residue domain is found in multiple copies in MAP (MHC class II analogue protein). The protein has been found in a wide range of extracellular matrix proteins.	87
397622	pfam03643	Vps26	Vacuolar protein sorting-associated protein 26. Vacuolar protein sorting-associated protein (Vps) 26 is one of around 50 proteins involved in protein trafficking. In particular, Vps26 assembles into a retromer complex with at least four other proteins Vps5, Vps17, Vps29 and Vps35. This family also contains Down syndrome critical region 3/A.	275
397623	pfam03644	Glyco_hydro_85	Glycosyl hydrolase family 85. Family of endo-beta-N-acetylglucosaminidases. These enzymes work on a broad spectrum of substrates.	291
397624	pfam03645	Tctex-1	Tctex-1 family. Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction.	98
397625	pfam03646	FlaG	FlaG protein. Although important for flagella the exact function of this protein is unknown.	101
397626	pfam03647	Tmemb_14	Transmembrane proteins 14C. This family of short membrane proteins are as yet uncharacterized.	90
397627	pfam03648	Glyco_hydro_67N	Glycosyl hydrolase family 67 N-terminus. Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the N-terminal region of alpha-glucuronidase. The N-terminal domain forms a two-layer sandwich, each layer being formed by a beta sheet of five strands. A further two helices form part of the interface with the central, catalytic, module (pfam07488).	120
397628	pfam03649	UPF0014	Uncharacterized protein family (UPF0014). 	242
397629	pfam03650	MPC	Uncharacterized protein family (UPF0041). 	110
397630	pfam03652	RuvX	Holliday junction resolvase. This family of nucleases resolves the Holliday junction intermediates in genetic recombination.	134
397631	pfam03653	UPF0093	Uncharacterized protein family (UPF0093). 	146
252088	pfam03656	Pam16	Pam16. The Pam16 protein is the fifth essential subunit of the pre-sequence translocase-associated protein import motor (PAM). In Saccharomyces cerevisiae, Pam16 is required for preprotein translocation into the matrix, but not for protein insertion into the inner membrane. Pam16 has a degenerate J domain. J-domain proteins play important regulatory roles as co-chaperones, recruiting Hsp70 partners and accelerating the ATP-hydrolysis step of the chaperone cycle. Pam16's J-like domain strongly interacts with Pam18's J domain, leading to a productive interaction of Pam18 with mtHsp70 at the mitochondria import channel. Pam18 stimulates the ATPase activity of mtHsp70.	127
397632	pfam03657	UPF0113	Uncharacterized protein family (UPF0113). 	72
397633	pfam03658	Ub-RnfH	RnfH family Ubiquitin. A member of the RnfH family of the ubiquitin superfamily. Members of this family strongly co-occur in two distinct gene neighborhood contexts. In one it is associated with a START domain protein, a membrane protein SmpA and the transfer mRNA binding protein SmpB. This association suggests a possible role in the SmpB-tmRNA-based tagging and degadation system of bacteria, which is interesting given that other members of the ubiquitin system are analogously involved in protein-tagging and degradation across eukaryotes and various prokaryotes. The second context in which the RnfH genes are present is in a membrane associated complex involved in transporting electrons for various reductive reactions such as nitrogen fixation.	83
397634	pfam03659	Glyco_hydro_71	Glycosyl hydrolase family 71. Family of alpha-1,3-glucanases.	372
397635	pfam03660	PHF5	PHF5-like protein. This family of proteins the superfamily of PHD-finger proteins. At least one example, from mouse, may act as a chromatin-associated protein. The S. pombe ini1 gene is essential, required for splicing. It is localized in the nucleus, but not detected in the nucleolus and can be complemented by human ini1.	104
397636	pfam03661	UPF0121	Uncharacterized protein family (UPF0121). Uncharacterized integral membrane protein family.	237
397637	pfam03662	Glyco_hydro_79n	Glycosyl hydrolase family 79, N-terminal domain. Family of endo-beta-N-glucuronidase, or heparanase. Heparan sulfate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulfate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular micro-environment. Heparanase degrades HS at specific intra-chain sites. The enzyme is synthesized as a latent approximately 65 kDa protein that is processed at the N-terminus into a highly active approximately 50 kDa form. Experimental evidence suggests that heparanase may facilitate both tumor cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity.	318
397638	pfam03663	Glyco_hydro_76	Glycosyl hydrolase family 76. Family of alpha-1,6-mannanases.	348
281639	pfam03664	Glyco_hydro_62	Glycosyl hydrolase family 62. Family of alpha -L-arabinofuranosidase (EC 3.2.1.55). This enzyme hydrolyzed aryl alpha-L-arabinofuranosides and cleaves arabinosyl side chains from arabinoxylan and arabinan.	272
397639	pfam03665	UPF0172	Uncharacterized protein family (UPF0172). In Chlamydomonas reinhardtii the protein TLA1 (truncated light-harvesting chlorophyll antenna size) apparently regulates genes that define the chlorophyll-a antenna size in the photosynthetic apparatus. This family was formerly known as UPF0172.	192
397640	pfam03666	NPR3	Nitrogen Permease regulator of amino acid transport activity 3. This family, also known in yeasts as Rmd11, complexes with NPR2, pfam06218. This complex heterodimer is responsible for inactivating TORC1. an evolutionarily conserved protein complex that controls cell size via nutritional input signals, specifically, in response to amino acid starvation.	446
397641	pfam03668	ATP_bind_2	P-loop ATPase protein family. This family contains an ATP-binding site and could be an ATPase (personal obs:C Yeats).	284
397642	pfam03669	UPF0139	Uncharacterized protein family (UPF0139). 	96
112485	pfam03670	UPF0184	Uncharacterized protein family (UPF0184). 	83
397643	pfam03671	Ufm1	Ubiquitin fold modifier 1 protein. This is a family of short ubiquitin-like proteins, that is like neither type-1 or type-2. It is a ubiquitin-fold modifier 1 (Ufm1) that is synthesized in a precursor form of 85 amino-acid residues. In humans the enzyme for Ufm1 is Uba5 and the conjugating enzyme is Ufc1. Prior to activation by Uba5 the extra two amino acids at the C-terminal region of the human pro-Ufm1 protein are removed to expose Gly whose residue is necessary for conjugation to target molecule(s). The mature Ufm1 is conjugated to yet unidentified endogenous proteins. While Ubiquitin and many Ubls possess the conserved C-terminal di-glycine that is adenylated by each specific E1 or E1-like enzyme, respectively, in an ATP-dependent manner, Ufm1(1-83) possesses a single glycine at its C-terminus, which is followed by a Ser-Cys dipeptide in the precursor form of Ufm1. The C-terminally processed Ufm1(1-83) is specifically activated by Uba5, an E1-like enzyme, and then transferred to its cognate Ufc1, an E2-like enzyme.	75
397644	pfam03672	UPF0154	Uncharacterized protein family (UPF0154). This family contains a set of short bacterial proteins of unknown function.	60
308975	pfam03673	UPF0128	Uncharacterized protein family (UPF0128). The members of this family are about 240 amino acids in length. The proteins are as yet uncharacterized.	221
397645	pfam03676	UPF0183	Uncharacterized protein family (UPF0183). This family of proteins includes Lin-10 from C. elegans.	395
281647	pfam03677	UPF0137	Uncharacterized protein family (UPF0137). This family includes GP6-D a virulence plasmid encoded protein.	237
308977	pfam03678	Adeno_hexon_C	Hexon, adenovirus major coat protein, C-terminal domain. Hexon is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organized so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices. The N and C-terminal domains adopt the same PNGase F-like fold although they are significantly different in length.	241
308978	pfam03682	UPF0158	Uncharacterized protein family (UPF0158). 	157
397646	pfam03683	UPF0175	Uncharacterized protein family (UPF0175). This family contains small proteins of unknown function.	75
397647	pfam03684	UPF0179	Uncharacterized protein family (UPF0179). The function of this family is unknown, however the proteins contain two cysteine clusters that may be iron sulphur redox centers.	139
397648	pfam03685	UPF0147	Uncharacterized protein family (UPF0147). This family of small proteins have no known function.	81
397649	pfam03686	UPF0146	Uncharacterized protein family (UPF0146). The function of this family of proteins is unknown.	129
281654	pfam03687	UPF0164	Uncharacterized protein family (UPF0164). This family of uncharacterized proteins are only found in Treponema pallidum. These proteins belong to the membrane beta barrel superfamily.	326
397650	pfam03688	Nepo_coat_C	Nepovirus coat protein, C-terminal domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure.	163
397651	pfam03689	Nepo_coat_N	Nepovirus coat protein, N-terminal domain. The members of this family are derived from nepoviruses. Together with comoviruses and picornaviruses, nepoviruses are classified in the picornavirus superfamily of plus strand single-stranded RNA viruses. This family aligns several nepovirus coat protein sequences. In several cases, this is found at the C-terminus of the RNA2-encoded viral polyprotein. The coat protein consists of three trapezoid-shaped beta-barrel domains, and forms a pseudo T = 3 icosahedral capsid structure.	91
397652	pfam03690	UPF0160	Uncharacterized protein family (UPF0160). This family of proteins contains a large number of metal binding residues. The patterns are suggestive of a phosphoesterase function. The conserved DHH motif may mean this family is related to pfam01368.	315
397653	pfam03691	UPF0167	Uncharacterized protein family (UPF0167). The proteins in this family are about 200 amino acids long and each contain 3 CXXC motifs.	174
308985	pfam03692	CxxCxxCC	Putative zinc- or iron-chelating domain. This family of proteins contains 8 conserved cysteines. It has in the past been annotated as being one of the complex of proteins of the flagellar Fli complex. However this was due to a mis-annotation of the original Salmonella LT2 Genbank entry of 'fliB'. With all its conserved cysteines it is possibly a domain that chelates iron or zinc ions.	85
281658	pfam03693	ParD_antitoxin	Bacterial antitoxin of ParD toxin-antitoxin type II system and RHH. ParD is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is ParE in, pfam05016. The family contains several related antitoxins from Cyanobacteria, Proteobacteria and Actinobacteria. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all type II antitoxins.	80
397654	pfam03694	Erg28	Erg28 like protein. This is a family of integral membrane proteins, which may contain four transmembrane helices. Members of this family are thought to be involved in sterol C-4 demethylation. In S. cerevisiae they may tether Erg26p (sterol dehydrogenase/decarboxylase) and Erg27p (3-ketoreductase) to the endoplasmic reticulum or may facilitate interaction between these proteins. The family contains a conserved arginine and histidine that may be functionally important.	110
397655	pfam03695	UPF0149	Uncharacterized protein family (UPF0149). The protein in this family are about 190 amino acids long. The function of these proteins is unknown.	170
397656	pfam03698	UPF0180	Uncharacterized protein family (UPF0180). The members of this family are small uncharacterized proteins.	73
397657	pfam03699	UPF0182	Uncharacterized protein family (UPF0182). This family contains uncharacterized integral membrane proteins.	753
397658	pfam03700	Sorting_nexin	Sorting nexin, N-terminal domain. These proteins bins to the cytoplasmic domain of plasma membrane receptors. and are involved in endocytic protein trafficking. The N-terminal domain appears to be specific to sorting nexins 1 and 2.	78
397659	pfam03701	UPF0181	Uncharacterized protein family (UPF0181). This family contains small proteins of about 50 amino acids of unknown function. The family includes YoaH.	50
397660	pfam03702	AnmK	Anhydro-N-acetylmuramic acid kinase. Anhydro-N-acetylmuramic acid kinase catalyzes the specific phosphorylation of 1,6-anhydro-N-acetylmuramic acid (anhMurNAc) with the simultaneous cleavage of the 1,6-anhydro ring, generating MurNAc-6-P. It is also required for the utilization of anhMurNAc, either imported from the medium, or derived from its own cell wall murein, and in so doing plays a role in cell wall recycling.	364
397661	pfam03703	bPH_2	Bacterial PH domain. Domain found in uncharacterized family of membrane proteins. 1-3 copies found in each protein, with each copy flanked by transmembrane helices. Members of this family have a PH domain like structure.	79
397662	pfam03704	BTAD	Bacterial transcriptional activator domain. Found in the DNRI/REDD/AFSR family of regulators. This region of AFSR, along with the C terminal region, is capable of independently directing actinorhodin production. This family contains TPR repeats.	146
397663	pfam03705	CheR_N	CheR methyltransferase, all-alpha domain. CheR proteins are part of the chemotaxis signaling mechanism in bacteria. CheR methylates the chemotaxis receptor at specific glutamate residues. CheR is an S-adenosylmethionine- dependent methyltransferase.	53
397664	pfam03706	LPG_synthase_TM	Lysylphosphatidylglycerol synthase TM region. LPG_synthase_TM is the N-terminal region of this family of bacterial phosphatidylglycerol lysyltransferases. The function of the family is to add lysyl groups to membrane lipids, and this region is the transmembrane domain of 7xTMs. In order to counteract attack by membrane-damaging external cationic antimicrobial molecules - from host immune systems, bacteriocins, defensins, etc - bacteria modify their anionic membrane phosphatidylglycerol with positively-charged L-lysine; this results in repulsion of the foreign cationic peptides.	300
397665	pfam03707	MHYT	Bacterial signalling protein N terminal repeat. Found as an N terminal triplet tandem repeat in bacterial signalling proteins. Family includes CoxC and CoxH from P.carboxydovorans. Each repeat contains two transmembrane helices. Domain is also described as the MHYT domain.	54
281671	pfam03708	Avian_gp85	Avian retrovirus envelope protein, gp85. Family of a vain specific viral glycoproteins that forms a receptor-binding gp85 polypeptide that is linked through disulfide to a membrane-spanning gp37 spike. Gp85 confers a high degree of subgroup specificity for interaction with distinct cell receptors.	246
397666	pfam03709	OKR_DC_1_N	Orn/Lys/Arg decarboxylase, N-terminal domain. This domain has a flavodoxin-like fold, and is termed the "wing" domain because of its position in the overall 3D structure.	111
397667	pfam03710	GlnE	Glutamate-ammonia ligase adenylyltransferase. Conserved repeated domain found in GlnE proteins. These proteins adenylate and deadenylate glutamine synthases: ATP + {L-Glutamate:ammonia ligase (ADP-forming)} = Diphosphate + Adenylyl-{L-Glutamate:Ammonia ligase (ADP-forming)}. The family is related to the pfam01909 domain.	249
397668	pfam03711	OKR_DC_1_C	Orn/Lys/Arg decarboxylase, C-terminal domain. 	129
397669	pfam03712	Cu2_monoox_C	Copper type II ascorbate-dependent monooxygenase, C-terminal domain. The N and C-terminal domains of members of this family adopt the same PNGase F-like fold.	157
397670	pfam03713	DUF305	Domain of unknown function (DUF305). Domain found in small family of bacterial secreted proteins with no known function. Also found in Paramecium bursaria chlorella virus 1. This domain is short and found in one or two copies. The domain has a conserved HH motif that may be functionally important. This domain belongs to the ferritin superfamily. It contains two sequence similar repeats each of which is composed of two alpha helices.	151
397671	pfam03714	PUD	Bacterial pullanase-associated domain. Domain is found in pullanase - carbohydrate de-branching - proteins. It is found both to the N or the C terminii of of the alpha-amylase active site region. This domain contains several conserved aromatic residues that are suggestive of a carbohydrate binding function.	97
397672	pfam03715	Noc2	Noc2p family. At least one member, Noc2p from yeast, is required for a late step in 60S subunit export from the nucleus. It has also been shown to co-precipitate with Nug1p, a nuclear GTPase also required for ribosome nucleus export. This family was formerly known as UPF0120.	299
202737	pfam03716	WCCH	WCCH motif. The WCCH motif is found in a retrotransposons and Gemini viruses. A specific function has not been associated to this motif.	25
397673	pfam03717	PBP_dimer	Penicillin-binding Protein dimerization domain. This domain is found at the N-terminus of Class B High Molecular Weight Penicillin-Binding Proteins. Its function has not been precisely defined, but is strongly implicated in PBP polymerization. The domain forms a largely disordered 'sugar tongs' structure.	178
397674	pfam03718	Glyco_hydro_49	Glycosyl hydrolase family 49. Family of dextranase (EC 3.2.1.11) and isopullulanase (EC 3.2.1.57). Dextranase hydrolyzes alpha-1,6-glycosidic bonds in dextran polymers. This domain corresponds to the C-terminal pectate lyase like domain.	118
397675	pfam03719	Ribosomal_S5_C	Ribosomal protein S5, C-terminal domain. 	66
397676	pfam03720	UDPG_MGDP_dh_C	UDP-glucose/GDP-mannose dehydrogenase family, UDP binding domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate.	103
397677	pfam03721	UDPG_MGDP_dh_N	UDP-glucose/GDP-mannose dehydrogenase family, NAD binding domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate.	186
397678	pfam03722	Hemocyanin_N	Hemocyanin, all-alpha domain. This family includes arthropod hemocyanins and insect larval storage proteins.	124
397679	pfam03723	Hemocyanin_C	Hemocyanin, ig-like domain. This family includes arthropod hemocyanins and insect larval storage proteins.	243
397680	pfam03724	META	META domain. Small domain family found in proteins of of unknown function. Some are secreted and implicated in motility in bacteria. Also occurs in Leishmania spp. as an essential gene. Over-expression in L.amazonensis increases virulence. A pair of cysteine residues show correlated conservation, suggesting that they form a disulphide bond.	109
397681	pfam03725	RNase_PH_C	3' exoribonuclease family, domain 2. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterized subfamily. This subfamily is found in both eukaryotes and archaebacteria.	67
397682	pfam03726	PNPase	Polyribonucleotide nucleotidyltransferase, RNA binding domain. This family contains the RNA binding domain of Polyribonucleotide nucleotidyltransferase (PNPase) PNPase is involved in mRNA degradation in a 3'-5' direction.	80
397683	pfam03727	Hexokinase_2	Hexokinase. Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam00349. Some members of the family have two copies of each of these domains.	241
281690	pfam03728	Viral_DNA_Zn_bi	Viral DNA-binding protein, zinc binding domain. This family represents the zinc binding domain of the viral DNA- binding protein, a multi functional protein involved in DNA replication and transcription control. Two copies of this domain are found at the C-terminus of many members of the family.	94
377113	pfam03729	DUF308	Short repeat of unknown function (DUF308). Family of short repeats that occurs in a limited number of membrane proteins. It may divide further in short repeats of around 7-10 residues of the pattern G-#-X(2)-#(2)-X (#=hydrophobic).	73
397684	pfam03730	Ku_C	Ku70/Ku80 C-terminal arm. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the C terminal arm. This alpha helical region embraces the beta-barrel domain pfam02735 of the opposite subunit.	87
397685	pfam03731	Ku_N	Ku70/Ku80 N-terminal alpha/beta domain. The Ku heterodimer (composed of Ku70 and Ku80) contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway. This is the amino terminal alpha/beta domain. This domain only makes a small contribution to the dimer interface. The domain comprises a six stranded beta sheet of the Rossman fold.	220
367628	pfam03732	Retrotrans_gag	Retrotransposon gag protein. Gag or Capsid-like proteins from LTR retrotransposons. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved.	97
397686	pfam03733	YccF	Inner membrane component domain. Domain occurs as one or more copies in bacterial and eukaryotic proteins. These are membrane proteins of four TM regions, two appearing in each of the two copies when both are present. Many of the latter members also carry the sodium/calcium exchanger protein family pfam01699, which have multipass membrane regions.	51
397687	pfam03734	YkuD	L,D-transpeptidase catalytic domain. This family of proteins are found in a range of bacteria. It has been shown that this domain can act as an L,D-transpeptidase that gives rise to an alternative pathway for peptidoglycan cross-linking. This gives bacteria resistance to beta-lactam antibiotics that inhibit PBPs which usually carry out the cross-linking reaction. The conserved region contains a conserved histidine and cysteine, with the cysteine thought to be an active site residue. Several members of this family contain peptidoglycan binding domains. The molecular structure of YkuD protein shows this domain has a novel tertiary fold consisting of a beta-sandwich with two mixed sheets, one containing five strands and the other, six strands. The two beta-sheets form a cradle capped by an alpha-helix. This family was formerly called the ErfK/YbiS/YcfS/YnhG family, but is now named after the first protein of known structure.	89
397688	pfam03735	ENT	ENT domain. This presumed domain is named after Emsy N-terminus (ENT). Emsy is a protein that is amplified in breast cancer and interacts with BRCA2. The N-terminus of this protein is found to be similar to other vertebrate and plant proteins of unknown function. This domain has a completely conserved histidine residue that may be functionally important.	68
397689	pfam03736	EPTP	EPTP domain. Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein contains a large repeat in its C terminal section. The architecture and structural features of this repeat make it a likely member 7-bladed beta-propeller fold.	41
397690	pfam03737	RraA-like	Aldolase/RraA. Members of this family include regulator of ribonuclease E activity A (RraA) and 4-hydroxy-4-methyl-2-oxoglutarate (HMG)/4-carboxy- 4-hydroxy-2-oxoadipate (CHA) aldolase, also known as RraA-like protein. RraA acts as a trans-acting modulator of RNA turnover, binding essential endonuclease RNase E and inhibiting RNA processing. RraA-like proteins seem to contain aldolase and/or decarboxylase activity either in place of or in addition to the RNase E inhibitor functions.	149
397691	pfam03738	GSP_synth	Glutathionylspermidine synthase preATP-grasp. This region contains the Glutathionylspermidine synthase enzymatic activity EC:6.3.1.8. This is the C-terminal region in bi-enzymes. Glutathionylspermidine (GSP) synthetases of Trypanosomatidae and Escherichia coli couple hydrolysis of ATP (to ADP and Pi) with formation of an amide bond between spermidine and the glycine carboxylate of glutathione (gamma-Glu-Cys-Gly). In the pathogenic trypanosomatids, this reaction is the penultimate step in the biosynthesis of the antioxidant metabolite, trypanothione (N1,N8-bis-(glutathionyl)spermidine), and is a target for drug design. This region, the pre-ATP grasp region, probably carries the substrate-binding site.	369
397692	pfam03739	YjgP_YjgQ	Predicted permease YjgP/YjgQ family. Members of this family are predicted integral membrane proteins of unknown function. They are about 350 amino acids long and contain about 6 transmembrane regions. They are predicted to be permeases although there is no verification of this.	351
397693	pfam03740	PdxJ	Pyridoxal phosphate biosynthesis protein PdxJ. Members of this family belong to the PdxJ family that catalyzes the condensation of 1-deoxy-d-xylulose-5-phosphate (DXP) and 1-amino-3-oxo-4-(phosphohydroxy)propan-2-one to form pyridoxine 5'-phosphate (PNP). This reaction is involved in de novo synthesis of pyridoxine (vitamin B6) and pyridoxal phosphate.	234
397694	pfam03741	TerC	Integral membrane protein TerC family. This family contains a number of integral membrane proteins that also contains the TerC protein. TerC has been implicated in resistance to tellurium. This protein may be involved in efflux of tellurium ions. The tellurite-resistant Escherichia coli strain KL53 was found during testing of the group of clinical isolates for antibiotics and heavy metal ion resistance. Determinant of the tellurite resistance of the strain was located on a large conjugative plasmid. Analyses showed, the genes terB, terC, terD and terE are essential for conservation of the resistance. The members of the family contain a number of conserved aspartates that could be involved in binding to metal ions.	179
397695	pfam03742	PetN	PetN. PetN is a small hydrophobic protein, crucial for cytochrome b6-f complex assembly and/or stability.	29
397696	pfam03743	TrbI	Bacterial conjugation TrbI-like protein. Although not essential for conjugation, the TrbI protein greatly increase the conjugational efficiency.	186
397697	pfam03744	BioW	6-carboxyhexanoate--CoA ligase. This family contains the enzyme 6-carboxyhexanoate--CoA ligase EC:6.2.1.14. This enzyme is involved in the first step of biotin synthesis, where it converts pimelate into pimeloyl-CoA. The enzyme requires magnesium as a cofactor and forms a homodimer.	239
397698	pfam03745	DUF309	Domain of unknown function (DUF309). This domain is found in eubacterial and archaebacterial proteins of unknown function. The proteins contain a motif HXXXEXX(W/Y) where X can be any amino acid. This motif is likely to be functionally important and may be involved in metal binding.	59
397699	pfam03746	LamB_YcsF	LamB/YcsF family. This family includes LamB. The lam locus of Aspergillus nidulans consists of two divergently transcribed genes, lamA and lamB, involved in the utilisation of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression. The exact molecular function of the proteins in this family is unknown.	238
397700	pfam03747	ADP_ribosyl_GH	ADP-ribosylglycohydrolase. This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues.	198
397701	pfam03748	FliL	Flagellar basal body-associated protein FliL. This FliL protein controls the rotational direction of the flagella during chemotaxis. FliL is a cytoplasmic membrane protein associated with the basal body.	98
397702	pfam03749	SfsA	Sugar fermentation stimulation protein. This family contains Sugar fermentation stimulation proteins. Which is probably a regulatory factor involved in maltose metabolism. SfsA has been shown to bind DNA and it contains a helix-turn-helix motif that probably binds DNA at its C-terminus.	138
397703	pfam03750	Csm2_III-A	Csm2 Type III-A. Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-associated) proteins. This entry represents Csm2 Type III-A, a family of Cas proteins also known as TM1810/Csm2.	112
397704	pfam03752	ALF	Short repeats of unknown function. This set of repeats is found in a small family of secreted proteins of no known function, though they are possibly involved in signal transduction. ALF stands for Alanine-rich (AL) - conserved Phenylalanine (F).	42
367637	pfam03753	HHV6-IE	Human herpesvirus 6 immediate early protein. The proteins in this family are poorly characterized, but an investigation has indicated that the immediate early protein is required the down-regulation of MHC class I expression in dendritic cells. Human herpesvirus 6 immediate early protein is also referred to as U90.	964
281714	pfam03754	DUF313	Domain of unknown function (DUF313). Family of proteins from Arabidopsis thaliana with uncharacterized function.	113
397705	pfam03755	YicC_N	YicC-like family, N-terminal region. Family of bacterial proteins. Although poorly characterized, the members of this protein family have been demonstrated to play a role in stationary phase survival. These proteins are not essential during stationary phase.	155
367638	pfam03756	AfsA	A-factor biosynthesis hotdog domain. The AfsA family are key enzymes in A-factor biosynthesis, which is essential for streptomycin production and resistance. This domain is distantly related to the thioester dehydratase FabZ family and therefore has a HotDog domain.	133
397706	pfam03759	PRONE	PRONE (Plant-specific Rop nucleotide exchanger). This is a functional guanine exchange factor (GEF) of plant Rho GTPase.	361
397707	pfam03760	LEA_1	Late embryogenesis abundant (LEA) group 1. Family members are conserved along the entire coding region, especially within the hydrophobic internal 20 amino acid motif, which may be repeated.	71
367641	pfam03761	DUF316	Chymotrypsin family Peptidase-S1. This is a family of trypsin-6 part of the chymotrypsin family S21, ie a serine peptidase. The C. elegans sequence UniProt:O01566 is trypsin-6: all the active site residues are present (His90, Asp168, Ser267).	281
397708	pfam03762	VOMI	Vitelline membrane outer layer protein I (VOMI). VOMI binds tightly to ovomucin fibrils of the egg yolk membrane. The structure that consists of three beta-sheets forming Greek key motifs, which are related by an internal pseudo three-fold symmetry. Furthermore, the structure of VOMI has strong similarity to the structure of the delta-endotoxin, as well as a carbohydrate-binding site in the top region of the common fold.	168
397709	pfam03763	Remorin_C	Remorin, C-terminal region. Remorins are plant-specific plasma membrane-associated proteins. In tobacco remorin co-purifies with lipid rafts. Most remorins have a variable, proline-rich N-half and a more conserved C-half that is predicted to form coiled coils. Consistent with this, circular dichroism studies have demonstrated that much of the protein is alpha-helical. Remorins exist in plasma membrane preparations as oligomeric structures and form filaments in vitro. The proteins can bind polyanions including the extracellular matrix component oligogalacturonic acid (OGA). In vitro, remorin in plasma membrane preparations is phosphorylated (principally on threonine residues) in the presence of OGA and thus co-purifies with a protein kinases(s). The biological functions of remorins are unknown but roles as components of the membrane/cytoskeleton are possible.	106
397710	pfam03764	EFG_IV	Elongation factor G, domain IV. This domain is found in elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopts a ribosomal protein S5 domain 2-like fold.	121
397711	pfam03765	CRAL_TRIO_N	CRAL/TRIO, N-terminal domain. This all-alpha domain is found to the N-terminus of pfam00650.	53
367646	pfam03766	Remorin_N	Remorin, N-terminal region. Remorins are plant-specific plasma membrane-associated proteins. In tobacco remorin co-purifies with lipid rafts. Most remorins have a variable, proline-rich C-half and a more conserved N-half that is predicted to form coiled coils. Consistent with this, circular dichroism studies have demonstrated that much of the protein is alpha-helical. Remorins exist in plasma membrane preparations as oligomeric structures and form filaments in vitro. The proteins can bind polyanions including the extracellular matrix component oligogalacturonic acid (OGA). In vitro, remorin in plasma membrane preparations is phosphorylated (principally on threonine residues) in the presence of OGA and thus co-purifies with a protein kinases(s). The biological functions of remorins are unknown but roles as components of the membrane/cytoskeleton are possible.	51
397712	pfam03767	Acid_phosphat_B	HAD superfamily, subfamily IIIB (Acid phosphatase). This family proteins includes acid phosphatases and a number of vegetative storage proteins.	213
397713	pfam03768	Attacin_N	Attacin, N-terminal region. This family includes attacin and sarcotoxin, but not diptericin (which share similarity to the C-terminal region of attacin). All members of this family are insect antibacterial proteins which are induced by the fat body and subsequently released into secreted into the hemolymph where they act synergistically to kill the invading microorganism.	64
397714	pfam03769	Attacin_C	Attacin, C-terminal region. This family includes attacin, sarcotoxin and diptericin. All members of this family are insect antibacterial proteins which are induced by the fat body and subsequently released into secreted into the hemolymph where they act synergistically to kill the invading microorganism.	120
397715	pfam03770	IPK	Inositol polyphosphate kinase. ArgRIII has has been demonstrated to be an inositol polyphosphate kinase.	190
367651	pfam03771	SPDY	Domain of unknown function (DUF317). This a sequence family found in a set of bacterial proteins with no known function. This domain is currently only found in streptomyces bacteria. Most proteins contain two copies of this domain.	59
397716	pfam03772	Competence	Competence protein. Members of this family are integral membrane proteins with 6 predicted transmembrane helices. Some members of this family have been shown to be essential for bacterial competence in uptake of extracellular DNA. These proteins may transport DNA across the cell membrane. These proteins contain a highly conserved motif in the amino terminal transmembrane region that has two histidines that may form a metal binding site.	270
281730	pfam03773	ArsP_1	Predicted permease. This family of integral membrane proteins are predicted to be permeases of unknown specificity.	316
397717	pfam03775	MinC_C	Septum formation inhibitor MinC, C-terminal domain. In Escherichia coli FtsZ assembles into a Z ring at midcell while assembly at polar sites is prevented by the min system. MinC, a component of this system, is an inhibitor of FtsZ assembly that is positioned within the cell by interaction with MinDE. MinC is an oligomer, probably a dimer. The C terminal half of MinC is the most conserved and interacts with MinD. The N terminal half is thought interact with FtsZ.	97
397718	pfam03776	MinE	Septum formation topological specificity factor MinE. The E. coli minicell locus was shown to code for three gene products (MinC, MinD, and MinE) whose coordinate action is required for proper placement of the division septum. The minE gene codes for a topological specificity factor that, in wild-type cells, prevents the division inhibitor from acting at internal division sites while permitting it to block septation at polar sites.	67
397719	pfam03777	DUF320	Small secreted domain (DUF320). Small domain found in a family of secreted streptomyces proteins. It occurs singly or as a pair. Many of the domains have two cysteines that may form a disulphide bridge.	55
112584	pfam03778	DUF321	Protein of unknown function (DUF321). This family may be related to the FARP (FMRFamide) family, pfam01581. Currently this repeat was only detectable in Arabidopsis thaliana.	20
397720	pfam03779	SPW	SPW repeat. A short repeat found in a small family of membrane-bound proteins. This repeat contains a conserved SPW motif in the first of two transmembrane helices.	48
397721	pfam03780	Asp23	Asp23 family, cell envelope-related function. The alkaline shock protein Asp23 was identified as an alkaline shock protein that was expressed in a sigmaB-dependent manner in Staphylococcus aureus. Following an alkaline shock Asp23 accumulates in the soluble protein fraction of the S. aureus cell. Asp23 is one of the most abundant proteins in the cytosolic protein fraction of stationary S. aureus cells, with a copy-number of >25000 per cell. A second Asp23-family protein, AmaP, which is encoded within the asp23-operon, is required to localize Asp23 to the cell membrane. The overall function for the family is thus a cell envelope-related one in Gram-positive bacteria.	108
397722	pfam03781	FGE-sulfatase	Sulfatase-modifying factor enzyme 1. This domain is found in eukaryotic proteins required for post-translational sulfatase modification (SUMF1). These proteins are associated with the rare disorder multiple sulfatase deficiency (MSD). The protein product of the SUMF1 gene is FGE, formylglycine (FGly),-generating enzyme, which is a sulfatase. Sulfatases are enzymes essential for degradation and remodelling of sulfate esters, and formylglycine (FGly), the key catalytic in the active site, is unique to sulfatases. FGE is localized to the endoplasmic reticulum (ER) and interacts with and modifies the unfolded form of newly synthesized sulfatases. FGE is a single-domain monomer with a surprising paucity of secondary structure that adopts a unique fold which is stabilized by two Ca2+ ions. The effect of all mutations found in MSD patients is explained by the FGE structure, providing a molecular basis for MSD. A redox-active disulfide bond is present in the active site of FGE. An oxidized cysteine residue, possibly cysteine sulfenic acid, has been detected that may allow formulation of a structure-based mechanism for FGly formation from cysteine residues in all sulfatases. In Mycobacteria and Treponema denticola this enzyme functions as an iron(II)-dependent oxidoreductase.	259
397723	pfam03782	AMOP	AMOP domain. This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges.	144
397724	pfam03783	CsgG	Curli production assembly/transport component CsgG. CsgG is an outer membrane-located lipoprotein that is highly resistant to protease digestion. During curli assembly, an adhesive surface fibre, CsgG is required to maintain the stability of CsgA and CsgB.	209
397725	pfam03784	Cyclotide	Cyclotide family. This family contains a set of cyclic peptides with a variety of activities. The structure consists of a distorted triple-stranded beta-sheet and a cysteine-knot arrangement of the disulfide bonds. Cyclotides can be separated into two subfamilies, namely bracelet and moebius. The bracelet cyclotide subfamily tends to contain a larger number of positively charged residues and has a bracelet-like circularisation of the backbone. The moebius cyclotide subfamily contains a backbone twist due to a cis-Pro peptide bond and may conceptually be regarded as a molecular Moebius strip.	28
367657	pfam03785	Peptidase_C25_C	Peptidase family C25, C terminal ig-like domain. 	74
397726	pfam03786	UxuA	D-mannonate dehydratase (UxuA). UxuA (this family) and UxuB are required for hexuronate degradation.	351
397727	pfam03787	RAMPs	RAMP superfamily. The molecular function of these proteins is not yet known. However, they have been identified and called the RAMP (Repair Associated Mysterious Proteins) superfamily. The members of this family have no known function they are around 300 amino acids in length and have several conserved motifs.	188
397728	pfam03788	LrgA	LrgA family. This family is uncharacterized. It contains the protein LrgA that has been hypothesized to export murein hydrolases.	94
397729	pfam03789	ELK	ELK domain. This domain is required for the nuclear localization of these proteins. All of these proteins are members of the Tale/Knox homeodomain family, a subfamily within homeobox pfam00046.	22
397730	pfam03790	KNOX1	KNOX1 domain. The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerization.	40
397731	pfam03791	KNOX2	KNOX2 domain. The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerization.	48
397732	pfam03792	PBC	PBC domain. The PBC domain is a member of the TALE (three-amino-acid loop extension) superclass of homeodomain proteins.	188
397733	pfam03793	PASTA	PASTA domain. This domain is found at the C termini of several Penicillin-binding proteins and bacterial serine/threonine kinases. It binds the beta-lactam stem, which implicates it in sensing D-alanyl-D-alanine - the PBP transpeptidase substrate. It is a small globular fold consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain.	63
397734	pfam03795	YCII	YCII-related domain. The majority of proteins in this family consist of a single copy of this domain, though it is also found as a repeat. A strongly conserved histidine and a aspartate suggest that the domain has an enzymatic function. This family also now includes the family formerly known as the DGPF domain (COG3795). Although its function is unknown it is found fused to a sigma-70 factor family domain in CC_1329. Suggesting that this domain plays a role in transcription initiation (Bateman A per. obs.). This domain is named after the most conserved motif in the alignment.	94
397735	pfam03796	DnaB_C	DnaB-like helicase C terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis.	255
397736	pfam03797	Autotransporter	Autotransporter beta-domain. Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type V pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different protease is used and in some cases no cleavage occurs.	254
397737	pfam03798	TRAM_LAG1_CLN8	TLC domain. 	200
397738	pfam03799	FtsQ	Cell division protein FtsQ. FtsQ is one of several cell division proteins. FtsQ interacts with other Fts proteins, reviewed in. The precise function of FtsQ is unknown.	111
397739	pfam03800	Nuf2	Nuf2 family. Members of this family are components of the mitotic spindle. It has been shown that Nuf2 from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. An arabidopsis protein has been included in this family that has previously not been identified as a member of this family. The match is not strong, but in common with other members of this family contains coiled-coil to the C-terminus of this region.	139
397740	pfam03801	Ndc80_HEC	HEC/Ndc80p family. Members of this family are components of the mitotic spindle. It has been shown that Ndc80/HEC from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle.	156
397741	pfam03802	CitX	Apo-citrate lyase phosphoribosyl-dephospho-CoA transferase. 	164
252175	pfam03803	Scramblase	Scramblase. Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury.	221
281756	pfam03804	DUF325	Viral domain of unknown function. 	73
367667	pfam03805	CLAG	Cytoadherence-linked asexual protein. Clag (cytoadherence linked asexual gene) is a malaria surface protein which has been shown to be involved in the binding of Plasmodium falciparum infected erythrocytes to host endothelial cells, a process termed cytoadherence. The cytoadherence phenomenon is associated with the sequestration of infected erythrocytes in the blood vessels of the brain, cerebral malaria. Clag is a multi-gene family in Plasmodium falciparum with at least 9 members identified to date. Orthologous proteins in the rodent malaria species Plasmodium chabaudi (Lawson D Unpubl. obs.) suggest that the gene family is found in other malaria species and may play a more generic role in cytoadherence.	1286
397742	pfam03806	ABG_transport	AbgT putative transporter family. 	502
397743	pfam03807	F420_oxidored	NADP oxidoreductase coenzyme F420-dependent. 	92
397744	pfam03808	Glyco_tran_WecB	Glycosyl transferase WecB/TagA/CpsF family. 	169
397745	pfam03810	IBN_N	Importin-beta N-terminal domain. 	72
281762	pfam03811	Zn_Tnp_IS1	InsA N-terminal domain. This appears to be a short zinc binding domain found in IS1 InsA family protein. It is found at the N-terminus of the protein and may be a DNA-binding domain.	35
397746	pfam03812	KdgT	2-keto-3-deoxygluconate permease. 	298
397747	pfam03813	Nrap	Nrap protein domain 1. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript. This domain has a nucleotidyltransferase structure.	144
397748	pfam03814	KdpA	Potassium-transporting ATPase A subunit. 	549
397749	pfam03815	LCCL	LCCL domain. 	96
397750	pfam03816	LytR_cpsA_psr	Cell envelope-related transcriptional attenuator domain. 	149
397751	pfam03817	MadL	Malonate transporter MadL subunit. 	117
397752	pfam03818	MadM	Malonate/sodium symporter MadM subunit. 	55
397753	pfam03819	MazG	MazG nucleotide pyrophosphohydrolase domain. This domain is about 100 amino acid residues in length. It is found in the MazG protein from E. coli. It contains four conserved negatively charged residues that probably form an active site or metal binding site. This domain is found in isolation in some proteins as well as associated with pfam00590. This domain is clearly related to pfam01503 another pyrophosphohydrolase involved in histidine biosynthesis. This family may be structurally related to the NUDIX domain pfam00293 (Bateman A pers. obs.).	74
397754	pfam03820	Mtc	Tricarboxylate carrier. 	319
397755	pfam03821	Mtp	Golgi 4-transmembrane spanning transporter. 	231
397756	pfam03822	NAF	NAF domain. 	56
397757	pfam03823	Neurokinin_B	Neurokinin B. 	58
397758	pfam03824	NicO	High-affinity nickel-transport protein. High affinity nickel transporters involved in the incorporation of nickel into H2-uptake hydrogenase and urease enzymes. Essential for the expression of catalytically active hydrogenase and urease. Ion uptake is dependent on proton motive force. HoxN in Alcaligenes eutrophus is thought to be an integral membrane protein with seven transmembrane helices. The family also includes a cobalt transporter.	287
281776	pfam03825	Nuc_H_symport	Nucleoside H+ symporter. 	400
397759	pfam03826	OAR	OAR domain. 	19
397760	pfam03827	Orexin_rec2	Orexin receptor type 2. 	57
397761	pfam03828	PAP_assoc	Cid1 family poly A polymerase. This domain is found in poly(A) polymerases and has been shown to have polynucleotide adenylyltransferase activity. Proteins in this family have been located to both the nucleus and the cytoplasm.	60
397762	pfam03829	PTSIIA_gutA	PTS system glucitol/sorbitol-specific IIA component. 	113
397763	pfam03830	PTSIIB_sorb	PTS system sorbose subfamily IIB component. 	148
397764	pfam03831	PhnA	PhnA domain. 	69
397765	pfam03832	WSK	WSK motif. This short motif is names after three conserved residues found in a WXSXK motif in protein kinase A anchoring proteins.	29
397766	pfam03833	PolC_DP2	DNA polymerase II large subunit DP2. 	866
397767	pfam03834	Rad10	Binding domain of DNA repair protein Ercc1 (rad10/Swi10). Ercc1 and XPF (xeroderma pigmentosum group F-complementing protein) are two structure-specific endonucleases of a class of seven containing an ERCC4 domain. Together they form an obligate complex that functions primarily in nucleotide excision repair (NER), a versatile pathway able to detect and remove a variety of DNA lesions induced by UV light and environmental carcinogens, and secondarily in DNA interstrand cross-link repair and telomere maintenance. This domain in fact binds simultaneously to both XPF and single-stranded DNA; this ternary complex explains the important role of Ercc1 in targeting its catalytic XPF partner to the NER pre-incision complex.	114
397768	pfam03835	Rad4	Rad4 transglutaminase-like domain. 	144
397769	pfam03836	RasGAP_C	RasGAP C-terminus. This domain can be found in the C-terminus of the IQGAP family members, including human IQGAP1/2/3, S. cerevisiae Iqg1 and S. pombe Rng2. Some members function in cytoskeletal remodelling. Human IQGAP1 is a scaffolding protein that can assemble multi-protein complexes involved in cell-cell interaction, cell adherence, and movement via actin/tubulin-based cytoskeletal reorganization. IQGAP1 is also a regulator of the MAPK and Wnt/beta-catenin signaling pathways.Iqg1 and Rng2 are required for actomyosin ring construction during cytokinesis.	140
397770	pfam03837	RecT	RecT family. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to RecT.	192
397771	pfam03838	RecU	Recombination protein U. 	162
397772	pfam03839	Sec62	Translocation protein Sec62. 	213
397773	pfam03840	SecG	Preprotein translocase SecG subunit. 	69
309101	pfam03841	SelA	L-seryl-tRNA selenium transferase. 	367
146463	pfam03842	Silic_transp	Silicon transporter. 	513
397774	pfam03843	Slp	Outer membrane lipoprotein Slp family. 	151
281794	pfam03845	Spore_permease	Spore germination protein. 	321
397775	pfam03846	SulA	Cell division inhibitor SulA. 	111
367691	pfam03847	TFIID_20kDa	Transcription initiation factor TFIID subunit A. 	68
397776	pfam03848	TehB	Tellurite resistance protein TehB. 	193
397777	pfam03849	Tfb2	Transcription factor Tfb2. 	362
397778	pfam03850	Tfb4	Transcription factor Tfb4. This family appears to be distantly related to the VWA domain.	273
252205	pfam03851	UvdE	UV-endonuclease UvdE. 	275
281799	pfam03852	Vsr	DNA mismatch endonuclease Vsr. 	74
397779	pfam03853	YjeF_N	YjeF-related protein N-terminus. YjeF-N domain is a novel version of the Rossmann fold with a set of catalytic residues and structural features that are different from the conventional dehydrogenases. YjeF-N domain is fused to Ribokinases in bacteria (YjeF), where they may be phosphatases, and to divergent Sm and the FDF domain in eukaryotes (Dcp3p and FLJ21128), where they may be involved in decapping and catalyze hydrolytic RNA-processing reactions.	168
281801	pfam03854	zf-P11	P-11 zinc finger. 	50
281802	pfam03855	M-factor	M-factor. The M-factor is a pheromone produce upon nitrogen starvation. The production of M-factor is increased by the pheromone signal. The protein undergoes post-translational modification, to remove the C-terminal signal peptide, the carboxy-terminal cysteine residue is carboxy-methylated and S-alkylated, with a farnesyl residue.	43
397780	pfam03856	SUN	Beta-glucosidase (SUN family). Members of this family include Nca3, Sun4 and Sim1. This is a family of yeast proteins, involved in a diverse set of functions (DNA replication, aging, mitochondrial biogenesis and cell septation). BGLA from Candida wickerhamii has been characterized as a Beta-glucosidase EC:3.2.1.21.	244
367697	pfam03857	Colicin_im	Colicin immunity protein. Colicin immunity proteins are plasmid-encoded proteins necessary for protecting the cell against colicins. Colicins are toxins released by bacteria during times of stress.	138
367698	pfam03858	Crust_neuro_H	Crustacean neurohormone H. These proteins are referred to as precursor-related peptides as they are typically co-transcribed and translated with the CHH neurohormone (pfam01147). However, in some species this neuropeptide is synthesized as a separate protein. Furthermore, neurohormone H can undergo proteolysis to give rise to 5 different neuropeptides.	41
397781	pfam03859	CG-1	CG-1 domain. CG-1 domains are highly conserved domains of about 130 amino-acid residues containing a predicted bipartite NLS and named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs.	114
397782	pfam03860	DUF326	Domain of Unknown Function (DUF326). This family is a small cysteine-rich repeat. The cysteines mostly follow a C-X(2)-C-X(3)-C-X(2)-C-X(3) pattern, though they often appear at other positions in the repeat as well.	21
397783	pfam03861	ANTAR	ANTAR domain. ANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins. The majority of the domain consists of a coiled-coil.	46
397784	pfam03862	SpoVAC_SpoVAEB	SpoVAC/SpoVAEB sporulation membrane protein. Members of this family are all transcribed from the spoVA operon. Bacillus and Clostridium are two well studied endospore forming bacteria. Spore formation provides a resistance mechanism in response to extreme or unfavourable environmental conditions such as heat, radiation, and chemical agents or nutrient deprivation. The reverse process termed germination takes place where spores develop into growing cells in response to nutrient availability or stress reduction. Nutrient germinant receptors (GRs) and the SpoVA proteins are important players in the germination process. In B.subtilis the SpoVAC and SpoVAEB, belonging to this domain family, are predicted to be membrane proteins, with two to five membrane spanning. Biophysical and biochemical studies suggest that SpoVAC acts as a mechano-sensitive channel with properties that would allow the release of Ca-DPA (dipicolinic acid) and amino acids during germination of the spore. The release of Ca-DPA is a crucial event during spore germination. When expressed in E.coli SpoVAC provides protection against osmotic downshift. Furthermore, SpoVAC acts as channel that facilitates the efflux down the concentration gradient of osmolytes up to a mass of at least 600 Da. Another conserved SpoVA protein in all spore-forming bacteria is SpoVAEb, which appears to be an integral membrane protein with no known function.	116
281809	pfam03863	Phage_mat-A	Phage maturation protein. 	446
397785	pfam03864	Phage_cap_E	Phage major capsid protein E. Major capsid protein E is involved with the stabilisation of the condensed form of the DNA molecule in phage heads.	335
397786	pfam03865	ShlB	Haemolysin secretion/activation protein ShlB/FhaC/HecB. This family represents a group of sequences that are related to ShlB from Serratia marcescens. ShlB is an outer membrane protein pore involved in the Type Vb or Two-partner secretion system where it is functions to secrete and activate the haemolysin ShlA. The activation of ShlA occurs during secretion when ShlB imposes a conformational change in the inactive haemolysin to form the active protein.	316
146478	pfam03866	HAP	Hydrophobic abundant protein (HAP). Expression of HAP is thought to be developmentally regulated and possibly involved in spherule cell wall formation.	167
397787	pfam03867	FTZ	Fushi tarazu (FTZ), N-terminal region. This region contains the important motif (LXXLL) necessary for the interaction of FTZ with the nuclear receptor FTZ-F1. FTZ is thought to represents a category of LXXLL motif-dependent co-activators for nuclear receptors.	269
397788	pfam03868	Ribosomal_L6e_N	Ribosomal protein L6, N-terminal domain. 	60
397789	pfam03869	Arc	Arc-like DNA binding domain. Arc repressor act by he cooperative binding of two Arc repressor dimers to a 21-base-pair operator site. Each Arc dimer uses an antiparallel beta-sheet to recognize bases in the major groove.	50
397790	pfam03870	RNA_pol_Rpb8	RNA polymerase Rpb8. Rpb8 is a subunit common to the three yeast RNA polymerases, pol I, II and III. Rpb8 interacts with the largest subunit Rpb1, and with Rpb3 and Rpb11, two smaller subunits.	136
397791	pfam03871	RNA_pol_Rpb5_N	RNA polymerase Rpb5, N-terminal domain. Rpb5 has a bipartite structure which includes a eukaryote-specific N-terminal domain and a C-terminal domain resembling the archaeal RNAP subunit H. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA.	89
397792	pfam03872	RseA_N	Anti sigma-E protein RseA, N-terminal domain. Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress.	87
397793	pfam03873	RseA_C	Anti sigma-E protein RseA, C-terminal domain. Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress.	53
397794	pfam03874	RNA_pol_Rpb4	RNA polymerase Rpb4. This family includes the Rpb4 protein. This family also includes C17 (aka CGRP-RCP) is an essential subunit of RNA polymerase III. C17 forms a subcomplex with C25 which is likely to be the counterpart of subcomplex Rpb4/7 in Pol II.	115
397795	pfam03875	Statherin	Statherin. Statherin functions biologically to inhibit the nucleation and growth of calcium phosphate minerals. The N-terminus of statherin is highly charge, the glutamic acids of which have been shown to be important in the recognition hydroxyapatite.	41
397796	pfam03876	SHS2_Rpb7-N	SHS2 domain found in N-terminus of Rpb7p/Rpc25p/MJ0397. Rpb7 bind to Rpb4 to form a heterodimer. This complex is thought to interact with the nascent RNA strand during RNA polymerase II elongation. This family includes the homologs from RNA polymerase I and III. In RNA polymerase I, Rpa43 is at least one of the subunits contacted by the transcription factor TIF-IA. The N-terminus of Rpb7p/Rpc25p/MJ0397 has a SHS2 domain that is involved in protein-protein interaction.	57
397797	pfam03878	YIF1	YIF1. YIF1 (Yip1 interacting factor) is an integral membrane protein that is required for membrane fusion of ER derived vesicles. It also plays a role in the biogenesis of ER derived COPII transport vesicles.	243
397798	pfam03879	Cgr1	Cgr1 family. Members of this family are coiled-coil proteins that are involved in pre-rRNA processing.	107
397799	pfam03880	DbpA	DbpA RNA binding domain. This RNA binding domain is found at the C-terminus of a number of DEAD helicase proteins. It is sufficient to confer specificity for hairpin 92 of 23S rRNA, which is part of the ribosomal A-site. However, several members of this family lack specificity for 23S rRNA. These can proteins can generally be distinguished by a basic region that extends beyond this domain [Karl Kossen, unpublished data].	72
397800	pfam03881	Fructosamin_kin	Fructosamine kinase. This family includes eukaryotic fructosamine-3-kinase enzymes. The family also includes bacterial members that have not been characterized but probably have a similar or identical function.	287
397801	pfam03882	KicB	MukF winged-helix domain. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid.	115
397802	pfam03883	H2O2_YaaD	Peroxide stress protein YaaA. YaaA is a key element of the stress response to H2O2. It acts by reducing the level of intracellular iron levels after peroxide stress, thereby attenuating the Fenton reaction and the DNA damage that this would cause. The molecular mechanism of action is not known.	233
397803	pfam03884	YacG	DNA gyrase inhibitor YacG. YacG inhibits all the catalytic activities of DNA gyrase by preventing its interaction with DNA. It acts by binding directly to the C-terminal domain of GyrB, which probably disrupts DNA binding by the gyrase. YacG has been shown to bind zinc and contains the structural motifs typical of zinc-binding proteins. The conserved four cysteine motif in this protein (-C-X(2)-C-X(15)-C-X(3)-C-) is not found in other zinc-binding proteins with known structures.	48
397804	pfam03885	DUF327	Protein of unknown function (DUF327). The proteins in this family are around 140-170 residues in length. The proteins contain many conserved residues. with the most conserved motifs found in the central and C-terminal region. The function of these proteins is unknown.	141
397805	pfam03886	ABC_trans_aux	ABC-type transport auxiliary lipoprotein component. ABC_trans_aux is a family of bacterial proteins that act as auxiliarires to the ABC-transporter in the gamma-hexachlorocyclohexane uptake permease system in Sphingobium japonicum. Gamma-hexachlorocyclohexane, or Lindane, can be used as the sole source of carbon in S.japonicum in aerobic conditions. Lindane is an insecticide.	154
397806	pfam03887	YfbU	YfbU domain. This presumed domain is about 160 residues long. It is found in archaebacteria and eubacteria. In Corynebacterium glutamicum Ycg4L it is associated with a helix-turn-helix domain. This suggests that this may be a ligand binding domain.	165
397807	pfam03888	MucB_RseB	MucB/RseB N-terminal domain. Members of this family are regulators of the anti-sigma E protein RseD.	178
397808	pfam03889	DUF331	Domain of unknown function. Members of this family are uncharacterized proteins from a number of bacterial species. The proteins range in size from 50-70 residues.	44
397809	pfam03891	DUF333	Domain of unknown function (DUF333). This small domain of about 70 residues is found in a number of bacterial proteins. It is found at the N-terminus the of AF_1947 protein. The proteins containing this domain are uncharacterized.	46
397810	pfam03892	NapB	Nitrate reductase cytochrome c-type subunit (NapB). The napB gene encodes a dihaem cytochrome c, the small subunit of a heterodimeric periplasmic nitrate reductase.	122
309136	pfam03893	Lipase3_N	Lipase 3 N-terminal region. N terminal region to pfam01764, found on a subset of Lipase 3 containing proteins.	76
397811	pfam03894	XFP	D-xylulose 5-phosphate/D-fructose 6-phosphate phosphoketolase. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22. This family is distantly related to transketolases e.g. pfam02779.	177
397812	pfam03895	YadA_anchor	YadA-like membrane anchor domain. This region represents the C-terminal 120 amino acids of a family of surface-exposed bacterial proteins. YadA, an adhesin from Yersinia, was the first member of this family to be characterized. UspA2 from Moraxella was second. The Eib immunoglobulin-binding proteins from E. coli were third, followed by the DsrA proteins of Haemophilus ducreyi and others. These proteins are homologous at their C-terminal and have predicted signal sequences, but they diverge elsewhere. The C-terminal 9 amino acids, consisting of alternating hydrophobic amino acids ending in F or W, comprise a targeting motif for the outer membrane of the Gram negative cell envelope. This region is important for oligomerization.	60
397813	pfam03896	TRAP_alpha	Translocon-associated protein (TRAP), alpha subunit. The alpha-subunit of the TRAP complex (TRAP alpha) is a single-spanning membrane protein of the endoplasmic reticulum (ER) which is found in proximity of nascent polypeptide chains translocating across the membrane.	279
146498	pfam03898	TNV_CP	Satellite tobacco necrosis virus coat protein. 	198
397814	pfam03899	ATP-synt_I	ATP synthase I chain. The atp operon of alkaliphilic Bacillus pseudofirmus OF4, as in most prokaryotes, contains the eight structural genes for the F-ATPase (ATP synthase), which are preceded by an atpI gene that encodes a membrane protein with 2 TMSs. A tenth gene, atpZ, has been found in this operon, which is upstream of and overlapping with atpI. AtpI is a Ca2+/Mg2+ transporter.	99
397815	pfam03900	Porphobil_deamC	Porphobilinogen deaminase, C-terminal domain. 	72
281842	pfam03901	Glyco_transf_22	Alg9-like mannosyltransferase family. Members of this family are mannosyltransferase enzymes. At least some members are localized in endoplasmic reticulum and involved in GPI anchor biosynthesis.	414
397816	pfam03902	Gal4_dimer	Gal4-like dimerization domain. 	44
397817	pfam03903	Phage_T4_gp36	Phage T4 tail fibre. 	131
112704	pfam03904	DUF334	Domain of unknown function (DUF334). Staphylococcus aureus plasmid proteins with no characterized function.	229
146503	pfam03905	Corona_NS4	Coronavirus non-structural protein NS4. 	45
281845	pfam03906	Phage_T7_tail	Phage T7 tail fibre protein. The bacteriophage T7 tail complex consists of a conical tail-tube surrounded by six kinked tail-fibers, which are oligomers of the viral protein gp17.	157
397818	pfam03907	Spo7	Spo7-like protein. S. cerevisiae Spo7 has an unknown function, but has a role in formation of a spherical nucleus and meiotic division.	205
112708	pfam03908	Sec20	Sec20. Sec20 is a membrane glycoprotein associated with secretory pathway.	92
397819	pfam03909	BSD	BSD domain. This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function.	55
367721	pfam03910	Adeno_PV	Adenovirus minor core protein PV. 	355
397820	pfam03911	Sec61_beta	Sec61beta family. This family consists of homologs of Sec61beta - a component of the Sec61/SecYEG protein secretory system. The domain is found in eukaryotes and archaea and is possibly homologous to the bacterial SecG. It consists of a single putative transmembrane helix, preceded by a short stretch containing various charged residues; this arrangement may help determine orientation in the cell membrane.	38
397821	pfam03912	Psb28	Psb28 protein. Psb28 is a 13 kDa soluble protein that is directly assembled in dimeric PSII supercomplexes. The negatively charged N-terminal region is essential for this process. This protein was formerly known as PsbW, but PsbW is now reserved for pfam07123.	106
112713	pfam03913	Amb_V_allergen	Amb V Allergen. 	44
397822	pfam03914	CBF	CBF/Mak21 family. 	152
397823	pfam03915	AIP3	Actin interacting protein 3. 	407
397824	pfam03916	NrfD	Polysulphide reductase, NrfD. NrfD is an integral transmembrane protein with loops in both the periplasm and the cytoplasm. NrfD is thought to participate in the transfer of electrons, from the quinone pool into the terminal components of the Nrf pathway.	313
397825	pfam03917	GSH_synth_ATP	Eukaryotic glutathione synthase, ATP binding domain. 	475
397826	pfam03918	CcmH	Cytochrome C biogenesis protein. Members of this family include NrfF, CcmH, CycL, Ccl2.	143
397827	pfam03919	mRNA_cap_C	mRNA capping enzyme, C-terminal domain. 	108
397828	pfam03920	TLE_N	Groucho/TLE N-terminal Q-rich domain. The N-terminal domain of the Grouch/TLE co-repressor proteins are involved in oligomerization.	125
397829	pfam03921	ICAM_N	Intercellular adhesion molecule (ICAM), N-terminal domain. ICAMs normally functions to promote intercellular adhesion and signalling. However, The N-terminal domain of the receptor binds to the rhinovirus 'canyon' surrounding the icosahedral 5-fold axes, during the viral attachment process. This family is a family that is part of the Ig superfamily and is therefore related to the family ig (pfam00047).	86
397830	pfam03922	OmpW	OmpW family. This family includes outer membrane protein W (OmpW) proteins from a variety of bacterial species. This protein may form the receptor for S4 colicins in E. coli.	193
397831	pfam03923	Lipoprotein_16	Uncharacterized lipoprotein. The function of this presumed lipoprotein is unknown. The family includes E. coli YajG.	149
397832	pfam03924	CHASE	CHASE domain. This domain is found in the extracellular portion of receptor-like proteins - such as serine/threonine kinases and adenylyl cyclases. Predicted to be a ligand binding domain.	186
397833	pfam03925	SeqA	SeqA protein C-terminal domain. The binding of SeqA protein to hemimethylated GATC sequences is important in the negative modulation of chromosomal initiation at oriC, and in the formation of SeqA foci necessary for Escherichia coli chromosome segregation. SeqA tetramers are able to aggregate or multimerize in a reversible, concentration-dependent manner. Apart from its function in the control of DNA replication, SeqA may also be a specific transcription factor.	110
397834	pfam03927	NapD	NapD protein. Uncharacterized protein involved in formation of periplasmic nitrate reductase.	71
397835	pfam03928	Haem_degrading	Haem-degrading. Haem_bdg is a bacterial protein that is up-regulated in response to haemin- and peroxide-based oxidative stress. It interacts with the SenS/SenR two-component signal transduction system. Iron binds to surface-exposed lysine residues of an octomeric assembly of the protein.	116
397836	pfam03929	PepSY_TM	PepSY-associated TM region. The PepSY_TM family is so named because it is an alignment of up to five transmembranes helices found in bacterial species some of which carry a nested PepSY domain, pfam03413.	354
397837	pfam03930	Flp_N	Recombinase Flp protein N-terminus. 	82
397838	pfam03931	Skp1_POZ	Skp1 family, tetramerisation domain. 	60
397839	pfam03932	CutC	CutC family. Copper transport in Escherichia coli is mediated by the products of at least six genes, cutA, cutB, cutC, cutD, cutE, and cutF. A mutation in one or more of these genes results in an increased copper sensitivity. Members of this family are between 200 and 300 amino acids in length are found in both eukaryotes and bacteria.	201
397840	pfam03934	T2SSK	Type II secretion system (T2SS), protein K. Members of this family are involved in the Type II protein secretion system. The T2SK family includes proteins such as ExeK, PulK, OutX and XcpX.	282
397841	pfam03935	SKN1	Beta-glucan synthesis-associated protein (SKN1). This family consists of the beta-glucan synthesis-associated proteins KRE6 and SKN1. Beta1,6-Glucan is a key component of the yeast cell wall, interconnecting cell wall proteins, beta1,3-glucan, and chitin. It has been postulated that the synthesis of beta1,6-glucan begins in the endoplasmic reticulum with the formation of protein-bound primer structures and that these primer structures are extended in the Golgi complex by two putative glucosyltransferases that are functionally redundant, Kre6 and Skn1. This is followed by maturation steps at the cell surface and by coupling to other cell wall macromolecules.	500
397842	pfam03936	Terpene_synth_C	Terpene synthase family, metal binding domain. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase.	266
397843	pfam03937	Sdh5	Flavinator of succinate dehydrogenase. This family includes the highly conserved mitochondrial and bacterial proteins Sdh5/SDHAF2/SdhE. Both yeast and human Sdh5/SDHAF2 interact with the catalytic subunit of the succinate dehydrogenase (SDH) complex, a component of both the electron transport chain and the tricarboxylic acid cycle. Sdh5 is required for SDH-dependent respiration and for Sdh1 flavination (incorporation of the flavin adenine dinucleotide cofactor). Mutational inactivation of Sdh5 confers tumor susceptibility in humans. Bacterial homologs of Sdh5, termed SdhE, are functionally conserved being required for the flavinylation of SdhA and succinate dehydrogenase activity. Like Sdh5, SdhE interacts with SdhA. Furthermore, SdhE was characterized as a FAD co-factor chaperone that directly binds FAD to facilitate the flavinylation of SdhA. Phylogenetic analysis demonstrates that SdhE/Sdh5 proteins evolved only once in an ancestral alpha-proteobacteria prior to the evolution of the mitochondria and now remain in subsequent descendants including eukaryotic mitochondria and the alpha, beta and gamma proteobacteria. This family was previously annotated in Pfam as being a divergent TPR repeat but structural evidence has indicated this is not true. The E. coli protein, YgfY also acts as the antitoxin to the membrane-bound toxin family Cpta, pfam13166, whose E. coli member YgfX, expressed from the same operon as YgfY.	73
397844	pfam03938	OmpH	Outer membrane protein (OmpH-like). This family includes outer membrane proteins such as OmpH among others. Skp (OmpH) has been characterized as a molecular chaperone that interacts with unfolded proteins as they emerge in the periplasm from the Sec translocation machinery.	140
397845	pfam03939	Ribosomal_L23eN	Ribosomal protein L23, N-terminal domain. The N-terminal domain appears to be specific to the eukaryotic ribosomal proteins L25, L23, and L23a.	50
281873	pfam03940	MSSP	Male specific sperm protein. This family of drosophila proteins are typified by the repetitive motif C-G-P.	51
397846	pfam03941	INCENP_ARK-bind	Inner centromere protein, ARK binding region. This region of the inner centromere protein has been found to be necessary and sufficient for binding to aurora-related kinase. This interaction has been implicated in the coordination of chromosome segregation with cell division in yeast.	53
397847	pfam03942	DTW	DTW domain. This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after.	194
397848	pfam03943	TAP_C	TAP C-terminal domain. The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for nuclear export of mRNA. Tap has a modular structure, and its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate nuclear shuttling. The structure of the C-terminal domain is composed of four helices. The structure is related to the UBA domain.	48
397849	pfam03944	Endotoxin_C	delta endotoxin. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding.	142
397850	pfam03945	Endotoxin_N	delta endotoxin, N-terminal domain. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N-terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding.	218
397851	pfam03946	Ribosomal_L11_N	Ribosomal protein L11, N-terminal domain. The N-terminal domain of Ribosomal protein L11 adopts an alpha/beta fold and is followed by the RNA binding C-terminal domain.	65
397852	pfam03947	Ribosomal_L2_C	Ribosomal Proteins L2, C-terminal domain. 	123
397853	pfam03948	Ribosomal_L9_C	Ribosomal protein L9, C-terminal domain. 	86
397854	pfam03949	Malic_M	Malic enzyme, NAD binding domain. 	257
397855	pfam03950	tRNA-synt_1c_C	tRNA synthetases class I (E and Q), anti-codon binding domain. Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln).	175
397856	pfam03951	Gln-synt_N	Glutamine synthetase, beta-Grasp domain. 	82
397857	pfam03952	Enolase_N	Enolase, N-terminal domain. 	131
397858	pfam03953	Tubulin_C	Tubulin C-terminal domain. This family includes the tubulin alpha, beta and gamma chains. Members of this family are involved in polymer formation. Tubulins are GTPases. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea. Tubulin is the major component of microtubules. (The FtsZ GTPases have been split into their won family).	125
397859	pfam03954	Lectin_N	Hepatic lectin, N-terminal domain. 	128
281888	pfam03955	Adeno_PIX	Adenovirus hexon-associated protein (IX). Hexon (PF01065) is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organized so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX.	110
397860	pfam03956	Lys_export	Lysine exporter LysO. Members of this family contain a conserved core of four predicted transmembrane segments. Some members have an additional pair of N-terminal transmembrane helices. This family includes lysine exporter LysO (YbjE) from E. coli.	190
397861	pfam03957	Jun	Jun-like transcription factor. 	228
397862	pfam03958	Secretin_N	Bacterial type II/III secretion system short domain. This is a short, often repeated, domain found in bacterial type II/III secretory system proteins. All previous NolW-like domains fall into this family.	68
397863	pfam03959	FSH1	Serine hydrolase (FSH1). This is a family of serine hydrolases.	208
397864	pfam03960	ArsC	ArsC family. This family is related to glutaredoxins pfam00462.	109
397865	pfam03961	FapA	Flagellar Assembly Protein A. Members of this family include FapA (flagellar assembly protein A), found in Vibrio vulnificus. The synthesis of flagella allows bacteria to respond to chemotaxis by facilitating motility. Studies examining the role of FapA show that the loss or delocalization of FapA results in a complete failure of the flagellar biosynthesis and motility in response to glucose mediated chemotaxis. The polar localization of FapA is required for flagellar synthesis, and dephosphorylated EIIAGlc (Glucose-permease IIA component) inhibited the polar localization of FapA through direct interaction.	451
397866	pfam03962	Mnd1	Mnd1 family. This family of proteins includes MND1 from S. cerevisiae. The mnd1 protein forms a complex with hop2 to promote homologous chromosome pairing and meiotic double-strand break repair.	60
397867	pfam03963	FlgD	Flagellar hook capping protein - N-terminal region. FlgD is known to be absolutely required for hook assembly, yet it has not been detected in the mature flagellum. It appears to act as a hook-capping protein to enable assembly of hook protein subunits. FlgD regulates the assembly of the hook cap structure to prevent leakage of hook monomers into the medium and hook monomer polymerization and also plays a role in determination of the correct hook length, with the help of the FliK protein. This family represents the N-terminal conserved region of FlgD. A recent crystal structure showed that this region was likely to be flexible and was cleaved off during crystallisation.	75
281897	pfam03964	Chorion_2	Chorion family 2. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary.	103
397868	pfam03965	Penicillinase_R	Penicillinase repressor. The penicillinase repressor negatively regulates expression of the penicillinase gene. The N-terminal region of this protein is involved in operator recognition, while the C-terminal is responsible for dimerization of the protein.	115
397869	pfam03966	Trm112p	Trm112p-like protein. The function of this family is uncertain. The bacterial members are about 60-70 amino acids in length and the eukaryotic examples are about 120 amino acids in length. The C-terminus contains the strongest conservation. Trm112p is required for tRNA methylation in S. cerevisiae and is found in complexes with 2 tRNA methylases (TRM9 and TRM11) also with putative methyltransferase YDR140W. The zinc-finger protein Ynr046w is plurifunctional and a component of the eRF1 methyltransferase in yeast. The crystal structure of Ynr046w has been determined to 1.7 A resolution. It comprises a zinc-binding domain built from both the N- and C-terminal sequences and an inserted domain, absent from bacterial and archaeal orthologs of the protein, composed of three alpha-helices.	44
397870	pfam03967	PRCH	Photosynthetic reaction centre, H-chain N-terminal region. The family corresponds the N-terminal cytoplasmic domain.	133
397871	pfam03968	OstA	OstA-like protein. This family of proteins are mostly uncharacterized. However the family does include E. coli OstA that has been characterized as an organic solvent tolerance protein.	113
397872	pfam03969	AFG1_ATPase	AFG1-like ATPase. This P-loop motif-containing family of proteins includes AFG1, LACE1 and ZapE. ATPase family gene 1 (AFG1) is a 377 amino acid yeast protein with an ATPase motif typical of the family. LACE1, the mammalian homolog of AFG1, is a mitochondrial integral membrane protein that is essential for maintenance of fused mitochondrial reticulum and lamellar cristae morphology. It has also been demonstrated that LACE1 mediates degradation of nuclear-encoded complex IV subunits COX4 (cytochrome c oxidase 4), COX5A and COX6A, and is required for normal activity of complexes III and IV of the respiratory chain. ZapE is a cell division protein found in Gram-negative bacteria. The bacterial cell division process relies on the assembly, positioning, and constriction of FtsZ ring (the so-called Z-ring), a ring-like network that marks the future site of the septum of bacterial cell division. ZapE is a Z-ring associated protein required for cell division under low-oxygen conditions. It is an ATPase that appears at the constricting Z-ring late in cell division. It reduces the stability of FtsZ polymers in the presence of ATP in vitro.	361
281903	pfam03970	Herpes_UL37_1	Herpesvirus UL37 tegument protein. UL37 interacts with UL36, which is thought to be an important early step in tegumentation during virion morphogenesis in the cytoplasm.	267
397873	pfam03971	IDH	Monomeric isocitrate dehydrogenase. NADP(+)-dependent isocitrate dehydrogenase (ICD) is an important enzyme of the intermediary metabolism, as it controls the carbon flux within the citric acid cycle and supplies the cell with 2-oxoglutarate EC:1.1.1.42 and NADPH for biosynthetic purposes.	733
397874	pfam03972	MmgE_PrpD	MmgE/PrpD family. This family includes 2-methylcitrate dehydratase EC:4.2.1.79 (PrpD) that is required for propionate catabolism. It catalyzes the third step of the 2-methylcitric acid cycle.	437
397875	pfam03973	Triabin	Triabin. Triabin is a serine-protease inhibitor with a calycin fold.	147
397876	pfam03974	Ecotin	Ecotin. Ecotin is a broad range serine protease inhibitor, which forms homodimers. The C-terminal region contains the dimerization motif. Interestingly, the binding sites show a fluidity of protein contacts binding sites show a fluidity of protein contacts derived from ecotin's innate flexibility in fitting itself to proteases while.	122
397877	pfam03975	CheD	CheD chemotactic sensory transduction. This chemotaxis protein stimulates methylation of MCP proteins. The chemotaxis machinery of Bacillus subtilis is similar to that of the well characterized system of Escherichia coli. However, B. subtilis contains several chemotaxis genes not found in the E. coli genome, such as CheC and CheD, indicating that the B. subtilis chemotactic system is more complex. CheD plays an important role in chemotactic sensory transduction for many organisms. CheD deamidates other B. subtilis chemoreceptors including McpB and McpC. Deamidation by CheD is required for B. subtilis chemoreceptors to effectively transduce signals to the CheA kinase. The structure of a complex between the signal-terminating phosphatase, CheC, and the receptor-modifying deamidase, CheD, reveals how CheC mimics receptor substrates to inhibit CheD and how CheD stimulates CheC phosphatase activity. CheD resembles other cysteine deamidases from bacterial pathogens that inactivate host Rho-GTPases. Phospho-CheY, the intracellular signal and CheC target, stabilizes the CheC-CheD complex and reduces availability of CheD. A model is proposed whereby CheC acts as a CheY-P-induced regulator of CheD; CheY-P would cause CheC to sequester CheD from the chemoreceptors, inducing adaptation of the chemotaxis system.	106
397878	pfam03976	PPK2	Polyphosphate kinase 2 (PPK2). Inorganic polyphosphate (polyP) plays a role in metabolism and regulation and has been proposed to serve as a energy source in a pre-ATP world. In prokaryotes, the synthesis and utilisation of polyP are catalyzed by PPK1, PPK2 and polyphosphatases. Proteins with a single PPK2 domain catalyze polyP-dependent phosphorylation of ADP to ATP, whereas proteins containing 2 fused PPK2 domains phosphorylate AMP to ADP. The structure of PPK2 from Pseudomonas aeruginosa has revealed a a 3-layer alpha/beta/alpha sandwich fold with an alpha-helical lid similar to the structures of microbial thymidylate kinases.	229
397879	pfam03977	OAD_beta	Na+-transporting oxaloacetate decarboxylase beta subunit. Members of this family are integral membrane proteins. The decarboxylation reactions they catalyze are coupled to the vectorial transport of Na+ across the cytoplasmic membrane, thereby creating a sodium ion motive force that is used for ATP synthesis.	345
397880	pfam03978	Borrelia_REV	Borrelia burgdorferi REV protein. This family consists of several REV proteins from Borrelia burgdorferi (Lyme disease spirochete). The function of REV is unknown although it known that gene is induced during the ingesting of host blood suggesting a role in the metabolic activation of borreliae to adapt to physiological stimuli.	157
397881	pfam03979	Sigma70_r1_1	Sigma-70 factor, region 1.1. Region 1.1 modulates DNA binding by region 2 and 4 when sigma is unbound by the core RNA polymerase. Region 1.1 is also involved in promoter binding	69
397882	pfam03980	Nnf1	Nnf1. NNF1 is an essential yeast gene that is necessary for chromosome segregation. It is associated with the spindle poles and forms part of a kinetochore subcomplex called MIND.	103
397883	pfam03981	Ubiq_cyt_C_chap	Ubiquinol-cytochrome C chaperone. 	137
112781	pfam03982	DAGAT	Diacylglycerol acyltransferase. The terminal step of triacylglycerol (TAG) formation is catalyzed by the enzyme diacylglycerol acyltransferase (DAGAT).	297
397884	pfam03983	SHD1	SLA1 homology domain 1, SHD1. NPFXD peptides specifically interact with the SHD1 domain. NPFXD is a clathrin-facilitated endocytic targeting signal. NPFXD was originally discovered in the cytoplasmic domain of the furin-like protease Kex2p. Sla1 is thought to function as an endocytic adaptor.	67
367755	pfam03984	DUF346	Repeat of unknown function (DUF346). This repeat was found as seven tandem copies in one protein. It is predicted to be composed of beta-strands. Thus it is likely that it forms a beta-propeller structure. It is found in association with BNR repeats, which also form a beta-propeller.	36
397885	pfam03985	Paf1	Paf1. Members of this family are components of the RNA polymerase II associated Paf1 complex. The Paf1 complex functions during the elongation phase of transcription in conjunction with Spt4-Spt5 and Spt16-Pob3i.	416
367757	pfam03986	Autophagy_N	Autophagocytosis associated protein (Atg3), N-terminal domain. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the lysosome/vacuole. Atg3 is a ubiquitin like modifier that is topologically similar to the canonical E2 enzyme. It catalyzes the conjugation of Atg8 and phosphatidylethanolamine.	126
397886	pfam03987	Autophagy_act_C	Autophagocytosis associated protein, active-site domain. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The cysteine residue within the HPC motif is the putative active-site residue for recognition of the Apg5 subunit of the autophagosome complex.	188
397887	pfam03988	DUF347	Repeat of Unknown Function (DUF347). This repeat is found as four tandem repeats in a family of bacterial membrane proteins. Each repeat contains two transmembrane regions and a conserved tryptophan.	50
397888	pfam03989	DNA_gyraseA_C	DNA gyrase C-terminal domain, beta-propeller. This repeat is found as 6 tandem copies at the C-termini of GyrA and ParC DNA gyrases. It is predicted to form 4 beta strands and to probably form a beta-propeller structure. This region has been shown to bind DNA non-specifically and may stabilize the DNA-topoisomerase complex.	48
397889	pfam03990	DUF348	Domain of unknown function (DUF348). This domain normally occurs as tandem repeats; however it is found as a single copy in the S. cerevisiae DNA-binding nuclear protein YCR593. This protein is involved in sporulation part of the SET3C complex, which is required to repress early/middle sporulation genes during meiosis. The bacterial proteins are likely to be involved in a cell wall function as they are found in conjunction with the pfam07501 domain, which is involved in various cell surface processes.	41
112790	pfam03991	Prion_octapep	Copper binding octapeptide repeat. This repeat is found at the amino terminus of prion proteins. It has been shown to bind to copper.	8
397890	pfam03992	ABM	Antibiotic biosynthesis monooxygenase. This domain is found in monooxygenases involved in the biosynthesis of several antibiotics by Streptomyces species. It's occurrence as a repeat in Streptomyces coelicolor SCO1909 is suggestive that the other proteins function as multimers. There is also a conserved histidine which is likely to be an active site residue.	74
397891	pfam03993	DUF349	Domain of Unknown Function (DUF349). This domain is found singly or as up to five tandem repeats in a small set of bacterial proteins. There are two or three alpha-helices, and possibly a beta-strand.	72
397892	pfam03994	DUF350	Domain of Unknown Function (DUF350). This domain occurs in a small set of of bacterial proteins. It has two transmembrane regions, and often occurs as tandem repeats. The are no conserved catalytic residues.	54
377188	pfam03995	Inhibitor_I36	Peptidase inhibitor family I36. This domain is currently only found in a small set of S. coelicolor secreted proteins. There are four conserved cysteines that probably form two disulphide bonds. Proteins 2SCK31.15C and SCO3675 also have probable beta-propellers at their C-termini. This family includes Streptomyces nigrescens SmpI, a known peptidase inhibitor of known structure. This protein has a crystallin like fold pfam00030 and is distantly related by sequence. It is not known whether other members of this family are peptidase inhibitors.	69
397893	pfam03996	Hema_esterase	Hemagglutinin esterase. 	384
397894	pfam03997	VPS28	VPS28 protein. 	187
397895	pfam03998	Utp11	Utp11 protein. This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome.	245
397896	pfam03999	MAP65_ASE1	Microtubule associated protein (MAP65/ASE1 family). 	562
397897	pfam04000	Sas10_Utp3	Sas10/Utp3/C1D family. This family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex. It also includes the human C1D protein and Saccharomyces cerevisiae YHR081W (rrp47), an exosome-associated protein required for the 3' processing of stable RNAs, and Sas10 which has been identified as a regulator of chromatin silencing. This family also includes the human protein Neuroguidin an initiation factor 4E (eIF4E) binding protein.	84
397898	pfam04001	Vhr1	Transcription factor Vhr1. Vhr1 is a transcription factor which regulates the biotin-dependent expression of transporters VHT1 and BIO5.	91
397899	pfam04002	RadC	RadC-like JAB domain. A family of proteins present widely across the bacteria. This family was named initially with reference to the E. coli radC102 mutation which suggested that RadC was involved in repair of DNA lesions. However the relevant mutation has subsequently been shown to be in recG, where radC is in fact an allele of recG. In addition, a personal communication from Claverys, J-P, et al, indicates a total failure of all attempts to characterize a radiation-related function for RadC in Streptococcus pneumoniae, suggesting that it is not involved in repair of DNA lesions, in recombination during transformation, in gene conversion, nor in mismatch repair. Computational analysis, however, provides a possible function. The RadC-like family belong to the JAB superfamily of metalloproteins. The domain shows fusions to an N-terminal Helix-hairpin-Helix (HhH) domain in most instances. Other domain combinations include fusions to the anti-restriction module ArdC, the DinG/RAD3-like superfamily II helicases and the DNAG-like primase. In some bacteria, closely related DinG/Rad3- like superfamily II helicases are fused to a 3'-5' exonuclease in the same position as the RadC-like JAB domain. These conserved domain associations lead to the hypothesis that the RadC-like JAB domains might function as a nuclease.	113
397900	pfam04003	Utp12	Dip2/Utp12 Family. This domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2.	106
397901	pfam04004	Leo1	Leo1-like protein. Members of this family are part of the Paf1/RNA polymerase II complex. The Paf1 complex probably functions during the elongation phase of transcription. The Leo1 subunit of the yeast Paf1-complex binds RNA and contributes to complex recruitment. The subunit acts by co-ordinating co-transcriptional chromain modifications and helping recruitment of mRNA 3prime-end processing factors.	163
397902	pfam04005	Hus1	Hus1-like protein. Hus1, Rad1, and Rad9 are three evolutionarily conserved proteins required for checkpoint control in fission yeast. These proteins are known to form a stable complex in vivo. Hus1-Rad1-Rad9 complex may form a PCNA-like ring structure, and could function as a sliding clamp during checkpoint control.	288
397903	pfam04006	Mpp10	Mpp10 protein. This family includes proteins related to Mpp10 (M phase phosphoprotein 10). The U3 small nucleolar ribonucleoprotein (snoRNP) is required for three cleavage events that generate the mature 18S rRNA from the pre-rRNA. In Saccharomyces cerevisiae, depletion of Mpp10, a U3 snoRNP-specific protein, halts 18S rRNA production and impairs cleavage at the three U3 snoRNP-dependent sites.	591
281936	pfam04007	DUF354	Protein of unknown function (DUF354). Members of this family are around 350 amino acids in length. They are found in archaebacteria and have no known function.	349
397904	pfam04008	Adenosine_kin	Adenosine specific kinase. The structure of a member of this family from the hyperthermophilic archaeon Pyrobaculum aerophilum contains a modified histidine residue which is interpreted as stable phosphorylation. In vitro binding studies confirmed that adenosine and AMP but not ADP or ATP bind to the protein.	154
397905	pfam04009	DUF356	Protein of unknown function (DUF356). Members of this family are around 120 amino acids in length and are found in some archaebacteria. The function of this family is unknown. However it contains a conserved motif IHPPAH that may be involved in its function.	106
397906	pfam04010	DUF357	Protein of unknown function (DUF357). Members of this family are short (less than 100 amino acid) proteins found in archaebacteria. The function of these proteins is unknown.	73
397907	pfam04011	LemA	LemA family. The members of this family are related to the LemA protein. LemA contains an amino terminal predicted transmembrane helix. It has been predicted that the small amino terminus is extracellular. The exact molecular function of this protein is uncertain.	149
281941	pfam04012	PspA_IM30	PspA/IM30 family. This family includes PspA a protein that suppresses sigma54-dependent transcription. The PspA protein, a negative regulator of the Escherichia coli phage shock psp operon, is produced when virulence factors are exported through secretins in many Gram-negative pathogenic bacteria and its homolog in plants, VIPP1, plays a critical role in thylakoid biogenesis, essential for photosynthesis. Activation of transcription by the enhancer-dependent bacterial sigma(54) containing RNA polymerase occurs through ATP hydrolysis-driven protein conformational changes enabled by activator proteins that belong to the large AAA(+) mechanochemical protein family. It has been shown that PspA directly and specifically acts upon and binds to the AAA(+) domain of the PspF transcription activator.	218
397908	pfam04013	Methyltrn_RNA_2	Putative SAM-dependent RNA methyltransferase. This family is likely to be an S-adenosyl-L-methionine (SAM)-dependent RNA methyltransferase. It is responsible for N1-methylation of pseudouridine 54 in archaeal tRNAs.	198
397909	pfam04014	MazE_antitoxin	Antidote-toxin recognition MazE, bacterial antitoxin. AbrB-like is a family of small proteins that operate in conjunction with a cognate toxin molecule. The commonly attributed role of toxin-antitoxin systems is to maintain low-copy number plasmids from one generation to the next. Such gene-pairs are also found on chromosomes and to be associated with a number of biological functions such as: reduction of protein synthesis, gene regulation and retardation of cell growth under nutritional stress. This family includes proteins from a number of different pairings, eg MazE, AbrB, VapB, PhoU, PemI-like and SpoVT. MazE is the antidote to the toxin MazF of E. coli. MazE-MazF in E. coli is a regulated prokaryotic chromosomal addiction module. MazE antidote is degraded by the ClpPA protease of the bacterial proteasome. MazE-MazF is thought to play a role in programmed cell death when cells suffer nutrient deprivation, and MazE-MazF modules have also been implicated in the bacteriostatic effects of other addiction modules.	44
397910	pfam04015	DUF362	Domain of unknown function (DUF362). Domain that is sometimes present in iron-sulphur proteins.	199
397911	pfam04016	DUF364	Putative heavy-metal chelation. This domain of unknown function has a PLP-dependent transferase-like fold. Its genomic context suggests that it may have a role in anaerobic vitamin B12 biosynthesis. This domain is often found at the C-terminus of proteins containing DUF4213, pfam13938. The structure of UnioProtKB:B8FUJ5, Structure 3l5o, suggests that the protein has an enolase N-terminal-like fold and this Rossmann-like C-terminal domain. Structural and bioinformatic analyses reveal partial similarities to Rossmann-like methyltransferases, with residues from the enolase-like fold combining to form a unique active site that is likely to be involved in the condensation or hydrolysis of molecules implicated in the synthesis of flavins, pterins or other siderophores. The protein may be playing a role in heavy-metal chelation.	147
397912	pfam04017	DUF366	Domain of unknown function (DUF366). Archaeal domain of unknown function.	184
397913	pfam04018	DUF368	Domain of unknown function (DUF368). Predicted transmembrane domain of unknown function. Family members have between 6 and 9 predicted transmembrane segments.	241
397914	pfam04019	DUF359	Protein of unknown function (DUF359). This family of archaebacterial proteins are about 170 amino acids in length. They have no known function. The most conserved portion of the protein contains the sequence GEEDL that may be important for its function.	122
397915	pfam04020	Phage_holin_4_2	Mycobacterial 4 TMS phage holin, superfamily IV. These proteins are predicted transmembrane proteins with probably four transmembrane spans. The 1.E.40 is represented by the mycobacterial 4 phage holin, but it also contains many cyanobacterial. proteobacterial and firmicute proteins. Holins are encoded within the genomes of Gram-positive and Gram-negative bacteria as well as in those of the bacteriophage of these organisms. The primary function of holins appears to be transport of murein hydrolases across the cytoplasmic membrane to the cell wall where these enzymes hydrolyze the cell wall polymer as a prelude to cell lysis. When chromosomally encoded the enzymes are therefore autolysins. Holins may also facilitate leakage of electrolytes and nutrients from the cell cytoplasm, thereby promoting cell death. Some may catalyze export of nucleases.	105
397916	pfam04021	Class_IIIsignal	Class III signal peptide. This family of archaeal proteins contains. an amino terminal motif QXSXEXXXL that has been suggested to be part of a class III signal sequence. With the Q being the +1 residue of the signal peptidase cleavage site. Two members of this family are cleaved by a type IV pilin-like signal peptidase.	27
281951	pfam04022	Staphylcoagulse	Staphylocoagulase repeat. 	27
397917	pfam04023	FeoA	FeoA domain. This family includes FeoA a small protein, probably involved in Fe2+ transport. This presumed short domain is also found at the C-terminus of a variety of metal dependent transcriptional regulators. This suggests that this domain may be metal-binding. In most cases this is likely to be either iron or manganese.	74
397918	pfam04024	PspC	PspC domain. This family includes Phage shock protein C (PspC) that is thought to be a transcriptional regulator. The presumed domain is 60 amino acid residues in length.	57
397919	pfam04025	DUF370	Domain of unknown function (DUF370). Bacterial domain of unknown function.	73
397920	pfam04026	SpoVG	SpoVG. Stage V sporulation protein G. Essential for sporulation and specific to stage V sporulation in Bacillus megaterium and subtilis. In B. subtilis, expression decreases after 30-60 minutes of cold shock.	82
397921	pfam04027	DUF371	Domain of unknown function (DUF371). Archaeal domain of unknown function.	133
397922	pfam04028	DUF374	Domain of unknown function (DUF374). Bacterial domain of unknown function.	69
397923	pfam04029	2-ph_phosp	2-phosphosulpholactate phosphatase. Thought to catalyze 2-phosphosulpholactate = sulpholactate + phosphate. Probable magnesium cofactor. Involved in the second step of coenzyme M biosynthesis. Inhibited by vanadate in Methanococcus jannaschii. Also known as the ComB family.	220
397924	pfam04030	ALO	D-arabinono-1,4-lactone oxidase. This domain is specific to D-arabinono-1,4-lactone oxidase EC:1.1.3.-, which is involved in the final step of the D-erythroascorbic acid biosynthesis pathway.	259
397925	pfam04031	Las1	Las1-like. Las1 is an essential nuclear protein involved in cell morphogenesis and cell surface growth.	147
397926	pfam04032	Rpr2	RNAse P Rpr2/Rpp21/SNM1 subunit domain. This family contains a ribonuclease P subunit of humans and yeast. Other members of the family include the probable archaeal homologs. This family includes SNM1. It is a subunit of RNase MRP (mitochondrial RNA processing), a ribonucleoprotein endoribonuclease that has roles in both mitochondrial DNA replication and nuclear 5.8S rRNA processing. SNM1 is an RNA binding protein that binds the MRP RNA specifically. This subunit possibly binds the precursor tRNA.	88
397927	pfam04033	DUF365	Domain of unknown function (DUF365). Archaeal domain of unknown function.	96
397928	pfam04034	Ribo_biogen_C	Ribosome biogenesis protein, C-terminal. This family represents the C-terminal domain of some putative ribosome biogenesis proteins in archaea. It has also been identified in the eukaryotic protein Tsr3, which is involved in ribosomal RNA biogenesis.	126
397929	pfam04037	DUF382	Domain of unknown function (DUF382). This domain is specific to the human splicing factor 3b subunit 2 and it's orthologues. Splicing factor 3b subunit 2 or SAP145 is a suppressor of U2 snRNA mutations. Pre-mRNA splicing is catalyzed by a large ribonucleoprotein complex called the spliceosome. Spliceosomes are multi-component enzymes that catalyze pre-mRNA splicing and form step-wise by the ordered interaction of UsnRNPs and non-snRNP proteins with short conserved regions of the pre-mRNA at the 5' and 3' splice sites and branch site.	127
397930	pfam04038	DHNA	Dihydroneopterin aldolase. 	108
397931	pfam04039	MnhB	Domain related to MnhB subunit of Na+/H+ antiporter. Possible subunit of Na+/H+ antiporter,. Predicted integral membrane protein, usually four transmembrane regions in this domain. Often found in bacterial NADH dehydrogenase subunit.	124
397932	pfam04041	Glyco_hydro_130	beta-1,4-mannooligosaccharide phosphorylase. This is a family of glycosyl-hydrolases of the CAZy GH130 family. Several have been characterized as mannosylglucose phosphorylase. This enzyme is part of the mannan catalytic pathway and feeds into the glycolysis cycle. Specifically it catalyzes the reversible phosphorolysis of beta-1,4-D-mannosyl-N-acetyl-D-glucosamine. This family was noted to belong to the Beta fructosidase superfamily in.	315
397933	pfam04042	DNA_pol_E_B	DNA polymerase alpha/epsilon subunit B. This family contains a number of DNA polymerase subunits. The B subunit of the DNA polymerase alpha plays an essential role at the initial stage of DNA replication in S. cerevisiae and is phosphorylated in a cell cycle-dependent manner. DNA polymerase epsilon is essential for cell viability and chromosomal DNA replication in budding yeast. In addition, DNA polymerase epsilon may be involved in DNA repair and cell-cycle checkpoint control. The enzyme consists of at least four subunits in mammalian cells as well as in yeast. The largest subunit of DNA polymerase epsilon is responsible for polymerase epsilon is responsible for polymerase activity. In mouse, the DNA polymerase epsilon subunit B is the second largest subunit of the DNA polymerase. A part of the N-terminal was found to be responsible for the interaction with SAP18. Experimental evidence suggests that this subunit may recruit histone deacetylase to the replication fork to modify the chromatin structure.	209
397934	pfam04043	PMEI	Plant invertase/pectin methylesterase inhibitor. This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein. It is also found at the N-termini of PMEs predicted from DNA sequences (personal obs:C Yeats), suggesting that both PMEs and their inhibitor are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical.	148
397935	pfam04045	P34-Arc	Arp2/3 complex, 34 kD subunit p34-Arc. Arp2/3 protein complex has been implicated in the control of actin polymerization in cells. The human complex consists of seven subunits which include the actin related Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. This family represents the p34-Arc subunit.	239
397936	pfam04046	PSP	PSP. Proline rich domain found in numerous spliceosome associated proteins.	45
367782	pfam04048	Sec8_exocyst	Sec8 exocyst complex component specific domain. 	141
397937	pfam04049	ANAPC8	Anaphase promoting complex subunit 8 / Cdc23. The anaphase-promoting complex is composed of eight protein subunits, including BimE (APC1), CDC27 (APC3), CDC16 (APC6), and CDC23 (APC8).	140
397938	pfam04050	Upf2	Up-frameshift suppressor 2. Transcripts harbouring premature signals for translation termination are recognized and rapidly degraded by eukaryotic cells through a pathway known as nonsense-mediated mRNA decay. In Saccharomyces cerevisiae, three trans-acting factors (Upf1 to Upf3) are required for nonsense-mediated mRNA decay.	133
397939	pfam04051	TRAPP	Transport protein particle (TRAPP) component. TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterized TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localize TRAPP to the Golgi.	150
397940	pfam04052	TolB_N	TolB amino-terminal domain. TolB is an essential periplasmic component of the tol-dependent translocation system. This function of this amino terminal domain is uncertain.	105
397941	pfam04053	Coatomer_WDAD	Coatomer WD associated region. This region is composed of WD40 repeats.	439
397942	pfam04054	Not1	CCR4-Not complex component, Not1. The Ccr4-Not complex is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID.	365
397943	pfam04055	Radical_SAM	Radical SAM superfamily. Radical SAM proteins catalyze diverse reactions, including unusual methylations, isomerisation, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation.	159
397944	pfam04056	Ssl1	Ssl1-like. Ssl1-like proteins are 40kDa subunits of the Transcription factor II H complex.	175
397945	pfam04057	Rep-A_N	Replication factor-A protein 1, N-terminal domain. 	99
112856	pfam04059	RRM_2	RNA recognition motif 2. 	97
397946	pfam04060	FeS	Putative Fe-S cluster. This family includes a domain with four conserved cysteines that probably form an Fe-S redox cluster.	33
397947	pfam04061	ORMDL	ORMDL family. Evidence form suggests that ORMDLs are involved in protein folding in the ER. Orm proteins have been identified as negative regulators of sphingolipid synthesis that form a conserved complex with serine palmitoyltransferase, the first and rate-limiting enzyme in sphingolipid production. This novel and conserved protein complex, has been termed the SPOTS complex (serine palmitoyltransferase, Orm1/2, Tsc3, and Sac1).	135
397948	pfam04062	P21-Arc	ARP2/3 complex ARPC3 (21 kDa) subunit. The seven component ARP2/3 actin-organising complex is involved in actin assembly and function.	173
397949	pfam04063	DUF383	Domain of unknown function (DUF383). 	188
397950	pfam04064	DUF384	Domain of unknown function (DUF384). 	55
397951	pfam04065	Not3	Not1 N-terminal domain, CCR4-Not complex component. 	228
397952	pfam04066	MrpF_PhaF	Multiple resistance and pH regulation protein F (MrpF / PhaF). Members of the PhaF / MrpF family are predicted to be an integral membrane proteins with three transmembrane regions, involved in regulation of pH. PhaF is part of a potassium efflux system involved in pH regulation. It is also involved in symbiosis in Rhizobium meliloti. MrpF is part of a Na+/H+ antiporter complex, also involved in pH homeostasis. MrpF is thought to be an efflux system for Na+ and cholate. The Mrp system in Bacilli may also have primary energisation capacities.	51
397953	pfam04068	RLI	Possible Fer4-like domain in RNase L inhibitor, RLI. Possible metal-binding domain in endoribonuclease RNase L inhibitor. Found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, fer4, pfam00037. Also often found adjacent to the DUF367 domain pfam04034 in uncharacterized proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons, and could possibly play a more general role in the regulation of RNA stability in mammalian cells. Inhibitory activity requires concentration-dependent association of RLI with RNase L.	35
397954	pfam04069	OpuAC	Substrate binding domain of ABC-type glycine betaine transport system. Part of a high affinity multicomponent binding-protein-dependent transport system involved in bacterial osmoregulation. This domain is often fused to the permease component of the transporter complex. Family members are often integral membrane proteins or predicted to be attached to the membrane by a lipid anchor. Glycine betaine is involved in protection from high osmolarity environments for example in Bacillus subtilis. The family member OpuBC is closely related, and involved in choline transport. Choline is necessary for the biosynthesis of glycine betaine. L-carnitine is important for osmoregulation in Listeria monocytogenes. Family also contains proteins binding l-proline (ProX), histidine (HisX) and taurine (TauA).	257
397955	pfam04070	DUF378	Domain of unknown function (DUF378). Predicted transmembrane domain of unknown function. The majority of the family have two predicted transmembrane regions.	60
397956	pfam04071	zf-like	Cysteine-rich small domain. Probable metal-binding domain.	80
397957	pfam04072	LCM	Leucine carboxyl methyltransferase. Family of leucine carboxyl methyltransferases EC:2.1.1.-. This family may need divides a the full alignment contains a significantly shorter mouse sequence.	188
397958	pfam04073	tRNA_edit	Aminoacyl-tRNA editing domain. This domain is found either on its own or in association with the tRNA synthetase class II core domain (pfam00587). It is involved in the tRNA editing of mis-charged tRNAs including Cys-tRNA(Pro), Cys-tRNA(Cys), Ala-tRNA(Pro). The structure of this domain shows a novel fold.	123
397959	pfam04074	DUF386	Domain of unknown function (DUF386). This family consists of conserved hypothetical proteins, typically about 150 amino acids in length, with no known function.	148
281995	pfam04075	F420H2_quin_red	F420H(2)-dependent quinone reductase. This family of proteins is found in the genera Mycobacterium and Streptomyces. Member protein Rv3547 has been characterized as a deazaflavin-dependent nitroreductase. Rv1558 is an F420H(2)-dependent quinone reductase involved in oxidative stress protection.	129
281996	pfam04076	BOF	Bacterial OB fold (BOF) protein. Proteins in this family form an OB-fold. Analysis of the predicted binding site of BOF family proteins implies that they lack nucleic acid-binding properties. They contain an predicted N-terminal signal peptide which indicates that they localize in the periplasm where they may function to bind proteins, small molecules, or other typical OB-fold ligands. As hypothesized for the distantly related OB-fold containing bacterial enterotoxins, the loss of nucleotide-binding function and the rapid evolution of the BOF ligand-binding site may be associated with the presence of BOF proteins in mobile genetic elements and their potential role in bacterial pathogenicity.	103
397960	pfam04077	DsrH	DsrH like protein. DsrH is involved in oxidation of intracellular sulphur in the phototrophic sulphur bacterium Chromatium vinosum D.	87
397961	pfam04078	Rcd1	Cell differentiation family, Rcd1-like. Two of the members in this family have been characterized as being involved in regulation of Ste11 regulated sex genes. Mammalian Rcd1 is a novel transcriptional cofactor that mediates retinoic acid-induced cell differentiation.	259
397962	pfam04079	SMC_ScpB	Segregation and condensation complex subunit ScpB. This is a family of prokaryotic proteins that form one of the subunits, ScpB, of the segregation and condensation complex, condensin, that plays a key role in the maintenance of the chromosome. In prokaryotes the complex consists of three proteins, SMC, ScpA (kleisin) and ScpB. ScpB dimerizes and binds to ScpA. As originally predicted, ScpB is structurally a winged-helix at both its N- and C-terminal halves. IN Bacillus subtilis,one Smc dimer is bridged by a single ScpAB to generate asymmetric tripartite rings analogous to eukaryotic SMC complex ring-shaped assemblies.	160
397963	pfam04080	Per1	Per1-like family. PER1 is required for GPI-phospholipase A2 activity and is involved in lipid remodelling of GPI-anchored proteins. PER1 is part of the CREST superfamily.	254
367800	pfam04081	DNA_pol_delta_4	DNA polymerase delta, subunit 4. 	131
397964	pfam04082	Fungal_trans	Fungal specific transcription factor domain. 	262
397965	pfam04083	Abhydro_lipase	Partial alpha/beta-hydrolase lipase region. This family corresponds to a N-terminal part of an alpha/beta hydrolase domain.	62
397966	pfam04084	ORC2	Origin recognition complex subunit 2. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in.	323
397967	pfam04085	MreC	rod shape-determining protein MreC. MreC (murein formation C) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped.	112
397968	pfam04086	SRP-alpha_N	Signal recognition particle, alpha subunit, N-terminal. SRP is a complex of six distinct polypeptides and a 7S RNA that is essential for transferring nascent polypeptide chains that are destined for export from the cell to the translocation apparatus of the endoplasmic reticulum (ER) membrane. SRP binds hydrophobic signal sequences as they emerge from the ribosome, and arrests translation.	287
397969	pfam04087	DUF389	Domain of unknown function (DUF389). Family of hypothetical bacterial proteins with an undetermined function.	137
397970	pfam04088	Peroxin-13_N	Peroxin 13, N-terminal region. Both termini of the Peroxin-13 are oriented to the cytosol. Peroxin-13 is required for peroxisomal association of peroxin-14.	141
397971	pfam04089	BRICHOS	BRICHOS domain. The BRICHOS domain is about 100 amino acids long. It is found in a variety of proteins implicated in dementia, respiratory distress and cancer. Its exact function is unknown; roles that have been proposed for it include (a) in targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialized intracellular protease processing system. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, pfam08999, provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region.	88
367807	pfam04090	RNA_pol_I_TF	RNA polymerase I specific initiation factor. 	199
397972	pfam04091	Sec15	Exocyst complex subunit Sec15-like. 	308
397973	pfam04092	SAG	SRS domain. Toxoplasma gondii is a persistent protozoan parasite capable of infecting almost any warm-blooded vertebrate. The surface of Toxoplasma is coated with a family of developmentally regulated glycosylphosphatidylinositol (GPI)-linked proteins (SRSs), of which SAG1 is the prototypic member. SRS proteins mediate attachment to host cells and interface with the host immune response to regulate the virulence of the parasite. SAG1 is composed of two disulphide linked SRS domains. These have 6 cysteines that form 1-6,2-5 and 3-4 pairings. The structure of the immunodominant SAG1 antigen reveals a homodimeric configuration. The SRS domain is found in a single copy in the SAG2 proteins. This family of surface antigens are found in other apicomplexans.	132
282013	pfam04093	MreD	rod shape-determining protein MreD. MreD (murein formation D) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped.	160
397974	pfam04095	NAPRTase	Nicotinate phosphoribosyltransferase (NAPRTase) family. Nicotinate phosphoribosyltransferase (EC:2.4.2.11) is the rate limiting enzyme that catalyzes the first reaction in the NAD salvage synthesis. This family also includes Pre-B cell enhancing factor that is a cytokine. This family is related to Quinolinate phosphoribosyltransferase pfam01729.	235
397975	pfam04096	Nucleoporin2	Nucleoporin autopeptidase. 	143
397976	pfam04097	Nic96	Nup93/Nic96. Nup93/Nic96 is a component of the nuclear pore complex. It is required for the correct assembly of the nuclear pore complex. In Saccharomyces cerevisiae, Nic96 has been shown to be involved in the distribution and cellular concentration of the GTPase Gsp1. The structure of Nic96 has revealed a mostly alpha helical structure.	611
367812	pfam04098	Rad52_Rad22	Rad52/22 family double-strand break repair protein. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to Rad52. These proteins contain two helix-hairpin-helix motifs.	140
282019	pfam04099	Sybindin	Sybindin-like family. Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses.	134
282020	pfam04100	Vps53_N	Vps53-like, N-terminal. Vps53 complexes with Vps52 and Vps54 to form a multi- subunit complex involved in regulating membrane trafficking events.	374
397977	pfam04101	Glyco_tran_28_C	Glycosyltransferase family 28 C-terminal domain. The glycosyltransferase family 28 includes monogalactosyldiacylglycerol synthase (EC 2.4.1.46) and UDP-N-acetylglucosamine transferase (EC 2.4.1.-). Structural analysis suggests the C-terminal domain contains the UDP-GlcNAc binding site.	166
397978	pfam04102	SlyX	SlyX. The SlyX protein has no known function. It is short less than 80 amino acids and is found close to the slyD gene. The SlyX protein has a conserved PPH(Y/W) motif at its C-terminus. The protein may be a coiled-coil structure.	66
397979	pfam04103	CD20	CD20-like family. This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulfide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probably topology where both amino- and carboxy termini protrude into the cytoplasm. This family also includes LR8 like proteins from humans, mice and rats. The function of the human LR8 protein is unknown although it is known to be strongly expressed in the lung fibroblasts. This family also includes sarcospan is a transmembrane component of dystrophin-associated glycoprotein. Loss of the sarcoglycan complex and sarcospan alone is sufficient to cause muscular dystrophy. The role of the sarcoglycan complex and sarcospan is thought to be to strengthen the dystrophin axis connecting the basement membrane with the cytoskeleton.	156
397980	pfam04104	DNA_primase_lrg	Eukaryotic and archaeal DNA primase, large subunit. DNA primase is the polymerase that synthesizes small RNA primers for the Okazaki fragments made during discontinuous DNA replication. DNA primase is a heterodimer of two subunits, the small subunit Pri1 (48 kDa in yeast), and the large subunit Pri2 (58 kDa in the yeast S. cerevisiae). The large subunit of DNA primase forms interactions with the small subunit and the structure implicates that it is not directly involved in catalysis, but plays roles in correctly positioning the primase/DNA complex, and in the transfer of RNA to DNA polymerase.	222
397981	pfam04106	APG5	Autophagy protein Apg5. Apg5 is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway.	201
397982	pfam04107	GCS2	Glutamate-cysteine ligase family 2(GCS2). Also known as gamma-glutamylcysteine synthetase and gamma-ECS (EC:6.3.2.2). This enzyme catalyzes the first and rate limiting step in de novo glutathione biosynthesis. Members of this family are found in archaea, bacteria and plants. May and Leaver discuss the possible evolutionary origins of glutamate-cysteine ligase enzymes in different organisms and suggest that it evolved independently in different eukaryotes, from an ancestral bacterial enzyme. They also state that Arabidopsis thaliana gamma-glutamylcysteine synthetase is structurally unrelated to mammalian, yeast and Escherichia coli homologs. In plants, there are separate cytosolic and chloroplast forms of the enzyme.	289
397983	pfam04108	APG17	Autophagy protein Apg17. Apg17 is required for activating Apg1 protein kinases.	387
397984	pfam04109	APG9	Autophagy protein Apg9. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialized compartment essential for these vesicle-mediated alternative targeting pathways.	478
397985	pfam04110	APG12	Ubiquitin-like autophagy protein Apg12. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. The Apg12 system is one of the ubiquitin-like protein conjugation systems conserved in eukaryotes. It was first discovered in yeast during systematic analyses of the apg mutants defective in autophagy. Covalent attachment of Apg12-Apg5 is essential for autophagy.	87
397986	pfam04111	APG6	Autophagy protein Apg6. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg6/Vps30p has two distinct functions in the autophagic process, either associated with the membrane or in a retrieval step of the carboxypeptidase Y sorting pathway.	176
397987	pfam04112	Mak10	Mak10 subunit, NatC N(alpha)-terminal acetyltransferase. NatC N(alpha)-terminal acetyltransferases contains Mak10p, Mak31p and Mak3p subunits. All three subunits are associated with each other to form the active complex.	164
397988	pfam04113	Gpi16	Gpi16 subunit, GPI transamidase component. GPI (glycosyl phosphatidyl inositol) transamidase is a multi-protein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase that adds glycosylphosphatidylinositols (GPIs) to newly synthesized proteins. Gpi16 is an essential N-glycosylated transmembrane glycoprotein. Gpi16 is largely found on the lumenal side of the ER. It has a single C-terminal transmembrane domain and a small C-terminal, cytosolic extension with an ER retrieval motif.	526
397989	pfam04114	Gaa1	Gaa1-like, GPI transamidase component. GPI (glycosyl phosphatidyl inositol) transamidase is a multi-protein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase that adds glycosylphosphatidylinositols (GPIs) to newly synthesized proteins.	496
397990	pfam04115	Ureidogly_lyase	Ureidoglycolate lyase. Ureidoglycolate lyase (EC:4.3.2.3) is one of the enzymes that acts upon ureidoglycolate, an intermediate of purine catabolism, releasing urea. The enzyme has in the past been wrongly assigned to EC:3.5.3.19, enzymes which release ammonia from ureidoglycolate.	162
397991	pfam04116	FA_hydroxylase	Fatty acid hydroxylase superfamily. This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins.	134
397992	pfam04117	Mpv17_PMP22	Mpv17 / PMP22 family. The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesized on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance.	62
397993	pfam04118	Dopey_N	Dopey, N-terminal. DopA is the founding member of the Dopey family and is required for correct cell morphology and spatiotemporal organisation of multicellular structures in the filamentous fungus Aspergillus nidulans. DopA homologs are found in mammals. S. cerevisiae DOP1 is essential for viability and, affects cellular morphogenesis.	297
397994	pfam04119	HSP9_HSP12	Heat shock protein 9/12. These heat shock proteins (Hsp9 and Hsp12) are strongly expressed, an increase of 100 fold, upon entry into stationary phase in yeast.	59
397995	pfam04120	Iron_permease	Low affinity iron permease. 	125
397996	pfam04121	Nup84_Nup100	Nuclear pore protein 84 / 107. Nup84p forms a complex with five proteins, of which Nup120p, Nup85p, Sec13p, and a Sec13p homologs. This Nup84p complex in conjunction with Sec13-type proteins is required for correct nuclear pore biogenesis.	699
397997	pfam04122	CW_binding_2	Putative cell wall binding repeat 2. This repeat is found in multiple tandem copies in proteins including amidase enhancers and adhesins.	80
397998	pfam04123	DUF373	Domain of unknown function (DUF373). Archaeal domain of unknown function. Predicted to be an integral membrane protein with six transmembrane regions.	336
252394	pfam04124	Dor1	Dor1-like family. Dor1 is involved in vesicle targeting to the yeast Golgi apparatus and complexes with a number of other trafficking proteins, which include Sec34 and Sec35.	339
309308	pfam04126	Cyclophil_like	Cyclophilin-like. This domain has a cyclophilin-like fold, consisting of an eight-stranded beta-barrel with an alpha helix located between the beta-2 and beta-3 strands and a 310 helix located between the beta-7 and beta-8 strands. The catalytic site found in human cyclophilin is not conserved in this domain, suggesting a different function for this domain.	119
397999	pfam04127	DFP	DNA / pantothenate metabolism flavoprotein. The DNA/pantothenate metabolism flavoprotein (EC:4.1.1.36) affects synthesis of DNA, and pantothenate metabolism.	183
282044	pfam04129	Vps52	Vps52 / Sac2 family. Vps52 complexes with Vps53 and Vps54 to form a multi- subunit complex involved in regulating membrane trafficking events.	508
398000	pfam04130	Spc97_Spc98	Spc97 / Spc98 family. The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Members of this family all form components of the gamma-tubulin complex, GCP.	296
398001	pfam04131	NanE	Putative N-acetylmannosamine-6-phosphate epimerase. This family represents a putative ManNAc-6-P-to-GlcNAc-6P epimerase in the N-acetylmannosamine (ManNAc) utilisation pathway found mainly in pathogenic bacteria.	192
398002	pfam04133	Vps55	Vacuolar protein sorting 55. Vps55 is involved in the secretion of the Golgi form of the soluble vacuolar carboxypeptidase Y, but not the trafficking of the membrane-bound vacuolar alkaline phosphatase. Both Vps55 and obesity receptor gene-related protein are important for functioning membrane trafficking to the vacuole/lysosome of eukaryotic cells.	118
398003	pfam04134	DUF393	Protein of unknown function, DUF393. Members of this family have two highly conserved cysteine residues near their N-terminus. The function of these proteins is unknown.	113
398004	pfam04135	Nop10p	Nucleolar RNA-binding protein, Nop10p family. Nop10p is a nucleolar protein that is specifically associated with H/ACA snoRNAs. It is essential for normal 18S rRNA production and rRNA pseudouridylation by the ribonucleoprotein particles containing H/ACA snoRNAs (H/ACA snoRNPs). Nop10p is probably necessary for the stability of these RNPs.	51
398005	pfam04136	Sec34	Sec34-like family. Sec34 and Sec35 form a sub-complex, in a seven protein complex that includes Dor1 (pfam04124). This complex is thought to be important for tethering vesicles to the Golgi.	146
398006	pfam04137	ERO1	Endoplasmic Reticulum Oxidoreductin 1 (ERO1). Members of this family are required for the formation of disulphide bonds in the ER.	347
398007	pfam04138	GtrA	GtrA-like protein. Members of this family are predicted to be integral membrane proteins with three or four transmembrane spans. They are involved in the synthesis of cell surface polysaccharides. The GtrA family are a subset of this family. GtrA is predicted to be an integral membrane protein with 4 transmembrane spans. It is involved is in O antigen modification by Shigella flexneri bacteriophage X (SfX), but does not determine the specificity of glucosylation. Its function remains unknown, but it may play a role in translocation of undecaprenyl phosphate linked glucose (UndP-Glc) across the cytoplasmic membrane. Another member of this family is a DTDP-glucose-4-keto-6-deoxy-D-glucose reductase, which catalyzes the conversion of dTDP-4-keto-6-deoxy-D-glucose to dTDP-D-fucose, which is involved in the biosynthesis of the serotype-specific polysaccharide antigen of Actinobacillus actinomycetemcomitans Y4 (serotype b). This family also includes the teichoic acid glycosylation protein, GtcA, which is a serotype-specific protein in some Listeria innocua and monocytogenes strains. Its exact function is not known, but it is essential for decoration of cell wall teichoic acids with glucose and galactose.	117
398008	pfam04139	Rad9	Rad9. Rad9 is required for transient cell-cycle arrests and transcriptional induction of DNA repair in response to DNA damage. It contains a Bcl-2 homology domain 3 (BH3).	250
367837	pfam04140	ICMT	Isoprenylcysteine carboxyl methyltransferase (ICMT) family. The isoprenylcysteine o-methyltransferase (EC:2.1.1.100) family carry out carboxyl methylation of cleaved eukaryotic proteins that terminate in a CaaX motif. In Saccharomyces cerevisiae this methylation is carried out by Ste14p, an integral endoplasmic reticulum membrane protein. Ste14p is the founding member of the isoprenylcysteine carboxyl methyltransferase (ICMT) family, whose members share significant sequence homology.	94
398009	pfam04142	Nuc_sug_transp	Nucleotide-sugar transporter. This family of membrane proteins transport nucleotide sugars from the cytoplasm into Golgi vesicles. SSLC35A1 transports CMP-sialic acid, SLC35A2 transports UDP-galactose and SLC35A3 transports UDP-GlcNAc.	315
398010	pfam04143	Sulf_transp	Sulphur transport. This is an integral membrane protein. It is predicted to have a function in the transport of sulphur-containing molecules. It contains several conserved glycines and an invariant cysteine that is probably an important functional residue.	310
398011	pfam04144	SCAMP	SCAMP family. In vertebrates, secretory carrier membrane proteins (SCAMPs) 1-3 constitute a family of putative membrane-trafficking proteins composed of cytoplasmic N-terminal sequences with NPF repeats, four central transmembrane regions (TMRs), and a cytoplasmic tail. SCAMPs probably function in endocytosis by recruiting EH-domain proteins to the N-terminal NPF repeats but may have additional functions mediated by their other sequences.	172
398012	pfam04145	Ctr	Ctr copper transporter family. The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport.	151
398013	pfam04146	YTH	YT521-B-like domain. A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily.	135
398014	pfam04147	Nop14	Nop14-like family. Emg1 and Nop14 are novel proteins whose interaction is required for the maturation of the 18S rRNA and for 40S ribosome production.	844
398015	pfam04148	Erv26	Transmembrane adaptor Erv26. Erv26 is an integral membrane protein that is packed into COPII vesicles and cycles between the ER and Golgi compartments. It directs pro-alkaline phosphatase into endoplasmic reticulum-derived COPII transport vesicles.	202
398016	pfam04149	DUF397	Domain of unknown function (DUF397). The function of this family is unknown.	50
398017	pfam04151	PPC	Bacterial pre-peptidase C-terminal domain. This domain is normally found at the C-terminus of secreted bacterial peptidases. They are not present in the active peptidase. It is possible that they fulfill a similar role to the PKD (pfam00801) domain, which also are found in this context. Visual analysis suggests that PKD and PPC are distantly related (personal obs:Bateman A, Yeats C).	68
398018	pfam04152	Mre11_DNA_bind	Mre11 DNA-binding presumed domain. The Mre11 complex is a multi-subunit nuclease that is composed of Mre11, Rad50 and Nbs1/Xrs2, and is involved in checkpoint signalling and DNA replication. Mre11 has an intrinsic DNA-binding activity that is stimulated by Rad50 on its own or in combination with Nbs1.	173
398019	pfam04153	NOT2_3_5	NOT2 / NOT3 / NOT5 family. NOT1, NOT2, NOT3, NOT4 and NOT5 form a nuclear complex that negatively regulates the basal and activated transcription of many genes. This family includes NOT2, NOT3 and NOT5.	124
398020	pfam04155	Ground-like	Ground-like domain. This family consists of the ground-like domain and is specific to C.elegans. It has been proposed that the ground-like domain containing proteins may bind and modulate the activity of Patched-like membrane molecules, reminiscent of the modulating activities of neuropeptides.	73
398021	pfam04157	EAP30	EAP30/Vps36 family. This family includes EAP30 as well as the Vps36 protein. Vps36 is involved in Golgi to endosome trafficking. EAP30 is a subunit of the ELL complex. The ELL is an 80-kDa RNA polymerase II transcription factor. ELL interacts with three other proteins to form the complex known as ELL complex. The ELL complex is capable of increasing that catalytic rate of transcription elongation, but is unable to repress initiation of transcription by RNA polymerase II as is the case of ELL. EAP30 is thought to lead to the derepression of ELL's transcriptional inhibitory activity.	210
398022	pfam04158	Sof1	Sof1-like domain. Sof1 is essential for cell growth and is a component of the nucleolar rRNA processing machinery.	87
282069	pfam04159	NB	NB glycoprotein. The NB glycoprotein is found in Influenza type B virus. Its function is unknown.	100
282070	pfam04160	Borrelia_orfX	Orf-X protein. This short protein has no known function and is found in Jaagsiekte sheep retrovirus. Jaagsiekte sheep retrovirus (JSRV) is the etiological agent of a contagious lung tumor of sheep known as sheep pulmonary adenomatosis. JSRV exhibits a simple genetic organisation, characteristic of the type D and type B retroviruses, with the canonical retroviral sequences gag, pro, pol and env encoding the structural proteins of the virion. An additional open reading frame (orf-x), of approximately 500 bp overlapping pol.	154
398023	pfam04161	Arv1	Arv1-like family. Arv1 is a transmembrane protein with potential zinc-binding motifs. ARV1 is a novel mediator of eukaryotic sterol homeostasis.	205
282072	pfam04162	Gyro_capsid	Gyrovirus capsid protein (VP1). Gyroviruses are small circular single stranded viruses. This family includes the VP1 protein from the chicken anaemia virus which is the viral capsid protein.	449
282073	pfam04163	Tht1	Tht1-like nuclear fusion protein. 	595
282074	pfam04165	DUF401	Protein of unknown function (DUF401). Members if this family are predicted to have 10 transmembrane regions.	393
398024	pfam04166	PdxA	Pyridoxal phosphate biosynthetic protein PdxA. In Escherichia coli the coenzyme pyridoxal 5'-phosphate is synthesized de novo by a pathway that is thought to involve the condensation of 4-(phosphohydroxy)-L-threonine and 1-deoxy-D-xylulose, catalyzed by the enzymes PdxA and PdxJ, to form either pyridoxine (vitamin B6) or pyridoxine 5'-phosphate.	251
398025	pfam04167	DUF402	Protein of unknown function (DUF402). Family member FomD is a predicted protein from a fosfomycin biosynthesis gene cluster in Streptomyces wedmorensis. Its function is unknown.	68
398026	pfam04168	Alpha-E	A predicted alpha-helical domain with a conserved ER motif. An uncharacterized alpha helical domain containing a highly conserved ER motif and typically found as a tandem duplication. Contextual analysis suggests that it functions in a distinct peptide synthesis/modification system comprising of a transglutaminase, a peptidase of the NTN-hydrolase superfamily, an active and inactive circularly permuted ATP-grasp domains and a transglutaminase fused N-terminal to a circularly permuted COOH-NH2 ligase domain.	303
398027	pfam04170	NlpE	NlpE N-terminal domain. This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. In Escherichia coli and Salmonella typhi, NlpE is also known to confer copper tolerance in copper-sensitive strains of Escherichia coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes.	78
398028	pfam04172	LrgB	LrgB-like family. The two products of the lrgAB operon are potential membrane proteins, and LrgA and LrgB are both thought to control of murein hydrolase activity and penicillin tolerance.	206
282080	pfam04173	DoxD	TQO small subunit DoxD. DoxD is a subunit of the terminal quinol oxidase present in the plasma membrane of Acidianus ambivalens, with calculated molecular mass of 20.4 kDa. Thiosulphate:quinone oxidoreductase (TQO) is one of the early steps in elemental sulphur oxidation. A novel TQO enzyme was purified from the thermo-acidophilic archaeon Acidianus ambivalens and shown to consist of a large subunit (DoxD) and a smaller subunit (DoxA). The DoxD- and DoxA-like two subunits are fused together in a single polypeptide in BT_0515.	167
398029	pfam04174	CP_ATPgrasp_1	A circularly permuted ATPgrasp. An ATP-grasp family that is present both as catalytically active and inactive versions. Contextual analysis suggests that it functions in a distinct peptide synthesis/modification system that additionally contains a transglutaminase, an NTN-hydrolase, the Alpha-E domain, and a transglutaminase fused N-terminal to a circularly permuted COOH-NH2 ligase. The inactive forms are often fused N-terminal to the Alpha-E domain.	332
398030	pfam04175	DUF406	Protein of unknown function (DUF406). Members of this family appear to be found only in gamma proteobacteria. The function of this protein family is undetermined. Solution of the structures of the two members of this family investigated bear some resemblance to that of the single domain enzyme pterin-4a-carbinolamine dehydratase, PDC. Although the residues of PCDs involved in binding of metabolite are not conserved in the two structures under study, they do correspond to a surface-region structurally aligned with residues that are highly conserved, eg Glu 89, suggesting that this region is also involved in binding of a ligand, thereby possibly constituting a catalytic site of a yet uncharacterized enzyme specific for gamma proteobacteria.	91
398031	pfam04176	TIP41	TIP41-like family. The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 interacts with TAP42 and negatively regulates the TOR signaling pathway.	168
398032	pfam04177	TAP42	TAP42-like family. The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 (pfam04176) interacts with TAP42 and negatively regulates the TOR signaling pathway.	310
398033	pfam04178	Got1	Got1/Sft2-like family. Traffic through the yeast Golgi complex depends on a member of the syntaxin family of SNARE proteins, Sed5, present in early Golgi cisternae. Got1 is thought to facilitate Sed5-dependent fusion events. This is a family of sequences derived from eukaryotic proteins. They are similar to a region of a SNARE-like protein required for traffic through the Golgi complex, SFT2 protein. This is a conserved protein with four putative transmembrane helices, thought to be involved in vesicular transport in later Golgi compartments.	114
398034	pfam04179	Init_tRNA_PT	Rit1 DUSP-like domain. This enzyme (EC:2.4.2.-) modifies exclusively the initiator tRNA in position 64 using 5'-phosphoribosyl-1'-pyrophosphate as the modification donor. As the initiator tRNA participates both in the initiation and elongation of translation, the 2'-O-ribosyl phosphate modification discriminates the initiator tRNAs from the elongator tRNAs. This C-terminal domain shows similarity to dual specificity phosphatases.	110
398035	pfam04180	LTV	Low temperature viability protein. The low-temperature viability protein LTV1 is involved in ribosome biogenesis 40S subunit production.	409
398036	pfam04181	RPAP2_Rtr1	Rtr1/RPAP2 family. This family includes the human RPAP2 (RNAP II associated polypeptide) protein and the yeast Rtr1 protein. It has been suggested that this family of proteins are regulators of core RNA polymerase II function.	75
367858	pfam04182	B-block_TFIIIC	B-block binding subunit of TFIIIC. Yeast transcription factor IIIC (TFIIIC) is a multi-subunit protein complex that interacts with two control elements of class III promoters called the A and B blocks. This family represents the subunit within TFIIIC involved in B-block binding.	75
398037	pfam04183	IucA_IucC	IucA / IucC family. IucA and IucC catalyze discrete steps in biosynthesis of the siderophore aerobactin from N epsilon-acetyl-N epsilon-hydroxylysine and citrate. This family represents the N-terminal region. The C-terminal region appears to be related to iron transporter proteins.	210
398038	pfam04184	ST7	ST7 protein. The ST7 (for suppression of tumorigenicity 7) protein is thought to be a tumor suppressor gene. The molecular function of this protein is uncertain.	528
309350	pfam04185	Phosphoesterase	Phosphoesterase family. This family includes both bacterial phospholipase C enzymes EC:3.1.4.3, but also eukaryotic acid phosphatases EC:3.1.3.2.	348
398039	pfam04186	FxsA	FxsA cytoplasmic membrane protein. This is a bacterial family of cytoplasmic membrane proteins. It includes two transmembrane regions. The molecular function of FxsA is unknown, but in Escherichia coli its over-expression has been shown to alleviate the exclusion of phage T7 in those cells with an F plasmid.	109
398040	pfam04187	Cofac_haem_bdg	Haem-binding uptake, Tiki superfamily, ChaN. This is a family of putative bacterial lipoproteins necessary for the uptake of haem-iron. The structure of UniProtKB:Q0PBW2, Structure 2g5g, comprises a large parallel beta-sheet with flanking alpha-helices and a smaller domain consisting of alpha-helices. Two cofacial haem groups (~3.5 Angstom apart with an inter-iron distance of 4.4 Angstrom) bind in a pocket formed by a dimer of two ChaN monomers.	209
282094	pfam04188	Mannosyl_trans2	Mannosyltransferase (PIG-V). This is a family of eukaryotic ER membrane proteins that are involved in the synthesis of glycosylphosphatidylinositol (GPI), a glycolipid that anchors many proteins to the eukaryotic cell surface. Proteins in this family are involved in transferring the second mannose in the biosynthetic pathway of GPI.	439
398041	pfam04189	Gcd10p	Gcd10p family. eIF-3 is a multi-subunit complex that stimulates translation initiation in vitro at several different steps. This family corresponds to the gamma subunit if eIF3. The Yeast protein Gcd10p has also been shown to be part of a complex with the methyltransferase Gcd14p that is involved in modifying tRNA.	292
398042	pfam04190	DUF410	Protein of unknown function (DUF410). This family of proteins is from Caenorhabditis elegans and has no known function. The protein has some GO references indicating that the protein has a positive regulation of growth rate and is involved in nematode larval development.	251
398043	pfam04191	PEMT	Phospholipid methyltransferase. The S. cerevisiae phospholipid methyltransferase (EC:2.1.1.16) has a broad substrate specificity of unsaturated phospholipids.	106
398044	pfam04192	Utp21	Utp21 specific WD40 associated putative domain. Utp21 is a subunit of U3 snoRNP, which is essential for synthesis of 18S rRNA.	230
398045	pfam04193	PQ-loop	PQ loop repeat. Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localization of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue. Family members are membrane transporters since two members, cystinosin and PQLC2, transport cystine and cationic amino acids, respectively, across the lysosomal membrane. The 2nd PQ-loop of cystinosin hosts the substrate-coupled H+ binding site underlying its H+ symport mechanism, suggesting that PQ-loop repeats have functional significance. It is thus likely that PQ-loop-containing proteins act as a family of membrane transporters. Some transport cystine and cationic amino acids, respectively, across the lysosomal membrane. Others transport lysine and or arginine across the lysosomal membrane in order to maintain the acidic homoeostasis.	61
398046	pfam04194	PDCD2_C	Programmed cell death protein 2, C-terminal putative domain. 	126
398047	pfam04195	Transposase_28	Putative gypsy type transposon. This family of plant genes are thought to be related to gypsy type transposons.	70
282102	pfam04196	Bunya_RdRp	Bunyavirus RNA dependent RNA polymerase. The bunyaviruses are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encode on the small (S) genomic RNA. The L segment codes for an RNA polymerase. This family contains the RNA dependent RNA polymerase on the L segment.	739
398048	pfam04197	Birna_RdRp	Birnavirus RNA dependent RNA polymerase (VP1). Birnaviruses are dsRNA viruses. This family corresponds to the RNA dependent RNA polymerase. This protein is also known as VP1. All of the birnavirus VP1 proteins contain conserved RdRp motifs that reside in the catalytic "palm" domain of all classes of polymerases. However, the birnavirus RdRps lack the highly conserved Gly-Asp-Asp (GDD) sequence, a component of the proposed catalytic site of this enzyme family that exists in the conserved motif VI of the palm domain of other RdRps.	802
398049	pfam04198	Sugar-bind	Putative sugar-binding domain. This probable domain is found in bacterial transcriptional regulators such as DeoR and SorC. These proteins have an amino-terminal helix-turn-helix pfam00325 that binds to DNA. This domain is probably the ligand regulator binding region. SorC is regulated by sorbose and other members of this family are likely to be regulated by other sugar substrates.	256
398050	pfam04199	Cyclase	Putative cyclase. Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site.	158
398051	pfam04200	Lipoprotein_17	Lipoprotein associated domain. This presumed domain is about 100 amino acids in length. It is found in lipoprotein of unknown function and is greatly expanded in Mycoplasma pulmonis. The domain is found in up to five copies in some proteins. This family also includes the Mycoplasma arthritidis MAA2 variable surface protein. MAA2 is implicated in in cytoadherence and virulence and has been shown to exhibit both size and phase variability.	84
398052	pfam04201	TPD52	tumor protein D52 family. The hD52 gene was originally identified through its elevated expression level in human breast carcinoma. Cloning of D52 homologs from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Two human homologs of hD52, hD53 and hD54, have also been identified, demonstrating the existence of a novel gene/protein family. These proteins have an amino terminal coiled-coil that allows members to form homo- and heterodimers with each other.	148
112992	pfam04202	Mfp-3	Foot protein 3. Mytilus foot protein-3 (Mfp-3) is a highly polymorphic protein family located in the byssal adhesive plaques of blue mussels.	71
398053	pfam04203	Sortase	Sortase family. The founder member of this family is S.aureus sortase, a transpeptidase that attaches surface proteins by the threonine of an LPXTG motif to the cell wall.	123
398054	pfam04204	HTS	Homoserine O-succinyltransferase. 	298
398055	pfam04205	FMN_bind	FMN-binding domain. This conserved region includes the FMN-binding site of the NqrC protein as well as the NosR and NirI regulatory proteins.	72
398056	pfam04206	MtrE	Tetrahydromethanopterin S-methyltransferase, subunit E. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump.	271
398057	pfam04207	MtrD	Tetrahydromethanopterin S-methyltransferase, subunit D. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump.	217
398058	pfam04208	MtrA	Tetrahydromethanopterin S-methyltransferase, subunit A. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump.	165
398059	pfam04209	HgmA	homogentisate 1,2-dioxygenase. Homogentisate dioxygenase cleaves the aromatic ring during the metabolic degradation of Phe and Tyr. Homogentisate dioxygenase deficiency causes alkaptonuria. The structure of homogentisate dioxygenase shows that the enzyme forms a hexamer arrangement comprised of a dimer of trimers. The active site iron ion is coordinated near the interface between the trimers.	424
367872	pfam04210	MtrG	Tetrahydromethanopterin S-methyltransferase, subunit G. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump.	64
398060	pfam04211	MtrC	Tetrahydromethanopterin S-methyltransferase, subunit C. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump.	266
398061	pfam04212	MIT	MIT (microtubule interacting and transport) domain. The MIT domain forms an asymmetric three-helix bundle and binds ESCRT-III (endosomal sorting complexes required for transport) substrates.	66
398062	pfam04213	HtaA	Htaa. This domain is found in HtaA, a secreted protein implicated in iron acquisition and transport.	159
398063	pfam04214	DUF411	Protein of unknown function, DUF. The function of the members of this bacterial protein family is unknown. Some members may be involved in conferring cation resistance.	68
398064	pfam04216	FdhE	Protein involved in formate dehydrogenase formation. The function of these proteins is unknown. They may possibly be involved in the formation of formate dehydrogenase.	286
398065	pfam04217	DUF412	Protein of unknown function, DUF412. This family consists of bacterial proteins, including yfbV from E. coli. YfbV is a membrane protien involved in insulating the chromosome from the TerR macrodomain.	133
398066	pfam04218	CENP-B_N	CENP-B N-terminal DNA-binding domain. Centromere Protein B (CENP-B) is a DNA-binding protein localized to the centromere. Within the N-terminal 125 residues, there is a DNA-binding region, which binds to a corresponding 17bp CENP-B box sequence. CENP-B dimers either bind two separate DNA molecules or alternatively, they may bind two CENP-B boxes on one DNA molecule, with the intervening stretch of DNA forming a loop structure. The CENP-B DNA-binding domain consists of two repeating domains, RP1 and RP2. This family corresponds to RP1 has been shown to consist of four helices in a helix-turn-helix structure.	53
398067	pfam04219	DUF413	Protein of unknown function, DUF. 	89
398068	pfam04220	YihI	Der GTPase activator (YihI). YihI activates the GTPase activity of Der, a 50S ribosomal subunit stability factor. The stimulation is specific to Der as YihI does not stimulate the GTPase activity of Era or ObgE. The interaction of YihI with Der requires only the C-terminal 78 amino acids of YihI. A yihI deletion mutant is viable and shows a shorter lag period, but the same post-lag growth rate as a wild-type strain. yihI is expressed during the lag period. Overexpression of yihI inhibits cell growth and biogenesis of the 50S ribosomal subunit. YihI is an unusual, highly hydrophilic protein with an uneven distribution of charged residues, resulting in an N-terminal region with high pI and a C-terminal region with low pI.	156
398069	pfam04221	RelB	RelB antitoxin. RelE and RelB form a toxin-antitoxin system. RelE represses translation, probably through binding ribosomes. RelB stably binds RelE, presumably deactivating it.	82
398070	pfam04222	DUF416	Protein of unknown function (DUF416). This is a bacterial protein family of unknown function. Proteins in this family adopt an alpha helical structure. Genome context analysis has suggested a high probability of a functional association with histidine kinases, which implicates proteins in this family to play a role in signalling (information from TOPSAN 2Q9R).	183
398071	pfam04223	CitF	Citrate lyase, alpha subunit (CitF). In citrate-utilising prokaryotes, citrate lyase EC:4.1.3.6 cleaves intracellular citrate into acetate and oxaloacetate, and is organized as a functional complex consisting of alpha, beta, and gamma subunits. The gamma subunit serves as an acyl carrier protein (ACP), and has a 2'-(5''-phosphoribosyl)-3'-dephospho-CoA prosthetic group. The citrate lyase is active only if this prosthetic group is acetylated; this acetylation is catalyzed by an acetate:SH-citrate lyase ligase. The alpha subunit substitutes citryl for the acetyl group to form citryl-S-ACP. The beta subunit completes the reaction by cleaving the citryl to yield oxaloacetate and (regenerated) acetyl-S-ACP. This family represents the alpha subunit EC:2.8.3.10.	466
282127	pfam04224	DUF417	Protein of unknown function, DUF417. This family of uncharacterized proteins appears to be restricted to proteobacteria.	175
282128	pfam04225	OapA	Opacity-associated protein A LysM-like domain. This family includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonisation, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation. This is a LysM-like domain.	84
398072	pfam04226	Transgly_assoc	Transglycosylase associated protein. Bacterial protein, predicted to be an integral membrane protein. Some family members have been annotated as transglycosylase associated proteins, but no experimental evidence is provided. This family was annotated based on the information found for Escherichia coli YmgE.	49
398073	pfam04227	Indigoidine_A	Indigoidine synthase A like protein. Indigoidine is a blue pigment synthesized by Erwinia chrysanthemi implicated in pathogenicity and protection from oxidative stress. IdgA is involved in indigoidine biosynthesis, but its specific function is unknown. The recommended name for this protein is now pseudouridine-5'-phosphate glycosidase.	288
398074	pfam04228	Zn_peptidase	Putative neutral zinc metallopeptidase. Members of this family have a predicted zinc binding motif characteristic of neutral zinc metallopeptidases (Prosite:PDOC00129).	291
398075	pfam04229	GrpB	GrpB protein. This family has been suggested to belong to the nucleotidyltransferase superfamily. It occurs at the C-terminus of dephospho-CoA kinase (CoaE) in a number of cases, where it plays a role in the proper folding of the enzyme.	160
398076	pfam04230	PS_pyruv_trans	Polysaccharide pyruvyl transferase. Pyruvyl-transferases involved in peptidoglycan-associated polymer biosynthesis. CsaB in Bacillus anthracis is necessary for the non-covalent anchoring of proteins containing an SLH (S-layer homology) domain to peptidoglycan-associated pyruvylated polysaccharides. WcaK and AmsJ are involved in the biosynthesis of colanic acid in Escherichia coli and of amylovoran in Erwinia amylovora.	233
398077	pfam04231	Endonuclease_1	Endonuclease I. Bacterial periplasmic or secreted endonuclease I (EC:3.1.21.1) E. coli endonuclease I (EndoI) is a sequence independent endonuclease located in the periplasm. It is inhibited by different RNA species. It is thought to normally generate double strand breaks in DNA, except in the presence of high salt concentrations and RNA, when it generates single strand breaks in DNA. Its biological role is unknown. Other family members are known to be extracellular. This family also includes a non-specific, Mg2+ activated ribonuclease precursor.	235
398078	pfam04232	SpoVS	Stage V sporulation protein S (SpoVS). In Bacillus subtilis this protein interferes with sporulation at an early stage and this inhibitory effect is overcome by SpoIIB and SpoVG. SpoVS seems to play a positive role in allowing progression beyond stage V of sporulation. Null mutations in the spoVS gene block sporulation at stage V, impairing the development of heat resistance and coat assembly.	81
309383	pfam04233	Phage_Mu_F	Phage Mu protein F like protein. Members of this family are found in double-stranded DNA bacteriophages, and in some bacteria. A member of this family is required for viral head morphogenesis in bacteriophage SPP1. This family is possibly a minor head protein. This family may be related to the family TT_ORF1 (pfam02956).	110
398079	pfam04234	CopC	CopC domain. CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm.	93
398080	pfam04235	DUF418	Protein of unknown function (DUF418). Probable integral membrane protein.	163
398081	pfam04236	Transp_Tc5_C	Tc5 transposase C-terminal domain. This family corresponds to a C-terminal cysteine rich region that probably binds to a metal ion and could be DNA binding (pers. obs. A Bateman).	63
398082	pfam04237	YjbR	YjbR. YjbR has a CyaY-like fold.	91
398083	pfam04238	DUF420	Protein of unknown function (DUF420). Predicted membrane protein with four transmembrane helices.	130
398084	pfam04239	DUF421	Protein of unknown function (DUF421). YDFR family	119
398085	pfam04240	Caroten_synth	Carotenoid biosynthesis protein. The representative member of this family is CruF, a C50 carotenoid 2',3'-hydratase involved in the synthesis of the C50 carotenoid bacterioruberin in the halophilic archaeon Haloarcula japonica.	209
398086	pfam04241	DUF423	Protein of unknown function (DUF423). This family of proteins with unknown function is a possible integral membrane protein from Caenorhabditis elegans. This family of proteins has GO references indicating the protein is involved in nematode larval development and is a positive regulator of growth rate.	86
377260	pfam04242	DUF424	Protein of unknown function (DUF424). This is a family of uncharacterized proteins.	92
398087	pfam04244	DPRP	Deoxyribodipyrimidine photo-lyase-related protein. This family appears to be related to pfam00875.	221
398088	pfam04245	NA37	37-kD nucleoid-associated bacterial protein. 	311
398089	pfam04246	RseC_MucC	Positive regulator of sigma(E), RseC/MucC. This bacterial family of integral membrane proteins represents a positive regulator of the sigma(E) transcription factor, namely RseC/MucC. The sigma(E) transcription factor is up-regulated by cell envelope protein misfolding, and regulates the expression of genes that are collectively termed ECF (devoted to Extra-Cellular Functions). In Pseudomonas aeruginosa, de-repression of sigma(E) is associated with the alginate-overproducing phenotype characteristic of chronic respiratory tract colonisation in cystic fibrosis patients. The mechanism by which RseC/MucC positively regulates the sigma(E) transcription factor is unknown. RseC is also thought to have a role in thiamine biosynthesis in Salmonella typhimurium. In addition, this family also includes an N-terminal part of RnfF, a Rhodobacter capsulatus protein, of unknown function, that is essential for nitrogen fixation. This protein also contains an ApbE domain pfam02424, which is itself involved in thiamine biosynthesis.	129
398090	pfam04247	SirB	Invasion gene expression up-regulator, SirB. SirB up-regulates Salmonella typhimurium invasion gene transcription. It is, however, not essential for the expression of these genes. Its function is unknown.	120
398091	pfam04248	NTP_transf_9	Domain of unknown function (DUF427). This domain contains a beta-tent fold.	93
398092	pfam04250	DUF429	Protein of unknown function (DUF429). 	213
398093	pfam04252	RNA_Me_trans	Predicted SAM-dependent RNA methyltransferase. This family of proteins are predicted to be alpha/beta-knot SAM-dependent RNA methyltransferases.	200
398094	pfam04253	TFR_dimer	Transferrin receptor-like dimerization domain. This domain is involved in dimerization of the transferrin receptor as shown in its crystal structure.	118
398095	pfam04254	DUF432	Protein of unknown function (DUF432). Archaeal protein of unknown function.	120
398096	pfam04255	DUF433	Protein of unknown function (DUF433). 	54
398097	pfam04256	DUF434	Protein of unknown function (DUF434). 	55
398098	pfam04257	Exonuc_V_gamma	Exodeoxyribonuclease V, gamma subunit. The Exodeoxyribonuclease V enzyme is a multi-subunit enzyme comprised of the proteins RecB, RecC (this family) and RecD. This enzyme plays an important role in homologous genetic recombination, repair of double strand DNA breaks resistance to UV irradiation and chemical DNA-damage. The enzyme (EC:3.1.11.5) catalyzes ssDNA or dsDNA-dependent ATP hydrolysis, hydrolysis of ssDNA or dsDNA and unwinding of dsDNA. This family consists of two AAA domains.	762
282158	pfam04258	Peptidase_A22B	Signal peptide peptidase. The members of this family are membrane proteins. In some proteins this region is found associated with pfam02225. This family corresponds with Merops subfamily A22B, the type example of which is signal peptide peptidase. There is a sequence-similarity relationship with pfam01080.	286
367886	pfam04259	SASP_gamma	Small, acid-soluble spore protein, gamma-type. The SASP family is a family of small, glutamine and asparagine-rich peptides that store amino acids in the spores of Bacillus subtilis and related bacteria.	84
398099	pfam04260	DUF436	Protein of unknown function (DUF436). Family of bacterial proteins with undetermined function.	170
398100	pfam04261	Dyp_perox	Dyp-type peroxidase family. This family of dye-decolourising peroxidases lack a typical heme-binding region.	315
398101	pfam04262	Glu_cys_ligase	Glutamate-cysteine ligase. Family of bacterial f glutamate-cysteine ligases (EC:6.3.2.2) that carry out the first step of the glutathione biosynthesis pathway.	371
398102	pfam04263	TPK_catalytic	Thiamin pyrophosphokinase, catalytic domain. Family of thiamin pyrophosphokinase (EC:2.7.6.2). Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis.	112
398103	pfam04264	YceI	YceI-like domain. E. coli YceI is a base-induced periplasmic protein. The recent structure of a member of this family shows that it binds to poly-isoprenoid. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold.	101
398104	pfam04265	TPK_B1_binding	Thiamin pyrophosphokinase, vitamin B1 binding domain. Family of thiamin pyrophosphokinase (EC:2.7.6.2). Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis.	66
398105	pfam04266	ASCH	ASCH domain. The ASCH domain adopts a beta-barrel fold similar to the pfam01472 domain. It is thought to function as an RNA-binding domain during coactivation, RNA-processing and possibly during prokaryotic translation regulation.	102
398106	pfam04267	SoxD	Sarcosine oxidase, delta subunit family. Sarcosine oxidase is a hetero-tetrameric enzyme that contains both covalently bound FMN and non-covalently bound FAD and NAD(+). This enzyme catalyzes the oxidative demethylation of sarcosine to yield glycine, H2O2, and 5,10-CH2-tetrahydrofolate (H4folate) in a reaction requiring H4folate and O2.	80
398107	pfam04268	SoxG	Sarcosine oxidase, gamma subunit family. Sarcosine oxidase is a hetero-tetrameric enzyme that contains both covalently bound FMN and non-covalently bound FAD and NAD(+). This enzyme catalyzes the oxidative demethylation of sarcosine to yield glycine, H2O2, and 5,10-CH2-tetrahydrofolate (H4folate) in a reaction requiring H4folate and O2.	152
398108	pfam04269	DUF440	Protein of unknown function, DUF440. This family consists of uncharacterized bacterial proteins.	101
398109	pfam04270	Strep_his_triad	Streptococcal histidine triad protein. All members of this family are proteins from Streptococcal species. The proteins are characterized by having a HxxHxH motif that usually occurs multiple times throughout the protein. The histidine triad is predicted to bind metal cations, in particular Zn2+. The zinc is transferred, on the surface of the streptococcus from the Strep_his_triad protein, a zinc scavenger, to apo-ADCAII, a cell-surface lipoprotein transporter that leads to Zn2+ uptake into the bacterium.	51
367891	pfam04272	Phospholamban	Phospholamban. The regulation of calcium levels across the membrane of the sarcoplasmic reticulum involves the interplay of many membrane proteins. Phospholamban is a 52 residue integral membrane protein that is involved in reversibly inhibiting the Ca(2+) pump and regulating the flow of Ca ions across the sarcoplasmic reticulum membrane during muscle contraction and relaxation. Phospholamban is thought to form a pentamer in the membrane.	52
398110	pfam04273	DUF442	Putative phosphatase (DUF442). Although this domain is uncharacterized it seems likely that it performs a phosphatase function.	110
398111	pfam04275	P-mevalo_kinase	Phosphomevalonate kinase. Phosphomevalonate kinase (EC:2.7.4.2) catalyzes the phosphorylation of 5-phosphomevalonate into 5-diphosphomevalonate, an essential step in isoprenoid biosynthesis via the mevalonate pathway. This family represents the animal type of the enzyme. The other is the ERG8 type, found in plants and fungi, and some bacteria (see pfam00288).	111
398112	pfam04276	DUF443	Protein of unknown function (DUF443). Family of uncharacterized proteins.	202
398113	pfam04277	OAD_gamma	Oxaloacetate decarboxylase, gamma chain. 	76
398114	pfam04278	Tic22	Tic22-like family. The preprotein translocation at the inner envelope membrane of chloroplasts so far involves five proteins: Tic110, Tic55, Tic40, Tic22 (this family) and Tic20. The molecular function of these proteins has not yet been established.	244
398115	pfam04279	IspA	Intracellular septation protein A. 	176
398116	pfam04280	Tim44	Tim44-like domain. Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region. This family includes the C-terminal region of Tim44 that has been shown to form a stable proteolytic fragment in yeast. This region is also found in a set of smaller bacterial proteins. The molecular function of the bacterial members of this family is unknown but transport seems likely. The crystal structure of the C terminal of Tim44 has revealed a large hydrophobic pocket which might play an important role in interacting with the acyl chains of lipid molecules in the mitochondrial membrane.	144
398117	pfam04281	Tom22	Mitochondrial import receptor subunit Tom22. The mitochondrial protein translocase family, which is responsible for movement of nuclear encoded pre-proteins into mitochondria, is very complex with at least 19 components. These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family represents the Tom22 proteins. The N terminal region of Tom22 has been shown to have chaperone-like activity, and the C terminal region faces the intermembrane face.	137
398118	pfam04282	DUF438	Family of unknown function (DUF438). 	67
398119	pfam04283	CheF-arch	Chemotaxis signal transduction system protein F from archaea. This is a family of proteins that are archaea-specific components of the bacterial-like chemotaxis signal transduction system of archaea. In H. salinarum, the CheF proteins interact with the chemotaxis proteins CheY, CheD and CheC2 as well as the flagella-accessory proteins FlaCE and FlaD, and are essential for any tactic response. CheF probably functions at the interface between the bacterial-like chemotaxis signal transduction system and the archaeal flagellar apparatus.	217
398120	pfam04284	DUF441	Protein of unknown function (DUF441). Predicted to be an integral membrane protein.	139
398121	pfam04285	DUF444	Protein of unknown function (DUF444). Bacterial protein of unknown function. One family member is predicted to contain a von Willebrand factor (vWF) type A domain (Smart:VWA).	409
398122	pfam04286	DUF445	Protein of unknown function (DUF445). Predicted to be a membrane protein.	368
398123	pfam04287	DUF446	tRNA pseudouridine synthase C. This family is suggested to be the catalytic domain of tRNA pseudouridine synthase C by association. The structure has been solved for one member, as Structure 2HGK, which by inference is designated in this way.	97
398124	pfam04288	MukE	MukE-like family. Bacterial protein involved in chromosome partitioning, MukE	229
398125	pfam04289	DUF447	Protein of unknown function (DUF447). Archaeal protein of unknown function. A fungal member UniProtKB:M2LN89 is clearly a Flavine-reductase enzyme by homology, and UniProtKB:O28442 has been shown to bind riboflavin 5'-phosphate (unpublished structural Xray analysis).	174
377284	pfam04290	DctQ	Tripartite ATP-independent periplasmic transporters, DctQ component. The function of the members of this family is unknown, but DctQ homologs are invariably found in the tripartite ATP-independent periplasmic transporters.	131
398126	pfam04293	SpoVR	SpoVR like protein. Bacillus subtilis stage V sporulation protein R is involved in spore cortex formation. Little is known about cortex biosynthesis, except that it depends on several sigma E controlled genes, including spoVR.	419
398127	pfam04294	VanW	VanW like protein. Family members include vancomycin resistance protein W (VanW). Genes encoding members of this family have been found in vancomycin resistance gene clusters vanB and vanG. The function of VanW is unknown.	130
398128	pfam04295	GD_AH_C	D-galactarate dehydratase / Altronate hydrolase, C-terminus. Family members include the C termini of D-galactarate dehydratase (EC:4.2.1.42) which is thought to catalyze the reaction D-galactarate = 5-keto-4-deoxy-D-glucarate + H2O, and altronate hydrolase (altronic acid hydratase, EC:4.2.1.7), which catalyzes D-altronate = 2-keto-2-deoxygluconate + H2O. As purified, both enzymes are catalytically inactive in the absence of added Fe2+, Mn2+, and beta-mercaptoethanol. Synergistic activation of altronate hydrolase activity is seen in the presence of both iron and manganese ions, suggesting that the enzyme may have two ion binding sites. Mn2+ appears to be part of the enzyme active centre, but the function of the single bound Fe2+ ion is unknown. The hydratase has no Fe-S core.	393
398129	pfam04296	DUF448	Protein of unknown function (DUF448). 	74
282193	pfam04297	UPF0122	Putative helix-turn-helix protein, YlxM / p13 like. Members of this family are predicted to contain a helix-turn-helix motif, for example residues 37-55 in Mycoplasma mycoides p13. Genes encoding family members are often part of operons that encode components of the SRP pathway, and this protein may regulate the expression of an operon related to the SRP pathway.	98
398130	pfam04298	Zn_peptidase_2	Putative neutral zinc metallopeptidase. Zinc metallopeptidase zinc binding regions have been predicted in some family members by a pattern match (Prosite:PS00142), of the characteristic HEXXH motif.	216
398131	pfam04299	FMN_bind_2	Putative FMN-binding domain. In Bacillus subtilis, family member PAI 2/ORF-2 was found to be essential for growth. The SUPERFAMILY database finds that this domain is related to FMN-binding domains, suggesting this protein is also FMN-binding.	167
398132	pfam04300	FBA	F-box associated region. Members of this family are associated with F-box domains, hence the name FBA. This domain is probably involved in binding other proteins that will be targeted for ubiquitination. FBXO2 is involved in binding to N-glycosylated proteins.	177
113085	pfam04301	DUF452	Protein of unknown function (DUF452). 	213
398133	pfam04303	PrpF	PrpF protein. PrpF is a protein found in the 2-methylcitrate pathway. It is structurally similar to DAP epimerase and proline racemase. This protein is likely to acts to isomerise trans-aconitate to cis-aconitate.	384
398134	pfam04304	DUF454	Protein of unknown function (DUF454). Predicted membrane protein.	115
398135	pfam04305	DUF455	Protein of unknown function (DUF455). 	243
398136	pfam04306	DUF456	Protein of unknown function (DUF456). This family is a putative membrane protein that contains glycine zipper motifs.	135
398137	pfam04307	YdjM	LexA-binding, inner membrane-associated putative hydrolase. YdjM is a family of putative LexA-binding proteins. Members are predicted to be membrane-bound metal-dependent hydrolases that may be acting as phospholipases. It is a member of the SOS network, that rescues cells from UV and other DNA-damage. Expression of YdjM is regulated by LexA.	173
398138	pfam04308	RNaseH_like	Ribonuclease H-like. RNaseH_like is a family of uncharacterized eubacterial proteins that are distant homologs of Ribonuclease H-like. The family maintains all the core secondary structure elements of the RNase H-like fold and shares several conserved, presumably active site residues with RNase HI. This finding suggests that it functions as a nuclease.	145
398139	pfam04309	G3P_antiterm	Glycerol-3-phosphate responsive antiterminator. Intracellular glycerol is usually converted to glycerol-3-phosphate in an ATP-requiring phosphorylation reaction catalyzed by glycerol kinase (GlpK) glycerol-3-phosphate activates the antiterminator GlpP.	173
398140	pfam04310	MukB	MukB N-terminal. This family represents the N-terminal region of MukB, one of a group of bacterial proteins essential for the movement of nucleoids from mid-cell towards the cell quarters (i.e. chromosome partitioning). The structure of the N-terminal domain consists of an antiparallel six-stranded beta sheet surrounded by one helix on one side and by five helices on the other side. It contains an exposed Walker A loop in an unexpected helix-loop-helix motif (in other proteins, Walker A motifs generally adopt a P loop conformation as part of a strand-loop-helix motif embedded in a conserved topology of alternating helices and (parallel) beta strands).	226
398141	pfam04311	DUF459	Protein of unknown function (DUF459). Putative periplasmic protein.	322
398142	pfam04312	DUF460	Protein of unknown function (DUF460). Archaeal protein of unknown function.	135
398143	pfam04313	HSDR_N	Type I restriction enzyme R protein N-terminus (HSDR_N). This family consists of a number of N terminal regions found in type I restriction enzyme R (HSDR) proteins. Restriction and modification (R/M) systems are found in a wide variety of prokaryotes and are thought to protect the host bacterium from the uptake of foreign DNA. Type I restriction and modification systems are encoded by three genes: hsdR, hsdM, and hsdS. The three polypeptides, HsdR, HsdM, and HsdS, often assemble to give an enzyme (R2M2S1) that modifies hemimethylated DNA and restricts unmethylated DNA.	139
398144	pfam04314	PCuAC	Copper chaperone PCu(A)C. PCu(A)C is a periplasmic copper chaperone. Its role may be to capture and transfer copper to two other copper chaperones, PrrC and Cox11, which in turn deliver Cu(I) to cytochrome c oxidase.	107
398145	pfam04315	EpmC	Elongation factor P hydroxylase. This family catalyzes the final step in the elongation factor P modification pathway. It hydroxylates Lys-34 of elongation factor P. Members of this family have a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold.	162
398146	pfam04316	FlgM	Anti-sigma-28 factor, FlgM. FlgM binds and inhibits the activity of the transcription factor sigma 28. Inhibition of sigma 28 prevents the expression of genes from flagellar transcriptional class 3, which include genes for the filament and chemotaxis. Correctly assembled basal body-hook structures export FlgM, relieving inhibition of sigma 28 and allowing expression of class 3 genes. NMR studies show that free FlgM is mostly unfolded, which may facilitate its export. The C terminal half of FlgM adopts a tertiary structure when it binds to sigma 28. All mutations in FlgM that prevent sigma 28 inhibition affect the C-terminal domain and is the region thought to constitute the binding domain. A minimal binding domain has been identified between Glu 64 and Arg 88 in Salmonella typhimurium. The N-terminal portion remains unstructured and may be necessary for recognition by the export machinery.	54
398147	pfam04317	DUF463	YcjX-like family, DUF463. These proteins possess a P-loop motif.	443
367903	pfam04318	DUF468	Protein of unknown function (DUF468). These conserved ORFs probably are probably not translated into protein [Personal communication, Val Wood].	84
398148	pfam04319	NifZ	NifZ domain. This short protein is found in the nif (nitrogen fixation) operon. Its function is unknown but is probably involved in nitrogen fixation or regulating some component of this process. This 75 residue region is presumed to be a domain. It is found in isolation in some members and in the amino terminal half of the longer NifZ proteins.	70
398149	pfam04320	DUF469	Protein with unknown function (DUF469). Family of bacteria protein with no known function.	102
398150	pfam04321	RmlD_sub_bind	RmlD substrate binding domain. L-rhamnose is a saccharide required for the virulence of some bacteria. Its precursor, dTDP-L-rhamnose, is synthesized by four different enzymes the final one of which is RmlD. The RmlD substrate binding domain is responsible for binding a sugar nucleotide.	284
398151	pfam04322	DUF473	Protein of unknown function (DUF473). Family of uncharacterized Archaeal proteins.	118
398152	pfam04324	Fer2_BFD	BFD-like [2Fe-2S] binding domain. The two Fe ions are each coordinated by two conserved cysteine residues. This domain occurs alone in small proteins such as Bacterioferritin-associated ferredoxin (BFD). The function of BFD is not known, but it may may be a general redox and/or regulatory component involved in the iron storage or mobilisation functions of bacterioferritin in bacteria. This domain is also found in nitrate reductase proteins in association with Nitrite and sulphite reductase 4Fe-4S domain (pfam01077), Nitrite/Sulfite reductase ferredoxin-like half domain (pfam03460) and Pyridine nucleotide-disulphide oxidoreductase (pfam00070). It is also found in NifU nitrogen fixation proteins, in association with NifU-like N terminal domain (pfam01592) and NifU-like domain (pfam01106).	50
398153	pfam04325	DUF465	Protein of unknown function (DUF465). Family members are found in small bacterial proteins, and also in the heavy chains of eukaryotic myosin and kinesin, C terminal of the motor domain (Myosin pfam00063, Kinesin pfam00225). Members of this family may form coiled coil structures.	48
398154	pfam04326	AlbA_2	Putative DNA-binding domain. This family belongs to the AlbA clan of DNA-binding domains.	116
398155	pfam04327	Peptidase_Prp	Cysteine protease Prp. This is a family of cysteine protease that are found to cleave the N-terminus extension of ribosomal subunit L27 in eubacteria. Proteins in this family are distinguished by a pair of invariant histidine and cysteine residues with conserved spacing that form the classic catalytic dyad of a cysteine protease.	102
398156	pfam04328	Sel_put	Selenoprotein, putative. This entry includes a group of putative selenoproteins from Proteobacteria, Actinobacteria and Firmicutes. The invariant cysteine at the C-terminus is encoded by a TGA Sec codon in some Epsilonproteobacteria, suggesting a redox activity for the protein.	61
367905	pfam04332	DUF475	Protein of unknown function (DUF475). Predicted to be an integral membrane protein with multiple membrane spans.	295
398157	pfam04333	MlaA	MlaA lipoprotein. MlaA is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. MlaA is required for the intercellular spreading of Shigella flexneri. It is attached to the outer membrane by a lipid anchor.	194
113117	pfam04334	DUF478	Protein of unknown function (DUF478). This family contains uncharacterized protein encoded on Trypanosoma kinetoplast minicircles.	68
398158	pfam04335	VirB8	VirB8 protein. VirB8 is a bacterial virulence protein with cytoplasmic, transmembrane, and periplasmic regions. It is thought that it is a primary constituent of a DNA transporter. The periplasmic region interacts with VirB9, VirB10, and itself. This family also includes the conjugal transfer protein family TrbF, a family of proteins known to be involved in conjugal transfer. The TrbF protein is thought to compose part of the pilus required for transfer. This domain has a similar fold to the NTF2 protein.	212
398159	pfam04336	ACP_PD	Acyl carrier protein phosphodiesterase. YajB, now renamed acpH, encodes an ACP hydrolase that converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine prosthetic group from ACP.	105
398160	pfam04337	DUF480	Protein of unknown function, DUF480. This family consists of several proteins of uncharacterized function.	149
398161	pfam04338	DUF481	Protein of unknown function, DUF481. This family includes several proteins of uncharacterized function.	211
398162	pfam04339	FemAB_like	Peptidogalycan biosysnthesis/recognition. FemAB_like is a family of both baterial and Viridiplantae proteins with responsibility for building interpeptide bridges in peptidoglycan. Such a function is feasible for bacteria but less likely for the plant members of this family. Perhaps the plant-members are using homologous proteins to recognize bacterial peptidoglcans as part of their innate immune system.	369
282228	pfam04340	DUF484	Protein of unknown function, DUF484. This family consists of several proteins of uncharacterized function.	219
398163	pfam04341	DUF485	Protein of unknown function, DUF485. This family includes several putative integral membrane proteins.	89
398164	pfam04342	DMT_6	Putative member of DMT superfamily (DUF486). This family contains several proteins of uncharacterized function. The family is represented in the Transport classification database as 2.A.7.34, though the exact nature of what is transported is not known.	104
377317	pfam04343	DUF488	Protein of unknown function, DUF488. This family includes several proteins of uncharacterized function.	123
398165	pfam04344	CheZ	Chemotaxis phosphatase, CheZ. This family represents the bacterial chemotaxis phosphatase, CheZ. This protein forms a dimer characterized by a long four-helix bundle, composed of two helices from each monomer. CheZ dephosphorylates CheY in a reaction that is essential to maintain a continuous chemotactic response to environmental changes. It is thought that CheZ's conserved residue Gln 147 orientates a water molecule for nucleophilic attack at the CheY active site.	204
113128	pfam04345	Chor_lyase	Chorismate lyase. Chorismate lyase catalyzes the first step in ubiquinone synthesis, i.e. the removal of pyruvate from chorismate, to yield 4-hydroxybenzoate.	168
398166	pfam04346	EutH	Ethanolamine utilisation protein, EutH. EutH is a bacterial membrane protein whose molecular function is unknown. It has been suggested that it may act as an ethanolamine transporter, responsible for carrying ethanolamine from the periplasm to the cytoplasm.	351
398167	pfam04347	FliO	Flagellar biosynthesis protein, FliO. FliO is an essential component of the flagellum-specific protein export apparatus. It is an integral membrane protein. Its precise molecular function is unknown.	88
367906	pfam04348	LppC	LppC putative lipoprotein. This family includes several bacterial outer membrane antigens, whose molecular function is unknown.	532
398168	pfam04349	MdoG	Periplasmic glucan biosynthesis protein, MdoG. This family represents MdoG, a protein that is necessary for the synthesis of periplasmic glucans. The function of MdoG remains unknown. It has been suggested that it may catalyze the addition of branches to a linear glucan backbone.	474
309474	pfam04350	PilO	Pilus assembly protein, PilO. PilO proteins are involved in the assembly of pilin. However, the precise function of this family of proteins is not known.	145
398169	pfam04351	PilP	Pilus assembly protein, PilP. The PilP family are periplasmic proteins involved in the biogenesis of type IV pili.	148
398170	pfam04352	ProQ	ProQ/FINO family. This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProP, in Escherichia coli. This family includes several bacterial fertility inhibition (FINO) proteins. The conjugative transfer of F-like plasmids is repressed by FinO, an RNA binding protein. FinO interacts with the F-plasmid encoded traJ mRNA and its antisense RNA, FinP, stabilizing FinP against endonucleolytic degradation and facilitating sense-antisense RNA recognition. ProQ operates as an RNA-chaperone, binding RNA and bringing about both RNA strand-exchange and RNA duplexing. This suggests that in fact it does not regulate ProP transcription but rather regulates ProP translation through activity as an RNA-binding protein.	106
398171	pfam04353	Rsd_AlgQ	Regulator of RNA polymerase sigma(70) subunit, Rsd/AlgQ. This family includes bacterial transcriptional regulators that are thought to act through an interaction with the conserved region 4 of the sigma(70) subunit of RNA polymerase. The Pseudomonas aeruginosa homolog, AlgQ, positively regulates virulence gene expression and is associated with the mucoid phenotype observed in Pseudomonas aeruginosa isolates from cystic fibrosis patients.	149
398172	pfam04354	ZipA_C	ZipA, C-terminal FtsZ-binding domain. This family represents the ZipA C-terminal domain. ZipA is involved in septum formation in bacterial cell division. Its C-terminal domain binds FtsZ, a major component of the bacterial septal ring. The structure of this domain is an alpha-beta fold with three alpha helices and a beta sheet of six antiparallel beta strands. The major loops protruding from the beta sheet surface are thought to form a binding site for FtsZ.	127
398173	pfam04355	SmpA_OmlA	SmpA / OmlA family. Lipoprotein Bacterial outer membrane lipoprotein, possibly involved in in maintaining the structural integrity of the cell envelope. Lipid attachment site is a conserved N terminal cysteine residue. Sometimes found adjacent to the OmpA domain (pfam00691).	69
398174	pfam04356	DUF489	Protein of unknown function (DUF489). Protein of unknown function, cotranscribed with purB in Escherichia coli, but with function unrelated to purine biosynthesis.	192
398175	pfam04357	TamB	TamB, inner membrane protein subunit of TAM complex. TamB is an integral inner membrane protein that forms a complex - the translocation and assembly module or TAM - with the outer membrane protein, TamA. TAM is responsible for the efficient secretion of the adhesin protein Ag43 in E.coli K-12.	383
398176	pfam04358	DsrC	DsrC like protein. Family member DsvC has been observed to co-purify with Desulfovibrio vulgaris dissimilatory sulfite reductase, and many members of this family are annotated as the third (gamma) subunit of dissimilatory sulphite reductase. However, this protein appears to be only loosely associated to the sulfite reductase, which suggests that DsrC may not be an integral part of the dissimilatory sulphite reductase. Members of this family are found in organisms such as E. coli and H. influenzae which do not contain dissimilatory sulphite reductases but can synthesize assimilatory sirohaem sulphite and nitrite reductases. It is speculated that DsrC may be involved in the assembly, folding or stabilisation of sirohaem proteins. The strictly conserved cysteine in the C-terminus suggests that DsrC may have a catalytic function in the metabolism of sulphur compounds.	103
398177	pfam04359	DUF493	Protein of unknown function (DUF493). This domain is likely to act in a regulatory capacity like pfam01842 domains. This domain has a remarkable property in that the C-terminal residue of every protein in the family lies up in the alignment. This suggests that the C-terminal residue plays some important functional role (Bateman A pers obs).	83
398178	pfam04360	Serglycin	Serglycin. Serglycin is the most prevalent proteoglycan produced in haemopoietic cells. Serglycin is a proteinase resistant secretory granule proteoglycan.	148
398179	pfam04361	DUF494	Protein of unknown function (DUF494). Members of this family of uncharacterized proteins are often named Smg.	153
398180	pfam04362	Iron_traffic	Bacterial Fe(2+) trafficking. This is a family of bacterial Fe(2+) trafficking proteins.	86
398181	pfam04363	DUF496	Protein of unknown function (DUF496). 	93
398182	pfam04364	DNA_pol3_chi	DNA polymerase III chi subunit, HolC. The DNA polymerase III holoenzyme (EC:2.7.7.7) is the polymerase responsible for the replication of the Escherichia coli chromosome. The holoenzyme is composed of the DNA polymerase III core, the sliding clamp, and the DnaX clamp loading complex. The DnaX complex contains either either the tau or gamma product of gene dnax, complexed to delta.delta' and to chi psi. Chi forms a 1:1 heterodimer with psi. The chi psi complex functions by increasing the affinity of tau and gamma for delta.delta' allowing a functional clamp-loading complex to form at physiological subunit concentrations. Psi is responsible for the interaction with DnaX (gamma/tau), but psi is insoluble unless it is in a complex with chi.	135
398183	pfam04365	BrnT_toxin	Ribonuclease toxin, BrnT, of type II toxin-antitoxin system. BrnT is a ribonuclease toxin of a type II toxin-antitoxin system that exhibits a RelE-like fold. The antitoxin that neutralizes this toxin is pfam14384. BrnT is found in bacteria, archaea, bacteriophage, and plasmids. BrnT-BrnA forms a 2:2 tetrameric complex and autoregulates its own expression, which is induced by a number of different environmental stresses. Expression of BrnT alone results in cessation of bacterial growth which can be rescued after subsequent expression of BrnA.	77
398184	pfam04366	Ysc84	Las17-binding protein actin regulator. Ysc84 is a family of Las17-binding proteins found in metazoa. Together, Las17 and Ysc84 are essential for proper polymerization of actin; Ysc84 is able to bind to and stabilize the actin dimer presented by Las17 and thereby promote polymerization. An active actin cytoskeleton is necessary for adequate endocytosis. (pfam00018), or a FYVE zinc finger (pfam01363).	127
398185	pfam04367	DUF502	Protein of unknown function (DUF502). Predicted to be an integral membrane protein.	106
398186	pfam04368	DUF507	Protein of unknown function (DUF507). Bacterial protein of unknown function.	182
113152	pfam04369	Lactococcin	Lactococcin-like family. Family of bacteriocins from lactic acid bacteria.	60
367915	pfam04370	DUF508	Domain of unknown function (DUF508). This is a family of uncharacterized proteins from C. elegans.	142
398187	pfam04371	PAD_porph	Porphyromonas-type peptidyl-arginine deiminase. Peptidyl-arginine deiminase (PAD) enzymes catalyze the deimination of the guanidino group from carboxy-terminal arginine residues of various peptides to produce ammonia. PAD from Porphyromonas gingivalis (PPAD) appears to be evolutionarily unrelated to mammalian PAD (pfam03068), which is a metalloenzyme. PPAD is thought to belong to the same superfamily as aminotransferase and arginine deiminase, and to form an alpha/beta propeller structure. This family has previously been named PPADH (Porphyromonas peptidyl-arginine deiminase homologs). The predicted catalytic residues in PPAD are Asp130, Asp187, His236, Asp238 and Cys351. These are absolutely conserved with the exception of Asp187 which is absent in two family members. PPAD is also able to catalyze the deimination of free L-arginine, but has primarily peptidyl-arginine specificity. It may have a FMN cofactor.	324
398188	pfam04375	HemX	HemX, putative uroporphyrinogen-III C-methyltransferase. This is a family of bacterial putative uroporphyrinogen-III C-methyltransferase proteins. It forms one of the members of a complex of proteins involved in the biogenesis of the inner membrane in E.coli. Uroporphorphyrin-III C-methyltransferase (HemX) is a single spanning inner membrane protein that regulates the activity of NAD(P)H:glutamyl-tRNA reductase (HemA) in the tetrapyrrole biosynthesis pathway.	236
398189	pfam04376	ATE_N	Arginine-tRNA-protein transferase, N-terminus. This family represents the N terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyzes the post-translational conjugation of arginine to the N-terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a de-stabilizing amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified. In S cerevisiae, Cys20, 23, 94 and/or 95 are thought to be important for activity. Of these, only Cys 94 appears to be completely conserved in this family.	71
398190	pfam04377	ATE_C	Arginine-tRNA-protein transferase, C-terminus. This family represents the C terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyzes the post-translational conjugation of arginine to the N-terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a destabilizing amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified.	122
398191	pfam04378	RsmJ	Ribosomal RNA large subunit methyltransferase D, RlmJ. RlmJ is ribosomal RNA large subunit methyltransferase J is required for full methylation of 23S ribosomal RNA (rRNA) during ribosome biogenesis. The ribosomal RNA of E. coli carries 24 residues that require methylation, and this methyltransferase is the last to be described, that modifies A2030. RlmJ displays a variant of the Rossmann-like methyltransferase (MTase) fold with an inserted helical subdomain. On binding cofactor and substrate a large shift of the N-terminal motif X tail is induced in order to make it cover the cofactor-binding site and to trigger active-site changes in motifs IV and VIII.	245
398192	pfam04379	DUF525	ApaG domain. Members of this family include the bacterial protein ApaG and the C termini of some F-box proteins (pfam00646). F-box proteins contain a carboxyl-terminal domain that interacts with protein substrates, so this family may be involved in protein-protein interaction. The function of ApaG proteins is unknown, but mutations in the Salmonella typhimurium ApaG homolog corD gives a phenotype of low-level cobalt resistance and decreased magnesium efflux by effects on the CorA magnesium transport system.	87
398193	pfam04380	BMFP	Membrane fusogenic activity. BMFP consists of two structural domains, a coiled-coil C-terminal domain via which the protein self-associates as a trimer, and an N-terminal domain disordered at neutral pH but adopting an amphipathic alpha-helical structure in the presence of phospholipid vesicles, high ionic strength, acidic pH or SDS. BMFP interacts with phospholipid vesicles though the predicted amphipathic alpha-helix induced in the N-terminal half of the protein and promotes aggregation and fusion of vesicles in vitro.	70
398194	pfam04381	RdgC	Putative exonuclease, RdgC. Members of the RdgC family may have exonuclease activity. RdgC is required for efficient pilin variation in Neisseria gonorrhoeae, suggesting that it may be involved in recombination reactions. In Escherichia coli, RdgC is required for growth in recombination-deficient exonuclease-depleted strains. Under these conditions, RdgC may act as an exonuclease to remove collapsed replication forks, in the absence of the normal repair mechanisms.	295
398195	pfam04382	SAB	SAB domain. This presumed domain is found in proteins containing FERM domains pfam00373. This domain is found to bind to both spectrin and actin, hence the name SAB (Spectrin and Actin Binding) domain.	49
367917	pfam04383	KilA-N	KilA-N domain. The amino-terminal module of the D6R/N1R proteins defines a novel, conserved DNA-binding domain (the KilA-N domain) that is found in a wide range of proteins of large bacterial and eukaryotic DNA viruses. The KilA-N domain family also includes the previously defined APSES domain. The KilA-N and APSES domains may also share a common fold with the nucleic acid-binding modules of the LAGLIDADG nucleases and the amino-terminal domains of the tRNA endonuclease.	107
398196	pfam04384	Fe-S_assembly	Iron-sulphur cluster assembly. This family of proteins is likely to be involved in the assembly of iron-sulphur clusters. It may function as an adaptor protein. In Escherichia coli IscX forms part of the isc operon, which encodes genes involved in iron-sulphur cluster assembly. Its structure is entirely alpha helical, and it contains a modified wing-helix structure, usually found in DNA-binding proteins. It binds to Fe2+ and Fe3+ ions and to the cysteine desulfurase IscS, the same surface of the protein is involved in both binding to iron and to IscS.	64
252557	pfam04385	FAINT	Domain of unknown function, DUF529. This family represents a repeated region found in several Theileria parva proteins. The repeat is normally about 70 residues long and contains a conserved aromatic residue in the middle.	78
398197	pfam04386	SspB	Stringent starvation protein B. Escherichia coli stringent starvation protein B (SspB), is thought to enhance the specificity of degradation of tmRNA-tagged proteins by the ClpXP protease. The tmRNA tag, also known as ssrA, is an 11-aa peptide added to the C-terminus of proteins stalled during translation, targets proteins for degradation by ClpXP and ClpAP. SspB a cytoplasmic protein that specifically binds to residues 1-4 and 7 of the tag. Binding of SspB enhances degradation of tagged proteins by ClpX, and masks sequence elements important for ClpA interactions, inhibiting degradation by ClpA. However, more recent work has cast doubt on the importance of SspB in wild-type cells. SspB is encoded in an operon whose synthesis is stimulated by carbon, amino acid, and phosphate starvation. SspB may play a special role during nutrient stress, for example by ensuring rapid degradation of the products of stalled translation, without causing a global increase in degradation of all ClpXP substrates.	144
398198	pfam04387	PTPLA	Protein tyrosine phosphatase-like protein, PTPLA. This family includes the mammalian protein tyrosine phosphatase-like protein, PTPLA. A significant variation of PTPLA from other protein tyrosine phosphatases is the presence of proline instead of catalytic arginine at the active site. It is thought that PTPLA proteins have a role in the development, differentiation, and maintenance of a number of tissue types.	162
398199	pfam04388	Hamartin	Hamartin protein. This family includes the hamartin protein which is thought to function as a tumor suppressor. The hamartin protein interacts with the tuberin protein pfam03542. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterized by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumor suppressor gene. TSC1 encodes a protein, hamartin, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. The TSC2 gene codes for tuberin pfam03542. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking.	730
398200	pfam04389	Peptidase_M28	Peptidase family M28. 	192
398201	pfam04390	LptE	Lipopolysaccharide-assembly. LptE (formerly known as RplB) is involved in lipopolysaccharide-assembly on the outer membrane of Gram-negative organisms. The lipopolysaccharide component of the outer bacterial membrane is transported from its source of origin to the outer membrane by a set of proteins constituting a transport machinery that is made up of LptA, LptB, LptC, LptD, LptE. LptD appears to be anchored in the outer membrane, and LptE forms a complex with it. This part of the machinery complex is involved in the assembly of lipopolysaccharide in the outer leaflet of the outer membrane.	49
398202	pfam04391	DUF533	Protein of unknown function (DUF533). Some family members may be secreted or integral membrane proteins.	177
398203	pfam04392	ABC_sub_bind	ABC transporter substrate binding protein. This family contains many hypothetical proteins and some ABC transporter substrate binding proteins.	293
398204	pfam04393	DUF535	Protein of unknown function (DUF535). Family member Shigella flexneri VirK is a virulence protein required for the expression, or correct membrane localization of IcsA (VirG) on the bacterial cell surface,. This family also includes Pasteurella haemolytica lapB, which is thought to be membrane-associated.	279
282276	pfam04394	DUF536	Protein of unknown function, DUF536. This family aligns the C-terminal region from several bacterial proteins of unknown function that may be involved in a theta-type replication mechanism.	43
282277	pfam04395	Poxvirus_B22R	Poxvirus B22R protein. This is highly conserved C-rich, central region of poxvirus proteins from eg, Fowlpox virus, Myxoma virus, Lumpy skin disease, Variola virus and other members of the Poxviridae family of double-stranded, no-RNA stage poxviruses. There are three pairs of conserved cysteine residues.	204
398205	pfam04397	LytTR	LytTr DNA-binding domain. This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern.	98
398206	pfam04398	DUF538	Protein of unknown function, DUF538. This family consists of several plant proteins of unknown function.	109
398207	pfam04399	Glutaredoxin2_C	Glutaredoxin 2, C terminal domain. Glutaredoxins are a multifunctional family of glutathione-dependent disulphide oxidoreductases. Unlike other glutaredoxins, glutaredoxin 2 (Grx2) cannot reduce ribonucleotide reductase. Grx2 has significantly higher catalytic activity in the reduction of mixed disulphides with glutathione (GSH) compared with other glutaredoxins. The active site residues (Cys9-Pro10-Tyr11-Cys12, in Escherichia coli Grx2), which are found at the interface between the N- and C-terminal domains are identical to other glutaredoxins, but there is no other similarity between glutaredoxin 2 and other glutaredoxins. Grx2 is structurally similar to glutathione-S-transferases (GST), but there is no obvious sequence similarity. The inter-domain contacts are mainly hydrophobic, suggesting that the two domains are unlikely to be stable on their own. Both domains are needed for correct folding and activity of Grx2. It is thought that the primary function of Grx2 is to catalyze reversible glutathionylation of proteins with GSH in cellular redox regulation including the response to oxidative stress.	130
398208	pfam04400	NqrM	(Na+)-NQR maturation NqrM. The NqrM gene is often found adjacent to the nqr operons that encode (Na+)-NQR subunits. It is involved in the maturation of (Na+) translocating NADH:quinone oxidoreductase in proteobacteria. The four conserved Cys residues found in NqrM are required for (Na+)- NQR maturation and may serve as ligands for a metal ion or metal cluster used to build up the (Na+)-NQR molecule.	42
398209	pfam04402	SIMPL	Protein of unknown function (DUF541). Members of this family have so far been found in bacteria and mouse SwissProt or TrEMBL entries. However possible family members have also been identified in translated rat (Genbank:AW144450) and human (Genbank:AI478629) ESTs. A mouse family member has been named SIMPL (signalling molecule that associates with mouse pelle-like kinase). SIMPL appears to facilitate and/or regulate complex formation between IRAK/mPLK (IL-1 receptor-associated kinase) and IKK (inhibitor of kappa-B kinase) containing complexes, and thus regulate NF-kappa-B activity. Separate experiments demonstrate that a mouse family member (named LaXp180) binds the Listeria monocytogenes surface protein ActA, which is a virulence factor that induces actin polymerization. It may also bind stathmin, a protein involved in signal transduction and in the regulation of microtubule dynamics. In bacteria its function is unknown, but it is thought to be located in the periplasm or outer membrane.	156
398210	pfam04403	PqiA	Paraquat-inducible protein A. Paraquat is a superoxide radical-generating agent. The promoter for the pqiA gene is also inducible by other known superoxide generators. This is predicted to be a family of integral membrane proteins, possibly located in the inner membrane. This family is related to NADH dehydrogenase subunit 2 (pfam00361).	155
398211	pfam04404	ERF	ERF superfamily. The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to ERF.	151
398212	pfam04405	ScdA_N	Domain of Unknown function (DUF542). This domain is always found in conjunction with the HHE domain (pfam03794) at the N-terminus.	55
398213	pfam04406	TP6A_N	Type IIB DNA topoisomerase. Type II DNA topoisomerases are ubiquitous enzymes that catalyze the ATP-dependent transport of one DNA duplex through a second DNA segment via a transient double-strand break. Type II DNA topoisomerases are now subdivided into two sub-families, type IIA and IIB DNA topoisomerases. TP6A_N is present in type IIB topoisomerase and is thought to be involved in DNA binding owing to its sequence similarity to E. coli catabolite activator protein (CAP).	62
367925	pfam04407	DUF531	Protein of unknown function (DUF531). Family of hypothetical archaeal proteins.	170
398214	pfam04408	HA2	Helicase associated domain (HA2). This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding.	104
398215	pfam04409	DUF530	Protein of unknown function (DUF530). Family of hypothetical archaeal proteins.	521
398216	pfam04410	Gar1	Gar1/Naf1 RNA binding region. Gar1 is a small nucleolar RNP that is required for pre-mRNA processing and pseudouridylation. It is co-immunoprecipitated with the H/ACA families of snoRNAs. This family represents the conserved central region of Gar1. This region is necessary and sufficient for normal cell growth, and specifically binds two snoRNAs snR10 and snR30. This region is also necessary for nucleolar targeting, and it is thought that the protein is co-transported to the nucleolus as part of a nucleoprotein complex. In humans, Gar1 is also component of telomerase in vivo. Naf1 is an essential protein that plays a role in ribosome biogenesis, modification of spliceosomal small nuclear RNAs and telomere synthesis, and is homologous to Gar1.	153
398217	pfam04411	PDDEXK_7	PD-(D/E)XK nuclease superfamily. This domain has been identified as a member of the PD-(D/E)XK nuclease superfamily through transitive meta profile searches. The domain has two additional beta-strands inserted to the core fold after the first core alpha-helix. It has been speculated that it could function as s methylation-dependent restriction. The domain has two additional beta-strands inserted into the core fold after the first core alpha-helix. The PD-(D/E)XK signature is clearly conserved corresponding to an invariant PD (motif II) and DAK (motif III) motifs. There is also a conserved glutamic acid in motif I that is most likely to be involved in metal ion binding. The second core alpha-helix contains an invariant MHXYRD motif. It has been speculated that it could function as s methylation-dependent restriction enzyme.	161
398218	pfam04412	DUF521	Protein of unknown function (DUF521). Family of hypothetical proteins.	392
398219	pfam04413	Glycos_transf_N	3-Deoxy-D-manno-octulosonic-acid transferase (kdotransferase). Members of this family transfer activated sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. Members of the family transfer UDP, ADP, GDP or CMP linked sugars. The Glycos_transf_N region is flanked at the N-terminus by a signal peptide and at the C-terminus by Glycos_transf_1 (pfam00534). The eukaryotic glycogen synthases may be distant members of this bacterial family.	176
398220	pfam04414	tRNA_deacylase	D-aminoacyl-tRNA deacylase. Several aminoacyl-tRNA synthetases have the ability to transfer the D-isomer of their amino acid onto their cognate tRNA. D-aminoacyl-tRNA deacylases hydrolyze the ester bond between the polynucleotide and the D-amino acid, thereby preventing the accumulation of such mis-acylated and metabolically inactive tRNA molecules.	203
282295	pfam04415	DUF515	Protein of unknown function (DUF515). Family of hypothetical Archaeal proteins.	449
398221	pfam04417	DUF501	Protein of unknown function (DUF501). Family of uncharacterized bacterial proteins.	137
398222	pfam04418	DUF543	Domain of unknown function (DUF543). This family of short eukaryotic proteins has no known function. Most of the members of this family are only 80 amino acid residues long. However the Arabidopsis homolog is over 300 residues long. The presumed domain contains a conserved amino terminal cysteine and a conserved motif GXGXGXG in the carboxy terminal half that may be functionally important.	75
398223	pfam04419	4F5	4F5 protein family. Members of this family are short proteins that are rich in aspartate, glutamate, lysine and arginine. Although the function of these proteins is unknown, they are found to be ubiquitously expressed.	38
398224	pfam04420	CHD5	CHD5-like protein. Members of this family are probably coiled-coil proteins that are similar to the CHD5 (Congenital heart disease 5) protein. In Saccharomyces cerevisiae this protein localizes to the ER and is thought to play a homeostatic role.	158
398225	pfam04421	Mss4	Mss4 protein. 	94
398226	pfam04422	FrhB_FdhB_N	Coenzyme F420 hydrogenase/dehydrogenase, beta subunit N-term. Coenzyme F420 hydrogenase (EC:1.12.99.1) reduces the low-potential two-electron acceptor coenzyme F420. This family contains the N termini of F420 hydrogenase and dehydrogenase beta subunits,. The N-terminus of Methanobacterium formicicum formate dehydrogenase beta chain (EC:1.2.1.2) is also a member of this family. This region is often found in association with the 4Fe-4S binding domain, fer4 (pfam00037).	78
398227	pfam04423	Rad50_zn_hook	Rad50 zinc hook motif. The Mre11 complex (Mre11 Rad50 Nbs1) is central to chromosomal maintenance and functions in homologous recombination, telomere maintenance and sister chromatid association. The Rad50 coiled-coil region contains a dimer interface at the apex of the coiled coils in which pairs of conserved Cys-X-X-Cys motifs form interlocking hooks that bind one Zn ion. This alignment includes the zinc hook motif and a short stretch of coiled-coil on either side.	49
398228	pfam04424	MINDY_DUB	MINDY deubiquitinase. This entry represents a group of deubiquitinating (DUB) enzymes known as the MINDY family (MIU-containing novel DUB). Ubiquitin (Ub) is released one molecule at a time from the distal end of proteins with Lys48-linked polyubiquitin chains. Long polyubiquitin chains are preferred. The catalytic Cys and His residues have been identified by site-directed mutagenesis, as has the Gln that participates in formation of the oxyanion hole during catalysis. Despite the structural similarity to papain-like cysteine peptidases, a residue corresponding to the Asn that orientates the imidazolium ring of the catalytic His has not been identified. Members of the MINDY family of DUBs contain an MIU (motif interacting with Ub) motif, which is a helical motif that binds mono-Ub.	110
398229	pfam04425	Bul1_N	Bul1 N-terminus. This family contains the N-terminus of Saccharomyces cerevisiae Bul1. Bul1 binds the ubiquitin ligase Rsp5, via an N terminal PPSY motif. The complex containing Bul1 and Rsp5 is involved in intracellular trafficking of the general amino acid permease Gap1, degradation of Rog1 in cooperation with Bul2 and GSK-3, and mitochondrial inheritance. Bul1 may contain HEAT repeats.	445
367936	pfam04426	Bul1_C	Bul1 C-terminus. This family contains the C-terminus of Saccharomyces cerevisiae Bul1. Bul1 binds the ubiquitin ligase Rsp5, via an N terminal PPSY motif. The complex containing Bul1 and Rsp5 is involved in intracellular trafficking of the general amino acid permease Gap1, degradation of Rog1 in cooperation with Bul2 and GSK-3, and mitochondrial inheritance. Bul1 may contain HEAT repeats.	271
398230	pfam04427	Brix	Brix domain. 	137
367938	pfam04428	Choline_kin_N	Choline kinase N-terminus. Found N terminal to choline/ethanolamine kinase regions (pfam01633) in some plant and fungal choline kinase enzymes (EC:2.7.1.32). This region is only found in some members of the choline kinase family, and is therefore unlikely to contribute to catalysis.	51
398231	pfam04430	DUF498	Protein of unknown function (DUF498/DUF598). This is a large family of uncharacterized proteins found in all domains of life. The structure shows a novel fold with three beta sheets. A dimeric form is found in the crystal structure. It was suggested that the cleft in between the two monomers might bing nucleic acid.	104
398232	pfam04431	Pec_lyase_N	Pectate lyase, N-terminus. This region is found N terminal to the pectate lyase domain (pfam00544) in some plant pectate lyase enzymes.	52
398233	pfam04432	FrhB_FdhB_C	Coenzyme F420 hydrogenase/dehydrogenase, beta subunit C-terminus. Coenzyme F420 hydrogenase (EC:1.12.99.1) reduces the low-potential two-electron acceptor coenzyme F420. This family contains the C termini of F420 hydrogenase and dehydrogenase beta subunits,. The N-terminus of Methanobacterium formicicum formate dehydrogenase beta chain (EC:1.2.1.2) is also a member of this family. This region is often found in association with the 4Fe-4S binding domain, fer4 (pfam00037).	146
398234	pfam04433	SWIRM	SWIRM domain. This SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in chromosomal proteins. It contains a helix-turn helix motif and binds to DNA.	78
309540	pfam04434	SWIM	SWIM zinc finger. This domain is found in bacterial, archaeal and eukaryotic proteins. It is predicted to be organized into two N-terminal beta-strands and a C-terminal alpha helix, thus possibly adopting a fold similar to that of the C2H2 zinc finger (pfam00096). SWIM is thought to be a versatile domain that can interact with DNA or proteins in different contexts.	38
398235	pfam04435	SPK	Domain of unknown function (DUF545). Family of uncharacterized C. elegans proteins. The region represented by this family can is found to be repeated up to four time in some proteins.	104
398236	pfam04437	RINT1_TIP1	RINT-1 / TIP-1 family. This family includes RINT-1, a Rad50 interacting protein which participates in radiation induced checkpoint control, as well as the TIP-1 protein from yeast that seems to be involved in a complex with Sec20p that is required for Golgi transport.	511
398237	pfam04438	zf-HIT	HIT zinc finger. This presumed zinc finger contains up to 6 cysteine residues that could coordinate zinc. The domain is named after the HIT protein. This domain is also found in the Thyroid receptor interacting protein 3 (TRIP-3) that specifically interact with the ligand binding domain of the thyroid receptor.	30
398238	pfam04439	Adenyl_transf	Streptomycin adenylyltransferase. Also known as Aminoglycoside 6- adenylyltransferase (EC:2.7.7.-), this protein confers resistance to aminoglycoside antibiotics.	278
398239	pfam04440	Dysbindin	Dysbindin (Dystrobrevin binding protein 1). Dysbindin is an evolutionary conserved 40-kDa coiled-coil-containing protein that binds to alpha- and beta-dystrobrevin in muscle and brain. Dystrophin and alpha-dystrobrevin are co-immunoprecipitated with dysbindin, indicating that dysbindin is DPC-associated in muscle. Dysbindin co-localizes with alpha-dystrobrevin at the sarcolemma and is up-regulated in dystrophin-deficient muscle. In the brain, dysbindin is found primarily in axon bundles and especially in certain axon terminals, notably mossy fibre synaptic terminals in the cerebellum and hippocampus. Dysbindin may have implications for the molecular pathology of Duchenne muscular dystrophy and may provide an alternative route for anchoring dystrobrevin and the DPC to the muscle membrane. Genetic variation in the human dysbindin gene is also thought to be associated with Schizophrenia.	143
398240	pfam04441	Pox_VERT_large	Poxvirus early transcription factor (VETF), large subunit. The poxvirus early transcription factor (VETF), in addition to the viral RNA polymerase, is required for efficient transcription of early genes in vitro. VETF is a heterodimeric protein that binds specifically to early gene promoters. The heterodimer is comprised of an 82 kDa (this family) subunit and a 70 kDa subunit.	697
398241	pfam04442	CtaG_Cox11	Cytochrome c oxidase assembly protein CtaG/Cox11. Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane. Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units. The C terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae. Met 224 is also thought to play a role in copper transfer or stabilizing the copper site.	148
282319	pfam04443	LuxE	Acyl-protein synthetase, LuxE. LuxE is an acyl-protein synthetase found in bioluminescent bacteria. LuxE catalyzes the formation of an acyl-protein thioester from a fatty acid and a protein. This is the second step in the bioluminescent fatty acid reduction system, which converts tetradecanoic acid to the aldehyde substrate of the luciferase-catalyzed bioluminescence reaction A conserved cysteine found at position 364 in Photobacterium phosphoreum LuxE is thought to be acylated during the transfer of the acyl group from the synthetase subunit to the reductase. The carboxyl terminal of the synthetase is though to act as a flexible arm to transfer acyl groups between the sites of activation and reduction. This family also includes Vibrio cholerae RBFN protein, which is involved in the biosynthesis of the O-antigen component 3-deoxy-L-glycero-tetronic acid.	386
398242	pfam04444	Dioxygenase_N	Catechol dioxygenase N-terminus. This family consists of the N termini of catechol, chlorocatechol or hydroxyquinol 1,2-dioxygenase proteins. This region is always found adjacent to the dioxygenase domain (pfam00775).	75
398243	pfam04445	SAM_MT	Putative SAM-dependent methyltransferase. This is a family of putative SAM-dependent methyltransferases.	231
398244	pfam04446	Thg1	tRNAHis guanylyltransferase. The Thg1 protein from Saccharomyces cerevisiae is responsible for adding a GMP residue to the 5' end of tRNA His. The catalytic domain Thg1 contains a RRM (ferredoxin) fold palm domain, just like the viral RNA-dependent RNA polymerases, reverse transcriptases, family A and B DNA polymerases, adenylyl cyclases, diguanylate cyclases (GGDEF domain) and the predicted polymerase of the CRISPR system. Thg1 possesses an active site with three acidic residues that chelate Mg++ cations. Thg1 catalyzes polymerization similar to the 5'-3' polymerases.	127
367945	pfam04447	DUF550	Protein of unknown function (DUF550). This family is found in a range of Proteobacteria and a few P-22 dsDNA virus particles. The function is currently not known.	97
398245	pfam04448	DUF551	Protein of unknown function (DUF551). This family represents the carboxy terminus of a protein of unknown function, found in dsDNA viruses with no RNA stage, including bacteriophages lambda and P22, and also in some Escherichia coli prophages.	66
398246	pfam04449	Fimbrial_CS1	CS1 type fimbrial major subunit. Fimbriae, also known as pili, form filaments radiating from the surface of the bacterium to a length of 0.5-1.5 micrometres. They enable the cell to colonise host epithelia. This family constitutes the major subunits of CS1 like pili, including CS2 and CFA1 from Escherichia coli, and also the Cable type II pilin major subunit from Burkholderia cepacia. The major subunit of CS1 pili is called CooA. Periplasmic CooA is mostly complexed with the assembly protein CooB. In addition, a small pool of CooA multimers, and CooA-CooD complexes exists, but the functional significance is unknown. A member of this family has also been identified in Salmonella typhi and Salmonella enterica.	134
398247	pfam04450	BSP	Peptidase of plants and bacteria. These basic secretory proteins (BSPs) are believed to be part of the plants defense mechanism against pathogens.	205
398248	pfam04451	Capsid_NCLDV	Large eukaryotic DNA virus major capsid protein. This family includes the major capsid protein of iridoviruses, chlorella virus and Spodoptera ascovirus, which are all dsDNA viruses with no RNA stage. This is the most abundant structural protein and can account for up to 45% of virion protein. In Chlorella virus PBCV-1 the major capsid protein is a glycoprotein. The four families of large eukaryotic DNA viruses, Poxviridae, Asfarviridae, Iridoviridae, and Phycodnaviridae, are referred to collectively as nucleocytoplasmic large DNA viruses or NCLDV. The virions of different NCLDV have dramatically different structures. The major capsid proteins of iridoviruses and phycodnaviruses, both of which have icosahedral capsids surrounding an inner lipid membrane, showed a high level of sequence conservation. A more limited, but statistically significant sequence similarity was observed between these proteins and the major capsid protein (p72) of ASFV, which also has an icosahedral capsid. It was surprising, however, to find that all of these proteins shared a conserved domain with the poxvirus protein D13L, which is an integral virion component thought to form a scaffold for the formation of viral crescents and immature virion.	192
398249	pfam04452	Methyltrans_RNA	RNA methyltransferase. RNA methyltransferases modify nucleotides during ribosomal RNA maturation in a site-specific manner. The Escherichia coli member is specific for U1498 methylation.	221
398250	pfam04453	OstA_C	Organic solvent tolerance protein. Family involved in organic solvent tolerance in bacteria. The region contains several highly conserved, potentially catalytic, residues.	384
398251	pfam04454	Linocin_M18	Encapsulating protein for peroxidase. The Linocin_M18 is found in eubacteria and archaea. These proteins, referred to as encapsulins, form nanocompartments within the bacterium which contain ferritin-like proteins or peroxidases, enzymes involved in oxidative-stress response. These enzymes are targeted to the interior of encapsulins via unique C-terminal extensions.	253
398252	pfam04455	Saccharop_dh_N	LOR/SDH bifunctional enzyme conserved region. Lysine-oxoglutarate reductase/Saccharopine dehydrogenase (LOR/SDH) is a bifunctional enzyme. This conserved region is commonly found immediately N-terminal to Saccharop_dh (pfam03435) in eukaryotes.	92
398253	pfam04456	DUF503	Protein of unknown function (DUF503). Family of hypothetical bacterial proteins.	82
398254	pfam04457	DUF504	Protein of unknown function (DUF504). Family of uncharacterized proteins.	76
398255	pfam04458	DUF505	Protein of unknown function (DUF505). Family of uncharacterized prokaryotic proteins.	621
398256	pfam04459	DUF512	Protein of unknown function (DUF512). Family of uncharacterized prokaryotic proteins.	203
398257	pfam04461	DUF520	Protein of unknown function (DUF520). Family of uncharacterized proteins.	161
398258	pfam04463	DUF523	Protein of unknown function (DUF523). Family of uncharacterized bacterial proteins.	142
398259	pfam04464	Glyphos_transf	CDP-Glycerol:Poly(glycerophosphate) glycerophosphotransferase. Wall-associated teichoic acids are a heterogeneous class of phosphate-rich polymers that are covalently linked to the cell wall peptidoglycan of gram-positive bacteria. They consist of a main chain of phosphodiester-linked polyols and/or sugar moieties attached to peptidoglycan via a linkage unit. CDP-glycerol:poly(glycerophosphate) glycerophosphotransferase is responsible for the polymerization of the main chain of the teichoic acid by sequential transfer of glycerol-phosphate units from CDP-glycerol to the linkage unit lipid.	360
367954	pfam04465	DUF499	Protein of unknown function (DUF499). Family of uncharacterized hypothetical prokaryotic proteins.	1016
335802	pfam04466	Terminase_3	Phage terminase large subunit. Initiation of packaging of double-stranded viral DNA involves the specific interaction of the prohead with viral DNA in a process mediated by a phage-encoded terminase protein. The terminase enzymes are usually hetero-oligomers composed of a small and a large subunit. This region is found on the large subunit and possess an endonuclease and ATPase activity that require Mg2+ and a neutral or slightly basic reaction. This region is also found in bacterial sequences.	201
367955	pfam04467	DUF483	Protein of unknown function (DUF483). Family of uncharacterized prokaryotic proteins.	119
398260	pfam04468	PSP1	PSP1 C-terminal conserved region. This region is present in both eukaryotes and eubacteria. The yeast PSP1 protein is involved in suppressing mutations in the DNA polymerase alpha subunit in yeast.	86
398261	pfam04471	Mrr_cat	Restriction endonuclease. Prokaryotic family found in type II restriction enzymes containing the hallmark (D/E)-(D/E)XK active site. Presence of catalytic residues implicates this region in the enzymatic cleavage of DNA.	114
398262	pfam04472	SepF	Cell division protein SepF. SepF accumulates at the cell division site in an FtsZ-dependent manner and is required for proper septum formation. Mutants are viable but the formation of the septum is much slower and occurs with a very abnormal morphology. This family also includes archaeal related proteins of unknown function.	72
282345	pfam04473	DUF553	Transglutaminase-like domain. This family of uncharacterized archaeal proteins are related to Transglutaminase-like domains. This family has previously been called DUF553 and UPF0252.	140
398263	pfam04474	DUF554	Protein of unknown function (DUF554). Family of uncharacterized prokaryotic proteins. Multiple predicted transmembrane regions suggest that the region is membrane associated.	220
398264	pfam04475	DUF555	Protein of unknown function (DUF555). Family of uncharacterized, hypothetical archaeal proteins.	101
398265	pfam04476	4HFCP_synth	4-HFC-P synthase. (5-formylfuran-3-yl)methyl phosphate synthase, also known as 4-HFC-P synthase, is involved in the production of methanofuran. This family has a classical TIM-barrel structure whose biological unit is a homohexamer.	228
398266	pfam04478	Mid2	Mid2 like cell wall stress sensor. This family represents a region near the C-terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway.	150
398267	pfam04479	RTA1	RTA1 like protein. This family is comprised of fungal proteins with multiple transmembrane regions. RTA1 is involved in resistance to 7-aminocholesterol, while RTM1 confers resistance to an an unknown toxic chemical in molasses. These proteins may bind to the toxic substance, and thus prevent toxicity. They are not thought to be involved in the efflux of xenobiotics.	210
398268	pfam04480	DUF559	Protein of unknown function (DUF559). 	95
113257	pfam04481	DUF561	Protein of unknown function (DUF561). Protein of unknown function found in a cyanobacterium, and the chloroplasts of algae.	243
398269	pfam04483	DUF565	Protein of unknown function (DUF565). Predicted transmembrane protein found in plants, chloroplasts and cyanobacteria. This family is also known as YCF20.	57
398270	pfam04484	QWRF	QWRF family. AUG8 belongs to the plant QWRF motif-containing protein family, which also includes microtubule-associated protein ENDOSPERM DEFECTIVE 1 and SNOWY COTYLEDON 3. AUG8 binds the microtubule plus-end and participates in the reorientation of microtubules in hypocotyls (the stem of a germinating seedling).	300
398271	pfam04485	NblA	Phycobilisome degradation protein nblA. In the cyanobacterium Synechococcus PCC 7942, nblA triggers degradation of light-harvesting phycobiliproteins in response to deprivation nutrients including nitrogen, phosphorus and sulphur. The mechanism of nblA function is not known, but it has been hypothesized that nblA may act by disrupting phycobilisome structure, activating a protease or tagging phycobiliproteins for proteolysis. Members of this family have also been identified in the chloroplasts of some red algae.	50
398272	pfam04486	SchA_CurD	SchA/CurD like domain. Members of this family have only been identified in species of the Streptomyces genus. Two family members are known to be part of gene clusters involved in the synthesis of polyketide-based spore pigments, homologous to clusters involved in the synthesis of polyketide antibiotics. The function of this protein is unknown, but it has been speculated to contain a NAD(P) binding site. Many of these proteins contain two copies of this presumed domain.	118
398273	pfam04487	CITED	CITED. CITED, CBP/p300-interacting transactivator with ED-rich tail, are characterized by a conserved 32-amino acid sequence at the C-terminus. CITED proteins do not bind DNA directly and are thought to function as transcriptional co-activators.	210
398274	pfam04488	Gly_transf_sug	Glycosyltransferase sugar-binding region containing DXD motif. The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases.	93
282358	pfam04489	DUF570	Protein of unknown function (DUF570). Protein of unknown function, found in herpesvirus and cytomegalovirus.	456
282359	pfam04490	Pox_T4_C	Poxvirus T4 protein, C-terminus. This family of poxvirus proteins are thought to be retained in the endoplasmic reticulum. M-T4 of myxoma virus is thought to protect infected lymphocytes from apoptosis and modulate the inflammatory response to virus infection.	146
282360	pfam04491	Pox_T4_N	Poxvirus T4 protein, N-terminus. This family of poxvirus proteins are thought to be secreted or retained in the endoplasmic reticulum if the protein also contains an additional C terminal region (pfam04490). M-T4 of myxoma virus is thought to protect infected lymphocytes from apoptosis and modulate the inflammatory response to virus infection.	46
398275	pfam04492	Phage_rep_O	Bacteriophage replication protein O. Replication protein O is necessary for the initiation of bacteriophage DNA replication. Protein O interacts with the lambda replication origin, and also with replication protein P to form an oligomer. It is speculated that the N-terminal half interacts with the replication origin while the C terminal half mediates protein-protein interaction.	92
398276	pfam04493	Endonuclease_5	Endonuclease V. Endonuclease V is specific for single-stranded DNA or for duplex DNA that contains uracil or that is damaged by a variety of agents.	198
398277	pfam04494	TFIID_NTD2	WD40 associated region in TFIID subunit, NTD2 domain. This region is an all-alpha domain associated with the WD40 helical bundle of the TAF5 subunit of transcription factor TFIID. The domain has distant structural similarity to RNA polymerase II CTD interacting factors. It contains several conserved clefts that are likely to be critical for TFIID complex assembly. The TAF5 subunit is present twice in the TFIID complex and is critical for the function and assembly of the complex, and the NTD2 and N-terminal domain is crucial for homodimerization.	125
398278	pfam04495	GRASP55_65	GRASP55/65 PDZ-like domain. GRASP55 (Golgi re-assembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide- sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system. This region appears to be related to the PDZ domain.	138
282365	pfam04496	Herpes_UL35	Herpesvirus UL35 family. UL35 represents a true late gene which encodes a 12-kDa capsid protein.	109
282366	pfam04497	Pox_E2-like	Poxviridae protein. This family of proteins is restricted to Poxviridae. It contains a number of differently named uncharacterized proteins.	729
282367	pfam04498	Pox_VP8_L4R	Poxvirus nucleic acid binding protein VP8/L4R. The 25 kDa product of Vaccinia virus gene L4R is also known as VP8. VP8 is found in the cores of Vaccinia virions and is essential for the formation of transcriptionally competent viral particles. It binds both single stranded and double stranded DNA and RNA with similar affinities. Binding is thought to involve cooperative interactions between protein subunits. The protein is proteolytically cleaved during viral assembly at an Ala-Gly-Ala site. Possible roles for VP8 include packaging and maintaining the DNA genome in a transcribable configuration; binding ssDNA during transcription initiation; and cooperation with I8R protein to unwind early promoter regions. VP8 may also function in either transcription elongation or release of mRNA molecules from viral particles.	217
398279	pfam04499	SAPS	SIT4 phosphatase-associated protein. This family includes a conserved region from a group of yeast proteins that associate with the SIT4 phosphatase. This association is required for SIT4's role in G1 cyclin transcription and for bud formation. This family also includes homologous regions from other eukaryotes.	383
398280	pfam04500	FLYWCH	FLYWCH zinc finger domain. Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in.	62
367967	pfam04501	Baculo_VP39	Baculovirus major capsid protein VP39. This family constitutes the 39 kDa major capsid protein of the Baculoviridae.	240
398281	pfam04502	DUF572	Family of unknown function (DUF572). Family of eukaryotic proteins with undetermined function.	316
398282	pfam04503	SSDP	Single-stranded DNA binding protein, SSDP. This is a family of eukaryotic single-stranded DNA binding proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene.	294
398283	pfam04504	DUF573	Protein of unknown function, DUF573. 	89
398284	pfam04505	CD225	Interferon-induced transmembrane protein. This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression.	68
398285	pfam04506	Rft-1	Rft protein. 	513
398286	pfam04507	DUF576	Csa1 family. This family contains several uncharacterized staphylococcal proteins. These proteins have been called conserved staphylococcal antigens (Csa).	225
282377	pfam04508	Pox_A_type_inc	Viral A-type inclusion protein repeat. The repeat is found in the A-type inclusion protein of the Poxvirus family.	22
398287	pfam04509	CheC	CheC-like family. The restoration of pre-stimulus levels of the chemotactic response regulator, CheY-P, is important for allowing bacteria to respond to new environmental stimuli. The members of this family, CheC, CheX, CheA and FliY are CheY-P phosphatase. CheC appears to be primarily involved in restoring normal CheY-P levels, whereas FliY seems to act on CheY-P constitutively. CheD enhances the activity of CheC 5-fold, which is normally relatively low. In some cases, the region represented by this entry is present as multiple copies.	38
367974	pfam04510	DUF577	Family of unknown function (DUF577). Family of Arabidopsis thaliana proteins. Many of these members contain a repeated region.	173
398288	pfam04511	DER1	Der1-like family. The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae contains of proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process. The mutant classes were called 'der' for 'degradation in the ER'. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein, that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins. The function of the Der1 protein seems to be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. Suggesting that this family may also mediate degradation of misfolded proteins (Bateman A pers. obs.).	191
398289	pfam04512	Baculo_PEP_N	Baculovirus polyhedron envelope protein, PEP, N-terminus. Polyhedra are large crystalline occlusion bodies containing nucleopolyhedrovirus virions, and surrounded by an electron-dense structure called the polyhedron envelope or polyhedron calyx. The polyhedron envelope (associated) protein PEP is thought to be an integral part of the polyhedron envelope. PEP is concentrated at the surface of polyhedra, and is thought to be important for the proper formation of the periphery of polyhedra. It is thought that PEP may stabilize polyhedra and protect them from fusion or aggregation.	91
309591	pfam04513	Baculo_PEP_C	Baculovirus polyhedron envelope protein, PEP, C-terminus. Polyhedra are large crystalline occlusion bodies containing nucleopolyhedrovirus virions, and surrounded by an electron-dense structure called the polyhedron envelope or polyhedron calyx. The polyhedron envelope (associated) protein PEP is thought to be an integral part of the polyhedron envelope. PEP is concentrated at the surface of polyhedra, and is thought to be important for the proper formation of the periphery of polyhedra. It is thought that PEP may stabilize polyhedra and protect them from fusion or aggregation.	140
282383	pfam04514	BTV_NS2	Bluetongue virus non-structural protein NS2. This family includes NS2 proteins from other members of the Orbivirus genus. NS2 is a non-specific single-stranded RNA-binding protein that forms large homomultimers and accumulates in viral inclusion bodies of infected cells. Three RNA binding regions have been identified in Bluetongue virus serotype 17 at residues 2-11, 153-166 and 274-286. NS2 multimers also possess nucleotidyl phosphatase activity. The precise function of NS2 is not known, but it may be involved in the transport and condensation of viral mRNAs.	349
398290	pfam04515	Choline_transpo	Plasma-membrane choline transporter. This family represents a high-affinity plasma-membrane choline transporter in C.elegans which is thought to be rate-limiting for ACh synthesis in cholinergic nerve terminals.	324
398291	pfam04516	CP2	CP2 transcription factor. This family represents a conserved region in the CP2 transcription factor family.	214
282386	pfam04517	Microvir_lysis	Microvirus lysis protein (E), C-terminus. E protein causes host cell lysis by inhibiting MraY, a peptidoglycan biosynthesis enzyme. This leads to cell wall failure at septation. The N terminal transmembrane region matches the signal peptide model and must be omitted from the family.	42
309594	pfam04518	Effector_1	Effector from type III secretion system. This is a family of effector proteins which are secreted by the type III secretion system. The precise function of this family is unknown.	352
398292	pfam04519	Bactofilin	Polymer-forming cytoskeletal. This is a family of bactofilins, a functionally diverse class of cytoskeletal, polymer-forming, proteins that is widely conserved among bacteria. In the example species C. crescentus, two bactofilins assemble into a membrane-associated laminar structure that shows cell-cycle-dependent polar localization and acts as a platform for the recruitment of a cell wall biosynthetic enzyme involved in polar morphogenesis. Bactofilins display distinct subcellular distributions and dynamics in different bacterial species, suggesting that they are versatile structural elements that have adopted a range of different cellular functions.	89
398293	pfam04520	Senescence_reg	Senescence regulator. This protein regulates the expression of proteins associated with leaf senescence in plants.	171
282390	pfam04521	Viral_P18	ssRNA positive strand viral 18kD cysteine rich protein. 	137
282391	pfam04522	DUF585	Protein of unknown function (DUF585). This region represents the N-terminus of bromovirus 2a protein, and is always found N terminal to a predicted RNA-dependent RNA polymerase region (pfam00978).	234
282392	pfam04523	Herpes_U30	Herpes virus tegument protein U30. This family is named after the human herpesvirus protein, but has been characterized in cytomegalovirus as UL47. Cytomegalovirus UL47 is a component of the tegument, which is a protein layer surrounding the viral capsid. UL47 co-precipitates with UL48 and UL69 tegument proteins, and the major capsid protein UL86. A UL47-containing complex is thought to be involved in the release of viral DNA from the disassembling virus particle.	906
398294	pfam04525	LOR	LURP-one-related. The structure of this family has been solved. It comprises a 12-stranded beta barrel with a central C-terminal alpha helix. This helix is thought to be a transmembrane helix. It is structurally similar to the C-terminal domain of the Tubby protein. In plants it plays a role in defense against pathogens.	186
398295	pfam04526	DUF568	Protein of unknown function (DUF568). Family of uncharacterized plant proteins.	100
309599	pfam04527	Retinin_C	Drosophila Retinin like protein. Family of Drosophila proteins related to the C-terminal region of the Drosophila Retinin protein. Conserved region is found towards the C-terminus of the member proteins.	63
282396	pfam04528	Adeno_E4_34	Adenovirus early E4 34 kDa protein conserved region. Conserved region found in the Adenovirus E4 34 kDa protein.	145
398296	pfam04529	Herpes_U59	Herpesvirus U59 protein. The proteins in this family have no known function. Cytomegalovirus UL88 is also a member of this family.	365
282398	pfam04530	Viral_Beta_CD	Viral Beta C/D like family. Family of ssRNA positive-strand viral proteins. Conserved region found in the Beta C and Beta D transcripts.	123
398297	pfam04531	Phage_holin_1	Bacteriophage holin. This family of holins is found in several staphylococcal and streptococcal bacteriophages. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the buildup of a holin oligomer which causes the lysis.	82
282400	pfam04532	DUF587	Protein of unknown function (DUF587). This family consists of the N termini of some human herpesvirus U58 proteins, and some cytomegalovirus UL87 proteins. This region is always found N terminal to the Pfam family UL87 (pfam03043), which has no known function.	227
282401	pfam04533	Herpes_U44	Herpes virus U44 protein. This is a family of proteins from dsDNA beta-herpesvirinae and gamma-herpesvirinae viruses. The function is not known, and the proteins are named variously as U44, BSRF1, UL71, and M71. The family BSRF1 has been merged into this.	202
398298	pfam04534	Herpes_UL56	Herpesvirus UL56 protein. In herpes simplex virus type 2, UL56 is thought to be a tail-anchored type II membrane protein involved in vesicular trafficking. The C terminal hydrophobic region is required for association with the cytoplasmic membrane, and the N terminal proline-rich region is important for the translocation of UL56 to the Golgi apparatus and cytoplasmic vesicles.	197
367980	pfam04535	DUF588	Domain of unknown function (DUF588). This family of plant proteins contains a domain that may have a catalytic activity. It has a conserved arginine and aspartate that could form an active site. These proteins are predicted to contain 3 or 4 transmembrane helices.	150
398299	pfam04536	TPM_phosphatase	TPM domain. This family was first named TPM domain after its founding proteins: TLP18.3, Psb32 and MOLO-1. In Arabidopsis, this domain is called the thylakoid acid phosphatase -TAP - domain and has a Rossmann-like fold. In plants, the family resides in the thylakoid lumen attached to the outer membrane of the chloroplast/plastid. It is active in the photosystem II.	125
282405	pfam04537	Herpes_UL55	Herpesvirus UL55 protein. In infected cells, UL55 is associated with the nuclear matrix, and found adjacent to compartments containing the capsid protein ICP35. UL55 was not detected in assembled virions. It is thought that UL55 may play a role in virion assembly or maturation.	164
398300	pfam04538	BEX	Brain expressed X-linked like family. This is a family of transcription elongation factors which includes those referred to as Bex proteins as well as those named TCEAL7. Bex1 was shown to be a novel link between neurotrophin signalling, the cell cycle, and neuronal differentiation, suggesting it might function by coordinating internal cellular states with the ability of cells to respond to external signals. TCEAL7 has been shown negatively to regulate the NF-kappaB pathway, hence being important in ovarian cancer as it one of the genes frequently downregulated in this cancer. A closely related protein, TFIIS/TCEA, found in pfam07500 is involved in transcription elongation and transcript fidelity. TFIIS/TCEA promotes 3' endoribonuclease activity of RNA polymerase II (pol II) and allows pol II to bypass transcript pause or 'arrest' during elongation process. It is thus possible that BEX is also acting in this way.	103
398301	pfam04539	Sigma70_r3	Sigma-70 region 3. Region 3 forms a discrete compact three helical domain within the sigma-factor. Region is not normally involved in the recognition of promoter DNA, but as some specific bacterial promoters containing an extended -10 promoter element, residues within region 3 play an important role. Region 3 primarily is involved in binding the core RNA polymerase in the holoenzyme.	76
282408	pfam04540	Herpes_UL51	Herpesvirus UL51 protein. UL51 protein is a virion protein. In pseudorabies virus, UL51 was identified as a component of the capsid. In herpes simplex virus type 1 there is evidence for post-translational modification of UL51.	158
398302	pfam04541	Herpes_U34	Herpesvirus virion protein U34. The virion proteins in this family include membrane phosphoprotein-like proteins such as UL34, Epstein-Barr and R50, from dsDNA viruses, no RNA stage, Herpesvirales. The family Herpes_BFRF1, pfam05900, has been merged in.	182
398303	pfam04542	Sigma70_r2	Sigma-70 region 2. Region 2 of sigma-70 is the most conserved region of the entire protein. All members of this class of sigma-factor contain region 2. The high conservation is due to region 2 containing both the -10 promoter recognition helix and the primary core RNA polymerase binding determinant. The core binding helix, interacts with the clamp domain of the largest polymerase subunit, beta prime. The aromatic residues of the recognition helix, found at the C-terminus of this domain are though to mediate strand separation, thereby allowing transcription initiation.	69
282411	pfam04544	Herpes_UL20	Herpesvirus egress protein UL20. UL20 is predicted to be a transmembrane protein with multiple membrane spans. It is involved in the trans-cellular transport of enveloped virions, and is therefore important for viral egress. However, UL20 operates in different cellular compartments and different stages of egress in pseudorabies virus and herpes simplex virus. This is thought to be due to differences in egress pathways between these two viruses.	179
398304	pfam04545	Sigma70_r4	Sigma-70, region 4. Region 4 of sigma-70 like sigma-factors are involved in binding to the -35 promoter element via a helix-turn-helix motif. Due to the way Pfam works, the threshold has been set artificially high to prevent overlaps with other helix-turn-helix families. Therefore there are many false negatives.	50
398305	pfam04546	Sigma70_ner	Sigma-70, non-essential region. The domain is found in the primary vegetative sigma factor. The function of this domain is unclear and can be removed without loss of function.	169
398306	pfam04547	Anoctamin	Calcium-activated chloride channel. The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes.	422
398307	pfam04548	AIG1	AIG1 family. Arabidopsis protein AIG1 appears to be involved in plant resistance to bacteria.	200
367985	pfam04549	CD47	CD47 transmembrane region. This family represents the transmembrane region of CD47 leukocyte antigen.	147
398308	pfam04550	Phage_holin_3_2	Phage holin family 2. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the buildup of a holin oligomer which causes the lysis.	86
398309	pfam04551	GcpE	GcpE protein. In a variety of organisms, including plants and several eubacteria, isoprenoids are synthesized by the mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. Although different enzymes of this pathway have been described, the terminal biosynthetic steps of the MEP pathway have not been fully elucidated. GcpE gene of Escherichia coli is involved in this pathway.	342
398310	pfam04552	Sigma54_DBD	Sigma-54, DNA binding domain. This DNA binding domain is based on peptide fragmentation data. This domain is proximal to DNA in the promoter/holoenzyme complex. Furthermore this region contains a putative helix-turn-helix motif. At the C-terminus, there is a highly conserved region known as the RpoN box and is the signature of the sigma-54 proteins.	159
398311	pfam04553	Tis11B_N	Tis11B like protein, N-terminus. Members of this family always contain a tandem repeat of CCCH zinc fingers pfam00642. Tis11B, Tis11D and their homologs are thought to be regulatory proteins involved in the response to growth factors. The function of the N-terminus is unknown.	105
252669	pfam04554	Extensin_2	Extensin-like region. 	57
398312	pfam04555	XhoI	Restriction endonuclease XhoI. This family consists of type II restriction enzymes (EC:3.1.21.4) that recognize the double-stranded sequence CTCGAG and cleave after C-1.	191
398313	pfam04556	DpnII	DpnII restriction endonuclease. Members of this family are type II restriction enzymes (EC:3.1.21.4). They recognize the double-stranded unmethylated sequence GATC and cleave before G-1. http://rebase.neb.com/rebase/enz/DpnII.html	276
398314	pfam04557	tRNA_synt_1c_R2	Glutaminyl-tRNA synthetase, non-specific RNA binding region part 2. This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function.	84
398315	pfam04558	tRNA_synt_1c_R1	Glutaminyl-tRNA synthetase, non-specific RNA binding region part 1. This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function.	162
398316	pfam04559	Herpes_UL17	Herpesvirus UL17 protein. UL17 protein is required for DNA cleavage and packaging in herpes viruses. It has been shown to associate with immature B-type capsids, and is required for the the localization of capsids and capsid proteins to the intranuclear sites where viral DNA is cleaved and packaged. In the virion, UL17 is a component of the tegument, which is a protein layer surrounding the viral capsid.	492
398317	pfam04560	RNA_pol_Rpb2_7	RNA polymerase Rpb2, domain 7. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain comprised of the structural domains anchor and clamp. The clamp region (C-terminal) contains a zinc-binding motif. The clamp region is named due to its interaction with the clamp domain found in Rpb1. The domain also contains a region termed "switch 4". The switches within the polymerase are thought to signal different stages of transcription.	85
398318	pfam04561	RNA_pol_Rpb2_2	RNA polymerase Rpb2, domain 2. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the lobe domain. DNA has been demonstrated to bind to the concave surface of the lobe domain, and plays a role in maintaining the transcription bubble. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 1 (DRI).	185
398319	pfam04562	Dicty_spore_N	Dictyostelium spore coat protein, N-terminus. The Dictyostelium spore coat is a polarised extracellular matrix composed of glycoproteins and cellulose. Four of the major coat glycoproteins exist as a multi-protein complex within the prespore vesicles before secretion. Of these, SP96 and SP70 are members of this family. The presence of SP96 and SP70 in the complex is necessary for the cellulose binding activity of the complex, which is in turn necessary for normal spore coat assembly. The function of this region of these proteins is not known.	114
367994	pfam04563	RNA_pol_Rpb2_1	RNA polymerase beta subunit. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the protrusion domain. The other lobe (pfam04561) is nested within this domain.	396
398320	pfam04564	U-box	U-box domain. This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues.	73
398321	pfam04565	RNA_pol_Rpb2_3	RNA polymerase Rpb2, domain 3. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 3, s also known as the fork domain and is proximal to catalytic site.	67
398322	pfam04566	RNA_pol_Rpb2_4	RNA polymerase Rpb2, domain 4. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 4, is also known as the external 2 domain.	62
398323	pfam04567	RNA_pol_Rpb2_5	RNA polymerase Rpb2, domain 5. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 5, is also known as the external 2 domain.	54
367997	pfam04568	IATP	Mitochondrial ATPase inhibitor, IATP. ATP synthase inhibitor prevents the enzyme from switching to ATP hydrolysis during collapse of the electrochemical gradient, for example during oxygen deprivation ATP synthase inhibitor forms a one to one complex with the F1 ATPase, possibly by binding at the alpha-beta interface. It is thought to inhibit ATP synthesis by preventing the release of ATP. The minimum inhibitory region for bovine inhibitor is from residues 39 to 72. The inhibitor has two oligomeric states, dimer (the active state) and tetramer. At low pH, the inhibitor forms a dimer via antiparallel coiled coil interactions between the C terminal regions of two monomers. At high pH, the inhibitor forms tetramers and higher oligomers by coiled coil interactions involving the N-terminus and inhibitory region, thus preventing the inhibitory activity.	98
367998	pfam04569	DUF591	Protein of unknown function. This family represents a conserved region in a number of uncharacterized plant proteins.	41
398324	pfam04570	zf-FLZ	zinc-finger of the FCS-type, C2-C2. zf-FLZ is a FCS-like zinc-finger domain found in higher plants. It is bryophitic in origin. It carries a zf-FCS-like C2-C2 zinc finger, consisting of a consensus cysteine-signature sequence with conserved phenyl alanine and serine residue associated with a third cysteine. It acts as a protein-protein interaction module.	53
398325	pfam04571	Lipin_N	lipin, N-terminal conserved region. Mutations in the lipin gene lead to fatty liver dystrophy in mice. The protein has been shown to be phosphorylated by the TOR Ser/Thr protein kinases in response to insulin stimulation. The conserved region is found at the N-terminus of the member proteins.	103
398326	pfam04572	Gb3_synth	Alpha 1,4-glycosyltransferase conserved region. The glycosphingolipids (GSL) form part of eukaryotic cell membranes. They consist of a hydrophilic carbohydrate moiety linked to a hydrophobic ceramide tail embedded within the lipid bilayer of the membrane. Lactosylceramide, Gal1,4Glc1Cer (LacCer), is the common synthetic precursor to the majority of GSL found in vertebrates. Alpha 1.4-glycosyltransferases utilize UDP donors and transfer the sugar to a beta-linked acceptor. This region appears to be confined to higher eukaryotes. No function has been yet assigned to this region.	125
398327	pfam04573	SPC22	Signal peptidase subunit. Translocation of polypeptide chains across the endoplasmic reticulum membrane is triggered by signal sequences. During translocation of the nascent chain through the membrane, the signal sequence of most secretory and membrane proteins is cleaved off. Cleavage occurs by the signal peptidase complex (SPC) which consists of four subunits in yeast and five in mammals. This family is common to yeast and mammals.	172
368003	pfam04574	DUF592	Protein of unknown function (DUF592). This region is found in some SIR2 family proteins (pfam02146).	153
398328	pfam04575	DUF560	Protein of unknown function (DUF560). Family of hypothetical bacterial proteins.	288
398329	pfam04576	Zein-binding	Zein-binding. This domain binds to zein proteins, pfam01559. Zein proteins are seed storage proteins.	92
398330	pfam04577	DUF563	Protein of unknown function (DUF563). Family of uncharacterized proteins.	209
398331	pfam04578	DUF594	Protein of unknown function, DUF594. 	54
398332	pfam04579	Keratin_matx	Keratin, high-sulphur matrix protein. Family of Keratin, high-sulfur matrix proteins. The keratin products of mammalian epidermal derivatives such as wool and hair consist of microfibrils embedded in a rigid matrix of other proteins. The matrix proteins include the high-sulphur and high-tyrosine keratins, having molecular weights of 6-20 kDa, whereas microfibrils contain the larger, low-sulphur keratins (40-56 kDa).	96
282445	pfam04580	Pox_D3	Chordopoxvirinae D3 protein. Chordopoxvirinae D3 protein conserved region. Region occupies entire length of D3 protein.	248
368009	pfam04582	Reo_sigmaC	Reovirus sigma C capsid protein. 	130
368010	pfam04583	Baculo_p74	Baculoviridae p74 conserved region. Baculoviruses are distinct from other virus families in that there are two viral phenotypes: budded virus (BV) and occlusion-derived virus (ODV). BVs disseminate viral infection throughout the tissues of the host and ODVs transmit baculovirus between insect hosts. GFP tagging experiments implicate p74 as an ODV envelope protein.	218
282447	pfam04584	Pox_A28	Poxvirus A28 family. Family of conserved Poxvirus A28 family proteins. Conserved region spans entire protein in the majority of family members.	140
309640	pfam04586	Peptidase_S78	Caudovirus prohead serine protease. Family of Caudovirus prohead serine proteases also found in a number of bacteria possibly as the result of horizontal transfer.	160
398333	pfam04587	ADP_PFK_GK	ADP-specific Phosphofructokinase/Glucokinase conserved region. In archaea a novel type of glycolytic pathway exists that is deviant from the classical Embden-Meyerhof pathway. This pathway utilizes two novel proteins: an ADP-dependent Glucokinase and an ADP-dependent Phosphofructokinase. This conserved region is present at the C-terminal of both these proteins. Interestingly this family contains sequences from higher eukaryotes..	428
398334	pfam04588	HIG_1_N	Hypoxia induced protein conserved region. This family is found in proteins thought to be involved in the response to hypoxia. Family members mostly come from diverse eukaryotic organisms however eubacterial members have been identified. This region is found at the N-terminus of the member proteins which are predicted to be transmembrane.	50
398335	pfam04589	RFX1_trans_act	RFX1 transcription activation region. The RFX family is a family of winged-helix DNA binding proteins. RFX1 is a regulatory factor essential for expression of MHC class II genes. This region is to found N terminal to the RFX DNA binding region (pfam02257) in some mammalian RFX proteins, and is thought to activate transcription when associated with DNA. Deletion analysis has identified the region 233-351 in human RFX1 as being required for maximal activation.	160
398336	pfam04591	DUF596	Protein of unknown function, DUF596. This family contains several uncharacterized proteins.	68
398337	pfam04592	SelP_N	Selenoprotein P, N terminal region. SelP is the only known eukaryotic selenoprotein that contains multiple selenocysteine (Sec) residues, and accounts for more than 50% of the selenium content of rat and human plasma. It is thought to be glycosylated. SelP may have antioxidant properties. It can attach to epithelial cells, and may protect vascular endothelial cells against peroxynitrite toxicity. The high selenium content of SelP suggests that it may be involved in selenium intercellular transport or storage. The promoter structure of bovine SelP suggest that it may be involved in countering heavy metal intoxication, and may also have a developmental function. The N-terminal region of SelP can exist independently of the C terminal region. Zebrafish selenoprotein Pb lacks the C terminal Sec-rich region, and a protein encoded by the rat SelP gene and lacking this region has also been reported. N-terminal region contains a conserved SecxxCys motif, which is similar to the CysxxCys found in thioredoxins. It is speculated that the N terminal region may adopt a thioredoxin fold and catalyze redox reactions. The N-terminal region also contains a His-rich region, which is thought to mediate heparin binding. Binding to heparan proteoglycans could account for the membrane binding properties of SelP. The function of the bacterial members of this family is uncharacterized.	233
335847	pfam04593	SelP_C	Selenoprotein P, C terminal region. SelP is the only known eukaryotic selenoprotein that contains multiple selenocysteine (Sec) residues, and accounts for more than 50% of the selenium content of rat and human plasma. It is thought to be glycosylated. SelP may have antioxidant properties. It can attach to epithelial cells, and may protect vascular endothelial cells against peroxynitrite toxicity. The high selenium content of SelP suggests that it may be involved in selenium intercellular transport or storage. The promoter structure of bovine SelP suggest that it may be involved in countering heavy metal intoxication, and may also have a developmental function. The N terminal region always contains one Sec residue, and this is separated from the C terminal region (9-16 sec residues) by a histidine-rich sequence. The large number of Sec residues in the C-terminal portion of SelP suggest CC that it may be involved in selenium transport or storage. However, it is also possible that this region has a redox function.	133
252691	pfam04595	Pox_I6	Poxvirus I6-like family. This family includes I6 proteins as well as the related F5L proteins.	320
282455	pfam04596	Pox_F15	Poxvirus protein F15. 	136
398338	pfam04597	Ribophorin_I	Ribophorin I. Ribophorin I is an essential subunit of oligosaccharyltransferase (OST), which is also known as Dolichyl-diphosphooligosaccharide--protein glycosyltransferase, (EC:2.4.1.119). OST catalyzes the transfer of an oligosaccharide from dolichol pyrophosphate to selected asparagine residues of nascent polypeptides as they are translocated into the lumen of the rough endoplasmic reticulum. Ribophorin I and OST48 are though to be responsible for OST catalytic activity. Both yeast and mammalian proteins are glycosylated but the sites are not conserved. Glycosylation may contribute towards general solubility but is unlikely to be involved in a specific biochemical function Most family members are predicted to have a transmembrane helix at the C-terminus of this region.	439
398339	pfam04598	Gasdermin	Gasdermin family. The precise function of this protein is unknown. A deletion/insertion mutation is associated with an autosomal dominant non-syndromic hearing impairment form. In addition, this protein has also been found to contribute to acquired etoposide resistance in melanoma cells. This family also includes the gasdermin protein	240
282458	pfam04599	Pox_G5	Poxvirus G5 protein. This protein has been predicted to be related to the FEN-1 endonuclease.	425
282459	pfam04601	DUF569	Domain of unknown function (DUF569). Family of hypothetical proteins. Some family members contain a two copies of the domain.	142
377388	pfam04602	Arabinose_trans	Mycobacterial cell wall arabinan synthesis protein. Arabinosyltransferase is involved in arabinogalactan (AG) biosynthesis pathway in mycobacteria. AG is a component of the macromolecular assembly of the mycolyl-AG-peptidoglycan complex of the cell wall. This enzyme has important clinical applications as it is believed to be the target of the antimycobacterial drug Ethambutol.	459
398340	pfam04603	Mog1	Ran-interacting Mog1 protein. Segregation of nuclear and cytoplasmic processes facilitates regulation of many eukaryotic cellular functions such as gene expression and cell cycle progression. Trafficking through the nuclear pore requires a number of highly conserved soluble factors that escort macromolecular substrates into and out of the nucleus. The Mog1 protein has been shown to interact with RanGTP which stimulates guanine nucleotide release, suggesting Mog1 regulates the nuclear transport functions of Ran. The human homolog of Mog1 is thought to be alternatively spliced.	136
368018	pfam04604	L_biotic_typeA	Type-A lantibiotic. Lantibiotics are antibiotic peptides distinguished by the presence of the rare thioether amino acids lanthionine and/or methyl-lanthionine. They are produced by Gram-positive bacteria as gene-encoded precursor peptides and undergo post-translational modification to generate the mature peptide. Based on their structural and functional features lantibiotics are currently divided into two major groups: the flexible amphiphilic type-A and the rather rigid and globular type-B. Type-A lantibiotics act primarily by pore formation in the bacterial membrane by a mechanism involving the interaction with specific docking molecules such as the membrane precursor lipid II.	51
398341	pfam04606	Ogr_Delta	Ogr/Delta-like zinc finger. This is a viral family of phage zinc-binding transcriptional activators, which also contains cryptic members in some bacterial genomes. The P4 phage delta protein contains two such domains attached covalently, while the P2 phage Ogr proteins possess one domain but function as dimers. All the members of this family have the following consensus sequence: C-X(2)-C-X(3)-A-(X)2-R-X(15)-C-X(4)-C-X(3)-F. This family also includes zinc fingers in recombinase proteins.	47
398342	pfam04607	RelA_SpoT	Region found in RelA / SpoT proteins. This region of unknown function is found in RelA and SpoT of Escherichia coli, and their homologs in plants and in other eubacteria. RelA is a guanosine 3',5'-bis-pyrophosphate (ppGpp) synthetase (EC:2.7.6.5) while SpoT is thought to be a bifunctional enzyme catalyzing both ppGpp synthesis and degradation (ppGpp 3'-pyrophosphohydrolase, (EC:3.1.7.2)). This region is often found in association with HD (pfam01966), a metal-dependent phosphohydrolase, TGS (pfam02824) which is a possible nucleotide-binding region, and the ACT regulatory domain (pfam01842).	113
398343	pfam04608	PgpA	Phosphatidylglycerophosphatase A. This family represents a family of bacterial phosphatidylglycerophosphatases (EC:3.1.3.27), known as PgpA. It appears that bacteria possess several phosphatidylglycerophosphatases, and thus, PgpA is not essential in Escherichia coli.	144
398344	pfam04609	MCR_C	Methyl-coenzyme M reductase operon protein C. Methyl coenzyme M reductase (MCR) catalyzes the final step in methanogenesis. MCR is composed of three subunits, alpha (pfam02249), beta (pfam02241) and gamma (pfam02240). Genes encoding the beta (mcrB) and gamma (mcrG) subunits are separated by two open reading frames coding for two proteins C and D. The function of proteins C and D (this family) is unknown. This family nowalso includes family MtrC_related,	271
377390	pfam04610	TrbL	TrbL/VirB6 plasmid conjugal transfer protein. 	214
282467	pfam04611	AalphaY_MDB	Mating type protein A alpha Y mating type dependent binding region. This region is important for the mating type dependent binding of Y protein to the A alpha Z protein of another mating type in Schizophyllum commune.	145
398345	pfam04612	T2SSM	Type II secretion system (T2SS), protein M. This family of membrane proteins consists of Type II secretion system protein M sequences from several Gram-negative (diderm) bacteria. The precise function of these proteins is unknown, though in Vibrio cholerae, the T2SM (EpsM) protein interacts with the T2SL (EpsL) protein, and also forms homodimers.	159
398346	pfam04613	LpxD	UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase, LpxD. UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase (EC 2.3.1.-) catalyzes an early step in lipid A biosynthesis: UDP-3-O-(3-hydroxytetradecanoyl)glucosamine + (R)-3-hydroxytetradecanoyl- [acyl carrier protein] -> UDP-2,3-bis(3-hydroxytetradecanoyl)glucosamine + [acyl carrier protein]. Members of this family also contain a hexapeptide repeat (pfam00132). This family constitutes the non-repeating region of LPXD proteins.	69
398347	pfam04614	Pex19	Pex19 protein family. 	246
398348	pfam04615	Utp14	Utp14 protein. This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome.	729
398349	pfam04616	Glyco_hydro_43	Glycosyl hydrolases family 43. The glycosyl hydrolase family 43 contains members that are arabinanases. Arabinanases hydrolyze the alpha-1,5-linked L-arabinofuranoside backbone of plant cell wall arabinans. The structure of arabinanase Arb43A from Cellvibrio japonicus reveals a five-bladed beta-propeller fold. A long V-shaped groove, partially enclosed at one end, forms a single extended substrate-binding surface across the face of the propeller.	281
398350	pfam04617	Hox9_act	Hox9 activation region. This family constitutes the N termini of the paralogous homeobox proteins HoxA9, HoxB9, HoxC9 and HoxD9. The N terminal region is found to act as a transcription activation region. Btg1 and Btg2 - the B-cell translocation gene products - may function as cofactors for Hoxb9-mediated transcription. The Btg proteins modulate Hoxb9 transcriptional activity by recruiting a multiprotein Ccr4-like complex.	182
398351	pfam04618	HD-ZIP_N	HD-ZIP protein N-terminus. This family consists of the N termini of plant homeobox-leucine zipper proteins. Its function is unknown.	99
309663	pfam04619	Adhesin_Dr	Dr-family adhesin. This family of adhesins bind to the Dr blood group antigen component of decay-accelerating factor. This mediates adherence of uropathogenic Escherichia coli to the urinary tract. This family contains both fimbriated and afimbriated adherence structures. This protein also confers the phenotype of mannose-resistant hemagglutination, which can be inhibited by chloramphenicol. The N terminal portion of the protein is though to be responsible for chloramphenicol sensitivity.	139
309664	pfam04620	FlaA	Flagellar filament outer layer protein Flaa. Periplasmic flagella are the organelles of spirochete mobility, and are structurally different from the flagella of other motile bacteria. They reside inside the cell within the periplasmic space, and confer mobility in viscous gel-like media such connective tissue. The flagella are composed of an outer sheath of FlaA proteins and a core filament of FlaB proteins. Each species usually has several FlaA protein species.	230
398352	pfam04621	ETS_PEA3_N	PEA3 subfamily ETS-domain transcription factor N terminal domain. The N-terminus of the PEA3 transcription factors is implicated in transactivation and in inhibition of DNA binding. Transactivation is potentiated by activation of the Ras/MAP kinase and protein kinase A signalling cascades. The N terminal region contains conserved MAP kinase phosphorylation sites.	342
398353	pfam04622	ERG2_Sigma1R	ERG2 and Sigma1 receptor like protein. This family consists of the fungal C-8 sterol isomerase and mammalian sigma1 receptor. C-8 sterol isomerase (delta-8--delta-7 sterol isomerase), catalyzes a reaction in ergosterol biosynthesis, which results in unsaturation at C-7 in the B ring of sterols. Sigma 1 receptor is a low molecular mass mammalian protein located in the endoplasmic reticulum, which interacts with endogenous steroid hormones, such as progesterone and testosterone. It also binds the sigma ligands, which are are a set of chemically unrelated drugs including haloperidol, pentazocine, and ditolylguanidine. Sigma1 effectors are not well understood, but sigma1 agonists have been observed to affect NMDA receptor function, the alpha-adrenergic system and opioid analgesia.	193
113396	pfam04623	Adeno_E1B_55K_N	Adenovirus E1B protein N-terminus. This family constitutes the amino termini of E1B 55 kDa (pfam01696). E1B 55K binds p53 the tumor suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the by the adenovirus E1A protein. The role of the N-terminus in the function of E1B is not known.	71
309667	pfam04624	Dec-1	Dec-1 repeat. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa). This repeat is usually found in 12 copies in the central region of the protein. Its function is unknown. Length polymorphisms of Dec-1 have been observed in wild-type strains, and are caused by changes in the numbers of the first five repeats.	27
368028	pfam04625	DEC-1_N	DEC-1 protein, N-terminal region. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa).	403
282480	pfam04626	DEC-1_C	Dec-1 protein, C terminal region. The defective chorion-1 gene (dec-1) in Drosophila encodes follicle cell proteins necessary for proper eggshell assembly. Multiple products of the dec-1 gene are formed by alternative RNA splicing and proteolytic processing. Cleavage products include S80 (80 kDa) which is incorporated into the eggshell, and further proteolysis of S80 gives S60 (60 kDa). Alternative splicing generates different carboxyl terminal ends in different protein isoforms, so this is region is the most C terminal region that is present in the main isoforms.	131
398354	pfam04627	ATP-synt_Eps	Mitochondrial ATP synthase epsilon chain. This family constitutes the mitochondrial ATP synthase epsilon subunit. This is not to be confused with the bacterial epsilon subunit, which is homologous to the mitochondrial delta subunit (pfam00401 and pfam02823) The epsilon subunit is located in the extrinsic membrane section F1, which is the catalytic site of ATP synthesis. The epsilon subunit was not well ordered in the crystal structure of bovine F1, but it is known to be located in the stalk region of F1. E subunit is thought to be involved in the regulation of ATP synthase, since a null mutation increased oligomycin sensitivity and decreased inhibition by inhibitor protein IF1.	49
335859	pfam04628	Sedlin_N	Sedlin, N-terminal conserved region. Mutations in this protein are associated with the X-linked spondyloepiphyseal dysplasia tarda syndrome (OMIM:313400). This family represents an N-terminal conserved region.	129
398355	pfam04629	ICA69	Islet cell autoantigen ICA69, C-terminal domain. This family includes a 69 kD protein which has been identified as an islet cell autoantigen in type I diabetes mellitus. Its precise function is unknown.	254
309671	pfam04630	Phage_TTP_1	Phage tail tube protein. This is a family of phage tail tube proteins from Myoviridae.	199
282484	pfam04631	PIF2	Per os infectivity factor 2. This family includes several hypothetical baculoviral proteins, with predicted molecular weights of approximately 44 kD. Family members include per os infectivity factor 2 (PIF2). PIF2 forms a stable complex with PIF1, PIF3, PIF4 which is essential for oral infectivity of Autographa californica multinucleocapsid nucleopolyhedrovirus (AcMNPV) in insect larvae, and P74 is also associated with this complex.	372
398356	pfam04632	FUSC	Fusaric acid resistance protein family. This family includes a conserved region found in two proteins associated with fusaric acid resistance, FusC from Burkholderia cepacia and fdt-2 from Klebsiella oxytoca. These proteins are likely to be membrane transporter proteins.	655
282486	pfam04633	Herpes_BMRF2	Herpesvirus BMRF2 protein. 	349
398357	pfam04634	DUF600	Protein of unknown function, DUF600. This conserved region is found in several uncharacterized proteins from Gram positive bacteria.	144
398358	pfam04636	PA26	PA26 p53-induced protein (sestrin). PA26 is a p53-inducible protein. Its function is unknown. It has similarity to pfam04636 in its N-terminus.	440
282489	pfam04637	Herpes_pp85	Herpesvirus phosphoprotein 85 (HHV6-7 U14/HCMV UL25). This family includes UL25 proteins from HCMV, as well as U14 proteins from HHV 6 and HHV7. These 85 kD phosphoproteins appear to act as structural antigens, but their precise function is otherwise unknown.	542
282490	pfam04639	Baculo_E56	Baculoviral E56 protein, specific to ODV envelope. This family represents the E56 protein, which is localizes to the occlusion derived virus (ODV) envelope, but not to the budded virus (BV) envelope.	293
398359	pfam04640	PLATZ	PLATZ transcription factor. Plant AT-rich sequence and zinc-binding proteins (PLATZ) are zinc dependant DNA binding proteins. They bind to AT rich sequences and functions in transcriptional repression.	79
398360	pfam04641	Rtf2	Rtf2 RING-finger. It is vital for effective cell-replication that replication is not stalled at any point by, for instance, damaged bases. Replication termination factor 2 (Rtf2) stabilizes the replication fork stalled at the site-specific replication barrier RTS1 by preventing replication restart until completion of DNA synthesis by a converging replication fork initiated at a flanking origin. The RTS1 element terminates replication forks that are moving in the cen2-distal direction while allowing forks moving in the cen2-proximal direction to pass through the region. Rtf2 contains a C2HC2 motif related to the C3HC4 RING-finger motif, and would appear to fold up, creating a RING finger-like structure but forming only one functional Zn2+ ion-binding site. This domain is also found at the N-terminus of peptidyl-prolyl cis-trans isomerase 4, a divergent cyclophilin family.	258
282493	pfam04642	DUF601	Protein of unknown function, DUF601. This family represents a conserved region found in several uncharacterized plant proteins.	327
368034	pfam04643	Motilin_assoc	Motilin/ghrelin-associated peptide. This family represents a peptide sequence that lies C-terminal to motilin/ghrelin on the respective precursor peptide. Its function is unknown.	59
398361	pfam04644	Motilin_ghrelin	Motilin/ghrelin. Motilin is a gastrointestinal regulatory polypeptide produced by motilin cells in the duodenal epithelium. It is released into the general circulation at about 100-min intervals during the inter-digestive state and is the most important factor in controlling the inter-digestive migrating contractions. Motilin also stimulates endogenous release of the endocrine pancreas. This family also includes ghrelin, a growth hormone secretagogue synthesized by endocrine cells in the stomach. Ghrelin stimulates growth hormone secretagogue receptors in the pituitary. These receptors are distinct from the growth hormone-releasing hormone receptors, and thus provide a means of controlling pituitary growth hormone release by the gastrointestinal system.	28
282496	pfam04645	DUF603	Protein of unknown function, DUF603. This family includes several uncharacterized proteins from Borrelia species.	181
398362	pfam04646	DUF604	Protein of unknown function, DUF604. This family includes a conserved region found in several uncharacterized plant proteins.	256
398363	pfam04647	AgrB	Accessory gene regulator B. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein.	185
398364	pfam04648	MF_alpha	Yeast mating factor alpha hormone. The hormone is excreted into the culture medium by haploid cells of the alpha mating type and acts on cells of the opposite mating type (type A). It inhibits DNA synthesis in type A cells synchronising them with type alpha, and so mediates the conjugation process.	13
68229	pfam04649	VlpA_repeat	Mycoplasma hyorhinis VlpA repeat. This repeat is found in the extracellular (C-terminal) region of the variant surface antigen A (VlpA) of Mycoplasma hyorhinis. Mutations that change the number of repeats in the protein are involved in antigenic variation and immune evasion of this swine pathogen.	13
398365	pfam04650	YSIRK_signal	YSIRK type signal peptide. Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.	25
282501	pfam04651	Pox_A12	Poxvirus A12 protein. 	183
398366	pfam04652	Vta1	Vta1 like. Vta1 (VPS20-associated protein 1) is a positive regulator of Vps4. Vps4 is an ATPase that is required in the multivesicular body (MVB) sorting pathway to dissociate the endosomal sorting complex required for transport (ESCRT). Vta1 promotes correct assembly of Vps4 and stimulates its ATPase activity through its conserved Vta1/SBP1/LIP5 region.	133
398367	pfam04654	DUF599	Protein of unknown function, DUF599. This family includes several uncharacterized proteins.	211
398368	pfam04655	APH_6_hur	Aminoglycoside/hydroxyurea antibiotic resistance kinase. The aminoglycoside phosphotransferases achieve inactivation of their antibiotic substrates by phosphorylation utilising ATP. Likewise hydroxyurea is inactivated by phosphorylation of the hydroxy group in the hydroxylamine moiety.	250
282505	pfam04656	Pox_E6	Pox virus E6 protein. Family of pox virus E6 proteins.	566
398369	pfam04657	DMT_YdcZ	Putative inner membrane exporter, YdcZ. DMT_YdcZ is a family of putative inner membrane exporters from both Gram-positive and Gram-negative bacteria.	139
398370	pfam04658	TAFII55_N	TAFII55 protein conserved region. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. TAFII55 binds to TAFII250 and inhibits it acetyltransferase activity. The exact role of TAFII55 is currently unknown. The conserved region is situated towards the N-terminus of the protein.	161
398371	pfam04659	Arch_fla_DE	Archaeal flagella protein. Family of archaeal flaD and flaE proteins. Conserved region found at N-terminus of flaE but towards the C-terminus of flaD.	96
282509	pfam04660	Nanovirus_coat	Nanovirus coat protein. Family of conserved Nanoviral coat proteins.	177
282510	pfam04661	Pox_I3	Poxvirus I3 ssDNA-binding protein. 	257
252728	pfam04662	Luteo_PO	Luteovirus P0 protein. This family of proteins may be involved in suppression of PTGS a plant defense mechanism.	208
398372	pfam04663	Phenol_monoox	Phenol hydroxylase conserved region. Under aerobic conditions, phenol is usually hydroxylated to catechol and degraded via the meta or ortho pathways. Two types of phenol hydroxylase are known: one is a multi-component enzyme the other is a single-component monooxygenase. This region is found in both types of enzymes.	66
398373	pfam04664	OGFr_N	Opioid growth factor receptor (OGFr) conserved region. Opioid peptides act as growth factors in neural and non-neural cells and tissues, in addition to serving in neurotransmission/neuromodulation in the nervous system. The Opioid growth factor receptor is an integral membrane protein associated with the nucleus. The conserved region is situated at the N-terminus of the member proteins with a series of imperfect repeats lying immediately to its C-terminus.	208
282513	pfam04665	Pox_A32	Poxvirus A32 protein. The A32 protein is thought to be involved in viral DNA packaging.	241
398374	pfam04666	Glyco_transf_54	N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region. The complex-type of oligosaccharides are synthesized through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein.	278
398375	pfam04667	Endosulfine	cAMP-regulated phosphoprotein/endosulfine conserved region. Conserved region found in both cAMP-regulated phosphoprotein 19 (ARPP-19) and Alpha/Beta endosulfine. No function has yet been assigned to ARPP-19. Endosulfine is the endogenous ligand for the ATP-dependent potassium (K ATP) channels which occupy a key position in the control of insulin release from the pancreatic beta cell by coupling cell polarity to metabolism. In both cases the region occupies the majority of the protein.	80
398376	pfam04668	Tsg	Twisted gastrulation (Tsg) protein conserved region. Tsg was identified in Drosophila as being required to specify the dorsal-most structures in the embryo, for example amnioserosa. Biochemical experiments have revealed three key properties of Tsg: it can synergistically inhibit Dpp/BMP action in both Drosophila and vertebrates by forming a tripartite complete between itself, SOG/chordin and a BMP ligand; Tsg seems to enhance the Tld/BMP-1-mediated cleavage rate of SOG/chordin and may change the preference of site utilisation; Tsg can promote the dissociation of chordin cysteine-rich-containing fragments from the ligand to inhibit BMP signalling.	97
368048	pfam04669	Polysacc_synt_4	Polysaccharide biosynthesis. This family of proteins plays a role in xylan biosynthesis in plant cell walls. The precise role of IRX15/IRX15-L in xylan biosynthesis is unknown. Glucuronoxylan methyltransferase (GXMT) catalyzes 4-O-methylation of the glucuronic acid substituents of this polysaccharide. AtGXMT1 specifically transfers the methyl group from S-adenosyl-l-methionine to O-4 of alpha-d-glucopyranosyluronic acid residues that are linked to O-2 of the xylan backbone. The function of members of this family in animals and fungi is not known.	177
398377	pfam04670	Gtr1_RagA	Gtr1/RagA G protein conserved region. GTR1 was first identified in S. cerevisiae as a suppressor of a mutation in RCC1. Biochemical analysis revealed that Gtr1 is in fact a G protein of the Ras family. The RagA/B proteins are the human homologs of Gtr1. Included in this family is the human Rag C, a novel protein that has been shown to interact with RagA/B.	231
309696	pfam04671	Ag332	Erythrocyte membrane-associated giant protein antigen 332. To date many different Plasmodium antigens recognized by the hyperimmune system human sera have been cloned, sequenced and characterized. The majority contain tandemly repeated amino acid sequences which make up a considerable portion of the protein sequence. It has been suggested that these repeat-containing antigens may provide an immunological 'smokescreen' to the parasite in order to evade the human immune system. This repeat is found exclusively in the Plasmodium falciparum Ag332 protein and occupies most of its length.	21
252734	pfam04672	Methyltransf_19	S-adenosyl methyltransferase. This family contains a SAM (S-adenosyl methyltransferase) domain, with a central beta sheet with 3 alpha-helices on both sides. Crystal packing analysis of the structure Structure 3giw suggests that a monomer is the solution state oligomeric form. An unidentified ligand (UNL, cyan) was found at the putative active site surrounded by the residues His57, His170, Phe171, Tyr216 and Met22. The UNL is likely to be a phenylalanine or phenylalanine-like molecule. (details derived from TOPSAN).	268
398378	pfam04673	Cyclase_polyket	Polyketide synthesis cyclase. This family represents a number of cyclases involved in polyketide synthesis in a number of actinobacterial species.	104
398379	pfam04674	Phi_1	Phosphate-induced protein 1 conserved region. Family of conserved plant proteins. Conserved region identified in a phosphate-induced protein of unknown function.	265
398380	pfam04675	DNA_ligase_A_N	DNA ligase N-terminus. This region is found in many but not all ATP-dependent DNA ligase enzymes (EC:6.5.1.1). It is thought to be involved in DNA binding and in catalysis. In human DNA ligase I, and in Saccharomyces cerevisiae, this region was necessary for catalysis, and separated from the amino terminus by targeting elements. In vaccinia virus this region was not essential for catalysis, but deletion decreases the affinity for nicked DNA and decreased the rate of strand joining at a step subsequent to enzyme-adenylate formation.	169
398381	pfam04676	CwfJ_C_2	Protein similar to CwfJ C-terminus 2. This region is found in the N-terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing.	96
309701	pfam04677	CwfJ_C_1	Protein similar to CwfJ C-terminus 1. This region is found in the N-terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing.	122
398382	pfam04678	MCU	Mitochondrial calcium uniporter. MCU functions with MICU1, an essential gatekeeper component of calcium-channel transport, to facilitate Ca2+ uptake into the mitochondrion.	179
398383	pfam04679	DNA_ligase_A_C	ATP dependent DNA ligase C terminal region. This region is found in many but not all ATP-dependent DNA ligase enzymes (EC:6.5.1.1). It is thought to constitute part of the catalytic core of ATP dependent DNA ligase.	94
282527	pfam04680	OGFr_III	Opioid growth factor receptor repeat. Proline-rich repeat found only in a human opioid growth factor receptor.	20
113449	pfam04681	Bys1	Blastomyces yeast-phase-specific protein. The molecular function of this protein is not known. Its expression is specific to the high temperature, unicellular yeast morphology (as opposed to the lower temperature, multicellular mycelium form).	155
282528	pfam04682	Herpes_BTRF1	Herpesvirus BTRF1 protein conserved region. Herpesvirus protein.	258
398384	pfam04683	Proteasom_Rpn13	Proteasome complex subunit Rpn13 ubiquitin receptor. This family was thought originally to be involved in cell-adhesion, but the members are now known to be proteasome subunit Rpn13, a novel ubiquitin receptor. The 26S proteasome is a huge macromolecular protein-degradation machine consisting of a proteolytically active 20S core, in the form of four disc-like proteins, and one or two 19S regulatory particles. The regulatory particle(s) sit on the top and or bottom of the core, de-ubiquitinate the substrate peptides, unfold them and guide them into the narrow channel through the centre of the core. Rpn13 and its homologs dock onto the regulatory particle through the N-terminal region which binds Rpn2. The C-terminal part of the domain binds de-ubiquitinating enzyme Uch37/UCHL5 and enhances its isopeptidase activity. Rpn13 binds ubiquitin via a conserved amino-terminal region called the pleckstrin-like receptor for ubiquitin, termed Pru, domain. The domain forms two contiguous anti-parallel beta-sheets with a configuration similar to the pleckstrin-homology domain (PHD) fold. Rpn13's ability to bind ubiquitin and the proteasome subunit Rpn2/S1 simultaneously supports evidence of its role as a ubiquitin receptor. Finally, when complexed to di-ubiquitin, via the Pru, and Uch37 via the C-terminal part, it frees up the distal ubiquitin for de-ubiquitination by the Uch37.	87
398385	pfam04684	BAF1_ABF1	BAF1 / ABF1 chromatin reorganising factor. ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 in the S. cerevisiae protein). The N-terminal two thirds of the protein are necessary for DNA binding, and the N-terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilize the protein structure.	501
398386	pfam04685	DUF608	Glycosyl-hydrolase family 116, catalytic region. This represents a family of archaeal, bacterial and eukaryotic glycosyl hydrolases, that belong to superfamily GH116. The primary catabolic pathway for glucosylceramide is catalysis by the lysosomal enzyme glucocerebrosidase. In higher eukaryotes, glucosylceramide is the precursor of glycosphingolipids, a complex group of ubiquitous membrane lipids. Mutations in the human protein cause motor-neurone defects in hereditary spastic paraplegia. The catalytic nucleophile, identified in UniProtKB:Q97YG8_SULSO, is a glutamine-335, with the likely acid/base at Asp-442 and the aspartates at Asp-406 and Asp-458 residues also playing a role in the catalysis of glucosides and xylosides that are beta-bound to hydrophobic groups. The family is defined as GH116, which presently includes enzymes with beta-glucosidase, EC:3.2.1.21, beta-xylosidase, EC:3.2.1.37, and glucocerebrosidase EC:3.2.1.45 activity.	362
398387	pfam04686	SsgA	Streptomyces sporulation and cell division protein, SsgA. The precise function of SsgA is unknown. It has been found to be essential for spore formation, and to stimulate cell division.	97
282533	pfam04687	Microvir_H	Microvirus H protein (pilot protein). A single molecule of H protein is found on each of the 12 spikes on the microvirus shell. H is involved in the ejection of the phage DNA, and at least one copy is injected into the host's periplasmic space along with the ssDNA viral genome. Part of H is thought to lie outside the shell, where it recognizes lipopolysaccharide from virus-sensitive strains. Part of H may lie within the capsid, since mutations in H can influence the DNA ejection mechanism by affecting the DNA-protein interactions. H may span the capsid through the hydrophilic channels formed by G proteins. Elucidation of the DNA-ejection mechanism from the crystal structure of part of the H protein shows that this tail-less icosahedral, single-stranded DNA phiX174-like coliphage bacteriophage requires H as a pilot protein for its DNA-delivery. H oligomerizes to form a tube the function of which seems to be the delivery of the DNA genome across the host's periplasmic space into the host cytoplasm. The tube is constructed of ten alpha-helices with their amino termini arrayed in a right-handed super-helical coiled-coil and their carboxy termini arrayed in a left-handed super-helical coiled-coil. The tube spans the periplasmic space and is present while the genome is being delivered into the host cell's cytoplasm.	310
398388	pfam04688	Holin_SPP1	SPP1 phage holin. This family constitutes holin proteins from the dsDNA Siphidoviridae group bacteriophages with two transmembrane segments. Most bacteriophages require an endolysin and a holin for host lysis. During late gene expression, holins accumulate and oligomerize in the host cell membrane. They then suddenly trigger to permeablise the membrane, which causes lysis by allowing endolysin to attach the peptidoglycan. There are thought to be at least 35 different families of holin genes.	74
282535	pfam04689	S1FA	DNA binding protein S1FA. S1FA is a DNA-binding protein found in plants that specifically recognizes the negative promoter element S1F.	66
398389	pfam04690	YABBY	YABBY protein. YABBY proteins are a group of plant-specific transcription involved in the specification of abaxial polarity in lateral organs.	163
398390	pfam04691	ApoC-I	Apolipoprotein C-I (ApoC-1). Apolipoprotein C-I (ApoC-1) is a water-soluble protein component of plasma lipoprotein. It solubalises lipids and regulates lipid metabolism. ApoC-1 transfers among HDL (high density lipoprotein), VLDL (very low-density lipoprotein) and chylomicrons. ApoC-1 activates lecithin:choline acetyltransferase (LCAT), inhibits cholesteryl ester transfer protein, can inhibit hepatic lipase and phospholipase 2 and can stimulate cell growth. ApoC-1 delays the clearance of beta-VLDL by inhibiting its uptake via the LDL receptor-related pathway. ApoC-1 has been implicated in hypertriglyceridemia, and Alzheimer's disease.	60
398391	pfam04692	PDGF_N	Platelet-derived growth factor, N terminal region. This family consists of the amino terminal regions of platelet-derived growth factor (PDGF, pfam00341) A and B chains.	77
147046	pfam04693	DDE_Tnp_2	Archaeal putative transposase ISC1217. 	327
282539	pfam04694	Corona_3	Coronavirus ORF3 protein. 	59
398392	pfam04695	Pex14_N	Peroxisomal membrane anchor protein (Pex14p) conserved region. Family of peroxisomal membrane anchor proteins which bind the PTS1 (peroxisomal targeting signal) receptor and are required for the import of PTS1-containing proteins into peroxisomes. Loss of functional Pex14p results in defects in both the PTS1 and PTS2-dependent import pathways. Deletion analysis of this conserved region implicates it in selective peroxisome degradation. In the majority of members this region is situated at the N-terminus of the protein.	46
398393	pfam04696	Pinin_SDK_memA	pinin/SDK/memA/ protein conserved region. Members of this family have very varied localizations within the eukaryotic cell. pinin is known to localize at the desmosomes and is implicated in anchoring intermediate filaments to the desmosomal plaque. SDK2/3 is a dynamically localized nuclear protein thought to be involved in modulation of alternative pre-mRNA splicing. memA is a tumor marker preferentially expressed in human melanoma cell lines. A common feature of the members of this family is that they may all participate in regulating protein-protein interactions.	130
398394	pfam04697	Pinin_SDK_N	pinin/SDK conserved region. SDK2/3 is localized in nuclear speckles where as pinin is known to localize at the desmosomes where it is thought to be involved in anchoring intermediate filaments to the desmosomal plaque. The role of SDK2/3 in the nucleus is thought to be concerned with modulation of alternative pre-mRNA splicing. pinin has also been implicated as a tumor suppressor. The conserved region is found at the N-terminus of the member proteins.	132
398395	pfam04698	Rab_eff_C	Rab effector MyRIP/melanophilin C-terminus. This domain is found at the C-terminus of the Rab effector proteins MyRIP and melanophilin.	715
398396	pfam04699	P16-Arc	ARP2/3 complex 16 kDa subunit (p16-Arc). The Arp2/3 protein complex has been implicated in the control of actin polymerization. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. The precise function of p16-Arc is currently unknown. Its structure consists of a single domain containing a bundle of seven alpha helices.	147
282545	pfam04700	Baculo_gp41	Structural glycoprotein p40/gp41 conserved region. Family of viral structural glycoproteins.	185
282546	pfam04701	Pox_D2	Pox virus D2 protein. 	139
252749	pfam04702	Vicilin_N	Vicilin N terminal region. This region is found in plant seed storage proteins, N-terminal to the Cupin domain (pfam00190). In Macadamia integrifolia, this region is processed into peptides of approximately 50 amino acids containing a C-X-X-X-C-(10-12)X-C-X-X-X-C motif. These peptides exhibit antimicrobial activity in vitro.	147
398397	pfam04703	FaeA	FaeA-like protein. This family represents a number of fimbrial protein transcription regulators found in Gram-negative bacteria. These proteins are thought to facilitate binding of the leucine-rich regulatory protein to regulatory elements, possibly by inhibiting deoxyadenosine methylation of these elements by deoxyadenosine methylase.	61
398398	pfam04704	Zfx_Zfy_act	Zfx / Zfy transcription activation region. Zfx and Zfy are transcription factors implicated in mammalian sex determination. This region is found N terminal to multiple copies of a C2H2 Zinc finger (pfam00096). This region has been shown to activate transcription when fused to a GAL4 DNA binding domain.	328
368067	pfam04705	TSNR_N	Thiostrepton-resistance methylase, N-terminus. This region is found in some members of the SpoU-type rRNA methylase family (pfam00588).	111
398399	pfam04706	Dickkopf_N	Dickkopf N-terminal cysteine-rich region. Dickkopf proteins are a class of Wnt antagonists. They possess two conserved cysteine-rich regions. This family represents the N-terminal one. The C-terminal region has been found to share significant sequence similarity to the colipase fold, pfam01114, pfam02740.	50
398400	pfam04707	PRELI	PRELI-like family. This family includes a conserved region found in the PRELI protein and yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. This region is also found in a number of other eukaryotic proteins.	156
282551	pfam04708	Pox_F16	Poxvirus F16 protein. 	215
398401	pfam04709	AMH_N	Anti-Mullerian hormone, N terminal region. Anti-Mullerian hormone, AMH is a signalling molecule involved in male and female sexual differentiation. Defects in synthesis or action of AMH cause persistent Mullerian duct syndrome (PMDS), a rare form of male pseudohermaphroditism. This family represents the N terminal part of the protein, which is not thought to be essential for activity. AMH contains a TGF-beta domain (pfam00019), at the C-terminus.	390
398402	pfam04710	Pellino	Pellino. Pellino is involved in Toll-like signalling pathways, and associates with the kinase domain of the Pelle Ser/Thr kinase.	409
398403	pfam04711	ApoA-II	Apolipoprotein A-II (ApoA-II). Apolipoprotein A-II (ApoA-II) is the second major apolipoprotein of high density lipoprotein in human plasma. Mature ApoA-II is present as a dimer of two 77-amino acid chains joined by a disulphide bridge. ApoA-II regulates many steps in HDL metabolism, and its role in coronary heart disease is unclear. In bovine serum, the ApoA-II homolog is present in almost free form. Bovine ApoA-II shows antimicrobial activity against Escherichia coli and yeasts in phosphate buffered saline (PBS).	75
398404	pfam04712	Radial_spoke	Radial spokehead-like protein. This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologs, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene.	493
282556	pfam04713	Pox_I5	Poxvirus protein I5. 	75
398405	pfam04714	BCL_N	BCL7, N-terminal conserver region. Members of the BCL family have significant sequence similarity at their N-terminus, represented in this family. The function of BCL7 proteins is unknown. They may be involved in early development. In addition, BCL7B is commonly hemizygously deleted in patients with Williams syndrome.	48
398406	pfam04715	Anth_synt_I_N	Anthranilate synthase component I, N terminal region. Anthranilate synthase (EC:4.1.3.27) catalyzes the first step in the biosynthesis of tryptophan. Component I catalyzes the formation of anthranilate using ammonia and chorismate. The catalytic site lies in the adjacent region, described in the chorismate binding enzyme family (pfam00425). This region is involved in feedback inhibition by tryptophan. This family also contains a region of Para-aminobenzoate synthase component I (EC 4.1.3.-).	141
398407	pfam04716	ETC_C1_NDUFA5	ETC complex I subunit conserved region. Family of eukaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 29.9 kDa protein. The conserved region is found at the N-terminus of the member proteins.	66
398408	pfam04717	Phage_base_V	Type VI secretion system, phage-baseplate injector. Family of bacterial and phage baseplate assembly proteins responsible for forming the small spike at the end of the tail or bacterial pathogenic needle-shaft.	75
398409	pfam04718	ATP-synt_G	Mitochondrial ATP synthase g subunit. The Fo sector of the ATP synthase is a membrane bound complex which mediates proton transport. It is composed of nine different polypeptide subunits (a, b, c, d, e, f, g F6, A6L). The function of subunit g is currently unknown. The conserved region covers all but the very N-terminus of the member sequences. No prokaryotic members have been identified thus far.	92
398410	pfam04719	TAFII28	hTAFII28-like protein conserved region. The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. The conserved region is found at the C-terminal of most member proteins. The crystal structure of hTAFII28 with hTAFII18 shows that this region is involved in the binding of these two subunits. The conserved region contains four alpha helices and three loops arranged as in histone H3.	85
398411	pfam04720	PDDEXK_6	PDDEXK-like family of unknown function. PDDEXK_6 is a family of plant proteins that are distant homologs of the PD-(D/E)XK nuclease superfamily. The core structure is retained, as alpha-beta-beta-beta-alpha-beta. It retains the characteristic PDDEXK motifs II and III in modified forms - xDxxx motif located in the second core beta-strand, where x is any hydrophobic residue, and a D/E)X(D/N/S/C/G) pattern. The missing positively charged residue in motif III is possibly replaced by a conserved arginine in motif IV located in the proceeding alpha-helix. The family is not in general fused with any other domains, so its function cannot be predicted.	215
398412	pfam04721	PAW	PNGase C-terminal domain, mannose-binding module PAW. The PAW domain is found at the C-terminus of PGNase, or peptide-N-glycanase, enzymes. It was named for 'domain present in PNGases and other worm proteins'. PNGase catalyzes the deglycosylation of several misfolded N-linked glycoproteins by cleaving off the bulky glycan chain before these proteins are degraded by the proteasome. PNGase specifically acts on the unfolded form of high-mannose type N-glycosylated proteins, and this domain appears to be the mannose-binding domain, which contributes to the oligosaccharide-binding specificity of PNGase.	197
398413	pfam04722	Ssu72	Ssu72-like protein. The highly conserved and essential protein Ssu72 has intrinsic phosphatase activity and plays an essential role in the transcription cycle. Ssu72 was originally identified in a yeast genetic screen as enhancer of a defect caused by a mutation in the transcription initiation factor TFIIB. It binds to TFIIB and is also involved in mRNA elongation. Ssu72 is further involved in both poly(A) dependent and independent termination. It is a subunit of the yeast cleavage and polyadenylation factor (CPF), which is part of the machinery for mRNA 3'-end formation. Ssu72 is also essential for transcription termination of snRNAs.	189
377402	pfam04723	GRDA	Glycine reductase complex selenoprotein A. Found in clostridia, this protein contains one active site selenocysteine and catalyzes the reductive deamination of glycine, which is coupled to the esterification of orthophosphate resulting in the formation of ATP. A member of this family may also exist in Treponema denticola.	147
398414	pfam04724	Glyco_transf_17	Glycosyltransferase family 17. This family represents beta-1,4-mannosyl-glycoprotein beta-1,4-N-acetylglucosaminyltransferase (EC:2.4.1.144). This enzyme transfers the bisecting GlcNAc to the core mannose of complex N-glycans. The addition of this residue is regulated during development and has functional consequences for receptor signalling, cell adhesion, and tumor progression.	349
309736	pfam04725	PsbR	Photosystem II 10 kDa polypeptide PsbR. This protein is associated with the oxygen-evolving complex of photosystem II. Its function in photosynthesis is not known. The C-terminal hydrophobic region functions as a thylakoid transfer signal but is not removed.	98
282569	pfam04726	Microvir_J	Microvirus J protein. This small protein is involved in DNA packaging, interacting with DNA via its hydrophobic carboxyl terminus. In bacteriophage phi-X174, J is present in 60 copies, and forms an S-shaped polypeptide chain without any secondary structure. It is thought to interact with DNA through simple charge interactions.	37
398415	pfam04727	ELMO_CED12	ELMO/CED-12 family. This family represents a conserved domain which is found in a number of eukaryotic proteins including CED-12, ELMO I and ELMO II. ELMO1 is a component of signalling pathways that regulate phagocytosis and cell migration and is the mammalian orthologue of the C. elegans gene, ced-12. CED-12 is required for the engulfment of dying cells and cell migration. In mammalian cells, ELMO1 interacts with Dock180 as part of the CrkII/Dock180/Rac pathway responsible for phagocytosis and cell migration. ELMO1 is ubiquitously expressed, although its expression is highest in the spleen, an organ rich in immune cells. ELMO1 has a PH domain and a polyproline sequence motif at its C-terminus which are not present in this alignment.	165
398416	pfam04728	LPP	Lipoprotein leucine-zipper. This is leucine-zipper is found in the enterobacterial outer membrane lipoprotein LPP. It is likely that this domain oligomerizes and is involved in protein-protein interactions. As such it is a bundle of alpha-helical coiled-coils, which are known to play key roles in mediating specific protein-protein interactions for in molecular recognition and the assembly of multi-protein complexes.	53
398417	pfam04729	ASF1_hist_chap	ASF1 like histone chaperone. This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.	154
368086	pfam04730	Agro_virD5	Agrobacterium VirD5 protein. The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterized products. This family represents the VirD5 protein.	672
398418	pfam04731	Caudal_act	Caudal like protein activation region. This family consists of the amino termini of proteins belonging to the caudal-related homeobox protein family. This region is thought to mediate transcription activation. The level of activation caused by mouse Cdx2 is affected by phosphorylation at serine 60 via the mitogen-activated protein kinase pathway. Caudal family proteins are involved in the transcriptional regulation of multiple genes expressed in the intestinal epithelium, and are important in differentiation and maintenance of the intestinal epithelial lining. Caudal proteins always have a homeobox DNA binding domain (pfam00046).	127
368088	pfam04732	Filament_head	Intermediate filament head (DNA binding) region. This family represents the N-terminal head region of intermediate filaments. Intermediate filament heads bind DNA. Vimentin heads are able to alter nuclear architecture and chromatin distribution, and the liberation of heads by HIV-1 protease liberates may play an important role in HIV-1 associated cytopathogenesis and carcinogenesis. Phosphorylation of the head region can affect filament stability. The head has been shown to interaction with the rod domain of the same protein.	83
398419	pfam04733	Coatomer_E	Coatomer epsilon subunit. This family represents the epsilon subunit of the coatomer complex, which is involved in the regulation of intracellular protein trafficking between the endoplasmic reticulum and the Golgi complex.	288
398420	pfam04734	Ceramidase_alk	Neutral/alkaline non-lysosomal ceramidase, N-terminal. This family represents N-terminal domain of a group of neutral/alkaline ceramidases found in both bacteria and eukaryotes. The EC classification is EC:3.5.1.23. The enzyme hydrolyzes ceramide to generate sphingosine and fatty acid. The enzyme plays a regulatory role in a variety of physiological events in eukaryotes and also functions as an exotoxin in particular bacteria. This N-terminal domain carries two metal-binding sites, the first for Zn2+ residing within the domain, and the second, for Mg2+/Ca2+ lying at the interface between the two domains.	473
282577	pfam04735	Baculo_helicase	Baculovirus DNA helicase. 	1307
368091	pfam04736	Eclosion	Eclosion hormone. Eclosion hormone is an insect neuropeptide that triggers the performance of ecdysis behaviour, which causes shedding of the old cuticle at the end of a molt,.	61
398421	pfam04738	Lant_dehydr_N	Lantibiotic dehydratase, C-terminus. Lantibiotics are ribosomally synthesized antimicrobial agents derived from ribosomally synthesized peptides. They are produced by bacteria of the Firmicutes phylum, and include mutacin, subtilin, and nisin. Lantibiotic peptides contain thioether bridges termed lanthionines that are thought to be generated by dehydration of serine and threonine residues followed by addition of cysteine residues. This family constitutes the N-terminus of the enzyme proposed to catalyze the dehydration step, via glutamylation of the substrate during lantibiotic biosynthesis. The enzyme dehydrates Ser/Thr residues in the precursor by glutamylation.	648
398422	pfam04739	AMPKBI	5'-AMP-activated protein kinase beta subunit, interaction domain. This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologs Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain (pfam02922) is sometimes found in proteins belonging to this family.	69
398423	pfam04740	LXG	LXG domain of WXG superfamily. This domain is present is the N-terminal region of a group of polymorphic toxin proteins in bacteria. It is predicted to use Type VII secretion pathway to mediate export of bacterial toxins.	202
282582	pfam04741	InvH	InvH outer membrane lipoprotein. This family represents the Salmonella outer membrane lipoprotein InvH. The molecular function of this protein is unknown, but it is required for the localization to outer membrane of InvG, which is involved in a type III secretion apparatus mediating host cell invasion.	147
398424	pfam04744	Monooxygenase_B	Monooxygenase subunit B protein. Family of membrane associated monooxygenases (EC 1.13.12.-) which utilize O(2) to oxidize their substrate. Family members include both ammonia and methane monooxygenases involved in the oxidation of their respective substrates. These enzymes are multi-subunit complexes. This family represents the B subunit of the enzyme; the A subunit is thought to contain the active site..	379
282584	pfam04745	Pox_A8	VITF-3 subunit protein. Family of Chordopoxvirus proteins composing one of the two subunits that make up VITF-3, a virally encoded complex necessary for intermediate stage transcription.	289
113513	pfam04746	DUF575	Protein of unknown function (DUF575). Family of uncharacterized proteins. Contains several chlamydial members.	101
282585	pfam04747	DUF612	Protein of unknown function, DUF612. This family includes several uncharacterized proteins from Caenorhabditis elegans.	511
398425	pfam04748	Polysacc_deac_2	Divergent polysaccharide deacetylase. This family is divergently related to pfam01522 (personal obs:Yeats C).	212
398426	pfam04749	PLAC8	PLAC8 family. This family includes the Placenta-specific gene 8 protein.	100
368096	pfam04750	Far-17a_AIG1	FAR-17a/AIG1-like protein. This family includes the hamster androgen-induced FAR-17a protein, and its human homolog, the AIG1 protein. The function of these proteins is unknown. This family also includes homologous regions from a number of other metazoan proteins.	206
398427	pfam04751	DUF615	Protein of unknown function (DUF615). This family of bacterial proteins has no known function.	139
398428	pfam04752	ChaC	ChaC-like protein. The ChaC family of proteins function as gamma-glutamyl cyclotransferases acting specifically to degrade glutathione but not other gamma-glutamyl peptides. It is is conversed across all phyla and represents a new pathway for glutathione degradation in living cells.	176
398429	pfam04753	Corona_NS2	Coronavirus non-structural protein NS2. 	109
368098	pfam04754	Transposase_31	Putative transposase, YhgA-like. This family of putative transposases includes the YhgA sequence from Escherichia coli and several prokaryotic homologs.	202
309752	pfam04755	PAP_fibrillin	PAP_fibrillin. This family identifies a conserved region found in a number of plastid lipid-associated proteins (PAPs), and in a number of putative fibrillin proteins.	196
398430	pfam04756	OST3_OST6	OST3 / OST6 family, transporter family. The proteins in this family are part of a complex of eight ER proteins that transfers core oligosaccharide from dolichol carrier to Asn-X-Ser/Thr motifs. This family includes both OST3 and OST6, each of which contains four predicted transmembrane helices. Disruption of OST3 and OST6 leads to a defect in the assembly of the complex. Hence, the function of these genes seems to be essential for recruiting a fully active complex necessary for efficient N-glycosylation. These proteins are also thought to be novel Mg2+ transporters.	294
398431	pfam04757	Pex2_Pex12	Pex2 / Pex12 amino terminal region. This region is found at the N terminal of a number of known and predicted peroxins including Pex2, Pex10 and Pex12. This conserved region is usually associated with a C terminal ring finger (pfam00097) domain.	213
398432	pfam04758	Ribosomal_S30	Ribosomal protein S30. 	58
398433	pfam04759	DUF617	Protein of unknown function, DUF617. This family represents a conserved region in a number of uncharacterized plant proteins.	163
398434	pfam04760	IF2_N	Translation initiation factor IF-2, N-terminal region. This conserved feature at the N-terminus of bacterial translation initiation factor IF2 has recently had its structure solved. It shows structural similarity to the tRNA anticodon Stem Contact Fold domains of the methionyl-tRNA and glutaminyl-tRNA synthetases, and a similar fold is also found in the B5 domain of the phenylalanine-tRNA synthetase.	52
282599	pfam04761	Phage_Treg	Lactococcus bacteriophage putative transcription regulator. This family represents a number of putative transcription repressor proteins found in several Lactococcus bacteriophages. Horizontal transfer may account for the presence of similar proteins in Lactococcus.	61
398435	pfam04762	IKI3	IKI3 family. Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation. This region contains WD40 like repeats.	920
368104	pfam04763	DUF562	Protein of unknown function (DUF562). Family of uncharacterized proteins.	146
252787	pfam04764	DUF613	Protein of unknown function (DUF613). Family of chloroplast proteins of unknown function. Some members have two copies of the conserved region.	120
398436	pfam04765	DUF616	Protein of unknown function (DUF616). Family of uncharacterized proteins.	303
309761	pfam04766	Baculo_p26	Nucleopolyhedrovirus p26 protein. Family of Baculovirus p26 proteins.	234
282602	pfam04767	Pox_F17	DNA-binding 11 kDa phosphoprotein. Family of poxvirus proteins required for virus morphogenesis. Protein function necessary for proteolytic processing of the major viral structural proteins, P4a and P4b.	93
398437	pfam04768	NAT	NAT, N-acetyltransferase, of N-acetylglutamate synthase. This is the C-terminal NAT or N-acetyltransferase domain of bifunctional N-acetylglutamate synthase/kinases. It catalyzes the first two steps in arginine biosynthesis. This domain contains the putative NAGS - N-acetylglutamate synthase - active site. It is found at the C-terminus of Neurospora crassa acetylglutamate synthase - amino-acid acetyltransferase, EC: 2.3.1.1. It is also found C-terminal to the amino acid kinase region (pfam00696) in some fungal acetylglutamate kinase enzymes. it stabilizes the yeast NAGK, N-acetyl-L-glutamate kinase, slows catalysis and modulates feed-back inhibition by arginine. This domain is found to be the N-acetyltransferase (NAT) domain, and it has a typical GCN5-related NAT fold and a site that catalyzes NAG synthesis which is located >25 Angstrom away from the L-arginine binding site in the N-temrinal domain pfam00696.	166
398438	pfam04769	MATalpha_HMGbox	Mating-type protein MAT alpha 1 HMG-box. This family includes Saccharomyces cerevisiae mating type protein alpha 1. Mat alpha 1 is a transcription activator which activates mating-type alpha-specific genes. MAT alpha 1 and MCM 1 bind cooperatively to PQ elements upstream of alpha-specific genes. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (pfam04648) response pathway. In silico modelling of the MAT_Alpha1 domain indicates that its best scoring templates were structures of HMG-box proteins, and DOI: 10.4236/ojbiphy.2013.31001. Phylogenetic analysis suggests that the MAT_Alpha1 domain diverged from the MATA_HMG-box subfamily. The name of MATalpha_HMG-box was proposed for the MAT_alpha1 domain.	186
398439	pfam04770	ZF-HD_dimer	ZF-HD protein dimerization region. This family of proteins has are plant transcription factors, and have been named ZF-HD for zinc finger homeodomain proteins, on the basis of similarity to proteins of known structure. This region is thought to be involved in the formation of homo and heterodimers, and may form a zinc finger.	55
282606	pfam04771	CAV_VP3	Chicken anaemia virus VP-3 protein. This protein is found in the nucleus of infected cells and may act as a transcriptional regulator. It induces apoptosis, and is also known as apoptin.	121
282607	pfam04772	Flu_B_M2	Influenza B matrix protein 2 (BM2). M2 is synthesized in the late phase of infection and incorporated into the virion. It may be phosphorylated in vivo. The function of BM2 is unknown.	109
398440	pfam04773	FecR	FecR protein. FecR is involved in regulation of iron dicitrate transport. In the absence of citrate FecR inactivates FecI. FecR is probably a sensor that recognizes iron dicitrate in the periplasm.	96
398441	pfam04774	HABP4_PAI-RBP1	Hyaluronan / mRNA binding family. This family includes the HABP4 family of hyaluronan-binding proteins, and the PAI-1 mRNA-binding protein, PAI-RBP1. HABP4 has been observed to bind hyaluronan (a glucosaminoglycan), but it is not known whether this is its primary role in vivo. It has also been observed to bind RNA, but with a lower affinity than that for hyaluronan. PAI-1 mRNA-binding protein specifically binds the mRNA of type-1 plasminogen activator inhibitor (PAI-1), and is thought to be involved in regulation of mRNA stability. However, in both cases, the sequence motifs predicted to be important for ligand binding are not conserved throughout the family, so it is not known whether members of this family share a common function.	106
398442	pfam04775	Bile_Hydr_Trans	Acyl-CoA thioester hydrolase/BAAT N-terminal region. This family consists of the amino termini of acyl-CoA thioester hydrolase and bile acid-CoA:amino acid N-acetyltransferase (BAAT). This region is not thought to contain the active site of either enzyme. Thioesterase isoforms have been identified in peroxisomes, cytoplasm and mitochondria, where they are thought to have distinct functions in lipid metabolism. For example, in peroxisomes, the hydrolase acts on bile-CoA esters.	120
398443	pfam04776	protein_MS5	Protein MS5. Proteins are known only from species of Brassicaceae. Protein MS5 is essential for pairing of homologs during early prophase stage of meiosis but not necessary for the initiation of DNA double-strand breaks.	119
398444	pfam04777	Evr1_Alr	Erv1 / Alr family. Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian orthologue of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane an d it thought to operate downstream of the mitochondrial ABC transporter.	91
398445	pfam04778	LMP	LMP repeated region. This family consists of a repeated sequence element found in the LMP group of surface-located membrane proteins of Mycoplasma hominis. The the number of repeats in the protein affects the tendency of cells to spontaneously aggregate. Agglutination may be an important factor in colonisation. Non-agglutinating microorganisms might easily be distributed whereas aggregation might provide a better chance to avoid an antibody response since some of the epitopes may be buried.	158
398446	pfam04780	DUF629	Protein of unknown function (DUF629). This family represents a region of several plant proteins of unknown function. A C2H2 zinc finger is predicted in this region in some family members, but the spacing between the cysteine residues is not conserved throughout the family.	465
398447	pfam04781	DUF627	Protein of unknown function (DUF627). This family represents the N-terminal region of several plant proteins of unknown function.	108
398448	pfam04782	DUF632	Protein of unknown function (DUF632). This plant protein may be a leucine zipper, but there is no experimental evidence for this.	311
398449	pfam04783	DUF630	Protein of unknown function (DUF630). This region is sometimes found at the N-terminus of putative plant bZIP proteins. Its function is not known. Structural modelling suggests this domain may bind nucleic acids.	59
398450	pfam04784	DUF547	Protein of unknown function, DUF547. Family of uncharacterized proteins from C. elegans and A. thaliana.	120
282619	pfam04785	Rhabdo_M2	Rhabdovirus matrix protein M2. M protein is involved in condensing and targeting the ribonucleoprotein (RNP) coil to the plasma membrane. M interacts specifically with the transmembrane spike protein (G) is important for the incorporation of G protein into budding virions.	202
398451	pfam04786	Baculo_DNA_bind	ssDNA binding protein. Family of Baculovirus ssDNA binding proteins.	248
282621	pfam04787	Pox_H7	Late protein H7. Family of poxvirus late H7 proteins.	143
398452	pfam04788	DUF620	Protein of unknown function (DUF620). Family of uncharacterized proteins.	243
398453	pfam04789	DUF621	Protein of unknown function (DUF621). Family of uncharacterized proteins. Some are annotated as having possible G-protein-coupled receptor-like activity.	301
398454	pfam04790	Sarcoglycan_1	Sarcoglycan complex subunit protein. The dystrophin glycoprotein complex (DGC) is a membrane-spanning complex that links the interior cytoskeleton to the extracellular matrix in muscle. The sarcoglycan complex is a subcomplex within the DGC and is composed of several muscle-specific, transmembrane proteins (alpha-, beta-, gamma-, delta- and zeta-sarcoglycan). The sarcoglycans are asparagine-linked glycosylated proteins with single transmembrane domains. This family contains beta, gamma and delta members.	253
398455	pfam04791	LMBR1	LMBR1-like membrane protein. Members of this family are integral membrane proteins that are around 500 residues in length. LMBR1 is not involved in preaxial polydactyly, as originally thought. Vertebrate members of this family may play a role in limb development. A member of this family has been shown to be a lipocalin membrane receptor	467
398456	pfam04792	LcrV	V antigen (LcrV) protein. Yersinia pestis, the aetiologic agent of plague, secretes a set of environmentally regulated, plasmid pCD1-encoded virulence proteins termed Yops and V antigen (LcrV) by a type III secretion mechanism. LcrV is a multifunctional protein that has been shown to act at the level of secretion control by binding the Ysc inner-gate protein LcrG and to modulate the host immune response by altering cytokine production. LcrV is also necessary for full induction of low-calcium response (LCR) stimulon virulence gene transcription. Family members are not confined to Yersinia pestis.	298
368123	pfam04793	Herpes_BBRF1	BRRF1-like protein. Family of herpesvirus proteins including Epstein-barr virus protein BBRF1.	282
398457	pfam04794	YdjC	YdjC-like protein. Family of YdjC-like proteins. This region is possibly involved in the the cleavage of cellobiose-phosphate.	193
398458	pfam04795	PAPA-1	PAPA-1-like conserved region. Family of proteins with a conserved region found in PAPA-1, a PAP-1 binding protein.	79
309782	pfam04796	RepA_C	Plasmid encoded RepA protein. Family of plasmid encoded proteins involved in plasmid replication. The role of RepA in the replication process is not clearly understood.	161
309783	pfam04797	Herpes_ORF11	Herpesvirus dUTPase protein. This family of proteins are found in Herpesvirus proteins. This family includes proteins called ORF10 and ORF11 amongst others. However, these proteins seem to be related to other dUTPases pfam00692 suggesting that these proteins are also dUTPases (Bateman A pers. obs.).	372
282632	pfam04798	Baculo_19	Baculovirus 19 kDa protein conserved region. Family of Baculovirus proteins of approximate mass 19 kDa.	143
398459	pfam04799	Fzo_mitofusin	fzo-like conserved region. Family of putative transmembrane GTPase. The fzo protein is a mediator of mitochondrial fusion. This conserved region is also found in the human mitofusin protein.	159
398460	pfam04800	ETC_C1_NDUFA4	ETC complex I subunit conserved region. Family of pankaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein.	96
398461	pfam04801	Sin_N	Sin-like protein conserved region. Family of higher eukaryotic proteins. SIN was identified as a protein that interacts specifically with SXL (sex lethal) in a yeast two-hybrid assay. The interaction is mediated by one of the SXL RNA binding domains.	426
398462	pfam04802	SMK-1	Component of IIS longevity pathway SMK-1. SMK-1 is a component of the IIs longevity pathway which regulates aging in C.elegans. Specifically, SMK-1 influences DAF-16-dependant regulation of the aging process by regulating the transcriptional specificity of DAF-16 activity. SMK-1 plays a role in longevity by modulating the transcriptional specificity of DAF-16.	191
398463	pfam04803	Cor1	Cor1/Xlr/Xmr conserved region. Cor1 is a component of the chromosome core in the meiotic prophase chromosomes. Xlr is a lymphoid cell specific protein. Xlm is abundantly transcribed in testis in a tissue-specific and developmentally regulated manner. The protein is located in the nuclei of spermatocytes, early in the prophase of the first meiotic division, and later becomes concentrated in the XY nuclear subregion where it is in particular associated with the axes of sex chromosomes.	132
282638	pfam04805	Pox_E10	E10-like protein conserved region. Family of poxvirus proteins.	69
398464	pfam04806	EspF	EspF protein repeat. The enteropathogenic Escherichia coli EspF secreted protein induces host cell apoptosis. Its proline-rich structure suggests that it may act by binding to SH3 domains or EVH1 domains of host cell signalling proteins.	47
147122	pfam04807	Gemini_AC4_5	Geminivirus AC4/5 conserved region. 	33
113574	pfam04808	CTV_P23	Citrus tristeza virus (CTV) P23 protein. This family consists of protein P23 from the citrus tristeza virus, which is a member of the Closteroviridae. CTV viruses produce more positive than negative RNA strands, and P23 controls this asymmetrical RNA accumulation. Amino acids 42-180 are essential for function and are thought to contain RNA-binding and zinc finger domains.	209
398465	pfam04809	HupH_C	HupH hydrogenase expression protein, C-terminal conserved region. This family represents a C-terminal conserved region found in these bacterial proteins necessary for hydrogenase synthesis. Their precise function is unknown.	109
398466	pfam04810	zf-Sec23_Sec24	Sec23/Sec24 zinc finger. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain.	38
398467	pfam04811	Sec23_trunk	Sec23/Sec24 trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.	241
398468	pfam04812	HNF-1B_C	Hepatocyte nuclear factor 1 (HNF-1), beta isoform C-terminus. This family consists of a region found within the alpha isoform and at the C-terminus of the beta isoform of the homeobox-containing transcription factor of HNF-1. Different isoforms of HNF-1 are generated by the differential use of polyadenylation sites and by alternative splicing. The C-terminal region of HNF-1 is responsible for the activation of transcription. Mutations and polymorphisms in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3).	258
398469	pfam04813	HNF-1A_C	Hepatocyte nuclear factor 1 (HNF-1), alpha isoform C-terminus. This family consists of an alternative C-terminus of homeobox-containing transcription factor HNF-1, found in the HNF-1A isoform. Different isoforms of HNF-1 are generated by the differential use of polyadenylation sites and by alternative splicing. The C-terminal region of HNF-1 is responsible for the activation of transcription, and HNF-1A, which has this C-terminal extension, transactivates less well than the B and C isoforms. Mutations and polymorphisms in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3).	89
398470	pfam04814	HNF-1_N	Hepatocyte nuclear factor 1 (HNF-1), N-terminus. This family consists of the N-terminus of homeobox-containing transcription factor HNF-1. This region contains a dimerization sequence and an acidic region that may be involved in transcription activation. Mutations and the common Ala/Val 98 polymorphism in HNF-1 cause the type 3 form of maturity-onset diabetes of the young (MODY3).	167
398471	pfam04815	Sec23_helical	Sec23/Sec24 helical domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices.	103
398472	pfam04816	TrmK	tRNA (adenine(22)-N(1))-methyltransferase. tRNA_MT is a family of bacterial tRNA (adenine(22)-N(1))-methyltransferase enzymes with a Rossmann-like fold. This enzyme carries out the function of N1-adenosine methylation at position 22 of bacterial tRNA.	205
113583	pfam04817	Umbravirus_LDM	Umbravirus long distance movement (LDM) family. The long distance movement protein of Umbraviruses mediates the movement of viral RNA through the phloem of infected plants.	231
398473	pfam04818	CTD_bind	RNA polymerase II-binding domain. This domain binds to the phosphorylated C-terminal domain (CTD) of RNA polymerase II.	120
398474	pfam04819	DUF716	Family of unknown function (DUF716). This family is equally distributed in both metazoa and plants. Annotation associated with a Nicotiana tabacum mRNA suggest that it may be involved in response to viral attack in plants. However, no clear function has been assigned to this family.	134
398475	pfam04820	Trp_halogenase	Tryptophan halogenase. Tryptophan halogenase catalyzes the chlorination of tryptophan to form 7-chlorotryptophan. This is the first step in the biosynthesis of pyrrolnitrin, an antibiotic with broad-spectrum anti-fungal activity. Tryptophan halogenase is NADH-dependent.	457
398476	pfam04821	TIMELESS	Timeless protein. The timeless gene in Drosophila melanogaster and its homologs in a number of other insects and mammals (including human) are involved in circadian rhythm control. This family includes a related proteins from a number of fungal species.	271
398477	pfam04822	Takusan	Takusan. This domain is named takusan, which is a Japanese word meaning 'many'. Members of this family regulate synaptic activity.	85
398478	pfam04823	Herpes_UL49_2	Herpesvirus UL49 tegument protein. 	82
335912	pfam04824	Rad21_Rec8	Conserved region of Rad21 / Rec8 like protein. This family represents a conserved region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation.	55
398479	pfam04825	Rad21_Rec8_N	N-terminus of Rad21 / Rec8 like protein. This family represents a conserved N-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation.	104
398480	pfam04826	Arm_2	Armadillo-like. This domain contains armadillo-like repeats. Proteins containing this domain interact with numerous other proteins, through these interactions they are involved in a wide variety of processes including carcinogenesis, control of cellular ageing and survival, regulation of circadian rhythm and lysosomal sorting of G protein-coupled receptors.	252
203098	pfam04827	Plant_tran	Plant transposon protein. This family contains plant transposases which are putative members of the PIF / Ping-Pong family.	205
398481	pfam04828	GFA	Glutathione-dependent formaldehyde-activating enzyme. The GFA enzyme catalyzes the first step in the detoxification of formaldehyde. This domain has a beta-tent fold.	93
398482	pfam04829	PT-VENN	Pre-toxin domain with VENN motif. This family represents a conserved region found in many bacterial porlymorphic toxins which is located before the C-terminal toxin modules.	52
377410	pfam04830	DUF637	Possible hemagglutinin (DUF637). This family represents a conserved region found in a bacterial protein which may be a hemagglutinin or hemolysin.	170
398483	pfam04831	Popeye	Popeye protein conserved region. The function of Popeye proteins is not well understood. They are predominantly expressed in cardiac and skeletal muscle. This family represents a conserved region which includes three potential transmembrane domains.	226
398484	pfam04832	SOUL	SOUL heme-binding protein. This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologs.	173
398485	pfam04833	COBRA	COBRA-like protein. Family of plant proteins are designated COBRA-like (COBL) proteins. The 12 Arabidopsis members of the family are all GPI-liked. Some members of this family are annotated as phytochelatin synthase, but these annotations are incorrect.	167
309809	pfam04834	Adeno_E3_14_5	Early E3 14.5 kDa protein. The E3B 14.5 kDa was first identified in Human adenovirus type 5. It is an integral membrane protein oriented with its C-terminus in the cytoplasm. It functions to down-regulate the epidermal growth factor receptor and prevent tumor necrosis factor cytolysis. It achieves this through the interaction with E3 10.4 kDa protein.	100
282664	pfam04835	Pox_A9	A9 protein conserved region. Family of Chordopoxvirus A9 proteins.	53
398486	pfam04836	IFRD_C	Interferon-related protein conserved region. Family of proteins thought to be involved in regulating gene activity in the proliferative and/or differentiative pathways induced by NGF.	52
113603	pfam04837	MbeB_N	MbeB-like, N-term conserved region. This family represents an N-terminal conserved region of MbeB/MobB proteins. These proteins are essential for specific plasmid transfer.	52
282666	pfam04838	Baculo_LEF5	Baculoviridae late expression factor 5. 	156
398487	pfam04839	PSRP-3_Ycf65	Plastid and cyanobacterial ribosomal protein (PSRP-3 / Ycf65). This small acidic protein is found in 30S ribosomal subunit of cyanobacteria and plant plastids. In plants it has been named plastid-specific ribosomal protein 3 (PSRP-3), and in cyanobacteria it is named Ycf65. Plastid-specific ribosomal proteins may mediate the effects of nuclear factors on plastid translation. The acidic PSRPs are thought to contribute to protein-protein interactions in the 30S subunit, and are not thought to bind RNA.	47
282668	pfam04840	Vps16_C	Vps16, C-terminal region. This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport. The role of VPS16 in this complex is not known.	320
252829	pfam04841	Vps16_N	Vps16, N-terminal region. This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport. The role of VPS16 in this complex is not known.	408
368150	pfam04842	DUF639	Plant protein of unknown function (DUF639). Plant protein of unknown function.	230
398488	pfam04843	Herpes_teg_N	Herpesvirus tegument protein, N-terminal conserved region. 	158
398489	pfam04844	Ovate	Transcriptional repressor, ovate. This is a family of transcriptional repressors. In plants, these proteins are important regulators of growth and development.	58
282672	pfam04845	PurA	PurA ssDNA and RNA-binding protein. This family represents most of the length of the protein.	219
282673	pfam04846	Herpes_pp38	Herpesvirus pp38 phosphoprotein. This protein represents a conserved region found in most herpesvirus pp38 phosphoproteins.	63
282674	pfam04847	Calcipressin	Calcipressin. Calcipressin is also known as calcineurin-binding protein, since it inhibits calcineurin-mediated transcriptional modulation by binding to calcineurin's catalytic domain.	183
282675	pfam04848	Pox_A22	Poxvirus A22 protein. 	143
398490	pfam04849	HAP1_N	HAP1 N-terminal conserved region. This family represents an N-terminal conserved region found in several huntingtin-associated protein 1 (HAP1) homologs. HAP1 binds to huntingtin in a polyglutamine repeat-length-dependent manner. However, its possible role in the pathogenesis of Huntington's disease is unclear. This family also includes a similar N-terminal conserved region from hypothetical protein products of ALS2CR3 genes found in the human juvenile amyotrophic lateral sclerosis critical region 2q33-2q34.	305
398491	pfam04850	Baculo_E66	Baculovirus E66 occlusion-derived virus envelope protein. 	387
398492	pfam04851	ResIII	Type III restriction enzyme, res subunit. 	162
398493	pfam04852	DUF640	Protein of unknown function (DUF640). This family represents a conserved region found in plant proteins including Resistance protein-like protein.	126
398494	pfam04854	DUF624	Protein of unknown function, DUF624. This family includes several uncharacterized bacterial proteins.	77
398495	pfam04855	SNF5	SNF5 / SMARCB1 / INI1. SNF5 is a component of the yeast SWI/SNF complex, which is an ATP-dependent nucleosome-remodelling complex that regulates the transcription of a subset of yeast genes. SNF5 is a key component of all SWI/SNF-class complexes characterized so far. This family consists of the conserved region of SNF5, including a direct repeat motif. SNF5 is essential for the assembly promoter targeting and chromatin remodelling activity of the SWI-SNF complex. SNF5 is also known as SMARCB1, for SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin, subfamily b, member 1, and also INI1 for integrase interactor 1. Loss-of function mutations in SNF5 are thought to contribute to oncogenesis in malignant rhabdoid tumors (MRTs).	178
282682	pfam04856	Securin	Securin sister-chromatid separation inhibitor. Securin is also known as pituitary tumor-transforming gene product. Over-expression of securin is associated with a number of tumors, and it has been proposed that this may be due to erroneous chromatid separation leading to chromosome gain or loss.	214
398496	pfam04857	CAF1	CAF1 family ribonuclease. The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localizes to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom resolution.	370
398497	pfam04858	TH1	TH1 protein. TH1 is a highly conserved but uncharacterized metazoan protein. No homolog has been identified in Caenorhabditis elegans. TH1 binds specifically to A-Raf kinase.	579
398498	pfam04859	DUF641	Plant protein of unknown function (DUF641). Plant protein of unknown function.	127
398499	pfam04860	Phage_portal	Phage portal protein. Bacteriophage portal proteins form a dodecamer and is located at a five-fold vertex of the viral capsid. The portal complex forms a channel through which the viral DNA is packaged into the capsid, and exits during infection. The portal protein is though to rotate during DNA packaging. Portal proteins from different phage show little sequence homology, so this family does not represent all portal proteins.	323
398500	pfam04862	DUF642	Protein of unknown function (DUF642). This family represents a duplicated conserved region found in a number of uncharacterized plant proteins, potentially in the stem. There is a conserved CGP sequence motif.	157
398501	pfam04863	EGF_alliinase	Alliinase EGF-like domain. Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesized from sulfoxide cysteine derivatives by alliinase (EC:4.4.1.4), whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defense system. This family represents the N-terminal EGF-like domain.	56
398502	pfam04864	Alliinase_C	Allinase. Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesized from sulfoxide cysteine derivatives by alliinase (EC:4.4.1.4), whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defense system.	363
398503	pfam04865	Baseplate_J	Baseplate J-like protein. The P2 bacteriophage J protein lies at the edge of the baseplate. This family also includes a number of bacterial homologs, which are thought to have been horizontally transferred.	253
282690	pfam04866	Rota_NS6	Rotavirus non-structural protein 6. 	92
282691	pfam04867	DUF643	Protein of unknown function (DUF643). Protein of unknown function found in Borrelia burgdorferi, the Lyme disease spirochete.	114
398504	pfam04868	PDE6_gamma	Retinal cGMP phosphodiesterase, gamma subunit. Retinal rod and cone cGMP phosphodiesterases function as the effector enzymes in the vertebrate visual transduction cascade. This family represents the inhibitory gamma subunit, which is also expressed outside retinal tissues and has been shown to interact with the G-protein-coupled receptor kinase 2 signalling system to regulate the epidermal growth factor- and thrombin-dependent stimulation of p42/p44 mitogen-activated protein kinase in human embryonic kidney 293 cells.	82
398505	pfam04869	Uso1_p115_head	Uso1 / p115 like vesicle tethering protein, head region. Also known as General vesicular transport factor, Transcytosis associated protein (TAP) and Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerization, and a short C-terminal acidic region. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the golgi stack. This family consists of part of the head region. The head region is highly conserved, but its function is unknown. It does not seem to be essential for vesicle tethering. The N-terminal part of the head region, not within this family, contains context-detected Armadillo/beta-catenin-like repeats (pfam00514).	311
398506	pfam04870	Moulting_cycle	Moulting cycle. This family of proteins plays a role in the moulting cycle of nematodes, which involves the synthesis of a new collagen-rich cuticle underneath the existing cuticle and the subsequent removal of the old cuticle.	343
398507	pfam04871	Uso1_p115_C	Uso1 / p115 like vesicle tethering protein, C terminal region. Also known as General vesicular transport factor, Transcytosis associate protein (TAP) and Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerization, and a short C-terminal acidic region. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the golgi stack. This family consists of the acidic C-terminus, which binds to the golgins giantin and GM130. p115 is thought to juxtapose two membranes by binding giantin with one acidic region, and GM130 with another.	130
282696	pfam04872	Pox_L5	Poxvirus L5 protein family. This family includes variola (smallpox) and vaccinia virus L5 proteins. However, not all proteins in this family are called L5. L5 is thought to contain a metal-binding region.	79
398508	pfam04873	EIN3	Ethylene insensitive 3. Ethylene insensitive 3 (EIN3) proteins are a family of plant DNA-binding proteins that regulate transcription in response to the gaseous plant hormone ethylene, and are essential for ethylene-mediated responses including the triple response, cell growth inhibition, and accelerated senescence.	252
398509	pfam04874	Mak16	Mak16 protein C-terminal region. The precise function of this eukaryotic protein family is unknown. The yeast orthologues have been implicated in cell cycle progression and biogenesis of 60S ribosomal subunits. The Schistosoma mansoni Mak16 has been shown to target protein transport to the nucleolus.	99
282699	pfam04875	DUF645	Protein of unknown function, DUF645. This family includes several uncharacterized proteins from Vibrio cholerae. There is some doubt regarding the existence of these proteins, they are encoded by open reading frames contained within a repeated region in the Vibrio superintegron.	59
282700	pfam04876	Tenui_NCP	Tenuivirus major non-capsid protein. This protein of unknown function accumulates in large amounts in tenuivirus infected cells. It is found in all forms of the inclusion bodies that are formed after infection.	173
368170	pfam04877	Hairpins	HrpZ. HrpZ from the plant pathogen Pseudomonas syringae binds to lipid bilayers and forms a cation-conducting pore in vivo. This pore-forming activity may allow nutrient release or delivery of virulence factors during bacterial colonisation of host plants. The family of hairpinN proteins, Harpin, has been merged into this family. HrpN is a virulence determinant which elicits lesion formation in Arabidopsis and tobacco and triggers systemic resistance in Arabidopsis.	277
282702	pfam04878	Baculo_p48	Baculovirus P48 protein. 	370
398510	pfam04879	Molybdop_Fe4S4	Molybdopterin oxidoreductase Fe4S4 domain. This domain is found in formate dehydrogenase H for which the structure is known. This first domain (residues 1 to 60) of Structure 1aa6 is an Fe4S4 cluster just below the protein surface.	55
398511	pfam04880	NUDE_C	NUDE protein, C-terminal conserved region. This family represents the C-terminal conserved region of the NUDE proteins. NUDE proteins are involved in nuclear migration.	169
282705	pfam04881	Adeno_GP19K	Adenovirus GP19K. This 19 kDa glycoprotein binds the major histocompatibility (MHC) class I antigens in the endoplasmic reticulum (ER). The ER retention signal at the C-terminus of GP19K causes retention of the complex in the ER, preventing lysis of the cell by cytotoxic T lymphocytes.	132
398512	pfam04882	Peroxin-3	Peroxin-3. Peroxin-3 is a peroxisomal protein. It is thought to be involve in membrane vesicle assembly prior to the translocation of matrix proteins.	453
398513	pfam04883	HK97-gp10_like	Bacteriophage HK97-gp10, putative tail-component. This family of proteins is found in the caudovirales. It may be a tail component.	80
398514	pfam04884	DUF647	Vitamin B6 photo-protection and homoeostasis. In plants, this domain plays a role in auxin-transport, plant growth and development and appears to be expressed by all cells in the plant as well as in plastids. The family has been shown to play a role in vitamin B6 photo-protection and homoeostasis in plants.	240
398515	pfam04885	Stig1	Stigma-specific protein, Stig1. This family represents the Stig1 cysteine rich plant protein. The STIG1 gene is developmentally regulated and expressed specifically in the stigmatic secretory zone.	134
282710	pfam04886	PT	PT repeat. This short repeat is composed on the tetrapeptide XPTX. This repeat is found in a variety of proteins, however it is not clear if these repeats are homologous to each other. The alignment represents nine copies of this repeat.	36
282711	pfam04887	Pox_M2	Poxvirus M2 protein. This family includes M2 protein from variola virus. The function of this protein is not known.	196
398516	pfam04888	SseC	Secretion system effector C (SseC) like family. SseC is a secreted protein that forms a complex together with SecB and SecD on the surface of Salmonella. All these proteins are secreted by the type III secretion system. Many mucosal pathogens use type III secretion systems for the injection of effector proteins into target cells. SecB, SseC and SecD are inserted into the target cell membrane. where they form a small pore or translocon. In addition to SseC, this family includes the bacterial secreted proteins PopB, PepB, YopB and EspD which are thought to be directly involved in pore formation, and type III secretion system translocon.	312
398517	pfam04889	Cwf_Cwc_15	Cwf15/Cwc15 cell cycle control protein. This family represents Cwf15/Cwc15 (from Schizosaccharomyces pombe and Saccharomyces cerevisiae respectively) and their homologs. The function of these proteins is unknown, but they form part of the spliceosome and are thus thought to be involved in mRNA splicing.	243
309840	pfam04890	DUF648	Family of unknown function (DUF648). Family of hypothetical Chlamydia proteins. This family may well comprise of two domains, as some members only match the N-terminus.	289
398518	pfam04891	NifQ	NifQ. NifQ is involved in early stages of the biosynthesis of the iron-molybdenum cofactor (FeMo-co), which is an integral part of the active site of dinitrogenase. The conserved C-terminal cysteine residues may be involved in metal binding.	159
398519	pfam04892	VanZ	VanZ like family. This family contains several examples of the VanZ protein, but also contains examples of phosphotransbutyrylases.	131
398520	pfam04893	Yip1	Yip1 domain. The Yip1 integral membrane domain contains four transmembrane alpha helices. The domain is characterized by the motifs DLYGP and GY. The Yip1 protein is a golgi protein involved in vesicular transport that interacts with GTPases.	173
398521	pfam04894	Nre_N	Archaeal Nre, N-terminal. This conserved region is found in the N-terminal region of archaeal Nre proteins. While most archaeal organisms encode only a single Nre protein, some encode two, NreA and NreB.	270
398522	pfam04895	Nre_C	Archaeal Nre, C-terminal. This conserved region is found in the C-terminal region of archaeal Nre proteins. While most archaeal organisms encode only a single Nre protein, some encode two, NreA and NreB.	110
398523	pfam04896	AmoC	Ammonia monooxygenase/methane monooxygenase, subunit C. Ammonia monooxygenase plays a key role in the nitrogen cycle and degrades a wide range of hydrocarbons and halogenated hydrocarbons. This family represents the AmoC subunit. It also includes the particulate methane monooxygenase subunit PmoC from methanotrophic bacteria.	245
398524	pfam04898	Glu_syn_central	Glutamate synthase central domain. The central domain of glutamate synthase connects the amino terminal amidotransferase domain with the FMN-binding domain and has an alpha / beta overall topology. This domain appears to be a rudimentary form of the FMN-binding TIM barrel according to SCOP.	281
113664	pfam04899	MbeD_MobD	MbeD/MobD like. The MbeD and MobD proteins are plasmid encoded, and are involved in the plasmids mobilisation and transfer in the presence of conjugative plasmids.	70
398525	pfam04900	Fcf1	Fcf1. Fcf1 is a nucleolar protein involved in pre-rRNA processing. Depletion of yeast Fcf1 and Fcf2 leads to a decrease in synthesis of the 18S rRNA and results in a deficit in 40S ribosomal subunits.	99
398526	pfam04901	RAMP	Receptor activity modifying family. The calcitonin-receptor-like receptor can function as either a calcitonin-gene-related peptide or an adrenomedullin receptor. The receptors function is modified by receptor-activity-modifying protein or RAMP. RAMPs are single-transmembrane-domain proteins.	108
398527	pfam04902	Nab1	Conserved region in Nab1. Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This C-terminal region is found only in the Nab1 subfamily.	190
398528	pfam04904	NCD1	NAB conserved region 1 (NCD1). Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This region consists of the N-terminal NAB conserved region 1, which interacts with the EGR1 inhibitory domain (R1). It may also mediate multimerisation.	79
398529	pfam04905	NCD2	NAB conserved region 2 (NCD2). Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This family consists of NAB conserved region 2, near the C-terminus of the protein. It is necessary for transcriptional repression by the Nab proteins. It is also required for transcription activation by Nab proteins at Nab-activated promoters.	123
368183	pfam04906	Tweety	Tweety. The tweety (tty) gene has not been characterized at the protein level. However, it is thought to form a membrane protein with five potential membrane-spanning regions. A number of potential functions have been suggested in.	406
398530	pfam04908	SH3BGR	SH3-binding, glutamic acid-rich protein. 	92
398531	pfam04909	Amidohydro_2	Amidohydrolase. These proteins are amidohydrolases that are related to pfam01979.	285
398532	pfam04910	Tcf25	Transcriptional repressor TCF25. Members of this family are transcriptional repressors. They may act by increasing histone deacetylase activity at promoter regions.	321
368187	pfam04911	ATP-synt_J	ATP synthase j chain. 	51
398533	pfam04912	Dynamitin	Dynamitin. Dynamitin is a subunit of the microtubule-dependent motor complex and in implicated in cell adhesion by binding to macrophage-enriched myristoylated alanine-rice C kinase substrate (MacMARCKS).	393
282731	pfam04913	Baculo_Y142	Baculovirus Y142 protein. This domain family is found in Baculovirus proteins including protein AC142, which is expressed in the cytoplasm and nucleus throughout infection. It is required for nucleocapsid envelopment in the budding virus to form the occlusion-derived virus and subsequent embedding of virions into polyhedra.	440
398534	pfam04914	DltD	DltD protein. DltD is and integral membrane protein involved in the biosynthesis of D-alanyl-lipoteichoic acid. This is important in controlling the net ionic charge in lipoteichoic acid (LTA). This family is found in bacteria of the Bacillus/Clostridium group. DltD binds Dcp and ligates it with D-alanine. DltD does not ligate acyl carrier protein (ACP) with D-alanine. It also has thioesterase activity for mischarged D-alanyl-acyl carrier protein (ACP). DltD is thought to be responsible for discriminating between Dcp involved in the D-alanylation of LTA, and ACP involved in fatty acid biosynthesis.	349
398535	pfam04916	Phospholip_B	Phospholipase B. Phospholipase B (PLB) catalyzes the hydrolytic cleavage of both acylester bonds of glycerophospholipids. This family of PLB enzymes has been identified in mammals, flies and nematodes but not in yeast. In Drosophila this protein was named LAMA for laminin ancestor since it is expressed in the neuronal and glial precursors that surround the lamina.	536
309860	pfam04917	Shufflon_N	Bacterial shufflon protein, N-terminal constant region. This family represents the high-similarity N-terminal 'constant region' shared by shufflon proteins.	324
282736	pfam04919	DUF655	Protein of unknown function (DUF655). This family includes several uncharacterized archaeal proteins. This protein appears to contain two HHH motifs.	181
282737	pfam04920	DUF656	Family of unknown function (DUF656). A family of hypothetical proteins from Beet necrotic yellow vein virus.	126
398536	pfam04921	XAP5	XAP5, circadian clock regulator. This protein is found in a wide range of eukaryotes. It is a nuclear protein and is suggested to be DNA binding. In plants, this family is essential for correct circadian clock functioning by acting as a light-quality regulator coordinating the activities of blue and red light signalling pathways during plant growth - inhibiting growth in red light but promoting growth in blue light.	238
398537	pfam04922	DIE2_ALG10	DIE2/ALG10 family. The ALG10 protein from Saccharomyces cerevisiae encodes the alpha-1,2 glucosyltransferase of the endoplasmic reticulum. This protein has been characterized in rat as potassium channel regulator 1.	383
398538	pfam04923	Ninjurin	Ninjurin. Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation and function in some tissues.	101
282741	pfam04924	Pox_A6	Poxvirus A6 protein. 	370
398539	pfam04925	SHQ1	SHQ1 protein. S. cerevisiae SHQ1 protein is required for SnoRNAs of the box H/ACA Quantitative accumulation (unpublished).	176
398540	pfam04926	PAP_RNA-bind	Poly(A) polymerase predicted RNA binding domain. Based on its similarity structurally to the RNA recognition motif this domain is thought to be RNA binding.	177
398541	pfam04927	SMP	Seed maturation protein. Plant seed maturation protein.	59
398542	pfam04928	PAP_central	Poly(A) polymerase central domain. The central domain of Poly(A) polymerase shares structural similarity with the allosteric activity domain of ribonucleotide reductase R1, which comprises a four-helix bundle and a three-stranded mixed beta- sheet. Even though the two enzymes bind ATP, the ATP-recognition motifs are different.	344
282746	pfam04929	Herpes_DNAp_acc	Herpes DNA replication accessory factor. Replicative DNA polymerases are capable of polymerising tens of thousands of nucleotides without dissociating from their DNA templates. The high processivity of these polymerases is dependent upon accessory proteins that bind to the catalytic subunit of the polymerase or to the substrate. The Epstein-Barr virus (EBV) BMRF1 protein is an essential component of the viral DNA polymerase and is absolutely required for lytic virus replication. BMRF1 is also a transactivator. This family is predicted to have a UL42 like structure.	400
398543	pfam04930	FUN14	FUN14 family. This family of short proteins are found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices.	93
398544	pfam04931	DNA_pol_phi	DNA polymerase phi. This family includes the fifth essential DNA polymerase in yeast EC:2.7.7.7. Pol5p is localized exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units.	765
398545	pfam04932	Wzy_C	O-Antigen ligase. This group of bacterial proteins is involved in the synthesis of O-antigen, a lipopolysaccharide found in the outer membrane in gram-negative bacteria. This family includes O-antigen ligases such as E. coli RfaL.	149
398546	pfam04934	Med6	MED6 mediator sub complex component. Component of RNA polymerase II holoenzyme and mediator sub complex.	132
398547	pfam04935	SURF6	Surfeit locus protein 6. The surfeit locus protein SURF-6 is shown to be a component of the nucleolar matrix and has a strong binding capacity for nucleic acids.	197
282752	pfam04936	DUF658	Protein of unknown function (DUF658). Protein of unknown function found in Lactococcus lactis bacteriophages.	186
398548	pfam04937	DUF659	Protein of unknown function (DUF 659). Transposase-like protein with no known function.	152
398549	pfam04938	SIP1	Survival motor neuron (SMN) interacting protein 1 (SIP1). Survival motor neuron (SMN) interacting protein 1 (SIP1) interacts with SMN protein and plays a crucial role in the biogenesis of spliceosomes. There is evidence that the protein is linked to spinal muscular atrophy (SMA) and amyotrophic lateral sclerosis(ALS) in humans.	212
398550	pfam04939	RRS1	Ribosome biogenesis regulatory protein (RRS1). This family consists of several eukaryotic ribosome biogenesis regulatory (RRS1) proteins. RRS1 is a nuclear protein that is essential for the maturation of 25 S rRNA and the 60 S ribosomal subunit assembly in Saccharomyces cerevisiae.	161
398551	pfam04940	BLUF	Sensors of blue-light using FAD. The BLUF domain has been shown to bind FAD in the AppA protein. AppA is involved in the repression of photosynthesis genes in response to blue-light.	91
282757	pfam04941	LEF-8	Late expression factor 8 (LEF-8). Late expression factor 8 (LEF-8) is one of the primary components of RNA polymerase produced by polyhedrosis viruses. LEF-8 shows homology to the second largest subunit of prokaryotic DNA-directed RNA polymerase.	730
368205	pfam04942	CC	CC domain. This short domain contains four conserved cysteines that probably for two disulphide bonds. The domain is named after the characteristic CC motif.	34
282759	pfam04943	Pox_F11	Poxvirus F11 protein. The protein F11 is an early virus protein.	409
398552	pfam04945	YHS	YHS domain. This short presumed domain is about 50 amino acid residues long. It often contains two cysteines that may be functionally important. This domain is found in copper transporting ATPases, some phenol hydroxylases and in a set of uncharacterized membrane proteins. This domain is named after three of the most conserved amino acids it contains. The domain may be metal binding, possibly copper ions. This domain is duplicated in some copper transporting ATPases.	47
282761	pfam04947	Pox_VLTF3	Poxvirus Late Transcription Factor VLTF3 like. Members of this family are approximately 26 KDa, and are involved in trans-activator of late transcription.	168
282762	pfam04948	Pox_A51	Poxvirus A51 protein. 	337
398553	pfam04949	Transcrip_act	Transcriptional activator. This family of proteins may act as a transcriptional activator. It plays a role in stress response in plants.	154
398554	pfam04950	RIBIOP_C	40S ribosome biogenesis protein Tsr1 and BMS1 C-terminal. RIBIOP_C is a family of eukaryotic proteins from the C-terminus of pre-rRNA-processing protein or ribosome biogenesis proteins BMS1 and TSR1. These proteins act, in the nucleolus, as a molecular switch during maturation of the 40S ribosomal subunit. This domain, domain IV of translation elongation factor selb, adopts the same fold as translation proteins such as domain II of GTP-elongation factor Tu proteins.	289
398555	pfam04951	Peptidase_M55	D-aminopeptidase. Bacillus subtilis DppA is a binuclear zinc-dependent, D-specific aminopeptidase. The structure reveals that DppA is a new example of a 'self-compartmentalising protease', a family of proteolytic complexes. Proteasomes are the most extensively studied representatives of this family. The DppA enzyme is composed of identical 30 kDa subunits organized in a decamer with 52 point-group symmetry. A 20 A wide channel runs through the complex, giving access to a central chamber holding the active sites. The structure shows DppA to be a prototype of a new family of metalloaminopeptidases characterized by the SXDXEG key sequence. The only known substrates are D-ala-D-ala and D-ala-gly-gly.	263
398556	pfam04952	AstE_AspA	Succinylglutamate desuccinylase / Aspartoacylase family. This family includes Succinylglutamate desuccinylase EC:3.1.-.- that catalyzes the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway. The family also include aspartoacylase EC:3.5.1.15 which cleaves acylaspartate into a fatty acid and aspartate. Mutations in human ASPA lead to Canavan disease disease. This family is probably structurally related to pfam00246 (Bateman A pers. obs.).	287
398557	pfam04954	SIP	Siderophore-interacting protein. 	119
398558	pfam04955	HupE_UreJ	HupE / UreJ protein. This family of proteins are hydrogenase / urease accessory proteins. The alignment contains many conserved histidines that are likely to be involved in nickel binding. The members usually have five membrane-spanning regions.	179
398559	pfam04956	TrbC	TrbC/VIRB2 family. Conjugal transfer protein, TrbC has been identified as a subunit of the pilus precursor in bacteria. The protein undergoes three processing steps before gaining its mature cyclic structure. This family also contains several VIRB2 type IV secretion proteins. The virB2 gene encodes a putative type IV secretion system and is known to be a pathogenicity factor in Bartonella species.	98
398560	pfam04957	RMF	Ribosome modulation factor. This protein associates with 70s ribosomes and converts them to a dimeric form (100S ribosomes) which appear during the transition from the exponential growth phase to the stationary phase of Escherichia coli cells.	51
398561	pfam04958	AstA	Arginine N-succinyltransferase beta subunit. Arginine N-succinyltransferase EC:2.3.1.109 catalyzes the transfer of succinyl-CoA to arginine to produce succinyl-arginine. This is the first step in arginine catabolism by the arginine succinyltransferase pathway.	335
398562	pfam04959	ARS2	Arsenite-resistance protein 2. Arsenite is a carcinogenic compound which can act as a co-mutagen by inhibiting DNA repair. Arsenite-resistance protein 2 is thought to play a role in arsenite resistance.	195
398563	pfam04960	Glutaminase	Glutaminase. This family of enzymes deaminates glutamine to glutamate EC:3.5.1.2.	283
398564	pfam04961	FTCD_C	Formiminotransferase-cyclodeaminase. Members of this family are thought to be Formiminotransferase- cyclodeaminase enzymes EC:4.3.1.4. This domain is found in the C-terminus of the bifunctional animal members of the family.	181
398565	pfam04962	KduI	KduI/IolB family. This family includes the 5-keto 4-deoxyuronate isomerase enzyme EC:5.3.1.17 that is involved in pectin degradation. This family aldo includes bacterial Myo-inositol catabolism (IolB) proteins. The Bacillus subtilis inositol operon (iolABCDEFGHIJ) is involved in myo-inositol catabolism. Glucose repression of the iol operon induced by inositol is exerted through catabolite repression mediated by CcpA and the iol induction system mediated by IolR. The exact function of IolB is unknown. Members of this family possess a Cupin like structure.	260
398566	pfam04963	Sigma54_CBD	Sigma-54 factor, core binding domain. This domain makes a direct interaction with the core RNA polymerase, to form an enhancer dependent holoenzyme. The centre of this domain contains a very weak similarity to a helix-turn-helix motif which may represent the other DNA binding domain.	182
398567	pfam04964	Flp_Fap	Flp/Fap pilin component. 	46
398568	pfam04965	GPW_gp25	Gene 25-like lysozyme. This family includes the phage protein Gene 25 from T4 which is a structural component of the outer wedge of the baseplate that has acidic lysozyme activity. The family also includes relatives from bacteria that are also presumably lysozymes.	93
335962	pfam04966	OprB	Carbohydrate-selective porin, OprB family. 	373
282780	pfam04967	HTH_10	HTH DNA binding domain. 	53
398569	pfam04968	CHORD	CHORD. CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development.	62
398570	pfam04969	CS	CS domain. The CS and CHORD (pfam04968) are fused into a single polypeptide chain in metazoans but are found in separate proteins in plants; this is thought to be indicative of an interaction between CS and CHORD. It has been suggested that the CS domain is a binding module for HSP90, implying that CS domain-containing proteins are involved in recruiting heat shock proteins to multiprotein assemblies. Two CS domains are found at the N-terminus of Ubiquitin carboxyl-terminal hydrolase 19 (USP19), these domains may play a role in the interaction of USP19 with cellular inhibitor of apoptosis 2.	76
398571	pfam04970	LRAT	Lecithin retinol acyltransferase. The full-length members of this family, are representatives of a novel class II tumor-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localization, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester.	106
398572	pfam04971	Phage_holin_2_1	Bacteriophage P21 holin S. Phage_holin_2_1 is a family of small hydrophobic holin proteins with one or more transmembrane domains. Members of this family fall into the holin superfamily II, and Phage 21 S holin is the prototype for this superfamily. It has two transmembrane segments with both the N- and C-termini on the cytoplasmic side of the inner membrane in E. coli. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the build up of a holin oligomer which causes the lysis.	64
398573	pfam04972	BON	BON domain. This domain is found in a family of osmotic shock protection proteins. It is also found in some Secretins and a group of potential haemolysins. Its likely function is attachment to phospholipid membranes.	69
398574	pfam04973	NMN_transporter	Nicotinamide mononucleotide transporter. Members of this family are integral membrane proteins that are involved in transport of nicotinamide mononucleotide.	176
398575	pfam04976	DmsC	DMSO reductase anchor subunit (DmsC). The terminal electron transfer enzyme Me2SO reductase of Escherichia coli is a heterotrimeric enzyme composed of a membrane extrinsic catalytic dimer (DmsAB) and a membrane intrinsic polytopic anchor subunit (DmsC).	275
398576	pfam04977	DivIC	Septum formation initiator. DivIC from B. subtilis is necessary for both vegetative and sporulation septum formation. These proteins are mainly composed of an amino terminal coiled-coil.	69
398577	pfam04978	DUF664	Protein of unknown function (DUF664). This family is commonly found in Streptomyces coelicolor and is of unknown function. These proteins contain several conserved histidines at their N-terminus that may form a metal binding site.	149
398578	pfam04979	IPP-2	Protein phosphatase inhibitor 2 (IPP-2). Protein phosphotase inhibitor 2 (IPP-2) is a phosphoprotein conserved among all eukaryotes, and it appears in both the nucleus and cytoplasm of tissue culture cells.	130
398579	pfam04981	NMD3	NMD3 family. The NMD3 protein is involved in nonsense mediated mRNA decay. This amino terminal region contains four conserved CXXC motifs that could be metal binding. NMD3 is involved in export of the 60S ribosomal subunit is mediated by the adapter protein Nmd3p in a Crm1p-dependent pathway.	242
398580	pfam04982	HPP	HPP family. These proteins are integral membrane proteins with four transmembrane spanning helices. The most conserved region of the alignment is a motif HPP. The function of these proteins is uncertain but they may be transporters.	122
398581	pfam04983	RNA_pol_Rpb1_3	RNA polymerase Rpb1, domain 3. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking.	158
398582	pfam04984	Phage_sheath_1	Phage tail sheath protein subtilisin-like domain. This entry represents the second domain in a variety of phage tail sheath proteins. According to ECOD this domain has a subtilisin-like structure.	157
398583	pfam04985	Phage_tube	Phage tail tube protein FII. The major structural components of the contractile tail of bacteriophage P2 are proteins FI and FII, which are believed to be the tail sheath and tube proteins, respectively.	162
398584	pfam04986	Y2_Tnp	Putative transposase. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases IS1294 and IS801. This is a rolling-circle transposase.	183
398585	pfam04987	PigN	Phosphatidylinositolglycan class N (PIG-N). Phosphatidylinositolglycan class N (PIG-N) is a mammalian homolog of the yeast protein MCD4P and is expressed in the endoplasmic reticulum. PIG-N is essential for glycosylphosphatidylinositol anchor synthesis. Glycosylphosphatidylinositol (GPI)-anchored proteins are cell surface-localized proteins that serve many important cellular functions.	455
398586	pfam04988	AKAP95	A-kinase anchoring protein 95 (AKAP95). A-kinase (or PKA)-anchoring protein AKAP95 is implicated in mitotic chromosome condensation by acting as a targeting molecule for the condensin complex. The protein contains two zinc fingers which are thought to mediate the binding of AKAP95 to DNA.	134
398587	pfam04989	CmcI	Cephalosporin hydroxylase. Members of this family are about 220 amino acids long. The CmcI protein is presumed to represent the cephalosporin-7--hydroxylase. However this has not been experimentally verified.	206
398588	pfam04990	RNA_pol_Rpb1_7	RNA polymerase Rpb1, domain 7. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 7, represents a mobile module of the RNA polymerase. Domain 7 forms a substantial interaction with the lobe domain of Rpb2 (pfam04561).	136
398589	pfam04991	LicD	LicD family. The LICD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent. These proteins are part of the nucleotidyltransferase superfamily.	224
398590	pfam04992	RNA_pol_Rpb1_6	RNA polymerase Rpb1, domain 6. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 6, represents a mobile module of the RNA polymerase. Domain 6 forms part of the shelf module. This family appears to be specific to the largest subunit of RNA polymerase II.	188
398591	pfam04993	TfoX_N	TfoX N-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the N-terminal presumed domain of TfoX. The domain is found as an isolated domain in some proteins suggesting this is an autonomous domain.	91
398592	pfam04994	TfoX_C	TfoX C-terminal domain. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes. This family corresponds to the C-terminal presumed domain of TfoX. The domain is found associated with pfam00383 in Neisseria meningitidis TadA. It is also found as an isolated domain in some proteins suggesting this is an autonomous domain.	81
398593	pfam04995	CcmD	Heme exporter protein D (CcmD). The CcmD protein is part of a C-type cytochrome biogenesis operon. The exact function of this protein is uncertain. It has been proposed that CcmC, CcmD and CcmE interact directly with each other, establishing a cytoplasm to periplasm haem delivery pathway for cytochrome c maturation. These proteins contain a predicted transmembrane helix.	44
398594	pfam04996	AstB	Succinylarginine dihydrolase. This enzyme transforms N(2)-succinylglutamate into succinate and glutamate. This is the fifth and last step in arginine catabolism by the arginine succinyltransferase pathway.	442
398595	pfam04997	RNA_pol_Rpb1_1	RNA polymerase Rpb1, domain 1. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand.	320
398596	pfam04998	RNA_pol_Rpb1_5	RNA polymerase Rpb1, domain 5. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 5, represents the discontinuous cleft domain that is required to from the central cleft or channel where the DNA is bound.	516
398597	pfam04999	FtsL	Cell division protein FtsL. In Escherichia coli, nine gene products are known to be essential for assembly of the division septum. One of these, FtsL, is a bitopic membrane protein whose precise function is not understood. It has been proposed that FtsL interacts with the DivIC protein pfam04977, however this interaction may be indirect.	97
398598	pfam05000	RNA_pol_Rpb1_4	RNA polymerase Rpb1, domain 4. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 4, represents the funnel domain. The funnel contain the binding site for some elongation factors.	108
398599	pfam05001	RNA_pol_Rpb1_R	RNA polymerase Rpb1 C-terminal repeat. The repetitive C-terminal domain (CTD) of Rpb1 (RNA polymerase Pol II) plays a critical role in the regulation of gene expression. The activity of the CTD is dependent on its state of phosphorylation.	12
398600	pfam05002	SGS	SGS domain. This domain was thought to be unique to the SGT1-like proteins, but is also found in calcyclin binding proteins.	82
398601	pfam05003	DUF668	Protein of unknown function (DUF668). Uncharacterized plant protein.	88
398602	pfam05004	IFRD	Interferon-related developmental regulator (IFRD). Interferon-related developmental regulator (IFRD1) is the human homolog of the rat early response protein PC4 and its murine homolog TIS7. The exact function of IFRD1 is unknown but it has been shown that PC4 is necessary to muscle differentiation and that it might have a role in signal transduction. This family also contains IFRD2 and its murine equivalent SKMc15 which are highly expressed soon after gastrulation and in the hepatic primordium, suggesting an involvement in early hematopoiesis.	314
398603	pfam05005	Ocnus	Janus/Ocnus family (Ocnus). This family is comprised of the Ocnus, Janus-A and Janus-B proteins. These proteins have been found to be testes specific in Drosophila melanogaster.	102
368236	pfam05006	PIF3	Per os infectivity factor 3. This family contains viral proteins and includes Baculovirus Per os infectivity factor 3 (PIF3). PIF3 forms a complex on the occlusion-derived virus surface with PIF1, PIF2, and P74 which has an essential function in the initial stages of baculovirus oral infection.	149
252941	pfam05007	Mannosyl_trans	Mannosyltransferase (PIG-M). PIG-M has a DXD motif. The DXD motif is found in many glycosyltransferases that utilize nucleotide sugars. It is thought that the motif is involved in the binding of a manganese ion that is required for association of the enzymes with nucleotide sugar substrates.	259
398604	pfam05008	V-SNARE	Vesicle transport v-SNARE protein N-terminus. V-SNARE proteins are required for protein traffic between eukaryotic organelles. The v-SNAREs on transport vesicles interact with t-SNAREs on target membranes in order to facilitate this. This domain is the N-terminal half of the V-Snare proteins.	78
282817	pfam05009	EBV-NA3	Epstein-Barr virus nuclear antigen 3 (EBNA-3). This family contains EBNA-3A, -3B, and -3C which are latent infection nuclear proteins important for Epstein-Barr virus (EBV)-induced B-cell immortalisation and the immune response to EBV infection.	254
398605	pfam05010	TACC	Transforming acidic coiled-coil-containing protein (TACC). This family contains the proteins TACC 1, 2 and 3 the genes for which are found concentrated in the centrosomes of eukaryotic and may play a conserved role in organising centrosomal microtubules. The human TACC proteins have been linked to cancer and TACC2 has been identified as a possible tumor suppressor (AZU-1). The functional homolog (Alp7) in Schizosaccharomyces pombe has been shown to be required for organisation of bipolar spindles.	201
398606	pfam05011	DBR1	Lariat debranching enzyme, C-terminal domain. This presumed domain is found at the C-terminus of lariat debranching enzyme. This domain is always found in association with pfam00149.	136
398607	pfam05013	FGase	N-formylglutamate amidohydrolase. Formylglutamate amidohydrolase (FGase) catalyzes the terminal reaction in the five-step pathway for histidine utilisation in Pseudomonas putida. By this action, N-formyl-L-glutamate (FG) is hydrolyzed to produce L-glutamate plus formate.	218
398608	pfam05014	Nuc_deoxyrib_tr	Nucleoside 2-deoxyribosyltransferase. Nucleoside 2-deoxyribosyltransferase EC:2.4.2.6 catalyzes the cleavage of the glycosidic bonds of 2`-deoxyribonucleosides.	115
398609	pfam05015	HigB-like_toxin	RelE-like toxin of type II toxin-antitoxin system HigB. This family carries several different examples of type II bacterial toxins of toxin-antitoxin systems including many HigB-like ones. The fold is referred to as the RelE/YoeB/Txe/Yeeu fold suggesting all examples of these are present in this family. Several plasmids with proteic killer gene-systems have been reported. All of them encode a stable toxin and an unstable antidote. Upon loss of the plasmid, the less stable inhibitor is inactivated more rapidly than the toxin, allowing the toxin to be activated. The activation of these systems results in cell filamentation and cessation of viable cell production. It has been verified that both the stable killer and the unstable inhibitor of the systems are short polypeptides. This family corresponds to the toxin.	93
398610	pfam05016	ParE_toxin	ParE toxin of type II toxin-antitoxin system, parDE. ParE is the toxin family of a type II toxin-antitoxin family. It is toxic towards DNA gyrase, but is neutralized by the antitoxin ParD. The family also encompasses RelE/ParE described in.	91
368241	pfam05017	TMP	TMP repeat. This short repeat consists of the motif WXXh where X can be any residue and h is a hydrophobic residue. The repeat is name TMP after its occurrence in the tape measure protein (TMP). Tape measure protein is a component of phage tail and probably forms a beta-helix. Truncated forms of TMP lead to shortened tail fibers. This repeat is also found in non-phage proteins where it may play a structural role.	11
398611	pfam05018	DUF667	Protein of unknown function (DUF667). This family of proteins are highly conserved in eukaryotes. Some proteins in the family are annotated as transcription factors. However, there is currently no support for this in the literature.	185
398612	pfam05019	Coq4	Coenzyme Q (ubiquinone) biosynthesis protein Coq4. Coq4p was shown to peripherally associate with the matrix face of the mitochondrial inner membrane. The putative mitochondrial- targeting sequence present at the amino-terminus of the polypeptide efficiently imported it to mitochondria. The function of Coq4p is unknown, although its presence is required to maintain a steady-state level of Coq7p, another component of the Q biosynthetic pathway. The overall structure of Coq4 is alpha helical and shows resemblance to haemoglobin/myoglobin (information from TOPSAN).	213
398613	pfam05020	zf-NPL4	NPL4 family, putative zinc binding region. The HRD4 gene was identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterized step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome. This region of the protein contains possibly two zinc binding motifs (Bateman A pers. obs.). Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing.	145
398614	pfam05021	NPL4	NPL4 family. The HRD4 gene was identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterized step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome. Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing.	308
398615	pfam05022	SRP40_C	SRP40, C-terminal domain. This presumed domain is found at the C-terminus of the S. cerevisiae SRP40 protein and its homologs. SRP40/nopp40 is a chaperone involved in nucleocytoplasmic transport. SRP40 is also a suppressor of mutant AC40 subunit of RNA polymerase I and III.	75
398616	pfam05023	Phytochelatin	Phytochelatin synthase. Phytochelatin synthase is the enzyme responsible for the synthesis of heavy-metal-binding peptides (phytochelatins) from glutathione and related thiols. The crystal structure of a member of this family shows it to possess a papain fold. The enzyme catalyzes the deglycination of a GSH donor molecule. The enzyme contains a catalytic triad of cysteine, histidine and aspartate residues.	207
398617	pfam05024	Gpi1	N-acetylglucosaminyl transferase component (Gpi1). Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins.The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This chemically simple step is genetically complex because three or four genes are required in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively.	187
398618	pfam05025	RbsD_FucU	RbsD / FucU transport protein family. The Escherichia coli high-affinity ribose-transport system consists of six proteins encoded by the rbs operon (rbsD, rbsA, rbsC, rbsB, rbsK and rbsR). Of the six components, RbsD is the only one whose function is unknown although it is thought that it somehow plays a critical role in PtsG-mediated ribose transport. This family also includes FucU a protein from the fucose biosynthesis operon that is presumably also involved in fucose transport by similarity to RbsD.	133
398619	pfam05026	DCP2	Dcp2, box A domain. This domain is always found to the amino terminal side of pfam00293. This domain is specific to mRNA decapping protein 2 and this region has been termed Box A. Removal of the cap structure is catalyzed by the Dcp1-Dcp2 complex.	83
398620	pfam05028	PARG_cat	Poly (ADP-ribose) glycohydrolase (PARG). Poly(ADP-ribose) glycohydrolase (PARG), is a ubiquitously expressed exo- and endoglycohydrolase which mediates oxidative and excitotoxic neuronal death.	324
398621	pfam05029	TIMELESS_C	Timeless protein C terminal region. The timeless (tim) gene is essential for circadian function in Drosophila. Putative homologs of Drosophila tim have been identified in both mice and humans (mTim and hTIM, respectively). Mammalian TIM is not the true orthologue of Drosophila TIM, but is the likely orthologue of a fly gene, timeout (also called tim-2). mTim has been shown to be essential for embryonic development, but does not have substantiated circadian function. Some family members contain a SANT domain in this region.	88
398622	pfam05030	SSXT	SSXT protein (N-terminal region). The SSXT or SS18 protein is involved in synovial sarcoma in humans. A SYT-SSX fusion gene resulting from the chromosomal translocation t(X;18) (p11;q11) is characteristic of synovial sarcomas. This translocation fuses the SSXT (SYT) gene from chromosome 18 to either of two homologous genes at Xp11, SSX1 or SSX2.	60
398623	pfam05031	NEAT	Iron Transport-associated domain. NEAT domains are heme and/or hemoprotein-binding modules highly conserved in secondary structure. They have roles in hemoprotein binding, heme extraction and heme transfer	113
398624	pfam05032	Spo12	Spo12 family. This family of proteins includes Spo12 from S. cerevisiae. The Spo12 protein plays a regulatory role in two of the most fundamental processes of biology, mitosis and meiosis, and yet its biochemical function remains elusive. Spo12 is a nuclear protein. Spo12 is a component of the FEAR (Cdc fourteen early anaphase release) regulatory network, that promotes Cdc14 release from the nucleolus during early anaphase. The FEAR network is comprised of the polo kinase Cdc5, the separase Esp1, the kinetochore-associated protein Slk19, and Spo12.	33
398625	pfam05033	Pre-SET	Pre-SET motif. This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilizing SET domains.	98
398626	pfam05034	MAAL_N	Methylaspartate ammonia-lyase N-terminus. Methylaspartate ammonia-lyase EC:4.3.1.2 catalyzes the second step of fermentation of glutamate. It is a homodimer. This family represents the N-terminal region of Methylaspartate ammonia-lyase. This domain is structurally related to pfam03952. This domain is associated with the catalytic domain pfam07476.	160
398627	pfam05035	DGOK	2-keto-3-deoxy-galactonokinase. 2-keto-3-deoxy-galactonokinase EC:2.7.1.58 catalyzes the second step in D-galactonate degradation.	284
398628	pfam05036	SPOR	Sporulation related domain. This 70 residue domain is composed of two 35 residue repeats found in proteins involved in sporulation and cell division such as FtsN, DedD, and CwlM. This domain is involved in binding peptidoglycan. Two tandem repeats fold into a pseudo-2-fold symmetric single-domain structure containing numerous contacts between the repeats. FtsN is an essential cell division protein with a simple bitopic topology, a short N-terminal cytoplasmic segment fused to a large carboxy periplasmic domain through a single transmembrane domain. These repeats lay at the periplasmic C-terminus. FtsN localizes to the septum ring complex.	76
398629	pfam05037	DUF669	Protein of unknown function (DUF669). Members of this family are found in various phage proteins.	126
398630	pfam05038	Cytochrom_B558a	Cytochrome Cytochrome b558 alpha-subunit. Cytochrome b-245 light chain (p22-phox) is one of the key electron transfer elements of the NADPH oxidase in phagocytes.	177
398631	pfam05039	Agouti	Agouti protein. The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP)is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation.	87
398632	pfam05041	Pecanex_C	Pecanex protein (C-terminus). This family consists of C terminal region of the pecanex protein homologs. The pecanex protein is a maternal-effect neurogenic gene found in Drosophila.	227
398633	pfam05042	Caleosin	Caleosin related protein. This family contains plant proteins related to caleosin. Caleosins contain calcium-binding domains and have an oleosin-like association with lipid bodies. Caleosins are present at relatively low levels and are mainly bound to microsomal membrane fractions at the early stages of seed development. As the seeds mature, overall levels of caleosins increased dramatically and they were associated almost exclusively with storage lipid bodies. This family is probably related to EF hands pfam00036.	170
398634	pfam05043	Mga	Mga helix-turn-helix domain. M regulator protein trans-acting positive regulator (Mga) is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions. This domain is found in the centre of the Mga proteins. This family also contains a number of bacterial RofA transcriptional regulators that seem to be largely restricted to streptococci. These proteins have been shown to regulate the expression of important bacterial adhesins. This is presumably a DNA-binding domain.	87
398635	pfam05044	HPD	Homeo-prospero domain. Prospero is a large drosophila transcription factor protein that is expressed in all neural lineages of drosophila embryos. It is needed for correct expression of several neural proteins and in determining the cell fates of neural stem cells. homologs of prospero are found in a wide range of animals including humans with the highest level of similarity being found in the C-terminal 160 amino acids. This region was identified as containing an atypical homeobox domain followed by a prospero domain. However, the structure shows that these two regions form a single stable structural domain as defined here. This homeo-prospero domain binds to DNA.	152
398636	pfam05045	RgpF	Rhamnan synthesis protein F. This family consists of a group of proteins which are related to the Streptococcus rhamnose-glucose polysaccharide assembly protein (RgpF). Rhamnan backbones are found in several O polysaccharides of phytopathogenic bacteria and are regarded as pathogenic factors.	501
398637	pfam05046	Img2	Mitochondrial large subunit ribosomal protein (Img2). This family of proteins have been identified as part of the mitochondrial large ribosomal subunit in yeast.	82
368263	pfam05047	L51_S25_CI-B8	Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain. The proteins in this family are located in the mitochondrion. The family includes ribosomal protein L51, and S25. This family also includes mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) EC:1.6.5.3. It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins. Structurally related to thioredoxin-fold.	51
398638	pfam05048	NosD	Periplasmic copper-binding protein (NosD). NosD is a periplasmic protein which is thought to insert copper into the exported reductase apoenzyme (NosZ). This region forms a parallel beta helix domain.	215
398639	pfam05049	IIGP	Interferon-inducible GTPase (IIGP). Interferon-inducible GTPase (IIGP) is thought to play a role in in intracellular defense. IIGP is predominantly associated with the Golgi apparatus and also localizes to the endoplasmic reticulum and exerts a distinct role in IFN-induced intracellular membrane trafficking or processing.	375
398640	pfam05050	Methyltransf_21	Methyltransferase FkbM domain. This family has members from bacteria to human, and appears to be a methyltransferase.	173
398641	pfam05051	COX17	Cytochrome C oxidase copper chaperone (COX17). Cox17 is essential for the assembly of functional cytochrome c oxidase (CCO) and for delivery of copper ions to the mitochondrion for insertion into the enzyme in yeast. The structure of Cox17 shows the protein to have an unstructured N-terminal region followed by two helices and several unstructured C-terminal residues. The Cu(I) binding site has been modelled as two-coordinate with ligation by conserved residues Cys23 and Cys26.	47
309965	pfam05052	MerE	MerE protein. The prokaryotic MerE (or URF-1) protein is part of the mercury resistance operon. The protein is thought not to have any direct role in conferring mercury resistance to the organism but may be a mercury resistance transposon.	75
398642	pfam05053	Menin	Menin. MEN1, the gene responsible for multiple endocrine neoplasia type 1, is a tumor suppressor gene that encodes a protein called Menin which may be an atypical GTPase stimulated by nm23.	617
282858	pfam05054	AcMNPV_Ac109	Autographa californica nuclear polyhedrosis virus (AcMNPV) protein. This domain family is found in viral proteins such as Ac109 from Autographa californica nuclear polyhedrosis virus (AcMNPV). The gene (Orf1090) is essential and transcribed late in virus assembly, and protein AC109 has been shown to be important for the transport of the budded virion to the host nucleus. In mutants lacking the AC109 gene, virions are unable to enter the nucleus and remain in the cytoplasm. Although addition of AC109 allowed virions to enter the nucleus, the occlusion bodies were empty, indicating that AC109 is also important for the production of infectious budded virus. The exact function of this domain family remains unknown.	418
252976	pfam05055	DUF677	Protein of unknown function (DUF677). This family consists of AT14A like proteins from Arabidopsis thaliana. At14a has a small domain that has sequence similarities to integrins from fungi, insects and humans. Transcripts of At14a are found in all Arabidopsis tissues and localizes partly to the plasma membrane.	336
398643	pfam05056	DUF674	Protein of unknown function (DUF674). This family is found in Arabidopsis thaliana and contains several uncharacterized proteins.	449
309968	pfam05057	DUF676	Putative serine esterase (DUF676). This family of proteins are probably serine esterase type enzymes with an alpha/beta hydrolase fold.	212
282861	pfam05058	ActA	ActA Protein. The ActA family is found in Listeria and is associated with motility. ActA protein acts as a scaffold to assemble and activate host cell actin cytoskeletal factors at the bacterial surface, resulting in directional actin polymerization and propulsion of the bacterium through the cytoplasm of the host cell.	633
282862	pfam05059	Orbi_VP4	Orbivirus VP4 core protein. Orbiviruses are double stranded RNA retroviruses of which the bluetongue virus is a member. The core of bluetongue virus (BTV) is a multienzyme complex composed of two major proteins (VP7 and VP3) and three minor proteins (VP1, VP4 and VP6) in addition to the viral genome. VP4 has been shown to perform all RNA capping activities and has both methyltransferase type 1 and type 2 activities associated with it.	640
398644	pfam05060	MGAT2	N-acetylglucosaminyltransferase II (MGAT2). UDP-N-acetyl-D-glucosamine:alpha-6-D-mannoside beta-1,2-N- acetylglucosaminyltransferase II (EC 2.4.1.143) (GnT II/MGAT2) is a Golgi resident enzyme that catalyzes an essential step in the biosynthetic pathway leading from high mannose to complex N-linked oligosaccharides. Mutations in the MGAT2 gene lead to congenital disorder of glycosylation (CDG IIa). CDG IIa patients have an increased bleeding tendency, unrelated to coagulation factors.	349
282864	pfam05061	Pox_A11	Poxvirus A11 Protein. Family of conserved Chordopoxvirinae A11 family proteins. Conserved region spans entire protein in the majority of family members.	315
309970	pfam05062	RICH	RICH domain. This presumed domain is about 85 residues in length and very rich in charged residues, hence the name RICH (Rich In CHarged residues). It is found in secreted proteins such as PspC, SpsA and IgA FC receptor from Streptococcus agalactiae. This domain could be involved in bacterial adherence or cell wall binding.	81
368270	pfam05063	MT-A70	MT-A70. MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs.	174
398645	pfam05064	Nsp1_C	Nsp1-like C-terminal region. This family probably forms a coiled-coil. This important region of Nsp1 is involved in binding Nup82.	116
398646	pfam05065	Phage_capsid	Phage capsid family. Family of bacteriophage hypothetical proteins and capsid proteins.	273
398647	pfam05066	HARE-HTH	HB1, ASXL, restriction endonuclease HTH domain. A winged helix-turn-helix domain present in the plant HB1, vertebrate ASXL, the H. pylori restriction endonuclease HpyAIII(HgrA), the RNA polymerase delta subunit(RpoE) of Gram positive bacteria and several restriction endonucleases. The domain is distinguished by the presence of a conserved one-turn helix between helix-3 and the preceding conserved turn. Its diverse architectures in eukaryotic species with extensive gene body methylation is suggestive of a chromatin function. The genetic interaction of the HARE-HTH containing ASXL with the methyl cytosine hydroxylating Tet2 protein is suggestive of a role for the domain in discriminating sequences with DNA modifications such as hmC. Bacterial versions include fusions to diverse restriction endonucleases, and a DNA glycosylase where it may play a similar role in detecting modified DNA. Certain bacterial version of the HARE-HTH domain show fusions to the helix-hairpin-helix domain of the RNA polymerase alpha subunit and the HTH domains found in regions 3 and 4 of the sigma factors. These versions are predicted to function as a novel inhibitor of the binding of RNA polymerase to transcription start sites, similar to the Bacillus delta protein.	71
252986	pfam05067	Mn_catalase	Manganese containing catalase. Catalases are important antioxidant metalloenzymes that catalyze disproportionation of hydrogen peroxide, forming dioxygen and water. Two families of catalases are known, one having a heme cofactor, and this family that is a structurally distinct family containing non-heme manganese.	283
398648	pfam05068	MtlR	Mannitol repressor. The mannitol operon of Escherichia coli, encoding the mannitol-specific enzyme II of the phosphotransferase system (MtlA) and mannitol phosphate dehydrogenase (MtlD) contains an additional downstream open reading frame which encodes the mannitol repressor (MtlR).	164
398649	pfam05069	Phage_tail_S	Phage virion morphogenesis family. Protein S of phage P2 is thought to be involved in tail completion and stable head joining.	148
398650	pfam05071	NDUFA12	NADH ubiquinone oxidoreductase subunit NDUFA12. This family contains the 17.2 kD subunit of complex I (NDUFA12) and its homologs. The family also contains a second related eukaryotic protein of unknown function.	78
282873	pfam05072	Herpes_UL43	Herpesvirus UL43 protein. UL43 genes are expressed with true-late (gamma2) kinetics and have been identified as a virion tegument component.	373
368276	pfam05073	Baculo_p24	Baculovirus P24 capsid protein. Baculovirus P24 is associated with nucleocapsids of budded and polyhedra-derived virions.	165
368277	pfam05075	DUF684	Protein of unknown function (DUF684). This family contains several uncharacterized proteins from Caenorhabditis elegans. The GO annotation suggests that the protein is involved in nematode larval development and has a positive regulation on growth rate.	338
398651	pfam05076	SUFU	Suppressor of fused protein (SUFU). SUFU, encoding the human orthologue of Drosophila suppressor of fused, appears to have a conserved role in the repression of Hedgehog signaling. SUFU exerts its repressor role by physically interacting with GLI proteins in both the cytoplasm and the nucleus. SUFU has been found to be a tumor-suppressor gene that predisposes individuals to medulloblastoma by modulating the SHH signaling pathway. Genomic contextual analysis of bacterial SUFU versions revealed that they are immunity proteins against diverse nuclease toxins in polymorphic toxin systems.	171
282877	pfam05077	DUF678	Protein of unknown function (DUF678). This family contains several poxvirus proteins of unknown function.	73
398652	pfam05078	DUF679	Protein of unknown function (DUF679). This family contains several uncharacterized plant proteins.	163
398653	pfam05079	DUF680	Protein of unknown function (DUF680). This family contains several uncharacterized proteins which seem to be found exclusively in Rhizobium loti.	54
113835	pfam05080	DUF681	Protein of unknown function (DUF681). This family contains several uncharacterized beak and feather disease virus proteins.	101
282879	pfam05081	DUF682	Protein of unknown function (DUF682). This family consists if several uncharacterized baculovirus proteins.	157
398654	pfam05082	Rop-like	Rop-like. This family contains several uncharacterized bacterial proteins. These proteins are found in nitrogen fixation operons so are likely to play some role in this process. They consist of two alpha helices which are joined by a four residue linker. The helices form an antiparallel bundle and cross towards their termini. They are likely to form a rod-like dimer. They have structural similarity to the regulatory protein Rop, pfam01815.	60
398655	pfam05083	LST1	LST-1 protein. B144/LST1 is a gene encoded in the human major histocompatibility complex that produces multiple forms of alternatively spliced mRNA and encodes peptides fewer than 100 amino acids in length. B144/LST1 is strongly expressed in dendritic cells. Transfection of B144/LST1 into a variety of cells induces morphologic changes including the production of long, thin filopodia.	78
398656	pfam05084	GRA6	Granule antigen protein (GRA6). This family contains the granule antigen protein GRA6 which is found in the parasitic protozoa Toxoplasma gondii and Neospora caninum. GRA6 protein plays an important role in the antigenicity and pathogenicity in these organisms.	216
282883	pfam05085	DUF685	Protein of unknown function (DUF685). This family consists of several uncharacterized proteins from Borrelia burgdorferi (Lyme disease spirochete). There is some evidence to suggest that the proteins may be outer surface proteins.	265
252996	pfam05086	Dicty_REP	Dictyostelium (Slime Mold) REP protein. This family consists of REP proteins from Dictyostelium (Slime molds). REP protein is likely involved in transcription regulation and control of DNA replication, specifically amplification of plasmid at low copy numbers. The formation of homomultimers may be required for their regulatory activity.	910
398657	pfam05087	Rota_VP2	Rotavirus VP2 protein. Rotavirus particles consist of three concentric proteinaceous capsid layers. The innermost capsid (core) is made of VP2. The genomic RNA and the two minor proteins VP1 and VP3 are encapsidated within this layer. The N-terminus of rotavirus VP2 is necessary for the encapsidation of VP1 and VP3.	882
398658	pfam05088	Bac_GDH	Bacterial NAD-glutamate dehydrogenase. This family consists of several bacterial proteins which are closely related to NAD-glutamate dehydrogenase found in Streptomyces clavuligerus. Glutamate dehydrogenases (GDHs) are a broadly distributed group of enzymes that catalyze the reversible oxidative deamination of glutamate to ketoglutarate and ammonia.	1530
398659	pfam05089	NAGLU	Alpha-N-acetylglucosaminidase (NAGLU) tim-barrel domain. Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This central domain has a tim barrel fold.	333
398660	pfam05090	VKG_Carbox	Vitamin K-dependent gamma-carboxylase. Using reduced vitamin K, oxygen, and carbon dioxide, gamma-glutamyl carboxylase post-translationally modifies certain glutamates by adding carbon dioxide to the gamma position of those amino acids. In vertebrates, the modification of glutamate residues of target proteins is facilitated by an interaction between a propeptide present on target proteins and the gamma-glutamyl carboxylase.	431
398661	pfam05091	eIF-3_zeta	Eukaryotic translation initiation factor 3 subunit 7 (eIF-3). This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery. The gene coding for the protein has been implicated in cancer in mammals.	530
282889	pfam05092	PIF	Per os infectivity. This is a family of dsDNA Baculovirus proteins. It is required for the infectivity of the OBs or occlusion bodies. It is a structural protein of the ODV envelope required only in the first steps of per os larva infection, as viruses being produced in cells expressing the gene for this protein but not containing it in their genomes are able to produce successful infections. Baculoviruses are large DNA viruses that infect arthropods, mainly members of the order Lepidoptera. In their life cycle, they produce two kinds of particles, a budded, non-occluded virus (BV), which buds out of the infected cell and is responsible for the cell-to-cell transmission of the virus, and an occluded form, the occlusion body (OB), which is responsible for protecting the virus between encounters with larvae. A variable number of virions are included in the para-crystalline structure of the OB, mainly constituted by the virus-encoded polyhedrin protein; these virions are called occlusion body-derived virions or ODVs.	519
398662	pfam05093	CIAPIN1	Cytokine-induced anti-apoptosis inhibitor 1, Fe-S biogenesis. Anamorsin, subsequently named CIAPIN1 for cytokine-induced anti-apoptosis inhibitor 1, in humans is the homolog of yeast Dre2, a conserved soluble eukaryotic Fe-S cluster protein, that functions in cytosolic Fe-S protein biogenesis. It is found in both the cytoplasm and in the mitochondrial intermembrane space (IMS). CIAPIN1 is found to be up-regulated in hepatocellular cancer, is considered to be a downstream effector of the receptor tyrosine kinase-Ras signalling pathway, and is essential in mouse definitive haematopoiesis. Dre2 has been found to interact with the yeast reductase Tah18, forming a tight cytosolic complex implicated in the response to high levels of oxidative stress.	99
282891	pfam05094	LEF-9	Late expression factor 9 (LEF-9). Late expression factor 9 (LEF-9) is one of the primary components of RNA polymerase produced by baculoviruses. LEF-9 is homologous to the largest beta-subunit of prokaryotic DNA-directed RNA polymerase.	493
309989	pfam05095	DUF687	Protein of unknown function (DUF687). This family contains several uncharacterized Chlamydia proteins.	537
398663	pfam05096	Glu_cyclase_2	Glutamine cyclotransferase. This family of enzymes EC:2.3.2.5 catalyze the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes.	240
368284	pfam05097	DUF688	Protein of unknown function (DUF688). This family contains several uncharacterized proteins found in Arabidopsis thaliana.	443
282893	pfam05098	LEF-4	Late expression factor 4 (LEF-4). Late expression factor 4 (LEF-4) is one of the Baculovirus late expression factor proteins. LEF-4 carries out all the enzymatic functions related to mRNA capping.	471
398664	pfam05099	TerB	Tellurite resistance protein TerB. This family contains the TerB tellurite resistance proteins from a a number of bacteria.	142
282895	pfam05100	Phage_tail_L	Phage minor tail protein L. 	206
398665	pfam05101	VirB3	Type IV secretory pathway, VirB3-like protein. This family includes the Type IV secretory pathway VirB3 protein, that is found associated with bacterial inner and outer membranes. The family also includes the conjugal transfer protein TrbD family that contains a nucleotide binding motif and may provide energy for the export of DNA or the export of other Trb proteins.	82
309993	pfam05102	Holin_BlyA	holin, BlyA family. BlyA, a small holin found in Borrelia circular plasmids that is encoded by a prophage. BlyA contains two largely hydrophobic helices and a highly charged C-terminus and has two transmembrane segments.	61
398666	pfam05103	DivIVA	DivIVA protein. The Bacillus subtilis divIVA1 mutation causes misplacement of the septum during cell division, resulting in the formation of small, circular, anucleate mini-cells. Inactivation of divIVA produces a mini-cell phenotype, whereas overproduction of DivIVA results in a filamentation phenotype. These proteins appear to contain coiled-coils.	131
398667	pfam05104	Rib_recp_KP_reg	Ribosome receptor lysine/proline rich region. This highly conserved region is found towards the C-terminus of the transmembrane domain. The function is unclear.	139
398668	pfam05105	Phage_holin_4_1	Bacteriophage holin family. Phage holins and lytic enzymes are both necessary for bacterial lysis and virus dissemination. This family also includes TcdE/UtxA involved in toxin secretion in Clostridium difficile. The 1.E.10 family is represented by Bacillus subtilis phi29 holin; 1.E.16 represents the Cph1 holin; and the 1.E.19 family is represented by the Clostridium difficile TcdE holin. Toxigenic strains of C. difficile produce two large toxins (TcdA and TcdB) encoded within a pathogenicity locus. tcdE, encoded between tcdA and tcdB, encodes a 166 aa protein which causes death to E. coli when expressed, and the structure of TcdE resembles holins. TcdE acts on the bacterial membrane. Since TcdA and TcdB lack signal peptides, they may be released via TcdE either prior to or subsequent to cell lysis.	109
398669	pfam05106	Phage_holin_3_1	Phage holin family (Lysis protein S). This family represents one of a large number of mutually dissimilar families of phage holins. Holins act against the host cell membrane to allow lytic enzymes of the phage to reach the bacterial cell wall. This family includes the product of the S gene of phage lambda.	99
398670	pfam05107	Cas_Cas7	CRISPR-associated protein Cas7. CRISPR-associated protein Cas7 is one of the components of the type I-B cascade-like antiviral defense complex. In Haloferax volcanii, Cas5, Cas6 and Cas7 form a small complex that aids the stability of CRISPR-derived RNA.	252
398671	pfam05108	T7SS_ESX1_EccB	Type VII secretion system ESX-1, transport TM domain B. EccB is a family of largely Gram-positive bacterial transmembrane componenets of the type VII secretion system characterized in Mycobacterium tuberculosis, systems ESX1-5. Translocation of virulent peptides through the membranes is thought to be mediated via a complex that includes EccB, EccC, EccD, EccE, and MycP. EccB, EccC, EccD, and EccE form a stable complex in the mycobacterial cell envelope.	405
282904	pfam05109	Herpes_BLLF1	Herpes virus major outer envelope glycoprotein (BLLF1). This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.	886
398672	pfam05110	AF-4	AF-4 proto-oncoprotein. This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homolog Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila.	515
398673	pfam05111	Amelin	Ameloblastin precursor (Amelin). This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals.	417
282907	pfam05112	Baculo_p47	Baculovirus P47 protein. This family consists of several Baculovirus P47 proteins which is one of the primary components of Baculovirus encoded RNA polymerase, which initiates transcription from late and very late promoters.	306
310001	pfam05113	DUF693	Protein of unknown function (DUF693). This family consists of several uncharacterized proteins from Borrelia burgdorferi (Lyme disease spirochete).	313
398674	pfam05114	DUF692	Protein of unknown function (DUF692). This family consists of several uncharacterized bacterial proteins.	263
398675	pfam05115	PetL	Cytochrome B6-F complex subunit VI (PetL). This family consists of several Cytochrome B6-F complex subunit VI (PetL) proteins found in several plant species. PetL is one of the small subunits which make up The cytochrome b(6)f complex. PetL is strictly required neither for the accumulation nor for the function of cytochrome b6f; in its absence, however, the complex becomes unstable in vivo in aging cells and labile in vitro. It has been suggested that the N-terminus of the protein is likely to lie in the thylakoid lumen.	31
398676	pfam05116	S6PP	Sucrose-6F-phosphate phosphohydrolase. This family consists of Sucrose-6F-phosphate phosphohydrolase proteins found in plants and cyanobacteria. Sucrose-6(F)-phosphate phosphohydrolase catalyzes the final step in the pathway of sucrose biosynthesis.	246
398677	pfam05117	DUF695	Family of unknown function (DUF695). Family of uncharacterized bacterial proteins.	129
398678	pfam05118	Asp_Arg_Hydrox	Aspartyl/Asparaginyl beta-hydroxylase. Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyze oxidative reactions in a range of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a 2-OG oxygenase catalyzing oxidation of a free alpha-amino acid. The structure of proline 3-hydroxylase contains the conserved motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron and 2-oxoglutarate, consistent with divergent evolution within the extended family. This family represent the arginine, asparagine and proline hydroxylases. The aspartyl/asparaginyl beta-hydroxylase (EC:1.14.11.16) specifically hydroxylates one aspartic or asparagine residue in certain epidermal growth factor-like domains of a number of proteins.	157
398679	pfam05119	Terminase_4	Phage terminase, small subunit. 	92
310007	pfam05120	GvpG	Gas vesicle protein G. These proteins are involved in the formation of gas vesicles.	80
398680	pfam05121	GvpK	Gas vesicle protein K. These proteins are involved in the formation of gas vesicles.	81
398681	pfam05122	SpdB	Mobile element transfer protein. This proteins are involved in transferring a group of integrating conjugative DNA elements, such as pSAM2 from Streptomyces ambofaciens. Their precise role is not known.	50
368294	pfam05123	S_layer_N	S-layer like family, N-terminal region. 	284
368295	pfam05124	S_layer_C	S-layer like family, C-terminal region. 	221
398682	pfam05125	Phage_cap_P2	Phage major capsid protein, P2 family. 	326
377463	pfam05127	Helicase_RecD	Helicase. This domain contains a P-loop (Walker A) motif, suggesting that it has ATPase activity, and a Walker B motif. In tRNA(Met) cytidine acetyltransferase (TmcA) it may function as an RNA helicase motor (driven by ATP hydrolysis) which delivers the wobble base to the active centre of the GCN5-related N-acetyltransferase (GNAT) domain. It is found in the bacterial exodeoxyribonuclease V alpha chain (RecD), which has 5'-3' helicase activity. It is structurally similar to the motor domain 1A in other SF1 helicases.	175
368297	pfam05128	DUF697	Domain of unknown function (DUF697). Family of bacterial hypothetical proteins that is sometimes associated with GTPase domains.	162
398683	pfam05129	Elf1	Transcription elongation factor Elf1 like. This family of short proteins contains a putative zinc binding domain with four conserved cysteines. ELF1 has been identified as a transcription elongation factor in Saccharomyces cerevisiae.	77
398684	pfam05130	FlgN	FlgN protein. This family includes the FlgN protein and export chaperone involved in flagellar synthesis.	141
398685	pfam05131	Pep3_Vps18	Pep3/Vps18/deep orange family. This region is found in a number of protein identified as involved in golgi function and vacuolar sorting. The molecular function of this region is unknown. The members of this family contain a C-terminal ring finger domain.	147
398686	pfam05132	RNA_pol_Rpc4	RNA polymerase III RPC4. Specific subunit for Pol III, the tRNA specific polymerase.	138
398687	pfam05133	Phage_prot_Gp6	Phage portal protein, SPP1 Gp6-like. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. During SPP1 morphogenesis, Gp6 participates in the procapsid assembly reaction. This family also includes the old Pfam family Phage_min_cap (PF05126).	416
282928	pfam05134	T2SSL	Type II secretion system (T2SS), protein L. This family consists of Type II secretion system protein L sequences from several Gram-negative (diderm) bacteria. The Type II secretion system, also called Secretion-dependent pathway (SDP), is responsible for extracellular secretion of a number of different proteins, including proteases and toxins. This pathway supports secretion of proteins across the cell envelope in two distinct steps, in which the second step, involving translocation through the outer membrane, is assisted by at least 13 different gene products. T2SL is predicted to contain a large cytoplasmic domain represented by this family and has been shown to interact with the autophosphorylating cytoplasmic membrane protein T2SE. It is thought that the tri-molecular complex of T2SL, T2SE (pfam00437) and T2SM (pfam04612) might be involved in regulating the opening and closing of the secretion pore and/or transducing energy to the site of outer membrane translocation.	230
398688	pfam05135	Phage_connect_1	Phage gp6-like head-tail connector protein. This family of proteins contain head-tail connector proteins related to gp6 from bacteriophage HK97. A structure of this protein shows similarity to gp15 a well characterized connector component of bacteriophage SPP1.	94
398689	pfam05136	Phage_portal_2	Phage portal protein, lambda family. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage capsid and the tail proteins.	343
398690	pfam05137	PilN	Fimbrial assembly protein (PilN). 	77
398691	pfam05138	PaaA_PaaC	Phenylacetic acid catabolic protein. This family includes proteins such as PaaA and PaaC that are part of a catabolic pathway of phenylacetic acid. These proteins may form part of a dioxygenase complex.	258
398692	pfam05139	Erythro_esteras	Erythromycin esterase. This family includes erythromycin esterase enzymes that confer resistance to the erythromycin antibiotic.	312
398693	pfam05140	ResB	ResB-like family. This family includes both ResB and cytochrome c biogenesis proteins. Mutations in ResB indicate that they are essential for growth. ResB is predicted to be a transmembrane protein.	446
398694	pfam05141	DIT1_PvcA	Pyoverdine/dityrosine biosynthesis protein. DIT1 is involved in synthesising dityrosine. Dityrosine is a sporulation-specific component of the yeast ascospore wall that is essential for the resistance of the spores to adverse environmental conditions. Pyoverdine biosynthesis protein PvcA is involved in the biosynthesis of pyoverdine, a cyclized isocyano derivative of tyrosine. It has a modified Rossmann fold.	270
398695	pfam05142	DUF702	Domain of unknown function (DUF702). Members of this family are found in various putative zinc finger proteins.	154
368303	pfam05144	Phage_CRI	Phage replication protein CRI. The phage replication protein CRI, is also known as Gene II, is essential for DNA replication.	234
398696	pfam05145	AbrB	Transition state regulatory protein AbrB. Bacillus subtilis respond to a multitude of environmental stimuli by using transcription factors called transition state regulators (TSRs). They play an essential role in cell survival by regulating spore formation, competence, and biofilm development. AbrB is one of the most known TSRs, acting as a pleotropic regulator for over 60 different genes where it directly binds to their promoter or regulatory regions. Many other genes are indirectly controlled by AbrB since it is a regulator of other regulatory proteins, including ScoC, Abh, SinR and SigH. Hence, AbrB is considered a global regulatory protein controlling processes such as Bacillus subtilis growth and cell division as well as production of extracellular degradative enzymes, nitrogen utilization and amino acid metabolism, motility, synthesis of antibiotics and their resistant determinants, development of competence, transport systems, oxidative stress response, phosphate metabolism, cell surface components and sporulation. AbrB is a tetramer consisting of identical 94 residue monomers. Its DNA-binding function resides solely in the N-terminal domain (AbrBN) of 53 residues. Although it does not recognize a well-defined DNA base-pairing sequence, instead, it appears to target a very weak pseudo consensus nucleotide sequence, TGGNA-5bp-TGGNA, which allows it to be rather promiscuous in binding. The N-terminal domains of very similar sequences are present in two more Bacillus subtilis proteins, Abh and SpoVT. Mutagenesis studies suggest that the role of the C-terminal domain is in forming multimers.	312
398697	pfam05147	LANC_like	Lanthionine synthetase C-like protein. Lanthionines are thioether bridges that are putatively generated by dehydration of Ser and Thr residues followed by addition of cysteine residues within the peptide. This family contains the lanthionine synthetase C-like proteins 1 and 2 which are related to the bacterial lanthionine synthetase components C (LanC). LANCL1 (P40 seven-transmembrane-domain protein) and LANCL2 (testes-specific adriamycin sensitivity protein) are thought to be peptide-modifying enzyme components in eukaryotic cells. Both proteins are produced in large quantities in the brain and testes and may have role in the immune surveillance of these organs. Lanthionines are found in lantibiotics, which are peptide-derived, post-translationally modified antimicrobials produced by several bacterial strains. This region contains seven internal repeats.	350
398698	pfam05148	Methyltransf_8	Hypothetical methyltransferase. This family consists of several uncharacterized eukaryotic proteins which are related to methyltransferases pfam01209.	214
368306	pfam05149	Flagellar_rod	Paraflagellar rod protein. This family consists of several eukaryotic paraflagellar rod component proteins. The eukaryotic flagellum represents one of the most complex macromolecular structures found in any organism and contains more than 250 proteins. In addition to its locomotive role, the flagellum is probably involved in nutrient uptake since receptors for host low-density lipoproteins are localized on the flagellar membrane as well as on the flagellar pocket membrane.	287
398699	pfam05150	Legionella_OMP	Legionella pneumophila major outer membrane protein precursor. This family consists of major outer membrane protein precursors from Legionella pneumophila.	279
398700	pfam05151	PsbM	Photosystem II reaction centre M protein (PsbM). This family consists of several Photosystem II reaction centre M proteins (PsbM) from plants and cyanobacteria. During the photosynthetic light reactions in the thylakoid membranes of cyanobacteria, algae, and plants, photosystem II (PSII), a multi-subunit membrane protein complex, catalyzes oxidation of water to molecular oxygen and reduction of plastoquinon.	31
282943	pfam05152	DUF705	Protein of unknown function (DUF705). This family contains several uncharacterized Baculovirus proteins.	302
398701	pfam05153	MIOX	Myo-inositol oxygenase. MIOX is the enzyme myo-inositol oxygenase. It catalyzes the first committed step in the glucuronate-xylulose pathway, It is a di-iron oxygenase with a key role in inositol metabolism. The structure reveals a monomeric, single-domain protein with a mostly helical fold that is distantly related to the diverse HD domain superfamily. The structural core is of five alpha-helices that contribute six ligands, four His and two Asp, to the di-iron centre where the two iron atoms are bridged by a putative hydroxide ion and one of the Asp ligands. The substrate is myo-inositol is bound in a terminal substrate-binding mode to a di-iron cluster. Within the structure are two additional proteinous lids that cover and shield the enzyme's active site.	249
377473	pfam05154	TM2	TM2 domain. This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts.	50
282946	pfam05155	Phage_X	Phage X family. This family is the product of Gene X. The function of this protein is unknown.	88
398702	pfam05157	T2SSE_N	Type II secretion system (T2SS), protein E, N-terminal domain. This domain is found at the N-terminus of members of the Type II secretion system protein E. Proteins in this subfamily are typically involved in Type 4 pilus biogenesis, though some are involved in other processes; for instance aggregation in Myxococcus xanthus. The structure of this domain is now known.	109
398703	pfam05158	RNA_pol_Rpc34	RNA polymerase Rpc34 subunit. Subunit specific to RNA Pol III, the tRNA specific polymerase. The C34 subunit of yeast RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and is therefore participates in Pol III recruitment.	317
282949	pfam05159	Capsule_synth	Capsule polysaccharide biosynthesis protein. This family includes export proteins involved in capsule polysaccharide biosynthesis, such as KpsS and LipB.	310
398704	pfam05160	DSS1_SEM1	DSS1/SEM1 family. This family contains the breast cancer tumor suppressor BRCA2-interacting protein DSS1 and its homolog SEM1, both of which are short acidic proteins. DSS1 has been shown to be a conserved component of the Rae1 mediated mRNA export pathway in Schizosaccharomyces pombe.	56
398705	pfam05161	MOFRL	MOFRL family. MOFRL(multi-organism fragment with rich Leucine) family exists in bacteria and eukaryotes. The function of this domain is not clear, although it exists in some putative enzymes such as reductases and kinases.	106
398706	pfam05162	Ribosomal_L41	Ribosomal protein L41. 	24
398707	pfam05163	DinB	DinB family. DNA damage-inducible (din) genes in Bacillus subtilis are coordinately regulated and together compose a global regulatory network that has been termed the SOS-like or SOB regulon. This family includes DinB from B. subtilis.	163
398708	pfam05164	ZapA	Cell division protein ZapA. ZapA is a cell division protein which interacts with FtsZ. FtsZ is part of a mid-cell cytokinetic structure termed the Z-ring that recruits a hierarchy of fission related proteins early in the bacterial cell cycle. The interaction of FtsZ with ZapA drives its polymerization and promotes FtsZ filament bundling thereby contributing to the spatio-temporal tuning of the Z-ring.	85
147379	pfam05165	GCH_III	GTP cyclohydrolase III. GTP cyclohydrolase (GCH) III from Methanocaldococcus jannaschi catalyzes the conversion of GTP to 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate (FAPy). The reaction requires two bound magnesium ions for the catalysis and is activated by monovalent cations such as potassium and ammonium. The enzyme is a tetramer of identical subunits; each monomer is composed of an N- and a C-terminal domain that adopt nearly superimposible structures, suggesting that the protein has arisen by gene duplication. The family is found in archaea and bacteria.	246
398709	pfam05166	YcgL	YcgL domain. This family of proteins formerly called DUF709 includes the E. coli gene ycgL. homologs of YcgL are found in gammaproteobacteria. The structure of this protein shows a novel alpha/beta/alpha sandwich structure.	73
398710	pfam05167	DUF711	Uncharacterized ACR (DUF711). The proteins in this family are functionally uncharacterized. The proteins are around 450 amino acids long. It is likely that this family represents a group of glycerol-3-phosphate dehydrogenases.	402
398711	pfam05168	HEPN	HEPN domain. 	117
282957	pfam05170	AsmA	AsmA family. The AsmA gene, whose product is involved in the assembly of outer membrane proteins in Escherichia coli. AsmA mutations were isolated as extragenic suppressors of an OmpF assembly mutant. AsmA may have a role in LPS biogenesis.	608
398712	pfam05171	HemS	Haemin-degrading HemS.ChuX domain. The Yersinia enterocolitica O:8 periplasmic binding-protein- dependent transport system consisted of four proteins: the periplasmic haemin-binding protein HemT, the haemin permease protein HemU, the ATP-binding hydrophilic protein HemV and the haemin-degrading protein HemS (this family). The structure for HemS has been solved and consists of a tandem repeat of this domain.	128
398713	pfam05172	Nup35_RRM	Nup53/35/40-type RNA recognition motif. Members of this family belong to the nucleor pore complex, NPC, the only gateway between the nucleus and the cytoplasm. The NPC consists of several subcomplexes each one of which is made up of multiple copies of several individual Nup, Nic or Sec protein subunits. In yeast, this Nup or nucleoporin subunit is numbered Nup53, Nup40 in Schizo. pombe and in vertebrates as Nup35. This subunit forms part of the inner ring within the membrane and interacts directly with Nup-Ndc1, considered to be an anchor for the NPC in the pore membrane. This region of the Nup is the RNA-recognition region.	81
398714	pfam05173	DapB_C	Dihydrodipicolinate reductase, C-terminus. Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The C-terminal domain of DapB has been proposed to be the substrate- binding domain.	134
398715	pfam05175	MTS	Methyltransferase small domain. This domain is found in ribosomal RNA small subunit methyltransferase C as well as other methyltransferases.	170
398716	pfam05176	ATP-synt_10	ATP10 protein. ATP 10 is essential for the assembly of a functional mitochondrial ATPase complex.	255
398717	pfam05177	RCSD	RCSD region. Proteins contain this region include C.elegans UNC-89. This region is found repeated in UNC-89 and shows conservation in prolines, lysines and glutamic acids. Proteins with RCSD are involved in muscle M-line assembly, but the function of this region RCSD is not clear.	101
398718	pfam05178	Kri1	KRI1-like family. The yeast member of this family (Kri1p) is found to be required for 40S ribosome biogenesis in the nucleolus.	101
398719	pfam05179	CDC73_C	RNA pol II accessory factor, Cdc73 family, C-terminal. CDC73 is an RNA polymerase II accessory factor, and forms part of the Paf1 complex that has roles in post-initiation events. More specifically, crystal structure analysis shows the C-terminus to be a Ras-like domain that adopts a fold that is highly similar to GTPases of the Ras superfamily. The canonical nucleotide binding pocket is altered in CDC73, and there is no nucleotide ligand, but it contributes to histone methylation and Paf1C recruitment to active genes. Thus together with Rtf1 it combines to couple the Paf1 complex to elongating polymerase. The family has been added to the P-loop clan on the basis of the topology of the b-stranded core, and its similarity to Ras.	155
398720	pfam05180	zf-DNL	DNL zinc finger. The domain is named after a short C-terminal motif of D(N/H)L. This domain is a novel zinc-finger protein essential for protein import into mitochondria.	64
398721	pfam05181	XPA_C	XPA protein C-terminus. 	51
398722	pfam05182	Fip1	Fip1 motif. This short motif is about 40 amino acids in length. In the Fip1 protein that is a component of a yeast pre-mRNA polyadenylation factor that directly interacts with poly(A) polymerase. This region of Fip1 is needed for the interaction with the Th1 subunit of the complex and for specific polyadenylation of the cleaved mRNA precursor.	43
398723	pfam05183	RdRP	RNA dependent RNA polymerase. This family of proteins are eukaryotic RNA dependent RNA polymerases. These proteins are involved in post transcriptional gene silencing where they are thought to amplify dsRNA templates.	554
398724	pfam05184	SapB_1	Saposin-like type B, region 1. 	36
398725	pfam05185	PRMT5	PRMT5 arginine-N-methyltransferase. The human homolog of yeast Skb1 (Shk1 kinase-binding protein 1) is PRMT5, an arginine-N-methyltransferase. These proteins appear to be key mitotic regulators. They play a role in Jak signalling in higher eukaryotes.	171
398726	pfam05186	Dpy-30	Dpy-30 motif. This motif is found in a wide variety of domain contexts. It is found in the Dpy-30 proteins hence the motifs name. It is about 40 residues long and is probably formed of two alpha-helices. It may be a dimerization motif analogous to pfam02197 (Bateman A pers obs).	42
398727	pfam05187	ETF_QO	Electron transfer flavoprotein-ubiquinone oxidoreductase, 4Fe-4S. Electron-transfer flavoprotein-ubiquinone oxidoreductase (ETF-QO) in the inner mitochondrial membrane accepts electrons from electron-transfer flavoprotein which is located in the mitochondrial matrix and reduces ubiquinone in the mitochondrial membrane. The two redox centers in the protein, FAD and a [4Fe4S] cluster, are present in a 64-kDa monomer.	103
398728	pfam05188	MutS_II	MutS domain II. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam01624, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. This domain corresponds to domain II in Thermus aquaticus MutS and has similarity resembles RNAse-H-like domains (see pfam00075).	133
398729	pfam05189	RTC_insert	RNA 3'-terminal phosphate cyclase (RTC), insert domain. RNA cyclases are a family of RNA-modifying enzymes that are conserved in all cellular organisms. They catalyze the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA, in a reaction involving formation of the covalent AMP-cyclase intermediate. The structure of RTC demonstrates that RTCs are comprised two domain. The larger domain contains an insert domain of approximately 100 amino acids.	102
398730	pfam05190	MutS_IV	MutS family domain IV. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam00488. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds in part with globular domain IV, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in.	92
398731	pfam05191	ADK_lid	Adenylate kinase, active site lid. Comparisons of adenylate kinases have revealed a particular divergence in the active site lid. In some organisms, particularly the Gram-positive bacteria, residues in the lid domain have been mutated to cysteines and these cysteine residues are responsible for the binding of a zinc ion. The bound zinc ion in the lid domain, is clearly structurally homologous to Zinc-finger domains. However, it is unclear whether the adenylate kinase lid is a novel zinc-finger DNA/RNA binding domain, or that the lid bound zinc serves a purely structural function.	36
398732	pfam05192	MutS_III	MutS domain III. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterized in.	290
398733	pfam05193	Peptidase_M16_C	Peptidase M16 inactive domain. Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp.	181
398734	pfam05194	UreE_C	UreE urease accessory protein, C-terminal domain. UreE is a urease accessory protein. Urease pfam00449 hydrolyzes urea into ammonia and carbamic acid. The C-terminal region of members of this family contains a His rich Nickel binding site.	86
398735	pfam05195	AMP_N	Aminopeptidase P, N-terminal domain. This domain is structurally very similar to the creatinase N-terminal domain (pfam01321). However, little or no sequence similarity exists between the two families.	121
398736	pfam05196	PTN_MK_N	PTN/MK heparin-binding protein family, N-terminal domain. 	57
398737	pfam05197	TRIC	TRIC channel. TRIC (trimeric intracellular cation) channels are differentially expressed in intracellular stores in animal cell types. TRIC subtypes contain three proposed transmembrane segments, and form homo-trimers with a bullet-like structure. Electrophysiological measurements with purified TRIC preparations identify a monovalent cation-selective channel.	185
398738	pfam05198	IF3_N	Translation initiation factor IF-3, N-terminal domain. 	70
398739	pfam05199	GMC_oxred_C	GMC oxidoreductase. This domain found associated with pfam00732.	143
398740	pfam05201	GlutR_N	Glutamyl-tRNAGlu reductase, N-terminal domain. 	149
398741	pfam05202	Flp_C	Recombinase Flp protein. 	254
368334	pfam05203	Hom_end_hint	Hom_end-associated Hint. Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life. The crystal structure of the homing nuclease PI-Sce revealed two domains: an endonucleolytic centre resembling the C-terminal domain of Drosophila melanogaster Hedgehog protein, and a second domain containing the protein-splicing active site. This Domain corresponds to the latter protein-splicing domain.	444
368335	pfam05204	Hom_end	Homing endonuclease. Homing endonucleases are encoded by mobile DNA elements that are found inserted within host genes in all domains of life.	110
398742	pfam05205	COMPASS-Shg1	COMPASS (Complex proteins associated with Set1p) component shg1. The Shg1 subunit is one of the eight subunits of the COMPASS complex, complex associated with SET1, conserved in yeasts and in other eukaryotes up to humans. It is associated with the region of the Set1 protein that is N-terminal to the C-terminus, ie Set1-560-900. The function of Shg1 seems to be to slightly inhibit histone 3 lysine 4 (H3K4) di- and tri-methylation, and it is a pioneer protein. The COMPASS complex functions to methylate the fourth lysine of Histone 3 and for silencing of genes close to the telomeres of chromosomes.	100
398743	pfam05206	TRM13	Methyltransferase TRM13. This is a family of eukaryotic proteins which are responsible for 2'-O-methylation of tRNA at position 4. TRM13 shows no sequence similarity to other known methyltransferases.	256
398744	pfam05207	zf-CSL	CSL zinc finger. This is a zinc binding motif which contains four cysteine residues which chelate zinc. This domain is often found associated with a pfam00226 domain. This domain is named after the conserved motif of the final cysteine.	55
398745	pfam05208	ALG3	ALG3 protein. The formation of N-glycosidic linkages of glycoproteins involves the ordered assembly of the common Glc3Man9GlcNAc2 core-oligosaccharide on the lipid carrier dolichyl pyrophosphate. Whereas early mannosylation steps occur on the cytoplasmic side of the endoplasmic reticulum with GDP-Man as donor, the final reactions from Man5GlcNAc2-PP-Dol to Man9GlcNAc2-PP-Dol on the lumenal side use Dol-P-Man. ALG3 gene encodes the Dol-P-Man:Man5GlcNAc2-PP-Dol mannosyltransferase.	358
282993	pfam05209	MinC_N	Septum formation inhibitor MinC, N-terminal domain. In Escherichia coli FtsZ assembles into a Z ring at midcell while assembly at polar sites is prevented by the min system. MinC, a component of this system, is an inhibitor of FtsZ assembly that is positioned within the cell by interaction with MinDE. MinC is an oligomer, probably a dimer. The C terminal half of MinC is the most conserved and interacts with MinD. The N terminal half is thought to interact with FtsZ.	104
398746	pfam05210	Sprouty	Sprouty protein (Spry). This family consists of eukaryotic Sprouty protein homologs. Sprouty proteins have been revealed as inhibitors of the Ras/mitogen-activated protein kinase (MAPK) cascade, a pathway crucial for developmental processes initiated by activation of various receptor tyrosine kinases. The sprouty gene has found to be expressed in the the brain, cochlea, nasal organs, teeth, salivary gland, lungs, digestive tract, kidneys and limb buds in mice.	101
398747	pfam05211	NLBH	Neuraminyllactose-binding hemagglutinin precursor (NLBH). This family is comprised of several flagellar sheath adhesin proteins also called neuraminyllactose-binding hemagglutinin precursor (NLBH) or N-acetylneuraminyllactose-binding fibrillar hemagglutinin receptor-binding subunits. NLBH is found exclusively in Helicobacter which are gut colonising bacteria and bind to sialic acid rich macromolecules present on the gastric epithelium.	229
398748	pfam05212	DUF707	Protein of unknown function (DUF707). This family consists of several uncharacterized proteins from Arabidopsis thaliana.	292
282997	pfam05213	Corona_NS2A	Coronavirus NS2A protein. This family contains a number of corona virus non-structural proteins of unknown function. The family also includes a polymerase protein fragment from Berne virus and does not seem to be related to the pfam04753 Coronavirus NS2 family. This family is part of the 2H phosphoesterase superfamily.	267
282998	pfam05214	Baculo_p33	Baculovirus P33. This family consists of a series of Baculovirus P33 protein homologs of unknown function.	247
253093	pfam05215	Spiralin	Spiralin. This family consists of Spiralin proteins found in spiroplasma bacteria. Spiroplasmas are helically shaped pathogenic bacteria related to the mycoplasmas. The surface of spiroplasma bacteria is crowded with the membrane-anchored lipoprotein spiralin whose structure and function are unknown although its cellular function is thought to be a structural and mechanical one rather than a catalytic one.	239
398749	pfam05216	UNC-50	UNC-50 family. Gmh1p from S. cerevisiae is located in the Golgi membrane and interacts with ARF exchange factors.	223
310081	pfam05217	STOP	STOP protein. Neurons contain abundant subsets of highly stable microtubules that resist de-polymerising conditions such as exposure to the cold. Stable microtubules are thought to be essential for neuronal development, maintenance, and function. STOP is a major factor responsible for the intriguing stability properties of neuronal microtubules and is important for synaptic plasticity. Additionally knowledge of STOPs function and properties may help in the treatment of neuroleptics in illnesses such as schizophrenia, currently thought to result from synaptic defects.	35
368344	pfam05218	DUF713	Protein of unknown function (DUF713). This family contains several proteins of unknown function from C.elegans. The GO annotation suggests that this protein is involved in nematode development and has a positive regulation on growth rate.	185
253097	pfam05219	DREV	DREV methyltransferase. This family contains DREV protein homologs from several eukaryotes. The function of this protein is unknown. However, these proteins appear to be related to other methyltransferases (Bateman A pers obs).	265
283002	pfam05220	MgpC	MgpC protein precursor. This family contains several Mycoplasma MgpC like-proteins.	226
398750	pfam05221	AdoHcyase	S-adenosyl-L-homocysteine hydrolase. 	461
398751	pfam05222	AlaDh_PNT_N	Alanine dehydrogenase/PNT, N-terminal domain. This family now also contains the lysine 2-oxoglutarate reductases.	136
398752	pfam05223	MecA_N	NTF2-like N-terminal transpeptidase domain. The structure of this domain from MecA is known and is found to be similar to that found in NTF2 pfam02136. This domain seems unlikely to have an enzymatic function, and its role remains unknown.	117
398753	pfam05224	NDT80_PhoG	NDT80 / PhoG like DNA-binding family. This family includes the DNA-binding region of NDT80 as well as PhoG and its homologs. The family contains VIB-1. VIB-1 is thought to be a regulator of conidiation in Neurospora crassa and shares a region of similarity to PHOG, a possible phosphate nonrepressible acid phosphatase in Aspergillus nidulans. It has been found that vib-1 is not the structural gene for nonrepressible acid phosphatase, but rather may regulate nonrepressible acid phosphatase activity.	180
283007	pfam05225	HTH_psq	helix-turn-helix, Psq domain. This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster. In pipsqueak this domain binds to GAGA sequence.	45
398754	pfam05226	CHASE2	CHASE2 domain. CHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognized by CHASE2 domains are not known at this time.	266
398755	pfam05227	CHASE3	CHASE3 domain. CHASE3 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE3 domains are found in histidine kinases, adenylate cyclases, methyl-accepting chemotaxis proteins and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognized by CHASE3 domains are not known at this time.	138
398756	pfam05228	CHASE4	CHASE4 domain. CHASE4. This is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in prokaryotes. Specifically, CHASE4 domains are found in histidine kinases in Archaea and in predicted diguanylate cyclases/phosphodiesterases in Bacteria. Environmental factors that are recognized by CHASE4 domains are not known at this time.	142
398757	pfam05229	SCPU	Spore Coat Protein U domain. This domain is found in a bacterial family of spore coat proteins, as well as a family of secreted pili proteins involved in motility and biofilm formation. This family is distantly related to fimbrial proteins.	139
398758	pfam05230	MASE2	MASE2 domain. Predicted integral membrane sensory domain found in histidine kinases, diguanylate cyclases and other bacterial signaling proteins.	85
398759	pfam05231	MASE1	MASE1. Predicted integral membrane sensory domain found in histidine kinases, diguanylate cyclases and other bacterial signaling proteins. This entry also includes members of the 8 transmembrane UhpB type (8TMR-UT) domain family.	299
398760	pfam05232	BTP	Chlorhexidine efflux transporter. This family represents a conserved pair of two transmembrane alpha-helices. All members carry the two pairs of TMs. BTP is a form of drug efflux pump, that actively tranports chlorhexidine out of the cell. Chlorhexidine, a bisbiguanide antimicrobial agent, is commonly used as an antiseptic and disinfectant in hospitals, and there is an increasing problem with resistance to it in some pathogenic bacteria. BTP is localized in the cytoplasmic membrane.	63
398761	pfam05233	PHB_acc	PHB accumulation regulatory domain. The proteins this domain is found in are typically involved in regulating polymer accumulation in bacteria, particularly poly-beta-hydroxybutyrate (PHB). The N-terminal region is likely to be the DNA-binding domain (pfam07879) while this domain probably binds PHB (personal obs:C Yeats).	40
113985	pfam05234	UAF_Rrn10	UAF complex subunit Rrn10. The protein Rrn10 has been identified as a component of the Upstream Activating Factor (UAF), an RNA polymerase I (pol I) specific transcription stimulatory factor	122
398762	pfam05235	CHAD	CHAD domain. The CHAD domain is an alpha-helical domain functionally associated with the pfam01928 domains. It has conserved histidines that may chelate metals.	164
398763	pfam05236	TAF4	Transcription initiation factor TFIID component TAF4 family. This region of similarity is found in Transcription initiation factor TFIID component TAF4.	259
398764	pfam05238	CENP-N	Kinetochore protein CHL4 like. CHL4 is a protein involved in chromosome segregation. It is a component of the central kinetochore which mediates the attachment of the centromere to the mitotic spindle. CENP-N is one of the components that assembles onto the CENP-A-nucleosome-associated (NAC) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC.	403
398765	pfam05239	PRC	PRC-barrel domain. The PRC-barrel is an all beta barrel domain found in photosystem reaction centre subunit H of the purple bacteria and RNA metabolism proteins of the RimM group. PRC-barrels are approximately 80 residues long, and found widely represented in bacteria, archaea and plants. This domain is also present at the carboxyl terminus of the pan-bacterial protein RimM, which is involved in ribosomal maturation and processing of 16S rRNA. A family of small proteins conserved in all known euryarchaea are composed entirely of a single stand-alone copy of the domain.	78
398766	pfam05240	APOBEC_C	APOBEC-like C-terminal domain. This domain is found at the C-termini of the Apolipoprotein B mRNA editing enzyme.	78
398767	pfam05241	EBP	Emopamil binding protein. Emopamil binding protein (EBP) is as a gene that encodes a non-glycosylated type I integral membrane protein of endoplasmic reticulum and shows high level expression in epithelial tissues. The EBP protein has emopamil binding domains, including the sterol acceptor site and the catalytic centre, which show Delta8-Delta7 sterol isomerase activity. Human sterol isomerase, a homolog of mouse EBP, is suggested not only to play a role in cholesterol biosynthesis, but also to affect lipoprotein internalisation. In humans, mutations of EBP are known to cause the genetic disorder of X-linked dominant chondrodysplasia punctata (CDPX2). This syndrome of humans is lethal in most males, and affected females display asymmetric hyperkeratotic skin and skeletal abnormalities.	113
368355	pfam05242	GLYCAM-1	Glycosylation-dependent cell adhesion molecule 1 (GlyCAM-1). This family consists of the lactophorin precursors proteose peptone component 3 (PP3) and glycosylation-dependent cell adhesion molecule 1 (GlyCAM-1). GlyCAM-1 functions as a ligand for L-selectin, a saccharide-binding protein on the surface of circulating leukocytes, and mediates the trafficking of blood-born lymphocytes into secondary lymph nodes. In this context, sulphatation of the carbohydrates of GlyCAM-1 has been shown to be a critical structural requirement to be recognized by L-selectin. GlyCAM-1 is also expressed in pregnant and lactating mammary glands of mouse and in an unknown site in the lung, in the bovine uterus and rat cochlea.	135
113995	pfam05244	Brucella_OMP2	Brucella outer membrane protein 2. This family consists of several outer membrane proteins (2a and 2b) from brucella bacteria. Brucellae are Gram-negative, facultative intracellular bacteria that can infect many species of animals and man.	240
283023	pfam05246	DUF735	Protein of unknown function (DUF735). This family consists of several uncharacterized Borrelia burgdorferi (Lyme disease spirochete) proteins of unknown function.	211
398768	pfam05247	FlhD	Flagellar transcriptional activator (FlhD). This family consists of several bacterial flagellar transcriptional activator (FlhD) proteins. FlhD combines with FlhC to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator.	102
310102	pfam05248	Adeno_E3A	Adenovirus E3A. 	104
398769	pfam05250	UPF0193	Uncharacterized protein family (UPF0193). This family of proteins is functionally uncharacterized.	215
398770	pfam05251	Ost5	Oligosaccharyltransferase subunit 5. Eukaryotic N-glycosylation is catalyzed in the ER lumen, where the enzyme oligosaccharyltransferase (OTase) transfers donor glycans from a dolichol pyrophosphate (DolP) carrier (Lipid-linked oligosaccharide; LLO) to polypeptides. The yeast OTase is a hetero-oligomeric complex composed of essential (Ost1, Ost2, Wbp1, Stt3, and Swp1) and nonessential (Ost3, Ost4, Ost5, and Ost6) subunits. This domain family is found in Ost5. The precise function of this subunit is not known, however Ost5 appears to form a sub-complex with Ost1, and this sub-complex associates with the catalytic Stt3 subunit of OTase. Down regulation of Ost5 resulted in a limited effect on glycosylation and no effect on the stability of Ost1 or Stt3 subunits.	73
398771	pfam05253	zf-U11-48K	U11-48K-like CHHC zinc finger. This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function.	24
398772	pfam05254	UPF0203	Uncharacterized protein family (UPF0203). This family of proteins is functionally uncharacterized.	69
398773	pfam05255	UPF0220	Uncharacterized protein family (UPF0220). This family of proteins is functionally uncharacterized.	160
398774	pfam05256	UPF0223	Uncharacterized protein family (UPF0223). This family of proteins is functionally uncharacterized.	85
398775	pfam05257	CHAP	CHAP domain. This domain corresponds to an amidase function. Many of these proteins are involved in cell wall metabolism of bacteria. This domain is found at the N-terminus of Escherichia coli gss, where it functions as a glutathionylspermidine amidase EC:3.5.1.78. This domain is found to be the catalytic domain of PlyCA. CHAP is the amidase domain of bifunctional Escherichia coli glutathionylspermidine synthetase/amidase, and it catalyzes the hydrolysis of Gsp (glutathionylspermidine) into glutathione and spermidine.	83
398776	pfam05258	DUF721	Protein of unknown function (DUF721). This family contains several actinomycete proteins of unknown function.	88
283034	pfam05259	Herpes_UL1	Herpesvirus glycoprotein L. This family consists of several herpesvirus glycoprotein L or UL1 proteins. Glycoprotein L is known to form a complex with glycoprotein H but the function of this complex is poorly understood.	103
398777	pfam05261	Tra_M	TraM protein, DNA-binding. The TraM protein is an essential part of the DNA transfer machinery of the conjugative resistance plasmid R1 (IncFII). On the basis of mutational analyses, it was shown that the essential transfer protein TraM has at least two functions. First, a functional TraM protein was found to be required for normal levels of transfer gene expression. Second, experimental evidence was obtained that TraM stimulates efficient site-specific single-stranded DNA cleavage at the oriT, in vivo. Furthermore, a specific interaction of the cytoplasmic TraM protein with the membrane protein TraD was demonstrated, suggesting that the TraM protein creates a physical link between the relaxosomal nucleoprotein complex and the membrane-bound DNA transfer apparatus.	126
114011	pfam05262	Borrelia_P83	Borrelia P83/100 protein. This family consists of several Borrelia P83/P100 antigen proteins.	489
283036	pfam05263	DUF722	Protein of unknown function (DUF722). This family contains several bacteriophage proteins of unknown function.	129
310112	pfam05264	CfAFP	Choristoneura fumiferana antifreeze protein (CfAFP). This family consists of several antifreeze proteins from the insect Choristoneura fumiferana (Spruce budworm). Antifreeze proteins (AFPs) and antifreeze glycoproteins (AFGPs) are present in many organisms that must survive sub-zero temperatures. These proteins bind to seed ice crystals and inhibit their growth through an adsorption-inhibition mechanism.	137
283037	pfam05265	DUF723	Protein of unknown function (DUF723). This family contains several uncharacterized proteins from Neisseria meningitidis. These proteins may have a role in DNA-binding.	60
398778	pfam05266	DUF724	Protein of unknown function (DUF724). This family contains several uncharacterized proteins found in Arabidopsis thaliana and other plants. This region is often found associated with Agenet domains and may contain coiled-coil.	188
398779	pfam05267	DUF725	Protein of unknown function (DUF725). This family contains several Drosophila proteins of unknown function.	121
147458	pfam05268	GP38	Phage tail fibre adhesin Gp38. This family contains several Gp38 proteins from T-even-like phages. Gp38, together with a second phage protein, gp57, catalyzes the organisation of gp37 but is absent from the phage particle. Gp37 is responsible for receptor recognition.	261
398780	pfam05269	Phage_CII	Bacteriophage CII protein. This family consists of several phage CII regulatory proteins. CII plays a key role in the lysis-lysogeny decision in bacteriophage lambda and related phages.	79
398781	pfam05270	AbfB	Alpha-L-arabinofuranosidase B (ABFB) domain. This family consists of several fungal alpha-L-arabinofuranosidase B proteins. L-Arabinose is a constituent of plant-cell-wall poly-saccharides. It is found in a polymeric form in L-arabinan, in which the backbone is formed by 1,5-a- linked l-arabinose residues that can be branched via 1,2-a- and 1,3-a-linked l-arabinofuranose side chains. AbfB hydrolyzes 1,5-a, 1,3-a and 1,2-a linkages in both oligosaccharides and polysaccharides, which contain terminal non-reducing l-arabinofuranoses in side chains.	137
147459	pfam05271	Tobravirus_2B	Tobravirus 2B protein. This family consists of several tobravirus 2B proteins. It is known that the 2B protein is required for transmission by both Paratrichodorus pachydermus and P. anemones nematodes.	117
398782	pfam05272	VirE	Virulence-associated protein E. This family contains several bacterial virulence-associated protein E like proteins. These proteins contain a P-loop motif.	217
398783	pfam05273	Pox_RNA_Pol_22	Poxvirus RNA polymerase 22 kDa subunit. This family consists of several poxvirus DNA-dependent RNA polymerase 22 kDa subunits.	184
283043	pfam05274	Baculo_E25	Occlusion-derived virus envelope protein E25. This family consists of several nucleopolyhedrovirus occlusion-derived virus envelope E25 proteins.	190
398784	pfam05275	CopB	Copper resistance protein B precursor (CopB). This family consists of several bacterial copper resistance proteins. Copper is essential and serves as cofactor for more than 30 enzymes yet a surplus of copper is toxic and leads to radical formation and oxidation of biomolecules. Therefore, copper homeostasis is a key requisite for every organism. CopB serves to extrude copper when it approaches toxic levels.	207
398785	pfam05276	SH3BP5	SH3 domain-binding protein 5 (SH3BP5). This family consists of several eukaryotic SH3 domain-binding protein 5 or c-Jun N-terminal kinase (JNK)-interacting proteins (SH3BP5 or Sab). Sab binds to and serves as a substrate for JNK in vitro, and has been found to interact with the Src homology 3 (SH3) domain of Bruton's tyrosine kinase (Btk). Inspection of the sequence of Sab reveals the presence of two putative mitogen-activated protein kinase interaction motifs (KIMs) similar to that found in the JNK docking domain of the c-Jun transcription factor, and four potential serine-proline JNK phosphorylation sites in the C-terminal half of the molecule.	231
368366	pfam05277	DUF726	Protein of unknown function (DUF726). This family consists of several uncharacterized eukaryotic proteins.	341
253129	pfam05278	PEARLI-4	Arabidopsis phospholipase-like protein (PEARLI 4). This family contains several phospholipase-like proteins from Arabidopsis thaliana which are homologous to PEARLI 4.	234
191249	pfam05279	Asp-B-Hydro_N	Aspartyl beta-hydroxylase N-terminal region. This family includes the N-terminal regions of the junctin, junctate and aspartyl beta-hydroxylase proteins. Junctate is an integral ER/SR membrane calcium binding protein, which comes from an alternatively spliced form of the same gene that generates aspartyl beta-hydroxylase and junctin. Aspartyl beta-hydroxylase catalyzes the post-translational hydroxylation of aspartic acid or asparagine residues contained within epidermal growth factor (EGF) domains of proteins.	240
398786	pfam05280	FlhC	Flagellar transcriptional activator (FlhC). This family consists of several bacterial flagellar transcriptional activator (FlhC) proteins. FlhC combines with FlhD to form a regulatory complex in E. coli, this complex has been shown to be a global regulator involved in many cellular processes as well as a flagellar transcriptional activator.	171
368368	pfam05281	Secretogranin_V	Neuroendocrine protein 7B2 precursor (Secretogranin V). The neuroendocrine protein 7B2 has a critical role in the proteolytic conversion and activation of proPC2, the enzyme responsible for the proteolytic conversion of many peptide hormone precursors. The 7B2 protein acts as an intracellular binding protein for proPC2, facilitates its maturation, and is required for its enzymatic activity. Processing of many important peptide precursors does not occur in 7B2 nulls. 7B2 null mice exhibit a unique form of Cushing's disease with many atypical symptoms, such as hypoglycemia.	230
398787	pfam05282	AAR2	AAR2 protein. This family consists of several eukaryotic AAR2-like proteins. The yeast protein AAR2 is involved in splicing pre-mRNA of the a1 cistron and other genes that are important for cell growth.	355
368370	pfam05283	MGC-24	Multi-glycosylated core protein 24 (MGC-24), sialomucin. This family consists of several MGC-24 (or Cd164 antigen) proteins from eukaryotic organisms. MGC-24/CD164 is a sialomucin expressed in many normal and cancerous tissues. In humans, soluble and transmembrane forms of MGC-24 are produced by alternative splicing.	140
398788	pfam05284	DUF736	Protein of unknown function (DUF736). This family consists of several uncharacterized bacterial proteins of unknown function.	98
398789	pfam05285	SDA1	SDA1. This family consists of several SDA1 protein homologs. SDA1 is a Saccharomyces cerevisiae protein which is involved in the control of the actin cytoskeleton. The protein is essential for cell viability and is localized in the nucleus.	288
283053	pfam05287	PMG	PMG protein. This family consists of several mouse anagen-specific protein mKAP13 (PMG1 and PMG2). PMG1 and 2 contain characteristic repeats reminiscent of the keratin-associated proteins (KAPs). Both genes are expressed in growing hair follicles in skin as well as in sebaceous and eccrine sweat glands. Interestingly, expression is also detected in the mammary epithelium where it is limited to the onset of the pubertal growth phase and is independent of ovarian hormones. Their broad, developmentally controlled expression pattern, together with their unique amino acid composition, demonstrate that pmg-1 and pmg-2 constitute a novel KAP gene family participating in the differentiation of all epithelial cells forming the epidermal appendages.	180
283054	pfam05288	Pox_A3L	Poxvirus A3L Protein. This family consists of several poxvirus A3L or A2_5L proteins.	70
368373	pfam05289	BLYB	Borrelia hemolysin accessory protein. This family consists of several borrelia hemolysin accessory proteins (BLYB). BLYB was thought to be an accessory protein, which was proposed to comprise a hemolysis system but it is now thought that BlyA and BlyB function instead as a prophage-encoded holin or holin-like system.	120
368374	pfam05290	Baculo_IE-1	Baculovirus immediate-early protein (IE-0). The Autographa californica multinucleocapsid nuclear polyhedrosis virus (AcMNPV) ie-1 gene product (IE-1) is thought to play a central role in stimulating early viral transcription. IE-1 has been demonstrated to activate several early viral gene promoters and to negatively regulate the promoters of two other AcMNPV regulatory genes, ie-0 and ie-2. It is thought that that IE-1 negatively regulates the expression of certain genes by binding directly, or as part of a complex, to promoter regions containing a specific IE-1-binding motif (5'-ACBYGTAA-3') near their mRNA start sites.	141
398790	pfam05291	Bystin	Bystin. Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastin bind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts at invasion front in the placenta from early pregnancy. This family also includes the yeast protein ENP1. ENP1 is an essential protein in Saccharomyces cerevisiae and is localized in the nucleus. It is thought that ENP1 plays a direct role in the early steps of rRNA processing as enp1 defective yeast cannot synthesize 20S pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits.	289
398791	pfam05292	MCD	Malonyl-CoA decarboxylase C-terminal domain. This family consists of several eukaryotic malonyl-CoA decarboxylase (MLYCD) proteins. Malonyl-CoA, in addition to being an intermediate in the de novo synthesis of fatty acids, is an inhibitor of carnitine palmitoyltransferase I, the enzyme that regulates the transfer of long-chain fatty acyl-CoA into mitochondria, where they are oxidized. After exercise, malonyl-CoA decarboxylase participates with acetyl-CoA carboxylase in regulating the concentration of malonyl-CoA in liver and adipose tissue, as well as in muscle. Malonyl-CoA decarboxylase is regulated by AMP-activated protein kinase (AMPK).	245
114041	pfam05293	ASFV_L11L	African swine fever virus (ASFV) L11L protein. L11L is an integral membrane protein of the African swine fever virus (ASFV) which is expressed late in the virus replication cycle. The protein is thought to be non-essential for growth in vitro and for virus virulence in domestic swine.	78
253137	pfam05294	Toxin_5	Scorpion short toxin. This family contains various secreted scorpion short toxins and seems to be unrelated to pfam00451.	32
253138	pfam05295	Luciferase_N	Luciferase/LBP N-terminal domain. This family consists of a presumed N-terminal domain that is conserved between dinoflagellate luciferase and luciferin binding proteins. Luciferase is involved in catalyzing the light emitting reaction in bioluminescence and luciferin binding protein (LBP) is known to bind to luciferin (the substrate for luciferase) to stop it reacting with the enzyme and therefore switching off the bioluminescence function. The expression of these two proteins is controlled by a circadian clock at the translational level, with synthesis and degradation occurring on a daily basis. However This domain is not the catalytic part of the protein. It has been suggested that this region may mediate an interaction between LBP and Luciferase or their association with the vacuolar membrane.	82
283059	pfam05296	TAS2R	Taste receptor protein (TAS2R). This family consists of several forms of eukaryotic taste receptor proteins (TAS2Rs). TAS2Rs are G protein-coupled receptors expressed in subsets of taste receptor cells of the tongue and palate epithelia in humans and mice, and are organized in the genome in clusters. The proteins are genetically linked to loci that influence bitter perception in mice and humans.	303
283060	pfam05297	Herpes_LMP1	Herpesvirus latent membrane protein 1 (LMP1). This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus. LMP1 of EBV is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N-terminus and a long cytoplasmic carboxy tail of 200 amino acids. EBV latent membrane protein 1 (LMP1) is essential for EBV-mediated transformation and has been associated with several cases of malignancies. EBV-like viruses in Cynomolgus monkeys (Macaca fascicularis) have been associated with high lymphoma rates in immunosuppressed monkeys	386
114046	pfam05298	Bombinin	Bombinin. This family consists of Bombinin and Maximin proteins from Bombina maxima (Chinese red belly toad). Two groups of antimicrobial peptides have been isolated from skin secretions of Bombina maxima. Peptides in the first group, named maximins 1, 2, 3, 4 and 5, are structurally related to bombinin-like peptides (BLPs). Unlike BLPs, sequence variations in maximins occurred all through the molecules. In addition to the potent antimicrobial activity, cytotoxicity against tumor cells and spermicidal action of maximins, maximin 3 possessed a significant anti-HIV activity. Maximins 1 and 3 have been found to be toxic to mice. Peptides in the second group, termed maximins H1, H2, H3 and H4, are homologous with bombinin H peptides.	141
398792	pfam05299	Peptidase_M61	M61 glycyl aminopeptidase. Glycyl aminopeptidase is an unusual peptidase in that it has a preference for substrates with an N-terminal glycine or alanine. These proteins are found in Bacteria and in Archaea.	116
398793	pfam05300	DUF737	Protein of unknown function (DUF737). This family consists of several uncharacterized mammalian proteins of unknown function.	142
398794	pfam05301	Acetyltransf_16	GNAT acetyltransferase, Mec-17. Mec-17 is the protein product of one of the 18 genes required for the development and function of the touch receptor neuron for gentle touch. Mec-17 is specifically required for maintaining the differentiation of the touch receptor. The family shares all the residue-motifs characteristic of Gcn5-related acetyl-transferases, though the exact unction is still unknown.	176
310131	pfam05302	DUF720	Protein of unknown function (DUF720). This family consists of several uncharacterized Chlamydia proteins of unknown function.	128
398795	pfam05303	DUF727	Protein of unknown function (DUF727). This family consists of several uncharacterized eukaryotic proteins of unknown function.	103
283066	pfam05304	DUF728	Protein of unknown function (DUF728). This family consists of several uncharacterized tobravirus proteins of unknown function.	139
398796	pfam05305	DUF732	Protein of unknown function (DUF732). This family consists of several uncharacterized Mycobacterium tuberculosis and leprae proteins of unknown function.	72
368381	pfam05306	DUF733	Protein of unknown function (DUF733). This family consists of several uncharacterized Drosophila melanogaster proteins of unknown function.	85
398797	pfam05307	Bundlin	Bundlin. This family consists of several bundlin proteins from E. coli. Bundlin is a type IV pilin protein that is the only known structural component of enteropathogenic Escherichia coli bundle-forming pili (BFP). BFP play a role in virulence, antigenicity, autoaggregation, and localized adherence to epithelial cells. These proteins contain an N-terminal methylation motif.	60
398798	pfam05308	Mito_fiss_reg	Mitochondrial fission regulator. In eukaryotes, this family of proteins induces mitochondrial fission.	242
398799	pfam05309	TraE	TraE protein. This family consists of several bacterial sex pilus assembly and synthesis proteins (TraE). Conjugal transfer of plasmids from donor to recipient cells is a complex process in which a cell-to-cell contact plays a key role. Many genes encoded by self-transmissible plasmids are required for various processes of conjugation, including pilus formation, stabilisation of mating pairs, conjugative DNA metabolism, surface exclusion and regulation of transfer gene expression. The exact function of the TraE protein is unknown.	182
191255	pfam05310	Tenui_NS3	Tenuivirus movement protein. This family of ssRNA negative-strand crop plant tenuivirus proteins appears to combine PV2, NS2, NS3, and PV3 proteins. Plant viruses encode specific proteins known as movement proteins (MPs) to control their spread through plasmodesmata (PD) in walls between cells as well as from leaf to leaf via vascular-dependent transport. During this movement process, the virally encoded MPs interact with viral genomes for transport from the viral replication sites to the PDs in the walls of infected cells along the cytoskeleton and/or endoplasmic reticulum (ER) network. The virus is then thought to move through the PDs in the form of MP-associated ribonucleoprotein complexes or as virions. The NS3 protein appears to function as an RNA silencing suppressor.	186
253146	pfam05311	Baculo_PP31	Baculovirus 33KDa late protein (PP31). Autographa californica nuclear polyhedrosis virus (AcMNPV) pp31 is a nuclear phosphoprotein that accumulates in the virogenic stroma, which is the viral replication centre in the infected-cell nucleus, binds to DNA, and serves as a late expression factor.	267
283071	pfam05313	Pox_P21	Poxvirus P21 membrane protein. The P21 membrane protein of vaccinia virus, encoded by the A17L (or A18L) gene, has been reported to localize on the inner of the two membranes of the intracellular mature virus (IMV). It has also been shown that P21 acts as a membrane anchor for the externally located fusion protein P14 (A27L gene).	189
283072	pfam05314	Baculo_ODV-E27	Baculovirus occlusion-derived virus envelope protein EC27. This family consists of several baculovirus occlusion-derived virus envelope proteins (EC27 or E27). The ODV-E27 protein has distinct functional characteristics compared to cellular and viral cyclins. Depending on the cdk protein, and perhaps other viral or cellular proteins yet to be described, the kinase-EC27 complex may have either cyclin B- or D-like activity.	295
398800	pfam05315	ICEA	ICEA Protein. This family consists of several ICEA proteins from Helicobacter pylori. Helicobacter pylori infection causes gastritis and peptic ulcer disease, and is classified as a definite carcinogen of gastric cancer. ICEA1 is speculated to be associated with peptic ulcer disease.	218
283074	pfam05316	VAR1	Mitochondrial ribosomal protein (VAR1). This family consists of the yeast mitochondrial ribosomal proteins VAR1. Mitochondria possess their own ribosomes responsible for the synthesis of a small number of proteins encoded by the mitochondrial genome. In yeast the two ribosomal RNAs and a single ribosomal protein, VAR1, are products of mitochondrial genes, and the remaining approximately 80 ribosomal proteins are encoded in the nucleus. VAR1 along with 15S rRNA are necessary for the formation of mature 37S subunits.	337
368384	pfam05317	Thermopsin	Thermopsin. This family consists of several thermopsin proteins from archaebacteria. Thermopsin is a thermostable acid protease which is capable of hydrolysing the following bonds: Leu-Val, Leu-Tyr, Phe-Phe, Phe-Tyr, and Tyr-Thr. The specificity of thermopsin is therefore similar to that of pepsin, that is, it prefers large hydrophobic residues at both sides of the scissile bond.	253
253150	pfam05318	Tombus_movement	Tombusvirus movement protein. This family consists of several Tombusvirus movement proteins. These proteins allow the virus to move from cell-to-cell and allow host-specific systemic spread.	68
398801	pfam05320	Pox_RNA_Pol_19	Poxvirus DNA-directed RNA polymerase 19 kDa subunit. This family contains several DNA-directed RNA polymerase 19 kDa polypeptides. The Poxvirus DNA-directed RNA polymerase (EC: 2.7.7.6) catalyzes DNA-template-directed extension of the 3'-end of an RNA strand by one nucleotide at a time.	164
398802	pfam05321	HHA	Haemolysin expression modulating protein. This family consists of haemolysin expression modulating protein (HHA) homologs. YmoA and Hha are highly similar bacterial proteins downregulating gene expression in Yersinia enterocolitica and Escherichia coli, respectively.	56
283078	pfam05322	NinE	NINE Protein. This family consists of NINE proteins from several bacteriophages and from E. coli.	58
283079	pfam05323	Pox_A21	Poxvirus A21 Protein. This family consists of several poxvirus A21 proteins.	111
398803	pfam05324	Sperm_Ag_HE2	Sperm antigen HE2. This family consists of several variants of the human and chimpanzee sperm antigen proteins (HE2 and EP2 respectively). The EP2 gene codes for a family of androgen-dependent, epididymis-specific secretory proteins.The EP2 gene uses alternative promoters and differential splicing to produce a family of variant messages. The translated putative protein variants differ significantly from each other. Some of these putative proteins have similarity to beta-defensins, a family of antimicrobial peptides.	70
114071	pfam05325	DUF730	Protein of unknown function (DUF730). This family consists of several uncharacterized Arabidopsis thaliana proteins of unknown function.	122
398804	pfam05326	SVA	Seminal vesicle autoantigen (SVA). This family consists of seminal vesicle autoantigen and prolactin-inducible (PIP) proteins. Seminal vesicle autoantigen (SVA) is specifically present in the seminal plasma of mice. This 19-kDa secretory glycoprotein suppresses the motility of spermatozoa by interacting with phospholipid. PIP, has several known functions. In saliva, this protein plays a role in host defense by binding to microorganisms such as Streptococcus. PIP is an aspartyl proteinase and it acts as a factor capable of suppressing T-cell apoptosis through its interaction with CD4.	124
398805	pfam05327	RRN3	RNA polymerase I specific transcription initiation factor RRN3. This family consists of several eukaryotic proteins which are homologous to the yeast RRN3 protein. RRN3 is one of the RRN genes specifically required for the transcription of rDNA by RNA polymerase I (Pol I) in Saccharomyces cerevisiae.	543
368389	pfam05328	CybS	CybS, succinate dehydrogenase cytochrome B small subunit. This family consists of several eukaryotic succinate dehydrogenase [ubiquinone] cytochrome B small subunit, mitochondrial precursor (CybS) proteins. SDHD encodes the small subunit (cybS) of cytochrome b in succinate-ubiquinone oxidoreductase (mitochondrial complex II). Mitochondrial complex II is involved in the Krebs cycle and in the aerobic electron transport chain. It contains four proteins. The catalytic core consists of a flavoprotein and an iron-sulfur protein; these proteins are anchored to the mitochondrial inner membrane by the large subunit of cytochrome b (cybL) and cybS, which together comprise the heme-protein cytochrome b. Mutations in the SDHD gene can lead to hereditary paraganglioma, characterized by the development of benign, vascularised tumors in the head and neck.	133
398806	pfam05331	DUF742	Protein of unknown function (DUF742). This family consists of several uncharacterized Streptomyces proteins as well as one from Mycobacterium tuberculosis. The function of these proteins is unknown.	114
283085	pfam05332	DUF743	Protein of unknown function (DUF743). This family consists of several uncharacterized Calicivirus proteins of unknown function.	113
398807	pfam05334	DUF719	Protein of unknown function (DUF719). This family consists of several eukaryotic proteins of unknown function.	189
398808	pfam05335	DUF745	Protein of unknown function (DUF745). This family consists of several uncharacterized Drosophila melanogaster proteins of unknown function.	180
398809	pfam05336	rhaM	L-rhamnose mutarotase. This family contains L-rhamnose mutarotase which is a glycosyl hydrolase that converts the monosaccharide L-rhamnopyranose from the alpha to the beta stereoisomer. In Escherichia coli this enzyme is the product of the rhaM gene (also known as yiiL). The tertiary structure has been solved, in complex with L-rhamnose, and the catalytic mechanism determined. His22 is the proton donor. The enzyme naturally exists as a dimer.	100
398810	pfam05337	CSF-1	Macrophage colony stimulating factor-1 (CSF-1). Colony stimulating factor 1 (CSF-1) is a homodimeric polypeptide growth factor whose primary function is to regulate the survival, proliferation, differentiation, and function of cells of the mononuclear phagocytic lineage. This lineage includes mononuclear phagocytic precursors, blood monocytes, tissue macrophages, osteoclasts, and microglia of the brain, all of which possess cell surface receptors for CSF-1. The protein has also been linked with male fertility and mutations in the Csf-1 gene have been found to cause osteopetrosis and failure of tooth eruption. Structurally these are short-chain 4-helical cytokines.	140
283089	pfam05338	DUF717	Protein of unknown function (DUF717). This family consists of several herpesvirus proteins of unknown function.	55
283090	pfam05339	DUF739	Protein of unknown function (DUF739). This family contains several bacteriophage proteins. Some of the proteins in this family have been labeled putative cro repressor proteins.	69
398811	pfam05340	DUF740	Protein of unknown function (DUF740). This family consists of several uncharacterized plant proteins of unknown function.	610
283092	pfam05341	PIF6	Per os infectivity factor 6. Family members include Autographa californica nuclear polyhedrosis virus (AcMNPV) Orf68 (also known as per os infectivity factor 6, PIF6 or ac68). PIF6 is present in both the budded virus (BV) and the occluded-derived virus (ODV). The ac68 gene overlaps the lef3 gene which encodes the single-stranded DNA binding protein, and knockout experiments of ac68 have to ensure that a functional lef3 gene is present. In ac68KO experiments, viral DNA replication and BV levels were unaffected as were mortality rates if caterpillars were injected with BV directly into the hemolymph bypassing the gut. However, in oral bioassays the ac68KO occlusion bodies failed to kill larvae, indicating that PIF6 is a per os infectivity factor.	105
398812	pfam05342	Peptidase_M26_N	M26 IgA1-specific Metallo-endopeptidase N-terminal region. These peptidases, which cleave mammalian IgA, are found in Gram-positive bacteria. Often found associated with pfam00746, they may be attached to the cell wall.	250
398813	pfam05343	Peptidase_M42	M42 glutamyl aminopeptidase. These peptidases are found in Archaea and Bacteria. The example in Lactococcus lactis, PepA, aids growth on milk. Pyrococcus horikoshii contain a thermostable de-blocking aminopeptidase member of this family used commercially for N-terminal protein sequencing.	292
283095	pfam05344	DUF746	Domain of Unknown Function (DUF746). This is a short conserved region found in some transposons. Structural modelling suggests this domain may bind nucleic acids.	64
398814	pfam05345	He_PIG	Putative Ig domain. This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei.	95
398815	pfam05346	DUF747	Eukaryotic membrane protein family. This family is a family of eukaryotic membrane proteins. It was previously annotated as including a putative receptor for human cytomegalovirus gH but this has has since been disputed. Analysis of the mouse Tapt1 protein (transmembrane anterior posterior transformation 1) has shown it to be involved in patterning of the vertebrate axial skeleton.	311
398816	pfam05347	Complex1_LYR	Complex 1 protein (LYR family). Proteins in this family include an accessory subunit of the higher eukaryotic NADH dehydrogenase complex. In Saccharomyces cerevisiae, the Isd11 protein has been shown to play a role in Fe/S cluster biogenesis in mitochondria. We have named this family LYR after a highly conserved tripeptide motif close to the N-terminus of these proteins.	59
398817	pfam05348	UMP1	Proteasome maturation factor UMP1. UMP1 is a short-lived chaperone present in the precursor form of the 20S proteasome and absent in the mature complex. UMP1 is required for the correct assembly and enzymatic activation of the proteasome. UMP1 seems to be degraded by the proteasome upon its formation	115
398818	pfam05349	GATA-N	GATA-type transcription activator, N-terminal. GATA transcription factors mediate cell differentiation in a diverse range of tissues. Mutation are often associated with certain congenital human disorders. The six classical vertebrate GATA proteins, GATA-1 to GATA-6, are highly homologous and have two tandem zinc fingers. The classical GATA transcription factors function transcription activators. In lower metazoans GATA proteins carry a single canonical zinc finger. This family represents the N-terminal domain of the family of GATA transcription activators.	176
398819	pfam05350	GSK-3_bind	Glycogen synthase kinase-3 binding. Glycogen synthase kinase-3 (GSK-3) sequentially phosphorylates four serine residues on glycogen synthase (GS), in the sequence SxxxSxxxSxxx-SxxxS(p), by recognising and phosphorylating the first serine in the sequence motif SxxxS(P) (where S(p) represents a phosphoserine). Interaction of GSK-3 with a peptide derived from GSK-3 binding protein (this family) prevents GSK-3 interaction with Axin. This interaction thereby inhibits the Axin-dependent phosphorylation of beta-catenin by GSK-3.	237
398820	pfam05351	GMP_PDE_delta	GMP-PDE, delta subunit. GMP-PDE delta subunit was originally identified as a fourth subunit of rod-specific cGMP phosphodiesterase (PDE)(EC:3.1.4.35). The precise function of PDE delta subunit in the rod specific GMP-PDE complex is unclear. In addition, PDE delta subunit is not confined to photoreceptor cells but is widely distributed in different tissues. PDE delta subunit is thought to be a specific soluble transport factor for certain prenylated proteins and Arl2-GTP a regulator of PDE-mediated transport.	154
147504	pfam05352	Phage_connector	Phage Connector (GP10). The head-tail connector of bacteriophage 29 is composed of 12 36 kDa subunits with 12 fold symmetry. It is the central component of a rotary motor that packages the genomic dsDNA into pre-formed proheads. This motor consists of the head-tail connector, surrounded by a 29-encoded, 174-base, RNA and a viral ATPase protein.	281
398821	pfam05353	Atracotoxin	Delta Atracotoxin. Delta atracotoxin produces potentially fatal neurotoxic symptoms in primates by slowing he inactivation of voltage-gated sodium channels. The structure of atracotoxin comprises a core beta region containing a triple-stranded a thumb-like extension protruding from the beta region and a C-terminal helix. The beta region contains a cystine knot motif, a feature seen in other neurotoxic polypeptides.	42
283102	pfam05354	Phage_attach	Phage Head-Tail Attachment. The phage head-tail attachment protein is required for the joining of phage heads and tails at the last step of morphogenesis.	117
398822	pfam05355	Apo-CII	Apolipoprotein C-II. Apolipoprotein C-II (ApoC-II) is the major activator of lipoprotein lipase, a key enzyme in the regulation of triglyceride levels in human serum.	77
398823	pfam05356	Phage_Coat_B	Phage Coat protein B. The major coat protein in the capsid of filamentous bacteriophage forms a helical assembly of about 7000 identical protomers, with each protomer comprised of 46 amino acid, after the cleavage of the signal peptide. Each protomer forms a slightly curved helix that combine to form a tubular structure that encapsulates the viral DNA.	56
368403	pfam05357	Phage_Coat_A	Phage Coat Protein A. Infection of Escherichia coli by filamentous bacteriophages is mediated by the minor phage coat protein A and involves two distinct cellular receptors, the F' pilus and the periplasmic protein TolA. These two receptors are contacted in a sequential manner, such that binding of TolA by the extreme N-terminal domain is conditional on a primary interaction of the second coat protein A domain with the F' pilus.	62
283105	pfam05358	DicB	DicB protein. DicB is part of the dic operon, which resides on cryptic prophage Kim. Under normal conditions, expression of dicB is actively repressed. When expression is induced, however, cell division rapidly ceases, and this division block is dependent on MinC with which it interacts.	62
398824	pfam05359	DUF748	Domain of Unknown Function (DUF748). 	152
398825	pfam05360	YiaAB	yiaA/B two helix domain. This domain consists of two transmembrane helices and a conserved linking section.	53
398826	pfam05361	PP1_inhibitor	PKC-activated protein phosphatase-1 inhibitor. Contractility of vascular smooth muscle depends on phosphorylation of myosin light chains, and is modulated by hormonal control of myosin phosphatase activity. Signaling pathways activate kinases such as PKC or Rho-dependent kinases that phosphorylate the myosin phosphatase inhibitor protein called CPI-17. Phosphorylation of CPI-17 at Thr-38 enhances its inhibitory potency 1000-fold, creating a molecular switch for regulating contraction.	141
368405	pfam05362	Lon_C	Lon protease (S16) C-terminal proteolytic domain. The Lon serine proteases must hydrolyze ATP to degrade protein substrates. In Escherichia coli, these proteases are involved in turnover of intracellular proteins, including abnormal proteins following heat-shock. The active site for protease activity resides in a C-terminal domain. The Lon proteases are classified as family S16 in Merops.	205
398827	pfam05363	Herpes_US12	Herpesvirus US12 family. US12 a key factor in the evasion of cellular immune response against HSV-infected cells. Specific inhibition of the transporter associated with antigen processing (TAP) by US12 prevents peptide transport into the endoplasmic reticulum and subsequent loading of major histocompatibility complex (MHC) class I molecules. US12 is comprised of three helices and is associated with cellular membranes.	82
283111	pfam05364	SecIII_SopE_N	Salmonella type III secretion SopE effector N-terminus. Salmonella typhimurium employs a type III secretion system to inject bacterial toxins into the host cell cytosol. These toxins transiently activate Rho family GTP-binding protein-dependent signaling cascades to induce cytoskeletal rearrangements. SopE, one of these toxins, can activate Cdc42 in a Dbl-like fashion via its C-terminal GEP domain pfam07487. This family represents the N-terminal region of SopE. The function of this domain is unknown.	74
398828	pfam05365	UCR_UQCRX_QCR9	Ubiquinol-cytochrome C reductase, UQCRX/QCR9 like. The UQCRX/QCR9 protein is the 9/10 subunit of complex III, encoding a protein of about 7-kDa. Deletion of QCR9 results in the inability of cells to grow on grow on-fermentable carbon source n yeast.	53
368407	pfam05366	Sarcolipin	Sarcolipin. Sarcolipin is a 31 amino acid integral membrane protein that regulates Ca-ATPase activity in skeletal muscle.	31
368408	pfam05367	Phage_endo_I	Phage endonuclease I. The bacteriophage endonuclease I is a nuclease that is selective for the structure of the four-way Holliday DNA junction.	149
398829	pfam05368	NmrA	NmrA-like family. NmrA is a negative transcriptional regulator involved in the post-translational modification of the transcription factor AreA. NmrA is part of a system controlling nitrogen metabolite repression in fungi. This family only contains a few sequences as iteration results in significant matches to other Rossmann fold families.	236
368410	pfam05369	MtmB	Monomethylamine methyltransferase MtmB. Monomethylamine methyltransferase of the archaebacterium Methanosarcina barkeri contains a novel amino acid, pyrrolysine, encoded by the termination codon UAG. The structure reveals a homohexamer comprised of individual subunits with a TIM barrel fold.	450
398830	pfam05370	DUF749	Domain of unknown function (DUF749). Archaeal domain of unknown function. This domain has been solved as part of a structural genomics project and comprises of segregated helical and anti-parallel beta sheet regions.	87
368412	pfam05371	Phage_Coat_Gp8	Phage major coat protein, Gp8. Class I phage major coat protein Gp8 or B. The coat protein is largely alpha-helix with a slight curve.	52
398831	pfam05372	Delta_lysin	Delta lysin family. Delta-lysin is a 26 amino acid, hemolytic peptide toxin secreted by Staphylococcus aureus. It is thought that delta-toxin forms an amphipathic helix upon binding to lipid bilayers. The precise mode of action of delta-lysis is unclear.	25
398832	pfam05373	Pro_3_hydrox_C	L-proline 3-hydroxylase, C-terminal. Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyze oxidative reactions in a range of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a 2-OG oxygenase catalyzing oxidation of a free alpha-amino acid. The structure contains conserved motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron and 2-oxoglutarate, consistent with divergent evolution within the extended family. The structure differs significantly from many other 2-OG oxygenases in possessing a discrete C-terminal helical domain.	101
310168	pfam05374	Mu-conotoxin	Mu-Conotoxin. Mu-conotoxins are peptide inhibitors of voltage-sensitive sodium channels.	22
253170	pfam05375	Pacifastin_I	Pacifastin inhibitor (LCMII). Structures of members of this family show that they are comprised of a triple-stranded antiparallel beta-sheet connected by three disulfide bridges, which defines this as a novel family of serine protease inhibitors.	40
398833	pfam05377	FlaC_arch	Flagella accessory protein C (FlaC). Although archaeal flagella appear superficially similar to those of bacteria, they are quite distinct. In several archaea, the flagellin genes are followed immediately by the flagellar accessory genes flaCDEFGHIJ. The gene products may have a role in translocation, secretion, or assembly of the flagellum. FlaC is a protein whose exact role is unknown but it has been shown to be membrane-associated (by immuno-blotting fractionated cells).	55
398834	pfam05378	Hydant_A_N	Hydantoinase/oxoprolinase N-terminal region. This family is found at the N-terminus of the pfam01968 family.	176
283122	pfam05379	Peptidase_C23	Carlavirus endopeptidase. A peptidase involved in auto-proteolysis of a polyprotein from the plant pathogen blueberry scorch carlavirus (BBScV). Corresponds to Merops family C23.	88
398835	pfam05380	Peptidase_A17	Pao retrotransposon peptidase. Corresponds to Merops family A17. These proteins are homologous to aspartic proteinases encoded by retroposons and retroviruses.	162
398836	pfam05381	Peptidase_C21	Tymovirus endopeptidase. Corresponds to Merops family C21. The best-studied plant alpha-like virus proteolytic enzyme is the proteinase of turnip yellow mosaic virus (TYMV). The TYMV replicase protein undergoes auto-cleavage to yield two products. The auto-peptidase activity has been mapped to the central part of this polyprotein.	100
283125	pfam05382	Amidase_5	Bacteriophage peptidoglycan hydrolase. At least one of the members of this family, the Pal protein from the pneumococcal bacteriophage Dp-1 has been shown to be a N-acetylmuramoyl-L-alanine amidase. According to the known modular structure of this and other peptidoglycan hydrolases from the pneumococcal system, the active site should reside at the N-terminal domain whereas the C-terminal domain binds to the choline residues of the cell wall teichoic acids. This family appears to be related to pfam00877.	142
398837	pfam05383	La	La domain. This presumed domain is found at the N-terminus of La RNA-binding proteins as well as other proteins. The function of this region is uncertain.	59
398838	pfam05384	DegS	Sensor protein DegS. This is small family of Bacillus DegS proteins. The DegS-DegU two-component regulatory system of Bacillus subtilis controls various processes that characterize the transition from the exponential to the stationary growth phase, including the induction of extracellular degradative enzymes, expression of late competence genes and down-regulation of the sigma D regulon. The family also contains one sequence from Thermoanaerobacter tengcongensis which is described as a sensory transduction histidine kinase.	159
283128	pfam05385	Adeno_E4	Mastadenovirus early E4 13 kDa protein. This family consists of human and simian mastadenovirus early E4 13 kDa proteins. Human adenovirus type 9 (Ad9) is unique in eliciting exclusively estrogen-dependent mammary tumors in rats and in not requiring viral E1 region transforming genes for tumorigenicity. E4 codes for an oncoprotein essential for tumorigenesis by Ad9.	108
398839	pfam05386	TEP1_N	TEP1 N-terminal domain. This short sequence region is found in four copies at the N-terminus of the TEP1 telomerase component. The functional significance of the region is uncertain. However the conservation of two histidines and a cysteine suggests it is a potential zinc binding domain.	29
310175	pfam05387	Chorion_3	Chorion family 3. This family consists of several Drosophila chorion proteins S36 and S38. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary.	277
398840	pfam05388	Carbpep_Y_N	Carboxypeptidase Y pro-peptide. This family is found at the N-terminus of several carboxypeptidase Y proteins and contains a signal peptide and pro-peptide regions.	126
398841	pfam05389	MecA	Negative regulator of genetic competence (MecA). This family contains several bacterial MecA proteins. The development of competence in Bacillus subtilis is regulated by growth conditions and several regulatory genes. In complex media competence development is poor, and there is little or no expression of late competence genes. Mec mutations permit competence development and late competence gene expression in complex media, bypassing the requirements for many of the competence regulatory genes. The mecA gene product acts negatively in the development of competence. Null mutations in mecA allow expression of a late competence gene comG, under conditions where it is not normally expressed, including in complex media and in cells mutant for several competence regulatory genes. Overexpression of MecA inhibits comG transcription.	168
398842	pfam05390	KRE9	Yeast cell wall synthesis protein KRE9/KNH1. This family contains several KRE9 and KNH1 proteins which are involved in encoding cell surface O glycoproteins, which are required for beta -1,6-glucan synthesis in yeast.	101
398843	pfam05391	Lsm_interact	Lsm interaction motif. This short motif is found at the C-terminus of Prp24 proteins and probably interacts with the Lsm proteins to promote U4/U6 formation.	19
398844	pfam05392	COX7B	Cytochrome C oxidase chain VIIB. 	79
283135	pfam05393	Hum_adeno_E3A	Human adenovirus early E3A glycoprotein. This family consists of several early glycoproteins from human adenoviruses.	102
368421	pfam05394	AvrB_AvrC	Avirulence protein. This family consists of several avirulence proteins from Pseudomonas syringae and Xanthomonas campestris.	326
398845	pfam05395	DARPP-32	Protein phosphatase inhibitor 1/DARPP-32. This family consists of several mammalian protein phosphatase inhibitor 1 (IPP-1) and dopamine- and cAMP-regulated neuronal phosphoprotein (DARPP-32) proteins. Protein phosphatase inhibitor-1 is involved in signal transduction and is an endogenous inhibitor of protein phosphatase-1. It has been demonstrated that DARPP-32, if phosphorylated, can inhibit protein-phosphatase-1. DARPP-32 has a key role in many neurotransmitter pathways throughout the brain and has been shown to be involved in controlling receptors, ion channels and other physiological factors including the brain's response to drugs of abuse, such as cocaine, opiates and nicotine. DARPP-32 is reciprocally regulated by the two neurotransmitters that are most often implicated in schizophrenia - dopamine and glutamate. Dopamine activates DARPP-32 through the D1 receptor pathway and disables DARPP-32 through the D2 receptor. Glutamate, acting through the N-methyl-d-aspartate receptor, renders DARPP-32 inactive. A mutant form of DARPP-32 has been linked with gastric cancers.	136
147533	pfam05396	Phage_T7_Capsid	Phage T7 capsid assembly protein. 	123
398846	pfam05397	Med15_fungi	Mediator complex subunit 15. GAL11 or MED15 is one of the up to 32 or subunits of the Mediator complex which is found from fungi to humans. The Mediator complex interacts with RNA polymerase II and other general transcription factors to form the RNA polymerase II holoenzyme, thereby affecting transcription through targetting of activators and repressors. This family is found in fungi and the small metazoan starlet anemone.	112
310184	pfam05398	PufQ	PufQ cytochrome subunit. This family consists of bacterial PufQ proteins. PufQ id required for bacteriochlorophyll biosynthesis serving a regulatory function in the formation of photosynthetic complexes.	74
368424	pfam05399	EVI2A	Ectropic viral integration site 2A protein (EVI2A). This family contains several mammalian ectropic viral integration site 2A (EVI2A) proteins. The function of this protein is unknown although it is thought to be a membrane protein and may function as an oncogene in retrovirus induced myeloid tumors.	231
398847	pfam05400	FliT	Flagellar protein FliT. This family contains several bacterial flagellar FliT proteins. The flagellar proteins FlgN and FliT have been proposed to act as substrate specific export chaperones, facilitating incorporation of the enterobacterial hook-associated axial proteins (HAPs) FlgK/FlgL and FliD into the growing flagellum. In Salmonella typhimurium flgN and fliT mutants, the export of target HAPs is reduced, concomitant with loss of unincorporated flagellin into the surrounding medium.	85
398848	pfam05401	NodS	Nodulation protein S (NodS). This family consists of nodulation S (NodS) proteins. The products of the rhizobial nodulation genes are involved in the biosynthesis of lipochitin oligosaccharides (LCOs), which are host-specific signal molecules required for nodule formation. NodS is an S-adenosyl-L-methionine (SAM)-dependent methyltransferase involved in N methylation of LCOs. NodS uses N-deacetylated chitooligosaccharides, the products of the NodBC proteins, as its methyl acceptors.	199
398849	pfam05402	PqqD	Coenzyme PQQ synthesis protein D (PqqD). This family contains several bacterial coenzyme PQQ synthesis protein D (PqqD) sequences. This protein is required for coenzyme pyrrolo-quinoline-quinone (PQQ) biosynthesis.	64
253181	pfam05403	Plasmodium_HRP	Plasmodium histidine-rich protein (HRPII/III). This family consists of several histidine-rich protein II and III sequence from Plasmodium falciparum.	218
398850	pfam05404	TRAP-delta	Translocon-associated protein, delta subunit precursor (TRAP-delta). This family consists of several eukaryotic translocon-associated protein, delta subunit precursors (TRAP-delta or SSR-delta). The exact function of this protein is unknown.	162
368426	pfam05405	Mt_ATP-synt_B	Mitochondrial ATP synthase B chain precursor (ATP-synt_B). The Fo sector of the ATP synthase is a membrane bound complex which mediates proton transport. It is composed of nine different polypeptide subunits (a, b, c, d, e, f, g F6, A6L).	163
398851	pfam05406	WGR	WGR domain. This domain is found in a variety of polyA polymerases as well as the E. coli molybdate metabolism regulator and other proteins of unknown function. I have called this domain WGR after the most conserved central motif of the domain. The domain is found in isolation in proteins such as Rhizobium radiobacter Ych and is between 70 and 80 residues in length. I propose that this may be a nucleic acid binding domain.	79
368428	pfam05407	Peptidase_C27	Rubella virus endopeptidase. Corresponds to Merops family C27. Required for processing of the rubella virus replication protein.	171
283147	pfam05408	Peptidase_C28	Foot-and-mouth virus L-proteinase. Corresponds to Merops family C28. Protein fold of the peptidase unit for members of this family resembles that of papain. The leader proteinase of foot and mouth disease virus (FMDV) cleaves itself from the growing polyprotein and also cleaves the host translation initiation factor 4GI (eIF4G), thus inhibiting 5'-cap dependent translation.	201
398852	pfam05409	Peptidase_C30	Coronavirus endopeptidase C30. Corresponds to Merops family C30. These peptidases are involved in viral polyprotein processing in replication.	274
398853	pfam05410	Peptidase_C31	Porcine arterivirus-type cysteine proteinase alpha. Corresponds to Merops family C31. These peptidases are involved in viral polyprotein processing in replication.	105
398854	pfam05411	Peptidase_C32	Equine arteritis virus putative proteinase. These proteins are characterized by a region that has been proposed to have peptidase activity involved in viral polyprotein processing in replication.	127
114153	pfam05412	Peptidase_C33	Equine arterivirus Nsp2-type cysteine proteinase. Corresponds to Merops family C33. These peptidases are involved in viral polyprotein processing in replication.	108
147545	pfam05413	Peptidase_C34	Putative closterovirus papain-like endopeptidase. Corresponds to Merops family C34. Putative closterovirus papain-like endopeptidase from the apple chlorotic leaf spot closterovirus.	92
283149	pfam05414	DUF1717	Viral domain of unknown function (DUF1717). This domain is found in viral proteins of unknown function.	78
283150	pfam05415	Peptidase_C36	Beet necrotic yellow vein furovirus-type papain-like endopeptidase. Corresponds to Merops family C36. This protease involved in processing the viral polyprotein.	104
253185	pfam05416	Peptidase_C37	Southampton virus-type processing peptidase. Corresponds to Merops family C37. Norwalk-like viruses (NLVs), including the Southampton virus, cause acute non-bacterial gastroenteritis in humans. The NLV genome encodes three open reading frames (ORFs). ORF1 encodes a polyprotein, which is processed by the viral protease into six proteins.	535
283151	pfam05417	Peptidase_C41	Hepatitis E cysteine protease. Corresponds to MEROPs family C41. This papain-like protease cleaves the viral polyprotein encoded by ORF1 of the hepatitis E virus (HEV).	161
283152	pfam05418	Apo-VLDL-II	Apovitellenin I (Apo-VLDL-II). This family consists of several avian apovitellenin I sequences. As part of the avian reproductive effort, large quantities of triglyceride-rich very-low-density lipoprotein (VLDL) particles are transported by receptor-mediated endocytosis into the female germ cells. Although the oocytes are surrounded by a layer of granulosa cells harbouring high levels of active lipoprotein lipase, non-lipolysed VLDL is transported into the yolk. This is because VLDL particles from laying chickens are protected from lipolysis by apolipoprotein (apo)-VLDL-II, a potent dimeric lipoprotein lipase inhibitor. Apo-VLDL-II is produced in the liver and secreted into the blood stream when induced by estrogen production in female birds.	79
398855	pfam05419	GUN4	GUN4-like. In Arabidopsis, GUN4 is required for the functioning of the plastid mediated repression of nuclear transcription that is involved in controlling the levels of magnesium- protoporphyrin IX. GUN4 binds the product and substrate of Mg-chelatase, an enzyme that produces Mg-Proto, and activates Mg-chelatase. GUN4 is thought to participates in plastid-to-nucleus signaling by regulating magnesium-protoporphyrin IX synthesis or trafficking.	138
398856	pfam05420	BCSC_C	Cellulose synthase operon protein C C-terminus (BCSC_C). This family contains the C-terminal regions of several bacterial cellulose synthase operon C (BCSC) proteins. BCSC is involved in cellulose synthesis although the exact function of this protein is unknown.	336
398857	pfam05421	DUF751	Protein of unknown function (DUF751). This family contains several plant, cyanobacterial and algal proteins of unknown function. The family is exclusively found in phototrophic organisms and may therefore play a role in photosynthesis (personal obs:Moxon SJ).	60
398858	pfam05422	SIN1	Stress-activated map kinase interacting protein 1 (SIN1). SIN1 is the N-terminus of stress-activated map kinase interacting protein 1 (MAPKAP1 OR SIN1) sequences. This domain is likely to be the Ras-binding domain. The fission yeast Sty1/Spc1 mitogen-activated protein (MAP) kinase is a member of the eukaryotic stress-activated MAP kinase (SAPK) family. Sin1 interacts with Sty1/Spc1. Cells lacking Sin1 display many, but not all, of the phenotypes of cells lacking the Sty1/Spc1 MAP kinase including sterility, multiple stress sensitivity and a cell-cycle delay. Sin1 is phosphorylated after stress but this is not Sty1/Spc1-dependent. The separate CRIM and PH, pleckstrin-homology domains of the full-length SIN1 proteins have been separated into distinct families.	139
283157	pfam05423	Mycobact_memb	Mycobacterium membrane protein. This family contains several membrane proteins from Mycobacterium species.	138
398859	pfam05424	Duffy_binding	Duffy binding domain. This domain is found in Plasmodium Duffy binding proteins. Plasmodium vivax and Plasmodium knowlesi merozoites invade human erythrocytes that express Duffy blood group surface determinants. The Duffy receptor family is localized in micronemes, an organelle found in all organisms of the phylum Apicomplexa. This family is closely associated on PfEMP1 proteins with PFEMP, pfam03011.	187
398860	pfam05425	CopD	Copper resistance protein D. Copper sequestering activity displayed by some bacteria is determined by copper-binding protein products of the copper resistance operon (cop). CopD, together with CopC, perform copper uptake into the cytoplasm.	97
398861	pfam05426	Alginate_lyase	Alginate lyase. This family contains several bacterial alginate lyase proteins. Alginate is a family of 1-4-linked copolymers of beta -D-mannuronic acid (M) and alpha -L-guluronic acid (G). It is produced by brown algae and by some bacteria belonging to the genera Azotobacter and Pseudomonas. Alginate lyases catalyze the depolymerization of alginates by beta -elimination, generating a molecule containing 4-deoxy-L-erythro-hex-4-enepyranosyluronate at the nonreducing end. This family adopts an all alpha fold.	274
398862	pfam05427	FIBP	Acidic fibroblast growth factor binding (FIBP). Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that is thought to be involved in the intracellular function of aFGF.	360
398863	pfam05428	CRF-BP	Corticotropin-releasing factor binding protein (CRF-BP). This family consists of several eukaryotic corticotropin-releasing factor binding proteins (CRF-BP or CRH-BP). Corticotropin-releasing hormone (CRH) plays multiple roles in vertebrate species. In mammals, it is the major hypothalamic releasing factor for pituitary adrenocorticotropin secretion, and is a neurotransmitter or neuromodulator at other sites in the central nervous system. In non-mammalian vertebrates, CRH not only acts as a neurotransmitter and hypophysiotropin, it also acts as a potent thyrotropin-releasing factor, allowing CRH to regulate both the adrenal and thyroid axes, especially in development. CRH-BP is thought to play an inhibitory role in which it binds CRH and other CRH-like ligands and prevents the activation of CRH receptors. There is however evidence that CRH-BP may also exhibit diverse extra and intracellular roles in a cell specific fashion and at specific times in development.	298
398864	pfam05430	Methyltransf_30	S-adenosyl-L-methionine-dependent methyltransferase. This family is a S-adenosyl-L-methionine (SAM)-dependent methyltransferase. It is often found in association with pfam01266, where it is responsible for catalyzing the transfer of a methyl group from S-adenosyl-L-methionine to 5-aminomethyl-2-thiouridine to form 5-methylaminomethyl-2-thiouridine.	124
398865	pfam05431	Toxin_10	Insecticidal Crystal Toxin, P42. Family of Bacillus insecticidal crystal toxins. Strains of Bacillus that have this insecticidal activity use a binary toxin comprised of two proteins, P51 and P42 (this family). Members of this family are highly conserved between strains of different serotypes and phage groups.	169
398866	pfam05432	BSP_II	Bone sialoprotein II (BSP-II). Bone sialoprotein (BSP) is a major structural protein of the bone matrix that is specifically expressed by fully-differentiated osteoblasts. The expression of bone sialoprotein (BSP) is normally restricted to mineralized connective tissues of bones and teeth where it has been associated with mineral crystal formation. However, it has been found that ectopic expression of BSP occurs in various lesions, including oral and extraoral carcinomas, in which it has been associated with the formation of microcrystalline deposits and the metastasis of cancer cells to bone.	301
398867	pfam05433	Rick_17kDa_Anti	Glycine zipper 2TM domain. This family includes a putative two transmembrane alpha-helical region that contains glycine zipper motifs. This family includes several Rickettsia genus specific 17 kDa surface antigen proteins.	42
398868	pfam05434	Tmemb_9	TMEM9. This family contains several eukaryotic transmembrane proteins which are homologous to human transmembrane protein 9. The TMEM9 gene encodes a 183 amino-acid protein that contains an N-terminal signal peptide, a single transmembrane region, three potential N-glycosylation sites and three conserved cys-rich domains in the N-terminus, but no known functional domains. The protein is highly conserved between species from Caenorhabditis elegans to man and belongs to a novel family of transmembrane proteins. The exact function of TMEM9 is unknown although it has been found to be widely expressed and localized to the late endosomes and lysosomes. Members of this family contain pfam03128 repeats in their N-terminal region.	142
368442	pfam05435	Phi-29_GP3	Phi-29 DNA terminal protein GP3. This family consists of DNA terminal protein GP3 sequences from Phi-29 like bacteriophages. DNA terminal protein GP3 is linked to the 5' ends of both strands of the genome through a phosphodiester bond between the beta-hydroxyl group of a serine residue and the 5'-phosphate of the terminal deoxyadenylate. This protein is essential for DNA replication and is involved in the priming of DNA elongation.	266
398869	pfam05436	MF_alpha_N	Mating factor alpha precursor N-terminus. This family contains the N-terminal regions of the Saccharomyces mating factor alpha precursor protein. All proteins in this family contain one or more copies pfam04648 further toward their C-terminus.	87
398870	pfam05437	AzlD	Branched-chain amino acid transport protein (AzlD). This family consists of a number of bacterial and archaeal branched-chain amino acid transport proteins. AzlD is known to be involved in conferring resistance to 4-azaleucine although its exact role is uncertain.	99
398871	pfam05438	TRH	Thyrotropin-releasing hormone (TRH). This family consists of several thyrotropin-releasing hormone (TRH) proteins. Thyrotropin-Releasing Hormone (TRH; pyroGlu-His-Pro-NH2), originally isolated as a hypothalamic neuropeptide hormone, most likely acts also as a neuromodulator and/or neurotransmitter in the central nervous system (CNS). This interpretation is supported by the identification of a peptidase localized on the surface of neuronal cells which has been termed TRH-degrading ectoenzyme (TRH-DE) since it selectively inactivates TRH. TRH has been used clinically for the treatment of spinocerebellar degeneration and disturbance of consciousness in humans.	219
398872	pfam05439	JTB	Jumping translocation breakpoint protein (JTB). This family contains several jumping translocation breakpoint proteins or JTBs. Jumping translocation (JT) is an unbalanced translocation that comprises amplified chromosomal segments jumping to various telomeres. JTB, located at 1q21, has been found to fuse with the telomeric repeats of acceptor telomeres in a case of JT. hJTB (human JTB) encodes a trans-membrane protein that is highly conserved among divergent eukaryotic species. JT results in a hJTB truncation, which potentially produces an hJTB product devoid of the trans-membrane domain. hJTB is located in a gene-rich region at 1q21, called EDC (Epidermal Differentiation Complex). JTB has also been implicated in prostatic carcinomas.	110
398873	pfam05440	MtrB	Tetrahydromethanopterin S-methyltransferase subunit B. The N5-methyltetrahydromethanopterin: coenzyme M (EC:2.1.1.86) of Methanosarcina mazei Go1 is a membrane-associated, corrinoid-containing protein that uses a transmethylation reaction to drive an energy-conserving sodium ion pump.	94
398874	pfam05443	ROS_MUCR	ROS/MUCR transcriptional regulator protein. This family consists of several ROS/MUCR transcriptional regulator proteins. The ros chromosomal gene is present in octopine and nopaline strains of Agrobacterium tumefaciens as well as in Rhizobium meliloti. This gene encodes a 15.5-kDa protein that specifically represses the virC and virD operons in the virulence region of the Ti plasmid and is necessary for succinoglycan production. Sinorhizobium meliloti can produce two types of acidic exopolysaccharides, succinoglycan and galactoglucan, that are interchangeable for infection of alfalfa nodules. MucR from Sinorhizobium meliloti acts as a transcriptional repressor that blocks the expression of the exp genes responsible for galactoglucan production therefore allowing the exclusive production of succinoglycan.	122
398875	pfam05444	DUF753	Protein of unknown function (DUF753). This family contains sequences with are repeated in several uncharacterized proteins from Drosophila melanogaster.	149
283176	pfam05445	Pox_ser-thr_kin	Poxvirus serine/threonine protein kinase. 	434
398876	pfam05448	AXE1	Acetyl xylan esterase (AXE1). This family consists of several bacterial acetyl xylan esterase proteins. Acetyl xylan esterases are enzymes that hydrolyze the ester linkages of the acetyl groups in position 2 and/or 3 of the xylose moieties of natural acetylated xylan from hardwood. These enzymes are one of the accessory enzymes which are part of the xylanolytic system, together with xylanases, beta-xylosidases, alpha-arabinofuranosidases and methylglucuronidases; these are all required for the complete hydrolysis of xylan.	316
398877	pfam05449	Phage_holin_3_7	Putative 3TM holin, Phage_holin_3. This is a family of putative proteobacterial phage three-transmembrane-domain holins.	80
310213	pfam05450	Nicastrin	Nicastrin. Nicastrin and presenilin are two major components of the gamma-secretase complex, which executes the intramembrane proteolysis of type I integral membrane proteins such as the amyloid precursor protein (APP) and Notch. Nicastrin is synthesized in fibroblasts and neurons as an endoglycosidase-H-sensitive glycosylated precursor protein (immature nicastrin) and is then modified by complex glycosylation in the Golgi apparatus and by sialylation in the trans-Golgi network (mature nicastrin). A region featured in this family has a fold similar to human transferrin receptor (TfR) and a bacterial aminopeptidase. It is implicated in the pathogenesis of Alzheimer's disease.	227
283180	pfam05451	Phytoreo_Pns	Phytoreovirus nonstructural protein Pns10/11. This family consists of Phytoreovirus nonstructural proteins Pns10 and Pns11. Genome segment S11 of rice gall dwarf virus (RGDV), a member of Phytoreovirus encodes a putative protein of 40 kDa that exhibits approximately 37% homology at the amino acid level to the nonstructural proteins Pns10 of rice dwarf and wound tumor viruses, which are other members of Phytoreovirus.	359
147565	pfam05452	Clavanin	Clavanin. This family consists of clavanin proteins from the haemocytes of the invertebrate Styela clava, a solitary tunicate. The family is made up of four alpha-helical antimicrobial peptides, clavanins A, B, C and D. The tunicate peptides resemble magainins in size, primary sequence and antibacterial activity. Synthetic clavanin A displays comparable antimicrobial activity to magainins and cecropins. The presence of alpha-helical antimicrobial peptides in the haemocytes of a urochordate suggests that such peptides are primeval effectors of innate immunity in the vertebrate lineage.	80
398878	pfam05453	Toxin_6	BmTXKS1/BmP02 toxin family. This family consists of toxin-like peptides that are isolated from the venom of Buthus martensii Karsch scorpion. The precursor consists of 60 amino acid residues, with a putative signal peptide of 28 residues and an extra residue, and a mature peptide of 31 residues with an amidated C-terminal. The peptides share close homology with other scorpion K+ channel toxins and should present a common three-dimensional fold - the Cysteine -stabilized alphabeta (CSalphabeta) motif. This family acts by blocking small conductance calcium activated potassium ion channels in their victim.	28
398879	pfam05454	DAG1	Dystroglycan (Dystrophin-associated glycoprotein 1). Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in human. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton. [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in mouse brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear.	290
310215	pfam05455	GvpH	GvpH. This family consists of archaeal GvpH proteins which are thought to be involved in gas vesicle synthesis.	177
398880	pfam05456	eIF_4EBP	Eukaryotic translation initiation factor 4E binding protein (EIF4EBP). This family consists of several eukaryotic translation initiation factor 4E binding proteins (EIF4EBP1,2 and 3). Translation initiation in eukaryotes is mediated by the cap structure (m7GpppN, where N is any nucleotide) present at the 5' end of all cellular mRNAs, except organellar. The cap is recognized by eukaryotic initiation factor 4F (eIF4F), which consists of three polypeptides, including eIF4E, the cap-binding protein subunit. The interaction of the cap with eIF4E facilitates the binding of the ribosome to the mRNA. eIF4E activity is regulated in part by translational repressors, 4E-BP1, 4E-BP2 and 4E-BP3 which bind to it and prevent its assembly into eIF4F.	120
283184	pfam05458	Siva	Cd27 binding protein (Siva). Siva binds to the CD27 cytoplasmic tail. It has a DD homology region, a box-B-like ring finger, and a zinc finger-like domain. Overexpression of Siva in various cell lines induces apoptosis, suggesting an important role for Siva in the CD27-transduced apoptotic pathway. Siva-1 binds to and inhibits BCL-X(L)-mediated protection against UV radiation-induced apoptosis. Indeed, the unique amphipathic helical region (SAH) present in Siva-1 is required for its binding to BCL-X(L) and sensitising cells to UV radiation. Natural complexes of Siva-1/BCL-X(L) are detected in HUT78 and murine thymocyte, suggesting a potential role for Siva-1 in regulating T cell homeostasis. This family contains both Siva-1 and the shorter Siva-2 lacking the sequence coded by exon 2. It has been suggested that Siva-2 could regulate the function of Siva-1.	173
398881	pfam05459	Herpes_UL69	Herpesvirus transcriptional regulator family. This family includes UL69 and IE63 that are transcriptional regulator proteins.	217
368452	pfam05460	ORC6	Origin recognition complex subunit 6 (ORC6). This family consists of several eukaryotic origin recognition complex subunit 6 (ORC6) proteins. Despite differences in their structure and sequences among eukaryotic replicators, ORC is a conserved feature of replication initiation in all eukaryotes. ORC-related genes have been identified in organisms ranging from S. pombe to plants to humans. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in.	288
398882	pfam05461	ApoL	Apolipoprotein L. Apo L belongs to the high density lipoprotein family that plays a central role in cholesterol transport. The cholesterol content of membranes is important in cellular processes such as modulating gene transcription and signal transduction both in the adult brain and during neurodevelopment. There are six apo L genes located in close proximity to each other on chromosome 22q12 in humans. 22q12 is a confirmed high-susceptibility locus for schizophrenia and close to the region associated with velocardiofacial syndrome that includes symptoms of schizophrenia.	313
283188	pfam05462	Dicty_CAR	Slime mold cyclic AMP receptor. This family consists of cyclic AMP receptor (CAR) proteins from slime molds. CAR proteins are responsible for controlling development in Dictyostelium discoideum.	305
398883	pfam05463	Sclerostin	Sclerostin (SOST). This family contains several mammalian sclerostin (SOST) proteins. SOST is thought to suppress bone formation. Mutations of the SOST gene lead to sclerosteosis, a progressive sclerosing bone dysplasia with an autosomal recessive mode of inheritance. Radiologically, it is characterized by a generalized hyperostosis and sclerosis leading to a markedly thickened and sclerotic skull, with mandible, ribs, clavicles and all long bones also being affected. Due to narrowing of the foramina of the cranial nerves, facial nerve palsy, hearing loss and atrophy of the optic nerves can occur. Sclerosteosis is clinically and radiologically very similar to van Buchem disease, mainly differentiated by hand malformations and a large stature in sclerosteosis patients.	198
398884	pfam05464	Phi-29_GP4	Phi-29-like late genes activator (early protein GP4). This family consists of phi-29-like late genes activator (or early protein GP4). This protein is thought to be a positive regulator of late transcription and may function as a sigma like component of the host RNA polymerase.	123
310220	pfam05465	Halo_GVPC	Halobacterial gas vesicle protein C (GVPC) repeat. This family consists of Halobacterium gas vesicle protein C sequences which are thought to confer stability to the gas vesicle membranes.	32
398885	pfam05466	BASP1	Brain acid soluble protein 1 (BASP1 protein). This family consists of several brain acid soluble protein 1 (BASP1) or neuronal axonal membrane protein NAP-22. The BASP1 is a neuron enriched Ca(2+)-dependent calmodulin-binding protein of unknown function.	239
283192	pfam05467	Herpes_U47	Herpesvirus glycoprotein U47. 	677
283193	pfam05470	eIF-3c_N	Eukaryotic translation initiation factor 3 subunit 8 N-terminus. The largest of the mammalian translation initiation factors, eIF3, consists of at least eight subunits ranging in mass from 35 to 170 kDa. eIF3 binds to the 40 S ribosome in an early step of translation initiation and promotes the binding of methionyl-tRNAi and mRNA.	544
398886	pfam05472	Ter	DNA replicatioN-terminus site-binding protein (Ter protein). This family contains several bacterial Ter proteins. The Ter protein specifically binds to DNA replicatioN-terminus sites on the host and plasmid genome and then blocks progress of the DNA replication fork.	296
368457	pfam05473	UL45	UL45 protein, carbohydrate-binding C-type lectin-like. This family consists of several UL45 proteins. The herpes simplex virus UL45 gene encodes an 18 kDa virion envelope protein whose function remains unknown. It has been suggested that the 18 kDa UL45 gene product is required for efficient growth in the central nervous system at low doses and may play an important role under the conditions of a naturally acquired infection. This family also contains several Varicellovirus UL45 or gene 15 proteins. The Equine herpesvirus 1 UL45 protein represents a type II membrane glycoprotein which has been found to be non-essential for EHV-1 growth in vitro but deletion reduces the viruses' replication efficiency. Studies have shown that UL45 has a C-type lectin-like fold, suggesting that it might have a carbohydrate-binding function.	191
368458	pfam05474	Semenogelin	Semenogelin. This family consists of several mammalian semenogelin (I and II) proteins. Freshly ejaculated human semen has the appearance of a loose gel in which the predominant structural protein components are the seminal vesicle secreted semenogelins (Sg).	582
368459	pfam05475	Chlam_vir	Pgp3 C-terminal domain. This family consists of Chlamydia virulence proteins which are thought to be required for growth within mammalian cells. The C-terminal domain shows distant homology to the TNF superfamily.	146
368460	pfam05476	PET122	PET122. The nuclear PET122 gene of S. cerevisiae encodes a mitochondrial-localized protein that activates initiation of translation of the mitochondrial mRNA from the COX3 gene, which encodes subunit III of cytochrome c oxidase.	259
398887	pfam05477	SURF2	Surfeit locus protein 2 (SURF2). Surfeit locus protein 2 is part of a group of at least six sequence unrelated genes (Surf-1 to Surf-6). The six Surfeit genes have been classified as housekeeping genes, being expressed in all tissue types tested and not containing a TATA box in their promoter region. The exact function of SURF2 is unknown.	240
398888	pfam05478	Prominin	Prominin. The prominins are an emerging family of proteins that among the multispan membrane proteins display a novel topology. Mouse prominin and human prominin (mouse)-like 1 (PROML1) are predicted to contain five membrane spanning domains, with an N-terminal domain exposed to the extracellular space followed by four, alternating small cytoplasmic and large extracellular, loops and a cytoplasmic C-terminal domain. The exact function of prominin is unknown although in humans defects in PROM1, the gene coding for prominin, cause retinal degeneration.	799
398889	pfam05479	PsaN	Photosystem I reaction centre subunit N (PSAN or PSI-N). This family contains several Photosystem I reaction centre subunit N (PSI-N) proteins. The protein has no known function although it is localized in the thylakoid lumen. PSI-N is a small extrinsic subunit at the lumen side and is very likely involved in the docking of plastocyanin.	132
398890	pfam05480	Staph_haemo	Staphylococcus haemolytic protein. This family consists of several different short Staphylococcal proteins, it contains SLUSH A, B and C proteins as well as haemolysin and gonococcal growth inhibitor. Some strains of the coagulase-negative Staphylococcus lugdunensis produce a synergistic hemolytic activity (SLUSH), phenotypically similar to the delta-hemolysin of S. aureus. Gonococcal growth inhibitor from Staphylococcus act on the cytoplasmic membrane of the gonococcal cell causing cytoplasmic leakage and, eventually, death.	41
398891	pfam05481	Myco_19_kDa	Mycobacterium 19 kDa lipoprotein antigen. Most of the antigens of Mycobacterium leprae and M. tuberculosis that have been identified are members of stress protein families, which are highly conserved throughout many diverse species. Of the M. leprae and M. tuberculosis antigens identified by monoclonal antibodies, all except the 18-kDa M. leprae antigen and the 19-kDa M. tuberculosis antigen are strongly cross-reactive between these two species and are coded within very similar genes.	116
368466	pfam05482	Serendipity_A	Serendipity locus alpha protein (SRY-A). The Drosophila serendipity alpha (sry alpha) gene is specifically transcribed at the blastoderm stage, from nuclear cycle 11 to the onset of gastrulation, in all somatic nuclei. SRY-A is required for the cellularisation of the embryo and is involved in the localization of the actin filaments just prior to and during plasma membrane invagination.	542
114219	pfam05483	SCP-1	Synaptonemal complex protein 1 (SCP-1). Synaptonemal complex protein 1 (SCP-1) is the major component of the transverse filaments of the synaptonemal complex. Synaptonemal complexes are structures that are formed between homologous chromosomes during meiotic prophase.	787
398892	pfam05484	LRV_FeS	LRV protein FeS4 cluster. This Iron sulphur cluster is found at the N-terminus of some proteins containing pfam01816 repeats.	53
398893	pfam05485	THAP	THAP domain. The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes.	74
398894	pfam05486	SRP9-21	Signal recognition particle 9 kDa protein (SRP9). This family consists of several eukaryotic SRP9 proteins. SRP9 together with the Alu-homologous region of 7SL RNA and SRP14 comprise the "Alu domain" of SRP, which mediates pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP. This family also contains the homologous fungal SRP21.	82
398895	pfam05488	PAAR_motif	PAAR motif. This motif is found usually in pairs in a family of bacterial membrane proteins. It is also found as a triplet of tandem repeats comprising the entire length in a another family of hypothetical proteins.	71
283209	pfam05489	Phage_tail_X	Phage Tail Protein X. This domain is found in a family of phage tail proteins. Visual analysis suggests that it is related to pfam01476 (personal obs: C Yeats). The functional annotation of family members further confirms this hypothesis.	60
398896	pfam05491	RuvB_C	Holliday junction DNA helicase ruvB C-terminus. The RuvB protein makes up part of the RuvABC revolvasome which catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalyzed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein. This family consists of the C-terminal region of the RuvB protein which is thought to be helicase DNA-binding domain.	72
398897	pfam05493	ATP_synt_H	ATP synthase subunit H. ATP synthase subunit H is an extremely hydrophobic of approximately 9 kDa. This subunit may be required for assembly of vacuolar ATPase.	59
398898	pfam05494	MlaC	MlaC protein. MlaC is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. This family of proteins is involved in toluene tolerance, which is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation. Many proteins are involved in these processes.	164
398899	pfam05495	zf-CHY	CHY zinc finger. This family of domains are likely to bind to zinc ions. They contain many conserved cysteine and histidine residues. We have named this domain after the N-terminal motif CXHY. This domain can be found in isolation in some proteins, but is also often associated with pfam00097. One of the proteins in this family is a mitochondrial intermembrane space protein called Hot13. This protein is involved in the assembly of small TIM complexes.	76
398900	pfam05496	RuvB_N	Holliday junction DNA helicase ruvB N-terminus. The RuvB protein makes up part of the RuvABC revolvasome which catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalyzed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein. This family contains the N-terminal region of the protein.	159
398901	pfam05497	Destabilase	Destabilase. Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine.	118
398902	pfam05498	RALF	Rapid ALkalinization Factor (RALF). RALF, a 5-kDa ubiquitous polypeptide in plants, arrests root growth and development.	65
398903	pfam05499	DMAP1	DNA methyltransferase 1-associated protein 1 (DMAP1). DNA methylation can contribute to transcriptional silencing through several transcriptionally repressive complexes, which include methyl-CpG binding domain proteins (MBDs) and histone deacetylases (HDACs). The chief enzyme that maintains mammalian DNA methylation, DNMT1, can also establish a repressive transcription complex. The non-catalytic amino terminus of DNMT1 binds to HDAC2 and DMAP1 (for DNMT1 associated protein), and can mediate transcriptional repression. DMAP1 has intrinsic transcription repressive activity, and binds to the transcriptional co-repressor TSG101. DMAP1 is targeted to replication foci through interaction with the far N-terminus of DNMT1 throughout S phase, whereas HDAC2 joins DNMT1 and DMAP1 only during late S phase, providing a platform for how histones may become deacetylated in heterochromatin following replication.	163
368473	pfam05501	DUF755	Domain of unknown function (DUF755). This family is predominated by ORFs from Circoviridae. The function of this family remains to be determined.	122
368474	pfam05502	Dynactin_p62	Dynactin p62 family. Dynactin is a multi-subunit complex and a required cofactor for most, or all, of the cellular processes powered by the microtubule-based motor cytoplasmic dynein. p62 binds directly to the Arp1 subunit of dynactin.	472
283219	pfam05503	Pox_G7	Poxvirus G7-like. 	367
398904	pfam05504	Spore_GerAC	Spore germination B3/ GerAC like, C-terminal. The GerAC protein of the Bacillus subtilis spore is required for the germination response to L-alanine. Members of this family are thought to be located in the inner spore membrane. Although the function of this family is unclear, they are likely to encode the components of the germination apparatus that respond directly to this germinant, mediating the spore's response.	167
398905	pfam05505	Ebola_NP	Ebola nucleoprotein. This family consists of Ebola and Marburg virus nucleoproteins. These proteins are responsible for encapsidation of genomic RNA. It has been found that nucleoprotein DNA vaccines can offer protection from the virus.	717
398906	pfam05506	DUF756	Domain of unknown function (DUF756). This domain is found, normally as a tandem repeat, at the C-terminus of bacterial phospholipase C proteins.	86
398907	pfam05507	MAGP	Microfibril-associated glycoprotein (MAGP). This family consists of several mammalian microfibril-associated glycoprotein (MAGP) 1 and 2 proteins. MAGP1 and 2 are components of elastic fibers. MAGP-1 has been proposed to bind a C-terminal region of tropoelastin, the soluble precursor of elastin. MAGP-2 was found to interact with fibrillin-1 and -2, as well as fibulin-1, another component of elastic fibers this suggests that MAGP-2 may be important in the assembly of microfibrils.	133
398908	pfam05508	Ran-binding	RanGTP-binding protein. The small Ras-like GTPase Ran plays an essential role in the transport of macromolecules in and out of the nucleus and has been implicated in spindle and nuclear envelope formation during mitosis in higher eukaryotes. The S. cerevisiae ORF YGL164c encoding a novel RanGTP-binding protein, termed Yrb30p was identified. The protein competes with yeast RanBP1 (Yrb1p) for binding to the GTP-bound form of yeast Ran (Gsp1p) and is, like Yrb1p, able to form trimeric complexes with RanGTP and some of the karyopherins.	308
398909	pfam05509	TraY	TraY domain. This family consists of several enterobacterial TraY proteins. TraY is involved in bacterial conjugation where it is required for efficient nick formation in the F plasmid. These proteins have a ribbon-helix-helix fold and are likely to be DNA-binding proteins.	49
368478	pfam05510	Sarcoglycan_2	Sarcoglycan alpha/epsilon. Sarcoglycans are a subcomplex of transmembrane proteins which are part of the dystrophin-glycoprotein complex. They are expressed in the skeletal, cardiac and smooth muscle. Although numerous studies have been conducted on the sarcoglycan subcomplex in skeletal and cardiac muscle, the manner of the distribution and localization of these proteins along the nonjunctional sarcolemma is not clear. This family contains alpha and epsilon members.	385
398910	pfam05511	ATP-synt_F6	Mitochondrial ATP synthase coupling factor 6. Coupling factor 6 (F6) is a component of mitochondrial ATP synthase which is required for the interactions of the catalytic and proton-translocating segments.	96
398911	pfam05512	AWPM-19	AWPM-19-like family. Members of this family are 19 kDa membrane proteins. The levels of the plant protein AWPM-19 increase dramatically when there is an increase level of abscisic acid. The increase presence of this protein leads to greater tolerance of freezing.	142
283228	pfam05513	TraA	TraA. Conjugative transfer of a bacteriocin plasmid, pPD1, of Enterococcus faecalis is induced in response to a peptide sex pheromone, cPD1, secreted from plasmid-free recipient cells. cPD1 is taken up by a pPD1 donor cell and binds to an intracellular receptor, TraA. Once a recipient cell acquires pPD1, it starts to produce an inhibitor of cPD1, termed iPD1, which functions as a TraA antagonist and blocks self-induction in donor cells. TraA transduces the signal of cPD1 to the mating response.	120
283229	pfam05514	HR_lesion	HR-like lesion-inducing. Family of plant proteins that are associated with the hypersensitive response (HR) pathway of defense against plant pathogens.	138
283230	pfam05515	Viral_NABP	Viral nucleic acid binding. This family is common to ssRNA positive-strand viruses and are commonly described as nucleic acid binding proteins (NABP).	190
398912	pfam05517	p25-alpha	p25-alpha. This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila.	155
253234	pfam05518	Totivirus_coat	Totivirus coat protein. 	753
114252	pfam05520	Citrus_P18	Citrus tristeza virus P18 protein. 	167
377521	pfam05521	Phage_H_T_join	Phage head-tail joining protein. 	96
114254	pfam05522	Metallothio_6	Metallothionein. This family consists of metallothioneins from several worm and sea urchin species. Metallothioneins are low molecular weight, cysteine rich proteins known to be involved in heavy metal detoxification and homeostasis.	65
398913	pfam05523	FdtA	WxcM-like, C-terminal. This family includes FdtA from Aneurinibacillus thermoaerophilus, which has been characterized as a dtdp-6-deoxy-3,4-keto-hexulose isomerase. It also includes WxcM from Xanthomonas campestris (pv. campestris).	129
398914	pfam05524	PEP-utilizers_N	PEP-utilising enzyme, N-terminal. 	125
283235	pfam05525	Branch_AA_trans	Branched-chain amino acid transport protein. This family consists of several bacterial branched-chain amino acid transport proteins which are responsible for the transport of leucine, isoleucine and valine via proton motive force.	429
398915	pfam05526	R_equi_Vir	Rhodococcus equi virulence-associated protein. This family consists of several virulence-associated proteins from Rhodococcus equi. Rhodococcus equi is an important pulmonary pathogen of foals and is increasingly isolated from pneumonic infections and other infections in human immunodeficiency virus (HIV)-infected patients. Isolates from foals possess a large virulence plasmid, varying in size from 80 to 90 kb. Isolates lacking the plasmid are avirulent to foals. Little is known about the function of the plasmid apart from its encoding a virulence associated surface proteins.	177
398916	pfam05527	DUF758	Domain of unknown function (DUF758). Family of eukaryotic proteins with unknown function, which are induced by tumor necrosis factor.	155
283238	pfam05528	Coronavirus_5	Coronavirus gene 5 protein. Infectious bronchitis virus (IBV), a member of Coronaviridae family, has a single-stranded positive-sense RNA genome, which is 27 kb in length. Gene 5 contains two (5a and 5b) open reading frames. The function of the 5a and 5b proteins is unknown.	82
398917	pfam05529	Bap31	B-cell receptor-associated protein 31-like. Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31.	137
368485	pfam05531	NPV_P10	Nucleopolyhedrovirus P10 protein. This family consists of several nucleopolyhedrovirus P10 proteins which are thought to be involved in the morphogenesis of the polyhedra.	75
398918	pfam05532	CsbD	CsbD-like. CsbD is a bacterial general stress response protein. It's expression is mediated by sigma-B, an alternative sigma factor. The role of CsbD in stress response is unclear.	53
114264	pfam05533	Peptidase_C42	Beet yellows virus-type papain-like endopeptidase C42. Members of the Closteroviridae and Potyviridae families of plant positive-strand RNA viruses encode one or two papain-like leader proteinases, belonging to Merops peptidase family C42.	88
310261	pfam05534	HicB	HicB family. This family consists of several bacterial HicB related proteins. The function of HicB is unknown although it is thought to be involved in pilus formation. It has been speculated that HicB performs a function antagonistic to that of pili and yet is necessary for invasion of certain niches.	51
398919	pfam05535	Chromadorea_ALT	Chromadorea ALT protein. This family consists of several ALT protein homologs found in nematodes. Lymphatic filariasis is a major tropical disease caused by the mosquito borne nematodes Brugia and Wuchereria. About 120 million people are infected and at risk of lymphatic pathology such as acute lymphangitis and elephantiasis. Expression of alt-1 and alt-2 is initiated midway through development in the mosquito, peaking in the infective larva and declining sharply following entry into the host. ALT-1 and the closely related ALT-2 have been found to be strong candidates for a future vaccine against human filariasis.	77
336138	pfam05536	Neurochondrin	Neurochondrin. This family contains several eukaryotic neurochondrin proteins. Neurochondrin induces hydroxyapatite resorptive activity in bone marrow cells resistant to bafilomycin A1, an inhibitor of macrophage- and osteoclast-mediated resorption. Expression of the gene is localized to chondrocyte, osteoblast, and osteocyte in the bone and to the hippocampus and Purkinje cell layer of cerebellum in the brain.	605
283245	pfam05537	DUF759	Borrelia burgdorferi protein of unknown function (DUF759). This family consists of several uncharacterized proteins from the Lyme disease spirochete Borrelia burgdorferi.	429
368487	pfam05538	Campylo_MOMP	Campylobacter major outer membrane protein. This family consists of Campylobacter major outer membrane proteins. The major outer membrane protein (MOMP), a putative porin and a multifunction surface protein of Campylobacter jejuni, may play an important role in the adaptation of the organism to various host environments.	421
114270	pfam05539	Pneumo_att_G	Pneumovirinae attachment membrane glycoprotein G. 	408
398920	pfam05540	Serpulina_VSP	Serpulina hyodysenteriae variable surface protein. This family consists of several variable surface proteins from Serpulina hyodysenteriae.	394
253243	pfam05541	Spheroidin	Entomopoxvirus spheroidin protein. Entomopoxviruses (EPVs) are large (300-400 nm) oval-shaped viruses replicating in the cytoplasm of their insect host cells. At the end of their replicative cycle EPVs virions are occluded in a highly expressed protein called spheroidin. This protein forms large (5-20 mm long) oval-shaped occlusion bodies (OBs) called spherules. The infectious cycle of EPVs begins with the ingestion by the insect host of the spherules, their dissolution by the alkaline reducing conditions of the midgut fluid and the release of virions in the midgut lumen. The infective particles first replicate in midgut epithelial cells, then pass the gut barrier to colonise the internal tissues, mainly the fat body cells. Whilst spheroidin has been demonstrated to be non-essential for viral replication, it plays an essential role in the natural biological cycle of the virus in protecting virions from adverse environmental conditions (e.g. UV degradation) and thus improving transmission efficacy. In this respect, spheroidins are functionally similar to polyhedrins of baculoviruses or cypoviruses.	943
398921	pfam05542	DUF760	Protein of unknown function (DUF760). This family contains several uncharacterized plant proteins.	83
398922	pfam05543	Peptidase_C47	Staphopain peptidase C47. Staphopains are one of four major families of proteinases secreted by the Gram-positive Staphylococcus aureus. These staphylococcal cysteine proteases are secreted as preproenzymes that are proteolytically cleaved to generate the mature enzyme.	174
398923	pfam05544	Pro_racemase	Proline racemase. This family consists of proline racemase (EC 5.1.1.4) proteins which catalyze the interconversion of L- and D-proline in bacteria. This family also contains several similar eukaryotic proteins including Trypanosoma cruzi PA45-A, a protein with B-cell mitogenic properties which has been characterized as a co-factor-independent proline racemase.	325
398924	pfam05545	FixQ	Cbb3-type cytochrome oxidase component FixQ. This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon.	49
398925	pfam05546	She9_MDM33	She9 / Mdm33 family. Members of this family are mitochondrial inner membrane proteins with a role in inner mitochondrial membrane organisation and biogenesis.	198
398926	pfam05547	Peptidase_M6	Immune inhibitor A peptidase M6. The insect pathogenic Gram-positive Bacillus thuringiensis secretes immune inhibitor A, a metallopeptidase, which specifically cleaves host antibacterial proteins. A homolog of immune inhibitor A, PrtV, has been identified in the Gram-negative human pathogen Vibrio cholerae.	644
336143	pfam05548	Peptidase_M11	Gametolysin peptidase M11. In the unicellular biflagellated alga, Chlamydomonas reinhardtii, gametolysin, a zinc-containing metallo-protease, is responsible for the degradation of the cell wall. homologs of gametolysin have also been reported in the simple multicellular organism, Volvox.	303
253247	pfam05549	Allexi_40kDa	Allexivirus 40kDa protein. 	271
283255	pfam05550	Peptidase_C53	Pestivirus Npro endopeptidase C53. Unique to pestiviruses, the N-terminal protein encoded by the bovine viral diarrhoea virus genome is a cysteine protease (Npro) responsible for a self-cleavage that releases the N-terminus of the core protein. This unique protease is dispensable for viral replication, and its coding region can be replaced by a ubiquitin gene directly fused in frame to the core.	168
368495	pfam05551	zf-His_Me_endon	Zinc-binding loop region of homing endonuclease. This domain is the short zinc-binding loops region of a number of much longer chain homing endonucleases. Such loops are probably stabilized by the zinc and may be viewed as small but separate domains. The common structural feature of these domains is that at least three zinc ligands lie very close to each other in the sequence and are not incorporated into regular secondary structural elements. The biological roles played by these small zinc-binding domains are presently unknown.	131
398927	pfam05552	TM_helix	Conserved TM helix. This alignment represents a conserved transmembrane helix as well as some flanking sequence. It is often found in association with pfam00924.	50
398928	pfam05553	DUF761	Cotton fibre expressed protein. This family consists of several plant proteins of unknown function. Three of the sequences (from Gossypium hirsutum) in this family are described as cotton fibre expressed proteins. The remaining sequences, found in Arabidopsis thaliana, are uncharacterized.	35
147629	pfam05554	Novirhabdo_Nv	Viral hemorrhagic septicemia virus non-virion protein. This family consists of several viral hemorrhagic septicemia virus non-virion (Nv) proteins. The NV protein is a nonstructural protein absent from mature virions although it is present in infected cells. The function of this protein is unknown.	122
283259	pfam05555	DUF762	Coxiella burnetii protein of unknown function (DUF762). This family consists several of several uncharacterized proteins from the bacterium Coxiella burnetii. Coxiella burnetii is the causative agent of the Q fever disease.	244
398929	pfam05556	Calsarcin	Calcineurin-binding protein (Calsarcin). This family consists of several mammalian calcineurin-binding proteins. The calcium- and calmodulin-dependent protein phosphatase calcineurin has been implicated in the transduction of signals that control the hypertrophy of cardiac muscle and slow fibre gene expression in skeletal muscle. Calsarcin-1 and calsarcin-2 are expressed in developing cardiac and skeletal muscle during embryogenesis, but calsarcin-1 is expressed specifically in adult cardiac and slow-twitch skeletal muscle, whereas calsarcin-2 is restricted to fast skeletal muscle. Calsarcins represent a novel family of sarcomeric proteins that link calcineurin with the contractile apparatus, thereby potentially coupling muscle activity to calcineurin activation. Calsarcin-3, is expressed specifically in skeletal muscle and is enriched in fast-twitch muscle fibers. Like calsarcin-1 and calsarcin-2, calsarcin-3 interacts with calcineurin, and the Z-disc proteins alpha-actinin, gamma-filamin, and telethonin.	255
368498	pfam05557	MAD	Mitotic checkpoint protein. This family consists of several eukaryotic mitotic checkpoint (Mitotic arrest deficient or MAD) proteins. The mitotic spindle checkpoint monitors proper attachment of the bipolar spindle to the kinetochores of aligned sister chromatids and causes a cell cycle arrest in prometaphase when failures occur. Multiple components of the mitotic spindle checkpoint have been identified in yeast and higher eukaryotes. In S.cerevisiae, the existence of a Mad1-dependent complex containing Mad2, Mad3, Bub3 and Cdc20 has been demonstrated.	660
368499	pfam05558	DREPP	DREPP plasma membrane polypeptide. This family contains several plant plasma membrane proteins termed DREPPs as they are developmentally regulated plasma membrane polypeptides.	206
398930	pfam05559	DUF763	Protein of unknown function (DUF763). This family consists of several uncharacterized bacterial and archaeal proteins of unknown function.	312
114291	pfam05560	Bt_P21	Bacillus thuringiensis P21 molecular chaperone protein. This family contains several Bacillus thuringiensis P21 proteins. These proteins are thought to be molecular chaperones and have mosquitocidal properties.	182
310276	pfam05561	DUF764	Borrelia burgdorferi protein of unknown function (DUF764). This family consists of proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete).	182
398931	pfam05562	WCOR413	Cold acclimation protein WCOR413. This family consists of several WCOR413-like plant cold acclimation proteins.	181
398932	pfam05563	SpvD	Salmonella plasmid virulence protein SpvD. This family consists of several SpvD plasmid virulence proteins from different Salmonella species. The structure of the protein from Salmonella typhimurium has been solved and shows a papain-like fold, with a predicted catalytic triad of Cys73, His162 and Asp182. The protein has been shown to have deubiquitinating-like activity, releasing aminoluciferin (AML) from Ub-AML.	213
398933	pfam05564	Auxin_repressed	Dormancy/auxin associated protein. This family contains several plant dormancy-associated and auxin-repressed proteins the function of which are poorly understood.	117
398934	pfam05565	Sipho_Gp157	Siphovirus Gp157. This family contains both viral and bacterial proteins which are related to the Gp157 protein of the Streptococcus thermophilus SFi bacteriophages. It is thought that bacteria possessing the gene coding for this protein have an increased resistance to the bacteriophage.	162
283269	pfam05566	Pox_vIL-18BP	Orthopoxvirus interleukin 18 binding protein. Interleukin-18 (IL-18) is a proinflammatory cytokine that plays a key role in the activation of natural killer and T helper 1 cell responses principally by inducing interferon-gamma (IFN-gamma). Several poxvirus genes encode proteins with sequence similarity to IL-18BPs. It has been shown that vaccinia, ectromelia and cowpox viruses secrete from infected cells a soluble IL-18BP (vIL-18BP) that may modulate the host antiviral response. The expression of vIL-18BPs by distinct poxvirus genera that cause local or general viral dissemination, or persistent or acute infections in the host, emphasises the importance of IL-18 in response to viral infections.	126
398935	pfam05567	Neisseria_PilC	Neisseria PilC beta-propeller domain. This family consists of several PilC protein sequences from Neisseria gonorrhoeae and N. meningitidis. PilC is a phase-variable protein associated with pilus-mediated adherence of pathogenic Neisseria to target cells. This domain has been shown to adopt a beta-propeller structure.	411
114299	pfam05568	ASFV_J13L	African swine fever virus J13L protein. This family consists of several African swine fever virus J13L proteins.	189
310280	pfam05569	Peptidase_M56	BlaR1 peptidase M56. Production of beta-Lactamase and penicillin-binding protein 2a (which mediate staphylococcal resistance to beta-lactam antibiotics) is regulated by a signal-transducing integral membrane protein and a transcriptional repressor. The signal transducer is a fusion protein with penicillin-binding and zinc metalloprotease domains. The signal for protein expression is transmitted by site-specific proteolytic cleavage of both the transducer, which auto-activates, and the repressor, which is inactivated, unblocking gene transcription. homologs to this peptidase domain, which corresponds to Merops family M56, are also found in a number of other bacterial genome sequences.	299
114301	pfam05570	DUF765	Circovirus protein of unknown function (DUF765). This family consists of several short (27-30aa) porcine and bovine circovirus ORF6 proteins of unknown function.	29
398936	pfam05571	DUF766	Protein of unknown function (DUF766). This family consists of several eukaryotic proteins of unknown function.	292
368506	pfam05572	Peptidase_M43	Pregnancy-associated plasma protein-A. Pregnancy-associated plasma protein A (PAPP-A) is a metallo-protease belonging to Merops family M43. It cleaves insulin-like growth factor (IGF) binding protein-4 (IGFBP-4), causing a dramatic reduction in its affinity for IGF-I and -II. Through this mechanism, PAPP-A is a regulator of IGF bioactivity in several systems, including the human ovary and the cardiovascular system.	152
398937	pfam05573	NosL	NosL. NosL is one of the accessory proteins of the nos (nitrous oxide reductase) gene cluster. NosL is a monomeric protein of 18,540 MW that specifically and stoichiometrically binds Cu(I). The copper ion in NosL is ligated by a Cys residue, and one Met and one His are thought to serve as the other ligands. It is possible that NosL is a copper chaperone involved in metallo-centre assembly.	131
114305	pfam05575	V_cholerae_RfbT	Vibrio cholerae RfbT protein. This family consists of several RfbT proteins from Vibrio cholerae. It has been found that genetic alteration of the rfbT gene is responsible for serotype conversion of Vibrio cholerae O1 and determines the difference between the Ogawa and Inaba serotypes, in that the presence of rfbT is sufficient for Inaba-to-Ogawa serotype conversion.	286
283275	pfam05576	Peptidase_S37	PS-10 peptidase S37. These serine proteases have been found in Streptomyces species.	448
310284	pfam05577	Peptidase_S28	Serine carboxypeptidase S28. These serine proteases include several eukaryotic enzymes such as lysosomal Pro-X carboxypeptidase, dipeptidyl-peptidase II, and thymus-specific serine peptidase.	434
398938	pfam05578	Peptidase_S31	Pestivirus NS3 polyprotein peptidase S31. These serine peptidases are involved in processing of the flavivirus polyprotein.	211
253263	pfam05579	Peptidase_S32	Equine arteritis virus serine endopeptidase S32. Serine peptidases involved in processing nidovirus polyprotein.	297
398939	pfam05580	Peptidase_S55	SpoIVB peptidase S55. The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis.	210
398940	pfam05582	Peptidase_U57	YabG peptidase U57. YabG is a protease involved in the proteolysis and maturation of SpoIVA and YrbA proteins, conserved with the cortex and/or coat assembly by Bacillus subtilis.	280
283279	pfam05584	Sulfolobus_pRN	Sulfolobus plasmid regulatory protein. This family consists of several plasmid regulatory proteins from the extreme thermophilic and acidophilic archaea Sulfolobus.	72
147642	pfam05585	DUF1758	Putative peptidase (DUF1758). This is a family of nematode proteins of unknown function. However, it seems likely that these proteins act as aspartic peptidases.	164
398941	pfam05586	Ant_C	Anthrax receptor C-terminus region. This region is found in the putatively cytoplasmic C-terminus of the anthrax receptor.	93
398942	pfam05587	Anth_Ig	Anthrax receptor extracellular domain. This region is found in the putatively extracellular N-terminal half of the anthrax receptor. It is probably part of the Ig superfamily and most closely related to pfam01833 (personal obs: C Yeats).	100
283282	pfam05588	Botulinum_HA-17	Clostridium botulinum HA-17 domain. This family consists of several Clostridium botulinum hemagglutinin (HA) subcomponents. Clostridium botulinum type D strain 4947 produces two different sizes of progenitor toxins (M and L) as intact forms without proteolytic processing. The M toxin is composed of neurotoxin (NT) and nontoxic-nonhemagglutinin (NTNHA), whereas the L toxin is composed of the M toxin and hemagglutinin (HA) subcomponents (HA-70, HA-17, and HA-33).	145
398943	pfam05589	DUF768	Protein of unknown function (DUF768). This family consists of several uncharacterized hypothetical proteins from Rhizobium loti.	67
283284	pfam05590	DUF769	Xylella fastidiosa protein of unknown function (DUF769). This family consists of several uncharacterized hypothetical proteins of unknown function from Xylella fastidiosa, the organism that causes Pierce's disease in plants.	259
398944	pfam05591	T6SS_VipA	Type VI secretion system, VipA, VC_A0107 or Hcp2. VipA is a family of Gram-negative bacterial proteins that form part of the type VI pathogenic secretion system. Members have been variously defined as VC_A0107 family, Hcp2 and VipA, for ClpV-interacting proteins. VipB and VipA proteins interact very closely to form the shaft of the pathogenic penetrating needle system.	153
336149	pfam05592	Bac_rhamnosid	Bacterial alpha-L-rhamnosidase concanavalin-like domain. This family consists of bacterial rhamnosidase A and B enzymes. L-Rhamnose is abundant in biomass as a common constituent of glycolipids and glycosides, such as plant pigments, pectic polysaccharides, gums or biosurfactants. Some rhamnosides are important bioactive compounds. For example, terpenyl glycosides, the glycosidic precursor of aromatic terpenoids, act as important flavouring substances in grapes. Other rhamnosides act as cytotoxic rhamnosylated terpenoids, as signal substances in plants or play a role in the antigenicity of pathogenic bacteria.	102
398945	pfam05593	RHS_repeat	RHS Repeat. RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.	36
398946	pfam05594	Fil_haemagg	Haemagluttinin repeat. This highly divergent repeat occurs in number of proteins implicated in cell aggregation. The Pfam alignment probably contains three such repeats (personal obs: C Yeats). These are likely to have a beta-helical structure.	69
398947	pfam05595	DUF771	Domain of unknown function (DUF771). Family of uncharacterized ORFs found in Bacteriophage and Lactococcus lactis.	90
398948	pfam05596	Taeniidae_ag	Taeniidae antigen. This family consists of several antigen proteins from Taenia and Echinococcus (tapeworm) species.	64
398949	pfam05597	Phasin	Poly(hydroxyalcanoate) granule associated protein (phasin). Polyhydroxyalkanoates (PHAs) are storage polyesters synthesized by various bacteria as intracellular carbon and energy reserve material. PHAs are accumulated as water-insoluble inclusions within the cells. This family consists of the phasins PhaF and PhaI which act as a transcriptional regulator of PHA biosynthesis genes. PhaF has been proposed to repress expression of the phaC1 gene and the phaIF operon.	126
398950	pfam05598	DUF772	Transposase domain (DUF772). This presumed domain is found at the N-terminus of many proteins found in transposons.	71
191311	pfam05599	Deltaretro_Tax	Deltaretrovirus Tax protein. This family consists of Rex/Tax proteins from human and simian T-cell leukaemia viruses. The exact function of these proteins is unknown. Tax is the viral transactivator; is it a nuclear phosphoprotein that interacts with CREB, coactivator CBP/p300 and PCAF to form a multiprotein complex, which activates viral LTR and stimulates virus expression. Tax is also involved in deregulated expression of numerous cellular genes leading to T-cell leukaemia. Rex is a nucleolar post transcriptional regulator that facilitates export to the cytoplasm of viral RNA not or incompletely spliced [personal communication, Dr. S Nicot].	87
398951	pfam05600	DUF773	Protein of unknown function (DUF773). This family contains several eukaryotic sequences which are thought to be CDK5 activator-binding proteins, however, the function of this family is unknown.	505
398952	pfam05602	CLPTM1	Cleft lip and palate transmembrane protein 1 (CLPTM1). This family consists of several eukaryotic cleft lip and palate transmembrane protein 1 sequences. Cleft lip with or without cleft palate is a common birth defect that is genetically complex. The nonsyndromic forms have been studied genetically using linkage and candidate-gene association studies with only partial success in defining the loci responsible for orofacial clefting. CLPTM1 encodes a transmembrane protein and has strong homology to two Caenorhabditis elegans genes, suggesting that CLPTM1 may belong to a new gene family. This family also contains the human cisplatin resistance related protein CRR9p which is associated with CDDP-induced apoptosis.	431
398953	pfam05603	DUF775	Protein of unknown function (DUF775). This family consists of several eukaryotic proteins of unknown function.	200
368516	pfam05604	DUF776	Protein of unknown function (DUF776). This family consists of several highly related mouse and human proteins of unknown function.	176
398954	pfam05605	zf-Di19	Drought induced 19 protein (Di19), zinc-binding. This family consists of several drought induced 19 (Di19) like proteins. Di19 has been found to be strongly expressed in both the roots and leaves of Arabidopsis thaliana during progressive drought. This domain is a zinc-binding domain.	54
368518	pfam05606	DUF777	Borrelia burgdorferi protein of unknown function (DUF777). This family consists of several hypothetical proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete).	135
368519	pfam05608	DUF778	Protein of unknown function (DUF778). This family consists of several eukaryotic proteins of unknown function.	136
398955	pfam05609	LAP1C	Lamina-associated polypeptide 1C (LAP1C). This family contains rat LAP1C proteins and several uncharacterized highly related sequences from both mice and humans. LAP1s (lamina-associated polypeptide 1s) are type 2 integral membrane proteins with a single membrane-spanning region of the inner nuclear membrane. LAP1s bind to both A- and B-type lamins and have a putative role in the membrane attachment and assembly of the nuclear lamina.	446
398956	pfam05610	DUF779	Protein of unknown function (DUF779). This family consists of several bacterial proteins of unknown function.	94
368521	pfam05611	DUF780	Caenorhabditis elegans protein of unknown function (DUF780). This family consists of several short C. elegans proteins of unknown function.	71
398957	pfam05612	Leg1	Leg1. Protein liver-enriched gene 1 (Leg1) has been suggested to function as a novel secreted regulator for the liver development.	331
283304	pfam05613	Herpes_U15	Human herpesvirus U15 protein. 	110
114342	pfam05614	DUF782	Circovirus protein of unknown function (DUF782). This family consists of porcine and bovine circovirus proteins of unknown function.	104
398958	pfam05615	THOC7	Tho complex subunit 7. The Tho complex is involved in transcription elongation and mRNA export from the nucleus.	135
283306	pfam05616	Neisseria_TspB	Neisseria meningitidis TspB protein. This family consists of several Neisseria meningitidis TspB virulence factor proteins.	517
398959	pfam05617	Prolamin_like	Prolamin-like. Prolamin_like (in which DUF784 and DUF1278 have been merged) is found to be expressed in the plant embryo sac and to be regulated by the Myb98 transcription factor. Computational analysis has revealed that members are homologous to the plant prolamin superfamily (Protease inhibitor-seed storage-LTP family, pfam00234). In contrast to typical prolamin members that have eight conserved Cys residues forming four pairs of disulfide bonds, this domain contains only six conserved Cys residues that may form three pairs of disulfide bonds. The domain may have a potential function in lipid transfer or protection during plant embryo sac development and reproduction.	70
283308	pfam05618	Zn_protease	Putative ATP-dependant zinc protease. Proteins in this family are annotated as being ATP-dependant zinc proteases.	138
310307	pfam05619	DUF787	Borrelia burgdorferi protein of unknown function (DUF787). This family consists of several hypothetical proteins of unknown function from Borrelia burgdorferi (Lyme disease spirochete).	369
398960	pfam05620	DUF788	Protein of unknown function (DUF788). This family consists of several eukaryotic proteins of unknown function.	167
398961	pfam05621	TniB	Bacterial TniB protein. This family consists of several bacterial TniB NTP-binding proteins. TniB is a probable ATP-binding protein which is involved in Tn5053 mercury resistance transposition.	189
398962	pfam05622	HOOK	HOOK protein. This family consists of several HOOK1, 2 and 3 proteins from different eukaryotic organisms. The different members of the human gene family are HOOK1, HOOK2 and HOOK3. Different domains have been identified in the three human HOOK proteins, and it was demonstrated that the highly conserved NH2-domain mediates attachment to microtubules, whereas the central coiled-coil motif mediates homodimerization and the more divergent C-terminal domains are involved in binding to specific organelles (organelle-binding domains). It has been demonstrated that endogenous HOOK3 binds to Golgi membranes, whereas both HOOK1 and HOOK2 are localized to discrete but unidentified cellular structures. In mice the Hook1 gene is predominantly expressed in the testis. Hook1 function is necessary for the correct positioning of microtubular structures within the haploid germ cell. Disruption of Hook1 function in mice causes abnormal sperm head shape and fragile attachment of the flagellum to the sperm head.	526
398963	pfam05623	DUF789	Protein of unknown function (DUF789). This family consists of several plant proteins of unknown function.	294
398964	pfam05624	LSR	Lipolysis stimulated receptor (LSR). The lipolysis-stimulated receptor (LSR) is a lipoprotein receptor primarily expressed in the liver and activated by free fatty acids. It is thought to be involved in the clearance of triglyceride-rich lipoproteins, and has been shown in mice to be critical for liver and embryonic development.	48
398965	pfam05625	PAXNEB	PAXNEB protein. PAXNEB or PAX6 neighbor is found in several eukaryotic organisms. PAXNED is an RNA polymerase II Elongator protein subunit. It is part of the HAP subcomplex of Elongator, which is a six-subunit component of the RNA polymerase II holoenzyme. The HAP subcomplex is required for Elongator structural integrity and histone acetyltransferase activity. This protein family has a P-loop motif. However its sequence has degraded in many members of the family.	358
398966	pfam05626	DUF790	Protein of unknown function (DUF790). This family consists of several hypothetical archaeal proteins of unknown function.	386
398967	pfam05627	AvrRpt-cleavage	Cleavage site for pathogenic type III effector avirulence factor Avr. This domain is conserved in small families of otherwise unrelated proteins in both mono-cots and di-cots, suggesting that it has a conserved, plant-specific function. It is found both in the plant RIN4 (resistance R membrane-bound host-target protein) where it appears to contribute to the binding of the protein to both RCS (AvrRpt2 auto-cleavage site) and AvrB, the virulence factor from the infecting bacterium. The cleavage site for the AvrRpt2 avirulence protein would appear to be the sequence motifs VPQFGDW and LPKFGEW, both of which are highly conserved within the domain.	36
310316	pfam05628	Borrelia_P13	Borrelia membrane protein P13. This family consists of P13 proteins from Borrelia species. P13 is a 13kDa integral membrane protein which is post-translationally processed at both ends and modified by an unknown mechanism.	138
283319	pfam05629	Nanovirus_C8	Nanovirus component 8 (C8) protein. This family consists of a group of 17.4 kDa nanovirus proteins which are highly related to the faba bean necrotic yellows virus component 8 protein whose function is unknown.	154
398968	pfam05630	NPP1	Necrosis inducing protein (NPP1). This family consists of several NPP1 like necrosis inducing proteins from oomycetes, fungi and bacteria. Infiltration of NPP1 into leaves of Arabidopsis thaliana plants result in transcript accumulation of pathogenesis-related (PR) genes, production of ROS and ethylene, callose apposition, and HR-like cell death.	198
283321	pfam05631	MFS_5	Sugar-tranasporters, 12 TM. MFS_5 is a family of sugar-transporters from both prokaryotes and eukaryotes.	356
368533	pfam05632	DUF792	Borrelia burgdorferi protein of unknown function (DUF792). This family consists of several hypothetical proteins from the Lyme disease spirochete Borrelia burgdorferi.	184
283323	pfam05633	BPS1	Protein BYPASS1-related. This family consists of several plant proteins and includes BYPASS1, which is required for normal root and shoot development. This protein prevents constitutive production of a root mobile carotenoid-derived signaling compound that is capable of arresting shoot and leaf development.	386
398969	pfam05634	APO_RNA-bind	APO RNA-binding. This domain contains conserved cysteine and histidine residues. It resembles zinc fingers, and binds to zinc. This domain functions as an RNA-binding domain.	194
398970	pfam05635	23S_rRNA_IVP	23S rRNA-intervening sequence protein. This family consists of bacterial proteins encoded within an intervening sequence present within some 23S rRNA genes. It folds into an anti-parallel four-helix bundle and forms homopentamers.	106
398971	pfam05636	HIGH_NTase1	HIGH Nucleotidyl Transferase. This family consists of HIGH Nucleotidyl Transferases	393
368535	pfam05637	Glyco_transf_34	galactosyl transferase GMA12/MNN10 family. This family contains a number of glycosyltransferase enzymes that contain a DXD motif. This family includes a number of C. elegans homologs where the DXD is replaced by DXH. Some members of this family are included in glycosyltransferase family 34.	238
398972	pfam05638	T6SS_HCP	Type VI secretion system effector, Hcp. HCP is a family of proteins which are expressed in up to 1000 copies in Gram-negative bacteria. Together these copies aggregate into a needle-like shaft or tube that will penetrate other bacteria via a puncturing protein attached to its head. Initially Hcp forms a hexameric structure with a central channel of 40 Angstroms. These hexamers pile up one on top of each other forming nanotubes resembling the gp19 tail phage tube.	129
368536	pfam05639	Pup	Pup-like protein. This family consists of several short bacterial proteins formely known as (DUF797). It was recently shown that Mycobacterium tuberculosis contains a small protein, Pup (Rv2111c), that is covalently conjugated to the e-NH2 groups of lysines on several target proteins (pupylation) such as the malonyl CoA acyl carrier protein (FabD). Pupylation of FabD was shown to result in its recruitment to the mycobacterial proteasome and subsequent degradation analogous to eukaryotic ubiquitin-conjugated proteins. Searches recovered Pup orthologs in all major actinobacteria lineages including the basal bifidobacteria and also sporadically in certain other bacterial lineages. The Pup proteins were all between 50-90 residues in length and a multiple alignment shows that they all contain a conserved motif with a G [EQ] signature at the C-terminus. Thus, all of them are suitable for conjugation via the terminal glutamate or the deamidated glutamine (as shown in the case of the Mycobacterium Pup). The conserved globular core of Pup is predicted to form a bihelical unit with the extreme C-terminal 6-7 residues forming a tail in the extended conformation. Thus, Pup is structurally unrelated to the ubiquitin fold and has convergently evolved the function of protein modifier.	64
398973	pfam05640	NKAIN	Na,K-Atpase Interacting protein. NKAIN (Na,K-Atpase INteracting) proteins are a family of evolutionary conserved transmembrane proteins that localize to neurons, that are critical for neuronal function, and that interact with the beta subunits, beta1 in vertebrates and beta in Drosophila, of Na,K-ATPase. NKAINs have highly conserved trans-membrane domains but otherwise no other characterized domains. NKAINs may function as subunits of pore or channel structures in neurons or they may affect the function of other membrane proteins. They are likely to function within the membrane bilayer.	197
398974	pfam05641	Agenet	Agenet domain. This domain is related to the TUDOR domain pfam00567. The function of the agenet domain is unknown. This family now matches both the two Agenet domains in the FMR proteins.	61
253298	pfam05642	Sporozoite_P67	Sporozoite P67 surface antigen. This family consists of several Theileria P67 surface antigens. A stage specific surface antigen of Theileria parva, p67, is the basis for the development of an anti-sporozoite vaccine for the control of East Coast fever (ECF) in cattle. The antigen has been shown to contain five distinct linear peptide sequences recognized by sporozoite-neutralising murine monoclonal antibodies.	727
398975	pfam05643	DUF799	Putative bacterial lipoprotein (DUF799). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative lipoproteins.	185
398976	pfam05644	Miff	Mitochondrial and peroxisomal fission factor Mff. This protein has a role in mitochondrial and peroxisomal fission.	292
398977	pfam05645	RNA_pol_Rpc82	RNA polymerase III subunit RPC82. This family consists of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In Saccharomyces cerevisiae, the enzyme is composed of 15 subunits, ranging from 160 to about 10 kDa.	233
398978	pfam05647	Epiglycanin_TR	Tandem-repeating region of mucin, epiglycanin-like. The unusual mucin, epiglycanin, is membrane-bound at the C-terminus but has a long region of this tandem-repeat at the N-terminus. It was the first mucin identified to be associated with the malignant behaviour of carcinoma cells. Mouse Muc21/epiglycanin is thought to be a highly glycosylated molecule, which makes it likely that its function is dependent on its glycoforms. Cells expressing Muc21 are significantly less adherent to each other and to extracellular matrix components than control cells, and this loss of adhesion is mediated by the TR portion of Muc21. This family also now contains the repeat that was the C. elegans protein of unknown function (DUF801).	67
398979	pfam05648	PEX11	Peroxisomal biogenesis factor 11 (PEX11). This family consists of several peroxisomal biogenesis factor 11 (PEX11) proteins from several eukaryotic species. The PEX11 peroxisomal membrane proteins promote peroxisome division in multiple eukaryotes.	223
398980	pfam05649	Peptidase_M13_N	Peptidase family M13. M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk.	380
398981	pfam05650	DUF802	Domain of unknown function (DUF802). This region is found as two or more repeats in a small number of hypothetical proteins.	53
398982	pfam05651	Diacid_rec	Putative sugar diacid recognition. This region is found in several proteins characterized as carbohydrate diacid regulators. An HTH DNA-binding motif is found at the C-terminus of these proteins suggesting that this region includes the sugar recognition region.	131
398983	pfam05652	DcpS	Scavenger mRNA decapping enzyme (DcpS) N-terminal. This family consists of several scavenger mRNA decapping enzymes (DcpS) and is the N-terminal domain of these proteins. DcpS is a scavenger pyrophosphatase that hydrolyzes the residual cap structure following 3' to 5' decay of an mRNA. The association of DcpS with 3' to 5' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway.	103
398984	pfam05653	Mg_trans_NIPA	Magnesium transporter NIPA. NIPA (nonimprinted in Prader-Willi/Angelman syndrome) is a family of integral membrane proteins which function as magnesium transporters.	295
398985	pfam05655	AvrD	Pseudomonas avirulence D protein (AvrD). This family consists of several avirulence D (AvrD) proteins primarily found in Pseudomonas syringae.	330
398986	pfam05656	DUF805	Protein of unknown function (DUF805). This family consists of several bacterial proteins of unknown function.	108
368546	pfam05657	DUF806	Protein of unknown function (DUF806). This family consists of several Siphovirus and Lactococcus proteins of unknown function. The viral sequences are thought to be tail component proteins.	121
398987	pfam05658	YadA_head	Head domain of trimeric autotransporter adhesin. This seven residue repeat makes up the majority sequence of a family of bacterial haemagglutinins and invasins. The representative alignment contains four repeats.	27
398988	pfam05659	RPW8	Arabidopsis broad-spectrum mildew resistance protein RPW8. This family consists of several broad-spectrum mildew resistance proteins from Arabidopsis thaliana. Plant disease resistance (R) genes control the recognition of specific pathogens and activate subsequent defense responses. The Arabidopsis thaliana locus Resistance To Powdery Mildew 8 (RPW8) contains two naturally polymorphic, dominant R genes, RPW8.1 and RPW8.2, which individually control resistance to a broad range of powdery mildew pathogens. They induce localized, salicylic acid-dependent defenses similar to those induced by R genes that control specific resistance. Apparently, broad-spectrum resistance mediated by RPW8 uses the same mechanisms as specific resistance.	139
283346	pfam05660	DUF807	Coxiella burnetii protein of unknown function (DUF807). This family consists of several proteins of unknown function from Coxiella burnetii (the causative agent of a zoonotic disease called Q fever).	142
398989	pfam05661	DUF808	Protein of unknown function (DUF808). This family consists of several bacterial proteins of unknown function.	299
398990	pfam05662	YadA_stalk	Coiled stalk of trimeric autotransporter adhesin. This short motif is found in invasins and haemagglutinins, normally associated with (pfam05658).	43
147685	pfam05663	DUF809	Protein of unknown function (DUF809). This family consists of several proteins of unknown function Raphanus sativus (Radish) and Brassica napus (Rape).	138
398991	pfam05664	DUF810	Plant family of unknown function (DUF810). This family is found in plant-symbionts and pathogens of the alpha-, beta- and gamma-Proteobacteria, but is not known in any other organism. It represents a candidate family for involvement in interactions with plants, or it may at least play a role in plant-associated lifestyles.	678
368549	pfam05666	Fels1	Fels-1 Prophage Protein-like. 	42
398992	pfam05667	DUF812	Protein of unknown function (DUF812). This family consists of several eukaryotic proteins of unknown function.	590
398993	pfam05669	Med31	SOH1. The family consists of Saccharomyces cerevisiae SOH1 homologs. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes.	94
398994	pfam05670	DUF814	Domain of unknown function (DUF814). This domain occurs in proteins that have been annotated as Fibronectin/fibrinogen binding protein by similarity. This annotation comes from Bacillus subtilis YloA, where the N-terminal region is involved in this activity. Hence the activity of this C-terminal domain is unknown. This domain contains a conserved motif D/E-X-W/Y-X-H that may be functionally important.	111
398995	pfam05671	GETHR	GETHR pentapeptide repeat (5 copies). This pentapeptide repeat is found mainly in C. elegans. The most conserved amino acid at each position leads to its name GETHR (Bateman A unpublished obs.). The family also includes a divergent repeat in a microneme protein. The function of this repeat is unknown.	25
398996	pfam05672	MAP7	MAP7 (E-MAP-115) family. The organisation of microtubules varies with the cell type and is presumably controlled by tissue-specific microtubule-associated proteins (MAPs). The 115-kDa epithelial MAP (E-MAP-115/MAP7) has been identified as a microtubule-stabilizing protein predominantly expressed in cell lines of epithelial origin. The binding of this microtubule associated protein is nucleotide independent.	163
398997	pfam05673	DUF815	Protein of unknown function (DUF815). This family consists of several bacterial proteins of unknown function.	250
283357	pfam05674	DUF816	Baculovirus protein of unknown function (DUF816). This family includes proteins that are about 200 amino acids in length. The proteins are all from baculoviruses. This family includes ORF107 from Orgyia pseudotsugata multicapsid polyhedrosis virus (OpMNPV) and a variety of other numbered ORF proteins, such as ORF52, ORF140. The function of these proteins is unknown.	176
398998	pfam05675	DUF817	Protein of unknown function (DUF817). This family consists of several bacterial proteins of unknown function.	234
398999	pfam05676	NDUF_B7	NADH-ubiquinone oxidoreductase B18 subunit (NDUFB7). This family consists of several NADH-ubiquinone oxidoreductase B18 subunit proteins from different eukaryotic organisms. Oxidative phosphorylation is the well-characterized process in which ATP, the principal carrier of chemical energy of individual cells, is produced due to a mitochondrial proton gradient formed by the transfer of electrons from NADH and FADH2 to molecular oxygen. The oxidative phosphorylation (OXPHOS) system is located in the mitochondrial inner membrane and consists of five multi-subunit enzyme complexes and two small electron carriers: coenzyme Q10 and cytochrome C. At least 70 structural proteins involved in the formation of the whole OXPHOS system are encoded by nuclear genes, whereas 13 structural proteins are encoded by the mitochondrial genome. Deficiency of NADH ubiquinone oxidoreductase, the first enzyme complex of the mitochondrial respiratory chain, is one of the most frequent causes of human mitochondrial encephalomyopathies.	60
253315	pfam05677	DUF818	Chlamydia CHLPS protein (DUF818). This family consists of several Chlamydia CHLPS proteins, the function of which are unknown.	364
399000	pfam05678	VQ	VQ motif. This short motif is found in a variety of plant proteins. These proteins vary greatly in length and are mostly composed of low complexity regions. They all conserve a short motif FXhVQChTG, where X is any amino acid and h is a hydrophobic amino acid. The function of this motif is uncertain, however one protein in this family has been found to bind the SigA sigma factor. It would seem plausible that this motif is needed for this activity and that this whole family might be involved in modulating plastid sigma factors (Bateman A pers. obs.).	28
399001	pfam05679	CHGN	Chondroitin N-acetylgalactosaminyltransferase. 	501
368558	pfam05680	ATP-synt_E	ATP synthase E chain. This family consists of several ATP synthase E chain sequences which are components of the CF(0) subunit.	83
399002	pfam05681	Fumerase	Fumarate hydratase (Fumerase). This family consists of several bacterial fumarate hydratase proteins FumA and FumB. Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth. Three fumarases, FumA, FumB, and FumC, have been reported in E. coli. fumA and fumB genes are homologous and encode products of identical sizes which form thermolabile dimers of Mr 120,000. FumA and FumB are class I enzymes and are members of the iron-dependent hydrolases, which include aconitase and malate hydratase. The active FumA contains a 4Fe-4S centre, and it can be inactivated upon oxidation to give a 3Fe-4S centre.	267
399003	pfam05683	Fumerase_C	Fumarase C-terminus. This family consists of the C terminal region of several bacterial fumarate hydratase proteins (FumA and FumB). Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth.	204
114410	pfam05684	DUF819	Protein of unknown function (DUF819). This family contains proteins of unknown function from archaeal, bacterial and plant species.	379
399004	pfam05685	Uma2	Putative restriction endonuclease. This family consists of hypothetical proteins that are greatly expanded in cyanobacteria. The proteins are found sporadically in other bacteria. A small number of member proteins also contain pfam02861 domains that are involved in protein interactions. Solutions of several structures for members of this family show that it is likely to be acting as an endonuclease.	168
310354	pfam05686	Glyco_transf_90	Glycosyl transferase family 90. This family of glycosyl transferases are specifically (mannosyl) glucuronoxylomannan/galactoxylomannan -beta 1,2-xylosyltransferases, EC:2.4.2.-.	396
399005	pfam05687	BES1_N	BES1/BZR1 plant transcription factor, N-terminal. This family consists of the N terminal regions of several plant transcription factors. It is classified as BES1/BZR1, a plant-specific transcription factor that cooperates with transcription factors such as BIM1 to regulate brassinosteroid-induced genes.	143
399006	pfam05688	DUF824	Salmonella repeat of unknown function (DUF824). This family consists of several repeated sequences of around 45 residues.	108
399007	pfam05689	DUF823	Salmonella repeat of unknown function (DUF823). This family consists of a series of repeated sequences (of around 180 residues) which are found in Salmonella typhimurium and Salmonella typhi. Sequences from this family are almost always found with pfam05688.	135
399008	pfam05690	ThiG	Thiazole biosynthesis protein ThiG. This family consists of several bacterial thiazole biosynthesis protein G sequences. ThiG, together with ThiF and ThiH, is proposed to be involved in the synthesis of 4-methyl-5-(b-hydroxyethyl)thiazole (THZ) which is an intermediate in the thiazole production pathway. This family also includes triosephosphate isomerase and pyridoxal 5'-phosphate synthase subunit PdxS.	247
283371	pfam05691	Raffinose_syn	Raffinose synthase or seed imbibition protein Sip1. This family consists of several raffinose synthase proteins, also known as seed imbibition (Sip1) proteins. Raffinose (O-alpha- D-galactopyranosyl- (1-->6)- O-alpha- D-glucopyranosyl-(1<-->2)- O-beta- D-fructofuranoside) is a widespread oligosaccharide in plant seeds and other tissues. Raffinose synthase (EC:2.4.1.82) is the key enzyme that channels sucrose into the raffinose oligosaccharide pathway. Raffinose family oligosaccharides (RFOs) are ubiquitous in plant seeds and are thought to play critical roles in the acquisition of tolerance to desiccation and seed longevity. Raffinose synthases are alkaline alpha-galactosidases and are solely responsible for RFO breakdown in germinating maize seeds, whereas acidic galactosidases appear to have other functions. Glycoside hydrolase family 36 can be split into 11 families, GH36A to GH36K. This family includes enzymes from GH36C.	749
368561	pfam05692	Myco_haema	Mycoplasma haemagglutinin. This family consists of several haemagglutinin sequences from Mycoplasma synoviae and Mycoplasma gallisepticum. The major plasma membrane proteins, pMGAs, of Mycoplasma gallisepticum are cell adhesin (hemagglutinin) molecules. It has been shown that the genetic determinants that code for the haemagglutinins are organized into a large family of genes and that only one of these genes is predominately expressed in any given strain.	424
399009	pfam05693	Glycogen_syn	Glycogen synthase. This family consists of the eukaryotic glycogen synthase proteins GYS1, GYS2 and GYS3. Glycogen synthase (GS) is the enzyme responsible for the synthesis of -1,4-linked glucose chains in glycogen. It is the rate limiting enzyme in the synthesis of the polysaccharide, and its activity is highly regulated through phosphorylation at multiple sites and also by allosteric effectors, mainly glucose 6-phosphate (G6P).	639
399010	pfam05694	SBP56	56kDa selenium binding protein (SBP56). This family consists of several eukaryotic selenium binding proteins as well as three sequences from archaea. The exact function of this protein is unknown although it is thought that SBP56 participates in late stages of intra-Golgi protein transport. The Lotus japonicus homolog of SBP56, LjSBP is thought to have more than one physiological role and can be implicated in controlling the oxidation/reduction status of target proteins, in vesicular Golgi transport.	453
283375	pfam05695	DUF825	Plant protein of unknown function (DUF825). This family consists of several plant proteins greater than 1000 residues in length. The function of this family is unknown.	1486
368563	pfam05696	DUF826	Protein of unknown function (DUF826). This family consists of several enterobacterial and siphoviral sequences of unknown function.	65
399011	pfam05697	Trigger_N	Bacterial trigger factor protein (TF). In the E. coli cytosol, a fraction of the newly synthesized proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains. This family represents the N-terminal region of the protein.	144
399012	pfam05698	Trigger_C	Bacterial trigger factor protein (TF) C-terminus. In the E. coli cytosol, a fraction of the newly synthesized proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains. This family represents the C-terminal region of the protein.	162
399013	pfam05699	Dimer_Tnp_hAT	hAT family C-terminal dimerization region. This dimerization region is found at the C-terminus of the transposases of elements belonging to the Activator superfamily (hAT element superfamily). The isolated dimerization region forms extremely stable dimers in vitro.	83
399014	pfam05700	BCAS2	Breast carcinoma amplified sequence 2 (BCAS2). This family consists of several eukaryotic sequences of unknown function. The mammalian members of this family are annotated as breast carcinoma amplified sequence 2 (BCAS2) proteins. BCAS2 is a putative spliceosome associated protein.	204
399015	pfam05701	WEMBL	Weak chloroplast movement under blue light. WEMBL consists of several plant proteins required for the chloroplast avoidance response under high intensity blue light. This avoidance response consists in the relocation of chloroplasts on the anticlinal side of exposed cells. Acts in association with PMI2 to maintain the velocity of chloroplast photo-relocation movement via the regulation of cp-actin filaments. Thus several member-sequences are described as "myosin heavy chain-like".	562
368568	pfam05702	Herpes_UL49_5	Herpesvirus UL49.5 envelope/tegument protein. UL49.5 protein consists of 98 amino acids with a calculated molecular mass of 10,155 Da. It contains putative signal peptide and transmembrane domains but lacks a consensus sequence for N glycosylation. UL49.5 protein is an O-glycosylated structural component of the viral envelope.	98
399016	pfam05703	Auxin_canalis	Auxin canalisation. This domain is frequently found at the N-terminus of proteins containing pfam08458 at the C-terminus. It is a component of the auto-regulatory loop which enables auxin canalisation by recruitment of the PIN1 auxin efflux protein to the cell membrane.	258
368570	pfam05704	Caps_synth	Capsular polysaccharide synthesis protein. This family consists of several capsular polysaccharide proteins. Capsular polysaccharide (CPS) is a major virulence factor in Streptococcus pneumoniae.	278
399017	pfam05705	DUF829	Eukaryotic protein of unknown function (DUF829). This family consists of several uncharacterized eukaryotic proteins.	240
399018	pfam05706	CDKN3	Cyclin-dependent kinase inhibitor 3 (CDKN3). This family consists of cyclin-dependent kinase inhibitor 3 or kinase associated phosphatase proteins from several mammalian species. The cyclin-dependent kinase (Cdk)-associated protein phosphatase (KAP) is a human dual specificity protein phosphatase that dephosphorylates Cdk2 on threonine 160 in a cyclin-dependent manner.	168
283385	pfam05707	Zot	Zonular occludens toxin (Zot). This family consists of bacterial and viral proteins which are very similar to the Zonular occludens toxin (Zot). Zot is elaborated by bacteriophages present in toxigenic strains of Vibrio cholerae. Zot is a single polypeptide chain of 44.8 kDa, with the ability to reversibly alter intestinal epithelial tight junctions, allowing the passage of macromolecules through mucosal barriers.	195
399019	pfam05708	Peptidase_C92	Permuted papain-like amidase enzyme, YaeF/YiiX, C92 family. Amidase_YiiX is a family of permuted papain-like amidases. It has amidase specificity for the amide bond between a lipid and an amino acid (or peptide). From the structure, a tetramer, each monomer is made up of a layered alpha-beta fold with a central, 6-stranded, antiparallel beta-sheet that is protected by helices on either side. The catalytic Cys154 in UniProtKB:Q74NK7, Structure 3kw0, is located on the N-terminus of helix alphaF. The two additional helices located above Cys154 contribute to the formation of the active site, where the lysine ligand is bound.	157
399020	pfam05709	Sipho_tail	Phage tail protein. This family consists of several Siphovirus and other phage tail component proteins as well as some bacterial proteins of unknown function.	258
283388	pfam05710	Coiled	Coiled coil. This region is found in a group of Dictyostelium discoideum proteins. It is likely to form a coiled-coil. Some of the proteins are regulated by cyclic AMP and are expressed late in development.	90
399021	pfam05711	TylF	Macrocin-O-methyltransferase (TylF). This family consists of bacterial macrocin O-methyltransferase (TylF) proteins. TylF is responsible for the methylation of macrocin to produce tylosin. Tylosin is a macrolide antibiotic used in veterinary medicine to treat infections caused by Gram-positive bacteria and as an animal growth promoter in the swine industry. It is produced by several Streptomyces species. As with other macrolides, the antibiotic activity of tylosin is due to the inhibition of protein biosynthesis by a mechanism that involves the binding of tylosin to the ribosome, preventing the formation of the mRNA-aminoacyl-tRNA-ribosome complex. The structure of one representative sequence from this family, NovP, shows it to be an S-adenosyl-l-methionine-dependent O-methyltransferase that catalyzes the penultimate step in the biosynthesis of the aminocoumarin antibiotic novobiocin. Specifically, it methylates at 4-OH of the noviose moiety, and the resultant methoxy group is important for the potency of the mature antibiotic. It is likely that the key structural features of NovP are common to the rest of the family and include: a helical 'lid' region that gates access to the co-substrate binding pocket and an active centre that contains a 3-Asp putative metal binding site. A further conserved Asp probably acts as the general base that initiates the reaction by de-protonating the 4-OH group of the noviose unit.	256
399022	pfam05712	MRG	MRG. This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation. It contains 2 chromo domains and a leucine zipper motif.	184
399023	pfam05713	MobC	Bacterial mobilisation protein (MobC). This family consists of several bacterial MobC-like, mobilisation proteins. MobC proteins belong to the group of relaxases. Together with MobA and MobB they bind to a single cis-active site of a mobilising plasmid, the origin of transfer (oriT) region. The absence of MobC has several different effects on oriT DNA. Site- and strand-specific nicking by MobA protein is severely reduced, accounting for the lower frequency of mobilisation. The localized DNA strand separation required for this nicking is less affected, but becomes more sensitive to the level of active DNA gyrase in the cell. In addition, strand separation is not efficiently extended through the region containing the nick site. These effects suggest a model in which MobC acts as a molecular wedge for the relaxosome-induced melting of oriT DNA. The effect of MobC on strand separation may be partially complemented by the helical distortion induced by supercoiling. However, MobC extends the melted region through the nick site, thus providing the single-stranded substrate required for cleavage by MobA.	43
399024	pfam05714	Borrelia_lipo_1	Borrelia burgdorferi virulent strain associated lipoprotein. This family consists of several virulent strain associated lipoproteins from the Lyme disease spirochete Borrelia burgdorferi.	184
399025	pfam05715	zf-piccolo	Piccolo Zn-finger. This (predicted) Zinc finger is found in the bassoon and piccolo proteins. There are eight conserved cysteines, suggesting that it coordinates two zinc ligands.	59
399026	pfam05716	AKAP_110	A-kinase anchor protein 110 kDa (AKAP 110). This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction.	692
399027	pfam05717	TnpB_IS66	IS66 Orf2 like protein. This protein is found in insertion sequences related to IS66. The function of these proteins is uncertain, but they are probably essential for transposition.	96
283396	pfam05718	Pox_int_trans	Poxvirus intermediate transcription factor. This family consists of several highly related Poxvirus sequences which are thought to be intermediate transcription factors.	382
399028	pfam05719	GPP34	Golgi phosphoprotein 3 (GPP34). This family consists of several eukaryotic GPP34 like proteins. GPP34 localizes to the Golgi complex and is conserved from yeast to humans. The cytosolic-ally exposed location of GPP34 predict a role for a novel coat protein in Golgi trafficking.	197
283398	pfam05720	Dicty_CAD	Cell-cell adhesion domain. This family is based on a group of Dictyostelium discoideum proteins that are essential in early development. csbA and csbB are located on the cell surface and mediate cell-cell adhesion.	75
399029	pfam05721	PhyH	Phytanoyl-CoA dioxygenase (PhyH). This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues.	213
114448	pfam05722	Ustilago_mating	Ustilago B locus mating-type protein. This family consists of several Ustilago mating-type proteins. The b locus of the phytopathogenic fungus Ustilago maydis encodes a multiallelic recognition function that controls the ability of the fungus to form a dikaryon and complete the sexual stage of the life cycle. The b locus has at least 25 alleles and any combination of two different alleles, brought together by mating between haploid cells, allows the fungus to cause disease and undergo sexual development within the plant.	239
399030	pfam05724	TPMT	Thiopurine S-methyltransferase (TPMT). This family consists of thiopurine S-methyltransferase proteins from both eukaryotes and prokaryotes. Thiopurine S-methyltransferase (TPMT) is a cytosolic enzyme that catalyzes S-methylation of aromatic and heterocyclic sulfhydryl compounds, including anticancer and immunosuppressive thiopurines.	218
191356	pfam05725	FNIP	FNIP Repeat. This repeat is approximately 22 residues long and is only found in Dictyostelium discoideum. It appears to be related to pfam00560 (personal obs:C Yeats). The alignment consists of two tandem repeats. It is termed the FNIP repeat after the pattern of conserved residues.	44
399031	pfam05726	Pirin_C	Pirin C-terminal cupin domain. This region is found the C-terminal half of the Pirin protein.	103
283402	pfam05727	UPF0228	Uncharacterized protein family (UPF0228). This small family of proteins is currently restricted Methanosarcina species. Members of this family are about 200 residues in length, except for MA_2565 that has two copies of this region. Although the function of this region is unknown the pattern of conservation suggests that this may be an enzyme, including multiple conserved aspartate and glutamate residues (Bateman A. pers. obs.). The most conserved motif in these proteins is NEL/MEXNE/D, where X can be any amino acid, which is found at the C-terminus of these proteins.	124
283403	pfam05728	UPF0227	Uncharacterized protein family (UPF0227). Despite being classed as uncharacterized proteins, the members of this family are almost certainly enzymes that are distantly related to the pfam00561.	187
399032	pfam05729	NACHT	NACHT domain. This NTPase domain is found in apoptosis proteins as well as those involved in MHC transcription activation. This family is closely related to pfam00931.	166
399033	pfam05730	CFEM	CFEM domain. This fungal specific cysteine rich domain is found in some proteins with proposed roles in fungal pathogenesis. The structure of the CFEM domain containing protein 'Surface antigen protein 2' from Candida albicans has been solved.	66
399034	pfam05731	TROVE	TROVE domain. This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding.	362
253356	pfam05732	RepL	Firmicute plasmid replication protein (RepL). This family consists of Firmicute RepL proteins which are involved in plasmid replication.	165
368585	pfam05733	Tenui_N	Tenuivirus/Phlebovirus nucleocapsid protein. This family consists of several Tenuivirus and Phlebovirus nucleocapsid proteins. These are ssRNA viruses.	224
283407	pfam05734	DUF832	Herpesvirus protein of unknown function (DUF832). This family consists of several herpesvirus proteins of unknown function.	228
399035	pfam05735	TSP_C	Thrombospondin C-terminal region. This region is found at the C-terminus of thrombospondin and related proteins.	198
399036	pfam05736	OprF	OprF membrane domain. This domain represents the presumed membrane spanning region of the OprF proteins. This region is involved in channel formation and is thought to form an 8-stranded beta-barrel.	156
283410	pfam05737	Collagen_bind	Collagen binding domain. The domain fold is a jelly-roll, composed of two antiparallel beta-sheets and two short alpha-helices. A groove on beta-sheet I exhibited the best surface complementarity to the collagen. This site partially overlaps with the peptide sequence previously shown to be critical for collagen binding. Recombinant proteins containing single amino acid mutations designed to disrupt the surface of the putative binding site exhibited significantly lower affinities for collagen.	129
399037	pfam05738	Cna_B	Cna protein B-type domain. This domain is found in Staphylococcus aureus collagen-binding surface protein. The structure of the repetitive B-region has been solved and forms a beta sandwich structure.	87
399038	pfam05739	SNARE	SNARE domain. Most if not all vesicular membrane fusion events in eukaryotic cells are believed to be mediated by a conserved fusion machinery, the SNARE [soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptors] machinery. The SNARE domain is thought to act as a protein-protein interaction module in the assembly of a SNARE protein complex.	52
399039	pfam05741	zf-nanos	Nanos RNA binding domain. This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localized determinant of posterior pattern. Nanos RNA is localized to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localized source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localized and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development.	53
399040	pfam05742	TANGO2	Transport and Golgi organisation 2. In eukaryotes this family is predicted to play a role in protein secretion and Golgi organisation. In plants this family includes Solanum habrochaites Cwp, which is involved in water permeability in the cuticles of fruit. Mouse Tango2 has been found to be expressed during early embryogenesis in mice. This protein contains a conserved NRDE motif. This gene has been characterized in Drosophila melanogaster and named as transport and Golgi organisation 2, hence the name Tango2.	254
399041	pfam05743	UEV	UEV domain. This family includes the eukaryotic tumor susceptibility gene 101 protein (TSG101). Altered transcripts of this gene have been detected in sporadic breast cancers and many other human malignancies. However, the involvement of this gene in neoplastic transformation and tumorigenesis is still elusive. TSG101 is required for normal cell function of embryonic and adult tissues but that this gene is not a tumor suppressor for sporadic forms of breast cancer. This family is related to the ubiquitin conjugating enzymes.	119
283416	pfam05744	Benyvirus_P25	Benyvirus P25/P26 protein. This family consists of P25 and P26 proteins from the beet necrotic yellow vein viruses.	240
368591	pfam05745	CRPA	Chlamydia 15 kDa cysteine-rich outer membrane protein (CRPA). This family consists of several Chlamydia 15 kDa cysteine-rich outer membrane proteins which are associated with differentiation of reticulate bodies (RBs) into elementary bodies (EBs).	150
399042	pfam05746	DALR_1	DALR anticodon binding domain. This all alpha helical domain is the anticodon binding domain in Arginyl and glycyl tRNA synthetase. This domain is known as the DALR domain after characteristic conserved amino acids.	117
283418	pfam05748	Rubella_E1	Rubella membrane glycoprotein E1. Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins. The E1 has been shown to be a type 1 membrane protein, rich in cysteine residues with extensive intramolecular disulfide bonds.	496
283419	pfam05749	Rubella_E2	Rubella membrane glycoprotein E2. Rubella virus (RV), the sole member of the genus Rubivirus within the family Togaviridae, is a small enveloped, positive strand RNA virus. The nucleocapsid consists of 40S genomic RNA and a single species of capsid protein which is enveloped within a host-derived lipid bilayer containing two viral glycoproteins, E1 (58 kDa) and E2 (42-46 kDa). In virus infected cells, RV matures by budding either at the plasma membrane, or at the internal membranes depending on the cell type and enters adjacent uninfected cells by a membrane fusion process in the endosome, directed by E1-E2 heterodimers. The heterodimer formation is crucial for E1 transport out of the endoplasmic reticulum to the Golgi and plasma membrane. In RV E1, a cysteine at position 82 is crucial for the E1-E2 heterodimer formation and cell surface expression of the two proteins.	267
399043	pfam05750	Rubella_Capsid	Rubella capsid protein. Rubella virus is an enveloped positive-strand RNA virus of the family Togaviridae. Virions are composed of three structural proteins: a capsid and two membrane-spanning glycoproteins, E2 and E1. During virus assembly, the capsid interacts with genomic RNA to form nucleocapsids. It has been discovered that capsid phosphorylation serves to negatively regulate binding of viral genomic RNA. This may delay the initiation of nucleocapsid assembly until sufficient amounts of virus glycoproteins accumulate at the budding site and/or prevent non-specific binding to cellular RNA when levels of genomic RNA are low. It follows that at a late stage in replication, the capsid may undergo dephosphorylation before nucleocapsid assembly occurs.	269
399044	pfam05751	FixH	FixH. This family consists of several Rhizobium FixH like proteins. It has been suggested that suggested that the four proteins FixG, FixH, FixI, and FixS may participate in a membrane-bound complex coupling the FixI cation pump with a redox process catalyzed by FixG.	150
399045	pfam05752	Calici_MSP	Calicivirus minor structural protein. This family consists of minor structural proteins largely from human calicivirus isolates. Human calicivirus causes gastroenteritis. The function of this family is unknown.	165
399046	pfam05753	TRAP_beta	Translocon-associated protein beta (TRAPB). This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion.	178
399047	pfam05754	DUF834	Domain of unknown function (DUF834). This short presumed domain is found in a large number of hypothetical plant proteins. The domain is quite rich in conserved glycine residues. It occurs in some putative transposons but currently has no known function.	53
399048	pfam05755	REF	Rubber elongation factor protein (REF). This family consists of the highly related rubber elongation factor (REF), small rubber particle protein (SRPP) and stress-related protein (SRP) sequences. REF and SRPP are released from the rubber particle membrane into the cytosol during osmotic lysis of the sedimentable organelles (lutoids). The exact function of this family is unknown.	206
399049	pfam05756	S-antigen	S-antigen protein. S-antigens are heat stable proteins that are found in the blood of individuals infected with malaria.	92
399050	pfam05757	PsbQ	Oxygen evolving enhancer protein 3 (PsbQ). This family consists of the plant specific oxygen evolving enhancer protein 3 (PsbQ). Photosystem II (PSII)1 is a pigment-protein complex, which consists of at least 25 different protein subunits, at present denoted PsbA-Z according to the genes that encode them. PsbQ plays an important role in the lumenal oxygen-evolving activity of PSII from higher plants and green algae.	198
368599	pfam05758	Ycf1	Ycf1. The chloroplast genomes of most higher plants contain two giant open reading frames designated ycf1 and ycf2. Although the function of Ycf1 is unknown, it is known to be an essential gene.	944
399051	pfam05760	IER	Immediate early response protein (IER). This family consists of several eukaryotic immediate early response (IER) 2 and 5 proteins. The role of IER5 is unclear although it play an important role in mediating the cellular response to mitogenic signals. Again, little is known about the function of IER2 although it is thought to play a role in mediating the cellular responses to a variety of extracellular signals.	304
399052	pfam05761	5_nucleotid	5' nucleotidase family. This family of eukaryotic proteins includes 5' nucleotidase enzymes, such as purine 5'-nucleotidase EC:3.1.3.5.	444
399053	pfam05762	VWA_CoxE	VWA domain containing CoxE-like protein. This family is annotated by SMART as containing a VWA (von Willebrand factor type A) domain. The exact function of this family is unknown. It is found as part of a CO oxidising (Cox) system operon is several bacteria.	221
368601	pfam05763	DUF835	Protein of unknown function (DUF835). The members of this archaebacterial protein family are around 250-300 amino acid residues in length. The function of these proteins is not known.	136
399054	pfam05764	YL1	YL1 nuclear protein. The proteins in this family are designated YL1. These proteins have been shown to be DNA-binding and may be a transcription factor.	246
368603	pfam05766	NinG	Bacteriophage Lambda NinG protein. NinG or Rap is involved in recombination. Rap (recombination adept with plasmid) increases lambda-by-plasmid recombination catalyzed by Escherichia coli's RecBCD pathway.	186
283435	pfam05767	Pox_A14	Poxvirus virion envelope protein A14. This family consists of several Poxvirus virion envelope protein A14 like sequences. A14 is a component of the virion membrane and has been found to be an H1 phosphatase substrate in vivo and in vitro. A14 is hyperphosphorylated on serine residues in the absence of H1 expression.	93
399055	pfam05768	DUF836	Glutaredoxin-like domain (DUF836). These proteins are related to the pfam00462 family.	80
399056	pfam05769	SIKE	SIKE family. This family consists of several eukaryotic proteins. Suppressor of IKBKE 1 (SIKE) is a physiological suppressor of IKK-epsilon and TBK1, which are two IKK-related kinases involved in virus- and TLR3-triggered activation of interferon regulatory factor 3 (IRF-3). Other members of this family are circulating cathodic antigen (CCA), found in Schistosoma mansoni (Blood fluke), and FGFR1 oncogene partner 2, which may be involved in wound healing pathway.	180
368606	pfam05770	Ins134_P3_kin	Inositol 1, 3, 4-trisphosphate 5/6-kinase. This family consists of several inositol 1, 3, 4-trisphosphate 5/6-kinase proteins. Inositol 1,3,4-trisphosphate is at a branch point in inositol phosphate metabolism. It is dephosphorylated by specific phosphatases to either inositol 3,4-bisphosphate or inositol 1,3-bisphosphate. Alternatively, it is phosphorylated to inositol 1,3,4,6-tetrakisphosphate or inositol 1,3,4,5-tetrakisphosphate by inositol trisphosphate 5/6-kinase.	201
283438	pfam05771	Pox_A31	Poxvirus A31 protein. 	113
399057	pfam05772	NinB	NinB protein. The ninR region of phage lambda contains two recombination genes, orf (ninB) and rap (ninG), that have roles when the RecF and RecBCD recombination pathways of E. coli, respectively, operate on phage lambda. NinB binds to single-stranded DNA.	117
399058	pfam05773	RWD	RWD domain. This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices.	111
283441	pfam05774	Herpes_heli_pri	Herpesvirus helicase-primase complex component. This family consists of several helicase-primase complex components from the Gammaherpesviruses.	127
283442	pfam05775	AfaD	Enterobacteria AfaD invasin protein. This family consists of several AfaD and related proteins from Escherichia coli and Salmonella bacteria. The afa gene clusters encode an afimbrial adhesive sheath produced by Escherichia coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells.	105
147756	pfam05776	Papilloma_E5A	Papillomavirus E5A protein. Human papillomaviruses (HPVs) are epitheliotropic viruses, and their life cycle is intimately linked to the stratification and differentiation state of the host epithelial tissues. The kinetics of E5a protein expression during the complete viral life cycle has been studied and the highest level was found to be coincidental with the onset of virion morphogenesis.	91
283443	pfam05777	Acp26Ab	Drosophila accessory gland-specific peptide 26Ab (Acp26Ab). This family consists of accessory gland-specific 26Ab peptides or male accessory gland secretory protein 355B from different Drosophila species. Drosophila males, like males of most other insects, transfer a group of specific proteins (Acp26Ab and Acp26Aa in Drosophila) to the females during mating. These proteins are produced primarily in the accessory gland and are likely to influence the female's reproduction.	90
399059	pfam05778	Apo-CIII	Apolipoprotein CIII (Apo-CIII). This family consists of several mammalian apolipoprotein CIII (Apo-CIII) sequences. Apolipoprotein C-III is a 79-residue glycoprotein. It is synthesized in the intestine and liver as part of the very low density lipoprotein (VLDL) and the high density lipoprotein (HDL) particles. Owing to its positive correlation with plasma triglyceride (Tg) levels, Apo-CIII is suggested to play a role in Tg metabolism and is therefore of interest regarding atherosclerosis. However, unlike other apolipoproteins such as Apo-AI, Apo E or CII for which many naturally occurring mutations are known, the structure-function relationships of apo C-III remains a subject of debate. One possibility is that apo C-III inhibits lipoprotein lipase (LPL) activity, as shown by in vitro experiments. Another suggestion, is that elevated levels of Apo-CIII displace other apolipoproteins at the lipoprotein surface, modifying their clearance from plasma.	68
399060	pfam05781	MRVI1	MRVI1 protein. This family consists of mammalian MRVI1 proteins which are related to the lymphoid-restricted membrane protein (JAW1) and the IP3 receptor associated cGMP kinase substrates A and B (IRAGA and IRAGB). The function of MRVI1 is unknown although mutations in the Mrvi1 gene induces myeloid leukaemia by altering the expression of a gene important for myeloid cell growth and/or differentiation so it has been speculated that Mrvi1 is a tumor suppressor gene. IRAG is very similar in sequence to MRVI1 and is an essential NO/cGKI-dependent regulator of IP3-induced calcium release. Activation of cGKI decreases IP3-stimulated elevations in intracellular calcium, induces smooth muscle relaxation and contributes to the antiproliferative and pro-apoptotic effects of NO/cGMP. Jaw1 is a member of a class of proteins with COOH-terminal hydrophobic membrane anchors and is structurally similar to proteins involved in vesicle targeting and fusion. This suggests that the function and/or the structure of the ER in lymphocytes may be modified by lymphoid-restricted resident ER proteins.	521
399061	pfam05782	ECM1	Extracellular matrix protein 1 (ECM1). This family consists of several eukaryotic extracellular matrix protein 1 (ECM1) sequences. ECM1 has been shown to regulate endochondral bone formation, stimulate the proliferation of endothelial cells and induce angiogenesis. Mutations in the ECM1 gene can cause lipoid proteinosis, a disorder which causes generalized thickening of skin, mucosae and certain viscera. Classical features include beaded eyelid papules and laryngeal infiltration leading to hoarseness.	559
368612	pfam05783	DLIC	Dynein light intermediate chain (DLIC). This family consists of several eukaryotic dynein light intermediate chain proteins. The light intermediate chains (LICs) of cytoplasmic dynein consist of multiple isoforms, which undergo post-translational modification to produce a large number of species. DLIC1 is known to be involved in assembly, organisation, and function of centrosomes and mitotic spindles when bound to pericentrin. DLIC2 is a subunit of cytoplasmic dynein 2 that may play a role in maintaining Golgi organisation by binding cytoplasmic dynein 2 to its Golgi-associated cargo.	468
283448	pfam05784	Herpes_UL82_83	Betaherpesvirus UL82/83 protein N-terminus. This family represents the N terminal region of the Betaherpesvirus UL82 and UL83 proteins. As viruses are reliant upon their host cell to serve as proper environments for their replication, many have evolved mechanisms to alter intracellular conditions to suit their own needs. Human cytomegalovirus induces quiescent cells to enter the cell cycle and then arrests them in late G(1), before they enter the S phase, a cell cycle compartment that is presumably favourable for viral replication. The protein product of the human cytomegalovirus UL82 gene, pp71, can accelerate the movement of cells through the G(1) phase of the cell cycle. This activity would help infected cells reach the late G(1) arrest point sooner and thus may stimulate the infectious cycle. pp71 also induces DNA synthesis in quiescent cells, but a pp71 mutant protein that is unable to induce quiescent cells to enter the cell cycle still retains the ability to accelerate the G(1) phase. Thus, the mechanism through which pp71 accelerates G(1) cell cycle progression appears to be distinct from the one that it employs to induce quiescent cells to exit G(0) and subsequently enter the S phase.	343
283449	pfam05785	CNF1	Rho-activating domain of cytotoxic necrotizing factor. This family consists of several bacterial cytotoxic necrotizing factor proteins as well as related dermonecrotic toxin (DNT) from Bordetella species. Cytotoxic necrotizing factor 1 (CNF1) causes necrosis of rabbit skin and re-organisation of the actin cytoskeleton in cultured cells. Bordetella dermonecrotic toxin (DNT) stimulates the assembly of actin stress fibers and focal adhesions by deamidating or polyaminating Gln63 of the small GTPase Rho. DNT is an A-B toxin which is composed of an N-terminal receptor-binding (B) domain and a C-terminal enzymatically active (A) domain.	286
399062	pfam05786	Cnd2	Condensin complex subunit 2. This family consists of several Barren protein homologs from several eukaryotic organisms. In Drosophila Barren (barr) is required for sister-chromatid segregation in mitosis. barr encodes a novel protein that is present in proliferating cells and has homologs in yeast and human. Mitotic defects in barr embryos become apparent during cycle 16, resulting in a loss of PNS and CNS neurons. Centromeres move apart at the metaphase-anaphase transition and Cyclin B is degraded, but sister chromatids remain connected, resulting in chromatin bridging. Barren protein localizes to chromatin throughout mitosis. Colocalization and biochemical experiments indicate that Barren associates with Topoisomerase II throughout mitosis and alters the activity of Topoisomerase II. It has been suggested that this association is required for proper chromosomal segregation by facilitating the decatenation of chromatids at anaphase. This family forms one of the three non-structural maintenance of chromosomes (SMC) subunits of the mitotic condensation complex along with Cnd1 and Cnd3.	743
399063	pfam05787	DUF839	Bacterial protein of unknown function (DUF839). This family consists of several bacterial proteins of unknown function that contain a predicted beta-propeller repeats.	504
399064	pfam05788	Orbi_VP1	Orbivirus RNA-dependent RNA polymerase (VP1). This family consists of the RNA-dependent RNA polymerase protein VP1 from the Orbiviruses. VP1 may have both enzymatic and structural roles in the virus life cycle.	1297
283452	pfam05789	Baculo_VP1054	Baculovirus VP1054 protein. This family consists of several VP1054 proteins from the Baculoviruses. VP1054 is a virus structural protein required for nucleocapsid assembly.	379
399065	pfam05790	C2-set	Immunoglobulin C2-set domain. 	80
368615	pfam05791	Bacillus_HBL	Bacillus haemolytic enterotoxin (HBL). This family consists of several Bacillus haemolytic enterotoxins (HblC, HblD, HblA, NheA, and NheB) which can cause food poisoning in humans.	177
399066	pfam05792	Candida_ALS	Candida agglutinin-like (ALS). This family consists of several agglutinin-like proteins from different Candida species. ALS genes of Candida albicans encode a family of cell-surface glycoproteins with a three-domain structure. Each Als protein has a relatively conserved N-terminal domain, a central domain consisting of a tandemly repeated motif of variable number, and a serine-threonine-rich C-terminal domain that is relatively variable across the family. The ALS family exhibits several types of variability that indicate the importance of considering strain and allelic differences when studying ALS genes and their encoded proteins. Fungal adhesins, which include sexual agglutinins, virulence factors, and flocculins, are surface proteins that mediate cell-cell and cell-environment interactions. It is possible that both the serine/threonine-rich domain and the cysteine residues in the C-terminal and DIPSY pfam11763 participate in anchoring the terminal domains inside the wall, so that only the inner part of Map4p, including the repeat region, is sticking out as a fold-back loop then able to act in adhesing.	33
310411	pfam05793	TFIIF_alpha	Transcription initiation factor IIF, alpha subunit (TFIIF-alpha). Transcription initiation factor IIF, alpha subunit (TFIIF-alpha) or RNA polymerase II-associating protein 74 (RAP74) is the large subunit of transcription factor IIF (TFIIF), which is essential for accurate initiation and stimulates elongation by RNA polymerase II.	528
399067	pfam05794	Tcp11	T-complex protein 11. This family consists of several eukaryotic T-complex protein 11 (Tcp11) related sequences. Tcp11 is only expressed in fertile adult mammalian testes and is thought to be important in sperm function and fertility. The family also contains the yeast Sok1 protein which is known to suppress cyclic AMP-dependent protein kinase mutants.	389
310413	pfam05795	Plasmodium_Vir	Plasmodium vivax Vir protein. This family consists of several Vir proteins specific to Plasmodium vivax. The vir genes are present at about 600-1,000 copies per haploid genome and encode proteins that are immunovariant in natural infections, indicating that they may have a functional role in establishing chronic infection through antigenic variation.	371
283458	pfam05796	Chordopox_G2	Chordopoxvirus protein G2. This family consists of several Chordopoxvirus isatin-beta-thiosemicarbazone dependent protein (protein G2) sequences. Inactivation of the gene coding for this protein renders the virus dependent upon isatin-beta-thiosemicarbazone (IBT) for growth.	215
283459	pfam05797	Rep_4	Yeast trans-acting factor (REP1/REP2). This family consists of the yeast trans-acting factor B and C (REP1 and 2) proteins. The yeast plasmid stability system consists of two plasmid-coded proteins, Rep1 and Rep2, and a cis-acting locus, STB. The Rep proteins show both self- and cross-interactions in vivo and in vitro, and bind to the STB DNA with assistance from host factor(s). Within the yeast nucleus, the Rep1 and Rep2 proteins tightly associate with STB-containing plasmids into well organized plasmid foci that form a cohesive unit in partitioning. It is generally accepted that the protein-protein and DNA-protein interactions engendered by the Rep-STB system are central to plasmid partitioning. Point mutations in Rep1 that knock out interaction with Rep2 or with STB simultaneously block the ability of these Rep1 variants to support plasmid stability.	369
283460	pfam05798	Phage_FRD3	Bacteriophage FRD3 protein. This family consists of bacteriophage FRD3 proteins.	75
399068	pfam05800	GvpO	Gas vesicle synthesis protein GvpO. This family consists of archaeal GvpO proteins which are required for gas vesicle synthesis. The family also contains two related sequences from Streptomyces coelicolor.	94
283462	pfam05801	DUF840	Lagovirus protein of unknown function (DUF840). This family consists of several Lagovirus sequences of unknown function, largely from rabbit hemorrhagic disease virus.	113
368619	pfam05802	EspB	Enterobacterial EspB protein. EspB is a type-III-secreted pore-forming protein of enteropathogenic Escherichia coli (EPEC) which is essential for EPEC pathogenesis. EspB is also found in Citrobacter rodentium.	165
283464	pfam05803	Chordopox_L2	Chordopoxvirus L2 protein. This family consists of several Chordopoxvirus L2 proteins.	79
253396	pfam05804	KAP	Kinesin-associated protein (KAP). This family consists of several eukaryotic kinesin-associated (KAP) proteins. Kinesins are intracellular multimeric transport motor proteins that move cellular cargo on microtubule tracks. It has been shown that the sea urchin KRP85/95 holoenzyme associates with a KAP115 non-motor protein, forming a heterotrimeric complex in vitro, called the Kinesin-II.	708
399069	pfam05805	L6_membrane	L6 membrane protein. This family consists of several eukaryotic L6 membrane proteins. L6, IL-TMP, and TM4SF5 are cell surface proteins predicted to have four transmembrane domains. Previous sequence analysis led to their assignment as members of the tetraspanin superfamily it has now been found that that they are not significantly related to genuine tetraspanins, but instead constitute their own L6 family. Several members of this family have been implicated in human cancer.	192
399070	pfam05806	Noggin	Noggin. This family consists of the eukaryotic Noggin proteins. Noggin is a glycoprotein that binds bone morphogenetic proteins (BMPs) selectively and, when added to osteoblasts, it opposes the effects of BMPs. It has been found that noggin arrests the differentiation of stromal cells, preventing cellular maturation.	215
399071	pfam05808	Podoplanin	Podoplanin. This family consists of several mammalian podoplanin like proteins which are thought to control specifically the unique shape of podocytes.	134
399072	pfam05810	NinF	NinF protein. This family consists of several bacteriophage NinF proteins as well as related sequences from E. coli.	57
399073	pfam05811	DUF842	Eukaryotic protein of unknown function (DUF842). This family consists of a number of conserved eukaryotic proteins of unknown function. The sequences carry three sets of CxxxC motifs, which might suggest a type of zinc-finger formation.	126
283470	pfam05812	Herpes_BLRF2	Herpesvirus BLRF2 protein. This family consists of several Herpesvirus BLRF2 proteins.	119
283471	pfam05813	Orthopox_F7	Orthopoxvirus F7 protein. 	80
114536	pfam05814	Ac76	Orf76 (Ac76). This family consists mainly of baculovirus proteins. Family members include Autographa californica multiple nucleopolyhedrovirus (AcMNPV), protein AC76. Ac76 has been shown to be involved in intranuclear microvesicle formation. Functional studies suggest that ac76 is essential for both BV (budded virus) and ODV (occlusion-derived virus) development but is not required for viral DNA synthesis.	83
283472	pfam05815	DUF844	Baculovirus protein of unknown function (DUF844). This family consists of several Baculovirus sequences of between 350 and 380 residues long. The family has no known function.	377
399074	pfam05816	TelA	Toxic anion resistance protein (TelA). This family consists of several prokaryotic TelA like proteins. TelA and KlA are associated with tellurite resistance and plasmid fertility inhibition.	330
399075	pfam05817	Ribophorin_II	Oligosaccharyltransferase subunit Ribophorin II. This family contains eukaryotic Ribophorin II (RPN2) proteins. The mammalian oligosaccharyltransferase (OST) is a protein complex that effects the cotranslational N-glycosylation of newly synthesized polypeptides, and is composed of the following proteins: ribophorins I and II (RI and RII), OST48, and Dadl, N33/IAP, OST4, STT3. The family also includes the SWP1 protein from yeast. In yeast the oligosaccharyltransferase complex is composed 7 or 8 subunits, SWP1, being one of them.	625
399076	pfam05818	TraT	Enterobacterial TraT complement resistance protein. The traT gene is one of the F factor transfer genes and encodes an outer membrane protein which is involved in interactions between an Escherichia coli and its surroundings.	214
399077	pfam05819	NolX	NolX protein. This family consists of Rhizobium NolX and Xanthomonas HrpF proteins. The interaction between the plant pathogen Xanthomonas campestris pv. vesicatoria and its host plants is controlled by hrp genes (hypersensitive reaction and pathogenicity), which encode a type III protein secretion system. Among type III-secreted proteins are avirulence proteins, effectors involved in the induction of plant defense reactions. HrpF is dispensable for protein secretion but required for AvrBs3 recognition in planta, is thought to function as a translocator of effector proteins into the host cell. NolX, a soybean cultivar specificity protein, is secreted by a type III secretion system (TTSS) and shows homology to HrpF of the plant pathogen Xanthomonas campestris pv. vesicatoria. It is not known whether NolX functions at the bacterium-plant interface or acts inside the host cell. NolX is expressed in planta only during the early stages of nodule development.	438
399078	pfam05820	Ac81	Baculoviridae AC81. This family consists of several highly related Baculovirus proteins and includes Autographa californica multiple nucleopolyhedrovirus (AcMNPV) protein AC81, which is required for nucleocapsid envelopment. Ac81 contains a functional hydrophobic transmembrane (TM) domain, whose deletion resulted in a phenotype similar to that of Ac81 knockout.	178
399079	pfam05821	NDUF_B8	NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI or NDUFB8). This family consists of several eukaryotic NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI) proteins. NADH:ubiquinone oxidoreductase (complex I) is an extremely complicated multiprotein complex located in the inner mitochondrial membrane. Its main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. Human complex I appears to consist of 41 subunits.	166
310424	pfam05822	UMPH-1	Pyrimidine 5'-nucleotidase (UMPH-1). This family consists of several eukaryotic pyrimidine 5'-nucleotidase proteins. P5'N-1, also known as uridine monophosphate hydrolase-1 (UMPH-1), is a member of a large functional group of enzymes, characterized by the ability to dephosphorylate nucleic acids. P5'N-1 catalyzes the dephosphorylation of pyrimidine nucleoside monophosphates to the corresponding nucleosides. Deficiencies in this proteins function can lead to several different disorders in humans.	246
399080	pfam05823	Gp-FAR-1	Nematode fatty acid retinoid binding protein (Gp-FAR-1). Parasitic nematodes produce at least two structurally novel classes of small helix-rich retinol- and fatty-acid-binding proteins that have no counterparts in their plant or animal hosts and thus represent potential targets for new nematicides. Gp-FAR-1 is a member of the nematode-specific fatty-acid- and retinol-binding (FAR) family of proteins but localizes to the surface of the organism, placing it in a strategic position for interaction with the host. Gp-FAR-1 functions as a broad-spectrum retinol- and fatty-acid-binding protein, and it is thought that it is involved in the evasion of primary host plant defense systems.	142
399081	pfam05824	Pro-MCH	Pro-melanin-concentrating hormone (Pro-MCH). This family consists of several mammalian pro-melanin-concentrating hormone (Pro-MCH) 1 and 2 proteins. Melanin-concentrating hormone (MCH) is a 19 amino acid cyclic peptide that was first isolated from the pituitary of teleost fish. It is produced from pro-MCH that encodes, in addition to MCH, NEI, and a putative peptide, NGE. In lower vertebrates, MCH acts to regulate skin colour by antagonising the melanin-dispersing actions of small alpha, Greek-melanocyte stimulating hormone (small alpha, Greek-MSH). In mammals, MCH serves as a neuropeptide and is found in many regions of the brain and especially the hypothalamus. It affects many types of behaviours such as appetite, sexual receptivity, aggression, and anxiety. MCH also stimulates the release of luteinising hormone.	85
283482	pfam05825	PSP94	Beta-microseminoprotein (PSP-94). This family consists of the mammalian specific protein beta-microseminoprotein. Prostatic secretory protein of 94 amino acids (PSP94), also called beta-microseminoprotein, is a small, nonglycosylated protein, rich in cysteine residues. It was first isolated as a major protein from human seminal plasma. The exact function of this protein is unknown.	94
399082	pfam05826	Phospholip_A2_2	Phospholipase A2. This family consists of several phospholipase A2 like proteins mostly from insects.	95
399083	pfam05827	ATP-synt_S1	Vacuolar ATP synthase subunit S1 (ATP6S1). This family consists of eukaryotic vacuolar ATP synthase subunit S1 proteins. The threshold is set high to avoid the inclusion of BIG1 ER integral membrane proteins which are involved in cell wall organisation and biogenesis.	149
310429	pfam05829	Adeno_PX	Adenovirus late L2 mu core protein (Protein X). This family consists of several Adenovirus late L2 mu core protein or Protein X sequences.	41
399084	pfam05830	NodZ	Nodulation protein Z (NodZ). The nodulation genes of Rhizobia are regulated by the nodD gene product in response to host-produced flavonoids and appear to encode enzymes involved in the production of a lipo-chitose signal molecule required for infection and nodule formation. NodZ is required for the addition of a 2-O-methylfucose residue to the terminal reducing N-acetylglucosamine of the nodulation signal. This substitution is essential for the biological activity of this molecule. Mutations in nodZ result in defective nodulation. nodZ represents a unique nodulation gene that is not under the control of NodD and yet is essential for the synthesis of an active nodulation signal.	320
399085	pfam05831	GAGE	GAGE protein. This family consists of several GAGE and XAGE proteins which are found exclusively in humans. The function of this family is unknown although they have been implicated in human cancers.	107
399086	pfam05832	DUF846	Eukaryotic protein of unknown function (DUF846). This family consists of several of unknown function from a variety of eukaryotic organisms.	139
399087	pfam05833	FbpA	Fibronectin-binding protein A N-terminus (FbpA). This family consists of the N-terminal region of the prokaryotic fibronectin-binding protein. Fibronectin binding is considered to be an important virulence factor in streptococcal infections. Fibronectin is a dimeric glycoprotein that is present in a soluble form in plasma and extracellular fluids; it is also present in a fibrillar form on cell surfaces. Both the soluble and cellular forms of fibronectin may be incorporated into the extracellular tissue matrix. While fibronectin has critical roles in eukaryotic cellular processes, such as adhesion, migration and differentiation, it is also a substrate for the attachment of bacteria. The binding of pathogenic Streptococcus pyogenes and Staphylococcus aureus to epithelial cells via fibronectin facilitates their internalisation and systemic spread within the host.	452
310433	pfam05834	Lycopene_cycl	Lycopene cyclase protein. This family consists of lycopene beta and epsilon cyclase proteins. Carotenoids with cyclic end groups are essential components of the photosynthetic membranes in all plants, algae, and cyanobacteria. These lipid-soluble compounds protect against photo-oxidation, harvest light for photosynthesis, and dissipate excess light energy absorbed by the antenna pigments. The cyclisation of lycopene (psi, psi-carotene) is a key branch point in the pathway of carotenoid biosynthesis. Two types of cyclic end groups are found in higher plant carotenoids: the beta and epsilon rings. Carotenoids with two beta rings are ubiquitous, and those with one beta and one epsilon ring are common; however, carotenoids with two epsilon rings are rare.	380
399088	pfam05835	Synaphin	Synaphin protein. This family consists of several eukaryotic synaphin 1 and 2 proteins. Synaphin/complexin is a cytosolic protein that preferentially binds to syntaxin within the SNARE complex. Synaphin promotes SNAREs to form precomplexes that oligomerize into higher order structures. A peptide from the central, syntaxin binding domain of synaphin competitively inhibits these two proteins from interacting and prevents SNARE complexes from oligomerising. It is thought that oligomerization of SNARE complexes into a higher order structure creates a SNARE scaffold for efficient, regulated fusion of synaptic vesicles. Synaphin promotes neuronal exocytosis by promoting interaction between the complementary syntaxin and synaptobrevin transmembrane regions that reside in opposing membranes prior to fusion.	142
283492	pfam05836	Chorion_S16	Chorion protein S16. This family consists of several examples of the fruit fly specific chorion protein S16. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary.	108
399089	pfam05837	CENP-H	Centromere protein H (CENP-H). This family consists of several eukaryotic centromere protein H (CENP-H) sequences. Macromolecular centromere-kinetochore complex plays a critical role in sister chromatid separation, but its complete protein composition as well as its precise dynamic function during mitosis has not yet been clearly determined. CENP-H contains a coiled-coil structure and a nuclear localization signal. CENP-H is specifically and constitutively localized in kinetochores throughout the cell cycle. CENP-H may play a role in kinetochore organisation and function throughout the cell cycle. This the C-terminus of the region, which is conserved from fungi to humans.	112
399090	pfam05838	Glyco_hydro_108	Glycosyl hydrolase 108. This family acts as a lysozyme (N-acetylmuramidase), EC:3.2.1.17. It contains a conserved EGGY motif near the N-terminus, the glutamic acid within this motif is essential for catalytic activity. In bacteria, it may activate the secretion of large proteins via the breaking and rearrangement of the peptidoglycan layer during secretion. It is frequently found at the N-terminus of proteins containing a C-terminal pfam09374 domain.	86
310437	pfam05839	Apc13p	Apc13p protein. The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators Members of this family are components of the anaphase-promoting complex homologous to Apc13p.	89
336220	pfam05840	Phage_GPA	Bacteriophage replication gene A protein (GPA). This family consists of a group of bacteriophage replication gene A protein (GPA) like sequences from both viruses and bacteria. The members of this family are likely to be endonucleases.	321
399091	pfam05841	Apc15p	Apc15p protein. The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators Members of this family are components of the anaphase-promoting complex homologous to Apc15p.	118
191385	pfam05842	Euplotes_phero	Euplotes octocarinatus mating pheromone protein. This family consists of several mating pheromone proteins from Euplotes octocarinatus. Cells of the ten mating types of the ciliate Euplotes octocarinatus communicate by pheromones before they enter conjugation. The pheromones induce homotypic pairing when applied to mating types that do not secrete the same pheromone(s). Heterotypic pairs (i.e., those between cells of different mating types) are formed only when both mating types in a mixture secrete a pheromone that the other does not. The genetics of mating types is based on four codominant mating type alleles, each allele determining production of a different pheromone. The pheromones not only induce pair formation but also attract cells.	135
399092	pfam05843	Suf	Suppressor of forked protein (Suf). This family consists of several eukaryotic suppressor of forked (Suf) like proteins. The Drosophila melanogaster Suppressor of forked [Su(f)] protein shares homology with the yeast RNA14 protein and the 77-kDa subunit of human cleavage stimulation factor, which are proteins involved in mRNA 3' end formation. This suggests a role for Su(f) in mRNA 3' end formation in Drosophila. The su(f) gene produces three transcripts; two of them are polyadenylated at the end of the transcription unit, and one is a truncated transcript, polyadenylated in intron 4. It is thought that su(f) plays a role in the regulation of poly(A) site utilisation and an important role of the GU-rich sequence for this regulation to occur.	291
283498	pfam05844	YopD	YopD protein. This family consists of several bacterial YopD like proteins. Virulent Yersinia species harbour a common plasmid that encodes essential virulence determinants (Yersinia outer proteins [Yops]), which are regulated by the extracellular stimuli Ca2+ and temperature. YopD is thought to be a possible transmembrane protein and contains an amphipathic alpha-helix in its carboxy terminus.	292
399093	pfam05845	PhnH	Bacterial phosphonate metabolism protein (PhnH). This family consists of several bacterial PhnH sequences which are known to be involved in phosphonate metabolism.	182
283500	pfam05846	Chordopox_A15	Chordopoxvirus A15 protein. This family consists of several Chordopoxvirus A15 like sequences.	90
283501	pfam05847	Baculo_LEF-3	Nucleopolyhedrovirus late expression factor 3 (LEF-3). This family consists of LEF-3 Nucleopolyhedrovirus late expression factor 3 (LEF-3) sequences which are known to be ssDNA-binding proteins. Alkaline nuclease (AN) and LEF-3 may participate in homologous recombination of the baculovirus genome in a manner similar to that of exonuclease (Redalpha) and DNA-binding protein (Redbeta) of the Red-mediated homologous recombination system of bacteriophage lambda. LEF-3 is essential for transporting the putative baculovirus helicase protein P143 into the nucleus where they function together during viral DNA replication. LEF-3 and other proteins have been shown to bind to closely linked sites on viral chromatin in vivo, suggesting that they may form part of the baculovirus replisome.	364
399094	pfam05848	CtsR	Firmicute transcriptional repressor of class III stress genes (CtsR). This family consists of several Firmicute transcriptional repressor of class III stress genes (CtsR) proteins. CtsR of L. monocytogenes negatively regulates the clpC, clpP and clpE genes belonging to the CtsR regulon.	72
368637	pfam05849	L-fibroin	Fibroin light chain (L-fibroin). This family consists of several moth fibroin light chain (L-fibroin) proteins. Fibroin of the silkworm, Bombyx mori, is secreted into the lumen of posterior silk gland (PSG) from the surrounding PSG cells as a molecular complex consisting of a heavy (H)-chain of approximately 350 kDa, a light (L)-chain of 25 kDa and a P25 of about 27 kDa. The H- and L-chains are disulfide-linked but P25 is associated with the H-L complex by non-covalent force.	239
147807	pfam05851	Lentivirus_VIF	Lentivirus virion infectivity factor (VIF). This family consists of several feline specific Lentivirus virion infectivity factor (VIF) proteins. VIF is essential for productive FIV infection of host target cells in vitro.	250
283504	pfam05852	DUF848	Gammaherpesvirus protein of unknown function (DUF848). This family consists of several uncharacterized proteins from the Gammaherpesvirinae.	145
399095	pfam05853	BKACE	beta-keto acid cleavage enzyme. BKACE, beta-keto acid cleavage enzyme plays, a role in lysine degradation. In certain instances it catalyzes the conversion of 3-keto-5-aminohexanoate and acetyl-CoA into acetoacetate and 3-aminobutyryl-CoA. The family is found to have at least 14 slightly different potential new enzymatic activities, all of which can therefore be designated as beta-keto acid cleavage enzymes.	274
399096	pfam05854	MC1	Non-histone chromosomal protein MC1. This family consists of archaeal chromosomal protein MC1 sequences which protect DNA against thermal denaturation.	90
399097	pfam05856	ARPC4	ARP2/3 complex 20 kDa subunit (ARPC4). This family consists of several eukaryotic ARP2/3 complex 20 kDa subunit (P20-ARC) proteins. The Arp2/3 protein complex has been implicated in the control of actin polymerization in cells. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3 it has been suggested that the complex promotes actin assembly in lamellipodia and may participate in lamellipodial protrusion.	166
399098	pfam05857	TraX	TraX protein. This family consists of several bacterial TraX proteins. TraX is responsible for the amino-terminal acetylation of F-pilin subunits.	215
147812	pfam05858	BIV_Env	Bovine immunodeficiency virus surface protein (SU). The bovine lentivirus also known as the bovine immunodeficiency-like virus (BIV) has conserved and hypervariable regions in the surface envelope gene. This family corresponds to the SU surface protein.	548
399099	pfam05859	Mis12	Mis12 protein. Kinetochores are the chromosomal sites for spindle interaction and play a vital role in chromosome segregation. Fission yeast kinetochore protein Mis12, is required for correct spindle morphogenesis, determining metaphase spindle length. Thirty-five to sixty percent extension of metaphase spindle length takes place in Mis12 mutants. It has been shown that Mis12 genetically interacts with Mal2, another inner centromere core complex protein in S. pombe.	137
399100	pfam05860	Haemagg_act	haemagglutination activity domain. This domain is suggested to be a carbohydrate- dependent haemagglutination activity site. It is found in a range of haemagglutinins and haemolysins.	118
399101	pfam05861	PhnI	Bacterial phosphonate metabolism protein (PhnI). This family consists of several Proteobacterial phosphonate metabolism protein (PhnI) sequences. Bacteria that use phosphonates as a phosphorus source must be able to break the stable carbon-phosphorus bond. In Escherichia coli phosphonates are broken down by a C-P lyase that has a broad substrate specificity. The genes for phosphonate uptake and degradation in E. coli are organized in an operon of 14 genes, named phnC to phnP. Three gene products (PhnC, PhnD and PhnE) comprise a binding protein-dependent phosphonate transporter, which also transports phosphate, phosphite, and certain phosphate esters such as phosphoserine; two gene products (PhnF and PhnO) may have a role in gene regulation; and nine gene products (PhnG, PhnH, PhnI, PhnJ, PhnK, PhnL, PhnM, PhnN, and PhnP) probably comprise a membrane-associated C-P lyase enzyme complex.	346
147815	pfam05862	IceA2	Helicobacter pylori IceA2 protein. This family consists of several Helicobacter pylori specific IceA2 proteins. The function of this family is unknown.	59
283512	pfam05864	Chordopox_RPO7	Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7). This family consists of several Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide sequences. DNA-dependent RNA polymerase catalyzes the transcription of DNA into RNA.	63
310449	pfam05865	Cypo_polyhedrin	Cypovirus polyhedrin protein. This family consists of several Cypovirus polyhedrin protein. Polyhedrin is known to form a crystalline matrix (polyhedra) in infected insect cells.	248
399102	pfam05866	RusA	Endodeoxyribonuclease RusA. This family consists of several bacterial and phage Holliday junction resolvase (RusA) like proteins. The RusA protein of Escherichia coli is an endonuclease that can resolve Holliday intermediates and correct the defects in genetic recombination and DNA repair associated with inactivation of RuvAB or RuvC.	122
399103	pfam05867	DUF851	Protein of unknown function (DUF851). 	241
114586	pfam05868	Rotavirus_VP7	Rotavirus major outer capsid protein VP7. This family consists of several Rotavirus major outer capsid protein VP7 sequences. The rotavirus capsid is composed of three concentric protein layers. Proteins VP4 and VP7 comprise the outer layer. VP4 forms spikes and is the viral attachment protein. VP7 is a glycoprotein and the major constituent of the outer protein layer.	249
399104	pfam05869	Dam	DNA N-6-adenine-methyltransferase (Dam). This family consists of several bacterial and phage DNA N-6-adenine-methyltransferase (Dam) like sequences.	165
399105	pfam05870	PA_decarbox	Phenolic acid decarboxylase (PAD). This family consists of several bacterial phenolic acid decarboxylase proteins. Phenolic acids, also called substituted cinnamic acids, are important lignin-related aromatic acids and natural constituents of plant cell walls. These acids (particularly ferulic, p-coumaric, and caffeic acids) bind the complex lignin polymer to the hemicellulose and cellulose in plants. The Phenolic acid decarboxylase (PAD) gene (pad) is transcriptionally regulated by p-coumaric, ferulic, or caffeic acid; these three acids are the three substrates of PAD.	158
399106	pfam05871	ESCRT-II	ESCRT-II complex subunit. This family of conserved eukaryotic proteins are subunits of the endosome associated complex ESCRT-II which recruits transport machinery for protein sorting at the multivesicular body (MVB). This protein complex transiently associates with the endosomal membrane and thereby initiates the formation of ESCRT-III, a membrane-associated protein complex that functions immediately downstream of ESCRT-II during sorting of MVB cargo. ESCRT-II in turn functions downstream of ESCRT-I, a protein complex that binds to ubiquitinated endosomal cargo.	133
283518	pfam05872	DUF853	Bacterial protein of unknown function (DUF853). This family consists of several bacterial proteins of unknown function. BMEI1370 is thought to be an ATPase.	503
399107	pfam05873	Mt_ATP-synt_D	ATP synthase D chain, mitochondrial (ATP5H). This family consists of several ATP synthase D chain, mitochondrial (ATP5H) proteins. Subunit d has no extensive hydrophobic sequences, and is not apparently related to any subunit described in the simpler ATP synthases in bacteria and chloroplasts.	154
368647	pfam05874	PBAN	Pheromone biosynthesis activating neuropeptide (PBAN). This family consists of several moth pheromone biosynthesis activating neuropeptide (PBAN) sequences. Female moths produce and release species specific sex pheromones to attract males for mating. Pheromone biosynthesis is hormonally regulated by the Pheromone Biosynthesis Activating Neuropeptide (PBAN) which is biosynthesized in the subesophageal ganglion (SOG).	184
399108	pfam05875	Ceramidase	Ceramidase. This family consists of several ceramidases. Ceramidases are enzymes involved in regulating cellular levels of ceramides, sphingoid bases, and their phosphates, EC:3.5.1.23. This family belongs to the CREST superfamily, which are distantly related to the GPCRs.	260
368649	pfam05876	Terminase_GpA	Phage terminase large subunit (GpA). This family consists of several phage terminase large subunit proteins as well as related sequences from several bacterial species. The DNA packaging enzyme of bacteriophage lambda, terminase, is a heteromultimer composed of a small subunit, gpNu1, and a large subunit, gpA, products of the Nu1 and A genes, respectively. Terminase is involved in the site-specific binding and cutting of the DNA in the initial stages of packaging. It is now known that gpA is actively involved in late stages of packaging, including DNA translocation, and that this enzyme contains separate functional domains for its early and late packaging activities.	559
283523	pfam05878	Phyto_Pns9_10	Phytoreovirus nonstructural protein Pns9/Pns10. This family consists of the Phytoreovirus nonstructural proteins Pns9 and Pns10. The function of this family is unknown.	344
399109	pfam05879	RHD3	Root hair defective 3 GTP-binding protein (RHD3). This family consists of several eukaryotic root hair defective 3 like GTP-binding proteins. It has been speculated that the RHD3 protein is a member of a novel class of GTP-binding proteins that is widespread in eukaryotes and required for regulated cell enlargement. The family also contains the homologous yeast synthetic enhancement of YOP1 (SEY1) protein which is involved in membrane trafficking.	639
283524	pfam05880	Fiji_64_capsid	Fijivirus 64 kDa capsid protein. This family consists of several Fijivirus 64 kDa capsid proteins.	554
399110	pfam05881	CNPase	2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP or CNPase). This family consists of the eukaryotic protein 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP). 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP) is one of the earliest myelin-related proteins expressed in differentiating oligodendrocytes and Schwann cells. CNP is abundant in the central nervous system and in oligodendrocytes. This protein is also found in mammalian photoreceptor cells, testis and lymphocytes. Although the biological function of CNP is unknown, it is thought to play a significant role in the formation of the myelin sheath, where it comprises 4% of total protein. CNP selectively cleaves 2',3'-cyclic nucleotides to produce 2'-nucleotides in vitro. Although physiologically relevant substrates with 2',3'-cyclic termini are still unknown, numerous cyclic phosphate containing RNAs occur transiently within eukaryotic cells. Other known protein families capable of hydrolysing 2',3'-cyclic nucleotides include tRNA ligases and plant cyclic phosphodiesterases. The catalytic domains from all these proteins contain two tetra-peptide motifs H-X-T/S-X, where X is usually a hydrophobic residue. Mutation of either histidine in CNP abolishes enzymatic activity. CNPases belong to the 2H phosphoesterase superfamily. They share a common active site, characterized by two conserved histidines, with the bacterial tRNA-ligating enzyme LigT, vertebrate myelin-associated 2',3' phosphodiesterases, plant Arabidopsis thaliana CPDases and several several bacteria and virus proteins.	214
283526	pfam05883	Baculo_RING	Baculovirus U-box/Ring-like domain. This family consists of several Baculovirus proteins of around 130 residues in length. The function of this family is unknown, but it appears to be related to the U-box and ring finger domain by profile-profile comparison.	138
368652	pfam05884	ZYG-11_interact	Interactor of ZYG-11. This family of proteins represents the protein product of the gene W03D8.9 which has been identified as an interactor of ZYG-11. ZYG-11 is the substrate-recognition subunit for a CUL-2 based complex that regulates cell division and embryonic development.	295
283528	pfam05886	Orthopox_F8	Orthopoxvirus F8 protein. This family consists of several Orthopoxvirus F8 proteins. The function of this family is unknown.	65
368653	pfam05887	Trypan_PARP	Procyclic acidic repetitive protein (PARP). This family consists of several Trypanosoma brucei procyclic acidic repetitive protein (PARP) like sequences. The procyclic acidic repetitive protein (parp) genes of Trypanosoma brucei encode a small family of abundant surface proteins whose expression is restricted to the procyclic form of the parasite. They are found at two unlinked loci, parpA and parpB; transcription of both loci is developmentally regulated.	134
399111	pfam05889	SepSecS	O-phosphoseryl-tRNA(Sec) selenium transferase, SepSecS. Early annotation suggested this family, SepSecS, of several eukaryotic and archaeal proteins, was involved in antigen-antibodies responses in the liver and pancreas. Structural studies show that the family is O-phosphoseryl-tRNA(Sec) selenium transferase, an enzyme involved in the synthesis of the amino acid selenocysteine (Sec). Sec is the only amino acid whose biosynthesis occurs on its cognate transfer RNA (tRNA). SepSecS catalyzes the final step in the formation of the amino acid. The early observation that autoantibodies isolated from patients with type I autoimmune hepatitis targeted a ribonucleoprotein complex containing tRNASec led to the identification and characterization of the archaeal and the human SepSecS. SepSecS forms its own branch in the family of fold-type I pyridoxal phosphate (PLP) enzymes that goes back to the last universal common ancestor which explains why the archaeal sequences spcS and MK0229 are annotated as being pyridoxal phosphate-dependent enzymes.	389
399112	pfam05890	Ebp2	Eukaryotic rRNA processing protein EBP2. This family consists of several Eukaryotic rRNA processing protein EBP2 sequences. Ebp2p is required for the maturation of 25S rRNA and 60S subunit assembly. Ebp2p may be one of the target proteins of Rrs1p for executing the signal to regulate ribosome biogenesis. This family also plays a role in chromosome segregation.	273
283530	pfam05891	Methyltransf_PK	AdoMet dependent proline di-methyltransferase. This protein is expressed in the tail neuron PVT and in uterine cells in C. elegans [worm-base]. In Saccharomyces cerevisiae this is AdoMet dependent proline di-methyltransferase. This enzyme catalyzes the di-methylation of ribosomal proteins Rpl12 and Rps25 at N-terminal proline residues. The methyltransferases described here specifically recognize the N-terminal X-Pro-Lys sequence motif, and they may account for nearly all previously described eukaryotic protein N-terminal methylation reactions. A number of other yeast and human proteins also share the recognition motif and may be similarly modified. As with other methyltransferases, this family carries the characteristic GxGxG motif.	217
283531	pfam05892	Tricho_coat	Trichovirus coat protein. This family consists of several coat proteins which are specific to the ssRNA positive-strand, no DNA stage viruses such as the Trichovirus and Vitivirus.	195
399113	pfam05893	LuxC	Acyl-CoA reductase (LuxC). This family consists of several bacterial Acyl-CoA reductase (LuxC) proteins. The channelling of fatty acids into the fatty aldehyde substrate for the bacterial bioluminescence reaction is catalyzed by a fatty acid reductase multienzyme complex, which channels fatty acids through the thioesterase (LuxD), synthetase (LuxE) and reductase (LuxC) components.	401
399114	pfam05894	Podovirus_Gp16	Podovirus DNA encapsidation protein (Gp16). This family consists of several DNA encapsidation protein (Gp16) sequences from the phi-29-like viruses. Gene product 16 catalyzes the in vivo and in vitro genome-encapsidation reaction.	331
310464	pfam05895	DUF859	Siphovirus protein of unknown function (DUF859). This family consists of several uncharacterized proteins from the Siphoviruses as well as one bacterial sequence. Some of the members of this family are described as putative minor structural proteins.	626
399115	pfam05896	NQRA	Na(+)-translocating NADH-quinone reductase subunit A (NQRA). This family consists of several bacterial Na(+)-translocating NADH-quinone reductase subunit A (NQRA) proteins. The Na(+)-translocating NADH: ubiquinone oxidoreductase (Na(+)-NQR) generates an electrochemical Na(+) potential driven by aerobic respiration.	257
399116	pfam05899	Cupin_3	Protein of unknown function (DUF861). This family consists of several proteins which seem to be specific to plants and bacteria. The function of this family is unknown.	74
399117	pfam05901	Excalibur	Excalibur calcium-binding domain. Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognized and the evolution of EF-hand-like domains is probably more complex than previously appreciated.	36
399118	pfam05902	4_1_CTD	4.1 protein C-terminal domain (CTD). At the C-terminus of all known 4.1 proteins is a sequence domain unique to these proteins, known as the C-terminal domain (CTD). Mammalian CTDs are associated with a growing number of protein-protein interactions, although such activities have yet to be associated with invertebrate CTDs. Mammalian CTDs are generally defined by sequence alignment as encoded by exons 18-21. Comparison of known vertebrate 4.1 proteins with invertebrate 4.1 proteins indicates that mammalian 4.1 exon 19 represents a vertebrate adaptation that extends the sequence of the CTD with a Ser/Thr-rich sequence. The CTD was first described as a 22/24-kDa domain by chymotryptic digestion of erythrocyte 4.1 (4.1R). CTD is thought to represent an independent folding structure which has gained function since the divergence of vertebrates from invertebrates.	106
399119	pfam05903	Peptidase_C97	PPPDE putative peptidase domain. The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p).	151
399120	pfam05904	DUF863	Plant protein of unknown function (DUF863). This family consists of a number of hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.	939
114618	pfam05906	DUF865	Herpesvirus-7 repeat of unknown function (DUF865). This family consists of a series of 12 repeats of 35 amino acids in length which are found exclusively in Herpesvirus-7. The function of this family is unknown.	35
399121	pfam05907	DUF866	Eukaryotic protein of unknown function (DUF866). This family consists of a number of hypothetical eukaryotic proteins of unknown function with an average length of around 165 residues.	152
399122	pfam05908	Gamma_PGA_hydro	Poly-gamma-glutamate hydrolase. This family consists of a number of bacterial and phage proteins that function as gamma-PGA hydrolase enzymes. Structurally the protein in this family adopted an open alpha/beta mixed core structure with a seven-stranded parallel/anti-parallel beta-sheet. This structure shows similarity to mammalian carboxypeptidase A and related enzymes.	191
399123	pfam05910	DUF868	Plant protein of unknown function (DUF868). This family consists of several hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.	266
399124	pfam05911	FPP	Filament-like plant protein, long coiled-coil. FPP is a family of long coiled-coil plant proteins that are filament-like. It interacts with the nuclear envelope-associated protein, MAF1, the WPP family pfam13943.	859
399125	pfam05912	DUF870	Caenorhabditis elegans protein of unknown function (DUF870). This family consists of a number of hypothetical proteins which seem to be specific to Caenorhabditis elegans. The function of this family is unknown.	111
399126	pfam05913	DUF871	Bacterial protein of unknown function (DUF871). This family consists of several conserved hypothetical proteins from bacteria and archaea. The function of this family is unknown.	116
399127	pfam05914	RIB43A	RIB43A. This family consists of several RIB43A-like eukaryotic proteins. Ciliary and flagellar microtubules contain a specialized set of protofilaments, termed ribbons, that are composed of tubulin and several associated proteins. RIB43A was first characterized in the unicellular biflagellate, Chlamydomonas reinhardtii although highly related sequences are present in several higher eukaryotes including humans. The function of this protein is unknown although the structure of RIB43A and its association with the specialized protofilament ribbons and with basal bodies is relevant to the proposed role of ribbons in forming and stabilizing doublet and triplet microtubules and in organising their three-dimensional structure. Human RIB43A homologs could represent a structural requirement in centriole replication in dividing cells.	376
399128	pfam05915	DUF872	Eukaryotic protein of unknown function (DUF872). This family consists of several uncharacterized eukaryotic proteins. The function of this family is unknown.	115
399129	pfam05916	Sld5	GINS complex protein. The eukaryotic GINS complex is essential for the initiation and elongation phases of DNA replication. It consists of four paralogous protein subunits (Sld5, Psf1, Psf2 and Psf3), all of which are included in this family. The GINS complex is conserved from yeast to humans, and has been shown in human to bind directly to DNA primase.	105
283549	pfam05917	DUF874	Helicobacter pylori protein of unknown function (DUF874). This family consists of several hypothetical proteins specific to Helicobacter pylori. The function of this family is unknown.	398
399130	pfam05918	API5	Apoptosis inhibitory protein 5 (API5). This family consists of apoptosis inhibitory protein 5 (API5) sequences from several organisms. Apoptosis or programmed cell death is a physiological form of cell death that occurs in embryonic development and organ formation. It is characterized by biochemical and morphological changes such as DNA fragmentation and cell volume shrinkage. API5 is an anti apoptosis gene located in human chromosome 11, whose expression prevents the programmed cell death that occurs upon the deprivation of growth factors.	534
253459	pfam05919	Mitovir_RNA_pol	Mitovirus RNA-dependent RNA polymerase. This family consists of several Mitovirus RNA-dependent RNA polymerase proteins. The family also contains fragment matches in the mitochondria of Arabidopsis thaliana.	495
399131	pfam05920	Homeobox_KN	Homeobox KN domain. This is a homeobox transcription factor KN domain conserved from fungi to human and plants. They were first identified as TALE homeobox genes in eukaryotes, (including KNOX and MEIS genes). They have been recently classified.	40
399132	pfam05922	Inhibitor_I9	Peptidase inhibitor I9. This family includes the proteinase B inhibitor from Saccharomyces cerevisiae and the activation peptides from peptidases of the subtilisin family. The subtilisin propeptides are known to function as molecular chaperones, assisting in the folding of the mature peptidase, but have also been shown to act as 'temporary inhibitors'.	82
399133	pfam05923	APC_r	APC repeat. This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). In the human protein many cancer-linked SNPs are found near the first three occurrences of the motif. These repeats bind beta-catenin.	24
399134	pfam05924	SAMP	SAMP Motif. This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). This motif binds axin.	22
283555	pfam05925	IpgD	Enterobacterial virulence protein IpgD. This family consists of several enterobacterial IpgD like virulence factor proteins. In the Gram-negative pathogen Shigella flexneri, the virulence factor IpgD is translocated directly into eukaryotic cells and acts as a potent inositol 4-phosphatase that specifically dephosphorylates phosphatidylinositol 4,5-bisphosphate [PtdIns(4,5)P(2)] into phosphatidylinositol 5-monophosphate [PtdIns(5)P] that then accumulates. Transformation of PtdIns(4,5)P(2) into PtdIns(5)P by IpgD is responsible for dramatic morphological changes of the host cell, leading to a decrease in membrane tether force associated with membrane blebbing and actin filament remodelling.	580
399135	pfam05926	Phage_GPL	Phage head completion protein (GPL). This family consists of several phage head completion protein (GPL) as well as related bacterial sequences. Members of this family allow the completion of filled heads by rendering newly packaged DNA in the heads resistant to DNase. The protein is thought to bind to DNA filled capsids.	139
147854	pfam05927	Penaeidin	Penaeidin. This family consists of several isoforms of the penaeidin protein which is specific to shrimps. Penaeidins, a unique family of antimicrobial peptides (AMPs) with both proline and cysteine-rich domains, were initially identified in the hemolymph of the Pacific white shrimp, Litopenaeus vannamei.	73
114639	pfam05928	Zea_mays_MuDR	Zea mays MURB-like protein (MuDR). This family consists of several Zea mays specific MURB-like proteins. The transposition of Mu elements underlying Mutator activity in maize requires a transcriptionally active MuDR element. Despite variation in MuDR copy number and RNA levels in Mutator lines, transposition events are consistently late in plant development, and Mu excision frequencies are similar.	207
399136	pfam05929	Phage_GPO	Phage capsid scaffolding protein (GPO) serine peptidase. This family consists of several bacteriophage capsid scaffolding proteins (GPO) and some related bacterial sequences. GPO is thought to function in both the assembly of proheads and the cleavage of GPN. The family is found to function as a serine peptidase, with a conserved Asp, His and Ser catalytic triad, as in subtilisin, and as represented in MEROPS:S73. The family includes capsid assembly scaffolding protein from Enterobacteria phage P2 which cleaves itself and then becomes the scaffold protein upon which the bacteriophage prohead is built - a mechanism quite common amongst phages.	272
399137	pfam05930	Phage_AlpA	Prophage CP4-57 regulatory protein (AlpA). This family consists of several short bacterial and phage proteins which are related to the E. coli protein AlpA. AlpA suppress two phenotypes of a delta lon protease mutant, overproduction of capsular polysaccharide and sensitivity to UV light. Several of the sequences in this family are thought to be DNA-binding proteins.	51
368676	pfam05931	AgrD	Staphylococcal AgrD protein. This family consists of several AgrD proteins from many Staphylococcus species. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr.	45
399138	pfam05932	CesT	Tir chaperone protein (CesT) family. This family consists of a number of bacterial sequences which are highly similar to the Tir chaperone protein in E. Coli. In many Gram-negative bacteria, a key indicator of pathogenic potential is the possession of a specialized type III secretion system, which is utilized to deliver virulence effector proteins directly into the host cell cytosol. Many of the proteins secreted from such systems require small cytosolic chaperones to maintain the secreted substrates in a secretion-competent state. CesT serves a chaperone function for the enteropathogenic Escherichia coli (EPEC) translocated intimin receptor (Tir) protein, which confers upon EPEC the ability to alter host cell morphology following intimate bacterial attachment. This family also contains several DspF and related sequences from several plant pathogenic bacteria. The "disease-specific" (dsp) region next to the hrp gene cluster of Erwinia amylovora is required for pathogenicity but not for elicitation of the hypersensitive reaction. DspF and AvrF are small (16 kDa and 14 kDa) and acidic with predicted amphipathic alpha helices in their C termini; they resemble chaperones for virulence factors secreted by type III secretion systems of animal pathogens.	119
283561	pfam05933	Fun_ATP-synt_8	Fungal ATP synthase protein 8 (A6L). This family consists of fungus specific ATP synthase protein 8 (EC:3.6.3.14). The family may be related to the ATP synthase protein 8 found in other eukaryotes pfam00895.	48
310487	pfam05934	MCLC	Mid-1-related chloride channel (MCLC). This family consists of several mid-1-related chloride channels. mid-1-related chloride channel (MCLC) proteins function as a chloride channel when incorporated in the planar lipid bilayer.	549
399139	pfam05935	Arylsulfotrans	Arylsulfotransferase (ASST). This family consists of several bacterial Arylsulfotransferase proteins. Arylsulfotransferase (ASST) transfers a sulfate group from phenolic sulfate esters to a phenolic acceptor substrate.	368
399140	pfam05936	T6SS_VasE	Bacterial Type VI secretion, VC_A0110, EvfL, ImpJ, VasE. T6SS_VasE is a family of of bacterial proteins that are essential for the type VI pathogenic secretion system, although the exact function of this particular component of the system is still not known.	427
399141	pfam05937	EB1_binding	EB-1 Binding Domain. This region, found at the C-terminus of the APC proteins, binds the microtubule-associating protein EB-1. At the C-terminus of the alignment is also a pfam00595 binding domain. A short motif in the middle of the region appears to be found in the APC2 proteins.	174
399142	pfam05938	Self-incomp_S1	Plant self-incompatibility protein S1. This family consists of a series of plant proteins which are related to the Papaver rhoeas S1 self-incompatibility protein. Self incompatibility (SI) is the single most important outbreeding device found in angiosperms and is a mechanism that regulates the acceptance or rejection of pollen. S1 is known to exhibit specific pollen-inhibitory properties.	89
399143	pfam05939	Phage_min_tail	Phage minor tail protein. This family consists of a series of phage minor tail proteins and related sequences from several bacterial species.	107
399144	pfam05940	NnrS	NnrS protein. This family consists of several bacterial NnrS like proteins. NnrS is a putative heme-Cu protein (NnrS) and a member of the short-chain dehydrogenase family. Expression of nnrS is dependent on the transcriptional regulator NnrR, which also regulates expression of genes required for the reduction of nitrite to nitrous oxide, including nirK and nor. NnrS is a haem- and copper-containing membrane protein. Genes encoding putative orthologues of NnrS are sometimes but not always found in bacteria encoding nitrite and/or nitric oxide reductase.	367
283569	pfam05941	Chordopox_A20R	Chordopoxvirus A20R protein. This family consists of several Chordopoxvirus A20R proteins. The A20R protein is required for DNA replication, is associated with the processive form of the viral DNA polymerase, and directly interacts with the viral proteins encoded by the D4R, D5R, and H5R open reading frames. A20R may contribute to the assembly or stability of the multiprotein DNA replication complex.	335
377572	pfam05942	PaREP1	Archaeal PaREP1/PaREP8 family. This family consists of several archaeal PaREP1 and PaREP8 proteins the function of this family is unknown.	115
399145	pfam05943	VipB	Type VI secretion protein, EvpB/VC_A0108, tail sheath. EvpB is a family of Gram-negative probable type VI secretion system components of the tail sheath. They have been known as COG:COG3517. These sheath-components, of which there are many copies in the sheath, are also variously referred to as VipA/VipB and TssB/TssC. On contact with another bacterial cell the sheath contracts and pushes the puncturing device and tube through the cell envelope and punches the target bacterial cell.	301
399146	pfam05944	Phage_term_smal	Phage small terminase subunit. This family consists of several phage small terminase subunit proteins as well as some related bacterial sequences.	129
283573	pfam05946	TcpA	Toxin-coregulated pilus subunit TcpA. This family consists of toxin-coregulated pilus subunit (TcpA) proteins from Vibrio cholerae and related sequences. The major virulence factors of toxigenic Vibrio cholerae are cholera toxin (CT), which is encoded by a lysogenic bacteriophage (CTXPhi), and toxin-coregulated pilus (TCP), an essential colonisation factor which is also the receptor for CTXPhi. The genes for the biosynthesis of TCP are part of a larger genetic element known as the TCP pathogenicity island.	130
399147	pfam05947	T6SS_TssF	Type VI secretion system, TssF. This is a family of Gram-negative bacterial proteins that form part of the type VI pathogenicity secretion system (T6SS), including TssF. TssF is homologs to phage tail proteins and is required for proper assembly of the Hcp tube (the T6SS inner tube) in bacteria.	606
399148	pfam05949	DUF881	Bacterial protein of unknown function (DUF881). This family consists of a series of hypothetical bacterial proteins. One of the family members YlxW from Bacillus subtilis is thought to be involved in cell division and sporulation.	141
283576	pfam05950	Orthopox_A36R	Orthopoxvirus A36R protein. This family consists of several Orthopoxvirus A36R proteins. The A36R protein is predicted to be a type Ib membrane protein.	158
399149	pfam05951	Peptidase_M15_2	Bacterial protein of unknown function (DUF882). This family consists of a series of hypothetical bacterial proteins of unknown function.	150
368681	pfam05952	ComX	Bacillus competence pheromone ComX. Natural genetic competence in Bacillus subtilis is controlled by quorum-sensing (QS). The ComP- ComA two-component system detects the signalling molecule ComX, and this signal is transduced by a conserved phosphotransfer mechanism. ComX is synthesized as an inactive precursor and is then cleaved and modified by ComQ before export to the extracellular environment.	55
283579	pfam05953	Allatostatin	Allatostatin. This family consists of allatostatins, bombystatins, helicostatins, cydiastatins and schistostatin from several insect species. Allatostatins (ASTs) of the Tyr/Phe-Xaa-Phe-Gly Leu/Ile-NH2 family are a group of insect neuropeptides that inhibit juvenile hormone biosynthesis by the corpora allata.	11
399150	pfam05954	Phage_GPD	Phage late control gene D protein (GPD). This family includes a number of phage late control gene D proteins and related bacterial sequences. This family also includes Bacteriophage Mu P proteins and related sequences.	305
368683	pfam05955	Herpes_gp2	Equine herpesvirus glycoprotein gp2. This family consists of a number of glycoprotein gp2 sequences from equine herpesviruses.	226
399151	pfam05956	APC_basic	APC basic domain. This region of the APC family of proteins is known as the basic domain. It contains a high proportion of positively charged amino acids and interacts with microtubules.	337
399152	pfam05957	DUF883	Bacterial protein of unknown function (DUF883). This family consists of several hypothetical bacterial proteins of unknown function.	53
310503	pfam05958	tRNA_U5-meth_tr	tRNA (Uracil-5-)-methyltransferase. This family consists of (Uracil-5-)-methyltransferases EC:2.1.1.35 from bacteria, archaea and eukaryotes. A 5-methyluridine (m(5)U) residue at position 54 is a conserved feature of bacterial and eukaryotic tRNAs. The methylation of U54 is catalyzed by the tRNA(m5U54)methyltransferase, which in Saccharomyces cerevisiae is encoded by the nonessential TRM2 gene. It is thought that tRNA modification enzymes might have a role in tRNA maturation not necessarily linked to their known catalytic activity.	357
283584	pfam05959	DUF884	Nucleopolyhedrovirus protein of unknown function (DUF884). This family consists of several hypothetical Nucleopolyhedrovirus proteins of unknown function.	194
399153	pfam05960	DUF885	Bacterial protein of unknown function (DUF885). This family consists of several hypothetical bacterial proteins several of which are putative membrane proteins.	527
283586	pfam05961	Chordopox_A13L	Chordopoxvirus A13L protein. This family consists of A13L proteins from the Chordopoxviruses. A13L or p8 is one of the three most abundant membrane proteins of the intracellular mature Vaccinia virus.	69
399154	pfam05962	HutD	HutD. HutD from Pseudomonas fluorescens SBW25 is a component of the histidine uptake and utilisation operon. HutD is operonic with the well characterized repressor protein HutC. Genetic analysis using transcriptional fusions (lacZ) and deletion mutants shows that hutD is necessary to maintain fitness in environments replete with histidine. Evidence outlined by Zhang & Rainey (2007) suggests that HutD functions as a governor that sets an upper bound on the level of hut operon transcription. The mechanistic basis is unknown, but in silico molecular docking studies based on the crystal structure of PA5104 (HutD from Pseudomonas aeruginosa) show that urocanate (the first breakdown product of histidine) docks with the active site of HutD.	180
283588	pfam05963	Cytomega_US3	Cytomegalovirus US3 protein. US3 of human cytomegalovirus is an endoplasmic reticulum resident transmembrane glycoprotein that binds to major histocompatibility complex class I molecules and prevents their departure. The endoplasmic reticulum retention signal of the US3 protein is contained in the luminal domain of the protein.	212
399155	pfam05964	FYRN	F/Y-rich N-terminus. This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00541.	51
399156	pfam05965	FYRC	F/Y rich C-terminus. This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00542.	83
283591	pfam05966	Chordopox_A33R	Chordopoxvirus A33R protein. This family consists of several Chordopoxvirus A33R proteins. A33R plays a role in promoting Ab-resistant cell-to-cell spread of virus and interacts with A36R to incorporate the protein into the outer membrane of intracellular enveloped virions (IEV).	184
399157	pfam05968	Bacillus_PapR	Bacillus PapR protein. This family consists of the Bacillus species specific PapR protein. The papR gene belongs to the PlcR regulon and is located 70 bp downstream from plcR. It encodes a 48-amino-acid peptide. Disruption of the papR gene abolishes expression of the PlcR regulon, resulting in a large decrease in haemolysis and virulence in insect larvae. A processed form of PapR activates the PlcR regulon by allowing PlcR to bind to its DNA target. This activating mechanism is strain specific.	45
399158	pfam05969	PSII_Ycf12	Photosystem II complex subunit Ycf12. Ycf12 has been identified as a core subunit in the photosystem II (PSII) complex. PsbZ has been shown to be required for the association of PsbK and Ycf12 with PSII.	29
399159	pfam05970	PIF1	PIF1-like helicase. This family includes homologs of the PIF1 helicase, which inhibits telomerase activity and is cell cycle regulated. This family includes a large number of largely uncharacterized plant proteins. This family includes a P-loop motif that is involved in nucleotide binding.	360
399160	pfam05971	Methyltransf_10	Protein of unknown function (DUF890). This family consists of several conserved hypothetical proteins from both eukaryotes and prokaryotes. The function of this family is unknown.	291
283595	pfam05972	APC_15aa	APC 15 residue motif. This motif, known as the 15 aa repeat, is found in the APC protein family. They are involved in binding beta-catenin along with the pfam05923 repeats. Many human cancer mutations map to the region around these motifs, and may be involved in disrupting their binding of beta-catenin.	15
399161	pfam05973	Gp49	Phage derived protein Gp49-like (DUF891). This family consists of hypothetical bacterial proteins of unknown function as well as phage Gp49 proteins.	90
399162	pfam05974	DUF892	Domain of unknown function (DUF892). This family consists of several hypothetical bacterial proteins of unknown function.	156
399163	pfam05975	EcsB	Bacterial ABC transporter protein EcsB. This family consists of several bacterial ABC transporter proteins which are homologous to the EcsB protein of Bacillus subtilis. EcsB is thought to encode a hydrophobic protein with six membrane-spanning helices in a pattern found in other hydrophobic components of ABC transporters.	383
399164	pfam05977	MFS_3	Transmembrane secretion effector. This is a family of transport proteins. Members of this family include a protein responsible for the secretion of the ferric chelator, enterobactin, and a protein involved in antibiotic resistance.	523
310513	pfam05978	UNC-93	Ion channel regulatory protein UNC-93. This family of proteins is a component of a multi-subunit protein complex which is involved in the coordination of muscle contraction. UNC-93 is most likely an ion channel regulatory protein.	157
399165	pfam05979	DUF896	Bacterial protein of unknown function (DUF896). In B. subtilis, one small SOS response operon under the control of LexA, the yneA operon, is comprised of three genes: yneA, yneB, and ynzC. This family consists of several short, hypothetical bacterial proteins of unknown function. These proteins are mainly found in gram-positive firmicutes. Structures show that the N-terminus is composed of two alpha helices forming a helix-loop-helix motif. The structure of ynzC from B. subtilis forms a trimeric complex. Structural modelling suggests this domain may bind nucleic acids. This family is also known as UPF0291.	62
368691	pfam05980	Toxin_7	Toxin 7. This family consists of several short spider neurotoxin proteins including many from the Funnel-web spider.	34
399166	pfam05981	CreA	CreA protein. This family consists of several bacterial CreA proteins, the function of which is unknown.	118
399167	pfam05982	Sbt_1	Na+-dependent bicarbonate transporter superfamily. Family of bacterial proteins that are likely to be part of the Na(+)-dependent bicarbonate transporter (sbt) family. Members carry 10TMS in a 5+5 duplicated structure. The loop between helices 5 and 6 in Synechocystis PCC6803 is likely to be the location for regulatory mechanisms governing the activation of the transporter.	308
399168	pfam05983	Med7	MED7 protein. This family consists of several eukaryotic proteins which are homologs of the yeast MED7 protein. Activation of gene transcription in metazoans is a multi-step process that is triggered by factors that recognize transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus.	180
283605	pfam05984	Cytomega_UL20A	Cytomegalovirus UL20A protein. This family consists of several Cytomegalovirus UL20A proteins. UL20A is thought to be a glycoprotein.	103
399169	pfam05985	EutC	Ethanolamine ammonia-lyase light chain (EutC). This family consists of several bacterial ethanolamine ammonia-lyase light chain (EutC) EC:4.3.1.7 sequences. Ethanolamine ammonia-lyase is a bacterial enzyme that catalyzes the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia.	233
368694	pfam05986	ADAM_spacer1	ADAM-TS Spacer 1. This family represents the Spacer-1 region from the ADAM-TS family of metalloproteinases.	114
399170	pfam05987	DUF898	Bacterial protein of unknown function (DUF898). This family consists of several bacterial proteins of unknown function. Some of the family members are described as putative membrane proteins.	336
399171	pfam05988	DUF899	Bacterial protein of unknown function (DUF899). This family consists of several uncharacterized bacterial proteins of unknown function.	224
283610	pfam05989	Chordopox_A35R	Chordopoxvirus A35R protein. This family consists of several Chordopoxvirus sequences homologous to the Vaccinia virus A35R protein. The function of this family is unknown.	172
399172	pfam05990	DUF900	Alpha/beta hydrolase of unknown function (DUF900). This family consists of several hypothetical proteins of unknown function mostly found in Rhizobium species. Members of this family have an alpha/beta hydrolase fold.	236
399173	pfam05991	NYN_YacP	YacP-like NYN domain. This family consists of bacterial proteins related to YacP. This family is uncharacterized functionally, but it has been suggested that these proteins are nucleases due to them containing a NYN domain. NYN (for N4BP1, YacP-like Nuclease) domains were discovered by Anantharaman and Aravind. Based on gene neighborhoods it was suggested that the bacterial YacP proteins interact with the Ribonuclease III and TrmH methylase in a processome complex that catalyzes the maturation of rRNA and tRNA.	166
399174	pfam05992	SbmA_BacA	SbmA/BacA-like family. The Rhizobium meliloti bacA gene encodes a function that is essential for bacterial differentiation into bacteroids within plant cells in the symbiosis between R. meliloti and alfalfa. An Escherichia coli homolog of BacA, SbmA, is implicated in the uptake of microcins and bleomycin. This family is likely to be a subfamily of the ABC transporter family.	315
283614	pfam05993	Reovirus_M2	Reovirus major virion structural protein Mu-1/Mu-1C (M2). This family consists of several Reovirus major virion structural protein Mu-1/Mu-1C (M2) sequences. This family is family is thought to play a role in host cell membrane penetration.	647
399175	pfam05994	FragX_IP	Cytoplasmic Fragile-X interacting family. CYFIP1/2 (Cytoplasmic fragile X mental retardation interacting protein) like proteins for a highly conserved protein family. The function of CYFIPs is unclear, but CYFIP interaction with fragile X mental retardation interacting protein (FMRP) involves the domain of FMRP which also mediating homo- and heteromerization.	842
399176	pfam05995	CDO_I	Cysteine dioxygenase type I. Cysteine dioxygenase type I (EC:1.13.11.20) converts cysteine to cysteinesulphinic acid and is the rate-limiting step in sulphate production.	168
399177	pfam05996	Fe_bilin_red	Ferredoxin-dependent bilin reductase. This family consists of several different but closely related proteins which include phycocyanobilin:ferredoxin oxidoreductase EC:1.3.7.5 (PcyA), 15,16-dihydrobiliverdin:ferredoxin oxidoreductase EC:1.3.7.2 (PebA) and phycoerythrobilin:ferredoxin oxidoreductase EC:1.3.7.3 (PebB). Phytobilins are linear tetrapyrrole precursors of the light-harvesting prosthetic groups of the phytochrome photoreceptors of plants and the phycobiliprotein photosynthetic antennae of cyanobacteria, red algae, and cryptomonads. It is known that that phytobilins are synthesized from heme via the intermediary of biliverdin IX alpha (BV), which is reduced subsequently by ferredoxin-dependent bilin reductases with different double-bond specificities.	228
399178	pfam05997	Nop52	Nucleolar protein,Nop52. Nop52 believed to be involved in the generation of 28S rRNA.	212
283619	pfam05999	Herpes_U5	Herpesvirus U5-like family. This family of Herpesvirus includes U4, U5 and UL27.	488
399179	pfam06001	DUF902	Domain of Unknown Function (DUF902). This domain of unknown function is found in several transcriptional co-activators including the CREB-binding protein, which is an acetyltransferase that acetylates histones, giving a specific tag for transcriptional activation. This short domain is found to the C-terminus of bromodomains. The 40 residue domain contains four conserved cysteines suggesting that it may be stabilized by a zinc ion. In CREB this domain is to the N-terminus of another zinc binding PHD domain.	40
310528	pfam06002	CST-I	Alpha-2,3-sialyltransferase (CST-I). This family consists of several alpha-2,3-sialyltransferase (CST-I) proteins largely found in Campylobacter jejuni.	293
399180	pfam06003	SMN	Survival motor neuron protein (SMN). This family consists of several eukaryotic survival motor neuron (SMN) proteins. The Survival of Motor Neurons (SMN) protein, the product of the spinal muscular atrophy-determining gene, is part of a large macromolecular complex (SMN complex) that functions in the assembly of spliceosomal small nuclear ribonucleoproteins (snRNPs). The SMN complex functions as a specificity factor essential for the efficient assembly of Sm proteins on U snRNAs and likely protects cells from illicit, and potentially deleterious, non-specific binding of Sm proteins to RNAs.	263
399181	pfam06004	DUF903	Bacterial protein of unknown function (DUF903). This family consists of several small bacterial proteins several of which are classified as putative lipoproteins. The function of this family is unknown.	48
399182	pfam06005	ZapB	Cell division protein ZapB. ZapB is a non-essential, abundant cell division factor that is required for proper Z-ring formation.	71
368702	pfam06006	DUF905	Bacterial protein of unknown function (DUF905). This family consists of several short hypothetical Enterobacteria proteins of unknown function. Structural analysis of the surface features of the protein YvyC has revealed a single cluster of highly conserved residues on the surface. Additionally, these residues fall into two groups which lie within the two largest of the three cavities identified over the surface. The conclusion from this is that these two cavities with, Leu 58, Glu 75, Ile 82, and Glu 83 and Pro 86, conserved, are likely to be important for the molecular function and reflect the cavities found on the surface of the FlaG proteins in pfam03646.	70
399183	pfam06007	PhnJ	Phosphonate metabolism protein PhnJ. This family consists of several bacterial phosphonate metabolism (PhnJ) sequences. The exact role that PhnJ plays in phosphonate utilisation is unknown.	274
310534	pfam06008	Laminin_I	Laminin Domain I. coiled-coil structure. It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure.	258
368703	pfam06009	Laminin_II	Laminin Domain II. It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure.	138
399184	pfam06011	TRP	Transient receptor potential (TRP) ion channel. This family of proteins are transient receptor potential (TRP) ion channels. They are essential for cellular viability and are involved in cell growth and cell wall synthesis. The genes for these proteins are homologous to polycystic kidney disease related ion channel genes.	424
399185	pfam06012	DUF908	Domain of Unknown Function (DUF908). 	350
399186	pfam06013	WXG100	Proteins of 100 residues with WXG. ESAT-6 is a small protein appears to be of fundamental importance in virulence and protective immunity in Mycobacterium tuberculosis. homologs have been detected in other Gram-positive bacterial species. It may represent a novel secretion system potentially driven by the pfam01580 domains in the YukA-like proteins.	85
399187	pfam06014	DUF910	Bacterial protein of unknown function (DUF910). This family consists of several short bacterial proteins of unknown function.	61
283633	pfam06015	Chordopox_A30L	Chordopoxvirus A30L protein. This family consists of several short Chordopoxvirus proteins which are homologous to the A30L protein of Vaccinia virus. The vaccinia virus A30L protein is required for the association of electron-dense, granular, proteinaceous material with the concave surfaces of crescent membranes, an early step in viral morphogenesis. A30L is known to interact with the G7L protein and it has been shown that the stability of each is dependent on its association with the other.	71
283634	pfam06016	Reovirus_L2	Reovirus core-spike protein lambda-2 (L2). This family consists of several Reovirus core-spike protein lambda-2 (L2) sequences. The reovirus L2 genome segment encodes the core spike protein lambda-2, which mediates enzymatic reactions in 5' capping of the viral plus-strand transcripts.	1297
399188	pfam06017	Myosin_TH1	Unconventional myosin tail, actin- and lipid-binding. Unconventional myosins, ie those that are not found in muscle, have the common, classical-type head domain, sometimes a neck with the IQ calmodulin-binding motifs, and then non-standard tails. These tails determine the subcellular localization of the unconventional myosins and also help determine their individual functions. The family carries several different unconventional myosins, eg. Myo1f is expressed mainly in immune cells as well as in the inner ear where it can be associated with deafness, Myo1d has a lipid-binding module in their tail and is implicated in endosome vesicle recycling in epithelial cells. Myo1a, b, c and g from various eukaryotes are also found in this family.	196
399189	pfam06018	CodY	CodY GAF-like domain. This domain is a GAF-like domain found at the N-terminus of several bacterial GTP-sensing transcriptional pleiotropic repressor CodY proteins. Presumably this domain is involved in GTP binding. CodY has been found to repress the dipeptide transport operon (dpp) of Bacillus subtilis in nutrient-rich conditions. The CodY protein also has a repressor effect on many genes in Lactococcus lactis during growth in milk.	177
147919	pfam06019	Phage_30_8	Phage GP30.8 protein. This family consists of several GP30.8 proteins from the T4-like phages. The function of this family is unknown.	124
310540	pfam06020	Roughex	Drosophila roughex protein. This family consists of several roughex (RUX) proteins specific to Drosophila species. Roughex can influence the intracellular distribution of cyclin A and is therefore defined as a distinct and specialized cell cycle inhibitor for cyclin A-dependent kinase activity. Rux is though to regulate the metaphase to anaphase transition during development.	379
399190	pfam06021	Gly_acyl_tr_N	Aralkyl acyl-CoA:amino acid N-acyltransferase. This family consists of several mammalian specific aralkyl acyl-CoA:amino acid N-acyltransferase (glycine N-acyltransferase) proteins EC:2.3.1.13.	196
399191	pfam06022	Cir_Bir_Yir	Plasmodium variant antigen protein Cir/Yir/Bir. This family consists of several Cir, Yir and Bir proteins from the Plasmodium species P.chabaudi, P.yoelii and P.berghei.	253
283640	pfam06023	Csa1	CRISPR-associated exonuclease Csa1. CRISPR (clustered regularly interspaced short palindromic repeats) elements and cas (CRISPR-associated) genes are widespread in Bacteria and Archaea. The CRISPR/Cas system operates as a defense mechanism against mobile genetic elements (i.e., viruses or plasmids). Csa1 is part of the archaeal subtype I-A system. Cas1 has not yet been enzymatically characterized.	292
368709	pfam06024	Orf78	Orf78 (ac78). Family members include Autographa californica nuclear polyhedrosis virus (AcMNPV), AC78 or Orf78. AC78 is a late gene in the viral life cycle and encodes an envelope structural protein that plays an essential role in embedding the occlusion-derived virus (ODV) in the occlusion body. Although AC78 is not essential for budding virus formation or nucleocapsid assembly and ODV formation, number are significantly reduced if the gene is knocked-out.	101
399192	pfam06025	DUF913	Domain of Unknown Function (DUF913). Members of this family are found in various ubiquitin protein ligases.	368
399193	pfam06026	Rib_5-P_isom_A	Ribose 5-phosphate isomerase A (phosphoriboisomerase A). This family consists of several ribose 5-phosphate isomerase A or phosphoriboisomerase A (EC:5.3.1.6) from bacteria, eukaryotes and archaea.	169
283644	pfam06027	SLC35F	Solute carrier family 35. This is a family of putative solute carrier proteins from eukaryotes.	299
283645	pfam06028	DUF915	Alpha/beta hydrolase of unknown function (DUF915). This family consists of several bacterial proteins of unknown function. Members of this family have an alpha/beta hydrolase fold.	253
399194	pfam06029	AlkA_N	AlkA N-terminal domain. 	118
399195	pfam06030	DUF916	Bacterial protein of unknown function (DUF916). This family consists of several hypothetical bacterial proteins of unknown function.	120
399196	pfam06031	SERTA	SERTA motif. This family consists of a novel motif designated as SERTA (for SEI-1, RBT1, and TARA), corresponding to the largest conserved region among TRIP-Br proteins. The function of this motif is uncertain, but the CDK4-interacting segment of p34SEI-1 (amino acid residues 44-161) includes most of the SERTA motif.	36
399197	pfam06032	DUF917	Protein of unknown function (DUF917). This family consists of hypothetical bacterial and archaeal proteins of unknown function.	350
283650	pfam06033	DUF918	Nucleopolyhedrovirus protein of unknown function (DUF918). This family consists of several Nucleopolyhedrovirus proteins with no known function.	152
114740	pfam06034	DUF919	Nucleopolyhedrovirus protein of unknown function (DUF919). This family consists of several short Nucleopolyhedrovirus proteins of unknown function.	62
399198	pfam06035	Peptidase_C93	Bacterial transglutaminase-like cysteine proteinase BTLCP. Members of this family are predicted to be bacterial transglutaminase-like cysteine proteinases. They contain a conserved Cys-His-Asp catalytic triad. Their structure is predicted to be similar to that of Salmonella typhimurium N-hydroxyarylamine O-acetyltransferase, in pfam00797, however they lack the sub-domain which is important for arylamine recognition.	161
399199	pfam06037	DUF922	Bacterial protein of unknown function (DUF922). This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold.	159
399200	pfam06039	Mqo	Malate:quinone oxidoreductase (Mqo). This family consists of several bacterial Malate:quinone oxidoreductase (Mqo) proteins (EC:1.1.99.16). Mqo takes part in the citric acid cycle. It oxidizes L-malate to oxaloacetate and donates electrons to ubiquinone-1 and other artificial acceptors or, via the electron transfer chain, to oxygen. NAD is not an acceptor and the natural direct acceptor for the enzyme is most likely a quinone. The enzyme is therefore called malate:quinone oxidoreductase, abbreviated to Mqo. Mqo is a peripheral membrane protein and can be released from the membrane by addition of chelators.	488
253527	pfam06040	Adeno_E3	Adenovirus E3 protein. This family consists of several Adenovirus E3 proteins. The E3 protein does not seem to be essential for virus replication in cultured cells suggesting that the protein may function in virus-host interactions.	126
399201	pfam06041	DUF924	Bacterial protein of unknown function (DUF924). This family consists of several hypothetical bacterial proteins of unknown function. Structurally, this family resembles TPR-like repeats.	185
399202	pfam06042	NTP_transf_6	Nucleotidyltransferase. This family consists of several hypothetical bacterial proteins of unknown function. This family was recently identified as belonging to the nucleotidyltransferase superfamily.	157
283656	pfam06043	Reo_P9	Reovirus P9-like family. 	334
399203	pfam06044	DpnI	Dam-replacing family. Dam-replacing protein (DRP) is an restriction endonuclease that is flanked by pseudo-transposable small repeat elements. The replacement of Dam-methylase by DRP allows phase variation through slippage-like mechanisms in several pathogenic isolates of Neisseria meningitidis.	182
283658	pfam06045	Rhamnogal_lyase	Rhamnogalacturonate lyase family. Rhamnogalacturonate lyase (EC:4.2.2.-) degrades the rhamnogalacturonan I (RG-I) backbone of pectin. This family contains mainly members from plants, but also contains the plant pathogen Erwinia chrysanthemi.	211
399204	pfam06046	Sec6	Exocyst complex component Sec6. Sec6 is a component of the multiprotein exocyst complex. Sec6 interacts with Sec8, Sec10 and Exo70.These exocyst proteins localize to regions of active exocytosis-at the growing ends of interphase cells and in the medial region of cells undergoing cytokinesis-in an F-actin-dependent and exocytosis- independent manner.	566
399205	pfam06047	SynMuv_product	Ras-induced vulval development antagonist. This family is from synthetic multi-vulval genes which encode chromatin-associated proteins involved in transcriptional repression. This protein has a role in antagonising Ras-induced vulval development.	102
399206	pfam06048	DUF927	Domain of unknown function (DUF927). Family of bacterial proteins of unknown function. The C-terminal half of this family contains a P-loop motif. The N-terminal domain appears to have a unique fold, which contains three Helices and two strands. Structural analyses show that helicases containing this domain form a hexameric ring with a positively charged central pore threading a single DNA strand through suggestive of a replicative function for this helicase.	286
399207	pfam06049	LSPR	Coagulation Factor V LSPD Repeat. These repeats are found in coagulation factor V (five). The name LSPD derives from the conserved residues in the middle of the repeat.They occur in the B domain, which is cleaved prior to activation of the protein. It has been suggested that domain B bring domains A and C together for activation.	9
399208	pfam06050	HGD-D	2-hydroxyglutaryl-CoA dehydratase, D-component. Degradation of glutamate via the hydroxyglutarate pathway involves the syn-elimination of water from 2-hydroxyglutaryl-CoA. This anaerobic process is catalyzed by 2-hydroxyglutaryl-CoA dehydratase, an enzyme with two components (A and D) that reversibly associate during reaction cycles. This component contains one non-reducible [4Fe-4S]2+ cluster and a reduced riboflavin 5'-monophosphate.	339
399209	pfam06051	DUF928	Domain of Unknown Function (DUF928). Family of uncharacterized bacterial protein.	186
399210	pfam06052	3-HAO	3-hydroxyanthranilic acid dioxygenase. In eukaryotes 3-hydroxyanthranilic acid dioxygenase (EC:1.13.11.6) is part of the kynurenine pathway for the degradation of tryptophan and the biosynthesis of nicotinic acid.The prokaryotic homolog is involved in the 2-nitrobenzoate degradation pathway.	151
368722	pfam06053	DUF929	Domain of unknown function (DUF929). Family of proteins from the archaeon Sulfolobus, with undetermined function.	248
283666	pfam06054	CoiA	Competence protein CoiA-like family. Many of the members of this family are described as transcription factors. CoiA falls within a competence-specific operon in Streptococcus. CoiA is an uncharacterized protein.	377
399211	pfam06055	ExoD	Exopolysaccharide synthesis, ExoD. Among the bacterial genes required for nodule invasion are the exo genes. These genes are involved in the production of an extracellular polysaccharide. Mutations in the exoD result in altered exopolysaccharide production and defects in nodule invasion.	174
310561	pfam06056	Terminase_5	Putative ATPase subunit of terminase (gpP-like). This family of proteins are annotated as ATPase subunits of phage terminase after. Terminases are viral proteins that are involved in packaging viral DNA into the capsid.	58
368724	pfam06057	VirJ	Bacterial virulence protein (VirJ). This family consists of several bacterial VirJ virulence proteins. VirJ is thought to be involved in the type IV secretion system. It is thought that the substrate proteins localized to the periplasm may associate with the pilus in a manner that is mediated by VirJ, and suggest a two-step process for type IV secretion in Agrobacterium.	191
399212	pfam06058	DCP1	Dcp1-like decapping family. An essential step in mRNA turnover is decapping. In yeast, two proteins have been identified that are essential for decapping, Dcp1 (this family) and Dcp2 (pfam05026). The precise role of these proteins in the decapping reaction have not been established. Evidence suggests that the Dcp1 may enhance the function of Dcp2.	112
399213	pfam06059	DUF930	Domain of Unknown Function (DUF930). Family of bacterial proteins with undetermined function. All bacteria in this family are from the Rhizobiales order.	99
368727	pfam06060	Mesothelin	Pre-pro-megakaryocyte potentiating factor precursor (Mesothelin). This family consists of several mammalian pre-pro-megakaryocyte potentiating factor precursor (MPF) or mesothelin proteins. Mesothelin is a glycosylphosphatidylinositol-linked glycoprotein highly expressed in mesothelial cells, mesotheliomas, and ovarian cancer, but the biological function of the protein is not known.	624
310566	pfam06061	Baculo_ME53	Baculoviridae ME53. ME53 is one of the major early-transcribed genes. The ME53 protein is reported to contain a putative zinc finger motif.	339
399214	pfam06062	UPF0231	Uncharacterized protein family (UPF0231). Family of uncharacterized Proteobacteria proteins.	121
399215	pfam06064	Gam	Host-nuclease inhibitor protein Gam. The Gam protein inhibits RecBCD nuclease and is found in both bacteria and bacteriophage.	98
368729	pfam06066	SepZ	SepZ. SepZ is a component of the type III secretion system use in bacteria. SepZ is a gene within the enterocyte effacement locus. SepZ mutants exhibit reduced invasion efficiency and lack of tyrosine phosphorylation of Hp90.	99
399216	pfam06067	DUF932	Domain of unknown function (DUF932). Family of prokaryotic proteins with unknown function. Contains a number of highly conserved polar residues that could suggest an enzymatic activity.	227
399217	pfam06068	TIP49	TIP49 C-terminus. This family consists of the C-terminal region of several eukaryotic and archaeal RuvB-like 1 (Pontin or TIP49a) and RuvB-like 2 (Reptin or TIP49b) proteins. The N-terminal domain contains the pfam00004 domain. In zebrafish, the liebeskummer (lik) mutation, causes development of hyperplastic embryonic hearts. lik encodes Reptin, a component of a DNA-stimulated ATPase complex. Beta-catenin and Pontin, a DNA-stimulated ATPase that is often part of complexes with Reptin, are in the same genetic pathways. The Reptin/Pontin ratio serves to regulate heart growth during development, at least in part via the beta-catenin pathway. TBP-interacting protein 49 (TIP49) was originally identified as a TBP-binding protein, and two related proteins are encoded by individual genes, tip49a and b. Although the function of this gene family has not been elucidated, they are supposed to play a critical role in nuclear events because they interact with various kinds of nuclear factors and have DNA helicase activities.TIP49a has been suggested to act as an autoantigen in some patients with autoimmune diseases.	347
310570	pfam06069	PerC	PerC transcriptional activator. PerC is a transcriptional activator of EaeA/BfpA expression in enteropathogenic bacteria.	90
283680	pfam06070	Herpes_UL32	Herpesvirus large structural phosphoprotein UL32. The large phosphorylated protein (UL32-like) of herpes viruses is the polypeptide most frequently reactive in immuno-blotting analyses with antisera when compared with other viral proteins.	1037
399218	pfam06071	YchF-GTPase_C	Protein of unknown function (DUF933). This domain is found at the C-terminus of the YchF GTP-binding protein and is possibly related to the ubiquitin-like and MoaD/ThiS superfamilies.	82
283682	pfam06072	Herpes_US9	Alphaherpesvirus tegument protein US9. This family consists of several US9 and related proteins from the Alphaherpesviruses. The function of the US9 protein is unknown although in Bovine herpesvirus 5 Us9 is essential for the anterograde spread of the virus from the olfactory mucosa to the bulb.	61
399219	pfam06073	DUF934	Bacterial protein of unknown function (DUF934). This family consists of several bacterial proteins of unknown function. One of the members of this family BMEI1764 is thought to be an oxidoreductase.	103
399220	pfam06074	DUF935	Protein of unknown function (DUF935). This family consists of several bacterial proteins of unknown function as well as the Bacteriophage Mu gp29 protein.	516
399221	pfam06075	DUF936	Plant protein of unknown function (DUF936). This family consists of several hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.	680
114778	pfam06076	Orthopox_F14	Orthopoxvirus F14 protein. This family consists of several short Orthopoxvirus F14 proteins. The function of this protein is unknown.	73
399222	pfam06078	DUF937	Bacterial protein of unknown function (DUF937). This family consists of several hypothetical bacterial proteins of unknown function.	107
399223	pfam06079	Apyrase	Apyrase. This family consists of several eukaryotic apyrase proteins (EC:3.6.1.5). The salivary apyrases of blood-feeding arthropods are nucleotide hydrolysing enzymes implicated in the inhibition of host platelet aggregation through the hydrolysis of extracellular adenosine diphosphate..	289
253548	pfam06080	DUF938	Protein of unknown function (DUF938). This family consists of several hypothetical proteins from both prokaryotes and eukaryotes. The function of this family is unknown.	201
399224	pfam06081	ArAE_1	Aromatic acid exporter family member 1. This family consists of bacterial proteins with three transmembrane regions that are purported to be aromatic acid exporters.	141
399225	pfam06082	YjbH	Exopolysaccharide biosynthesis protein YbjH. YjbH is a family of Gram-negative beta-barrel outer-membrane lipoproteins that act as putative porins. YbjH is one of four gene-products expressed from an operon, yjbEFGH, which is regulated by the Rcs phosphorelay in a RcsA-dependent manner, similar to that of other exopolysaccharide biosynthetic pathways. It is highly possible that the yjbEFGH operon encodes a system involved in EPS secretion since none of the products is predicted to have enzymic activity, the products are all secreted and YbjH and F are predicted to be beta-barrel lipoproteins similar to porins. It may be that the operon products play some role in biofilm formation and/or matrix production.	662
399226	pfam06083	IL17	Interleukin-17. IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution.	80
399227	pfam06084	Cytomega_TRL10	Cytomegalovirus TRL10 protein. This family consists of several Cytomegalovirus TRL10 proteins. TRL10 represents a structural component of the virus particle and like the other HCMV envelope glycoproteins, is present in a disulfide-linked complex.	149
368737	pfam06085	Rz1	Lipoprotein Rz1 precursor. This family consists of several bacteria and phage lipoprotein Rz1 precursors. Rz1 is a proline-rich lipoprotein from bacteriophage lambda which is known to have fusogenic properties. Rz1-induced liposome fusion is thought to be mediated primarily by the generation of local perturbation in the bilayer lipid membrane and to a lesser extent by electrostatic forces. This family Rz1 and the Rz protein Rz (pfam03245) represent a unique example of two genes located in different reading frames in the same nucleotide sequence, which encode different proteins that are both required in the same physiological pathway.	41
399228	pfam06086	Pox_A30L_A26L	Orthopoxvirus A26L/A30L protein. This family consists of several Orthopoxvirus A26L and A30L proteins. The Vaccinia A30L gene is regulated by a late promoter and encodes a protein of approximately 9 kDa. It is thought that the A30L protein is needed for vaccinia virus morphogenesis, specifically the association of the dense viroplasm with viral membranes.	219
399229	pfam06087	Tyr-DNA_phospho	Tyrosyl-DNA phosphodiesterase. Covalent intermediates between topoisomerase I and DNA can become dead-end complexes that lead to cell death. Tyrosyl-DNA phosphodiesterase can hydrolyze the bond between topoisomerase I and DNA.	433
368739	pfam06088	TLP-20	Nucleopolyhedrovirus telokin-like protein-20 (TLP20). This family consists of several Nucleopolyhedrovirus telokin-like protein-20 (TLP20) sequences. The function of this family is unknown but TLP20 is known to shares some antigenic similarities to the smooth muscle protein telokin although the amino acid sequence shows no homologies to telokin.	164
399230	pfam06089	Asparaginase_II	L-asparaginase II. This family consists of several bacterial L-asparaginase II proteins. L-asparaginase (EC:3.5.1.1) catalyzes the hydrolysis of L-asparagine to L-aspartate and ammonium. Rhizobium etli possesses two asparaginases: asparaginase I, which is thermostable and constitutive, and asparaginase II, which is thermolabile, induced by asparagine and repressed by the carbon source.	320
399231	pfam06090	Ins_P5_2-kin	Inositol-pentakisphosphate 2-kinase. This is a family of inositol-pentakisphosphate 2-kinases (EC 2.7.1.158) (also known as inositol 1,3,4,5,6-pentakisphosphate 2-kinase, Ins(1,3,4,5,6)P5 2-kinase) and InsP5 2-kinase). This enzyme phosphorylates Ins(1,3,4,5,6)P5 to form Ins(1,2,3,4,5,6)P6 (also known as InsP6 or phytate). InsP6 is involved in many processes such as mRNA export, nonhomologous end-joining, endocytosis and ion channel regulation.	376
399232	pfam06092	DUF943	Enterobacterial putative membrane protein (DUF943). This family consists of several hypothetical putative membrane proteins from Escherichia coli, Yersinia pestis and Salmonella typhi.	151
399233	pfam06093	Spt4	Spt4/RpoE2 zinc finger. This family consists of several eukaryotic transcription elongation Spt4 proteins as well as archaebacterial RpoE2. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles. RpoE2 is one of 13 subunits in the archaeal RNA polymerase. These proteins contain a C4-type zinc finger, and the structure has been solved in. The structure reveals that Spt4-Spt5 binding is governed by an acid-dipole interaction between Spt5 and Spt4, and the complex binds to and travels along the elongating RNA polymerase. The Spt4-Spt5 complex is likely to be an ancient, core component of the transcription elongation machinery.	77
399234	pfam06094	GGACT	Gamma-glutamyl cyclotransferase, AIG2-like. GGACT, gamma-glutamylamine cyclotransferase, is a ubiquitous enzyme found in bacteria, plants, and metazoans from Dictyostelium through to humans. It converts gamma-glutamylamines to free amines and 5-oxoproline.	114
368743	pfam06096	Baculo_8kDa	Baculoviridae 8.2 KDa protein. Family of proteins from various Baculoviruses with undetermined function.	65
399235	pfam06097	DUF945	Bacterial protein of unknown function (DUF945). This family consists of several hypothetical bacterial proteins of unknown function.	458
399236	pfam06098	Radial_spoke_3	Radial spoke protein 3. This family consists of several radial spoke protein 3 (RSP3) sequences. Eukaryotic cilia and flagella present in diverse types of cells perform motile, sensory, and developmental functions in organisms from protists to humans. They are centred by precisely organized, microtubule-based structures, the axonemes. The axoneme consists of two central singlet microtubules, called the central pair, and nine outer doublet microtubules. These structures are well-conserved during evolution. The outer doublet microtubules, each composed of A and B sub-fibers, are connected to each other by nexin links, while the central pair is held at the centre of the axoneme by radial spokes. The radial spokes are T-shaped structures extending from the A-tubule of each outer doublet microtubule to the centre of the axoneme. Radial spoke protein 3 (RSP3), is present at the proximal end of the spoke stalk and helps in anchoring the radial spoke to the outer doublet. It is thought that radial spokes regulate the activity of inner arm dynein through protein phosphorylation and dephosphorylation.	286
399237	pfam06099	Phenol_hyd_sub	Phenol hydroxylase subunit. This family consists of several bacterial phenol hydroxylase subunit proteins which are part of a multicomponent phenol hydroxylase. Some bacteria can utilize phenol or some of its methylated derivatives as their sole source of carbon and energy. The first step in this process is the conversion of phenol into catechol. Catechol is then further metabolized via the meta-cleavage pathway into TCA cycle intermediates.	56
399238	pfam06100	MCRA	MCRA family. The MCRA (myosin-cross-reactive antigen) family of proteins were thought to have structural features in common with the beta chain of the class II antigens, as well as myosin, and may play an important role in the pathogenesis. More recent work shows that these proteins act as hydratase enzymes that convert linoleic acid and oleic acid to their respective 10-hydroxy derivatives. It has been suggested that MCRA proteins catalyze the first step in conjugated linoleic acid production. Proteins in this family act in an FAD dependent manner. The structure of a fatty acid double-bond hydratase from Lactobacillus acidophilus has been recently solved showing four structural domains.	492
399239	pfam06101	Vps62	Vacuolar protein sorting-associated protein 62. Vps62 is a vacuolar protein sorting (VPS) protein required for cytoplasm to vacuole targeting of proteins.	539
399240	pfam06102	RRP36	rRNA biogenesis protein RRP36. RRP36 is involved in the early processing steps of the pre-rRNA.	158
399241	pfam06103	DUF948	Bacterial protein of unknown function (DUF948). This family consists of bacterial sequences several of which are thought to be general stress proteins.	83
399242	pfam06105	Aph-1	Aph-1 protein. This family consists of several eukaryotic Aph-1 proteins.Gamma-secretase catalyzes the intramembrane proteolysis of Notch, beta-amyloid precursor protein, and other substrates as part of a new signaling paradigm and as a key step in the pathogenesis of Alzheimer's disease. It is thought that the presenilin heterodimer comprises the catalytic site and that a highly glycosylated form of nicastrin associates with it. Aph-1 and Pen-2, two membrane proteins genetically linked to gamma-secretase, associate directly with presenilin and nicastrin in the active protease complex. Co-expression of all four proteins leads to marked increases in presenilin heterodimers, full glycosylation of nicastrin, and enhanced gamma-secretase activity.	224
399243	pfam06106	SAUGI	S. aureus uracil DNA glycosylase inhibitor. Uracil-DNA glycosylase inhibitors, are DNA mimic proteins that prevent the DNA binding sites of UDGs (Uracil DNA glycosylase) from interacting with their DNA substrate. SSP0047 (SAUGI; for Staphylococcus aureus uracil-DNA glycosylase inhibitor) acts as a uracil-DNA glycosylase inhibitor that breaks the uracil-removing activity of S. aureus uracil-DNA glycosylase (SAUDG) pfam03167. The SAUGI/SAUDG complex has been determined, and shows that SAUGI binds to the SAUDG DNA binding region via several strong interactions, by using a hydrophobic pocket to hold SAUDG's protruding residue (i.e. SAUDG Leu184, E. coli UDG Leu191 and B. subtilis UDG Phe191). By binding to SAUDG in this way, SAUGI thus prevents SAUDG from binding to its DNA substrate and performing DNA repair activity.	112
399244	pfam06107	DUF951	Bacterial protein of unknown function (DUF951). This family consists of several short hypothetical bacterial proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids.	55
399245	pfam06108	DUF952	Protein of unknown function (DUF952). This family consists of several hypothetical bacterial and plant proteins of unknown function.	84
399246	pfam06109	HlyE	Haemolysin E (HlyE). This family consists of several enterobacterial haemolysin (HlyE) proteins.Hemolysin E (HlyE) is a novel pore-forming toxin of Escherichia coli, Salmonella typhi, and Shigella flexneri. HlyE is unrelated to the well characterized pore-forming E. coli hemolysins of the RTX family, haemolysin A (HlyA), and the enterohaemolysin encoded by the plasmid borne ehxA gene of E. coli 0157. However, it is evident that expression of HlyE in the absence of the RTX toxins is sufficient to give a hemolytic phenotype in E. coli. HlyE is a protein of 34 kDa that is expressed during anaerobic growth of E. coli. Anaerobic expression is controlled by the transcription factor, FNR, such that, upon ingestion and entry into the anaerobic mammalian intestine, HlyE is produced and may then contribute to the colonisation of the host.	333
399247	pfam06110	DUF953	Eukaryotic protein of unknown function (DUF953). This family consists of several hypothetical eukaryotic proteins of unknown function.	119
399248	pfam06112	Herpes_capsid	Gammaherpesvirus capsid protein. This family consists of several Gammaherpesvirus capsid proteins. The exact function of this family is unknown.	169
399249	pfam06113	BRE	Brain and reproductive organ-expressed protein (BRE). This family consists of several eukaryotic brain and reproductive organ-expressed (BRE) proteins. BRE is a putative stress-modulating gene, found able to down-regulate TNF-alpha-induced-NF-kappaB activation upon over expression. A total of six isoforms are produced by alternative splicing predominantly at either end of the gene.Compared to normal cells, immortalised human cell lines uniformly express higher levels of BRE. Peripheral blood monocytes respond to LPS by down-regulating the expression of all the BRE isoforms.It is thought that the function of BRE and its isoforms is to regulate peroxisomal activities.	320
399250	pfam06114	Peptidase_M78	IrrE N-terminal-like domain. This entry includes the catalytic domain of the protein ImmA, which is a metallopeptidase containing an HEXXH zinc-binding motif from peptidase family M78. ImmA is encoded on a conjugative transposon. Conjugating bacteria are able to transfer conjugative transposons that can, for example, confer resistance to antibiotics. The transposon is integrated into the chromosome, but during conjugation excises itself and then moves to the recipient bacterium and re-integrate into its chromosome. Typically a conjugative tranposon encodes only the proteins required for this activity and the proteins that regulate it. During exponential growth, the ICEBs1 transposon of Bacillus subtilis is inactivated by the immunity repressor protein ImmR, which is encoded by the transposon and represses the genes for excision and transfer. Cleavage of ImmR relaxes repression and allows transfer of the transposon. ImmA has been shown to be essential for the cleavage of ImmR. This domain is also found in in metalloprotease IrrE, a central regulator of DNA damage repair in Deinococcaceae, HTH-type transcriptional regulators RamB and PrpC.	122
399251	pfam06115	DUF956	Domain of unknown function (DUF956). Family of bacterial sequences with undetermined function.	117
283718	pfam06116	RinB	Transcriptional activator RinB. This family consists of several Staphylococcus aureus bacteriophage RinB proteins and related sequences from their host. The int gene of staphylococcal bacteriophage phi 11 is the only viral gene responsible for the integrative recombination of phi 11. rinA and rinB, are both required to activate expression of the int gene.	51
399253	pfam06119	NIDO	Nidogen-like. This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function.	90
399254	pfam06120	Phage_HK97_TLTM	Tail length tape measure protein. This family consists of the tail length tape measure protein from bacteriophage HK97 and related sequences from Escherichia coli O157:H7.	288
399255	pfam06121	DUF959	Domain of Unknown Function (DUF959). This N-terminal domain is not expressed in the 'Short' isoform of Collagen A.	192
399256	pfam06122	TraH	Conjugative relaxosome accessory transposon protein. The TraH protein is thought to be a relaxosome accessory component, also necessary for transfer but not for H-pilus synthesis within the conjugative transposon.	356
399257	pfam06123	CreD	Inner membrane protein CreD. This family consists of several bacterial CreD or Cet inner membrane proteins. Dominant mutations of the cet gene of Escherichia coli result in tolerance to colicin E2 and increased amounts of an inner membrane protein with an Mr of 42,000. The cet gene is shown to be in the same operon as the phoM gene, which is required in a phoR background for expression of the structural gene for alkaline phosphatase, phoA. Although the Cet protein is not required for phoA expression, it has been suggested that the Cet protein has an enhancing effect on the transcription of phoA.	428
399258	pfam06124	DUF960	Staphylococcal protein of unknown function (DUF960). This family consists of several hypothetical proteins from several species of Staphylococcus. The function of this family is unknown.	94
399259	pfam06125	DUF961	Bacterial protein of unknown function (DUF961). This family consists of several hypothetical bacterial proteins of unknown function.	96
147993	pfam06126	Herpes_LAMP2	Herpesvirus Latent membrane protein 2. Family of Kaposi's sarcoma-associated herpesvirus (HHV8) latent membrane protein.	510
377609	pfam06127	DUF962	Protein of unknown function (DUF962). This family consists of several eukaryotic and prokaryotic proteins of unknown function. The yeast protein YGL010W has been found to be non-essential for cell growth.	95
283728	pfam06128	Shigella_OspC	Shigella flexneri OspC protein. This family consists of the Shigella flexneri specific protein OspC. The function of this family is unknown but it is thought that Osp proteins may be involved in post invasion events related to virulence. Since bacterial pathogens adapt to multiple environments during the course of infecting a host, it has been proposed that Shigella evolved a mechanism to take advantage of a unique intracellular cue, which is mediated through MxiE, to express proteins when the organism reaches the eukaryotic cytosol.	292
283729	pfam06129	Chordopox_G3	Chordopoxvirus G3 protein. This family consists of several Chordopoxvirus specific G3 proteins. The function of this family is unknown.	108
399260	pfam06130	PTAC	Phosphate propanoyltransferase. This family includes phosphotransacylases (PTACs) required for the degradation of 1,2-propanediol (1,2-PD).	67
283731	pfam06131	DUF963	Schizosaccharomyces pombe repeat of unknown function (DUF963). This family consists of a series of repeated sequences from one hypothetical protein found in Schizosaccharomyces pombe. The function of this family is unknown.	36
399261	pfam06133	Com_YlbF	Control of competence regulator ComK, YlbF/YmcA. YlbF Is a family of short Gram-positive and archaeal proteins that includes both YlbF and YmcA which may interact synergistically. The family is necessary for correct biofilm formation, as null mutants of ymcA and ylbF fail to form pellicles at air-liquid interfaces and grow on solid media as smooth, undifferentiated colonies. During development, YmcA, YlbF and YaaT, family PSPI, pfam04468, interact directly with one another forming a stable ternary complex, in vitro. All three proteins are required for competence, sporulation and the formation of biofilms. The YmcA-YlbF-YaaT complex affects the phosphotransfer between Spo0F and Spo0B, thus accelerating the production of Spo0A~P. The three processes of biofilm formation, mature spore formation and competence all require the active, phosphorylated form of Spo0A, as Spo0A-P.	104
399262	pfam06134	RhaA	L-rhamnose isomerase (RhaA). This family consists of several bacterial L-rhamnose isomerase proteins (EC:5.3.1.14).	417
399263	pfam06135	DUF965	Bacterial protein of unknown function (DUF965). This family consists of several hypothetical bacterial proteins. The function of the family is unknown.	77
399264	pfam06136	DUF966	Domain of unknown function (DUF966). Family of plant proteins with unknown function.	366
283736	pfam06138	Chordopox_E11	Chordopoxvirus E11 protein. This family consists of several Chordopoxvirus E11 proteins. The E11 gene of vaccinia virus encodes a 15-kDa polypeptide. Mutations in the E11 gene makes the virus temperature-sensitive due to either the fact that virus infectivity requires a threshold level of active E11 protein or that E11 function is conditionally essential.	126
399265	pfam06139	BphX	BphX-like. Family of bacterial proteins located in the phenyl dioxygenase (bph) operon. The function of this family is unknown.	133
399266	pfam06140	Ifi-6-16	Interferon-induced 6-16 family. 	77
399267	pfam06141	Phage_tail_U	Phage minor tail protein U. Tail fibre component U of bacteriophage.	129
114838	pfam06143	Baculo_11_kDa	Baculovirus 11 kDa family. Family of uncharacterized Baculovirus proteins that are all about 11 kDa in size.	84
399268	pfam06144	DNA_pol3_delta	DNA polymerase III, delta subunit. DNA polymerase III, delta subunit (EC 2.7.7.7) is required for, along with delta' subunit, the assembly of the processivity factor beta(2) onto primed DNA in the DNA polymerase III holoenzyme-catalyzed reaction. The delta subunit is also known as HolA.	174
148007	pfam06145	Corona_NS1	Coronavirus nonstructural protein NS1. Bovine coronavirus NS1 encodes a 4.9 kDa protein.	29
399269	pfam06146	PsiE	Phosphate-starvation-inducible E. Phosphate-starvation-inducible E (PsiE) expression is under direct positive and negative control by PhoB and cAMP-CRP, respectively. The function of PsiE remains to be determined.	68
399270	pfam06147	DUF968	Protein of unknown function (DUF968). Family of uncharacterized prophage proteins found in Gammaproteobacteria. These may be HNH-nucleases, as there are several conserved cysteines and histidines.	206
399271	pfam06148	COG2	COG (conserved oligomeric Golgi) complex component, COG2. The COG complex comprises eight proteins COG1-8. The COG complex plays critical roles in Golgi structure and function. The proposed function of the complex is to mediate the initial physical contact between transport vesicles and their membrane targets. A comparable role in tethering vesicles has been suggested for at least six additional large multisubunit complexes, including the exocyst, a complex that mediates trafficking to the plasma membrane. COG2 structure reveals a six-helix bundle with few conserved surface features but a general resemblance to recently determined crystal structures of four different exocyst subunits. These bundles inCOG2 may act as platforms for interaction with other trafficing proteins including SNAREs (soluble N-ethylmaleimide factor attachment protein receptors) and Rabs.	133
399272	pfam06149	DUF969	Protein of unknown function (DUF969). Family of uncharacterized bacterial membrane proteins.	216
399273	pfam06150	ChaB	ChaB. This family of proteins contain a conserved 60 residue region. This protein is known as ChaB in E. coli and is found next to ChaA which is a cation transporter protein. ChaB may be regulate ChaA function in some way.	60
336318	pfam06151	Trehalose_recp	Trehalose receptor. In Drosophila, taste is perceived by gustatory neurons located in sensilla distributed on several different appendages throughout the body of the animal. This family represents the taste receptor sensitive to trehalose.	411
399274	pfam06152	Phage_min_cap2	Phage minor capsid protein 2. Family of related phage minor capsid proteins.	367
283747	pfam06153	CdAMP_rec	Cyclic-di-AMP receptor. CdAMP is a family of bacterial cyclic-di-AMP receptor proteins. Cyclic-di-AMP (c-di-AMP) is a bacterial secondary messenger involved in various processes, including sensing of DNA-integrity, cell wall metabolism and potassium transport. CdAMP_rec has a ferredoxin-like fold and is structurally related to Pii-signal transduction proteins.	109
399275	pfam06154	CbeA_antitoxin	CbeA_antitoxin, type IV, cytoskeleton bundling-enhancing factor A. CbeA_antitoxin is a family of cognate antitoxins to the CbtA toxins that act by inhibiting the polymerization of cytoskeletal proteins, see pfam06755. These are classified as a type IV toxin-antitoxin system. The family includes three proteins from E. coli YagB, YeeU and YfjZ, which act not by forming a complex with CbtA but through acting as antagonists to the CbtA toxicity, by stabilizing the CbtA target proteins. For example, YeeU binds directly to both MreB and FtsZ and enhances the bundling of their filaments in vitro. YeeU is also able to neutralize the toxicity caused by other MreB and FtsZ inhibitors, such as A22 [S-(3, 4-dichlorobenzyl)isothiourea] for MreB, and SulA and DicB for FtsZ. Thus CbeA, for cytoskeleton bundling-enhancing factor A, is proposed as a general name for all of these antitoxin proteins.	101
399276	pfam06155	DUF971	Protein of unknown function (DUF971). This family consists of several short bacterial proteins and one sequence from Oryza sativa. The function of this family is unknown.	83
399277	pfam06156	YabB	Initiation control protein YabA. YabA is involved in initiation control of chromosome replication. It interacts with both DnaA and DnaN, acting as a bridge between these two proteins.	103
368769	pfam06157	DUF973	Protein of unknown function (DUF973). This family consists of several hypothetical archaeal proteins of unknown function.	309
399278	pfam06159	DUF974	Protein of unknown function (DUF974). Family of uncharacterized eukaryotic proteins.	243
399279	pfam06160	EzrA	Septation ring formation regulator, EzrA. During the bacterial cell cycle, the tubulin-like cell-division protein FtsZ polymerizes into a ring structure that establishes the location of the nascent division site. EzrA modulates the frequency and position of FtsZ ring formation.	542
283754	pfam06161	DUF975	Protein of unknown function (DUF975). Family of uncharacterized bacterial proteins.	244
310625	pfam06162	PgaPase_1	Putative pyroglutamyl peptidase PgaPase_1. PgaPase_1 is a family of functionally diverse Caenorhabditis proteins. The family is homologous to the cysteine-peptidases, but lack of a strictly conserved Glu-Cys-His catalytic triad or pGlu binding site implies that it has other functions that could have resulted in a change in reaction-specificity or even of catalytic activity.	166
310626	pfam06163	DUF977	Bacterial protein of unknown function (DUF977). This family consists of several hypothetical bacterial proteins from Escherichia coli and Salmonella typhi. The function of this family is unknown.	134
399280	pfam06165	Glyco_transf_36	Glycosyltransferase family 36. The glycosyltransferase family 36 includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-). Many members of this family contain two copies of this domain.	247
399281	pfam06166	DUF979	Protein of unknown function (DUF979). This family consists of several putative bacterial membrane proteins. The function of this family is unclear.	311
399282	pfam06167	Peptidase_M90	Glucose-regulated metallo-peptidase M90. MtfA (earlier known as YeeI) is a transcription factor A that binds Mlc (make large colonies), itself a repressor of glucose and hence a protein important in regulation of the phosphoenolpyruvate:glucose-phosphotransferase (ptsG) system, the major glucose transporter in E.coli. Mlc is a repressor of ptsG, and MtfA is found to bind and inactivate Mlc with high affinity. The membrane-bound protein EIICBGlc encoded by the ptsG gene is the major glucose transporter in Escherichia coli. MtfA is found to be a glucose-regulated peptidase, whose activity is regulated by binding to Mlc available in the cytoplasm, which in turn has been released from EIICBGlc during times when no glucose is taken up. A physiologically relevant target for this peptidase is not yet known.	243
399283	pfam06168	DUF981	Protein of unknown function (DUF981). Family of uncharacterized proteins found in bacteria and archaea.	180
399284	pfam06169	DUF982	Protein of unknown function (DUF982). This family consists of several hypothetical proteins from Rhizobium meliloti, Rhizobium loti and Agrobacterium tumefaciens. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids.	71
399285	pfam06170	DUF983	Protein of unknown function (DUF983). This family consists of several bacterial proteins of unknown function.	85
399286	pfam06172	Cupin_5	Cupin superfamily (DUF985). Family of uncharacterized proteins found in bacteria and eukaryotes that belongs to the Cupin superfamily.	138
399287	pfam06173	DUF986	Protein of unknown function (DUF986). This family consists of several bacterial putative membrane proteins of unknown function.	148
368775	pfam06174	DUF987	Protein of unknown function (DUF987). Family of bacterial proteins that are related to the hypothetical protein yeeT.	65
114868	pfam06175	MiaE	tRNA-(MS[2]IO[6]A)-hydroxylase (MiaE). This family consists of several bacterial tRNA-(MS[2]IO[6]A)-hydroxylase (MiaE) proteins. The modified nucleoside 2-methylthio-N-6-isopentenyl adenosine (ms2i6A) is present at position 37 (3' of the anticodon) of tRNAs that read codons beginning with U except tRNA(I,V Ser) in Escherichia coli. Salmonella typhimurium 2-methylthio-cis-ribozeatin (ms2io6A) is found in tRNA, probably in the corresponding species that have ms2i6A in E. coli. The miaE gene is absent in E. coli, a finding consistent with the absence of the hydroxylated derivative of ms2i6A in this species.	199
283765	pfam06176	WaaY	Lipopolysaccharide core biosynthesis protein (WaaY). This family consists of several bacterial lipopolysaccharide core biosynthesis proteins (WaaY or RfaY). The waaY, waaQ, and waaP genes are located in the central operon of the waa (formerly rfa) locus on the chromosome of Escherichia coli. This locus contains genes whose products are involved in the assembly of the core region of the lipopolysaccharide molecule. WaaY is the enzyme that phosphorylates HepII in this system.	229
399288	pfam06177	QueT	QueT transporter. This family includes the queT gene encoding a hypothetical integral membrane protein with 5 predicted transmembrane regions. The queT genes in Firmicutes are often preceded by the PreQ1 (7-aminomethyl-7-deazaguanine) riboswitches of two distinct classes, suggesting involvement of the QueT transporters in uptake of a queuosine biosynthetic intermediate.	140
399289	pfam06178	KdgM	Oligogalacturonate-specific porin protein (KdgM). This family consists of several bacterial proteins which are homologous to the oligogalacturonate-specific porin protein KdgM from Erwinia chrysanthemi. The phytopathogenic Gram-negative bacteria Erwinia chrysanthemi secretes pectinases, which are able to degrade the pectic polymers of plant cell walls, and uses the degradation products as a carbon source for growth. KdgM is a major outer membrane protein, whose synthesis is strongly induced in the presence of pectic derivatives. KdgM behaves like a voltage-dependent porin that is slightly selective for anions and that exhibits fast block in the presence of trigalacturonate. In contrast to most porins, KdgM seems to be monomeric.	215
399290	pfam06179	Med22	Surfeit locus protein 5 subunit 22 of Mediator complex. This family consists of several eukaryotic Surfeit locus protein 5 (SURF5) sequences. The human Surfeit locus has been mapped on chromosome 9q34.1. The locus includes six tightly clustered housekeeping genes (Surf1-6), and the gene organisation is similar in human, mouse and chicken Surfeit locus. The Med22 subunit of Mediator complex is part of the essential core head region.	105
399291	pfam06180	CbiK	Cobalt chelatase (CbiK). This family consists of several bacterial cobalt chelatase (CbiK) proteins (EC:4.99.1.-).	261
399292	pfam06181	Urate_ox_N	Urate oxidase N-terminal. Cytochrome c urate oxidase (Uox) PuuD is involved in purine degradation. In contrast with soluble Uox it is a membrane protein with an 8-helix transmembrane N-terminal domain and a C-terminal cytochrome c.	295
253605	pfam06182	ABC2_membrane_6	ABC-2 family transporter protein. This family acts as the transmembrane domain (TMD) of ABC transporters. The family includes proteins responsible for the transport of herbicides.	229
399293	pfam06183	DinI	DinI-like family. This family of short proteins includes DNA-damage-inducible protein I (DinI) and related proteins. The SOS response, a set of cellular phenomena exhibited by eubacteria, is initiated by various causes that include DNA damage-induced replication arrest, and is positively regulated by the co- protease activity of RecA. Escherichia coli DinI, a LexA-regulated SOS gene product, shuts off the initiation of the SOS response when overexpressed in vivo. Biochemical and genetic studies indicated that DinI physically interacts with RecA to inhibit its co-protease activity. The structure of DinI is known.	63
399294	pfam06184	Potex_coat	Potexvirus coat protein. This family consists of several Potexvirus coat proteins.	153
399295	pfam06185	YecM	YecM protein. This family consists of several bacterial YecM proteins of unknown function.	179
399296	pfam06186	DUF992	Protein of unknown function (DUF992). This family consists of several hypothetical bacterial proteins of unknown function.	143
399297	pfam06187	DUF993	Protein of unknown function (DUF993). This family consists of several hypothetical bacterial proteins of unknown function.	381
368784	pfam06188	HrpE	HrpE/YscL/FliH and V-type ATPase subunit E. This is a prokaryotic family that contains proteins of the FliH and HrpE/YscL family. These proteins are involved in type III secretion, which is the process that drives flagellar biosynthesis and mediates bacterial-eukaryotic interactions. This family also V-type ATPase subunit E. This subunit appears to form a tight interaction with subunit G in the F0 complex. Subunits E and G may act together as stators to prevent certain subunits from rotating with the central rotary element. pfam01991 also contains V-type ATPase subunit E proteins.	187
399298	pfam06189	5-nucleotidase	5'-nucleotidase. This family consists of both eukaryotic and prokaryotic 5'-nucleotidase sequences (EC:3.1.3.5).	265
399299	pfam06191	DUF995	Protein of unknown function (DUF995). Family of uncharacterized Proteobacteria proteins.	140
283778	pfam06193	Orthopox_A5L	Orthopoxvirus A5L protein-like. This family includes several Orthopoxvirus A5L proteins. The vaccinia virus WR A5L open reading frame (corresponding to open reading frame A4L in vaccinia virus Copenhagen) encodes an immunodominant late protein found in the core of the vaccinia virion. The A5 protein appears to be required for the immature virion to form the brick-shaped intracellular mature virion.	216
283779	pfam06194	Phage_Orf51	Phage Conserved Open Reading Frame 51. Family of conserved bacteriophage open reading frames.	80
399300	pfam06195	DUF996	Protein of unknown function (DUF996). Family of uncharacterized bacterial and archaeal proteins.	135
399301	pfam06196	DUF997	Protein of unknown function (DUF997). Family of predicted bacterial membrane protein with unknown function.	77
399302	pfam06197	DUF998	Protein of unknown function (DUF998). Family of conserved archaeal proteins.	185
114890	pfam06198	DUF999	Protein of unknown function (DUF999). Family of conserved Schizosaccharomyces pombe proteins with unknown function.	143
399303	pfam06199	Phage_tail_2	Phage tail tube protein. characterized members are major tail tube proteins from various phages, including lactococcal temperate bacteriophage TP901-1.	134
399304	pfam06200	tify	tify domain. This short possible domain is found in a variety of plant transcription factors that contain GATA domains as well as other motifs. Although previously known as the Zim domain this is now called the tify domain after its most conserved amino acids. TIFY proteins can be further classified into two groups depending on the presence (group I) or absence (group II) of a C2C2-GATA domain. Functional annotation of these proteins is still poor, but several screens revealed a link between TIFY proteins of group II and jasmonic acid-related stress response.	34
399305	pfam06201	PITH	PITH domain. This family was formerly known as DUF1000. The full-length, Txnl1, protein which is a probable component of the 26S proteasome, uses its C-terminal, PITH, domain to associate specifically with the 26S proteasome. PITH derives from proteasome-interacting thioredoxin domain.	145
283786	pfam06202	GDE_C	Amylo-alpha-1,6-glucosidase. This family includes human glycogen branching enzyme AGL. This enzyme contains a number of distinct catalytic activities. It has been shown for the yeast homolog GDB1 that mutations in this region disrupt the enzymes Amylo-alpha-1,6-glucosidase (EC:3.2.1.33).	374
399306	pfam06203	CCT	CCT motif. This short motif is found in a number of plant proteins. It is rich in basic amino acids and has been called a CCT motif after Co, Col and Toc1. The CCT motif is about 45 amino acids long and contains a putative nuclear localization signal within the second half of the CCT motif. Toc1 mutants have been identified in this region.	44
399307	pfam06206	CpeT	CpeT/CpcT family (DUF1001). This family consists of proteins of proteins belonging to the CpeT/CpcT family. These proteins are around 200 amino acids in length. The proteins contain a conserved motif PYR in the amino terminal half of the protein that may be functionally important. The species distribution of the family is interesting. So far it is restricted to cyanobacteria, cryptomonads and plants. It has been shown that CpcT encodes a bilin lyase responsible for attachment of phycocyanobilin to the beta subunit of phycocyanin.	179
399308	pfam06207	DUF1002	Protein of unknown function (DUF1002). This protein family has no known function. Its members are about 300 amino acids in length. It has so far been detected in Firmicute bacteria and some archaebacteria.	220
283790	pfam06208	BDV_G	Borna disease virus G protein. This family consists of Borna disease virus G glycoprotein sequences. Borna disease virus (BDV) infection produces a variety of clinical diseases, from behavioural illnesses to classical fatal encephalitis. G protein is important for viral entry into the host cell.	503
399309	pfam06209	COBRA1	Cofactor of BRCA1 (COBRA1). This family consists of several cofactor of BRCA1 (COBRA1) like proteins. It is thought that COBRA1 along with BRCA1 is involved in chromatin unfolding. COBRA1 is recruited to the chromosome site by the first BRCT repeat of BRCA1, and is itself sufficient to induce chromatin unfolding. BRCA1 mutations that enhance chromatin unfolding also increase its affinity for, and recruitment of, COBRA1. It is thought that that reorganisation of higher levels of chromatin structure is an important regulated step in BRCA1-mediated nuclear functions.	472
399310	pfam06210	DUF1003	Protein of unknown function (DUF1003). This family consists of several hypothetical bacterial proteins of unknown function.	101
283793	pfam06211	BAMBI	BMP and activin membrane-bound inhibitor (BAMBI) N-terminal domain. This family consists of several eukaryotic BMP and activin membrane-bound inhibitor (BAMBI) proteins. Members of the transforming growth factor-beta (TGF-beta) superfamily, including TGF-beta, bone morphogenetic proteins (BMPs), activins and nodals, are vital for regulating growth and differentiation. BAMBI is related to TGF-beta-family type I receptors but lacks an intracellular kinase domain. BAMBI is co-expressed with the ventralising morphogen BMP4 during Xenopus embryogenesis and requires BMP signalling for its expression. The protein stably associates with TGF-beta-family receptors and inhibits BMP and activin as well as TGF-beta signalling.	107
399311	pfam06212	GRIM-19	GRIM-19 protein. This family consists of several eukaryotic gene associated with retinoic-interferon-induced mortality 19 (GRIM-19) proteins. GRIM-19, was reported to encode a small protein primarily distributed in the nucleus and was able to promote cell death induced by IFN-beta and RA. A bovine homolog of GRIM-19 was co-purified with mitochondrial NADH:ubiquinone oxidoreductase (complex I) in bovine heart. Therefore, its exact cellular localization and function are unclear. It has now been discovered that GRIM-19 is a specific interacting protein which negatively regulates Stat3 activity.	132
399312	pfam06213	CobT	Cobalamin biosynthesis protein CobT. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid.	274
368793	pfam06214	SLAM	Signaling lymphocytic activation molecule (SLAM) protein. This family consists of several mammalian signaling lymphocytic activation molecule (SLAM) proteins. Optimal T cell activation and expansion require engagement of the TCR plus co-stimulatory signals delivered through accessory molecules. SLAM, a 70-kDa co-stimulatory molecule belonging to the Ig superfamily, is defined as a human cell surface molecule that mediates CD28-independent proliferation of human T cells and IFN-gamma production by human Th1 and Th2 clones. SLAM has also been recognized as a receptor for measles virus.	125
368794	pfam06215	ISAV_HA	Infectious salmon anaemia virus haemagglutinin. This family consists of several infectious salmon anaemia virus haemagglutinin proteins. Infectious salmon anaemia virus (ISAV), an orthomyxovirus-like virus, is an important fish pathogen in marine aquaculture.	380
283798	pfam06216	RTBV_P46	Rice tungro bacilliform virus P46 protein. This family consists of several Rice tungro bacilliform virus P46 proteins. The function of this family is unknown.	389
399313	pfam06217	GAGA_bind	GAGA binding protein-like family. This family includes gbp a protein from Soybean that binds to GAGA element dinucleotide repeat DNA. It seems likely that the this domain mediates DNA binding. This putative domain contains several conserved cysteines and a histidine suggesting this may be a zinc-binding DNA interaction domain.	290
368796	pfam06218	NPR2	Nitrogen permease regulator 2. This family of regulators are involved in post-translational control of nitrogen permease.	439
399314	pfam06219	DUF1005	Protein of unknown function (DUF1005). Family of plant proteins with undetermined function.	430
368798	pfam06220	zf-U1	U1 zinc finger. This family consists of several U1 small nuclear ribonucleoprotein C (U1-C) proteins. The U1 small nuclear ribonucleoprotein (U1 snRNP) binds to the pre-mRNA 5' splice site (ss) at early stages of spliceosome assembly. Recruitment of U1 to a class of weak 5' ss is promoted by binding of the protein TIA-1 to uridine-rich sequences immediately downstream from the 5' ss. Binding of TIA-1 in the vicinity of a 5' ss helps to stabilize U1 snRNP recruitment, at least in part, via a direct interaction with U1-C, thus providing one molecular mechanism for the function of this splicing regulator. This domain is probably a zinc-binding. It is found in multiple copies in some members of the family.	38
399315	pfam06221	zf-C2HC5	Putative zinc finger motif, C2HC5-type. This zinc finger appears to be common in activating signal cointegrator 1/thyroid receptor interacting protein 4.	54
368800	pfam06222	Phage_TAC_1	Phage tail assembly chaperone. 	126
283804	pfam06223	Phage_tail_T	Minor tail protein T. Minor tail protein T is located at the distal end and is involved in the assembly of the initiator complex for tail polymerization.	100
399316	pfam06224	HTH_42	Winged helix DNA-binding domain. This family contains two copies of a winged helix domain.	324
399317	pfam06226	DUF1007	Protein of unknown function (DUF1007). Family of conserved bacterial proteins with unknown function.	210
368802	pfam06227	Poxvirus	dsDNA Poxvirus. This is a family of dsDNA viruses, with no RNA stage, Poxvirus proteins.	145
399318	pfam06228	ChuX_HutX	Haem utilisation ChuX/HutX. This family is found within haem utilisation operons. It has a similar structure to that of pfam05171. pfam05171 usually occurs as a duplicated domain, but this domain occurs as a single domain and forms a dimer. The organisation of the dimer is very similar to that of the duplicated pfam05171 domains. It binds haem via conserved histidines.	128
368803	pfam06229	FRG1	FRG1-like domain. The human FRG1 gene maps to human chromosome 4q35 and has been identified as a candidate for facioscapulohumeral muscular dystrophy. Currently, the function of FRG1 is unknown.	189
399319	pfam06230	DUF1009	Protein of unknown function (DUF1009). Family of uncharacterized bacterial proteins.	131
283811	pfam06231	DUF1010	Protein of unknown function (DUF1010). Family of plasmid encoded proteins with unknown function.	81
399320	pfam06232	ATS3	Embryo-specific protein 3, (ATS3). This is a family of plant seed-specific proteins identified in Arabidopsis thaliana (Mouse-ear cress). ATS3 (Arabidopsis thaliana seed gene 3) is expressed in a pattern similar to the Arabidopsis seed storage protein genes.	125
399321	pfam06233	Usg	Usg-like family. Family of bacterial proteins, referred to as Usg. Usg is found in the same operon as trpF, trpB, and trpA and is expressed in a coupled transcription-translation system.	80
399322	pfam06234	TmoB	Toluene-4-monooxygenase system protein B (TmoB). This family consists of several Toluene-4-monooxygenase system protein B (TmoB) sequences. Pseudomonas mendocina KR1 metabolizes toluene as a carbon source. The initial step of the pathway is hydroxylation of toluene to form p-cresol by a multicomponent toluene-4-monooxygenase (T4MO) system. TmoB adopts a ubiquitin fold. Although TmoB is a component of the T4MO system, its precise role remains unclear.	78
368806	pfam06235	NAD4L	NADH dehydrogenase subunit 4L (NAD4L). This family consists of NADH dehydrogenase subunit 4L (NAD4L) proteins from the mitochondria of several parasitic flatworms.	86
399323	pfam06236	MelC1	Tyrosinase co-factor MelC1. This family consists of several tyrosinase co-factor MELC1 proteins from a number of Streptomyces species. The melanin operon (melC) of Streptomyces antibioticus contains two genes, melC1 and melC2 (apotyrosinase). It is thought that MelC1 forms a transient binary complex with the downstream apotyrosinase MelC2 to facilitate the incorporation of copper ion and the secretion of tyrosinase indicating that MelC1 is a chaperone for the apotyrosinase MelC2.	113
399324	pfam06237	DUF1011	Protein of unknown function (DUF1011). Family of uncharacterized eukaryotic proteins.	98
399325	pfam06239	ECSIT	Evolutionarily conserved signalling intermediate in Toll pathway. Activation of NF-kappaB as a consequence of signaling through the Toll and IL-1 receptors is a major element of innate immune responses. ECSIT plays an important role in signalling to NF-kappaB, functioning as the intermediate in the signaling pathways between TRAF-6 and MEKK-1.	219
399326	pfam06240	COXG	Carbon monoxide dehydrogenase subunit G (CoxG). The CO dehydrogenase structural genes coxMSL are flanked by nine accessory genes arranged as the cox gene cluster. The cox genes are specifically and coordinately transcribed under chemolithoautotrophic conditions in the presence of CO as carbon and energy source.	140
399327	pfam06241	Castor_Poll_mid	Castor and Pollux, part of voltage-gated ion channel. This family represents a short region in the middle of largely plant proteins that belong to the TCDB:1.A.1.23.2 family of the voltage-gated ion channel superfamily, eg UniProtKB:Q5H8A6, Q5H8A5 and Q4VY51.	104
399328	pfam06242	DUF1013	Protein of unknown function (DUF1013). Family of uncharacterized proteins found in Proteobacteria.	138
399329	pfam06243	PaaB	Phenylacetic acid degradation B. Phenylacetic acid degradation protein B (PaaB) is thought to be part of a multicomponent oxygenase involved in phenylacetyl-CoA hydroxylation.	88
399330	pfam06244	Ccdc124	Coiled-coil domain-containing protein 124. Ccdc124 is a centrosome and midbody protein involved in cytokinesis.	121
399331	pfam06245	DUF1015	Protein of unknown function (DUF1015). Family of proteins with unknown function found in archaea and bacteria.	330
399332	pfam06246	Isy1	Isy1-like splicing family. Isy1 protein is important in the optimisation of splicing.	245
399333	pfam06247	Plasmod_Pvs28	Pvs28 EGF domain. This family consists of several ookinete surface proteins (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunized animals. The structure of this protein shows it is composed of four EGF domains.	35
368816	pfam06248	Zw10	Centromere/kinetochore Zw10. Zw10 and rough deal proteins are both required for correct metaphase check-pointing during mitosis. These proteins bind to the centromere/kinetochore.	543
114941	pfam06249	EutQ	Ethanolamine utilisation protein EutQ. The eut operon of Salmonella typhimurium encodes proteins involved in the cobalamin-dependent degradation of ethanolamine. The role of EutQ in this process is unclear.	152
399334	pfam06250	DUF1016	Protein of unknown function (DUF1016). Family of uncharacterized proteins found in viruses, archaea and bacteria.	155
399335	pfam06251	Caps_synth_GfcC	Capsule biosynthesis GfcC. Many bacteria are covered in a layer of surface-associated polysaccharide called the capsule. These capsules can be divided into four groups depending upon the organisation of genes responsible for capsule assembly, the assembly pathway and regulation. This family plays a role in group 4 capsule biosynthesis. These proteins have a beta-grasp fold. Two beta-grasp domains, D2 and D3, are arranged in tandem. There is a C-terminal amphipathic helix which packs against D3. A helical hairpin insert in D2 binds to D3 and constrains its position, a conserved arginine residue at the end of this hairpin is essential for structural integrity.	229
399336	pfam06252	DUF1018	Protein of unknown function (DUF1018). This family consists of several bacterial and phage proteins of unknown function.	118
399337	pfam06253	MTTB	Trimethylamine methyltransferase (MTTB). This family consists of several trimethylamine methyltransferase (MTTB) (EC:2.1.1.-) proteins from numerous Rhizobium and Methanosarcina species.	489
368819	pfam06254	YdaT_toxin	Putative bacterial toxin ydaT. YdaT_toxin is a family of putative bacterial toxins that are neutralized by the putative antitoxin YdaS, UniProtKB:P76063, family pfam144549.	88
283833	pfam06255	MafB	Neisseria toxin MafB. MafB constitutes a family of secreted toxins in pathogenic Neisseria species, probably involved in interbacterial competition. Genes immediately downstream of mafB encode a specific immunity protein (MafI). MafB proteins exhibit a signal peptide sequence, a N-terminal conserved domain and a C-terminal variable region. Toxic domains identified at the C-terminus include pfam15542, pfam14437, pfam15524, and pfam14436.	312
283834	pfam06256	Nucleo_LEF-12	Nucleopolyhedrovirus LEF-12 protein. This family consists of several Nucleopolyhedrovirus late expression factor-12 (LEF-12) proteins. The function of this family is unknown.	173
399338	pfam06257	VEG	Biofilm formation stimulator VEG. VEG is a family that is highly conserved among Gram-positive bacteria. It stimulates biofilm formation through inducing transcription of the tapA-sipW-tasA operon. The products of this operon are resposible for production of the amyloid fibre (TasA) component of the biofilm. Veg or a Veg-induced protein acts as an antirepressor of SinR - part of the major overall biofilm transcriptional control system - to regulate and stimulate biofilm formation. Veg is transcribed at high levels during both exponential growth and sporulation.	62
399339	pfam06258	Mito_fiss_Elm1	Mitochondrial fission ELM1. In plants, this family is involved in mitochondrial fission. It binds to dynamin-related proteins and plays a role in their relocation from the cytosol to mitochondrial fission sites. Its function in bacteria is unknown.	301
283837	pfam06259	Abhydrolase_8	Alpha/beta hydrolase. Members of this family are predicted to have an alpha/beta hydrolase fold. They contain a predicted Ser-His-Asp catalytic triad, in which the serine is likely to act as a nucleophile.	178
283838	pfam06260	DUF1024	Protein of unknown function (DUF1024). This family consists of several hypothetical Staphylococcus aureus and Staphylococcus aureus phage phi proteins. The function of this family is unknown.	82
368820	pfam06261	LktC	Actinobacillus actinomycetemcomitans leukotoxin activator LktC. This family consists of several Actinobacillus actinomycetemcomitans leukotoxin activator (LktC) proteins. Actinobacillus actinomycetemcomitans is a Gram-negative bacterium that has been implicated in the etiology of several forms of periodontitis, especially localized juvenile periodontitis. LktC along with LktB and LktD are thought to be required for activation and localization of the leukotoxin.	150
399340	pfam06262	Zincin_1	Zincin-like metallopeptidase. This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. The structure of this family is a minimal version of the metalloprotease fold (Structure 3E11).	95
399341	pfam06265	DUF1027	Protein of unknown function (DUF1027). This family consists of several hypothetical bacterial proteins of unknown function.	84
336357	pfam06266	HrpF	HrpF protein. The species Pseudomonas syringae encompasses plant pathogens with differing host specificities and corresponding pathovar designations. P. syringae requires the Hrp (type III protein secretion) system, encoded by a 25-kb cluster of hrp and hrc genes, in order to elicit the hypersensitive response (HR) in nonhosts or to be pathogenic in hosts. The exact function of HrpF is unknown but the protein is needed for pathogenicity.	74
399342	pfam06267	DUF1028	Family of unknown function (DUF1028). Family of bacterial and archaeal proteins with unknown function. Some members are associated with a C-terminal peptidoglycan binding domain. So perhaps this could be an enzyme involved in peptidoglycan metabolism.	189
399343	pfam06268	Fascin	Fascin domain. This family consists of several eukaryotic fascin or singed proteins. The fascins are a structurally unique and evolutionarily conserved group of actin cross-linking proteins. Fascins function in the organisation of two major forms of actin-based structures: dynamic, cortical cell protrusions and cytoplasmic microfilament bundles. The cortical structures, which include filopodia, spikes, lamellipodial ribs, oocyte microvilli and the dendrites of dendritic cells, have roles in cell-matrix adhesion, cell interactions and cell migration, whereas the cytoplasmic actin bundles appear to participate in cell architecture. Dictyostelium hisactophilin, another actin-binding protein, is a submembranous pH sensor that signals slight changes of the H+ concentration to actin by inducing actin polymerization and binding to microfilaments only at pH values below seven. Members of this family are histidine rich, typically contain the repeated motif of HHXH.	111
283844	pfam06269	DUF1029	Protein of unknown function (DUF1029). This family consists of several short Chordopoxvirus proteins of unknown function.	53
114962	pfam06270	DUF1030	Protein of unknown function (DUF1030). This family consists of several short Circovirus proteins of unknown function.	53
399344	pfam06271	RDD	RDD family. This family of proteins contain three highly conserved amino acids: one arginine and two aspartates, hence the name of RDD family. This region contains two predicted transmembrane regions. The arginine occurs at the N-terminus of the first helix and the first aspartate occurs in the middle of this helix. The molecular function of this region is unknown. However this region may be involved in transport of an as yet unknown set of ligands (Bateman A pers. obs.).	136
310698	pfam06273	eIF-4B	Plant specific eukaryotic initiation factor 4B. This family consists of several plant specific eukaryotic initiation factor 4B proteins.	502
114966	pfam06275	DUF1031	Protein of unknown function (DUF1031). This family consists of several Lactococcus lactis bacteriophage and Lactococcus lactis proteins of unknown function.	80
399345	pfam06276	FhuF	Ferric iron reductase FhuF-like transporter. This family consists of several bacterial ferric iron reductase protein (FhuF) sequences. FhuF is involved in the reduction of ferric iron in cytoplasmic ferrioxamine B. This family also includes the IucA and IucC proteins.	163
377642	pfam06277	EutA	Ethanolamine utilisation protein EutA. This family consists of several bacterial EutA ethanolamine utilisation proteins. The EutA protein is thought to protect the lyase (EutBC) from inhibition by CNB12.	475
399346	pfam06278	CNDH2_N	Condensin II complex subunit CAP-H2 or CNDH2, N-terminal. CNDH2_N is the N-terminal domain of the H2 subunit of the condensing II complex, found in eukaryotes but not in fungi. Eukaryotes carry at least two condensin complexes, I and II, each made up of five subunits. The functions of the two complexes are collaborative but non-overlapping. CI appears to be functional in G2 phase in the cytoplasm beginning the process of chromosomal lateral compaction while the CII is concentrated in the nucleus, possibly to counteract the activity of cohesion at this stage. In prophase, CII contributes to axial shortening of chromatids while CI continues to bring about lateral chromatid compaction, during which time the sister chromatids are joined centrally by cohesins. There appears to be just one condensin complex in fungi. CI and CII each contain SMC2 and SMC4 (structural maintenance of chromosomes) subunits, then CI has non-SMC CAP-D2 (CND1), CAP-G (CND3), and CAP-H (CND2). CII has, in addition to the two SMCs, CAP-D3, CAPG2 and CAP-H2. All four of the CAP-D and CAP-G subunits have degenerate HEAT repeats, whereas the CAP-H are kleisins or SMC-interacting proteins (ie they bind directly to the SMC subunits in the complex). The SMC molecules are each long with a small hinge-like knob at the free end of a longish strand, articulating with each other at the hinge. Each strand ends in a knob-like head that binds to one or other end of the CAP-H subunit. The HEAT-repeat containing D and G subunits bind side-by-side between the ends of the H subunit. Activity of the various parts of the complex seem to be triggered by extensive phosphorylations, eg, entry of the complex, in Sch.pombe, into the nucleus during mitosis is promoted by Cdk1 phosphorylation of SMC4/Cut3; and it has been shown that Cdk1 phosphorylates CAP-D3 at Thr1415 in He-La cells thus promoting early stage chromosomal condensation by CII.	111
399347	pfam06279	DUF1033	Protein of unknown function (DUF1033). This family consists of several hypothetical bacterial proteins. Many of the sequences in this family are annotated as putative DNA binding proteins but the function of this family is unknown.	117
399348	pfam06280	fn3_5	Fn3-like domain. Fn3_5 is an fn3-like domain which is frequently found as the first of three on streptococcal C5a peptidase (SCP), a highly specific protease and adhesin/invasin. The family is found in conjunction with pfam00082, pfam02225 and pfam00746.	112
253656	pfam06281	DUF1035	Protein of unknown function (DUF1035). This family consists of several Sulfolobus and Sulfolobus virus proteins of unknown function.	73
399349	pfam06282	DUF1036	Protein of unknown function (DUF1036). This family consists of several hypothetical bacterial proteins of unknown function.	111
399350	pfam06283	ThuA	Trehalose utilisation. This family consists of several bacterial ThuA like proteins. ThuA appears to be involved in utilisation of trehalose. The thuA and thuB genes form part of the trehalose/sucrose transport operon thuEFGKAB, which is located on the pSymB megaplasmid. The thuA and thuB genes are induced in vitro by trehalose but not by sucrose and the extent of its induction depends on the concentration of trehalose available in the medium.	213
283854	pfam06284	Cytomega_UL84	Cytomegalovirus UL84 protein. This family consists of several Cytomegalovirus UL84 proteins. The open reading frame UL84 of human cytomegalovirus encodes a multifunctional regulatory protein which is required for viral DNA replication and binds with high affinity to the immediate-early transactivator IE2-p86.	586
253659	pfam06286	Coleoptericin	Coleoptericin. This family consists of several insect Coleoptericin, Acaloleptin, Holotricin and Rhinocerosin proteins which are all known to be antibacterial proteins.	143
377644	pfam06287	DUF1039	Protein of unknown function (DUF1039). This family consists of several hypothetical bacterial proteins from Escherichia coli and Citrobacter rodentium. The function of this family is unknown.	65
399351	pfam06288	DUF1040	Protein of unknown function (DUF1040). This family consists of several bacterial YihD proteins of unknown function.	86
399352	pfam06289	FlbD	Flagellar protein (FlbD). This family consists of several bacterial FlbD flagellar proteins. The exact function of this family is unknown.	59
399353	pfam06290	PsiB	Plasmid SOS inhibition protein (PsiB). This family consists of several plasmid SOS inhibition protein (PsiB) sequences.	138
399354	pfam06291	Lambda_Bor	Bor protein. This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the Escherichia coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis.	77
399355	pfam06292	DUF1041	Domain of Unknown Function (DUF1041). This family consists of several eukaryotic domains of unknown function. Members of this family are often found in tandem repeats and co-occur with pfam00168, pfam00130 and pfam00169 domains.	107
399356	pfam06293	Kdo	Lipopolysaccharide kinase (Kdo/WaaP) family. These lipopolysaccharide kinases are related to protein kinases pfam00069. This family includes waaP (rfaP) gene product is required for the addition of phosphate to O-4 of the first heptose residue of the lipopolysaccharide (LPS) inner core region. It has previously been shown that WaaP is necessary for resistance to hydrophobic and polycationic antimicrobials in E. coli and that it is required for virulence in invasive strains of S. enterica.	206
399357	pfam06294	CH_2	CH-like domain in sperm protein. Spef is a region of sperm flagellar proteins. It probably exerts a role in spermatogenesis in that the protein is expressed predominantly in adult tissue. It is present in the tails of developing and epididymal sperm internal to the fibrous sheath and around the dense outer fibers of the sperm flagellum. The amino-terminal domain (residues 1-110) shows a possible calponin homology (CH) domain; however Spef does not bind actin directly under in vitro conditions, so the function of the amino-terminal calponin-like domain is unclear. Transcription aberrations leading to a truncated protein result in immotile sperm.	91
399358	pfam06295	DUF1043	Protein of unknown function (DUF1043). This family consists of several hypothetical bacterial proteins of unknown function.	123
399359	pfam06296	RelE	RelE toxin of RelE / RelB toxin-antitoxin system. RelE is a family of Gram-negative bacterial antitoxins of the RelE/RelB toxin-antitoxin system. Its cognate antitoxin is family RelB, pfam04221.	120
399360	pfam06297	PET	PET Domain. This domain is suggested to be involved in protein-protein interactions. The family is found in conjunction with pfam00412.	85
399361	pfam06298	PsbY	Photosystem II protein Y (PsbY). This family consists of several bacterial and plant photosystem II protein Y (PsbY) sequences. PsbY is a manganese-binding protein that has an L-arginine metabolising enzyme activity.	35
399362	pfam06299	DUF1045	Protein of unknown function (DUF1045). This family consists of several hypothetical proteins from Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown.	159
368834	pfam06300	Tsp45I	Tsp45I type II restriction enzyme. This family consists of several type II restriction enzymes.	260
399363	pfam06301	Lambda_Kil	Bacteriophage lambda Kil protein. This family consists of several Bacteriophage lambda Kil protein like sequences from both phages and bacteria. Induction of a lambda prophage causes the death of the host cell even in the absence of phage replication and lytic functions due to expression of the lambda kil gene.	42
399364	pfam06303	MatP	MatP N-terminal domain. This family, many of whose members are YcbG, organizes the macrodomain Ter of the chromosome of bacteria such as E coli. In these bacteria, insulated macrodomains influence the segregation of sister chromatids and the mobility of chromosomal DNA. Organisation of the Terminus region (Ter) into a macrodomain relies on the presence of a 13 bp motif called matS repeated 23 times in the 800-kb-long domain. MatS sites are the main targets in the E. coli chromosome of YcbG or MatP (macrodomain Ter protein). MatP accumulates in the cell as a discrete focus that co-localizes with the Ter macrodomain. The effects of MatP inactivation reveal its role as the main organizer of the Ter macrodomain: in the absence of MatP, DNA is less compacted, the mobility of markers is increased, and segregation of the Ter macrodomain occurs early in the cell cycle. A specific organisational system is required in the Terminus region for bacterial chromosome management during the cell cycle. This entry represents the N-terminal domain of MatP.	84
399365	pfam06304	DUF1048	Protein of unknown function (DUF1048). This family consists of several hypothetical bacterial proteins of unknown function.	103
399366	pfam06305	LapA_dom	Lipopolysaccharide assembly protein A domain. This family includes a domain found in lipopolysaccharide assembly protein A (LapA). LapA functions along with LapB in the assembly of lipopolysaccharide (LPS). Domains in this family are also found in some uncharacterized bacterial proteins.	64
114995	pfam06306	CgtA	Beta-1,4-N-acetylgalactosaminyltransferase (CgtA). This family consists of several beta-1,4-N-acetylgalactosaminyltransferase proteins from Campylobacter jejuni.	347
253668	pfam06307	Herpes_IR6	Herpesvirus IR6 protein. This family consists of several Herpesvirus IR6 proteins. The equine herpesvirus 1 (EHV-1) IR6 protein forms typical rod-like structures in infected cells, influences virus growth at elevated temperatures, and determines the virulence of EHV-1 Rac strains.	214
368836	pfam06308	ErmC	23S rRNA methylase leader peptide (ErmC). This family consists of several very short bacterial 23S rRNA methylase leader peptide (ErmC) sequences. ermC confers resistance to macrolide-lincosamide streptogramin B antibiotics by specifying a ribosomal RNA methylase, which results in decreased ribosomal affinity for these antibiotics. ermC expression is induced by exposure to erythromycin.	31
399367	pfam06309	Torsin	Torsin. This family consists of several eukaryotic torsin proteins. Torsion dystonia is an autosomal dominant movement disorder characterized by involuntary, repetitive muscle contractions and twisted postures. The most severe early-onset form of dystonia has been linked to mutations in the human DYT1 (TOR1A) gene encoding a protein termed torsinA. While causative genetic alterations have been identified, the function of torsin proteins and the molecular mechanism underlying dystonia remain unknown. Phylogenetic analysis of the torsin protein family indicates these proteins share distant sequence similarity with the large and diverse family of (pfam00004) proteins. It has been suggested that torsins play a role in effectively managing protein folding and that possible breakdown in a neuroprotective mechanism that is, in part, mediated by torsins may be responsible for the neuronal dysfunction associated with dystonia.	120
399368	pfam06311	NumbF	NUMB domain. This presumed domain is found in the Numb family of proteins adjacent to the PTB domain..	90
368838	pfam06312	Neurexophilin	Neurexophilin. This family consists of mammalian neurexophilin proteins. Mammalian brains contain four different neurexophilin proteins. Neurexophilins form a family of related glycoproteins that are proteolytically processed after synthesis and bind to alpha-neurexins. The structure and characteristics of neurexophilins indicate that they function as neuropeptides that may signal via alpha-neurexins.	203
399369	pfam06313	ACP53EA	Drosophila ACP53EA protein. This family consists of several Drosophila ACP53EA accessory gland (seminal) proteins.	90
399370	pfam06314	ADC	Acetoacetate decarboxylase (ADC). This family consists of several acetoacetate decarboxylase (ADC) proteins (EC:4.1.1.4).	239
399371	pfam06315	AceK	Isocitrate dehydrogenase kinase/phosphatase (AceK). This family consists of several bacterial isocitrate dehydrogenase kinase/phosphatase (AceK) proteins (EC:2.7.1.116).	560
115004	pfam06316	Ail_Lom	Enterobacterial Ail/Lom protein. This family consists of several bacterial and phage Ail/Lom-like proteins. The Yersinia enterocolitica Ail protein is a known virulence factor. Proteins in this family are predicted to consist of eight transmembrane beta-sheets and four cell surface-exposed loops. It is thought that Ail directly promotes invasion and loop 2 contains an active site, perhaps a receptor-binding domain. The phage protein Lom is expressed during lysogeny, and encode host-cell envelope proteins. Lom is found in the bacterial outer membrane, and is homologous to virulence proteins of two other enterobacterial genera. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis.	199
399372	pfam06317	Arena_RNA_pol	Arenavirus RNA polymerase. This family consists of several Arenavirus RNA polymerase proteins (EC:2.7.7.48).	2039
399373	pfam06319	MmcB-like	DNA repair protein MmcB-like. This family includes Caulobacter MmcB (CCNA_03580), which is involved in DNA repair. It has been proposed to be an endonuclease that creates the substrate for translesion synthesis.	148
399374	pfam06320	GCN5L1	GCN5-like protein 1 (GCN5L1). This family consists of several eukaryotic GCN5-like protein 1 (GCN5L1) sequences. The function of this family is unknown.	113
399375	pfam06321	P_gingi_FimA	Major fimbrial subunit protein (FimA). This family consists of several Porphyromonas gingivalis major fimbrial subunit protein (FimA) sequences. Fimbriae of Porphyromonas gingivalis, a periodontopathogen, play an important role in its adhesion to and invasion of host cells. The fimA genes encoding fimbrillin (FimA), a subunit protein of fimbriae, have been classified into five types, types I to V, based on nucleotide sequences. It has been found that type II FimA can bind to epithelial cells most efficiently through specific host receptors. Human dental plaque is a multispecies microbial biofilm that is associated with two common oral diseases, dental caries and periodontal disease. There is an inter-species contact-dependent communication system between P. gingivalis and S. cristatus that involces the Arc-A enzyme.	157
283883	pfam06322	Phage_NinH	Phage NinH protein. This family consists of several phage NinH proteins. The function of this family is unknown.	60
399376	pfam06323	Phage_antiter_Q	Phage antitermination protein Q. This family consists of several phage antitermination protein Q and related bacterial sequences. Phage 82 gene Q encodes a phage-specific positive regulator of late gene expression, thought, by analogy to the corresponding gene of phage lambda, to be a transcription antiterminator.	220
283885	pfam06324	Pigment_DH	Pigment-dispersing hormone (PDH). This family consists of several eukaryotic pigment-dispersing hormone (PDH) proteins. The pigment-dispersing hormone (PDH) is produced in the eyestalks of Crustacea where it induces light-adapting movements of pigment in the compound eye and regulates the pigment dispersion in the chromatophores.	18
399377	pfam06325	PrmA	Ribosomal protein L11 methyltransferase (PrmA). This family consists of several Ribosomal protein L11 methyltransferase (EC:2.1.1.-) sequences.	295
283887	pfam06326	Vesiculo_matrix	Vesiculovirus matrix protein. This family consists of several Vesiculovirus matrix proteins. The matrix (M) protein of vesicular stomatitis virus (VSV) expressed in the absence of other viral components causes many of the cytopathic effects of VSV, including an inhibition of host gene expression and the induction of cell rounding. It has been shown that M protein also induces apoptosis in the absence of other viral components. It is thought that the activation of apoptotic pathways causes the inhibition of host gene expression and cell rounding by M protein.	240
399378	pfam06327	DUF1053	Domain of Unknown Function (DUF1053). This domain is found in Adenylate cyclases.	100
399379	pfam06328	Lep_receptor_Ig	Ig-like C2-type domain. This domain is a ligand-binding immunoglobulin-like domain. The two cysteine residues form a disulphide bridge.	87
310730	pfam06330	TRI5	Trichodiene synthase (TRI5). This family consists of several fungal trichodiene synthase proteins (EC:4.2.3.6). TRI5 encodes the enzyme trichodiene synthase, which has been shown to catalyze the first step in the trichothecene pathways of Fusarium and Trichothecium species.	353
399380	pfam06331	Tfb5	Transcription factor TFIIH complex subunit Tfb5. This family is a component of the general transcription and DNA repair factor IIH. TFB5 has been shown to be required for efficient recruitment of TFIIH to a promoter.	67
399381	pfam06333	Med13_C	Mediator complex subunit 13 C-terminal. Mediator is a large complex of up to 33 proteins that is conserved from plants through fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med13 is part of the ancillary kinase module, together with Med12, CDK8 and CycC, which in yeast is implicated in transcriptional repression, though most of this activity is likely attributable to the CDK8 kinase. The large Med12 and Med13 proteins are required for specific developmental processes in Drosophila, zebrafish, and Caenorhabditis elegans but their biochemical functions are not understood.	322
283893	pfam06334	Orthopox_A47	Orthopoxvirus A47 protein. This family consists of several Orthopoxvirus A47 proteins. The function of this family is unknown.	244
399382	pfam06335	DUF1054	Protein of unknown function (DUF1054). This family consists of several hypothetical bacterial proteins of unknown function.	198
283895	pfam06336	Corona_5a	Coronavirus 5a protein. This family consists of several Coronavirus 5a proteins. The function of this family is unknown.	64
399383	pfam06337	DUSP	DUSP domain. The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet.	80
399384	pfam06338	ComK	ComK protein. This family consists of several bacterial ComK proteins. The ComK protein of Bacillus subtilis positively regulates the transcription of several late competence genes as well as comK itself. It has been found that ClpX plays an important role in the regulation of ComK at the post-transcriptional level.	152
368850	pfam06339	Ectoine_synth	Ectoine synthase. This family consists of several bacterial ectoine synthase proteins. The ectABC genes encode the diaminobutyric acid acetyltransferase (EctA), the diaminobutyric acid aminotransferase (EctB), and the ectoine synthase (EctC). Together these proteins constitute the ectoine biosynthetic pathway.	127
283899	pfam06340	TcpF	Vibrio cholerae toxin co-regulated pilus biosynthesis protein F. This family consists of several Vibrio cholerae toxin co-regulated pilus biosynthesis protein F (TcpF) sequences. TcpF is known to be a secreted virulence protein but its exact function is unknown.	317
283900	pfam06341	DUF1056	Protein of unknown function (DUF1056). This family consists of several putative head-tail joining bacteriophage proteins.	63
115027	pfam06342	DUF1057	Alpha/beta hydrolase of unknown function (DUF1057). This family consists of several Caenorhabditis elegans specific proteins of unknown function. Members of this family have an alpha/beta hydrolase fold.	297
283901	pfam06344	Parecho_VpG	Parechovirus Genome-linked protein. This family is of the Parechovirus genome-linked protein Vpg type P3B.	20
399385	pfam06345	Drf_DAD	DRF Autoregulatory Domain. This motif is found in Diaphanous-related formins. It binds the N-terminal GTPase-binding domain; this link is broken when GTP-bound Rho binds to the GBD and activates the protein. The addition of DAD to mammalian cells induces actin filament formation, stabilizes microtubules, and activates serum-response mediated transcription.	15
399386	pfam06346	Drf_FH1	Formin Homology Region 1. This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.	154
399387	pfam06347	SH3_4	Bacterial SH3 domain. This family consists of several hypothetical bacterial proteins of unknown function. These are composed of SH3-like domains.	56
399388	pfam06348	DUF1059	Protein of unknown function (DUF1059). This family consists of several short hypothetical archaeal proteins of unknown function.	56
399389	pfam06350	HSL_N	Hormone-sensitive lipase (HSL) N-terminus. This family consists of several mammalian hormone-sensitive lipase (HSL) proteins (EC:3.1.1.-). Hormone-sensitive lipase, a key enzyme in fatty acid mobilisation, overall energy homeostasis, and possibly steroidogenesis, is acutely controlled through reversible phosphorylation by catecholamines and insulin.	306
368854	pfam06351	Allene_ox_cyc	Allene oxide cyclase. This family consists of several plant specific allene oxide cyclase proteins (EC:5.3.99.6). The allene oxide cyclase (AOC)-catalyzed step in jasmonate (JA) biosynthesis is important in the wound response of tomato.	175
399390	pfam06353	DUF1062	Protein of unknown function (DUF1062). This family consists of several hypothetical bacterial proteins of unknown function.	135
399391	pfam06355	Aegerolysin	Aegerolysin. This family consists of several bacterial and fungal Aegerolysin-like proteins. It has been found that aegerolysin and ostreolysin are expressed during formation of primordia and fruiting bodies. It has been suggested that these haemolysins play an important role in initial phase of fungal fruiting. The bacterial members of this family are expressed during sporulation. Ostreolysin was found cytolytic to various erythrocytes and tumor cells. It forms transmembrane pores 4 nm in diameter. The activity is inhibited by total membrane lipids, and modulated by lysophosphatides. The potential use of aegerolysins is reviewed with special emphasis on their properties which would allow their use in therapeutics. Aegerolysin is part of the pleurotolysin pore-forming (Pleurotolysin) transporter superfamily. Member proteins assemble into a transmembrane pore complex.	131
283910	pfam06356	DUF1064	Protein of unknown function (DUF1064). This family consists of several phage and bacterial proteins of unknown function.	117
253691	pfam06357	Omega-toxin	Omega-atracotoxin. This family consists of several Hadronyche versuta (Blue mountains funnel-web spider) specific omega-atracotoxin proteins. Omega-Atracotoxin-Hv1a is an insect-specific neurotoxin whose phylogenetic specificity derives from its ability to antagonise insect, but not vertebrate, voltage-gated calcium channels. Two spatially proximal residues, Asn(27) and Arg(35), form a contiguous molecular surface that is essential for toxin activity. It has been proposed that this surface of the beta-hairpin is a key site for interaction of the toxin with insect calcium channels.	37
283911	pfam06358	DUF1065	Protein of unknown function (DUF1065). This family consists of several Benyvirus proteins of unknown function.	111
399392	pfam06360	E_raikovi_mat	Euplotes raikovi mating pheromone. This family consists of several Euplotes raikovi mating pheromone proteins. Diffusible polypeptide pheromones, which distinguish otherwise morphologically identical vegetative cell types from one another, are produced by some species of ciliates. In the marine sand-dwelling protozoan ciliate Euplotes raikovi, pheromone molecules promote the vegetative reproduction (mitogenic proliferation or growth) of the same cells from which they originate. As, understandably, such autocrine pheromone activity is primary to that of targeting and inducing a foreign cell to mate (paracrine functions), this finding provides an example of how the original function of a molecule can be obscured during evolution by the acquisition of a new one.	33
283912	pfam06361	RTBV_P12	Rice tungro bacilliform virus P12 protein. This family consists of several Rice tungro bacilliform virus P12 proteins. The function of this family is unknown.	110
115044	pfam06362	DUF1067	Protein of unknown function (DUF1067). This family consists of several hypothetical Mycobacterium leprae specific proteins. The function of this family is unknown.	97
368858	pfam06363	Picorna_P3A	Picornaviridae P3A protein. This family consists of the P3A protein of picornaviridae. P3A has been identified as a genome-linked protein (VPg) and is involved in replication.	98
399393	pfam06364	DUF1068	Protein of unknown function (DUF1068). This family consists of several hypothetical plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.	165
399394	pfam06365	CD34_antigen	CD34/Podocalyxin family. This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighboring foot processes in the glomerular epithelium by charge repulsion.	210
399395	pfam06366	FlhE	Flagellar protein FlhE. This family consists of several Enterobacterial FlhE flagellar proteins. The exact function of this family is unknown.	106
368862	pfam06367	Drf_FH3	Diaphanous FH3 Domain. This region is found in the Formin-like and and diaphanous proteins.	195
399396	pfam06368	Met_asp_mut_E	Methylaspartate mutase E chain (MutE). This family consists of several methylaspartate mutase E chain proteins (EC:5.4.99.1). Glutamate mutase catalyzes the first step in the fermentation of glutamate by Clostridium tetanomorphum. This is an unusual isomerisation in which L-glutamate is converted to threo-beta-methyl L-aspartate.	441
399397	pfam06369	Anemone_cytotox	Sea anemone cytotoxic protein. Sea anemones are a rich source of cytotoxic proteins. Cytolysins comprise a group of more than 30 highly basic proteins with molecular masses of about 20 kDa. Cytolysins isolated from the sea anemone, Heteractis magnifica, include magnificalysin I (HMg I), magnificalysin II (HMg II) and Heteractis magnifica toxin (HMgtxn). These are highly homologous at their N-terminals. HMg I and II have molecular masses of approximately 19 kDa, and pI values of 9.4 and 10.0, respectively. Cytolysins isolated from other sea anemones Actinia tenebrosa (Tenebrosin-C, TN-C), Actinia equina (Equinatoxin, EqT) and Stichodactyla helianthus (ShC) exhibit pore-forming, haemolytic, cytotoxic, and heart stimulatory activities.	176
191504	pfam06370	DUF1069	Protein of unknown function (DUF1069). This family consists of several Maize streak virus 21.7 kDa proteins. The function of this family is unknown.	206
399398	pfam06371	Drf_GBD	Diaphanous GTPase-binding Domain. This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein.	190
148150	pfam06372	Gemin6	Gemin6 protein. This family consists of several mammalian Gemin6 proteins. The exact function of Gemin6 is unknown but it has been found to form part of the pfam06003 complex. The SMN complex plays a key role in the biogenesis of spliceosomal small nuclear ribonucleoproteins (snRNPs) and other ribonucleoprotein particles.	169
368865	pfam06373	CART	Cocaine and amphetamine regulated transcript protein (CART). This family consists of several cocaine and amphetamine regulated transcript type I protein (CART) sequences. Cocaine and amphetamine regulated transcript (CART) peptide has been shown to be an anorectic peptide that inhibits both normal and starvation-induced feeding and completely blocks the feeding response induced by neuropeptide Y and regulated by leptin in the hypothalamus. The C-terminal part containing the three disulfide bridges is the biologically active part of the molecule affecting food intake. The solution structure of the active part of CART has a fold equivalent to other functionally distinct small proteins. CART consists mainly of turns and loops spanned by a compact framework composed by a few small stretches of antiparallel beta-sheet common to cystine knots.	70
399399	pfam06374	NDUF_C2	NADH-ubiquinone oxidoreductase subunit b14.5b (NDUFC2). This family consists of several NADH-ubiquinone oxidoreductase subunit b14.5b proteins (EC:1.6.5.3).	109
399400	pfam06375	AP3D1	AP-3 complex subunit delta-1. AP-3 complex subunit delta-1 (AP3D1) is part of the AP-3 complex, an adaptor-related complex which is not clathrin- associated. The complex is associated with the Golgi region as well as more peripheral structures. AP3D1 is required for efficient transport of VSV-G (vesicular stomatitis virus glycoprotein) from the trans-Golgi network to the cell surface.	157
399401	pfam06376	AGP	Arabinogalactan peptide. This entry represents the arabinogalactan peptide family found in plants.	35
399402	pfam06377	Adipokin_hormo	Adipokinetic hormone. This family consists of several insect adipokinetic hormone as well as the related crustacean red pigment concentrating hormone. Flight activity of insects comprises one of the most intense biochemical processes known in nature, and therefore provides an attractive model system to study the hormonal regulation of metabolism during physical exercise. In long-distance flying insects, such as the migratory locust, both carbohydrate and lipid reserves are utilized as fuels for sustained flight activity. The mobilization of these energy stores in Locusta migratoria is mediated by three structurally related adipokinetic hormones (AKHs), which are all capable of stimulating the release of both carbohydrates and lipids from the fat body.	51
399403	pfam06378	DUF1071	Protein of unknown function (DUF1071). This family consists of several hypothetical bacterial and phage proteins of unknown function.	152
115061	pfam06379	RhaT	L-rhamnose-proton symport protein (RhaT). This family consists of several bacterial L-rhamnose-proton symport protein (RhaT) sequences.	344
148156	pfam06380	DUF1072	Protein of unknown function (DUF1072). This family consists of several Barley yellow dwarf virus proteins of unknown function.	39
399404	pfam06381	DUF1073	Protein of unknown function (DUF1073). This family consists of several hypothetical bacterial proteins. The function of this family is unknown.	355
283927	pfam06382	DUF1074	Protein of unknown function (DUF1074). This family consists of several proteins which appear to be specific to Drosophila melanogaster. The function of this family is unknown.	125
399405	pfam06384	ICAT	Beta-catenin-interacting protein ICAT. This family consists of several eukaryotic beta-catenin-interacting (ICAT) proteins. Beta-catenin is a multifunctional protein involved in both cell adhesion and transcriptional activation. Transcription mediated by the beta-catenin/Tcf complex is involved in embryological development and is upregulated in various cancers. ICAT selectively inhibits beta-catenin/Tcf binding in vivo, without disrupting beta-catenin/cadherin interactions.	76
283929	pfam06385	Baculo_LEF-11	Baculovirus LEF-11 protein. This family consists of several Baculovirus LEF-11 proteins. The exact function of this family is unknown although it has been shown that LEF-11 is required for viral DNA replication during the infection cycle.	93
399406	pfam06386	GvpL_GvpF	Gas vesicle synthesis protein GvpL/GvpF. This family consists of several bacterial and archaeal gas vesicle synthesis protein (GvpL/GvpF) sequences. The exact function of this family is unknown.	237
399407	pfam06387	Calcyon	D1 dopamine receptor-interacting protein (calcyon). This family consists of several D1 dopamine receptor-interacting (calcyon) proteins. D1/D5 dopamine receptors in the basal ganglia, hippocampus, and cerebral cortex modulate motor, reward, and cognitive behaviour. D1-like dopamine receptors likely modulate neocortical and hippocampal neuronal excitability and synaptic function via Ca(2+) as well as cAMP-dependent signaling. Defective calcyon proteins have been implicated in both attention-deficit/hyperactivity disorder (ADHD) and schizophrenia.	178
399408	pfam06388	DUF1075	Protein of unknown function (DUF1075). This family consists of several eukaryotic proteins of unknown function.	124
399409	pfam06389	Filo_VP24	Filovirus membrane-associated protein VP24. This family consists of several membrane-associated protein VP24 sequences from a variety of Ebola and Marburg viruses. The VP24 protein of Ebola virus is believed to be a secondary matrix protein and minor component of virions. VP24 possesses structural features commonly associated with viral matrix proteins and that VP24 may have a role in virus assembly and budding.	227
115071	pfam06390	NESP55	Neuroendocrine-specific golgi protein P55 (NESP55). This family consists of several mammalian neuroendocrine-specific golgi protein P55 (NESP55) sequences. NESP55 is a novel member of the chromogranin family and is a soluble, acidic, heat-stable secretory protein that is expressed exclusively in endocrine and nervous tissues, although less widely than chromogranins.	261
399410	pfam06391	MAT1	CDK-activating kinase assembly factor MAT1. MAT1 is an assembly/targeting factor for cyclin-dependent kinase-activating kinase (CAK), which interacts with the transcription factor TFIIH. The domain found to the N-terminal side of this domain is a C3HC4 RING finger.	203
368876	pfam06392	Asr	Acid shock protein repeat. The Asr protein is synthesized as a precursor and the cleavage is essential for moderate to high acid tolerance.	22
399411	pfam06393	BID	BH3 interacting domain (BID). BID is a member of the BCL-2 superfamily of proteins are key regulators of programmed cell death, hence this family is related to pfam00452. BID is a pro-apoptotic member of the Bcl-2 superfamily and as such posses the ability to target intracellular membranes and contains the BH3 death domain. The activity of BID is regulated by a Caspase 8-mediated cleavage event, exposing the BH3 domain and significantly changing the surface charge and hydrophobicity, which causes a change of cellular localization.	191
368878	pfam06394	Pepsin-I3	Pepsin inhibitor-3-like repeated domain. Pepsin inhibitor-3 consisting of two domains, each comprising an antiparallel beta-sheet flanked by an alpha-helix. In the enzyme-inhibitor complex, the N-terminal beta-strand of PI-3 pairs with one strand of the active site flap region of pepsin. The two domains are tandem repeats of sequence, and has therefore been termed repeated domain.	74
399412	pfam06395	CDC24	CDC24 Calponin. Is a calponin homology domain.	89
399413	pfam06396	AGTRAP	Angiotensin II, type I receptor-associated protein (AGTRAP). This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the carboxyl-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear.	146
283940	pfam06397	Desulfoferrod_N	Desulfoferrodoxin, N-terminal domain. Most members of this family are small (approximately 36 amino acids) proteins that from homodimeric complexes. Each subunit contains a high-spin iron atom tetrahedrally bound to four cysteinyl sulphur atoms This family has a similar fold to the rubredoxin metal binding domain. It is also found as the N-terminal domain of desulfoferrodoxin, see (pfam01880).	36
399414	pfam06398	Pex24p	Integral peroxisomal membrane peroxin. Peroxisomes play diverse roles in the cell, compartmentalising many activities related to lipid metabolism and functioning in the decomposition of toxic hydrogen peroxide. Sequence similarity was identified between two hypothetical proteins and the peroxin integral membrane protein Pex24p.	369
399415	pfam06399	GFRP	GTP cyclohydrolase I feedback regulatory protein (GFRP). Tetrahydrobiopterin, the cofactor required for hydroxylation of aromatic amino acids regulates its own synthesis in via feedback inhibition of GTP cyclohydrolase I. This mechanism is mediated by the regulatory subunit called GTP cyclohydrolase I feedback regulatory protein (GFRP).	81
399416	pfam06400	Alpha-2-MRAP_N	Alpha-2-macroglobulin RAP, N-terminal domain. The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. The N-terminal domain is predominately alpha helical. Two different studies have provided conflicted domain boundaries.	117
399417	pfam06401	Alpha-2-MRAP_C	Alpha-2-macroglobulin RAP, C-terminal domain. The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. Two different studies have provided conflicted domain boundaries.	209
399418	pfam06403	Lamprin	Lamprin. This family consists of several lamprin proteins from the Sea lamprey Petromyzon marinus. Lamprin, an insoluble non-collagen, non-elastin protein, is the major connective tissue component of the fibrillar extracellular matrix of lamprey annular cartilage. Although not generally homologous to any other protein, soluble lamprins contain a tandemly repeated peptide sequence (GGLGY) which is present in both silkmoth chorion proteins and spider dragline silk. Strong homologies to this repeat sequence are also present in several mammalian and avian elastins. It is thought that these proteins share a structural motif which promotes self-aggregation and fibril formation in proteins through interdigitation of hydrophobic side chains in beta-sheet/beta-turn structures, a motif that has been preserved in recognisable form over several hundred million years of evolution.	91
399419	pfam06404	PSK	Phytosulfokine precursor protein (PSK). This family consists of several plant specific phytosulfokine precursor proteins. Phytosulfokines, are active as either a pentapeptide or a C-terminally truncated tetrapeptide. These compounds were first isolated because of their ability to stimulate cell division in somatic embryo cultures of Asparagus officinalis.	89
399420	pfam06405	RCC_reductase	Red chlorophyll catabolite reductase (RCC reductase). This family consists of several red chlorophyll catabolite reductase (RCC reductase) proteins. Red chlorophyll catabolite (RCC) reductase (RCCR) and pheophorbide (Pheide) a oxygenase (PaO) catalyze the key reaction of chlorophyll catabolism, porphyrin macrocycle cleavage of Pheide a to a primary fluorescent catabolite (pFCC).	253
310773	pfam06406	StbA	StbA protein. This family consists of several bacterial StbA plasmid stability proteins.	317
399421	pfam06407	BDV_P40	Borna disease virus P40 protein. This family consists of several Borna disease virus P40 proteins. Borna disease (BD) is a persistent viral infection of the central nervous system caused by the single-negative-strand, nonsegmented RNA Borna disease virus (BDV). P40 is known to be a nucleoprotein.	348
399422	pfam06409	NPIP	Nuclear pore complex interacting protein (NPIP). This family consists of a series of primate specific nuclear pore complex interacting protein (NPIP) sequences. The function of this family is unknown but is well conserved from African apes to humans.	262
399423	pfam06411	HdeA	HdeA/HdeB family. HdeA (hns-dependent expression protein A) is a single domain alpha-helical protein localized in the periplasmic space. HdeA is involved in acid resistance essential for infectivity of enteric bacterial pathogens. Functional studies demonstrate that HdeA is activated by a dimer-to-monomer transition at acidic pH, leading to suppression of aggregation by acid-denatured proteins. The gene encoding HdeA was initially identified as part of an operon regulated by the nucleoid protein H-NS. This family also contains HdeB.	92
399424	pfam06412	TraD	Conjugal transfer protein TraD. This family contains bacterial TraD conjugal transfer proteins. Mutations in the TraD gene result in loss of transfer.	61
399425	pfam06413	Neugrin	Neugrin. This family consists of several mouse and human neugrin proteins. Neugrin and m-neugrin are mainly expressed in neurons in the nervous system, and are thought to play an important role in the process of neuronal differentiation.	225
399426	pfam06414	Zeta_toxin	Zeta toxin. This family consists of several bacterial zeta toxin proteins. Zeta toxin is thought to be part of a postregulational killing system in bacteria. It relies on antitoxin/toxin systems that secure stable inheritance of low and medium copy number plasmids during cell division and kill cells that have lost the plasmid.	192
399427	pfam06415	iPGM_N	BPG-independent PGAM N-terminus (iPGM_N). This family represents the N-terminal region of the 2,3-bisphosphoglycerate-independent phosphoglycerate mutase (or phosphoglyceromutase or BPG-independent PGAM) protein (EC:5.4.2.1). The family is found in conjunction with pfam01676 (located in the C-terminal region of the protein).	217
368893	pfam06416	T3SS_NleG	Effector protein NleG. Many bacterial pathogens deliver effector proteins into host cells via a type III secretion system. These effector proteins then alter the host cell's biology in ways that are advantageous to the pathogen. The NleG protein and its homologs form the largest family of effector proteins in the enterohemorrhagic Escherichia coli O157:H7, with 14 members identified in the Sakai strain alone.	113
399428	pfam06417	DUF1077	Protein of unknown function (DUF1077). This family consists of several hypothetical eukaryotic proteins of unknown function.	118
399429	pfam06418	CTP_synth_N	CTP synthase N-terminus. This family consists of the N-terminal region of the CTP synthase protein (EC:6.3.4.2). This family is found in conjunction with pfam00117 located in the C-terminal region of the protein. CTP synthase catalyzes the synthesis of CTP from UTP by amination of the pyrimidine ring at the 4-position.	265
399430	pfam06419	COG6	Conserved oligomeric complex COG6. COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localization.	612
399431	pfam06420	Mgm101p	Mitochondrial genome maintenance MGM101. The mgm101 gene was identified as essential for maintenance of the mitochondrial genome in Saccharomyces cerevisiae. Based on its DNA-binding activity, and experimental work with a temperature-sensitive mgm101 mutant, it has been proposed that the mgm101 gene product performs an essential function in the repair of oxidatively damaged mitochondrial DNA.	170
399432	pfam06421	LepA_C	GTP-binding protein LepA C-terminus. This family consists of the C-terminal region of several pro- and eukaryotic GTP-binding LepA proteins.	107
399433	pfam06422	PDR_CDR	CDR ABC transporter. Corresponds to a region of the PDR/CDR subgroup of ABC transporters comprising extracellular loop 3, transmembrane segment 6 and linker region.	92
399434	pfam06423	GWT1	GWT1. Glycosylphosphatidylinositol (GPI) is a conserved post-translational modification to anchor cell surface proteins to plasma membrane in eukaryotes. GWT1 is involved in GPI anchor biosynthesis; it is required for inositol acylation in yeast.	140
399435	pfam06424	PRP1_N	PRP1 splicing factor, N-terminal. This domain is specific to the N-terminal part of the prp1 splicing factor, which is involved in mRNA splicing (and possibly also poly(A)+ RNA nuclear export and cell cycle progression). This domain is specific to the N-terminus of the RNA splicing factor encoded by prp1. It is involved in mRNA splicing and possibly also poly(A)and RNA nuclear export and cell cycle progression.	109
399436	pfam06426	SATase_N	Serine acetyltransferase, N-terminal. The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants and bacteria.	104
399437	pfam06427	UDP-g_GGTase	UDP-glucose:Glycoprotein Glucosyltransferase. UDP-g_GGTase is an important, central component of the QC system in the ER for checking that glycoproteins are folded correctly. This QC prevents incorrectly folded glycoproteins from leaving the ER.	109
368902	pfam06428	Sec2p	GDP/GTP exchange factor Sec2p. In Saccharomyces cerevisiae, Sec2p is a GDP/GTP exchange factor for Sec4p, which is required for vesicular transport at the post-Golgi stage of yeast secretion.	92
377656	pfam06429	Flg_bbr_C	Flagellar basal body rod FlgEFG protein C-terminal. This family consists of a number of C-terminal domains of unknown function. This domain seems to be specific to flagellar basal-body rod and flagellar hook proteins in which pfam00460 is often present at the extreme N-terminus.	74
399438	pfam06430	L_lactis_RepB_C	Lactococcus lactis RepB C-terminus. This family consists of the C-terminal region of RepB proteins from Lactococcus lactis (See pfam01051).	122
283968	pfam06431	Polyoma_lg_T_C	Polyomavirus large T antigen C-terminus. 	417
399439	pfam06432	GPI2	Phosphatidylinositol N-acetylglucosaminyltransferase. Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively.	267
399440	pfam06433	Me-amine-dh_H	Methylamine dehydrogenase heavy chain (MADH). Methylamine dehydrogenase (EC:1.4.99.3) a periplasmic quinoprotein found in several methyltrophic bacteria. Induced when grown on methylamine as a carbon source MADH catalyzes the oxidative deamination of amines to there corresponding aldehydes. MADH is a hetero- tetramer, comprised of two heavy chains (H) and two light chains (L). The H-chain forms a beta-propeller like structure.	343
399441	pfam06434	Aconitase_2_N	Aconitate hydratase 2 N-terminus. This family represents the N-terminal region of several bacterial Aconitate hydratase 2 proteins and is found in conjunction with pfam00330.	204
283972	pfam06435	DUF1079	Repeat of unknown function (DUF1079). This family consists of several repeats of 31 residues in length and seems to be exclusive to Moraxella catarrhalis UspA proteins. The UspA1 and UspA2 proteins of Moraxella catarrhalis are structurally related and are exposed on the bacterial cell surface where can function adhesins. This family is commonly found with the pfam03895 family.	31
399442	pfam06436	Pneumovirus_M2	Pneumovirus matrix protein 2 (M2). This family consists of several Pneumovirus matrix glycoprotein M2 sequences. This family functions as a transcription processivity factor that is essential for virus replication.	155
399443	pfam06437	ISN1	IMP-specific 5'-nucleotidase. The Saccharomyces cerevisiae ISN1 (YOR155c) gene encodes an IMP-specific 5'-nucleotidase, which catalyzes degradation of IMP to inosine as part of the purine salvage pathway.	407
399444	pfam06438	HasA	Heme-binding protein A (HasA). Free iron is limited in vertebrate hosts, thus an alternative to siderophores has been developed by pathogenic bacteria to access host iron bound in protein complexes. HasA is a secreted hemophore that has the ability to obtain iron from hemoglobin. Once bound to HasA, the heme is shuttled to the receptor HasR, which releases the heme into the bacterium.	184
399445	pfam06439	DUF1080	Domain of Unknown Function (DUF1080). This family has structural similarity to an endo-1,3-1,4-beta glucanase belonging to glycoside hydrolase family 16. However, the structure surrounding the active site differs from that of the endo-1,3-1,4-beta glucanase.	182
399446	pfam06440	DNA_pol3_theta	DNA polymerase III, theta subunit. DNA polymerase III (EC 2.7.7.7) is comprised of three tightly associated subunits, alpha, epsilon and theta. This family contains the theta subunit. The structure of the theta subunit shows that the N-terminal two thirds is comprised of three helices while the C-terminal third is disordered. The function of the theta subunit is poorly understood, but the interaction of the theta subunit with the epsilon subunit is thought to enhance the 3' to 5' exonucleolytic proofreading activity of epsilon.	68
399447	pfam06441	EHN	Epoxide hydrolase N-terminus. This family represents the N-terminal region of the eukaryotic epoxide hydrolase protein. Epoxide hydrolases (EC:3.3.2.3) comprise a group of functionally related enzymes that catalyze the addition of water to oxirane compounds (epoxides), thereby usually generating vicinal trans-diols. EHs have been found in all types of living organisms, including mammals, invertebrates, plants, fungi and bacteria. In animals, the major interest in EH is directed towards their detoxification capacity for epoxides since they are important safeguards against the cytotoxic and genotoxic potential of oxirane derivatives that are often reactive electrophiles because of the high tension of the three-membered ring system and the strong polarization of the C--O bonds. This is of significant relevance because epoxides are frequent intermediary metabolites which arise during the biotransformation of foreign compounds. This family is often found in conjunction with pfam00561.	106
368910	pfam06442	DHFR_2	R67 dihydrofolate reductase. R67 dihydrofolate reductase is a plasmid encoded enzyme that provides resistance to the antibacterial drug trimethoprim. The R67 dihydrofolate reductase does not share significant similarity to the chromosomal encoded dihydrofolate reductase.	78
115120	pfam06443	SEF14_adhesin	SEF14-like adhesin. Family of enterotoxigenic bacterial adhesins.	165
368911	pfam06444	NADH_dehy_S2_C	NADH dehydrogenase subunit 2 C-terminus. This family consists of the C-terminal region specific to the eukaryotic NADH dehydrogenase subunit 2 protein and is found in conjunction with pfam00361.	51
399448	pfam06445	GyrI-like	GyrI-like small molecule binding domain. This family contains the small molecule binding domain of a number of different bacterial transcription activators. This family also contains DNA gyrase inhibitors. The GyrI superfamily contains a diad of the SHS2 module, adapted for small-molecule binding. The GyrI superfamily includes a family of secreted forms that is found only in animals and the bacterial pathogen Leptospira.	153
399449	pfam06446	Hepcidin	Hepcidin. Hepcidin is a antibacterial and antifungal protein expressed in the liver and is also a signaling molecule in iron metabolism. The hepcidin protein is cysteine-rich and forms a distorted beta-sheet with an unusual disulphide bond found at the turn of the hairpin.	53
399450	pfam06448	DUF1081	Domain of Unknown Function (DUF1081). This region is found in Apolipophorin proteins.	103
368914	pfam06449	DUF1082	Mitochondrial domain of unknown function (DUF1082). This family consists of the C-terminal region of several plant mitochondria specific proteins. The function of this family is unknown. This family is found in conjunction with pfam02326.	51
283984	pfam06450	NhaB	Bacterial Na+/H+ antiporter B (NhaB). This family consists of several bacterial Na+/H+ antiporter B (NhaB) proteins. The exact function of this family is unknown.	515
368915	pfam06451	Moricin	Moricin. Moricin is a antibacterial peptide that is highly basic. The structure of moricin reveals that it is comprised of a long alpha-helix. The N-terminus of the helix is amphipathic, and the C-terminus of the helix is predominately hydrophobic. The amphipathic N-terminal segment of the alpha- helix is mainly responsible for the increase in permeability of the bacterial membrane which kills the bacteria.	41
399451	pfam06452	CBM9_1	Carbohydrate family 9 binding domain-like. CBM9_1 is a C-terminal domain on bacterial xylanase proteins, and it is tandemly repeated in a number of family-members. The CBM9 module binds to amorphous and crystalline cellulose and a range of soluble di- and monosaccharides as well as to cello- and xylo- oligomers of different degrees of polymerization. Comparison of the glucose and cellobiose complexes during crystallisation reveals surprising differences in binding of these two substrates by CBM9-2. Cellobiose was found to bind in a distinct orientation from glucose, while still maintaining optimal stacking and electrostatic interactions with the reducing end sugar.	182
115129	pfam06453	LT-IIB	Type II heat-labile enterotoxin, B subunit (LT-IIB). Family of B subunits from the type II heat-labile enterotoxin. The B subunits form a pentameric ring, which interacts with one A subunit. Thus, the structural arrangement of type I and type II heat-labile enterotoxins are very similar.	122
399452	pfam06454	DUF1084	Protein of unknown function (DUF1084). This family consists of several hypothetical plant specific proteins of unknown function.	271
368917	pfam06455	NADH5_C	NADH dehydrogenase subunit 5 C-terminus. This family represents the C-terminal region of several NADH dehydrogenase subunit 5 proteins and is found in conjunction with pfam00361 and pfam00662.	181
399453	pfam06456	Arfaptin	Arfaptin-like domain. Arfaptin interacts with ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The structure of arfaptin shows that upon binding to a small GTPase, arfaptin forms an elongated, crescent-shaped dimer of three-helix coiled-coils. The N-terminal region of ICA69 is similar to arfaptin.	207
115133	pfam06457	Ectatomin	Ectatomin. Ectatomin is a toxic component from the Ectatomma tuberculatum ant venom. It is comprised of two subunits, A and B, which are homologous. The structure of ectatomin reveals that each subunit is comprised of two helices and a connecting hinge region, the forms a hairpin structure that is stabilized by disulphide bridges. The two hinges are connected by a disulphide bond.	34
399454	pfam06458	MucBP	MucBP domain. The MucBP (MUCin-Binding Protein) domain is found in a wide variety of bacterial proteins, in several repeats. The domain is found in bacterial peptidoglycan bound proteins and is often found in conjunction with pfam00746 and pfam00560.	61
399455	pfam06459	RR_TM4-6	Ryanodine Receptor TM 4-6. This region covers TM regions 4-6 of the ryanodine receptor 1 family.	280
399456	pfam06460	NSP13	Coronavirus NSP13. This family covers the NSP13 region of the coronavirus polyprotein. This protein has the predicted function of an mRNA cap-1 methyltransferase function.	297
368921	pfam06461	DUF1086	Domain of Unknown Function (DUF1086). This family consists of several eukaryotic domains of unknown function which are present in chromodomain helicase DNA binding proteins. This domain is often found in conjunction with pfam00176, pfam00271, pfam06465, pfam00385 and pfam00628.	138
399457	pfam06462	Hyd_WA	Propeller. Probable beta-propeller.	30
399458	pfam06463	Mob_synth_C	Molybdenum Cofactor Synthesis C. This region contains two iron-sulphur (3Fe-4S) binding sites. Mutations in this region of human MOCS1 cause MOCOD (Molybdenum Co-Factor Deficiency) type A.	127
368923	pfam06464	DMAP_binding	DMAP1-binding Domain. This domain binds DMAP1, a transcriptional co-repressor.	104
399459	pfam06465	DUF1087	Domain of Unknown Function (DUF1087). Members of this family are found in various chromatin remodelling factors and transposases. Their exact function is, as yet, unknown.	60
399460	pfam06466	PCAF_N	PCAF (P300/CBP-associated factor) N-terminal domain. This region is spliced out of human KAT2A isoform 2. It is predicted to be of a mixed alpha/beta fold - though predominantly helical.	249
399461	pfam06467	zf-FCS	MYM-type Zinc finger with FCS sequence motif. MYM-type zinc fingers were identified in MYM family proteins. Human protein ZMYM3 is involved in a chromosomal translocation and may be responsible for X-linked retardation in XQ13.1. ZMYM2 is also involved in disease. In myeloproliferative disorders it is fused to FGF receptor 1; in atypical myeloproliferative disorders it is rearranged. Members of the family generally are involved in development. This Zn-finger domain functions as a transcriptional trans-activator of late vaccinia viral genes, and orthologues are also found in all nucleocytoplasmic large DNA viruses, NCLDV. This domain is also found fused to the C termini of recombinases from certain prokaryotic transposons.	40
399462	pfam06468	Spond_N	Spondin_N. This conserved region is found at the in the N-terminal half of several Spondin proteins. Spondins are involved in patterning axonal growth trajectory through either inhibiting or promoting adhesion of embryonic nerve cells.	185
399463	pfam06469	DUF1088	Domain of Unknown Function (DUF1088). This family is found in the neurobeachins. The function of this region is not known.	168
399464	pfam06470	SMC_hinge	SMC proteins Flexible Hinge Domain. This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction.	115
399465	pfam06471	NSP11	NSP11. This region of coronavirus polyproteins encodes the NSP11 protein.	515
399466	pfam06472	ABC_membrane_2	ABC transporter transmembrane region 2. This domain covers the transmembrane of a small family of ABC transporters and shares sequence similarity with pfam00664. Mutations in this domain in ABCD3 are believed responsible for Zellweger Syndrome-2; mutations in ABCD1 are responsible for recessive X-linked adrenoleukodystrophy. A Saccharomyces cerevisiae homolog is involved in the import of long-chain fatty acids.	269
399467	pfam06473	FGF-BP1	FGF binding protein 1 (FGF-BP1). This family consists of several mammalian FGF binding protein 1. Fibroblast growth factors (FGFs) play important roles during fetal and embryonic development. Fibroblast growth factor-binding protein (FGF-BP) 1 is a secreted protein that can bind fibroblast growth factors (FGFs) 1 and 2.	226
399468	pfam06474	MLTD_N	MltD lipid attachment motif. This short motif is a lipid attachment site.	34
399469	pfam06475	Glycolipid_bind	Putative glycolipid-binding. This family has a novel fold known as a spiral beta-roll, consisting of a 15-stranded beta sheet wrapped around a single alpha helix. It forms dimers. It has some structural similarity to the E. coli lipoprotein localization factors LolA and LolB. Its structure suggests that it may have a role in glycolipid binding. Its genomic context supports a role in glycolipid metabolism.	178
399470	pfam06476	DUF1090	Protein of unknown function (DUF1090). This family consists of several bacterial proteins of unknown function and is known as YqjC in E. coli.	106
399471	pfam06477	DUF1091	Protein of unknown function (DUF1091). This is a family of uncharacterized proteins. Based on its distant similarity to pfam02221 and conserved pattern of cysteine residues it is possible that these domains are also lipid binding.	83
399472	pfam06478	Corona_RPol_N	Coronavirus RPol N-terminus. This family covers the N-terminal region of the coronavirus RNA-directed RNA Polymerase.	353
399473	pfam06479	Ribonuc_2-5A	Ribonuclease 2-5A. This domain is a endoribonuclease. Specifically it cleaves an intron from Hac1 mRNA in humans, which causes it to be much more efficiently translated.	127
377663	pfam06480	FtsH_ext	FtsH Extracellular. This domain is found in the FtsH family of proteins. FtsH is the only membrane-bound ATP-dependent protease universally conserved in prokaryotes. It only efficiently degrades proteins that have a low thermodynamic stability - e.g. it lacks robust unfoldase activity. This feature may be key and implies that this could be a criterion for degrading a protein. In Oenococcus oeni FtsH is involved in protection against environmental stress, and shows increased expression under heat or osmotic stress. These two lines of evidence suggest that it is a fundamental prokaryotic self-protection mechanism that checks if proteins are correctly folded (personal obs: Yeats C). The precise function of this N-terminal region is unclear.	103
399474	pfam06481	COX_ARM	COX Aromatic Rich Motif. COX2 (Cytochrome O ubiquinol OXidase 2) is a major component of the respiratory complex during vegetative growth. It transfers electrons from a quinol to the binuclear centre of the catalytic subunit 1. The function of this region is not known.	46
399475	pfam06482	Endostatin	Collagenase NC10 and Endostatin. NC10 stands for Non-helical region 10 and is taken from COL15A1. A mutation in this region in COL18A1 is associated with an increased risk of prostate cancer. This domain is cleaved from the precursor and forms endostatin. Endostatin is a key tumor suppressor and has been used highly successfully to treat cancer. It is a potent angiogenesis inhibitor. Endostatin also binds a zinc ion near the N-terminus; this is likely to be of structural rather than functional importance according to.	222
368936	pfam06483	ChiC	Chitinase C. This ~170 aa region is found at the C-terminus of pfam00704.	174
399476	pfam06484	Ten_N	Teneurin Intracellular Region. This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).	367
399477	pfam06485	DUF1092	Protein of unknown function (DUF1092). This family consists of several hypothetical proteins of unknown function all from photosynthetic organisms including plants and cyanobacteria.	268
399478	pfam06486	DUF1093	Protein of unknown function (DUF1093). This family consists of several hypothetical bacterial proteins of unknown function.	81
399479	pfam06487	SAP18	Sin3 associated polypeptide p18 (SAP18). This family consists of several eukaryotic Sin3 associated polypeptide p18 (SAP18) sequences. SAP18 is known to be a component of the Sin3-containing complex which is responsible for the repression of transcription via the modification of histone polypeptides. SAP18 is also present in the ASAP complex which is thought to be involved in the regulation of splicing during the execution of programmed cell death.	123
115164	pfam06488	L_lac_phage_MSP	Phage tail tube protein. This is a family of Siphoviridae phage tail tube proteins including several from Lactococcus lactis.	301
399480	pfam06489	Orthopox_A49R	Orthopoxvirus A49R protein. This family consists of several Orthopoxvirus A49R proteins. The function of this family is unknown.	150
377666	pfam06490	FleQ	Flagellar regulatory protein FleQ. This domain is found at the N-terminus of a subset of sigma54-dependent transcriptional activators that are involved in regulation of flagellar motility e.g. FleQ in Pseudomonas aeruginosa. It is clearly related to pfam00072, but lacks the conserved aspartate residue that undergoes phosphorylation in the classic two-component system response regulator (pfam00072).	108
399481	pfam06491	Disulph_isomer	Disulphide isomerase. This family of proteins has disulphide isomerase activity, EC:5.3.4.1. It has a similar fold to thioredoxin, with an alpha-beta-alpha-beta-alpha-beta-beta-alpha topology. It has a conserved CGC motif in the loop immediately downstream of the first beta strand. This motif is essential for activity.	136
368941	pfam06493	DUF1096	Protein of unknown function (DUF1096). This family represents the N-terminal region of several proteins found in C. elegans. The family is often found with pfam02363.	52
253769	pfam06495	Transformer	Fruit fly transformer protein. This family consists of transformer proteins from several Drosophila species and also from Ceratitis capitata (Mediterranean fruit fly). The transformer locus (tra) produces an RNA processing protein that alternatively splices the doublesex pre-mRNA in the sex determination hierarchy of Drosophila melanogaster.	182
399482	pfam06496	DUF1097	Protein of unknown function (DUF1097). This family consists of several bacterial putative membrane proteins.	139
284024	pfam06497	DUF1098	Protein of unknown function (DUF1098). This family consists of several hypothetical Baculovirus proteins of unknown function.	99
399483	pfam06500	DUF1100	Alpha/beta hydrolase of unknown function (DUF1100). This family consists of several hypothetical bacterial proteins of unknown function. Members of this family have an alpha/beta hydrolase fold.	410
253772	pfam06501	Herpes_U55	Human herpesvirus U55 protein. This family consists of several human herpesvirus U55 proteins. The function of this family is unknown.	432
115174	pfam06502	Equine_IAV_S2	Equine infectious anaemia virus S2 protein. This family consists of several equine infectious anaemia virus S2 proteins. The function of this family is unknown.	67
284026	pfam06503	DUF1101	Protein of unknown function (DUF1101). This family consists of several hypothetical Fijivirus proteins of unknown function.	360
399484	pfam06504	RepC	Replication protein C (RepC). This family consists of several bacterial replication protein C (RepC) sequences.	273
399485	pfam06505	XylR_N	Activator of aromatic catabolism. This domain is found at the N-terminus of a subset of sigma54-dependent transcriptional activators in several proteobacteria, including activators of phenol degradation such as XylR. It is found adjacent to pfam02830.	100
399486	pfam06506	PrpR_N	Propionate catabolism activator. This domain is found at the N-terminus of several sigma54- dependent transcriptional activators including PrpR, which activates catabolism of propionate.	165
399487	pfam06507	Auxin_resp	Auxin response factor. A conserved region of auxin-responsive transcription factors.	83
284031	pfam06508	QueC	Queuosine biosynthesis protein QueC. This family of proteins participate in the biosynthesis of 7-carboxy-7-deazaguanine. They catalyze the conversion of 7-deaza-7-carboxyguanine to preQ0.	211
368946	pfam06510	DUF1102	Protein of unknown function (DUF1102). This family consists of several hypothetical archaeal proteins of unknown function.	141
399488	pfam06511	IpaD	Invasion plasmid antigen IpaD. This family consists of several invasion plasmid antigen IpaD proteins. Entry of Shigella flexneri into epithelial cells and lysis of the phagosome involve the IpaB, IpaC, and IpaD proteins, which are secreted by type III secretion machinery.	355
399489	pfam06512	Na_trans_assoc	Sodium ion transport-associated. Members of this family contain a region found exclusively in eukaryotic sodium channels or their subunits, many of which are voltage-gated. Members very often also contain between one and four copies of pfam00520 and, less often, one copy of pfam00612.	201
115185	pfam06513	DUF1103	Repeat of unknown function (DUF1103). This family consists of several repeats of around 30 residues in length which are found specifically in mature-parasite-infected erythrocyte surface antigen proteins from Plasmodium falciparum. This family often found in conjunction with pfam00226.	215
284035	pfam06514	PsbU	Photosystem II 12 kDa extrinsic protein (PsbU). This family consists of several photosystem II 12 kDa extrinsic protein (PsbU) proteins from cyanobacteria and algae. PsbU is an extrinsic protein of the photosystem II complex of cyanobacteria and red algae. PsbU is known to stabilize the oxygen-evolving machinery of the photosystem II complex against heat-induced inactivation. This family appears to be related to the Helix-hairpin-helix domain.	93
115187	pfam06515	BDV_P10	Borna disease virus P10 protein. This family consists of several Borna disease virus P10 (or X) proteins. Borna disease virus (BDV) is unique among the non-segmented negative-strand RNA viruses of animals and man because it transcribes and replicates its genome in the nucleus of the infected cell. It has been suggested that the p10 protein plays a role in viral RNA synthesis or ribonucleoprotein transport.	87
399490	pfam06516	NUP	Purine nucleoside permease (NUP). This family consists of several purine nucleoside permease from both bacteria and fungi.	304
284037	pfam06517	Orthopox_A43R	Orthopoxvirus A43R protein. This family consists of several Orthopoxvirus A43R proteins. The function of this family is unknown.	195
399491	pfam06518	DUF1104	Protein of unknown function (DUF1104). This family consists of several hypothetical proteins of unknown function which appear to be found largely in Helicobacter pylori.	83
399492	pfam06519	TolA	TolA C-terminal. This family consists of several bacterial TolA proteins as well as two eukaryotic proteins of unknown function. Tol proteins are involved in the translocation of group A colicins. Colicins are bacterial protein toxins, which are active against Escherichia coli and other related species (See pfam01024). TolA is anchored to the cytoplasmic membrane by a single membrane spanning segment near the N-terminus, leaving most of the protein exposed to the periplasm.	94
399493	pfam06521	PAR1	PAR1 protein. This family consists of several plant specific PAR1 proteins from Nicotiana tabacum and Arabidopsis thaliana. The function of this family is unknown.	156
399494	pfam06522	B12D	NADH-ubiquinone reductase complex 1 MLRQ subunit. The MLRQ subunit of mitochondrial NADH-ubiquinone reductase complex I is nuclear and is found in plants, insects, fungi and higher metazoans. It appears to act within the membrane and, in mammals, is highly expressed in muscle and neural tissue, indicative of a role in ATP generation.	69
115195	pfam06523	DUF1106	Protein of unknown function (DUF1106). This family consists of several hypothetical bacterial proteins found in Escherichia coli and Citrobacter rodentium. The function of this family is unknown.	91
368953	pfam06524	NOA36	NOA36 protein. This family consists of several NOA36 proteins which contain 29 highly conserved cysteine residues. The function of this protein is unknown.	306
310845	pfam06525	SoxE	Sulfocyanin (SoxE) domain. This family consists of several archaeal sulfocyanin (or blue copper protein) sequences from a number of Sulfolobus species.	149
399495	pfam06526	DUF1107	Protein of unknown function (DUF1107). This family consists of several short, hypothetical bacterial proteins of unknown function.	63
399496	pfam06527	TniQ	TniQ. This family consists of several bacterial TniQ proteins. TniQ along with TniA and B is involved in the transposition of the mercury-resistance transposon Tn5053 which carries the mer operon. It has been suggested that the tni genes are involved in the dissemination of integrons.	142
399497	pfam06528	Phage_P2_GpE	Phage P2 GpE. This family consists of several phage and bacterial proteins which are closely related to the GpE tail protein from Phage P2.	37
368956	pfam06529	Vert_IL3-reg_TF	Vertebrate interleukin-3 regulated transcription factor. This family includes vertebrate transcription factors, some of which are regulated by IL-3/adenovirus E4 promoter binding protein. Others were found to strongly repress transcription in a DNA-binding-site-dependent manner.	332
399498	pfam06530	Phage_antitermQ	Phage antitermination protein Q. This family consists of several phage antitermination protein Q and related bacterial sequences. Antiterminator proteins control gene expression by recognising control signals near the promoter and preventing transcriptional termination which would otherwise occur at sites that may be a long way downstream.	118
368958	pfam06531	DUF1108	Protein of unknown function (DUF1108). This family consists of several bacterial proteins from Staphylococcus aureus as well as a number of phage proteins. The function of this family is unknown.	84
399499	pfam06532	DUF1109	Protein of unknown function (DUF1109). This family consists of several hypothetical bacterial proteins of unknown function.	204
399500	pfam06533	DUF1110	Protein of unknown function (DUF1110). This family consists of hypothetical proteins specific to Oryza sativa. One sequence appears to be tandemly repeated.	189
399501	pfam06534	RGM_C	Repulsive guidance molecule (RGM) C-terminus. This family consists of several mammalian and one bird sequence from Gallus gallus (Chicken). This family represents the C-terminal region of several sequences but in others it represents the full protein. All of the mammalian proteins are hypothetical and have no known function but RGMA from chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum.	171
399502	pfam06535	RGM_N	Repulsive guidance molecule (RGM) N-terminus. This family consists of the N-terminal region of several mammalian and one bird sequence from Gallus gallus (Chicken). All of the mammalian proteins are hypothetical and have no known function but RGMA from chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum.	166
368962	pfam06536	Av_adeno_fibre	Avian adenovirus fibre, N-terminal. This family is the N-terminal region of avian adenovirus fibre proteins; the domain is frequently found repeated several times along the fibre. These fibers have been linked to variations in virulence. Avian adenoviruses possess penton capsomers that consist of a pentameric base associated with two fibers.	56
399503	pfam06537	DHOR	Di-haem oxidoreductase, putative peroxidase. DHOR is a family of di-haem oxidoredictases. It carries the two characteristic Cys-X-Y-Cys-His haem-binding motifs. The C-terminal high-potential site functions as an electron transfer centre, and the N-terminal low-potential site corresponds to the peroxidatic centre. Its probable function is as a peroxidase.	486
399504	pfam06540	GMAP	Galanin message associated peptide (GMAP). This family consists of several galanin message associated peptides. In rat preprogalanin, galanin is C-terminally flanked by a 60 amino acid long peptide: galanin message-associated peptide (GMAP). GMAP sequences in different species show high degree of homology, but the biological function of this family is unknown.	58
399505	pfam06541	ABC_trans_CmpB	Putative ABC-transporter type IV. CmpB is a family of membrane proteins that are likely to be part of a two-component type IV ABC-transporter system. Families can transport multiple drugs including ethidium and fluoroquinolones. UniProtKB:Q83XH0 is a member of TCDB family 3.A.1.121.4.	149
368965	pfam06542	PHA-1	Regulator protein PHA-1. This family represents the protein product of the gene pha-1 which coordinates with lin-35 Rb during animal development. The protein is expressed during embryonic development and functions in the cytoplasm. PHA-1 acts in a parallel pathway with UBC-18 to regulate the activity of a common cellular target.	403
284059	pfam06543	Lac_bphage_repr	Lactococcus bacteriophage repressor. This family represents the C-terminus of Lactococcus bacteriophage repressor proteins.	49
399506	pfam06544	DUF1115	Protein of unknown function (DUF1115). This family represents the C-terminus of hypothetical eukaryotic proteins of unknown function.	133
399507	pfam06545	DUF1116	Protein of unknown function (DUF1116). This family contains hypothetical bacterial proteins of unknown function.	215
399508	pfam06546	Vert_HS_TF	Vertebrate heat shock transcription factor. This family represents the C-terminal region of vertebrate heat shock transcription factors. Heat shock transcription factors regulate the expression of heat shock proteins - a set of proteins that protect the cell from damage caused by stress and aid the cell's recovery after the removal of stress. This C-terminal region is found with the N-terminal pfam00447, and may contain a three-stranded coiled-coil trimerisation domain and a CE2 regulatory region, the latter of which is involved in sustained heat shock response.	269
399509	pfam06547	DUF1117	Protein of unknown function (DUF1117). This family represents the C-terminus of a number of hypothetical plant proteins.	110
368969	pfam06549	DUF1118	Protein of unknown function (DUF1118). This family consists of several hypothetical plant proteins of unknown function.	115
284066	pfam06550	SPP	Signal-peptide peptidase, presenilin aspartyl protease. SPP is a family of signal-peptide aspartyl proteases. The family carries the characteristic catalytic aspartate GXGD motif, and members are integral membrane peptidases of the presenilin-type with nine transmembrane regions. UniProtKB:Q18K19 is part of the TCDB family 1.A.54.3.4, the presenilin er Ca(2+) leak channel (presenilin).	283
368970	pfam06551	DUF1120	Protein of unknown function (DUF1120). This family consists of several hypothetical bacterial proteins of unknown function.	116
284068	pfam06552	TOM20_plant	Plant specific mitochondrial import receptor subunit TOM20. This family consists of several plant specific mitochondrial import receptor subunit TOM20 (translocase of outer membrane 20 kDa subunit) proteins. Most mitochondrial proteins are encoded by the nuclear genome, and are synthesized in the cytosol. TOM20 is a general import receptor that binds to mitochondrial pre-sequences in the early step of protein import into the mitochondria.	187
399510	pfam06553	BNIP3	BNIP3. This family consists of several mammalian specific BCL2/adenovirus E1B 19-kDa protein-interacting protein 3 or BNIP3 sequences. BNIP3 belongs to the Bcl-2 homology 3 (BH3)-only family, a Bcl-2-related family possessing an atypical Bcl-2 homology 3 (BH3) domain, which regulates PCD from mitochondrial sites by selective Bcl-2/Bcl-XL interactions. BNIP3 family members contain a C-terminal transmembrane domain that is required for their mitochondrial localization, homodimerization, as well as regulation of their pro-apoptotic activities. BNIP3-mediated apoptosis has been reported to be independent of caspase activation and cytochrome c release and is characterized by early plasma membrane and mitochondrial damage, prior to the appearance of chromatin condensation or DNA fragmentation.	184
399511	pfam06554	Olfactory_mark	Olfactory marker protein. This family consists of several olfactory marker proteins. Expression of the olfactory marker protein (OMP) is highly restricted to mature olfactory receptor neurons in virtually all vertebrate species from fish to man.	149
284071	pfam06556	ASFV_p27	IAP-like protein p27 C-terminus. This family represents the C-terminal region of the African swine fever virus IAP-like protein p27. This family is found in conjunction with pfam00653. It has been suggested that the family may be a host range gene involved in aspects of infection in the arthropod host, ticks of the genus Ornithodoros.	131
399512	pfam06557	DUF1122	Protein of unknown function (DUF1122). This family consists of several hypothetical archaeal and bacterial proteins of unknown function.	162
399513	pfam06558	SecM	Secretion monitor precursor protein (SecM). This family consists of several bacterial Secretion monitor precursor (SecM) proteins. SecM is known to regulate SecA expression. The eubacterial protein secretion machinery consists of a number of soluble and membrane associated components. One critical element is SecA ATPase, which acts as a molecular motor to promote protein secretion at translocation sites that consist of SecYE, the SecA receptor, and SecG and SecDFyajC proteins, which regulate SecA membrane cycling.	146
399514	pfam06559	DCD	2'-deoxycytidine 5'-triphosphate deaminase (DCD). This family consists of several bacterial 2'-deoxycytidine 5'-triphosphate deaminase proteins (EC:3.5.4.13).	360
284075	pfam06560	GPI	Glucose-6-phosphate isomerase (GPI). This family consists of several bacterial and archaeal glucose-6-phosphate isomerase (GPI) proteins (EC:5.3.1.9).	177
284076	pfam06563	DUF1125	Protein of unknown function (DUF1125). This family consists of several short Lactococcus lactis and bacteriophage proteins. The function of this family is unknown.	55
399515	pfam06564	CBP_BcsQ	Cellulose biosynthesis protein BcsQ. This is a family of bacterial proteins involved in cellulose biosynthesis. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)). A second component of the extracellular matrix of the multicellular morphotype (rdar) of Salmonella typhimurium and Escherichia coli is cellulose. The family does contain a P-loop sequence motif suggesting a nucleotide binding function, but this has not been confirmed.	234
399516	pfam06565	DUF1126	DUF1126 PH-like domain. The structure of this domain shows that it has a PH-like fold.	105
368978	pfam06566	Chon_Sulph_att	Chondroitin sulphate attachment domain. This family represents the chondroitin sulphate attachment domain of vertebrate neural transmembrane proteoglycans that contain EGF modules. Evidence has been accumulated to support the idea that neural proteoglycans are involved in various cellular events including mitogenesis, differentiation, axonal outgrowth and synaptogenesis. This domain contains several potential sites of chondroitin sulphate attachment, as well as potential sites of N-linked glycosylation.	249
368979	pfam06567	Neural_ProG_Cyt	Neural chondroitin sulphate proteoglycan cytoplasmic domain. This family represents the C-terminal cytoplasmic domain of vertebrate neural chondroitin sulphate proteoglycans that contain EGF modules. Evidence has been accumulated to support the idea that neural proteoglycans are involved in various cellular events including mitogenesis, differentiation, axonal outgrowth and synaptogenesis. This domain contains a number of potential sites of phosphorylation by protein kinase C.	120
399517	pfam06568	DUF1127	Domain of unknown function (DUF1127). This family is found in several hypothetical bacterial proteins. In some cases it represents it represents the C-terminal region whereas in others it represents the whole sequence.	33
399518	pfam06569	DUF1128	Protein of unknown function (DUF1128). This family consists of several short, hypothetical bacterial proteins of unknown function.	71
399519	pfam06570	DUF1129	Protein of unknown function (DUF1129). This family consists of several hypothetical bacterial proteins of unknown function.	200
399520	pfam06572	DUF1131	Protein of unknown function (DUF1131). This family consists of several hypothetical bacterial proteins of unknown function.	169
399521	pfam06573	Churchill	Churchill protein. This family consists of several eukaryotic Churchill proteins. This protein contains a novel zinc binding region that mediates FGF signaling during neural development (unpublished obs Sheng G and Stern C).	111
399522	pfam06574	FAD_syn	FAD synthetase. This family corresponds to the N terminal domain of the bifunctional enzyme riboflavin kinase / FAD synthetase. These enzymes have both ATP:riboflavin 5'-phospho transferase and ATP:FMN-adenylyltransferase activity. They catalyze the 5'-phosphorylation of riboflavin to FMN and the adenylylation of FMN to FAD. This domain is thought to have the flavin mononucleotide (FMN) adenylyltransferase activity.	158
284087	pfam06575	DUF1132	Protein of unknown function (DUF1132). This family consists of several hypothetical proteins from Neisseria meningitidis. The function of this family is unknown.	101
148278	pfam06576	DUF1133	Protein of unknown function (DUF1133). This family consists of a number of hypothetical proteins from Escherichia coli O157:H7 and Salmonella typhi. The function of this family is unknown.	176
399523	pfam06577	DUF1134	Protein of unknown function (DUF1134). This family consists of several hypothetical bacterial proteins of unknown function.	159
284089	pfam06578	YscK	YOP proteins translocation protein K (YscK). This family consists of several YscK proteins. The function of this protein is unknown but it belongs to an operon involved in the secretion of Yop proteins across bacterial membranes.	209
399524	pfam06579	Ly-6_related	Caenorhabditis elegans ly-6-related protein. This family consists of several Caenorhabditis elegans specific ly-6-related HOT and ODR proteins. These proteins are involved in the olfactory system. Odr-2 mutants are known to be defective in the ability to chemotax to odorants that are recognized by the two AWC olfactory neurons. Odr-2 encodes a membrane-associated protein related to the Ly-6 superfamily of GPI-linked signaling proteins.	125
399525	pfam06580	His_kinase	Histidine kinase. This family represents a region within bacterial histidine kinase enzymes. Two-component signal transduction systems such as those mediated by histidine kinase are integral parts of bacterial cellular regulatory processes, and are used to regulate the expression of genes involved in virulence. Members of this family often contain pfam02518 and/or pfam00672.	79
368986	pfam06581	p31comet	Mad1 and Cdc20-bound-Mad2 binding. This family is involved in the cell-cycle surveillance mechanism called the spindle checkpoint. This mechanism monitors the proper bipolar attachment of sister chromatids to spindle microtubules and ensures the fidelity of chromosome segregation during mitosis. A key player in mitosis is Mad2, and Mad2 exhibits an unusual two-state behaviour. A Mad1-Mad2 core complex recruits cytosolic Mad2 to kinetochores through Mad2 dimerization and converts Mad2 to a conformer amenable to Cdc20 binding. p31comet inactivates the checkpoint by binding to Mad1- or Cdc20-bound Mad2 in such a way as to stop Mad2 activation and to promote the dissociation of the Mad2-Cdc20 complex.	265
399526	pfam06582	DUF1136	Repeat of unknown function (DUF1136). This family consists of several eukaryote specific repeats of unknown function. This repeat seems to always be found with pfam00047.	27
399527	pfam06583	Neogenin_C	Neogenin C-terminus. This family represents the C-terminus of eukaryotic neogenin precursor proteins, which contains several potential phosphorylation sites. Neogenin is a member of the N-CAM family of cell adhesion molecules (and therefore contains multiple copies of pfam00047 and pfam00041) and is closely related to the DCC tumor suppressor gene product - these proteins may play an integral role in regulating differentiation programmes and/or cell migration events within many adult and embryonic tissues.	289
399528	pfam06584	DIRP	DIRP. DIRP (Domain in Rb-related Pathway) is postulated to be involved in the Rb-related pathway, which is encoded by multiple eukaryotic genomes and is present in proteins including lin-9 of Caenorhabditis elegans, aly of fruit fly and mustard weed. Studies of lin-9 and aly of fruit fly proteins containing DIRP suggest that this domain might be involved in development. Aly, lin-9, act in parallel to, or downstream of, activation of MAPK by the RTK-Ras signalling pathway.	107
399529	pfam06585	JHBP	Haemolymph juvenile hormone binding protein (JHBP). This family consists of several insect-specific haemolymph juvenile hormone binding proteins (JHBP). Juvenile hormone regulates embryogenesis, maintains the status quo of larval development and stimulates reproductive maturation in the adult insect. JH is transported from the sites of its synthesis to target tissues by a haemolymph carrier called juvenile hormone-binding protein (JHBP). JHBP protects the JH molecules from hydrolysis by non-specific esterases present in the insect haemolymph. The crystal structure of the JHBP from Galleria mellonella shows an unusual fold consisting of a long alpha-helix wrapped in a much curved antiparallel beta-sheet. The folding pattern for this structure closely resembles that found in some tandem-repeat mammalian lipid-binding and bactericidal permeability-increasing proteins, with a similar organisation of the major cavity and a disulfide bond linking the long helix and the beta-sheet. It would appear that JHBP forms two cavities, only one of which, the one near the N- and C-termini, binds the hormone; binding induces a conformational change, of unknown significance. This family now includes DUF233, pfam03027.	239
399530	pfam06586	TraK	TraK protein. This family consists of several TraK proteins from Escherichia coli, Salmonella typhi and Salmonella typhimurium. TraK is known to be essential for pilus assembly but its exact role in this process is unknown.	228
368992	pfam06587	DUF1137	Protein of unknown function (DUF1137). This family consists of several hypothetical proteins specific to Chlamydia species. The function of this family is unknown.	117
284099	pfam06588	Muskelin_N	Muskelin N-terminus. This family represents the N-terminal region of muskelin and is found in conjunction with several pfam01344 repeats. Muskelin is an intracellular, kelch repeat protein that is needed in cell-spreading responses to the matrix adhesion molecule, thrombospondin-1.	197
399531	pfam06589	CRA	Circumsporozoite-related antigen (CRA). This family consists of several circumsporozoite-related antigen (CRA) or exported protein-1 (EXP1) sequences found specifically in Plasmodium species. The function of this family is unknown.	123
115260	pfam06590	PerB	PerB protein. This family consists of several PerB or BfpV proteins found specifically in Escherichia coli. PerB is thought to play a role in regulating the expression of BfpA.	129
148289	pfam06591	Phage_T4_Ndd	T4-like phage nuclear disruption protein (Ndd). This family consists of several nuclear disruption (Ndd) proteins from T4-like phages. Early in a bacteriophage T4 infection, the phage ndd gene causes the rapid destruction of the structure of the Escherichia coli nucleoid. The targets of Ndd action may be the chromosomal sequences that determine the structure of the nucleoid.	154
399532	pfam06592	DUF1138	Protein of unknown function (DUF1138). This family consists of several hypothetical short plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.	73
284102	pfam06593	RBDV_coat	Raspberry bushy dwarf virus coat protein. This family consists of several Raspberry bushy dwarf virus coat proteins.	274
399533	pfam06594	HCBP_related	Haemolysin-type calcium binding protein related domain. This family consists of a number of bacteria specific domains which are found in haemolysin-type calcium binding proteins. This family is found in conjunction with pfam00353 and is often found in multiple copies.	42
284104	pfam06595	BDV_P24	Borna disease virus P24 protein. This family consists of several Borna disease virus (BDV) P24 proteins. The function of this family is unknown.	201
399534	pfam06596	PsbX	Photosystem II reaction centre X protein (PsbX). This family consists of several photosystem II reaction centre X protein (PsbX) sequences from both prokaryotes and eukaryotes.	37
368996	pfam06597	Clostridium_P47	Clostridium P-47 protein. This family consists of several P-47 proteins from various Clostridium species as well as two related sequences from Pseudomonas putida. The function of this family is unknown.	469
253815	pfam06598	Chlorovi_GP_rpt	Chlorovirus glycoprotein repeat. This family consists of s number of repeats found in Chlorovirus glycoproteins. The function of this family is unknown.	34
115269	pfam06599	DUF1139	Protein of unknown function (DUF1139). This family consists of several hypothetical Fijivirus proteins of unknown function.	309
284107	pfam06600	DUF1140	Protein of unknown function (DUF1140). This family consists of several short, hypothetical phage and bacterial proteins. The function of this family is unknown.	99
284108	pfam06601	Orthopox_F6	Orthopoxvirus F6 protein. This family consists of several Orthopoxvirus F6L proteins the function of which are unknown.	72
399535	pfam06602	Myotub-related	Myotubularin-like phosphatase domain. This family represents the phosphatase domain within eukaryotic myotubularin-related proteins. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease.	330
399536	pfam06603	UpxZ	UpxZ family of transcription anti-terminator antagonists. The UpxZ family of proteins acts to inhibit transcription of heterologous capsular polysaccharide loci in Bacteroides species by interfering with the action of the UpxY family of transcription anti-terminators. As antagonists of polysaccharide locus-specific UpxY transcription anti-terminators, the UpxZ proteins exert a hierarchical level of regulation, insuring that only one of the multiple phase-variable capsular polysaccharide loci per cell characteristic of this genus is transcribed at a time.	104
399537	pfam06605	Prophage_tail	Prophage endopeptidase tail. This family is of prophage tail proteins that are probably acting as endopeptidases.	251
148298	pfam06607	Prokineticin	Prokineticin. This family consists of several prokineticin proteins and related BM8 sequences. The suprachiasmatic nucleus (SCN) controls the circadian rhythm of physiological and behavioural processes in mammals. It has been shown that prokineticin 2 (PK2), a cysteine-rich secreted protein, functions as an output molecule from the SCN circadian clock. PK2 messenger RNA is rhythmically expressed in the SCN, and the phase of PK2 rhythm is responsive to light entrainment. Molecular and genetic studies have revealed that PK2 is a gene that is controlled by a circadian clock.	97
284112	pfam06608	DUF1143	Protein of unknown function (DUF1143). This family consists of several hypothetical mammalian proteins (from mouse and human). The function of this family is unknown.	148
115279	pfam06609	TRI12	Fungal trichothecene efflux pump (TRI12). This family consists of several fungal specific trichothecene efflux pump proteins. Many of the genes involved in trichothecene toxin biosynthesis in Fusarium sporotrichioides are present within a gene cluster.It has been suggested that TRI12 may play a role in F. sporotrichioides self-protection against trichothecenes.	598
399538	pfam06610	AlaE	L-alanine exporter. AlaE is a family of Gram-negative amino-acid transporters. It is not entirely clear why bacteria export metabolites but recent studies have shown that many excrete alanine. AlaE is likely to be the exporter protein for L-alanine. UniProtKB:A8ANM6, UniProt:G4R961 and UniProt:H5SVY7 are classified as putative alanine exporters.	141
399539	pfam06611	DUF1145	Protein of unknown function (DUF1145). This family consists of several hypothetical bacterial proteins of unknown function.	56
399540	pfam06612	DUF1146	Protein of unknown function (DUF1146). This family consists of several hypothetical bacterial proteins of unknown function.	48
399541	pfam06613	KorB_C	KorB C-terminal beta-barrel domain. This family consists of several KorB transcriptional repressor proteins. The korB gene is a major regulatory element in the replication and maintenance of broad host-range plasmid RK2. It negatively controls the replication gene trfA, the host-lethal determinants kilA and kilB, and the korA-korB operon. This beta-barrel domain is found at the C-terminus of KorB.	58
369001	pfam06614	Neuromodulin	Neuromodulin. This family consists of several neuromodulin (Axonal membrane protein GAP-43) sequences and is found in conjunction with pfam00612. GAP-43 is a neuronal calmodulin-binding phosphoprotein that is concentrated in growth cones and pre-synaptic terminals.	175
115285	pfam06615	DUF1147	Protein of unknown function (DUF1147). This family consists of several short Circovirus proteins of unknown function.	59
377679	pfam06616	BsuBI_PstI_RE	BsuBI/PstI restriction endonuclease C-terminus. This family represents the C-terminus of bacterial enzymes similar to type II restriction endonucleases BsuBI and PstI (EC:3.1.21.4). The enzymes of the BsuBI restriction/modification (R/M) system recognize the target sequence 5'CTGCAG and are functionally identical with those of the PstI R/M system.	153
399542	pfam06617	M-inducer_phosp	M-phase inducer phosphatase. This family represents a region within eukaryotic M-phase inducer phosphatases (EC:3.1.3.48), which also contain the pfam00581 domain. These proteins are involved in the control of mitosis.	228
115288	pfam06618	DUF1148	Protein of unknown function (DUF1148). This family consists of several Maize streak virus proteins of unknown function.	114
399543	pfam06619	DUF1149	Protein of unknown function (DUF1149). This family consists of several hypothetical bacterial proteins of unknown function.	122
399544	pfam06620	DUF1150	Protein of unknown function (DUF1150). This family consists of several hypothetical bacterial proteins of unknown function.	76
399545	pfam06621	SIM_C	Single-minded protein C-terminus. This family represents the C-terminal region of the eukaryotic single-minded (SIM) protein. Drosophila single-minded acts as a positive master gene regulator in central nervous system midline formation. There are two homologs in mammals: SIM1 and SIM2, which are members of the basic-helix-loop-helix PAS family of transcription factors. SIM1 and SIM2 are novel heterodimerization partners for ARNT in vitro, and they may function both as positive and negative transcriptional regulators in vivo, during embryogenesis and in the adult organism. SIM2 is thought to contribute to some specific Down syndrome phenotypes. This family is found in conjunction with a pfam00989 domain and associated pfam00785 motif.	293
115292	pfam06622	SepQ	SepQ protein. This family consists of several enterobacterial SepQ proteins from Escherichia coli and Citrobacter rodentium. The function of this family is unclear.	305
399546	pfam06623	MHC_I_C	MHC_I C-terminus. This family represents the C-terminal region of the MHC class I antigen. The family is found in conjunction with pfam00129 and pfam00047.	26
399547	pfam06624	RAMP4	Ribosome associated membrane protein RAMP4. This family consists of several ribosome associated membrane protein RAMP4 (or SERP1) sequences. Stabilisation of membrane proteins in response to stress involves the concerted action of a rescue unit in the ER membrane comprised of SERP1/RAMP4, other components of the translocon, and molecular chaperones in the ER.	57
399548	pfam06625	DUF1151	Protein of unknown function (DUF1151). This family consists of several hypothetical eukaryotic proteins of unknown function.	114
399549	pfam06626	DUF1152	Protein of unknown function (DUF1152). This family consists of several hypothetical archaeal proteins of unknown function.	294
399550	pfam06627	DUF1153	Protein of unknown function (DUF1153). This family consists of several short, hypothetical bacterial proteins of unknown function.	87
399551	pfam06628	Catalase-rel	Catalase-related immune-responsive. This family represents a small conserved region within catalase enzymes (EC:1.11.1.6). All members also contain the Catalase family, pfam00199 domain. Catalase decomposes hydrogen peroxide into water and oxygen, serving to protect cells from its toxic effects. This domain carries the immune-responsive amphipathic octa-peptide that is recognized by T cells.	61
399552	pfam06629	MipA	MltA-interacting protein MipA. This family consists of several bacterial MltA-interacting protein (MipA) like sequences. As well as interacting with the membrane-bound lytic transglycosylase MltA, MipA is known to bind to PBP1B, a bifunctional murein transglycosylase/transpeptidase. MipA is considered to be a structural protein mediating the assembly of MltA to PBP1B into a complex.	221
284130	pfam06630	Exonuc_VIII	Enterobacterial exodeoxyribonuclease VIII. This family consists of several Enterobacterial exodeoxyribonuclease VIII proteins.	203
399553	pfam06631	DUF1154	Protein of unknown function (DUF1154). This family represents a small conserved region of unknown function within eukaryotic phospholipase C (EC:3.1.4.3). All members also contain pfam00387 and pfam00388.	44
369011	pfam06632	XRCC4	DNA double-strand break repair and V(D)J recombination protein XRCC4. This family consists of several eukaryotic DNA double-strand break repair and V(D)J recombination protein XRCC4 sequences. In the non-homologous end joining pathway of DNA double-strand break repair, the ligation step is catalyzed by a complex of XRCC4 and DNA ligase IV. It is thought that XRCC4 and ligase IV are essential for alignment-based gap filling, as well as for final ligation of the breaks.	336
115303	pfam06633	DUF1155	Protein of unknown function (DUF1155). This family consists of several Cucumber mosaic virus ORF IIB proteins. The function of this family is unknown.	42
369012	pfam06634	DUF1156	Protein of unknown function (DUF1156). This family represents a conserved region within hypothetical prokaryotic and archaeal proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids.	71
399554	pfam06635	NolV	Nodulation protein NolV. This family consists of several nodulation protein NolV sequences from different Rhizobium species. The function of this family is unclear.	206
253836	pfam06636	DUF1157	Protein of unknown function (DUF1157). This family consists of several uncharacterized proteins from Melanoplus sanguinipes entomopoxvirus (MsEPV). The function of this family is unknown.	370
399555	pfam06637	PV-1	PV-1 protein (PLVAP). This family consists of several PV-1 (PLVAP) proteins which seem to be specific to mammals. PV-1 is a novel protein component of the endothelial fenestral and stomatal diaphragms. The function of this family is unknown.	440
399556	pfam06638	Strabismus	Strabismus protein. This family consists of several strabismus (STB) or Van Gogh-like (VANGL) proteins 1 and 2. The exact function of this family is unknown. It is thought, however that STB1 gene and STB2 may be potent tumor suppressor gene candidates.	503
369015	pfam06639	BAP	Basal layer antifungal peptide (BAP). This family consists of several basal layer antifungal peptide (BAP) sequences specific to Zea mays. The BAP2 peptide exhibits potent broad-range activity against a range of filamentous fungi, including several plant pathogens.	76
336460	pfam06640	P_C	P protein C-terminus. This family represents the C-terminus of plant P proteins. The maize P gene is a transcriptional regulator of genes encoding enzymes for flavonoid biosynthesis in the pathway leading to the production of a red phlobaphene pigment, and P proteins are homologous to the DNA-binding domain of myb-like transcription factors. All members of this family contain the pfam00249 domain.	206
399557	pfam06643	DUF1158	Protein of unknown function (DUF1158). This family consists of several enterobacterial YbdJ proteins. The function of this family is unknown	78
399558	pfam06644	ATP11	ATP11 protein. This family consists of several eukaryotic ATP11 proteins. In Saccharomyces cerevisiae, expression of functional F1-ATPase requires two proteins encoded by the ATP11 and ATP12 genes. Atp11p is a molecular chaperone of the mitochondrial matrix that participates in the biogenesis pathway to form F1, the catalytic unit of the ATP synthase.	267
399559	pfam06645	SPC12	Microsomal signal peptidase 12 kDa subunit (SPC12). This family consists of several microsomal signal peptidase 12 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 12 kDa subunit (SPC12).	71
284141	pfam06646	Mycoplasma_p37	High affinity transport system protein p37. This family consists of several high affinity transport system protein p37 sequences which are specific to Mycoplasma species. The p37 gene is part of an operon encoding two additional proteins which are highly similar to components of the periplasmic binding-protein-dependent transport systems of Gram-negative bacteria.It has been suggested that p37 is part of a homologous, high-affinity transport system in M. hyorhinis, a Gram-positive bacterium.	330
148323	pfam06648	DUF1160	Protein of unknown function (DUF1160). This family consists of several hypothetical Baculovirus proteins of unknown function.	122
399560	pfam06649	DUF1161	Protein of unknown function (DUF1161). This family consists of several short, hypothetical bacterial proteins of unknown function.	52
399561	pfam06650	SHR-BD	SHR-binding domain of vacuolar-sorting associated protein 13. SHR-BD is a family of eukaryotic proteins found on vacuolar-sorting associated proteins towards the C-terminus. In plants, the domain is found to be the region which interacts with SHR or the SHORT-ROOT transcription factor, a regulator of root-growth and asymmetric cell division that separates ground tissue into endodermis and cortex. The plant protein containing the SHR-BD is named SHRUBBY or SHBY, UniProtKB:Q9FT44.	272
284144	pfam06651	DUF1163	Protein of unknown function (DUF1163). This family represents the C-terminus of hypothetical Arabidopsis thaliana proteins of unknown function.	67
399562	pfam06652	Methuselah_N	Methuselah N-terminus. This family represents the N-terminal region of the Drosophila specific Methuselah protein. Drosophila Methuselah (Mth) mutants have a 35% increase in average lifespan and increased resistance to several forms of stress, including heat, starvation, and oxidative damage. The protein affected by this mutation is related to G protein-coupled receptors of the secretin receptor family. Mth, like secretin receptor family members, has a large N-terminal ectodomain, which may constitute the ligand binding site. This family is found in conjunction with pfam00002.	179
399563	pfam06653	Claudin_3	Tight junction protein, Claudin-like. This is a family of probable membrane tight junction, Claudin-like, proteins.	164
284147	pfam06656	Tenui_PVC2	Tenuivirus PVC2 protein. This family consists of several Tenuivirus PVC2 proteins from Rice grassy stunt virus, Maize stripe virus and Rice hoja blanca virus. The function of this family is unknown.	784
399564	pfam06657	Cep57_MT_bd	Centrosome microtubule-binding domain of Cep57. This C-terminal region of Cep57 binds, nucleates and bundles microtubules. The N-terminal part, family Cep57_CLD, pfam14073, is the centrosome localization domain Cep57.	77
399565	pfam06658	DUF1168	Protein of unknown function (DUF1168). This family consists of several hypothetical eukaryotic proteins of unknown function.	136
284150	pfam06661	VirE3	VirE3. This family represents a conserved region within Agrobacterium tumefaciens VirE3. Agrobacterium tumefaciens (a plant pathogen) has a tumor-inducing (Ti) plasmid of which part, the transfer (T)-region, is transferred to plant cells during the infection process. Vir proteins mediate the processing of the T-region and the transfer of a single-stranded (ss) DNA copy of this region, the T-strand, into the recipient cells. VirE3 is a translocated effector protein, but its specific role has not been established.	316
399566	pfam06662	C5-epim_C	D-glucuronyl C5-epimerase C-terminus. This family represents the C-terminus of D-glucuronyl C5-epimerase (EC:5.1.3.-). Glucuronyl C5-epimerases catalyze the conversion of D-glucuronic acid (GlcUA) to L-iduronic acid (IdceA) units during the biosynthesis of glycosaminoglycans.	188
399567	pfam06663	DUF1170	Protein of unknown function (DUF1170). This family represents a conserved region of unknown function within MAGUIN, a neuronal membrane-associated guanylate kinase-interacting protein. This region is situated between the pfam00595 and pfam00169 domains. All family members also contain an N-terminal pfam00536 domain.	214
399568	pfam06664	MIG-14_Wnt-bd	Wnt-binding factor required for Wnt secretion. MIG-14 is a Wnt-binding factor. Newly synthesized EGL-20/Wnt binds to MIG-14 in the Golgi, targetting the Wnt to the cell membrane for secretion. AP-2-mediated endocytosis and retromer retrieval at the sorting endosome would recycle MIG-14 to the Golgi, where it can bind to EGL-20/Wnt for next cycle of secretion.	294
399569	pfam06666	DUF1173	Protein of unknown function (DUF1173). This family contains a group of hypothetical bacterial proteins that contain three conserved cysteine residues towards the N-terminal. The function of these proteins is unknown.	380
399570	pfam06667	PspB	Phage shock protein B. This family consists of several bacterial phage shock protein B (PspB) sequences. The phage shock protein (psp) operon is induced in response to heat, ethanol, osmotic shock and infection by filamentous bacteriophages. Expression of the operon requires the alternative sigma factor sigma54 and the transcriptional activator PspF. In addition, PspA plays a negative regulatory role, and the integral-membrane proteins PspB and PspC play a positive one.	73
399571	pfam06668	ITI_HC_C	Inter-alpha-trypsin inhibitor heavy chain C-terminus. This family represents the C-terminal region of inter-alpha-trypsin inhibitor heavy chains. Inter-alpha-trypsin inhibitors are glycoproteins with a high inhibitory activity against trypsin, built up from different combinations of four polypeptides: bikunin and the three heavy chains that belong to this family (HC1, HC2, HC3). The heavy chains do not have any protease inhibitory properties but have the capacity to interact in vitro and in vivo with hyaluronic acid, which promotes the stability of the extra-cellular matrix. All family members contain the pfam00092 domain.	188
115333	pfam06670	Etmic-2	Microneme protein Etmic-2. This family consists of several Microneme protein Etmic-2 sequences from Eimeria tenella. Etmic-2 is a 50 kDa acidic protein, which is found within the microneme organelles of Eimeria tenella sporozoites and merozoites.	379
369031	pfam06671	DUF1174	Repeat of unknown function (DUF1174). This family consists of a number of Caenorhabditis elegans specific repeats of around 36 residues in length which are found in two hypothetical proteins. This family is found in conjunction with pfam00024.	24
399572	pfam06672	DUF1175	Protein of unknown function (DUF1175). This family consists of several hypothetical bacterial proteins of around 210 residues in length. The function of this family is unknown.	218
115336	pfam06673	L_lactis_ph-MCP	Lactococcus lactis bacteriophage major capsid protein. This family consists of several Lactococcus lactis bacteriophage major capsid proteins.	347
399573	pfam06674	DUF1176	Protein of unknown function (DUF1176). This family consists of several hypothetical bacterial proteins of around 340 residues in length. Members of this family contain six highly conserved cysteine residues. The function of this family is unknown.	319
399574	pfam06675	DUF1177	Protein of unknown function (DUF1177). This family consists of several hypothetical archaeal and and bacterial proteins of around 300 residues in length. The function of this family is unknown.	270
399575	pfam06676	DUF1178	Protein of unknown function (DUF1178). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown.	148
399576	pfam06677	Auto_anti-p27	Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27). This family consists of several Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27) sequences. It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis.	38
369033	pfam06678	DUF1179	Protein of unknown function (DUF1179). This family consists of several hypothetical Caenorhabditis elegans proteins of around 106 residues in length. The function of the family is unknown.	107
369034	pfam06679	DUF1180	Protein of unknown function (DUF1180). This family consists of several hypothetical mammalian proteins of around 190 residues in length. The function of this family is unknown.	167
284166	pfam06680	DUF1181	Protein of unknown function (DUF1181). This family consists of several hypothetical proteins of around 120 residues in length which are found specifically in Trypanosoma brucei. The function of this family is unknown.	120
284167	pfam06681	DUF1182	Protein of unknown function (DUF1182). This family consists of several hypothetical proteins of around 360 residues in length and seems to be specific to Caenorhabditis elegans. The function of this family is unknown. It appears to carry seven TM regions.	208
399577	pfam06682	SARAF	SOCE-associated regulatory factor of calcium homoeostasis. SARAF is as family of eukaryotic proteins embedded in the ER. SARAF is SOCE-associated regulatory factor, where SOCE is store operated calcium entry. ie a mechanism governing calcium homoeostasis in the cell and the mitochondria. SOCE involves the enabling of Ca2+ release-activated Ca2+ (CRAC) channels. SARAF is a single pass ER membrane protein whose systolic-facing domain is responsible for activity and whose luminary-facing domain carries out a regulatory function in conjunction with another membrane protein STIM, an ER single pass membrane protein that detects changes in ER Ca2+ levels through its EF-hand, conserved Ca2+ binding domain. STIM is the major target for SARAF regulation, and thus SARAF negatively regulates the SOCE entry of calcium into cells protecting them from overfilling.	320
310940	pfam06683	DUF1184	Protein of unknown function (DUF1184). This family contains a number of hypothetical proteins of unknown function from Arabidopsis thaliana.	203
399578	pfam06684	AA_synth	Amino acid synthesis. This family of proteins is structurally similar to proteins with the Bacillus chorismate mutase-like (BCM-like) fold. This structure, combined with its genomic context, suggest that it has a role in amino acid synthesis.	175
369037	pfam06685	DUF1186	Protein of unknown function (DUF1186). This family consists of several hypothetical bacterial proteins of around 250 residues in length and is found in several Chlamydia and Anabaena species. The function of this family is unknown.	246
399579	pfam06686	SpoIIIAC	Stage III sporulation protein AC/AD protein family. This family consists of several bacterial stage III sporulation protein AC (SpoIIIAC) and SpoIIIAD sequences. The exact function of this family is unknown. SpoIIIAD is the an uncharacterized protein which is part of the spoIIIA operon that acts at sporulation stage III as part of a cascade of events leading to endospore formation. The operon is regulated by sigmaG.	56
399580	pfam06687	SUR7	SUR7/PalI family. This family consists of several fungal-specific SUR7 proteins. Its activity regulates expression of RVS161, a homolog of human endophilin, suggesting a function for both in endocytosis. The protein carries four transmembrane domains and is thus likely to act as an anchoring protein for the eisosome to the plasma membrane. Eisosomes are the immobile protein complexes, that include the proteins Pil1 and Lsp1, which co-localize with sites of protein and lipid endocytosis at the plasma membrane. SUR7 protein may play a role in sporulation. This family also includes PalI which is part of a pH signal transduction cascade. Based on the similarity of PalI to the yeast Rim9 meiotic signal transduction component it has been suggested that PalI might be a membrane sensor for ambient pH.	201
115351	pfam06688	DUF1187	Protein of unknown function (DUF1187). This family consists of several short, hypothetical bacterial proteins of around 62 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi. The function of this family is unknown.	61
399581	pfam06689	zf-C4_ClpX	ClpX C4-type zinc finger. The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known.	39
399582	pfam06690	DUF1188	Protein of unknown function (DUF1188). This family consists of several hypothetical archaeal proteins of around 260 residues in length which seem to be specific to Methanobacterium, Methanococcus and Methanopyrus species. The function of this family is unknown.	248
399583	pfam06691	DUF1189	Protein of unknown function (DUF1189). This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown.	240
115355	pfam06692	MNSV_P7B	Melon necrotic spot virus P7B protein. This family consists of several Melon necrotic spot virus (MNSV) P7B proteins. The function of this family is unknown.	61
399584	pfam06693	DUF1190	Protein of unknown function (DUF1190). This family consists of several hypothetical Enterobacterial proteins of around 212 residues in length and is known as YjfM in Escherichia coli. The function of this family is unknown.	161
369039	pfam06694	Plant_NMP1	Plant nuclear matrix protein 1 (NMP1). This family consists of several plant specific nuclear matrix protein 1 (NMP1) sequences. Nuclear Matrix Protein 1 is a ubiquitously expressed 36 kDa protein, which has no homologs in animals and fungi, but is highly conserved among flowering and non-flowering plants. NMP1 is located both in the cytoplasm and nucleus and that the nuclear fraction is associated with the nuclear matrix. NMP1 is a candidate for a plant-specific structural protein with a function both in the nucleus and cytoplasm.	318
399585	pfam06695	Sm_multidrug_ex	Putative small multi-drug export protein. This family contains a small number of putative small multi-drug export proteins.	112
336475	pfam06696	Strep_SA_rep	Streptococcal surface antigen repeat. This family consists of a number of ~25 residue long repeats found commonly in Streptococcal surface antigens although one copy is present in the HPSR2-heavy chain potential motor protein of Giardia lamblia. This family is often found in conjunction with pfam00746.	25
399586	pfam06697	DUF1191	Protein of unknown function (DUF1191). This family contains hypothetical plant proteins of unknown function.	182
399587	pfam06698	DUF1192	Protein of unknown function (DUF1192). This family consists of several short, hypothetical, bacterial proteins of around 60 residues in length. The function of this family is unknown.	58
399588	pfam06699	PIG-F	GPI biosynthesis protein family Pig-F. PIG-F is involved in glycosylphosphatidylinositol (GPI) anchor biosynthesis.	183
399589	pfam06701	MIB_HERC2	Mib_herc2. Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect).	66
399590	pfam06702	Fam20C	Golgi casein kinase, C-terminal, Fam20. Fam20C represents the C-terminus of eukaryotic secreted Golgi casein kinase proteins. Fam20C is the Golgi casein kinase that phosphorylates secretory pathway proteins within Ser-x-Glu/pSer motifs. Mutations in Fam20C cause Raine syndrome, an autosomal recessive osteosclerotic bone dysplasia.	218
399591	pfam06703	SPC25	Microsomal signal peptidase 25 kDa subunit (SPC25). This family consists of several microsomal signal peptidase 25 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 25 kDa subunit (SPC25).	153
284187	pfam06705	SF-assemblin	SF-assemblin/beta giardin. This family consists of several eukaryotic SF-assemblin and related beta giardin proteins. During mitosis the SF-assemblin-based cytoskeleton is reorganized; it divides in prophase and is reduced to two dot-like structures at each spindle pole in metaphase. During anaphase, the two dots present at each pole are connected again. In telophase there is an asymmetrical outgrowth of new fibers. It has been suggested that SF-assemblin is involved in re-establishing the microtubular root system characteristic of interphase cells after mitosis.	247
115368	pfam06706	CTV_P6	Citrus tristeza virus 6-kDa protein. This family consists of several Citrus tristeza virus (CTV) 6-kDa, 51 residue long hydrophobic (P6) proteins. The function of this family is unknown.	51
369046	pfam06707	DUF1194	Protein of unknown function (DUF1194). This family consists of several hypothetical Rhizobiales specific proteins of around 270 residues in length. The function of this family is unknown.	206
399592	pfam06708	DUF1195	Protein of unknown function (DUF1195). This family consists of several plant specific hypothetical proteins of around 160 residues in length. The function of this family is unknown.	147
399593	pfam06711	DUF1198	Protein of unknown function (DUF1198). This family consists of several bacterial proteins of around 150 residues in length which are specific to Escherichia coli, Salmonella species and Yersinia pestis. The function of this family is unknown.	142
115374	pfam06712	DUF1199	Protein of unknown function (DUF1199). This family consists of several hypothetical Feline immunodeficiency virus (FIV) proteins. Members of this family are typically around 67 residues long and are often annotated as ORF3 proteins. The function of this family is unknown.	52
369049	pfam06713	bPH_4	Bacterial PH domain. This family consists of several hypothetical proteins specific to Oceanobacillus and Bacillus species. Members of this family are typically around 130 residues in length. The function of this family is unknown. Members of this family have a PH domain like structure.	74
148361	pfam06714	Gp5_OB	Gp5 N-terminal OB domain. This domain is found at the N-terminus of the Gp5 baseplate protein of bacteriophage T4. This domain binds to the Gp27 protein. This domain has the common OB fold.	144
310962	pfam06715	Gp5_C	Gp5 C-terminal repeat (3 copies). This repeat composes the C-terminal part of the bacteriophage T4 baseplate protein Gp5. This region of the protein forms a needle like projection from the baseplate that is presumed to puncture the bacterial cell membrane. Structurally three copies of the repeated region trimerize to form a beta solenoid type structure. This family also includes repeats from bacterial Vgr proteins.	24
284193	pfam06716	DUF1201	Protein of unknown function (DUF1201). This family consists of several Sugar beet yellow virus (SBYV) putative membrane-binding proteins of around 54 residues in length. The function of this family is unknown.	54
284194	pfam06717	DUF1202	Protein of unknown function (DUF1202). This family consists of several hypothetical bacterial proteins of around 335 residues in length. Members of this family are found exclusively in Escherichia coli and Salmonella species and are often referred to as YggM proteins. The function of this family is unknown.	307
399594	pfam06718	DUF1203	Protein of unknown function (DUF1203). This family consists of several hypothetical bacterial proteins of around 155 residues in length. Family members are present in Rhizobium, Agrobacterium and Streptomyces species.	116
399595	pfam06719	AraC_N	AraC-type transcriptional regulator N-terminus. This family represents the N-terminus of bacterial ARAC-type transcriptional regulators. In E. coli, these regulate the L-arabinose operon through sensing the presence of arabinose, and when the sugar is present, transmitting this information from the arabinose-binding domains to the protein's DNA-binding domains. This family might represent the N-terminal arm of the protein, which binds to the C-terminal DNA binding domains to hold them in a state where the protein prefers to loop and remain non-activating. All family members contain the pfam00165 domain.	152
336481	pfam06720	Phi-29_GP16_7	Bacteriophage phi-29 early protein GP16.7. This family consists of several bacteriophage phi-29 early protein GP16.7 sequences of around 130 residues in length. The function of this family is unknown.	130
284198	pfam06721	DUF1204	Protein of unknown function (DUF1204). This family represents the C-terminus of a number of Arabidopsis thaliana hypothetical proteins of unknown function. Family members contain a conserved DFD motif.	243
369050	pfam06722	DUF1205	Protein of unknown function (DUF1205). This family represents a conserved region of unknown function within bacterial glycosyl transferases. Many family members contain pfam03033.	95
399596	pfam06723	MreB_Mbl	MreB/Mbl protein. This family consists of bacterial MreB and Mbl proteins as well as two related archaeal sequences. MreB is known to be a rod shape-determining protein in bacteria and goes to make up the bacterial cytoskeleton. Genes coding for MreB/Mbl are only found in elongated bacteria, not in coccoid forms. It has been speculated that constituents of the eukaryotic cytoskeleton (tubulin, actin) may have evolved from prokaryotic precursor proteins closely related to today's bacterial proteins FtsZ and MreB/Mbl.	327
399597	pfam06724	DUF1206	Domain of Unknown Function (DUF1206). This region consists of two a pair of transmembrane helices and occurs three times in each of the family member proteins.	70
399598	pfam06725	3D	3D domain. This short presumed domain contains three conserved aspartate residues, hence the name 3D. It has been shown to be part of the catalytic double psi beta barrel domain of MltA.	72
369052	pfam06726	BC10	Bladder cancer-related protein BC10. This family consists of a series of short proteins of around 90 residues in length. The human protein BC10 has been implicated in bladder cancer where the transcription of the gene coding for this protein is nearly completely abolished in highly invasive transitional cell carcinomas (TCCs). The protein is a small globular protein containing two transmembrane helices, and it is a multiply edited transcript. All the editing sites are found in either the 5'-UTR or the N-terminal section of the protein, which is predicted to be outside the membrane. The three coding edits are all non-synonymous and predicted to encode exposed residues. The function of this family is unknown.	65
310969	pfam06727	DUF1207	Protein of unknown function (DUF1207). This family consists of a number of hypothetical bacterial proteins of around 410 residues in length which seem to be specific to Chlamydia species. The function of this family is unknown.	337
399599	pfam06728	PIG-U	GPI transamidase subunit PIG-U. Many eukaryotic proteins are anchored to the cell surface via glycosylphosphatidylinositol (GPI), which is posttranslationally attached to the carboxyl-terminus by GPI transamidase. The mammalian GPI transamidase is a complex of at least four subunits, GPI8, GAA1, PIG-S, and PIG-T. PIG-U is thought to represent a fifth subunit in this complex and may be involved in the recognition of either the GPI attachment signal or the lipid portion of GPI.	374
399600	pfam06729	CENP-R	Kinetochore component, CENP-R. This family consists of mammalian kinetochore sub-complex proteins CENP-R, also referred to as nuclear receptor co-activator NRIF3 proteins. NRIF3 exhibits a distinct receptor specificity in interacting with and potentiating the activity of only TRs and RXRs but not other examined nuclear receptors. NRIF3 as a co-regulator that possesses both transactivation and transrepression domains and/or functions. Collectively, the NRIF3 family of co-regulators may play dual roles in mediating both positive and negative regulatory effects on gene expression. CENP-R is one of the 15 components that make up the constitutive centromere associated complex (CCAN) part of the kinetochore. A sub-complex of CCAN, consisting of CENP-P/O/R/Q/U self-assembles on kinetochores with varying stoichiometry and undergoes a pre-mitotic maturation step. Kinetochore assembly is a cell cycle regulated multi-step process. The initial step occurs during interphase and involves loading of the 15-subunit constitutive centromere associated complex (CCAN). Kinetochores are multi-protein megadalton assemblies that are required for attachment of microtubules to centromeres and, in turn, the segregation of chromosomes in mitosis.	137
284207	pfam06730	FAM92	FAM92 protein. This family of proteins has a role in embryogenesis. During embryogenesis it is essential for ectoderm and axial mesoderm development. It may regulate cell proliferation and apoptosis.	225
399601	pfam06732	Pescadillo_N	Pescadillo N-terminus. This family represents the N-terminal region of Pescadillo. Pescadillo protein localizes to distinct substructures of the interphase nucleus including nucleoli, the site of ribosome biogenesis. During mitosis pescadillo closely associates with the periphery of metaphase chromosomes and by late anaphase is associated with nucleolus-derived foci and prenucleolar bodies. Blastomeres in mouse embryos lacking pescadillo arrest at morula stages of development, the nucleoli fail to differentiate and accumulation of ribosomes is inhibited. It has been proposed that in mammalian cells pescadillo is essential for ribosome biogenesis and nucleologenesis and that disruption to its function results in cell cycle arrest. This family is often found in conjunction with a pfam00533 domain.	269
399602	pfam06733	DEAD_2	DEAD_2. This represents a conserved region within a number of RAD3-like DNA-binding helicases that are seemingly ubiquitous - members include proteins of eukaryotic, bacterial and archaeal origin. RAD3 is involved in nucleotide excision repair, and forms part of the transcription factor TFIIH in yeast.	168
284210	pfam06734	UL97	UL97. This family represents a conserved region within viral UL97 phosphotransferases. UL97 participates in the phosphorylation of the nucleoside analog ganciclovir (GCV) to produce GCV-monophosphate.	187
399603	pfam06736	DUF1211	Protein of unknown function (DUF1211). This family represents a conserved region within a number of hypothetical proteins of unknown function found in eukaryotes, bacteria and archaea. These may possibly be integral membrane proteins.	88
399604	pfam06737	Transglycosylas	Transglycosylase-like domain. This family of proteins are very likely to act as transglycosylase enzymes related to pfam00062 and pfam01464. These other families are weakly matched by this family, and include the known active site residues.	75
399605	pfam06738	ThrE	Putative threonine/serine exporter. ThrE is a family of bacterial and Archaeal proteins that catalyze the export of L-threonine from the cell. UniProtKB:Q79VD1 has been characterized as being necessary for this export. The domain exhibits 10 putative TMs and catalyzes the proton-motive-force-dependent efflux of threonine and serine.	241
115400	pfam06739	SBBP	Beta-propeller repeat. This family is related to pfam00400 and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller.	38
369057	pfam06740	DUF1213	Protein of unknown function (DUF1213). This family represents a short conserved repeat within Drosophila melanogaster proteins of unknown function. Approximately 50 copies of this repeat are present in each protein.	32
399606	pfam06741	LsmAD	LsmAD domain. This domain is found associated with Lsm domain.	65
399607	pfam06742	DUF1214	Protein of unknown function (DUF1214). This family represents the C-terminal region of several hypothetical proteins of unknown function. Family members are mostly bacterial, but a few are also found in eukaryotes and archaea.	109
399608	pfam06743	FAST_1	FAST kinase-like protein, subdomain 1. This family represents a conserved region of eukaryotic Fas-activated serine/threonine (FAST) kinases (EC:2.7.1.-) that contains several conserved leucine residues. FAST kinase is rapidly activated during Fas-mediated apoptosis, when it phosphorylates TIA-1, a nuclear RNA-binding protein that has been implicated as an effector of apoptosis. Note that many family members are hypothetical proteins. This region is often found immediately N-terminal to the FAST kinase-like protein, subdomain 2.	69
399609	pfam06744	IcmF_C	Type VI secretion protein IcmF C-terminal. IcmF_C family represents a conserved region situated towards the C-terminal end of IcmF-like proteins. It was thought to be involved in Vibrio cholerae cell surface reorganisation that results in increased adherence to epithelial cells leading to an increased conjugation frequency. IcmF as a whole interacts with DotU whereby these bind tightly and form the docking area of the T6SS within the inner membrane. The exact function of this domain is not clear.	106
399610	pfam06745	ATPase	KaiC. This family is in the P-loop NTPase superfamily and is found in archaea, bacteria and eukaryotes. More than one copy is sometimes found in each protein. This family includes KaiC, which is one of the Kai proteins among which direct protein-protein association may be a critical process in the generation of circadian rhythms in cyanobacteria.	231
369060	pfam06746	DUF1216	Protein of unknown function (DUF1216). This family represents a conserved region, within Arabidopsis thaliana proteins, of unknown function. Family members sometimes contain more than one copy.It has been reported that this domain will be found in other Brassicaceae.	132
399611	pfam06747	CHCH	CHCH domain. we have identified a conserved motif in the LOC118487 protein that we have called the CHCH motif. Alignment of this protein with related members showed the presence of three subgroups of proteins, which are called the S (Small), N (N-terminal extended) and C (C-terminal extended) subgroups. All three sub-groups of proteins have in common that they contain a predicted conserved [coiled coil 1]-[helix 1]-[coiled coil 2]-[helix 2] domain (CHCH domain). Within each helix of the CHCH domain, there are two cysteines present in a C-X9-C motif. The N-group contains an additional double helix domain, and each helix contains the C-X9-C motif. This family contains a number of characterized proteins: Cox19 protein - a nuclear gene of Saccharomyces cerevisiae, codes for an 11-kDa protein (Cox19p) required for expression of cytochrome oxidase. Because cox19 mutants are able to synthesize the mitochondrial and nuclear gene products of cytochrome oxidase, Cox19p probably functions post-translationally during assembly of the enzyme. Cox19p is present in the cytoplasm and mitochondria, where it exists as a soluble intermembrane protein. This dual location is similar to what was previously reported for Cox17p, a low molecular weight copper protein thought to be required for maturation of the CuA centre of subunit 2 of cytochrome oxidase. Cox19p have four conserved potential metal ligands, these are three cysteines and one histidine. Mrp10 - belongs to the class of yeast mitochondrial ribosomal proteins that are essential for translation. Eukaryotic NADH-ubiquinone oxidoreductase 19 kDa (NDUFA8) subunit. The CHCH domain was previously called DUF657.	35
399612	pfam06748	DUF1217	Protein of unknown function (DUF1217). This family represents a conserved region that is found within bacterial proteins, most of which are hypothetical. Some members contain multiple copies.	149
399613	pfam06749	DUF1218	Protein of unknown function (DUF1218). This family contains hypothetical plant proteins of unknown function. Family members contain a number of conserved cysteine residues.	95
399614	pfam06750	DiS_P_DiS	Bacterial Peptidase A24 N-terminal domain. This family is found at the N-terminus of the pre-pilin peptidases (pfam01478). It's function has not been specifically determined; however some of the family have been characterized as bifunctional, and this domain may contain the N-methylation activity (EC:2.1.1.-). It consists of an intracellular region between a pair of transmembrane. This region contains an invariant proline and two almost fully conserved disulphide bridges - hence the name DiS-P-DiS. The cysteines have been shown to be essential to the overall function of the enzyme in, but their role was incorrectly ascribed.	84
399615	pfam06751	EutB	Ethanolamine ammonia lyase large subunit (EutB). This family consists of several bacterial ethanolamine ammonia lyase large subunit (EutB) proteins (EC:4.3.1.7). Ethanolamine ammonia-lyase is a bacterial enzyme that catalyzes the adenosylcobalamin-dependent conversion of certain vicinal amino alcohols to oxo compounds and ammonia. The enzyme is a heterodimer composed of subunits of Mr approximately 55,000 (EutB) and 35,000 (EutC).	435
399616	pfam06752	E_Pc_C	Enhancer of Polycomb C-terminus. This family represents the C-terminus of eukaryotic enhancer of polycomb proteins, which have roles in heterochromatin formation. This family contains several conserved motifs.	229
70231	pfam06753	Bradykinin	Bradykinin. This family consists of several bradykinin sequences. The skins of anuran amphibians, in addition to mucus glands, contain highly specialized poison glands, which, in reaction to stress or attack, exude a complex noxious cocktail of biologically active molecules. These secretions often contain a plethora of peptides among which bradykinin or structural variants have been identified.	19
399617	pfam06754	PhnG	Phosphonate metabolism protein PhnG. This family consists of several bacterial phosphonate metabolism protein PhnG sequences. In Escherichia coli, the phn operon encodes proteins responsible for the uptake and breakdown of phosphonates. The exact function of PhnG is unknown, however it is thought likely that along with six other proteins PhnG makes up the the C-P (carbon-phosphorus) lyase.	142
369065	pfam06755	CbtA_toxin	CbtA_toxin of type IV toxin-antitoxin system. CbtA is a family of bacterial and archaeal toxins of type IV toxin-antitoxin system. Toxins from such systems in free-living bacteria inhibit cell growth by targeting essential functions of cellular metabolism. In this case the toxin inhibits cell-division leading to changes in morphology and finally lysis, by interacting with two essential cytoskeletal proteins, FtsZ and MreB. For FtsZ it inhibits its GTPase activity and GTP-dependent polymerization, and for MreB it inhibits its ATP-dependent polymerization. These actions of CbtA appear to occur simultaneously. he cognate antitoxin family is represented by pfam06154.	108
369066	pfam06756	S19	Chorion protein S19 C-terminal. This family represents the C-terminal region of eukaryotic chorion protein S19. In Drosophilidae, the S19 gene is known to form part of an autosomal cluster that also contains s16, s15 and s18. Note that members of this family contain a conserved PVA motif, and many contain pfam03964.	72
399618	pfam06757	Ins_allergen_rp	Insect allergen related repeat, nitrile-specifier detoxification. This family exemplifies a case of novel gene evolution. The case in point is the arms-race between plants and their infective insective herbivores in the area of the glucosinolate-myrosinase system. Brassicas have developed the glucosinolate-myrosinase system as chemical defense mechanism against the insects, and consequently the insects have adapted to produce a detoxifying molecule, nitrile-specifier protein (NSP). NSP is present in the small white butterfly Pieris rapae. NSP is structurally different from and has no amino acid homology to any known detoxifying enzymes, and it appears to have arisen by a process of domain and gene duplication of a sequence of unknown function that is widespread in insect species and referred to as insect-allergen-repeat protein. Thus this family is found either as a single domain or as a multiple repeat-domain.	173
399619	pfam06758	DUF1220	Repeat of unknown function (DUF1220). 	66
399620	pfam06760	DUF1221	Protein of unknown function (DUF1221). This is a family of plant proteins, most of which are hypothetical and of unknown function. All members contain the pfam00069 domain, suggesting that they may possess kinase activity.	215
399621	pfam06761	IcmF-related	Intracellular multiplication and human macrophage-killing. This family represents a conserved region within several bacterial proteins that resemble IcmF, which has been proposed to be involved in Vibrio cholerae cell surface reorganisation, resulting in increased adherence to epithelial cells and increased conjugation frequency. Note that many family members are hypothetical proteins.	304
399622	pfam06762	LMF1	Lipase maturation factor. This family of transmembrane proteins includes the lipase maturation factor, LMF1. Lipoprotein lipase and hepatic lipase require LMF1 to fold into their active states. The precise role of LMF1 in lipase folding has yet to be determined.	378
284235	pfam06763	Minor_tail_Z	Prophage minor tail protein Z (GPZ). This family consists of several prophage minor tail protein Z like sequences from Escherichia coli, Salmonella typhimurium and Lambda-like bacteriophages.	190
399623	pfam06764	DUF1223	Protein of unknown function (DUF1223). This family consists of several hypothetical proteins of around 250 residues in length which are found in both plants and bacteria. The function of this family is unknown. Structurally it lies in the Thioredoxin-like superfamily.	201
399624	pfam06766	Hydrophobin_2	Fungal hydrophobin. This is a family of fungal hydrophobins that seems to be restricted to ascomycetes. These are small, moderately hydrophobic extracellular proteins that have eight cysteine residues arranged in a strictly conserved motif. Hydrophobins are generally found on the outer surface of conidia and of the hyphal wall, and may be involved in mediating contact and communication between the fungus and its environment. Note that some family members contain multiple copies.	65
399625	pfam06767	Sif	Sif protein. This family consists of several SifA and SifB and SseJ proteins which seem to be specific to the Salmonella species. SifA, SifB and SseJ have been demonstrated to localize to the Salmonella-containing vacuole (SCV) and to Salmonella-induced filaments (Sifs). Trafficking of SseJ and SifB away from the SCV requires the SPI-2 effector SifA. SseJ trafficking away from the SCV along Sifs is unnecessary for its virulence function.	336
284238	pfam06769	YoeB_toxin	YoeB-like toxin of bacterial type II toxin-antitoxin system. YoeB_toxin is a family of bacterial toxins that forms one component of the type II toxin-antitoxin system in E. coli whose antitoxin is represented by YefM, found in pfam02604. The plasmid encoded Axe-Txe proteins in Enterococcus faecium act as an antitoxin-toxin pair. When the plasmid is lost, the antitoxin is degraded relatively quickly by host enzymes. This allows the toxin to interact with its intracellular target, thus killing the cell or impeding cell growth. These toxins are highly potent protein synthesis inhibitors, specifically blocking the initiation of translation. In the case of YoeB, it binds to the 50 S ribosomal subunit in 70 S ribosomes and interacts with the A site leading to mRNA cleavage at this site. As a result, the 3'-end portion of the mRNA is released from ribosomes, and translation initiation is effectively inhibited.	80
284239	pfam06770	Arif-1	Actin-rearrangement-inducing factor (Arif-1). This family consists of several Nucleopolyhedrovirus actin-rearrangement-inducing factor (Arif-1) proteins. In response to Autographa californica multicapsid nuclear polyhedrosis virus (AcMNPV) infection, a sequential rearrangement of the actin cytoskeleton occurs this is induced by Arif-1. Arif-1 is tyrosine phosphorylated and is located at the plasma membrane as a component of the actin rearrangement-inducing complex.	205
284240	pfam06771	Desmo_N	Viral Desmoplakin N-terminus. This family represents the N-terminus of viral desmoplakin. Desmoplakin is a component of mature desmosomes, which are the main adhesive junctions in epithelia and cardiac muscle. Desmoplakin is also essential for the maturation of adherens junctions. Note that many family members are hypothetical.	97
399626	pfam06772	LtrA	Bacterial low temperature requirement A protein (LtrA). This family consists of several bacteria specific low temperature requirement A (LtrA) protein sequences which have been found to be essential for growth at low temperatures in Listeria monocytogenes.	352
399627	pfam06773	Bim_N	Bim protein N-terminus. This family represents the N-terminal region of several mammal specific Bim proteins. The Bim protein is one of the BH3-only proteins, members of the Bcl-2 family that have only one of the Bcl-2 homology regions, BH3. BH3-only proteins are essential initiators of apoptotic cell death.	40
399628	pfam06775	Seipin	Putative adipose-regulatory protein (Seipin). Seipin is a protein of approximately 400 residues, in humans, which is the product of a gene homologous to the murine guanine nucleotide-binding protein (G protein) gamma-3 linked gene. This gene is implicated in the regulation of body fat distribution and insulin resistance and particularly in the auto-immune disease Berardinelli-Seip congenital lipodystrophy type 2. Seipin has no similarity with other known proteins or consensus motifs that might predict its function, but it is predicted to contain two transmembrane domains at residues 28-49 and 237-258, in human, and a third transmembrane domain might be present at residues 155-173. Seipin may also be implicated in Silver spastic paraplegia syndrome and distal hereditary motor neuropathy type V.	194
399629	pfam06776	IalB	Invasion associated locus B (IalB) protein. This family consists of several invasion associated locus B (IalB) proteins and related sequences. IalB is known to be a major virulence factor in Bartonella bacilliformis where it was shown to have a direct role in human erythrocyte parasitism. IalB is upregulated in response to environmental cues signaling vector-to-host transmission. Such environmental cues would include, but not be limited to, temperature, pH, oxidative stress, and haemin limitation. It is also thought that IalB would aide B. bacilliformis survival under stress-inducing environmental conditions. The role of this protein in other bacterial species is unknown.	134
399630	pfam06777	HBB	Helical and beta-bridge domain. HBB is the domain on DEAD-box eukaryotic DNA repair helicases (EC:3.6.1.-) that appears to be a unique fold. It's conformation is of alpha-helices 12-16 plus a short beta-bridge to the FeS-cluster domain at the N-terminal. The full-length XPD protein verifies the presence of damage to DNA and allows DNA repair to proceed. XPD is an assembly of several domains to form a doughnut-shaped molecule that is able to separate two DNA strands and scan the DNA for damage. HBB helps to form the overall DNA-clamping architecture. This family represents a conserved region within a number of eukaryotic DNA repair helicases (EC:3.6.1.-).	190
399631	pfam06778	Chlor_dismutase	Chlorite dismutase. This family contains chlorite dismutase enzymes of bacterial and archaeal origin. This enzyme catalyzes the disproportionation of chlorite into chloride and oxygen. Note that many family members are hypothetical proteins.	190
399632	pfam06779	MFS_4	Uncharacterized MFS-type transporter YbfB. This family represents putative bacterial membrane proteins which may be sugar transporters. Members carry twelve transmembrane regions which are characteristic of members of the major facilitator sugar-transporter superfamily.	365
311003	pfam06780	Erp_C	Erp protein C-terminus. This family represents the C-terminus of bacterial Erp proteins that seem to be specific to Borrelia burgdorferi (a causative agent of Lyme disease). Borrelia Erp proteins are particularly heterogeneous, which might enable them to interact with a wide variety of host components.	140
399633	pfam06781	CrgA	Cell division protein CrgA. CrgA is a trans-membrane (TM) protein, first described in Streptomyces as being required for sporulation through the coordination of several aspects of reproductive growth. In Mtb (Mycobacterium tuberculosis ) CrgA is a central component of the divisome, and consists of 93 residues with two predicted TM helices (TM1: residues 29-51; and TM2: residues 66-88). CrgA facilitates the recruitment of the proteins essential for peptidoglycan synthesis to the divisome and also stabilizes the divisome. Reduced production of CrgA results in elongated cells and reduced growth rate, and loss of CrgA impairs peptidoglycan synthesis. CrgA has homologs in other actinomycetes.	88
284250	pfam06782	UPF0236	Uncharacterized protein family (UPF0236). 	479
399634	pfam06783	UPF0239	Uncharacterized protein family (UPF0239). 	83
399635	pfam06784	UPF0240	Uncharacterized protein family (UPF0240). 	167
399636	pfam06785	UPF0242	Uncharacterized protein family (UPF0242). 	194
399637	pfam06786	UPF0253	Uncharacterized protein family (UPF0253). 	65
399638	pfam06787	UPF0254	Uncharacterized protein family (UPF0254). 	162
399639	pfam06788	UPF0257	Uncharacterized protein family (UPF0257). 	235
399640	pfam06789	UPF0258	Uncharacterized protein family (UPF0258). 	148
284258	pfam06790	UPF0259	Uncharacterized protein family (UPF0259). 	248
399641	pfam06791	TMP_2	Prophage tail length tape measure protein. This family represents a conserved region located towards the N-terminal end of prophage tail length tape measure protein (TMP). TMP is important for assembly of phage tails and involved in tail length determination. Mutated forms TMP cause tail fibers to be shortened.	205
399642	pfam06792	UPF0261	Uncharacterized protein family (UPF0261). 	400
399643	pfam06793	UPF0262	Uncharacterized protein family (UPF0262). 	152
399644	pfam06794	UPF0270	Uncharacterized protein family (UPF0270). 	67
284263	pfam06795	Erythrovirus_X	Erythrovirus X protein. This family consists of several Erythrovirus X proteins which seem to be found exclusively in human parvovirus and human erythrovirus. The function of this family is unknown.	81
369084	pfam06796	NapE	Periplasmic nitrate reductase protein NapE. This family consists of several bacterial periplasmic nitrate reductase NapE proteins. Seven genes, napKEFDABC, encoding the periplasmic nitrate reductase system were cloned from the denitrifying phototrophic bacterium Rhodobacter sphaeroides f. sp. denitrificans IL106. NapE is thought to be a transmembrane protein.	53
369085	pfam06797	DUF1229	Protein of unknown function (DUF1229). This family consists of several hypothetical proteins of around 415 residues in length which seem to be specific to the bacterium Leptospira interrogans.	146
399645	pfam06798	PrkA	PrkA serine protein kinase C-terminal domain. This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This family corresponds to the C-terminal domain.	252
399646	pfam06799	DUF1230	Protein of unknown function (DUF1230). This family consists of several hypothetical plant and photosynthetic bacterial proteins of around 160 residues in length. The function of this family is unknown although looking at the species distribution the protein may play a part in photosynthesis.	141
284268	pfam06800	Sugar_transport	Sugar transport protein. This is a family of bacterial sugar transporters approximately 300 residues long. Members include glucose uptake proteins, ribose transport proteins, and several putative and hypothetical membrane proteins probably involved in sugar transport across bacterial membranes. These members are transmembrane proteins which are usually 5+5 duplications. This model recognizes a set of five TMs,	281
284269	pfam06802	DUF1231	Protein of unknown function (DUF1231). This family consists of several Orthopoxvirus specific proteins predominantly of around 340 residues in length. This family contains both B17 and B15 proteins, the function of which are unknown.	340
399647	pfam06803	DUF1232	Protein of unknown function (DUF1232). This family represents a conserved region of approximately 60 residues within a number of hypothetical bacterial and archaeal proteins of unknown function.	37
399648	pfam06804	Lipoprotein_18	NlpB/DapX lipoprotein. This family consists of a number of bacterial lipoproteins often known as NlpB or DapX. This lipoprotein is detected in outer membrane vesicles in Escherichia coli and appears to be nonessential.	292
284272	pfam06805	Lambda_tail_I	Bacteriophage lambda tail assembly protein I. This family consists of tail assembly proteins from lambdoid and T1 phages and related prophages, e.g. the tail assembly protein I (TAPI). Members of this family contain a core ubiquitin fold domain. The exact function of TAPI is not clear but it is not incorporated into the mature tail. Gene neighborhoods reveal that TAPI co-occurs with genes encoding the host-specificity protein TapJ, and TapK, which contains a JAB metallopeptidase fused to an NlpC/P60 peptidase. It is proposed that the TAPI protein is processed by the peptidase domains of TapK.	82
399649	pfam06806	DUF1233	Putative excisionase (DUF1233). This family consists of several putative phage excisionase proteins of around 80 residues in length.	70
399650	pfam06807	Clp1	Pre-mRNA cleavage complex II protein Clp1. This family consists of several pre-mRNA cleavage complex II Clp1 (or HeaB) proteins. Six different protein factors are required in vitro for 3' end formation of mammalian pre-mRNAs by endonucleolytic cleavage and polyadenylation. Clp1 is a subunit of cleavage complex IIA, which is required for cleavage, but not for polyadenylation of pre-mRNA.	112
284275	pfam06808	DctM	Tripartite ATP-independent periplasmic transporter, DctM component. This family contains a diverse range of predicted transporter proteins. Including the DctM subunit of the bacterial and archaeal TRAP C4-dicarboxylate transport (Dct) system permease. In general, C4-dicarboxylate transport systems allow C4-dicarboxylates like succinate, fumarate, and malate to be taken up. TRAP C4-dicarboxylate carriers are secondary carriers that use an electrochemical H+ gradient as the driving force for transport. DctM is an integral membrane protein that is one of the constituents of TRAP carriers. Note that many family members are hypothetical proteins.	413
284276	pfam06809	NPDC1	Neural proliferation differentiation control-1 protein (NPDC1). This family consists of several neural proliferation differentiation control-1 (NPDC1) proteins. NPDC1 plays a role in the control of neural cell proliferation and differentiation. It has been suggested that NPDC1 may be involved in the development of several secretion glands. This family also contains the C-terminal region of the C. elegans protein CAB-1, which is known to interact with AEX-3.	352
399651	pfam06810	Phage_GP20	Phage minor structural protein GP20. This family consists of several phage minor structural protein GP20 sequences of around 180 residues in length. The function of this family is unknown.	148
399652	pfam06812	ImpA_N	ImpA, N-terminal, type VI secretion system. This family represents a conserved region located towards the N-terminal end of ImpA and related proteins. ImpA is an inner membrane protein, which has been suggested to be involved with proteins that are exported and associated with colony variations in Actinobacillus actinomycetemcomitans. The ImpA gene in Vibrio cholera and many other bacteria is expressed from the virulence factor operon which produces the pathogenic injection, type VI secretion system; although the exact function of this gene-product is not known it appears to be an essential component of the pathogenic effect.	115
284279	pfam06813	Nodulin-like	Nodulin-like. This family represents a conserved region within plant nodulin-like proteins.	250
369089	pfam06814	Lung_7-TM_R	Lung seven transmembrane receptor. This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins.	294
399653	pfam06815	RVT_connect	Reverse transcriptase connection domain. This domain is known as the connection domain. This domain lies between the thumb and palm domains.	102
399654	pfam06816	NOD	NOTCH protein. NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals. NOD and NODP represent a region present in many NOTCH proteins and NOTCH homologs in multiple species such as NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. Role of NOD domain remains to be elucidated.	55
399655	pfam06817	RVT_thumb	Reverse transcriptase thumb domain. This domain is known as the thumb domain. It is composed of a four helix bundle.	66
399656	pfam06818	Fez1	Fez1. This family represents the eukaryotic Fez1 protein. Fez1 contains a leucine-zipper region with similarity to the DNA-binding domain of the cAMP-responsive activating-transcription factor 5. There is evidence that Fez1 inhibits cancer cell growth through regulation of mitosis, and that its alterations result in abnormal cell growth. Note that some family members contain more than one copy of this region.	198
399657	pfam06819	Arc_PepC	Archaeal Peptidase A24 C-terminal Domain. This region is of unknown function but is found in some archaeal pfam01478. It is predicted to be of mixed alpha/beta secondary structure by JPred.	112
148432	pfam06820	Phage_fiber_C	Putative prophage tail fibre C-terminus. This family represents the C-terminus of a prophage tail fibre protein found mostly in E. coli. All family members contain a conserved RLGP motif.	64
399658	pfam06821	Ser_hydrolase	Serine hydrolase. Members of this family have serine hydrolase activity. They contain a conserved serine hydrolase motif, GXSXG/A, where the serine is a putative nucleophile. This family has an alpha-beta hydrolase fold. Eukaryotic members of this family have a conserved LXCXE motif, which binds to retinoblastomas. This motif is absent from prokaryotic members of this family.	171
284287	pfam06822	DUF1235	Protein of unknown function (DUF1235). This family contains a number of viral proteins of unknown function.	261
399659	pfam06823	DUF1236	Protein of unknown function (DUF1236). This family contains a number of hypothetical bacterial proteins of unknown function. Some family members contain more than one copy of the region represented by this family.	64
399660	pfam06824	Glyco_hydro_125	Metal-independent alpha-mannosidase (GH125). This family, which contains bacterial and fungal glycoside hydrolases, is also known as GH125. They function as metal-independent alpha-mannosidases, with specificity for alpha-1,6-linked non-reducing terminal mannose residues. Structurally this family is part of the 6 hairpin glycosidase superfamily.	416
399661	pfam06825	HSBP1	Heat shock factor binding protein 1. Heat shock factor binding protein 1 (HSBP1) appears to be a negative regulator of the heat shock response.	51
284291	pfam06826	Asp-Al_Ex	Predicted Permease Membrane Region. This family represents five transmembrane helices that are normally found flanking (five either side) a pair of pfam02080 domains. This suggests that the paired regions form a ten helical structure, probably forming the pore, whereas the pfam02080) binds a ligand for export or regulation of the pore. Tetragenococcus halophilus aspT is described as a aspartate-alanine antiporter. In conjunction with aspD it forms a 'proton motive metabolic cycle catalyzed by an aspartate-alanine exchange'. The general conservation of domain architecture in this family suggests that they are functional orthologues.	167
399662	pfam06827	zf-FPG_IleRS	Zinc finger found in FPG and IleRS. This zinc binding domain is found at the C-terminus of isoleucyl tRNA synthetase and the enzyme Formamidopyrimidine-DNA glycosylase EC:3.2.2.23.	28
399663	pfam06830	Root_cap	Root cap. The cells at the periphery of the root cap are continuously sloughed off from the root into the mucilage, and are thought to be programmed to die.This family represents a conserved region approximately 60 residues in length within plant root cap proteins, which may be involved in the process.	57
399664	pfam06831	H2TH	Formamidopyrimidine-DNA glycosylase H2TH domain. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. This family is the central domain containing the DNA-binding helix-two turn-helix domain.	89
399665	pfam06832	BiPBP_C	Penicillin-Binding Protein C-terminus Family. This conserved region of approximately 90 residues is found in a sub-group of bacterial Penicillin-Binding Proteins (PBPs). A variable length loop region separates this region from the transpeptidase unit (pfam00905). It is predicted by PROF to be an all beta fold.	90
399666	pfam06833	MdcE	Malonate decarboxylase gamma subunit (MdcE). This family consists of several bacterial malonate decarboxylase gamma subunit proteins. Malonate decarboxylase of Klebsiella pneumoniae consists of four different subunits and catalyzes the conversion of malonate plus H+ to acetate and CO2. The catalysis proceeds via acetyl and malonyl thioester residues with the phosphribosyl-dephospho-CoA prosthetic group of the acyl carrier protein (ACP) subunit. MdcD and E together probably function as malonyl-S-ACP decarboxylase.	232
399667	pfam06834	TraU	TraU protein. This family consists of several bacterial TraU proteins. TraU appears to be more essential to conjugal DNA transfer than to assembly of pilus filaments.	306
399668	pfam06835	LptC	Lipopolysaccharide-assembly, LptC-related. This family consists of several related groups of proteins one of which is the LptC family. LptC is involved in lipopolysaccharide-assembly on the outer membrane of Gram-negative organisms. The lipopolysaccharide component of the outer bacterial membrane is transported form its source of origin to the outer membrane by a set of proteins constituting a transport machinery that is made up of LptA, LptB, LptC, LptD, LptE. LptC is located on the inner membrane side of the intermembrane space.	176
336520	pfam06836	DUF1240	Protein of unknown function (DUF1240). This family consists of a number of hypothetical putative membrane proteins which seem to be specific to Yersinia pestis. The function of this family is unknown.	95
284300	pfam06837	Fijivirus_P9-2	Fijivirus P9-2 protein. This family consists of several Fijivirus specific P9-2 proteins from Rice black streaked dwarf virus (RBSDV) and Fiji disease virus. The function of this family is unknown.	207
399669	pfam06838	Met_gamma_lyase	Methionine gamma-lyase. This is a putative pyridoxal 5'-phosphate-dependent methionine gamma-lyase enzyme involved in methionine catabolism.	405
399670	pfam06839	zf-GRF	GRF zinc finger. This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396.	45
399671	pfam06840	DUF1241	Protein of unknown function (DUF1241). This family consists of several programmed cell death 10 protein (PDCD10 or TFAR15) sequences. The function of this family is unknown.	150
399672	pfam06841	Phage_T4_gp19	T4-like virus tail tube protein gp19. This family consists of several tail tube protein gp19 sequences from the T4-like viruses. This family also contains bacterial members which suggest lateral transfer of genes.	134
399673	pfam06842	DUF1242	Protein of unknown function (DUF1242). This family consists of a number of eukaryotic proteins of around 72 residues in length. The function of this family is unknown.	35
399674	pfam06844	DUF1244	Protein of unknown function (DUF1244). This family consists of several short bacterial proteins of around 100 residues in length. The function of this family is unknown.	65
377719	pfam06847	Arc_PepC_II	Archaeal Peptidase A24 C-terminus Type II. This region is of unknown function but is found in some archaeal pfam01478. It is predicted to be of mixed alpha/beta secondary structure by Prof.	93
399675	pfam06848	Disaggr_repeat	Disaggregatase related repeat. This family consists of several repeats which seem to be specific to the Methanosarcina archaea species and are often found in multiple copies in disaggregatase proteins. Members of this family are also found in single copies in several hypothetical proteins. This repeat is also known as DNRLRE repeat and is predicted form a mainly beta-strand structure with two alpha-helices [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. It is found in some cell surface proteins.	179
399676	pfam06849	DUF1246	Protein of unknown function (DUF1246). This family represents the N-terminus of a number of hypothetical archaeal proteins of unknown function. This family is structurally related to the PreATP-grasp domain.	122
399677	pfam06850	PHB_depo_C	PHB de-polymerase C-terminus. This family represents the C-terminus of bacterial poly(3-hydroxybutyrate) (PHB) de-polymerase. This degrades PHB granules to oligomers and monomers of 3-hydroxy-butyric acid.	203
284311	pfam06851	DUF1247	Protein of unknown function (DUF1247). This family contains a number of hypothetical viral proteins of unknown function approximately 200 residues long.	149
369108	pfam06852	DUF1248	Protein of unknown function (DUF1248). This family represents a conserved region within a number of proteins of unknown function that seem to be specific to C. elegans. Note that some family members contain more than one copy of this region.	181
399678	pfam06853	DUF1249	Protein of unknown function (DUF1249). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown.	116
399679	pfam06854	Phage_Gp15	Bacteriophage Gp15 protein. This family consists of bacteriophage Gp15 proteins and related bacterial sequences. The function of this family is unknown	172
399680	pfam06855	YozE_SAM_like	YozE SAM-like fold. YozE-like is a family of Firmicute proteins that carries a four-helix motif similar to sterile alpha motif (SAM) domains. The family is suggested to fall into two subfamilies, possibly with differing functions based on the different surface charges on the three structural representatives, YozE MW0776 and MW1311. What this function is is not yet known although it is likely to involve binding to DNA.	66
369111	pfam06856	DUF1251	Protein of unknown function (DUF1251). This family consists of the N-terminal region of several hypothetical Nucleopolyhedrovirus proteins of unknown function.	121
399681	pfam06857	ACP	Malonate decarboxylase delta subunit (MdcD). This family consists of several bacterial malonate decarboxylase delta subunit (MdcD) proteins. Malonate decarboxylase of Klebsiella pneumoniae consists of four different subunits and catalyzes the conversion of malonate plus H+ to acetate and CO2. The catalysis proceeds via acetyl and malonyl thioester residues with the phosphribosyl-dephospho-CoA prosthetic group of the acyl carrier protein (ACP) subunit. MdcC is the (apo) ACP subunit. The family also contains the CitD family of citrate lyase acyl carrier proteins.	83
399682	pfam06858	NOG1	Nucleolar GTP-binding protein 1 (NOG1). This family represents a conserved region of approximately 60 residues in length within nucleolar GTP-binding protein 1 (NOG1). In S. cerevisiae, the NOG1 gene has been shown to be essential for cell viability, suggesting that NOG1 may play an important role in nucleolar functions. Family members include eukaryotic, bacterial and archaeal proteins.	57
399683	pfam06859	Bin3	Bicoid-interacting protein 3 (Bin3). This family represents a conserved region of approximately 120 residues within eukaryotic Bicoid-interacting protein 3 (Bin3). Bin3, which shows similarity to a number of protein methyltransferases that modify RNA-binding proteins, interacts with Bicoid, which itself directs pattern formation in the early Drosophila embryo. The interaction might allow Bicoid to switch between its dual roles in transcription and translation. Note that family members contain a conserved HLN motif.	108
115513	pfam06861	BALF1	BALF1 protein. This family consists of several BALF1 proteins which seem to be specific to the Lymphocryptoviruses. BALF1, inhibits the antiapoptotic activity of EBV BHRF1 and of KSBcl-2.	184
369113	pfam06862	UTP25	Utp25, U3 small nucleolar RNA-associated SSU processome protein 25. UTP25 is a family of eukaryotic proteins. The family displays limited sequence similarity to DEAD-box RNA helicases, having alternative residues at the Walker A and DEAD-box sites, but conservation of structural and other key residues. The domain is required and sufficient for the interaction of Utp25 with Utp3. UTP25 interacts with nucleolar protein Nop19 in S. cerevisiae, and Nop19p is essential for the incorporation of Utp25p into pre-ribosomes.	471
399684	pfam06863	DUF1254	Protein of unknown function (DUF1254). This family represents a conserved region about 130 residues long within hypothetical proteins of unknown function. Family members include eukaryotic, bacterial and archaeal proteins.	131
399685	pfam06864	PAP_PilO	Pilin accessory protein (PilO). This family consists of several enterobacterial PilO proteins. The function of PilO is unknown although it has been suggested that it is a cytoplasmic protein in the absence of other Pil proteins, but PilO protein is translocated to the outer membrane in the presence of other Pil proteins. Alternatively, PilO protein may form a complex with other Pil protein(s). PilO has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body. This family does not seem to be related to pfam04350.	412
399686	pfam06865	DUF1255	Protein of unknown function (DUF1255). This family consists of several conserved hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown	91
115518	pfam06866	DUF1256	Protein of unknown function (DUF1256). This family consists of several uncharacterized bacterial proteins which seem to be specific to the orders Clostridia and Bacillales. Family members are typically around 180 residues in length. The function of this family is unknown. These proteins are related to peptidase family M63 and so may be peptidases.	164
399687	pfam06868	DUF1257	Protein of unknown function (DUF1257). This family contains hypothetical proteins of unknown function that are approximately 120 residues long. Family members include eukaryotic and bacterial proteins.	103
369115	pfam06869	DUF1258	Protein of unknown function (DUF1258). This family represents a conserved region approximately 260 residues long within a number of hypothetical proteins of unknown function that seem to be specific to C. elegans. Note that this family contains a number of conserved cysteine and histidine residues.	250
399688	pfam06870	RNA_pol_I_A49	A49-like RNA polymerase I associated factor. Saccharomyces cerevisiae A49 is a specific subunit associated with RNA polymerase I (Pol I) in eukaryotes. Pol I maintains transcription activities in A49 deletion mutants. However, such mutants are deficient in transcription activity at low temperatures. Deletion analysis of the fusion yeast homolog indicate that only the C-terminal two thirds are required for function. Transcript analysis has demonstrated that A49 is maximising transcription of ribosomal DNA.	380
284325	pfam06871	TraH_2	TraH_2. This family consists of several TraH proteins which seem to be specific to Agrobacterium and Rhizobium species. This protein is thought to be involved in conjugal transfer but its function is unknown. This family does not appear to be related to pfam06122.	207
399689	pfam06872	EspG	EspG protein. This family consists of several EspG like proteins from Citrobacter rodentium and Escherichia coli. EspG is secreted by the type III secretory system and is translocated into host epithelial cells. EspG is homologous with Shigella flexneri protein VirA and can rescue invasion in a Shigella virA mutant, indicating that these proteins are functionally equivalent in Shigella. EspG plays an accessory but as yet undefined role in EPEC virulence that may involve intestinal colonisation.	351
284327	pfam06873	SerH	Cell surface immobilisation antigen SerH. This family consists of several cell surface immobilisation antigen SerH proteins which seem to be specific to Tetrahymena thermophila. The SerH locus of Tetrahymena thermophila is one of several paralogous loci with genes encoding variants of the major cell surface protein known as the immobilisation antigen (i-ag).	418
399690	pfam06874	FBPase_2	Firmicute fructose-1,6-bisphosphatase. This family consists of several bacterial fructose-1,6-bisphosphatase proteins (EC:3.1.3.11) which seem to be specific to phylum Firmicutes. Fructose-1,6-bisphosphatase (FBPase) is a well known enzyme involved in gluconeogenesis. This family does not seem to be structurally related to pfam00316.	638
115526	pfam06875	PRF	Plethodontid receptivity factor PRF. This family consists of several plethodontid receptivity factor (PRF) proteins which seem to be specific to Plethodon jordani (Jordan's salamander). PRF is a courtship pheromone produced by males increase female receptivity.	214
369118	pfam06876	SCRL	Plant self-incompatibility response (SCRL) protein. This family consists of several Plant self-incompatibility response (SCRL) proteins. The male component of the self-incompatibility response in Brassica has been shown to be encoded by the S locus cysteine-rich gene (SCR). SCR is related, at the sequence level, to the pollen coat protein (PCP) gene family whose members encode small, cysteine-rich proteins located in the proteo-lipidic surface layer (tryphine) of Brassica pollen grains.	67
399691	pfam06877	RraB	Regulator of ribonuclease activity B. This family of proteins regulate mRNA abundance by binding to RNaseE and inhibiting its endonucleolytic activity. A subset of these proteins are predicted to function as immunity proteins.	97
399692	pfam06878	Pkip-1	Pkip-1 protein. This family consists of several Pkip-1 proteins which seem to be specific to Nucleopolyhedroviruses. The function of this family is unknown although it has been found that Pkip-1 is not essential for virus replication in cell culture or by in vivo intrahaemocoelic injection.	163
399693	pfam06880	DUF1262	Protein of unknown function (DUF1262). This family represents a conserved region within a number of proteins of unknown function that seem to be specific to Arabidopsis thaliana. Note that some family members contain more than one copy of this region.	101
399694	pfam06881	Elongin_A	RNA polymerase II transcription factor SIII (Elongin) subunit A. This family represents a conserved region within RNA polymerase II transcription factor SIII (Elongin) subunit A. In mammals, the Elongin complex activates elongation by RNA polymerase II by suppressing transient pausing of the polymerase at many sites within transcription units. Elongin is a heterotrimer composed of A, B, and C subunits of 110, 18, and 15 kilodaltons, respectively. Subunit A has been shown to function as the transcriptionally active component of Elongin.	105
399695	pfam06882	DUF1263	Protein of unknown function (DUF1263). This family represents a conserved region located towards the C-terminus of a number proteins of unknown function that seem to be specific to Oryza sativa.	57
399696	pfam06883	RNA_pol_Rpa2_4	RNA polymerase I, Rpa2 specific domain. This domain is found between domain 3 (pfam04565) and domain 5 (pfam04565), but shows no homology to domain 4 of Rpb2. The external domains in multisubunit RNA polymerase (those most distant from the active site) are known to demonstrate more sequence variability.	53
399697	pfam06884	DUF1264	Protein of unknown function (DUF1264). This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 200 residues long. Some family members are annotated as putative lipoproteins.	169
399698	pfam06886	TPX2	Targeting protein for Xklp2 (TPX2). This family represents a conserved region approximately 60 residues long within the eukaryotic targeting protein for Xklp2 (TPX2). Xklp2 is a kinesin-like protein localized on centrosomes throughout the cell cycle and on spindle pole microtubules during metaphase. In Xenopus, it has been shown that Xklp2 protein is required for centrosome separation and maintenance of spindle bi-polarity. TPX2 is a microtubule-associated protein that mediates the binding of the C-terminal domain of Xklp2 to microtubules. It is phosphorylated during mitosis in a microtubule-dependent way.	82
369125	pfam06887	DUF1265	Protein of unknown function (DUF1265). This family represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be restricted to C. elegans. The GO annotation for this protein indicate that its a protein involved in nematode larval development and has a positive regulation on growth rate.	47
284339	pfam06888	Put_Phosphatase	Putative Phosphatase. This family contains a number of putative eukaryotic acid phosphatases. Some family members represent the products of the PSI14 phosphatase family in Lycopersicon esculentum (Tomato).	234
399699	pfam06889	DUF1266	Protein of unknown function (DUF1266). This family consists of several hypothetical bacterial proteins of around 235 residues in length. Members of this family seem to be found exclusively in the Enterobacteria Salmonella typhimurium and Escherichia coli. The function of this family is unknown.	174
399700	pfam06890	Phage_Mu_Gp45	Bacteriophage Mu Gp45 protein. This family consists of Bacteriophage Mu Gp45 related proteins from both phages and bacteria. The function of this family is unknown although it has been suggested that family members may be involved in baseplate assembly.	68
399701	pfam06891	P2_Phage_GpR	P2 phage tail completion protein R (GpR). This family consists of P2 phage tail completion protein R (GpR) like sequences. GpR is thought to be a tail completion protein which is essential for stable head joining.	131
399702	pfam06892	Phage_CP76	Phage regulatory protein CII (CP76). This family consists of several phage regulatory protein CII (CP76) sequences which are thought to be DNA binding proteins which are involved in the establishment of lysogeny.	155
284344	pfam06894	Phage_TAC_2	Bacteriophage lambda tail assembly chaperone, TAC, protein G. This family consists of Bacteriophage lambda minor tail protein G and related sequences. The construction of phage tails involves a stage of tail-tube formation, and tail-tube polymerization requires two additional proteins, gpG and gpGT. The open reading frames, ORFs, for gpG and gpGT are overlapping and are related by a programmed translational frameshift. During virion morphogenesis, gpG is expressed in large amounts, and about 3.5% of the time, a -1 translational frameshift leads to the production of the larger fusion protein, gpGT. The correct ratio of gpG to gpGT, as determined by the frequency of frameshifting, is crucial for tail assembly. Since gpG accumulates to high levels during a lambda infection and yet is not found in mature phage particles it is believed to act as a chaperone.	126
311073	pfam06896	Phage_TAC_3	Phage tail assembly chaperone proteins, TAC. This is a family of phage tail tube assembly chaperone proteins from some Siphoviridae viruses.	115
399703	pfam06897	DUF1269	Protein of unknown function (DUF1269). This family consists of several bacterial and archaeal proteins of around 200 residues in length. The function of this family is unknown. The family carries a repeated glycine-zipper sequence- motif, GxxxGxxxG, where the x following the G is frequently found to be an alanine. As glycine-zippers occur in membrane proteins, this family is likely to be found spanning a membrane.	99
399704	pfam06898	YqfD	Putative stage IV sporulation protein YqfD. This family consists of several putative bacterial stage IV sporulation (SpoIV) proteins. YqfD of Bacillus subtilis is known to be essential for efficient sporulation although its exact function is unknown.	379
399705	pfam06899	WzyE	WzyE protein, O-antigen assembly polymerase. This family consists of several WzyE proteins which appear to be specific to Enterobacteria. Members of this family are described as putative ECA polymerases this has been found to be incorrect. The function of this family is unknown. The family is a transmembrane family with up to 11 TM regions, and is necessary for the assembly of O-antigen lipopolysaccharide.	446
284349	pfam06900	DUF1270	Protein of unknown function (DUF1270). This family consists of several hypothetical Staphylococcus aureus and phage proteins of 53 residues in length. The function of this family is unknown.	53
399706	pfam06901	FrpC	RTX iron-regulated protein FrpC. This family consists of several RTX iron-regulated FrpC proteins which appear to be found exclusively in Neisseria meningitidis. FrpC has been shown to be related to the RTX family of bacterial cytotoxins. FrpC is found in the meningococcal outer membrane. The function of this family is unknown although it is thought to be a virulence factor.	228
399707	pfam06902	Fer4_19	Divergent 4Fe-4S mono-cluster. Members of this family contain three highly conserved cysteine residues. This family includes proteins containing divergent domains which are most likely to bind to iron-sulfur clusters.	64
399708	pfam06903	VirK	VirK protein. This family consists of several bacterial VirK proteins of around 145 residues in length. The function of this family is unknown.	98
399709	pfam06904	Extensin-like_C	Extensin-like protein C-terminus. This family represents the C-terminus (approx. 120 residues) of a number of bacterial extensin-like proteins. Extensins are cell wall glycoproteins normally associated with plants, where they strengthen the cell wall in response to mechanical stress. Note that many family members of this family are hypothetical.	176
399710	pfam06905	FAIM1	Fas apoptotic inhibitory molecule (FAIM1). This family consists of several fas apoptotic inhibitory molecule (FAIM1) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM1 is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology.	174
399711	pfam06906	DUF1272	Protein of unknown function (DUF1272). This family consists of several hypothetical bacterial proteins of around 80 residues in length. This family contains a number of conserved cysteine residues and its function is unknown.	57
399712	pfam06907	Latexin	Latexin. This family consists of several animal specific latexin proteins. Latexin is a carboxypeptidase A inhibitor and is expressed in a cell type-specific manner in both central and peripheral nervous systems in the rat.	216
399713	pfam06908	DUF1273	Protein of unknown function (DUF1273). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.	168
399714	pfam06910	MEA1	Male enhanced antigen 1 (MEA1). This family consists of several mammalian male enhanced antigen 1 (MEA1) proteins. The Mea-1 gene is found to be localized in primary and secondary spermatocytes and spermatids, but the protein products are detected only in spermatids. Intensive transcription of Mea-1 gene and specific localization of the gene product suggest that Mea-1 may play a important role in the late stage of spermatogenesis.	128
399715	pfam06911	Senescence	Senescence-associated protein. This family contains a number of plant senescence-associated proteins of approximately 450 residues in length. In Hemerocallis, petals have a genetically based program that leads to senescence and cell death approximately 24 hours after the flower opens, and it is believed that senescence proteins produced around that time have a role in this program. This family extends to the higher vertebrates where the full-length protein is often a Spartin, associated with mitochondrial membranes and transportation along microtubules.	179
399716	pfam06912	DUF1275	Protein of unknown function (DUF1275). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although most members have 6 TM regions, and may be putative permeases.	202
369134	pfam06916	DUF1279	Protein of unknown function (DUF1279). This family represents the C-terminus (approx. 120 residues) of a number of eukaryotic proteins of unknown function.	88
369135	pfam06917	Pectate_lyase_2	Periplasmic pectate lyase. This family consists of several Enterobacterial periplasmic pectate lyase proteins (EC:4.2.2.2). A major virulence determinant of the plant-pathogenic enterobacterium Erwinia chrysanthemi is the production of pectate lyase enzymes that degrade plant cell walls.	556
369136	pfam06918	DUF1280	Protein of unknown function (DUF1280). This family represents a conserved region approximately 200 residues long within a number of proteins of unknown function that seem to be specific to C. elegans.	219
284364	pfam06919	Phage_T4_Gp30_7	Phage Gp30.7 protein. This family consists of several phage Gp30.7 proteins of 121 residues in length. Family members seem to be exclusively from the T4-like viruses. The function of this family is unknown.	121
399717	pfam06920	DHR-2	Dock homology region 2. This family represents a conserved region within a number of eukaryotic dedicator of cytokinesis proteins. These are potential guanine nucleotide exchange factors, which activate some small GTPases by exchanging bound GDP for free GTP. This region interacts with RAC1 and ELMO1.	489
115570	pfam06922	CTV_P13	Citrus tristeza virus P13 protein. This family consists of several Citrus tristeza virus (CTV) P13 13-kDa proteins. Citrus tristeza virus (CTV), a member of the closterovirus group, is one of the more complex single-stranded RNA viruses. The function of this family is unknown.	119
399718	pfam06923	GutM	Glucitol operon activator protein (GutM). This family consists of several glucitol operon activator (GutM) proteins. Expression of the glucitol (gut) operon in Escherichia coli is regulated by an unusual, complex system which consists of an activator (encoded by the gutM gene) and a repressor (encoded by the gutR gene) in addition to the cAMP-CRP complex (CRP, cAMP receptor protein). Synthesis of the mRNA, which initiates at the promoter specific to the gutR gene, occurs within the gutM gene. Expressional control of the gut operon appears to occur as a consequence of the antagonistic action of the products of the autogenously regulated gutM and gutR genes.	105
399719	pfam06924	DUF1281	Protein of unknown function (DUF1281). This family consists of several hypothetical enterobacterial proteins of around 170 residues in length. Members of this family are found in Escherichia coli, Salmonella typhimurium and Shigella species. The function of this family is unknown.	179
284368	pfam06925	MGDG_synth	Monogalactosyldiacylglycerol (MGDG) synthase. This family represents a conserved region of approximately 180 residues within plant and bacterial monogalactosyldiacylglycerol (MGDG) synthase (EC:2.4.1.46). In Arabidopsis, there are two types of MGDG synthase which differ in their N-terminal portion: type A and type B.	169
284369	pfam06926	Rep_Org_C	Putative replisome organizer protein C-terminus. This family represents the C-terminus (approximately 100 residues) of a putative replisome organizer protein in Lactococcus bacteriophages.	95
399720	pfam06929	Rotavirus_VP3	Rotavirus VP3 protein. This family consists of several Rotavirus specific VP3 proteins. VP3 is known to be a viral guanylyltransferase and is thought to posses methyltransferase activity and therefore VP3 is a predicted multifunctional capping enzyme.	687
369140	pfam06930	DUF1282	Protein of unknown function (DUF1282). This family consists of several hypothetical proteins of around 200 residues in length. The function of this family is unknown although a number of family members are thought to be putative membrane proteins.	172
284372	pfam06931	Adeno_E4_ORF3	Mastadenovirus E4 ORF3 protein. This family consists of several Mastadenovirus E4 ORF3 proteins. Early proteins E4 ORF3 and E4 ORF6 have complementary functions during viral infection. Both proteins facilitate efficient viral DNA replication, late protein expression, and prevention of concatenation of viral genomes. A unique function of E4 ORF3 is the reorganisation of nuclear structures known as PML oncogenic domains (PODs). The function of these domains is unclear, but PODs have been implicated in a number of important cellular processes, including transcriptional regulation, apoptosis, transformation, and response to interferon.	113
399721	pfam06932	DUF1283	Protein of unknown function (DUF1283). This family consists of several hypothetical proteins of around 115 residues in length which seem to be specific to Enterobacteria. The function of the family is unknown.	74
115579	pfam06933	SSP160	Special lobe-specific silk protein SSP160. This family consists of several special lobe-specific silk protein SSP160 sequences which appear to be specific to Chironomus (Midge) species.	758
399722	pfam06934	CTI	Fatty acid cis/trans isomerase (CTI). This family consists of several fatty acid cis/trans isomerase proteins which appear to be found exclusively in bacteria of the orders Vibrionales and Pseudomonadales. Cis/trans isomerase (CTI) catalyzes the cis-trans isomerisation of esterified fatty acids in phospholipids, mainly cis-oleic acid (C(16:1,9)) and cis-vaccenic acid (C(18:1,11)), in response to solvents. The CTI protein has been shown to be involved in solvent resistance in Pseudomonas putida.	692
399723	pfam06935	DUF1284	Protein of unknown function (DUF1284). This family consists of several hypothetical bacterial and archaeal proteins of around 130 residues in length. The function of this family is unknown, although it is thought that they may be iron-sulphur binding proteins.	102
369143	pfam06936	Selenoprotein_S	Selenoprotein S (SelS). This family consists of several mammalian selenoprotein S (SelS) sequences. SelS is a plasma membrane protein and is present in a variety of tissues and cell types. Selenoprotein S (SelS) is an intrinsically disordered protein. It formsa selenosulfide bond between cys 174 and Sec 188, that has a redox potential -234 mV. In vitro, SelS is an efficient reductase that depends on the presence of selenocysteine. Due to the high reactivity, SelS also has peroxidase activity that can catalyze the reduction of hydrogen peroxide. It is also resistant to inactivation by hydrogen peroxide which might provide evolutionary advantage compared to cysteine containing peroxidases.	192
284377	pfam06937	EURL	EURL protein. This family consists of several animal EURL proteins. EURL is preferentially expressed in chick retinal precursor cells as well as in the anterior epithelial cells of the lens at early stages of development. EURL transcripts are found primarily in the peripheral dorsal retina, i.e., the most undifferentiated part of the dorsal retina. EURL transcripts are also detected in the lens at stage 18 and remain abundant in the proliferating epithelial cells of the lens until at least day 11. The distribution pattern of EURL in the developing retina and lens suggest a role before the events leading to cell determination and differentiation.	283
399724	pfam06938	DUF1285	Protein of unknown function (DUF1285). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. The structures revealed a conserved core with domain duplication and a superficial similarity of the C-terminal domain to pleckstrin homology-like folds. The conservation of the domain- interface indicates a potential binding site that is likely to involve a nucleotide-based ligand, with genome-context and gene-fusion analyses additionally supporting a role for this family in signal transduction, possibly during oxidative stress.	145
311095	pfam06939	DUF1286	Protein of unknown function (DUF1286). This family consists of several hypothetical archaeal proteins of around 120 residues in length. All members of this family seem to be Sulfolobus species specific. The function of this family is unknown.	111
399725	pfam06940	DUF1287	Domain of unknown function (DUF1287). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. This family is related to pfam00877.	163
284381	pfam06941	NT5C	5' nucleotidase, deoxy (Pyrimidine), cytosolic type C protein (NT5C). This family consists of several 5' nucleotidase, deoxy (Pyrimidine), cytosolic type C (NT5C) proteins. 5'(3')-Deoxyribonucleotidase is a ubiquitous enzyme in mammalian cells whose physiological function is not known.	180
399726	pfam06942	GlpM	GlpM protein. This family consists of several bacterial GlpM membrane proteins. GlpM is a hydrophobic protein containing 109 amino acids. It is thought that GlpM may play a role in alginate biosynthesis in Pseudomonas aeruginosa.	107
399727	pfam06943	zf-LSD1	LSD1 zinc finger. This family consists of several plant specific LSD1 zinc finger domains. Arabidopsis lsd1 mutants are hyper-responsive to cell death initiators and fail to limit the extent of cell death. Superoxide is a necessary and sufficient signal for cell death propagation. LSD1 monitors a superoxide-dependent signal and negatively regulates a plant cell death pathway. LSD1 protein contains three zinc finger domains, defined by CxxCxRxxLMYxxGASxVxCxxC. It has been suggested that LSD1 defines a zinc finger protein subclass and that LSD1 regulates transcription, via either repression of a pro-death pathway or activation of an anti-death pathway, in response to signals emanating from cells undergoing pathogen-induced hypersensitive cell death.	25
399728	pfam06945	DUF1289	Protein of unknown function (DUF1289). This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N-terminus. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids.	47
311099	pfam06946	Phage_holin_5_1	Bacteriophage A118-like holin, Hol118. This family consists of several Listeria bacteriophage holin proteins and related bacterial sequences. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the build up of a holin oligomer which causes the lysis.	92
399729	pfam06947	DUF1290	Protein of unknown function (DUF1290). This family consists of several bacterial small basic proteins of around 100 residues in length. The function of this family is unknown.	86
399730	pfam06949	DUF1292	Protein of unknown function (DUF1292). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown.	77
284388	pfam06950	DUF1293	Protein of unknown function (DUF1293). This family consists of several bacterial and phage proteins of around 115 residues in length. The function of this family is unknown.	115
399731	pfam06951	PLA2G12	Group XII secretory phospholipase A2 precursor (PLA2G12). This family consists of several group XII secretory phospholipase A2 precursor (PLA2G12) (EC:3.1.1.4) proteins. Group XII and group V PLA(2)s are thought to participate in helper T cell immune response through release of immediate second signals and generation of downstream eicosanoids.	186
369145	pfam06952	PsiA	PsiA protein. This family consists of several Enterobacterial PsiA proteins. The function of PsiA is unknown although it is thought that it may affect the generation of an SOS signal in Escherichia coli.	237
399732	pfam06953	ArsD	Arsenical resistance operon trans-acting repressor ArsD. This family consists of several bacterial arsenical resistance operon trans-acting repressor ArsD proteins. ArsD is a trans-acting repressor of the arsRDABC operon that confers resistance to arsenicals and antimonials in Escherichia coli. It possesses two-pairs of vicinal cysteine residues, Cys(12)-Cys(13) and Cys(112)-Cys(113), that potentially form separate binding sites for the metalloids that trigger dissociation of ArsD from the operon. However, as a homodimer it has four vicinal cysteine pairs.	120
399733	pfam06954	Resistin	Resistin. This family consists of several mammalian resistin proteins. Resistin is a 12.5-kDa cysteine-rich secreted polypeptide first reported from rodent adipocytes. It belongs to a multigene family termed RELMs or FIZZ proteins. Plasma resistin levels are significantly increased in both genetically susceptible and high-fat-diet-induced obese mice. Immunoneutralisation of resistin improves hyperglycemia and insulin resistance in high-fat-diet-induced obese mice, while administration of recombinant resistin impairs glucose tolerance and insulin action in normal mice. It has been demonstrated that increases in circulating resistin levels markedly stimulate glucose production in the presence of fixed physiological insulin levels, whereas insulin suppressed resistin expression. It has been suggested that resistin could be a link between obesity and type 2 diabetes.	85
399734	pfam06955	XET_C	Xyloglucan endo-transglycosylase (XET) C-terminus. This family represents the C-terminus (approximately 60 residues) of plant xyloglucan endo-transglycosylase (XET). Xyloglucan is the predominant hemicellulose in the cell walls of most dicotyledons. With cellulose, it forms a network that strengthens the cell wall. XET catalyzes the splitting of xyloglucan chains and the linking of the newly generated reducing end to the non-reducing end of another xyloglucan chain, thereby loosening the cell wall. Note that all family members contain the pfam00722 domain.	48
399735	pfam06956	RtcR	Regulator of RNA terminal phosphate cyclase. RtcR is a sigma54-dependent enhancer binding protein that activates transcription of the rtcBA operon. The product of the rtcA gene is an RNA 3'-terminal phosphate cyclase. This domain is found at the N-terminus of the RtcR sequence. RtcR, and other sigma54-dependent activators, contain pfam00158 in the central region of the protein sequence.	183
399736	pfam06957	COPI_C	Coatomer (COPI) alpha subunit C-terminus. This family represents the C-terminus (approximately 500 residues) of the eukaryotic coatomer alpha subunit. Coatomer (COPI) is a large cytosolic protein complex which forms a coat around vesicles budding from the Golgi apparatus. Such coatomer-coated vesicles have been proposed to play a role in many distinct steps of intracellular transport. Note that many family members also contain the pfam04053 domain.	403
399737	pfam06958	Pyocin_S	S-type Pyocin. This family represents a conserved region approximately 180 residues long within bacterial S-type pyocins. Pyocins are polypeptide toxins produced by, and active against, bacteria. S-type pyocins cause cell death by DNA breakdown due to endonuclease activity.	139
399738	pfam06959	RecQ5	RecQ helicase protein-like 5 (RecQ5). This family represents a conserved region approximately 200 residues long within eukaryotic RecQ helicase protein-like 5 (RecQ5). The RecQ helicases have been implicated in DNA repair and recombination, and RecQ5 may have an important role in DNA metabolism.	202
399739	pfam06961	DUF1294	Protein of unknown function (DUF1294). This family includes a number of hypothetical bacterial and archaeal proteins of unknown function.	55
399740	pfam06962	rRNA_methylase	Putative rRNA methylase. This family contains a number of putative rRNA methylases. Note that many family members are hypothetical proteins.	135
311113	pfam06963	FPN1	Ferroportin1 (FPN1). This family represents a conserved region approximately 100 residues long within eukaryotic Ferroportin1 (FPN1), a protein that may play a role in iron export from the cell. This family may represent a number of transmembrane regions in Ferroportin1.	430
399741	pfam06964	Alpha-L-AF_C	Alpha-L-arabinofuranosidase C-terminal domain. This family represents the C-terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase (EC:3.2.1.55). This catalyzes the hydrolysis of nonreducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides.	192
399742	pfam06965	Na_H_antiport_1	Na+/H+ antiporter 1. This family contains a number of bacterial Na+/H+ antiporter 1 proteins. These are integral membrane proteins that catalyze the exchange of H+ for Na+ in a manner that is highly dependent on the pH.	378
399743	pfam06966	DUF1295	Protein of unknown function (DUF1295). This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 300 residues long.	235
399744	pfam06967	Mo-nitro_C	Mo-dependent nitrogenase C-terminus. This family represents the C-terminus (approximately 80 residues) of a number of bacterial Mo-dependent nitrogenases. These are involved in nitrogen fixation in cyanobacteria. Note that many family members are hypothetical proteins.	83
399745	pfam06968	BATS	Biotin and Thiamin Synthesis associated domain. Biotin synthase (BioB), EC:2.8.1.6, catalyzes the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this family) and form a heterodimer. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers. This domain therefore may be involved in co-factor binding or dimerization (Finn, RD personal observation).	85
399746	pfam06969	HemN_C	HemN C-terminal domain. Members of this family are all oxygen-independent coproporphyrinogen-III oxidases (HemN). This enzyme catalyzes the oxygen-independent conversion of coproporphyrinogen-III to protoporphyrinogen-IX, one of the last steps in haem biosynthesis. The function of this domain is unclear, but comparison to other proteins containing a radical SAM domain (pfam04055) suggest it may be a substrate binding domain.	66
399747	pfam06970	RepA_N	Replication initiator protein A (RepA) N-terminus. This of family of predicted proteins represents the N-terminus (approximately 80 residues) of replication initiator protein A (RepA), a DNA replication initiator in plasmids. Most proteins in this family are bacterial, but archaeal and eukaryotic members are also included.	76
399748	pfam06971	Put_DNA-bind_N	Putative DNA-binding protein N-terminus. This family represents the N-terminus (approximately 50 residues) of a number of putative bacterial DNA-binding proteins.	49
399749	pfam06972	DUF1296	Protein of unknown function (DUF1296). This family represents a conserved region approximately 60 residues long within a number of plant proteins of unknown function. Structural modelling suggests this domain may bind nucleic acids.	60
399750	pfam06973	DUF1297	Domain of unknown function (DUF1297). This family represents the C-terminus (approximately 200 residues) of a number of archaeal proteins of unknown function. One member is annotated as being a possible carboligase enzyme.	188
399751	pfam06974	DUF1298	Protein of unknown function (DUF1298). This family represents the C-terminus (approximately 170 residues) of a number of hypothetical plant proteins of unknown function.	144
115620	pfam06975	DUF1299	Protein of unknown function (DUF1299). This family represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be specific to Arabidopsis thaliana. Note that many family members contain multiple copies of this region.	47
399752	pfam06977	SdiA-regulated	SdiA-regulated. This family represents a conserved region approximately within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. Some family members contain the pfam01436 repeat.	249
399753	pfam06978	POP1	Ribonucleases P/MRP protein subunit POP1. This family represents a conserved region approximately 150 residues long located towards the N-terminus of the POP1 subunit that is common to both the RNase MRP and RNase P ribonucleoproteins (EC:3.1.26.5). These RNA-containing enzymes generate mature tRNA molecules by cleaving their 5' ends.	211
399754	pfam06979	TMEM70	Assembly, mitochondrial proton-transport ATP synth complex. TMEM70 is a family of proteins essential for assembly of the mitochondrial proton-transporting ATP synthase complex within the inner mitochondrial membrane.	132
399755	pfam06980	DUF1302	Protein of unknown function (DUF1302). This family contains a number of hypothetical bacterial proteins of unknown function that are approximately 600 residues long. Most family members seem to be from Pseudomonas.	569
399756	pfam06983	3-dmu-9_3-mt	3-demethylubiquinone-9 3-methyltransferase. This family represents a conserved region approximately 100 residues long within a number of bacterial and archaeal 3-demethylubiquinone-9 3-methyltransferases (EC:2.1.1.64). Note that some family members contain more than one copy of this region, and that many members are hypothetical proteins.	116
369158	pfam06984	MRP-L47	Mitochondrial 39-S ribosomal protein L47 (MRP-L47). This family represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure.	86
369159	pfam06985	HET	Heterokaryon incompatibility protein (HET). This family represents a conserved region approximately 150 residues long within various heterokaryon incompatibility proteins that seem to be restricted to ascomycete fungi. Genetic differences in specific het genes prevent a viable heterokaryotic fungal cell from being formed by the fusion of filaments from two different wild-type strains. Many family members also contain the pfam00400 repeat and the pfam05729 domain.	146
399757	pfam06986	TraN	Type-1V conjugative transfer system mating pair stabilisation. TraN is a large cysteine-rich outer membrane protein involved in the mating-pair stabilisation (adhesin) component of the F-type conjugative plasmid transfer system. TraN is believed to interact with the core type IV secretion system apparatus through the TraV protein.	239
399758	pfam06988	NifT	NifT/FixU protein. This family consists of several NifT/FixU bacterial proteins. NifT/FixU is a very small, conserved protein that is found in nif clusters; however, its function is unknown. Although it is thought that the protein may be involved in biosynthesis of the FeMo cofactor of nitrogenase although perturbation of nifT expression in K. pneumoniae has only a limited effect on nitrogen fixation.	64
369161	pfam06989	BAALC_N	BAALC N-terminus. This family represents the N-terminal region of the mammalian BAALC proteins. BAALC (brain and acute leukaemia, cytoplasmic), that is highly conserved among mammals but evidently absent from lower organisms. Two isoforms are specifically expressed in neuroectoderm-derived tissues, but not in tumors or cancer cell lines of non-neural tissue origin. It has been shown that blasts from a subset of patients with acute leukaemia greatly overexpress eight different BAALC transcripts, resulting in five protein isoforms. Among patients with acute myeloid leukaemia, those overexpressing BAALC show distinctly poor prognosis, pointing to a key role of the BAALC products in leukaemia. It has been suggested that BAALC is a gene implicated in both neuroectodermal and hematopoietic cell functions.	50
254007	pfam06990	Gal-3-0_sulfotr	Galactose-3-O-sulfotransferase. This family consists of several mammalian galactose-3-O-sulfotransferase proteins. Gal-3-O-sulfotransferase is thought to play a critical role in 3'-sulfation of N-acetyllactosamine in both O- and N-glycans.	400
399759	pfam06991	MFAP1	Microfibril-associated/Pre-mRNA processing. MFAP1 was first named for proteins associated with microfibrils which are an important component of the extracellular matrix (ECM) of many tissues. For example, MFAP1 has been shown to be associated with elastin-like fibers at the base of Schlemm's canal endothelium cells, in the juxtacanalicular tissue, and in the uveal region. Based on its role in the ECM and the proximity of the MFAP1 gene to FBN1 it was hypothesized that mutations in MFAP1 contributed to heritable diseases affecting microfibrils, Marfan syndrome but this has now been shown not to be the case. MFAP1 has also been shown to interact directly with certain pre-mRNA processing factor proteins, Prps, which are also spliceosome components and is thus required for pre-mRNA processing. MAFP1 bound to Pr38 of yeast is necessary for cells in vivo to progress from G2 to M phase.	214
399760	pfam06992	Phage_lambda_P	Replication protein P. This family consists of several Bacteriophage lambda replication protein P like proteins. The bacteriophage lambda P protein promoters replication of the phage chromosome by recruiting a key component of the cellular replication machinery to the viral origin. Specifically, P protein delivers one or more molecules of Escherichia coli DnaB helicase to a nucleoprotein structure formed by the lambda O initiator at the lambda replication origin.	162
399761	pfam06993	DUF1304	Protein of unknown function (DUF1304). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown.	110
399762	pfam06994	Involucrin2	Involucrin. This family represents a conserved region approximately 60 residues long, multiple copies of which are found within eukaryotic involucrin, and which is rich in glutamine and glutamic acid residues. Involucrin forms part of the insoluble cornified cell envelope (a specialized protective barrier) of stratified squamous epithelia. Members of this family seem to be restricted to mammals.	41
399763	pfam06995	Phage_P2_GpU	Phage P2 GpU. This family consists of several bacterial and phage proteins of around 130 residues in length which seem to be related to the bacteriophage P2 GpU protein, which is thought to be involved in tail assembly.	120
399764	pfam06996	T6SS_TssG	Type VI secretion, TssG. This is a family of Gram-negative bacterial proteins that form part of the type VI pathogenicity secretion system (T6SS), including TssG. TssG is homologs to phage tail proteins and is required for proper assembly of the Hcp tube in bacteria.One other member in this family, SciB (Q93IT4) from Salmonella enterica, is thought to be involved in virulence.	303
399765	pfam06998	DUF1307	Protein of unknown function (DUF1307). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Some family members are described as putative lipoproteins but the function of the family is unknown.	114
399766	pfam06999	Suc_Fer-like	Sucrase/ferredoxin-like. This family contains a number of bacterial and eukaryotic proteins approximately 400 residues long that resemble ferredoxin and appear to have sucrolytic activity.	217
399767	pfam07000	DUF1308	Protein of unknown function (DUF1308). This family consists of several hypothetical eukaryotic sequences of around 400 residues in length. The function of this family is unknown.	163
399768	pfam07001	BAT2_N	BAT2 N-terminus. This family represents the N-terminus (approximately 200 residues) of the proline-rich protein BAT2. BAT2 is similar to other proteins with large proline-rich domains, such as some nuclear proteins, collagens, elastin, and synapsin.	189
284432	pfam07002	Copine	Copine. This family represents a conserved region approximately 220 residues long within eukaryotic copines. Copines are Ca(2+)-dependent phospholipid-binding proteins that are thought to be involved in membrane-trafficking, and may also be involved in cell division and growth.	218
311141	pfam07004	SHIPPO-rpt	Sperm-tail PG-rich repeat. This family represents a short conserved region carrying a PGP motif that is repeated in eukaryotic proteins of sperm-tails. Shippo orthologues from some species may include up to 40 Pro-Gly-Pro repeats.	33
399769	pfam07005	DUF1537	Putative sugar-binding N-terminal domain. This conserved region is found in proteins of unknown function in a range of Proteobacteria as well as the Gram-positive Oceanobacillus iheyensis. Structural analysis of the whole protein indicates the N- and C-termini interacting to produce an interacting surface into which a threonate-ADPcomplex is bound, suggesting that a sugar binding site is on the N-terminal domain here, and a nucleotide binding site is in the C-terminal domain (manuscript in preparation). There is a critical motif, DDXTG, at approximately residues 22-25.	230
399770	pfam07006	DUF1310	Protein of unknown function (DUF1310). This family consists of several hypothetical proteins of around 125 residues in length. Members of this family seem to be specific to Listeria and Streptococcus species. The function of this family is unknown.	116
399771	pfam07007	LprI	Lysozyme inhibitor LprI. This family consists of several bacterial proteins of around 120 residues in length. Members of this family contain four highly conserved cysteine residues. Family members include lipoprotein LprI from Mycobacterium, which binds to and inhibits macrophage lysozyme, which may aid bacterial survival.	103
399772	pfam07009	NusG_II	NusG domain II. This domain is found in some NusG proteins where it forms domain II. However most NusG proteins are missing this domain. In other cases this domain is found in isolation. The function of this domain is unknown.	107
399773	pfam07010	Endomucin	Endomucin. This family consists of several mammalian endomucin proteins. Endomucin is an early endothelial-specific antigen that is also expressed on putative hematopoietic progenitor cells.	260
399774	pfam07011	DUF1313	Protein of unknown function (DUF1313). This family consists of several hypothetical plant proteins of around 100 residues in length. The function of this family is unknown.	83
399775	pfam07012	Curlin_rpt	Curlin associated repeat. This family consists of several bacterial repeats of around 30 residues in length. These repeats are often found in multiple copies in the curlin proteins CsgA and CsgB. Curli fibers are thin aggregative surface fibers, connected with adhesion, which bind laminin, fibronectin, plasminogen, human contact phase proteins, and major histocompatibility complex (MHC) class I molecules. Curli fibers are coded for by the csg gene cluster, which is comprised of two divergently transcribed operons. One operon encodes the csgB, csgA, and csgC genes, while the other encodes csgD, csgE, csgF, and csgG. The assembly of the fibers is unique and involves extracellular self-assembly of the curlin subunit (CsgA), dependent on a specific nucleator protein (CsgB). CsgD is a transcriptional activator essential for expression of the two curli fibre operons, and CsgG is an outer membrane lipoprotein involved in extracellular stabilisation of CsgA and CsgB.	34
284441	pfam07013	DUF1314	Protein of unknown function (DUF1314). This family consists of several Alphaherpesvirus proteins of around 200 residues in length. The function of this family is unknown.	197
399776	pfam07014	Hs1pro-1_C	Hs1pro-1 protein C-terminus. This family represents the C-terminus (approximately 270 residues) of a number of plant Hs1pro-1 proteins, which are believed to confer nematode resistance.	258
148565	pfam07015	VirC1	VirC1 protein. This family consists of several bacterial VirC1 proteins. In Agrobacterium tumefaciens, a cis-active 24-base-pair sequence adjacent to the right border of the T-DNA, called overdrive, stimulates tumor formation by increasing the level of T-DNA processing. It is thought that the virC operon which enhances T-DNA processing probably does so because the VirC1 protein interacts with overdrive. It has now been shown that the virC1 gene product binds to overdrive but not to the right border of T-DNA.	231
284443	pfam07016	CRAM_rpt	Cysteine-rich acidic integral membrane protein precursor. This family consists of several 24 residue repeats from the Trypanosoma brucei cysteine-rich, acidic integral membrane protein precursor (CRAM). CRAM is concentrated in the flagellar pocket, an invagination of the cell surface of the trypanosome where endocytosis has been documented.	22
399777	pfam07017	PagP	Antimicrobial peptide resistance and lipid A acylation protein PagP. This family consists of several bacterial antimicrobial peptide resistance and lipid A acylation (PagP) proteins. The bacterial outer membrane enzyme PagP transfers a palmitate chain from a phospholipid to lipid A. In a number of pathogenic Gram-negative bacteria, PagP confers resistance to certain cationic antimicrobial peptides produced during the host innate immune response.	146
399778	pfam07019	Rab5ip	Rab5-interacting protein (Rab5ip). This family consists of several Rab5-interacting protein (RIP5 or Rab5ip ) sequences. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5.	79
115660	pfam07020	Orthopox_C10L	Orthopoxvirus C10L protein. This family consists of several Orthopoxvirus C10L proteins. C10L viral protein can play an important role in vaccinia virus evasion of the host immune system. It may consist in the blockade of IL-1 receptors by the C10L protein, a homolog of the IL-1 Ra.	83
399779	pfam07021	MetW	Methionine biosynthesis protein MetW. This family consists of several bacterial and one archaeal methionine biosynthesis MetW proteins. Biosynthesis of methionine from homoserine in Pseudomonas putida takes place in three steps. The first step is the acylation of homoserine to yield an acyl-L-homoserine. This reaction is catalyzed by the products of the metXW genes and is equivalent to the first step in enterobacteria, gram-positive bacteria and fungi, except that in these microorganisms the reaction is catalyzed by a single polypeptide (the product of the metA gene in Escherichia coli and the met5 gene product in Neurospora crassa). In Pseudomonas putida, as in gram-positive bacteria and certain fungi, the second and third steps are a direct sulfhydrylation that converts the O-acyl-L-homoserine into homocysteine and further methylation to yield methionine. The latter reaction can be mediated by either of the two methionine synthetases present in the cells.	193
311152	pfam07022	Phage_CI_repr	Bacteriophage CI repressor helix-turn-helix domain. This family consists of several phage CI repressor proteins and related bacterial sequences. The CI repressor is known to function as a transcriptional switch, determining whether transcription is lytic or lysogenic.	65
399780	pfam07023	DUF1315	Protein of unknown function (DUF1315). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown.	81
399781	pfam07024	ImpE	ImpE protein. This family consists of several bacterial proteins including ImpE from Rhizobium leguminosarum. It has been suggested that the imp locus is involved in the secretion to the environment of proteins, including periplasmic RbsB protein, that cause blocking of infection specifically in pea plants. The exact function of this family is unknown.	122
284449	pfam07026	DUF1317	Protein of unknown function (DUF1317). This family consists of several hypothetical bacterial and phage proteins of around 60 residues in length. The function of this family is unknown.	60
399782	pfam07027	DUF1318	Protein of unknown function (DUF1318). This family consists of several bacterial proteins of around 100 residues in length and is often known as YdbL. The function of this family is unknown.	86
369174	pfam07028	DUF1319	Protein of unknown function (DUF1319). This family contains a number of viral proteins of unknown function approximately 200 residues long. Family members seem to be restricted to badnaviruses.	109
369175	pfam07029	CryBP1	CryBP1 protein. This family consists of several CryBP1 like proteins from Bacillus thuringiensis and Paenibacillus popilliae. Members of this family are thought to be involved in the overall toxicity of the bacteria to their hosts.	180
399783	pfam07030	DUF1320	Protein of unknown function (DUF1320). This family consists of both hypothetical bacterial and phage proteins of around 145 residues in length. The function of this family is unknown.	109
369177	pfam07032	DUF1322	Protein of unknown function (DUF1322). This family consists of several hypothetical 9.4 kDa Borrelia burgdorferi (Lyme disease spirochete) proteins of around 78 residues in length. The function of this family is unknown.	73
284455	pfam07033	Orthopox_B11R	Orthopoxvirus B11R protein. This family consists of several Orthopoxvirus B11R proteins of around 70 residues in length. The function of this family is unknown.	71
399784	pfam07034	ORC3_N	Origin recognition complex (ORC) subunit 3 N-terminus. This family represents the N-terminus (approximately 300 residues) of subunit 3 of the eukaryotic origin recognition complex (ORC). Origin recognition complex (ORC) is composed of six subunits that are essential for cell viability. They collectively bind to the autonomously replicating sequence (ARS) in a sequence-specific manner and lead to the chromatin loading of other replication factors that are essential for initiation of DNA replication.	330
399785	pfam07035	Mic1	Colon cancer-associated protein Mic1-like. This family represents the C-terminus (approximately 160 residues) of a number of proteins that resemble colon cancer-associated protein Mic1.	157
399786	pfam07037	DUF1323	Putative transcription regulator (DUF1323). This family consists of several hypothetical Enterobacterial proteins of around 120 residues in length. This family appears to have an HTH domain and is therefore likely to act as a transcriptional regulator.	122
70500	pfam07038	DUF1324	Protein of unknown function (DUF1324). This family consists of several Circovirus proteins of around 60 residues in length. The function of this family is unknown.	59
399787	pfam07039	DUF1325	SGF29 tudor-like domain. This domain is found in the yeast protein SAGA-associated factor 29. This domain is related to members of the Tudor domain superfamily such as pfam05641. The SAGA complex is involved in RNA polymerase II-dependent transcriptional regulation. The membership of the tudor domain superfamily suggests this domain may bind to RNA.	131
399788	pfam07040	DUF1326	Protein of unknown function (DUF1326). This family consists of several hypothetical bacterial proteins which seem to be found exclusively in Rhizobium and Ralstonia species. Members of this family are typically around 210 residues in length and contain 5 highly conserved cysteine residues at their N-terminus. The function of this family is unknown.	175
311164	pfam07041	DUF1327	Protein of unknown function (DUF1327). This family consists of several hypothetical bacterial proteins of around 115 residues in length which seem to be specific to Escherichia coli. The function of this family is unknown.	113
115680	pfam07042	TrfA	TrfA protein. This family consists of several bacterial TrfA proteins. The trfA operon of broad-host-range IncP plasmids is essential to activate the origin of vegetative replication in diverse species. The trfA operon encodes two ORFs. The first ORF is highly conserved and encodes a putative single-stranded DNA binding protein (Ssb). The second, trfA, contains two translational starts as in the IncP alpha plasmids, generating related polypeptides of 406 (TrfA1) and 282 (TrfA2) amino acids. TrfA2 is very similar to the IncP alpha product, whereas the N-terminal region of TrfA1 shows very little similarity to the equivalent region of IncP alpha TrfA1. This region has been implicated in the ability of IncP alpha plasmids to replicate efficiently in Pseudomonas aeruginosa.	282
399789	pfam07043	DUF1328	Protein of unknown function (DUF1328). This family consists of several hypothetical bacterial proteins of around 50 residues in length. The function of this family is unknown.	38
399790	pfam07044	DUF1329	Protein of unknown function (DUF1329). This family consists of several hypothetical bacterial proteins of around 475 residues in length. The majority of family members are from Pseudomonas species but the family also contains sequences from Shewanella oneidensis and Thauera aromatica.	366
399791	pfam07045	DUF1330	Domain of unknown function (DUF1330). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown.	94
336588	pfam07046	CRA_rpt	Cytoplasmic repetitive antigen (CRA) like repeat. This family consists of several repeats of around 42 residues in length. These repeated sequences are found in multiple copies in Trypanosoma cruzi antigens, the cytoplasmic repetitive antigen (CRA) protein contains 23 copies of this repeat.	42
399792	pfam07047	OPA3	Optic atrophy 3 protein (OPA3). This family consists of several optic atrophy 3 (OPA3) proteins. OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity.	125
115686	pfam07048	DUF1331	Protein of unknown function (DUF1331). This family consists of several Circovirus proteins of around 35 residues in length. Members of this family are described as ORF-10 proteins and their function is unknown.	35
399793	pfam07051	OCIA	Ovarian carcinoma immunoreactive antigen (OCIA). This family consists of several ovarian carcinoma immunoreactive antigen (OCIA) and related eukaryotic sequences. The function of this family is unknown.	87
399794	pfam07052	Hep_59	Hepatocellular carcinoma-associated antigen 59. This family represents a conserved region approximately 100 residues long within mammalian hepatocellular carcinoma-associated antigen 59 and similar proteins. Family members are found in a variety of eukaryotes, mainly as hypothetical proteins.	102
369186	pfam07054	Pericardin_rpt	Pericardin like repeat. This family consists of several repeated sequences of around 34 residues in length. This repeat is found in multiple copies in the Drosophila pericardin and other extracellular matrix proteins.	35
399795	pfam07055	Eno-Rase_FAD_bd	Enoyl reductase FAD binding domain. This family carries the region of the enzyme trans-2-enoyl-CoA reductase, at the very C-terminus, that binds to FAD. The activity was characterized in Euglena where an unusual fatty acid synthesis path-way in mitochondria performs a malonyl-CoA independent synthesis of fatty acids leading to accumulation of wax esters, which serve as the sink for electrons stemming from glycolytic ATP synthesis and pyruvate oxidation. The full enzyme catalyzes the reduction of enoyl-CoA to acyl-CoA. The conserved region is seen as the motif FGFxxxxxDY.	64
399796	pfam07056	DUF1335	Protein of unknown function (DUF1335). This family represents a conserved region approximately 130 residues long within a number of proteins of unknown function that seem to be specific to the white spot syndrome virus (WSSV).	131
399797	pfam07057	TraI	DNA helicase TraI. This family represents a conserved region approximately 130 residues long within the bacterial DNA helicase TraI (EC:3.6.1.-). TraI is a bifunctional protein that catalyzes the unwinding of duplex DNA as well as acts as a sequence-specific DNA trans-esterase, providing the site- and strand-specific nick required to initiate DNA transfer.	123
399798	pfam07058	MAP70	Microtubule-associated protein 70. This family represents a family of plant microtubule-associated proteins of size 70 kDa. The proteins contain four predicted coiled-coil domains, and truncation studies identify a central domain that targets the proteins to microtubules. It has no predicted trans-membrane domains, and the region between the coils from approximately residues 240-483 is the targetting region.	544
399799	pfam07059	DUF1336	Protein of unknown function (DUF1336). This family represents the C-terminus (approximately 250 residues) of a number of hypothetical plant proteins of unknown function.	211
399800	pfam07061	Swi5	Swi5. Swi5 is involved in meiotic DNA repair synthesis and meiotic joint molecule formation. It is known to interact with Swi2, Rhp51 and Swi6.	79
311178	pfam07062	Clc-like	Clc-like. This family contains a number of Clc-like proteins that are approximately 250 residues long.	212
399801	pfam07063	DUF1338	Domain of unknown function (DUF1338). This domain is found in a variety of bacterial and fungal hypothetical proteins of unknown function. The structure of this domain has been solved by structural genomics. The structure implies a zinc-binding function, so it is a putative metal hydrolase (information derived from TOPSAN for Structure 3iuz).	322
399802	pfam07064	RIC1	RIC1. RIC1 has been identified in yeast as a Golgi protein involved in retrograde transport to the cis-Golgi network. It forms a heterodimer with Rgp1 and functions as a guanyl-nucleotide exchange factor.	248
399803	pfam07065	D123	D123. This family contains a number of eukaryotic D123 proteins approximately 330 residues long. It has been shown that mutated variants of D123 exhibit temperature-dependent differences in their degradation rate. D123 proteins are regulators of eIF2, the central regulator of translational initiation.	300
369194	pfam07066	DUF3882	Lactococcus phage M3 protein. This family consists of several Lactococcus phage middle-3 (M3) proteins of around 160 residues in length. The function of this family is unknown.	159
115703	pfam07067	DUF1340	Protein of unknown function (DUF1340). This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 235 residues in length. The function of this family is unknown.	236
399804	pfam07068	Gp23	Major capsid protein Gp23. This family contains a number of major capsid Gp23 proteins approximately 500 residues long, from T4-like bacteriophages.	449
115705	pfam07069	PRRSV_2b	Porcine reproductive and respiratory syndrome virus 2b. This family consists of several Porcine reproductive and respiratory syndrome virus (PRRSV) ORF2b proteins. The function of this family is unknown however it is known that large amounts of 2b protein are present in the virion and it is thought that this protein may be an integral component of the virion.	73
399805	pfam07070	Spo0M	SpoOM protein. This family consists of several bacterial SpoOM proteins which are thought to control sporulation in Bacillus subtilis.Spo0M exerts certain negative effects on sporulation and its gene expression is controlled by sigmaH.	203
399806	pfam07071	KDGP_aldolase	KDGP aldolase. DgaF is part of the dga operon required for wild-type growth of Salmonella Typhimurium with D-glucosaminate. It catalyzes the conversion of keto-3-deoxygluconate 6-phosphate (KDGP) to yield pyruvate and glyceraldehyde-3-phosphate. Orthologues of the dga genes are largely restricted to certain enteric bacteria and a few species in the phylum Firmicutes.	217
399807	pfam07072	ZapD	Cell division protein. Cell division protein ZapD enhances FtsZ-ring assembly. It directly interacts with FtsZ and promotes bundling of FtsZ protofilaments, with a reduction in FtsZ GTPase activity.	210
399808	pfam07073	ROF	Modulator of Rho-dependent transcription termination (ROF). This family consists of several bacterial modulator of Rho-dependent transcription termination (ROF) proteins. ROF binds transcription termination factor Rho and inhibits Rho-dependent termination in vivo.	80
399809	pfam07074	TRAP-gamma	Translocon-associated protein, gamma subunit (TRAP-gamma). This family consists of several eukaryotic translocon-associated protein, gamma subunit (TRAP-gamma) sequences. The translocation site (translocon), at which nascent polypeptides pass through the endoplasmic reticulum membrane, contains a component previously called 'signal sequence receptor' that is now renamed as 'translocon-associated protein' (TRAP). The TRAP complex is comprised of four membrane proteins alpha, beta, gamma and delta which are present in a stoichiometric relation, and are genuine neighbors in intact microsomes. The gamma subunit is predicted to span the membrane four times.	170
399810	pfam07075	DUF1343	Protein of unknown function (DUF1343). This family consists of several hypothetical bacterial proteins of around 400 residues in length. The function of this family is unknown.	362
399811	pfam07076	DUF1344	Protein of unknown function (DUF1344). This family consists of several short, hypothetical bacterial proteins of around 80 residues in length. Members of this family are found in Rhizobium, Agrobacterium and Brucella species. The function of this family is unknown.	59
399812	pfam07077	DUF1345	Protein of unknown function (DUF1345). This family consists of several hypothetical bacterial proteins of around 230 residues in length. The function of this family is unknown.	171
399813	pfam07078	FYTT	Forty-two-three protein. This family consists of several mammalian proteins of around 320 residues in length called 40-2-3 proteins. The function of this family is unknown.	308
284489	pfam07079	DUF1347	Protein of unknown function (DUF1347). This family consists of several hypothetical bacterial proteins of around 610 residues in length. Members of this family are highly conserved and seem to be specific to Chlamydia species. The function of this family is unknown.	548
399814	pfam07080	DUF1348	Protein of unknown function (DUF1348). This family consists of several highly conserved hypothetical proteins of around 150 residues in length. The function of this family is unknown.	130
399815	pfam07081	DUF1349	Protein of unknown function (DUF1349). This family consists of several hypothetical bacterial proteins but contains one sequence from Saccharomyces cerevisiae. Members of this family are typically around 200 residues in length. The function of this family is unknown.	168
115718	pfam07082	DUF1350	Protein of unknown function (DUF1350). This family consists of several hypothetical proteins from both cyanobacteria and plants. Members of this family are typically around 250 residues in length. The function of this family is unknown but the species distribution indicates that the family may be involved in photosynthesis.	250
399816	pfam07083	DUF1351	Protein of unknown function (DUF1351). This family consists of several bacterial and phage proteins of around 230 residues in length. The function of this family is unknown.	210
399817	pfam07084	Spot_14	Thyroid hormone-inducible hepatic protein Spot 14. This family consists of several thyroid hormone-inducible hepatic protein (Spot 14 or S14) sequences. Mainly expressed in tissues that synthesize triglycerides, the mRNA coding for Spot 14 has been shown to be increased in rat liver by insulin, dietary carbohydrates, glucose in hepatocyte culture medium, as well as thyroid hormone. In contrast, dietary fats and polyunsaturated fatty acids, have been shown to decrease the amount of Spot 14 mRNA, while an elevated level of cAMP acts as a dominant negative factor. In addition, liver-specific factors or chromatin organisation of the gene have been shown to contribute to the regulation of its expression. Spot 14 protein is thought to be required for induction of hepatic lipogenesis.	144
399818	pfam07085	DRTGG	DRTGG domain. This presumed domain is about 120 amino acids in length. It is found associated with CBS domains pfam00571, as well as the CbiA domain pfam01656. The function of this domain is unknown. It is named the DRTGG domain after some of the most conserved residues. This domain may be very distantly related to a pair of CBS domains. There are no significant sequence similarities, but its length and association with CBS domains supports this idea (Bateman A, pers. obs.).	105
399819	pfam07086	Jagunal	Jagunal, ER re-organisation during oogenesis. Jagunal is an endoplasmic-reticulum (ER)-membrane protein found in eukaryotes. It is involved in reorganising the ER in cells that must increase exocytic membrane traffic during development, that is, in the oocyte during vitellogenesis. It facilitates vesicular traffic in the subcortex.	186
399820	pfam07087	DUF1353	Protein of unknown function (DUF1353). This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown.	91
311196	pfam07088	GvpD	GvpD gas vesicle protein. This family consists of several archaeal GvpD gas vesicle proteins. GvpD is thought to be involved in the regulation of gas vesicle formation.	484
399821	pfam07090	GATase1_like	Putative glutamine amidotransferase. This family consists of several hypothetical bacterial proteins of around 250 residues in length. The function of this family is unknown. The structure of this cytoplasmic domain was solved by the Midwest Center for Structural Genomics (MCSG). The structure has been classified as part of the Class-I Glutamine amidotransferase superfamily owing to similarity with other known structures. The monomer combines with itself to form a hexamer, and this hexamer exposes a potential catalytic surface rich in Glu, Asp, Tyr, Ser.Trp and His residues.	246
336604	pfam07091	FmrO	Ribosomal RNA methyltransferase (FmrO). This family consists of several bacterial ribosomal RNA methyltransferase (aminoglycoside-resistance methyltransferase) proteins.	252
399822	pfam07092	DUF1356	Protein of unknown function (DUF1356). This family consists of several hypothetical mammalian proteins of around 250 residues in length. The function of this family is unknown.	225
399823	pfam07093	SGT1	SGT1 protein. This family consists of several eukaryotic SGT1 proteins. Human SGT1 or hSGT1 is known to suppress GCR2 and is highly expressed in the muscle and heart. The function of this family is unknown although it has been speculated that SGT1 may be functionally analogous to the Gcr2p protein of Saccharomyces cerevisiae which is known to be a regulatory factor of glycolytic gene expression.	583
284502	pfam07094	DUF1357	Protein of unknown function (DUF1357). This family consists of several hypothetical bacterial proteins of around 225 residues in length. Members of this family appear to be specific Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown.	223
399824	pfam07095	IgaA	Intracellular growth attenuator protein IgaA. This family consists of several bacterial intracellular growth attenuator (IgaA) proteins. IgaA is involved in negative control of bacterial proliferation within fibroblasts. IgaA is homologous to the E. coli YrfF and P. mirabilis UmoB proteins. Whereas the biological function of YrfF is currently unknown, UmoB has been shown elsewhere to act as a positive regulator of FlhDC, the master regulator of flagella and swarming. FlhDC has been shown to repress cell division during P. mirabilis swarming, suggesting that UmoB could repress cell division via FlhDC. This biological function, if maintained in S. enterica, could sustain a putative negative control of cell division and growth exerted by IgaA in intracellular bacteria.	696
369206	pfam07096	DUF1358	Protein of unknown function (DUF1358). This family consists of several hypothetical eukaryotic proteins of around 125 residues in length. The function of this family is unknown.	115
284505	pfam07097	DUF1359	Protein of unknown function (DUF1359). This family consists of several hypothetical bacterial and phage proteins of around 100 residues in length. Members of this family seem to be found exclusively in Lactococcus lactis and the bacteriophages that infect this species. The function of this family is unknown.	104
399825	pfam07098	DUF1360	Protein of unknown function (DUF1360). This family consists of several bacterial proteins of around 115 residues in length. Members of this family are found in Bacillus species and Streptomyces coelicolor, the function of the family is unknown.	102
399826	pfam07099	DUF1361	Protein of unknown function (DUF1361). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although some members are annotated as being putative integral membrane proteins.	166
399827	pfam07100	ASRT	Anabaena sensory rhodopsin transducer. The family of bacterial Anabaena sensory rhodopsin transducers are likely to bind sugars or related metabolites. The entire protein is comprised of a single globular domain with an eight-stranded beta-sandwich fold. There are a few characteristics which define this beta-sandwich fold as being distinct from other so-named folds, and these are: 1) a well conserved tryptophan, usually following a polar residue, present at the start of the first strand; this tryptophan appears to be central to a hydrophobic interaction required to hold the two beta-sheets of the sandwich together, and 2) a nearly absolutely conserved asparagine located at the end of the second beta-strand, that hydrogen bonds with the backbone carbonyls of the residues 2 and 4 positions downstream from it, thereby stabilizing the characteristic tight turn between strands 2 and 3 of the structure.	119
115736	pfam07101	DUF1363	Protein of unknown function (DUF1363). This family consists of several Trypanosoma brucei putative variant specific antigen proteins of around 80 residues in length.	124
377777	pfam07102	DUF1364	Protein of unknown function (DUF1364). This family consists of several bacterial and phage proteins of around 95 residues in length. The function of this family is unknown.	91
399828	pfam07103	DUF1365	Protein of unknown function (DUF1365). This family consists of several bacterial and plant proteins of around 250 residues in length. The function of this family is unknown.	227
336610	pfam07104	DUF1366	Protein of unknown function (DUF1366). This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 130 residues in length. One of the sequences in this family, from phage Sfi11 is known as Gp149. The function of this family is unknown.	116
369209	pfam07105	DUF1367	Protein of unknown function (DUF1367). This family consists of several highly conserved, hypothetical phage proteins of around 200 residues in length. The function of this family is unknown. Some proteins are annotated as IrsA (intracellular response to stress).	192
369210	pfam07106	TBPIP	Tat binding protein 1(TBP-1)-interacting protein (TBPIP). This family consists of several eukaryotic TBP-1 interacting protein (TBPIP) sequences. TBP-1 has been demonstrated to interact with the human immunodeficiency virus type 1 (HIV-1) viral protein Tat, then modulate the essential replication process of HIV. In addition, TBP-1 has been shown to be a component of the 26S proteasome, a basic multiprotein complex that degrades ubiquitinated proteins in an ATP-dependent fashion. Human TBPIP interacts with human TBP-1 then modulates the inhibitory action of human TBP-1 on HIV-Tat-mediated transactivation.	61
284513	pfam07107	WI12	Wound-induced protein WI12. This family consists of several plant wound-induced protein sequences related to WI12 from Mesembryanthemum crystallinum. Wounding, methyl jasmonate, and pathogen infection is known to induce local WI12 expression. WI12 expression is also thought to be developmentally controlled in the placenta and developing seeds. WI12 preferentially accumulates in the cell wall and it has been suggested that it plays a role in the reinforcement of cell wall composition after wounding and during plant development. This family seems partly related to the NTF2-like superfamily.	109
369211	pfam07108	PipA	PipA protein. This family consists of several Salmonella PipA (pathogenicity island-encoded protein A) and related phage sequences. PipA is thought to contribute to enteric but not to systemic salmonellosis. The family carries a highly conserved HEXXH sequence motif along with several highly conserved glutamic acid residues which might be indicative of the family being a metallo-peptidase.	200
399829	pfam07109	Mg-por_mtran_C	Magnesium-protoporphyrin IX methyltransferase C-terminus. This family represents the C-terminus (approximately 100 residues) of bacterial and eukaryotic Magnesium-protoporphyrin IX methyltransferase (EC:2.1.1.11). This converts magnesium-protoporphyrin IX to magnesium-protoporphyrin IX methylester using S-adenosyl-L-methionine as a cofactor.	97
399830	pfam07110	EthD	EthD domain. This family consists of several bacterial sequences which are related to the EthD protein of Rhodococcus ruber. In Rhodococcus ruber, EthD is thought to be involved in the degradation of ethyl tert-butyl ether (ETBE). EthD synthesis is induced by ETBE but it's exact function is unknown, it is however thought to be essential to the ETBE degradation system.	95
284517	pfam07111	HCR	Alpha helical coiled-coil rod protein (HCR). This family consists of several mammalian alpha helical coiled-coil rod HCR proteins. The function of HCR is unknown but it has been implicated in psoriasis in humans and is thought to affect keratinocyte proliferation.	749
254061	pfam07112	DUF1368	Protein of unknown function (DUF1368). This family consists of several proteins with seem to be specific to red algae plasmids. Members of this family are typically around 415 residues in length. The function of this family is unknown.	404
399831	pfam07114	TMEM126	Transmembrane protein 126. This entry includes the transmembrane protein 126 A/B (TMEM126A/B) from animals. Human TMEM126B participates in constructing the membrane arm of mitochondrial respiratory complex I.	176
254062	pfam07116	DUF1372	Protein of unknown function (DUF1372). This family consists of several Streptococcus bacteriophage sequences and related proteins from Streptococcus species. Members of this family are typically around 100 residues in length and their function is unknown.	104
399832	pfam07117	DUF1373	Protein of unknown function (DUF1373). This family consists of several hypothetical proteins which seem to be specific to Oryzias latipes (Japanese ricefish). Members of this family are typically around 200 residues in length. The function of this family is unknown.	213
399833	pfam07118	DUF1374	Protein of unknown function (DUF1374). This family consists of several hypothetical Sulfolobus virus proteins of around 100 residues in length. The function of this family is unknown.	92
399834	pfam07119	DUF1375	Protein of unknown function (DUF1375). This family consists of several hypothetical, putative lipoproteins of around 80 residues in length. Members of this family seem to be specific to the Class Gammaproteobacteria. The function of this family is unknown.	53
399835	pfam07120	DUF1376	Protein of unknown function (DUF1376). This family consists of several hypothetical bacterial proteins of around 95 residues in length. The function of this family is unknown.	87
284524	pfam07122	VLPT	Variable length PCR target protein (VLPT). This family consists of a number of 29 residue repeats which seem to be specific to the Ehrlichia chaffeensis variable length PCR target (VLPT) protein. Ehrlichia chaffeensis is a tick-transmitted rickettsial agent and is responsible for human monocytic ehrlichiosis (HME). The function of this family is unknown.	30
399836	pfam07123	PsbW	Photosystem II reaction centre W protein (PsbW). This family consists of several plant specific photosystem II reaction centre W (PsbW) proteins. PsbW is a nuclear-encoded protein located in the thylakoid membrane of the chloroplast. PsbW is a core component of photosystem II but not photosystem I. This family does not appear to be related to pfam03912.	129
399837	pfam07124	Phytoreo_P8	Phytoreovirus outer capsid protein P8. This family consists of several Phytoreovirus outer capsid protein P8 sequences.	427
115758	pfam07125	DUF1378	Protein of unknown function (DUF1378). This family consists of hypothetical bacterial and phage proteins of around 59 residues in length. Bacterial members of this family seem to be specific to Enterobacteria. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids.	59
399838	pfam07126	ZapC	Cell-division protein ZapC. ZapC is one of four FtsZ-binding components of the Z ring in bacteria. Formation of the Z ring on the cytoplasmic surface of the membrane is the starting process for assembly of the cell-division apparatus. It binds directly to the Z ring, and although it is not essential for absolute cell division it contributes to it by enhancing the interactions between the FtsZ protofilaments (or polymers) which aggregate to form the ring conformation in the Z ring.	169
399839	pfam07127	Nodulin_late	Late nodulin protein. This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues towards the C-terminus which may be involved in metal-binding.	54
399840	pfam07128	DUF1380	Protein of unknown function (DUF1380). This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of this family seem to be specific to Enterobacteria. The function of this family is unknown.	137
369223	pfam07129	DUF1381	Protein of unknown function (DUF1381). This family consists of several hypothetical Staphylococcus aureus and Staphylococcus aureus bacteriophage proteins of around 65 residues in length. The function of this family is unknown.	44
399841	pfam07130	YebG	YebG protein. This family consists of several bacterial YebG proteins of around 75 residues in length. The exact function of this protein is unknown but it is thought to be involved in the SOS response. The induction of the yebG gene occurs as cell enter into the stationary growth phase and is dependent on is dependent on cyclic AMP and H-NS.	72
369224	pfam07131	DUF1382	Protein of unknown function (DUF1382). This family consists of several hypothetical Escherichia coli and bacteriophage lambda-like proteins of around 60 residues in length. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids.	60
284533	pfam07133	Merozoite_SPAM	Merozoite surface protein (SPAM). This family consists of several Plasmodium falciparum SPAM (secreted polymorphic antigen associated with merozoites) proteins. Variation among SPAM alleles is the result of deletions and amino acid substitutions in non-repetitive sequences within and flanking the alanine heptad-repeat domain. Heptad repeats in which the a and d position contain hydrophobic residues generate amphipathic alpha-helices which give rise to helical bundles or coiled-coil structures in proteins. SPAM is an example of a P. falciparum antigen in which a repetitive sequence has features characteristic of a well-defined structural element.	182
254068	pfam07134	DUF1383	Protein of unknown function (DUF1383). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 375 residues in length. The function of this family is unknown.	328
399842	pfam07136	DUF1385	Protein of unknown function (DUF1385). This family contains a number of hypothetical bacterial proteins of unknown function approximately 300 residues in length. Some family members are predicted to be metal-dependent.	231
399843	pfam07137	VDE	VDE lipocalin domain. This family represents a conserved region approximately 150 residues long within plant violaxanthin de-epoxidase (VDE). In higher plants, violaxanthin de-epoxidase forms part of a conserved system that dissipates excess energy as heat in the light-harvesting complexes of photosystem II (PSII), thus protecting them from photo-inhibitory damage.	240
369226	pfam07138	DUF1386	Protein of unknown function (DUF1386). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 350 residues in length. The function of this family is unknown.	334
399844	pfam07139	DUF1387	Protein of unknown function (DUF1387). This family represents a conserved region approximately 300 residues long within a number of hypothetical proteins of unknown function that seem to be restricted to mammals.	313
284538	pfam07140	IFNGR1	Interferon gamma receptor (IFNGR1). This family consists of several eukaryotic and viral interferon gamma receptor proteins. Molecular interactions among cytokines and cytokine receptors in eukaryotes form the basis of many cell-signaling pathways relevant to immune function. Human interferon-gamma (IFN-gamma) signals through a multimeric receptor complex consisting of two different but structurally related transmembrane chains: the high-affinity receptor-binding subunit (IFN-gammaRalpha) and a species specific accessory factor (AF-1 or IFN-gammaRbeta). The vaccinia viral interferon gamma receptor has been shown to be secreted from infected cells during early infection. The structure has been halved such that the N-terminus of this family is now represented by Tissue_fac pfam01108.	133
369228	pfam07141	Phage_term_sma	Putative bacteriophage terminase small subunit. This family consists of several putative Lactococcus bacteriophage terminase small subunit proteins. The exact function of this family is unknown.	174
399845	pfam07142	DUF1388	Repeat of unknown function (DUF1388). This family consists of several repeats of around 29 residues in length. Members of this family are found in the variable surface lipoproteins in Mycoplasma bovis and in mammalian neurofilament triplet H (NefH or NF-H) proteins. This repeat contains several Lys-Ser-Pro (KSP) motifs and in NefH these are thought to function as the main target for neurofilament directed protein kinases in vivo.	27
399846	pfam07143	CrtC	CrtC N-terminal lipocalin domain. This family contains the members of the old Pfam family DUF2006. Structural characterization of family member NE1406 (from DUF2006 now merged into this family) has revealed a lipocalin-like fold with domain duplication.	175
399847	pfam07145	PAM2	Ataxin-2 C-terminal region. The PABP-interacting motif PAM2 has been identified in various eukaryotic proteins as an important binding site for pfam00658. It has been found in a wide range of eukaryotic proteins. Strikingly, this motif appears to occur solely outside of globular domains.	16
254076	pfam07146	DUF1389	Protein of unknown function (DUF1389). This family consists of several hypothetical bacterial proteins which seem to be specific to Chlamydia pneumoniae. Members of this family are typically around 400 residues in length. The function of this family is unknown.	311
399848	pfam07147	PDCD9	Mitochondrial 28S ribosomal protein S30 (PDCD9). This family consists of several eukaryotic mitochondrial 28S ribosomal protein S30 (or programmed cell death protein 9 PDCD9) sequences. The exact function of this family is unknown although it is known to be a component of the mitochondrial ribosome and a component in cellular apoptotic signaling pathways.	441
399849	pfam07148	MalM	Maltose operon periplasmic protein precursor (MalM). This family consists of several maltose operon periplasmic protein precursor (MalM) sequences. The function of this family is unknown.	134
399850	pfam07149	Pes-10	Pes-10. This family consists of several Caenorhabditis elegans pes-10 and related proteins. Members of this family are typically around 400 residues in length. The function of this family is unknown.	397
284546	pfam07150	DUF1390	Protein of unknown function (DUF1390). This family consists of several Paramecium bursaria chlorella virus 1 (PBCV-1) proteins of around 250 residues in length. The function of this family is unknown.	226
369233	pfam07151	DUF1391	Protein of unknown function (DUF1391). This family consists of several Enterobacterial proteins of around 50 residues in length. Members of this family are found in Escherichia coli and Salmonella typhi where they are often known as YdfA. The function of this family is unknown.	48
399851	pfam07152	YaeQ	YaeQ protein. This family consists of several hypothetical bacterial proteins of around 180 residues in length which are often known as YaeQ. YaeQ is homologous to RfaH, a specialized transcription elongation protein. YaeQ is known to compensate for loss of RfaH function.	172
369234	pfam07153	Marek_SORF3	Marek's disease-like virus SORF3 protein. This family consists of several SORF3 proteins from the Marek's disease-like viruses. Members of this family are around 350 residues in length. The function of this family is unknown.	290
284550	pfam07154	DUF1392	Protein of unknown function (DUF1392). This family consists of several hypothetical cyanobacterial proteins of around 150 residues in length which seem to be specific to Anabaena species. The function of this family is unknown.	150
399852	pfam07155	ECF-ribofla_trS	ECF-type riboflavin transporter, S component. This family is the substrate-binding component (S component) of the energy coupling-factor (ECF)-type riboflavin transporter. It is a transmembrane protein which binds riboflavin, and is responsible for riboflavin-uptake by cells.	168
399853	pfam07156	Prenylcys_lyase	Prenylcysteine lyase. This family contains prenylcysteine lyases (EC:1.8.3.5) that are approximately 500 residues long. Prenylcysteine lyase is a FAD-dependent thioether oxidase that degrades a variety of prenylcysteines, producing free cysteine, an isoprenoid aldehyde and hydrogen peroxide as products of the reaction. It has been noted that this enzyme has considerable homology with ClP55, a 55 kDa protein that is associated with chloride ion pumps.	362
399854	pfam07157	DNA_circ_N	DNA circularisation protein N-terminus. This family represents the N-terminus (approximately 100 residues) of a number of phage DNA circularisation proteins.	88
115789	pfam07158	MatC_N	Dicarboxylate carrier protein MatC N-terminus. This family represents the N-terminal region of the bacterial dicarboxylate carrier protein MatC. The MatC protein is an integral membrane protein that could function as a malonate carrier.	149
399855	pfam07159	DUF1394	Protein of unknown function (DUF1394). This family consists of several hypothetical eukaryotic proteins of around 320 residues in length. The function of this family is unknown.	302
399856	pfam07160	SKA1	Spindle and kinetochore-associated protein 1. Spindle and kinetochore-associated protein 1 (SKA1) is a component of the SKA1 complex (consists of Ska1, Ska2, and Ska3/Rama1), a microtubule-binding subcomplex of the outer kinetochore that is essential for proper chromosome segregation.	234
399857	pfam07161	LppX_LprAFG	LppX_LprAFG lipoprotein. This entry consists of several lipoproteins mainly from Mycobacterium species, collectively known as the LppX_ LprAFG family. Proteins in this entry include LprG, LppX, LprF and lprA.	191
399858	pfam07162	B9-C2	Ciliary basal body-associated, B9 protein. The B9-C2 domain is found in proteins associated with the ciliary basal body. B9 domains were identified as a specific family of C2 domains. There are three sub-families represented by this family, notably, Mks1-Xbx7, Stumpy-Tza1 and Tza2 groups of proteins. Mutations in human Mks1 result in the developmental disorder Mechler-Gruber syndrome; mutations in mouse Stumpy lead to perinatal hydrocephalus and severe polycystic kidney disease. All the three distinct types of B9-C2 proteins cooperatively localize to the basal body or centrosome of cilia.	165
399859	pfam07163	Pex26	Pex26 protein. This family consists of Pex26 and related mammalian proteins. Pex26 is a type II peroxisomal membrane protein which recruits Pex6-Pex1 complexes to peroxisomes. Mutations in Pex26 can lead to human disorders.	301
399860	pfam07165	DUF1397	Protein of unknown function (DUF1397). This family consists of several insect specific proteins. A member from Manduca sexta is annotated as being a haemolymph glycoprotein precursor. The function of this family is unknown.	203
399861	pfam07166	DUF1398	Protein of unknown function (DUF1398). This family consists of several hypothetical Enterobacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Escherichia coli and Salmonella species. The function of this family is unknown.	118
399862	pfam07167	PhaC_N	Poly-beta-hydroxybutyrate polymerase (PhaC) N-terminus. This family represents the N-terminal region of the bacterial poly-beta-hydroxybutyrate polymerase (PhaC). Polyhydroxyalkanoic acids (PHAs) are carbon and energy reserve polymers produced in some bacteria when carbon sources are plentiful and another nutrient, such as nitrogen, phosphate, oxygen, or sulfur, becomes limiting. PHAs composed of monomeric units ranging from 3 to 14 carbons exist in nature. When the carbon source is exhausted, PHA is utilized by the bacterium. PhaC links D-(-)-3-hydroxybutyrl-CoA to an existing PHA molecule by the formation of an ester bond. This family appears to be a partial segment of an alpha/beta hydrolase domain.	173
399863	pfam07168	Ureide_permease	Ureide permease. Heterocyclic nitrogen compounds may serve as nitrogen sources or nitrogen transport compounds in plants that are not able to fix nitrogen. This family represents ureide permease, a transporter of a wide spectrum of oxo derivatives of heterocyclic nitrogen compounds, including allantoin, uric acid and xanthine; it has 10 putative transmembrane domains with a large cytosolic central domain containing a 'Walker A' motif. Ureide permease is likely to transport other purine degradation products when nitrogen sources are low. Transport is dependent on glucose and a proton gradient. The family is found in bacteria, plants and yeast. These transporters are constituted of two sets of 5xTMs.	358
399864	pfam07171	MlrC_C	MlrC C-terminus. This family represents the C-terminus (approximately 200 residues) of the product of a bacterial gene cluster that is involved in the degradation of the cyanobacterial toxin microcystin LR. Many members of this family are hypothetical proteins.	178
254089	pfam07172	GRP	Glycine rich protein family. This family of proteins includes several glycine rich proteins as well as two nodulins 16 and 24. The family also contains proteins that are induced in response to various stresses.	91
399865	pfam07173	GRDP-like	Glycine-rich domain-containing protein-like. This entry includes Arabidopsis Glycine-rich domain- containing protein 1 and 2 (GRDP1/2). They are involved in development and stress responses.	139
399866	pfam07174	FAP	Fibronectin-attachment protein (FAP). This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix.	301
399867	pfam07175	Osteoregulin	Osteoregulin. This family represents a conserved region approximately 180 residues long within osteoregulin, a bone-remodelling protein expressed highly in osteocytes within trabecular and cortical bone. A conserved RGD motif is found towards the C-terminal end of this region, and this is potentially involved in integrin recognition.	160
399868	pfam07176	DUF1400	Alpha/beta hydrolase of unknown function (DUF1400). This family contains a number of hypothetical proteins of unknown function that seem to be specific to cyanobacteria. Members of this family have an alpha/beta hydrolase fold.	127
399869	pfam07177	Neuralized	Neuralized. This family contains a conserved region approximately 60 residues long within eukaryotic neuralized and neuralized-like proteins. Neuralized belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the Drosophila nervous system. Some family members contain multiple copies of this region.	150
399870	pfam07178	TraL	TraL protein. This family consists of several bacterial TraL proteins. TraL is a predicted peripheral membrane protein which is thought to be involved in bacterial sex pilus assembly. The exact function of this family is unclear.	87
399871	pfam07179	SseB	SseB protein N-terminal domain. This family consists of several SseB proteins which appear to be found exclusively in Enterobacteria. SseB is known to enhance serine-sensitivity in Escherichia coli and is part of the Salmonella pathogenicity island 2 (SPI-2) translocon. This entry contains the presumed N-terminal domain of SseB.	120
369250	pfam07180	CaiF_GrlA	CaiF/GrlA transcriptional regulator. This is a family of transcriptional regulators. CaiF is involved in carnitine metabolism. GrlA is encoded within the LEE type III secretion system in the enteropathogenic E. coli O157:H. GrlR interacts with GrlA at its Helix-Turn-Helix (HTH) motif, preventing GrlA from binding to its target promoter DNA.	134
284572	pfam07181	VirC2	VirC2 protein. This family consists of several VirC2 proteins which seem to be found exclusively in Agrobacterium species and Rhizobium etli. VirC2 is known to be involved in virulence in Agrobacterium species but its exact function is unclear.	200
399872	pfam07182	DUF1402	Protein of unknown function (DUF1402). This family consists of several hypothetical bacterial proteins of around 310 residues in length. Members of this family seem to be found exclusively in Agrobacterium, Rhizobium and Brucella species. The function of this family is unknown.	300
399873	pfam07183	DUF1403	Protein of unknown function (DUF1403). This family consists of several hypothetical bacterial proteins of around 320 residues in length. Members of this family are mainly found in Rhizobium and Agrobacterium species. The function of this family is unknown.	320
115814	pfam07184	CTV_P33	Citrus tristeza virus P33 protein. This family consists of several Citrus tristeza virus (CTV) P33 proteins. The function of P33 is unclear although it is known that the protein is not needed for virion formation.	303
369252	pfam07185	DUF1404	Protein of unknown function (DUF1404). This family consists of several archaeal proteins of around 180 residues in length. Members of this family seem to be found exclusively in Sulfolobus tokodaii and Sulfolobus solfataricus. The function of this family is unknown.	169
399874	pfam07187	DUF1405	Protein of unknown function (DUF1405). This family consists of several bacterial and related archaeal protein of around 180 residues in length. The function of this family is unknown.	160
284577	pfam07188	KSHV_K8	Kaposi's sarcoma-associated herpesvirus (KSHV) K8 protein. This family consists of Kaposi's sarcoma-associated herpesvirus (KSHV) K8 proteins. KSHV is a human Gammaherpesvirus related to Epstein-Barr virus (EBV) and herpesvirus saimiri. KSHV open reading frame K8 encodes a basic region-leucine zipper protein of 237 aa that homodimerizes. K8 interacts and co-localizes with human pfam04855, a cellular chromatin-remodelling factor, both in vivo and in vitro. K8 is thought to function as a transcriptional activator under specific conditions and its transactivation activity requires its interaction with the cellular chromatin remodelling factor hSNF5.	238
399875	pfam07189	SF3b10	Splicing factor 3B subunit 10 (SF3b10). This family consists of several eukaryotic splicing factor 3B subunit 10 (SF3b10) proteins. SF3b10 is a 10 kDa subunit of the splicing factor SF3b. SF3b associates with the splicing factor SF3a and a 12S RNA unit to form the U2 small nuclear ribonucleoproteins complex. SF3b10 and SF3b14b are also thought to facilitate the interaction of U2 with the branch site.	75
148663	pfam07190	DUF1406	Protein of unknown function (DUF1406). This family consists of several Orthopoxvirus proteins of around 185 resides in length. Members of this family seem to be exclusive to Vaccinia, Camelpox and Cowpox viruses. Some family members are annotated as being C8 proteins but their function is unknown.	170
399876	pfam07191	zinc-ribbons_6	zinc-ribbons. This family consists of several short, hypothetical bacterial proteins of around 70 residues in length. Members of this family have 8 highly conserved cysteine residues, which form two zinc ribbon domains.	64
369255	pfam07192	SNURF	SNURF/RPN4 protein. This family consists of several mammalian SNRPN upstream reading frame (SNURF) proteins. SNURF or RPF4 is a RING-finger protein and a coregulator of androgen receptor-dependent transcription. It has been suggested that SNURF is involved in the regulation of processes required for late steps of spermatid maturation.	65
284581	pfam07193	DUF1408	Protein of unknown function (DUF1408). This family consists of several hypothetical Lactococcus lactis and related phage proteins of around 75 residues in length. The function of this family is unknown.	71
399877	pfam07194	P2	P2 response regulator binding domain. The response regulators for CheA bind to the P2 domain, which is found between pfam01627 and pfam02895 as either one or two copies. Highly flexible linkers connect P2 to the rest of CheA and impart remarkable mobility to the P2 domain. This feature is thought to enhance the inter CheA dimer phosphotransfer reactions within the signalling complex, thereby amplifying the phosphorylation signal.	80
311255	pfam07195	FliD_C	Flagellar hook-associated protein 2 C-terminus. The flagellar hook-associated protein 2 (HAP2 or FliD) forms the distal end of the flagella, and plays a role in mucin specific adhesion of the bacteria. This alignment covers the C-terminal region of this family of proteins.	235
399878	pfam07196	Flagellin_IN	Flagellin hook IN motif. The function of this region is not clear, but it is found in many flagellar hook proteins, including FliD homologs. It is normally repeated, but is also apparently seen as a singleton. A conserved IN is seen at the centre of the motif. The diversity of these motifs makes it likely that some members of the family are not identified.	55
369256	pfam07197	DUF1409	Protein of unknown function (DUF1409). This family represents a short conserved region (approximately 50 residues long), sometimes repeated, within a number of hypothetical Oryza sativa proteins of unknown function.	48
399879	pfam07198	DUF1410	Protein of unknown function (DUF1410). This family represents a conserved domain approximately 100 residues long, multiple copies of which are found within hypothetical Ureaplasma parvum proteins of unknown function, as well as related species.	61
369258	pfam07199	DUF1411	Protein of unknown function (DUF1411). This family represents a conserved region approximately 150 residues long that is sometimes repeated within some Babesia bovis proteins of unknown function.	188
399880	pfam07200	Mod_r	Modifier of rudimentary (Mod(r)) protein. This family represents a conserved region approximately 150 residues long within a number of eukaryotic proteins that show homology with Drosophila melanogaster Modifier of rudimentary (Mod(r)) proteins. The N-terminal half of Mod(r) proteins is acidic, whereas the C-terminal half is basic, and both of these regions are represented in this family. Members of this family include the Vps37 subunit of the endosomal sorting complex ESCRT-I, a complex involved in recruiting transport machinery for protein sorting at the multivesicular body (MVB). The yeast ESCRT-I complex consists of three proteins (Vps23, Vps28 and Vps37). The mammalian homolog of Vps37 interacts with Tsg101 (pfam05743) through its mod(r) domain and its function is essential for lysosomal sorting of EGF receptors.	146
399881	pfam07201	HrpJ	HrpJ-like domain. This family represents a conserved region approximately 200 residues long within a number of bacterial hypersensitivity response secretion protein HrpJ and similar proteins. HrpJ forms part of a type III secretion system through which, in phytopathogenic bacterial species, virulence factors are thought to be delivered to plant cells. This family also includes the InvE invasion protein from Salmonella. This protein is involved in host parasite interactions and mutations in the InvE gene render Salmonella typhimurium non-invasive. InvE S. typhimurium mutants fail to elicit a rapid Ca2+ increase in cultured cells, an important event in the infection procedure and internalisation of S. typhimurium into epithelial cells. This family includes bacterial SepL and SsaL proteins. SepL plays an essential role in the infection process of enterohemorrhagic Escherichia coli and is thought to be responsible for the secretion of EspA, EspD, and EspB. SsaL of Salmonella typhimurium is thought to be a component of the type III secretion system.	165
399882	pfam07202	Tcp10_C	T-complex protein 10 C-terminus. This family represents the C-terminus (approximately 180 residues) of eukaryotic T-complex protein 10. The T-complex is involved in spermatogenesis in mice.	35
311262	pfam07203	DUF1412	Protein of unknown function (DUF1412). This family consists of several Caenorhabditis elegans proteins of around 70-75 residues in length. The function of this family is unknown.	53
115833	pfam07204	Orthoreo_P10	Orthoreovirus membrane fusion protein p10. This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction.	98
399883	pfam07205	DUF1413	Domain of unknown function (DUF1413). This family consists of several hypothetical bacterial proteins which seem to be specific to firmicute species. Members of this family are typically around 100 residues in length. The function of this family is unknown.	69
284592	pfam07206	Baculo_LEF-10	Baculovirus late expression factor 10 (LEF-10). This family consists of several Baculovirus specific late expression factor 10 (LEF-10) sequences. LEF-10 is thought to be a late expressed structural protein although its exact function is unknown.	71
399884	pfam07207	Lir1	Light regulated protein Lir1. This family consists of several plant specific light regulated Lir1 proteins. Lir1 mRNA accumulates in the light, reaching maximum and minimum steady-state levels at the end of the light and dark period, respectively. Plants germinated in the dark have very low levels of lir1 mRNA, whereas plants germinated in continuous light express lir1 at an intermediate but constant level. It is thought that lir1 expression is controlled by light and a circadian clock. The exact function of this family is unclear.	134
399885	pfam07208	DUF1414	Protein of unknown function (DUF1414). This family consists of several hypothetical bacterial proteins of around 70 residues in length. Members of this family are often referred to as YejL. The function of this family is unknown.	44
399886	pfam07209	DUF1415	Protein of unknown function (DUF1415). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.	171
377793	pfam07210	DUF1416	Protein of unknown function (DUF1416). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family appear to be Actinomycete specific. The function of this family is unknown.	97
399887	pfam07212	Hyaluronidase_1	Hyaluronidase protein (HylP). This family consists of several phage associated hyaluronidase proteins (EC:3.2.1.35) which seem to be specific to Streptococcus pyogenes and Streptococcus pyogenes bacteriophages. The substrate of hyaluronidase is hyaluronic acid, a sugar polymer composed of alternating N-acetylglucosamine and glucuronic acid residues. Hyaluronic acid is found in the ground substance of human connective tissue and the vitreous of the eye and also is the sole component of the capsule of group A streptococci. The capsule has been shown to be an important virulence factor of this organism by virtue of its ability to resist phagocytosis. Production by S. pyogenes of both a hyaluronic acid capsule and hyaluronidase enzymatic activity capable of destroying the capsule is an interesting, yet-unexplained, phenomenon.	278
284598	pfam07213	DAP10	DAP10 membrane protein. This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells.	79
369264	pfam07214	DUF1418	Protein of unknown function (DUF1418). This family consists of several hypothetical Enterobacterial proteins of around 100 residues in length. Members of this family are often described as YbjC. In E. coli the ybjC gene is located downstream of nfsA (which encodes the major oxygen-insensitive nitroreductase). It is thought that nfsA and ybjC form an operon an its promoter is a class I SoxS-dependent promoter. The function of this family is unknown.	94
399888	pfam07215	DUF1419	Protein of unknown function (DUF1419). This family consists of several bacterial proteins of around 110 residues in length. Members of this family seem to be specific to Agrobacterium species and to Rhizobium loti. The function of this family is unknown.	112
284601	pfam07216	LcrG	LcrG protein. This family consists of several bacterial LcrG proteins. Yersiniae are equipped with the Yop virulon, an apparatus that allows extracellular bacteria to deliver toxic Yop proteins inside the host cell cytosol in order to sabotage the communication networks of the host cell or even to cause cell death. LcrG is a component of the Yop virulon involved in the regulation of secretion of the Yops.	91
399889	pfam07217	Het-C	Heterokaryon incompatibility protein Het-C. In filamentous fungi, het loci (for heterokaryon incompatibility) are believed to regulate self/nonself-recognition during vegetative growth. As filamentous fungi grow, hyphal fusion occurs within an individual colony to form a network. Hyphal fusion can occur also between different individuals to form a heterokaryon, in which genetically distinct nuclei occupy a common cytoplasm. However, heterokaryotic cells are viable only if the individuals involved have identical alleles at all het loci.	560
284603	pfam07218	RAP1	Rhoptry-associated protein 1 (RAP-1). This family consists of several rhoptry-associated protein 1 (RAP-1) sequences which appear to be specific to Plasmodium falciparum.	793
399890	pfam07219	HemY_N	HemY protein N-terminus. This family represents the N-terminus (approximately 150 residues) of bacterial HemY porphyrin biosynthesis proteins. This is a membrane protein involved in a late step of protoheme IX synthesis.	104
115849	pfam07220	DUF1420	Protein of unknown function (DUF1420). This family consists of several hypothetical putative lipoproteins which seem to be found specifically in the bacterium Leptospira interrogans. Members of this family are typically around 670 resides in length and their function is unknown.	672
399891	pfam07221	GlcNAc_2-epim	N-acylglucosamine 2-epimerase (GlcNAc 2-epimerase). This family contains a number of eukaryotic and bacterial N-acylglucosamine 2-epimerase (GlcNAc 2-epimerase) enzymes (EC:5.3.1.8) approximately 500 residues long. This converts N-acyl-D-glucosamine to N-acyl-D-mannosamine.	347
311272	pfam07222	PBP_sp32	Proacrosin binding protein sp32. This family consists of several mammalian specific proacrosin binding protein sp32 sequences. sp32 is a sperm specific protein which is known to bind with with 55- and 53-kDa proacrosins and the 49-kDa acrosin intermediate. The exact function of sp32 is unclear, it is thought however that the binding of sp32 to proacrosin may be involved in packaging the acrosin zymogen into the acrosomal matrix.	243
399892	pfam07223	DUF1421	UBA-like domain (DUF1421). This domain represents a conserved region that has a UBA like fold. It is found in a number of plant proteins of unknown function.	45
254111	pfam07224	Chlorophyllase	Chlorophyllase. This family consists of several plant specific Chlorophyllase proteins (EC:3.1.1.14). Chlorophyllase (Chlase) is the first enzyme involved in chlorophyll (Chl) degradation and catalyzes the hydrolysis of ester bond to yield chlorophyllide and phytol.	307
399893	pfam07225	NDUF_B4	NADH-ubiquinone oxidoreductase B15 subunit (NDUFB4). This family consists of several NADH-ubiquinone oxidoreductase B15 subunit proteins (EC:1.6.5.3).	124
399894	pfam07226	DUF1422	Protein of unknown function (DUF1422). This family consists of several hypothetical bacterial proteins of around 120 residues in length. The function of this family is unknown.	114
399895	pfam07227	PHD_Oberon	PHD - plant homeodomain finger protein. PHD_oberon is a plant homeodomain finger domain of Oberon proteins from plants. Oberon is necessary for maintenance and/or establishment of both the shoot and root apical meristems in Arabidopsis. Oberon proteins are made up of a PHD finger domain and a coiled-coil domain. The PHD-finger domain is found in a wide variety of proteins involved in the regulation of chromatin structure. Oberon proteins mediate the TMO7 (the direct target of MP) expression through modification of, or binding to, chromatin at the TMO7 locus. TMO7 stands for the target of Monopteros 7 (MP) (or Auxin response factor 7).	130
399896	pfam07228	SpoIIE	Stage II sporulation protein E (SpoIIE). This family contains a number of bacterial stage II sporulation E proteins (EC:3.1.3.16). These are required for formation of a normal polar septum during sporulation. The N-terminal region is hydrophobic and is expected to contain up to 12 membrane-spanning segments.	192
399897	pfam07229	VirE2	VirE2. This family consists of several VirE2 proteins which seem to be specific to Agrobacterium tumefaciens and Rhizobium etli. VirE2 is known to interact, via its C-terminus, with VirD4. Agrobacterium tumefaciens transfers oncogenic DNA and effector proteins to plant cells during the course of infection. Substrate translocation across the bacterial cell envelope is mediated by a type IV secretion (TFS) system composed of the VirB proteins, as well as VirD4, a member of a large family of inner membrane proteins implicated in the coupling of DNA transfer intermediates to the secretion machine. VirE2 is therefore thought to be a protein substrate of a type IV secretion system which is recruited to a member of the coupling protein superfamily.	557
399898	pfam07230	Peptidase_S80	Bacteriophage T4-like capsid assembly protein (Gp20). This family consists of several bacteriophage T4-like capsid assembly (or portal) proteins. The exact mechanism by which the double-stranded (ds) DNA bacteriophages incorporate the portal protein at a unique vertex of the icosahedral capsid is unknown. In phage T4, there is evidence that this vertex, constituted by 12 subunits of gp20, acts as an initiator for the assembly of the major capsid protein and the scaffolding proteins into a prolate icosahedron of precise dimensions. The regulation of portal protein gene expression is an important regulator of prohead assembly in bacteriophage T4. This family represents the protease responsible for the proteolysis of head proteins, a critical step in the morphogenesis of many tailed phages, Cleavage facilitates the conversion of the prohead to the mature capsid. All these cleavages are carried out by action at consensus S/A/G-X-E recognition sequences at 39 cleavage sites. Evidence of multiple processing sites in nine phiKZ proteins appears to represent a built-in mechanism by which the phage ensures that the majority of the propeptide regions are removed, and emphasizes the essential nature of processing in phiKZ-head morphogenesis. The family is classified by MEROPS as a serine peptidase.	445
399899	pfam07231	Hs1pro-1_N	Hs1pro-1 N-terminus. This family represents the N-terminus (approximately 180 residues) of plant Hs1pro-1, which is believed to confer resistance to nematodes.	195
115861	pfam07232	DUF1424	Putative rep protein (DUF1424). This family consists of several archaeal proteins of around 320 residues in length. Members of this family seem to be found exclusively in Halobacterium and Haloferax species. The function of this family is unknown. This protein is probably a rep protein due to conservation of functional motifs.	329
399900	pfam07233	DUF1425	Protein of unknown function (DUF1425). This family consists of several hypothetical bacterial proteins of around 125 residues in length. Several members of this family are described as putative lipoproteins and are often known as YcfL. The function of this family is unknown.	87
284616	pfam07234	Babuvirus_MP	Movement and RNA silencing protein. This family consists of several Babuvirus proteins of around 120 residues in length. Proteins in this family include movement and RNA silencing protein (also known as MP) from Banana bunchy top virus. MP acts as a suppressor of RNA-mediated gene silencing, also known as post-transcriptional gene silencing (PTGS), a mechanism of plant viral defense that limits the accumulation of viral RNAs. It transports viral genome to neighboring plant cells directly through plasmosdesmata, without any budding. The movement protein allows efficient cell to cell propagation, by bypassing the host cell wall barrier.	117
399901	pfam07235	DUF1427	Protein of unknown function (DUF1427). This family consists of several bacterial proteins of around 100 residues in length. The function of this family is unknown.	84
399902	pfam07236	Phytoreo_S7	Phytoreovirus S7 protein. This family consists of several Phytoreovirus S7 proteins which are thought to be viral core proteins.	505
399903	pfam07237	DUF1428	Protein of unknown function (DUF1428). This family consists of several hypothetical bacterial and one archaeal sequence of around 120 residues in length. The function of this family is unknown. The structure of this family shows it to be part of the Dimeric-alpha-beta-barrel superfamily. Many members are annotated as being RNA signal recognition particle 4.5S RNA, but this could not be verified.	102
399904	pfam07238	PilZ	PilZ domain. PilZ is a c-di-GMP binding domain which is found C terminal to pfam07317. Proteins which contain PilZ are known to interact with the flagellar switch-complex proteins FliG and FliM. This interaction results in a reduction of torque generation and induces CCW motor bias. This domain forms a beta barrel structure.	102
399905	pfam07239	OpcA	Outer membrane protein OpcA. This family consists of several Neisseria species specific OpcA outer membrane proteins. Opc (formerly called 5C) is one of the major outer membrane proteins and has been shown to play an important role in meningococcal adhesion and invasion of both epithelial and endothelial cells.	250
399906	pfam07240	Turandot	Stress-inducible humoral factor Turandot. This family consists of several Drosophila species specific Turandot proteins. The Turandot A (TotA) gene encodes a humoral factor, which is secreted from the fat body and accumulates in the body fluids. TotA is strongly induced upon bacterial challenge, as well as by other types of stress such as high temperature, mechanical pressure, dehydration, UV irradiation, and oxidative agents. It is also up-regulated during metamorphosis and at high age. Flies that over-express TotA show prolonged survival and retain normal activity at otherwise lethal temperatures. Although TotA is only induced by severe stress, it responds to a much wider range of stimuli than heat shock genes such as hsp70 or immune genes such as Cecropin A1.	81
369281	pfam07242	DUF1430	Protein of unknown function (DUF1430). This family represents the C-terminus (approximately 120 residues) of a number of hypothetical bacterial proteins of unknown function. These are possibly membrane proteins involved in immunity.	100
369282	pfam07243	Phlebovirus_G1	Phlebovirus glycoprotein G1. This family consists of several Phlebovirus glycoprotein G1 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi.	527
399907	pfam07244	POTRA	Surface antigen variable number repeat. This family is found primarily in bacterial surface antigens, normally as variable number repeats at the N-terminus. The C-terminus of these proteins is normally represented by pfam01103. The alignment centers on a -GY- or -GF- motif. Some members of this family are found in the mitochondria. It is predicted to have a mixed alpha/beta secondary structure.	80
399908	pfam07245	Phlebovirus_G2	Phlebovirus glycoprotein G2. This family consists of several Phlebovirus glycoprotein G2 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi.	325
369285	pfam07246	Phlebovirus_NSM	Phlebovirus nonstructural protein NS-M. This family consists of several Phlebovirus nonstructural NS-M proteins which represent the N-terminal region of the M polyprotein precursor. The function of this family is unknown.	228
369286	pfam07247	AATase	Alcohol acetyltransferase. This family contains a number of alcohol acetyltransferase (EC:2.3.1.84) enzymes approximately 500 residues long found in both bacteria and metazoa. These catalyze the esterification of isoamyl alcohol by acetyl coenzyme A.	500
399909	pfam07248	DUF1431	Protein of unknown function (DUF1431). This family contains a number of Drosophila melanogaster proteins of unknown function. These contain several conserved cysteine residues.	154
369288	pfam07249	Cerato-platanin	Cerato-platanin. This family contains a number of fungal cerato-platanin phytotoxic proteins approximately 150 residues long. Cerato-platanin contains four cysteine residues that form two disulphide bonds.	118
399910	pfam07250	Glyoxal_oxid_N	Glyoxal oxidase N-terminus. This family represents the N-terminus (approximately 300 residues) of a number of plant and fungal glyoxal oxidase enzymes. Glyoxal oxidase catalyzes the oxidation of aldehydes to carboxylic acids, coupled with reduction of dioxygen to hydrogen peroxide. It is an essential component of the extracellular lignin degradation pathways of the wood-rot fungus Phanerochaete chrysosporium.	243
284628	pfam07252	DUF1433	Protein of unknown function (DUF1433). This family contains a number of hypothetical bacterial proteins of unknown function approximately 100 residues in length.	88
254125	pfam07253	Gypsy	Gypsy protein. This family consists of several Gypsy/Env proteins from Drosophila and Ceratitis fruit fly species. Gypsy is an endogenous retrovirus of Drosophila melanogaster. Phylogenetic studies suggest that occasional horizontal transfer events of gypsy occur between Drosophila species. Gypsy possesses infective properties associated with the products of the envelope gene that might be at the origin of these interspecies transfers. This family contains many members with full-length matches; however, it also includes a number of very short sequences and short matches of sequences with other unrelated domains on them, which cannot be excluded. These matches may represent remnants of once-functional genes.	472
399911	pfam07254	Cpta_toxin	Membrane-bound toxin component of toxin-antitoxin system. CptA is a family of bacterial proteins named for the member of this family, YGFX_ECOLI. YgfX was previously thought to be the toxic part of a toxin-antitoxin module along with the antitoxin, pfam03937 Sdh5. However, studies have shown that, YgfX interferes with correct cell division and morphology. Furthermore, the function of YgfX-SdhE as a TA system could not be demonstrated in either E. coli or Serratia sp. ATCC 39006. YgfX is predicted to have a short N-terminal cytoplasmic domain followed by two transmembrane helices (TMHs) separated by a short periplasmic loop and finally, a larger C-terminal cytoplasmic domain. The TMHs of YgfX are required for activity, but the sequence of the cytoplasmic 13 N-terminal amino acids is not essential. Furthermore, the amino acids W34 and D117 are not required for localization but are necessary for YgfX multimerization, interaction with SdhE, and YgfX activity. It is proposed that the formation of YgfX multimeric membrane-bound proteins are required to enable the interaction with the cytoplasmic SDH assembly factor SdhE. Another study has demonstrated that sdhEygfX (bicistronic operon) affects pig biosynthesis, directly or indirectly, at the level of transcription of the biosynthetic operon (pigA-O). It has also been suggested that, in addition to indirect transcriptional activation of pigA-O, YgfX might facilitate the formation of a terminal pig biosynthetic complex consisting of PigB and PigC.	130
284630	pfam07255	Benyvirus_14KDa	Benyvirus 14KDa protein. This family consists of several Benyvirus specific 14KDa proteins of around 125 residues in length. Members of this family contain 9 conserved cysteine residues. The function of this family is unknown.	123
399912	pfam07256	DUF1435	Protein of unknown function (DUF1435). This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown.	75
399913	pfam07258	COMM_domain	COMM domain. The leucine-rich, 70-85 amino acid long COMM domain is predicted to form a beta-sheet and an extreme C-terminal alpha- helix. The COMM domain containing proteins are about 200 residues in length and passed the C-terminal COMM domain.	72
399914	pfam07259	ProSAAS	ProSAAS precursor. This family consists of several mammalian proSAAS precursor proteins. ProSAAS mRNA is expressed primarily in brain and other neuroendocrine tissues (pituitary, adrenal, pancreas); within brain, the mRNA is broadly distributed among neurons. ProSAAS is thought to be an endogenous inhibitor of prohormone convertase 1 may function as a neuropeptide. N-terminal fragments of proSAAS in intracellular Pick Bodies (PBs) may cause a functional disturbance of neurons in Pick's disease.	266
311292	pfam07260	ANKH	Progressive ankylosis protein (ANKH). This family consists of several progressive ankylosis protein (ANK or ANKH) sequences. The ANK protein spans the outer cell membrane and shuttles inorganic pyrophosphate (PPi), a major inhibitor of physiologic and pathologic calcification, bone mineralisation and bone resorption. Mutations in ANK are thought to give rise to Craniometaphyseal dysplasia (CMD) which is a rare skeletal disorder characterized by progressive thickening and increased mineral density of craniofacial bones and abnormally developed metaphyses in long bones. This family shows distant homology to the MOP (TCDB) superfamily of transporters.	344
399915	pfam07261	DnaB_2	Replication initiation and membrane attachment. This family consists of several bacterial replication initiation and membrane attachment (DnaB) proteins, as well as DnaD which is a component of the PriA primosome. The PriA primosome functions to recruit the replication fork helicase onto the DNA. The DnaB protein is essential for both replication initiation and membrane attachment of the origin region of the chromosome and plasmid pUB110 in Bacillus subtilis. It is known that there are two different classes (DnaBI and DnaBII) in the DnaB mutants; DnaBI is essential for both chromosome and pUB110 replication, whereas DnaBII is necessary only for chromosome replication. DnaD has been merged into this family. This family also includes Ftn6, a cyanobacterial-specific divisome component possibly playing a role at the interface between DNA replication and cell division. Ftn6 possesses a conserved domain localized within the N-terminus of the proteins. This domain, named FND, exhibits sequence and structure similarities with the DnaD-like domains pfam04271 now merged into pfam07261.	70
399916	pfam07262	CdiI	CDI immunity protein. CdiI immunity proteins function as part of the bacterial contact-dependent growth inhibition (CDI) system. CDI is mediated by the CdiB-CdiA two-partner secretion system. Each CdiA protein exhibits a distinct growth inhibition activity, which resides in the polymorphic C-terminal region (CdiA-CT). Cells with the CDI sytem also express a CdiI immunity protein that blocks the activity of cognate CdiA-CT, thereby protecting the cell from autoinhibition. In many CDI systems the cdiBAI genes are followed by orphan cdiA-CT/cdiI modules, suggesting that these modules are exchanged between the CDI systems of different bacteria.	155
399917	pfam07263	DMP1	Dentin matrix protein 1 (DMP1). This family consists of several mammalian dentin matrix protein 1 (DMP1) sequences. The dentin matrix acidic phosphoprotein 1 (DMP1) gene has been mapped to human chromosome 4q21. DMP1 is a bone and teeth specific protein initially identified from mineralized dentin. DMP1 is primarily localized in the nuclear compartment of undifferentiated osteoblasts. In the nucleus, DMP1 acts as a transcriptional component for activation of osteoblast-specific genes like osteocalcin. During the early phase of osteoblast maturation, Ca(2+) surges into the nucleus from the cytoplasm, triggering the phosphorylation of DMP1 by a nuclear isoform of casein kinase II. This phosphorylated DMP1 is then exported out into the extracellular matrix, where it regulates nucleation of hydroxyapatite. DMP1 is a unique molecule that initiates osteoblast differentiation by transcription in the nucleus and orchestrates mineralized matrix formation extracellularly, at later stages of osteoblast maturation. The DMP1 gene has been found to be ectopically expressed in lung cancer although the reason for this is unknown.	522
399918	pfam07264	EI24	Etoposide-induced protein 2.4 (EI24). This family contains a number of eukaryotic etoposide-induced 2.4 (EI24) proteins approximately 350 residues long as well as bacterial CysZ proteins (formerly known as DUF540). In cells treated with the cytotoxic drug etoposide, EI24 is induced by p53. It has been suggested to play an important role in negative cell growth control.	161
399919	pfam07265	TAP35_44	Tapetum specific protein TAP35/TAP44. This family consists of several plant tapetum specific proteins. Members of this family are found in Arabidopsis thaliana, Brassica napus and Sinapis alba. Members of this family may be involved in sporopollenin formation and/or deposition.	119
369296	pfam07267	Nucleo_P87	Nucleopolyhedrovirus capsid protein P87. This family consists of several Nucleopolyhedrovirus capsid protein P87 sequences. P87 is expressed late in infection and concentrated in infected cell nuclei.	623
284641	pfam07268	EppA_BapA	Exported protein precursor (EppA/BapA). This family consists of a number of exported protein precursor (EppA and BapA) sequences which seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). bapA gene sequences are quite stable but the encoded proteins do not provoke a strong immune response in most individuals. Conversely, EppA proteins are much more antigenic but are more variable in sequence. It is thought that BapA and EppA play important roles during the Borrelia burgdorferi infectious cycle.	138
284642	pfam07270	DUF1438	Protein of unknown function (DUF1438). This family consists of several hypothetical proteins of around 170 residues in length which appear to be mouse specific. The function of this family is unknown.	151
284643	pfam07271	Cytadhesin_P30	Cytadhesin P30/P32. This family consists of several Mycoplasma species specific Cytadhesin P32 and P30 proteins. P30 has been found to be membrane associated and localized on the tip organelle. It is thought that it is important in cytadherence and virulence.	308
148716	pfam07272	Orthoreo_P17	Orthoreovirus P17 protein. This family consists of several Orthoreovirus P17 proteins. P17 is specified be ORF2 of the S1 gene and represents a nonstructural protein which associate with cell membranes.	146
399920	pfam07273	DUF1439	Protein of unknown function (DUF1439). This family consists of several hypothetical bacterial proteins of around 190 residues in length. Several members of this family are annotated as being putative lipoproteins and are often known as YceB. The function of this family is unknown.	151
399921	pfam07274	DUF1440	Protein of unknown function (DUF1440). This family contains a number of bacterial proteins of unknown function approximately 180 residues long. These are possibly integral membrane proteins.	133
399922	pfam07275	ArdA	Antirestriction protein (ArdA). This family consists of several bacterial antirestriction (ArdA) proteins. ArdA functions in bacterial conjugation to allow an unmodified plasmid to evade restriction in the recipient bacterium and yet acquire cognate modification.	153
115901	pfam07276	PSGP	Apopolysialoglycoprotein (PSGP). This family represents a series of 13 reside repeats found in the apopolysialoglycoprotein of Oncorhynchus mykiss (Rainbow trout) and Oncorhynchus masou (Cherry salmon). Polysialoglycoprotein (PSGP) of unfertilized eggs of rainbow trout consists of tandem repeats of a glycotridecapeptide, Asp-Asp-Ala-Thr*-Ser*-Glu-Ala-Ala-Thr*-Gly-Pro-Ser- Gly (* denotes the attachment site of a polysialoglycan chain). In response to egg activation, PSGP is discharged by exocytosis into the space between the vitelline envelope and the plasma membrane, i.e. the perivitelline space, where the 200-kDa PSGP molecules undergo rapid and dramatic depolymerization by proteolysis into glycotridecapeptides.	13
399923	pfam07277	SapC	SapC. This family contains a number of bacterial SapC proteins approximately 250 residues long. In Campylobacter fetus, SapC forms part of a paracrystalline surface layer (S-layer) that confers serum resistance.	221
399924	pfam07278	DUF1441	Protein of unknown function (DUF1441). This family consists of several hypothetical Enterobacterial proteins of around 160 residues in length. The function of this family is unknown. However, it appears to be distantly related to other HTH families so may act as a transcriptional regulator.	149
115904	pfam07279	DUF1442	Protein of unknown function (DUF1442). This family consists of several hypothetical Arabidopsis thaliana proteins of around 225 residues in length. The function of this family is unknown.	218
148722	pfam07280	Ac110_PIF	Per os infectivity factor AC110. This family consists of several Baculovirus proteins of around 55 residues in length. Family members include Autographa californica nuclear polyhedrosis virus (AcMNPV) Per os infectivity factor AC110, which is required for oral infectivity. It may play a role after occlusion-derived virions pass through the host's peritrophic membrane.	43
399925	pfam07281	INSIG	Insulin-induced protein (INSIG). This family contains a number of eukaryotic Insulin-induced proteins (INSIG-1 and INSIG-2) approximately 200 residues long. INSIG-1 and INSIG-2 are found in the endoplasmic reticulum and bind the sterol-sensing domain of SREBP cleavage-activating protein (SCAP), preventing it from escorting SREBPs to the Golgi. Their combined action permits feedback regulation of cholesterol synthesis over a wide range of sterol concentrations.	187
284650	pfam07282	OrfB_Zn_ribbon	Putative transposase DNA-binding domain. This putative domain is found at the C-terminus of a large number of transposase proteins. This domain contains four conserved cysteines suggestive of a zinc binding domain. Given the need for transposases to bind DNA as well as the large number of DNA-binding zinc fingers we hypothesize this domain is DNA-binding.	69
369299	pfam07283	TrbH	Conjugal transfer protein TrbH. This family contains TrbH, a bacterial conjugal transfer protein approximately 150 residues long. This contains a putative membrane lipoprotein lipid attachment site.	119
399926	pfam07284	BCHF	2-vinyl bacteriochlorophyllide hydratase (BCHF). This family contains the bacterial enzyme 2-vinyl bacteriochlorophyllide hydratase (EC:4.2.1.-) (approximately 150 residues long). This is involved in the light-independent bacteriochlorophyll biosynthesis pathway by adding water across the 2-vinyl group.	139
377802	pfam07285	DUF1444	Protein of unknown function (DUF1444). This family contains several hypothetical bacterial proteins of unknown function that are approximately 250 residues long.	264
399927	pfam07286	DUF1445	Protein of unknown function (DUF1445). This family represents a conserved region approximately 150 residues long within a number of hypothetical bacterial and eukaryotic proteins of unknown function.	143
399928	pfam07287	AtuA	Acyclic terpene utilisation family protein AtuA. This family consists of several bacterial and plant proteins of around 400 residues in length. One member of this family has been characterized in Pseudomonas citronellolis as AtuA, a member of a gene cluster that is essential for the acyclic terpene utilisation (Atu) pathway.	348
399929	pfam07288	DUF1447	Protein of unknown function (DUF1447). This family consists of several bacterial proteins of around 70 residues in length. The function of this family is unknown.	68
399930	pfam07289	BBL5	Bardet-Biedl syndrome 5 protein. BBS5 is part of the BBSome complex that may function as a coat complex required for sorting of specific membrane proteins to the primary cilia. Mutations in the BBS5 gene cause Bardet-Biedl syndrome 5.	334
399931	pfam07290	DUF1449	Protein of unknown function (DUF1449). This family consists of several bacterial proteins of around 210 residues in length. The function of this family is unknown.	198
399932	pfam07291	MauE	Methylamine utilisation protein MauE. This family consists of several bacterial methylamine utilisation MauE proteins. Synthesis of enzymes involved in methylamine oxidation via methylamine dehydrogenase (MADH) is encoded by genes present in the mau cluster. MauE and MauD are specifically involved in the processing, transport, and/or maturation of the beta-subunit and that the absence of each of these proteins leads to production of a non-functional beta-subunit which becomes rapidly degraded.	184
399933	pfam07292	NID	Nmi/IFP 35 domain (NID). This family represents a domain of approximately 90 residues that is tandemly repeated within interferon-induced 35 kDa protein (IFP 35) and the homologous N-myc-interactor (Nmi). This domain mediates Nmi-Nmi protein interactions and subcellular localization.	89
399934	pfam07293	DUF1450	Protein of unknown function (DUF1450). This family consists of several hypothetical bacterial proteins of around 80 residues in length. Members of this family contain four highly conserved cysteine residues. The function of this family is unknown.	75
284662	pfam07294	Fibroin_P25	Fibroin P25. This family consists of several insect fibroin P25 proteins. Silk fibroin produced by the silkworm Bombyx mori consists of a heavy chain, a light chain, and a glycoprotein, P25. The heavy and light chains are linked by a disulfide bond, and P25 associates with disulfide-linked heavy and light chains by non-covalent interactions. P25 is plays an important role in maintaining integrity of the complex.	196
399935	pfam07295	DUF1451	Zinc-ribbon containing domain. This family consists of several hypothetical bacterial proteins of around 160 residues in length. Members of this family contain four highly conserved cysteine resides toward the C-terminal region of the protein.	146
254144	pfam07296	TraP	TraP protein. This family consists of several bacterial conjugative transfer TraP proteins from Escherichia coli and Salmonella typhimurium. TraP appears to play a minor role in conjugation and may interact with TraB, which varies in sequence along with TraP, in order to stabilize the proposed transmembrane complex formed by the tra operon products.	202
399936	pfam07297	DPM2	Dolichol phosphate-mannose biosynthesis regulatory protein (DPM2). This family consists of several eukaryotic dolichol phosphate-mannose biosynthesis regulatory (DPM2) proteins. Biosynthesis of glycosylphosphatidylinositol and N-glycan precursor is dependent upon a mannosyl donor, dolichol phosphate-mannose (DPM). DPM2, an 84 amino acid membrane protein expressed in the endoplasmic reticulum (ER), makes a complex with DPM1 that is essential for the ER localization and stable expression of DPM1. Moreover, DPM2 enhances binding of dolichol phosphate, a substrate of DPM synthase. Biosynthesis of DPM in mammalian cells is regulated by DPM2.	76
399937	pfam07298	NnrU	NnrU protein. This family consists of several plant and bacterial NnrU proteins. NnrU is thought to be involved in the reduction of nitric oxide. The exact function of NnrU is unclear. It is thought however that NnrU and perhaps NnrT are required for expression of both nirK and nor.	189
399938	pfam07299	EF-G-binding_N	Elongation factor G-binding protein, N-terminal. This domain can be found in the N-terminus of the FusB, FusC, and FusD proteins from Staphylococcus aureus. They are elongation factor G (EF-G) binding proteins that are linked to the fusidic acid (FA) resistance in S. aureus. The FusB proteins are two-domain metalloproteins, and this N-terminal domain forms a four-helical bundle whose helices help to stabilize the conformation of the treble-clef zinc-finger in the C-terminal domain. FA is an antibiotic that binds to EF-G, preventing its release from the ribosome, thus stalling bacterial protein synthesis. The FusB proteins provide FA resistance by preventing formation or facilitating dissociation of the FA-locked EF-G-ribosome complex during elongation and ribosome recycling.	82
399939	pfam07301	DUF1453	Protein of unknown function (DUF1453). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown. Members of this family seem to be found exclusively in the Order Bacillales.	144
399940	pfam07302	AroM	AroM protein. This family consists of several bacterial and archaeal AroM proteins. In Escherichia coli the aroM gene is cotranscribed with aroL. The function of this family is unknown.	218
399941	pfam07303	Occludin_ELL	Occludin homology domain. This domain represents a conserved region approximately 100 residues long within eukaryotic occludin proteins and the RNA polymerase II elongation factor ELL. Occludin is an integral membrane protein that localizes to tight junctions, while ELL is an elongation factor that can increase the catalytic rate of RNA polymerase II transcription by suppressing transient pausing by polymerase at multiple sites along the DNA. This shared domain is thought to mediate protein interactions.	101
399942	pfam07304	SRA1	Steroid receptor RNA activator (SRA1). This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs.	145
399943	pfam07305	DUF1454	Protein of unknown function (DUF1454). This family consists of several Enterobacterial sequences of around 200 residues in length which are often known as YiiQ proteins. The function of this family is unknown.	190
284672	pfam07306	DUF1455	Protein of unknown function (DUF1455). This family consists of several hypothetical putative outer membrane proteins which appear to be specific to Anaplasma marginale and Anaplasma ovis.	130
399944	pfam07307	HEPPP_synt_1	Heptaprenyl diphosphate synthase (HEPPP synthase) subunit 1. This family contains subunit 1 of bacterial heptaprenyl diphosphate synthase (HEPPP synthase) (EC:2.5.1.30) (approximately 230 residues long). The enzyme consists of two subunits, both of which are required for catalysis of heptaprenyl diphosphate synthesis.	210
399945	pfam07308	DUF1456	Protein of unknown function (DUF1456). This family consists of several hypothetical bacterial proteins of around 150 residues in length. The function of this family is unknown.	68
399946	pfam07309	FlaF	Flagellar protein FlaF. This family consists of several bacterial FlaF flagellar proteins. FlaF and FlaG are trans-acting, regulatory factors that modulate flagellin synthesis during flagellum biogenesis.	113
399947	pfam07310	PAS_5	PAS domain. This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long. This region is is distantly similar to other PAS domains.	136
399948	pfam07311	Dodecin	Dodecin. Dodecin is a flavin-binding protein,found in several bacteria and few archaea and represents a stand-alone version of the SHS2 domain. It most closely resembles the SHS2 domains of FtsA and Rpb7p, and represents a single domain small-molecule binding form.	62
369313	pfam07312	DUF1459	Protein of unknown function (DUF1459). This family consists of several hypothetical Caenorhabditis elegans proteins of around 85 residues in length. The function of this family is unknown.	81
399949	pfam07313	DUF1460	Protein of unknown function (DUF1460). This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown.	214
399950	pfam07314	DUF1461	Protein of unknown function (DUF1461). This family contains a number of hypothetical bacterial proteins of unknown function approximately 200 residues long. These are possibly integral membrane proteins.	175
377812	pfam07315	DUF1462	Protein of unknown function (DUF1462). This family consists of several hypothetical bacterial proteins of around 100 residues in length. The function of this family is unknown.	93
311332	pfam07316	DUF1463	Protein of unknown function (DUF1463). This family consists of several hypothetical bacterial proteins of around 140 residues in length. Members of this family seem to be found exclusively in Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown.	137
399951	pfam07317	YcgR	Flagellar regulator YcgR. This domain is found N terminal to pfam07238. Proteins which contain YcgR domains are known to interact with the flagellar switch-complex proteins FliG and FliM. This interaction results in a reduction of torque generation and induces CCW motor bias.	103
399952	pfam07318	DUF1464	Protein of unknown function (DUF1464). This family consists of several hypothetical archaeal proteins of around 350 residues in length. The function of this family is unknown.	327
399953	pfam07319	DnaI_N	Primosomal protein DnaI N-terminus. This family represents the N-terminus (approximately 120 residues) of bacterial primosomal DnaI proteins, although one family member appears to be of viral origin. DnaI is one of the components of the Bacillus subtilis replication restart primosome, and is required for the DnaB75-dependent loading of the DnaC helicase.	90
399954	pfam07321	YscO	Type III secretion protein YscO. This family contains the bacterial type III secretion protein YscO, which is approximately 150 residues long. YscO has been shown to be required for high-level expression and secretion of the anti-host proteins V antigen and Yops in Yersinia pestis.	148
284687	pfam07322	Seadorna_Vp10	Seadornavirus Vp10. This family consists of several Seadornavirus Vp10 proteins found in the Banna and Kadipiro viruses. Members of this family are typically around 240 residues in length. The function of this family is unknown.	241
399955	pfam07323	DUF1465	Protein of unknown function (DUF1465). This family consists of several hypothetical bacterial proteins of around 180 residues in length. The function of this family is unknown.	154
369318	pfam07324	DGCR6	DiGeorge syndrome critical region 6 (DGCR6) protein. This family contains DiGeorge syndrome critical region 6 (DGCR6) proteins (approximately 200 residues long) of a number of vertebrates. DGCR6 is a candidate for involvement in the DiGeorge syndrome pathology by playing a role in neural crest cell migration into the third and fourth pharyngeal pouches, the structures from which derive the organs affected in DiGeorge syndrome. Also found in this family is the Drosophila melanogaster gonadal protein gdl.	187
148753	pfam07325	Curto_V2	Curtovirus V2 protein. This family consists of several Curtovirus V2 proteins. The exact function of V2 is unclear but it is known that the protein is required for a successful host infection process.	126
399956	pfam07326	DUF1466	Protein of unknown function (DUF1466). This family consists of several hypothetical mammalian proteins of around 240 residues in length.	229
115951	pfam07327	Neuroparsin	Neuroparsin. This family consists of several locust specific neuroparsin proteins. Neuroparsins are produced by the A1 type of protocerebral median neurosecretory cells of the PI-CC system and display pleiotropic activities: inhibition of the effect of juvenile hormone, stimulation of fluid reabsorption of isolated recta, induction of an increase in hemolymph lipid and trehalose levels, and neurotrophic effects.	103
284691	pfam07328	VirD1	T-DNA border endonuclease VirD1. This family consists of several T-DNA border endonuclease VirD1 proteins which appear to be found exclusively in Agrobacterium species. Agrobacterium, a plant pathogen, is capable to stably transform the plant cell with a segment of its own DNA called T-DNA (transferred DNA). This process depends, among others, on the specialized bacterial virulence proteins VirD1 and VirD2 that excise the T-DNA from its adjacent sequences. VirD1 is thought to interact with VirD2 in this process.	142
399957	pfam07330	DUF1467	Protein of unknown function (DUF1467). This family consists of several bacterial proteins of around 90 residues in length. The function of this family is unknown.	82
399958	pfam07331	TctB	Tripartite tricarboxylate transporter TctB family. This family consists of several hypothetical bacterial proteins of around 150 residues in length. This family was formerly known as DUF1468.	136
399959	pfam07332	Phage_holin_3_6	Putative Actinobacterial Holin-X, holin superfamily III. Phage_holin_3_6 is a family of small hydrophobic proteins with two or three transmembrane domains of the Hol-X family. Holin proteins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion.	116
284695	pfam07333	SLR1-BP	S locus-related glycoprotein 1 binding pollen coat protein (SLR1-BP). This family consists of a number of cysteine rich SLR1 binding pollen coat like proteins. Adhesion of pollen grains to the stigmatic surface is a critical step during sexual reproduction in plants. In Brassica, S locus-related glycoprotein 1 (SLR1), a stigma-specific protein belonging to the S gene family of proteins, has been shown to be involved in this step. SLR1-BP specifically binds SLR1 with high affinity. The SLR1-BP gene is specifically expressed in pollen at late stages of development and is a member of the class A pollen coat protein (PCP) family, which includes PCP-A1, an SLG (S locus glycoprotein)-binding protein.	56
284696	pfam07334	IFP_35_N	Interferon-induced 35 kDa protein (IFP 35) N-terminus. This family represents the N-terminus of interferon-induced 35 kDa protein (IFP 35) (approximately 80 residues long), which contains a leucine zipper motif in an alpha helical configuration. This family also includes N-myc-interactor (Nmi), a homologous interferon-induced protein.	76
399960	pfam07335	Glyco_hydro_75	Fungal chitosanase of glycosyl hydrolase group 75. This family consists of several fungal chitosanase proteins. Chitin, xylan, 6-O-sulphated chitosan and O-carboxymethyl chitin are indigestible by chitosanase. EC:3.2.1.132. The mechanism is likely to be inverting, and the probable catalytic neutrophile base is Asp, with the probable catalytic proton donor being Glu. (see the Chitosanase web-page from CAZY).	165
399961	pfam07336	ABATE	Putative stress-induced transcription regulator. The structure of one member of the ABATE domain family consists of a two-domain organisation, with the N-terminal domain presenting a new fold called the ABATE domain that may bind an as yet unknown ligand. The C-terminal domain forms a treble-clef zinc-finger that is likely to be involved in DNA binding. suggests a role as stress-induced transcriptional regulator. Further computational analyses sugeests a role as a stress-induced transcriptional regulator. Members of this family are found in Streptomyces, Rhizobium, Ralstonia, Agrobacterium and Bradyrhizobium species.	90
284699	pfam07337	CagY_M	DC-EC Repeat. This repeat is found in the CagY proteins - part of the CAG pathogenicity island - and involved in delivery of the protein CagA into host cells. It forms part of a surface needle structure, and this repeat may form an alpha-helical rod structure. A conserved -DC- and -EC- can be seen in regularly spaced in the alignment.	32
399962	pfam07338	DUF1471	Protein of unknown function (DUF1471). This family consists of several hypothetical Enterobacterial proteins of around 90 residues in length. Some members of this family are annotated as ydgH precursors and contain two copies of this region, one at the N-terminus and the other at the C-terminus. The function of this family is unknown.	56
115962	pfam07339	DUF1472	Protein of unknown function (DUF1472). This family consists of several Enterobacterial proteins of around 125 residues in length and contains 6 highly conserved cysteine residues. The function of this family is unknown.	101
284701	pfam07340	Herpes_IE1	Cytomegalovirus IE1 protein. Expression from a human cytomegalovirus early promoter (E1.7) has been shown to be activated in trans by the IE2 gene product. Although the IE1 gene product alone had no effect on this early viral promoter, maximal early promoter activity was detected when both IE1 and IE2 gene products were present. The IE1 protein from cytomegalovirus is also known as UL123.	391
115964	pfam07341	DUF1473	Protein of unknown function (DUF1473). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Members of this family seem to be found exclusively in Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown.	163
369323	pfam07342	DUF1474	Protein of unknown function (DUF1474). This family consists of several bacterial proteins of around 100 residues in length. Members of this family seem to be found exclusively in Staphylococcus aureus. The function of this family is unknown.	100
399963	pfam07343	DUF1475	Protein of unknown function (DUF1475). This family consists of several hypothetical plant proteins of around 250 residues in length. Members of this family seem to be found exclusively in Arabidopsis thaliana. The function of this family is unknown.	236
399964	pfam07344	Amastin	Amastin surface glycoprotein. This family contains the eukaryotic surface glycoprotein amastin (approximately 180 residues long).In Trypanosoma cruzi, amastin is particularly abundant during the amastigote stage.	156
399965	pfam07345	DUF1476	Domain of unknown function (DUF1476). This family consists of several hypothetical bacterial proteins of around 100 residues in length. Members of this family are found in Bradyrhizobium, Rhizobium, Brucella and Caulobacter species. The function of this family is unknown.	102
311348	pfam07346	DUF1477	Protein of unknown function (DUF1477). This family consists of several hypothetical Nucleopolyhedrovirus proteins of around 100 resides in length. The function of this family is unknown.	115
399966	pfam07347	CI-B14_5a	NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a). This family contains the eukaryotic NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a) (EC:1.6.5.3). This is approximately 100 residues long, and forms part of a multiprotein complex that resides on the inner mitochondrial membrane. The main function of the complex is the transport of electrons from NADH to ubiquinone, accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space.	91
399967	pfam07348	Syd	Syd protein (SUKH-2). This family contains a number of bacterial Syd proteins approximately 180 residues long. It has been suggested that Syd is loosely associated with the cytoplasmic surface of the cytoplasmic membrane, and that interaction with SecY may be involved in this membrane association. Operon analysis showed that Syd protein may function as immunity protein in bacterial toxin systems.	174
284709	pfam07349	DUF1478	Protein of unknown function (DUF1478). This family consists of several hypothetical Sapovirus proteins of around 165 residues in length. The function of this family is unknown.	161
399968	pfam07350	DUF1479	Protein of unknown function (DUF1479). This family consists of several hypothetical Enterobacterial proteins, of around 420 residues in length. Members of this family are often known as YbiU. The function of this family is unknown.	404
369328	pfam07351	DUF1480	Protein of unknown function (DUF1480). This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown.	79
399969	pfam07352	Phage_Mu_Gam	Bacteriophage Mu Gam like protein. This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo.	146
284713	pfam07353	Uroplakin_II	Uroplakin II. This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension.	161
336682	pfam07354	Sp38	Zona-pellucida-binding protein (Sp38). This family contains a number of zona-pellucida-binding proteins that seem to be restricted to mammals. These are sperm proteins that bind to the 90-kDa family of zona pellucida glycoproteins in a calcium-dependent manner. These represent some of the specific molecules that mediate the first steps of gamete interaction, allowing fertilisation to occur.	177
399970	pfam07355	GRDB	Glycine/sarcosine/betaine reductase selenoprotein B (GRDB). This family represents a conserved region approximately 350 residues long within the selenoprotein B component of the bacterial glycine, sarcosine and betaine reductase complexes.	347
399971	pfam07356	DUF1481	Protein of unknown function (DUF1481). This family consists of several hypothetical bacterial proteins of around 230 residues in length. Members of this family are often referred to as YjaH and are found in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown.	186
369330	pfam07357	DRAT	Dinitrogenase reductase ADP-ribosyltransferase (DRAT). This family consists of several bacterial dinitrogenase reductase ADP-ribosyltransferase (DRAT) proteins. Members of this family seem to be specific to Rhodospirillum, Rhodobacter and Azospirillum species. Dinitrogenase reductase ADP-ribosyl transferase (DRAT) carries out the transfer of the ADP-ribose from NAD to the Arg-101 residue of one subunit of the dinitrogenase reductase homodimer, resulting in inactivation of that enzyme. Dinitrogenase reductase-activating glycohydrolase (DRAG) removes the ADP-ribose group attached to dinitrogenase reductase, thus restoring nitrogenase activity. The DRAT-DRAG system negatively regulates nitrogenase activity in response to exogenous NH4+ or energy limitation in the form of a shift to darkness or to anaerobic conditions.	256
369331	pfam07358	DUF1482	Protein of unknown function (DUF1482). This family consists of several Enterobacterial proteins of around 60 residues in length. The function of this family is unknown.	57
284718	pfam07359	LEAP-2	Liver-expressed antimicrobial peptide 2 precursor (LEAP-2). This family consists of several mammalian liver-expressed antimicrobial peptide 2 (LEAP-2) sequences. LEAP-2 is a cysteine-rich, and cationic protein. LEAP-2 contains a core structure with two disulfide bonds formed by cysteine residues in relative 1-3 and 2-4 positions. LEAP-2 is synthesized as a 77-residue precursor, which is predominantly expressed in the liver and highly conserved among mammals. The largest native LEAP-2 form of 40 amino acid residues is generated from the precursor at a putative cleavage site for a furin-like endoprotease. In contrast to smaller LEAP-2 variants, this peptide exhibits dose-dependent antimicrobial activity against selected microbial model organisms. The exact function of this family is unclear.	77
399972	pfam07361	Cytochrom_B562	Cytochrome b562. This family contains the bacterial cytochrome b562. This forms a four-helix bundle that non-covalently binds a single heme prosthetic group..	101
399973	pfam07362	CcdA	Post-segregation antitoxin CcdA. This family consists of several Enterobacterial post-segregation antitoxin CcdA proteins. The F plasmid-carried bacterial toxin, the CcdB protein, is known to act on DNA gyrase in two different ways. CcdB poisons the gyrase-DNA complex, blocking the passage of polymerases and leading to double-strand breakage of the DNA. Alternatively, in cells that overexpress CcdB, the A subunit of DNA gyrase (GyrA) has been found as an inactive complex with CcdB. Both poisoning and inactivation can be prevented and reversed in the presence of the F plasmid-encoded antidote, the CcdA protein.	71
284721	pfam07363	DUF1484	Protein of unknown function (DUF1484). This family consists of several hypothetical bacterial proteins of around 110 residues in length. Members of this family appear to be found exclusively in Ralstonia solanacearum. The function of this family is unknown.	109
399974	pfam07364	DUF1485	Metallopeptidase family M81. This is a family of proteobacterial metallo-peptidases.	287
191732	pfam07365	Toxin_8	Alpha conotoxin precursor. This family consists of several alpha conotoxin precursor proteins from a number of Conus species. The alpha-conotoxins are small peptide neurotoxins from the venom of fish-hunting cone snails which block nicotinic acetylcholine receptors (nAChRs).	50
399975	pfam07366	SnoaL	SnoaL-like polyketide cyclase. This family includes SnoaL a polyketide cyclase involved in nogalamycin biosynthesis. This family was formerly known as DUF1486. The proteins in this family adopt a distorted alpha-beta barrel fold. Structural data together with site-directed mutagenesis experiments have shown that SnoaL has a different mechanism to that of the classical aldolase for catalyzing intramolecular aldol condensation.	126
399976	pfam07367	FB_lectin	Fungal fruit body lectin. This family consists of several fungal fruit body lectin proteins. Fruit body lectins are thought to have insecticidal activity and may also function in capturing nematodes.	139
254173	pfam07368	DUF1487	Protein of unknown function (DUF1487). This family consists of several uncharacterized proteins from Drosophila melanogaster. The function of this family is unknown.	215
399977	pfam07369	DUF1488	Protein of unknown function (DUF1488). This family consists of several hypothetical bacterial proteins of around 85 residues in length. The function of this family is unknown.	82
399978	pfam07370	DUF1489	Protein of unknown function (DUF1489). This family consists of several hypothetical bacterial proteins of around 150 residues in length. Members of this family seem to be founds exclusively in the Class Alphaproteobacteria. The function of this family is unknown.	138
399979	pfam07371	DUF1490	Protein of unknown function (DUF1490). This family consists of several hypothetical bacterial proteins of around 90 residues in length. Members of the family seem to be found exclusively in Mycobacterium species. The function of this family is unknown.	88
399980	pfam07372	DUF1491	Protein of unknown function (DUF1491). This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in the Class Alphaproteobacteria. The function of this family is unknown.	103
399981	pfam07373	CAMP_factor	CAMP factor (Cfa). This family consists of several bacterial CAMP factor (Cfa) proteins which seem to be specific to Streptococcus species. The CAMP reaction is a synergistic lysis of erythrocytes by the interaction of an extracellular protein (CAMP factor) produced by some streptococcal species with the Staphylococcus aureus sphingomyelinase C (beta-toxin).	220
311366	pfam07374	DUF1492	Protein of unknown function (DUF1492). This family consists of several hypothetical, highly conserved Streptococcal and related phage proteins of around 100 residues in length. The function of this family is unknown. It appears to be distantly related to pfam08281.	100
284730	pfam07376	Prosystemin	Prosystemin. This family consists of several plant specific prosystemin proteins. Prosystemin is the precursor protein of the 18 amino acid wound signal systemin which activates systemic defense in plant leaves against insect herbivores.	204
399982	pfam07377	DUF1493	Protein of unknown function (DUF1493). This family consists of several bacterial proteins of around 115 residues in length. Members of this family seem to be found exclusively in Salmonella and Yersinia species and several have been described as being putative cytoplasmic proteins. The function of this family is unknown.	111
399983	pfam07378	FlbT	Flagellar protein FlbT. This family consists of several FlbT proteins. FlbT is a post-transcriptional regulator of flagellin. FlbT is associated with the 5' untranslated region (UTR) of fljK (25 kDa flagellin) mRNA and that this association requires a predicted loop structure in the transcript. Mutations within this loop abolish FlbT association and result in increased mRNA stability. It is therefore thought that FlbT promotes the degradation of flagellin mRNA by associating with the 5' UTR.	125
284733	pfam07379	DUF1494	Protein of unknown function (DUF1494). This family consists of several bacterial proteins of around 175 residues in length. Members of this family seem to be found exclusively in Chlamydia species. The function of this family is unknown.	179
284734	pfam07380	Pneumo_M2	Pneumovirus M2 protein. This family consists of several Pneumovirus M2 proteins. The M2-1 protein of respiratory syncytial virus (RSV) is a transcription processivity factor that is essential for virus replication.	89
369338	pfam07381	DUF1495	Winged helix DNA-binding domain (DUF1495). This family consists of several hypothetical archaeal proteins of around 110 residues in length. The structure of this domain possesses a winged helix DNA-binding domain suggesting these proteins are bacterial transcription factors.	90
369339	pfam07382	HC2	Histone H1-like nucleoprotein HC2. This family contains the bacterial histone H1-like nucleoprotein HC2 (approximately 200 residues long), which seems to be found mostly in Chlamydia. HC2 functions in DNA condensation, although it has been suggested that it also has other roles.	187
399984	pfam07383	DUF1496	Protein of unknown function (DUF1496). This family consists of several bacterial proteins of around 90 residues in length. Members of this family seem to be found exclusively in the Orders Vibrionales and Enterobacteriales. The function of this family is unknown.	51
284738	pfam07384	DUF1497	Protein of unknown function (DUF1497). This family consists of several phage and bacterial proteins of around 59 residues in length. Members of this family seem to be found exclusively in Lactococcus lactis and the bacteriophages that infect this organism. The function of this family is unknown.	59
399985	pfam07385	Lyx_isomer	D-lyxose isomerase. Members of this family of sugar isomerases belong to the cupin superfamily. The enzyme from Cohnella laevoribosii has been shown to be specific for D-lyxose, L-ribose, and D-mannose. E. coli sugar isomerase (EcSI) has been structurally and functionally characterized and shows a preference for D-lyxose and D-mannose.	223
399986	pfam07386	DUF1499	Protein of unknown function (DUF1499). This family consists of several hypothetical bacterial and plant proteins of around 125 residues in length. The function of this family is unknown.	114
254182	pfam07387	Seadorna_VP7	Seadornavirus VP7. This family consists of several Seadornavirus specific VP7 proteins of around 305 residues in length. The function of this family is unknown. However, it appears to be distantly related to protein kinases.	308
399987	pfam07388	A-2_8-polyST	Alpha-2,8-polysialyltransferase (POLYST). This family contains the bacterial enzyme alpha-2,8-polysialyltransferase (EC:2.4.99.-) (approximately 500 residues long). This catalyzes the polycondensation of alpha-2,8-linked sialic acid required for the synthesis of polysialic acid (PSA).	322
284742	pfam07389	DUF1500	Protein of unknown function (DUF1500). This family consists of several Orthopoxvirus specific proteins of around 100 residues in length. The function of this family is unknown.	97
369342	pfam07390	P30	Mycoplasma P30 protein. This family consists of several P30 proteins which seem to be specific to Mycoplasma agalactiae. P30 is a 30-kDa immunodominant antigen and is known to be a transmembrane protein.	150
369343	pfam07391	NPR	NPR nonapeptide repeat (2 copies). This nine residue repeat which I have called NPR after NonaPeptide Repeat. It is found in two malarial proteins and has the consensus EEhhEEhhP where h stands for a hydrophobic amino acid.	17
369344	pfam07392	P19Arf_N	Cyclin-dependent kinase inhibitor 2a p19Arf N-terminus. This family represents the N-terminus (approximately 50 residues) of cyclin-dependent kinase inhibitor 2a p19Arf, which seems to be restricted to mammals. This is a tumor-suppressor protein that has been shown to inhibit the growth of human tumor cells lacking functional p53 by inducing a transient G2 arrest and subsequently apoptosis.	51
399988	pfam07393	Sec10	Exocyst complex component Sec10. This family contains the Sec10 component (approximately 650 residues long) of the eukaryotic exocyst complex, which specifically affects the synthesis and delivery of secretory and basolateral plasma membrane proteins.	704
399989	pfam07394	DUF1501	Protein of unknown function (DUF1501). This family contains a number of hypothetical bacterial proteins of unknown function approximately 400 residues long.	392
284747	pfam07395	Mig-14	Mig-14. This family contains a number of bacterial mig-14 proteins (approximately 270 residues long). In Salmonella, mig-14 contributes to resistance to antimicrobial peptides, although the mechanism is not fully understood.	264
399990	pfam07396	Porin_O_P	Phosphate-selective porin O and P. This family represents a conserved region approximately 400 residues long within the bacterial phosphate-selective porins O and P. These are anion-specific porins, the binding site of which has a higher affinity for phosphate than chloride ions. Porin O has a higher affinity for polyphosphates, while porin P has a higher affinity for orthophosphate. In P. aeruginosa, porin O was found to be expressed only under phosphate-starvation conditions during the stationary growth phase.	358
116019	pfam07397	DUF1502	Repeat of unknown function (DUF1502). This family consists of a number of repeats of around 34 residues in length. Members of this family seem to be found exclusively in three hypothetical Murid herpesvirus 4 proteins. The function of this family is unknown.	34
399991	pfam07398	MDMPI_C	MDMPI C-terminal domain. This domain is found at the C-terminus of the mycothiol maleylpyruvate isomerase enzyme (MDMPI). The structure of this protein has been solved. This domain appears weakly similar to pfam08608.	88
399992	pfam07399	Na_H_antiport_3	Putative Na+/H+ antiporter. This family consists of several hypothetical bacterial proteins of around 440 residues in length. The function of this family is unknown. Many members carry 11 or 12 transmembrane regions, suggesting that they might be transporters. One family member, UniProtKB:Q821X2 is classified by TCDB as being an NhaE type of Na+/H+ antiporter.	418
399993	pfam07400	IL11	Interleukin 11. This family contains interleukin 11 (approximately 200 residues long). This is a secreted protein that stimulates megakaryocytopoiesis, resulting in increased production of platelets, as well as activating osteoclasts, inhibiting epithelial cell proliferation and apoptosis, and inhibiting macrophage mediator production. These functions may be particularly important in mediating the hematopoietic, osseous and mucosal protective effects of interleukin 11. Family members seem to be restricted to mammals.	167
116023	pfam07401	Lenti_VIF_2	Bovine Lentivirus VIF protein. This family consists of several Lentivirus viral infectivity factor (VIF) proteins. VIF is known to be essential for ability of cell-free virus preparation to infect cells. Members of this family are specific to Bovine immunodeficiency virus (BIV) and Jembrana disease virus which also infects cattle.	198
284752	pfam07402	Herpes_U26	Human herpesvirus U26 protein. This family consists of several Human herpesvirus U26 proteins of around 300 residues in length. The function of this family is unknown.	293
369351	pfam07403	DUF1505	Protein of unknown function (DUF1505). This family consists of several uncharacterized Caenorhabditis elegans proteins of around 115 resides in length. Members of this family contain 6 highly conserved cysteine residues. The function of this family is unknown.	114
254188	pfam07404	TEBP_beta	Telomere-binding protein beta subunit (TEBP beta). This family consists of several telomere-binding protein beta subunits which appear to be specific to the family Oxytrichidae. Telomeres are specialized protein-DNA complexes that compose the ends of eukaryotic chromosomes. Telomeres protect chromosome termini from degradation and recombination and act together with telomerase to ensure complete genome replication. TEBP beta forms a complex with TEBP alpha and this complex is able to recognize and bind ssDNA to form a sequence-specific, telomeric nucleoprotein complex that caps the very 3' ends of chromosomes.	375
284754	pfam07405	DUF1506	Protein of unknown function (DUF1506). This family consists of several bacterial proteins of around 130 residues in length. Members of this family seem to be specific to Borrelia burgdorferi (Lyme disease spirochete). The function of this family is unknown.	127
399994	pfam07406	NICE-3	NICE-3 protein. This family consists of several eukaryotic NICE-3 and related proteins. The gene coding for NICE-3 is part of the epidermal differentiation complex (EDC) which comprises a large number of genes that are of crucial importance for the maturation of the human epidermis. The function of NICE-3 is unknown.	181
284756	pfam07407	Seadorna_VP6	Seadornavirus VP6 protein. This family consists of several VP6 proteins from the Banna virus as well as a related protein VP5 from the Kadipiro virus. Members of this family are typically of around 420 residues in length. The function of this family is unknown.	420
399995	pfam07408	DUF1507	Protein of unknown function (DUF1507). This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown.	84
399996	pfam07409	GP46	Phage protein GP46. This family contains GP46 phage proteins (approximately 120 residues long).	115
399997	pfam07410	Phage_Gp111	Streptococcus thermophilus bacteriophage Gp111 protein. This family consists of several Streptococcus thermophilus bacteriophage Gp111 proteins of around 110 residues in length. The function of this family is unknown.	118
399998	pfam07411	DUF1508	Domain of unknown function (DUF1508). This family represents a series of bacterial domains of unknown function of around 50 residues in length. Members of this family are often found as tandem repeats and in some cases represent the whole protein. All member proteins are described as being hypothetical.	48
399999	pfam07412	Geminin	Geminin. This family contains the eukaryotic protein geminin (approximately 200 residues long). Geminin inhibits DNA replication by preventing the incorporation of MCM complex into prereplication complex, and is degraded during the mitotic phase of the cell cycle. It has been proposed that geminin inhibits DNA replication during S, G2, and M phases and that geminin destruction at the metaphase-anaphase transition permits replication in the succeeding cell cycle.	194
284761	pfam07413	Herpes_UL37_2	Betaherpesvirus immediate-early glycoprotein UL37. This family consists of several Betaherpesvirus immediate-early glycoprotein UL37 sequences. The human cytomegalovirus (HCMV) UL37 immediate-early regulatory protein is a type I integral membrane N-glycoprotein which traffics through the ER and the Golgi network.	334
284762	pfam07415	Herpes_LMP2	Gammaherpesvirus latent membrane protein (LMP2) protein. This family consists of several Gammaherpesvirus latent membrane protein (LMP2) proteins. Epstein-Barr virus is a human Gammaherpesvirus that infects and establishes latency in B lymphocytes in vivo. The latent membrane protein 2 (LMP2) gene is expressed in latently infected B cells and encodes two protein isoforms, LMP2A and LMP2B, that are identical except for an additional N-terminal 119 aa cytoplasmic domain which is present in the LMP2A isoform. LMP2A is thought to play a key role in either the establishment or the maintenance of latency and/or the reactivation of productive infection from the latent state. The significance of LMP2B and its role in pathogenesis remain unclear.	497
284763	pfam07416	Crinivirus_P26	Crinivirus P26 protein. This family consists of several Crinivirus P26 proteins which seem to be found exclusively in the Lettuce infectious yellows virus. The function of this family is unknown.	227
400000	pfam07417	Crl	Transcriptional regulator Crl. This family contains the bacterial transcriptional regulator Crl (approximately 130 residues long). This is a transcriptional regulator of the csgA curlin subunit gene for curli fibers that are found on the surface of certain bacteria.	128
400001	pfam07418	PCEMA1	Acidic phosphoprotein precursor PCEMA1. This family consists of several acidic phosphoprotein precursor PCEMA1 sequences which appear to be found exclusively in Plasmodium chabaudi. PCEMA1 is an antigen that is associated with the membrane of the infected erythrocyte throughout the entire intraerythrocytic cycle. The exact function of this family is unclear.	294
369356	pfam07419	PilM	PilM. This family contains the bacterial protein PilM (approximately 150 residues long). PilM is an inner membrane protein that has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body.	135
284767	pfam07420	DUF1509	Protein of unknown function (DUF1509). This family consists of several uncharacterized viral proteins from the Marek's disease-like viruses. Members of this family are typically around 400 residues in length. The function of this family is unknown.	384
400002	pfam07421	Pro-NT_NN	Neurotensin/neuromedin N precursor. This family contains the precursor of bacterial neurotensin/neuromedin N (approximately 170 residues long). This the common precursor of two biologically active related peptides, neurotensin and neuromedin N. It undergoes tissue-specific processing leading to the formation in some tissues and cancer cell lines of large peptides ending with the neurotensin or neuromedin N sequence.	161
400003	pfam07422	s48_45	Sexual stage antigen s48/45 domain. This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation. This domain contains 6 conserved cysteines suggesting 3 disulphide bridges.	110
400004	pfam07423	DUF1510	Protein of unknown function (DUF1510). This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown.	93
369359	pfam07424	TrbM	TrbM. This family contains the bacterial protein TrbM (approximately 180 residues long). In Comamonas testosteroni T-2, TrbM is derived from the IncP1beta plasmid pTSA, which encodes the widespread genes for p-toluenesulfonate (TSA) degradation.	156
116046	pfam07425	Pardaxin	Pardaxin. This family consists of several Pardaxin proteins. Pardaxin, a 33-amino-acid pore-forming polypeptide toxin isolated from the Red Sea Moses sole Pardachirus marmoratus, has a helix-hinge-helix structure. This is a common structural motif found both in antibacterial peptides that can act selectively on bacterial membranes (e.g., cecropin), and in cytotoxic peptides that can lyse both mammalian and bacterial cells (e.g., melittin). Pardaxin possesses a high antibacterial activity with a significantly reduced haemolytic activity towards human red blood cells compared with melittin. Pardaxin has also been found to have a shark repellent action.	33
400005	pfam07426	Dynactin_p22	Dynactin subunit p22. This family contains p22, the smallest subunit of dynactin, a complex that binds to cytoplasmic dynein and is a required activator for cytoplasmic dynein-mediated vesicular transport. Dynactin localizes to the cleavage furrow and to the midbodies of dividing cells, suggesting that it may function in cytokinesis. Family members are approximately 170 residues long.	164
400006	pfam07428	Tri3	15-O-acetyltransferase Tri3. This family represents a conserved region approximately 400 residues long within 15-O-acetyltransferase (Tri3), which seems to be restricted to ascomycete fungi. In Fusarium sporotrichioides, this is required for acetylation of the C-15 hydroxyl group of trichothecenes in the biosynthesis of T-2 toxin.	416
400007	pfam07429	Glyco_transf_56	4-alpha-L-fucosyltransferase glycosyl transferase group 56. This family contains the bacterial enzyme 4-alpha-L-fucosyltransferase (Fuc4NAc transferase) (EC 2.4.1.-) (approximately 360 residues long). This catalyzes the synthesis of Fuc4NAc-ManNAcA-GlcNAc-PP-Und (lipid III) as part of the biosynthetic pathway of enterobacterial common antigen (ECA), a polysaccharide comprised of the trisaccharide repeat unit Fuc4NAc-ManNAcA-GlcNAc.	358
311397	pfam07430	PP1	Phloem filament protein PP1 cystatin-like domain. This domain represents a conserved region related to cystatins. Eight copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem.	78
400008	pfam07431	DUF1512	Protein of unknown function (DUF1512). This family consists of several archaeal proteins of around 370 residues in length. The function of this family is unknown.	355
311399	pfam07432	Hc1	Histone H1-like protein Hc1. This family consists of several bacterial histone H1-like Hc1 proteins. In Chlamydia, Hc1 is expressed in the late stages of the life cycle, concomitant with the reorganisation of chlamydial reticulate bodies into elementary bodies. This suggests that Hc1 protein plays a role in the condensation of chromatin during intracellular differentiation.	124
400009	pfam07433	DUF1513	Protein of unknown function (DUF1513). This family consists of several bacterial proteins of around 360 residues in length. The function of this family is unknown.	304
400010	pfam07434	CblD	CblD like pilus biogenesis initiator. This family consists of several minor pilin proteins including CblD from Burkholderia cepacia which is known to CblD be the initiator of pilus biogenesis. The family also contains a variety of Enterobacterial minor pilin proteins.	380
400011	pfam07435	YycH	YycH protein. This family contains the bacterial protein YycH which is approximately 450 residues long. YycH plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two component system together with its cognate response regulator YycF. PhoA fusion studies have shown that YycH is transported across the cytoplasmic protein. It is postulated that YycH functions as an antagonist to YycG. The molecule is made up of three domains, and has a novel three-dimensional structure. The N-terminal domain features a calcium binding site and the central domain contains two conserved loop regions.	407
369364	pfam07436	Curto_V3	Curtovirus V3 protein. This family consists of several Curtovirus V3 proteins of around 90 residues in length. The function of this family is unknown.	87
284781	pfam07437	YfaZ	YfaZ precursor. This family contains the precursor of the bacterial protein YfaZ (approximately 180 residues long). Many members of this family are hypothetical proteins.	180
369365	pfam07438	DUF1514	Protein of unknown function (DUF1514). This family consists of several Staphylococcus aureus and related bacteriophage proteins of around 65 residues in length. The function of this family is unknown. Structural modelling suggests this domain may bind nucleic acids.	62
400012	pfam07439	DUF1515	Protein of unknown function (DUF1515). This family consists of several hypothetical bacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Rhizobium species. The function of this family is unknown.	113
116061	pfam07440	Caerin_1	Caerin 1 protein. This family consists of several caerin 1 proteins from Litoria species. The caerin 1 peptides are among the most powerful of the broad-spectrum antibiotic amphibian peptides.	24
400013	pfam07441	BofA	SigmaK-factor processing regulatory protein BofA. This family contains the sigmaK-factor processing regulatory protein BofA (Bypass-of-forespore protein A) (approximately 80 residues long). During sporulation in Bacillus subtilis, transcription is controlled in the developing sporangium by a cascade of sporulation-specific transcription factors (sigma factors). Following engulfment, processing of sigmaK is inhibited by BofA. It has been suggested that this effect is exerted by alteration of the level of the SpoIVFA protein.	73
116063	pfam07442	Ponericin	Ponericin. This family contains a number of ponericin peptides (approximately 30 residues long) from the venom of the predatory ant Pachycondyla goeldii. These peptides exhibit antibacterial and insecticidal properties, and may adopt an amphipathic alpha-helical structure in polar environments such as cell membranes.	29
400014	pfam07443	HARP	HepA-related protein (HARP). This family represents a conserved region approximately 60 residues long within eukaryotic HepA-related protein (HARP). This exhibits single-stranded DNA-dependent ATPase activity, and is ubiquitously expressed in human and mouse tissues. Family members may contain more than one copy of this region.	55
400015	pfam07444	Ycf66_N	Ycf66 protein N-terminus. This family represents the N-terminus (approximately 80 residues) of Ycf66, a protein that seems to be restricted to eukaryotes that contain chloroplasts and to cyanobacteria.	76
400016	pfam07445	PriC	Primosomal replication protein priC. This family contains the bacterial primosomal replication protein priC. In Escherichia coli, this function in the assembly of the primosome.	173
400017	pfam07447	VP40	Matrix protein VP40. This family contains viral VP40 matrix proteins that seem to be restricted to the Filoviridae. These play an important role in the assembly process of virus particles by interacting with cellular factors, cellular membranes, and the ribonuclearprotein particle complex. It has been shown that the N-terminal region of VP40 folds into a mixture of hexameric and octameric states - these may have distinct roles.	292
311407	pfam07448	Spp-24	Secreted phosphoprotein 24 (Spp-24) cystatin-like domain. This family represents a conserved region approximately 60 residues long within secreted phosphoprotein 24 (Spp-24), which seems to be restricted to vertebrates. This is a non-collagenous protein found in bone that is related in sequence to the cystatin family of thiol protease inhibitors. This suggests that Spp-24 could function to modulate the thiol protease activities known to be involved in bone turnover. It is also possible that the intact form of Spp-24 found in bone could be a precursor to a biologically active peptide that coordinates an aspect of bone turnover.	64
400018	pfam07449	HyaE	Hydrogenase-1 expression protein HyaE. This family contains bacterial hydrogenase-1 expression proteins approximately 120 residues long. This includes the E. coli protein HyaE, and the homologous proteins HoxO of R. eutropha and HupG of R. leguminosarum. Deletion of the hoxO gene in R. eutropha led to complete loss of the uptake [NiFe] hydrogenase activity, suggesting that it has a critical role in hydrogenase assembly.	108
400019	pfam07450	HycH	Formate hydrogenlyase maturation protein HycH. This family contains the bacterial formate hydrogenlyase maturation protein HycH, which is approximately 140 residues long. This may be required for the conversion of a precursor form of the large subunit of hydrogenlyase 3 into a mature form.	129
400020	pfam07451	SpoVAD	Stage V sporulation protein AD (SpoVAD). This family contains the bacterial stage V sporulation protein AD (SpoVAD), which is approximately 340 residues long. This is one of six proteins encoded by the spoVA operon, which is transcribed exclusively in the forespore at about the time of dipicolinic acid (DPA) synthesis in the mother cell. The functions of the proteins encoded by the spoVA operon are unknown, but it has been suggested they are involved in DPA transport during sporulation.	329
400021	pfam07452	CHRD	CHRD domain. CHRD (after SWISS-PROT abbreviation for chordin) is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologs. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made. Its most conserved feature is a GE[I/L]RCG[V/I/L] motif towards its C-terminal end Most bacterial proteins in this family have only one CHRD domain, whereas it is found repeated in many eukaryotic proteins such as human chordin and Drosophila SOG..	115
284793	pfam07453	NUMOD1	NUMOD1 domain. This domain probably represents a DNA-binding helix-turn-helix based on its similarity to other families (Bateman A pers obs).	37
400022	pfam07454	SpoIIP	Stage II sporulation protein P (SpoIIP). This family contains the bacterial stage II sporulation protein P (SpoIIP) (approximately 350 residues long). It has been shown that a block in polar cytokinesis in Bacillus subtilis is mediated partly by transcription of spoIID, spoIIM and spoIIP. This inhibition of polar division is involved in the locking in of asymmetry after the formation of a polar septum during sporulation. Engulfment in Bacillus subtilis is mediated by two complementary systems: the first includes the proteins SpoIID, SpoIIM and SpoIIP (DMP) which carry out the engulfment, and the second includes the SpoIIQ-SpoIIIAGH (Q-AH) zipper, that recruits other proteins to the septum in a second-phase of the engulfment. The course of events follows as the incorporation firstly of SpoIIB into the septum during division to serve directly or indirectly as a landmark for localising SpoIIM and then SpoIIP and SpoIID to the septum. SpoIIP and SpoIID interact together to form part of the DMP complex. SpoIIP itself has been identified as an autolysin with peptidoglycan hydrolase activity.	263
311413	pfam07455	Psu	Phage polarity suppression protein (Psu). This family contains a number of phage polarity suppression proteins (Psu) (approximately 190 residues long). The Psu protein of bacteriophage P4 causes suppression of transcriptional polarity in Escherichia coli by overcoming Rho termination factor activity.	174
400023	pfam07456	Hpre_diP_synt_I	Heptaprenyl diphosphate synthase component I. This family contains component I of bacterial heptaprenyl diphosphate synthase (EC:2.5.1.30) (approximately 170 residues long). This is one of the two dissociable subunits that form the enzyme, both of which are required for the catalysis of the biosynthesis of the side chain of menaquinone-7.	147
400024	pfam07457	DUF1516	Protein of unknown function (DUF1516). This family contains a number of hypothetical bacterial proteins of unknown function approximately 120 residues long.	107
400025	pfam07458	SPAN-X	Sperm protein associated with nucleus, mapped to X chromosome. This family contains human sperm proteins associated with the nucleus and mapped to the X chromosome (SPAN-X) (approximately 100 residues long). SPAN-X proteins are cancer-testis antigens (CTAs), and thus represent potential targets for cancer immunotherapy because they are widely distributed in tumors but not in normal tissues, except testes. They are highly insoluble, acidic, and polymorphic.	94
400026	pfam07459	CTX_RstB	CTX phage RstB protein. This family contains a number of RstB proteins approximately 120 residues long, including RstB1 and RstB2, from the Vibrio cholerae phage CTX. Functional analyses indicate that rstB2 is required for integration of the CTXphi phage into the V. cholerae chromosome.	90
400027	pfam07460	NUMOD3	NUMOD3 motif (2 copies). NUMOD3 is a DNA-binding motif found in homing endonucleases and related proteins. It occurs on its own or in tandem repeats in GIY-YIG (pfam01541) and HTH proteins. It constitutes a beta-turn-loop-helix subregion of the the DNA-binding domain of I-TevI homing endonuclease.	37
284801	pfam07461	NADase_NGA	Nicotine adenine dinucleotide glycohydrolase (NADase). This family consists of several bacterial nicotine adenine dinucleotide glycohydrolase (NGA) proteins which appear to be specific to Streptococcus pyogenes. NAD glycohydrolase (NADase) is a potential virulence factor. Streptococcal NADase may contribute to virulence by its ability to cleave beta-NAD at the ribose-nicotinamide bond, depleting intracellular NAD pools and producing the potent vasoactive compound nicotinamide.	446
400028	pfam07462	MSP1_C	Merozoite surface protein 1 (MSP1) C-terminus. This family represents the C-terminal region of merozoite surface protein 1 (MSP1) which are found in a number of Plasmodium species. MSP-1 is a 200-kDa protein expressed on the surface of the P. vivax merozoite. MSP-1 of Plasmodium species is synthesized as a high-molecular-weight precursor and then processed into several fragments. At the time of red cell invasion by the merozoite, only the 19-kDa C-terminal fragment (MSP-119), which contains two epidermal growth factor-like domains, remains on the surface. Antibodies against MSP-119 inhibit merozoite entry into red cells, and immunisation with MSP-119 protects monkeys from challenging infections. Hence, MSP-119 is considered a promising vaccine candidate.	553
400029	pfam07463	NUMOD4	NUMOD4 motif. NUMOD4 is a putative DNA-binding motif found in homing endonucleases and related proteins.	49
369376	pfam07464	ApoLp-III	Apolipophorin-III precursor (apoLp-III). This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage.	143
400030	pfam07465	PsaM	Photosystem I protein M (PsaM). This family consists of several plant and cyanobacterial photosystem I protein M (PsaM) sequences. PsaM forms part of the photosystem I complex and its binding is stabilized by PsaI.	29
400031	pfam07466	DUF1517	Protein of unknown function (DUF1517). This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown.	183
400032	pfam07467	BLIP	Beta-lactamase inhibitor (BLIP). The structure of BLIP reveals two structural domains, which form a polar, concave surface that docks onto a predominantly polar, convex protrusion on beta-lactamase. The ability of BLIP to adapt to a variety of class A beta-lactamases is thought to be due to flexibility between these two domains.	123
116089	pfam07468	Agglutinin	Agglutinin domain. 	141
400033	pfam07469	DUF1518	Domain of unknown function (DUF1518). This domain, which is usually found tandemly repeated, is found various receptor co-activating proteins.	58
400034	pfam07470	Glyco_hydro_88	Glycosyl Hydrolase Family 88. Unsaturated glucuronyl hydrolase catalyzes the hydrolytic release of unsaturated glucuronic acids from oligosaccharides (EC:3.2.1.-) produced by the reactions of polysaccharide lyases.	343
369382	pfam07471	Phage_Nu1	Phage DNA packaging protein Nu1. Terminase, the DNA packaging enzyme of bacteriophage lambda, is a heteromultimer composed of subunits Nu1 and A. The smaller Nu1 terminase subunit has a low-affinity ATPase stimulated by non-specific DNA.	164
400035	pfam07472	PA-IIL	Fucose-binding lectin II (PA-IIL). In Pseudomonas aeruginosa the fucose-binding lectin II (PA-IIL) contributes to the pathogenic virulence of the bacterium. PA-IIL functions as a tetramer when binding fucose. Each monomer is comprised of a nine-stranded, antiparallel beta-sandwich arrangement and contains two calcium cations that mediate the binding of fucose in a recognition mode unique among carbohydrate-protein interactions.	107
400036	pfam07473	Toxin_11	Spasmodic peptide gm9a; conotoxin from Conus species. This family consists of several spasmodic peptide gm9a sequences. Conotoxin gm9a is a putative 27-residue polypeptide encoded by Conus gloriamaris and is known to be a homolog of the 'spasmodic peptide', tx9a, isolated from the venom of the mollusc-hunting cone shell Conus textile. Upon injection of this venom component, normal mice are converted into behavioural phenocopies of a well-known mutant, the spasmodic mouse.	28
400037	pfam07474	G2F	G2F domain. Nidogen, an invariant component of basement membranes, is a multifunctional protein that interacts with most other major basement membrane proteins. The G2 fragment or (G2F domain) contains binding sites for collagen IV and perlecan. The structure is composed of an 11-stranded beta-barrel with a central helix. This domain is structurally related to that of green fluorescent protein pfam01353. A large surface patch on the beta-barrel is conserved in all metazoan nidogens.	184
400038	pfam07475	Hpr_kinase_C	HPr Serine kinase C-terminal domain. This family represents the C terminal kinase domain of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phosphorelay system in control of carbon catabolic repression in bacteria. This kinase in unusual in that it recognizes the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller.	171
284814	pfam07476	MAAL_C	Methylaspartate ammonia-lyase C-terminus. Methylaspartate ammonia-lyase EC:4.3.1.2 catalyzes the second step of fermentation of glutamate. It is a homodimer. This family represents the C-terminal region of Methylaspartate ammonia-lyase and contains a TIM barrel fold similar to the pfam01188. This family represents the catalytic domain and contains a metal binding site.	247
400039	pfam07477	Glyco_hydro_67C	Glycosyl hydrolase family 67 C-terminus. Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the C terminal region of alpha-glucuronidase which is mainly alpha-helical. It wraps around the catalytic domain (pfam07488), making additional interactions both with the N-terminal domain (pfam03648) of its parent monomer and also forming the majority of the dimer-surface with the equivalent C-terminal domain of the other monomer of the dimer.	223
400040	pfam07478	Dala_Dala_lig_C	D-ala D-ala ligase C-terminus. This family represents the C-terminal, catalytic domain of the D-alanine--D-alanine ligase enzyme EC:6.3.2.4. D-Alanine is one of the central molecules of the cross-linking step of peptidoglycan assembly. There are three enzymes involved in the D-alanine branch of peptidoglycan biosynthesis: the pyridoxal phosphate-dependent D-alanine racemase (Alr), the ATP-dependent D-alanine:D-alanine ligase (Ddl), and the ATP-dependent D-alanine:D-alanine-adding enzyme (MurF).	205
400041	pfam07479	NAD_Gly3P_dh_C	NAD-dependent glycerol-3-phosphate dehydrogenase C-terminus. NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the C-terminal substrate-binding domain.	141
369385	pfam07481	DUF1521	Domain of Unknown Function (DUF1521). This family of unknown function is found in a limited set of Bradyrhizobium proteins. There appears to be a periodic -DG- motif in it.	169
284819	pfam07482	DUF1522	Domain of Unknown Function (DUF1522). 	110
400042	pfam07483	W_rich_C	Tryptophan-rich Synechocystis species C-terminal domain. This domain is found at the C-terminus, normally between 2-3 copies, of a range of Synechocystis membrane proteins. This domain is fairly tryptophan rich as well.	105
400043	pfam07484	Collar	Phage Tail Collar Domain. This region is occasionally found in conjunction with pfam03335. Most of the family appear to be phage tail proteins; however some appear to be involved in other processes. For instance a member from Rhizobium leguminosarum may be involved in plant-microbe interactions. A related protein MrpB is involved in the pathogenicity of Microcystis aeruginosa. The finding of this family in a structural component of the phage tail fibre baseplate suggests that its function is structural rather than enzymatic. Structural studies show this region consists of a helix and a loop and three beta-strands. This alignment does not catch the third strand as it is separated from the rest of the structure by around 100 residues. This strand is conserved in homologs but the intervening sequence is not. Much of the function of phage T4 appears to reside in this intervening region. In the tertiary structure of the phage baseplate this domain forms part of the 'collar'. The domain may bind SO4, however the residues accredited with this vary between the PDB file and the Swiss-Prot entry. The long unconserved region maybe due to domain swapping in and out of a loop or reflective of rapid evolution.	57
400044	pfam07485	DUF1529	Domain of Unknown Function (DUF1259). This family is the lppY/lpqO homolog family.	118
400045	pfam07486	Hydrolase_2	Cell Wall Hydrolase. These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance Bacillus subtilis steB is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyzes the cortex. A similar role is carried out by the partially redundant Bacillus subtilis CwlJ. It is not clear whether these enzymes are amidases or peptidases.	101
400046	pfam07487	SopE_GEF	SopE GEF domain. This family represents the C-terminal guanine nucleotide exchange factor (GEF) domain of SopE. Salmonella typhimurium employs a type III secretion system to inject bacterial toxins into the host cell cytosol. These toxins transiently activate Rho family GTP-binding protein-dependent signaling cascades to induce cytoskeletal rearrangements. SopE, can activate Cdc42, an essential component of the host cellular signaling cascade, in a Dbl-like fashion despite its lack of sequence similarity to Dbl-like proteins, the Rho-specific eukaryotic guanine nucleotide exchange factors.	136
400047	pfam07488	Glyco_hydro_67M	Glycosyl hydrolase family 67 middle domain. Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the central catalytic domain of alpha-glucuronidase.	324
369388	pfam07489	Tir_receptor_C	Translocated intimin receptor (Tir) C-terminus. Intimin and its translocated intimin receptor (Tir) are bacterial proteins that mediate adhesion between mammalian cells and attaching and effacing (A/E) pathogens. A unique and essential feature of A/E bacterial pathogens is the formation of actin-rich pedestals beneath the intimately adherent bacteria and localized destruction of the intestinal brush border. The bacterial outer membrane adhesin, intimin, is necessary for the production of the A/E lesion and diarrhoea. The A/E bacteria translocate their own receptor for intimin, Tir, into the membrane of mammalian cells using the type III secretion system. The translocated Tir triggers additional host signalling events and actin nucleation, which are essential for lesion formation. This family represents the Tir C-terminal domain which has been reported to bind uninfected host cells and beta-1 integrins although the role of intimin binding to integrins is unclear. This intimin C-terminal domain has also been shown to be sufficient for Tir recognition.	222
254231	pfam07490	Tir_receptor_N	Translocated intimin receptor (Tir) N-terminus. Intimin and its translocated intimin receptor (Tir) are bacterial proteins that mediate adhesion between mammalian cells and attaching and effacing (A/E) pathogens. A unique and essential feature of A/E bacterial pathogens is the formation of actin-rich pedestals beneath the intimately adherent bacteria and localized destruction of the intestinal brush border. The bacterial outer membrane adhesin, intimin, is necessary for the production of the A/E lesion and diarrhoea. The A/E bacteria translocate their own receptor for intimin, Tir, into the membrane of mammalian cells using the type III secretion system. The translocated Tir triggers additional host signalling events and actin nucleation, which are essential for lesion formation. This family represents the Tir N-terminal domain which is involved in Tir stability and Tir secretion.	269
400048	pfam07491	PPI_Ypi1	Protein phosphatase inhibitor. These proteins include Ypi1,, a novel Saccharomyces cerevisiae type 1 protein phosphatase inhibitor and ppp1r11/hcgv, annotated as having protein phosphatase inhibitor activity.	57
400049	pfam07492	Trehalase_Ca-bi	Neutral trehalase Ca2+ binding domain. Neutral trehalases mobilise trehalose accumulated by fungal cells as a protective and storage carbohydrate. This family represents a calcium-binding domain similar to EF hand. Residues 97 and 108 in S. pombe ntp1 have been implicated in this interaction. It is thought that this domain may provide a general mechanism for regulating neutral trehalase activity in yeasts and filamentous fungi.	30
400050	pfam07494	Reg_prop	Two component regulator propeller. A large group of two component regulator proteins appear to have the same N-terminal structure of 14 tandem repeats. These repeats show homology to pfam01011 and pfam00400 indicating that they are likely to form a beta-propeller. This family has been built with artificially high cut-offs in order to avoid overlaps with other beta-propeller families. The fourteen repeats are likely to form two propellers; it is not clear if these structures are likely to recruit other proteins or interact with DNA.	24
400051	pfam07495	Y_Y_Y	Y_Y_Y domain. This domain is mostly found at the end of the beta propellers (pfam07494) in a family of two component regulators. However they are also found tandemly repeated in CTC_02402 without other signal conduction domains being present. It's named after the conserved tyrosines found in the alignment. The exact function is not known.	65
400052	pfam07496	zf-CW	CW-type Zinc Finger. This domain appears to be a zinc finger. The alignment shows four conserved cysteine residues and a conserved tryptophan. It was first identified by, and is predicted to be a "highly specialized mononuclear four-cysteine zinc finger...that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including...chromatin methylation status and early embryonic development." Weak homology to pfam00628 further evidences these predictions (personal obs: C Yeats). Twelve different CW-domain-containing protein subfamilies are described, with different subfamilies being characteristic of vertebrates, higher plants and other animals in which these domain is found.	46
400053	pfam07497	Rho_RNA_bind	Rho termination factor, RNA-binding domain. The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers.	72
400054	pfam07498	Rho_N	Rho termination factor, N-terminal domain. The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers. This domain is found to the N-terminus of the RNA binding domain (pfam07497).	43
400055	pfam07499	RuvA_C	RuvA, C-terminal domain. Homologous recombination is a crucial process in all living organisms. In bacteria, this process the RuvA, RuvB, and RuvC proteins are involved. More specifically the proteins process the Holliday junction DNA. RuvA is comprised of three distinct domains. The domain represents the C-terminal domain and plays a significant role in the ATP-dependent branch migration of the hetero-duplex through direct contact with RuvB. Within the Holliday junction, the C-terminal domain makes no interaction with DNA.	47
400056	pfam07500	TFIIS_M	Transcription factor S-II (TFIIS), central domain. Transcription elongation by RNA polymerase II is regulated by the general elongation factor TFIIS. This factor stimulates RNA polymerase II to transcribe through regions of DNA that promote the formation of stalled ternary complexes. TFIIS is composed of three structural domains, termed I, II, and III. The two C-terminal domains (II and III), this domain and pfam01096 are required for transcription activity.	110
400057	pfam07501	G5	G5 domain. This domain is found in a wide range of extracellular proteins. It is found tandemly repeated in up to 8 copies. It is found in the N-terminus of peptidases belonging to the M26 family which cleave human IgA. The domain is also found in proteins involved in metabolism of bacterial cell walls suggesting this domain may have an adhesive function.	75
400058	pfam07502	MANEC	MANEC domain. This region of similarity, comprising 8 conserved cysteines, is found in the N-terminal region of several membrane-associated and extracellular proteins. Although formerly called MANSC (for motif at N-terminus with seven cysteines) it has now been renamed by MANEC (motif at N-terminus with eight cysteines) by Richard Mitter and Stephen Fitzgerald after the discovery of an eighth conserved cysteine. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors.	89
400059	pfam07503	zf-HYPF	HypF finger. The HypF family of proteins are involved in the maturation and regulation of hydrogenase. In the N-terminus they appear to have two Zinc finger domains, as modelled by this family.	33
400060	pfam07504	FTP	Fungalysin/Thermolysin Propeptide Motif. This motif is found in both the bacterial M4 peptidase propeptide and the fungal M36 propeptide. Its exact function is not clear, but it is likely to either inhibit the peptidase, so as to prevent its premature activation, or has a chaperone activity. Both of these roles have been ascribed to the M4 and M36 propeptides.	51
400061	pfam07505	DUF5131	Protein of unknown function (DUF5131). This is a family of bacterial and phage proteins of unknown function. There are three highly conserved cysteine residues in the disposition, Cx6Cxxc, amongst many highly conserved residues.	246
311449	pfam07506	RepB	RepB plasmid partitioning protein. This family includes proteins with sequence similarity to the RepB partitioning protein of the large Ti (tumor-inducing) plasmids of Agrobacterium tumefaciens.	185
400062	pfam07507	WavE	WavE lipopolysaccharide synthesis. These proteins are encoded by putative wav gene clusters, which are responsible for the synthesis of the core oligosaccharide (OS) region of Vibrio cholerae lipopolysaccharide.	305
377856	pfam07508	Recombinase	Recombinase. This domain is usually found associated with pfam00239 in putative integrases/recombinases of mobile genetic elements of diverse bacteria and phages.	102
400063	pfam07509	DUF1523	Protein of unknown function (DUF1523). 	175
400064	pfam07510	DUF1524	Protein of unknown function (DUF1524). This family of uncharacterized proteins contain a conserved HXXP motif. A similar motif is seen in protein families in the His-Me finger endonuclease superfamily which suggests this family of proteins may also act as endonucleases.	140
400065	pfam07511	DUF1525	Protein of unknown function (DUF1525). 	113
400066	pfam07514	TraI_2	Putative helicase. Some members of this family have been annotated as helicases.	325
400067	pfam07515	TraI_2_C	Putative conjugal transfer nickase/helicase TraI C-term. 	123
400068	pfam07516	SecA_SW	SecA Wing and Scaffold domain. SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner. This family is composed of two C-terminal alpha helical subdomains: the wing and scaffold subdomains.	209
369399	pfam07517	SecA_DEAD	SecA DEAD-like domain. SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner. This domain represents the N-terminal ATP-dependent helicase domain, which is related to the pfam00270.	379
284851	pfam07519	Tannase	Tannase and feruloyl esterase. This family includes fungal tannase and feruloyl esterase. It also includes several bacterial homologs of unknown function.	460
400069	pfam07520	SrfB	Virulence factor SrfB. This family includes homologs of SsrAB is a two-component regulatory system encoded within the Salmonella pathogenicity island SPI-2. Among the products of genes activated by SsrAB within epithelial and macrophage cells is Salmonella typhimurium srfB. homologs are found in several other proteobacteria.	985
400070	pfam07521	RMMBL	Zn-dependent metallo-hydrolase RNA specificity domain. The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. This, the fifth motif, appears to be specific to Zn-dependent metallohydrolases such as ribonuclease J 2 which are involved in the processing of mRNA. This domain adds essential structural elements to the CASP-domain and is unique to RNA/DNA-processing nucleases, showing that they are pre-mRNA 3'-end-processing endonucleases.	61
400071	pfam07522	DRMBL	DNA repair metallo-beta-lactamase. The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in DNA repair.	107
400072	pfam07523	Big_3	Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.	67
400073	pfam07524	Bromo_TP	Bromodomain associated. This domain is predicted to bind DNA and is often found associated with pfam00439 and in transcription factors. It has a histone-like fold.	77
400074	pfam07525	SOCS_box	SOCS box. The SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases.	38
400075	pfam07526	POX	Associated with HOX. The function of this domain is unknown. It is often found in plant proteins associated with pfam00046.	139
400076	pfam07527	Hairy_orange	Hairy Orange. The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute.	39
284860	pfam07528	DZF	DZF domain. The function of this domain is unknown. It is often found associated with pfam00098 or pfam00035. This domain has been predicted to belong to the nucleotidyltransferase superfamily.	248
400077	pfam07529	HSA	HSA. This domain is predicted to bind DNA and is often found associated with helicases.	71
148888	pfam07530	PRE_C2HC	Associated with zinc fingers. This function of this domain is unknown and is often found associated with pfam00096.	68
400078	pfam07531	TAFH	NHR1 homology to TAF. This corresponds to the region NHR1 that is conserved between the product of the nervy gene in Drosophila and the human mtg8b protein, which is hypothesized to be a transcription factor.	89
400079	pfam07532	Big_4	Bacterial Ig-like domain (group 4). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.	59
400080	pfam07533	BRK	BRK domain. The function of this domain is unknown. It is often found associated with helicases and transcription factors.	43
400081	pfam07534	TLD	TLD. This domain is predicted to be an enzyme and is often found associated with pfam01476. It's structure consists of a beta-sandwich surrounded by two helices and two one-turn helices.	136
400082	pfam07535	zf-DBF	DBF zinc finger. This domain is predicted to bind metal ions and is often found associated with pfam00533 and pfam02178. It was first identified in the Drosophila chiffon gene product, and is associated with initiation of DNA replication.	42
400083	pfam07536	HWE_HK	HWE histidine kinase. Two-component systems, consisting of a histidine kinase and a cognate response regulator protein, represent the best-known apparatus for transducing external cues into a physiological response in bacteria. The HWE domain is found in a subset of two-component system kinases, belonging to the same superfamily as pfam00512. The family was defined by the presence of a highly conserved H residue in the kinase domain and a WxE motif in a C-terminal ATPase domain that is related to pfam02518. These proteins are found in a variety of alpha- and gamma-proteobacteria, with significant enrichment in the rhizobia.	83
400084	pfam07537	CamS	CamS sex pheromone cAM373 precursor. This family includes CamS, from which Staphylococcus aureus sex pheromone staph-cAM373 is processed.	316
400085	pfam07538	ChW	Clostridial hydrophobic W. A novel extracellular macromolecular system has been proposed based on the proteins containing ChW repeats. ChW stands for Clostridial hydrophobic with conserved W (tryptophan). This repeat was originally described in Clostridium acetobutylicum but is also found in other Gram-positive bacteria including Enterococcus faecalis, Streptococcus agalactiae and Streptomyces coelicolor.	35
400086	pfam07539	DRIM	Down-regulated in metastasis. These eukaryotic proteins include DRIM (Down-Regulated In Metastasis), which is differentially expressed in metastatic and non-metastatic human breast carcinoma cells. It is believed to be involved in processing of non-coding RNA.	591
400087	pfam07540	NOC3p	Nucleolar complex-associated protein. Nucleolar complex-associated protein (Noc3p) is conserved in eukaryotes and has essential roles in replication and rRNA processing in Saccharomyces cerevisiae.	91
400088	pfam07541	EIF_2_alpha	Eukaryotic translation initiation factor 2 alpha subunit. These proteins share a region of similarity that falls towards the C-terminus from pfam00575.	112
400089	pfam07542	ATP12	ATP12 chaperone protein. Mitochondrial F1-ATPase is an oligomeric enzyme composed of five distinct subunit polypeptides. The alpha and beta subunits make up the bulk of protein mass of F1. In Saccharomyces cerevisiae both subunits are synthesized as precursors with amino-terminal targeting signals that are removed upon translocation of the proteins to the matrix compartment. These proteins include examples from eukaryotes and bacteria and may have chaperone activity, being involved in F1 ATPase complex assembly.	121
369416	pfam07543	PGA2	Protein trafficking PGA2. A Saccharomyces cerevisiae member of this family (PGA2) is an ER protein which has been implicated in protein trafficking.	138
400090	pfam07544	Med9	RNA polymerase II transcription mediator complex subunit 9. This family of Med9 proteins is conserved in yeasts. It forms part of the middle region of Mediator. Med9 has two functional domains. The species-specific amino-terminal half (aa 1-63) plays a regulatory role in transcriptional regulation, whereas this well-conserved carboxy-terminal half (aa 64-149) has a more fundamental function involved in direct binding to the amino-terminal portions of Med4 and Med7 and the assembly of Med9 into the Middle module. Also, some unidentified factor(s) in med9 extracts may impact the binding of TFIID to the promoter.	79
400091	pfam07545	Vg_Tdu	Vestigial/Tondu family. The mammalian TEF and the Drosophila scalloped genes belong to a conserved family of transcriptional factors that possesses a TEA/ATTS DNA-binding domain. Transcriptional activation by these proteins likely requires interactions with specific coactivators. In Drosophila, Scalloped (Sd) interacts with Vestigial (Vg) to form a complex, which binds DNA through the Sd TEA/ATTS domain. The Sd-Vg heterodimer is a key regulator of wing development, which directly controls several target genes and is able to induce wing outgrowth when ectopically expressed. This short conserved region is needed for interaction with Sd.	30
400092	pfam07546	EMI	EMI domain. The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.	67
400093	pfam07547	RSD-2	RSD-2 N-terminal domain. This domain is found in three copies in the N-terminus of the C. elegans RSD-2 protein. RSD-2 (RNAi spreading defective) is involved in systemic RNAi. Mutations in the rsd-2 gene do not effect somatic genes but only germline expressed genes.	83
311484	pfam07548	ChlamPMP_M	Chlamydia polymorphic membrane protein middle domain. This family contains several Chlamydia polymorphic membrane proteins. Chlamydia pneumoniae is an obligate intracellular bacterium and a common human pathogen causing infection of the upper and lower respiratory tract. This domain is found between the beta-helical repeats (pfam02415) and the C-terminal pfam03797. This domain is excised subsequent to secretion.	170
400094	pfam07549	Sec_GG	SecD/SecF GG Motif. This family consists of various prokaryotic SecD and SecF protein export membrane proteins. This SecD and SecF proteins are part of the multimeric protein export complex comprising SecA, D, E, F, G, Y, and YajC. SecD and SecF are required to maintain a proton motive force. This alignment encompasses a -GG- motif typically found in N-terminal half of the SecD/SecF proteins.	27
369421	pfam07550	DUF1533	Protein of unknown function (DUF1533). This family consists of several hypothetical bacterial proteins and is around 60 residues in length. It's function is not known.	59
148905	pfam07551	DUF1534	Protein of unknown function (DUF1534). This family is found in a group of small bacterial proteins. Its function is not known.	48
284881	pfam07552	Coat_X	Spore Coat Protein X and V domain. This family is found in the Bacilliales coat protein X as a tandem repeat and also in coat protein V. The proteins are found in the insoluble fraction.	54
400095	pfam07553	Lipoprotein_Ltp	Host cell surface-exposed lipoprotein. This is a family of lipoproteins that is involved in superinfection exclusion. Proteins in this family have been shown to act at the stage of DNA release from the phage head into the cell.	48
400096	pfam07554	FIVAR	FIVAR domain. This domain is found in a wide variety of contexts, but mostly occurring in cell wall associated proteins. A lack of conserved catalytic residues suggests that it is a binding domain. From context, possible substrates are hyaluronate or fibronectin (personal obs: C Yeats). This is further evidenced by. Possibly the exact substrate is N-acetyl glucosamine. Finding it in the same protein as pfam05089 further supports this proposal. It is found in the C-terminal part of Bacillus sp. Gellan lyase, which is removed during maturation. Some of the proteins it is found in are involved in methicillin resistance. The name FIVAR derives from Found In Various Architectures.	69
400097	pfam07555	NAGidase	beta-N-acetylglucosaminidase. This family has previously been described as a hyaluronidase. However, more recently it has been shown that this family has beta-N-acetylglucosaminidase activity.	293
400098	pfam07556	DUF1538	Protein of unknown function (DUF1538). This family contains several conserved glycines and phenylalanines.	209
400099	pfam07557	Shugoshin_C	Shugoshin C-terminus. Shugoshin-like proteins contain this conserved sequence at the C-terminus, which is rich in basic amino-acids. Shugoshin (Sgo1) protects Rec8 at centromeres during anaphase I (during meiosis) so that sister chromatids remain tethered. Sgo2 is a paralogue of Sgo1 and is involved in correctly orienting sister-centromeres.	25
400100	pfam07558	Shugoshin_N	Shugoshin N-terminal coiled-coil region. The Shugoshin protein is found to have this conserved N-terminal coiled-coil region and a highly conserved C-terminal basic region, family Shugoshin_C pfam07557. Shugoshin is a crucial target of Bub1 kinase function at kinetochores, necessary for both meiotic and mitotic localization of shugoshin to the kinetochore. Human shugoshin is diffusible and mediates kinetochore-driven formation of kinetochore-microtubules during bipolar spindle assembly. Further, the primary role of shugoshin is to ensure bipolar attachment of kinetochores, and its role in protecting cohesion has co-developed to facilitate this process.	45
400101	pfam07559	FlaE	Flagellar basal body protein FlaE. This family consists of several bacterial FlaE flagellar proteins. These proteins are part of the flageller basal body rod complex.	85
116179	pfam07560	DUF1539	Domain of Unknown Function (DUF1539). 	126
400102	pfam07561	DUF1540	Domain of Unknown Function (DUF1540). This family has four conserved cysteines, which is suggestive of a metal binding function.	40
400103	pfam07562	NCD3G	Nine Cysteines Domain of family 3 GPCR. This conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the pfam00003 in several receptor proteins.	54
400104	pfam07563	DUF1541	Protein of unknown function (DUF1541). This family consists of several hypothetical bacterial and occurs as a tandem repeat.	52
400105	pfam07564	DUF1542	Domain of Unknown Function (DUF1542). This domain is found in several cell surface proteins. Some are involved in antibiotic resistance and/or cellular adhesion.	77
400106	pfam07565	Band_3_cyto	Band 3 cytoplasmic domain. This family contains the cytoplasmic domain of the Band 3 anion exchange proteins that exchange Cl-/HCO3-. Band 3 constitutes the most abundant polypeptide in the red blood cell membrane, comprising 25% of the total membrane protein. The cytoplasmic domain of band 3 functions primarily as an anchoring site for other membrane-associated proteins. Included among the protein ligands of cdb3 are ankyrin, protein 4.2, protein 4.1, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), phosphofructokinase, aldolase, hemoglobin, hemichromes, and the protein tyrosine kinase (p72syk).	258
400107	pfam07566	DUF1543	Domain of Unknown Function (DUF1543). This domain is found as 1-2 copies in a small family of proteins of unknown function.	52
400108	pfam07568	HisKA_2	Histidine kinase. This is the dimerization and phosphoacceptor domain of a sub-family of histidine kinases. It shares sequence similarity with pfam00512 and pfam07536. It is usually found adjacent to a C-terminal ATPase domain (pfam02518). This domain is found in a wide range of Bacteria and also several Archaea.	76
400109	pfam07569	Hira	TUP1-like enhancer of split. The Hira proteins are found in a range of eukaryotes and are implicated in the assembly of repressive chromatin. These proteins also contain pfam00400.	206
400110	pfam07571	TAF6_C	TAF6 C-terminal HEAT repeat domain. TAF6_C is the C-terminal domain of the TAF6 subunit of the general transcription factor TFIID. The crystal structure reveals the presence of five conserved HEAT repeats. This region is necessary for the complexing together of the subunits TAF5, TAF6 and TAF9.	90
400111	pfam07572	BCNT	Bucentaur or craniofacial development. Bucentaur or craniofacial development protein 1 (BCNT) in ruminents has a different domain architecture to that in mouse and human. For this reason it has been used as a model for molecular evolution. Both bovine and human BCNTs are phosphorylated by casein kinase II in vitro.	71
311502	pfam07573	AreA_N	Nitrogen regulatory protein AreA N-terminus. The AreA nitrogen regulatory protein proteins (which are GATA type transcription factors) share a highly conserved N-terminus and pfam00320 at the C-terminus.	94
400112	pfam07574	SMC_Nse1	Nse1 non-SMC component of SMC5-6 complex. S. cerevisiae Nse1 forms part of a complex with SMC5-SMC6. This non-structural maintenance of chromosomes (SMC) complex plays an essential role in genomic stability, being involved in DNA repair and DNA metabolism. It is conserved in eukaryotes from yeast to human. This domain lies immediatley up-stream of the DNA-binding zinc-finger domain, zf-RING-like pfam08746.	195
400113	pfam07575	Nucleopor_Nup85	Nup85 Nucleoporin. A family of nucleoporins conserved from yeast to human. THe nuclear pore complex is a large assembly composed of two essential complexes: the heptameric Nup84 complex and the heteromeric Nic96-containing complex. The Nup84 complex is composed of one copy each of Nup84, Nup85, Nup120, Nup133, Nup145C, Sec13, and Seh1. The structure of a complex of Nup85 and Seh1 was solved. The N-terminus of Nup85 is inserted and forms a three-stranded blade that completes the Seh1 6-bladed beta-propeller in trans. Following its N-terminal insertion blade, Nup85 forms a compact cuboid structure composed of 20 helices, with two distinct modules, referred to as crown and trunk.	562
400114	pfam07576	BRAP2	BRCA1-associated protein 2. These proteins include BRCA1-associated protein 2 (BRAP2), which binds nuclear localization signals (NLSs) in vitro and in yeast two-hybrid screening. These proteins share a region of sequence similarity at their N-terminus. They also have pfam02148 at the C-terminus.	93
284902	pfam07577	DUF1547	Domain of Unknown Function (DUF1547). This family appears to be found only in a small family of Chlamydia species.	60
400115	pfam07578	LAB_N	Lipid A Biosynthesis N-terminal domain. This family is found at the N-terminus of a group of Chlamydial Lipid A biosynthesis proteins. It is also found by itself in a family of proteins of unknown function.	68
311507	pfam07579	DUF1548	Domain of Unknown Function (DUF1548). This family appears to be found only in a small family of Chlamydia proteins.	135
400116	pfam07580	Peptidase_M26_C	M26 IgA1-specific Metallo-endopeptidase C-terminal region. These peptidases, which cleave mammalian IgA, are found in Gram-positive bacteria. Often found associated with pfam00746, they may be attached to the cell wall.	734
311509	pfam07581	Glug	The GLUG motif. This family is found in the IgA1 (M26) peptidases, which attached to the cell wall peptidoglycan by an amide bond. IgA1 protease selectively cleaves human IgA1 and is likely to be a pathogenicity factor in some pathogens. This family is also found in various other contexts, including with pfam05860. It is named GLUG after the mostly conserved G-L-any-G motif.	28
377871	pfam07582	AP_endonuc_2_N	AP endonuclease family 2 C-terminus. This highly-conserved sequence is found at the C-terminus of several apurinic/apyrimidinic (AP) endonucleases. in a range of Gram-positive and Gram-negative bacteria. See also pfam01261.	55
400117	pfam07583	PSCyt2	Protein of unknown function (DUF1549). A family of paralogues in the planctomyces.	208
400118	pfam07584	BatA	Aerotolerance regulator N-terminal. These proteins share a highly-conserved sequence at their N-terminus. They include several proteins from Rhodopirellula baltica and also several from proteobacteria. The proteins are produced by the Batl operon which appears to be important in pathogenicity and aerotolerance. This family is the conserved N-terminus, but the full length proteins carry multiple membrane-spanning domains. BatA ensures bacterial survival in the early stages of the infection process, when the infected sites are aerobic, and is produced under conditions of oxidative stress.	75
377874	pfam07585	BBP7	Putative beta barrel porin-7 (BBP7). This is a family of putative beta barrel porin-7 BBP7 proteins identified initially in Rhodopirellula baltica.	350
377875	pfam07586	HXXSHH	Protein of unknown function (DUF1552). A family of proteins identified in Rhodopirellula baltica.	301
400119	pfam07587	PSD1	Protein of unknown function (DUF1553). A family of proteins found in Rhodopirellula baltica.	213
369437	pfam07588	DUF1554	Protein of unknown function (DUF1554). A family of proteins identified in Leptospira interrogans.	136
377877	pfam07589	VPEP	PEP-CTERM motif. This motif has been identified in a wide range of bacteria at their C-terminus. It has been suggested that this is a protein sorting signal. Based on phylogenetic profiling it has been suggested that the EpsH family of proteins mediate this function.	23
284914	pfam07590	DUF1556	Protein of unknown function (DUF1556). 	82
400120	pfam07591	PT-HINT	Pretoxin HINT domain. A member of the HINT superfamily of proteases that is usually found N-terminal to the toxin module in polymorphic toxin systems. The domain is predicted to function in releasing the toxin domain by autoproteolysis.	136
400121	pfam07592	DDE_Tnp_ISAZ013	Rhodopirellula transposase DDE domain. These transposases are found in the planctomycete Rhodopirellula baltica, the cyanobacterium Nostoc, and the Gram-positive bacterium Streptomyces.	308
400122	pfam07593	UnbV_ASPIC	ASPIC and UnbV. This conserved sequence is found associated with pfam00515 in several paralogous proteins in Rhodopirellula baltica. It is also found associated with pfam01839 in several eukaryotic integrin-like proteins (e.g. human ASPIC) and in several other bacterial proteins.	66
284918	pfam07595	Planc_extracel	Planctomycete extracellular. This motif is conserved as the N-terminus of several Rhodopirellula baltica proteins predicted to be extracellular.	24
284919	pfam07596	SBP_bac_10	Protein of unknown function (DUF1559). A large family of paralogous proteins apparently unique to planctomycetes.	268
284921	pfam07598	DUF1561	Protein of unknown function (DUF1561). A family of paralogous proteins in Leptospira interrogans.	625
203693	pfam07599	DUF1563	Protein of unknown function (DUF1563). A small family of short hypothetical proteins in Leptospira interrogans.	43
284922	pfam07600	DUF1564	Protein of unknown function (DUF1564). A family of paralogous proteins in Leptospira interrogans. Several have been annotated as possible CopG-like transcriptional regulators (see pfam01402).	167
284923	pfam07602	DUF1565	Protein of unknown function (DUF1565). These proteins share a region of homology in their N termini, and are found in several phylogenetically diverse bacteria and in the archaeon Methanosarcina acetivorans. Some of these proteins also contain characterized domains such as pfam00395 and pfam03422.	256
400123	pfam07603	DUF1566	Protein of unknown function (DUF1566). These proteins of unknown function are found in Leptospira interrogans and in several gamma proteobacteria.	118
369441	pfam07606	DUF1569	Protein of unknown function (DUF1569). A family of hypothetical proteins identified in Rhodopirellula baltica.	152
284926	pfam07607	DUF1570	Protein of unknown function (DUF1570). A family of hypothetical proteins in Rhodopirellula baltica. This family carries a highly conserved HExxH sequence motif characteristic of members of the Peptidase clan MA.	129
400124	pfam07608	DUF1571	Protein of unknown function (DUF1571). A family of paralogous proteins in Rhodopirellula baltica.	208
400125	pfam07609	DUF1572	Protein of unknown function (DUF1572). These proteins, from several diverse bacteria, share a short conserved sequence towards their N termini.	163
400126	pfam07610	DUF1573	Protein of unknown function (DUF1573). These hypothetical proteins, from bacteria such as Rhodopirellula baltica, Bacteroides thetaiotaomicron, and Porphyromonas gingivalis, share a region of conserved sequence towards their N-termini.	98
369442	pfam07611	DUF1574	Protein of unknown function (DUF1574). A family of hypothetical proteins in Leptospira interrogans.	342
400127	pfam07613	DUF1576	Protein of unknown function (DUF1576). This small family is found in several undescribed proteins. The alignment is distinguished by the frequent occurrence of conserved glycine and aromatic residues.	176
369443	pfam07614	DUF1577	Protein of unknown function (DUF1577). A family of hypothetical proteins in Leptospira interrogans.	256
400128	pfam07615	Ykof	YKOF-related Family. 	81
400129	pfam07617	DUF1579	Protein of unknown function (DUF1579). A family of paralogous hypothetical proteins identified in Rhodopirellula baltica that also has members in Gloeobacter violaceus, Sinorhizobium meliloti and Agrobacterium tumefaciens.	155
284935	pfam07618	DUF1580	Protein of unknown function (DUF1580). A family of short hypothetical proteins found in Rhodopirellula baltica.	57
284936	pfam07619	DUF1581	Protein of unknown function (DUF1581). Several Rhodopirellula baltica proteins share this probable domain. Most of these proteins are predicted to be secreted or membrane-associated.	84
284937	pfam07621	DUF1582	Protein of unknown function (DUF1582). A family of hypothetical proteins in Rhodopirellula baltica.	29
284938	pfam07622	DUF1583	Protein of unknown function (DUF1583). Most of these Rhodopirellula baltica hypothetical proteins also match pfam07619.	411
377883	pfam07624	PSD2	Protein of unknown function (DUF1585). A conserved sequence region at the C-terminus of several cytochrome-like proteins in Rhodopirellula baltica.	74
377884	pfam07626	PSD3	Protein of unknown function (DUF1587). A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also match pfam07624.	65
377885	pfam07627	PSCyt3	Protein of unknown function (DUF1588). A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also match pfam07626 and pfam07624.	98
284944	pfam07628	DUF1589	Protein of unknown function (DUF1589). A family of short hypothetical proteins in Rhodopirellula baltica.	164
377886	pfam07631	PSD4	Protein of unknown function (DUF1592). A region of similarity shared by several Rhodopirellula baltica cytochrome-like proteins that are predicted to be secreted. These proteins also match pfam07627, pfam07626, and pfam07624.	128
400130	pfam07632	DUF1593	Protein of unknown function (DUF1593). A family of proteins in Rhodopirellula baltica that are predicted to be secreted. Also, a member has been identified in Caulobacter crescentus. These proteins mat be related to pfam01156.	261
400131	pfam07634	RtxA	RtxA repeat. This short repeat is found in the RtxA toxin family.	18
369445	pfam07635	PSCyt1	Planctomycete cytochrome C. These proteins share a region of homology at their N-terminus that contains the C-{CPWHF}-{CPWR}-C-H-{CFYW} motif typical of cytochromes C, or CxxCH.	59
148958	pfam07636	PSRT	PSRT. This motif is found at the N-terminus of several short hypothetical proteins in Rhodopirellula baltica and the predicted Arylsulfatase B (EC:3.1.6.12).	32
377888	pfam07637	PSD5	Protein of unknown function (DUF1595). A family of proteins in Rhodopirellula baltica, associated with pfam07635, pfam07626, pfam07631, pfam07627, and pfam07624.	62
254323	pfam07638	Sigma70_ECF	ECF sigma factor. These proteins are probably RNA polymerase sigma factors belonging to the extra-cytoplasmic function (ECF) subfamily and show sequence similarity to pfam04542 and pfam04545.	185
377889	pfam07639	YTV	YTV. These hypothetical proteins in Rhodopirellula baltica contain several repeats of a sequence whose core is the residues YTV.	40
400132	pfam07642	BBP2	Putative beta-barrel porin-2, OmpL-like. bbp2. BBP2 is a family of putative porin proteins that are likely to be outer membrane beta barrel proteins porins.	340
377890	pfam07643	DUF1598	Protein of unknown function (DUF1598). A family of Rhodopirellula baltica hypothetical proteins of about 500 amino acids in length.	84
311536	pfam07645	EGF_CA	Calcium-binding EGF domain. 	42
400133	pfam07646	Kelch_2	Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that Drosophila kel is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415.	47
400134	pfam07647	SAM_2	SAM domain (Sterile alpha motif). 	66
400135	pfam07648	Kazal_2	Kazal-type serine protease inhibitor domain. Usually indicative of serine protease inhibitors. However, kazal-like domains are also seen in the extracellular part of agrins, which are not known to be protease inhibitors. Kazal domains often occur in tandem arrays. Small alpha+beta fold containing three disulphides.	50
400136	pfam07650	KH_2	KH domain. 	78
400137	pfam07651	ANTH	ANTH domain. AP180 is an endocytotic accessory proteins that has been implicated in the formation of clathrin-coated pits. The domain is involved in phosphatidylinositol 4,5-bisphosphate binding and is a universal adaptor for nucleation of clathrin coats.	272
400138	pfam07652	Flavi_DEAD	Flavivirus DEAD domain. 	146
400139	pfam07653	SH3_2	Variant SH3 domain. SH3 (Src homology 3) domains are often indicative of a protein involved in signal transduction related to cytoskeletal organisation. First described in the Src cytoplasmic tyrosine kinase. The structure is a partly opened beta barrel.	52
400140	pfam07654	C1-set	Immunoglobulin C1-set domain. 	85
377891	pfam07655	Secretin_N_2	Secretin N-terminal domain. This is a short domain found in bacterial type II/III secretory system proteins. The architecture of these proteins suggest that this family may be functionally analogous to pfam03958.	91
400141	pfam07657	MNNL	N-terminus of Notch ligand. This entry represents a region of conserved sequence at the N-terminus of several Notch ligand proteins.	75
400142	pfam07659	DUF1599	Domain of Unknown Function (DUF1599). 	61
377893	pfam07660	STN	Secretin and TonB N-terminus short domain. This is a short domain found at the N-terminus of the Secretins of the bacterial type II/III secretory system as well as the TonB-dependent receptor proteins. These proteins are involved in TonB-dependent active uptake of selective substrates.	51
311548	pfam07661	MORN_2	MORN repeat variant. This family represents an apparent variant of the pfam02493 repeat (personal obs:C Yeats).	22
400143	pfam07662	Nucleos_tra2_C	Na+ dependent nucleoside transporter C-terminus. This family consists of nucleoside transport proteins. Rat Slc28a2 is a purine-specific Na+-nucleoside cotransporter localized to the bile canalicular membrane. Rat Slc28a1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters.	205
400144	pfam07663	EIIBC-GUT_C	Sorbitol phosphotransferase enzyme II C-terminus. 	92
400145	pfam07664	FeoB_C	Ferrous iron transport protein B C-terminus. Escherichia coli has an iron(II) transport system (feo) which may make an important contribution to the iron supply of the cell under anaerobic conditions. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N-terminus has been previously erroneously described as being ATP-binding. Recent work shows that it is similar to eukaryotic G-proteins and that it is a GTPase.	51
254342	pfam07666	MpPF26	M penetrans paralogue family 26. These proteins include those ascribed to M penetrans paralogue family 26 in.	133
284973	pfam07667	DUF1600	Protein of unknown function (DUF1600). These proteins appear to be specific to Mycoplasma species.	109
284974	pfam07668	MpPF1	M penetrans paralogue family 1. This family of paralogous proteins identified in Mycoplasma penetrans includes homologs of p35.	313
369456	pfam07669	Eco57I	Eco57I restriction-modification methylase. homologs of the Escherichia coli Eco57I restriction-modification methylase are found in several phylogenetically diverse bacteria. The structure of TaqI has been solved.	104
400146	pfam07670	Gate	Nucleoside recognition. This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'.	101
400147	pfam07671	DUF1601	Protein of unknown function (DUF1601). This repeat is found in a small number of proteins and is apparently limited to Coxiella and related species.	37
284977	pfam07672	MFS_Mycoplasma	Mycoplasma MFS transporter. These proteins share some similarity with members of the Major Facilitator Superfamily (MFS).	267
284979	pfam07675	Cleaved_Adhesin	Cleaved Adhesin Domain. This is a family of bacterial protein modules thought to function in various roles including cell adhesion, cell lysis and carbohydrate binding. A tandem repeat of these modules (either two or three repeats) constitute the haemagglutinin/adhesin (HA) regions of the gingipains, RgpA, Kgp, and Lys-gingipain HG66 expressed by Porphyromonas gingivalis (Bacteroides gingivalis). They form components of the major extracellular virulence complex RgpA-Kgp - a mixture of proteinases and adhesin domains. The adhesin domains in this complex are found in proteinase-cleaved forms when isolated from the cell surface. Haemagglutinin genes of P. gingivalis (hagA1 HAGA1_PORGI - and hagA2 HAGA2_PORGI) suggest that such proteins are composed of eight to ten tandem repeats of these adhesin modules. Genomic data predicts that homologous protein modules are also expressed by a number of other bacteria and form part of putative multi-domain proteins. These domains may be acting in concert with other adhesion modules thought to be part of these multi-domain proteins such as fibronectin type III, pfam00041, and Meprin, A5, mu (MAM), pfam00629, domains.	166
400148	pfam07676	PD40	WD40-like Beta Propeller Repeat. This family appears to be related to the pfam00400 repeat This This repeat corresponds to the RIVW repeat identified in cell surface proteins [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16].	37
400149	pfam07677	A2M_recep	A-macroglobulin receptor. This family includes the receptor domain region of the alpha-2-macroglobulin family.	90
400150	pfam07678	A2M_comp	A-macroglobulin complement component. This family includes the complement components region of the alpha-2-macroglobulin family.	310
400151	pfam07679	I-set	Immunoglobulin I-set domain. 	90
400152	pfam07680	DoxA	TQO small subunit DoxA. Thiosulphate:quinone oxidoreductase (TQO) is one of the early steps in elemental sulphur oxidation. A novel TQO enzyme was purified from the thermo-acidophilic archaeon Acidianus ambivalens and shown to consist of a large subunit (DoxD) and a smaller subunit (DoxA). The DoxD- and DoxA-like two subunits are fused together in a single polypeptide in BT_0515.	131
400153	pfam07681	DoxX	DoxX. These proteins appear to have some sequence similarity with pfam04173 but their function is unknown.	84
400154	pfam07682	SOR	Sulphur oxygenase reductase. The sulphur oxygenase/reductase (SOR) of the thermo-acidophilic archaeon Acidianus ambivalens is an unusual enzyme consisting of 24 identical subunits arranged in a perfectly symmetrical hollow sphere and containing a mononuclear non-heme iron centre (personal communication: A. Kletzin). At 85 degrees C in vitro, elemental sulphur is oxidized to sulphite, thiosulphate and hydrogen sulphide with no external cofactors needed. The proposed equation is: 4S + O2 + 4 H2O ---> 2 HSO3- + 2 H2S + 2 H+.	302
377897	pfam07683	CobW_C	Cobalamin synthesis protein cobW C-terminal domain. This is a large and diverse family of putative metal chaperones that can be separated into up to 15 subgroups. In addition to known roles in cobalamin biosynthesis and the activation of the Fe-type nitrile hydratase, this family is also known to be involved in the response to zinc limitation. The CobW subgroup involved in cobalamin synthesis represents only a small sub-fraction of the family.	94
400155	pfam07684	NODP	NOTCH protein. NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals. NOD and NODP represent a region present in many NOTCH proteins and NOTCH homologs in multiple species such as NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. The role of the NOD and NODP domains remains to be elucidated.	58
400156	pfam07685	GATase_3	CobB/CobQ-like glutamine amidotransferase domain. 	189
400157	pfam07686	V-set	Immunoglobulin V-set domain. This domain is found in antibodies as well as neural protein P0 and CTL4 amongst others.	109
400158	pfam07687	M20_dimer	Peptidase dimerization domain. This domain consists of 4 beta strands and two alpha helices which make up the dimerization surface of members of the M20 family of peptidases. This family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases.	107
400159	pfam07688	KaiA	KaiA C-terminal domain. The cyanobacterial clock proteins KaiA and KaiB are proposed as regulators of the circadian rhythm in cyanobacteria. The overall fold of the KaiA C-terminal domain is that of a four-helix bundle, which forms a dimer in the known structure.	122
400160	pfam07689	KaiB	KaiB domain. The cyanobacterial clock proteins KaiA and KaiB are proposed as regulators of the circadian rhythm in cyanobacteria. Mutations in both proteins have been reported to alter or abolish circadian rhythmicity. KaiB adopts an alpha-beta meander motif and is found to be a dimer.	82
369468	pfam07690	MFS_1	Major Facilitator Superfamily. 	346
400161	pfam07691	PA14	PA14 domain. This domain forms an insert in bacterial beta-glucosidases and is found in other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium prespore-cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding.	141
254362	pfam07692	Fea1	Low iron-inducible periplasmic protein. In Chlamydomonas reinhardtii, the gene encoding Fe-assimilating protein 1 is induced by iron deficiency. In green algae, this protein is periplasmic. The two paralogues FEA1 and FEA2 are the major proteins secreted by iron-deficient Chlamydomonas reinhardtii, and both are up-regulated in response to iron deficiency. FEA1 but not FEA2 is up-regulated by high CO2 concentration. Both FEA1 and FEA2 are secreted into the periplasmic space and genetic evidence confirms that their association with the cell is required for growth in low iron.	359
284995	pfam07693	KAP_NTPase	KAP family P-loop domain. The KAP (after Kidins220/ARMS and PifA) family of predicted NTPases are sporadically distributed across a wide phylogenetic range in bacteria and in animals. Many of the prokaryotic KAP NTPases are encoded in plasmids and tend to undergo disruption to form pseudogenes. A unique feature of all eukaryotic and certain bacterial KAP NTPases is the presence of two or four transmembrane helices inserted into the P-loop NTPase domain. These transmembrane helices anchor KAP NTPases in the membrane such that the P-loop domain is located on the intracellular side.	293
400162	pfam07694	5TM-5TMR_LYT	5TMR of 5TMR-LYT. This entry represents the transmembrane region of the 5TM-LYT (5TM Receptors of the LytS-YhcK type).	171
377898	pfam07695	7TMR-DISM_7TM	7TM diverse intracellular signalling. This entry represents the transmembrane region of the 7TM-DISM (7TM Receptors with Diverse Intracellular Signalling Modules).	207
400163	pfam07696	7TMR-DISMED2	7TMR-DISM extracellular 2. This entry represents one of two distinct types of extracellular domain found in the 7TM-DISM (7TM Receptors with Diverse Intracellular Signalling Modules) bacterial transmembrane proteins. It is possible that this domain adopts a jelly roll fold and acts as a receptor for carbohydrates and their derivatives.	127
400164	pfam07697	7TMR-HDED	7TM-HD extracellular. This entry represents the extracellular domain of the 7TM-HD (7TM Receptors with HD hydrolase).	219
400165	pfam07698	7TM-7TMR_HD	7TM receptor with intracellular HD hydrolase. These bacterial 7TM receptor proteins have an intracellular pfam01966. This entry corresponds to the 7 helix transmembrane domain. These proteins also contain an N-terminal extracellular domain.	190
400166	pfam07699	Ephrin_rec_like	Putative ephrin-receptor like. This family has repeats of a region rich in cysteines.	48
400167	pfam07700	HNOB	Haem-NO-binding. The HNOB (Haem NO Binding) domain, is a predominantly alpha-helical domain and binds heme via a covalent linkage to histidine. It is a haem protein sensor (SONO) that displays femtomolar affinity for nitrous oxide, NO. It is predicted to function as a haem-dependent sensor for gaseous ligands and to transduce diverse downstream signals in both bacteria and animals.	163
400168	pfam07701	HNOBA	Heme NO binding associated. The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals.	215
400169	pfam07702	UTRA	UTRA domain. The UbiC transcription regulator-associated (UTRA) domain is a conserved ligand-binding domain that has a similar fold to pfam04345. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules.	141
400170	pfam07703	A2M_N_2	Alpha-2-macroglobulin family N-terminal region. This family includes a region of the alpha-2-macroglobulin family.	141
400171	pfam07704	PSK_trans_fac	Rv0623-like transcription factor. This entry represents the Rv0623-like family of transcription factors associated with the PSK operon.	82
400172	pfam07705	CARDB	CARDB. Cell adhesion related domain found in bacteria.	101
400173	pfam07706	TAT_ubiq	Aminotransferase ubiquitination site. This segment contains a probable site of ubiquitination that ensures rapid degradation of tyrosine aminotransferase in rats. The half life of the enzyme in vivo is about 2-4 hours. In addition, unpublished information identifies at least 2 phosphorylation sites including CAPK at Ser29 and, at the other end of the protein, a casein kinase II site at S*QEECDK. This region of TAT is probably primarily related to regulatory events. Most other transaminases are much more stable and are not phosphorylated.	40
400174	pfam07707	BACK	BTB And C-terminal Kelch. This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation).	100
285010	pfam07708	Tash_PEST	Tash protein PEST motif. This motif is found in the Tash AT-hook proteins of Theileria annulata. These proteins are transported to the hosts nucleus and are likely to be involved in pathogenesis. It is also often found in conjunction with pfam04385. It is suggested that they may be 'part of PEST motifs' (a signal for rapid proteolytic degradation) in, though this is not definite. This motif is also found in other T. annulata proteins, which have no other known domains in (unpublished data: C Yeats).	18
116323	pfam07709	SRR	Seven Residue Repeat. Associated with pfam02969 in This repeat is found in some Plasmodium and Theileria proteins.	14
400175	pfam07710	P53_tetramer	P53 tetramerisation motif. 	37
400176	pfam07711	RabGGT_insert	Rab geranylgeranyl transferase alpha-subunit, insert domain. Rab geranylgeranyl transferase (RabGGT) catalyzes the addition of two geranylgeranyl groups to the C-terminal cysteine residues of Rab proteins, which is crucial for membrane association and function of these proteins in intracellular vesicular trafficking. This domain is inserted between pfam01239 repeats. This domain adopts an Ig-like fold and is thought to be involved in protein-protein interactions and might be involved in the recognition and binding of REP.	101
400177	pfam07712	SURNod19	Stress up-regulated Nod 19. 	377
400178	pfam07713	DUF1604	Protein of unknown function (DUF1604). This family is found at the N-terminus of several eukaryotic RNA processing proteins.	84
400179	pfam07714	Pkinase_Tyr	Protein tyrosine kinase. 	258
400180	pfam07715	Plug	TonB-dependent Receptor Plug Domain. The Plug domain has been shown to be an independently folding subunit of the TonB-dependent receptors. It acts as the channel gate, blocking the pore until the channel is bound by ligand. At this point it under goes conformational changes opens the channel.	107
400181	pfam07716	bZIP_2	Basic region leucine zipper. 	51
400182	pfam07717	OB_NTP_bind	Oligonucleotide/oligosaccharide-binding (OB)-fold. This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself. The structure Structure 3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins.	82
400183	pfam07718	Coatamer_beta_C	Coatomer beta C-terminal region. This family is found at the C-terminus of the coatamer beta subunit proteins (Beta-coat proteins). This C-terminal domain probably adapts the function of the N-terminal pfam01602 domain.	138
400184	pfam07719	TPR_2	Tetratricopeptide repeat. This Pfam entry includes outlying Tetratricopeptide-like repeats (TPR) that are not matched by pfam00515.	33
400185	pfam07720	TPR_3	Tetratricopeptide repeat. This Pfam entry includes tetratricopeptide-like repeats found in the LcrH/SycD-like chaperones.	34
311590	pfam07721	TPR_4	Tetratricopeptide repeat. This Pfam entry includes tetratricopeptide-like repeats not detected by the pfam00515, pfam07719 and pfam07720 models.	26
400186	pfam07722	Peptidase_C26	Peptidase C26. These peptidases have gamma-glutamyl hydrolase activity; that is they catalyze the cleavage of the gamma-glutamyl bond in poly-gamma-glutamyl substrates. They are structurally related to pfam00117, but contain extensions in four loops and at the C-terminus.	217
336782	pfam07723	LRR_2	Leucine Rich Repeat. This Pfam entry includes some LRRs that fail to be detected with the pfam00560 model.	26
400187	pfam07724	AAA_2	AAA domain (Cdc48 subfamily). This Pfam entry includes some of the AAA proteins not detected by the pfam00004 model.	168
400188	pfam07725	LRR_3	Leucine Rich Repeat. This Pfam entry includes some LRRs that fail to be detected by the pfam00560 model.	20
400189	pfam07726	AAA_3	ATPase family associated with various cellular activities (AAA). This Pfam entry includes some of the AAA proteins not detected by the pfam00004 model.	131
400190	pfam07727	RVT_2	Reverse transcriptase (RNA-dependent DNA polymerase). A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.	243
400191	pfam07728	AAA_5	AAA domain (dynein-related subfamily). This Pfam entry includes some of the AAA proteins not detected by the pfam00004 model.	135
400192	pfam07729	FCD	FCD domain. This domain is the C-terminal ligand binding domain of many members of the GntR family. This domain probably binds to a range of effector molecules that regulate the transcription of genes through the action of the N-terminal DNA-binding domain pfam00392. This domain is found in Escherichia coli NanR and DgoR that are regulators of sugar biosynthesis operons. It is also in the known structure of FadR where it binds to acyl-coA, the domain is alpha helical. This family has been named as FCD for (FadR C-terminal Domain).	121
400193	pfam07730	HisKA_3	Histidine kinase. This is the dimerization and phosphoacceptor domain of a sub-family of histidine kinases. It shares sequence similarity with pfam00512 and pfam07536.	68
400194	pfam07731	Cu-oxidase_2	Multicopper oxidase. This entry contains many divergent copper oxidase-like domains that are not recognized by the pfam00394 model.	138
400195	pfam07732	Cu-oxidase_3	Multicopper oxidase. This entry contains many divergent copper oxidase-like domains that are not recognized by the pfam00394 model.	119
400196	pfam07733	DNA_pol3_alpha	Bacterial DNA polymerase III alpha subunit. 	259
254394	pfam07734	FBA_1	F-box associated. Most of these proteins contain pfam00646 at the N-terminus, suggesting that they are effectors linked with ubiquitination.	159
400197	pfam07735	FBA_2	F-box associated. Most of these proteins contain pfam00646 at the N-terminus, suggesting that they are effectors linked with ubiquitination.	66
400198	pfam07736	CM_1	Chorismate mutase type I. Chorismate mutase EC:5.4.99.5 catalyzes the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine.	115
285037	pfam07737	ATLF	Anthrax toxin lethal factor, N- and C-terminal domain. The C-terminal domain is the catalytically active domain whereas the N-terminal domain is likely to be inactive.	218
400199	pfam07738	Sad1_UNC	Sad1 / UNC-like C-terminal. The C. elegans UNC-84 protein is a nuclear envelope protein that is involved in nuclear anchoring and migration during development. The S. pombe Sad1 protein localizes at the spindle pole body. UNC-84 and and Sad1 share a common C-terminal region, that is often termed the SUN (Sad1 and UNC) domain. In mammals, the SUN domain is present in two proteins, Sun1 and Sun2. The SUN domain of Sun2 has been demonstrated to be in the periplasm.	130
400200	pfam07739	TipAS	TipAS antibiotic-recognition domain. This domain is found at the C-terminus of some MerR family transcription factors. The domain has an alpha-helical globin-like fold. The family includes Mta a central regulator of multidrug resistance in Bacillus subtilis.	117
400201	pfam07740	Toxin_12	Ion channel inhibitory toxin. This is a family of potent toxins that function as ion-channel inhibitors for several different ions. Omega-Grammotoxin SIA is a VSCC antagonist that inhibits neuronal N- and P-type VSCC responses. Huwentoxin-IV, from the Chinese bird spider, is a highly potent neurotoxin that specifically inhibits the neuronal tetrodotoxin-sensitive voltage-gated sodium channel in rat dorsal root ganglion neurons. Hainantoxin-4, from the venom of spider Selenocosmia hainana, adopts an inhibitor cystine knot structural motif like huwentoin-IV, and is a potent antagonist that acts at site 1 on tetrodotoxin-sensitive (TTX-S) sodium channels. Study of the molecular nature of toxin-receptor interactions has helped elucidate the functioning of many ion-channels.	30
400202	pfam07741	BRF1	Brf1-like TBP-binding domain. This region covers both the Brf homology II and III regions. This region is involved in binding TATA binding protein.	98
400203	pfam07742	BTG	BTG family. 	115
400204	pfam07743	HSCB_C	HSCB C-terminal oligomerization domain. This domain is the HSCB C-terminal oligomerization domain and is found on co-chaperone proteins.	75
400205	pfam07744	SPOC	SPOC domain. The SPOC (Spen paralogue and orthologue C-terminal) domain is involved in developmental signalling.	142
311610	pfam07745	Glyco_hydro_53	Glycosyl hydrolase family 53. This domain belongs to family 53 of the glycosyl hydrolase classification. These enzymes are enzymes are endo-1,4- beta-galactanases (EC:3.2.1.89). The structure of this domain is known and has a TIM barrel fold.	333
400206	pfam07746	LigA	Aromatic-ring-opening dioxygenase LigAB, LigA subunit. This is a family of aromatic ring opening dioxygenases which catalyze the ring-opening reaction of protocatechuate and related compounds.	87
400207	pfam07747	MTH865	MTH865-like family. This domain has an EF-hand like fold.	70
400208	pfam07748	Glyco_hydro_38C	Glycosyl hydrolases family 38 C-terminal domain. Glycosyl hydrolases are key enzymes of carbohydrate metabolism.	204
400209	pfam07749	ERp29	Endoplasmic reticulum protein ERp29, C-terminal domain. ERp29 is a ubiquitously expressed endoplasmic reticulum protein found in mammals. ERp29 is comprised of two domains. This domain, the C-terminal domain, has an all helical fold. ERp29 is thought to form part of the thyroglobulin folding complex.	94
400210	pfam07750	GcrA	GcrA cell cycle regulator. GcrA is a master cell cycle regulator that, together with CtrA (see pfam00072 and pfam00486), is involved in controlling cell cycle progression and asymmetric polar morphogenesis. During this process, there are temporal and spatial variations in the concentrations of GcrA and CtrA. The variation in concentration produces time and space dependent transcriptional regulation of modular functions that implement cell-cycle processes. More specifically, GcrA acts as an activator of components of the replisome and the segregation machinery.	155
400211	pfam07751	Abi_2	Abi-like protein. This family, found in various bacterial species, contains sequences that are similar to the Abi group of proteins, which are involved in bacteriophage resistance mediated by abortive infection in Lactococcus species. The proteins are thought to have helix-turn-helix motifs, found in many DNA-binding proteins, allowing them to perform their function.	184
400212	pfam07752	S-layer	S-layer protein. Archaeal S-layer proteins consist of two copies of this domain.	258
285051	pfam07753	DUF1609	Protein of unknown function (DUF1609). This region is found in a number of hypothetical proteins thought to be expressed by the eukaryote Encephalitozoon cuniculi, an obligate intracellular microsporidial parasite. It is approximately 200 residues long.	227
400213	pfam07754	DUF1610	Domain of unknown function (DUF1610). This zinc ribbon domain is found in archaeal species. It is likely to bind zinc via its four well-conserved cysteine residues.	24
400214	pfam07755	DUF1611	Domain of unknown function (DUF1611_C) P-loop domain. This region is found in a number of hypothetical bacterial and archaeal proteins. According to structure it has a P-loop structure.	198
400215	pfam07756	DUF1612	Protein of unknown function (DUF1612). This family includes sequences of largely unknown function but which share a number of features in common. They are expressed by bacterial species, and in many cases these bacteria are known to associate symbiotically with plants. Moreover, the majority are coded for by plasmids, which in many cases are known to confer on the organism the ability to interact symbiotically with leguminous plants. An example of such a plasmid is NGR234, which encodes Y4CF, a protein of unknown function that is a member of this family. Other members of this family are expressed by organisms with a documented genomic similarity to plant symbionts.	127
254409	pfam07757	AdoMet_MTase	Predicted AdoMet-dependent methyltransferase. Proteins in this family have been predicted to function as AdoMet-dependent methyltransferases.	112
400216	pfam07758	DUF1614	Protein of unknown function (DUF1614). This is a family of sequences coming from hypothetical proteins found in both bacterial and archaeal species.	170
400217	pfam07759	DUF1615	Protein of unknown function (DUF1615). This is a family of proteins of unknown function expressed by various bacterial species. Some members of this family are thought to be lipoproteins. Another member of this family is thought to be involved in photosynthesis.	320
400218	pfam07760	DUF1616	Protein of unknown function (DUF1616). This is a family of sequences from hypothetical archaeal proteins. The region in question is approximately 330 amino acid residues long.	301
285058	pfam07761	DUF1617	Protein of unknown function (DUF1617). This is a family of sequences from hypothetical bacterial and bacteriophage proteins. The region in question is approximately 150 residues long and is highly conserved throughout the family.	143
400219	pfam07762	DUF1618	Protein of unknown function (DUF1618). The members of this family are mainly hypothetical proteins expressed by Oryza sativa.	130
400220	pfam07763	FEZ	FEZ-like protein. This is a family of eukaryotic proteins thought to be involved in axonal outgrowth and fasciculation. The N-terminal regions of these sequences are less conserved than the C-terminal regions, and are highly acidic. The C. elegans homolog, UNC-76, may play structural and signalling roles in the control of axonal extension and adhesion (particularly in the presence of adjacent neuronal cells) and these roles have also been postulated for other FEZ family proteins. Certain homologs have been definitively found to interact with the N-terminal variable region (V1) of PKC-zeta, and this interaction causes cytoplasmic translocation of the FEZ family protein in mammalian neuronal cells. The C-terminal region probably participates in the association with the regulatory domain of PKC-zeta. The members of this family are predicted to form coiled-coil structures, which may interact with members of the RhoA family of signalling proteins, but are not thought to contain other characteristic protein motifs. Certain members of this family are expressed almost exclusively in the brain, whereas others (such as FEZ2) are expressed in other tissues, and are thought to perform similar but unknown functions in these tissues.	240
311623	pfam07764	Omega_Repress	Omega Transcriptional Repressor. The omega transcriptional repressor regulates expression of involved in copy number control and stable maintenance of plasmids. The omega protein belongs to the structural superfamily of MetJ/Arc repressors featuring a ribbon-helix-helix DNA-binding motif with the beta-ribbon located in and recognising the major groove of operator DNA.	71
400221	pfam07765	KIP1	KIP1-like protein. This is a family of sequences found exclusively in plants. They are similar to kinase interacting protein 1 (KIP1), which has been found to interact with the kinase domain of PRK1, a receptor-like kinase. This particular region contains two coiled-coils, which are described as motifs involved in protein-protein interactions. It has also been suggested that the protein's coiled- coils allow it to dimerize in vivo.	74
400222	pfam07766	LETM1	LETM1-like protein. Members of this family are inner mitochondrial membrane proteins which play a role in potassium and hydrogen ion exchange. Deletion of LETM1 is thought to be involved in the development of Wolf-Hirschhorn syndrome in humans.	264
400223	pfam07767	Nop53	Nop53 (60S ribosomal biogenesis). This nucleolar family of proteins are involved in 60S ribosomal biogenesis. They are specifically involved in the processing beyond the 27S stage of 25S rRNA maturation. This family contains sequences that bear similarity to the glioma tumor suppressor candidate region gene 2 protein (p60). This protein has been found to interact with herpes simplex type 1 regulatory proteins.	350
285065	pfam07768	PVL_ORF50	PVL ORF-50-like family. This is a family of sequences found in both bacteria and bacteriophages. This region is approximately 130 residues long and in some cases is found as part of the PVL (Panton-Valentine leukocidin) group of genes, which encode a member of the leukocidin group of bacterial toxins that kill leukocytes by creation of pores in the cell membrane. PVL appears to be a virulence factor associated with a number of human diseases.	116
400224	pfam07769	PsiF_repeat	psiF repeat. This region is approximately 35 residues long. It is found repeated in a number of putative phosphate starvation- inducible proteins expressed by various bacterial species. psiF is known to be an example of such phosphate starvation-inducible proteins.	34
369508	pfam07771	TSGP1	Tick salivary peptide group 1. This contains a group of peptides derived from a salivary gland cDNA library of the tick Ixodes scapularis. Also present are peptides from a related tick species, Ixodes ricinus. They are characterized by a putative signal peptide indicative of secretion and conserved cysteine residues.	120
400225	pfam07773	DUF1619	Protein of unknown function (DUF1619). This is a family of sequences derived from hypothetical eukaryotic proteins. The region in question is approximately 330 residues long and has a cysteine rich amino-terminus.	315
400226	pfam07774	DUF1620	Protein of unknown function (DUF1620). These sequences are mainly derived from predicted eukaryotic proteins. The region in question lies towards the C-terminus of these large proteins and is approximately 300 amino acid residues long.	214
285070	pfam07775	PaRep2b	PaRep2b protein. This is a family of proteins, expressed in the crenarchaeon Pyrobaculum aerophilum, whose members are variable in length and level of conservation. The presence of numerous frameshifts and internal stop codons in multiple alignments are thought to indicate that most family members are no longer functional.	512
400227	pfam07776	zf-AD	Zinc-finger associated domain (zf-AD). The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA.	74
400228	pfam07777	MFMR	G-box binding protein MFMR. This region is found to the N-terminus of the pfam00170 transcription factor domain. It is between 150 and 200 amino acids in length. The N-terminal half is rather rich in proline residues and has been termed the PRD (proline rich domain), whereas the C-terminal half is more polar and has been called the MFMR (multifunctional mosaic region). It has been suggested that this family is composed of three sub-families called A, B and C, classified according to motif composition. It has been suggested that some of these motifs may be involved in mediating protein-protein interactions. The MFMR region contains a nuclear localization signal in bZIP opaque and GBF-2. The MFMR also contains a transregulatory activity in TAF-1. The MFMR in CPRF-2 contains cytoplasmic retention signals.	96
369513	pfam07778	CENP-I	Mis6. Mis6 is an essential centromere connector protein acting during G1-S phase of the cell cycle. Mis6 is thought to be required for recruiting CENP-A, the centromere- specific histone H3 variant, an important event for centromere function and chromosome segregation during mitosis.	511
400229	pfam07779	Cas1_AcylT	10 TM Acyl Transferase domain found in Cas1p. Cas1p protein of Cryptococcus neoformans is required for the synthesis of O-acetylated glucuronoxylomannans, a consitutent of the capsule, and is critical for its virulence. The multi TM domain of the Cas1p was unified with the 10 TM Sugar Acyltransferase superfamily. This superfamily is comprised of members from the OatA, MdoC, OpgC, NolL and GumG families in addition to the Cas1p family. The Cas1p protein has a N terminal PC-Esterase domain with the opposing Acyl esterase activity.	474
400230	pfam07780	Spb1_C	Spb1 C-terminal domain. This presumed domain is found at the C-terminus of a family of FtsJ-like methyltransferases. Members of this family are involved in 60S ribosomal biogenesis.	209
369516	pfam07781	Reovirus_Mu2	Reovirus minor core protein Mu-2. This family represents the Reovirus core protein Mu-2. Mu-2 is a microtubule associated protein and is thought to play a key role in the formation and structural organisation of reovirus inclusion bodies.	727
400231	pfam07782	DC_STAMP	DC-STAMP-like protein. This is a family of sequences which are similar to a region of the dendritic cell-specific transmembrane protein (DC-STAMP). This is thought to be a novel receptor protein that shares no identity with other multimembrane-spanning proteins. It is thought to have seven putative transmembrane regions, two of which are found in the region featured in this family. DC-STAMP is also described as having potential N-linked glycosylation sites and a potential phosphorylation site for PKC, but these are not conserved throughout the family.	191
400232	pfam07784	DUF1622	Protein of unknown function (DUF1622). This is a family of 14 highly conserved sequences, from hypothetical proteins expressed by both bacterial and archaeal species.	78
400233	pfam07785	DUF1623	Protein of unknown function (DUF1623). The members of this family are all derived from relatively short hypothetical proteins thought to be expressed by various Nucleopolyhedroviruses.	90
377915	pfam07786	DUF1624	Protein of unknown function (DUF1624). These sequences are found in hypothetical proteins of unknown function expressed by bacterial and archaeal species. The region in question is approximately 230 residues long.	222
400234	pfam07787	TMEM43	Transmembrane protein 43. This entry represents the transmembrane protein 43 family of proteins, which may function as tetraspanin-like membrane organizers.	248
369519	pfam07788	PDDEXK_10	PD-(D/E)XK nuclease superfamily. This family is found to carry modified motifs characteristic of PD-(D/E)XK endonuclease superfamily. These are the conserved Glu of motif I, the Asp surreounded by hydrophobics of motif II, EIKS of motif III, and the lysine of mmotif IV has migrated to an alpha-helix following the third core beta-strand. The conserved patch of positively charged lysine and arginine residues in the motif IV apha-helix might be involved in substrate-binding or be contributing to active site formation. Members with an additional N-terminal coi9led-coil domain, are annotated as tropomyosin, coiled-coil or microtubule binding proteins.	74
369520	pfam07789	DUF1627	Protein of unknown function (DUF1627). This is a group of sequences found in hypothetical proteins predicted to be expressed in a number of bacterial species. The region in question is approximately 150 amino acid residues long.	155
400235	pfam07790	Pilin_N	Archaeal Type IV pilin, N-terminal. This entry represents the N-terminal domain of archaeal pilins, which play important roles in surface adhesion and twitching motility. This domain contains an conserved N- terminal hydrophobic motif.	78
369521	pfam07791	DUF1629	Protein of unknown function (DUF1629). This family consists of sequences from hypothetical proteins thought to be expressed by two members of the Xanthomonas genus. The region in question is 125 amino acid residues long.	123
400236	pfam07792	Afi1	Docking domain of Afi1 for Arf3 in vesicle trafficking. This domain occurs at the N-terminal of Afi1, an Arf3p-interacting protein, is a protein necessary for vesicle trafficking in yeast. This domain is the interacting region of the protein which binds to Arf3, the highly conserved small GTPases (ADP-ribosylation factors). Afi1 is distributed asymmetrically at the plasma membrane and is required for polarized distribution of Arf3 but not of an Arf3 guanine nucleotide-exchange factor, Yel1p. However, Afi1 is not required for targeting of Arf3 or Yel1p to the plasma membrane. Afi1 functions as an Arf3 polarization-specific adapter and participates in development of polarity. Although Arf3 is the homolog of human Arf6 it does not function in the same way, not being necessary for endocytosis or for mating factor receptor internalization. In the S phase, however, it is concentrated at the plasma membrane of the emerging bud. Because of its polarized localization and its critical function in the normal budding pattern of yeast, Arf3 is probably a regulator of vesicle trafficking, which is important for polarized growth.	119
400237	pfam07793	DUF1631	Protein of unknown function (DUF1631). The members of this family are sequences derived from a group of hypothetical proteins expressed by certain bacterial species. The region concerned is approximately 440 amino acid residues in length.	741
116408	pfam07794	DUF1633	Protein of unknown function (DUF1633). This family contains sequences derived from a group of hypothetical proteins expressed by Arabidopsis thaliana. These sequences are highly similar and the region concerned is about 100 residues long.	698
400238	pfam07795	DUF1635	Protein of unknown function (DUF1635). The members of this family include sequences that are parts of hypothetical proteins expressed by plant species. The region in question is about 170 amino acids long.	223
400239	pfam07796	DUF1638	Protein of unknown function (DUF1638). This family contains sequences covering an approximately 270 amino acid stretch of a group of hypothetical proteins. These proteins are expressed by archaeal species of the Methanosarcina genus.	161
400240	pfam07797	DUF1639	Protein of unknown function (DUF1639). This approximately 50 residue region is found in a number of sequences derived from hypothetical plant proteins. This region features a highly basic 5 amino-acid stretch towards its centre.	50
400241	pfam07798	DUF1640	Protein of unknown function (DUF1640). This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured.	174
400242	pfam07799	DUF1643	Protein of unknown function (DUF1643). The members of this family are all sequences found within hypothetical proteins expressed by various bacterial species. The region concerned is approximately 150 residues long.	133
400243	pfam07800	DUF1644	Protein of unknown function (DUF1644). This family consists of sequences found in a number of hypothetical plant proteins of unknown function. The region of interest contains nine highly conserved cysteine residues and is approximately 160 amino acids in length, and is probably a zinc-binding domain.	164
369527	pfam07801	DUF1647	Protein of unknown function (DUF1647). The sequences making up this family are all derived from hypothetical proteins expressed by C. elegans. The region in question is approximately 160 amino acids long. The GO annotation for this protein indicates the protein to be involved in nematode larval development and to have a positive regulation on growth rate.	141
400244	pfam07802	GCK	GCK domain. This domain is found in proteins carrying other domains known to be involved in intracellular signalling pathways (such as pfam00071) indicating that it might also be involved in these pathways. It has 4 highly conserved cysteine residues, suggesting that it can bind zinc ions. Moreover, it is found repeated in some members of this family; this may indicate that these domains are able to interact with one another, raising the possibility that this domain mediates heterodimerization.	74
400245	pfam07803	GSG-1	GSG1-like protein. This family contains sequences bearing similarity to a region of GSG1, a protein specifically expressed in testicular germ cells. It is possible that overexpression of the human homolog may be involved in tumorigenesis of human testicular germ cell tumors. The region in question has four highly-conserved cysteine residues.	109
400246	pfam07804	HipA_C	HipA-like C-terminal domain. The members of this family are similar to a region close to the C-terminus of the HipA protein expressed by various bacterial species. This protein is known to be involved in high-frequency persistence to the lethal effects of inhibition of either DNA or peptidoglycan synthesis. When expressed alone, it is toxic to bacterial cells, but it is usually tightly associated with HipB, and the HipA-HipB complex may be involved in autoregulation of the hip operon. The hip proteins may be involved in cell division control and may interact with cell division genes or their products.	221
116420	pfam07806	Nod_GRP	Nodule-specific GRP repeat. The region featured in this family is found repeated in a number of plant proteins, some of which are expressed specifically in nodules formed during symbiotic interactions with certain bacterial species. Some of these proteins are also termed glycine-rich proteins (GRPs), due to the presence of a glycine-rich C-terminal region in their structures. Bacterial infection is required for the induction of nodule-specific GRP genes, and it is thought that nodule-specific GRPs may play non-redundant roles required at specific stages of nodule development. Members of this group of proteins may be cytosolic, whereas others are thought to be membrane-associated.	38
400247	pfam07807	RED_C	RED-like protein C-terminal region. This family contains sequences that are similar to the C-terminal region of Red protein. This and related proteins are thought to be localized to the nucleus, and contain a RED repeat which consists of a number of RE and RD sequence elements. The region in question has several conserved NLS sequences. The function of Red protein is unknown, but efficient sequestration to nuclear bodies suggests that its expression may be tightly regulated or that the protein self-aggregates extremely efficiently.	107
400248	pfam07808	RED_N	RED-like protein N-terminal region. This family contains sequences that are similar to the N-terminal region of Red protein. This and related proteins contain a RED repeat which consists of a number of RE and RD sequence elements. The region in question has several conserved NLS sequences and a putative trimeric coiled-coil region, suggesting that these proteins are expressed in the nucleus. The function of Red protein is unknown, but efficient sequestration to nuclear bodies suggests that its expression may be tightly regulated of that the protein self-aggregates extremely efficiently.	226
400249	pfam07809	RTP801_C	RTP801 C-terminal region. The members of this family are sequences similar to the C-terminal region of RTP801, the protein product of a hypoxia-inducible factor 1 (HIF-1)- responsive gene. Two members of this family expressed by Drosophila melanogaster, Scylla and Charybde, are designated by the GenBank as Hox targets. RTP801 is thought to be involved in various cellular processes. Its overexpression caused the apoptosis- resistant phenotype in cycling cells, and apoptosis sensitivity in growth arrested cells. Moreover, the protein product of the mouse homolog of RTP801 (dig2) is thought to be induced by diverse apoptotic signals, and also by dexamethasone treatment.	113
400250	pfam07810	TMC	TMC domain. These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 and EVIN2 - this region is termed the TMC domain. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters.	111
400251	pfam07811	TadE	TadE-like protein. The members of this family are similar to a region of the protein product of the bacterial tadE locus. In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria. All tad loci but TadA have putative transmembrane regions, and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues.	43
400252	pfam07812	TfuA	TfuA-like protein. This family consists of a group of sequences that are similar to a region of TfuA protein. This protein is involved in the production of trifolitoxin (TFX), an gene-encoded, post-translationally modified peptide antibiotic. The role of TfuA in TFX synthesis is unknown, and it may be involved in other cellular processes.	120
400253	pfam07813	LTXXQ	LTXXQ motif family protein. This protein family includes two copies of a five residue motif is found in a number of bacterial proteins bearing similarity to the protein CpxP. This is a periplasmic protein that aids in combating extracytoplasmic protein-mediated toxicity, and may also be involved in the response to alkaline pH. Another member of this family, Spy, is also a periplasmic protein that may be involved in the response to stress. The homology between CpxP and Spy may indicate that these two proteins are functionally related.	97
400254	pfam07814	WAPL	Wings apart-like protein regulation of heterochromatin. This family contains sequences expressed in eukaryotic organisms bearing high similarity to the WAPL conserved region of D. melanogaster wings apart-like protein. This protein is involved in the regulation of heterochromatin structure. hWAPL, the human homolog, is found to play a role in the development of cervical carcinogenesis, and is thought to have similar functions to Drosophila wapl protein. Malfunction of the hWAPL pathway is thought to activate an apoptotic pathway that consequently leads to cell death.	344
400255	pfam07815	Abi_HHR	Abl-interactor HHR. The region featured in this family is found towards the N-terminus of a number of adaptor proteins that interact with Abl-family tyrosine kinases. More specifically, it is termed the homeo-domain homologous region (HHR), as it is similar to the DNA-binding region of homeo-domain proteins. Other homeo-domain proteins have been implicated in specifying positional information during embryonic development, and in the regulation of the expression of cell-type specific genes. The Abl-interactor proteins are thought to coordinate the cytoplasmic and nuclear functions of the Abl-family kinases, and seem to be involved in cytoskeletal reorganisation, but their precise role remains unclear.	64
400256	pfam07816	DUF1645	Protein of unknown function (DUF1645). These sequences are derived from a number of hypothetical plant proteins. The region in question is approximately 270 amino acids long. Some members of this family are annotated as yeast pheromone receptor proteins AR781 but no literature was found to support this.	194
400257	pfam07817	GLE1	GLE1-like protein. The members of this family are sequences that are similar to the human protein GLE1. This protein is localized at the nuclear pore complexes and functions in poly(A)+ RNA export to the cytoplasm.	231
400258	pfam07818	HCNGP	HCNGP-like protein. This family comprises sequences bearing significant similarity to the mouse transcriptional regulator protein HCNGP. This protein is localized to the nucleus and is thought to be involved in the regulation of beta-2-microglobulin genes.	91
369540	pfam07819	PGAP1	PGAP1-like protein. The sequences found in this family are similar to PGAP1. This is an endoplasmic reticulum membrane protein with a catalytic serine containing motif that is conserved in a number of lipases. PGAP1 functions as a GPI inositol-deacylase; this deacylation is important for the efficient transport of GPI-anchored proteins from the endoplasmic reticulum to the Golgi body.	233
369541	pfam07820	TraC	TraC-like protein. The members of this family are sequences that are similar to TraC. The gene encoding this protein is one of a group of genes found on plasmid p42a of Rhizobium etli CFN42 that are thought to be involved in the process of plasmid self-transmission. Mobilisation of plasmid p42a is of importance as it is required for transfer of plasmid p42a, which is also known as plasmid pSym as it carries most of the genes required for nodulation and nitrogen fixation by the symbiotic bacterium. The predicted protein products of p42a are similar to known transfer proteins of Agrobacterium tumefaciens plasmid pTiC58.	88
400259	pfam07821	Alpha-amyl_C2	Alpha-amylase C-terminal beta-sheet domain. This domain is organized as a five-stranded anti-parallel beta-sheet. It is the probable result of a decay of the common-fold.	59
400260	pfam07822	Toxin_13	Neurotoxin B-IV-like protein. The members of this family resemble neurotoxin B-IV, which is a crustacean-selective neurotoxin produced by the marine worm Cerebratulus lacteus. This highly cationic peptide is approximately 55 residues and is arranged to form two antiparallel helices connected by a well-defined loop in a hairpin structure. The branches of the hairpin are linked by four disulphide bonds. Three residues identified as being important for activity, namely Arg-17, -25 and -34, are found on the same face of the molecule, while another residue important for activity, Trp30, is on the opposite side. The protein's mode of action is not entirely understood, but it may act on voltage-gated sodium channels, possibly by binding to an as yet uncharacterized site on these proteins. Its site of interaction may also be less specific, for example it may interact with negatively charged membrane lipids.	55
311669	pfam07823	CPDase	Cyclic phosphodiesterase-like protein. Cyclic phosphodiesterase (CPDase) is involved in the tRNA splicing pathway. This protein exhibits a bilobal arrangement of two alpha-beta modules. Two antiparallel helices are found on the outer side of each lobe and frame an antiparallel beta-sheet that is wrapped around an accessible cleft. Moreover, the beta-strands of each lobe interact with the other lobe. The central water-filled cavity houses the enzyme's active site.	199
285114	pfam07824	Chaperone_III	Type III secretion chaperone domain. Type III secretion chaperones are involved in delivering virulence effector proteins from bacterial pathogens directly into eukaryotic cells. The chaperones may prevent aggregation and degradation of their substrates, may target the effector to the secretion apparatus, and may ensure a secretion-component unfolded confirmation of their specific substrate. One member of this family, SigE forms homodimers in crystal. The monomers have a novel fold with an alpha-beta(3)-alpha-beta(2)-alpha topology.	110
116439	pfam07825	Exc	Excisionase-like protein. The phage-encoded excisionase protein (Xis) is involved in excisive recombination by regulating the assembly of the excisive intasome and by inhibiting viral integration. It adopts an unusual 'winged'-helix structure in which two alpha helices are packed against two extended strands. Also present in the structure is a two-stranded anti-parallel beta-sheet, whose strands are connected by a four-residue 'wing'. During interaction with DNA, helix alpha2 is thought to insert into the major groove, while the wing contacts the adjacent minor groove or phosphodiester backbone. The C-terminal region of Xis is involved in interaction with phage-encoded integrase (Int), and a putative C-terminal alpha helix may fold upon interaction with Int and/or DNA.	72
400261	pfam07826	IMP_cyclohyd	IMP cyclohydrolase-like protein. This enzyme may catalyze the cyclization of 5-formylamidoimidazole-4-carboxamide ribonucleotide to inosine monophosphate (IMP), a reaction which is important in de novo purine biosynthesis in archaeal species. This single domain protein is arranged to form an overall fold that consists of a four-layered alpha-beta-beta-alpha core structure. The two antiparallel beta-sheets pack against each other and are covered by alpha-helices on one face of the molecule. The protein is structurally similar to members of the N-terminal nucleophile (NTN) hydrolase superfamily. A deep pocket was in fact found on the surface of IMP cyclohydrolase in a position equivalent to that of active sites of NTN-hydrolases, but an N-terminal nucleophile could not be found. Therefore, it is thought that this enzyme is structurally but not functionally similar to members of the NTN-hydrolase family.	194
285116	pfam07827	KNTase_C	KNTase C-terminal domain. Kanamycin nucleotidyltransferase (KNTase) is involved in conferring resistance to aminoglycoside antibiotics and catalyzes the transfer of a nucleoside monophosphate group from a nucleotide to kanamycin. This enzyme is dimeric with each subunit being composed of two domains. The C-terminal domain contains five alpha helices, four of which are organized into an up-and-down alpha helical bundle. Residues found in this domain may contribute to this enzyme's active site.	132
369543	pfam07828	PA-IL	PA-IL-like protein. The members of this family are similar to the galactophilic lectin-1 expressed by P. aeruginosa ((PA-IL). Lectins recognising specific carbohydrates found on the surface of host cells are known to be involved in the initiation of infections by this organism. The protein is thought to be organized into an extensive network of beta-sheets, as is the case with many other lectins.	121
311671	pfam07829	Toxin_14	Alpha-A conotoxin PIVA-like protein. Alpha-A conotoxin PIVA is the major paralytic toxin found in the venom produced by the piscivorous snail Conus purpurascens. This peptide acts by blocking the acetylcholine binding site of the nicotinic acetylcholine receptor at the neuromuscular junction. The overall shape of the peptide is described as an "iron" with a highly charged hydrophilic loop of 15S-19R forming the "handle" domain that is exposed to the exterior of the protein. The stability of the conotoxin is primarily governed by three disulphide bonds. A triangular structural motif formed by residues 19R, 12H and 6Y is thought to constitute a "binding core" that is important in binding to the acetylcholine receptor.	26
400262	pfam07830	PP2C_C	Protein serine/threonine phosphatase 2C, C-terminal domain. Protein phosphatase 2C (PP2C) is involved in regulating cellular responses to stress in various eukaryotes. It consists of two domains: an N-terminal catalytic domain and a C-terminal domain characteristic of mammalian PP2Cs. This domain consists of three antiparallel alpha helices, one of which packs against two corresponding alpha-helices of the N-terminal domain. The C-terminal domain does not seem to play a role in catalysis, but it may provide protein substrate specificity due to the cleft that is created between it and the catalytic domain.	79
400263	pfam07831	PYNP_C	Pyrimidine nucleoside phosphorylase C-terminal domain. This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP). The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer.	74
400264	pfam07832	Bse634I	Cfr10I/Bse634I restriction endonuclease. Cfr10I and Bse634I are two Type II restriction endonucleases. They exhibit a conserved tetrameric architecture that is of functional importance, wherein two dimers are arranged 'back-to-back' with their putative DNA-binding clefts facing opposite directions. These clefts are formed between two monomers that interact, mainly via hydrophobic interactions supported by a few hydrogen bonds, to form a U-shaped dimer. Each monomer is folded to form a compact alpha-beta structure, whose core is made up of a five-stranded mixed beta-sheet.The monomer may be split into separate N-terminal and C-terminal subdomains at a hinge located in helix alpha3.	281
400265	pfam07833	Cu_amine_oxidN1	Copper amine oxidase N-terminal domain. Copper amine oxidases catalyze the oxidative deamination of primary amines to the corresponding aldehydes, while reducing molecular oxygen to hydrogen peroxide. These enzymes are dimers of identical subunits, each comprising four domains. The N-terminal domain, which is absent in some amine oxidases, consists of a five-stranded antiparallel beta sheet twisted around an alpha helix. The D1 domains from the two subunits comprise the 'stalk' of the mushroom-shaped dimer, and interact with each other but do not pack tightly against each other.	93
400266	pfam07834	RanGAP1_C	RanGAP1 C-terminal domain. Ran-GTPase activating protein 1 (RanGAP1) is a GTPase activator for the nuclear Ras-related regulatory protein Ran, converting it to the putatively inactive GDP-bound state. Its C-terminal domain is required for RanGAP1 localization at the vertebrate nuclear pore complex, and is sumoylated by the small ubiquitin-related modifier protein (SUMO-1). This domain is composed almost entirely of helical substructures that are organized into an alpha-alpha superhelix fold, with the exception of the peptide containing the lysine residue required for SUMO-1 conjugation.	179
400267	pfam07835	COX4_pro_2	Bacterial aa3 type cytochrome c oxidase subunit IV. Bacterial cytochrome c oxidase is found bound to the to the cell membrane, where it is involved in the generation of the transmembrane proton electrochemical gradient. It is composed of four subunits. Subunit IV consists of one transmembrane helix that does not interact directly with the other subunits, but maintains its position by indirect contacts via phospholipid molecules found in the structure. The function of subunit IV is as yet unknown.	42
400268	pfam07836	DmpG_comm	DmpG-like communication domain. This domain is found towards the C-terminal region of various aldolase enzymes. It consists of five alpha-helices, four of which form an antiparallel helical bundle that plugs the C-terminus of the N-terminal TIM barrel domain. The communication domain is thought to play an important role in the heterodimerization of the enzyme.	63
400269	pfam07837	FTCD_N	Formiminotransferase domain, N-terminal subdomain. The formiminotransferase (FT) domain of formiminotransferase- cyclodeaminase (FTCD) forms a homodimer, and each protomer comprises two subdomains. The N-terminal subdomain is made up of a six-stranded mixed beta-pleated sheet and five alpha helices, which are arranged on the external surface of the beta sheet. This, in turn, faces the beta-sheet of the C-terminal subdomain to form a double beta-sheet layer. The two subdomains are separated by a short linker sequence, which is not thought to be any more flexible than the remainder of the molecule. The substrate is predicted to form a number of contacts with residues found in both the N-terminal and C-terminal subdomains.	173
400270	pfam07839	CaM_binding	Plant calmodulin-binding domain. The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins, and are thought to constitute the calmodulin-binding domains. Binding of the proteins to calmodulin depends on the presence of calcium ions. These proteins are thought to be involved in various processes, such as plant defense responses and stolonisation or tuberization.	118
400271	pfam07840	FadR_C	FadR C-terminal domain. This family contains sequences that are similar to the fatty acid metabolism regulator protein (FadR). This functions as a dimer, with each monomer being composed of an N-terminal DNA-binding domain and a regulatory C-terminal domain. A linker comprising two short alpha helices joins the two domains. In the C-terminal domain, an antiparallel array of six alpha helices forms a barrel-like structure, while a seventh alpha helix forms a 'lid' at the end closest to the N-terminal domain. This structure was found to be similar to that of the C-terminal domain of the Tet repressor. Long-chain acyl-CoA thioesters interact directly and reversibly with the C-terminal domain, and this interaction affects the structure and therefore the DNA binding properties of the N-terminal domain.	163
400272	pfam07841	DM4_12	DM4/DM12 family. This family contains sequences derived from hypothetical proteins expressed by two insect species, D. melanogaster and A. gambiae. The region in question is approximately 115 amino acid residues long and contains four highly- conserved cysteine residues.	86
400273	pfam07842	GCFC	GC-rich sequence DNA-binding factor-like protein. Sequences found in this family are similar to a region of a human GC-rich sequence DNA-binding factor homolog. This is thought to be a protein involved in transcriptional regulation due to partial homologies to a transcription repressor and histone-interacting protein. This family also contains tuftelin interacting protein 11 which has been identified as both a nuclear and cytoplasmic protein, and has been implicated in the secretory pathway. Sip1, a septin interacting protein is also a member of this family.	275
400274	pfam07843	DUF1634	Protein of unknown function (DUF1634). This family contains many hypothetical bacterial and archaeal proteins. A few members of this family are annotated as being putative transmembrane proteins, and the region in question in fact contains many hydrophobic residues.	102
400275	pfam07845	DUF1636	Protein of unknown function (DUF1636). The sequences featured in this family are derived from a number of hypothetical prokaryotic proteins. The region in question is approximately 130 amino acids long.	117
285130	pfam07846	Metallothio_Cad	Metallothionein family. The sequences making up Metallothio_Cad are found repeated in metallothionein proteins expressed by several different Tetrahymena species. Metallothioneins are low molecular mass, cysteine-rich metal-binding proteins that are thought to be involved in the regulation of levels of trace metals, and detoxification of these metals when present in excess. Some of the metallothioneins found in this family are known to be induced by cadmium and are thought to be involved in the cellular sequestration of toxic metal ions. The high proportion of cysteine residues allows the metal ions to be bound by the formation of clusters of metal-thiolate complexes. Tetrahymena spp. metallothioneins differ from other eukaryotic metallothioneins mainly in the length of their sequences and in the cysteine-containing motifs they exhibit.	20
400276	pfam07847	PCO_ADO	PCO_ADO. This entry includes cysteine oxidases (PCOs) from plants and 2-aminoethanethiol dioxygenases (ADOs) from animals.	201
285132	pfam07848	PaaX	PaaX-like protein. This family contains proteins that are similar to the product of the paaX gene of Escherichia coli. This protein is involved in the regulation of expression of a group of proteins known to participate in the metabolism of phenylacetic acid. In fact, some members of this family are annotated by InterPro as containing a winged helix DNA-binding domain.	70
400277	pfam07849	DUF1641	Protein of unknown function (DUF1641). Archaeal and bacterial hypothetical proteins are found in this family, with the region in question being approximately 40 residues long.	39
400278	pfam07850	Renin_r	Renin receptor-like protein. The sequences featured in this family are similar to a region of the human renin receptor that bears a putative transmembrane spanning segment. The renin receptor is involved in intracellular signal transduction by the activation of the ERK1/ERK2 pathway, and it also serves to increase the efficiency of angiotensinogen cleavage by receptor-bound renin, therefore facilitating angiotensin II generation and action on a cell surface.	97
400279	pfam07851	TMPIT	TMPIT-like protein. A number of members of this family are annotated as being transmembrane proteins induced by tumor necrosis factor alpha, but no literature was found to support this.	324
369557	pfam07852	DUF1642	Protein of unknown function (DUF1642). The sequences making up this family are derived from various hypothetical phage and prophage proteins. The region in question is approximately 140 amino acids long.	136
400280	pfam07853	DUF1648	Protein of unknown function (DUF1648). Members of this family are hypothetical proteins expressed by either bacterial or archaeal species. Some of these are annotated as being transmembrane proteins, and in fact many of these sequences contain a high proportion of hydrophobic residues.	49
285138	pfam07854	DUF1646	Protein of unknown function (DUF1646). Some of the members of this family are hypothetical bacterial and archaeal proteins, but others are annotated as being cation transporters expressed by the archaebacterium Methanosarcina mazei.	347
400281	pfam07855	ATG101	Autophagy-related protein 101. Atg101 is a critical autophagy factor that functions together with ULK, Atg13 and FIP200.	152
400282	pfam07856	Orai-1	Mediator of CRAC channel activity. ORAI-1 is a protein homolog of Drosophila Orai and human Orai1, Orai2 and Orai3. ORAI-1 GFP reporters are co- expressed with STIM-1 (ER CA(2+) sensors) in the gonad and intestine. The protein has four predicted transmembrane domains with a highly conserved region between TM2 ad TM3. This conserved domain is thought to function in channel regulation. ORAI1- related proteins are required for the production of the calcium channel, CRAC, along with STIM1-related proteins.	168
285141	pfam07857	TMEM144	Transmembrane family, TMEM144 of transporters. Members of this family fall in to the drug/metabolite transporter (dmt) superfamily. They carry 10xTM domains arranged as 5+5. Although these two sets may originally have arisen by gene-duplication the divergence now is such that the two halves are no longer homologous.	333
400283	pfam07858	LEH	Limonene-1,2-epoxide hydrolase catalytic domain. Epoxide hydrolases catalyze the hydrolysis of epoxides to corresponding diols, which is important in detoxification, synthesis of signal molecules, or metabolism. Limonene-1,2- epoxide hydrolase (LEH) differs from many other epoxide hydrolases in its structure and its novel one-step catalytic mechanism. Its main fold consists of a six-stranded mixed beta-sheet, with three N-terminal alpha helices packed to one side to create a pocket that extends into the protein core. A fourth helix lies in such a way that it acts as a rim to this pocket. Although mainly lined by hydrophobic residues, this pocket features a cluster of polar groups that lie at its deepest point and constitute the enzyme's active site.	125
400284	pfam07859	Abhydrolase_3	alpha/beta hydrolase fold. This catalytic domain is found in a very wide range of enzymes.	208
285144	pfam07860	CCD	WisP family C-Terminal Region. This family is found at the C-terminus of the Tropheryma whipplei WisP family proteins.	130
311693	pfam07861	WND	WisP family N-Terminal Region. This family is found at the N-terminus of the Tropheryma whipplei WisP family proteins.	239
400285	pfam07862	Nif11	Nif11 domain. This domain is found mainly in the Cyanobacteria and in Proteobacteria such as the nitrogen-fixing bacterium Azotobacter vinelandii. It is found in Nif11, a protein described in Azotobacter as linked to nitrogen fixation. It also constitutes a leader peptide in Nif11-derived peptides (N11P), which are thought to be post-translationally modified microcins derived from a putative nitrogen-fixing protein. N11P sequences have a classic leader peptide cleavage motif, usually Gly-Gly, which marks the end of family-wide similarity area and the beginning of a low-complexity region rich in Cys, Gly and Ser.	47
400286	pfam07863	CtnDOT_TraJ	homologs of TraJ from Bacteroides conjugative transposon. Members of this family have been implicated in as being involved in an unusual form of DNA transfer (conjugation) in Bacteroides. The family has been named CtnDOT_TraJ to avoid confusion with other conjugative transfer systems.	66
400287	pfam07864	DUF1651	Protein of unknown function (DUF1651). This is a family containing bacterial proteins of unknown function.	73
311697	pfam07865	DUF1652	Protein of unknown function (DUF1652). This is a family containing hypothetical bacterial proteins.	67
400288	pfam07866	DUF1653	Protein of unknown function (DUF1653). This is a family of hypothetical bacterial proteins of unknown function.	61
311699	pfam07867	DUF1654	Protein of unknown function (DUF1654). This family consists of proteins from the Pseudomonadaceae.	70
285151	pfam07868	DUF1655	Protein of unknown function (DUF1655). This protein is found in some prophages found in Lactobacillales lactis.	55
400289	pfam07869	DUF1656	Protein of unknown function (DUF1656). This family contains bacterial proteins, many of which are hypothetical. Some proteins in this family are putative membrane proteins.	56
377932	pfam07870	DUF1657	Protein of unknown function (DUF1657). This domain appears to be restricted to the Bacillales.	49
285154	pfam07871	DUF1658	Protein of unknown function (DUF1658). This family of small proteins seems to be found in several places in the Coxiella genome.	21
400290	pfam07872	DUF1659	Protein of unknown function (DUF1659). This family consists of hypothetical bacterial proteins of unknown function.	44
400291	pfam07873	YabP	YabP family. This family of proteins is involved in spore coat assembly during the process of sporulation.	64
369565	pfam07874	DUF1660	Prophage protein (DUF1660). This protein is found in Lactobacillae prophages.	64
400292	pfam07875	Coat_F	Coat F domain. The Coat F proteins, which contribute to the Bacillales spore coat. It occurs multiple times in the genomes it is found in.	63
400293	pfam07876	Dabb	Stress responsive A/B Barrel Domain. The function of this family is unknown, but it is upregulated in response to salt stress in Populus balsamifera. It is also found at the C-terminus of an fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus. Arthrobacter nicotinovorans ORF106 is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved and the domain forms an a/b barrel dimer. Although there is a clear duplication within the domain it is not obviously detectable in the sequence.	96
285160	pfam07877	DUF1661	Protein of unknown function (DUF1661). This is a family containing bacterial proteins of unknown function. Many of the proteins in this family are hypothetical.	31
400294	pfam07878	RHH_5	CopG-like RHH_1 or ribbon-helix-helix domain, RHH_5. This family contains bacterial proteins that form a ribbon-helix-helix fold. This fold occurs in many examples of bacterial antitoxins.	43
400295	pfam07879	PHB_acc_N	PHB/PHA accumulation regulator DNA-binding domain. This domain is found at the N-terminus of the Polyhydroxyalkanoate (PHA) synthesis regulators. These regulators have been shown to directly bind DNA and PHA. The invariant nature of this domain compared to the C-terminal pfam05233 domain(s) suggests that it contains the DNA-binding function.	59
400296	pfam07880	T4_gp9_10	Bacteriophage T4 gp9/10-like protein. The members of this family are similar to gene products 9 (gp9) and 10 (gp10) of bacteriophage T4. Both proteins are components of the viral baseplate. Gp9 connects the long tail fibers of the virus to the baseplate and triggers tail contraction after viral attachment to a host cell. The protein is active as a trimer, with each monomer being composed of three domains. The N-terminal domain consists of an extended polypeptide chain and two alpha helices. The alpha1 helix from each of the three monomers in the trimer interacts with its counterparts to form a coiled-coil structure. The middle domain is a seven-stranded beta-sandwich that is thought to be a novel protein fold. The C-terminal domain is thought to be essential for gp9 trimerisation and is organized into an eight- stranded antiparallel beta-barrel, which was found to resemble the 'jelly roll' fold found in many viral capsid proteins. The long flexible region between the N-terminal and middle domains may be required for the function of gp9 to transmit signals from the long tail fibers. Together with gp11, gp10 initiates the assembly of wedges that then go on to associate with a hub to form the viral baseplate.	285
400297	pfam07881	Fucose_iso_N1	L-fucose isomerase, first N-terminal domain. The members of this family are similar to L-fucose isomerase expressed by E. coli (EC:5.3.1.3). This enzyme corresponds to glucose-6-phosphate isomerase in glycolysis, and converts an aldo-hexose to a ketose to prepare it for aldol cleavage. The enzyme is a hexamer, with each subunit being wedge-shaped and composed of three domains. Both domains 1 and 2 contain central parallel beta-sheets with surrounding alpha helices. Domain 1 demonstrates the beta-alpha-beta-alpha- beta Rossman fold. The active centre is shared between pairs of subunits related along the molecular three-fold axis, with domains 2 and 3 from one subunit providing most of the substrate-contacting residues, and domain 1 from the adjacent subunit contributing some other residues.	168
400298	pfam07882	Fucose_iso_N2	L-fucose isomerase, second N-terminal domain. The members of this family are similar to L-fucose isomerase expressed by E. coli (EC:5.3.1.3). This enzyme corresponds to glucose-6-phosphate isomerase in glycolysis, and converts an aldo-hexose to a ketose to prepare it for aldol cleavage. The enzyme is a hexamer, with each subunit being wedge-shaped and composed of three domains. Both domains 1 and 2 contain central parallel beta- sheets with surrounding alpha helices. The active centre is shared between pairs of subunits related along the molecular three-fold axis, with domains 2 and 3 from one subunit providing most of the substrate-contacting residues.	180
400299	pfam07883	Cupin_2	Cupin domain. This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel).	70
400300	pfam07884	VKOR	Vitamin K epoxide reductase family. Vitamin K epoxide reductase (VKOR) recycles reduced vitamin K, which is used subsequently as a co-factor in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. VKORC1 is a member of a large family of predicted enzymes that are present in vertebrates, Drosophila, plants, bacteria and archaea. Four cysteine residues and one residue, which is either serine or threonine, are identified as likely active-site residues. In some plant and bacterial homologs the VKORC1 homologous domain is fused with domains of the thioredoxin family of oxidoreductases.	132
400301	pfam07885	Ion_trans_2	Ion channel. This family includes the two membrane helix type ion channels found in bacteria.	76
400302	pfam07886	BA14K	BA14K-like protein. The sequences found in this family are similar to the BA14K proteins expressed by Brucella abortus and by Brucella suis. BA14K was found to be strongly immunoreactive; it induces both humoral and cellular responses in hosts throughout the infective process.	29
400303	pfam07887	Calmodulin_bind	Calmodulin binding protein-like. The members of this family are putative or actual calmodulin binding proteins expressed by various plant species. Some members are known to be involved in the induction of plant defense responses. However, their precise function in this regards is as yet unknown.	291
400304	pfam07888	CALCOCO1	Calcium binding and coiled-coil domain (CALCOCO1) like. Proteins found in this family are similar to the coiled-coil transcriptional coactivator protein coexpressed by Mus musculus (CoCoA/CALCOCO1). This protein binds to a highly conserved N-terminal domain of p160 coactivators, such as GRIP1, and thus enhances transcriptional activation by a number of nuclear receptors. CALCOCO1 has a central coiled-coil region with three leucine zipper motifs, which is required for its interaction with GRIP1 and may regulate the autonomous transcriptional activation activity of the C-terminal region.	488
400305	pfam07889	DUF1664	Protein of unknown function (DUF1664). The members of this family are hypothetical plant proteins of unknown function. The region featured in this family is approximately 100 amino acids long.	122
400306	pfam07890	Rrp15p	Rrp15p. Rrp15p is required for the formation of 60S ribosomal subunits.	126
400307	pfam07891	DUF1666	Protein of unknown function (DUF1666). These sequences are derived from hypothetical plant proteins of unknown function. The region in question is approximately 250 residues long.	246
400308	pfam07892	DUF1667	Protein of unknown function (DUF1667). Hypothetical archaeal and bacterial proteins make up this family. A few proteins are annotated as being potential metal-binding proteins, and in fact the members of this family have four highly conserved cysteine residues, but no further literature evidence was found in this regard.	82
369579	pfam07893	DUF1668	Protein of unknown function (DUF1668). The hypothetical proteins found in this family are expressed by Oryza sativa and are of unknown function.	330
400309	pfam07894	DUF1669	Protein of unknown function (DUF1669). This family is composed of sequences derived from hypothetical eukaryotic proteins of unknown function. Some members of this family are annotated as being potential phospholipases but no literature was found to support this.	276
400310	pfam07895	DUF1673	Protein of unknown function (DUF1673). This family contains hypothetical proteins of unknown function expressed by two archaeal species.	207
400311	pfam07896	DUF1674	Protein of unknown function (DUF1674). The members of this family are sequences derived from hypothetical eukaryotic and bacterial proteins. The region in question is approximately 60 residues long.	50
400312	pfam07897	EAR	Ethylene-responsive binding factor-associated repression. The EAR motif is the ethylene-responsive element binding factor-associated amphiphilic repression motif. This motif binds to the Groucho/Tup1-type co-repressor TOPLESS (TPL) and TPL-related proteins. The motif is frequently to be find at the N-terminus of NINJA, or Novel INteractor of JAZ, proteins. The EAR motif, defined by the consensus sequence patterns of either LxLxL or DLN xxP, is the most predominant form of transcriptional repression motif so far identified in plants. It is highly conserved in transcriptional regulators that are known to function as negative regulators in a broad range of developmental and physiological processes across evolutionarily diverse plant species. This family is closely related to family AUX_IAA Pam:PF02309 which also has an LxLxL signature.	35
400313	pfam07898	DUF1676	Protein of unknown function (DUF1676). This family contains sequences derived from proteins of unknown function expressed by Drosophila melanogaster and Anopheles gambiae.	172
400314	pfam07899	Frigida	Frigida-like protein. This family is composed of plant proteins that are similar to FRIGIDA protein expressed by Arabidopsis thaliana. This protein is probably nuclear and is required for the regulation of flowering time in the late-flowering phenotype. It is known to increase RNA levels of flowering locus C. Allelic variation at the FRIGIDA locus is a major determinant of natural variation in flowering time.	290
369585	pfam07900	DUF1670	Protein of unknown function (DUF1670). The hypothetical eukaryotic proteins found in this family are of unknown function.	218
285184	pfam07901	DUF1672	Protein of unknown function (DUF1672). This family is composed of hypothetical bacterial proteins of unknown function.	271
369586	pfam07902	Gp58	gp58-like protein. Sequences found in this family are derived from a number of bacteriophage and prophage proteins. They are similar to gp58, a minor structural protein of Lactococcus delbrueckii bacteriophage LL-H.	594
285186	pfam07903	PaRep2a	PaRep2a protein. This is a family of proteins expressed by the crenarchaeon Pyrobaculum aerophilum. The members are highly variable in length and level of conservation. The presence of numerous frameshifts and internal stop codons in multiple alignments are thought to indicate that most family members are no longer functional.	122
400315	pfam07904	Eaf7	Chromatin modification-related protein EAF7. The S. cerevisiae member of this family is part of NuA4, the only essential histone acetyltransferase complex in Saccharomyces cerevisiae involved in global histone acetylation.	97
400316	pfam07905	PucR	Purine catabolism regulatory protein-like family. The bacterial proteins found in this family are similar to the purine catabolism regulatory protein expressed by Bacillus subtilis (PucR). PucR is thought to be a transcriptional activator involved in the induction of the purine degradation pathway, and may contain a LysR-like DNA-binding domain. It is similar to LysR-type regulators in that it represses its own expression. The other members of this family are also annotated as being putative regulatory proteins.	117
369588	pfam07906	Toxin_15	ShET2 enterotoxin, N-terminal region. The members of this family are are sequences that are similar to the N-terminal half of the ShET2 enterotoxin produced by Shigella flexneri and Escherichia coli. This protein was found to confer toxigenicity in the Ussing chamber, and the N-terminal region was found to be important for the protein's enterotoxic effect. It is thought to be a hydrophobic protein that forms inclusion bodies within the bacterial cell, and may be secreted by the Mxi system. Most members of this family are annotated as putative enterotoxins, but one member is a regulator of acetyl CoA synthetase, and another two members are annotated as ankyrin-like regulatory proteins and contain Ank repeats (pfam00023).	278
400317	pfam07907	YibE_F	YibE/F-like protein. The sequences featured in this family are similar to two proteins expressed by Lactococcus lactis, YibE and YibF. Most of the members of this family are annotated as being putative membrane proteins, and in fact the sequences contain a high proportion of hydrophobic residues.	240
116521	pfam07909	DUF1663	Protein of unknown function (DUF1663). The members of this family are hypothetical proteins expressed by Trypanosoma cruzi, a eukaryotic parasite that causes Chagas' disease in humans. This region is found as multiple copies per protein.	514
400318	pfam07910	Peptidase_C78	Peptidase family C78. This family formerly known as DUF1671 has been shown to be a cysteine peptidase called (Ufm1)-specific protease.	199
400319	pfam07911	DUF1677	Protein of unknown function (DUF1677). The sequences found in this family are all derived from hypothetical plant proteins of unknown function. The region features a number of highly conserved cysteine residues.	89
400320	pfam07912	ERp29_N	ERp29, N-terminal domain. ERp29 is a ubiquitously expressed endoplasmic reticulum protein, and is involved in the processes of protein maturation and protein secretion in this organelle. The protein exists as a homodimer, with each monomer being composed of two domains. The N-terminal domain featured in this family is organized into a thioredoxin-like fold that resembles the a domain of human protein disulphide isomerase (PDI). However, this domain lacks the C-X-X-C motif required for the redox function of PDI; it is therefore thought that ERp29's function is similar to the chaperone function of PDI. The N-terminal domain is exclusively responsible for the homodimerization of the protein, without covalent linkages or additional contacts with other domains.	126
285193	pfam07913	DUF1678	Protein of unknown function (DUF1678). This family is composed of uncharacterized proteins expressed by Methanopyrus kandleri, a hyperthermophilic archaebacterium.	196
369592	pfam07914	DUF1679	Protein of unknown function (DUF1679). The region featured in this family is found in a number of C. elegans proteins, in one case as a repeat. In many of the family members, this region is associated with the CHK region described by SMART as being found in ZnF_C4 and HLH domain-containing kinases. In fact, one member of this family is annotated as being a member of the nuclear hormone receptor family, and contains regions typical of such proteins (Interpro:IPR000536, Interpro:IPR008946, and Interpro:IPR001628).	413
400321	pfam07915	PRKCSH	Glucosidase II beta subunit-like protein. The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. Mutations in the gene coding for PRKCSH have been found to be involved in the development of autosomal dominant polycystic liver disease (ADPLD), but the precise role the protein has in the pathogenesis of this disease is unknown. This family also includes an ER sensor for misfolded glycoproteins and is therefore likely to be a generic sugar binding domain.	72
400322	pfam07916	TraG_N	TraG-like protein, N-terminal region. The bacterial sequences found in this family are similar to the N-terminal region of the TraG protein. This is a membrane-spanning protein, with three predicted transmembrane segments and two periplasmic regions. TraG protein is known to be essential for DNA transfer in the process of conjugation, with the N-terminal portion being required for F pilus assembly. The protein is thought to interact with the periplasmic domain of TraN to stabilize mating-cell interactions.	400
400323	pfam07918	CAP160	CAP160 repeat. This region featured in this family is repeated in spinach cold acclimation protein CAP160. CAP160 is induced during periods of drought stress; its precise function is unknown but it has been implicated in the stabilisation of membranes, cytoskeletal elements, and ribosomes. By acting as a compatible solute, it may reduce the toxic effects of cellular solutes that accumulate at high concentration during dehydration; it may also function as an enzyme that produces such a solute. Other members of this family are also induced by water stress, abscisic acid, and/or low temperature, such as desiccation-responsive protein 29B and CDet11-24 protein.	27
400324	pfam07919	Gryzun	Gryzun, putative trafficking through Golgi. The proteins featured in this family are all eukaryotic, and many of them are annotated as being Gryzun. Gryzun is distantly related to, but distinct from, the Trs130 subunit of the TRAPP complex but is absent from S. cerevisiae. RNAi of human Gryzun blocks Golgi exit. Thus the family is likely to be involved with trafficking of proteins through membranes, perhaps as part of the TRAPP complex.	590
400325	pfam07920	DUF1684	Protein of unknown function (DUF1684). The sequences featured in this family are found in hypothetical archaeal and bacterial proteins of unknown function. The region in question is approximately 200 amino acids long.	141
254516	pfam07921	Fibritin_C	Fibritin C-terminal region. This family features sequences bearing similarity to the C-terminal portion of the bacteriophage T4 protein fibritin. This protein is responsible for attachment of long tail fibers to virus particle, and forms the 'whiskers' or fibers on the neck of the virion. The region seen in this family contains an N-terminal coiled-coil portion and the C-terminal globular foldon domain (residues 457-486), which is essential for fibritin trimerisation and folding. This domain consists of a beta-hairpin; three such hairpins come together in a beta-propeller-like arrangement in the trimer, which is stabilized by hydrogen bonds, salt bridges and hydrophobic interactions.	93
400326	pfam07922	Glyco_transf_52	Glycosyltransferase family 52. This family features glycosyltransferases belonging to glycosyltransferase family 52, which have alpha-2,3- sialyltransferase (EC:4.2.99.4) and alpha-glucosyltransferase (EC 2.4.1.-) activity. For example, beta-galactoside alpha-2,3- sialyltransferase expressed by Neisseria meningitidis is a member of this family and is involved in a step of lipooligosaccharide biosynthesis requiring sialic acid transfer; these lipooligosaccharides are thought to be important in the process of pathogenesis. This family includes several bacterial lipooligosaccharide sialyltransferases similar to the Haemophilus ducreyi LST protein. Haemophilus ducreyi is the cause of the sexually transmitted disease chancroid and produces a lipooligosaccharide (LOS) containing a terminal sialyl N-acetyllactosamine trisaccharide.	271
400327	pfam07923	N1221	N1221-like protein. The sequences featured in this family are similar to a hypothetical protein product of ORF N1221 in the CPT1-SPC98 intergenic region of the yeast genome. This encodes an acidic polypeptide with several possible transmembrane regions.	282
400328	pfam07924	NuiA	Nuclease A inhibitor-like protein. This family consists of protein sequences that are similar to the nuclease A inhibitor expressed by bacteria of the genus Anabaena ((NuiA). This sequence is organized to form an alpha-beta-alpha sandwich fold, which is similar to the PR-1-like fold. NuiA interacts with nuclease A by means of residues located at one end of the molecule, including residues making up the loop between helices III and IV and the loop between strands C and D. The mechanism of inhibition of nuclease A by NuiA is as yet incompletely understood.	130
369599	pfam07925	RdRP_5	Reovirus RNA-dependent RNA polymerase lambda 3. The sequences in this family are similar to the reoviral minor core protein lambda 3, which functions as a RNA-dependent RNA polymerase within the protein capsid. It is organized into 3 domains. N- and C-terminal domains create a 'cage' that encloses a conserved central catalytic domain within a hollow centre; this catalytic domain is arranged to form 'fingers', 'palm' and 'thumb' subdomains. Unlike other RNA polymerases, like HIV reverse transcriptase and T7 RNA polymerase, lambda 3 protein binds template and substrate with only localized rearrangements, and catalytic activity can occur with little structural change. However, the structure of the catalytic complex is similar to that of other polymerase catalytic complexes with known structure.	1271
400329	pfam07926	TPR_MLP1_2	TPR/MLP1/MLP2-like protein. The sequences featured in this family are similar to a region of human TPR protein and to yeast myosin-like proteins 1 (MLP1) and 2 (MLP2). These proteins share a number of features; for example, they all have coiled-coil regions and all three are associated with nuclear pores. TPR is thought to be a component of nuclear pore complex- attached intra-nuclear filaments, and is implicated in nuclear protein import. Moreover, its N-terminal region is involved in the activation of oncogenic kinases, possibly by mediating the dimerization of kinase domains or by targeting these kinases to the nuclear pore complex. MLP1 and MLP2 are involved in the process of telomere length regulation, where they are thought to interact with proteins such as Tel1p and modulate their activity.	129
400330	pfam07927	HicA_toxin	HicA toxin of bacterial toxin-antitoxin,. HicA_toxin is a bacterial family of toxins that act as mRNA interferases. The antitoxin that neutralizes this is family HicB, pfam15919.	56
400331	pfam07928	Vps54	Vps54-like protein. This family contains various proteins that are homologs of the yeast Vps54 protein, such as the rat homolog, the human homolog, and the mouse homolog. In yeast, Vps54 associates with Vps52 and Vps53 proteins to form a trimolecular complex that is involved in protein transport between Golgi, endosomal, and vacuolar compartments. All Vps54 homologs contain a coiled coil region (not found in the region featured in this family) and multiple dileucine motifs.	133
400332	pfam07929	PRiA4_ORF3	Plasmid pRiA4b ORF-3-like protein. Members of this family are similar to the protein product of ORF-3 found on plasmid pRiA4 in the bacterium Agrobacterium rhizogenes. This plasmid is responsible for tumorigenesis at wound sites of plants infected by this bacterium, but the ORF-3 product does not seem to be involved in the pathogenetic process. Other proteins found in this family are annotated as being putative TnpR resolvases, but no further evidence was found to back this. Moreover, another member of this family is described as a probable lexA repressor and in fact carries a LexA DNA binding domain (pfam01726), but no references were found to expand on this.	166
400333	pfam07930	DAP_B	D-aminopeptidase, domain B. D-aminopeptidase is a dimeric enzyme with each monomer being composed of three domains. Domain B is organized to form a beta barrel made up of eight antiparallel beta strands. It is connected to domain A, the catalytic domain, by an eight-residue sequence, and also interacts with both domains A and C via non-covalent bonds. Domain B probably functions in maintaining domain C in a good position to interact with domain A.	181
400334	pfam07931	CPT	Chloramphenicol phosphotransferase-like protein. The members of this family are all similar to chloramphenicol 3-O phosphotransferase (CPT) expressed by Streptomyces venezuelae. Chloramphenicol (Cm) is a metabolite produced by this bacterium that can inhibit ribosomal peptidyl transferase activity and therefore protein production. By transferring a phosphate group to the C-3 hydroxyl group of Cm, CPT inactivates this potentially lethal metabolite.	172
400335	pfam07933	DUF1681	Protein of unknown function (DUF1681). This family is composed of sequences derived from a number of hypothetical eukaryotic proteins of unknown function.	156
400336	pfam07934	OGG_N	8-oxoguanine DNA glycosylase, N-terminal domain. The presence of 8-oxoguanine residues in DNA can give rise to G-C to T-A transversion mutations. This enzyme is found in archaeal, bacterial and eukaryotic species, and is specifically responsible for the process which leads to the removal of 8-oxoguanine residues. It has DNA glycosylase activity (EC:3.2.2.23) and DNA lyase activity (EC:4.2.99.18). The region featured in this family is the N-terminal domain, which is organized into a single copy of a TBP-like fold. The domain contributes residues to the 8-oxoguanine binding pocket.	115
311750	pfam07935	SSV1_ORF_D-335	ORF D-335-like protein. The sequences featured in this family are similar to a probable integrase expressed by the SSV1 virus of the archaebacterium Sulfolobus shibatae. This protein may be necessary for the integration of the virus into the host genome by a process of site-specific recombination.	63
254527	pfam07936	Defensin_4	Potassium-channel blocking toxin. This family features the antihypertensive and antiviral proteins BDS-I and BDS-II expressed by Anemonia sulcata. BDS-I is organized into a triple-stranded antiparallel beta-sheet, with an additional small antiparallel beta-sheet at the N-terminus. Both peptides are known to specifically block the Kv3.4 potassium channel, and thus bring about a decrease in blood pressure. Moreover, they inhibit the cytopathic effects of mouse hepatitis virus strain MHV-A59 on mouse liver cells, by an unknown mechanism.	34
285213	pfam07937	DUF1686	Protein of unknown function (DUF1686). The members of this family are all hypothetical proteins of unknown function expressed by the eukaryotic parasite Encephalitozoon cuniculi GB-M1. The region in question is approximately 250 amino acids long.	182
369605	pfam07938	Fungal_lectin	Fungal fucose-specific lectin. Lectins are involved in many recognition events at the molecular or cellular level. These fungal lectins, such as Aleuria aurantia lectin (AAL), specifically recognize fucosylated glycans. AAL is a dimeric protein, with each monomer being organized into a six-bladed beta-propeller fold and a small antiparallel two-stranded beta-sheet. The beta-propeller fold is important in fucose recognition; five binding pockets are found between the propeller blades. The small beta-sheet, on the other hand, is involved in the dimerization process.	303
400337	pfam07939	DUF1685	Protein of unknown function (DUF1685). The members of this family are hypothetical eukaryotic proteins of unknown function. The region in question is approximately 100 amino acid residues long.	60
400338	pfam07940	Hepar_II_III	Heparinase II/III-like protein. This family features sequences that are similar to a region of the Flavobacterium heparinum proteins heparinase II and heparinase III. The former is known to degrade heparin and heparin sulphate, whereas the latter predominantly degrades heparin sulphate. Both are secreted into the periplasmic space upon induction with heparin.	235
400339	pfam07941	K_channel_TID	Potassium channel Kv1.4 tandem inactivation domain. This family features the tandem inactivation domain found at the N-terminus of the Kv1.4 potassium channel. It is composed of two subdomains. Inactivation domain 1 (ID1, residues 1-38) consists of a flexible N-terminus anchored at a 5-turn helix, and is thought to work by occluding the ion pathway, as is the case with a classical ball domain. Inactivation domain 2 (ID2, residues 40-50) is a 2.5 turn helix with a high proportion of hydrophobic residues that probably serves to attach ID1 to the cytoplasmic face of the channel. In this way, it can promote rapid access of ID1 to the receptor site in the open channel. ID1 and ID2 function together to being about fast inactivation of the Kv1.4 channel, which is important for the channel's role in short-term plasticity.	71
400340	pfam07942	N2227	N2227-like protein. This family features sequences that are similar to a region of hypothetical yeast gene product N2227. This is thought to be expressed during meiosis and may be involved in the defense response to stressful conditions.	268
400341	pfam07943	PBP5_C	Penicillin-binding protein 5, C-terminal domain. Penicillin-binding protein 5 expressed by E. coli functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain (pfam00768) is the catalytic domain. The C-terminal domain featured in this family is organized into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides.	91
400342	pfam07944	Glyco_hydro_127	Beta-L-arabinofuranosidase, GH127. One member of this family, from Bidobacterium longicum, UniProtKB:E8MGH8, has been characterized as an unusual beta-L-arabinofuranosidase enzyme, EC:3.2.1.185. It rleases l-arabinose from the l-arabinofuranose (Araf)-beta1,2-Araf disaccharide and also transglycosylates 1-alkanols with retention of the anomeric configuration. Terminal beta-l-arabinofuranosyl residues have been found in arabinogalactan proteins from a mumber of different plantt species. beta-l-Arabinofuranosyl linkages with 1-4 arabinofuranosides are also found in the sugar chains of extensin and solanaceous lectins, hydroxyproline (Hyp)2-rich glycoproteins that are widely observed in plant cell wall fractions. The critical residue for catalytic activity is Glu-338, in a ET/SCAS sequence context.	503
116555	pfam07945	Toxin_16	Janus-atracotoxin. This family includes three peptides secreted by the spider Hadronyche versuta. These are insect-selective, excitatory neurotoxins that may function by antagonising muscle acetylcholine receptors, or acetylcholine receptor subtypes present in other invertebrate neurons. Janus atracotoxin-Hv1c (J-ACTX-Hv1c) is organized into a disulphide-rich globular core (residues 3-19) and a beta-hairpin (residues 20-34). There are 4 disulphide bridges, one of which is a vicinal disulphide bridge; this is known to be unimportant in the maintenance of structure but critical for insecticidal activity.	36
400343	pfam07946	DUF1682	Protein of unknown function (DUF1682). The members of this family are all hypothetical eukaryotic proteins of unknown function. One member is described as being an adipocyte-specific protein, but no evidence of this was found.	326
400344	pfam07947	YhhN	YhhN family. The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many are annotated as possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues. A human member of this family, formerly known as TMEM86B, is a lysoplasmalogenase that catalyzes the hydrolysis of the vinyl ether bond of lysoplasmalogen. Putative conserved active site residues have been proposed for the YhhN family.	182
285223	pfam07948	Nairovirus_M	Nairovirus M polyprotein-like. The sequences in this family are similar to the Dugbe virus M polyprotein precursor, which includes glycoproteins G1 and G2. Both are thought to be inserted in the membrane of the Golgi complex of the infected host cell, and G1 is known to have a role in infection of vertebrate hosts.	657
400345	pfam07949	YbbR	YbbR-like protein. The members of this family are are all hypothetical bacterial proteins of unknown function, and are similar to the YbbR protein expressed by Bacillus subtilis. One member is annotated as an uncharacterized secreted protein, whereas another member is described as a hypothetical protein in the 5'region of the def gene of Thermus thermophilus, which encodes a deformylase, but no further information was found in either case. This region is found repeated up to four times in many members of this family.	80
400346	pfam07950	DUF1691	Protein of unknown function (DUF1691). This family of fungal proteins is uncharacterized. Each protein contains two copies of this region.	105
285226	pfam07951	Toxin_R_bind_C	Clostridium neurotoxin, C-terminal receptor binding. The Clostridium neurotoxin family is composed of tetanus neurotoxins and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (pfam01742), the central translocation domains and two receptor binding domains. This domains is the C-terminal receptor binding domain, which adopts a modified beta-trefoil fold with a six stranded beta-barrel and a beta-hairpin triplet capping the domain. The first step in the intoxication process is a binding event between this domains and the pre-synaptic nerve ending.	217
400347	pfam07952	Toxin_trans	Clostridium neurotoxin, Translocation domain. The Clostridium neurotoxin family is composed of tetanus neurotoxin and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (pfam01742), the central translocation domains and two receptor binding domains. Subsequent to cell surface binding and receptor mediated endocytosis of the neurotoxin, an acid induced conformational change in the neurotoxin translocation domain is believed to allow the domain to penetrate the endosome and from a pore, thereby facilitating the passage of the catalytic domain across the membrane into the cytosol. The structure of the translocation reveals a pair of helices that are 105 Angstroms long and is structurally distinct from other pore forming toxins.	323
400348	pfam07953	Toxin_R_bind_N	Clostridium neurotoxin, N-terminal receptor binding. The Clostridium neurotoxin family is composed of tetanus neurotoxin and seven serotypes of botulinum neurotoxin. The structure of the botulinum neurotoxin reveals a four domain protein. The N-terminal catalytic domain (pfam01742), the central translocation domains and two receptor binding domains. This domains is the N-terminal receptor binding domain,which is comprised of two seven-stranded beta-sheets sandwiched together to form a jelly role motif. The role of this domain in receptor binding appears to be indirect.	192
369614	pfam07954	DUF1689	Protein of unknown function (DUF1689). Family of fungal proteins with unknown function. A member of this family has been found to localize in the mitochondria.	143
400349	pfam07955	DUF1687	Protein of unknown function (DUF1687). This is a putative redox protein which is predicted to have a thioredoxin fold containing a single active cysteine.	124
400350	pfam07956	DUF1690	Protein of Unknown function (DUF1690). Family of uncharacterized fungal proteins.	138
400351	pfam07957	DUF3294	Protein of unknown function (DUF3294). This family was annotated as mitochondrial Ribosomal protein MRP8, based on the presumed similarity of the S.cerevisiae protein to an E.coli mitochondrial ribosomal protein; however, this similarity is spurious, and the function is not known [Wood, V].	213
400352	pfam07958	DUF1688	Protein of unknown function (DUF1688). A family of uncharacterized proteins.	420
400353	pfam07959	Fucokinase	L-fucokinase. In the salvage pathway of GDP-L-fucose, free cytosolic fucose is phosphorylated by L-fucokinase to form L-fucose-L-phosphate, which is then further converted to GDP-L-fucose in the reaction catalyzed by GDP-L-fucose pyrophosphorylase.	404
369619	pfam07960	CBP4	CBP4. The CBP4 in S. cerevisiae is essential for the expression and activity of ubiquinol-cytochrome c reductase. This family appears to be fungal specific.	125
254545	pfam07961	MBA1	MBA1-like protein. Mba1 is an inner membrane protein that is part of the mitochondrial protein export machinery. It binds to the large subunit of mitochondrial ribosomes and cooperates with the C-terminal ribosome-binding domain of Oxa1, which is a central component of the insertion machinery of the inner membrane. In the absence of both Mba1 and the C-terminus of Oxa1, mitochondrial translation products fail to be properly inserted into the inner membrane and serve as substrates of the matrix chaperone Hsp70. It is proposed that Mba1 functions as a ribosome receptor that cooperates with Oxa1 in the positioning of the ribosome exit site to the insertion machinery of the inner membrane.	235
400354	pfam07962	Swi3	Replication Fork Protection Component Swi3. Replication fork pausing is required to initiate a recombination events. More specifically, Swi1 is required for recombination near the mat1 locus. Swi3 has been found to co-purify with Swi1 Swi3, together with Swi1, define a fork protection complex that coordinates leading- and lagging-strand synthesis and stabilizes stalled replication forks. The Swi1-Swi3 complex is required for accurate replication, fork protection and replication checkpoint signalling.	83
400355	pfam07963	N_methyl	Prokaryotic N-terminal methylation motif. This short motif directs methylation of the conserved phenylalanine residue. It is most often found at the N-terminus of pilins and other proteins involved in secretion, see pfam00114, pfam05946, pfam02501 and pfam07596.	27
369621	pfam07964	Red1	Rec10 / Red1. Rec10 / Red1 is involved in meiotic recombination and chromosome segregation during homologous chromosome formation. This protein localizes to the synaptonemal complex in S. cerevisiae and the analogous structures (linear elements) in S. pombe. This family is currently only found in fungi.	748
400356	pfam07965	Integrin_B_tail	Integrin beta tail domain. This is the beta tail domain of the Integrin protein. Integrins are receptors which are involved in cell-cell and cell-extracellular matrix interactions.	84
400357	pfam07966	A1_Propeptide	A1 Propeptide. Most eukaryotic endopeptidases (Merops Family A1) are synthesized with signal and propeptides. The animal pepsin-like endopeptidase propeptides form a distinct family of propeptides, which contain a conserved motif approximately 30 residues long. In pepsinogen A, the first 11 residues of the mature pepsin sequence are displaced by residues of the propeptide. The propeptide contains two helices that block the active site cleft, in particular the conserved Asp11 residue, in pepsin, hydrogen bonds to a conserved Arg residues in the propeptide. This hydrogen bond stabilizes the propeptide conformation and is probably responsible for triggering the conversion of pepsinogen to pepsin under acidic conditions.	29
400358	pfam07967	zf-C3HC	C3HC zinc finger-like. This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) proteins. NIPA is implicate to perform some sort of antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signaling events. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe the protein containing this domain is involved in mRNA export from the nucleus.	132
400359	pfam07968	Leukocidin	Leukocidin/Hemolysin toxin family. 	250
400360	pfam07969	Amidohydro_3	Amidohydrolase family. 	464
400361	pfam07970	COPIIcoated_ERV	Endoplasmic reticulum vesicle transporter. This family is conserved from plants and fungi to humans. Erv46 works in close conjunction with Erv41 and together they form a complex which cycles between the endoplasmic reticulum and Golgi complex. Erv46-41 interacts strongly with the endoplasmic reticulum glucosidase II. Mammalian glucosidase II comprises a catalytic alpha-subunit and a 58 kDa beta subunit, which is required for ER localization. All proteins identified biochemically as Erv41p-Erv46p interactors are localized to the early secretory pathway and are involved in protein maturation and processing in the ER and/or sorting into COPII vesicles for transport to the Golgi.	223
400362	pfam07971	Glyco_hydro_92	Glycosyl hydrolase family 92. Members of this family are alpha-1,2-mannosidases, enzymes which remove alpha-1,2-linked mannose residues from Man(9)(GlcNAc)(2) by hydrolysis. They are critical for the maturation of N-linked oligosaccharides and ER-associated degradation.	465
400363	pfam07972	Flavodoxin_NdrI	NrdI Flavodoxin like. 	119
400364	pfam07973	tRNA_SAD	Threonyl and Alanyl tRNA synthetase second additional domain. The catalytically active from of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain.	43
400365	pfam07974	EGF_2	EGF-like domain. This family contains EGF domains found in a variety of extracellular proteins.	26
336887	pfam07975	C1_4	TFIIH C1-like domain. The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterized by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C (pfam00130).	55
400366	pfam07976	Phe_hydrox_dim	Phenol hydroxylase, C-terminal dimerization domain. Phenol hydroxylase acts a homodimer, to hydroxylates phenol to catechol or similar product. The enzyme is comprised of three domains. The first two domains from the active site. The third domain, this domain, is involved in forming the dimerization interface. The domain adopts a thioredoxin-like fold.	166
400367	pfam07977	FabA	FabA-like domain. This enzyme domain has a HotDog fold.	134
400368	pfam07978	NIPSNAP	NIPSNAP. Members of this family include many hypothetical proteins. It also includes members of the NIPSNAP family which have putative roles in vesicular transport. This domain is often found in duplicate.	102
311782	pfam07979	Intimin_C	Intimin C-type lectin domain. This domain is found at the C-terminus of intimin. Its structure has been solved and shown to have a C-lectin type of structure. Intimin is a bacterial adhesion molecule involved in intimate attachment of enteropathogenic and enterohemorrhagic Escherichia coli to mammalian host cells. Intimin targets the translocated intimin receptor (Tir), which is exported by the bacteria and integrated into the host cell plasma membrane.	101
400369	pfam07980	SusD_RagB	SusD family. This domain is found in bacterial cell surface proteins such SusD and SusD-like proteins, as as well RagB, outer membrane surface receptor antigen. Bacteroidetes, one of the two dominant bacterial phyla in the human gut, are Gram-negative saccharolytic microorganisms that utilize a diverse array of glycans. Hence, they express starch-utilization system (Sus) for glycan uptake. SusD has 551 amino acids, and is almost entirely alpha-helical, with 22 alpha-helices, eight of which form 4 tetra-trico peptide repeats (TPRs: helix-turn-helix motifs involved in protein-protein interactions). The four TPRs pack together to create a right-handed super-helix. This is predicted to mediate the formation of SusD and SusC porin complex at the cell surface. The interaction between SusC and TPR1/TPR2 region of SusD is predicted to be of functional importance since it allows SusD to be in position for oligosaccharide capture from other Sus lipoproteins and delivery of these glycans to the SusC porin. The non-TPR containing portion of SusD is where starch binding occurs. The binding site is a shallow surface cavity located on top of TPR1. SusD homologs such as SusD-like proteins have a critical role in carbohydrate acquisition. Both SusD and its homologs, contain about 15-20 residues at the N-terminus that might be a flexible linker region, anchoring the protein to the membrane and the glycan-binding domain. Other homologs to SusD have been examined in Porphyromonas gingivalis such as RagB, an immunodominant outer-membrane surface receptor antigen. Structural characterization of RagB shows substantial similarity with Bacteroides thetaiotaomicron SusD (i.e alpha-helices and TPR regions). Based on this structural similarity, functional studies suggest that, RagB binding of glycans occurs at pockets on the molecular surface that are distinct from those of SusD.	292
400370	pfam07981	Plasmod_MYXSPDY	Plasmodium repeat_MYXSPDY. This repeat is found in two hypothetical Plasmodium proteins.	17
285256	pfam07982	Herpes_UL74	Herpes UL74 glycoproteins. Members of this family are viral glycoproteins that form part of an envelope complex.	418
400371	pfam07983	X8	X8 domain. The X8 domain domain contains at least 6 conserved cysteine residues that presumably form three disulphide bridges. The domain is found in an Olive pollen allergen as well as at the C-terminus of several families of glycosyl hydrolases. This domain may be involved in carbohydrate binding. This domain is characteristic of GPI-anchored domains.	76
400372	pfam07984	NTP_transf_7	Nucleotidyltransferase. This family contains many hypothetical proteins. It also includes four nematode prion-like proteins. This domain has been identified as part of the nucleotidyltransferase superfamily.	319
400373	pfam07985	SRR1	SRR1. SRR1 proteins are signalling proteins involved in regulating the circadian clock in Arabidopsis.	54
400374	pfam07986	TBCC	Tubulin binding cofactor C. Members of this family are involved in the folding pathway of tubulins and form a beta helix structure.	119
400375	pfam07987	DUF1775	Domain of unkown function (DUF1775). Domain found in bacteria with undetermined function. Its structure has been determined and is an immunoglobulin-like fold.	145
400376	pfam07988	LMSTEN	LMSTEN motif. This region of Myb proteins has previously been described as the transcriptional activation domain present in the vertebrate c-Myb and A-Myb, but neither vertebrate B-Myb proteins nor Myb proteins of invertebrates. Because vertebrate B-Myb (but neither A-Myb nor c-Myb) can partially complement Drosophila Myb null mutants, this region appears to have been a relatively recent insertion.	45
400377	pfam07989	Cnn_1N	Centrosomin N-terminal motif 1. This domain has been identified in two microtubule associated proteins in Schizosaccharomyces pombe, Mto1 and Pcp1. Mto1 has been identified in association with spindle pole body and non-spindle pole body microtubules. The pericentrin homolog Pcp1 is also associated with the fungal centrosome or spindle pole body (SPB). Members of this family have been named centrosomins, and are an essential mitotic centrosome component required for assembly of all other known pericentriolar matrix proteins in order to achieve microtubule-organising activity in fission yeast. Cnn_1N is a short conserved motif towards the N-terminus. Motif 1 is found to be necessary for proper recruitment of gamma-tubulin, D-TACC (the homolog of vertebrate transforming acidic coiled-coil proteins [TACC]), and Minispindles (Msps) to embryonic centrosomes but is not required for assembly of other centrosome components including Aurora A kinase and CP60 in Drosophila.	69
400378	pfam07990	NABP	Nucleic acid binding protein NABP. Many members of this family are putative nucleic acid binding proteins. One member of this family has been partially characterized and contains two putative phosphorylation sites and a possible dimerization / leucine zipper domain.	387
285265	pfam07991	IlvN	Acetohydroxy acid isomeroreductase, NADPH-binding domain. Acetohydroxy acid isomeroreductase catalyzes the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain amino acids valine and isoleucine. This N-terminal region of the enzyme carries the binding-site for NADPH. The active-site for enzymatic activity lies in the C-terminal part, IlvC, pfam01450.	165
400379	pfam07992	Pyr_redox_2	Pyridine nucleotide-disulphide oxidoreductase. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain.	301
400380	pfam07993	NAD_binding_4	Male sterility protein. This family represents the C-terminal region of the male sterility protein in a number of arabidopsis and drosophila. A sequence-related jojoba acyl CoA reductase is also included.	257
400381	pfam07994	NAD_binding_5	Myo-inositol-1-phosphate synthase. This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyzes the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction.	435
400382	pfam07995	GSDH	Glucose / Sorbosone dehydrogenase. Members of this family are glucose/sorbosone dehydrogenases that possess a beta-propeller fold.	326
400383	pfam07996	T4SS	Type IV secretion system proteins. Members of this family are components of the type IV secretion system. They mediate intracellular transfer of macromolecules via a mechanism ancestrally related to that of bacterial conjugation machineries.	191
400384	pfam07997	DUF1694	Protein of unknown function (DUF1694). This family contains many hypothetical proteins.	114
191923	pfam07998	Peptidase_M54	Peptidase family M54. This is a family of metallopeptidases. Two human proteins have been reported to degrade synthetic substrates and peptides.	176
191924	pfam07999	RHSP	Retrotransposon hot spot protein. Members of this family are retrotransposon hot spot proteins. They are associated with polymorphic subtelomeric regions in Trypanosoma. These proteins contain a P-loop motif.	439
400385	pfam08000	bPH_1	Bacterial PH domain. This family contains many bacterial hypothetical proteins. The structures of Structure 3hsa and Structure 3dcx show similarities to the PH or pleckstrin homology domain. First evidence of PH-like domains in bacteria suggests role in cell envelope stress response.	122
285273	pfam08001	CMV_US	CMV US. This is a family of unique short (US) cytoplasmic glycoproteins which are expressed in cytomegalovirus.	245
400386	pfam08002	DUF1697	Protein of unknown function (DUF1697). This family contains many hypothetical bacterial proteins.	131
400387	pfam08003	Methyltransf_9	Protein of unknown function (DUF1698). This family contains many hypothetical proteins. It also includes two putative methyltransferase proteins.	315
369645	pfam08004	DUF1699	Protein of unknown function (DUF1699). This family contains many archaeal proteins which have very conserved sequences.	130
400388	pfam08005	PHR	PHR domain. This domain is called PHR as it was originally found in the proteins PAM, highwire, and RPM. This domain can be duplicated in the highwire, PFAM and PRM sequence. The C-terminal region of the protein BTBD1 includes the PHR domain and is known to interact with Topoisomerase I, an enzyme which relaxes DNA supercoils.	150
285278	pfam08006	DUF1700	Protein of unknown function (DUF1700). This family contains many hypothetical bacterial proteins and putative membrane proteins.	181
400389	pfam08007	Cupin_4	Cupin superfamily protein. This family contains many hypothetical proteins that belong to the cupin superfamily.	319
285280	pfam08008	Viral_cys_rich	Viral cysteine rich. Members of this family are polydna viral proteins that contain a cysteine rich motif. Some members of this family have multiple copies of this domain.	83
400390	pfam08009	CDP-OH_P_tran_2	CDP-alcohol phosphatidyltransferase 2. This domain is found on CDP-alcohol phosphatidyltransferases. These enzymes catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond.	37
285282	pfam08010	Phage_30_3	Bacteriophage protein GP30.3. Proteins in this family are bacteriophage GP30.3 proteins. Their function is poorly characterized.	138
400391	pfam08011	PDDEXK_9	PD-(D/E)XK nuclease superfamily. This family contains many hypothetical bacterial proteins. It has been identified as a member of the PD-(D/E)XK nuclease superfamily through transitive meta profile searches. DUF1703 has the predicted secondary structure pattern of the restriction endonuclease-like fold core and contains an additional beta-strand at the C-terminus.	104
400392	pfam08012	DUF1702	Protein of unknown function (DUF1702). This family of proteins contains many bacterial proteins that are encoded by the UnbL gene. The function of these proteins is unknown.	319
400393	pfam08013	Tagatose_6_P_K	Tagatose 6 phosphate kinase. Proteins in this family are tagatose 6 phosphate kinases.	420
400394	pfam08014	DUF1704	Domain of unknown function (DUF1704). This family contains many hypothetical proteins.	365
369652	pfam08015	Pheromone	Fungal mating-type pheromone. This family corresponds to mating-type pheromone proteins. The homobasidiomycetes, or mushroom fungi, have arguably the most complex mating system of all known organisms. Many species possess a mating system known as bifactorial incompatibility, where two unlinked loci control the mating -type of an individual incompatibility loci (the A and B mating-type loci). Each A mating-type sublocus encodes a pair of divergently transcribed homeodomain transcription factors while the genes responsible for B mating-type activity encode lipopeptide pheromones and G-protein -coupled pheromone receptors.	67
400395	pfam08016	PKD_channel	Polycystin cation channel. This family contains the cation channel region of PKD1 and PKD2 proteins.	424
311808	pfam08017	Fibrinogen_BP	Fibrinogen binding protein. Proteins in this family bind to fibrinogen. Members of this family includes the fibrinogen receptor, FbsA, which mediates platelet aggregation.	393
285289	pfam08018	Antimicrobial_1	Frog antimicrobial peptide. This family includes antimicrobial peptides secreted from skins of frogs. The secretion of antimicrobial peptides from the skins of frogs plays an important role in the self defense of these frogs. Structural characterization of these peptides showed that they belonged to four known families: the brevinin-1 family, the esculentin-2 family, the ranatuerin-2 family and the temporin family.	24
400396	pfam08019	DUF1705	Domain of unknown function (DUF1705). Some members of this family are putative bacterial membrane proteins. This domain is found immediately N terminal to the sulfatase domain in many sulfatases.	149
400397	pfam08020	DUF1706	Protein of unknown function (DUF1706). This family contains many hypothetical proteins from bacteria and yeast.	161
311811	pfam08021	FAD_binding_9	Siderophore-interacting FAD-binding domain. 	118
285293	pfam08022	FAD_binding_8	FAD-binding domain. 	108
400398	pfam08023	Antimicrobial_2	Frog antimicrobial peptide. This family consists of the major classes of antimicrobial peptides secreted from the skin of frogs that protect the frogs against invading microbes. They are typically between 10-50 amino acids long and are derived from proteolytic cleavage of larger precursors. Major classes of peptides such esculentin, gaegurin, brevinin, rugosin and ranatuerin are included in this family.	31
116634	pfam08024	Antimicrobial_4	Ant antimicrobial peptide. This family consists of the ponericin family of antimicrobial peptides isolated from predatory ant Pachycondyla goeldii. The ponericin peptides may adopt amphipathic alpha-helical structure in polar environments. In the ant colony, these peptides exhibit a defensive role against microbial pathogens arising from prey introduction and/or ingestion.	24
116635	pfam08025	Antimicrobial_3	Spider antimicrobial peptide. This family includes antimicrobial peptides isolated from the crude venom of the wolf spider Oxyopes kitabensis. These peptides, known as oxyopinins, are the largest linear cationic amphipathic peptides chemically characterized and exhibit disrupting activities towards biological membranes.	37
369654	pfam08026	Antimicrobial_5	Bee antimicrobial peptide. This family consists of antimicrobial peptides produced by bees. These peptides have strong antimicrobial and some anti-fungal activity and has homology to abaecin which is the largest proline-rich antimicrobial peptide isolated from European bumblebee Bombus pascuorum.	31
369655	pfam08027	Albumin_I	Albumin I chain b. The albumin I protein, a hormone-like peptide, stimulates kinase activity upon binding a membrane bound 43 kDa receptor. The structure of this domain (chain b) reveals a knottin like fold, comprise of three beta strands.	35
400399	pfam08028	Acyl-CoA_dh_2	Acyl-CoA dehydrogenase, C-terminal domain. 	133
400400	pfam08029	HisG_C	HisG, C-terminal domain. 	73
369657	pfam08030	NAD_binding_6	Ferric reductase NAD binding domain. 	151
369658	pfam08031	BBE	Berberine and berberine like. This domain is found in the berberine bridge and berberine bridge- like enzymes which are involved in the biosynthesis of numerous isoquinoline alkaloids. They catalyze the transformation of the N-methyl group of (S)-reticuline into the C-8 berberine bridge carbon of (S)-scoulerine.	45
400401	pfam08032	SpoU_sub_bind	RNA 2'-O ribose methyltransferase substrate binding. This domain is a RNA 2'-O ribose methyltransferase substrate binding domain.	74
400402	pfam08033	Sec23_BS	Sec23/Sec24 beta-sandwich domain. 	86
400403	pfam08034	TES	Trematode eggshell synthesis protein. This domain has been identified in a number of distantly related species of trematodes. This protein domain is crucial for eggshell synthesis in trematodes (Ebersberger I).	66
400404	pfam08035	Op_neuropeptide	Opioids neuropeptide. This family corresponds to the conserved YGG motif that is found in a wide variety of opioid neuropeptides such as enkephalin.	30
285304	pfam08036	Antimicrobial_6	Diapausin family of antimicrobial peptide. This family consists of diapausin-related antimicrobial peptides. Diapause during periods of environmental adversity is an essential part of the life cycle of many organisms with the molecular basis being different among animals. Diapause-specific peptides provide anti-fungal activity and act as N-type voltage-gated calcium channel blocker.	39
311820	pfam08037	Attractin	Attractin family. This family consists of the attractin family of water-borne pheromone. Mate attraction in Aplysia involves a long-distance water-borne signal in the form of the attractin peptide, that is released during egg laying. These peptides contain 6 conserved cysteines and are folded into 2 antiparallel helices. The second helix contains the IEECKTS sequence conserved in Aplysia attractins.	55
400405	pfam08038	Tom7	TOM7 family. This family consists of TOM7 family of mitochondrial import receptors. TOM7 forms part of the translocase of the outer mitochondrial membrane (TOM) complex and it appears to function as a modulator of the dynamics of the mitochondrial protein transport machinery by promoting the dissociation of subunits of the outer membrane translocase.	41
400406	pfam08039	Mit_proteolip	Mitochondrial proteolipid. This family consists of proteins with similarity to the mitochondrial proteolipids. Mitochondrial proteolipid consists of about 60 amino acids residues and is about 6.8 kDa in size.	60
400407	pfam08040	NADH_oxidored	MNLL subunit. This family consists of the MNLL subunits of NADH-ubiquinone oxidoreductase complex. NADH-ubiquinone oxidoreductase is involved in the transfer of electrons from NADH to the electron transport chain. This oxidation of NADH is coupled to proton transfer across the membrane, generating a proton motive force that is utilized for the synthesis of ATP. MNLL subunit is one of the many subunits found in the complex and it contains a mitochondrial import sequence. However, the role of MNLL subunit is unclear.	58
400408	pfam08041	PetM	PetM family of cytochrome b6f complex subunit 7. This family consists of the PetM family of cytochrome b6f complex subunit IV. The cytochrome b6f complex consists of 7 subunits and contains 2 beta hemes and 1 chlorophyll alpha per cytochrome f. It is highly active in transferring electrons from decylplastoquinol to oxidized plastocyanin.	29
369666	pfam08042	PqqA	PqqA family. This family consists of proteins belonging to the coenzyme Pyrroloquinoline quinone A (pqqA) family. PQQ is the non-covalently bounded prosthetic group of many quinoproteins catalyzing reactions in the periplasm of Gram-negative bacteria. PQQ is formed by the fusion of glutamate and tyrosine and synthesis of PQQ require the proteins encoded by the pqqABCDEF operon but details of the biosynthetic pathway are unclear.	19
400409	pfam08043	Xin	Xin repeat. The repeat has the consensus sequence GDV(K/Q/R)(T/S/G)X(R/K/T) WLFETXPLD. This repeat motif is typically found in the N-terminus of the proteins, with a copy number between 2 and 28 repeats. Direct evidence for binding to and stabilizing F-actin has been found in the human protein XIRP1. The homologs in mouse and chicken localize in the adherens junction complex of the intercalated disc in cardiac muscle and in the myotendon junction of skeletal muscle. mXin may co-localize with Vinculin which is known to attach the actin to the cytoplasmic membrane. It has been shown that the amino-terminus of human xin (CMYA1) binds the EVH1 domain of Mena/VASP/EVL, and the carboxy-terminus binds the, for the filamin family unique, domain 20 of filaminC. This confirms the proposed role of xin repeat containing proteins as F-actin-binding adapter proteins.	16
400410	pfam08044	DUF1707	Domain of unknown function (DUF1707). This domain is found in a variety of Actinomycetales proteins. All of the proteins containing this domain are hypothetical and probably membrane bound or associated. Currently, it is unclear to the function of this domain.	52
400411	pfam08045	CDC14	Cell division control protein 14, SIN component. Cdc14 is a component of the septation initiation network (SIN) and is required for the localization and activity of Sid1. Sid1 is a protein kinase that localizes asymmetrically to one spindle pole body (SPB) in anaphase disappears prior to cell separation.	283
285313	pfam08046	IlvGEDA_leader	IlvGEDA operon leader peptide. This family consists of the leader peptides of ilvGEDA operon. The expression of the ilvGEDA operon of E coli K-12 is multivalently controlled by the three branched -chain amino acids. Regulation is thought to occur by attenuation of transcription in response to the changing levels of the cognate tRNAs. Transcription of this operon is usually terminated at the end of the leader (regulatory) region.	32
285314	pfam08047	His_leader	Histidine operon leader peptide. This family consists of the leader peptide of the histidine (his) operon. The his operon contains all the genes necessary for histidine biosynthesis. The region corresponding to the untranslated 5' end of the transcript, named the his leader region, displays the typical features of the T box transcriptional attenuation mechanism which is involved in the regulation of many amino acid biosynthetic operons.	16
116658	pfam08048	RepA1_leader	Tap RepA1 leader peptide. This family consists of the RepA1 leader peptides. The frequency of replication of IncFII plasmid NR1 during the cell division cycle is regulated by the control of the synthesis of the plasmid-specific replication initiation protein (RepA1). When RepA1 is synthesized, it binds to the plasmid replication origin (ori) and effects the assembly of a replication complex composed of host proteins that mediate the replication of the plasmid. The tap gene encodes a 24-amino acids protein. The translation of tap is required for translation of repA.	25
369669	pfam08049	IlvB_leader	IlvB leader peptide. This family consists of the leader peptides of the ilvB operon. This region encodes a potential leader polypeptide containing 32 amino acids, 12 of which are the regulatory amino acids valine and leucine. A model for the multivalent regulation of this operon by valyl- and leucyl-tRNA is proposed on the basis of the mutually exclusive formation of five strong stem-and-loop structures in the leader mRNA.	32
254601	pfam08050	Tet_res_leader	Tetracycline resistance leader peptide. This family consists of the tetracycline resistance leader peptide. The presence of 3 inverted repeats which can form 2 different conformations of mRNA suggests that the tetracycline resistance (TcR) region is regulated by a translational attenuation mechanism. A Rho-independent transcriptional terminator structure is present immediately after the translational stop codon of the TET protein.	20
369670	pfam08051	Ery_res_leader1	Erythromycin resistance leader peptide. This family consists of erythromycin resistance gene leader peptides. These leader peptides are involved in the translational attenuation of erythromycin resistance genes. Interestingly, the consensus sequence of peptides conferring erythromycin resistance is similar to that of the leader peptides, thus indicating that a similar type of interaction between the nascent peptide and antibiotics can occur in both cases. This family also includes a small number of regions from within larger proteins from actinomycetes.	15
285317	pfam08052	PyrBI_leader	PyrBI operon leader peptide. This family consists of the pyrBI operon leader peptides. The expression of the pyrBI operon, which encodes the subunits of the pyrimidine biosynthetic enzyme aspartate transcarbamylase. is regulated primarily through a UTP-sensitive transcriptional attenuation control mechanism. In this mechanism, the concentration of UTP determines the extent of coupling between transcription and translation within the pyrBI leader region, hence determining the level of rho-independent transcriptional termination at an attenuator preceding the pyrB gene.	44
369671	pfam08053	Tna_leader	Tryptophanase operon leader peptide. This family consists of the tryptophanase (tna) operon leader peptide. Tna catalyzes the degradation of L-tryptophan to indole, pyruvate and ammonia, enabling the bacteria to utilize tryptophan as a source of carbon, nitrogen and energy. The tna operon of E. coli contains two major structural genes, tnaA and tnaB. Preceding tnaA in the tna operon is a 319 -nucleotide transcribed regulatory region that contains the coding region for a 24-residue leader peptide, TnaC. The RNA sequence in the vicinity of the tnaC stop codon is rich in Cytidylate residues which is required for efficient Rho -dependent termination in the leader region of the tna operon.	23
285319	pfam08054	Leu_leader	Leucine operon leader peptide. This family consists of the leucine operon leader peptide. The leucine operon is involved in the control of the biosynthesis of leucine. Four adjacent leucine codons within the leucine leader RNA are critically important in transcription attenuation-mediated control of leucine operon expression in bacteria. The leader RNA contains translational start and stop signals, a cluster of four leucine codons and overlapping regions of dyad symmetry that are capable of forming stem-and-loop structures.	28
116663	pfam08055	Trp_leader1	Tryptophan leader peptide. This family consists of the tryptophan (trp) leader peptides. Tryptophan accumulation is the principal event resulting in downregulation of transcription of the structural genes of the trp operon. The leader peptide of the trp operon forms mutually exclusive secondary structures that would either result in the termination of transcription of the trp operon when tryptophan is in plentiful supply or vice versa.	18
285320	pfam08056	Trp_leader2	Tryptophan operon leader peptide. This family consists of the tryptophan operon leader peptides. The tryptophan operon is regulated by transcription attenuation in response to changes in the level of tryptophan. The transcript of the leader peptide can adopt alternative mutually-exclusive secondary structures that would either result in termination of transcription of the tryptophan structural genes or in transcription of the entire operon.	41
71493	pfam08057	Ery_res_leader2	Erythromycin resistance leader peptide. This family consists of erythromycin resistance gene leader peptides. These leader peptides are involved in the transcriptional attenuation control of the synthesis of the macrolide-lincosamide -streptogramin B resistance protein. It acts as a transcriptional attenuator, in contrast to other inducible erm genes. The mRNA leader sequence can fold in either of two mutually exclusive conformations, one of which is postulated to form in the absence of induction, and to contain two rho factor-independent terminators..	14
400412	pfam08058	NPCC	Nuclear pore complex component. Proteins containing this domain are components of the nuclear pore complex. One member of this family is Nucleoporin POM34, which is thought to have a role in anchoring peripheral Nups into the pore and mediating pore formation.	134
400413	pfam08059	SEP	SEP domain. The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain.	75
400414	pfam08061	P68HR	P68HR (NUC004) repeat. This short region is found in two copies in p68-like RNA helicases.	32
285324	pfam08062	P120R	P120R (NUC006) repeat. This characteristic repeat of proliferating cell nuclear antigen P120 is found in three copies.	22
400415	pfam08063	PADR1	PADR1 (NUC008) domain. This domain is found in poly(ADP-ribose)-synthetases. The function of this domain is unknown.	53
400416	pfam08064	UME	UME (NUC010) domain. This domain is characteristic of UVSB PI-3 kinase, MEI-41 and ESR1.	102
400417	pfam08065	K167R	K167R (NUC007) repeat. This family represents the K167/Chmadrin repeat. The function of this repeat is unknown.	112
400418	pfam08066	PMC2NT	PMC2NT (NUC016) domain. This domain is found at the N-terminus of 3'-5' exonucleases with HRDC domains, and also in putative exosome components.	87
400419	pfam08067	ROKNT	ROKNT (NUC014) domain. This presumed domain is found at the N-terminus of RNP K-like proteins that also contains KH domains pfam00013.	42
400420	pfam08068	DKCLD	DKCLD (NUC011) domain. This is a TruB_N/PUA domain associated N-terminal domain of Dyskerin-like proteins.	58
400421	pfam08069	Ribosomal_S13_N	Ribosomal S13/S15 N-terminal domain. This domain is found at the N-terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021.	57
400422	pfam08070	DTHCT	DTHCT (NUC029) region. The DTCHT region is the C-terminal part of DNA gyrases B / topoisomerase IV / HATPase proteins. This region is composed of quite low complexity sequence.	96
400423	pfam08071	RS4NT	RS4NT (NUC023) domain. This is the N-terminal domain of Ribosomal S4 / S4e proteins. This domain is associated with S4 and KOW domains.	37
400424	pfam08072	BDHCT	BDHCT (NUC031) domain. This is a C-terminal domain in Bloom's syndrome DEAD helicase subfamily.	40
400425	pfam08073	CHDNT	CHDNT (NUC034) domain. The CHDNT domain is found in PHD/RING finger and chromo domain-associated helicases.	54
400426	pfam08074	CHDCT2	CHDCT2 (NUC038) domain. The CHDCT2 C-terminal domain is found in PHD/RING finger and chromo domain-associated CHD-like helicases.	126
400427	pfam08075	NOPS	NOPS (NUC059) domain. This domain is found at the C-terminus of NONA and PSP1 proteins adjacent to 1 or 2 pfam00076 domains.	52
285338	pfam08076	TetM_leader	Tetracycline resistance determinant leader peptide. This family consists of the tetracycline resistance determinant tet(M) leader peptides. A short open reading frame corresponding to a 28 amino acid peptide which contain a number of inverted repeat sequences was found immediately upstream of the tet(M). Transcriptional analyses has found that expression of tet(M) resulted from an extension of a small transcript representing the upstream leader region into the resistance determinant. Thus this leader sequence is responsible for transcriptional attenuation and thus regulation of the transcription of tet(M).	28
71513	pfam08077	Cm_res_leader	Chloramphenicol resistance gene leader peptide. This family consists of chloramphenicol (Cm) resistance gene leader peptides. Inducible resistance to Cm in both Gram positive and Gram negative bacteria is controlled by translation attenuation. In translation attenuation, the ribosome-binding-site (RBS) for the resistance determinant is sequestered in a secondary structure domain within the mRNA. Preceding the secondary structure is a short, translated ORF termed the leader. Ribosome stalling in the leader causes the destabilization of the downstream secondary structure, allowing initiation of translation of the Cm resistance gene.	17
369686	pfam08078	PsaX	PsaX family. This family consists of the PsaX family of photosystem I (PSI) protein subunits. PSI is a large multi-subunit pigment protein complex embedded in the thylakoid membranes of green plants and cyanobacteria. PsaX is one of the 12 protein subunits found in PSI and these subunits are arranged as monomers or trimers within the membrane as shown by the structure of the trimeric complex from Synechococcus elongatus.	35
400428	pfam08079	Ribosomal_L30_N	Ribosomal L30 N-terminal domain. This presumed domain is found at the N-terminus of Ribosomal L30 proteins and has been termed RL30NT or NUC018.	66
369688	pfam08080	zf-RNPHF	RNPHF zinc finger. This domain is a putative zinc-binding domain (CHHC motif) in RNP H and F. The domain is often associated with pfam00076.	36
400429	pfam08081	RBM1CTR	RBM1CTR (NUC064) family. This C-terminal region is found in RBM1-like RNA binding hnRNPs.	45
400430	pfam08082	PRO8NT	PRO8NT (NUC069), PrP8 N-terminal domain. The PRO8NT domain is found at the N-terminus of pre-mRNA splicing factors of PRO8 family. The NLS or nuclear localization signal for these spliceosome proteins begins at the start and runs for 60 residues. N-terminal to this domain is a highly variable proline-rich region.	152
400431	pfam08083	PROCN	PROCN (NUC071) domain. The PROCN domain is the central domain in pre-mRNA splicing factors of PRO8 family.	402
400432	pfam08084	PROCT	PROCT (NUC072) domain. The PROCT domain is the C-terminal domain in pre-mRNA splicing factors of PRO8 family.	111
400433	pfam08085	Entericidin	Entericidin EcnA/B family. This family consists of the entericidin antidote/toxin peptides. The entericidin locus is activated in stationary phase under high osmolarity conditions by rho-S and simultaneously repressed by the osmoregulatory EnvZ/OmpR signal transduction pathway. The entericidin locus encodes tandem paralogous genes (ecnAB) and directs the synthesis of two small cell-envelope lipoproteins which can maintain plasmids in bacterial population by means of post-segregational killing.	20
116691	pfam08086	Toxin_17	Ergtoxin family. This family consists of ergtoxin peptides which are toxins secreted by the scorpions. The ergtoxins are capable of blocking the function of K+ channels. More than 100 ergtoxins have been found from scorpion venoms and they have been classified into three subfamilies according to their primary structures.	41
191941	pfam08087	Toxin_18	Conotoxin O-superfamily. This family consists of members of the conotoxin O-superfamily. The O-superfamily of conotoxins consists of 3 groups of Conus peptides that belong to the same structural group. These 3 groups differ in their pharmacological properties: the w-conotoxins which inhibit calcium channels, the delta-conotoxins which slow down the inactivation rate of voltage -sensitive sodium channels and the muO-conotoxins block the voltage sensitive sodium currents.	31
400434	pfam08088	Toxin_19	Conotoxin I-superfamily. This family consists of the I-superfamily of conotoxins. This is a new class of peptides in the venom of some Conus species. These toxins are characterized by four disulfide bridges and inhibit of modify ion channels of nerve cells. The I-superfamily conotoxins is found in five or six major clades of cone snails and could possible be found in many more species.	40
400435	pfam08089	Toxin_20	Huwentoxin-II family. This family consists of the huwentoxin-II (HWTX-II) family of toxins secreted by spiders. These toxins are found in venom that secreted from the bird spider Selenocosmia huwena Wang. The HWTX-II adopts a novel scaffold different from the ICK motif that is found in other huwentoxins. HWTX-II consists of 37 amino acids residues including six cysteines involved in three disulfide bridges.	39
116695	pfam08090	Enterotoxin_HS1	Heat stable E.coli enterotoxin 1. Heat-stable toxin 1 of entero-aggregative E.coli (EAST1) is a small toxin. It is not, however, solely associated with entero-aggregative E.coli but also with many other diarrhoaeic E. coli families. Some studies have established the role of EAST1 in some human outbreaks of diarrhoea. Isolates from farm animals have been shown to carry the astA gene coding for EAST1. However, the relation between the presence of EAST1 and disease is not conclusive.	36
400436	pfam08091	Toxin_21	Spider insecticidal peptide. This family consists of insecticidal peptides isolated from venom of spiders of Aptostichus schlingeri and Calisoga sp. Nine insecticidal peptides were isolated from the venom of the Aptostichus schlingeri spider and seven of these toxins cause flaccid paralysis to insect larvae within 10 min of injection. However, all nine peptides were lethal within 24 hours. The structure of Aps III was solved and shown to be an atypical knottin peptide with four disulphide bridges.	40
149265	pfam08092	Toxin_22	Magi peptide toxin family. This family consists of Magi peptide toxins (Magi 1, 2 and 5) isolated from the venom of Hexathelidae spider. These insecticidal peptide toxins bind to sodium channels and induce flaccid paralysis when injected into lepidopteran larvae. However, these peptides are not toxic to mice when injected intracranially at 20 pmol/g.	38
116698	pfam08093	Toxin_23	Magi 5 toxic peptide family. This family consists of toxic peptides (Magi 5) found in the venom of the Hexathelidae spider. Magi 5 is the first spider toxin with binding affinity to site 4 of a mammalian sodium channel and the toxin has an insecticidal effect on larvae, causing paralysis when injected into the larvae.	30
311850	pfam08094	Toxin_24	Conotoxin TVIIA/GS family. This family consists of conotoxins isolated from the venom of cone snail Conus tulipa and Conus geographus. Conotoxin TVIIA, isolated from Conus tulipa displays little sequence homology with other well-characterized pharmacological classes of peptides, but displays similarity with conotoxin GS, a peptide from Conus geographus. Both these peptides block skeletal muscle sodium channels and also share several biochemical features and represent a distinct subgroup of the four-loop conotoxins.	33
71530	pfam08095	Toxin_25	Hefutoxin family. This family consists of the hefutoxins that are found in the venom of the scorpion Heterometrus fulvipes. These toxins, kappa-hefutoxin1 and kappa-hefutoxin2, exhibit no homology to any known toxins. The hefutoxins are potassium channel toxins.	22
71531	pfam08096	Bombolitin	Bombolitin family. This family consists of the bombolitin peptides that are found in the venom of the bumblebee Megabombus pennsylvanicus. Bombolitins are structurally and functionally very similar. They lyse erythrocytes and liposomes, release histamine from rat peritoneal mast cells, and stimulate phospholipase A2 from different sources.	17
71532	pfam08097	Toxin_26	Conotoxin T-superfamily. This family consists of the T-superfamily of conotoxins. Eight different T-superfamily peptides from five Conus species were identified. These peptides share a consensus signal sequence, and a conserved arrangement of cysteine residues. T-superfamily peptides were found expressed in venom ducts of all major feeding types of Conus, suggesting that the T-superfamily is a large and diverse group of peptides, widely distributed in the 500 different Conus species.	11
400437	pfam08098	ATX_III	Anemonia sulcata toxin III family. This family consists of the Anemonia sulcata toxin III (ATX III) neurotoxin family. ATX III is a neurotoxin that is produced by sea anemone; it adopts a compact structure containing four reverse turns and two other chain reversals, but no regular alpha-helix or beta-sheet. A hydrophobic patch found on the surface of the peptide may constitute part of the sodium channel binding surface.	23
400438	pfam08099	Toxin_27	Scorpion calcine family. This family consists of the calcine family of scorpion toxins. The calcine family consists of Maurocalcine and Imperatoxin. These toxins have been shown to be potent effector of ryanodyne-sensitive calcium channel from skeletal muscles. These toxins are thus useful for dihydropyridine receptor/ryanodyne receptor interaction studies.	33
400439	pfam08100	dimerization	dimerization domain. This domain is found at the N-terminus of a variety of plant O-methyltransferases. It has been shown to mediate dimerization of these proteins.	50
400440	pfam08101	DUF1708	Domain of unknown function (DUF1708). This is a yeast domain of unknown function.	423
116704	pfam08102	Antimicrobial_7	Scorpion antimicrobial peptide. This family consists of antimicrobial peptides secreted by scorpions. Novel antimicrobial peptides have been isolated from scorpions, namely the opistoporin and the pandinin. These peptides form essentially helical structures and demonstrate high antimicrobial activity against Gram-negative and Gram-positive bacteria respectively.	43
116705	pfam08103	Antimicrobial_8	Uperin family. This family consists of the uperin family of antimicrobial peptides. Uperin is a wide-spectrum antibiotic peptide isolated from the Australian toadlet, Uperoleia mjobergii. Being only 17 amino acid residues long, it is smaller than most other wide-spectrum antibiotic peptides isolated from amphibians. Uperin adopts a well-defined amphipathic alpha-helix with distinct hydrophilic and hydrophobic faces.	17
71539	pfam08104	Antimicrobial_9	Ponericin L family. This family consists of the ponericin L family of antimicrobial peptides that are isolated from the venom of the predatory ant Pachycondyla goeldii. Ponericin L family shares similarities with dermaseptins. Ponericin L may adopt an amphipathic alpha-helical structure in polar environments and these peptides exhibit a defensive role against microbial pathogens arising from prey introduction and/or ingestion.	24
311854	pfam08105	Antimicrobial10	Metchnikowin family. This family consists of the metchnikowin family of antimicrobial peptides from Drosophila. metchnikowin is a proline-rich peptide whose expression is immune-inducible. Induction of the metchnikowin gene expression can be mediated either by the TOLL pathway or by the imd gene product. The metchnikowin peptide is unique among the Drosophila antimicrobial peptides in that it is active against both bacteria and fungi.	50
71541	pfam08106	Antimicrobial11	Formaecin family. This family consists of the formaecin family of antimicrobial peptides isolated from the bulldog ant Myrmecia gulosa in response to bacterial infection. Formaecins are inducible peptide antibiotics and are active against growing Escherichia coli but were inactive against other Gram-negative and Gram-positive bacteria. Formaecin peptides are 16 amino acids long, are rich in proline and have N-acetylgalactosamine O-linked to a conserved threonine.	16
116706	pfam08107	Antimicrobial12	Pleurocidin family. This family consists of the pleurocidin family of antimicrobial peptides. Pleurocidins are found in the skin mucous secretions of the winter flounder (Pleuronectes americanus) and these peptides exhibit antimicrobial activity against Escherichia coli. Pleurocidin is predicted to assume an amphipathic alpha-helical conformation similar to other linear antimicrobial peptides and may play a role in innate host defense.	42
71543	pfam08108	Antimicrobial13	Halocidin family. This family consists of the halocidin family of antimicrobial peptides. Halocidins are isolated from the haemocytes of the tunicate, Halocynthia aurantium. They are dimeric in structures which are found via a disulfide linkage between cysteines of two different- sized monomers. Halocidins have been shown to have strong antimicrobial activities against a wide variety of pathogenic bacteria and could be ideal candidates as peptide antibiotics against multidrug-resistant bacteria.	15
71544	pfam08109	Antimicrobial14	Lactocin 705 family. This family consists of lactocin 705 which is a bacteriocin produced by Lactobacillus casei CRL 705. Lactocin 705 is a class IIb bacteriocin, whose activity depends upon the complementation of two peptides (705-alpha and 705-beta) of 33 amino acid residues each. Lactocin 705 is active against several Gram-positive bacteria, including food-borne pathogens and is a good candidate to be used for biopreservation of fermented meats.	31
149268	pfam08110	Antimicrobial15	Ocellatin family. This family consists of the ocellatin family of antimicrobial peptides. Ocellatins are produced from the electrical-stimulated skin secretions of the South American frog, Leptodactylus ocellatus. The family consists of three structurally related peptides, ocellatin 1, ocellatin 2 and ocellatin 3. These peptides present hemolytic activity against human erythrocytes and are also active against Escherichia coli.	19
71546	pfam08111	Pea-VEAacid	Pea-VEAacid family. This family consists of the PEA-VEAacid neuropeptides family. These neuropeptides are isolated from the abdominal perisympathetic organs of the American cockroach. These peptides are found together with Pea-YLS-amide and Pea-SKNacid, giving a unique neuropeptide pattern in abdominal perisympathetic organs. The functions of these neuropeptides are unknown.	15
116708	pfam08112	ATP-synt_E_2	ATP synthase epsilon subunit. This family consists of epsilon subunits of the ATP synthase. The ATP synthase complex is composed of an oligomeric transmembrane sector (CF0), and a catalytic core (CF1). CF1 is composed of 5 subunits, of which the epsilon subunit functions as a potent inhibitor of ATPase activity in both soluble and bound CF1. Only when the epsilon inhibition is disabled is high ATPase activity detected in ATPase	56
285350	pfam08113	CoxIIa	Cytochrome c oxidase subunit IIa family. This family consists of the cytochrome c oxidase subunit IIa family. The bax-type cytochrome c oxidase from Thermus thermophilus is known as a two subunit enzyme. From its crystal structure, it was discovered that an additional transmembrane helix 'subunit IIa' spans the membrane. This subunit consists of 34 residues forming one helix across the membrane. The presence of this subunit seems to be important for the function of cytochrome c oxidases.	33
285351	pfam08114	PMP1_2	ATPase proteolipid family. This family consists of small proteolipids associated with the plasma membrane H+ ATPase. Two proteolipids (PMP1 and PMP2) are associated with the ATPase and both genes are similarly expressed in the wild-type strain of yeast with no modification of the level of transcription of one PMP gene is detected in a strain deleted of the other. Though both proteolipids show similarity with other small proteolipids associated with other cation -transporting ATPases, their functions remain unclear.	43
285352	pfam08115	Toxin_28	SFI toxin family. This family consists of the SFI family of spider toxins. This family of toxins might share structural, evolutionary and functional relationships with other small, highly structurally constrained spider neurotoxins. These toxins are highly selective agonists/antagonists of different voltage-dependent calcium channels and are extremely valuable reagents in the analysis of neuromuscular function.	35
149271	pfam08116	Toxin_29	PhTx neurotoxin family. This family consists of PhTx insecticidal neurotoxins that are found in the venom of Brazilian, Phoneutria nigriventer. The venom of the Phoneutria nigrivente contains numerous neurotoxic polypeptides of 30-140 amino acids which exert a range of biological effects. While some of these neurotoxins are lethal to mice after intracerebroventricular injections, others are extremely toxic to insects of the orders Diptera and Dictyoptera but had much weaker toxic effects on mice.	31
400441	pfam08117	Toxin_30	Ptu family. This family consists of toxic peptides that are isolated from the saliva of assassin bugs. The saliva contains a complex mixture of proteins that are used by the bug either to immobilise the prey or to digest it. One of the proteins (Ptu1) has been purified and shown to block reversibly the N-type calcium channels and to be less specific for the L- and P/Q- type calcium channels expressed in BHK cells.	36
400442	pfam08118	MDM31_MDM32	Yeast mitochondrial distribution and morphology (MDM) proteins. Proteins in this family are yeast mitochondrial inner membrane proteins MDM31 and MDM32. These proteins are required for the maintenance of mitochondrial morphology, and the stability of mitochondrial DNA.	519
71554	pfam08119	Toxin_31	Scorpion acidic alpha-KTx toxin family. This family consists of acidic alpha-KTx short chain scorpion toxins. These toxins named parabutoxins, block voltage-gated K channels and have extremely low pI values. Furthermore, they lack the crucial pore-plugging lysine. In addition, the second important residue of the dyad, the hydrophobic residue (Phe or Tyr) is also missing.	37
71555	pfam08120	Toxin_32	Tamulustoxin family. This family consists of the tamulustoxins which are found in the venom of the Indian red scorpion (Mesobuthus tamulus). Tamulustoxin shares no similarity with other scorpion venom toxins, although the positions of its six cysteine residues suggest that it shares the same structural scaffold. Tamulustoxin acts as a potassium channel blocker.	35
71556	pfam08121	Toxin_33	Waglerin family. This family consists of the lethal peptides (waglerins) that are found in the venom of Trimeresurus wagleri. Waglerins are 22-24 residue lethal peptides and are competitive antagonist of the muscle nicotinic receptor (nAChR). Waglerin-1 possesses a distinctive selectivity for the alpha-epsilon interface binding site of the mouse nAChR.	22
400443	pfam08122	NDUF_B12	NADH-ubiquinone oxidoreductase B12 subunit family. This family consists of the NADH-ubiquinone oxidoreductase B12 subunit proteins. NADH is the central source of electrons in the mitochondrial and bacterial respiration. NADH-ubiquinone oxidoreductase is involved in the transfer of electrons from NADH to the electron transport chain. This oxidation of NADH is coupled to proton transfer across the membrane, generating a proton motive force that is utilized for the synthesis of ATP. The function of this subunit is unclear.	55
149273	pfam08123	DOT1	Histone methylation protein DOT1. The DOT1 domain regulates gene expression by methylating histone H3. H3 methylation by DOT1 has been shown to be required for the DNA damage checkpoint in yeast.	205
400444	pfam08124	Lyase_8_N	Polysaccharide lyase family 8, N terminal alpha-helical domain. This family consists of a group of secreted bacterial lyase enzymes EC:4.2.2.1 capable of acting on hyaluronan and chondroitin in the extracellular matrix of host tissues, contributing to the invasive capacity of the pathogen.	323
369700	pfam08125	Mannitol_dh_C	Mannitol dehydrogenase C-terminal domain. 	246
285357	pfam08126	Propeptide_C25	Propeptide_C25. This is found at the N terminal end of some of the members of the C25 peptidase family (PF01364). Little is known about the function of this motif.	205
400445	pfam08127	Propeptide_C1	Peptidase family C1 propeptide. This motif is found at the N terminal of some members of the Peptidase_C1 family (pfam00112) and is involved in activation of this peptidase.	40
71564	pfam08129	Antimicrobial17	Alpha/beta enterocin family. This family consists of the alpha and beta enterocins and lactococcin G peptides. These peptides have some antimicrobial properties; they inhibit the growth of Enterococcus spp. and a few other gram-positive bacteria. These peptides act as pore- forming toxins that create cell membrane channels through a barrel-stave mechanism and thus produce an ionic imbalance in the cell. These family of antimicrobial peptides belong to the class II group of bacteriocin.	57
116721	pfam08130	Antimicrobial18	Type A lantibiotic family. This family consists of the type A lantibiotic peptides. Both Pep5 and epicidin-280 are ribosomally-synthesized antimicrobial peptides produced by Gram-positive bacteria that are characterized by the presence of lanthionine and/or methyllanthionine residues. The lantibiotics family has a highly specific activity against multi- drug resistant bacteria and has potential to be utilized in a wide range of medical applications.	60
400446	pfam08131	Defensin_3	Defensin-like peptide family. This family consists of the defensin-like peptides (DLPs) isolated from platypus venom. These DLPs show similar three-dimensional fold to that of beta-defensin-12 and sodium-channel neurotoxin Shl. However the side chains known to be functionally important to beta-defensin-12 and Shl are not conserved in DLPs. This suggests a different biological function. Consistent with this contention, DLPs have been shown to possess no anti-microbial properties and have no observable activity on rat dorsal-root-ganglion sodium-channel currents.	39
369703	pfam08132	AdoMetDC_leader	S-adenosyl-l-methionine decarboxylase leader peptide. This family consists of the S-adenosyl-l-methionine decarboxylase (AdoMetDC) leader peptides. AdoMetDC is a key regulatory enzymes in the biosynthesis of polyamines. All expressed plant AdoMetDC mRNA 5' leader sequences contain a highly conserved pair of overlapping upstream ORFs (uORFs) that overlap by one base. Sequences of the small uORFs are highly conserved between monocot, dicot and gymnosperm AdoMetDC mRNA species, suggesting a translational regulatory mechanism.	51
285360	pfam08133	Nuclease_act	Anticodon nuclease activator family. This family consists of the anticodon nuclease activator proteins. Pre-existing host tRNAs are reprocessed during bacteriophage T4 infection of certain Escherichia coli strains. In this pathway, tRNA(Lys) is cleaved 5' by the anticodon nuclease to the wobble base and is later restored in polynucleotide kinase and RNA ligase reactions.	26
400447	pfam08134	cIII	cIII protein family. This family consists of the cIII family of regulatory proteins. The lambda CIII protein has 54 amino acids and it forms an amphipathic helix within its amino acid sequence. Lambda cIII stabilizes the lambda cII protein and the host sigma factor 32, responsible for transcribing genes of the heat shock regulon.	37
285362	pfam08135	EPV_E5	Major transforming protein E5 family. This family consists of the major transforming proteins (E5) of the bovine papilloma virus (BPV). The equine sarcoid is one of the most common dermatological lesion in equids. It is a benign, locally invasive dermal fibroblastic lesion and studies have shown an association of the lesions with BPV. E5 is a short hydrophobic membrane protein localising to the Golgi apparatus and other intracellular membranes. It binds to and constitutively activates the platelet-derived growth factor-beta in transformed cells. This stimulation activates a receptor signaling cascade which results in an intracellular growth stimulatory signal.	43
311863	pfam08136	Ribosomal_S22	30S ribosomal protein subunit S22 family. This family consists of the 30S ribosomal proteins subunit S22 polypeptides. This polypeptide is 47 amino acids in length and has a molecular weight of about 5 kDa. The S22 subunit is a component of the stationary-phase-specific ribosomal protein and is assembled in the ribosomal particles in the stationary phase. This subunit along with other stationary-phase-specific ribosomal proteins result in compositional changes of ribosomes during the stationary phase. The significance of this change is not clear as yet.	44
400448	pfam08137	DVL	DVL family. This family consists of the DVL family of proteins. In a gain-of-function genetic screen for genes that influence fruit development in Arabidopsis, DEVIL (DVL) gene was identified. DVL is a small protein and overexpression of the protein results in pleiotropic phenotypes featured by shortened stature, rounder rosette leaves, clustered inflorescences, shortened pedicles, and siliques with pronged tips. DVL family is a novel class of small polypeptides and the overexpression phenotypes suggest that these polypeptides may have a role in plant development.	19
285365	pfam08138	Sex_peptide	Sex peptide (SP) family. This family consists of Sex Peptides (SP) that are found in Drosophila. On mating, Drosophila females decreases her remating rate and increases her egg-laying rate due, in part, to the transfer of SP from the male to the female. SP are found in seminal fluids transferred from the male to the female during mating. The male seminal fluid proteins are referred to as accessory gland proteins (Acps). The SP is one of the most interesting Acps and plays an important role in reproduction.	55
400449	pfam08139	LPAM_1	Prokaryotic membrane lipoprotein lipid attachment site. In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached.	18
369707	pfam08140	Cuticle_1	Crustacean cuticle protein repeat. This family consists of the cuticle proteins from the Cancer pagurus and the Homarus americanus. These proteins are isolated from the calcified regions of the crustacean and they contain two copies of an 18 residue sequence motif, which thus far has been found only in crustacean calcified exoskeletons.	40
400450	pfam08141	SspH	Small acid-soluble spore protein H family. This family consists of the small acid-soluble spore proteins (SASP) of the H type (sspH). SspH are unique to spores of Bacillus subtilis and are expressed only in the forespore compartment during sporulation of this organism. The sspH genes are monocistronic and are recognized by the forespore-specific sigma factor for RNA polymerase - sigma-G. The specific role of this protein is unclear but is thought to play a role in sporulation under conditions different from that of the common laboratory tests of spore properties.	58
400451	pfam08142	AARP2CN	AARP2CN (NUC121) domain. This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU.	85
311868	pfam08143	CBFNT	CBFNT (NUC161) domain. This N terminal domain is found in proteins of CARG-binding factor A-like proteins.	60
400452	pfam08144	CPL	CPL (NUC119) domain. This C terminal domain is fund in Penguin-like proteins associated with Pumilio like repeats.	140
400453	pfam08145	BOP1NT	BOP1NT (NUC169) domain. This N terminal domain is found in BOP1-like WD40 proteins.	259
400454	pfam08146	BP28CT	BP28CT (NUC211) domain. This C terminal domain is found in BAP28-like nucleolar proteins.	146
400455	pfam08147	DBP10CT	DBP10CT (NUC160) domain. This C terminal domain is found in the Dbp10p subfamily of hypothetical RNA helicases.	63
400456	pfam08148	DSHCT	DSHCT (NUC185) domain. This C terminal domain is found in DOB1/SK12/helY-like DEAD box helicases.	158
400457	pfam08149	BING4CT	BING4CT (NUC141) domain. This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins.	79
400458	pfam08150	FerB	FerB (NUC096) domain. This is central domain B in proteins of the Ferlin family.	76
400459	pfam08151	FerI	FerI (NUC094) domain. This domain is present in proteins of the Ferlin family. It is often located between two C2 domains.	52
400460	pfam08152	GUCT	GUCT (NUC152) domain. This is the C terminal domain found in the RNA helicase II / Gu protein family.	91
400461	pfam08153	NGP1NT	NGP1NT (NUC091) domain. This N terminal domain is found in a subfamily of hypothetical nucleolar GTP-binding proteins similar to human NGP1.	130
400462	pfam08154	NLE	NLE (NUC135) domain. This domain is located N terminal to WD40 repeats. It is found in the microtubule-associated yeast ribosome biogenesis protein YTM1.	65
400463	pfam08155	NOGCT	NOGCT (NUC087) domain. This C terminal domain is found in the NOG subfamily of nucleolar GTP-binding proteins.	51
400464	pfam08156	NOP5NT	NOP5NT (NUC127) domain. This N terminal domain is found in RNA-binding proteins of the NOP5 family.	66
400465	pfam08157	NUC129	NUC129 domain. This C terminal domain is found in a novel family of hypothetical nucleolar proteins.	63
400466	pfam08158	NUC130_3NT	NUC130/3NT domain. This N terminal domain is found in a novel nucleolar protein family.	50
400467	pfam08159	NUC153	NUC153 domain. This small domain is found in a a novel nucleolar family.	29
400468	pfam08161	NUC173	NUC173 domain. This is the central domain of of novel family of hypothetical nucleolar proteins.	202
400469	pfam08163	NUC194	NUC194 domain. This is domain B in the catalytic subunit of DNA-dependent protein kinases.	387
400470	pfam08164	TRAUB	Apoptosis-antagonizing transcription factor, C-terminal. This C terminal domain is found in traube proteins. This is the domain of the AATF proteins that interacts with BLOS2 or Ceap, that functions as an adaptor in processes such as protein and vesicle processing and transport, and perhaps transcription.	81
400471	pfam08165	FerA	FerA (NUC095) domain. This is central domain A in proteins of the Ferlin family.	58
149302	pfam08166	NUC202	NUC202 domain. This domain is found in a novel family of nucleolar proteins.	61
400472	pfam08167	RIX1	rRNA processing/ribosome biogenesis. Rix1 is a nucleoplasmic particle involved in rRNA processing/ribosome assembly. It associates with two other proteins, Ipi1 and Ipi3, to form the RIX1 complex that allows Rea1 - the AAA ATPase - to associate with the 60S ribosomal subunit. More than 170 assembly factors are involved in the construction and maturation of yeast ribosomes, and after these factors have completed their function they need to be released from the pre-ribosomes. Rea1 induces the release of the assembly protein complex in a mechanical fashion. This family is usually associated with NUC202, pfam08166.	187
400473	pfam08168	NUC205	NUC205 domain. This domain is found in a novel family of nucleolar proteins.	44
400474	pfam08169	RBB1NT	RBB1NT (NUC162) domain. This domain is found N terminal to the ARID/BRIGHT domain in DNA-binding proteins of the Retinoblastoma-binding protein 1 family.	94
400475	pfam08170	POPLD	POPLD (NUC188) domain. This domain is found in POP1-like nucleolar proteins.	92
400476	pfam08171	Mad3_BUB1_II	Mad3/BUB1 homology region 2. This domain is found in checkpoint proteins which are involved in cell division. This region has been shown to be necessary and sufficient for the binding of MAD3 to BUB3 in Saccharomyces cerevisiae. This domain is present in BUB1 which also binds BUB3.	65
400477	pfam08172	CASP_C	CASP C terminal. This domain is the C-terminal region of the CASP family of proteins. It is a Golgi membrane protein which is thought to have a role in vesicle transport.	247
400478	pfam08173	YbgT_YccB	Membrane bound YbgT-like protein. This family contains a set of membrane proteins, typically 33 amino acids long. The family has no known function, but the protein is found in the operon CydAB in E. coli. Members have a consensus motif (MWYFXW) which is rich in aromatic residues. The protein forms a single membrane-spanning helix. This family seems to be restricted to Proteobacteria.	26
400479	pfam08174	Anillin	Cell division protein anillin. Anillin is a protein involved in septin organisation during cell division. It is an actin binding protein that is localized to the cleavage furrow, and it maintains the localization of active myosin, which ensures the spatial control of concerted contraction during cytokinesis.	140
369735	pfam08175	SspO	Small acid-soluble spore protein O family. This family consists of the small acid-soluble spore proteins (SASP) O type (sspO). SspO (originally cotK) are unique to the spores of Bacillus subtilis and are expressed only in the forespore compartment of sporulating cells of this organism. The sspO is the first gene in a likely operon with sspP and transcription of this gene is primarily by RNA polymerase with the forespore-specific sigma factor, sigma-G. Mutation deleting sspO causes the loss of the SspO from the forespore but had no discernible effect on sporulation, spore properties or spore germination.	50
285399	pfam08176	SspK	Small acid-soluble spore protein K family. This family consists of the small acid-soluble spore proteins (SASP) belonging to the K type (sspK). The sspK are unique to the spores of Bacillus subtilis and are expressed only in the forespore compartment of sporulating cells of this organism. The sspK gene is monocistronic and transcription is primarily by the RNA polymerase with the forespore-specific sigma factor, sigma-G. Mutation deleting sspK results in loss of SspK from the spore but had no discernible effect on sporulation, spore properties or spore germination.	47
285400	pfam08177	SspN	Small acid-soluble spore protein N family. This family consists of the small acid-soluble spore protein (SASP) N type (sspN). SspN is a 48 residues protein that is expressed only in the forespore compartment of sporulating Bacillus subtilis. The sspN gene is recognized equally by both sigma-G and sigma-F. The role of SspN is still not well-defined.	46
285401	pfam08178	GnsAB_toxin	GnsA/GnsB toxin of bacterial toxin-antitoxin system. This family consists of the GnsA/GnsB family. GnsA and GnsB are multicopy suppressors of the secG null mutation. These proteins participate in the synthesis of phospholipids, suggesting the functional relationship between SecG and membrane phospholipids. Over-expression of gnsA and gnsB causes a remarkable increase in the unsaturated fatty acid content. However, the gnsA-gnsB double null mutant exhibits no effect. Both proteins are predicted to possess a helix-turn-helix structure. GnsAB is a family of putative bacterial toxins (both GnsA and GnsB) that, are neutralized by the antitoxin YmcE, pfam15939.	54
400480	pfam08179	SspP	Small acid-soluble spore protein P family. This family consists of the small acid-soluble spore proteins (SASP) P type (sspP). sspP is expressed only in the forespore compartment of the sporulating cell. sspP is also expressed under sigma-G control from the same promoter as sspO. Mutations deleting sspP causes no discernible effect on sporulation, spore properties or spore germination.	44
400481	pfam08180	BAGE	B melanoma antigen family. This family consists of the B melanoma antigen (BAGE) peptides. The BAGE gene encodes a human tumor antigen that is recognized by a cytolytic T lymphocyte. BAGE genes are expressed in melanomas, bladder and lung carcinomas and in a few tumors of other histological types.	28
285404	pfam08181	DegQ	DegQ (SacQ) family. This family consists of the DegQ (formerly sacQ) regulatory peptides. The DegQ family of peptides control the rates of synthesis of a class of both secreted and intracellular degradative enzymes in Bacillus subtilis. DegQ is 46 amino acids long and activates the synthesis of degradative enzymes. The expression of this peptide was shown to be subjected both to catabolite repression and DegS-DegU-mediated control. Thus allowing an increase in the rate of synthesis of degQ under conditions of nitrogen starvation.	46
400482	pfam08182	Pedibin	Pedibin/Hym-346 family. This family consists of the pedibin and Hym-346 signalling peptides. These two peptides have been isolated from Hydra vulgaris and Hydra magnipapillata. Experiments have indicated that both cause a reduction in the positional value gradient, the principle patterning process governing the maintenance of form in the adult hydra. The peptides cause an increase in the rate of foot regeneration following bisection of the body column. Thus both play important signalling roles in patterning processes in cnidaria and maybe in more complex metazoans.	35
369736	pfam08183	SpoV	Stage V sporulation protein family. This family consists of the stage V sporulation (SpoV) proteins of Bacillus subtilis which includes SpoVM. SpoVM is an small, 26 residue-long protein that is produced in the mother cell chamber of the sporangium during the process of sporulation in B. subtilis. SpoVM forms an amphipathic alpha-helix and is recruited to the polar septum shortly after the sporangium undergoes asymmetric division. The function of SpoVM depends on proper subcellular localization.	25
116772	pfam08184	Cuticle_2	Cuticle protein 7 isoform family. This family consists of cuticle protein 7 isoforms that are isolated from the carapace cuticle of a juvenile horseshoe crab, Limulus polyphemus. There are 3 isoforms of cuticle protein 7. The 3 isoforms are N-terminally blocked but could be deblocked by treatment with pyroglutaminase, showing that the N-terminal residue is a pyroglutamine residue.	59
369737	pfam08186	Wound_ind	Wound-inducible basic protein family. This family consists of the wound-inducible basic proteins from plants. The metabolic activities of plants are dramatically altered upon mechanical injury or pathogen attack. A large number of proteins accumulates at wound or infection sites, such as the wound-inducible basic proteins. These proteins are small, 47 amino acids in length, has no signal peptides and are hydrophilic and basic.	44
71621	pfam08187	Tetradecapep	Myoactive tetradecapeptides family. This family consists of myoactive tetradecapeptides that are isolated from the gut of earthworms, Eisenia foetida and Pheretima vitata. These peptides were termed ETP and PTP respectively. Both peptides showed a potent excitatory action on spontaneous contractions of the anterior gut. These peptides show similarity to Molluscan tetradecapeptides and arthropodan tridecapeptides.	14
71622	pfam08188	Protamine_3	Spermatozal protamine family. This family consists of the spermatozal protamines. Spermatozal protamines play an important role in remodelling of the sperm chromatin during mammalian spermiogenesis. Nuclear elongation and chromatin condensation are concomitant with modifications in the basic protein complement associated with DNA. Somatic histones are initially replaced by testis -specific histone variants, then by transitional proteins, and ultimately by protamines.	48
400483	pfam08189	Meleagrin	Meleagrin/Cygnin family. This family consists of meleagrin and cygnin basic peptides that are isolated from turkey and black swan respectively. Both peptides are low in molecular weight and contains three disulphide bonds with high concentrations of aromatic residues. These peptides show similarity to transferrins and probably play some vital role in avian eggs but the exact function is still unknown.	38
369738	pfam08190	PIH1	pre-RNA processing PIH1/Nop17. This domain is involved in pre-rRNA processing. It has has been shown to be required either for nucleolar retention or correct assembly of the box C/D snoRNP in Saccharomyces cerevisiae. The C-terminal region of this family has similarity to the CS domain pfam04969.	166
369739	pfam08191	LRR_adjacent	LRR adjacent. These are small, all beta strand domains, structurally described for the protein Internalin (InlA) and related proteins InlB, InlE, InlH from the pathogenic bacterium Listeria monocytogenes. Their function appears to be mainly structural: They are fused to the C-terminal end of leucine-rich repeats (LRR), significantly stabilizing the LRR, and forming a common rigid entity with the LRR. They are themselves not involved in protein-protein-interactions but help to present the adjacent LRR-domain for this purpose. These domains belong to the family of Ig-like domains in that they consist of two sandwiched beta sheets that follow the classical connectivity of Ig-domains. The beta strands in one of the sheets is, however, much smaller than in most standard Ig-like domains, making it somewhat of an outlier.	57
369740	pfam08192	Peptidase_S64	Peptidase family S64. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1. The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS:S1) and to have a typical trypsin-like catalytic triad.	684
400484	pfam08193	INO80_Ies4	INO80 complex subunit Ies4. The INO80 ATPase is a member of the SNF2 family of ATPases and functions as an integral component of a multisubunit ATP-dependent chromatin remodelling complex. This family of proteins corresponds to the fungal Ies4 subunit of INO80.	233
400485	pfam08194	DIM	DIM protein. Drosophila immune-induced molecules (DIMs) are short proteins induced during the immune response of Drosophila. This family includes DIMs 1 to 4 that have masses below 5 kDa.	36
369743	pfam08195	TRI9	TRI9 protein. Putative gene of 129 bp in the Trichothecene gene cluster of Fusarium sporotrichioides and F. graminearum. Encoding a predicted protein of 43 amino acids which function is unknown.	43
285414	pfam08196	UL2	UL2 protein. Orf UL2 of Human cytomegalovirus (HCMV) which is a short protein of unknown function.	59
285415	pfam08197	TT_ORF2a	pORF2a truncated protein. Most isolated ORF2 of TT virus (TTV) encode a 49 amino acids protein (pORF2a) because of an in-frame stop codon. ORF2s isolated from G1 TTV encode 202 amino acids protein (pORF2ab).	49
400486	pfam08198	Thymopoietin	Thymopoietin protein. Short protein of 49 amino acid isolated from bovine spleen cells. Thymopoietins (TMPOs) are a group of ubiquitously expressed nuclear proteins. They are suggested to play an important role in nuclear envelope organisation and cell cycle control.	48
285417	pfam08199	E2	Bacteriophage E2-like protein. Short conseved protein described in Lactococcus Bacteriophage c2 of 37 amino acids.	37
369745	pfam08200	Phage_1_1	Bacteriophage 1.1 Protein. Gene 1.1 in Bacteriophage T7 encodes a 42 amino acid protein, rich in basic amino acids suggesting its interaction with nucleic acids. Many homologs are present in different T7 and T3-like bacteriophage.	45
369746	pfam08201	BssC_TutF	BssC/TutF protein. BssC short protein (57 amino acids) has been described as the gamma-subunit of benzylsuccinate synthase from Thauera aromatica strain K172. TutF has been identified and described as highly similar to BssC in T.aromatica strain T1.	56
369747	pfam08202	MIS13	Mis12-Mtw1 protein family. Mis12-Mtw1 is a eukaryotic conserved kinetochore protein that is involved in chromosome segregation.	288
400487	pfam08203	RNA_polI_A14	Yeast RNA polymerase I subunit RPA14. This is a family of yeast proteins. A14 is one of the final two subunits of Saccharomyces cerevisiae RNA polymerase I and is proposed to play a role in the recruitment of pol I to the promoter.	76
400488	pfam08204	V-set_CD47	CD47 immunoglobulin-like domain. This family represents the CD47 leukocyte antigen V-set like Ig domain.	93
400489	pfam08205	C2-set_2	CD80-like C2-set immunoglobulin domain. These domains belong to the immunoglobulin superfamily.	89
400490	pfam08206	OB_RNB	Ribonuclease B OB domain. This family includes the N-terminal OB domain found in ribonuclease B proteins in one or two copies.	58
400491	pfam08207	EFP_N	Elongation factor P (EF-P) KOW-like domain. 	58
400492	pfam08208	RNA_polI_A34	DNA-directed RNA polymerase I subunit RPA34.5. This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription.	203
369754	pfam08209	Sgf11	Sgf11 (transcriptional regulation protein). The Sgf11 family is a SAGA complex subunit in Saccharomyces cerevisiae. The SAGA complex is a multisubunit protein complex involved in transcriptional regulation. SAGA combines proteins involved in interactions with DNA-bound activators and TATA-binding protein (TBP), as well as enzymes for histone acetylation and deubiquitylation.	33
400493	pfam08210	APOBEC_N	APOBEC-like N-terminal domain. A mechanism of generating protein diversity is mRNA editing. Members of this family are C-to-U editing enzymes. The N-terminal domain of APOBEC-1 like proteins is the catalytic domain, while the C-terminal domain is a pseudocatalyitc domain. More specifically, the catalytic domain is a zinc dependent deaminases domain and is essential for cytidine deamination.APOBEC-3 like members contain two copies of this domain. RNA editing by APOBEC-1 requires homodimerization and this complex interacts with RNA binding proteins to from the editosome (and references therein). This family also includes the functionally homologous activation induced deaminase (AID), which is essential for the development of antibody diversity in B lymphocytes, and the sea lamprey PmCDA1 and PmCDA2, which are predicted to play an AID-like role in the adaptive immune response of jawless vertebrates. Divergent members of this family are present in various eukaryotes such as Nematostella, C. elegans, Micromonas and Emiliania, and prokaryotes such as Wolbachia and Pseudomonas brassicacearum.	170
400494	pfam08211	dCMP_cyt_deam_2	Cytidine and deoxycytidylate deaminase zinc-binding region. 	122
400495	pfam08212	Lipocalin_2	Lipocalin-like domain. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The structure is an eight-stranded beta barrel.	143
400496	pfam08213	DUF1713	Mitochondrial domain of unknown function (DUF1713). This domain is found at the C terminal end of mitochondrial proteins of unknown function.	33
400497	pfam08214	HAT_KAT11	Histone acetylation protein. Histone acetylation is required in many cellular processes including transcription, DNA repair, and chromatin assembly. This family contains the fungal KAT11 protein (previously known as RTT109) which is required for H3K56 acetylation. Loss of KAT11 results in the loss of H3K56 acetylation, both on bulk histone and on chromatin. KAT11 and H3K56 acetylation appear to correlate with actively transcribed genes and associate with the elongating form of Pol II in yeast. This family also incorporates the p300/CBP histone acetyltransferase domain which has different catalytic properties and cofactor regulation to KAT11.	348
400498	pfam08216	CTNNBL	Catenin-beta-like, Arm-motif containing nuclear. CTNNBL is a family of eukaryotic nuclear proteins of the catenin-beta-like 1 type that contain an armadillo motif. A human nuclear protein with this domain is thought to have a role in apoptosis. The interaction of CTNNBL1 with its known partners (the Prp19-CDC5L complex and AID) is mediated by recognition of NLS (nuclear localization signal) motifs. The RNA-splicing factor Prp31 is also an interactor, with recognition also occurring through the NLS. CTNNBL1 uses its central armadillo (ARM) domain to bind NLS-containing partners.	104
369760	pfam08217	DUF1712	Fungal domain of unknown function (DUF1712). The function of this family of proteins is unknown.	468
285434	pfam08218	Citrate_ly_lig	Citrate lyase ligase C-terminal domain. This family is composed of the C-terminal domain of citrate lyase ligase EC:6.2.1.22.	182
400499	pfam08219	TOM13	Outer membrane protein TOM13. The TOM13 family of proteins are mitochondrial outer membrane proteins that mediate the assembly of beta-barrel proteins.	82
285436	pfam08220	HTH_DeoR	DeoR-like helix-turn-helix domain. 	57
400500	pfam08221	HTH_9	RNA polymerase III subunit RPC82 helix-turn-helix domain. This family consists of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In Saccharomyces cerevisiae, the enzyme is composed of 15 subunits, ranging from 160 to about 10 kDa. This region is a probably DNA-binding helix-turn-helix.	62
369763	pfam08222	HTH_CodY	CodY helix-turn-helix domain. This family consists of the C-terminal helix-turn-helix domain found in several bacterial GTP-sensing transcriptional pleiotropic repressor CodY proteins. CodY has been found to repress the dipeptide transport operon (dpp) of Bacillus subtilis in nutrient-rich conditions. The CodY protein also has a repressor effect on many genes in Lactococcus lactis during growth in milk.	61
369764	pfam08223	PaaX_C	PaaX-like protein C-terminal domain. This family contains proteins that are similar to the product of the paaX gene of Escherichia coli. This protein is involved in the regulation of expression of a group of proteins known to participate in the metabolism of phenylacetic acid.	99
400501	pfam08224	DUF1719	Domain of unknown function (DUF1719). This is a domain of unknown function. It may have a role in ATPase activation.	231
400502	pfam08225	Antimicrobial19	Pseudin antimicrobial peptide. Pseudins are a subfamily of the FSAP family (Frog Secreted Active Peptides) extracted from the skin of the paradoxical frog Pseudis paradoxa (Pseudidae). The pseudins belong to the class of cationic, amphipathic-helical antimicrobial peptides.	23
369766	pfam08226	DUF1720	Domain of unknown function (DUF1720). This domain is found in different combinations with cortical patch components EF hand, SH3 and ENTH and is therefore likely to be involved in cytoskeletal processes. This family contains many hypothetical proteins.	75
400503	pfam08227	DASH_Hsk3	DASH complex subunit Hsk3 like. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. This family also includes several higher eukaryotic proteins. However, other DASH subunits do not appear to be conserved in higher eukaryotes.	45
400504	pfam08228	RNase_P_pop3	RNase P subunit Pop3. This family of fungal proteins form a subunit of RNase P, the ribonucleoprotein enzyme that cleaves the leader sequence of precursor tRNAs to generate mature tRNAs. The structure of Pop3 has been assigned the L7Ae/L30e fold. This RNA-binding fold is also present in human RNase P subunit Rpp38, raising the possibility that Pop3p and Rpp38 are functional homologs.	158
369769	pfam08229	SHR3_chaperone	ER membrane protein SH3. This family of proteins are membrane localized chaperones that are required for correct plasma membrane localization of amino acid permeases (AAPs). SH3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of SH3, AAPs are retained in the ER.	185
400505	pfam08230	CW_7	CW_7 repeat. This domain was originally found in the C-terminal moiety of the Cpl-7 lysozyme encoded by the Streptococcus pneumoniae bacteriophage Cp-7. It is also found in the cell wall hydrolases of human and life-stock pathogens. CW_7 repeats make up a cell wall binding motif.	40
400506	pfam08231	SYF2	SYF2 splicing factor. Proteins in this family are involved in cell cycle progression and pre-mRNA splicing.	150
400507	pfam08232	Striatin	Striatin family. Striatin is an intracellular protein which has a caveolin-binding motif, a coiled-coil structure, a calmodulin-binding site, and a WD (pfam00400) repeat domain. It acts as a scaffold protein and is involved in signalling pathways.	142
400508	pfam08234	Spindle_Spc25	Chromosome segregation protein Spc25. This is a family of chromosome segregation proteins. It contains Spc25, which is a conserved eukaryotic kinetochore protein involved in cell division. In fungi the Spc25 protein is a subunit of the Nuf2-Ndc80 complex, and in vertebrates it forms part of the Ndc80 complex.	71
400509	pfam08235	LNS2	LNS2 (Lipin/Ned1/Smp2). This domain is found in Saccharomyces cerevisiae protein SMP2, proteins with an N-terminal lipin domain (pfam04571). SMP2 (also known as PAH1) is involved in plasmid maintenance and respiration, and has been identified as a Mg2+-dependent phosphatidate phosphatase (EC:3.1.3.4) that contains a haloacid dehalogenase (HAD)-like domain. Lipin proteins are involved in adipose tissue development and insulin resistance.	226
400510	pfam08236	SRI	SRI (Set2 Rpb1 interacting) domain. The SRI (Set2 Rpb1 interacting) domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation. This domain is conserved from yeast to humans. Members of this family form a compact, closed three-helix bundle, with an up-down-up topology. The first and second helices are antiparallel to each other and are of similar length; the third helix, which is packed across helices alpha1 and alpha2 is slightly shorter, consisting of only 15 amino acids. Most conserved hydrophobic residues are largely buried in the interior of the structure and form an extensive and contiguous hydrophobic core that stabilizes the packing of the three-helix bundle. This domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation.	83
369775	pfam08237	PE-PPE	PE-PPE domain. This domain is found C terminal to the PE (pfam00934) and PPE (pfam00823) domains. The secondary structure of this domain is predicted to be a mixture of alpha helices and beta strands.	227
400511	pfam08238	Sel1	Sel1 repeat. This short repeat is found in the Sel1 protein. It is related to TPR repeats.	35
400512	pfam08239	SH3_3	Bacterial SH3 domain. 	54
400513	pfam08240	ADH_N	Alcohol dehydrogenase GroES-like domain. This is the catalytic domain of alcohol dehydrogenases. Many of them contain an inserted zinc binding domain. This domain has a GroES-like structure.	106
400514	pfam08241	Methyltransf_11	Methyltransferase domain. Members of this family are SAM dependent methyltransferases.	94
400515	pfam08242	Methyltransf_12	Methyltransferase domain. Members of this family are SAM dependent methyltransferases.	98
400516	pfam08243	SPT2	SPT2 chromatin protein. This family includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation.	105
400517	pfam08244	Glyco_hydro_32C	Glycosyl hydrolases family 32 C terminal. This domain corresponds to the C terminal domain of glycosyl hydrolase family 32. It forms a beta sandwich module.	162
400518	pfam08245	Mur_ligase_M	Mur ligase middle domain. 	200
400519	pfam08246	Inhibitor_I29	Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS.	58
116832	pfam08247	ENOD40	ENOD40 protein. Rohrig et al. reported the in vitro translation of two peptides of 12 and 24 amino acids from the short, overlapping ORFs of soybean ENOD40 mRNA. The putative role of the enod40 genes has been in favour of organogenesis, such as induction of the cortical cell divisions that lead to initiation of nodule primordia, in developing lateral roots and embryonic tissues. This supports the hypothesis for a role of enod40 in lateral organ development.	12
116833	pfam08248	Tryp_FSAP	Tryptophyllin-3 skin active peptide. PdT-3 or Tryptophyllin-3 peptide is a subfamily of the family Tryptophyllin and of the superfamily FSAP (Frog Skin Active Peptide). Originally identified in skin extracts of Neotropical leaf frogs, Phyllomedusa sp. This subfamily has an average length of 13 amino acids. The pharmacological activity of the tryptophyllins remains to be established but it seems that these peptides possess an action on liver protein synthesis and body weight.	12
116834	pfam08249	Mastoparan	Mastoparan protein. Mastoparans are a family of tetradecapeptides from wasp venom, that have been shown to directly activate GTP-binding regulatory proteins. These peptides show selectivity among G proteins: they strongly activate Go and Gi but not Gs or Gt. The peptide of this family are composed by 14 amino acids but they can assume different structures.	14
116835	pfam08250	Sperm_act_pep	Sperm-activating peptides. The sperm-activating peptides (SAPs) are isolated in egg-conditioned media (egg jelly) of sea urchins. SAPs have several effects on sea urchin spermatozoa: stimulate sperm respiration and motility through intracellular alkalinization, transient elevation of cAMP, cGMP and Ca++levels in sperm cells.	10
116836	pfam08251	Mastoparan_2	Mastoparan peptide. Mastoparan (MP) peptides I II and III are extracted from the venom gland of the Neotropical social wasp Protopolybia exigua(Saussure) They are tetradecapeptides presenting from seven to ten hydrophobic amino acid residues and from two to four lysine residues in their primary sequences. These peptide cause the degranulation of mast cells. Protopolybia-MP-I also act causing hemolysis in erythrocytes.	14
285459	pfam08252	Leader_CPA1	arg-2/CPA1 leader peptide. In this family there are Leaders Peptides involved in the regulation the glutaminase subunit (small subunit) of arginine-specific carbamoyl phosphate synthetase. In Neurospora crassa it is a small upstream ORF of 24 codon above the arg-2 locus. In yeast it is the leader peptide of the CPA1 gene. The 5' region of CPA1 mRNA contains a 25 codon upstream open reading frame. The leader peptide, the product of the upstream open reading frame, plays an essential, negative role in the specific repression of CPA1 by arginine.	23
285460	pfam08253	Leader_Erm	Erm Leader peptide. These short proteins are Leader peptides (15-19 amino acids) of erm genes that code for resistance determinants in Staphylococcus aureus.	19
285461	pfam08254	Leader_Thr	Threonine leader peptide. Threonine leader peptide of the Threonine operon thrA1A2BC. It as been sequenced in different bacteria: E. coli, Serratia marcescens, Salmonella typhi.	22
285462	pfam08255	Leader_Trp	Trp-operon Leader Peptide. The tryptophan operon regulatory region of C. freundii's (leader transcript) encodes a 14-residue peptide containing characteristic tandem tryptophan residues. It is about 10 nucleotides shorter than those of E. coli and S. typhimurium.	14
116841	pfam08256	Antimicrobial20	Aurein-like antibiotic peptide. This family of antibacterial peptides are secreted from the granular dorsal glands of the Green and Golden Bell Frog Litoria aurea, Southern Bell Frog L. raniformis, Blue Mountains tree-frog Litoria citropa (genus Litoria) and frogs from genus Uperoleia. They are a part of the FSAP peptide family. Amongst the more active of these are aurein 1.2, aurein 2.2 and aurein 3.1; caerin 1.1, maculatin 1.1, uperin 3.6; citropin 1.1, citropin 1.2, citropin 1.3 and a minor peptide are wide-spectrum antibacterial peptides.	13
400520	pfam08257	Sulfakinin	Sulfakinin family. The sulfakinin (SK) family of neuropeptides have only been identified in crustaceans and insects. For most species there is the potential for producing two sulfakinin peptides one have a short sulfakinin sequence The function of the sulfakinins is difficult to assess. For the American cockroach, various forms of the endogenous sulfakinins have been shown to be active on the hindgut, and also on the heart. In C. vomitoria the peptides act as neurotransmitters or neuromodulators, linking the brain with all thoracic and abdominal ganglia. In adults of P. monodon they appear to be restricted to a few neurones in the brain with a neural pathway extending along to the ventral thoracic and abdominal ganglia.	9
116843	pfam08258	WWamide	WWamide peptide. This family contain neuropeptides, isolated from ganglia of the African giant snail, Achatina fulica. Each peptide has a Trp residue at both the N- and C-termini. Purified WWamide-1, -2 and -3 showed an inhibitory effect on the phasic contractions of the anterior byssus retractor muscle (ABRM).	7
400521	pfam08259	Periviscerokin	Periviscerokinin family. Abdominal Perisympathetic organs of insects contain Periviscerokinins neuropeptides of about 11 amino acids.	11
116845	pfam08260	Kinin	Insect kinin peptide. These neuropeptides are the first members of the insect kinin-family isolated from the American cockroach. Their occurrence in the retrocerebral complex suggests a physiological role as a neurohormone. The C-terminal sequence Phe-X-Ser-Trp-Gly-NH2 characterized the peptides as members of the insect kinin family. Data suggest a possible involvement of insect kinins in water-balance by regulating the osmoregulation. These peptides have length from 6 to 14 amino acids.	8
87473	pfam08261	Carcinustatin	Carcinustatin peptide. A total of 20 peptides of the superfamily allostatin were isolated from the shore crab Carcinus maenas. They are named carcinustatin 1 to 20 and their length ranges from 5 to 27 amino acids. This family includes carcinustatin 8,9,15 and 16.	8
116846	pfam08262	Lem_TRP	Leucophaea maderae tachykinin-related peptide. These peptides are designated Leucophaea maderae tachykinin-related peptides (Lem TRPs). Some were isolated from the midgut of L. maderae, whereas others appear to be brain specific. The Lem TRPs of the brain are myotropic and induce increases in the amplitude and frequency of spontaneous contractions and tonus of hindgut muscle in L. maderae. They were also isolated from brain-corpora, cardiaca-corpora, allata-suboesophageal ganglion extracts of the Locusta migratoria. They stimulate visceral muscle contractions of the oviduct and the foregut of Locusta migratoria.	10
400522	pfam08263	LRRNT_2	Leucine rich repeat N-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats.	41
400523	pfam08264	Anticodon_1	Anticodon-binding domain of tRNA. This domain is found mainly hydrophobic tRNA synthetases. The domain binds to the anticodon of the tRNA.	141
400524	pfam08265	YL1_C	YL1 nuclear protein C-terminal domain. This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins.	29
400525	pfam08266	Cadherin_2	Cadherin-like. This cadherin domain is usually the most N-terminal copy of the domain.	83
400526	pfam08267	Meth_synt_1	Cobalamin-independent synthase, N-terminal domain. The N-terminal domain and C-terminal domains of cobalamin-independent synthases together define a catalytic cleft in the enzyme. The N-terminal domain is thought to bind the substrate, in particular, the negatively charged polyglutamate chain. The N-terminal domain is also thought to stabilize a loop from the C-terminal domain.	310
285468	pfam08268	FBA_3	F-box associated domain. 	125
377967	pfam08269	dCache_2	Cache domain. Double Cache domain 2 (dCache_2) may be a result of single Cache domain 2 (sCache_2) duplication.	297
285470	pfam08270	PRD_Mga	M protein trans-acting positive regulator (MGA) PRD domain. Mga is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions. This corresponds to the PRD like region.	220
400527	pfam08271	TF_Zn_Ribbon	TFIIB zinc-binding. The transcription factor TFIIB contains a zinc-binding motif near the N-terminus. This domain is involved in the interaction with RNA pol II and TFIIF and plays a crucial role in selecting the transcription initiation site. The domain adopts a zinc ribbon like structure.	43
400528	pfam08272	Topo_Zn_Ribbon	Topoisomerase I zinc-ribbon-like. Some Proteobacteria topoisomerase I contain two zinc-ribbon-like domains at the C-terminus that structurally homologous to pfam01396. However, this domain no longer bind zinc. Indeed, only one of the four cysteine residues remains.	39
400529	pfam08273	Prim_Zn_Ribbon	Zinc-binding domain of primase-helicase. 	37
311948	pfam08274	PhnA_Zn_Ribbon	PhnA Zinc-Ribbon. 	30
400530	pfam08275	Toprim_N	DNA primase catalytic core, N-terminal domain. 	128
400531	pfam08276	PAN_2	PAN-like domain. 	67
400532	pfam08277	PAN_3	PAN-like domain. 	71
400533	pfam08278	DnaG_DnaB_bind	DNA primase DnaG DnaB-binding. Eubacterial DnaG primases interact with several factors to from the replisome. One of these factors in DnaB, a helicase. This domain has been demonstrated to be responsible for the interaction between DnaG and DnaB.	122
400534	pfam08279	HTH_11	HTH domain. This family includes helix-turn-helix domains in a wide variety of proteins.	52
311953	pfam08280	HTH_Mga	M protein trans-acting positive regulator (MGA) HTH domain. Mga is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions.	59
400535	pfam08281	Sigma70_r4_2	Sigma-70, region 4. Region 4 of sigma-70 like sigma-factors are involved in binding to the -35 promoter element via a helix-turn-helix motif.	54
400536	pfam08282	Hydrolase_3	haloacid dehalogenase-like hydrolase. This family contains haloacid dehalogenase-like hydrolase enzymes.	254
285483	pfam08283	Gemini_AL1_M	Geminivirus rep protein central domain. This is the cetral domain of the geminivirus rep proteins.	107
400537	pfam08284	RVP_2	Retroviral aspartyl protease. Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases.	134
400538	pfam08285	DPM3	Dolichol-phosphate mannosyltransferase subunit 3 (DPM3). This family corresponds to subunit 3 of dolichol-phosphate mannosyltransferase, an enzyme which generates mannosyl donors for glycosylphosphatidylinositols, N-glycan and protein O- and C-mannosylation. DPM3 is an integral membrane protein and plays a role in stabilizing the dolichol-phosphate mannosyl transferase complex.	89
400539	pfam08286	Spc24	Spc24 subunit of Ndc80. Spc24 is a component of the evolutionarily conserved kinetochore-associated Ndc80 complex and is involved in chromosome segregation	106
400540	pfam08287	DASH_Spc19	Spc19. Spc19 is a component of the DASH complex. The DASH complex associates with the spindle pole body and is important for spindle and kinetochore integrity during cell division.	148
400541	pfam08288	PIGA	PIGA (GPI anchor biosynthesis). This domain is found on phosphatidylinositol n-acetylglucosaminyltransferase proteins. These proteins are involved in GPI anchor biosynthesis and are associated with disease the paroxysmal nocturnal haemoglobinuria.	90
285488	pfam08289	Flu_M1_C	Influenza Matrix protein (M1) C-terminal domain. This region is thought to be a second domain of the M1 matrix protein.	97
285489	pfam08290	Hep_core_N	Hepatitis core protein, putative zinc finger. This short region is found at the N-terminus of some hepatitis core proteins. Its conservation of four cys and his suggests a zinc binding domain.	27
400542	pfam08291	Peptidase_M15_3	Peptidase M15. 	112
400543	pfam08292	RNA_pol_Rbc25	RNA polymerase III subunit Rpc25. Rpc25 is a strongly conserved subunit of RNA polymerase III and has homology to Rpa43 in RNA polymerase I, Rpb7 in RNA polymerase II and the archaeal RpoE subunit. Rpc25 is required for transcription initiation and is not essential for the elongating properties of RNA polymerase III.	121
400544	pfam08293	MRP-S33	Mitochondrial ribosomal subunit S27. This family of proteins corresponds to mitochondrial ribosomal subunit S27 in prokaryotes and to subunit S33 in humans. It is a small 106 residue protein.The evolutionary history of the mitoribosomal proteome that is encoded by a diverse subset of eukaryotic genomes, reveals an ancestral ribosome of alpha-proteobacterial descent that more than doubled its protein content in most eukaryotic lineages. Several new MRPs have originated via duplication of existing MRPs as well as by recruitment from outside of the mitoribosomal proteome.	87
311963	pfam08294	TIM21	TIM21. TIM21 interacts with the outer mitochondrial TOM complex and promotes the insertion of proteins into the inner mitochondrial membrane.	145
400545	pfam08295	Sin3_corepress	Sin3 family co-repressor. This domain is found on transcriptional regulators. It forms interactions with histone deacetylases.	97
400546	pfam08297	U3_snoRNA_assoc	U3 snoRNA associated. This family of proteins is associated with U3 snoRNA. U3 snoRNA is required for nucleolar processing of pre-18S ribosomal RNA.	88
116881	pfam08298	AAA_PrkA	PrkA AAA domain. This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This is the N-terminal AAA domain.	358
400547	pfam08299	Bac_DnaA_C	Bacterial dnaA protein helix-turn-helix. 	69
369801	pfam08300	HCV_NS5a_1a	Hepatitis C virus non-structural 5a zinc finger domain. The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. This domain corresponds to the N-terminal zinc binding domain.	62
149382	pfam08301	HCV_NS5a_1b	Hepatitis C virus non-structural 5a domain 1b. The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. This region corresponds to the 1b domain.	102
400548	pfam08302	tRNA_lig_CPD	Fungal tRNA ligase phosphodiesterase domain. This domain is found in fungal tRNA ligases and has cyclic phosphodiesterase activity. tRNA ligases are enzymes required for the splicing of precursor tRNA molecules containing introns.	253
400549	pfam08303	tRNA_lig_kinase	tRNA ligase kinase domain. This domain is found in fungal tRNA ligases and has kinase activity. tRNA ligases are enzymes required for the splicing of precursor tRNA molecules containing introns. This family contains a P-loop motif.	168
400550	pfam08305	NPCBM	NPCBM/NEW2 domain. This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. This domain has also been called the NEW2 domain (Naumoff DG. Phylogenetic analysis of alpha-galactosidases of the GH27 family. Molecular Biology (Engl Transl). (2004)38:388-399.)	136
400551	pfam08306	Glyco_hydro_98M	Glycosyl hydrolase family 98. This domain is the putative catalytic domain of glycosyl hydrolase family 98 proteins.	328
285502	pfam08307	Glyco_hydro_98C	Glycosyl hydrolase family 98 C-terminal domain. This putative domain is found at the C-terminus of glycosyl hydrolase family 98 proteins. This domain is not expected to form part of the catalytic activity.	270
285503	pfam08308	PEGA	PEGA domain. This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16].	70
285504	pfam08309	LVIVD	LVIVD repeat. This repeat is found in bacterial and archaeal cell surface proteins, many of which are hypothetical. The secondary structure corresponding to this repeat is predicted to comprise from 1-7 of 4-beta-strands which may associate to form a beta-propeller. The repeat copy number varies from 3-29. This repeat is sometimes found with the PKD domain pfam00801.	42
400552	pfam08310	LGFP	LGFP repeat. This 54 amino acid repeat is found in many hypothetical proteins. Several hypothetical proteins from C.glutamicum and C.efficiens along with PS1 protein contain this repeat region. The N-terminus region of PS1 contains an esterase domain which transfers corynomycolic acid. The C-terminus region consists of 4 tandem LGFP repeats. It is hypothesized that the PS1 proteins in Corynebacterium, when associated with the cell wall, may be anchored via the LGFP tandem repeats that may be important for maintaining cell wall integrity [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. Deletion of Corynebacterium glutamicum csp1 protein results in a 10-fold increase in the cell volume of the organism and infers the corresponding proteins involvement in the cell shape formation. The secondary structure of each repeat is predicted to comprise two beta-strands and one alpha-helix [Adindla et al. 2004].	52
400553	pfam08311	Mad3_BUB1_I	Mad3/BUB1 homology region 1. Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p.	123
400554	pfam08312	cwf21	cwf21 domain. The cwf21 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe. The function of the cwf21 domain is to bind directly to the spliceosomal protein Prp8. Mutations in the cwf21 domain prevent Prp8 from binding. The structure of this domain has recently been solved which shows this domain to be composed of two alpha helices.	42
400555	pfam08313	SCA7	SCA7, zinc-binding domain. This domain is found in the protein Sgf73/Sca7 which is a component of the multihistone acetyltransferase complexes SAGA and SILK. This domain is also found in Ataxin-7, a human protein which in its polyglutamine expanded pathological form, is responsible for the neurodegenerative disease spinocerebellar ataxia 7 (SCA7). Ataxin-7 is an integral component of the mammalian SAGA-like complexes, the TATA-binding protein-free TAF-containing complex (TFTC) and the SPT3/TAF9/GCN5 acetyltransferase complex (STAGA). This domain is a minimal domain in ataxin-7-like proteins that is required for interaction with TFTC/STAGA subunits and is conserved highly through evolution. The domain contains a conserved Cys(3)His motif that binds zinc, thus indicating this to be a new zinc-binding domain.	60
400556	pfam08314	Sec39	Secretory pathway protein Sec39. Mnaimneh et al identified Sec39p as a protein involved in ER-Golgi transport in a large scale promoter shut down analysis of essential yeast genes. Kraynack et al. (2005) showed that Sec39p (Dsl3p) is required for Golgi-ER retrograde transport and is part of a very stable protein complex that also includes Dsl1p (in mammals ZW10), Tip20p (Rint-1) and the ER localized Q-SNARE proteins Ufe1p (syntaxin-18), Sec20p and Use1p. This was confirmed in a genome-wide analysis of protein complexes by Gavin et al (2006).	724
400557	pfam08315	cwf18	cwf18 pre-mRNA splicing factor. The cwf18 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe.	135
400558	pfam08316	Pal1	Pal1 cell morphology protein. Pal1 is a membrane associated protein that is involved in the maintenance of cylindrical cellular morphology. It localizes to sites of active growth. Pal1 physically interacts and displays overlapping localization with the Huntingtin-interacting-protein (Hip1)-related protein Sla2p/End4p.	135
400559	pfam08317	Spc7	Spc7 kinetochore protein. This domain is found in cell division proteins which are required for kinetochore-spindle association.	311
400560	pfam08318	COG4	COG4 transport protein. This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi an intra Golgi transport.	326
400561	pfam08320	PIG-X	PIG-X / PBN1. Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules.	205
400562	pfam08321	PPP5	PPP5 TPR repeat region. This region is specific to the PPP5 subfamily of serine/threonine phosphatases and contains TPR repeats.	92
400563	pfam08323	Glyco_transf_5	Starch synthase catalytic domain. 	239
400564	pfam08324	PUL	PUL domain. The PUL (PLAP, Ufd3p and Lub1p) domain is a novel alpha-helical Ub-associated domain. It directly binds to Cdc48, a chaperone-like AAA ATPase that collects ubiquitylated substrates.	260
311984	pfam08325	WLM	WLM domain. This is a predicted metallopeptidase domain called WLM (Wss1p-like metalloproteases). These are linked to the Ub-system by virtue of fusions with the UB-binding PUG (PUB), Ub-like, and Little Finger domains. More specifically, genetic evidence implicates the WLM family in de-SUMOylation.	190
400565	pfam08326	ACC_central	Acetyl-CoA carboxylase, central region. The region featured in this family is found in various eukaryotic acetyl-CoA carboxylases, N-terminal to the catalytic domain (pfam01039). This enzyme (EC:6.4.1.2) is involved in the synthesis of long-chain fatty acids, as it catalyzes the rate-limiting step in this process.	718
400566	pfam08327	AHSA1	Activator of Hsp90 ATPase homolog 1-like protein. This family includes eukaryotic, prokaryotic and archaeal proteins that bear similarity to a C-terminal region of human activator of 90 kDa heat shock protein ATPase homolog 1 (AHSA1/p38). This protein is known to interact with the middle domain of Hsp90, and stimulate its ATPase activity. It is probably a general upregulator of Hsp90 function, particularly contributing to its efficiency in conditions of increased stress. p38 is also known to interact with the cytoplasmic domain of the VSV G protein, and may thus be involved in protein transport. It has also been reported as being underexpressed in Down's syndrome. This region is found repeated in two members of this family.	125
400567	pfam08328	ASL_C	Adenylosuccinate lyase C-terminal. This domain is found at the C-terminus of adenylosuccinate lyase(ASL; PurB in E. coli). It has been identified in bacteria, eukaryotes and archaea and is found together with the lyase domain pfam00206. ASL catalyzes the cleavage of succinylaminoimidazole carboxamide ribotide to aminoimidazole carboxamide ribotide and fumarate and the cleavage of adenylosuccinate to adenylate and fumarate.	115
400568	pfam08329	ChitinaseA_N	Chitinase A, N-terminal domain. This domain is found in a number of bacterial chitinases and similar viral proteins. It is organized into a fibronectin III module domain-like fold, comprising only beta strands. Its function is not known, but it may be involved in interaction with the enzyme substrate, chitin. It is separated by a hinge region from the catalytic domain (pfam00704); this hinge region is probably mobile, allowing the N-terminal domain to have different relative positions in solution.	130
400569	pfam08331	DUF1730	Domain of unknown function (DUF1730). This domain of unknown function occurs in Iron-sulfur cluster-binding proteins together with the 4Fe-4S binding domain (pfam00037).	77
285524	pfam08332	CaMKII_AD	Calcium/calmodulin dependent protein kinase II association domain. This domain is found at the C-terminus of the Calcium/calmodulin dependent protein kinases II (CaMKII). These proteins also have a Ser/Thr protein kinase domain (pfam00069) at their N-terminus. The function of the CaMKII association domain is the assembly of the single proteins into large (8 to 14 subunits) multimers.	128
400570	pfam08333	DUF1725	Protein of unknown function (DUF1725). This family include many eukaryotic and one bacterial sequence. Many of its members are annotated as being putative L1 retrotransposons or LINE-1 reverse transcriptase homologs. The region in question is found repeated in some family members.	19
400571	pfam08334	T2SSG	Type II secretion system (T2SS), protein G. The Type II secretion system, also called Secretion-dependent pathway (SDP), is responsible for the transport of proteins across the outer membrane first exported to the periplasm by the Sec or Tat translocon in Gram-negative (diderm) bacteria. The T2SG family includes proteins such as EpsG (P45773) in Vibrio cholera, XcpT also called PddA (Q00514) in Pseudomonas aeruginosa or PulG (P15746)in Klebsiella pneumoniae. The PulG is thought to be anchored in the inner membrane with its C-terminus directed towards the periplasme. Together with other members of the Type II secretion machinery, it is thought to assemble into a pilus-like structure that may function as a dynamic mechanism to push secreted proteins out of the cell. The polypeptide is organized into a long N-terminal alpha-helix followed by a loop region that separates it from a C-terminal anti-parallel beta-sheet.	106
400572	pfam08335	GlnD_UR_UTase	GlnD PII-uridylyltransferase. This is a family of bifunctional uridylyl-removing enzymes/uridylyltransferases (UR/UTases, GlnD) that are responsible for the modification (EC:2.7.7.59) of the regulatory protein P-II, or GlnB (pfam00543). In response to nitrogen limitation, these transferases catalyze the uridylylation of the PII protein, which in turn stimulates deadenylylation of glutamine synthetase (GlnA). Deadenylylated glutamine synthetase is the more active form of the enzyme. Moreover, uridylylated PII can act together with NtrB and NtrC to increase transcription of genes in the sigma54 regulon, which include glnA and other nitrogen-level controlled genes. It has also been suggested that the product of the glnD gene is involved in other physiological functions such as control of iron metabolism in certain species. The region described in this family is found in many of its members to be C-terminal to a nucleotidyltransferase domain (pfam01909), and N-terminal to an HD domain (pfam01966) and two ACT domains (pfam01842).	140
400573	pfam08336	P4Ha_N	Prolyl 4-Hydroxylase alpha-subunit, N-terminal region. The members of this family are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme (EC:1.14.11.2) is important in the post-translational modification of collagen, as it catalyzes the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase. The function of the N-terminal region featured in this family does not seem to be known.	92
400574	pfam08337	Plexin_cytopl	Plexin cytoplasmic RasGAP domain. This family features the C-terminal regions of various plexins. Plexins are receptors for semaphorins, and plexin signalling is important in path finding and patterning of both neurons and developing blood vessels. The cytoplasmic region, which has been called a SEX domain in some members of this family, is involved in downstream signalling pathways, by interaction with proteins such as Rac1, RhoD, Rnd1 and other plexins. This domain acts as a RasGAP domain.	504
400575	pfam08338	DUF1731	Domain of unknown function (DUF1731). This domain of unknown function appears towards the C-terminus of proteins of the NAD dependent epimerase/dehydratase family (pfam01370) in bacteria, eukaryotes and archaea. Many of the proteins in which it is found are involved in cell-division inhibition.	44
400576	pfam08339	RTX_C	RTX C-terminal domain. This family describes the C-terminal region of various bacterial haemolysins and leukotoxins, which belong to the RTX family of toxins. These are produced by various Gram negative bacteria, such as E. coli and Actinobacillus pleuropneumoniae. RTX toxins may interact with lipopolysaccharide (LPS) to functionally impair and eventually kill leukocytes. This region is found in association with the RTX N-terminal domain (pfam02382) and multiple hemolysin-type calcium-binding repeats (pfam00353).	131
400577	pfam08340	DUF1732	Domain of unknown function (DUF1732). This domain of unknown function is often found at the C-terminus of bacterial proteins, many of which are hypothetical, including proteins of the YicC family which have pfam03755 at the N-terminus. These include a protein important in the stationary phase of growth, and required for growth at high temperature. Structural modelling suggests this domain may bind nucleic acids.	85
400578	pfam08341	TED	Thioester domain. This domain is found near the N-terminus of a variety of bacterial surface proteins and pili. This domain contains an unusual covalent ester bond between a conserved cysteine and glutamine residue.	104
400579	pfam08343	RNR_N	Ribonucleotide reductase N-terminal. This domain is found at the N-terminus of bacterial ribonucleoside-diphosphate reductases (ribonucleotide reductases, RNRs) which catalyze the formation of deoxyribonucleotides. It occurs together with the RNR all-alpha domain (pfam00317) and the RNR barrel domain (pfam02867).	82
400580	pfam08344	TRP_2	Transient receptor ion channel II. This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023).	60
400581	pfam08345	YscJ_FliF_C	Flagellar M-ring protein C-terminal. This domain is found in bacterial flagellar M-ring (FliF) proteins together with the YscJ/FliF domain (pfam01514).	155
400582	pfam08346	AntA	AntA/AntB antirepressor. In E. coli the two proteins AntA and AntB have 62% amino acid identities near their N termini. AntA appears to be encoded by a truncated and divergent copy of AntB. The two proteins are homologous to putative antirepressors found in numerous bacteriophages, such as the hypothetical antirepressor protein encoded by the gene LO142 of the bacteriophage 933W.	68
400583	pfam08347	CTNNB1_binding	N-terminal CTNNB1 binding. This region tends to appear at the N-terminus of proteins also containing DNA-binding HMG (high mobility group) boxes (pfam00505) and appears to bind the armadillo repeat of CTNNB1 (beta-catenin), forming a stable complex. Signaling by Wnt through TCF/LCF is involved in developmental patterning, induction of neural tissues, cell fate decisions and stem cell differentiation. Isoforms of HMG T-cell factors lacking the N-terminal CTNNB1-binding domain cannot fulfill their role as transcriptional activators in T-cell differentiation.	206
400584	pfam08348	PAS_6	YheO-like PAS domain. This family contains various hypothetical bacterial proteins that are similar to the E. coli protein YheO. Their function is unknown, but are likely to be involved in signalling based on the presence of this PAS domain.	115
400585	pfam08349	DUF1722	Protein of unknown function (DUF1722). This domain of unknown function is found in bacteria and archaea and is homologous to the hypothetical protein ybgA from E. coli.	115
400586	pfam08350	DUF1724	Domain of unknown function (DUF1724). This domain of unknown function has so far only been found at the C-terminus of archaean proteins, including several transcriptional regulators of the ArsR family (see pfam01022).	62
400587	pfam08351	DUF1726	Domain of unknown function (DUF1726). This domain of unknown function is often found at the N-terminus of proteins containing pfam05127. Its fold resembles that of pfam05127, but it does not appear to bind ATP.	92
400588	pfam08352	oligo_HPY	Oligopeptide/dipeptide transporter, C-terminal region. This family features a region found towards the C-terminus of oligopeptide ABC transporter ATP binding proteins, immediately following the ATP-binding domain (pfam00005). All characterized members appear able to be involved in the transport of oligopeptides or dipeptides. Some are important for sporulation or antibiotic resistance. Some dipeptide transporters also act on the heme precursor delta-aminolevulinic acid.	65
400589	pfam08353	DUF1727	Domain of unknown function (DUF1727). This domain of unknown function is found at the C-terminus of bacterial proteins which include UDP-N-acetylmuramyl tripeptide synthase and the related Mur ligase.	110
369827	pfam08354	DUF1729	Domain of unknown function (DUF1729). This domain of unknown function is found in fatty acid synthase beta subunits together with the MaoC-like domain (pfam01575) and the Acyltransferase domain (pfam00698). The domain has been identified in fungi and bacteria.	353
400590	pfam08355	EF_assoc_1	EF hand associated. This region typically appears on the C-terminus of EF hands in GTP-binding proteins such as Arht/Rhot (may be involved in mitochondrial homeostasis and apoptosis). The EF hand associated region is found in yeast, vertebrates and plants.	69
400591	pfam08356	EF_assoc_2	EF hand associated. This region predominantly appears near EF-hands (pfam00036) in GTP-binding proteins. It is found in all three eukaryotic kingdoms.	85
254756	pfam08357	SEFIR	SEFIR domain. This family comprises IL17 receptors (IL17Rs) and SEF proteins. The latter are feedback inhibitors of FGF signalling and are also thought to be receptors. Due to its similarity to the TIR domain (pfam01582), the SEFIR region is thought to be involved in homotypic interactions with other SEFIR/TIR-domain-containing proteins. Thus, SEFs and IL17Rs may be involved in TOLL/IL1R-like signalling pathways.	150
149427	pfam08358	Flexi_CP_N	Carlavirus coat. This domain is found together with the viral coat protein domain (pfam00286) in coat/capsid proteins of Carlaviruses infecting plants.	52
285548	pfam08359	TetR_C_4	YsiA-like protein, C-terminal region. The members of this family are thought to be TetR-type transcriptional regulators that bear particular similarity to YsiA, a hypothetical protein expressed by B. subtilis.	133
400592	pfam08360	TetR_C_5	QacR-like protein, C-terminal region. This family features the C-terminal region of a number of proteins that bear similarity to the QacR protein, a transcriptional regulator of the TetR family. QacR is able to bind various environmental agents, which include a number of cationic lipophilic compounds, and thus regulate the transcription of QacA, a multidrug efflux pump. The C-terminal region contains the multifaceted, expansive drug-binding pocket, which is composed of several separate, but linked, binding sites.	131
369831	pfam08361	TetR_C_2	MAATS-type transcriptional repressor, C-terminal region. This family is named after the various transcriptional regulatory proteins that it contains, including MtrR, AcrR, ArpR, TtgR and SmeT. These are members of the TetR family of transcriptional repressors, that are involved in the control of expression of multidrug resistance proteins.	121
400593	pfam08362	TetR_C_3	YcdC-like protein, C-terminal region. This family comprises proteins that belong to the TetR family of transcriptional regulators. They bear particular similarity to YcdC, a putative HTH-containing protein. This family features the C-terminal region of these sequences, which does not include the helix-turn-helix.	143
400594	pfam08363	GbpC	Glucan-binding protein C. This domain is found in the Streptococcus Glucan-binding protein C (GbpC) and also in surface protein antigen (Spa)-family proteins which show sequence similarity to GbpC.	260
400595	pfam08364	IF2_assoc	Bacterial translation initiation factor IF-2 associated region. Most of the sequences in this alignment come from bacterial translation initiation factors (IF-2, also pfam04760), but the domain is also found in the eukaryotic translation initiation factor 4 gamma in yeast and in a hypothetical Euglenozoa protein of unknown function.	39
400596	pfam08365	IGF2_C	Insulin-like growth factor II E-peptide. This domain is found at the C-terminal domain of the insulin-like growth factor II (IGF-2, also see pfam00049) in vertebrates and seems to represent the E-peptide.	56
400597	pfam08366	LLGL	LLGL2. This domain is found in lethal giant larvae homolog 2 (LLGL2) proteins and syntaxin-binding proteins like tomosyn. It has been identified in eukaryotes and tends to be found together with WD repeats (pfam00400).	102
400598	pfam08367	M16C_assoc	Peptidase M16C associated. This domain appears in eukaryotes as well as bacteria and tends to be found near the C-terminus of the metalloprotease M16C (pfam05193).	245
400599	pfam08368	FAST_2	FAST kinase-like protein, subdomain 2. This family represents a conserved region of eukaryotic Fas-activated serine/threonine (FAST) kinases (EC:2.7.1.-) that contains several conserved leucine residues. FAST kinase is rapidly activated during Fas-mediated apoptosis, when it phosphorylates TIA-1, a nuclear RNA-binding protein that has been implicated as an effector of apoptosis. Note that many family members are hypothetical proteins. This subdomain is often found associated with the FAST kinase-like protein, subdomain 2.	87
400600	pfam08369	PCP_red	Proto-chlorophyllide reductase 57 kD subunit. This domain is found in bacteria and plant chloroplast proteins. It often appears at the C-terminal of Nitrogenase component 1 type Oxidoreductases (pfam00148) and sometimes independently in bacterial proteins such as the Proto-chlorophyllide reductase 57 kD subunit of the Cyanobacterium Synechocystis.	44
400601	pfam08370	PDR_assoc	Plant PDR ABC transporter associated. This domain is found on the C-terminus of ABC-2 type transporter domains (pfam01061). It seems to be associated with the plant pleiotropic drug resistance (PDR) protein family of ABC transporters. Like in yeast, plant PDR ABC transporters may also play a role in the transport of antifungal agents [pfam06422]. The PDR family is characterized by a configuration in which the ABC domain is nearer the N-terminus of the protein than the transmembrane domain.	62
337028	pfam08372	PRT_C	Plant phosphoribosyltransferase C-terminal. This domain is found at the C-terminus of phosphoribosyltransferases and phosphoribosyltransferase-like proteins. It contains putative transmembrane regions. It often appears together with calcium-ion dependent C2 domains (pfam00168).	156
400602	pfam08373	RAP	RAP domain. This domain is found in various eukaryotic species, where it is found in proteins that are important in various parasite-host cell interactions. It is thought to be an RNA-binding domain. The domain is involved in plant defense in response to bacterial infection.	58
400603	pfam08374	Protocadherin	Protocadherin. The structure of protocadherins is similar to that of classic cadherins (pfam00028), but particularly on the cytoplasmic domains they also have some unique features. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localized mainly at cell-cell contact sites. Their expression seems to be developmentally regulated.	217
400604	pfam08375	Rpn3_C	Proteasome regulatory subunit C-terminal. This eukaryotic domain is found at the C-terminus of 26S proteasome regulatory subunits such as the non-ATPase Rpn3 subunit which is essential for proteasomal function. It occurs together with the PCI/PINT domain (pfam01399).	57
400605	pfam08376	NIT	Nitrate and nitrite sensing. The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure.	234
369841	pfam08377	MAP2_projctn	MAP2/Tau projection domain. This domain is found in the MAP2/Tau family of proteins which includes MAP2, MAP4, Tau, and their homologs. All isoforms contain a conserved C-terminal domain containing tubulin-binding repeats (pfam00418), and a N-terminal projection domain of varying size. This domain has a net negative charge and exerts a long-range repulsive force. This provides a mechanism that can regulate microtubule spacing which might facilitate efficient organelle transport.	1134
400606	pfam08378	NERD	Nuclease-related domain. The nuclease-related domain (NERD) is found in a range of bacterial as well as archaeal and plant proteins. It has distant similarity to endonucleases (hence its name) and its predicted secondary structure is helix - sheet - sheet - sheet - sheet - weak sheet/long loop - helix - sheet - sheet. The majority of NERD-containing proteins are single-domain, but in several cases proteins containing NERD have additional domains which in 75% of cases are involved in DNA processing.	108
400607	pfam08379	Bact_transglu_N	Bacterial transglutaminase-like N-terminal region. This region is found towards the N-terminus of various archaeal and bacterial hypothetical proteins. Some of these are annotated as being transglutaminase-like proteins, and in fact contain a transglutaminase-like superfamily domain (pfam01841).	80
400608	pfam08381	BRX	Transcription factor regulating root and shoot growth via Pin3. The BREVIS RADIX (BRX) domain was characterized as being a transcription factor in plants regulating the extent of cell proliferation and elongation in the growth zone of the root. BRX is rate limiting for auxin-responsive gene-expression by mediating cross-talk with the brassino-steroid pathway. BRX has a ubiquitous, although quantitatively variable role in modulating the growth rate in both the root and the shoot. The family features a short region of alpha-helix, approximately 60 residues in length, which is found repeated up to three times. BRX is expressed in the vasculature and is rate-limiting for transcriptional auxin action.	56
400609	pfam08383	Maf_N	Maf N-terminal region. This region is found in various leucine zipper transcription factors of the Maf family. These are implicated in the regulation of insulin gene expression, in erythroid differentiation, and in differentiation of the neuroretina.	34
400610	pfam08384	NPP	Pro-opiomelanocortin, N-terminal region. This family features the N-terminal peptide of pro-opiomelanocortin (NPP). It is thought to represent an important pituitary peptide, given its high yield from pituitary glands, and exhibits a potent in vitro aldosterone-stimulating activity.	43
400611	pfam08385	DHC_N1	Dynein heavy chain, N-terminal region 1. Dynein heavy chains interact with other heavy chains to form dimers, and with intermediate chain-light chain complexes to form a basal cargo binding unit. The region featured in this family includes the sequences implicated in mediating these interactions. It is thought to be flexible and not to adopt a rigid conformation.	559
312032	pfam08386	Abhydrolase_4	TAP-like protein. This is a family of putative bacterial peptidases and hydrolases that bear similarity to a tripeptidyl aminopeptidase isolated from Streptomyces lividans. A member of this family is thought to be involved in the C-terminal processing of propionicin F, a bacteriocidin characterized from Propionibacterium freudenreichii.	98
400612	pfam08387	FBD	FBD. This region is found in F-box (pfam00646) and other domain containing plant proteins; it is repeated in two family members. Its precise function is unknown, but it is thought to be associated with nuclear processes. In fact, several family members are annotated as being similar to transcription factors.	46
400613	pfam08388	GIIM	Group II intron, maturase-specific domain. This region is found mainly in various bacterial and archaeal species, but a few members of this family are expressed by fungal and chlamydomonal species. It has been implicated in the binding of intron RNA during reverse transcription and splicing.	80
400614	pfam08389	Xpo1	Exportin 1-like protein. The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus.	147
400615	pfam08390	TRAM1	TRAM1-like protein. This family comprises sequences that are similar to human TRAM1. This is a transmembrane protein of the endoplasmic reticulum, thought to be involved in the membrane transfer of secretory proteins. The region featured in this family is found N-terminal to the longevity-assurance protein region (pfam03798).	63
400616	pfam08391	Ly49	Ly49-like protein, N-terminal region. The sequences making up this family are annotated as, or are similar to, Ly49 receptors. These are type II transmembrane receptors expressed by mouse natural killer (NK) cells. They are classified as being activating (e.g.Ly49D and H) or inhibitory (e.g. Ly49A and G), depending on their effect on NK cell function. They are members of the C-type lectin receptor superfamily, and in fact in many family members this region is found immediately N-terminal to a lectin C-type domain (pfam00059).	120
400617	pfam08392	FAE1_CUT1_RppA	FAE1/Type III polyketide synthase-like protein. The members of this family are described as 3-ketoacyl-CoA synthases, type III polyketide synthases, fatty acid elongases and fatty acid condensing enzymes, and are found in both prokaryotic and eukaryotic (mainly plant) species. The region featured in this family contains the active site residues, as well as motifs involved in substrate binding.	290
400618	pfam08393	DHC_N2	Dynein heavy chain, N-terminal region 2. Dyneins are described as motor proteins of eukaryotic cells, as they can convert energy derived from the hydrolysis of ATP to force and movement along cytoskeletal polymers, such as microtubules. This region is found C-terminal to the dynein heavy chain N-terminal region 1 (pfam08385) in many members of this family. No functions seem to have been attributed specifically to this region.	331
285580	pfam08394	Arc_trans_TRASH	Archaeal TRASH domain. This region is found in the C-terminus of a number of archaeal transcriptional regulators. It is thought to function as a metal-sensing regulatory module.	37
400619	pfam08395	7tm_7	7tm Chemosensory receptor. This family includes a number of gustatory and odorant receptors mainly from insect species such as A. gambiae and D. melanogaster. They are classified as G-protein-coupled receptors (GPCRs), or seven-transmembrane receptors. They show high sequence divergence, consistent with an ancient origin for the family.	370
254775	pfam08396	Toxin_34	Spider toxin omega agatoxin/Tx1 family. The Tx1 family lethal spider neurotoxin induces excitatory symptoms in mice.	75
400620	pfam08397	IMD	IRSp53/MIM homology domain. The N-terminal predicted helical stretch of the insulin receptor tyrosine kinase substrate p53 (IRSp53) is an evolutionary conserved F-actin bundling domain involved in filopodium formation. The domain has been named IMD after the IRSp53 and missing in metastasis (MIM) proteins in which it occurs. Filopodium-inducing IMD activity is regulated by Cdc42 and Rac1 and is SH3-independent.	218
285583	pfam08398	Parvo_coat_N	Parvovirus coat protein VP1. This is the N-terminal region of the Parvovirus VP1 coat protein. Also see Parvovirus coat protein VP2 (pfam00740).	63
400621	pfam08399	VWA_N	VWA N-terminal. This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits.	123
285585	pfam08400	phage_tail_N	Prophage tail fibre N-terminal. This domain is found at the N-terminus of prophage tail fibre proteins.	134
400622	pfam08401	DUF1738	Domain of unknown function (DUF1738). This region is found in a number of bacterial hypothetical proteins. Some members are annotated as being similar to replication primases, and in fact this region is often found together with the Toprim domain (pfam01751).	127
400623	pfam08402	TOBE_2	TOBE domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulphate. Found in ABC transporters immediately after the ATPase domain. In this family a strong RPE motif is found at the presumed N-terminus of the domain.	73
400624	pfam08403	AA_permease_N	Amino acid permease N-terminal. This domain is found to the N-terminus of the amino acid permease domain (pfam00324) in metazoan Na-K-Cl cotransporters.	69
285589	pfam08404	Baculo_p74_N	Baculoviridae P74 N-terminal. This domain is found at the N-terminus of P74 occlusion-derived virus (ODV) envelope proteins which are required for oral infectivity. The envelope proteins are found in baculoviruses which are insect pathogens. The C-terminus of P74 is anchored to the membrane whereas the N-terminus is exposed to the virion surface. Furthermore P74 is unusual for a virus envelope protein as it lacks an N-terminal localization signal sequence. Also see pfam04583.	300
285590	pfam08405	Calici_PP_N	Viral polyprotein N-terminal. This domain is found at the N-terminus of non-structural viral polyproteins of the Caliciviridae subfamily.	358
400625	pfam08406	CbbQ_C	CbbQ/NirQ/NorQ C-terminal. This domain is found at the C-terminus of proteins of the CbbQ/NirQ/NorQ family of proteins which play a role in the post-translational activation of Rubisco. It is also found in the Thauera aromatica TutH protein which is similar to the CbbQ/NirQ/NorQ family, as well as in putative chaperones. The ATPase family associated with various cellular activities (AAA) pfam07728 is found in the same bacterial and archaeal proteins as the domain described here.	85
400626	pfam08407	Chitin_synth_1N	Chitin synthase N-terminal. This is the N-terminal domain of Chitin synthase (pfam01644).	70
369858	pfam08408	DNA_pol_B_3	DNA polymerase family B viral insert. This viral domain is found between the exonuclease domain of the DNA polymerase family B (pfam03104) and the pfam00136 domain, connecting the two.	128
400627	pfam08409	DUF1736	Domain of unknown function (DUF1736). This domain of unknown function is found in various hypothetical metazoan proteins.	74
400628	pfam08410	DUF1737	Domain of unknown function (DUF1737). This domain of unknown function is found at the N-terminus of bacterial and viral hypothetical proteins.	51
400629	pfam08411	Exonuc_X-T_C	Exonuclease C-terminal. This bacterial domain is found at the C-terminus of Exodeoxyribonuclease I/Exonuclease I (pfam00929), which is a single-strand specific DNA nuclease affecting recombination and expression pathways. The exonuclease I protein in E. coli is associated with DNA deoxyribophosphodiesterase (dRPase).	267
400630	pfam08412	Ion_trans_N	Ion transport protein N-terminal. This metazoan domain is found to the N-terminus of pfam00520 in voltage- and cyclic nucleotide-gated K/Na ion channels.	43
400631	pfam08414	NADPH_Ox	Respiratory burst NADPH oxidase. This domain is found in plant proteins such as respiratory burst NADPH oxidase proteins which produce reactive oxygen species as a defense mechanism. It tends to occur to the N-terminus of an EF-hand (pfam00036), which suggests a direct regulatory effect of Ca2+ on the activity of the NADPH oxidase in plants.	100
400632	pfam08416	PTB	Phosphotyrosine-binding domain. The phosphotyrosine-binding domain (PTB, also phosphotyrosine-interaction or PI domain) in the protein tensin tends to be found at the C-terminus. Tensin is a multi-domain protein that binds to actin filaments and functions as a focal-adhesion molecule (focal adhesions are regions of plasma membrane through which cells attach to the extracellular matrix). Human tensin has actin-binding sites, an SH2 (pfam00017) domain and a region similar to the tumor suppressor PTEN. The PTB domain interacts with the cytoplasmic tails of beta integrin by binding to an NPXY motif.	128
400633	pfam08417	PaO	Pheophorbide a oxygenase. This domain is found in bacterial and plant proteins to the C-terminus of a Rieske 2Fe-2S domain (pfam00355). One of the proteins the domain is found in is Pheophorbide a oxygenase (PaO) which seems to be a key regulator of chlorophyll catabolism. Arabidopsis PaO (AtPaO) is a Rieske-type 2Fe-2S enzyme that is identical to Arabidopsis accelerated cell death 1 and homologous to lethal leaf spot 1 (LLS1) of maize, in which the domain described here is also found.	89
400634	pfam08418	Pol_alpha_B_N	DNA polymerase alpha subunit B N-terminal. This is the eukaryotic DNA polymerase alpha subunit B N-terminal domain which is involved in complex formation. Also see pfam04058.	240
400635	pfam08421	Methyltransf_13	Putative zinc binding domain. This domain is found at the N-terminus of bacterial methyltransferases and contains four conserved cysteines suggesting a potential zinc binding domain.	62
400636	pfam08423	Rad51	Rad51. Rad51 is a DNA repair and recombination protein and is a homolog of the bacterial ATPase RecA protein.	255
400637	pfam08424	NRDE-2	NRDE-2, necessary for RNA interference. This is a family of eukaryotic proteins. Eukaryotic cells express a wide variety of endogenous small regulatory RNAs that regulate heterochromatin formation, developmental timing, defense against parasitic nucleic acids, and genome rearrangement. Many small regulatory RNAs are thought to function in nuclei, and in plants and fungi small interfering (si)RNAs associate with nascent transcripts and direct chromatin and/or DNA modifications. This family protein, NRDE-2, is required for small interfering (si)RNA-mediated silencing in nuclei. NRDE-2 associates with the Argonaute protein NRDE-3 within nuclei and is recruited by NRDE-3/siRNA complexes to nascent transcripts that have been targeted by RNA interference, RNAi, the process whereby double-stranded RNA (dsRNA) directs the sequence-specific degradation of mRNA.	315
400638	pfam08426	ICE2	ICE2. ICE2 is a fungal ER protein which has been shown to play an important role in forming/maintaining the cortical ER. It has also bee identified as a protein which is necessary for nuclear inner membrane targeting.	402
400639	pfam08427	DUF1741	Domain of unknown function (DUF1741). This is a eukaryotic domain of unknown function.	230
400640	pfam08428	Rib	Rib/alpha-like repeat. The region featured in this family is found repeated in a number of bacterial surface proteins, such as Rib and alpha. These are expressed by group B streptococci, and Rib is thought to confer protective immunity.	76
400641	pfam08429	PLU-1	PLU-1-like protein. Sequences in this family bear similarity to the central region of PLU-1. This is a nuclear protein that may have a role in DNA-binding and transcription, and is closely associated with the malignant phenotype of breast cancer. This region is found in various other Jumonji/ARID domain-containing proteins (see pfam02373, pfam01388).	334
369872	pfam08430	Forkhead_N	Forkhead N-terminal region. The region described in this family is found towards the N-terminus of various eukaryotic forkhead/HNF-3-related transcription factors (which contain the pfam00250 domain). These proteins play key roles in embryogenesis, maintenance of differentiated cell states, and tumorigenesis.	139
400642	pfam08432	Vfa1	AAA-ATPase Vps4-associated protein 1. Vps Four-Associated 1, Vfa1, in yeast, is an endosomal protein that interacts with the AAA-ATPase Vps4. It would seem to be involved in regulating the trafficking of other proteins to the endocytic vacuole. There is a CCCH zinc finger at the N-terminus.	179
400643	pfam08433	KTI12	Chromatin associated protein KTI12. This is a family of chromatin associated proteins which interact with the Elongator complex, a component of the elongating form of RNA polymerase II. The Elongator complex has histone acetyltransferase activity.	269
400644	pfam08434	CLCA	Calcium-activated chloride channel N terminal. The CLCA family of calcium-activated chloride channels has been identified in many epithelial and endothelial cell types as well as in smooth muscle cells and has four or five putative transmembrane regions. Additionally to their role as chloride channels some CLCA proteins function as adhesion molecules and may also have roles as tumor suppressors. This protein cleaves itself into an N-terminal portion and a C-terminal portion. The N-terminus contains an HEXXHXXXGXXDE motif which is essential for proteolytic cleavage.	266
400645	pfam08435	Calici_coat_C	Calicivirus coat protein C-terminal. This is the calicivirus coat protein (pfam00915) C-terminal region.	222
400646	pfam08436	DXP_redisom_C	1-deoxy-D-xylulose 5-phosphate reductoisomerase C-terminal. This domain is found to the C-terminus of pfam02670 domains in bacterial and plant 1-deoxy-D-xylulose 5-phosphate reductoisomerases which catalyze the formation of 2-C-methyl-D-erythritol 4-phosphate from 1-deoxy-D-xylulose-5-phosphate in the presence of NADPH.	84
400647	pfam08437	Glyco_transf_8C	Glycosyl transferase family 8 C-terminal. This domain is found at the C-terminus of the pfam01501 domain in bacterial glucosyltransferase and galactosyltransferase proteins.	54
400648	pfam08438	MMR_HSR1_C	GTPase of unknown function C-terminal. This domain is found at the C-terminus of pfam01926 in archaeal and eukaryotic GTP-binding proteins. The C-terminal domain of the GTP-binding proteins is necessary for the complete activity of the protein of interacting with the 50S ribosome and binding of both adenine and guanine nucleotides, with a preference for guanine nucleotides.	109
400649	pfam08439	Peptidase_M3_N	Oligopeptidase F. This domain is found to the N-terminus of the pfam01432 domain in bacterial and archaeal proteins including Oligoendopeptidase F. An example of this protein is Lactococcus lactis PepF.	70
285618	pfam08440	Poty_PP	Potyviridae polyprotein. This domain is found in polyproteins of the viral Potyviridae taxon.	277
400650	pfam08441	Integrin_alpha2	Integrin alpha. This domain is found in integrin alpha and integrin alpha precursors to the C-terminus of a number of pfam01839 repeats and to the N-terminus of the pfam00357 cytoplasmic region. This region is composed of three immunoglobulin-like domains.	449
400651	pfam08442	ATP-grasp_2	ATP-grasp domain. 	202
369879	pfam08443	RimK	RimK-like ATP-grasp domain. This ATP-grasp domain is found in the ribosomal S6 modification enzyme RimK.	188
117021	pfam08444	Gly_acyl_tr_C	Aralkyl acyl-CoA:amino acid N-acyltransferase, C-terminal region. This family features the C-terminal region of several mammalian specific aralkyl acyl-CoA:amino acid N-acyltransferase (glycine N-acyltransferase) proteins EC:2.3.1.13.	89
117022	pfam08445	FR47	FR47-like protein. The members of this family are similar to the C-terminal region of the D. melanogaster hypothetical protein FR47. This protein has been found to consist of two N-acyltransferase-like domains swapped with the C-terminal strands.	86
400652	pfam08446	PAS_2	PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya.	107
369881	pfam08447	PAS_3	PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya.	89
312075	pfam08448	PAS_4	PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya.	110
312076	pfam08449	UAA	UAA transporter family. This family includes transporters with a specificity for UDP-N-acetylglucosamine.	302
400653	pfam08450	SGL	SMP-30/Gluconolaconase/LRE-like region. This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30), gluconolactonase and luciferin-regenerating enzyme (LRE). SMP-30 is known to hydrolyze diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 and LRE.	246
400654	pfam08451	A_deaminase_N	Adenosine/AMP deaminase N-terminal. This domain is found to the N-terminus of the Adenosine/AMP deaminase domain (pfam00962) in metazoan proteins such as the Cat eye syndrome critical region protein 1 and its homologs.	95
369884	pfam08452	DNAP_B_exo_N	DNA polymerase family B exonuclease domain, N-terminal. This domain is found in viral DNA polymerases to the N-terminus of DNA polymerase family B exonuclease domains (pfam03104).	21
400655	pfam08453	Peptidase_M9_N	Peptidase family M9 N-terminal. This domain is found in microbial collagenase metalloproteases to the N-terminus of pfam01752.	183
400656	pfam08454	RIH_assoc	RyR and IP3R Homology associated. This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1,4,5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels. There seems to be no known function for this domain. Also see the IP3-binding domain pfam01365 and pfam02815.	98
400657	pfam08455	SNF2_assoc	Bacterial SNF2 helicase associated. This domain is found in bacterial proteins of the SWF/SNF/SWI helicase family to the N-terminus of the SNF2 family N-terminal domain (pfam00176) and together with the Helicase conserved C-terminal domain (pfam00271). The function of the domain is not clear.	369
285632	pfam08456	Vmethyltransf_C	Viral methyltransferase C-terminal. This domain is found to the C-terminus of the viral methyltransferase domain (pfam01660) in single-stranded-RNA positive-strand viruses with no DNA stage in the Virgaviridae family.	230
400658	pfam08457	Sfi1	Sfi1 spindle body protein. This is a family of fungal spindle pole body proteins that play a role in spindle body duplication. They contain binding sites for calmodulin-like proteins called centrins which are present in microtubule-organising centers.	570
400659	pfam08458	PH_2	Plant pleckstrin homology-like region. This family describes a pleckstrin homology (PH)-like region found in several plant proteins of unknown function.	104
400660	pfam08459	UvrC_HhH_N	UvrC Helix-hairpin-helix N-terminal. This domain is found in the C subunits of the bacterial and archaeal UvrABC system which catalyzes nucleotide excision repair in a multi-step process. UvrC catalyzes the first incision on the fourth or fifth phosphodiester bond 3' and on the eighth phosphodiester bond 5' from the damage that is to be excised. The domain described here is found to the N-terminus of a helix hairpin helix (pfam00633) motif and also co-occurs with the pfam01541 catalytic domain which is found at the N-terminus of the same proteins.	150
285636	pfam08460	SH3_5	Bacterial SH3 domain. 	65
285637	pfam08461	HTH_12	Ribonuclease R winged-helix domain. This domain is found at the amino terminus of Ribonuclease R and a number of presumed transcriptional regulatory proteins from archaebacteria.	66
117039	pfam08462	Carmo_coat_C	Carmovirus coat protein. This domain is found to the C-terminus of the pfam00729 domain in Carmoviruses.	99
400661	pfam08463	EcoEI_R_C	EcoEI R protein C-terminal. The restriction enzyme EcoEI recognizes 5'-GAGN(7)ATGC-3' and is composed of the three proteins R, M, and S. The domain described here is found at the C-terminus of the R protein (HsdR) which is required for both nuclease and ATPase activity.	158
369889	pfam08464	Gemini_AC4_5_2	Geminivirus AC4/5 conserved region. This domain is found in replication initiator (Rep) associated proteins such as AC5 in the Geminivirus/Begomovirus.	43
285639	pfam08465	Herpes_TK_C	Thymidine kinase from Herpesvirus C-terminal. This domain is found towards the C-terminus in Herpesvirus Thymidine kinases.	33
400662	pfam08466	IRK_N	Inward rectifier potassium channel N-terminal. This metazoan domain is found to the N-terminus of the pfam01007 domain in Inward rectifier potassium channels (KIR2 or IRK2).	45
285641	pfam08467	Luteo_P1-P2	Luteovirus RNA polymerase P1-P2/replicase. This domain is found in RNA-dependent RNA polymerase P1-P2 fusion/replicase proteins in plant Luteoviruses.	339
400663	pfam08468	MTS_N	Methyltransferase small domain N-terminal. This domain is found to the N-terminus of the methyltransferase small domain (pfam05175) in bacterial proteins.	157
400664	pfam08469	NPHI_C	Nucleoside triphosphatase I C-terminal. This viral domain is found to the C-terminus of Poxvirus nucleoside triphosphatase phosphohydrolase I (NPH I) together with the helicase conserved C-terminal domain (pfam00271).	148
400665	pfam08470	NTNH_C	Nontoxic nonhaemagglutinin C-terminal. Bacteria of the Clostridium genus produce protein neurotoxins, which are complexes consisting of neurotoxin (NT), haemagglutinin (HA), nontoxic nonhaemagglutinin (NTNH), and RNA. The domain described here is found at the C-terminus of the NTNH component.	162
400666	pfam08471	Ribonuc_red_2_N	Class II vitamin B12-dependent ribonucleotide reductase. This domain is found to the N-terminus of the ribonucleotide reductase barrel domain (pfam02867). It occurs in bacterial class II ribonucleotide reductase proteins which depend upon coenzyme B12 (deoxyadenosylcobalamine).	99
369893	pfam08472	S6PP_C	Sucrose-6-phosphate phosphohydrolase C-terminal. This is the Sucrose-6-phosphate phosphohydrolase (S6PP or SPP) C-terminal domain as found in in plant sucrose phosphatases. These enzymes irreversibly catalyze the last step in sucrose synthesis following the formation of Sucrose-6-Phosphate via sucrose-phosphate synthase (SPS).	133
400667	pfam08473	VGCC_alpha2	Neuronal voltage-dependent calcium channel alpha 2acd. This eukaryotic domain has been found in the neuronal voltage-dependent calcium channel (VGCC) alpha 2a, 2c, and 2d subunits. It is also found in other calcium channel alpha-2 delta subunits to the C-terminus of a Cache domain (pfam02743).	430
400668	pfam08474	MYT1	Myelin transcription factor 1. This domain is found in the myelin transcription factor 1 (MYT1) of chordates. MYT1 contains C2HC zinc finger domains (pfam01530) and is expressed in developing neurons of the central nervous system where it is involved in the selection of neuronal precursor cells.	236
285649	pfam08475	Baculo_VP91_N	Viral capsid protein 91 N-terminal. This domain is found in Baculoviridae including the nucleopolyhedrovirus at the N-terminus of the viral capsid protein 91 (VP91).	192
285650	pfam08476	VD10_N	Viral D10 N-terminal. This domain is found on the N-terminus of the viral protein D10 (VD10) and the related MutT motif proteins. The VD10 protein is probably essential for virus replication and is often found to the N-terminus of a pfam00293 domain.	41
400669	pfam08477	Roc	Ras of Complex, Roc, domain of DAPkinase. Roc, or Ras of Complex, proteins are mitochondrial Rho proteins (Miro-1, and Miro-2) and atypical Rho GTPases. Full-length proteins have a unique domain organisation, with tandem GTP-binding domains and two EF hand domains (pfam00036) that may bind calcium. They are also larger than classical small GTPases. It has been proposed that they are involved in mitochondrial homeostasis and apoptosis.	120
400670	pfam08478	POTRA_1	POTRA domain, FtsQ-type. FtsQ/DivIB bacterial division proteins (pfam03799) contain an N-terminal POTRA domain (for polypeptide-transport-associated domain). This is found in different types of proteins, usually associated with a transmembrane beta-barrel. FtsQ/DivIB may have chaperone-like roles, which has also been postulated for the POTRA domain in other contexts.	69
369898	pfam08479	POTRA_2	POTRA domain, ShlB-type. The POTRA domain (for polypeptide-transport-associated domain) is found towards the N-terminus of ShlB family proteins (pfam03865). ShlB is important in the secretion and activation of the haemolysin ShlA. It has been postulated that the POTRA domain has a chaperone-like function over ShlA; it may fold back into the C-terminal beta-barrel channel.	76
285654	pfam08480	Disaggr_assoc	Disaggregatase related. This domain is found in disaggregatases and several hypothetical proteins of the archaeal genus Methanosarcina. Disaggregatases cause aggregates to separate into single cells and contain parallel beta-helix repeats. Also see pfam06848.	194
400671	pfam08481	GBS_Bsp-like	GBS Bsp-like repeat. This domain is found as a repeat in a number of Streptococcus proteins including some hypothetical proteins and Bsp. Bsp is a protein of group B Streptococcus (GBS) which might control cell morphology.	89
400672	pfam08482	HrpB_C	ATP-dependent helicase C-terminal. This domain is found near the C-terminus of bacterial ATP-dependent helicases such as HrpB.	126
378004	pfam08483	IstB_IS21_ATP	IstB-like ATP binding N-terminal. This bacterial domain is found to the N-terminus of the pfam01695 like ATP binding domain in proteins which are putative transposase subunits.	28
400673	pfam08484	Methyltransf_14	C-methyltransferase C-terminal domain. This domain is found in bacterial C-methyltransferase proteins. This domain is found C-terminal to methyltransferase domains such as pfam08241 or pfam08242. But this domain is not a methyltransferase.	160
400674	pfam08485	Polysacc_syn_2C	Polysaccharide biosynthesis protein C-terminal. This domain is found to the C-terminus of the pfam02719 domain in bacterial polysaccharide biosynthesis enzymes including the capsule protein CapD and several putative epimerases/dehydratases.	48
400675	pfam08486	SpoIID	Stage II sporulation protein. This domain is found in the stage II sporulation protein SpoIID. SpoIID is necessary for membrane migration as well as for some of the earlier steps in engulfment during bacterial endospore formation. The domain is also found in amidase enhancer proteins. Amidases, like SpoIID, are cell wall hydrolases.	100
400676	pfam08487	VIT	Vault protein inter-alpha-trypsin domain. Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumor metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors.	111
254827	pfam08488	WAK	Wall-associated kinase. This domain is found together with the eukaryotic protein kinase domain pfam00069 in plant wall-associated kinases (WAKs) and related proteins. WAKs are serine-threonine kinases which might be involved in signalling to the cytoplasm and are required for cell expansion.	103
400677	pfam08489	DUF1743	Domain of unknown function (DUF1743). This domain of unknown function is found in many hypothetical proteins and predicted DNA-binding proteins such as transcription-associated proteins. It is found in bacteria and archaea.	116
400678	pfam08490	DUF1744	Domain of unknown function (DUF1744). This domain is found on the epsilon catalytic subunit of DNA polymerase. It is found C terminal to pfam03104 and pfam00136.	400
400679	pfam08491	SE	Squalene epoxidase. This domain is found in squalene epoxidase (SE) and related proteins which are found in taxonomically diverse groups of eukaryotes and also in bacteria. SE was first cloned from Saccharomyces cerevisiae where it was named ERG1. It contains a putative FAD binding site and is a key enzyme in the sterol biosynthetic pathway. Putative transmembrane regions are found to the protein's C-terminus.	276
400680	pfam08492	SRP72	SRP72 RNA-binding domain. This region has been identified as the binding site of the SRP72 protein to SRP RNA.	57
400681	pfam08493	AflR	Aflatoxin regulatory protein. This domain is found in the aflatoxin regulatory protein (AflR) which is involved in the regulation of the biosynthesis of aflatoxin in the fungal genus Aspergillus. It occurs together with the fungal Zn(2)-Cys(6) binuclear cluster domain (pfam00172).	274
400682	pfam08494	DEAD_assoc	DEAD/H associated. This domain is found in ATP-dependent helicases as well as a number of hypothetical proteins together with the helicase conserved C-terminal domain (pfam00270) and the pfam00271 domain.	182
400683	pfam08495	FIST	FIST N domain. The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids.	130
400684	pfam08496	Peptidase_S49_N	Peptidase family S49 N-terminal. This domain is found to the N-terminus of bacterial signal peptidases of the S49 family (pfam01343).	147
400685	pfam08497	Radical_SAM_N	Radical SAM N-terminal. This domain tends to occur to the N-terminus of the pfam04055 domain in hypothetical bacterial proteins.	298
400686	pfam08498	Sterol_MT_C	Sterol methyltransferase C-terminal. This domain is found to the C-terminus of a methyltransferase domain (pfam08241) in fungal and plant sterol methyltransferases.	63
400687	pfam08499	PDEase_I_N	3'5'-cyclic nucleotide phosphodiesterase N-terminal. This domain is found to the N-terminus of the calcium/calmodulin-dependent 3'5'-cyclic nucleotide phosphodiesterase domain (pfam00233).	57
149522	pfam08500	Tombus_P33	Tombusvirus p33. Tombusviruses, which replicate in a wide range of plant hosts, replicate with the help of viral replicase protein including the overlapping p33 and p92 proteins which contain the domain described here.	142
400688	pfam08501	Shikimate_dh_N	Shikimate dehydrogenase substrate binding domain. This domain is the substrate binding domain of shikimate dehydrogenase.	83
400689	pfam08502	LeuA_dimer	LeuA allosteric (dimerization) domain. This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyzes the first committed step in the leucine biosynthetic pathway. This domain, is an internally duplicated structure with a novel fold. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich.	112
400690	pfam08503	DapH_N	Tetrahydrodipicolinate succinyltransferase N-terminal. This domain is found at the N-terminus of tetrahydrodipicolinate N-succinyltransferase (DapH) which catalyzes the acylation of L-2-amino-6-oxopimelate to 2-N-succinyl-6-oxopimelate in the meso-diaminopimelate/lysine biosynthetic pathway of bacteria, blue-green algae, and plants. The N-terminal domain as defined here contains three alpha-helices and two twisted hairpin loops.	83
400691	pfam08504	RunxI	Runx inhibition domain. This domain lies to the C-terminus of Runx-related transcription factors and homologous proteins (AML, CBF-alpha, PEBP2). Its function might be to interact with functional cofactors.	98
400692	pfam08505	MMR1	Mitochondrial Myo2 receptor-related protein. Myo2p, a class V myosin, is essential for mitochondrial distribution, class V being vital for organelle distribution in S. cerevisiae. It is the myosin essential for mitochondrial distribution. The established mechanism for distribution of cellular components by class V myosins is that they interact with the cargo at the C-terminal tail domain and transport it along the actin cytoskeleton using the N-terminal motor domain. Cargo-specific myosin receptors act as the link between the myosin tail and cargo. Myo2 binds with MMR1 (mitochondrial Myo2p receptor-related 1), the receptor on cargo, via the C-terminal domain.	267
369909	pfam08506	Cse1	Cse1. This domain is present in Cse1 nuclear export receptor proteins. Cse1 mediates the nuclear export of importin alpha. This domain contains HEAT repeats.	370
400693	pfam08507	COPI_assoc	COPI associated protein. Proteins in this family colocalize with COPI vesicle coat proteins.	130
400694	pfam08508	DUF1746	Fungal domain of unknown function (DUF1746). This is a fungal domain of unknown function.	116
369912	pfam08509	Ad_cyc_g-alpha	Adenylate cyclase G-alpha binding domain. This fungal domain is found in adenylate cyclase and interacts with the alpha subunit of heterotrimeric G proteins.	48
400695	pfam08510	PIG-P	PIG-P. PIG-P (phosphatidylinositol N-acetylglucosaminyltransferase subunit P) is an enzyme involved in GPI anchor biosynthesis.	120
400696	pfam08511	COQ9	COQ9. COQ9 is an enzyme that is required for the biosynthesis of coenzyme Q. It may either catalyze a reaction in the coenzyme Q biosynthetic pathway or have a regulatory role.	73
400697	pfam08512	Rtt106	Histone chaperone Rttp106-like. This family includes Rttp106, a histone chaperone involved in heterochromatin-mediated silencing. This domain belongs to the Pleckstrin homology domain superfamily.	83
369916	pfam08513	LisH	LisH. The LisH (lis homology) domain mediates protein dimerization and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex.	25
400698	pfam08514	STAG	STAG domain. STAG domain proteins are subunits of cohesin complex - a protein complex required for sister chromatid cohesion in eukaryotes. The STAG domain is present in Schizosaccharomyces pombe mitotic cohesin Psc3, and the meiosis specific cohesin Rec11. Many organisms express a meiosis-specific STAG protein, for example, mice and humans have a meiosis specific variant called STAG3, although budding yeast does not have a meiosis specific version.	108
400699	pfam08515	TGF_beta_GS	Transforming growth factor beta type I GS-motif. This motif is found in the transforming growth factor beta (TGF-beta) type I which regulates cell growth and differentiation. The name of the GS motif comes from its highly conserved GSGSGLP signature in the cytoplasmic juxtamembrane region immediately preceding the protein's kinase domain. Point mutations in the GS motif modify the signaling ability of the type I receptor.	28
400700	pfam08516	ADAM_CR	ADAM cysteine-rich. ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity.	115
400701	pfam08517	AXH	Ataxin-1 and HBP1 module (AXH). AXH is a protein-protein and RNA binding motif found in Ataxin-1 (ATX1). ATX1 is responsible for the autosomal-dominant neurodegenerative disorder Spinocerebellar ataxia type-1 (SCA1) in humans. The AXH module has also been identified in the apparently unrelated transcription factor HBP1 which is thought to be involved in the architectural regulation of chromatin and in specific gene expression.	109
400702	pfam08518	GIT_SHD	Spa2 homology domain (SHD) of GIT. GIT proteins are signaling integrators with GTPase-activating function which may be involved in the organisation of the cytoskeletal matrix assembled at active zones (CAZ). The function of the CAZ might be to define sites of neurotransmitter release. Mutations in the Spa2 homology domain (SHD) domain of GIT1 described here interfere with the association of GIT1 with Piccolo, beta-PIX, and focal adhesion kinase.	29
400703	pfam08519	RFC1	Replication factor RFC1 C terminal domain. This is the C terminal domain of replication factor C, RFC1. RFC complexes hydrolyze ATP and load sliding clamps such as PCNA (proliferating cell nuclear antigen) onto double-stranded DNA. RFC1 is essential for RFC function in vivo.	158
400704	pfam08520	DUF1748	Fungal protein of unknown function (DUF1748). This is a family of fungal proteins of unknown function.	69
400705	pfam08521	2CSK_N	Two-component sensor kinase N-terminal. This domain is found in bacterial two-component sensor kinases towards the N-terminus.	140
400706	pfam08522	DUF1735	Domain of unknown function (DUF1735). This domain of unknown function is found in a number of bacterial proteins including acylhydrolases. The structure of this domain has a beta-sandwich fold.	120
400707	pfam08523	MBF1	Multiprotein bridging factor 1. This domain is found in the multiprotein bridging factor 1 (MBF1) which forms a heterodimer with MBF2. It has been shown to make direct contact with the TATA-box binding protein (TBP) and interacts with Ftz-F1, stabilizing the Ftz-F1-DNA complex. It is also found in the endothelial differentiation-related factor (EDF-1). Human EDF-1 is involved in the repression of endothelial differentiation, interacts with CaM and is phosphorylated by PKC. The domain is found in a wide range of eukaryotic proteins including metazoans, fungi and plants. A helix-turn-helix motif (pfam01381) is found to its C-terminus.	70
400708	pfam08524	rRNA_processing	rRNA processing. This is a family of proteins that are involved in rRNA processing. In a localization study they were found to localize to the nucleus and nucleolus. The family also includes other metazoa members from plants to mammals where the protein has been named BR22 and is associated with TTF-1, thyroid transcription factor 1. In the lungs, the family binds TTF-1 to form a complex which influences the expression of the key lung surfactant protein-B (SP-B) and -C (SP-C), the small hydrophobic surfactant proteins that maintain surface tension in alveoli.	142
400709	pfam08525	OapA_N	Opacity-associated protein A N-terminal motif. This family includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonisation, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation. This motif occurs at the N-terminus of these proteins. It contains a conserved histidine followed by a run of hydrophobic residues.	28
400710	pfam08526	PAD_N	Protein-arginine deiminase (PAD) N-terminal domain. This family represents the N-terminal non-catalytic domain of protein-arginine deiminase. This domain has a cupredoxin-like fold.	113
400711	pfam08527	PAD_M	Protein-arginine deiminase (PAD) middle domain. This family represents the central non-catalytic domain of protein-arginine deiminase. This domain has an immunoglobulin-like fold.	159
400712	pfam08528	Whi5	Whi5 like. In metazoans, cyclin-dependent kinase(CDK) dependent phosphorylation of the retinoblastoma Tudor suppressor protein (Rb) alleviates repression of E2F and thereby activates G1/S transcription. The cell size regulator Whi5 appears to be an analogous target of CDK activity during G1 phase.	25
400713	pfam08529	NusA_N	NusA N-terminal domain. This domain represents the RNA polymerase binding domain of NusA.	120
400714	pfam08530	PepX_C	X-Pro dipeptidyl-peptidase C-terminal non-catalytic domain. This domain contains a beta sandwich domain.	216
400715	pfam08531	Bac_rhamnosid_N	Alpha-L-rhamnosidase N-terminal domain. This family consists of bacterial rhamnosidase A and B enzymes. This domain is probably involved in substrate recognition.	172
369931	pfam08532	Glyco_hydro_42M	Beta-galactosidase trimerisation domain. This is non catalytic domain B of beta-galactosidase enzymes belong to the glycosyl hydrolase 42 family. This domain is related to glutamine amidotransferase enzymes, but the catalytic residues are replaced by non functional amino acids. This domain is involved in trimerisation.	207
400716	pfam08533	Glyco_hydro_42C	Beta-galactosidase C-terminal domain. This domain is found at the C-terminus of beta-galactosidase enzymes that belong to the glycosyl hydrolase 42 family.	58
400717	pfam08534	Redoxin	Redoxin. This family of redoxins includes peroxiredoxin, thioredoxin and glutaredoxin proteins.	148
400718	pfam08535	KorB	KorB domain. This family consists of several KorB transcriptional repressor proteins. The korB gene is a major regulatory element in the replication and maintenance of broad host-range plasmid RK2. It negatively controls the replication gene trfA, the host-lethal determinants kilA and kilB, and the korA-korB operon. This domain includes the DNA-binding HTH motif.	88
400719	pfam08536	Whirly	Whirly transcription factor. This family contains the plant whirly transcription factors.	136
400720	pfam08537	NBP1	Fungal Nap binding protein NBP1. NBP1 is a nuclear protein which has been shown in Saccharomyces cerevisiae to be essential for the G2/M transition of the cell cycle.	332
369936	pfam08538	DUF1749	Protein of unknown function (DUF1749). This is a plant and fungal family of unknown function. This family contains many hypothetical proteins.	299
400721	pfam08539	HbrB	HbrB-like. HbrB is involved hyphal growth and polarity.	159
400722	pfam08540	HMG_CoA_synt_C	Hydroxymethylglutaryl-coenzyme A synthase C terminal. 	280
400723	pfam08541	ACP_syn_III_C	3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III C terminal. This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III EC:2.3.1.41, the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria.	90
400724	pfam08542	Rep_fac_C	Replication factor C C-terminal domain. This is the C-terminal domain of RFC (replication factor-C) protein of the clamp loader complex which binds to the DNA sliding clamp (proliferating cell nuclear antigen, PCNA). The five modules of RFC assemble into a right-handed spiral, which results in only three of the five RFC subunits (RFC-A, RFC-B and RFC-C) making contact with PCNA, leaving a wedge-shaped gap between RFC-E and the PCNA clamp-loader complex. The C-terminal is vital for the correct orientation of RFC-E with respect to RFC-A.	87
400725	pfam08543	Phos_pyr_kin	Phosphomethylpyrimidine kinase. This enzyme EC:2.7.4.7 is part of the Thiamine pyrophosphate (TPP) synthesis pathway, TPP is an essential cofactor for many enzymes.	246
378019	pfam08544	GHMP_kinases_C	GHMP kinases C terminal. This family includes homoserine kinases, galactokinases and mevalonate kinases.	86
400726	pfam08545	ACP_syn_III	3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III. This domain is found on 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III EC:2.3.1.180, the enzyme responsible for initiating the chain of reactions of the fatty acid synthase in plants and bacteria.	80
400727	pfam08546	ApbA_C	Ketopantoate reductase PanE/ApbA C terminal. This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalyzed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyzes the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway.	125
400728	pfam08547	CIA30	Complex I intermediate-associated protein 30 (CIA30). This protein is associated with mitochondrial Complex I intermediate-associated protein 30 (CIA30) in human and mouse. The family is also present in Schizosaccharomyces pombe which does not contain the NADH dehydrogenase component of complex I, or many of the other essential subunits. This means it is possible that this family of protein may not be directly involved in oxidative phosphorylation.	156
400729	pfam08548	Peptidase_M10_C	Peptidase M10 serralysin C terminal. Serralysins are peptidases related to mammalian matrix metallopeptidases (MMPs). The peptidase unit is found at the N terminal while this domain at the C terminal forms a corkscrew and is thought to be important for secretion of the protein through the bacterial cell wall. This domain contains the calcium ion binding domain pfam00353.	221
400730	pfam08549	SWI-SNF_Ssr4	Fungal domain of unknown function (DUF1750). This is a fungal domain of unknown function.	714
400731	pfam08550	DUF1752	Fungal protein of unknown function (DUF1752). This is a family of fungal proteins of unknown function. This short section domain is bounded by two highly conserved tryptophans. The family contains MKD1 that is thought to be a negative regulator of RAS-cAMP pathway in S.cerevisiae. the Sch.pombe member is a GAF1 transcription factor that is also associated with the zinc finger family GATA pfam00320.	28
369945	pfam08551	DUF1751	Eukaryotic integral membrane protein (DUF1751). This domain is found in eukaryotic integral membrane proteins. YOL107W, a Saccharomyces cerervisiae protein, has been shown to localize COP II vesicles.	99
400732	pfam08552	Kei1	Inositolphosphorylceramide synthase subunit Kei1. Kei1 is a subunit of Saccharomyces cerevisiae inositol phosphorylceramide (IPC) synthase. It is localized to the Golgi and is cleaved by the late Golgi processing endopeptidase Kex2. Kei1 is essential for both the activity and the Golgi localization of IPC synthase.	181
400733	pfam08553	VID27	VID27 cytoplasmic protein. This is a family of fungal and plant proteins and contains many hypothetical proteins. VID27 is a cytoplasmic protein that plays a potential role in vacuolar protein degradation.	356
369948	pfam08555	DUF1754	Eukaryotic family of unknown function (DUF1754). This is a eukaryotic protein family of unknown function.	91
400734	pfam08557	Lipid_DES	Sphingolipid Delta4-desaturase (DES). Sphingolipids are important membrane signalling molecules involved in many different cellular functions in eukaryotes. Sphingolipid delta 4-desaturase catalyzes the formation of (E)-sphing-4-enine. Some proteins in this family have bifunctional delta 4-desaturase/C-4-hydroxylase activity. Delta 4-desaturated sphingolipids may play a role in early signalling required for entry into meiotic and spermatid differentiation pathways during Drosophila spermatogenesis. This small domain associates with FA_desaturase pfam00487 and appears to be specific to sphingolipid delta 4-desaturase.	37
400735	pfam08558	TRF	Telomere repeat binding factor (TRF). Telomere repeat binding factor (TRF) family proteins are important for the regulation of telomere stability. The two related human TRF proteins hTRF1 and hTRF2 form homodimers and bind directly to telomeric TTAGGG repeats via the myb DNA binding domain pfam00249 at the carboxy terminus. TRF1 is implicated in telomere length regulation and TRF2 in telomere protection. Other telomere complex associated proteins are recruited through their interaction with either TRF1 or TRF2. The fission yeast protein Taz1p (telomere-associated in Schizosaccharomyces pombe) has similarity to both hTRF1 and hTRF2 and may perform the dual functions of TRF1 and TRF2 at fission yeast telomeres. This domain is composed of multiple alpha helices arranged in a solenoid conformation similar to TPR repeats. The fungal members have now also been found to carry two double strand telomeric repeat binding factors.	212
400736	pfam08559	Cut8	Cut8, nuclear proteasome tether protein. In Schizosaccharomyces pombe, Cut8 is a nuclear envelope protein that physically interacts with and tethers 26S proteasome in the nucleus resulting in the nuclear accumulation of proteasome. Cut8 comprises three functional domains. An N-terminal lysine-rich segment which binds to the proteasome when ubiquitinated, a central dimerization domain and a C-terminal nine-helix, Structure 3q5w, bundle which shows structural similarity to 14-3-3 phosphoprotein-binding domains. The helical bundle is necessary for liposome and cholesterol binding. Cut8 is a proteasome substrate and the N-terminal segment is polyubiquitinated and functions as a degron tag. Ubiquitination of the amino N-terminal segment is essential for the function of Cut8. Lysine residues in the N-terminal segment of Cut8 are required for physical interaction with proteasome. In fission yeast the function of Cut8 has been demonstrated to be regulated by ubiquitin-conjugating Rhp6/Ubc2/Rad6 and ligating enzymes Ubr1. Cut8 homologs have been identified in Drosophila melanogaster, Anopheles gambiae and Dictyostelium discoideum.	196
400737	pfam08560	DUF1757	Protein of unknown function (DUF1757). This family of proteins are about 150 amino acids in length and have no known function.	147
400738	pfam08561	Ribosomal_L37	Mitochondrial ribosomal protein L37. This family includes yeast MRPL37 a mitochondrial ribosomal protein.	88
400739	pfam08562	Crisp	Crisp. This domain is found on Crisp proteins which contain pfam00188 and has been termed the Crisp domain. It is found in the mammalian reproductive tract and the venom of reptiles, and has been shown to regulate ryanodine receptor Ca2+ signalling. It contains 10 conserved cysteines which are all involved in disulphide bonds and is structurally related to the ion channel inhibitor toxins BgK and ShK.	55
400740	pfam08563	P53_TAD	P53 transactivation motif. The binding of the p53 transactivation domain by regulatory proteins regulates p53 transcription activation. This motif is comprised of a single amphipathic alpha helix and contains a highly conserved sequence.	25
400741	pfam08564	CDC37_C	Cdc37 C terminal domain. Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain pfam08565 and the N terminal kinase binding domain of Cdc37 pfam03234.	87
400742	pfam08565	CDC37_M	Cdc37 Hsp90 binding domain. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37. It is found between the N terminal Cdc37 domain pfam03234, which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 pfam08564 whose function is unclear.	113
400743	pfam08566	Pam17	Mitochondrial import protein Pam17. The presequence translocase-associated motor (PAM) drives the completion of preprotein translocation into the mitochondrial matrix. The Pam17 subunit is required for formation of a stable complex between cochaperones Pam16 and Pam18 and promotes the association of Pam16-Pam18 with the presequence translocase. Mitochondria lacking Pam17 are selectively impaired in the import of matrix proteins.	165
400744	pfam08567	PH_TFIIH	TFIIH p62 subunit, N-terminal domain. The N-terminal domain of the TFIIH basal transcription factor complex p62 subunit (BTF2-p62) forms an interaction with the 3' endonuclease XPG, which is essential for activity. The 3' endonuclease XPG is a major component of the nucleotide excision repair machinery. The structure of the N-terminal domain reveals that it adopts a pleckstrin homology (PH) fold. This PH-type domain has been shown to bind to a mono-phosphorylated inositide.	88
400745	pfam08568	Kinetochor_Ybp2	Uncharacterized protein family, YAP/Alf4/glomulin. This entry contains a number of protein families with apparently unrelated functions. These include the YAP binding proteins of yeasts. These are stress response and redox homeostasis proteins, induced by hydrogen peroxide or induced in response to alkylating agent methyl methanesulphonate (MMS). The family includes Aberrant root formation protein 4 (Alf4) of Arabidopsis thaliana (Mouse-ear cress), which is required for the initiation of lateral roots independent from auxin signalling. It may also function in maintaining the pericycle in the mitotically competent state needed for lateral root formation. The family includes glomulin (FAP68), which is essential for normal development of the vasculature and may represent a naturally occurring ligand of the immunophilins FKBP59 and FKBP12.	624
400746	pfam08569	Mo25	Mo25-like. Mo25-like proteins are involved in both polarised growth and cytokinesis. In fission yeast Mo25 is localized alternately to the spindle pole body and to the site cell division in a cell cycle dependent manner.	328
400747	pfam08570	DUF1761	Protein of unknown function (DUF1761). Family of conserved fungal and bacterial membrane proteins with unknown function.	122
400748	pfam08571	Yos1	Yos1-like. In yeast, Yos1 is a subunit of the Yip1p-Yif1p complex and is required for transport between the endoplasmic reticulum and the Golgi complex. Yos1 appears to be conserved in eukaryotes.	80
400749	pfam08572	PRP3	pre-mRNA processing factor 3 (PRP3). Pre-mRNA processing factor 3 (PRP3) is a U4/U6-associated splicing factor. The human PRP3 has been implicated in autosomal retinitis pigmentosa.	218
400750	pfam08573	SAE2	DNA repair protein endonuclease SAE2/CtIP C-terminus. SAE2 is a protein involved in repairing meiotic and mitotic double-strand breaks in DNA. It has been shown to negatively regulate DNA damage checkpoint signalling. SAE2 is homologous to the CtIP proteins in mammals and an homologous protein in plants. Crucial sequence motifs that are highly conserved are the CxxC and the RHR motifs in this C-terminal part of the protein. It is now known to be an endonuclease. In budding yeast, genetic evidence suggests that the SAE2 protein is essential for the processing of hairpin DNA intermediates and meiotic double-strand breaks by Mre11/Rad50 complexes. SAE2 binds DNA and exhibits endonuclease activity on single-stranded DNA independently of Mre11/Rad50 complexes, but hairpin DNA structures are cleaved cooperatively in the presence of Mre11/Rad50 or Mre11/Rad50/Xrs2. Hairpin structures are not processed at the tip by SAE2 but rather at single-stranded DNA regions adjacent to the hairpin. The catalytic activities of SAE2 are important for its biological functions.	110
400751	pfam08574	Iwr1	Transcription factor Iwr1. Iwr1 is involved in transcription from polymerase II promoters; it interacts with with most of the polymerase II subunits. Deletion of this protein results in hypersensitivity to the K1 killer toxin.	73
400752	pfam08576	DUF1764	Eukaryotic protein of unknown function (DUF1764). This is a family of eukaryotic proteins of unknown function. This family contains many hypothetical proteins.	99
400753	pfam08577	PI31_Prot_C	PI31 proteasome regulator. PI31 is a cellular regulator of proteasome formation and of proteasome-mediated antigen processing.	80
400754	pfam08578	DUF1765	Protein of unknown function (DUF1765). This region represents a conserved region found in hypothetical proteins from fungi, mycetozoa and entamoebidae.	125
369970	pfam08579	RPM2	Mitochondrial ribonuclease P subunit (RPM2). Ribonuclease P (RNase P) generates mature tRNA molecules by cleaving their 5' ends. RPM2 is a protein subunit of the yeast mitochondrial RNase P. It has the ability to act as transcriptional activator in the nucleus where it plays a role in defining the steady-state levels of mRNAs for some nucleus-encoded mitochondrial components.	119
369971	pfam08580	KAR9	Yeast cortical protein KAR9. The KAR9 protein in Saccharomyces cerevisiae is a cytoskeletal protein required for karyogamy, correct positioning of the mitotic spindle and for orientation of cytoplasmic microtubules. KAR9 localizes at the shmoo tip in mating cells and at the tip of the growing bud in anaphase.	683
400755	pfam08581	Tup_N	Tup N-terminal. The N-terminal domain of the Tup protein has been shown to interact with the Ssn6 transcriptional co-repressor.	77
400756	pfam08583	Cmc1	Cytochrome c oxidase biogenesis protein Cmc1 like. Cmc1 is a metallo-chaperone like protein which is known to localize to the inner mitochondrial membrane in Saccharomyces cerevisiae. It is essential for full expression of cytochrome c oxidase and respiration. Cmc1 contains two Cx9C motifs and is able to bind copper(I). Cmc1 is thought to play a role in mitochondrial copper trafficking and transfer to cytochrome c oxidase.	70
400757	pfam08584	Ribonuc_P_40	Ribonuclease P 40kDa (Rpp40) subunit. The tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule and at least eight protein subunits. Subunits hpop1, Rpp21, Rpp29, Rpp30, Rpp38, and Rpp40 (this entry) are involved in extensive, but weak, protein-protein interactions in the holoenzyme complex.	277
400758	pfam08585	RMI1_N	RecQ mediated genome instability protein. RMI1_N is an N-terminal family of eukaryotic proteins. The domain probably carries an oligo-nucleotide-binding domain or OB-fold, and forms a stable complex with Bloom syndrome protein BLM and DNA topoisomerase 3-alpha.	193
400759	pfam08586	Rsc14	RSC complex, Rsc14/Ldb7 subunit. RSC is an ATP-dependent chromatin remodelling complex found in yeast. The RSC components Rsc7/Npl6 and Rsc14/Ldb7 interact physically and/or functionally with Rsc3, Rsc30, and Htl1 to form a module important for a broad range of RSC functions.	99
400760	pfam08587	UBA_2	Ubiquitin associated domain (UBA). This is a UBA (ubiquitin associated) domain. Ubiquitin is involved in intracellular proteolysis.	45
400761	pfam08588	DUF1769	Protein of unknown function (DUF1769). Family of fungal protein with unknown function.	55
400762	pfam08589	DUF1770	Fungal protein of unknown function (DUF1770). The function of this family is unknown. These proteins are rather dissimilar except for a single strongly conserved motif (PDLRFEQ).	98
400763	pfam08590	DUF1771	Domain of unknown function (DUF1771). This domain is always found adjacent to pfam01713.	62
400764	pfam08591	RNR_inhib	Ribonucleotide reductase inhibitor. This family includes S. pombe Spd1. Spd1p inhibits fission yeast RNR activity by interacting with the Cdc22p.	95
400765	pfam08592	DUF1772	Domain of unknown function (DUF1772). This domain is of unknown function.	136
400766	pfam08593	DUF1773	Domain of unknown function. This is the C-terminal part of some meiotically up-regulated gene products from fission yeast. The actual function is not yet known but the proteins are likely to be cell-surface glycoproteins.	58
369983	pfam08594	UPF0300	Uncharacterized protein family (UPF0300). This family of proteins appear to be specific to S. pombe.	212
400767	pfam08595	RXT2_N	RXT2-like, N-terminal. The family represents the N-terminal region of RXT2-like proteins. In S. cerevisiae, RXT2 has been demonstrated to be involved in conjugation with cellular fusion (mating) and invasive growth. A high throughput localization study has localized RXT2 to the nucleus.	139
400768	pfam08596	Lgl_C	Lethal giant larvae(Lgl) like, C-terminal. The Lethal giant larvae (Lgl) tumor suppressor family is conserved from yeast to mammals. The Lgl family functions in cell polarity, at least in part, by regulating SNARE-mediated membrane delivery events at the cell surface. The N-terminal half of Lgl members contains WD40 repeats (see pfam00400), while the C-terminal half appears specific to the family.	393
400769	pfam08597	eIF3_subunit	Translation initiation factor eIF3 subunit. This is a family of proteins which are subunits of the eukaryotic translation initiation factor 3 (eIF3). In yeast it is called Hcr1. The Saccharomyces cerevisiae protein HCR1 has been shown to be required for processing of 20S pre-rRNA and binds to 18S rRNA and eIF3 subunits Rpg1p and Prt1p.	242
400770	pfam08598	Sds3	Sds3-like. Repression of gene transcription is mediated by histone deacetylases containing repressor-co-repressor complexes, which are recruited to promoters of target genes via interactions with sequence-specific transcription factors. The co-repressor complex contains a core of at least seven proteins. This family represents the conserved region found in Sds3, Dep1 and BRMS1-homolog p40 proteins.	214
400771	pfam08599	Nbs1_C	DNA damage repair protein Nbs1. This C terminal region of the DNA damage repair protein Nbs1 has been identified to be necessary for the binding of Mre11 and Tel1.	62
369989	pfam08600	Rsm1	Rsm1-like. Rsm1 is a protein involved in mRNA export from the nucleus	97
369990	pfam08601	PAP1	Transcription factor PAP1. The transcription factor Pap1 regulates antioxidant-gene transcription in response to H2O2. This region is cysteine rich. Alkylation of cysteine residues following treatment with a cysteine alkylating agent can mask the accessibility of the nuclear exporter Crm1, triggering nuclear accumulation and Pap1 dependent transcriptional expression.	363
369991	pfam08602	Mgr1	Mgr1-like, i-AAA protease complex subunit. The S. cerevisiae Mgr1 protein has been shown to be required for mitochondrial viability in yeast lacking mitochondrial DNA. It is a mitochondrial inner membrane protein, which interacts with Yme1 and is a new subunit of the i-AAA protease complex.	374
400772	pfam08603	CAP_C	Adenylate cyclase associated (CAP) C terminal. 	156
400773	pfam08604	Nup153	Nucleoporin Nup153-like. This family contains both the nucleoporin Nup153 from human and Nup154 from fission yeast. These have been demonstrated to be functionally equivalent.	501
369994	pfam08605	Rad9_Rad53_bind	Fungal Rad9-like Rad53-binding. In Saccharomyces cerevisiae the Rad9 a key adaptor protein in DNA damage checkpoint pathways. DNA damage induces Rad9 phosphorylation, and Rad53 specifically associates with this region of Rad9, when phosphorylated, via Rad53 pfam00498 domains. This region is structurally composed of a pair of TUDOR domains.	129
400774	pfam08606	Prp19	Prp19/Pso4-like. This regions is found specifically in PRP19-like protein. The region represented by this family covers the sequence implicated in self-interaction and a coiled-coiled motif. PRP19-like proteins form an oligomer that is necessary for spliceosome assembly.	65
400775	pfam08608	Wyosine_form	Wyosine base formation. Some proteins in this family appear to be important in wyosine base formation in a subset of phenylalanine specific tRNAs. It has been proposed that they participates in converting tRNA(Phe)-m(1)G(37) to tRNA(Phe)-yW.	63
400776	pfam08609	Fes1	Nucleotide exchange factor Fes1. Fes1 is a cytosolic homolog of Sls1, an ER protein which has nucleotide exchange factor activity. Fes1 in yeast has been shown to bind to the molecular chaperone Hsp70 and has adenyl-nucleotide exchange factor activity.	89
400777	pfam08610	Pex16	Peroxisomal membrane protein (Pex16). Pex16 is a peripheral protein located at the matrix face of the peroxisomal membrane.	346
400778	pfam08611	DUF1774	Fungal protein of unknown function (DUF1774). This is a fungal family of unknown function.	95
400779	pfam08612	Med20	TATA-binding related factor (TRF) of subunit 20 of Mediator complex. This family of proteins is related to TATA-binding protein (TBP). TBP is a highly conserved RNA polymerase II general transcription factor that binds to the core promoter and initiates assembly of the preinitiation complex. Human TRF has been shown to associate with an RNA polymerase II-SRB complex. This Med20 subunit of Mediator is found in the non-essential part of the head.	200
400780	pfam08613	Cyclin	Cyclin. This family includes many different cyclin proteins. Members include the G1/S-specific cyclin pas1, and the phosphate system cyclin PHO80/PHO85.	149
400781	pfam08614	ATG16	Autophagy protein 16 (ATG16). Autophagy is a ubiquitous intracellular degradation system for eukaryotic cells. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. ATG16 (also known as Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate in the yeast autophagy pathway.	176
400782	pfam08615	RNase_H2_suC	Ribonuclease H2 non-catalytic subunit (Ylr154p-like). This entry represents the non-catalytic subunit of RNase H2, which in S. cerevisiae is Ylr154p/Rnh203p. Whereas bacterial and archaeal RNases H2 are active as single polypeptides, the Saccharomyces cerevisiae homolog, Rnh2Ap, when expressed in Escherichia coli, fails to produce an active RNase H2. For RNase H2 activity three proteins are required [Rnh2Ap (Rnh201p), Ydr279p (Rnh202p) and Ylr154p (Rnh203p)]. Deletion of any one of the proteins or mutations in the catalytic site in Rnh2A leads to loss of RNase H2 activity. RNase H2 ia an endonuclease that specifically degrades the RNA of RNA:DNA hybrids. It participates in DNA replication, possibly by mediating the removal of lagging-strand Okazaki fragment RNA primers during DNA replication.	133
400783	pfam08616	SPA	Stabilization of polarity axis. Yeast AFI1 has been shown to interact with the outer plaque of the spindle pole body. In Aspergillus nidulans the protein member is necessary for stabilization of the polarity axes during septation. and in S. cerevisiae it functions as a polarisation-specific docking factor.	113
400784	pfam08617	CGI-121	Kinase binding protein CGI-121. CGI-121 has been shown to bind to the p53-related protein kinase (PRPK). PRPK is a novel protein kinase which binds to and induces phosphorylation of the tumor suppressor protein p53. CGI-121 is part of a conserved protein complex, KEOPS. The KEOPS complex is involved in telomere uncapping and telomere elongation. Interestingly this family also include archaeal homologs, formerly in the DUF509 family. A structure for these proteins has been solved by structural genomics.	159
370005	pfam08618	Opi1	Transcription factor Opi1. Opi1 is a leucine zipper containing yeast transcription factor that negatively regulates phospholipid biosynthesis. It represses the expression of several UAS(INO) cis acting element containing genes and its activity is mediated by phosphorylations catalyzed by protein kinase A, protein kinase C and casein kinase II.	416
400785	pfam08619	Nha1_C	Alkali metal cation/H+ antiporter Nha1 C-terminus. The C-terminus of the plasma membrane Nha1 antiporter plays an important role in the immediate cell response to hypo-osmotic shock which prevents an execessive loss of ions and water. This domain is found with pfam00999.	323
400786	pfam08620	RPAP1_C	RPAP1-like, C-terminal. Inhibition of RPAP1 synthesis in Saccharomyces cerevisiae results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11. This entry represents the C-terminal region that contains the motif GLHHH. This region is conserved from yeast to humans.	69
400787	pfam08621	RPAP1_N	RPAP1-like, N-terminal. Inhibition of RPAP1 synthesis in Saccharomyces cerevisiae results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11. This entry represents the N-terminal region of RPAP-1 that is conserved from yeast to humans.	45
400788	pfam08622	Svf1	Svf1-like N-terminal lipocalin domain. Family of proteins that are involved in survival during oxidative stress. This entry corresponds to the N-terminal lipocalin domain of a pair.	162
400789	pfam08623	TIP120	TATA-binding protein interacting (TIP20). TIP120 (also known as cullin-associated and neddylation-dissociated protein 1) is a TATA binding protein interacting protein that enhances transcription.	165
400790	pfam08624	CRC_subunit	Chromatin remodelling complex Rsc7/Swp82 subunit. This family has been identified as a subunit of chromatin remodelling complexes. Saccharomyces cerevisiae NPL6 and its paralogue SWP82 have been identified as subunits of the RSC chromatin remodelling complex, and SWI/SNF chromatin remodelling complex respectively.	134
400791	pfam08625	Utp13	Utp13 specific WD40 associated domain. Utp13 is a component of the five protein Pwp2 complex that forms part of a stable particle subunit independent of the U3 small nucleolar ribonucleoprotein that is essential for the initial assembly steps of the 90S pre-ribosome. Pwp2 is capable of interacting directly with the 35 S pre-rRNA 5' end.	141
400792	pfam08626	TRAPPC9-Trs120	Transport protein Trs120 or TRAPPC9, TRAPP II complex subunit. This region is found at the N terminal of Saccharomyces cerevisiae Trs120 protein. Trs120 is a subunit of the multiprotein complex TRAPP (transport particle protein) which functions in ER to Golgi traffic. Trs120 is specific to the larger TRAPP complex, TRAPP II, along with Trs65p and Trs130p(TRAPPC10). It is suggested that Trs120p is required for the stability of the Trs130p subunit, suggesting that these two proteins might interact in some way. It is likely that there is a complex function for TRAPP II in multiple pathways.	1220
400793	pfam08627	CRT-like	CRT-like, chloroquine-resistance transporter-like. This region is found in proteins related to Plasmodium falciparum chloroquine resistance transporter (CRT).	331
400794	pfam08628	Nexin_C	Sorting nexin C terminal. This region is found a the C terminal of proteins belonging to the sorting nexin family. It is found on proteins which also contain pfam00787.	111
400795	pfam08629	PDE8	PDE8 phosphodiesterase. This region is found in members of the PDE8 phosphodiesterase family. It is found with pfam00233.	52
400796	pfam08630	Dfp1_Him1_M	Dfp1/Him1, central region. This is the middle regions described by Ogino et al. This region, together with the C-terminal zinc finger (pfam07535) is essential for the mitotic and kinase activation functions of Dfp1/Him1.	128
400797	pfam08631	SPO22	Meiosis protein SPO22/ZIP4 like. SPO22/ZIP4 in yeast is a meiosis specific protein involved in sporulation. It has been shown to regulate crossover distribution by promoting synaptonemal complex formation.	272
370018	pfam08632	Zds_C	Activator of mitotic machinery Cdc14 phosphatase activation C-term. This region of the Zds1 protein is critical for sporulation and has also been shown to suppress the calcium sensitivity of Zds1 deletions. The C-terminal motif is common to both Zds1 and Zds2 proteins, both of which are putative interactors of Cdc55 and are required for the completion of mitotic exit and cytokinesis. They both contribute to timely Cdc14 activation during mitotic exit and are required downstream of separase to facilitate nucleolar Cdc14 release.	49
400798	pfam08633	Rox3	Rox3 mediator complex subunit. The mediator complex is part of the RNA polymerase II holoenzyme. Rox3 is a subunit of the mediator complex.	163
400799	pfam08634	Pet127	Mitochondrial protein Pet127. Pet127 has been implicated in mitochondrial RNA stability and/or processing and is localized to the mitochondrial membrane. The Pet127 family is part of the PD-(D/E)XK nuclease superfamily including a full set of active site residues.	275
117208	pfam08635	ox_reductase_C	Putative oxidoreductase C terminal. This is the C terminal of a family of putative oxidoreductases.	142
400800	pfam08636	Pkr1	ER protein Pkr1. Pkr1 has been identified as an ER protein of unknown function.	69
400801	pfam08637	NCA2	ATP synthase regulation protein NCA2. NCA2 has been shown to be required for the regulation of ATP synthase subunits Atp6p and Atp8p in Saccharomyces cerevisiae.	288
400802	pfam08638	Med14	Mediator complex subunit MED14. Saccharomyces cerevisiae RGR1 mediator complex subunit affects chromatin structure, transcriptional regulation of diverse genes and sporulation, required for glucose repression, HO repression, RME1 repression and sporulation. This subunit is also found in higher eukaryotes and Med14 is the agreed unified nomenclature for this subunit. Med14 is found in the tail region of Mediator.	192
400803	pfam08639	SLD3	DNA replication regulator SLD3. The SLD3 DNA replication regulator is required for loading and maintenance of Cdc45 on chromatin during DNA replication.	534
400804	pfam08640	U3_assoc_6	U3 small nucleolar RNA-associated protein 6. This is a family of U3 nucleolar RNA-associated proteins which are involved in nucleolar processing of pre-18S ribosomal RNA.	77
400805	pfam08641	Mis14	Kinetochore protein Mis14 like. Mis14 is a kinetochore protein which is known to be recruited to kinetochores independently of CENP-A.	131
400806	pfam08642	Rxt3	Histone deacetylation protein Rxt3. Rxt3 has been shown in yeast to be required for histone deacetylation.	113
370028	pfam08643	DUF1776	Fungal family of unknown function (DUF1776). This is a fungal family of unknown function. One of the proteins in this family YSC83 has been localized to the mitochondria.	295
400807	pfam08644	SPT16	FACT complex subunit (SPT16/CDC68). Proteins in this family are subunits the FACT complex. The FACT complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin.	151
370030	pfam08645	PNK3P	Polynucleotide kinase 3 phosphatase. Polynucleotide kinase 3 phosphatases play a role in the repair of single breaks in DNA induced by DNA-damaging agents such as gamma radiation and camptothecin.	161
400808	pfam08646	Rep_fac-A_C	Replication factor-A C terminal domain. This domain is found at the C terminal of replication factor A. Replication factor A (RPA) binds single-stranded DNA and is involved in replication, repair, and recombination of DNA.	146
400809	pfam08647	BRE1	BRE1 E3 ubiquitin ligase. BRE1 is an E3 ubiquitin ligase that has been shown to act as a transcriptional activator through direct activator interactions.	95
400810	pfam08648	DUF1777	Protein of unknown function (DUF1777). This is a family of eukaryotic proteins of unknown function. Some of the proteins in this family are putative nucleic acid binding proteins.	56
400811	pfam08649	DASH_Dad1	DASH complex subunit Dad1. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules. Throughout the cell cycle Dad1 remains bound to kinetochores throughout the cell cycle and its association is dependent on the Mis6 and Mal2.	55
400812	pfam08650	DASH_Dad4	DASH complex subunit Dad4. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro.	71
400813	pfam08651	DASH_Duo1	DASH complex subunit Duo1. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro.	72
400814	pfam08652	RAI1	RAI1 like PD-(D/E)XK nuclease. RAI1 is homologous to Caenorhabditis elegans DOM-3 and human DOM3Z and binds to a nuclear exoribonuclease. It is required for 5.8S rRNA processing. Profile-profile comparison tools demonstrate this to be a PD-(D/E)XK nuclease, with a full set of canonical active site signature motifs characteristic to the PD-(D/E)XK nuclease superfamily.	69
400815	pfam08653	DASH_Dam1	DASH complex subunit Dam1. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules.	56
400816	pfam08654	DASH_Dad2	DASH complex subunit Dad2. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro.	99
400817	pfam08655	DASH_Ask1	DASH complex subunit Ask1. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules.	64
400818	pfam08656	DASH_Dad3	DASH complex subunit Dad3. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro.	75
400819	pfam08657	DASH_Spc34	DASH complex subunit Spc34. The DASH complex is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro. Components of the DASH complex, including Dam1, Duo1, Spc34, Dad1 and Ask1, are essential and connect the centromere to the plus end of spindle microtubules.	279
400820	pfam08658	Rad54_N	Rad54 N terminal. This is the N terminal of the DNA repair protein Rad54.	180
370044	pfam08659	KR	KR domain. This enzymatic domain is part of bacterial polyketide synthases and catalyzes the first step in the reductive modification of the beta-carbonyl centers in the growing polyketide chain. It uses NADPH to reduce the keto group to a hydroxy group.	180
370045	pfam08660	Alg14	Oligosaccharide biosynthesis protein Alg14 like. Alg14 is involved dolichol-linked oligosaccharide biosynthesis and anchors the catalytic subunit Alg13 to the ER membrane.	171
400821	pfam08661	Rep_fac-A_3	Replication factor A protein 3. Replication factor A is involved in eukaryotic DNA replication, recombination and repair.	105
400822	pfam08662	eIF2A	Eukaryotic translation initiation factor eIF2A. This is a family of eukaryotic translation initiation factors.	194
400823	pfam08663	HalX	HalX domain. HalX is a domain of unknown function, previously (mis)annotated as HoxA-like transcriptional regulator.	68
400824	pfam08664	YcbB	YcbB domain. YcbB is a DNA-binding domain.	136
400825	pfam08665	PglZ	PglZ domain. This family is a member of the Alkaline phosphatase clan.	176
400826	pfam08666	SAF	SAF domain. This domain family includes a range of different proteins. Such as antifreeze proteins and flagellar FlgA proteins, and CpaB pilus proteins.	61
400827	pfam08667	BetR	BetR domain. This family includes an N-terminal helix-turn-helix domain.	148
400828	pfam08668	HDOD	HDOD domain. 	196
400829	pfam08669	GCV_T_C	Glycine cleavage T-protein C-terminal barrel domain. This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase.	80
400830	pfam08670	MEKHLA	MEKHLA domain. The MEKHLA domain shares similarity with the PAS domain and is found in the 3' end of plant HD-ZIP III homeobox genes, and bacterial proteins.	141
400831	pfam08671	SinI	Anti-repressor SinI. SinR is a pleiotropic regulator of several late growth processes. It is a tetrameric DNA binding protein whose activity is down-regulated thorough the formation of a SinI:SinR protein complex. When complexed with SinI, the SinR tetramer is disrupted such that is no longer able to bind DNA.	28
400832	pfam08672	ANAPC2	Anaphase promoting complex (APC) subunit 2. The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyze the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein.	60
400833	pfam08673	RsbU_N	Phosphoserine phosphatase RsbU, N-terminal domain. RsbU is a phosphoserine phosphatase which acts as a positive regulator of the general stress-response factor of gram positive organisms, sigma-B. The phosphatase activity of RsbU is stimulated by association with the RsbT kinase. Deletions in the N terminal domain are deleterious to the activity of RsbU.	75
400834	pfam08674	AChE_tetra	Acetylcholinesterase tetramerisation domain. The acetylcholinesterase tetramerisation domain is found at the C-terminus and forms a left handed superhelix.	35
400835	pfam08675	RNA_bind	RNA binding domain. This domain corresponds to the RNA binding domain of Poly(A)-specific ribonuclease (PARN).	75
400836	pfam08676	MutL_C	MutL C terminal dimerization domain. MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognizes mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerization.	144
400837	pfam08677	GP11	GP11 baseplate wedge protein. GP11 is a viral structural protein that connects short tail fibers to the baseplate. The tail region is responsible for attachment to the host bacteria during infection.	252
400838	pfam08678	Rsbr_N	Rsbr N terminal. Rsbr is a regulator of the RNA polymerase sigma factor subunit sigma(B). The structure of the N terminal domain belongs to the globin fold superfamily.	130
370055	pfam08679	DsrD	Dissimilatory sulfite reductase D (DsrD). The structure of the DsrD protein has shown it to contain a winged-helix motif similar to those found in DNA binding proteins. The structure suggests a possible role for DsrD in transcription of translation of genes which catalyze dissimilatory sulfite reduction.	64
400839	pfam08680	DUF1779	TATA-box binding. TATA-box_bdg is a family of bacterial proteins. YwmB from Bacillus subtilis contains a circularly permuted TATA-box binding protein-like fold. Jian-Xiang Liu, Qi Xie, Jun Lin. Protein Structural Data Mining and Evolutionary Bioinformatic Analysis on Domains of TATA-box Binding Protein-like Fold. Life Science Journal 2014; 11(2): 298-302 (not yet in PubMed 27-02-2014).	184
400840	pfam08681	DUF1778	Protein of unknown function (DUF1778). This is a family of uncharacterized proteins. The structure of one of the hypothetical proteins in this family has been solved and it forms a helix structure which may form interactions with DNA.	80
285845	pfam08682	DUF1780	Putative endonuclease, protein of unknown function (DUF1780). This is a family of uncharacterized proteins. The structure of a hypothetical protein from Pseudomonas aeruginosa has shown it to adopt an alpha/beta fold, placing it in the Endonuclease superfamily/clan of restriction endonucleases.	208
400841	pfam08683	CAMSAP_CKK	Microtubule-binding calmodulin-regulated spectrin-associated. This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin.	119
400842	pfam08684	ocr	DNA mimic ocr. The structure of an ocr protein from bacteriophage T7 has shown that this protein mimics the size and shape of a bent DNA molecule. ocr has also been shown to be an inhibitor of the complex type I DNA restriction enzymes.	100
400843	pfam08685	GON	GON domain. The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues.	198
400844	pfam08686	PLAC	PLAC (protease and lacunin) domain. The PLAC (protease and lacunin) domain is a short six-cysteine region that is usually found at the C terminal of proteins. It is found in a range of proteins including PACE4 (paired basic amino acid cleaving enzyme 4) and the extracellular matrix protein lacunin.	31
400845	pfam08687	ASD2	Apx/Shroom domain ASD2. This region is found in the actin binding protein Shroom which mediates apical contriction in epithelial cells and is required for neural tube closure.	288
400846	pfam08688	ASD1	Apx/Shroom domain ASD1. This region is found in the actin binding protein Shroom which mediates apical contriction in epithelial cells and is required for neural tube closure. ASD1 has been implicated directly in F-actin binding.	170
370064	pfam08689	Med5	Mediator complex subunit Med5. The mediator complex is required for the expression of nearly all RNA pol II dependent genes in Saccharomyces cerevisiae. Deletion of the MED5 gene leads to increased transcription of nuclear genes encoding components of the oxidative phosphorylation machinery, and decreased transcription of mitochondrial genes encoding components of the same machinery. There is no orthologue from pombe, and this subunit appears to be fungal specific.	1082
400847	pfam08690	GET2	GET complex subunit GET2. This family corresponds to the GET complex subunit GET2. The GET complex is involved in the retrieval of ER resident proteins from the Golgi.	308
370066	pfam08691	Nse5	DNA repair proteins Nse5 and Nse6. Nse5 and Nse6 are non essential nuclear proteins that are critical for chromosome segregation in fission yeast. Nse5 forms a dimer with Nse6 and facilitates DNA repair as part of the Smc5-Smc6 holocomplex.	503
400848	pfam08692	Pet20	Mitochondrial protein Pet20. Pet20 is a mitochondrial protein which is thought to play a role in the correct assembly/maintenance of mitochondrial components.	134
400849	pfam08693	SKG6	Transmembrane alpha-helix domain. SKG6/Axl2 are membrane proteins that show polarised intracellular localization. SKG6_Tmem is the highly conserved transmembrane alpha-helical domain of SKG6 and Axl2 proteins,. The full-length fungal protein has a negative regulatory function in cytokinesis.	38
400850	pfam08694	UFC1	Ubiquitin-fold modifier-conjugating enzyme 1. Ubiquitin-like (UBL) post-translational modifiers are covalently linked to most, if not all, target protein(s) through an enzymatic cascade analogous to ubiquitylation, consisting of E1 (activating), E2 (conjugating), and E3 (ligating) enzymes. Ubiquitin-fold modifier 1 (Ufm1) a ubiquitin-like protein is activated by a novel E1-like enzyme, Uba5, by forming a high-energy thioester bond. Activated Ufm1 is then transferred to its cognate E2-like enzyme, Ufc1, in a similar thioester linkage. This family represents the E2-like enzyme.	155
400851	pfam08695	Coa1	Cytochrome oxidase complex assembly protein 1. Coa1 is an inner mitochondrial membrane protein that associates with Shy1 and is required for cytochrome oxidase complex IV assembly. It contains a conserved hydrophobic segment (amino acids 74-92) with the potential to form a membrane-spanning helix. The N-terminus of Coa1 is rich in positively charged amino acids and could form an amphipathic alpha helix, characteristic of a mitochondrial presequence. A cleavage site for the mitochondrial processing peptidase is predicted adjacent to the presequence. Upon in vitro import into mitochondria, Coa1 is processed to a mature form, indicating that it possesses a cleavable presequence. The eukaryotic cytochrome oxidase complex consists of 12-13 subunits, with three mitochondrial encoded subunits, Cox1-Cox3, forming the core enzyme. Translation of the Cox1 transcript requires the two promoters, Pet309 and Mss51, and the latter has an additional role in translational elongation. Coa1 is necessary for linking the activity of Mss51 to Cox1 insertion into the assembly complex.	117
400852	pfam08696	Dna2	DNA replication factor Dna2. Dna2 is a DNA replication factor with single-stranded DNA-dependent ATPase, ATP-dependent nuclease, ( 5'-flap endonuclease) and helicase activities. It is required for Okazaki fragment processing and is involved in DNA repair pathways.	203
400853	pfam08698	Fcf2	Fcf2 pre-rRNA processing. This is a family of eukaryotic nucleolar proteins that are involved in pre-rRNA processing.	94
400854	pfam08699	ArgoL1	Argonaute linker 1 domain. ArgoL1 is a region found in argonaute proteins. It normally co-occurs with pfam02179 and pfam02171. It is a linker region between the N-terminal and the PAZ domains. It contains an alpha-helix packed against a three-stranded antiparallel beta-sheet with two long beta-strands (beta8 and beta9) of the sheet spanning one face of the adjacent N and PAZ domains. L1 together with linker 2, L2, PAZ and ArgoN forms a compact global fold.	52
400855	pfam08700	Vps51	Vps51/Vps67. This family includes a presumed domain found in a number of components of vesicular transport. The VFT tethering complex (also known as GARP complex, Golgi associated retrograde protein complex, Vps53 tethering complex) is a conserved eukaryotic docking complex which is involved recycling of proteins from endosomes to the late Golgi. Vps51 (also known as Vps67) is a subunit of VFT and interacts with the SNARE Tlg1. Cog1_N is the N-terminus of the Cog1 subunit of the eight-unit Conserved Oligomeric Golgi (COG) complex that participates in retrograde vesicular transport and is required to maintain normal Golgi structure and function. The subunits are located in two lobes and Cog1 serves to bind the two lobes together probably via the highly conserved N-terminal domain of approximately 85 residues.	86
400856	pfam08701	GN3L_Grn1	GNL3L/Grn1 putative GTPase. Grn1 (yeast) and GNL3L (human) are putative GTPases which are required for growth and play a role in processing of nucleolar pre-rRNA. This family contains a potential nuclear localization signal.	74
400857	pfam08702	Fib_alpha	Fibrinogen alpha/beta chain family. Fibrinogen is a protein involved in platelet aggregation and is essential for the coagulation of blood. This domain forms part of the central coiled coiled region of the protein which is formed from two sets of three non-identical chains (alpha, beta and gamma).	142
400858	pfam08703	PLC-beta_C	PLC-beta C terminal. This domain corresponds to the alpha helical C terminal domain of phospholipase C beta.	176
312288	pfam08704	GCD14	tRNA methyltransferase complex GCD14 subunit. GCD14 is a subunit of the tRNA methyltransferase complex and is required for 1-methyladenosine modification and maturation of initiator methionyl-tRNA.	242
312289	pfam08705	Gag_p6	Gag protein p6. HIV protein p6 contains two late-budding domains (L domains) which are short sequence motifs essential for viral particle release. p6 interacts with the endosomal sorting complex and represents a docking site for several cellular and binding factors. The PTAP motif interacts with the cellular budding factor TSG101. This domain is also found in some chimpanzee immunodeficiency virus (SIV-cpz) proteins.	37
378029	pfam08706	D5_N	D5 N terminal like. This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primases phages.	145
400859	pfam08707	PriCT_2	Primase C terminal 2 (PriCT-2). This alpha helical domain is found at the C terminal of primases.	76
400860	pfam08708	PriCT_1	Primase C terminal 1 (PriCT-1). This alpha helical domain is found at the C terminal of primases.	64
400861	pfam08709	Ins145_P3_rec	Inositol 1,4,5-trisphosphate/ryanodine receptor. This domain corresponds to the ligand binding region on inositol 1,4,5-trisphosphate receptor, and the N terminal region of the ryanodine receptor. Both receptors are involved in Ca2+ release. They can couple to the activation of neurotransmitter-gated receptors and voltage-gated Ca2+ channels on the plasma membrane, thus allowing the endoplasmic reticulum discriminate between different types of neuronal activity.	213
285872	pfam08710	nsp9	nsp9 replicase. nsp9 is a single-stranded RNA-binding viral protein likely to be involved in RNA synthesis. Its structure comprises of a single beta barrel.	111
400862	pfam08711	Med26	TFIIS helical bundle-like domain. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species {1-2]. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator exists in two major forms in human cells: a smaller form that interacts strongly with pol II and activates transcription, and a large form that does not interact strongly with pol II and does not directly activate transcription. Notably, the 'small' and 'large' Mediator complexes differ in their subunit composition: the Med26 subunit preferentially associates with the small, active complex, whereas cdk8, cyclin C, Med12 and Med13 associate with the large Mediator complex. This family includesthe C terminal region of a number of eukaryotic hypothetical proteins which are homologous to the Saccharomyces cerevisiae protein IWS1. IWS1 is known to be an Pol II transcription elongation factor and interacts with Spt6 and Spt5.	52
400863	pfam08712	Nfu_N	Scaffold protein Nfu/NifU N terminal. This domain is found at the N-terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters.	81
378033	pfam08713	DNA_alkylation	DNA alkylation repair enzyme. Proteins in this family are predicted to be DNA alkylation repair enzymes. The structure of a hypothetical protein in this family shows it to adopt a supercoiled alpha helical structure.	212
400864	pfam08714	Fae	Formaldehyde-activating enzyme (Fae). Formaldehyde-activating enzyme is an enzyme required for energy metabolism and formaldehyde detoxification. It catalyzes the condensation of formaldehyde and tetrahydromethanopterin to methylene tetrahydromethanopterin.	158
400865	pfam08715	Viral_protease	Papain like viral protease. This family of viral proteases are similar to the papain protease and are required for proteolytic processing of the replicase polyprotein. The structure of this protein has shown it adopts a fold similar that of de-ubiquitinating enzymes.	320
285878	pfam08716	nsp7	nsp7 replicase. nsp7 (non structural protein 7) has been implicated in viral RNA replication and is predominantly alpha helical in structure. It forms a hexadecameric supercomplex with nsp7 that adopts a hollow cylinder-like structure. The dimensions of the central channel and positive electrostatic properties of the cylinder imply that it confers processivity on RNA-dependent RNA polymerase.	83
400866	pfam08717	nsp8	nsp8 replicase. Viral nsp8 (non structural protein 8) forms a hexadecameric supercomplex with nsp7 that adopts a hollow cylinder-like structure. The dimensions of the central channel and positive electrostatic properties of the cylinder imply that it confers processivity on RNA-dependent RNA polymerase.	197
400867	pfam08718	GLTP	Glycolipid transfer protein (GLTP). GLTP is a cytosolic protein that catalyzes the intermembrane transfer of glycolipids.	137
400868	pfam08719	DUF1768	Domain of unknown function (DUF1768). This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate.	155
72144	pfam08720	Hema_stalk	Influenza C hemagglutinin stalk. This domain corresponds to the stalk segment of hemagglutinin in influenza C virus. It forms a coiled coil structure.	175
400869	pfam08721	Tn7_Tnp_TnsA_C	TnsA endonuclease C terminal. The Tn7 transposase is composed of proteins TnsA and TnsB. DNA breakage at the 5' end of the transposon is carried out by TnsA, and breakage and joining at the 3' end is carried out by TnsB. The C terminal domain of TnsA binds DNA.	83
400870	pfam08722	Tn7_Tnp_TnsA_N	TnsA endonuclease N terminal. The Tn7 transposase is composed of proteins TnsA and TnsB. DNA breakage at the 5' end of the transposon is carried out by TnsA, and breakage and joining at the 3' end is carried out by TnsB. The N terminal domain of TnsA is catalytic.	83
72147	pfam08723	Gag_p15	Gag protein p15. Gag p15 is a viral membrane-binding matrix protein which is alpha helical in structure.	123
312303	pfam08724	Rep_N	Rep protein catalytic domain like. Adeno-associated virus (AAV) Replication (Rep) protein is essential for viral replication and integration. The catalytic domain has DNA binding and endonuclease activity.	187
400871	pfam08725	Integrin_b_cyt	Integrin beta cytoplasmic domain. Integrins are a group of transmembrane proteins which function as extracellular matrix receptors and in cell adhesion. Integrins are ubiquitously expressed and are heterodimeric, each composed of an alpha and beta subunit. Several variations of the the alpha and beta subunits exist, and association of different alpha and beta subunits can have different a different binding specificity. This domain corresponds to the cytoplasmic domain of the beta subunit.	44
400872	pfam08726	EFhand_Ca_insen	Ca2+ insensitive EF hand. EF hands are helix-loop-helix binding motifs involved in the regulation of many cellular processes. EF hands usually bind to Ca2+ ions which causes a major conformational change that allows the protein to interact with its designated targets. This domain corresponds to an EF hand which has partially or entirely lost its calcium-binding properties. The calcium insensitive EF hand is still able to mediate protein-protein recognition.	69
400873	pfam08727	P3A	Poliovirus 3A protein like. This domain is found in positive-strand RNA viruses. The 3A protein is a critical component of the poliovirus replication complex, and is also an inhibitor of host cell ER to Golgi transport.	59
400874	pfam08728	CRT10	CRT10. CRT10 is a transcriptional regulator of ribonucleotide reductase (RNR) genes. RNR catalyzes the rate limiting step in dNTP synthesis. Mutations in CRT10 have been shown to enhance hydroxyurea resistance.	618
400875	pfam08729	HUN	HPC2 and ubinuclein domain. HPC2 (Histone promoter control 2) is required for cell-cycle regulation of histone transcription. It regulates transcription of the histone genes during the S-phase of the cell cycle by repressing transcription at other cell cycle stages. HPC2 mutants display synthetic interactions with FACT complex which allows RNA Pol II to elongate through nucleosomes. Hpc2 is one of the proteins of one of the multi-subunit complexes that mediate replication- independent nucleosome assembly, along with histone chaperone proteins. the Hip4 sequence from SCH. pombe is an integral component of this complex that is required for transcriptional silencing at multiple loci. HPC2, ubinuclein/yemanuclein, and the cell cycle regulator FLJ25778 share a conserved domain that is predicted to bind histone tails. This domain is also referred to as the HRD or Hpc2-related domain.	52
400876	pfam08730	Rad33	Rad33. Rad33 is involved in nucleotide excision repair (NER). NER is the main pathway for repairing DNA lesions induced by UV. Cells deleted for RAD33 display intermediate UV sensitivity that is epistatic with NER.	165
400877	pfam08731	AFT	Transcription factor AFT. AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This family includes the paralogous pair of transcription factors AFT1 and AFT2.	91
400878	pfam08732	HIM1	HIM1. HIM1 (high induction of mutagenesis protein 1) plays a role in the control of spontaneous and induced mutagenesis. It is thought to participate in the control of processing of mutational intermediates appearing during error-prone bypass of DNA damage.	169
400879	pfam08733	PalH	PalH/RIM21. PalH (also known as RIM21) is a transmembrane protein required for proteolytic cleavage of Rim101/PacC transcription factors which are activated by C terminal proteolytic processing. Rim101/PacC family proteins play a key role in pH-dependent responses and PalH has been implicated as a pH sensor.	335
400880	pfam08734	GYD	GYD domain. This protein is found in a range of bacteria. It is usually less than 100 amino acids in length. The function of the protein is unknown. It may belong to the dimeric alpha/beta barrel superfamily.	89
400881	pfam08735	DUF1786	Putative pyruvate format-lyase activating enzyme (DUF1786). This family is annotated as pyruvate formate-lyase activating enzyme (EC:1.97.1.4) in UniProt. It is not clear where this annotation comes from.	251
400882	pfam08736	FA	FERM adjacent (FA). This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesized to play a role in regulatory adaptation, based on similarity to other protein kinase substrates.	44
400883	pfam08737	Rgp1	Rgp1. Rgp1 forms heterodimer with Ric1 (pfam07064) which associates with Golgi membranes and functions as a guanyl-nucleotide exchange factor.	414
370092	pfam08738	Gon7	Gon7 family. In S. cerevisiae Gon7 is a member of the KEOPS protein complex. A protein complex proposed to be involved in transcription and promoting telomere uncapping and telomere elongation.	105
400884	pfam08740	BCS1_N	BCS1 N terminal. This domain is found at the N terminal of the mitochondrial ATPase BCS1. It encodes the import and intramitochondrial sorting for the protein.	179
400885	pfam08741	YwhD	YwhD family. This family of proteins are currently uncharacterized. They are around 170 amino acids in length.	162
400886	pfam08742	C8	C8 domain. This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826.	68
400887	pfam08743	Nse4_C	Nse4 C-terminal. Nse4 is a component of the Smc5/6 DNA repair complex. It forms interactions with Smc5 and Nse1. The exact function of this highly conserved C-terminal domain is not known.	87
285901	pfam08744	NOZZLE	Plant transcription factor NOZZLE. NOZZLE is a transcription factor that plays a role in patterning the proximal-distal and adaxial-abaxial axes.	335
400888	pfam08745	PIN_5	PINc domain ribonuclease. This is a family of bacterial and archaeal PINc domains. PIN domains are characterized by the conservation of three acidic residues, possibly four, an Asp at residue 13, a Glu at 63, and then Asps at 172 and 194 in UniProtKB:Q58360.	206
400889	pfam08746	zf-RING-like	RING-like domain. This is a zinc finger domain that is related to the C3HC4 RING finger domain (pfam00097).	43
400890	pfam08747	DUF1788	Domain of unknown function (DUF1788). Putative uncharacterized domain in proteins of length around 200 amino acids.	119
370098	pfam08748	Phage_TAC_4	Phage tail assembly chaperone. This is a family of phage tail assembly chaperone proteins largely from phage T1 Gp40.	124
400891	pfam08750	CNP1	CNP1-like family. This family of proteins are likely to be lipoproteins. CNP1 (cryptic neisserial protein) has been expressed in E. coli and shown to be localized periplasmicly.	135
400892	pfam08751	TrwC	TrwC relaxase. Relaxases are DNA strand transferases which function during the conjugative cell to cell DNA transfer. TrwC binds to the origin of transfer (oriT) and melts the double helix.	283
400893	pfam08752	COP-gamma_platf	Coatomer gamma subunit appendage platform subdomain. COPI-coated vesicles function in retrograde transport from the Golgi to the ER, and in intra-Golgi transport. This is the platform subdomain of the coatomer gamma subunit appendage domain. It carries a protein-protein interaction site at UniProt:P53620, residue W776, which in yeast binds to the ARFGAP Glo3p, and in mammalian gamma-COP binds to a Glo3p orthologue, ARFGAP2.	149
400894	pfam08753	NikR_C	NikR C terminal nickel binding domain. NikR is a transcription factor that regulates nickel uptake. It consists of two dimeric DNA binding domains separated by a tetrameric regulatory domain that binds nickel. This domain corresponds to the C terminal regulatory domain which contains four nickel binding sites at the tetramer interface.	74
400895	pfam08755	YccV-like	Hemimethylated DNA-binding protein YccV like. YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix.	94
378044	pfam08756	YfkB	YfkB-like domain. This protein is adjacent to YfkA in B. subtilis. In other bacterial species it is fused to this protein. As YfkA contains a Radical SAM domain it suggests this domain is interacts with them.	149
400896	pfam08757	CotH	CotH kinase protein. Members of this family include the spore coat protein H (cotH). This protein is an atypical protein kinase that phosphorylates CotB and CotG.	318
400897	pfam08758	Cadherin_pro	Cadherin prodomain like. Cadherins are a family of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This domain corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions.	90
400898	pfam08759	GT-D	Glycosyltransferase GT-D fold. This domain is found at the C-terminus of proteins such as the probable glycosyltransferase Gly that also contain the glycosyl transferase domain at the N-terminus. It is also found N-terminal in numerous putative glycosyltransferases such as GalT1. GalT1 has been shown to catalyze the third step of Fap1 glycosylation. This domain is structurally distinct from all known GT folds of glycosyltransferases and contains a metal binding site. This new glycosyltransferase fold has been named GT-D.	223
400899	pfam08760	DUF1793	Domain of unknown function (DUF1793). This presumed domain is found at the C-terminus of a glutaminase protein from fungi. This domain is also found as a single domain protein in Bacteroides thetaiotaomicron.	169
400900	pfam08761	dUTPase_2	dUTPase. 2-Deoxyuridine 5-triphosphate nucleotidohydrolase (dUTPase) catalyzes the hydrolysis of dUTP to dUMP and pyrophosphate (EC:3.6.1.23). Members of this family have a novel all-alpha fold and are unrelated to the all-beta fold found in dUTPases of the majority of organisms. This family contains both dUTPase homologs of dUTPase including dCTPase of phage T4.	162
370104	pfam08762	CRPV_capsid	CRPV capsid protein like. This is a family of capsid proteins found in positive stranded ssRNA viruses such as cricket paralysis virus (CRPV). It forms an all beta sheet structure.	198
400901	pfam08763	Ca_chan_IQ	Voltage gated calcium channel IQ domain. Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF).	75
312335	pfam08764	Coagulase	Staphylococcus aureus coagulase. Staphylococcus aureus secretes a cofactor called coagulase. Coagulase is an extracellular protein that forms a complex with human prothrombin, and activates it without the usual proteolytic cleavages. The resulting complex directly initiates blood clotting.	279
400902	pfam08765	Mor	Mor transcription activator family. Mor (Middle operon regulator) is a sequence specific DNA binding protein. It mediates transcription activation through its interactions with the C-terminal domains of the alpha and sigma subunits of bacterial RNA polymerase. The N terminal region of Mor is the dimerization region, and the C terminal contains a helix-turn-helix motif which binds DNA.	107
400903	pfam08766	DEK_C	DEK C terminal domain. DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family. This domain is also found in chitin synthase proteins and in protein phosphastases.	54
370107	pfam08767	CRM1_C	CRM1 C terminal. CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat.	323
400904	pfam08768	DUF1794	Domain of unknown function (DUF1794). This domain forms a beta barrel structure but the function is unknown. The GO annotation for this protein indicates that the protein has a function in nematode larval development and has a positive regulation on growth rate.	150
400905	pfam08769	Spo0A_C	Sporulation initiation factor Spo0A C terminal. The response regulator Spo0A is comprised of a phophoacceptor domain and a transcription activation domain. This domain corresponds to the transcription activation domain and forms an alpha helical structure comprising of 6 alpha helices. The structure contains a helix-turn-helix and binds DNA.	104
400906	pfam08770	SoxZ	Sulphur oxidation protein SoxZ. SoxZ forms an anti parallel beta structure and forms a complex with SoxY. Sulphur oxidation occurs at the thiol of a conserved cysteine residue of the SoxY subunit.	94
400907	pfam08771	FRB_dom	FKBP12-rapamycin binding domain. The macrolide antibiotic rapamycin and the cytosol protein FKBP12 can form a complex which specifically inhibits the TORC1 complex, leading to growth arrest. The FKBP12-rapamycin complex interferes with TORC1 function by binding to the FKBP12-rapamycin binding domain (FRB) of the TOR proteins. This entry represents the FRB domain.	98
400908	pfam08772	NOB1_Zn_bind	Nin one binding (NOB1) Zn-ribbon like. This domain corresponds to a zinc ribbon and is found on the RNA binding protein NOB1 (Nin one binding).	72
400909	pfam08773	CathepsinC_exc	Cathepsin C exclusion domain. Cathepsin C (dipeptidyl peptidase I) is the physiological activator of a group of serine proteases. This domain corresponds to the exclusion domain whose structure excludes the approach of a polypeptide apart from its termini. It forms an enclosed beta barrel structure composed from 8 anti-parallel beta strands. Based on a structural comparison and interaction data, it is suggested that the exclusion domain originates from a metallo-protease inhibitor.	118
400910	pfam08774	VRR_NUC	VRR-NUC domain. 	114
400911	pfam08775	ParB	ParB family. ParB is a component of the par system which mediates accurate DNA partition during cell division. It recognizes A-box and B-box DNA motifs. ParB forms an asymmetric dimer with 2 extended helix-turn-helix (HTH) motifs that bind to A-boxes. The HTH motifs emanate from a beta sheet coiled coil DNA binding module. Both DNA binding elements are free to rotate around a flexible linker, this enables them to bind to complex arrays of A- and B-box elements on adjacent DNA arms of the looped partition site.	125
400912	pfam08776	VASP_tetra	VASP tetramerisation domain. Vasodilator-stimulated phosphoprotein (VASP) is an actin cytoskeletal regulatory protein. This region corresponds to the tetramerisation domain which forms a right handed alpha helical coiled coil structure.	36
400913	pfam08777	RRM_3	RNA binding motif. This domain is found in protein La which functions as an RNA chaperone during RNA polymerase III transcription, and can also stimulate translation initiation. It contains a five stranded beta sheet which forms an atypical RNA recognition motif.	102
400914	pfam08778	HIF-1a_CTAD	HIF-1 alpha C terminal transactivation domain. Hypoxia inducible factor-1 alpha (HIF-1 alpha) is the regulatory subunit of the heterodimeric transcription factor HIF-1. It plays a key role in cellular response to low oxygen tension. This region corresponds to the C terminal transactivation domain.	36
400915	pfam08779	SARS_X4	SARS coronavirus X4 like. The structure of the coronavirus X4 protein (also known as 7a and U122) shows similarities to the immunoglobulin like fold and suggests a binding activity to integrin I domains. In SARS-CoV- infected cells, the X4 protein is expressed and retained intra-cellularly within the Golgi network. X4 has been implicated to function during the replication cycle of SARS-CoV.	107
400916	pfam08780	NTase_sub_bind	Nucleotidyltransferase substrate binding protein like. Nucleotidyltransferases (EC 2.7.7) comprise a large enzyme family with diverse roles in polynucleotide synthesis and modification. This domain is structurally related to kanamycin nucleotidyltransferase (KNTase) and forms a complex with HI0073, a sequence homolog of the nucleotide-binding domain of this nucleotidyltransferase superfamily.	126
400917	pfam08781	DP	Transcription factor DP. DP forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer and negatively regulates the G1-S transition.	138
400918	pfam08782	c-SKI_SMAD_bind	c-SKI Smad4 binding domain. c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4	91
400919	pfam08783	DWNN	DWNN domain. DWNN is a ubiquitin like domain found at the N-terminus of the RBBP6 family of splicing-associated proteins. The DWNN domain is independently expressed in higher vertebrates so it may function as a novel ubiquitin-like modifier of other proteins.	73
400920	pfam08784	RPA_C	Replication protein A C terminal. This domain corresponds to the C terminal of the single stranded DNA binding protein RPA (replication protein A). RPA is involved in many DNA metabolic pathways including DNA replication, DNA repair, recombination, cell cycle and DNA damage checkpoints.	106
400921	pfam08785	Ku_PK_bind	Ku C terminal domain like. The non-homologous end joining (NHEJ) pathway is one method by which double stranded breaks in chromosomal DNA are repaired. Ku is a component of a multi-protein complex that is involved in the NHEJ. Ku has affinity for DNA ends and recruits the DNA-dependent protein kinase catalytic subunit (DNA-PKcs). This domain is found at the C terminal of Ku which binds to DNA-PKcs.	117
400922	pfam08786	DcrB	DcrB. DcrB is a bacterial protein required for phages C1 and C6 adsorption. It may be involved in the opening or formation of diffusion channels in the outer membrane. Structurally, it consist of an antiparallel beta sheet with some alpha helical regions.	126
400923	pfam08787	Alginate_lyase2	Alginate lyase. Alginate lyases are enzymes that degrade the linear polysaccharide alignate. They cleave the glycosidic linkage of alignate through a beta-elimination reaction. This family forms an all beta fold and is different to all alpha fold of pfam05426.	222
370124	pfam08788	NHR2	NHR2 domain like. The NHR2 (Nervy homology 2) domain is found in the ETO protein where it mediates oligomerization and protein-protein interactions. It forms an alpha-helical tetramer.	67
285942	pfam08789	PBCV_basic_adap	PBCV-specific basic adaptor domain. The small PBCV-specific basic adaptor domain is found fused to S/T protein kinases and the 2-Cysteine domain.	38
400924	pfam08790	zf-LYAR	LYAR-type C2HC zinc finger. This C2HC zinc finger is found in LYAR proteins, which are involved in cell growth regulation.	28
285944	pfam08792	A2L_zn_ribbon	A2L zinc ribbon domain. This zinc ribbon domain is found associated with some viral A2L transcription factors.	33
285945	pfam08793	2C_adapt	2-cysteine adaptor domain. The virus-specific 2-cysteine adaptor domain is found fused to OTU/A20-like peptidases and S/T protein kinases. The domain associations of these proteins indicate that they might function as viral adaptors connecting the kinases and OTU/A20 peptidases to specific targets.	35
400925	pfam08794	Lipoprot_C	Lipoprotein GNA1870 C terminal like. GNA1870 is a surface exposed lipoprotein in Neisseria meningitidis that and is a potent antigen of Meningococcus. The structure of the C terminal domain consists of an anti-parallel beta barrel overlaid by a short alpha helical region.	155
400926	pfam08795	DUF1796	Putative papain-like cysteine peptidase (DUF1796). 	165
400927	pfam08796	DUF1797	Protein of unknown function (DUF1797). This is a domain of unknown function. It forms a central anti-parallel beta sheet with flanking alpha helical regions.	67
400928	pfam08797	HIRAN	HIRAN domain. The HIRAN domain (HIP116, Rad5p N-terminal) is found in the N-terminal regions of the SWI2/SNF2 proteins typified by HIP116 and Rad5p. The HIRAN domain is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes. It has been predicted that this domain functions as a DNA-binding domain that probably recognizes features associated with damaged DNA or stalled replication forks	96
400929	pfam08798	CRISPR_assoc	CRISPR associated protein. This domain forms an anti-parallel beta strand structure with flanking alpha helical regions.	216
400930	pfam08799	PRP4	pre-mRNA processing factor 4 (PRP4) like. This small domain is found on PRP4 ribonuleoproteins. PRP4 is a U4/U6 small nuclear ribonucleoprotein that is involved in pre-mRNA processing.	29
400931	pfam08800	VirE_N	VirE N-terminal domain. This presumed domain is found at the N-terminus of VirE proteins.	133
400932	pfam08801	Nucleoporin_N	Nup133 N terminal like. Nup133 is a nucleoporin that is crucial for nuclear pore complex (NPC) biogenesis. The N terminal forms a seven-bladed beta propeller structure. This family now contains other sized nucleoporins, including Nup155, Nup8, Nuo132, Nup15 and Nup170.	426
312365	pfam08802	CytB6-F_Fe-S	Cytochrome B6-F complex Fe-S subunit. The cytochrome B6-F complex mediates electron transfer between photosystem II (PSII) and photosystem I (PSI), cyclic electron flow around PSI, and state transitions. This domain corresponds to the alpha helical transmembrane domain of the cytochrome B6-F complex iron-sulphur subunit.	39
400933	pfam08803	ydhR	Putative mono-oxygenase ydhR. ydhR is a homodimeric protein that comprises of a central four-stranded beta sheet and four surrounding alpha helices. It shows structural homology to the ActVA-Orf6 and YgiN proteins which indicates it could be a mono-oxygenase.	96
400934	pfam08804	gp32	gp32 DNA binding protein like. gp32 is a single stranded (ss) DNA binding protein in bacteriophage T4 that is essential for DNA replication, recombination and repair. The ssDNA binding cleft of gp32 comprises regions from three structural subdomains.	203
285957	pfam08805	PilS	PilS N terminal. Type IV pili are bacterial virulence-associated adhesins that promote bacterial attachment to host cells. In Salmonella typhi, the structural pilin protein PilS interacts with the cystic fibrosis transmembrane conductance regulator. Mutagenesis studies suggest that residues on an alpha-beta loop and the C terminal disulphide-bonded region of PilS might be involved in binding specificity of the pilus.	137
400935	pfam08806	Sep15_SelM	Sep15/SelM redox domain. Sep15 and SelM are eukaryotic selenoproteins that have a thioredoxin-like domain and a surface accessible active site redox motif. This suggests that they function as thiol-disulphide isomerases involved in disulphide bond formation in the endoplasmic reticulum. Structurally it resembles the thioredoxin-fold.	75
400936	pfam08807	DUF1798	Bacterial domain of unknown function (DUF1798). This domain is found in many hypothetical proteins. The structure of one of the proteins in this family has been solved and it adopts an all alpha helical fold.	108
400937	pfam08808	RES	RES domain. This presumed domain contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. The domain is found widely distributed in bacteria. The domain is about 150 residues in length.	154
400938	pfam08809	DUF1799	Phage related hypothetical protein (DUF1799). Members of this family are about 100 amino acids in length and are uncharacterized.	75
378054	pfam08810	KapB	Kinase associated protein B. This bacterial protein forms an anti-parallel beta sheet with an extending alpha helical region.	111
400939	pfam08811	DUF1800	Protein of unknown function (DUF1800). This is a family of large bacterial proteins of unknown function.	436
400940	pfam08812	YtxC	YtxC-like family. This family includes proteins similar to B. subtilis YtxC an uncharacterized protein.	215
370135	pfam08813	Phage_tail_3	Phage tail tube protein, TTP. This is a family of phage tail tube proteins. A few members have an associated bacterial Ig-like domain, pfam02368, at their C-terminus.	165
400941	pfam08814	XisH	XisH protein. The fdxN element, along with two other DNA elements, is excised from the chromosome during heterocyst differentiation in cyanobacteria. The xisH as well as the xisF and xisI genes are required.	133
400942	pfam08815	Nuc_rec_co-act	Nuclear receptor coactivator. This region is found on eukaryotic nuclear receptor coactivators and forms an alpha helical structure.	47
400943	pfam08816	Ivy	Inhibitor of vertebrate lysozyme (Ivy). This bacterial family is a strong inhibitor of vertebrate lysozyme.	117
400944	pfam08817	YukD	WXG100 protein secretion system (Wss), protein YukD. The YukD protein family members participate in the formation of a translocon required for the secretion of WXG100 proteins (pfam06013) in monoderm bacteria, with the WXG100 protein secretion system (Wss). Like the cytoplasmic protein EsaC in Staphylococcus aureus, YukD was hypothesized to play a role of a chaperone. YukD adopts a ubiquitin-like fold. Usually, ubiquitin covalently binds to protein and flags them for protein degradation, however conjugation assays have indicated that the classical YukD lacks the capacity for covalent bond formation with other proteins. In contrast to the situation in firmicutes, YukD-like proteins in actinobacteria are often fused to a transporter involved in the ESAT-6/ESX/Wss secretion pathway. Members of the YukD family are also associated in gene neighborhoods with other enzymatic members of the ubiquitin signaling and degradation pathway such as the E1, E2 and E3 trienzyme complex that catalyze ubiquitin transfer to substrates, and the JAB family metallopeptidases that are involved in its release. This suggests that a subset of the YukD family in bacteria are conjugated and released from proteins as in the eukaryotic ubiquitin-mediated signaling and degradation pathway.	77
400945	pfam08818	DUF1801	Domain of unknown function (DU1801). This large family of bacterial proteins is uncharacterized. They contain a presumed domain about 110 amino acids in length.	96
400946	pfam08819	DUF1802	Domain of unknown function (DUF1802). The function of this family is unknown. This region is found associated with a pfam04471 suggesting they could be part of a restriction modification system..	175
400947	pfam08820	DUF1803	Domain of unknown function (DUF1803). This small domain is found in one or two copies in proteins from bacteria. The function of this domain is unknown.	91
400948	pfam08821	CGGC	CGGC domain. This putative domain contains a quite highly conserved sequence of CGGC in its central region. The domain has many conserved cysteines and histidines suggestive of a zinc binding function.	105
400949	pfam08822	DUF1804	Protein of unknown function (DUF1804). This family of bacterial protein is uncharacterized.	164
255058	pfam08823	PG_binding_2	Putative peptidoglycan binding domain. This family may be a peptidoglycan binding domain.	74
400950	pfam08824	Serine_rich	Serine rich protein interaction domain. This is a serine rich domain that is found in the docking protein p130(cas) (Crk-associated substrate). This domain folds into a four helix bundle which is associated with protein-protein interactions.	156
400951	pfam08825	E2_bind	E2 binding domain. E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The domain resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains.	81
117396	pfam08826	DMPK_coil	DMPK coiled coil domain like. This domain is found in the myotonic dystrophy protein kinase (DMPK) and adopts a coiled coil structure. It plays a role in dimerization.	61
400952	pfam08827	DUF1805	Domain of unknown function (DUF1805). This domain is found in bacteria and archaea and has an N terminal tetramerisation region that is composed of beta sheets.	58
370143	pfam08828	DSX_dimer	Doublesex dimerization domain. Doublesex (DSX) is a transcription factor that regulates somatic sexual differences in Drosophila. The structure of this domain has revealed a novel dimeric arrangement of ubiquitin-associated folds that has not previously been identified in a transcription factor.	60
337221	pfam08829	AlphaC_N	Alpha C protein N terminal. The alpha C protein (ACP) is found in Streptococcus and acts as an invasin which plays a role in the internalisation and translocation of the organism across human epithelial surfaces. Group B Streptococcus is the leading cause of diseases including bacterial pneumonia, sepsis and meningitis. The N terminal of ACP is associated with virulence and forms a beta sandwich and a three helix bundle. ACP consists of an N-terminal domain (170 amino acids) followed by a variable number of tandem repeats (82 amino acids each) and a C-terminal domain (45 amino acids) containing an LPXTG peptidoglycan-anchoring motif. This entry is the N-terminal domain of ACP (NtACP). NtACP can be further divided into two structurally distinct domains, D1 and D2. D1, the more distal (amino-terminal) portion, consists of a beta sandwich with strong structural homology to fibronectin's integrin-binding region (FnIII10). D2 consists of three antiparallel alpha helix coils containing a portion of the glycosaminoglycan (GAG)-binding domain adjacent to the repeat region. NtACP binds to heparin and GAGs only when it is covalently associated with the adjacent repeat region. NtACP's D1 region contains a K144- T145-D146 (KTD) motif, located within a loop region that is structurally analogous to the loop containing the RGD integrin-binding motif in FnIII10. Single mutation within the KTD motif (D146A), present in the D1 domain, reduces NtACP binding to a1b integrion. The a1b1-integrin is one of four collagen-binding I-domain-containing integrins. Structural analysis of the D1 domain, in particular the region containing the putative integrin-binding loop and KTD motif, shares a strong structural homology with the FnIII10's integrin-binding region. Amino acid sequence alignment of Alps indicates that KTD is highly conserved.	106
400953	pfam08830	DUF1806	Protein of unknown function (DUF1806). This is a bacterial family of uncharacterized proteins. The structure of one of the proteins in this family has been solved and it adopts a beta barrel-like structure.	112
400954	pfam08831	MHCassoc_trimer	Class II MHC-associated invariant chain trimerisation domain. The class II associated invariant chain peptide is required for folding and localization of MHC class II heterodimers. This domain is involved in trimerisation of the ectoderm and interferes with DM/class II binding. The trimeric protein forms a cylindrical shape which is thought to be important for interactions between the invariant chain and class II molecules.	69
400955	pfam08832	SRC-1	Steroid receptor coactivator. This domain is found in steroid/nuclear receptor coactivators and contains two LXXLL motifs that are involved in receptor binding. The family includes SRC-1/NcoA-1, NcoA-2/TIF2, pCIP/ACTR/GRIP-1/AIB1.	87
400956	pfam08833	Axin_b-cat_bind	Axin beta-catenin binding domain. This domain is found on the scaffolding protein Axin which is a component of the beta-catenin destruction complex. It competes with the tumor suppressor adenomatous polyposis coli protein (APC) for binding to beta-catenin.	37
400957	pfam08837	DUF1810	Protein of unknown function (DUF1810). This is a family of uncharacterized proteins. The structure of one of the members in this family has been solved and it adopts a mainly alpha helical structure.	136
400958	pfam08838	DUF1811	Protein of unknown function (DUF1811). This is a bacterial family of uncharacterized proteins. Some of the proteins are annotated as being transcriptional regulators. The structure of one of the proteins in this family has revealed a beta-barrel like structure with helix-turn-helix like motif.	99
400959	pfam08839	CDT1	DNA replication factor CDT1 like. CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin.	171
400960	pfam08840	BAAT_C	BAAT / Acyl-CoA thioester hydrolase C terminal. This catalytic domain is found at the C terminal of acyl-CoA thioester hydrolases and bile acid-CoA:amino acid N-acetyltransferases (BAAT).	211
400961	pfam08841	DDR	Diol dehydratase reactivase ATPase-like domain. Diol dehydratase (DDH, EC:4.2.1.28) and its isofunctional homolog glycerol dehydratase (GDH, EC.4.2.1.30) are enzymes which catalyze the conversion of glycerol 1,2-propanediol, and 1,2-ethanediol to aldehydes. These reactions require coenzyme B12. Cleavage of the Co-C bond of coenzyme B12 by substrates or coenzyme analogues results in inactivation during which coenzyme B12 remains tightly bound to the apoenzyme. This family comprises of the large subunit of the diol dehydratase and glycerol dehydratase reactivating factors whose function is to reactivate the holoenzyme by exchange of a damaged cofactor for intact coenzyme.	328
400962	pfam08842	Mfa2	Fimbrillin-A associated anchor proteins Mfa1 and Mfa2. This family of proteins may be lipoproteins principally from bacilli. They are between 300 and 400 residues. Many Bacteroides-like bacterial species, including Porphyromonas gingivalis, the causal agent of periodontal infection, carry at least two types of fimbriae, namely FimA and Mfa1 fimbriae, following the names of their major subunit proteins. Normally, FimA fimbriae are long filaments that are easily detached from cells, whereas Mfa1 fimbriae are short filaments that are tightly bound to cells; however, in the absence of Mfa2 protein, the Mfa1 fimbriae are also very long and are not attached. Mfa2 and Mfa1 are associated with each other in whole P. gingivalis cells to the extent that Mfa2 is located on the cell surface and probably associated with Mfa1 fimbriae in such a way that it anchors the Mfa1 fimbriae to the cell surface and regulates Mfa1 filament length.	276
400963	pfam08843	AbiEii	Nucleotidyl transferase AbiEii toxin, Type IV TA system. This family was recently identified as belonging to the nucleotidyltransferase superfamily. AbiEii is the cognate toxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	238
400964	pfam08844	DUF1815	Domain of unknown function (DUF1815). This presumed domain is about 100 amino acids in length and is functionally uncharacterized.	98
370150	pfam08845	SymE_toxin	Toxin SymE, type I toxin-antitoxin system. SymE (SOS-induced yjiW gene with similarity to MazE ) is an SOS-induced toxin. It inhibits cell growth, decreases protein synthesis and increases RNA degradation. It may play a role in the recycling of RNAs damaged under SOS response-inducing conditions. It is predicted to have an AbrB fold, similar to that of the antitoxin MazE. Its translation is repressed by the antisense RNA SymR, which acts as an antitoxin.	54
400965	pfam08846	DUF1816	Domain of unknown function (DUF1816). Crocosphaera watsonii CpcD is associated with the pfam01383 domain suggesting this presumed domain could have a role in phycobilisomes.	64
400966	pfam08847	Crr6	Chlororespiratory reduction 6. Chlororespiratory reduction 6 (Crr6) is a factor required for the assembly or stabilisation of the chloroplast NAD(P)H dehydrogenase complex in Arabidopsis.	150
400967	pfam08848	DUF1818	Domain of unknown function (DUF1818). This presumed domain is found in a small family of cyanobacterial protein. These proteins are functionally uncharacterized.	113
400968	pfam08849	DUF1819	Putative inner membrane protein (DUF1819). These proteins are functionally uncharacterized. Several are annotated as putative inner membrane proteins.	181
400969	pfam08850	DUF1820	Domain of unknown function (DUF1820). This family includes small functionally uncharacterized proteins around 100 amino acids in length.	97
370155	pfam08852	DUF1822	Protein of unknown function (DUF1822). This family of proteins are functionally uncharacterized.	370
400970	pfam08853	DUF1823	Domain of unknown function (DUF1823). This presumed domain is functionally uncharacterized.	111
400971	pfam08854	DUF1824	Domain of unknown function (DUF1824). This uncharacterized family of proteins are principally found in cyanobacteria.	124
400972	pfam08855	DUF1825	Domain of unknown function (DUF1825). This uncharacterized family of proteins are principally found in cyanobacteria.	103
400973	pfam08856	DUF1826	Protein of unknown function (DUF1826). These proteins are functionally uncharacterized.	197
400974	pfam08857	ParBc_2	Putative ParB-like nuclease. This domain is probably distantly related to pfam02195. Suggesting these uncharacterized proteins have a nuclease function.	159
400975	pfam08858	IDEAL	IDEAL domain. This short domain is found at the C-terminus of proteins in the UPF0302 family. The domain is named after the sequence of the most conserved region in some members. The function of this domain is unknown.	37
400976	pfam08859	DGC	DGC domain. This domain appears to be a zinc binding domain from the conservation of four potential chelating cysteines. The domain is named after a conserved central motif. The function of this domain is unknown.	103
400977	pfam08860	DUF1827	Domain of unknown function (DUF1827). This presumed domain has no known function.	91
400978	pfam08861	DUF1828	Domain of unknown function DUF1828. This presumed domain is functionally uncharacterized.	90
400979	pfam08862	DUF1829	Domain of unknown function DUF1829. This short domain is usually associated with pfam08861.	87
400980	pfam08863	YolD	YolD-like protein. Members of this family are functionally uncharacterized. However it has been predicted that these proteins are functionally equivalent to the UmuD subunit of polymerase V from gram-negative bacteria. This family has been shown to belong to the WYL-like superfamily.	94
400981	pfam08864	UPF0302	UPF0302 domain. This family is known as UPF0302. It is currently uncharacterized.	105
400982	pfam08865	DUF1830	Domain of unknown function (DUF1830). This family of short proteins is functionally uncharacterized.	66
400983	pfam08866	DUF1831	Putative amino acid metabolism. Solution of the structure of the Lactobacillus plantarum protein from this family has indicated a potential new fold with remote similarities to TBP-like (TATA-binding protein) structures. This similarity, in combination with genomic context analysis, leads us to propose an involvement in amino-acid metabolism. The potentially novel fold is an alpha + beta fold comprising two beta sheets packed against a single helix. The enzyme is present in the cytosol.	110
400984	pfam08867	FRG	FRG domain. This presumed domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterized.	93
370164	pfam08868	YugN	YugN-like family. This family of proteins related to B. subtilis YugN are functionally uncharacterized.	130
400985	pfam08869	XisI	XisI protein. The fdxN element, along with two other DNA elements, is excised from the chromosome during heterocyst differentiation in cyanobacteria. The xisH as well as the xisF and xisI genes are required.	103
400986	pfam08870	DndE	DNA sulphur modification protein DndE. DndE is a small protein of 126 residues. It is a putative carboxylase homologous to NCAIR synthetase. It is encoded by an operon that is associated with a sulphur-based modification to DNA.	110
370166	pfam08872	KGK	KGK domain. This presumed domain is found in one or two copies in cyanobacterial proteins. It is named after a short sequence motif.	111
370167	pfam08873	DUF1834	Domain of unknown function (DUF1834). This family of proteins are functionally uncharacterized. One member is the Gp37 protein from the FluMu prophage.	204
400987	pfam08874	DUF1835	Domain of unknown function (DUF1835). This family of proteins are functionally uncharacterized.	122
286019	pfam08875	DUF1833	Domain of unknown function (DUF1833). This family of proteins are functionally uncharacterized and are predicted to adopt an all-beta fold. They are often found in gene neighborhoods containing genes for an NlpC peptidase and a Ubiquitin domain predicted to be involved in tail assembly.	150
400988	pfam08876	DUF1836	Domain of unknown function (DUF1836). This family of proteins are functionally uncharacterized.	102
400989	pfam08877	MepB	MepB protein. MepB is a functionally uncharacterized protein in the mepRAB gene cluster of Staphylococcus aureus.	122
378078	pfam08878	DUF1837	Domain of unknown function (DUF1837). This family of proteins are functionally uncharacterized.	230
370168	pfam08879	WRC	WRC. The WRC domain, named after the conserved Trp-Arg-Cys motif, contains two distinctive features: a putative nuclear localization signal and a zinc-finger motif (C3H). It is suggested that the WRC domain functions in DNA binding.	42
400990	pfam08880	QLQ	QLQ. The QLQ domain is named after the conserved Gln, Leu, Gln motif. The QLQ domain is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. This domain has thus been postulated to be involved in mediating protein interactions.	35
400991	pfam08881	CVNH	CVNH domain. CyanoVirin-N Homology domains are found in the sugar-binding antiviral protein cyanovirin-N (CVN) as well as filamentous ascomycetes and in the fern Ceratopteris richardii.	101
400992	pfam08882	Acetone_carb_G	Acetone carboxylase gamma subunit. Acetone carboxylase is the key enzyme of bacterial acetone metabolism, catalyzing the condensation of acetone and CO(2) to form acetoacetate.	113
400993	pfam08883	DOPA_dioxygen	Dopa 4,5-dioxygenase family. This family of proteins are related to a DOPA 4,5-dioxygenase that is involved in synthesis of betalain. DOPA-dioxygenase is the key enzyme involved in betalain biosynthesis. It converts 3,4-dihydroxyphenylalanine to betalamic acid, a yellow chromophore.	106
400994	pfam08884	Flagellin_D3	Flagellin D3 domain. This domain is found in the central portion bacterial flagellin FliC. The domain contains a structural motif called a beta-folium fold. Although no specific function is assigned to this domain its deletion leads to a reduction in filament stability.	88
400995	pfam08885	GSCFA	GSCFA family. This family of proteins are functionally uncharacterized. They have been named GSCFA after a highly conserved N-terminal motif in the alignment. Distant similarity to the pfam00657 lipases suggests these proteins are likely to be enzymes.	237
400996	pfam08886	GshA	Glutamate-cysteine ligase. This is a rare family of glutamate--cysteine ligases, EC:6.3.2.2, demonstrated first in Thiobacillus ferrooxidans and present in a few other Proteobacteria. It is the first of two enzymes for glutathione biosynthesis. It is also called gamma-glutamylcysteine synthetase. The structure of this family has been solved, and is similar to that of human glutathione synthetase and very different to gamma-glutamylcysteine synthetase from Escherichia coli.	402
400997	pfam08887	GAD-like	GAD-like domain. This domain is functionally uncharacterized, but it appears to be distantly related to the GAD domain pfam02938.	102
400998	pfam08888	HopJ	HopJ type III effector protein. Pathovars of Pseudomonas syringae interact with their plant hosts via the action of Hrp outer protein (Hop) effector proteins, injected into plant cells by the type III secretion system. The proteins in this family are called HopJ after the original member HopPmaJ.	109
400999	pfam08889	WbqC	WbqC-like protein family. This family of proteins are functionally uncharacterized. However it is found in an O-antigen gene cluster in E. coli and other bacteria suggesting a role in O-antigen production. Feng et al. suggest that wbnG may code for a glycine transferase.	217
401000	pfam08890	Phage_TAC_5	Phage XkdN-like tail assembly chaperone protein, TAC. This is a family of phage tail assembly chaperone proteins, TACs, from Gram-positive bacteriophages, in particular PBSX from Firmicutes.	135
401001	pfam08891	YfcL	YfcL protein. This family of proteins are functionally uncharacterized. THey are related to the short YfcL protein from E. coli.	85
401002	pfam08892	YqcI_YcgG	YqcI/YcgG family. This family of proteins are functionally uncharacterized. The family include YqcI and YcgG from B. subtilis. The alignment contains a conserved FPC motif at the N-terminus and CPF at the C-terminus.	211
401003	pfam08893	DUF1839	Domain of unknown function (DUF1839). This family of proteins are functionally uncharacterized.	312
378086	pfam08894	DUF1838	Protein of unknown function (DUF1838). This family of proteins are functionally uncharacterized.	235
401004	pfam08895	DUF1840	Domain of unknown function (DUF1840). This family of proteins are functionally uncharacterized.	108
401005	pfam08896	DUF1842	Domain of unknown function (DUF1842). This domain is found at the N-terminus of proteins that are functionally uncharacterized.	110
401006	pfam08897	DUF1841	Domain of unknown function (DUF1841). This family of proteins are functionally uncharacterized.	135
370178	pfam08898	DUF1843	Domain of unknown function (DUF1843). This domain is found at the C-terminus of a family of proteins that are functionally uncharacterized. The presumed domain is about 60 amino acid residues in length and is found independently in some proteins.	52
401007	pfam08899	DUF1844	Domain of unknown function (DUF1844). This family of proteins are functionally uncharacterized.	72
401008	pfam08900	DUF1845	Domain of unknown function (DUF1845). This family of proteins are functionally uncharacterized.	215
401009	pfam08901	DUF1847	Protein of unknown function (DUF1847). This family of proteins are functionally uncharacterized. THey contain 4 N-terminal cysteines that may form a zinc binding domain.	157
401010	pfam08902	DUF1848	Domain of unknown function (DUF1848). This family of proteins are functionally uncharacterized. The C-terminus contains a cluster of cysteines that are similar to the iron-sulfur cluster found at the N-terminus of pfam04055.	262
401011	pfam08903	DUF1846	Domain of unknown function (DUF1846). This family of proteins are functionally uncharacterized. Some members of the family are annotated as ATP-dependent peptidases. However, we can find no support for this annotation.	489
401012	pfam08904	DUF1849	Domain of unknown function (DUF1849). This family of proteins are functionally uncharacterized.	248
401013	pfam08905	DUF1850	Domain of unknown function (DUF1850). This family of proteins are functionally uncharacterized. Some members of this family appear to be misannotated as RocC an amino acid transporter from B. subtilis.	86
401014	pfam08906	DUF1851	Domain of unknown function (DUF1851). This domain is found at the C-terminus of a variety of proteins that are functionally uncharacterized.	72
401015	pfam08907	DUF1853	Domain of unknown function (DUF1853). This family of proteins are functionally uncharacterized.	282
401016	pfam08908	DUF1852	Domain of unknown function (DUF1852). This family of proteins are functionally uncharacterized.	321
401017	pfam08909	DUF1854	Domain of unknown function (DUF1854). This potential domain is functionally uncharacterized. It is found at the C-terminus of a number of ATP transporter proteins suggesting this domain may be involved in ligand binding.	126
401018	pfam08910	Aida_N	Aida N-terminus. This is the N-terminal domain of the axin interactor, dorsalization-associated protein family.	103
401019	pfam08911	NUP50	NUP50 (Nucleoporin 50 kDa). Nucleoporin 50 kDa (NUP50) acts as a cofactor for the importin-alpha:importin-beta heterodimer, which in turn allows for transportation of many nuclear-targeted proteins through nuclear pore complexes. The C-terminus of NUP50 binds importin-beta through RAN-GTP, the N-terminus binds the C-terminus of importin-alpha, while a central domain binds importin-beta. NUP50:importin-alpha:importin-beta then binds cargo and can stimulate nuclear import. The N-terminal domain of NUP50 is also able to actively displace nuclear localization signals from importin-alpha.	64
401020	pfam08912	Rho_Binding	Rho Binding. Rho Binding Domain is responsible for the recognition and binding of Rho binding domain-containing proteins (such as ROCK) to Rho, resulting in activation of the GTPase which in turn modulates the phosphorylation of various signalling proteins. This domain is within an amphipathic alpha-helical coiled-coil and interacts with Rho through predominantly hydrophobic interactions.	67
312463	pfam08913	VBS	Vinculin Binding Site. Vinculin binding sites are predominantly found in talin and talin-like molecules, enabling binding of vinculin to talin, stabilizing integrin-mediated cell-matrix junctions. Talin, in turn, links integrins to the actin cytoskeleton. The consensus sequence for Vinculin binding sites is LxxAAxxVAxxVxxLIxxA, with a secondary structure prediction of four amphipathic helices. The hydrophobic residues that define the VBS are themselves 'masked' and are buried in the core of a series of helical bundles that make up the talin rod.	125
286058	pfam08914	Myb_DNA-bind_2	Rap1 Myb domain. The Rap1 Myb domain adopts a canonical three-helix bundle tertiary structure, with the second and third helices forming a helix-turn-helix variant motif. The function of this domain is unclear: it may either interact with DNA via an adaptor protein or it may be only involved in protein-protein interactions.	65
401021	pfam08915	tRNA-Thr_ED	Archaea-specific editing domain of threonyl-tRNA synthetase. Archaea-specific editing domain of threonyl-tRNA synthetase, with marked structural similarity to D-amino acids deacylases found in eubacteria and eukaryotes. This domain can bind D-amino acids, and ensures high fidelity during translation. It is especially responsible for removing incorrectly attached serine from tRNA-Thr. The domain forms a fold that can be be defined as two layers of beta-sheets (a three-stranded sheet and a five-stranded sheet), with two alpha-helices located adjacent to the five-stranded sheet.	137
401022	pfam08916	Phe_ZIP	Phenylalanine zipper. The phenylalanine zipper consists of aromatic side chains from ten phenylalanine residues that are stacked within a hydrophobic core. This zipper mediates dimerization of various proteins, such as APS, SH2-B and Lnk.	57
401023	pfam08917	ecTbetaR2	Transforming growth factor beta receptor 2 ectodomain. The Transforming growth factor beta receptor 2 ectodomain is a compact fold consisting of nine beta-strands and a single helix stabilized by a network of six intra strand disulphide bonds. The folding topology includes a central five-stranded antiparallel beta-sheet, eight-residues long at its centre, covered by a second layer consisting of two segments of two-stranded antiparallel beta-sheets (beta1-beta4, beta3-beta9).	103
401024	pfam08918	PhoQ_Sensor	PhoQ Sensor. The PhoQ Sensor is required for the virulence of various Gram-negative bacteria by allowing interaction of PhoPQ with the intracellular membrane, resulting in remodelling of the bacterial cell surface and subsequent bacterial resistance to host antimicrobial peptides. The domain contains a major flat acidic surface, which binds to at least 3 calcium ions, neutralising the domain's negative charge and allowing interaction with the negatively charged membrane.	179
401025	pfam08919	F_actin_bind	F-actin binding. The F-actin binding domain forms a compact bundle of four antiparallel alpha-helices, which are arranged in a left-handed topology. Binding of F-actin to the F-actin binding domain may result in cytoplasmic retention and subcellular distribution of the protein, as well as possible inhibition of protein function.	106
401026	pfam08920	SF3b1	Splicing factor 3B subunit 1. This family consists of several eukaryotic splicing factor 3B subunit 1 proteins, which associate with p14 through a C-terminus beta-strand that interacts with beta-3 of the p14 RNA recognition motif (RRM) beta-sheet, which is in turn connected to an alpha-helix by a loop that makes extensive contacts with both the shorter C-terminal helix and RRM of p14. This subunit is required for 'A' splicing complex assembly (formed by the stable binding of U2 snRNP to the branchpoint sequence in pre-mRNA) and 'E' splicing complex assembly.	116
401027	pfam08921	DUF1904	Domain of unknown function (DUF1904). This domain is found in a set of hypothetical bacterial proteins.	107
401028	pfam08922	DUF1905	Domain of unknown function (DUF1905). This domain is found in a set of hypothetical bacterial proteins.	78
312472	pfam08923	MAPKK1_Int	Mitogen-activated protein kinase kinase 1 interacting. Mitogen-activated protein kinase kinase 1 interacting protein is a small subcellular adaptor protein required for MAPK signaling and ERK1/2 activation. The overall topology of this domain has a central five-stranded beta-sheet sandwiched between a two alpha-helix and a one alpha-helix layer.	119
401029	pfam08924	DUF1906	Domain of unknown function (DUF1906). This domain is found in a set of uncharacterized hypothetical bacterial proteins.	179
401030	pfam08925	DUF1907	Domain of Unknown Function (DUF1907). The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxH motif that coordinates a zinc ion, and an acetate anion at a site that likely supports the enzymatic activity of an ester hydrolase.	281
401031	pfam08926	DUF1908	Domain of unknown function (DUF1908). This domain is found in a set of hypothetical/structural eukaryotic proteins.	282
401032	pfam08928	DUF1910	Domain of unknown function (DUF1910). This domain is found in a set of hypothetical bacterial proteins.	117
401033	pfam08929	DUF1911	Domain of unknown function (DUF1911). This domain is found in a set of hypothetical bacterial proteins.	105
286073	pfam08930	DUF1912	Domain of unknown function (DUF1912). This domain has no known function. It is found in various Streptococcal proteins.	84
286074	pfam08931	Caudo_bapla_RBP	Receptor-binding protein of phage tail base-plate Siphoviridae, head. Caudo_bapla_RBP is a family of proteins expressed from ORF18 of the Lactococcus P2-like phage. This is one of three protein species, shoulders, neck, and head, that form the phage tail base-plate. In the overall structure this head domain exists as six trimers, and is necessary for specific recognition of the receptors at the host cell surface. Siphoviridae are the P2-like Caudovirales of Lactococcus. This family now includes DUF1914. Family Baseplate, pfam16774, is the ORF15 or shoulder component of the base-plate complex.	262
255115	pfam08933	DUF1864	Domain of unknown function (DUF1864). This domain has no known function. It is found in various hypothetical and conserved domain proteins.	387
401034	pfam08934	Rb_C	Rb C-terminal domain. The Rb C-terminal domain is required for high-affinity binding to E2F-DP complexes and for maximal repression of E2F-responsive promoters, thereby acting as a growth suppressor by blocking the G1-S transition of the cell cycle. This domain has a strand-loop-helix structure, which directly interacts with both E2F1 and DP1, followed by a tail segment that lacks regular secondary structure.	151
286076	pfam08935	VP4_2	Viral protein VP4 subunit. This domain is predominantly found in viral proteins from the family Picornaviridae. It is VP4 of the viral polyprotein which, in poliovirus, is part of the capsid that consists of 60 copies each of four proteins VP1, VP2, VP3, and VP4 arranged on an icosahedral lattice. VP4 is on the inside and differs from the others in being small, myristoylated and having an extended structure. Productive infection involves the externalisation of the VP4, which is cleaved from the rest, along with the N-terminus of VP1. There thus seem to be three stages of the virus, ie a multi-step process for cell entry involving RNA translocation through a membrane channel formed by the externalised N termini of VP1.	84
401035	pfam08936	CsoSCA	Carboxysome Shell Carbonic Anhydrase. Carboxysome Shell Carbonic Anhydrase is a bacterial carbonic anhydrase localized in the carboxysome, where it converts bicarbonate ions to carbon dioxide for use in carbon fixation. It contains three domains, these being: (1) an N-terminal domain composed primarily of four alpha-helices; (2) a catalytic domain containing a tightly bound zinc ion and (3) a C-terminal domain with weak structural similarity to the catalytic domain.	455
401036	pfam08937	DUF1863	MTH538 TIR-like domain (DUF1863). This domain adopts the flavodoxin fold, that is, five parallel beta-strands and four helical segments. The structure is a three-layer sandwich with alpha-1 and alpha-4 on one side of the beta-sheet, and alpha-2 and alpha-3 on the other side. Probable role in signal transduction as a phosphorylation-independent conformational switch protein. This domain is similar to the TIR domain.	130
401037	pfam08938	HBS1_N	HBS1 N-terminus. This domain is found at the N-terminus of HBS1 proteins. It interacts with the ribosomal protein rpS3 at the mRNA entry site.	74
401038	pfam08939	DUF1917	Domain of unknown function (DUF1917). This domain is found in various hypothetical and basophilic leukaemia proteins. It has no known function.	258
401039	pfam08940	DUF1918	Domain of unknown function (DUF1918). This domain, found in various hypothetical bacterial proteins, has no known function.	58
401040	pfam08941	USP8_interact	USP8 interacting. This domain interacts with the UBP deubiquitinating enzyme USP8.	179
401041	pfam08942	DUF1919	Domain of unknown function (DUF1919). This domain has no known function. It is found in various hypothetical and putative bacterial proteins.	191
401042	pfam08943	CsiD	CsiD. This family consists of various bacterial proteins pertaining to the non-haem Fe(II)-dependent oxygenase family. Exact function is unknown, but a putative role includes involvement in the control of utilisation of gamma-aminobutyric acid.	294
401043	pfam08944	p47_phox_C	NADPH oxidase subunit p47Phox, C terminal domain. The C terminal domain of the phagocyte NADPH oxidase subunit p47Phox contains conserved PxxP motifs that allow binding to SH3 domains, with subsequent activation of the NADPH oxidase, and generation of superoxide, which plays a crucial role in host defense against microbial infection.	32
401044	pfam08945	Bclx_interact	Bcl-x interacting, BH3 domain. This domain is a long alpha helix, required for interaction with Bcl-x. It is found in BAM, Bim and Bcl2-like protein 11. This domain is also known as the BH3 domain between residues 146 and 161.	39
401045	pfam08946	Osmo_CC	Osmosensory transporter coiled coil. The osmosensory transporter coiled coil is a C-terminal domain found in various bacterial osmoprotective transporters, such as ProP, Proline/betaine transporter, Proline permease 2 and the citrate proton symporters. It adopts an antiparallel coiled-coil structure, and is essential for osmosensory and osmoprotectant transporter function.	46
401046	pfam08947	BPS	BPS (Between PH and SH2). The BPS (Between PH and SH2) domain, comprised of 2 beta strands and a C-terminal helix, is an approximately 45 residue region found in the adaptor proteins Grb7/10/14 that mediates inhibition of the tyrosine kinase domain of the insulin receptor by binding of the N-terminal portion of the BPS domain to the substrate peptide groove of the kinase, acting as a pseudosubstrate inhibitor.	45
401047	pfam08948	DUF1859	Domain of unknown function (DUF1859). This domain has no known function. It is predominantly found in the N-terminus of bacteriophage spike proteins.	126
401048	pfam08949	DUF1860	Domain of unknown function (DUF1860). This domain has no known function. It is predominantly found in the C-terminus of bacteriophage spike proteins.	219
401049	pfam08950	DUF1861	Protein of unknown function (DUF1861). This hypothetical protein, found in bacteria and in the eukaryote Leishmania, has no known function.	295
401050	pfam08951	EntA_Immun	Enterocin A Immunity. Gram-positive lactobacilli produce bacteriocins to kill closely-related competitor species. To protect themselves from the bacteriocidal activity of this molecule they co-express an immunity protein (for discussion of this operon see Bacteriocin_IIc pfam10439). The immunity protein structure is a soluble, cytoplasmic, antiparallel four alpha-helical globular bundle with a fifth, more flexible and more divergent C-terminal helical hair-pin. The C-terminal hair-pin recognizes the C-terminus of the producer bacteriocin and this interaction is sufficient to dis-orient the bacteriocin within the membrane and close up the permeabilising pore that on its own the bacteriocin creates. These immunity proteins interact in the same way with other bacteriocins, family Bacteriocin_II, pfam01721. Since many enterococci can produce more than one bacteriocin it seems likely that the whole operon can be carried on transferable plasmids.	67
286093	pfam08952	DUF1866	Domain of unknown function (DUF1866). This domain, found in Synaptojanin, has no known function.	146
401051	pfam08953	DUF1899	Domain of unknown function (DUF1899). This set of domains is found in various eukaryotic proteins. Function is unknown.	66
401052	pfam08954	Trimer_CC	Trimerisation motif. This domain is predominantly found in the structural protein coronin, and is duplicated in some sequences. It appears to have the function of stabilizing the topology of short coiled-coils in proteins.	52
401053	pfam08955	BofC_C	BofC C-terminal domain. The C-terminal domain of the bacterial protein 'bypass of forespore C' contains a three-stranded beta-sheet and three alpha-helices. Its exact function is, as yet, unknown.	74
401054	pfam08956	DUF1869	Domain of unknown function (DUF1869). This domain is found in a set of hypothetical bacterial proteins.	56
401055	pfam08958	DUF1871	Domain of unknown function (DUF1871). This set of hypothetical proteins is produced by prokaryotes pertaining to the Bacillus genus.	77
401056	pfam08960	DUF1874	Domain of unknown function (DUF1874). This domain is found in a set of hypothetical viral and bacterial proteins.	100
401057	pfam08961	NRBF2	Nuclear receptor-binding factor 2, autophagy regulator. NRBF2 plays an essential role in autophagy, the cellular pathway that degrades long-lived proteins and other cytoplasmic contents through lysosomes. NRBF2 binds Atg14L - a Beclin-binding protein - directly via the MIT domain and enhances Atg14L-linked Vps34 kinase (a class III phosphatidylinositol-3 kinase) activity and autophagy induction.	197
401058	pfam08962	DUF1876	Domain of unknown function (DUF1876). This domain is found in a set of hypothetical bacterial proteins.	82
401059	pfam08963	DUF1878	Protein of unknown function (DUF1878). This domain is found in a set of hypothetical bacterial proteins.	110
401060	pfam08964	Crystall_3	Beta/Gamma crystallin. This family of beta/gamma crystallins includes the N-terminal domain of Dictyostelium discoideum Calcium-dependent cell adhesion molecule 1, which mediates cell-cell adhesion through homophilic interactions.	86
401061	pfam08965	DUF1870	Domain of unknown function (DUF1870). This domain is found in a set of hypothetical bacterial proteins. It contains a helix-turn-helix domain so may be a DNA-binding protein.	117
401062	pfam08966	DUF1882	Domain of unknown function (DUF1882). This domain is found in a set of hypothetical bacterial proteins.	72
370216	pfam08967	DUF1884	Domain of unknown function (DUF1884). This domain is found in a set of hypothetical bacterial proteins. It shows similarity to the N-terminus of ATP-synthase.	92
370217	pfam08968	DUF1885	Domain of unknown function (DUF1885). This domain is found in a set of hypothetical proteins produced by bacteria of the Bacillus genus.	131
401063	pfam08969	USP8_dimer	USP8 dimerization domain. This domain is predominantly found in the amino terminal region of Ubiquitin carboxyl-terminal hydrolase 8 (USP8). It forms a five helical bundle that dimerizes.	112
370219	pfam08970	Sda	Sporulation inhibitor A. Members of this protein family contain two antiparallel alpha helices that are linked by a highly structured inter-helix loop to form a helical hairpin; the structure is stabilized by numerous hydrophobic and electrostatic interactions. These sporulation inhibitors are antikinases that bind to the histidine kinase KinA phosphotransfer domain and act as a molecular barricade that inhibit productive interaction between the ATP binding site and the phosphorylatable KinA His residue. This results in the inhibition of sporulation (by preventing phosphorylation of spo0A).	45
401064	pfam08971	GlgS	Glycogen synthesis protein. Members of this family are involved in glycogen synthesis in Enterobacteria. The structure of the polypeptide chain comprises a bundle of two parallel amphipathic helices, alpha-1 and alpha-3, and a short hydrophobic helix alpha-2 sandwiched between them.	66
312506	pfam08972	DUF1902	Domain of unknown function (DUF1902). Members of this family of prokaryotic proteins adopt a fold consisting of one alpha-helix and four beta-strands. Their function has not, as yet, been elucidated.	75
401065	pfam08973	TM1506	Domain of unknown function (DUF1893). A member of the deaminase fold that binds an unknown ligand in the crystal structure. The protein is ADP-ribosylated at a conserved aspartate. Contextual analysis suggests that the domain is likely to bind NAD or ADP ribose either to sense redox states or to function as a regulatory ADP ribosyltransferase.	126
401066	pfam08974	DUF1877	Domain of unknown function (DUF1877). This domain is found in a set of hypothetical bacterial proteins.	163
401067	pfam08975	2H-phosphodiest	Domain of unknown function (DUF1868). This group of 2H-phosphodiesterases comprises a single family typified by the protein mlr3352 from M.loti. Members are also present in various alpha-proteobacteria, Synechocystis, Streptococcus and Chilo iridescent virus. The presence of a member of this predominantly bacterial group in a large eukaryotic DNA virus represents a potential case of horizontal transfer from a bacterial source into a virus. Several proteins of bacterial origin have been noticed in the insect viruses (L.M.Iyer, E.V.Koonin and L.Aravind, unpublished observations and these appear to have been acquired from endo-symbiotic or parasitic bacteria that share the same host cells with the viruses. Presence of 2H proteins in the proteomes of large DNA viruses (e.g. T4 57B protein and the Fowl-pox virus FPV025) may point to some role for these proteins in regulating the viral tRNA metabolism. Each member of this family contains an internal duplication, each of which contains an HXTX motif that defines the family.	116
401068	pfam08976	EF-hand_11	EF-hand domain. This domain is found predominantly in DJ binding proteins.	105
286116	pfam08977	BOFC_N	Bypass of Forespore C, N terminal. The N-terminal domain of 'bypass of forespore C' is composed of a four-stranded beta-sheet covered by an alpha-helix. The beta-sheet has a beta2-beta1-beta4-beta3 topology, where strands beta1 and beta2 and strands beta3 and beta4 are connected by beta-turns, whereas strands beta2 and beta3 are joined by an alpha-helix that runs across one face of the beta-sheet. This domain is similar to the third immunoglobulin G-binding domain of protein G from Streptococcus, the latter belonging to a large and diverse group of cell surface-associated proteins that bind to immunoglobulins. It has been hypothesized that this domain may be a mediator of protein-protein interactions involved in proteolytic events at the cell surface.	49
401069	pfam08978	Reoviridae_Vp9	Reoviridae VP9. This domain is found in various VP9 viral outer-coat proteins. It has no known function.	280
401070	pfam08979	DUF1894	Domain of unknown function (DUF1894). Members of this family have an important role in methanogenesis. They assume an alpha-beta globular structure consisting of six beta-strands and three alpha-helices forming the secondary structural topological arrangement of alpha1-beta1-alpha2-beta2-beta3-beta4-beta5-beta6-alpha3.	85
401071	pfam08980	DUF1883	Domain of unknown function (DUF1883). This domain is found in a set of hypothetical bacterial proteins.	86
401072	pfam08982	DUF1857	Domain of unknown function (DUF1857). This domain has no known function. It is found in various hypothetical bacterial and fungal proteins.	146
370225	pfam08983	DUF1856	Domain of unknown function (DUF1856). This domain has no known function. It is found in the C-terminal segment of various vasopressin receptors.	48
401073	pfam08984	DUF1858	Domain of unknown function (DUF1858). This domain has no known function. It is found in various hypothetical bacterial proteins.	57
378109	pfam08985	DP-EP	DP-EP family. The DP-EP family of proteins, formerly known as DUF1888 have been shown to catalyze a cleavage of an internal peptide bond.	120
401074	pfam08986	DUF1889	Domain of unknown function (DUF1889). This domain is found in a set of hypothetical bacterial proteins.	119
370227	pfam08987	DUF1892	Protein of unknown function (DUF1892). Members of this family, that are synthesized by Saccharomycetes, adopt a structure consisting of a four-stranded beta-sheet, with strand order beta2-beta1-beta4-beta3, and two alpha-helices, with an overall topology of beta-beta-alpha-beta-beta-alpha. They have no known function.	107
401075	pfam08988	T3SS_needle_E	Type III secretion system, cytoplasmic E component of needle. T3SS_needle_E is a family of proteins from the operon that builds and controls the needle of the injection system of type III secretion. The YscE protein, produced by the pathogen Yersinia, assumes a secondary structure composed of two anti-parallel alpha-helices separated by a flexible loop. The family is cytoplasmic and may help to stabilize and prevent early polymerization of the needle-protein F.	66
401076	pfam08989	DUF1896	Domain of unknown function (DUF1896). This domain is found in a set of hypothetical bacterial proteins.	142
401077	pfam08990	Docking	Erythronolide synthase docking. The N terminal docking domain found in modular polyketide synthase assumes an alpha-helical structure, wherein two alpha-helices are connected by a short loop. Two such N-terminal domains dimerize to form amphipathic parallel alpha-helical coiled coils: dimerization is essential for protein function.	29
401078	pfam08991	MTCP1	Mature-T-Cell Proliferation I type. Members of this family adopt a coiled coil structure, with two antiparallel alpha-helices that are tightly strapped together by two disulfide bridges at each end. The protein sequence shows a cysteine motif, required for the stabilisation of the coiled-coil-like structure. Additional inter-helix hydrophobic contacts impart stability to this scaffold. The precise function of this eukaryotic domain is, as yet, unknown. MTCP1 is found in mitochondria. Mature-T-Cell Proliferation) is the first gene unequivocally identified in the group of uncommon leukemias with a mature phenotype.	62
401079	pfam08992	QH-AmDH_gamma	Quinohemoprotein amine dehydrogenase, gamma subunit. Members of this family contain a cross-linked, proteinous quinone cofactor, cysteine tryptophylquinone, which is required for catalysis of the oxidative deamination of a wide range of aliphatic and aromatic amines. The domain assumes a globular secondary structure, with two short alpha-helices having many turns and bends.	75
401080	pfam08993	T4_Gp59_N	T4 gene Gp59 loader of gp41 DNA helicase. Bacteriophage T4 gene-59 helicase assembly protein is required for recombination-dependent DNA replication, which is the predominant mode of DNA replication in the late stage of T4 infection. T4 gene-59 helicase assembly protein accelerates the loading of the T4 gene-41 helicase during DNA synthesis by the T4 replication system in vitro. T4 gene-59 helicase assembly protein binds to both T4 gene-41 helicase and T4 gene-32 single-stranded DNA binding protein, and to single and double-stranded DNA. The structure of T4 gene-59 helicase assembly protein reveals a novel alpha-helical bundle fold with two domains of similar size, this being the N-terminal domain that consists of six alpha-helices linked by loop segments and short turns. The surface of the domain contains large regions of exposed hydrophobic residues and clusters of acidic and basic residues. This domain has structural similarity to members of the high-mobility-group (HMG) family of DNA minor groove binding proteins including rat HMG1A and lymphoid enhancer-binding factor, and is required for binding of the helicase to the DNA minor groove.	93
401081	pfam08994	T4_Gp59_C	T4 gene Gp59 loader of gp41 DNA helicase C-term. Bacteriophage T4 gene-59 helicase assembly protein is required for recombination-dependent DNA replication, which is the predominant mode of DNA replication in the late stage of T4 infection. T4 gene-59 helicase assembly protein accelerates the loading of the T4 gene-41 helicase during DNA synthesis by the T4 replication system in vitro. T4 gene-59 helicase assembly protein binds to both T4 gene-41 helicase and T4 gene-32 single-stranded DNA binding protein, and to single and double-stranded DNA. The structure of T4 gene-59 helicase assembly protein reveals a novel alpha-helical bundle fold with two domains of similar size, this being the C-terminal domain that consists of seven alpha-helices with short intervening loops and turns. The surface of the domain contains large regions of exposed hydrophobic residues and clusters of acidic and basic residues. The hydrophobic region on the 'bottom' surface of the domain near the C-terminal helix binds the leading strand DNA, whilst the hydrophobic region on the 'top' surface of the domain lies between the two arms of the fork DNA, allowing for T4 gene 41 helicase binding and assembly into a hexameric complex around the lagging strand.	109
117561	pfam08995	NIP_1	Necrosis inducing protein-1. Necrosis inducing protein-1, a fungal avirulence protein produced by plants, consists of two parts containing beta-sheets of two and three anti-parallel strands, respectively. Five intramolecular disulfide bonds, stabilize these parts and their position with respect to each other, providing a high level of stability.	82
401082	pfam08996	zf-DNA_Pol	DNA Polymerase alpha zinc finger. The DNA Polymerase alpha zinc finger domain adopts an alpha-helix-like structure, followed by three turns, all of which involve proline. The resulting motif is a helix-turn-helix motif, in contrast to other zinc finger domains, which show anti-parallel sheet and helix conformation. Zinc binding occurs due to the presence of four cysteine residues positioned to bind the metal centre in a tetrahedral coordination geometry. Function of this domain is uncertain: it has been proposed that the zinc finger motif may be an essential part of the DNA binding domain.	184
401083	pfam08997	UCR_6-4kD	Ubiquinol-cytochrome C reductase complex, 6.4kD protein. The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is an essential component of the mitochondrial cellular respiratory chain. This family represents the 6.4kD protein, which may be closely linked to the iron-sulphur protein in the complex and function as an iron-sulphur protein binding factor.	51
370234	pfam08998	Epsilon_antitox	Bacterial epsilon antitoxin. The epsilon antitoxin, produced by various prokaryotes, forms part of a postsegregational killing system which is involved in the initiation of programmed cell death of plasmid-free cells. The protein is folded into a three-helix bundle that directly interacts with the zeta toxin, inactivating it.	89
286135	pfam08999	SP_C-Propep	Surfactant protein C, N terminal propeptide. The N-terminal propeptide of surfactant protein C adopts an alpha-helical structure, with turn and extended regions. It's main function is the stabilisation of metastable surfactant protein C (SP-C), since the latter can irreversibly transform from its native alpha-helical structure to beta-sheet aggregates and form amyloid-like fibrils. The correct intracellular trafficking of proSP-C has also been reported to depend on the propeptide.	96
401084	pfam09000	Cytotoxic	Cytotoxic. The cytotoxic domain confers cytotoxic activity to proteins, enabling the formation of nucleolytic breaks in 16S ribosomal RNA. The structure of the domain reveals a highly twisted central beta-sheet elaborated with a short N-terminal alpha-helix.	82
401085	pfam09001	DUF1890	Domain of unknown function (DUF1890). This domain is found in a set of hypothetical archaeal proteins.	141
312526	pfam09002	DUF1887	Domain of unknown function (DUF1887). This domain is found in a set of hypothetical bacterial proteins.	379
370236	pfam09003	Arm-DNA-bind_1	Bacteriophage lambda integrase, Arm DNA-binding domain. The amino terminal domain of bacteriophage lambda integrase folds into a three-stranded, antiparallel beta-sheet that packs against a C-terminal alpha-helix, adopting a fold that is structurally related to the three-stranded beta-sheet family of DNA-binding domains (which includes the GCC-box DNA-binding domain and the N-terminal domain of Tn916 integrase). This domain is responsible for high-affinity binding to each of the five DNA arm-type sites and is also a context-sensitive modulator of DNA cleavage.	72
117570	pfam09004	DUF1891	Domain of unknown function (DUF1891). This domain is found in a set of hypothetical eukaryotic proteins.	38
401086	pfam09005	DUF1897	Domain of unknown function (DUF1897). This domain is found in Psi proteins produced by Drosophila, and in various eukaryotic hypothetical proteins. It has no known function.	36
286141	pfam09006	Surfac_D-trimer	Lung surfactant protein D coiled-coil trimerisation. This domain, predominantly found in lung surfactant protein D, forms a triple-helical parallel coiled coil, and mediates trimerisation of the protein.	46
401087	pfam09007	EBP50_C	EBP50, C-terminal. This C terminal domain allows interaction of EBP50 with FERM (four-point one ERM) domains, resulting in the activation of Ezrin-radixin-moesin (ERM), with subsequent cytoskeletal modulation and cellular growth control. It includes a disordered section between two reasonably well conserved hydrophobic regions.	127
401088	pfam09008	Head_binding	Head binding. The head binding domain found in the Phage P22 tailspike protein contains two regular beta-sheets, A and B, oriented nearly perpendicular to each other and composed of five and three strands respectively. The topology of the strands is exclusively antiparallel. The tailspike protein trimerizes through this domain, and the direction of the strands with respect to the molecular triad is almost parallel for beta-sheet A, whereas beta-sheet B is perpendicular to the triad, forming a dome-like structure. This domain is dispensable for thermostability and SDS resistance of the intact protein, and its deletion has only minor effects on tailspike folding kinetics.	103
401089	pfam09009	Exotox-A_cataly	Exotoxin A catalytic. Members of this family, which are found in prokaryotic exotoxin A, catalyze the transfer of ADP ribose from nicotinamide adenine dinucleotide (NAD) to elongation factor-2 in eukaryotic cells, with subsequent inhibition of protein synthesis.	168
401090	pfam09010	AsiA	Anti-Sigma Factor A. Anti-sigma factor A is a transcriptional inhibitor that inhibits sigma 70-directed transcription by weakening its interaction with the core of the host's RNA polymerase. It is an all-helical protein, composed of six helical segments and intervening loops and turns, as well as a helix-turn-helix DNA binding motif, although neither free anti-sigma factor nor anti-sigma factor bound to sigma-70 has been shown to interact directly with DNA. In solution, the protein forms a symmetric dimer of small (10.59 kDa) protomers, which are composed of helix and coil regions and are devoid of beta-strand/sheet secondary structural elements.	85
401091	pfam09011	HMG_box_2	HMG-box domain. This short 71 residue domain is an HMG-box domain. HMG-box domains mediate re-modelling of chromatin-structure. Mammalian HMG-box proteins are of two types: those that are non-sequence-specific DNA-binding proteins with two HMG-box domains and a long highly acidic C-tail; and a diverse group of sequence-specific transcription factor-proteins with either a single HMG-box or up to six copies, and no acidic C-tail.	71
401092	pfam09012	FeoC	FeoC like transcriptional regulator. This family contains several transcriptional regulators, including FeoC, which contain a HTH motif. FeoC acts as a [Fe-S] dependant transcriptional repressor.	69
401093	pfam09013	YopH_N	YopH, N-terminal. The N-terminal domain of YopH is a compact structure composed of four alpha-helices and two beta-hairpins. Helices alpha-1 and alpha-3 are parallel to each other and antiparallel to helices alpha-2 and alpha-4. This domain targets YopH for secretion from the bacterium and translocation into eukaryotic cells, and has phosphotyrosyl peptide-binding activity, allowing for recognition of p130Cas and paxillin.	123
401094	pfam09014	Sushi_2	Beta-2-glycoprotein-1 fifth domain. The fifth domain of beta-2-glycoprotein-1 (b2GP-1) is composed of four well-defined anti-parallel beta-strands and two short alpha-helices, as well as a long highly flexible loop. It plays an important role in the binding of b2GP-1 to negatively charged compounds and subsequent capture for binding of anti-b2GP-1 antibodies.	88
401095	pfam09015	NgoMIV_restric	NgoMIV restriction enzyme. Members of this family are prokaryotic DNA restriction enzymes, exhibiting an alpha/beta structure, with a central region comprising a mixed six-stranded beta-sheet with alpha-helices on each side. A long 'arm' protrudes out of the core of the domain between strands beta2 and beta3 and is mainly involved in the tetramerisation interface of the protein. These restriction enzymes recognize the double-stranded sequence GCCGGC and cleave after G-1.	275
312534	pfam09016	Pas_Saposin	Pas factor saposin fold. Members of this family adopt a compact structure comprising five alpha helices. Charged and polar residues are exposed mostly on the surface, while most of the hydrophobic residues are buried inside the hydrophobic core of the helical bundle. The precise function of this domain is unknown, but it is has been shown to induce secretion of periplasmic proteins, especially collagenase.	76
117583	pfam09017	Transglut_prok	Microbial transglutaminase. Microbial transglutaminase (MTG) catalyzes an acyl transfer reaction by means of a Cys-Asp diad mechanism, in which the gamma-carboxyamide groups of peptide-bound glutamine residues act as the acyl donors. The MTG molecule forms a single, compact domain belonging to the alpha+beta folding class, containing 11 alpha-helices and 8 beta-strands. The alpha-helices and the beta-strands are concentrated mainly at the amino and carboxyl ends of the polypeptide, respectively. These secondary structures are arranged so that a beta-sheet is surrounded by alpha-helices, which are clustered into three regions.	414
312535	pfam09018	Phage_Capsid_P3	P3 major capsid protein. The P3 major capsid protein adopts a 'double-barrel' structure comprising two eight-stranded viral beta-barrels or jelly rolls, each of which contains a 12-residue alpha-helix. This protein then trimerizes through a 'trimerisation loop' sequence, and is incorporated within the viral capsid.	394
401096	pfam09019	EcoRII-C	EcoRII C terminal. The C-terminal catalytic domain of the Restriction Endonuclease EcoRII has a restriction endonuclease-like fold with a central five-stranded mixed beta-sheet surrounded on both sides by alpha-helices. It cleaves DNA specifically at single 5' CCWGG sites.	165
286154	pfam09020	YopE_N	YopE, N terminal. The N terminal YopE domain targets YopE for secretion from the bacterium and translocation into eukaryotic cells.	126
401097	pfam09021	HutP	HutP. The HutP protein family regulates the expression of Bacillus 'hut' structural genes by an anti-termination complex, which recognizes three UAG triplet units, separated by four non-conserved nucleotides on the RNA terminator region. L-histidine and Mg2+ ions are also required. These proteins exhibit the structural elements of alpha/beta proteins, arranged in the order: alpha-alpha-beta-alpha-alpha-beta-beta-beta in the primary structure, and the four antiparallel beta-strands form a beta-sheet in the order beta1-beta2-beta3-beta4, with two alpha-helices each on the front (alpha1 and alpha2) and at the back (alpha3 and alpha4) of the beta-sheet.	128
286156	pfam09022	Staphostatin_A	Staphostatin A. The staphostatin A polypeptide chain folds into a slightly deformed, eight-stranded beta-barrel, with strands beta-4 through beta-8 forming an antiparallel sheet while the N-terminus forms a a psi-loop motif. Members of this family constitute a class of cysteine protease inhibitors distinct in the fold and the mechanism of action from any known inhibitors of these enzymes.	105
401098	pfam09023	Staphostatin_B	Staphostatin B. Staphostatin B inhibits the cysteine protease Staphopain B, produced by Staphylococcus aureus, by blocking the active site of the enzyme. The domain adopts an eight-stranded mixed beta-barrel structure, with a deviation from the up-down topology of canonical beta-barrels in the amino-terminal part of the molecule.	105
286158	pfam09025	T3SS_needle_reg	YopR, type III needle-polymerization regulator. The YopR core domain, predominantly found in the Gammaproteobacteria virulence factor YopR, is composed of five alpha-helices, four of which are arranged in an antiparallel bundle. Little is known about this domain, though it may contribute to the virulence of the protein YopR. YopR controls the selective access of early (YscF, YscI and YscP) substrates to the type III secretion machines of yersiniae and other Gammaproteobacteriae. YopR is a mobile regulatory component thought to function as a checkpoints probing the completion of discrete intermediary stages in the assembly of the type III injection pathway. The location of secreted YopR (into the medium) is directly controlling the secretion of YscF, the polymerized needle protein pfam09392, thereby impacting the assembly of type III machines.	145
286159	pfam09026	CENP-B_dimeris	Centromere protein B dimerization domain. The centromere protein B (CENP-B) dimerization domain is composed of two alpha-helices, which are folded into an antiparallel configuration. dimerization of CENP-B is mediated by this domain, in which monomers dimerize to form a symmetrical, antiparallel, four-helix bundle structure with a large hydrophobic patch in which 23 residues of one monomer form van der Waals contacts with the other monomer. This CENP-B dimer configuration may be suitable for capturing two distant CENP-B boxes during centromeric heterochromatin formation.	100
401099	pfam09027	GTPase_binding	GTPase binding. The GTPase binding domain binds to the G protein Cdc42, inhibiting both its intrinsic and stimulated GTPase activity. The domain is largely unstructured in the absence of Cdc42.	66
401100	pfam09028	Mac-1	Mac 1. The bacterial protein Mac 1 adopts an alpha/beta fold, with 14 beta strands and 9 alpha helices. The N-terminal domain is made up predominantly of alpha helices, whereas the C-terminal domain consists predominantly of beta sheets. Mac 1 blocks polymorphonuclear opsonophagocytosis, inhibits the production of reactive oxygen species and contains IgG endopeptidase activity.	347
401101	pfam09029	Preseq_ALAS	5-aminolevulinate synthase presequence. The N terminal presequence domain found in 5-aminolevulinate synthase exists as an amphipathic helix, with a positively charged surface provided by lysine residues and no stable helix at the N-terminus. The domain is essential for the import process by which ALAS is transported into the mitochondria: translocase of the outer membrane (Tom) and translocase of the inner membrane protein complexes appear responsible for recognition and import through the mitochondrial membrane. The protein Tom20 is anchored to the mitochondrial outer membrane, and its interaction with presequences is thought to be the recognition step which allows subsequent import.	114
401102	pfam09030	Creb_binding	Creb binding. The Creb binding domain assumes a structure comprising of three alpha-helices which pack in a bundle, exposing a hydrophobic groove between alpha-1 and alpha-3 within which complimentary domains found in the protein 'activator for thyroid hormone and retinoid receptors' (ACTR) can dock. Docking of these domains is required for the recruitment of RNA polymerase II and the basal transcription machinery.	104
401103	pfam09032	Siah-Interact_N	Siah interacting protein, N terminal. The N terminal domain of Siah interacting protein (SIP) adopts a helical hairpin structure with a hydrophobic core stabilized by a classic knobs-and-holes arrangement of side chains contributed by the two amphipathic helices. Little is known about this domain's function, except that it is crucial for interactions with Siah. It has also been hypothesized that SIP can dimerize through this N terminal domain.	76
401104	pfam09033	DFF-C	DNA Fragmentation factor 45kDa, C terminal domain. The C terminal domain of DNA Fragmentation factor 45kDa (DFF-C) consists of four alpha-helices, which are folded in a helix-packing arrangement, with alpha-2 and alpha-3 packing against a long C-terminal helix (alpha-4). The main function of this domain is the inhibition of DFF40 by binding to its C-terminal catalytic domain through ionic interactions, thereby inhibiting the fragmentation of DNA in the apoptotic process. In addition to blocking the DNase activity of DFF40, the C-terminal region of DFF45 is also important for the DFF40-specific folding chaperone activity, as demonstrated by the ability of DFF45 to refold DFF40.	165
401105	pfam09034	TRADD_N	TRADD, N-terminal domain. The N terminal domain of 'tumor necrosis factor receptor type 1 associated death domain protein' (TRADD) folds into an alpha-beta sandwich with a four-stranded beta sheet and six alpha helices, each forming one layer of the structure. The domain allows docking of TRADD onto 'tumor necrosis factor receptor-associated factor' (TRAF): the binding is at the beta-sandwich domain, away from the coiled-coil domain. Binding ensures the recruitment of cIAPs to the signaling complex, which may be important for direct caspase-8 inhibition and the immediate suppression of apoptosis at the apical point of the cascade.	111
286167	pfam09035	Tn916-Xis	Excisionase from transposon Tn916. The phage-encoded excisionase protein Tn916-Xis adopts a winged-helix structure that consists of a three-stranded anti-parallel beta-sheet that packs against a helix-turn-helix (HTH) motif and a third C-terminal alpha-helix. It is encoded for by Tn916, which also codes for the integrase Tn916-Int. The protein interacts with DNA by the insertion of helix alpha-2 into the major groove and the contact of the hairpin that connects strands beta-2 and beta-3 with the adjacent phosphodiester backbone and/or minor groove. Tn916-Xis stimulates phage excision and inhibits viral integration by stabilizing distorted DNA structures.	62
401106	pfam09036	Bcr-Abl_Oligo	Bcr-Abl oncoprotein oligomerization domain. The Bcr-Abl oncoprotein oligomerization domain consists of a short N-terminal helix (alpha-1), a flexible loop and a long C-terminal helix (alpha-2). Together these form an N-shaped structure, with the loop allowing the two helices to assume a parallel orientation. The monomeric domains associate into a dimer through the formation of an antiparallel coiled coil between the alpha-2 helices and domain swapping of two alpha-1 helices, where one alpha-1 helix swings back and packs against the alpha-2 helix from the second monomer. Two dimers then associate into a tetramer. The oligomerization domain is essential for the oncogenicity of the Bcr-Abl protein.	73
401107	pfam09037	Sulphotransf	Stf0 sulphotransferase. Members of this family are essential for the biosynthesis of sulpholipid-1 in prokaryotes. They adopt a structure that belongs to the sulphotransferase superfamily, consisting of a single domain with a core four-stranded parallel beta-sheet flanked by alpha-helices.	244
401108	pfam09038	53-BP1_Tudor	tumor suppressor p53-binding protein-1 Tudor. Members of this family consist of ten beta-strands and a carboxy-terminal alpha-helix. The amino-terminal five beta-strands and the C-terminal five beta-strands adopt folds that are identical to each other. This domain is essential for the recruitment of proteins to double stranded breaks in DNA, which is mediated by interaction with methylated Lys 79 of histone H3.	122
370256	pfam09039	HTH_Tnp_Mu_2	Mu DNA binding, I gamma subdomain. Members of this family are responsible for binding the DNA attachment sites at each end of the Mu genome. They adopt a secondary structure comprising a four helix bundle tightly packed around a hydrophobic core consisting of aliphatic and aromatic amino acid residues. Helices 1 and 2 are oriented antiparallel to each other. Helix 3 crosses helices 1 and 2 at angles of 60 and 120 degrees, respectively. Excluding the C-terminal helix 4, the fold of the I-gamma subdomain is remarkably similar to that of the homeodomain family of helix-turn-helix DNA-binding proteins, although their amino acid sequences are completely unrelated.	109
312544	pfam09040	H-K_ATPase_N	Gastric H+/K+-ATPase, N terminal domain. Members of this family adopt an alpha-helical conformation under hydrophobic conditions. The domain contains tyrosine residues, phosphorylation of which regulates the function of the ATPase. Additionally, the domain also interacts with various structural proteins, including the spectrin-binding domain of ankyrin III.	43
370257	pfam09041	Aurora-A_bind	Aurora-A binding. The Aurora-A binding domain binds to two distinct sites on the Aurora kinase: the upstream residues bind at the N-terminal lobe, whilst the downstream residues bind in an alpha-helical conformation between the N- and C-terminal lobes. The two Aurora-A binding motifs are connected by a flexible linker that is variable in length and sequence across species. Binding of the domain results strong activation of Aurora-A and protection from deactivating dephosphorylation by phosphatase PP1.	68
401109	pfam09042	Titin_Z	Titin Z. The titin Z domain, that recognizes and binds to the C-terminal calmodulin-like domain of alpha-actinin-2 (Act-EF34), adopts a helical structure, and binds in a groove formed by the two planes between the helix pairs of Act-EF34. This interaction is essential for sarcomere assembly.	40
401110	pfam09043	Lys-AminoMut_A	D-Lysine 5,6-aminomutase TIM-barrel domain of alpha subunit. Members of his family are involved in the 1,2 rearrangement of the terminal amino group of DL-lysine and of L-beta-lysine, using adenosylcobalamin (AdoCbl) and pyridoxal-5'-phosphate as co-factors. The structure is predominantly a PLP-binding TIM barrel domain, with several additional alpha-helices and beta-strands at the N and C termini. These helices and strands form an intertwined accessory clamp structure that wraps around the sides of the TIM barrel and extends up toward the Ado ligand of the Cbl co-factor, providing most of the interactions observed between the protein and the Ado ligand of the Cbl, suggesting that its role is mainly in stabilizing AdoCbl in the precatalytic resting state. This is a TIM-barrel domain.	508
370259	pfam09044	Kp4	Kp4. Members of this fungal family of toxins specifically inhibit voltage-gated calcium channels in mammalian cells. They adopt an alpha/beta-sandwich structure, comprising a five-stranded antiparallel beta-sheet with two antiparallel alpha-helices lying at approximately 45 degrees to these strands.	123
312549	pfam09045	L27_2	L27_2. The L27_2 domain is a protein-protein interaction domain capable of organising scaffold proteins into supramolecular assemblies by formation of heteromeric L27_2 domain complexes. L27_2 domain-mediated protein assemblies have been shown to play essential roles in cellular processes including asymmetric cell division, establishment and maintenance of cell polarity, and clustering of receptors and ion channels. Members of this family form specific heterotetrameric complexes, in which each domain contains three alpha-helices. The two N-terminal helices of each L27_2 domain pack together to form a tight, four-helix bundle in the heterodimer, whilst the third helix of each L27_2 domain forms another four-helix bundle that assembles the two units of the heterodimer into a tetramer.	58
401111	pfam09046	AvrPtoB-E3_ubiq	AvrPtoB E3 ubiquitin ligase. The E3 ubiquitin ligase domain found in the bacterial protein AvrPtoB inhibits immunity-associated programmed cell death (PCD) when translocated into plant cells, probably by recruiting E2 enzymes and transferring ubiquitin molecules to cellular proteins involved in regulation of PCD and targeting them for degradation. The structure of this domain reveals a globular fold centred on a four-stranded beta-sheet that packs against two helices on one face and has three very extended loops connecting the elements of secondary structure, with remarkable homology to the RING-finger and U-box families of proteins involved in ubiquitin ligase complexes in eukaryotes.	118
370261	pfam09047	MEF2_binding	MEF2 binding. The myocyte enhancer factor-2 (MEF2) binding domain, predominantly found in the calcineurin-binding protein CABIN 1, adopts an amphipathic alpha-helical structure, which allows it to bind a hydrophobic groove on the MEF2S domain, forming a triple-helical interaction. Interaction of this domain with MEF2 causes repression of transcription.	35
401112	pfam09048	Cro	Cro. Members of this family are involved in the repression of transcription by binding as a homodimer to palindromic DNA operator sites in phage lambda: they repress genes expressed in early phage development and are necessary for the late stage of lytic growth. These proteins have a secondary structure consisting of three alpha-helices and three beta-sheets, and dimerize through interactions between the two antiparallel beta-strands.	59
401113	pfam09049	SNN_transmemb	Stannin transmembrane. Members of this family consist of a single highly hydrophobic transmembrane helix that transverses the lipid bilayer at a 20 degree angle with respect to the membrane normal. They contain a conserved cysteine residue (Cys32) that, together with Cys34 found in the stannin unstructured linker domain, constitutes the putative trimethyltin-binding site that resides at the end of the transmembrane domain close to the lipid/solvent interface.	32
370263	pfam09050	SNN_linker	Stannin unstructured linker. Members of this family are unstructured, acting as connectors of the stannin helical domains. They contain a conserved CXC metal-binding motif and a putative 14-3-3-zeta binding domain. Upon coordinating dimethytin, considerable structural or dynamic changes in the flexible loop region of SNN may take place, recruiting other binding partners such as 14-3-3-zeta, and thereby initiating the apoptotic cascade.	26
401114	pfam09051	SNN_cytoplasm	Stannin cytoplasmic. Members of this family consist of a distorted cytoplasmic helix that is partially absorbed into the plane of the lipid bilayer with a tilt angle of approximately 80 degrees from the membrane normal. They interact with the surface of the lipid bilayer, and contribute to the initiation of the apoptotic cascade on binding of the unstructured linker domain to dimethyltin.	26
401115	pfam09052	SipA	Salmonella invasion protein A. Salmonella invasion protein A is an actin-binding protein that contributes to host cytoskeletal rearrangements by stimulating actin polymerization and counteracting F-actin destabilizing proteins. Members of this family possess an all-helical fold consisting of eight alpha-helices arranged so that six long, amphipathic helices form a compact fold that surrounds a final, predominantly hydrophobic helix in the middle of the molecule.	213
401116	pfam09053	CagZ	CagZ. CagZ is a 23 kDa protein consisting of a single compact L-shaped domain, composed of seven alpha-helices that run antiparallel to each other. 70% of the residues are in alpha-helix conformation and no beta-sheet is present. CagZ is essential for the translocation of the pathogenic protein CagA into host cells.	198
401117	pfam09055	Sod_Ni	Nickel-containing superoxide dismutase. Nickel containing superoxide dismutase (NiSOD) is a metalloenzyme containing a hexameric assembly of right-handed 4-helix bundles of up-down-up-down topology with an N-terminal His-Cys-X-X-Pro-Cys-Gly-X-Tyr motif that chelates the active site Ni ions. NiSOD catalyzes the disproportionation of superoxide to peroxide and molecular oxygen through alternate oxidation and reduction of Ni, protecting cells from the toxic products of aerobic metabolism.	127
401118	pfam09056	Phospholip_A2_3	Prokaryotic phospholipase A2. The prokaryotic phospholipase A2 domain is predominantly found in bacterial and fungal phospholipases, as well as various hypothetical and putative proteins. It enables the liberation of fatty acids and lysophospholipid by hydrolysing the 2-ester bond of 1,2-diacyl-3-sn-phosphoglycerides. The domain adopts an alpha-helical secondary structure, consisting of five alpha-helices and two helical segments.	102
401119	pfam09057	Smac_DIABLO	Second Mitochondria-derived Activator of Caspases. Second Mitochondria-derived Activator of Caspases promotes apoptosis by activating caspases in the cytochrome c/Apaf-1/caspase-9 pathway, and by opposing the inhibitory activity of inhibitor of apoptosis proteins (XIAP-BIR3). The protein assumes an elongated three-helix bundle structure, and forms a dimer in solution.	237
401120	pfam09058	L27_1	L27_1. The L27 domain is a protein interaction module that exists in a large family of scaffold proteins, functioning as an organisation centre of large protein assemblies required for the establishment and maintenance of cell polarity. L27 domains form specific heterotetrameric complexes, in which each domain contains three alpha-helices.	61
401121	pfam09059	TyeA	TyeA. Members of this family are composed of two pairs of parallel alpha-helices, and interact with the bacterial protein YopN via hydrophobic residues located on the helices. Association of TyeA with the C-terminus of YopN is accompanied by conformational changes in both polypeptides that create order out of disorder: the resulting structure then serves as an impediment to type III secretion of YopN.	81
401122	pfam09060	L27_N	L27_N. The L27_N domain plays a role in the biogenesis of tight junctions and in the establishment of cell polarity in epithelial cells. Each L27_N domain consists of three alpha-helices, the first two of which form an antiparallel coiled-coil. Two L27 domains come together to form a four-helical bundle with the antiparallel coiled-coils formed by the first two helices. The third helix of each domain forms another coiled-coil packing at one end of the four-helix bundle, creating a large hydrophobic interface: the hydrophobic interactions are the major force that drives heterodimer formation.	48
401123	pfam09061	Stirrup	Stirrup. The Stirrup domain, found in the prokaryotic protein ribonucleotide reductase, has a molecular mass of 9 kDa and is folded into an alpha/beta structure. It allows for binding of the reductase to DNA via electrostatic interactions, since it has a predominance of positive charges distributed on its surface.	79
401124	pfam09062	Endonuc_subdom	PI-PfuI Endonuclease subdomain. The endonuclease subdomain, found in the prokaryotic protein ribonucleotide reductase, assumes an alpha-beta-beta-alpha-beta-beta-alpha-alpha topology. The four stranded beta-sheet forms a saddle-shaped surface and assembles together through an interface made of alpha-helices. The presence of 14 basic residues on the surface of the beta-sheets suggests that this large groove may be involved in DNA binding.	98
401125	pfam09063	Phage_coat	Phage PP7 coat protein. Members of this family form the capsid of P. aeruginosa phage PP7. They adopt a secondary structure consisting of a six stranded beta sheet and an alpha helix.	127
312559	pfam09064	Tme5_EGF_like	Thrombomodulin like fifth domain, EGF-like. Members of this family adopt a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C.	34
72483	pfam09065	Haemadin	Haemadin. Members of this family adopt a secondary structure consisting of five short beta-strands (beta1-beta5), which are arranged in two antiparallel distorted sheets formed by strands beta1-beta4-beta5 and beta2-beta3 facing each other. This beta-sandwich is stabilized by six enclosed cysteines arranged in a [1-2, 3-5, 4-6] disulphide pairing resulting in a disulphide-rich hydrophobic core that is largely inaccessible to bulk solvent. The close proximity of disulfide bonds [3-5] and [4-6] organizes haemadin into four distinct loops. The N-terminal segment of this domain binds to the active site of thrombin, inhibiting it.	27
401126	pfam09066	B2-adapt-app_C	Beta2-adaptin appendage, C-terminal sub-domain. Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerization. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15).	109
401127	pfam09067	EpoR_lig-bind	Erythropoietin receptor, ligand binding. Members of this family interact with erythropoietin (EPO), with subsequent initiation of the downstream chain of events associated with binding of EPO to the receptor, including EPO-induced erythroblast proliferation and differentiation through induction of the JAK2/STAT5 signaling cascade. The domain adopts a secondary structure composed of a short amino-terminal helix, followed by two beta-sandwich regions.	104
401128	pfam09068	EF-hand_2	EF hand. Members of this family adopt a helix-loop-helix motif, as per other EF hand domains. However, since they do not contain the canonical pattern of calcium binding residues found in many EF hand domains, they do not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (pfam00397), enhancing dystroglycan binding.	121
401129	pfam09069	EF-hand_3	EF-hand. Members of this family adopt a helix-loop-helix motif, as per other EF hand domains. However, since they do not contain the canonical pattern of calcium binding residues found in many EF hand domains, they do not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (pfam00397), enhancing dystroglycan binding.	90
401130	pfam09070	PFU	PFU (PLAA family ubiquitin binding). This domain is found N terminal to pfam08324 and binds to ubiquitin.	111
286197	pfam09071	Alpha-amyl_C	Alpha-amylase, C terminal. Members of this family, which are found in the prokaryotic protein glycosyltrehalose trehalohydrolase, assume a gamma-crystallin-type fold with a five-stranded anti-parallel beta-sheet that packs against the C-terminal side of a beta-alpha barrel. This domain is common to family 13 glycosidases and typically contains a five to ten strand beta-sheet, however its precise fold varies.	67
401131	pfam09072	TMA7	Translation machinery associated TMA7. TMA7 plays a role in protein translation. Deletions of the TMA7 gene results in altered protein synthesis rates.	62
401132	pfam09073	BUD22	BUD22. BUD22 has been shown in yeast to be a nuclear protein involved in bud-site selection. It plays a role in positioning the proximal bud pole signal. More recently it has been shown to be involved in ribosome biogenesis.	426
401133	pfam09074	Mer2	Mer2. Mer2 (Rec107) forms part of a complex that is required for meiotic double strand DNA break formation. Mer2 increases in abundance and is phosphorylated during the prophase phase of cell division. Blocking double strand break formation results in delayed dephosphorylation and dissociation of Mer2 from the chromosome.	193
401134	pfam09075	STb_secrete	Heat-stable enterotoxin B, secretory. Members of this family assume a helical secondary structure, with two alpha helices forming a disulphide crosslinked alpha-helical hairpin. The disulphide bonds are crucial for the toxic activity of the protein, and are required for maintenance of the tertiary structure, and subsequent interaction with the particulate form of guanylate cyclase, increasing cyclic GMP levels within the host intestinal epithelial cells.	48
370280	pfam09076	Crystall_2	Beta/Gamma crystallin. Members of this family assume a beta-gamma-crystallin fold, wherein nine beta-strands are connected by loop, and are separated into two sheets, each sheet forming the Greek key motif. The two Greek key motifs face each other in the global topology. The three-dimensional structure of the molecule is a 'sandwich'-shaped beta-barrel structure: hydrophobic side-chains are packed in the large interface area of the beta-sheets. In Streptomyces killer toxin-like protein domain confers a cytocidal effect to the toxin, causing cell death in both budding and fission yeasts, and morphological changes in yeasts and filamentous fungi. This family also includes chitin-biding antifungal proteins.	69
401135	pfam09077	Phage-MuB_C	Mu B transposition protein, C terminal. The C terminal domain of the B transposition protein from Bacteriophage Mu comprises four alpha-helices arranged in a loosely packed bundle, where helix alpha1 runs parallel to alpha3, and anti-parallel to helices alpha2 and alpha4. The domain allows for non-specific binding of Mu to double-stranded DNA, allowing for integration into the bacterial genome, and mediates dimerization of the protein.	78
401136	pfam09078	CheY-binding	CheY binding. Members of this family adopt a secondary structure consisting of an open-face beta/alpha sandwich, with four antiparallel beta-strands and two alpha-helices. They bind to a corresponding domain on CheY, with subsequent phosphorylation of the CheY Asp57 residue, and activation of CheY, which then affects flagellar rotation.	63
401137	pfam09079	Cdc6_C	CDC6, C terminal winged helix domain. The C terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localization factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity.	83
401138	pfam09080	K-cyclin_vir_C	K cyclin, C terminal. Members of this family adopt a secondary structure consisting of a five alpha-helix cyclin fold. Interaction with cyclin dependent kinases (CDKs) at a PSTAIRE sequence motif within the catalytic cleft of CDK results in the regulation of CDK activity.	104
401139	pfam09081	DUF1921	Domain of unknown function (DUF1921). This domain, which is found in a set of prokaryotic amylases, has no known function.	51
401140	pfam09082	DUF1922	Domain of unknown function (DUF1922). Members of this family consist of a beta-sheet region followed by an alpha-helix and an unstructured C-terminus. The beta-sheet region contains a CXCX...XCXC sequence with Cys residues located in two proximal loops and pointing towards each other. This precise function of this set of bacterial proteins is, as yet, unknown.	65
72501	pfam09083	DUF1923	Domain of unknown function (DUF1923). Members of this family are found in maltosyltransferases, and adopt a secondary structure consisting of eight antiparallel beta-strands, which form an open-sided 'jelly roll' Greek key beta-barrel. Their exact function is, as yet, unknown.	64
401141	pfam09084	NMT1	NMT1/THI5 like. This family contains the NMT1 and THI5 proteins. These proteins are proposed to be required for the biosynthesis of the pyrimidine moiety of thiamine. They are regulated by thiamine. The protein adopts a fold related to the periplasmic binding protein (PBP) family. Both pyridoxal-5'-phosphate (PLP) and an iron atom are bound to the protein suggesting numerous residues of the active site necessary for HMP-P biosynthesis. The yeast protein is a dimer and, although exceptionally using PLP as a substrate, has notable similarities with enzymes dependent on this molecule as a cofactor.	216
401142	pfam09085	Adhes-Ig_like	Adhesion molecule, immunoglobulin-like. Members of this family are found in a set of mucosal cellular adhesion proteins and adopt an immunoglobulin-like beta-sandwich structure, with seven strands arranged in two beta-sheets in a Greek-key topology. They are essential for recruitment of lymphocytes to specific tissues.	107
401143	pfam09086	DUF1924	Domain of unknown function (DUF1924). This domain is found in a set of bacterial proteins, including Cytochrome c-type protein. It is functionally uncharacterized.	91
401144	pfam09087	Cyc-maltodext_N	Cyclomaltodextrinase, N-terminal. Members of this family assume a beta-sandwich structure composed of the eight antiparallel beta-strands. A ten residue linker is also present at the C-terminal end, which connects the N terminal domain to a distal domain in the protein. This domain participates in oligomerization of the protein, wherein the N-terminal domain of one subunit contacts the active centre of the other subunit, and is also required for binding of cyclodextrin to substrate.	88
370287	pfam09088	MIF4G_like	MIF4G like. Members of this family are involved in mediating U snRNA export from the nucleus. They adopt a highly helical structure, wherein the polypeptide chain forms a right-handed solenoid. At the tertiary level, the domain is composed of a superhelical arrangement of successive antiparallel pairs of helices.	191
337290	pfam09089	gp12-short_mid	Phage short tail fibre protein gp12, middle domain. Members of this family adopt a right-handed triple-stranded beta-helix fold, and are found in the middle of the phage short tail fibre protein gp12.	81
401145	pfam09090	MIF4G_like_2	MIF4G like. Members of this family are involved in mediating U snRNA export from the nucleus. They adopt a highly helical structure, wherein the polypeptide chain forms a right-handed solenoid. At the tertiary level, the domain is composed of a superhelical arrangement of successive antiparallel pairs of helices.	263
401146	pfam09092	Lyase_N	Lyase, N terminal. Members of this family are predominantly found in chondroitin ABC lyase I, and adopt a jelly-roll fold topology consisting of a two-layered bent beta-sheet sandwich with one short alpha-helix. The convex beta sheet is composed of five antiparallel strands, whilst the concave beta-sheet contains five antiparallel beta-strands with a loop between two consecutive strands folding back onto the concave surface. This domain is required for binding of the protein to long glycosaminoglycan chains.	167
401147	pfam09093	Lyase_catalyt	Lyase, catalytic. Members of this family are predominantly found in chondroitin ABC lyase I, and adopt a helical structure, with fifteen alpha-helices which are at least two turns long and several short helical turns. The bulk of the domain is formed by ten alpha-helices forming five hairpin-like pairs and arranged into an incomplete toroid, the (alpha/alpha)5 fold. Additionally, two long and two short alpha-helices at the N-terminus of the domain wrap around the toroid. At the C-terminal end of the toroid there is one additional short alpha-helix. This domain is required for degradation of polysaccharides containing 1,4-beta-D-hexosaminyl and 1,3-beta-D-glucoronosyl or 1,3-alpha-L-iduronosyl linkages to disaccharides containing 4-deoxy-beta-D-gluc-4-enuronosyl groups.	361
370291	pfam09094	DUF1925	Domain of unknown function (DUF1925). Members of this family, which are found in a set of prokaryotic transferases, adopt an immunoglobulin/albumin-binding domain-like fold, with a bundle of three alpha-helices. Their function is, as yet, unknown.	80
401148	pfam09095	DUF1926	Domain of unknown function (DUF1926). Members of this family, which are found in a set of prokaryotic transferases, adopt a beta-sandwich fold, in which two layers of anti-parallel beta-sheets are arranged in a nearly parallel fashion. The exact function of this family is, as yet, unknown, however it has been proposed that they may play a role in transglycosylation reactions.	274
286219	pfam09096	Phage-tail_2	Baseplate structural protein, domain 2. Members of this family adopt a beta barrel structure with a Greek key topology, which is topologically similar to the FMN-binding split barrel. They are structural component of the viral baseplate, predominantly found in the structural protein gp27.	173
286220	pfam09097	Phage-tail_1	Baseplate structural protein, domain 1. Members of this family adopt a beta barrel structure with a Greek key topology, which is topologically similar to the FMN-binding split barrel. They are structural component of the viral baseplate, predominantly found in the structural protein gp27.	196
401149	pfam09098	Dehyd-heme_bind	Quinohemoprotein amine dehydrogenase A, alpha subunit, haem binding. Members of this family are predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase. They have a predominantly alpha-helical structure and can be divided into two subdomains, each binding a haem C group via a conserved CXXCH motif.	164
286222	pfam09099	Qn_am_d_aIII	Quinohemoprotein amine dehydrogenase, alpha subunit domain III. Members of this family, which are predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, adopt an immunoglobulin-like beta-sandwich fold, with seven strands arranged into two beta sheets; the fold is possibly related to the immunoglobulin and/or fibronectin type III superfamilies. The precise function of this domain has not, as yet, been defined.	82
401150	pfam09100	Qn_am_d_aIV	Quinohemoprotein amine dehydrogenase, alpha subunit domain IV. Members of this family, which are predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, adopt an immunoglobulin-like beta-sandwich fold, with seven strands arranged into two beta sheets; the fold is possibly related to the immunoglobulin and/or fibronectin type III superfamilies. The precise function of this domain has not, as yet, been defined.	133
286224	pfam09101	Exotox-A_bind	Exotoxin A binding. Members of this family are found in Pseudomonas aeruginosa exotoxin A, and are responsible for binding of the toxin to the alpha-2-macroglobulin receptor, with subsequent internalisation into endosomes. The domain adopts a thirteen-strand antiparallel beta jelly roll topology, which belongs to the Concanavalin A-like lectins/glucanases fold superfamily.	274
401151	pfam09102	Exotox-A_target	Exotoxin A, targeting. Members of this family are found in Pseudomonas aeruginosa exotoxin A, and are responsible for transmembrane targeting of the toxin, as well as transmembrane translocation of the catalytic domain into the cytoplasmic compartment. A furin cleavage site is present within the domain: cleavage generates a 37 kDa carboxy-terminal fragment, which includes the enzymatic domain, which is then is translocated into the cytoplasm. The domain adopts a helical structure, with six alpha-helices forming a bundle.	142
401152	pfam09103	BRCA-2_OB1	BRCA2, oligonucleotide/oligosaccharide-binding, domain 1. Members of this family assume an OB fold, which consists of a highly curved five-stranded beta-sheet that closes on itself to form a beta-barrel. OB1 has a shallow groove formed by one face of the curved sheet and is demarcated by two loops, one between beta 1 and beta 2 and another between beta 4 and beta 5, which allows for weak single strand DNA binding. The domain also binds the 70-amino acid DSS1 (deleted in split-hand/split foot syndrome) protein, which was originally identified as one of three genes that map to a 1.5-Mb locus deleted in an inherited developmental malformation syndrome.	120
401153	pfam09104	BRCA-2_OB3	BRCA2, oligonucleotide/oligosaccharide-binding, domain 3. Members of this family assume an OB fold, which consists of a highly curved five-stranded beta-sheet that closes on itself to form a beta-barrel. OB3 has a pronounced groove formed by one face of the curved sheet and is demarcated by two loops, one between beta 1 and beta 2 and another between beta 4 and beta 5, which allows for strong ssDNA binding.	137
370297	pfam09105	SelB-wing_1	Elongation factor SelB, winged helix. Members of this family adopt a winged-helix fold, with an alpha/beta structure consisting of three alpha-helices and a twisted three-stranded antiparallel beta-sheet, with an alpha-beta-alpha-alpha-beta-beta connectivity. They are involved in both DNA and RNA binding.	61
401154	pfam09106	SelB-wing_2	Elongation factor SelB, winged helix. Members of this family adopt a winged-helix fold, with an alpha/beta structure consisting of three alpha-helices and a twisted three-stranded antiparallel beta-sheet, with an alpha-beta-alpha-alpha-beta-beta connectivity. They are involved in both DNA and RNA binding.	57
401155	pfam09107	SelB-wing_3	Elongation factor SelB, winged helix. Members of this family adopt a winged-helix fold, with an alpha/beta structure consisting of three alpha-helices and a twisted three-stranded antiparallel beta-sheet, with an alpha-beta-alpha-alpha-beta-beta connectivity. They are involved in both DNA and RNA binding.	46
401156	pfam09108	Xol-1_N	Switch protein XOL-1, N-terminal. Members of this family, which are required for the formation of the active site of the sex-determining protein Xol-1, adopt a secondary structure consisting of five alpha helices and six antiparallel beta sheets, in a beta-alpha-beta-beta-beta-alpha-beta-alpha-alpha-alpha-beta arrangement. The fold of this family is similar to that found in ribosomal protein S5 domain 2-like.	160
401157	pfam09109	Xol-1_GHMP-like	Switch protein XOL-1, GHMP-like. Members of this family, which are required for the formation of the active site of the sex-determining protein Xol-1, adopt a secondary structure consisting of five alpha helices and seven antiparallel beta sheets, in a beta-alpha-beta-alpha-alpha-alpha-beta-beta-alpha-beta-beta-beta arrangement. The fold of this family is structurally similar to that found in the C-terminal domain of GHMP Kinase.	196
401158	pfam09110	HAND	HAND. The HAND domain adopts a secondary structure consisting of four alpha helices, three of which (H2, H3, H4) form an L-like configuration. Helix H2 runs antiparallel to helices H3 and H4, packing closely against helix H4, whilst helix H1 reposes in the concave surface formed by these three helices and runs perpendicular to them. The domain confers DNA and nucleosome binding properties to the protein.	110
401159	pfam09111	SLIDE	SLIDE. The SLIDE domain adopts a secondary structure comprising a main core of three alpha-helices. It has a role in DNA binding, contacting DNA target sites similar to c-Myb (pfam00249) repeats or homeodomains.	114
401160	pfam09112	N-glycanase_N	Peptide-N-glycosidase F, N terminal. Members of this family adopt an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. They are similar in topology to many viral capsid proteins, as well as lectins and several glucanases. The domain allows the protein to bind sugars and catalyzes the complete removal of N-linked oligosaccharide chains from glycoproteins.	147
401161	pfam09113	N-glycanase_C	Peptide-N-glycosidase F, C terminal. Members of this family adopt an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. They are similar in topology to many viral capsid proteins, as well as lectins and several glucanases. The domain allows the protein to bind sugars and catalyzes the complete removal of N-linked oligosaccharide chains from glycoproteins.	137
312591	pfam09114	MotA_activ	Transcription factor MotA, activation domain. Members of this family of viral protein domains are implicated in transcriptional activation. They are almost completely alpha-helical, with five alpha-helices and a short, two-stranded, beta-ribbon. Four alpha helices (alpha1, alpha3, alpha4 and alpha5) are amphipathic and pack their hydrophobic surfaces around the central helix alpha2.	95
401162	pfam09115	DNApol3-delta_C	DNA polymerase III, delta subunit, C terminal. Members of this family, which are predominantly found in prokaryotic DNA polymerase III, assume an alpha helical structure, with a core of five alpha helices, and an additional small helix. They are essential for the formation of the polymerase clamp loader.	112
401163	pfam09116	gp45-slide_C	gp45 sliding clamp, C terminal. Members of this family are essential for the interaction of the gp45 sliding clamp with the corresponding polymerase. They adopt a DNA clamp fold, consisting of two alpha helices and two beta sheets - the fold is duplicated and has internal pseudo two-fold symmetry.	105
370303	pfam09117	MiAMP1	MiAMP1. MiAMP1 is a highly basic protein from the nut kernel of Macadamia integrifolia which inhibits the growth of several microbial plant pathogens in vitro while having no effect on mammalian or plant cells. It consists of eight beta-strands which are arranged in two Greek key motifs. These Greek key motifs then associate to form a Greek key beta-barrel.	76
401164	pfam09118	DUF1929	Domain of unknown function (DUF1929). Members of this family adopt a secondary structure consisting of a bundle of seven, mostly antiparallel, beta-strands surrounding a hydrophobic core. The 7 strands are arranged in 2 sheets, in a Greek-key topology. Their precise function, has not, as yet, been defined, though they are mostly found in sugar-utilising enzymes, such as galactose oxidase.	91
401165	pfam09119	SicP-binding	SicP binding. Members of this family bind the chaperone SicP, which is required both to maintain the stability of SptP, as well as to ensure the eventual secretion of the protein. The domain is found in the Salmonella effector protein SptP, which interacts with SicP chaperone dimers mainly through four regions of its chaperone-binding domain. The structure of the SptP-SicP complex contains four molecules of SicP, aligned in a linear fashion and arranged in two sets of tightly bound homodimers that bind two SptP molecules. The SicP homodimers do not interact with each other, but are held together by a molecular interface formed between two SptP molecules. Each SptP molecule is wrapped around by three SicP chaperones (two chaperones from one homodimer and a third one from the opposite homodimer pair).	84
401166	pfam09121	Tower	Tower. Members of this family adopt a secondary structure consisting of a pair of long, antiparallel alpha-helices (the stem) that support a three-helix bundle (3HB) at their end. The 3HB contains a helix-turn-helix motif and is similar to the DNA binding domains of the bacterial site-specific recombinases, and of eukaryotic Myb and homeodomain transcription factors. The Tower domain has an important role in the tumor suppressor function of BRCA2, and is essential for appropriate binding of BRCA2 to DNA.	42
401167	pfam09122	DUF1930	Domain of unknown function (DUF1930). Members of this family are found in 3-mercaptopyruvate sulfurtransferase, and have no known function. They adopt a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase.	67
401168	pfam09123	DUF1931	Domain of unknown function (DUF1931). Members of this family, which are found in a set of hypothetical bacterial proteins, contain a core of six alpha-helices, where one central helix is surrounded by the other five. The exact function of this family has not, as yet, been determined. The known structure shows this domain contains two copies of the histone fold.	138
401169	pfam09124	Endonuc-dimeris	T4 recombination endonuclease VII, dimerization. Members of this family, which are predominantly found in Bacteriophage T4 recombination endonuclease VII, adopt a helical secondary structure, with three alpha helices oriented parallel to each other. They mediate dimerization of the protein, as well as binding to the DNA major groove.	54
255195	pfam09125	COX2-transmemb	Cytochrome C oxidase subunit II, transmembrane. Members of this family adopt a tertiary structure consisting of two antiparallel transmembrane helices, in a transmembrane helix hairpin fold.	38
401170	pfam09126	NaeI	Restriction endonuclease NaeI. Members of this family adopt a secondary structure consisting of nine alpha-helices, six 3-10 helices and 13 beta-strands. They bind two GCC-CGG recognition sequences to cleave DNA into blunt-ended products.	288
401171	pfam09127	Leuk-A4-hydro_C	Leukotriene A4 hydrolase, C-terminal. Members of this family adopt a structure consisting of two layers of parallel alpha-helices, five in the inner layer and four in the outer, arranged in an antiparallel manner, with perpendicular loops containing short helical segments on top. They are required for the formation of a deep cleft harbouring the catalytic Zn2+ site in Leukotriene A4 hydrolase.	112
401172	pfam09128	RGS-like	Regulator of G protein signalling-like domain. Members of this family adopt a structure consisting of twelve helices that fold into a compact domain that contains the overall structural scaffold observed in other RGS proteins and three additional helical elements that pack closely to it. Helices 1-9 comprise the RGS (pfam00615) fold, in which helices 4-7 form a classic antiparallel bundle adjacent to the other helices. Like other RGS structures, helices 7 and 8 span the length of the folded domain and form essentially one continuous helix with a kink in the middle. Helices 10-12 form an apparently stable C-terminal extension of the structural domain, and although other RGS proteins lack this structure, these elements are intimately associated with the rest of the structural framework by hydrophobic interactions. Members of the family bind to active G-alpha proteins, promoting GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off G protein-coupled receptor signalling pathways.	188
401173	pfam09129	Chol_subst-bind	Cholesterol oxidase, substrate-binding. The substrate-binding domain found in Cholesterol oxidase is composed of an eight-stranded mixed beta-pleated sheet and six alpha-helices. This domain is positioned over the isoalloxazine ring system of the FAD cofactor bound by FAD_binding_4 (PF:PF01565) and forms the roof of the active site cavity, allowing for catalysis of oxidation and isomerisation of cholesterol to cholest-4-en-3-one.	321
401174	pfam09130	DUF1932	Domain of unknown function (DUF1932). This domain is found in a set of hypothetical prokaryotic proteins. Its exact function has not, as yet, been described.	70
117687	pfam09131	Endotoxin_mid	Bacillus thuringiensis delta-Endotoxin, middle domain. Members of this family adopt a structure consisting of three four-stranded beta-sheets, each with a Greek key fold, with internal pseudo threefold symmetry. Thus they act as a receptor binding beta-prism, binding to insect-specific receptors of gut epithelial cells.	206
312601	pfam09132	BmKX	BmKX. Members of this family assume a structure adopted by most short-chain scorpion toxins, consisting of a cysteine-stabilized alpha/beta scaffold consisting of a short 3-10-helix and a two-stranded antiparallel beta-sheet. They are predominantly found in short-chain scorpion toxins, and their biological method of action has not, as yet, been defined.	30
401175	pfam09133	SANTA	SANTA (SANT Associated). The SANTA domain (SANT Associated domain) is approximately 90 amino acids in length and is conserved in Eukaryota. It is sometimes found in association with the SANT domain (pfam00249, also known as Myb-like DNA-binding domain) implying a putative function in regulating chromatin remodelling. Sequence analysis has showed that the SANTA domain is likely to form four central beta-sheets with three flanking alpha- helixes. Many conserved hydrophobic residues are present which implying a possible role in protein-protein interactions.	87
401176	pfam09134	Invasin_D3	Invasin, domain 3. Members of this family adopt a structure consisting of an immunoglobulin-like beta-sandwich, with seven strands in two beta-sheets, arranged in a Greek-key topology. It forms part of the extracellular region of the protein, which can be expressed as a soluble protein (Inv497) that binds integrins and promotes subsequent uptake by cells when attached to bacteria.	98
401177	pfam09135	Alb1	Alb1. Alb1 is a nuclear shuttling factor involved in ribosome biogenesis.	105
401178	pfam09136	Glucodextran_B	Glucodextranase, domain B. Members of this family adopt a structure consisting of seven/eight-strand antiparallel beta-sheets, in a Greek-key topology, similar to the immunoglobulin beta-sandwich fold. They act as cell wall anchors, where they interact with the S-layer present in the cell wall of Gram-positive bacteria by hydrophobic interactions. In glucodextranase, Domain B is buried in the S-layer, and a flexible linker located between domain B and the catalytic unit confers motion to the catalytic unit, which is capable of efficient hydrolysis of the substrates located close to the cell surface.	89
401179	pfam09137	Glucodextran_N	Glucodextranase, domain N. Members of this family, which are uniquely found in bacterial and archaeal glucoamylases and glucodextranases, adopt a structure consisting of 17 antiparallel beta-strands. These beta-strands are divided into two beta-sheets, and one of the beta-sheets is wrapped by an extended polypeptide, which appears to stabilize the domain. Members of this family are mainly concerned with catalytic activity, hydrolysing alpha-1,6-glucosidic linkages of dextran to release beta-D-glucose from the non-reducing end via an inverting reaction mechanism.	263
370315	pfam09138	Urm1	Urm1 (Ubiquitin related modifier). Urm1 is a ubiquitin related protein that modifies proteins in the yeast ubiquitin-like pathway urmylation. Structural comparisons and phylogenetic analysis of the ubiquitin superfamily has indicated that Urm1 has the most conserved structural and sequence features of the common ancestor of the entire superfamily.	96
401180	pfam09139	Mmp37	Mitochondrial matrix Mmp37. MMp37 is a mitochondrial matrix protein that functions in the translocation of proteins across the mitochondrial inner membrane. It has been shown that MMP37 proteins possess the NTase fold but they have only one active site carboxylate and thus probably are not able to carry out enzymatic reaction. These potentially non-active members of NTase fold superfamily may bind ATP, hydrolysis of which is necessary for the translocation of proteins through the membrane.	322
401181	pfam09140	MipZ	ATPase MipZ. MipZ is an ATPase that forms a complex with the chromosome partitioning protein ParB near the chromosomal origin of replication. It is responsible for the temporal and spatial regulation of FtsZ ring formation.	262
401182	pfam09141	Talin_middle	Talin, middle domain. Members of this family adopt a structure consisting of five alpha helices that fold into a bundle. They contain a Vinculin binding site (VBS) composed of a hydrophobic surface spanning five turns of helix four. Activation of the VBS causes subsequent recruitment of Vinculin, which enables maturation of small integrin/talin complexes into more stable adhesions. Formation of the complex between VBS and Vinculin requires prior unfolding of this middle domain: once released from the talin hydrophobic core, the VBS helix is then available to induce the 'bundle conversion' conformational change within the vinculin head domain thereby displacing the intramolecular interaction with the vinculin tail, allowing vinculin to bind actin.	161
370319	pfam09142	TruB_C	tRNA Pseudouridine synthase II, C terminal. The C terminal domain of tRNA Pseudouridine synthase II adopts a PUA (pfam01472) fold, with a four-stranded mixed beta-sheet flanked by one alpha-helix on each side. It allows for binding of the enzyme to RNA, as well as stabilisation of the RNA molecule.	56
401183	pfam09143	AvrPphF-ORF-2	AvrPphF-ORF-2. Members of this family of plant pathogenic proteins adopt an elongated structure somewhat reminiscent of a mushroom that can be divided into 'stalk' and 'head' subdomains. The stalk subdomain is composed of the N-terminal helix (alpha1) and beta strands beta3-beta4. An antiparallel beta sheet (beta5, beta7-beta8) forms the base of the head subdomain that interacts with the stalk. A pair of twisted antiparallel beta sheets (beta1 and beta6; beta2 and beta9/9') supported by alpha2 form the dome of the head. The head subdomain possesses weak structural similarity with the catalytic portion of a number of ADP-ribosyltransferase toxins.	175
72561	pfam09144	YpM	Yersinia pseudo-tuberculosis mitogen. Members of this family of Yersinia pseudo-tuberculosis mitogens adopt a sandwich structure consisting of nine strands in two beta sheets, in a jelly-roll topology. As with other super-antigens, they are able to excessively activate T cells by binding to the T cell receptor.	117
401184	pfam09145	Ubiq-assoc	Ubiquitin-associated. Ubiquitin associated domains contain approximately 40 residues and bind ubiquitin non-covalently. They adopt a secondary structure consisting of three alpha-helices, and have been identified in various modular proteins involved in protein trafficking, clathrin assembly/disassembly, DNA repair, proteasomal degradation, and cell cycle regulation.	44
286257	pfam09147	DUF1933	Domain of unknown function (DUF1933). Members of this family are predominantly found in carbapenam synthetase, and are composed of two antiparallel six-stranded beta-sheets that form a sandwich, flanked on each side by two alpha-helices. Their exact function has not, as yet, been determined.	201
401185	pfam09148	DUF1934	Domain of unknown function (DUF1934). Members of this family are found in a set of hypothetical bacterial proteins. Their precise function has not, as yet, been defined.	123
401186	pfam09149	DUF1935	Domain of unknown function (DUF1935). Members of this family are found in various bacterial and eukaryotic hypothetical proteins, as well as in the cysteine protease calpain. Their exact function has not, as yet, been defined.	100
401187	pfam09150	Carot_N	Orange carotenoid protein, N-terminal. Members of this family adopt an alpha-helical structure consisting of two four-helix bundles. They are predominantly found in prokaryotic orange carotenoid protein, and carotenoid binding proteins.	149
401188	pfam09151	DUF1936	Domain of unknown function (DUF1936). This domain is found in a set of hypothetical Archaeal proteins. Its exact function has not, as yet, been defined. It possesses a zinc ribbon fold.	34
370325	pfam09152	DUF1937	Domain of unknown function (DUF1937). This domain is found in a set of hypothetical bacterial proteins. Their exact function has not, as yet, been described.	111
401189	pfam09153	DUF1938	Domain of unknown function (DUF1938). Members of this family, which are predominantly found in the archaeal protein O6-alkylguanine-DNA alkyltransferase, adopt a secondary structure consisting of a three stranded antiparallel beta-sheet and three alpha helices. Their exact function has not, as yet, been defined, though it has been postulated that they confer thermostability to the archaeal protein.	90
401190	pfam09154	DUF1939	Domain of unknown function (DUF1939). Members of this family, which are predominantly found in Archaeal amylase, adopt a secondary structure consisting of an eight-stranded antiparallel beta-sheet containing a Greek key motif. Their exact function has not, as yet, been determined.	57
312612	pfam09155	DUF1940	Domain of unknown function (DUF1940). Members of this family adopt a secondary structure consisting of six alpha helices, with four long helices (alpha1, alpha2, alpha5, alpha6) form a left-handed, antiparallel alpha helical bundle. The function of this family of Archaeal hypothetical proteins has not, as yet, been defined.	143
401191	pfam09156	Anthrax-tox_M	Anthrax toxin lethal factor, middle domain. Members of this family, which are predominantly found in anthrax toxin lethal factor, adopt a structure consisting of a core of antiparallel beta sheets and alpha helices. They form a long deep groove within the protein that anchors the 16-residue N-terminal tail of MAPKK-2 before cleavage. It has been noted that this domain resembles the ADP-ribosylating toxin from Bacillus cereus, but the active site has been modified to augment substrate recognition.	286
401192	pfam09157	TruB-C_2	Pseudouridine synthase II TruB, C-terminal. Members of this family adopt a secondary structure consisting of a four-stranded beta sheet and one alpha helix. They are predominantly RNA-binding domains, mostly found in Pseudouridine synthase II TruB.	57
286268	pfam09158	MotCF	Bacteriophage T4 MotA, C-terminal. Members of this family adopt a compact alpha/beta structure comprising three alpha-helices and six beta-strands in the order: alpha1-beta1-beta2-beta3-beta4-alpha2-beta5-beta6-alpha3. The beta-strands form a single anti-parallel beta-sheet and the three alpha-helices pack side-by-side onto one surface of the beta-sheet. In this architecture, the domain's hydrophobic core is at the sheet-helix interface, and the second surface of the beta-sheet is completely exposed. The domain is a DNA-binding motif, with a consensus sequence containing nine base pairs (5'-TTTGCTTTA-3'), that appears to bind to various mot boxes, allowing access to the minor groove towards the 5'-end of this sequence and the major groove towards the 3'-end.	103
401193	pfam09159	Ydc2-catalyt	Mitochondrial resolvase Ydc2 / RNA splicing MRS1. Members of this family adopt a secondary structure consisting of two beta sheets and one alpha helix, arranged as a beta-alpha-beta motif. Each beta sheet has five strands, arranged in a 32145 order, with the second strand being antiparallel to the rest. Mitochondrial resolvase Ydc2 is capable of resolving Holliday junctions and cleaves DNA after 5'-CT-3' and 5'-TT-3' sequences. This family also contains the mitochondrial RNA-splicing protein MRS1 which is involved in the excision of group I introns.	251
370330	pfam09160	FimH_man-bind	FimH, mannose binding. Members of this family adopt a secondary structure consisting of a beta sandwich, with nine strands arranged in two sheets in a Greek key topology. They are predominantly found in bacterial mannose-specific adhesins, since they are capable of binding to D-mannose.	144
401194	pfam09162	Tap-RNA_bind	Tap, RNA-binding. Members of this family adopt a structure consisting of an alpha+beta sandwich with an antiparallel beta-sheet, arranged in a 2(beta-alpha-beta) motif. They are mainly found in mRNA export factors, and mediate the sequence nonspecific nuclear export of cellular mRNAs as well as the sequence-specific export of retroviral mRNAs bearing the constitutive transport element.	83
401195	pfam09163	Form-deh_trans	Formate dehydrogenase N, transmembrane. Members of this family are predominantly found in the beta subunit of formate dehydrogenase, and consist of a single transmembrane helix. They act as a transmembrane anchor, and allow for conduction of electrons within the protein.	43
401196	pfam09164	VitD-bind_III	Vitamin D binding protein, domain III. Members of this family are predominantly found in Vitamin D binding protein, and adopt a multi-helical structure. They are required for formation of an actin 'clamp', allowing the protein to bind to actin.	65
401197	pfam09165	Ubiq-Cytc-red_N	Ubiquinol-cytochrome c reductase 8 kDa, N-terminal. Members of this family adopt a structure consisting of many antiparallel beta sheets, with few alpha helices, in a non-globular arrangement. They are required for proper functioning of the respiratory chain.	74
401198	pfam09166	Biliv-reduc_cat	Biliverdin reductase, catalytic. Members of this family adopt a structure consisting of four alpha helices and six beta sheets, in an alpha-beta-alpha-alpha-alpha-beta-beta-beta-beta-beta arrangement. They contain a catalytic active site, capable of reducing the gamma-methene bridge of the open tetrapyrrole, biliverdin IX alpha, to bilirubin with the concomitant oxidation of a NADH or NADPH cofactor.	113
401199	pfam09167	DUF1942	Domain of unknown function (DUF1942). Members of this family of bacterial proteins assume a beta-sandwich structure consisting of two antiparallel beta-sheets similar to an immunoglobulin-like fold, with an additional small, antiparallel beta-sheet. The longer-stranded beta-sheet is made up of four antiparallel beta-strands. The shorter-stranded beta-sheet consists of five beta-strands, four of these beta-strands form an antiparallel beta-sheet. The exact function of this family of proteins is unknown, though a putative role includes involvement in host-bacterial interactions involved in endocytosis or phagocytosis, possibly during bacterial internalisation.	123
401200	pfam09168	PepX_N	X-Prolyl dipeptidyl aminopeptidase PepX, N-terminal. Members of this family adopt a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand 5' of catalytic domain. The domain mediates dimerization of the protein, with two proline residues present in the domain being critical for interaction.	156
401201	pfam09169	BRCA-2_helical	BRCA2, helical. Members of this family adopt a helical structure, consisting of a four-helix cluster core (alpha 1, alpha 8, alpha 9, alpha 10) and two successive beta-hairpins (beta 1 to beta 4). An approx. 50-amino acid segment that contains four short helices (alpha 2 to alpha 4), meanders around the surface of the core structure. In BRCA2, the alpha 9 and alpha 10 helices pack with BRCA-2_OB1 (pfam09103) through van der Waals contacts involving hydrophobic and aromatic residues, and also through side-chain and backbone hydrogen bonds. The domain binds the 70-amino acid DSS1 (deleted in split-hand/split foot syndrome) protein, which was originally identified as one of three genes that map to a 1.5-Mb locus deleted in an inherited developmental malformation syndrome.	187
401202	pfam09170	STN1_2	CST, Suppressor of cdc thirteen homolog, complex subunit STN1. STN1 is a component of the CST complex, a complex that binds to single-stranded DNA and is required for protecting telomeres from DNA degradation. The CST complex binds single-stranded DNA with high affinity in a sequence-independent manner, while isolated subunits bind DNA with low affinity on their own. In addition to telomere protection, the CST complex probably has a more general role in DNA metabolism at non-telomeric sites.	176
401203	pfam09171	AGOG	N-glycosylase/DNA lyase. This domain is predominantly found in the Archaeal protein N-glycosylase/DNA lyase.	248
401204	pfam09172	DUF1943	Domain of unknown function (DUF1943). Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined.	302
401205	pfam09173	eIF2_C	Initiation factor eIF2 gamma, C terminal. Members of this family, which are found in the initiation factors eIF2 and EF-Tu, adopt a structure consisting of a beta barrel with Greek key topology. They are required for formation of the ternary complex with GTP and initiator tRNA.	85
401206	pfam09174	Maf1	Maf1 regulator. Maf1 is a negative regulator of RNA polymerase III. It targets the initiation factor TFIIIB.	174
401207	pfam09175	DUF1944	Domain of unknown function (DUF1944). Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined.	167
401208	pfam09176	Mpt_N	Methylene-tetrahydromethanopterin dehydrogenase, N-terminal. Members of this family adopt a alpha-beta structure, with a core comprising three alpha/beta/alpha layers, in which each sheet contains four strands. They are predominantly found in prokaryotic methylene-tetrahydromethanopterin dehydrogenase, which catalyzes the dehydrogenation of methylene-tetrahydromethanopterin and the reversible dehydrogenation of methylene-H(4)F.	76
401209	pfam09177	Syntaxin-6_N	Syntaxin 6, N-terminal. Members of this family, which are found in the amino terminus of various SNARE proteins, adopt a structure consisting of an antiparallel three-helix bundle. Their exact function has not been determined, though it is known that they regulate the SNARE motif, as well as mediate various protein-protein interactions involved in membrane-transport.	91
286287	pfam09178	DUF1945	Domain of unknown function (DUF1945). Members of this family, which are predominantly found in prokaryotic 4-alpha-glucanotransferase, adopt a structure composed of six antiparallel beta-strands, four of which form a beta-sheet and another two form a type I' beta-hairpin. The role of this family of domains, has not, as yet, been defined.	50
401210	pfam09179	TilS	TilS substrate binding domain. This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein.	66
401211	pfam09180	ProRS-C_1	Prolyl-tRNA synthetase, C-terminal. Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif.	70
401212	pfam09181	ProRS-C_2	Prolyl-tRNA synthetase, C-terminal. Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif.	66
401213	pfam09182	PuR_N	Bacterial purine repressor, N-terminal. The N-terminal domain of the bacterial purine repressor PuR is a winged-helix domain, a subdivision of the HTH structural family. It consists of a canonical arrangement of secondary structures: a1-b1-a2-T-a3-b2-W-b3, where a2-T-a3 is the HTH motif, a3 is the recognition helix, and W is the wing. The domain allows for recognition of a conserved CGAA sequence in the centre of a DNA PurBox, resulting in binding to the major groove of DNA.	70
401214	pfam09183	DUF1947	Domain of unknown function (DUF1947). Members of this family are found in a set of hypothetical Archaeal proteins. Their exact function has not, as yet, been defined.	63
401215	pfam09184	PPP4R2	PPP4R2. PPP4R2 (protein phosphatase 4 core regulatory subunit R2) is the regulatory subunit of the histone H2A phosphatase complex. It has been shown to confer resistance to the anticancer drug cisplatin in yeast, and may confer resistance in higher eukaryotes.	287
401216	pfam09185	DUF1948	Domain of unknown function (DUF1948). Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with one central alpha-helix surrounded by five others, in a NusB-like fold. Their function has not, as yet, been determined.	140
401217	pfam09186	DUF1949	Domain of unknown function (DUF1949). Members of this family pertain to a set of functionally uncharacterized hypothetical bacterial proteins. They adopt a ferredoxin-like fold, with a beta-alpha-beta-beta-alpha-beta arrangement.	56
401218	pfam09187	RdDM_RDM1	RNA-directed DNA methylation 1. This family of plant proteins includes RDM1 from Arabidopsis, which is a component of the RNA-directed DNA methylation (RdDM) effector complex and may have a role in linking siRNA production with pre-existing or de novo cytosine methylation. As part of the DDR complex with two other RdDM components, it has been shown to facilitate association of PolV to chromatin.	119
286297	pfam09188	DUF1951	Domain of unknown function (DUF1951). Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with a buried central helix. Their function has not, as yet, been determined.	137
401219	pfam09189	DUF1952	Domain of unknown function (DUF1952). Members of this family are found in various Thermus thermophilus proteins. Their exact function has not, as yet, been determined.	73
401220	pfam09190	DALR_2	DALR domain. This DALR domain is found in cysteinyl-tRNA-synthetases.	63
401221	pfam09191	CD4-extracel	CD4, extracellular. Members of this family adopt an immunoglobulin-like beta-sandwich, with seven strands in 2 beta sheets, in a Greek key topology. They are predominantly found in the extracellular portion of CD4 proteins, where they enable interaction with major histocompatibility complex class II antigens.	105
370352	pfam09192	Act-Frag_cataly	Actin-fragmin kinase, catalytic. Members of this family assume a secondary structure consisting of eight beta strands and 11 alpha-helices, organized in two lobes. They are predominantly found in actin-fragmin kinase, where they act as a catalytic domain that mediates the phosphorylation of actin.	278
401222	pfam09193	CholecysA-Rec_N	Cholecystokinin A receptor, N-terminal. Members of this family are found in the extracellular region of the cholecystokinin A receptor, where they adopt a tertiary structure consisting of a few helical turns and a disulphide-crosslinked loop. They are required for interaction of the cholecystokinin A receptor with it's corresponding hormonal ligand.	47
370354	pfam09194	Endonuc-BsobI	Restriction endonuclease BsobI. Members of this family of prokaryotic restriction endonucleases recognize the double-stranded sequence CYCGRG (where Y = T/C, and R = A/G) and cleave after C-1. They catalyze the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates.	314
370355	pfam09195	Endonuc-BglII	Restriction endonuclease BglII. Members of this family are predominantly found in prokaryotic restriction endonuclease BglII, and adopt a structure consisting of an alpha/beta core containing a six-stranded beta-sheet surrounded by five alpha-helices, two of which are involved in homodimerization of the endonuclease. They recognize the double-stranded DNA sequence AGATCT and cleave after A-1, resulting in specific double-stranded fragments with terminal 5'-phosphates.	161
286305	pfam09196	DUF1953	Domain of unknown function (DUF1953). This domain is found in the Archaeal protein maltooligosyl trehalose synthase produced by Sulfolobus spp. Its function has not, as yet, been defined.	63
401223	pfam09197	Rap1-DNA-bind	Rap1, DNA-binding. Members of this family, which are predominantly found in the yeast protein rap1, assume a secondary structure consisting of a three-helix bundle and an N-terminal arm. They contain an Arg-Asp-Arg-Lys sequence that interacts with an ACACC region in the 3' region of the DNA-binding site.	108
72614	pfam09198	T4-Gluco-transf	Bacteriophage T4 beta-glucosyltransferase. Members of this family are DNA-modifying enzymes encoded by bacteriophage T4 that transfer glucose from uridine diphosphoglucose to 5-hydroxymethyl cytosine bases of phage T4 DNA.	38
401224	pfam09199	SSL_OB	Staphylococcal superantigen-like OB-fold domain. This OB-fold domain folds into a five-stranded beta-barrel. Members of this family are found in various staphylococcal toxins described as staphylococcal superantigen-like (SSL) proteins that are related to the staphylococcal enterotoxins (SEs) or superantigens. These SSL proteins of which 11 have so far been characterized have a typical SE tertiary structure consisting of a distinct oligonucleotide/oligosaccharide binding (OB-fold), this domain, linked to a beta-grasp domain, family Stap_Strp_tox_C, pfam02876. SSLs do not bind to T-cell receptors or major histocompatibility complex class II molecules and do not stimulate T cells. SSLs target components of innate immunity, such as complement, Fc receptors, and myeloid cells 2,3,4,5,6,7,8]. SSL protein 7 (SSL7) is the best characterized of the SSLs and binds complement factor C5 and IgA with high affinity and inhibits the end stage of complement activation and IgA binding to FcalphaR.	84
312644	pfam09200	Monellin	Monellin. Monellin, a protein produced by the West African plant Dioscoreophyllum cumminsii, is approximately 70,000 times sweeter than sucrose on a molar basis. The protein adopts an alpha-beta structure, with a cystatin-like fold, where each helix packs against a coiled antiparallel beta-sheet.	40
370358	pfam09201	SRX	SRX, signal recognition particle receptor alpha subunit. Members of this family, which are predominantly found in eukaryotic signal recognition particle receptor alpha, consist of a central six-stranded anti-parallel beta-sheet sandwiched by helix alpha1 on one side and helices alpha2-alpha4 on the other. They interact with the small GTPase SR-beta, forming a complex that matches a class of small G protein-effector complexes, including Rap-Raf, Ras-PI3K(gamma), Ras-RalGDS, and Arl2-PDE(delta). Structurally the alpha subunit is SNARE-like.	148
401225	pfam09202	Rio2_N	Rio2, N-terminal. Members of this family are found in Rio2, and are structurally homologous to the winged helix (wHTH) domain. They adopt a structure consisting of four alpha helices followed by two beta strands and a fifth alpha helix. The domain confers DNA binding properties to the protein, as per other winged helix domains.	82
401226	pfam09203	MspA	MspA. MspA is a membrane porin produced by Mycobacteria, allowing hydrophilic nutrients to enter the bacterium. The protein forms a tightly interconnected octamer with eightfold rotation symmetry that resembles a goblet and contains a central channel. Each subunit fold contains a beta-sandwich of Ig-like topology and a beta-ribbon arm that forms an oligomeric transmembrane barrel.	175
401227	pfam09204	Colicin_immun	Bacterial self-protective colicin-like immunity. Colicin D, which is synthesized by various prokaryotes, adopts an antiparallel four helical bundle fold: the helices are tightly packed, forming a compact cylindrical molecule. The protein specifically cleaves the anticodon loop of all four tRNA-Arg isoacceptors, thereby inactivating prokaryotic protein synthesis and leading to cell death. This family also contains immunity proteins to klebicins and microcins. Many bacteria produce proteins that destroy their competitors. Colicin D is one such. The immunity proteins are expressed on the same operon as their cognate bacteriocins and protect the expressing bacterium from the effects of its own bacteriocin.	84
401228	pfam09205	DUF1955	Domain of unknown function (DUF1955). Members of this family are found in hypothetical proteins synthesized by the Archaeal organism Sulfolobus. Their exact function has not, as yet, been determined.	159
401229	pfam09206	ArabFuran-catal	Alpha-L-arabinofuranosidase B, catalytic. Members of this family, which are present in fungal alpha-L-arabinofuranosidase B, adopt a beta-sandwich fold similar to that of Concanavalin A-like lectins/glucanase. The beta-sandwich fold consists of two anti-parallel beta-sheets with seven and and six strands, respectively. In addition, there are four helices outside of the beta-strands. The beta-sandwich strands are closely packed and curved with a jelly roll topology, creating a small catalytic pocket. The domain catalyzes the hydrolysis of alpha-1,2-, alpha-1,3- and alpha-1,5-L-arabinofuranosidic bonds in L-arabinose-containing hemicelluloses such as arabinoxylan and L-arabinan.	315
312650	pfam09207	Yeast-kill-tox	Yeast killer toxin. Members of this family, which are produced by Williopsis fungi, adopt a secondary structure consisting of eight strands in two beta sheets, in a Greek-key topology.	87
401230	pfam09208	Endonuc-MspI	Restriction endonuclease MspI. Members of this family of prokaryotic restriction endonucleases recognize the palindromic tetranucleotide sequence 5'-CCGG and cleave between the first and second nucleotides, leaving 2 base 5' overhangs. They fold into an alpha/beta architecture, with a five-stranded mixed beta-sheet sandwiched on both sides by alpha-helices.	258
401231	pfam09209	DUF1956	Domain of unknown function (DUF1956). Members of this family are found in various prokaryotic transcriptional regulator proteins. Their exact function has not, as yet, been identified.	124
401232	pfam09210	DUF1957	Domain of unknown function (DUF1957). This domain is found in a set of hypothetical bacterial proteins. Its exact function has not, as yet, been defined.	98
401233	pfam09211	DUF1958	Domain of unknown function (DUF1958). Members of this functionally uncharacterized family are found in prokaryotic penicillin-binding protein 4.	63
117765	pfam09212	CBM27	Carbohydrate binding module 27. Members of this family are carbohydrate binding modules that bind to beta-1, 4-manno-oligosaccharides, carob galactomannan, and konjac glucomannan, but not to cellulose (insoluble and soluble) or soluble birchwood xylan. They adopt a beta sandwich structure comprising 13 beta strands with a single, small alpha-helix and a single metal atom.	173
117766	pfam09213	M3	M3. Members of this family of viral chemokine binding proteins adopt a structure consisting of two different beta-sandwich domains of partial topological similarity to immunoglobulin-like folds. They bind with the CC-chemokine MCP-1, acting as cytokine decoy receptors.	367
401234	pfam09214	Prd1-P2	Bacteriophage Prd1, adsorption protein P2. Members of this family form a set of bacteriophage adsorption proteins, composed mainly of beta-strands whose complicated topology forms an elongated seahorse-shaped molecule with a distinct head, containing a pseudo-beta propeller structure with approximate 6-fold symmetry, and tail. They are required for the attachment of the phage to the host conjugative DNA transfer complex. This is a poorly understood large transmembrane complex of unknown architecture, with at least 11 different proteins.	554
401235	pfam09215	Phage-Gp8	Bacteriophage T4, Gp8. Members of this family of viral baseplate structural proteins adopt a structure consisting of a three-layer beta-sandwich with two finger-like loops containing an alpha-helix at the opposite sides of the sandwich. The two peripheral, five-stranded, antiparallel beta-sheets are stacked against the middle, four-stranded, antiparallel beta-sheet. Attachment of this family of proteins to the baseplate during assembly creates a binding site for subsequent attachment of Gp6.	337
401236	pfam09216	Pfg27	Pfg27. Members of this family are essential for gametocytogenesis in Plasmodium falciparum. They contain a fold composed of two pseudo dyad-related repeats of the helix-turn-helix motif, serving as a platform for RNA and Src homology-3 (SH3) binding.	176
401237	pfam09217	EcoRII-N	Restriction endonuclease EcoRII, N-terminal. The N-terminal effector-binding domain of the Restriction Endonuclease EcoRII has a DNA recognition fold, allowing for binding to 5'-CCWGG sequences. It assumes a structure composed of an eight-stranded beta-sheet with the strands in the order of b2, b5, b4, b3, b7, b6, b1 and b8. They are mostly antiparallel to each other except that b3 is parallel to b7. Alternatively, it may also be viewed as consisting of two mini beta-sheets of four antiparallel beta-strands, sheet I from beta-strands b2, b5, b4, b3 and sheet II from strands b7, b6, b1, b8, folded into an open mixed beta-barrel with a novel topology. Sheet I has a simple Greek key motif while sheet II does not.	148
401238	pfam09218	DUF1959	Domain of unknown function (DUF1959). This domain is found in a set of uncharacterized Archaeal hypothetical proteins. Its function has not, as yet, been described.	116
401239	pfam09220	LA-virus_coat	L-A virus, major coat protein. Members of this family form the major coat protein of the Saccharomyces cerevisiae L-A virus.	439
401240	pfam09221	Bacteriocin_IId	Bacteriocin class IId cyclical uberolysin-like. Members of this family are membrane-interacting peptides, produced by Firmicutes that display a broad anti-microbial spectrum against Gram-positive and Gram-negative bacteria. They adopt a helical structure, with four or five alpha helices forming a Saposin-like fold. The structure has been found to be cyclical. It should be pointed out that one reference implies that both circularin A and gassericin A are class V or IIc-type bacteriocins; however we find that these two proteins fall into different Pfam families families, this one and BacteriocIIc_cy, pfam12173.	67
370370	pfam09222	Fim-adh_lectin	Fimbrial adhesin F17-AG, lectin domain. Members of this family are carbohydrate-specific lectin domains found in bacterial fimbrial adhesins. They adopt a compact, elongated structure consisting of a beta-sandwich with two major sheets: one consisting of five long strands in mixed orientations, and a front sheet with four antiparallel strands, forming an immunoglobin-like fold.	171
401241	pfam09223	ZinT	ZinT (YodA) periplasmic lipocalin-like zinc-recruitment. ZinT plays a critical role in recruiting periplasmic zinc to the bacterial zinc-uptake complex ZnuABC, consisting of families pfam01297,pfam00950, pfam00005, regulated by the transcription-regulator FUR, pfam01475. ZinT acts as a Zn2+-buffering protein that delivers Zn2+ to ZnuA (TroA), pfam01297. Members of this family of prokaryotic domains were first identified as part of the response of bacteria to a challenge with the toxic heavy metal cadmium. They are able to bind to cadmium, and ensure its subsequent elimination.	181
401242	pfam09224	DUF1961	Domain of unknown function (DUF1961). Members of this family are found in a set of hypothetical bacterial proteins. Their exact function has not, as yet, been determined.	214
370372	pfam09225	Endonuc-PvuII	Restriction endonuclease PvuII. Members of this family are predominantly found in prokaryotic restriction endonuclease PvuII. They recognize the double-stranded DNA sequence 5'-CAGCTG-3' and cleave after G-3, resulting in specific double-stranded fragments with terminal 5'-phosphates.	154
312659	pfam09226	Endonuc-HincII	Restriction endonuclease HincII. Members of this family of prokaryotic restriction endonucleases recognize the double-stranded sequence 5'-GTYRAC-3' and cleave after Y-3. They catalyze the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates.	247
286330	pfam09227	DUF1962	Domain of unknown function (DUF1962). Members of this family of fungal domains are functionally uncharacterized.	64
401243	pfam09228	Prok-TraM	Prokaryotic Transcriptional repressor TraM. Members of this family of transcriptional repressors adopt a T-shaped structure, with a core composed of two antiparallel alpha-helices. These proteins can be divided into two parts, a 'globular head' and an 'elongated tail', and they negatively regulate conjugation and the expression of tra genes by antagonising traR/AAI-dependent activation.	102
401244	pfam09229	Aha1_N	Activator of Hsp90 ATPase, N-terminal. Members of this family, which are predominantly found in the protein 'Activator of Hsp90 ATPase' adopt a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity.	130
401245	pfam09230	DFF40	DNA fragmentation factor 40 kDa. Members of this family of eukaryotic apoptotic proteins induce DNA fragmentation and chromatin condensation during apoptosis.	225
286334	pfam09231	RDV-p3	Rice dwarf virus p3. Members of this family are core structural proteins found in the double-stranded RNA virus Phytoreovirus. They are large proteins without apparent domain division, with a number of all-alpha regions and one all beta domain near the C-terminal end.	963
370376	pfam09232	Caenor_Her-1	Caenorhabditis elegans Her-1. Her-1 adopts an all-helical structure with two subdomains: residues 19-80 comprise a left-handed three-helix bundle with an overhand connection between the second and third helices, whilst residues 81-164 comprise a left-handed anti-parallel four-helix bundle in which the first helix consists of four consecutive turns of 3-10-helix. Fourteen Cys are conserved in all known HER-1 sequences and form seven disulfide bonds. The protein dictates male development in Caenorhabditis elegans, probably by playing a direct role in cell signaling during C. elegans sex determination. It also inhibits the function of tra-2a.	131
401246	pfam09233	Endonuc-EcoRV	Restriction endonuclease EcoRV. Members of this family of prokaryotic restriction endonucleases recognize the double-stranded sequence 5'-GATATC-3' and cleave after T-3. They catalyze the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates.	240
401247	pfam09234	DUF1963	Domain of unknown function (DUF1963). This domain is found in a set of hypothetical bacterial proteins. Its exact function has not, as yet, been described.	178
401248	pfam09235	Ste50p-SAM	Ste50p, sterile alpha motif. The fungal Ste50p SAM domain consists of five helices, which form a compact, globular fold. It is required for mediation of homodimerization and heterodimerization (and in some cases oligomerization) of the protein.	75
401249	pfam09236	AHSP	Alpha-haemoglobin stabilizing protein. Alpha-haemoglobin stabilizing protein (AHSP) acts a molecular chaperone for free alpha-haemoglobin, preventing the harmful aggregation of alpha-haemoglobin during normal erythroid cell development: it specifically protects free alpha-haemoglobin from precipitation. AHSP adopts a helical secondary structure consisting of an elongated antiparallel three alpha-helix bundle.	87
150045	pfam09237	GAGA	GAGA factor. Members of this family bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognizes the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognizes the A in the fourth position of the consensus sequence.	54
401250	pfam09238	IL4Ra_N	Interleukin-4 receptor alpha chain, N-terminal. Members of this family are related in overall topology to fibronectin type III modules and fold into a sandwich comprising seven antiparallel beta sheets arranged in a three-strand and a four-strand beta-pleated sheet. They are required for binding of interleukin-4 to the receptor alpha chain, which is a crucial event for the generation of a Th2-dominated early immune response.	90
401251	pfam09239	Topo-VIb_trans	Topoisomerase VI B subunit, transducer. Members of this family adopt a structure consisting of a four-stranded beta-sheet backed by three alpha-helices, the last of which is over 50 amino acids long and extends from the body of the protein by several turns. This domain has been proposed to mediate intersubunit communication by structurally transducing signals from the ATP binding and hydrolysis domains to the DNA binding and cleavage domains of the gyrase holoenzyme.	157
401252	pfam09240	IL6Ra-bind	Interleukin-6 receptor alpha chain, binding. Members of this family adopt a structure consisting of an immunoglobulin-like beta-sandwich, with seven strands in two beta-sheets, in a Greek-key topology. They are required for binding to the cytokine Interleukin-6.	96
286342	pfam09241	Herp-Cyclin	Herpesviridae viral cyclin. Members of this family of viral cyclins adopt a helical structure consisting of five alpha-helices, with one helix surrounded by the others. They specifically activate CDK6 of host cells to a very high degree.	106
401253	pfam09242	FCSD-flav_bind	Flavocytochrome c sulphide dehydrogenase, flavin-binding. Members of this family adopt a structure consisting of a beta(3,4)-alpha(3) core, and an alpha+beta sandwich. They are required for binding to flavin, and subsequent electron transfer.	68
401254	pfam09243	Rsm22	Mitochondrial small ribosomal subunit Rsm22. Rsm22 has been identified as a mitochondrial small ribosomal subunit and is a methyltransferase. In Schizosaccharomyces pombe, Rsm22 is tandemly fused to Cox11 (a factor required for copper insertion into cytochrome oxidase) and the two proteins are proteolytically cleaved after import into the mitochondria.	275
401255	pfam09244	DUF1964	Domain of unknown function (DUF1964). Members of this family of bacterial domains adopt a beta-sandwich fold, with Greek-key topology. They are C-terminal to the catalytic sucrose phosphorylase beta/alpha barrel domain, and are functionally uncharacterized.	67
312671	pfam09245	MA-Mit	Mycoplasma arthritidis-derived mitogen. Mycoplasma arthritidis-derived mitogen (MA-Mit) adopts a completely alpha-helical structure consisting of ten alpha helices. It is a superantigen that can activate large fractions of T cells bearing particular TCR V-beta elements. Two MA-Mit molecules form an asymmetric dimer and cross-link two MHC antigens to form a dimerized MA-Mit-MHC complex.	213
204178	pfam09246	PHAT	PHAT. The PHAT (pseudo-HEAT analogous topology) domain assumes a structure consisting of a layer of three parallel helices packed against a layer of two antiparallel helices, into a cylindrical shaped five-helix bundle. It is found in the RNA-binding protein Smaug, where it is essential for high-affinity RNA binding.	108
370383	pfam09247	TBP-binding	TATA box-binding protein binding. Members of this family adopt a structure consisting of three alpha helices and a beta-hairpin. They bind to TATA box-binding protein (TBP), inhibiting TBP interaction with the TATA element, thereby resulting in shutting down of gene transcription.	58
401256	pfam09248	DUF1965	Domain of unknown function (DUF1965). Members of this family of fungal domains adopt a structure that consists of an alpha/beta motif. Their exact function has not, as yet, been determined.	69
401257	pfam09249	tRNA_NucTransf2	tRNA nucleotidyltransferase, second domain. Members of this family adopt a structure consisting of a five helical bundle core. They are predominantly found in Archaeal tRNA nucleotidyltransferase, following the catalytic nucleotidyltransferase domain.	111
401258	pfam09250	Prim-Pol	Bifunctional DNA primase/polymerase, N-terminal. Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities.	158
312676	pfam09251	PhageP22-tail	Salmonella phage P22 tail-spike. Members of this family of viral domains adopt a structure consisting of a single-stranded right-handed beta-helix, which in turn is made of parallel beta-strands and short turns. They are required for recognition of the 0-antigenic repeating units of the cell surface, and for subsequent infection of the bacterial cell.	550
401259	pfam09252	Feld-I_B	Allergen Fel d I-B chain. Members of this family of cat allergens adopt a helical structure consisting of eight alpha helices, in a Uteroglobin-like fold. They are one of the most important causes of allergic asthma worldwide.	67
401260	pfam09253	Ole-e-6	Pollen allergen ole e 6. Members of this family consist of two nearly antiparallel alpha-helices, that are connected by a short loop and followed by a long, unstructured C-terminal tail. They are highly allergenic, primarily mediating olive allergy.	39
401261	pfam09254	Endonuc-FokI_C	Restriction endonuclease FokI, C terminal. Members of this family are predominantly found in prokaryotic restriction endonuclease FokI, and adopt a structure consisting of an alpha/beta/alpha core containing a five-stranded beta-sheet. They recognize the double-stranded DNA sequence 5'-GGATG-3' and cleave DNA phosphodiester groups 9 base pairs away on this strand and 13 base pairs away on the complementary strand.	188
286355	pfam09255	Antig_Caf1	Caf1 Capsule antigen. Members of this family are predominantly found in the F1 capsule antigen Caf1 synthesized by Yersinia bacteria. They adopt a structure consisting of a seven strands arranged in two beta-sheets, in a Greek-key topology, and mediate targeting of the bacterium to sites of infection.	136
401262	pfam09256	BaffR-Tall_bind	BAFF-R, TALL-1 binding. Members of this family, which are predominantly found in the tumor necrosis factor receptor superfamily member 13c, BAFF-R, are required for binding to tumor necrosis factor ligand TALL-1.	28
401263	pfam09257	BCMA-Tall_bind	BCMA, TALL-1 binding. Members of this family, which are predominantly found in the tumor necrosis factor receptor superfamily member 17, BCMA, are required for binding to tumor necrosis factor ligand TALL-1.	37
401264	pfam09258	Glyco_transf_64	Glycosyl transferase family 64 domain. Members of this family catalyze the transfer reaction of N-acetylglucosamine and N-acetylgalactosamine from the respective UDP-sugars to the non-reducing end of [glucuronic acid]beta 1-3[galactose]beta 1-O-naphthalenemethanol, an acceptor substrate analog of the natural common linker of various glycosylaminoglycans. They are also required for the biosynthesis of heparan-sulphate.	240
401265	pfam09259	Fve	Fungal immunomodulatory protein Fve. Fve is a major fruiting body protein from Flammulina velutipes, a mushroom possessing immunomodulatory activity. It stimulates lymphocyte mitogenesis, suppresses systemic anaphylaxis reactions and oedema, enhances transcription of IL-2, IFN-gamma and TNF-alpha, and haemagglutinates red blood cells. It appears to be a lectin with specificity for complex cell-surface carbohydrates. Fve adopts a tertiary structure consisting of an immunoglobulin-like beta-sandwich, with seven strands arranged in two beta sheets, in a Greek-key topology. It forms a non-covalently linked homodimer containing no Cys, His or Met residues; dimerization occurs by 3-D domain swapping of the N-terminal helices and is stabilized predominantly by hydrophobic interactions.	111
370390	pfam09260	DUF1966	Domain of unknown function (DUF1966). This domain is found in various fungal alpha-amylase proteins. Its exact function has not, as yet, been defined.	90
401266	pfam09261	Alpha-mann_mid	Alpha mannosidase middle domain. Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase.	98
401267	pfam09262	PEX-1N	Peroxisome biogenesis factor 1, N-terminal. Members of this family adopt a double psi beta-barrel fold, similar in structure to the Cdc48 N-terminal domain. It has been suggested that this domain may be involved in interactions with ubiquitin, ubiquitin-like protein modifiers, or ubiquitin-like domains, such as Ubx. Furthermore, the domain may possess a putative adaptor or substrate binding site, allowing for peroxisomal biogenesis, membrane fusion and protein translocation.	77
401268	pfam09263	PEX-2N	Peroxisome biogenesis factor 1, N-terminal. Members of this family adopt a Cdc48 domain 2-like fold, with a beta-alpha-beta(3) arrangement. It has been suggested that this domain may be involved in interactions with ubiquitin, ubiquitin-like protein modifiers, or ubiquitin-like domains, such as Ubx. Furthermore, the domain may possess a putative adaptor or substrate binding site, allowing for peroxisomal biogenesis, membrane fusion and protein translocation.	83
401269	pfam09264	Sial-lect-inser	Vibrio cholerae sialidase, lectin insertion. Members of this family are predominantly found in Vibrio cholerae sialidase, and adopt a beta sandwich structure consisting of 12-14 strands arranged in two beta-sheets. They bind to lectins with high affinity helping to target the protein to sialic acid-rich environments, thereby enhancing the catalytic efficiency of the enzyme.	198
401270	pfam09265	Cytokin-bind	Cytokinin dehydrogenase 1, FAD and cytokinin binding. Members of this family adopt an alpha+beta sandwich structure with an antiparallel beta-sheet, in a ferredoxin-like fold. They are predominantly found in plant cytokinin dehydrogenase 1, where they are capable of binding both FAD and cytokinin substrates. The substrate displays a 'plug-into-socket' binding mode that seals the catalytic site and precisely positions the carbon atom undergoing oxidation in close contact with the reactive locus of the flavin.	278
286365	pfam09266	VirDNA-topo-I_N	Viral DNA topoisomerase I, N-terminal. Members of this family are predominantly found in viral DNA topoisomerase, and assume a beta(2)-alpha-beta-alpha-beta(2) fold, with a left-handed crossover between strands beta2 and beta3.	58
370396	pfam09267	Dict-STAT-coil	Dictyostelium STAT, coiled coil. Members of this family are found in Dictyostelium STAT proteins and adopt a structure consisting of four long alpha-helices, folded into a coiled coil. They are responsible for nuclear export of the protein.	114
401271	pfam09268	Clathrin-link	Clathrin, heavy-chain linker. Members of this family adopt a structure consisting of alpha-alpha superhelix. They are predominantly found in clathrin, where they act as a heavy-chain linker domain.	24
401272	pfam09269	DUF1967	Domain of unknown function (DUF1967). Members of this family contain a four-stranded beta sheet and three alpha helices flanked by an additional beta strand. They are predominantly found in the bacterial GTP-binding protein Obg, and are still functionally uncharacterized.	67
401273	pfam09270	BTD	Beta-trefoil DNA-binding domain. Members of this family of DNA binding domains adopt a beta-trefoil fold, that is, a capped beta-barrel with internal pseudo threefold symmetry. In the DNA-binding protein LAG-1, it also is the site of mutually exclusive interactions with NotchIC (and the viral protein EBNA2) and co-repressors (SMRT/N-Cor and CIR).	123
401274	pfam09271	LAG1-DNAbind	LAG1, DNA binding. Members of this family are found in various eukaryotic hypothetical proteins and in the DNA-binding protein LAG-1. They adopt a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology, and allow for DNA binding. This domain is also known as RHR-N (Rel-homology region) as it related to Rel domain proteins.	135
401275	pfam09272	Hepsin-SRCR	Hepsin, SRCR. Members of this family form an extracellular domain of the serine protease hepsin. They are formed primarily by three elements of regular secondary structure: a 12-residue alpha helix, a twisted five-stranded antiparallel beta sheet, and a second, two-stranded, antiparallel sheet. The two beta-sheets lie at roughly right angles to each other, with the helix nestled between the two, adopting an SRCR fold. The exact function of this domain has not been identified, though it probably may serve to orient the protease domain or place it in the vicinity of its substrate.	110
401276	pfam09273	Rubis-subs-bind	Rubisco LSMT substrate-binding. Members of this family adopt a multihelical structure, with an irregular array of long and short alpha-helices. They allow binding of the protein to substrate, such as the N-terminal tails of histones H3 and H4 and the large subunit of the Rubisco holoenzyme complex.	121
117819	pfam09274	ParG	ParG. Members of this family of plasmid partition proteins adopt a ribbon-helix-helix fold, with a core of four alpha-helices. They are an essential component of the DNA partition complex of the multidrug resistance plasmid TP228.	76
286372	pfam09275	Pertus-S4-tox	Pertussis toxin S4 subunit. Members of this family of Bordetella pertussis toxins adopt a structure consisting of an OB fold, with a closed or partly opened beta-barrel in a Greek-key topology.	110
286373	pfam09276	Pertus-S5-tox	Pertussis toxin S5 subunit. Members of this family of Bordetella pertussis toxins adopt a structure consisting of an OB fold, with a closed or partly opened beta-barrel in a Greek-key topology.	97
401277	pfam09277	Erythro-docking	Erythronolide synthase, docking. Members of this family of docking domains are found in prokaryotic erythronolide synthase. They adopt a structure consisting of a bundle of four alpha-helices, and mediate homodimerization of the protein, stabilizing the resulting complex.	58
401278	pfam09278	MerR-DNA-bind	MerR, DNA binding. Members of this family of DNA-binding domains are predominantly found in the prokaryotic transcriptional regulator MerR. They adopt a structure consisting of a core of three alpha helices, with an architecture that is similar to that of the 'winged helix' fold.	65
401279	pfam09279	EF-hand_like	Phosphoinositide-specific phospholipase C, efhand-like. Members of this family are predominantly found in phosphoinositide-specific phospholipase C. They adopt a structure consisting of a core of four alpha helices, in an EF like fold, and are required for functioning of the enzyme.	85
401280	pfam09280	XPC-binding	XPC-binding domain. Members of this family adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair.	57
370405	pfam09281	Taq-exonuc	Taq polymerase, exonuclease. Members of this family are found in prokaryotic Taq DNA polymerase, where they assume a ribonuclease H-like motif. The domain confers 5'-3' exonuclease activity to the polymerase.	129
401281	pfam09282	Mago-bind	Mago binding. Members of this family adopt a structure consisting of a small globular all-beta-domain, with a three-stranded beta-sheet and a contiguous beta-hairpin. They bind to Mago alpha-helices via extensive electrostatic interactions and at a beta2-beta3 loop via hydrophobic interactions.	27
401282	pfam09284	RhgB_N	Rhamnogalacturonan lyase B, N-terminal. Members of this family are found in both fungi, bacteria and wood-eating arthropods. The domain is found at the N-terminus of rhamnogalacturonase B, a member of the polysaccharide lyase family 4. The domain adopts a structure consisting of a beta super-sandwich, with eighteen strands in two beta-sheets. The three domains of the whole protein rhamnogalacturonan lyase (RGL4), are involved in the degradation of rhamnogalacturonan-I, RG-I, an important pectic plant cell-wall polysaccharide. The active-site residues are a lysine at position 169 in UniProtKB:Q00019 and a histidine at 229, Lys169 is likely to be a proton abstractor, His229 a proton donor in the mechanism. The substrate is a disaccharide, and RGL4, in contrast to other rhamnogalacturonan hydrolases, cleaves the alpha-1,4 linkages of RG-I between Rha and GalUA through a beta-elimination resulting in a double bond in the nonreducing GalUA residue, and is thus classified as a polysaccharide lyase (PL).	251
401283	pfam09285	Elong-fact-P_C	Elongation factor P, C-terminal. Members of this family of nucleic acid binding domains are predominantly found in elongation factor P, where they adopt an OB-fold, with five beta-strands forming a beta-barrel in a Greek-key topology.	56
401284	pfam09286	Pro-kuma_activ	Pro-kumamolisin, activation domain. Members of this family are found in various subtilase propeptides, and adopt a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptide.	142
401285	pfam09287	CEP1-DNA_bind	CEP-1, DNA binding. Members of this family of DNA-binding domains are found the transcription factor CEP-1. They adopt a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology.	198
117832	pfam09288	UBA_3	Fungal ubiquitin-associated domain. Members of this family of ubiquitin binding domains adopt a structure consisting of a three alpha-helix bundle. They are predominantly found in fungal ubiquitin-protein ligases.	55
401286	pfam09289	FOLN	Follistatin/Osteonectin-like EGF domain. Members of this family are predominantly found in osteonectin and follistatin and adopt an EGF-like fold.	22
401287	pfam09290	AcetDehyd-dimer	Prokaryotic acetaldehyde dehydrogenase, dimerization. Members of this family are found in prokaryotic acetaldehyde dehydrogenase (acylating), and adopt a structure consisting of an alpha-beta-alpha-beta(3) core. They mediate dimerization of the protein.	138
401288	pfam09291	DUF1968	Domain of unknown function (DUF1968). Members of this family are found in mammalian T-cell antigen receptor, and adopt an immunoglobulin-like beta-sandwich fold, with seven strands in two beta-sheets in a Greek-key topology. Their exact function has not, as yet, been determined.	80
401289	pfam09292	Neil1-DNA_bind	Endonuclease VIII-like 1, DNA bind. Members of this family are predominantly found in Endonuclease VIII-like 1 and adopt a glucocorticoid receptor-like fold. They allow for DNA binding.	39
401290	pfam09293	RNaseH_C	T4 RNase H, C terminal. Members of this family are found in T4 RNaseH ribonuclease, and adopt a SAM domain-like fold, consisting of a bundle of four/five helices. These residues may have a role in providing a docking site for other proteins or enzymes in the replication fork.	124
401291	pfam09294	Interfer-bind	Interferon-alpha/beta receptor, fibronectin type III. Members of this family adopt a secondary structure consisting of seven beta-strands arranged in an immunoglobulin-like beta-sandwich, in a Greek-key topology. They are required for binding to interferon-alpha.	104
401292	pfam09295	ChAPs	ChAPs (Chs5p-Arf1p-binding proteins). ChAPs (Chs5p-Arf1p-binding proteins) are required for the export of specialized cargo from the Golgi. They physically interact with Chs3, Chs5 and the small GTPase Arf1, and they form also interactions with each other.	395
401293	pfam09296	NUDIX-like	NADH pyrophosphatase-like rudimentary NUDIX domain. The N-terminal domain in NADH pyrophosphatase, which has a rudiment Nudix fold according to SCOP.	96
401294	pfam09297	zf-NADH-PPase	NADH pyrophosphatase zinc ribbon domain. This domain is found in between two duplicated NUDIX domains. It has a zinc ribbon structure.	32
401295	pfam09298	FAA_hydrolase_N	Fumarylacetoacetase N-terminal. The N-terminal domain of fumarylacetoacetate hydrolase is functionally uncharacterized, and adopts a structure consisting of an SH3-like barrel.	106
401296	pfam09299	Mu-transpos_C	Mu transposase, C-terminal. Members of this family are found in various prokaryotic integrases and transposases. They adopt a beta-barrel structure with Greek-key topology.	61
286393	pfam09300	Tecti-min-caps	Tectiviridae, minor capsid. Members of this family form the minor capsid protein of various Tectiviridae.	83
401297	pfam09301	DUF1970	Domain of unknown function (DUF1970). Members of this family consist of various uncharacterized viral hypothetical proteins.	118
401298	pfam09302	XLF	XLF-Cernunnos, XRcc4-like factor, NHEJ component. XLF (also called Cernunnos) is Xrcc4-like-factor, and interacts with the XRCC4-DNA ligase IV complex to promote DNA non-homologous end-joining. It directly interacts with the XRCC4-Ligase IV complex and siRNA-mediated down-regulation of XLF in human cell lines leads to radio-sensitivity and impaired DNA non-homologous end-joining. This family contains Nej1 (non-homologous end-joining factor), and Lif1, ligase-interacting factor. XLF forms one of the components of the NHEJ machinery for DNA non-homologous end-joining.	181
401299	pfam09303	KcnmB2_inactiv	KCNMB2, ball and chain domain. Members of this family are found in the cytoplasmic N-terminus of KCNMB2, the beta-2 subunit of large conductance calcium and voltage-activated potassium channels. They are responsible for the fast inactivation of these channels.	30
312712	pfam09304	Cortex-I_coil	Cortexillin I, coiled coil. Members of this family are predominantly found in the actin-bundling protein Cortexillin I from Dictyostelium discoideum. They adopt a structure consisting of an 18-heptad-repeat alpha-helical coiled-coil, and are a prerequisite for the assembly of Cortexillin I.	107
370419	pfam09305	TACI-CRD2	TACI, cysteine-rich domain. Members of this family are predominantly found in tumor necrosis factor receptor superfamily, member 13b (TACI), and are required for binding to the ligands APRIL and BAFF.	39
286399	pfam09306	Phage-scaffold	Bacteriophage, scaffolding protein. Members of this family of scaffolding proteins are produced by various bacteriophages.	303
401300	pfam09307	MHC2-interact	CLIP, MHC2 interacting. Members of this family are found in class II invariant chain-associated peptide (CLIP), and are required for association with class II major histocompatibility complex (MHC) in the MHC class II processing pathway.	109
401301	pfam09308	LuxQ-periplasm	LuxQ, periplasmic. Members of this family constitute the periplasmic sensor domain of the prokaryotic protein LuxQ, and assume a structure consisting of two tandem Per/ARNT/Simple-minded (PAS) folds.	238
401302	pfam09309	FCP1_C	FCP1, C-terminal. The C-terminal domain of FCP-1 is required for interaction with the carboxy terminal domain of RAP74. Interaction relies extensively on van der Waals contacts between hydrophobic residues situated within alpha-helices in both domains.	260
401303	pfam09310	PD-C2-AF1	POU domain, class 2, associating factor 1. Members of this family are transcriptional coactivators that specifically associate with either OCT1 or OCT2, through recognition of their POU domains. They are essential for the response of B-cells to antigens and required for the formation of germinal centers.	248
401304	pfam09311	Rab5-bind	Rabaptin-like protein. Members of this family are predominantly found in Rabaptin and allow for binding to the GTPase Rab5. This interaction is necessary and sufficient for Rab5-dependent recruitment of Rabaptin5 to early endosomal membranes.	307
401305	pfam09312	SurA_N	SurA N-terminal domain. This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment.	118
401306	pfam09313	DUF1971	Domain of unknown function (DUF1971). Members of this family of functionally uncharacterized domains are predominantly found in bacterial Tellurite resistance protein.	80
370426	pfam09314	DUF1972	Domain of unknown function (DUF1972). Members of this family of functionally uncharacterized domains are found in bacterial glycosyltransferases and rhamnosyltransferases.	186
401307	pfam09316	Cmyb_C	C-myb, C-terminal. Members of this family are predominantly found in the proto-oncogene c-myb and the viral transforming protein myb. Truncation of the domain results in 'activation' of c-myb and subsequent tumorigenesis.	164
401308	pfam09317	DUF1974	Domain of unknown function (DUF1974). Members of this family of functionally uncharacterized domains are predominantly found in various prokaryotic acyl-coenzyme a dehydrogenases.	284
370428	pfam09318	Glyco_trans_A_1	Glycosyl transferase 1 domain A. Glyco_trans_A_1 is family of found predominantly at the N-terminus of various prokaryotic alpha-glucosyltransferases. According to whether the domain exists as a whole molecule or as a half molecule determines the number of sugar residues that the molecule transfers. Two-domain proteins are processive in that they transfer more than one sugar residue, processively; single domain proteins transfer just one sugar moiety.	199
401309	pfam09320	DUF1977	Domain of unknown function (DUF1977). Members of this family of functionally uncharacterized domains are predominantly found in dnaj-like proteins.	104
312723	pfam09321	DUF1978	Domain of unknown function (DUF1978). Members of this family are found in various hypothetical proteins produced by the bacterium Chlamydia pneumoniae. Their exact function has not, as yet, been identified.	244
401310	pfam09322	DUF1979	Domain of unknown function (DUF1979). Members of this family of functionally uncharacterized domains are found in various Oryza sativa mutator-like transposases.	58
401311	pfam09323	DUF1980	Domain of unknown function (DUF1980). Members of this family are found in a set of prokaryotic hypothetical proteins. Their exact function, has not, as yet, been defined.	179
401312	pfam09324	DUF1981	Domain of unknown function (DUF1981). Members of this family of functionally uncharacterized domains are found in various plant and yeast protein transport proteins.	84
401313	pfam09325	Vps5	Vps5 C terminal like. Vps5 is a sorting nexin that functions in membrane trafficking. This is the C terminal dimerization domain.	219
401314	pfam09326	NADH_dhqG_C	NADH-ubiquinone oxidoreductase subunit G, C-terminal. Members of this family of are found at the C-terminus of NADH dehydrogenases subunit G or NADH-ubiquinone oxidoreductase subunit G. EC:1.6.99.5.	41
401315	pfam09327	DUF1983	Domain of unknown function (DUF1983). Members of this family of functionally uncharacterized domains are found in various bacteriophage host specificity proteins.	75
401316	pfam09328	Phytochelatin_C	Domain of unknown function (DUF1984). Members of this family of functionally uncharacterized domains are found at the C-terminus of plant phytochelatin synthases.	252
401317	pfam09329	zf-primase	Primase zinc finger. This zinc finger is found in yeast Mcm10 proteins and DnaG-type primases.	46
401318	pfam09330	Lact-deh-memb	D-lactate dehydrogenase, membrane binding. Members of this family are predominantly found in prokaryotic D-lactate dehydrogenase, forming the cap-membrane-binding domain, which consists of a large seven-stranded antiparallel beta-sheet flanked on both sides by alpha-helices. They allow for membrane association.	290
401319	pfam09331	DUF1985	Domain of unknown function (DUF1985). Members of this family of functionally uncharacterized domains are found in a set of Arabidopsis thaliana hypothetical proteins.	133
401320	pfam09332	Mcm10	Mcm10 replication factor. Mcm10 is a eukaryotic DNA replication factor that regulates the stability and chromatin association of DNA polymerase alpha.	344
401321	pfam09333	ATG_C	Autophagy-related protein C terminal domain. ATG2 (also known as Apg2) is a peripheral membrane protein. It functions in both cytoplasm-to-vacuole targeting and in autophagy.	96
401322	pfam09334	tRNA-synt_1g	tRNA synthetases class I (M). This family includes methionyl tRNA synthetases.	387
401323	pfam09335	SNARE_assoc	SNARE associated Golgi protein. This is a family of SNARE associated Golgi proteins. The yeast member of this family localizes with the t-SNARE Tlg2.	120
401324	pfam09336	Vps4_C	Vps4 C terminal oligomerization domain. This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerization.	61
370444	pfam09337	zf-H2C2	His(2)-Cys(2) zinc finger. This domain binds to histone upstream activating sequence (UAS) elements that are found in histone gene promoters. Added to clan to resolve overlaps with PF16721 but neither are classic zf_C2H2 zinc-fingers.	39
401325	pfam09338	Gly_reductase	Glycine/sarcosine/betaine reductase component B subunits. This is a family of glycine reductase, sarcosine reductase and betaine reductases. These enzymes catalyze the following reactions. sarcosine reductase: Acetyl phosphate + methylamine + thioredoxin disulphide = N-methylglycine + phosphate + thioredoxin Acetyl phosphate + NH(3) + thioredoxin disulphide = glycine + phosphate + thioredoxin. betaine reductase: Acetyl phosphate + trimethylamine + thioredoxin disulphide = N,N,N-trimethylglycine + phosphate + thioredoxin.	426
401326	pfam09339	HTH_IclR	IclR helix-turn-helix domain. 	52
401327	pfam09340	NuA4	Histone acetyltransferase subunit NuA4. The NuA4 histone acetyltransferase (HAT) multisubunit complex is responsible for acetylation of histone H4 and H2A N-terminal tails in yeast. NuA4 complexes are highly conserved in eukaryotes and play primary roles in transcription, cellular response to DNA damage, and cell cycle control.	78
401328	pfam09341	Pcc1	Transcription factor Pcc1. Pcc1 is a transcription factor that functions in regulating genes involved in cell cycle progression and polarised growth.	75
286432	pfam09342	DUF1986	Domain of unknown function (DUF1986). This domain is found in serine proteases and is predicted to contain disulphide bonds.	116
401329	pfam09343	DUF2460	Conserved hypothetical protein 2217 (DUF2460). This model represents a family of conserved hypothetical proteins. It is usually (but not always) found in apparent phage-derived regions of bacterial chromosomes.	200
401330	pfam09344	Cas_CT1975	CT1975-like protein. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family is represented by CT1975 of Chlorobium tepidum.	365
401331	pfam09345	DUF1987	Domain of unknown function (DUF1987). This family of proteins are functionally uncharacterized.	121
401332	pfam09346	SMI1_KNR4	SMI1 / KNR4 family (SUKH-1). Proteins in this family are involved in the regulation of 1,3-beta-glucan synthase activity and cell-wall formation. Genome contextual information showed that SMI1 are primary immunity proteins in bacterial toxin systems.	119
401333	pfam09347	DUF1989	Domain of unknown function (DUF1989). This family of proteins are functionally uncharacterized.	165
401334	pfam09348	DUF1990	Domain of unknown function (DUF1990). This family of proteins are functionally uncharacterized.	152
401335	pfam09349	OHCU_decarbox	OHCU decarboxylase. The proteins in this family are OHCU decarboxylase - enzymes of the purine catabolism that catalyze the conversion of OHCU into S(+)-allantoin. This is the third step of the conversion of uric acid (a purine derivative) to allantoin. Step one is catalyzed by urate oxidase (pfam01014) and step two is catalyzed by HIUases (pfam00576).	155
401336	pfam09350	DUF1992	Domain of unknown function (DUF1992). This family of proteins are functionally uncharacterized.	72
401337	pfam09351	DUF1993	Domain of unknown function (DUF1993). This family of proteins are functionally uncharacterized.	161
401338	pfam09353	DUF1995	Domain of unknown function (DUF1995). This family of proteins are functionally uncharacterized.	202
401339	pfam09354	HNF_C	HNF3 C-terminal domain. This presumed domain is found in the C-terminal region of Hepatocyte Nuclear Factor 3 alpha and beta chains. Its specific function is uncertain. The N-terminal region of this presumed domain contains an EH1 (engrailed homology 1) motif, that is characterized by the FxIxxIL sequence.	66
286444	pfam09355	Phage_Gp19	Phage protein Gp19/Gp15/Gp42. This family of proteins are functionally uncharacterized. They are found in a variety of bacteriophage.	116
401340	pfam09356	Phage_BR0599	Phage conserved hypothetical protein BR0599. This entry describes a family of proteins found almost exclusively in phage or in prophage regions of bacterial genomes, including the phage-like Rhodobacter capsulatus gene transfer agent, which packages DNA. An apparent exception is Wolbachia pipientis wMel, a bacterial endosymbiont of the fruit fly, which has several candidate phage-related genes physically separate from obvious prophage regions.	80
401341	pfam09357	RteC	RteC protein. Human colonic Bacteroides species harbor a family of large conjugative transposons, called tetracycline resistance (Tcr) elements. Activities of these elements are enhanced by pregrowth of bacteria in medium containing tetracycline, indicating that at least some Tcr element genes are regulated by tetracycline. An insertional disruption in the rteC gene abolished self-transfer of the Tcr element to Bacteroides recipients, indicating that the gene was essential for self-transfer.	218
401342	pfam09358	E1_UFD	Ubiquitin fold domain. The ubiquitin fold domain is found at the C-terminus of ubiquitin-activating E1 family enzymes. This domain binds to E2 enzymes.	93
401343	pfam09359	VTC	VTC domain. This presumed domain is found in the yeast vacuolar transport chaperone proteins VTC2, VTC3 and VTC4. This domain is also found in a variety of bacterial proteins.	235
401344	pfam09360	zf-CDGSH	Iron-binding zinc finger CDGSH type. The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm.	42
401345	pfam09361	Phasin_2	Phasin protein. This entry describes a group of small proteins found associated with inclusions in bacterial cells. Most associate with polyhydroxyalkanoate (PHA) inclusions, the most common of which consist of polyhydroxybutyrate (PHB). These are designated granule-associate proteins or phasins.	90
401346	pfam09362	DUF1996	Domain of unknown function (DUF1996). This family of proteins are functionally uncharacterized.	234
401347	pfam09363	XFP_C	XFP C-terminal domain. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22.	200
401348	pfam09364	XFP_N	XFP N-terminal domain. Bacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22. This family is distantly related to transketolases e.g. pfam02779.	364
401349	pfam09365	DUF2461	Conserved hypothetical protein (DUF2461). Members of this family are widely (though sparsely) distributed bacterial proteins, about 230 residues in length. All members have a motif RxxRDxRFxxx[DN]KxxY. The function of this protein family is unknown.	207
401350	pfam09366	DUF1997	Protein of unknown function (DUF1997). This family of proteins are functionally uncharacterized.	153
401351	pfam09367	CpeS	CpeS-like protein. This family, that includes CpeS proteins, is functionally uncharacterized.	169
401352	pfam09368	Sas10	Sas10 C-terminal domain. Sas10 is an Essential subunit of U3-containing Small Subunit (SSU) processome complex involved in the production of the 18S rRNA and assembly of the small ribosomal subunit.	74
401353	pfam09369	DUF1998	Domain of unknown function (DUF1998). This family of proteins are functionally uncharacterized. They are mainly found in helicase proteins so could be RNA binding. This family includes a probable zinc binding motif at its C-terminus.	83
401354	pfam09370	PEP_hydrolase	Phosphoenolpyruvate hydrolase-like. This domain has a TIM barrel fold related to IGPS and to phosphoenolpyruvate mutase/aldolase/carboxylase.	266
401355	pfam09371	Tex_N	Tex-like protein N-terminal domain. This presumed domain is found at the N-terminus of Bordetella pertussis tex. This protein defines a novel family of prokaryotic transcriptional accessory factors.	183
286461	pfam09372	PRANC	PRANC domain. This presumed domain is found at the C-terminus of a variety of Pox virus proteins. The PRANC (Pox proteins Repeats of ANkyrin - C terminal) domain is also found on its own in some proteins. The function of this domain is unknown, but it appears to be related to the F-box domain and may play a similar role.	95
117915	pfam09373	PMBR	Pseudomurein-binding repeat. Methanothermobacter thermautotrophicus is a methanogenic Gram-positive microorganism with a cell wall consisting of pseudomurein. This repeat specifically binds to pseudomurein. This repeat is found at the N-terminus of PeiW and PeiP which are pseudomurein binding phage proteins.	33
401356	pfam09374	PG_binding_3	Predicted Peptidoglycan domain. This family contains a potential peptidoglycan binding domain.	76
401357	pfam09375	Peptidase_M75	Imelysin. The imelysin peptidase was first identified in Pseudomonas aeruginosa. The active site residues have not been identified. However, His201 and Glu204 are completely conserved in the family and occur in an HXXE motif that is also found in family M14.	287
401358	pfam09376	NurA	NurA domain. This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius.	252
401359	pfam09377	SBDS_C	SBDS protein C-terminal domain. This family is highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. Members of this family play a role in RNA metabolism.	116
401360	pfam09378	HAS-barrel	HAS barrel domain. The HAS barrel is named after HerA-ATP Synthase. In ATP synthases, this domain is implicated in the assembly of the catalytic toroid and docking of accessory subunits, such as the subunit of the ATP synthase complex. Similar roles in docking of the functional partner, the NurA nuclease, and assembly of the HerA toroid complex appear likely for the HAS-barrel of the HerA family.	91
401361	pfam09379	FERM_N	FERM N-terminal domain. This domain is the N-terminal ubiquitin-like structural domain of the FERM domain.	64
401362	pfam09380	FERM_C	FERM C-terminal PH-like domain. 	85
401363	pfam09381	Porin_OmpG	Outer membrane protein G (OmpG). Porins are channel proteins in the outer membrane of gram negative bacteria which mediate the uptake of molecules required for growth and survival. Escherichia coli OmpG forms a 14 stranded beta-barrel and in contrast to most porins, appears to function as a monomer. The central pore of OmpG is wider than other E. coli porins and it is speculated that it may form a non-specific channel for the transport of larger oligosaccharides.	285
401364	pfam09382	RQC	RQC domain. This DNA-binding domain is found in the RecQ helicase among others and has a helix-turn-helix structure. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain.	109
401365	pfam09383	NIL	NIL domain. This domain is found at the C-terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family.	73
401366	pfam09384	UTP15_C	UTP15 C terminal. U3 snoRNA is ubiquitous in eukaryotes and is required for nucleolar processing of pre-18S ribosomal RNA. It is a component of the ribosomal small subunit (SSU) processome. UTP15 is needed for optimal pre-ribosomal RNA transcription by RNA polymerase I, together with a subset of U3 proteins required for transcription (t-UTPs). This entry represents the C terminal of UTP15, and is found adjacent to WD40 repeats (pfam00400).	147
286473	pfam09385	HisK_N	Histidine kinase N terminal. This domain is found at the N terminal of sensor histidine kinase proteins.	129
401367	pfam09386	ParD	Antitoxin ParD. ParD is a plasmid anti-toxin than forms a ribbon-helix-helix DNA binding structure. It stabilizes plasmids by inhibiting ParE toxicity in cells that express ParD and ParE. ParD forms a dimer and also regulates its own promoter (parDE).	80
401368	pfam09387	MRP	Mitochondrial RNA binding protein MRP. MRP1 and MRP2 are mitochondrial RNA binding proteins that form a heteromeric complex. The MRP1/MRP2 heterotetrameric complex binds to guide RNAs and stabilizes them in an unfolded conformation suitable for RNA-RNA hybridisation. Each MRP subunit adopts a 'whirly' transcription factor fold.	192
401369	pfam09388	SpoOE-like	Spo0E like sporulation regulatory protein. Spore formation is an extreme response to starvation and can also be a component of disease transmission. Sporulation is controlled by an expanded two-component system where starvation signals result in sensor kinase activation and phosphorylation of the master sporulation response regulator Spo0A. Phosphatases such as Spo0E dephosphorylate Spo0A thereby inhibiting sporulation. This is a family of Spo0E-like phosphatases. The structure of a Bacillus anthracis member of this family has revealed an anti-parallel alpha-helical structure.	42
370462	pfam09390	DUF1999	Protein of unknown function (DUF1999). This family contains a putative Fe-S binding reductase whose structure adopts an alpha and beta fold.	151
401370	pfam09391	DUF2000	Protein of unknown function (DUF2000). This is a family of proteins of unknown function. The structure of one of the proteins in this family has been shown to adopt an alpha beta fold.	133
401371	pfam09392	T3SS_needle_F	Type III secretion needle MxiH, YscF, SsaG, EprI, PscF, EscF. Type III secretion systems are essential virulence determinants for many gram-negative bacterial pathogens. MxiH is an extracellular alpha helical needle that is required for translocation of effector proteins into host cells. Once inside, the effector proteins subvert normal cell function to aid infection. The needle protein F, polymerizes to form a shaft.	67
401372	pfam09393	DUF2001	Phage tail tube protein. This is a family of phage tail tube proteins including protein XkdM from phage-like element PBSX protein whose structure adopts a beta barrel flanked with alpha helical regions.	139
401373	pfam09394	Inhibitor_I42	Chagasin family peptidase inhibitor I42. Chagasin is a cysteine peptidase inhibitor which forms a beta barrel structure.	89
401374	pfam09396	Thrombin_light	Thrombin light chain. Thrombin is an enzyme that cleaves bonds after Arg and Lys, converts fibrinogen to fibrin and activates factors V, VII, VIII. Prothrombin is activated on the surface of a phospholipid membrane where factor Xa removes the activation peptide and cleaves the remaining part into light and heavy chains. This domain corresponds to the light chain of thrombin.	47
401375	pfam09397	Ftsk_gamma	Ftsk gamma domain. This domain directs oriented DNA translocation and forms a winged helix structure. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding.	63
312785	pfam09398	FOP_dimer	FOP N terminal dimerization domain. Fibroblast growth factor receptor 1 (FGFR1) oncogene partner (FOP) is a centrosomal protein that is involved in anchoring microtubules to subcellular structures. This domain includes a Lis-homology motif. It forms an alpha helical bundle and is involved in dimerization.	81
401376	pfam09399	SARS_lipid_bind	SARS lipid binding protein. This is a family of proteins found in SARS coronavirus. The protein has a novel fold which forms a dimeric tent-like beta structure with an amphipathic surface, and a central hydrophobic cavity that binds lipid molecules. This cavity is likely to be involved in membrane attachment.	97
401377	pfam09400	DUF2002	Protein of unknown function (DUF2002). This is a family of putative cytoplasmic proteins. The structure of these proteins form an antiparallel beta and sheet and contain some alpha helical regions.	110
286486	pfam09401	NSP10	RNA synthesis protein NSP10. Non-structural protein 10 (NSP10) is involved in RNA synthesis. it is synthesized as a polyprotein whose cleavage generates many non-structural proteins. NSP10 contains two zinc binding motifs and forms two anti-parallel helices which are stacked against an irregular beta sheet. A cluster of basic residues on the protein surface suggests a nucleic acid-binding function.	119
401378	pfam09402	MSC	Man1-Src1p-C-terminal domain. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C terminal nucleoplasmic region forms a DNA binding winged helix and binds to Smad. This C-terminal tail is also found in S. cerevisiae and is thought to consist of three conserved helices followed by two downstream strands.	333
401379	pfam09403	FadA	Adhesion protein FadA. FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices.	99
401380	pfam09404	DUF2003	Eukaryotic protein of unknown function (DUF2003). This is a family of proteins of unknown function which adopt an alpha helical and beta sheet structure.	440
401381	pfam09405	Btz	CASC3/Barentsz eIF4AIII binding. This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide.	116
401382	pfam09406	DUF2004	Protein of unknown function (DUF2004). This is a family of proteins with unknown function. The structure of one of the proteins in this family has revealed a novel alpha-beta fold.	106
401383	pfam09407	AbiEi_1	AbiEi antitoxin C-terminal domain. AbiEi_1 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	143
401384	pfam09408	Spike_rec_bind	Spike receptor binding domain. Spike is an envelope glycoprotein which aids viral entry into the host cell. This domain corresponds is the immunogenic receptor binding domain of the protein which binds to angiotensin-converting enzyme 2 (ACE2).	177
401385	pfam09409	PUB	PUB domain. The PUB (also known as PUG) domain is found in peptide N-glycanase where it functions as a AAA ATPase binding domain. This domain is also found on other proteins linked to the ubiquitin-proteasome system.	77
401386	pfam09411	PagL	Lipid A 3-O-deacylase (PagL). PagL is an outer membrane protein with lipid A 3-O-deacylase activity. It forms an 8 stranded beta barrel structure.	129
401387	pfam09412	XendoU	Endoribonuclease XendoU. This is a family of endoribonucleases involved in RNA biosynthesis which has been named XendoU in Xenopus laevis. XendoU is a U-specific metal dependent enzyme that produces products with a 2'-3' cyclic phosphate termini.	264
401388	pfam09413	DUF2007	Putative prokaryotic signal transducing protein. This is a family of putative prokaryotic signal transducing proteins of Pii-type.	66
401389	pfam09414	RNA_ligase	RNA ligase. This is a family of RNA ligases. The enzyme repairs RNA strand breaks in nicked DNA:RNA and RNA:RNA but not in DNA:DNA duplexes.	132
401390	pfam09415	CENP-X	CENP-S associating Centromere protein X. The centromere, essential for faithful chromosome segregation during mitosis, has a network of constitutive centromere-associated (CCAN) proteins associating with it during mitosis. So far in vertebrates at least 15 centromere proteins have been identified, which are divided into several subclasses based on functional and biochemical analyses. These provide a platform for the formation of a functional kinetochore during mitosis. CENP-S is one that does not associate with the CENP-H-containing complex but rather interacts with CENP-X to form a stable assembly of outer kinetochore proteins that functions downstream of other components of the CCAN. This complex may directly allow efficient and stable formation of the outer kinetochore on the CCAN platform.	72
401391	pfam09416	UPF1_Zn_bind	RNA helicase (UPF2 interacting domain). UPF1 is an essential RNA helicase that detects mRNAs containing premature stop codons and triggers their degradation. This domain contains 3 zinc binding motifs and forms interactions with another protein (UPF2) that is also involved nonsense-mediated mRNA decay (NMD).	152
401392	pfam09418	DUF2009	Protein of unknown function (DUF2009). This is a eukaryotic family of proteins with unknown function.	454
286502	pfam09419	PGP_phosphatase	Mitochondrial PGP phosphatase. This is a family of proteins that acts as a mitochondrial phosphatase in cardiolipin biosynthesis. Cardiolipin is a unique dimeric phosphoglycerolipid predominantly present in mitochondrial membranes. The inverted phosphatase motif includes the highly conserved DKD triad.	166
401393	pfam09420	Nop16	Ribosome biogenesis protein Nop16. Nop16 is a protein involved in ribosome biogenesis.	209
401394	pfam09421	FRQ	Frequency clock protein. The frequency clock protein, is the central component of the frq-based circadian negative feedback loop, regulates various aspects of the circadian clock in Neurospora crassa. This protein has been shown to interact with itself via a coiled-coil.	982
401395	pfam09422	WTX	WTX protein. The WTX protein is found to be inactivated in one third of Wilms tumors. The WTX protein is functionally uncharacterized.	468
401396	pfam09423	PhoD	PhoD-like phosphatase. 	342
401397	pfam09424	YqeY	Yqey-like protein. The function of this domain found in the YqeY protein is uncertain.	143
401398	pfam09425	CCT_2	Divergent CCT motif. This short motif is found in a number of plant proteins. It appears to be related to the N-terminal half of the CCT motif. The CCT motif is about 45 amino acids long and contains a putative nuclear localization signal within the second half of the CCT motif.	25
401399	pfam09426	Nyv1_N	Vacuolar R-SNARE Nyv1 N terminal. This domain corresponds to the N terminal domain of vacuolar R-SNARE Nyv1 which adopts a longin fold. In yeast it has been shown that this domain is sufficient to direct the transport of Nyv1 to limiting membrane of the vacuole.	138
401400	pfam09427	DUF2014	Domain of unknown function (DUF2014). This domain is found at the C terminal of a family of ER membrane bound transcription factors called sterol regulatory element binding proteins (SREBP).	263
401401	pfam09428	DUF2011	Fungal protein of unknown function (DUF2011). This is a family of fungal proteins whose function is unknown.	90
401402	pfam09429	Wbp11	WW domain binding protein 11. The WW domain is a small protein module with a triple-stranded beta-sheet fold. This is a family of WW domain binding proteins.	76
401403	pfam09430	DUF2012	Protein of unknown function (DUF2012). This is a eukaryotic family of uncharacterized proteins.	122
401404	pfam09431	DUF2013	Protein of unknown function (DUF2013). This region is found at the C terminal of a group of cytoskeletal proteins.	136
401405	pfam09432	THP2	Tho complex subunit THP2. The THO complex plays a role in coupling transcription elongation to mRNA export. It is composed of subunits THP2, HPR1, THO2 and MFT1.	129
401406	pfam09435	DUF2015	Fungal protein of unknown function (DUF2015). This is a fungal family of uncharacterized proteins.	110
312811	pfam09436	DUF2016	Domain of unknown function (DUF2016). A predicted alpha+beta domain that is usually fused N-terminal to the JAB metallopeptidase. This protein in turn is found in conserved gene neighborhoods that include genes encoding the bacterial homologs of the ubiquitin modification system such as the E1, E2 and Ub proteins. The domain is also known as the JAB-N domain.	72
117976	pfam09437	Pombe_5TM	Pombe specific 5TM protein. 	219
401407	pfam09438	DUF2017	Domain of unknown function (DUF2017). This is an alpha-helical domain found in gene neighborhoods that contain genes encoding ubiquitin, cysteine synthases and JAB peptidases.	170
370490	pfam09439	SRPRB	Signal recognition particle receptor beta subunit. The beta subunit of the signal recognition particle receptor (SRP) is a transmembrane GTPase which anchors the alpha subunit to the endoplasmic reticulum membrane.	181
401408	pfam09440	eIF3_N	eIF3 subunit 6 N terminal domain. This is the N terminal domain of subunit 6 translation initiation factor eIF3.	132
401409	pfam09441	Abp2	ARS binding protein 2. This DNA-binding protein binds to the autonomously replicating sequence (ARS) binding element. It may play a role in regulating the cell cycle response to stress signals.	171
401410	pfam09442	DUF2018	Domain of unknown function (DUF2018). Acid-adaptive protein possibly of physiological significance when H.pylori colonises the human stomach, which adopts a unique four alpha-helical triangular conformations. The biologically active form is thought to be a tetramer. The protein is expressed along with six other proteins, some of which are related to iron storage and haem biosynthesis.	83
401411	pfam09443	CFC	Cripto_Frl-1_Cryptic (CFC). CFC domain is one half of the membrane protein Cripto, a protein overexpressed in many tumors and structurally similar to the C-terminal extracellular portions of Jagged 1 and Jagged 2. CFC is approx 40-residues long, compacted by three internal disulphide bridges, and binds Alk4 via a hydrophobic patch. CFC is structurally homologous to the VWFC-like domain.	35
401412	pfam09444	MRC1	MRC1-like domain. This putative domain is found to be the most conserved region in mediator of replication checkpoint protein 1.	141
370496	pfam09445	Methyltransf_15	RNA cap guanine-N2 methyltransferase. RNA cap guanine-N2 methyltransferases such as Schizosaccharomyces pombe Tgs1 and Giardia lamblia Tgs2 catalyze methylation of the exocyclic N2 amine of 7-methylguanosine.	165
401413	pfam09446	VMA21	VMA21-like domain. This presumed short domain appears to contain two potential transmembrane helices. VMA21 is localized in the ER where it is needed as an accessory factor for assembly of the V0 component of the vacuolar ATPase.	64
401414	pfam09447	Cnl2_NKP2	Cnl2/NKP2 family protein. This family includes the Cnl2 kinetochore protein.	65
370499	pfam09448	MmlI	Methylmuconolactone methyl-isomerase. MmlI is a short, approx 115 residue, protein of two alpha helices and four beta strands. It is involved in the catabolism of methyl-substituted aromatics via a modified oxo-adipate pathway in bacteria. The enzyme appears to be monomeric in some species and tetrameric in others. The known structure shows two copies of the protein form a dimeric alpha beta barrel.	114
401415	pfam09449	DUF2020	Domain of unknown function (DUF2020). Protein of unknown function found in bacteria.	144
312824	pfam09450	DUF2019	Domain of unknown function (DUF2019). Protein of unknown function found in bacteria.	105
401416	pfam09451	ATG27	Autophagy-related protein 27. 	261
370502	pfam09452	Mvb12	ESCRT-I subunit Mvb12. The endosomal sorting complex required for transport (ESCRT) complexes play a critical role in receptor down-regulation and retroviral budding. A new component of the ESCRT-I complex was identified, multivesicular body sorting factor of 12 kD (Mvb12), which binds to the coiled-coil domain of the ESCRT-I subunit vacuolar protein sorting 23 (Vps23).	90
401417	pfam09453	HIRA_B	HIRA B motif. The HirA B (Histone regulatory homolog A binding) motif is the essential binding interface between HIRA pfam07569 and ASF1a, of approx. 40 residues. It forms an antiparallel beta-hairpin that binds perpendicular to the strands of the beta-sandwich of ASF1a N-terminal core domain, via beta-sheet, salt bridge and van der Waals interactions. The two histone chaperone proteins, HIRA and ASF1a, form a heterodimer with histones H3 and H4. HIRA is the human orthologue of Hir proteins known to silence histone gene expression and create transcriptionally silent heterochromatin in yeast, flies, plants and humans. The yeast CAF1B proteins which bind H3 also carry this motif at their very C-terminus.	23
401418	pfam09454	Vps23_core	Vps23 core domain. ESCRT complexes form the main machinery driving protein sorting from endosomes to lysosomes. The core domain of the Vps23 subunit of the heterotrimeric ESCRT-I complex is a helical hairpin sandwiched in a fan-like formation between two other helical hairpins from Vps28 (pfam03997) and Vps37. Vps23 gives ESCRT-I complex its stability.	60
401419	pfam09455	Cas_DxTHG	CRISPR-associated (Cas) DxTHG family. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. The family describes Cas proteins of about 400 residues that include the motif [VIL]-D-x-[ST]-H-[GS]. The CRISPR and associated proteins are thought to be involved in the evolution of host resistance. The exact molecular function of this family is currently unknown.	313
401420	pfam09456	RcsC	RcsC Alpha-Beta-Loop (ABL). This domain is found in the C-terminus of the phospho-relay kinase RcsC between pfam00512 and pfam00072, and forms a discrete alpha/beta/loop structure.	91
401421	pfam09457	RBD-FIP	FIP domain. The FIP domain is the Rab11-binding domain (RBD) at the C-terminus of a family of Rab11-interacting proteins (FIPs). The Rab proteins constitute the largest family of small GTPases (>60 members in mammals). Among them Rab11 is a well characterized regulator of endocytic and recycling pathways. Rab11 associates with a broad range of post-Golgi organelles, including recycling endosomes.	41
401422	pfam09458	H_lectin	H-type lectin domain. The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754.	67
401423	pfam09459	EB_dh	Ethylbenzene dehydrogenase. Eythylbenzene dehydrogenase is a heterotrimer of three subunits that catalyzes the anaerobic degradation of hydrocarbons. The alpha subunit contains the catalytic centre as a Molybdenum cofactor-complex. This removes an electron-pair from the hydrocarbon and passes it along an electron transport system involving iron-sulphur complexes held in the beta subunit and a Haem b molecule contained in the gamma subunit. The electron-pair is then subsequently passed to an as yet unknown receiver. The enzyme is found in a variety of different bacteria.	193
370509	pfam09460	Saf-Nte_pilin	Saf-pilin pilus formation protein. This domain consists of the adjacent Saf-Nte and Saf-pilin chains of the pilus-forming complex. Pilus assembly in Gram-negative bacteria involves a Donor-strand exchange mechanism between the C- and the N-termini of this domain. The C-terminal subunit forms an incomplete Ig-fold which is then complemented by the 10-18 residue N-terminus of another, incoming, pilus subunit which is not involved in the Ig-fold. The N-terminus sequences contain a motif of alternating hydrophobic residues that occupy the P2 to P5 binding pockets in the groove of the first pilus subunit.	144
370510	pfam09461	PcF	Phytotoxin PcF protein. PcF is a 52 residue protein factor of two alpha helices, containing a 4-hydroxyproline and three cysteine bridges. The presence of the hydroxyproline is unique in relation to other fungal phytotoxic proteins. The protein has a high content of acidic side-chains implying a lack of binding with lipid-rich components of membranes and appears to be an extracellular phytotoxin that causes leaf necrosis in strawberries.	43
370511	pfam09462	Mus7	Mus7/MMS22 family. This family includes a conserved region from the Mus7 protein. Mus7 is involved in the repair of replication-associated DNA damage in the fission yeast Schizosaccharomyces pombe. Mus7 functions in the same pathway as Mus81, a subunit of the Mus81-Eme1 structure-specific endonuclease, which has been implicated in the repair of the replication-associated DNA damage. The MMS22 proteins are involved in repairing double-stranded DNA breaks created by the cleavage reaction of topoisomerase II.	610
401424	pfam09463	Opy2	Opy2 protein. Opy2p acts as a membrane anchor in the HOG signalling pathway.	35
401425	pfam09465	LBR_tudor	Lamin-B receptor of TUDOR domain. The Lamin-B receptor, found on the TUDOR domain pfam00567, is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner Nuclear Envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with Importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the NE.	55
312837	pfam09466	Yqai	Hypothetical protein Yqai. This hypothetical protein is expressed in bacteria, particularly Bacillus subtilis. It forms a homo-dimer, with each monomer containing an alpha helix and four beta strands.	66
401426	pfam09467	Yopt	Hypothetical protein Yopt. This hypothetical protein is expressed in bacteria, particularly Bacillus subtilis. It forms homo-dimers, with each monomer consisting of one alpha helix and three beta strands.	71
401427	pfam09468	RNase_H2-Ydr279	Ydr279p protein family (RNase H2 complex component). RNases H are enzymes that specifically hydrolyze RNA when annealed to a complementary DNA and are present in all living organisms. In yeast RNase H2 is composed of a complex of three proteins (Rnh2Ap, Ydr279p and Ylr154p), this family represents the homologs of Ydr279p. It is not known whether non yeast proteins in this family fulfil the same function.	157
312839	pfam09469	Cobl	Cordon-bleu ubiquitin-like domain. The Cordon-bleu protein domain is highly conserved among vertebrates. The sequence contains three repeated lysine, arginine, and proline-rich regions, the KKRAP motif. The exact function of the protein is unknown but it is thought to be involved in mid-brain neural tube closure. It is expressed specifically in the node. This domain has a ubiquitin-like fold.	79
401428	pfam09470	Telethonin	Telethonin protein. Telethonin is a 167-residue protein which complexes with the large muscle protein, titin. The very N-terminus of titin, composed of two immunoglobulin-like (Ig) domains, referred to as Z1 and Z2, interacts with the N-terminal region (residues 1-53) of telethonin, mediating the antiparallel assembly of two Z1Z2 domains. The C-terminus of the telethonin appears to induce dimerization of this 2:1 titin/telethonin structure which thus forms a complex necessary for myofibril assembly and maintenance of the intact Z-disk of skeletal and cardiac muscles.	154
401429	pfam09471	Peptidase_M64	IgA Peptidase M64. This is a family of highly selective metallo-endopeptidases. The primary structure of the Clostridium ramosum IgA proteinase shows no significant overall similarity to any other known metallo-endopeptidase.	259
401430	pfam09472	MtrF	Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF). Many archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This domain is mostly found in MtrF, where it covers the entire length of the protein. This polypeptide is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase complex found in methanogenic archaea. This is a membrane-associated enzyme complex that uses methyl-transfer reactions to drive a sodium-ion pump. MtrF itself is involved in the transfer of the methyl group from N5-methyltetrahydromethanopterin to coenzyme M. Subsequently, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. In some organisms this domain is found at the C terminal region of what appears to be a fusion of the MtrA and MtrF proteins. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism.	62
401431	pfam09474	Type_III_YscX	Type III secretion system YscX (type_III_YscX). Members of this family are encoded within bacterial type III secretion gene clusters. Among all species with type III secretion, those with this protein are found among those that target animal rather than plant cells. The member of this family in Yersinia was shown by mutation to be required for type III secretion of Yops effector proteins and therefore is believed to be part of the secretion machinery.	121
286550	pfam09475	Dot_icm_IcmQ	Dot/Icm secretion system protein (dot_icm_IcmQ). Proteins in this entry are the IcmQ component of Dot/Icm secretion systems, as found in the obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favour calling this the Dot/Icm system. This protein was shown to be essential for translocation.	178
401432	pfam09476	Pilus_CpaD	Pilus biogenesis CpaD protein (pilus_cpaD). Proteins in this entry consist of a pilus biogenesis protein, CpaD, from Caulobacter, and homologs in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function of the homologs is not known.	201
401433	pfam09477	Type_III_YscG	Bacterial type II secretion system chaperone protein (type_III_yscG). YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designated Yops (Yersinia outer proteins), in Yersinia. This entry consists of YscG from Yersinia and functionally equivalent type III secretion proteins in other species: e.g. AscG in Aeromonas and LscG in Photorhabdus luminescens.	116
286553	pfam09478	CBM49	Carbohydrate binding domain CBM49. This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose.	80
401434	pfam09479	Flg_new	Listeria-Bacteroides repeat domain (List_Bact_rpt). This model describes a conserved core region of about 43 residues, which occurs in at least two families of tandem repeats. These include 78-residue repeats which occur from 2 to 15 times in some proteins of Bacteroides forsythus ATCC 43037, and 70-residue repeats found in families of internalins of Listeria species. Single copies are found in proteins of Fibrobacter succinogenes, Geobacter sulfurreducens, and a few other bacteria.	65
401435	pfam09480	PrgH	Type III secretion system protein PrgH-EprH (PrgH). In Salmonella, the gene encoding this protein is part of a four-gene operon PrgHIJK, while in other organisms it is found in type III secretion operons. PrgH has been shown to be required for type III secretion and is a structural component of the needle complex, which is the core component of type III secretion systems.	374
401436	pfam09481	CRISPR_Cse1	CRISPR-associated protein Cse1 (CRISPR_cse1). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry, represented by CT1972 from Chlorobaculum tepidum, is found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse1.	465
370521	pfam09482	OrgA_MxiK	Bacterial type III secretion apparatus protein (OrgA_MxiK). This protein is encoded by genes which are found in type III secretion operons, and has been shown to be essential for the invasion phenotype in Salmonella and a component of the secretion apparatus. The protein is known as OrgA in Salmonella due to its oxygen-dependent expression pattern in which low-oxygen levels up-regulate the gene. In Shigella the gene is called MxiK and has been shown to be essential for the proper assembly of the needle complex, which is the core component of type III secretion systems.	181
401437	pfam09483	HpaP	Type III secretion protein (HpaP). This entry represents proteins encoded by genes which are always found in type III secretion operons, although their function in the processes of secretion and virulence is unclear. Hpa stands for Hrp-associated gene, where Hrp stands for hypersensitivity response and virulence. see also PMID:18584024	90
401438	pfam09484	Cas_TM1802	CRISPR-associated protein TM1802 (cas_TM1802). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This minor cas protein is found in at least five prokaryotic genomes: Methanosarcina mazei, Sulfurihydrogenibium azorense, Thermotoga maritima, Carboxydothermus hydrogenoformans, and Dictyoglomus thermophilum, the first of which is archaeal while the rest are bacterial.	584
401439	pfam09485	CRISPR_Cse2	CRISPR-associated protein Cse2 (CRISPR_cse2). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family of proteins, represented by CT1973 from Chlorobaculum tepidum, is encoded by genes found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse2.	144
370523	pfam09486	HrpB7	Bacterial type III secretion protein (HrpB7). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia.	157
370524	pfam09487	HrpB2	Bacterial type III secretion protein (HrpB2). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow group of species including Xanthomonas, Burkholderia and Ralstonia.	114
401440	pfam09488	Osmo_MPGsynth	Mannosyl-3-phosphoglycerate synthase (osmo_MPGsynth). This family consists of examples of mannosyl-3-phosphoglycerate synthase (MPGS), which together with mannosyl-3-phosphoglycerate phosphatase (MPGP) EC:2.4.1.217, comprises a two-step pathway for mannosylglycerate biosynthesis. Mannosylglycerate is a compatible solute that tends to be restricted to extreme thermophiles of archaea and bacteria. Note that in Rhodothermus marinus, this pathway is one of two; the other is condensation of GDP-mannose with D-glycerate by mannosylglycerate synthase.	380
401441	pfam09489	CbtB	Probable cobalt transporter subunit (CbtB). This entry represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of a single transmembrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a protein (CbtA) predicted to have five additional transmembrane segments.	50
401442	pfam09490	CbtA	Probable cobalt transporter subunit (CbtA). This entry represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of five transmembrane segments, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a small protein (CbtB) having a single additional transmembrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site.	240
401443	pfam09491	RE_AlwI	AlwI restriction endonuclease. This family includes the AlwI (recognizes GGATC), Bsp6I (recognizes GC^NGC), BstNBI (recognizes GASTC), PleI(recognizes GAGTC) and MlyI (recognizes GAGTC) restriction endonucleases.	441
401444	pfam09492	Pec_lyase	Pectic acid lyase. Members of this family are isozymes of pectate lyase (EC:4.2.2.2), also called polygalacturonic transeliminase and alpha-1,4-D-endopolygalacturonic acid lyase.	289
401445	pfam09493	DUF2389	Tryptophan-rich protein (DUF2389). Members of this family are small hypothetical proteins of 60 to 100 residues from Cyanobacteria and some Proteobacteria. Prochlorococcus marinus strains have two members, other species one only. Interestingly, of the eight most conserved residues, four are aromatic and three are invariant tryptophans. It appears all species that encode this protein can synthesize tryptophan de novo.	62
401446	pfam09494	Slx4	Slx4 endonuclease. The Slx4 protein is a heteromeric structure-specific endonuclease found from fungi to mammals. Slx4 with Slx1 acts as a nuclease on branched DNA substrates, particularly simple-Y, 5'-flap, or replication fork structures by cleaving the strand bearing the 5' non-homologous arm at the branch junction and thus generating ligatable nicked products from 5'-flap or replication fork substrates.	59
401447	pfam09495	DUF2462	Protein of unknown function (DUF2462). This protein is highly conserved, but its function is unknown. It can be isolated from HeLa cell nucleoli and is found to be homologous with Leydig cell tumor protein whose function is unknown.	74
401448	pfam09496	CENP-O	Cenp-O kinetochore centromere component. This eukaryotic protein is a component of the inner kinetochore subcomplex of the centromere. It has been shown to be involved in chromosome segregation via regulation of the spindle in both yeast and human.	202
401449	pfam09497	Med12	Transcription mediator complex subunit Med12. Med12 is a negative regulator of the Gli3-dependent sonic hedgehog signalling pathway via its interaction with Gli3 within the RNA polymerase II transcriptional Mediator. A complex is formed between Med12, Med13, CDK8 and CycC which is responsible for suppression of transcription. This subunit forms part of the Kinase section of Mediator.	63
401450	pfam09498	DUF2388	Protein of unknown function (DUF2388). This family consists of small hypothetical proteins, about 100 amino acids in length. The family includes five members (three in tandem) in Pseudomonas aeruginosa PAO1 and in Pseudomonas putida (strain KT2440), four in Pseudomonas syringae DC3000, and single members in several other Proteobacteria. The function is unknown.	70
286574	pfam09499	RE_ApaLI	ApaLI-like restriction endonuclease. This family includes R.ApaLI and R.XbaI restriction endonucleases. ApaLI recognizes and cleaves the sequence GTGCAC.	189
401451	pfam09500	YiiD_C	Putative thioesterase (yiiD_Cterm). This entry consists of a broadly distributed uncharacterized domain often found as a standalone protein. The member from Shewanella oneidensis is described from crystallography work as a putative thioesterase because it belongs to the HotDog clan of enzymes. About half of the members of this family are fused to an Acetyltransf_1 domain pfam00583.	144
312864	pfam09501	Bac_small_YrzI	Probable sporulation protein (Bac_small_yrzI). Members of this family are very small proteins, about 47 residues each, in the genus Bacillus. Single members are found in Bacillus subtilis and Bacillus halodurans, while arrays of six members in tandem are found in Bacillus cereus and Bacillus anthracis. An EIxxE motif present in most members of this family resembles cleavage sites by the germination protease GPR in a number of small acid-soluble spore proteins (SASP). A role in sporulation is possible.	45
401452	pfam09502	HrpB4	Bacterial type III secretion protein (HrpB4). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia.	217
401453	pfam09504	RE_Bsp6I	Bsp6I restriction endonuclease. This family includes the Bsp6I (recognizes and cleaves GC^NGC) restriction endonucleases.	179
370535	pfam09505	Dimeth_Pyl	Dimethylamine methyltransferase (Dimeth_PyL). This family consists of dimethylamine methyltransferases from the genus Methanosarcina. It is found in three nearly identical copies in each of Methanosarcina acetivorans, Methanosarcina barkeri, and Methanosarcina mazei. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with trimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates dimethylamine, leaving monomethylamine, and methylates the prosthetic group of the small corrinoid protein MtbC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence.	462
401454	pfam09506	Salt_tol_Pase	Glucosylglycerol-phosphate phosphatase (Salt_tol_Pase). Proteins in this family are glucosylglycerol-phosphate phosphatases, with the gene symbol stpA (Salt Tolerance Protein A). A motif characteristic of acid phosphatases is found, but otherwise this family shows little sequence similarity to other phosphatases. This enzyme acts on the glucosylglycerol phosphate, product of glucosylglycerol phosphate synthase and immediate precursor of the osmoprotectant glucosylglycerol.	388
370537	pfam09507	CDC27	DNA polymerase subunit Cdc27. This protein forms the C subunit of DNA polymerase delta. It carries the essential residues for binding to the Pol1 subunit of polymerase alpha, from residues 293-332, which are characterized by the motif D--G--VT, referred to as the DPIM motif. The first 160 residues of the protein form the minimal domain for binding to the B subunit, Cdc1, of polymerase delta, the final 10 C-terminal residues, 362-372, being the DNA sliding clamp, PCNA, binding motif.	427
401455	pfam09508	Lact_bio_phlase	Lacto-N-biose phosphorylase N-terminal TIM barrel domain. The gene which codes for this protein in gut-bacteria is located in a novel putative operon for galactose metabolism. The protein appears to be a carbohydrate-processing phosphorolytic enzyme (EC:2.4.1.211), unlike either glycoside hydrolases or glycoside lyase. Intestinal colonisation by bifidobacteria is important for human health, especially in pediatrics, because colonisation seems to prevent infection by some pathogenic bacteria that cause diarrhoea or other illnesses. The operon seems to be involved in intestinal colonisation by bifidobacteria mediated by metabolism of mucin sugars. In addition, it may also resolve the question of the nature of the bifidus factor in human milk as the lacto-N-biose structure found in milk oligosaccharides.	434
401456	pfam09509	Hypoth_Ymh	Protein of unknown function (Hypoth_ymh). This entry consists of a relatively rare prokaryotic protein family (about 8 occurrences per 200 genomes). Genes for members of this family appear to be associated variously with phage and plasmid regions, restriction system loci, transposons, and housekeeping genes. Their function is unknown.	118
401457	pfam09510	Rtt102p	Rtt102p-like transcription regulator protein. This protein is found in fungi. The family includes Rtt102p, a transcription regulator protein which appears to be integrally associated with both the Swi-Snf and the RSC chromatin remodelling complexes,.	130
401458	pfam09511	RNA_lig_T4_1	RNA ligase. Members of this family include T4 phage proteins with ATP-dependent RNA ligase activity. Host defense to phage may include cleavage and inactivation of specific tRNA molecules; members of this family act to reverse this RNA damage. The enzyme is adenylated, transiently, on a Lys residue in a motif KXDGSL. This family also includes fungal tRNA ligases that have adenylyltransferase activity. tRNA ligases are enzymes required for the splicing of precursor tRNA molecules containing introns.i	221
401459	pfam09512	ThiW	Thiamine-precursor transporter protein (ThiW). Levels of thiamine pyrophosphate (TPP) or thiamine regulate transcription or translation of a number of thiamine biosynthesis, salvage, or transport genes in a wide range of prokaryotes. The mechanism involves direct binding, with no protein involved, to a structural element called THI found in the untranslated upstream region of thiamine metabolism gene operons. This element is called a riboswitch and is seen also for other metabolites such as FMN and glycine. This protein family consists of proteins identified in operons controlled by the THI riboswitch and designated ThiW. The hydrophobic nature of this protein and reconstructed metabolic background suggests that this protein acts in transport of a thiazole precursor of thiamine.	150
401460	pfam09514	SSXRD	SSXRD motif. SSX1 can repress transcription, and this has been attributed to a putative Kruppel associated box (KRAB) repression domain at the N-terminus. However, from the analysis of these deletion constructs further repression activity was found at the C-terminus of SSX1. Which has been called the SSXRD (SSX Repression Domain). The potent repression exerted by full-length SSX1 appears to localize to this region.	30
401461	pfam09515	Thia_YuaJ	Thiamine transporter protein (Thia_YuaJ). Members of this protein family have been assigned as thiamine transporters by a phylogenetic analysis of families of genes regulated by the THI element, a broadly conserved RNA secondary structure element through which thiamine pyrophosphate (TPP) levels can regulate transcription of many genes related to thiamine transport, salvage, and de novo biosynthesis. Species with this protein always lack the ThiBPQ ABC transporter. In some species (e.g. Streptococcus mutans and Streptococcus pyogenes), yuaJ is the only THI-regulated gene. Evidence from Bacillus cereus indicates thiamine uptake is coupled to proton translocation.	177
401462	pfam09516	RE_CfrBI	CfrBI restriction endonuclease. This family includes the CfrBI (recognizes and cleaves C^CWWGG) restriction endonuclease.	257
370543	pfam09517	RE_Eco29kI	Eco29kI restriction endonuclease. This family includes the Eco29kI (recognizes and cleaves CCGC^GG ) restriction endonuclease.	161
286590	pfam09518	RE_HindIII	HindIII restriction endonuclease. This family includes the HindIII (recognizes and cleaves A^AGCTT) restriction endonuclease.	284
370544	pfam09519	RE_HindVP	HindVP restriction endonuclease. This family includes the HindVP (recognizes GRCGYC bu the cleavage site is unknown) restriction endonucleases.	324
401463	pfam09520	RE_TdeIII	Type II restriction endonuclease, TdeIII. This family includes many TdeIII restriction endonucleases that recognize and cleave at GGNCC sites. TdeIII cleave unmethylated double-stranded DNA.	239
401464	pfam09521	RE_NgoPII	NgoPII restriction endonuclease. This family includes the NgoPII (recognizes and cleaves GG^CC) restriction endonuclease.	262
337439	pfam09522	RE_R_Pab1	R.Pab1 restriction endonuclease. 	119
401465	pfam09523	DUF2390	Protein of unknown function (DUF2390). Members of this family are bacterial hypothetical proteins, about 160 amino acids in length, found in various proteobacteria, including members of the genera Pseudomonas and Vibrio. The C-terminal region is poorly conserved and is not included in the model.	107
401466	pfam09524	Phg_2220_C	Conserved phage C-terminus (Phg_2220_C). This entry represents the conserved C-terminal domain of a family of proteins found exclusively in bacteriophage and in bacterial prophage regions. The functions of this domain and the proteins containing it are unknown.	74
401467	pfam09526	DUF2387	Probable metal-binding protein (DUF2387). Members of this family are small proteins, about 70 residues in length, with a basic triplet near the N-terminus and a probable metal-binding motif CPXCX(18)CXXC. Members are found in various proteobacteria.	75
401468	pfam09527	ATPase_gene1	Putative F0F1-ATPase subunit Ca2+/Mg2+ transporter. This model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophobic stretches and is presumed to be a subunit of the enzyme. It carries two transmembrane helices and is a magnesium or calcium uniporter. The atp operon of alkaliphilic Bacillus pseudofirmus OF4, as in most prokaryotes, contains the eight structural genes for the F-ATPase (ATP synthase), which are preceded by an atpI gene that encodes a membrane protein with 2 TMSs. A tenth gene, atpZ, has been found in this operon, which is upstream of and overlapping with atpI.	54
312885	pfam09528	Ehrlichia_rpt	Ehrlichia tandem repeat (Ehrlichia_rpt). This entry represents 30 amino acid tandem repeat, found in a variable number of copies in an immunodominant outer membrane protein of Ehrlichia chaffeensis, a tick-borne obligate intracellular pathogen. These short tandem-repeats elicit a strong antibody response in the hosts.	36
401469	pfam09529	Intg_mem_TP0381	Integral membrane protein (intg_mem_TP0381). This entry represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc.	210
401470	pfam09531	Ndc1_Nup	Nucleoporin protein Ndc1-Nup. Ndc1 is a nucleoporin protein that is a component of the Nuclear Pore Complex, and, in fungi, also of the Spindle Pole Body. It consists of six transmembrane segments, three lumenal loops, both concentrated at the N-terminus and cytoplasmic domains largely at the C-terminus, all of which are well conserved.	546
370547	pfam09532	FDF	FDF domain. The FDF domain, so called because of the conserved FDF at its N termini, is an entirely alpha-helical domain with multiple exposed hydrophilic loops. It is found at the C-terminus of Scd6p-like SM domains. It is also found with other divergent Sm domains and in proteins such as Dcp3p and FLJ21128, where it is found N terminal to the YjeF-N domain, a novel Rossmann fold domain.	102
370548	pfam09533	DUF2380	Predicted lipoprotein of unknown function (DUF2380). This family consists of at least 9 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. One appears truncated toward the N-terminus; the others are predicted lipoproteins. The function is unknown.	187
401471	pfam09534	Trp_oprn_chp	Tryptophan-associated transmembrane protein (Trp_oprn_chp). Members of this family are predicted transmembrane proteins with four membrane-spanning helices. Members are found in the Actinobacteria (Mycobacterium, Corynebacterium, Streptomyces), always associated with genes for tryptophan biosynthesis.	180
370549	pfam09535	Gmx_para_CXXCG	Protein of unknown function (Gmx_para_CXXCG). This entry consists of at least 10 paralogous proteins from Myxococcus xanthus and that lack detectable sequence similarity to any other protein family. An imperfectly conserved CXXCG motif, a probable binding site, appears twice in the multiple sequence alignment.	236
370550	pfam09536	DUF2378	Protein of unknown function (DUF2378). This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus DK 1622 and and 12 in Stigmatella aurantiaca DW4/3-1. Members are about 200 amino acids in length. The function is unknown.	177
401472	pfam09537	DUF2383	Domain of unknown function (DUF2383). Members of this protein family are found mostly in the Proteobacteria, although one member is found in the the marine planctomycete Pirellula sp. strain 1. The function is unknown.	106
401473	pfam09538	FYDLN_acid	Protein of unknown function (FYDLN_acid). Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.	108
401474	pfam09539	DUF2385	Protein of unknown function (DUF2385). Members of this uncharacterized protein family are found in a number of alphaproteobacteria, including root nodule bacteria, Brucella suis, Caulobacter crescentus, and Rhodopseudomonas palustris. Conserved residues include two well-separated cysteines, suggesting a disulfide bond. The function is unknown.	88
370553	pfam09543	DUF2379	Protein of unknown function (DUF2379). This family consists of at least 7 paralogs in Myxococcus xanthus and 6 in Stigmatella aurantiaca, both members of the Deltaproteobacteria. The function is unknown.	120
370554	pfam09544	DUF2381	Protein of unknown function (DUF2381). This family consists of at least 8 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown.	287
286611	pfam09545	RE_AccI	AccI restriction endonuclease. This family includes the AccI (recognizes and cleaves GT^MKAC) restriction endonuclease.	366
401475	pfam09546	Spore_III_AE	Stage III sporulation protein AE (spore_III_AE). This represents the stage III sporulation protein AE, which is encoded in a spore formation operon spoIIIAABCDEFGH under the control of sigma G. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.	321
401476	pfam09547	Spore_IV_A	Stage IV sporulation protein A (spore_IV_A). SpoIVA is designated stage IV sporulation protein A. It acts in the mother cell compartment and plays a role in spore coat morphogenesis. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.	490
401477	pfam09548	Spore_III_AB	Stage III sporulation protein AB (spore_III_AB). SpoIIIAB represents the stage III sporulation protein AB, which is encoded in a spore formation operon: spoIIIAABCDEFGH that is under sigma G regulation. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.	169
370555	pfam09549	RE_Bpu10I	Bpu10I restriction endonuclease. This family includes the Bpu10I (recognizes and cleaves CCTNAGC (-5/-2)) restriction endonucleases.	220
401478	pfam09550	Phage_TAC_6	Phage tail assembly chaperone protein, TAC. This is a family of phage tail assembly chaperone proteins largely derived from the Rhodobacter species viral agent GTA (gene transfer agent) gp10.	58
401479	pfam09551	Spore_II_R	Stage II sporulation protein R (spore_II_R). SpoIIR is designated stage II sporulation protein R. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species. SpoIIR is a signalling protein that links the activation of sigma E to the transcriptional activity of sigma F during sporulation.	125
286618	pfam09552	RE_BstXI	BstXI restriction endonuclease. This family includes the BstXI (recognizes and cleaves CCANNNNN^NTGG) restriction endonuclease.	290
401480	pfam09553	RE_Eco47II	Eco47II restriction endonuclease. This family includes the Eco47II (which recognizes GGNCC, but the cleavage site unknown) restriction endonuclease.	202
370557	pfam09554	RE_HaeII	HaeII restriction endonuclease. This family includes the HaeII (recognizes and cleaves RGCGC^Y) restriction endonuclease.	338
401481	pfam09556	RE_HaeIII	HaeIII restriction endonuclease. This family includes the HaeIII (recognizes and cleaves GG^CC) restriction endonuclease.	298
401482	pfam09557	DUF2382	Domain of unknown function (DUF2382). This entry describes an uncharacterized domain, sometimes found in association with a PRC-barrel domain pfam05239 which is also found in rRNA processing protein RimM and in a photosynthetic reaction centre complex protein). This domain is found in proteins from Bacillus subtilis, Deinococcus radiodurans, Nostoc sp. PCC 7120, Myxococcus xanthus, and several other species. The function is not known.	111
286623	pfam09558	DUF2375	Protein of unknown function (DUF2375). Two members of this family are found in Colwellia psychrerythraea (strain 34H / ATCC BAA-681) and one each in various other species of Colwellia and Shewanella. One member from C. psychrerythraea is of special interest because it is preceded by the same cis-regulatory site as a number of genes that have the PEP-CTERM domain described by PEP_anchor (IPR013424).	69
401483	pfam09559	Cas6	Cas6 Crispr. The Cas6 Crispr family of proteins averaging 140 residues are characterized by having a GhGxxxxxGhG motif, where h indicates a hydrophobic residue, at the C-terminus. The CRISPR-Cas system is possibly a mechanism of defense against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes.	190
401484	pfam09560	Spore_YunB	Sporulation protein YunB (Spo_YunB). Spo_YunB is the sporulation protein YunB. In Bacillus subtilis its expression is controlled by sigmaE.The gene YunB seems to code for a protein involved, at least indirectly, in the pathway leading to the activation of sigmaK. Inactivation of YunB delays sigmaK activation and results in reduced sporulation efficiency.	91
401485	pfam09561	RE_HpaII	HpaII restriction endonuclease. This family includes the HpaII (recognizes and cleaves C^CGG) restriction endonuclease.	352
401486	pfam09562	RE_LlaMI	LlaMI restriction endonuclease. This family includes the LlaMI (recognizes and cleaves CC^NGG) restriction endonuclease.	242
370562	pfam09563	RE_LlaJI	LlaJI restriction endonuclease. This family includes the LlaJI (recognizes GACGC) restriction endonucleases.	365
401487	pfam09564	RE_NgoBV	NgoBV restriction endonuclease. This family includes the NgoBV (recognizes GGNNCC but cleavage site is unknown) restriction endonuclease.	238
401488	pfam09565	RE_NgoFVII	NgoFVII restriction endonuclease. This family includes the NgoFVII (recognizes GCSGC but cleavage site unknown) restriction endonuclease.	293
401489	pfam09566	RE_SacI	SacI restriction endonuclease. This family includes the SacI (recognizes and cleaves GAGCT^C) restriction endonuclease.	267
401490	pfam09567	RE_MamI	MamI restriction endonuclease. This family includes the MamI (recognizes and cleaves GATNN^NNATC) restriction endonuclease.	183
401491	pfam09568	RE_MjaI	MjaI restriction endonuclease. This family includes the MjaI (recognizes CTAG but cleavage site unknown) restriction endonuclease.	164
401492	pfam09569	RE_ScaI	ScaI restriction endonuclease. This family includes the ScaI (recognizes and cleaves AGT^ACT) restriction endonuclease.	192
312916	pfam09570	RE_SinI	SinI restriction endonuclease. This family includes the SinI (recognizes and cleaves G^GWCC) restriction endonuclease.	218
286631	pfam09571	RE_XcyI	XcyI restriction endonuclease. This family includes the XcyI (recognizes and cleaves C^CCGGG) restriction endonucleases.	305
401493	pfam09572	RE_XamI	XamI restriction endonuclease. This family includes the XamI (recognizes GTCGAC but cleavage site unknown) restriction endonuclease.	254
401494	pfam09573	RE_TaqI	TaqI restriction endonuclease. This family includes the TaqI (recognizes and cleaves T^CGA) restriction endonuclease.	229
370569	pfam09574	DUF2374	Protein of unknown function (Duf2374). This very small protein (about 46 amino acids) consists largely of a single predicted membrane-spanning region. It is found in Photobacterium profundum SS9 and in three species of Vibrio, always near periplasmic nitrate reductase genes, but far from the periplasmic nitrate reductase genes in Aeromonas hydrophila ATCC 7966.	42
286635	pfam09575	Spore_SspJ	Small spore protein J (Spore_SspJ). Spore_SspJ represents a group of small acid-soluble proteins (SASP) from Bacillus sp., which are present in spores but not in growing cells. The sspJ gene is transcribed in the forespore compartment by RNA polymerase with the forespore-specific sigmaG. Loss of SspJ causes a slight decrease in the rate of spore outgrowth in an otherwise wild-type background.	46
312918	pfam09577	Spore_YpjB	Sporulation protein YpjB (SpoYpjB). These proteins are found in the endospore-forming bacteria which include Bacillus species. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon. Sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect, but this gene is not, however, a part of the endospore formation minimal gene set.	223
401495	pfam09578	Spore_YabQ	Spore cortex protein YabQ (Spore_YabQ). This protein is predicted to span the membrane several times. It is only found in genomes of species that perform sporulation, such as Bacillus subtilis, Clostridium tetani, and other members of the Firmicutes (low-GC Gram-positive bacteria). Mutation of this sigmaE-dependent gene blocks development of the spore cortex. The length of the C-terminal region, which includes some hydrophobic regions, is variable.	75
401496	pfam09579	Spore_YtfJ	Sporulation protein YtfJ (Spore_YtfJ). Proteins in this family are encoded by bacterial genomes if, and only if, the species is capable of endospore formation. YtfJ was confirmed in spores of B. subtilis; it appears to be expressed in the forespore under control of SigF.	81
401497	pfam09580	Spore_YhcN_YlaJ	Sporulation lipoprotein YhcN/YlaJ (Spore_YhcN_YlaJ). This entry contains YhcN and YlaJ, which are predicted lipoproteins that have been detected as spore proteins but not vegetative proteins in Bacillus subtilis. Both appear to be expressed under control of the RNA polymerase sigma-G factor. The YlaJ-like members of this family have a low-complexity, strongly acidic, 40-residue C-terminal domain.	157
401498	pfam09581	Spore_III_AF	Stage III sporulation protein AF (Spore_III_AF). This family represents the stage III sporulation protein AF (Spore_III_AF) of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of these proteins is poorly conserved.	184
401499	pfam09582	AnfO_nitrog	Iron only nitrogenase protein AnfO (AnfO_nitrog). Proteins in this entry include Anf1 from Rhodobacter capsulatus (Rhodopseudomonas capsulata) and AnfO from Azotobacter vinelandii. They are found exclusively in species which contain the iron-only nitrogenase, and are encoded immediately downstream of the structural genes for the nitrogenase enzyme in these species.	190
401500	pfam09583	Phageshock_PspG	Phage shock protein G (Phageshock_PspG). This protein was previously designated as YjbO in Escherichia coli. It is found only in genomes that have the phage shock operon (psp), but it is only rarely encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins and heat shock.	64
401501	pfam09584	Phageshock_PspD	Phage shock protein PspD (Phageshock_PspD). Members of this family are phage shock protein PspD, found in a minority of bacteria that carry the defining genes of the phage shock regulon (pspA, pspB, pspC, and pspF). It is found in Escherichia coli, Yersinia pestis, and closely related species, where it is part of the phage shock operon. It is known to be expressed but its function is unknown.	61
401502	pfam09585	Lin0512_fam	Conserved hypothetical protein (Lin0512_fam). This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a well conserved motif GxGxDxHG near the N-terminus.	114
401503	pfam09586	YfhO	Bacterial membrane protein YfhO. This protein is a conserved membrane protein. The yfhO gene is transcribed in Difco sporulation medium and the transcription is affected by the YvrGHb two-component system. Some members of this family have been annotated as glycosyl transferases of the PMT family.	839
401504	pfam09587	PGA_cap	Bacterial capsule synthesis protein PGA_cap. This protein is a putative poly-gamma-glutamate capsule biosynthesis protein found in bacteria. Poly-gamma-glutamate is a natural polymer that may be involved in virulence and may help bacteria survive in high salt concentrations. It is a surface-associated protein.	247
401505	pfam09588	YqaJ	YqaJ-like viral recombinase domain. This protein family is found in many different bacterial species but is of viral origin. The protein forms an oligomer and functions as a processive alkaline exonuclease that digests linear double-stranded DNA in a Mg(2+)-dependent reaction, It has a preference for 5'-phosphorylated DNA ends. It thus forms part of the two-component SynExo viral recombinase functional unit.	142
370573	pfam09589	HrpA_pilin	HrpA pilus formation protein. HrpA is an essential component of the type III secretion system (TTSS) which pathogens use to inject virulence factors directly into their host cells, and to cause disease. The TTSS has an Hrp pilus appendage for channelling effector proteins through the plant cell wall and this pilus elongates by the addition of HrpA pilin subunits at the distal end.	96
150302	pfam09590	Env-gp36	Lentivirus surface glycoprotein. This protein is found in feline immunodeficiency retrovirus. It represents the surface glycoprotein which is found in the polyprotein C-terminal to the Env protein.	591
286649	pfam09591	DUF2463	Protein of unknown function (DUF2463). This protein is found in eukaryotic, parasitic microsporidia. Its function is unknown.	210
401506	pfam09592	DUF2031	Protein of unknown function (DUF2031). This protein is expressed in Plasmodium; its function is unknown. It may be the product of gene family pyst-b.	227
401507	pfam09593	Pathogen_betaC1	Beta-satellite pathogenicity beta C1 protein. Cotton leaf-curl disease - CLCuD - is of major economic importance in cotton-growing areas of the far-east. The infectious agent appears to be a single-stranded DNA molecule of approx 1350 nucleotides in length, which, when inoculated with the Begomovirus into cotton, induces symptoms typical of CLCuD. This molecule requires the Begomovirus for replication and encapsidation. DNA beta encodes a single protein, betaC1. The intracellular distribution of betaC1 is consistent with the hypothesis that it has a role in transporting the DNA A of Begomovirus from the nuclear site of replication to the plasmodesmatal exit sites of the infected cell. The DNA beta-encoded protein, betaC1, is the determinant of both pathogenicity and suppression of gene silencing.	117
401508	pfam09594	GT87	Glycosyltransferase family 87. The enzymes in this family are glycosyltransferases. PimE is involved in phosphatidylinositol mannoside (PIM) synthesis, a major class of glycolipids in all mycobacteria. PimE is a polyprenol-phosphate-mannose-dependent mannosyltransferase that transfers the fifth mannose of PIM. The family also includes alpha(1-->3) arabinofuranosyltransferase, invloved in the synthesis of of mycobacterial arabinogalactan.	237
312932	pfam09595	Metaviral_G	Metaviral_G glycoprotein. This is a viral attachment glycoprotein from region G of metaviruses. It is high in serine and threonine suggesting it is highly glycosylated.	183
401509	pfam09596	MamL-1	MamL-1 domain. The MamL-1 domain is a polypeptide of up to 70 residues, numbers 15-67 of which adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors.	58
401510	pfam09597	IGR	IGR protein motif. This domain is found in fungal proteins and contains a conserved IGR motif. Its function is unknown.	55
401511	pfam09598	Stm1_N	Stm1. This region is found at the N terminal of the Stm1 protein. Stm1 is a G4 quadraplex and purine motif triplex nucleic acid-binding protein. It has been implicated in many biological processes including apoptosis and telomere biosynthesis. Stm1 is known to interact with CDC13, and is known to associate with ribosomes and nuclear telomere cap complexes.	62
286655	pfam09599	IpaC_SipC	Salmonella-Shigella invasin protein C (IpaC_SipC). This entry represents a family of proteins associated with bacterial type III secretion systems, which are injection machines for virulence factors into host cell cytoplasm. Characterized members of this protein family are known to be secreted and are described as invasins, including IpaC from Shigella flexneri and SipC from Salmonella typhimurium. Members may be referred to as invasins, pathogenicity island effectors, and cell invasion proteins.	334
401512	pfam09600	Cyd_oper_YbgE	Cyd operon protein YbgE (Cyd_oper_YbgE). This entry describes a small protein of unknown function, about 100 amino acids in length, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It appears to be an integral membrane protein. It is found so far only in the Proteobacteria.	76
401513	pfam09601	DUF2459	Protein of unknown function (DUF2459). This conserved hypothetical protein of unknown function is found in several Proteobacteria. Its function is unknown and its genome context is not well-conserved. It is found amid urease genes in at least one species.	171
370578	pfam09602	PhaP_Bmeg	Polyhydroxyalkanoic acid inclusion protein (PhaP_Bmeg). This entry describes a protein found in polyhydroxyalkanoic acid (PHA) gene regions and incorporated into PHA inclusions in Bacillus cereus and Bacillus megaterium. The role of the protein may include amino acid storage.	168
401514	pfam09603	Fib_succ_major	Fibrobacter succinogenes major domain (Fib_succ_major). This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulfide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron.	172
401515	pfam09604	Potass_KdpF	F subunit of K+-transporting ATPase (Potass_KdpF). This entry describes a very small integral membrane peptide KdpF, a subunit of the K(+)-translocating Kdp complex. It is found upstream of the KdpA subunit (IPR004623). Because of its very small size and highly hydrophobic character, it is sometimes missed in genome annotation.	24
401516	pfam09605	Trep_Strep	Hypothetical bacterial integral membrane protein (Trep_Strep). This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. It is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6.	185
312941	pfam09606	Med15	ARC105 or Med15 subunit of Mediator complex non-fungal. The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.	732
401517	pfam09607	BrkDBD	Brinker DNA-binding domain. This DNA-binding domain is the first approx. 100 residues of the N-terminal end of Brinker. The structure of this domain in complex with DNA consists of four alpha-helices that contain a helix-turn-helix DNA recognition motif specific for GC-rich DNA. The Brinker nuclear repressor is a major element of the Drosophila Decapentaplegic morphogen signalling pathway.	58
401518	pfam09608	Alph_Pro_TM	Putative transmembrane protein (Alph_Pro_TM). This family consists of predicted transmembrane proteins of about 270 amino acids. Members are found, so far, only among the Alphaproteobacteria and only once in each genome.	227
312943	pfam09609	Cas_GSU0054	CRISPR-associated protein, GSU0054 family (Cas_GSU0054). This entry represents a rare CRISPR-associated protein. So far, members are found in Geobacter sulfurreducens and in two unpublished genomes: Gemmata obscuriglobus and Actinomyces naeslundii. CRISPR-associated proteins typically are found near CRISPR repeats and other CRISPR-associated proteins, have low levels of sequence identify, have sequence relationships that suggest lateral transfer, and show some sequence similarity to DNA-active proteins such as helicases and repair proteins.	552
401519	pfam09610	Myco_arth_vir_N	Mycoplasma virulence signal region (Myco_arth_vir_N). This entry represents the N-terminal region of a family of large, virulence-associated proteins in Mycoplasma arthritidis and smaller proteins in Mycoplasma capricolum. It includes a probable signal sequence or signal anchor, which, in most instances, has four consecutive Lys residues before the hydrophobic stretch.	32
401520	pfam09611	Cas_Csy1	CRISPR-associated protein (Cas_Csy1). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2465 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy1, for CRISPR/Cas Subtype Ypest protein 1.	376
401521	pfam09612	HtrL_YibB	Bacterial protein of unknown function (HtrL_YibB). The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. homologs are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein.	267
312947	pfam09613	HrpB1_HrpK	Bacterial type III secretion protein (HrpB1_HrpK). This family of proteins is encoded by genes found within type III secretion operons in a limited range of species including Xanthomonas, Ralstonia and Burkholderia.	126
401522	pfam09614	Cas_Csy2	CRISPR-associated protein (Cas_Csy2). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2464 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy2, for CRISPR/Cas Subtype Ypest protein 2.	295
401523	pfam09615	Cas_Csy3	CRISPR-associated protein (Cas_Csy3). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2463 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy3, for CRISPR/Cas Subtype Ypest protein 3.	327
401524	pfam09617	Cas_GSU0053	CRISPR-associated protein GSU0053 (Cas_GSU0053). This entry is found in CRISPR-associated (cas) proteins in the genomes of Geobacter sulfurreducens PCA and Desulfotalea psychrophila LSv54 (both Desulfobacterales from the Deltaproteobacteria), Gemmata obscuriglobus (a Planctomycete), and Actinomyces naeslundii MG1 (Actinobacteria).	170
401525	pfam09618	Cas_Csy4	CRISPR-associated protein (Cas_Csy4). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4.	181
401526	pfam09619	YscW	Type III secretion system lipoprotein chaperone (YscW). This entry is encoded within type III secretion operons. The protein has been characterized as a chaperone for the outer membrane pore component YscC. YscW is a lipoprotein which is itself localized to the outer membrane and, it is believed, facilitates the oligomerization and localization of YscC.	105
401527	pfam09620	Cas_csx3	CRISPR-associated protein (Cas_csx3). This entry is encoded in CRISPR-associated (cas) gene clusters, near CRISPR repeats, in the genomes of several different thermophiles: Archaeoglobus fulgidus (archaeal), Aquifex aeolicus (Aquificae), Dictyoglomus thermophilum (Dictyoglomi), and a thermophilic Synechococcus (Cyanobacteria). It is not yet assigned to a specific CRISPR/cas subtype (hence the x designation csx3).	79
286674	pfam09621	LcrR	Type III secretion system regulator (LcrR). This family of proteins are encoded within type III secretion operons and have been characterized in Yersinia as a regulator of the Low-Calcium Response (LCR).	138
401528	pfam09622	DUF2391	Putative integral membrane protein (DUF2391). This entry is found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighborhood. Proteins containing this entry appear to span the membrane seven times.	269
370585	pfam09623	Cas_NE0113	CRISPR-associated protein NE0113 (Cas_NE0113). Members of this minor CRISPR-associated (Cas) protein family are encoded in cas gene clusters in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, Mannheimia succiniciproducens MBEL55E, and Verrucomicrobium spinosum.	210
401529	pfam09624	DUF2393	Protein of unknown function (DUF2393). The function of this protein is unknown. It is always found as part of a two-gene operon with IPR013416, a protein that appears to span the membrane seven times. It has so far been found in the bacteria Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus.	142
312957	pfam09625	VP9	VP9 protein. VP9 is a protein containing a ferredoxin fold. Two dimers come together to form one asymmetric unit which possesses a DNA recognition fold and specific metal binding sites possibly for zinc. It is postulated that being a non-structural protein VP9 is involved in the transcriptional regulation of the White spot syndrome virus, WSSV, from which it comes. WSSV is the major viral pathogen in shrimp aquaculture. VP9 is found N-terminal to the pfam07056 domain.	73
401530	pfam09626	DHC	Dihaem cytochrome c. Dihaem cytochrome c (DHC) is a soluble c-type cytochrome that folds into two distinct domains, each binding a single haem group and connected by a small linker region. Despite little sequence similarity, the N-terminal domain (residues 12-75) is a class I type cytochrome c, that binds one of the haems, but the domain surrounding the other haem is structurally unique. DHC binds electrostatically to an oxygen-binding protein, sphaeroides haem protein (SHP), as a component of a conserved electron transfer pathway. DHC acts as the physiological electron donor for SHP during phototrophic growth. In certain species DHC is found upstream of pfam01292.	110
286680	pfam09627	PrgU	PrgU-like protein. This hypothetical protein of 125 residues is expressed in bacteria but is thought to be plasmid in origin. It forms a six beta-strand barrel with three accompanying alpha helices and is probably a homo-dimer in the cell. It may be involved in pheromone-inducible conjugation.	105
401531	pfam09628	YvfG	YvfG protein. Yvfg is a hypothetical protein of 71 residues expressed in some bacteria. The monomer consists of two parallel alpha helices, and the protein crystallizes as a homo-dimer.	68
401532	pfam09629	YorP	YorP protein. YorP is a 71 residue protein found in bacteria. As it is also found in a bacteriophage it might be of viral origin. The structure is of an alpha helix between two of five beta strands. The function is unknown.	71
401533	pfam09630	DUF2024	Domain of unknown function (DUF2024). This protein of 86 residues is expressed in bacteria. It consists of four alpha helices and two beta strands. Its function is unknown. One UniProt entry gives the gene name as Traf5.	81
401534	pfam09631	Sen15	Sen15 protein. The Sen15 subunit of the tRNA intron-splicing endonuclease is one of the two structural subunits of this hetero-tetrameric enzyme. Residues 36-157 of this subunit possess a novel homodimeric fold. Each monomer consists of three alpha-helices and a mixed antiparallel/parallel beta-sheet. Two monomers of Sen15 fold with two monomers of Sen34, one of the two catalytic subunits, to form an alpha2-beta2 tetramer as part of the functional endonuclease assembly.	101
401535	pfam09632	Rac1	Rac1-binding domain. The Rac1-binding domain is the C-terminal portion of YpkA from Yersinia. It is an all-helical molecule consisting of two distinct subdomains connected by a linker. the N-terminal end, residues 434-615, consists of six helices organized into two three-helix bundles packed against each other. This region is involved with binding to GTPases. The C-terminal end, residues 705-732. is a novel and elongated fold consisting of four helices clustered into two pairs, and this fold carries the helix implicated in actin activation. Rac1-binding domain mimics host guanidine nucleotide dissociation inhibitors (GDIs) of the Rho GTPases, thereby inhibiting nucleotide exchange in Rac1 and causing cytoskeletal disruption in the host. It is usually found downstream of pfam00069.	293
401536	pfam09633	DUF2023	Protein of unknown function (DUF2023). This protein of approx.120 residues consists of three beta strands and five alpha helices, thought to fold into a homo-dimer. It is expressed in bacteria.	99
312962	pfam09634	DUF2025	Protein of unknown function (DUF2025). This protein is produced from gene PA1123 in Pseudomonas. It contains three alpha helices and six beta strands and is thought to be monomeric. It appears to be present in the biofilm layer and may be a lipoprotein.	105
401537	pfam09635	MetRS-N	MetRS-N binding domain. The MetRS-N domain binds an Arc1-P domain in a tetrameric complex resembling a classical GST homo-dimer. Domain-swapping between symmetrically related MetRS-N and Arc1p-N domains generates a 2:2 tetramer held together by van der Waals forces. This domain is necessary for formation of the aminoacyl-tRNA synthetase complex necessary for tRNA nuclear export and shuttling as part of the translational apparatus. The domain is associated with pfam09334.	121
401538	pfam09636	XkdW	XkdW protein. This protein of approx. 100 residues contains two alpha helices and two beta strands and is probably monomeric. It is expressed in bacteria but is probably viral in origin. Its function is unknown.	106
401539	pfam09637	Med18	Med18 protein. Med18 is one subunit of Mediator, a head-module multiprotein complex, that stimulates basal RNA polymerase II (Pol II) transcription. Med18 consists of an eight-stranded beta-barrel with a central pore and three flanking helices. It complexes with Med8 and Med20 proteins by forming a heterodimer of two-fold symmetry with Med20 and binding the C-terminal alpha-helix region of Med8 across the top of its barrel. This complex creates a multipartite TBP-binding site that can be modulated by transcriptional activators.	229
286688	pfam09638	Ph1570	Ph1570 protein. This is a hypothetical protein from Pyroccous horikoshii of unknown function. It contains six alpha helices and eight beta strands and is thought to be monomeric.	156
401540	pfam09639	YjcQ	YjcQ protein. YjcQ is a protein of approx. 100 residues containing four alpha helices and three beta strands. It is expressed in bacteria and also in viruses. It appears to be under the regulation of SigD RNA polymerase which is responsible for the expression of many genes encoding cell-surface proteins related to flagellar assembly, motility, chemotaxis and autolysis in the late exponential growth phase. The exact function of YjcQ is unknown. However, it is thought to be a prophage head protein in viruses.	94
401541	pfam09640	DUF2027	Domain of unknown function (DUF2027). This protein domain is of unknown function. though putatively involved in DNA mismatch repair. It is associated with pfam01713.	154
378228	pfam09641	DUF2026	Protein of unknown function (DUF2026). This protein of approx. 100 residues is found in bacteria. It contains up to five alpha helices and up to seven beta strands and is probably monomeric. Its function is unknown. It is cited as a major prophage head protein, so might generally be of viral origin.	205
401542	pfam09642	YonK	YonK protein. YonK protein is expressed by the bacterial prophage SPbetaC. It is a 63 residue protein that associates into a homo-octamer in the form of a beta-stranded barrel with four outer helical features at points of the compass. Its function is unknown.	58
401543	pfam09643	YopX	YopX protein. YopX is a protein that is largely helical, with three identical chains probably complexing into a twelve-chain structure.	122
401544	pfam09644	Mg296	Mg296 protein. This protein of 129 residues is expressed in bacteria. It consists of three identical chains of five alpha helices. Two copies of each chain associate into a complex of six units of possible biological significance but of unknown function.	126
401545	pfam09645	F-112	F-112 protein. F-112 protein is of 70-110 residues and is found in viruses. Its winged-helix structure suggests a DNA-binding function.	71
401546	pfam09646	Gp37	Gp37 protein. This protein of 154 residues consists of a unit of helices and beta sheets that crystallizes into a beautiful asymmetrical dodecameric barrel-structure, of two six-membered rings one on top of the other. It is expressed in bacteria but is of viral origin as it is found in phage BcepMu and is probably a pathogenesis factor.	134
401547	pfam09648	YycI	YycH protein. This domain is exclusively found in YycI proteins in the low GC content Gram positive species. These two domains share the same structural fold with domains two and three of YycH pfam07435. Both, YycH and YycI are always found in pair on the chromosome, downstream of the essential histidine kinase YycG. Additionally, both proteins share a function in regulating the YycG kinase with which they appear to form a ternary complex. Lastly, the two proteins always contain an N-terminal transmembrane helix and are localized to the periplasmic space as shown by PhoA fusion studies.	229
401548	pfam09649	CHZ	Histone chaperone domain CHZ. This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones.	34
401549	pfam09650	PHA_gran_rgn	Putative polyhydroxyalkanoic acid system protein (PHA_gran_rgn). Proteins in this entry are encoded by genes involved in either polyhydroxyalkanoic acid (PHA) biosynthesis or utilisation, including proteins found at the surface of PHA granules. These proteins have so far been found in the Pseudomonadales, Xanthomonadales, and Vibrionales, all of which belong to the Gammaproteobacteria.	79
401550	pfam09651	Cas_APE2256	CRISPR-associated protein (Cas_APE2256). This entry represents a conserved region of about 150 amino acids found in at least five archaeal and three bacterial species. These species all contain CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In six of eight species, the protein is encoded the vicinity of a CRISPR/Cas locus.	136
401551	pfam09652	Cas_VVA1548	Putative CRISPR-associated protein (Cas_VVA1548). This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas (CRISPR-associated) genes.	91
401552	pfam09654	DUF2396	Protein of unknown function (DUF2396). These conserved hypothetical proteins have so far been found only in the Cyanobacteria. They are about 170 amino acids long and contain a CxxCx(14)CxxH motif near the N-terminus.	159
401553	pfam09655	Nitr_red_assoc	Conserved nitrate reductase-associated protein (Nitr_red_assoc). Proteins in this entry are found in the Cyanobacteria, and are mostly encoded near nitrate reductase and molybdopterin biosynthesis genes. Molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. These proteins are sometimes annotated as nitrate reductase-associated proteins, though their function is unknown.	144
370603	pfam09656	PGPGW	Putative transmembrane protein (PGPGW). Proteins in this entry are putative Actinobacterial proteins of about 150 amino acids in length, with three predicted transmembrane helices and an unusual motif with consensus sequence PGPGW.	53
286705	pfam09657	Cas_Csx8	CRISPR-associated protein Csx8 (Cas_Csx8). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes proteins of unknown function which are encoded in the midst of a cas gene operon.	493
370604	pfam09658	Cas_Csx9	CRISPR-associated protein (Cas_Csx9). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes archaeal proteins encoded in cas gene regions.	377
401554	pfam09659	Cas_Csm6	CRISPR-associated protein (Cas_Csm6). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.	370
401555	pfam09660	DUF2397	Protein of unknown function (DUF2397). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria).	483
401556	pfam09661	DUF2398	Protein of unknown function (DUF2398). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria).	360
286710	pfam09662	Phenyl_P_gamma	Phenylphosphate carboxylase gamma subunit (Phenyl_P_gamma). Members of this protein family are the gamma subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. The gamma subunit has no known homologs.	83
401557	pfam09663	Amido_AtzD_TrzD	Amidohydrolase ring-opening protein (Amido_AtzD_TrzD). Members of this family are ring-opening amidohydrolases, including cyanuric acid amidohydrolase (EC:3.5.2.15) (AtzD and TrzD) and barbiturase. Note that barbiturase does not act as defined for EC:3.5.2.1 (barbiturate + water = malonate + urea) but rather catalyzes the ring opening of barbiturase acid to ureidomalonic acid.	361
401558	pfam09664	DUF2399	Protein of unknown function C-terminus (DUF2399). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Beta-proteobacteria). Just the C-terminal region is ioncluded here.	152
401559	pfam09665	RE_Alw26IDE	Type II restriction endonuclease (RE_Alw26IDE). Members of this entry are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. characterized specificities of the three members are GGTCTC, CGTCTC and the shared subsequence GTCTC.	443
401560	pfam09666	Sororin	Sororin protein. Sororin is an essential, cell cycle-dependent mediator of sister chromatid cohesion. The protein is nuclear in interphase cells, dispersed from the chromatin in mitosis, and interacts with the cohesin complex.	155
312980	pfam09667	DUF2028	Domain of unknown function (DUF2028). This region of similarity is found in the vertebrate homologs of the drosophila Bobby Sox.	198
312981	pfam09668	Asp_protease	Aspartyl protease. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover.	124
401561	pfam09669	Phage_pRha	Phage regulatory protein Rha (Phage_pRha). Members of this protein family are found in temperate phage and bacterial prophage regions. Members include the product of the rha gene of the lambdoid phage phi-80, a late operon gene. The presence of this gene interferes with infection of bacterial strains that lack integration host factor (IHF), which regulates the rha gene. It is suggested that Rha is a phage regulatory protein.	91
401562	pfam09670	Cas_Cas02710	CRISPR-associated protein (Cas_Cas02710). Members of this family are found, exclusively in the vicinity of CRISPR repeats and other CRISPR-associated (cas) genes, in Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum), Thermus thermophilus (Deinococcus-Thermus), Chloroflexus aurantiacus (Chloroflexi), and Thermomicrobium roseum (Thermomicrobia).	377
401563	pfam09671	Spore_GerQ	Spore coat protein (Spore_GerQ). Members of this protein family are the spore coat protein GerQ of endospore-forming Firmicutes (low GC Gram-positive bacteria). This protein is cross-linked by a spore coat-associated transglutaminase.	76
401564	pfam09673	TrbC_Ftype	Type-F conjugative transfer system pilin assembly protein. This entry represents TrbC, a protein that is an essential component of the F-type conjugative pilus assembly system for the transfer of plasmid DNA. The N-terminal portion of these proteins is heterogeneous.	111
401565	pfam09674	DUF2400	Protein of unknown function (DUF2400). Members of this uncharacterized protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighborhoods show little conservation.	228
286721	pfam09675	Chlamy_scaf	Chlamydia-phage Chp2 scaffold (Chlamy_scaf). Members of this entry are encoded by genes in chlamydia-phage such as Chp2. These viruses have around eight genes and obligately infect intracellular bacterial pathogens of the genus Chlamydia. This protein is annotated as VP3 or structural protein (as if a protein of mature viral particles), however, it is displaced from procapsids as DNA is packaged, and therefore is more correctly described as a scaffolding protein.	109
401566	pfam09676	TraV	Type IV conjugative transfer system lipoprotein (TraV). This entry includes TraV, which is a component of conjugative type IV secretion system. TraV is an outer membrane lipoprotein that is believed to interact with the secretin TraK. The alignment contains three conserved cysteines in the N-terminal half.	127
401567	pfam09677	TrbI_Ftype	Type-F conjugative transfer system protein (TrbI_Ftype). This entry represents TrbI, an essential component of the F-type conjugative transfer system for plasmid DNA transfer that has been shown to be localized to the periplasm.	106
401568	pfam09678	Caa3_CtaG	Cytochrome c oxidase caa3 assembly factor (Caa3_CtaG). Members of this family are the CtaG protein required for assembly of active cytochrome c oxidase of the caa3 type, as found in Bacillus subtilis.	238
370613	pfam09679	TraQ	Type-F conjugative transfer system pilin chaperone (TraQ). This entry represents TraQ, a protein that makes a specific interaction with pilin (TraA) to aid its transfer through the inner membrane during the process of F-type conjugative pilus assembly.	85
401569	pfam09680	Tiny_TM_bacill	Protein of unknown function (Tiny_TM_bacill). This entry represents a family of hypothetical proteins, half of which are 40 residues or less in length. Members are found only in spore-forming species. A Gly-rich variable region is followed by a strongly conserved, highly hydrophobic region, predicted to form a transmembrane helix, ending with an invariant Gly. The consensus for this stretch is FALLVVFILLIIV.	24
401570	pfam09681	Phage_rep_org_N	N-terminal phage replisome organizer (Phage_rep_org_N). This entry represents the N-terminal domain of a small family of phage proteins. The protein contains a region of low-complexity sequence that reflects DNA direct repeats able to function as an origin of phage replication. The region is N-terminal to the low-complexity region.	121
401571	pfam09682	Phage_holin_6_1	Bacteriophage holin of superfamily 6 (Holin_LLH). Phage_holin_6_1 or Holin_LLH identifies a family of phage holins from a number of phage and prophage regions of Gram-positive bacteria. Like other holins, it is large for holins (about 100-160 amino acids) with stretches of hydrophobic sequence and is encoded adjacent to lytic enzymes. Holin LLH family is found in phage of Firmicutes and have an N-terminal transmembrane segment.	100
401572	pfam09683	Lactococcin_972	Bacteriocin (Lactococcin_972). These sequences represent bacteriocins related to lactococcin. Members tend to be found in association with a seven transmembrane putative immunity protein.	63
401573	pfam09684	Tail_P2_I	Phage tail protein (Tail_P2_I). These sequences represent the family of phage P2 protein I and related tail proteins from a number of temperate phage of Gram-negative bacteria.	132
401574	pfam09685	DUF4870	Domain of unknown function (DUF4870). 	107
401575	pfam09686	Plasmid_RAQPRD	Plasmid protein of unknown function (Plasmid_RAQPRD). This entry identifies a family of proteins, which are about 100 amino acids in length, including a predicted signal sequence and a perfectly conserved motif RAQPRD towards the C-terminus. Members are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae DC3000. The function of these proteins are unknown.	75
401576	pfam09687	PRESAN	Plasmodium RESA N-terminal. The short, four-helical domain first identified in the Plasmodium export proteins PHISTa and PHISTc has been extended to become this six-helical PRESAN domain identified in the P. falciparum-specific RESA-type (Ring-infected erythrocyte surface antigen) proteins in association with the DnaJ domain. Overall, at least 67 proteins have been detected in P. falciparum with complete copies of the PRESAN domain. No versions of this domain were detected in other apicomplexan genera, suggesting that the domain was 'invented' after the divergence of the lineage leading to the genus Plasmodium undergoing a dramatic proliferation only in P. falciparum. A secondary structure-prediction derived from the multiple alignment of the PRESAN family reveals that it is composed of an all-helical fold with six conserved helical segments. There is some evidence it might localize to membranes.	125
401577	pfam09688	Wx5_PLAF3D7	Protein of unknown function (Wx5_PLAF3D7). This set of protein sequences represent a family of at least four proteins in Plasmodium falciparum (isolate 3D7). An interesting feature is five perfectly conserved Trp residues.	144
401578	pfam09689	PY_rept_46	Plasmodium yoelii repeat (PY_rept_46). This repeat is found in the products of only 2 genes in Plasmodium yoelii, in each of these proteins it is repeated 9 times. It is found in no other organism.	51
401579	pfam09690	PYST-C1	Plasmodium yoelii subtelomeric region (PYST-C1). This group of sequences are defined by the N-terminal domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The C-terminal portions of the genes that contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (IPR006491).	55
401580	pfam09691	T2SS_PulS_OutS	Type II secretion system pilotin lipoprotein (PulS_OutS). This family comprises lipoproteins from four gamma proteobacterial species: PulS protein of Klebsiella pneumoniae (P20440), the OutS protein of Erwinia chrysanthemi (Q01567) and Pectobacterium chrysanthemi, and the functionally uncharacterized E. coli protein EtpO. PulS and OutS have been shown to interact with and facilitate insertion of secretins into the outer membrane, suggesting a chaperone-like, or piloting function for members of this family. In the pilotin from this four-helix protein from enterohemorrhagic Escherichia coli, the straight helix alpha2, the curved helix alpha3 and the bent helix alpha4 surround the central N-terminal helix alpha1. These helices create a prominent groove, mainly formed by side chains of helices 1,2 and 3 suggesting this groove is important as a binding site.	96
401581	pfam09692	Arb1	Argonaute siRNA chaperone (ARC) complex subunit Arb1. Arb1 is required for histone H3 Lys9 (H3-K9) methylation, heterochromatin, assembly and siRNA generation in fission yeast.	401
401582	pfam09693	Phage_XkdX	Phage uncharacterized protein (Phage_XkdX). This entry identifies a family of small (about 50 amino acid) phage proteins, found in at least 12 different phage and prophage regions of Gram-positive bacteria. In a number of these phage, the gene for this protein is found near the holin and endolysin genes.	40
286740	pfam09694	Gcw_chp	Bacterial protein of unknown function (Gcw_chp). This entry represents a conserved hypothetical protein about 240 residues in length found so far in Proteobacteria including Shewanella oneidensis and Ralstonia solanacearum, usually as part of a paralogous family. The function is unknown.	228
401583	pfam09695	YtfJ_HI0045	Bacterial protein of unknown function (YtfJ_HI0045). These are sequences from gamma proteobacteria that are related to the E. coli protein, YtfJ.	158
401584	pfam09696	Ctf8	Ctf8. Ctf8 (chromosome transmissions fidelity 8) is a component of the Ctf18 RFC-like complex which is a DNA clamp loader involved in sister chromatid cohesion.	127
401585	pfam09697	Porph_ging	Protein of unknown function (Porph_ging). This family of proteins of unknown function is found in Porphyromonas gingivalis (Bacteroides gingivalis).	81
286744	pfam09698	GSu_C4xC__C2xCH	Geobacter CxxxxCH...CXXCH motif (GSu_C4xC__C2xCH). This motif occurs from three to eight times in eight different proteins of Geobacter sulfurreducens. The final CXXCH motif matches the cytochrome c family haem-binding site signature, suggesting that the sequence may be involved in haem-binding.	36
401586	pfam09699	Paired_CXXCH_1	Doubled CXXCH motif (Paired_CXXCH_1). This entry represents a domain of about 41 amino acids that contains, among other motifs, two copies of the motif CXXCH associated with haem binding. This domain is predicted to be a high molecular weight c-type cytochrome and is often found in multiple copies. Members are found mostly in species of Shewanella, Geobacter, and Vibrio.	41
401587	pfam09700	Cas_Cmr3	CRISPR-associated protein (Cas_Cmr3). CRISPR is a term for Clustered Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This highly divergent family, found in at least ten different archaeal and bacterial species, is represented by TM1793 from Thermotoga maritima.	372
401588	pfam09701	Cas_Cmr5	CRISPR-associated protein (Cas_Cmr5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family, represented by TM1791.1 of Thermotoga maritima, is found in both archaeal and bacterial species.	120
401589	pfam09702	Cas_Csa5	CRISPR-associated protein (Cas_Csa5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry represents a minor family of Cas proteins found in various species of Sulfolobus and Pyrococcus (all archaeal). It is found with two different CRISPR loci in Sulfolobus solfataricus.	104
286749	pfam09703	Cas_Csa4	CRISPR-associated protein (Cas_Csa4). CRISPR loci appear to be mobile elements with a wide host range. This entry represents a protein that tends to be found near CRISPR repeats. The species range for this species, so far, is exclusively archaeal. It is found so far in only four different species, and includes two tandem genes in Pyrococcus furiosus DSM 3638. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.	355
401590	pfam09704	Cas_Cas5d	CRISPR-associated protein (Cas_Cas5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This small Cas family is represented by CT1134 of Chlorobium tepidum.	215
401591	pfam09706	Cas_CXXC_CXXC	CRISPR-associated protein (Cas_CXXC_CXXC). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes a conserved region of about 65 amino acids from an otherwise highly divergent protein found in a minority of CRISPR-associated protein regions. This region features two motifs of CXXC.	69
401592	pfam09707	Cas_Cas2CT1978	CRISPR-associated protein (Cas_Cas2CT1978). This entry represents a minor branch of the Cas2 family of CRISPR-associated protein which are found in IPR003799. Cas proteins are found adjacent to a characteristic short, palindromic repeat cluster termed CRISPR, a probable mobile DNA element.	86
401593	pfam09709	Cas_Csd1	CRISPR-associated protein (Cas_Csd1). CRISPR loci appear to be mobile elements with a wide host range. This entry represents proteins that tend to be found near CRISPR repeats. The species range, so far, is exclusively bacterial and mesophilic, although CRISPR loci are particularly common among the archaea and thermophilic bacteria. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.	561
286754	pfam09710	Trep_dent_lipo	Treponema clustered lipoprotein (Trep_dent_lipo). This entry represents a family of six predicted lipoproteins from a region of about 20 tandemly arranged genes in the Treponema denticola genome. Two other neighboring genes share the lipoprotein signal peptide region but do not show more extensive homology. The function of this locus is unknown.	397
401594	pfam09711	Cas_Csn2	CRISPR-associated protein (Cas_Csn2). CRISPR loci appear to be mobile elements with a wide host range. This entry represents proteins found only in CRISPR-containing species, near other CRISPR-associated proteins (cas). The species range so far for these proteins is pathogenic bacteria only. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats).	176
370627	pfam09712	PHA_synth_III_E	Poly(R)-hydroxyalkanoic acid synthase subunit (PHA_synth_III_E). This entry represents the PhaE subunit of the heterodimeric class (class III) of polymerase for poly(R)-hydroxyalkanoic acids (PHAs), carbon and energy storage polymers of many bacteria. The most common PHA is polyhydroxybutyrate but about 150 different constituent hydroxyalkanoic acids (HAs) have been identified in various species.	314
401595	pfam09713	A_thal_3526	Plant protein 1589 of unknown function (A_thal_3526). This plant-specific family of proteins is defined by an uncharacterized region 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana and Oryza sativa. The function of the proteins are unknown.	51
401596	pfam09715	Plasmod_dom_1	Plasmodium protein of unknown function (Plasmod_dom_1). These sequences represent an uncharacterized family consisting of a small number of hypothetical proteins of the malaria parasite Plasmodium falciparum (isolate 3D7).	66
401597	pfam09716	ETRAMP	Malarial early transcribed membrane protein (ETRAMP). These sequences represent a family of proteins from the malaria parasite Plasmodium falciparum, several of which have been shown to be expressed specifically in the ring stage as well as the rodent parasite Plasmodium yoelii. A homolog from Plasmodium chabaudi was localized to the parasitophorous vacuole membrane. Members have an initial hydrophobic, Phe/Tyr-rich, stretch long enough to span the membrane, a highly charged region rich in Lys, a second putative transmembrane region and a second highly charged, low complexity sequence region. Some members have up to 100 residues of additional C-terminal sequence. These genes have been shown to be found in the sub-telomeric regions of both Plasmodium falciparum and P. yoelii chromosomes.	81
401598	pfam09717	CPW_WPC	Plasmodium falciparum domain of unknown function (CPW_WPC). This group of sequences is defined by a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown.	59
401599	pfam09718	Tape_meas_lam_C	Lambda phage tail tape-measure protein (Tape_meas_lam_C). This represents a relatively well-conserved region near the C-terminus of the tape measure protein of a lambda and related phage. The protein, which controls phage tail length, is typically about 1000 residues in length. Both low-complexity sequence and insertion/deletion events appear common in this family. Mutational studies suggest a ruler or template role in the determination of phage tail length. Similar behaviour is attributed to proteins from distantly related or unrelated families in other phage.	76
401600	pfam09719	C_GCAxxG_C_C	Putative redox-active protein (C_GCAxxG_C_C). This entry represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters.	115
401601	pfam09720	Unstab_antitox	Putative addiction module component. This entry defines several short bacterial proteins, typically about 75 amino acids long, which are always found as part of a pair (at least) of small genes. The other protein in the pair always belongs to a family of plasmid stabilisation proteins (IPR007712). It is likely that this protein and its partner comprise some form of addiction module - a pair of genes consisting of a stable toxin and an unstable antitoxin which mediate programmed cell death - although these gene-pairs are usually found on the bacterial main chromosome.	54
401602	pfam09721	Exosortase_EpsH	Transmembrane exosortase (Exosortase_EpsH). Members of this family are designated exosortase, analogous to sortase in cell wall sorting mediated by LPXTG domains in Gram-positive bacteria. The phylogenetic distribution of the proteins in this entry is nearly perfectly correlated with the distribution of the proteins having the PEP-CTERM anchor motif, IPR013424. Members of this entry are integral membrane proteins with eight predicted transmembrane helices in common. Some members of this family have long trailing sequences past the region described by this model. This model does not include the region of the first predicted transmembrane region. The best characterized member is EpsH of Methylobacillus sp. 12S, where it is part of a locus associated with biosynthesis of the exopolysaccharide methanol-an.	249
401603	pfam09722	DUF2384	Protein of unknown function (DUF2384). Proteins in this family are found almost exclusively in the Proteobacteria, but also in Gloeobacter violaceus PCC 7421, a cyanobacterium. The function is unknown.	51
401604	pfam09723	Zn-ribbon_8	Zinc ribbon domain. This entry represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB. Most proteins in this entry have a C-terminal region containing highly degenerate sequence.	41
401605	pfam09724	Dcc1	Sister chromatid cohesion protein Dcc1. Sister chromatid cohesion protein Dcc1 is a component of the Ctf18-RFC complex. This complex is required for the efficient establishment of chromosome cohesion during S-phase and for loading the replication clamp Pol30/PCNA, which functions in DNA replication and repair. Ctf18-RFC loads PCNA onto DNA in an ATP-dependent manner. It may also have PCNA-unloading activity.	318
401606	pfam09725	Fra10Ac1	Folate-sensitive fragile site protein Fra10Ac1. This entry represents the full-length proteins in which, in higher eukaryotes, the nested domain EDSLL lies. Fra10Ac1 is a highly conserved protein, of unknown function that is nuclear and highly expressed in brain.	113
401607	pfam09726	Macoilin	Macoilin family. The Macoilin proteins has an N-terminal portion that is composed of 5 trasnmembrane helices, followed by a C-terminal coiled-coil region. Macoilin is a highly conserved protein present in eukaryotes. Macoilin appears to be found in the ER and be involved in the function of neurons.	664
401608	pfam09727	CortBP2	Cortactin-binding protein-2. This entry is the first approximately 250 residues of cortactin-binding protein 2. In addition to being a positional candidate for autism this protein is expressed at highest levels in the brain in humans. The human protein has six associated ankyrin repeat domains pfam00023 towards the C-terminus which act as protein-protein interaction domains.	187
401609	pfam09728	Taxilin	Myosin-like coiled-coil protein. Taxilin contains an extraordinarily long coiled-coil domain in its C-terminal half and is ubiquitously expressed. It is a novel binding partner of several syntaxin family members and is possibly involved in Ca2+-dependent exocytosis in neuroendocrine cells. Gamma-taxilin, described as leucine zipper protein Factor Inhibiting ATF4-mediated Transcription (FIAT), localizes to the nucleus in osteoblasts and dimerizes with ATF4 to form inactive dimers, thus inhibiting ATF4-mediated transcription.	302
401610	pfam09729	Gti1_Pac2	Gti1/Pac2 family. In S. pombe the gti1 protein promotes the onset of gluconate uptake upon glucose starvation. In S. pombe the Pac2 protein controls the onset of sexual development, by inhibiting the expression of ste11, in a pathway that is independent of the cAMP cascade.	112
370639	pfam09730	BicD	Microtubule-associated protein Bicaudal-D. BicD proteins consist of three coiled-coiled domains and are involved in dynein-mediated minus end-directed transport from the Golgi apparatus to the endoplasmic reticulum (ER). For full functioning they bind with GSK-3beta pfam05350 to maintain the anchoring of microtubules to the centromere. It appears that amino-acid residues 437-617 of BicD and the kinase activity of GSK-3 are necessary for the formation of a complex between BicD and GSK-3beta in intact cells.	720
370640	pfam09731	Mitofilin	Mitochondrial inner membrane protein. Mitofilin controls mitochondrial cristae morphology. Mitofilin is enriched in the narrow space between the inner boundary and the outer membranes, where it forms a homotypic interaction and assembles into a large multimeric protein complex. The first 78 amino acids contain a typical amino-terminal-cleavable mitochondrial presequence rich in positive-charged and hydroxylated residues and a membrane anchor domain. In addition, it has three centrally located coiled coil domains.	583
401611	pfam09732	CactinC_cactus	Cactus-binding C-terminus of cactin protein. CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain pfam10312 further upstream.	125
401612	pfam09733	VEFS-Box	VEFS-Box of polycomb protein. The VEFS-Box (VRN2-EMF2-FIS2-Su(z)12) box is the C-terminal region of these proteins, characterized by an acidic cluster and a tryptophan/methionine-rich sequence, the acidic-W/M domain. Some of these sequences are associated with a zinc-finger domain about 100 residues towards the N-terminus. This protein is one of the polycomb cluster of proteins which control HOX gene transcription as it functions in heterochromatin-mediated repression.	137
401613	pfam09734	Tau95	RNA polymerase III transcription factor (TF)IIIC subunit. TFIIIC1 is a multisubunit DNA binding factor that serves as a dynamic platform for assembly of pre-initiation complexes on class III genes. This entry represents the tau 95 subunit which holds a key position in TFIIIC, exerting both upstream and downstream influence on the TFIIIC-DNA complex by rendering the complex more stable. Once bound to tDNA-intragenic promoter elements, TFIIIC directs the assembly of TFIIIB on the DNA, which in turn recruits the RNA polymerase III (pol III) and activates multiple rounds of transcription.	150
401614	pfam09735	Nckap1	Membrane-associated apoptosis protein. Expression of this protein was found to be markedly reduced in patients with Alzheimer's disease. It is involved in the regulation of actin polymerization in the brain as part of a WAVE2 signalling complex.	1114
401615	pfam09736	Bud13	Pre-mRNA-splicing factor of RES complex. This entry is characterized by proteins with alternating conserved and low-complexity regions. Bud13 together with Snu17p and a newly identified factor, Pml1p/Ylr016c, form a novel trimeric complex. called The RES complex, pre-mRNA retention and splicing complex. Subunits of this complex are not essential for viability of yeasts but they are required for efficient splicing in vitro and in vivo. Furthermore, inactivation of this complex causes pre-mRNA leakage from the nucleus. Bud13 contains a unique, phylogenetically conserved C-terminal region of unknown function.	143
401616	pfam09737	Det1	De-etiolated protein 1 Det1. This is the C-terminal conserved 400 residues of Det1 proteins of approximately 550 amino acids. Det1 (de-etiolated-1) is an essential negative regulator of plant light responses, and it is a component of the Arabidopsis CDD complex containing DDB1 and COP10 ubiquitin E2 variant. Mammalian Det1 forms stable DDD-E2 complexes, consisting of DDB1, DDA1 (DET1, DDB1 Associated 1), and a member of the UBE2E group of canonical ubiquitin conjugating enzymes and modulates Cul4A function.	409
401617	pfam09738	LRRFIP	LRRFIP family. LRRFIP1 is a transcriptional repressor which preferentially binds to the GC-rich consensus sequence (5'- AGCCCCCGGCG-3') and may regulate expression of TNF, EGFR and PDGFA. LRRFIP2 may function as activator of the canonical Wnt signalling pathway, in association with DVL3, upstream of CTNNB1/beta-catenin.	305
401618	pfam09739	MCM_bind	Mini-chromosome maintenance replisome factor. This entry is of proteins of approximately 600 residues in length containing alternating regions of conservation and low complexity. The Arabidopsis protein is a replisome factor found to bind with the mini-chromosome maintenance, MCM-binding, complex and is crucial for efficient DNA replication. The family now spans the full-length proteins.	571
401619	pfam09740	DUF2043	Uncharacterized conserved protein (DUF2043). This is a 100 residue conserved region of a family of proteins found from fungi to humans. This region contains three conserved Cysteines and a motif of {CP}{y/l}{HG}.	103
401620	pfam09741	DUF2045	Uncharacterized conserved protein (DUF2045). This entry is the conserved 250 residues of proteins of approximately 450 amino acids. It contains several highly conserved motifs including a CVxLxxxD motif.The function is unknown.	225
401621	pfam09742	Dymeclin	Dyggve-Melchior-Clausen syndrome protein. Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. Mutations in the gene coding for this protein in humans give rise to the disorder Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800) which is an autosomal-recessive disorder characterized by the association of a spondylo-epi-metaphyseal dysplasia and mental retardation. DYM transcripts are widely expressed throughout human development and Dymeclin is not an integral membrane protein of the ER, but rather a peripheral membrane protein dynamically associated with the Golgi apparatus.	640
401622	pfam09743	E3_UFM1_ligase	E3 UFM1-protein ligase 1. The ubiquitin fold modifier 1 (Ufm1) is the most recently discovered ubiquitin-like modifier whose conjugation (ufmylation) system is conserved in multicellular organisms. Ufm1 is known to covalently attach with cellular protein(s) via a specific E1-activating enzyme (Uba5), an E2-conjugating enzyme (Ufc1), and a E3-ligating enzyme.	272
401623	pfam09744	Jnk-SapK_ap_N	JNK_SAPK-associated protein-1. This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end.	149
401624	pfam09745	DUF2040	Coiled-coil domain-containing protein 55 (DUF2040). This entry is a conserved domain of approximately 130 residues of proteins conserved from fungi to humans. The proteins do contain a coiled-coil domain, but the function is unknown.	121
401625	pfam09746	Membralin	tumor-associated protein. Membralin is evolutionarily highly conserved; though it seems to represent a unique protein family. The protein appears to contain several transmembrane regions. In humans it is expressed in certain cancers, particularly ovarian cancers. Membralin-like gene homologs have been identified in plants including grape, cotton and tomato.	377
401626	pfam09747	DUF2052	Coiled-coil domain containing protein (DUF2052). This entry is of sequences of two conserved domains separated by a region of low complexity, spanning some 200 residues. The function is unknown.	195
401627	pfam09748	Med10	Transcription factor subunit Med10 of Mediator complex. Med10 is one of the protein subunits of the Mediator complex, tethered to Rgr1 protein. The Mediator complex is required for the transcription of most RNA polymerase II (Pol II)-transcribed genes. Med10 specifically mediates basal-level HIS4 transcription via Gcn4, and, additionally, there is a putative requirement for Med10 in Bas2-mediated transcription. Med10 is part of the middle region of Mediator.	119
401628	pfam09749	HVSL	Uncharacterized conserved protein. This entry is of proteins of approximately 300 residues conserved from plants to humans. It contains two conserved motifs, HxSL and FHVSL. The function is unknown.	239
401629	pfam09750	DRY_EERY	Alternative splicing regulator. This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing. Most family members are associated with two Surp domains pfam01805 and an Arginine- serine-rich binding region towards the C-terminus.	129
401630	pfam09751	Es2	Nuclear protein Es2. This entry is of a family of proteins of approximately 500 residues with alternating regions of low complexity and conservation where the domain similarities are strong. Apart from a predicted coiled-coil domain, no other known functional domains have been characterized. The protein appears to be expressed in the nucleus and particularly highly in the pons sub-region of the brain. The protein is clearly necessary for normal development of the nervous system.	419
370661	pfam09752	DUF2048	Abhydrolase domain containing 18. The proteins in this family are conserved from plants to vertebrates. The function is unknown.	352
401631	pfam09753	Use1	Membrane fusion protein Use1. This entry is of a family of proteins all approximately 300 residues in length. The proteins have a single C-terminal trans-membrane domain and a SNARE [soluble NSF (N-ethylmaleimide-sensitive fusion protein) attachment protein receptor] domain of approximately 60 residues. The SNARE domains are essential for membrane fusion and are conserved from yeasts to humans. Use1 is one of the three protein subunits that make up the SNARE complex and it is specifically required for Golgi-endoplasmic reticulum retrograde transport.	243
401632	pfam09754	PAC2	PAC2 family. This PAC2 (Proteasome assembly chaperone) family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 247 and 307 amino acids in length. These proteins function as a chaperone for the 26S proteasome. The 26S proteasome mediates ubiquitin-dependent proteolysis in eukaryotic cells. A number of studies including very recent ones have revealed that assembly of its 20S catalytic core particle is an ordered process that involves several conserved proteasome assembly chaperones (PACs). Two heterodimeric chaperones, PAC1-PAC2 and PAC3-PAC4, promote the assembly of rings composed of seven alpha subunits.	167
401633	pfam09755	DUF2046	Uncharacterized conserved protein H4 (DUF2046). This is the conserved N-terminal 350 residues of a family of proteins of unknown function possibly containing a coiled-coil domain.	304
370664	pfam09756	DDRGK	DDRGK domain. This is a family of proteins of approximately 300 residues, found in plants and vertebrates. They contain a highly conserved DDRGK motif.	188
401634	pfam09757	Arb2	Arb2 domain. A second fission yeast Argonaute complex (Argonaute siRNA chaperone, ARC) that contains two previously uncharacterized proteins, Arb1 and Arb2, both of which are required for histone H3 Lys9 (H3-K9) methylation, heterochromatin assembly and siRNA generation. This family includes a region found in Arb2 and the Hda1 protein.	252
401635	pfam09758	FPL	Uncharacterized conserved protein. This entry represents an N-terminal region of approximately 150 residues of a family of proteins of unknown function. It contains a highly conserved FPL motif.	148
401636	pfam09759	Atx10homo_assoc	Spinocerebellar ataxia type 10 protein domain. This is the conserved C-terminal 100 residues of Ataxin-10. Ataxin-10 belongs to the family of armadillo repeat proteins and in solution it tends to form homotrimeric complexes, which associate via a tip-to-tip association in a horseshoe-shaped contact with the concave sides of the molecules facing each other. This domain may represent the homo-association site since that is located near the C-terminus of Ataxin-10. The protein does not contain a signal sequence for secretion or any subcellular compartment confirming its cytoplasmic localization, specifically to the olivocerebellar region.	99
401637	pfam09762	KOG2701	Coiled-coil domain-containing protein (DUF2037). This entry represents the conserved N-terminal 200 residues of a family of proteins conserved from plants to vertebrates. In Drosophila it comes from the Fidipidine gene, and is of unknown function.	173
401638	pfam09763	Sec3_C	Exocyst complex component Sec3. This entry is the conserved middle and C-terminus of the Sec3 protein. Sec3 binds to the C-terminal cytoplasmic domain of GLYT1 (glycine transporter protein 1). Sec3 is the exocyst component that is closest to the plasma membrane docking site and it serves as a spatial landmark in the plasma membrane for incoming secretory vesicles. Sec3 is recruited to the sites of polarised membrane growth through its interaction with Rho1p, a small GTP-binding protein.	696
401639	pfam09764	Nt_Gln_amidase	N-terminal glutamine amidase. This protein is conserved from plants to humans. It represents a family of N terminal glutamine amidases. The enzyme removes the NH2 group from a Gln, at the N-terminal, rendering it a Glu.	172
401640	pfam09765	WD-3	WD-repeat region. This entry is of a region of approximately 100 residues containing three WD repeats and six cysteine residues possibly as three cystine-bridges. These regions are contained within the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. The WD repeats are required for interaction with other subunits of the FA complex.	87
401641	pfam09766	FimP	Fms-interacting protein. This entry carries part of the crucial 144 N-terminal residues of the FmiP protein, which is essential for the binding of the protein to the cytoplasmic domain of activated Fms-molecules in M-CSF induced haematopoietic differentiation of macrophages. The C-terminus contains a putative nuclear localization sequence and a leucine zipper which suggest further, as yet unknown, nuclear functions. The level of FMIP expression might form a threshold that determines whether cells differentiate into macrophages or into granulocytes.	339
401642	pfam09767	DUF2053	Predicted membrane protein (DUF2053). This entry is of the conserved N-terminal 150 residues of proteins conserved from plants to humans. The function is unknown although some annotation suggests it to be a transmembrane protein.	157
401643	pfam09768	Peptidase_M76	Peptidase M76 family. This is a family of metalloproteases. Proteins in this family are also annotated as Ku70-binding proteins.	168
401644	pfam09769	ApoO	Apolipoprotein O. Members of this family promote cholesterol efflux from macrophage cells. They are present in various lipoprotein complexes, including HDL, LDL and VLDL. The apoprotein is secreted by a microsomal triglyceride transfer protein (MTTP)-dependent mechanism, probably as a VLDL-associated protein that is subsequently transferred to HDL.	121
401645	pfam09770	PAT1	Topoisomerase II-associated protein PAT1. Members of this family are necessary for accurate chromosome transmission during cell division.	846
401646	pfam09771	Tmemb_18A	Transmembrane protein 188. The function of this family of transmembrane proteins has not, as yet, been determined.	118
370678	pfam09772	Tmem26	Transmembrane protein 26. The function of this family of transmembrane proteins has not, as yet, been determined.	288
401647	pfam09773	Meckelin	Meckelin (Transmembrane protein 67). Members of this family are thought to be related to the ciliary basal body. Defects result in Meckel syndrome type 3, an autosomal recessive disorder characterized by a combination of renal cysts and variably associated features including developmental anomalies of the central nervous system (typically encephalocele), hepatic ductal dysplasia and cysts, and polydactyly. Joubert syndrome type 6 is also a manifestation of certain mutations; it is an autosomal recessive congenital malformation of the cerebellar vermis and brainstem with abnormalities of axonal decussation (crossing in the brain) affecting the corticospinal tract and superior cerebellar peduncles. Individuals with Joubert syndrome have motor and behavioral abnormalities, including an inability to walk due to severe clumsiness and 'mirror' movements, and cognitive and behavioural disturbances.	816
401648	pfam09774	Cid2	Caffeine-induced death protein 2. Members of this family of proteins mediate the disruption of the DNA replication checkpoint (S-M checkpoint) mechanism caused by caffeine.	156
401649	pfam09775	Keratin_assoc	Keratinocyte-associated protein 2. Members of this family comprise various keratinocyte-associated proteins. Their exact function has not, as yet, been determined.	129
401650	pfam09776	Mitoc_L55	Mitochondrial ribosomal protein L55. Members of this family are involved in mitochondrial biogenesis and G2/M phase cell cycle progression. They form a component of the mitochondrial ribosome large subunit (39S) which comprises a 16S rRNA and about 50 distinct proteins.	115
401651	pfam09777	OSTMP1	Osteopetrosis-associated transmembrane protein 1 precursor. Members of this family of proteins are required for osteoclast and melanocyte maturation and function. Mutations give rise to autosomal recessive osteopetrosis; also called autosomal recessive Albers-Schonberg disease.	229
401652	pfam09778	Guanylate_cyc_2	Guanylylate cyclase. Members of this family of proteins catalyze the conversion of guanosine triphosphate (GTP) to 3',5'-cyclic guanosine monophosphate (cGMP) and pyrophosphate.	211
401653	pfam09779	Ima1_N	Ima1 N-terminal domain. This domain occurs at the N-terminus of the Schizosaccharomyces pombe inner nuclear membrane protein, Ima1. Ima1 interacts with other inner nuclear membrane proteins.	131
401654	pfam09781	NDUF_B5	NADH:ubiquinone oxidoreductase, NDUFB5/SGDH subunit. Members of this family mediate the transfer of electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone, the reaction that occurs being: NADH + ubiquinone = NAD(+) + ubiquinol.	186
370687	pfam09782	NDUF_B6	NADH:ubiquinone oxidoreductase, NDUFB6/B17 subunit. Members of this family mediate the transfer of electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone, the reaction that occurs being: NADH + ubiquinone = NAD(+) + ubiquinol.	128
401655	pfam09783	Vac_ImportDeg	Vacuolar import and degradation protein. Members of this family are involved in the negative regulation of gluconeogenesis. They are required for both proteosome-dependent and vacuolar catabolite degradation of fructose-1,6-bisphosphatase (FBPase), where they probably regulate FBPase targeting from the FBPase-containing vesicles to the vacuole.	162
401656	pfam09784	L31	Mitochondrial ribosomal protein L31. This is a family of mitochondrial ribosomal proteins. L31 is essential for mitochondrial function in yeast.	102
401657	pfam09785	Prp31_C	Prp31 C terminal domain. This is the C terminal domain of the pre-mRNA processing factor Prp31. Prp31 is required for U4/U6.U5 tri-snRNP formation. In humans this protein has been linked to autosomal dominant retinitis pigmentosa.	121
401658	pfam09786	CytochromB561_N	Cytochrome B561, N terminal. Members of this family are found in the N terminal region of cytochrome B561, as well as in various other putative uncharacterized proteins.	581
401659	pfam09787	Golgin_A5	Golgin subfamily A member 5. Members of this family of proteins are involved in maintaining Golgi structure. They stimulate the formation of Golgi stacks and ribbons, and are involved in intra-Golgi retrograde transport. Two main interactions have been characterized: one with RAB1A that has been activated by GTP-binding and another with isoform CASP of CUTL1.	306
401660	pfam09788	Tmemb_55A	Transmembrane protein 55A. Members of this family catalyze the hydrolysis of the 4-position phosphate of phosphatidylinositol 4,5-bisphosphate, in the reaction: 1-phosphatidyl-myo-inositol 4,5-bisphosphate + H(2)O = 1-phosphatidyl-1D-myo-inositol 5-phosphate + phosphate.	245
401661	pfam09789	DUF2353	Uncharacterized coiled-coil protein (DUF2353). Members of this family of uncharacterized proteins have no known function.	311
401662	pfam09790	Hyccin	Hyccin. Members of this family of proteins may have a role in the beta-catenin-Tcf/Lef signaling pathway, as well as in the process of myelination of the central and peripheral nervous system. Defects in Hyccin are the cause of hypomyelination with congenital cataracts. This disorder is characterized by congenital cataracts, progressive neurologic impairment, and diffuse myelin deficiency. Affected individuals experience progressive pyramidal and cerebellar dysfunction, muscle weakness and wasting prevailing in the lower limbs.	323
401663	pfam09791	Oxidored-like	Oxidoreductase-like protein, N-terminal. Members of this family are found in the N terminal region of various oxidoreductase like proteins. Their exact function is, as yet, unknown.	46
401664	pfam09792	But2	Ubiquitin 3 binding protein But2 C-terminal domain. This family is of proteins conserved in yeasts. It binds to Uba3 and is involved in the NEDD8 signalling pathway. This family represents a presumed C-terminal domain.	141
401665	pfam09793	AD	Anticodon-binding domain. This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins. It is an anticodon-binding domain of a prolyl-tRNA synthetase, whose PDB structure is available under the identifier 1h4q.	90
401666	pfam09794	Avl9	Transport protein Avl9. Avl9 is a protein involved in exocytic transport from the Golgi. It has been speculated that Avl9 could play a role in deforming membranes for vesicle fission and/or in recruiting cargo.	376
401667	pfam09795	Atg31	Autophagy-related protein 31. Autophagy is an intracellular degradation system that responds to nutrient starvation. Cis1/Atg31 has been shown to be required for autophagosome formation in Saccharomyces cerevisiae. It interacts with Atg17.	154
401668	pfam09796	QCR10	Ubiquinol-cytochrome-c reductase complex subunit (QCR10). The QCR10 family of proteins are a component of the ubiquinol-cytochrome c reductase complex (also known as complex III or cytochrome b-c1 complex). This complex is located on the inner mitochondrial membrane and it couples electron transfer from ubiquinol to cytochrome. This subunit (QCR10) is required for stable association of the iron-sulfur protein with the complex.	62
401669	pfam09797	NatB_MDM20	N-acetyltransferase B complex (NatB) non catalytic subunit. This is the non-catalytic subunit of the N-terminal acetyltransferase B complex (NatB). The NatB complex catalyzes the acetylation of the amino-terminal methionine residue of all proteins beginning with Met-Asp or Met-Glu and of some proteins beginning with Met-Asn or Met-Met. In Saccharomyces cerevisiae this subunit is called MDM20 and in Schizosaccharomyces pombe it is called Arm1. NatB acetylates the Tpm1 protein and regulates and tropomyocin-actin interactions. This subunit is required by the NatB complex for the N-terminal acetylation of Tpm1.	371
401670	pfam09798	LCD1	DNA damage checkpoint protein. This is a family of proteins which regulate checkpoint kinases. In Schizosaccharomyces pombe this protein is called Rad26 and in Saccharomyces cerevisiae it is called LCD1.	616
401671	pfam09799	Transmemb_17	Predicted membrane protein. This is a 100 amino acid region of a family of proteins conserved from nematodes to humans. It is predicted to be a transmembrane region but its function is not known.	104
370705	pfam09801	SYS1	Integral membrane protein S linking to the trans Golgi network. Members of this family are integral membrane proteins involved in protein trafficking between the late Golgi and endosome. They may also serve as a receptor for ADP-ribosylation factor-related protein 1 (ARFRP1). Sys1p is a small integral membrane protein with four predicted transmembrane domains that localizes to the Trans Golgi network TGN in yeast and human cells.	144
401672	pfam09802	Sec66	Preprotein translocase subunit Sec66. Members of this family of proteins are a component of the heterotetrameric Sec62/63 complex composed of SEC62, SEC63, SEC66 and SEC72. The Sec62/63 complex associates with the Sec61 complex to form the Sec complex. Sec 66 is involved in SRP-independent post-translational translocation across the endoplasmic reticulum and functions together with the Sec61 complex and KAR2 in a channel-forming translocon complex. Furthermore, Sec66 is also required for growth at elevated temperatures.	176
401673	pfam09803	Pet100	Pet100. Pet100 is a chaperone required for the assembly of cytochrome c oxidase. The human Pet100 homolog (also known as C19orf79) has been shown to be located in the mitochondrial inner membrane and forms a ~300 kDa subcomplex with mitochondrial complex IV subunits.	63
401674	pfam09804	DUF2347	Uncharacterized conserved protein (DUF2347). Members of this family of hypothetical proteins have no known function.	283
401675	pfam09805	Nop25	Nucleolar protein 12 (25kDa). Members of this family of proteins are part of the yeast nuclear pore complex-associated pre-60S ribosomal subunit. The family functions as a highly conserved exonuclease that is required for the 5'-end maturation of 5.8S and 25S rRNAs, demonstrating that 5'-end processing also has a redundant pathway. Nop25 binds late pre-60S ribosomes, accompanying them from the nucleolus to the nuclear periphery; and there is evidence for both physical and functional links between late 60S subunit processing and export.	135
370710	pfam09806	CDK2AP	Cyclin-dependent kinase 2-associated protein. Members of this family of proteins are cell-growth suppressors, associating with and influencing the biological activities of important cell cycle regulators in the S phase including monomeric non-phosphorylated cyclin-dependent kinase 2 (CDK2) and DNA polymerase alpha/primase. An association between mutations in the gene coding for this protein and oral cancer has been described.	205
401676	pfam09807	ELP6	Elongation complex protein 6. ELP6 is a subunit of the RNA polymerase II elongator complex. The Elongator complex promotes RNA-polymerase II transcript elongation through histone acetylation in the nucleus and tRNA modification in the cytoplasm. ELP5 and ELP6 play a major role in the migration, invasion and tumorigenicity of melanoma cells, as integral subunits of Elongator	250
401677	pfam09808	SNAPc_SNAP43	Small nuclear RNA activating complex (SNAPc), subunit SNAP43. Members of this family are part of the SNAPc complex required for the transcription of both RNA polymerase II and III small-nuclear RNA genes. They bind to the proximal sequence element (PSE), a non-TATA-box basal promoter element common to these 2 types of genes. Furthermore, they also recruit TBP and BRF2 to the U6 snRNA TATA box.	191
286847	pfam09809	MRP-L27	Mitochondrial ribosomal protein L27. Members of this family of proteins are components of the mitochondrial ribosome large subunit. They are also involved in apoptosis and cell cycle regulation.	97
401678	pfam09810	Exo5	Exonuclease V - a 5' deoxyribonuclease. Exonuclease V is a monomeric 5' deoxyribonuclease that is localized in the nucleus. It degrades single-stranded, but not double-stranded, DNA from the 5'-end, and the products are dinucleotides, except the 3'-terminal tri- and tetranucleotides, which are not degraded. The initial hydrolytic cut of exonuclease V on the dephosphorylated substrate produces a mixture of dinucleoside monophosphates and trinucleoside diphosphates. The enzyme is processive in action. Exo5 is specific for single-stranded DNA and does not hydrolyze RNA. However, Exo5 has the capacity to slide across 5' double-stranded DNA or 5' RNA sequences and resume cutting two nucleotides downstream of the double-stranded-to-single-stranded junction or RNA-to-DNA junction, respectively.	363
401679	pfam09811	Yae1_N	Essential protein Yae1, N terminal. Members of this family are found in the N terminal region of the essential protein Yae1. Their exact function has not, as yet, been determined. The family DUF1715, pfam08215 has now been merged into this family.	39
401680	pfam09812	MRP-L28	Mitochondrial ribosomal protein L28. Members of this family are components of the mitochondrial large ribosomal subunit. Mature mitochondrial ribosomes consist of a small (37S) and a large (54S) subunit. The 37S subunit contains at least 33 different proteins and 1 molecule of RNA (15S). The 54S subunit contains at least 45 different proteins and 1 molecule of RNA (21S).	157
286851	pfam09813	Coiled-coil_56	Coiled-coil domain-containing protein 56. Members of this family of proteins have no known function.	101
401681	pfam09814	HECT_2	HECT-like Ubiquitin-conjugating enzyme (E2)-binding. HECT_2 is a family of UbcH10-binding proteins.	305
401682	pfam09815	XK-related	XK-related protein. Members of this family comprise various XK-related proteins, that are involved in sodium-dependent transport of neutral amino acids or oligopeptides. These proteins are responsible for the Kx blood group system - defects results in McLeod syndrome, an X-linked multi-system disorder characterized by late onset abnormalities in the neuromuscular and hematopoietic systems.	337
401683	pfam09816	EAF	RNA polymerase II transcription elongation factor. Members of this family act as transcriptional transactivators of ELL and ELL2 elongation activities. Eaf proteins form a stable heterodimer complex with ELL proteins to facilitate the binding of RNA polymerase II to activate transcription elongation. The N-terminus of approx 120 residues is globular and highly conserved.	101
401684	pfam09817	Zwilch	RZZ complex, subunit zwilch. The protein Zwilch is an essential component of the mitotic checkpoint, which prevents cells from prematurely exiting mitosis. It is required for the assembly of the dynein-dynactin, Mad2 complexes and spindly/CG15415 onto kinetochores.	573
401685	pfam09818	ABC_ATPase	Predicted ATPase of the ABC class. Members of this family include various bacterial predicted ABC class ATPases.	445
401686	pfam09819	ABC_cobalt	ABC-type cobalt transport system, permease component. Members of this family of prokaryotic proteins include various hypothetical proteins as well as ABC-type cobalt transport systems.	121
401687	pfam09820	AAA-ATPase_like	Predicted AAA-ATPase. This family contains many hypothetical bacterial proteins. This family was previously the N-terminal part of the Pfam DUF1703 (pfam08011) family before it was split into two. This region is predicted to be an AAA-ATPase domain.	277
401688	pfam09821	AAA_assoc_C	C-terminal AAA-associated domain. This had been thought to be an ATPase domain of ABC-transporter proteins. However, only one member has any trans-membrane regions. It is associated with an upstream ATP-binding cassette family, pfam00005.	117
401689	pfam09822	ABC_transp_aux	ABC-type uncharacterized transport system. This domain is found in various eukaryotic and prokaryotic intra-flagellar transport proteins involved in gliding motility, as well as in several hypothetical proteins.	264
401690	pfam09823	DUF2357	Domain of unknown function (DUF2357). This entry was previously the N terminal portion of DUF524 (pfam04411) before it was split into two. This domain has no known function. It is predicted to adopt an all beta secondary structure pattern followed by mainly alpha-helical structures.	248
401691	pfam09824	ArsR	ArsR transcriptional regulator. Members of this family of archaeal proteins are conserved transcriptional regulators belonging to the ArsR family.	159
401692	pfam09825	BPL_N	Biotin-protein ligase, N terminal. The function of this structural domain is unknown. It is found to the N-terminus of the biotin protein ligase catalytic domain.	373
401693	pfam09826	Beta_propel	Beta propeller domain. Members of this family comprise secreted bacterial proteins containing C-terminal beta-propeller domain distantly related to WD-40 repeats. Jpred secondary-structure prediction shows family to be a series of 4 short beta-strands, characteristic of beta-propeller families.	505
401694	pfam09827	CRISPR_Cas2	CRISPR associated protein Cas2. Members of this family of bacterial proteins comprise various hypothetical proteins, as well as CRISPR (clustered regularly interspaced short palindromic repeats) associated proteins, conferring resistance to infection by certain bacteriophages.	72
401695	pfam09828	Chrome_Resist	Chromate resistance exported protein. Members of this family of bacterial proteins, are involved in the reduction of chromate accumulation and are essential for chromate resistance.	131
401696	pfam09829	DUF2057	Uncharacterized protein conserved in bacteria (DUF2057). This domain, found in various prokaryotic proteins, has no known function.	191
401697	pfam09830	ATP_transf	ATP adenylyltransferase. Members of this family of proteins catabolize Ap4N nucleotides (where N is A,C,G or U). Additionally they catalyze the conversion of adenosine-5-phosphosulfate (AMPs) plus Pi to ADP plus sulphate, the exchange of NDP and phosphate and the synthesis of Ap4A from AMPs plus ATP.	65
401698	pfam09831	DUF2058	Uncharacterized protein conserved in bacteria (DUF2058). This domain, found in various prokaryotic proteins, has no known function.	173
401699	pfam09832	DUF2059	Uncharacterized protein conserved in bacteria (DUF2059). This domain, found in various prokaryotic proteins, has no known function.	59
401700	pfam09834	DUF2061	Predicted membrane protein (DUF2061). This domain, found in various prokaryotic proteins, has no known function.	51
401701	pfam09835	DUF2062	Uncharacterized protein conserved in bacteria (DUF2062). This domain, found in various prokaryotic proteins, has no known function.	139
401702	pfam09836	DUF2063	Putative DNA-binding domain. This family represents the N-terminal part of a Neisseria protein, UniProtKB:Q5F5I0, Structure 3dee. It runs from residues 31-117 as a helical bundle with 4 main helices. \From genomic context and the fold of the C-terminal part, it is suggested that this protein is involved in transcriptional regulation.	87
401703	pfam09837	DUF2064	Uncharacterized protein conserved in bacteria (DUF2064). This family has structural similarity to proteins in the nucleotide-diphospho-sugar transferases superfamily. The similarity suggests that it is an enzyme with a sugar substrate.	120
401704	pfam09838	DUF2065	Uncharacterized protein conserved in bacteria (DUF2065). This domain, found in various prokaryotic proteins, has no known function.	56
401705	pfam09839	DUF2066	Uncharacterized protein conserved in bacteria (DUF2066). This domain, found in various prokaryotic proteins, has no known function.	230
378268	pfam09840	DUF2067	Uncharacterized protein conserved in archaea (DUF2067). This domain, found in various archaeal proteins, has no known function.	186
401706	pfam09842	DUF2069	Predicted membrane protein (DUF2069). This domain, found in various prokaryotes, has no known function.	104
401707	pfam09843	DUF2070	Predicted membrane protein (DUF2070). This is a family of Archaeal 7-TM proteins. There are 6 closely assembled TM-regions at the N-terminus followed by a long intracellular, from residues 220-590, highly conserved region, of unknown function, terminating with one more TM-region. The short 25 residue section between TMs 5 and 6 might lie on the outer surface of the membrane and be acting as a receptor (from TMHMM).	561
401708	pfam09844	DUF2071	Uncharacterized conserved protein (COG2071). This conserved protein (similar to YgjF), found in various prokaryotes, has no known function.	211
401709	pfam09845	DUF2072	Zn-ribbon containing protein. This archaeal protein has no known function.	134
401710	pfam09846	DUF2073	Uncharacterized protein conserved in archaea (DUF2073). This archaeal protein has no known function.	105
286883	pfam09847	12TM_1	Membrane protein of 12 TMs. This family carries twelve transmembrane regions. It does not have any characteristic nucleotide-binding-domains of the GxSGSGKST type. so it may not be an ATP-binding cassette transporter. However, it may well be a transporter of some description. ABC transporters always have two nucleotide binding domains; this has two unusual conserved sequence-motifs: 'KDhKxhhR' and 'LxxLP'.	448
401711	pfam09848	DUF2075	Uncharacterized conserved protein (DUF2075). This domain, found in various prokaryotic proteins (including putative ATP/GTP binding proteins), has no known function.	355
401712	pfam09849	DUF2076	Uncharacterized protein conserved in bacteria (DUF2076). This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins.	257
401713	pfam09850	DotU	Type VI secretion system protein DotU. DotU is a family of Gram-negative bacterial proteins that form part of the membrane-joining complex of the type VI secretion system. DotU binds tightly to IcmF and together they are tethered to the inner membrane at one end and the peptidoglycan layer at the other; they interact with Lip1 which then tethers the peptidoglycan layer to the outer membrane.	204
401714	pfam09851	SHOCT	Short C-terminal domain. 	28
401715	pfam09852	DUF2079	Predicted membrane protein (DUF2079). This domain, found in various hypothetical prokaryotic proteins, has no known function.	455
313137	pfam09853	DUF2080	Putative transposon-encoded protein (DUF2080). This domain, found in various hypothetical archaeal proteins, has no known function.	50
401716	pfam09855	zinc_ribbon_13	Nucleic-acid-binding protein containing Zn-ribbon domain (DUF2082). This domain, found in various hypothetical prokaryotic proteins, as well as some Zn-ribbon nucleic-acid-binding proteins has no known function.	63
401717	pfam09856	DUF2083	Predicted transcriptional regulator (DUF2083). This domain is found in various prokaryotic transcriptional regulatory proteins belonging to the XRE family. Its exact function is, as yet, unknown.	157
401718	pfam09857	YjhX_toxin	Putative toxin of bacterial toxin-antitoxin pair. YjhX_toxin is a putative toxin of a bacterial toxin-antitoxin pair, which is neutralized by the proteins YjhQ in family pfam00583.	85
401719	pfam09858	DUF2085	Predicted membrane protein (DUF2085). This domain, found in various hypothetical prokaryotic proteins, has no known function.	90
401720	pfam09859	Oxygenase-NA	Oxygenase, catalyzing oxidative methylation of damaged DNA. This family of bacterial sequences is predicted to catalyze oxidative de-methylation of damaged bases in DNA.	172
401721	pfam09860	DUF2087	Uncharacterized protein conserved in bacteria (DUF2087). This domain, found in various hypothetical prokaryotic proteins and transcriptional activators, has no known function. Structural modelling suggests this domain may bind nucleic acids.	67
401722	pfam09861	DUF2088	Domain of unknown function (DUF2088). This domain, found in various hypothetical prokaryotic proteins, has no known function.	204
401723	pfam09862	DUF2089	Protein of unknown function (DUF2089). This domain, found in various hypothetical prokaryotic proteins, has no known function. This domain is a zinc-ribbon.	115
401724	pfam09863	DUF2090	Uncharacterized protein conserved in bacteria (DUF2090). This domain, found in various prokaryotic carbohydrate kinases, has no known function.	310
401725	pfam09864	MliC	Membrane-bound lysozyme-inhibitor of c-type lysozyme. Lysozymes are ancient and important components of the innate immune system of animals that hydrolyze peptidoglycan, the major bacterial cell wall polymer. Various mechanisms have evolved by which bacteria can evade this bactericidal enzyme, one being the production of lysozyme inhibitors. MliC (membrane bound lysozyme inhibitor of c-type lysozyme) of E. coli and Pseudomonas aeruginosa, possess lysozyme inhibitory activity and confer increased lysozyme tolerance upon expression in E. coli. Structural analyses show that the invariant loop of MliC plays a crucial role in the inhibition of the lysozyme by its insertion into the active site cleft of the lysozyme, where the loop forms hydrogen and ionic bonds with the catalytic residues.	68
401726	pfam09865	DUF2092	Predicted periplasmic protein (DUF2092). This domain, found in various hypothetical prokaryotic proteins, has no known function.	209
401727	pfam09866	DUF2093	Uncharacterized protein conserved in bacteria (DUF2093). This domain, found in various hypothetical prokaryotic proteins, has no known function.	41
401728	pfam09867	DUF2094	Uncharacterized protein conserved in bacteria (DUF2094). This domain, found in various hypothetical prokaryotic proteins, has no known function.	135
370731	pfam09868	DUF2095	Uncharacterized protein conserved in archaea (DUF2095). This domain, found in various hypothetical prokaryotic proteins, has no known function.	129
401729	pfam09869	DUF2096	Uncharacterized protein conserved in archaea (DUF2096). This domain, found in various hypothetical prokaryotic proteins, has no known function.	168
401730	pfam09870	DUF2097	Uncharacterized protein conserved in archaea (DUF2097). This domain, found in various hypothetical prokaryotic proteins, has no known function.	83
401731	pfam09871	DUF2098	Uncharacterized protein conserved in archaea (DUF2098). This domain, found in various hypothetical prokaryotic proteins, has no known function.	94
401732	pfam09872	DUF2099	Uncharacterized protein conserved in archaea (DUF2099). This domain, found in various hypothetical prokaryotic proteins, has no known function.	257
401733	pfam09873	DUF2100	Uncharacterized protein conserved in archaea (DUF2100). This domain, found in various hypothetical archaeal proteins, has no known function.	210
255617	pfam09874	DUF2101	Predicted membrane protein (DUF2101). This domain, found in various archaeal and bacterial proteins, has no known function.	206
401734	pfam09875	DUF2102	Uncharacterized protein conserved in archaea (DUF2102). This domain, found in various hypothetical archaeal proteins, has no known function.	102
401735	pfam09876	DUF2103	Predicted metal-binding protein (DUF2103). This domain, found in various putative metal binding prokaryotic proteins, has no known function.	98
401736	pfam09877	DUF2104	Predicted membrane protein (DUF2104). This domain, found in various hypothetical archaeal proteins, has no known function.	99
401737	pfam09878	DUF2105	Predicted membrane protein (DUF2105). This domain, found in various hypothetical archaeal proteins, has no known function.	200
401738	pfam09879	DUF2106	Predicted membrane protein (DUF2106). This domain, found in various hypothetical archaeal proteins, has no known function.	151
401739	pfam09880	DUF2107	Predicted membrane protein (DUF2107). This domain, found in various hypothetical archaeal proteins, has no known function.	73
401740	pfam09881	DUF2108	Predicted membrane protein (DUF2108). This domain, found in various hypothetical archaeal proteins, has no known function.	70
401741	pfam09882	DUF2109	Predicted membrane protein (DUF2109). This domain, found in various hypothetical archaeal proteins, has no known function.	76
401742	pfam09883	DUF2110	Uncharacterized protein conserved in archaea (DUF2110). This domain, found in various hypothetical archaeal proteins, has no known function.	223
401743	pfam09884	DUF2111	Uncharacterized protein conserved in archaea (DUF2111). This domain, found in various hypothetical archaeal proteins, has no known function.	83
401744	pfam09885	DUF2112	Uncharacterized protein conserved in archaea (DUF2112). This domain, found in various hypothetical archaeal proteins, has no known function.	143
401745	pfam09886	DUF2113	Uncharacterized protein conserved in archaea (DUF2113). This domain, found in various hypothetical archaeal proteins, has no known function.	185
401746	pfam09887	DUF2114	Uncharacterized protein conserved in archaea (DUF2114). This domain, found in various hypothetical archaeal proteins, has no known function.	449
401747	pfam09888	DUF2115	Uncharacterized protein conserved in archaea (DUF2115). This domain, found in various hypothetical archaeal proteins, has no known function.	163
401748	pfam09889	DUF2116	Uncharacterized protein containing a Zn-ribbon (DUF2116). This domain, found in various hypothetical archaeal proteins, has no known function. Structural modelling suggests this domain may bind nucleic acids.	59
401749	pfam09890	DUF2117	Uncharacterized protein conserved in archaea (DUF2117). This domain, found in various hypothetical archaeal proteins, has no known function.	213
378294	pfam09891	DUF2118	Uncharacterized protein conserved in archaea (DUF2118). This domain, found in various hypothetical archaeal proteins, has no known function.	149
401750	pfam09892	DUF2119	Uncharacterized protein conserved in archaea (DUF2119). This domain, found in various hypothetical archaeal proteins, has no known function.	186
401751	pfam09893	DUF2120	Uncharacterized protein conserved in archaea (DUF2120). This domain, found in various hypothetical archaeal proteins, has no known function.	136
401752	pfam09894	DUF2121	Uncharacterized protein conserved in archaea (DUF2121). This domain, found in various hypothetical archaeal proteins, has no known function.	196
378295	pfam09895	DUF2122	RecB-family nuclease (DUF2122). This domain, found in various hypothetical archaeal proteins, has no known function.	106
401753	pfam09897	DUF2124	Uncharacterized protein conserved in archaea (DUF2124). This domain, found in various hypothetical archaeal proteins, has no known function.	141
401754	pfam09898	DUF2125	Uncharacterized protein conserved in bacteria (DUF2125). This domain, found in various hypothetical prokaryotic proteins, has no known function.	306
401755	pfam09899	DUF2126	Putative amidoligase enzyme (DUF2126). Members of this family of bacterial domains are predominantly found in transglutaminase and transglutaminase-like proteins. Their exact function is, as yet, unknown, but they are likely to act as amidoligase enzymes Protein in this family are found in conserved gene neighborhoods encoding a glutamine amidotransferase-like thiol peptidase (in proteobacteria) or an Aig2 family cyclotransferase protein (in firmicutes).	822
401756	pfam09900	DUF2127	Predicted membrane protein (DUF2127). This domain, found in various hypothetical prokaryotic and archaeal proteins, has no known function.	140
401757	pfam09902	DUF2129	Uncharacterized protein conserved in bacteria (DUF2129). This domain, found in various hypothetical prokaryotic proteins, has no known function. Structural modelling suggests this domain may bind nucleic acids.	70
401758	pfam09903	DUF2130	Uncharacterized protein conserved in bacteria (DUF2130). This domain, found in various hypothetical prokaryotic proteins, has no known function.	248
370745	pfam09904	HTH_43	Winged helix-turn helix. This family, found in various hypothetical prokaryotic proteins, is a probable winged helix DNA-binding domain.	89
401759	pfam09905	VF530	DNA-binding protein VF530. VF530 contains a unique four-helix motif that shows some similarity to the C-terminal double-stranded DNA (dsDNA) binding domain of RecA, as well as other nucleic acid binding domains.	63
401760	pfam09906	DUF2135	Uncharacterized protein conserved in bacteria (DUF2135). This domain, found in various hypothetical prokaryotic proteins, has no known function.	52
401761	pfam09907	HigB_toxin	HigB_toxin, RelE-like toxic component of a toxin-antitoxin system. HigB_toxin is a family of RelE-like prokaryotic proteins that function as mRNA interferases. HigB cleaves translated mRNA only, and cleavage depended on translation of the target RNAs. HigB belongs to the RelE super-family of RNases. The toxin-antitoxin gene-pair is induced by environmental stress factors.	73
370747	pfam09909	DUF2138	Uncharacterized protein conserved in bacteria (DUF2138). This domain, found in various hypothetical prokaryotic proteins, has no known function.	546
401762	pfam09910	DUF2139	Uncharacterized protein conserved in archaea (DUF2139). This domain, found in various hypothetical archaeal proteins, has no known function.	340
401763	pfam09911	DUF2140	Uncharacterized protein conserved in bacteria (DUF2140). This domain, found in various hypothetical prokaryotic proteins, has no known function.	186
401764	pfam09912	DUF2141	Uncharacterized protein conserved in bacteria (DUF2141). This domain, found in various hypothetical prokaryotic proteins, has no known function.	112
401765	pfam09913	DUF2142	Predicted membrane protein (DUF2142). This domain, found in various hypothetical prokaryotic proteins, has no known function.	393
401766	pfam09916	DUF2145	Uncharacterized protein conserved in bacteria (DUF2145). This domain, found in various hypothetical prokaryotic proteins, has no known function.	199
401767	pfam09917	DUF2147	Uncharacterized protein conserved in bacteria (DUF2147). This domain, found in various hypothetical prokaryotic proteins, has no known function.	112
401768	pfam09918	DUF2148	Uncharacterized protein containing a ferredoxin domain (DUF2148). This domain, found in various hypothetical bacterial proteins containing a ferredoxin domain, has no known function.	65
401769	pfam09919	DUF2149	Uncharacterized conserved protein (DUF2149). This domain, found in various hypothetical prokaryotic proteins, has no known function.	92
401770	pfam09920	DUF2150	Uncharacterized protein conserved in archaea (DUF2150). This domain, found in various hypothetical archaeal proteins, has no known function.	189
378308	pfam09921	DUF2153	Uncharacterized protein conserved in archaea (DUF2153). This domain, found in various hypothetical archaeal proteins, has no known function.	123
401771	pfam09922	DUF2154	Cell wall-active antibiotics response 4TMS YvqF. 	114
401772	pfam09923	DUF2155	Uncharacterized protein conserved in bacteria (DUF2155). This domain, found in various hypothetical prokaryotic proteins, has no known function.	89
401773	pfam09924	DUF2156	Uncharacterized conserved protein (DUF2156). This domain, found in various hypothetical prokaryotic proteins, has no known function.	297
401774	pfam09925	DUF2157	Predicted membrane protein (DUF2157). This domain, found in various hypothetical prokaryotic proteins, has no known function.	141
401775	pfam09926	DUF2158	Uncharacterized small protein (DUF2158). Members of this family of prokaryotic proteins have no known function.	52
401776	pfam09928	DUF2160	Predicted small integral membrane protein (DUF2160). The members of this family of hypothetical prokaryotic proteins have no known function. It is thought that they are transmembrane proteins, but their function has not been inferred yet.	88
401777	pfam09929	DUF2161	Putative PD-(D/E)XK phosphodiesterase (DUF2161). This family of proteins is functionally uncharacterized. This family of proteins is found in prokaryotes. Advanced homology-detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses s have established this family to be a member of the PD-(D/E)XK superfamily.	111
401778	pfam09930	DUF2162	Predicted transporter (DUF2162). Members of this family of bacterial proteins are thought to be membrane transporters, but their exact function has not, as yet, been elucidated.	223
401779	pfam09931	DUF2163	Uncharacterized conserved protein (DUF2163). This domain, found in various hypothetical prokaryotic proteins, has no known function.	163
401780	pfam09932	DUF2164	Uncharacterized conserved protein (DUF2164). This domain, found in various hypothetical prokaryotic proteins, has no known function.	73
401781	pfam09933	DUF2165	Predicted small integral membrane protein (DUF2165). This domain, found in various hypothetical prokaryotic proteins, has no known function.	157
401782	pfam09935	DUF2167	Protein of unknown function (DUF2167). This domain, found in various hypothetical membrane-anchored prokaryotic proteins, has no known function.	238
401783	pfam09936	Methyltrn_RNA_4	SAM-dependent RNA methyltransferase. This family has a Rossmanoid fold, with a deep trefoil knot in its C-terminal region. It has structural similarity to RNA methyltransferases, and is likely to function as an S-adenosyl-L-methionine (SAM)-dependent RNA 2'-O methyltransferase.	181
401784	pfam09937	DUF2169	Uncharacterized protein conserved in bacteria (DUF2169). This domain, found in various hypothetical prokaryotic proteins, has no known function.	294
401785	pfam09938	DUF2170	Uncharacterized protein conserved in bacteria (DUF2170). This domain, found in various hypothetical prokaryotic proteins, has no known function.	137
401786	pfam09939	DUF2171	Uncharacterized protein conserved in bacteria (DUF2171). This domain, found in various hypothetical prokaryotic proteins, has no known function.	63
401787	pfam09940	DUF2172	Domain of unknown function (DUF2172). This domain, found in various hypothetical prokaryotic proteins, has no known function. An aminopeptidase domain is conserved within the family, but its relevance has not been established yet. Rebuilding from Structure 3kt9 shows this is an inserted (nested domain within the amino-peptidase). The function of this small domain is not known.	91
370757	pfam09941	DUF2173	Uncharacterized conserved protein (DUF2173). This domain, found in various hypothetical prokaryotic proteins, has no known function.	104
370758	pfam09943	DUF2175	Uncharacterized protein conserved in archaea (DUF2175). This domain, found in various hypothetical archaeal proteins, has no known function.	98
401788	pfam09945	DUF2177	Predicted membrane protein (DUF2177). This domain, found in various hypothetical bacterial proteins, has no known function.	120
401789	pfam09946	DUF2178	Predicted membrane protein (DUF2178). This domain, found in various hypothetical archaeal proteins, has no known function.	106
401790	pfam09947	DUF2180	Uncharacterized protein conserved in archaea (DUF2180). This domain, found in various hypothetical archaeal proteins, has no known function. A few of the family members contain a zinc finger domain.	68
401791	pfam09948	DUF2182	Predicted metal-binding integral membrane protein (DUF2182). This domain, found in various hypothetical bacterial membrane proteins having predicted metal-binding properties, has no known function.	188
401792	pfam09949	DUF2183	Uncharacterized conserved protein (DUF2183). This domain, found in various hypothetical bacterial proteins, has no known function.	99
401793	pfam09950	DUF2184	Uncharacterized protein conserved in bacteria (DUF2184). This domain, found in various hypothetical bacterial proteins, has no known function.	251
401794	pfam09951	DUF2185	Protein of unknown function (DUF2185). This domain, found in various hypothetical bacterial proteins, has no known function.	85
370761	pfam09952	AbiEi_2	Transcriptional regulator, AbiEi antitoxin, Type IV TA system. AbiEi_2 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	142
401795	pfam09953	DUF2187	Uncharacterized protein conserved in bacteria (DUF2187). This domain, found in various hypothetical bacterial proteins, has no known function.	60
378323	pfam09954	DUF2188	Uncharacterized protein conserved in bacteria (DUF2188). This domain, found in various hypothetical bacterial proteins, has no known function.	62
401796	pfam09955	DUF2189	Predicted integral membrane protein (DUF2189). Members of this family are found in various hypothetical prokaryotic proteins, as well as putative cytochrome c oxidases. Their exact function has not, as yet, been established.	126
401797	pfam09956	DUF2190	Uncharacterized conserved protein (DUF2190). This domain, found in various hypothetical prokaryotic proteins, as well as in some putative RecA/RadA recombinases, has no known function.	103
401798	pfam09957	VapB_antitoxin	Bacterial antitoxin of type II TA system, VapB. VapB is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is VapC, pfam05016. The family contains several related antitoxins from Cyanobacteria and Actinobacterial families. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the Toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all antitoxins.	43
378325	pfam09958	DUF2192	Uncharacterized protein conserved in archaea (DUF2192). This domain, found in various hypothetical archaeal proteins, has no known function.	229
401799	pfam09959	DUF2193	Uncharacterized protein conserved in archaea (DUF2193). This domain, found in various hypothetical archaeal proteins, has no known function.	498
401800	pfam09960	DUF2194	Uncharacterized protein conserved in bacteria (DUF2194). This domain, found in various hypothetical bacterial proteins, has no known function.	586
401801	pfam09961	DUF2195	Uncharacterized protein conserved in bacteria (DUF2195). This domain, found in various hypothetical bacterial proteins, has no known function.	117
401802	pfam09962	DUF2196	Uncharacterized conserved protein (DUF2196). This domain, found in various hypothetical bacterial proteins, has no known function.	59
313229	pfam09963	DUF2197	Uncharacterized protein conserved in bacteria (DUF2197). This domain, found in various hypothetical bacterial proteins, has no known function.	56
401803	pfam09964	DUF2198	Uncharacterized protein conserved in bacteria (DUF2198). This domain, found in various hypothetical bacterial proteins, has no known function.	72
401804	pfam09965	DUF2199	Uncharacterized protein conserved in bacteria (DUF2199). This domain, found in various hypothetical bacterial proteins, has no known function.	145
401805	pfam09966	DUF2200	Uncharacterized protein conserved in bacteria (DUF2200). This domain, found in various hypothetical bacterial proteins, has no known function.	110
401806	pfam09967	DUF2201	VWA-like domain (DUF2201). This domain, found in various hypothetical bacterial proteins, has no known function. However, it is clearly related to the VWA domain.	123
401807	pfam09968	DUF2202	Uncharacterized protein domain (DUF2202). This domain, found in various hypothetical archaeal proteins, has no known function.	162
401808	pfam09969	DUF2203	Uncharacterized conserved protein (DUF2203). This domain, found in various hypothetical bacterial proteins, has no known function.	121
378331	pfam09970	DUF2204	Nucleotidyl transferase of unknown function (DUF2204). This domain, found in various hypothetical archaeal proteins, has no known function. However, this family was identified as belonging to the nucleotidyltransferase superfamily.	181
401809	pfam09971	DUF2206	Predicted membrane protein (DUF2206). This domain, found in various hypothetical archaeal proteins, has no known function.	380
401810	pfam09972	DUF2207	Predicted membrane protein (DUF2207). This domain, found in various hypothetical bacterial proteins, has no known function.	434
378334	pfam09973	DUF2208	Predicted membrane protein (DUF2208). This domain, found in various hypothetical archaeal proteins, has no known function.	231
401811	pfam09974	DUF2209	Uncharacterized protein conserved in archaea (DUF2209). This domain, found in various hypothetical archaeal proteins, has no known function.	121
401812	pfam09976	TPR_21	Tetratricopeptide repeat-like domain. This family resembles a single unit of a TPR repeat.	194
401813	pfam09977	Tad_C	Putative Tad-like Flp pilus-assembly. This domain, found in various hypothetical prokaryotic proteins, is likely to be involved in Flp lius biogenesis.	93
401814	pfam09979	DUF2213	Uncharacterized protein conserved in bacteria (DUF2213). Members of this family of bacterial proteins comprise various hypothetical and phage-related proteins. The exact function of these proteins has not, as yet, been determined.	166
401815	pfam09980	DUF2214	Predicted membrane protein (DUF2214). This domain, found in various hypothetical bacterial proteins, has no known function.	144
401816	pfam09981	DUF2218	Uncharacterized protein conserved in bacteria (DUF2218). This domain, found in various hypothetical bacterial proteins, has no known function.	88
401817	pfam09982	DUF2219	Uncharacterized protein conserved in bacteria (DUF2219). This domain, found in various hypothetical bacterial proteins, has no known function.	294
401818	pfam09983	DUF2220	Uncharacterized protein conserved in bacteria C-term(DUF2220). This domain, found in various hypothetical bacterial proteins, has no known function. The family represents just the C-terminus.	181
401819	pfam09984	DUF2222	Uncharacterized signal transduction histidine kinase domain (DUF2222). Members of this family of domains are found in various BarA-like signal transduction histidine kinases, which are involved in the regulation of carbon metabolism via the csrA/csrB regulatory system. The role of this domain has not, as yet, been established.	146
401820	pfam09985	Glucodextran_C	C-terminal binding-module, SLH-like, of glucodextranase. Glucodextran_C is the C-terminal domain of glucodextranase-like proteins found in various prokaryotic membrane-anchored proteins. It shows homology to the carbohydrate-binding unit of some glycosidases.	228
401821	pfam09986	DUF2225	Uncharacterized protein conserved in bacteria (DUF2225). This domain, found in various hypothetical bacterial proteins, has no known function.	213
255677	pfam09987	DUF2226	Uncharacterized protein conserved in archaea (DUF2226). This domain, found in various hypothetical archaeal proteins, has no known function.	252
401822	pfam09988	DUF2227	Uncharacterized metal-binding protein (DUF2227). Members of this family of hypothetical bacterial proteins possess metal binding properties; however, their exact function has not, as yet, been determined.	172
401823	pfam09989	DUF2229	CoA enzyme activase uncharacterized domain (DUF2229). Members of this family include various bacterial hypothetical proteins, as well as CoA enzyme activases. The exact function of this domain has not, as yet, been defined.	213
401824	pfam09990	DUF2231	Predicted membrane protein (DUF2231). This domain, found in various hypothetical bacterial proteins, has no known function.	100
370773	pfam09991	DUF2232	Predicted membrane protein (DUF2232). This family of bacterial proteins are multi-pass membrane proteins with up to 10 (2 x 4/5) transmembrane regions. The exact function of this potential pore molecule is not known, but in many instances it is associated with ABC-transporter-like domains, implying that it is part of a secretion system that uses energy.	290
401825	pfam09992	NAGPA	Phosphodiester glycosidase. This is a family conserved from bacteria to humans. The structure of a member from Bacteroides has been crystallized and modelled onto the luminal region of the human member of the family, the transmembrane glycoprotein N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase. There is some conservation of potentially functional residues, implying that in the bacterial members this family acts in some way as a phosphodiester glycosidase. The human protein is also present, so the eukaryotic members are likely to be catalyzing the second step in the formation of the mannose 6-phosphate targeting signal on lysosomal enzyme oligosaccharides.	169
401826	pfam09994	DUF2235	Uncharacterized alpha/beta hydrolase domain (DUF2235). This domain, found in various hypothetical bacterial proteins, has no known function.	283
401827	pfam09995	DUF2236	Uncharacterized protein conserved in bacteria (DUF2236). This domain, found in various hypothetical bacterial proteins, has no known function. This family contains a highly conserved arginine and histidine that may be active site residues for an as yet unknown catalytic activity.	223
401828	pfam09996	DUF2237	Uncharacterized protein conserved in bacteria (DUF2237). This domain, found in various hypothetical bacterial proteins, has no known function.	108
401829	pfam09997	DUF2238	Predicted membrane protein (DUF2238). This domain, found in various hypothetical bacterial proteins, has no known function.	140
401830	pfam09998	DUF2239	Uncharacterized protein conserved in bacteria (DUF2239). This domain, found in various hypothetical bacterial proteins, has no known function.	181
401831	pfam09999	DUF2240	Uncharacterized protein conserved in archaea (DUF2240). This domain, found in various hypothetical archaeal proteins, has no known function.	144
401832	pfam10000	ACT_3	ACT domain. This domain, found in various hypothetical bacterial proteins, has no known function. However, its structure is similar to the ACT domain which suggests that it binds to amino acids and regulates other protein activity. This family was formerly known as DUF2241.	69
401833	pfam10001	DUF2242	Uncharacterized protein conserved in bacteria (DUF2242). This domain is found in various hypothetical bacterial proteins, and has no known function.	121
401834	pfam10002	DUF2243	Predicted membrane protein (DUF2243). This domain, found in various hypothetical bacterial proteins, has no known function.	139
401835	pfam10003	DUF2244	Integral membrane protein (DUF2244). This domain, found in various bacterial hypothetical and putative membrane proteins, has no known function.	135
401836	pfam10004	DUF2247	Uncharacterized protein conserved in bacteria (DUF2247). This domain, found in various hypothetical bacterial proteins, has no known function.	158
401837	pfam10005	zinc-ribbon_6	zinc-ribbon domain. This family appears to be a true zinc-ribbon, with two sets of putative zinc-binding domains in tandem.	93
401838	pfam10006	DUF2249	Uncharacterized conserved protein (DUF2249). Members of this family of hypothetical bacterial proteins have no known function.	70
401839	pfam10007	DUF2250	Uncharacterized protein conserved in archaea (DUF2250). Members of this family of hypothetical archaeal proteins have no known function.	93
401840	pfam10008	DUF2251	Uncharacterized protein conserved in bacteria (DUF2251). Members of this family of hypothetical bacterial proteins have no known function.	89
401841	pfam10009	DUF2252	Uncharacterized protein conserved in bacteria (DUF2252). This domain, found in various hypothetical bacterial proteins, has no known function.	387
401842	pfam10011	DUF2254	Predicted membrane protein (DUF2254). Members of this family of bacterial proteins comprises various hypothetical and putative membrane proteins. Their exact function, has not, as yet, been defined.	371
401843	pfam10012	DUF2255	Uncharacterized protein conserved in bacteria (DUF2255). Members of this family of hypothetical bacterial proteins have no known function.	113
401844	pfam10013	DUF2256	Uncharacterized protein conserved in bacteria (DUF2256). Members of this family of hypothetical bacterial proteins have no known function.	40
401845	pfam10014	2OG-Fe_Oxy_2	2OG-Fe dioxygenase. This family contains 2-oxoglutarate (2OG) and Fe-dependent dioxygenases. It includes L-isoleucine dioxygenase (IDO).	191
401846	pfam10015	DUF2258	Uncharacterized protein conserved in archaea (DUF2258). Members of this family of hypothetical bacterial archaeal have no known function. Structural modelling suggests this domain may bind nucleic acids.	78
401847	pfam10016	DUF2259	Predicted secreted protein (DUF2259). Members of this family of hypothetical bacterial proteins have no known function.	189
401848	pfam10017	Methyltransf_33	Histidine-specific methyltransferase, SAM-dependent. The mycobacterial members of this family are expressed from part of the ergothioneine biosynthetic gene cluster. EGTD is the histidine methyltransferase that transfers three methyl groups to the alpha-amino moiety of histidine, in the first stage of the production of this histidine betaine derivative that carries a thiol group attached to the C2 atom of an imidazole ring.	305
401849	pfam10018	Med4	Vitamin-D-receptor interacting Mediator subunit 4. Members of this family function as part of the Mediator (Med) complex, which links DNA-bound transcriptional regulators and the general transcription machinery, particularly the RNA polymerase II enzyme. They play a role in basal transcription by mediating activation or repression according to the specific complement of transcriptional regulators bound to the promoter.	184
401850	pfam10020	DUF2262	Uncharacterized protein conserved in bacteria (DUF2262). This domain, found in various hypothetical bacterial proteins, has no known function.	141
401851	pfam10021	DUF2263	Uncharacterized protein conserved in bacteria (DUF2263). This domain, found in various hypothetical bacterial and eukaryotic proteins, has no known function.	136
401852	pfam10022	DUF2264	Uncharacterized protein conserved in bacteria (DUF2264). Members of this family of hypothetical bacterial proteins have no known function.	351
401853	pfam10023	Aminopep	Putative aminopeptidase. This family of bacterial proteins has a conserved HEXXH motif, suggesting that members are putative peptidases of zincin fold.	322
401854	pfam10025	DUF2267	Uncharacterized conserved protein (DUF2267). This domain, found in various hypothetical bacterial proteins, has no known function.	122
401855	pfam10026	DUF2268	Predicted Zn-dependent protease (DUF2268). This domain, found in various hypothetical bacterial proteins, as well as predicted zinc dependent proteases, has no known function.	195
401856	pfam10027	DUF2269	Predicted integral membrane protein (DUF2269). Members of this family of bacterial hypothetical integral membrane proteins have no known function.	150
401857	pfam10028	DUF2270	Predicted integral membrane protein (DUF2270). This domain, found in various hypothetical bacterial proteins, has no known function.	182
401858	pfam10029	DUF2271	Predicted periplasmic protein (DUF2271). This domain, found in various hypothetical bacterial proteins and misannotated lysozyme proteins, it has no known function.	136
370785	pfam10030	DUF2272	Uncharacterized protein conserved in bacteria (DUF2272). Members of this family of hypothetical bacterial proteins have no known function. However, given its similarity to the CHAP domain it seems likely that this is an enzyme involved in cleaving peptidoglycan.	191
401859	pfam10031	DUF2273	Small integral membrane protein (DUF2273). Members of this family of hypothetical bacterial proteins have no known function.	47
401860	pfam10032	Pho88	Phosphate transport (Pho88). Members of this family of proteins are involved in regulating inorganic phosphate transport, as well as telomere length regulation and maintenance.	175
401861	pfam10033	ATG13	Autophagy-related protein 13. Members of this family of phosphoproteins are involved in cytoplasm to vacuole transport (Cvt), and more specifically in Cvt vesicle formation. They are probably involved in the switching machinery regulating the conversion between the Cvt pathway and autophagy. Finally, ATG13 is also required for glycogen storage.	229
401862	pfam10034	Dpy19	Q-cell neuroblast polarisation. Dyp-19, formerly known as DUF2211, is a transmembrane domain family that is required to orient the neuroblast cells, QR and QL accurately on the anterior-posterior axis: QL and QR are born in the same anterior-posterior position, but polarise and migrate left-right asymmetrically, QL migrating towards the posterior and QR migrating towards the anterior. It is also required, with unc-40, to express mab-5 correctly in the Q cell descendants. The Dpy-19 protein derives from the C. elegans DUMPY mutant.	645
401863	pfam10035	DUF2179	Uncharacterized protein conserved in bacteria (DUF2179). This domain, found in various hypothetical bacterial proteins, has no known function.	55
401864	pfam10036	RLL	Putative carnitine deficiency-associated protein. This family of proteins conserved from nematodes to humans is of approximately 250 amino acids. It is purported to be carnitine deficiency-associated protein but this could not be confirmed. It carries a characteristic RLL sequence-motif. The function is unknown.	243
401865	pfam10037	MRP-S27	Mitochondrial 28S ribosomal protein S27. Members of this family of small ribosomal proteins possess one of three conserved blocks of sequence found in proteins that stimulate the dissociation of guanine nucleotides from G-proteins, leaving open the possibility that MRP-S27 might be a functional partner of GTP-binding ribosomal proteins.	391
401866	pfam10038	DUF2274	Protein of unknown function (DUF2274). Members of this family of hypothetical bacterial proteins have no known function.	69
401867	pfam10039	DUF2275	Predicted integral membrane protein (DUF2275). This domain, found in various hypothetical bacterial proteins and in the RNA polymerase sigma factor, has no known function.	201
401868	pfam10040	CRISPR_Cas6	CRISPR-associated endoribonuclease Cas6. Cas6 is a member of the RAMP (repeat-associated mysterious protein) superfamily. It is among the most widely distributed Cas proteins and is found in both bacteria and archaea. Cas6 functions in the generation of CRISPR-derived guide RNAs for invader defense in prokaryotes.	65
401869	pfam10041	DUF2277	Uncharacterized conserved protein (DUF2277). Members of this family of hypothetical bacterial proteins have no known function.	74
401870	pfam10042	DUF2278	Uncharacterized conserved protein (DUF2278). Members of this family of hypothetical bacterial proteins have no known function.	205
370792	pfam10043	DUF2279	Predicted periplasmic lipoprotein (DUF2279). This domain, found in various hypothetical bacterial proteins, has no known function.	91
401871	pfam10044	LIN52	Retinal tissue protein. LIN52 is a family of proteins of approximately 112 amino acids in length which is conserved from nematodes to humans. The proposed tertiary structure is of almost entirely alpha helix interrupted only by loops located at proline residues. Three sites in the protein sequence reveal two types of possible post-translation modification. A serine residue, at position 41, is a candidate for protein kinase C phosphorylation. Glycine residues at position 69 and 91 are probable sites for acetylation by covalent amide linkage of myristate via N-myristoyl transferase. LIN52 is differentially expressed in the trout retina between parr and smolt developmental stages (smoltification). It is likely to be a house-keeping protein. LIN52 forms a complex (LINC) required for transcriptional activation of G2/M genes. The LINC core complex consists of at least five subunits including the chromatin-associated LIN-9 and RbAp48 proteins. LINC associates with a large number of E2F-regulated promoters in quiescent cells. Family members are required for spermatogenesis by repressing testis-specific gene expression.	92
401872	pfam10045	DUF2280	Uncharacterized conserved protein (DUF2280). Members of this family of hypothetical bacterial proteins have no known function.	103
401873	pfam10046	BLOC1_2	Biogenesis of lysosome-related organelles complex-1 subunit 2. Members of this family of proteins play a role in cellular proliferation, as well as in the biogenesis of specialized organelles of the endosomal-lysosomal system.	95
378366	pfam10047	DUF2281	Protein of unknown function (DUF2281). Members of this family of hypothetical bacterial proteins have no known function.	64
401874	pfam10048	DUF2282	Predicted integral membrane protein (DUF2282). Members of this family of hypothetical bacterial proteins and putative signal peptide proteins have no known function.	52
401875	pfam10049	DUF2283	Protein of unknown function (DUF2283). Members of this family of hypothetical bacterial proteins have no known function.	49
401876	pfam10050	DUF2284	Predicted metal-binding protein (DUF2284). Members of this family of metal-binding hypothetical bacterial proteins have no known function.	161
378369	pfam10051	DUF2286	Uncharacterized protein conserved in archaea (DUF2286). Members of this family of hypothetical archaeal proteins have no known function.	138
401877	pfam10052	DUF2288	Protein of unknown function (DUF2288). Members of this family of hypothetical bacterial proteins have no known function.	89
370798	pfam10053	DUF2290	Uncharacterized conserved protein (DUF2290). Members of this family of hypothetical bacterial proteins have no known function.	195
401878	pfam10054	DUF2291	Predicted periplasmic lipoprotein (DUF2291). Members of this family of hypothetical bacterial proteins have no known function.	199
401879	pfam10055	DUF2292	Uncharacterized small protein (DUF2292). Members of this family of hypothetical bacterial proteins have no known function.	37
401880	pfam10056	DUF2293	Uncharacterized conserved protein (DUF2293). This domain, found in various hypothetical bacterial proteins, has no known function.	85
401881	pfam10057	DUF2294	Uncharacterized conserved protein (DUF2294). Members of this family of hypothetical bacterial proteins have no known function.	111
401882	pfam10058	zinc_ribbon_10	Predicted integral membrane zinc-ribbon metal-binding protein. This domain, found in various hypothetical bacterial and eukaryotic metal-binding proteins is a probably zinc-ribbon.	51
401883	pfam10060	DUF2298	Uncharacterized membrane protein (DUF2298). This domain, found in various hypothetical bacterial proteins, has no known function.	485
401884	pfam10061	DUF2299	Uncharacterized conserved protein (DUF2299). Members of this family of hypothetical bacterial proteins have no known function.	137
401885	pfam10062	DUF2300	Predicted secreted protein (DUF2300). This domain, found in various bacterial hypothetical and putative signal peptide proteins, has no known function.	122
401886	pfam10063	DUF2301	Uncharacterized integral membrane protein (DUF2301). This domain, found in various hypothetical bacterial proteins, has no known function.	133
401887	pfam10065	DUF2303	Uncharacterized conserved protein (DUF2303). Members of this family of hypothetical bacterial proteins have no known function.	268
401888	pfam10066	DUF2304	Uncharacterized conserved protein (DUF2304). Members of this family of hypothetical archaeal proteins have no known function.	106
370803	pfam10067	DUF2306	Predicted membrane protein (DUF2306). Members of this family of hypothetical bacterial proteins have no known function.	147
401889	pfam10069	DICT	Sensory domain in DIguanylate Cyclases and Two-component system. DICT is a sensory domain found associated with GGDEF, EAL, HD-GYP, STAS, and two component systems (histidine-kinase type). It assumes an alpha+beta fold with a 4-stranded beta-sheet and might have a role in light response (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter)	126
401890	pfam10070	DUF2309	Uncharacterized protein conserved in bacteria (DUF2309). Members of this family of hypothetical bacterial proteins have no known function.	758
401891	pfam10071	DUF2310	Zn-ribbon-containing, possibly nucleic-acid-binding protein (DUF2310). Members of this family of proteobacterial zinc ribbon proteins are thought to bind to nucleic acids, however their exact function has not as yet been defined.	255
401892	pfam10073	DUF2312	Uncharacterized protein conserved in bacteria (DUF2312). Members of this family of hypothetical bacterial proteins have no known function. Structural modelling suggests this domain may bind nucleic acids.	72
401893	pfam10074	DUF2285	Uncharacterized conserved protein (DUF2285). This domain, found in various hypothetical bacterial proteins, has no known function.	102
370807	pfam10075	CSN8_PSD8_EIF3K	CSN8/PSMD8/EIF3K family. This domain is conserved from plants to humans. It is a signature protein motif found in components of CSN (COP9 signalosome) where it functions as a structural scaffold for subunit-subunit interactions within the complex and is a key regulator of photomorphogenic development. It is found in Eukaryotic translation initiation factor 3 subunit K, a component of the eukaryotic translation initiation factor 3 (eIF-3) complex required for the initiation of protein synthesis. It is also found in 26S proteasome non-ATPase regulatory subunit 8 (PSMD8), a regulatory subunit of the 26S proteasome.	137
401894	pfam10076	DUF2313	Uncharacterized protein conserved in bacteria (DUF2313). Members of this family of proteins comprise various hypothetical and putative bacteriophage tail proteins.	150
401895	pfam10077	DUF2314	Uncharacterized protein conserved in bacteria (DUF2314). This domain is found in various bacterial hypothetical proteins, as well as putative ankyrin repeat proteins. The exact function of the domains comprising this family has not, as yet, been determined.	136
401896	pfam10078	DUF2316	Uncharacterized protein conserved in bacteria (DUF2316). Members of this family of hypothetical bacterial proteins have no known function.	89
401897	pfam10079	BshC	Bacillithiol biosynthesis BshC. Members of this protein family include BshC, which is an enzyme required for bacillithiol biosynthesis and described as a cysteine-adding enzyme.	538
401898	pfam10080	DUF2318	Predicted membrane protein (DUF2318). Members of this family of hypothetical bacterial proteins have no known function.	98
401899	pfam10081	Abhydrolase_9	Alpha/beta-hydrolase family. This is a family of alpha/beta hydrolases which may function as lipases. This domain is the catalytic domain and includes the catalytic triad and the GXSXG sequence motif which is a characteristic of these enzymes.	282
401900	pfam10082	BBP2_2	Putative beta-barrel porin 2. This domain is a putative beta-barrel porin type 2.	378
401901	pfam10083	DUF2321	Uncharacterized protein conserved in bacteria (DUF2321). Members of this family of hypothetical bacterial proteins have no known function.	156
401902	pfam10084	DUF2322	Uncharacterized protein conserved in bacteria (DUF2322). Members of this family of hypothetical bacterial proteins have no known function.	99
401903	pfam10086	DUF2324	Putative membrane peptidase family (DUF2324). This domain, found in various hypothetical bacterial proteins, has no known function. This family appears to be related to the prenyl protease 2 family pfam02517, suggesting this family may be peptidases.	223
401904	pfam10087	DUF2325	Uncharacterized protein conserved in bacteria (DUF2325). Members of this family of hypothetical bacterial proteins have no known function.	94
401905	pfam10088	DUF2326	Uncharacterized protein conserved in bacteria (DUF2326). This domain, found in various hypothetical bacterial proteins, has no known function.	135
401906	pfam10090	HPTransfase	Histidine phosphotransferase C-terminal domain. HPTransfase is a family of essential histidine phosphotransferases. It controls the activity of the master bacterial cell-cycle regulator CtrA through phosphorylation. It behaves as a homodimer by adopting the domain architecture of the intracellular part of class I histidine kinases. Each subunit consists of two distinct domains: an N-terminal helical hairpin domain and a C-terminal [alpha]/[beta] domain. The two N-terminal domains are adjacent within the dimer, forming a four-helix bundle. The C-terminal domain adopts an atypical Bergerat ATP-binding fold.	123
401907	pfam10091	Glycoamylase	Putative glucoamylase. The structure of UniProt:Q5LIB7 has an alpha/alpha toroid fold and is similar structurally to a number of glucoamylases. Most of these structural homologs are glucoamylases, involved in breaking down complex sugars (e.g. starch). The biologically relevant state is likely to be monomeric. The putative active site is located at the centre of the toroid with a well defined large cavity.	215
370813	pfam10092	DUF2330	Uncharacterized protein conserved in bacteria (DUF2330). Members of this family of hypothetical bacterial proteins have no known function.	311
401908	pfam10093	DUF2331	Uncharacterized protein conserved in bacteria (DUF2331). Members of this family of hypothetical bacterial proteins have no known function.	373
401909	pfam10094	DUF2332	Uncharacterized protein conserved in bacteria (DUF2332). Members of this family of hypothetical bacterial proteins have no known function.	334
401910	pfam10095	DUF2333	Uncharacterized protein conserved in bacteria (DUF2333). Members of this family of hypothetical bacterial proteins have no known function.	330
401911	pfam10096	DUF2334	Uncharacterized protein conserved in bacteria (DUF2334). This domain, found in various hypothetical bacterial proteins, has no known function.	206
401912	pfam10097	DUF2335	Predicted membrane protein (DUF2335). Members of this family of hypothetical bacterial proteins have no known function.	50
401913	pfam10098	DUF2336	Uncharacterized protein conserved in bacteria (DUF2336). Members of this family of hypothetical bacterial proteins have no known function.	258
401914	pfam10099	RskA	Anti-sigma-K factor rskA. This domain, formerly known as DUF2337, is the anti-sigma-K factor, RskA. In Mycobacterium tuberculosis the protein positively regulates expression of the antigenic proteins MPB70 and MPB83.	180
401915	pfam10100	DUF2338	Uncharacterized protein conserved in bacteria (DUF2338). Members of this family of hypothetical bacterial proteins have no known function.	423
401916	pfam10101	DUF2339	Predicted membrane protein (DUF2339). This domain, found in various hypothetical bacterial proteins, has no known function.	650
401917	pfam10102	DUF2341	Domain of unknown function (DUF2341). Members of this family are found in various bacterial proteins, including MotA/TolQ/ExbB proton channels and other transport proteins. The exact function of this set of domains has not, as yet, been determined.	82
401918	pfam10103	Zincin_2	Zincin-like metallopeptidase. This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. The structure of this family has similarity to Peptidase_M1 (pfam01433, Structure 3CMN).	340
401919	pfam10104	Brr6_like_C_C	Di-sulfide bridge nucleocytoplasmic transport domain. Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport. Brr6 in yeast is an essential integral membrane protein of the NE-ER, wit two predicted transmembrane domains, and is a dosage suppressor of Apq12, pfam12716.	133
401920	pfam10105	DUF2344	Uncharacterized protein conserved in bacteria (DUF2344). This domain, found in various hypothetical bacterial proteins and Radical Sam domain proteins, has no known function. This domain is distantly related to tRNA pseudouridine synthases, suggesting this family may carry out a function related to RNA modification. But this family appears to lack the catalytic aspartate found in pseudouridine synthases.	178
401921	pfam10106	DUF2345	Uncharacterized protein conserved in bacteria (DUF2345). Members of this family are found in various bacterial hypothetical proteins, as well as Rhs element Vgr proteins.	151
401922	pfam10107	Endonuc_Holl	Endonuclease related to archaeal Holliday junction resolvase. This domain is found in various predicted bacterial endonucleases which are distantly related to archaeal Holliday junction resolvases.	159
401923	pfam10108	DNA_pol_B_exo2	Predicted 3'-5' exonuclease related to the exonuclease domain of PolB. This domain is found in various prokaryotic 3'-5' exonucleases and hypothetical proteins.	213
401924	pfam10109	Phage_TAC_7	Phage tail assembly chaperone proteins, E, or 41 or 14. This is family of various Myoviridae bacteriophage tail assembly chaperone, or TAC, proteins.	76
401925	pfam10110	GPDPase_memb	Membrane domain of glycerophosphoryl diester phosphodiesterase. Members of this family comprise the membrane domain of the prokaryotic enzyme glycerophosphoryl diester phosphodiesterase.	321
313356	pfam10111	Glyco_tranf_2_2	Glycosyltransferase like family 2. Members of this family of prokaryotic proteins include putative glucosyltransferase, which are involved in bacterial capsule biosynthesis.	276
401926	pfam10112	Halogen_Hydrol	5-bromo-4-chloroindolyl phosphate hydrolysis protein. Members of this family of prokaryotic proteins mediate the hydrolysis of 5-bromo-4-chloroindolyl phosphate bonds.	186
401927	pfam10113	Fibrillarin_2	Fibrillarin-like archaeal protein. Members of this family of proteins include archaeal fibrillarin homologs.	500
401928	pfam10114	PocR	Sensory domain found in PocR. PocR, a ligand binding domain, has a novel variant of the PAS-like Fold. Evidence suggests that it binds small hydrocarbon derivatives such as 1,3-propanediol. In (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter)	162
401929	pfam10115	HlyU	Transcriptional activator HlyU. This domain, found in various hypothetical prokaryotic proteins, has no known function. One of the sequences in this family corresponds to the transcriptional activator HlyU, indicating a possible similar role in other members.	91
401930	pfam10116	Host_attach	Protein required for attachment to host cells. Members of this family of bacterial proteins are required for the attachment of the bacterium to host cells.	136
370821	pfam10117	McrBC	McrBC 5-methylcytosine restriction system component. Members of this family of bacterial proteins modify the specificity of mcrB restriction by expanding the range of modified sequences restricted.	319
401931	pfam10118	Metal_hydrol	Predicted metal-dependent hydrolase. Members of this family of proteins comprise various bacterial transition metal-dependent hydrolases.	250
401932	pfam10119	MethyTransf_Reg	Predicted methyltransferase regulatory domain. Members of this family of domains are found in various prokaryotic methyltransferases, where they regulate the activity of the methyltransferase domain.	84
401933	pfam10120	ThiP_synth	Thiamine-phosphate synthase. This family is thiamine-phosphate synthase, and it belongs to the SCOP phosphomethylpyrimidine kinase C-terminal domain-like family. Vitamin B1 (thiamine pyrophosphate) is involved in several microbial metabolic functions. Thiamine biosynthesis is accomplished by joining two intermediate molecules that are synthesized separately, HMP-PP and HET-P. In the archaeon Natrialba magadii, ThiE and ThiN, are known to join HMP-PP ( hydroxymethylpyrimidine pyrophosphate) and HET-P (hydroxyethylthiazole phosphate) to generate thiamine phosphate. Whereas ThiE in Natrialba magadii is a mono-functional protein, ThiN exists as a C-terminal domain in a ThiDN fusion protein - examples of all three forms, from various prokaryotes, are found in this family.	164
287133	pfam10122	Mu-like_Com	Mu-like prophage protein Com. Members of this family of proteins comprise the translational regulator of mom.	52
401934	pfam10123	Mu-like_Pro	Mu-like prophage I protein. Members of this family of proteins comprise various viral Mu-like prophage I proteins.	325
401935	pfam10124	Mu-like_gpT	Mu-like prophage major head subunit gpT. Members of this family of proteins comprise various caudoviral prophage proteins, including the Mu-like prophage major head subunit gpT.	289
401936	pfam10125	NADHdeh_related	NADH dehydrogenase I, subunit N related protein. This family comprises a set of NADH dehydrogenase I, subunit N related proteins found in archaea. Their exact function, has not, as yet, been determined.	218
401937	pfam10126	Nit_Regul_Hom	Uncharacterized protein, homolog of nitrogen regulatory protein PII. This domain, found in various hypothetical archaeal proteins, has no known function. It is distantly similar to the nitrogen regulatory protein PII.	107
401938	pfam10127	Nuc-transf	Predicted nucleotidyltransferase. Members of this family of bacterial proteins catalyze the transfer of nucleotide residues from nucleoside diphosphates or triphosphates into dimer or polymer forms.	246
401939	pfam10128	OpcA_G6PD_assem	Glucose-6-phosphate dehydrogenase subunit. Members of this family are found in various prokaryotic OpcA and glucose-6-phosphate dehydrogenase proteins. The exact function of the domain is, as yet, unknown.	255
401940	pfam10129	OpgC_C	OpgC protein. This domain, found in various hypothetical and OpgC prokaryotic proteins. It is likely to act as an acyltransferase enzyme.	358
370825	pfam10130	PIN_2	PIN domain. Members of this family of bacterial domains are predicted to be RNases (from similarities to 5'-exonucleases).	132
401941	pfam10131	PTPS_related	6-pyruvoyl-tetrahydropterin synthase related domain; membrane protein. This domain is found in various bacterial hypothetical membrane proteins, as well as in tetratricopeptide TPR_2 repeat protein. The exact function of the domain has not, as yet, been established.	621
401942	pfam10133	RNA_bind_2	Predicted RNA-binding protein. Members of this family of bacterial proteins are thought to have RNA-binding properties, however, their exact function has not, as yet, been defined.	60
401943	pfam10134	RPA	Replication initiator protein A. Members of this family of bacterial proteins are single-stranded DNA binding proteins that are involved in DNA replication, repair and recombination.	229
401944	pfam10135	Rod-binding	Rod binding protein. Members of this family are involved in the assembly of the prokaryotic flagellar rod.	50
401945	pfam10136	SpecificRecomb	Site-specific recombinase. Members of this family of bacterial proteins are found in various putative site-specific recombinase transmembrane proteins.	640
401946	pfam10137	TIR-like	Predicted nucleotide-binding protein containing TIR-like domain. Members of this family of bacterial nucleotide-binding proteins contain a TIR-like domain. Their exact function has not, as yet, been defined.	120
401947	pfam10138	vWA-TerF-like	vWA found in TerF C-terminus. vWA domain fused to TerD domain typified by the TerF protein. Some times found as solos.	200
401948	pfam10139	Virul_Fac	Putative bacterial virulence factor. Members of this family of prokaryotic proteins include various putative virulence factor effector proteins. Their exact function is, as yet, unknown.	872
401949	pfam10140	YukC	WXG100 protein secretion system (Wss), protein YukC. Members of this family of proteins include predicted membrane proteins homologous to YukC in B. subtilis. The YukC protein family would participate to the formation of a translocon required for the secretion of WXG100 proteins (pfam06013) in monoderm bacteria, the WXG100 protein secretion system (Wss). This family includes EssB in Staphylococcus aureus.	357
401950	pfam10141	ssDNA-exonuc_C	Single-strand DNA-specific exonuclease, C terminal domain. Members of this set of prokaryotic domains are found in a set of single-strand DNA-specific exonucleases, including RecJ. Their exact function has not, as yet, been determined.	202
401951	pfam10142	PhoPQ_related	PhoPQ-activated pathogenicity-related protein. Members of this family of bacterial proteins are involved in the virulence of some pathogenic proteobacteria.	366
401952	pfam10143	PhosphMutase	2,3-bisphosphoglycerate-independent phosphoglycerate mutase. Members of this family are found in various bacterial 2,3-bisphosphoglycerate-independent phosphoglycerate mutase enzymes, which catalyze the interconversion of 2-phosphoglycerate and 3-phosphoglycerate in the reaction: [2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate].	171
401953	pfam10144	SMP_2	Bacterial virulence factor haemolysin. Members of this family of bacterial proteins are membrane proteins that effect the expression of haemolysin under anaerobic conditions.	159
401954	pfam10145	PhageMin_Tail	Phage-related minor tail protein. Members of this family are found in putative phage tail tape measure proteins.	200
401955	pfam10146	zf-C4H2	Zinc finger-containing protein. This is a family of proteins which appears to have a highly conserved zinc finger domain at the C terminal end, described as -C-X2-CH-X3-H-X5-C-X2-C-. The structure is predicted to contain a coiled coil. Members are annotated as being tumor-associated antigen HCA127 in humans but this could not confirmed.	213
401956	pfam10147	CR6_interact	Growth arrest and DNA-damage-inducible proteins-interacting protein 1. Members of this family of proteins act as negative regulators of G1 to S cell cycle phase progression by inhibiting cyclin-dependent kinases. Inhibitory effects are additive with GADD45 proteins but occur also in the absence of GADD45 proteins. Furthermore, they act as a repressor of the orphan nuclear receptor NR4A1 by inhibiting AB domain-mediated transcriptional activity.	204
401957	pfam10148	SCHIP-1	Schwannomin-interacting protein 1. Members of this family are coiled coil protein involved in linking membrane proteins to the cytoskeleton.	233
401958	pfam10149	TM231	Transmembrane protein 231. This is a family of transmembrane proteins, given the number 231, of unknown function. It is conserved in eukaryotes.	301
401959	pfam10150	RNase_E_G	Ribonuclease E/G family. Ribonuclease E and Ribonuclease G are related enzymes that cleave a wide variety of RNAs.	267
401960	pfam10151	TMEM214	TMEM214, C-terminal, caspase 4 activator. This is the N-terminal domain of transmembrane family 214, from eukaryotes. The family is localized on the endoplasmic reticulum where it recruits procaspase 4 to the ER and subsequently allows this to be cleaved to caspase 4 so leading to apoptosis.	661
401961	pfam10152	CCDC53	Subunit CCDC53 of WASH complex. CCDC53 is a component of the WASH complex, which plays a key role in the fission of tubules that serve as transport intermediates during endosome sorting.	146
401962	pfam10153	Efg1	rRNA-processing protein Efg1. Efg1 is involved in rRNA processing.	114
401963	pfam10154	DUF2362	Uncharacterized conserved protein (DUF2362). This is a family of proteins conserved from nematodes to humans. The function is not known.	500
401964	pfam10155	DUF2363	Uncharacterized conserved protein (DUF2363). This is a region of 120 amino acids of a family of proteins conserved from plants to humans. The function is not known.	124
401965	pfam10156	Med17	Subunit 17 of Mediator complex. This Mediator complex subunit was formerly known as Srb4 in yeasts or Trap80 in Drosophila and human. The Med17 subunit is located within the head domain and is essential for cell viability to the extent that a mutant strain of cerevisiae lacking it shows all RNA polymerase II-dependent transcription ceasing at non-permissive temperatures.	441
401966	pfam10157	BORCS6	BLOC-1-related complex sub-unit 6. This is a family of conserved proteins found from nematodes to humans. Family members include BORCS6 (BLOC-1-related complex sub-unit 6) also known as Lyspersin (lysosome-dispersing protein) or C17orf59. It constitutes sub-unit 6 of the BORC complex (BLOC-one-related complex). BORC is a multisubunit complex that regulates the positioning of lysosomes at the cell periphery, and consequently affects cell migration. BORC associates with the lysosomal membrane, where it functions to recruit the small GTPase Arl8. This initiates a series of interactions that promote the microtubule-guided transport of lysosomes toward the cell periphery.	152
287167	pfam10158	LOH1CR12	tumor suppressor protein. This is a region of 130 amino acids that is the most conserved region of hypothetical proteins involved in loss of heterozygosity and thus tumor suppression. The exact function of family members is not known. This region is also found in subunit 5 of the BLOC-1-related complex, which is also found in the BORC complex. BLOC-1 is important for the biogenesis of lysosome-related organelles, and BORC is important for the positioning of the lysosome in the cytoplasm. The BORC complex associates with the lysosomal membrane where it recruits the small GTPase Arl8, which leads in turn to the kinesin-dependent movement of lysosomes toward the plus ends of microtubules in the peripheral cytoplasm.	131
401967	pfam10159	MMtag	Kinase phosphorylation protein. This is a glycine-rich domain that is the most highly conserved region of a family of proteins that in vertebrates are associated with tumors in multiple myelomas. The region may contain phosphorylation sites for several protein kinases, as well as N-myristoylation sites and nuclear localization signals, so it might act as a signal molecule in the nucleus.	78
401968	pfam10160	Tmemb_40	Predicted membrane protein. This is a region of 280 amino acids from a group of proteins conserved from plants to humans. It is predicted to be a membrane protein but its function is otherwise unknown.	258
401969	pfam10161	DDDD	Putative mitochondrial precursor protein. This is a family of small conserved proteins found from nematodes to humans. The C-terminal region is rich in asparagine. Members are putatively assigned to be mitochondrial precursor proteins but this could not be confirmed.	76
401970	pfam10162	G8	G8 domain. This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix.	123
401971	pfam10163	EnY2	Transcription factor e(y)2. EnY2 is a small transcription factor which is combined in a complex with the TAFII40 protein. The protein is conserved from paramecium to humans.	79
401972	pfam10164	DUF2367	Uncharacterized conserved protein (DUF2367). This is a highly conserved family of proteins which contains three pairs of cysteine residues within a length of 42 amino acids and is rich in proline residues towards the N-terminus. The function is unknown. Several members are putatively assigned as brain protein i3 but this was not validated.	105
401973	pfam10165	Ric8	Guanine nucleotide exchange factor synembryn. Ric8 is involved in the EGL-30 neurotransmitter signalling pathway. It is a guanine nucleotide exchange factor that regulates neurotransmitter secretion.	406
401974	pfam10166	DUF2368	Uncharacterized conserved protein (DUF2368). This family is conserved from nematodes to humans. The function is not known.	134
401975	pfam10167	BORCS8	BLOC-1-related complex sub-unit 8. This is the N-terminal 80 residues of a family of proteins conserved from plants to humans. It contains a characteristic NEP sequence motif. Family members include BORCS8 (BLOC-1-related complex sub-unit 8) also known as MEF2BNB. It constitutes sub-unit 8 of the BORC complex (BLOC-one-related complex). BORC is a multisubunit complex that regulates the positioning of lysosomes at the cell periphery, and consequently affects cell migration. BORC associates with the lysosomal membrane, where it functions to recruit the small GTPase Arl8. This initiates a series of interactions that promote the microtubule-guided transport of lysosomes toward the cell periphery.	107
401976	pfam10168	Nup88	Nuclear pore component. Nup88 can be divided into two structural domains; the N-terminal two-thirds of the protein has no obvious structural motifs but is the region for binding to Nup98, one of the components of the nuclear pore. the C-terminal end is a predicted coiled-coil domain. Nup88 is overexpressed in tumor cells.	713
401977	pfam10169	Laps	Learning-associated protein. This is a family of 121-amino acid secretory proteins. Laps functions in the regulation of neuronal cell adhesion and/or movement and synapse attachment. Laps binds to the ApC/EBP (Aplysia CCAAT/enhancer binding protein) promoter and activates the transcription of ApC/EBP mRNA.	124
401978	pfam10170	C6_DPF	Cysteine-rich domain. This is the N-terminal approximately 100 amino acids of a family of proteins found from nematodes to humans. It contains between six and eight highly conserved cysteine residues and a characteristic DPF sequence motif. One member is putatively named as receptor for egg jelly protein but this could not confirmed.	94
401979	pfam10171	Tim29	Translocase of the Inner Mitochondrial membrane 29. This is a family of proteins conserved from nematodes to humans. The function is not known. However, family members such as the import inner membrane translocase sub-unit Tim29 (C19orf52) found in human, is shown to be required for the stability of the TIM22 complex. TIM22 complex imports and inserts multi-pass trans-membrane proteins into the mitochondrial inner membrane by formation of a twin-pore translocase with components in the outer and inner membranes. TIM29 is integrated into the inner member with the C-terminus exposed to the inter-membrane space and able to contact the translocase of the outer membrane. It is required for complex stability and for the addition of the TIMM22 protein to the complex.	169
401980	pfam10172	DDA1	Det1 complexing ubiquitin ligase. DDA1 (De-etiolated 1, Damaged DNA binding protein 1 associated 1) protein binds strongly with DDB1 and Det1 forming a DDD complex which is part of the ubiquitin conjugation system.	66
401981	pfam10173	Mit_KHE1	Mitochondrial K+-H+ exchange-related. The members of this family function as mitochondrial potassium-hydrogen exchange transporters. The family is part of a large mitochondrial KHE protein complex.	191
401982	pfam10174	Cast	RIM-binding protein of the cytomatrix active zone. This is a family of proteins that form part of the CAZ (cytomatrix at the active zone) complex which is involved in determining the site of synaptic vesicle fusion. The C-terminus is a PDZ-binding motif that binds directly to RIM (a small G protein Rab-3A effector). The family also contains four coiled-coil domains.	765
401983	pfam10175	MPP6	M-phase phosphoprotein 6. This is a family of M-phase phosphoprotein 6s which is necessary for generation of the 3' end of the 5.8S rRNA precursor. It preferentially binds to poly(C) and poly(U).	130
401984	pfam10176	DUF2370	Protein of unknown function (DUF2370). This family is conserved from fungi to humans. The human member is annotated as a Golgi-associated protein-Nedd4 WW domain-binding protein but this could not be confirmed.	215
401985	pfam10177	DUF2371	Uncharacterized conserved protein (DUF2371). This is a family of proteins conserved from nematodes to humans. The function is not known.	141
401986	pfam10178	PAC3	Proteasome assembly chaperone 3. PAC3 is a family of eukaryotic proteasome assembly chaperone 3 proteins conserved from fungi to plants to humans. PAC3 plays a crucial part in the assembly of the 20S core proteasome unit, in conjunction with PAC4.	86
401987	pfam10179	DUF2369	Uncharacterized conserved protein (DUF2369). This is a proline-rich region of a group of proteins found from plants to fungi. The function is not known.	94
401988	pfam10180	DUF2373	Uncharacterized conserved protein (DUF2373). This is the C-terminal conserved region of a family of proteins found from fungi to humans. The function is not known.	62
401989	pfam10181	PIG-H	GPI-GlcNAc transferase complex, PIG-H component. PIG-H is a family of conserved proteins that complexes with three other proteins to form the GPI-GnT (glycosylphosphatidylinositol anchor biosynthesis transferase) complex. It appears to be a peripheral membrane protein facing the cytoplasm involved in the first step in GPI anchor formation.	67
401990	pfam10182	Flo11	Flo11 domain. This presumed domain is found at the N-terminus of the S. cerevisiae Flo11 protein. Flo11 is required for diploid pseudohyphal formation and haploid invasive growth. It belongs to a family of proteins involved in invasive growth, cell-cell adhesion, and mating, many of which can substitute for each other under abnormal conditions.	151
401991	pfam10183	ESSS	ESSS subunit of NADH:ubiquinone oxidoreductase (complex I). This subunit is part of the mitochondrial NADH:ubiquinone oxidoreductase (complex I). It carries mitochondrial import sequences.	115
370865	pfam10184	DUF2358	Uncharacterized conserved protein (DUF2358). DUF2358 is a family of conserved proteins found from plants to humans. The function is unknown.	113
401992	pfam10185	Mesd	Chaperone for wingless signalling and trafficking of LDL receptor. Mesd is a family of highly conserved proteins found from nematodes to humans. The final C-terminal residues, KEDL, are the endoplasmic reticulum retention sequence as it is an ER protein specifically required for the intracellular trafficking of members of the low-density lipoprotein family of receptors (LDLRs). The N- and C-terminal sequences are predicted to adopt a random coil conformation, with the exception of an isolated predicted helix within the N-terminal region, The central folded domain flanked by natively unstructured regions is the necessary structure for facilitating maturation of LRP6 (Low-Density Lipoprotein Receptor-Related Protein 6 Maturation).	155
370867	pfam10186	Atg14	Vacuolar sorting 38 and autophagy-related subunit 14. The Atg14 or Apg14 proteins are hydrophilic proteins with a predicted molecular mass of 40.5 kDa, and have a coiled-coil motif at the N-terminus region. Yeast cells with mutant Atg14 are defective not only in autophagy but also in sorting of carboxypeptidase Y (CPY), a vacuolar-soluble hydrolase, to the vacuole. Subcellular fractionation indicate that Apg14p and Apg6p are peripherally associated with a membrane structure(s). Apg14p was co-immunoprecipitated with Apg6p, suggesting that they form a stable protein complex. These results imply that Apg6/Vps30p has two distinct functions: in the autophagic process and in the vacuolar protein sorting pathway. Apg14p may be a component specifically required for the function of Apg6/Vps30p through the autophagic pathway. There are 17 auto-phagosomal component proteins which are categorized into six functional units, one of which is the AS-PI3K complex (Vps30/Atg6 and Atg14). The AS-PI3K complex and the Atg2-Atg18 complex are essential for nucleation, and the specific function of the AS-PI3K apparently is to produce phosphatidylinositol 3-phosphate (PtdIns(3)P) at the pre-autophagosomal structure (PAS). The localization of this complex at the PAS is controlled by Atg14. Autophagy mediates the cellular response to nutrient deprivation, protein aggregation, and pathogen invasion in humans, and malfunction of autophagy has been implicated in multiple human diseases including cancer. This effect seems to be mediated through direct interaction of the human Atg14 with Beclin 1 in the human phosphatidylinositol 3-kinase class III complex.	342
401993	pfam10187	Nefa_Nip30_N	N-terminal domain of NEFA-interacting nuclear protein NIP30. This is a the N-terminal 100 amino acids of a family of proteins conserved from plants to humans. The full-length protein has putatively been called NEFA-interacting nuclear protein NIP30, however no reference could be found to confirm this.	102
401994	pfam10188	Oscp1	Organic solute transport protein 1. Oscp1 is a family of proteins conserved from plants to humans. It is called organic solute transport protein or oxido-red- nitro domain-containing protein 1, however no reference could be find to confirm the function of the protein.	173
401995	pfam10189	Ints3	Integrator complex subunit 3. The Integrator complex is involved in small nuclear RNA (snRNA) U1 and U2 transcription, and in their 3'-box- dependent processing. This complex associates with the C- terminal domain of RNA polymerase II largest subunit and is recruited to the U1 and U2 snRNAs genes. This entry represents subunit 3 of this complex.	225
401996	pfam10190	Tmemb_170	Putative transmembrane protein 170. Tmem170 is a family of putative transmembrane proteins conserved from fungi to nematodes to humans. The protein is only of approximately 130 amino acids in length. The function is unknown.	106
337664	pfam10191	COG7	Golgi complex component 7 (COG7). COG7 is a component of the conserved oligomeric Golgi complex which is required for normal Golgi morphology and localization. Mutation in COG7 causes a congenital disorder of glycosylation.	736
401997	pfam10192	GpcrRhopsn4	Rhodopsin-like GPCR transmembrane domain. This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers.	257
401998	pfam10193	Telomere_reg-2	Telomere length regulation protein. This family is the central conserved 110 amino acid region of a group of proteins called telomere-length regulation or clock abnormal protein-2 which are conserved from plants to humans. The full-length protein regulates telomere length and contributes to silencing of sub-telomeric regions. In vitro the protein binds to telomeric DNA repeats.	112
401999	pfam10195	Phospho_p8	DNA-binding nuclear phosphoprotein p8. P8 is a short 80-82 amino acid protein that is conserved from nematodes to humans. It carries at least one protein kinase C domain suggesting a possible role in signal transduction and it is thought to be a phosphoprotein, but the sites of phosphorylation and the kinases involved remain to be determined.	58
402000	pfam10197	Cir_N	N-terminal domain of CBF1 interacting co-repressor CIR. This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex.	37
402001	pfam10198	Ada3	Histone acetyltransferases subunit 3. Ada3 is a family of proteins conserved from yeasts to humans. It is an essential component of the Ada transcriptional coactivator (alteration/deficiency in activation) complex. Ada3 plays a key role in linking histone acetyltransferase-containing complexes to p53 (tumor suppressor protein) thereby regulating p53 acetylation, stability and transcriptional activation following DNA damage.	123
370877	pfam10199	Adaptin_binding	Alpha and gamma adaptin binding protein p34. p34 is a protein involved in membrane trafficking. It is known to interact with both alpha and gamma adaptin. It has been speculated that p34 may play a chaperone role such as preventing the soluble adaptors from co-assembling with soluble clathrin, or helping to remove the adaptors from the coated vesicle. Another possible function is in aiding the recruitment of soluble adaptors onto the membrane.	93
370878	pfam10200	Ndufs5	NADH:ubiquinone oxidoreductase, NDUFS5-15kDa. This is a family of short, approximately 105 amino acid residue, proteins which form part of NADH:ubiquinone oxidoreductase complex I. Complex I is the first multisubunit inner membrane protein complex of the mitochondrial electron transport chain and it transfers two electrons from NADH to ubiquinone. The protein carries four highly conserved cysteine residues but these do not appear to be in a configuration which would favour metal binding so the exact function of the protein is uncertain.	95
402002	pfam10203	Pet191_N	Cytochrome c oxidase assembly protein PET191. Pet191_N is the conserved N-terminal of a family of conserved proteins found from nematodes to humans. It carries six highly conserved cysteine residues. Pet191 is required for the assembly of active cytochrome c oxidase but does not form part of the final assembled complex.	67
402003	pfam10204	DuoxA	Dual oxidase maturation factor. DuoxA (Dual oxidase maturation factor) is the essential protein necessary for the final release of DUOX2 (an NADPH:O2 oxidoreductase flavoprotein) from the endoplasmic reticulum. Dual oxidases (DUOX1 and DUOX2) constitute the catalytic core of the hydrogen peroxide generator, which generates H2O2 at the apical membrane of thyroid follicular cells, essential for iodination of thyroglobulin by thyroid peroxidases. DuoxA carries five membrane-integral regions including a reverse signal-anchor with external N-terminus (type III) and two N-glycosylation sites. It is conserved from nematodes to humans.	274
402004	pfam10205	KLRAQ	Predicted coiled-coil domain-containing protein. This is the N-terminal 100 amino acid domain of a family of proteins conserved from nematodes to humans. It carries a characteristic KLRAQ sequence-motif. The function is not known.	100
402005	pfam10206	WRW	Mitochondrial F1F0-ATP synthase, subunit f. This is a family of small proteins of approximately 110 amino acids, which are highly conserved from nematodes to humans. Some members of the family have been annotated in Swiss-Prot as being the f subunit of mitochondrial F1F0-ATP synthase but this could not be confirmed. The sequence has a well-conserved WRW motif. The exact function of the protein is not known.	102
402006	pfam10208	Armet	Degradation arginine-rich protein for mis-folding. This is a family of small proteins of approximately 170 residues which contain four di-sulfide bridges that are highly conserved from nematodes to humans. Armet is a soluble protein resident in the endoplasmic reticulum and induced by ER stress. It appears to be involved with dealing with mis-folded proteins in the ER, thus in quality control of ER stress.	145
402007	pfam10209	DUF2340	Uncharacterized conserved protein (DUF2340). This is a family of small proteins of approximately 150 amino acids of unknown function.	120
402008	pfam10210	MRP-S32	Mitochondrial 28S ribosomal protein S32. This entry is of a family of short, approximately 100 amino acid residues, proteins which are mitochondrial 28S ribosomal proteins named as MRP-S32. Their exact function could not be confirmed.	92
402009	pfam10211	Ax_dynein_light	Axonemal dynein light chain. Axonemal dynein light chain proteins play a dynamic role in flagellar and cilia motility. Eukaryotic cilia and flagella are complex organelles consisting of a core structure, the axoneme, which is composed of nine microtubule doublets forming a cylinder that surrounds a pair of central singlet microtubules. This ultra-structural arrangement seems to be one of the most stable micro-tubular assemblies known and is responsible for the flagellar and ciliary movement of a large number of organisms ranging from protozoan to mammals. This light chain interacts directly with the N-terminal half of the heavy chains.	182
402010	pfam10212	TTKRSYEDQ	Predicted coiled-coil domain-containing protein. This is the C-terminal 500 amino acids of a family of proteins with a predicted coiled-coil domain conserved from nematodes to humans. It carries a characteristic TTKRSYEDQ sequence-motif. The function is not known.	523
287217	pfam10213	MRP-S28	Mitochondrial ribosomal subunit protein. This is a conserved region of approx. 125 residues of one of the proteins that makes up the small subunit of the mitochondrial ribosome. In Saccharomyces cerevisiae the protein is MRP-S24 whereas in humans it is MRP-S28. The human mitochondrial ribosome has 29 distinct proteins in the small subunit and these have homologs in, for example, Drosophila melanogaster, Caenorhabditis elegans, and in the genomes of several fungi.	127
402011	pfam10214	Rrn6	RNA polymerase I-specific transcription-initiation factor. RNA polymerase I-specific transcription-initiation factor Rrn6 and Rrn7 represent components of a multisubunit transcription factor essential for the initiation of rDNA transcription by Pol I. These proteins are found in fungi.	847
402012	pfam10215	Ost4	Oligosaccaryltransferase. Ost4 is a very short, approximately 30 residues, enzyme found from fungi to vertebrates. It is a member of the ER oligosaccaryltansferase complex, EC 2.4.1.119, that catalyzes the asparagine-linked glycosylation of proteins. It appears to be an integral membrane protein that mediates the en bloc transfer of a preassembled high-mannose oligosaccharide onto asparagine residues of nascent polypeptides as they enter the lumen of the rough endoplasmic reticulum (RER).	34
402013	pfam10216	ChpXY	CO2 hydration protein (ChpXY). This small family of proteins includes paralogues ChpX and ChpY in Synechococcus sp. PCC7942 and other cyanobacteria, associated with distinct NAD(P)H dehydrogenase complexes. These proteins collectively enable light-dependent CO2 hydration and CO2 uptake; loss of both blocks growth at low CO2 concentrations.	352
402014	pfam10217	DUF2039	Uncharacterized conserved protein (DUF2039). This entry is a region of approximately 100 residues containing three pairs of cysteine residues. The region is conserved from plants to humans but its function is unknown.	89
402015	pfam10218	DUF2054	Uncharacterized conserved protein (DUF2054). This entry contains 14 conserved cysteines, three of which are CC-dimers. The region is of approximately 200 residues in length but its function is unknown.	128
402016	pfam10220	Smg8_Smg9	Smg8_Smg9. Smg8 and Smg9 are two subunits of the Smg-1 complex. They suppress Smg-1 kinase activity in the isolated Smg-1 complex, and are involved in nonsense-mediated mRNA decay (NMD) in both mammals and nematodes.	868
402017	pfam10221	DUF2151	Cell cycle and development regulator. This is a set of proteins conserved from worms to humans. The proteins are a PAN GU kinase substrate, Mat89Bb, essential for S-M cycles of early Drosophila embryogenesis, Xenopus embryonic cell cycles and morphogenesis, and cell division in cultured mammalian cells.	680
287225	pfam10222	DUF2152	Uncharacterized conserved protein (DUF2152). This is a family of proteins conserved from worms to humans. Its function is unknown.	605
402018	pfam10223	DUF2181	Uncharacterized conserved protein (DUF2181). This is region of approximately 250 residues conserved from worms to humans. Its function is unknown.	240
402019	pfam10224	DUF2205	Predicted coiled-coil protein (DUF2205). This entry represent a highly conserved 100 residue region which is likely to be a coiled-coil structure. The exact function is unknown.	71
402020	pfam10225	NEMP	NEMP family. This entry includes a group of nuclear envelope integral membrane proteins from animals and plants, including NEMP1 from Xenopus laevis. NEMP1 is a RanGTP-binding protein and is involved in eye development.	249
402021	pfam10226	CCDC85	CCDC85 family. This entry includes human CCDC85A/B/C and C. elegans Picc-1 protein. Picc-1 serves as a linker protein which helps to recruit the Rho GTPase-activating protein, pac-1, to adherens junctions. Human CCDC85B suppresses the beta-catenin activity in a p53-dependent manner.	190
402022	pfam10228	DUF2228	Uncharacterized conserved protein (DUF2228). This is a family of conserved proteins of approximately 700 residues found from worms to humans.	250
402023	pfam10229	MMADHC	Methylmalonic aciduria and homocystinuria type D protein. This entry represents methylmalonic aciduria and homocystinuria type D protein and homologs. These proteins are involved in cobalamin (vitamin B12) metabolism.	272
370901	pfam10230	LIDHydrolase	Lipid-droplet associated hydrolase. This family of proteins is conserved from plants to humans. The function is as a lipid-droplet hydrolase in the yeast members.	261
402024	pfam10231	DUF2315	Uncharacterized conserved protein (DUF2315). This is a family of small conserved proteins found from worms to humans. The function is not known.	118
402025	pfam10232	Med8	Mediator of RNA polymerase II transcription complex subunit 8. Arc32, or Med8, is one of the subunits of the Mediator complex of RNA polymerase II. The region conserved contains two alpha helices putatively necessary for binding to other subunits within the core of the Mediator complex. The N-terminus of Med8 binds to the essential core Head part of Mediator and the C-terminus hinges to Med18 on the non-essential part of the Head that also includes Med20.	231
402026	pfam10233	Cg6151-P	Uncharacterized conserved protein CG6151-P. This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined.	113
402027	pfam10234	Cluap1	Clusterin-associated protein-1. This protein is conserved from worms to humans. The protein of 413 amino acids contains a central coiled-coil domain, possibly the region that binds to clusterin. Cluap1 expression is highest in the nucleus and gradually increases during late S to G2/M phases of the cell cycle and returns to the basal level in the G0/G1 phases. In addition, it is upregulated in colon cancer tissues compared to corresponding non-cancerous mucosa. It thus plays a crucial role in the life of the cell.	268
402028	pfam10235	Cript	Microtubule-associated protein CRIPT. The CRIPT protein is a cytoskeletal protein involved in microtubule production. The C-terminal domain is essential for binding to the PDZ3 domain of the SAP90 protein, one of a super-family of PDZ-containing proteins that play an important role in coupling the membrane ion channels with their signalling partners. SAP90 is concentrated in the post synaptic density of glutamatergic neurons.	87
370907	pfam10236	DAP3	Mitochondrial ribosomal death-associated protein 3. This is a family of conserved proteins which were originally described as death-associated-protein-3 (DAP-3). The proteins carry a P-loop DNA-binding motif, and induce apoptosis. DAP3 has been shown to be a pro-apoptotic factor in the mitochondrial matrix and to be crucial for mitochondrial biogenesis and so has also been designated as MRP-S29 (mitochondrial ribosomal protein subunit 29).	310
402029	pfam10237	N6-adenineMlase	Probable N6-adenine methyltransferase. This is a protein of approximately 200 residues which is conserved from plants to humans. It contains a highly conserved QFW motif close to the N-terminus and a DPPF motif in the centre. The DPPF motif is characteristic of N-6 adenine-specific DNA methylases, and this family is found in eukaryotes.	118
402030	pfam10238	Eapp_C	E2F-associated phosphoprotein. This entry represents the conserved C-terminal portion of an E2F binding protein. E2F transcription factors play an essential role in cell proliferation and apoptosis and their activity is frequently deregulated in human cancers. E2F activity is regulated by a variety of mechanisms, frequently mediated by proteins binding to individual members or a subgroup of the family. EAPP interacts with a subset of E2F factors and influences E2F-dependent promoter activity. EAPP is present throughout the cell cycle but disappears during mitosis.	133
402031	pfam10239	DUF2465	Protein of unknown function (DUF2465). FAM98A and B proteins are found from worms to humans but their function is unknown. This entry is of a family of proteins that is rich in glycines.	321
402032	pfam10240	DUF2464	Multivesicular body subunit 12. MVB12A (also known as CFBP) and MVB12B are subunits of the ESCRT-I complex, which mediates the sorting of ubiquitinated cargo protein from the plasma membrane to the endosomal vesicle. MVB12A plays a key role in the ligand-mediated internalization and down-regulation of the EGF receptor.	256
402033	pfam10241	KxDL	Uncharacterized conserved protein. This is a family of short proteins which are conserved over a region of 80 residues. There is a characteristic KxDL motif towards the C-terminus. The function is unknown.	80
370913	pfam10242	L_HMGIC_fpl	Lipoma HMGIC fusion partner-like protein. This is a group of proteins expressed from a series of genes referred to as Lipoma HMGIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumor cell lines.	181
402034	pfam10243	MIP-T3	Microtubule-binding protein MIP-T3. This protein, which interacts with both microtubules and TRAF3 (tumor necrosis factor receptor-associated factor 3), is conserved from worms to humans. The N-terminal region is the microtubule binding domain and is well-conserved; the C-terminal 100 residues, also well-conserved, constitute the coiled-coil region which binds to TRAF3. The central region of the protein is rich in lysine and glutamic acid and carries KKE motifs which may also be necessary for tubulin-binding, but this region is the least well-conserved.	113
402035	pfam10244	MRP-L51	Mitochondrial ribosomal subunit. MRP-L51 is a family of small proteins from the intact 55 S mitochondrial ribosome. It has otherwise been referred to as bMRP-64. The exact function of this family is not known.	93
402036	pfam10245	MRP-S22	Mitochondrial 28S ribosomal protein S22. This is the conserved N-terminus and central portion of the mitochondrial small subunit 28S ribosomal protein S22. Mammalian mitochondria carry out the synthesis of 13 polypeptides that are essential for oxidative phosphorylation and, hence, for the synthesis of the majority of the ATP used by eukaryotic organisms. The number of proteins produced by prokaryotes is smaller, reflected in the lower number of ribosomal proteins present in them.	241
402037	pfam10246	MRP-S35	Mitochondrial ribosomal protein MRP-S35. This is a family of short mitochondrial ribosomal proteins, less than 200 amino acids long. that are highly conserved from worms to humans. The structure has previously been referred to as MRP-S18 but the current numbering fits the preferred nomenclature from these authors.	105
402038	pfam10247	Romo1	Reactive mitochondrial oxygen species modulator 1. This is a family of small, approximately 100 amino acid, proteins found from yeasts to humans. The majority of endogenous reactive oxygen species (ROS) in cells are produced by the mitochondrial respiratory chain. An increase or imbalance in ROS alters the intracellular redox homeostasis, triggers DNA damage, and may contribute to cancer development and progression. Members of this family are mitochondrial reactive oxygen species modulator 1 (Romo1) proteins that are responsible for increasing the level of ROS in cells. Increased Romo1 expression can have a number of other effects including: inducing premature senescence of cultured human fibroblasts and increased resistance to 5-fluorouracil.	66
402039	pfam10248	Mlf1IP	Myelodysplasia-myeloid leukemia factor 1-interacting protein. This entry is the conserved central region of a group of proteins that are putative transcriptional repressors. The structure contains a putative 14-3-3 binding motif involved in the subcellular localization of various regulatory molecules, and it may be that interaction with the transcription factor DREF could be regulated through this motif. DREF regulates proliferation-related genes in Drosophila. Mlf1IP is expressed in both the nuclei and the cytoplasm and thus may have multi-functions.	174
402040	pfam10249	NDUFB10	NADH-ubiquinone oxidoreductase subunit 10. NDUFB10 is a family of conserved proteins of up to 180 residues. It is one of the 41 protein subunits within the hydrophobic fraction of the NADH:ubiquinone oxidoreductase (complex I), a multiprotein complex located in the inner mitochondrial membrane whose main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. NDUFB10 is encoded in the nucleus.	126
402041	pfam10250	O-FucT	GDP-fucose protein O-fucosyltransferase. This is a family of conserved proteins representing the enzyme responsible for adding O-fucose to EGF (epidermal growth factor-like) repeats. Six highly conserved cysteines are present in O-FucT-1 as well as a DXD-like motif (ERD), conserved in mammals, Drosophila, and C. elegans. Both features are characteristic of several glycosyltransferase families. The enzyme is a membrane-bound protein released by proteolysis and, as for most glycosyltransferases, is strongly activated by manganese.	254
402042	pfam10251	PEN-2	Presenilin enhancer-2 subunit of gamma secretase. This entry is a short 101 peptide protein which is the smallest subunit of the gamma-secretase aspartyl protease complex that catalyzes the intramembrane cleavage of a subset of type I transmembrane proteins. The other active constituents of the complex are presenilin (PS) nicastrin and anterior pharynx defective-1 (APH-1) protein. PEN-2 adopts a hairpin orientation in the membrane with its N- and C-terminal domains facing the luminal/extracellular space, and the C-terminal domain maintains PS stability within the complex.	93
402043	pfam10252	PP28	Casein kinase substrate phosphoprotein PP28. This domain is a region of 70 residues conserved in proteins from plants to humans and contains a serine/arginine rich motif. In rats the full protein is a casein kinase substrate, and this region contains phosphorylation sites for both cAMP-dependent protein kinase and casein kinase II.	80
402044	pfam10253	PRCC	Mitotic checkpoint regulator, MAD2B-interacting. This family constitutes the major, conserved, portion of PRCC proteins. In humans this family interacts with MAD2B, the mitotic checkpoint protein. In Schizosaccharomyces pombe this protein is part of the Cwf-complex that is known to be involved in pre-mRNA splicing.	128
402045	pfam10254	Pacs-1	PACS-1 cytosolic sorting protein. PACS-1 is a cytosolic sorting protein that directs the localization of membrane proteins in the trans-Golgi network (TGN)/endosomal system. PACS-1 connects the clathrin adaptor AP-1 to acidic cluster sorting motifs contained in the cytoplasmic domain of cargo proteins such as furin, the cation-independent mannose-6-phosphate receptor and in viral proteins such as human immunodeficiency virus type 1 Nef.	413
402046	pfam10255	Paf67	RNA polymerase I-associated factor PAF67. RNA polymerase I is a multisubunit enzyme and its transcription competence is dependent on the presence of PAF67. This family of proteins is conserved from worms to humans.	399
402047	pfam10256	Erf4	Golgin subfamily A member 7/ERF4 family. This family of proteins includes Golgin subfamily A member 7 proteins as well as Ras modification protein ERF4.	116
370927	pfam10257	RAI16-like	Retinoic acid induced 16-like protein. This is the conserved N-terminal 450 residues of a family of proteins described as retinoic acid-induced protein 16-like proteins. The exact function is not known. The proteins are found from worms to humans.	357
402048	pfam10258	RNA_GG_bind	PHAX RNA-binding domain. RNA_GG_bind is the highly conserved U3 snoRNA-binding domain of PHAX (phosphorylated adaptor for RNA export) whose function is to transport U3 snoRNA from the nucleus after transcription. It is characterized by having two pairs of adjacent glycines, as GGx12GG.	84
402049	pfam10259	Rogdi_lz	Rogdi leucine zipper containing protein. This is a family of conserved proteins which have been suggested as containing leucine-zipper domains. A leucine zipper domain is a region of 30 amino acids with leucines repeating every seven or eight residues; these proteins do have many such leucines. The protein in Drosophila comes from the gene ROGDI.	295
402050	pfam10260	SAYSvFN	Uncharacterized conserved domain (SAYSvFN). This domain of approximately 75 residues contains a highly conserved SATSv/iFN motif. The function is unknown but the domain is conserved from plants to humans.	65
402051	pfam10261	Scs3p	Inositol phospholipid synthesis and fat-storage-inducing TM. This is a family of transmembrane proteins which are variously annotated as possibly being inositol phospholipid synthesis protein and fat-storage-inducing. The members are conserved from yeasts to humans and are localized to the endoplasmic reticulum where they are involved in triglyceride lipid droplet formation.	242
402052	pfam10262	Rdx	Rdx family. This entry is an approximately 100 residue region of selenoprotein-T, conserved from plants to humans. The protein binds to UDP-glucose:glycoprotein glucosyltransferase (UGTR), the endoplasmic reticulum (ER)-resident protein, which is known to be involved in the quality control of protein folding. Selenium (Se) plays an essential role in cell survival and most of the effects of Se are probably mediated by selenoproteins, including selenoprotein T. However, despite its binding to UGTR and that its mRNA is up-regulated in extended asphyxia, the function of the protein and hence of this region of it is unknown. Selenoprotein W contains selenium as selenocysteine in the primary protein structure and levels of this selenoprotein are affected by selenium.	74
402053	pfam10263	SprT-like	SprT-like family. This family represents a domain found in eukaryotes and prokaryotes. The domain contains a characteristic motif of the zinc metallopeptidases. This family includes the bacterial SprT protein.	105
402054	pfam10264	Stork_head	Winged helix Storkhead-box1 domain. This is the conserved N-terminal winged helix domain of Storkhead-box1 protein which is likely to be a DNA binding domain. In humans the full-length protein controls polyploidization of extravillus trophoblast and is implicated in pre-eclampsia.	79
402055	pfam10265	Miga	Mitoguardin. Mitoguardin (Miga) was first identified in flies as a mitochondrial outer-membrane protein that promotes mitochondrial fusion. Later, the mammalian Miga homologs, Miga1 and Miga2, were identified. They are found to promote mitochondrial fusion by regulating mitochondrial phospholipid metabolism via MitoPLD.	535
402056	pfam10266	Strumpellin	Hereditary spastic paraplegia protein strumpellin. This is a family of proteins conserved from plants to humans, in which two closely situated point mutations in the human protein lead to the condition of hereditary spastic paraplegia. Strumpellin contains one known domain called a spectrin repeat that consists of three alpha-helices of a characteristic length wrapped in a left-handed coiled coil. The spectrin proteins have multiple copies of this repeat, which can then form multimers in the cell. Spectrin associates with the cell membrane via spectrin repeats in the ankyrin protein. The spectrin repeat is a structural platform for cytoskeletal protein assemblies.	1081
402057	pfam10267	Tmemb_cc2	Predicted transmembrane and coiled-coil 2 protein. This family of transmembrane coiled-coil containing proteins is conserved from worms to humans. Its function is unknown.	386
402058	pfam10268	Tmemb_161AB	Predicted transmembrane protein 161AB. Transmemb_161AB is a family of conserved proteins found from worms to humans. Members are putative transmembrane proteins but otherwise the function is not known.	485
402059	pfam10269	Tmemb_185A	Transmembrane Fragile-X-F protein. This is a family of conserved transmembrane proteins that appear in humans to be expressed from a region upstream of the FragileXF site and to be intimately linked with the Fragile-X syndrome. Absence of TMEM185A does not necessarily lead to developmental delay, but might in combination with other, yet unknown, factors. Otherwise, the lack of the TMEM185A protein is either disposable (redundant) or its function can be complemented by the highly similar chromosome 2 retro-pseudogene product, TMEM185B.	245
402060	pfam10270	MMgT	Membrane magnesium transporter. This entry represents a novel family of membrane magnesium transporters (MMgT). The proteins, MMgT1 and MMgT2, are localized to the Golgi complex and post-Golgi vesicles, including the early endosomes, suggesting that they may provide regulated pathways for Mg(2+) transport in the Golgi and post-Golgi organelles of epithelium-derived cells.	117
402061	pfam10271	Tmp39	Putative transmembrane protein. This is a family of conserved proteins found from worms to humans. They are putative transmembrane proteins but the function is unknown.	429
402062	pfam10272	Tmpp129	Putative transmembrane protein precursor. This is a family of proteins conserved from worms to humans. The proteins are purported to be transmembrane protein-precursors but the function is unknown.	351
402063	pfam10273	WGG	Pre-rRNA-processing protein TSR2. This entry represents the central conserved section of a family of proteins described as pre-rRNA-processing protein TSR2. The region has a distinctive WGG motif but the function is unknown.	80
402064	pfam10274	ParcG	Parkin co-regulated protein. This family of proteins is transcribed anti-sense along the DNA to the Parkin gene product and the two appear to be transcribed under the same promoter. The protein has predicted alpha-helical and beta-sheet domains which suggest its function is in the ubiquitin/proteasome system. Mutations in parkin are the genetic cause of early-onset and autosomal recessive juvenile parkinsonism.	183
402065	pfam10275	Peptidase_C65	Peptidase C65 Otubain. This family of proteins conserved from plants to humans is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryote being a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumor domain) in which there is an active cysteine protease triad (ii) a nuclear localization signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif.	242
402066	pfam10276	zf-CHCC	Zinc-finger domain. This is a short zinc-finger domain conserved from fungi to humans. It is Cx8Hx14Cx2C.	37
402067	pfam10277	Frag1	Frag1/DRAM/Sfk1 family. This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumor-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localization of Stt4p to the actin cytoskeleton.	219
287279	pfam10278	Med19	Mediator of RNA pol II transcription subunit 19. Med19 represents a family of conserved proteins which are members of the multi-protein co-activator Mediator complex. Mediator is required for activation of RNA polymerase II transcription by DNA binding transactivators.	178
192511	pfam10279	Latarcin	Latarcin precursor. This family represents the precursor proteins for a number of short antimicrobial peptides called Latarcins. Latarcins were discovered in the venom of the spider Lachesana tarabaevi. Latarcins are likely to adopt amphipathic alpha-helical structure in the plasma membrane.	64
402068	pfam10280	Med11	Mediator complex protein. Mediator is a large, modular protein complex that is conserved from yeast to human and conveys regulatory signals from DNA-binding transcription factors to RNA polymerase II. Not only are the polypeptides conserved but the structural organisation is also largely conserved. One or two subunits are either fungal or vertebral specific but Med11 is one of the subunits that is conserved from fungi to humans. Med11 appears to be necessary for the full and successful assembly of the core head sub-region.	119
402069	pfam10281	Ish1	Putative stress-responsive nuclear envelope protein. This entry represents a repeat found in the fungal protein Ish1, a putative stress-responsive nuclear envelope protein.	37
402070	pfam10282	Lactonase	Lactonase, 7-bladed beta-propeller. This entry contains bacterial 6-phosphogluconolactonases (6PGL)YbhE-type (EC:3.1.1.31) which hydrolyze 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonising enzyme carboxy-cis,cis-muconate cyclase (EC:5.5.1.5) and muconate cycloisomerase (EC:5.5.1.1), which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold.	340
402071	pfam10283	zf-CCHH	Zinc-finger (CX5CX6HX5H) motif. This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism.	26
118808	pfam10284	Luciferase_3H	Luciferase helical bundle domain. This domain is found associated with the the catalytic domain of dinoflagellate luciferase. Luciferase is involved in catalyzing the light emitting reaction in bioluminescence. The structure of this domain has been solved. This domain has a three helix bundle structure that holds four important histidines that are thought to play a role in the pH regulation of the enzyme.	66
118809	pfam10285	Luciferase_cat	Luciferase catalytic domain. This domain is the catalytic domain of dinoflagellate luciferase. Luciferase is involved in catalyzing the light emitting reaction in bioluminescence. The structure of this domain has been solved. The core part of the domain is a 10 stranded beta barrel that is structurally similar to lipocalins and FABP.	296
402072	pfam10287	DUF2401	Putative TOS1-like glycosyl hydrolase (DUF2401). This family of proteins is conserved in fungi. One member is annotated putatively as OPEL, a house-keeping protein, but this could not be confirmed. It contains 5 highly conserved cysteines two of which form a characteristic CGC sequence motif. It has recently been shown that this family is related to known glycosyl hydrolases.	223
402073	pfam10288	CTU2	Cytoplasmic tRNA 2-thiolation protein 2. CTU2 is a family of proteins necessary for the formation of the wobble nucleoside 5-methoxycarbonylmethyl-2-thiouridine in Saccharomyces cerevisiae. The family is conserved from plants to humans ]1]. It plays a central role in the 2-thiolation of 5-methoxycarbonylmethyl-2-thiouridine, or the wobble nucleoside. This wobble modification in tRNAs, 5-methoxycarbonylmethyl-2-thiouridine (mcm(5)s(2)U), is required for the proper decoding of NNR codons in eukaryotes. The 2-thio group gives rigidity by largely fixing the C3'-endo ribose puckering, ensuring stable and accurate codon-anticodon pairing.	106
402074	pfam10290	DUF2403	Glycine-rich protein domain (DUF2403). This domain is found in the N-terminal region of members of DUF2401 pfam10287. The function of this glycine-rich region is unknown.	59
402075	pfam10291	muHD	Muniscin C-terminal mu homology domain. The muniscins are a family of endocytic adaptors that is conserved from yeast to humans.This C-terminal domain is structurally similar to mu homology domains, and is the region of the muniscin proteins involved in the interactions with the endocytic adaptor-scaffold proteins Ede1-eps15. This interaction influences muniscin localization. The muniscins provide a combined adaptor-membrane-tubulation activity that is important for regulating endocytosis.	255
370953	pfam10292	7TM_GPCR_Srab	Serpentine type 7TM GPCR receptor class ab chemoreceptor. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srab is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The expression pattern of the srab genes is biologically intriguing. Of the six promoters successfully expressed in transgenic organisms, one was exclusively expressed in the tail phasmid neurons, two were exclusively expressed in a head amphid neuron, and two were expressed both in the head and tail neurons as well as a limited number of other cells.	324
402076	pfam10293	DUF2405	Domain of unknown function (DUF2405). This is a conserved region of a family of proteins conserved in fungi. The function is unknown.	152
313513	pfam10294	Methyltransf_16	Lysine methyltransferase. Methyltrans_16 is a lysine methyltransferase. characterized members of this family are protein methyltransferases targetting Lys residues in specific proteins, including calmodulin, VCP, Kin17 and Hsp70 proteins.	172
402077	pfam10295	DUF2406	Uncharacterized protein (DUF2406). This is a family of small proteins conserved in fungi. The function is not known.	58
402078	pfam10296	MMM1	Maintenance of mitochondrial morphology protein 1. MMM1 is conserved from plants to humans. MMM1 is an integral ER protein. It is N-glycosylated, and forms a complex with Mdm10, Mdm12and Mdm34 to tether the mitochondria to the endoplasmic reticulum.	314
402079	pfam10297	Hap4_Hap_bind	Minimal binding motif of Hap4 for binding to Hap2/3/5. In Saccharomyces cerevisiae, the haem-activated protein complex Hap2/3/4/5 plays a major role in the transcription of genes involved in respiration. Hap4_Hap_bind is the essential domain of Hap4 which allows it to associate with Hap2, Hap3 and Hap5 to form the Hap complex.	17
402080	pfam10298	WhiA_N	WhiA N-terminal LAGLIDADG-like domain. This domain is found at the N terminal of sporulation factor WhiA. This domain is related to the LAGLIDADG Homing endonuclease domain while the C terminal domain of WhiA is predicted to be a DNA binding helix-turn-helix domain.	86
402081	pfam10300	DUF3808	Protein of unknown function (DUF3808). This is a family of proteins conserved from fungi to humans. Members of this family also carry a TPR_2 domain pfam07719 at their C-terminus.	474
370959	pfam10302	DUF2407	DUF2407 ubiquitin-like domain. This is a family of proteins found in fungi. The function is not known. This domain is related to the ubiquitin domain.	101
402082	pfam10303	DUF2408	Protein of unknown function (DUF2408). This is a family of proteins conserved in fungi. The function is unknown.	128
402083	pfam10304	RTP1_C2	Required for nuclear transport of RNA pol II C-terminus 2. This domain is found towards the C-terminus of required for the nuclear transport of RNA pol II protein (RTP1). RTP1 is required for the nuclear localization of RNA polymerase II. This family is found in association with pfam10363.	34
402084	pfam10305	Fmp27_SW	RNA pol II promoter Fmp27 protein domain. Fmp27_SW is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic SW and GKG sequence motifs.	101
402085	pfam10306	FLILHELTA	Hypothetical protein FLILHELTA. This is a family of conserved proteins found in fungi. It contains a characteristic FL(I)LHE(L)TA sequence motif, where the bracketed residues are I, L or V. The function is not known.	82
402086	pfam10307	DUF2410	Hypothetical protein (DUF2410). This is a family of proteins conserved in fungi. The function is not known.There are two characteristic sequence motifs, GGWW and TGR.	198
402087	pfam10309	NCBP3	Nuclear cap-binding protein subunit 3. NCBP3 and NCBP1 form an alternative cap-binding complex in higher eukaryotes. NCBP3 binds mRNA, associates with components of the mRNA processing machinery and contributes to polyA RNA export.	59
402088	pfam10310	DUF5427	Family of unknown function (DUF5427). This is a domain of unknown function. Family members found in Saccharomyces cerevisiae, are synthetic lethal with genes involved in maintenance of telomere capping. However, experimental evidence is yet to verify the exact function of family members and the domain.	456
402089	pfam10311	Ilm1	Increased loss of mitochondrial DNA protein 1. This is a family of proteins of approximately 200 residues that are conserved in fungi. Ilm1 is part of the peroxisome, a complex that is the sole site of beta-oxidation in Saccharomyces cerevisiae and known to be required for optimal growth in the presence of fatty acid. Ilm1 may participate in the control of the C16/C18 ratio since it interacts strongly with Mga2p, a transcription factor that controls expression of Ole1, the sole fatty acyl desaturase in S. cerevisiae responsible for conversion of the saturated fatty acids stearate (C18) and palmitate (C16) to oleate and palmitoleate, respectively.	160
402090	pfam10312	Cactin_mid	Conserved mid region of cactin. This is the conserved middle region of a family of proteins referred to as cactins. The region contains two of three predicted coiled-coil domains. Most members of this family have a CactinC_cactus pfam09732 domain at the C-terminal end. Upstream of Mid_cactin in Drosophila members are a serine-rich region, some non-typical RD motifs and three predicted bipartite nuclear localization signals, none of which are well-conserved. Cactin associates with IkappaB-cactus as one of the intracellular members of the Rel (NF-kappaB) pathway which is conserved in invertebrates and vertebrates. In mammals, this pathway controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo.	186
402091	pfam10313	DUF2415	Uncharacterized protein domain (DUF2415). This is a short, 30 residue domain, from a family of proteins conserved in fungi. The function is unknown. There is a characteristic DLL sequence motif.	43
402092	pfam10315	Aim19	Altered inheritance of mitochondria protein 19. This is a family of conserved proteins found in fungi. The function is not known.	111
402093	pfam10316	7TM_GPCR_Srbc	Serpentine type 7TM GPCR chemoreceptor Srbc. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srbc is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	273
402094	pfam10317	7TM_GPCR_Srd	Serpentine type 7TM GPCR chemoreceptor Srd. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srd is part of the larger Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	292
402095	pfam10318	7TM_GPCR_Srh	Serpentine type 7TM GPCR chemoreceptor Srh. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srh is part of the Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	302
370974	pfam10319	7TM_GPCR_Srj	Serpentine type 7TM GPCR chemoreceptor Srj. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srj is part of the Str superfamily of chemoreceptors. The srj family is designated as the out-group based on its location in preliminary phylogenetic analyses of the entire superfamily. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	310
255903	pfam10320	7TM_GPCR_Srsx	Serpentine type 7TM GPCR chemoreceptor Srsx. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	257
402096	pfam10321	7TM_GPCR_Srt	Serpentine type 7TM GPCR chemoreceptor Srt. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srt is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	313
370976	pfam10322	7TM_GPCR_Sru	Serpentine type 7TM GPCR chemoreceptor Sru. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sru is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	304
370977	pfam10323	7TM_GPCR_Srv	Serpentine type 7TM GPCR chemoreceptor Srv. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	283
402097	pfam10324	7TM_GPCR_Srw	Serpentine type 7TM GPCR chemoreceptor Srw. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz.	318
402098	pfam10325	7TM_GPCR_Srz	Serpentine type 7TM GPCR chemoreceptor Srz. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srz is a solo families amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srz appear to be under strong adaptive evolutionary pressure.	265
402099	pfam10326	7TM_GPCR_Str	Serpentine type 7TM GPCR chemoreceptor Str. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Str is a member of the Str superfamily of chemoreceptors. Almost a quarter (22.5%) of str and srj family genes and pseudogenes in C. elegans appear to have been newly formed by gene duplications since the species split. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	307
313539	pfam10327	7TM_GPCR_Sri	Serpentine type 7TM GPCR chemoreceptor Sri. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sri is part of the Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	303
402100	pfam10328	7TM_GPCR_Srx	Serpentine type 7TM GPCR chemoreceptor Srx. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.	262
402101	pfam10329	DUF2417	Region of unknown function (DUF2417). This is a region of a family of proteins conserved in fungi some of whose members also have the Abhydrolase_1, pfam00561, domain in their sequence. The function of this region is not known.	234
402102	pfam10330	Stb3	Putative Sin3 binding protein. This is a family of the conserved N-terminal end of a group of proteins conserved in fungi. It is likely to be a Sin3 binding protein. Sin3p does not bind DNA directly even though the yeast SIN3 gene functions as a transcriptional repressor. Sin3p is part of a large multiprotein complex. Stb3 appears to bind directly to ribosomal RNA Processing Elements (RRPE) although there are no obvious domains which would accord with this, implying that Stb3 may be a novel RNA-binding protein.	79
402103	pfam10332	DUF2418	Protein of unknown function (DUF2418). This is a conserved 100 residue central region of a family of proteins found in fungi. It carries a characteristic EYD sequence motif. The function is not known.	96
370985	pfam10333	Pga1	GPI-Mannosyltransferase II co-activator. Pga1 is found only in yeasts and not in mammals. It localizes in the ER as a glycosylated integral membrane protein. It binds to the GPI-mannosyltransferase II subunit of the GPI and it is responsible for the second mannose addition to GPI precursors. The GPI-anchoring complex is a glycolipid that functions as a membrane anchor for many cell-surface proteins.	174
370986	pfam10334	ArAE_2	Aromatic acid exporter family member 2. This is a family of proteins conserved in fungi. The function is not known.	228
402104	pfam10335	DUF294_C	Putative nucleotidyltransferase substrate binding domain. This domain is found associated with presumed nucleotidyltransferase domains and seems to be distantly related to other helical substrate binding domains.	144
402105	pfam10336	DUF2420	Protein of unknown function (DUF2420). This is a family of proteins conserved in fungi. The function is not known.	107
370988	pfam10337	ArAE_2_N	Putative ER transporter, 6TM, N-terminal. This is a family of proteins conserved in fungi. The function is not known. This family is the C-terminal half of some member proteins which contain the DUF2421 pfam10334 domain at their N-terminus. These proteins are putative endoplasmic reticulum tranpsorters, with a total of 12 TMs.	470
402106	pfam10338	DUF2423	Protein of unknown function (DUF2423). This is a family of proteins conserved in fungi. The function is not known.	44
402107	pfam10339	Vel1p	Yeast-specific zinc responsive. This is a small family of proteins from Saccharomyces and related species. The function is not known but member proteins are highly induced in zinc-depleted conditions and have increased expression in NAP1-deletion mutants. The S. cerevisiae genes are named VEL by association with Velum formation in the wine making process http://www.ajevonline.org/content/48/1/55.abstract	202
313549	pfam10340	Say1_Mug180	Steryl acetyl hydrolase. This entry includes budding yeast steryl acetyl hydrolase 1 (Say1) and fission yeast Mug180. Say1 is a a membrane-anchored deacetylase required for the deacetylation of acetylated sterols. It is involved in the resistance to eugenol and pregnenolone toxicity. Mug180 has a role in meiosis.	374
402108	pfam10341	TPP1	Shelterin complex subunit, TPP1/ACD. TPP1 is a component of the telomerase holoenzyme, involved in telomere replication. It has been demonstrated that TPP1 dimerizes and binds to DNA and RNA. Furthermore, TPP1 stimulates the dissociation of RNA/DNA hetero-duplexes. Yeast telomerase protein TPP1 (Est3 in yeast) is a novel type of GTPase. The key residues in yeast EST3 are an Asp at residue 86 and the Arg at residue 110. The Asp is totally conserved in the family, whereas the Arg is not so well conserved. The N-terminal of TPP1 is likely to be the binding surface for TINF2, whereas the C-terminus probably binds to POT1, thereby tethering POT1 to the shelterin complex. The complex bound to telomeric DNA increases the activity and processivity of the human telomerase core enzyme, thus helping to maintain the length of the telomeres. This domain is conserved from fungi to mammals, hence family Telomere_Pot1 has been merged into the family. The human shelterin complex includes six proteins: telomere repeat binding factor 1 (TRF1), TRF2, repressor/activator protein 1 (RAP1), TRF1-interacting nuclear protein 2 (TIN2), TIN2-interacting protein 1 (TPP1) and protection of telomeres 1 (POT1).	105
370992	pfam10342	GPI-anchored	Ser-Thr-rich glycosyl-phosphatidyl-inositol-anchored membrane family. Some members of this family appear to be serine- threonine-rich membrane-anchored proteins, anchored by glycosyl-phosphatidylinositol. In A. fumigatus these proteins play a role in fungal cell wall organisation. In Lentinula edodes this family is involved in fruiting body formation, and may have a more general role in signalling in other organisms as it interacts with MAPK. The family is also found in archaea and bacteria.	93
402109	pfam10343	Q_salvage	Potential Queuosine, Q, salvage protein family. Q_salvage proteins occur in most Eukarya as well as in a few bacteria possible via horizontal gene-transfer. Queuosine (Q) is a chemical modification found at the wobble position of tRNAs that have GUN anticodons. Most bacteria synthesize queuosine de novo, whereas eukaryotes rely solely on salvaging this essential component from the environment or the gut flora. The exact enzymatic function of the domain has yet to be determined, but structural similarity with DNA glycosidases suggests a ribonucleoside hydrolase role.	285
402110	pfam10344	Fmp27	Mitochondrial protein from FMP27. This family contains mitochondrial FMP27 proteins which in yeasts together with SEN1 are long genes that exist in a looped conformation, effectively bringing together their promoter and terminator regions. Pol-II is located at both ends of FMP27 when this gene is transcribed from a GAL1 promoter under induced and non-induced conditions. The exact function of the Fmp27 protein is not certain.	864
402111	pfam10345	Cohesin_load	Cohesin loading factor. Cohesin_load is a common cohesin loading factor protein that is conserved in fungi. It is associated with the cohesin complex and is required in G1 for cohesin binding to chromosomes but dispensable in G2 when cohesion has been established. It is referred to as both Ssl3, in pombe, and Scc4, in S.cerevisiae. It complexes with Mis4.	594
402112	pfam10346	Con-6	Conidiation protein 6. Con-6 is the conserved N-terminal region of a family of small proteins found in fungi. It is expressed at approximately 6 hours after the induction of development and is induced just prior to major constriction-chain growth.	33
402113	pfam10347	Fmp27_GFWDK	RNA pol II promoter Fmp27 protein domain. Fmp27_GFWDK is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic GFWDK sequence motifs. Some members are associated with domain Fmp27_SW (pfam10305) towards the N-terminus.	155
402114	pfam10348	DUF2427	Domain of unknown function (DUF2427). This is the N-terminal region of a family of proteins conserved in fungi. Several members are annotated as being Ftp1 but this could not be confirmed. The function is not known.	105
402115	pfam10350	DUF2428	Putative death-receptor fusion protein (DUF2428). This is a family of proteins conserved from plants to humans. The function is not known. Several members have been annotated as being HEAT repeat-containing proteins while others are designated as death-receptor interacting proteins, but neither of these could be confirmed.	256
402116	pfam10351	Apt1	Golgi-body localization protein domain. This is the C-terminus of a family of proteins conserved from plants to humans. The plant members are localized to the Golgi proteins and appear to regulate membrane trafficking, as they are required for rapid vesicle accumulation at the tip of the pollen tube. The C-terminus probably contains the Golgi localization signal and it is well-conserved.	468
118874	pfam10353	DUF2430	Protein of unknown function (DUF2430). This is a family of short, 111 residue, proteins found in S. pombe. The function is not known.	107
402117	pfam10354	DUF2431	Domain of unknown function (DUF2431). This is the N-terminal domain of a family of proteins found from plants to humans. The function is not known.	163
402118	pfam10355	Ytp1	Protein of unknown function (Ytp1). This is a family of proteins found in fungi. The region appears to contain regions similar to mitochondrial electron transport proteins. The C-terminal domain is hydrophobic and negatively charged. There are consensus sites for both N-linked glycosylation and cAMP-dependent protein kinase phosphorylation.	274
287343	pfam10356	DUF2034	Protein of unknown function (DUF2034). This protein is expressed in fungi but its function is unknown.	185
402119	pfam10357	Kin17_mid	Domain of Kin17 curved DNA-binding protein. Kin17_mid is the conserved central 169 residue region of a family of Kin17 proteins. Towards the N-terminal end there is a zinc-finger domain, and in human and mouse members there is a RecA-like domain further downstream. The Kin17 protein in humans forms intra-nuclear foci during cell proliferation and is re-distributed in the nucleoplasm during the cell cycle.	123
402120	pfam10358	NT-C2	N-terminal C2 in EEIG1 and EHBP1 proteins. This version of the C2 domain was initally identified in the vertebrate estrogen early-induced gene 1 (EEIG1), and its Drosophila ortholog required for uptake of dsRNA via the endocytotic machinery to induce RNAi silencing. It is also in C.elegans ortholog Sym-3 (SYnthetic lethal with Mec-3) and the mammalian protein EHBP1 (EH domain Binding Protein-1) that regulates endocytotic recycling and two plant proteins, RPG that regulates Rhizobium-directed polar growth and PMI1 (Plastid Movement Impaired 1) that is essential for intracellular movement of chloroplasts in response to blue light.	143
402121	pfam10359	Fmp27_WPPW	RNA pol II promoter Fmp27 protein domain. Fmp27_WPPW is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic HQR and WPPW sequence motifs. and is towards the C-terminal in members which contain Fmp27_SW pfam10305.	481
402122	pfam10360	DUF2433	Protein of unknown function (DUF2433). This is a conserved 120 residue region of a family of proteins found in fungi. The function is not known.	95
402123	pfam10361	DUF2434	Protein of unknown function (DUF2434). This is a family of proteins conserved in fungi. The function is not known.	294
402124	pfam10363	RTP1_C1	Required for nuclear transport of RNA pol II C-terminus 1. This domain is found towards the C-terminus of required for the nuclear transport of RNA pol II protein (RTP1). RTP1 is required for the nuclear localization of RNA polymerase II. This family is found in association with pfam10304.	108
402125	pfam10364	NKWYS	Putative capsular polysaccharide synthesis protein. Found only in Vibrio species, pombe and one other fungi, this is a the N-terminal 150 residues of a family of proteins of unknown function. There is a characteristic NKWYS sequence motif.	132
287351	pfam10365	DUF2436	Domain of unknown function (DUF2436). This domain is found on peptidase C25 proteins and has no known function.	164
371008	pfam10366	Vps39_1	Vacuolar sorting protein 39 domain 1. This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. The precise function of this domain has not been characterized.	108
402126	pfam10367	Vps39_2	Vacuolar sorting protein 39 domain 2. This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. This domain is involved in localization and in mediating the interactions of Vps39 with Vps11.	109
402127	pfam10368	YkyA	Putative cell-wall binding lipoprotein. YkyA is a family of proteins containing a lipoprotein signal and a hydrolase domain. It is similar to cell wall binding proteins and might also be recognisable by a host immune defense system. It is thus likely to belong to pathways important for pathogenicity.	185
402128	pfam10369	ALS_ss_C	Small subunit of acetolactate synthase. ALS_ss_C is the C-terminal half of a family of proteins which are the small subunits of acetolactate synthase. Acetolactate synthase is a tetrameric enzyme, containing probably two large and two small subunits, which catalyzes the first step in branched-chain amino acid biosynthesis. This reaction is sensitive to certain herbicides.	73
402129	pfam10370	DUF2437	Domain of unknown function (DUF2437). This is the N-terminal 50 amino acids of a group of bacterial proteins annotated as fumarylacetoacetate hydrolase-containing enzymes. In most cases members are associated with FAA_hydrolase pfam01557 further towards the C-terminus.	50
402130	pfam10371	EKR	Domain of unknown function. EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) pfam01558 and the 4Fe-4S binding domain Fer4 pfam00037. It contains a characteristic EKR sequence motif. The exact function of this domain is not known.	54
402131	pfam10372	YojJ	Bacterial membrane-spanning protein N-terminus. YojJ is the N-terminus of a family of bacterial proteins some of which are associated with DUF147 pfam02457 towards the C-terminus. It is a putative membrane-spanning protein.	69
402132	pfam10373	EST1_DNA_bind	Est1 DNA/RNA binding domain. Est1 is a protein which recruits or activates telomerase at the site of polymerization. This is the DNA/RNA binding domain of EST1.	279
402133	pfam10374	EST1	Telomerase activating protein Est1. Est1 is a protein which recruits or activates telomerase at the site of polymerization. Structurally it resembles a TPR-like repeat.	130
402134	pfam10375	GRAB	GRIP-related Arf-binding domain. The GRAB (GRIP-related Arf-binding) domain is towards the C-terminus of Rud3 type proteins. This domain is related to the GRIP domain, but the conserved tyrosine residue found at position 4 in all GRIP domains is replaced by a leucine residue. The Arf small GTPase is localized to the cis-Golgi where it recruits proteins via their GRAB domain, as part of the transport of cargo from the endoplasmic reticulum to the plasma membrane.	49
371014	pfam10376	Mei5	Double-strand recombination repair protein. Mei5 is one of a pair of meiosis-specific proteins which facilitate the loading of Dmc1 on to Rad51 on DNA at double-strand breaks during recombination. Recombination is carried out by a large protein complex based around the two RecA homologs, Rad51 and Dmc1. This complex may play both a catalytic and a structural role in the interaction between homologous chromosomes during meiosis. Mei5 is seen to contain a coiled-coli region.	207
402135	pfam10377	ATG11	Autophagy-related protein 11. The function of this family is conflicting. In the fission yeast, Schizosaccharomyces pombe, this protein has been shown to interact with the telomere cap complex. However, in budding yeast, Saccharomyces cerevisiae, this protein is called ATG11 and is shown to be involved in autophagy.	131
402136	pfam10378	RRM	Putative RRM domain. This is a putative RRM, RNA-binding, domain found only in fungi. It occurs in proteins annotated as Nrd1 yeast proteins, which are known to carry RRM domains. It is not homologous with any of the other RRM domains, eg RRM_1 pfam00076.	46
402137	pfam10379	nec1	Virulence protein nec1. This is a family of virulence proteins that are found in pathogenic Streptomyces species.	184
371017	pfam10380	CRF1	Transcription factor CRF1. CRF1 is a transcription factor that co-represses ribosomal genes with FHL1 via the TOR signalling pathway and protein kinase A.	122
371018	pfam10381	Autophagy_C	Autophagocytosis associated protein C-terminal. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The small C-terminal domain is likely to be a distinct binding region for the stability of the autophagosome complex. It carries a highly characteristic conserved FLKF sequence motif.	25
402138	pfam10382	DUF2439	Protein of unknown function (DUF2439). Proteins in this family have been implicated in telomere maintenance in Saccharomyces cerevisiae and in meiotic chromosome segregation in Schizosaccharomyces pombe	74
402139	pfam10383	Clr2	Transcription-silencing protein Clr2. Clr2 is a chromatin silencing protein, one of a quartet of proteins forming the core of SHREC, a multienzyme effector complex that mediates hetero-chromatic transcriptional gene silencing in fission yeast. Clr2 does not have any obvious well-conserved domains but, along with the other core proteins, binds to the histone deacetylase Clr3, and on its own might also have a role in chromatin organisation at the cnt domain, the site of kinetochore assembly.	138
402140	pfam10384	Scm3	Centromere protein Scm3. Scm3 is a centromere protein that has been shown in Saccharomyces cerevisiae to be required for G2/M progression and Cse4 localization. The C terminal region of Scm3 proteins is variable in size and sometimes consists of DNA binding motifs.	53
402141	pfam10385	RNA_pol_Rpb2_45	RNA polymerase beta subunit external 1 domain. RNA polymerases catalyze the DNA-dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared with three in eukaryotes (not including mitochondrial or chloroplast polymerases). This domain in prokaryotes spans the gap between domains 4 and 5 of the yeast protein. It is also known as the external 1 region of the polymerase and is bound in association with the external 2 region.	66
402142	pfam10386	DUF2441	Protein of unknown function (DUF2441). This is a family of highly conserved, predicted, proteins from Bacillus species. The structure forms a homo-dimer. The function is unknown.	141
402143	pfam10387	DUF2442	Protein of unknown function (DUF2442). This family of bacterial and fungal proteins has several members annotated as being putative molybdopterin-guanine dinucleotide biosynthesis protein A; however this could not be verified. Hence the function is not known. This family also includes the DUF3532 that was found to be related and was merged into this family. Members of this family also fall into the NE0471 N-terminal domain-like superfamily, a family of proteins with a unique fold in SCOP:143880.	72
402144	pfam10388	YkuI_C	EAL-domain associated signalling protein domain. In Bacillus species this highly conserved region of the YkuI protein lies immediately downstream of the EAL (diguanylate cyclase/phosphodiesterase domain 2) pfam00563 domain so that together they form a monomer which dimerizes for its enzymatic action. The region contains three alpha helices and five beta strands and is the C-terminal half of the structure.	166
313589	pfam10389	CoatB	Bacteriophage coat protein B. CoatB is a single filamentous bacteriophage alpha helix of approximately 44 residues. It is likely to assemble into a complex of 35 monomers in a Catherine-wheel like formation. It is the major coat protein of the virion.	46
402145	pfam10390	ELL	RNA polymerase II elongation factor ELL. ELL is a family of RNA polymerase II elongation factors. It is bound stably to elongation-associated factors 1 and 2, EAFs, and together these act as a strong regulator of transcription activity. by direct interaction with Pol II. ELL binds to pol II on its own but the affinity is greatly increased by the cooperation of EAF. Some members carry an Occludin domain pfam07303 just downstream. There is no S. cerevisiae member.	281
402146	pfam10391	DNA_pol_lambd_f	Fingers domain of DNA polymerase lambda. DNA polymerases catalyze the addition of dNMPs onto the 3-prime ends of DNA chains. There is a general polymerase fold consisting of three subdomains that have been likened to the fingers, palm, and thumb of a right hand. DNA_pol_lambd_f is the central three-helical region of DNA polymerase lambda referred to as the F and G helices of the fingers domain. Contacts with DNA involve this conserved helix-hairpin-helix motif in the fingers region which interacts with the primer strand. This motif is common to several DNA binding proteins and confers a sequence-independent interaction with the DNA backbone.	50
192566	pfam10392	COG5	Golgi transport complex subunit 5. The COG complex, the peripheral membrane oligomeric protein complex involved in intra-Golgi protein trafficking, consists of eight subunits arranged in two lobes bridged by Cog1. Cog5 is in the smaller, B lobe, bound in with Cog6-8, and is itself bound to Cog1 as well as, strongly, to Cog7.	132
402147	pfam10393	Matrilin_ccoil	Trimeric coiled-coil oligomerization domain of matrilin. This short domain is a coiled coil structure and has a single cysteine residue at the start which is likely to form a di-sulfide bridge with a corresponding cysteine in an upstream EGF (pfam00008) domain thereby spanning a VWA (pfam00092) domain. All three domains can be associated together as in the cartilage matrix protein matrilin, where this domain is likely to be responsible for oligomerization.	43
402148	pfam10394	Hat1_N	Histone acetyl transferase HAT1 N-terminus. This domain is the N-terminal half of the structure of histone acetyl transferase HAT1. It is often found in association with the C-terminal part of the GNAT Acetyltransf_1 (pfam00583) domain. It seems to be motifs C and D of the structure. Histone acetyltransferases (HATs) catalyze the transfer of an acetyl group from acetyl-CoA to the lysine E-amino groups on the N-terminal tails of histones. HATs are involved in transcription since histones tend to be hyper-acetylated in actively transcribed regions of chromatin, whereas in transcriptionally silent regions histones are hypo-acetylated.	157
402149	pfam10395	Utp8	Utp8 family. Utp8 is an essential component of the nuclear tRNA export machinery in Saccharomyces cerevisiae. It is a tRNA binding protein that acts at a step between tRNA maturation /aminoacylation, and translocation of the tRNA across the nuclear pore complex.	690
402150	pfam10396	TrmE_N	GTP-binding protein TrmE N-terminus. This family represents the shorter, B, chain of the homo-dimeric structure which is a guanine nucleotide-binding protein that binds and hydrolyzes GTP. TrmE is homologous to the tetrahydrofolate-binding domain of N,N-dimethylglycine oxidase and indeed binds formyl-tetrahydrofolate. TrmE actively participates in the formylation reaction of uridine and regulates the ensuing hydrogenation reaction of a Schiff's base intermediate. This B chain is the N-terminal portion of the protein consisting of five beta-strands and three alpha helices and is necessary for mediating dimer formation within the protein.	117
402151	pfam10397	ADSL_C	Adenylosuccinate lyase C-terminus. This is the C-terminal seven alpha helices of the structure whose full length represents the enzyme adenylosuccinate lyase. This sequence lies C-terminal to the conserved motif necessary for beta-elimination reactions, Adenylosuccinate lyase catalyzes two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide, the eighth step of the de novo pathway, and the formation of adenosine monophosphate (AMP) from adenylosuccinate, the second step in the conversion of inosine monophosphate into AMP.	79
371028	pfam10398	DUF2443	Protein of unknown function (DUF2443). This is a small family of highly conserved proteins from bacteria, in particular Helicobacter species, The structure is a bundle of alpha helices. The function is not known.	79
402152	pfam10399	UCR_Fe-S_N	Ubiquitinol-cytochrome C reductase Fe-S subunit TAT signal. This is the N-terminal region of the E or R chain, Ubiquitinol-cytochrome C reductase Fe-S subunit, of the hetero-hexameric cytochrome bc1 complex. This region is a TAT-signal region. The cytochrome bc1 complex is an oligomeric membrane protein complex that is a component of respiratory and photosynthetic electron transfer chains. The enzyme couples the transfer of electrons from ubiquinol to cytochrome c with the the generation of a protein gradient across the membrane. The motif is also associated with Rieske (pfam00355), UCR_TM (pfam02921) and Ubiq-Cytc-red_N (pfam09165).	41
402153	pfam10400	Vir_act_alpha_C	Virulence activator alpha C-term. This structure is homo-dimeric, and the domain here is the C-terminal half of the structure, often associated with PadR upstream, (pfam03551), which is a transcriptional regulator.	85
402154	pfam10401	IRF-3	Interferon-regulatory factor 3. This is the interferon-regulatory factor 3 chain of the hetero-dimeric structure which also contains the shorter chain CREB-binding protein. These two subunits make up the DRAF1 (double-stranded RNA-activated factor 1). Viral dsRNA produced during viral transcription or replication leads to the activation of DRAF1. The DNA-binding specificity of DRAF1 correlates with transcriptional induction of ISG (interferon-alpha,beta-stimulated gene). IRF-3 preexists in the cytoplasm of uninfected cells and translocates to the nucleus following viral infection. Translocation of IRF-3 is accompanied by an increase in serine and threonine phosphorylation, and association with the CREB coactivator occurs only after infection.	179
402155	pfam10403	BHD_1	Rad4 beta-hairpin domain 1. This short domain is found in the Rad4 protein. This domain binds to DNA.	53
402156	pfam10404	BHD_2	Rad4 beta-hairpin domain 2. This short domain is found in the Rad4 protein. This domain binds to DNA.	63
402157	pfam10405	BHD_3	Rad4 beta-hairpin domain 3. This short domain is found in the Rad4 protein. This domain binds to DNA.	73
402158	pfam10406	TAF8_C	Transcription factor TFIID complex subunit 8 C-term. This is the C-terminal, Delta, part of the TAF8 protein. The N-terminal is generally the histone fold domain, Bromo_TP (pfam07524). TAF8 is one of the key subunits of the transcription factor for pol II, TFIID. TAF8 is one of the several general cofactors which are typically involved in gene activation to bring about the communication between gene-specific transcription factors and components of the general transcription machinery.	48
402159	pfam10407	Cytokin_check_N	Cdc14 phosphatase binding protein N-terminus. Cytokinesis in yeasts involves a family of proteins whose essential function is to bind Cdc14-family phosphatase and prevent this from being sequestered and inhibited in the nucleolus. This is the highly conserved N-terminus of a family of proteins which act as cytokinesis checkpoint controls by allowing cells to cope with cytokinesis defects. These proteins are required for rDNA silencing and mini-chromosome maintenance.	71
402160	pfam10408	Ufd2P_core	Ubiquitin elongating factor core. This is the most conserved part of the core region of Ufd2P ubiquitin elongating factor or E4, running from helix alpha-11 to alpha-38. It consists of 31 helices of variable length connected by loops of variable size forming a compact unit; the helical packing pattern of the compact unit consists of five structural repeats that resemble tandem Armadillo (ARM) repeats. This domain is involved in ubiquitination as it binds Cdc48p and escorts ubiquitinated proteins from Cdc48p to the proteasome for degradation. The core is structurally similar to the nuclear transporter protein importin-alpha. The core is associated with the U-box at the C-terminus, pfam04564, which has ligase activity.	594
402161	pfam10409	PTEN_C2	C2 domain of PTEN tumor-suppressor protein. This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (pfam00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane.	133
402162	pfam10410	DnaB_bind	DnaB-helicase binding domain of primase. This domain is the C-terminal region three-helical domain of primase. Primases synthesize short RNA strands on single-stranded DNA templates, thereby generating the hybrid duplexes required for the initiation of synthesis by DNA polymerases. Primases are recruited to single-stranded DNA by helicases, and this domain is the region of the primase which binds DnaB-helicase. It is associated with the Toprim domain (pfam01751) which is the central catalytic core.	56
402163	pfam10411	DsbC_N	Disulfide bond isomerase protein N-terminus. This is the N-terminal domain of the disulfide bond isomerase DsbC. The whole molecule is V-shaped, where each arm is a DsbC monomer of two domains linked by a hinge; and the N-termini of each monomer join to form the dimer interface at the base of the V, so are vital for dimerization. DsbC is required for disulfide bond formation and functions as a disulfide bond isomerase during oxidative protein-folding in bacterial periplasm. It also has chaperone activity.	54
313610	pfam10412	TrwB_AAD_bind	Type IV secretion-system coupling protein DNA-binding domain. The plasmid conjugative coupling protein TrwB forms hexamers from six structurally very similar protomers. This hexamer contains a central channel running from the cytosolic pole (made up by the AADs) to the membrane pole ending at the transmembrane pore shaped by 12 transmembrane helices, rendering an overall mushroom-like structure. The TrwB_AAD (all-alpha domain) domain appears to be the DNA-binding domain of the structure. TrwB, a basic integral inner-membrane nucleoside-triphosphate-binding protein, is the structural prototype for the type IV secretion system coupling proteins, a family of proteins essential for macromolecular transport between cells and export.	386
313611	pfam10413	Rhodopsin_N	Amino terminal of the G-protein receptor rhodopsin. Rhodopsin is the archetypal G-protein-coupled receptor. Such receptors participate in virtually all physiological processes, as signalling molecules. They utilize heterotrimeric guanosine triphosphate (GTP)-binding proteins to transduce extracellular signals to intracellular events. Rhodopsin is important because of the pivotal role it plays in visual signal transduction. Rhodopsin is a dimeric transmembrane protein and its intradiskal surface consists of this amino terminal domain and three loops connecting six of the seven transmembrane helices. The N-terminus is a compact domain of alpha-helical regions with breaks and bends at proline residues outside the membrane. The transmembrane part of rhodopsin is represented by 7tm_1 (pfam00001). The N-terminal domain is extracellular is and is necessary for successful dimerization and molecular stability.	35
402164	pfam10414	CysG_dimerizer	Sirohaem synthase dimerization region. Bacterial sulfur metabolism depends on the iron-containing porphinoid sirohaem. CysG, S-adenosyl-L-methionine (SAM)-dependent bis-methyltransferase, dehydrogenase and ferrochelatase, synthesizes sirohaem from uroporphyrinogen III via reactions which encompass two branchpoint intermediates in tetrapyrrole biosynthesis, diverting flux first from protoporphyrin IX biosynthesis and then from cobalamin (vitamin B12) biosynthesis. CysG is a dimer of two structurally similar protomers held together asymmetrically through a number of salt-bridges across complementary residues in the CysG_dimerizer region to produce a series of active sites, accounting for CysG's multifunctionality, catalyzing four diverse reactions: two SAM-dependent methylations, NAD+-dependent tetrapyrrole dehydrogenation and metal chelation. The CysG_dimerizer region holding the two protomers together is of 74 residues.	58
402165	pfam10415	FumaraseC_C	Fumarase C C-terminus. Fumarase C catalyzes the stereo-specific interconversion of fumarate to L-malate as part of the Kreb's cycle. The full-length protein forms a tetramer with visible globular shape. FumaraseC_C is the C-terminal 65 residues referred to as domain 3. The core of the molecule consists of a bundle of 20 alpha-helices from the five-helix bundle of domain 2. The projections from the core of the tetramer are generated from domains 1 and 3 of each subunit. FumaraseC_C does not appear to be part of either the active site or the activation site but is helical in structure forming a little bundle.	54
402166	pfam10416	IBD	Transcription-initiator DNA-binding domain IBD. In Trichomonas vaginalis, thought to be the earliest extant eukaryote, the sole initiator element for control of the start of transcription is Inr, and this is recognized by the initiator binding protein IBP39. IBP39 contains an N-terminal Inr binding domain, IBD, connected via a flexible, proteolytically sensitive, linker (residues 127-145) to a C-terminal domain. The IBD structure reveals a winged-helix-wing conformation with each element binding to DNA, the central helix-turn-helix contributing the majority of the specificity-determining contacts with the Inr core motif TCAPy(T/A). The binding of IBP39 to the Inr directly recruits RNA polymerase II and in this way initiates transcription.	125
402167	pfam10417	1-cysPrx_C	C-terminal domain of 1-Cys peroxiredoxin. This is the C-terminal domain of 1-Cys peroxiredoxin (1-cysPrx), a member of the peroxiredoxin superfamily which protect cells against membrane oxidation through glutathione (GSH)-dependent reduction of phospholipid hydroperoxides to corresponding alcohols. The C-terminal domain is crucial for providing the extra cysteine necessary for dimerization of the whole molecule. Loss of the enzyme's peroxidase activity is associated with oxidation of the catalytic cysteine, upstream of this domain; and glutathionylation, presumably through its disruption of protein structure, facilitates access for GSH, resulting in spontaneous reduction of the mixed disulfide to the sulfhydryl and consequent activation of the enzyme. The domain is associated with family AhpC-TSA, pfam00578, which carries the catalytic cysteine.	40
402168	pfam10418	DHODB_Fe-S_bind	Iron-sulfur cluster binding domain of dihydroorotate dehydrogenase B. Lactococcus lactis is one of the few organisms with two dihydroorotate dehydrogenases, DHODs, A and B. The B enzyme is a prototype for DHODs in Gram-positive bacteria that use NAD+ as the second substrate. DHODB is a hetero-tetramer composed of a central homodimer of PyrDB subunits resembling the DHODA structure and two PyrK subunits along with three different cofactors: FMN, FAD, and a [2Fe-2S] cluster. The [2Fe-2S] iron-sulfur cluster binds to this C-terminal domain of the PyrK subunit, which is at the interface between the flavin and NAD binding domains and contains three beta-strands. The four cysteine residues at the N-terminal part of this domain are the ones that bind, in pairs, to the iron-sulfur cluster. The conformation of the whole molecule means that the iron-sulfur cluster is localized in a well-ordered part of this domain close to the FAD binding site. The FAD and and NAD binding domains are FAD_binding_6, pfam00970 and NAD_binding_1, pfam00175.	40
402169	pfam10419	TFIIIC_sub6	TFIIIC subunit. This is a family of proteins subunits of TFIIIC. TFIIIC in yeast and humans is required for transcription of tRNA and 5 S RNA genes by RNA polymerase III. Yeast members of this family are fused to phosphoglycerate mutase domain.	99
402170	pfam10420	IL12p40_C	Cytokine interleukin-12p40 C-terminus. IL12p40_C is the largely beta stranded C-terminal, D3, domain of interleukin-12p40 or interleukin-12B. This interleukin is produced on stimulation by macrophage-engulfed micro-organisms and other stimuli, when it dimerizes with interleukin-12p35 to form a heterodimer which then binds to receptors on natural killer cells to activate them to destroy the micro-organisms. This domain contains two disulfide bridges, one of which serves to bind p40 to p35 and the other to hold the beta strands within the domain together. The cupped shape of the p35 binding interface matches the elbow-like bend between D2 and D3 in p40. The domain is often associated with family fn3, pfam00041.	85
402171	pfam10421	OAS1_C	2'-5'-oligoadenylate synthetase 1, domain 2, C-terminus. This is the largely alpha-helical, C-terminal half of 2'-5'-oligoadenylate synthetase 1, being described as domain 2 of the enzyme and homologous to a tandem ubiquitin repeat. It carries the region of enzymic activity between 320 and 344 at the extreme C-terminal end. Oligoadenylate synthetases are antiviral enzymes that counteract vial attack by degrading viral RNA. The enzyme uses ATP in 2'-specific nucleotidyl transfer reactions to synthesize 2'.5'-oligoadenylates, which activate latent ribonuclease, resulting in degradation of viral RNA and inhibition of virus replication. This domain is often associated with NTP_transf_2 pfam01909.	185
255978	pfam10422	LRS4	Monopolin complex subunit LRS4. Monopolin is a protein complex, originally identified in Saccharomyces cerevisiae, that is required for the segregation of homologous centromeres to opposite poles of a dividing cell during meiosis I. The orthologous complex in Schizosaccharomyces pombe is not required for meiosis I chromosome segregation, but is proposed to play a similar physiological role in clamping microtubule binding sites. In S.cerevisiae this subunit is called LRS4, and in S. pombe it is known as Mde4.	211
402172	pfam10423	AMNp_N	Bacterial AMP nucleoside phosphorylase N-terminus. This is the N-terminal domain of bacterial AMP nucleoside phosphorylase (AMNp). The N- and C-termini form distinct domains which intertwine with each other to form a stable monomer which associates with five other monomers to yield the active hexamer. The N-terminus consists of a long helix and a four-stranded sheet with a novel topology. The C-terminus binds the nucleoside whereas the N-terminus acts as the enzymatic regulatory domain. AMNp (EC:3.2.2.4) catalyzes the hydrolysis of AMP to form adenine and ribose 5-phosphate. thereby regulating intracellular AMP levels.	151
402173	pfam10425	SdrG_C_C	C-terminus of bacterial fibrinogen-binding adhesin. This is the C-terminal half of a bacterial fibrinogen-binding adhesin SdrG. SdrG is a Gram-positive cell-wall-anchored adhesin that allows attachment of the bacterium to host tissues via specific binding to the beta-chain of human fibrinogen (Fg). SdrG binds to its ligand with a dynamic "dock, lock, and latch" mechanism which represents a general mode of ligand-binding for structurally related cell wall-anchored proteins in most Gram-positive bacteria. The C-terminal part of SdrG(276-596) is integral to the folding of the immunoglobulin-like whole to create the docking grooves necessary for Fg binding. The domain is associated with families of Cna_B, pfam05738.	156
287407	pfam10426	zf-RAG1	Recombination-activating protein 1 zinc-finger domain. This is a C2-H2 zinc-finger domain closely resembling the classical TFIIIA-type zinc-finger, CX3FX5LX2-3H, despite having a valine and a tyrosine at the core instead of a phenylalanine and a leucine, hence CX3VX1LX2YX2H. The structure, nevertheless, contains the characteristic two-stranded beta-sheet and alpha-helix of a classical zinc-finger. The domain binds one zinc and, in complex with the zinc-RING-finger domain, helps to stabilize the whole of the dimerization region of recombination activating protein 1 (RAG1). The function of the whole is to bind double-stranded DNA.	30
371044	pfam10427	Ago_hook	Argonaute hook. This region has been called the argonaute hook. It has been shown to bind to the Piwi domain pfam02171 of Argnonaute proteins.	150
402174	pfam10428	SOG2	RAM signalling pathway protein. SOG2 proteins in Saccharomyces cerevisiae are involved in cell separation and cytokinesis.	479
402175	pfam10429	Mtr2	Nuclear pore RNA shuttling protein Mtr2. Mtr2 is a monomeric, dual-action, RNA-shuttle protein found in yeasts. Transport across the nuclear-cytoplasmic membrane is via the macro-molecular membrane-spanning nuclear pore complex, NPC. The pore is lined by a subset of NPC members called nucleoporins that present FG (Phe-Gly) receptors, characteristically GLFG and FXFG motifs, for shuttling RNAs and proteins. RNA cargo is bound to soluble transport proteins (nuclear export factors) such as Mex67 in yeasts, and TAP in metazoa, which pass along the pore by binding to successive FG receptors. Mtr2 when bound to Mex67 maximises this FG-binding. Mtr2 also acts independently of Mex67 in transporting the large ribosomal RNA subunit through the pore.	164
402176	pfam10430	Ig_Tie2_1	Tie-2 Ig-like domain 1. 	95
402177	pfam10431	ClpB_D2-small	C-terminal, D2-small domain, of ClpB protein. This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, pfam00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerization, forming a tight interface with the D2-large domain of a neighboring subunit and thereby providing enough binding energy to stabilize the functional assembly. The domain is associated with two Clp_N, pfam02861, at the N-terminus as well as AAA, pfam00004 and AAA_2, pfam07724.	80
402178	pfam10432	bact-PGI_C	Bacterial phospho-glucose isomerase C-terminal SIS domain. This is the C-terminal SIS domain of a bacterial phospho-glucose isomerase EC:5.3.1.9 protein which is similar to eukaryote homologs to the extent that the sequence includes the cluster of threonines and serines that forms the sugar phosphate-binding site in conventional PGI. This domain contributes a good proportion of the active catalytic site residues. This PGI uses the same catalytic mechanisms for both glucose ring-opening and isomerisation for the interconversion of glucose 6-phosphate to fructose 6-phosphate. It is associated with family SIS, pfam01380.	147
402179	pfam10433	MMS1_N	Mono-functional DNA-alkylating methyl methanesulfonate N-term. MMS1 is a protein that protects against replication-dependent DNA damage in Saccharomyces cerevisiae. MMS1 belongs to the DDB1 family of cullin 4 adaptors and the two proteins are homologous. MMS1 bridges the interaction of MMS22 and Crt10 with Cul8/Rtt101. Cul8/Rtt101 is a cullin protein involved in the regulation of DNA replication subsequent to DNA damage. The N-terminal region of MMS1 and the C-terminal of MMS22 are required for the the MMS1-MMS22 interaction. The human HIV-1 virion-associated protein Vpr assembles with DDB1 through interaction with DCAF1 (chromatin assembly factor) to form an E3 ubiquitin ligase that targets cellular substrates for proteasome-mediated degradation and subsequent G2 arrest.	486
371048	pfam10434	MAM1	Monopolin complex protein MAM1. Monopolin is a protein complex, originally identified in Saccharomyces cerevisiae, that is required for the segregation of homologous centromeres to opposite poles of a dividing cell during meiosis I. MAM1 is required in S. cerevisiae for monopolar attachment.	255
402180	pfam10435	BetaGal_dom2	Beta-galactosidase, domain 2. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyzes the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C-terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N-terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, pfam01301, which is N-terminal to it, but itself has no metazoan members.	180
402181	pfam10436	BCDHK_Adom3	Mitochondrial branched-chain alpha-ketoacid dehydrogenase kinase. Catabolism and synthesis of leucine, isoleucine and valine are finely balanced, allowing the body to make the most of dietary input but removing excesses to prevent toxic build-up of their corresponding keto-acids. This is the butyryl-CoA dehydrogenase, subunit A domain 3, a largely alpha-helical bundle of the enzyme BCDHK. This enzyme is the regulator of the dehydrogenase complex that breaks branched-chain amino-acids down, by phosphorylating and thereby inactivating it when synthesis is required. The domain is associated with family HATPase_c pfam02518 which is towards the C-terminal.	159
402182	pfam10437	Lip_prot_lig_C	Bacterial lipoate protein ligase C-terminus. This is the C-terminal domain of a bacterial lipoate protein ligase. There is no conservation between this C-terminus and that of vertebrate lipoate protein ligase C-termini, but both are associated with the domain BPL_LipA_LipB pfam03099, further upstream. This domain is required for adenylation of lipoic acid by lipoate protein ligases. The domain is not required for transfer of lipoic acid from the adenylate to the lipoyl domain. Upon adenylation, this domain rotates 180 degrees away from the active site cleft. Therefore, the domain does not interact with the lipoyl domain during transfer.	84
402183	pfam10438	Cyc-maltodext_C	Cyclo-malto-dextrinase C-terminal domain. This domain is at the very C-terminus of cyclo-malto-dextrinase proteins and consists of 8 beta strands, is largely globular and appears to help stabilize the acitve sites created by upstream domains, Cyc-maltodext_N pfam09087, and Alpha-amylase pfam00128. Cyclo-malto-dextrinases hydrolyze cyclodextrans to maltose and glucose and catalyze trans-glycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules.	76
402184	pfam10439	Bacteriocin_IIc	Bacteriocin class II with double-glycine leader peptide. This is a family of bacteriocidal bacteriocins secreted by Streptococcal species in order to kill off closely-related competitor Gram-positives. The sequence includes the peptide precursor, this being cleaved off proteolytically at the double-glycine. The family does not carry the YGNGVXC motif characteristic of pediocin-like Bacteriocins, Bacteriocin_II pfam01721. The producer bacteria are protected from the effects of their own bacteriocins by production of a specific immunity protein which is co-transcribed with the genes encoding the bacteriocins, eg family EntA_Immun pfam08951. The bacteriocins are structurally more specific than their immunity-protein counterparts. Typically, production of the bacteriocin gene is from within an operon carrying up to 6 genes including a typical two-component regulatory system (R and H), a small peptide pheromone (C), and a dedicated ABC transporter (A and -B) as well as an immunity protein. The ABC transporter is thought to recognize the N termini of both the pheromone and the bacteriocins and to transport these peptides across the cytoplasmic membrane, concurrent with cleavage at the conserved double-glycine motif. Cleaved extracellular C can then bind to the sensor kinase, H, resulting in activation of R and up-regulation of the entire gene cluster via binding to consensus sequences within each promoter. It seems likely that this whole regulon is carried on a transmissible plasmid which is passed between closely related Firmicute species since many clinical isolates from different Firmicutes can produce at least two bacteriocins. and the same bacteriocins can be produced by different species.	58
402185	pfam10440	WIYLD	Ubiquitin-binding WIYLD domain. This presumed domain has been predicted to contain three alpha helices. The domain was named the WIYLD domain based on the pattern of most conserved residues. It binds ubiquitin. In the Arabidopsis thaliana histone-lysine N-methyltransferase SUVR4, binding of ubiquitin to this domain stimulates enzymatic activity and converts its activity from a strict dimethylase to a di/trimethylase.	58
402186	pfam10441	Urb2	Urb2/Npa2 family. This family includes the Urb2 protein from yeast that are involved in ribosome biogenesis.	213
402187	pfam10442	FIST_C	FIST C domain. The FIST C domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids.	135
371055	pfam10443	RNA12	RNA12 protein. This family includes RNA12 from S. cerevisiae. That protein contains an RRM domain. This region is C-terminal to that and includes a P-loop motif suggesting this region binds to NTP. The RNA12 proteins is involved in pre-rRNA maturation.	429
402188	pfam10444	Nbl1_Borealin_N	Nbl1 / Borealin N terminal. Nbl1 is a subunit of the conserved CPC, the chromosomal passenger complex, which regulates mitotic chromosome segregation. In Fungi and Animalia, this complex consists of the kinase Aurora B/AIR-2/Ipl1p, INCENP/ICP-1/Sli15p, and Survivin/BIR-1/Bir1p. In Animalia, a fourth subunit (Borealin/Dasra/CSC-1) is required for targeting CPC to centromeres and central spindles. Nbl1 has been shown in budding yeast to be essential for viability, and for CPC localization, stability, integrity, and function. The N-terminus of Borealin is homologous to Nbl1. This family contains both Nbl1, and the N terminal region of Borealin.	55
402189	pfam10445	DUF2456	Protein of unknown function (DUF2456). This is a family of uncharacterized proteins.	94
402190	pfam10446	DUF2457	Protein of unknown function (DUF2457). This is a family of uncharacterized proteins.	458
402191	pfam10447	EXOSC1	Exosome component EXOSC1/CSL4. This family of proteins are components of the exosome 3'->5' exoribonuclease complex. The exosome mediates degradation of unstable mRNAs that contain AU-rich elements (AREs) within their 3' untranslated regions.	112
402192	pfam10448	POC3_POC4	20S proteasome chaperone assembly proteins 3 and 4. This family contains chaperones of the 20S proteasome which function in early 20S proteasome assembly. The structures of two of the proteins in this family (POC3 and POC4) have been solved, and they closely resemble those of the mammalian proteasome assembling chaperone PAC3, although there is little sequence similarity between them.	136
402193	pfam10450	POC1	POC1 chaperone. In yeast, POC1 is a chaperone of the 20S proteasome which functions in early 20S proteasome assembly.	223
371062	pfam10451	Stn1	Telomere regulation protein Stn1. The budding yeast protein Stn1 is a DNA-binding protein which has specificity for telomeric DNA. Structural profiling has predicted an OB-fold. This domain is the N-terminal part of the molecule, which adopts the OB fold. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution.	252
371063	pfam10452	TCO89	TORC1 subunit TCO89. TC089 is a component of the TORC1 complex. TORC1 is responsible for a wide range of rapamycin-sensitive cellular activities.	546
402194	pfam10453	NUFIP1	Nuclear fragile X mental retardation-interacting protein 1 (NUFIP1). Proteins in this family have been implicated in the assembly of the large subunit of the ribosome and in telomere maintenance. Some proteins in this family contain a CCCH zinc finger. This family contains a protein called human fragile X mental retardation-interacting protein 1, which is known to bind RNA and is phosphorylated upon DNA damage.	53
402195	pfam10454	DUF2458	Protein of unknown function (DUF2458). This a is family of uncharacterized proteins.	172
402196	pfam10455	BAR_2	Bin/amphiphysin/Rvs domain for vesicular trafficking. This Pfam entry includes proteins that are not matched by pfam03114.	286
313646	pfam10456	BAR_3_WASP_bdg	WASP-binding domain of Sorting nexin protein. The C-terminal region of the Sorting nexin group of proteins appears to carry a BAR-like (Bin/amphiphysin/Rvs) domain. This domain is very diverse and the similarities with other BAR domains are few. In the Sorting nexins it is associated with family PX, pfam00787.13, and in combination with PX appears to be necessary to bind WASP along with p85 to form a multimeric signalling complex.	236
402197	pfam10457	MENTAL	Cholesterol-capturing domain. Human meta-static lymph node (MLN) 64 is a late endosomal membrane protein, and carries this MENTAL (MLN64N-terminal) domain at its N-terminus. The domain is composed of four trans-membrane helices with three short intervening loops. The function of the domain is to capture cholesterol and pass it to the associated START domain pfam01852 for transfer to a cytosolic acceptor protein or membrane. In mammals, the MENTAL domain is involved in the localization of MLN64 and MENTHO in late endosomes, and also in homo-and of hetero-interactions of these two proteins.	176
402198	pfam10458	Val_tRNA-synt_C	Valyl tRNA synthetase tRNA binding arm. This domain is found at the C-terminus of Valyl tRNA synthetases.	66
402199	pfam10459	Peptidase_S46	Peptidase S46. Dipeptidyl-peptidase 7 (DPP-7) is the best characterized member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.	694
378438	pfam10460	Peptidase_M30	Peptidase M30. This family contains the metallopeptidase hyicolysin. Hyicolysin has a zinc ion which is liganded by two histidine and one glutamate residue.	364
402200	pfam10461	Peptidase_S68	Peptidase S68. This family of serine peptidases contains PIDD proteins. PIDD forms a complex with RAIDD and procaspase-2 that is known as the 'PIDDosome'. The PIDDosome forms when DNA damage occurs and either activates NF-kappaB, leading to cell survival, or caspase-2, which leads to apoptosis.	34
287440	pfam10462	Peptidase_M66	Peptidase M66. This family of metallopeptidases contains StcE, a virulence factor found in Shiga toxigenic Escherichia coli organisms. StcE peptidase cleaves C1 esterase inhibitor.	306
402201	pfam10463	Peptidase_U49	Peptidase U49. This family contains Lit peptidase from Escherichia coli. Lit protease functions in bacterial cell death in response to infection by bacteriophage T4. Following binding of Gol peptide to domains II and III of elongation factor Tu, the Lit peptidase cleaves domain I of the elongation factor. This prevents binding of guanine nucleotides, shuts down translation and leads to cell death.	198
287442	pfam10464	Peptidase_U40	Peptidase U40. This family contains P5 murein endopeptidase from bacteriophage phi-6. P5 murein endopeptidase has lytic activity against several gram-negative bacteria. It is thought that the enzyme cleaves the cell wall peptide bridge formed by meso-2,6-diaminopimelic acid and D-Ala	212
287443	pfam10465	Inhibitor_I24	PinA peptidase inhibitor. PinA inhibits the endopeptidase La. It binds to the La homotetramer but does not interfere with the ATP binding site or the active site of La.	140
402202	pfam10466	Inhibitor_I34	Saccharopepsin inhibitor I34. The saccharopepsin inhibitor is highly specific for the aspartic peptidase saccharopepsin. It is largely unstructured in the absence of saccharopepsin, but in the presence, the inhibitor undergoes a conformation change forming an almost perfect alpha-helix from Asn2 to Met32 in the active site cleft of the peptidase.	69
313652	pfam10467	Inhibitor_I48	Peptidase inhibitor clitocypin. Clitocypin binds and inhibits cysteine proteinases. It has no similarity to any other known cysteine proteinase inhibitors but bears some similarity to a lectin-like family of proteins from mushrooms.	142
402203	pfam10468	Inhibitor_I68	Carboxypeptidase inhibitor I68. This is a family of tick carboxypetidase inhibitors.	74
402204	pfam10469	AKAP7_NLS	AKAP7 2'5' RNA ligase-like domain. AKAP7_NLS is the N-terminal domain of the cyclic AMP-dependent protein kinase A, PKA, anchor protein AKAP7. This protein anchors PKA for its role in regulating PKA-mediated gene transcription in both somatic cells and oocytes. AKAP7_NLS carries the nuclear localization signal (NLS) KKRKK, that indicates the cellular destiny of this anchor protein. Binding to the regulatory subunits RI and RII of PKA is mediated via the family AKAP7_RIRII_bdg. at the C-terminus. This family represents a region that contains two 2'5' RNA ligase like domains pfam02834. Presumably this domain carried out some as yet unknown enzymatic function.	207
371072	pfam10470	AKAP7_RIRII_bdg	PKA-RI-RII subunit binding domain of A-kinase anchor protein. AKAP7_RIRII_bdg is the C-terminal domain of the cyclic AMP-dependent protein kinase A, PKA, anchor protein AKAP7. This protein anchors PKA, for its role in regulating PKA-mediated gene transcription in both somatic cells and oocytes, by binding to its regulatory subunits, RI and RII, hence being known as a dual-specific AKAP. The 25 crucial amino acids of RII-binding domains in general form structurally conserved amphipathic helices with unrelated sequences; hydrophobic amino acid residues form the backbone of the interaction and hydrogen bond- and salt-bridge-forming amino acid residues increase the affinity of the interaction. The N-terminus, of family AKAP7_NLS, carries the nuclear localization signal.	57
402205	pfam10471	ANAPC_CDC26	Anaphase-promoting complex APC subunit CDC26. The anaphase-promoting complex (APC) or cyclosome is a cell cycle-regulated ubiquitin-protein ligase that regulates important events in mitosis such as the initiation of anaphase and exit from telophase. The APC, in conjunction with other enzymes, assembles multi-ubiquitin chains on a variety of regulatory proteins thereby targeting them for proteolysis by the 26S proteasome. CDC26 is one of the nine or so subunits identified within APC but its exact function is not known. The APC/C becomes active at the metaphase/anaphase transition and remains active during G1 phase. One mechanism linked to activation of the APC/C is phosphorylation. The yeast APC/C is composed of at least 13 subunits, but the function of many of the subunits is unknown. Hcn1 is the smallest subunit of the S. pombe APC/C, and is found to be essential for cell viability, APC/C integrity, and proper APC/C regulation. In addition, Hcn1 phosphorylation indicates a specific role for the phosphorylation of this subunit late in the cell cycle.	65
371074	pfam10472	CReP_N	eIF2-alpha phosphatase phosphorylation constitutive repressor. This is the conserved N-terminal domain of CReP, constitutive repressor of eIF2-alpha phosphorylation/protein phosphatase 1, catalytic subunit. It functions in the dephosphorylation of eIF2-alpha under basal conditions in the absence of stress. In response to translation inhibition, there is reduced synthesis of the labile CReP that contributes to elevated levels of eIF2-alpha phosphorylation. The C-terminus, family PP1c, is shared with the apoptosis-associated protein Gadd34 and herpes simplex virus.	411
402206	pfam10473	CENP-F_leu_zip	Leucine-rich repeats of kinetochore protein Cenp-F/LEK1. Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. There are several leucine-rich repeats along the sequence of LEK1 that are considered to be zippers, though they do not appear to be binding DNA directly in this instance.	140
402207	pfam10474	DUF2451	Protein of unknown function C-terminus (DUF2451). This protein is found in eukaryotes but its function is not known. The C-terminal part of some members is DUF2450.	229
402208	pfam10475	Vps54_N	Vacuolar-sorting protein 54, of GARP complex. This is a family of vacuolar-sorting proteins 54, from eukaryotes. Along with VPS52 and VPS53 this forms the Golgi-associated retrograde protein complex GARP. VPS54 is separated into N- and C-terminal regions each of which has a different function. This N-terminal family of is important for GARP complex assembly and stability, whereas the C-terminal domain, pfam07928, brings about localization to an early endocytic compartment.	291
402209	pfam10476	DUF2448	Protein of unknown function C-terminus (DUF2448). The family DUF2349 is the N-terminal part of this family. This protein is found in eukaryotes but its function is not known.	204
371079	pfam10477	EIF4E-T	Nucleocytoplasmic shuttling protein for mRNA cap-binding EIF4E. EIF4E-T is the transporter protein for shuttling the mRNA cap-binding protein EIF4E protein, targeting it for nuclear import. EIF4E-T contains several key binding domains including two functional leucine-rich NESs (nuclear export signals) between residues 438-447 and 613-638 in the human protein. The other two binding domains are an EIF4E-binding site, between residues 27-42 in Q9EST3, and a bipartite NLS (nuclear localization signals) between 194-211, and these lie in family EIF4E-T_N. EIF4E is the eukaryotic translation initiation factor 4E that is the rate-limiting factor for cap-dependent translation initiation.	646
313663	pfam10479	FSA_C	Fragile site-associated protein C-terminus. This is the conserved C-terminal half of the protein KIAA1109 which is the fragile site-associated protein FSA. Genome-wide-association studies showed this protein to linked to the susceptibility to coeliac disease. The protein may also be associated with polycystic kidney disease.	726
119000	pfam10480	ICAP-1_inte_bdg	Beta-1 integrin binding protein. ICAP-1 is a serine/threonine-rich protein that binds to the cytoplasmic domains of beta-1 integrins in a highly specific manner, binding to a NPXY sequence motif on the beta-1 integrin. The cytoplasmic domains of integrins are essential for cell adhesion, and the fact that phosphorylation of ICAP-1 by interaction with the cell-matrix implies an important role of ICAP-1 during integrin-dependent cell adhesion. Overexpression of ICAP-1 strongly reduces the integrin-mediated cell spreading on extracellular matrix and inhibits both Cdc42 and Rac1. In addition, ICAP-1 induces release of Cdc42 from cellular membranes and prevents the dissociation of GDP from this GTPase. An additional function of ICAP-1 is to promote differentiation of osteoprogenitors by supporting their condensation through modulating the integrin high affinity state,	200
402210	pfam10481	CENP-F_N	Cenp-F N-terminal domain. Mitosin or centromere-associated protein-F (Cenp-F) is found bound across the centromere as one of the proteins of the outer layer of the kinetochore. Most of the kinetochore/centromere functions appear to depend upon binding of the C-terminal par to f the molecule, whereas the N-terminal part, here, may be a cytoplasmic player in controlling the function of microtubules and dynein.	304
402211	pfam10482	CtIP_N	tumor-suppressor protein CtIP N-terminal domain. CtIP is predominantly a nuclear protein that complexes with both BRCA1 and the BRCA1-associated RING domain protein (BARD1). At the protein level, CtIP expression varies with cell cycle progression in a pattern identical to that of BRCA1. Thus, the steady-state levels of CtIP polypeptides, which remain low in resting cells and G1 cycling cells, increase dramatically as Dividing cells traverse the G1/S boundary. CtIP can potentially modulate the functions ascribed to BRCA1 in transcriptional regulation, DNA repair, and/or cell cycle checkpoint control. This N-terminal domain carries a coiled-coil region and is essential for homodimerization of the protein. The C-terminal domain is family pfam08573.	119
402212	pfam10483	Elong_Iki1	Elongator subunit Iki1. This family is a component of the RNA polymerase II elongator complex. This complex is involved in elongation of RNA polymerase II transcription and in modification of wobble nucleosides in tRNA.	278
402213	pfam10484	MRP-S23	Mitochondrial ribosomal protein S23. MRP-S23 is one of the proteins that makes up the 55S ribosome in eukaryotes from nematodes to humans. It does not appear to carry any common motifs, either RNA binding or ribosomal protein motifs. All of the mammalian MRPs are encoded in nuclear genes that are evolving more rapidly than those encoding cytoplasmic ribosomal proteins. The MRPs are imported into mitochondria where they assemble coordinately with mitochondrially transcribed rRNAs into ribosomes that are responsible for translating the 13 mRNAs for essential proteins of the oxidative phosphorylation system. MRP-S23 is significantly up-regulated in uterine cancer cells.	124
402214	pfam10486	PI3K_1B_p101	Phosphoinositide 3-kinase gamma adapter protein p101 subunit. Class I PI3Ks are dual-specific lipid and protein kinases involved in numerous intracellular signaling pathways. Class IB PI3K, p110gamma, is mainly activated by seven-transmembrane G-protein-coupled receptors (GPCRs), through its regulatory subunit p101 and G-protein beta-gamma subunits.	860
402215	pfam10487	Nup188	Nucleoporin subcomplex protein binding to Pom34. This is one of the many peptides that make up the nucleoporin complex (NPC), and is found across eukaryotes. The Nup188 subcomplex (Nic96p-Nup188p-Nup192p-Pom152p) is one of at least six that make up the NPC, and as such is symmetrically localized on both faces of the NPC at the nuclear end, being integrally bound to the C-terminus of Pom34p.	916
402216	pfam10488	PP1c_bdg	Phosphatase-1 catalytic subunit binding region. This conserved C-terminus appears to be a protein phosphatase-1 catalytic subunit (PP1C) binding region, which may in some circumstances also be retroviral in origin since it is found in both herpes simplex virus and in mouse and man. This domain is found in Gadd-34 apoptosis-associated proteins as well as the constitutive repressor of eIF2-alpha phosphorylation/protein phosphatase 1, regulatory (inhibitor) subunit 15b, otherwise known as CReP. Diverse stressful conditions are associated with phosphorylation of the {alpha} subunit of eukaryotic translation initiation factor 2 (eIF2{alpha}) on serine 51. This signaling event, which is conserved from yeast to mammals, negatively regulates the guanine nucleotide exchange factor, eIF2-B and inhibits the recycling of eIF2 to its active GTP bound form. In mammalian cells eIF2{alpha} phosphorylation emerges as an important event in stress signaling that impacts on gene expression at both the translational and transcriptional levels.	287
402217	pfam10490	CENP-F_C_Rb_bdg	Rb-binding domain of kinetochore protein Cenp-F/LEK1. Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. This domain is at the very C-terminus of the C-terminal coiled-coil, and is one of the key Rb-binding domains.	47
402218	pfam10491	Nrf1_DNA-bind	NLS-binding and DNA-binding and dimerization domains of Nrf1. In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein there is also an NLS domain at 88-116, and a DNA binding and dimerization domain at 127-282. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity.	213
313673	pfam10492	Nrf1_activ_bdg	Nrf1 activator activation site binding domain. In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, there is an activation domain at 303-469, the most conserved part of which is this domain 446-469. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity. The family Nrf1_DNA-bind is associated with this domain towards the N-terminal, as is the N terminal of the activation domain.	82
402219	pfam10493	Rod_C	Rough deal protein C-terminal region. Rod, the Rough deal protein, displays a dynamic intracellular staining pattern, localising first to kinetochores in pro-metaphase, but moving to kinetochore microtubules at metaphase. Early in anaphase the protein is once again restricted to the kinetochores, where it persists until the end of telophase. This behaviour is in all respects similar to that described for ZW10, and indeed the two proteins function together, localization of each depending upon the other. These two proteins are found at the kinetochore in complex with a third, Zwilch, in both flies and humans. The C-terminus is the most conserved part of the protein. During pro-metaphase, the ZW10-Rod complex, dynein/dynactin, and Mad2 all accumulate on unattached kinetochores; microtubule capture leads to Mad2 depletion as it is carried off by dynein/dynactin; ZW10-Rod complex accumulation continues, replenishing kinetochore dynein. The continuing recruitment of the ZW10-Rod complex during metaphase may serve to maintain adequate dynein/dynactin complex on kinetochores for assisting chromatid movement during anaphase. The ZW10-Rod complex acts as a bridge whose association with Zwint-1 links Mad1 and Mad2, components that are directly responsible for generating the diffusible 'wait anaphase' signal, to a structural, inner kinetochore complex containing Mis12 and KNL-1AF15q14, the last of which has been proved to be essential for kinetochore assembly in C. elegans. Removal of ZW10 or Rod inactivates the mitotic checkpoint.	560
402220	pfam10494	Stk19	Serine-threonine protein kinase 19. This serine-threonine protein kinase number 19 is expressed from the MHC and predominantly in the nucleus. Protein kinases are involved in signal transduction pathways and play fundamental roles in the regulation of cell functions. This is a novel Ser/Thr protein kinase, that has Mn2+-dependent protein kinase activity that phosphorylates alpha -casein at Ser/Thr residues and histone at Ser residues. It can be covalently modified by the reactive ATP analogue 5'-p-fluorosulfonylbenzoyladenosine in the absence of ATP, and this modification is prevented in the presence of 1 mM ATP, indicating that the kinase domain of is capable of binding ATP.	244
402221	pfam10495	PACT_coil_coil	Pericentrin-AKAP-450 domain of centrosomal targeting protein. This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly.	77
402222	pfam10496	Syntaxin-18_N	SNARE-complex protein Syntaxin-18 N-terminus. This is the conserved N-terminal of Syntaxin-18. Syntaxin-18 is found in the SNARE complex of the endoplasmic reticulum and functions in the trafficking between the ER intermediate compartment and the cis-Golgi vesicle. In particular, the N-terminal region is important for the formation of ER aggregates. More specifically, syntaxin-18 is involved in endoplasmic reticulum-mediated phagocytosis, presumably by regulating the specific and direct fusion of the ER with the plasma or phagosomal membranes.	86
402223	pfam10497	zf-4CXXC_R1	Zinc-finger domain of monoamine-oxidase A repressor R1. R1 is a transcription factor repressor that inhibits monoamine oxidase A gene expression. This domain is a four-CXXC zinc finger putative DNA-binding domain found at the C-terminal end of R1. The domain carries 12 cysteines of which four pairs are of the CXXC type.	95
402224	pfam10498	IFT57	Intra-flagellar transport protein 57. Eukaryotic cilia and flagella are specialized organelles found at the periphery of cells of diverse organisms. Intra-flagellar transport (IFT) is required for the assembly and maintenance of eukaryotic cilia and flagella, and consists of the bidirectional movement of large protein particles between the base and the distal tip of the organelle. IFT particles contain multiple copies of two distinct protein complexes, A and B, which contain at least 6 and 11 protein subunits. IFT57 is part of complex B but is not, however, required for the core subunits to stay associated. This protein is known as Huntington-interacting protein-1 in humans.	359
402225	pfam10500	SR-25	Nuclear RNA-splicing-associated protein. SR-25, otherwise known as ADP-ribosylation factor-like factor 6-interacting protein 4, is expressed in virtually all tissues. At the N-terminus there is a repeat of serine-arginine (SR repeat), and towards the middle of the protein there are clusters of both serines and of basic amino acids. The presence of many nuclear localization signals strongly implies that this is a nuclear protein that may contribute to RNA splicing. SR-25 is also implicated, along with heat-shock-protein-27, as a mediator in the Rac1 (GTPase ras-related C3 botulinum toxin substrate 1) signalling pathway.	228
402226	pfam10501	Ribosomal_L50	Ribosomal subunit 39S. The 39S ribosomal protein appears to be a subunit of one of the larger mitochondrial 66S or 70S units. Under conditions of ethanol-stress in rats the larger subunit is largely dissociated into its smaller components. In E. coli, in the absence of the enzyme pseudouridine synthase (RluD) synthase, there is an accumulation of 50S and 30S subunits and the appearance of abnormal particles (62S and 39S), with concomitant loss of 70S ribosomes.	109
402227	pfam10502	Peptidase_S26	Signal peptidase, peptidase S26. This is a family of membrane signal serine endopeptidases which function in the processing of newly-synthesized secreted proteins. Peptidase S26 removes the hydrophobic, N-terminal, signal peptides as proteins are translocated across membranes. The active site residues take the form of a catalytic dyad that is Ser, Lys in subfamily S26A; the Ser is the nucleophile in catalysis, and the Lys is the general base.	162
402228	pfam10503	Esterase_phd	Esterase PHB depolymerase. This family of proteins include acetyl xylan esterases (AXE), feruloyl esterases (FAE), and poly(3-hydroxybutyrate) (PHB) depolymerases.	219
371098	pfam10504	DUF2452	Protein of unknown function (DUF2452). This protein is found in eukaryotes but its function is unknown.	152
402229	pfam10505	NARG2_C	NMDA receptor-regulated gene protein 2 C-terminus. The transition of neuronal cells from pre-cursor to mature state is regulated by the N-methyl-d-aspartate (NMDA) receptor, a glutamate-gated ion channel that is permeable to Ca2+. NMDA receptors probably mediate this activity by permitting expression of NARG2. NARG2 is transiently expressed, being a regulatory protein that is present in the nucleus of dividing cells and then down-regulated as progenitors exit the cell cycle and begin to differentiate. NARG2 contains repeats of (S/T)PXX, (11 in mouse, six in human), a putative DNA-binding motif that is found in many gene-regulatory proteins including Kruppel, Hunchback and Antennapedi.	206
402230	pfam10506	MCC-bdg_PDZ	PDZ domain of MCC-2 bdg protein for Usher syndrome. The protein has a high homology to the tumor suppressor MCC (mutated in colon cancer; or MCC1 hereafter) and was named MCC2. MCC2 protein binds the first PDZ domain of AIE-75 with its C-terminal amino acids -DTFL. A possible role of MCC2 as a tumor suppressor has been put forward. The carboxyl terminus of the predicted protein was DTFL which matched the consensus motif X-S/T-X-phi (phi: hydrophobic amino acid residue) for binding to the PDZ domain of AIE-75.	65
402231	pfam10507	TMEM65	Transmembrane protein 65. MEM65 is an intercalated disc protein that interacts with with connexin 43 (Cx43) and is required for correct localization of Cx43 to the intercalated disc. It is essential for cardiac function in zebrafish.	108
371102	pfam10508	Proteasom_PSMB	Proteasome non-ATPase 26S subunit. The 26S proteasome, a eukaryotic ATP-dependent, dumb-bell shaped, protease complex with a molecular mass of approx 20kDa consists of a central 20S proteasome,functioning as a catalytic machine, and two large V-shaped terminal modules, having possible regulatory roles,composed of multiple subunits of 25- 110 kDa attached to the central portion in opposite orientations. It is responsible for degradation of abnormal intracellular proteins, including oxidatively damaged proteins, and may play a role as a component of a cellular anti-oxidative system. Expression of catalytic core subunits including PSMB5 and peptidase activities of the proteasome were elevated following incubation with 3-methylcholanthrene. The 20S proteasome comprises a cylindrical stack of four rings, two outer rings formed by seven alpha-subunits (alpha1-alpha7) and two inner rings of seven beta-subunits (beta1-beta7). Two outer rings of alpha subunits maintain structure, while the central beta rings contain the proteolytic active core subunits beta1 (PSMB6), beta2 (PSMB7), and beta5 (PSMB5). Expression of PSMB5 can be altered by chemical reactants, such as 3-methylcholanthrene.	497
402232	pfam10509	GalKase_gal_bdg	Galactokinase galactose-binding signature. This is the highly conserved galactokinase signature sequence which appears to be present in all galactokinases irrespective of how many other ATP binding sites, etc that they carry. The function of this domain appears to be to bind galactose, and the domain is normally at the N-terminus of the enzymes, EC:2.7.1.6. This domain is associated with the families GHMP_kinases_C, pfam08544 and GHMP_kinases_N, pfam00288.	50
402233	pfam10510	PIG-S	Phosphatidylinositol-glycan biosynthesis class S protein. PIG-S is one of several key, core, components of the glycosylphosphatidylinositol (GPI) trans-amidase complex that mediates GPI anchoring in the endoplasmic reticulum. Anchoring occurs when a protein's C-terminal GPI attachment signal peptide is replaced with a pre-assembled GPI. Mammalian GPITransamidase consists of at least five components: Gaa1, Gpi8, PIG-S, PIG-T, and PIG-U, all five of which are required for function. It is possible that Gaa1, Gpi8, PIG-S, and PIG-T form a tightly associated core that is only weakly associated with PIG-U. The exact function of PIG-S is unclear.	497
371104	pfam10511	Cementoin	Trappin protein transglutaminase binding domain. Trappin-2, itself a protease inhibitor, has this unique N-terminal domain that enables it to become cross-linked to extracellular matrix proteins by transglutaminase. This domain contains several repeated motifs with the the consensus sequence Gly-Gln-Asp-Pro-Val-Lys, and these together can anchor the whole molecule to extracellular matrix proteins, such as laminin, fibronectin, beta-crystallin, collagen IV, fibrinogen, and elastin, by transglutaminase-catalyzed cross-links. The whole domain is rich in glutamine and lysine, thus allowing and transglutaminase(s) to catalyze the formation of an intermolecular epsilon-(gamma-glutamyl)lysine isopeptide bond. Cementoin is associated with the WAP family, pfam00095, at the C-terminus.	17
402234	pfam10512	Borealin	Cell division cycle-associated protein 8. The chromosomal passenger complex of Aurora B kinase, INCENP, and Survivin has essential regulatory roles at centromeres and the central spindle in mitosis. Borealin is also a member of the complex. Approximately half of Aurora B in mitotic cells is complexed with INCENP, Borealin, and Survivin. Depletion of Borealin by RNA interference delays mitotic progression and results in kinetochore-spindle mis-attachments and an increase in bipolar spindles associated with ectopic asters.	120
402235	pfam10513	EPL1	Enhancer of polycomb-like. This is a family of EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes.	166
402236	pfam10514	Bcl-2_BAD	Pro-apoptotic Bcl-2 protein, BAD. BAD is a Bcl-2 homology domain 3 (BH3)-only pro-apoptotic member of the Bcl-2 protein family that is regulated by phosphorylation in response to survival factors. Binding of BAD to mitochondria is thought to be exclusively mediated by its BH3 domain. Membrane localization of BAD mediates membrane translocation of Bcl-XL. The C-terminal part of BAD is sufficient for membrane binding. There are two segments with differing lipid-binding preferences, LBD1 and LBD2, that are responsible for this binding: (i) LBD1 located in the proximity of the BH3 domain (amino acids 122-131) and (ii) LBD2, the putative C-terminal alpha-helix-5. Phosphorylation-regulated 14-3-3 protein binding may expose the cholesterol-preferring LBD1 and bury the LBD2, thereby mediating translocation of BAD to raft-like micro-domains.	166
402237	pfam10515	APP_amyloid	beta-amyloid precursor protein C-terminus. This is the amyloid, C-terminal, protein of the beta-Amyloid precursor protein (APP) which is a conserved and ubiquitous transmembrane glycoprotein strongly implicated in the pathogenesis of Alzheimer's disease but whose normal biological function is unknown. The C-terminal 100 residues are released and aggregate into amyloid deposits which are strongly implicated in the pathology of Alzheimer's disease plaque-formation. The domain is associated with family A4_EXTRA, pfam02177, further towards the N-terminus.	52
402238	pfam10516	SHNi-TPR	SHNi-TPR. SHNi-TPR family members contain a reiterated sequence motif that is an interrupted form of TPR repeat.	38
402239	pfam10517	DM13	Electron transfer DM13. The DM13 domain is a component of a novel electron-transfer system potentially involved in oxidative modification of animal cell-surface proteins. It contains a nearly absolutely conserved cysteine, which could be involved in a redox reaction, either as a naked thiol group or through binding a prosthetic group like heme.	104
402240	pfam10518	TAT_signal	TAT (twin-arginine translocation) pathway signal sequence. 	26
402241	pfam10520	TMEM189_B_dmain	B domain of TMEM189, localization domain. TMEM189_B is the B domain or probable localization domain of the transmembrane protein TMEM189 which in some mammals is fused with Kua ubiquitin-conjugation E2 enzyme proteins. The domain is also found on fatty acid saturase FAD4 in Arabidopsis.	176
402242	pfam10521	Tti2	Tti2 family. Budding yeast Tti2 is a subunit of the ASTRA complex, which is involved in chromatin remodelling. Tti2 homolog from humans, TELO2-interacting protein 2, is part of the TTT complex that is involved in the cellular resistance to DNA damage stresses.	282
371112	pfam10522	RII_binding_1	RII binding domain. This domain is found is a wide variety of AKAPs (A kinase anchoring proteins). The domain is also found on micro-tubule-associated proteins.	19
402243	pfam10523	BEN	BEN domain. The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription.	77
371114	pfam10524	NfI_DNAbd_pre-N	Nuclear factor I protein pre-N-terminus. The Nuclear factor I (NFI) family of site-specific DNA-binding proteins (also known as CTF or CAAT box transcription factor) functions both in viral DNA replication and in the regulation of gene expression in higher organisms. The N-terminal 200 residues contains the DNA-binding and dimerization domain, but also has an 8-47 residue highly conserved region 5' of this, whose function is not known. Deletion of the N-terminal 200 amino acids removes the DNA-binding activity, dimerization-ability and the stimulation of adenovirus DNA replication.	41
402244	pfam10525	Engrail_1_C_sig	Engrailed homeobox C-terminal signature domain. Engrailed homeobox proteins are characterized by the presence of a conserved region of some 20 amino-acid residues located at the C-terminal of the 'homeobox' domain. This domain of approximately 20 residues forms a kind of a signature pattern for this subfamily of proteins.	31
402245	pfam10528	GLEYA	GLEYA domain. The GLEYA domain is related to lectin-like binding domains found in the S. cerevisiae Flo proteins and the C. glabrata Epa proteins. It is a carbohydrate-binding domain that is found in fungal adhesins (also referred to as agglutinins or flocculins). Adhesins with a GLEYA domain possess a typical N-terminal signal peptide and a domain of conserved sequence repeats, but lack glycosylphosphatidylinositol (GPI) anchor attachment signals. They contain a conserved motif G(M/L)(E/A/N/Q)YA, hence the name GLEYA. Based on sequence homology, it is suggested that the GLEYA domain would predominantly contain beta sheets. The GLEYA domain is also found in S. pombe putative cell agglutination protein fta5, thought to be a kinetochore portein (Sim4 complex subunit), however no direct evidence for kinetochore association has been found. Furthermore, a global protein localization study in S. pombe identified it as a secreted protein localized to the Golgi complex.	91
402246	pfam10529	Hist_rich_Ca-bd	Histidine-rich Calcium-binding repeat region. This is a histidine-rich calcium binding repeat which appears in proteins called histidine-rich-calcium binding proteins (HRC). HRC is a high capacity, low affinity Ca2+-binding protein, residing in the lumen of the sarcoplasmic reticulum. HRC binds directly to triadin. This binding interaction occurs between the histidine-rich region of HRC and multiple clusters of charged amino acids, named as the KEKE motifs, in the lumenal domain of triadin. The region in which this repeat is found in many copies is long and variable but is the acidic region of the protein. There is also a cysteine-rich region further towards the C-terminus. HRC may regulate sarcoplasmic reticular calcium transport and play a critical role in maintaining calcium homeostasis and function in the heart. HRC as a candidate regulator of sarcoplasmic reticular calcium uptake.	15
287498	pfam10530	Toxin_35	Toxin with inhibitor cystine knot ICK or Knottin scaffold. Spider toxins of the CSTX family are ion channel toxins containing an inhibitor cystine knot (ICK) structural motif or Knottin scaffold. The four disulfide bonds present in the CSTX spider toxin family are arranged in the following pattern: 1-4, 2-5, 3-8 and 6-7. CSTX-1 is the most important component of C. salei venom in terms of relative abundance and toxicity and therefore is likely to contribute significantly to the overall toxicity of the whole venom. CSTX-1 blocked rat neuronal L-type, but no other types of HVA Cav channels. Interestingly, the omega-toxins from Phoneutria nigriventer venom (another South American species also belonging to the Ctenidae family) are included as they carry the same disulfide bond arrangement. suggestive that CSTX-1 may interact with Cav channels. Calcium ion voltage channel heteromultimer containing an L-type pore-forming alpha1-subunit is the most probable candidate for the molecular target of CSTX-1 and these toxins.	61
402247	pfam10531	SLBB	SLBB domain. 	56
119052	pfam10532	Plant_all_beta	Plant specific N-all beta domain. This domain was identified by Babu and colleagues. It is found associated with the WRKY domain pfam03106.	114
402248	pfam10533	Plant_zn_clust	Plant zinc cluster domain. This zinc binding domain was identified by Babu and colleagues and found associated with the WRKY domain pfam03106.	42
402249	pfam10534	CRIC_ras_sig	Connector enhancer of kinase suppressor of ras. The CRIC - Connector enhancer of kinase suppressor of ras - domain functions as a scaffold in several signal cascades and acts on proliferation, differentiation and apoptosis.	93
402250	pfam10536	PMD	Plant mobile domain. This domain was identified by Babu and colleagues in a variety of transposases.	345
402251	pfam10537	WAC_Acf1_DNA_bd	ATP-utilising chromatin assembly and remodelling N-terminal. ACF (for ATP-utilising chromatin assembly and remodelling factor) is a chromatin-remodelling complex that catalyzes the ATP-dependent assembly of periodic nucleosome arrays. The WAC (WSTF/Acf1/cbp146) domain is an approximately 110-residue module present at the N-termini of Acf1-related proteins in a variety of organisms. The DNA-binding region of Acf1 includes the WAC domain, which is necessary for the efficient binding of ACF complex to DNA.	101
402252	pfam10538	ITAM_Cys-rich	Immunoreceptor tyrosine-based activation motif. Signal transduction by T and B cell antigen receptors and certain receptors for Ig Fc regions involves a conserved sequence motif, termed an immunoreceptor tyrosine-based activation motif (ITAM). It is also found in the cytoplasmic domain of apoptosis receptor.	38
402253	pfam10539	Dev_Cell_Death	Development and cell death domain. The DCD domain is found in plant proteins involved in development and cell death. The DCD domain is an approximately 130 amino acid long stretch that contains several mostly invariable motifs. These include a FGLP and a LFL motif at the N-terminus and a PAQV and a PLxE motif towards the C-terminus of the domain. The DCD domain is present in proteins with different architectures. Some of these proteins contain additional recognisable motifs, like the KELCH repeats or the ParB domain.	126
402254	pfam10540	Membr_traf_MHD	Munc13 (mammalian uncoordinated) homology domain. Munc13 proteins constitute a family of three highly homologous molecules (Munc13-1, Munc13-2 and Munc13-3) with homology to Caenorhabditis elegans unc-13p. Munc13 proteins contain a phorbol ester-binding C1 domain and two C2 domains, which are Ca2+/phospholipid binding domains. Sequence analyses have uncovered two regions called Munc13 homology domains 1 (MHD1) and 2 (MHD2) that are arranged between two flanking C2 domains. MHD1 and MHD2 domains are present in a wide variety of proteins from Arabidopsis thaliana, C. elegans, Drosophila melanogaster, mouse, rat and human, some of which may function in a Munc13-like manner to regulate membrane trafficking. The MHD1 and MHD2 domains are predicted to be alpha-helical.	141
402255	pfam10541	KASH	Nuclear envelope localization domain. The KASH (for Klarsicht/ANC-1/Syne-1 homology) or KLS domain is a highly hydrophobic nuclear envelope localization domain of approximately 60 amino acids comprising a 20-amino-acid transmembrane region and a 30-35-residue C-terminal region that lies between the inner and the outer nuclear membranes. During meiotic prophase, telomeres cluster to form a bouquet arrangement of chromosomes. SUN and KASH domain proteins form complexes that span both membranes of the nuclear envelope. The KASH domain links the dynein motor complex of the microtubules, through the outer nuclear membrane to the Sad1 domain in the inner nuclear membrane which then interacts with the bouquet proteins Bqt1 and Bqt2 that are complexed with Bqt4, Rap1 and Taz1 and attached to the telomere. SUN domain-containing proteins are essential for recruiting KASH domain proteins at the outer nuclear membrane, and KASH domains provide a generic NE tethering device for functionally distinct proteins whose cytoplasmic domains mediate nuclear positioning, maintain physical connections with other cellular organelles, and possibly even influence chromosome dynamics.	58
371125	pfam10542	Vitelline_membr	Vitelline membrane cysteine-rich region. In Drosophila melanogaster the vitelline membrane (VM) is the first layer of the eggshell produced by the follicular epithelium. It is composed of at least four different proteins. VM proteins are similarly organized with a central highly conserved 38-amino acid domain which is flanked by unrelated regions. The domain contains three highly conserved cysteines.	37
402256	pfam10543	ORF6N	ORF6N domain. This domain was identified by Iyer and colleagues.	82
402257	pfam10544	T5orf172	T5orf172 domain. This domain was identified by Iyer and colleagues.	98
402258	pfam10545	MADF_DNA_bdg	Alcohol dehydrogenase transcription factor Myb/SANT-like. The myb/SANT-like domain in Adf-1 (MADF) is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains.	85
402259	pfam10546	P63C	P63C domain. This domain was identified by Iyer and colleagues.	93
402260	pfam10547	P22_AR_N	P22_AR N-terminal domain. This domain was identified by Iyer and colleagues.	110
402261	pfam10548	P22_AR_C	P22AR C-terminal domain. This domain was identified by Iyer and colleagues. It is found associated with pfam10547.	73
287514	pfam10549	ORF11CD3	ORF11CD3 domain. This domain was identified by Iyer and colleagues.	52
402262	pfam10550	Toxin_36	Conantokin-G mollusc-toxin. The conantokins are a family of neuroactive peptides found in the venoms of fish-hunting cone snails. They possess a high content of gamma-carboxyglutamic acid (Gla) (4-5 residues), a non-standard amino-acid made by the post-translational modification of glutamate (Glu) residue. Conantokins are the only natural biochemically characterized peptides known to be N-methyl-D-aspartate (NMDA) receptor antagonists.	17
402263	pfam10551	MULE	MULE transposase domain. This domain was identified by Babu and colleagues.	98
402264	pfam10552	ORF6C	ORF6C domain. This domain was identified by Iyer and colleagues.	114
287517	pfam10553	MSV199	MSV199 domain. This domain was identified by Iyer and colleagues.	132
402265	pfam10554	Phage_ASH	Ash protein family. This family was identified by Iyer and colleagues. It includes the Ash protein from bacteriophage P4.	102
402266	pfam10555	MraY_sig1	Phospho-N-acetylmuramoyl-pentapeptide-transferase signature 1. Phospho-N-acetylmuramoyl-pentapeptide-transferase (EC 2.7.8.13) (mraY) is a bacterial enzyme responsible for the formation of the first lipid intermediate of the cell wall peptidoglycan synthesis. It catalyzes the formation of undecaprenyl-pyrophosphoryl-N-acetylmuramoyl-pentapeptide from UDP-MurNAc-pentapeptide and undecaprenyl-phosphate. It is an integral membrane protein with probably ten transmembrane domains. This domain is located at the end of the first cytoplasmic loop and the beginning of the second transmembrane domain.	13
402267	pfam10557	Cullin_Nedd8	Cullin protein neddylation domain. This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognizes and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue.	62
402268	pfam10558	MTP18	Mitochondrial 18 KDa protein (MTP18). This family of proteins are mitochondrial 18KDa proteins that are often misannotated as carbonic anhydrases. It was shown that knockdown of MTP18 protein results in a cytochrome c release from mitochondria and consequently leads to apoptosis. Overexpression studies suggest that MTP18 is required for mitochondrial fission.	170
402269	pfam10559	Plug_translocon	Plug domain of Sec61p. The Sec61/SecY translocon mediates translocation of proteins across the membrane and integration of membrane proteins into the lipid bilayer. The structure of the translocon revealed a plug domain blocking the pore on the lumenal side.The plug is unlikely to be important for sealing the translocation pore in yeast but it plays a role in stabilizing Sec61p during translocon formation. The domain runs from residues 52-74.	33
402270	pfam10561	UPF0565	Uncharacterized protein family UPF0565. This family of proteins has no known function.	290
402271	pfam10562	CaM_bdg_C0	Calmodulin-binding domain C0 of NMDA receptor NR1 subunit. This is a very short highly conserved domain that is C-terminal to the cytosolic transmembrane region IV of the NMDA-receptor 1. It has been shown to bind Calmodulin-Calcium with high affinity. The ionotropic N-methyl-D-aspartate receptor (NMDAR) is a major source of calcium flux into neurons in the brain and plays a critical role in learning, memory, neural development, and synaptic plasticity. Calmodulin (CaM) regulates NMDARs by binding tightly to the C0 and C1 regions of their NR1 subunit. The conserved tryptophan is considered to be the anchor residue.	29
402272	pfam10563	CdCA1	Cadmium carbonic anhydrase repeat. This domain is the cadmium carbonic anhydrase repeat unit of the beta-carbonic anhydrase of a marine diatom, that uses both zinc and cadmium for catalysis of the reversible hydration of carbon dioxide for use in inorganic carbon acquisition for photosynthesis (thus being a cambialistic enzyme). Compared with alpha- and gamma-carbonic anhydrases that use three histidines to coordinate the zinc-atom, this beta-carbonic anhydrase has two cysteines and one histidine, and rapidly binds cadmium.	182
402273	pfam10564	MAR_sialic_bdg	Sialic-acid binding micronemal adhesive repeat. This domain is a novel carbohydrate-binding domain found on micronemal proteins. Micronemal proteins (MICs) are released onto the parasite surface just before invasion of host cells and play important roles in host cell recognition, attachment and penetration. Toxoplasma gondii can infect and replicate within all nucleated cells. This domain interacts with sialylated oligosaccharides; the protein in Toxoplasma gondii is a monomer but several MAR domains are carried on the protein. Each MAR domain contains one central sialic acid-binding pocket.	94
402274	pfam10565	NMDAR2_C	N-methyl D-aspartate receptor 2B3 C-terminus. This domain is found at the C-terminus of many NMDA-receptor proteins, many of which also carry the Ligated ion-channel family pfam00060 further upstream as well as the ANF_receptor family pfam01094. This region is predicted to be a large extra-cellular domain of the NMDA receptor proteins, being highly hydrophilic, and is thought to be integrally involved in the function of the receptor. The region also carries a number of potential N-glycosylation sites.	634
402275	pfam10566	Glyco_hydro_97	Glycoside hydrolase 97. This domain is the catalytic region of the bacterial glycosyl-hydrolase family 97. This central part of the GH97 family protein sequences represents a typical and complete (beta/alpha)8-barrel or catalytic TIM-barrel type domain. The N- and C-terminal parts of the sequences, mainly consisting of beta-strands, form two additional non-catalytic domains. In all known glycosidases with the (beta-alpha)8-barrel fold, the amino acid residues at the active site are located on the C-termini of the beta-strands.	278
402276	pfam10567	Nab6_mRNP_bdg	RNA-recognition motif. This conserved domain is found in fungal proteins and appears to be involved in RNA-processing. It binds to poly-adenylated RNA, interacts genetically with mRNA 3'-end processing factors, copurifies with the nuclear cap-binding protein Cbp20p, and is found in complexes containing other translation factors, such as EIF4G.	315
402277	pfam10568	Tom37	Outer mitochondrial membrane transport complex protein. The TOM37 protein is one of the outer membrane proteins that make up the TOM complex for guiding cytosolic mitochondrial beta-barrel proteins from the cytosol across the outer mitochondrial membrane into the intra-membrane space. In conjunction with TOM70 it guides peptides without an MTS into TOM40, the protein that forms the passage through the outer membrane. It has homology with Metaxin-1, also part of the outer mitochondrial membrane beta-barrel protein transport complex.	125
287532	pfam10570	Myelin-PO_C	Myelin-PO cytoplasmic C-term p65 binding region. Myelin protein zero is the major myelin protein in the peripheral central nervous system and is essential for normal myelination. The family is a single-pass transmembrane molecule containing one Ig-like loop in the extracellular domain and this highly basic 69 residue C-terminal cytoplasmic domain which is the region that interacts with protein p65.	65
402278	pfam10571	UPF0547	Uncharacterized protein family UPF0547. This domain contains a zinc-ribbon motif.	26
402279	pfam10572	UPF0556	Uncharacterized protein family UPF0556. This family of proteins has no known function.	126
287535	pfam10573	UPF0561	Uncharacterized protein family UPF0561. This family of proteins has no known function.	120
402280	pfam10574	UPF0552	Uncharacterized protein family UPF0552. This family of proteins has no known function.	224
402281	pfam10576	EndIII_4Fe-2S	Iron-sulfur binding domain of endonuclease III. Escherichia coli endonuclease III (EC 4.2.99.18) is a DNA repair enzyme that acts both as a DNA N-glycosylase, removing oxidized pyrimidines from DNA, and as an apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a single 4Fe-4S cluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, but is probably involved in the proper positioning of the enzyme along the DNA strand. The 4Fe-4S cluster is bound by four cysteines which are all located in a 17 amino acid region at the C-terminal end of endonuclease III. A similar region is also present in the central section of mutY and in the C-terminus of ORF-10 and of the Micro-coccus UV endonuclease.	17
402282	pfam10577	UPF0560	Uncharacterized protein family UPF0560. This family of proteins has no known function.	819
371146	pfam10578	SVS_QK	Seminal vesicle protein repeat. 	12
402283	pfam10579	Rapsyn_N	Rapsyn N-terminal myristoylation and linker region. Neuromuscular junction formation relies upon the clustering of acetylcholine receptors and other proteins in the muscle membrane. Rapsyn is a peripheral membrane protein that is selectively concentrated at the neuromuscular junction and is essential for the formation of synaptic acetylcholine receptor aggregates. Acetylcholine receptors fail to aggregate beneath nerve terminals in mice where rapsyn has been knocked out. The N-terminal six amino acids of rapsyn are its myristoylation site, and myristoylation is necessary for the targeting of the protein to the membrane.	80
402284	pfam10580	Neuromodulin_N	Gap junction protein N-terminal region. 	30
402285	pfam10581	Synapsin_N	Synapsin N-terminal. This highly conserved domain of synapsin proteins has a serine at position 9 or 10 which is a phosphorylation site. The domain appears to be the part of the molecule that binds to calmodulin.	32
402286	pfam10583	Involucrin_N	Involucrin of squamous epithelia N-terminus. This is the N-terminal three beta strands of involucrin, a protein present in keratinocytes of epidermis and other stratified squamous epithelia. Involucrin first appears in the cell cytosol, but ultimately becomes cross-linked to membrane proteins by transglutaminase thus helping in the formation of an insoluble envelope beneath the plasma membrane. Apigenin is a plant-derived flavanoid that has significant promise as a skin cancer chemopreventive agent. It has been found that apigenin regulates normal human keratinocyte differentiation by suppressing it and this is associated with reduced cell proliferation without apoptosis. The downstream part of the protein is represented by the family Involucrin, pfam00904.	69
402287	pfam10584	Proteasome_A_N	Proteasome subunit A N-terminal signature. This domain is conserved in the A subunits of the proteasome complex proteins.	23
402288	pfam10585	UBA_e1_thiolCys	Ubiquitin-activating enzyme active site. Ubiquitin-activating enzyme (E1 enzyme) activates ubiquitin by first adenylating with ATP its C-terminal glycine residue and thereafter linking this residue to the side chain of a cysteine residue in E1, yielding an ubiquitin-E1 thiolester and free AMP. Later the ubiquitin moiety is transferred to a cysteine residue on one of the many forms of ubiquitin-conjugating enzymes (E2). This domain carries the last of five conserved cysteines that is part of the active site of the enzyme, responsible for ubiquitin thiolester complex formation, the active site being represented by the sequence motif PICTLKNFP.	252
402289	pfam10587	EF-1_beta_acid	Eukaryotic elongation factor 1 beta central acidic region. 	28
402290	pfam10588	NADH-G_4Fe-4S_3	NADH-ubiquinone oxidoreductase-G iron-sulfur binding region. 	40
402291	pfam10589	NADH_4Fe-4S	NADH-ubiquinone oxidoreductase-F iron-sulfur binding region. 	83
402292	pfam10590	PNP_phzG_C	Pyridoxine 5'-phosphate oxidase C-terminal dimerization region. This domain represents one of the two dimerization regions of the protein, located at the edge of the dimer interface, at the C-terminus, being the last three beta strands, S6, S7, and S8 along with the last three residues to the end. In Myxococcus xanthus PdxH, S6 runs from residues 178-192, S7 from 200-206 and S8 from 211-215. the extended loop, of residues 167-177 may well be involved in the pocket formed between the two dimers that positions the FMN molecule.To date, the only time functional oxidase or phenazine biosynthesis activities have been experimentally demonstrated is when the sequences contain both pfam01243 and pfam10590. It is unknown the role performed by each domain in bringing about molecular functions of either oxidase or phenazine activity.	42
402293	pfam10591	SPARC_Ca_bdg	Secreted protein acidic and rich in cysteine Ca binding region. The SPARC_Ca_bdg domain of Secreted Protein Acidic and Rich in Cysteine is responsible for the anti-spreading activity of human urothelial cells. It is rich in alpha-helices. This extracellular calcium-binding domain contains two EF-hands that each coordinates one Ca2+ ion, forming a helix-loop-helix structure that not only drives the conformation of the protein but is also necessary for biological activity. The anti-spreading activity was dependent on the coordination of Ca2+ by a Glu residue at the Z position of EF-hand 2.	108
402294	pfam10592	AIPR	AIPR protein. This family of proteins was identified in as an abortive infection phage resistance protein often found in restriction modification system operons.	297
402295	pfam10593	Z1	Z1 domain. This uncharacterized domain was identified by Iyer and colleagues. It is found associated with a helicase domain of superfamily type II.	222
402296	pfam10595	UPF0564	Uncharacterized protein family UPF0564. This family of proteins has no known function. However, one of the members, TTHERM_01026310, is annotated as an EF-hand family protein.	362
371153	pfam10596	U6-snRNA_bdg	U6-snRNA interacting domain of PrP8. This domain incorporates the interacting site for the U6-snRNA as part of the U4/U6.U5 tri-snRNPs complex of the spliceosome, and is the prime candidate for the role of cofactor for the spliceosome's RNA core. The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor.	159
402297	pfam10597	U5_2-snRNA_bdg	U5-snRNA binding site 2 of PrP8. The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor.	134
402298	pfam10598	RRM_4	RNA recognition motif of the spliceosomal PrP8. The large RNA-protein complex of the spliceosome catalyzes pre-mRNA splicing. One of the most conserved core proteins is PrP8 which occupies a central position in the catalytic core of the spliceosome, and has been implicated in several crucial molecular rearrangements that occur there, and has recently come under the spotlight for its role in the inherited human disease, Retinitis Pigmentosa. The RNA-recognition motif of PrP8 is highly conserved and provides a possible RNA binding centre for the 5-prime SS, BP, or 3-prime SS of pre-mRNA which are known to contact with Prp8. The most conserved regions of an RRM are defined as the RNP1 and RNP2 sequences. Recognition of RNA targets can also be modulated by a number of other factors, most notably the two loops beta1-alpha1, beta2-beta3 and the amino acid residues C-terminal to the RNP2 domain.	92
371156	pfam10599	Nup_retrotrp_bd	Retro-transposon transporting motif. This is the highly conserved C-terminal motif GRKIxxxxxRRKx of nucleoporins that plays a critical and unique role in the nuclear import of retro-transposons in both yeasts and higher organisms. It would appear that the arginine residues at positions 2 and 9-10 constitute a bipartite nuclear localization signal, with two basic peptide motifs separated by an interchangeable spacer sequence, that is crucial for the retro-transposon activity.	86
402299	pfam10600	PDZ_assoc	PDZ-associated domain of NMDA receptors. This domain is found in higher eukaryotes between the second and third PDZ domains, pfam00595, of glutamate receptor like proteins. Its exact function is not known.	68
402300	pfam10601	zf-LITAF-like	LITAF-like zinc ribbon domain. Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumor necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure.	67
402301	pfam10602	RPN7	26S proteasome subunit RPN7. RPN7 (known as the non ATPase regulatory subunit 6 in higher eukaryotes) is one of the lid subunits of the 26S proteasome and has been shown in Saccharomyces cerevisiae to be required for structural integrity. The 26S proteasome is is involved in the ATP-dependent degradation of ubiquitinated proteins.	174
402302	pfam10604	Polyketide_cyc2	Polyketide cyclase / dehydrase and lipid transport. This family contains polyketide cylcases/dehydrases which are enzymes involved in polyketide synthesis. It also includes other proteins of the START superfamily.	132
402303	pfam10605	3HBOH	3HB-oligomer hydrolase (3HBOH). D-(-)-3-hydroxybutyrate oligomer hydrolase (also known as 3HB-oligomer hydrolase) functions in the degradation of poly-3-hydroxybutyrate (PHB). It catalyzes the hydrolysis of D(-)-3-hydroxybutyrate oligomers (3HB-oligomers) into 3HB-monomers.	690
402304	pfam10606	GluR_Homer-bdg	Homer-binding domain of metabotropic glutamate receptor. This is the proline-rich region of metabotropic glutamate receptor proteins that binds Homer-related synaptic proteins. The Homer proteins form a physical tether linking mGluRs with the inositol trisphosphate receptors (IP3R) that appears to be due to the proline-rich "Homer ligand" (PPXXFr). Activation of PI turnover triggers intracellular calcium release. MGluR function is altered in the mouse model of human Fragile X syndrome mental retardation, a disorder caused by loss of function mutations in the Fragile X mental retardation gene Fmr1. Homer 3 (and to a lesser extent Homer 1b/c) has been shown to form a multimeric complex with mGlu1a and the IP3 receptor, indicating that Homers may play a role in the localization of receptors to their signalling partners.	50
402305	pfam10607	CLTH	CTLH/CRA C-terminal to LisH motif domain. RanBPM is a scaffolding protein and is important in regulating cellular function in both the immune system and the nervous system. This domain is at the C-terminus of the proteins and is the binding domain for the CRA motif (for CT11-RanBPM), which is comprised of approximately 100 amino acids at the C terminal of RanBPM. It was found to be important for the interaction of RanBPM with fragile X mental retardation protein (FMRP), but its functional significance has yet to be determined. This region contains CTLH and CRA domains annotated by SMART; however, these may be a single domain, and it is refereed to as a C-terminal to LisH motif.	143
402306	pfam10608	MAGUK_N_PEST	Polyubiquitination (PEST) N-terminal domain of MAGUK. The residues upstream of this domain are the probable palmitoylation sites, particularly two cysteines. The domain has a putative PEST site at the very start that seems to be responsible for poly-ubiquitination. PEST domains are polypeptide sequences enriched in proline (P), glutamic acid (E), serine (S) and threonine (T) that target proteins for rapid destruction. The whole domain, in conjunction with a C-terminal domain of the longer protein, is necessary for dimerization of the whole protein.	89
402307	pfam10609	ParA	NUBPL iron-transfer P-loop NTPase. This family contains ATPases involved in plasmid partitioning. It also contains the cytosolic Fe-S cluster assembling factor NBP35 which is required for biogenesis and export of both ribosomal subunits.	246
313764	pfam10610	Tafi-CsgC	Thin aggregative fimbriae synthesis protein. Fimbriae are cell-surface protein polymers, of eg. E coli and Salmonella spp, that mediate interactions important for host and environmental persistence, development of biofilms, motility, colonisation and invasion of cells, and conjugation. Four general assembly pathways for different fimbriae have been proposed, one of which is extracellular nucleation-precipitation (ENP), that differs from the others in that fibre-growth occurs extracellularly. Thin aggregative fimbriae (Tafi) are the only fimbriae dependent on the ENP pathway. Tafi were first identified in Salmonella spp and the controlling operon termed agf; however subsequent isolation of the homologous operon in E coli led to its being called csg. Tafi are known as curli because, in the absence of extracellular polysaccharides, their morphology appears curled; however, when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix. The gene agfC is found to be transcribed at low levels, localized to the periplasm in a mature form, and in combination with AgfE is important for AgfA extracellular assembly, which facilitates the synthesis of Tafi. The genes involved in Tafi production are organized into two adjacent divergently transcribed operons, agfBAC and agfDEFG, both of which are required for biosynthesis and assembly.	98
313765	pfam10611	DUF2469	Protein of unknown function (DUF2469). Member proteins often found in Actinomycetes clustered with signal peptidase and/or RNAse-HII.	100
402308	pfam10612	Spore-coat_CotZ	Spore coat protein Z. This family has members annotated as Spore coat protein Z, otherwise known as CotZ, It is a cysteine-rich spore coat family, and along with CotY is necessary for assembly of intact exosporium.	157
402309	pfam10613	Lig_chan-Glu_bd	Ligated ion channel L-glutamate- and glycine-binding site. This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan, pfam00060.	111
402310	pfam10614	CsgF	Type VIII secretion system (T8SS), CsgF protein. The extracellular nucleation-precipitation (ENP) pathway or Type VIII secretion system (T8SS) in Gram-negative (diderm) bacteria is responsible for the secretion and assembly of prepilins for fimbiae biogenesis, the prototypical curli. Besides the T2SS that can be involved in the assembly of prototypical Type 4 pilus, the T4SS that can be involved in the biogenesis of the prototypical pilus T, the T3SS involved in the assembly of the injectisome and the T7SS involved in the formation of the prototypical Type 1 pilus, the T8SS differs in that fibre-growth occurs extracellularly. The curli, also called thin aggregative fimbriae (Tafi), are the only fimbriae dependent on the T8SS. Tafi were first identified in Salmonella spp and the controlling operon termed agf; however subsequent isolation of the homologous operon in E coli led to its being called csg. In the absence of extracellular polysaccharides Tafi appear curled, although when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix. CsgF is one of three putative curli assembly factors appearing to act as a nucleator protein. Unlike eukaryotic amyloid formation, curli biogenesis is a productive pathway requiring a specific assembly machinery.	118
402311	pfam10615	DUF2470	Protein of unknown function (DUF2470). This family is a putative haem-iron utilisation family, as many members are annotated as being pyridoxamine 5'-phosphate oxidase-related, FMN-binding; however this could not be confirmed.	73
402312	pfam10616	DUF2471	Protein of unknown function (DUF2471). The function of this family is unknown. Members all come from Burkholderia spp. BDAG_04162 is annotated as Serine/threonine-protein kinase, but this could not be confirmed.	118
402313	pfam10617	DUF2474	Protein of unknown function (DUF2474). This family of short proteins has no known function.	39
402314	pfam10618	Tail_tube	Phage tail tube protein. This bacterial family of proteins contains phage tail tube proteins related to the Mu phage tail tube protein M. Bacteriophage Mu has an eicosahedral head and contractile tail. The tail is composed of an outer sheath and an inner tube.	117
402315	pfam10620	MdcG	Phosphoribosyl-dephospho-CoA transferase MdcG. MdcG is a phosphoribosyl-dephospho-CoA transferase that is involved in the biosynthesis of the prosthetic group of malonate decarboxylase. Malonate decarboxylase from Klebsiella pneumoniae contains an acyl carrier protein (MdcC) to which a 2'-(5' '-phosphoribosyl)-3'-dephospho-CoA prosthetic group is attached via phosphodiester linkage. MdcG catalyzes the following reaction: 2'-(5''-triphosphoribosyl)-3'-dephospho-CoA + apo-[acyl-carrier-protein] = holo-[acyl-carrier-protein] + diphosphate.	196
287577	pfam10621	FpoO	F420H2 dehydrogenase subunit FpoO. This is the FpoO subunit of F420H2 dehydrogenase, an enzyme which oxidizes reduced coenzyme F420. Reduced coenzyme F420 is a universal electron carrier in methanogens.	110
402316	pfam10622	Ehbp	Energy-converting hydrogenase B subunit P (EhbP). Ehb (energy-converting hydrogenase B) is an methanogenic archaeal enzyme that functions in one of the metabolic pathways involved in methanol reduction to methane. This family contains subunit P of Ehb.	77
151143	pfam10623	PilI	Plasmid conjugative transfer protein PilI. The thin pilus of plasmid R64 belongs to the type IV family and is required for liquid matings. pilI is one of 14 genes that have been identified as being involved in biogenesis of the R64 thin pilus.	83
287579	pfam10624	TraS	Plasmid conjugative transfer entry exclusion protein TraS. Entry exclusion (Eex) is a process which prevents redundant transfer of DNA between donor cells. TraS is a protein involved in Eex. It blocks redundant conjugative DNA synthesis and transport between donor cells, and it is suggested that TraS interferes with a signalling pathway that is required to trigger DNA transfer. TraS on the recipient cell is known to form an interaction with TraG on the donor cell.	162
337810	pfam10625	UspB	Universal stress protein B (UspB). UspB in Escherichia coli is a 14kDa protein which is predicted to be an integral membrane protein. Overexpression of UspB results in cell death in stationary phase, and mutants of uspB are sensitive to ethanol exposure during stationary phase.	107
402317	pfam10626	TraO	Conjugative transposon protein TraO. This is a family of conjugative transposon proteins.	168
402318	pfam10627	CsgE	Curli assembly protein CsgE. Curli are a class highly aggregated surface fibers that are part of a complex extracellular matrix. They promote biofilm formation in addition to other activities. CsgE is a non-structural protein involved in curli biogenesis. CsgE forms an outer membrane complex with the curli assembly proteins CsgG and CsgF.	105
402319	pfam10628	CotE	Outer spore coat protein E (CotE). CotE is a morphogenic protein that is required for the assembly of the outer coat of the endospore and spore resistance to lysozyme. CotE also regulates the expression of cotA, cotB, cotC and other genes encoding spore outer coat proteins. The timing of cotE expression has been shown in Bacillus subtilis to affect spore coat morphology but not lysozyme resistance.	177
402320	pfam10629	DUF2475	Protein of unknown function (DUF2475). This family of proteins has no known function.	67
402321	pfam10630	DUF2476	Protein of unknown function (DUF2476). This is a family of proteins of unknown function. The family is rich in proline residues.	258
313781	pfam10631	DUF2477	Protein of unknown function (DUF2477). This is a family of proteins with no known function. The family is rich in proline residues.	141
378460	pfam10632	He_PIG_assoc	He_PIG associated, NEW1 domain of bacterial glycohydrolase. The English-language version of the first reference can be found on pages 388-399 of the above. This domain has been named NEW1 but its actual function is not known. It is found on proteins which are bacterial galactosidases. The domain is associated with the He_PIG family, pfam05345, a putative Ig-containing domain.	29
402322	pfam10633	NPCBM_assoc	NPCBM-associated, NEW3 domain of alpha-galactosidase. The English-language version of the first reference can be found on pages 388-399 of the above. This domain has been named NEW3 but its actual function is not known. It is found on proteins which are bacterial galactosidases. The domain is associated with the NPCBM family, pfam08305, a novel putative carbohydrate binding module found at the N-terminus of glycosyl hydrolases.	78
402323	pfam10634	Iron_transport	Fe2+ transport protein. This is a bacterial family of periplasmic proteins that are thought to function in high-affinity Fe2+ transport.	150
402324	pfam10635	DisA-linker	DisA bacterial checkpoint controller linker region. The DisA protein is a bacterial checkpoint protein that dimerizes into an octameric complex. The protein consists of three distinct domains. the first, N-terminal region, from 1-145 is globular and is represented by family DisA_N, pfam02457; the next 146-289 residues is this domain that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), thus forming a spine like-linker between domains 1 and 3. The C-terminal residues of domain 3 are family HHH, pfam00633, the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis.	141
402325	pfam10636	hemP	Hemin uptake protein hemP. This is a bacterial family of proteins that are involved in the uptake of the iron source hemin.	37
402326	pfam10637	Ofd1_CTDD	Oxoglutarate and iron-dependent oxygenase degradation C-term. Ofd1 is a prolyl 4-hydroxylase-like 2-oxoglutarate-Fe(II) dioxygenase that accelerates the degradation of Sre1N in the presence of oxygen. The domain is conserved from yeasts to humans. Yeast Sre1 is the orthologue of mammalian sterol regulatory element binding protein (SREBP), and it responds to changes in oxygen-dependent sterol synthesis as an indirect measure of oxygen availability. However, unlike the prolyl 4-hydroxylases that regulate mammalian hypoxia-inducible factor, Ofd1 uses multiple domains to regulate Sre1N degradation by oxygen; the Ofd1 N-terminal dioxygenase domain is required for oxygen sensing and this Ofd1 C-terminal domain accelerates Sre1N degradation in yeasts.	255
402327	pfam10638	Sfi1_C	Spindle body associated protein C-terminus. This C-terminal domain of spindle-body-associated protein Sfi1 has an important role to play in the bridge-splitting during bi-polar spindle assembly, and this separation event possibly requires interaction with integral components of the nuclear envelope, such as the Mps2-Bbp1 complex. Centrally to this domain is a region carrying centrin-binding repeats with repeating units containing tryptophan, family Sfi1_central, pfam08457.	100
371174	pfam10639	TMEM234	Putative transmembrane family 234. TMEM234 is a family of putative inner membrane proteins. Many bacterial members are annotated as putative L-Ara4N-phosphoundecaprenol flippase subunit ArnE, and as inner membrane proteins. They may be transporters of the multi-drug-resistant superfamily.	113
287595	pfam10640	Pox_ATPase-GT	mRNA capping enzyme N-terminal, ATPase and guanylyltransferase. This domain is the N-terminus of the large subunit viral mRNA capping enzyme, and carries both the ATPase and the guanylyltransferase activities of the enzyme. The guanylyltransferase enzymatic region runs from residues 242 (leucine)-273(arginine), the core of the acitve site being the lysine residue at 260. The ATPase activity is at the very N-terminal part of the domain.	311
402328	pfam10642	Tom5	Mitochondrial import receptor subunit or translocase. This protein family is very short and is only found in yeasts. Tom5 is one of three very small translocases of the mitochondrial outer membrane. Tom5 links mitochondrial preprotein receptors to the general import pore. Although Tom5 has allegedly been identified in vertebrates this could not be confirmed.	47
402329	pfam10643	Cytochrome-c551	Photosystem P840 reaction-centre cytochrome c-551. A photosynthetic reaction-centre complex is found in certain green sulphur bacteria such as Chlorobium vibrioforme which are anaerobic photo-auto-trophic organisms. The primary electron donor is P840, a probable B-Chl a dimer, and the primary electron acceptor is a B-Chl monomer. Also on the donor side c-type cytochromes are known to function as electron donors to photo-oxidized P840. This family is thus the secondary endogenous donor of the photosynthetic reaction-centre complex and is a membrane-bound cytochrome containing a single haem group.	207
402330	pfam10644	Misat_Tub_SegII	Misato Segment II tubulin-like domain. The misato protein contains three distinct, conserved domains, segments I, II and III. Segments I and III are common to Tubulins pfam00091, but segment II aligns with myosin heavy chain sequences from D. melanogaster (PIR C35815), rabbit (SP P04460), and human (PIR S12458). Segment II of misato is a major contributor to its greater length compared with the various tubulins. The most significant sequence similarities to this 54-amino acid region are from a motif found in the heavy chains of myosins from different organisms. A comparison of segment II with the vertebrate myosin heavy chains reveals that it is homologous to a myosin peptide in the hinge region linking the S2 and LMM domains. Segment II also contains heptad repeats which are characteristic of the myosin tail alpha-helical coiled-coils. This myosin-like homology may be due only to the fact that both myosin and Misato carry coiled-coils, which appear similar but are not necessarily homologous (Wood V, personal communication).	115
402331	pfam10645	Carb_bind	Carbohydrate binding. This is a carbohydrate binding domain which has been shown in Schizosaccharomyces pombe to be required for septum localization.	49
402332	pfam10646	Germane	Sporulation and spore germination. The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 pfam01520, Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold.	114
402333	pfam10647	Gmad1	Lipoprotein LpqB beta-propeller domain. The Gmad1 domain is found associated with the GerMN family, pfam10646, in bacterial spore formation. It is predicted to have a beta-propeller fold and to have a passive binding role rather than a catalytic function owing to the low number of conserved hydrophilic residues.	255
402334	pfam10648	Gmad2	Immunoglobulin-like domain of bacterial spore germination. This domain is found linked to the GerMN domain pfam10646 in some bacterial proteins. It is predicted to contain an immunoglobulin-like all-beta fold.	85
402335	pfam10649	DUF2478	Protein of unknown function (DUF2478). This is a family of hypothetical bacterial proteins found in the vicinity of Molybdenum ABC transporter ATP-binding gene-products MobA MobB and MobC. However the function could not be confirmed. This family appears to belong to the P-loop superfamily by alignment to pfam03266. However, the characteristic P-loop sequence motif appears to have diverged beyond recognition in this family.	159
402336	pfam10650	zf-C3H1	Putative zinc-finger domain. This domain is conserved in fungi and might be a zinc-finger domain as it contains three conserved Cs and an H in the C-x8-C-x5-C-x3-H conformation typical of a zinc-finger.	22
371181	pfam10651	DUF2479	Domain of unknown function (DUF2479). This domain is found in phage from a number of different bacteria. It is purported to be a putative long tail fibre (Bacteriophage A118) protein, but this could not be confirmed.	143
402337	pfam10652	DUF2480	Protein of unknown function (DUF2480). All the members of this family are uncharacterized proteins, but the environment in which they are found on the bacterial genome suggests a function as a glucose-6-phosphate isomerase (EC 5.3.1.9). This could not, however, be confirmed.	165
287607	pfam10653	Phage-A118_gp45	Protein gp45 of Bacteriophage A118. This domain is found in bacteriophage and is thought to have a gp45 function within the phage tail-fibre system.	62
287608	pfam10654	DUF2481	Protein of unknown function (DUF2481). This is a hypothetical protein family homologous to Lmo2305 in Bacteriophage A118 systems.	126
402338	pfam10655	DUF2482	Hypothetical protein of unknown function (DUF2482). All the members of this very small, very short family are derived from bacteriophages, of the SA bacteriophages 11, Mu50B, system, and from the Staphylococcal_phi-Mu50B-like_prophages subsystem. All members are hypothetical proteins.	98
371183	pfam10656	DUF2483	Hypothetical protein of unknown function (DUF2483). This is a family of proteins found in bacteriophage particularly of the SA bacteriophages 11, Mu50B, family, homologous to phi-ETA orf16.	72
402339	pfam10657	RC-P840_PscD	Photosystem P840 reaction centre protein PscD. The photosynthetic reaction centers (RCs) of aerotolerant organisms contain a heterodimeric core, built up of two strongly homologous polypeptides each of which contributes five transmembrane peptide helices to hold a pseudo-symmetric double set of redox components. Two molecules of PscD are housed within a subunit. PscD may be involved in stabilizing the PscB component since it is found to co-precipitate with FMO (Fenna-Mathews-Olson BChl a-protein) and PscB. It may also be involved in the interaction with ferredoxin.	144
402340	pfam10658	DUF2484	Protein of unknown function (DUF2484). A role of this family in UDP-N-acetylenolpyruvoylglucosamine reductase, as MurB, could not be confirmed.	76
287612	pfam10659	Trypan_glycop_C	Trypanosome variant surface glycoprotein C-terminal domain. The trypanosome parasite expresses these proteins to evade the immune response.	104
313801	pfam10660	MitoNEET_N	Iron-containing outer mitochondrial membrane protein N-terminus. MitoNEET_N is the N-terminal region of the MitoNEET and Miner-type proteins that carry a zf-CDGSH, pfam09360, redox-active 2Fe-2S cluster. The whole protein regulates oxidative capacity. The domain is an anchor sequence that tethers the protein to the outer membrane.	64
287614	pfam10661	EssA	WXG100 protein secretion system (Wss), protein EssA. The WXG100 protein secretion system (Wss) is responsible for the secretion of WXG100 proteins (pfam06013) such as ESAT-6 and CFP-10 in Mycobacterium tuberculosis or EsxA and EsxB in Staphylococcus aureus. In S. aureus, the Wss seems to be encoded by a locus of eight CDS, called ess (eSAT-6 secretion system). This locus encodes, amongst several other proteins, EssA, a protein predicted to possess one transmembrane domain. Due to its predicted membrane location and its absolute requirement for WXG100 protein secretion, it has been speculated that EssA could form a secretion apparatus in conjunction with the polytopic membrane protein EsaA, YukC (pfam10140) and YukAB, which is a membrane-bound ATPase containing Ftsk/SpoIIIE domains (pfam01580) called EssC in S. aureus and Snm1/Snm2 in Mycobacterium tuberculosis. Proteins homologous to EssA, YukC, EsaA and YukD seem absent from mycobacteria.	148
402341	pfam10662	PduV-EutP	Ethanolamine utilisation - propanediol utilisation. Members of this family function in ethanolamine and propanediol degradation pathways. PduV may be involved in the association of the bacterial microcompartments (BMCs) to filaments.	137
402342	pfam10664	NdhM	Cyanobacterial and plastid NDH-1 subunit M. The proton-pumping NADH:ubiquinone oxidoreductase catalyzes the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. It is the largest, most complex and least understood of the respiratory chain enzymes and is referred to as Complex I. The subunit composition of the enzyme varies between groups of organisms. Complex I originating from mammalian mitochondria contains 45 different proteins, whereas in bacteria, the corresponding complex NDH-1 consists of 14 different polypeptides. homologs of these 14 proteins are found among subunits of the mitochondrial complex I, and therefore bacterial NDH-1 might be considered a model proton-pumping NADH dehydrogenase with a minimal set of subunits. Escherichia coli NDH-1 readily disintegrates into 3 sub-complexes: a water-soluble NADH dehydrogenase fragment (NuoE, -F, and -G),the connecting fragment (NuoB, -C, -D, and -I), and the membrane fragment (NuoA, -H, -J, -K, -L, -M, -N). In cyanobacteria and their descendants, the chloroplasts of green plants, the subunit composition of NDH-1 remains obscure. The genes for eleven subunits NdhA-NdhK, homologous to the NuoA-NuoD and NuoH-NuoN of the E. coli complex, have been found in the genome of Synechocystis sp. PCC 6803 which has a family of 6 ndhD genes and a family of 3 ndhF genes. Two reported multisubunit complexes, NDH-1L and NDH-1M, represent distinct NDH-1 complexes in the thylakoid membrane of Synechocystis 6803 -cyanobacterium. NDH-1L was shown to be essential for photoheterotrophic cell growth, whereas expression of NDH-1M was a prerequisite for CO2 uptake and played an important role in growth of cells at low CO2. Here we report the subunit composition of these two complexes. Fifteen proteins were discovered in NDH-1L including NdhL, a new component of the membrane fragment, and Ssl1690 (designated as NdhO), a novel peripheral subunit. The cyanobacterial NDH-1 complex contains additional subunits, NdhM and NdhN, compared with the minimal set of the bacterial enzyme and these seem to be specific for thylakoid-located NDH-1 of photosynthetic organisms. The three subunits of NDH-1, NdhM, NdhN and NdhO are essential for effecting cyclic electron flow around photosystem I, by supplying extra-ATP for photosynthesis in both plastids and cyanobacteria.	107
402343	pfam10665	Minor_capsid_1	Minor capsid protein. This is a putative tail-knob or minor capsid protein from bacteriophages.	104
402344	pfam10666	Phage_TAC_8	Phage tail assembly chaperone protein Gp14 ()A118. This phage protein family is expressed from within a cluster of tail- and base plate-producing genes. It is a family of tail assembly chaperone proteins.	140
402345	pfam10667	DUF2486	Protein of unknown function (DUF2486). This family is made up of members from various Burkholderia spp. The function is unknown.	251
402346	pfam10668	Phage_terminase	Phage terminase small subunit. This family of small highly conserved proteins come from a subset of Firmicute species. Its putative function is as a phage terminase small subunit.	67
402347	pfam10669	Phage_Gp23	Protein gp23 (Bacteriophage A118). This is the highly conserved family of the major tail subunit protein.	120
402348	pfam10670	DUF4198	Domain of unknown function (DUF4198). This family was previously missannotated in Pfam as NikM.	209
402349	pfam10671	TcpQ	Toxin co-regulated pilus biosynthesis protein Q. The toxin-coregulated pilus (TCP) of Vibrio cholerae and the soluble TcpF protein that is secreted via the TCP biogenesis apparatus are essential for intestinal colonisation in the disease of cholera. TcpQ is part of an outer membrane complex of the TCP biogenesis apparatus, comprised of TcpC and TcpQ, and the TcpQ is required for proper localization of TcpC to the outer membrane. The domain is found in other Proteobacterial species apart from Vibrio.	80
287624	pfam10672	Methyltrans_SAM	S-adenosylmethionine-dependent methyltransferase. Members of this family are S-adenosylmethionine-dependent methyltransferases from gamma-proteobacterial species. The diversity in the roles of methylation is matched by the almost bewildering number of methyltransferase enzymes that catalyze the methylation reaction. Although several classes of methyltransferase enzymes are known, the great majority of methylation reactions are catalyzed by the S-adenosylmethionine-dependent methyltransferases.	286
402350	pfam10673	DUF2487	Protein of unknown function (DUF2487). This is a bacterial family of uncharacterized proteins.	142
402351	pfam10674	Ycf54	Protein of unknown function (DUF2488). This protein is conserved in the green lineage and located in the chloroplast.	91
402352	pfam10675	DUF2489	Protein of unknown function (DUF2489). This is a bacterial family of uncharacterized proteins.	130
371190	pfam10676	gerPA	Spore germination protein gerPA/gerPF. This is a bacterial family of proteins that are required for the formation of functionally normal spores. Proteins in this family may be involved in establishing normal coat structure and/or permeability which could control the access of germinants to their receptor.	69
402353	pfam10677	DUF2490	Protein of unknown function (DUF2490). This is a bacterial family of uncharacterized proteins. They appear to belong to the outer membrane beta barrel superfamily.	180
402354	pfam10678	DUF2492	Protein of unknown function (DUF2492). This is a bacterial family of uncharacterized proteins.	77
402355	pfam10679	DUF2491	Protein of unknown function (DUF2491). This is a bacterial family of uncharacterized proteins.	211
402356	pfam10680	RRN9	RNA polymerase I specific transcription initiation factor. Initiation of transcription of ribosomal DNA (rDNA) in yeast involves an interaction of upstream activation factor (UAF) with the upstream element of the promoter, to form a stable UAF-template complex. UAF, together with the TATA-binding transcription initiation factor protein TBP, then recruits an essential core factor to the promoter, to form a stable preinitiation complex. This Rrn9 domain, which seems to be constrained to fungi, is the two highly conserved regions of proteins which form one of the subunits of UAF and appears to be the region responsible for the interaction with TBP. The family includes the S.pombe Arc1 protein, which is found to be essential for the accumulation of condensin at kinetochores.	65
402357	pfam10681	Rot1	Chaperone for protein-folding within the ER, fungal. This conserved fungal family is an essential molecular chaperone in the endoplasmic reticulum. Molecular chaperones transiently interact with unfolded proteins to inhibit their self-aggregation and to support their folding and/or assembly. Rot1 is a general chaperone with some substrate specificity, its substrates being the structurally unrelated Kre5 Kre6 Big1 Atg22, which are type I, type II, and polytopic membrane proteins. The dependencies of each for Rot1 do not share similarities. However, their folding does require BiP, and one of these proteins was simultaneously associated with both Rot1 and BiP. In addition, Rot1 may cooperate with BiP/Kar2 in the folding of Kre6.	208
287634	pfam10682	UL40	Glycoprotein of human cytomegalovirus HHV-5. This is glycoprotein UL40 from human cytomegalovirus or herpesvirus 5. The signal sequence of the UL40 polypeptide contains an HLA-E ligand identical with HLA-Cw*0304. The first 37 residues of UL40, including this ligand, are predicted to encode a signal peptide. The virus thus prevents the lysis by NK (natural killer) cells of the cell it has invaded.	214
402358	pfam10683	DBD_Tnp_Hermes	Hermes transposase DNA-binding domain. This domain confers specific DNA-binding on Hermes transposase.	67
313818	pfam10684	BDM	Putative biofilm-dependent modulation protein. This is a family of tightly conserved proteins from Enterobacteriaceae which are annotated as being biofilm-dependent modulation protein homologs.	71
402359	pfam10685	KGG	Stress-induced bacterial acidophilic repeat motif. This repeat is found in proteins which are expressed under conditions of stress in bacteria. The repeat contains a highly conserved, characteristic sequence motif,KGG, that is also recognized by plants and lower eukaryotes and repeated in their LEA (late embryogenesis abundant) family of proteins, thereby rendering those proteins bacteriostatic. An example of such an LEA family is LEA_5, pfam00477. Further downstream from this motif is a Walker A, nucleotide binding, motif GXXXXGK(S,T), that in YciG of E coli is QSGGNKSGKS. YciG is expressed as part of a three-gene operon, yciGFE, and this operon is induced by stress and is regulated by RpoS, which controls the general stress-response in E coli. YciG was shown to be important for stationary-phase resistance to thermal stress and in particular to acid stress.	21
402360	pfam10686	DUF2493	Protein of unknown function (DUF2493). Members of this family are all Proteobacteria. The function is not known.	66
402361	pfam10688	Imp-YgjV	Bacterial inner membrane protein. This is a family of inner membrane proteins. Many of the members are YgjV protein.	155
402362	pfam10689	DUF2496	Protein of unknown function (DUF2496). This family consists of proteins from Gammaproteobacteria spp. Many members are annotated as being like the E coli protein YbaM.	43
151186	pfam10690	Myticin-prepro	Myticin pre-proprotein from the mussel. Myticin is a cysteine-rich peptide produced in three isoforms, A, B and C, by Mytilus galloprovincialis, the Mediterranean mussel. Some isoforms show antibacterial activity against gram-positive bacteria, while others are additionally active against the fungus Fusarium oxysporum and a gram-negative bacterium, Escherichia coli D31. Myticin-prepro is the precursor peptide. The mature molecule, named myticin, consists of 40 residues, with four intramolecular disulfide bridges and a cysteine array in the primary structure different from that of previously characterized cysteine-rich antimicrobial peptides. The first 20 amino acids are a putative signal peptide, and the antimicrobial peptide sequence is a 36-residue C-terminal extension. Such a structure suggests that myticins are synthesized as prepro-proteins that are then processed by various proteolytic events before storage in the haemocytes as the active peptide. Myticin precursors are expressed mainly in the haemocytes. The family Mytilin has been merged into this family.	98
402363	pfam10691	DUF2497	Protein of unknown function (DUF2497). Members of this family belong to the Alphaproteobacteria. The function of the family is not known.	70
402364	pfam10692	DUF2498	Protein of unknown function (DUF2498). Members of this family are Gammaproteobacteria. Many are annotated as like E coli protein YciN. The function is not known.	79
402365	pfam10693	DUF2499	Protein of unknown function (DUF2499). Members of this family are found in plants, lower eukaryotes, and bacteria and the chloroplast where it is annotated as Ycf49 or Ycf49-like. The function is not known though several members are annotated as putative membrane proteins.	87
402366	pfam10694	DUF2500	Protein of unknown function (DUF2500). The members of this family are largely confined to the Gammaproteobacteria. The function is not known.	102
402367	pfam10696	DUF2501	Protein of unknown function (DUF2501). Members of this family are all Proteobacteria. Several are annotated as being YjjA or YjjA-like, but this protein is uncharacterized.	77
371199	pfam10697	DUF2502	Protein of unknown function (DUF2502). Members of this family are all Gammaproteobacteria. The function is not known.	90
402368	pfam10698	DUF2505	Protein of unknown function (DUF2505). Members of this family are all Actinobacteria. The function is not known.	151
402369	pfam10699	HAP2-GCS1	Male gamete fusion factor. The gene encoding Arabidopsis HAP2 is allelic with GCS1 (Generative cell-specific protein 1). HAP2 is expressed only in the haploid sperm and is required for efficient guidance of the pollen tube to the ovules. In Arabidopsis the protein is a predicted membrane protein with an N-terminal secretion signal, a single transmembrane domain and a C-terminal histidine-rich domain. HAP2-GCS1 is found from plants to lower eukaryotes and is necessary for the fusion of the gametes in fertilisation. Studies in the green alga Chlamydomonas and the malaria organism Plasmodium showed that it is involved in a novel mechanism for gamete fusion where a first species-specific protein binds male and female gamete membranes together after which a second, broadly conserved protein, either directly or indirectly, causes fusion of the two membranes together. The broadly conserved protein is represented by this HAP2-GCS1 domain, conserved from plants to lower eukaryotes. In Plasmodium berghei the protein is expressed only in male gametocytes and gametes, having a male-specific function during the interaction with female gametes, and being indispensable for parasite fertilisation. The gene in plants and eukaryotes might well have originated from acquisition of plastids from red algae.	433
402370	pfam10702	DUF2507	Protein of unknown function (DUF2507). This family is conserved in Firmicutes. The function is not known.	123
402371	pfam10703	MoaF	MoaF N-terminal domain. MoaF protein is essential for the production of the monoamine-inducible 30kDa protein in Klebsiella. It is necessary for reconstituting organoautotrophic growth in Ralstonia eutropha. It is conserved in Proteobacteria and some lower eukaryotes. The operon regulating the Moa genes is responsible for molybdenum cofactor biosynthesis. This entry corresponds to the N-terminal domain.	108
402372	pfam10704	DUF2508	Protein of unknown function (DUF2508). This family is conserved in Firmicutes. Several members are annotated as being the protein YaaL. The function is not known.	71
313834	pfam10705	Ycf15	Chloroplast protein precursor Ycf15 putative. In some species of plants the ycf15 gene is probably not a protein-coding gene because the protein in these species has premature stop codons. Most of the members of the family are hypothetical or uncharacterized.	86
402373	pfam10706	Aminoglyc_resit	Aminoglycoside-2''-adenylyltransferase. This family is conserved in Bacteria. It confers resistance to kanamycin, gentamicin, and tobramycin. The protein is also produced by plasmids in various bacterial species and confers resistance to essentially all clinically available aminoglycosides except streptomycin, and it eliminates the synergism between aminoglycosides and cell-wall active agents.	174
402374	pfam10707	YrbL-PhoP_reg	PhoP regulatory network protein YrbL. This is a family of proteins that are activated by PhoP. PhoP protein controls the expression of a large number of genes that mediate adaptation to low Mg2+ environments and/or virulence in several bacterial species. YbrL is proposed to be acting in a loop activity with PhoP and PrmA analogous to the multicomponent loop in Salmonella where the PhoP-dependent PmrD protein activates the regulatory protein PmrA, and the activated PmrA then represses transcription from the PmrD promoter which harbours binding sites for both the PhoP and PmrA proteins. Expression of YrbL is induced in low Mg2+ in a PhoP-dependent fashion and repressed by Fe3+ in a PmrA-dependent manner.	185
402375	pfam10708	DUF2510	Protein of unknown function (DUF2510). This is family of proteins conserved in Actinobacteria. Many members are annotated as putative membrane proteins but this could not be confirmed.	35
402376	pfam10709	DUF2511	Protein of unknown function (DUF2511). This family is conserved in bacteria. The function is not known.	87
371204	pfam10710	DUF2512	Protein of unknown function (DUF2512). Proteins in this family are predicted to be integral membrane proteins, and many of them are annotated as being YndM protein. They are all found in Firmicutes. The true function is not known.	136
402377	pfam10711	DUF2513	Hypothetical protein (DUF2513). This family is found in bacteria. The function is not known.	98
402378	pfam10712	NAD-GH	NAD-specific glutamate dehydrogenase. The members of this are annotated as being NAD-specific glutamate dehydrogenase encoded in antisense gene pair with DnaK-J.	576
402379	pfam10713	DUF2509	Protein of unknown function (DUF2509). This family is conserved in Proteobacteria. The function is not known but many of the members are annotated as protein YgdB.	131
371207	pfam10714	LEA_6	Late embryogenesis abundant protein 18. This is a family of late embryogenesis-abundant proteins There is high accumulation of this protein in dry seeds, and in the roots of full-grown plants in response to dehydration and ABA (abscisic acid application) treatments. This LEA protein disappears after germination. It accumulates in growing regions of well irrigated hypocotyls and meristems suggesting a role in seedling growth resumption on rehydration. As a group the LEA proteins are highly hydrophilic, contain a high percentage of glycine residues, lack Cys and Trp residues and do not coagulate upon exposure to high temperature, and for these reasons are considered to be members of a group of proteins called hydrophilins. Expression of the protein is negatively regulated during etiolating growth, particularly in roots, in contrast to its expression patterns during normal growth.	75
402380	pfam10715	REGB_T4	T4-page Endoribonuclease RegB. The RegB endoribonuclease encoded by bacteriophage T4 is a unique sequence-specific nuclease that cleaves in the middle of GGAG or, in a few cases, GGAU tetranucleotides, preferentially those found in the Shine-Dalgarno regions of early phage mRNAs. Phage RB49 in addition to RegB utilizes Escherichia coli endoribonuclease E for the degradation of its transcripts for gene regB. The deduced primary structure of RegB proteins of 32 phages studied is almost identical to that of T4, while the sequences of RegB encoded by phages RB69, TuIa and RB49 show substantial divergence from their T4 counterpart. Rebuilding from the Structure 2hx6 structure, this family does not fall into the Lysozyme-like family, but rather is a new member of the RelE/YoeB structural and functional family of ribonucleases specialising in mRNA inactivation within the ribosome.	150
402381	pfam10716	NdhL	NADH dehydrogenase transmembrane subunit. The NdhL family is a component of the NDH-1L complex that is one of the proton-pumping NADH:ubiquinone oxidoreductases that catalyze the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. NDH-1L is essential for photoheterotrophic cell growth. NdhL appears to contain two transmembrane helices and it is necessary for the functioning of though not the correct assembly of the NDH-1 complex in Synechocystis 6803. The conservation between cyanobacteria and green plants suggests that chloroplast NDH-1 complexes contain related subunits.	76
287662	pfam10717	ODV-E18	Occlusion-derived virus envelope protein ODV-E18. This family of occlusion-derived viral envelope proteins are detected in viral-induced intranuclear microvesicles and are not detected in the plasma membrane, cytoplasmic membranes, or the nuclear envelope. The ODV-E18 protein is encoded by baculovirus late genes with transcription initiating from a TAAG motif. It exists as a dimer in the ODV envelope and contains a hydrophobic domain which is putatively acting as a target or retention signal for intranuclear microvesicles.	87
402382	pfam10718	Ycf34	Hypothetical chloroplast protein Ycf34. This family is of proteins annotated as hypothetical chloroplast protein YCF34. The function is not known.	78
402383	pfam10719	ComFB	Late competence development protein ComFB. This family is conserved in bacteria. Some members, with three conserved cysteines, are annotated as late competence development protein ComFB.	79
371209	pfam10720	DUF2515	Protein of unknown function (DUF2515). This family is conserved in Firmicutes. Several members are annotated as YppC. The function is not known.	303
287666	pfam10721	DUF2514	Protein of unknown function (DUF2514). This family is conserved in bacteria and some viruses. The function is not known.	161
402384	pfam10722	YbjN	Putative bacterial sensory transduction regulator. YbjN is a putative sensory transduction regulator protein found in Proteobacteria. As it is a multi-copy suppressor of the coenzyme A-associated temperature sensitivity in temperature-sensitive mutant strains of Escherichia coli the suggestion is that it both helps CoA-A1 and possibly works as a general stabilizer for some other unstable proteins. This family was expanded to subsume other related families: DUF1790, DUF1821 and DUF2596.	126
287668	pfam10723	RepB-RCR_reg	Replication regulatory protein RepB. This is a family of proteins which regulate replication of rolling circle replication (RCR) plasmids that have a double-strand replication origin (dso). Regulation of replication of RCR plasmids occurs mainly at initiation of leading strand synthesis at the dso, such that Rep protein concentration controls plasmid replication.	81
402385	pfam10724	DUF2516	Protein of unknown function (DUF2516). This family is conserved in Actinobacteria. The function is not known.	91
402386	pfam10725	DUF2517	Protein of unknown function (DUF2517). This family is conserved in Proteobacteria. Several members are annotated as being protein YbfA. The function is not known.	61
402387	pfam10726	DUF2518	Protein of function (DUF2518). This family is conserved in Cyanobacteria. Several members are annotated as the protein Ycf51. The function is not known.	142
287672	pfam10727	Rossmann-like	Rossmann-like domain. This family of proteins contain a Rossmann-like domain.	127
402388	pfam10728	DUF2520	Domain of unknown function (DUF2520). This presumed domain is found C-terminal to a Rossmann-like domain suggesting that these proteins are oxidoreductases.	126
313850	pfam10729	CedA	Cell division activator CedA. CedA is made up of four antiparallel beta-strands and an alpha-helix. It activates cell division by inhibiting chromosome over-replication. This is mediated by binding to dsDNA via the beta-sheet..	75
371213	pfam10730	DUF2521	Protein of unknown function (DUF2521). Family of unknown function specific to Bacillus.	146
287676	pfam10731	Anophelin	Thrombin inhibitor from mosquito. Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing.	65
287677	pfam10732	DUF2524	Protein of unknown function (DUF2524). This family of proteins with unknown function appears to be restricted to Bacillaceae bacteria.	84
402389	pfam10733	DUF2525	Protein of unknown function (DUF2525). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. The family has a highly conserved sequence.	60
371214	pfam10734	DUF2523	Protein of unknown function (DUF2523). This is a family of phage related proteins whose function is uncharacterized.	80
402390	pfam10735	DUF2526	Protein of unknown function (DUF2526). This family of proteins with unknown function is restricted to Enterobacteriaceae. The family has a highly conserved sequence.	76
119256	pfam10736	DUF2527	Protein of unknown function (DUF2627). This family of proteins with unknown function appears to be restricted to a family of Enterobacterial proteins. It has a highly conserved sequence.	38
371215	pfam10737	GerPC	Spore germination protein GerPC. GerPC is required for the formation of functionally normal spores. The gerP locus encodes a number of proteins which are thought to be involved in the establishment of normal spore coat structure and/or permeability, which allows the access of germinants to their receptor.	173
402391	pfam10738	Lpp-LpqN	Probable lipoprotein LpqN. This family is conserved in Mycobacteriaceae and is likely to be a lipoprotein.	171
402392	pfam10739	DUF2550	Protein of unknown function (DUF2550). This family is conserved in Corynebacterineae. The function is not known though most members are annotated as either secreted, or membrane, proteins.	127
402393	pfam10740	DUF2529	Domain of unknown function (DUF2529). This domain is conserved in the Bacillales. The function is not known, but given this domains relationship to the SIS domain it may carry out a sugar isomerase reaction. Several members are annotated as being YWJG, a protein expressed downstream of pyrG, a gene encoding for cytidine triphosphate synthetase.	167
402394	pfam10741	T2SSM_b	Type II secretion system (T2SS), protein M subtype b. The T2SMb family is conserved in Proteobacteria and Actinobacteria, and differs from the T2SM proteins in Vibrio spp. (pfam04612).	111
402395	pfam10742	DUF2555	Protein of unknown function (DUF2555). This family is conserved in Cyanobacteria. The function is not known.	55
371218	pfam10743	Phage_Cox	Regulatory phage protein cox. This family of phage Cox proteins is expressed by Enterobacteria phages. The Cox protein is a 79-residue basic protein with a predicted strong helix-turn-helix DNA-binding motif. It inhibits integrative recombination and it activates site-specific excision of the HP1 genome from the Haemophilus influenzae chromosome, Hp1. Cox appears to function as a tetramer. Cox binding sites consist of two direct repeats of the consensus motif 5'-GGTMAWWWWA, one Cox tetramer binding to each motif. Cox binding interferes with the interaction of HP1 integrase with one of its binding sites, IBS5. This competition is central to directional control. Both Cox binding sites are needed for full inhibition of integration and for activating excision, because it plays a positive role in assembling the nucleoprotein complexes that produce excisive recombination, by inducing the formation of a critical conformation in those complexes.	87
402396	pfam10744	Med1	Mediator of RNA polymerase II transcription subunit 1. Mediator complexes are basic necessities for linking transcriptional regulators to RNA polymerase II. This domain, Med1, is conserved from plants to fungi to humans and forms part of the Med9 submodule of the Srb/Med complex. it is one of three subunits essential for viability of the whole organism via its role in environmentally-directed cell-fate decisions. Med1 is part of the tail region of the Mediator complex.	377
402397	pfam10745	DUF2530	Protein of unknown function (DUF2530). This family of proteins with unknown function appears to be restricted to mycobacteria.	73
287690	pfam10746	Phage_holin_2_2	Phage holin T7 family, holin superfamily II. Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis.	55
402398	pfam10747	SirA	Sporulation inhibitor of replication protein SirA. This entry represents the Sporulation inhibitor of replication (sirA) family of proteins from Bacillus sp. Induction of sporulation in rapidly growing cells inhibits replication; this is thought to be through the action of SirA protein and independent of phosphorylated Spo0A; however SirA protein synthesis is induced by Spo0A.	140
402399	pfam10748	DUF2531	Protein of unknown function (DUF2531). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	132
402400	pfam10749	DUF2534	Protein of unknown function (DUF2534). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	80
371221	pfam10750	DUF2536	Protein of unknown function (DUF2536). This family of proteins with unknown function appears to be restricted to Bacillus spp. Structural modelling suggests this domain may bind nucleic acids.	68
313866	pfam10751	DUF2535	Protein of unknown function (DUF2535). This family of proteins with unknown function appears to be restricted to Bacillus spp.	83
313867	pfam10752	DUF2533	Protein of unknown function (DUF2533). This family of proteins with unknown function appears to be restricted to Bacillus spp.	83
402401	pfam10753	Toxin_GhoT_OrtT	Toxin GhoT_OrtT. GhoT is part of the GhoT-GhoS type V toxin-antitoxin (TA) system. OrtT is homologous to GhoT, but it is not part of a TA pair. In this case, it acts as an independent toxin to reduce growth during stress related to amino acid and DNA synthesis.	55
402402	pfam10754	DUF2569	Protein of unknown function (DUF2569). This family is conserved in bacteria. The function is not known, but several members are annotated as being YdgK or a homolog thereof.	142
402403	pfam10755	DUF2585	Protein of unknown function (DUF2585). This family is conserved in Proteobacteria. The function is not known.	164
371225	pfam10756	bPH_6	Bacterial PH domain. This domain has a bacterial type PH domain structure. This domain was previously known as DUF2581. This family is conserved in the Actinomycetales. Although several members are annotated as RbiX homologs, RbiX being a putative regulator of riboflavin biosynthesis, the function could not be confirmed.	73
371226	pfam10757	YbaJ	Biofilm formation regulator YbaJ. YbaJ regulates biofilm formation. It also has an important role in the regulation of motility in the biofilm. YbaJ functions in increasing conjugation, aggregation and decreasing the motility, resulting in an increase of biofilm	114
402404	pfam10758	DUF2586	Protein of unknown function (DUF2586). This bacterial family of proteins has no known function.	363
402405	pfam10759	DUF2587	Protein of unknown function (DUF2587). This is a bacterial family of proteins with no known function.	161
402406	pfam10761	DUF2590	Protein of unknown function (DUF2590). This family of proteins has no known function.	98
402407	pfam10762	DUF2583	Protein of unknown function (DUF2583). Some members in this family of proteins are annotated as YchH however currently no function is known.	86
402408	pfam10763	DUF2584	Protein of unknown function (DUF2584). This bacterial family of proteins have no known function.	77
402409	pfam10764	Gin	Inhibitor of sigma-G Gin. Gin allows sigma-F to delay late forespore transcription by preventing sigma-G to take over before the cell has reached a critical stage of development. Gin is also known as CsfB.	44
402410	pfam10765	DUF2591	Protein of unknown function (DUF2591). This bacterial family of proteins has no known function.	107
371231	pfam10766	AcrZ	Multidrug efflux pump-associated protein AcrZ. AcrZ is associated with the AcrA-TolC multidrug efflux pump, it may enhance the ability of the pump to recognize and export certain substrates.	44
402411	pfam10767	DUF2593	Protein of unknown function (DUF2593). This family of proteins appear to be restricted to Enterobacteriaceae. Some members in the family are annotated as YbjO however currently there is no known function.	141
402412	pfam10768	FliX	Class II flagellar assembly regulator. The FliX protein is possibly a transient component of the flagellum that is required for the assembly process. FliX may contribute to the targeting or assembly of the P- and L-ring protein monomers at the cell pole. The family carries a potential N-terminal signal sequence and at least one transmembrane domain indicating that it might function either in or in association with the cell membrane.	135
371234	pfam10769	DUF2594	Protein of unknown function (DUF2594). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae.	74
402413	pfam10771	DUF2582	Winged helix-turn-helix domain (DUF2582). This family is conserved in bacteria and archaea. The function is not known. The structure of two proteins in this family were solved using NMR and shown to adopt a winged helix-turn-helix fold. Structural analysis shows that these proteins form an unusual dimeric conformation. This dimer was shown to be similar to that found in the FadR and TubR wHTH domains. It was suggested that these proteins are not very likely to bind to DNA.	65
402414	pfam10772	DUF2597	Protein of unknown function (DUF2597). This family of proteins has no known function.	134
402415	pfam10774	DUF4226	Domain of unknown function (DUF4226). This family of mycobacterial proteins are uncharacterized.	115
402416	pfam10775	ATP_sub_h	ATP synthase complex subunit h. Subunit h is a component of the yeast mitochondrial F1-F0 ATP synthase. It is essential for the correct assembly and functioning of this enzyme. Subunit h occupies a central place in the peripheral stalk between the F1 sector and the membrane.	67
378492	pfam10776	DUF2600	Protein of unknown function (DUF2600). This is a bacterial family of proteins. Some members in the family are annotated as YtpB however currently no function is known.	328
402417	pfam10777	YlaC	Inner membrane protein YlaC. Members of this family include proteins annotated as inner membrane protein YlaC in E. coli and Salmonella. The function of this family is unknown.	154
402418	pfam10778	DehI	Halocarboxylic acid dehydrogenase DehI. Haloacid dehalogenases catalyze the removal of halides from organic haloacids. DehI can process both L- and D-substrates. A crucial aspartate residue is predicted to activate a water molecule for nucleophilic attack of the substrate chiral centre resulting in an inversion of the configuration of either L- or D-substrates in contrast to D-only enzymes.	148
402419	pfam10779	XhlA	Haemolysin XhlA. XhlA is a cell-surface associated haemolysin that lyses the two most prevalent types of insect immune cells (granulocytes and plasmatocytes) as well as rabbit and horse erythrocytes. This family has had DUF1267, pfam06895, merged into it.	67
402420	pfam10780	MRP_L53	39S ribosomal protein L53/MRP-L53. MRP-L53 is also known as Mrp144. It is part of the 39S ribosome.	52
402421	pfam10781	DSRB	Dextransucrase DSRB. DSRB is a novel dextransucrase which produces a dextran different from the typical dextran, as it contains (1-6) and (1-2) linkages, when this strain is grown in the presence of sucrose.	61
402422	pfam10782	zf-C2HCIx2C	Zinc-finger. This bacterial family of proteins is a zinc-finger domain of the C2HC type with an additional cysteine.	58
402423	pfam10783	DUF2599	Protein of unknown function (DUF2599). This family is conserved in Actinobacteria. The function is not known.	94
119304	pfam10784	Plasmid_stab_B	Plasmid stability protein. This family is conserved in the Enterobacteriales. It is a putative plasmid stability protein in that it is expressed from the operon involved in stability, but its actual function has not yet been characterized.	72
402424	pfam10785	NADH-u_ox-rdase	NADH-ubiquinone oxidoreductase complex I, 21 kDa subunit. This family is the N-terminal domain of NADH-ubiquinone oxidoreductase 21 kDa subunits from fungi, lower metazoa and plants.	84
402425	pfam10786	G6PD_bact	Glucose-6-phosphate 1-dehydrogenase (EC 1.1.1.49). This family is conserved in Firmicutes and Proteobacteria. Several members are annotated as being glucose-6-phosphate 1-dehydrogenase (EC:1.1.1.49) but this could not be confirmed.	213
402426	pfam10787	YfmQ	Uncharacterized protein from bacillus cereus group. This family is conserved in the Bacillus cereus group. Several members are called YfmQ but the function is not known.	141
402427	pfam10788	DUF2603	Protein of unknown function (DUF2603). This family is conserved in Epsilon-proteobacteria. The function is not known.	134
256166	pfam10789	Phage_RpbA	Phage RNA polymerase binding, RpbA. Upon infection the RpbA encode phage protein binds to the ADP-ribosylated core RNA polymerase and modulates function to preferentially bind T4 promoters. This is a non-essential protein to the phage life cycle.	108
402428	pfam10790	DUF2604	Protein of Unknown function (DUF2604). Family of bacterial proteins with undetermined function.	76
402429	pfam10791	F1F0-ATPsyn_F	Mitochondrial F1-F0 ATP synthase subunit F of fungi. The membrane bound F1-FO-type H+ ATP synthase of mitochondria catalyzes the terminal step in oxidative respiration converting the generation of the electrochemical gradient into ATP for cellular biosynthesis. The general structure and the core subunits of the enzyme are highly conserved in both prokaryotic and eukaryotic organisms.	87
402430	pfam10792	DUF2605	Protein of unknown function (DUF2605). This family is conserved in Cyanobacteria. The function is not known.	96
371249	pfam10793	Gloverin	Gloverin-like protein. This family of proteins are Gloverin-like. Gloverin is a 13.8kDa inducible antibacterial insect protein which inhibits the synthesis of vital outer membrane proteins leading to a permeable outer membrane. Gloverin contains a large number of glycine residues.	161
287732	pfam10794	DUF2606	Protein of unknown function (DUF2606). Family of bacterial proteins with unknown function. These proteins have been classified as membrane proteins	134
402431	pfam10795	DUF2607	Protein of unknown function (DUF2607). This family is conserved in Gammaproteobacteria. The function is not known.	94
402432	pfam10796	Anti-adapt_IraP	Sigma-S stabilisation anti-adaptor protein. This family is conserved in Enterobacteriaceae. It is one of a series of proteins, expressed by these bacteria in response to stress, that help to regulate Sigma-S, the stationary phase sigma factor of Escherichia coli and Salmonella. IraP is essential for Sigma-S stabilisation in some but not all starvation conditions.	86
402433	pfam10797	YhfT	Protein of unknown function. This family is conserved in Firmicutes and Proteobacteria. The function is not known but several members are annotated as being homologs of E coli YhfT, a protein thought to be involved in fatty acid oxidation.	422
402434	pfam10798	YmgB	Biofilm development protein YmgB/AriR. YmgB is part of the three gene cluster ymgABC which has a role in biofilm development and stability. YmgB represses biofilm formation in rich medium containing glucose, decreases cellular motility and also protects the cell from acid which indicates that YmgB has an important function in acid-resistance. YmgB binds as a dimer to genes which are important for biofilm formation via a ligand. Due to its important function in acid resistance it is also known as AriR (regulator of acid resistance influenced by indole).	59
313902	pfam10799	YliH	Biofilm formation protein (YliH/bssR). YliH is induced in biofilms and is involved in repression of motility in the biofilms. YliH is also known as bssR (regulator of biofilm through signal secreton).	126
402435	pfam10800	DUF2528	Protein of unknown function (DUF2528). This family of proteins has no known function. Some of the sequences are annotated as ea10 however the function of this protein is unknown.	103
402436	pfam10801	DUF2537	Protein of unknown function (DUF2537). This bacterial family of proteins has no known function.	75
402437	pfam10802	DUF2540	Protein of unknown function (DUF2540). This family of proteins with unknown function appears to be restricted to Methanococcus.	75
313906	pfam10803	GerPB	Spore germination GerPB. Members of this family are required for formation of functionally normal spores. They may be involved in the establishment of spore coat structure or permeability.	52
402438	pfam10804	DUF2538	Protein of unknown function (DUF2538). This family of proteins has no known function.	155
402439	pfam10805	DUF2730	Protein of unknown function (DUF2730). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	101
402440	pfam10806	SAM35	SAM35, subunit of SAM coomplex. SAM35 is a family of fungal proteins found in the peripheral mitochondrial outer membrane. It is essential for cell viability. It forms a subunit of the SAM (sorting and assembly machinery) complex and is crucial for the assembly of the precursors of Tom40 and porin, the outer membrane beta-barrel proteins involved in mitochondrial biogenesis. SAM35 is required in order for the Sam50 subunit of the SAM complex to bind outer membrane substrate proteins.	125
287745	pfam10807	DUF2541	Protein of unknown function (DUF2541). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. All proteins are annotated as YaaI precursor however currently no function is known.	130
151258	pfam10808	DUF2542	Protein of unknown function (DUF2542). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. The family has a highly conserved sequence.	79
402441	pfam10809	DUF2732	Protein of unknown function (DUF2732). This family of proteins has no known function.	74
371254	pfam10810	DUF2545	Protein of unknown function (DUF2545). This family of proteins with unknown function is restricted to Enterobacteriaceae. The sequence is highly conserved.	80
287748	pfam10811	DUF2532	Protein of unknown function (DUF2532). This bacterial family of proteins has no known function.	158
402442	pfam10812	DUF2561	Protein of unknown function (DUF2561). This family of proteins with unknown function appears to be restricted to Mycobacterium spp.	204
287750	pfam10813	DUF2733	Protein of unknown function (DUF2733). This viral family of proteins has no known function.	32
402443	pfam10814	CwsA	Cell wall synthesis protein CwsA. Cell wall synthesis protein CwsA is required for cell division, cell wall synthesis and cell shape maintenance.	133
313911	pfam10815	ComZ	ComZ. ComZ is part of a two gene operon. It affects competence regulation by negatively affecting the transcription of the ComG operon. ComZ contains a leucine zipper motif.	55
402444	pfam10816	DUF2760	Domain of unknown function (DUF2760). This is a bacterial family of uncharacterized proteins.	123
287754	pfam10817	DUF2563	Protein of unknown function (DUF2563). This family of proteins with unknown function appears to be restricted to Mycobacterium.	104
402445	pfam10818	DUF2547	Protein of unknown function (DUF2547). This bacterial family of proteins has no known function.	96
287756	pfam10819	DUF2564	Protein of unknown function (DUF2564). This family of proteins with unknown function appears to be restricted to Bacillus spp.	78
402446	pfam10820	DUF2543	Protein of unknown function (DUF2543). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae. The family has a highly conserved sequence.	81
402447	pfam10821	DUF2567	Protein of unknown function (DUF2567). This is a bacterial family of proteins with unknown function.	166
402448	pfam10823	DUF2568	Protein of unknown function (DUF2568). One member in this family is annotated as yrdB which is part of a four gene operon however currently no function is known.	92
371259	pfam10824	T7SS_ESX_EspC	Excreted virulence factor EspC, type VII ESX diderm. T7SS_ESX-EspC is a family of exported virulence proteins from largely Acinetobacteria and a few Fimicutes, Gram-positive bacteria. It is exported in conjunction with EspA as an interacting pair.ED F8ADQ6.1/227-313; F8ADQ6.1/227-313;	100
402449	pfam10825	DUF2752	Protein of unknown function (DUF2752). This family is conserved in bacteria. Many members are annotated as being putative membrane proteins.	47
402450	pfam10826	DUF2551	Protein of unknown function (DUF2551). This Archaeal family of proteins has no known function.	83
371260	pfam10827	DUF2552	Protein of unknown function (DUF2552). This bacterial family of proteins has no known function.	79
402451	pfam10828	DUF2570	Protein of unknown function (DUF2570). This is a family of proteins with unknown function.	108
313921	pfam10829	DUF2554	Protein of unknown function (DUF2554). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	76
371262	pfam10830	DUF2553	Protein of unknown function (DUF2553). This family of bacterial proteins has no known function.	75
402452	pfam10831	DUF2556	Protein of unknown function (DUF2556). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	53
402453	pfam10832	DUF2559	Protein of unknown function (DUF2559). This family of proteins appear to be restricted to Enterobacteriaceae. The sequences are annotated as yhfG however currently no function is known.	52
402454	pfam10833	DUF2572	Protein of unknown function (DUF2572). This bacterial family of proteins has no known function.	220
287770	pfam10834	DUF2560	Protein of unknown function (DUF2560). This family of proteins has no known function.	72
313925	pfam10835	DUF2573	Protein of unknown function (DUF2573). Some members in this bacterial family of proteins are annotated as YusU however no function is currently known. This family of proteins appears to be restricted to Bacillus spp.	75
287772	pfam10836	DUF2574	Protein of unknown function (DUF2574). This family of proteins appears to be restricted to Enterobacteriaceae. Members of the family are annotated as yehE however currently no function is known.	93
313926	pfam10838	DUF2677	Protein of unknown function (DUF2677). Members in this family of proteins are annotated as UL121 however currently no function is known.	166
287774	pfam10839	DUF2647	Protein of unknown function (DUF2647). This eukaryotic family of proteins are annotated as ycf68 but have no known function.	70
337858	pfam10840	DUF2645	Protein of unknown function (DUF2645). This family of proteins appear to be restricted to Enterobacteriaceae. Some members in the family are annotated as YjeO however no function for this protein is currently known.	98
402455	pfam10841	DUF2644	Protein of unknown function (DUF2644). This family of proteins with unknown function appear to be restricted to Pasteurellaceae.	59
402456	pfam10842	DUF2642	Protein of unknown function (DUF2642). This family of proteins with unknown function appear to be restricted to Bacillus spp.	61
371266	pfam10843	RGI1	Respiratory growth induced protein 1. This family of fungal proteins includes RGI1, standing for respiratory growth induced 1. RGI1 is involved in aerobic energetic metabolism.	194
402457	pfam10844	DUF2577	Protein of unknown function (DUF2577). This family of proteins has no known function	106
313931	pfam10845	DUF2576	Protein of unknown function (DUF2576). The function of this viral family of proteins is unknown.	48
371267	pfam10846	DUF2722	Protein of unknown function (DUF2722). This eukaryotic family of proteins has no known function.	369
402458	pfam10847	DUF2656	Protein of unknown function (DUF2656). This bacterial family of proteins has no known function.	141
287783	pfam10849	DUF2654	Protein of unknown function (DUF2654). Some members in this family of proteins are annotated as a-gt.4 however currently no function is known.	70
313933	pfam10850	DUF2653	Protein of unknown function (DUF2653). This family of proteins with unknown function appears to be restricted to Bacillus spp.	88
402459	pfam10851	DUF2652	Protein of unknown function (DUF2652). This family of proteins has no known function.	118
371270	pfam10852	DUF2651	Protein of unknown function (DUF2651). This family of proteins with unknown function appears to be restricted to Bacillus spp.	73
402460	pfam10853	DUF2650	Protein of unknown function (DUF2650). This family of proteins with unknown function appear to be restricted to Caenorhabditis elegans.	35
287788	pfam10854	DUF2649	Protein of unknown function (DUF2649). Members in this family of proteins are annotated as Plectrovirus orf 10 transmembrane proteins however currently no function is known.	67
402461	pfam10856	DUF2678	Protein of unknown function (DUF2678). This family of proteins has no known function.	118
287791	pfam10857	DUF2701	Protein of unknown function (DUF2701). This viral family of proteins has no known function.	63
402462	pfam10858	DUF2659	Protein of unknown function (DUF2659). This bacterial family of proteins has no known function.	224
313938	pfam10859	DUF2660	Protein of unknown function (DUF2660). This is a family of proteins with unknown function.	91
287794	pfam10860	DUF2661	Protein of unknown function (DUF2661). This viral family of proteins have no known function.	113
402463	pfam10861	DUF2784	Protein of Unknown function (DUF2784). This is a family of uncharacterized protein. The function is not known however it is conserved in Bacteria.	105
402464	pfam10862	FcoT	FcoT-like thioesterase domain. Proteins in this family have a HotDog fold. This family was formerly known as domain of unknown function 2662 (DUF2662). The structure of Rv0098 from M. tuberculosis suggested a thioesterase function. Assays showed that this protein was a thioesterase with a preference for long chain fatty acyl groups. The maximal Kcat was observed for palmitoyl-CoA although longer and shorter molecules were also cleaved. In solution this protein forms a homo-hexameric complex.	150
402465	pfam10863	NOP19	Nucleolar protein 19. Nucleolar protein 19 plays an essential role in 40S ribosomal subunit biogenesis.	140
371275	pfam10864	DUF2663	Protein of unknown function (DUF2663). Some members in this family of proteins are annotated as YpbF however currently no function is known.	130
402466	pfam10865	DUF2703	Domain of unknown function (DUF2703). This family of protein has no known function, but it may be distantly related to the thioredoxin fold. It contains the CXXC motif that is characteristic of thioredoxins.	120
313944	pfam10866	DUF2704	Protein of unknown function (DUF2704). This viral family of proteins has no known function.	167
287800	pfam10867	DUF2664	Protein of unknown function (DUF2664). This family of proteins is a viral family, annotated as UL96. Currently no function is known.	89
402467	pfam10868	Defensin_like	Cysteine-rich antifungal protein 2, defensin-like. This is a family of plant antifungal proteins. It has insecticidal and antifungal activity against certain plant pathogens.	50
402468	pfam10869	DUF2666	Protein of unknown function (DUF2666). This Archaeal family of proteins has no known function.	135
287802	pfam10870	DUF2729	Protein of unknown function (DUF2729). This viral family of proteins has no known function.	57
402469	pfam10871	DUF2748	Protein of unknown function (DUF2748). This is a bacterial family of proteins with unknown function.	439
287804	pfam10872	DUF2740	Protein of unknown function (DUF2740). This family of proteins with unknown function has a highly conserved sequence.	48
313948	pfam10873	CYYR1	Cysteine and tyrosine-rich protein 1. Members in this family of proteins are annotated as Cysteine and tyrosine-rich protein 1, however currently no function is known.	149
371279	pfam10874	DUF2746	Protein of unknown function (DUF2746). This family of proteins has no known function.	101
287807	pfam10875	DUF2670	Protein of unknown function (DUF2670). This bacterial family of proteins has no known function.	145
337861	pfam10876	Phage_TAC_9	Phage tail assemb.y chaperone protein, TAC. This is a family of putative phage tail assembly chaperone proteins largely from Haemophilus and Xylella species.	133
402470	pfam10877	DUF2671	Protein of unknown function (DUF2671). This family of proteins with unknown function appears to be restricted to Rickettsia spp.	89
313952	pfam10878	DUF2672	Protein of unknown function (DUF2672). This family of proteins with unknown function appear to be restricted to Rickettsiae.	67
402471	pfam10879	DUF2674	Protein of unknown function (DUF2674). This family of proteins with unknown function appears to be conserved to Rickettsia spp.	63
313953	pfam10880	DUF2673	Protein of unknown function (DUF2673). This family of proteins with unknown function appears to be restricted to Rickettsiae spp.	82
371282	pfam10881	DUF2726	Protein of unknown function (DUF2726). This bacterial family of proteins has no known function.	127
378504	pfam10882	bPH_5	Bacterial PH domain. This family of proteins with unknown function appear to be related to bacterial PH domains. This family was formerly known as DUF2679.	100
402472	pfam10883	DUF2681	Protein of unknown function (DUF2681). This family of proteins is found in bacteria. Proteins in this family are typically between 81 and 117 amino acids in length.	87
287815	pfam10884	DUF2683	Protein of unknown function (DUF2683). This family of proteins with unknown function appears to be restricted to Methanosarcinaceae.	78
287817	pfam10886	DUF2685	Protein of unknown function (DUF2685). Members in this family of proteins are annotated as uvdY.-2 which is an open reading frame within uvsY. However currently there is no known function.	55
287818	pfam10887	DUF2686	Protein of unknown function (DUF2686). Some members in this family of proteins are annotated as yjfZ however currently no function is known.	285
371285	pfam10888	DUF2742	Protein of unknown function (DUF2742). Members in this family of phage proteins are the product of the gene phiRv1, however no function is known.	97
371286	pfam10890	Cyt_b-c1_8	Cytochrome b-c1 complex subunit 8. This entry represents subunit 8 of the Cytochrome b-c1 complex.	72
151339	pfam10891	DUF2719	Protein of unknown function (DUF2719). This family of proteins with unknown function appears to be restricted to Nucleopolyhedrovirus.	81
371287	pfam10892	DUF2688	Protein of unknown function (DUF2688). Members in this family of proteins are annotated as KleB however currently no function is known.	56
371288	pfam10893	DUF2724	Protein of unknown function (DUF2724). This is a family of proteins with unknown function.	63
313958	pfam10894	DUF2689	Protein of unknown function (DUF2689). Members in this family of proteins are annotated as TrbD however currently no function is known.	57
371289	pfam10895	DUF2715	Domain of unknown function (DUF2715). This family of proteins with unknown function appears to be largely found in spirochaete bacteria. It is related to membrane beta barrel proteins.	153
402473	pfam10896	DUF2714	Protein of unknown function (DUF2714). This family of proteins with unknown function appears to be restricted to Mycoplasmataceae.	143
313960	pfam10897	DUF2713	Protein of unknown function (DUF2713). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	235
371291	pfam10898	DUF2716	Protein of unknown function (DUF2716). This bacterial family of proteins has no known function.	140
402474	pfam10899	AbiGi	Putative abortive phage resistance protein AbiGi, antitoxin. This is a bacterial family of proteins with unknown function. AbiGi is a family of putative type IV toxin-antitoxin system antitoxins. The AbiG abortive phage resistance system affects lactococcal bacteriophages phiP335 and phiQ30 but not the other P335 phage species. AbiGii toxin appears to confer resistance to phages by a mechanism of abortive infection that acts by interfering with phage RNA synthesis. The cognate toxin is found in pfam16873.	178
402475	pfam10901	DUF2690	Protein of unknown function (DUF2690). This bacterial family of proteins has no known function.	86
402476	pfam10902	WYL_2	WYL_2, Sm-like SH3 beta-barrel fold. WYL_2 is a family of Sm-like SH3 beta-barrel fold containing domains. WYL is named for three conserved amino acids found in a subset of domains of this superfamily. These residues are not strongly conserved throughout the family. Rather, the conservation pattern includes four basic residues and a position often occupied by a cysteine, which are predicted to line a ligand-binding groove typical of the Sm-like SH3 beta-barrels. It is predicted to be a ligand-sensing domain that could bind negatively charged ligands, such as nucleotides or nucleic acid fragments, to regulate CRISPR-Cas and other defense systems such as the abortive infection AbiG system	74
371294	pfam10903	DUF2691	Protein of unknown function (DUF2691). This bacterial family of proteins has no known function.	152
287827	pfam10904	DUF2694	Protein of unknown function (DUF2694). This family of proteins with unknown function appears to be restricted to Mycobacterium spp.	97
402477	pfam10905	DUF2695	Protein of unknown function (DUF2695). This bacterial family of proteins has no known function.	53
371295	pfam10906	Mrx7	MIOREX complex component 7. This entry includes budding yeast MIOREX complex component 7 (Mrx7), which associates with mitochondrial ribosome. Its function is not clear.	66
337866	pfam10907	DUF2749	Protein of unknown function (DUF2749). This bacterial family of proteins appear to come from the Trb operon however currently no function is known.	64
402478	pfam10908	DUF2778	Protein of unknown function (DUF2778). This is a bacterial family of uncharacterized proteins.	119
151356	pfam10909	DUF2682	Protein of unknown function (DUF2682). This viral family of proteins has no known function.	77
287830	pfam10910	DUF2744	Protein of unknown function (DUF2744). This is a viral family of proteins with unknown function.	119
151358	pfam10911	DUF2717	Protein of unknown function (DUF2717). Members in this family of proteins are annotated as gene 6.5 protein however currently there is no known function.	77
371296	pfam10912	DUF2700	Protein of unknown function (DUF2700). This family of proteins with unknown function appears to be restricted to Caenorhabditis elegans.	136
313971	pfam10913	DUF2706	Protein of unknown function (DUF2706). This family of proteins with unknown function appears to be restricted to Rickettsia spp.	59
371297	pfam10914	DUF2781	Protein of unknown function (DUF2781). This is a eukaryotic family of uncharacterized proteins. Some of the proteins in this family are annotated as membrane proteins.	145
402479	pfam10915	DUF2709	Protein of unknown function (DUF2709). This bacterial family of proteins has no known function.	237
402480	pfam10916	DUF2712	Protein of unknown function (DUF2712). This family of proteins with unknown function appear to be restricted to Bacillales.	115
371300	pfam10917	Fungus-induced	Fungus-induced protein. This entry represents fungus-induced proteins which may have role in hypoxia response.	49
313975	pfam10918	DUF2718	Protein of unknown function (DUF2718). This viral family of proteins has no known function.	129
402481	pfam10920	DUF2705	Protein of unknown function (DUF2705). This bacterial family of proteins has no known function.	226
287837	pfam10921	DUF2710	Protein of unknown function (DUF2710). This family of proteins with unknown function appears to be restricted to Mycobacteriaceae.	104
313976	pfam10922	DUF2745	Protein of unknown function (DUF2745). This is a viral family of proteins with unknown function.	85
402482	pfam10923	DUF2791	P-loop Domain of unknown function (DUF2791). This is a family of proteins found in archaea and bacteria. This domain contains a P-loop motif suggesting it binds to a nucleotide such as ATP.	412
402483	pfam10924	DUF2711	Protein of unknown function (DUF2711). Some members in this family of proteins are annotated as ywbB however currently there is no known function.	216
371301	pfam10925	DUF2680	Protein of unknown function (DUF2680). Members in this family of proteins are annotated as yckD however currently no function is known.	57
402484	pfam10926	DUF2800	Protein of unknown function (DUF2800). This is a family of uncharacterized proteins found in bacteria and viruses. Some members of this family are annotated as being Phi APSE P51-like proteins.	366
287842	pfam10927	DUF2738	Protein of unknown function (DUF2738). This is a viral family of proteins with unknown function.	236
402485	pfam10928	DUF2810	Protein of unknown function (DUF2810). This is a bacterial family of uncharacterized proteins.	53
402486	pfam10929	DUF2811	Protein of unknown function (DUF2811). This is a bacterial family of uncharacterized proteins.	57
402487	pfam10930	DUF2737	Protein of unknown function (DUF2737). This family of proteins has no known function.	53
371303	pfam10931	DUF2735	Protein of unknown function (DUF2735). Some members in this family of proteins are annotated as glutamine synthetase translation inhibitor however this function can not be confirmed.	52
402488	pfam10932	DUF2783	Protein of unknown function (DUF2783). This is a bacterial family of uncharacterized protein.	59
402489	pfam10933	DUF2827	Protein of unknown function (DUF2827). This is a family of uncharacterized proteins found in Burkholderia.	362
402490	pfam10934	DUF2634	Protein of unknown function (DUF2634). Some members in this family of proteins are annotated as phage related, xkdS however currently there is no known function.	104
402491	pfam10935	DUF2637	Protein of unknown function (DUF2637). This family of proteins has no known function.	161
402492	pfam10936	DUF2617	Protein of unknown function DUF2617. This bacterial family of proteins has no known function.	156
402493	pfam10937	S36_mt	Ribosomal protein S36, mitochondrial. This entry is represented by a mitochondrial ribosomal protein of the small subunit, which has similarity to human mitochondrial ribosomal protein MRP-S36.	116
402494	pfam10938	YfdX	YfdX protein. YfdX is a protein found in Proteobacteria of unknown function. The protein coding for this gene is regulated by EvgA in E. coli.	148
402495	pfam10939	DUF2631	Protein of unknown function (DUF2631). This is s bacterial family of proteins with unknown function.	63
287855	pfam10940	DUF2618	Protein of unknown function (DUF2618). This bacterial family of proteins has no known function. The sequences within the family are highly conserved.	40
402496	pfam10941	DUF2620	Protein of unknown function DUF2620. This is a bacterial family of proteins with unknown function.	116
402497	pfam10942	DUF2619	Protein of unknown function (DUF2619). This bacterial family of proteins has no known function.	69
287858	pfam10943	DUF2632	Protein of unknown function (DUF2632). This is a family of membrane proteins with unknown function.	233
402498	pfam10944	DUF2630	Protein of unknown function (DUF2630). This bacterial family of proteins have no known function.	80
402499	pfam10945	CBP_BcsR	Cellulose biosynthesis protein BcsR. CBP_BcsR is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation).	42
402500	pfam10946	DUF2625	Protein of unknown function DUF2625. Some members in this family of proteins are annotated as ybfG however currently no function is known.	207
402501	pfam10947	DUF2628	Protein of unknown function (DUF2628). Some members in this family of proteins are annotated as yigF however currently no function is known.	78
378516	pfam10948	DUF2635	Protein of unknown function (DUF2635). This is a family of phage proteins with unknown function.	46
371311	pfam10949	DUF2777	Protein of unknown function (DUF2777). This family of proteins with unknown function appears to be restricted to Bacillus cereus.	181
402502	pfam10950	Organ_specific	Organ specific protein. This eukaryotic family includes a number of plant organ-specific proteins. While their function is unknown, their predicted amino acid sequence suggests that these proteins could be exported and glycosylated.	117
402503	pfam10951	DUF2776	Protein of unknown function (DUF2776). This bacterial family of proteins has no known function.	348
287867	pfam10952	DUF2753	Protein of unknown function (DUF2753). This bacterial family of proteins has no known function.	140
402504	pfam10953	DUF2754	Protein of unknown function (DUF2754). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae.	70
402505	pfam10954	DUF2755	Protein of unknown function (DUF2755). Some members in this family of proteins are annotated as YaiY however no function is known. The family appears to be restricted to Enterobacteriaceae.	100
402506	pfam10955	DUF2757	Protein of unknown function (DUF2757). Members in this family of proteins are annotated as YabK however currently no function is known.	73
402507	pfam10956	DUF2756	Protein of unknown function (DUF2756). Some members in this family of proteins are annotated yhhA however currently no function is known. The family appears to be restricted to Enterobacteriaceae.	104
337871	pfam10957	Spore_Cse60	Sporulation protein Cse60. Cse60 is expressed during sporulation in Bacillus subtilis. Transcription commences around 2h after the start of sporulation and had an absolute requirement for the transcription factor sigmaE. Cse60 is an acidic product of only 60 residues, whose function is not known.	60
378517	pfam10958	DUF2759	Protein of unknown function (DUF2759). This family of proteins with unknown function appear to be restricted to Bacillaceae.	50
371315	pfam10959	DUF2761	Protein of unknown function (DUF2761). Members in this family of proteins are annotated as KleF however no function is known.	94
402508	pfam10960	Holin_BhlA	BhlA holin family. The Phage_holin_BhlA family is a family of holin-like proteins from both bacteriophages and bacterial chromosomes. In bacteriophage, holins are small membrane proteins that accumulate and oligomerize to form non-specific lesions in the cytoplasmic membrane allowing the release of the second protein, endolysins, to access the peptidoglycan. Most holins share common structural features: two or three transmembrane domains separated by a beta-turn, a short hydrophilic N-terminus, a highly charged C-terminus and a dual translational start motif. The BhlA holin of Bacillus is found to be toxic to the host cell where the site of action of is on the cell membrane and causes bacterial death by cell membrane disruption.	66
402509	pfam10961	SelK_SelG	Selenoprotein SelK_SelG. This entry inclues a group of eukaryotic selenoproteins, such as SelK and SelG. SelK seems to play an important role in protecting cells from endoplasmic reticulum stress induced apoptosis. SelG may be involved in regulating the redox state of the cell.	81
402510	pfam10962	DUF2764	Protein of unknown function (DUF2764). This bacterial family of proteins has no known function.	271
402511	pfam10963	Phage_TAC_10	Phage tail assembly chaperone. This is a family of phage tail assembly chaperone proteins.	82
371318	pfam10964	DUF2766	Protein of unknown function (DUF2766). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	79
402512	pfam10965	DUF2767	Protein of unknown function (DUF2767). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	67
402513	pfam10966	DUF2768	Protein of unknown function (DUF2768). This family of proteins with unknown function appear to be restricted to Bacillus spp.	58
402514	pfam10967	DUF2769	Protein of unknown function (DUF2769). This family of proteins have no known function.	57
371319	pfam10968	DUF2770	Protein of unknown function (DUF2770). Members in this family of proteins are annotated as yceO however currently no function is known.	36
402515	pfam10969	DUF2771	Protein of unknown function (DUF2771). This bacterial family of proteins has no known function.	128
287884	pfam10970	GerPE	Spore germination protein GerPE. GerPE is required for the formation of functionally normal spores. It could be involved in the establishment of a normal spore coat structure and (or) permeability, which allows the access of germinants to their receptor.	123
287885	pfam10971	DUF2773	Protein of unknown function (DUF2773). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.	81
402516	pfam10972	CsiV	Peptidoglycan-binding protein, CsiV. CsiV, a small periplasmic protein (cell-shape integrity in Vibrio), is essential for growth of Vibrio cholerae in the presence of DAA, non-canonical amino-acids, the typical components of peptidoglycan side-chains in Vibrio cholerae. CsiV interacts with LpoA, the lipoprotein activator of penicillin-binding-protein1A that is necessary for mediating the assembly of peptidoglycan. CsiV acts through LpoA to promote peptidoglycan biogenesis in V. cholerae and other vibrio species as well as in the other genera where this protein is found.	210
402517	pfam10973	DUF2799	Protein of unknown function (DUF2799). Some members in this family of proteins are annotated as yfiL which has no known function.	86
402518	pfam10974	DUF2804	Protein of unknown function (DUF2804). This is a family of proteins with unknown function.	321
402519	pfam10975	DUF2802	Protein of unknown function (DUF2802). This bacterial family of proteins has no known function.	63
402520	pfam10976	DUF2790	Protein of unknown function (DUF2790). This family of proteins with unknown function appear to be restricted to Pseudomonadaceae.	77
402521	pfam10977	DUF2797	Protein of unknown function (DUF2797). This family of proteins has no known function.	228
402522	pfam10978	DUF2785	Protein of unknown function (DUF2785). Some members in this family are annotated as hypothetical membrane spanning proteins however this cannot be confirmed. The family has no known function.	174
402523	pfam10979	DUF2786	Protein of unknown function (DUF2786). This family of proteins has no known function.	40
402524	pfam10980	DUF2787	Protein of unknown function (DUF2787). This bacterial family of proteins has no known function.	128
402525	pfam10981	DUF2788	Protein of unknown function (DUF2788). This bacterial family of proteins have no known function.	51
402526	pfam10982	DUF2789	Protein of unknown function (DUF2789). This bacterial family of proteins has no known function.	75
402527	pfam10983	DUF2793	Protein of unknown function (DUF2793). This is a bacterial family of proteins with unknown function.	87
402528	pfam10984	DUF2794	Protein of unknown function (DUF2794). This is a bacterial family of proteins with unknown function.	85
402529	pfam10985	DUF2805	Protein of unknown function (DUF2805). This is a bacterial family of proteins with unknown function.	71
402530	pfam10986	DUF2796	Protein of unknown function (DUF2796). This bacterial family of proteins has no known function.	163
371326	pfam10987	DUF2806	Protein of unknown function (DUF2806). This bacterial family of proteins has no known function.	221
402531	pfam10988	DUF2807	Putative auto-transporter adhesin, head GIN domain. This bacterial family of proteins shows structural similarity to other pectin lyase families. Although structures from this family align with acetyl-transferases, there is no conservation of catalytic residues found. It is likely that the function is one of cell-adhesion. In Structure 3jx8, it is interesting to note that the sequence of contains several well defined sequence repeats, centred around GSG motifs defining the tight beta turn between the two sheets of the super-helix; there are 8 such repeats in the C-terminal half of the protein, which could be grouped into 4 repeats of two. It seems likely that this family belongs to the superfamily of trimeric auto-transporter adhesins (TAAs), which are important virulence factors in Gram-negative pathogens. In the case of Parabacteroides distasonis, which is a component of the normal distal human gut microbiota, TAA-like complexes probably modulate adherence to the host (information derived from TOPSAN).	181
402532	pfam10989	DUF2808	Protein of unknown function (DUF2808). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	144
402533	pfam10990	DUF2809	Protein of unknown function (DUF2809). Some members in this family of proteins are annotated as yjgA however currently no function for the protein is known.	86
402534	pfam10991	DUF2815	Protein of unknown function (DUF2815). This is a phage related family of proteins with unknown function.	167
402535	pfam10992	DUF2816	Protein of unknown function (DUF2816). This eukaryotic family of proteins has no known function.	83
402536	pfam10993	DUF2818	Protein of unknown function (DUF2818). This bacterial family of proteins has no known function.	93
378533	pfam10994	DUF2817	Protein of unknown function (DUF2817). This family of proteins has no known function.	340
402537	pfam10995	CBP_GIL	GGDEF I-site like or GIL domain. The GIL domain, for GGDEF I-site like domain, is a c-di-GMP binding domain on the BcsE proteins of enterobacteria. It is not essentail for cellulose synthesis but is critical for maximal cellulose production. Cellulose production in enterobacteria is controlled by a two-tiered c-di-GMP-dependent system involving BcsE and the PilZ domain containing glycosyltransferase BcsA.	513
402538	pfam10996	Beta-Casp	Beta-Casp domain. The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains.	107
402539	pfam10997	Amj	Alternate to MurJ. This bacterial family of proteins has no known function. However, family members include lipid II flippase Amj, which is required for bacterial cell wall synthesis. It transports lipid-linked peptidoglycan precursors from the inner to the outer surface of the cytoplasmic membrane.	253
402540	pfam10998	DUF2838	Protein of unknown function (DUF2838). This bacterial family of proteins has no known function.	108
402541	pfam10999	DUF2839	Protein of unknown function (DUF2839). This bacterial family of unknown function appear to be restricted to Cyanobacteria.	67
402542	pfam11000	DUF2840	Protein of unknown function (DUF2840). This bacterial family of proteins have no known function.	148
402543	pfam11001	DUF2841	Protein of unknown function (DUF2841). This family of proteins with unknown function are all present in yeast.	122
402544	pfam11002	RDM	RFPL defining motif (RDM). The RDM domain is found on RFPL (Ret finger protein like) proteins. In humans, RFPL transcripts can be detected at the onset of neurogenesis in differentiating human embryonic stem cells, and in the developing human neocortex. The RDM domain is thought to have emerged from a neofunctionalisation event. It is found N terminal to the SPRY domain (pfam00622).	42
402545	pfam11003	DUF2842	Protein of unknown function (DUF2842). This bacterial family of proteins have no known function.	61
402546	pfam11004	Kdo_hydroxy	3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) hydroxylase. This is a family of 3-deoxy-D-manno-oct-2-ulosonic acid 3-hydroxylases, which catalyze the conversion of 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) to D-glycero-D-talo-oct-2-ulosonic acid (Ko). It contains a potential iron-binding motif, HXDX(n)H (n>40). Hydroxylation activity is iron-dependent.	273
402547	pfam11005	DUF2844	Protein of unknown function (DUF2844). This bacterial family of proteins has no known function.	129
402548	pfam11006	DUF2845	Protein of unknown function (DUF2845). This bacterial family of proteins has no known function.	78
402549	pfam11007	CotJA	Spore coat associated protein JA (CotJA). CotJA is part of the CotJ operon which contains CotJA and CotJC. The operon encodes spore coat proteins. Interaction of CotJA with CotJC is required for the assembly of both CotJA and CotJC into the spore coat.	35
402550	pfam11008	DUF2846	Protein of unknown function (DUF2846). Some members in this family of proteins with unknown function are annotated as lipoproteins however this cannot be confirmed.	89
402551	pfam11009	DUF2847	Protein of unknown function (DUF2847). Some members in this bacterial family of proteins with unknown function are annotated as YtxJ, a putative general stress protein. This cannot be confirmed.	103
402552	pfam11010	DUF2848	Protein of unknown function (DUF2848). This bacterial family of proteins has no known function.	194
402553	pfam11011	DUF2849	Protein of unknown function (DUF2849). This bacterial family of proteins has no known function.	86
314054	pfam11012	DUF2850	Protein of unknown function (DUF2850). This family of proteins with unknown function appear to be restricted to Vibrionaceae.	78
402554	pfam11013	DUF2851	Protein of unknown function (DUF2851). This bacterial family of proteins has no known function.	369
402555	pfam11014	DUF2852	Protein of unknown function (DUF2852). This bacterial family of proteins has no known function.	116
402556	pfam11015	DUF2853	Protein of unknown function (DUF2853). This bacterial family of proteins has no known function.	101
402557	pfam11016	DUF2854	Protein of unknown function (DUF2854). This family of proteins has no known function.	145
402558	pfam11017	DUF2855	Protein of unknown function (DUF2855). This family of proteins has no known function.	334
371343	pfam11018	Cuticle_3	Pupal cuticle protein C1. Insect cuticles are composite structures whose mechanical properties are optimized for biological function. The major components are the chitin filament system and the cuticular proteins, and the cuticle's properties are determined largely by the interactions between these two sets of molecules. The proteins can be ordered by species.	186
378542	pfam11019	DUF2608	Protein of unknown function (DUF2608). This family is conserved in Bacteria. The function is not known.	240
371344	pfam11020	DUF2610	Domain of unknown function (DUF2610). This family is conserved in Proteobacteria. One member is annotated as being elongation factor P but this could not be confirmed. This domain is related to the Ribbon-helix-helix superfamily so may be a DNA-binding protein.	78
402559	pfam11021	DUF2613	Protein of unknown function (DUF2613). This is a family of putative small secreted proteins expressed by Actinobacteria. The function is not known.	55
402560	pfam11022	DUF2611	Protein of unknown function (DUF2611). This family is conserved in the Dikarya of Fungi. The function is not known.	64
402561	pfam11023	DUF2614	Zinc-ribbon containing domain. This is a family of proteins conserved in the Bacillaceae family. Some members are annotated as being protein YgzB. The function is not known.	111
337892	pfam11024	DGF-1_4	Dispersed gene family protein 1 of Trypanosoma cruzi region 4. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. Other domains on this protein include DGF-1_N, DGF-1_2, and DGF-1_5. This domain is just downstream from the C-terminus, but not the C-terminus of proteins, also annotated as being DGF-1, that constitute family DGF-1_C.	70
314067	pfam11025	GP40	Glycoprotein GP40 of Cryptosporidium. This family is highly conserved in Cryptosporidium spp. Many members are annotated as being a 60 kDa glycoprotein.	164
402562	pfam11026	DUF2721	Protein of unknown function (DUF2721). This family is conserved in bacteria. The function is not known.	127
402563	pfam11027	DUF2615	Protein of unknown function (DUF2615). This small. approximately 100 residue, family is conserved from worms to humans. It is cysteine-rich with a characteristic FDxCEC sequence motif. The function is not known.	102
402564	pfam11028	DUF2723	Protein of unknown function (DUF2723). This family is conserved in bacteria. The function is not known.	168
402565	pfam11029	DAZAP2	DAZ associated protein 2 (DAZAP2). DAZ associated protein 2 has a highly conserved sequence throughout evolution including a conserved polyproline region and several SH2/SH3 binding sites. It occurs as a single copy gene with a four-exon organisation and is located on chromosome 12. It encodes a ubiquitously expressed protein and binds to DAZ and DAZL1 through DAZ repeats.	129
287944	pfam11030	Nucleocapsid-N	Nucleocapsid protein N. This is the N protein of the nucleocapsid. The nucleocapsid functions to protect the RNA against nuclease degradation and to promote it's reverse transcription. The NC protein promotes viral RNA dimerization and encapsidation and initiates reverse transcription by activating the annealing of the primer tRNA to the initiation site.	167
287945	pfam11031	Phage_holin_T	Bacteriophage T holin. Bacteriophage effects host lysis with T holin along with an endolysin. T disrupts the membrane allowing sequential events which lead to the attack of the peptidoglycan. T has an usual periplasmic domain which transduces environmental information for the real-time control of lysis timing.	233
402566	pfam11032	ApoM	ApoM domain. ApoM is a 25 kDa plasma protein associated with high-density lipoproteins (HDLs). ApoM is important in the formation of pre-ss-HDL and also in increasing cholesterol efflux from macrophage foam cells. Lipoproteins consist of lipids solubilized by apolipoproteins. ApoM lacks an external amphipathic motif and is uniquely secreted to plasma without cleavage of its terminal signal peptide.	188
402567	pfam11033	ComJ	Competence protein J (ComJ). ComJ is a competence specific protein.	122
402568	pfam11034	Grg1	Glucose-repressible protein Grg1. This fungal protein increases during glucose deprivation. Its function is unknown.	65
402569	pfam11035	SnAPC_2_like	Small nuclear RNA activating complex subunit 2, SNAP190 Myb. This family of proteins is snRNA-activating protein complex subunit 2 (SnAPC subunit 2). SnAPC complex allows the transcription of human small nuclear RNA genes to occur by recognition of the proximal sequence element, the TATA box. The family functions both to specifically recognize the proximal sequence element present in the core promoters of human snRNA genes and to stimulate TBP recognition of the neighboring TATA box present in human U6 snRNA promoters.	331
151483	pfam11036	YqgB	Virulence promoting factor. YqgB encodes adaptive factors that acts in synergy with vqfZ, enabling the bacteria to cope with the physical environment in vivo, facilitating colonisation of the host.	43
371352	pfam11037	Musclin	Insulin-resistance promoting peptide in skeletal muscle. Musclin is a muscle derived secretory peptide which induces insulin resistance in vitro. It encodes a 130 amino acid sequence including a NH(2) terminal 30 amino acid signal sequence. Musclin expression level is tightly regulated by nutritional changes.	132
287951	pfam11038	DGF-1_5	Dispersed gene family protein 1 of Trypanosoma cruzi region 5. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. Other domains on this protein include DGF-1_N, DGF-1_2, and DGF-1_4. This domain is just downstream from the C-terminus, but not the C-terminus of proteins, also annotated as being DGF-1, that constitute family DGF-1_C.	278
371353	pfam11039	DUF2824	Protein of unknown function (DUF2824). This family of proteins has no known function. Some members in the family are annotated as the P22 head assembly protein gp14 however this cannot be confirmed.	151
337895	pfam11040	DGF-1_C	Dispersed gene family protein 1 of Trypanosoma cruzi C-terminus. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. This is the very C-terminal part of the protein.	87
402570	pfam11041	DUF2612	Protein of unknown function (DUF2612). This is a phage protein family expressed from a range of Proteobacteria species. The function is not known.	181
402571	pfam11042	DUF2750	Protein of unknown function (DUF2750). This family is conserved in Proteobacteria. The function is not known.	102
287956	pfam11043	DUF2856	Protein of unknown function (DUF2856). Some members in this viral family of proteins with unknown function are annotated as Abc2 however this cannot be confirmed.	97
371355	pfam11044	TMEMspv1-c74-12	Plectrovirus spv1-c74 ORF 12 transmembrane protein. This is a family of proteins expressed by Plectroviruses. The plectroviruses are single-stranded DNA viruses belonging to the Inoviridae. Except that it is a putative transmembrane protein the function is not known.	49
402572	pfam11045	YbjM	Putative inner membrane protein of Enterobacteriaceae. This family is conserved in the Enterobacteriaceae. It is a putative inner membrane protein, named YbjM, but the function is not known.	117
402573	pfam11046	HycA_repressor	Transcriptional repressor of hyc and hyp operons. This family is conserved in Proteobacteria. It is likely to be the transcriptional repressor molecule for the hyc and hyp operons, which express, amongst others, the protein HycA. This protein may be harnessed for the reduction of technetium oxide, an unwelcome product of radio-nucleotide bioaccumulation. HycA produces formate hydrogenlyase, one of the key proteins necessary for metal compound reduction.	140
287960	pfam11047	SopD	Salmonella outer protein D. SopD is a type III virulence effector protein whose structure consists of 38% alpha-helix and 26% beta-strand.	319
287961	pfam11049	KSHV_K1	Glycoprotein K1 of Kaposi's sarcoma-associated herpes virus. This is a highly glycosylated cytoplasmic and membrane protein similar to the immunoglobulin receptor family that is expressed as an inducible early-lytic-cycle gene product in primary effusion lymphoma cell-lines. This domain would appear to be the cytoplasmic region of the protein.	71
287962	pfam11050	Viral_env_E26	Virus envelope protein E26. E26 is a multifunctional protein. One form of E26 associates with viral DNA or DNA binding proteins, while a second form associates with intracellular membranes.	225
402574	pfam11051	Mannosyl_trans3	Mannosyltransferase putative. This family is conserved in fungi. Several members are annotated as being alpha-1,3-mannosyltransferase but this could not be confirmed.	273
337898	pfam11052	Tr-sialidase_C	Trans-sialidase of Trypanosoma hydrophobic C-terminal. This is a highly conserved sequence motif that is the very C-terminus of a number of more diverse proteins from Trypanosoma cruzi. All members of the family are annotated putatively as being trans-sialidase but this appears to be a diverse group.	23
287965	pfam11053	DNA_Packaging	Terminase DNA packaging enzyme. Phage T4 terminase functions in packaging concatemeric DNA. The T4 terminase is composed of a large subunit, gp17 ad a small subunit, gp16. The role of gp16 is not well characterized however it is known that it binds to double-stranded DNA but not single stranded DNA.	157
402575	pfam11054	Surface_antigen	Sporozoite TA4 surface antigen. This family of proteins is a Eukaryotic family of surface antigens. One of the better characterized members of the family is the sporulated TA4 antigen. The TA4 gene encodes a single polypeptide of 25 kDa which contains a 17 and a 8 kD polypeptide.	209
402576	pfam11055	Gsf2	Glucose signalling factor 2. Gsf2 is localized to the ER and functions to promote the secretion of certain hexose transporters.	371
402577	pfam11056	UvsY	Recombination, repair and ssDNA binding protein UvsY. UvsY protein enhances the rate of single-stranded-DNA-dependant ATP hydrolysis by UvsX protein. The enhancement of ATP hydrolysis by UvsY protein is shown to result from the ability of UvsY protein to increase the affinity of UvsX protein for single-stranded DNA.	128
314086	pfam11057	Cortexin	Cortexin of kidney. In the middle of cortexin protein there is a single membrane-spanning domain which indicates that this protein may be a membrane protein involved in intracellular or extracellular signalling of the kidney or brain, since it is expressed specifically in the kidneys and brain only. The protein is highly conserved among species. Cortexin is also thought to be important to neurons of both the developing and adult cerebral cortex.	73
287970	pfam11058	Ral	Antirestriction protein Ral. Ral alleviates restriction and enhances modification by the E.Coli restriction and modification system.	66
371359	pfam11059	DUF2860	Protein of unknown function (DUF2860). This bacterial family of proteins has no known function.	297
402578	pfam11060	DUF2861	Protein of unknown function (DUF2861). This bacterial family of proteins has no known function.	267
402579	pfam11061	DUF2862	Protein of unknown function (DUF2862). This family of proteins has no known function.	60
402580	pfam11062	DUF2863	Protein of unknown function (DUF2863). This bacterial family of proteins have no known function.	398
402581	pfam11064	DUF2865	Protein of unknown function (DUF2865). This bacterial family of proteins has no known function.	110
402582	pfam11065	DUF2866	Protein of unknown function (DUF2866). This bacterial family of proteins have no known function.	64
402583	pfam11066	DUF2867	Protein of unknown function (DUF2867). This bacterial family of proteins have no known function.	144
402584	pfam11067	DUF2868	Protein of unknown function (DUF2868). Some members in this family of proteins with unknown function are annotated as putative membrane proteins. However, this cannot be confirmed.	309
402585	pfam11068	YlqD	YlqD protein. The structure of a representative of this family has been solved (Structure 4dci) and found to form a tetrameric structure of prefoldin-like architecture with the beta-barrel core and helical coiled coil tentacles. This suggests that this family may act as molecular chaperones.	131
402586	pfam11069	DUF2870	Protein of unknown function (DUF2870). This is a eukaryotic family of proteins with unknown function.	95
402587	pfam11070	DUF2871	Protein of unknown function (DUF2871). This family of proteins has no known function.	133
402588	pfam11071	Nuc_deoxyri_tr3	Nucleoside 2-deoxyribosyltransferase YtoQ. 	140
402589	pfam11072	DUF2859	Protein of unknown function (DUF2859). This is a bacterial family of uncharacterized proteins.	145
402590	pfam11073	NSs	Rift valley fever virus non structural protein (NSs) like. This family contains several Phlebovirus non structural proteins which act as a major determinant of virulence by antagonising interferon beta gene expression.	242
402591	pfam11074	DUF2779	Domain of unknown function(DUF2779). This domain is conserved in bacteria. The function is not known.	126
402592	pfam11075	DUF2780	Protein of unknown function VcgC/VcgE (DUF2780). This is a bacterial family of uncharacterized proteins.	175
402593	pfam11076	YbhQ	Putative inner membrane protein YbhQ. This family is conserved in Proteobacteria. The function is not known but most members are annotated as being inner membrane protein YbhQ.	132
371366	pfam11077	DUF2616	Protein of unknown function (DUF2616). This cysteine-rich family is expressed by the double-stranded Nucleopolyhedrovirus, a member of the Baculoviridae family of dsDNA viruses. The function is not known.	172
337907	pfam11078	Optomotor-blind	Optomotor-blind protein N-terminal region. This family is conserved in Drosophila spp. Optomotor-blind is one of the essential toolkit proteins for coordinating development in diverse animal taxa, and in Drosophila it plays a key role in establishing the abdominal pigmentation pattern, in development of the central nervous system and leg and wing imaginal disc-formation of Drosophila melanogaster. This is the N-terminal region of the protein and does not include the T-box-containing transcription factor that plays a part in DNA-binding.	79
402594	pfam11079	YqhG	Bacterial protein YqhG of unknown function. This family of putative proteins is conserved in the Bacillaceae family of the Firmicutes. The function is not known.	258
402595	pfam11080	GhoS	Endoribonuclease GhoS. GhoS is part of the GhoT-GhoS type V toxin-antitoxin (TA) system. GhoT is inhibited by antitoxin GhoS, which specifically cleaves its mRNA.	87
314108	pfam11081	DUF2890	Protein of unknown function (DUF2890). This family is conserved in dsDNA adenoviruses of vertebrates. The function is not known.	168
287992	pfam11082	DUF2880	Protein of unknown function (DUF2880). This bacterial family of proteins has no known function.	79
314109	pfam11083	Streptin-Immun	Lantibiotic streptin immunity protein. Streptococcal species produce a lantibiotic, streptin, in a similar manner to the production of nisin and subtilin by other lactic acid bacteria, in order to compete against competing bacteria within the environment. The immunity protein protects the bacterium from destruction by its own lantibiotic. In general, there is little homology between the immunity proteins of different genera of bacteria.	93
402596	pfam11084	DUF2621	Protein of unknown function (DUF2621). This family is conserved in the Bacillaceae family. Several members are named as YneK. The function is not known.	139
314111	pfam11085	YqhR	Conserved membrane protein YqhR. This family is conserved in the Bacillaceae family of the Firmicutes. The function is not known.	165
402597	pfam11086	DUF2878	Protein of unknown function (DUF2878). This bacterial family of proteins has no known function. Some members annotate the proteins as the permease component of a Mn2+/Zn2+ transport system however this cannot be confirmed.	150
151532	pfam11087	PRD1_DD	PRD1 phage membrane DNA delivery. This small family of phage proteins are bound in the viral membrane and assist, along with P11 and P18 in the delivery of DNA.	54
287997	pfam11088	RL11D	Glycoprotein encoding membrane proteins RL5A and RL6. RL5A and RL6 are part of the RL11 family which are predicted to encode membrane glycoproteins. Two adjacent open reading frames potentially encode a domain that is the hallmark of proteins encoded by the RL11 family.	99
402598	pfam11089	SyrA	Exopolysaccharide production repressor. SyrA is a small protein located in the cytoplasmic membrane that lacks an apparent DNA binding domain. SyrA mediates the transcriptional up-regulation of exo genes involved in the biosynthesis of the symbiotic exopolysaccharide succinoglycan. It does this through a mechanism which requires a two component system.	38
151535	pfam11090	DUF2833	Protein of unknown function (DUF2833). This family of proteins with unknown function are found in the bacteriophage T7. Some of the members of this family are annotated as gene 13 protein.	86
371369	pfam11091	T4_tail_cap	Tail-tube assembly protein. This tail tube protein is also referred to as Gp48. It is required for the assembly and length regulation of the tail tube of bacteriophage T4.	348
402599	pfam11092	Alveol-reg_P311	Neuronal protein 3.1 (p311). P311 has several PEST-like motifs and is found in neuron and muscle cells. P311 could have some function in myo-fibroblast transformation and prevention of fibrosis. It has also been identified as a potential regulator of alveolar generation.	66
402600	pfam11093	Mitochondr_Som1	Mitochondrial export protein Som1. Som1 is a component of the mitochondrial protein export system. The various Som1 proteins exhibit a highly conserved region and a pattern of cysteine residues. Stabilisation of Som1 occurs through an interaction between Som1 and Imp1, a peptidase required for proteolytic processing of certain proteins during their transport across the mitochondrial membrane. This suggests that Som1 represents a third subunit of the Imp1 peptidase complex	81
288002	pfam11094	UL11	Membrane-associated tegument protein. The UL11 gene product of herpes simplex virus is a membrane-associated tegument protein that is incorporated into the HSV virion and functions in viral envelopment. UL11 is acylated which is crucial for lipid raft association.	39
402601	pfam11095	Gemin7	Gem-associated protein 7 (Gemin7). Gemin7 is a novel component of the survival of motor neuron complex which functions in the assembly of spliceosomal small nuclear ribonucleoproteins. Gemin7 interacts with several Sm proteins of spliceosomal small nuclear ribonucleoproteins, especially SmE.	76
288004	pfam11097	DUF2883	Protein of unknown function (DUF2883). This family of proteins have no known function but appear to be restricted to phage.	75
371373	pfam11098	Chlorosome_CsmC	Chlorosome envelope protein C. Chlorosomes are light-harvesting antennae found in green bacteria. CsmC is one of the proteins that exists in the chlorosome envelope. CsmC has been shown to exist as a homomultimer with CsmD in the chlorosome envelope. CsmC is thought to be important in chlorosome elongation and shape.	139
402602	pfam11099	M11L	Apoptosis regulator M11L like. Apoptosis regulators function to modulate the apoptotic cascades and thereby favour productive viral replication. M11L inhibits mitochondrial-dependant apoptosis by mimicking and competing with host proteins for the binding and blocking of Bak and Bax, two executioner proteins.	141
314118	pfam11100	TrbE	Conjugal transfer protein TrbE. TrbE is essential for conjugation and phage adsorption. It contains four common motifs and one conserved domain.	66
402603	pfam11101	DUF2884	Protein of unknown function (DUF2884). Some members in this bacterial family of proteins are annotated as YggN which currently has no known function.	228
402604	pfam11102	YjbF	Group 4 capsule polysaccharide lipoprotein gfcB, YjbF. This family includes lipoprotein GfcB (YmcC), involved in group 4 capsule polysaccharide formation. YjbF is a family of Gram-negative bacterial outer-membrane lipoproteins, predicted to be a beta-barrel and possibly a porin that is one of four gene-products expressed from an operon, yjbEFGH, which is regulated by the Rcs phosphorelay in a RcsA-dependent manner, similar to that of other exopolysaccharide biosynthetic pathways. It is highly possible that the yjbEFGH operon encodes a system involved in EPS secretion since none of the products is predicted to have enzymic activity, the products are all secreted and YbjF and H are predicted to be beta-barrel lipoproteins similar to porins. It may be that the operon products play some role in biofilm formation and/or matrix production.	189
371375	pfam11103	DUF2887	Protein of unknown function (DUF2887). This bacterial family of proteins has no known function. These proteins may be distantly related to the PD(D/E)XK superfamily.	200
402605	pfam11104	PilM_2	Type IV pilus assembly protein PilM;. The type IV pilus assembly protein PilM is required for competency and pilus biogenesis. It binds to PilN and ATP.	340
314123	pfam11105	CCAP	Arthropod cardioacceleratory peptide 2a. CCAP exerts a reversible and dose-dependant cardio-stimulatory effect on the semi-isolated heart of experimental beetles. CCAP also increases free hemolymph sugar concentration in young larvae and adults of the meal-worm beetle.	128
402606	pfam11106	YjbE	Exopolysaccharide production protein YjbE. YjbE is part of a four gene operon which is involved in exopolysaccharide production. The expression of YjbE is higher than the rest of the operon yjbEFGH. It appears to be restricted to Enterobacteriaceae. YbjE is one of four gene-products expressed from an operon, yjbEFGH, which is regulated by the Rcs phosphorelay in a RcsA-dependent manner, similar to that of other exopolysaccharide biosynthetic pathways. It is highly possible that the yjbEFGH operon encodes a system involved in EPS secretion since none of the products is predicted to have enzymic activity, the products are all secreted and YbjH and F are predicted to be beta-barrel lipoproteins similar to porins. It may be that the operon products play some role in biofilm formation and/or matrix production.	79
402607	pfam11107	FANCF	Fanconi anemia group F protein (FANCF). FANCF regulates its own expression by methylation at both mRNA and protein levels. Methylation-induced inactivation of FANCF has an important role on the occurrence of ovarian cancers by disrupting the FA-BRCA pathway.	345
371377	pfam11108	Phage_glycop_gL	Viral glycoprotein L. GL forms a complex with gH, a glycoprotein known to be essential for entry of HSV-1 into cells and virus-induced cell fusion. It is a hetero-oligomer of gH and gL which is incorporated into virions and transported to the cell surface which acts during entry of virus into cells	98
402608	pfam11109	RFamide_26RFa	Orexigenic neuropeptide Qrfp/P518. Qrfp/P518 has a direct role in maintaining bone mineral density. Qrfp has also found to be important in energy homeostasis by regulating appetite and energy expenditure in mice. The c-terminal 28 residues are the functional 26RFa.	131
314127	pfam11110	Phage_hub_GP28	Baseplate hub distal subunit. These baseplate proteins are also referred to as Gp28. Gp28 is the structural component of the central part of the bacteriophage T4 baseplate, which possesses a hydrophobic region and is membrane bound. Gp28 forms a complex with gp27 which is another structural component of the baseplate.	154
402609	pfam11111	CENP-M	Centromere protein M (CENP-M). The prime candidate for specifying centromere identity is the array of nucleosomes assembles with CENP-A. CENP-A recruits a nucleosome associated complex (NAC) comprised of CENP-M along with two other proteins. Assembly of the CENP-A NAC at centromeres is partly dependant on CENP-M. The CENP-A NAC is essential, as disruption of the complex causes errors of chromosome alignment and segregation that preclude cell survival.	171
402610	pfam11112	PyocinActivator	Pyocin activator protein PrtN. PrtN is a transcriptional activator for pyocin synthesis genes. It activates the expression of various pyocin genes by interaction with the DNA sequences conserved in the 5' noncoding regions of the pyocin genes.	74
371381	pfam11113	Phage_head_chap	Head assembly gene product. This head assembly protein is also refereed to as gene product 40 (Gp40). A specific gp20-gp40 membrane insertion structure constitutes the T4 prohead assembly initiation complex. This protein in T4 stimulates head formation.	56
402611	pfam11114	Minor_capsid_2	Minor capsid protein. Most of the members of this family are annotated as being minor capsid proteins. The genomes carrying the genes usually have three similar proteins adjacent to each other, hence this one being named as No.2.	113
371383	pfam11115	DUF2623	Protein of unknown function (DUF2623). This family is conserved in the Enterobacteriaceae family. Several members are named as YghW. The function is not known.	93
402612	pfam11116	DUF2624	Protein of unknown function (DUF2624). This family is conserved in the Bacillaceae family. Several members are named as YqfT. The function is not known.	83
378559	pfam11117	DUF2626	Protein of unknown function (DUF2626). This family is conserved in the Bacillaceae family. Several members are named as YqgY. The function is not known.	73
378560	pfam11118	DUF2627	Protein of unknown function (DUF2627). This family is conserved in the Bacillaceae family. Several members are named as YqzF. The function is not known.	72
402613	pfam11119	DUF2633	Protein of unknown function (DUF2633). This family is conserved largely in the Bacillaceae family. Several members are named as YfgG. The function is not known.	54
402614	pfam11120	CBP_BcsF	Cellulose biosynthesis protein BcsF. CBP_BcsF is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)).	56
288026	pfam11121	DUF2639	Protein of unknown function (DUF2639). This family is conserved in the Bacillaceae family. Several members are named as being YflJ, but the function is not known.	37
371387	pfam11122	Spore-coat_CotD	Inner spore coat protein D. This family is conserved in the Enterobacteriaceae family. CotD is an inner spore coat protein that is expressed in the middle phase of mother cell gene expression. Along with CotD, CotH, CotS and CotT it is assumed to assemble into the loose skeleton of the matrix, between the shells of SpoIVA and CotE. Coat proteins do not share much sequence similarity between species, but this does not imply they do not share secondary, tertiary, or quaternary features.	86
402615	pfam11123	DNA_Packaging_2	DNA packaging protein. This DNA packaging protein is also referred to as gene 18 product (gp18). This protein is required for DNA packaging and functions in a complex with gp19.	82
402616	pfam11124	Pho86	Inorganic phosphate transporter Pho86. Pho86p is an ER protein which is produced in response to phosphate starvation. It is essential for growth when phosphate levels are limiting. Pho86p is also involved in the regulation of Pho84p, a high-affinity phosphate transporter which is localized to the endoplasmic reticulum (ER) in low phosphate medium. When the level of phosphate increases Pho84p is transported to the vacuole. Pho86p is required for packaging of Pho84p in to COPII vesicles.	284
288030	pfam11125	DUF2830	Protein of unknown function (DUF2830). Several members in this viral family of proteins are annotated as lysis proteins.	54
288031	pfam11126	Phage_DsbA	Transcriptional regulator DsbA. DsbA is a double stranded binding protein found in bacteriophage T4 which is involved in transcriptional regulation. DsbA, along with other viral proteins, interacts with the host RNA polymerase core enzyme enabling initiation of transcription. DsbA acts as an enhancer protein of late genes in vitro. The protein consists of mainly alpha helices.	67
402617	pfam11127	DUF2892	Protein of unknown function (DUF2892). This family is conserved in bacteria. The function is not known.	66
371390	pfam11128	Nucleocap_ssRNA	Plant viral coat protein nucleocapsid. This family of nucleocapsid proteins is from ssRNA negative-strand viruses of plant origin.	179
151573	pfam11129	EIAV_Rev	Rev protein of equine infectious anaemia virus. The sequence of this family is highly conserved and carries a nuclear export signal from residues 31-55, and RNA binding/nuclear localization signals of RRDR at residue 76 and KRRRK at residue 159. Rev is an essential regulatory protein required for nucleocytoplasmic transport of incompletely spliced viral mRNAs that encode structural proteins. Rev has been shown to down-regulate the expression of viral late genes and alter sensitivity to Gag-specific cytotoxic-T-lymphocytes (CTL). Equine infectious anaemia virus (EIAV) exhibits a high rate of genetic variation in vivo, and results in a clinically variable disease in infected horses.	134
402618	pfam11130	TraC_F_IV	F pilus assembly Type-IV secretion system for plasmid transfer. This family of TraC proteins is conserved in Proteobacteria. TraC is a cytoplasmic, peripheral membrane protein and is one of the proteins encoded by the F transfer region of the conjugative plasmid that is required for the assembly of F pilin into the mature F pilus structure. F pili are filamentous appendages that help establish the physical contact between donor and recipient cells involved in the conjugation process.	231
288035	pfam11131	PhrC_PhrF	Rap-phr extracellular signalling. PhrC and PhrF stimulate ComA-dependent gene expression to different levels and are both required for full expression of genes activated by ComA, which activates the expression of genes involved in competence development and the production of several secreted products.	38
371391	pfam11132	SplA	Transcriptional regulator protein (SplA). The SplA protein functions in trans as a negative regulator of the level of splB-lacZ expression in the developing forespore.	73
256308	pfam11133	Phage_head_fibr	Head fiber protein. This head fiber protein is also refereed to as Gp8.5. Gp8.5 is a structural protein in phage. It is a dispensable head protein.	277
288037	pfam11134	Phage_stabilize	Phage stabilisation protein. Members of this family are phage proteins that are probably involved with stabilizing the condensed DNA within the capsid.	469
288038	pfam11135	DUF2888	Protein of unknown function (DUF2888). Some members in this family of proteins with unknown function are annotated as immediate early protein ICP-18 however this cannot be confirmed.	144
402619	pfam11136	DUF2889	Protein of unknown function (DUF2889). This bacterial family of proteins has no known function.	123
402620	pfam11137	DUF2909	Protein of unknown function (DUF2909). This is a family of proteins conserved in Proteobacteria of unknown function.	60
402621	pfam11138	DUF2911	Protein of unknown function (DUF2911). This bacterial family of proteins has no known function.	141
378566	pfam11139	SfLAP	Sap, sulfolipid-1-addressing protein. SAP is a transmembrane transport protein with six predicted transmembrane helices, with a hydrophilic domain between helices 3 and 4. This hyrodphobic region is highly variable among identified Gap-like (GPL, peptidoglycolipid, addressing protein) proteins and may be involved in substrate recognition. SAP also belongs to the LysE protein superfamily (pfam01810), whose members have been implicated in small molecule transport in bacteria. Other Gap proteins export metabolites across the cell membrane so it is possible that Sap specifically may be involved in transport of sulfolipid-1 across the membrane.	213
402622	pfam11140	DUF2913	Protein of unknown function (DUF2913). This family of proteins with unknown function appear to be restricted to Gammaproteobacteria.	207
402623	pfam11141	DUF2914	Protein of unknown function (DUF2914). This bacterial family of proteins has no known function.	62
402624	pfam11142	DUF2917	Protein of unknown function (DUF2917). This bacterial family of proteins appears to be restricted to Proteobacteria.	59
402625	pfam11143	DUF2919	Protein of unknown function (DUF2919). This bacterial family of proteins has no known function. Some members are annotated as YfeZ however this cannot be confirmed.	146
314148	pfam11144	DUF2920	Protein of unknown function (DUF2920). This bacterial family of proteins has no known function.	394
402626	pfam11145	DUF2921	Protein of unknown function (DUF2921). This eukaryotic family of proteins has no known function.	891
402627	pfam11146	DUF2905	Protein of unknown function (DUF2905). This is a family of bacterial proteins conserved of unknown function.	64
402628	pfam11148	DUF2922	Protein of unknown function (DUF2922). This bacterial family of proteins has no known function.	63
402629	pfam11149	DUF2924	Protein of unknown function (DUF2924). This bacterial family of proteins has no known function.	134
402630	pfam11150	DUF2927	Protein of unknown function (DUF2927). This family is conserved in Proteobacteria. Several members are described as being putative lipoproteins, but otherwise the function is not known.	204
402631	pfam11151	DUF2929	Protein of unknown function (DUF2929). This family of proteins with unknown function appears to be restricted to Firmicutes.	56
402632	pfam11152	CCB2_CCB4	Cofactor assembly of complex C subunit B, CCB2/CCB4. Cofactor maturation pathways such as the CCB system (system IV) for cytochrome c-heme attachment are conserved in all organisms performing oxygenic photosynthesis. The CCB system consists of four proteins: CCB1-4. CCB2 and CCB4 are paralogues derived from a unique cyanobacterial ancestor. Orthologues are conserved in higher plants.	192
402633	pfam11153	DUF2931	Protein of unknown function (DUF2931). Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. Currently, there is no known function.	189
402634	pfam11154	DUF2934	Protein of unknown function (DUF2934). This bacterial family of proteins has no known function.	37
402635	pfam11155	DUF2935	Domain of unknown function (DUF2935). This family of proteins with unknown function appears to be restricted to Firmicutes. The structure of this protein has been solved and each domain is composed of four alpha helices. A metal cluster composed of iron and magnesium lies between the two domains.	121
402636	pfam11157	DUF2937	Protein of unknown function (DUF2937). This family of proteins with unknown function appears to be restricted to Proteobacteria.	160
402637	pfam11158	DUF2938	Protein of unknown function (DUF2938). This bacterial family of proteins has no known function. Some members are thought to be membrane proteins however this cannot be confirmed.	150
402638	pfam11159	DUF2939	Protein of unknown function (DUF2939). This bacterial family of proteins has no known function.	92
402639	pfam11160	DUF2945	Protein of unknown function (DUF2945). This family of proteins has no known function.	59
402640	pfam11161	DUF2944	Protein of unknown function (DUF2946). This family of proteins with unknown function appear to be restricted to Proteobacteria.	183
402641	pfam11162	DUF2946	Protein of unknown function (DUF2946). This family of proteins has no known function.	116
314165	pfam11163	DUF2947	Protein of unknown function (DUF2947). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	151
402642	pfam11164	DUF2948	Protein of unknown function (DUF2948). This family of proteins with unknown function appear to be restricted to Proteobacteria.	137
402643	pfam11165	DUF2949	Protein of unknown function (DUF2949). This family of proteins with unknown function appear to be restricted to Cyanobacteria.	56
288067	pfam11166	DUF2951	Protein of unknown function (DUF2951). This family of proteins has no known function. It has a highly conserved sequence.	98
378578	pfam11167	DUF2953	Protein of unknown function (DUF2953). This family of proteins has no known function.	53
402644	pfam11168	DUF2955	Protein of unknown function (DUF2955). Some members in this family of proteins with unknown function annotate the proteins as membrane protein. However, this cannot be confirmed.	140
402645	pfam11169	DUF2956	Protein of unknown function (DUF2956). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	101
402646	pfam11170	DUF2957	Protein of unknown function (DUF2957). Some members annotate the proteins to be putative lipoproteins however this cannot be confirmed. Currently no function is known for this family of proteins.	298
402647	pfam11171	DUF2958	Protein of unknown function (DUF2958). Some members are annotated as lipoproteins however this cannot be confirmed. This family of proteins has no known function.	111
402648	pfam11172	DUF2959	Protein of unknown function (DUF2959). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	190
402649	pfam11173	DUF2960	Protein of unknown function (DUF2960). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	79
402650	pfam11174	DUF2970	Protein of unknown function (DUF2970). This short family is conserved in Proteobacteria. The function is not known.	56
402651	pfam11175	DUF2961	Protein of unknown function (DUF2961). This family of proteins has no known function.	234
402652	pfam11176	Tma16	Translation machinery-associated protein 16. Proteins in this family localize to the nucleus. Their function is not clear.	146
402653	pfam11177	DUF2964	Protein of unknown function (DUF2964). This family of proteins with unknown function appears to be restricted to Proteobacteria.	62
314179	pfam11178	DUF2963	Protein of unknown function (DUF2963). This family of proteins with unknown function appears to be restricted to Mollicutes.	51
402654	pfam11179	DUF2967	Protein of unknown function (DUF2967). This family of proteins with unknown function appears to be restricted to Drosophila.	954
402655	pfam11180	DUF2968	Protein of unknown function (DUF2968). This family of proteins has no known function.	180
402656	pfam11181	YflT	Heat induced stress protein YflT. YflT is a heat induced protein.	100
402657	pfam11182	AlgF	Alginate O-acetyl transferase AlgF. AlgF is essential for the addition of O-acetyl groups to alginate, an extracellular polysaccharide. The presence of O-acetyl groups plays an important role in the ability of the polymer to act as a virulence factor.	164
402658	pfam11183	PmrD	Polymyxin resistance protein PmrD. PmrB forms a two-component system (TCS) with PmrA that allows Gram-negative bacteria to survive the cationic antimicrobial peptide polymyxin G. The TCS is linked to another one via the polymyxin resistance protein PmrD. PmrD is the first protein identified to mediate the connectivity between the two TCSs. It binds to the N terminal domain of the PmrA response regulator which prevents its dephosphorylation, thereby promoting the the transcription of genes involved in polymyxin resistance.	81
402659	pfam11184	DUF2969	Protein of unknown function (DUF2969). This family of proteins with unknown function appears to be restricted to Lactobacillales.	71
402660	pfam11185	DUF2971	Protein of unknown function (DUF2971). This bacterial family of proteins has no known function.	89
288086	pfam11186	DUF2972	Protein of unknown function (DUF2972). Some members in this family of proteins with unknown function are annotated as sugar transferase proteins, however this cannot be confirmed.	198
402661	pfam11187	DUF2974	Protein of unknown function (DUF2974). This bacterial family of proteins has no known function.	224
402662	pfam11188	DUF2975	Protein of unknown function (DUF2975). This family of bacterial proteins have no known function. These proteins are likely to be integral membrane proteins. The proteins contain a highly conserved glutamic acid close to their C-terminus.	130
402663	pfam11189	DUF2973	Protein of unknown function (DUF2973). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently they have no known function.	68
402664	pfam11190	DUF2976	Protein of unknown function (DUF2976). This family of proteins has no known function. Some members are annotated as membrane proteins however this cannot be confirmed.	87
402665	pfam11191	DUF2782	Protein of unknown function (DUF2782). This is a bacterial family of proteins whose function is unknown.	88
288092	pfam11192	DUF2977	Protein of unknown function (DUF2977). This family of proteins has no known function.	61
402666	pfam11193	DUF2812	Protein of unknown function (DUF2812). This is a bacterial family of uncharacterized proteins, however some members of this family are annotated as membrane proteins.	108
402667	pfam11195	DUF2829	Protein of unknown function (DUF2829). This is a uncharacterized family of proteins found in bacteria and bacteriphages.	71
402668	pfam11196	DUF2834	Protein of unknown function (DUF2834). This is a bacterial family of uncharacterized proteins.	95
402669	pfam11197	DUF2835	Protein of unknown function (DUF2835). This is a bacterial family of uncharacterized proteins. One member of this family is annotated as the A subunit of Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV).	71
402670	pfam11198	DUF2857	Protein of unknown function (DUF2857). This is a bacterial family of uncharacterized proteins.	174
402671	pfam11199	DUF2891	Protein of unknown function (DUF2891). This is a bacterial family of uncharacterized proteins.	323
402672	pfam11200	DUF2981	Protein of unknown function (DUF2981). This eukaryotic family of proteins has no known function.	334
402673	pfam11201	DUF2982	Protein of unknown function (DUF2982). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	151
402674	pfam11202	PRTase_1	Phosphoribosyl transferase (PRTase). This PRTase family is fused to a C-terminal RNA binding Pelota domain, pfam01248. These genes are found in the biosynthetic operon associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	246
402675	pfam11203	EccE	Putative type VII ESX secretion system translocon, EccE. EccE is a family of largely Gram-positive bacterial transmembrane componenets of the type VII secretion system characterized in Mycobacterium tuberculosis, systems ESX1-5. Translocation of virulent peptides through the membranes is thought to be mediated via a complex that includes EccB, EccC, EccD, EccE, and MycP. EccB, EccC, EccD, and EccE form a stable complex in the mycobacterial cell envelope.	97
402676	pfam11204	DUF2985	Protein of unknown function (DUF2985). This eukaryotic family of proteins has no known function.	78
402677	pfam11205	DUF2987	Protein of unknown function (DUF2987). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	145
371419	pfam11207	DUF2989	Protein of unknown function (DUF2989). Some members in this bacterial family of proteins are annotated as lipoproteins however this cannot be confirmed.	201
402678	pfam11208	DUF2992	Protein of unknown function (DUF2992). This bacterial family of proteins has no known function. However, the cis-regulatory yjdF motif, just upstream from the gene encoding the proteins for this family, is a small non-coding RNA, Rfam:RF01764. The yjdF motif is found in many Firmicutes, including Bacillus subtilis. In most cases, it resides in potential 5' UTRs of homologs of the yjdF gene whose function is unknown. However, in Streptococcus thermophilus, a yjdF RNA motif is associated with an operon whose protein products synthesize nicotinamide adenine dinucleotide (NAD+). Also, the S. thermophilus yjdF RNA lacks typical yjdF motif consensus features downstream of and including the P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the S. thermophilus RNAs might sense a distinct compound that structurally resembles the ligand bound by other yjdF RNAs. On the ohter hand, perhaps these RNAs have an alternative solution forming a similar binding site, as is observed with some SAM riboswitches.	132
402679	pfam11209	DUF2993	Protein of unknown function (DUF2993). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	216
402680	pfam11210	DUF2996	Protein of unknown function (DUF2996). This family of proteins has no known function.	121
402681	pfam11211	DUF2997	Protein of unknown function (DUF2997). This family of proteins has no known function.	47
288110	pfam11212	DUF2999	Protein of unknown function (DUF2999). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	82
402682	pfam11213	DUF3006	Protein of unknown function (DUF3006). This family of proteins has no known function.	67
402683	pfam11214	Med2	Mediator complex subunit 2. This family of mediator complex subunit 2 proteins is conserved in fungi. Cyclin-dependent kinase CDK8 or Srb10 interacts with and phosphorylates Med2. Post-translational modifications of Mediator subunits are important for regulation of gene expression.	100
402684	pfam11215	DUF3010	Protein of unknown function (DUF3010). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	137
402685	pfam11216	DUF3012	Protein of unknown function (DUF3012). This family of proteins with unknown function is restricted to Gammaproteobacteria.	32
402686	pfam11217	DUF3013	Protein of unknown function (DUF3013). This bacterial family of proteins with unknown function appear to be restricted to Firmicutes.	159
402687	pfam11218	DUF3011	Protein of unknown function (DUF3011). This bacterial family of proteins has no known function. Most members belong to Proteobacteria.	197
378600	pfam11219	DUF3014	Protein of unknown function (DUF3014). This family of proteins with unknown function appears to be restricted to Proteobacteria.	156
402688	pfam11220	DUF3015	Protein of unknown function (DUF3015). This bacterial family of proteins has no known function.	137
402689	pfam11221	Med21	Subunit 21 of Mediator complex. Med21 has been known as Srb7 in yeasts, hSrb7 in humans and Trap 19 in Drosophila. The heterodimer of the two subunits Med7 and Med21 appears to act as a hinge between the middle and the tail regions of Mediator.	140
402690	pfam11222	DUF3017	Protein of unknown function (DUF3017). This bacterial family of proteins with unknown function appear to be restricted to Actinobacteria.	74
402691	pfam11223	DUF3020	Protein of unknown function (DUF3020). This family of fungal proteins is conserved towards the C-terminus of HMG domains. The function is not known.	49
288122	pfam11224	DUF3023	Protein of unknown function (DUF3023). This bacterial family of proteins with unknown function appear to be restricted to Alphaproteobacteria.	130
402692	pfam11225	DUF3024	Protein of unknown function (DUF3024). This family of proteins has no known function.	56
402693	pfam11226	DUF3022	Protein of unknown function (DUF3022). This family of proteins with unknown function appears to be restricted to Proteobacteria.	103
402694	pfam11227	DUF3025	Protein of unknown function (DUF3025). Some members in this bacterial family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently this family of proteins has no known function.	210
402695	pfam11228	DUF3027	Protein of unknown function (DUF3027). This family of proteins with unknown function appears to be restricted to Actinobacteria.	193
402696	pfam11229	Focadhesin	Focadhesin. Focadhesin (FOCAD) is focal adhesion protein with potential tumor suppressor function in gliomas.	589
402697	pfam11230	DUF3029	Protein of unknown function (DUF3029). Some members in this family of proteins are annotated as ykkI. Currently no function is known.	485
402698	pfam11231	DUF3034	Protein of unknown function (DUF3034). This family of proteins with unknown function appears to be restricted to Proteobacteria.	256
402699	pfam11232	Med25	Mediator complex subunit 25 PTOV activation and synapsin 2. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-active part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain, an SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This family is the combined PTOV and SD2 domains. the PTOV domain being the domain through which Med25 co-operates with the histone acetyltransferase CBP, but the function of the SD2 domain is unclear.	147
402700	pfam11233	DUF3035	Protein of unknown function (DUF3035). This family of proteins with unknown function appear to be restricted to Alphaproteobacteria.	140
402701	pfam11235	Med25_SD1	Mediator complex subunit 25 synapsin 1. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA, domain, this SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This The function of the SD domains is unclear.	157
402702	pfam11236	DUF3037	Protein of unknown function (DUF3037). This bacterial family of proteins has no known function.	118
402703	pfam11237	DUF3038	Protein of unknown function (DUF3038). This family of proteins with unknown function appear to be restricted to Cyanobacteria.	169
402704	pfam11238	DUF3039	Protein of unknown function (DUF3039). This family of proteins with unknown function appears to be restricted to Actinobacteria.	56
402705	pfam11239	DUF3040	Protein of unknown function (DUF3040). Some members in this family of proteins with unknown function are annotated as membrane proteins however this cannot be confirmed.	82
402706	pfam11240	DUF3042	Protein of unknown function (DUF3042). This family of proteins with unknown function appears to be restricted to Firmicutes.	54
402707	pfam11241	DUF3043	Protein of unknown function (DUF3043). Some members in this family of proteins with unknown function are annotated as membrane proteins. This cannot be confirmed.	171
402708	pfam11242	DUF2774	Protein of unknown function (DUF2774). This is a viral family of proteins with unknown function.	63
288140	pfam11243	DUF3045	Protein of unknown function (DUF3045). Members in this family of proteins are annotated as gene protein 30.1. Currently no function is known.	88
402709	pfam11244	Med25_NR-box	Mediator complex subunit 25 C-terminal NR box-containing. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA, domain, an SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and this C-terminal NR box-containing domain (646-650) from C69-747. The NR box of MED25 is critical for its recruitment to the promoter, probably through an interaction with pre bound RAR.	90
288142	pfam11245	DUF2544	Protein of unknown function (DUF2544). This is a bacterial family of proteins with unknown function.	246
402710	pfam11246	Phage_gp53	Base plate wedge protein 53. The baseplate of bacteriophage T4 controls host cell recognition, attachment, tail sheath contraction and viral DNA ejection. The structure of the baseplate suggests a mechanism of baseplate structural transition during the initial stages of T4 infection. The baseplate is assembled from six identical wedges that surround the central hub. Gp53, along with other T4 gene products, combine sequentially to assemble a wedge.	200
314235	pfam11247	DUF2675	Protein of unknown function (DUF2675). Members in this family of proteins are annotated as Gene protein 5.5. Currently no function is known.	99
402711	pfam11248	DUF3046	Protein of unknown function (DUF3046). This family of proteins with unknown function appears to be restricted to Actinobacteria.	62
402712	pfam11249	DUF3047	Protein of unknown function (DUF3047). This bacterial family of proteins has no known function.	198
402713	pfam11250	FAF	Fantastic Four meristem regulator. FAF is a family of plant proteins that regulate the size of the shoot meristem by modulating the CLV3-WUS feedback loop. The proteins are expressed in the centre of the shoot meristem, overlapping with the site of WUS - the homeodomain transcription factor WUSCHEL- expression. FAF proteins are capable of modulating shoot growth by repressing WUS in the organising centre of the shoot meristem. The ability of plants to form new organs throughout their life cycle requires tight control of the meristems to avoid unregulated growth. Plants have evolved an elaborate genetic network that controls meristem size and maintenance. WUS and the CLAVATA (CLV) ligand-receptor system are at the core of the network that regulates the size of the stem cell population in the shoot meristem.	54
402714	pfam11251	DUF3050	Protein of unknown function (DUF3050). This bacterial family of proteins has no known function.	232
151694	pfam11252	DUF3051	Protein of unknown function (DUF3051). This viral family of proteins has no known function.	189
402715	pfam11253	DUF3052	Protein of unknown function (DUF3052). This family of proteins with unknown function appears to be restricted to Actinobacteria.	123
402716	pfam11254	DUF3053	Protein of unknown function (DUF3053). Some members in this family of proteins are annotated as the membrane protein YiaF. No function is currently known.	219
402717	pfam11255	DUF3054	Protein of unknown function (DUF3054). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known.	110
402718	pfam11256	DUF3055	Protein of unknown function (DUF3055). This family of proteins with unknown function appear to be restricted to Firmicutes.	80
402719	pfam11258	DUF3048	Protein of unknown function (DUF3048) N-terminal domain. Some members in this bacterial family of proteins are annotated as YerB. However currently no function is known. This entry represents the N-terminal domain.	143
402720	pfam11259	DUF3060	Protein of unknown function (DUF3060). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed.	58
371436	pfam11260	Spidroin_MaSp	Major ampullate spidroin 1 and 2. Dragline silk is composed of two proteins, major ampullate spidroin 1 (MaSp1) and major ampullate spidroin 2 (MaSp2). MaSp1 contains five alpha-helices. Only the C-terminus of the proteins are shown.	85
402721	pfam11261	IRF-2BP1_2	Interferon regulatory factor 2-binding protein zinc finger. IRF-2BP1 and IRF-2BP2 are nuclear transcriptional repressor proteins and can inhibit both enhancer-activated and basal transcription. They both contain N-terminal zinc finger represented in this family and C-terminal RING finger domains.	52
402722	pfam11262	Tho2	Transcription factor/nuclear export subunit protein 2. THO and TREX form a eukaryotic complex which functions in messenger ribonucleoprotein metabolism and plays a role in preventing the transcription-associated genetic instability. Tho2, along with four other subunits forms THO	304
402723	pfam11263	Attachment_P66	Borrelia burgdorferi attachment protein P66. P66 is an outer membrane protein in Borrelia burgdorferi, the agent of Lyme disease. P66 has a role in the attachment of Borrelia burgdorferi to human cell-surface receptors.	253
402724	pfam11264	ThylakoidFormat	Thylakoid formation protein. THF1 is localized to the outer plastid membrane and the stroma. THF1 has a role in sugar signalling. THF1 is also thought to have a role in chloroplast and leaf development. THF1 has been shown to play a crucial role in vesicle-mediated thylakoid membrane biogenesis.	216
402725	pfam11265	Med25_VWA	Mediator complex subunit 25 von Willebrand factor type A. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain which is this one, an SD2 domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This VWA or von Willebrand factor type A domain when bound to RAR and the histone acetyltransferase CBP is responsible for recruiting Med1 to the rest of the Mediator complex.	213
402726	pfam11266	Ald_deCOase	Long-chain fatty aldehyde decarbonylase. This cyanobacterial family of fatty aldehyde decarbonylases acts on mainly C16 and C18 substrates to form hydrocarbons and carbon monoxide. Note that the corresponding EC number (EC:4.1.99.5) dating from 1989 refers to a nonorthologous Pisum sativum enzyme that acts on C18 and longer chains and attaches the overly narrow narrow name octadecanal decarbonylase.	215
402727	pfam11267	DUF3067	Domain of unknown function (DUF3067). This family of proteins found in plants and cyanobacteria has no known function. The structure of this domain has been solved by NMR for the alr2454 protein. The structure was determined to be a novel fold composed of four alpha helices and a sheet of three anti-parallel beta-strands.	90
402728	pfam11268	DUF3071	Protein of unknown function (DUF3071). Some members in this family of proteins are annotated as DNA-binding proteins however this cannot be confirmed. Currently no function is known.	165
314254	pfam11269	DUF3069	Protein of unknown function (DUF3069). This family of proteins with unknown function appear to be restricted to Gammaproteobacteria.	118
288164	pfam11270	DUF3070	Protein of unknown function (DUF3070). This eukaryotic family of proteins has no known function.	23
402729	pfam11271	DUF3068	Protein of unknown function (DUF3068). Some members in this family of proteins with unknown function are annotated as membrane proteins however this cannot be confirmed.	297
402730	pfam11272	DUF3072	Protein of unknown function (DUF3072). This bacterial family of proteins has no known function.	56
402731	pfam11273	DUF3073	Protein of unknown function (DUF3073). This family of proteins with unknown function appears to be restricted to Actinobacteria.	63
402732	pfam11274	DUF3074	Protein of unknown function (DUF3074). This eukaryotic family of proteins has no known function but appears to be part of the START superfamily.	181
371447	pfam11275	DUF3077	Protein of unknown function (DUF3077). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	77
402733	pfam11276	DUF3078	Protein of unknown function (DUF3078). This bacterial family of proteins has no known function.	90
402734	pfam11277	Med24_N	Mediator complex subunit 24 N-terminal. This subunit of the Mediator complex appears to be conserved only from insects to humans. It is essential for correct retinal development in fish. Subunit composition of the mediator contributes to the control of differentiation in the vertebrate CNS as there are divergent functions of the mediator subunits Crsp34/Med27, Trap100/Med24, and Crsp150/Med14.	994
402735	pfam11278	DUF3079	Protein of unknown function (DUF3079). This family of proteins with unknown function appears to be restricted to Proteobacteria.	50
402736	pfam11279	DUF3080	Protein of unknown function (DUF3080). Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. Currently this family has no known function.	315
337968	pfam11280	DUF3081	Protein of unknown function (DUF3081). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	77
314264	pfam11281	DUF3083	Protein of unknown function (DUF3083). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	315
402737	pfam11282	DUF3082	Protein of unknown function (DUF3082). This family of proteins has no known function.	80
402738	pfam11283	DUF3084	Protein of unknown function (DUF3084). This bacterial family of proteins has no known function.	77
402739	pfam11284	DUF3085	Protein of unknown function (DUF3085). This family of proteins with unknown function appears to be restricted to Proteobacteria.	89
402740	pfam11285	DUF3086	Protein of unknown function (DUF3086). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	275
402741	pfam11286	DUF3087	Protein of unknown function (DUF3087). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	165
402742	pfam11287	DUF3088	Protein of unknown function (DUF3088). This family of proteins with unknown function appears to be restricted to Proteobacteria.	112
402743	pfam11288	DUF3089	Protein of unknown function (DUF3089). This family of proteins has no known function but appears to have an alpha/beta hydrolase domain and so is likely to be enzymatic.	200
402744	pfam11289	APA3_viroporin	Coronavirus accessory protein 3a. APA3_viroporin is a pro-apoptosis-inducing protein. It localizes to the endoplasmic reticulum (ER)-Golgi compartment. The Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) causes apoptosis of infected cells, and this is one of the culprits. Multi-pass membrane protein that forms a homotetrameric potassium-sensitive ion channel called a viroporin whose activity causes ER-stress to the host cell.	273
402745	pfam11290	DUF3090	Protein of unknown function (DUF3090). This family of proteins with unknown function appears to be restricted to Actinobacteria.	171
151732	pfam11291	DUF3091	Protein of unknown function (DUF3091). This eukaryotic family of proteins has no known function.	100
402746	pfam11292	DUF3093	Protein of unknown function (DUF3093). This family of proteins with unknown function appears to be restricted to Actinobacteria. Some members are annotated as alanine rich membrane proteins however this cannot be confirmed.	140
402747	pfam11293	DUF3094	Protein of unknown function (DUF3094). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	55
402748	pfam11294	DUF3095	Protein of unknown function (DUF3095). Some members in this bacterial family of proteins are annotated as adenylyl cyclase however this cannot be confirmed. Currently no function is known.	377
371454	pfam11295	DUF3096	Protein of unknown function (DUF3096). This family of proteins with unknown function appears to be restricted to Proteobacteria.	37
402749	pfam11296	DUF3097	Protein of unknown function (DUF3097). This family of proteins with unknown function appears to be restricted to Actinobacteria.	270
402750	pfam11297	DUF3098	Protein of unknown function (DUF3098). This bacterial family of proteins has no known function.	67
402751	pfam11298	DUF3099	Protein of unknown function (DUF3099). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known.	72
402752	pfam11299	DUF3100	Protein of unknown function (DUF3100). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known.	235
402753	pfam11300	DUF3102	Protein of unknown function (DUF3102). This family of proteins has no known function.	129
402754	pfam11301	DUF3103	Protein of unknown function (DUF3103). This family of proteins with unknown function appear to be restricted to Proteobacteria.	351
402755	pfam11302	DUF3104	Protein of unknown function (DUF3104). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	69
402756	pfam11303	DUF3105	Protein of unknown function (DUF3105). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known.	123
402757	pfam11304	DUF3106	Protein of unknown function (DUF3106). Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.	98
402758	pfam11305	DUF3107	Protein of unknown function (DUF3107). Some members in this family of proteins are annotated as ATP-binding proteins however this cannot be confirmed. Currently no function is known.	73
402759	pfam11306	DUF3108	Protein of unknown function (DUF3108). This is a bacterial family of putative lipoproteins. Structure 3fzx, the first structural template for this large family including several homologs in the human gut microbiome and in metagenomic datasets, folds into a beta barrel that topologically looks like a small-scale porin (such as FepA). Bacteroides fragilis glyA is a putative exported protein, and this fold is of the YmcC-like type, with a predicted signal peptide SpI cleavage site AGAMA|QNQDC, and a Phobius server prediction of non-cytoplasmic localization for amino acids 21-236. The possibility of it being a membrane protein can be ruled out by the hydrophilic nature of the solvent exposed surface outside the barrels. Analysis of sequence conservation suggests that an area near Glu172/Trp206 is potentially interesting. These two residues are also conserved in Dali hit Structure 2in5, a hypothetical lipoprotein classified as a new YmcC-like fold in SCOP (SCOP:159271, with a 12-stranded meander beta-sheet folded into a deformed beta-barrel) despite large structural differences between the two structures, suggesting similarity in function.	225
402760	pfam11307	DUF3109	Protein of unknown function (DUF3109). This bacterial family of proteins has no known function.	181
402761	pfam11308	Glyco_hydro_129	Glycosyl hydrolases related to GH101 family, GH129. This family of bacterial and lower eukaryote glycosyl hydrolases is related to CAZy family GH129,and distantly to GH101, and is made up of sub-families GHL1-GHL3.	324
402762	pfam11309	DUF3112	Protein of unknown function (DUF3112). This eukaryotic family of proteins has no known function.	217
288203	pfam11310	DUF3113	Protein of unknown function (DUF3113). This family of proteins has no known function. It has a highly conserved sequence.	60
402763	pfam11311	DUF3114	Protein of unknown function (DUF3114). Some members in this family of proteins with unknown function are annotated as cytosolic proteins. This cannot be confirmed.	253
402764	pfam11312	Methyltransf_34	Putative SAM-dependent methyltransferase. This family of largely fungal proteins are likely to be a methyltransferase. This was determined through multiple motif screening in yeast.	294
402765	pfam11313	DUF3116	Protein of unknown function (DUF3116). This family of proteins with unknown function appears to be restricted to Bacillales.	84
371461	pfam11314	DUF3117	Protein of unknown function (DUF3117). This family of proteins with unknown function appears to be restricted to Actinobacteria.	50
402766	pfam11315	Med30	Mediator complex subunit 30. Med30 is a metazoan-specific subunit of Mediator, having no homologs in yeasts.	147
371463	pfam11316	Rhamno_transf	Putative rhamnosyl transferase. Most members of this family are uncharacterized, but one is a putative side-chain-rhamnosyl transferase.	235
402767	pfam11317	DUF3119	Protein of unknown function (DUF3119). This family of proteins has no known function.	108
402768	pfam11318	DUF3120	Protein of unknown function (DUF3120). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	199
402769	pfam11319	VasI	Type VI secretion system VasI, EvfG, VC_A0118. VasI is a family of Gram-negative proteins that form part of the pathogenicity apparatus for bacteria-to-bacteria attack. The exact function of this component is not known.	183
402770	pfam11320	DUF3122	Protein of unknown function (DUF3122). This family of proteins with unknown function appear to be restricted to Cyanobacteria.	134
402771	pfam11321	DUF3123	Protein of unknown function (DUF3123). This eukaryotic family of proteins has no known function.	113
402772	pfam11322	DUF3124	Protein of unknown function (DUF3124). This bacterial family of proteins has no known function.	123
402773	pfam11324	DUF3126	Protein of unknown function (DUF3126). This family of proteins with unknown function appear to be restricted to Alphaproteobacteria.	63
402774	pfam11325	DUF3127	Domain of unknown function (DUF3127). This bacterial family of proteins has no known function. However, it does show distant similarity to pfam00436, with proteins such as Prevotella buccalis HMPREF0650_0099 being similar to both families. This suggests that this family may have a DNA-binding function.	84
402775	pfam11326	DUF3128	Protein of unknown function (DUF3128). This eukaryotic family of proteins has no known function.	80
402776	pfam11327	DUF3129	Protein of unknown function (DUF3129). This eukaryotic family of proteins has no known function.	183
288220	pfam11328	DUF3130	Protein of unknown function (DUF3130. This bacterial family of proteins has no known function.	89
402777	pfam11329	DUF3131	Protein of unknown function (DUF3131). This bacterial family of proteins has no known function.	367
371472	pfam11330	DUF3132	Protein of unknown function (DUF3132). This viral family of proteins are 55kDa. No function is currently known.	242
402778	pfam11331	zinc_ribbon_12	Probable zinc-ribbon domain. This eukaryotic family of proteins has no known function.	45
402779	pfam11332	DUF3134	Protein of unknown function (DUF3134). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	68
402780	pfam11333	DUF3135	Protein of unknown function (DUF3135). This family of proteins with unkown function appears to be restricted to Proteobacteria.	80
402781	pfam11334	DUF3136	Protein of unknown function (DUF3136). This family of proteins with unknown function appear to be restricted to Cyanobacteria.	64
402782	pfam11335	DUF3137	Protein of unknown function (DUF3137). This bacterial family of proteins has no known function.	142
402783	pfam11336	DUF3138	Protein of unknown function (DUF3138). This family of proteins with unknown function appear to be restricted to Proteobacteria.	525
402784	pfam11337	DUF3139	Protein of unknown function (DUF3139). This family of proteins with unknown function appears to be restricted to Firmicutes.	77
402785	pfam11338	DUF3140	Protein of unknown function (DUF3140). Some members in this family of proteins are annotated as DNA binding proteins. No function is currently known.	92
402786	pfam11339	DUF3141	Protein of unknown function (DUF3141). This family of proteins with unknown function appears to be restricted to Proteobacteria.	582
371478	pfam11340	DUF3142	Protein of unknown function (DUF3142). This bacterial family of proteins has no known function.	223
402787	pfam11341	DUF3143	Protein of unknown function (DUF3143). This family of proteins has no known function.	65
402788	pfam11342	DUF3144	Protein of unknown function (DUF3144). This family of proteins with unknown function appears to be restricted to Proteobacteria.	77
402789	pfam11343	DUF3145	Protein of unknown function (DUF3145). This family of proteins with unknown function appear to be restricted to Actinobacteria.	157
402790	pfam11344	DUF3146	Protein of unknown function (DUF3146). This family of proteins with unknown function appear to be restricted to Cyanobacteria.	80
288237	pfam11345	DUF3147	Protein of unknown function (DUF3147). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known.	111
402791	pfam11346	DUF3149	Protein of unknown function (DUF3149). This bacterial family of proteins has no known function.	39
402792	pfam11347	DUF3148	Protein of unknown function (DUF3148). This family of proteins has no known function.	61
402793	pfam11348	DUF3150	Protein of unknown function (DUF3150). This bacterial family of proteins with unknown function appears to be restricted to Proteobacteria.	256
402794	pfam11349	DUF3151	Protein of unknown function (DUF3151). This family of proteins with unknown function appears to be restricted to Actinobacteria.	127
371483	pfam11350	DUF3152	Protein of unknown function (DUF3152). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function.	207
402795	pfam11351	GTA_holin_3TM	Holin of 3TMs, for gene-transfer release. This is a family of bacterial 3TM holins. In Rhodobacter capsulatus the protein is expressed just overlapping and downstream of a putative N-acetylmuramidase lysozyme (an endolysin) thought to be responsible for lysing a phage particle, RcGTA - a gene-transfer agent. A holin would be necessary for such an endolysin to access the peptidoglycan. Gene-transfer agents are bacteriophage-like genetic elements with the sole known function of horizontal gene transfer, serving an important role in microbial evolution. In order to be released from the cell these require the combined action of an endolysin and this holin.	123
402796	pfam11352	DUF3155	Protein of unknown function (DUF3155). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	88
402797	pfam11353	DUF3153	Protein of unknown function (DUF3153). This family of proteins with unknown function appear to be restricted to Cyanobacteria. Some members are annotated as membrane proteins however this cannot be confirmed.	191
402798	pfam11354	DUF3156	Protein of unknown function (DUF3156). This family of proteins with unknown function appears to be restricted to Proteobacteria.	161
371487	pfam11355	DUF3157	Protein of unknown function (DUF3157). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	199
402799	pfam11356	T2SSC	Type II secretion system protein C. This is the greater N-terminal region of GspC-type proteins. GspC proteins form part of the sophisticated transport mechanism of Gram-negative pathogens for injecting divers proteins into their hosts, a type-II secretion system - T2SS. The region is made up of a short N-terminal cytoplasmic domain that is followed by the single transmembrane helix, a Pro-rich linker, and the so-called homology region domain in the periplasm. This inner membrane GspC interacts with the outer membrane secretin GspD via periplasmic domains, an interaction which is critical for the effectiveness of type II secretion.	142
402800	pfam11357	Spy1	Cell cycle regulatory protein. Speedy (Spy1) is a cell cycle regulatory protein which activates CDK2, the major kinase that allows progression through G1/S phase and further replication events. Spy1 expression overcomes a p27-induced cell cycle arrest to allow for DNA synthesis, so cell cycle progression occurs due to an interaction between Spy1 and p27. Spy1 is also known as Ringo protein A.	131
402801	pfam11358	DUF3158	Protein of unknown function (DUF3158). Some members in this family of proteins are annotated as integrase regulator R however this cannot be confirmed. This family of proteins with unknown function appear to be restricted to Proteobacteria.	156
288251	pfam11359	gpUL132	Glycoprotein UL132. Glycoprotein UL132 is a low-abundance structural component of Human cytomegalovirus (HCMV). The function of this protein is not fully understood.	238
402802	pfam11360	DUF3110	Protein of unknown function (DUF3110). This family of proteins has no known function.	84
371490	pfam11361	DUF3159	Protein of unknown function (DUF3159). Some members in this family of proteins with unknown function are annotated as membrane proteins however this cannot be confirmed. Currently this family of proteins has no known function.	188
402803	pfam11362	DUF3161	Protein of unknown function (DUF3161). This eukaryotic family of proteins has no known function.	83
371491	pfam11363	DUF3164	Protein of unknown function (DUF3164). This family of proteins has no known function.	194
402804	pfam11364	DUF3165	Protein of unknown function (DUF3165). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function.	81
402805	pfam11365	SOGA	Protein SOGA. The SOGA (suppressor of glucose by autophagy) family consists of proteins SOGA1, SOGA2, and SOGA3. SOGA1 regulates autophagy by playing a role in the reduction of glucose production in an adiponectin and insulin dependent manner.	95
402806	pfam11367	DUF3168	Protein of unknown function (DUF3168). This family of proteins has no known function but is likely to be a component of bacteriophage.	117
402807	pfam11368	DUF3169	Protein of unknown function (DUF3169). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function.	248
402808	pfam11369	DUF3160	Protein of unknown function (DUF3160). This family of proteins has no known function.	606
402809	pfam11371	DUF3172	Protein of unknown function (DUF3172). This family of proteins has no known function.	139
402810	pfam11372	DUF3173	Domain of unknown function (DUF3173). This family of proteins with unknown function appears to be restricted to Firmicutes. These proteins appear to be distantly related to HHH domains and are therefore likely to be DNA-binding. Genomic environment-visualisation confirms the likely function as being DNA-binding, as this short protein lies very closely between an integrase and a replication protein (http://www.microbesonline.org/).	58
402811	pfam11373	DUF3175	Protein of unknown function (DUF3175). This family of proteins with unknown function appears to be restricted to Proteobacteria.	84
402812	pfam11374	DUF3176	Protein of unknown function (DUF3176). This eukaryotic family of proteins has no known function.	107
402813	pfam11375	DUF3177	Protein of unknown function (DUF3177). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function.	189
402814	pfam11376	DUF3179	Protein of unknown function (DUF3179). This family of proteins has no known function.	289
402815	pfam11377	DUF3180	Protein of unknown function (DUF3180). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently there is no known function.	137
402816	pfam11378	DUF3181	Protein of unknown function (DUF3181). This family of proteins has no known function.	88
402817	pfam11379	DUF3182	Protein of unknown function (DUF3182). This family of proteins with unknown function appears to be restricted to Proteobacteria.	353
402818	pfam11380	Stealth_CR2	Stealth protein CR2, conserved region 2. Stealth_CR2 is the second of several highly conserved regions on stealth proteins in metazoa and bacteria. There are up to four CR regions on all member proteins. CR2 carries a well-conserved NDD sequence-motif. The domain is found in tandem with CR1, CR3 and CR4 on both potential metazoan hosts and pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1-phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system.	108
402819	pfam11381	DUF3185	Protein of unknown function (DUF3185). Some members in this bacterial family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known.	56
402820	pfam11382	MctB	Copper transport outer membrane protein, MctB. Outer membrane channel protein MctB in Mycobacterium tuberculosis is part of a Cu resistance mechanism that ensures low intracellular Cu levels in the bacterium. Human resisitance to bacteria, mediated via IFN-gamma-mediated activation of macrophages, may involve the use of reactive Cu(1) in the presence of hyrodgen peroxide, since acitivated Cu is toxic to bacteria. IFN-gamma stimulates the trafficking of the Cu transporter ATP7a to the vesicles that fuse with phagosomes and these phagosomes are found to have a high Cu content and an increaseed bactericidal activity against E.coli. Using MctB, Mycobacterium tuberculosis may limit the amount of excess Cu within it whole in the host.	298
402821	pfam11383	DUF3187	Protein of unknown function (DUF3187). This family of proteins with unknown function appear to be restricted to Proteobacteria. These proteins are likely to be outer membrane proteins.	320
402822	pfam11384	DUF3188	Protein of unknown function (DUF3188). This bacterial family of proteins has no known function.	50
402823	pfam11385	DUF3189	Protein of unknown function (DUF3189). This family of proteins with unknown function appears to be restricted to Firmicutes	147
371502	pfam11386	VERL	Vitelline envelope receptor for lysin. VERL, the egg vitelline envelope (VE) receptor for lysin, is a giant unbranched glycoprotein comprising 30% of the vitelline envelope. Lysin binds to VERL and creates a hole as VERL molecules lose cohesion and splay apart. These proteins are important in the mediation of fertilisation	78
402824	pfam11387	DUF2795	Protein of unknown function (DUF2795). This family of proteins has no known function.	44
402825	pfam11388	DotA	Phagosome trafficking protein DotA. DotA is essential for intracellular growth in Legionella. DotA is thought to play an important role in regulating initial phagosome trafficking decisions either upon or immediately after macrophage uptake.	105
371504	pfam11389	Porin_OmpL1	Leptospira porin protein OmpL1. OmpL1 is a member of the outer membrane (OM) proteins in the mammalian pathogen Leptospira. Specifically, it is a porin.	272
402826	pfam11390	FdsD	NADH-dependant formate dehydrogenase delta subunit FdsD. FdsD is the delta subunit of the enzyme formate dehydrogenase. This subunit may play a role in maintaining the quaternary structure by means of electrostatic interactions with the other subunits. The delta subunit is not involved in the active centre of the enzyme.	61
402827	pfam11391	DUF2798	Protein of unknown function (DUF2798). This family of proteins has no known function.	58
402828	pfam11392	DUF2877	Protein of unknown function (DUF2877). This bacterial family of proteins are putative carboxylase proteins however this cannot be confirmed.	109
402829	pfam11393	T4BSS_DotI_IcmL	Type-IV b secretion system, inner-membrane complex component. IcmL contains two amphipathic beta-sheet regions, required for the pore-forming ability which may be related to the transfer of this protein into a host cell membrane. The icmL gene shows significant similarity to plasmid genes involved in conjugation however IcmL is thought to be required for macrophage killing. It is unknown whether conjugation plays a role in macrophage killing. This is a family of DotI/IcmL proteins of type IVb secretion systems, that reside in the inner-membrane. It carries a single transmembrane helix in the N-terminal conserved region, has an extra-periplasmic domain, and is conserved in all T4BSSs including I-type conjugation systems (TraM). DotI/IcmL (and DotJ) may form an inner membrane complex that associates with the core complex.	180
288282	pfam11394	DUF2875	Protein of unknown function (DUF2875). This family of proteins with unknown function appear to be restricted to Proteobacteria.	462
402830	pfam11395	DUF2873	Protein of unknown function (DUF2873). This viral family of proteins has no known function.	43
402831	pfam11396	PepSY_like	Putative beta-lactamase-inhibitor-like, PepSY-like. This family of bacterial proteins is probably periplasmic. Members are found predominantly in microbes of the human gut and oral cavity. Structurally, one member of this family is found to show similarity to the beta-lactamase-inhibitor PepSY proteins, so the overall function may be inhibitory. There are tandem repeats of the domain on many family members.	86
371506	pfam11397	GlcNAc	Glycosyltransferase (GlcNAc). GlcNAc is an enzyme that carries out the first glycosylation step of hydroxylated Skp1, a ubiquitous eukaryotic protein, in the cytoplasm.	351
402832	pfam11398	DUF2813	Protein of unknown function (DUF2813). This entry contains YjbD from Escherichia coli, which is annotated as a nucleotide triphosphate hydrolase.	372
378660	pfam11399	DUF3192	Protein of unknown function (DUF3192). Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed.	101
402833	pfam11401	Tetrabrachion	Tetrabrachion. Tetrabrachion forms a parallel right-handed coiled coil structure with hydrophobic interactions and salt bridges forming a thermostable tetrameric structure. It contains large hydrophobic cavities. No function is known for this family of proteins.	49
402834	pfam11402	Antifungal_prot	Antifungal protein. Antifungal protein consists of five antiparallel beta strands which are highly twisted creating a beta barrel stabilized by four internal disulphide bridges. A cationic site adjacent to a hydrophobic stretch on the protein surface may constitute a phospholipid binding site.	50
402835	pfam11403	Yeast_MT	Yeast metallothionein. Metallothioneins are characterized by an abundance of cysteine residues and a lack of generic secondary structure motifs. This protein functions in primary metal storage, transport and detoxification. For the first 40 residues in the protein the polypeptide wraps around the metal by forming two large parallel loops separated by a deep cleft containing the metal cluster.	39
402836	pfam11404	Potassium_chann	Potassium voltage-gated channel. Fast inactivation of voltage-dependant potassium channels occurs by a 'ball-and-chain'-type mechanism. It controls membrane excitability and signal propagation in central neurons. Inactivation is regulated by protein phosphorylation where phosphorylation of serine residues leads to a reduction of the fast inactivation.	29
402837	pfam11405	Inhibitor_I67	Bromelain inhibitor VI. Bromelain inhibitor VI is a double-chain inhibitor consisting of a 11-residue and a 41-residue chain. This protein is the 41-residue heavy chain which is joined to the 11-residue chain by disulphide bonds. The inhibitor acts to inhibit the cysteine proteinase bromelain.	41
371512	pfam11406	Tachystatin_A	Antimicrobial peptide tachystatin A. Tachystatin A contains a cysteine-stabilized triple-stranded beta-sheet and shows features common to membrane-interactive peptides. Tachystatin A is thought to have an antimicrobial activity similar to defensins.Tachystatin A is also a chitin-binding peptide.	44
402838	pfam11407	RestrictionMunI	Type II restriction enzyme MunI. Type II restriction enzyme MunI recognizes the palindromic sequence C/AATTG. It makes contact with the DNA via the major groove.	202
151847	pfam11408	Helicase_Sgs1	Sgs1 RecQ helicase. RecQ helicases unwind DNA in an ATP-dependent manner. Sgs1 has a HRDC (helicase and RNaseD C-terminal) domain which modulates the helicase function via auxiliary contacts to DNA.	79
402839	pfam11409	SARA	Smad anchor for receptor activation (SARA). Smad proteins mediate transforming growth factor-beta (TGF-beta) signaling from the transmembrane serine-threonine receptor kinases to the nucleus. SARA recruits Smad2 to the TGF-beta receptors for phosphorylation.	37
371514	pfam11410	Antifungal_pept	Antifungal peptide. This peptide has six cysteines involved in three disulphide bonds. It contains a global fold which involves a cysteine-knotted three-stranded antiparallel beta-sheet along with a flexible loop and four beta-reverse turns. It also has an amphiphilic character which is the main structural basis of its biological function.	33
402840	pfam11411	DNA_ligase_IV	DNA ligase IV. DNA ligase IV along with Xrcc4 functions in DNA non-homologous end joining. This process is required to mend double-strand breaks. Upon ligase binding to an Xrcc4 dimer, the helical tails unwind leading to a flat interaction surface.	34
402841	pfam11412	DsbC	Disulphide bond corrector protein DsbC. DsbC rearranges incorrect disulphide bonds during oxidative protein folding. It is activated by the N-terminal domain of DsbD, a transmembrane electron transporter. DsbD binds to a DsbC dimer and selectively activates it using electrons from the cytoplasm.	117
402842	pfam11413	HIF-1	Hypoxia-inducible factor-1. HIF-1 is a transcriptional complex and controls cellular systemic homeostatic responses to oxygen availability. In the presence of oxygen HIF-1 alpha is targeted for proteasomal degradation by pHVL, a ubiquitination complex.	32
402843	pfam11414	Suppressor_APC	Adenomatous polyposis coli tumor suppressor protein. The tumor suppressor protein, APC, has a nuclear export activity as well as many different intracellular functions. The structure consists of three alpha-helices forming two separate antiparallel coiled coils.	82
151854	pfam11415	Toxin_37	Antifungal peptide termicin. Termicin is a cysteine-rich antifungal peptide which exhibits antibacterial activity. A cysteine stabilized alpha beta motif is formed due to an alpha-helical segment and a two-stranded antiparallel beta-sheet.	35
402844	pfam11416	Syntaxin-5_N	Syntaxin-5 N-terminal, Sly1p-binding domain. Syntaxin-5_N is the Sed5 N-terminal and the N-terminus of Syntaxin-5-like proteins. It is the region of Syntaxin that interacts with Sly1p, a positive regulator of intracellular membrane fusion, allowing SM (cytosolic Sec1/munc18-like) proteins to stay associated with the assembling fusion machinery. This allows the SM proteins to participate in late fusion steps.	22
402845	pfam11417	Inhibitor_G39P	Loader and inhibitor of phage G40P. G39P inhibits the initiation of DNA replication by blocking G40P replicative helicase. G39P has a bipartite stricture consisting of a folded N-terminal domain and an unfolded C-terminal domain. The C terminal is essential for helicase interaction.	66
402846	pfam11418	Scaffolding_pro	Phi29 scaffolding protein. This protein is also referred to as gp7. The protein contains a DNA-binding function and may halve a role in mediating the structural transition from prohead to mature virus and also scaffold release.Gp7 is arranged within the capsid as a series of concentric shells.	100
402847	pfam11419	DUF3194	Protein of unknown function (DUF3194). This family of proteins has no known function however the structure has been determined. The protein consists of two alpha-helices packed on the same side of a central beta-hairpin.	83
371520	pfam11420	Subtilosin_A	Bacteriocin subtilosin A. Subtilosin A is a bacteriocin from Bacillus subtilis.The protein has a cyclized peptide backbone and forms three cross-liks between the sulphurs of Cys13, Cys7 and Cys4 and the alpha-positions of Phe22,Thr28 and Phe31.	35
402848	pfam11421	Synthase_beta	ATP synthase F1 beta subunit. The NMR solution structure of the protein in SDS micelles was found to contain two helices, an N-terminal amphipathic alpha-helix and a C-terminal alpha-helix separated by a large unstructured internal domain. The N-terminal alpha-helix is the Tom20 receptor binding site whereas the C-terminal alpha-helix is located upstream of the mitochondrial processing peptidase cleavage site.	47
402849	pfam11422	IBP39	Initiator binding protein 39 kDa. IBP39 recognizes the initiator which is solely responsible for transcription start site selection. IBP39 contains an N-terminal Inr binding domain connected to a C-terminal domain. The C domain structure indicates that it interacts with the T. vaginalis RNAP II large subunit C-terminal domain. Binding of IBP39 to Inr recruits RNAP II and initiates transcription.	179
402850	pfam11423	Repressor_Mnt	Regulatory protein Mnt. Mnt is a repressor which is involved in the genetic switch between lysogenic and lytic growth in bacteriophage P22. The C-terminal domain of the protein consists of a dimer of two antiparallel coiled coils with a right handed twist, which is both stronger and has closer inter-helical separation compared with those found in left-handed coiled coils.	25
402851	pfam11424	DUF3195	Protein of unknown function (DUF3195). This archaeal family of proteins has no known function.	85
371524	pfam11426	Tn7_TnsC_Int	Tn7 transposition regulator TnsC. TnsC is a molecular switch that regulates transposition and interacts with TnsA which is a component of the transposase. The two proteins interact via the residues 504-555 on TnsC. The TnsA/TnsC interaction is very important in Tn7 transposition.	47
192757	pfam11427	HTH_Tnp_Tc3_1	Tc3 transposase. Tc3 is transposase with a specific DNA-binding domain which contains three alpha-helices, two of which form a helix-turn-helix motif which makes four base-specific contacts with the major groove. The N-terminus makes contacts with the minor groove. There is a base specific recognition between Tc3 and the transposon DNA. The DNA binding domain forms a dimer in which each monomer binds a separate transposon end. This implicates that the dimer has a role in synapsis and is necessary for the simultaneous cleavage of both transposon termini.	50
402852	pfam11428	DUF3196	Protein of unknown function (DUF3196). This proteins is the product of the gene MPN330 and is thought to involved in a cellular function that has yet to be characterized. The proteins has 11 helices and a novel fold. No function is currently known for this protein.	264
402853	pfam11429	Colicin_D	Colicin D. Colicin D is a tRNase which kills sensitive E.coli cells via a specific tRNA cleavage. It targets the four isoaccepting tRNAs for Arg and cleaves the phosphodiester bond between positions 38 and 39 at the 3' junction of the anticodon stem and the loop.	81
402854	pfam11430	EGL-1	Programmed cell death activator EGL-1. Initiation of programmed cell death in C.elegans occurs by the binding of EGL-1 to CED-9 which disrupts a complex involving CED-4/CED-9 and allows CED-4 to activate CED-3, a caspase. It is the C terminal domain of EGL-1 which is involved in the formation of the complex with CED-9. The formation of the complex induces structural rearrangements in CED-9 and EGL-1 adopts an extended alpha-helical conformation.	20
402855	pfam11431	Transport_MerF	Membrane transport protein MerF. The mercury transport membrane protein, MerF has a core helix-loop-helix domain. It has two vicinal pairs of cysteine residues which are involved in the transport of Hg(II) across the membrane and are exposed to the cytoplasm.	45
371529	pfam11432	DUF3197	Protein of unknown function (DUF3197). This bacterial family of proteins has no known function.	113
371530	pfam11433	DUF3198	Protein of unknown function (DUF3198). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently, this archaeal family has no known function.	49
371531	pfam11434	CHIPS	Chemotaxis-inhibiting protein CHIPS. The chemotaxis inhibitory protein, CHIPS, is an excreted virulence factor which acts by binding to C5a and formylated peptide receptor (FPR), blocking phagocyte responses. A fragment of CHIPS, which contains residues 31-121 comprises of an alpha helix packed onto a four stranded anti-parallel beta-sheet. Most of the conserved residues of CHIPS are present in the alpha-helix.	91
402856	pfam11435	She2p	RNA binding protein She2p. She2p is a RNA binding protein which binds to RNA via a helical hairpin. The protein is required for the actin dependent transport of ASH1 mRNA in yeast, a form of mRNP translocation. ASH1 mRNP requires recognition of zip code elements by the RNA binding protein She2p. She2p contains a globular domain consisting of a bundle of five alpha-helices.	199
402857	pfam11436	DUF3199	Protein of unknown function (DUF3199). Some members in this family of proteins with unknown function are annotated as YqbG however this cannot be confirmed. Currently the proteins has no known function.	123
371534	pfam11437	Vanabin-2	Vanadium-binding protein 2. The Vanadium binding protein, Vanabin2, contains four alpha-helices connected by nine disulphide bonds. Vanadium accumulates in Ascidians however the biological reason remains unclear.	93
288317	pfam11438	N36	36-mer N-terminal peptide of the N protein (N36). The arginine-rich motif of the N protein is involved in transcriptional antitermination of phage lambda. N36 forms a complex with boxB RNA by binding tightly to the major groove of the boxB hairpin via hydrophobic and electrostatic interactions forming a bent alpha helix.	35
371535	pfam11439	T3SchapCesA	Type III secretion system filament chaperone CesA. This family represents a chaperone protein for the type III secretion system - TTSS - translocon protein EspA, to prevent the latter's self-polymerization. The TTSS is a highly specialized bacterial protein secretory pathway, similar in many ways to the flagellar system, that is essential for the pathogenesis of many Gram-negative bacteria. The twenty or so proteins making up the TTSS apparatus, referred to as the needle complex, allow the injection of virulence proteins (known as effectors) directly into the cytoplasm of the eukaryotic host cells they infect; however, the injection process itself is mediated by a subset of extracellular proteins that are secreted by the needle complex to the bacterial surface and assembled into the type III translocon - EspA. EspB and EspD. EspA polymerizes into an extracellular filament, and, as with other fibrous proteins, is apt to undergo massive polymerization when overexpressed. CesA is the secretion chaperone protein that binds to EspA. CesA is dimeric and helical, and it traps EspA in a monomeric state and inhibits its polymerization.	95
288318	pfam11440	AGT	DNA alpha-glucosyltransferase. The T4 bacteriophage of E.coli protects its DNA via two glycosyltransferases which glucosylate 5-hydroxymethyl cytosines (5-HMC) using UDP-glucose. These two proteins are the retaining alpha-glucosyltransferase (AGT) and the inverting beta-glucosyltransferase (BGT). The proteins in this family are AGT. AGT adopts the GT-B fold and binds both the sugar donor and acceptor to the C-terminal domain. There is evidence for a role of AGT in the base-flipping mechanism and for its specific recognition of the acceptor base.	355
288319	pfam11441	MxiM	Pilot protein MxiM. MxiM, a Shigella pilot protein, is essential for the assembly and membrane association of the Shigella secretin MxiD. MxiM contains an orthologous secretin component and has a specific binding domain for the acyl chains of bacterial lipids. The C terminal domain of MxiD hinders lipid binding to MxiM.	115
288320	pfam11442	DUF2826	Protein of unknown function (DUF2826). This is a family of uncharacterized proteins that is highly conserved in Trypanosoma cruzi.	158
402858	pfam11443	DUF2828	Domain of unknown function (DUF2828). This is a uncharacterized domain found in eukaryotes and viruses.	612
402859	pfam11444	DUF2895	Protein of unknown function (DUF2895). This is a bacterial family of uncharacterized proteins.	189
402860	pfam11445	DUF2894	Protein of unknown function (DUF2894). This is a bacterial family of uncharacterized proteins.	181
402861	pfam11446	DUF2897	Protein of unknown function (DUF2897). This is a bacterial family of uncharacterized proteins.	50
402862	pfam11447	DUF3201	Protein of unknown function (DUF3201). This archaeal family of proteins has no known function.	153
402863	pfam11448	DUF3005	Protein of unknown function (DUF3005). This is a bacterial family of uncharacterized proteins.	109
402864	pfam11449	ArsP_2	Putative, 10TM heavy-metal exporter. This is a family of putative manganese transporters with 9-11 TMs. Members carry two well-conserved characteristic sequence- motifs of 'PGCG'.	363
402865	pfam11450	DUF3008	Protein of unknwon function (DUF3008). This is a bacterial family of uncharacterized proteins.	57
402866	pfam11452	DUF3000	Protein of unknown function (DUF3000). This is a bacterial family of uncharacterized proteins.	173
402867	pfam11453	DUF2950	Protein of unknown function (DUF2950). This is a bacterial family of uncharacterized proteins.	273
402868	pfam11454	DUF3016	Protein of unknown function (DUF3016). This is a bacterial family of uncharacterized proteins.	139
402869	pfam11455	DUF3018	Protein of unknown function (DUF3018). This is a bacterial family of uncharacterized proteins.	64
378667	pfam11456	DUF3019	Protein of unknown function (DUF3019). This is a bacterial family of uncharacterized proteins.	102
402870	pfam11457	DUF3021	Protein of unknown function (DUF3021). This is a bacterial family of uncharacterized proteins.	130
402871	pfam11458	Mistic	Membrane-integrating protein Mistic. Mistic is an integral membrane protein that folds autonomously into the membrane.The protein forms a helical bundle with a polar lipid-facing surface. Mistic can be used for high-level production of other membrane proteins in their native conformations.	74
402872	pfam11459	AbiEi_3	Transcriptional regulator, AbiEi antitoxin, Type IV TA system. AbiEi_3 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	159
402873	pfam11460	DUF3007	Protein of unknown function (DUF3007). This is a family of uncharacterized proteins found in bacteria and eukaryotes.	96
402874	pfam11461	RILP	Rab interacting lysosomal protein. RILP contains a domain which contains two coiled-coil regions and is found mainly in the cytosol. RILP is recruited onto late endosomal and lysosomal membranes by Rab7 and acts as a downstream effector of Rab7. This recruitment process is important for phagosome maturation and fusion with late endosomes and lysosomes.	58
314397	pfam11462	DUF3203	Protein of unknown function (DUF3203). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.	67
314398	pfam11463	R-HINP1I	R.HinP1I restriction endonuclease. Hinp1I is a type II restriction endonuclease, recognising and cleaving a palindromic tetranucleotide sequence (G/CGC) resulting in 2 nt 5' overhanging ends. HINP1I has a conserved catalytic core domain containing an active site motif SDC18QXK and a DNA-binding domain.	205
402875	pfam11464	Rbsn	Rabenosyn Rab binding domain. Rabenosyn-5 (Rbsn) is a multivalent effector with interacts with the Rab family.Rsbn contains distinct Rab4 and Rab5 binding sites within residues 264-500 and 627-784 respectively. Rab proteins are GTPases involved in the regulation of all stages of membrane trafficking.	39
288342	pfam11465	Receptor_2B4	Natural killer cell receptor 2B4. 2B4 is a transmembrane receptor which is expressed primarily on natural killer cells. It plays a role in activating NK-mediated cytotoxicity through its interaction with CD48 on target cells in a subset of CD8 T cells. The structure of 2B4 consists of an immunoglobulin variable domain fold and contains two beta-sheets. One of the beta-sheets, the six-stranded sheet, contains structural features that may have a role in ligand recognition and receptor function.	108
402876	pfam11466	Doppel	Prion-like protein Doppel. Dpl is a homolog related to the prion protein (PrP). Dpl is toxic to neurons and is expressed in the brains of mice that do not express PrP. In DHPC and SDS micelles, Dpl shoes about 40% alpha-helical structure however in aqueous solution it consists of a random coil. The alpha helical segment can adopt a transmembrane localization also in a membrane. The unprocessed Dpl protein is thought to posses a possible channel formation mechanism which may be related to toxicity through direct interaction with cell membranes and damage to the cell membrane.	30
402877	pfam11467	LEDGF	Lens epithelium-derived growth factor (LEDGF). LEDGF is a chromatin-associated protein that protects cells from stress-induced apoptosis. It is the binding partner of HIV-1 integrase in human cells. The integrase binding domain (IBD) of LEDGF is a compact right-handed bundle composed of five alpha-helices. The residues essential for the interaction with the integrase are present in the inter-helical loop regions of the bundle structure.	112
371546	pfam11468	PTase_Orf2	Aromatic prenyltransferase Orf2. In vivo Orf2 attaches a geranyl group to a 1,3,6,8-tetrahydroxynaphthalene-derived polyketide during naphterpin biosynthesis. In vitro, Orf2 catalyzes carbon-carbon based and carbon-oxygen based prenylation of hydroxyl-containing aromatic acceptors of synthetic, microbial and plant origin.	287
402878	pfam11469	Ribonucleas_3_2	Ribonuclease III. This is a family of archaeal ribonuclease_III proteins.	119
402879	pfam11470	TUG-UBL1	TUG ubiquitin-like domain. TUG is a GLUT4 regulating protein and functions to retain membrane vesicles containing GLUT4 intracellularly. TUG releases the GLUT4 containing vesicles to the cellular exocytic machinery in response to insulin stimulation which allows translocation to the plasma membrane. TUG has an N-terminal ubiquitin-like domain (UBL1) which in similar proteins appears to participate in protein-protein interactions. The region does have a area of negative electrostatic potential and increased backbone motility which leads to suggestions of a potential protein-protein interaction site. This domain is also found at the N-terminus of yeast UBX4.	65
402880	pfam11471	Sugarporin_N	Maltoporin periplasmic N-terminal extension. This domain would appear to be the periplasmic, N-terminal extension of the outer membrane maltoporins, pfam02264, LamB.	31
402881	pfam11472	DUF3206	Protein of unknown function (DUF3206). This bacterial family of proteins has no known function.	128
314404	pfam11473	B2	RNA binding protein B2. B2 is expressed by the insect Flock House virus (FHV) as a counter-defense mechanism against antiviral RNA silencing during infection. In vitro, B2 binds to dsRNA as a dimer and inhibits the cleavage of it by Dicer. B2 blocks cleavage of the FHV genome by Dicer and also the incorporation of FHV small interfering RNAs into the RNA-induced silencing complex.	72
151913	pfam11474	N-Term_TEN	Telomerase reverse transcriptase TEN domain. This is the N terminal domain of the protein telomerase reverse transcriptase called TEN. The TEN domain is able to bind both RNA and telomeric DNA and contributes towards telomerase catalysis. The TEN domain has a structure that consists of a core beta sheet surrounded by seven alpha helices and a short beta hairpin.	188
288349	pfam11475	VP_N-CPKC	Virion protein N terminal domain. This is the N terminal domain of a family of virion proteins which contains a zinc finger domain. Currently no function is known.	32
402882	pfam11476	TgMIC1	Toxoplasma gondii micronemal protein 1 TgMIC1. TgMIC1 is released as part of a complex by Toxoplasma gondii prior to invasion. The complex which consists of TgMIC4-MIC1-MIC6 participates in host cell attachment and penetration and is critical in invasion. This is the C terminal domain of TgMIC1 which has a Galectin-like fold which interacts with and stabilizes TgMIC6 providing a mechanism for an exit from the early secretory compartments and trafficking of the complex to micronemes.	137
402883	pfam11477	PM0188	Sialyltransferase PMO188. PMO188 is a sialyltransferase from P.multocida. It transfers sialic acid from cytidine 5'-monophosphonuraminic acid to an acceptor sugar. It has important catalytic residues such as Asp141, His311, Glu338, Ser355 and Ser356.	385
314405	pfam11478	Tachystatin_B	Antimicrobial chitin binding protein tachystatin B. Tachystatin B is an antimicrobial chitin binding peptide and consists of two isotopes B1 and B2.Both structures contain a short antiparallel beta sheet with an inhibitory cysteine knot motif. Tyr(14) and Arg(17) are thought to be the essential residues for chitin binding.	42
371552	pfam11479	Suppressor_P21	RNA silencing suppressor P21. P21 is produced by Beet yellows virus to suppress the antiviral silencing response mounted by the host. P21 acts by binding directly to siRNA which is a mediator in the process. P21 has an octameric ring structure with a large central cavity.	80
288352	pfam11480	ImmE5	Colicin-E5 Imm protein. Imms bind specifically to cognate colicins in order to protect their host cells. Imm-E5 is a specific inhibitor protein of colicin E5. It binds to E5 C-terminal ribonuclease domain (CRD) to prevent cell death. The binding mode of E5-CRD and Imm-E5 mimics that of mRNA and tRNA suggesting an evolutionary pathway from the RNA-RNA interaction through the RNA-protein interaction of tRNA/E5-CRD.	82
402884	pfam11482	DUF3208	Protein of unknown function (DUF3208). This bacterial family of proteins has no known function.	108
402885	pfam11483	DUF3209	Protein of unknown function (DUF3209). This family of proteins has no known function.	123
402886	pfam11485	DUF3211	Protein of unknown function (DUF3211). This archaeal family of proteins has no known function.	136
288356	pfam11486	DUF3212	Protein of unknown function (DUF3212). Members in this family of proteins are annotated as YfmB however currently no function for this protein is known.	119
402887	pfam11487	RestrictionSfiI	Type II restriction enzyme SfiI. SfiI is a restriction enzyme that can cleave two DNA sites simultaneously to leave 3-base 3' overhangs. It acts as a homo-tetramer and recognizes a specific eight base-paid palindromic DNA sequence. After binding two copies of its recognition sequence, SfiI becomes activated leading to cleavage of all four DNA strands. The structure of SfiI consists of a central twisted beta-sheet surrounded by alpha-helices.	216
402888	pfam11488	Lge1	Transcriptional regulatory protein LGE1. This family of proteins is conserved from fungi to human. In yeasts it is involved in the ubiquitination of histones H2A and H2B. This ubiquitination step is a vital one in the regulation of the transcriptional activity of RNA polymerase II. In S. cerevisiae, Rad6 and Bre1 are present in a complex, also containing Lge1, that is required for H2B ubiquitination. Bre1 is the H2B ubiquitin ligase that interacts with acidic activators, such as Gal4, and recruits Rad6 and its binding partner Lge1 to target promoters. In S. pombe the equivalent protein to Lge1 appears to be Shf1. In human, periphilin acts a transcriptional co-repressor and regulates cell cycle progression.	71
371558	pfam11489	Aim21	Altered inheritance of mitochondria protein 21. This is a family of proteins conserved in yeasts. Saccharomyces cerevisiae Aim21 may be involved in mitochondrial migration along actin filament. It may also interact with ribosomes.	677
402889	pfam11490	DNA_pol3_a_NII	DNA polymerase III polC-type N-terminus II. This is the second N-terminal domain, NII domain, of the DNA polymerase III polC subunit A that is found only in Firmicutes. DNA polymerase polC-type III enzyme functions as the 'replicase' in low G + C Gram-positive bacteria. Purine asymmetry is a characteristic of organisms with a heterodimeric DNA polymerase III alpha-subunit constituted by polC which probably plays a direct role in the maintenance of strand-biased gene distribution; since, among prokaryotic genomes, the distribution of genes on the leading and lagging strands of the replication fork is known to be biased. It has been predicted that the N-terminus of polC folds into two globular domains, NI and NII. A predicted hydrophobic surface patch suggests this domain may be involved in protein binding. This domain is associated with DNA_pol3_alpha pfam07733 and DNA_pol3_a_NI pfam14480.	117
402890	pfam11491	DUF3213	Protein of unknown function (DUF3213). The backbone structure of this family of proteins has been determined however the function remains unknown. The protein has an alpha and beta structure with a ferredoxin-like fold.	85
288362	pfam11492	Dicistro_VP4	Cricket paralysis virus, VP4. This is a family of minor capsid proteins, known as VP4, from the dicistroviridae. The dicistroviridae is a group of small, RNA-containing viruses that are closely structurally related to the picornaviridae. VP4 is a short, extended polypeptide chain found within the viral capsid, at the interface between the external protein shell and packaged RNA genome.	53
402891	pfam11493	TSP9	Thylakoid soluble phosphoprotein TSP9. The plant-specific protein, TSP9 is phosphorylated and released in response to changing light conditions from the photosynthetic membrane. The protein resembles the characteristics of transcription/translation regulatory factors. The structure of the protein is predicted to consist of a random coil.	80
402892	pfam11494	Ta0938	Ta0938. Ta0938 is a protein of unknown function however the structure has been determined. The protein has a novel fold and a putative Zn-binding motif. The structure has two different parts, one region contains a beta sheet flanked by two alpha helices and the other contains a bundle of loops which contain all cysteines in the protein.	92
402893	pfam11495	Regulator_TrmB	Archaeal transcriptional regulator TrmB. TrmB is an alpha-glucoside sensing transcriptional regulator. The protein is the transcriptional repressor for gene cluster encoding trehalose/maltose ABC transporter in T.litoralis and P.furiosus. TrmB has lost its DNA binding domain but retained its sugar recognition site. A nonreducing glucosyl residue is shared by all substrates bound to TrmB which suggests that its a common recognition motif.	233
402894	pfam11496	HDA2-3	Class II histone deacetylase complex subunits 2 and 3. This family of class II histone deacetylase complex subunits HDA2 and HDA3 is found in fungi, The member from S. pombe is referred to as Ccq1 (coiled-coil quantitatively-enriched protein 1). These proteins associate with HDA1 to generate the activity of the HDA1 histone deacetylase complex. HDA1 interacts with itself and with the HDA2-HDA3 subcomplex to form a probable tetramer and these interactions are necessary for catalytic activity. The HDA1 histone deacetylase complex is responsible for the deacetylation of lysine residues on the N-terminal part of the core histones (H2A, H2B, H3 and H4). Histone deacetylation gives a tag for epigenetic repression and plays an important role in transcriptional regulation, cell cycle progression and developmental events. HDA2 and HDA3 have a conserved coiled-coil domain towards their C-terminus.	281
402895	pfam11497	NADH_Oxid_Nqo15	NADH-quinone oxidoreductase chain 15. This protein, Nqo15, is a part of respiratory complex 1 which is a complex that plays a central role in cellular energy production in both bacteria and mitochondria. Nqo15 has a similar fold to Frataxin, the mitochondrial iron chaperone. This protein may have a role in iron-sulphur cluster regeneration in the complex. This domain represents more than half the molecular mass of the entire complex.	123
151935	pfam11498	Activator_LAG-3	Transcriptional activator LAG-3. The C.elegans Notch pathway, involved in the control of growth, differentiation and patterning in animal development, relies on either of the receptors GLP-1 or LIN-12. Both these receptors promote signalling by the recruitment of LAG-3 to target promoters, where it then acts as a transcriptional activator. LAG-3 works as a ternary complex together with the DNA binding protein, LAG-1.	476
402896	pfam11500	Cut12	Spindle pole body formation-associated protein. This is the central coiled-coil region of cut12 also found in other fungi, barring S. cerevisiae. The full protein has two predicted coiled-coil regions, and one consensus phosphorylation site for p34cdc2 and two for MAP kinase. During fission yeast mitosis, the duplicated spindle pole bodies (SPBs) nucleate microtubule arrays that interdigitate to form the mitotic spindle. Cut12 is localized to the SPB throughout the cell cycle, predominantly around the inner face of the interphase SPB, adjacent to the nucleus. Cut12 associates with Fin1 and is important in this context for the activity of Plo1.	149
402897	pfam11501	Nsp1	Non structural protein Nsp1. Nsp1 is the N-terminal cleavage product from the viral replicase that mediates RNA replication and processing. The specific function of the protein is unknown however the structure has been determined. The protein has a novel alpha/beta fold formed by a 6 stranded beta barrel with an alpha helix covering one end of the barrel and another helix alongside the barrel. Nsp1 could be involved in the degradation of mRNA.	138
371566	pfam11502	BCL9	B-cell lymphoma 9 protein. The Wnt pathway plays a role in embryonic development, stem cell growth and tumorigenesis. BCL9 associates with beta-catenin and Tcf in the nucleus when the Wnt pathway is stimulated leading to the transactivation of Wnt target genes.	39
371567	pfam11503	DUF3215	Protein of unknown function (DUF3215). This family of proteins with unknown function appears to be restricted to Saccharomycetaceae.	72
314418	pfam11504	Colicin_Ia	Colicin Ia. Colicins are toxic molecules secreted to kill other bacteria in times of stress. Colicin Ia kills susceptible E.coli cells by binding to the colicin I receptor leading to the formation of a voltage-dependant ion channel. The protein can be divided into three domains, a translocation domain, a receptor binding domain and a channel forming domain.	72
402898	pfam11505	DUF3216	Protein of unknown function (DUF3216). This family of archaeal proteins with unknown function appears to be restricted ton Thermococcaceae.	91
402899	pfam11506	DUF3217	Protein of unknown function (DUF3217). This family of proteins with unknown function appears to be restricted to Mycoplasma. Some members in this family of proteins are annotated as MG376 however this cannot be confirmed.	104
402900	pfam11507	Transcript_VP30	Ebola virus-specific transcription factor VP30. VP30 is a nucleocapsid-associated Ebola virus-specific transcription factor. It acts by stabilizing nascent mRNA in Ebola virus replication. The C terminal domain of VP30 folds into a dimeric helical assembly. VP30 assembles into hexamers in solution by an N-terminal oligomerization domain which activates the transcription function of the protein. The oligomerization is mediated by hydrophobic amino acids at 94-112.	131
402901	pfam11508	DUF3218	Protein of unknown function (DUF3218). This family of proteins with unknown function appears to be restricted to Pseudomonas.	213
288377	pfam11510	FA_FANCE	Fanconi Anaemia group E protein FANCE. Fanconi Anaemia (FA) is a cancer predisposition disorder. In response to DNA damage, the FA core complex monoubiquitinates the downatream FANCD2 protein. The protein FANCE has an important role in DNA repair as it is the FANCD2-binding protein in the FA core complex so it represents the link between the FA core complex and FANCD2. The sequence shown is the C terminal domain of the protein which consists predominantly of helices and does not contain any beta-strand. The fold of the polypeptide is a continuous right-handed solenoidal pattern from the N terminal to the C terminal end.	262
314420	pfam11511	RhodobacterPufX	Intrinsic membrane protein PufX. PufX organizes RC-LH1, the photosynthesis reaction centre-light harvesting complex 1 core complex of Rhodobacter sphaeroides. It also facilitates the exchange of quinol for quinone between the reaction centre and cytochrome bc(1) complexes. In organic solvent, PufX contains two hydrophobic helices which are flanked by unstructured regions and connected by a helical bend.	67
371572	pfam11512	Atu4866	Agrobacterium tumefaciens protein Atu4866. Atu4866 is a protein with unknown function from Agrobacterium tumefaciens however the structure has been determined. Atu4866 adopts a streptavidin-like fold and has a beta-barrel/sandwich which is formed by eight antiparallel beta-strands. Atu4866 has a potential ligand-binding site where is has a stretch of conserved residues on the surface.	75
314422	pfam11513	TA0956	Thermoplasma acidophilum protein TA0956. TA0956 is a protein from Thermoplasma acidophilum which currently has no known function however the structure has been determined. The protein has a two-layered alpha/beta-sandwich topology and is a putative Elongation factor 1-alpha binding motif.	110
402902	pfam11514	DUF3219	Protein of unknown function (DUF3219). This family of proteins with unknown function appears to be restricted to Bacillaceae. Some members in this family of proteins are annotated as YkvR however this cannot be confirmed.	94
402903	pfam11515	Cul7	Mouse development and cellular proliferation protein Cullin-7. The Cullin Ring Ligase family member, Cul7, is required for normal mouse development and cellular proliferation. Cul7 has a CPH domain which is a p53 interaction domain. The CPH domain interaction surface of P53 is present in the tetramerisation domain.	75
151953	pfam11516	DUF3220	Protein of unknown function (DUF3120). This family of proteins with unknown function appears to be restricted to Bordetella.	106
402904	pfam11517	Nab2	Nuclear abundant poly(A) RNA-bind protein 2 (Nab2). Nab2 is a yeast heterogeneous nuclear ribonucleoprotein that modulates poly(A) tail length and mRNA. This is the N terminal domain of the protein which mediates interactions with the C-terminal globular domain, Myosin-like protein 1 and the mRNA export factor, Gfd1.The N-terminal domain of Nab2 shows a structure of a helical fold. The N terminal domain of Nab2 is thought to mediate protein protein interactions that facilitate the nuclear export of mRNA. An essential hydrophobic Phe73 patch on the N terminal domain is thought to be a important component of the interface between Nab2 and Mlp1.	101
402905	pfam11518	DUF3221	Protein of unknown function (DUF3221). This family of proteins with unknown function appears to be restricted to Bacillus. Some members in this family of proteins are annotated as YobA however this cannot be confirmed. YobA is a protein with unknown function.	82
402906	pfam11519	DUF3222	Protein of unknown function (DUF3222). This family of proteins with unknown function appears to be restricted to Rhodopseudomonas.	75
402907	pfam11520	Cren7	Chromatin protein Cren7. Cren7 is a chromatin protein found in Crenarchaeota and has a higher affinity for double-stranded DNA than for single-stranded DNA. The protein contains negative DNA supercoils and is associated with genomic DNA in vivo.Cren7 interacts with duplex DNA through a beta-sheet and a long flexible loop. The function has not been completely determined but it is thought that the protein may have a role similar to that of archaeal proteins in Euryarchaea.	55
402908	pfam11521	TFIIE-A_C	C-terminal general transcription factor TFIIE alpha. TFIIE is compiled of two subunits, alpha and beta. This family of proteins are the C terminal domain of the alpha subunit of the protein which is the largest subunit and contains several functional domains which are important for basal transcription and cell growth. The C terminal end of the protein binds directly to the amino-terminal PH domain of p62/Tfb1 (of IIH) which is involved in the recruitment of the general transcription factor IIH to the transcription preinitiation complex. P53 competes for the same binding site as TFIIE alpha which shows their structural similarity. Like p53, TFIIE alpha 336-439 can activate transcription in vivo.	83
402909	pfam11522	Pik1	Yeast phosphatidylinositol-4-OH kinase Pik1. Pik1 is a regulator of membrane traffic and participates in the mating-pheromone signal-transduction cascade. The protein is localized to the nucleus and cytoplasm in the Golgi. Pik1 is thought to have an actin-independent role in membrane transport.	50
402910	pfam11523	DUF3223	Protein of unknown function (DUF3223). This family of proteins has no known function.	75
314430	pfam11524	SeleniumBinding	Selenium binding protein. Selenium is an important nutrient that needs to be regulated since lack of the nutrient leads to cell abnormalities and high concentrations are toxic. SeBP regulates the level of free selenium in the cell by sequestering the nutrient during transport. SeBP acts as a pentamer and delivers the selenium to the selenophosphate synthetase enzyme. Each subunit is composed of an alpha helix on top of a four stranded twisted ss sheet, stabilized by hydrogen bonds.	82
371581	pfam11525	CopK	Copper resistance protein K. CopK is a periplasmic dimeric protein which is strongly up-regulated in the presence of copper, leading to a high periplasmic accumulation. CopK has two different binding sites for Cu(I), each with a different affinity for the metal. Binding of the first Cu(I) ion induces a conformational change of CopK which involves dissociation of the dimeric apo-protein. Binding of a second Cu(I) further increases the plasticity of the protein. CopK has features that are common with functionally related proteins such as a structure consisting of an all-beta fold and a methionine-rich Cu(I) binding site.	70
402911	pfam11526	CFIA_Pcf11	Subunit of cleavage factor IA Pcf11. Pcf11 is a subunit of an essential polyadenylation factor in Saccharomyces cerevisiae, CFIA. Pcf11 binds to Clp1, another subunit of CFIA whose interaction is responsible for maintaining a tight coupling between the Clp1 nucleotide binding subunit and the other components of the polyadenylation machinery.	83
402912	pfam11527	ARL2_Bind_BART	The ARF-like 2 binding protein BART. BART binds specifically to ARL2.GTP with a high affinity however it does not bind to ARL2.GDP. It is thought that this specific interaction is due to BART being the first identified ARL2-specific effector. The function is not completely characterized. BART is predominantly cytosolic but can also be found to be associated with mitochondria. BART is also involved in binding to the adenine nucleotide transporter ANT1.	111
402913	pfam11528	DUF3224	Protein of unknown function (DUF3224). This bacterial family of proteins has no known function.	124
151966	pfam11529	AvrL567-A	Melampsora lini avirulence protein AvrL567-A. AvrL567-A is a protein from the fungal pathogen flax which induces plant disease resistance in flax plants. The protein has a novel fold.	127
288394	pfam11530	Pilin_PilX	Minor type IV pilin, PilX. PilX is a protein from Neisseria meningitidis which is crucial for the formation of bacterial aggregates and adhesion to human cells. The structure of PilX is similar to all pilins as it has the common alpha/beta roll fold. PilX subunits have surface-exposed motifs which are thought to stabilize bacterial aggregates against pilus retraction. It also illustrates how a minor pilus component can modulate the virulence properties of pili which have a simple composition and structure.	127
402914	pfam11531	CARM1	Coactivator-associated arginine methyltransferase 1 N terminal. CARM1 is an arginine methyltransferase which methylates a variety of different proteins and plays a role in gene expression. This is the N terminal domain of the protein which has a PH domain, normally present to regulate protein-protein interactions.A molecular switch is also present on the N terminal domain.	105
402915	pfam11532	HnRNP_M	Heterogeneous nuclear ribonucleoprotein M. HnRNP M is a splicing regulatory factor that binds to the auxiliary RNA cis-element ISE/ISS-2 which promotes splicing of exon IIIb and silencing of exon IIIC in the fibroblast growth factor receptor 2 (FGFR2). By binding to ISE/ISS-3, HnRNP M plays a role in the regulation of alternative splicing in FGFR2 as it induces exon skipping and promotes exon inclusion.	30
402916	pfam11533	DUF3225	Protein of unknown function (DUF3225). This bacterial family of proteins has no known function.	126
402917	pfam11534	HTHP	Hexameric tyrosine-coordinated heme protein (HTHP). HTHP is from the marine bacterium Silicibacter pomeroyi and has peroxidase and catalase activity. HTHP consists of six monomers which each binds a solvent accessible heme group and is stabilized by the interaction of three neighboring monomers. The heme iron is penta-coordinated with a tyrosine residue as proximal ligand.	72
402918	pfam11535	Calci_bind_CcbP	Calcium binding. CcbP is a Ca(2+) binding protein which, in Anabaena, is thought to bind Ca(2+) by protein surface charge. When bound to Ca(2+), the protein becomes more compact and the level of free calcium decreases. The free Ca(2+) concentration which is regulated by CcbP is critical for the differentiation process. Calcium signalling is widespread in bacterial species, and prokaryotic cells like eukaryotes are equipped with all the elements to maintain Ca2+ homeostasis.	105
402919	pfam11536	DUF3226	Protein of unknown function (DUF3226). This archaeal family of proteins has no known function.	237
371589	pfam11537	DUF3227	Protein of unknown function (DUF3227). This archaeal family of proteins has no known function.	93
402920	pfam11538	Snurportin1	Snurportin1. Snurportin1 is a novel nuclear import receptor which contains an N-terminal importin beta binding domain which is essential for its function of a snRNP-specific nuclear import receptor. Snurportin1 interacts with m3G-cap where it enhances the m3G-cap dependent nuclear import of U snRNPs in Xenopus laevis oocytes and digitonin-permeabilized HeLa cells.	40
402921	pfam11539	DUF3228	Protein of unknown function (DUF3228). This family of proteins has no known function.	192
402922	pfam11540	Dynein_IC2	Cytoplasmic dynein 1 intermediate chain 2. Intermediate chain IC 2 forms part of the complex cytoplasmic dynein 1 along with a heavy chain (HC), two light intermediate chains (LICs) and three light chains (LCs). The complex is responsible for hydrolysing ATP to generate force toward the minus end of microtubules. IC binds to the HC via the N terminal binding domain on the HC and ICs contain binding sites for the LCs. The ICs are responsible for binding to kinetochores and the Golgi apparatus through an interaction with the p150Glued subunit of dynactin which is another complex.	29
402923	pfam11542	Mdv1	Mitochondrial division protein 1. Mdv1 is a component of the mitochondrial fission machinery in Saccharomyces cerevisiae. The protein is also involved in peroxisome proliferation. Mdv1 along with Fis1 is also involved in controlling Dnm-1 dependant devision, a GTPase involved in the mediation of mitochondrial division. In this role, Mdv1 is the linker between Fis1 and Dnm1. Mdv1 plays a key role in the regulation of Dnm1 self-assembly.	49
402924	pfam11543	UN_NPL4	Nuclear pore localization protein NPL4. Npl4 is part of the heterodimer UN along with Ufd1 which is involved in the recruitment of p97, an AAA ATPase, for tasks involving the ubiquitin pathway. Npl4 has a ubiquitin-like domain which has within its structure a beta-grasp fold with a helical insert.	74
402925	pfam11544	Spc42p	Spindle pole body component Spc42p. Spc42p is a 42-kD component of the S.cerevisiae spindle body that localizes to the electron dense central region of the SPB.Spc42p is a phosphoprotein which forms a polymeric layer at the periphery of the SPB central plaque. This functions during SPB duplication and also facilitates the attachment of the SPB to the nuclear membrane.	72
288407	pfam11545	HemeBinding_Shp	Cell surface heme-binding protein Shp. Shp is part of a complex which functions in heme uptake in Streptococcus pyogenes. During which, Shp transfers its heme to HtsA which is a component of an ABC transporter. The heme binding region of Shp contains an immunoglobulin-like beta-sandwich fold and has a unique heme-iron coordination with the axial ligands being two methionine residues from the same Shp molecule. Surrounding the heme pocket, there is a negative surface which may serve as a docking interface for heme transfer.	151
371594	pfam11546	CompInhib_SCIN	Staphylococcal complement inhibitor SCIN. SCIN is released by Staphylococcus aureus to counteract the host immune defense. The protein binds to and inhibits C3 convertases on the bacterial surface, reducing phagocytosis and blocking downstream effector functions by C3b deposition on its surface. An 18 residue stretch 31-48 is crucial for SCIN activity.	114
402926	pfam11547	E3_UbLigase_EDD	E3 ubiquitin ligase EDD. EDD, the ER ubiquitin ligase from the HECT ligases, contains an N-terminal ubiquitin-associated domain which binds ubiquitin. Ubiquitin is recognized by helices alpha-1 and -3 in in the UBA domain. EDD is involved in DNA damage repair pathways and binds to mono-ubiquitinated proteins.	52
402927	pfam11548	Receptor_IA-2	Protein-tyrosine phosphatase receptor IA-2. IA-2 is a protein-tyrosine phosphatase receptor that upon exocytosis, the cytoplasmic domain is cleaved and moves to the nucleus where it enhances transcription of the insulin gene. The mature exodomain of IA-2 participates in adhesion to the extracellular matrix and is self-proteolyzed in vitro by reactive oxygen species which may be a new shedding mechanism.	89
402928	pfam11549	Sec31	Protein transport protein SEC31. Sec31 is involved in COPII coat formation as it forms through the sequential binding of three cytoplasmic proteins: Sar1, Sec23/24 and Sec13/31. Sec13/31 is recruited by the pre-budding complex and polymerization of Sec13/31 occurs to form an octahedral cage that is the outer shell of the COPII coat. Sec13/31 is a hetero-tetramer which is organized as a linear array of alpha-solenoid and beta-propeller domains to form a rod in which twenty-four copies assemble to form the COPII cub-octahedron.	48
402929	pfam11550	IglC	Intracellular growth locus C protein. IglC protein is involved in the escape of F.tularensis live vaccine strain. It has been shown that the expression of IglC is essential for F.tularensis to induce macrophage apoptosis. IglC adopts a beta-sandwich conformation that has no similarity to any known protein structure.	210
402930	pfam11551	Omp28	Outer membrane protein Omp28. Omp28 is a 28-kDa outer membrane protein from Porphyromonas gingivalis. Omp28 is thought to be a surface adhesion/receptor protein. Omp28 is expressed in a wide distribution of P.gingivalis strains.	171
402931	pfam11553	DUF3231	Protein of unknown function (DUF3231). This bacterial family of proteins has no known function.	161
402932	pfam11554	DUF3232	Protein of unknown function (DUF3232). This bacterial family of proteins has no known function.	151
402933	pfam11555	Inhibitor_Mig-6	EGFR receptor inhibitor Mig-6. When the kinase domain of EGFR binds to segment one of Mitogen induced gene 6 (Mig-6), EGFR becomes inactive due to the conformation it adopts which is Src/CDK like. The binding of the two proteins prevents EGFR acting as a cyclin-like activator for other kinase domains.The structure of Mig-6(1) consists of alpha helices-G and -H with a polar surface and hydrophobic residues for interactions with EGFR. A critical step for the activation of EGFR is the formation of an asymmetric dimer involving the kinase domains of the protein. Since Mig-6 binds to the kinase domain it blocks this process and EGFR becomes inactive.	71
402934	pfam11556	EBA-175_VI	Erythrocyte binding antigen 175. EBA-175 is involved in the formation of a tight junction, a necessary step in invasion. This family represents the region VI which is a cysteine rich domain essential for EBA-175 trafficking. The structure is a homodimer that contains a five-alpha-helical core stabilized by four disulphide bridges.	80
371601	pfam11557	Omp_AT	Solitary outer membrane autotransporter beta-barrel domain. Omp_AT is a family of Gram-negative Gamma-proteobacteria outer membrane autotransporter beta-barrel proteins. Secondary structure prediction indicates a beta-barrel domain of 12 beta-strands. with an N terminal helix running along the central barrel axis perpendicular to the 12 antiparallel strands that form the barrel. Autotransporter translocation units defined by a beta-barrel of 12 to 14 antiparallel strands with an N terminal helix perpendicular to the barrel.	327
402935	pfam11558	HET-s_218-289	Het-s 218-289. This family of proteins is residues 218-289 of Het-s, a protein of Podospora anserina. Het-s plays a role in heterokaryon incompatibility which prevents different forms of parasitism. This region of the protein is the C-terminal end and is unstructured in solution but forms infectious fibrils in vitro which has a structure consisting of a left-handed beta solenoid which contains two windings per molecule.	61
402936	pfam11559	ADIP	Afadin- and alpha -actinin-Binding. This family is found in mammals where it is localized at cell-cell adherens junctions, and in Sch. pombe and other fungi where it anchors spindle-pole bodies to spindle microtubules. It is a coiled-coil structure, and in pombe, it is required for anchoring the minus end of spindle microtubules to the centrosome equivalent, the spindle-pole body. The name ADIP derives from the family being composed of Afadin- and alpha -Actinin-Binding Proteins localized at Cell-Cell Adherens Junctions.	151
371604	pfam11560	LAP2alpha	Lamina-associated polypeptide 2 alpha. LAPs are components of the nuclear lamina which supports the nuclear envelope.LAP2alpha is a non-membrane-associated member of the LAP family which is unique. This family of proteins is the C terminal domain of LAP2alpha which consists of residues 459-693 and constitutes a dimeric structure with an antiparallel coiled coil. LAP2alpha is involved in cell-cycle regulation and chromatin organisation and preferentially binds to lamin A/C.	234
371605	pfam11561	Saw1	Single strand annealing-weakened 1. This family of yeast proteins is involved in single-strand-annealing, or SSA. SSA entails multiple steps: end resection and ssDNA formation; annealing of complementary ssDNAs; removal of 3' single-stranded non-homologous tails; gap fill-in synthesis; and ligation. Saw1 in combination with Slx4 catalyzes the 3' non-homologous tail removal during recombination. Saw1 interacts physically with Rad1/Rad10, Msh2/Msh3, and Rad52 proteins, and works by targeting Rad1/Rad10 to Rad52-coated recombination intermediates.	244
402937	pfam11563	Protoglobin	Protoglobin. This family includes protoglobin from Methanosarcina acetivorans C2A. It is also found near the N-terminus of the Haem-based aerotactic transducer HemAT in Bacillus subtilis. It is part of the haemoglobin superfamily. Protoglobin has specific loops and an amino-terminal extension which leads to the burying of the haem within the matrix of the protein. Protoglobin-specific apolar tunnels allow the access of O2, CO and NO to the haem distal site. In HemAT it acts as an oxygen sensor domain.	153
402938	pfam11564	BpuJI_N	Restriction endonuclease BpuJI - N terminal. BpuJI is a restriction endonuclease which recognizes the asymmetric sequence 5'-CCCGT and cuts at multiple sites in the surrounding area of the target sequence. This family of proteins is the N terminal domain of BpuJI which has DNA recognition functions. The recognition domain has two subdomains D1 and D2. The recognition of the target sequence occurs through major groove contacts of amino acids on the helix-turn-helix region and the N-terminal arm.	278
288422	pfam11565	PorB	Alpha helical Porin B. Porin B is a porin from Corynebacterium glutamicum which allows the exchange of material across the mycolic acid layer which is the protective nonpolar barrier. Porin B has an alpha helical core structure consisting of four alpha-helices surrounding a nonpolar interior. There is a disulphide bridge between helices 1 and 4 to form a stable covalently bound ring. The channel of PorB is oligomeric.	99
402939	pfam11566	PI31_Prot_N	PI31 proteasome regulator N-terminal. PI31 is a regulatory subunit of the immuno-proteasome which is an inhibitor of the 20 S proteasome in vitro.PI31 is also an F-box protein Fbxo7.Skp1 binding partner which requires an N terminal FP domain in both proteins for the interaction to occur via the FP beta sheets. The structure of PI31 FP domain contains a novel alpha/beta-fold and two intermolecular contact surfaces. This is the N-terminal domain of the members.	156
402940	pfam11567	PfUIS3	Plasmodium falciparum UIS3 membrane protein. UIS3 is a membrane protein essential for sporozoite development in infected hepatocytes. This family is 130-229 of the Plasmodium falciparum UIS3 protein which is compact and has an all alpha-helical structure.PfUIS3(130-229) interacts with lipids, phospholipid lysosomes, the human liver fatty acid-binding protein and with the lipid phosphatidylethanolamine. The interaction with liver fatty acid-binding protein provides the parasite with a method to import essential fatty acids/lipids during rapid growth phases of sporozoites.	101
402941	pfam11568	Med29	Mediator complex subunit 29. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-active part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med29, along with Med11 and Med28, in mammals, is part of the core head-region of the complex. Med29 is the apparent orthologue of the Drosophila melanogaster Intersex protein, which interacts directly with, and functions as a transcriptional coactivator for, the DNA-binding transcription factor Doublesex, so it is likely that mammalian Med29 serves as a target for one or more DNA-binding transcriptional activators.	140
288426	pfam11569	Homez	Homeodomain leucine-zipper encoding, Homez. Homez contains two leucine zipper-like motifs and an acidic domain and belongs to the superfamily of homeobox-containing proteins. The presence of leucine zippers suggests that Homez can function as a homo or heterodimer in the nucleus. It is thought that the first leucine zipper and homeodomain 1 (HD1)of Homez is responsible for dimerization and HD2 has a specific DNA-binding activity. Homez is also thought to function as a transcriptional repressor due to the acidic region in its C-terminal domain. Homez is involved in a complex regulatory network.	55
314459	pfam11570	E2R135	Coiled-coil receptor-binding R-domain of colicin E2. E2 is a DNase which utilizes the outer membrane receptor BtuB to bind to and enter the cell. This family of proteins is E2R135 (residues 321-443) which is the part of E2 which is responsible for binding to BtuB in a coiled coil formation.	136
402942	pfam11571	Med27	Mediator complex subunit 27. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species {1-2]. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator exists in two major forms in human cells: a smaller form that interacts strongly with pol II and activates transcription, and a large form that does not interact strongly with pol II and does not directly activate transcription. The ubiquitous expression of Med27 mRNA suggests a universal requirement for Med27 in transcriptional initiation. Loss of Crsp34/Med27 decreases amacrine cell number, but increases the number of rod photoreceptor cells.	85
402943	pfam11572	DUF3234	Protein of unknown function (DUF3234). This bacterial family of proteins has no known function. Some members in this family of proteins are annotated as TTHA0547 however this cannot be confirmed.	95
402944	pfam11573	Med23	Mediator complex subunit 23. Med23 is one of the subunits of the Tail portion of the Mediator complex that regulates RNA polymerase II activity. Med23 is required for heat-shock-specific gene expression, and has been shown to mediate transcriptional activation of E1A in mice.	1302
402945	pfam11574	DUF3235	Protein of unknown function (DUF3235). Some members in this family of proteins with unknown function are annotated as RpfA however this cannot be confirmed.	83
402946	pfam11575	FhuF_C	FhuF 2Fe-2S C-terminal domain. This family consists of several bacterial ferric iron reductase protein (FhuF) sequences. FhuF is involved in the reduction of ferric iron in cytoplasmic ferrioxamine B. This domain is the C-terminal domain that contains 4 conserved cysteine residues that are found to be part of a 2Fe-2S cluster.	21
402947	pfam11576	DUF3236	Protein of unknown function (DUF3236). This family of proteins with unknown function appears to be restricted to Methanobacteria.	152
371612	pfam11577	NEMO	NF-kappa-B essential modulator NEMO. NEMO is a regulatory protein which is part of the IKK complex along with the catalytic IKKalpha and beta kinases. The IKK complex phosphorylates IkappaB targeting it for degradation which results in the release of NF-kappaB which initiates the inflammatory response, cell proliferation or cell differentiation. NEMO activates the IKK complex's activity by associating with the unphosphorylated IKK kinase C termini.The core domain of NEMO is a dimer which binds to two fragments of IKK.	67
402948	pfam11578	DUF3237	Protein of unknown function (DUF3237). This family of proteins has no known function	149
314467	pfam11579	DUF3238	Protein of unknown function (DUF3238). This family of proteins with unknown function appears to be restricted to Bacillus cereus.	192
402949	pfam11580	DUF3239	Protein of unknown function (DUF3239). This bacterial family of proteins may be membrane proteins however this cannot be confirmed. Currently there is no known function.	125
371613	pfam11581	Argos	Antagonist of EGFR signalling, Argos. Argos is a natural secreted antagonist of EGFR signalling which functions by binding growth factor ligands that activate EGFR by forming a clamp like structure using three disulphide-bonded beta-sheet domains.	129
338041	pfam11582	DUF3240	Protein of unknown function (DUF3240). This family of proteins with unknown function appears to be restricted to Proteobacteria.	101
402950	pfam11583	AurF	P-aminobenzoate N-oxygenase AurF. This family is a metalloenzyme which is involved in the biosynthesis of antibiotic aureothin by catalyzing the formation of p-nitrobenzoic acid from p-aminobenzoic acid. AurF is a non-heme di-iron monooxygenase which creates nitroarenes via the sequential oxidation of aminoarenes.	280
402951	pfam11584	Toxin_ToxA	Proteinaceous host-selective toxin ToxA. ToxA is produced by particular Pyrenophora tritici-repentis races and is a proteinaceous host-selective toxin. It is necessary and sufficient to cause cell death in sensitive wheat cultivars.ToxA adopts a single-domain, beta-sandwich fold which has novel topology. The protein is directly involved in recognition events required for ToxA action. It is thought to be distantly related to FnIII proteins, gaining entry to the host via an integrin-like receptor.	117
152021	pfam11585	Stomoxyn	Insect antimicrobial peptide, stomoxyn. Stomoxyn, localized in the gut epithelium, is an insect antimicrobial peptide which functions in killing a range of microorganisms, parasites and some viruses. Stomoxyn has a structure consisting of a random coil in water however in TFE it adopts a stable helical structure. Stomoxyn is thought to have a similar function to cecropin A from Hyalophora cecropia due to structural similarities.	42
402952	pfam11586	DUF3242	Protein of unknown function (DUF3242). This protein from Thermotoga maritima is a hypothetical ORFan protein, TM1622, whose structure has been determined. The protein is composed of seven beta strands and three alpha helices.	125
371614	pfam11587	Prion_bPrPp	Major prion protein bPrPp - N terminal. This family represents the N-terminal domain (1-30) of the bovine prion protein (bPrPp). The proteins structure consists of mainly alpha helices. BPrPp forms a stable helix which inserts in a transmembrane location in the bilayer, with the N -terminal (1-30) functioning as a cell-penetrating peptide.	30
378681	pfam11588	DUF3243	Protein of unknown function (DUF3243). This family of proteins with unknown function appears to be restricted to Firmicutes.	79
402953	pfam11589	DUF3244	Domain of unknown function (DUF3244). This domain adopts an immunoglobulin-like beta-sandwich fold and structurally is most similar to fibronectin.	100
314473	pfam11590	DNAPolymera_Pol	DNA polymerase catalytic subunit Pol. This family of proteins represents the catalytic subunit, Pol, of the Herpes simplex virus DNA polymerase. Pol binds UL42, making up the DNA polymerase. UL42 is a processivity subunit which binds to the C-terminal of Pol in a similar way that the cell cycle regulator p21 binds to PCNA.	36
152027	pfam11591	2Fe-2S_Ferredox	Ferredoxin chloroplastic transit peptide. The structure of chloroplast ferredoxin in water is unstructured however in a 30:70 molar-ratio mixture of 2,2,2-trifluoroethanol, residues 3 to 13 form an alpha-helix. The rest of the peptide remains unstructured. This family is the N-terminal of the [2Fe-2S) ferredoxin from C.reinhardtii. This protein catalyzes the final reaction in a pathway which allows the production of H(2) from water in the chloroplast.	34
288446	pfam11592	AvrPto	Central core of the bacterial effector protein AvrPto. This family of proteins represents the bacterial effector protein AvrPto from Pseudomonas syringae. This is the central core region of the protein which consists of a three-helix bundle motif. AvrPto is part of a type III secretion system from P.syringae which is involved in the bacterial speck disease of tomato. In resistant plants, AvrPto interacts with the host Pto kinase, which elicits an antibacterial defense response. In plants lacking resistance, the Pto kinase is not present and AvrPto acts as a virulence factor, promoting bacterial growth.	105
402954	pfam11593	Med3	Mediator complex subunit 3 fungal. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator subunit Hrs1/Med3 is a physical target for Cyc8-Tup1, a yeast transcriptional co-repressor.	398
402955	pfam11594	Med28	Mediator complex subunit 28. Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Subunit Med28 of the Mediator may function as a scaffolding protein within Mediator by maintaining the stability of a submodule within the head module, and components of this submodule act together in a gene-regulatory programme to suppress smooth muscle cell differentiation. Thus, mammalian Mediator subunit Med28 functions as a repressor of smooth muscle-cell differentiation, which could have implications for disorders associated with abnormalities in smooth muscle cell growth and differentiation, including atherosclerosis, asthma, hypertension, and smooth muscle tumors.	101
371618	pfam11595	DUF3245	Protein of unknown function (DUF3245). This is a family of proteins conserved in fungi. The function is not known, and there is no S. cerevisiae member.	148
371619	pfam11596	DUF3246	Protein of unknown function (DUF3246). This is a small family of fungal proteins one of whose members, MUC1.5 from Pichia stipitis is described as being an extremely serine rich protein-mucin-like protein.	241
402956	pfam11597	Med13_N	Mediator complex subunit 13 N-terminal. Mediator is a large complex of up to 33 proteins that is conserved from plants through fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med13 is part of the ancillary kinase module, together with Med12, CDK8 and CycC, which in yeast is implicated in transcriptional repression, though most of this activity is likely attributable to the CDK8 kinase. The large Med12 and Med13 proteins are required for specific developmental processes in Drosophila, zebrafish, and Caenorhabditis elegans but their biochemical functions are not understood.	311
402957	pfam11598	COMP	Cartilage oligomeric matrix protein. This family of proteins represents the five-stranded coiled-coil domain of cartilage oligomeric matrix protein (COMP). This region has a binding site between two internal rings formed by Leu37 and Thr40	43
402958	pfam11599	AviRa	RRNA methyltransferase AviRa. This family of proteins represents the methyltransferase AviRa from Streptomyces viridochromogenes. This protein mediates the resistance to the antibiotic avilamycin. AviRa methylates a specific guanine base within the peptidyl-transferase loop of the 23S ribosomal RNA.	232
402959	pfam11600	CAF-1_p150	Chromatin assembly factor 1 complex p150 subunit, N-terminal. CAF-1_p150 is a polypeptide subunit of CAF-1, which functions in depositing newly synthesized and acetylated histones H3/H4 into chromatin during DNA replication and repair. CAF-1_p150 includes the HP1 interaction site, the PEST, KER and ED interacting sites. CAF-1_p150 interacts directly with newly synthesized and acetylated histones through the acidic KER and ED domains. The PEST domain is associated with proteins that undergo rapid proteolysis.	164
402960	pfam11601	Shal-type	Shal-type voltage-gated potassium channels, N-terminal. This family represents the short N-terminal helical domain of Shal-type voltage-gated potassium channels. The domain interacts with Kv channel-interacting proteins to modulate cell surface expression and the function of Kv4 channels. The interaction of the N-terminus of Shal-type protein Kv4.2 and the Kv interacting protein KChiP1 forms a structure which is like the structure between calmodulin and its target peptides when they interact. Interactions of an N terminal alpha helix in Kv4.2 and a C terminal alpha helix in KChIP1 are essential for the modulation of Kv4.2 by KChIPs.	28
402961	pfam11602	NTPase_P4	ATPase P4 of dsRNA bacteriophage phi-12. P4 is a packaging motor which is involved in the packaging of phi-12 genome into preformed capsids using ATP. P4 is located at the vertices of the icosahedral capsid. ATP drives RNA translocation through cooperative conformational changes.	320
402962	pfam11603	Sir1	Regulatory protein Sir1. Sir1p interacts with the BAH domain of the Orc1p subunit of the origin recognition complex (ORC) resulting in the establishment of silent chromatin at HMR and HML in S.cerevisiae. The amino acids from the ORC interaction region of Sir1p are presented on a conserved, convex surface that forms a complementary interface with the Orc1 BAH domain, critical for transcriptional silencing.	120
402963	pfam11604	CusF_Ec	Copper binding periplasmic protein CusF. CusF is a periplasmic protein involved in copper and silver resistance in Escherichia coil. CusF forms a five-stranded beta-barrel OB fold. Cu(I) binds to H36, M47 and M49 which are conserved residues in the protein.	67
402964	pfam11605	Vps36_ESCRT-II	Vacuolar protein sorting protein 36 Vps36. Vps36 is a subunit of ESCRT-II, a protein involved in driving protein sorting from endosomes to lysosomes. The GLUE domain of Vps36 allows for a tight interaction to occur between the protein and Vps28, a subunit of ESCRT-I. This interaction is critical for ubiquitinated cargo progression from early to late endosomes.	92
152042	pfam11606	AlcCBM31	Family 31 carbohydrate binding protein. This family of proteins represents the family 31 carbohydrate-binding module of beta-1,2-xylanase. This protein is from Alcaligenes sp. strain XY-234. The AlcCBM31 module makes a beta-sandwich structure with an immunoglobulin fold and contains two intra-molecular disulfide bonds. AlcCBM31 shows affinity with only beta-1,3-xylan.	93
402965	pfam11607	DUF3247	Protein of unknown function (DUF3247). This family of proteins is the protein product of the gene XC5848 from Xanthomonas campestris. The protein has no known function however its structure has been determined. The protein adopts a Lsm fold however differences with the fold were observed at the N-terminal and internal regions.	92
402966	pfam11608	Limkain-b1	Limkain b1. This family of proteins represents Limkain b1, which is a novel human autoantigen, localized to a subset of ABCD3 and PXF marked peroxisomes. Limkain b1 may be a relatively common target of human autoantibodies reactive to cytoplasmic vesicle-like structures.	89
288462	pfam11609	DUF3248	Protein of unknown function (DUF3248). This family of proteins is thought to be the product of the gene TT1592 from Thermus thermophilus however this cannot be confirmed. Currently there is no known function.	62
402967	pfam11610	Ste5	Scaffold protein Ste5, Fus3-binding region. This family of proteins represents the Fus3 binding region of Ste5. Ste5 functions in the yeast mating pathway and is required for signalling through the mating response MAPK pathway. Ste5 has separate binding sites for each member of the MAPK cascade. This region of Ste5 allosterically activates autophosphroylation of Fus3, a mitogen-activated protein kinase. Auto-activated Fus3 has a negative regulatory role, and promotes Ste5 phosphorylation which leads to a decrease in pathway transcriptional output.	30
402968	pfam11611	DUF4352	Domain of unknown function (DUF4352). Members of these family are putative lipoproteins that fall into the Antigen MPT63/MPB63 (immunoprotective extracellular protein) superfamily.	125
402969	pfam11612	T2SSJ	Type II secretion system (T2SS), protein J. The T2SJ proteins are pseudopilins, which are targeted to the membrane in E. Coli. T2SJ forms a complex with T2SI (pfam02501) and T2SK (pfam03934) which is part of the Type II secretion apparatus involved in the translocation of proteins across the outer membrane in E.coli. The T2SK-I-J complex has quasihelical characteristics.	137
402970	pfam11614	FixG_C	IG-like fold at C-terminal of FixG, putative oxidoreductase. This domain is part of a transmembrane protein, FixG, itself part of the FixGHIS operon closely associated with the FixNOPQ operon that is the symbiotically essential cbb3-type haem-copper oxidase complex. FixG expression is induced by oxygen-deprivation. This C-terminal domain adopts an E-set Ig-like fold.	116
402971	pfam11615	Caf4	CCR4-associated factor 4. Caf4 is a WD40 repeats containing protein involved in mitochondrial fission. It displays physical interactions with CCR4-NOT complex. It has a paralogue, Mdv1. Both Caf4 and Mdv1 act as adapter proteins, binding to Fis1 on the mitochondrial outer membrane and recruiting the dynamin-like GTPase Dnm1 to form mitochondrial fission complexes. However, Fis1 and Caf4, but not Mdv1, determine the polar localization of Dnm1 clusters on the mitochondrial surface.	60
402972	pfam11616	EZH2_WD-Binding	WD repeat binding protein EZH2. This family of proteins represents Enhancer of zest homolog 2, (EZH2) a 30 residue peptide which binds to a WD-repeat domain of EED by residues 39-68. EED is a component of PRC2 complex which is involved in gene expression. This interaction is required for the HMTase activity of PCR2.	30
402973	pfam11617	Cu-binding_MopE	Putative metal-binding motif. The seqeunce of structure 2vov is not matched in any other sequence either in UniProt or in NCBI (Sep2014). The model is of a short repeat not found on the G1UBC6 - 2vov - protein. The presence of conserved cysteine residues and the lack of hydrophobic residues suggests that this repeat might be a metal-binding site, perhaps for zinc or calcium ions.	28
402974	pfam11618	C2-C2_1	First C2 domain of RPGR-interacting protein 1. This domain is the first, more N-terminal, C2 domain on X-linked retinitis pigmentosa GTPase regulator-interacting proteins, or RPGR-interacting proteins.	140
314489	pfam11619	P53_C	Transcription factor P53 - C terminal domain. This family of proteins is the C terminal domain of the transcription factor P53. While the rest of the protein is quite conserved between the different transcription factors such as p53 and p73, the C terminal domain is highly divergent. The DM-p53 structure is characterized by an additional N-terminal beta-strand and a C-terminal helix.	67
402975	pfam11620	GABP-alpha	GA-binding protein alpha chain. This family of proteins represents the transcription factor GABP alpha. This alpha domain is a five-stranded beta-sheet crossed by a distorted helix termed an OST domain. The surface of the GABP alpha OST domain contains two clusters of negatively-charged residues suggesting there are positively-charged partner proteins. The OST domain binds to the CH1 and CH3 domains of the co-activator histone acetyltransferase CBP/p300, a direct link between GABP and transcriptional machinery has been made.	81
288473	pfam11621	Sbi-IV	C3 binding domain 4 of IgG-bind protein SBI. This family of proteins represents Sbi domain IV which binds the central complement protein C3. Sbi-IV interacts with Sbi-III to induce a consumption of complement via alternative pathway activation. When not interacting with Sbi-III, Sbi-IV inhibits the alternative pathway without complement consumption. The structure of Sbi-IV consists of a three-helix bundle fold.	69
402976	pfam11622	DUF3251	Protein of unknown function (DUF3251). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. Some members if this family are annotated as putative lipoprotein YajI however this cannot be confirmed.	156
402977	pfam11623	NdhS	NAD(P)H dehydrogenase subunit S. This family is found in Bacteria and Streptophyta includes members such as NdhS (NAD(P)H-quinone oxidoreductase subunit S). NdhS, also known as CRR31 (chlororespiratory reduction 31), is a subunit of the chloroplast NADH dehydrogenase-like (NDH) complex. It is also a subunit of the cyanobacterial NDH-1 complex. NAD(P)H-oxidizing subunits have not been found in chloroplasts or cyanobacteria, where ferredoxin is probably the electron donor. NdhS contributes to the formation of a ferredoxin binding site of NDH and is necessary for high affinity binding of ferredoxin. The cyanobacterial NDH-1 complex, also known as NADPH:plastoquinone oxidoreductase or type I NAD(P)H dehydrogenase, is involved in plastoquinone reduction and cyclic electron transfer (CET) around photosystem I. The chloroplast NDH is more similar to cyanobacterial NDH-1, which is believed to be the origin of chloroplast NDH, than to mitochondrial NADH dehydrogenase present in the same species. The NDH complexes of chloroplasts, however, contain many subunits that are absent from cyanobacterial NDH-1 complexes.	52
402978	pfam11624	M157	MHC class I-like protein M157. This family of proteins represents M157,a divergent form of MHC class I-like proteins which is the protein product of the mouse cytomegalovirus. This protein is unique in its ability to engage both activating (Ly49H) and inhibitory (Ly49I) natural killer cell receptors. M157 is involved in intra- and intermolecular interacts within and between its domains to form a compact MHC-like molecule.	247
402979	pfam11625	DUF3253	Protein of unknown function (DUF3253). This bacterial family of proteins has no known function.	81
402980	pfam11626	Rap1_C	TRF2-interacting telomeric protein/Rap1 - C terminal domain. This family of proteins represents the C-terminal domain of the protein Rap-1, which plays a distinct role in silencing at the silent mating-type loci and telomeres. The Rap-1 C-terminus adopts an all-helical fold. Rap1 carries out its function by recruiting the Sir3 and Sir4 proteins to chromatin via its C terminal domain. Rap1 is otherwise known as TRF2-interacting protein, as it is one of the six subunit components of the Shelterin complex. Shelterin protects telomere ends from attack by DNA-repair mechanisms. Model doesn't capture Sch. pombe as it cuts this sequence into two.	80
402981	pfam11627	HnRNPA1	Nuclear factor hnRNPA1. This family of proteins represents hnRNPA1, a nuclear factor that binds to Pol II transcripts. The family of hnRNP proteins are involved in numerous RNA-related activities.	38
402982	pfam11628	TCR_zetazeta	T-cell surface glycoprotein CD3 zeta chain. The incorporation of the zetazeta signalling module requires one basic TCR alpha and two zetazeta aspartic acid TM residues. The structure of the zetazeta(TM) dimer consists of a left-handed coiled coil with polar contacts. Two aspartic acids are critical for zetazeta dimerization and assembly with TCR.	31
402983	pfam11629	Mst1_SARAH	C terminal SARAH domain of Mst1. This family of proteins represents the C terminal SARAH domain of Mst1. SARAH controls apoptosis and cell cycle arrest via the Ras, RASSF, MST pathway. The Mst1 SARAH domain interacts with Rassf1 and Rassf5 by forming a heterodimer which mediates the apoptosis process.	48
152066	pfam11630	DUF3254	Protein of unknown function (DUF3254). This family of proteins is most likely a family of anti-lipopolysaccharide factor proteins however this cannot be confirmed.	97
288482	pfam11631	DUF3255	Protein of unknown function (DUF3255). Members in this family of proteins are annotated as YxeF however no function is currently known. The family appears to be restricted to Bacillus.	123
402984	pfam11632	LcnG-beta	Lactococcin G-beta. This family of proteins is LcnG-beta, which with LcnG-alpha constitute the two-peptide bacteriocin lactococcin G (LcnG). This family of proteins represents the N terminal domain which has an alpha-helical structure and is amphiphilic. Both peptides have a GxxxG motif which they use for interaction through a helix-helix structure.	61
314498	pfam11633	SUD-M	Single-stranded poly(A) binding domain. This family of proteins represents Nsp3c, the product of ORF1a in group 2 coronavirus. The domain exhibits a macrodomain fold containing the nsp3 residues 528 to 648, with a flexibly extended N-terminal tail from residues 513 to 527 and a C-terminal flexible tail of residues 649 to 651. SUD-M(527-651) binds single-stranded poly(A); the contact area with this RNA on the protein surface, and the electrophoretic mobility shift assays confirm that SUD-M has higher affinity for purine bases than for pyrimidine bases.	143
288483	pfam11634	IPI_T4	Nuclease inhibitor from bacteriophage T4. This family of proteins represents IPI from bacteriophage T4. This protein is a nuclease inhibitor which is injected by T4 to protect its DNA from gmrS/gmrD CT of pathogenic Escherichia coli into the infected host. The structure of this protein consists of two small beta-sheets flanked by N and C termini by alpha-helices. The protein has a gmrS/gmrD hydrophobic binding site.	76
371639	pfam11635	Med16	Mediator complex subunit 16. Mediator is a large complex of up to 33 proteins that is conserved from plants through fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med16 is one of the subunits of the Tail portion of the Mediator complex and is required for lipopolysaccharide gene-expression. Several members including the human protein MED16 have one or more WD40 domains on them, pfam00400.	755
371640	pfam11636	Troponin-I_N	Troponin I residues 1-32. This family of proteins represents the cardiac N-extension of troponin I. This region of the protein (1-32) interacts with the N-lobe of cTnC and modulates myofilament calcium(2) sensitivity.	32
402985	pfam11637	UvsW	ATP-dependant DNA helicase UvsW. This family of proteins represents the DNA helicase UvsW from bacteriophage T4. The protein is a member of the monomeric SF2 helicase superfamily and shows structural homology to the eukaryotic SF2 helicase Rad54. UvsW is thought to have a role in recombination and the rescue of stalled replication forks.	56
402986	pfam11638	DnaA_N	DnaA N-terminal domain. This family of proteins represents the N-terminal domain of DnaA, a protein involved in the initiation of bacterial chromosomal replication. The structure of this domain is known. It is also found in three copies in some proteins. The exact function of this domain is uncertain but it has been suggested to play a role in oligomerization.	65
402987	pfam11639	HapK	REDY-like protein HapK. This family of proteins represents HapK, a protein of unknown function, with two homologs PigK and RedY. The monomer structure of the protein contains a four-stranded anti parallel beta-sheet, three alpha-helices and a short C terminal tail which it uses for dimer formation. The surface of HapK has a deep cavity with consists of a kinked helix and a beta-four strand. HapK could be involved in prodigiosin biosynthesis, specifically the binding of a bipyrrole intermediate such as HBM or MBM.	103
402988	pfam11640	TAN	Telomere-length maintenance and DNA damage repair. ATM is a large protein kinase, in humans, critical for responding to DNA double-strand breaks (DSBs). Tel1, the orthologue from budding yeast, also regulates responses to DSBs. Tel1 is important for maintaining viability and for phosphorylation of the DNA damage signal transducer kinase Rad53 (an orthologue of mammalian CHK2). In addition to functioning in the response to DSBs, numerous findings indicate that Tel1/ATM regulates telomeres. The overall domain structure of Tel1/ATM is shared by proteins of the phosphatidylinositol 3-kinase (PI3K)-related kinase (PIKK) family, but this family carries a unique and functionally important TAN sequence motif, near its N-terminal, LxxxKxxE/DRxxxL. which is conserved specifically in the Tel1/ATM subclass of the PIKKs. The TAN motif is essential for both telomere length maintenance and Tel1 action in response to DNA damage. It is classified as an EC:2.7.11.1.	151
152077	pfam11641	Antigen_Bd37	Glycosylphosphatidylinositol-anchored merozoite surface protein. This family of proteins represents the core region of Bd37, a surface antigen of B.divergens which is GPI-anchored at the surface of the merozoite. The structure of the protein consists of mainly alpha folds and has three sub domains.	224
402989	pfam11642	Blo-t-5	Mite allergen Blo t 5. This family of proteins is Blo t 5, an allergen protein from Blomia tropicalis mites. This protein shoes strong reactivity with IgE in asthmatic and rhinitis patients. The structure of the protein contains three alpha helices which form a coiled-coil.	118
402990	pfam11644	DUF3256	Protein of unknown function (DUF3256). This family of proteins with unknown function appears to be restricted to Bacteroidales.	195
402991	pfam11645	PDDEXK_5	PD-(D/E)XK endonuclease. This family of endonucleases includes a group I intron-encoded endonuclease. This family belongs to the PD-(D/E)XK superfamily.	137
402992	pfam11646	DUF3258	Protein of unknown function DUF3258. This viral family are possible phage integrase proteins however this cannot be confirmed.	99
402993	pfam11647	MLD	Membrane Localization Domain. This is a membrane localization domain found in multiple families of bacterial toxins including all of the clostridial glucosyltransferase toxins and various MARTX toxins (multifunctional-autoprocessing RTX toxins). In the Pasteurella multocida toxin (PMT) C-terminal fragment, structural analysis have indicated that the C1 domain possesses a signal that leads the toxin to the cell membrane. Furthermore, the C1 domain was found to structurally resemble the phospholipid-binding domain of C. difficile toxin B. Functional studies in Vibrio cholera indicate that the subdomain at the N-terminus of RID (Rho-inactivation domain), homologous to the membrane targeting C1 domain of Pasteurella multocida toxin, is a conserved membrane localization domain essential for proper localization. The Rho-inactivation domain (RID) of MARTX (Multifunctional Autoprocessing RTX toxin) is responsible for inactivating the Rho-family of small GTPases in Vibrio cholerae. It is a bacterial toxin that self-process by a cysteine peptidase mechanism. These cysteine peptidases belong to MEROPS peptidase family C80 (RTX self-cleaving toxin, clan CD).	67
402994	pfam11648	RIG-I_C-RD	C-terminal domain of RIG-I. This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerization. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity.	115
288494	pfam11649	T4_neck-protein	Virus neck protein. This family of protein represents gene product 14, a major component of the neck in T4-like viruses along with gene product 13. Gene product 14 is rich is beta-sheets. The formation of the neck to the head of the bacteriophage is crucial for the tail attachment.	254
402995	pfam11650	P22_Tail-4	P22 tail accessory factor. This tail accessory factor of the P22 virus is also referred to as gene product 4 (Gp4). The proteins structure consists of 60% alpha helices. Gp4 is the first tail accessory factor to be added to newly DNA-filled capsids during P22-morphogenesis. In solution, the protein acts as a monomer and has low structural stability. The interaction of gp4 with the portal protein involves the binding of two non-equivalent sets of six gp4 proteins. Gp4 acts as a structural adaptor for gp10 and gp26, the other tail accessory factors.	148
402996	pfam11651	P22_CoatProtein	P22 coat protein - gene protein 5. This family of proteins represents gene product 5 from bacteriophage P22. This protein is involved in the formation of the pro-capsid shells in the bacteriophage. In total, there are 415 molecules of the coat protein which are arranged in an icosahedral shell.	416
402997	pfam11652	FAM167	FAM167. This entry describes a eukaryotic protein family of unknown function designated FAM167.	84
314509	pfam11653	VirionAssem_T7	Bacteriophage T7 virion assembly protein. This family of proteins represents the gene product 7.3 from T7 bacteriophage. The protein is localized to the tail and is thought to be important in virion assembly. Particles assembled in the absence of the protein fail to adsorb to cells.	99
402998	pfam11654	NCE101	Non-classical export protein 1. This entry represents the non classical export protein 1 family. Family members are Involved in a novel pathway of export of proteins that lack a cleavable signal sequence.	45
371654	pfam11655	DUF2589	Protein of unknown function (DUF2589). This family of proteins has no known function.	150
402999	pfam11656	DUF3811	YjbD family (DUF3811). This is a family of proteobacteria proteins of unknown function. This family is unrelated to pfam03960 which contains a set of transcription factors that are also named YjbD.	88
338055	pfam11657	Activator-TraM	Transcriptional activator TraM. TraM is required for quorum dependence. It binds to and in-activates TraR which controls the replication of the tumor-inducing virulence plasmid. TraM interacts in a two-step process with DNA-TraR to form a large, stable anti-activation complex.	142
403000	pfam11658	CBP_BcsG	Cellulose biosynthesis protein BcsG. CBP_BcsG is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)).	516
403001	pfam11659	DUF3261	Protein of unknown function (DUF3261). This family of proteins with unknown function appears to be restricted to Proteobacteria. The family is related to the LolB family suggesting a role in lipoprotein insertion in the outer membrane.	139
403002	pfam11660	DUF3262	Protein of unknown function (DUF3262). This family of proteins with unknown function appears to be restricted to Proteobacteria.	76
314517	pfam11661	DUF2986	Protein of unknown function (DUF2986). This family of proteins has no known function.	43
403003	pfam11662	DUF3263	Protein of unknown function (DUF3263). This family of proteins with unknown function appears to be restricted to Actinobacteria.	74
403004	pfam11663	Toxin_YhaV	Toxin with endonuclease activity, of toxin-antitoxin system. YhaV causes reversible bacteriostasis and is part of a toxin-antitoxin system in Escherichia coli along with PrlF. The toxicity of YhaV is counteracted by PrlF by the formation of a tight complex which binds to the promoter of the prlF-yhaV operon. In vitro, YhaV also has endonuclease activity.	138
403005	pfam11665	DUF3265	Protein of unknown function (DUF3265). This family of proteins with unknown function appear to be restricted to Vibrio.	28
403006	pfam11666	DUF2933	Protein of unknown function (DUF2933). This bacterial family of proteins has no known function.	50
403007	pfam11667	DUF3267	Putative zincin peptidase. This family of proteins has a conserved HEXXH motif, suggesting the members are putative peptidases of zincin fold.	103
288512	pfam11668	Gp_UL130	HCMV glycoprotein pUL130. This family of proteins represents pUL130 from Human cytomegalovirus, a glycoprotein secreted from infected cells that is incorporated into the virion envelope as a Golgi-matured form. The protein promotes endothelial cell infection through a producer cell modification of the virion.	159
403008	pfam11669	WBP-1	WW domain-binding protein 1. This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain.	102
371661	pfam11670	MSP1a	Major surface protein 1a (MSP1a). MSP1a is part of the A.marginale major surface protein 1 (MSP1) complex and exists as a heterodimer with MSP1b. The complex has adhesive functions in bovine erythrocytes invasion.	252
152107	pfam11671	Apis_Csd	Complementary sex determiner protein. This family of proteins represents the complementary sex determiner in the honeybee. In the honeybee, the mechanism of sex determination depends on the csd gene which produces an SR-type protein. Males are homozygous while females are homozygous for the csd gene. Heterozygosity generates an active protein which initiates female development.	146
403009	pfam11672	DUF3268	zinc-finger-containing domain. This is a family of bacterial and plasmid sequences that carry at least one zinc-finger towards the N-terminus and a possible second at the C-terminus.	118
288515	pfam11673	DUF3269	Protein of unknown function (DUF3269). This family of proteins has no known function.	73
403010	pfam11674	DUF3270	Protein of unknown function (DUF3270). This family of proteins with unknown function appears to be restricted to Streptococcus.	86
403011	pfam11675	DUF3271	Protein of unknown function (DUF3271). This family of proteins with unknown function appears to be restricted to Plasmodium.	248
403012	pfam11676	DUF3272	Protein of unknown function (DUF3272). This family of proteins with unknown function appears to be restricted to Streptococcus.	61
403013	pfam11677	DUF3273	Protein of unknown function (DUF3273). Some members in this family of proteins are annotated as multi-transmembrane proteins however this cannot be confirmed. Currently this family has no known function.	265
371665	pfam11678	DUF3274	Protein of unknown function (DUF3274). This bacterial family of proteins has no known function.	286
371666	pfam11679	DUF3275	Protein of unknown function (DUF3275). This family of proteins with unknown function appear to be restricted to Proteobacteria.	207
403014	pfam11680	DUF3276	Protein of unknown function (DUF3276). This bacterial family of proteins has no known function.	128
403015	pfam11681	DUF3277	Protein of unknown function (DUF3277). This family of proteins represents a putative bacteriophage protein. No function is currently known.	144
288524	pfam11682	zinc_ribbon_11	Probable zinc-ribbon. This family of proteins with unknown function appears to be restricted to Enterobacteriaceae. It is a probably zinc-ribbon.	127
371668	pfam11683	DUF3278	Protein of unknown function (DUF3278). This bacterial family of proteins has no known function.	127
403016	pfam11684	DUF3280	Protein of unknown function (DUF2380). This family of proteins with unknown function appears to be restricted to Proteobacteria.	133
314533	pfam11685	DUF3281	Protein of unknown function (DUF3281). This family of bacterial proteins has no known function.	267
403017	pfam11686	DUF3283	Protein of unknown function (DUF3283). This family of proteins with unknown function appears to be restricted to Proteobacteria.	60
403018	pfam11687	DUF3284	Domain of unknown function (DUF3284). This family of proteins with unknown function appears to be restricted to Firmicutes.	116
403019	pfam11688	DUF3285	Protein of unknown function (DUF3285). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	44
314536	pfam11690	DUF3287	Protein of unknown function (DUF3287). This eukaryotic family of proteins has no known function.	121
403020	pfam11691	DUF3288	Protein of unknown function (DUF3288). This family of proteins with unknown function appears to be restricted to Cyanobacteria.	88
403021	pfam11692	DUF3289	Protein of unknown function (DUF3289). This family of proteins with unknown function appears to be restricted to Proteobacteria.	272
403022	pfam11693	DUF2990	Protein of unknown function (DUF2990). This family of proteins represents a fungal protein with unknown function.	62
403023	pfam11694	DUF3290	Protein of unknown function (DUF3290). This family of proteins with unknown function appears to be restricted to Firmicutes.	144
403024	pfam11695	DUF3291	Domain of unknown function (DUF3291). This bacterial family of proteins has no known function.	136
403025	pfam11696	DUF3292	Protein of unknown function (DUF3292). This eukaryotic family of proteins has no known function.	648
403026	pfam11697	DUF3293	Protein of unknown function (DUF3293). This bacterial family of proteins has no known function.	73
403027	pfam11698	V-ATPase_H_C	V-ATPase subunit H. The yeast Saccharomyces cerevisiae vacuolar H+-ATPase (V-ATPase) is a multisubunit complex responsible for acidifying organelles. It functions as an ATP dependent proton pump that transports protons across a lipid bilayer. This domain corresponds to the C terminal domain of the H subunit of V-ATPase. The N-terminal domain is required for the activation of the complex whereas the C-terminal domain is required for coupling ATP hydrolysis to proton translocation.	117
403028	pfam11699	CENP-C_C	Mif2/CENP-C like. CENP-C_C is a C-terminal family of fungal and eukaryote proteins necessary for centromere formation. CENP-C is the inner-kinetochore centromere (CEN) binding protein. In the budding-yeast, Mif2, the yeast homolog, binds in the CDEIII region of the centromere, and has been shown to recruit a substantial subset of all inner and outer kinetochore proteins. Mif2 adopts a cupin fold and is extremely similar both in polypeptide chain conformation and in dimer geometry to the dimerization domain of a bacterial transcription factor. The Mif2 dimer appears to be part of an enhanceosome-like structure that nucleates kinetochore assembly in budding yeast. This C-terminal domain is the region via which CENP-C localizes to centromeres throughout the cell cycle 2,3].	85
371676	pfam11700	ATG22	Vacuole effluxer Atg22 like. Autophagy is a major survival survival mechanism in which eukaryotes recycle cellular nutrients during stress conditions. Atg22, Avt3 and Avt4 are partially redundant vacuolar effluxes, which mediate the efflux of leucine and other amino acids resulting from autophagy. This family also includes other transporter proteins.	479
403029	pfam11701	UNC45-central	Myosin-binding striated muscle assembly central. The UNC-45 or small muscle protein 1 of C.elegans is expressed in two forms from different genomic positions in mammals, as a general tissue protein UNC-45a and a specific form Unc-45b expressed only in striated and skeletal muscle. All members carry up to three amino-terminal tetratricopeptide repeat (TPR) domains towards their N-terminal, a UCS domain at the C-terminal that contains a number of Arm repeats pfam00514 and this central region of approximately 400 residues. Both the general form and the muscle form of UNC-45 function in myotube formation through cell fusion. Myofibril formation requires both GC and SM UNC-45, consistent with the fact that the cytoskeleton is necessary for the development and maintenance of organized myofibrils. The S. pombe Rng3p, is crucial for cell shape, normal actin cytoskeleton, and contractile ring assembly, and is essential for assembly of the myosin II-containing progenitors of the contractile ring. Widespread defects in the cytoskeleton are found in null mutants of all three fungal proteins. Mammalian Unc45 is found to act as a specific chaperone during the folding of myosin and the assembly of striated muscle by forming a stable complex with the general chaperone Hsp90. The exact function of this central region is not known.	150
403030	pfam11702	DUF3295	Protein of unknown function (DUF3295). This family is conserved in fungi but the function is not known.	489
403031	pfam11703	UPF0506	UPF0506. This uncharacterized family is found in Schistosoma genomes. Although uncharacterized it appears to belong to the knottin fold. The sequence is composed of two repeats of a 6 cysteine motif.	59
403032	pfam11704	Folliculin	Vesicle coat protein involved in Golgi to plasma membrane transport. In yeast cells this family functions in the regulated delivery of Gap1p (a general amino acid permease) to the cell surface, perhaps as a component of a post-Golgi secretory-vesicle coat complex. Birt-Hogg-Dube (BHD)4 syndrome is an autosomal dominant disorder characterized by hamartomas of skin follicles, lung cysts, spontaneous pneumothorax, and renal cell carcinoma. Folliculin is the protein from the BHD4 gene and is found to have no significant homology to any other human proteins. It is expressed in most tissues. These same symptoms also occur in TSC or tuberous sclerosis complex, suggesting that the same pathway is involved, and it is likely that the target is the down-stream Tor2 - an essential gene. Folliculin appears to bind Tor2, and down-regulation of Tor2 activity leads to up-regulation of nitrogen responsive genes including membrane transporters and amino acid permeases.	163
403033	pfam11705	RNA_pol_3_Rpc31	DNA-directed RNA polymerase III subunit Rpc31. RNA polymerase III contains seventeen subunits in yeasts and in human cells. Twelve of these are akin to RNA polymerase I or II and the other five are RNA pol III-specific, and form the functionally distinct groups (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31, Rpc34 and Rpc82 form a cluster of enzyme-specific subunits that contribute to transcription initiation in S.cerevisiae and H.sapiens. There is evidence that these subunits are anchored at or near the N-terminal Zn-fold of Rpc1, itself prolonged by a highly conserved but RNA polymerase III-specific domain.	230
403034	pfam11706	zf-CGNR	CGNR zinc finger. This family consists of a C-terminal zinc finger domain. It seems likely to be DNA-binding given the conservation of many positively charged residues. The domain is named after a highly conserved motif found in many members of the family.	44
403035	pfam11707	Npa1	Ribosome 60S biogenesis N-terminal. Npa1p is required for ribosome biogenesis and operates in the same functional environment as Rsa3p and Dbp6p during early maturation of 60S ribosomal subunits. The protein partners of Npa1p include eight putative helicases as well as the novel Npa2p factor. Npa1p can also associate with a subset of H/ACA and C/D small nucleolar RNPs (snoRNPs) involved in the chemical modification of residues in the vicinity of the peptidyl transferase centre. The protein has also been referred to as Urb1, and this domain at the N-terminal is one of several conserved regions along the length.	332
403036	pfam11708	Slu7	Pre-mRNA splicing Prp18-interacting factor. The spliceosome, an assembly of snRNAs (U1, U2, U4/U6, and U5) and proteins, catalyzes the excision of introns from pre-mRNAs in two successive trans-esterification reactions. Step 2 depends upon integral spliceosome constituents such as U5 snRNA and Prp8 and non-spliceosomal proteins Prp16, Slu7, Prp18, and Prp22. ATP hydrolysis by the DEAH-box enzyme Prp16 promotes a conformational change in the spliceosome that leads to protection of the 3'ss from targeted RNase H cleavage. This change, which probably reflects binding of the 3'ss PyAG in the catalytic centre of the spliceosome, requires the ordered recruitment of Slu7, Prp18, and Prp22 to the spliceosome. There is a close functional relationship between Prp8, Prp18, and Slu7, and Prp18 interacts with Slu7, so that together they recruit Prp22 to the spliceosome. Most members of the family carry a zinc-finger of the CCHC-type upstream of this domain.	258
403037	pfam11709	Mit_ribos_Mrp51	Mitochondrial ribosomal protein subunit. This family is the mitochondrial ribosomal small-subunit protein Mrp51. Its function is not entirely clear, but deletion of the MRP51 gene completely blocked mitochondrial gene expression.	355
371685	pfam11710	Git3	G protein-coupled glucose receptor regulating Gpa2. Git3 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. Git3 contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This is the conserved N-terminus of these proteins, and the C-terminal conserved region is now in family Git3_C.	201
403038	pfam11711	Tim54	Inner membrane protein import complex subunit Tim54. Mitochondrial function depends on the import of hundreds of different proteins synthesized in the cytosol. Protein import is a multi-step pathway which includes the binding of precursor proteins to surface receptors, translocation of the precursor across one or both mitochondrial membranes, and folding and assembly of the imported protein inside the mitochondrion. Most precursor proteins carry amino-terminal targeting signals, called pre-sequences, and are imported into mitochondria via import complexes located in both the outer and the inner membrane (IM). The IM complex, TIM, is made up of at least two proteins which mediate translocation of proteins into the matrix by removing their signal peptide and another pair of proteins, Tim54 and Tim22, that insert the polytopic proteins, that carry internal targetting information, into the inner membrane.	372
403039	pfam11712	Vma12	Endoplasmic reticulum-based factor for assembly of V-ATPase. The yeast vacuolar proton-translocating ATPase (V-ATPase) is the best characterized member of the V-ATPase family. A total of thirteen genes are required for encoding the subunits of the enzyme complex itself and an additional three for providing factors necessary for the assembly of the whole. Vma12 is one of these latter, all three of which are localized to the endoplasmic reticulum.	139
288550	pfam11713	Peptidase_C80	Peptidase C80 family. This family belongs to cysteine peptidase family C80.	152
152150	pfam11714	Inhibitor_I53	Thrombin inhibitor Madanin. Members of this family are the peptidase inhibitor madanin proteins. These proteins were isolated from tick saliva.	78
403040	pfam11715	Nup160	Nucleoporin Nup120/160. Nup120 is conserved from fungi to plants to humans, and is homologous with the Nup160 of vertebrates. The nuclear core complex, or NPC, mediates macromolecular transport across the nuclear envelope. Deletion of the NUP120 gene causes clustering of NPCs at one side of the nuclear envelope, moderate nucleolar fragmentation and slower cell growth. The vertebrate NPC is estimated to contain between 30 and 60 different proteins. most of which are not known. Two important ones in creating the nucleoporin basket are Nup98 and Nup153, and Nup120, in conjunction with Nup 133, interacts with these two and itself plays a role in mRNA export. Nup160, Nup133, Nup96, and Nup107 are all targets of phosphorylation. The phosphorylation sites are clustered mainly at the N-terminal regions of these proteins, which are predicted to be natively disordered. The entire Nup107-160 sub-complex is stable throughout the cell cycle, thus it seems unlikely that phosphorylation affects interactions within the Nup107-160 sub-complex, but rather that it regulates the association of the sub-complex with the NPC and other proteins.	536
371689	pfam11716	MDMPI_N	Mycothiol maleylpyruvate isomerase N-terminal domain. 	139
403041	pfam11717	Tudor-knot	RNA binding activity-knot of a chromodomain. This is a novel knotted tudor domain which is required for binding to RNA. The know influences the loop conformation of the helical turn Ht2 - residues 61-6 3- that is located at the side opposite the knot in the tudor domain-chromodomain; stabilisation of Ht2 is essential for RNA binding.	55
403042	pfam11718	CPSF73-100_C	Pre-mRNA 3'-end-processing endonuclease polyadenylation factor C-term. This is the C-terminal conserved region of the pre-mRNA 3'-end-processing of the polyadenylation factor CPSF-73/CPSF-100 proteins. The exact function of this domain is not known.	204
371692	pfam11719	Drc1-Sld2	DNA replication and checkpoint protein. Genome duplication is precisely regulated by cyclin-dependent kinases CDKs, which bring about the onset of S phase by activating replication origins and then prevent re-licensing of origins until mitosis is completed. The optimum sequence motif for CDK phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found to have at least 11 potential phosphorylation sites. Drc1 is required for DNA synthesis and S-M replication checkpoint control. Drc1 associates with Cdc2 and is phosphorylated at the onset of S phase when Cdc2 is activated. Thus Cdc2 promotes DNA replication by phosphorylating Drc1 and regulating its association with Cut5. Sld2 and Sld3 represent the minimal set of S-CDK substrates required for DNA replication.	391
288556	pfam11720	Inhibitor_I78	Peptidase inhibitor I78 family. This family includes Aspergillus elastase inhibitor and belongs to MEROPS peptidase inhibitor family I78.	65
403043	pfam11721	Malectin	Di-glucose binding within endoplasmic reticulum. Malectin is a membrane-anchored protein of the endoplasmic reticulum that recognizes and binds Glc2-N-glycan. It carries a signal peptide from residues 1-26, a C-terminal transmembrane helix from residues 255-274, and a highly conserved central part of approximately 190 residues followed by an acidic, glutamate-rich region. Carbohydrate-binding is mediated by the four aromatic residues, Y67, Y89, Y116, and F117 and the aspartate at D186. NMR-based ligand-screening studies has shown binding of the protein to maltose and related oligosaccharides, on the basis of which the protein has been designated "malectin", and its endogenous ligand is found to be Glc2-high-mannose N-glycan.	164
403044	pfam11722	zf-TRM13_CCCH	CCCH zinc finger in TRM13 protein. This domain is found at the N-terminus of TRM13 methyltransferase proteins. It is presumed to be a zinc binding domain.	29
403045	pfam11723	Aromatic_hydrox	Homotrimeric ring hydroxylase. This domain is found on aromatic hydroxylating enzymes such as 2-oxo-1,2-dihydroquinoline 8-monooxygenase from Pseudomonas putida and carbazole 1,9a-dioxygenase from Janthinobacterium. These enzymes are homotrimers and are distantly related to the typical oxygenase. This domain is found C terminal to the Rieske domain which binds an iron-sulphur cluster.	241
378697	pfam11724	YvbH_ext	YvbH-like oligomerization region. This region is found at the C-terminus of a group of bacterial PH domains. This region is composed of a helical hairpin that appears to mediate oligomerization based on the known structure. This elaboration of the bacterial PH domain is only found in Bacillales.	61
338078	pfam11725	AvrE	Pathogenicity factor. This family is secreted by gram-negative Gammaproteobacteria such as Pseudomonas syringae of tomato and the fire blight plant pathogen Erwinia amylovora, amongst others. It is an essential pathogenicity factor of approximately 198 kDa. Its injection into the host-plant is dependent upon the bacterial type III or Hrp secretion system. The family is long and carries a number of predicted functional regions, including in Erwinia stewartii, an ERMS or endoplasmic reticulum membrane retention signal at both the C- and the N-termini, a leucine-zipper motif from residues 539-560, and a nuclear localization signal at 1358-1361. this conserved AvrE-family of effectors is among the few that are required for full virulence of many phytopathogenic pseudomonads, erwinias and pantoeas. A double beta-propeller structure is found towards the N-terminus.	1879
403046	pfam11726	Inovirus_Gp2	Inovirus Gp2. Isoform G2P plays an essential role in viral DNA replication; it binds to the origin of replication and cleaves the dsDNA replicative form I (RFI) and becomes covalently bound to it via phosphotyrosine bond, generating the dsDNA replicative form II (RFII).	179
403047	pfam11727	ISG65-75	Invariant surface glycoprotein. This family is found in Trypanosome species, and appears to be one of two invariant surface glycoproteins, ISG65 and ISG75. that are found in the mammalian stage of the parasitic protozoan. the sequence suggests the two families are polypeptides with N-terminal signal sequences, hydrophilic extracellular domains, single trans-membrane alpha-helices and short cytoplasmic domains. they are both expressed in the bloodstream form but not in the midgut stage. Both polypeptides are distributed over the entire surface of the parasite.	280
403048	pfam11728	ArAE_1_C	Putative aromatic acid exporter C-terminal domain. This region is a presumed intracellular domain found in a set of bacterial presumed transporter proteins. The region is about 160 amino acids in length.	161
288565	pfam11729	Capsid-VNN	nodavirus capsid protein. The capsid or coat protein of this family is expressed in Nodaviridae, that are ssRNA positive-strand viruses, with no DNA stage. These viruses are the causative agents of viral nervous necrosis in marine fish.	340
403049	pfam11730	DUF3297	Protein of unknown function (DUF3297). This family is expressed in Proteobacteria and Actinobacteria. The function is not known.	71
403050	pfam11731	Cdd1	Pathogenicity locus. Cdd1 is expressed as part of the pathogenicity locus operon in several different orders of bacteria. Many members of the family are annotated as being putative mitomycin resistance proteins but this could not be confirmed.	81
403051	pfam11732	Thoc2	Transcription- and export-related complex subunit. The THO/TREX complex is the transcription- and export-related complex associated with spliceosomes that preferentially deal with spliced mRNAs as opposed to unspliced mRNAs. Thoc2 plays a role in RNA polymerase II (RNA pol II)-dependent transcription and is required for the stability of DNA repeats. In humans, the TRE complex is comprised of the exon-junction-associated proteins Aly/REF and UAP56 together with the THO proteins THOC1 (hHpr1/p84), Thoc2 (hRlr1), THOC3 (hTex1), THOC5 (fSAP79), THOC6 (fSAP35), and THOC7 (fSAP24). Although much evidence indicates that the function of the TREX complex as an adaptor between the mRNA and components of the export machinery is conserved among eukaryotes, in Drosophila the majority of mRNAs can be exported from the nucleus independently of the THO complex.	75
314574	pfam11733	NP1-WLL	Non-capsid protein NP1. This family is the non-capsid protein NP1 of the ssDNA, Parvovirinae virus Bocavirus of cattle and humans.	213
403052	pfam11734	TilS_C	TilS substrate C-terminal domain. This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein.	74
403053	pfam11735	CAP59_mtransfer	Cryptococcal mannosyltransferase 1. The capsule of pathogenic fungi is a complex polysaccharide whose formation is determined by a number of enzymes including, most importantly, alpha-1,3-mannosyltransferase 1, EC:2.4.1.-.	225
403054	pfam11736	DUF3299	Protein of unknown function (DUF3299). This is a family of bacterial proteins of unknown function.	106
403055	pfam11737	DUF3300	Protein of unknown function (DUF3300). This hypothetical bacterial gene product has a long hydrophobic segment and is thus likely to be a membrane protein.	229
403056	pfam11738	DUF3298	Protein of unknown function (DUF3298). This family of bacterial protein C-terminal regions is highly conserved but the function is not known. Several members are annotated as being endo-1,4-beta-xylanase-like, but this could not be confirmed, and the structure can be defined as a heat-shock cognate 70kd protein 44kd ATPase.	81
403057	pfam11739	DctA-YdbH	Dicarboxylate transport. In certain bacterial families this protein is expressed from the ydbH gene, and there is a suggestion that this is a form of DctA or dicarboxylate transport protein. Dicarboxylate transport proteins are found in aerobic bacteria which grow on succinate or other C4-dicarboxylates.	204
403058	pfam11740	KfrA_N	Plasmid replication region DNA-binding N-term. The broad host-range plasmid RK2 is able to replicate in and be inherited in a stable manner in diverse Gram-negative bacterial species. It encodes a number of co-ordinately regulated operons including a central control korF1 operon that represses the kfrA operon. The KfrA polypeptide is a site-specific DNA-binding protein whose operator overlaps the kfrA promoter. The N-terminus, containing an helix-turn-helix motif, is essential for function. Downstream from this family is an extended coiled-coil domain containing a heptad repeat segment which is probably responsible for formation of multimers, and may provide an example of a bridge to host structures required for plasmid partitioning.	117
403059	pfam11741	AMIN	AMIN domain. This N-terminal domain of various bacterial protein families is crucial for the targetting of periplasmic or extracellular proteins to specific regions of the bacterial envelope. AMIN is derived from the N-terminal domain of AmiC, an N-acetylmuramoyl-l-alanine amidase of Escherichia coli which localizes to the septal ring during division and plays a key role in the separation of daughter cells. The AMIN domain is present in several protein families besides amidases suggesting that AMIN may represent a general targetting determinant involved in the localization of periplasmic protein complexes.	96
314583	pfam11742	DUF3302	Protein of unknown function (DUF3302). This family of unknown function is expressed by proteobacteria.	77
403060	pfam11743	DUF3301	Protein of unknown function (DUF3301). This family is conserved in Proteobacteria, but the function is not known.	93
314585	pfam11744	ALMT	Aluminium activated malate transporter. 	469
403061	pfam11745	DUF3304	Protein of unknown function (DUF3304). This is a family of bacterial proteins of unknown function.	104
403062	pfam11746	DUF3303	Protein of unknown function (DUF3303). Several members are annotated as being LysM domain-like proteins, but these did not match any LysM domains reported in the literature.	90
403063	pfam11747	RebB	Killing trait. RebB is one of three proteins necessary for the production of R- bodies, refractile inclusion bodies produced by a small number of bacterial species, essential for the expression of the killing trait of the endosymbiont bacteria that produce them for attack upon the host Paramecium. R-bodies are highly insoluble protein ribbons which coil into cylindrical structures in the cell and the genes for their synthesis and assembly are encoded on a plasmid. One of these three proteins is RebB.	68
403064	pfam11748	DUF3306	Protein of unknown function (DUF3306). This family of proteobacterial species proteins has no known function.	120
403065	pfam11749	DUF3305	Protein of unknown function (DUF3305). Several members of this family are annotated as being molybdopterin-guanine dinucleotide biosynthesis protein A; however, this could not be confirmed. The family is found in proteobacteria.	141
371703	pfam11750	DUF3307	Protein of unknown function (DUF3307). This family of bacterial proteins has no known function.	124
403066	pfam11751	PorP_SprF	Type IX secretion system membrane protein PorP/SprF. This entry describes a protein family unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Cytophaga johnsonae (Flavobacterium johnsoniae), that exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss of the rapid gliding phenotype.	274
403067	pfam11752	DUF3309	Protein of unknown function (DUF3309). This family is conserved in bacteria but its function is not known.	49
403068	pfam11753	DUF3310	Protein of unknwon function (DUF3310). This is a family of conserved bacteriophage proteins of unknown function.	60
403069	pfam11754	Velvet	Velvet factor. The velvet factor is conserved in many fungal species and is found to have gained different roles depending on the organism's need, expanding the conserved role in developmental programmes. The velvet factor orthologues can be adapted to the fungal-specific life cycle and may be involved in diverse functions such as sclerotia formation and toxin production, as in A. parasiticus, nutrition-dependent sporulation, as in A. fumigatus, or the microconidia-to-macroconidia ratio and cell wall formation, as in the heterothallic fungus Fusarium verticilloides.	237
403070	pfam11755	DUF3311	Protein of unknown function (DUF3311). This is a family of short bacterial proteins of unknown function.	59
403071	pfam11756	YgbA_NO	Nitrous oxide-stimulated promoter. The function of ygaB is not known but it is a promoter that is stimulated by the presence of nitrous oxide. It is regulated by the gene-product of the bacterial nsrR gene.	106
403072	pfam11757	RSS_P20	Suppressor of RNA silencing P21-like. This is a large family of putative suppressors of RNA silencing proteins, P20-P25, from ssRNA positive-strand viruses such as Closterovirus, Potyvirus and Cucumovirus families. RNA silencing is one of the major mechanisms of defense against viruses, and, in response, some viruses have evolved or acquired functions for suppression of RNA silencing. These counter-defencive viral proteins with RNA silencing suppressor (RSS) activity were originally discovered in the members of plant virus genera Potyvirus and Cucumovirus. Each of the conserved blocks of amino acids found in P21-like proteins corresponds to a computer-predicted alpha-helix, with the most C-terminal element being 42 residues long. This suggests conservation of the predominantly alpha-helical secondary structure in the P21-like proteins.	94
403073	pfam11758	Bacteriocin_IIi	Aureocin-like type II bacteriocin. This is a small family of type II bacteriocins usually encoded on a plasmid. Characteristically the members are small, cationic, rich in Lys and Try, and bring about a generalized membrane permeabilisation leading to leakage of ions. The family includes aureocin A, lacticins Q and Z, and BhtB as well as an archaeal member.	51
314599	pfam11759	KRTAP	Keratin-associated matrix. The major structural proteins of mammalian hair are the hair keratin intermediate filaments (KIFs) and the keratin-associated proteins (KRTAPs). In the hair cortex, hair keratins are embedded in an inter-filamentous matrix consisting of KRTAPs which are essential for the formation of a rigid and resistant hair shaft as a result of disulfide bonds between cysteine residues. There are essentially three groups of KRTAPs, viz: the high-sulfur (HS) and ultra-high-sulfur (UHS) KRTAPs (cysteine content: 16-30 and >30 mol%, respectively) and the high-glycine/tyrosine (HGT: 35-60 mol% glycine and tyrosine) KRTAPs.	59
403074	pfam11760	CbiG_N	Cobalamin synthesis G N-terminal. Members of this family are involved in cobalamin synthesis. Synechocystis sp. cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process. Within the cobalamin synthesis pathway CbiG catalyzes the both the opening of the lactone ring and the extrusion of the two-carbon fragment of cobalt-precorrin-5A from C-20 and its associated methyl group (deacylation) to give cobalt-precorrin-5B. The N-terminal of the enzyme is conserved in this family, and the C-terminal and the mid-sections are conserved independently in other families, CbiG_C and CbiG_mid, although the distinct function of each region is unclear.	79
403075	pfam11761	CbiG_mid	Cobalamin biosynthesis central region. Members of this family are involved in cobalamin synthesis. Synechocystis sp. cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process.	88
403076	pfam11762	Arabinose_Iso_C	L-arabinose isomerase C-terminal domain. This is a family of L-arabinose isomerases, AraA, EC:5.3.1.4. These enzymes catalyze the reaction: L-arabinose <=> L-ribulose. This reaction is the first step in the pathway of L-arabinose utilisation as a carbon source after entering the cell L-arabinose is converted into L-ribulose by the L-arabinose isomerases enzyme. This is a C-terminal non catalytic domain.	114
403077	pfam11763	DIPSY	Cell-wall adhesin ligand-binding C-terminal. The DIPSY domain is characterized by the distinctive D*I*PSY motif at the very C-terminus of yeast cell-wall glycoproteins. It appears not to be conserved in any other species, however. In fungi, cell adhesion is required for flocculation, mating and virulence, and is mediated by covalently bound cell wall proteins termed adhesins. Map4, an adhesin required for mating in Schizosaccharomyces pombe, is N-glycosylated and O-glycosylated, and is an endogenous substrate for the mannosyl transferase Oma4p. Map4 has a modular structure with an N-terminal signal peptide, a serine and threonine (S/T)-rich domain that includes nine repeats of 36 amino acids (rich in serine and threonine residues, but lacking glutamines), and a C-terminal DIPSY domain with no glycosyl-phosphatidyl inositol (GPI)-anchor signal. The N-terminal S/T-rich regions, are required for cell wall attachment, but the C-terminal DIPSY domain is required for agglutination and mating in liquid and solid media.	122
403078	pfam11764	N-SET	COMPASS (Complex proteins associated with Set1p) component N. The n-SET or N-SET domain is a component of the COMPASS complex, associated with SET1, conserved in yeasts and in other eukaryotes up to humans. The COMPASS complex functions to methylate the fourth lysine of Histone 3 and for the silencing of genes close to the telomeres of chromosomes. This domain promotes trimethylation in conjunction with an RRM domain and is necessary for binding of the Spp1 component of COMPASS into the complex.	172
403079	pfam11765	Hyphal_reg_CWP	Hyphally regulated cell wall protein N-terminal. The proteins in this family are all fungal and largely annotated as being hyphally regulated cell wall proteins, and several are listed as the enzyme EC:3.2.1.18. This enzyme is acetylneuraminyl hydrolase or exo-alpha-sialidase, that hydrolyzes glycosidic linkages of terminal sialic acid residues in oligosaccharides, glycoproteins, glycolipids, colominic acid and synthetic substrates.	322
403080	pfam11766	Candida_ALS_N	Cell-wall agglutinin N-terminal ligand-sugar binding. This is likely to be the sugar or ligand binding domain of the yeast alpha-agglutinins.	241
403081	pfam11767	SET_assoc	Histone lysine methyltransferase SET associated. SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. A subset of SET domains have been called PR domains. The SET domain consists of two regions known as N-SET and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure. This domain is found in fungi associated with SET and N-SET domains.	65
314607	pfam11768	Frtz	WD repeat-containing and planar cell polarity effector protein Fritz. Fritz is a probable effector of the planar cell polarity signaling pathway which regulates the septin cytoskeleton in both ciliogenesis and collective cell movements. In Drosophila melanogaster, fritz regulates both the location and the number of wing cell prehair initiation sites.	545
403082	pfam11769	DUF3313	Protein of unknown function (DUF3313). This a bacterial family of proteins which are annotated as putative lipoproteins.	186
403083	pfam11770	GAPT	GRB2-binding adapter (GAPT). This is a family of transmembrane proteins which bind the growth factor receptor-bound protein 2 (GRB2) in B cells. In contrast to other transmembrane adaptor proteins, GAPT is not phosphorylated upon BCR ligation. It associates with GRB2 constitutively through its proline-rich region.	155
403084	pfam11771	DUF3314	Protein of unknown function (DUF3314). This small family contains human, mouse and fish members but the function is not known.	162
403085	pfam11772	EpuA	DNA-directed RNA polymerase subunit beta. This short 60-residue long bacterial family is the beta subunit of the DNA-directed RNA polymerase, likely to be EC:2.7.7.6. It is membrane-bound and is referred to by the name EpuA.	46
403086	pfam11773	PulG	Type II secretory pathway pseudopilin. The secreton (type II secretion) and type IV pilus biogenesis branches of the general secretory pathway in Gram-negative bacteria share many features that suggest a common evolutionary origin. Five components of the secreton, the pseudopilins, are similar to subunits of type IV pili. Pseudopilin PulG is one of the secreton pseudopilins, and is found to assemble into pilus-like bundles. PulG interacts with proteins H, I and J within the multi-protein complex as well as blocking extracellular secretion and reducing the amount of PulE protein as well as the amounts of PulL, PulM, PulC and PulD when G is over-expressed. In Klebsiella the pilus-like structure is composed largely of PulG.	82
403087	pfam11774	Lsr2	Lsr2. Lsr2 is a small, basic DNA-bridging protein present in Mycobacterium and related actinomycetes. It is a functional homolog of the H-NS-like proteins. H-NS proteins play a role in nucleoid organisation and also function as a pleiotropic regulator of gene expression.	109
288608	pfam11775	CobT_C	Cobalamin biosynthesis protein CobT VWA domain. This family consists of several bacterial cobalamin biosynthesis (CobT) proteins. CobT is involved in the transformation of precorrin-3 into cobyrinic acid.	220
403088	pfam11776	RcnB	Nickel/cobalt transporter regulator. RcnB is a family of Proteobacteria proteins. RcnB is required for maintaining metal ion homeostasis, in conjunction with the efflux pump RcnA, family NicO, pfam03824.	51
403089	pfam11777	DUF3316	Protein of unknown function (DUF3316). This family of bacterial proteins has no known function. Several members are, however, annotated as being putative acyl-CoA synthetase, but this could not be confirmed.	107
403090	pfam11778	SID	Septation initiation. This family is required for activation of the spg1 GTPase signalling cascade which leads to the initiation of septation and the subsequent termination of mitosis. It may act as a scaffold at the spindle pole body to which other components of the spg1 signalling cascade attach in pombe. In S.cerevisiae it is both required for the proper formation of the spindle pole body outer plaque and may also connect the outer plaque to the central plaque embedded in the nuclear envelope.	135
403091	pfam11779	SPT_ssu-like	Small subunit of serine palmitoyltransferase-like. Serine palmitoyltransferase (SPT) catalyzes the first committed step in sphingolipid biosynthesis. In mammals, two small subunits of serine palmitoyltransferase, ssSPTa and ssSPTb, substantially enhance the activity of SPT, conferring full enzyme activity upon it. The 2 ssSPT isoforms share a conserved hydrophobic central domain, which is predicted to reside in the membrane.	54
378716	pfam11780	DUF3318	Protein of unknown function (DUF3318). This is a bacterial family of uncharacterized proteins.	141
403092	pfam11781	zf-RRN7	Zinc-finger of RNA-polymerase I-specific TFIIB, Rrn7. This is the zinc-finger at the start of transcription-binding factor that associates strongly with both Rrn6 and Rrn7 to form a complex which itself binds the TATA-binding protein and is required for transcription by the core domain of the RNA PolI promoter.	32
403093	pfam11782	DUF3319	Protein of unknown function (DUF3319). This is a family of short bacterial proteins, a few of which are annotated as being minor tail protein. Otherwise the function is unknown.	89
403094	pfam11783	Cytochrome_cB	Cytochrome c bacterial. This is a family of long bacterial cytochrome c proteins, found in Proteobacteria and Chlorobi families.	173
403095	pfam11784	DUF3320	Protein of unknown function (DUF3320). This family is conserved in Proteobacteria and Chlorobi families. Many members are annotated as being putative DNA helicase-related proteins.	50
403096	pfam11785	Aft1_OSA	Aft1 osmotic stress response (OSM) domain. This domain is found in the transcription factor Aft1 which is required for a wide range of stress responses. The OSM domain has been shown to be involved in the osmotic stress response.	57
371723	pfam11786	Aft1_HRA	Aft1 HRA domain. This domain is found in the transcription factor Aft1 which is required for a wide range of stress responses. The HRA domain is involved in meiotic recombination. It has been shown to be necessary and sufficient to activate recombination.	76
403097	pfam11787	Aft1_HRR	Aft1 HRR domain. This domain is found in the transcription factor Aft1 which is required for a wide range of stress responses. The HRR domain is involved in meiotic recombination. It has been shown to be necessary and sufficient to repress recombination.	68
371725	pfam11788	MRP-L46	39S mitochondrial ribosomal protein L46. This is the L46 subunit of the mammalian mitochondrial ribosome, conserved from plants and fungi.	115
403098	pfam11789	zf-Nse	Zinc-finger of the MIZ type in Nse subunit. Nse1 and Nse2 are novel non-SMC subunits of the fission yeast Smc5-6 DNA repair complex. This family is the zinc-finger domain similar to the MIZ type of zinc-finger.	57
371727	pfam11790	Glyco_hydro_cc	Glycosyl hydrolase catalytic core. This family is probably a glycosyl hydrolase, and is conserved in fungi and some Proteobacteria. The pombe member is annotated as being from IPR013781.	235
403099	pfam11791	Aconitase_B_N	Aconitate B N-terminal domain. This family represents the N-terminal domain of Aconitase B.	152
288625	pfam11792	Baculo_LEF5_C	Baculoviridae late expression factor 5 C-terminal domain. This C-terminal domain is likely to be a zinc-binding domain.	42
403100	pfam11793	FANCL_C	FANCL C-terminal domain. This domain is found at the C-terminus of the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2.	70
403101	pfam11794	HpaB_N	4-hydroxyphenylacetate 3-hydroxylase N terminal. HpaB is part of the 4-hydroxyphenylacetate 3-hydroxylase from Escherichia coli. HpaB is part of a heterodimeric enzyme that also requires HpaC. The enzyme is NADH-dependent and uses FAD as the redox chromophore. This family also includes PvcC, which may play a role in one of the proposed hydroxylation steps of pyoverdine chromophore biosynthesis.	266
403102	pfam11795	DUF3322	Uncharacterized protein conserved in bacteria N-term (DUF3322). This domain, found in various hypothetical bacterial proteins, has no known function. The family represents just the N-terminus.	187
403103	pfam11796	DUF3323	Protein of unknown function N-terminus (DUF3323). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Beta-proteobacteria).	209
403104	pfam11797	DUF3324	Protein of unknown function C-terminal (DUF3324). This family consists of several hypothetical bacterial proteins of unknown function.	138
403105	pfam11798	IMS_HHH	IMS family HHH motif. These proteins are involved in UV protection, eg.	32
403106	pfam11799	IMS_C	impB/mucB/samB family C-terminal domain. These proteins are involved in UV protection.	110
403107	pfam11800	RP-C_C	Replication protein C C-terminal region. Replication protein C is involved in the early stages of viral DNA replication.	206
371732	pfam11801	Tom37_C	Tom37 C-terminal domain. The TOM37 protein is one of the outer membrane proteins that make up the TOM complex for guiding cytosolic mitochondrial beta-barrel proteins from the cytosol across the outer mitochondrial membrane into the intra-membrane space. In conjunction with TOM70 it guides peptides without an MTS into TOM40, the protein that forms the passage through the outer membrane. It has homology with Metaxin-1, also part of the outer mitochondrial membrane beta-barrel protein transport complex.	145
371733	pfam11802	CENP-K	Centromere-associated protein K. CENP-K is one of seven new CENP-A-nucleosome distal (CAD) centromere components (the others being CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S) that are identified as assembling on the CENP-A nucleosome associated complex, NAC. The CENP-A NAC is essential, as disruption of the complex causes errors of chromosome alignment and segregation that preclude cell survival despite continued centromere-derived mitotic checkpoint signalling. CENP-K is centromere-associated through its interaction with one or more components of the CENP-A NAC.	263
403108	pfam11803	UXS1_N	UDP-glucuronate decarboxylase N-terminal. The N-terminus of the UDP-glucuronate decarboxylases may be involved in localization to the perinuclear Golgi membrane.	75
403109	pfam11804	DUF3325	Protein of unknown function (DUF3325). This family of short proteins are functionally uncharacterized. This family is restricted to Alpha-, Beta- and Gamma-proteobacteria.	101
403110	pfam11805	DUF3326	Protein of unknown function (DUF3326). This protein is functionally uncharacterized. It is about 300-500 amino acids in length. This family is found in plants and bacteria.	336
403111	pfam11806	DUF3327	Domain of unknown function (DUF3327). 	122
403112	pfam11807	DUF3328	Domain of unknown function (DUF3328). This family of proteins are functionally uncharacterized. This family is only found in eukaryotes.	220
403113	pfam11808	DUF3329	Domain of unknown function (DUF3329). This family of proteins are functionally uncharacterized. This family is only found in bacteria.	83
288642	pfam11809	DUF3330	Domain of unknown function (DUF3330). This family of proteins are functionally uncharacterized. This family is only found in bacteria.	69
403114	pfam11810	DUF3332	Domain of unknown function (DUF3332). This family of proteins are functionally uncharacterized. This family is only found in bacteria.	160
288644	pfam11811	DUF3331	Domain of unknown function (DUF3331). This family of proteins are functionally uncharacterized. This family is only found in bacteria. Proteins in this family vary in length from 96 to 160 amino acids.	90
403115	pfam11812	DUF3333	Domain of unknown function (DUF3333). This family of proteins are functionally uncharacterized. This family is only found in bacteria. This presumed domain is typically between 116 to 159 amino acids in length.	150
314646	pfam11813	DUF3334	Protein of unknown function (DUF3334). This family of proteins are functionally uncharacterized. This family is only found in bacteria. Proteins in this family are typically between 227 to 238 amino acids in length.	226
403116	pfam11814	DUF3335	Peptidase_C39 like family. 	206
403117	pfam11815	DUF3336	Domain of unknown function (DUF3336). This family of proteins are functionally uncharacterized. This family is found in bacteria and eukaryotes. This presumed domain is typically between 143 to 227 amino acids in length.	139
403118	pfam11816	DUF3337	Domain of unknown function (DUF3337). This family of proteins are functionally uncharacterized. This family is only found in eukaryotes. This presumed domain is typically between 285 to 342 amino acids in length.	168
371742	pfam11817	Foie-gras_1	Foie gras liver health family 1. Mutating the gene foie gras in zebrafish has been shown to affect development; the mutants develop large, lipid-filled hepatocytes in the liver, resembling those in individuals with fatty liver disease. Foie-gras protein is long and has several well-defined domains though none of them has a known function. We have annotated this one as the first. The C-terminus of this region contains TPR repeats.	262
403119	pfam11818	DUF3340	C-terminal domain of tail specific protease (DUF3340). This presumed domain is found at the C-terminus of tail specific proteases. Its function is unknown. This family is found in bacteria and eukaryotes. This presumed domain is typically between 88 to 187 amino acids in length.	149
403120	pfam11819	DUF3338	Domain of unknown function (DUF3338). This family of proteins are functionally uncharacterized. This family is found in eukaryotes. This presumed domain is about 130 amino acids in length.	135
403121	pfam11820	DUF3339	Protein of unknown function (DUF3339). This family of proteins are functionally uncharacterized. This family is found in eukaryotes. Proteins in this family are about 70 amino acids in length.	66
403122	pfam11821	DUF3341	Protein of unknown function (DUF3341). This family of proteins are functionally uncharacterized. This family is found in bacteria. Proteins in this family are about 170 amino acids in length.	170
403123	pfam11822	DUF3342	Domain of unknown function (DUF3342). This family of proteins are functionally uncharacterized. This family is found in bacteria. The domain is a BTB-like domain.	97
403124	pfam11823	DUF3343	Protein of unknown function (DUF3343). This family of proteins are functionally uncharacterized. This protein is found in bacteria and archaea. Proteins in this family are typically between 78 to 102 amino acids in length.	63
403125	pfam11824	DUF3344	Protein of unknown function (DUF3344). This family of proteins are functionally uncharacterized. This protein is found in bacteria and archaea. Proteins in this family are typically between 367 to 1857 amino acids in length.	267
403126	pfam11825	Nuc_recep-AF1	Nuclear/hormone receptor activator site AF-1. Nuclear receptors (NRs) are a family of ligand-inducible transcription factors, and, like other transcription factors, they contain a distinct DNA binding domain that allows for target gene recognition and several activation domains that possess the ability to activate transcription. One of these activation domains is at the N-terminal, although there are two distinct motifs within this domain, between residues 20-36 and between 74 and the end of this domain, which are the binding regions. One of the co-activators is TIF1beta, which appears to bind at the first motif.	113
288659	pfam11826	DUF3346	Protein of unknown function (DUF3346). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 231 to 659 amino acids in length.	225
403127	pfam11827	DUF3347	Protein of unknown function (DUF3347). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 169 to 570 amino acids in length.	93
403128	pfam11828	DUF3348	Protein of unknown function (DUF3348). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 244 to 323 amino acids in length.	247
403129	pfam11829	DUF3349	Protein of unknown function (DUF3349). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 99 to 124 amino acids in length.	94
403130	pfam11830	DUF3350	Domain of unknown function (DUF3350). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 50 to 64 amino acids in length.	62
403131	pfam11831	Myb_Cef	pre-mRNA splicing factor component. This family is a region of the Myb-Related Cdc5p/Cef1 proteins, in fungi, and is part of the pre-mRNA splicing factor complex.	226
403132	pfam11832	DUF3352	Protein of unknown function (DUF3352). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 538 to 575 amino acids in length.	529
403133	pfam11833	CPP1-like	Protein CHAPERONE-LIKE PROTEIN OF POR1-like. This entry includes proteins from bacteria and eukaryotes. The plant member, CHAPERONE-LIKE PROTEIN OF POR1 (CPP1), is an essential protein for chloroplast development, plays a role in the regulation of POR (light-dependent protochlorophyllide oxidoreductase) stability and function.	193
403134	pfam11834	KHA	KHA, dimerization domain of potassium ion channel. KHA is the tetramerisation domain of eukaryotic voltage-dependent potassium ion-channel proteins. In plants the domain lies at the C-terminus whereas in many chordates it lies at the N-terminus.	67
371753	pfam11835	DUF3355	Domain of unknown function (DUF3355). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 111 to 177 amino acids in length.	89
403135	pfam11836	Phage_TAC_11	Phage tail tube protein, GTA-gp10. This is a family of phage tail tube proteins.	98
403136	pfam11837	DUF3357	Domain of unknown function (DUF3357). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 96 to 119 amino acids in length.	108
403137	pfam11838	ERAP1_C	ERAP1-like C-terminal domain. This large domain is composed of 16 alpha helices organized as 8 HEAT-like repeats. This domain forms a concave face that faces towards the active site of the peptidase.	316
371756	pfam11839	Alanine_zipper	Alanine-zipper, major outer membrane lipoprotein. This is a family of a major outer membrane lipoprotein, OprL that is an alanine-zipper. Zipper motifs are a seven-repeat motif where the first and fourth positions are occupied by an aliphatic residue, usually a leucine. These residues are positioned on the outside of the coil such as to bind firmly to one or more monomers of the protein to create a triple or five-helical coiled-coil that probably forms a seam in a membrane.	69
403138	pfam11840	DUF3360	Protein of unknown function (DUF3360). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 489 to 517 amino acids in length.	485
403139	pfam11841	DUF3361	Domain of unknown function (DUF3361). This domain is functionally uncharacterized. This domain is found in eukaryotes and predominantly in ELMO (Elongation and Cell motility) proteins where it may play an important role in defining the functions of the ELMO family members and may be functionally linked to the ELMO domain in these proteins.	153
403140	pfam11842	DUF3362	Domain of unknown function (DUF3362). This domain is functionally uncharacterized. This domain is found in bacteria and archaea. This presumed domain is typically between 117 to 158 amino acids in length.	148
403141	pfam11843	DUF3363	Protein of unknown function (DUF3363). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 323 to 658 amino acids in length.	380
403142	pfam11844	DUF3364	Domain of unknown function (DUF3364). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 60 amino acids in length.	56
403143	pfam11845	DUF3365	Protein of unknown function (DUF3365). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 198 to 657 amino acids in length.	167
403144	pfam11846	Wzy_C_2	Virulence factor membrane-bound polymerase, C-terminal. Wzy is a membrane-bound polymerase of 12 TMs, found in Gram-positive bacteria such as Streptococcus pnuemoniae. It forms part of the EPS or exopolysaccharide system. This family is the 6xTMs at the C-terminal end of the molecule. Wzy functions in polymerizing the oligosaccharide repeat subunits to form high-molecular-weight capsular polysaccharides. A contiguous emebrane-bound flippase, Wzx, pfam01943, transports the repeat units to the external surface of the membrane. These polysaccharides form the capsule and their differing compositions contribute to the multidudinous pneumococcal capsular serotypes, all being structurally and antigenically different.	186
403145	pfam11847	DUF3367	Domain of unknown function (DUF3367). This domain is functionally uncharacterized. This domain is found in bacteria and archaea. This presumed domain is typically between 667 to 694 amino acids in length.	642
403146	pfam11848	DUF3368	Domain of unknown function (DUF3368). This domain is functionally uncharacterized. This domain is found in bacteria and archaea. This presumed domain is about 50 amino acids in length.	46
403147	pfam11849	DUF3369	Domain of unknown function (DUF3369). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 170 amino acids in length. The domain appears to be related to the GAF domain.	168
403148	pfam11850	DUF3370	Protein of unknown function (DUF3370). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 452 to 532 amino acids in length.	422
403149	pfam11851	DUF3371	Domain of unknown function (DUF3371). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 125 to 142 amino acids in length.	127
403150	pfam11852	DUF3372	Domain of unknown function (DUF3372). This domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This presumed domain is about 170 amino acids in length.	167
371762	pfam11853	DUF3373	Protein of unknown function (DUF3373). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 472 to 574 amino acids in length.	405
403151	pfam11854	MtrB_PioB	Putative outer membrane beta-barrel porin, MtrB/PioB. MtrB-PioB is a family of bacterial putative outer membrane porins. This family, is secreted as part of the pio (phototrophic iron oxidation) operon that has been found to couple the oxidation of ferrous iron [Fe(II)] to reductive CO2 fixation using light energy. PioABC is found in Rhodopseudomonas palustris and MtrB-PioB is likely to be a beta-barrel porin. Similar to other outer membrane porins, PioB and MtrB are predicted to have long loops protruding into the extracellular space and short turns on the periplasmic side.	640
403152	pfam11855	DUF3375	Protein of unknown function (DUF3375). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 479 to 499 amino acids in length.	469
371765	pfam11856	DUF3376	Protein of unknown function (DUF3376). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 770 to 1142 amino acids in length.	521
403153	pfam11857	DUF3377	Domain of unknown function (DUF3377). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 70 amino acids in length.	72
403154	pfam11858	DUF3378	Domain of unknown function (DUF3378). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 80 amino acids in length.	76
403155	pfam11859	DUF3379	Protein of unknown function (DUF3379). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 234 to 251 amino acids in length.	232
403156	pfam11860	Muraidase	N-acetylmuramidase. Endolysins are bacteriophage encoded proteins synthesized at the end of the lytic infection cycle. They degrade the peptidoglycan (PG) of the host bacterium to allow viral progeny release. This domain family is found in bacteria and viruses. It is also found associated with pfam01471. One of the family members is the modular Gp110 endolysin found in the Salmonella phage. This domain represents the catalytic region found in the C-terminal of Gp110. It has been demonstrated to have N-acetylmuramidase (lysozyme) activity cleaving the beta-(1,4) glycosidic bond between N-acetylmuramic acid and N-acetylglucosamine residues in the sugar backbone of the PG. Furthermore, sequence alignments containing this domain show that the Gp110 E101 residue is conserved (suggesting that is is the catalytic residue), and followed by serine (a common feature in lysozymes). The structure of endolysins varies depending on their origin. In general, most of the endolysins from phages infecting Gram-positive bacteria have a modular structure consisting of one or two N-terminal enzymatic active domains (EADs) and a C-terminal cell wall binding domain (CBD) separated by a short linker. In silico analysis indicate that this endolysin has a modular structure harboring this EAD family at the C-terminus and a PG_binding_1 CBD at the N-terminus.	173
403157	pfam11861	DUF3381	Domain of unknown function (DUF3381). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 156 to 174 amino acids in length. This domain is found associated with pfam07780, pfam01728.	146
403158	pfam11862	DUF3382	Domain of unknown function (DUF3382). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 100 amino acids in length. This domain is found associated with pfam02653.	97
403159	pfam11863	DUF3383	Protein of unknown function (DUF3383). This family of proteins are functionally uncharacterized. This protein is found in bacteria and viruses. Proteins in this family are typically between 356 to 501 amino acids in length.	493
403160	pfam11864	DUF3384	Domain of unknown function (DUF3384). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 422 to 486 amino acids in length. This domain is found associated with pfam02145.	407
403161	pfam11865	DUF3385	Domain of unknown function (DUF3385). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 160 to 172 amino acids in length. This domain is found associated with pfam00454, pfam02260, pfam02985, pfam02259 and pfam08771.	160
403162	pfam11866	DUF3386	Protein of unknown function (DUF3386). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are about 220 amino acids in length.	211
403163	pfam11867	DUF3387	Domain of unknown function (DUF3387). This domain is functionally uncharacterized. This domain is found in bacteria and archaea. This presumed domain is typically between 255 to 340 amino acids in length. This domain is found associated with pfam04851, pfam04313.	331
314698	pfam11868	DUF3388	Protein of unknown function (DUF3388). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 261 to 275 amino acids in length. This protein is found associated with pfam01842.	190
403164	pfam11869	DUF3389	Protein of unknown function (DUF3389). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 80 amino acids in length.	75
403165	pfam11870	DUF3390	Domain of unknown function (DUF3390). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 90 amino acids in length. This domain is found associated with pfam02589. This domain is found on most LutB proteins in association with DUF162 and usually Fer4_8. The LutABC operon is involved in lactate-utilisation and is essential. Duf162, pfam02589, is over-represented in the human gut-microbiome.	86
403166	pfam11871	DUF3391	Domain of unknown function (DUF3391). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is typically between 122 to 139 amino acids in length. This domain is found associated with pfam01966.	136
403167	pfam11872	DUF3392	Protein of unknown function (DUF3392). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 110 amino acids in length.	103
403168	pfam11873	DUF3393	Domain of unknown function (DUF3393). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is typically between 188 to 206 amino acids in length. This domain is found associated with pfam01464.	161
403169	pfam11874	DUF3394	Domain of unknown function (DUF3394). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with pfam06808.	180
403170	pfam11875	DUF3395	Domain of unknown function (DUF3395). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 147 to 176 amino acids in length. This domain is found associated with pfam00226.	144
403171	pfam11876	DUF3396	Protein of unknown function (DUF3396). This family of proteins are functionally uncharacterized. This protein is found in bacteria and viruses. Proteins in this family are typically between 302 to 382 amino acids in length.	205
403172	pfam11877	DUF3397	Protein of unknown function (DUF3397). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 114 to 128 amino acids in length.	112
403173	pfam11878	DUF3398	Domain of unknown function (DUF3398). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 100 amino acids in length.	111
403174	pfam11879	DUF3399	Domain of unknown function (DUF3399). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 100 amino acids in length. This domain is found associated with pfam02214, pfam00520.	103
403175	pfam11880	DUF3400	Domain of unknown function (DUF3400). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 50 amino acids in length. This domain is found associated with pfam02754, pfam02913, pfam01565.	45
403176	pfam11881	SPAR_C	C-terminal domain of SPAR protein. This domain is found st the C-terminus of many spine-associated Rap GTPase-activating - SPAR - proteins in eukaryotes. This domain is found associated with pfam02145, pfam00595. The exact function is not known.	242
403177	pfam11882	DUF3402	Domain of unknown function (DUF3402). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 350 to 473 amino acids in length. This domain is found associated with pfam07923.	429
403178	pfam11883	DUF3403	Domain of unknown function (DUF3403). This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 50 amino acids in length. This domain is found associated with pfam00069, pfam08276, pfam00954, pfam01453.	47
403179	pfam11884	DUF3404	Domain of unknown function (DUF3404). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 260 amino acids in length. This domain is found associated with pfam02518, pfam00512.	259
403180	pfam11885	DUF3405	Protein of unknown function (DUF3405). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 636 to 810 amino acids in length.	493
403181	pfam11886	TOC159_MAD	Translocase of chloroplast 159/132, membrane anchor domain. This is the membrane-anchor domain of translocase of chloroplast 159, TOC159/132. This domain is present in plants at the C-terminus of the GTPase, AIG1, pfam04548, and anchors the GTPas region to the outer membrane of the chloroplast. The domain may carry a very C-terminal sequence motif that resembles a transit peptide.	267
403182	pfam11887	Mce4_CUP1	Cholesterol uptake porter CUP1 of Mce4, putative. Mce4_CUP1 is a family of putative Mce4 transporters of cholesterol. The domain is found associated with pfam02470. The full TCDB classification for this family in conjunction with PF02470 is TC:3.A.1.27.4.	238
403183	pfam11888	DUF3408	Protein of unknown function (DUF3408). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 128 to 160 amino acids in length.	136
288721	pfam11889	DUF3409	Domain of unknown function (DUF3409). This domain is functionally uncharacterized. This domain is found in viruses. This presumed domain is about 60 amino acids in length. This domain is found associated with pfam00271, pfam05550, pfam05578.	56
403184	pfam11890	DUF3410	Domain of unknown function (DUF3410). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 90 amino acids in length. This domain is found associated with pfam02826, pfam00389. This domain has a conserved RRE sequence motif.	81
403185	pfam11891	RETICULATA-like	Protein RETICULATA-related. This entry represents RETICULATA and related proteins from plants. Arabidopsis RETICULATA protein is involved in differential development of bundle sheath and mesophyll cell chloroplasts.	177
403186	pfam11892	DUF3412	Domain of unknown function (DUF3412). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 120 amino acids in length. This domain is found associated with pfam03641.	121
403187	pfam11893	DUF3413	Domain of unknown function (DUF3413). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 250 amino acids in length. This domain is found associated with pfam00884.	246
403188	pfam11894	Nup192	Nuclear pore complex scaffold, nucleoporins 186/192/205. This is a family of eukaryotic nucleoporins of several different sizes. All of them are long and form the scaffold of the nuclear pore complex. Nup192 in particular modulates the permeability of the central channel of the NPC central or nuclear pore complex.	1483
403189	pfam11895	Peroxidase_ext	Fungal peroxidase extension region. This region is found as an extension to a haem peroxidase domain in some fungi. This region is about 80 amino acids in length and forms an extended structure on the surface of the peroxidase domain pfam00141.	72
403190	pfam11896	DUF3416	Domain of unknown function (DUF3416). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is about 190 amino acids in length. This domain is found associated with pfam00128.	185
403191	pfam11897	DUF3417	Protein of unknown function (DUF3417). This family of proteins are functionally uncharacterized. This protein is found in bacteria and archaea. Proteins in this family are typically between 145 to 860 amino acids in length. This protein is found associated with pfam00343. This protein has a conserved AYF sequence motif.	109
403192	pfam11898	DUF3418	Domain of unknown function (DUF3418). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 582 to 594 amino acids in length. This domain is found associated with pfam07717, pfam00271, pfam04408.	587
403193	pfam11899	DUF3419	Protein of unknown function (DUF3419). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 398 to 802 amino acids in length.	383
314729	pfam11900	DUF3420	Domain of unknown function (DUF3420). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 50 amino acids in length. This domain is found associated with pfam00023.	47
403194	pfam11901	DUF3421	Protein of unknown function (DUF3421). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 119 to 296 amino acids in length.	114
403195	pfam11902	DUF3422	Protein of unknown function (DUF3422). This family of proteins are functionally uncharacterized. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 426 to 444 amino acids in length.	415
314732	pfam11903	ParD_like	ParD-like antitoxin of type II bacterial toxin-antitoxin system. ParD-like antitoxin is a family of archaeal and bacterial proteins of a type II bacterial toxin-antitoxin system. Many of the cognate toxins for these molecules fall into family ParE-like_toxin, pfam15781. Gene-pairs are expressed from the same operon, the toxin of the pair being expressed first, eg, for UniProtKB:Q3AQ93 and UniProtKB:Q3AQ94.	73
403196	pfam11904	GPCR_chapero_1	GPCR-chaperone. This domain, and the associated ANK family repeat pfam00023 domain, together act as a chaperone for biogenesis and folding of the DP receptor for prostaglandin D2.	300
403197	pfam11905	DUF3425	Domain of unknown function (DUF3425). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 120 to 143 amino acids in length.	128
403198	pfam11906	DUF3426	Protein of unknown function (DUF3426). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 262 to 463 amino acids in length.	147
403199	pfam11907	DUF3427	Domain of unknown function (DUF3427). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is typically between 243 to 275 amino acids in length. This domain is found associated with pfam04851, pfam00271.	281
403200	pfam11909	NdhN	NADH-quinone oxidoreductase cyanobacterial subunit N. The proton-pumping NADH:ubiquinone oxidoreductase catalyzes the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. It is the largest, most complex and least understood of the respiratory chain enzymes and is referred to as Complex I. The subunit composition of the enzyme varies between groups of organisms. Complex I originating from mammalian mitochondria contains 45 different proteins, whereas in bacteria, the corresponding complex NDH-1 consists of 14 different polypeptides. homologs of these 14 proteins are found among subunits of the mitochondrial complex I, and therefore bacterial NDH-1 might be considered a model proton-pumping NADH dehydrogenase with a minimal set of subunits. Escherichia coli NDH-1 readily disintegrates into 3 subcomplexes: a water-soluble NADH dehydrogenase fragment (NuoE, -F, and -G),the connecting fragment (NuoB, -C, -D, and -I), and the membrane fragment (NuoA, -H, -J, -K, -L, -M, -N). In cyanobacteria and their descendants, the chloroplasts of green plants, the subunit composition of NDH-1 remains obscure. The genes for eleven subunits NdhA-NdhK, homologous to the NuoA-NuoD and NuoH-NuoN of the E. coli complex, have been found in the genome of Synechocystis sp. PCC 6803 which has a family of 6 ndhD genes and a family of 3 ndhF genes. Two reported multisubunit complexes, NDH-1L and NDH-1M, represent distinct NDH-1 complexes in the thylakoid membrane of Synechocystis 6803 -cyanobacterium. NDH-1L was shown to be essential for photoheterotrophic cell growth, whereas expression of NDH-1M was a prerequisite for CO2 uptake and played an important role in growth of cells at low CO2. Here we report the subunit composition of these two complexes. Fifteen proteins were discovered in NDH-1L including NdhL, a new component of the membrane fragment, and Ssl1690 (designated as NdhO), a novel peripheral subunit. The cyanobacterial NDH-1 complex contains additional subunits, NdhM and NdhN, compared with the minimal set of the bacterial enzyme and these seem to be specific for thylakoid-located NDH-1 of photosynthetic organisms.	150
403201	pfam11910	NdhO	Cyanobacterial and plant NDH-1 subunit O. The proton-pumping NADH:ubiquinone oxidoreductase catalyzes the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. It is the largest, most complex and least understood of the respiratory chain enzymes and is referred to as Complex I. The subunit composition of the enzyme varies between groups of organisms. Complex I originating from mammalian mitochondria contains 45 different proteins, whereas in bacteria, the corresponding complex NDH-1 consists of 14 different polypeptides. homologs of these 14 proteins are found among subunits of the mitochondrial complex I, and therefore bacterial NDH-1 might be considered a model proton-pumping NADH dehydrogenase with a minimal set of subunits. Escherichia coli NDH-1 readily disintegrates into 3 subcomplexes: a water-soluble NADH dehydrogenase fragment (NuoE, -F, and -G),the connecting fragment (NuoB, -C, -D, and -I), and the membrane fragment (NuoA, -H, -J, -K, -L, -M, -N). In cyanobacteria and their descendants, the chloroplasts of green plants, the subunit composition of NDH-1 remains obscure. The genes for eleven subunits NdhA-NdhK, homologous to the NuoA-NuoD and NuoH-NuoN of the E. coli complex, have been found in the genome of Synechocystis sp. PCC 6803 which has a family of 6 ndhD genes and a family of 3 ndhF genes. Two reported multisubunit complexes, NDH-1L and NDH-1M, represent distinct NDH-1 complexes in the thylakoid membrane of Synechocystis 6803 -cyanobacterium. NDH-1L was shown to be essential for photoheterotrophic cell growth, whereas expression of NDH-1M was a prerequisite for CO2 uptake and played an important role in growth of cells at low CO2. Here we report the subunit composition of these two complexes. Fifteen proteins were discovered in NDH-1L including NdhL, a new component of the membrane fragment, and Ssl1690 (designated as NdhO), a novel peripheral subunit. The three nuclear-encoded subunits NdhM,NdhN and NdhO are vital for the functional integrity of the plastidial complex.	65
403202	pfam11911	DUF3429	Protein of unknown function (DUF3429). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 147 to 245 amino acids in length.	137
256719	pfam11912	DUF3430	Protein of unknown function (DUF3430). This family of proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 209 to 265 amino acids in length.	204
403203	pfam11913	DUF3431	Protein of unknown function (DUF3431). This family of proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 291 to 390 amino acids in length. This protein has a conserved NLRC sequence motif.	211
403204	pfam11914	DUF3432	Domain of unknown function (DUF3432). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 100 amino acids in length. This domain is found associated with pfam00096. This domain has two conserved sequence motifs: YPSPV and PSP.	98
403205	pfam11915	DUF3433	Protein of unknown function (DUF3433). This is a family of functionally uncharacterized proteins. The family is found in eukaryotes, and represents the conserved central region of the member proteins.	91
403206	pfam11916	Vac14_Fig4_bd	Vacuolar protein 14 C-terminal Fig4p binding. Vac14 is a scaffold for the Fab1 kinase complex, a complex that allows for the dynamic interconversion of PI3P and PI(3,5)P2p (phosphoinositide phosphate (PIP) lipids, that are generated transiently on the cytoplasmic face of selected intracellular membranes). This interconversion is regulated by at least five proteins in yeast: the lipid kinase Fab1p, lipid phosphatase Fig4p, the Fab1p activator Vac7p, the Fab1p inhibitor Atg18p, and Vac14p, a protein required for the activity of both Fab1p and Fig4p. The C-terminal region of Vac14 binds to Fig4p. The full length Vac14 in yeasts is likely to be a protein carrying a succession of HEAT repeats, most of which have now degenerated. This regulatory system is crucial for the proper functioning of the mammalian nervous system.	179
371796	pfam11917	DUF3435	Protein of unknown function (DUF3435). This family of proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 435 to 791 amino acids in length. This family is related to pfam00589 suggesting it may be an integrase enzyme.	418
403207	pfam11918	Peptidase_S41_N	N-terminal domain of Peptidase_S41 in eukaryotic IRBP. Peptidase_S41_N is a family found at the N-terminus of the functional unit of interphotoreceptor retinoid binding proteins 3, IRBP, in eukaryotes. From the structure of Structure 1j7x, the domain forms the N-terminal end of the module which is characterized as a serine-peptidase, pfam03572. Peptidase_S41_N forms a three-helix bundle followed by a small beta strand and is termed domain A. Part of the peptidase domain folds back over domain A to create a largely hydrophobic cleft between the two domains. On binding of ligand domain A is structurally rearranged with respect to domain B.	129
403208	pfam11919	DUF3437	Domain of unknown function (DUF3437). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 142 to 163 amino acids in length.	86
403209	pfam11920	DUF3438	Protein of unknown function (DUF3438). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 276 to 307 amino acids in length.	289
288750	pfam11921	DUF3439	Domain of unknown function (DUF3439). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 46 to 94 amino acids in length. This domain is found associated with pfam01462, pfam00560.	122
403210	pfam11922	DUF3440	Domain of unknown function (DUF3440). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 53 to 190 amino acids in length. This domain is found associated with pfam01507. This domain has a conserved KND sequence motif.	181
403211	pfam11923	DUF3441	Domain of unknown function (DUF3441). This presumed domain is functionally uncharacterized. This domain is found in archaea and eukaryotes. This domain is typically between 104 to 119 amino acids in length. This domain is found associated with pfam05833, pfam05670. This domain has two conserved residues (P and G) that may be functionally important.	104
403212	pfam11924	IAT_beta	Inverse autotransporter, beta-domain. This is a family of beta-barrel porin-like outer membrane proteins from enteropathogenic Gram-negative bacteria. Intimins and invasins are virulence factors produced by pathogenic Gram-negative bacteria. They carry C-terminal extracellular passenger domains that are involved in adhesion to host cells and N-terminal beta domains that are embedded in the outer membrane. This family represents the beta-barrel porin-like domain in the outer membrane that can be found in intimins, invasins and some inverse autotransporters.	276
314750	pfam11925	DUF3443	Protein of unknown function (DUF3443). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 400 to 434 amino acids in length. This protein has two conserved sequence motifs: NPV and DNNG.	365
403213	pfam11926	DUF3444	Domain of unknown function (DUF3444). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 210 amino acids in length. This domain is found associated with pfam00226. This domain has two conserved sequence motifs: FSH and FSH.	210
403214	pfam11927	DUF3445	Protein of unknown function (DUF3445). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 264 to 418 amino acids in length. This protein has a conserved RLP sequence motif. This protein has two completely conserved R residues that may be functionally important.	231
403215	pfam11928	DUF3446	Domain of unknown function (DUF3446). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 80 to 99 amino acids in length. This domain is found associated with pfam00096. This domain has a single completely conserved residue P that may be functionally important.	76
371804	pfam11929	DUF3447	Domain of unknown function (DUF3447). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 80 amino acids in length. This domain is found associated with pfam00023. This domain has a conserved SHN sequence motif. It seems likely that this region represents divergent Ankyrin repeats.	76
403216	pfam11931	DUF3449	Domain of unknown function (DUF3449). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 181 to 207 amino acids in length. This domain has two conserved sequence motifs: PIP and CEICG. The domain carries a zinc-finger domain of the C2H2-type.	191
403217	pfam11932	DUF3450	Protein of unknown function (DUF3450). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are about 260 amino acids in length.	238
403218	pfam11933	Na_trans_cytopl	Cytoplasmic domain of voltage-gated Na+ ion channel. This is a large cytoplasmic domain towards the start of voltage-dependent sodium ion channel proteins in eukaryotes. It is found closely associated with pfam06512 and pfam00520.	205
403219	pfam11934	DUF3452	Domain of unknown function (DUF3452). This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is typically between 124 to 150 amino acids in length. This domain is found associated with pfam01858, pfam01857. This domain has a single completely conserved residue W that may be functionally important.	131
403220	pfam11935	DUF3453	Domain of unknown function (DUF3453). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 239 to 261 amino acids in length.	217
403221	pfam11936	DUF3454	Domain of unknown function (DUF3454). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 60 amino acids in length. This domain is found associated with pfam00066, pfam00008, pfam06816, pfam07684, pfam00023.	63
314761	pfam11937	DUF3455	Protein of unknown function (DUF3455). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 174 to 251 amino acids in length.	142
403222	pfam11938	DUF3456	TLR4 regulator and MIR-interacting MSAP. This family of proteins, found from plants to humans, is PRAT4 (A and B), a Protein Associated with Toll-like receptor 4. The Toll family of receptors - TLRs - plays an essential role in innate recognition of microbial products, the first line of defense against bacterial infection. PRAT4A influences the subcellular distribution and the strength of TLR responses and alters the relative activity of each TLR. PRAT4B regulates TLR4 trafficking to the cell surface and the extent of its expression there. TLR4 recognizes lipopolysaccharide (LPS), one of the most immuno-stimulatory glycolipids constituting the outer membrane of the Gram-negative bacteria. This family has also been described as a SAP-like MIR-interacting protein family.	137
403223	pfam11939	NiFe-hyd_HybE	[NiFe]-hydrogenase assembly, chaperone, HybE. Members of this family are chaperones for the assembly of [NiFe] hydrogenases, in the family of HybE, which is specific for hydrogenase-2 of Escherichia coli. Members often have an additional N-terminal rubredoxin domain.	147
403224	pfam11940	DUF3458	Domain of unknown function (DUF3458) Ig-like fold. This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. The domain has an Ig-like fold. This domain is found associated with pfam01433.	95
403225	pfam11941	DUF3459	Domain of unknown function (DUF3459). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 110 amino acids in length. This domain is found associated with pfam00128, pfam02922.	92
403226	pfam11942	Spt5_N	Spt5 transcription elongation factor, acidic N-terminal. This is the very acidic N-terminal region of the early transcription elongation factor Spt5. The Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The actual function of this N-terminal domain is not known although it is dispensable for binding to Spt4.	97
403227	pfam11943	DUF3460	Protein of unknown function (DUF3460). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 70 amino acids in length. This protein has a conserved WDK sequence motif.	58
403228	pfam11944	DUF3461	Protein of unknown function (DUF3461). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 130 amino acids in length. This protein has two conserved sequence motifs: KFK and HLE.	124
403229	pfam11945	WASH_WAHD	WAHD domain of WASH complex. This domain forms part of the WASH-complex of domains and proteins that activates the Arp2/3 complex, see pfam04062. The Arp2/3 complex regulates endocytosis, sorting, and trafficking within the cell. The WAHD domain attaches to the FAM21 proteins via its N-terminal residues and to the microtubules via its C-terminal residues.	286
403230	pfam11946	DUF3463	Domain of unknown function (DUF3463). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is about 140 amino acids in length. This domain is found associated with pfam04055. This domain has two conserved sequence motifs: CTPWG and PCYL, plus a highly conserved CxxCxxHC motif.	134
403231	pfam11947	DUF3464	Protein of unknown function (DUF3464). This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 137 to 196 amino acids in length.	150
403232	pfam11948	DUF3465	Protein of unknown function (DUF3465). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 131 to 151 amino acids in length. This protein has a conserved HWTH sequence motif.	124
403233	pfam11949	DUF3466	Protein of unknown function (DUF3466). This family of proteins are functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 564 to 612 amino acids in length.	604
403234	pfam11950	DUF3467	Protein of unknown function (DUF3467). This family of proteins are functionally uncharacterized. This protein is found in bacteria, archaea and viruses. Proteins in this family are typically between 101 to 118 amino acids in length.	92
403235	pfam11951	Fungal_trans_2	Fungal specific transcription factor domain. This family of are likely to be transcription factors. This protein is found in fungi. Proteins in this family are typically between 454 to 826 amino acids in length. This protein is found associated with pfam00172.	384
403236	pfam11952	XTBD	XRN-Two Binding Domain, XTBD. XTBD is a family of eukaryotic proteins that act as an XRN2-binding module. XRN2 is an essential exoribonuclease in eukaryotes that processes and degrades a number of different substrates. XTBD is found on a number of different proteins to link them to XRN, such as PAXT-1.	85
403237	pfam11953	DUF3470	Domain of unknown function (DUF3470). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 50 amino acids in length. This domain is found associated with pfam00037. This domain has a single completely conserved residue N that may be functionally important.	42
403238	pfam11954	DUF3471	Domain of unknown function (DUF3471). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144.	94
403239	pfam11955	PORR	Plant organelle RNA recognition domain. This family, which was previously known as DUF860, has been shown to be a component of group II intron ribonucleoprotein particles in maize chloroplasts. The domain is required for the splicing of the introns with which it associates, and promotes splicing in the context of a heterodimer with the RNase III-domain protein RNC1. All of the members are predicted to localize to mitochondria or chloroplasts. It seems likely that most PORR proteins function in organellar RNA metabolism.	328
403240	pfam11956	KCNQC3-Ank-G_bd	Ankyrin-G binding motif of KCNQ2-3. Interactions with ankyrin-G are crucial to the localization of voltage-gated sodium channels (VGSCs) at the axon initial segment and for neurons to initiate action potentials. This conserved 9-amino acid motif ((V/A)P(I/L)AXXE(S/D)D) is required for ankyrin-G binding and functions to localize sodium channels to a variety of 'excitable' membrane domains both inside and outside of the nervous system. This motif has also been identified in the potassium channel 6TM proteins KCNQ2 and KCNQ3, that correspond to the M channels that exert a crucial influence over neuronal excitability. KCNQ2/KCNQ3 channels are preferentially localized to the surface of axons both at the axonal initial segment and more distally, and this axonal initial segment targeting of surface KCNQ channels is mediated by these ankyrin-G binding motifs of KCNQ2 and KCNQ3. KCNQ3 is a major determinant of M channel localization to the AIS, rather than KCNQ2. Phylogenetic analysis reveals that anchor motifs evolved sequentially in chordates (NaV channel) and jawed vertebrates (KCNQ2/3).	101
403241	pfam11957	efThoc1	THO complex subunit 1 transcription elongation factor. The THO complex plays a role in coupling transcription elongation to mRNA export. It is composed of subunits THP2, HPR1, THO2 and MFT1. The THO complex is a nuclear complex that is required for transcription elongation through genes containing tandemly repeated DNA sequences. The THO complex is also part of the TREX (TRanscription EXport) complex that is involved in coupling transcription to export of mRNAs to the cytoplasm.	490
403242	pfam11958	DUF3472	Domain of unknown function (DUF3472). This presumed domain is functionally uncharacterized. This domain is found in bacteria, eukaryotes and viruses. This domain is typically between 174 to 190 amino acids in length. This domain has a single completely conserved residue G that may be functionally important.	173
403243	pfam11959	DUF3473	Domain of unknown function (DUF3473). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is about 130 amino acids in length. This domain is found associated with pfam01522. This domain has two completely conserved residues (P and H) that may be functionally important.	130
403244	pfam11960	DUF3474	Domain of unknown function (DUF3474). This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is typically between 126 to 140 amino acids in length. This domain is found associated with pfam00487.	127
403245	pfam11961	DUF3475	Domain of unknown function (DUF3475). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 60 amino acids in length. This domain is found associated with pfam05003.	57
371823	pfam11962	Peptidase_G2	Peptidase_G2, IMC autoproteolytic cleavage domain. This domain is found at the very C-terminus of bacteriophage parallel beta-helical tailspike proteins. It carries the enzymic residues that induce autoproteolytic cleavage to bring about maturation of the folding process of the helix in a chaperone-like manner. The domain thus mediates the assembly of a large tailspike protein and then releases itself after maturation. These C-terminal regions that autoproteolytically release themselves after maturation are exchangeable between functionally unrelated N-terminal proteins and have been identified in a number of bacteriophage tailspike proteins.	221
152398	pfam11963	DUF3477	Protein of unknown function (DUF3477). This family of proteins is functionally uncharacterized. This protein is found in viruses. Proteins in this family are typically between 246 to 7162 amino acids in length. This protein is found associated with pfam08716, pfam01661, pfam05409, pfam08717, pfam01831, pfam08715, pfam08710.	355
403246	pfam11964	SpoIIAA-like	SpoIIAA-like. These proteins adopt an alpha/beta SpoIIAA-like fold, similar to that found in STAT (pfam01740). They adopt open and closed conformations arising from different arrangements of their alpha-2 and alpha-3 helices. They may be membrane associated and may function as carriers of non-polar compounds.	104
403247	pfam11965	DUF3479	Domain of unknown function (DUF3479). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is about 160 amino acids in length. This domain is found associated with pfam02514.	159
403248	pfam11966	SSURE	Fibronectin-binding repeat. Streptococcal surface repeat domain - SSURE - is a protein fragment found to bind to extracellular matrix protein fibronectin but not to collagen or submaxillary mucin in Streptococci. Anti-SSURE antibodies recognized the corresponding protein on the surface of streptococcal cells. The full-length proteins are thus fibronectin-binding surface adhesins.	149
403249	pfam11967	RecO_N	Recombination protein O N terminal. Recombination protein O (RecO) is involved in DNA repair and pfam00470 pathway recombination. This domain forms a beta barrel structure.	80
371825	pfam11968	Bmt2	25S rRNA (adenine(2142)-N(1))-methyltransferase, Bmt2. This entry represents Bmt2 and its homogues. In Saccharomyces cerevisiae, Bmt2 is a nucleolar S-adenosylmethionine-dependent rRNA methyltransferase that is responsible for the N-1-methyl-adenosine base modification of 25S rRNA.It specifically methylates the N1 position of adenine 2142 in 25S rRNA.	221
403250	pfam11969	DcpS_C	Scavenger mRNA decapping enzyme C-term binding. This family consists of several scavenger mRNA decapping enzymes (DcpS) and is the C-terminal region. DcpS is a scavenger pyrophosphatase that hydrolyzes the residual cap structure following 3' to 5' decay of an mRNA. The association of DcpS with 3' to 5' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway. The C-terminal domain contains a histidine triad (HIT) sequence with three histidines separated by hydrophobic residues. The central histidine within the DcpS HIT motif is critical for decapping activity and defines the HIT motif as a new mRNA decapping domain, making DcpS the first member of the HIT family of proteins with a defined biological function.	114
371826	pfam11970	GPR_Gpa2_C	G protein-coupled glucose receptor regulating Gpa2 C-term. GPR1 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. The protein contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This family is the conserved C-terminal domain of the member proteins.	76
314792	pfam11971	CAMSAP_CH	CAMSAP CH domain. This domain is the N-terminal CH domain from the CAMSAP proteins.	85
403251	pfam11972	HTH_13	HTH DNA binding domain. This is a helix-turn-helix DNA binding domain.	54
403252	pfam11973	NQRA_SLBB	NQRA C-terminal domain. This family consists of the C-terminal domain of several bacterial Na(+)-translocating NADH-quinone reductase subunit A (NQRA) proteins. The Na(+)-translocating NADH: ubiquinone oxidoreductase (Na(+)-NQR) generates an electrochemical Na(+) potential driven by aerobic respiration.	51
403253	pfam11974	MG1	Alpha-2-macroglobulin MG1 domain. This is the N-terminal MG1 domain from alpha-2-macroglobulin.	102
403254	pfam11975	Glyco_hydro_4C	Family 4 glycosyl hydrolase C-terminal domain. 	168
403255	pfam11976	Rad60-SLD	Ubiquitin-2 like Rad60 SUMO-like. The small ubiquitin-related modifier SUMO-1 is a Ub/Ubl family member, and although SUMO-1 shares structural similarity to Ub, SUMO's cellular functions remain distinct insomuch as SUMO modification alters protein function through changes in activity, cellular localization, or by protecting substrates from ubiquitination. Rad60 family members contain functionally enigmatic, integral SUMO-like domains (SLDs). Despite their divergence from SUMO, each Rad60 SLD interacts with a subset of SUMO pathway enzymes: SLD2 specifically binds the SUMO E2 conjugating enzyme (Ubc9)), whereas SLD1 binds the SUMO E1 (Fub2, also called Uba2) activating and E3 (Pli1, also called Siz1 and Siz2) specificity enzymes. Structural analysis of Structure 2uyz reveals a mechanistic basis for the near-synonymous roles of Rad60 and SUMO in survival of genotoxic stress and suggest unprecedented DNA-damage-response functions for SLDs in regulating SUMOylation. The Rad60 branch of this family is also known as RENi (Rad60-Esc2-Nip45), and biologically it should be two distinct families SUMO and RENi (Rad60-Esc2-Nip45).	72
403256	pfam11977	RNase_Zc3h12a	Zc3h12a-like Ribonuclease NYN domain. This domain is found in the Zc3h12a protein which has shown to be a ribonuclease that controls the stability of a set of inflammatory genes. It has been suggested that this domain belongs to the PIN domain superfamily. This domain has also been identified as part of the NYN domain family.	154
403257	pfam11978	MVP_shoulder	Shoulder domain. This domain is found in the Major Vault Protein and has been called the shoulder domain. This family includes two bacterial proteins, suggesting that some bacteria may possess vault particles.	117
403258	pfam11979	DUF3480	Domain of unknown function (DUF3480). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 350 to 362 amino acids in length. This domain is found associated with pfam01363.	353
403259	pfam11980	DUF3481	C-terminal domain of neuropilin glycoprotein. This domain is found in eukaryotes at the C-terminus of neuropilins. It represents the transmembrane region of these transmembrane glycoproteins, that are predominantly co-receptors for another class of proteins known as semaphorins. The domain is found associated with pfam00754, pfam00431, pfam00629.	82
403260	pfam11981	DUF3482	Domain of unknown function (DUF3482). This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is typically between 289 to 301 amino acids in length. This domain is found associated with pfam01926. The central region of these proteins contains a hydrophobic region that is similar to pfam05433.	290
403261	pfam11982	DUF3483	Domain of unknown function (DUF3483). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 230 amino acids in length. This domain is found associated with pfam02754.	217
403262	pfam11983	DUF3484	Membrane-attachment and polymerization-promoting switch. This family is the C-terminal region of essential streptococcal FtsA proteins and their homologs. It acts as an intra-molecular switch, triggered by ATP, to promote polymerization of the whole protein and to attach it to the membrane. FtsA is essential for the formation of the septum that divides fully-grown cells into two daughter cells at cell-division. FtsA anchors the constricting FtsZ ring to the membrane.	65
403263	pfam11984	DUF3485	Protein of unknown function (DUF3485). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 223 to 526 amino acids in length. This protein is found associated with pfam09721.	195
403264	pfam11985	DUF3486	Protein of unknown function (DUF3486). This family of proteins is functionally uncharacterized. This protein is found in bacteria and viruses. Proteins in this family are about 190 amino acids in length.	179
288812	pfam11986	PB1-F2	Influenza A Proapoptotic protein. PB1-F2 is a protein found in almost all known strains of Influenza A virus - a negative sense ssRNA Orthomyxovirus. It originates from translation of the viral polymerase gene in an alternative reading frame. PB1-F2 consists of two independent structural domains, two closely neighboring short helices at the N-terminus, and an extended C-terminal helix. Although the protein has originally been described to induce apoptosis, it has now been shown that PB1-F2 more likely acts as an apoptosis promoter in concert with other apoptosis-inducing agents. PB1-F2 promotes apoptosis by localising to the mitochondria where it destabilizes the membrane. This will cause release of cytochrome C which activates the caspase cascade of apoptosis through the endogenous pathway. In this way it acts like the Bcl-2 protein family which are physiological apoptotic regulators in cells.	87
403265	pfam11987	IF-2	Translation-initiation factor 2. IF-2 is a translation initiator in each of the three main phylogenetic domains (Eukaryotes, Bacteria and Archaea). IF2 interacts with formylmethionine-tRNA, GTP, IF1, IF3 and both ribosomal subunits. Through these interactions, IF2 promotes the binding of the initiator tRNA to the A site in the smaller ribosomal subunit and catalyzes the hydrolysis of GTP following initiation-complex formation.	105
403266	pfam11988	Dsl1_N	Retrograde transport protein Dsl1 N terminal. Dsl1 is a peripheral membrane protein required for transport between the Golgi and the endoplasmic reticulum. It is localized to the ER membrane, and in vitro it specifically binds to coatomer, the major component of the protein coat of COPI vesicles. It is comprised primarily of alpha helical bundles. It complexes with another subunit of the Dsl1p complex called Tip20 which forms heterodimers by pairing the N termini of each protein. A central disorganized region between the N and C termini of Dsl1 contains binding sites for coatomer. The C-terminus of Dsl1 contains a binding site to the Sec39 subunit of the Dsl1p complex.	350
371836	pfam11989	Dsl1_C	Retrograde transport protein Dsl1 C terminal. Dsl1 is a peripheral membrane protein required for transport between the Golgi and the endoplasmic reticulum. It is localized to the ER membrane, and in vitro it specifically binds to coatomer, the major component of the protein coat of COPI vesicles. Binding sites for coatomer are found on a disorganized region between the C and N termini of Dsl1. The C terminal domain is involved in binding to the Sec39 subunit of the Dsl1p complex. The N terminal complexes with another subunit of the Dsl1p complex called Tip20 which forms heterodimers by pairing the N termini of each protein.	194
403267	pfam11990	DUF3487	Protein of unknown function (DUF3487). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 121 to 136 amino acids in length. This protein has a conserved RLN sequence motif.	114
403268	pfam11991	Trp_DMAT	Tryptophan dimethylallyltransferase. This family of proteins represents tryptophan dimethylallyltransferase (EC:2.5.1.34), which catalyzes the first step of ergot alkaloid biosynthesis. Ergot alkaloids, which are produced by endophyte fungi, can enhance plant host fitness, but also cause livestock toxicosis to host plants. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 390 to 465 amino acids in length.	356
403269	pfam11992	DUF3488	Domain of unknown function (DUF3488). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 323 to 339 amino acids in length. This domain is found associated with pfam01841. This domain has a conserved PLW sequence motif. This domain contains 6 transmembrane helices.	339
403270	pfam11993	Ribosomal_S4Pg	Ribosomal S4P (gammaproteobacterial). This family of proteins are ribosomal SSU S4 p proteins. This protein is found in gamma-proteobacteria. Proteins in this family are typically between 162 to 178 amino acids in length.	158
403271	pfam11994	DUF3489	Protein of unknown function (DUF3489). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 84 to 211 amino acids in length. This protein has a single completely conserved residue W that may be functionally important.	68
403272	pfam11995	DUF3490	Domain of unknown function (DUF3490). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 160 amino acids in length. This domain is found associated with pfam00225. This domain is found associated with pfam00225. This domain has two conserved sequence motifs: EVE and ESA.	161
314813	pfam11996	DUF3491	Protein of unknown function (DUF3491). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 286 to 3225 amino acids in length. This protein is found associated with pfam04488. This protein is found associated with pfam04488.	946
403273	pfam11997	DUF3492	Domain of unknown function (DUF3492). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 259 to 282 amino acids in length. This domain is found associated with pfam00534. This domain has two conserved sequence motifs: GGVS and EHGIY.	278
403274	pfam11998	DUF3493	Protein of unknown function (DUF3493). This family of proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 79 to 331 amino acids in length.	73
403275	pfam11999	DUF3494	Protein of unknown function (DUF3494). This family of proteins is functionally uncharacterized. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 243 to 678 amino acids in length. This protein has a single completely conserved residue G that may be functionally important.	202
403276	pfam12000	Glyco_trans_4_3	Gkycosyl transferase family 4 group. This domain is found associated with pfam00534.	168
403277	pfam12001	DUF3496	Domain of unknown function (DUF3496). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 110 amino acids in length.	109
403278	pfam12002	MgsA_C	MgsA AAA+ ATPase C terminal. The MgsA protein possesses DNA-dependent ATPase and ssDNA annealing activities. MgsA contributes to the recovery of stalled replication forks and therefore prevents genomic instability caused by aberrant DNA replication. Additionally, MgsA may play a role in chromosomal segregation. This is consistent with a report that MgsA co-localizes with the replisome and affects chromosome segregation. This domain represents the C terminal region of MgsA.	158
403279	pfam12004	DUF3498	Domain of unknown function (DUF3498). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 433 to 538 amino acids in length. This domain is found associated with pfam00616, pfam00168. This domain has two conserved sequence motifs: DLQ and PLSFQNP.	465
403280	pfam12005	DUF3499	Protein of unknown function (DUF3499). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 125 to 163 amino acids in length.	119
403281	pfam12006	DUF3500	Protein of unknown function (DUF3500). This family of proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 335 to 438 amino acids in length. This protein has a conserved GHH sequence motif. This protein has two completely conserved G residues that may be functionally important.	294
403282	pfam12007	DUF3501	Protein of unknown function (DUF3501). This family of proteins is functionally uncharacterized. This protein is found in bacteria and archaea. Proteins in this family are about 200 amino acids in length. The structure of protein BPSS1837 from B. pseudomallei has been solved. This protein contains two domains, domain I (1:31, 46:81) is a helical domain, domain II (32:45,82-193) is a mainly beta protein with a beta barrel. According to crystal contacts the proteins probably functions as a dimer. The gene neighborhood analysis suggests that this protein may be functionally related to rubrerythrin and ferredoxin. The wedge surface between the two domains might be functionally important. The fold of this protein could best be described as a circularly permuted C2-like fold (details derived from TOPSAN).	187
403283	pfam12008	EcoR124_C	Type I restriction and modification enzyme - subunit R C terminal. This enzyme has been characterized and shown to belong to a new family of the type I class of restriction and modification enzymes. This family is involved in bacterial defense by making double strand breaks in specific double stranded DNA sequences, e.g. that of invading bacteriophages. EcoR124 is made up of three subunits, HsdR, HsdS and HsdM. The R subunit has ATPase and restriction endonuclease activity. This domain is the C terminal of the R subunit.	232
403284	pfam12009	Telomerase_RBD	Telomerase ribonucleoprotein complex - RNA binding domain. Telomeres in most organisms are comprised of tandem simple sequence repeats. The total length of telomeric repeat sequence at each chromosome end is determined in a balance of sequence loss and sequence addition. One major influence on telomere length is the enzyme telomerase. It is a reverse transcriptase that adds these simple sequence repeats to chromosome ends by copying a template sequence within the RNA component of the enzyme. The RNA binding domain of telomerase - TRBD - is made up of twelve alpha helices and two short beta sheets. How telomerase and associated regulatory factors physically interact and function with each other to maintain appropriate telomere length is poorly understood. It is known however that TRBD is involved in formation of the holoenzyme (which performs the telomere extension) in addition to recognition and binding of RNA.	127
403285	pfam12010	DUF3502	Domain of unknown function (DUF3502). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 140 amino acids in length. This domain is found associated with pfam01547.	131
288834	pfam12011	NPH-II	RNA helicase NPH-II. RNA helicase NPH-II or I8 is found in Poxviridae. It is essential for viral replication and plays an important role during transcription of early mRNAs, presumably by preventing R-loop formation behind the elongating RNA polymerase. It acts as NTP-dependent helicase that catalyzes unidirectional unwinding of 3'tailed duplex RNAs. It might also play a role in the export of newly synthesized mRNA chains out of the core into the cytoplasm and is required for propagation of viral particles.	168
403286	pfam12012	DUF3504	Domain of unknown function (DUF3504). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length.	163
371847	pfam12013	OrsD	Orsellinic acid/F9775 biosynthesis cluster protein D. This family of proteins is functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 247 to 1018 amino acids in length. Family members include orsellinic acid/F9775 biosynthesis cluster protein D (orsD) from Emericella nidulans. The orsD gene is part of the cluster that encodes components for the biosynthesis of orsellinic acid, as well as biosynthesis of the cathepsin K inhibitors F9775 A and F9775 B, but the function of orsD is unknown. OrsD contains two segments that are likely to be C2H2 zinc binding domains.	114
403287	pfam12014	DUF3506	Domain of unknown function (DUF3506). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 131 to 148 amino acids in length. This domain has a conserved KLTGD sequence motif.	134
403288	pfam12015	DUF3507	Domain of unknown function (DUF3507). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 180 amino acids in length. This domain has a conserved ENL sequence motif.	182
403289	pfam12016	Stonin2_N	Stonin 2. Stonin 2 is involved in clathrin mediated endocytosis. It binds to Eps15 by its highly conserved NPF motif. The complex formed has been shown to directly associate with the clathrin adaptor complex AP-2, and to localize to clathrin-coated pits (CCPs). In addition, stonin2 was recently identified as a specific sorting adaptor for synaptotagmin, and may thus regulate synaptic vesicle recycling.	338
288840	pfam12017	Tnp_P_element	Transposase protein. Protein in this family are transposases found in insects. This region is about 230 amino acids in length and is found associated with pfam05485.	219
403290	pfam12018	FAP206	Domain of unknown function. This domain of about 280 residues is found in eukaryotes. There are two conserved sequence motifs: GFC and GLL. This family is also known as UPF0704. This domain is found in FAP206, a protein associated with cilia and flagella. In the ciliate Tetrahymena, the cilium has radial spokes, each of which is a macromolecular complex essential for motility. A triplet of three radial spokes, RS1, RS2, and RS3, is repeated every 96 nm along the doublet microtubule. Each spoke has a distinct base that docks to the doublet and is linked to different inner dynein arms. Knockout of the FAP206 gene results in slow cell motility and the 96-nm repeats lack RS2 and dynein c. FAP206 is probably part of the front prong and docks RS2 and dynein c to the microtubule.	271
403291	pfam12019	GspH	Type II transport protein GspH. GspH is involved in bacterial type II export systems. Like all pilins, GspH has an N-terminus alpha helix. This helix is followed by nine beta strands forming two beta sheets, one of five antiparallel strands and one of four antiparallel strands. GspH is a minor pseudopilin; it is expressed much less than other pseudopilins in the type II secretion pilus (major pilins). The function and localization of minor pseudo-pilins are still to be fully unraveled. It has been suggested that some minor pseudopilins may assemble either into the base or the tip of pili, or both. They function as initiators or regulators of pilus biogenesis and dynamics, and/or as adaptors between various pseudopilin component and other members of the T2SS.	108
403292	pfam12020	TAFA	TAFA family. This family of secreted proteins are brain specific and thought to be chemokines. These proteins are found in vertebrates. Proteins in this family are typically between 94 to 133 amino acids in length and contain a number of conserved cysteines.	89
403293	pfam12021	DUF3509	Protein of unknown function (DUF3509). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 92 to 110 amino acids in length. This protein has two completely conserved residues (G and R) that may be functionally important.	87
403294	pfam12022	DUF3510	Domain of unknown function (DUF3510). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 130 amino acids in length. This domain is found associated with pfam06148.	129
403295	pfam12023	DUF3511	Domain of unknown function (DUF3511). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 50 amino acids in length. This domain has two completely conserved residues (Y and K) that may be functionally important.	45
403296	pfam12024	DUF3512	Domain of unknown function (DUF3512). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 231 to 249 amino acids in length. This domain is found associated with pfam00439.	185
288848	pfam12025	Phage_C	Phage protein C. This family of phage proteins is functionally uncharacterized. Proteins in this family are typically between 68 to 86 amino acids in length.	68
403297	pfam12026	DUF3513	Domain of unknown function (DUF3513). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 192 to 218 amino acids in length. This domain is found associated with pfam00018, pfam08824. This domain has a conserved QPP sequence motif.	207
314839	pfam12027	DUF3514	Protein of unknown function (DUF3514). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 368 to 823 amino acids in length.	256
403298	pfam12028	DUF3515	Protein of unknown function (DUF3515). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 166 to 214 amino acids in length. This protein has a conserved RCG sequence motif.	159
403299	pfam12029	DUF3516	Domain of unknown function (DUF3516). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 460 to 473 amino acids in length. This domain is found associated with pfam00270, pfam00271.	460
403300	pfam12030	DUF3517	Domain of unknown function (DUF3517). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 340 amino acids in length. This domain is found associated with pfam00443.	408
403301	pfam12031	BAF250_C	SWI/SNF-like complex subunit BAF250/Osa. This entry represents the mammalian BAF250a/b and its homolog osa from fruit flies. They are part of the SWI/SNF-like ATP-dependent chromatin remodelling complex that regulates gene expression through regulating nucleosome remodelling. In humans there are two BAF250 isoforms, BAF250a/ARID1a and BAF250b/ARID1b. BAF250a/b may be E3 ubiquitin ligases that target histone H2B.	257
403302	pfam12032	CLIP	Regulatory CLIP domain of proteinases. CLIP is a regulatory domain which controls the proteinase action of various proteins of the trypsin family, e.g. easter and pap2. The CLIP domain remains linked to the protease domain after cleavage of a conserved residue which retains the protein in zymogen form. It is named CLIP because it can be drawn in the shape of a paper clip. It has many disulphide bonds and highly conserved cysteine residues, and so it folds extensively.	54
152468	pfam12033	DUF3519	Protein of unknown function (DUF3519). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 117 to 1154 amino acids in length. This protein has a single completely conserved residue Q that may be functionally important.	104
403303	pfam12034	DUF3520	Domain of unknown function (DUF3520). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 180 amino acids in length. This domain is found associated with pfam00092.	182
403304	pfam12036	DUF3522	Protein of unknown function (DUF3522). This family of proteins is functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 220 to 787 amino acids in length. This family belongs to the CREST superfamily, which are distant members of the GPCR superfamily.	183
403305	pfam12037	DUF3523	Domain of unknown function (DUF3523). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 257 to 277 amino acids in length. This domain is found associated with pfam00004. This domain has a conserved LER sequence motif.	264
403306	pfam12038	DUF3524	Domain of unknown function (DUF3524). This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is about 170 amino acids in length. This domain is found associated with pfam00534. This domain has two conserved sequence motifs: HENQ and FNS. This domain has a single completely conserved residue S that may be functionally important.	165
152474	pfam12039	DUF3525	Protein of unknown function (DUF3525). This family of proteins is functionally uncharacterized. This protein is found in viruses. Proteins in this family are about 360 amino acids in length.	404
403307	pfam12040	DUF3526	Domain of unknown function (DUF3526). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is typically between 149 to 170 amino acids in length. This domain has a single completely conserved residue P that may be functionally important.	140
403308	pfam12041	DELLA	Transcriptional regulator DELLA protein N terminal. Gibberellins are plant hormones which have great impact on growth signalling. DELLA proteins are transcriptional regulators of growth related proteins which are downregulated when gibberellins bind to their receptor GID1. GID1 forms a complex with DELLA proteins and signals them towards 26S proteasome. The N terminal of DELLA proteins contains conserved DELLA and VHYNP motifs which are important for GID1 binding and proteolysis of the DELLA proteins.	68
314852	pfam12042	RP1-2	Tubuliform egg casing silk strands structural domain. Spiders use fibroins to make silk strands. This family includes tubuliform silk fibroins which are used to protect egg cases. This domain is a structural domain which is found in repeats of up to 20 in many individuals (although this is not necessarily the case). RP1 makes up structural domains in the N terminal while RP2 makes up structural domains in the C terminal.	167
403309	pfam12043	DUF3527	Domain of unknown function (DUF3527). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 120 amino acids in length. This domain has a conserved CDCGGWD sequence motif.	350
371862	pfam12044	Metallopep	Putative peptidase family. This family of proteins is functionally uncharacterized. However, it does contain an HEXXH motif characteristic of metallopeptidases. This protein is found mainly in fungi. Proteins in this family are typically between 625 to 773 amino acids in length.	425
403310	pfam12045	DUF3528	Protein of unknown function (DUF3528). This family of proteins is functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 185 to 298 amino acids in length. This protein is found associated with pfam00046.	141
403311	pfam12046	CCB1	Cofactor assembly of complex C subunit B. Cofactor maturation pathways such as the CCB system (system IV) for cytochrome c-heme attachment are conserved in all organisms performing oxygenic photosynthesis. The CCB system consists of four protein, CCB1-4. The four CCBs are well conserved between green algae and plants.	166
403312	pfam12047	DNMT1-RFD	Cytosine specific DNA methyltransferase replication foci domain. This domain is part of a cytosine specific DNA methyltransferase enzyme. It functions non-catalytically to target the protein towards replication foci. This allows the DNMT1 protein to methylate the correct residues. This domain targets DMAP1 and HDAC2 to the replication foci during the S phase of mitosis. They are thought to have some importance in conversion of critical histone lysine moieties.	141
403313	pfam12048	DUF3530	Protein of unknown function (DUF3530). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 272 to 336 amino acids in length. These proteins are distantly related to alpa/beta hydrolases so they may act as enzymes.	307
403314	pfam12049	DUF3531	Protein of unknown function (DUF3531). This family of proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 149 to 199 amino acids in length.	139
403315	pfam12051	DUF3533	Protein of unknown function (DUF3533). This family of transmembrane proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 393 to 772 amino acids in length.	378
403316	pfam12052	VGCC_beta4Aa_N	Voltage gated calcium channel subunit beta domain 4Aa N terminal. The beta subunit of voltage gated calcium channels is coded for by four genes 1-4. Gene 4 can produce two types of beta4A domain (beta4Aa and beta4Ab) according to how the gene splicing is carried out. This family is part of the beta4Aa N terminal domain. It is made up of an alpha helix and a beta strand. It is thought to regulate the channel properties through protein-protein interactions with non Ca channel proteins.	42
403317	pfam12053	DUF3534	N-terminal of Par3 and HAL proteins. This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 150 amino acids in length. This eukaryotic domain is found associated with pfam00595. It has a conserved GILD sequence motif. Family members have been found to be essential for cell polarity establishment and maintenance such as Par3 (partitioning defective) and involved in conversion of histidine into ammonia (a crucial step for forming histamine in humans) such as Histidine ammonia lyase (HAL). This N-terminal domain is found to mediate oligomerization critical for the membrane localization of Par-3. It is also found to possess a self-association capacity via a front-to-back mode in Par-3 and HAL proteins. However, unlike the Par-3 N-terminal domain which self-assembles into a left-handed helical filament, the HAL N-terminal domain does not tend to form a helical filament but rather self-assembles into circular oligomeric particles. This has been suggested to be likely due to the absence of equivalent charged residues that are essential for the longitudinal packing of the Par-3 N-terminal domain filament.	82
403318	pfam12054	DUF3535	Domain of unknown function (DUF3535). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 439 to 459 amino acids in length. This domain is found associated with pfam00271, pfam02985, pfam00176. This domain has two completely conserved residues (P and K) that may be functionally important.	445
403319	pfam12055	DUF3536	Domain of unknown function (DUF3536). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is typically between 274 to 285 amino acids in length. This domain is found associated with pfam03065.	284
403320	pfam12056	DUF3537	Protein of unknown function (DUF3537). This family of transmembrane proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 427 to 453 amino acids in length.	392
403321	pfam12057	DUF3538	Domain of unknown function (DUF3538). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 120 amino acids in length. This domain is found associated with pfam00240. This domain has a conserved SDL sequence motif.	114
403322	pfam12058	DUF3539	Protein of unknown function (DUF3539). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 90 amino acids in length. This protein has a conserved NHP sequence motif.	86
403323	pfam12059	DUF3540	Protein of unknown function (DUF3540). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 212 to 238 amino acids in length. This protein has a conserved SCL sequence motif.	199
403324	pfam12060	DUF3541	Domain of unknown function (DUF3541). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 230 amino acids in length.	225
403325	pfam12061	NB-LRR	Late blight resistance protein R1. R1 is a gene for resistance to late blight, the most destructive disease in potato cultivation worldwide. The R1 gene belongs to the class of plant genes for pathogen resistance that have a leucine zipper motif, a putative nucleotide binding domain and a leucine-rich repeat domain. This protein is found associated with PF00931.	297
403326	pfam12062	HSNSD	heparan sulfate-N-deacetylase. This family of proteins is are heparan sulfate N-deacetylase enzymes. This protein is found in eukaryotes. This proteinenzyme is often found associated with pfam00685.	491
403327	pfam12063	DUF3543	Domain of unknown function (DUF3543). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 217 to 291 amino acids in length. This domain is found associated with pfam00069. This domain has a single completely conserved residue A that may be functionally important.	251
403328	pfam12064	DUF3544	Domain of unknown function (DUF3544). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 198 to 216 amino acids in length. This domain is found associated with pfam00628, pfam01753, pfam00439, pfam00855.	202
403329	pfam12065	DUF3545	Protein of unknown function (DUF3545). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 60 to 77 amino acids in length. This protein has two completely conserved residues (R and L) that may be functionally important.	58
403330	pfam12066	DUF3546	Domain of unknown function (DUF3546). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 93 to 114 amino acids in length. This domain has two completely conserved Y residues that may be functionally important.	110
403331	pfam12067	Sox17_18_mid	Sox 17/18 central domain. This is the central region of eukaryotic SOX17 and 18 transcription factor proteins. It lies just downstream of the HMG-box family, pfam00505, and is followed by a C-terminal domain.	49
371881	pfam12068	DUF3548	Domain of unknown function (DUF3548). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 184 to 216 amino acids in length. This domain is found associated with pfam00566. This domain is found at the N-terminus of GYP7 proteins.	170
403332	pfam12069	DUF3549	Protein of unknown function (DUF3549). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 340 amino acids in length. This protein has a conserved LDE sequence motif.	338
403333	pfam12070	SCAI	Protein SCAI. SCAI is a transcriptional cofactor and tumor suppressor that suppresses MKL1-induced SRF transcriptional activity. It may function in the RHOA-DIAPH1 signal transduction pathway and regulate cell migration through transcriptional regulation of ITGB1.	520
371883	pfam12071	DUF3551	Protein of unknown function (DUF3551). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 79 to 104 amino acids in length. This protein has a single completely conserved residue C that may be functionally important.	77
403334	pfam12072	DUF3552	Domain of unknown function (DUF3552). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is about 200 amino acids in length. This domain is found associated with pfam00013, pfam01966. This domain has a single completely conserved residue A that may be functionally important.	201
403335	pfam12073	DUF3553	Protein of unknown function (DUF3553). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 60 amino acids in length. This protein has two conserved sequence motifs: GQVQS and TVNF.	48
403336	pfam12074	Gcn1_N	Domain of unknown function (DUF3554). This domain is found in the N-terminal region of Gcn1 protein, which acts as a translation activator that mediates translational control by regulating Gcn2 kinase activity.	350
403337	pfam12075	KN_motif	KN motif. This small motif is found at the N-terminus of Kank proteins and has been called the KN (for Kank N-terminal) motif. This protein is found in eukaryotes. Proteins in this family are typically between 413 to 1202 amino acids in length. This protein is found associated with pfam00023. This protein has two conserved sequence motifs: TPYG and LDLDF. Kank1 was obtained by positional cloning of a tumor suppressor gene in renal cell carcinoma, while the other members were found by homology search. The family is involved in the regulation of actin polymerization and cell motility through signaling pathways containing PI3K/Akt and/or unidentified modulators/effectors.	39
403338	pfam12076	Wax2_C	WAX2 C-terminal domain. This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 170 amino acids in length. This domain is found associated with pfam04116. This domain has a conserved LEGW sequence motif. This region has similarity to short chain dehydrogenases.	164
403339	pfam12077	DUF3556	Transmembrane protein of unknown function (DUF3556). This family of transmembrane proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 576 to 592 amino acids in length.	573
371889	pfam12078	DUF3557	Domain of unknown function (DUF3557). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 150 amino acids in length.	154
403340	pfam12079	DUF3558	Protein of unknown function (DUF3558). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 177 to 195 amino acids in length.	172
403341	pfam12080	GldM_C	GldM C-terminal domain. This domain is found in bacteria at the C-terminus of the GldM protein. This domain is typically between 169 to 182 amino acids in length. This domain has two completely conserved residues (Y and N) that may be functionally important. GldM, is named for the member from Cytophaga johnsonae (Flavobacterium johnsoniae), which is required for a type of rapid gliding motility found in certain members of the Bacteriodetes.	176
403342	pfam12081	GldM_N	GldM N-terminal domain. This domain is found in bacteria at the N-terminus of the GldM protein. This domain is typically between 169 to 182 amino acids in length. This domain has two completely conserved residues (Y and N) that may be functionally important. GldM, is named for the member from Cytophaga johnsonae (Flavobacterium johnsoniae), which is required for a type of rapid gliding motility found in certain members of the Bacteriodetes.	178
403343	pfam12083	DUF3560	Domain of unknown function (DUF3560). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 120 amino acids in length. This domain has a conserved GHHSE sequence motif.	124
403344	pfam12084	DUF3561	Protein of unknown function (DUF3561). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 110 amino acids in length.	107
378806	pfam12085	DUF3562	Protein of unknown function (DUF3562). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 62 to 84 amino acids in length. This protein has two completely conserved residues (A and Y) that may be functionally important.	60
403345	pfam12086	DUF3563	Protein of unknown function (DUF3563). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 50 amino acids in length. This protein has conserved AYL and DLE sequence motifs.	57
403346	pfam12087	DUF3564	Protein of unknown function (DUF3564). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 118 to 142 amino acids in length. This protein has a conserved WSRE sequence motif.	120
403347	pfam12088	DUF3565	Protein of unknown function (DUF3565). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 30 to 78 amino acids in length. This protein has two conserved sequence motifs: WVA and CGH.	58
403348	pfam12089	DUF3566	Transmembrane domain of unknown function (DUF3566). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 136 to 304 amino acids in length. This region represents a transmembrane region found at the C-terminus of the proteins.	118
371892	pfam12090	Spt20	Spt20 family. This presumed domain is found in the Spt20 proteins from both human and yeast. The Spt20 protein is part of the SAGA complex which is a large complex mediating histone deacetylation. Yeast Spt20 has been shown to play a role in structural integrity of the SAGA complex as as no intact SAGA could be purified in spt20 deletion strains.	155
403349	pfam12091	DUF3567	Protein of unknown function (DUF3567). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 90 amino acids in length. This protein has a conserved EIVDK sequence motif.	85
403350	pfam12092	DUF3568	Protein of unknown function (DUF3568). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 130 amino acids in length.	124
152528	pfam12093	Corona_NS8	Coronavirus NS8 protein. This family of proteins is functionally uncharacterized. This protein is found in coronaviruses. Proteins in this family are typically between 39 to 121 amino acids in length. This protein has two conserved sequence motifs: EDPCP and INCQ.	126
378811	pfam12094	DUF3570	Protein of unknown function (DUF3570). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 396 to 444 amino acids in length.	417
403351	pfam12095	CRR7	Protein CHLORORESPIRATORY REDUCTION 7. This entry includes protein from blue-green algae and plants, including CRR7 protein from Arabidopsis. CRR7 is part of the chloroplastic NAD(P)H dehydrogenase complex (NDH Complex) involved in respiration, photosystem I (PSI) cyclic electron transport and CO2 uptake. It is essential for the stable formation of the NDH Complex.	78
403352	pfam12096	DUF3572	Protein of unknown function (DUF3572). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are about 100 amino acids in length.	81
288914	pfam12097	DUF3573	Protein of unknown function (DUF3573). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 372 to 530 amino acids in length.	383
403353	pfam12098	DUF3574	Protein of unknown function (DUF3574). This family of proteins is functionally uncharacterized. This protein is found in bacteria and viruses. Proteins in this family are typically between 144 to 163 amino acids in length. This protein has a conserved TPRF sequence motif.	103
403354	pfam12099	DUF3575	Protein of unknown function (DUF3575). This family of proteins are functionally uncharacterized. This family is only found in bacteria. Proteins in this family are typically between 187 to 236 amino acids in length.	178
403355	pfam12100	DUF3576	Domain of unknown function (DUF3576). This presumed domain is functionally uncharacterized. This domain is found in bacteria. This domain is about 100 amino acids in length. This domain has a single completely conserved residue G that may be functionally important.	101
403356	pfam12101	DUF3577	Protein of unknown function (DUF3577). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 143 to 307 amino acids in length.	133
403357	pfam12102	DUF3578	Domain of unknown function (DUF3578). This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea. This domain is typically between 177 to 191 amino acids in length.	183
403358	pfam12103	Lipl32	Surface lipoprotein of Spirochaetales order. Lipl32 is an outer membrane surface lipoprotein of Leptospira like bacteria.	178
403359	pfam12104	Tcell_CD4_C	T cell CD4 receptor C terminal region. This domain is the C terminal domain of the CD4 T cell receptor. The C terminal domain is the cytoplasmic domain which relays the signal for T cell activation. This process involves co-receptor internalisation. This domain is involved in binding to the N terminal of Lck co-receptor in a Zn2+ clasp structure.	28
403360	pfam12105	SpoU_methylas_C	SpoU, rRNA methylase, C-terminal. This domain is found in bacteria. This domain is about 60 amino acids in length. This domain is found in association with pfam00588. This domain has a conserved LFE sequence motif. Some members of the Pfam family SpoU_methylase, pfam00588, carry this very distinctive sequence region at their extreme C-terminus. The exact function of this region is not known.	54
314909	pfam12106	Colicin_E5	Colicin E5 ribonuclease domain. Colicin is a protein produced by bacteria with Col plasmids. Its function is to attack E. coli through actions on its inner membrane ion channels or through ribonuclease or deoxyribonuclease actions. The C terminal domain is the ribonuclease domain. It specifically cleaves tRNA anticodons which recognize codons in the form NAY (N:any nucleotide, A:adenosine, Y:pyrimidine) which corresponds to Tyrosine, Histidine, Asparagine and Aspartic Acid. E5-CRD can be referred to as an RNA restriction enzyme that specifically recognizes and cleaves single-stranded GU sequences.	88
152542	pfam12107	VEK-30	Plasminogen (Pg) ligand in fibrinolytic pathway. Pg is an important mediator of angiostatin production in the fibrinolytic pathway. Pg is made up of five subunit kringle molecules (Pg-K1 to Pg-K5), of which the first three make the protein angiostatin. VEK-30 is a domain of the group A streptococcal protein PAM. It binds to Pg-K2 of angiostatin and activates the molecule to mediate its anti-angiogenic effects. VEK-30 binds to angiostatin via a C terminal lysine with argininyl and glutamyl side chain residues known as a 'through space isostere'.	17
403361	pfam12108	SF3a60_bindingd	Splicing factor SF3a60 binding domain. This domain is found in eukaryotes. This domain is about 30 amino acids in length. This domain has a single completely conserved residue Y that may be functionally important. SF3a60 makes up the SF3a complex with SF3a66 and SF3a120. This domain is the binding site of SF3a60 for SF3a120. The SF3a complex is part of the spliceosome, a protein complex involved in splicing mRNA after transcription.	27
403362	pfam12109	CXCR4_N	CXCR4 Chemokine receptor N terminal. CXCR4 and its ligand stromal cell-derived factor-1 (a.k.a. CXCL12) are essential for proper fetal development. CXCR4 is also the major coreceptor for T-tropic strains of human immunodeficiency virus 1 (HIV-1), and SDF-1 inhibits HIV-1 infection. Additionally, SDF-1 and CXCR4 mediate cancer cell migration and metastasis. The N terminal domain of most chemokine receptors is the ligand binding domain and so the N terminal domain of CXCR4 is the binding site for SDF-1.	33
403363	pfam12110	Nup96	Nuclear protein 96. Nup96 (often known by the name of its yeast homolog Nup145C) is part of the Nup84 heptameric complex in the nuclear pore complex. Nup96 complexes with Sec13 in the middle of the heptamer. The function of the heptamer is to coat the curvature of the nuclear pore complex between the inner and outer nuclear membranes. Nup96 is predicted to be an alpha helical solenoid. The interaction between Nup96 and Sec13 is the point of curvature in the heptameric complex.	287
403364	pfam12111	PNPase_C	Polyribonucleotide phosphorylase C terminal. PNPase regulates the expression of small non-coding RNAs that control expression of outer-membrane proteins. The enzyme also affects complex processes, such as the tissue-invasive virulence of Salmonella enterica and the regulation of a virulence-factor secretion system in Yersinia. In Escherichia coli, PNPase is involved in the quality control of ribosomal RNA precursors and is required for growth following cold shock. This family contains the C terminal protomer domain of the PNPase core. The function of the C terminal protomer is to catalyze phosphorolysis through its two active sites.	37
403365	pfam12112	DUF3579	Protein of unknown function (DUF3579). This family of proteins is functionally uncharacterized. This protein is found in bacteria. Proteins in this family are typically between 98 to 126 amino acids in length. This protein has a conserved FRP sequence motif.	87
288929	pfam12113	SVM_signal	SVM protein signal sequence. This region is presumed to be a signal peptide sequence found in Sequence-variable mosaic (SVM) proteins. This domain is found in phytoplasmas. This presumed signal sequence is about 30 amino acids in length.	33
403366	pfam12114	Period_C	Period protein 2/3C-terminal region. This domain is found in eukaryotes. This domain is typically between 164 to 200 amino acids in length. This domain is found associated with pfam08447.	195
314916	pfam12115	Salp15	Salivary protein of 15kDa inhibits CD4+ T cell activation. This is a family of 15kDa salivary proteins from Acari Arachnids that is induced on feeding and assists the parasite to remain attached to its arthropod host. By repressing calcium fluxes triggered by TCR engagement, Salp15 inhibits CD4+ T cell activation. Salp15 shows weak similarity to Inhibin A, a member of the TGF-beta superfamily that inhibits the production of cytokines and the proliferation of T cells.	112
403367	pfam12116	SpoIIID	Stage III sporulation protein D. This stage III sporulation protein is a small DNA-binding family that is essential for gene expression of the mother-cell compartment during sporulation. The domain is found in bacteria and viruses, and is about 40 amino acids in length. It has a conserved RGG sequence motif.	82
288933	pfam12117	DUF3580	Protein of unknown function (DUF3580). This domain is found in viruses, and is about 120 amino acids in length. It is found in association with pfam01057.	121
288934	pfam12118	SprA-related	SprA-related family. This family of bacterial proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. Proteins in this family are typically between 234 to 465 amino acids in length. Most members are annotated as being SprA-related.	310
403368	pfam12119	DUF3581	Protein of unknown function (DUF3581). This protein is found in bacteria. Proteins in this family are about 240 amino acids in length.	217
403369	pfam12120	Arr-ms	Rifampin ADP-ribosyl transferase. This protein is found in bacteria. Proteins in this family are typically between 136 to 150 amino acids in length. The opportunistic pathogen Mycobacterium smegmatis is resistant to rifampin because of the presence of a chromosomally encoded rifampin ADP-ribosyltransferase (Arr-ms). Arr-ms is a small enzyme whose activity thus renders rifamycin antibiotics ineffective.	99
403370	pfam12121	DD_K	Dermaseptin. This protein is found in eukaryotes. Proteins in this family are typically between 30 to 76 amino acids in length. This protein is found associated with pfam03032. This domain is part of a dermaseptin protein which is used as an antimicrobial agent. The full protein is almost completely defined in an alpha helical domain. It creates high levels of disorder at the level of the phospholipid head group of bacterial membranes suggesting that it partitions into the bilayer where it severely disrupts membrane packing.	23
403371	pfam12122	Rhomboid_N	Cytoplasmic N-terminal domain of rhomboid serine protease. Rhomboid_N is the N-terminal cytoplasmic domain of the rhomboid intra-membraneous serine protease, otherwise known as Peptidase_S54, pfam01694. This N-terminal domain has similarity to other GlnB-like domains, some of which appear to have a binding role, eg to peptidoglycan. It is not clear exactly what the function of this domain is in the protease, but its presence is critical for maintaining a catalytically competent state for the protein.	86
338253	pfam12123	Amidase02_C	N-acetylmuramoyl-l-alanine amidase. This domain is found in bacteria and viruses. This domain is about 50 amino acids in length. This domain is classified with the enzyme classification code EC:3.5.1.28. This domain is the C terminal of the enzyme which hydrolyzes the link between N-acetylmuramoyl residues and L-amino acid residues in certain cell-wall glycopeptides.	44
288939	pfam12124	Nsp3_PL2pro	Coronavirus polyprotein cleavage domain. This domain is found in SARS coronaviruses, and is about 70 amino acids in length. It is found associated with various other coronavirus proteins due to the polyprotein nature of most viral translation. PL2pro is a domain of the non-structural protein nsp3. The domain performs three of the cleavages required to separate the translated polyprotein into its distinct proteins.	66
403372	pfam12125	Beta-TrCP_D	D domain of beta-TrCP. This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with pfam00646, pfam00400. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerization of the protein. dimerization is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation.	39
403373	pfam12126	DUF3583	Protein of unknown function (DUF3583). This domain is found in eukaryotes, and is typically between 302 and 338 amino acids in length. It is found in association with pfam00097 and pfam00643. Most members are promyelocytic leukemia proteins, and this family lies towards the C-terminus.	329
403374	pfam12127	YdfA_immunity	SigmaW regulon antibacterial. This protein is found in bacteria. Proteins in this family are about 330 amino acids in length. The operon from which this protein is derived confers immunity for the host species to a broad range of antibacterial compounds, unlike the specific immunity proteins that are linked to and co-regulated with their antibiotic-synthesis proteins.	314
403375	pfam12128	DUF3584	Protein of unknown function (DUF3584). This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 943 to 1234 amino acids in length. This family contains a P-loop motif suggesting it is a nucleotide binding protein. It may be involved in replication.	1186
403376	pfam12129	Phtf-FEM1B_bdg	Male germ-cell putative homeodomain transcription factor. This domain is found in bacteria and eukaryotes, and is typically between 101 and 140 amino acids in length. Phtf proteins do not display any sequence similarity to known or predicted proteins, but their conservation among species suggests an essential function. The 84 kDa Phtf1 protein is an integral membrane protein, anchored to a cell membrane by six to eight trans-membrane domains, that is associated with a domain of the endoplasmic reticulum (ER) juxtaposed to the Golgi apparatus. It is present during meiosis and spermiogenesis, and, by the end of spermiogenesis, is released from the mature spermatozoon within the residual bodies. Phtf1 enhances the binding of FEM1B -feminisation homolog 1B - to cell membranes. Fem-1 was initially identified in the signaling pathway for sex determination, as well as being implicated in apoptosis, but its biochemical role is still unclear, and neither FEM1B nor PHTF1 is directly implicated in apoptosis in spermatogenesis. It is the ANK domain of FEM1B that is necessary for the interaction with the N-terminal region of Phtf1.	154
403377	pfam12130	DUF3585	Protein of unknown function (DUF3585). This domain is found in eukaryotes. This domain is typically between 135 and 149 amino acids in length and is found associated with pfam00307.	129
403378	pfam12131	DUF3586	Protein of unknown function (DUF3586). This domain is found in eukaryotes. This domain is about 80 amino acids in length and is found associated with pfam08246, and pfam00112.	75
371912	pfam12132	DUF3587	Protein of unknown function (DUF3587). This protein is found in viruses. Proteins in this family are typically between 209 and 248 amino acids in length.	201
403379	pfam12133	Sars6	Open reading frame 6 from SARS coronavirus. This family is found in Coronaviruses. Proteins in this family are typically between 42 to 63 amino acids in length.	62
403380	pfam12134	PRP8_domainIV	PRP8 domain IV core. This domain is found in eukaryotes, and is about 20 amino acids in length. It is found associated with pfam10597, pfam10596, pfam10598, pfam08083, pfam08082, pfam01398, pfam08084. There is a conserved LILR sequence motif. The domain is a selenomethionine domain in a subunit of the spliceosome. The function of PRP8 domain IV is believed to be interaction with the splicosomal core.	230
403381	pfam12135	Sialidase_penC	Sialidase enzyme penultimate C terminal domain. This domain is found in bacteria and eukaryotes, and is about 30 amino acids in length. The protein from which this domain is found is a sialidase enzyme which is used by virulent bacteria as a toxin. It is the penultimate C terminal domain.	25
152571	pfam12136	RNA_pol_Rpo13	RNA polymerase Rpo13 subunit HTH domain. This domain is found in archaea, and is about 40 amino acids in length. It has a single completely conserved residue E that may be functionally important. It is found in the archaeal DNA dependent RNA polymerase. The domain is a 'helix-turn-helix' (HTH) domain in the Rpo13 subunit of the RNA polymerase. This domain is involved in downstream DNA binding, and the entire subunit has also been implicated in contacting transcription factor II B.	40
403382	pfam12137	RapA_C	RNA polymerase recycling family C-terminal. This domain is found in bacteria. This domain is about 360 amino acids in length. This domain is found associated with pfam00271, pfam00176. The function of this domain is not known, but structurally it forms an alpha-beta fold in nature with a central beta-sheet flanked by helices and loops, the beta-sheet being mainly antiparallel and flanked by four alpha helices, among which the two longer helices exhibit a coiled-coil arrangement.	360
403383	pfam12138	Spherulin4	Spherulation-specific family 4. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 250 and 398 amino acids in length. There is a conserved NPG sequence motif and there are two completely conserved G residues that may be functionally important. Starvation will often induce spherulation - the production of spores - and this process may involve DNA-methylation. Changes in the methylation of spherulin4 are associated with the formation of spherules, but these changes are probably transient. Methylation of the gene accompanies its transcriptional activation, and spherulin4 mRNA is only detectable in late spherulating cultures and mature spherules. It is a spherulation-specific protein.	238
378819	pfam12139	APS-reductase_C	Adenosine-5'-phosphosulfate reductase beta subunit. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 112 to 142 amino acids in length. This family is found in association with pfam00037, and has a conserved FPIRTT sequence motif. The whole beta subunit has the enzymic properties of EC:1.8.99.2.	82
403384	pfam12140	SLED	SLED domain. The SLED (Scm-Like Embedded Domain) domain is a double-stranded DNA binding domain found in Scml2 which is a member of the Polycomb group of proteins involved in epigenetic gene silencing.	112
403385	pfam12141	DUF3589	Protein of unknown function (DUF3589). This family of proteins is found in eukaryotes. Proteins in this family are typically between 541 and 717 amino acids in length. The function of this family is not known,	485
403386	pfam12142	PPO1_DWL	Polyphenol oxidase middle domain. This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length, and the family is found in association with pfam00264. Most members are annotated as being polyphenol oxidases, and many are from plants or plastids. There is a conserved DWL sequence motif which gives the family its name.	52
371919	pfam12143	PPO1_KFDV	Protein of unknown function (DUF_B2219). This domain family is found in eukaryotes, and is typically between 138 and 152 amino acids in length. and the family is found in association with pfam00264. Many members are plant or plastid polyphenol oxidases, and there is a highly conserved sequence motif: KFDV, from which the name derives. This is the C-terminal domain of these oxidases.	130
403387	pfam12144	Med12-PQL	Eukaryotic Mediator 12 catenin-binding domain. This domain is found in eukaryotes, and is typically between 325 and 354 amino acids in length. Both development and carcinogenesis are driven by signal transduction within the canonical Wnt/beta-catenin pathway through both programmed and unprogrammed changes in gene transcription. Beta-catenin physically and functionally targets this PQL (proline-, glutamine-, leucine-rich) region of the Med12 subunit of Mediator to activate transcription. The beta-catenin transactivation domain binds directly to isolated Med12 and intact Mediator both in vitro and in vivo, and Mediator is recruited to Wnt-responsive genes in a beta-catenin-dependent manner.	209
403388	pfam12145	Med12-LCEWAV	Eukaryotic Mediator 12 subunit domain. This domain is found in eukaryotes, and is typically between 325 and 354 amino acids in length. The function of this particular region of the Mediator subunit Med12 is not known, but there is a conserved sequence motif: LCEWAV, from which the name derives.	466
403389	pfam12146	Hydrolase_4	Serine aminopeptidase, S33. This domain is found in bacteria and eukaryotes and is approximately 110 amino acids in length. It is found in association with pfam00561. The majority of the members in this family carry the exopeptidase active-site residues of Ser-122, Asp-239 and His-269 as in UniProtKB:Q7ZWC2.	237
403390	pfam12147	Methyltransf_20	Putative methyltransferase. This domain is found in bacteria and eukaryotes and is approximately 110 amino acids in length. It is found in association with pfam00561. The family shows homology to methyltransferases.	309
403391	pfam12148	TTD	Tandem tudor domain within UHRF1. TTD, tandem tudor domain within UHRF1 preferentially binds H3 histone tails trimethylated at Lys-9. It specifically recognizes H3 tail peptides with the heterochromatin-associated modification state of trimethylated lysine 9 and unmodified lysine 4 (H3K4me0/K9me3). This domain is found in eukaryotes and is found in association with pfam00097, pfam02182, pfam00628, pfam00240.	154
403392	pfam12149	HSV_VP16_C	Herpes simplex virus virion protein 16 C terminal. This domain is found in viruses, and is about 30 amino acids in length. It is found in association with pfam02232. This domain is the C terminal of the HSV virion protein 16. This protein is a transcription promoter. The C terminal domain is the carboxyl subdomain of the acidic transcriptional activation domain. The protein binds to DNA binding proteins to carry out its function. Such proteins include TATA binding protein, CBP, TBP-binding protein, etc.	26
403393	pfam12150	MFP2b	Cytosolic motility protein. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. These proteins are found in nematodes. They complex with MSP (major sperm protein) to allow motility. Their action is quite similar to the action of bacterial actin molecules.	343
314942	pfam12151	MVL	Mannan-binding protein. This domain family is found in bacteria, and is approximately 40 amino acids in length, There is a single completely conserved residue G that may be functionally important. The domain occurs in two types of proteins. In mannan binding proteins, it forms a homodimeric molecule which complexes into a homo-octamer. In thiamidases it occurs without repeats but in the presence of other domains. MVL is distinct amongst other oligomannoside binding proteins in that it exhibits specificity for certain tetrasaccharides. Each molecule of MVL has four distinct carbohydrate binding sites.	36
403394	pfam12152	eIF_4G1	Eukaryotic translation initiation factor 4G1. This domain is found in eukaryotes, and is about 80 amino acids in length. It is found in association with pfam02854. This domain is part of the protein eIF_4G. It binds to eIF_4E by wrapping around its N terminal to form the eIF_4F complex. This complex binds various eIF_4E-BPs (binding proteins) to regulate initiation of translation.	60
403395	pfam12153	CAP18_C	LPS binding domain of CAP18 (C terminal). This domain family is found in eukaryotes, and is approximately 30 amino acids in length, and the family is found in association with pfam00666. CAP18 is a protein which is derived from rabbit granulocytes. It has two domains, an N terminal DUF and a C terminal Gram negative LPS binding domain. This domain is the C terminal domain.	27
152589	pfam12154	HCMVantigenic_N	Glycoprotein B N-terminal antigenic domain of HCMV. This domain is found in viruses, and is approximately 40 amino acids in length. The domain is found in association with pfam00606. There are two conserved sequence motifs: SVS and TSS. This family is the amino-terminal antigenic domain of glycoprotein B of human cytomegalovirus.	36
152590	pfam12155	NADHdh-2_N	NADH dehydrogenase subunit 2 N-terminal. This domain is found in eukaryotes, and is approximately 90 amino acids in length. It is found associated with pfam00361. All members are annotated as being NADH dehydrogenase subunit 2, and this region is the N-terminus.	88
403396	pfam12156	ATPase-cat_bd	Putative metal-binding domain of cation transport ATPase. This domain is found in bacteria, and is approximately 90 amino acids in length. It is found associated with pfam00403, pfam00122, pfam00702. The cysteine-rich nature and composition suggest this might be a cation-binding domain; most members are annotated as being cation transport ATPases.	86
403397	pfam12157	DUF3591	Protein of unknown function (DUF3591). This domain is found in eukaryotes and is typically between 445 to 462 amino acids in length. Most members are annotated as being transcription initiation factor TFIID subunit 1, and this region is the conserved central portion of these proteins.	455
403398	pfam12158	DUF3592	Protein of unknown function (DUF3592). This family of proteins is functionally uncharacterized.This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 150 and 242 amino acids in length.	138
403399	pfam12159	DUF3593	Protein of unknown function (DUF3593). This family of proteins is functionally uncharacterized.This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 98 and 228 amino acids in length. There is a conserved LHG sequence motif.	88
403400	pfam12160	Fibrinogen_aC	Fibrinogen alpha C domain. This domain family is found in eukaryotes, and is approximately 70 amino acids in length, and the family is found in association with pfam08702. This domain is the C terminal domain of fibrinogen in mammals. The domain lies in the C terminal half of the alpha C region in these proteins. The function of the domain is that of intramolecular and intermolecular interactions to form fibrin.	68
403401	pfam12161	HsdM_N	HsdM N-terminal domain. This domain is found at the N-terminus of the methylase subunit of Type I DNA methyltransferases. This domain family is found in bacteria and archaea, and is typically between 123 and 138 amino acids in length. The family is found in association with pfam02384. Mutations in this region of EcoKI methyltransferase abolish the normally strong preference of this system for methylating hemimethylated substrate. The structure of this domain has been shown to be all alpha-helical.	123
371931	pfam12162	STAT1_TAZ2bind	STAT1 TAZ2 binding domain. This domain family is found in eukaryotes, and is approximately 20 amino acids in length, and the family is found in association with pfam02865, pfam00017, pfam01017, pfam02864. This domain is the C terminal domain of STAT1. This domain binds selectively to the TAZ2 domain of CRB (CREB-binding protein). In this process it becomes a transcriptional activator and can initiate transcription of certain genes.	25
403402	pfam12163	HobA	DNA replication regulator. This family of proteins is found exclusively in epsilon-proteobacteria. Proteins in this family are approximately 180 amino acids in length. The structure of HobA is a modified Rossmann fold consisting of a five-stranded parallel beta-sheet (beta1-5) flanked on one side by alpha-2, alpha-3 and alpha-6 helices and alpha-4 and alpha-5 on the other. The alpha-1 helix is extended away from and has minimal interaction with the globular part of the protein. Four monomers interact to form a tetrameric molecule. Four calcium atoms bind to the tetramer and these binding sites may have functional relevance. The function of HobA is to regulate DNA replication and its does this by binding to DNA-A, but the exact mechanism of how this regulation occurs is purely speculative	180
403403	pfam12164	SporV_AA	Stage V sporulation protein AA. This domain family is found in bacteria - primarily Firmicutes, and is approximately 90 amino acids in length. There is a single completely conserved residue G that may be functionally important. Most annotation associated with this domain suggests that it is involved in the fifth stage of sporulation, however there is little publication to back this up.	89
403404	pfam12165	Alfin	Alfin. The Alfin family includes PHD finger protein Alfin1 and Alfin1-like proteins. Alfin1 is a histone-binding component that specifically recognizes H3 tails trimethylated on 'Lys-4' (H3K4me3), which marks transcription start sites of virtually all active genes.	126
403405	pfam12166	Piezo_RRas_bdg	Piezo non-specific cation channel, R-Ras-binding domain. This is an extracellular domain at the C-terminus of Piezo, or FAM38 mechanosensitive non-specific cation channel proteins. It seems likely that this region of the Piezo proteins may be responsible for R-Ras recruitment because this region is capable of relocalising R-Ras to the ER in eukaryotes.	419
403406	pfam12167	Arm-DNA-bind_2	Arm DNA-binding domain. This domain is found at the N-terminus of various phage integrases. The domain binds to DNA.	65
403407	pfam12168	DNA_pol3_tau_4	DNA polymerase III subunits tau domain IV DnaB-binding. This domain family is found in bacteria, and is approximately 80 amino acids in length. The family is found in association with pfam00004. Domains I-III are shared between the tau and the gamma subunits, while most of the DnaB-binding Domain IV and all of the alpha-interacting Domain V are unique to tau.	82
403408	pfam12169	DNA_pol3_gamma3	DNA polymerase III subunits gamma and tau domain III. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam00004. Domains I-III are shared between the tau and the gamma subunits, while most of the DnaB-binding Domain IV and all of the alpha-interacting Domain V are unique to tau.	143
403409	pfam12170	DNA_pol3_tau_5	DNA polymerase III tau subunit V interacting with alpha. This domain family is found in bacteria, and is approximately 140 amino acids in length. The family is found in association with pfam00004. Domains I-III are shared between the tau and the gamma subunits, while most of the DnaB-binding Domain IV and all of the alpha-interacting Domain V are unique to tau. The extreme C-terminal region of this domain 5 is the part which interacts with the alpha subunit of the DNA polymerase III holoenzyme.	142
403410	pfam12171	zf-C2H2_jaz	Zinc-finger double-stranded RNA-binding. This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localize in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localization.	26
403411	pfam12172	DUF35_N	Rubredoxin-like zinc ribbon domain (DUF35_N). This domain has no known function and is found in conserved hypothetical archaeal and bacterial proteins. The domain is duplicated in Mycobacterium tuberculosis Rv3521. The structure of a DUF35 representative reveals two long N-terminal helices followed by a rubredoxin-like zinc ribbon domain represented in this family and a C-terminal OB fold domain. Zinc is chelated by the four conserved cysteines in the alignment.	37
152608	pfam12173	BacteriocIIc_cy	Bacteriocin class IIc cyclic gassericin A-like. This class of bacteriocins was previously described as class V. The members include gassericin A, acidocin B and butyrovibriocin AR10, all of which are hydrophobic cyclical structures. The N- and C-termini are covalently linked, and the circular molecule is resistant to several proteases and peptidases. The immunity protein that protects Lactobacillus gasseri from the toxic effects of its bacteriocin, gassericin A, has been identified. It is found to be a small positively-charged hydrophobic peptide of 53 amino acids containing a putative transmembrane segment - a structure unlike that of the more common immunity proteins as found in pfam08951.	91
403412	pfam12174	RST	RCD1-SRO-TAF4 (RST) plant domain. This domain is found in plant RCD1, SRO and TAF4 proteins, hence its name of RST. It is required for interaction with multiple plant transcription factors. Radical-Induced Cell Death1 (RCD1) is an important regulator of stress and hormonal and developmental responses in Arabidopsis thaliana, as is its closest homolog, SRO1 - Similar To RCD-One1. TBP-Associated Factor 4 (TAF4) and TAF4-b are components of the transcription initiation factor complex TFIID.	67
403413	pfam12175	WSS_VP	White spot syndrome virus structural envelope protein VP. This family of proteins is found in viruses. Proteins in this family are approximately 210 amino acids in length. There is a conserved NNT sequence motif. These proteins are structural envelope proteins in viruses. This is the beta barrel C terminal domain. There is a protruding N terminal domain which completes the proteins. Three of four envelope proteins in white spot syndrome virus share sequence homology with each other and are present in this family - VP24, VP26 and VP28. VP19 is the other major envelope protein but shares no sequence homology with the other proteins. These proteins are essential for entry into cells of the crustacean host.	201
403414	pfam12176	MtaB	Methanol-cobalamin methyltransferase B subunit. This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 460 amino acids in length. MtaB folds as a TIM barrel and contains a novel zinc-binding motif. Zinc(II) lies at the bottom of a funnel formed at the C-terminal beta-barrel end and ligates to two cysteinyl sulfurs (Cys-220 and Cys-269) and one carboxylate oxygen (Glu-164). The function of this protein is to catalyze the cleavage of the C O bond in methanol by an SN2 mechanism. It complexes with MtaA and MtaC to perform this function.	459
403415	pfam12177	Proho_convert	Prohormone convertase enzyme. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam01483, pfam00082. There are two completely conserved residues (Y and D) that may be functionally important. This protein is the C terminal domain of a prohormone convertase enzyme which targets hormones in dense core secretory granules. This C terminal tail domain is the domain responsible for targeting these dense core secretory granules. The domain adopts an alpha helical structure.	39
403416	pfam12178	INCENP_N	Chromosome passenger complex (CPC) protein INCENP N terminal. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. INCENP is a regulatory protein in the chromosome passenger complex. It is involved in regulation of the catalytic protein Aurora B. It performs this function in association with two other proteins - Survivin and Borealin. These proteins form a tight three-helical bundle. The N terminal domain is the domain involved in formation of this three helical bundle.	33
403417	pfam12179	IKKbetaNEMObind	I-kappa-kinase-beta NEMO binding domain. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00069. These proteins are involved in inflammatory reactions. They cause release of NF-kappa-B into the nucleus of inflammatory cells and upregulation of transcription of proinflammatory cytokines. They perform this function by phosphorylating I-kappa-B proteins which are targeted for degradation to release NF-kappa-B. This kinase (I-kappa-kinase-beta) is found in association with IKK-alpha and NEMO (NF-kappa-B essential modulator). This domain is the binding site of IKK-beta for NEMO.	35
403418	pfam12180	EABR	TSG101 and ALIX binding domain of CEP55. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. This domain is the active domain of CEP55. CEP55 is a protein involved in cytokinesis, specifically in abscission of the plasma membrane at the midbody. To perform this function, CEP55 complexes with ESCRT-I (by a Proline rich sequence in its TSG101 domain) and ALIX. This is the domain on CEP55 which binds to both TSG101 and ALIX. It also acts as a hinge between the N and C termini. This domain is called EABR.	34
403419	pfam12181	MogR_DNAbind	DNA binding domain of the motility gene repressor (MogR). This domain family is found in bacteria, and is approximately 150 amino acids in length. MogR is involved in repression of transcription of the flagellar gene in Listeria bacteria. This allows a phenotypical switch from an extracellular bacterium to an intracellular pathogen. MogR binds AT rich flagellar gene promoter regions upstream of the flagellar gene. These regions follow the pattern 5'-TTTTNNNNNAAAA-3'. This domain is the DNA binding domain of MogR.	151
403420	pfam12182	DUF3642	Bacterial lipoprotein. This domain family is found in bacteria, and is approximately 60 amino acids in length. There is a single completely conserved Y residue that may be functionally important. This domain is from a bacterial lipoprotein, a major virulence factor in Gram negative bacteria.	83
403421	pfam12183	NotI	Restriction endonuclease NotI. This family of proteins is found in bacteria. Proteins in this family are typically between 270 and 341 amino acids in length. There is a conserved CPF sequence motif. The type IIP restriction enzyme, NotI, is a homodimer that recognizes the 8 bp DNA sequence 5'-GC/GGCCGC-3' and cleaves both strands of DNA to create 5', 4 base cohesive overhangs.	232
403422	pfam12185	IR1-M	Nup358/RanBP2 E3 ligase domain. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam00638, pfam00641, pfam00160. There are two conserved sequence motifs: TFFC and EDF. Nup358/RanBP2 is a nucleoporin involved in ubiquitination of many different protein targets from various cellular pathways. It complexes with Ubc9, SUMO-1 and RanGAP1 to perform this function. This is the ligase domain which binds to Ubc9.	59
403423	pfam12186	AcylCoA_dehyd_C	Acyl-CoA dehydrogenase C terminal. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam02770, pfam00441, pfam02771. There is a conserved ARRL sequence motif. The C terminal domain is an alpha helical domain. The flavin ring of Acyl-CoA dehydrogenase is buried in the crevice between the two alpha helical domains and the beta-sheet domain of one subunit, and the adenosine pyrophosphate moiety is stretched into the subunit junction of a neighboring subunit, composed of two C terminal domains.	111
288997	pfam12187	VirArc_Nuclease	Viral/Archaeal nuclease. This family of proteins is found in archaea and viruses. Proteins in this family are typically between 211 and 244 amino acids in length. These proteins are nucleases from fusseloviruses and sulfolobus archaea.	149
371951	pfam12188	STAT2_C	Signal transducer and activator of transcription 2 C terminal. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam02865, pfam00017, pfam01017, pfam02864. There is a conserved DLP sequence motif. STATs are involved in transcriptional regulation and are the only regulators known to be modulated by tyrosine phosphorylation. STAT2 forms a trimeric complex with STAT1 and IRF-9 (Interferon Regulatory Factor 9), on activation of the cell by interferon, which is called ISGF3 (Interferon-stimulated gene factor 3). The C terminal domain of STAT2 contains a nuclear export signal (NES) which allows export of STAT2 into the cytoplasm along with any complexed molecules.	53
288999	pfam12189	VirE1	Single-strand DNA-binding protein. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved IELE sequence motif. VirE1 is an acidic chaperone protein which binds to VirE2, a ssDNA binding protein. These proteins are virulence factors of the plant pathogens Agrobacteria. VirE1 competes for the ssDNA binding site of VirE2.	63
371952	pfam12190	amfpi-1	Fungal protease inhibitor. This protein family is found in eukaryotes, and is approximately 50 amino acids in length. These proteins are fungal protease inhibitors.	91
403424	pfam12191	stn_TNFRSF12A	tumor necrosis factor receptor stn_TNFRSF12A_TNFR domain. This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 184 amino acids in length. This is the stn_TNFRSF12A_TNFR domain from the tumor necrosis factor receptor. The function of this domain is unknown.	129
371954	pfam12192	CBP	Fungal calcium binding protein. This domain is found in eukaryotes, and is approximately 60 amino acids in length. There is a single completely conserved residue C that may be functionally important. This is a calcium binding domain from the fungal protein CBP (calcium binding protein). This protein is a virulence factor with unknown virulence mechanisms. CBP complexes as a highly intertwined homodimer. Each monomer is comprised of four alpha helices which adopt the saposin fold, characteristic of a protein family that binds to membranes and lipids.	76
289003	pfam12193	Sulf_coat_C	Sulfolobus virus coat protein C terminal. This domain family is found in viruses, and is approximately 70 amino acids in length. It is the C terminal of a coat protein in sulfolobus viruses.	69
403425	pfam12194	Ste5_C	Protein kinase Fus3-binding. This domain family is found in eukaryotes, and is approximately 190 amino acids in length. This domain is the penultimate C terminal domain from the protein ste5 which co-catalyzes the phosphorylation of fus3 by ste7. It is involved in the MAPK pathways. This domain is the minimal scaffold domain of ste5. It binds to the mitogen activated protein kinase fus3 before it is phosphorylated.	189
152630	pfam12195	End_beta_barrel	Beta barrel domain of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is approximately 80 amino acids in length.This domain is the beta barrel domain of bacteriophage endosialidase which represents the one of the two sialic acid binding sites of the enzyme. The domain is nested in the beta propeller domain of the endosialidase enzyme. The endosialidase protein complexes to form homotrimeric molecules.	83
403426	pfam12196	hNIFK_binding	FHA Ki67 binding domain of hNIFK. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00076. There are two conserved sequence motifs: TPVCTP and LERRKS. This domain is found on the human nucleolar protein hNIFK. It binds to the fork-head-associated domain of human Ki67. High-affinity binding requires sequential phosphorylation by two kinases, CDK1 and GSK3, yielding pThr238, pThr234 and pSer230. This interaction is involved in cell cycle regulation.	40
314977	pfam12197	lci	Bacillus cereus group antimicrobial protein. This domain is found in bacteria, and is approximately 40 amino acids in length. This domain is found in bacillus cereus group bacteria. It is an antimicrobial protein.	42
289006	pfam12198	Tuberculin	Theoretical tuberculin protein. This domain family is found in bacteria, and is approximately 30 amino acids in length. This protein is a theoretical model of the tuberculin protein from Mycobacterium tuberculosis.	34
289007	pfam12199	efb-c	Extracellular fibrinogen binding protein C terminal. This domain family is found in bacteria, and is approximately 70 amino acids in length. There is a conserved VLK sequence motif. It is the C terminal domain of bacterial extracellular fibrinogen binding protein. It contains a helical motif involved in complement regulation. This motif binds to complement and changes its conformation to a form which cannot activate downstream components of the complement cascade.	65
403427	pfam12200	DUF3597	Domain of unknown function (DUF3597). This family of proteins is found in bacteria, eukaryotes and viruses. Proteins in this family are typically between 126 and 281 amino acids in length. The function of this domain is unknown. The structure of this domain has been found to contain five helices with a long flexible loop between helices one and two.	127
371957	pfam12201	bcl-2I13	Bcl2-interacting killer, BH3-domain containing. This is a family of pro-apoptotic Bcl-x proteins, B cell leukaemia/lymphoma 2, or BIKs. BIK proteins rely for their activity upon an intact BH3 domain lying between residues 48 and 80, as in UniProt:Q13323.	155
403428	pfam12202	OSR1_C	Oxidative-stress-responsive kinase 1 C-terminal domain. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00069. There is a single completely conserved residue F that may be functionally important. OSR1 is involved in the signalling cascade which activates Na/K/2Cl cotransporter during osmotic stress. This domain is the C terminal domain of OSR1 which recognizes a motif (Arg-Phe-Xaa-Val) on the OSR1-activating protein WNK1.	64
403429	pfam12203	HDAC4_Gln	Glutamine rich N terminal domain of histone deacetylase 4. This domain is found in eukaryotes, and is approximately 90 amino acids in length. The family is found in association with pfam00850. The domain forms an alpha helix which complexes to form a tetramer. The glutamine rich domains have many intra- and inter-helical interactions which are thought to be involved in reversible assembly and disassembly of proteins. The domain is part of histone deacetylase 4 (HDAC4) which removes acetyl groups from histones. This restores their positive charge to allow stronger DNA binding thus restricting transcriptional activity.	91
403430	pfam12204	DUF3598	Domain of unknown function (DUF3598). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 230 and 398 amino acids in length. These proteins are formed entirely from B sheets which form a barrel structure similar to those seen in the lipocalin superfamily.	267
403431	pfam12205	GIT1_C	G protein-coupled receptor kinase-interacting protein 1 C term. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam01412, pfam00023, pfam08518. GIT1 plays an important role in cell adhesion, motility, cytoskeletal remodeling and membrane trafficking. To perform this function, it localizes p21-activated kinase (PAK) and PAK-interactive exchange factor to focal adhesions. Its activation is regulated by interaction between its paxillin-binding C terminal and the LD motifs of paxillin. The C terminal folds into a four helix bundle.	116
403432	pfam12206	DUF3599	Domain of unknown function (DUF3599). This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. This domain is the phage-like element pbsx protein xkdh.	115
403433	pfam12207	DUF3600	Domain of unknown function (DUF3600). This family of proteins is found in bacteria. Proteins in this family are approximately 230 amino acids in length. This domain is the C terminal of the putative ecf-type sigma factor negative effector.	157
314984	pfam12208	DUF3601	Domain of unknown function (DUF3601). This domain family is found in bacteria, and is approximately 80 amino acids in length.	77
371963	pfam12209	SAC3	Leucine permease transcriptional regulator helical domain. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found in association with pfam03399. This domain is a helical domain in the middle of leucine permease transcriptional regulator.	77
403434	pfam12210	Hrs_helical	Hepatocyte growth factor-regulated tyrosine kinase substrate. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam00790, pfam01363, pfam02809. This domain is the helical region of Hrs which forms the core complex of ESCRT with STAM.	95
314987	pfam12211	LMWSLP_N	Low molecular weight S layer protein N terminal. This family of proteins is found in bacteria. Proteins in this family are typically between 328 and 381 amino acids in length. There is a conserved LGDG sequence motif. Clostridial species have a layer of surface proteins surrounding their membrane. This layer is comprised of a high molecular weight protein and a low molecular weight protein. This domain is the N terminal domain of the low molecular weight protein. It is a structural domain.	258
314988	pfam12212	PAZ_siRNAbind	PAZ domain. This entry corresponds to the PAZ domain found in some archaeal argonaute proteins. It is an siRNA binding domain.	127
371965	pfam12213	Dpoe2NT	DNA polymerases epsilon N terminal. This domain is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam04042. There is a single completely conserved residue F that may be functionally important. This domain is the N terminal domain of DNA polymerase epsilon subunit B. It forms a primarily alpha helical structure in which four helices are arranged in two hairpins with connecting loops containing beta strands which form a short parallel sheet. DNA polymerase epsilon is required in DNA replication for synthesis of the leading strand. This domain has close structural relation to AAA+ protein C terminal domains.	71
403435	pfam12214	TPX2_importin	Cell cycle regulated microtubule associated protein. This domain is found in eukaryotes. This domain is typically between 127 to 182 amino acids in length. This domain is found associated with pfam06886. This domain is found in the protein TPX2 (a.k.a p100) which is involved in cell cycling. It is only expressed between the start of the S phase and completion of cytokinesis. The microtubule-associated protein TPX2 has been reported to be crucial for mitotic spindle formation. This domain is close to the C terminal of TPX2. The protein importin alpha regulates the activity of TPX2 by binding to the nuclear localization signal in this domain.	127
403436	pfam12215	Glyco_hydr_116N	beta-glucosidase 2, glycosyl-hydrolase family 116 N-term. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 320 to 354 amino acids in length. This domain is found associated with pfam04685. It is found just after the extreme N-terminus. The N-terminal is thought to be the luminal domain while the C terminal is the cytosolic domain. The catalytic domain of GBA-2 is unknown. The primary catabolic pathway for glucosylceramide is catalysis by the lysosomal enzyme glucocerebrosidase. In higher eukaryotes, glucosylceramide is the precursor of glycosphingolipids, a complex group of ubiquitous membrane lipids. Mutations in the human protein cause motor-neurone defects in hereditary spastic paraplegia. The catalytic nucleophile, identified in UniProtKB:Q97YG8_SULSO, is a glutamine-335 in the downstream family pfam04685.	309
371968	pfam12216	m04gp34like	Immune evasion protein. This protein is found in archaea and viruses. Proteins in this family are typically between 265 to 342 amino acids in length. The proteins in this family are or are related to the m04 encoded protein gp34 of pathogenic microorganisms such as murine cytomegalovirus. m06 and m152 genes are expressed earlier in the intracellular replication phases of these microorganism' life cycles. They function to inhibit MHC-1 loading and export. gp34 is theorized to prevent immune reactions from NK cells which would ordinarily recognize and attack cells lacking MHC.	267
152652	pfam12217	End_beta_propel	Catalytic beta propeller domain of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is typically between 443 and 460 amino acids in length. This domain is the highly conserved beta propeller of bacteriophage endosialidase which represents the catalytically active part of the enzymes. This core domain forms stable SDS-resistant trimers. There is a nested beta barrel domain in this domain (pfam12195). The endosialidase protein complexes to form a homotrimeric molecule.	449
314993	pfam12218	End_N_terminal	N terminal extension of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is approximately 70 amino acids in length. This domain is found in the bacteriophage protein endosialidase. The two N-terminal domains (this domain and the beta propeller) assemble in the compact 'cap' whereas the C-terminal domain forms an extended tail-like structure. The very N-terminal part of the 'cap' region (residues 246 to 312) holds the only alpha-helix of the protein and is presumably the residual part of the deleted N-terminal head-binding domain. The endosialidase protein complexes to form homotrimeric molecules.	67
152654	pfam12219	End_tail_spike	Catalytic domain of bacteriophage endosialidase. This domain family is found in bacteria and viruses, and is approximately 160 amino acids in length. There are two conserved sequence motifs: VSR and YGA. This domain is the C terminal domain of the bacteriophage protein endosialidase. The endosialidase protein forms homotrimeric molecules and this domain complexes into a tail-spike stalk. The stalk region folds in a triple beta-helix that is interrupted by a small triple beta-prism domain. The tail-spike is a multifunctional protein device used by the phage to fulfill the following functions: (i) to adsorb to the bacterial polySia capsule (ii) to de-polymerize the capsule to gain access to the outer bacterial membrane, and finally (iii) to mediate tight adhesion to the membrane, a prerequisite for the initiation of the infection cycle.	160
403437	pfam12220	U1snRNP70_N	U1 small nuclear ribonucleoprotein of 70kDa MW N terminal. This domain is found in eukaryotes. This domain is about 90 amino acids in length. This domain is found associated with pfam00076. This domain is part of U1 snRNP, which is the pre-mRNA binding protein of the penta-snRNP spliceosome complex. It extends over a distance of 180 A from its RNA binding domain, wraps around the core domain of U1 snRNP consisting of the seven Sm proteins and finally contacts U1-C, which is crucial for 5'-splice-site recognition.	90
403438	pfam12221	HflK_N	Bacterial membrane protein N terminal. This domain is found in bacteria. This domain is typically between 65 to 81 amino acids in length. This domain is found associated with pfam01145. This domain is the N terminal of the bacterial membrane protein HflK. HflK complexes with HflC to form a membrane protease which is modulated by the GTPase HflX. The N terminal domain of HflK is the membrane spanning region which anchors the protein in the bacterial membrane.	44
403439	pfam12222	PNGaseA	Peptide N-acetyl-beta-D-glucosaminyl asparaginase amidase A. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 558 and 775 amino acids in length. There is a conserved TGG sequence motif. PNGase A is a protein which cleaves glycopeptides.	434
403440	pfam12223	DUF3602	Protein of unknown function (DUF3602). This domain family is found in eukaryotes, and is typically between 78 and 89 amino acids in length.	80
403441	pfam12224	Amidoligase_2	Putative amidoligase enzyme. This family of proteins are likely to act as amidoligase enzymes Protein in this family are found in conserved gene neighborhoods encoding a glutamine amidotransferase-like thiol peptidase (in proteobacteria) or an Aig2 family cyclotransferase protein (in firmicutes).	235
403442	pfam12225	MTHFR_C	Methylene-tetrahydrofolate reductase C terminal. This family is found in bacteria and archaea, and is approximately 100 amino acids in length. There is a conserved NGPCGG sequence motif. This family is the C terminal of methylene-tetrahydrofolate reductase. This protein reduces FAD using the reducing equivalents from reduced FAD, subsequently reduces tetrahydrofolate. The C terminal of MTHFR contains the FAD binding site and is the catalytic portion of the enzyme.	89
371974	pfam12226	Astro_capsid_p	Turkey astrovirus capsid protein. This family of proteins is found in viruses. Proteins in this family are typically between 241 and 261 amino acids in length. These proteins are capsid proteins from various astrovirus strains.	361
371975	pfam12227	DUF3603	Protein of unknown function (DUF3603). This protein is found in bacteria and eukaryotes. Proteins in this family are about 250 amino acids in length.	214
371976	pfam12228	DUF3604	Protein of unknown function (DUF3604). This family of proteins is found in bacteria. Proteins in this family are typically between 621 and 693 amino acids in length.	591
403443	pfam12229	PG_binding_4	Putative peptidoglycan binding domain. This domain is found associated with the L,D-transpeptidase domain pfam03734. The structure of this domain has been solved and shows a mixed alpha-beta fold composed of nine beta strands and four alpha helices. This domain is usually found to be duplicated. Therefore, it seems likely that this domain acts to bind the two unlinked peptidoglycan chains and bring them into close association so they can be cross linked by the transpeptidase domain (Bateman A pers. observation).	117
403444	pfam12230	PRP21_like_P	Pre-mRNA splicing factor PRP21 like protein. This domain family is found in eukaryotes, and is typically between 212 and 238 amino acids in length. The family is found in association with pfam01805. There are two completely conserved residues (W and H) that may be functionally important. PRP21 is required for assembly of the prespliceosome and it interacts with U2 snRNP and/or pre-mRNA in the prespliceosome. This family also contains proteins similar to PRP21, such as the mammalian SF3a. SF3a also interacts with U2 snRNP from the prespliceosome, converting it to its active form.	213
403445	pfam12231	Rif1_N	Rap1-interacting factor 1 N terminal. This domain family is found in eukaryotes, and is typically between 135 and 146 amino acids in length. Rif1 is a protein which interacts with Rap1 to regulate telomere length. Interaction with telomeres limits their length. The N terminal region contains many HEAT- and ARMADILLO- type repeats. These are helical folds which form extended curved proteins or RNA interface surfaces.	363
403446	pfam12232	Myf5	Myogenic determination factor 5. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00010, pfam01586. There is a conserved CSD sequence motif. Myf5 is responsible for directing cells to the skeletal myocyte lineage during development. Myf5 is likely to act in a similar way to the other MRF4 proteins such as MyoD which perform the same function. These are histone acetyltransferases and histone deacetylases which activate and repress genes involved in the myocyte lineage.	71
289037	pfam12233	p12I	Human adult T cell leukemia/lymphoma virus protein. This family of proteins is found in viruses. Proteins in this family are approximately 100 amino acids in length. p12I binds to the immature beta and gamma-c chains of the interleukin-2 receptor retarding their translocation to the plasma membrane. p12I forms dimers which bind to these chains.	99
403447	pfam12234	Rav1p_C	RAVE protein 1 C terminal. This domain family is found in eukaryotes, and is typically between 621 and 644 amino acids in length. This family is the C terminal region of the protein RAVE (regulator of the ATPase of vacuolar and endosomal membranes). Rav1p is involved in regulating the glucose dependent assembly and disassembly of vacuolar ATPase V1 and V0 subunits.	637
403448	pfam12235	FXMRP1_C_core	Fragile X-related 1 protein core C terminal. This domain family is found in eukaryotes, and is typically between 126 and 160 amino acids in length. The family is found in association with pfam05641, pfam00013. This family is the core C terminal region of the fragile X related 1 proteins FXR1P, FXR2 and FMR1. These different proteins have different regions at their very C-terminus. The Glutamine-arginine rich region facilitates protein interactions. This family contains two blocks of RGG repeats that bind to G-quartet sequences in a wide variety of mRNAs.	121
403449	pfam12236	Head-tail_con	Bacteriophage head to tail connecting protein. This family of head-tail connector proteins is found in bacteria and viruses. Proteins in this family are typically between 516 and 555 amino acids in length. This protein is found in Phage T7 and T3 among others.	479
403450	pfam12237	PCIF1_WW	Phosphorylated CTD interacting factor 1 WW domain. This domain family is found in bacteria and eukaryotes, and is approximately 180 amino acids in length. This domain is the WW domain of PCIF1. PCIF1 interacts with phosphorylated RNA polymerase II carboxy-terminal domain (CTD). The WW domain of PCIF1 can directly and preferentially bind to the phosphorylated CTD compared to the unphosphorylated CTD. PCIF1 binds to the hyperphosphorylated RNAP II (RNAP IIO) in vitro and in vivo. Double immunofluorescence labeling in HeLa cells demonstrated that PCIF1 and endogenous RNAP IIO are co-localized in the cell nucleus. Thus, PCIF1 may play a role in mRNA synthesis by modulating RNAP IIO activity.	172
289042	pfam12238	MSA-2c	Merozoite surface antigen 2c. This family of proteins is found in eukaryotes. Proteins in this family are typically between 263 and 318 amino acids in length. There is a conserved SFT sequence motif. MSA-2 is a plasma membrane glycoprotein which can be found in Babesia bovis species.	216
403451	pfam12239	DUF3605	Protein of unknown function (DUF3605). This family of proteins is found in eukaryotes and viruses. Proteins in this family are typically between 161 and 256 amino acids in length.	155
403452	pfam12240	Angiomotin_C	Angiomotin C terminal. This domain family is found in eukaryotes, and is typically between 197 and 211 amino acids in length. This family is the C terminal region of angiomotin. Angiomotin regulates the action of angiogenesis inhibitor angiostatin. The C terminal region of angiomotin appears to be involved in directing the protein chemotactically.	200
403453	pfam12241	Enoyl_reductase	Trans-2-enoyl-CoA reductase catalytic region. This family of trans-2-enoyl-CoA reductases, EC:1.3.1.44, carries the the catalytic sites of the enzyme, characterized by the conserved sequence motifs: YNThhhFxK, and YShAPxR. In Euglena where the enzyme has been characterized it catalyzes the reduction of enoyl-CoA to acyl-CoA in an unusual fatty acid pathway in mitochondria. the whole path performs a malonyl-CoA independent synthesis of fatty acids leading to accumulation of wax esters, which serve as the sink for electrons stemming from glycolytic ATP synthesis and pyruvate oxidation.	236
403454	pfam12242	Eno-Rase_NADH_b	NAD(P)H binding domain of trans-2-enoyl-CoA reductase. This family carries the region of the enzyme trans-2-enoyl-CoA reductase, EC:1.3.1.44, which binds NAD(P)H. The activity of the enzyme was characterized in Euglena where an unusual fatty acid synthesis path-way in the mitochondria performs a malonyl-CoA independent synthesis of fatty acids leading to accumulation of wax esters, which serve as the sink for electrons stemming from glycolytic ATP synthesis and pyruvate oxidation. The full enzyme catalyzes the reduction of enoyl-CoA to acyl-CoA. The binding site is conserved as GA/CSpGYG, where p is any polar residue.	78
403455	pfam12243	CTK3	CTD kinase subunit gamma CTK3. The C-terminal domain kinase (CTDK-1), is a three-subunit complex comprised of Ctk1, Ctk2, and Ctk3, that plays a key role in regulation of transcription and translation and in coordinating these two processes. Both Ctk2 and Ctk3 are regulated at the level of protein turnover, and are unstable proteins processed through a ubiquitin-proteasome pathway. Their physical interaction is required to protect both subunits from degradation, and both Ctk2 and Ctk3 are required for Ctk1 CTD kinase activation. The mammalian P-TEFb is mirrored by the combined complexes in yeast of the CTDK1 and the Bur1/2.	123
403456	pfam12244	DUF3606	Protein of unknown function (DUF3606). This family of proteins is found in bacteria. Proteins in this family are typically between 58 and 85 amino acids in length. There is a single completely conserved residue G that may be functionally important.	54
403457	pfam12245	Big_3_2	Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.	122
403458	pfam12246	MKT1_C	Temperature dependent protein affecting M2 dsRNA replication. This domain family is found in eukaryotes, and is typically between 231 and 255 amino acids in length. There is a single completely conserved residue P that may be functionally important. MKT1 is required for maintenance of K2 toxin above 30 degrees C in strains with the L-A-HN variant of the L-A double-stranded RNA virus of Saccharomyces cerevisiae. MKT1 is a 93 kDa protein with serine-rich regions and the retroviral protease signature, DTG. This family is the C terminal region of MKT1.	242
403459	pfam12247	MKT1_N	Temperature dependent protein affecting M2 dsRNA replication. This domain family is found in eukaryotes, and is typically between 231 and 255 amino acids in length. There is a single completely conserved residue P that may be functionally important. MKT1 is required for maintenance of K2 toxin above 30 degrees C in strains with the L-A-HN variant of the L-A double-stranded RNA virus of Saccharomyces cerevisiae. MKT1 is a 93 kDa protein with serine-rich regions and the retroviral protease signature, DTG. This family is the N terminal region of MKT1.	84
403460	pfam12248	Methyltransf_FA	Farnesoic acid 0-methyl transferase. This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH).	101
403461	pfam12249	AftA_C	Arabinofuranosyltransferase A C terminal. This domain family is found in bacteria, and is typically between 179 and 190 amino acids in length. This family is the C terminal region of AftA. The enzyme catalyzes the addition of the first key arabinofuranosyl residue from the sugar donor beta-D-arabinofuranosyl-1-monophosphoryldecaprenol to the galactan domain of the cell wall, thus priming the galactan for further elaboration by the arabinofuranosyltransferases. The C terminal region is predicted to be directed towards the periplasm.	177
403462	pfam12250	AftA_N	Arabinofuranosyltransferase N terminal. This domain family is found in bacteria, and is typically between 430 and 441 amino acids in length. This family is the N terminal region of AftA. The enzyme catalyzes the addition of the first key arabinofuranosyl residue from the sugar donor beta-D-arabinofuranosyl-1-monophosphoryldecaprenol to the galactan domain of the cell wall, thus priming the galactan for further elaboration by the arabinofuranosyltransferases. The N terminal region has been predicted to span 11 transmembrane regions.	424
403463	pfam12251	zf-SNAP50_C	snRNA-activating protein of 50kDa MW C terminal. This domain family is found in eukaryotes, and is typically between 196 and 207 amino acids in length. There is a conserved CEH sequence motif. SNAP50 is part of the snRNA-activating protein complex which activates RNA polymerases II and III. There is a cysteine-histidine cluster which contains two possible zinc finger motifs.	189
403464	pfam12252	SidE	Dot/Icm substrate protein. This family of proteins is found in bacteria. Proteins in this family are typically between 397 and 1543 amino acids in length. This family is the SidE protein in the Dot/Icm pathway of Legionella pneumophila bacteria. There is little literature describing the family.	220
403465	pfam12253	CAF1A	Chromatin assembly factor 1 subunit A. The CAF-1 or chromatin assembly factor-1 consists of three subunits, and this is the first, or A. The A domain is uniquely required for the progression of S phase in mouse cells, independent of its ability to promote histone deposition but dependent on its ability to interact with HP1 - heterochromatin protein 1-rich heterochromatin domains next to centromeres that are crucial for chromosome segregation during mitosis. This HP1-CAF-1 interaction module functions as a built-in replication control for heterochromatin, which, like a control barrier, has an impact on S-phase progression in addition to DNA-based checkpoints.	76
403466	pfam12254	DNA_pol_alpha_N	DNA polymerase alpha subunit p180 N terminal. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00136, pfam08996, pfam03104. This family is the N terminal of DNA polymerase alpha subunit p180 protein. The N terminal contains the catalytic region of the alpha subunit.	64
371995	pfam12255	TcdB_toxin_midC	Insecticide toxin TcdB middle/C-terminal region. This domain family is found in bacteria, and is approximately 150 amino acids in length. The family is found in association with pfam03534. This family is the C-terminal-sided middle region of the bacterial insecticide toxin TcdB.	140
403467	pfam12256	TcdB_toxin_midN	Insecticide toxin TcdB middle/N-terminal region. This domain family is found in bacteria and archaea, and is typically between 164 and 180 amino acids in length. The family is found in association with pfam05593. This family is the N-terminal-sided middle region of the bacterial insecticide toxin TcdB. This region appears related to the FG-GAP repeat pfam01839.	181
403468	pfam12257	IML1	Vacuolar membrane-associated protein Iml1. Proteins in this family contain a DEP domain, which is a globular domain of about 80 residues. This entry includes vacuolar membrane-associated protein Iml1 and DEP domain-containing protein 5/DDB_G0279099. In Saccharomyces cerevisiae, Iml1 is a subunit of both the SEA (Seh1-associated) and Iml1 complexes (Iml1-Npr2-Npr3). SEA complex is associates dynamically with the vacuole and is involved in autophagy. Iml1 complex is required for non-nitrogen-starvation (NNS)-induced autophagy.	278
403469	pfam12258	Microcephalin	Microcephalin protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 384 and 835 amino acids in length. Microcephalin is involved in determining the size of the brain in animals. It is a protein, which if expressed homozygously causes the organism to have the condition microcephaly. Organisms expressing the mutated form of this protein in a homozygous manner develop a condition called microcephaly - a drastically reduced brain mass and volume. Microcephalin is predicted to contain three BRCA1 C-terminal domains, the first of which is the probable microcephaly mutation site.	391
371998	pfam12259	Baculo_F	Baculovirus F protein. This protein is found in a variety of baculoviruses. It is known as the F protein. Matches to this family are additionally found in some presumed transposons.	606
403470	pfam12260	PIP49_C	Protein-kinase domain of FAM69. This is the C-terminal region of a family of FAM69 proteins from Metazoa and Viridiplantae that are active protein-kinases. The family members have a short transmembrane helix close to the N-terminus, and thereafter are highly enriched with cysteines. FAM69 proteins are localized to the endoplasmic reticulum. Many members also have a short EF-hand, calcium-binding, domain just upstream of the kinase domain. The exact function of the more N-terminal family is uncertain.	189
403471	pfam12261	T_hemolysin	Thermostable hemolysin. This family of proteins is found in bacteria. Proteins in this family are typically between 200 and 228 amino acids in length. T_hemolysin is a pore-forming toxin of bacteria, able to lyse erythrocytes from a number of mammalian species.	171
372000	pfam12262	Lipase_bact_N	Bacterial virulence factor lipase N-terminal. This domain family is found in bacteria, and is typically between 258 and 271 amino acids in length. There are two conserved sequence motifs: DGT and DGWST. This family is the N-terminal region of bacterial virulence factor lipase. The N-terminal region contains a potential signalling sequence.	238
403472	pfam12263	DUF3611	Protein of unknown function (DUF3611). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 180 and 205 amino acids in length. There are two completely conserved residues (W and G) that may be functionally important.	176
152699	pfam12264	Waikav_capsid_1	Waikavirus capsid protein 1. The rice tungro spherical waikavirus polyprotein is cleaved into 7 proteins, including three capsid proteins, by the tungro spherical virus-type peptidase pfam12381. This family represents the capsid protein 1.	197
403473	pfam12265	CAF1C_H4-bd	Histone-binding protein RBBP4 or subunit C of CAF1 complex. The CAF-1 complex is a conserved heterotrimeric protein complex that promotes histone H3 and H4 deposition onto newly synthesized DNA during replication or DNA repair; specifically it facilitates replication-dependent nucleosome assembly with the major histone H3 (H3.1). This domain is an alpha helix which sits just upstream of the WD40 seven-bladed beta-propeller in the human RbAp46 protein. RbAp46 folds into the beta-propeller and binds histone H4 in a groove formed between this N-terminal helix and an extended loop inserted into blade six.	69
289069	pfam12266	DUF3613	Protein of unknown function (DUF3613). This family of proteins is found in bacteria. Proteins in this family are typically between 94 and 126 amino acids in length.	67
152702	pfam12267	DUF3614	Protein of unknown function (DUF3614). This family of proteins is found in viruses. Proteins in this family are typically between 162 and 495 amino acids in length.	173
372003	pfam12268	DUF3612	Protein of unknown function (DUF3612). This domain family is found in bacteria, and is approximately 180 amino acids in length. The family is found in association with pfam01381.	176
403474	pfam12269	zf-CpG_bind_C	CpG binding protein zinc finger C terminal domain. This domain family is found in eukaryotes, and is approximately 240 amino acids in length. This domain is the zinc finger domain of a CpG binding DNA methyltransferase protein. It contains a CxxC motif which forms the zinc finger and binds to DNA.	233
403475	pfam12270	Cyt_c_ox_IV	Cytochrome c oxidase subunit IV. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. This family is the fourth subunit of the cytochrome c oxidase complex. This subunit does not have a catalytic capacity but instead, is required for assembly and/or stability of the complex.	132
403476	pfam12271	Chs3p	Chitin synthase III catalytic subunit. This family of proteins is found in eukaryotes. Proteins in this family are typically between 288 and 332 amino acids in length. This family is the catalytic domain of chitin synthase III. Chitin is a major component of fungal cell walls and this enzyme is responsible for its formation.	283
403477	pfam12273	RCR	Chitin synthesis regulation, resistance to Congo red. RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5.	113
403478	pfam12274	DUF3615	Protein of unknown function (DUF3615). This domain family is found in bacteria and eukaryotes, and is typically between 86 and 97 amino acids in length. There is a conserved FAE sequence motif. There is a single completely conserved residue F that may be functionally important.	94
403479	pfam12275	DUF3616	Protein of unknown function (DUF3616). This family of proteins is found in bacteria. Proteins in this family are typically between 335 and 392 amino acids in length. There is a conserved GLRGPV sequence motif.	328
403480	pfam12276	DUF3617	Protein of unknown function (DUF3617). This family of proteins is found in bacteria. Proteins in this family are typically between 155 and 179 amino acids in length. There is a single completely conserved residue C that may be functionally important.	133
403481	pfam12277	DUF3618	Protein of unknown function (DUF3618). This domain family is found in bacteria, and is approximately 50 amino acids in length.	47
372010	pfam12278	SDP_N	Sex determination protein N terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 168 and 410 amino acids in length. This family is the N terminal end of the sex determination protein of many different animals. It plays a role in the gender determination of around 20% of all animals.	137
403482	pfam12279	DUF3619	Protein of unknown function (DUF3619). This protein is found in bacteria. Proteins in this family are about 140 amino acids in length. This protein has two conserved sequence motifs: AAR and DDLP.	123
403483	pfam12280	BSMAP	Brain specific membrane anchored protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 285 and 331 amino acids in length. BSMAP has a putative transmembrane domain and is predicted to be a type I membrane glycoprotein.	195
403484	pfam12281	NTP_transf_8	Nucleotidyltransferase. This is a family of bacterial proteins that have a nucleotidyltransferase fold. The fold-prediction is backed up by conservation of three highly characteristic sequence motifs found in all other nucleotidyl transferases: i) pDhDhhh(h/p), where p is a polar residue and h is a hydrophobic residue; ii) upstream of the first, a GG/S; iii) a conserved D/E in a hydrophobic surround. In the classification of nucleotidyltransferases proposed in this is a group XVIII NTP-transferase. Many of these sequences were classified in the COG database as COG5397. The exact function is not known.	208
403485	pfam12282	H_kinase_N	Signal transduction histidine kinase. This domain is found in bacteria. This domain is about 150 amino acids in length. This domain is found associated with pfam07568, pfam08448, pfam02518. This domain has a single completely conserved residue P that may be functionally important. This family is mostly annotated as a histidine kinase involved in signal transduction but there is little published evidence to support this.	139
289084	pfam12283	Protein_K	Bacteriophage protein K. This family of proteins is found in viruses. Proteins in this family are approximately 60 amino acids in length. This family is a protein expressed by bacteriophages which has an unknown function. There is evidence that it is non-essential for in vivo production of a mature phage.	56
403486	pfam12284	HoxA13_N	Hox protein A13 N terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 149 and 306 amino acids in length. The family is found in association with pfam00046. This family is the N terminal of the Hox gene protein involved in formation of the digital arch of the hands and feet as well as in correct genital formation. Mutation of the protein is associated with hand-foot-genital syndrome.	120
204871	pfam12285	DUF3621	Protein of unknown function (DUF3621). This family of proteins is found in viruses. Proteins in this family are typically between 49 and 62 amino acids in length. There are two conserved sequence motifs: QPLDLS and EQQ.	49
403487	pfam12286	DUF3622	Protein of unknown function (DUF3622). This family of proteins is found in bacteria. Proteins in this family are typically between 72 and 107 amino acids in length. There is a conserved VSK sequence motif.	71
403488	pfam12287	Caprin-1_C	Cytoplasmic activation/proliferation-associated protein-1 C term. This family of proteins is found in eukaryotes. Proteins in this family are typically between 343 and 708 amino acids in length. This family is the C terminal region of caprin-1. Caprin-1 is a protein involved in regulating cellular proliferation. In mutated phenotypes, the G1 phase of the cell cycle is greatly lengthened, impairing normal proliferation. The C terminal region of caprin-1 contains RGG motifs which are characteristic of RNA binding domains. It is possible that caprin-1 functions through an RNA binding mechanism.	319
403489	pfam12288	CsoS2_M	Carboxysome shell peptide mid-region. This domain family is found in bacteria and eukaryotes, and is approximately 430 amino acids in length. This family is annotated frequently as a carboxysome shell peptide, however there is little publication to confirm this.	420
403490	pfam12289	Rotavirus_VP1	Rotavirus VP1 C-terminal domain. This domain is the C-terminal bracelet domain of the rotavirus VP1 RNA-directed RNA polymerase. It surrounds the exit tunnel for dsRNA produced by replication and for the RNA template for transcription.	312
315053	pfam12290	DUF3802	Protein of unknown function (DUF3802). This family of proteins is found in bacteria. Proteins in this family are typically between 114 and 143 amino acids in length. There is a conserved KNLFD sequence motif.	112
403491	pfam12291	DUF3623	Protein of unknown function (DUF3623). This family of proteins is found in bacteria. Proteins in this family are typically between 261 and 345 amino acids in length.	255
403492	pfam12292	DUF3624	Protein of unknown function (DUF3624). This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There is a conserved GRC sequence motif.	74
403493	pfam12293	T4BSS_DotH_IcmK	Putative outer membrane core complex of type IVb secretion. T4BSS_DotH_IcmK is a family of bacterial transporter proteins from Proteobacteria. DotH is an integral outer membrane component and it may form an outer membrane complex along with DotD and DotC functionally equivalent to secretins. DotH is the strongest candidate for the VirB9 counterpart of other T4BSS systems.	238
403494	pfam12294	DUF3626	Protein of unknown function (DUF3626). This family of proteins is found in bacteria. Proteins in this family are typically between 294 and 374 amino acids in length.	301
403495	pfam12295	Symplekin_C	Symplekin tight junction protein C terminal. This domain family is found in eukaryotes, and is approximately 180 amino acids in length. There is a single completely conserved residue P that may be functionally important. Symplekn has been localized, by light and electron microscopy, to the plaque associated with the cytoplasmic face of the tight junction-containing zone (zonula occludens) of polar epithelial cells and of Sertoli cells of testis. However, both the mRNA and the protein can also be detected in a wide range of cell types that do not form tight junctions. Careful analyses have revealed that the protein occurs in all these diverse cells in the nucleoplasm, and only in those cells forming tight junctions is it recruited, partly but specifically, to the plaque structure of the zonula occludens.	185
403496	pfam12296	HsbA	Hydrophobic surface binding protein A. This protein is found in eukaryotes. Proteins in this family are typically between 171 to 275 amino acids in length. Although the HsbA amino acid sequence suggests that HsbA may be hydrophilic, HsbA adsorbed to hydrophobic PBSA (Polybutylene succinate-co-adipate) surfaces in the presence of NaCl or CaCl2. When HsbA was adsorbed on the hydrophobic PBSA surfaces, it promoted PBSA degradation via the CutL1 polyesterase. CutL1 interacts directly with HsbA attached to the hydrophobic QCM electrode surface. These results suggest that when HsbA is adsorbed onto the PBSA surface, it recruits CutL1, and that when CutL1 is accumulated on the PBSA surface, it stimulates PBSA degradation.	123
403497	pfam12297	EVC2_like	Ellis van Creveld protein 2 like protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 571 and 1310 amino acids in length. There are two conserved sequence motifs: LPA and ELH. EVC2 is implicated in Ellis van Creveld chondrodysplastic dwarfism in humans. Mutations in this protein can give rise to this congenital condition. LIMBIN is a protein which shares around 80% sequence homology with EVC2 and it is implicated in a similar condition in bovine chondrodysplastic dwarfism.	429
403498	pfam12298	Bot1p	Eukaryotic mitochondrial regulator protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 168 and 381 amino acids in length. Bot1p localizes to the mitochondria in live cells and cofractionates with purified mitochondrial ribosomes. Bot1p has a novel function in the control of cell respiration by acting on the mitochondrial protein synthesis machinery. Observations also indicate that in fission yeast, alterations of mitochondrial function are linked to changes in cell cycle and cell morphology control mechanisms.	172
372024	pfam12299	DUF3627	Protein of unknown function (DUF3627). This domain family is found in bacteria and viruses, and is approximately 90 amino acids in length. The family is found in association with pfam02498.	93
372025	pfam12300	RhlB	ATP-dependent RNA helicase RhlB. Proteins in this entry are DEAD Box RhlB RNA Helicases found in Xanthomonadaceae bacteria.	181
403499	pfam12301	CD99L2	CD99 antigen like protein 2. This family of proteins is found in eukaryotes. Proteins in this family are typically between 165 and 237 amino acids in length. CD99L2 and CD99 are involved in trans-endothelial migration of neutrophils in vitro and in the recruitment of neutrophils into inflamed peritoneum.	159
372027	pfam12302	DUF3629	Protein of unknown function (DUF3629). This family of proteins is found in eukaryotes. Proteins in this family are typically between 256 and 292 amino acids in length.	218
403500	pfam12304	BCLP	Beta-casein like protein. This protein is found in eukaryotes. Proteins in this family are typically between 216 to 240 amino acids in length. This protein has two conserved sequence motifs: VLR and TRIY. BCLP is associated with cell morphology and a regulation of growth pattern of tumor. It is found in adenocarcinomas of uterine cervical tissues.	184
403501	pfam12305	DUF3630	Protein of unknown function (DUF3630). This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. There is a single completely conserved residue D that may be functionally important.	91
403502	pfam12306	PixA	Inclusion body protein. This family of proteins is found in bacteria. Proteins in this family are typically between 173 and 191 amino acids in length. PixA is thought to be specifically produced in Xenorhabdus nematophila. It is an inclusion body protein.	165
403503	pfam12307	DUF3631	Protein of unknown function (DUF3631). This protein is found in bacteria. Proteins in this family are typically between 180 to 701 amino acids in length.	185
403504	pfam12308	Noelin-1	Neurogenesis glycoprotein. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam02191. There are two conserved sequence motifs: SAQ and VQN. Noelin-1 is a glycoprotein which is secreted mainly by postmitotic neurogenic tissues in the developing central and peripheral nervous systems, first appearing after neural tube closure. It is likely that it forms large multimeric complexes.It has a divergent function in neurogenesis. In animal caps neuralized by expression of noggin, co-expression of Noelin-1 causes expression of neuronal differentiation markers several stages before neurogenesis normally occurs in this tissue. Finally, only secreted forms of the protein can activate sensory marker expression, while all forms of the protein can induce early neurogenesis.	100
403505	pfam12309	KBP_C	KIF-1 binding protein C terminal. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 365 and 621 amino acids in length. There is a conserved LLP sequence motif. KBP is a binding partner for KIF1Balpha that is a regulator of its transport function and thus represents a type of kinesin interacting protein.	360
403506	pfam12310	Elf-1_N	Transcription factor protein N terminal. This domain family is found in eukaryotes, and is approximately 110 amino acids in length. The family is found in association with pfam00178. There is a conserved PAVIVE sequence motif. Elf-1 is an immune cell specific transcription factor. It is found in T cells, B cells, megakaryocytes,and mast cells and is involved in the control of transcription for various immune proteins. These include IL-2, GM-CSF, IL-5, IL-2 receptor alpha chain, and CD4 in T cells, IgH, blk, and lyn in B cells, TdT in T and B cells, IL-3 in megakaryocytes, and SCL and Fc-epsilon-RI alpha chain in mast cells.	109
403507	pfam12311	DUF3632	Protein of unknown function (DUF3632). This domain family is found in eukaryotes, and is approximately 170 amino acids in length. There is a conserved ALE sequence motif.	185
372036	pfam12312	NeA_P2	Nepovirus subgroup A polyprotein. This family of proteins is found in viruses. Proteins in this family are typically between 259 and 1110 amino acids in length. The family is found in association with pfam03688, pfam03689, pfam03391. This family is one of the polyproteins expressed by Nepoviruses in subgroup A.	175
372037	pfam12313	NPR1_like_C	NPR1/NIM1 like defense protein C terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 251 and 588 amino acids in length. The family is found in association with pfam00023, pfam00651. There are two conserved sequence motifs: LENRV and DLN. NPR1 (NIM1) is a defense protein in many plant species.	204
403508	pfam12314	IMCp	Inner membrane complex protein. This domain is found in bacteria and eukaryotes. This domain is about 120 amino acids in length. This family is the inner membrane complex of parasitic organisms. This is a cytoskeletal structure associated with the pellicle of these parasites.	87
403509	pfam12315	DA1-like	Protein DA1. Proteins in this family include protein DA1 and its homologs. In Arabidopsis thaliana, DA1 is an ubiquitin receptor that limits final seed and organ size by restricting the period of cell proliferation. It may act maternally to control seed mass.	214
403510	pfam12316	Dsh_C	Segment polarity protein dishevelled (Dsh) C terminal. This domain family is found in eukaryotes, and is typically between 177 and 207 amino acids in length. The family is found in association with pfam00778, pfam02377, pfam00610, pfam00595. The segment polarity gene dishevelled (dsh) is required for pattern formation of the embryonic segments. It is involved in the determination of body organisation through the Wingless pathway (analogous to the Wnt-1 pathway).	211
403511	pfam12317	IFT46_B_C	Intraflagellar transport complex B protein 46 C terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 298 and 416 amino acids in length. IFT46 is a flagellar protein of complex B. Like all IFT proteins, it is required for transport of IFT particles into the flagella.	203
403512	pfam12318	FAD-SLDH	Membrane bound FAD containing D-sorbitol dehydrogenase. This family of proteins is found in bacteria. Proteins in this family are typically between 168 and 189 amino acids in length. There is a conserved ALM sequence motif. This family is a membrane protein (FAD-SLDH) involved in oxidation of D-sorbitol to L-sorbose.	160
403513	pfam12319	TryThrA_C	Tryptophan-Threonine-rich plasmodium antigen C terminal. This protein is found in eukaryotes. Proteins in this family are typically between 254 to 536 amino acids in length. This family is the C terminal of a surface antigen of malarial Plasmodium species. It is currently being targeted for use as part of a subunit vaccine against Plasmodium falciparum, the main species involved in causing human malaria.	214
403514	pfam12320	SbcD_C	Type 5 capsule protein repressor C-terminal domain. This domain is found in bacteria and archaea. This domain is about 90 amino acids in length. This domain is found associated with pfam00149. SbcD works in complex with SbdC (SbcDC) which is a transcription regulator. It down-regulates transcription of arl and mgr to inhibit type 5 capsule protein production. It acts as part of the SOS pathway of bacteria.	96
315081	pfam12321	DUF3634	Protein of unknown function (DUF3634). This family of proteins is found in bacteria. Proteins in this family are typically between 103 and 114 amino acids in length.	98
289120	pfam12322	T4_baseplate	T4 bacteriophage base plate protein. This protein is found in viruses. Proteins in this family are typically between 208 to 249 amino acids in length. This protein has a single completely conserved residue S that may be functionally important. This family includes the two base plate proteins in T4 bacteriophages. These are gp51 and gp26, encoded by late genes.	132
403515	pfam12323	HTH_OrfB_IS605	Helix-turn-helix domain. This is the N terminal helix-turn-helix domain of Transposase_2 pfam01385.	47
372044	pfam12324	HTH_15	Helix-turn-helix domain of alkylmercury lyase. Alkylmercury lyase (EC:4.99.1.2) cleaves the carbon-mercury bond of organomercurials such as phenylmercuric acetate. This is the N terminal helix-turn-helix domain associated with pfam03243.	74
403516	pfam12325	TMF_TATA_bd	TATA element modulatory factor 1 TATA binding. This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family.	115
403517	pfam12326	EOS1	N-glycosylation protein. This family is not required for survival of S.cerevisiae, but its deletion leads to heightened sensitivity to oxidative stress. It appears to be involved in N-glycosylation, and resides in the endoplasmic reticulum.	160
403518	pfam12327	FtsZ_C	FtsZ family, C-terminal domain. This family includes the bacterial FtsZ family of proteins. Members of this family are involved in polymer formation. FtsZ is the polymer-forming protein of bacterial cell division. It is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ is a GTPase, like tubulin. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea.	95
372048	pfam12328	Rpp20	Rpp20 subunit of nuclear RNase MRP and P. The nuclear RNase P of Saccharomyces cerevisiae is made up of at least nine protein subunits; Pop1, Pop3, Pop4, Pop5, Pop6, Pop7, Pop8, Rpr2 and Rpp1. Many of these subunits seem to be present also in the RNase MRP, with the exception of Rpr2 (Rpp21) which is unique to RNase P. Human nuclear RNase P and MRP appear to contain at least 10 protein subunits, Rpp14, Rpp20, Rpp21, Rpp25, Rpp29, Rpp30, Rpp38, Rpp40, hPop1 and hPop5, although there is recent evidence that not all of these subunits are shared between P and MRP. Archaeal RNase P has at least four protein subunits homologous to eukaryotic RNase P/MRP proteins. In the yeast RNase P, Pop6 and Pop7 (the Rpp20 homolog) interact with each other and they are both interaction partners of Pop4; in the human MRP Rpp25 and Rpp20 interact with each other and Rpp25 binds to Rpp29 (Pop4).	118
372049	pfam12329	TMF_DNA_bd	TATA element modulatory factor 1 DNA binding. This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells.	74
403519	pfam12330	Haspin_kinase	Haspin like kinase domain. This family represents the haspin-like kinase domains.	369
403520	pfam12331	DUF3636	Protein of unknown function (DUF3636). This domain family is found in eukaryotes, and is approximately 160 amino acids in length.	148
403521	pfam12333	Ipi1_N	Rix1 complex component involved in 60S ribosome maturation. This domain family is found in eukaryotes, and is typically between 91 and 105 amino acids in length. This family is the N terminal of Ipi1, a component of the Rix1 complex which works in conjunction with Rea1 to mature the 60S ribosome.	101
372053	pfam12334	rOmpB	Rickettsia outer membrane protein B. This domain family is found in bacteria, and is approximately 220 amino acids in length. The family is found in association with pfam03797. This family is the middle region of one of the outer membrane proteins of Rickettsia which is involved in adhesion to eukaryotic cells for uptake.	223
403522	pfam12335	SBF2	Myotubularin protein. This domain family is found in eukaryotes, and is approximately 220 amino acids in length. The family is found in association with pfam02141, pfam03456, pfam03455. This family is the middle region of SBF2, a member of the myotubularin family. Myotubularin-related proteins have been suggested to work in phosphoinositide-mediated signalling events that may also convey control of myelination. Mutations of SBF2 are implicated in Charcot-Marie-Tooth disease.	227
403523	pfam12336	SOXp	SOX transcription factor. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found in association with pfam00505. There are two conserved sequence motifs: KKDK and LPG. This family is made up of SOX transcription factors. These are involved in upregulation of nestin, a neural promoter.	88
289134	pfam12337	DUF3637	Protein of unknown function (DUF3637). This domain family is found in viruses, and is approximately 70 amino acids in length. The family is found in association with pfam00073, pfam08935.	67
403524	pfam12338	RbcS	Ribulose-1,5-bisphosphate carboxylase small subunit. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00101. There is a conserved APF sequence motif. There are two completely conserved residues (L and P) that may be functionally important. This family is the small subunit of ribulose-1,5-bisphosphate.	45
403525	pfam12339	DNAJ_related	DNA-J related protein. This domain family is found in bacteria, and is approximately 130 amino acids in length. The family is found in association with pfam00226. There is a conserved YYLD sequence motif. Mostof the sequences in this family are annotated as DNA-J related proteins but there is little publication to back this up.	120
403526	pfam12340	DUF3638	Protein of unknown function (DUF3638). This domain family is found in eukaryotes, and is approximately 230 amino acids in length. There are two conserved sequence motifs: LLE and NMG.	225
403527	pfam12341	Mcl1_mid	Minichromosome loss protein, Mcl1, middle region. Mcl1_mid, or the middle domain of minichromosome loss protein 1, is the domain that lies between a 7-bladed beta-propeller at the N-terminus, family WD40 pfam00400 etc, and a Homeobox (HMG) domain, pfam00505, at the C-terminus. The full length proteins with all three domains are referred to as DNA polymerase alpha accessory factor Mcl1, but the exact function of this domain is not known.	288
152777	pfam12342	DUF3640	Protein of unknown function (DUF3640). This family of proteins is found in viruses. Proteins in this family are typically between 25 and 211 amino acids in length.	26
403528	pfam12343	DEADboxA	Cold shock protein DEAD box A. This domain family is found in bacteria, and is typically between 68 and 89 amino acids in length. The family is found in association with pfam00270, pfam00271, pfam03880. This family is the C terminal region of DEAD box A, a protein expressed under conditions of cold shock which is involved in various cellular processes such as transcription, translation and DA recombination.	67
403529	pfam12344	UvrB	Ultra-violet resistance protein B. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00271, pfam02151, pfam04851. There are two conserved sequence motifs: YAD and RRR. This family is the C terminal region of the UvrB protein which conveys mutational resistance against UV light to various different species.	43
403530	pfam12345	DUF3641	Protein of unknown function (DUF3641). This domain family is found in bacteria and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam04055. This family consists of proteins which are commonly annotated as Radical SAM domains but there is little annotation to back this up.	135
372060	pfam12346	HJURP_mid	Holliday junction recognition protein-associated repeat. Vertebral Holliday junction recognition proteins carry an SCM3 domain at their N-terminus as do the eukaryotic fungi, but they also carry this central, conserved region. The function of this family is not known. Further downstream there is also a repeated domain, also of unknown function. Investigation of Scm3 and associated proteins is likely to be directly relevant to understanding the mechanism of HJURP-mediated CENP-A chromatin assembly at human centromeres.	115
403531	pfam12347	HJURP_C	Holliday junction regulator protein family C-terminal repeat. Although this family is conserved in the Holliday junction regulator, HJURP, proteins in higher eukaryotes, alongside an Scm3, pfam10384, family, its exact function is not known. The C-terminal region of Scm3 proteins has been evolving rapidly, and this short repeat at the C-terminal end can be present in up to two copies in the higher eukaryotes.	60
403532	pfam12348	CLASP_N	CLASP N terminal. This region is found at the N terminal of CLIP-associated proteins (CLASPs). CLASPs are widely conserved microtubule plus-end-tracking proteins that regulate the stability of dynamic microtubules. In yeast, Drosophila, and Xenopus, a single CLASP orthologue is present. In mammals, a second paralogue (CLASP2) exists which has some functional overlap with CLASP1.	227
403533	pfam12349	Sterol-sensing	Sterol-sensing domain of SREBP cleavage-activation. Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus.	153
403534	pfam12350	CTK3_C	CTD kinase subunit gamma CTK3 C-terminus. The C-terminal domain kinase (CTDK-1), is a three-subunit complex comprised of Ctk1, Ctk2, and Ctk3, that plays a key role in regulation of transcription and translation and in coordinating these two processes. Both Ctk2 and Ctk3 are regulated at the level of protein turnover, and are unstable proteins processed through a ubiquitin-proteasome pathway. Their physical interaction is required to protect both subunits from degradation, and both Ctk2 and Ctk3 are required for Ctk1 CTD kinase activation. The mammalian P-TEFb is mirrored by the combined complexes in yeast of the CTDK1 and the Bur1/2. It is not clear what independent function this C-terminal domain has.	62
372065	pfam12351	Fig1	Ca2+ regulator and membrane fusion protein Fig1. During the mating process of yeast cells, two Ca2+ influx pathways become activated. The resulting elevation of cytosolic free Ca2+ activates downstream signaling factors that promote long term survival of unmated cells. Fig1 is a regulator of the low affinity Ca2+ influx system (LACS), and is also required for efficient membrane fusion during yeast mating.	181
289148	pfam12352	V-SNARE_C	Snare region anchored in the vesicle membrane C-terminus. Within the SNARE proteins interactions in the C-terminal half of the SNARE helix are critical to the driving of membrane fusion; whereas interactions in the N-terminal half of the SNARE domain are important for promoting priming or docking of the vesicle pfam05008.	66
403535	pfam12353	eIF3g	Eukaryotic translation initiation factor 3 subunit G. This domain family is found in eukaryotes, and is approximately 130 amino acids in length. The family is found in association with pfam00076. This family is subunit G of the eukaryotic translation initiation factor 3. Subunit G is required for eIF3 integrity.	126
289150	pfam12354	Internalin_N	Bacterial adhesion/invasion protein N terminal. This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam00560, pfam08191, pfam09479. There are two completely conserved residues (I and F) that may be functionally important. Internalin mediates bacterial adhesion and invasion of epithelial cells in the human intestine through specific interaction with its host cell receptor E-cadherin. This family is the N terminal of internalin, the cap domain of the protein. The cap domain is conserved between different internalin types. The cap domain does not interact with E cadherin, therefore its function is presumably structural: capping the hydrophobic core.	50
315106	pfam12355	Dscam_C	Down syndrome cell adhesion molecule C terminal. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam00047, pfam07679, pfam00041. The Down syndrome cell adhesion molecule (Dscam) belongs to a family of cell membrane molecules involved in the differentiation of the nervous system. This is the C terminal cytoplasmic tail region of Dscam.	118
403536	pfam12356	BIRC6	Baculoviral IAP repeat-containing protein 6. BIRC6 is an anti-apoptotic protein which can regulate cell death by controlling caspases and by acting as an E3 ubiquitin-protein ligase.	175
403537	pfam12357	PLD_C	Phospholipase D C terminal. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00168, pfam00614. There is a conserved FPD sequence motif. This family is the C terminal of phospholipase D. PLD is a major plant lipid-degrading enzyme which is involved in signal transduction.	69
403538	pfam12358	DUF3644	Protein of unknown function (DUF3644). This domain family is found in bacteria, and is typically between 65 and 80 amino acids in length.	71
403539	pfam12359	DUF3645	Protein of unknown function (DUF3645). This domain family is found in eukaryotes, and is approximately 40 amino acids in length. There is a conserved HPD sequence motif.	33
403540	pfam12360	Pax7	Paired box protein 7. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00046, pfam00292. Pax7 belongs to a family of genes that encode paired-box-containing transcription factors involved in the control of developmental processes. Pax7 has a distinct role in the specification of myogenic satellite cells.	45
403541	pfam12361	DBP	Duffy-antigen binding protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 449 and 1061 amino acids in length. The family is found in association with pfam05424. There are two conserved sequence motifs: NKNGG and QKHDF. This family is part of the Duffy-antigen binding protein of Plasmodium spp. This protein is an antigen on these parasites which enable them to invade erythrocytes.	156
403542	pfam12362	DUF3646	DNA polymerase III gamma and tau subunits C terminal. This domain family is found in bacteria, and is approximately 120 amino acids in length. The family is found in association with pfam00004. The proteins in this family are frequently annotated as the gamma and tau subunits of DNA polymerase III, however there is little accompanying literature to back this up.	114
372073	pfam12363	Phage_TAC_12	Phage tail assembly chaperone protein, TAC. This is a family of phage tail assembly chaperone proteins from Siphoviridae phages.	111
372074	pfam12364	DUF3648	Protein of unknown function (DUF3648). This family of proteins is found in eukaryotes and viruses. Proteins in this family are typically between 53 and 3115 amino acids in length. There are two completely conserved residues (A and F) that may be functionally important.	141
403543	pfam12365	DUF3649	Protein of unknown function (DUF3649). This domain family is found in bacteria and eukaryotes, and is approximately 30 amino acids in length.	26
372076	pfam12366	Casc1	Cancer susceptibility candidate 1. This domain family is found in eukaryotes, and is typically between 216 and 263 amino acids in length. Casc1 has many SNPs associated with cancer susceptibility.	240
403544	pfam12367	PFO_beta_C	Pyruvate ferredoxin oxidoreductase beta subunit C terminal. This domain family is found in bacteria and archaea, and is approximately 70 amino acids in length. The family is found in association with pfam02775. There are two completely conserved residues (A and G) that may be functionally important. PFO is involved in carbon dioxide fixation via a reductive TCA cycle. It forms a heterodimer (alpha/beta). The beta subunit has binding motifs for Fe-S clusters and thiamine pyrophosphate.	63
403545	pfam12368	Rhodanese_C	Rhodanase C-terminal. Rhodanase_C is found as the domain-extension to Rhodanase enzyme in some members of the Rhodanase family. Rhodanase is pfam00581.	63
403546	pfam12369	GnHR_trans	Gonadotropin hormone receptor transmembrane region. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00560, pfam00001. There are two completely conserved C residues that may be functionally important. This family contains the transmembrane region of Follicular stimulating hormone and leutenizing hormone - the two major gonadotropin hormone receptors. These receptors are G protein coupled receptors involved in development and maturation of germ cells in both fecund genders. The transmembrane region is conserved between the two different receptors while the extracellular ligand binding domains are less well conserved.	68
372078	pfam12371	TMEM131_like	Transmembrane protein 131-like. TMEM131_like is a family of bacterial, plant and other metazoa transmembrane proteins. Many of the members are multi-pass transmembrane proteins.	84
403547	pfam12372	DUF3652	Huntingtin protein region. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam02985. This family is in the middle region of the Huntingtin protein associated with Huntington's disease. The protein is of unknown function, however it is known that a polyglutamine (CAG) repeat in the gene coding for it results in the development of Huntington's disease.	41
372080	pfam12373	Msg2_C	Major surface glycoprotein 2 C terminal. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam02349. This family is the C terminal of major surface glycoprotein 2 of virulent bacteria. It is a virulence factor antigen.	30
403548	pfam12374	Dmrt1	Double-sex mab3 related transcription factor 1. This domain family is found in eukaryotes, and is typically between 61 and 73 amino acids in length. The family is found in association with pfam00751. This family is a transcription factor involved in sex determination. The proteins in this family contain a zinc finger-like DNA-binding motif, DM domain.	72
372082	pfam12375	DUF3653	Phage protein. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 112 and 194 amino acids in length.	66
289169	pfam12376	DUF3654	Protein of unknown function (DUF3654). This family of proteins is found in eukaryotes. Proteins in this family are typically between 193 and 612 amino acids in length.	138
315124	pfam12377	DuffyBP_N	Duffy binding protein N terminal. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam05424. This family contains the N-terminus of the Duffy receptor binding domain.	67
315125	pfam12378	CytadhesinP1	Trypsin-sensitive surface-exposed protein. This domain family is found in bacteria, and is typically between 67 and 79 amino acids in length. This family contains trypsin-sensitive surface-exposed proteins called cytadhesins. Cytadhesins are virulence factor proteins which mediate attachment of bacterial cells to host cells for invasion.	72
403549	pfam12379	DUF3655	Protein of unknown function (DUF3655). This domain family is found in viruses, and is approximately 70 amino acids in length. The family is found in association with pfam08716, pfam01661, pfam05409, pfam06471, pfam08717, pfam06478, pfam09401, pfam06460, pfam08715, pfam08710.	151
289173	pfam12380	Peptidase_C62	Gill-associated viral 3C-like peptidase. a positive-stranded RNA virus of prawns, that has been called yellow head virus protease and gill-associated virus 3C-like peptidase. The GAV cysteine protease is predicted to be the key enzyme in the processing of the GAV replicase polyprotein precursors, pp1a and pp1ab. This protease employs a Cys(2968)-His(2879) catalytic dyad.	284
152816	pfam12381	Peptidase_C3G	Tungro spherical virus-type peptidase. This is the protease for self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The protease activity of the polyprotein is at the C-terminal end, adjacent to the putative RNA polymerase.	231
152817	pfam12382	Peptidase_A2E	Retrotransposon peptidase. This is a small family of fungal retroviral aspartyl peptidases.	137
289174	pfam12383	SARS_3b	Severe acute respiratory syndrome coronavirus 3b protein. This family of proteins is found in viruses. Proteins in this family are typically between 32 and 154 amino acids in length. This family contains the SARS coronavirus 3b protein which is predominantly localized in the nucleolus, and induces G0/G1 arrest and apoptosis in transfected cells.	153
152819	pfam12384	Peptidase_A2B	Ty3 transposon peptidase. Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species.	177
403550	pfam12385	Peptidase_C70	Papain-like cysteine protease AvrRpt2. This is a family of cysteine proteases, found in actinobacteria, protobacteria and firmicutes. Papain-like cysteine proteases play a crucial role in plant-pathogen/pest interactions. On entering the host they act on non-self substrates, thereby manipulating the host to evade proteolysis. AvrRpt2 from Pseudomonas syringae pv. tomato DC3000 triggers resistance to P. syringae-2-dependent defense responses, including hypersensitive cell death, by cleaving the Arabidopsis RIN4 protein which is monitored by the cognate resistance protein RPS2.	143
403551	pfam12386	Peptidase_C71	Pseudomurein endo-isopeptidase Pei. This peptidase has the catalytic triad C-H-D at the C-terminal end, a triad similar to that in thiol proteases and animal transglutaminases. It catalyzes the in vitro lysis of M. marburgensis cells under reducing conditions and exhibits characteristics of metal-activated peptidases.	149
289175	pfam12387	Peptidase_C74	Pestivirus NS2 peptidase. The pestivirus NS2 peptidase is responsible for single cleavage between NS2 and NS3 of the bovine viral diarrhea virus polyprotein, a cleavage that is correlated with cytopathogenicity. The peptidase is activated by its interaction with 'J-domain protein interacting with viral protein'.	200
403552	pfam12388	Peptidase_M57	Dual-action HEIGH metallo-peptidase. The catalytic triad for this family of proteases is HE-H-H, which in many members is in the sequence motif HEIGH.	211
372085	pfam12389	Peptidase_M73	Camelysin metallo-endopeptidase. 	196
403553	pfam12390	Se-cys_synth_N	Selenocysteine synthase N terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam03841. There is a single completely conserved residue P that may be functionally important. This family is the N terminal region of selenocysteine synthase which catalyzes the conversion of seryl-tRNA(Sec) into selenocysteyl-tRNA(Sec).	40
403554	pfam12391	PCDO_beta_N	Protocatechuate 3,4-dioxygenase beta subunit N terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam00775. There are two completely conserved residues (Y and R) that may be functionally important. This family is the N terminal region of the beta subunit of protocatechuate 3,4-dioxidase. This enzyme utilizes a mononuclear, non-heme Fe3+ centre to catalyze metabolic cellular reactions.	32
403555	pfam12392	DUF3656	Collagenase. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam01136.	102
152828	pfam12393	Dr_adhesin	Dr family adhesin. This domain family is found in bacteria, and is approximately 20 amino acids in length. The family is found in association with pfam04619. This family is the Dr-family adhesin expressed by uropathogenic E. coli.	21
403556	pfam12394	DUF3657	Protein FAM135. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam05057.	64
403557	pfam12395	DUF3658	Protein of unknown function. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam08874. There are two completely conserved residues (D and R) that may be functionally important.	107
403558	pfam12396	DUF3659	Protein of unknown function (DUF3659). This domain family is found in bacteria and eukaryotes, and is approximately 70 amino acids in length.	59
403559	pfam12397	U3snoRNP10	U3 small nucleolar RNA-associated protein 10. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam08146. This family is the protein associated with U3 snoRNA which is involved in the processing of pre-rRNA.	116
403560	pfam12398	DUF3660	Receptor serine/threonine kinase. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00954, pfam01453, pfam00069, pfam08276. There is a conserved ELPL sequence motif.	42
403561	pfam12399	BCA_ABC_TP_C	Branched-chain amino acid ATP-binding cassette transporter. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00005. There is a conserved AYLG sequence motif. This family is the C terminal of an ATP dependent branched-chain amino acid transporter.	23
403562	pfam12400	STIMATE	STIMATE family. STIMATE is a ER-resident multi-transmembrane protein that serves as a positive regulator of Ca(2+) influx in vertebrates. It interacts with ER-resident Ca2+ sensor protein STIM1 to promote STIM1 conformational switch. This entry also includes budding yeast YPL162C.	124
403563	pfam12401	DUF3662	Protein of unknown function (DUF2662). This domain family is found in bacteria, and is approximately 120 amino acids in length. The family is found in association with pfam00498.	115
403564	pfam12402	nlz1	NocA-like zinc-finger protein 1. This domain family is found in eukaryotes, and is typically between 42 and 57 amino acids in length. There is a conserved GAY sequence motif. There is a single completely conserved residue G that may be functionally important. Nlz1 self-associated via its C-terminus, interacted with Nlz2, and bound to histone deacetylases.	58
403565	pfam12403	Pax2_C	Paired-box protein 2 C terminal. This domain family is found in eukaryotes, and is approximately 110 amino acids in length. The family is found in association with pfam00292. This family is the C terminal of the paired-box protein 2 which is a transcription factor involved in embryonic development and organogenesis.	112
403566	pfam12404	DUF3663	Peptidase. This domain family is found in bacteria, and is approximately 80 amino acids in length. The family is found in association with pfam00883. There is a conserved WAF sequence motif.	77
289191	pfam12406	DUF3664	Surface protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 131 and 312 amino acids in length.	99
403567	pfam12407	Abdominal-A	Homeobox protein. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00046. This family is a homeobox protein involved in differentiation of embryonic cells to form the abdominal region.	24
403568	pfam12408	DUF3666	Ribose-5-phosphate isomerase. This domain family is found in bacteria, and is approximately 50 amino acids in length. The family is found in association with pfam02502. There are two completely conserved residues (D and F) that may be functionally important.	46
403569	pfam12409	P5-ATPase	P5-type ATPase cation transporter. This domain family is found in eukaryotes, and is typically between 110 and 126 amino acids in length. The family is found in association with pfam00122, pfam00702. P-type ATPases comprise a large superfamily of proteins, present in both prokaryotes and eukaryotes, that transport inorganic cations and other substrates across cell membranes.	125
289195	pfam12410	rpo30_N	Poxvirus DNA dependent RNA polymerase 30kDa subunit. This family of proteins is found in viruses. Proteins in this family are typically between 193 and 259 amino acids in length. The family is found in association with pfam01096. There are two conserved sequence motifs: GIEYSKD and LRY. This family is N terminal of the 30 kDa subunit of poxvirus DNA-d-RNA-pol. It has structural similarity to the eukaryotic transcriptional elongation factor SII.	135
403570	pfam12411	Choline_sulf_C	Choline sulfatase enzyme C terminal. This domain family is found in bacteria, eukaryotes and viruses, and is approximately 60 amino acids in length. The family is found in association with pfam00884. There are two completely conserved residues (R and W) that may be functionally important. This family is the C terminal of choline sulfatase, the enzyme responsible for catalyzing the conversion of choline-O-sulfate and, at a lower rate, phosphorylcholine, into choline.	53
403571	pfam12412	DUF3667	Protein of unknown function (DUF3667). This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length. There is a single completely conserved residue P that may be functionally important.	45
403572	pfam12413	DLL_N	Homeobox protein distal-less-like N terminal. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found in association with pfam00046. This family is the N terminal of a homeobox protein involved in embryonic development and adult neural regeneration.	83
403573	pfam12414	Fox-1_C	Calcitonin gene-related peptide regulator C terminal. This domain family is found in eukaryotes, and is typically between 69 and 99 amino acids in length. The family is found in association with pfam00076. This family is the C terminal of Fox-1, a protein involved in the regulation of calcitonin gene-related peptide to mediate the neuron-specific splicing pattern. Fox-1, with Fox-2, functions to repress exon 4 inclusion.	95
289200	pfam12415	rpo132	Poxvirus DNA dependent RNA polymerase. This domain family is found in viruses, and is approximately 30 amino acids in length. The family is found in association with pfam04566, pfam00562, pfam04567, pfam04560, pfam04565. This family is the second largest subunit of the poxvirus DNA dependent RNA polymerase. It has structural similarity to the second-largest RNA polymerase subunits of eubacteria, archaebacteria, and eukaryotes.	32
403574	pfam12416	DUF3668	Cep120 protein. This family includes the Cep120 protein which is associated with centriole structure and function.	226
403575	pfam12417	DUF3669	Zinc finger protein. This domain family is found in eukaryotes, and is typically between 64 and 80 amino acids in length.	66
403576	pfam12418	AcylCoA_DH_N	Acyl-CoA dehydrogenase N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam02770, pfam00441, pfam02771. This family is one of the enzymes involved in AcylCoA interaction in beta-oxidation.	32
403577	pfam12419	DUF3670	SNF2 Helicase protein. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam00271, pfam00176. Most of the proteins in this family are annotated as SNF2 helicases but there is little accompanying literature to confirm this.	136
372098	pfam12420	DUF3671	Protein of unknown function. This domain family is found in eukaryotes, and is typically between 96 and 116 amino acids in length.	114
289206	pfam12421	DUF3672	Fibronectin type III protein. This domain family is found in bacteria and viruses, and is typically between 126 and 146 amino acids in length. The family is found in association with pfam09327, pfam00041. There are two completely conserved G residues that may be functionally important. Many of the proteins in this family are annotated as fibronectin type III however there is little accompanying literature to confirm this.	133
403578	pfam12422	Condensin2nSMC	Condensin II non structural maintenance of chromosomes subunit. This domain family is found in eukaryotes, and is approximately 150 amino acids in length. This family is part of a non-SMC subunit of condensin II which is involved in maintenance of the structural integrity of chromosomes. Condensin II is made up of SMC (structural maintenance of chromosomes) and non-SMC subunits. The non-SMC subunits bind to the catalytic ends of the SMC subunit dimer. The condensin holocomplex is able to introduce superhelical tension into DNA in an ATP hydrolysis- dependent manner, resulting in the formation of positive supercoils in the presence of topoisomerase I and of positive knots in the presence of topoisomerase II.	148
403579	pfam12423	KIF1B	Kinesin protein 1B. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00225, pfam00498. KIF1B is an anterograde motor for transport of mitochondria in axons of neuronal cells.	43
403580	pfam12424	ATP_Ca_trans_C	Plasma membrane calcium transporter ATPase C terminal. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam00689, pfam00122, pfam00702, pfam00690. There is a conserved QTQ sequence motif. This family is the C terminal of a calcium transporting ATPase located in the plasma membrane.	47
338347	pfam12425	DUF3673	Protein of unknown function (DUF3673). This domain family is found in eukaryotes, and is approximately 50 amino acids in length.	53
289211	pfam12426	DUF3674	RNA dependent RNA polymerase. This domain family is found in viruses, and is approximately 40 amino acids in length. There is a conserved MFNLKF sequence motif. There are two completely conserved residues (E and P) that may be functionally important.	41
372102	pfam12427	DUF3665	Branched-chain amino acid aminotransferase. This domain family is found in bacteria, and is typically between 23 and 35 amino acids in length. The family is found in association with pfam01063. There is a conserved TRT sequence motif.	22
403581	pfam12428	DUF3675	Protein of unknown function (DUF3675). This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam00097. There are two completely conserved residues (R and L) that may be functionally important.	119
338349	pfam12429	DUF3676	Protein of unknown function (DUF3676). This domain family is found in eukaryotes, and is approximately 230 amino acids in length.	230
403582	pfam12430	ABA_GPCR	Abscisic acid G-protein coupled receptor. This domain family is found in eukaryotes, and is typically between 177 and 216 amino acids in length. This family is part of the abscisic acid (ABA) G-protein coupled receptor. ABA is a stress hormone in plants.	186
403583	pfam12431	CitT	Transcriptional regulator. This domain family is found in bacteria, and is approximately 30 amino acids in length. The family is found in association with pfam00072. There is a single completely conserved residue G that may be functionally important. CitT is a transcriptional regulator which allows transcription of the citM gene which codes for the secondary transporter in the Mg-citrate transport complex.	30
403584	pfam12432	DUF3677	Protein of unknown function (DUF3677). This domain family is found in eukaryotes, and is approximately 80 amino acids in length.	81
403585	pfam12433	PV_NSP1	Parvovirus non-structural protein 1. This family of proteins is found in viruses. Proteins in this family are typically between 109 and 668 amino acids in length. Parvoviral NSPs regulate host gene expression through histone acetylation.	71
289219	pfam12434	Malate_DH	Malate dehydrogenase enzyme. This domain family is found in bacteria, and is approximately 30 amino acids in length. The family is found in association with pfam00390, pfam03949, pfam01515. There is a conserved AAL sequence motif. There is a single completely conserved residue R that may be functionally important. Malate dehydrogenase is one of the enzymes involved in the citric acid cycle in mitochondria. It converts malate to oxaloacetate using NAD as a cofactor.	28
372107	pfam12435	DUF3678	Protein of unknown function (DUF3678). This domain family is found in eukaryotes, and is approximately 40 amino acids in length.	35
403586	pfam12436	USP7_ICP0_bdg	ICP0-binding domain of Ubiquitin-specific protease 7. This domain is one of two C-terminal domains on the much longer ubiquitin-specific proteases. This particular one is found to interact with the herpesvirus 1 trans-acting transcriptional protein ICP0/VMW110.	246
403587	pfam12437	GSIII_N	Glutamine synthetase type III N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 160 amino acids in length. The family is found in association with pfam00120. This family is the N terminal region of glutamine synthetase type III which is one of the enzymes responsible for generation of glutamine through conversion glutamate to glutamine by the incorporation of ammonia (NH3).	160
315165	pfam12438	DUF3679	Protein of unknown function (DUF3679). This domain family is found in bacteria, and is approximately 60 amino acids in length.	56
403588	pfam12439	GDE_N	Glycogen debranching enzyme N terminal. This domain family is found in bacteria and archaea, and is typically between 218 and 229 amino acids in length. The family is found in association with pfam06202. Glycogen debranching enzyme catalyzes the debranching of amylopectin in glycogen. This is done by transferring three glucose subunits of glycogen from one parallel chain to another. This has the effect of enabling the glucose residues to become more accessible for glycolysis.	209
403589	pfam12440	MAGE_N	Melanoma associated antigen family N terminal. This domain family is found in eukaryotes, and is typically between 82 and 96 amino acids in length. The family is found in association with pfam01454. This family is the N terminal of various melanoma associated antigens. These are tumor rejection antigens which are expressed on HLA-A1 of tumor cells and they are recognized by cytotoxic T lymphocytes (CTLs).	90
372110	pfam12441	CopG_antitoxin	CopG antitoxin of type II toxin-antitoxin system. CopG antitoxin is a member of a type II toxin-antitoxin system family found in bacteria and archaea. Most antitoxins encoded by the relBE and parDE loci belong to the MetJ/Arc/CopG family of dimeric proteins which bind DNA through N-terminal ribbon-helix-helix (RHH) motifs. The toxin for CopG proteins falls into the family BrnT_toxin, pfam04365.	79
403590	pfam12442	DUF3681	Protein of unknown function (DUF3681). This family of proteins is found in eukaryotes. Proteins in this family are typically between 112 and 212 amino acids in length. There is a single completely conserved residue G that may be functionally important.	95
403591	pfam12443	AKNA	AT-hook-containing transcription factor. This domain family is found in eukaryotes, and is approximately 110 amino acids in length. This family contains a transcription factor which regulates the expression of the costimulatory molecules on lymphocytes.	96
403592	pfam12444	Sox_N	Sox developmental protein N terminal. This domain family is found in eukaryotes, and is typically between 69 and 88 amino acids in length. The family is found in association with pfam00505. There are two conserved sequence motifs: YDW and PVR. This family contains Sox8, Sox9 and Sox10 proteins which have structural similarity. Sox proteins are involved in developmental processes.	76
372114	pfam12445	FliC	Flagellin protein. This domain family is found in bacteria, and is typically between 125 and 147 amino acids in length. The family is found in association with pfam00669, pfam00700. There are two completely conserved G residues that may be functionally important. This family is the flagellin motor protein which confers motility to bacterial cells.	127
289231	pfam12446	DUF3682	Protein of unknown function (DUF3682). This domain family is found in eukaryotes, and is typically between 125 and 136 amino acids in length.	129
403593	pfam12447	DUF3683	Protein of unknown function (DUF3683). This domain family is found in bacteria, and is approximately 120 amino acids in length. The family is found in association with pfam02754, pfam01565, pfam02913.	114
403594	pfam12448	Milton	Kinesin associated protein. This domain family is found in eukaryotes, and is typically between 143 and 173 amino acids in length. The family is found in association with pfam04849. This family is a region of the protein milton. Milton recruits the heavy chain of kinesin to mitochondria to allow the motor movement function of kinesin.	168
403595	pfam12449	DUF3684	Protein of unknown function (DUF3684). This domain family is found in eukaryotes, and is typically between 1072 and 1090 amino acids in length.	1093
403596	pfam12450	vWF_A	von Willebrand factor. This domain family is found in bacteria, and is approximately 100 amino acids in length. The family is found in association with pfam00092. There are two conserved sequence motifs: STF and DVD. There are two completely conserved residues (E and N) that may be functionally important. In hemostasis, platelet adhesion to the damaged vessel wall is mediated by several proteins, including von Willebrand factor. In solution vWF becomes immobilized via its A3 domain on the fibrillar collagen of the vessel wall and acts as an intermediary between collagen and the platelet receptor glycoprotein Ibalpha (GPIbalpha), which is the only platelet receptor that does not require prior activation for bond formation.	94
403597	pfam12451	VPS11_C	Vacuolar protein sorting protein 11 C terminal. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. Vps 11 is one of the evolutionarily conserved class C vacuolar protein sorting genes (c-vps: vps11, vps16, vps18, and vps33), whose products physically associate to form the c-vps protein complex required for vesicle docking and fusion.	44
403598	pfam12452	DUF3685	Protein of unknown function (DUF3685). This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. There are two completely conserved residues (L and D) that may be functionally important.	192
403599	pfam12453	PTP_N	Protein tyrosine phosphatase N terminal. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00041. There is a single completely conserved residue L that may be functionally important. This family consists of various protein tyrosine phosphatase haematopoietic receptors, e.g. CD45, which dephosphorylate growth stimulating proteins. This limits growth signalling in haematopoietic cells.	26
372120	pfam12454	Ecm33	GPI-anchored cell wall organization protein. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. Ecm33 is an essential cell wall component and is important for cell wall integrity.	40
403600	pfam12455	Dynactin	Dynein associated protein. This domain family is found in eukaryotes, and is approximately 280 amino acids in length. The family is found in association with pfam01302. There is a single completely conserved residue E that may be functionally important. Dynactin has been associated with Dynein, a kinesin protein which is involved in organelle transport, mitotic spindle assembly and chromosome segregation. Dynactin anchors Dynein to specific subcellular structures.	286
403601	pfam12456	hSac2	Inositol phosphatase. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam02383. hSac2 functions as an inositol polyphosphate 5-phosphatase.	110
403602	pfam12457	TIP_N	Tuftelin interacting protein N terminal. This domain family is found in eukaryotes, and is typically between 99 and 114 amino acids in length. The family is found in association with pfam08697, pfam01585. There are two completely conserved residues (G and F) that may be functionally important. TIP is involved in enamel assembly by interacting with one of the major proteins responsible for biomineralisation of enamel - tuftelin.	93
403603	pfam12458	DUF3686	ATPase involved in DNA repair. This domain family is found in bacteria, and is approximately 450 amino acids in length. There are two conserved sequence motifs: DVF and SPNGED.	446
403604	pfam12459	DUF3687	D-Ala-teichoic acid biosynthesis protein. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There are two completely conserved residues (L and Y) that may be functionally important.	43
403605	pfam12460	MMS19_C	RNAPII transcription regulator C-terminal. MMS19 is required for both nucleotide excision repair (NER) and RNA polymerase II (RNAP II) transcription. This C-terminal domain, along with the N-terminal, MMS19_N, form part of a silencing complex in fission yeast that contains Dos2, Rik1, Mms19 and Cdc20 (the catalytic subunit of DNA polymerase-epsilon). This complex regulates RNA polymerase II (RNA Pol II) activity in heterochromatin and is required for DNA replication and heterochromatin assembly.	423
315187	pfam12461	DUF3688	Protein of unknown function (DUF3688). This domain family is found in bacteria and viruses, and is typically between 79 and 104 amino acids in length. There is a conserved YRW sequence motif. There is a single completely conserved residue Y that may be functionally important.	727
403606	pfam12462	Helicase_IV_N	DNA helicase IV / RNA helicase N terminal. This domain family is found in bacteria, and is approximately 170 amino acids in length. This family is found in bacterial DNA helicase IV, at the N-terminus of pfam00580.	164
403607	pfam12463	DUF3689	Protein of unknown function (DUF3689). This family of proteins is found in eukaryotes. Proteins in this family are typically between 399 and 797 amino acids in length.	309
403608	pfam12464	Mac	Maltose acetyltransferase. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00132. Mac uses acetyl-CoA as acetyl donor to acetylated cytoplasmic maltose.	52
403609	pfam12465	Pr_beta_C	Proteasome beta subunits C terminal. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00227. There is a conserved GTT sequence motif. There is a single completely conserved residue Y that may be functionally important. This family includes the C terminal of the beta-type subunits of the proteasome, a multimeric complex that degrades proteins into peptides as part of the MHC class I-mediated Ag-presenting pathway.	35
372129	pfam12466	GDH_N	Glutamate dehydrogenase N terminal. This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam05088. There is a conserved ALR sequence motif. Glutamate dehydrogenase (GDH) is a homohexameric, mitochondrial enzyme that reversibly catalyzes the oxidative deamination of L-glutamate to 2-oxoglutarate using either NADP(H) or NAD(H) with comparable efficacy.	95
289250	pfam12467	CMV_1a	Cucumber mosaic virus 1a protein family. This domain family is found in viruses, and is typically between 156 and 171 amino acids in length. The family is found in association with pfam01443, pfam01660. 1a protein is the major virulence factor of the cucumber mosaic virus (CMV). The Ns strain of CMV causes necrotic lesions to Nicotiana spp. while other strains cause systemic mosaic. The determinant of the pathogenesis of these different strains is the specific amino acid residue at the 461 residue of the 1a protein.	184
403610	pfam12468	TTSSLRR	Type III secretion system leucine rich repeat protein. This domain family is found in bacteria, and is approximately 50 amino acids in length. There are two completely conserved residues (Y and W) that may be functionally important. This family consists of leucine-rich repeat proteins involved in type III secretion.	54
403611	pfam12469	DUF3692	CRISPR-associated protein. This domain family is found in bacteria and archaea, and is typically between 101 and 138 amino acids in length. The proteins in this family are frequently annotated as CRISPR-associated proteins however there is little accompanying literature to confirm this.	112
403612	pfam12470	SUFU_C	Suppressor of Fused Gli/Ci N terminal binding domain. This domain family is found in eukaryotes, and is typically between 192 and 219 amino acids in length. The family is found in association with pfam05076. There is a conserved HGRHFT sequence motif. This family is the C terminal domain of the Suppressor of Fused protein (Su(fu)). Su(fu) is a repressor of the Gli and Ci transcription factors of the Hedgehog signalling cascade. It functions by binding these proteins and preventing their translocation to the nucleus. The C terminal domain is only found in eukaryotic Su(fu) proteins; it is not present in bacterial homologs. The C terminal domain binds to the N terminal of Gli/Ci while the N terminal of Su(fu) binds to the C terminal of Gli/Ci. This dual binding mechanism is likely an evolutionary advancement in this signalling cascade which is not present in bacterial homologs.	215
403613	pfam12471	GTP_CH_N	GTP cyclohydrolase N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. This family is the N terminal of GTP cyclohydrolase, the rate limiting enzyme in the synthesis of tetrahydrobiopterin.	193
372132	pfam12472	DUF3693	Phage related protein. This domain family is found in bacteria and viruses, and is approximately 60 amino acids in length.	60
403614	pfam12473	DUF3694	Kinesin protein. This domain family is found in eukaryotes, and is typically between 131 and 151 amino acids in length. The family is found in association with pfam00225, pfam00498. There is a single completely conserved residue W that may be functionally important.	149
403615	pfam12474	PKK	Polo kinase kinase. This domain family is found in eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam00069. Polo-like kinase 1 (Plx1) is essential during mitosis for the activation of Cdc25C, for spindle assembly, and for cyclin B degradation. This family is Polo kinase kinase (PKK) which phosphorylates Polo kinase and Polo-like kinase to activate them. PKK is a serine/threonine kinase.	140
289258	pfam12475	Amdo_NSP	Amdovirus non-structural protein. This domain family is found in viruses, and is approximately 50 amino acids in length. This family contains proteins of each of the four types of Amdovirus non-structural protein.	54
403616	pfam12476	DUF3696	Protein of unknown function (DUF3696). This domain family is found in bacteria and archaea, and is approximately 50 amino acids in length.	53
403617	pfam12477	TraW_N	Sex factor F TraW protein N terminal. This domain family is found in bacteria, and is approximately 30 amino acids in length. There is a single completely conserved residue G that may be functionally important. The traW gene of the E. coli K-12 sex factor, F, encodes one of the numerous proteins required for conjugative transfer of this plasmid.	30
372135	pfam12478	DUF3697	Ubiquitin-associated protein 2. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00627. There are two conserved sequence motifs: AVEMPG and QFG.	32
372136	pfam12479	DUF3698	Protein of unknown function (DUF3698). This domain family is found in eukaryotes, and is typically between 89 and 105 amino acids in length.	101
403618	pfam12480	DUF3699	Protein of unknown function (DUF3699). This domain family is found in eukaryotes, and is approximately 80 amino acids in length.	71
403619	pfam12481	DUF3700	Aluminium induced protein. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. There are two conserved sequence motifs: YGL and LRDR. This family is related to GATase enzyme domains.	228
372139	pfam12482	DUF3701	Phage integrase protein. This domain family is found in bacteria, and is approximately 100 amino acids in length. The family is found in association with pfam00589.	88
403620	pfam12483	GIDE	E3 Ubiquitin ligase. This domain family is found in bacteria, archaea and eukaryotes, and is typically between 150 and 163 amino acids in length. There is a single completely conserved residue E that may be functionally important. GIDE is an E3 ubiquitin ligase which is involved in inducing apoptosis.	160
372141	pfam12484	PE_PPE_C	Polymorphic PE/PPE proteins C terminal. This domain family is found in bacteria, and is approximately 90 amino acids in length. The family is found in association with pfam00823. There is a conserved SVP sequence motif. There is a single completely conserved residue W that may be functionally important. The proteins in this family are PE/PPE proteins implicated in immunostimulation and virulence.	80
403621	pfam12485	SLY	Lymphocyte signaling adaptor protein. This domain family is found in eukaryotes, and is typically between 144 and 156 amino acids in length. The family is found in association with pfam07647, pfam07653. There is a conserved LGKK sequence motif. SLY contains a Src homology 3 domain and a sterile alpha motif, suggesting that it functions as a signaling adaptor protein in lymphocytes.	154
403622	pfam12486	VasL	Type VI secretion system, EvfB, or VasL. EvfB or VasL is a domain found on many Gram-negative proteins with an ImpA_N domain at the N-terminus. These proteins are expressed from the pathogenicity locus that forms the bacterial type VI secretion system. The exact function of VasL is not known. One E.coli member is annotated as being EvfB, though the E.coli equivalent of ImpA would be expected to be EvfG. It is possible that in many bacteria what is a single protein in one species, eg E.coli, is a fusion of two genes in others, which would explain an ImpA at the N-terminus and a VasL at the C-terminus.	147
403623	pfam12487	DUF3703	Protein of unknown function (DUF3703). This family of proteins is found in bacteria. Proteins in this family are typically between 113 and 135 amino acids in length.	109
403624	pfam12488	DUF3704	Protein of unknown function (DUF3704). This domain family is found in eukaryotes, and is approximately 30 amino acids in length.	27
403625	pfam12489	ARA70	Nuclear coactivator. This domain family is found in eukaryotes, and is typically between 127 and 138 amino acids in length. This family is ARA70, a nuclear coactivator which interacts with peroxisome proliferator-activated receptor gamma (PPARgamma) to regulate transcription and the addition of the PPARgamma ligand (prostaglandin J2) enhances this interaction.	131
403626	pfam12490	BCAS3	Breast carcinoma amplified sequence 3. This domain family is found in eukaryotes, and is typically between 229 and 245 amino acids in length. The proteins in this family have been shown to be proto-oncogenes implicated in the development of breast cancer.	240
403627	pfam12491	ApoB100_C	Apolipoprotein B100 C terminal. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. There are two conserved sequence motifs: QLS and LIDL. ApoB100 has an essential role in the assembly and secretion of triglyceride-rich lipoproteins and lipids transport.	57
289275	pfam12493	DUF3709	Protein of unknown function (DUF3709). This domain family is found in bacteria, and is approximately 30 amino acids in length. There are two conserved sequence motifs: RCLMK and LIEL.	33
372148	pfam12494	DUF3695	Protein of unknown function (DUF3695). This family of proteins is found in eukaryotes. Proteins in this family are typically between 157 and 192 amino acids in length. There is a single completely conserved residue D that may be functionally important.	95
403628	pfam12495	Vip3A_N	Vegetative insecticide protein 3A N terminal. This family of proteins is found in bacteria. Proteins in this family are typically between 170 and 789 amino acids in length. The family is found in association with pfam02018. Vip3A represents a novel class of proteins insecticidal to lepidopteran insect larvae.	177
403629	pfam12496	BNIP2	Bcl2-/adenovirus E1B nineteen kDa-interacting protein 2. This domain family is found in eukaryotes, and is typically between 119 and 133 amino acids in length. There is a conserved HGGY sequence motif. This family is Bcl2-/adenovirus E1B nineteen kDa-interacting protein 2. It interacts with pro- and anti- apoptotic molecules in the cell.	135
403630	pfam12497	ERbeta_N	Estrogen receptor beta. This domain family is found in eukaryotes, and is approximately 110 amino acids in length. The family is found in association with pfam00104, pfam00105. There is a conserved IPS sequence motif. There are two completely conserved residues (Y and W) that may be functionally important. ERbeta binds estrogens with an affinity similar to that of ERalpha, and activates expression of reporter genes containing estrogen response elements in an estrogen-dependent manner. ERbeta acts as a transcription factor once bound to its ligand and it can dimerize with ERalpha.	114
403631	pfam12498	bZIP_C	Basic leucine-zipper C terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 174 and 411 amino acids in length. The family is found in association with pfam00170. There is a conserved KVK sequence motif. There is a single completely conserved residue K that may be functionally important. Various bZIP proteins have been found and shown to play a role in seed-specific gene expression. bZIP binds to the alpha-globulin gene promoter, but not to promoters of other major storage genes such as glutelin, prolamin and albumin.	122
403632	pfam12499	DUF3707	Pherophorin. This domain family is found in eukaryotes, and is typically between 147 and 160 amino acids in length. The proteins in this family are frequently annotated as pherophorins however there is little accompanying literature to confirm this.	139
403633	pfam12500	TRSP	TRSP domain C-terminus to PRTase_2. This domain occurs C-terminal to PRTase_2 and has highly conserved GXXE and TRSP signatures. It is found in bacteria. These genes are found in the biosynthetic operon associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	128
403634	pfam12501	DUF3708	Phosphate ATP-binding cassette transporter. This domain family is found in bacteria, and is typically between 143 and 173 amino acids in length. The family is found in association with pfam00528. There is a single completely conserved residue P that may be functionally important.	165
403635	pfam12502	DUF3710	Protein of unknown function (DUF3710). This family of proteins is found in bacteria. Proteins in this family are typically between 237 and 284 amino acids in length. There are two conserved sequence motifs: DLG and DGPRW.	177
289285	pfam12503	CMV_1a_C	Cucumber mosaic virus 1a protein C terminal. This domain family is found in viruses, and is approximately 90 amino acids in length. The family is found in association with pfam01443, pfam01660. There is a conserved GLG sequence motif. 1a protein is the major virulence factor of the cucumber mosaic virus (CMV). The Ns strain of CMV causes necrotic lesions to Nicotiana spp. while other strains cause systemic mosaic. The determinant of the pathogenesis of these different strains is the specific amino acid residue at the 461 residue of the 1a protein.	84
403636	pfam12505	DUF3712	Protein of unknown function (DUF3712). This domain family is found in eukaryotes, and is approximately 130 amino acids in length.	125
289287	pfam12506	DUF3713	Protein of unknown function (DUF3713). This family of proteins is found in bacteria. Proteins in this family are typically between 92 and 1225 amino acids in length. There is a single completely conserved residue S that may be functionally important.	115
403637	pfam12507	HCMV_UL139	Human Cytomegalovirus UL139 protein. This family of proteins is found in eukaryotes and viruses. Proteins in this family are approximately 140 amino acids in length. UL139 product shared sequence homology with human CD24, a signal transducer modulating B-cell activation responses, and the sequences in the G1c variant of UL139 contained a specific attachment site of prokaryotic membrane lipoprotein lipid.	100
403638	pfam12508	Transposon_TraM	Conjugative transposon, TraM. Proteins in this entry are designated TraM and are found in a proposed transfer region of a class of conjugative transposon found in the Bacteroides lineage.	194
403639	pfam12509	DUF3715	Protein of unknown function (DUF3715). This domain family is found in eukaryotes, and is approximately 170 amino acids in length.	150
403640	pfam12510	Smoothelin	Smoothelin cytoskeleton protein. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00307. Smoothelin is a cytoskeletal protein specifically expressed in differentiated smooth muscle cells and has been shown to co-localize with smooth muscle alpha actin.	50
403641	pfam12511	DUF3716	Protein of unknown function (DUF3716). This domain family is found in eukaryotes, and is approximately 60 amino acids in length.	59
403642	pfam12512	DUF3717	Protein of unknown function (DUF3717). This family of proteins is found in bacteria. Proteins in this family are typically between 75 and 117 amino acids in length. There is a conserved AIN sequence motif. There are two completely conserved residues (L and Y) that may be functionally important.	65
403643	pfam12513	SUV3_C	Mitochondrial degradasome RNA helicase subunit C terminal. This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00271. The yeast mitochondrial degradosome (mtEXO) is an NTP-dependent exoribonuclease involved in mitochondrial RNA metabolism. mtEXO is made up of two subunits: an RNase (DSS1) and an RNA helicase (SUV3). These co-purify with mitochondrial ribosomes.	47
378871	pfam12514	DUF3718	Protein of unknown function (DUF3718). This domain family is found in bacteria and viruses, and is approximately 70 amino acids in length. There is a single completely conserved residue C that may be functionally important.	66
403644	pfam12515	CaATP_NAI	Ca2+-ATPase N terminal autoinhibitory domain. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00689, pfam00122, pfam00702, pfam00690. There is a conserved RRFR sequence motif. There are two completely conserved residues (F and W) that may be functionally important. This family is the N terminal autoinhibitory domain of an endosomal Ca2+-ATPase.	45
403645	pfam12516	DUF3719	Protein of unknown function (DUF3719). This domain family is found in eukaryotes, and is approximately 70 amino acids in length. There is a conserved HLR sequence motif. There are two completely conserved residues (W and H) that may be functionally important.	65
338388	pfam12517	DUF3720	Protein of unknown function (DUF3720). This domain family is found in eukaryotes, and is approximately 100 amino acids in length. There are two completely conserved A residues that may be functionally important.	99
403646	pfam12518	DUF3721	Protein of unknown function. This domain family is found in bacteria and eukaryotes, and is approximately 30 amino acids in length. There is a conserved WMPC sequence motif. There are two completely conserved residues (A and C) that may be functionally important.	33
403647	pfam12519	MDM10	Mitochondrial distribution and morphology protein 10. MDM10 is a family of eukaryotic proteins that forms a subunit of the SAM complex for biogenesis of beta-barrel proteins, though not porins, into the outer mitochondrial membrane.	434
372162	pfam12520	DUF3723	Protein of unknown function (DUF3723). This family of proteins is found in eukaryotes. Proteins in this family are typically between 374 and 1069 amino acids in length. There is a conserved LGF sequence motif.	504
152955	pfam12521	DUF3724	Protein of unknown function (DUF3724). This domain family is found in viruses, and is approximately 20 amino acids in length. The family is found in association with pfam00073. There is a single completely conserved residue Y that may be functionally important.	23
315236	pfam12522	UL73_N	Cytomegalovirus glycoprotein N terminal. This domain family is found in viruses, and is approximately 30 amino acids in length. The family is found in association with pfam03554. This family is an envelope glycoprotein of human cytomegalovirus (HCMV).	27
152957	pfam12523	DUF3725	Protein of unknown function (DUF3725). This domain family is found in viruses, and is approximately 70 amino acids in length. The family is found in association with pfam01577. There is a conserved FLE sequence motif.	74
315237	pfam12524	GlyL_C	dsDNA virus glycoprotein L C terminal. This domain family is found in viruses, and is typically between 55 and 80 amino acids in length. The family is found in association with pfam05259. This family is the C terminal of glycoprotein L from various types of double stranded DNA viruses (dsDNA).	65
403648	pfam12525	DUF3726	Protein of unknown function (DUF3726). This domain family is found in bacteria and eukaryotes, and is approximately 80 amino acids in length. There is a single completely conserved residue E that may be functionally important.	74
372164	pfam12526	DUF3729	Protein of unknown function (DUF3729). This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.	115
403649	pfam12527	DUF3727	Protein of unknown function (DUF3727). This domain family is found in bacteria and eukaryotes, and is approximately 100 amino acids in length.	97
403650	pfam12528	T2SSppdC	Type II secretion prepilin peptidase dependent protein C. 	81
403651	pfam12529	Xylo_C	Xylosyltransferase C terminal. This domain family is found in eukaryotes, and is typically between 169 and 183 amino acids in length. The family is found in association with pfam02485. There is a single completely conserved residue G that may be functionally important. Xylosyltransferases are enzymes involved in the biosynthesis of the glycosaminoglycan linker region in proteoglycans.	181
403652	pfam12530	DUF3730	Protein of unknown function (DUF3730). This domain family is found in eukaryotes, and is typically between 220 and 262 amino acids in length.	227
403653	pfam12531	DUF3731	DNA-K related protein. This domain family is found in bacteria, and is approximately 250 amino acids in length. There are two conserved sequence motifs: RPG and WRR. The proteins in this family are frequently annotated as DNA-K related proteins however there is little accompanying literature to confirm this.	247
403654	pfam12532	DUF3732	Protein of unknown function (DUF3732). This domain family is found in bacteria and eukaryotes, and is typically between 180 and 198 amino acids in length. There is a conserved DQP sequence motif.	184
403655	pfam12533	Neuro_bHLH	Neuronal helix-loop-helix transcription factor. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found C-terminal to pfam00010. There is a single completely conserved residue W that may be functionally important. Neuronal basic helix-loop-helix (bHLH) transcription factors such as neuroD and neurogenin have been shown to play important roles in neuronal development.	122
403656	pfam12534	Pannexin_like	Pannexin-like TM region of LRRC8. Pannexin_like is a family of the four transmembrane domains of metazoan leucine-rich-repeat-containing 8 proteins. These four TMs associate into hexamers resulting in homo- or heteromeric channels that connect the cytosol to the extracellular space. The family is found in association with pfam00560.	342
403657	pfam12535	Nudix_N	Hydrolase of X-linked nucleoside diphosphate N terminal. This family of proteins is found in eukaryotes. Proteins in this family are typically between 847 and 5344 amino acids in length. These enzymes hydrolyze the molecular motif of a nucleoside diphosphate linked to some other moiety, X.	54
403658	pfam12536	DUF3734	Patatin phospholipase. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam01734. There are two completely conserved residues (F and G) that may be functionally important. The proteins in this family are frequently annotated as patatin family phospholipases however there is little accompanying literature to confirm this.	106
403659	pfam12537	GPHR_N	The Golgi pH Regulator (GPHR) Family N-terminal. GPHR_N is the N-terminal 5TM region of the Golgi pH regulator proteins in eukaryotes. It plays vital roles in the transport of newly synthesized proteins from the Golgi to the plasma membrane, in the glycosylation of proteins along the exocytic pathway and the structural organisation of the Golgi apparatus.	68
372173	pfam12538	FtsK_SpoIIIE_N	DNA transporter. This domain family is found in bacteria, and is typically between 107 and 121 amino acids in length. The family is found in association with pfam01580. The FtsK/SpoIIIE family of DNA transporters are responsible for translocating missegregated chromosomes after the completion of cell division.	115
403660	pfam12539	Csm1	Chromosome segregation protein Csm1/Pcs1. Saccharomyces cerevisiae Csm1 is part of the monopolin complex. Csm1 forms a complex with Mde4 and promotes monoorientation during meiosis. Csm1 also plays a mitotic role in DNA replication. This family also contains the Schizosaccharomyces pombe homolog to Csm1, Pcs1. Pcs1 forms a complex with Mde4 and acts in the central kinetochore domain to clamp microtubule binding sites together. The two complexes (Csm1/Lrs4 and Pcs1/Mde4) contribute to the prevention of merotelic attachment.	84
403661	pfam12540	DUF3736	Protein of unknown function (DUF3736). This domain family is found in eukaryotes, and is typically between 135 and 160 amino acids in length.	138
403662	pfam12541	DUF3737	Protein of unknown function (DUF3737). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 281 and 297 amino acids in length.	274
403663	pfam12542	CWC25	Pre-mRNA splicing factor. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam10197. There is a single completely conserved residue Y that may be functionally important. Cwc25 has been identified to associate with pre-mRNA splicing factor Cef1/Ntc85, a component of the Prp19-associated complex (NTC) involved in spliceosome activation. Cwc25 is neither tightly associated with NTC nor required for spliceosome activation, but is required for the first catalytic reaction.	98
403664	pfam12543	DUF3738	Protein of unknown function (DUF3738). This family of proteins is found in bacteria. Proteins in this family are typically between 251 and 457 amino acids in length.	188
378876	pfam12544	LAM_C	Lysine-2,3-aminomutase. This domain family is found in bacteria, archaea and eukaryotes, and is typically between 111 and 127 amino acids in length. The family is found in association with pfam04055. LAM catalyzes the interconversion of L-alpha-lysine and L-beta-lysine, which proceeds by migration of the amino group from C2 to C3 concomitant with cross-migration of the 3-pro-R hydrogen of L-alpha-lysine to the 2-pro-R position of L-beta-lysine.	127
372178	pfam12545	DUF3739	Filamentous haemagglutinin family outer membrane protein. This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam05860.	111
403665	pfam12546	Cryptochrome_C	Blue/Ultraviolet sensing protein C terminal. This domain family is found in eukaryotes, and is typically between 113 and 125 amino acids in length. The family is found in association with pfam03441, pfam00875. Cryptochromes are blue/ultraviolet-A light sensing photoreceptors involved in regulating various growth and developmental responses in plants.	121
403666	pfam12547	ATXN-1_C	Capicua transcriptional repressor modulator. This family of proteins is found in eukaryotes. Proteins in this family are typically between 49 and 781 amino acids in length. There is a conserved IQT sequence motif. ATXN1 directly binds Capicua and modulates Capicua repressor activity in Drosophila and mammalian cells. The polyglutamine expanded mutant type of ATXN-1 does not bind Capicua with as high affinity as wild-type ATXN-1. It is associated with spinocerebellar ataxia type 1 (SCA1).	50
403667	pfam12548	DUF3740	Sulfatase protein. This domain family is found in eukaryotes, and is typically between 144 and 173 amino acids in length. The family is found in association with pfam00884.	139
403668	pfam12549	TOH_N	Tyrosine hydroxylase N terminal. This domain family is found in eukaryotes, and is approximately 30 amino acids in length. There is a single completely conserved residue G that may be functionally important. Tyrosine hydroxylase converts L-tyrosine to L-DOPA in the catecholamine synthesis pathway.	25
403669	pfam12550	GCR1_C	Transcriptional activator of glycolytic enzymes. This domain family is found in eukaryotes, and is approximately 80 amino acids in length. This family is activates the transcription of glycolytic enzymes.	80
403670	pfam12551	PHBC_N	Poly-beta-hydroxybutyrate polymerase N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam07167, pfam00561. There is a single completely conserved residue W that may be functionally important. PHBC is the third enzyme of the poly-beta-hydroxybutyrate biosynthetic pathway.	41
403671	pfam12552	DUF3741	Protein of unknown function (DUF3741). This domain family is found in eukaryotes, and is approximately 50 amino acids in length.	45
372185	pfam12553	DUF3742	Protein of unknown function (DUF3742). This domain family is found in bacteria, and is approximately 50 amino acids in length. There is a single completely conserved residue Y that may be functionally important.	114
403672	pfam12554	MOZART1	Mitotic-spindle organizing gamma-tubulin ring associated. The name MOZART is derived from letters of 'mitotic-spindle organizing proteins associated with a ring of gamma-tubulin'. This family operates as part of the gamma-tubulin ring complex, gamma-TuRC, one of the complexes necessary for chromosome segregation. This complex is located at centrosomes and mediates the formation of bipolar spindles in mitosis; it consists of six subunits. However, unlike the other four known subunits, this family does not carry the conserved 'Spc97-Spc98' GCP domain, so the TUBCGP nomenclature cannot be used for it. MOZART1 is required for gamma-TuRC recruitment to centrosomes.	45
403673	pfam12555	TPPK_C	Thiamine pyrophosphokinase C terminal. This domain family is found in bacteria, and is approximately 50 amino acids in length. The proteins in this family catalyzes the pyrophosphorylation of thiamine in yeast and synthesizes thiamine pyrophosphate (TPP), a thiamine coenzyme.	50
403674	pfam12556	CobS_N	Cobaltochelatase CobS subunit N terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam07728. There are two completely conserved residues (P and F) that may be functionally important. This family is the N terminal of the CobS subunit of cobaltochelatase. Cobaltochelatase belongs to the AAA+ superfamily of proteins. CobS and CobT form a chaperone like complex.	33
403675	pfam12557	Co_AT_N	Cob(I)alamin adenosyltransferase N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 20 amino acids in length. The family is found in association with pfam02572. Cob(I)alamin adenosyltransferase adenosylates Co(I) in an ATP-dependent manner in the conversion of aquacobalamin to its coenzyme form. This is the third step in this process, after two steps involved in the reduction of Co(III) to Co(I).	23
403676	pfam12558	DUF3744	ATP-binding cassette cobalt transporter. This domain family is found in bacteria, and is approximately 70 amino acids in length. The family is found in association with pfam00005. There is a conserved REP sequence motif. There is a single completely conserved residue P that may be functionally important. The proteins in this family are frequently annotated as ABC Cobalt transporters however there is little accompanying literature to confirm this.	73
372187	pfam12559	Inhibitor_I10	Serine endopeptidase inhibitors. This family includes both microviridins and marinostatins. It seems likely that in both cases it is the C-terminus which becomes the active inhibitor after post-translational modifications of the full length, pre-peptide. it is the ester linkages within the key, 12-residue. region that circularize the molecule giving it its inhibitory conformation.	64
403677	pfam12560	RAG1_imp_bd	RAG1 importin binding. This region of RAG1 is responsible for binding to importin alpha.	287
403678	pfam12561	TagA	ToxR activated gene A lipoprotein. This domain family is found in bacteria, and is approximately 140 amino acids in length. The family is found in association with pfam10462. There is a conserved GAG sequence motif. This family is a bacterial lipoprotein.	99
289339	pfam12562	DUF3746	Protein of unknown function (DUF3746). This domain family is found in viruses, and is approximately 40 amino acids in length. The family is found in association with pfam04595.	37
403679	pfam12563	Hemolysin_N	Hemolytic toxin N terminal. This domain family is found in bacteria, and is approximately 190 amino acids in length. The family is found in association with pfam07968, pfam00652. This family is a bacterial virulence factor - hemolysin - which forms pores in erythrocytes and causes them to lyse.	192
403680	pfam12564	TypeIII_RM_meth	Type III restriction/modification enzyme methylation subunit. This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam01555. There are two completely conserved residues (F and S) that may be functionally important. This family is a bacterial phage resistance protein. It functions in a type III restriction/modification enzyme complex. It is part of the methylation subunit of the complex. It binds DNA and methylates it.	56
403681	pfam12565	DUF3747	Protein of unknown function (DUF3747). This family of proteins is found in bacteria. Proteins in this family are typically between 215 and 413 amino acids in length. There is a conserved DSNGYS sequence motif.	171
403682	pfam12566	DUF3748	Protein of unknown function (DUF3748). This domain family is found in bacteria and eukaryotes, and is approximately 120 amino acids in length.	119
403683	pfam12567	CD45	Leukocyte receptor CD45. This family of proteins is found in eukaryotes. Proteins in this family are typically between 77 and 1130 amino acids in length. The family is found in association with pfam00041. CD45 plays a critical role in T-cell receptor (TCR)-mediated signaling. CD45 interacts with SKAP55 which is a transcriptional activator of IL-2.	59
403684	pfam12568	DUF3749	Acetyltransferase (GNAT) domain. This domain family is found in bacteria, and is approximately 40 amino acids in length. The proteins in this family are acetyltransferases of the GNAT family.	128
403685	pfam12569	NARP1	NMDA receptor-regulated protein 1. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam07719, pfam00515. There is a single completely conserved residue L that may be functionally important. NARP1 is the mammalian homolog of a yeast N-terminal acetyltransferase that regulates entry into the G(0) phase of the cell cycle.	514
403686	pfam12570	DUF3750	Protein of unknown function (DUF3750). This family of proteins is found in bacteria. Proteins in this family are typically between 175 and 265 amino acids in length.	129
403687	pfam12571	DUF3751	Phage tail-collar fibre protein. This domain family is found in bacteria and viruses, and is approximately 160 amino acids in length. There are two completely conserved residues (K and W) that may be functionally important. The members are annotated as being putative phage tail or tail-collar proteins.	149
403688	pfam12572	DUF3752	Protein of unknown function (DUF3752). This domain family is found in eukaryotes, and is typically between 140 and 163 amino acids in length.	150
403689	pfam12573	OxoDH_E1alpha_N	2-oxoisovalerate dehydrogenase E1 alpha subunit N terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam00676. There are two conserved sequence motifs: VPEP and RPG. This family is the alpha subunit of the E1 component of 2-oxoisovalerate dehydrogenase. This is the enzyme complex responsible for metabolism of pyruvate, 2-oxoglutarate, branched chain 2-oxo acids and acetoin. The E1 component is a heterotetramer of alpha2beta2. The homodimerized beta subunits are flanked by two alpha subunits in a 'vise' structure.	41
403690	pfam12574	120_Rick_ant	120 KDa Rickettsia surface antigen. This domain family is found in bacteria, and is approximately 40 amino acids in length. This family is a Rickettsia surface antigen of 120 KDa which may be used as an antigen for immune response against the bacterial species.	238
289352	pfam12575	Pox_EPC_I2-L1	Poxvirus entry protein complex L1 and I2. Pox_EPC_I2-L1 family of proteins is found in poxviruses. Proteins in this family are approximately 70 amino acids in length. There is a conserved YLK sequence motif.	71
403691	pfam12576	DUF3754	Protein of unknown function (DUF3754). This domain family is found in bacteria, archaea and eukaryotes, and is typically between 135 and 166 amino acids in length. There is a single completely conserved residue P that may be functionally important.	136
403692	pfam12577	PPARgamma_N	PPAR gamma N-terminal region. Peroxisome proliferator-activated receptors (PPAR) are nuclear hormone receptors that control the expression of genes involved in lipid homeostasis in mammals. This sequence region is found at the N-terminus of these proteins. The family is found in association with pfam00104, pfam00105. It is not clear if this region is a separate protein domain.	79
403693	pfam12578	3-PAP	Myotubularin-associated protein. This domain family is found in eukaryotes, and is typically between 115 and 138 amino acids in length. Myotubularin is a dual-specific phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bisphosphate. 3-PAP is a catalytically inactive member of the myotubularin gene family, which coprecipitates lipid phosphatidylinositol 3-phosphate-3-phosphatase activity from lysates of human platelets.	128
403694	pfam12579	DUF3755	Protein of unknown function (DUF3755). This domain family is found in eukaryotes, and is approximately 40 amino acids in length. There is a single completely conserved residue N that may be functionally important.	34
403695	pfam12580	TPPII	Tripeptidyl peptidase II. This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. The family is found in association with pfam00082. Tripeptidyl peptidase II (TPPII) is a crucial component of the proteolytic cascade acting downstream of the 26S proteasome in the ubiquitin-proteasome pathway. It is an amino peptidase belonging to the subtilase family removing tripeptides from the free N-terminus of oligopeptides.	187
315289	pfam12581	DUF3756	Protein of unknown function (DUF3756). This domain family is found in viruses, and is approximately 40 amino acids in length.	41
403696	pfam12582	DUF3757	Protein of unknown function (DUF3757). This family of proteins is found in bacteria. Proteins in this family are typically between 94 and 154 amino acids in length.	122
403697	pfam12583	TPPII_N	Tripeptidyl peptidase II N terminal. This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. The family is found in association with pfam00082. Tripeptidyl peptidase II (TPPII) is a crucial component of the proteolytic cascade acting downstream of the 26S proteasome in the ubiquitin-proteasome pathway. It is an amino peptidase belonging to the subtilase family removing tripeptides from the free N-terminus of oligopeptides.	136
403698	pfam12584	TRAPPC10	Trafficking protein particle complex subunit 10, TRAPPC10. This domain forms part of the TRAPP complex for mediating vesicle docking and fusion in the Golgi apparatus. The fungal version is referred to as Trs130, and an alternative vertebrate alias is TMEM1.	147
403699	pfam12585	DUF3759	Protein of unknown function (DUF3759). This family of proteins is found in eukaryotes. Proteins in this family are typically between 107 and 132 amino acids in length. There is a single completely conserved residue H that may be functionally important.	91
315294	pfam12586	DUF3760	Protein of unknown function (DUF3760). This domain family is found in eukaryotes, and is typically between 46 and 64 amino acids in length.	44
403700	pfam12587	DUF3761	Protein of unknown function (DUF3761). This family of proteins is found in bacteria. Proteins in this family are typically between 100 and 157 amino acids in length.	87
403701	pfam12588	PSDC	Phophatidylserine decarboxylase. This domain family is found in bacteria and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam02666. Phosphatidylserine decarboxylase (PSD) is an important enzyme in the synthesis of phosphatidylethanolamine in both prokaryotes and eukaryotes.	140
403702	pfam12589	WBS_methylT	Methyltransferase involved in Williams-Beuren syndrome. This domain family is found in eukaryotes, and is typically between 72 and 83 amino acids in length. The family is found in association with pfam08241. This family is made up of S-adenosylmethionine-dependent methyltransferases. The proteins are deleted in Williams-Beuren syndrome (WBS), a complex developmental disorder with multisystemic manifestations including supravalvular aortic stenosis (SVAS) and a specific cognitive phenotype.	81
403703	pfam12590	Acyl-thio_N	Acyl-ATP thioesterase. This domain family is found in bacteria and eukaryotes, and is typically between 120 and 131 amino acids in length. The family is found in association with pfam01643. The plant acyl-acyl carrier protein (ACP) thioesterases (TEs) have roles in fatty acid synthesis.	131
153025	pfam12591	DUF3762	Protein of unknown function (DUF3762). This domain family is found in viruses, and is approximately 80 amino acids in length. The family is found in association with pfam05533.	80
403704	pfam12592	DUF3763	Protein of unknown function (DUF3763). This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam07728. There is a single completely conserved residue F that may be functionally important.	55
289369	pfam12593	McyA_C	Microcystin synthetase C terminal. This domain family is found in bacteria, and is approximately 40 amino acids in length. The family is found in association with pfam08242, pfam00501. There is a conserved YAN sequence motif. Microcystins form a large family of small cyclic heptapeptides harbouring extensive modifications in amino acid residue composition and functional group chemistry. These peptide hepatotoxins contain a range of non-proteinogenic amino acids and unusual peptide bonds, and are typically N-methylated. They are synthesized on large enzyme complexes consisting of non-ribosomal peptide synthetases and polyketide synthases. This family is made up of the C terminal of microcystin synthetase, one of the proteins involved in this synthesis pathway.	43
403705	pfam12594	DUF3764	Protein of unknown function (DUF3764). This family of proteins is found in bacteria. Proteins in this family are typically between 89 and 101 amino acids in length.	84
403706	pfam12595	Rhomboid_SP	Rhomboid serine protease. This domain family is found in eukaryotes, and is approximately 210 amino acids in length. The family is found in association with pfam01694. Rhomboid is a seven-transmembrane spanning protein that resides in the Golgi and acts as a serine protease to cleave Spitz.	216
257152	pfam12596	Tnp_P_element_C	87kDa Transposase. This domain family is found in eukaryotes, and is typically between 78 and 110 amino acids in length. The family is found in association with pfam05485. There are two completely conserved residues (D and G) that may be functionally important. This family is an 87kDa transposase protein which catalyzes both the precise and imprecise excision of a nonautonomous P transposable element.	107
403707	pfam12597	DUF3767	Protein of unknown function (DUF3767). This family of proteins is found in eukaryotes. Proteins in this family are typically between 112 and 199 amino acids in length.	101
403708	pfam12598	TBX	T-box transcription factor. This domain family is found in eukaryotes, and is typically between 77 and 89 amino acids in length. The family is found in association with pfam00907. There are two completely conserved residues (S and P) that may be functionally important. T-box genes encode transcription factors involved in morphogenesis and organogenesis of vertebrates and invertebrates	83
403709	pfam12599	DUF3768	Protein of unknown function (DUF3768). This family of proteins is found in bacteria. Proteins in this family are typically between 108 and 129 amino acids in length. There are two conserved sequence motifs: NDP and RVLT.	83
403710	pfam12600	DUF3769	Protein of unknown function (DUF3769). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 560 and 931 amino acids in length.	443
289375	pfam12601	Rubi_NSP_C	Rubivirus non-structural protein. This domain family is found in viruses, and is approximately 70 amino acids in length. The family is found in association with pfam05407. The rubella virus (RUB) nonstructural (NS) protein (NSP) ORF encodes a protease that cleaves the NSP precursor (240 kDa) at a single site to produce two products.	66
315304	pfam12602	FinO_N	Fertility inhibition protein N terminal. This domain family is found in bacteria, and is typically between 62 and 102 amino acids in length. The family is found in association with pfam04352. The FinOP (fertility inhibition) system of F-like plasmids consists of an antisense RNA (FinP) and a 22 kDa protein (FinO) which act in concert to prevent the translation of TraJ, the positive regulator of the transfer operon.	62
289377	pfam12603	DUF3770	Protein of unknown function (DUF3770). This domain family is found in viruses, and is approximately 250 amino acids in length. The family is found in association with pfam04196.	235
403711	pfam12604	gp37_C	Tail fibre protein gp37 C terminal. This domain family is found in bacteria and viruses, and is typically between 49 and 166 amino acids in length. The family is found in association with pfam03906. In T-even phages, gp37 and gp38 are components of the tail Faber that are critical for phage-host interaction.	156
403712	pfam12605	CK1gamma_C	Casein kinase 1 gamma C terminal. This domain family is found in eukaryotes, and is typically between 54 and 99 amino acids in length. The family is found in association with pfam00069. CK1gamma is a membrane-bound member of the CK1 family. Gain-of-function and loss-of-function experiments show that CK1gamma is both necessary and sufficient to transduce LRP6 signalling in vertebrates and Drosophila cells.	99
403713	pfam12606	RELT	tumor necrosis factor receptor superfamily member 19. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 49 and 288 amino acids in length. There are two completely conserved residues (K and Y) that may be functionally important. The members of tumor necrosis factor receptor (TNFR) superfamily have been designated as the "guardians of the immune system" due to their roles in immune cell proliferation, differentiation, activation, and death (apoptosis). The messenger RNA of RELT is especially abundant in hematologic tissues such as spleen, lymph node, and peripheral blood leukocytes as well as in leukemias and lymphomas. RELT is able to activate the NF-kappaB pathway and selectively binds tumor necrosis factor receptor-associated factor 1.	42
403714	pfam12607	DUF3772	Protein of unknown function (DUF3772). This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam00924.	63
289381	pfam12608	T4bSS_IcmS	Type IVb secretion, IcmS, effector-recruitment. This is a family of Gram-negative bacterial proteins involved in the Dot/Icm type IVb transport system. Members are small acidic cytoplasmic proteins required for Dot/Icm-dependent activities. Binary complexes of IcmW-IcmS and of IcmS-LvgA have been consistently reported, suggestive of the binary WXG100 system. The IcmW-IcmS complex may play a role in recruitment of effector proteins to the transport apparatus.	92
403715	pfam12609	DUF3774	Wound-induced protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 81 and 97 amino acids in length. The proteins in the family are often annotated as wound-induced proteins however there is little accompanying literature to confirm this.	77
403716	pfam12610	SOCS	Suppressor of cytokine signalling. This domain family is found in bacteria and eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam07525, pfam00017. The suppressors of cytokine signaling (SOCS) family play important roles in regulating a variety of signal transduction pathways that are involved in immunity, growth and development of organisms.	58
403717	pfam12611	Flagellar_put	Putative flagellar. Proteins in this entry are encoded in a subset of bacterial flagellar operons, generally between genes designated flgD and flgE, in species as diverse as Bacillus halodurans and various other Firmicutes, Geobacter sulfurreducens, and Bdellovibrio bacteriovorus.	24
403718	pfam12612	TFCD_C	Tubulin folding cofactor D C terminal. This domain family is found in eukaryotes, and is typically between 182 and 199 amino acids in length. The family is found in association with pfam02985. There is a single completely conserved residue R that may be functionally important. Tubulin folding cofactor D does not co-polymerize with microtubules either in vivo or in vitro, but instead modulates microtubule dynamics by sequestering beta-tubulin from GTP-bound alphabeta-heterodimers in microtubules.	186
403719	pfam12613	FliC_SP	Flagellin structural protein. This domain family is found in bacteria, and is approximately 60 amino acids in length. The family is found in association with pfam00669, pfam00700. This family is the bacterial flagellin structural protein. It is involved with cell motility.	53
403720	pfam12614	RRF_GI	Ribosome recycling factor. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 130 amino acids in length. There are two conserved sequence motifs: LPS and LKR. Overproduction of ribosome recycling factor (RRF) reduces tna operon expression and increases the rate of cleavage of TnaC-tRNA(2)(Pro), relieving the growth inhibition associated with plasmid-mediated tnaC overexpression.	126
403721	pfam12615	TraD_N	F sex factor protein N terminal. This domain family is found in bacteria, and is typically between 96 and 107 amino acids in length. The family is found in association with pfam10412. TraD is a cytoplasmic membrane protein with possible DNA binding domains. It is part of the bacterial F sex factor complex.	93
403722	pfam12616	DUF3775	Protein of unknown function (DUF3775). This domain family is found in bacteria, and is approximately 80 amino acids in length. There is a single completely conserved residue G that may be functionally important.	69
403723	pfam12617	LdpA_C	Iron-Sulfur binding protein C terminal. This domain family is found in bacteria and eukaryotes, and is typically between 179 and 201 amino acids in length. The family is found in association with pfam00037. LdpA (light-dependent period) plays a role in controlling the redox state in cyanobacteria to modulate its. circadian clock. LdpA is a protein with Iron-Sulfur cluster-binding motifs.	183
372226	pfam12618	DUF3776	Protein of unknown function (DUF3776). This domain family is found in eukaryotes, and is approximately 100 amino acids in length.	76
403724	pfam12619	MCM2_N	Mini-chromosome maintenance protein 2. This domain family is found in eukaryotes, and is typically between 138 and 153 amino acids in length. The family is found in association with pfam00493. Mini-chromosome maintenance (MCM) proteins are essential for DNA replication. These proteins use ATPase activity to perform this function.	148
403725	pfam12620	DUF3778	Protein of unknown function (DUF3778). This domain family is found in eukaryotes, and is typically between 48 and 61 amino acids in length. There is a conserved LRF sequence motif.	64
403726	pfam12621	PHM7_ext	Extracellular tail, of 10TM putative phosphate transporter. This PHM7_ext family is found in plants and fungi. It represents the C-terminal extracellular domain of the putative phosphate transporter, PHM7. The three N-terminal TMS are found in family RSN1_TM, pfam02714; the cytoplssmic domain is pfam14703, and the remaining 7TM region is in pfam02714.	84
403727	pfam12622	NpwBP	mRNA biogenesis factor. The full-length Wbp11 proteins carry several copies of a PPGPPP motif throughout their length. This motif is thought to be necessary for folding of the molecule as it helps to bind the WW domain, Wbp11, pfam09429. This domain together with Wbp11 may function as components of an mRNA factory in the nucleus.	47
403728	pfam12623	Hen1_L	RNA repair, ligase-Pnkp-associating, region of Hen1. This domain is the N-terminal region of the bacterial Hen1 protein. This protein forms stable hetero-tetramer with Pnkp. The hetero-tetramer was able to repair transfer RNAs cleaved by ribotoxins in vitro. This domain provides the ligase activity of the hetero-tetramer.	230
403729	pfam12624	Chorein_N	N-terminal region of Chorein or VPS13. Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport.	109
403730	pfam12625	Arabinose_bd	Arabinose-binding domain of AraC transcription regulator, N-term. AraC is a bacterial transcriptional regulatory protein with a DNA-binding domain at the C-terminus, HTH_AraC, pfam00165, and this dimerization domain which harbours the arabinose-binding pocket at the N-terminus. AraC positively and negatively regulates expression of the proteins required for the uptake and catabolism of the sugar L-arabinose 1,2,3].	185
403731	pfam12626	PolyA_pol_arg_C	Polymerase A arginine-rich C-terminus. The C-terminus of polymerase A in E coli is arginine-rich and is necessary for full functioning of the enzyme.	116
403732	pfam12627	PolyA_pol_RNAbd	Probable RNA and SrmB- binding site of polymerase A. This region encompasses much of the RNA and SrmB binding motifs on polymerase A.	64
403733	pfam12628	Inhibitor_I71	Falstatin, cysteine peptidase inhibitor. This family of peptidase inhibitors is expressed from plasmodial protozoal species. Falstatin is found to be a potent reversible inhibitor of the P. falciparum cysteine proteases falcipain-2 and falcipain-3, as well as other parasite- and non-parasite-derived cysteine proteases, but is only a relatively weak inhibitor of the P. falciparum cysteine proteases falcipain-1 and dipeptidyl aminopeptidase 1. Thus, P. falciparum requires expression of falstatin to limit proteolysis by certain host or parasite cysteine proteases during erythrocyte invasion.	173
289401	pfam12629	Pox_polyA_pol_C	Poxvirus poly(A) polymerase C-terminal domain. This domain is found at the C-terminus of the pox virus PolyA polymerase protein.	199
403734	pfam12630	Pox_polyA_pol_N	Poxvirus poly(A) polymerase N-terminal domain. This domain is found at the N-terminus of the pox virus Poly(A) polymerase protein. According to SCOP this domain contains a helix-hairpin-helix motif.	108
403735	pfam12631	MnmE_helical	MnmE helical domain. The tRNA modification GTPase MnmE consists of three domains. An N-terminal domain, a helical domain and a GTPase domain which is nested within the helical domain. This family represents the helical domain.	326
403736	pfam12632	Vezatin	Mysoin-binding motif of peroxisomes. Vezatin is a peroxisome transmembrane receptor that is involved in membrane-membrane and cell-cell adhesions. In the movement of peroxisomes it binds to class V and class VIIa myosins to guide the organelle through the microtubules and allow pathogens to internalize themselves into host cells. Vezatin is crucial for spermatozoan production. In mouse cells it interacts with the cadherin-catenin complex bridging it to the C-terminal FERM domain of myosin VIIA.	242
403737	pfam12633	Adenyl_cycl_N	Adenylate cyclase NT domain. 	198
372234	pfam12634	Inp1	Inheritance of peroxisomes protein 1. Inp1 is a family of peripheral membrane proteins of peroxisomes. Inp1p binds Pex25p, Pex30p, and Vps1p, all of which are involved in controlling peroxisome division. The levels of Inp1p vary with the cell cycle, and Inp1 acts as a factor that retains peroxisomes in cells and controls peroxisome division. Inp1p promotes the retention of peroxisomes in mother cells and buds of budding yeast by attaching peroxisomes to as-yet-unidentified cortical structures.	137
403738	pfam12635	DUF3780	Protein of unknown function (DUF3780). This family of proteins is functionally uncharacterized.This family of proteins is found in bacteria. Proteins in this family are typically between 189 and 206 amino acids in length. There are two conserved sequence motifs: PEERWWL and GWR. This family is found in a very sporadic set of bacterial species, suggesting that it may have been horizontally transferred. One protein is annotated as plasmid borne.	184
403739	pfam12636	DUF3781	Protein of unknown function (DUF3781). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 82 and 98 amino acids in length. There are two conserved sequence motifs: GKNWY and ITA.	72
403740	pfam12637	TSCPD	TSCPD domain. This family of proteins is found in bacteria, archaea and viruses. The domain is found in isolation in many proteins where it has a conserved C-terminal motif TSCPD after which the domain is named. Most copies of the domain possess 4 conserved cysteines that may be part of an Iron-sulfur cluster. This domain is found at the C-terminus of some ribonucleoside-diphosphate reductase enzymes.	74
403741	pfam12638	Staygreen	Staygreen protein. This family of proteins have been implicated in chlorophyll degradation. Intriguingly members of this family are also found in non-photosynthetic bacteria.	147
403742	pfam12639	Colicin-DNase	DNase/tRNase domain of colicin-like bacteriocin. Colicin-like bacteriocins are complex structures with an N-terminal beta-barrel translocation domain (pfam09000), a long double-alpha-helical receptor-binding domain (pfam11570) and this C-terminal RNAse/DNase domain with endonuclease activity. Their competitor bacteriocidal action is by a process that involves binding to a surface receptor, entering the cell, and, finally, killing it. The lethal action of colicin E3 is a specific cleavage in the ribosomal decoding A site. The crystal structure of colicin E3 reveals a Y-shaped molecule with the receptor binding domain forming a 100 Angstrom long stalk and the two globular heads of the translocation domain and this catalytic domain comprising the two arms.	96
403743	pfam12640	UPF0489	UPF0489 domain. This family is probably an enzyme which is related to the Arginase family.	161
403744	pfam12641	Flavodoxin_3	Flavodoxin domain. This family represents a flavodoxin domain.	159
403745	pfam12642	TpcC	Conjugative transposon protein TcpC. This family of proteins are annotated as conjugative transposon protein TcpC. The transfer clostridial plasmid (tcp) locus is part of some conjugative antibiotic resistance and virulence plasmids. TcpC was one of five genes whose products had low-level sequence identity to Tn916 proteins, having similarity to ORF13 homologs from Tn916, Tn5397, and CW459tet. This family of proteins is found in bacteria. Proteins in this family are typically between 302 and 351 amino acids in length.	230
403746	pfam12643	MazG-like	MazG-like family. This family of short proteins are distantly related to the MazG enzyme. This suggests that these proteins are enzymes that catalyze a related reaction.	84
372238	pfam12644	DUF3782	Protein of unknown function (DUF3782). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 91 and 186 amino acids in length.	78
403747	pfam12645	HTH_16	Helix-turn-helix domain. This domain appears to be a helix-turn-helix domain suggesting that this might be a transcriptional regulatory protein. Some members of this family are annotated as conjugative transposon domains.	65
403748	pfam12646	DUF3783	Domain of unknown function (DUF3783). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 60 amino acids in length.	56
403749	pfam12647	RNHCP	RNHCP domain. This family of proteins is found in bacteria. Proteins in this family are typically between 94 and 143 amino acids in length. There is a conserved RNHCP sequence motif.	85
403750	pfam12648	TcpE	TcpE family. This family of proteins includes TcpE a conjugative transposon membrane protein.This family of proteins is found in bacteria. Proteins in this family are typically between 122 and 168 amino acids in length.	104
403751	pfam12650	DUF3784	Domain of unknown function (DUF3784). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 96 and 110 amino acids in length.	94
289422	pfam12651	RHH_3	Ribbon-helix-helix domain. This short bacterial protein contains a ribbon-helix-helix domain that is likely to be DNA-binding.	44
403752	pfam12652	CotJB	CotJB protein. CotJ is a sigma E-controlled operon involved in the spore coat of Bacillus subtilis. This protein has been identified as a spore coat protein.	76
403753	pfam12653	DUF3785	Protein of unknown function (DUF3785). This family of proteins is functionally uncharacterized.This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. These proteins share two CXXC motifs suggesting these are zinc binding proteins. This protein is found in clostridia in an operon with three signalling proteins, suggesting this protein may be a DNA-binding transcription regulator downstream of an as yet unknown signalling pathway (Bateman A pers obs).	136
403754	pfam12654	DUF3786	Domain of unknown function (DUF3786). This presumed domain is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 201 and 257 amino acids in length. Some proteins also contains an iron-sulfur cluster.	176
403755	pfam12655	DUF3787	Domain of unknown function (DUF3787). This family of proteins is functionally uncharacterized. This family of proteins is found in Clostridia. Proteins in this family are approximately 60 amino acids in length. There is a conserved TAAW sequence motif that may be functionally important.	52
372242	pfam12656	G-patch_2	G-patch domain. Yeast Spp2, a G-patch protein and spliceosome component, interacts with the ATP-dependent DExH-box splicing factor Prp2. As this interaction involves the G-patch sequence in Spp2 and is required for the recruitment of Prp2 to the spliceosome before the first catalytic step of splicing, it is proposed that Spp2 might be an accessory factor that confers spliceosome specificity on Prp2.	61
403756	pfam12657	TFIIIC_delta	Transcription factor IIIC subunit delta N-term. In humans there are six subunits of transcription factor IIIC, and this one is the 90 kDa subunit; whereas in fungi the complex resolves into nine different subunits and this is No. 9 in yeasts. The whole subunit is involved in RNA polymerase III-mediated transcription. It is possible that this N-terminal domain interacts with TFIIIC subunit 8.	174
403757	pfam12658	Ten1	Telomere capping, CST complex subunit. Stn1 and Ten1 are DNA-binding proteins with specificity for telomeric DNA substrates and both protect chromosome termini from unregulated resection and regulate telomere length. Stn1 complexes with Ten1 and Cdc13 to function as a telomere-specific replication protein A (RPA)-like complex. These three interacting proteins associate with the telomeric overhang in budding yeast, whereas a single protein known as Pot1 (protection of telomeres-1) performs this function in fission yeast, and a two-subunit complex consisting of POT1 and TPP1 associates with telomeric ssDNA in humans. S.pombe has Stn1- and Ten1-like proteins that are essential for chromosome end protection. Stn1 orthologues exist in all species that have Pot1, whereas Ten1-like proteins can be found in all fungi. Fission yeast Stn1 and Ten1 localize at telomeres in a manner that correlates with the length of the ssDNA overhang, suggesting that they specifically associate with the telomeric ssDNA. Two separate protein complexes are required for chromosome end protection in fission yeast. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution. Ten1 is one of the three components of the CST complex, which, in conjunction with the Shelterin complex helps protect telomeres from attack by DNA-repair mechanisms.	115
403758	pfam12659	Stn1_C	Telomere capping C-terminal wHTH. This domain consists of tandem winged helix-turn-helix motifs. Stn1 and Ten1 are DNA-binding proteins with specificity for telomeric DNA substrates and both protect chromosome termini from unregulated resection and regulate telomere length. Stn1 complexes with Ten1 and Cdc13 to function as a telomere-specific replication protein A (RPA)-like complex. These three interacting proteins associate with the telomeric overhang in budding yeast, whereas a single protein known as Pot1 (protection of telomeres-1) performs this function in fission yeast, and a two-subunit complex consisting of POT1 and TPP1 associates with telomeric ssDNA in humans. S.pombe has Stn1- and Ten1-like proteins that are essential for chromosome end protection. Stn1 orthologues exist in all species that have Pot1, whereas Ten1-like proteins can be found in all fungi. Fission yeast Stn1 and Ten1 localize at telomeres in a manner that correlates with the length of the ssDNA overhang, suggesting that they specifically associate with the telomeric ssDNA. Two separate protein complexes are required for chromosome end protection in fission yeast. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution.	119
372246	pfam12660	zf-TFIIIC	Putative zinc-finger of transcription factor IIIC complex. This zinc-finger domain is at the very C-terminus of a number of different TFIIIC subunit proteins. This domain might be involved in protein-DNA and/or protein-protein interactions.	87
403759	pfam12661	hEGF	Human growth factor-like EGF. hEGF, or human growth factor-like EGF, domains have six conserved residues disulfide-bonded into the characteristic 'ababcc' pattern. They are involved in growth and proliferation of cells, in proteins of the Notch/Delta pathway, neurogulin and selectins. hEGFs are also found in mosaic proteins with four-disulfide laminin EGFs such as aggrecan and perlecan. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal Cys residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In hEGFs the C-terminal thiol resides in the beta-turn, resulting in shorter loop-lengths between the Cys residues of disulfide 'c', typically C[8-9]XC. These shorter loop-lengths are also typical of the four-disulfide EGF domains, laminin ad integrin. Tandem hEGF domains have six linking residues between terminal cysteines of adjacent domains. hEGF domains may or may not bind calcium in the linker region. hEGF domains with the consensus motif CXD4X[F,Y]XCXC are hydroxylated exclusively in the Asp residue.	22
403760	pfam12662	cEGF	Complement Clr-like EGF-like. cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue.	24
403761	pfam12663	DUF3788	Protein of unknown function (DUF3788). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 137 and 149 amino acids in length. This family may be distantly related to RelE proteins.	128
315357	pfam12664	DUF3789	Protein of unknown function (DUF3789). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There are two completely conserved residues (V and C) that may be functionally important.	32
403762	pfam12666	PrgI	PrgI family protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 116 and 146 amino acids in length. This protein is found in an operon that is part of a Type IV secretion system.	91
403763	pfam12667	NigD_N	NigD-like N-terminal OB domain. This family of proteins is functionally uncharacterized. This family of proteins is found in Bacteroides species. Proteins in this family are typically between 234 and 260 amino acids in length. These proteins possess an N-terminal lipoprotein attachment site. The family includes NigD a protein found in the Nig operon that encodes a bacteriocin called nigrescin. It has been suggested that NigD may be the immunity protein for nigrescin (NigC) because it is directly downstream. This domain has an OB fold.	66
403764	pfam12668	DUF3791	Protein of unknown function (DUF3791). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 71 and 125 amino acids in length.	60
403765	pfam12669	P12	Virus attachment protein p12 family. This family of proteins are related to Virus attachment protein p12 from the African swine fever virus. The family appears to contain an N-terminal signal peptide followed by a short cysteine rich region. The cysteine rich region is extremely variable and it is possible that only the N-terminal region is homologous.	45
403766	pfam12670	DUF3792	Protein of unknown function (DUF3792). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. These proteins are integral membrane proteins.	110
403767	pfam12671	Amidase_6	Putative amidase domain. 	161
403768	pfam12672	DUF3793	Protein of unknown function (DUF3793). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 187 and 211 amino acids in length. There are two conserved sequence motifs: PHE and LGYP.	171
403769	pfam12673	DUF3794	Domain of unknown function (DUF3794). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 90 amino acids in length. The family is found in association with pfam01476.	80
403770	pfam12674	Zn_ribbon_2	Putative zinc ribbon domain. This domain appears to be a zinc binding DNA-binding domain.	76
403771	pfam12675	DUF3795	Protein of unknown function (DUF3795). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 99 and 171 amino acids in length. This protein is likely to be zinc binding given the conserved cysteines.	81
403772	pfam12676	DUF3796	Protein of unknown function (DUF3796). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length.	112
289447	pfam12677	DUF3797	Domain of unknown function (DUF3797). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 50 amino acids in length. There is a conserved CGN sequence motif.	48
403773	pfam12678	zf-rbx1	RING-H2 zinc finger domain. There are 8 cysteine/ histidine residues which are proposed to be the conserved residues involved in zinc binding. The protein, of which this domain is the conserved region, participates in diverse functions relevant to chromosome metabolism and cell cycle control.	55
403774	pfam12679	ABC2_membrane_2	ABC-2 family transporter protein. This family is related to the ABC-2 membrane transporter family.	281
403775	pfam12680	SnoaL_2	SnoaL-like domain. This family contains a large number of proteins that share the SnoaL fold.	101
403776	pfam12681	Glyoxalase_2	Glyoxalase-like domain. This domain is related to the Glyoxalase domain pfam00903.	118
403777	pfam12682	Flavodoxin_4	Flavodoxin. This is a family of flavodoxins. Flavodoxins are electron transfer proteins that carry a molecule of non-covalently bound FMN.	155
403778	pfam12683	DUF3798	Protein of unknown function (DUF3798). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 247 and 417 amino acids in length. Most of the proteins in this family have an N-terminal lipoprotein attachment site. These proteins have distant similarity to periplasmic ligand binding families such as pfam02608, which suggests that this family have a similar role.	271
403779	pfam12684	DUF3799	PDDEXK-like domain of unknown function (DUF3799). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 265 and 420 amino acids in length. It appears that these proteins are distantly related to the PDDEXK superfamily and so these domains are likely to be nucleases. This family has a C-terminal cysteine cluster similar to that found in pfam01930.	228
403780	pfam12685	SpoIIIAH	SpoIIIAH-like protein. Stage III sporulation protein AH (SpoIIIAH) is a protein that is involved in forespore engulfment. It forms a channel with SpoIIIAH that is open on the forespore end and closed (or gated) on the mother cell end. This allows sigma-E-directed gene expression in the mother-cell compartment of the sporangium to trigger the activation of sigma-G forespore-specific gene expression by a pathway of intercellular signaling. This family of proteins is found in bacteria, archaea and eukaryotes and so must have a wider function that in sporulation. Proteins in this family are typically between 174 and 223 amino acids in length.	195
403781	pfam12686	DUF3800	Protein of unknown function (DUF3800). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 215 and 302 amino acids in length. There is a DE motif at the N-terminus and a QXXD motif at the C-terminus that may be functionally important.	112
403782	pfam12687	DUF3801	Protein of unknown function (DUF3801). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 158 and 187 amino acids in length. This family includes the PcfB protein.	131
403783	pfam12688	TPR_5	Tetratrico peptide repeat. BH0479 of Bacillus halodurans is a hypothetical protein which contains a tetratrico peptide repeat (TPR) structural motif. The TPR motif is often involved in mediating protein-protein interactions. This protein is likely to function as a dimer. The first 48 amino acids are not present in the clone construct. This Pfam entry includes tetratricopeptide-like repeats not detected by the pfam00515, pfam07719, pfam07720 and pfam07221 models.	119
372256	pfam12689	Acid_PPase	Acid Phosphatase. This family contains phosphatase enzymes and other proteins of the HAD superfamily. It includes MDP-1 which is a eukaryotic magnesium-dependent acid phosphatase.	169
403784	pfam12690	BsuPI	Intracellular proteinase inhibitor. This is a bacterial domain which has been named BsuPI in Bacillus subtilis. This domain is found in Bacillus subtilis ipi, where it has been suggested to regulate the major intracellular proteinase (ISP-1) activity in vivo. The structure of proteins in this family adopt a beta barrel topology.	100
403785	pfam12691	Minor_capsid_3	Minor capsid protein from bacteriophage. This family is from one of three adjacent genes, all of which are involved in formation of the minor phage capsid.	117
403786	pfam12692	Methyltransf_17	S-adenosyl-L-methionine methyltransferase. This domain is found in bacterial proteins. The structure of the proteins in this family suggest that they function as a methyltransferase.	160
289463	pfam12693	GspL_C	GspL periplasmic domain. This domain is the periplasmic domain of the GspL/EpsL family proteins. These proteins are involved in type II secretion systems.	158
403787	pfam12694	MoCo_carrier	Putative molybdenum carrier. The structure of proteins in this family contain central beta strands with flanking alpha helices. The structure is similar to that of a molybdenum cofactor carrier protein.	145
315383	pfam12695	Abhydrolase_5	Alpha/beta hydrolase family. This family contains a diverse range of alpha/beta hydrolase enzymes.	164
403788	pfam12696	TraG-D_C	TraM recognition site of TraD and TraG. This family includes both TraG and TraD as well as VirD4 proteins. TraG is essential for DNA transfer in bacterial conjugation. These proteins are thought to mediate interactions between the DNA-processing (Dtr) and the mating pair formation (Mpf) systems. This domain interacts with the relaxosome component TraM via the latter's tetramerisation domain. TraD is a hexameric ring ATPase that forms the cytoplasmic face of the conjugative pore.	125
403789	pfam12697	Abhydrolase_6	Alpha/beta hydrolase family. This family contains alpha/beta hydrolase enzymes of diverse specificity.	212
403790	pfam12698	ABC2_membrane_3	ABC-2 family transporter protein. This family is related to the ABC-2 membrane transporter family pfam01061.	345
403791	pfam12699	phiKZ_IP	phiKZ-like phage internal head proteins. Phage internal head proteins (IP) are proteins that are encoded by a bacteriophage and assembled into the mature virion inside the capsid head. The most analogous characterized IP proteins are those of bacteriophage T4, which are known to be proteolytically processed during phage maturation, and then subsequently injected into the host cell during infection. The phiKZ_IP family consists of internal head proteins encoded by phiKZ-like phages. Each phage encodes three to six members of this family. Members of the family reside in the head and are cleaved during phage maturation to separate an N-terminal propeptide from a C-terminal domain. The C-terminal domain remains in the mature capsid. The N-terminal propeptide domain is either mostly or completely removed from the mature capsid. In one case, an unrelated polypeptide is embedded in the propeptide and also remains in the mature capsid. The phiKZ-like IP proteins are not discernibly homologous to the T4 IP proteins, and it is not known if the phiKZ-like IP proteins are injected into the host cell, or have some other function within the head. The alignment and HMM model exclude most of the propeptide region, but include the cleavage sites. The first 100 residues, including the cleavage sites, constitute the most conservative part of the seed alignment.	323
403792	pfam12700	HlyD_2	HlyD family secretion protein. This family is related to pfam00529.	413
403793	pfam12701	LSM14	Scd6-like Sm domain. The Scd6-like Sm domain is found in Scd6p from S. cerevisiae, Rap55 from the newt Pleurodeles walt, and its orthologs from fungi, animals, plants and apicomplexans. The domain is also found in Dcp3p and the human EDC3/FLJ21128 protein where it is fused to the the Rossmanoid YjeF-N domain. In addition both EDC3 and Scd6p are found fused to the FDF domain.	75
403794	pfam12702	Lipocalin_3	Lipocalin-like. This is a family of proteins of 115 residues on average. The family has two highly conserved tryptophan residues. The fold is very similar to the lipocalin-like fold from several comparable structures.	92
403795	pfam12703	ptaRNA1_toxin	Toxin of toxin-antitoxin type 1 system. This family is the toxin of a type 1 toxin-antitoxin system which is found in a relatively widespread range of bacterial species. The species distribution suggests frequent horizontal gene transfer. In a type 1 system, as characterized for the plasmid-encoded E coli hok/sok system, the toxin-encoding stable mRNA encodes a protein which rapidly leads to cell death unless the translation is suppressed by a short-lived small RNA. The plasmid-encoded module prevents the growth of plasmid-free offspring, thus ensuring the persistence of the plasmid in the population. Plasmid-free cells arising after cell-division will be killed because the stable mRNA toxin is present while the comparably unstable anti-toxin is rapidly degraded. Where the system is transcribed chromosomally, the mechanism is poorly understood.	73
403796	pfam12704	MacB_PCD	MacB-like periplasmic core domain. This family represents the periplasmic core domain found in a variety of ABC transporters. The structure of this family has been solved for the MacB protein. Some structural similarity was found to the periplasmic domain of the AcrB multidrug efflux transporter.	209
403797	pfam12705	PDDEXK_1	PD-(D/E)XK nuclease superfamily. Members of this family belong to the PD-(D/E)XK nuclease superfamily	249
403798	pfam12706	Lactamase_B_2	Beta-lactamase superfamily domain. This family is part of the beta-lactamase superfamily and is related to pfam00753.	196
403799	pfam12707	DUF3804	Protein of unknown function (DUF3804). This family is approximately 130 residues. Dali search indicates this protein carries a NTF2-fold with a hydrophobic cavity as a structural homolog to 1JB2, 2R4I, 3FSD and 2UX0. In this hydrophobic cavity, Arg 118 provides the H-bonding force to hold a PEG molecule from crystallisation. The interface interaction suggests that the biomolecule of PMN2A_0505 is a dimer. Two members of the family are annotated as putative EF-Tu domain 2 but there is no match to this family so this is likely to be a false assignment. There are two highly conserved tryptophan residues towards the C-terminal end of the family.	128
403800	pfam12708	Pectate_lyase_3	Pectate lyase superfamily protein. This family of proteins possesses a beta helical structure like Pectate lyase. This family is most closely related to glycosyl hydrolase family 28.	213
403801	pfam12709	Kinetocho_Slk19	Central kinetochore-associated. This is a family of proteins integrally involved in the central kinetochore. Slk19 is a yeast member and it may play an important role in the timing of nuclear migration. It may also participate, directly or indirectly, in the maintenance of centromeric tensile strength during mitotic stagnation, for instance during activation of checkpoint controls, when cells need to preserve nuclear integrity until cell cycle progression can be resumed.	77
403802	pfam12710	HAD	haloacid dehalogenase-like hydrolase. 	187
403803	pfam12712	DUF3805	Domain of unknown function (DUF3805). This family represent the N-terminal domain of the structure. In two related Bacteroides species the gene lies immediately upstream from a putative ATP binding component of an ATP transporter and a putative histidinol phosphatase. The structure of this domain is strikingly similar to the N-terminal structure of 1tui, also of unknown function. The domain carries four conserved tryptophan residues.	152
403804	pfam12713	DUF3806	Domain of unknown function (DUF3806). This family represent the C-terminal domain of the structure. In two related Bacteroides species the gene lies immediately upstream from a putative ATP binding component of an ATP transporter and a putative histidinol phosphatase. The structure of this domain is strikingly similar to the N-terminal structure of 1ma7 whose C-terminal domain is a phage integrase, pfam00589.	86
315397	pfam12714	TILa	TILa domain. This cysteine rich domain occurs along side the TIL pfam01826 domain and is likely to be a distantly related relative.	54
403805	pfam12715	Abhydrolase_7	Abhydrolase family. This is a family of probable bacterial abhydrolases.	387
403806	pfam12716	Apq12	Nuclear pore assembly and biogenesis. This is a family of conserved fungal proteins involved in nuclear pore assembly. Apq12 is an integral membrane protein of the nuclear envelope (NE) and endoplasmic reticulum. Its absence leads to a partial block in mRNA export and cold-sensitive defects in the growth and localization of a subset of nucleoporins, particularly those asymmetrically localized to the cytoplasmic fibrils. The defects in nuclear pore assembly appear to be due to defects in regulating membrane fluidity.	53
403807	pfam12717	Cnd1	non-SMC mitotic condensation complex subunit 1. The three non-SMC (structural maintenance of chromosomes) subunits of the mitotic condensation complex are Cnd1-3. The whole complex is essential for viability and the condensing of chromosomes in mitosis.	162
403808	pfam12718	Tropomyosin_1	Tropomyosin like. This family is a set of eukaryotic tropomyosins. Within the yeast Tpm1 and Tpm2, biochemical and sequence analyses indicate that Tpm2p spans four actin monomers along a filament, whereas Tpm1p spans five. Despite its shorter length, Tpm2p can compete with Tpm1p for binding to F-actin. Over-expression of Tpm2p in vivo alters the axial budding of haploids to a bipolar pattern, and this can be partially suppressed by co-over-expression of Tpm1p. This suggests distinct functions for the two tropomyosins, and indicates that the ratio between them is important for correct morphogenesis. The family also contains higher eukaryote Tpm3 members.	142
403809	pfam12719	Cnd3	Nuclear condensing complex subunits, C-term domain. The Cnd1-3 proteins are the three non-SMC (structural maintenance of chromosomes) proteins that go to make up the mitotic condensation complex along with the two SMC protein families, XCAP-C and XCAP-E, (or in the case of fission yeast, Cut3 and Cut14). The five-member complex seems to be conserved from yeasts to vertebrates. This domain is the C-terminal, cysteine-rich domain of Cnd3. The complex shuttles between the nucleus, during mitosis, and the cytoplasm during the rest of the cycle. Thus this family is made up of the C-termini of XCAP-Gs, Ycg1 and Ycs5 members.	286
403810	pfam12720	DUF3807	Protein of unknown function (DUF3807). This is a family of conserved fungal proteins of unknown function.	178
403811	pfam12721	RHIM	RIP homotypic interaction motif. RIP proteins are receptor-interacting serine/threonine-protein kinases or cell death proteins. This interacting domain is involved in virus recognition. The RHIM domain is necessary for the recruitment of RIP and RIP3 by the IFN-inducible protein DNA-dependent activator of IRFs (DAI), also known as DLM-1 or Z-DNA binding protein (ZBP1). Both the RIP kinases contribute to DAI-induced NF-kappaB activation. RIP3 undergoes auto phosphorylation on binding to DAI.	52
403812	pfam12722	Hid1	High-temperature-induced dauer-formation protein. Hid1 (high-temperature-induced dauer-formation protein 1) represents proteins of approximately 800 residues long and is conserved from fungi to humans. It contains up to seven potential transmembrane domains separated by regions of low complexity. Functionally it might be involved in vesicle secretion or be an inter-cellular signalling protein or be a novel insulin receptor.	804
372272	pfam12723	DUF3809	Protein of unknown function (DUF3809). This family of proteins is functionally uncharacterized. This family of proteins is found in Deinococci bacteria. Proteins in this family are typically between 117 and 157 amino acids in length.	136
403813	pfam12724	Flavodoxin_5	Flavodoxin domain. This is a family of flavodoxins. Flavodoxins are electron transfer proteins that carry a molecule of non-covalently bound FMN.	144
403814	pfam12725	DUF3810	Protein of unknown function (DUF3810). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 333 and 377 amino acids in length. There is a conserved HEXXH sequence motif that is characteristic of metallopeptidases. This family may therefore belong to an as yet uncharacterized family of peptidase enzymes.	318
403815	pfam12726	SEN1_N	SEN1 N terminal. This domain is found at the N terminal of the helicase SEN1. SEN1 is a Pol II termination factor for noncoding RNA genes. The N terminal of SEN1, unlike the C terminal, is not required for growth.	744
403816	pfam12727	PBP_like	PBP superfamily domain. This family belongs to the periplasmic binding domain superfamily. It is often associated with a helix-turn-helix domain.	193
403817	pfam12728	HTH_17	Helix-turn-helix domain. This domain is a DNA-binding helix-turn-helix domain.	51
403818	pfam12729	4HB_MCP_1	Four helix bundle sensory module for signal transduction. This family is a four helix bundle that operates as a ubiquitous sensory module in prokaryotic signal-transduction. The 4HB_MCP is always found between two predicted transmembrane helices indicating that it detects only extracellular signals. In many cases the domain is associated with a cytoplasmic HAMP domain suggesting that most proteins carrying the bundle might share the mechanism of transmembrane signalling which is well-characterized in E coli chemoreceptors.	181
403819	pfam12730	ABC2_membrane_4	ABC-2 family transporter protein. This family is related to the ABC-2 membrane transporter family pfam01061.	179
372275	pfam12731	Mating_N	Mating-type protein beta 1. This domain is found in some fungi and is the C-terminus of a homeodomain-containing transcription factor protein involved in mating.	95
403820	pfam12732	YtxH	YtxH-like protein. This family of proteins is found in bacteria. Proteins in this family are typically between 100 and 143 amino acids in length. The N-terminal region is the most conserved. Proteins is this family are functionally uncharacterized.	71
403821	pfam12733	Cadherin-like	Cadherin-like beta sandwich domain. This domain is found in several bacterial, metazoan and chlorophyte algal proteins. A profile-profile comparison recovered the cadherin domain and a comparison of the predicted structure of this domain with the crystal structure of the cadherin showed a congruent seven stranded secondary structure. The domain is widespread in bacteria and seen in the firmicutes, actinobacteria, certain proteobacteria, bacteroides and chlamydiae with an expansion in Clostridium. In contrast, it is limited in its distribution in eukaryotes suggesting that it was derived through lateral transfer from bacteria. In prokaryotes, this domain is widely fused to other domains such as FNIII (Fibronectin Type III), TIG, SLH (S-layer homology), discoidin, cell-wall-binding repeat domain and alpha-amylase-like glycohydrolases. These associations are suggestive of a carbohydrate-binding function for this cadherin-like domain. In animal proteins it is associated with an ATP-grasp domain.	89
403822	pfam12734	CYSTM	Cysteine-rich TM module stress tolerance. The members of this family are short cysteine-rich membrane proteins that most probably dimerize together to form a transmembrane sulfhydryl-lined pore. The CYSTM module is always present at the extreme C-terminus of the protein in which it is present. Furthermore, like the yeast prototypes, the majority of the proteins also possess a proline/glutamine-rich segment upstream of the CYSTM module that is likely to form a polar, disordered head in the cytoplasm. The presence of an atypical well-conserved acidic residue at the C-terminal end of the TM helix suggests that this might interact with a positively charged moiety in the lipid head group. Consistently across the eukaryotes, the different versions of the CYSTM module appear to have roles in stress-response or stress-tolerance, and, more specifically, in resistance to deleterious substances, implying that these might be general functions of the whole family.	37
403823	pfam12735	Trs65	TRAPP trafficking subunit Trs65. This family is one of the subunits of the TRAPP Golgi trafficking complex. TRAPP subunits are found in two different sized complexes, TRAPP I and TRAPP II. While both complexes contain the same seven subunits, Bet3p, Bet5p, Trs20p, Trs23p, Trs31p, Trs33p and Trs85p, with TRAPPC human equivalents, TRAPP II has the additional three subunits,Trs65p, Trs120p and Trs130p. While it has been implicated in cell wall biogenesis and stress response, the role of Trs65 in TRAPP II is supported by the findings that the protein co-localizes with Trs130p, and deletion of TRS65 in yeast leads to a conditional lethal phenotype if either one of the other TRAPP II-specific subunits is modified. Furthermore, the trs65 mutant has reduced Ypt31/32p guanine nucleotide exchange, GEF, activity.	309
403824	pfam12736	CABIT	Cell-cycle sustaining, positive selection,. The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologs (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2.	261
372279	pfam12737	Mating_C	C-terminal domain of homeodomain 1. Mating in fungi is controlled by the loci that determine the mating type of an individual, and only individuals with differing mating types can mate. Basidiomycete fungi have evolved a unique mating system, termed tetrapolar or bifactorial incompatibility, in which mating type is determined by two unlinked loci; compatibility at both loci is required for mating to occur. The multi-allelic tetrapolar mating system is considered to be a novel innovation that could have only evolved once, and is thus unique to the mushroom fungi. This domain is C-terminal to the homeodomain transcription factor region.	412
403825	pfam12738	PTCB-BRCT	twin BRCT domain. This is a BRCT domain that appears in duplicate in most member sequences. BRCT domains are peptide- and phosphopeptide-binding modules. BRCT domains are present in a number of proteins involved in DNA checkpoint controls and DNA repair.	63
403826	pfam12739	TRAPPC-Trs85	ER-Golgi trafficking TRAPP I complex 85 kDa subunit. This family is one of the subunits of the TRAPP Golgi trafficking complex. TRAPP subunits are found in two different sized complexes, TRAPP I and TRAPP II, and this Trs85 is in the smaller complex. TRAPP I, but Not TRAPP II, functions in ER-Golgi transport. Trs85p was reported to function in the cytosol-to-vacuole targeting pathway, suggesting a role for this subunit in autophagy as well as in secretion. The overall architecture of TRAPP I shows the other components to be Bet3p (TRAPPC3), Bet5p (TRAPPC1), Trs20p (TRAPPC2), Trs23p (TRAPPC4), Trs31p (TRAPPC5), Trs33p (TRAPPC6a and b) and Trs85p.	392
403827	pfam12740	Chlorophyllase2	Chlorophyllase enzyme. This family consists of several chlorophyllase and chlorophyllase-2 (EC:3.1.1.14) enzymes. Chlorophyllase (Chlase) is the first enzyme involved in chlorophyll (Chl) degradation and catalyzes the hydrolysis of an ester bond to yield chlorophyllide and phytol. The family includes both plant and Amphioxus members.	254
403828	pfam12741	SusD-like	Susd and RagB outer membrane lipoprotein. This is a family of SusD-like proteins, one member of which, BT1043, is an outer membrane lipoprotein involved in host glycan metabolism. The structures of this and SusD-homologs in the family are dominated by tetratrico peptide repeats that may facilitate association with outer membrane beta-barrel transporters required for glycan uptake. The structure of BT1043 complexed with N-acetyllactosamine reveals that recognition is mediated via hydrogen bonding interactions with the reducing end of beta-N-acetylglucosamine, suggesting a role in binding glycans liberated from the mucin polypeptide. Mammalian distal gut bacteria have an expanded capacity to utilize glycans. In the absence of dietary sources, some species rely on host-derived mucosal glycans. The ability of Bacteroides thetaiotaomicron, a prominent human gut symbiont, to forage host glycans contributes to both its ability to persist within an individual host and its ability to be transmitted naturally to new hosts at birth.	495
403829	pfam12742	Gryzun-like	Gryzun, putative Golgi trafficking. Members of this family are involved in Golgi trafficking.	56
403830	pfam12743	ESR1_C	Oestrogen-type nuclear receptor final C-terminal. This is the very C-terminal region of a subfamily of nuclear receptors that includes oestrogen receptors and other subfamily 3 group A members. The actual function of this region is not known, but the domain is absent from all the other types of nuclear receptors. Oestrogen receptors modulate AP-1-dependent transcription through two distinct mechanisms: via protein-protein interactions on DNA; and via non-genomic actions. The mechanism used depends on the cellular localization of the receptor. In addition to the more extensively studied cross-talk on DNA, additional non-genomic actions might be very important in target tissues in which membrane-associated ERs are found. These non-genomic actions probably contribute to the overall physiological responses mediated by ligand-bound ERs and might possibly be mediated via this C-terminal domain.	40
403831	pfam12744	ATG19_autophagy	Autophagy protein Atg19, Atg8-binding. Autophagy is generally known as a process involved in the degradation of bulk cytoplasmic components that are non-specifically sequestered into an autophagosome, where they are sequestered into double-membrane vesicles and delivered to the degradative organelle, the lysosome/vacuole, for breakdown and eventual recycling of the resulting macromolecules. In contrast to autophagy, however, the Cvt pathway is a highly selective process that involves the sequestration of at least two specific cargos that are resident vacuolar hydrolases, aminopeptidase I (Ape1) and alpha-mannosidase (Ams1). These proteins are sequestered within a double-membrane vesicle, termed a Cvt vesicle. The Cvt vesicle is fairly consistent in size, and is much smaller than the autophagosome, being 140-160 nm in diameter. The prApe1 is sequestered within either Cvt vesicles or autophagosomes, depending on the nutrient conditions, and delivered to the vacuole. Autophagy and the Cvt pathway are topologically and mechanistically similar and share most of the same machinery. The Ape1 complex is ultimately enwrapped within either Cvt vesicles or autophagosomes at the perivacuolar PAS. The receptor protein Atg19 binds to the Ape1 complex through the prApe1 propeptide to form the Cvt complex in the cytosol. In the absence of Atg19, prApe1 can form an Ape1 complex, but does not localize at the PAS. Atg19 is a peripheral membrane protein with differing binding sites for both Ape1 and Ams1. The Atg8-binding region in the yeast proteins is this very C-terminal residues.	246
403832	pfam12745	HGTP_anticodon2	Anticodon binding domain of tRNAs. This is an HGTP_anticodon binding domain, found largely on Gcn2 proteins which bind tRNA to down regulate translation in certain stress situations.	261
403833	pfam12746	GNAT_acetyltran	GNAT acetyltransferase. Many of the members are annotated s being Zwittermicin A resistance proteins, whereas others are listed as being GNAT acetyltransferases. The family has similarities to the GNAT acetyltransferase family.	239
372287	pfam12747	DdrB	DdrB-like protein. This family includes the Deinococcus DdrB protein which is a ssDNA binding protein. This family also includes some possibly distantly related cyanobacterial proteins. However, these are not strongly supported. The structure of DdrB is known.	126
315429	pfam12749	Metallothio_Euk	Eukaryotic metallothionein. This is a family of eukaryotic metallothioneins.	66
403834	pfam12750	Maff2	Maff2 family. This family of short membrane proteins are related to the protein Maff2. Maff2 lies just outside the direct repeats of a tetracycline resistance transposable element. This protein may contain transmembrane helices.	70
403835	pfam12751	Vac7	Vacuolar segregation subunit 7. Vac7 is localized at the vacuole membrane, a location which is consistent with its involvement in vacuole morphology and inheritance. Vac7 has been shown to function as an upstream regulator of the Fab1 lipid kinase pathway. The Fab1 lipid p[pathway is important for correct regulation of membrane trafficking events.	382
403836	pfam12752	SUZ	SUZ domain. The SUZ domain is a conserved RNA-binding domain found in eukaryotes and enriched in positively charged amino acids. It was first characterized in the C.elegans protein Szy-20 where it has been shown to bind RNA and allow their localization to the centrosome. Warning- the domain has a compositionally biased character.	56
403837	pfam12753	Nro1	Nuclear pore complex subunit Nro1. In fission yeast, this protein is a positive regulator of the stability of Sre1N, the sterol regulatory element-binding protein which is an ER membrane-bound transcription factor that controls adaptation to low oxygen-growth. In addition, the fission yeast Nro1 is a direct inhibitor of a protein that inhibits SreN1 degradation, Ofd1 (an oxoglutamate deoxygenase). The outcome of this reactivity is that Ofd1 acts as an oxygen sensor that regulates the binding of Nro1 to Ofd1 to control the stability of Sre1N. Solution of the structure of Nro1 reveals it to be made up of a number of TPR coils. TPR proteins are composed of three to 16 tandem peptide repeat motifs of 34 amino acids with degenerate sequence. The helical pairs adopt a helix-turn-helix anti-parallel arrangement with interacting helices. In general, TPR motifs are stacked together so that helix A from TPRn is packed between helix B from TPRn and helix A from TPRn+1. In Nro1, the 12 alpha helices forming the six TPR motifs are organized as follows from N-terminus to C-terminus - TPR1A, TPR1B, TPR2A, TPR2B, TPR3A, TPR3B, TPR4A, TPR4B, TPR5A, TPR5B, TPR6A, and TPR6B with the C-terminal helix (hC) running above the sixth TPR motif with an angle of approx 45 degrees with TPR6A and TPR6B. The corresponding TPRs structural motifs are longer (50 residues) than are canonical ones (34 amino acids) and are organized into two subdomains - Nro1-N (residues 55-225) and Nro1-C (residues 226-393). The Nro1/Etti protein plays a role in nuclear import suggesting that it is residues 4-19 that are interacting with Ofd1.	414
372291	pfam12754	Blt1	Blt1 N-terminal domain. During size-dependent cell cycle transitions controlled by the ubiquitous cyclin-dependent kinase Cdk1, Blt1 has been shown to co-localize with Cdr2 in the medial interphase nodes, as well as with Mid1 which was previously shown to localize to similar interphase structures. Physical interactions between Blt1-Mid1, Blt1-Cdr2 and Cdr2-Mid1 were detected, indicating that medial cortical nodes are formed by the ordered, Cdr2-dependent assembly of multiple interacting proteins during interphase. This domain show similarity to ubiquitin family proteins.	150
403838	pfam12755	Vac14_Fab1_bd	Vacuolar 14 Fab1-binding region. Vac14 is a scaffold for the Fab1 kinase complex, a complex that allows for the dynamic interconversion of PI3P and PI(3,5)P2p (phosphoinositide phosphate (PIP) lipids, that are generated transiently on the cytoplasmic face of selected intracellular membranes). This interconversion is regulated by at least five proteins in yeast: the lipid kinase Fab1p, lipid phosphatase Fig4p, the Fab1p activator Vac7p, the Fab1p inhibitor Atg18p, and Vac14p, a protein required for the activity of both Fab1p and Fig4p. This domain appears to be the one responsible for binding to Fab1. The full length Vac14 in yeasts is likely to be a protein carrying a succession of HEAT repeats, most of which have now degenerated. This regulatory system is crucial for the proper functioning of the mammalian nervous system.	97
403839	pfam12756	zf-C2H2_2	C2H2 type zinc-finger (2 copies). This family contains two copies of a C2H2-like zinc finger domain.	98
403840	pfam12757	Eisosome1	Eisosome protein 1. Eisosome protein 1 is required for normal formation of eisosomes, large cytoplasmic protein assemblies that localize to specialized domains on plasma membrane and mark the site of endocytosis.	125
372295	pfam12758	DUF3813	Protein of unknown function (DUF3813). This is an uncharacterized family of Bacillus proteins.	60
289525	pfam12759	HTH_Tnp_IS1	InsA C-terminal domain. This short domain is found at the C-terminus of the InsA protein. This domain contains a helix-turn-helix domain.	46
403841	pfam12760	Zn_Tnp_IS1595	Transposase zinc-ribbon domain. This zinc binding domain is found in a range of transposase proteins such as ISSPO8, ISSOD11, ISRSSP2 etc. It is likely a zinc-binding beta ribbon domain that could bind the DNA.	46
403842	pfam12761	End3	Actin cytoskeleton-regulatory complex protein END3. Endocytosis is accomplished through the sequential recruitment at endocytic sites of proteins that drive cargo sorting, membrane invagination and vesicle release. End3p is part of the coat module protein complex Pan1, along with Pan1p, Sla1p, and Sla2p. The proteins in this complex are regulated by phosphorylation events. End3p also regulates the cortical actin cytoskeleton. The subunits of the Pan1 complex are homologous to mammalian intersectin.	200
403843	pfam12762	DDE_Tnp_IS1595	ISXO2-like transposase domain. This domain probably functions as an integrase that is found in a wide variety of transposases, including ISXO2.	153
289529	pfam12763	EF-hand_4	Cytoskeletal-regulatory complex EF hand. This is an efhand family from the N-terminal of actin cytoskeleton-regulatory complex END3 and similar proteins from fungi and closely related species.	104
403844	pfam12764	Gly-rich_Ago1	Glycine-rich region of argonaut. This domain is often found at the very N-terminal of argonaut-like proteins.	103
403845	pfam12765	Cohesin_HEAT	HEAT repeat associated with sister chromatid cohesion. This HEAT repeat is found most frequently in sister chromatid cohesion proteins such as Nipped-B. HEAT repeats are found tandemly repeated in many proteins, and they appear to serve as flexible scaffolding on which other components can assemble.	42
403846	pfam12766	Pyridox_oxase_2	Pyridoxamine 5'-phosphate oxidase. Pyridoxamine 5'-phosphate oxidase catalyzes the oxidation of pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P (PLP), the terminal step in the de novo biosynthesis of PLP in Escherichia coli and part of the salvage pathway of this coenzyme in both E. coli and mammalian cells. This region is the flavoprotein FMN-binding domain.	99
403847	pfam12767	SAGA-Tad1	Transcriptional regulator of RNA polII, SAGA, subunit. The yeast SAGA complex is a multifunctional coactivator that regulates transcription by RNA polymerase II. It is formed of five major modular subunits and shows a high degree of structural conservation to human TFTC and STAGA. The complex can also be conceived of as consisting of two histone-fold-containing core subunits, and this family is one of these. As a family it is likely to carry binding regions for interactions with a number of the other components of the complex.	135
403848	pfam12768	Rax2	Cortical protein marker for cell polarity. Diploid yeast cells repeatedly polarise and bud from their poles, due probably to the presence of highly stable membrane markers, and Rax2 is one such marker. It is inherited immutably at the cell cortex for multiple generations, and has a half-life exceeding several generations. The persistent inheritance of cortical protein markers would provide a means of coupling a cell's history with the future development of a precise morphogenetic form. Both Rax1 and Rax2 localize to the distal pole as well as to the division site and they interact both with each other and with Bud8p and Bud9p in the establishment and/or maintenance of the cortical markers for bipolar budding. thus Rax2 is likely to control cell polarity during vegetative growth, and in fission yeast this is done by regulating the localization of for3p.	211
403849	pfam12769	PNTB_4TM	4TM region of pyridine nucleotide transhydrogenase, mitoch. PNTB_4TM is the region upstream of family PNTB, pfam02233, that carries four of this transporters transmembrane regions. PNTB is the beta-subunit of pyridine nucleotide transhydrogenase. This family forms part of the Proton-translocating Transhydrogenase (PTH) Family.	84
378942	pfam12770	CHAT	CHAT domain. These proteins appear to be related to peptidases in peptidase clan CD that includes the caspases. This domain has been termed the CHAT domain for Caspase HetF Associated with Tprs. This family has been identified as a sister group to the separins.	289
403850	pfam12771	SusD-like_2	Starch-binding associating with outer membrane. SusD is a secreted starch-binding protein with an N-terminal lipid tail that allows it to associate with the outer membrane.	413
403851	pfam12772	GHBP	Growth hormone receptor binding. Growth hormone receptor binding protein is produced either by proteolysis of the GHR (growth hormone receptor) at the cell surface thereby releasing its extracellular domain, the GHBP (growth hormone-binding protein), or, in rodents, by alternative processing of the GHR transcript. The sheddase proteolytic enzyme responsible for the cleavage is TACE (tumor necrosis factor-alpha-converting enzyme). Growth hormone (GH) binding to GH receptor (GHR) is the initial step that leads to the physiological functions of the hormone. The biological effects of GHBP are determined by the serum levels of growth hormone (GH), which can vary. Low levels of GH can result in a dwarf phenotype and have been positively correlated with an increased life expectancy. High levels of GH can lead to gigantism or a clinical syndrome termed acromegaly and have been implicated in diabetic eye and kidney damage.	303
403852	pfam12773	DZR	Double zinc ribbon. This family consists of a pair of zinc ribbon domains.	45
403853	pfam12774	AAA_6	Hydrolytic ATP binding site of dynein motor region D1. the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D1 unit of the motor and contains the hydrolytic ATP binding site.	327
403854	pfam12775	AAA_7	P-loop containing dynein motor region D3. the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site.	181
403855	pfam12776	Myb_DNA-bind_3	Myb/SANT-like DNA-binding domain. This presumed domain appears to be related to other Myb/SANT like DNA binding domains. In particular pfam10545 seems most related. This family is greatly expanded in plants and appears in several proteins annotated as transposon proteins.	96
289543	pfam12777	MT	Microtubule-binding stalk of dynein motor. the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component.	344
403856	pfam12778	PXPV	PXPV repeat (3 copies). This short repeat is found in multiple copies in a variety of Burkholderia proteins. The function of this region is unknown.	22
403857	pfam12779	YXWGXW	YXWGXW repeat (2 copies). This short repeat contains the motif YXWXXGXW where X can be any amino acid. It is generally found in 2-5 copies in short secreted bacterial proteins. Its function is as yet unknown.	26
403858	pfam12780	AAA_8	P-loop containing dynein motor region D4. The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor.	259
403859	pfam12781	AAA_9	ATP-binding dynein motor region D5. The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop.	217
372312	pfam12782	Innate_immun	Invertebrate innate immunity transcript family. The immune response of the purple sea urchin appears to be more complex than previously believed in that it uses immune-related gene families homologous to vertebrate Toll-like and NOD/NALP-like receptor families as well as C-type lectins and a rudimentary complement system. In addition, the species also produces this unusual family of mRNAs, also known as 185/333, which is strongly upregulated in response to pathogen challenge.	291
403860	pfam12783	Sec7_N	Guanine nucleotide exchange factor in Golgi transport N-terminal. The full-length Sec7 functions proximally in the secretory pathway as a protein binding scaffold for the coat protein complexes COPII-COPI. The COPII-COPI-protein switch is necessary for maturation of the vesicular-tubular cluster, VTC, intermediate compartments for Golgi compartment biogenesis. This N-terminal domain however does not appear to be binding either of the COP or the ARF.	154
403861	pfam12784	PDDEXK_2	PD-(D/E)XK nuclease family transposase. Members of this family belong to the PD-(D/E)XK nuclease superfamily. These proteins are transposase proteins.	227
372315	pfam12785	VESA1_N	Variant erythrocyte surface antigen-1. This family represents the N-terminal of the variant erythrocyte surface antigen 1, versions a and b, of Babesia. Babesia bovis is a tick-borne, intra-erythrocytic, protozoal parasite of cattle that shares many lifestyle parallels with the most virulent of the human malarial parasites, Plasmodium falciparum. Babesia uses antigenic variation to establish consistent infections of long duration. The two variants of VESA1, a and b, are expressed from different but closely related genes, and variation is achieved through the involvement of a segmental gene conversion mechanism and low-frequency epigenetic in situ switching of transcriptional activity from the VESA1 gene-pair to a possible other gene pair.	457
193262	pfam12786	GBV-C_env	GB virus C genotype envelope. This the envelope protein from the ssRNA GB virus genotype C.	413
403862	pfam12787	EcsC	EcsC protein family. Proteins in this family are related to EcsC from B. subtilis. This protein is found in an operon with EcsA and EcsB which are components of an ABC transport system. The function of this protein is unknown.	245
403863	pfam12788	YmaF	YmaF family. This family of proteins contain 6 HXH motifs and is named after the B. subtilis YmaF protein. It seems likely that these are involved in metal binding. The function of this protein is unknown.	97
372316	pfam12789	PTR	Phage tail repeat like. This family largely contains proteins from the eukaryote Trichomonas vaginalis. These proteins contain multiple HXH repeats. Some proteins in this family are annotated as having phage tail repeats. The function of this family is unknown.	60
403864	pfam12790	T6SS-SciN	Type VI secretion lipoprotein, VasD, EvfM, TssJ, VC_A0113. One of the virulence mechanisms of E coli is the production of toxins which it produces from dedicated machineries called secretion systems. Seven secretion systems have been described, which assemble from 3 to up to more than 20 subunits. These secretion systems derive from or have co-evolved with bacterial organelles such as ABC transporters (type I), type IV pili (type 2), flagella (type 3), or conjugative machines (type IV). The type VI secretion system (T6SS) is present in most pathogens that have contact with animals, plants, or humans. SciN is a lipoprotein tethered to the outer membrane and expressed in the periplasm of E coli and is essential for T6S-dependent secretion of the Hcp-like SciD protein and for biofilm formation.	120
403865	pfam12791	RsgI_N	Anti-sigma factor N-terminus. The heat shock genes in B. subtilis can be classified into several groups according to their regulation, and the sigma gene, sigI, of Bacillus subtilis belongs to the group IV heat-shock response genes and has many orthologues in the bacterial phylum Firmicutes. Regulation of sigma factor I is carried out by RsgI from the same operon, and this N-terminal cytoplasmic portion of RsgI ('upstream' of the single transmembrane helix) has been shown to interact directly with Sigma-I.	53
403866	pfam12792	CSS-motif	CSS motif domain associated with EAL. This family with its characteristic highly conserved CSS sequence motif is found N-terminal to the EAL, pfam00563, domain in many cyclic diguanylate phosphodiesterases.	209
403867	pfam12793	SgrR_N	Sugar transport-related sRNA regulator N-term. Small, non-coding RNA molecules play important regulatory roles in a variety of physiological processes in bacteria. SgrR_N is the N-terminus of a family of proteins which regulate the transcription of these sRNAs, in particular SgrS. SgrR_N contains a helix-turn-helix motif characteristic of winged-helix DNA-binding transcriptional regulators. SgrS is a small RNA required for recovery from glucose-phosphate stress in bacteria. In examining the regulation of sgrR expression it was found that SgrR negatively auto-regulates its own transcription in the presence and absence of stress, and thus SgrR coordinates the response to glucose-phosphate stress by binding specifically to sgrS promoter DNA.	115
403868	pfam12794	MscS_TM	Mechanosensitive ion channel inner membrane domain 1. The small mechanosensitive channel, MscS, is a part of the turgor-driven solute efflux system that protects bacteria from lysis in the event of osmotic shock. The MscS protein alone is sufficient to form a functional mechanosensitive channel gated directly by tension in the lipid bilayer. The MscS proteins are heptamers of three transmembrane subunits with seven converging M3 domains, and this domain is one of the inner membrane domains.	333
403869	pfam12795	MscS_porin	Mechanosensitive ion channel porin domain. The small mechanosensitive channel, MscS, is a part of the turgor-driven solute efflux system that protects bacteria from lysis in the event of osmotic shock. The MscS protein alone is sufficient to form a functional mechanosensitive channel gated directly by tension in the lipid bilayer. The MscS proteins are heptamers of three transmembrane subunits with seven converging M3 domains, and this MscS_porin is towards the N-terminal of the molecules. The high concentration of negative charges at the extracellular entrance of the pore helps select the cations for efflux.	235
403870	pfam12796	Ank_2	Ankyrin repeats (3 copies). 	91
403871	pfam12797	Fer4_2	4Fe-4S binding domain. This superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich.	22
403872	pfam12798	Fer4_3	4Fe-4S binding domain. This superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich.	15
403873	pfam12799	LRR_4	Leucine Rich repeats (2 copies). Leucine rich repeats are short sequence motifs present in a number of proteins with diverse functions and cellular locations. These repeats are usually involved in protein-protein interactions. Each Leucine Rich Repeat is composed of a beta-alpha unit. These units form elongated non-globular structures. Leucine Rich Repeats are often flanked by cysteine rich domains.	44
403874	pfam12800	Fer4_4	4Fe-4S binding domain. This superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich.	17
403875	pfam12801	Fer4_5	4Fe-4S binding domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich.	48
403876	pfam12802	MarR_2	MarR family. The Mar proteins are involved in the multiple antibiotic resistance, a non-specific resistance system. The expression of the mar operon is controlled by a repressor, MarR. A large number of compounds induce transcription of the mar operon. This is thought to be due to the compound binding to MarR, and the resulting complex stops MarR binding to the DNA. With the MarR repression lost, transcription of the operon proceeds. The structure of MarR is known and shows MarR as a dimer with each subunit containing a winged-helix DNA binding motif.	61
289567	pfam12803	G-7-MTase	mRNA (guanine-7-)methyltransferase (G-7-MTase). The Sendai virus RNA-dependent RNA polymerase complex, which consists of L and P proteins, participates in the synthesis of viral mRNAs that possess a methylated cap structure. The N-terminal of the L protein acts as the RNA-dependent RNA polymerase part of the molecule, family Paramyx_RNA_pol, pfam00946. This domain is the C-terminal part of the L protein and it catalyzes cap methylation through its mRNA (guanine-7-)methyltransferase (G-7-MTase) activity.	317
403877	pfam12804	NTP_transf_3	MobA-like NTP transferase domain. This family includes the MobA protein (Molybdopterin-guanine dinucleotide biosynthesis protein A). The family also includes a wide range of other NTP transferase domain.	159
403878	pfam12805	FUSC-like	FUSC-like inner membrane protein yccS. This family has similarities to the fusaric acid resistance protein family. The proteins are lodged in the inner membrane.	284
403879	pfam12806	Acyl-CoA_dh_C	Acetyl-CoA dehydrogenase C-terminal like. this domain would appear to be the very C-terminal region of many bacterial acetyl-CoA dehydrogenases.	127
403880	pfam12807	eIF3_p135	Translation initiation factor eIF3 subunit 135. Translation initiation factor eIF3 is a multi-subunit protein complex required for initiation of protein biosynthesis in eukaryotic cells. The complex promotes ribosome dissociation, the binding of the initiator methionyl-tRNA to the 40 S ribosomal subunit, and mRNA recruitment to the ribosome. The protein product from TIF31 genes in yeast is p135 which associates with the eIF3 but does not seem to be necessary for protein translation initiation.	168
315477	pfam12808	Mto2_bdg	Micro-tubular organizer Mto1 C-term Mto2-binding region. The C-terminal region of the micro-tubular organizer protein 1 (mto1) is the binding domain for attachment to Mto2p.The full-length Mto1 protein is required for microtubule nucleation from non-spindle pole body MTOCs in fission yeast. The interaction of Mto2p with this region of Mto1 is critical for anchoring the cytokinetic actin ring to the medial region of the cell and for proper coordination of mitosis with cytokinesis.	52
403881	pfam12809	Metallothi_Euk2	Eukaryotic metallothionein. This is a family of eukaryotic metallothioneins.	69
403882	pfam12810	Gly_rich	Glycine rich protein. This family of proteins is greatly expanded in Trichomonas vaginalis. The proteins are composed of several glycine rich motifs interspersed through the sequence. Although many proteins have been annotated by similarity in the family these annotations given the biased composition of the sequences these are unlikely to be functionally relevant.	257
403883	pfam12811	BaxI_1	Bax inhibitor 1 like. The Bax-inhibitor-1 region of the receptor molecules is conserved from bacteria to humans.	235
372324	pfam12812	PDZ_1	PDZ-like domain. PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates. this is a family of PDZ-like domains from bacteria, plants and fungi.	78
372325	pfam12813	XPG_I_2	XPG domain containing. This family is largely of fungal proteins and is related to the XP-G protein family.	249
403884	pfam12814	Mcp5_PH	Meiotic cell cortex C-terminal pleckstrin homology. The PH domain of these largely fungal proteins is necessary for the cortical localization of the protein during meiosis, since the overall function of the protein is to anchor dynein at the cell cortex during the horsetail phase. During prophase I of fission yeast, horsetail nuclear movement occurs, and this starts when all the telomeres become bundled at the spindle pole body - SPB. Subsequent to this, the nucleus undergoes a dynamic oscillation, resulting in elongated nuclear morphology. Horsetail nuclear movement is thought to be predominantly due to the pulling of astral microtubules that link the SPB to cortical microtubule-attachment sites at the opposite end of the cell; the pulling force is believed to be provided by cytoplasmic dynein and dynactin.	119
372327	pfam12815	CTD	Spt5 C-terminal nonapeptide repeat binding Spt4. The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.	71
403885	pfam12816	Vps8	Golgi CORVET complex core vacuolar protein 8. Vps8 is one of the Golgi complex components necessary for vacuolar sorting. Eukaryotic cells contain a highly dynamic endo-membrane system, in which individual organelles keep their identity despite continuous vesicle generation and fusion. Vesicles that bud from a donor membrane are targeted and delivered to each individual organelle, where they release their cargo after fusion with the acceptor membrane. Vps8 is the core component of the endosomal tethering complex CORVET (class C core vacuole/endosome tethering). Vps8 co-operates with Vps21-GTP to mediate endosomal clustering in a reaction that is dependent on Vps3. Vps8 is the only CORVET subunit that is enriched on late endosomes, suggesting that it is a marker for the maturation of late endosomes. Late endosomes form intralumenal vesicles, and the resulting multivesicular bodies fuse with the vacuole to release their cargoes.	194
289579	pfam12818	Tegument_dsDNA	dsDNA viral tegument protein. This is a family of tegument proteins from double-stranded DNA herpesvirus and related viral species.	277
403886	pfam12819	Malectin_like	Carbohydrate-binding protein of the ER. Malectin is a membrane-anchored protein of the endoplasmic reticulum that recognizes and binds Glc2-N-glycan. The domain is found on a number of plant receptor kinases.	328
403887	pfam12820	BRCT_assoc	Serine-rich domain associated with BRCT. This domain is found on BRCA1 proteins.	164
403888	pfam12821	ThrE_2	Threonine/Serine exporter, ThrE. ThrE_2 is a family of membrane proteins involved in the export of threonine and serine. L-threonine, L-serine are both substrates for the exporter. The exporter exhibits nine-ten predicted transmembrane-spanning helices with long charged C and N termini and an amphipathic helix present within the N-terminus. L-Threonine can be made by the amino acid-producing bacterium Corynebacterium glutamicum, but the potential for amino acid formation can be considerably improved by reducing its intracellular degradation into glycine and increasing its export by this exporter. Members of the family are found in Bacteria, Archaea, and the fungal kingdoms, and the family can exist either as a single long polypeptide chain or as two short polypeptides. All family members show an extended hydrophilic N-terminal domain with weak sequence similarity to portions of hydrolases (proteases, peptidases, and glycosidases); this suggests that since this region is cytoplasmic to the membrane it may be generating the transport substrate, so may imply that threonine may not be the primary substrate and the ThrE has a subsidiary function.	129
403889	pfam12822	ECF_trnsprt	ECF transporter, substrate-specific component. Energy-coupling factor (ECF) transporters consist of a substrate-specific component (known as the S component), and an energy-coupling module. The substrate-binding component is a small integral membrane protein which captures specific substrates and forms an active transporter in the presence of the energy-coupling AT module.	156
403890	pfam12823	DUF3817	Domain of unknown function (DUF3817). This domain is of unknown function. It is sometimes found adjacent to pfam07690 and pfam03176 which are both transporter domains.	89
403891	pfam12824	MRP-L20	Mitochondrial ribosomal protein subunit L20. This family is the essential mitochondrial ribosomal protein subunit L20 of fungi.	165
403892	pfam12825	DUF3818	Domain of unknown function in PX-proteins (DUF3818). This domain is found on proteins carrying a PX domain. Its function is unknown.	335
403893	pfam12826	HHH_2	Helix-hairpin-helix motif. The HhH domain of DisA, a bacterial checkpoint control protein, is a DNA-binding domain.	64
403894	pfam12827	Peroxin-22	Peroxisomal biogenesis protein family. Peroxin-22 is a integral peroxisomal membrane protein family. The N-terminus is in the matrix and the C-terminus is in the cytosol. The N-terminus carries a 25-amino acid peroxisome membrane-targeting signal. It interacts with the ubiquitin-conjugating peripheral peroxisomal membrane enzyme Pex4p anchoring it at the peroxisomal membrane. Both Pex proteins are involved at the same stage of peroxisome biogenesis.	109
403895	pfam12828	PXB	PX-associated. This domain is associated with the PX domain.	131
403896	pfam12829	Mhr1	Transcriptional regulation of mitochondrial recombination. This family is involved in the transcriptional regulation of recombination in the mitochondria,	82
403897	pfam12830	Nipped-B_C	Sister chromatid cohesion C-terminus. This domain lies towards the C-terminus of nipped-B or sister chromatid cohesion proteins.	177
403898	pfam12831	FAD_oxidored	FAD dependent oxidoreductase. This family of proteins contains FAD dependent oxidoreductases and related proteins.	420
372338	pfam12832	MFS_1_like	MFS_1 like family. This family contains proteins related to the MFS superfamily.	362
403899	pfam12833	HTH_18	Helix-turn-helix domain. 	81
372339	pfam12834	Phage_int_SAM_2	Phage integrase, N-terminal. This is a family of DNA-binding prophage integrases. It is found largely in Proteobacteria.	91
372340	pfam12835	Integrase_1	Integrase. This is a family of DNA-binding prophage integrases found in Proteobacteria.	149
403900	pfam12836	HHH_3	Helix-hairpin-helix motif. The HhH domain is a short DNA-binding domain.	65
403901	pfam12837	Fer4_6	4Fe-4S binding domain. This superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich.	24
403902	pfam12838	Fer4_7	4Fe-4S dicluster domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. Domain contains two 4Fe4S clusters.	52
403903	pfam12840	HTH_20	Helix-turn-helix domain. This domain represents a DNA-binding Helix-turn-helix domain found in transcriptional regulatory proteins.	61
403904	pfam12841	YvrJ	YvrJ protein family. This family of short proteins are related to B. subtilis YvrJ protein. None of the members of this family have been functionally characterized.	36
403905	pfam12842	DUF3819	Domain of unknown function (DUF3819). This is an uncharacterized domain that is found on the CCR4-Not complex component Not1. Not1 is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID.	143
403906	pfam12843	QSregVF_b	Putative quorum-sensing-regulated virulence factor. QSregVF_b is a family of short Pseudomonas proteins that are potential virulence factors. The structure of UniProtKB:Q9HY15 a secreted protein has been solved and deposited as Structure 3npd, from pfam13652. It is predicted that these two adjacent proteins form a single transcriptional unit based on the prediction that together they interact with their adjacent protein PotD, which is the putrescine-binding periplasmic protein in the polyamine uptake system comprising PotABCD. These two adjacent proteins are predicted to be quroum-sensing-regulated virulence factors.	66
403907	pfam12844	HTH_19	Helix-turn-helix domain. Members of this family contains a DNA-binding helix-turn-helix domain. This family contains many example antitoxins from bacterial toxin-antitoxin systems. These antitoxins are likely to be DNA-binding domains.	64
403908	pfam12845	TBD	TBD domain. The Tbk1/Ikki binding domain (TBD) is a 40 amino acid domain able to bind kinases, has been found to be essential for poly(I:C)-induced IRF activation. The domain is found in SINTBAD, TANK and NAP1 protein. This domain is predicted to form an a-helix with residues essential for kinase binding clustering on one side.	55
315512	pfam12846	AAA_10	AAA-like domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins.	362
403909	pfam12847	Methyltransf_18	Methyltransferase domain. Protein in this family function as methyltransferases.	151
403910	pfam12848	ABC_tran_Xtn	ABC transporter. This domain is an extension of some members of pfam00005 and other ABC-transporter families.	85
403911	pfam12849	PBP_like_2	PBP superfamily domain. This domain belongs to the periplasmic binding protein superfamily.	270
403912	pfam12850	Metallophos_2	Calcineurin-like phosphoesterase superfamily domain. Members of this family are part of the Calcineurin-like phosphoesterase superfamily.	153
372343	pfam12851	Tet_JBP	Oxygenase domain of the 2OGFeDO superfamily. A double-stranded beta helix (DSBH) fold domain of the 2-oxoglutarate (2OG)-Fe(II)-dependent dioxygenase (2OGFeDO) superfamily found in various eukaryotes, bacteria and bacteriophages. Members of this family catalyze nucleic acid modifications, such as thymidine hydroxylation during base J synthesis in kinetoplastids, and the conversion of 5 methyl-cytosine (5-mC) to 5-hydroxymethyl-cytosine (hmC), or further oxidation to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Metazoan TET proteins contain a cysteine-rich region inserted into the core of the DSBH fold. Vertebrate TET proteins are oncogenes that are mutated in various myeloid cancers. Fungal and algal versions of this family are linked to a predicted transposase and show lineage-specific expansions.	166
403913	pfam12852	Cupin_6	Cupin. This is a family of bacterial and eukaryotic proteins that belong to the Cupin superfamily. Some of the proteins in this family are annotated as being members of the AraC family of transcription factors, in which case this domain corresponds to the ligand binding domain.	184
372344	pfam12853	NADH_u_ox_C	C-terminal of NADH-ubiquinone oxidoreductase 21 kDa subunit. This family is the C-terminal domain of NADH-ubiquinone oxidoreductase 21 kDa subunits from fungi.	89
403914	pfam12854	PPR_1	PPR repeat. This family matches additional variants of the PPR repeat that were not captured by the model for pfam01535. The exact function is not known.	34
403915	pfam12855	Ecl1	Life-span regulatory factor. This family is involved in the chronological life-span of S. cerevisiae. Over-expression leads to an extended viability of wild-type strains, indicating a role in regulation.	171
403916	pfam12856	ANAPC9	Anaphase-promoting complex subunit 9. Apc9 is one of the subunits of the anaphase-promoting complex, or cyclosome, which is essential for regulating entry into anaphase and exit from mitosis. The APC is a ubiquitin-protein ligase complex. All APC subunits are members of the cullin family proteins, which bind to a ring-finger subunit via a conserved cullin domain. The APC is made up of four parts, the third of which is a tetratricopeptide repeat arm (TPR) that contains Apc9.	112
403917	pfam12857	TOBE_3	TOBE-like domain. The TOBE domain (Transport-associated OB) always occurs as a dimer as the C-terminal strand of each domain is supplied by the partner. Probably involved in the recognition of small ligands such as molybdenum and sulfate. Found in ABC transporters immediately after the ATPase domain.	59
403918	pfam12859	ANAPC1	Anaphase-promoting complex subunit 1. Apc1 is the largest of the subunits of the anaphase-promoting complex or cyclosome. The anaphase-promoting complex is a multiprotein subunit E3 ubiquitin ligase complex that controls segregation of chromosomes and exit from mitosis in eukaryotes. Infection of human fibroblasts with human cytomegalovirus (HCMV) leads to cell cycle dysregulation, which is associated with the inactivation of the anaphase-promoting complex.	120
403919	pfam12860	PAS_7	PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya.	115
403920	pfam12861	zf-ANAPC11	Anaphase-promoting complex subunit 11 RING-H2 finger. Apc11 is one of the subunits of the anaphase-promoting complex or cyclosome. The APC subunits are cullin family proteins with ubiquitin ligase activity. Polyubiquitination marks proteins for degradation by the 26S proteasome and is carried out by a cascade of enzymes that includes ubiquitin-activating enzymes (E1s), ubiquitin-conjugating enzymes (E2s), and ubiquitin ligases (E3s). Apc11 acts as an E3 enzyme and is responsible for recruiting E2s to the APC and for mediating the subsequent transfer of ubiquitin to APC substrates in vivo. In Saccharomyces cerevisiae this RING-H2 finger protein defines the minimal ubiquitin ligase activity of the APC, and the integrity of the RING-H2 finger is essential for budding yeast cell viability.	85
403921	pfam12862	ANAPC5	Anaphase-promoting complex subunit 5. Apc5 is a subunit of the anaphase-promoting complex/cyclosome (APC/C) which is a multi-subunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Apc5, although it does not harbour a classical RNA binding domain, Apc5 binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction. The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC. This region of the Apc5 member proteins carries a TPR-like motif.	91
403922	pfam12863	DUF3821	Domain of unknown function (DUF3821). This is a domain largely confined to sequences from Methanomicrobiales found on putative lipases. The function is not known.	202
403923	pfam12864	DUF3822	Protein of unknown function (DUF3822). This is a family of uncharacterized bacterial proteins. However, structural-similarity searches indicate the family takes on an actin-like ATPase fold.	241
403924	pfam12866	DUF3823	Protein of unknown function (DUF3823). This is a family of uncharacterized proteins from Bacteroidetes. It has characteristic DN and DR sequence-motifs. The function is not known.	92
403925	pfam12867	DinB_2	DinB superfamily. The DinB family are an uncharacterized family of potential enzymes. The structure of these proteins is composed of a four helix bundle.	128
372351	pfam12868	DUF3824	Domain of unknwon function (DUF3824). This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known.	145
378982	pfam12869	tRNA_anti-like	tRNA_anti-like. This is a family of bacterial, archeael and viral proteins that is related to the tRNA_anti family pfam01336. The major characteristic of families like tRNA_anti is their OB-fold, and many of them bind DNA.	162
403926	pfam12870	DUF4878	Domain of unknown function (DUF4878). This is a family of putative lipoproteins from bacteria. The family is probably related to the NTF2-like transpeptidase family.	112
403927	pfam12871	PRP38_assoc	Pre-mRNA-splicing factor 38-associated hydrophilic C-term. This domain is a hydrophilic region found at the C-terminus of plant and metazoan pre-mRNA-splicing factor 38 proteins. The function is not known.	98
403928	pfam12872	OST-HTH	OST-HTH/LOTUS domain. A predicted RNA-binding domain found in insect Oskar and vertebrate TDRD5/TDRD7 proteins that nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. The domain adopts the winged helix-turn- helix fold and bind RNA with a potential specificity for dsRNA.In eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized RNase domain belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains.	65
403929	pfam12873	DUF3825	Domain of unknown function (DUF3825). Potential uncharacterized enzymatic domain associated with bacterial pfam12872 domains. Has conserved residues suggestive of an enzymatic role probably related to RNA metabolism.	239
403930	pfam12874	zf-met	Zinc-finger of C2H2 type. This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding.	25
403931	pfam12875	DUF3826	Protein of unknown function (DUF3826). This is a putative sugar-binding family.	186
372355	pfam12876	Cellulase-like	Sugar-binding cellulase-like. This is a putative cellulase family. The structure is a TIM-barrel.	355
403932	pfam12877	DUF3827	Domain of unknown function (DUF3827). This family contains the human KIAA1549 protein which has been found to be fused fused to BRAF gene in many cases of pilocytic astrocytomas. The fusion is due mainly to a tandem duplication of 2 Mb at 7q34. Although nothing is known about the function of the human KIAA1549 protein, the BRAF protein is a well characterized oncoprotein. It is a serine/threonine protein kinase which is implicated in MAP/ERK signalling, a critical pathway for the regulation of cell division, differentiation and secretion.	677
403933	pfam12878	SICA_beta	SICA extracellular beta domain. The SICA (schizont-infected cell agglutination) proteins of P. knowlesi, one of the variant antigen gene families, are associated with parasitic virulence. These proteins are comprised of multiple domains, with the extracellular domains occurring at different frequencies. There can be between 1 and 10 copies of this cysteine-rich domain.	172
372358	pfam12879	SICA_C	SICA C-terminal inner membrane domain. The SICA (schizont-infected cell agglutination) proteins of P. knowlesi, one of the variant antigen gene families, are associated with parasitic virulence. These proteins are comprised of multiple domains, with the extracellular domains occurring at different frequencies. The C-terminal domain is thought to remain in the erythrocyte, found juxtaposition to the single transmembrane domain. To date, all full length proteins contain a single copy of this domain.	136
403934	pfam12881	NUT	NUT protein. This family includes the NUT protein. The gene encoding for NUT protein (Nuclear Testis protein) is found fused to BRD3 or BRD4 genes, in some aggressive types of carcinoma, due to chromosomal translocations. Proteins of the BRD family contain two bromodomains that bind transcriptionally active chromatin through associations with acetylated histones H3 and H4. Such proteins are crucial for the regulation of cell cycle progression. On the other hand, little is known about NUT protein. NUT is known to have a Nuclear Export Sequence (NES) as well as a Nuclear localization Signal (NLS), both located towards the C-terminal end of the protein. A fused NUT-GFP protein showed either cytoplasmic or nuclear localization, suggesting that it is subject to nuclear/cytoplasmic shuttling. Consistent with this possibility, treatment with leptomycin B an inhibitor of CRM1-dependent nuclear export resulted in re-distribution of NUT-GFP to the nucleus. Inspection of NUT revealed a C-terminal sequence similar to known nuclear export sequences (NES) which are often regulated by phosphorylation. This family carries some natively unstructured sequence.	717
403935	pfam12883	DUF3828	Protein of unknown function (DUF3828). This is a family of bacterial proteins of unknown function.	122
403936	pfam12884	TORC_N	Transducer of regulated CREB activity, N-terminus. This family includes the N terminal region of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerization domain of CREB (cAMP Response Element-Binding). The proteins display a highly conserved predicted N-terminal coiled-coil domain and an invariant sequence matching a protein kinase A (PKA) phosphorylation consensus sequence (RKXS). The coiled-coil structure interacts with the bZIP domain of CREB. This interaction may occur via ionic bonds because it is disrupted under high-salt conditions. In addition to CREB-binding, the N-terminal region plays a role in the tetramer formation of TORCs, but the physiological function of the multimeric complex has not been clarified yet.	63
403937	pfam12885	TORC_M	Transducer of regulated CREB activity middle domain. This family includes the region between the N and C-terminus of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerization domain of CREB (cAMP Response Element-Binding). Although the C- and N- terminal domains of these proteins have been well characterized, no functional role has been assigned to the central region, yet.	160
403938	pfam12886	TORC_C	Transducer of regulated CREB activity, C-terminus. This family includes the C terminal region of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerization domain of CREB (cAMP Response Element-Binding). The C-terminus region is negatively charged, resembling the transcription activation domains. When this domain, from all three human TORC proteins, was expressed as fusion proteins with the DNA-binding domain of GAL4 (GAL4-BD), and tested for induction of a minimal promoter linked to GAL4-binding sites (UAS-GAL4), UAS-GAL4 was potently induced by GAL4-BD fusions containing the C-terminal portion of all three human TORCs.	75
403939	pfam12887	SICA_alpha	SICA extracellular alpha domain. The SICA (schizont-infected cell agglutination) proteins of P. knowlesi, one of the variant antigen gene families, are associated with parasitic virulence. These proteins are comprised of multiple domains, with the extracellular domains occurring at different frequencies. This domain is typically found at the N-terminus, with 1 or 2 copies per protein. The domain is cysteine-rich domain and similar to pfam12878.	187
403940	pfam12888	Lipid_bd	Lipid-binding putative hydrolase. This is a small family of lipid-binding proteins found in Bacteroidetes.	140
403941	pfam12889	DUF3829	Protein of unknown function (DUF3829). This is a small family of proteins from several bacterial species, whose function is not known. It may, however, be related to the GvpL_GvpF family of proteins, pfam06386.	283
315550	pfam12890	DHOase	Dihydro-orotase-like. This is a small family of dihydro-orotase-like proteins from various bacteria.	142
403942	pfam12891	Glyco_hydro_44	Glycoside hydrolase family 44. This is a family of bacterial glycoside hydrolases formerly known as cellulase family J, and now known as Cel44A. It is one of the major enzymatic components of the cellulosome of Clostridium thermocellum strain F1 and of many other Firmicutes.	234
403943	pfam12892	FctA	Spy0128-like isopeptide containing domain. The FCT and equivalent region genes of Streptococcus pyogenes and other related bacteria encode surface proteins that include fibronectin- and collagen-binding proteins and the serological markers known as T antigens. Some of these proteins give rise to pilus-like appendages. The FctA family is found in many Firmicutes and related bacteria. In S. pyogenes, the pili have a role in bacterial adherence and colonisation of human tissues. Members of this family have a conserved N-terminal lysine and C-terminal asparagine that can form a covalent isopeptide bond.	113
403944	pfam12893	Lumazine_bd_2	Putative lumazine-binding. This is a family of uncharacterized proteins. However, the family belongs to the NTF2-like superfamily of various enzymes, and some of the members of the family are putative dehydrogenases.	116
403945	pfam12894	ANAPC4_WD40	Anaphase-promoting complex subunit 4 WD40 domain. Apc4 contains an N-terminal propeller-shaped WD40 domain.The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC,	91
403946	pfam12895	ANAPC3	Anaphase-promoting complex, cyclosome, subunit 3. Apc3, otherwise known as Cdc27, is one of the subunits of the anaphase-promoting complex or cyclosome. The anaphase-promoting complex is a multiprotein subunit E3 ubiquitin ligase complex that controls segregation of chromosomes and exit from mitosis in eukaryotes. The protein members of this family contain TPR repeats just as those of Apc7 do, and it appears that these TPR units bind the C-termini of the APC co-activators CDH1 and CDC20.	82
403947	pfam12896	ANAPC4	Anaphase-promoting complex, cyclosome, subunit 4. Apc4 is one of the larger of the subunits of the anaphase-promoting complex or cyclosome. This family represents the long domain downstream of the WD40 repeat/s that are present on the Apc4 subunits. The anaphase-promoting complex is a multiprotein subunit E3 ubiquitin ligase complex that controls segregation of chromosomes and exit from mitosis in eukaryotes. Results in C.elegans show that the primary essential role of the spindle assembly checkpoint is not in the chromosome segregation process itself but rather in delaying anaphase onset until all chromosomes are properly attached to the spindle. the APC/C is likely to be required for all metaphase-to-anaphase transitions in a multicellular organism.	203
403948	pfam12897	Aminotran_MocR	Alanine-glyoxylate amino-transferase. These proteins catalyze the reversible transfer of an amino group from the amino acid substrate to an acceptor alpha-keto acid. They require pyridoxal 5'-phosphate (PLP) as a cofactor to catalyze this reaction. Trans-amination reactions are of central importance in amino acid metabolism and in links to carbohydrate and fat metabolism. This class of aminotransferases acts as dimers in a head-to-tail configuration.	419
403949	pfam12898	Stc1	Stc1 domain. The domain contains 8 conserved cysteines that may bind to zinc. In S. pombe this protein acts as a protein linker which links the chromatin modifying CLRC complex to RNAi by tethering it to the RITS complex. The region is reported as a LIM domain here, but has a slightly different arrangement of its CxxC pairs from the Pfam LIM domain pfam00412, hence why it is not part of that family. The tandem zinc-finger structure could mediate protein-protein interactions.	78
403950	pfam12899	Glyco_hydro_100	Alkaline and neutral invertase. This is a family of bacterial and plant alkaline and neutral invertases, EC:3.2.1.26, previously known as Invertase_neut pfam04853.	429
403951	pfam12900	Pyridox_ox_2	Pyridoxamine 5'-phosphate oxidase. Pyridoxamine 5'-phosphate oxidase is a FMN flavoprotein that catalyzes the oxidation of pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P (PLP). This entry contains several pyridoxamine 5'-phosphate oxidases, and related proteins.	133
403952	pfam12901	SUZ-C	SUZ-C motif. The SUZ-C domain is a conserved motif found in one or more copies in several RNA-binding proteins. It is always found at the C-terminus of the protein and appear to be required for localization of the protein to specific subcellular structures. It was first characterized in the C.elegans protein Szy-20 which localizes to the centrosome. It is widely distributed in eukaryotes.	33
403953	pfam12902	Ferritin-like	Ferritin-like. This is a family of bacterial ferritin-like substances that also includes a C-terminal domain of VioB, polyketide synthase enzymes, that make up one of the key components of the violacein biosynthesis pathway. Violacein is a purple-coloured, broad-spectrum antibacterial pigment.	222
403954	pfam12903	DUF3830	Protein of unknown function (DUF3830). This is a family of bacterial and archaeal proteins, the structure for one of whose members has been characterized. Structure 3kop probably adopts a new hexameric form compared to previous structures. The putative active is near the domain interface. 3kop is most closely related, structurally to Structure 1zx8, where the potential active site is located near residues E51 and Y53 (conserved in 1zx8). Beyond the two residues above, the other residues are not conserved. Also the shape of the active site differs from that of 1zx8. Structure 1zx8 belongs to family DUF369. pfam04126, which is part of the cyclophilin-like clan.	144
403955	pfam12904	Collagen_bind_2	Putative collagen-binding domain of a collagenase. This domain is likely to be the collagen-binding domain of a family of bacterial collagenase enzymes. It is the C-terminal part of the Structure 3kzs (information derived from TOPSAN).	92
403956	pfam12905	Glyco_hydro_101	Endo-alpha-N-acetylgalactosaminidase. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae is largely determined by the ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. This family is the enzymatic region, EC:3.2.1.97, of the cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins. This reaction is exemplified by the S. pneumoniae protein Endo-alpha-N-acetylgalactosaminidase, where Asp764 is the catalytic nucleophile-base and Glu796 the catalytic proton donor.	273
403957	pfam12906	RINGv	RING-variant domain. 	47
403958	pfam12907	zf-met2	Zinc-binding. This is small family of metazoan zinc-binding proteins.	38
403959	pfam12910	PHD_like	Antitoxin of toxin-antitoxin, RelE / RelB, TA system. This domain appears to be the N-terminus of the RelB antitoxin of toxin-antitoxin stability system or prevent-host death system. Together RelE toxin and the RelB antitoxin form a non-toxic complex. Although toxin-antitoxin gene cassettes were first found in plasmids, it is clear that these loci are abundant in free-living prokaryotes, including many pathogenic bacteria, and these toxin-antitoxin loci provide a control mechanism that helps free-living prokaryotes cope with nutritional stress.	139
403960	pfam12911	OppC_N	N-terminal TM domain of oligopeptide transport permease C. Oligopeptide permeases (Opp) have been identified in numerous gram-negative and -positive bacteria. These transport systems belong to the superfamily of highly conserved ATP-binding cassette transporters. Typically, Opp importers comprise a complex of five proteins. The oligopeptide-binding protein OppA is responsible for the capture of peptides from the external medium. Two integral highly hydrophobic membrane spanning proteins, OppB and OppC, form a channel through the membrane used for peptide translocation. This N-terminal domain appears to be the first TM domain of the molecule.	53
403961	pfam12912	N_NLPC_P60	NLPC_P60 stabilizing domain, N term. This domain, at the N-terminus, appears to be the stabilizing domain for the structure from Desulfovibrio vulgaris DVU_0896, Structure 3m1u, which is a four-domain protein. The next domain is an SH3b1, the third an SH3b2 and the last, the C-terminal region, the catalytic domain of the cysteine-peptidase type, ie family NLPC_P60, pfam00877 (details derived from TOPSAN).	106
403962	pfam12913	SH3_6	SH3 domain (SH3b1 type). This domain appears to be an SH3 domain of the SH3b1-type, and is just C-terminal to an N-terminal domain that is probably the stabilizing domain for the structure from Desulfovibrio vulgaris DVU_0896, Structure 3m1u, which is a four-domain protein. The next domain is an SH3b2 and the last, the C-terminal region, is the catalytic domain of the cysteine-peptidase type, ie family NLPC_P60, pfam00877 (details derived from TOPSAN).	51
403963	pfam12914	SH3_7	SH3 domain of SH3b2 type. This domain appears to be an SH3 domain of the SH3b2-type, and is the second SH3 domain to be found, downstream of an N-terminal domain that is probably the stabilizing domain, for the structure from Desulfovibrio vulgaris DVU_0896, Structure 3m1u, which is a four-domain protein. The last, the C-terminal region, is the catalytic domain of the cysteine-peptidase type, ie family NLPC_P60, pfam00877 (details derived from TOPSAN).	46
403964	pfam12915	DUF3833	Protein of unknown function (DUF3833). This is a family of uncharacterized proteins found in Proteobacteria.	163
315571	pfam12916	DUF3834	Protein of unknown function (DUF3834). This family is likely to be related to solute-binding lipo-proteins.	201
403965	pfam12917	HD_2	HD containing hydrolase-like enzyme. This is a family of bacterial and archaeal hydrolases.	182
403966	pfam12918	TcdB_N	TcdB toxin N-terminal helical domain. This is a short helical bundle domain found associated with the catalytic domain of the TcdB toxin from C. difficile. The function of this domain is unknown, but it may be involved in substrate recognition.	66
372382	pfam12919	TcdA_TcdB	TcdA/TcdB catalytic glycosyltransferase domain. This domain represents the N-terminal glycosyltransferase from a set of toxins found in some bacteria. This domain in TcdB glycosylates the host RhoA protein.	382
372383	pfam12920	TcdA_TcdB_pore	TcdA/TcdB pore forming domain. This family represents the most conserved region within the C. difficile Toxin A and Toxin B pore forming region.	626
372384	pfam12921	ATP13	Mitochondrial ATPase expression. ATP13 is necessary for the expression of subunit 9 of mitochondrial ATPase. The protein has a basic amino terminal signal sequence that is cleaved upon import into mitochondria.	114
403967	pfam12922	Cnd1_N	non-SMC mitotic condensation complex subunit 1, N-term. The three non-SMC (structural maintenance of chromosomes) subunits of the mitotic condensation complex are Cnd1-3. The whole complex is essential for viability and the condensing of chromosomes in mitosis. This is the conserved N-terminus of the subunit 1.	164
403968	pfam12923	RRP7	Ribosomal RNA-processing protein 7 (RRP7). RRP7 is an essential protein in yeast that is involved in pre-rRNA processing and ribosome assembly. It is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle.	119
403969	pfam12924	APP_Cu_bd	Copper-binding of amyloid precursor, CuBD. This short domain, part of the extra-cellular N-terminus of the amyloid precursor protein, APP, can bind both copper and zinc, CuBD. The structure of Cu2+-bound CuBD reveals that the metal ligands are His147, His151, Tyr168 and two water molecules, which are arranged in a square pyramidal geometry. The structure of Cu+-bound CuBD is almost identical to the Cu2+-bound structure except for the loss of one of the water ligands. The geometry of the site is unfavourable for Cu+, thus providing a mechanism by which CuBD could readily transfer Cu ions to other proteins.	56
403970	pfam12925	APP_E2	E2 domain of amyloid precursor protein. The E2 domain is the largest of the conserved domains of the amyloid precursor protein. The structure of E2 consists of two coiled-coil sub-structures connected through a continuous helix, and bears an unexpected resemblance to the spectrin family of protein structures.E 2 can reversibly dimerize in solution, and the dimerization occurs along the longest dimension of the molecule in an antiparallel orientation, which enables the N-terminal substructure of one monomer to pack against the C-terminal substructure of a second monomer. The high degree of conservation of residues at the putative dimer interface suggests that the E2 dimer observed in the crystal could be physiologically relevant. Heparin sulfate proteoglycans, the putative ligands for the precursor present in extracellular matrix, bind to E2 at a conserved and positively charged site near the dimer interface.	190
403971	pfam12926	MOZART2	Mitotic-spindle organizing gamma-tubulin ring associated. FAM128A and FAM128B proteins have been re-named MOZART2A and B. The name MOZART is derived from letters of 'mitotic-spindle organizing proteins associated with a ring of gamma-tubulin'. This family operates as part of the gamma-tubulin ring complex, gamma-TuRC, one of the complexes necessary for chromosome segregation. This complex is located at centrosomes and mediates the formation of bipolar spindles in mitosis; it consists of six subunits. However, unlike the other four known subunits, the MOZART proteins, both 1 and 2, do not carry the conserved 'Spc97-Spc98' GCP domain, so the TUBCGP nomenclature cannot be used for it. The exact function of MOZART2 is not clear.	90
403972	pfam12927	DUF3835	Domain of unknown function (DUF3835). This is a C-terminal domain conserved in fungi.	73
403973	pfam12928	tRNA_int_end_N2	tRNA-splicing endonuclease subunit sen54 N-term. This is an N-terminal family of archaeal and metazoan sen54 proteins that forms one of the tRNA-splicing endonuclease subunits.	69
403974	pfam12929	Mid1	Stretch-activated Ca2+-permeable channel component. MID1 is a yeast Saccharomyces cerevisiae gene encoding a plasma membrane protein required for Ca2+ influx induced by the mating pheromone, alpha-factor. Mid1 protein plays a crucial role in supplying Ca2+ during the mating process. Mid1 is composed of 548-amino-acid residues with four hydrophobic regions named H1, H2, H3 and H4, and two cysteine-rich regions (C1 and C2) at the C-terminal. This family contains the H3, H4, C1 and C2 regions. suggesting that H1 is a signal sequence responsible for the alpha-factor-induced Mid1 delivery to the plasma membrane. The region from H1 to H3 is required for the localization of Mid1 in the plasma and ER membranes. Trafficking of Mid1-GFP to the plasma membrane is dependent on the N-glycosylation of Mid1 and the transporter protein Sec12. This findings suggests that the trafficking of Mid1-GFP to the plasma membrane requires a Sec12-dependent pathway from the ER to the Golgi, and that Mid1 is recruited via a Sec6- and Sec7-independent pathway from the Golgi to the plasma membrane.	430
403975	pfam12930	DUF3836	Family of unknown function (DUF3836). Family of uncharacterized proteins found in Bacteroidales species. Test.	121
403976	pfam12931	Sec16_C	Sec23-binding domain of Sec16. Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.	279
403977	pfam12932	Sec16	Vesicle coat trafficking protein Sec16 mid-region. Sec16 is a multi-domain vesicle coat protein. This central region is the functional part of the molecules and thus is vital for the family's role in mediating the movement of protein-cargo between the organelles of the secretory pathway.	118
403978	pfam12933	FTO_NTD	FTO catalytic domain. This domain is the catalytic AlkB-like domain from the FTO protein. This domain catalyzes a demethylase activity with a preference for 3-methylthymidine.	275
403979	pfam12934	FTO_CTD	FTO C-terminal domain. This domain is found at the C-terminus of the FTO protein which was shown to be associated with increased BMI and obesity risk in humans. The N-terminal domain of this protein is a DNA demethylase and this domain is found to associate with the N-terminal domain in the crystal structure. This domain is alpha helical with three helices that form a bundle.	167
315590	pfam12935	Sec16_N	Vesicle coat trafficking protein Sec16 N-terminus. Sec16 is a multi-domain vesicle coat protein. The overall function of Sec16 is in mediating the movement of protein-cargo between the organelles of the secretory pathway. Over-expression of truncated mutants of only the N-terminus are lethal, and this portion does not appear to be essential for function so may act as a stabilizing region.	236
403980	pfam12936	Kri1_C	KRI1-like family C-terminal. The yeast member of this family (Kri1p) is found to be required for 40S ribosome biogenesis in the nucleolus. This is the C-terminal domain of the family.	89
403981	pfam12937	F-box-like	F-box-like. This is an F-box-like family.	45
372400	pfam12938	M_domain	M domain of GW182. 	240
403982	pfam12939	DUF3837	Domain of unknown function (DUF3837). A small, compact all-alpha helical domain of unknown function. This domain is currently only found in Clostridiales species.	92
315595	pfam12940	RAG1	Recombination-activation protein 1 (RAG1), recombinase. This family is one of the two different components of the RAG1-RAG2 V(D)J recombinase complex. The RAG complex, consisting of two RAG1 and two RAG2 proteins is a multi-protein complex that mediates DNA cleavage during V(D)J (variable-diversity-joining) recombination. RAG1 mediates DNA-binding to the conserved recombination signal sequences (RSS). Many of the proteins in this family are fragments. Solution of the structure of the complex of RAG1 and RAG2 shows that each protein dimerizes with itself and each pair then complexes together to from the RAG1-RAG2 V(D)J recombinase enzyme. The different structural elements in RAG1 for UniProtKB:P15919 are: an N-terminal nonamer-binding domain from residues 391-459; a dimerization and DNA-binding domain from 459-515; an extended pre-RNase H domain from 515-588; the catalytic RNase H domain from 588-719; a ZnC2 domain from 719-791; and ZnH2 domain from 791-962; and a three-helix C-terminal domain from 962-1008.	653
289693	pfam12941	HCV_NS5a_C	HCV NS5a protein C-terminal region. This is a family of proteins found in the hepatitis C virus. This family contains the C-terminal region of the NS5A protein. CC The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR.	242
289694	pfam12942	Archaeal_AmoA	Archaeal ammonia monooxygenase subunit A (AmoA). This is an archeael family that contains ammonia monooxygenase subunit A. Ammonia monooxygenase is an enzyme that oxidizes ammonia to nitrite and nitrate, thus playing a significant role in the nitrogen cycle. Ammonia-oxidising archaea (AOA) are widespread in marine environments.	183
372401	pfam12943	DUF3839	Protein of unknown function (DUF3839). This is a family of uncharacterized proteins that are found in Trichomonas.	242
403983	pfam12944	HAV_VP	Hepatitis A virus viral protein VP. This is a family of the viral protein found in hepatitis A viruses. HAV is unique among picornaviruses in targeting the liver.	169
403984	pfam12945	YcgR_2	Flagellar protein YcgR. This domain is found N terminal to pfam07238. Proteins which contain YcgR domains are known to interact with the flagellar switch-complex proteins FliG and FliM. This interaction results in a reduction of torque generation and induces CCW motor bias. This family contains members not captured by pfam07317.	85
403985	pfam12946	EGF_MSP1_1	MSP1 EGF domain 1. This EGF-like domain is found at the C-terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite.	37
403986	pfam12947	EGF_3	EGF domain. This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.	36
403987	pfam12948	MSP7_C	MSP7-like protein C-terminal domain. MSP7 is a protein family the malaria parasite that has been found to be associated with processed fragments from the MSP1 protein in a complex involved in red blood cell invasion.	125
403988	pfam12949	HeH	HeH/LEM domain. This is a HeH domain. HeH domains form helix-extended loop-helix (HeH) structures. This domain is closely related to pfam03020 and pfam02037.	35
403989	pfam12950	TaqI_C	TaqI-like C-terminal specificity domain. This domain is found at the C-terminus of the TaqI protein and is involved in DNA-binding and substrate recognition.	119
403990	pfam12951	PATR	Passenger-associated-transport-repeat. This Autotransporter-associated beta strand repeat model represents a core 32-residue region of a class of bacterial protein repeat found in one to 30 copies per protein. Most proteins with a copy of this repeat have domains associated with membrane autotransporters (pfam03797). The repeats occur with a periodicity of 60 to 100 residues. A pattern of sequence conservation is that every second residue is well-conserved across most of the domain. These repeats as likely to have a beta-helical structure. This repeat plays a role in the efficient transport of autotransporter virulence factors to the bacterial surface during growth and infection. The repeat is always associated with the passenger domain of the autotransporter. For these reasons it has been coined the Passenger-associated Transport Repeat (PATR). The mechanism by which the PATR motif promotes transport is uncertain but it is likely that the conserved glycines (see HMM Logo) are required for flexibility of folding and that this folding drives secretion. Autotransporters that contain PATR(s) associate with distinct virulence traits such as subtilisin (S8) type protease domains and polymorphic outer-membrane protein repeats, whilst SPATE (S6) type protease and lipase-like autotransporters do not tend to contain PATR motifs.	28
403991	pfam12952	DUF3841	Domain of unknown function (DUF3841). This presumed domain is around 190 amino acids in length. As yet no function has been given to any member of the family.	178
403992	pfam12953	DUF3842	Domain of unknown function (DUF3842). This short protein is found mainly in firmicute bacteria. It is functionally uncharacterized.	130
403993	pfam12954	DUF3843	Protein of unknown function (DUF3843). A family of uncharacterized proteins found by clustering human gut metagenomic sequences.	409
403994	pfam12955	DUF3844	Domain of unknown function (DUF3844). This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins that are thought to be found in the endoplasmic reticulum.	104
403995	pfam12956	DUF3845	Domain of Unknown Function with PDB structure. Member Structure 3GF6 has statistically significant similarity to TNF-like jelly roll fold may indicate an immunomodulatory function or a bioadhesion role	221
403996	pfam12957	DUF3846	Domain of unknown function (DUF3846). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain is found associated with an pfam07275 like domain. This suggests that this family may also be involved in evading host restriction.	92
403997	pfam12958	DUF3847	Protein of unknown function (DUF3847). A family of uncharacterized proteins found by clustering human gut metagenomic sequences.	81
403998	pfam12959	DUF3848	Protein of unknown function (DUF3848). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain frequently seen with DUF3849.	93
403999	pfam12960	DUF3849	Protein of unknown function (DUF3849). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain frequently seen with DUF3848.	124
404000	pfam12961	DUF3850	Domain of Unknown Function with PDB structure (DUF3850). The search results from NCBI sequence alignment indicates a conserved domain belonging to ASCH superfamily. Dali searching results show that the protein is a structurally similar to the PUA domain, suggesting it may be involved in RNA recognition. It has been reported that the deletion of PUA genes results in impaired growth (RluD) and competitive disadvantage (TruB) in Escherichia coli. Suggestions have been put forward that, apart from their usual catalytic role, certain PUS enzymes (e.g. TruB) may also act as chaperones for RNA folding. The interface interaction indicates that the biomolecule of protein NP_809782.1 should be a dimer.	77
289714	pfam12962	DUF3851	Protein of unknown function (DUF3851). A family of uncharacterized proteins found by clustering human gut metagenomic sequences.	126
404001	pfam12963	DUF3852	Protein of unknown function (DUF3852). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain frequently seen with DUF3848.	107
372411	pfam12964	DUF3853	Protein of unknown function (DUF3853). A family of uncharacterized proteins found by clustering human gut metagenomic sequences.	96
404002	pfam12965	DUF3854	Domain of unknown function (DUF3854). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This domain is likely to be related to the Toprim domain.	124
404003	pfam12966	AtpR	N-ATPase, AtpR subunit. Membrane protein with three predicted transmembrane segments, two of which contain conserved Arg residues. AtpR genes are found in the N-ATPase (archaeal-type F1-Fo-ATPase) operons and are predicted to interact with the conserved Glu/Asp residues in the c subunits, regulating the assembly and/or function of the membrane-embedded ring of 'c' (proteolipid) subunits (pfam00137).	86
404004	pfam12967	DUF3855	Domain of Unknown Function with PDB structure (DUF3855). Family based on orphan protein (TM0875) from Thermotoga maritima that has been structurally determined as Structure 1022. The TM0875 gene of Thermotoga maritima encodes a hypothetical protein NP_228683 of unknown function. Analysis of TM0875 genomic context reveals the presence of MMT1 (a predicted Co/Zn/Cd cation transporter) and an inactive homolog of metal-dependent proteases. 1O22 shows weak structural similarity with the phosphoribosylformylglycinamidine synthase 1t4a (Dali Z-scr=4.6), the yggU protein (PDB structure:1n91; with DALI Z-scr=3), and with the thioesterase superfamily member (PDB structure 2cy9 - found using FATCAT), even though they have very low sequence identity.	157
404005	pfam12968	DUF3856	Domain of Unknown Function (DUF3856). TPR-like protein. The 2hr2 structure belongs to the SCOP all alpha class, TPR-like superfamily, CT2138-like family. A DALI search gives hits with the putative peptidyl-prolyl isomerase 2fbn (Z=16), the SGTA protein (Z=16), the PLCR protein 2qfc (Z=16), a putative FK506-binding protein (Structure 1qz2-A; DALI Z-score 15.3; RMSD 2.9; 16% sequence identity within 132 superimposed residues), and with the tetratricopeptide repeats of the protein phosphatase 5 (Structure 2bug; DALI Z-score 15.1; RMSD 2.5; 19% sequence identity within 117 superimposed residues).	142
404006	pfam12969	DUF3857	Domain of Unknown Function with PDB structure (DUF3857). This family is based on the first domain of the PDB structure 3KD4(residues 1-228). It is structurally similar to domains in other hydrolases, eg. M1 family aminopeptidase (3ebi, Z=10, rmsd 3.6A for 152 CA, seq id 12%), despite lack of any significant sequence similarity.	131
404007	pfam12970	DUF3858	Domain of Unknown Function with PDB structure (DUF3858). This family is based on the third domain of the PDB structure 3KD4(residues 410-525). It is structurally similar to part of neuropilin-2 (Z=4.6, rmsd 3.6A for 83 CA, 7% seq id). This domain and the second domain appears to be part of peptide-n-glycanase (1x3w, 2g9f).	116
404008	pfam12971	NAGLU_N	Alpha-N-acetylglucosaminidase (NAGLU) N-terminal domain. Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This N-terminal domain has an alpha-beta fold.	81
404009	pfam12972	NAGLU_C	Alpha-N-acetylglucosaminidase (NAGLU) C-terminal domain. Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This C-terminal domain has an all alpha helical fold.	258
404010	pfam12973	Cupin_7	ChrR Cupin-like domain. Members of this family are part of the cupin superfamily. This family includes the transcriptional activator ChrR.	91
404011	pfam12974	Phosphonate-bd	ABC transporter, phosphonate, periplasmic substrate-binding protein. This is a family of periplasmic proteins which are part of the transport system for alkylphosphonate uptake.	241
404012	pfam12975	DUF3859	Domain of unknown function (DUF3859). This short domain is functionally uncharacterized.	127
315622	pfam12976	DUF3860	Domain of Unknown Function with PDB structure (DUF3860). A protein family created to cover Structure 2OD5. 2OD5 is a hypothetical protein (JCVI_PEP_1096688149193) from an environmental metagenome (unidentified marine microbe).	92
404013	pfam12977	DUF3861	Domain of Unknown Function with PDB structure (DUF3861). The 3cjl structure is likely a representative of a new fold with some resemblance to 3-helical bundle folds such as the serum albumin-like fold of SCOP. No significant hits reported by a Dali search. This protein is the first structural representative of a small (about 60 proteins) family of proteins that are found among proteo- and enterobacteria (REF http://www.topsan.org/Proteins/JCSG/3CJL).	88
289729	pfam12978	DUF3862	Domain of Unknown Function with PDB structure (DUF3862). Structure 3D4E shared structural similarity to beta-lactamase inhibitory proteins (BLIP) which already include 1XXM, 1S0W, 1JTG, 2G2U, 2G2W, 2B5R, and 3due. All of structures are involved in beta-lactamase inhibitor complex. (REF http://www.topsan.org/Proteins/JCSG/3d4e)	159
372415	pfam12979	DUF3863	Domain of Unknown Function with PDB structure (DUF3863). Domain based on 1-364 domain of Structure 3LM3 which is encoded by the BDI_3119 gene from Parabacteroides distasonis atcc 8503.	349
289731	pfam12980	DUF3864	Domain of Unknown Function with PDB structure (DUF3864). Domain based on 366-449 domain of Structure 3LM3 which is encoded by the BDI_3119 gene from Parabacteroides distasonis atcc 8503.	80
289732	pfam12981	DUF3865	Domain of Unknown Function with PDB structure (DUF3865). Family based of Structure 3B5P encoded by ZP_00108531 from nitrogen-fixing cyanobacterium Nostoc punctiforme pcc 73102 is a CADD-like protein of unknown function. Superposition between protein structures encoded by CT610 from Chlamydia trachomatis (Structure 1rwc), pyrroloquinolinquinone synthase C (PqqC, Structure 1otv) and ZP_00108531 revealed that putative active sites in CT610 and ZP_00108531 are identical.	224
404014	pfam12982	DUF3866	Protein of unknown function (DUF3866). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 352 and 374 amino acids in length.	317
404015	pfam12983	DUF3867	Protein of unknown function (DUF3867). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 190 amino acids in length.	185
404016	pfam12984	DUF3868	Domain of unknown function, B. Theta Gene description (DUF3868). Based on Bacteroides thetaiotaomicron gene BT_1065, a putative uncharacterized protein As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), It appears to be upregulated in the presence of host or other bacterial species vs when in culture.	102
404017	pfam12985	DUF3869	Domain of unknown function (DUF3869). A family based on the N-terminal domain of 3KOG, which shows weak but consistent remote homology with adhesive families such as immunoglobulins and cadherins, suggesting it might form an attachment module.	97
404018	pfam12986	DUF3870	Domain of unknown function (DUF3870). A family based on the C-terminal domain of 3KOG which shows structural similarity to pore-forming proteins, suggesting it may have a lytic function.	94
404019	pfam12987	DUF3871	Domain of unknown function, B. Theta Gene description (DUF3871). Based on Bacteroides thetaiotaomicron gene BT_2984, a putative uncharacterized protein As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231). It appears to be upregulated in the presence of host or other bacterial species vs when in culture.	318
404020	pfam12988	DUF3872	Domain of unknown function, B. Theta Gene description (DUF3872). Based on Bacteroides thetaiotaomicron gene BT_2593, a conserved protein found in a conjugate transposon. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231). It appears to be upregulated in the presence of host or other bacterial species vs when in culture.	113
372420	pfam12989	DUF3873	Domain of unknown function, B. Theta Gene description (DUF3873). Based on Bacteroides thetaiotaomicron gene BT_2286, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or other bacterial species vs when in culture.	68
404021	pfam12990	DUF3874	Domain of unknonw function from B. Theta Gene description (DUF3874). Based on Bacteroides thetaiotaomicron gene BT_4228, a putative uncharacterized protein As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), It appears to be upregulated in the presence of host or other bacterial species vs when in culture.	71
404022	pfam12991	DUF3875	Domain of unknown function, B. Theta Gene description (DUF3875). Based on Bacteroides thetaiotaomicron gene BT_4769, a conserved protein found in a conjugate transposon. As seem in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231). It appears to be upregulated in the presence of host or other bacterial species vs when in culture.	50
404023	pfam12992	DUF3876	Domain of unknown function, B. Theta Gene description (DUF3876). Based on Bacteroides thetaiotaomicron gene BT_0092, a conserved protein found in a conjugate transposon. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or other bacterial species vs when in culture.	91
404024	pfam12993	DUF3877	Domain of unknown function, E. rectale Gene description (DUF3877). Based on Eubacterium rectale gene EUBREC_0237. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737), it appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture.	173
404025	pfam12994	DUF3878	Domain of unknown function, E. rectale Gene description (DUF3878). Based on Eubacterium rectale gene EUBREC_0973. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737). it appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture.	300
289746	pfam12995	DUF3879	Domain of unknown function, E. rectale Gene description (DUF3879). Based on Eubacterium rectale gene EUBREC_1343. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737), it appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture.	179
404026	pfam12996	DUF3880	DUF based on E. rectale Gene description (DUF3880). Based on Eubacterium rectale gene EUBREC_3218. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737), It appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture.	78
404027	pfam12997	DUF3881	Domain of unknown function, E. rectale Gene description (DUF3881). Based on Eubacterium rectale gene EUBREC_3695. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14737), it appears to be upregulated in the presence of Bacteroides thetaiotaomicron vs when isolated in culture.	283
404028	pfam12998	ING	Inhibitor of growth proteins N-terminal histone-binding. Histones undergo numerous post-translational modifications, including acetylation and methylation, at residues which are then probable docking sites for various chromatin remodelling complexes. Inhibitor of growth proteins (INGs) specifically bind to residues that have been thus modified. INGs carry a well-characterized C-terminal PHD-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3), as well as this N-terminal domain that binds unmodified H3 tails. Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail.	100
372423	pfam12999	PRKCSH-like	Glucosidase II beta subunit-like. The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum.	176
372424	pfam13000	Acatn	Acetyl-coenzyme A transporter 1. The mouse Acatn is a 61 kDa hydrophobic protein with six to 10 transmembrane domains. It appears to promote 9-O-acetylation in gangliosides.	543
404029	pfam13001	Ecm29	Proteasome stabilizer. The proteasome consists of two subunits, and the capacity of the proteasome to degrade protein depends crucially on the interaction between these two subunits. This interaction is affected by a wide range of factors including metabolites, such as ATP, and proteasome-associated proteins such as Ecm29. Ecm29 stabilizes the interaction between the two subunits.	494
404030	pfam13002	LDB19	Arrestin_N terminal like. This is a family of proteins related to the Arrestin_N terminal family.	183
404031	pfam13004	BACON	Putative binding domain, N-terminal. The BACON (Bacteroidetes-Associated Carbohydrate-binding Often N-terminal) domain is an all-beta domain found in diverse architectures, principally in combination with carbohydrate-active enzymes and proteases. These architectures suggest a carbohydrate-binding function which is also supported by the nature of BACON's few conserved amino-acids. The phyletic distribution of BACON and other data tentatively suggest that it may frequently function to bind mucin. Further work with the characterized structure of a member of glycoside hydrolase family 5 enzyme, Structure 3ZMR, has found no evidence for carbohydrate-binding for this domain.	61
404032	pfam13005	zf-IS66	zinc-finger binding domain of transposase IS66. This is a zinc-finger region of the N-terminus of the insertion element IS66 transposase.	46
404033	pfam13006	Nterm_IS4	Insertion element 4 transposase N-terminal. This family represents the N-terminal region of proteins carrying the transposase enzyme, DDE_Tnp_1 (that was Transposase_11), pfam01609, at the C-terminus. The full-length members are Insertion Element 4, IS4. Within the collection of E.coli strains, ECOR, the number of IS4 elements varies from zero to 14, with an average of 5 copies/strain.	95
404034	pfam13007	LZ_Tnp_IS66	Transposase C of IS166 homeodomain. This is a leucine-zipper-like or homeodomain-like region of transposase TnpC of insertion element IS66.	68
372427	pfam13008	zf-Paramyx-P	Zinc-binding domain of Paramyxoviridae V protein. The Paramyxoviridae, which include such respiroviruses as para-influenzae and measles, produce phosphoproteins - protein P - that are integral to the polymerase transcription-replication complex. Protein P consists of two functionally distinct moieties, an N-terminal PNT, and a C-terminal PCT. The P gene region transcribes proteins from all three ORFs, and the V protein consists of the PNT moiety and a more C-terminal 2-zinc-binding domain. This conserved region consists of the two-zinc-binding section sandwiched between beta sheets 6 and 7 of the overall V protein. It is the binding of this core domain of V protein with the DDB1 protein (part of the ubiquitin-ligase complex) of eukaryotes which represents the key element of the virus-host protein interaction. In the Henipavirus family which includes Nipah and Hendra viruses, the V protein is able to block IFN (interferon) signalling by preventing IFN-induced STAT phosphorylation and nuclear translocation. The P gene of morbillivirus is co-transcriptionally edited leading to a V protein being produced.	45
404035	pfam13009	Phage_Integr_2	Putative phage integrase. This family is found in association with IS elements.	323
289758	pfam13010	pRN1_helical	Primase helical domain. This alpha helical domain is found in a set of bacterial plasmid replication proteins. The domain is found to the C-terminus of the primase/polymerase domain. Mutants of this domain are defective in template binding, dinucleotide formation and conformation change prior to DNA extension.	138
289759	pfam13011	LZ_Tnp_IS481	leucine-zipper of insertion element IS481. This is the upstream region of the conjoined ORF AB of insertion element 481. The significance of IS481 in the detection of Bordetella pertussis is discussed in. The B portion of the ORF AB carries the transposase activity in family rve, pfam00665.	85
404036	pfam13012	MitMem_reg	Maintenance of mitochondrial structure and function. This is C-terminal to the Mov24 region of the yeast proteasomal subunit Rpn11 and seems likely to regulate the mitochondrial fission and tubulation processes, ie the outer mitochondrial membrane proteins. This function appears to be unrelated to the proteasome activity of the N-terminal region.	72
404037	pfam13013	F-box-like_2	F-box-like domain. The F-box domain has a role in mediating protein-protein interactions in a variety of contexts, such as polyubiquitination, transcription elongation, centromere binding and translational repression.	107
404038	pfam13015	PRKCSH_1	Glucosidase II beta subunit-like protein. The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. The beta-subunit confers substrate specificity for di- and monoglucosylated glycans on the glucose-trimming activity of the alpha-subunit.	154
404039	pfam13016	Gliadin	Cys-rich Gliadin N-terminal. This is a cysteine-rich N-terminal region of gliadin and avenin plant proteins. The exact function is not known.	75
404040	pfam13017	Maelstrom	piRNA pathway germ-plasm component. Maelstrom is a germ-plasm component protein, that is shown to be functionally involved in the piRNA pathway. It is conserved throughout Eukaryota, though it appears to have been lost from all examined teleost fish species. The domain architecture shows that it is coupled with several DNA- and RNA- related domains such as HMG box, SR-25-like and HDAC_interact domains. Sequence analysis and fold recognition have found a distant similarity between Maelstrom domain and the DnaQ 3'-5' exonuclease family with the RNase H fold (Exonuc_X-T, pfam00929); notably, that the Maelstrom domains from basal eukaryotes contain the conserved 3'-5' exonuclease active site residues (Asp-Glu-Asp-His-Asp, DEDHD). However, the animal and some amoeba maelstrom contain another set of conserved residues (Glu-His-His-Cys-His-Cys, EHHCHC). This evolutionary link together with structural examinations leads to the hypothesis that Maelstrom domains may have a potential nuclease-transposase activity or RNA-binding ability that may be implicated in piRNA biogenesis. A protein function evolution mode, namely "active site switch", has been proposed, in which the amoeba Maelstrom domains are the possible evolutionary intermediates due to their harbouring of the specific characteristics of both 3'-5' exonuclease and Maelstrom domains.	212
404041	pfam13018	ESPR	Extended Signal Peptide of Type V secretion system. This conserved domain is called ESPR for Extended Signal Peptide Region. It is present at the N-terminus of the signal peptides of proteins belonging to the Type V secretion systems, including the autotransporters (T5aSS), TpsA exoproteins of the two-partner system (T5bSS) and trimeric autotransporters (TAAs). So far, the ESPR is present only in Gram-negative bacterial proteins originating from the classes Beta- and Gamma-proteobacteria. ESPR severely impairs inner membrane translocation, suggesting that it adopts a particular conformation or it interacts with a cytoplasmic or inner membrane co-factor, prior to exportation. Deletion of ESPR causes mis-folding of the TAAs passenger domain in the periplasm, substantially impairing its translocation across the outer membrane.	24
372432	pfam13019	Telomere_Sde2	Telomere stability and silencing. Sde2 has been identified in fission yeast as an important factor in telomere formation and maintenance. This is a more N-terminal domain on these nuclear proteins, and is essential for telomeric silencing and genomic stability.	165
404042	pfam13020	DUF3883	Domain of unknown function (DUF3883). This is a domain is uncharacterized. It is found on restriction endonucleases.	91
404043	pfam13021	DUF3885	Domain of unknown function (DUF3885). A putative Rac prophage DNA binding protein. This domain family is found in bacteria, and is approximately 40 amino acids in length. There is a conserved YDDRG sequence motif. There is a single completely conserved residue D that may be functionally important.	38
404044	pfam13022	HTH_Tnp_1_2	Helix-turn-helix of insertion element transposase. This is a family of largely phage proteins which are likely to be a helix-turn-helix insertion elements.	122
404045	pfam13023	HD_3	HD domain. HD domains are metal dependent phosphohydrolases.	163
338588	pfam13024	DUF3884	Protein of unknown function (DUF3884). This family of proteins is functionally uncharacterized. However several proteins are annotated as Tagatose 1,6-diphosphate aldolase, but evidence to support this could not be found. This family of proteins is found in bacteria. Proteins in this family are typically between 61 and 106 amino acids in length. There are two completely conserved residues (Y and F) that may be functionally important.	73
315655	pfam13025	DUF3886	Protein of unknown function (DUF3886). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two completely conserved L residues that may be functionally important.	68
404046	pfam13026	DUF3887	Protein of unknown function (DUF3887). This domain family is found in bacteria and archaea, and is approximately 90 amino acids in length.	91
404047	pfam13027	DUF3888	Protein of unknown function (DUF3888). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 111 and 149 amino acids in length.	87
404048	pfam13028	DUF3889	Protein of unknown function (DUF3889). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. There are two completely conserved residues (A and Y) that may be functionally important.	84
289775	pfam13029	DUF3890	Domain of unknown function (DUF3890). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 70 amino acids in length.	84
404049	pfam13030	DUF3891	Protein of unknown function (DUF3891). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 250 amino acids in length.	215
404050	pfam13031	DUF3892	Protein of unknown function (DUF3892). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 87 and 104 amino acids in length.	70
404051	pfam13032	DUF3893	Domain of unknown function (DUF3893). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 123 and 144 amino acids in length. There is a single completely conserved residue E that may be functionally important.	288
372437	pfam13033	DUF3894	Protein of unknown function (DUF3894). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 66 and 79 amino acids in length. There are two conserved sequence motifs: FNIC and MALLNLT.	54
372438	pfam13034	DUF3895	Protein of unknown function (DUF3895). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. There are two completely conserved residues (Y and L) that may be functionally important.	76
315663	pfam13035	DUF3896	Protein of unknown function (DUF3896). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	61
404052	pfam13036	LpoB	Peptidoglycan-synthase activator LpoB. This is a family of Gram-negative bacterial outer membrane lipoproteins. LpoB is required for the function of the major peptidoglycan synthase enzyme PBP1B. It interacts with PBP1B protein via the UvrB-like non-catalytic domain on that protein. LpoB has a 54-aa-long flexible N-terminal stretch followed by a globular domain with similarity to the N-terminal domain of the prevalent periplasmic protein TolB. The long, flexible N-terminal region of LpoB enables it to span the periplasm and reach its docking site in PBP1B. Peptidoglycan is the essential polymer within the sacculus that surrounds the cytoplasmic membrane of bacteria.	147
404053	pfam13037	DUF3898	Domain of unknown function (DUF3898). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 90 amino acids in length. There are two conserved sequence motifs: DFG and FEKG.	89
404054	pfam13038	DUF3899	Domain of unknown function (DUF3899). Putative Tryptophanyl-tRNA synthetase.	83
372442	pfam13039	DUF3900	Protein of unknown function (DUF3900). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 360 amino acids in length.	249
379027	pfam13040	Fur_reg_FbpB	Fur-regulated basic protein B. This family of proteins is regulated by the ferric uptake regulator protein Fur. This family represses expression of the lutABC operon encoding iron sulfur-containing enzymes necessary for growth on lactate.	39
404055	pfam13041	PPR_2	PPR repeat family. This repeat has no known function. It is about 35 amino acids long and is found in up to 18 copies in some proteins. The family appears to be greatly expanded in plants and fungi. The repeat has been called PPR.	50
289787	pfam13042	DUF3902	Protein of unknown function (DUF3902). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 170 amino acids in length. There is a conserved LGI sequence motif.	161
404056	pfam13043	DUF3903	Domain of unknown function (DUF3903). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 40 amino acids in length.	40
289789	pfam13044	Fusion_F0	Fusion glycoprotein F0, Isavirus. Fusion between viral and cellular membranes is mediated by viral membrane fusion glycoproteins. This entry represents fusion glycoprotein F0 from the infectious salmon anemia virus (ISAV). The precursor protein F0 is proteolytically cleaved to F1 and F2, which are held together by disulphide bridges.	436
315671	pfam13045	DUF3905	Protein of unknown function (DUF3905). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length.	84
289791	pfam13046	DUF3906	Protein of unknown function (DUF3906). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved EKK sequence motif.	64
379028	pfam13047	DUF3907	Protein of unknown function (DUF3907). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length. There is a conserved AYTG sequence motif.	146
315673	pfam13048	DUF3908	Protein of unknown function (DUF3908). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 140 amino acids in length. There is a single completely conserved residue Y that may be functionally important.	134
404057	pfam13049	DUF3910	Protein of unknown function (DUF3910). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length.	93
404058	pfam13050	DUF3911	Protein of unknown function (DUF3911). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	77
404059	pfam13051	DUF3912	Protein of unknown function (DUF3912). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	92
315677	pfam13052	DUF3913	Protein of unknown function (DUF3913). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	57
372448	pfam13053	DUF3914	Protein of unknown function (DUF3914). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. There are two conserved sequence motifs: KFDIR and DLW.	89
315678	pfam13054	DUF3915	Protein of unknown function (DUF3915). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length.	126
404060	pfam13055	DUF3917	Protein of unknown function (DUF3917). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length.	71
289801	pfam13056	DUF3918	Protein of unknown function (DUF3918). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There are two completely conserved residues (G and R) that may be functionally important.	43
404061	pfam13057	DUF3919	Protein of unknown function (DUF3919). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 251 and 262 amino acids in length. There is a conserved YLNG sequence motif.	227
372451	pfam13058	DUF3920	Protein of unknown function (DUF3920). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length.	126
404062	pfam13059	DUF3922	Protein of unknown function (DUF3992). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 87 and 98 amino acids in length.	79
289805	pfam13060	DUF3921	Protein of unknown function (DUF3921). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	58
404063	pfam13061	DUF3923	Protein of unknown function (DUF3923). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	65
289807	pfam13062	DUF3924	Protein of unknown function (DUF3924). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	62
372453	pfam13063	DUF3925	Protein of unknown function (DUF3925). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length.	65
372454	pfam13064	DUF3927	Protein of unknown function (DUF3927). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 50 amino acids in length. There is a conserved SVL sequence motif. There is a single completely conserved residue D that may be functionally important.	53
372455	pfam13065	DUF3928	Protein of unknown function (DUF3928). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length.	95
372456	pfam13066	DUF3929	Protein of unknown function (DUF3929). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length.	64
404064	pfam13067	DUF3930	Protein of unknown function (DUF3930). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 51 and 67 amino acids in length.	51
372458	pfam13068	DUF3932	Protein of unknown function (DUF3932). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	81
404065	pfam13069	DUF3933	Protein of unknown function (DUF3933). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	53
289815	pfam13070	DUF3934	Protein of unknown function (DUF3934). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There are two conserved sequence motifs: GTG and SKG.	40
372460	pfam13071	DUF3935	Protein of unknown function (DUF3935). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two conserved sequence motifs: FVF and LGV.	70
289817	pfam13072	DUF3936	Protein of unknown function (DUF3936). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved GKAW sequence motif. There is a single completely conserved residue G that may be functionally important.	37
372461	pfam13073	DUF3937	Protein of unknown function (DUF3937). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	72
372462	pfam13074	DUF3938	Protein of unknown function (DUF3938). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length.	98
289820	pfam13075	DUF3939	Protein of unknown function (DUF3939). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length.	133
315692	pfam13076	Fur_reg_FbpA	Fur-regulated basic protein A. This family of proteins is regulated by the ferric uptake regulator protein Fur. This family does not regulate the lutABC operon encoding iron sulfur-containing enzymes necessary for growth on lactate.	36
404066	pfam13077	DUF3909	Protein of unknown function (DUF3909). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length.	108
289823	pfam13078	DUF3942	Protein of unknown function (DUF3942). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length.	137
404067	pfam13079	DUF3916	Protein of unknown function (DUF3916). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 170 amino acids in length. There is a single completely conserved residue S that may be functionally important.	147
404068	pfam13080	DUF3926	Protein of unknown function (DUF3926). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 46 and 63 amino acids in length. There is a single completely conserved residue P that may be functionally important.	43
289826	pfam13081	DUF3941	Domain of unknown function (DUF3941). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 30 amino acids in length. There is a conserved YSK sequence motif.	24
289827	pfam13082	DUF3931	Protein of unknown function (DUF3931). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	66
404069	pfam13083	KH_4	KH domain. 	73
404070	pfam13084	DUF3943	Domain of unknown function (DUF3943). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 110 amino acids in length.	108
404071	pfam13085	Fer2_3	2Fe-2S iron-sulfur cluster binding domain. The 2Fe-2S ferredoxin family have a general core structure consisting of beta(2)-alpha-beta(2) which abeta-grasp type fold. The domain is around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated.	106
404072	pfam13086	AAA_11	AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins.	248
404073	pfam13087	AAA_12	AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins.	196
404074	pfam13088	BNR_2	BNR repeat-like domain. This family of proteins contains BNR-like repeats suggesting these proteins may act as sialidases.	280
404075	pfam13089	PP_kinase_N	Polyphosphate kinase N-terminal domain. Polyphosphate kinase (Ppk) catalyzes the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules.	106
404076	pfam13090	PP_kinase_C	Polyphosphate kinase C-terminal domain. Polyphosphate kinase (Ppk) catalyzes the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. This C-terminal domain has a structure similar to phospholipase D.	172
404077	pfam13091	PLDc_2	PLD-like domain. 	132
404078	pfam13092	CENP-L	Kinetochore complex Sim4 subunit Fta1. CENP-L is one of the components that assembles onto the CENP-A-nucleosome distal (CAD) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. Fta1 is the equivalent component of the fission yeast Sim4 complex. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals.	158
404079	pfam13093	FTA4	Kinetochore complex Fta4 of Sim4 subunit, or CENP-50. Fission yeast has three kinetochore protein complexes. Two complexes, Sim4 and Ndc80-MIND-Spc7 (NMS), are constitutive components, whereas the third complex, DASH, is transiently associated with kinetochores only in mitosis and is required for precise chromosome segregation. The Sim4 complex functions as a loading dock for the DASH complex. Sim4 consists of a number of different proteins including Ftas 1-7 and Dad1.	199
404080	pfam13094	CENP-Q	CENP-Q, a CENPA-CAD centromere complex subunit. CENP-Q is one of the components that assembles onto the CENPA-nucleosome distal (CAD) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENPA nucleosomes directly recruit a proximal CENPA-nucleosome-associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENPA NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENPA-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. Fta7 is the equivalent component of the fission yeast Sim4 complex.	158
404081	pfam13095	FTA2	Kinetochore Sim4 complex subunit FTA2. Fission yeast has three kinetochore protein complexes. Two complexes, Sim4 and Ndc80-MIND-Spc7 (NMS), are constitutive components, whereas the third complex, DASH, is transiently associated with kinetochores only in mitosis and is required for precise chromosome segregation. The Sim4 complex functions as a loading dock for the DASH complex. Sim4 consists of a number of different proteins including Ftas 1-7 and Dad1. The equivalent higher eukaryotic protein is CENP-P. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC.	204
289841	pfam13096	CENP-P	CENP-A-nucleosome distal (CAD) centromere subunit, CENP-P. CENP-P is one of the components that assembles onto the CENP-A-nucleosome distal (CAD) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. Fta7 is the equivalent component of the fission yeast Sim4 complex.	177
404082	pfam13097	CENP-U	CENP-A nucleosome associated complex (NAC) subunit. CENP-U is one of the components that assembles onto the CENP-A-nucleosome associated complex (NAC). The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. FTA4 is the equivalent component of the fission yeast Sim4 complex.	175
379034	pfam13098	Thioredoxin_2	Thioredoxin-like domain. 	103
289844	pfam13099	DUF3944	Domain of unknown function (DUF3944). This short domain is sometimes found N terminal to pfam03981.	35
404083	pfam13100	OstA_2	OstA-like protein. This is a family of OstA-like proteins that are related to pfam03968.	158
404084	pfam13101	DUF3945	Protein of unknown function (DUF3945). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This is a C-terminal repeated region.	59
404085	pfam13102	Phage_int_SAM_5	Phage integrase SAM-like domain. A family of uncharacterized proteins found by clustering human gut metagenomic sequences. This family appears related to the N-terminal domain of phage integrases.	99
404086	pfam13103	TonB_2	TonB C terminal. This family contains TonB members that are not captured by pfam03544.	85
289849	pfam13104	DUF3956	Protein of unknown function (DUF3956). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length.	45
404087	pfam13105	DUF3959	Protein of unknown function (DUF3959). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 260 amino acids in length.	241
372478	pfam13106	DUF3961	Domain of unknown function (DUF3961). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 40 amino acids in length.	39
289852	pfam13107	DUF3964	Protein of unknown function (DUF3964). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. There are two conserved sequence motifs: FYF and AFW.	109
404088	pfam13108	DUF3969	Protein of unknown function (DUF3969). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length.	102
404089	pfam13109	AsmA_1	AsmA-like C-terminal region. This family is similar to the C-terminal of the AsmA protein of E. coli.	213
404090	pfam13110	DUF3966	Protein of unknown function (DUF3966). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 58 and 86 amino acids in length.	42
404091	pfam13111	DUF3962	Protein of unknown function (DUF3962). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 233 and 796 amino acids in length. There is a conserved FSY sequence motif.	397
289857	pfam13112	DUF3965	Protein of unknown function (DUF3965). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 380 amino acids in length.	291
372483	pfam13113	DUF3970	Protein of unknown function (DUF3970). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved NPKY sequence motif.	55
404092	pfam13114	RecO_N_2	RecO N terminal. This entry contains members that are not captured by pfam11967.	71
404093	pfam13115	YtkA	YtkA-like. 	86
404094	pfam13116	DUF3971	Protein of unknown function. Some members of this family are related to the AsmA family proteins.	288
404095	pfam13117	Cag12	Cag pathogenicity island protein Cag12. This is a Proteobacterial family of Cag pathogenicity island proteins.	92
372487	pfam13118	DUF3972	Protein of unknown function (DUF3972). This is a Proteobacterial family of unknown function. Some of the proteins in this family are annotated as being kinesin-like proteins.	125
404096	pfam13119	DUF3973	Domain of unknown function (DUF3973). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 40 amino acids in length. There is a conserved YCI sequence motif.	40
372489	pfam13120	DUF3974	Domain of unknown function (DUF3974). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 130 amino acids in length.	126
404097	pfam13121	DUF3976	Domain of unknown function (DUF3976). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 40 amino acids in length.	40
289867	pfam13122	DUF3977	Protein of unknown function (DUF3977). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	77
372491	pfam13123	DUF3978	Protein of unknown function (DUF3978). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length.	144
289869	pfam13124	DUF3963	Protein of unknown function (DUF3963). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 42 and 85 amino acids in length. There is a conserved DIQKW sequence motif.	40
404098	pfam13125	DUF3958	Protein of unknown function (DUF3958). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. There are two conserved sequence motifs: RLF and TWH.	107
315729	pfam13126	DUF3975	Protein of unknown function (DUF3975). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length.	80
404099	pfam13127	DUF3955	Protein of unknown function (DUF3955). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 68 and 87 amino acids in length. There are two completely conserved residues (G and E) that may be functionally important.	59
315731	pfam13128	DUF3954	Protein of unknown function (DUF3954). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 60 amino acids in length.	49
404100	pfam13129	DUF3953	Protein of unknown function (DUF3953). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 47 and 76 amino acids in length.	40
315733	pfam13130	DUF3952	Domain of unknown function (DUF3952). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 110 amino acids in length. There is a conserved VMSAS sequence motif.	101
315734	pfam13131	DUF3951	Protein of unknown function (DUF3951). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 56 and 71 amino acids in length. There is a conserved YTP sequence motif.	52
404101	pfam13132	DUF3950	Domain of unknown function (DUF3950). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 30 amino acids in length. There is a conserved NFS sequence motif.	30
315735	pfam13133	DUF3949	Protein of unknown function (DUF3949). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 69 and 87 amino acids in length.	60
315736	pfam13134	DUF3948	Protein of unknown function (DUF3948). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length.	35
289880	pfam13135	DUF3947	Protein of unknown function (DUF3947). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	91
372493	pfam13136	DUF3984	Protein of unknown function (DUF3984). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 393 and 442 amino acids in length.	325
289882	pfam13137	DUF3983	Protein of unknown function (DUF3983). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 40 amino acids in length. There is a conserved AWRN sequence motif.	34
404102	pfam13138	DUF3982	Protein of unknown function (DUF3982). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 47 and 73 amino acids in length. There are two conserved sequence motifs: EKL and EIP.	35
372494	pfam13139	DUF3981	Domain of unknown function (DUF3981). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 110 amino acids in length.	115
289885	pfam13140	DUF3980	Domain of unknown function (DUF3980). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 90 amino acids in length.	87
404103	pfam13141	DUF3979	Protein of unknown function (DUF3979). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length.	115
372496	pfam13142	DUF3960	Domain of unknown function (DUF3960). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 72 and 89 amino acids in length.	89
404104	pfam13143	DUF3986	Protein of unknown function (DUF3986). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length.	87
404105	pfam13144	ChapFlgA	Chaperone for flagella basal body P-ring formation. ChapFlgA is a family similar to the SAF family, and includes chaperones for flagellar basal-body proteins and pilus-assembly proteins, FlgA, RcpB and CpaB. ChapFlgA is necessary for the formation of the P-ring of the flagellum, FlgI, which sits in the peptidoglycan layer of the outer membrane of the bacterium. FlgA plays an auxiliary role in P-ring assembly.	122
404106	pfam13145	Rotamase_2	PPIC-type PPIASE domain. 	121
404107	pfam13146	TRL	TRL-like protein family. This family includes the TRL protein that is found in a locus that includes several tRNAs. The function of this protein is not known. The proteins in this family usually have a lipoprotein attachment site at their N-terminus.	77
404108	pfam13148	DUF3987	Protein of unknown function (DUF3987). A family of uncharacterized proteins found by clustering human gut metagenomic sequences.	365
404109	pfam13149	Mfa_like_1	Fimbrillin-like. A family of putative fimbrillin proteins found by clustering human gut metagenomic sequences. Analysis of structural comparisons shows this family to be part of the FimbA (CL0450) superfamily of adhesin components or fimbrillins.	244
404110	pfam13150	DUF3989	Protein of unknown function (DUF3989). A family of uncharacterized proteins found by clustering human gut metagenomic sequences.	86
404111	pfam13151	DUF3990	Protein of unknown function (DUF3990). A family of uncharacterized proteins found by clustering human gut metagenomic sequences.	151
289896	pfam13152	DUF3967	Protein of unknown function (DUF3967). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 173 and 249 amino acids in length.	35
404112	pfam13153	DUF3985	Protein of unknown function (DUF3985). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length.	44
404113	pfam13154	DUF3991	Protein of unknown function (DUF3991). This family of proteins is often associated with family Toprim, pfam01751.	73
404114	pfam13155	Toprim_2	Toprim-like. This is a family or Toprim-like proteins.	87
404115	pfam13156	Mrr_cat_2	Restriction endonuclease. Prokaryotic family found in type II restriction enzymes containing the hallmark (D/E)-(D/E)XK active site. Presence of catalytic residues implicates this region in the enzymatic cleavage of DNA.	127
372502	pfam13157	DUF3992	Protein of unknown function (DUF3992). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 98 and 122 amino acids in length. There is a single completely conserved residue T that may be functionally important.	88
372503	pfam13158	DUF3993	Protein of unknown function (DUF3993). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length.	118
372504	pfam13159	DUF3994	Domain of unknown function (DUF3994). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 97 and 111 amino acids in length.	99
404116	pfam13160	DUF3995	Protein of unknown function (DUF3995). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 138 and 149 amino acids in length. There are two completely conserved residues (W and P) that may be functionally important.	124
315755	pfam13161	DUF3996	Protein of unknown function (DUF3996). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 172 and 203 amino acids in length.	154
404117	pfam13162	DUF3997	Protein of unknown function (DUF3997). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length.	107
404118	pfam13163	DUF3999	Protein of unknown function (DUF3999). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 440 and 470 amino acids in length. There is a single completely conserved residue D that may be functionally important.	421
404119	pfam13164	Diedel	Diedel. Diedel (die) was identified as an insect immune response protein. It is up-regulated after a septic injury and may act as a negative regulator of the JAK/STAT signalling pathway. Its homologs can be found in Drosophila and Acyrtosiphon pisum. Interestingly, the orthologues of the die gene are present in the genome of insect DNA viruses of the Baculoviridae and Ascoviridae families. The viral homologs suppress the immune deficiency (IMD) pathway in Drosophila.	75
404120	pfam13165	SCIFF	Six-cysteine peptide SCIFF. Members of this protein family are essentially universal in the class Clostidia and therefore highly abundant in the human gut microbiome. This short peptide is designated SCIFF, for Six Cysteines in Forty-Five residues. It is a presumed ribosomal natural product precursor, always found associated with a yet-uncharacterized radical SAM protein that resembles other peptide modification radical SAM enzymes and is designated SCIFF radical SAM maturase.	43
372509	pfam13166	AAA_13	AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. This family includes the PrrC protein that is thought to be the active component of the anticodon nuclease.	712
404121	pfam13167	GTP-bdg_N	GTP-binding GTPase N-terminal. This is the N-terminal region of GTP-binding HflX-like proteins. The full-length members bind and interact with the 50S ribosome and are GTPases, hydrolysing GTP/GDP/ATP/ADP. This N-terminal region is necessary for stability of the whole protein.	87
289912	pfam13168	Poxvirus_B22R_C	Poxvirus B22R protein C-terminal. This is the highly conserved C-terminal region of poxvirus proteins from eg, Fowlpox virus, Myxoma virus, Lumpy skin disease, Variola virus and other members of the Poxviridae family of double-stranded, no-RNA stage poxviruses.	195
289913	pfam13169	Poxvirus_B22R_N	Poxvirus B22R protein N-terminal. This is the highly conserved N-terminal region of poxvirus proteins from eg, Fowlpox virus, Myxoma virus, Lumpy skin disease, Variola virus and other members of the Poxviridae family of double-stranded, no-RNA stage poxviruses.	88
404122	pfam13170	DUF4003	Protein of unknown function (DUF4003). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 327 and 345 amino acids in length.	296
404123	pfam13171	DUF4004	Protein of unknown function (DUF4004). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 210 amino acids in length.	196
404124	pfam13173	AAA_14	AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily.	129
404125	pfam13174	TPR_6	Tetratricopeptide repeat. 	33
404126	pfam13175	AAA_15	AAA ATPase domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily.	392
372510	pfam13176	TPR_7	Tetratricopeptide repeat. 	36
404127	pfam13177	DNA_pol3_delta2	DNA polymerase III, delta subunit. DNA polymerase III, delta subunit (EC 2.7.7.7) is required for, along with delta' subunit, the assembly of the processivity factor beta(2) onto primed DNA in the DNA polymerase III holoenzyme-catalyzed reaction. The delta subunit is also known as HolA.	161
404128	pfam13178	DUF4005	Protein of unknown function (DUF4005). This is a C-terminal region of plant IQ-containing putative calmodulin-binding proteins.	97
404129	pfam13179	DUF4006	Family of unknown function (DUF4006). This is a family of short, approx 65 residue-long, bacterial proteins of unknown function.	63
404130	pfam13180	PDZ_2	PDZ domain. 	74
404131	pfam13181	TPR_8	Tetratricopeptide repeat. 	33
404132	pfam13182	DUF4007	Protein of unknown function (DUF4007). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 284 and 326 amino acids in length. This domain is found associated with pfam01507 in some proteins, suggesting a functional link.	287
404133	pfam13183	Fer4_8	4Fe-4S dicluster domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. Domain contains two 4Fe4S clusters.	65
404134	pfam13184	KH_5	NusA-like KH domain. 	69
404135	pfam13185	GAF_2	GAF domain. 	137
404136	pfam13186	SPASM	Iron-sulfur cluster-binding domain. This domain occurs as an additional C-terminal iron-sulfur cluster binding domain in many radical SAM domain, pfam04055 proteins. The domain occurs in a number of proteins that modify a protein to become an active enzyme, or a peptide to become a ribosomal natural product. The domain is named SPASM because it occurs in the maturases of Subilitosin, PQQ, Anaerobic Sulfatases, and Mycofactocin.	66
404137	pfam13187	Fer4_9	4Fe-4S dicluster domain. 	50
404138	pfam13188	PAS_8	PAS domain. 	65
404139	pfam13189	Cytidylate_kin2	Cytidylate kinase-like family. This family includes enzymes related to cytidylate kinase.	176
404140	pfam13190	PDGLE	PDGLE domain. This short presumed domain is usually found on its own. However, it is also found associated with pfam01891 suggesting it may have a role in cobalt uptake. The domain is named after a short motif found within many members of the family.	89
404141	pfam13191	AAA_16	AAA ATPase domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily.	166
404142	pfam13192	Thioredoxin_3	Thioredoxin domain. 	71
404143	pfam13193	AMP-binding_C	AMP-binding enzyme C-terminal domain. This is a small domain that is found C terminal to pfam00501. It has a central beta sheet core that is flanked by alpha helices.	76
404144	pfam13194	DUF4010	Domain of unknown function (DUF4010). This is a family of putative membrane proteins found in archaea and bacteria. It is sometimes found C terminal to pfam02308.	209
404145	pfam13195	DUF4011	Protein of unknown function (DUF4011). This family of proteins is found in archaea and bacteria. Many members are annotated as being putative DNA helicase-related proteins.	164
404146	pfam13196	DUF4012	Protein of unknown function (DUF4012). This is a family of uncharacterized proteins found in archaea and bacteria.	144
404147	pfam13197	DUF4013	Protein of unknown function (DUF4013). This is a family of uncharacterized proteins that is found in archaea and bacteria.	167
338629	pfam13198	DUF4014	Protein of unknown function (DUF4014). This is a bacterial and viral family of uncharacterized proteins.	72
404148	pfam13199	Glyco_hydro_66	Glycosyl hydrolase family 66. This family is a set of glycosyl hydrolase enzymes including cycloisomaltooligosaccharide glucanotransferase (EC:2.4.1.-) and dextranase (EC:3.2.1.11) activities.	557
404149	pfam13200	DUF4015	Putative glycosyl hydrolase domain. This domain is related to other known glycosyl hydrolases suggesting this domain is also involved in carbohydrate break down.	313
404150	pfam13201	PCMD	Putative carbohydrate metabolism domain. This domain has been suggested to participate in carbohydrate metabolism. Structural evidence indicates that it might be a carbohydrate binding domain, with or without enzymatic activity. In particular, it has been hypothesized that it might act as a glycoside hydrolase.	238
404151	pfam13202	EF-hand_5	EF hand. 	25
404152	pfam13203	DUF2201_N	Putative metallopeptidase domain. This domain, found in various hypothetical bacterial proteins, has no known function. However, it is related to pfam01435.	271
404153	pfam13204	DUF4038	Protein of unknown function (DUF4038). A family of putative cellulases.	320
404154	pfam13205	Big_5	Bacterial Ig-like domain. 	106
404155	pfam13206	VSG_B	Trypanosomal VSG domain. This family represents the B-type variant surface glycoproteins from trypanosomal parasites. This family is related to pfam00913.	354
404156	pfam13207	AAA_17	AAA domain. 	136
404157	pfam13208	TerB_N	TerB N-terminal domain. The TerB_N domain is found N-terminal to TerB, and TerB_C containing proteins. It has a predominantly alpha-helical structure and contains an absolutely conserved glutamate. The presence of a conserved acidic residue suggests that it might chelate metal like TerB. These proteins occur in a two-gene operon containing an AAA+ ATPase and SF-II DNA helicase suggesting a role in stress-response or phage defense.	203
289952	pfam13209	DUF4017	Protein of unknown function (DUF4017). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	60
404158	pfam13210	DUF4018	Domain of unknown function (DUF4018). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 190 amino acids in length.	198
404159	pfam13211	DUF4019	Protein of unknown function (DUF4019). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 130 and 183 amino acids in length. There is a single completely conserved residue E that may be functionally important.	104
372518	pfam13212	DUF4020	Domain of unknown function (DUF4020). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 176 and 195 amino acids in length.	174
289956	pfam13213	DUF4021	Protein of unknown function (DUF4021). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved YGM sequence motif.	46
289957	pfam13214	DUF4022	Protein of unknown function (DUF4022). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 73 and 85 amino acids in length.	77
372519	pfam13215	DUF4023	Protein of unknown function (DUF4023). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved KLP sequence motif.	35
315802	pfam13216	DUF4024	Protein of unknown function (DUF4024). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved RDE sequence motif.	35
315803	pfam13217	DUF4025	Protein of unknown function (DUF4025). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved EGT sequence motif.	50
404160	pfam13218	DUF4026	Protein of unknown function (DUF4026). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 450 amino acids in length. The family is found in association with pfam10077.	320
404161	pfam13219	DUF4027	Protein of unknown function (DUF4027). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved CLGGF sequence motif.	36
289962	pfam13220	DUF4028	Protein of unknown function (DUF4028). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 67 and 93 amino acids in length. There are two conserved sequence motifs: IVKI and YVKKWF.	65
372521	pfam13221	DUF4029	Protein of unknown function (DUF4029). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 95 and 119 amino acids in length.	96
289964	pfam13222	DUF4030	Protein of unknown function (DUF4030). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 164 and 197 amino acids in length.	142
404162	pfam13223	DUF4031	Protein of unknown function (DUF4031). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 91 and 130 amino acids in length. There is a conserved HYD sequence motif.	75
404163	pfam13224	DUF4032	Domain of unknown function (DUF4032). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 170 amino acids in length. The family is found in association with pfam06293.	163
404164	pfam13225	DUF4033	Domain of unknown function (DUF4033). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is approximately 80 amino acids in length.	88
404165	pfam13226	DUF4034	Domain of unknown function (DUF4034). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 280 amino acids in length. There is a conserved PRW sequence motif.	274
404166	pfam13227	DUF4035	Protein of unknown function (DUF4035). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 67 and 93 amino acids in length.	55
404167	pfam13228	DUF4037	Domain of unknown function (DUF4037). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is approximately 100 amino acids in length. There is a single completely conserved residue P that may be functionally important.	101
404168	pfam13229	Beta_helix	Right handed beta helix region. This region contains a parallel beta helix region that shares some similarity with Pectate lyases.	157
404169	pfam13230	GATase_4	Glutamine amidotransferases class-II. This family captures members that are not found in pfam00310.	272
404170	pfam13231	PMT_2	Dolichyl-phosphate-mannose-protein mannosyltransferase. This family contains members that are not captured by pfam02366.	160
315815	pfam13232	Complex1_LYR_1	Complex1_LYR-like. This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria.	99
404171	pfam13233	Complex1_LYR_2	Complex1_LYR-like. This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria.	79
404172	pfam13234	rRNA_proc-arch	rRNA-processing arch domain. Mtr4 is the essential RNA helicase, and is an exosome-activating cofactor. This arch domain is carried in Mtr4 and Ski2 (the cytosolic homolog of Mtr4). The arch domain is required for proper 5.8S rRNA processing, and appears to function independently of canonical helicase activity.	266
404173	pfam13236	CLU	Clustered mitochondria. The CLU domain (CLUstered mitochondria) is a eukaryotic domain found in proteins from fungi, protozoa, plants to humans. It is required for correct functioning of the mitochondria and mitochondrial transport although the exact function of the domain is unknown. In Dictyostelium the full-length protein is required for a very late step in fission of the outer mitochondrial membrane suggesting that mitochondria are transported along microtubules, as in mammalian cells, rather than along actin filaments, as in budding yeast. Disruption of the protein-impaired cytokinesis and caused mitochondria to cluster at the cell centre. It is likely that CLU functions in a novel pathway that positions mitochondria within the cell based on their physiological state. Disruption of the CLU pathway may enhance oxidative damage, alter gene expression, cause mitochondria to cluster at microtubule plus ends, and lead eventually to mitochondrial failure.	225
404174	pfam13237	Fer4_10	4Fe-4S dicluster domain. This family includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. The structure of the domain is an alpha-antiparallel beta sandwich.	56
404175	pfam13238	AAA_18	AAA domain. 	128
404176	pfam13239	2TM	2TM domain. This short region contains two transmembrane alpha helices that are found associated with a wide range of other domains. This domain may be involved in cell lysis or peptidoglycan turnover.	80
404177	pfam13240	zinc_ribbon_2	zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR. pfam12773.	21
404178	pfam13241	NAD_binding_7	Putative NAD(P)-binding. This domain is found in fungi, plants, archaea and bacteria.	104
404179	pfam13242	Hydrolase_like	HAD-hyrolase-like. 	75
404180	pfam13243	SQHop_cyclase_C	Squalene-hopene cyclase C-terminal domain. Squalene-hopene cyclase, EC:5.4.99.17, catalyzes the cyclisation of squalene into hopene in bacteria. This reaction is part of a cationic cyclisation cascade, which is homologous to a key step in cholesterol biosynthesis. This family is the C-terminal half of the molecule.	319
404181	pfam13244	DUF4040	Domain of unknown function (DUF4040). 	65
404182	pfam13245	AAA_19	AAA domain. 	136
404183	pfam13246	Cation_ATPase	Cation transport ATPase (P-type). This domain is found in cation transport ATPases, including phospholipid-transporting ATPases, calcium-transporting ATPases, and sodium-potassium ATPases.	91
404184	pfam13247	Fer4_11	4Fe-4S dicluster domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. Domain contains two 4Fe4S clusters.	99
404185	pfam13248	zf-ribbon_3	zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR. pfam12773.	25
404186	pfam13249	SQHop_cyclase_N	Squalene-hopene cyclase N-terminal domain. Squalene-hopene cyclase, EC:5.4.99.17, catalyzes the cyclisation of squalene into hopene in bacteria. This reaction is part of a cationic cyclisation cascade, which is homologous to a key step in cholesterol biosynthesis. This family is the N-terminal domain.	290
404187	pfam13250	DUF4041	Domain of unknown function (DUF4041). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and viruses, and is approximately 60 amino acids in length. The family is found in association with pfam10544.	55
404188	pfam13251	DUF4042	Domain of unknown function (DUF4042). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 180 amino acids in length.	182
404189	pfam13252	DUF4043	Protein of unknown function (DUF4043). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 369 and 424 amino acids in length. There is a single completely conserved residue G that may be functionally important.	352
404190	pfam13253	DUF4044	Protein of unknown function (DUF4044). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 42 and 56 amino acids in length. There is a single completely conserved residue M that may be functionally important.	33
404191	pfam13254	DUF4045	Domain of unknown function (DUF4045). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 384 and 430 amino acids in length.	426
404192	pfam13255	DUF4046	Protein of unknown function (DUF4046). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 64 and 331 amino acids in length.	90
315838	pfam13256	DUF4047	Domain of unknown function (DUF4047). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 130 amino acids in length. There are two conserved sequence motifs: TEA and FPKT.	125
372535	pfam13257	DUF4048	Domain of unknown function (DUF4048). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 228 and 257 amino acids in length.	252
289998	pfam13258	DUF4049	Domain of unknown function (DUF4049). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 310 and 324 amino acids in length.	333
404193	pfam13259	DUF4050	Protein of unknown function (DUF4050). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 173 amino acids in length. There are two conserved sequence motifs: IPL and FLVD.	124
372537	pfam13260	DUF4051	Protein of unknown function (DUF4051). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	52
315842	pfam13261	DUF4052	Protein of unknown function (DUF4052). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 220 amino acids in length.	217
404194	pfam13262	DUF4054	Protein of unknown function (DUF4054). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 120 and 152 amino acids in length.	105
404195	pfam13263	PHP_C	PHP-associated. This is a subunit, probably the alpha, of bacterial and eukaryotic DNA polymerase III, associated with the PHP domain, pfam02811.	56
404196	pfam13264	DUF4055	Domain of unknown function (DUF4055). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 140 amino acids in length.	135
372540	pfam13265	DUF4056	Protein of unknown function (DUF4056). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 355 and 380 amino acids in length.	266
404197	pfam13266	DUF4057	Protein of unknown function (DUF4057). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 279 and 322 amino acids in length.	299
379094	pfam13267	DUF4058	Protein of unknown function (DUF4058). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 244 and 264 amino acids in length.	252
404198	pfam13268	DUF4059	Protein of unknown function (DUF4059). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved DKT sequence motif.	72
372542	pfam13269	DUF4060	Protein of unknown function (DUF4060). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. There are two conserved sequence motifs: VEVV and SYVAT.	73
404199	pfam13270	DUF4061	Domain of unknown function (DUF4061). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 90 amino acids in length. There is a conserved AFG sequence motif.	88
404200	pfam13271	DUF4062	Domain of unknown function (DUF4062). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. There is a conserved SST sequence motif.	78
315853	pfam13272	Holin_2-3	Putative 2/3 transmembrane domain holin. Holin_2-3 is a putative holins with 2 or 3 transmembrane segments. It consists of many proteobacterial proteins ranging in size from about 70 aas to 120 aas. They have 2 or 3 predicted TMSs. Although some are annotated as phage proteins or holins, none appears to be functionally characterized.	106
404201	pfam13273	DUF4064	Protein of unknown function (DUF4064). 	102
404202	pfam13274	DUF4065	Protein of unknown function (DUF4065). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 155 and 202 amino acids in length.	108
404203	pfam13275	S4_2	S4 domain. The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4.	65
404204	pfam13276	HTH_21	HTH-like domain. This domain contains a predicted helix-turn-helix suggesting a DNA-binding function.	59
404205	pfam13277	YmdB	YmdB-like protein. This family of putative phosphoesterases contains the B. subtilis protein YmdB.	251
404206	pfam13279	4HBT_2	Thioesterase-like superfamily. This family contains a wide variety of enzymes, principally thioesterases. These enzymes are part of the Hotdog fold superfamily.	121
404207	pfam13280	WYL	WYL domain. WYL is a Sm-like SH3 beta-barrel fold containing domain. It is a member of the WYL-like superfamily, named for three conserved amino acids found in a subset of the superfamily. However, these residues are not strongly conserved throughout the family. Rather, the conservation pattern includes four basic residues and a position often occupied by a cysteine, which are predicted to line a ligand-binding groove typical of the Sm-like SH3 beta-barrels. A WYL domain protein (sll7009) is a negative regulator of the I-D CRISPR-Cas system in Synechocystis sp. It is predicted to be a ligand-sensing domain that could bind negatively charged ligands, such as nucleotides or nucleic acid fragments, to regulate CRISPR-Cas and other defense systems such as the abortive infection AbiG system.	173
404208	pfam13281	DUF4071	Domain of unknown function (DUF4071). This domain is found at the N-terminus of many serine-threonine kinase-like proteins.	372
404209	pfam13282	DUF4070	Domain of unknown function (DUF4070). This is a bacterial domain often found at the C-terminus of Radical_SAM methylases.	142
372547	pfam13283	NfrA_C	Bacteriophage N adsorption protein A C-term. The function of this domain is unknown but it is found at the C-terminus of bacteriophage N4 adsorption protein A, in association with an N-terminal region of TPR repeats.	173
404210	pfam13284	DUF4072	Domain of unknown function (DUF4072). This short domain is normally found at the very N-terminus of Hyrdrolases pfam00702.	47
290024	pfam13285	DUF4073	Domain of unknown function (DUF4073). This family is frequently found at the C-terminus of bacterial proteins carrying the family, Metallophos pfam00149.	157
404211	pfam13286	HD_assoc	Phosphohydrolase-associated domain. This domain is found on bacterial and archaeal metal-dependent phosphohydrolases.	91
404212	pfam13287	Fn3_assoc	Fn3 associated. 	59
404213	pfam13288	DXPR_C	DXP reductoisomerase C-terminal domain. This is the C-terminal domain of the 1-deoxy-D-xylulose-5-phosphate reductoisomerase enzyme. This domain forms a left handed super-helix.	114
404214	pfam13289	SIR2_2	SIR2-like domain. This family of proteins are related to the sirtuins.	141
404215	pfam13290	CHB_HEX_C_1	Chitobiase/beta-hexosaminidase C-terminal domain. 	67
404216	pfam13291	ACT_4	ACT domain. ACT domains bind to amino acids and regulate associated enzyme domains. These ACT domains are found at the C-terminus of the RelA protein.	79
404217	pfam13292	DXP_synthase_N	1-deoxy-D-xylulose-5-phosphate synthase. This family contains 1-deoxyxylulose-5-phosphate synthase (DXP synthase), an enzyme which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate, to yield 1-deoxy-D- xylulose-5-phosphate, a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6).	273
404218	pfam13293	DUF4074	Domain of unknown function (DUF4074). This family is found at the C-terminal of Homeobox proteins in Metazoa.	62
372549	pfam13294	DUF4075	Domain of unknown function (DUF4075). The members of this family are putative mature parasite-infected erythrocyte surface antigen protein from Bacillus spp.	79
290034	pfam13295	DUF4077	Domain of unknown function (DUF4077). This is the N-terminal region of methyl-accepting chemotaxis proteins from Bacillus spp. The function is not known.	176
404219	pfam13296	T6SS_Vgr	Putative type VI secretion system Rhs element Vgr. This is a family of putative type VI secretion system Rhs element Vgr proteins from Proteobacteria.	108
404220	pfam13297	Telomere_Sde2_2	Telomere stability C-terminal. This short C-terminal domain is found in higher eukaryotes further downstream from the Sde2 family, pfam13019. It is found in all Sde2-related proteins except those from fission yeast, fly, and mosquito. Its exact function in telomere formation and maintenance has not yet been established.	60
404221	pfam13298	LigD_N	DNA polymerase Ligase (LigD). This is the N terminal region of ATP dependant DNA ligase.	103
404222	pfam13299	CPSF100_C	Cleavage and polyadenylation factor 2 C-terminal. This family lies at the C-terminus of many fungal and plant cleavage and polyadenylation specificity factor subunit 2 proteins. The exact function of the domain is not known, but is likely to function as a binding domain for the protein within the overall CPSF complex.	161
404223	pfam13300	DUF4078	Domain of unknown function (DUF4078). This family is found from fungi to humans, but its exact function is not known.	86
404224	pfam13301	DUF4079	Protein of unknown function (DUF4079). This is an uncharacterized family of proteins.	141
379112	pfam13302	Acetyltransf_3	Acetyltransferase (GNAT) domain. This domain catalyzes N-acetyltransferase reactions.	139
404225	pfam13303	PTS_EIIC_2	Phosphotransferase system, EIIC. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. The sugar-specific permease of the PTS consists of three domains (IIA, IIB and IIC). The IIC domain catalyzes the transfer of a phosphoryl group from IIB to the sugar substrate.	327
404226	pfam13304	AAA_21	AAA domain, putative AbiEii toxin, Type IV TA system. Several members are annotated as being of the abortive phage resistance system, in which case the family would be acting as the toxin for a type IV toxin-antitoxin resistance system.	303
404227	pfam13305	WHG	WHG domain. This presumed domain is around 80 amino acids in length. It is found to the C-terminus of a DNA-binding helix-turn-helix domain. This domain may be involved in binding to an as yet unknown ligand that allows a transcriptional regulation response to that molecule. The domain is named WHG after three conserved residues near the C-terminus of the domain.	102
404228	pfam13306	LRR_5	Leucine rich repeats (6 copies). This family includes a number of leucine rich repeats. This family contains a large number of BSPA-like surface antigens from Trichomonas vaginalis.	127
404229	pfam13307	Helicase_C_2	Helicase C-terminal domain. This domain is found at the C-terminus of DEAD-box helicases.	166
404230	pfam13308	YARHG	YARHG domain. This presumed extracellular domain is about 70 amino acids in length. It is named YARHG after a conserved motif in the sequence. This domain is associated with peptidases and bacterial kinase proteins. Its molecular function is unknown.	84
404231	pfam13309	HTH_22	HTH domain. This domain is a helix-turn-helix domain that is likely to act as a DNA-binding domain.	63
404232	pfam13310	Virulence_RhuM	Virulence protein RhuM family. There are currently no experimental data for members of this group or their homologs. However, these proteins are implicated in virulence/pathogenicity because RhuM is encoded in the SPI-3 pathogenicity island in Salmonella typhimurium.	252
404233	pfam13311	DUF4080	Protein of unknown function (DUF4080). A family of uncharacterized proteins found by clustering human gut metagenomic sequences.	188
404234	pfam13312	DUF4081	Domain of unknown function (DUF4081). This domain is often found N-terminal to the GNAT acetyltransferase domain, pfam00583 and FR47, pfam08445.	105
404235	pfam13313	DUF4082	Domain of unknown function (DUF4082). This family appears to be a parallel beta-helix repeated region that sits between successive Cadherin domains, pfam00028.	143
372557	pfam13314	DUF4083	Domain of unknown function (DUF4083). This is a family of very short, approximately 60 residue, proteins from Firmicutes, that are all putatively annotated as being MutT/Nudix. However, the characteristic Nudix motif of GX(5)EX(7)REUXEE is absent.	57
372558	pfam13315	DUF4085	Protein of unknown function (DUF4085). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 101 and 269 amino acids in length.	206
372559	pfam13316	DUF4087	Protein of unknown function (DUF4087). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 140 and 280 amino acids in length. There is a conserved RCGW sequence motif.	94
404236	pfam13317	DUF4088	Protein of unknown function (DUF4088). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 258 and 300 amino acids in length.	226
372560	pfam13318	DUF4089	Protein of unknown function (DUF4089). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	50
404237	pfam13319	DUF4090	Protein of unknown function (DUF4090). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length.	84
404238	pfam13320	DUF4091	Domain of unknown function (DUF4091). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 70 amino acids in length. There is a single completely conserved residue G that may be functionally important.	66
372562	pfam13321	DUF4084	Domain of unknown function (DUF4084). This family of Firmicute proteins is frequently associated with the EAL, GGDEF and PAS families, pfam00563, pfam00990, and pfam00989. The exact function is not known.	304
290061	pfam13322	DUF4092	Domain of unknown function (DUF4092). This family is found in Proteobacteria. The function is not known.	176
404239	pfam13323	HPIH	N-terminal domain with HPIH motif. This family is found in fungi on proteins carrying the PAS, pfam00989, domain. There is a well-conserved characteristic HPIH motif, but the function is not known.	152
404240	pfam13324	GCIP	Grap2 and cyclin-D-interacting. GCIP, or Grap2 and cyclin-D-interacting protein, is found in eukaryotes, and in the human protein CCNDBP1, residues 149-190 constitute a helix-loop-helix domain, residues 190-240 an acidic region, and 240-261 a leucine zipper domain. GCIP interacts with full-length Grap2 protein and with the COOH-terminal unique and SH3 domains (designated QC domain) of Grap2. It is potentially involved in the regulation of cell differentiation and proliferation through Grap2 and cyclin D-mediated signalling pathways. In mice, it is involved in G1/S-phase progression of hepatocytes, which in older animals is associated with the development of liver tumors. In vitro it acts as an inhibitory HLH protein, for example, blocking transcription of the HNF-4 promoter. In its function as a cyclin D1-binding protein it is able to reduce CDK4-mediated phosphorylation of the retinoblastoma protein and to inhibit E2F-mediated transcriptional activity. GCIP has also been shown to have interact physically with Rad (Ras associated with diabetes), Rad being important in regulating cellular senescence.	261
404241	pfam13325	MCRS_N	N-terminal region of micro-spherule protein. This domain is found in plants and higher eukaryotes, and is the N-terminal region of micro-spherule proteins which repress the transactivation activities of Nrf1 (p45 nuclear factor-erythroid 2 (p45 NF-E2)-related factor 1). In conjunction with DIPA the full-length protein acts as a transcription repressor. The exact function of the region is not known.	199
404242	pfam13326	PSII_Pbs27	Photosystem II Pbs27. This family of proteins contains Pbs27, a highly conserved component of photosystem II. Pbs27 is comprised of four helices arranged in a right handed up-down-up-down fold, with a less ordered region located at the N-terminus.	134
404243	pfam13327	T3SS_LEE_assoc	Type III secretion system subunit. This is a family of bacterial putative type III secretion apparatus proteins associated with the locus of enterocyte effacement (LEE).	162
404244	pfam13328	HD_4	HD domain. HD domains are metal dependent phosphohydrolases.	157
404245	pfam13329	ATG2_CAD	Autophagy-related protein 2 CAD motif. The Atg2 protein, an integral membrane protein, is required for a range of functions including the regulation of autophagy in conjunction with the Atg1-Atg13 complex. Atg2 binds Atg9. The precise function of this region, with its characteristic highly conserved CAD sequence motif, is not known.	154
404246	pfam13330	Mucin2_WxxW	Mucin-2 protein WxxW repeating region. This family is repeating region found on mucins 2 and 5. The function is not known, but the repeat can be present in up to 32 copies, as in an uncharacterized protein from Branchiostoma floridae. The region carries a highly conserved WxxW sequence motif and also has at least six well conserved cysteine residues.	84
404247	pfam13331	DUF4093	Domain of unknown function (DUF4093). This domain lies at the C-terminus of primase proteins carrying the TOPRIM, pfam01751, domain. The exact function of the domain is not known.	85
404248	pfam13332	Fil_haemagg_2	Haemagluttinin repeat. 	169
372570	pfam13333	rve_2	Integrase core domain. 	52
404249	pfam13334	DUF4094	Domain of unknown function (DUF4094). This domain is found in plant proteins that often carry a galactosyltransferase domain, pfam01762, at their C-terminus.	91
404250	pfam13335	Mg_chelatase_C	Magnesium chelatase, subunit ChlI C-terminal. This is a family of the C-terminal of putative bacterial magnesium chelatase subunit ChlI proteins. Most members have the associated pfam01078.	93
404251	pfam13336	AcetylCoA_hyd_C	Acetyl-CoA hydrolase/transferase C-terminal domain. This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilizes acyl-CoA and acetate to form acetyl-CoA.	154
404252	pfam13337	Lon_2	Putative ATP-dependent Lon protease. This is a family of proteins that are annotated as ATP-dependent Lon proteases.	450
404253	pfam13338	AbiEi_4	Transcriptional regulator, AbiEi antitoxin. AbiEi_4 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	49
404254	pfam13339	AATF-Che1	Apoptosis antagonizing transcription factor. The N-terminal and leucine-zipper region of the apoptosis antagonizing transcription factor-Che1.	130
404255	pfam13340	DUF4096	Putative transposase of IS4/5 family (DUF4096). 	76
404256	pfam13341	RAG2_PHD	RAG2 PHD domain. This domain is found at the C-terminus of the RAG2 protein. The structure of this domain has been shown bound to histone H3 trimethylated at lysine 4 (H3K4me3).	78
404257	pfam13342	Toprim_Crpt	C-terminal repeat of topoisomerase. 	60
404258	pfam13343	SBP_bac_6	Bacterial extracellular solute-binding protein. This family includes bacterial extracellular solute-binding proteins.	247
404259	pfam13344	Hydrolase_6	Haloacid dehalogenase-like hydrolase. This family is part of the HAD superfamily.	101
404260	pfam13346	ABC2_membrane_5	ABC-2 family transporter protein. This family is related to the ABC-2 membrane transporter family pfam01061.	206
404261	pfam13347	MFS_2	MFS/sugar transport protein. This family is part of the major facilitator superfamily of membrane transport proteins.	427
404262	pfam13349	DUF4097	Putative adhesin. This has a putative all-beta structure with a twenty-residue repeat with a highly conserved repeating GD, gly-asp, motif. It may form part of a bacterial adhesin.	247
404263	pfam13350	Y_phosphatase3	Tyrosine phosphatase family. This family is closely related to the pfam00102 and pfam00782 families.	243
404264	pfam13351	DUF4099	Protein of unknown function (DUF4099). A family of uncharacterized proteins found by clustering human gut metagenomic sequences. The C-terminal repeat region of this family is DUF4098, pfam13345.	80
372575	pfam13352	DUF4100	Protein of unknown function (DUF4100). This is a family of uncharacterized proteins found in Physcomitrella.	207
404265	pfam13353	Fer4_12	4Fe-4S single cluster domain. This family includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. The structure of the domain is an alpha-antiparallel beta sandwich.	138
404266	pfam13354	Beta-lactamase2	Beta-lactamase enzyme family. This family is closely related to Beta-lactamase, pfam00144, the serine beta-lactamase-like superfamily, which contains the distantly related pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase.	201
404267	pfam13355	DUF4101	Protein of unknown function (DUF4101). This is a family of uncharacterized proteins, and is sometimes found in combination with pfam00226.	117
404268	pfam13356	Arm-DNA-bind_3	Arm DNA-binding domain. This DNA-binding domain is found at the N-terminus of a wide variety of phage integrase proteins.	78
404269	pfam13358	DDE_3	DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction.	145
404270	pfam13359	DDE_Tnp_4	DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction.	156
404271	pfam13360	PQQ_2	PQQ-like domain. This domain contains several repeats of the PQQ repeat.	233
404272	pfam13361	UvrD_C	UvrD-like helicase C-terminal domain. This domain is found at the C-terminus of a wide variety of helicase enzymes. This domain has a AAA-like structural fold.	498
372578	pfam13362	Toprim_3	Toprim domain. The toprim domain is found in a wide variety of enzymes involved in nucleic acid manipulation.	96
404273	pfam13363	BetaGal_dom3	Beta-galactosidase, domain 3. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyzes the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain has an Ig-like fold.	65
404274	pfam13364	BetaGal_dom4_5	Beta-galactosidase jelly roll domain. This domain is found in beta galactosidase enzymes. It has a jelly roll fold.	111
404275	pfam13365	Trypsin_2	Trypsin-like peptidase domain. This family includes trypsin-like peptidase domains.	142
404276	pfam13366	PDDEXK_3	PD-(D/E)XK nuclease superfamily. Members of this family belong to the PD-(D/E)XK nuclease superfamily	117
404277	pfam13367	PrsW-protease	Protease prsW family. This is a family of putative peptidases, possibly belonging to the MEROPS M79 family. PrsW, appears to be a member of a widespread family of membrane proteins that includes at least one previously known protease. PrsW appears to be responsible for Site-1 cleavage of the RsiW anti-sigma factor, the cognate anti-sigma factor, and it senses antimicrobial peptides that damage the cell membrane and other agents that cause cell envelope stress, The three acidic residues, E75, E76 and E95 in Aflv_1074, appear to be crucial since their mutation to alanine renders the protein inactive. Based on predictions of the bioinformatics programme TMHMM it is likely that these residues are located on the extracytoplasmic face of PrsW placing them in a position to act as a sensor for cell envelope stress.	195
404278	pfam13368	Toprim_C_rpt	Topoisomerase C-terminal repeat. This domain is repeated up to five times to form the C-terminal region of bacterial topoisomerase immediately downstream of the zinc-finger motif.	48
404279	pfam13369	Transglut_core2	Transglutaminase-like superfamily. 	155
404280	pfam13370	Fer4_13	4Fe-4S single cluster domain of Ferredoxin I. Fer4_13 is a ferredoxin I from sulfate-reducing bacteria. Chemical sequence analysis suggests that this characteristic [4Fe-4S] cluster sulfur environment is widely distributed among ferredoxins.	58
404281	pfam13371	TPR_9	Tetratricopeptide repeat. 	73
404282	pfam13372	Alginate_exp	Alginate export. This domain forms an 18-stranded beta-barrel pore which is likely to act as an alginate export channel.	394
404283	pfam13373	DUF2407_C	DUF2407 C-terminal domain. This is a family of proteins found in fungi. The function is not known. There is a characteristic GFDRL sequence motif.	141
404284	pfam13374	TPR_10	Tetratricopeptide repeat. 	42
404285	pfam13375	RnfC_N	RnfC Barrel sandwich hybrid domain. This domain is part of the barrel sandwich hybrid superfamily. It is found at the N-terminus of the RnfC Electron transport complex protein. It appears to be most related to the N-terminal NQRA domain (pfam05896).	101
404286	pfam13376	OmdA	Bacteriocin-protection, YdeI or OmpD-Associated. This is a family of archaeal and bacterial proteins predicted to be periplasmic. YdeI is important for resistance to polymyxin B in broth and for bacterial survival in mice upon oral, but not intraperitoneal inoculation, suggesting a role for YdeI in the gastrointestinal tract of mice. Production of the ydeI gene is regulated by the Rcs (regulator of capsule synthesis) phospho-relay system pathway independently of RcsA, and additionally transcription of the protein is regulated by the stationary-phase sigma factor, RpoS (sigma-S). YdeI confers protection against cationic AMPs (Antimicrobial peptides) or bacteriocins in conjunction with the general porin Omp, thus justifying its name of OmdA, for OmpD-Associated protein.	60
404287	pfam13377	Peripla_BP_3	Periplasmic binding protein-like domain. Thi domain is found in a variety of transcriptional regulatory proteins. It is related to bacterial periplasmic binding proteins, although this domain is unlikely to be found in the periplasm. This domain likely acts to bind a small molecule ligand that the DNA-binding domain responds to.	160
404288	pfam13378	MR_MLE_C	Enolase C-terminal domain-like. This domain appears at the C-terminus of many of the proteins that carry the MR_MLE_N pfam02746 domain. EC:4.2.1.40.	205
404289	pfam13379	NMT1_2	NMT1-like family. This family is closely related to the pfam09084 family.	254
404290	pfam13380	CoA_binding_2	CoA binding domain. This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases.	116
404291	pfam13382	Adenine_deam_C	Adenine deaminase C-terminal domain. This family represents a C-terminal region of the adenine deaminase enzyme.	168
372586	pfam13383	Methyltransf_22	Methyltransferase domain. This family appears to be a methyltransferase domain.	252
404292	pfam13384	HTH_23	Homeodomain-like domain. 	50
404293	pfam13385	Laminin_G_3	Concanavalin A-like lectin/glucanases superfamily. This domain belongs to the Concanavalin A-like lectin/glucanases superfamily.	151
404294	pfam13386	DsbD_2	Cytochrome C biogenesis protein transmembrane region. 	199
404295	pfam13387	DUF4105	Domain of unknown function (DUF4105). This is a family of uncharacterized bacterial proteins. There is a highly conserved histidine residue and a well-conserved NCT motif.	166
404296	pfam13388	DUF4106	Protein of unknown function (DUF4106). This family of proteins are found in large numbers in the Trichomonas vaginalis proteome. The function of this protein is unknown.	431
372588	pfam13389	DUF4107	Protein of unknown function (DUF4107). This family of putative proteins are found in Trichomonas vaginalis in large numbers. The function of this protein is unknown.	167
290126	pfam13390	DUF4108	Protein of unknown function (DUF4108). This family of putative proteins are found in Trichomonas vaginalis in large numbers. The function of this protein is unknown.	145
404297	pfam13391	HNH_2	HNH endonuclease. 	64
404298	pfam13392	HNH_3	HNH endonuclease. This is a zinc-binding loop of Fold group 7 as found in endo-deoxy-ribonucleases and HNH nucleases.	46
404299	pfam13393	tRNA-synt_His	Histidyl-tRNA synthetase. This is a family of class II aminoacyl-tRNA synthetase-like and ATP phosphoribosyltransferase regulatory subunits.	305
404300	pfam13394	Fer4_14	4Fe-4S single cluster domain. 	117
404301	pfam13395	HNH_4	HNH endonuclease. This HNH nuclease domain is found in CRISPR-related proteins.	54
404302	pfam13396	PLDc_N	Phospholipase_D-nuclease N-terminal. This family is often found at the very N-terminus of proteins from the phospholipase_D-nuclease family, PLDc, pfam00614. However, a large number of members are full-length within this family.	43
404303	pfam13397	RbpA	RNA polymerase-binding protein. RbpA is a family bacterial RNA polymerase-binding proteins. RbpA acts as a transcription factor by binding to the sigma subunit of RNA polymerase.	104
404304	pfam13398	Peptidase_M50B	Peptidase M50B-like. This is a family of bacterial and plant peptidases in the same family as MEROPS:M50B.	194
404305	pfam13399	LytR_C	LytR cell envelope-related transcriptional attenuator. This family appears at the C-terminus of members of the LytR_cpsA_psr, pfam03816, family	87
404306	pfam13400	Tad	Putative Flp pilus-assembly TadE/G-like. This is an N-terminal domain on a family of putative Flp pilus-assembly proteins. The exact function is not known. The Flp-pilus biogenesis genes include the Tad genes, and some members of this family are putatively assigned as being TadG.	47
379165	pfam13401	AAA_22	AAA domain. 	129
404307	pfam13402	Peptidase_M60	Peptidase M60, enhancin and enhancin-like. This family of peptidases contains a zinc metallopeptidase motif (HEXXHX(8,28)E) and possesses mucinase activity. It includes the viral enhancins as well as enhancin-like peptidases from bacterial species. Enhancins are a class of metalloproteases found in some baculoviruses that enhance viral infection by degrading the peritrophic membrane (PM) of the insect midgut. Bacterial enhancins are found to be cytotoxic when compared to viral enhancin, however, suggesting that the bacterial enhancins do not enhance infection in the same way as viral enhancin. Bacterial enhancins may have evolved a distinct biochemical function. These bacterial domains are peptidases targetting host glycoproteins and thus probably play an important role in successful colonisation of both vertebrate mucosal surfaces and the invertebrate digestive tract by both mutualistic and pathogenic microbes. This family has been augmented by a merge with the sequences in the Enhancin Pfam family.	268
404308	pfam13403	Hint_2	Hint domain. This domain is found in inteins.	147
404309	pfam13404	HTH_AsnC-type	AsnC-type helix-turn-helix domain. 	41
404310	pfam13405	EF-hand_6	EF-hand domain. 	30
404311	pfam13406	SLT_2	Transglycosylase SLT domain. This family is related to the SLT domain pfam01464.	292
404312	pfam13407	Peripla_BP_4	Periplasmic binding protein domain. This domain is found in a variety of bacterial periplasmic binding proteins.	256
404313	pfam13408	Zn_ribbon_recom	Recombinase zinc beta ribbon domain. This short bacterial protein contains a zinc ribbon domain that is likely to be DNA-binding. This domain is found in site specific recombinase proteins. This family appears most closely related to pfam04606.	57
404314	pfam13409	GST_N_2	Glutathione S-transferase, N-terminal domain. This family is closely related to pfam02798.	68
404315	pfam13410	GST_C_2	Glutathione S-transferase, C-terminal domain. This domain is closely related to pfam00043.	67
404316	pfam13411	MerR_1	MerR HTH family regulatory protein. 	66
404317	pfam13412	HTH_24	Winged helix-turn-helix DNA-binding. 	45
404318	pfam13413	HTH_25	Helix-turn-helix domain. This domain is a helix-turn-helix domain that probably binds to DNA.	62
315977	pfam13414	TPR_11	TPR repeat. 	42
404319	pfam13415	Kelch_3	Galactose oxidase, central domain. 	49
404320	pfam13416	SBP_bac_8	Bacterial extracellular solute-binding protein. This family includes bacterial extracellular solute-binding proteins.	279
404321	pfam13417	GST_N_3	Glutathione S-transferase, N-terminal domain. 	75
404322	pfam13418	Kelch_4	Galactose oxidase, central domain. 	49
404323	pfam13419	HAD_2	Haloacid dehalogenase-like hydrolase. 	178
404324	pfam13420	Acetyltransf_4	Acetyltransferase (GNAT) domain. 	153
404325	pfam13421	Band_7_1	SPFH domain-Band 7 family. 	211
404326	pfam13422	DUF4110	Domain of unknown function (DUF4110). This is a family that is found predominantly at the C-terminus of Kelch-containing proteins. However, the exact function of this region is not known.	95
404327	pfam13423	UCH_1	Ubiquitin carboxyl-terminal hydrolase. 	308
315987	pfam13424	TPR_12	Tetratricopeptide repeat. 	77
404328	pfam13425	O-antigen_lig	O-antigen ligase like membrane protein. 	461
404329	pfam13426	PAS_9	PAS domain. 	102
404330	pfam13427	DUF4111	Domain of unknown function (DUF4111). Although the exact function of this domain is not known it frequently appears downstream of the family Nucleotidyltransferase, pfam01909. It is also found in species associated with methicillin-resistant bacteria.	102
404331	pfam13428	TPR_14	Tetratricopeptide repeat. 	39
404332	pfam13429	TPR_15	Tetratricopeptide repeat. 	279
404333	pfam13430	DUF4112	Domain of unknown function (DUF4112). This family has several highly conserved GD sequence-motifs of unknown function. The family is found in bacteria, archaea and fungi.	104
404334	pfam13431	TPR_17	Tetratricopeptide repeat. 	34
404335	pfam13432	TPR_16	Tetratricopeptide repeat. This family is found predominantly at the C-terminus of transglutaminase enzyme core regions.	68
404336	pfam13433	Peripla_BP_5	Periplasmic binding protein domain. This domain is found in a variety of bacterial periplasmic binding proteins.	363
404337	pfam13434	K_oxygenase	L-lysine 6-monooxygenase (NADPH-requiring). This is family of Rossmann fold oxidoreductases that catalyzes the NADPH-dependent hydroxylation of lysine at the N6 position, EC:1.14.13.59.	341
372604	pfam13435	Cytochrome_C554	Cytochrome c554 and c-prime. This family is a tetra-haem cytochrome involved in the oxidation of ammonia. It is found in both phototrophic and denitrifying bacteria.	84
404338	pfam13436	Gly-zipper_OmpA	Glycine-zipper domain. 	44
404339	pfam13437	HlyD_3	HlyD family secretion protein. This is a family of largely bacterial haemolysin translocator HlyD proteins.	104
404340	pfam13438	DUF4113	Domain of unknown function (DUF4113). Although the function is not known this domain occurs almost invariably at the very C-terminus of the IMS family DNA-polymerase repair proteins, IMS, pfam00817.	51
404341	pfam13439	Glyco_transf_4	Glycosyltransferase Family 4. 	170
404342	pfam13440	Polysacc_synt_3	Polysaccharide biosynthesis protein. 	293
404343	pfam13441	Gly-zipper_YMGG	YMGG-like Gly-zipper. 	45
404344	pfam13442	Cytochrome_CBB3	Cytochrome C oxidase, cbb3-type, subunit III. 	67
404345	pfam13443	HTH_26	Cro/C1-type HTH DNA-binding domain. This is a helix-turn-helix domain that probably binds to DNA.	63
404346	pfam13444	Acetyltransf_5	Acetyltransferase (GNAT) domain. This family contains proteins with N-acetyltransferase functions.	102
404347	pfam13445	zf-RING_UBOX	RING-type zinc-finger. This zinc-finger is a typical RING-type of plant ubiquitin ligases.	40
404348	pfam13446	RPT	A repeated domain in UCH-protein. This is a repeated domain found in de-ubiquitinating proteins. It's exact function is not known although it is likely to be involved in the binding of the Ubps in the complex with Rsp5 and Rup1.	59
290183	pfam13447	Multi-haem_cyto	Seven times multi-haem cytochrome CxxCH. This domain carries up to seven CxxCH repeated sequence motifs, characteristic of multi-haem cytochromes.	269
404349	pfam13448	DUF4114	Domain of unknown function (DUF4114). This is a repeated domain that is found towards the C-terminal of many different types of bacterial proteins. There are highly conserved glutamate and aspartate residues suggesting that this domain might carry enzymic activity.	86
404350	pfam13449	Phytase-like	Esterase-like activity of phytase. This is a repeated domain that carries several highly conserved Glu and Asp residues indicating the likelihood that the domain incorporates the enzymic activity of the PLC-like phospho-diesterase part of the proteins.	284
404351	pfam13450	NAD_binding_8	NAD(P)-binding Rossmann-like domain. 	68
404352	pfam13451	zf-trcl	Probable zinc-ribbon domain. This is a probable zinc-binding domain with two CxxC sequence motifs, found in various families of bacteria.	48
404353	pfam13452	MaoC_dehydrat_N	N-terminal half of MaoC dehydratase. It is clear from the structures of bacterial members of MaoC dehydratase, pfam01575, that the full-length functional dehydratase enzyme is made up of two structures that dimerize to form a whole. Divergence of the N- and C- monomers in higher eukaryotes has led to two distinct domains, this one and MaoC_dehydratas. However, in order to function as an enzyme both are required together.	132
404354	pfam13453	zf-TFIIB	Transcription factor zinc-finger. 	41
404355	pfam13454	NAD_binding_9	FAD-NAD(P)-binding. 	155
372608	pfam13455	MUG113	Meiotically up-regulated gene 113. This is a family of fungal proteins found to be up-regulated in meiosis.	73
404356	pfam13456	RVT_3	Reverse transcriptase-like. This domain is found in plants and appears to be part of a retrotransposon.	123
404357	pfam13457	SH3_8	SH3-like domain. 	74
404358	pfam13458	Peripla_BP_6	Periplasmic binding protein. This family includes a diverse range of periplasmic binding proteins.	342
404359	pfam13459	Fer4_15	4Fe-4S single cluster domain. 	66
404360	pfam13460	NAD_binding_10	NAD(P)H-binding. 	183
404361	pfam13462	Thioredoxin_4	Thioredoxin. 	165
404362	pfam13463	HTH_27	Winged helix DNA-binding domain. 	68
404363	pfam13464	DUF4115	Domain of unknown function (DUF4115). This short domain is often found at the C-terminus of proteins containing a helix-turn-helix domain. The function of this domain is unknown.	68
404364	pfam13465	zf-H2C2_2	Zinc-finger double domain. 	26
404365	pfam13466	STAS_2	STAS domain. The STAS (after Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C-terminal region of Sulphate transporters and bacterial antisigma factor antagonists. It has been suggested that this domain may have a general NTP binding function.	80
404366	pfam13467	RHH_4	Ribbon-helix-helix domain. This short bacterial protein contains a ribbon-helix-helix domain that is likely to be DNA-binding.	65
404367	pfam13468	Glyoxalase_3	Glyoxalase-like domain. This domain is related to the Glyoxalase domain pfam00903.	175
404368	pfam13469	Sulfotransfer_3	Sulfotransferase family. 	216
404369	pfam13470	PIN_3	PIN domain. Members of this family of bacterial domains are predicted to be RNases (from similarities to 5'-exonucleases).	117
404370	pfam13471	Transglut_core3	Transglutaminase-like superfamily. This family includes uncharacterized proteins that are related to the transglutaminase like domain pfam01841.	117
404371	pfam13472	Lipase_GDSL_2	GDSL-like Lipase/Acylhydrolase family. This family of presumed lipases and related enzymes are similar to pfam00657.	176
379208	pfam13473	Cupredoxin_1	Cupredoxin-like domain. The cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel.	104
404372	pfam13474	SnoaL_3	SnoaL-like domain. This family contains a large number of proteins that share the SnoaL fold.	121
404373	pfam13475	DUF4116	Domain of unknown function (DUF4116). 	49
404374	pfam13476	AAA_23	AAA domain. 	190
404375	pfam13477	Glyco_trans_4_2	Glycosyl transferase 4-like. 	139
404376	pfam13478	XdhC_C	XdhC Rossmann domain. This entry is the rossmann domain found in the Xanthine dehydrogenase accessory protein.	143
404377	pfam13479	AAA_24	AAA domain. This AAA domain is found in a wide variety of presumed phage proteins.	199
404378	pfam13480	Acetyltransf_6	Acetyltransferase (GNAT) domain. This family contains proteins with N-acetyltransferase functions.	143
404379	pfam13481	AAA_25	AAA domain. This AAA domain is found in a wide variety of presumed DNA repair proteins.	195
404380	pfam13482	RNase_H_2	RNase_H superfamily. 	165
404381	pfam13483	Lactamase_B_3	Beta-lactamase superfamily domain. This family is part of the beta-lactamase superfamily and is related to pfam00753.	160
404382	pfam13484	Fer4_16	4Fe-4S double cluster binding domain. 	65
290220	pfam13485	Peptidase_MA_2	Peptidase MA superfamily. 	247
372615	pfam13486	Dehalogenase	Reductive dehalogenase subunit. This family is most frequently associated with a Fer4 iron-sulfur cluster towards the C-terminal region.	288
404383	pfam13487	HD_5	HD domain. HD domains are metal dependent phosphohydrolases.	64
404384	pfam13488	Gly-zipper_Omp	Glycine zipper. 	46
404385	pfam13489	Methyltransf_23	Methyltransferase domain. This family appears to be a methyltransferase domain.	162
404386	pfam13490	zf-HC2	Putative zinc-finger. This is a putative zinc-finger found in some anti-sigma factor proteins.	34
404387	pfam13491	FtsK_4TM	4TM region of DNA translocase FtsK/SpoIIIE. 4TM_FtsK is the integral membrane domain of the FtsK DNA tranlocases. During sporulation in Bacillus subtilis, the SpoIIIE protein is believed to form a translocation pore at the leading edge of the nearly closed septum. The E. coli FtsK protein is homologous to SpoIIIE, and can free chromosomes trapped in vegetative septa.	171
379221	pfam13492	GAF_3	GAF domain. 	129
404388	pfam13493	DUF4118	Domain of unknown function (DUF4118). This domain is found in a wide variety of bacterial signalling proteins. It is likely to be a transmembrane domain involved in ligand sensing.	107
404389	pfam13494	DUF4119	Domain of unknown function, B. Theta Gene description (DUF4119). Based on Bacteroides thetaiotaomicron gene BT_0594, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture.	92
404390	pfam13495	Phage_int_SAM_4	Phage integrase, N-terminal SAM-like domain. 	84
404391	pfam13496	DUF4120	Domain of unknown function (DUF4120). Based on Bacteroides thetaiotaomicron gene BT_2585, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture.	95
404392	pfam13497	DUF4121	Domain of unknown function (DUF4121). Based on Bacteroides thetaiotaomicron gene BT_2588, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture.	264
404393	pfam13498	DUF4122	Domain of unknown function (DUF4122). Based on Bacteroides thetaiotaomicron gene BT_2607, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture.	220
404394	pfam13499	EF-hand_7	EF-hand domain pair. 	67
404395	pfam13500	AAA_26	AAA domain. This domain is found in a number of proteins involved in cofactor biosynthesis such as dethiobiotin synthase and cobyric acid synthase. This domain contains a P-loop motif.	198
404396	pfam13501	SoxY	Sulfur oxidation protein SoxY. This domain is found in the sulfur oxidation protein SoxY. It is closely related to the Desulfoferrodoxin family pfam01880. Dissimilatory oxidation of thiosulfate is carried out by the ubiquitous sulfur-oxidizing (Sox) multi-enzyme system. In this system, SoxY plays a key role, functioning as the sulfur substrate-binding protein that offers its sulfur substrate, which is covalently bound to a conserved C-terminal cysteine, to another oxidizing Sox enzyme. The structure of this domain shows an Ig-like fold.	109
404397	pfam13502	AsmA_2	AsmA-like C-terminal region. This family is similar to the C-terminal of the AsmA protein of E. coli.	233
404398	pfam13503	DUF4123	Domain of unknown function (DUF4123). This presumed domain is functionally uncharacterized. It is about 120 amino acids in length and contains several conserved motifs that may be functionally important. This domain is sometimes associated with the FHA domain.	128
404399	pfam13505	OMP_b-brl	Outer membrane protein beta-barrel domain. This domain is found in a wide range of outer membrane proteins. This domain assumes a membrane bound beta-barrel fold.	175
404400	pfam13506	Glyco_transf_21	Glycosyl transferase family 21. This is a family of ceramide beta-glucosyltransferases - EC:2.4.1.80.	174
404401	pfam13507	GATase_5	CobB/CobQ-like glutamine amidotransferase domain. This family captures members that are not found in pfam00310, pfam07685 and pfam13230.	260
404402	pfam13508	Acetyltransf_7	Acetyltransferase (GNAT) domain. This domain catalyzes N-acetyltransferase reactions.	83
404403	pfam13509	S1_2	S1 domain. The S1 domain was originally identified as a repeat motif in the ribosomal S1 protein. It was later identified in a wide range of proteins. The S1 domain has an OB-fold structure. The S1 domain is involved in nucleic acid binding.	59
404404	pfam13510	Fer2_4	2Fe-2S iron-sulfur cluster binding domain. The 2Fe-2S ferredoxin family have a general core structure consisting of beta(2)-alpha-beta(2) which a beta-grasp type fold. The domain is around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This cluster appears within sarcosine oxidase proteins.	82
404405	pfam13511	DUF4124	Domain of unknown function (DUF4124). This presumed domain is found in a variety of bacterial proteins. It is found associated at the N-terminus associated with other domains such as the SLT domain and glutaredoxin domains in some proteins. The function of this domain is unknown, but it may have an Ig-like fold.	53
404406	pfam13512	TPR_18	Tetratricopeptide repeat. 	145
404407	pfam13513	HEAT_EZ	HEAT-like repeat. The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514). These EZ repeats are found in subunits of cyanobacterial phycocyanin lyase and other proteins and probably carry out a scaffolding role.	53
404408	pfam13514	AAA_27	AAA domain. This domain is found in a number of double-strand DNA break proteins. This domain contains a P-loop motif.	207
404409	pfam13515	FUSC_2	Fusaric acid resistance protein-like. 	126
404410	pfam13516	LRR_6	Leucine Rich repeat. 	24
404411	pfam13517	VCBS	Repeat domain in Vibrio, Colwellia, Bradyrhizobium and Shewanella. This domain of about 100 residues is found in multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion (TIGR).	61
404412	pfam13518	HTH_28	Helix-turn-helix domain. This helix-turn-helix domain is often found in transposases and is likely to be DNA-binding.	52
404413	pfam13519	VWA_2	von Willebrand factor type A domain. 	103
404414	pfam13520	AA_permease_2	Amino acid permease. 	427
404415	pfam13521	AAA_28	AAA domain. 	163
404416	pfam13522	GATase_6	Glutamine amidotransferase domain. This domain is a class-II glutamine amidotransferase domain found in a variety of enzymes, such as asparagine synthetase and glutamine--fructose-6-phosphate transaminase.	130
404417	pfam13523	Acetyltransf_8	Acetyltransferase (GNAT) domain. This domain catalyzes N-acetyltransferase reactions.	145
404418	pfam13524	Glyco_trans_1_2	Glycosyl transferases group 1. 	93
404419	pfam13525	YfiO	Outer membrane lipoprotein. This outer membrane lipoprotein carries a TPR-like region towards its N-terminal. YfiO in E.coli is one of three outer membrane lipoproteins that form a multicomponent YaeT complex in the outer membrane of Gram-negative bacteria that is involved in the targeting and folding of beta-barrel outer membrane proteins. YfiO is the only essential lipoprotein component of the complex. It is required for the proper assembly and/or targeting of outer membrane proteins to the outer membrane. Through its interactions with NlpB it maintains the functional integrity of the YaeT complex.	200
404420	pfam13526	DUF4125	Protein of unknown function (DUF4125). 	197
404421	pfam13527	Acetyltransf_9	Acetyltransferase (GNAT) domain. This domain catalyzes N-acetyltransferase reactions.	124
404422	pfam13528	Glyco_trans_1_3	Glycosyl transferase family 1. 	321
379241	pfam13529	Peptidase_C39_2	Peptidase_C39 like family. 	139
404423	pfam13530	SCP2_2	Sterol carrier protein domain. 	103
404424	pfam13531	SBP_bac_11	Bacterial extracellular solute-binding protein. This family includes bacterial extracellular solute-binding proteins.	224
404425	pfam13532	2OG-FeII_Oxy_2	2OG-Fe(II) oxygenase superfamily. 	191
404426	pfam13533	Biotin_lipoyl_2	Biotin-lipoyl like. 	50
404427	pfam13534	Fer4_17	4Fe-4S dicluster domain. This family includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. The structure of the domain is an alpha-antiparallel beta sandwich.	61
316093	pfam13535	ATP-grasp_4	ATP-grasp domain. This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity.	160
404428	pfam13536	EmrE	Putative multidrug resistance efflux transporter. This is a membrane protein family whose members are purported to be related to the DMT or Drug/Metabolite Transporter (DMT) Superfamily. Members are all uncharacterized.	259
404429	pfam13537	GATase_7	Glutamine amidotransferase domain. This domain is a class-II glutamine amidotransferase domain found in a variety of enzymes such as asparagine synthetase and glutamine-fructose-6-phosphate transaminase.	123
404430	pfam13538	UvrD_C_2	UvrD-like helicase C-terminal domain. This domain is found at the C-terminus of a wide variety of helicase enzymes. This domain has a AAA-like structural fold.	51
404431	pfam13539	Peptidase_M15_4	D-alanyl-D-alanine carboxypeptidase. This family resembles VanY, pfam02557, which is part of the peptidase M15 family.	67
404432	pfam13540	RCC1_2	Regulator of chromosome condensation (RCC1) repeat. 	30
404433	pfam13541	ChlI	Subunit ChlI of Mg-chelatase. 	121
404434	pfam13542	HTH_Tnp_ISL3	Helix-turn-helix domain of transposase family ISL3. 	51
404435	pfam13543	KSR1-SAM	SAM like domain present in kinase suppressor RAS 1. 	129
404436	pfam13545	HTH_Crp_2	Crp-like helix-turn-helix domain. This family represents a crp-like helix-turn-helix domain that is likely to bind DNA.	69
404437	pfam13546	DDE_5	DDE superfamily endonuclease. This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction.	270
404438	pfam13547	GTA_TIM	GTA TIM-barrel-like domain. This domain is found in the gene transfer agent protein. An unusual system of genetic exchange exists in the purple nonsulfur bacterium Rhodobacter capsulatus. DNA transmission is mediated by a small bacteriophage-like particle called the gene transfer agent (GTA) that transfers random 4.5-kb segments of the producing cell's genome to recipient cells, where allelic replacement occurs. The genes involved in this process appear to be found widely in bacteria. According to the SUPERFAMILY database this domain has a TIM barrel fold.	299
404439	pfam13548	DUF4126	Domain of unknown function (DUF4126). 	174
404440	pfam13549	ATP-grasp_5	ATP-grasp domain. This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity.	222
404441	pfam13550	Phage-tail_3	Putative phage tail protein. This putative domain is found in the large gene transfer agent protein. These produce defective phage like particles. This domain is similar to other phage-tail protein families.	164
379256	pfam13551	HTH_29	Winged helix-turn helix. This helix-turn-helix domain is often found in transferases and is likely to be DNA-binding.	64
404442	pfam13552	DUF4127	Protein of unknown function (DUF4127). This family of uncharacterized bacterial proteins are about 500 amino acids in length.	493
404443	pfam13553	FIIND	Function to find. The function to find (FIIND) was initially discovered in two proteins, NLRP1 (aka NALP1, CARD7, NAC, DEFCAP) and CARD8 (aka TUCAN, Cardinal). NLRP1 is a member of the Nod-like receptor (NLR) protein superfamily and is involved in apoptosis and inflammation. To date, it is the only NLR protein known to have a FIIND domain. The FIIND domain is also present in the CARD8 protein where, like in NLRP1, it is followed by a C-terminal CARD domain. Both proteins are described to form an "inflammasome", a macro-molecular complex able to process caspase 1 and activate pro-IL1beta. The FIIND domain is present in only a very small subset of the kingdom of life, comprising primates, rodents (mouse, rat), carnivores (dog) and a few more, such as horse. The function of this domain is yet to be determined. Publications describing the newly discovered NLRP1 protein failed to identify it as a separate domain; for example, it was taken as part of the adjacent leucine rich repeat domain (LRR). Upon discovery of CARD8 it was noted that the N-terminal region shared significant sequence identity with an undescribed region in NLRP1. Before getting its final name, FIIND, this domain was termed NALP1-associated domain (NAD).	251
404444	pfam13554	DUF4128	Bacteriophage related domain of unknown function. The three-dimensional structure of NP_888769.1, Structure 2L25, reveals a tail terminator protein gpU fold, which suggests that the protein could have a bacteriophage origin.	127
404445	pfam13555	AAA_29	P-loop containing region of AAA domain. 	61
404446	pfam13556	HTH_30	PucR C-terminal helix-turn-helix domain. This helix-turn-helix domain is often found at the C-terminus of PucR-like transcriptional regulators such as Bacillus subtilis pucR and is likely to be DNA-binding.	55
404447	pfam13557	Phenol_MetA_deg	Putative MetA-pathway of phenol degradation. 	238
404448	pfam13558	SbcCD_C	Putative exonuclease SbcCD, C subunit. Possible exonuclease SbcCD, C subunit, on AAA proteins.	90
404449	pfam13559	DUF4129	Domain of unknown function (DUF4129). This presumed domain is found at the C-terminus of proteins that contain a transglutaminase core domain. The function of this domain is unknown. The domain has a conserved TXXE motif.	70
404450	pfam13560	HTH_31	Helix-turn-helix domain. This domain is a helix-turn-helix domain that probably binds to DNA.	64
404451	pfam13561	adh_short_C2	Enoyl-(Acyl carrier protein) reductase. 	236
404452	pfam13562	NTP_transf_4	Sugar nucleotidyl transferase. This is a probable sugar nucleotidyl transferase family.	147
404453	pfam13563	2_5_RNA_ligase2	2'-5' RNA ligase superfamily. This family contains proteins related to pfam02834. These proteins are likely to be enzymes, but they may not share the RNA ligase activity.	152
404454	pfam13564	DoxX_2	DoxX-like family. This family of uncharacterized proteins are related to DoxX pfam07681.	103
404455	pfam13565	HTH_32	Homeodomain-like domain. 	73
404456	pfam13566	DUF4130	Domain of unknown function (DUF4130. 	157
379269	pfam13567	DUF4131	Domain of unknown function (DUF4131). This domain is frequently found to the N-terminus of the Competence domain, pfam03772.	165
404457	pfam13568	OMP_b-brl_2	Outer membrane protein beta-barrel domain. This domain is found in a wide range of outer membrane proteins. This domain assumes a membrane bound beta-barrel fold.	175
404458	pfam13569	DUF4132	Domain of unknown function (DUF4132). This domain might be involved in the biosynthesis of the molybdopterin cofactor in E.coli.	179
404459	pfam13570	PQQ_3	PQQ-like domain. 	38
404460	pfam13571	DUF4133	Domain of unknown function (DUF4133). Based on Bacteroides thetaiotaomicron gene BT_0094, a putative uncharacterized protein as seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), It appears to be upregulated in the presence of host or vs when in culture.	91
404461	pfam13572	DUF4134	Domain of unknown function (DUF4134). Based on Bacteroides thetaiotaomicron gene BT_0095, a putative uncharacterized protein As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), It appears to be upregulated in the presence of host or vs when in culture.	92
404462	pfam13573	SprB	SprB repeat. This repeat occurs several times in SprB, a cell surface protein involved in gliding motility in the bacterium Flavobacterium johnsoniae.	37
372637	pfam13574	Reprolysin_2	Metallo-peptidase family M12B Reprolysin-like. This zinc-binding metallo-peptidase has the characteristic binding motif HExxGHxxGxxH of Reprolysin-like peptidases of family M12B.	193
404463	pfam13575	DUF4135	Domain of unknown function (DUF4135). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 380 amino acids in length. The family is found in association with pfam05147. This domain may be involved in synthesis of a lantibiotic compound.	367
404464	pfam13576	Pentapeptide_3	Pentapeptide repeats (9 copies). 	48
404465	pfam13577	SnoaL_4	SnoaL-like domain. This family contains a large number of proteins that share the SnoaL fold.	125
404466	pfam13578	Methyltransf_24	Methyltransferase domain. This family appears to be a methyltransferase domain.	106
404467	pfam13579	Glyco_trans_4_4	Glycosyl transferase 4-like domain. 	158
404468	pfam13580	SIS_2	SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars.	138
404469	pfam13581	HATPase_c_2	Histidine kinase-like ATPase domain. 	126
404470	pfam13582	Reprolysin_3	Metallo-peptidase family M12B Reprolysin-like. This zinc-binding metallo-peptidase has the characteristic binding motif HExxGHxxGxxH of Reprolysin-like peptidases of family M12B.	122
404471	pfam13583	Reprolysin_4	Metallo-peptidase family M12B Reprolysin-like. This zinc-binding metallo-peptidase has the characteristic binding motif HExxGHxxGxxH of Reprolysin-like peptidases of family M12B.	203
404472	pfam13584	BatD	Oxygen tolerance. This family of proteins carries up to three membrane spanning regions and is involved in tolerance to oxygen in in Bacteroides spp.	95
404473	pfam13585	CHU_C	C-terminal domain of CHU protein family. The function of this C-terminal domain is not known; there are several conserved tryptophan and asparagine residues.	85
404474	pfam13586	DDE_Tnp_1_2	Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis.	90
404475	pfam13588	HSDR_N_2	Type I restriction enzyme R protein N-terminus (HSDR_N). This family consists of a number of N terminal regions found in type I restriction enzyme R (HSDR) proteins. Restriction and modification (R/M) systems are found in a wide variety of prokaryotes and are thought to protect the host bacterium from the uptake of foreign DNA. Type I restriction and modification systems are encoded by three genes: hsdR, hsdM, and hsdS. The three polypeptides, HsdR, HsdM, and HsdS, often assemble to give an enzyme (R2M2S1) that modifies hemimethylated DNA and restricts unmethylated DNA.	110
404476	pfam13589	HATPase_c_3	Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase. This family represents, additionally, the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90.	135
404477	pfam13590	DUF4136	Domain of unknown function (DUF4136). This domain is found in bacterial lipoproteins. The function is not known.	112
404478	pfam13591	MerR_2	MerR HTH family regulatory protein. 	84
404479	pfam13592	HTH_33	Winged helix-turn helix. This helix-turn-helix domain is often found in transferases and is likely to be DNA-binding.	60
404480	pfam13593	SBF_like	SBF-like CPA transporter family (DUF4137). These family members are 7TM putative membrane transporter proteins. The family is similar to the SBF family of bile-acid symporters, pfam01758.	313
404481	pfam13595	DUF4138	Domain of unknown function (DUF4138). Based on Bacteroides thetaiotaomicron gene BT_4780, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture.	244
404482	pfam13596	PAS_10	PAS domain. 	106
404483	pfam13597	NRDD	Anaerobic ribonucleoside-triphosphate reductase. 	554
404484	pfam13598	DUF4139	Domain of unknown function (DUF4139). This family is usually found at the C-terminus of proteins.	317
404485	pfam13599	Pentapeptide_4	Pentapeptide repeats (9 copies). 	78
404486	pfam13600	DUF4140	N-terminal domain of unknown function (DUF4140). This family is often found at the N-terminus of its member proteins, with DUF4139, pfam13598, at the C-terminus.	99
404487	pfam13601	HTH_34	Winged helix DNA-binding domain. 	80
404488	pfam13602	ADH_zinc_N_2	Zinc-binding dehydrogenase. 	132
404489	pfam13603	tRNA-synt_1_2	Leucyl-tRNA synthetase, Domain 2. This is a family of the conserved region of Leucine-tRNA ligase or Leucyl-tRNA synthetase, EC:6.1.1.4.	184
404490	pfam13604	AAA_30	AAA domain. This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B.	191
404491	pfam13605	DUF4141	Domain of unknown function (DUF4141). Based on Bacteroides thetaiotaomicron gene BT_4772, a putative uncharacterized protein. As seen in gene expression experiments (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2231), it appears to be upregulated in the presence of host or vs when in culture.	53
404492	pfam13606	Ank_3	Ankyrin repeat. Ankyrins are multifunctional adaptors that link specific proteins to the membrane-associated, spectrin- actin cytoskeleton. This repeat-domain is a 'membrane-binding' domain of up to 24 repeated units, and it mediates most of the protein's binding activities.	30
404493	pfam13607	Succ_CoA_lig	Succinyl-CoA ligase like flavodoxin domain. This domain contains the catalytic domain from Succinyl-CoA ligase alpha subunit and other related enzymes. A conserved histidine is involved in phosphoryl transfer.	136
290339	pfam13608	Potyvirid-P3	Protein P3 of Potyviral polyprotein. This is the P3 protein section of the Potyviridae polyproteins. The function is not known except that the protein is essential to viral survival.	452
404494	pfam13609	Porin_4	Gram-negative porin. 	310
404495	pfam13610	DDE_Tnp_IS240	DDE domain. This DDE domain is found in a wide variety of transposases including those found in IS240, IS26, IS6100 and IS26.	138
372647	pfam13611	Peptidase_S76	Serine peptidase of plant viral polyprotein, P1. This family is the P1 protein of the Potyviridae polyproteins that is a serine peptidase at the N-terminus. The catalytic triad in the genome polyprotein of ssRNA positive-strand Brome streak mosaic rymovirus, is His-311, Asp-322 and Ser-355.	119
404496	pfam13612	DDE_Tnp_1_3	Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contains three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction.	154
404497	pfam13613	HTH_Tnp_4	Helix-turn-helix of DDE superfamily endonuclease. This domain is the probable DNA-binding region of transposase enzymes, necessary for efficient DNA transposition. Most of the members derive from the IS superfamily IS5 and rather fewer from IS4.	53
404498	pfam13614	AAA_31	AAA domain. This family includes a wide variety of AAA domains including some that have lost essential nucleotide binding residues in the P-loop.	177
404499	pfam13616	Rotamase_3	PPIC-type PPIASE domain. Rotamases increase the rate of protein folding by catalyzing the interconversion of cis-proline and trans-proline.	116
404500	pfam13617	Lipoprotein_19	YnbE-like lipoprotein. This family includes lipoproteins similar to E. coli YnbE. Protein in this family are typically 60 amino acids in length and contain an N-terminal lipid attachment site, which has been included in the alignment to increase sensitivity. The specific function of these proteins is unknown.	34
404501	pfam13618	Gluconate_2-dh3	Gluconate 2-dehydrogenase subunit 3. This family corresponds to subunit 3 of the Gluconate 2-dehydrogenase enzyme that catalyzes the conversion of gluconate to 2-dehydro-D-gluconate EC:1.1.99.3.	134
404502	pfam13619	KTSC	KTSC domain. This short domain is named after Lysine tRNA synthetase C-terminal domain. It is found at the C-terminus of some Lysyl tRNA synthetases as well as a single domain in bacterial proteins. The domain is about 60 amino acids in length and contains a reasonably conserved YXY motif in the centre of the sequence. The function of this domain is unknown but it could be an RNA binding domain.	58
404503	pfam13620	CarboxypepD_reg	Carboxypeptidase regulatory-like domain. 	81
404504	pfam13621	Cupin_8	Cupin-like domain. This cupin like domain shares similarity to the JmjC domain.	251
404505	pfam13622	4HBT_3	Thioesterase-like superfamily. This family contains a wide variety of enzymes, principally thioesterases. These enzymes are part of the Hotdog fold superfamily.	240
404506	pfam13623	SurA_N_2	SurA N-terminal domain. This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment.	139
404507	pfam13624	SurA_N_3	SurA N-terminal domain. This domain is found at the N-terminus of the chaperone SurA. It is a helical domain of unknown function. The C-terminus of the SurA protein folds back and forms part of this domain also but is not included in the current alignment.	162
404508	pfam13625	Helicase_C_3	Helicase conserved C-terminal domain. This domain family is found in a wide variety of helicases and helicase-related proteins.	121
404509	pfam13627	LPAM_2	Prokaryotic lipoprotein-attachment site. In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached.	23
404510	pfam13628	DUF4142	Domain of unknown function (DUF4142). This is a bacterial family of unknown function.	138
404511	pfam13629	T2SS-T3SS_pil_N	Pilus formation protein N terminal region. 	72
404512	pfam13630	SdpI	SdpI/YhfL protein family. This family of proteins includes the SdpI and YhfL proteins from B. subtilis. The SdpI protein is a multipass integral membrane protein that protects toxin-producing cells from being killed. Killing is mediated by the exported toxic protein SdpC an extracellular protein that induces the synthesis of an immunity protein.	71
379304	pfam13631	Cytochrom_B_N_2	Cytochrome b(N-terminal)/b6/petB. 	169
404513	pfam13632	Glyco_trans_2_3	Glycosyl transferase family group 2. Members of this family of prokaryotic proteins include putative glucosyltransferases, which are involved in bacterial capsule biosynthesis.	194
404514	pfam13634	Nucleoporin_FG	Nucleoporin FG repeat region. This family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145.	90
404515	pfam13635	DUF4143	Domain of unknown function (DUF4143). This domain is almost always found C-terminal to an ATPase core family.	160
404516	pfam13636	Methyltranf_PUA	RNA-binding PUA-like domain of methyltransferase RsmF. Methyltranf_PUA is the second of two C-terminal domains found on bacterial methyltransferase RsmF that modifies the 16S ribosomal RNA. It has some structural similarity to the RNA-binding PUA domains suggesting that it is involved in RNA recognition. It lies downstream of the catalytic centre of this methyltransferase, family pfam01189.	50
372654	pfam13637	Ank_4	Ankyrin repeats (many copies). 	54
404517	pfam13638	PIN_4	PIN domain. Members of this family of bacterial domains are predicted to be RNases (from similarities to 5'-exonucleases).	131
404518	pfam13639	zf-RING_2	Ring finger domain. 	44
404519	pfam13640	2OG-FeII_Oxy_3	2OG-Fe(II) oxygenase superfamily. This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily.	94
404520	pfam13641	Glyco_tranf_2_3	Glycosyltransferase like family 2. Members of this family of prokaryotic proteins include putative glucosyltransferase, which are involved in bacterial capsule biosynthesis.	230
404521	pfam13642	DUF4144	protein structure with unknown function. A family based on the three-dimensional structure of YP_926445.1 (Structure 2L6O)	95
404522	pfam13643	DUF4145	Domain of unknown function (DUF4145). This domain is found in a variety of restriction endonuclease enzymes. The exact function of this domain is uncertain.	88
404523	pfam13644	DKNYY	DKNYY family. This family represents a group of proteins found enriched in fusobacteria. These proteins contain many repeats of a DKNXXYY motif. The repeats are spaced at about 35 amino acid residues intervals. These proteins are likely to be associated with the membrane. The specific function of these proteins is unknown.	150
404524	pfam13645	YkuD_2	L,D-transpeptidase catalytic domain. This family is related to pfam03734.	169
404525	pfam13646	HEAT_2	HEAT repeats. This family includes multiple HEAT repeats.	88
404526	pfam13647	Glyco_hydro_80	Glycosyl hydrolase family 80 of chitosanase A. This is a small family of bacterial chitosanases. These have lysozyme-like activity.	307
404527	pfam13648	Lipocalin_4	Lipocalin-like domain. 	89
404528	pfam13649	Methyltransf_25	Methyltransferase domain. This family appears to be a methyltransferase domain.	97
404529	pfam13650	Asp_protease_2	Aspartyl protease. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix.	90
404530	pfam13651	EcoRI_methylase	Adenine-specific methyltransferase EcoRI. This methylase recognizes the double-stranded sequence GAATTC, causes specific methylation on A-3 on both strands, and protects the DNA from cleavage by the EcoRI endonuclease.	343
404531	pfam13652	QSregVF	Putative quorum-sensing-regulated virulence factor. This is a family of short ~14 kDa proteins from Psuedomonas. The structure of UniProtKB:Q9HY15 a secreted protein has been solved and deposited as Structure 3npd. It comprises one structural domain with five beta-strands and five alpha-helices. Various comparative structural prediction methods plus its genomic location point to the protein forming a functional dimer with its adjacent genomic partner, UniProtKB:Q9HY14, in pfam12843. Together these might be regulated by the other product from the PotABCD operon, namely the putrescine-binding periplasmic protein UniProtKB:Q9HY16. which has been implicated in quorum-sensing. QSregVF is certainly up-regulated in quorum-sensing, and is predicted to be a virulence factor.	110
404532	pfam13653	GDPD_2	Glycerophosphoryl diester phosphodiesterase family. This family also includes glycerophosphoryl diester phosphodiesterases as well as agrocinopine synthase, the similarity to GDPD has been noted. This family appears to have weak but not significant matches to mammalian phospholipase C pfam00388, which suggests that this family may adopt a TIM barrel fold.	30
404533	pfam13654	AAA_32	AAA domain. This family includes a wide variety of AAA domains including some that have lost essential nucleotide binding residues in the P-loop.	514
404534	pfam13655	RVT_N	N-terminal domain of reverse transcriptase. This domain is found at the N-terminus of bacterial reverse transcriptases.	83
404535	pfam13656	RNA_pol_L_2	RNA polymerase Rpb3/Rpb11 dimerization domain. The two eukaryotic subunits Rpb3 and Rpb11 dimerize to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent of the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerization domain of the alpha subunit/Rpb3 is interrupted by an insert domain (pfam01000). Some of the alpha subunits also contain iron-sulphur binding domains (pfam00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria, alpha subunits from chloroplasts, Rpb3 subunits from eukaryotes, Rpb11 subunits from eukaryotes, RpoD subunits from archaeal spp, and RpoL subunits from archaeal spp. Many of the members of this family carry only the N-terminal region of Rpb11.	75
404536	pfam13657	Couple_hipA	HipA N-terminal domain. This domain is found to the N-terminus of HipA-like proteins. It is also found in isolation in some proteins.	95
404537	pfam13660	DUF4147	Domain of unknown function (DUF4147). This domain is frequently found at the N-terminus of proteins carrying the glycerate kinase-like domain MOFRL, pfam05161.	231
379319	pfam13661	2OG-FeII_Oxy_4	2OG-Fe(II) oxygenase superfamily. This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily.	93
404538	pfam13662	Toprim_4	Toprim domain. The toprim domain is found in a wide variety of enzymes involved in nucleic acid manipulation.	83
404539	pfam13663	DUF4148	Domain of unknown function (DUF4148). 	62
404540	pfam13664	DUF4149	Domain of unknown function (DUF4149). 	101
404541	pfam13665	DUF4150	Domain of unknown function (DUF4150). 	108
404542	pfam13667	ThiC-associated	ThiC-associated domain. This domain is most frequently found at the N-terminus of the ThiC family of proteins, pfam01964. The function is not known.	71
404543	pfam13668	Ferritin_2	Ferritin-like domain. This family contains ferritins and other ferritin-like proteins such as members of the DPS family and bacterioferritins.	138
404544	pfam13669	Glyoxalase_4	Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily. 	109
404545	pfam13670	PepSY_2	Peptidase propeptide and YPEB domain. This region is likely to have a protease inhibitory function (personal obs:C Yeats). The name is derived from Peptidase & Bacillus subtilis YPEB.	83
404546	pfam13671	AAA_33	AAA domain. This family of domains contain only a P-loop motif, that is characteristic of the AAA superfamily. Many of the proteins in this family are just short fragments so there is no Walker B motif.	143
404547	pfam13672	PP2C_2	Protein phosphatase 2C. Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase.	209
404548	pfam13673	Acetyltransf_10	Acetyltransferase (GNAT) domain. This family contains proteins with N-acetyltransferase functions such as Elp3-related proteins.	128
404549	pfam13675	PilJ	Type IV pili methyl-accepting chemotaxis transducer N-term. This domain is found on many type IV pili methyl-accepting chemotaxis transducer proteins where there is also a HAMP, signature towards the C-terminus.	112
404550	pfam13676	TIR_2	TIR domain. This is a family of bacterial Toll-like receptors.	119
404551	pfam13677	MotB_plug	Membrane MotB of proton-channel complex MotA/MotB. This is the MotB member of the E.coli MotA/MotB proton-channel complex that forms the stator of the bacterial membrane flagellar motor. Key residues act as a plug to prevent premature proton flow. The plug is in the periplasm just C-terminal to the MotB TM, consisting of an amphipathic alpha helix flanked by Pro-52 and Pro-65. In addition to the Pro residues, Ile-58, Tyr-61, and Phe 62 are also essential for plug function.	58
372668	pfam13678	Peptidase_M85	NFkB-p65-degrading zinc protease. This family of bacterial metallo-peptidases is thought to compromise the inflammatory response by degrading p65 thereby down-regulating the NF-kappaB signalling pathway. NF-kappa-B is a pleiotropic transcription factor which is present in almost all cell types and is involved in many biological processes such as inflammation, immunity, differentiation, cell growth, tumorigenesis and apoptosis. NF-kappa-B is a homo- or heterodimeric complex formed by the Rel-like domain-containing proteins RELA/p65, RELB, NFKB1/p105, NFKB1/p50, REL and NFKB2/p52; and the heterodimeric p65-p50 complex appears to be most abundant one.	251
379330	pfam13679	Methyltransf_32	Methyltransferase domain. This family appears to be a methyltransferase domain.	138
404552	pfam13680	DUF4152	Protein of unknown function (DUF4152). This family of proteins is functionally uncharacterized. This family of proteins is found in archaea. Proteins in this family are approximately 230 amino acids in length. The structure of PF2046 from pyrococcus furiosus has been solved. It shows an RNaseH like fold that conserves critical catalytic residues. This suggests that these proteins may cleave nucleic acid.	224
404553	pfam13681	PilX	Type IV pilus assembly protein PilX C-term. This family is likely to be the C-terminal region of type IV pilus assembly PilX or PilW proteins.	92
404554	pfam13682	CZB	Chemoreceptor zinc-binding domain. The chemoreceptor zinc-binding domain (CZB) is found in bacterial signal transduction proteins - most frequently receptors involved in chemotaxis and motility, but also in c-di-GMP signalling and nitrate/nitrite-sensing. Originally discovered in the cytoplasmic chemoreceptor TlpD from Helicobacter pylori, it is often found C-terminal to the MCPsignal domain in cytoplasmic chemoreceptor proteins. The CZB domain contains a core sequence motif, Hxx[WFYL]x21-28Cx[LFMVI]Gx[WFLVI]x18-27HxxxH. The highly-conserved H-C-H-H residues of this motif are believed to coordinate zinc; mutating the latter two histidines of the motif to alanines abolishes Zn binding. This domain binds zinc with high affinity, with a Kd in the femtomolar range. This domain has been shown in E. coli to be a zinc sensor that regulates the catalytic activity of pfam00990.	64
404555	pfam13683	rve_3	Integrase core domain. 	67
404556	pfam13684	Dak1_2	Dihydroxyacetone kinase family. This is the kinase domain of the dihydroxyacetone kinase family.	313
404557	pfam13685	Fe-ADH_2	Iron-containing alcohol dehydrogenase. 	251
404558	pfam13686	DrsE_2	DsrE/DsrF/DrsH-like family. DsrE is a small soluble protein involved in intracellular sulfur reduction. The family also includes YrkE proteins.	156
404559	pfam13687	DUF4153	Domain of unknown function (DUF4153). Members of this family are annotated as putative inner membrane proteins.	216
372673	pfam13688	Reprolysin_5	Metallo-peptidase family M12. 	191
404560	pfam13689	DUF4154	Domain of unknown function (DUF4154). This family of proteins is found in bacteria. Proteins in this family are typically between 172 and 207 amino acids in length. Many members are annotated as valyl-tRNA synthetase but this could not be confirmed.	140
404561	pfam13690	CheX	Chemotaxis phosphatase CheX. CheX is very closely related to the CheC chemotaxis phosphatase, but it dimerizes in a different way, via a continuous beta sheet between the subunits. CheC and CheX both dephosphorylate CheY, although CheC requires binding of CheD to achieve the activity of CheX. The ability of bacteria to modulate their swimming behaviour in the presence of external chemicals (nutrients and repellents) is one of the most rudimentary behavioural responses known, but the the individual components are very sensitively tuned.	94
404562	pfam13691	Lactamase_B_4	tRNase Z endonuclease. This is family of tRNase Z enzymes, that are closely related structurally to the Lactamase_B family members. tRNase Z is the endonuclease that is involved in tRNA 3'-end maturation through removal of the 3'-trailer sequences from tRNA precursors. The fission yeast Schizosaccharomyces pombe contains two candidate tRNase Zs encoded by two essential genes. The first, trz1, is targeted to the nucleus and has an SV40 nuclear localization signal at its N-terminus, consisting of four consecutive arginine and lysine residues between residues 208 and 211 (KKRK) that is critical for the NLS function. The second, trz2, is targeted to the mitochondria, with an N-terminal mitochondrial targeting signal within the first 38 residues.	63
404563	pfam13692	Glyco_trans_1_4	Glycosyl transferases group 1. 	138
404564	pfam13693	HTH_35	Winged helix-turn-helix DNA-binding. 	70
404565	pfam13694	Hph	Sec63/Sec62 complex-interacting family. This is a family of closely related Hph proteins that are integral endoplasmic reticulum (ER) membrane proteins required for yeast survival under environmental stress conditions. They interact with several subunits of the Sec63/Sec62 complex that mediates post-translational translocation of proteins into the ER. Cells with mutant Hph1 and Hph2 proteins revealed phenotypes resembling those of mutants defective for vacuolar proton ATPase (V-ATPase) activity. The yeast V-ATPase is a multisubunit complex whose function, structure, and assembly have been well characterized. Cells with impaired V-ATPase activity fail to acidify the vacuole, cannot grow at alkaline pH, and are sensitive to high concentrations of extracellular calcium.	187
404566	pfam13695	zf-3CxxC	Zinc-binding domain. This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue.	95
404567	pfam13696	zf-CCHC_2	Zinc knuckle. This is a zinc-binding domain of the form CxxCxxxGHxxxxC from a variety of different species.	21
404568	pfam13698	DUF4156	Domain of unknown function (DUF4156). The function of this family is unknown but members are annotated as putative lipoprotein outer membrane proteins.	89
404569	pfam13699	DUF4157	Domain of unknown function (DUF4157). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. This domain contains an HEXXH motif that is characteristic of many families of metallopeptidases. However, no peptidase activity has been shown for this domain.	79
404570	pfam13700	DUF4158	Domain of unknown function (DUF4158). The exact function of this domain is not clear, but it frequently occurs as an N-terminal region of transposase 3 or IS3 family of insertion elements.	166
404571	pfam13701	DDE_Tnp_1_4	Transposase DDE domain group 1. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis.	433
404572	pfam13702	Lysozyme_like	Lysozyme-like. 	165
404573	pfam13704	Glyco_tranf_2_4	Glycosyl transferase family 2. Members of this family of prokaryotic proteins include putative glucosyltransferases,	97
404574	pfam13705	TRC8_N	TRC8 N-terminal domain. This region is found at the N-terminus of the TRC8 protein. TRC8 is an E3 ubiquitin-protein ligase also known as RNF139. This region contains 12 transmembrane domains. This region has been suggested to contain a sterol sensing domain. It has been found that TRC8 protein levels are sterol responsive and that it binds and stimulates ubiquitylation of the endoplasmic reticulum anchor protein INSIG.	491
404575	pfam13707	RloB	RloB-like protein. This family includes the RloB protein that is found within a bacterial restriction modification operon. This family includes the AbiLii protein that is found as part of a plasmid encoded phage abortive infection mechanism. Deletion within abiLii abolished the phage resistance. The family includes some proteins annotated as CRISPR Csm2 proteins.	152
404576	pfam13708	DUF4942	Domain of unknown function (DUF4942). The function of this family is not known.	187
404577	pfam13709	DUF4159	Domain of unknown function (DUF4159). Members of this family are hypothetical proteins.	191
404578	pfam13710	ACT_5	ACT domain. ACT domains bind to amino acids and regulate associated enzyme domains. These ACT domains are found at the C-terminus of the RelA protein.	61
404579	pfam13711	DUF4160	Domain of unknown function (DUF4160). 	61
404580	pfam13712	Glyco_tranf_2_5	Glycosyltransferase like family. Members of this family of prokaryotic proteins include putative glucosyltransferases, which are involved in bacterial capsule biosynthesis.	210
404581	pfam13713	BRX_N	Transcription factor BRX N-terminal domain. The BREVIS RADIX (BRX) domain was characterized as being a transcription factor in plants regulating the extent of cell proliferation and elongation in the growth zone of the root. BRX is rate limiting for auxin-responsive gene-expression by mediating cross-talk with the brassino-steroid pathway. BRX has a ubiquitous, although quantitatively variable role in modulating the growth rate in both the root and the shoot. This family features a short region, also alpha-helical, N-terminal to the repeated alpha-helices of family BRX, pfam08381. BRX is expressed in the vasculature and is rate-limiting for transcriptional auxin action.	37
404582	pfam13714	PEP_mutase	Phosphoenolpyruvate phosphomutase. This domain includes the enzyme Phosphoenolpyruvate phosphomutase (EC:5.4.2.9). This protein has been characterized as catalyzing the formation of a carbon-phosphorus bond by converting phosphoenolpyruvate (PEP) to phosphonopyruvate (P-Pyr). This enzyme has a TIM barrel fold.	241
404583	pfam13715	CarbopepD_reg_2	CarboxypepD_reg-like domain. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 90 amino acids in length. The family is found in association with pfam07715 and pfam00593.	88
404584	pfam13716	CRAL_TRIO_2	Divergent CRAL/TRIO domain. This family includes divergent members of the CRAL-TRIO domain family. This family includes ECM25 that contains a divergent CRAL-TRIO domain identified by Gallego and colleagues.	140
404585	pfam13717	zinc_ribbon_4	zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR, pfam12773.	36
404586	pfam13718	GNAT_acetyltr_2	GNAT acetyltransferase 2. This domain has N-acetyltransferase activity. It has a GCN5-related N-acetyltransferase (GNAT) fold.	228
316258	pfam13719	zinc_ribbon_5	zinc-ribbon domain. This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR, pfam12773.	37
404587	pfam13720	Acetyltransf_11	Udp N-acetylglucosamine O-acyltransferase; Domain 2. This is domain 2, or the C-terminal domain, of Udp N-acetylglucosamine O-acyltransferase. This enzyme is a zinc-dependent enzyme that catalyzes the deacetylation of UDP-3-O-((R)-3-hydroxymyristoyl)-N-acetylglucosamine to form UDP-3-O-(R-hydroxymyristoyl)glucosamine and acetate.	82
404588	pfam13721	SecD-TM1	SecD export protein N-terminal TM region. This domain appears to be the fist transmembrane region of the SecD export protein. SecD is directly involved in protein secretion and important for the release of proteins that have been translocated across the cytoplasmic membrane.	103
404589	pfam13722	CstA_5TM	5TM C-terminal transporter carbon starvation CstA. CstA_5TM is the last five transmembrane regions of the peptide transporter carbon starvation family CstA.	118
404590	pfam13723	Ketoacyl-synt_2	Beta-ketoacyl synthase, N-terminal domain. 	226
404591	pfam13724	DNA_binding_2	DNA-binding domain. This domain, often found on ovate proteins, binds to single-stranded and double-stranded DNA. Binding to DNA is not sequence-specific.	48
404592	pfam13725	tRNA_bind_2	Possible tRNA binding domain. This domain, found at the C-terminus of tRNA(Met) cytidine acetyltransferase, may be involved in tRNA-binding.	231
404593	pfam13726	Na_H_antiport_2	Na+-H+ antiporter family. This family includes integral membrane proteins, some of which are NA+-H+ antiporters.	88
379351	pfam13727	CoA_binding_3	CoA-binding domain. 	175
404594	pfam13728	TraF	F plasmid transfer operon protein. TraF protein undergoes proteolytic processing associated with export. The 19 amino acids at the amino terminus of the polypeptides appear to constitute a typical membrane leader peptide - not included in this family, while the remainder of the molecule is predicted to be primarily hydrophilic in character. F plasmid TraF and TraH are required for F pilus assembly and F plasmid transfer, and they are both localized to the outer membrane in the presence of the complete F transfer region, especially TraV, the putative anchor.	224
404595	pfam13729	TraF_2	F plasmid transfer operon, TraF, protein. 	268
404596	pfam13730	HTH_36	Helix-turn-helix domain. 	55
404597	pfam13731	WxL	WxL domain surface cell wall-binding. The WxL motif appears in two or three copies in these bacterial proteins and confers a cell surface localization function. It seems likely that this region is the cell wall-binding domain of gram-positive bacteria, and may interact with the peptidoglycan.	205
404598	pfam13732	DUF4162	Domain of unknown function (DUF4162). This domain is found at the C-terminus of bacterial ABC transporter proteins. The function is not known.	82
404599	pfam13733	Glyco_transf_7N	N-terminal region of glycosyl transferase group 7. This is the N-terminal half of a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activities, all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalyzed reaction.	133
404600	pfam13734	Inhibitor_I69	Spi protease inhibitor. This family includes the inhibitor Spi and the pro-peptides of streptopain (SpeB). SpeB is produced as a 43 kDa pre-pro-protein, which is secreted via the recently described Sec secretory pathway Exportal. There is tight coupling between this inhibitor and its associated protease: the gene for the inhibitor Spi is located directly downstream from the gene for the streptococcal cysteine protease SpeB, and the sequence of the inhibitor is very similar to that of the SpeB propeptide. This is an example of an inhibitor molecule that is a structural homolog of the cognate propeptide, and is genetically linked to the protease gene.	98
404601	pfam13735	tRNA_NucTran2_2	tRNA nucleotidyltransferase domain 2 putative. 	149
404602	pfam13737	DDE_Tnp_1_5	Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis.	112
404603	pfam13738	Pyr_redox_3	Pyridine nucleotide-disulphide oxidoreductase. 	296
404604	pfam13739	DUF4163	Domain of unknown function (DUF4163). The structure of this domain is and alpha-beta-two layer sandwich, identified from a Fervidobacterium nodosum Rt17-B1 like protein. The function is not known except that it is found in association with Heat-shock cognate 70kd protein 44kd ATPase, pfam11738.	94
404605	pfam13740	ACT_6	ACT domain. ACT domains bind to amino acids and regulate associated enzyme domains.	76
404606	pfam13741	MRP-S25	Mitochondrial ribosomal protein S25. This is the family of fungal 37S mitochondrial ribosomal S25 proteins.	221
404607	pfam13742	tRNA_anti_2	OB-fold nucleic acid binding domain. This family contains OB-fold domains that bind to nucleic acids.	95
404608	pfam13743	Thioredoxin_5	Thioredoxin. 	176
404609	pfam13744	HTH_37	Helix-turn-helix domain. Members of this family contains a DNA-binding helix-turn-helix domain.	80
404610	pfam13746	Fer4_18	4Fe-4S dicluster domain. This family includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. The structure of the domain is an alpha-antiparallel beta sandwich.	114
404611	pfam13747	DUF4164	Domain of unknown function (DUF4164). This is a family of short, approx 100 residue-long, bacterial proteins of unknown function. There is several conserved LE/LD sequence pairs.	89
404612	pfam13748	ABC_membrane_3	ABC transporter transmembrane region. This family represents a unit of six transmembrane helices.	237
404613	pfam13749	HATPase_c_4	Putative ATP-dependent DNA helicase recG C-terminal. This domain may well interact selectively and non-covalently with ATP, adenosine 5'-triphosphate, a universally important coenzyme and enzyme regulator.	88
404614	pfam13750	Big_3_3	Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.	157
404615	pfam13751	DDE_Tnp_1_6	Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis.	125
404616	pfam13752	DUF4165	Domain of unknown function (DUF4165). 	119
404617	pfam13753	SWM_repeat	Putative flagellar system-associated repeat. This family appears to be a repeated unit that can occur up to 29 times in these outer membrane proteins. It is putatively associated with a novel flagellar system.	87
404618	pfam13754	Big_3_4	Domain of unknown function. This is a family of uncharacterized Clostridiales proteins.	105
404619	pfam13755	Sensor_TM1	Sensor N-terminal transmembrane domain. This domain is found at the N-terminus of the sensor component of the two-component regulatory system. It includes a transmembrane region and part of the periplasmic region, which is likely to be involved in stimulus sensing.	68
404620	pfam13756	Stimulus_sens_1	Stimulus-sensing domain. This domain is found in the periplasmic region of the sensor component of the two-component regulatory system. The periplasmic region is likely to be involved in stimulus sensing.	110
404621	pfam13757	VIT_2	Vault protein inter-alpha-trypsin domain. Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumor metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors.	78
372710	pfam13758	Prefoldin_3	Prefoldin subunit. This family includes prefoldin subunits that are not detected by pfam02996.	99
404622	pfam13759	2OG-FeII_Oxy_5	Putative 2OG-Fe(II) oxygenase. This family has structural similarity to the 2OG-Fe(II) oxygenase superfamily.	101
404623	pfam13761	DUF4166	Domain of unknown function (DUF4166). This domain is often found at the C-terminus of proteins containing pfam03435.	177
372712	pfam13762	MNE1	Mitochondrial splicing apparatus component. MNE1 is a novel component of the mitochondrial splicing apparatus responsible for the processing of a COX1 group I intron in yeast. Yeast cells lacking MNE1 are deficient in intron splicing in the gene encoding the Cox1 subunit of cytochrome oxidase but do contain wild-type levels of the bc1 complex.	141
404624	pfam13763	DUF4167	Domain of unknown function (DUF4167). 	75
404625	pfam13764	E3_UbLigase_R4	E3 ubiquitin-protein ligase UBR4. This is a family of E3 ubiquitin ligase enzymes.	807
404626	pfam13765	PRY	SPRY-associated domain. SPRY and PRY domains occur on PYRIN proteins. Their function is not known.	49
404627	pfam13767	DUF4168	Domain of unknown function (DUF4168). 	78
372716	pfam13768	VWA_3	von Willebrand factor type A domain. 	155
404628	pfam13769	Virulence_fact	Virulence factor. This domain is found in conserved virulence factors. It is often found in association with pfam02985 and pfam08712.	81
404629	pfam13770	DUF4169	Domain of unknown function (DUF4169). 	54
404630	pfam13771	zf-HC5HC2H	PHD-like zinc-binding domain. The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH.	88
372719	pfam13772	AIG2_2	AIG2-like family. This family is found in bacteria and metazoa.	83
404631	pfam13773	DUF4170	Domain of unknown function (DUF4170). 	68
404632	pfam13774	Longin	Regulated-SNARE-like domain. Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain.	78
404633	pfam13775	DUF4171	Domain of unknown function (DUF4171). This short family is frequently found at the N-terminus of Homeobox proteins.	127
404634	pfam13776	DUF4172	Domain of unknown function (DUF4172). The family is often found in association with pfam02661.	82
404635	pfam13777	DUF4173	Domain of unknown function (DUF4173). This domain of unknown function contains multiple predicted transmembrane domains.	188
404636	pfam13778	DUF4174	Domain of unknown function (DUF4174). This domain of unknown function is found in a putative tumor suppressor gene and in a ligand for the the urokinase-type plasminogen activator receptor, which plays a role in cellular migration and adhesion.	120
404637	pfam13779	DUF4175	Domain of unknown function (DUF4175). 	818
404638	pfam13780	DUF4176	Domain of unknown function (DUF4176). 	74
404639	pfam13781	DoxX_3	DoxX-like family. This family of uncharacterized proteins are related to DoxX pfam07681.	101
404640	pfam13782	SpoVAB	Stage V sporulation protein AB. This family of proteins is required for sporulation.	109
404641	pfam13783	DUF4177	Domain of unknown function (DUF4177). 	65
404642	pfam13784	Fic_N	Fic/DOC family N-terminal. This domain is found at the N-terminus of the Fic/DOC family, pfam02661.	79
404643	pfam13785	DUF4178	Domain of unknown function (DUF4178). 	143
404644	pfam13786	DUF4179	Domain of unknown function (DUF4179). 	93
404645	pfam13787	HXXEE	Protein of unknown function with HXXEE motif. This domain contains an HXXEE motif, another conserved histidine and a YXPG motif. Its function is unknown.	100
404646	pfam13788	DUF4180	Domain of unknown function (DUF4180). 	108
372726	pfam13789	DUF4181	Domain of unknown function (DUF4181). 	108
372727	pfam13790	SR1P	SR1 protein. This family of proteins is encoded by the dual function SR1 RNA. SR1 is a sRNA which regulates arginine metabolism, it also encodes a short protein that binds to glyceraldehyde-3-phosphate dehydrogenase (GapA) and stabilizes the gapA operon mRNAs.	37
404647	pfam13791	Sigma_reg_C	Sigma factor regulator C-terminal. This family is the C-terminal domain of a sigma factor regulator, this may represent a sensory domain.	156
404648	pfam13793	Pribosyltran_N	N-terminal domain of ribose phosphate pyrophosphokinase. This family is frequently found N-terminal to the Pribosyltran, pfam00156.	117
205967	pfam13794	MiaE_2	tRNA-(MS[2]IO[6]A)-hydroxylase (MiaE)-like. 	185
404649	pfam13795	HupE_UreJ_2	HupE / UreJ protein. These proteins contain many conserved histidines that may be involved in nickel binding.	153
404650	pfam13796	Sensor	Putative sensor. This family is often found at the N-terminus of proteins containing pfam07730 and pfam02518. The N-termini of proteins containing these two domains often function in stimulus sensing.	179
404651	pfam13797	Post_transc_reg	Post-transcriptional regulator. This family includes post-transcriptional regulators.	83
404652	pfam13798	PCYCGC	Protein of unknown function with PCYCGC motif. This domain contains a PCYCGC motif and four other conserved cysteines. Its function is unknown.	153
404653	pfam13799	DUF4183	Domain of unknown function (DUF4183). This domain of unknown function contains a highly conserved ING motif.	74
404654	pfam13800	Sigma_reg_N	Sigma factor regulator N-terminal. This domain is found near the N-terminus of a sigma factor regulator. The N-terminus is responsible for interaction with the sigma factor.	89
404655	pfam13801	Metal_resist	Heavy-metal resistance. This is a metal-binding protein which is involved in resistance to heavy-metal ions. The protein forms a four-helix hooked hairpin, consisting of two long alpha helices each flanked by a shorter alpha helix. It binds a metal ion in a type-2 like centre. It contains two copies of an LTXXQ motif.	119
404656	pfam13802	Gal_mutarotas_2	Galactose mutarotase-like. This family is found N-terminal to glycosyl-hydrolase domains, and appears to be similar to the galactose mutarotase superfamily.	67
404657	pfam13803	DUF4184	Domain of unknown function (DUF4184). This domain of unknown function contains several highly conserved histidines.	230
290518	pfam13804	HERV-K_env_2	Retro-transcribing viruses envelope glycoprotein. This family comes from human endogenous retrovirus K envelope glycoproteins.	169
404658	pfam13805	Pil1	Eisosome component PIL1. In the budding yeast, S. cerevisiae, Pil1 and another cytoplasmic protein, Lsp1, together form large immobile assemblies at the plasma membrane that mark sites for endocytosis, called eisosomes. Endocytosis functions to recycle plasma membrane components, to regulate cell-surface expression of signalling receptors and to internalize nutrients in all eukaryotic cells.	265
404659	pfam13806	Rieske_2	Rieske-like [2Fe-2S] domain. 	103
379381	pfam13807	GNVR	G-rich domain on putative tyrosine kinase. This domain is found between two families, Wzz, pfam02706 and CbiA pfam01656. There is a highly conserved GNVR sequence motif which characterizes this domain. The function is not known.	82
404660	pfam13808	DDE_Tnp_1_assoc	DDE_Tnp_1-associated. This domain is frequently found N-terminal to the transposase, IS family DDE_Tnp_1, pfam01609 and its relatives.	88
404661	pfam13809	Tubulin_2	Tubulin like. Many of the residues conserved in Tubulin, pfam00091, are also highly conserved in this family.	337
404662	pfam13810	DUF4185	Domain of unknown function (DUF4185). 	308
404663	pfam13811	DUF4186	Domain of unknown function (DUF4186). 	109
316342	pfam13812	PPR_3	Pentatricopeptide repeat domain. This family matches additional variants of the PPR repeat that were not captured by the model for pfam01535. In the case of the Arabidopsis protein UniProtKB:Q66GI4, the repeated helices in this N-terminal region, of protein-only RNase P (PRORP) enzymes, form the pentatricopeptide repeat (PPR) domain which enhances pre-tRNA binding affinity. PROPRP enzymes process precursor tRNAs in human mitochondria and in all tRNA-using compartments of Arabidopsis thaliana.	63
372734	pfam13813	MBOAT_2	Membrane bound O-acyl transferase family. 	84
404664	pfam13814	Replic_Relax	Replication-relaxation. This family includes proteins which are essential for plasmid replication and plasmid DNA relaxation.	190
372735	pfam13815	Dzip-like_N	Iguana/Dzip1-like DAZ-interacting protein N-terminal. The DAZ gene-product - Deleted in Azoospermia - and a closely related sequence are required early in germ-cell development in order to maintain germ-cell populations. This family is the N-terminal region that is the only part of the protein in some fungi and lower metazoa.	118
404665	pfam13816	Dehydratase_hem	Haem-containing dehydratase. This family includes aldoxime dehydratase, EC:4.99.1.5. This is a haem-containing enzyme, which catalyzes the dehydration of aldoximes to their corresponding nitrile. It also includes phenylacetaldoxime dehydratase, EC:4.99.1.7. This haem-containing enzyme catalyzes the dehydration of Z-phenylacetaldoxime to phenylacetonitrile. The enzyme forms an elliptic beta barrel, composed of eight beta-strands, flanked by alpha-helices.	307
404666	pfam13817	DDE_Tnp_IS66_C	IS66 C-terminal element. 	39
372737	pfam13820	Nucleic_acid_bd	Putative nucleic acid-binding region. This is a family of putative nucleic acid-binding proteins. Several members are annotated as being nuclear receptor coactivator 6 proteins but this could not be confirmed.	143
404667	pfam13821	DUF4187	Domain of unknown function (DUF4187). This family is found at the very C-terminus of proteins that carry a G-patch domain, pfam01585. The domain is short and cysteine-rich.	50
404668	pfam13822	ACC_epsilon	Acyl-CoA carboxylase epsilon subunit. This family includes the epsilon subunits of propionyl-CoA carboxylase, EC:6.4.1.3, and acetyl-CoA carboxylase, EC:6.4.1.2. These enzymes are involved in the biosynthesis of long-chain fatty acids. The epsilon subunit is necessary for an efficient interaction between the alpha and beta subunits of these enzymes.	60
404669	pfam13823	ADH_N_assoc	Alcohol dehydrogenase GroES-associated. This short domain is frequently found at the N-terminus of the alcohol dehydrogenase GroES-like domain, pfam08240.	23
404670	pfam13824	zf-Mss51	Zinc-finger of mitochondrial splicing suppressor 51. Mss51 regulates the expression of cytochrome oxidase, so this domain is probably DNA-binding.	54
372740	pfam13825	Paramyxo_PNT	Paramyxovirus structural protein V/P N-terminus. This family consists of several Paramyxoviridae structural protein P and V sequences. From a structural point of view, P is the best-characterized protein of the replicative complex. P is organized into two moieties that are functionally and structurally distinct: a C-terminal moiety (PCT) and an N-terminal moiety (PNT). PCT is the most conserved in sequence and contains all regions required for virus transcription, whereas PNT, which is poorly conserved, provides several additional functions required for replication. P protein plays a crucial role in the enzyme by positioning L onto the N/RNA template through an interaction with the C-terminal domain of N. Without P, L is not functional. The N, P, and L proteins of SeV and measles and mumps viruses are functionally equivalent. However, sequence identity between proteins from these viruses is limited, and the viruses have been placed in different genera (Respirovirus, Morbilivirus, and Rubulavirus, respectively). SeV P protein (568 aa) is a modular protein with distinct functional domains. The N-terminal part of P (PNT) is a chaperone for N and prevents it from binding to non-viral RNA in the infected cell.	309
404671	pfam13826	DUF4188	Domain of unknown function (DUF4188). 	117
404672	pfam13827	DUF4189	Domain of unknown function (DUF4189). This domain of unknown function contains six well-conserved cysteine residues.	97
404673	pfam13828	DUF4190	Domain of unknown function (DUF4190). This integral membrane domain is functionally uncharacterized. One of the membrane helices contains two GXXG motifs that are usually associated with dimerization.	61
404674	pfam13829	DUF4191	Domain of unknown function (DUF4191). 	219
404675	pfam13830	DUF4192	Domain of unknown function (DUF4192). 	318
404676	pfam13831	PHD_2	PHD-finger. PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3.	34
404677	pfam13832	zf-HC5HC2H_2	PHD-zinc-finger like domain. 	108
404678	pfam13833	EF-hand_8	EF-hand domain pair. 	54
404679	pfam13834	DUF4193	Domain of unknown function (DUF4193). This domain of unknown function contains four conserved cysteines and a conserved histidine, including a CXXXXH motif.	98
404680	pfam13835	DUF4194	Domain of unknown function (DUF4194). 	165
404681	pfam13836	DUF4195	Domain of unknown function (DUF4195). This family is found at the N-terminus of metazoan proteins that carry PHD-like zinc-finger domains. The function is not known.	183
404682	pfam13837	Myb_DNA-bind_4	Myb/SANT-like DNA-binding domain. This presumed domain appears to be related to other Myb/SANT-like DNA binding domains. In particular pfam10545 seems most related. This family is greatly expanded in plants and appears in several proteins annotated as transposon proteins.	84
404683	pfam13838	Clathrin_H_link	Clathrin-H-link. This short domain is found on clathrins, and often appears on proteins directly downstream from the Clathrin-link domain pfam09268.	66
404684	pfam13839	PC-Esterase	GDSL/SGNH-like Acyl-Esterase family found in Pmr5 and Cas1p. The PC-Esterase family is comprised of Cas1p, the Homo sapiens C7orf58, Arabidopsis thaliana PMR5 and a group of plant freezing resistance/cold acclimatization proteins typified by Arabidopsis thaliana ESKIMO1, animal FAM55D proteins, and animal FAM113 proteins. The PC-Esterase family has features that are both similar and different from the canonical GDSL/SGNH superfamily. The members of this family are predicted to have Acyl esterase activity and predicted to modify cell-surface biopolymers such as glycans and glycoproteins. The Cas1p protein has a Cas1_AcylT domain, in addition, with the opposing acyltransferase activity. The C7orf58 family has a ATP-Grasp domain fused to the PC-Esterase and is the first identified secreted tubulin-tyrosine ligase like enzyme in eukaryotes. The plant family with PMR5, ESK1, TBL3 etc have a N-terminal C rich potential sugar binding domain followed by PC-Esterase domain.	281
404685	pfam13840	ACT_7	ACT domain. The ACT domain is a structural motif of 70-90 amino acids that functions in the control of metabolism, solute transport and signal transduction. They are thus found in a variety of different proteins in a variety of different arrangements. In mammalian phenylalanine hydroxylase the domain forms no contacts but promotes an allosteric effect despite the apparent lack of ligand binding.	65
404686	pfam13841	Defensin_beta_2	Beta defensin. The beta defensins are antimicrobial peptides implicated in the resistance of epithelial surfaces to microbial colonisation.	30
404687	pfam13842	Tnp_zf-ribbon_2	DDE_Tnp_1-like zinc-ribbon. This zinc-ribbon domain is frequently found at the C-terminal of proteins derived from transposable elements.	31
372752	pfam13843	DDE_Tnp_1_7	Transposase IS4. 	352
404688	pfam13844	Glyco_transf_41	Glycosyl transferase family 41. This family of glycosyltransferases includes O-linked beta-N-acetylglucosamine (O-GlcNAc) transferase, an enzyme which catalyzes the addition of O-GlcNAc to serine and threonine residues. In addition to its function as an O-GlcNAc transferase, human OGT also appears to proteolytically cleave the epigenetic cell-cycle regulator HCF-1.	543
404689	pfam13845	Septum_form	Septum formation. This domain is found in a protein which is predicted to play a role in septum formation during cell division.	227
372754	pfam13846	DUF4196	Domain of unknown function (DUF4196). This is a short region of ccdc82_homologs that is conserved from Schizo. pombe up to humans. The function is not known.	116
404690	pfam13847	Methyltransf_31	Methyltransferase domain. This family appears to have methyltransferase activity.	150
404691	pfam13848	Thioredoxin_6	Thioredoxin-like domain. 	184
404692	pfam13850	ERGIC_N	Endoplasmic Reticulum-Golgi Intermediate Compartment (ERGIC). This family is the N-terminal of ERGIC proteins, ER-Golgi intermediate compartment clusters, otherwise known as Ervs, and is associated with family COPIIcoated_ERV, pfam07970.	91
404693	pfam13851	GAS	Growth-arrest specific micro-tubule binding. This family is the highly conserved central region of a number of metazoan proteins referred to as growth-arrest proteins. In mouse, Gas8 is predominantly a testicular protein, whose expression is developmentally regulated during puberty and spermatogenesis. In humans, it is absent in infertile males who lack the ability to generate gametes. The localization of Gas8 in the motility apparatus of post-meiotic gametocytes and mature spermatozoa, together with the detection of Gas8 also in cilia at the apical surfaces of epithelial cells lining the pulmonary bronchi and Fallopian tubes suggests that the Gas8 protein may have a role in the functioning of motile cellular appendages. Gas8 is a microtubule-binding protein localized to regions of dynein regulation in mammalian cells.	200
404694	pfam13852	DUF4197	Protein of unknown function (DUF4197). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 228 and 249 amino acids in length.	199
404695	pfam13853	7tm_4	Olfactory receptor. The members of this family are transmembrane olfactory receptors.	278
404696	pfam13854	Kelch_5	Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that Drosophila ring canal kelch protein is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415.	41
404697	pfam13855	LRR_8	Leucine rich repeat. 	61
404698	pfam13856	Gifsy-2	ATP-binding sugar transporter from pro-phage. Members of this short family are putative ATP-binding sugar transporter-like protein.	98
404699	pfam13857	Ank_5	Ankyrin repeats (many copies). 	56
404700	pfam13858	DUF4199	Protein of unknown function (DUF4199). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 167 and 182 amino acids in length.	156
372760	pfam13859	BNR_3	BNR repeat-like domain. This family of proteins contains BNR-like repeats suggesting these proteins may act as sialidases.	302
404701	pfam13860	FlgD_ig	FlgD Ig-like domain. This domains has an immunoglobulin like beta sandwich fold. It is found in the FlgD protein the flagellar hook capping protein. THe structure for this domain shows that it is inserted within a TUDOR like beta barrel domain.	78
404702	pfam13861	FLgD_tudor	FlgD Tudor-like domain. This domain has a tudor domain-like beta barrel fold. It is found in the FlgD protein the flagellar hook capping protein. The structure for this domain shows that it contains a nested Ig-like domain within it. However in some firmicute proteins this inserted domain is absent such as Q67K21.	136
404703	pfam13862	BCIP	p21-C-terminal region-binding protein. This family of p21-binding proteins is important as a modulator of p21 activity. The domain binds the C-terminal region of p21 in a ternary complex with CDK2, which results in inhibition of the kinase activity of CDK2.	206
404704	pfam13863	DUF4200	Domain of unknown function (DUF4200). This family is found in eukaryotes. It is a coiled-coil domain of unknwon function.	119
404705	pfam13864	Enkurin	Calmodulin-binding. This is a family of apparent calmodulin-binding proteins found at high levels in the testis and vomeronasal organ and at lower levels in certain other tissues. Enkurin is a scaffold protein that binds PI3 kinase to sperm transient receptor potential (canonical) (TRPC) channels. The mammalian transient receptor potential (canonical) channels are the primary candidates for the Ca(2+) entry pathway activated by the hormones, growth factors, and neurotransmitters that exert their effect through activation of PLC. Calmodulin binds to the C-terminus of all TRPC channels, and dissociation of calmodulin from TRPC4 results in profound activation of the channel.	96
404706	pfam13865	FoP_duplication	C-terminal duplication domain of Friend of PRMT1. Fop, or Friend of Prmt1, proteins are conserved from fungi and plants to vertebrates. There is little that is actually conserved except for this C-terminal LDXXLDAYM region where X is any amino acid). The Fop proteins themselves are nuclear proteins localized to regions with low levels of DAPI, with a punctate/speckle-like distribution. Fop is a chromatin-associated protein and it co-localizes with facultative heterochromatin. It is is critical for oestrogen-dependent gene activation.	80
404707	pfam13866	zf-SAP30	SAP30 zinc-finger. SAP30 is a subunit of the histone deacetylase complex, and this domain is a zinc-finger. Solution of the structure shows a novel fold comprising two beta-strands and two alpha-helices with the zinc organising centre showing remote resemblance to the treble clef motif. In silico analysis of the structure revealed a highly conserved surface dominated by basic residues. NMR-based analysis of potential ligands for the SAP30 zn-finger motif indicated a strong preference for nucleic acid substrates. The zinc-finger of SAP3 probably functions as a double-stranded DNA-binding motif, thereby expanding the known functions of both SAP30 and the mammalian Sin3 co-repressor complex.	72
404708	pfam13867	SAP30_Sin3_bdg	Sin3 binding region of histone deacetylase complex subunit SAP30. This C-terminal domain of the SAP30 proteins appears to be the binding region for Sin3.	53
404709	pfam13868	TPH	Trichohyalin-plectin-homology domain. This family is a mixtrue of two different families of eukaryotic proteins. Trichoplein or mitostatin, was first defined as a meiosis-specific nuclear structural protein. It has since been linked with mitochondrial movement. It is associated with the mitochondrial outer membrane, and over-expression leads to reduction in mitochondrial motility whereas lack of it enhances mitochondrial movement. The activity appears to be mediated through binding the mitochondria to the actin intermediate filaments (IFs). The family is in the trichohyalin-plectin-homology domain.	352
404710	pfam13869	NUDIX_2	Nucleotide hydrolase. Nudix hydrolases are found in all classes of organism and hydrolyze a wide range of organic pyrophosphates, including nucleoside di- and triphosphates, di-nucleoside and diphospho-inositol polyphosphates, nucleotide sugars and RNA caps, with varying degrees of substrate specificity.	188
404711	pfam13870	DUF4201	Domain of unknown function (DUF4201). This is a family of coiled-coil proteins from eukaryotes. The function is not known.	177
404712	pfam13871	Helicase_C_4	C-terminal domain on Strawberry notch homolog. Strawberry notch proteins carry DExD/H-box groups upstream of this domain. The function of this domain is not known. These proteins promote the expression of diverse targets, potentially through interactions with transcriptional activator or repressor complexes.	271
404713	pfam13872	AAA_34	P-loop containing NTP hydrolase pore-1. 	301
404714	pfam13873	Myb_DNA-bind_5	Myb/SANT-like DNA-binding domain. This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes.	78
404715	pfam13874	Nup54	Nucleoporin complex subunit 54. This is the human Nup54 subunit of the nucleoporin complex, equivalent to Nup57 of yeast. Nup54, Nup58 and Nup62 all have similar affinities for importin-beta. It seems likely that they are the only FG-repeat nucleoporins of the central channel, and as such they would form a zone of equal affinity spanning the central channel. The diffusion of importin-beta import complexes through the central channel may be a stochastic process as the affinities are similar, whereas movement from cytoplasmic fibrils to the central channel and from the central channel to the nuclear basket would be facilitated by the subtle differences in affinity between them.	139
404716	pfam13875	DUF4202	Domain of unknown function (DUF4202). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 187 and 205 amino acids in length. There are two conserved sequence motifs: LED and KMS. The function of these proteins is unknown, although many are incorrectly annotated as glutamyl tRNA synthetases.	183
404717	pfam13876	Phage_gp49_66	Phage protein (N4 Gp49/phage Sf6 gene 66) family. This family of phage proteins is functionally uncharacterized. The family includes bacteriophage Sf6 gene 66 as well as phage N4 GP49 protein. Proteins in this family are typically between 87 and 154 amino acids in length. There is a conserved NGF sequence motif.	77
404718	pfam13877	RPAP3_C	Potential Monad-binding region of RPAP3. This domain is found at the C-terminus of RNA-polymerase II-associated proteins. These proteins bind to Monad and are involved in regulating apoptosis. They contain TPR-repeats towards the N_terminus.	89
404719	pfam13878	zf-C2H2_3	zinc-finger of acetyl-transferase ESCO. 	40
404720	pfam13879	KIAA1430	KIAA1430 homolog. This is a family of KIAA1430 homologs. The function is not known.	97
404721	pfam13880	Acetyltransf_13	ESCO1/2 acetyl-transferase. 	69
372780	pfam13881	Rad60-SLD_2	Ubiquitin-2 like Rad60 SUMO-like. 	111
404722	pfam13882	Bravo_FIGEY	Bravo-like intracellular region. This is the very C-terminal intracellular region of neural adhesion molecule L1 proteins that are also known as Bravo or NrCAM. It lies upstream of the IG and Fn3 domains and has the highly conserved motif FIGEY. The function is not known.	84
404723	pfam13883	Pyrid_oxidase_2	Pyridoxamine 5'-phosphate oxidase. 	167
404724	pfam13884	Peptidase_S74	Chaperone of endosialidase. This is the very C-terminal, chaperone, domain of the bacteriophage protein endosialidase. It releases itself, via the serine-lysine dyad at the N-terminus, from the remainder of the end-tail-spike. Cleavage occurs after the threonine which is the final residue of the End-tail-spike family, pfam12219. The endosialidase protein forms homotrimeric molecules in bacteriophages. The catalytic dyad allows this portion of the molecule to be cleaved from the more N-terminal region such that the latter can fold and presumably bind to DNA.	56
404725	pfam13885	Keratin_B2_2	Keratin, high sulfur B2 protein. 	45
404726	pfam13886	DUF4203	Domain of unknown function (DUF4203). This is the N-terminal region of 7tm proteins. The function is not known.	200
404727	pfam13887	MRF_C1	Myelin gene regulatory factor -C-terminal domain 1. This domain is found just downstream of Peptidase_S74, pfam13884. The function is not known.	36
404728	pfam13888	MRF_C2	Myelin gene regulatory factor C-terminal domain 2. This domain is found further downstream of Peptidase_S74, pfam13884, and MRF_C1, pfam13887. The function is not known.	135
404729	pfam13889	Chromosome_seg	Chromosome segregation during meiosis. The proteins come from eukaryotes, plants and animals, and are necessary for chromosome segregation during meiosis.	55
404730	pfam13890	Rab3-GTPase_cat	Rab3 GTPase-activating protein catalytic subunit. This family is the probable catalytic subunit of the GTPase activating protein that has specificity for Rab3 subfamily (RAB3A, RAB3B, RAB3C and RAB3D). It is likely to convert active Rab3-GTP to the inactive form Rab3-GDP. Rab3 proteins are involved in regulated exocytosis of neurotransmitters and hormones. The Rab3 GTPase-activating complex is a heterodimer composed of RAB3GAP and RAB3-GAP150. This complex interacts with DMXL2.	159
404731	pfam13891	zf-C3Hc3H	Potential DNA-binding domain. This domain is likely to be the DNA-binding domain of chromatin re-modelling proteins and helicases.	62
404732	pfam13892	DBINO	DNA-binding domain. DBINO is a DNA-binding domain found on global transcription activator SNF2L1 proteins and chromatin re-modelling proteins.	130
372791	pfam13893	RRM_5	RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain). The RRM motif is probably diagnostic of an RNA binding protein. RRMs are found in a variety of RNA binding proteins, including various hnRNP proteins, proteins implicated in regulation of alternative splicing, and protein components of snRNPs. The motif also appears in a few single stranded DNA binding proteins.	125
404733	pfam13894	zf-C2H2_4	C2H2-type zinc finger. This family contains a number of divergent C2H2 type zinc fingers.	24
404734	pfam13895	Ig_2	Immunoglobulin domain. This domain contains immunoglobulin-like domains.	76
404735	pfam13896	Glyco_transf_49	Glycosyl-transferase for dystroglycan. This glycosyl-transferase brings about the glycosylation of the alpha-dystroglycan subunit. Dystroglycan is an integral member of the skeletal muscular dystrophin glycoprotein complex, which links dystrophin to proteins in the extracellular matrix.	326
404736	pfam13897	GOLD_2	Golgi-dynamics membrane-trafficking. Sec14-like Golgi-trafficking domain The GOLD domain is always found combined with lipid- or membrane-association domains.	133
404737	pfam13898	DUF4205	Domain of unknown function (DUF4205). The proteins in this family are uncharacterized but often named FAM188B.	348
404738	pfam13899	Thioredoxin_7	Thioredoxin-like. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond.	82
404739	pfam13901	zf-RING_9	Putative zinc-RING and/or ribbon. This is a family of cysteine-rich proteins. Many members also carry a pleckstrin-homology domain, pfam00169	201
404740	pfam13902	R3H-assoc	R3H-associated N-terminal domain. This family is found at the N-terminus of R3H, pfam01424, domain-containing proteins. The function is not known.	117
372799	pfam13903	Claudin_2	PMP-22/EMP/MP20/Claudin tight junction. Members of this family are claudins, that form tight junctions between cells.	191
404741	pfam13904	DUF4207	Domain of unknown function (DUF4207). This family is found in eukaryotes; it has several conserved tryptophan residues. The function is not known.	249
404742	pfam13905	Thioredoxin_8	Thioredoxin-like. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond.	94
404743	pfam13906	AA_permease_C	C-terminus of AA_permease. This is the C-terminus of AA-permease enzymes that is not captured by the models pfam00324 and pfam13520.	51
404744	pfam13907	DUF4208	Domain of unknown function (DUF4208). This domain is found at the C-terminus of chromodomain-helicase-DNA-binding proteins. The exact function of the domain is undetermined.	93
404745	pfam13908	Shisa	Wnt and FGF inhibitory regulator. Shisa is a transcription factor-type molecule that physically interacts with immature forms of the Wnt receptor Frizzled and the FGF receptor within the endoplasmic reticulum to inhibit their post-translational maturation and trafficking to the cell surface.	175
404746	pfam13909	zf-H2C2_5	C2H2-type zinc-finger domain. 	25
404747	pfam13910	DUF4209	Domain of unknown function (DUF4209). This short domain is found in bacteria and eukaryotes, though not in yeasts or Archaea. It carries a highly conserved RNxxxHG sequence motif.	89
372807	pfam13911	AhpC-TSA_2	AhpC/TSA antioxidant enzyme. This family contains proteins related to alkyl hydro-peroxide reductase (AhpC) and thiol specific antioxidant (TSA).	113
404748	pfam13912	zf-C2H2_6	C2H2-type zinc finger. 	27
404749	pfam13913	zf-C2HC_2	zinc-finger of a C2HC-type. This family contains a number of divergent C2H2 type zinc fingers.	25
404750	pfam13914	Phostensin	Phostensin PP1-binding and SH3-binding region. Phostensin has been identified as a PP1 regulatory protein binding PP1 at the KISF motif. The domain also appears to carry an incomplete incomplete SH3-binding domain PxRxP further upstream. It is likely that Phostensin targets PP1 to the F-actin cytoskeleton. Phostensin binds to actin and decreases the elongation and depolymerization rates of actin filament pointed ends.	142
404751	pfam13915	DUF4210	Domain of unknown function (DUF4210). This short domain is found in fungi, plants and animals, and the proteins appear to be necessary for chromosome segregation during meiosis.	66
404752	pfam13916	Phostensin_N	PP1-regulatory protein, Phostensin N-terminal. Phostensin has been identified as a PP1 regulatory protein binding protein. This domain is N-terminal to the PP1- and SH3-binding regions though may carry an additional SH3-binding motif. It is likely that Phostensin targets PP1 to the F-actin cytoskeleton. Phostensin binds to actin and decreases the elongation and depolymerization rates of actin filament pointed ends.	86
404753	pfam13917	zf-CCHC_3	Zinc knuckle. The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.	41
372814	pfam13918	PLDc_3	PLD-like domain. 	177
404754	pfam13919	ASXH	Asx homology domain. A conserved alpha helical domain with a characteristic LXXLL motif. The LXXLL motif is detected in diverse transcription factors, coactivators and corepressors and is implicated in mediating interactions between them. The ASXH domain is found in animals, fungi and plants and is predicted to play a role in mediating contact between transcription factors and chromatin-associated complexes. In Drosophila Asx and Human ASXL1, the ASXH domain is predicted to mediate interactions with the Calypso and BAP1 deubiquitinases (DUBs) which further belong to the UCHL5/UCH37 clade of DUBs.	128
404755	pfam13920	zf-C3HC4_3	Zinc finger, C3HC4 type (RING finger). 	50
372817	pfam13921	Myb_DNA-bind_6	Myb-like DNA-binding domain. This family contains the DNA binding domains from Myb proteins, as well as the SANT domain family.	60
316444	pfam13922	PHD_3	PHD domain of transcriptional enhancer, Asx. This is the DNA-binding domain on the additional sex combs-like 1 proteins. The Asx protein acts as an enhancer of trithorax and polycomb in displaying bidirectional homoeotic phenotypes in Drosophila, suggesting that it is required for maintenance of both activation and silencing of Hox genes. Asx is required for normal adult haematopoiesis and its function depends on its cellular context.	68
404756	pfam13923	zf-C3HC4_2	Zinc finger, C3HC4 type (RING finger). 	40
404757	pfam13924	Lipocalin_5	Lipocalin-like domain. This family includes domains distantly related to lipocalins. However, they do contain the important GXW motif in the first strand. The protein in this family include aln5, which is involved in biosynthesis of alnumycin. The family also includes the ZFK protein from Trypanosoma brucei which is a protein kinase. This domain is at the C-terminus of that protein. The domain is also found as the C-terminal domain in StiJ a protein involved in producing stigmatellin. This domain has been assumed to catalyze a final cyclisation reaction.	140
404758	pfam13925	Katanin_con80	con80 domain of Katanin. The con80 domain of katanin is the C-terminal region of the protein that binds to the N-terminal domain of katanin-p60, the catalytic ATPase. The complex associates with a specific subregion of the mitotic spindle leading to increased microtubule disassembly and targeting of p60 to the spindle poles. The assembly and function of the mitotic spindle requires the activity of a number of microtubule-binding proteins. Katanin, a heterodimeric microtubule-severing ATPase, is found localized at mitotic spindle poles. A proposed model is that katanin is targeted to spindle poles through a combination of direct microtubule binding by the p60 subunit and through interactions between the WD40 domain and an unknown protein.	153
404759	pfam13926	DUF4211	Domain of unknown function (DUF4211). 	139
404760	pfam13927	Ig_3	Immunoglobulin domain. This family contains immunoglobulin-like domains.	79
404761	pfam13928	Flocculin_t3	Flocculin type 3 repeat. This repeat is found in the Flocculation protein FLO9 close to its C-terminus.	44
404762	pfam13929	mRNA_stabil	mRNA stabilisation. This domain is an mRNA stabilisation factor.	288
404763	pfam13930	Endonuclea_NS_2	DNA/RNA non-specific endonuclease. 	133
404764	pfam13931	Microtub_bind	Kinesin-associated microtubule-binding. This domain binds to micotubules.	129
404765	pfam13932	GIDA_assoc	GidA associated domain. The GidA associated domain is a domain that has been identified at the C-terminus of protein GidA. It consists of several helices, the last three being rather short and forming small bundle. GidA is an tRNA modification enzyme found in bacteria and mitochondrial. Based on mutational analysis this domain has been suggested to be implicated in binding of the D-stem of tRNA and to be responsible for the interaction with protein MnmE. Structures of GidA in complex with either tRNA or MnmE are missing. Reported to bind to Pfam family MnmE, pfam12631.	212
404766	pfam13933	HRXXH	Putative peptidase family. This family of putative peptidases are closely related to the M35 family pfam02102. In this family the metal binding HEXXH motif is replaced with HRXXH. The exact function of these proteins is unknown. Members of this family are found to be fungal allergens.	244
404767	pfam13934	ELYS	Nuclear pore complex assembly. ELYS (embryonic large molecule derived from yolk sac) is conserved from fungi such Aspergillus nidulans and Schizosaccharomyces pombe to human. It is important for the assembly of the nuclear pore complex.	219
379401	pfam13935	Ead_Ea22	Ead/Ea22-like protein. This family contains phage proteins and bacterial proteins that are likely to represent integrated phage proteins. This family includes the Lambda phage Ea22 early protein as well as the Bacteriophage P22 Ead protein.	139
404768	pfam13936	HTH_38	Helix-turn-helix domain. This helix-turn-helix domain is often found in transferases and is likely to be DNA-binding.	44
404769	pfam13937	DUF4212	Domain of unknown function (DUF4212). This family includes several putative integral membrane proteins.	77
404770	pfam13938	DUF4213	Putative heavy-metal chelation. This domain of unknown function has an enolase N-terminal domain-like fold. Its genomic context suggests that it may have a role in anaerobic vitamin B12 biosynthesis. This domain is often found at the N-terminus of proteins containing DUF364, pfam04016. The structure of UnioProtKB:B8FUJ5, Structure 3l5o, suggests that the whole protein has this enolase N-terminal-like fold and an Rossmann-like C-terminal domain. Structural and bioinformatic analyses reveal partial similarities to Rossmann-like methyltransferases, with residues from the enolase-like fold combining to form a unique active site that is likely to be involved in the condensation or hydrolysis of molecules implicated in the synthesis of flavins, pterins or other siderophores. The protein may be playing a role in heavy-metal chelation.	73
404771	pfam13939	TisB_toxin	Toxin TisB, type I toxin-antitoxin system. TisB (toxicity-induced by SOS B) is an SOS-induced toxic peptide. It is a hydrophobic membrane-spanning protein which inhibits cell growth. Its expression is inhibited by the antisense RNA IstR-1, which acts as an antitoxin.	28
404772	pfam13940	Ldr_toxin	Toxin Ldr, type I toxin-antitoxin system. This family includes the Ldr (long direct repeat) toxins. In Escherichia coli there are four Ldr toxins, LdrA, LdrB, LdrC and LdrD. These toxins inhibit cell growth, decrease cell viability and cause nucleoid condensation. LdrD expression is inhibited by the antisense RNA RdlD, which functions as an antitoxin.	35
404773	pfam13941	MutL	MutL protein. This small family includes, GlmL/MutL from Clostridium tetanomorphum and Clostridium cochlearium. GlmL is located between the genes for the two subunits, epsilon (GlmE) and sigma (GlmS), of the coenzyme-B12-dependent glutamate mutase (methylaspartate mutase), the first enzyme in a pathway of glutamate fermentation. Members shows significant sequence similarity to the hydantoinase branch of the hydantoinase/oxoprolinase family.	448
404774	pfam13942	Lipoprotein_20	YfhG lipoprotein. This family includes the YfhG protein from E. coli. Members of this family have an N-terminal lipoprotein attachment site. The members of this family are functionally uncharacterized.	175
404775	pfam13943	WPP	WPP domain. 	100
404776	pfam13944	Calycin_like	Calycin-like beta-barrel domain. 	121
372833	pfam13945	NST1	Salt tolerance down-regulator. NST1 is a family of proteins that seem to be involved, directly or indirectly, in the salt sensitivity of some cellular functions in yeast. It does this without affecting sodium accumulation. It negatively affects salt-tolerance through an interaction with the splicing factor Msl1p. This interaction stresses the importance of efficient RNA processing under salt stress conditions.	186
404777	pfam13946	DUF4214	Domain of unknown function (DUF4214). This domain is found on a variety of different proteins including transferases, and allergen V5/Tpx-1 related proteins.	72
404778	pfam13947	GUB_WAK_bind	Wall-associated receptor kinase galacturonan-binding. This cysteine-rich GUB_WAK_bind domain is the extracellular part of this serine/threonine kinase that binds to the cell-wall pectins.	63
290659	pfam13948	DUF4215	Domain of unknown function (DUF4215). The function of this family is unknown.	47
404779	pfam13949	ALIX_LYPXL_bnd	ALIX V-shaped domain binding to HIV. The binding of the LYPxL motif of late HIV p6Gag and EIAV p9Gag to this domain is necessary for viral budding.This domain is generally central between an N-terminal Bro1 domain, pfam03097 and a C-terminal proline-rich domain. The retroviruses thus used this domain to hijack the ESCRT system of the cell.	295
404780	pfam13952	DUF4216	Domain of unknown function (DUF4216). This DUF is sometimes found at the C-terminal end of proteins carrying a Transposase_21 domain, pfam02992.	72
404781	pfam13953	PapC_C	PapC C-terminal domain. The PapC C-terminal domain is a structural domain found at the C-terminus of the E. coli PapC protein. Pili are assembled using the chaperone usher system. In E.coli this is composed of the chaperone PapD and the usher PapC. This domain represents the C-terminal domain from PapC and its homologs. This domain has a beta-sandwich structure similar to the plug domain of PapC.	66
404782	pfam13954	PapC_N	PapC N-terminal domain. The PapC N-terminal domain is a structural domain found at the N-terminus of the E. coli PapC protein. Pili are assembled using the chaperone usher system. In E.coli this is composed of the chaperone PapD and the usher PapC. This domain represents the N-terminal domain from PapC and its homologs. This domain is involved in substrate binding.	146
372839	pfam13955	Fst_toxin	Toxin Fst, type I toxin-antitoxin system. Fst (faecalis plasmid stabilization toxin), also known as RNA I, is a toxic peptide. Its N-terminus forms a transmembrane alpha helix, its C-terminus is disordered and is likely to be cytosolic. Its translation is inhibited by the antisense RNA, RNA II, which acts as an antitoxin.	21
206126	pfam13956	Ibs_toxin	Toxin Ibs, type I toxin-antitoxin system. The Ibs (induction brings stasis) proteins are a family of toxic peptides. Their expression is inhibited by the Sib antisense RNAs, which act as antitoxins.	19
404783	pfam13957	YafO_toxin	Toxin YafO, type II toxin-antitoxin system. YafO is a toxin which inhibits protein synthesis. It acts as a ribosome-dependent mRNA interferase. It forms part of a type II toxin-antitoxin system, where the YafN protein acts as an antitoxin. This domain forms complexes with yafN antitoxins containing pfam02604.	101
404784	pfam13958	ToxN_toxin	Toxin ToxN, type III toxin-antitoxin system. ToxN acts as a toxin, it is part of a type III toxin-antitoxin system. It acts as a ribosome independent endoribonuclease. It interacts with, and is inhibited by, the RNA antitoxin, ToxI. Three ToxN monomers bind to three ToxI monomers to create a trimeric ToxN-ToxI complex.	155
404785	pfam13959	DUF4217	Domain of unknown function (DUF4217). This short domain is found at the C-terminus of many helicase proteins.	61
404786	pfam13960	DUF4218	Domain of unknown function (DUF4218). 	112
404787	pfam13961	DUF4219	Domain of unknown function (DUF4219). This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.	27
404788	pfam13962	PGG	Domain of unknown function. The PGG domain is named for the highly conserved sequence motif found at the startt of the domain. The function is not known.	114
404789	pfam13963	Transpos_assoc	Transposase-associated domain. 	74
404790	pfam13964	Kelch_6	Kelch motif. 	50
404791	pfam13965	SID-1_RNA_chan	dsRNA-gated channel SID-1. This is a family of proteins that are transmembrane dsRNA-gated channels. They passively transport dsRNA into cells and do not act as ATP-dependent pumps. They are required for systemic RNA interference. This family of proteins belong to the CREST superfamily, which are distantly related to GPCRs.	590
404792	pfam13966	zf-RVT	zinc-binding in reverse transcriptase. This domain would appear to be a zinc-binding region of a putative reverse transcriptase.	84
404793	pfam13967	RSN1_TM	Late exocytosis, associated with Golgi transport. This family represents the first three transmembrane regions of 11-TM proteins involved in vesicle transport. In S. cerevisiae these proteins are members of the yeast facilitator superfamily and are integral membrane proteins localized to the cell periphery, in particular to the bud-neck region. The distribution is consistent with a role in late exocytosis which is in agreement with the proteins' ability to substitute for the function of Sro7p, required for the sorting of the protein Enap1 into Golgi-derived vesicles destined for the cell surface.	158
404794	pfam13968	DUF4220	Domain of unknown function (DUF4220). This family is found in plants and is often associated with DUF294, pfam04578.	316
372852	pfam13969	Pab87_oct	Pab87 octamerisation domain. This domain was first characterized as the C-terminal domain of Pab87 serine protease from Pyrococcus abyssi. The domain is reported to play a crucial role in Pab87 octamerisation and active site compartmentalisation. Its up-and-down 8-stranded beta-barrel 3D structure is reminiscent of the one found in lipocalins.	96
404795	pfam13970	DUF4221	Domain of unknown function (DUF4221). This family of bacterial proteins contains highly conserved asparagine and cysteine residues. The function is not known.	310
404796	pfam13971	Mei4	Meiosis-specific protein Mei4. This family of meiosis specific proteins is required for correct meiotic chromosome segregation and recombination. It is required for meiotic DNA double-strand break (DSB) formation.	340
404797	pfam13972	TetR	Bacterial transcriptional repressor. This family of bacterial transcriptional repressors is characterized by the short approximately 50 amino acid stretch of residues constituting the helix-turn-helix DNA binding motif, around the YRFhY motif. The target proteins that are repressed are involved in the transcriptional control of multi-drug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The regulatory network in which TetR itself is involved is in being released in the presence of tetracycline, binding to the target operator, and repressing tetA transcription.	143
372855	pfam13973	DUF4222	Domain of unknown function (DUF4222). This short protein is likely to be of phage origin. For example it is found in the Enterobacteria phage YYZ-2008. It is largely found in enteric bacteria. The molecular function of this protein is unknown.	53
404798	pfam13974	YebO	YebO-like protein. This short protein is uncharacterized. It seems likely to be of phage origin as it is found in Enterobacteria phage HK022 Gp20 and Enterobacteria phage HK97 Gp15. The protein is also found in a variety of enteric bacteria.	80
404799	pfam13975	gag-asp_proteas	gag-polyprotein putative aspartyl protease. This family of putative aspartyl proteases is found pre-dominantly in retroviral proteins.	92
372857	pfam13976	gag_pre-integrs	GAG-pre-integrase domain. This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.	67
404800	pfam13977	TetR_C_6	BetI-type transcriptional repressor, C-terminal. This family comprises the C-terminal portion of proteins that belong to the TetR family of transcriptional regulators. The C-terminus represents the regulatory region, and does not include the DNA binding helix-turn-helix domain. The target proteins that are repressed are involved in the transcriptional control of multi-drug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. One of the target proteins is BetI, an osmoprotectant which controls the choline-glycine betaine pathway in E.coli.	115
404801	pfam13978	DUF4223	Protein of unknown function (DUF4223). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. These proteins are likely to be lipoproteins (attachment site currently included in alignment).	54
372858	pfam13979	SopA_C	SopA-like catalytic domain. This domain is found in the E. coli Type III secretion effector proteins SopA and NleL. These proteins have been shown to act as E3 ubiquitin ligase enzymes. This domain contains the active site cysteine residue.	166
404802	pfam13980	UPF0370	Uncharacterized protein family (UPF0370). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved DWP sequence motif.	61
404803	pfam13981	SopA	SopA-like central domain. This domain is found in the E. coli Type III secretion effector proteins SopA and NleL. These proteins have been shown to act as E3 ubiquitin ligase enzymes.	126
372861	pfam13982	YbfN	YbfN-like lipoprotein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length. Members of this family are lipoproteins.	88
404804	pfam13983	YsaB	YsaB-like lipoprotein. This family of proteins is functionally uncharacterized. These proteins are related to E.coli YsaB. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. These proteins are lipoproteins.	75
404805	pfam13984	MsyB	MsyB protein. The MsyB protein has been found to be able to restore protein export defects caused by a temperature-sensitive secY or secA mutation. However, its exact molecular function is still unknown, but it may play a role in protein export. Proteins in this family are approximately 120 amino acids in length. This family of proteins is found in bacteria.	120
404806	pfam13985	YbgS	YbgS-like protein. This family of proteins is functionally uncharacterized. The family includes the YbgS protein from E. coli. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. Some members of this family are annotated as homeobox protein, but this annotation cannot be verified.	120
404807	pfam13986	DUF4224	Domain of unknown function (DUF4224). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and viruses, and is approximately 50 amino acids in length. The protein is likely to be of phage origin and is found as protein Gp02 in the Xylella phage Xfas53.	45
404808	pfam13987	YedD	YedD-like protein. This family of proteins related to the YedD protein is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length. These proteins are lipoproteins.	106
404809	pfam13988	DUF4225	Protein of unknown function (DUF4225). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 182 and 282 amino acids in length.	163
404810	pfam13989	YejG	YejG-like protein. The YejG protein family is a group of functionally uncharacterized proteins related to Escherichia coli yejG. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length.	106
404811	pfam13990	YjcZ	YjcZ-like protein. This family of proteins is functionally uncharacterized. The family includes the YjcZ protein from E. coli. This family of proteins is found in enteric bacteria. Proteins in this family are approximately 300 amino acids in length. There are two conserved sequence motifs: FGD and MPR.	272
404812	pfam13991	BssS	BssS protein family. The BssS protein family is a group of proteins that are involved in regulation of biofilm formation. Proteins in this family are approximately 80 amino acids in length.	72
404813	pfam13992	YecR	YecR-like lipoprotein. The YecR-like family of lipoproteins includes the YecR protein from E. coli. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 110 amino acids in length.	73
404814	pfam13993	YccJ	YccJ-like protein. The YccJ-like family of proteins includes the E. coli YccJ protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	67
372868	pfam13994	PgaD	PgaD-like protein. This family includes the PgaD protein from E. coli. The homopolymer poly-beta-1,6-N-acetyl-D-glucosamine (beta-1,6-GlcNAc; PGA) serves as an adhesin for the maintenance of biofilm structural stability in eubacteria. The pgaABCD operon is required for its synthesis and export. It has been shown that PgaD is essential for this process.	148
404815	pfam13995	YebF	YebF-like protein. The YebF-like protein family appears to be a group of colicin immunity proteins. As well as YebF the family includes cmi, the colicin M immunity protein. This domain family is found in bacteria, and is approximately 80 amino acids in length. The alignment contains two conserved cysteine residues that form a disulphide bond in the solved structure.	89
404816	pfam13996	YobH	YobH-like protein. The YobH-like protein family includes the YobH protein from E. coli, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. There are two conserved sequence motifs: GYG and GLGL.	70
404817	pfam13997	YqjK	YqjK-like protein. The YqjK-like protein family includes the E. coli YqjK protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. There is a single completely conserved residue R that may be functionally important.	72
404818	pfam13998	MgrB	MgrB protein. The MgrB protein is a short lipoprotein. The mgrB gene has a mg2+ responsive promoter. Deletion of mgrB results in a potent increase in PhoP-regulated transcription. The PhoQ/PhoP signaling system responds to low magnesium and the presence of certain cationic antimicrobial peptides. Over-expression of mgrB decreased transcription at both high and low concentrations of magnesium. Localization and bacterial two-hybrid studies suggest that MgrB resides in the inner-membrane and interacts directly with PhoQ. This domain family is found in bacteria, and is approximately 40 amino acids in length. There are two conserved sequence motifs: CDQ and GIC.	29
404819	pfam13999	MarB	MarB protein. The MarB protein is found in the multiple antibiotic resistance (mar) locus in Escherichia coli. The MarB protein is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved GSDKSD sequence motif.	63
290707	pfam14000	Packaging_FI	DNA packaging protein FI. This family includes the lambda phage DNA-packaging protein FI. Proteins in this family are typically between 124 and 140 amino acids in length. There is a conserved EEE sequence motif.	131
404820	pfam14001	YdfZ	YdfZ protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved YDRNRN sequence motif. The E. coli protein has been shown to bind selenium.	64
404821	pfam14002	YniB	YniB-like protein. The YniB-like protein family includes the E. coli YniB protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 180 amino acids in length. This family of proteins are integral membrane proteins.	166
404822	pfam14003	YlbE	YlbE-like protein. The YlbE-like protein family includes the B. subtilis protein YlbE, which is functionally uncharacterized. This family of cytosolic proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. There is a conserved WYR sequence motif.	61
372877	pfam14004	DUF4227	Protein of unknown function (DUF4227). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	71
404823	pfam14005	YpjP	YpjP-like protein. The YpjP-like protein family includes the B. subtilis YpjP protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 200 amino acids in length.	133
379412	pfam14006	YqzL	YqzL-like protein. The YqzL-like protein family includes the B. subtilis YqzL protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length.	40
379413	pfam14007	YtpI	YtpI-like protein. The YtpI-like protein family includes the B. subtilis YtpI protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 73 and 101 amino acids in length.	87
404824	pfam14008	Metallophos_C	Iron/zinc purple acid phosphatase-like protein C. This domain is found at the C-terminus of Purple acid phosphatase proteins.	63
404825	pfam14009	DUF4228	Domain of unknown function (DUF4228). This domain is found in plants. The function is not known.	148
404826	pfam14010	PEPcase_2	Phosphoenolpyruvate carboxylase. This family of phosphoenolpyruvate carboxylases is based on seqeunces not picked up by the model for PEPcase, PF00311. Most of the family members are from Archaea.	496
404827	pfam14011	ESX-1_EspG	EspG family. This family of proteins contains the the EspG1, EspG2 and EspG3 proteins from M. tuberculosis. These proteins are involved in the ESAT-6 secretion system 1 (ESX-1) of Mycobacterium tuberculosis which is important for virulence and intercellular spread. Proteins in this family are typically between 254 and 295 amino acids in length.	241
404828	pfam14012	DUF4229	Protein of unknown function (DUF4229). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 95 and 122 amino acids in length.	67
404829	pfam14013	MT0933_antitox	MT0933-like antitoxin protein. This family of proteins contains the MT0933 protein, which has been identified as an antitoxin to /protein MT0934. This family of proteins is found in bacteria. Proteins in this family are typically between 61 and 90 amino acids in length.	49
404830	pfam14014	DUF4230	Protein of unknown function (DUF4230). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 203 and 228 amino acids in length.	134
404831	pfam14015	DUF4231	Protein of unknown function (DUF4231). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 148 and 288 amino acids in length.	104
404832	pfam14016	DUF4232	Protein of unknown function (DUF4232). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 177 and 242 amino acids in length. Many members of this family are lipoproteins.	130
404833	pfam14017	DUF4233	Protein of unknown function (DUF4233). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 122 and 147 amino acids in length. Proteins in this family are integral membrane proteins.	106
404834	pfam14018	DUF4234	Domain of unknown function (DUF4234). This presumed integral membrane protein domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 70 amino acids in length.	69
404835	pfam14019	DUF4235	Protein of unknown function (DUF4235). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 88 and 119 amino acids in length.	77
404836	pfam14020	DUF4236	Protein of unknown function (DUF4236). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 69 and 402 amino acids in length.	55
404837	pfam14021	TNT	Tuberculosis necrotizing toxin. This is the C-terminal domain secreted by Mycobacterium tuberculosis (Mtb). It induces necrosis of infected cells to evade immune responses. Mtb utilizes the protein CpnT to kill human macrophages by secreting its C-terminal domain (CTD), named tuberculosis necrotizing toxin (TNT) that induces necrosis. It acts as a NAD+ glycohydrolase which hydrolyzes the essential cellular coenzyme NAD+ in the cytosol of infected macrophages resulting in necrotic cell death. CpnT transports its toxic CTD from the cell surface of M. tuberculosis by proteolytic cleavage, where the toxin is cleaved to induce host cell death. Structural analysis determined that the TNT core contains only six beta-strands as opposed to seven found in all known NAD+-utilizing toxins, and is significantly smaller, with only two short alpha-helices and two 3/10 helices. Furthermore, the putative NAD+ binding pocket identified Q822, Y765 and R757 as residues possibly involved in NAD+-binding and hydrolysis based on similar positions of catalytic amino acids of ADP-ribosylating toxins. While glutamine 822 residue was detected to be highly conserved among TNT homologs.	84
404838	pfam14022	DUF4238	Protein of unknown function (DUF4238). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 274 and 374 amino acids in length.	279
372881	pfam14023	DUF4239	Protein of unknown function (DUF4239). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 254 and 270 amino acids in length.	212
404839	pfam14024	DUF4240	Protein of unknown function (DUF4240). This presumed domain is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 169 and 263 amino acids in length. This domain is often associated with the WGR domain pfam05406.	128
404840	pfam14025	DUF4241	Protein of unknown function (DUF4241). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 205 and 315 amino acids in length. There is a conserved GDG sequence motif at the C-terminus.	187
404841	pfam14026	DUF4242	Protein of unknown function (DUF4242). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 90 and 170 amino acids in length. There is a single completely conserved residue C that may be functionally important.	74
404842	pfam14027	DUF4243	Protein of unknown function (DUF4243). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 348 and 477 amino acids in length.	336
404843	pfam14028	Lant_dehydr_C	Lantibiotic biosynthesis dehydratase C-term. Lant_dehydr_C is the C-terminal domain of a family of dehydratases that are involved in the biosynthesis of lantibiotics. While the extensive N-terminal domain, pfam04738, is involved in the serine-threonine glutamylation step of the synthetic process, this C-terminal domain, once thought to be a separate domain from the dehydratase enzymic activity, is necessary for the final glutamate-elimination step in the generation of the lantibiotic. Lantibiotics are a class of peptide antibiotic that contains one or more thioether bonds.	289
404844	pfam14029	DUF4244	Protein of unknown function (DUF4244). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 66 and 95 amino acids in length. There is a conserved EYA sequence motif.	50
404845	pfam14030	DUF4245	Protein of unknown function (DUF4245). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 188 and 235 amino acids in length.	166
404846	pfam14031	D-ser_dehydrat	Putative serine dehydratase domain. This domain is found at the C-terminus of yeast D-serine dehydratase. Structures have been solved for two bacterial members of this family. The yeast protein has been shown to be a zinc dependant enzyme.	96
404847	pfam14032	PknH_C	PknH-like extracellular domain. This domain is functionally uncharacterized. It is found as the periplasmic domain of the bacterial protein kinase PknH. The domain is also found in isolation in numerous proteins, for example the lipoproteins lpqQ, lprH, lppH and lpqA from M. tuberculosis. This family of proteins is found in bacteria. Proteins in this family are typically between 214 and 268 amino acids in length. There are two completely conserved C residues that are likely to form a disulphide bond. A second pair of cysteines are less well conserved probably form a second disulphide bond. It seems likely that this domain functions to bind some as yet unknown ligand.	187
404848	pfam14033	DUF4246	Protein of unknown function (DUF4246). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and fungi. Proteins in this family are typically between 392 and 644 amino acids in length.	421
379432	pfam14034	Spore_YtrH	Sporulation protein YtrH. This family of proteins is involved in sporulation. It may contribute to the formation and stability of the thick peptidoglycan layer between the two membranes of the spore, known as the cortex. In Bacillus subtilis its expression is regulated by sigma-E.	99
404849	pfam14035	YlzJ	YlzJ-like protein. The YlzJ-like protein family includes the B. subtilis YlzJ protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 61 and 72 amino acids in length. There are two completely conserved residues (L and G) that may be functionally important.	65
379434	pfam14036	YlaH	YlaH-like protein. The YlaH-like protein family includes the B. subtilis YlaH protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length. There is a conserved LGFA sequence motif.	76
372886	pfam14037	YoqO	YoqO-like protein. The YoqO-like protein family includes the B. subtilis YoqO protein, which is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 120 amino acids in length. There are two completely conserved residues (I and Y) that may be functionally important.	116
372887	pfam14038	YqzE	YqzE-like protein. The YqzE-like protein family includes the B. subtilis YqzE protein, which is functionally uncharacterized. It is a part of the ComG operon, which is regulated by the competence transcription factor ComK. This family of proteins is found in bacteria. Proteins in this family are typically between 49 and 66 amino acids in length.	53
404850	pfam14039	YusW	YusW-like protein. The YusW-like protein family includes the B. subtilis YusW protein, which is functionally uncharacterized. This family of proteins is found in bacteria, and is approximately 90 amino acids in length.	91
404851	pfam14040	DNase_NucA_NucB	Deoxyribonuclease NucA/NucB. Members of this family act as deoxyribonucleases.	113
404852	pfam14041	Lipoprotein_21	LppP/LprE lipoprotein. The family includes putative lipoproteins LppP and LprE from species of Mycobacterium. LppP is required for optimal growth of M. tuberculosis.	86
404853	pfam14042	DUF4247	Domain of unknown function (DUF4247). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 143 and 271 amino acids in length.	122
379438	pfam14043	WVELL	WVELL protein. This family includes the B. subtilis YfjH protein, which is functionally uncharacterized. This is not a homolog of E. coli YfjH, a synonym for IscX, which belongs to pfam04384. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length and contain a highly conserved WVELL motif.	74
404854	pfam14044	NETI	NETI protein. This family includes the B. subtilis YebG protein, which is functionally uncharacterized. This is not a homolog of E. coli YebG, which belongs to pfam07130. This family of proteins is found in bacteria. Proteins in this family are typically between 42 and 66 amino acids in length and contain a conserved NETI motif.	56
404855	pfam14045	YIEGIA	YIEGIA protein. This family includes the B. subtilis YphB protein, which is functionally uncharacterized. Its expression is regulated by the sporulation transcription factor sigma-F, however it is not essential for sporulation or germination. This is not a homolog of E. coli YphB, which belongs to pfam01263. This family of proteins is found in bacteria. Proteins in this family are typically between 276 and 300 amino acids in length and contain a conserved YIEGIA motif.	282
404856	pfam14046	NR_Repeat	Nuclear receptor repeat. This is a repeat domain involved in dimerization of nuclear receptors proteins and in transcriptional regulation in general. It contains a Leu-Xaa-Xaa-Leu-Leu motif which has been characterized for the orphan nuclear receptor Dax-1, which represses the constitutively expressed protein Ad4BP/SF-1. The LXXLL motif plays in important role in binding of Dax-1 to Ad4BP/SF-1. The domain is subject to structure determination by the Joint Center of Structural Genomics.	47
404857	pfam14047	DCR	Dppa2/4 conserved region. This domain has been characterized in the finding of a developmental pluripotency associated gene (Dppa) in the lower vertebrate Xenopus laevis. Previous to this discovery, Dppa genes were known only in higher vertebrates. The domain is subject to structure determination by the Joint Center of Structural Genomics.	67
404858	pfam14048	MBD_C	C-terminal domain of methyl-CpG binding protein 2 and 3. CpG-methylation is a frequently occurring epigenetic modification of vertebrate genomes resulting in transcriptional repression. This domain was found at the C-terminus of the methyl-CpG-binding domain (MBD) containing proteins MBD2 and MBD3, the latter was shown to not bind directly to methyl-CpG DNA but rather interact with components of the NuRD/Mi2 complex, an abundant deacetylase complex. The domain is subject to structure determination by the Joint Center of Structural Genomics.	93
404859	pfam14049	Dppa2_A	Dppa2/4 conserved region in higher vertebrates. Developmental pluripotency associated genes (Dppa) in lower vertebrates have remained undetected until the discovery of a Dppa homolog in Xenopus laevis, reporting a new domain termed Dppa2/4 conserved region (DCR). In higher vertebrate Dppa proteins the DCR domain is located next to the here-reported domain. The domain is subject to structure determination by the Joint Center of Structural Genomics.	85
404860	pfam14050	Nudc_N	N-terminal conserved domain of Nudc. The N-terminus of nuclear distribution gene C homolog (NUDC) proteins contains a highly conserved region consisting of a predicted three helix bundle. In the human homolog this segment has been targeted for structure determination by the Joint Center for Structural Genomics. NUDC forms a complex with other NUD proteins and is involved in several cellular division activities. Recently it was shown that NUDC regulates platelet-activating factor (PAF) acetylhydrolase with PAF being a pro-inflammatory secondary lipidic messenger.	60
404861	pfam14051	Requiem_N	N-terminal domain of DPF2/REQ. This putative domain has been detected on the human DPF2 protein and was subsequently targeted for structure determination by the Joint Center for Structural Genomics (JCSG). Possibly, the C-terminus extends by 30 amino acids and forms a separate domain. DPF2 interacts with estrogen related receptor alpha (Err-alpha), an orphan receptor which acts as a regulator in energy metabolism. It was also identified as an adaptor molecule that links nuclear factor kappa-light-chain-enhancer of activated B cells (NF-kappa-B) dimer RelB/p52 and switch/sucrose-nonfermentable (SWI/SNF) chromatin remodeling factor.	67
404862	pfam14052	Caps_assemb_Wzi	Capsule assembly protein Wzi. Many bacteria are covered in a layer of surface-associated polysaccharide called the capsule. These capsules can be divided into four groups depending upon the organisation of genes responsible for capsule assembly, the assembly pathway and regulation. This family plays a role in group 1 capsule biosynthesis. It is likely to be involved in the later stages of capsule assembly. It is likely to consist of a beta-barrel structure.	392
404863	pfam14053	DUF4248	Domain of unknown function (DUF4248). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 73 and 86 amino acids in length.	66
404864	pfam14054	DUF4249	Domain of unknown function (DUF4249). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 279 and 365 amino acids in length. There are two completely conserved residues (C and G) that may be functionally important.	257
404865	pfam14055	NVEALA	NVEALA protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 75 and 92 amino acids in length. There is a conserved NVEALA sequence motif.	64
404866	pfam14056	DUF4250	Domain of unknown function (DUF4250). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There are two completely conserved residues (N and R) that may be functionally important.	55
404867	pfam14057	GGGtGRT	GGGtGRT protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 330 amino acids in length and contain many highly conserved residues including a GGGtGRT motif.	326
404868	pfam14058	PcfK	PcfK-like protein. The PcfK-like protein family includes the Enterococcus faecalis PcfK protein, which is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 137 and 257 amino acids in length. There are two completely conserved residues (D and L) that may be functionally important.	137
404869	pfam14059	DUF4251	Domain of unknown function (DUF4251). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 164 and 196 amino acids in length.	132
404870	pfam14060	DUF4252	Domain of unknown function (DUF4252). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 154 and 182 amino acids in length.	123
404871	pfam14061	Mtf2_C	Polycomb-like MTF2 factor 2. Mammalian Polycomb-like gene MTF2/PCL2 forms a complex with Polycomb repressive complex-2 (PRC2) and collaborates with PRC1 to achieve repression of Hox gene expression. The human MTF2 gene is expressed in three splicing variants, each of them contains the short C-terminal domain defined here. The domain is subject to structure determination by the Joint Center of Structural Genomics.	48
404872	pfam14062	DUF4253	Domain of unknown function (DUF4253). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 110 amino acids in length.	109
404873	pfam14063	DUF4254	Protein of unknown function (DUF4254). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 195 and 207 amino acids in length.	144
404874	pfam14064	HmuY	HmuY protein. HmuY is a novel heme-binding protein that recruits heme from host carriers and delivers it to its cognate outer-membrane transporter, the TonB-dependent receptor HmuR. This family of proteins is found in bacteria. Proteins in this family are typically between 214 and 278 amino acids in length.	155
404875	pfam14065	DUF4255	Protein of unknown function (DUF4255). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 190 and 320 amino acids in length.	176
404876	pfam14066	DUF4256	Protein of unknown function (DUF4256). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 190 amino acids in length.	173
404877	pfam14067	LssY_C	LssY C-terminus. This domain is found at the C-terminus of Legionella LssY proteins, which may be a part of the type I secretion system. This domain is functionally uncharacterized. This domain is found in bacteria, and is typically between 182 and 195 amino acids in length. It is often found in association with pfam09335 and PF01569. There are two completely conserved residues (P and W) that may be functionally important.	188
404878	pfam14068	YuiB	Putative membrane protein. This family of bacterial proteins is functionally uncharacterized. Proteins in this family are approximately 100 amino acids in length. There is a conserved FGIGF sequence motif, and many members are putative membrane proteins.	101
404879	pfam14069	SpoVIF	Stage VI sporulation protein F. The sporulation-specific SpoVIF (YjcC) protein of Bacillus subtilis is essential for the development of heat-resistant spores. Its expression is governed by SigK.	72
404880	pfam14070	YjfB_motility	Putative motility protein. This family of proteins is regulated in B. subtilis by SigD, and is likely to be involved in motility or flagellin production, Proteins in this family are approximately 60 amino acids in length, and contain two highly conserved asparagine residues.	57
404881	pfam14071	YlbD_coat	Putative coat protein. This is a family of putative bacterial coat proteins. Proteins in this family are approximately 140 amino acids in length.	127
404882	pfam14072	DndB	DNA-sulfur modification-associated. This is family of bacterial proteins likely to be necessary for binding to DNA and recognising the modification sites. Members are found in bacteria, archaea and on viral plasmids, and are typically between 354 and 474 amino acids in length. There is a conserved DGQHR sequence motif.	337
404883	pfam14073	Cep57_CLD	Centrosome localization domain of Cep57. The CLD or centrosome localization domain of Cep57 is found at the N-terminus, and lies approximately between residues 58 and 239. This region lies within the first alpha-helical coiled-coil segment of Cep57, and localizes to the centrosome internally to gamma-tubulin, suggesting that it is either on both centrioles or on a centromatrix component. This N-terminal region can also multimerize with the N-terminus of other Cep57 molecules. The C-terminal part, Family Cep57_MT_bd, pfam06657, is the microtubule-binding region of Cep57.	178
372902	pfam14074	DUF4257	Protein of unknown function (DUF4257). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length.	80
404884	pfam14075	UBN_AB	Ubinuclein conserved middle domain. Ubinuclein 1 and 2 (UBN1, UBN2) are members of a histone chaperone complex involved in the formation of a certain type of facultative heterochromatin, called senescence-associated heterochromatin foci (SAHF). The domain described here is conserved in many eukaryotes such as human, rat, drosophila, and zebra-fish and has been targeted for protein structure determination by the Joint Center for Structural Genomics.	211
404885	pfam14076	DUF4258	Domain of unknown function (DUF4258). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 95 and 124 amino acids in length.	70
404886	pfam14077	WD40_alt	Alternative WD40 repeat motif. WD repeats are short subdomains of about 40 amino acids and fold into 4 antiparallel beta hairpins. This domain here has been detected on the C-terminus of WD repeat-containing protein 18 during target selection by the Joint Center for Structural Genomics.	48
404887	pfam14078	DUF4259	Domain of unknown function (DUF4259). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 118 and 145 amino acids in length.	130
404888	pfam14079	DUF4260	Domain of unknown function (DUF4260). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 114 and 126 amino acids in length. There is a conserved GLK sequence motif.	112
404889	pfam14080	DUF4261	Domain of unknown function (DUF4261). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 80 amino acids in length.	77
404890	pfam14081	DUF4262	Domain of unknown function (DUF4262). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 147 and 227 amino acids in length. Some members may be incorrectly annotated as the KatG protein.	127
404891	pfam14082	DUF4263	Domain of unknown function (DUF4263). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 244 and 403 amino acids in length.	163
290790	pfam14083	PGDYG	PGDYG protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length. There is a conserved PGDYG motif.	101
404892	pfam14084	DUF4264	Protein of unknown function (DUF4264). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	52
404893	pfam14085	DUF4265	Domain of unknown function (DUF4265). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 139 and 168 amino acids in length.	111
404894	pfam14086	DUF4266	Domain of unknown function (DUF4266). This presumed lipoprotein domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 50 amino acids in length.	50
404895	pfam14087	DUF4267	Domain of unknown function (DUF4267). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 126 and 142 amino acids in length.	110
404896	pfam14088	DUF4268	Domain of unknown function (DUF4268). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 151 and 387 amino acids in length.	138
372908	pfam14089	KbaA	KinB-signalling pathway activation in sporulation. This family of small proteins is found in the membrane and is necessary for kinase KinB signalling during sporulation. There is a conserved GFF sequence motif. The initiation of sporulation in Bacillus subtilis is dependent on the phosphorylation of the Spo0A transcription factor mediated by the phospho-relay and by two major kinases, KinA and KinB.	179
404897	pfam14090	HTH_39	Helix-turn-helix domain. This helix-turn-helix domain is often found in phage proteins and is likely to be DNA-binding.	70
404898	pfam14091	DUF4269	Domain of unknown function (DUF4269). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 176 and 187 amino acids in length. There is a conserved KTE sequence motif.	151
404899	pfam14092	DUF4270	Domain of unknown function (DUF4270). This family of lipoproteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 444 and 534 amino acids in length.	442
404900	pfam14093	DUF4271	Domain of unknown function (DUF4271). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 221 and 326 amino acids in length.	207
404901	pfam14094	DUF4272	Domain of unknown function (DUF4272). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 221 and 399 amino acids in length.	207
404902	pfam14096	DUF4274	Domain of unknown function (DUF4274). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 80 amino acids in length.	76
404903	pfam14097	SpoVAE	Stage V sporulation protein AE1. Members of this family are all described as putative stage V sporulation protein AE, although this could not be confirmed. Proteins in this family are approximately 190 amino acids in length.	179
404904	pfam14098	SSPI	Small, acid-soluble spore protein I. This family of proteins is putatively assigned as a small, acid-soluble spore protein 1. Proteins in this family are approximately 70 amino acids in length. There is a conserved LPGLGV sequence motif.	65
404905	pfam14099	Polysacc_lyase	Polysaccharide lyase. This family includes heparin lyase I, EC:4.2.2.7. Heparin lyase I depolymerizes heparin by cleaving the glycosidic linkage next to an iduronic acid moiety. The structure of heparin lyase I consists of a beta-jelly roll domain with a long, deep substrate-binding groove and an unusual thumb domain containing many basic residues extending from the main body of the enzyme. This family also includes glucuronan lyase, EC:4.2.2.14. The structure glucuronan lyase is a beta-jelly roll.	213
404906	pfam14100	PmoA	Methane oxygenase PmoA. This family is a putative methane oxygenase	272
316612	pfam14101	DUF4275	Domain of unknown function (DUF4275). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length.	139
404907	pfam14102	Caps_synth_CapC	Capsule biosynthesis CapC. This family of proteins play a role in capsule biosynthesis. They are essential for gamma-polyglutamic acid (PGA) production.	119
404908	pfam14103	DUF4276	Domain of unknown function (DUF4276). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 190 and 224 amino acids in length. There is a single completely conserved residue E that may be functionally important.	186
404909	pfam14104	DUF4277	Domain of unknown function (DUF4277). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 110 amino acids in length. There is a conserved NGLGF sequence motif.	109
404910	pfam14105	DUF4278	Domain of unknown function (DUF4278). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 58 and 136 amino acids in length. There is a single completely conserved residue R that may be functionally important.	56
404911	pfam14106	DUF4279	Domain of unknown function (DUF4279). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 134 and 145 amino acids in length.	116
404912	pfam14107	DUF4280	Domain of unknown function (DUF4280). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 129 and 456 amino acids in length. There is a single completely conserved residue C that may be functionally important.	109
404913	pfam14108	DUF4281	Domain of unknown function (DUF4281). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 147 and 232 amino acids in length. There are two completely conserved residues (W and P) that may be functionally important.	127
404914	pfam14109	GldH_lipo	GldH lipoprotein. Members of this protein family are predicted lipoproteins, exclusive to the Bacteroidetes phylum. Proteins in this family are typically between 155 and 167 amino acids in length. Members include GldH, a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Not all Bacteroidetes with members of this protein family may have gliding motility.	129
404915	pfam14110	DUF4282	Domain of unknown function (DUF4282). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 93 and 155 amino acids in length. There is a single completely conserved residue E that may be functionally important.	86
404916	pfam14111	DUF4283	Domain of unknown function (DUF4283). This domain family is found in plants, and is approximately 100 amino acids in length. Considering the very diverse range of other domains it is associated with it is possible that this domain is a binding/guiding region. There are two highly conserved tryptophan residues.	145
404917	pfam14112	DUF4284	Immunity protein 22. A predicted immunity protein with an alpha+beta fold and conserved tryptophan,tyrosine and an acidic residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the ColD/E5, Tox-REase-4, Ntox49 or Ntox14 families. The domain is also found in heterogeneous polyimmunity loci.	121
404918	pfam14113	Tae4	Type VI secretion system (T6SS), amidase effector protein 4. Tae4 is a new form of toxin-antitoxin system protein for a type VI secretion system, T6SS. T6SS has roles in interspecies interactions, as well as higher order host-infection, by injecting effector proteins into the periplasmic compartment of the recipient cells of closely related species. Pseudomonas aeruginosa produces at least three effector proteins to other cells and thus has three specific cognate immunity proteins to protect itself. Tae4, or type VI amidase effector 4, in Enterobacter cloacae has a cognate Tai4 or type VI amidase immunity 4 protein. The immunity protein is Tai4, pfam16695.	114
404919	pfam14114	DUF4286	Domain of unknown function (DUF4286). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 100 and 112 amino acids in length.	95
316626	pfam14115	YuzL	YuzL-like protein. The YuzL-like protein family includes the B. subtilis YuzL protein, which is functionally uncharacterized. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length.	41
404920	pfam14116	YyzF	YyzF-like protein. The YyzF-like protein family includes the B. subtilis YyzF protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	48
404921	pfam14117	DUF4287	Domain of unknown function (DUF4287). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 70 and 180 amino acids in length.	60
290824	pfam14118	YfzA	YfzA-like protein. The YfzA-like protein family includes the B. subtilis YfzA protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length.	90
379479	pfam14119	DUF4288	Domain of unknown function (DUF4288). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length.	84
379480	pfam14120	YhzD	YhzD-like protein. The YhzD-like protein family includes the B. subtilis YhzD protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved GKL sequence motif.	61
404922	pfam14121	Porin_10	Putative porin. This family of membrane bet-barrel proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 655 and 722 amino acids in length. SRI_1264 is identified by Gene3D as a membrane bound beta-barrel. These sequences are putative porins.	602
316632	pfam14122	YokU	YokU-like protein, putative antitoxin. The YokU-like protein family includes the B. subtilis YokU protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two conserved CXXC sequence motifs. This is likely to be a family of bacterial antitoxins, as the sequence bears remote homology to the RelE fold family.	87
404923	pfam14123	DUF4290	Domain of unknown function (DUF4290). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 200 and 221 amino acids in length. There are two conserved sequence motifs: EYGR and KLWD.	172
404924	pfam14124	DUF4291	Domain of unknown function (DUF4291). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 190 and 214 amino acids in length. There are two conserved sequence motifs: VYQAY and RMTW.	180
404925	pfam14125	DUF4292	Domain of unknown function (DUF4292). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 243 and 287 amino acids in length.	207
404926	pfam14126	DUF4293	Domain of unknown function (DUF4293). This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 136 and 154 amino acids in length.	153
404927	pfam14127	DUF4294	Domain of unknown function (DUF4294). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 192 and 226 amino acids in length.	149
404928	pfam14128	DUF4295	Domain of unknown function (DUF4295). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There are two completely conserved residues (K and Y) that may be functionally important.	47
404929	pfam14129	DUF4296	Domain of unknown function (DUF4296). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 90 amino acids in length.	87
404930	pfam14130	DUF4297	Domain of unknown function (DUF4297). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is typically between 207 and 221 amino acids in length.	212
404931	pfam14131	DUF4298	Domain of unknown function (DUF4298). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 94 and 105 amino acids in length. There are two completely conserved residues (Y and D) that may be functionally important.	87
404932	pfam14132	DUF4299	Domain of unknown function (DUF4299). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 275 and 313 amino acids in length. There are two conserved sequence motifs: RGF and DAY. There are two completely conserved residues (P and D) that may be functionally important.	301
404933	pfam14133	DUF4300	Domain of unknown function (DUF4300). This family of lipoproteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 281 and 303 amino acids in length. There are two conserved sequence motifs: NCR and PYQ.	252
404934	pfam14134	DUF4301	Domain of unknown function (DUF4301). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 505 and 516 amino acids in length.	508
404935	pfam14135	DUF4302	Domain of unknown function (DUF4302). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 344 and 443 amino acids in length. There are two completely conserved residues (R and L) that may be functionally important.	234
404936	pfam14136	DUF4303	Domain of unknown function (DUF4303). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 169 and 192 amino acids in length.	153
404937	pfam14137	DUF4304	Domain of unknown function (DUF4304). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 154 and 223 amino acids in length.	114
404938	pfam14138	COX16	Cytochrome c oxidase assembly protein COX16. This family represents homologs of COX16 which has been shown to be involved in assembly of cytochrome oxidase. Protein in this family are typically between 106 and 134 amino acids in length.	79
404939	pfam14139	YpzG	YpzG-like protein. The YpzG-like protein family includes the B. subtilis YpzG protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There is a conserved QVNG sequence motif.	49
290845	pfam14140	YpzI	YpzI-like protein. The YpzI-like protein family includes the B. subtilis YpzI protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length.	42
372925	pfam14141	YqzM	YqzM-like protein. The YqzM-like protein family includes the B. subtilis YqzM protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length.	40
290847	pfam14142	YrzO	YrzO-like protein. The YrzO-like protein family includes the B. subtilis YrzO protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length.	46
372926	pfam14143	YrhC	YrhC-like protein. The YrhC-like protein family includes the B. subtilis YrhC protein, which is functionally uncharacterized. YrhC is on the same operon as the MccA and MccB genes, which are involved in the conversion of methionine to cysteine. Expression of this operon is repressed in the presence of sulphate or cysteine. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	72
404940	pfam14144	DOG1	Seed dormancy control. This family of plant proteins appears to be a highly specific controller seed dormancy.	76
404941	pfam14145	YrhK	YrhK-like protein. The YrhK-like protein family includes the B. subtilis YrhK protein, which is functionally uncharacterized. Its expression is under the control of the motility sigma factor sigma-D. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 60 amino acids in length.	57
316654	pfam14146	DUF4305	Domain of unknown function (DUF4305). This family includes the B. subtilis YdiK protein, which is functionally uncharacterized. This is not a homolog of E. coli YdiK, which belongs to pfam01594. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	37
290852	pfam14147	Spore_YhaL	Sporulation protein YhaL. This family of proteins is involved in sporulation. In B. subtilis its expression is regulated by the early mother-cell-specific transcription factor sigma-E.	52
290853	pfam14148	YhdB	YhdB-like protein. The YhdB-like protein family includes the B. subtilis YhdB protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 57 and 82 amino acids in length. There are two conserved sequence motifs: LMVRT and FLHAY.	71
404942	pfam14149	YhfH	YhfH-like protein. The YhfH-like protein family includes the B. subtilis YhfH protein, which is functionally uncharacterized. Its expression is repressed by the Spx paralogue MgsR, which regulates genes involved in stress response. This family of proteins is found in bacteria. Proteins in this family are typically between 42 and 53 amino acids in length.	37
372929	pfam14150	YesK	YesK-like protein. The YesK-like protein family includes the B. subtilis YesK protein, which is functionally uncharacterized. Its expression is regulated by the sporulation-specific sigma factor sigma-E. This family of proteins is found in bacteria. Proteins in this family are approximately 100 amino acids in length.	81
372930	pfam14151	YfhD	YfhD-like protein. The YfhD-like protein family includes the B. subtilis YfhD protein, which is functionally uncharacterized. Its expression is regulated by the sporulation-specific sigma factor sigma-F. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There is a single completely conserved residue E that may be functionally important.	59
372931	pfam14152	YfhE	YfhE-like protein. The YfhE-like protein family includes the B. subtilis YfhE protein, which is functionally uncharacterized. Its expression may be regulated by the sigma factor sigma-B, which regulates the expression of stress-response proteins. This family of proteins is found in bacteria. Proteins in this family are approximately 40 amino acids in length. There is a conserved QEV sequence motif.	36
372932	pfam14153	Spore_coat_CotO	Spore coat protein CotO. Bacillus spores are protected by a protein shell consisting of over 50 different polypeptides, known as the coat. This family of proteins has an important morphogenetic role in coat assembly, it is involved in the assembly of at least 5 different coat proteins including CotB, CotG, CotS, CotSA and CotW. It is likely to act at a late stage of coat assembly.	180
316659	pfam14154	DUF4306	Domain of unknown function (DUF4306). This family includes the B. subtilis YjdJ protein, which is functionally uncharacterized. This is not a homolog of E. coli YjdJ, which belongs to pfam00583. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 95 and 152 amino acids in length.	88
404943	pfam14155	DUF4307	Domain of unknown function (DUF4307). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 132 and 153 amino acids in length. There is a single completely conserved residue C that may be functionally important.	111
404944	pfam14156	AbbA_antirepres	Antirepressor AbbA. This family inactivates the repressor AbrB, which represses genes switched on during the transition from the exponential to the stationary phase of growth. It binds to AbrB and prevents it from binding to DNA.	63
404945	pfam14157	YmzC	YmzC-like protein. The YmzC-like protein family includes the B. subtilis YmzC protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 58 and 91 amino acids in length. There is a conserved ELR sequence motif.	58
404946	pfam14158	YndJ	YndJ-like protein. The YndJ-like protein family includes the B. subtilis YndJ protein, which is functionally uncharacterized. This family is found in bacteria and archaea, and is typically between 222 and 269 amino acids in length. There are two completely conserved G residues that may be functionally important.	260
404947	pfam14159	CAAD	CAAD domains of cyanobacterial aminoacyl-tRNA synthetase. This domain is present in aminoacyl-tRNA synthetases (aaRSs), enzymes that couple tRNAs to their cognate amino acids. aaRSs from cyanobacteria containing the CAAD (for cyanobacterial aminoacyl-tRNA synthetases appended domain) protein domains are localized in the thylakoid membrane. The domain bears two putative transmembrane helices and is present in glutamyl-, isoleucyl-, leucyl-, and valyl-tRNA synthetases, the latter of which has probably recruited the domain more than once during evolution.	85
404948	pfam14160	FAM110_C	Centrosome-associated C-terminus. This is the C-terminus of a family of proteins that colocalize with the centrosome/microtubule organisation centre in interphase and at the spindle poles in mitosis.	113
404949	pfam14161	FAM110_N	Centrosome-associated N-terminus. This is the N-terminus of a family of proteins that colocalize with the centrosome/microtubule organisation centre in interphase and at the spindle poles in mitosis.	107
316666	pfam14162	YozD	YozD-like protein. The YozD-like protein family includes the B. subtilis YozD protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	57
404950	pfam14163	SieB	Super-infection exclusion protein B. This family includes superinfection exclusion proteins. These proteins prevent the growth of superinfecting phage which are insensitive to repression. It aborts lytic development of superinfecting phage.	147
372939	pfam14164	YqzH	YqzH-like protein. The YqzH-like protein family includes the B. subtilis YqzH protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	64
316669	pfam14165	YtzH	YtzH-like protein. The YtzH-like protein family includes the B. subtilis YtzH protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There is a conserved DIL sequence motif.	86
372940	pfam14166	YueH	YueH-like protein. The YueH-like protein family includes the B. subtilis YueH protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	79
372941	pfam14167	YfkD	YfkD-like protein. The YfkD-like protein family includes the B. subtilis YfkD protein, which is functionally uncharacterized. Its expression is regulated by the sigma factor sigma-B, which regulates the expression of stress-response proteins, and by the forespore-specific sigma factor sigma-G. This family of proteins is found in bacteria. Proteins in this family are typically between 254 and 265 amino acids in length.	232
404951	pfam14168	YjzC	YjzC-like protein. The YjzC-like protein family includes the B. subtilis YjzC protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	55
372943	pfam14169	YdjO	Cold-inducible protein YdjO. This family includes the B. subtilis YdjO protein, which is functionally uncharacterized. This is not a homolog of E. coli YdjO. B. subtilis YdjO is cold-inducible. Its expression is induced by the extracytoplasmic function sigma factor sigma-W. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length.	59
404952	pfam14171	SpoIISA_toxin	Toxin SpoIISA, type II toxin-antitoxin system. SpoIISA is a toxin which causes lysis of vegetatively growing cells. It forms part of a type II toxin-antitoxin system, where the SpoIISB protein, pfam14185, acts as an antitoxin. It is a transmembrane protein, with a cytoplasmic domain accounting for approximately two-thirds of the protein. The structure of the cytoplasmic domain resembles that of the GAF domains, pfam01590. SpoIISB binds to the cytoplasmic domain of SpoIISA with high affinity.	236
404953	pfam14172	DUF4309	Domain of unknown function (DUF4309). This family includes the B. subtilis YjgB protein, which is functionally uncharacterized. This is not a homolog of E. coli YjgB. Expression of B. subtilis YjgB is regulated by the alternative transcription factor sigma-B. This family is found in bacteria, and is approximately 140 amino acids in length.	130
372946	pfam14173	ComGG	ComG operon protein 7. This family is required for DNA-binding during transformation of competent bacterial cells.	95
379493	pfam14174	YycC	YycC-like protein. The YycC-like protein family includes the B. subtilis YycC protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 50 amino acids in length. There is a conserved HIL sequence motif.	50
404954	pfam14175	YaaC	YaaC-like Protein. The YaaC-like protein family includes the B. subtilis YaaC protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 320 and 333 amino acids in length.	313
404955	pfam14176	YxiJ	YxiJ-like protein. The YxiJ-like protein family includes the B. subtilis YxiJ protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length.	110
372949	pfam14177	YkyB	YkyB-like protein. The YkyB-like protein family includes the B. subtilis YkyB protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length. There are two conserved sequence motifs: NRHAKTA and HLG.	135
290882	pfam14178	YppF	YppF-like protein. The YppF-like protein family includes the B. subtilis YppF protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There is a conserved LLDF sequence motif.	59
372950	pfam14179	YppG	YppG-like protein. The YppG-like protein family includes the B. subtilis YppG protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 115 and 181 amino acids in length. There are two completely conserved residues (F and G) that may be functionally important.	101
404956	pfam14181	YqfQ	YqfQ-like protein. The YqfQ-like protein family includes the B. subtilis YqfQ protein, also known as VrrA, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 146 and 237 amino acids in length. There are two conserved sequence motifs: QYGP and PKLY.	166
372951	pfam14182	YgaB	YgaB-like protein. The YgaB-like protein family includes the B. subtilis YgaB protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length.	76
404957	pfam14183	YwpF	YwpF-like protein. The YwpF-like protein family includes the B. subtilis YwpF protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 146 and 167 amino acids in length. There is a conserved IIN sequence motif.	134
404958	pfam14184	YrvL	Regulatory protein YrvL. YrvL prevents expression and activity of the YrvI sigma factor. It may function as an anti-sigma factor.	121
290888	pfam14185	SpoIISB_antitox	Antitoxin SpoIISB, type II toxin-antitoxin system. Members of this family act as antitoxins. They bind to the SpoIISA toxin, pfam14171. They are disordered proteins which adopt structure only when bound to SpoIISA.	55
404959	pfam14186	Aida_C2	Cytoskeletal adhesion. This is the C-terminal domain of the axin-interacting protein family, and is a distinct version of the C2 domain. This domain is critical for interactions with cytoskeletal in the context of cellular adhesion points.	139
404960	pfam14187	DUF4310	Domain of unknown function (DUF4310). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 214 and 231 amino acids in length.	208
404961	pfam14188	DUF4311	Domain of unknown function (DUF4311). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 260 amino acids in length.	212
404962	pfam14189	DUF4312	Domain of unknown function (DUF4312). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 99 and 118 amino acids in length.	84
404963	pfam14190	DUF4313	Domain of unknown function (DUF4313). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 136 and 171 amino acids in length.	103
404964	pfam14191	YodL	YodL-like. The YodL-like protein family includes the B. subtilis YodL protein, which is functionally uncharacterized. This domain family is found in bacteria, and is approximately 100 amino acids in length. There are two completely conserved residues (Y and D) that may be functionally important.	101
404965	pfam14192	DUF4314	Domain of unknown function (DUF4314). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is typically between 56 and 93 amino acids in length.	69
404966	pfam14193	DUF4315	Domain of unknown function (DUF4315). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length.	79
404967	pfam14194	Cys_rich_VLP	Cysteine-rich VLP. This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is approximately 60 amino acids in length. It contains 6 conserved cysteines and a conserved VLP sequence motif.	57
404968	pfam14195	DUF4316	Domain of unknown function (DUF4316). This domain is functionally uncharacterized. This domain is found in bacteria, and is typically between 56 and 95 amino acids in length.	44
404969	pfam14196	ATC_hydrolase	L-2-amino-thiazoline-4-carboxylic acid hydrolase. This family of enzymes catalyzes the conversion of L-2-amino-delta2-thiazoline-4-carboxylic acid (L-ATC) to N-carbamoyl-L-cysteine. It cleaves the carbon-sulphur bond in the ring structure of L-ATC to produce N-carbamoyl-L-cysteine.	145
372959	pfam14197	Cep57_CLD_2	Centrosome localization domain of PPC89. The N-terminal region of the fission yeast spindle pole body protein PPC89 has low similarity to the human Cep57 protein. The CLD or centrosome localization domain of Cep57 and PPC89 is found at the N-terminus. This region localizes to the centrosome internally to gamma-tubulin, suggesting that it is either on both centrioles or on a centromatrix component. This N-terminal region can also multimerize with the N-terminus of other Cep57 molecules. The C-terminal part, Family Cep57_MT_bd, pfam06657, is the microtubule-binding region of Cep57 and PPC89.	67
404970	pfam14198	TnpV	Transposon-encoded protein TnpV. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 114 and 125 amino acids in length.	112
404971	pfam14199	DUF4317	Domain of unknown function (DUF4317). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 225 and 451 amino acids in length. There is a single completely conserved residue P that may be functionally important.	370
404972	pfam14200	RicinB_lectin_2	Ricin-type beta-trefoil lectin domain-like. 	89
404973	pfam14201	DUF4318	Domain of unknown function (DUF4318). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. There is a single completely conserved residue F that may be functionally important.	75
404974	pfam14202	TnpW	Transposon-encoded protein TnpW. This family of proteins is found in bacteria. Proteins in this family are typically between 54 and 75 amino acids in length. There is a single completely conserved residue G that may be functionally important.	35
404975	pfam14203	TTRAP	Putative tranposon-transfer assisting protein. TTRAP is a family of small bacterial proteins largely from Clostrium difficile. From comparative and other structural studies of the Structure 2L7K, UniProtKB:Q18AW3, it has been suggested that this family is required for interacting with other proteins in order to facilitate the transfer of the transposon CTn4 between different bacterial species. Structure 2L7K comprises an alpha-helical fold of four alpha-helices leading to the production of two clefts, the larger of which displays two highly conserved residues in close proximity, Glu-8 and Lys-48. The gene concerned is part of an operon within transposon CTn4, and is expressed alongside a putative DNA primase, a DNA topoisomerase and conjugal transfer proteins.	62
404976	pfam14204	Ribosomal_L18_c	Ribosomal L18 C-terminal region. This domain is the C-terminal end of ribosomal L18/L5 proteins.	93
404977	pfam14205	Cys_rich_KTR	Cysteine-rich KTR. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are approximately 60 amino acids in length. There are 4 conserved cysteines and a conserved KTR sequence motif.	54
404978	pfam14206	Cys_rich_CPCC	Cysteine-rich CPCC. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 68 and 104 amino acids in length. There are six conserved cysteines and a conserved CPCC sequence motif.	75
404979	pfam14207	DpnD-PcfM	DpnD/PcfM-like protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 57 and 153 amino acids in length. There are two completely conserved residues (E and A) that may be functionally important.	46
404980	pfam14208	DUF4320	Domain of unknown function (DUF4320). This family of proteins is found in bacteria. Proteins in this family are typically between 120 and 131 amino acids in length. There are two completely conserved residues (G and Y) that may be functionally important.	117
404981	pfam14209	DUF4321	Domain of unknown function (DUF4321). This family of proteins is functionally uncharacterized. It is found in bacteria, and is approximately 50 amino acids in length.	48
290912	pfam14210	DUF4322	Domain of unknown function (DUF4322). This presumed domain is functionally uncharacterized. This domain family is found in archaea, and is approximately 60 amino acids in length. There is a conserved QTV sequence motif.	66
404982	pfam14213	DUF4325	STAS-like domain of unknown function (DUF4325). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 99 and 341 amino acids in length. This domain is distantly related to the STAS domain.	63
404983	pfam14214	Helitron_like_N	Helitron helicase-like domain at N-terminus. This family is found in Helitrons, recently recognized eukaryotic transposons that are predicted to amplify by a rolling-circle mechanism. In many instances a protein-coding gene is disrupted by their insertion.	197
404984	pfam14215	bHLH-MYC_N	bHLH-MYC and R2R3-MYB transcription factors N-terminal. This is the N-terminal region of a family of MYB and MYC transcription factors. The DNA-binding HLH domain is further downstream, pfam00010. Members of the MYB and MYC family regulate the biosynthesis of phenylpropanoids in several plant species (DOI:10.1007/s11295-009-0232-y).	121
404985	pfam14216	DUF4326	Domain of unknown function (DUF4326). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 100 and 162 amino acids in length. There are two completely conserved residues (P and C) that may be functionally important.	82
404986	pfam14217	DUF4327	Domain of unknown function (DUF4327). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	67
404987	pfam14218	COP23	Circadian oscillating protein COP23. This family includes the circadian oscillating protein COP23 from Cyanothece sp. (strain PCC 8801). The levels of this peripheral membrane protein display a circadian oscillation.	138
404988	pfam14219	DUF4328	Domain of unknown function (DUF4328). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 218 and 342 amino acids in length.	166
404989	pfam14220	DUF4329	Domain of unknown function (DUF4329). This domain is functionally uncharacterized. It is found in bacteria and eukaryotes, and is approximately 130 amino acids in length. It is often found in association with pfam05593 and pfam03527. There is a single completely conserved residue D and a highly conserved HTH motif which may be functionally important.	111
404990	pfam14221	DUF4330	Domain of unknown function (DUF4330). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 165 and 177 amino acids in length. There is a single completely conserved residue G that may be functionally important.	167
404991	pfam14222	MOR2-PAG1_N	Cell morphogenesis N-terminal. This family is the conserved N-terminal region of proteins that are involved in cell morphogenesis.	547
404992	pfam14223	Retrotran_gag_2	gag-polypeptide of LTR copia-type. This family is found in Plants and fungi, and contains LTR-polyproteins, or retrotransposons of the copia-type.	138
404993	pfam14224	DUF4331	Domain of unknown function (DUF4331). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 223 and 526 amino acids in length. There is a conserved FPY sequence motif.	414
404994	pfam14225	MOR2-PAG1_C	Cell morphogenesis C-terminal. This family is the conserved C-terminal region of proteins that are involved in cell morphogenesis.	252
404995	pfam14226	DIOX_N	non-haem dioxygenase in morphine synthesis N-terminal. This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity.	118
372971	pfam14228	MOR2-PAG1_mid	Cell morphogenesis central region. This family is the conserved central region of proteins that are involved in cell morphogenesis.	1114
404996	pfam14229	DUF4332	Domain of unknown function (DUF4332). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 134 and 356 amino acids in length. This domain contains helix-hairpin-helix motifs.	122
404997	pfam14230	DUF4333	Domain of unknown function (DUF4333). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 140 and 255 amino acids in length. There are two completely conserved C residues that may be functionally important.	76
404998	pfam14231	GXWXG	GXWXG protein. This domain is found in bacteria and eukaryotes, and is approximately 60 amino acids in length. There is a conserved GXWXG motif. This domain is frequently found at the N-terminus of pfam14232.	59
404999	pfam14232	DUF4334	Domain of unknown function (DUF4334). This domain family is found in bacteria and eukaryotes, and is approximately 60 amino acids in length. This domain is frequently found at the C-terminus of pfam14231.	55
405000	pfam14233	DUF4335	Domain of unknown function (DUF4335). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 204 and 480 amino acids in length. There are two completely conserved residues (G and D) that may be functionally important.	184
405001	pfam14234	DUF4336	Domain of unknown function (DUF4336). 	321
405002	pfam14235	DUF4337	Domain of unknown function (DUF4337). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 187 and 201 amino acids in length. There is a single completely conserved residue Q that may be functionally important.	153
372975	pfam14236	DUF4338	Domain of unknown function (DUF4338). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 206 and 475 amino acids in length.	231
405003	pfam14237	DUF4339	Domain of unknown function (DUF4339). This domain is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. There are two completely conserved residues (G and W) that may be functionally important.	50
405004	pfam14238	DUF4340	Domain of unknown function (DUF4340). This domain is found in bacteria, and is typically between 183 and 196 amino acids in length.	183
379519	pfam14239	RRXRR	RRXRR protein. This domain is found in bacteria, eukaryotes and viruses, and is approximately 180 amino acids in length. It contains a conserved RRXRR motif. It is often found in association with pfam01844.	173
405005	pfam14240	YHYH	YHYH protein. This domain family is found in bacteria, eukaryotes and viruses, and is typically between 141 and 198 amino acids in length. There is a conserved YHYH sequence motif.	187
405006	pfam14242	DUF4342	Domain of unknown function (DUF4342). This family of proteins is found in bacteria. Proteins in this family are typically between 97 and 206 amino acids in length. There is a single completely conserved residue P that may be functionally important.	79
405007	pfam14243	DUF4343	Domain of unknown function (DUF4343). This domain family is found in bacteria, eukaryotes and viruses, and is typically between 127 and 142 amino acids in length.	172
405008	pfam14244	Retrotran_gag_3	gag-polypeptide of LTR copia-type. This family is found in Plants and fungi, and contains pol polyprotein-like retroelements or retrotransposons of the copia-type. It is a short domain at the very start of these polypeptides.	48
290944	pfam14245	Pilin_PilA	Type IV pilin PilA. This family consists of proteins which form type IV pili. In M. xanthus these pili are required for social motility.	136
405009	pfam14246	TetR_C_7	AefR-like transcriptional repressor, C-terminal region. This family comprises the C-terminal domain of transcriptional regulators of the TetR family. It includes the AefR transcriptional regulator from P. syringae. It is found in association with pfam00440.	119
405010	pfam14247	DUF4344	Putative metallopeptidase. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 247 and 291 amino acids in length. There is a conserved EED sequence motif. This is a putative metallopeptidase.	214
405011	pfam14248	DUF4345	Domain of unknown function (DUF4345). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 125 and 141 amino acids in length. There is a single completely conserved residue E that may be functionally important.	119
405012	pfam14249	Tocopherol_cycl	Tocopherol cyclase. This family contains tocopherol cyclases. These enzymes are involved in the synthesis of tocopherols and tocotrienols (vitamin E).	332
405013	pfam14250	AbrB-like	AbrB-like transcriptional regulator. This family of DNA-binding proteins is likely to act as a transcriptional regulator. This family does not include E.coli AbrB, which belongs to pfam05145.	67
405014	pfam14251	DUF4346	Domain of unknown function (DUF4346). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 127 and 502 amino acids in length. There are two conserved sequence motifs: LDP and DHA. Many members of this family have been annotated as dihydropteroate synthases, however no experimental evidence can be found for this and MJ0107 has been shown not to possess dihydropteroate synthase activity.	118
405015	pfam14252	DUF4347	Domain of unknown function (DUF4347). This domain family is found in bacteria and eukaryotes, and is approximately 160 amino acids in length. There are two completely conserved residues (C and G) that may be functionally important.	164
405016	pfam14253	AbiH	Bacteriophage abortive infection AbiH. This family of proteins confers resistance to bacteriophage.	249
405017	pfam14254	DUF4348	Domain of unknown function (DUF4348). Two structures have been solved form this DUF, Structure 4mjf and Structure 3sbu. TOPSAN records that both proteins are the only structural representatives of Pfam PF14254, DUF4348. There are no other significant hits in FFAS. DUF4348 has ~200 proteins, all from Bacteroidetes, and all with a single domain architecture with just one DUF4348 domain. There appears to be a possible gene duplication in the protein as the N-terminal domain (approx residues 25-174) and C-terminal domain (approx residues 175-286) superimpose quite well with ~1.9A r.m.s.d. and ~30% sequence identity.	230
405018	pfam14255	Cys_rich_CPXG	Cysteine-rich CPXCG. This family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length. There are 5 conserved cysteines which occur in a CPXCG motif and a DCXXCCXP motif.	49
405019	pfam14256	YwiC	YwiC-like protein. The YwiC-like protein family includes the B. subtilis YwiC protein, which is functionally uncharacterized. This domain family is found in bacteria, and is approximately 130 amino acids in length. There is a single completely conserved residue G that may be functionally important.	124
405020	pfam14257	DUF4349	Domain of unknown function (DUF4349). This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 282 and 353 amino acids in length. There is a single completely conserved residue D that may be functionally important.	213
405021	pfam14258	DUF4350	Domain of unknown function (DUF4350). This domain family is found in bacteria, archaea and eukaryotes, and is approximately 70 amino acids in length.	69
405022	pfam14260	zf-C4pol	C4-type zinc-finger of DNA polymerase delta. In fission yeast this zinc-finger domain appears is the region of Pol3 that binds directly to the B-subunit, Cdc1. Pol delta is a hetero-tetrameric enzyme comprising four evolutionarily well-conserved proteins: the catalytic subunit Pol3 and three smaller subunits Cdc1, Cdc27 and Cdm1.	70
405023	pfam14261	DUF4351	Domain of unknown function (DUF4351). This domain is found in bacteria, and is approximately 60 amino acids in length.	59
405024	pfam14262	Cthe_2159	Carbohydrate-binding domain-containing protein Cthe_2159. Cthe_2159 from Clostridium thermocellum is the first representative of a novel family of cellulose and/or acid-sugar binding beta-helix proteins that share structural similarities with polysaccharide lyases.	264
372985	pfam14263	DUF4354	Domain of unknown function (DUF4354). Several members of this family are annotated as being ATP/GTP-binding site motif A (P-loop) proteins, but this could not be confirmed. The one Structure 3NRF structure solved for this family exhibits an immunoglobin-like beta-sandwich fold. Crystal packing suggests that a tetramer is a significant oligomerization state, and a disulfide bridge is formed between Cys 125 at the C-terminal end of the monomer, and Cys 69.	125
405025	pfam14264	Glucos_trans_II	Glucosyl transferase GtrII. This family includes glucosyl transferase II from the Shigella phage SfII, which mediates seroconversion of S. flexneri when the phage is integrated into the host chromosome.	307
405026	pfam14265	DUF4355	Domain of unknown function (DUF4355). This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 180 and 214 amino acids in length.	119
405027	pfam14266	YceG_bac	Putative component of 'biosynthetic module'. YceG is a family of proteins found in bacteria. Proteins in this family are approximately 540 amino acids in length. YceG is an additional gene-product encoded in the Ter operon and might thus be part of a 'biosynthetic module' encoding certain enzymes.	480
405028	pfam14267	DUF4357	Domain of unknown function (DUF4357). This domain family is found in bacteria and archaea, and is approximately 60 amino acids in length. There are two completely conserved residues (G and W) that may be functionally important.	54
405029	pfam14268	YoaP	YoaP-like. The YoaP-like domain is found at the C-terminus of the B. subtilis YoaP protein. It is found in bacteria and archaea, and is approximately 40 amino acids in length. The family is found in association with pfam00583. There is a single completely conserved residue A that may be functionally important.	41
372989	pfam14269	Arylsulfotran_2	Arylsulfotransferase (ASST). 	301
405030	pfam14270	DUF4358	Domain of unknown function (DUF4358). This domain family is found in bacteria, and is approximately 110 amino acids in length.	97
405031	pfam14271	DUF4359	Domain of unknown function (DUF4359). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. There are two completely conserved residues (P and S) that may be functionally important.	103
405032	pfam14272	Gly_rich_SFCGS	Glycine-rich SFCGS. This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length. There are a number of highly conserved motifs including an SFCGSGGAGA motif.	113
405033	pfam14273	DUF4360	Domain of unknown function (DUF4360). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 200 and 228 amino acids in length. There is a conserved GCP sequence motif near the N-terminus.	178
405034	pfam14274	DUF4361	Domain of unknown function (DUF4361). 	141
372994	pfam14275	DUF4362	Domain of unknown function (DUF4362). This family of proteins is found in bacteria. Proteins in this family are typically between 93 and 146 amino acids in length. There is a conserved IRIV sequence motif.	93
405035	pfam14276	DUF4363	Domain of unknown function (DUF4363). This family of proteins is found in bacteria. Proteins in this family are approximately 120 amino acids in length.	107
405036	pfam14277	DUF4364	Domain of unknown function (DUF4364). This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 180 amino acids in length.	162
405037	pfam14278	TetR_C_8	Transcriptional regulator C-terminal region. This domain is a tetracycline repressor, domain 2, or C-terminus.	103
405038	pfam14279	HNH_5	HNH endonuclease. This domain is related to other HNH domain families such as pfam01844. Suggesting that these proteins have a nucleic acid cleaving function.	56
405039	pfam14280	DUF4365	Domain of unknown function (DUF4365). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, eukaryotes and viruses. Proteins in this family are typically between 182 and 530 amino acids in length. There is a single completely conserved residue D that may be functionally important.	141
405040	pfam14281	PDDEXK_4	PD-(D/E)XK nuclease superfamily. Members of this family belong to the PD-(D/E)XK nuclease superfamily.	178
405041	pfam14282	FlxA	FlxA-like protein. This family includes FlxA from E. coli. The expression of FlxA is regulated by the FliA sigma factor, a transcription factor specific for class 3 flagellar operons. However FlxA is not required for flagellar function or formation.	101
405042	pfam14283	DUF4366	Domain of unknown function (DUF4366). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 227 and 387 amino acids in length.	144
405043	pfam14284	PcfJ	PcfJ-like protein. The PcfJ-like protein family includes the E. faecalis PcfJ protein, which is functionally uncharacterized. It is found in bacteria and viruses, and is typically between 159 and 170 amino acids in length. There is a conserved HCV sequence motif.	146
379540	pfam14285	DUF4367	Domain of unknown function (DUF4367). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 229 and 435 amino acids in length.	110
405044	pfam14286	DHHW	DHHW protein. This family of proteins is found in bacteria. Proteins in this family are typically between 366 and 404 amino acids in length. There is a conserved DHHW motif. There is some distant homology to the Lipase_GDSL_2 family.	385
405045	pfam14287	DUF4368	Domain of unknown function (DUF4368). This domain family is found in bacteria, and is approximately 70 amino acids in length. The family is found in association with pfam00239 and pfam07508. There is a single completely conserved residue G that may be functionally important.	64
405046	pfam14288	FKS1_dom1	1,3-beta-glucan synthase subunit FKS1, domain-1. The FKS1_dom1 domain is likely to be the 'Class I' region just N-terminal to the first set of transmembrane helices that is involved in 1,3-beta-glucan synthesis itself. This family is found on proteins with family Glucan_synthase, pfam02364.	111
405047	pfam14289	DUF4369	Domain of unknown function (DUF4369). This domain family is found in bacteria, and is approximately 110 amino acids in length. The family is found in association with pfam00578.	92
372997	pfam14290	DUF4370	Domain of unknown function (DUF4370). 	237
405048	pfam14291	DUF4371	Domain of unknown function (DUF4371). 	236
405049	pfam14292	SusE	SusE outer membrane protein. This family includes the SusE outer membrane protein from Bacteroides thetaiotaomicron. This protein has a role in starch utilisation, but is not essential for growth on starch.	106
405050	pfam14293	YWFCY	YWFCY protein. This family is found in bacteria, and is approximately 60 amino acids in length. There is a conserved YWFCY motif. It is often found in association with pfam02534.	60
405051	pfam14294	DUF4372	Domain of unknown function (DUF4372). This domain family is found in bacteria, and is approximately 80 amino acids in length. The family is found in association with pfam01609. There is a single completely conserved residue G that may be functionally important.	74
405052	pfam14295	PAN_4	PAN domain. 	51
405053	pfam14296	O-ag_pol_Wzy	O-antigen polysaccharide polymerase Wzy. This family includes O-antigen polysaccharide polymerases. These enzymes link O-units via a glycosidic linkage to form a long O-antigen. These enzymes vary in specificity and sequence.	468
405054	pfam14297	DUF4373	Domain of unknown function (DUF4373). This domain is found in bacteria, eukaryotes and viruses, and is approximately 90 amino acids in length.	89
405055	pfam14298	DUF4374	Domain of unknown function (DUF4374). This family of proteins is found in bacteria. Proteins in this family are typically between 406 and 466 amino acids in length.	427
405056	pfam14299	PP2	Phloem protein 2. Phloem protein 2 (PP2) is one of the most abundant and enigmatic proteins in the phloem sap. PP2 is translocated in the assimilate stream where its lectin activity or RNA-binding properties can exert effects over long distances.	152
405057	pfam14300	DUF4375	Domain of unknown function (DUF4375). This family of proteins is found in bacteria. Proteins in this family are typically between 156 and 204 amino acids in length. There is a single completely conserved residue G that may be functionally important.	116
405058	pfam14301	DUF4376	Domain of unknown function (DUF4376). This domain family is found in bacteria and viruses, and is approximately 110 amino acids in length.	102
405059	pfam14302	DUF4377	Domain of unknown function (DUF4377). This domain family is found in bacteria and archaea, and is approximately 80 amino acids in length.	76
405060	pfam14303	NAM-associated	No apical meristem-associated C-terminal domain. This domain is found in a number of different types of plant proteins including NAM-like proteins.	164
405061	pfam14304	CSTF_C	Transcription termination and cleavage factor C-terminal. The C-terminal section of CSTF proteins is a discreet structure is crucial for mRNA 3'-end processing. This domain interacts with Pcf11 and possibly PC4, thus linking CstF2 to transcription, transcriptional termination, and cell growth.	41
291003	pfam14305	ATPgrasp_TupA	TupA-like ATPgrasp. A member of the ATP-grasp fold predicted to be involved in the biosynthesis of cell surface polysaccharides such as the O-antigen in proteobacteria, the capsule in firmicutes and the polyglutamate chain of teichuronopeptide.	241
405062	pfam14306	PUA_2	PUA-like domain. This PUA like domain is found at the N-terminus of ATP-sulfurylase enzymes.	159
405063	pfam14307	Glyco_tran_WbsX	Glycosyltransferase WbsX. Members of this family are found in within O-antigen biosynthesis clusters in Gram negative bacteria, where they are predicted to function as glycosyltransferases.	312
405064	pfam14308	DnaJ-X	X-domain of DnaJ-containing. IN certain plant and yeast proteins, the DnaJ-1 proteins have a three-domain structure. The x-domain lies between the N-terminal DnaJ and the C-terminal Z domains. The exact function is not known.	205
405065	pfam14309	DUF4378	Domain of unknown function (DUF4378). 	157
405066	pfam14310	Fn3-like	Fibronectin type III-like domain. This domain has a fibronectin type III-like structure. It is often found in association with pfam00933 and pfam01915. Its function is unknown.	71
405067	pfam14311	DUF4379	Probable Zinc-ribbon domain. This domain is found in bacteria, eukaryotes and viruses, and is approximately 60 amino acids in length. It contains a CXXCXH motif and a CPXC motif.	54
316802	pfam14312	FG-GAP_2	FG-GAP repeat. 	49
316803	pfam14313	Soyouz_module	N-terminal region of Paramyxovirinae phosphoprotein (P). The soyouz module moiety is the N-terminal region of the phosphoprotein (P) from the subfamily Paramyxovirinae of the family Paramyxoviridae viruses. The main genera in this subfamily include the Rubulaviruses, avulaviruses, respiroviruses, henipaviruses, and morbilliviruses, all of which are enveloped viruses with a non-segmented, negative, single-stranded RNA genome encapsidated by the nucleoprotein (N) within a helical nucleocapsid.	58
316804	pfam14314	Methyltrans_Mon	Virus-capping methyltransferase. This is the methyltransferase region of the Mononegavirales single-stranded RNA viral RNA polymerase enzymes. This region is involved in the mRNA-capping of the virion particles.	685
373010	pfam14315	DUF4380	Domain of unknown function (DUF4380). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 288 and 372 amino acids in length. There are two completely conserved residues (G and E) that may be functionally important.	295
405068	pfam14316	DUF4381	Domain of unknown function (DUF4381). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 158 and 180 amino acids in length.	145
405069	pfam14317	YcxB	YcxB-like protein. The YcxB-like protein family includes the B. subtilis YcxB protein, which is a functionally uncharacterized transmembrane protein. This family of proteins is found in bacteria, and is approximately 60 amino acids in length.	61
405070	pfam14318	Mononeg_mRNAcap	Mononegavirales mRNA-capping region V. This V domain of L RNA-polymerase carries a new motif, GxxTx(n)HR, that is essential for mRNA cap formation. Nonsegmented negative-sense (NNS) RNA viruses, Mononegavirales, cap their mRNA by an unconventional mechanism. Specifically, 5'-monophosphate mRNA is transferred to GDP derived from GTP through a reaction that involves a covalent intermediate between the large polymerase protein L and mRNA. The V region is essential for this process.	241
405071	pfam14319	Zn_Tnp_IS91	Transposase zinc-binding domain. This domain is likely to be a zinc-binding domain. It is found at the N-terminus of transposases belonging to the IS91 family.	92
291018	pfam14320	Paramyxo_PCT	Phosphoprotein P region PCT disordered. The N-terminal half of the phosphoprotein P of the Paramyxovirinae viruses. The very first 60 residues have been built as the family Soyouz-module, pfam14313. The remaining part of the region, here, is disordered, and is liable to induced folding under the right physiological conditions. The region undergoes an unstructured-to-structured transition upon binding to Measles virus tail, C, unstructured region.	311
405072	pfam14321	DUF4382	Domain of unknown function (DUF4382). This family is found in bacteria and archaea, and is typically between 142 and 161 amino acids in length.	149
405073	pfam14322	SusD-like_3	Starch-binding associating with outer membrane. SusD is a secreted polysaccharide-binding protein with an N-terminal lipid moiety that allows it to associate with the outer membrane. SusD probably mediates xyloglucan-binding prior to xyloglucan transport in the periplasm for degradation. This domain is found N-terminal to pfam07980.	185
405074	pfam14323	GxGYxYP_C	GxGYxYP putative glycoside hydrolase C-terminal domain. This family carries a characteristic sequence motif, GxGYxYP, and is a putative glycoside hydrolase. This domain is found in association with pfam16216. Associated families are sugar-processing domains.	226
405075	pfam14324	PINIT	PINIT domain. The PINIT domain is a protein domain that is found in PIAS proteins. The PINIT domain is about 180 amino acids in length.	133
379556	pfam14325	DUF4383	Domain of unknown function (DUF4383). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 137 and 164 amino acids in length.	125
405076	pfam14326	DUF4384	Domain of unknown function (DUF4384). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 80 amino acids in length.	81
405077	pfam14327	CSTF2_hinge	Hinge domain of cleavage stimulation factor subunit 2. The hinge domain of cleavage stimulation factor subunit 2 proteins, CSTF2, is necessary for binding to the subunit CstF-77 within the polyadenylation complex and subsequent nuclear localization. This suggests that nuclear import of a pre-formed CSTF complex is an essential step in polyadenylation. Accurate and efficient polyadenylation is essential for transcriptional termination, nuclear export, translation, and stability of eukaryotic mRNAs. CSTF2 is an important regulatory subunit of the polyadenylation complex.	81
405078	pfam14328	DUF4385	Domain of unknown function (DUF4385). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 149 and 163 amino acids in length.	143
405079	pfam14329	DUF4386	Domain of unknown function (DUF4386). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 214 and 245 amino acids in length.	210
405080	pfam14330	DUF4387	Domain of unknown function (DUF4387). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are approximately 110 amino acids in length. There is a conserved RSKN sequence motif.	98
405081	pfam14331	ImcF-related_N	ImcF-related N-terminal domain. This domain is found in bacterial ImcF (intracellular multiplication and human macrophage-killing) proteins. It is found to the N-terminus of the ImcF-related domain, pfam06761.	258
405082	pfam14332	DUF4388	Domain of unknown function (DUF4388). This domain family is found in bacteria, and is typically between 102 and 135 amino acids in length.	106
405083	pfam14333	DUF4389	Domain of unknown function (DUF4389). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 104 and 223 amino acids in length. There is a single completely conserved residue R that may be functionally important.	76
405084	pfam14334	DUF4390	Domain of unknown function (DUF4390). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 192 and 203 amino acids in length.	158
405085	pfam14335	DUF4391	Domain of unknown function (DUF4391). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 220 and 257 amino acids in length.	219
405086	pfam14336	DUF4392	Domain of unknown function (DUF4392). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 282 and 585 amino acids in length. There are two completely conserved G residues that may be functionally important.	289
405087	pfam14337	DUF4393	Domain of unknown function (DUF4393). This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 254 and 285 amino acids in length.	192
405088	pfam14338	Mrr_N	Mrr N-terminal domain. This domain is found at the N-terminus of the Mrr restriction endonuclease catalytic domain, pfam04471. Fold recognition analysis predicts that it is a diverged member of the winged helix variant of helix turn helix proteins. It may play a role in DNA sequence recognition.	82
405089	pfam14339	DUF4394	Domain of unknown function (DUF4394). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 262 and 476 amino acids in length.	228
405090	pfam14340	DUF4395	Domain of unknown function (DUF4395). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 142 and 168 amino acids in length. There are two completely conserved C residues that may be functionally important.	130
405091	pfam14341	PilX_N	PilX N-terminal. This domain is found at the N-terminus of the PilX prepilin-like proteins which are involved in type 4 fimbrial biogenesis.	51
405092	pfam14342	DUF4396	Domain of unknown function (DUF4396). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 167 and 310 amino acids in length.	141
405093	pfam14343	PrcB_C	PrcB C-terminal. This domain is found at the C-terminus of Treponema denticola PrcB. PrcB interacts with the PrtP protease (dentilisin) and is required for the stability of the protease complex.	56
405094	pfam14344	DUF4397	Domain of unknown function (DUF4397). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 120 amino acids in length.	115
405095	pfam14345	GDYXXLXY	GDYXXLXY protein. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 171 and 199 amino acids in length. It contains a conserved GDYXXLXY motif.	155
405096	pfam14346	DUF4398	Domain of unknown function (DUF4398). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 127 and 269 amino acids in length.	78
405097	pfam14347	DUF4399	Domain of unknown function (DUF4399). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 135 and 1079 amino acids in length.	91
405098	pfam14348	DUF4400	Domain of unknown function (DUF4400). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 209 and 249 amino acids in length. There is a single completely conserved residue P that may be functionally important.	204
405099	pfam14349	SprA_N	Motility related/secretion protein. This domain is found repeated three times in the N-terminal half of the gliding motility-related SprA proteins. The role of this domain in motility is uncertain. It is also found in proteins required for secretion.	509
405100	pfam14350	Beta_protein	Beta protein. This family includes the beta protein from Bacteriophage T4. Beta protein prevents the gop protein from killing the bacterial host cell.	340
405101	pfam14351	DUF4401	Domain of unknown function (DUF4401). This family of proteins is found in bacteria. Proteins in this family are typically between 357 and 735 amino acids in length. The family is found in association with pfam09925. There is a single completely conserved residue K that may be functionally important.	308
405102	pfam14352	DUF4402	Domain of unknown function (DUF4402). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 155 and 182 amino acids in length.	129
405103	pfam14353	CpXC	CpXC protein. This presumed domain is functionally uncharacterized. This domain is found in bacteria and archaea, and is typically between 122 and 134 amino acids in length. It contains four conserved cysteines forming two CpXC motifs.	121
405104	pfam14354	Lar_restr_allev	Restriction alleviation protein Lar. This family includes the restriction alleviation protein Lar encoded by the Rac prophage of Escherichia coli. This protein modulates the activity of the Escherichia coli restriction and modification system.	60
405105	pfam14355	Abi_C	Abortive infection C-terminus. This domain is found at the C-terminus of the Lactococcus lactis abortive infection protein Abi-859. This protein confers bacteriophage resistance.	83
405106	pfam14356	DUF4403	Domain of unknown function (DUF4403). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 455 and 518 amino acids in length. There is a single completely conserved residue W that may be functionally important.	425
405107	pfam14357	DUF4404	Domain of unknown function (DUF4404). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. There are two completely conserved residues (P and G) that may be functionally important.	83
405108	pfam14358	DUF4405	Domain of unknown function (DUF4405). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 50 amino acids in length. There are two conserved histidines that may be functionally important. This family is N-terminally truncated compared to other members of the clan.	65
405109	pfam14359	DUF4406	Domain of unknown function (DUF4406). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 98 and 145 amino acids in length.	90
405110	pfam14360	PAP2_C	PAP2 superfamily C-terminal. This family is closely related to the C-terminal a region of PAP2.	74
405111	pfam14361	RsbRD_N	RsbT co-antagonist protein rsbRD N-terminal domain. This domain is found at the N-terminus of a number of anti-sigma-factor antagonist proteins including B. subtilis RsbRD. These proteins are negative regulators of the general stress transcription factor sigma(B). It is found in association with pfam01740.	104
405112	pfam14362	DUF4407	Domain of unknown function (DUF4407). This family of proteins is found in bacteria. Proteins in this family are typically between 366 and 597 amino acids in length. There is a single completely conserved residue R that may be functionally important.	296
405113	pfam14363	AAA_assoc	Domain associated at C-terminal with AAA. This domain is found in association with the AAA family, pfam00004.	96
405114	pfam14364	DUF4408	Domain of unknown function (DUF4408). This domain is found at the N-terminus of member of the DUF761 family pfam05553. Many members are plant proteins.	33
405115	pfam14365	Neprosin_AP	Neprosin activation peptide. Pitcher plants are insectivorous and secrete a digestive fluid into the pitcher. This fluid contains a mixture of enzymes including peptidases. One of these is neprosin, characterized from the pitcher plant Nepenthes ventrata. This peptidase is of unknown catalytic type and is unaffected by standard peptidase inhibitors. Unusually, activity is directed towards prolyl bonds, but unlike most peptidase that cleave after proline, there is no restriction on sequence length or position of the proline residue. The peptidase is secreted and is presumed to possess an N-terminal activation peptide. This domain corresponds to the presumed activation peptide.	119
379584	pfam14366	DUF4410	Domain of unknown function (DUF4410). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 187 and 238 amino acids in length.	119
405116	pfam14367	DUF4411	Domain of unknown function (DUF4411). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 153 and 170 amino acids in length. There is a single completely conserved residue D that may be functionally important.	159
405117	pfam14368	LTP_2	Probable lipid transfer. The members of this family are probably involved in lipid transfer. The family has several highly conserved cysteines, paired in various ways.	91
405118	pfam14369	zinc_ribbon_9	zinc-ribbon. 	35
405119	pfam14370	Topo_C_assoc	C-terminal topoisomerase domain. This domain is found at the C-terminal of topoisomerase and other similar enzymes.	68
405120	pfam14371	DUF4412	Domain of unknown function (DUF4412). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is typically between 75 and 104 amino acids in length.	191
405121	pfam14372	DUF4413	Domain of unknown function (DUF4413). This domain is part of an RNase-H fold section of longer proteins some of which are transposable elements possibly of the Pong type, since some members are putative Tam3 transposases.	100
405122	pfam14373	Imm_superinfect	Superinfection immunity protein. This family includes the E. coli bacteriophage T4 superinfection immunity (imm) protein. When E. coli is sequentially infected with two T-even type bacteriophage the DNA of the superinfecting phage is excluded from the host, into the periplasmic space. The immunity protein plays a role in this process.	42
405123	pfam14374	Ribos_L4_asso_C	60S ribosomal protein L4 C-terminal domain. This family is found at the very C-terminal of 60 ribosomal L4 proteins.	74
405124	pfam14375	Cys_rich_CWC	Cysteine-rich CWC. This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 74 and 102 amino acids in length. It contains eight conserved cysteines, including a conserved CWC sequence motif.	50
405125	pfam14376	Haem_bd	Haem-binding domain. This domain contains a potential haem-binding motif, CXXCH. This family is found in association with pfam00034 and pfam03150.	133
405126	pfam14377	DUF4414	Domain of unknown function (DUF4414). This family is frequently found on DNA binding proteins of the URE-B1 type and on ligases.	32
405127	pfam14378	PAP2_3	PAP2 superfamily. 	190
405128	pfam14379	Myb_CC_LHEQLE	MYB-CC type transfactor, LHEQLE motif. This family is found towards the C-terminus of Myb-CC type transcription factors, and carries a highly conserved LHEQLE sequence motif.	47
405129	pfam14380	WAK_assoc	Wall-associated receptor kinase C-terminal. This WAK_assoc domain is cysteine-rich and lies C-terminal to the binding domain, GUB_WAK_bind, pfam13947.	96
405130	pfam14381	EDR1	Ethylene-responsive protein kinase Le-CTR1. EDR1 regulates disease resistance and ethylene-induced senescence, and is also involved in stress response signalling and cell death regulation.	199
405131	pfam14382	ECR1_N	Exosome complex exonuclease RRP4 N-terminal region. ECR1_N is an N-terminal region of the exosome complex exonuclease RRP proteins. It is a G-rich domain which structurally is a rudimentary single hybrid fold with a permuted topology.	36
405132	pfam14383	VARLMGL	DUF761-associated sequence motif. This family is found frequently at the N-terminus of family DUF3741, pfam12552.	32
405133	pfam14384	BrnA_antitoxin	BrnA antitoxin of type II toxin-antitoxin system. BrnA is family of antitoxins that neutralizes the toxin BrnT, pfam04365. It consists of 3 alpha-helices and a C-terminal ribbon-helix-helix DNA binding domain. As in other toxin-antitoxin systems, BrnA negatively autoregulates the brnTA operon and has higher affinity for the DNA operator when complexed with BrnT. It dimerizes with two molecules of its toxin BrnT.	67
405134	pfam14385	DUF4416	Domain of unknown function (DUF4416). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 176 and 187 amino acids in length. There is a conserved DPG sequence motif.	162
405135	pfam14386	DUF4417	Domain of unknown function (DUF4417). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 220 and 340 amino acids in length. There is a single completely conserved residue G that may be functionally important.	182
405136	pfam14387	DUF4418	Domain of unknown function (DUF4418). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 132 and 150 amino acids in length.	118
405137	pfam14388	DUF4419	Domain of unknown function (DUF4419). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, eukaryotes and viruses. Proteins in this family are typically between 348 and 454 amino acids in length.	302
405138	pfam14389	Lzipper-MIP1	Leucine-zipper of ternary complex factor MIP1. This leucine-zipper is towards the N-terminus of MIP1 proteins. These proteins, here largely from plants, are subunits of the TORC2 (rictor-mTOR) protein complex controlling cell growth and proliferation. The leucine-zipper is likely to be the region that interacts with plant MADS-box factors,	82
405139	pfam14390	DUF4420	Putative PD-(D/E)XK family member, (DUF4420). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 310 and 334 amino acids in length. Advanced homology-detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses have established this family to be a member of the PD-(D/E)XK superfamily.	310
405140	pfam14391	DUF4421	Domain of unknown function (DUF4421). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 336 and 370 amino acids in length.	298
405141	pfam14392	zf-CCHC_4	Zinc knuckle. The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. This particular family is found in plant proteins.	49
405142	pfam14393	DUF4422	Domain of unknown function (DUF4422). This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 255 and 371 amino acids in length.	219
405143	pfam14394	DUF4423	Domain of unknown function (DUF4423). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, and is approximately 170 amino acids in length.	168
373045	pfam14395	COOH-NH2_lig	Phage phiEco32-like COOH.NH2 ligase-type 2. A family of COOH-NH2 ligases/GCS superfamily found in the neighborhood of YheC/D-like ATP-grasp and the CotE family of proteins in the firmicutes. Contextual analysis suggests that it might be involved in cell wall modification and spore coat biosynthesis.	253
405144	pfam14396	CFTR_R	Cystic fibrosis TM conductance regulator (CFTR), regulator domain. 	213
405145	pfam14397	ATPgrasp_ST	Sugar-transfer associated ATP-grasp. A member of the ATP-grasp fold predicted to be involved in the biosynthesis of cell surface polysaccharides.	278
405146	pfam14398	ATPgrasp_YheCD	YheC/D like ATP-grasp. A member of the ATP-grasp fold predicted to be involved in the modification/biosynthesis of spore-wall and capsular proteins.	256
405147	pfam14399	BtrH_N	Butirosin biosynthesis protein H, N-terminal. BtrH_N is the N-terminus of the acyl carrier protein:aminoglycoside acyltransferase BtrH. Alternatively it can be referred to as butirosin biosynthesis protein H. BtrH transfers the unique (S)-4-amino-2-hydroxybutyrate (AHBA) side chain, which protects the antibiotic butirosin from several common resistance mechanisms. Butirosin, an aminoglycoside antibiotic produced by Bacillus circulans, exhibits improved antibiotic properties over its parent molecule and retains bactericidal activity toward many aminoglycoside-resistant strains. Butirosin is unique in carrying the AHBA side-chain. BtrH transfers the AHBA from the acyl carrier protein BtrI to the parent aminoglycoside ribostamycin as a gamma-glutamylated dipeptide.	134
405148	pfam14400	Transglut_i_TM	Inactive transglutaminase fused to 7 transmembrane helices. A family of inactive transglutaminases fused to seven transmembrane helices. The transglutaminase domain is predicted to be extracellularly located. Members of this family are associated in gene neighborhoods with a pepsin-like peptidase and an ATP-grasp of the RimK-family. The ATP-grasp is predicted to modify the 7TM protein or a cofactor that interacts with it.	161
405149	pfam14401	RLAN	RimK-like ATPgrasp N-terminal domain. An uncharacterized alpha+beta fold domain that is mostly fused to a RimK-like ATP-grasp and is found in bacteria and euryarchaea. Members of this family are almost always associated in gene neighborhoods with a GNAT-like acetyltransferase fused to a papain-like petidase. Additionally M20-like peptidases, GCS2, 4Fe-4S Ferredoxins, a distinct metal-sulfur cluster protein and ribosomal proteins are found in the gene neighborhoods. Contextual analysis suggests a role for these in peptide biosynthesis.	150
405150	pfam14402	7TM_transglut	7 transmembrane helices usually fused to an inactive transglutaminase. A family of seven transmembrane helices fused to an inactive transglutaminase domain. The transglutaminase domain is predicted to be extracellularly located. Members of this family are associated in gene neighborhoods with a pepsin-like peptidase and an ATP-grasp of the RimK-family. The ATP-grasp is predicted to modify the 7TM protein or a cofactor that interacts with it.	248
405151	pfam14403	CP_ATPgrasp_2	Circularly permuted ATP-grasp type 2. Circularly permuted ATP-grasp prototyped by Roseiflexus RoseRS_2616 that is associated in gene neighborhoods with a GCS2-like COOH-NH2 ligase, alpha/beta hydrolase fold peptidase, GAT-II -like amidohydrolase, and M20 peptidase. Members of this family are predicted to be involved in the biosynthesis of small peptides.	378
405152	pfam14404	Strep_pep	Ribosomally synthesized peptide in Streptomyces species. A ribosomally synthesized peptide related to microviridin and marinostatin, usually in the gene neighborhood of one or more RimK-like ATP-grasp. The gene-context suggests that it is further modified by the ATP-grasp. The peptide is predicted to function in a defensive or developmental role, or as an antibiotic.	63
405153	pfam14406	Bacteroid_pep	Ribosomally synthesized peptide in Bacteroidetes. Ribosomally synthesized peptide that is usually in the gene neighborhood of a RimK-like ATP-grasp, and an ABC ATPase fused to a papain-like domain. It is often present in multiple tandem gene copies. The gene contexts suggest that it is modified by the ATP-grasp as in the biosynthesis of microviridin and marinostatin. They might function in defense or development or as peptide antibiotics.	52
373050	pfam14407	Frankia_peptide	Ribosomally synthesized peptide prototyped by Frankia Franean1_4349. Ribosomally synthesized peptide linked to cyclases in chloroflexi. It may have a link to cyclic nucleotide signaling.	62
405154	pfam14408	Actino_peptide	Ribosomally synthesized peptide in actinomycetes. Ribosomally synthesized peptide that is usually in the gene neighborhood of a RimK-like ATP-grasp and an aspartyl-O-methylase. Gene contexts suggest that it is further modified by the ATP-grasp and the methylase. It might function in defense or development, or as a peptide antibiotic.	58
291106	pfam14409	Herpeto_peptide	Ribosomally synthesized peptide in Herpetosiphon. Ribosomally synthesized peptide that is usually in the gene neighborhood of a RimK-like ATP-grasp, and an ABC ATPase fused to a papain-like domain. It is often present in multiple tandem gene copies. Gene contexts suggest that it is modified by the ATP=grasp. It might function in defense or development, or as a peptide antibiotic.	56
405155	pfam14410	GH-E	HNH/ENDO VII superfamily nuclease with conserved GHE residues. A predicted nuclease of the HNH/EndoVII superfamily of the treble clef fold which is closely related to the NucA-like family. The name is derived from the conserved G, H and E residues. It is found in several bacterial polymorphic toxin systems. Some GH-E members preserve the conserved cysteines of the treble-clef suggesting that they might represent potential evolutionary intermediates from a classical HNH domain to the derived NucA-like form.	70
405156	pfam14411	LHH	A nuclease of the HNH/ENDO VII superfamily with conserved LHH. LHH is a predicted nuclease of the HNH/ENDO VII superfamily of the treble clef fold. The name is derived from the conserved motif, LHH. It is found in bacterial polymorphic toxin systems and functions as a toxin module. Like WHH and AHH, LHH nuclease contain 4 conserved histidines of which, the first one is predicted to bind metal-ion and other three ones are involved in activation of water molecule for hydrolysis.	76
405157	pfam14412	AHH	A nuclease family of the HNH/ENDO VII superfamily with conserved AHH. AHH is a predicted nuclease of the HNH/ENDO VII superfamily of the treble clef fold. The name is derived from the conserved motif, AHH. It is found in bacterial polymorphic toxin systems and functions as a toxin module. Like WHH and LHH, the AHH nuclease contains 4 conserved histidines of which, the first one is predicted to bind a metal-ion and the other three ones are involved in activation of a water molecule for hydrolysis.	113
405158	pfam14413	Thg1C	Thg1 C terminal domain. Thg1 polymerases contain an additional region of conservation C-terminal to the core palm domain that comprise of 5 helices and two strands. This region has several well-conserved charged residues including a basic residue found towards the end of the first helix of this unit might contribute to the Thg1-specific active site. This C-terminal module of Thg1 is predicted to form a helical bundle that functions equivalently to the fingers of the other nucleic acid polymerases, probably in interacting with the template HtRNA.	117
405159	pfam14414	WHH	A nuclease of the HNH/ENDO VII superfamily with conserved WHH. WHH is a predicted nuclease of the HNH/ENDO VII superfamily of the treble clef fold. The name is derived from the conserved motif WHH. It is found in bacterial polymorphic toxin systems and functions as a toxin module. WHH is the shortest version of HNH nuclease families. Like AHH and LHH, the WHH nuclease contains 4 conserved histidines of which the first one is predicted to bind a metal-ion and other three ones are involved in activation of water molecule for hydrolysis.	43
405160	pfam14415	DUF4424	Domain of unknown function (DUF4424). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 310 and 361 amino acids in length.	303
405161	pfam14416	PMR5N	PMR5 N terminal Domain. The plant family with PMR5, ESK1, TBL3 etc have a N-terminal C rich predicted sugar binding domain followed by the PC-Esterase (acyl esterase) domain.	55
405162	pfam14417	MEDS	MEDS: MEthanogen/methylotroph, DcmR Sensory domain. MEDS is prototyped by DcmR and is likely to function with the PocR domain in certain organisms in sensing hydrocarbon derivatives. The MEDS domain occurs fused to Histidine Kinase and as stand-alone version. Sequence analysis shows that it is a catalytically inactive version of the P-loop NTPase domain of the RecA superfamily.	160
405163	pfam14418	OHA	OST-HTH Associated domain. OHA occurs with OST-HTH.	74
405164	pfam14419	SPOUT_MTase_2	AF2226-like SPOUT RNA Methylase fused to THUMP. SPOUT superfamily RNA methylase fused to RNA binding THUMP domain.	173
405165	pfam14420	Clr5	Clr5 domain. This domain is found at the N-terminus of the Clr5 protein which has been shown to be involved in silencing in fission yeast. This domain has been found to often be associated with proteins that contain ankyrin repeats and large regions of disordered sequence.	54
405166	pfam14421	LmjF365940-deam	A distinct subfamily of CDD/CDA-like deaminases. A distinct branch of the CDD/CDA-like deaminases prototyped by Leishmania LmjF36.5940. Members of this family are widely distributed across several microbial eukaryotes such as kinetoplastids, chlorophyte algae, stramenopiles and the alveolate Perkinsus. Domain architectures suggest that these proteins might possess mRNA editing or DNA mutagenizing activity.	197
405167	pfam14423	Imm5	Immunity protein Imm5. A predicted Immunity protein, with an all-alpha fold, present in bacterial polymorphic toxin systems as an immediate neighbor of the toxin.	186
405168	pfam14424	Toxin-deaminase	The BURPS668_1122 family of deaminases. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Burkholderia BURPS668_1122. Members of this family are found as toxins in polymorphic toxin systems in a wide range of bacteria and in the eukaryote Perkinsus. Members of this family typically possess a DxE catalytic motif in Helix-2 of the core fold instead of the more common C[H]xE motif. The Perkinsus versions are predicted to be inactive.	135
373060	pfam14425	Imm3	Immunity protein Imm3. A predicted Immunity protein, with a mostly all-alpha fold, present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene.	119
339223	pfam14426	Imm2	Immunity protein Imm2. A predicted Immunity protein, with a mostly all-alpha fold, present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene.	60
373061	pfam14427	Pput2613-deam	Pput_2613-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Pseudomonas Pput_2613. Members of this family are predicted to function as toxins in bacterial polymorphic toxin systems.	118
373062	pfam14428	SCP1201-deam	SCP1.201-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Streptomyces SCP1.201. Members of this family are predicted to function as toxins in bacterial polymorphic toxin systems.	136
405169	pfam14429	DOCK-C2	C2 domain in Dock180 and Zizimin proteins. The Dock180/Dock1 and Zizimin proteins are atypical GTP/GDP exchange factors for the small GTPases Rac and Cdc42 and are implicated cell-migration and phagocytosis. Across all Dock180 proteins, two regions are conserved: C-terminus termed CZH2 or DHR2 (or the Dedicator of cytokinesis) whereas CZH1/DHR1 contain a new family of the C2 domain.	186
405170	pfam14430	Imm1	Immunity protein Imm1. A predicted immunity protein, with an alpha+beta fold and a conserved C-terminal tryptophan residue. The protein is present in a wide range of bacteria in polymorphic toxin systems as an immediate gene neighbor of the toxin gene.	125
373065	pfam14431	YwqJ-deaminase	YwqJ-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Bacillus YwqJ. Members of this family are present in a wide phyletic range of bacteria and a few basidiomycetes. Bacterial versions are predicted to function as toxins in bacterial polymorphic toxin systems.	134
373066	pfam14432	DYW_deaminase	DYW family of nucleic acid deaminases. A family of nucleic acid deaminases prototyped by the plant PPR DYW proteins that are implicated in chloroplast and mitochondrial RNA transcript maturation by numerous C to U editing events. The name derives from the DYW motif present at the C-terminus of the classical plant PPR DYW deaminases. Members of this family are present in bacteria, plants, Naegleria, and fungi. Plants and Naegleria show lineage-specific expansions of this family. The classical DYW family contain an additional C-terminal metal-binding cluster composed of 2 histidines and a CxC motif and are often fused to PPR repeats. Ascomycete versions, which are independent lateral transfers, contain a large insert within the domain and are often fused to ankyrin repeats. Bacterial versions are predicted to function as toxins in polymorphic toxin systems.	125
405171	pfam14433	SUKH-3	SUKH-3 immunity protein. This family belongs to the SUKH superfamily and functions as immunity proteins in bacterial toxin systems.	140
405172	pfam14434	Imm6	Immunity protein Imm6. A predicted immunity protein, with an alpha+beta fold (mostly alpha helices). The protein is present in polymorphic toxin systems as an immediate gene neighbor of the toxin gene.	121
405173	pfam14435	SUKH-4	SUKH-4 immunity protein. This family belongs to the SUKH superfamily and functions as immunity proteins in bacterial toxin systems.	140
405174	pfam14436	EndoU_bacteria	Bacterial EndoU nuclease. This is a bacterial virion of EndoU nuclease. It is found at C-terminal region of polymorphic toxin proteins.	129
405175	pfam14437	MafB19-deam	MafB19-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Neisseria MafB19. Members of this family are present in a wide phyletic range of bacteria and are predicted to function as toxins in bacterial polymorphic toxin systems.	138
405176	pfam14438	SM-ATX	Ataxin 2 SM domain. This SM domain is found in Ataxin-2.	78
405177	pfam14439	Bd3614-deam	Bd3614-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Bdellovibrio Bd3614. They are typified by a distinct N-terminal globular domain. The Bdellovibrio version occurs in a predicted operon with a 23S rRNA G2445-modifying methylase suggesting that it might be involved in RNA editing.	113
405178	pfam14440	XOO_2897-deam	Xanthomonas XOO_2897-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Xanthomonas XOO_2897. Members of this family are present in a wide phyletic range of bacteria and are predicted to function as toxins in bacterial polymorphic toxin systems. The Xanthomonas XOO_2897 lack an immunity protein and is predicted to be deployed against its eukaryotic host.	101
405179	pfam14441	OTT_1508_deam	OTT_1508-like deaminase. A member of the nucleic acid/nucleotide deaminase superfamily prototyped by Orientia OTT_1508. Members of this family are present in a wide phyletic range of bacteria,including several intracellular parasites and eukaryotes such as fungi, Leishmania, Selaginella, and some apicomplexa. In bacteria, these deaminases are predicted to function as toxins in bacterial polymorphic toxin systems. Versions in intracellular bacteria lack immunity proteins and are likely to be deployed against their eukaryotic hosts. Eukaryotic versions are predicted to function as nucleic acid (either DNA or RNA) deaminases. Among eukaryotes, some fungi show lineage-specific expansions of this family. Many fungal versions are fused to a distinct N-terminal globular domain. Various fungal versions are fused to domains involved in chromatin function. Apicomplexan versions are fused to tRNA guanine transglycosylase domain.	66
405180	pfam14442	Bd3614_N	Bd3614-like deaminase N-terminal. This is a globular domain that occurs N-terminal to the Bd3614-like deaminases, which are predicted to be involved in RNA editing.	92
405181	pfam14443	DBC1	DBC1. DBC1 and it homologs from diverse eukaryotes are a catalytically inactive version of the Nudix hydrolase (MutT) domain. DBC1 is predicted to bind NAD metabolites and regulate the activity of SIRT1 or related deacetylases by sensing the soluble products or substrates of the NAD-dependent deacetylation reaction.	123
405182	pfam14444	S1-like	S1-like. S1-like RNA binding domain found in DBC1	58
379611	pfam14445	Prok-RING_2	Prokaryotic RING finger family 2. RING finger family found sporadically in bacteria and archaea, and associated with other components of the ubiquitin-based signaling and degradation system, including ubiquitin and the E1 and E2 proteins. The bacterial versions contain transmembrane helices.	56
405183	pfam14446	Prok-RING_1	Prokaryotic RING finger family 1. RING finger family found sporadically in bacteria and archaea, and associated in gene neighborhoods with other components of the ubiquitin-based signaling and degradation system, including ubiquitin, the E1 and E2 proteins and the JAB-like metallopeptidase. The bacterial versions contain transmembrane helices.	52
405184	pfam14447	Prok-RING_4	Prokaryotic RING finger family 4. RING finger family domain found sporadically in bacteria. The finger is fused to an N-terminal alpha-helical domain, ROT/Trove-like repeats and a C-terminal TerD domain. The architecture suggests a possible role in an RNA-processing complex.	46
373075	pfam14448	Nuc_N	Nuclease N terminal. This is a conserved short region that is found in many bacterial polymorphic toxin proteins. It is often located before C-terminal nuclease domains.	58
379612	pfam14449	PT-TG	Pre-toxin TG. PT-TG is a conserved region found in many bacterial toxin proteins. It could function as a linker that links N-terminal secretion-related domain and C-terminal toxin domain. It contains a TG motif.	68
405185	pfam14450	FtsA	Cell division protein FtsA. FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains. The FtsA protein contains two structurally related actin-like ATPase domains which are also structurally related to the ATPase domains of HSP70 (see PF00012). FtsA has a SHS2 domain PF02491 inserted in to the RnaseH fold PF02491.	168
405186	pfam14451	Ub-Mut7C	Mut7-C ubiquitin. This member of the ubiquitin superfamily is found at the N-terminus of Mut7-C like RNAses, suggestive of an RNA-binding role.	81
405187	pfam14452	Multi_ubiq	Multiubiquitin. A ubiquitin superfamily domain that is often present in multiple tandem copies in the same polypeptide. Members of this family are associated in gene neighborhoods, or on occasions fused to, bacterial homologs of components of ubiquitin-dependent modification system such as the E1, E2 and JAB metallopeptidase enzymes and a distinct metal-binding domain. The E2/UBC fold domain appears to be inactive. The JAB domain in these operons is usually fused to the E1 domain.	69
405188	pfam14453	ThiS-like	ThiS-like ubiquitin. A member of the ubiquitin superfamily that is often fused to the ThiF-like (E1)- ubiquitin activating enzyme and is present in gene neighborhoods with components of the thiamine biosynthesis pathway.	57
405189	pfam14454	Prok_Ub	Prokaryotic Ubiquitin. A Ubiquitin-superfamily protein that is present across several bacterial lineages, and found in gene neighborhoods with components of the ubiquitin modification system such as the E1, E2 and JAB proteins, and a novel alpha-helical protein, which is predicted to be enzymatic.	63
405190	pfam14455	Metal_CEHH	Predicted metal binding domain. A predicted metal-binding domain that is found in gene-neighborhood associations with genes encoding components of the bacterial homologs of the ubiquitin modification pathway including the E1, E2, JAB metallopeptidase and ubiquitin proteins. The domain is characterized by a conserved motif with a CxxxxxEYHxxxxH signature.	177
373077	pfam14456	alpha-hel2	Alpha-helical domain 2. An alpha-helical domain found in gene neighborhoods encoding genes containing bacterial homologs of components of the ubiquitin modification pathway such as the E1, E2, Ub and JAB peptidase proteins.	303
405191	pfam14457	Prok-E2_A	Prokaryotic E2 family A. A member of the E2/UBC superfamily of proteins found in several bacteria. The active site residues are very similar to the eukaryotic E2 proteins. Members of this family are usually fused to E1 and JAB domains C-terminal to the E2 domain. The protein is usually in the gene neighborhood of a gene encoding a distinct metallobetalactamase family protein.	162
405192	pfam14459	Prok-E2_C	Prokaryotic E2 family C. A divergent member of the E2/UBC superfamily of proteins found in bacteria. Members of the family contain a conserved cysteine in place of the histidine of the classical E2/UBC proteins. Members of this family are usually fused to an E1 domain at their C-terminus. The protein is usually in the gene neighborhood of a gene encoding a JAB peptidase and another encoding a predicted metal binding domain.	131
405193	pfam14460	Prok-E2_D	Prokaryotic E2 family D. A member of the E2/UBC superfamily of proteins found in several bacteria. Members of this family lack the conserved histidine of the classical E2-fold. However, they have an absolutely conserved histidine carboxyl-terminal to the conserved cysteine. Members of this family are usually present in a conserved gene neighborhood with genes encoding members of the Ub modification pathway such as the E1, Ub and JAB proteins. These neighborhoods also contain a gene encoding a rapidly diverging alpha-helical protein.	169
405194	pfam14461	Prok-E2_B	Prokaryotic E2 family B. A member of the E2/UBC superfamily of proteins found in several bacteria. The active site residues are similar to the eukaryotic E2 proteins but lack the conserved asparagine. Members of this family are usually fused to an E1 domain at the C-terminus. The protein is usually in the gene neighborhood of a gene encoding a member of the pol-beta nucleotidyltransferase superfamily. Many of the operons in this family are in ICE-like mobile elements and plasmids.	133
405195	pfam14462	Prok-E2_E	Prokaryotic E2 family E. A member of the E2/UBC superfamily of proteins found in diverse bacteria. Analysis of the active site residues suggest that members of this family are inactive as they lack the characteristic catalytic residues of the E2 enzymes. They are usually fused to or in the neighborhood of a multi/poly ubiquitin domain protein. Other proteins of the ubiquitin modification pathway such as the E1 and JAB proteins are also found in its gene neighborhood along with a distinct predicted metal-binding protein.	119
405196	pfam14463	E1-N	E1 N-terminal domain. An uncharacterized alpha/beta domain fused to E1 proteins. This protein is usually present in gene neighborhoods with genes encoding a JAB protein and a predicted metal-binding protein. In related E1 proteins, the E1-N domain is replaced by an E2/UBC superfamily domain.	151
405197	pfam14464	Prok-JAB	Prokaryotic homologs of the JAB domain. These are metalloenzymes that function as the ubiquitin isopeptidase/ deubiquitinase in the ubiquitin-based signaling and protein turnover pathways in eukaryotes. Prokaryotic JAB domains are predicted to have a similar role in their cognates of the ubiquitin modification pathway. The domain is widely found in bacteria, archaea and phages where they are present in several gene contexts in addition to those that correspond to the prokaryotic cognates of the eukaryotic Ub pathway. Other contexts in which JAB domains are present include gene neighbor associations with ubiquitin fold domains in cysteine and siderophore biosynthesis, and phage tail morphogenesis, where they are shown or predicted to process the associated ubiquitin. A distinct family, the RadC-like JAB domains are widespread in bacteria and are predicted to function as nucleases. In halophilic archaea the JAB domain shows strong gene-neighborhood associations with a nucleotidyltransferase suggesting a role in nucleotide metabolism.	113
405198	pfam14465	NFRKB_winged	NFRKB Winged Helix-like. This domain covers regions 370-495 of human nuclear factor related to kappaB binding (NFRKB) protein.	102
405199	pfam14466	PLCC	PLAT/LH2 and C2-like Ca2+-binding lipoprotein. A small family of bacterial proteins, found in several Bacteroides species. Structure determination (NMR and Xray) shows an immunoglobulin beta-barrel fold. Multiple homologs have been found in human gut metagenomics data sets. Structural experimentation shows it to share features with two well-established protein architectures in the SCOP database, ie, C2 (calcium/lipid-binding domain) of the Pfam PF00168 and PLAT/LH2 (lipase/lipooxigenase domain) of the Pfam PF01477. The C2 and PLAT/LH2 domains bind Ca2+ in their functions of targeting proteins to cell-membranes; this domain is also shown to bind Ca2+ as well as to be a novel fold.	128
405200	pfam14467	DUF4426	Domain of unknown function (DUF4426). Members of this entry are found mostly in g-proteobacteria, especially in Vibrio. Strangely enough, there seems to be one eukaryotic homolog in Nematostella vectensis (NEMVEDRAFT_v1g226006), where the PA0388-like domain is fused with a domain homogous to the Methionine biosynthesis protein MetW (see below). In several Pseudomonas species, but also in Vibrio vulnificus and Azotobacter vinelandii PA0388 homologs are genomic neighbors of Nucleoside 5-triphosphatase RdgB (dHAPTP, dITP, XTP-specific) (EC 3.6.1.15) and Methionine biosynthesis protein MetW. On the other hand, in most Vibrio species it appears as a part of a conserved operon involved in possible response to stress.	119
291157	pfam14468	DUF4427	Protein of unknown function (DUF4427). This domain is often found at the C-terminal of proteins with pfam10899 domain, for instance in STY1911 protein from a multiple drug resistant Salmonella enterica serovar Typhi CT18.	132
405201	pfam14469	AKAP28	28 kDa A-kinase anchor. 28 kDa AKAP (AKAP28) is highly enriched in human airway axonemes. The mRNA for AKAP28 is up-regulated as primary airway cells differentiate and is specifically expressed in tissues containing cilia and/or flagella. Homologs of AKAP28 are present in all animals and in some, including mice the AKAP28-like domain are preceded by another uncharacterized domain	121
405202	pfam14470	bPH_3	Bacterial PH domain. Proteins in this family are distantly related to PH domains.	94
405203	pfam14471	DUF4428	Domain of unknown function (DUF4428). This putative zinc finger domain is found in uncharacterized bacterial proteins.	51
405204	pfam14472	DUF4429	Domain of unknown function (DUF4429). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and viruses, and is approximately 90 amino acids in length. This domain is often found in two tandem copies.	95
405205	pfam14473	RD3	RD3 protein. RD3 is a human protein that is found preferentially expressed in the retina. Mutations in RD3 causes Leber Congenital Amaurosis type 12.	127
405206	pfam14474	RTC4	RTC4-like domain. This presumed domain is found in the RTC4 protein from yeasts. In Saccharomyces cerevisiae, Cdc13 binds telomeric DNA to recruit telomerase and to "cap" chromosome ends. RTC4 was identified in a screen to identify novel proteins and pathways that cap telomeres, or that respond to uncapped telomeres. This domain is also found in proteins that contain a DNA-binding myb domain.	121
405207	pfam14475	Mso1_Sec1_bdg	Sec1-binding region of Mso1. Mso1p is a component of the secretory vesicle docking complex whose function is closely associated with that of Sec1p. It is a small hydrophilic protein that is enriched in the microsomal membrane fraction, and this binding domain is towards the N-terminus of Mso1. The yeast Sec1p protein functions in the docking of secretory transport vesicles to the plasma membrane. Mso1p and Sec1p interact at sites of exocytosis and the Mso1p-Sec1p interaction site depends on a functional Rab GTPase Sec4p and its GEF Sec2p. The C-terminal region of Mso1 (not built) assists in targetting Sec1 to the sites of polarised membrane transport.	40
405208	pfam14476	Chloroplast_duf	Petal formation-expressed. The members of this plant family from Arabidopsis thaliana appear to be proteins found in the chloroplast, expressed in the pollen tube during the petal differentiation and expansion stage. The function is not known.	319
316951	pfam14477	Mso1_C	Membrane-polarising domain of Mso1. Mso1p is a component of the secretory vesicle docking complex whose function is closely associated with that of Sec1p. It is a small hydrophilic protein that is enriched in the microsomal membrane fraction. The yeast Sec1p protein functions in the docking of secretory transport vesicles to the plasma membrane. Mso1p and Sec1p interact at sites of exocytosis and the Mso1p-Sec1p interaction site depends on a functional Rab GTPase Sec4p and its GEF Sec2p. This C-terminal region of Mso1 assists in targeting Sec1 to the sites of polarised membrane transport, the SNARES and Sec4.	54
405209	pfam14478	DUF4430	Domain of unknown function (DUF4430). Although this family has overlaps with SLBB, the majority of its sequences are unique. Several family members, eg UniProtKB:A0RGA8, that do not overlap have an LPXTG-cell wall anchor at their C-terminus, a SSF_Family 10_polysaccharide_lyase or Glycosyltransferase structure associated with them in the middle region, as shown by InterPro, as well as this domain at the N-terminus.	72
405210	pfam14479	HeLo	Prion-inhibition and propagation. This N-terminal region, HeLo, has a prion-inhibitory effect in cis on its own prion-forming domain (PFD) and in trans on HET-s prion propagation. The domain is found exclusively in the fungal kingdom. Its structure, as it occurs in the HET-s/HET-S proteins, consists of two bundles of alpha-helices that pack into a single globular domain. The domain boundary determined from its structure and from protease-resistance experiments overlaps with the C-terminal prion-forming domain of HET-s (PF11558. The HeLo domains of HET-s and HET-S are very similar and their few differences (and not the prion-forming domains) determine the compatibility-phenotype of the fungi in which the proteins are expressed. The mechanism of the HeLo domain-function in heterokaryon-incompatibility is still under investigation, however the HeLo domain is found in similar protein architectures as other cell death and apoptosis-inducing domains. The only other HeLo protein to which a function has been associated is LopB from L. maculans. Although its specific role in L. maculans is unknown, LopB- mutants have impaired ability to form lesions on oilseed rape. The HeLo domain is not related to the HET domain (PF06985) which is another domain involved in heterokaryon incompatibility.	167
405211	pfam14480	DNA_pol3_a_NI	DNA polymerase III polC-type N-terminus I. This is the first N-terminal domain, NI domain, of the DNA polymerase III polC subunit A that is found only in Firmicutes. DNA polymerase polC-type III enzyme functions as the 'replicase' in low G + C Gram-positive bacteria. Purine asymmetry is a characteristic of organisms with a heterodimeric DNA polymerase III alpha-subunit constituted by polC which probably plays a direct role in the maintenance of strand-biased gene distribution; since, among prokaryotic genomes, the distribution of genes on the leading and lagging strands of the replication fork is known to be biased. It has been predicted that the N-terminus of polC folds into two globular domains, NI and NII. A predicted patch of elecrostatic potential at the surface of this domain suggests a possible involvement in nucleic acid binding. This domain is associated with DNA_pol3_alpha pfam07733 and DNA_pol3_a_NI pfam11490.	72
373090	pfam14481	Fimbrial_PilY2	Type 4 fimbrial biogenesis protein PilY2. Members of this family were experimentally shown to be involved in fimbrial biogenesis, but its exact role appears to be unknown.	105
405212	pfam14484	FISNA	Fish-specific NACHT associated domain. This domain is frequently found associated with the NACHT domain (pfam05729) in fish and other vertebrates.	70
373092	pfam14485	DUF4431	Domain of unknown function (DUF4431). 	48
405213	pfam14486	DUF4432	Domain of unknown function (DUF4432). 	303
405214	pfam14487	DUF4433	Domain of unknown function (DUF4433). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 201 and 230 amino acids in length. There is a single completely conserved residue E that may be functionally important. This family is distantly similar to pfam01885 suggesting these may be ADP-ribosylases.	201
405215	pfam14488	DUF4434	Domain of unknown function (DUF4434). 	168
405216	pfam14489	QueF	QueF-like protein. This protein is involved in the biosynthesis of queuosine. In some proteins this domain appears to be fused to pfam06508.	81
405217	pfam14490	HHH_4	Helix-hairpin-helix containing domain. This presumed domain contains at least one helix-hairpin-helix motif. This domain is often found in RecD helicases.	91
405218	pfam14491	DUF4435	Protein of unknown function (DUF4435). This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 285 and 362 amino acids in length. This domain is sometimes associated with AAA domains.	232
405219	pfam14492	EFG_II	Elongation Factor G, domain II. This domain is found in Elongation Factor G. It shares a similar structure with domain V (pfam00679).	75
405220	pfam14493	HTH_40	Helix-turn-helix domain. This presumed domain is found at the C-terminus of a large number of helicase proteins.	89
379624	pfam14494	DUF4436	Domain of unknown function (DUF4436). This is a family of membrane and transmembrane proteins from mycobacterial and related species. The function is not known.	255
405221	pfam14495	Cytochrom_C550	Cytochrome c-550 domain. This domain is a heme binding cytochrome known as cytochrome c550, or cytochrome c549, or PsbV.	136
405222	pfam14496	NEL	C-terminal novel E3 ligase, LRR-interacting. This NEL or novel E3 ligase domain is found at the C-terminus of bacterial virulence factors. Its sequence is different from those of the eukaryotic HECT and RING-finger E3 ligases, and it subverts the host ubiquitination process. At the N-terminus of the family-members there is a series of LRR repeats, and the NEL domain interacts with the most N-terminal repeat. The key residue for the ligation step is the cysteine, eg found at position 386 in UniProtKB:E7K2H2. The LRR section sequesters this active site until invasion has occurred.	216
405223	pfam14497	GST_C_3	Glutathione S-transferase, C-terminal domain. This domain is closely related to pfam00043.	102
405224	pfam14498	Glyco_hyd_65N_2	Glycosyl hydrolase family 65, N-terminal domain. This domain represents a domain found to the N-terminus of the glycosyl hydrolase 65 family catalytic domain.	233
379626	pfam14499	DUF4437	Domain of unknown function (DUF4437). This family of proteins is found in bacteria. Proteins in this family are typically between 152 and 283 amino acids in length.	250
405225	pfam14500	MMS19_N	Dos2-interacting transcription regulator of RNA-Pol-II. This domain, along with the C-terminal part, pfam12460, is an essential component of a silencing complex in fission yeast that contains Dos2, Rik1, Mms19 and Cdc20 (the catalytic subunit of DNA polymerase-epsilon). This complex regulates RNA polymerase II (RNA Pol II) activity in heterochromatin and is required for DNA replication and heterochromatin assembly.	258
405226	pfam14501	HATPase_c_5	GHKL domain. This family represents the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90.	99
405227	pfam14502	HTH_41	Helix-turn-helix domain. 	48
405228	pfam14503	YhfZ_C	YhfZ C-terminal domain. This domain is often found in association with the helix-turn-helix domain HTH_41 (pfam14502). It includes YhfZ proteins from Escherichia coli and Shigella flexneri.	236
405229	pfam14504	CAP_assoc_N	CAP-associated N-terminal. The function of this domain is unknown, but it is found towards the N-terminus of bacterial proteins carrying the CAP domain, pfam00188. All members that do not otherwise carry an additional Cu_amine_oxidN1, pfam07833, domain are likely to be extracellular as they start with a signal-peptide. Most other non-bacterial proteins with the CAP domain are allergenic.	140
405230	pfam14505	DUF4438	Domain of unknown function (DUF4438). 	256
291192	pfam14506	CppA_N	CppA N-terminal. This is the N-terminal domain of the CppA protein found in species of Streptococcus. CppA is a putative C3-glycoprotein degrading proteinase, involved in pathogenicity. It is often found associated with pfam14507.	123
405231	pfam14507	CppA_C	CppA C-terminal. This is the C-terminal domain of the CppA protein found in species of Streptococcus. CppA is a putative C3-glycoprotein degrading proteinase, involved in pathogenicity. It is often found associated with pfam14506.	97
405232	pfam14508	GH97_N	Glycosyl-hydrolase 97 N-terminal. This N-terminal domain of glycosyl-hydrolase-97 contributes part of the active site pocket. It is also important for contact with the catalytic and C-terminal domains of the whole.	235
405233	pfam14509	GH97_C	Glycosyl-hydrolase 97 C-terminal, oligomerization. Glycosyl-hydrolase-97 is made up of three tightly linked and highly conserved globular domains. The C-terminal domain is found to be necessary for oligomerization of the whole molecule in order to create the active-site pocket and the Ca++-binding site.	97
405234	pfam14510	ABC_trans_N	ABC-transporter extracellular N-terminal. This domain is found at the N-terminus of ABC-transporter proteins from fungi, plants to higher eukaryotes. It would appear to be an extracellular domain.	80
405235	pfam14511	RE_EcoO109I	Type II restriction endonuclease EcoO109I. This is a family of Type II restriction endonucleases.	194
405236	pfam14512	TM1586_NiRdase	Putative TM nitroreductase. Compared with the more traditional NADH oxidase/flavin reductase family, this family is a duplication, consisting of two similar domains arranged as the subunits of the dimeric NADH oxidase/flavin reductase with one conserved active site.	214
405237	pfam14513	DAG_kinase_N	Diacylglycerol kinase N-terminus. This domain is found at the N-terminus of diacylglycerol kinases.	107
405238	pfam14514	TetR_C_9	Transcriptional regulator, TetR, C-terminal. This family comprises proteins that belong to the TetR family of transcriptional regulators. This family features the C-terminal region of these sequences, which does not include the N-terminal helix-turn-helix.	130
405239	pfam14515	HOASN	Haem-oxygenase-associated N-terminal helices. This domain represents a pair of alpha helices, which are found at the N-terminus of some Haem-oxygenase globular domain.	92
373105	pfam14516	AAA_35	AAA-like domain. This family of proteins are part of the AAA superfamily.	331
405240	pfam14517	Tachylectin	Tachylectin. This family of lectins binds N-acetylglucosamine and N-acetylgalactosamine and may be involved in innate immunity. It has a five-bladed beta-propeller structure with five carbohydrate-binding sites, one per beta sheet.	230
405241	pfam14518	Haem_oxygenas_2	Iron-containing redox enzyme. The CADD, Chlamydia protein associating with death domains, crystal structure reveals a dimer of seven-helical bundles. Each bundle contains a di-iron centre adjacent to an internal cavity that forms an active site similar to that of methane mono-oxygenase hydrolase.	178
405242	pfam14519	Macro_2	Macro-like domain. This domain is an ADP-ribose binding module. It is found in a number of yeast proteins.	290
405243	pfam14520	HHH_5	Helix-hairpin-helix domain. 	57
405244	pfam14521	Aspzincin_M35	Lysine-specific metallo-endopeptidase. This is the catalytic region of aspzincins, a group of lysine-specific metallo-endopeptidases in the MEROPS:M35 family. They exhibit the following active-site architecture. The active site is composed of two helices and a loop region and includes the HExxH and GTxDxxYG motifs. In UniProt:P81054, His117, His121 and Asp130 coordinate to the catalytic zinc ligands. An electrostatically negative region composed of Asp154 and Glu157 attracts a positively charged Lys side chain of a substrate in a specific manner.	145
405245	pfam14522	Cytochrome_C7	Cytochrome c7 and related cytochrome c. This family includes cytochromes c7 and c7-type. In cytochromes c7 all three haems are bis-His co-ordinated. In c7-type the last haem is His-Met co-ordinated.	62
405246	pfam14523	Syntaxin_2	Syntaxin-like protein. This domain includes syntaxin-like domains including from the Vam3p protein.	101
405247	pfam14524	Wzt_C	Wzt C-terminal domain. This domain is found at the C-terminus of the Wzt protein. The crystal structure of C-Wzt(O9a) reveals a beta sandwich with an immunoglobulin-like topology that contains the O-antigenic polysaccharide binding pocket. This domain is often associated with the ABC-transporter domain.	143
405248	pfam14525	AraC_binding_2	AraC-binding-like domain. This domain is related to the AraC ligand binding domain pfam02311.	173
405249	pfam14526	Cass2	Integron-associated effector binding protein. This family contains Cass2 from Vibrio cholerae, an integron-associated protein that has been shown to bind cationic drug compounds with submicromolar affinity. Cass2 has been proposed to be representative of a larger family of independent effector-binding proteins associated with lateral gene transfer within Vibrio and other closely-related species.	149
405250	pfam14527	LAGLIDADG_WhiA	WhiA LAGLIDADG-like domain. This domain is found within the sporulation regulator WhiA. It is a LAGLIDADG superfamily like domain.	93
405251	pfam14528	LAGLIDADG_3	LAGLIDADG-like domain. This domain is part of the LAGLIDADG superfamily.	82
405252	pfam14529	Exo_endo_phos_2	Endonuclease-reverse transcriptase. This domain represents the endonuclease region of retrotransposons from a range of bacteria, archaea and eukaryotes. These are enzymes largely from class EC:2.7.7.49.	118
405253	pfam14530	DUF4439	Domain of unknown function (DUF4439). This domain has a ferritin-like fold.	131
405254	pfam14531	Kinase-like	Kinase-like. This family includes the pseudokinases ROP2 and ROP8 from Toxoplasma gondii. These proteins have a typical bilobed protein kinase fold, but lack catalytic actvity.	288
316998	pfam14532	Sigma54_activ_2	Sigma-54 interaction domain. 	138
405255	pfam14533	USP7_C2	Ubiquitin-specific protease C-terminal. This C-terminal domain on many long ubiquitin-specific proteases has no known function.	205
405256	pfam14534	DUF4440	Domain of unknown function (DUF4440). 	107
405257	pfam14535	AMP-binding_C_2	AMP-binding enzyme C-terminal domain. This is a small domain that is found C terminal to pfam00501. It has a central beta sheet core that is flanked by alpha helices.	96
405258	pfam14536	DUF4441	Domain of unknown function (DUF4441). This family is largely made up of uncharacterized proteins from the Ciliophora. The function is not known.	114
405259	pfam14537	Cytochrom_c3_2	Cytochrome c3. 	79
405260	pfam14538	Raptor_N	Raptor N-terminal CASPase like domain. This domain is found at the N-terminus of the Raptor protein. It has been identified to have a CASPase like structure. It conserves the characteristic cys/his dyad of the caspases suggesting it may have a peptidase activity.	152
405261	pfam14539	DUF4442	Domain of unknown function (DUF4442). This family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 139 and 165 amino acids in length. There is a conserved PYF sequence motif. There is a single completely conserved residue N that may be functionally important.	131
405262	pfam14540	NTF-like	Nucleotidyltransferase-like. Structural comparisons with Structure 1kny indicate that this N-terminal domain resembles a nucleotidyltransferase fold.	117
405263	pfam14541	TAXi_C	Xylanase inhibitor C-terminal. The N- and C-termini of the members of this family are jointly necessary for creating the catalytic pocket necessary for cleaving xylasnase. Phytopathogens produce xylanase that destroys plant cells, so its destruction through proteolysis is vital for plant-survival.	160
405264	pfam14542	Acetyltransf_CG	GCN5-related N-acetyl-transferase. This family of GCN5-related N-acetyl-transferases bind both CoA and acetyl-CoA. They are characterized by highly conserved glycine, a cysteine residue in the acetyl-CoA binding site near the acetyl group, their small size compared with other GNATs and a lack of of an obvious substrate-binding site. It is proposed that they transfer an acetyl group from acetyl-CoA to one or more unidentified aliphatic amines via an acetyl (cysteine) enzyme intermediate. The substrate might be another macromolecule.	80
405265	pfam14543	TAXi_N	Xylanase inhibitor N-terminal. The N- and C-termini of the members of this family are jointly necessary for creating the catalytic pocket necessary for cleaving xylanase. Phytopathogens produce xylanase that destroys plant cells, so its destruction through proteolysis is vital for plant-survival.	172
373123	pfam14544	DUF4443	Domain of unknown function (DUF4443). This is a family of archaeal proteins. The domain is a putative gyrase domain.	112
405266	pfam14545	DBB	Dof, BCAP, and BANK (DBB) motif,. The DBB domain is named from the Drosophila (Downstream of FGFR - Dof, also known as Heartbroken or Stumps) protein, the BANKS and BCAP, both signalling in B-cell pathway, proteins. This domain defines a minimal region required for mediating Dof dimerization. Since this domain can interact both with itself and with a region in the C-terminal part of the molecule, it may mediate either intermolecular or intramolecular interactions. Mutants lacking this domain disrupt FGFR signal transduction and fibroblast growth-factor signalling.	139
405267	pfam14547	Hydrophob_seed	Hydrophobic seed protein. This domain has a four-helix bundle structure. It contains four disulfide bonds, of which three function to keep the C- and N-terminal parts of the molecule in place.	85
405268	pfam14549	P22_Cro	DNA-binding transcriptional regulator Cro. Bacteriophage P22 Cro protein represses genes normally expressed in early phage development and is necessary for the late stage of lytic growth. It does this by binding to the OL and OR operator-regions normally used by the repressor protein for lysogenic maintenance.	60
405269	pfam14550	Peptidase_S78_2	Putative phage serine protease XkdF. This domain is largely found on phage proteins. In a number of cases the domain is associated with a SAM-dependent methyltransferase. Members are serine peptidases.	120
405270	pfam14551	MCM_N	MCM N-terminal domain. This family contains the N-terminal domain of MCM proteins.	93
405271	pfam14552	Tautomerase_2	Tautomerase enzyme. 	82
405272	pfam14553	YqbF	YqbF, hypothetical protein domain. This N-terminal domain is found in Bacillus and related spp. The function is not known.	43
405273	pfam14554	VEGF_C	VEGF heparin-binding domain. This short domain is found at the C-terminus of VEGF. It has been shown to have heparin binding activity.	49
405274	pfam14555	UBA_4	UBA-like domain. 	42
373128	pfam14556	AF2331-like	AF2331-like. AF2331-like is a 11-kDa orphan protein of unknown function from Archaeoglobus fulgidus. The structure consists of an alpha + beta fold formed by an unusual homodimer, where the two core beta-sheets are interdigitated, containing strands alternating from both subunits. AF2331 contains multiple negatively charged surface clusters and is located on the same operon as the basic protein AF2330. It is suggested that AF2331 and AF2330 may form a charge-stabilized complex in vivo, though the role of the negatively charged surface clusters is not clear.	90
317018	pfam14557	AphA_like	Putative AphA-like transcriptional regulator. Members of this family are putative transcriptional regulators that appear to be related to the pfam03551 family. This family includes AphA-like members.	174
405275	pfam14558	TRP_N	ML-like domain. This domain is distantly similar to pfam02221 and conserves its pattern of conserved cysteines. This suggests that this domain may be involved in lipid binding.	137
405276	pfam14559	TPR_19	Tetratricopeptide repeat. 	68
405277	pfam14560	Ubiquitin_2	Ubiquitin-like domain. This entry contains ubiquitin-like domains.	83
405278	pfam14561	TPR_20	Tetratricopeptide repeat. 	90
373131	pfam14562	Endonuc_BglI	Restriction endonuclease BglI. This restriction endonuclease binds DNA as a dimer. BglI recognizes and cleaves the interrupted DNA sequence GCCNNNNNGGC and cleaves between the fourth and fifth unspecified base pair to produce 3' overhanging ends.	285
405279	pfam14563	DUF4444	Domain of unknown function (DUF4444). This domain family is found in bacteria, and is approximately 40 amino acids in length. There is a conserved LIPL sequence motif. There are two completely conserved G residues that may be functionally important.	41
405280	pfam14564	Membrane_bind	Membrane binding. This family includes the C-terminal domain of Dictyostelium discoideum Calcium-dependent cell adhesion molecule 1, which has an immunoglobulin-like fold. It tethers the protein to the cell membrane.	109
405281	pfam14565	IL22	Interleukin 22 IL-10-related T-cell-derived-inducible factor. Interleukin-22 is distantly related to interleukin (IL)-10, and is produced by activated T cells. IL-22 is a ligand for CRF2-4, a member of the class II cytokine receptor family.	139
405282	pfam14566	PTPlike_phytase	Inositol hexakisphosphate. Inositol hexakisphosphate, often called phytate, is found in abundance in seeds and acting as an inorganic phosphate reservoir. Phytases are phosphatases that hydrolyze phytate to less-phosphorylated myo-inositol derivatives and inorganic phosphate. The active-site sequence (HCXXGXGR) of the phytase identified from the gut micro-organism Selenomonas ruminantium forms a loop (P loop) at the base of a substrate binding pocket that is characteristic of protein tyrosine phosphatases (PTPs). The depth of this pocket is an important determinant of the substrate specificity of PTPs. In humans this enzyme is thought to aid bone mineralization and salvage the inositol moiety prior to apoptosis.	157
405283	pfam14567	SUKH_5	SMI1-KNR4 cell-wall. Members of this family are related to the SMI1/KNR4-like or SUKH superfamily of proteins.	133
405284	pfam14568	SUKH_6	SMI1-KNR4 cell-wall. Members of this family are related to the SMI1/KNR4-like or SUKH superfamily of proteins.	120
405285	pfam14569	zf-UDP	Zinc-binding RING-finger. This RING/U-box type zinc-binding domain is frequently found in the catalytic subunit (irx3) of cellulose synthase. The enzymic class is EC:2.4.1.12, whereby the synthase removes the glucose from UDP-glucose and adds it to the growing cellulose, thereby releasing UDP. The domain-structure is treble-clef like (Structure 1weo).	75
405286	pfam14570	zf-RING_4	RING/Ubox like zinc-binding domain. 	47
405287	pfam14571	Di19_C	Stress-induced protein Di19, C-terminal. C-terminal domain of Di19, a protein that increases the sensitivity of plants to environmental stress, such as salinity, drought, osmotic stress and cold. the protein is also induced by an increased supply of stress-related hormones such as abscisic acid ABA and ethylene. There is a zinc-finger at the N-terminus, zf-Di19, pfam05605.	102
405288	pfam14572	Pribosyl_synth	Phosphoribosyl synthetase-associated domain. This family includes several examples of enzymes from class EC:2.7.6.1, phosphoribosyl-pyrophosphate transferase.	184
373139	pfam14573	PP-binding_2	Acyl-carrier. 	96
405289	pfam14574	DUF4445	Domain of unknown function (DUF4445). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 525 and 664 amino acids in length. The family is found in association with pfam00111.	261
405290	pfam14575	EphA2_TM	Ephrin type-A receptor 2 transmembrane domain. Epha2_TM represents the left-handed dimer transmembrane domain of of EphA2 receptor. This domain oligomerizes and is important for the active signalling process.	76
405291	pfam14576	SEO_N	Sieve element occlusion N-terminus. Sieve element occlusion (SEO) proteins, or forisomes, are phloem proteins which accumulate during sieve element differentiation. This domain represents the N-terminus of SEO proteins.	287
405292	pfam14577	SEO_C	Sieve element occlusion C-terminus. Sieve element occlusion (SEO) proteins, or forisomes, are phloem proteins which accumulate during sieve element differentiation. This domain represents the C-terminus of SEO proteins.	232
405293	pfam14578	GTP_EFTU_D4	Elongation factor Tu domain 4. Elongation factor Tu consists of several structural domains, and this is usually the fourth.	86
405294	pfam14579	HHH_6	Helix-hairpin-helix motif. The HHH domain is a short DNA-binding domain.	88
405295	pfam14580	LRR_9	Leucine-rich repeat. 	175
405296	pfam14581	SseB_C	SseB protein C-terminal domain. This family consists of several SseB proteins which appear to be found exclusively in Enterobacteria. SseB is known to enhance serine-sensitivity in Escherichia coli and is part of the Salmonella pathogenicity island 2 (SPI-2) translocon. This presumed domain is found at the C-terminus of SseB proteins.	106
405297	pfam14582	Metallophos_3	Metallophosphoesterase, calcineurin superfamily. Members of this family are part of the Calcineurin-like phosphoesterase superfamily.	259
291262	pfam14583	Pectate_lyase22	Oligogalacturonate lyase. This is a family of oligogalacturonate lyases, referred to more generally as pectate lyase family 22. These proteins fold into 7-bladed beta-propellers.	386
405298	pfam14584	DUF4446	Protein of unknown function (DUF4446). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 165 and 176 amino acids in length.	150
405299	pfam14585	CagY_I	CagY type 1 repeat. This repeat is found at the N-terminus of the CagY proteins - part of the CAG pathogenicity island - and involved in delivery of the protein CagA into host cells.	65
291265	pfam14586	MHC_I_2	Class I Histocompatibility antigen, NKG2D ligand, domains 1 and 2. Members of this family are known as retinoic-acid-inducible proteins. They are ligands for the activating immunoreceptor NKG2D, which is widely expressed on natural killer cells, T cells, and macrophages.	174
405300	pfam14587	Glyco_hydr_30_2	O-Glycosyl hydrolase family 30. 	367
405301	pfam14588	YjgF_endoribonc	YjgF/chorismate_mutase-like, putative endoribonuclease. YjgF_Endoribonuc is a putative endoribonuclease. The structure is of beta-alpha-beta-alpha-beta(2) domains common both to bacterial chorismate mutase and to members of the YjgF family. These proteins form trimers with a three-fold symmetry with three closely-packed beta-sheets. The YjgF family is a large, widely distributed family of proteins of unknown biochemical function that are highly conserved among eubacteria, archaea and eukaryotes.	148
373147	pfam14589	NrfD_2	Polysulfide reductase. Bacterial polysulfide reductase is an integral membrane protein complex responsible for quinone-coupled reduction of polysulfide, a process important in extreme environments such as deep-sea vents and hot springs. Polysulfides are a class of compounds composed of chains of sulfur atoms, which in their simplest form are present as an anion with general formula Sn(2-). In nature, polysulfides are found in particularly high concentrations in extreme volcanic or geothermically active environments. Here, the reduction and oxidation of polysulfides are vital processes for many bacteria and are essential steps in the global sulfur cycle. In particular, the reduction of polysulfide to hydrogen sulfide in these environments is usually linked to energy-generating respiratory processes, supporting growth of many microorganisms, particularly hyperthermophiles.	281
317044	pfam14590	DUF4447	Domain of unknown function (DUF4447). This family of proteins is found in bacteria. Proteins in this family are approximately 170 amino acids in length.	166
373148	pfam14591	AF0941-like	AF0941-like. Members of this family are of unknown function.	113
405302	pfam14592	Chondroitinas_B	Chondroitinase B. This family includes chondroitinases. These enzymes cleave the glycosaminoglycan dermatan sulfate.	426
405303	pfam14593	PH_3	PH domain. 	103
405304	pfam14594	Sipho_Gp37	Siphovirus ReqiPepy6 Gp37-like protein. This family includes numerous phage proteins from Siphoviruses. The function of this protein is uncertain, but it is related to pfam06605. In Rhodococcus phage ReqiPepy6 this protein is called Gp37.	337
405305	pfam14595	Thioredoxin_9	Thioredoxin. 	129
291272	pfam14596	STAT6_C	STAT6 C-terminal. This family represents the C-terminus of mammalian STAT6 (Signal transducer and activator of transcription 6), it contains an LXXLL motif which binds to NCOA1 (Nuclear receptor coactivator 1).	193
373150	pfam14597	Lactamase_B_5	Metallo-beta-lactamase superfamily. This is a small family of putative metal-dependent hydrolases.	199
405306	pfam14598	PAS_11	PAS domain. This family includes the PAS-B domain of NCOA1 (Nuclear receptor coactivator 1), which binds to an LXXLL motif in the C-terminal region of STAT6 (Signal transducer and activator of transcription 6).	109
405307	pfam14599	zinc_ribbon_6	Zinc-ribbon. This is a typical zinc-ribbon finger, with each pair of zinc-ligands coming from more-or-less either side of two knuckles. It is found in eukaryotes.	59
373153	pfam14600	CBM_5_12_2	Cellulose-binding domain. This C-terminal domain belongs to the CAZy family of carbohydrate-binding domains that are associated with glycosyl-hydrolases. It is suggested to bind cellulose.	62
405308	pfam14601	TFX_C	DNA_binding protein, TFX, C-term. This is the C-terminal region of TFX-like DNA-binding proteins.	83
405309	pfam14602	Hexapep_2	Hexapeptide repeat of succinyl-transferase. 	33
405310	pfam14603	hSH3	Helically-extended SH3 domain. This domain is the 70 C-terminal residues of ADAP - Adhesion and de-granulation promoting adapter protein. It shows homology to SH3 domains; however, conserved residues of the fold are absent. It thus represents an altered SH3 domain fold. An N-terminal, amphipathic, helix makes extensive contacts to residues of the regular SH3 domain fold thereby creating a composite surface with unusual surface properties. The domain can no longer bind conventional proline-rich peptides. There are key phosphorylation sites within the two hSH3 domains and it would appear that binding at these sites does not materially affect the folding of these regions although the equilibrium towards the unfolded state may be slightly altered. The binding partners of the hSH3 domains are still unknown.	89
405311	pfam14604	SH3_9	Variant SH3 domain. 	49
373156	pfam14605	Nup35_RRM_2	Nup53/35/40-type RNA recognition motif. 	53
405312	pfam14606	Lipase_GDSL_3	GDSL-like Lipase/Acylhydrolase family. 	178
405313	pfam14607	GxDLY	N-terminus of Esterase_SGNH_hydro-type. This domain lies upstream of SGNH hydrolase, but its function is not known. There is a highly conserved GxDLY sequence-motif.	146
405314	pfam14608	zf-CCCH_2	RNA-binding, Nab2-type zinc finger. This is an unusual zinc-finger family, and is represented by fingers 5-7 of Nab2. Nab2 ZnF5-7 are zinc-fingers of the type C-x8-C-x5-C-x3-H. Nab2 ZnFs function in the generation of export-competent mRNPs. Mab2 is a conserved polyadenosine-RNA-binding Zn finger protein required for both mRNA export and polyadenylation regulation and becomes attached to the mRNP after splicing and during or immediately after polyadenylation. The three ZnFs, 5-7, have almost identical folds and, most unusually, associate with one another to form a single coherent structural unit. ZnF5-7 bind to eight consecutive adenines, and chemical shift perturbations identify residues on each finger that interact with RNA.	18
405315	pfam14609	GCP5-Mod21	gamma-Tubulin ring complex non-core subunit mod21. GCP5-Mod21 is a non-core subunit of the larger gamma-tubulin ring complex that effects microtubule nucleation from both centrosomal and non-centrosomal sites. This subunit, unlike GCP2 and and GCP3 and others, is not thought to be essential for viability in the fission yeast, and may not be expressed in very high concentrations. Fission yeast can form a large gamma-Tubulin complex C similar to that found in higher eukaryotes and this complex is important for maintaining normal levels of microtubule nucleation in vivo.	676
405316	pfam14610	DUF4448	Protein of unknown function (DUF4448). This is a family of predicted membrane glycoproteins from fungi. However there appears, visually, to be some similarity with the family of GPI-anchored fungal proteins, pfam10342.	185
405317	pfam14611	SLS	Mitochondrial inner-membrane-bound regulator. SLS is a fungal domain found bound to the mitochondrial inner-membrane. It reacts physically with fungal Kar2p to promote translocation across the endoplasmic-reticulum membrane. This action appeared to be mediated via the promotion of the Sec63p-mediated activation of Kar2p's ATPase activity. This indicates that the Sls1p protein is a GrpE-like protein in the endoplasmic reticulum. In S.cerevisiae the SLS1 gene (ScSLS1) is not essential but is also involved in ERAD and folding.	197
405318	pfam14612	Ino80_Iec3	IEC3 subunit of the Ino80 complex, chromatin re-modelling. This is a family of fungal chromatin re-modelling proteins found in one of the chromatin-central complexes, Ino80. The function was identified in Schizosaccharomyces pombe but there is no orthologue in S. cerevisiae.	231
405319	pfam14613	DUF4449	Protein of unknown function (DUF4449). This is a fungal DUF of unknown function.	156
405320	pfam14614	DUF4450	Domain of unknown function (DUF4450). This is a family of bacterial proteins of unknown function.	213
405321	pfam14615	Rsa3	Ribosome-assembly protein 3. This is a family of 60S ribosome-assembly proteins, from fungi.	46
405322	pfam14616	DUF4451	Domain of unknown function (DUF4451). This is family of fungal proteins up-regulated during meiosis.	120
373164	pfam14617	CMS1	U3-containing 90S pre-ribosomal complex subunit. This is a family of fungal and plant CMS1-like proteins. The family has similarity to the DEAD-box helicases.	250
405323	pfam14618	DUF4452	Domain of unknown function (DUF4452). This fungal family has no known function. However, it is rich in paired, as CXXC, cysteines and histidines, but these do not fall in the conformation that might suggest zinc-binding.	169
405324	pfam14619	SnAC	Snf2-ATP coupling, chromatin remodelling complex. This domain appears to play a crucial role in chromatin remodelling for yeast SWI/SNF. It binds histones. It is required for mobilising nucleosomes and lies within the catalytic subunit of the yeast SWI/SNF. It is found to be universally conserved.	71
405325	pfam14620	YPEB	YpeB sporulation. YPEB is a protein that is necessary for the functioning of SleB during spore-cortex hydrolysis.	361
405326	pfam14621	RFX5_DNA_bdg	RFX5 DNA-binding domain. RFX5 and RFXAP reveals molecular details associated with MHCII gene expression.	219
405327	pfam14622	Ribonucleas_3_3	Ribonuclease-III-like. Members of this family are involved in rDNA transcription and rRNA processing. They probably also cleave a stem-loop structure at the 3' end of U2 snRNA to ensure formation of the correct U2 3' end; they are involved in polyadenylation-independent transcription termination. Some members may be mitochondrial ribosomal protein subunit L15, others may be 60S ribosomal protein L3.	128
405328	pfam14623	Vint	Hint-domain. This short domain is a conserved region of intein-containing proteins from lower eukaryotes.	165
405329	pfam14624	Vwaint	VWA / Hh protein intein-like. VWA-Hint proteins carry this conserved domain of around 300 residues, now named the Vwaint domain. Such proteins do not seem to have a signal peptide for secretion. Generally, this domain lies between the N-terminal VWA domain and the more C-terminal 'Vint'-type Hint domain. The exact function of this domain is not known.	70
405330	pfam14625	Lustrin_cystein	Lustrin, cysteine-rich repeated domain. This repeated domain is found in proteins from lower eukaryotes in lustrin, perlucin, pearl nacre, and other similar protein-types. Each repeat lies between Kunitz-BPTI repeats, in certain species, which are also cysteine-rich. The cysteines may form the disulfide bonds observed for other members of this superfamily.	43
405331	pfam14626	RNase_Zc3h12a_2	Zc3h12a-like Ribonuclease NYN domain. This family is found to be a divergent form of the NYN-domain- containing RNAse family.	122
405332	pfam14627	DUF4453	Domain of unknown function (DUF4453). This short domain is found only on a small subgroup of proteins from Gram-negative Proteobacteria that also carry a YARHG domain, pfam13308. They carry three conserved tryptophan and three conserved cysteine residues.	107
405333	pfam14628	DUF4454	Domain of unknown function (DUF4454). This C-terminal domain is found only on a small subgroup of proteins from Gram-positive Clostridiales that also carry a YARHG domain, pfam13308.	210
405334	pfam14629	ORC4_C	Origin recognition complex (ORC) subunit 4 C-terminus. This entry represents the C-terminus of origin recognition complex subunit 4.	212
405335	pfam14630	ORC5_C	Origin recognition complex (ORC) subunit 5 C-terminus. This entry represents the C-terminus of origin recognition complex subunit 5.	260
405336	pfam14631	FancD2	Fanconi anaemia protein FancD2 nuclease. The Fanconi anaemia protein FancD2 is a nuclease necessary for the repair of DNA interstrand-crosslinks.	1345
405337	pfam14632	SPT6_acidic	Acidic N-terminal SPT6. The N-terminus of SPT6 is highly acidic. The full SPT6 protein is a transcription regulator, but the exact function of this acidic region is not certain.	88
405338	pfam14633	SH2_2	SH2 domain. 	206
405339	pfam14634	zf-RING_5	zinc-RING finger domain. 	43
291309	pfam14635	HHH_7	Helix-hairpin-helix motif. 	104
405340	pfam14636	FNIP_N	Folliculin-interacting protein N-terminus. This is the N-terminus of folliculin-interacting proteins.	79
405341	pfam14637	FNIP_M	Folliculin-interacting protein middle domain. This is the middle domain of folliculin-interacting proteins.	226
405342	pfam14638	FNIP_C	Folliculin-interacting protein C-terminus. This is the C-terminus of folliculin-interacting proteins. This region is responsible for binding to folliculin.	189
258777	pfam14639	YqgF	Holliday-junction resolvase-like of SPT6. The YqgF domain of SPT6 proteins is homologous to the E.coli RuvC but its putative catalytic site lacks the carboxylate side chains critical for coordinating magnesium ions that mediate phosphodiester bond-cleavage	150
405343	pfam14640	TMEM223	Transmembrane protein 223. 	172
405344	pfam14641	HTH_44	Helix-turn-helix DNA-binding domain of SPT6. This helix-turn-helix represents the first of two DNA-binding domains on the SPT6 proteins.	115
405345	pfam14642	FAM47	FAM47 family. The function of this Chordate family of proteins is not known.	257
405346	pfam14643	DUF4455	Domain of unknown function (DUF4455). This domain family is found in bacteria and eukaryotes, and is approximately 480 amino acids in length. There are two completely conserved residues (W and P) that may be functionally important.	469
405347	pfam14644	DUF4456	Domain of unknown function (DUF4456). This domain family is found in bacteria and eukaryotes, and is approximately 210 amino acids in length. There is a single completely conserved residue E that may be functionally important.	209
405348	pfam14645	Chibby	Chibby family. This family includes the eukaryotic chibby proteins. These proteins inhibit the wingless/Wnt pathway by binding to beta-catenin and inhibiting beta-catenin-mediated transcriptional activation. Chibby is Japanese for small, and is named after the RNAi phenotype seen in Drosophila.	114
405349	pfam14646	MYCBPAP	MYCBP-associated protein family. This family of eukaryotic proteins includes the mammalian MYCBP-associated proteins. These proteins may be synaptic processes and may have a role in spermatogenesis.	429
405350	pfam14647	FAM91_N	FAM91 N-terminus. 	306
405351	pfam14648	FAM91_C	FAM91 C-terminus. 	393
405352	pfam14649	Spatacsin_C	Spatacsin C-terminus. This family includes the C-terminus of spatacsin.	295
405353	pfam14650	FAM75	FAM75 family. 	382
405354	pfam14651	Lipocalin_7	Lipocalin / cytosolic fatty-acid binding protein family. Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2).	128
405355	pfam14652	DUF4457	Domain of unknown function (DUF4457). This family of proteins is found in eukaryotes. It is found repeated several times in the vertebrate KIAA0556 proteins.	325
405356	pfam14653	IGFL	Insulin growth factor-like family. This family includes the insulin growth factor-like proteins. These proteins are potential ligands for the IGFLR1 cell membrane receptor.	83
405357	pfam14654	Epiglycanin_C	Mucin, catalytic, TM and cytoplasmic tail region. This family represents the non-tandem repeat domain including cleavage site, the transmembrane helix domain, and the cytoplasmic tail of epiglycanin and related mucins.	100
405358	pfam14655	RAB3GAP2_N	Rab3 GTPase-activating protein regulatory subunit N-terminus. This family includes the N-terminus of the Rab3 GTPase-activating protein non-catalytic subunit. Rab3 GTPase-activating protein is a GTPase activating protein with specificity for Rab3 subfamily.	416
405359	pfam14656	RAB3GAP2_C	Rab3 GTPase-activating protein regulatory subunit C-terminus. This family includes the N-terminus of the Rab3 GTPase-activating protein non-catalytic subunit. Rab3 GTPase-activating protein is a GTPase activating protein with specificity for Rab3 subfamily.	595
405360	pfam14657	Arm-DNA-bind_4	Arm DNA-binding domain. This family includes AP2-like domains found in a variety of phage integrase proteins. These domains bind to Arm DNA sites.	44
405361	pfam14658	EF-hand_9	EF-hand domain. 	66
405362	pfam14659	Phage_int_SAM_3	Phage integrase, N-terminal SAM-like domain. This domain is found in a variety of phage integrase proteins.	55
405363	pfam14660	DUF4458	Domain of unknown function (DUF4458). this domain is found in tandem repeats on the N-terminus of secreted LRR proteins from human associated Bacteroidetes domain boundaries are based on the JCSG solved 3D structure of JCSG target SP16667A (BT_0210)	108
405364	pfam14661	HAUS6_N	HAUS augmin-like complex subunit 6 N-terminus. This family includes the N-terminus of HAUS augmin-like complex subunit 6. The HAUS augmin-like complex contributes to mitotic spindle assembly, maintenance of chromosome integrity and completion of cytokinesis.	227
405365	pfam14662	KASH_CCD	Coiled-coil region of CCDC155 or KASH. This coiled-coil region is found in the central part of KASH or Klarsicht/ANC-1/Syne/homology proteins. KASH are a meiosis-specific proteins that localize at telomeres and interact with SUN1, thus being implicated in meiotic chromosome dynamics and homolog pairing.	191
405366	pfam14663	RasGEF_N_2	Rapamycin-insensitive companion of mTOR RasGEF_N domain. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the more conserved central section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin.	107
405367	pfam14664	RICTOR_N	Rapamycin-insensitive companion of mTOR, N-term. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the N-terminal conserved section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin.	372
405368	pfam14665	RICTOR_phospho	Rapamycin-insensitive companion of mTOR, phosphorylation-site. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient- and growth-factor signalling. This short region is the phoshorylation site. Rictor does interact with 14-3-3 in a Thr1135-dependent manner. Rictor can be inhibited by short-term rapamycin treatment showing that Thr1135 is an mTORC1-regulated site.	112
405369	pfam14666	RICTOR_M	Rapamycin-insensitive companion of mTOR, middle domain. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the more conserved central section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin.	104
405370	pfam14667	Polysacc_synt_C	Polysaccharide biosynthesis C-terminal domain. This family represents the C-terminal integral membrane region of polysaccharide biosynthesis proteins.	142
405371	pfam14668	RICTOR_V	Rapamycin-insensitive companion of mTOR, domain 5. Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. These long eukaryotic proteins carry several well-conserved domains, and this is No.5.	71
373208	pfam14669	Asp_Glu_race_2	Putative aspartate racemase. This is a small family of vertebrate putative aspartate racemases. The family lies on TOPAZ 1 proteins.	176
405372	pfam14670	FXa_inhibition	Coagulation Factor Xa inhibitory site. This short domain on coagulation enzyme factor Xa is found to be the target for a potent inhibitor of coagulation, TAK-442.	36
405373	pfam14671	DSPn	Dual specificity protein phosphatase, N-terminal half. The active core of the dual specificity protein phosphatase is made up of two globular domains both with the DSP-like fold. This family represents the N-terminal half of the core. These domains are arranged in tandem, and are associated via an extensive interface to form a single globular whole. The conserved PTP signature motif (Cys-[X]5-Arg) that defines the catalytic centre of all PTP-family members is located within the C-terminal domain, family DSPc, pfam00782. Although the centre of the catalytic site is formed from DSPc, two loops from the N-terminal domain, DSPn, also contribute to the catalytic site, facilitating peptide substrate specificity.	138
405374	pfam14672	LCE	Late cornified envelope. This is a family of late cornified envelope proteins that are expressed in skin.	77
405375	pfam14673	DUF4459	Domain of unknown function (DUF4459). This family appears only on sequences from Salmonella spp. These sequences also all carry a YARHG domain, pfam13308.	159
405376	pfam14674	FANCI_S1-cap	FANCI solenoid 1 cap. This is the solenoid 1 cap (S1-cap) domain of the Fanconi anemia group I protein.	53
405377	pfam14675	FANCI_S1	FANCI solenoid 1. This is the solenoid 1 (S1) domain of the Fanconi anemia group I protein.	221
405378	pfam14676	FANCI_S2	FANCI solenoid 2. This is the solenoid 2 (S2) domain of the Fanconi anemia group I protein.	152
405379	pfam14677	FANCI_S3	FANCI solenoid 3. This is the solenoid 3 (S3) domain of the Fanconi anemia group I protein.	217
405380	pfam14678	FANCI_S4	FANCI solenoid 4. This is the solenoid 4 (S4) domain of the Fanconi anemia group I protein.	244
405381	pfam14679	FANCI_HD1	FANCI helical domain 1. This is the helical domain 1 (HD1) of the Fanconi anemia group I protein.	86
405382	pfam14680	FANCI_HD2	FANCI helical domain 2. This is the helical domain 2 (HD2) of the Fanconi anemia group I protein.	237
405383	pfam14681	UPRTase	Uracil phosphoribosyltransferase. This family includes the enzyme uracil phosphoribosyltransferase (EC:2.4.2.9). This enzyme catalyzes the first step of UMP biosynthesis.	204
379667	pfam14682	SPOB_ab	Sporulation initiation phospho-transferase B, C-terminal. Sporulation initiation phospho-transferase B or SpoOB is part of a phospho-relay that initiates sporulation in Bacillus subtilis. Spo0B is a two-domain protein consisting of an N-terminal alpha-helical hairpin domain and a C-terminal alpha/beta domain, represented by this family. Two subunits of Spo0B dimerize by a parallel association of helical hairpins to form a novel four-helix bundle from which the active histidine - involved in the auto-phosphorylation - protrudes. In the phospho-relay, the signal-receptor histidine kinases are dephosphorylated by a common response regulator, Spo0F. Spo0B then takes phosphorylated Spo0F as substrate hereby mediating the transfer of a phosphoryl group to Spo0A, the ultimate transcription factor.	113
405384	pfam14683	CBM-like	Polysaccharide lyase family 4, domain III. CBM-like is domain III of rhamnogalacturonan lyase (RG-lyase). The full-length protein specifically recognizes and cleaves alpha-1,4 glycosidic bonds between l-rhamnose and d-galacturonic acids in the backbone of rhamnogalacturonan-I, a major component of the plant cell wall polysaccharide, pectin. This domain possesses a jelly roll beta-sandwich fold structurally homologous to carbohydrate binding modules (CBMs), and it carries two sulfate ions and a hexa-coordinated calcium ion.	157
405385	pfam14684	Tricorn_C1	Tricorn protease C1 domain. This domain is the C1 core domain of tricorn protease. This is a mixed alpha-beta domain.	70
405386	pfam14685	Tricorn_PDZ	Tricorn protease PDZ domain. This domain is the PDZ domain of tricorn protease.	88
405387	pfam14686	fn3_3	Polysaccharide lyase family 4, domain II. FnIII-like is domain II of rhamnogalacturonan lyase (RG-lyase). The full-length protein specifically recognizes and cleaves alpha-1,4 glycosidic bonds between l-rhamnose and d-galacturonic acids in the backbone of rhamnogalacturonan-I, a major component of the plant cell wall polysaccharide, pectin. This domain displays an immunoglobulin-like or more specifically Fibronectin-III type fold and shows highest structural similarity to the C-terminal beta-sandwich subdomain of the pro-hormone/propeptide processing enzyme carboxypeptidase gp180 from duck. It serves to assist in producing the deep pocket, with domain III, into which the substrate fits.	74
405388	pfam14687	DUF4460	Domain of unknown function (DUF4460). This domain family is found in eukaryotes, and is typically between 103 and 119 amino acids in length. There is a conserved HPD sequence motif. There are two completely conserved residues (N and F) that may be functionally important.	104
405389	pfam14688	DUF4461	Domain of unknown function (DUF4461). This domain family is found in eukaryotes, and is approximately 310 amino acids in length.	306
405390	pfam14689	SPOB_a	Sensor_kinase_SpoOB-type, alpha-helical domain. Sporulation initiation phospho-transferase B or SpoOB is part of a phospho-relay that initiates sporulation in Bacillus subtilis. Spo0B is a two-domain protein consisting of an N-terminal alpha-helical hairpin domain and a C-terminal alpha/beta domain. Two subunits of Spo0B dimerize by a parallel association of helical hairpins to form a novel four-helix bundle from which the active histidine - involved in the auto-phosphorylation - protrudes. In the phospho-relay, the signal-receptor histidine kinases are dephosphorylated by a common response regulator, Spo0F. Spo0B then takes phosphorylated Spo0F as substrate thereby mediating the transfer of a phosphoryl group to Spo0A, the ultimate transcription factor. The exact function of this alpha-helical domain is not known; it does not always occur just as the N-terminal domain of SPOB_ab, pfam14682. SCOP describes this domain as a histidine kinase-like fold lacking the kinase ATP-binding site.	60
379669	pfam14690	zf-ISL3	zinc-finger of transposase IS204/IS1001/IS1096/IS1165. 	47
405391	pfam14691	Fer4_20	Dihydroprymidine dehydrogenase domain II, 4Fe-4S cluster. Domain II of the enzyme dihydroprymidine dehydrogenase binds FAD. Dihydroprymidine dehydrogenase catalyzes the first and rate-limiting step of pyrimidine degradation by converting pyrimidines to the corresponding 5,6- dihydro compounds. This domain carries two Fe4-S4 clusters.	113
405392	pfam14692	DUF4462	Domain of unknown function (DUF4462). This domain family is found in eukaryotes, and is approximately 30 amino acids in length.	28
405393	pfam14693	Ribosomal_TL5_C	Ribosomal protein TL5, C-terminal domain. This family contains the C-terminal domain of ribosomal protein TL5. The N-terminal domain, which binds to 5S rRNA, is contained in family Ribosomal_L25p, pfam01386. Full length (N- and C-terminal domain) homologs of TL5 are also known as CTC proteins. TL5 or CTC are not found in Eukarya or Archaea. In some Bacteria, including E. coli, this ribosomal subunit occurs as a single domain protein (named Ribosomal subunit L25), where the only domain is homologous to TL5 N-terminal domain (hence included in family pfam01386). The function of the C-terminal domain of TLC is at present unknown.	84
405394	pfam14694	LINES_N	Lines N-terminus. This family represents the N-terminus of protein lines. In Drosophila this protein is involved in embryonic segmentation and may function as a transcriptional regulator.	350
405395	pfam14695	LINES_C	Lines C-terminus. This family represents the C-terminus of protein lines. In Drosophila this protein is involved in embryonic segmentation and may function as a transcriptional regulator.	37
405396	pfam14696	Glyoxalase_5	Hydroxyphenylpyruvate dioxygenase, HPPD, N-terminal. This domain is one of two barrel-shaped regions that together form the active enzyme, 4-hydroxyphenylpyruvic acid dioxygenase, EC:1.13.11.27. As can be deduced from the disposition of the various Glyoxalase families, _2, _3 and _4 in Pfam, pfam00903, pfam12681, pfam13468, pfam13669, these two regions are similar to be indicative of a gene-duplication event. At the individual sequence level slight differences in conformation have given rise to slightly different functions. In the case of UniProt:P80064, 4-hydroxyphenylpyruvic acid dioxygenase catalyzes the formation of homogentisate from 4-hydroxyphenylpyruvate, and the pyruvate part of the HPPD substrate (4-hydroxyphenylpyruvate), derived from L-tyrosine, and the O2 molecule occupy the three free coordination sites of the catalytic iron atom in the C-terminal domain. In plants and photosynthetic bacteria, the tyrosine degradation pathway is crucial because homogentisate, a tyrosine degradation product, is a precursor for the biosynthesis of photosynthetic pigments, such as quinones or tocopherols.	138
405397	pfam14697	Fer4_21	4Fe-4S dicluster domain. Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich. Domain contains two 4Fe4S clusters.	59
405398	pfam14698	ASL_C2	Argininosuccinate lyase C-terminal. This domain is found at the C-terminus of argininosuccinate lyase.	68
405399	pfam14699	hGDE_N	N-terminal domain from the human glycogen debranching enzyme. This domain is found on the very N-terminal of eukaryotic variants of the glycogen debranching enzyme (GDE), where it is immediately followed by the aldolase-like domain. The eukaryotic GDE performs two functions: 4-alpha-D-glucanotransferase, EC:2.4.1.25, and Amylo-alpha-1,6-glucosidase, EC:3.2.1.33, performed by the, respectively N- and C- terminal halves of eukaryotic GDE enzyme. The domain is involved in the glucosyltransferase activity, probably as a substrate-binding module (by analogy with other glucosyltransferases).	87
405400	pfam14700	RPOL_N	DNA-directed RNA polymerase N-terminal. This is the N-terminal domain of DNA-directed RNA polymerase. This domain has a role in interaction with regions of upstream promoter DNA and the nascent RNA chain, leading to the processivity of the enzyme. In order to make mRNA transcripts the RNA polymerase undergoes a transition from the initiation phase (which only makes short fragments of RNA) to an elongation phase. This domain undergoes a structural change in the transition from initiation to elongation phase. The structural change results in abolition of the promoter binding site, creation of a channel accommodating the heteroduplex in the active site and formation of an exit tunnel which the RNA transcript passes through after peeling off the heteroduplex.	286
405401	pfam14701	hDGE_amylase	Glycogen debranching enzyme, glucanotransferase domain. This is a glucanotransferase catalytic domain of the eukaryotic variant of the glycogen debranching enzyme (GDE). The eukaryotic GDEs performs two functions: 4-alpha-D-glucanotransferase, EC:2.4.1.25, and Amylo-alpha-1,6-glucosidase, EC:3.2.1.33, performed by the, respectively N- and C- terminal halves of eukaryotic GDE enzymes. The domain is a catalytic domain responsible for the glucanotransferase function. It belongs to the alpha-amylase clan and is predicted to have a structure of a 8-stranded alpha/beta barrel (TIM barrel) where strands are interrupted by long loops and additional mini-domains. In most other amylases, the catalytic domain is followed by a beta- barrel substrate binding domain, but presence of such a domain cannot be verified in the human (and other eukaryotic) GDE enzymes.	439
405402	pfam14702	hGDE_central	Central domain of human glycogen debranching enzyme. This is a central domain of the eukaryotic variant of the glycogen debranching enzyme (GDE). The eukaryotic GDE performs two functions: 4-alpha-D-glucanotransferase, EC:2.4.1.25, and Amylo-alpha-1,6-glucosidase, EC:3.2.1.33, performed by the, respectively N- and C- terminal halves of eukaryotic GDE enzyme This central domain follows the glucanotransferase domain and precedes the glucosidase (GDE_N) domain. It is very likely that the current definition contains two or more domains, by analogy with baterial GDEs, this domain should be involved in substrate- binding either for the N-terminal glucanotransferase and/or the the C-terminal glucosidase (or both).	242
405403	pfam14703	PHM7_cyt	Cytosolic domain of 10TM putative phosphate transporter. PHM7_cyt is the predicted cytosolic domain of integral membrane proteins, such as yeast PHM7 and TM63A_HUMAN TRANSMEMBRANE PROTEIN 63A. This domain usually precedes the 7TM region, pfam02714, and follows a RSN1_TM, pfam13967. Fold recognition programs consistently and with high significance predict this domain to be distantly homologous to RNA binding proteins from the RRM clan.	163
405404	pfam14704	DERM	Dermatopontin. Members of this family mediate cell adhesion via cell surface integrin binding. They also induce haemagglutination and aggregation of amebocytes.	149
405405	pfam14705	Costars	Costars. This domain is found both alone and at the C-terminus of actin-binding Rho-activating protein (ABRA). It binds to actin, and in muscle regulates the actin cytoskeleton and cell motility. It has a winged helix-like fold consisting of three alpha-helices and four antiparallel beta strands. Unlike typical winged helix proteins it does not bind to DNA, but contains a hydrophobic groove which may be responsible for interaction with other proteins.	75
405406	pfam14706	Tnp_DNA_bind	Transposase DNA-binding. This domain occurs at the C-terminus of transposases including E. coli tnpA. TnpA encodes a transposase and an inhibitor protein, the inhibitor only differs from the transposase by the absence of the N-terminal 55 amino acids, which includes most of this domain. This domain consists of alpha helices and turns, and functions as a DNA-binding domain.	58
405407	pfam14707	Sulfatase_C	C-terminal region of aryl-sulfatase. 	122
405408	pfam14709	DND1_DSRM	double strand RNA binding domain from DEAD END PROTEIN 1. A C-terminal domain in human dead end protein 1 (DND1_HUMAN) homologous to double strand RNA binding domains (PF00035, PF00333)	80
405409	pfam14710	Nitr_red_alph_N	Respiratory nitrate reductase alpha N-terminal. This is the N-terminal tail of the respiratory nitrate reductase alpha chain. The nitrate reductase complex is a dimer of heterotrimers each consisting of an alpha, beta and gamma chain. The N-terminal tail of the alpha chain interacts with the beta chain and contributes to the stability of the heterotrimer.	37
405410	pfam14711	Nitr_red_bet_C	Respiratory nitrate reductase beta C-terminal. This domain occurs near the C-terminus of the respiratory nitrate reductase beta chain. The nitrate reductase complex is a dimer of heterotrimers each consisting of an alpha, beta and gamma chain. This domain plays a role in the interactions between subunits and shielding of the Fe-S clusters	81
405411	pfam14712	Snapin_Pallidin	Snapin/Pallidin. This family of proteins includes Snapin, this protein is associated with the SNARE complex, which mediates synaptic vesicle docking and fusion. It also includes the yeast snapin-like protein SNN1, which is a part of a complex involved in endosomal cargo sorting. The family also includes pallidin, a component of a complex involved in biogenesis of lysosome-related organelles.	89
405412	pfam14713	DUF4464	Domain of unknown function (DUF4464). This family of proteins is found in eukaryotes. Proteins in this family are typically between 224 and 241 amino acids in length. There is a conserved YID sequence motif.	229
405413	pfam14714	KH_dom-like	KH-domain-like of EngA bacterial GTPase enzymes, C-terminal. The KH-like domain at the C-terminus of the EngA subfamily of essential bacterial GTPases has a unique domain structure position. The two adjacent GTPase domains (GD1 and GD2), two domains of family MMR_HSR1, pfam01926, pack at either side of the C-terminal domain. This C-terminal domain resembles a KH domain but is missing the distinctive RNA recognition elements. Conserved motifs of the nucleotide binding site of GD1 are integral parts of the GD1-KH domain interface, suggesting the interactions between these two domains are directly influenced by the GTP/GDP cycling of the protein. In contrast, the GD2-KH domain interface is distal to the GDP binding site of GD2. This family has not been added to the KH clan since SCOP classifies it separately due to its missing the key KH motif/fold.	79
405414	pfam14715	FixP_N	N-terminal domain of cytochrome oxidase-cbb3, FixP. This is the N-terminal domain of FixP, the cytochrome oxidase type-cbb3. the exact function is not known.	47
405415	pfam14716	HHH_8	Helix-hairpin-helix domain. 	67
405416	pfam14717	DUF4465	Domain of unknown function (DUF4465). A large family of uncharacterized proteins mostly from human gut bacteroides, but also some environmental and water bacteria (Planctomycetes) as well as metagenomic samples Most proteins from this family are secreted or located on the outer surface and may participate in cell-cell interactions or cell-nutrient interactions This function is supported by a solved structure of a Bacteroides ovatus homolog, which adapts a galactose binding (jelly-roll) beta barrel structure	170
405417	pfam14718	SLT_L	Soluble lytic murein transglycosylase L domain. Soluble lytic murein transglycosylase (SLT) consists of three domains, an N-terminal U domain, an L domain (linker domain) and a C-terminal domain (C). The L domain may be involved in the interaction of the enzyme with peptidoglycan.	67
405418	pfam14719	PID_2	Phosphotyrosine interaction domain (PTB/PID). 	184
405419	pfam14720	NiFe_hyd_SSU_C	NiFe/NiFeSe hydrogenase small subunit C-terminal. This domain is found at the C-terminus of hydrogenase small subunits including periplasmic [NiFeSe] hydrogenase small subunit, uptake hydrogenase small subunit and periplasmic [NiFe] hydrogenase small subunit. This C-terminal domain binds two of the three iron-sulfur clusters in this enzyme.	79
405420	pfam14721	AIF_C	Apoptosis-inducing factor, mitochondrion-associated, C-term. This C-terminal domain appears to be a dimerization domain of the mitochondrial apoptosis-inducing factor 1. protein. The domain also appears at the C-terminus of FAD-dependent pyridine nucleotide-disulfide oxidoreductases. Apoptosis inducing factor (AIF) is a bifunctional mitochondrial flavoprotein critical for energy metabolism and induction of caspase-independent apoptosis. On reduction with NADH, AIF undergoes dimerization and forms tight, long-lived FADH2-NAD charge-transfer complexes proposed to be functionally important.	129
405421	pfam14722	KRAP_IP3R_bind	Ki-ras-induced actin-interacting protein-IP3R-interacting domain. This family includes the N-terminus of the actin-interacting protein sperm-specific antigen 2, or KRAP (Ki-ras-induced actin-interacting protein). This region is found to be the residues that interact with inositol 1,4,5-trisphosphate receptor (IP3R). KRAP was first localized as a membrane-bound form with extracellular regions suggesting it might be involved in the regulation of filamentous actin and signals from the outside of the cells. It has now been shown to be critical for the proper subcellular localization and function of IP3R. Inositol 1,4,5-trisphosphate receptor functions as the Ca2+ release channel on specialized endoplasmic reticulum membranes, so the subcellular localization of IP3R is crucial for its proper function.	143
405422	pfam14723	SSFA2_C	Sperm-specific antigen 2 C-terminus. This family includes the C-terminus of the actin-interacting protein sperm-specific antigen 2.	170
405423	pfam14724	mit_SMPDase	Mitochondrial-associated sphingomyelin phosphodiesterase. The GO annotation for this family indicates that it is a single-pass membrane protein, and it appears to be found in mitochondrial membranes. Sphingolipids play important roles in regulating cellular responses, and although mitochondria contain sphingolipids, direct regulation of their levels in mitochondria or mitochondria-associated membranes is mostly unclear. Sphingomyelin phosphodiesterases catalyze the hydrolysis of sphingomyelin to ceramide and phosphocholine, and these metabolites are involved in signalling pathways.	765
405424	pfam14725	DUF4466	Domain of unknown function (DUF4466). 	307
405425	pfam14726	RTTN_N	Rotatin, an armadillo repeat protein, centriole functioning. Rotatin and its homologs such as Ana3 in Drosophila are found to be essential for centriole function. A deficiency of rotatin in mice leads to randomized heart tube looping, defects in embryonic turning, and abnormal expression of HNF3beta, lefty, and nodal. Thus it is required for left-right and axial patterning. Ana3 - the Drosophila homolog - is present in centrioles and basal bodies, is required for the structural integrity of both centrioles and basal bodies and for centriole cohesion. Rotatin also localizes to centrioles and basal bodies and appears to be essential for cilia function. This family represents the N-terminal domain.	97
405426	pfam14727	PHTB1_N	PTHB1 N-terminus. This family includes the N-terminus of PTHB1 protein. This protein forms a part of the BBSome complex, which is required for ciliogenesis.	413
405427	pfam14728	PHTB1_C	PTHB1 C-terminus. This family includes the C-terminus of PTHB1 protein. This protein forms a part of the BBSome complex, which is required for ciliogenesis.	370
291399	pfam14729	DUF4467	Domain of unknown function with cystatin-like fold (DUF4467). Large family of predicted lipoproteins from Gram-positive bacteria Experimentally determined structure shows a cystatitin-like fold, allowing us to classify this family in the NFT2 clan, despite lack of any detectable sequence similarity between members of this family and other families in this clan	94
405428	pfam14730	DUF4468	Domain of unknown function (DUF4468) with TBP-like fold. A large family of (predicted) secreted proteins with unknown functions from human gut and oral cavity. Typically forms a N-terminal domain with FMN binding domain at the C-terminus. Experimentaly determined 3D structure of this domain shows a variant of a TATA box binding - like fold, but no detectable sequence similarity to other proteins with this fold	88
291401	pfam14731	Staphopain_pro	Staphopain proregion. This domain is the proregion of the cysteine protease staphopain. Like many papain type peptidases, staphopain is synthesized as an inactive precursor and cleavage of the proregion is required for activation. This proregion has a half-barrel or barrel-sandwich hybrid fold. The proregion blocks the active site cleft of the mature enzyme on one side of the nucleophilic cysteine	169
405429	pfam14732	UAE_UbL	Ubiquitin/SUMO-activating enzyme ubiquitin-like domain. This is the C-terminal domain of ubiquitin-activating enzyme and SUMO-activating enzyme 2. It is structurally similar to ubiquitin. This domain is involved in E1-SUMO-thioester transfer to the SUMO E2 conjugating protein.	88
405430	pfam14733	ACDC	AP2-coincident C-terminal. This family is found at the C-terminus of apicomplexan proteins containing the AP2 domain (pfam00847).	89
405431	pfam14734	DUF4469	Domain of unknown function (DUF4469) with IG-like fold. A C-terminal domain in a large family of (predicted) secreted proteins with uknown functions from human gut bacteroides	101
405432	pfam14735	HAUS4	HAUS augmin-like complex subunit 4. This family includes HAUS augmin-like complex subunit 4. The HAUS augmin-like complex contributes to mitotic spindle assembly, maintenance of chromosome integrity and completion of cytokinesis.	235
405433	pfam14736	N_Asn_amidohyd	Protein N-terminal asparagine amidohydrolase. This family of enzymes catalyze the deamindation of N-terminal asparagines in peptides and proteins to aspartic acid.	267
405434	pfam14737	DUF4470	Domain of unknown function (DUF4470). This family is conserved from fungi to Metazoa and includes plants. The function is not known, but several members have zinc-finger domain, zf-MYND, pfam01753, at their very C-terminus. Others are also associated with DUF1279, pfam06916.	97
405435	pfam14738	PaaSYMP	Solute carrier (proton/amino acid symporter), TRAMD3 or PAT1. PAT1 (proton amino acid transporter 1), also known as TRAMD3 of AAT-1, is the molecular correlate of the intestinal imino acid carrier. It is a proton-amino acid co-transporter having a stoichiometry of 1:1. Due to its mechanism, PAT1 activity increases at acidic pH, which correlates well with the acidic micro-climate close to the brush-border in the intestine. Glycine, proline, and alanine are the preferred substrates of the transporter. The maximum velocity is similar for the three substrates. All substrates are transported with low affinity, showing Km values in the range of 2-10 mM. The transporter does not discriminate between L- and D-isoforms of these amino acids; in addition, beta-alanine is transported with similar affinity as alpha-alanine. Similar to the IMINO transporter, the amino acid analog MeAIB is recognized by PAT1. The transporter is strongly expressed in the small intestine, colon, kidney, and brain.	153
405436	pfam14739	DUF4472	Domain of unknown function (DUF4472). This family is specific to the Chordates. Some members also carry Kinesin-motor domains at their N-terminus, Kinesin, pfam00225.	106
405437	pfam14740	DUF4471	Domain of unknown function (DUF4471). This family is conserved from fungi to Metazoa and includes plants. The function is not known, but several members have zinc-finger domain, zf-MYND, pfam01753, at their very C-terminus. Others are also associated with DUF1279, pfam06916. This domain is more C-terminal in many members to DUF4470, pfam14737.	303
373261	pfam14741	GH114_assoc	N-terminal glycosyl-hydrolase-114-associated domain. This short domain is also a very small family found at the N-terminus of GH114, glycosyl-hydrolases.	126
405438	pfam14742	GDE_N_bis	N-terminal domain of (some) glycogen debranching enzymes. This domain is found on the N-terminal of some glycogen debranching enzymes and is usually followed by the GDE_C (PF06202) and in this sense it is analogous (but probably not homologous) to the GDE_N (PF12439). Its exact function is unknown	193
405439	pfam14743	DNA_ligase_OB_2	DNA ligase OB-like domain. This domain has an OB-like fold, but does not appear to be related to pfam03120. It is found at the C-terminus of the ATP dependent DNA ligase domain pfam01068.	60
405440	pfam14744	WASH-7_mid	WASH complex subunit 7. This family is the central, conserved region of proteins that form subunit 7 of the WASH complex. In species such as Drosophila this protein is the only component of the 'complex'. This complex is a nucleation promoting factor necessary for the activation of Arp2/3 that nucleates and organizes actin filaments by associating with a pre-existing filament to induce the assembly of a branching filament. WASH thus effectively nucleates actin on endosomes.	346
405441	pfam14745	WASH-7_N	WASH complex subunit 7, N-terminal. This family is the conserved N-terminal region of proteins that form subunit 7 of the WASH complex. In species such as Drosophila this protein is the only component of the 'complex'. This complex is a nucleation promoting factor necessary for the activation of Arp2/3 that nucleates and organizes actin filaments by associating with a pre-existing filament to induce the assembly of a branching filament. WASH thus effectively nucleates actin on endosomes.	566
405442	pfam14746	WASH-7_C	WASH complex subunit 7, C-terminal. This family is the conserved C-terminal region of proteins that form subunit 7 of the WASH complex. In species such as Drosophila this protein is the only component of the 'complex'. This complex is a nucleation promoting factor necessary for the activation of Arp2/3 that nucleates and organizes actin filaments by associating with a pre-existing filament to induce the assembly of a branching filament. WASH thus effectively nucleates actin on endosomes. The C-terminus is predicted to include a transmembrane region.	175
373266	pfam14747	DUF4473	Domain of unknown function (DUF4473). This short family is largely confined to Caenorhabditis proteins. The function is not known. There are two well-conserved aspartate residues.	78
405443	pfam14748	P5CR_dimer	Pyrroline-5-carboxylate reductase dimerization. Pyrroline-5-carboxylate reductase consists of two domains, an N-terminal catalytic domain (pfam03807) and a C-terminal dimerization domain. This is the dimerization domain.	105
405444	pfam14749	Acyl-CoA_ox_N	Acyl-coenzyme A oxidase N-terminal. Acyl-coenzyme A oxidase consists of three domains. An N-terminal alpha-helical domain, a beta sheet domain (pfam02770) and a C-terminal catalytic domain (pfam01756). This entry represents the N-terminal alpha-helical domain.	120
405445	pfam14750	INTS2	Integrator complex subunit 2. This family of proteins are subunits of the integrator complex involved in snRNA transcription and processing.	1048
405446	pfam14751	DUF4474	Domain of unknown function (DUF4474). Domain found on N-termina of few families of uncharacterized Clostridia proteins. Typically followed by a proline-rich domain or other kinds of repeats	239
405447	pfam14752	RBP_receptor	Retinol binding protein receptor. Proteins in this family function as retinol binding protein receptors.	602
405448	pfam14753	FAM221	Protein FAM221A/B. This family of proteins is found in eukaryotes. Proteins in this family are typically between 99 and 305 amino acids in length.	195
291424	pfam14754	IFR3_antag	Papain-like auto-proteinase. The replicase polyproteins of the Nidoviruses such as, porcine arterivirus PRRSV, equine arterivirus EAV, human coronavirus 229E, and severe acute respiratory syndrome coronavirus (SARS-CoV), are predicted to be cleaved into 14 non-structural proteins (nsps) by the nsp4 main proteinase pfam05579 and three accessory proteinases residing in nsp1-alpha, nsp1-beta and nsp2. This family is the two nsp1 proteins that together act in a papain-like way to separate off the rest of the various functional domains of the polyprotein. Once inside the host cell, this nsp1 interferes with the regulation of interferon, thereby enabling the virus to replicate.	249
373271	pfam14755	NSP2_middle	Middle region of RNA-arterivirus nonstructural protein 2 (nsp2). This domain represents the middle region of nsp2 of the RNA-arteriviruses, such as porcine arterivirus PRRSV and equine arterivirus EAV, C-terminal to the  peptidase C33 family catalytic domain.	148
258893	pfam14756	Pdase_C33_assoc	Peptidase_C33-associated domain. The nsps or non-structural protein subunits of the arteriviral polyproteins such as porcine arterivirus PRRSV and equine arterivirus EAV are auto-cleaved into functional units. the function of this particular domain is not known.	147
373272	pfam14757	NSP2-B_epitope	Immunogenic region of nsp2 protein of arterivirus polyprotein. This domain is in a non-essential part of the nsp2 (non-structural protein) subunit section of the arterivirus polyprotein. This domain carries seven small sequence-regions that are predicted to be potential B-cell epitopes.	272
373273	pfam14758	NSP2_assoc	Non-essential region of nsp2 of arterivirus polyprotein. This non-essential region of the nsp2 subunit of the arterivirus polyprotein of such as porcine arterivirus PRRSV and equine arterivirus EAV may offer immunogenic surfaces to B-cells. It is associated with Peptidase_C33, pfam05412.	198
405449	pfam14759	Reductase_C	Reductase C-terminal. This domain occurs at the C-terminus of various reductase enzymes, including putidaredoxin reductase, ferredoxin reductase, 3-phenylpropionate/cinnamic acid dioxygenase ferredoxin--NAD(+) reductase component, benzene 1,2-dioxygenase system ferredoxin--NAD(+) reductase subunit, rhodocoxin reductase, biphenyl dioxygenase system ferredoxin--NAD(+) reductase component, rubredoxin-NAD(+) reductase and toluene 1,2-dioxygenase system ferredoxin--NAD(+) reductase component. In putidaredoxin reductase this domain is involved in dimerization. In the FAD-containing NADH-ferredoxin reductase (BphA4) it is responsible for interaction with the Rieske-type [2Fe-2S] ferredoxin (BphA3).	83
405450	pfam14760	Rnk_N	Rnk N-terminus. This domain occurs at the N-terminus of Rnk, an RNA polymerase-interacting protein of the GreA/GreB family (pfam01272). It has a coiled coil structure.	41
405451	pfam14761	HPS3_N	Hermansky-Pudlak syndrome 3. This domain is at the N-terminus of these vertebrate proteins. This region carries the clathrin-binding motif LLDFE at residues 172-176 in human HPS3. There is also reference to a human Mendelian disease at MIM:614072.	211
405452	pfam14762	HPS3_Mid	Hermansky-Pudlak syndrome 3, middle region. This domain is downstream of the N-terminus of these vertebrate proteins. This region carries a number of tyrosine sorting motifs and one of two di-leucine sorting boxes at residues 542-548 well as a peroxisomal matrix targetting motif at residues 614-623 in human HPS3. There is also reference to a human Mendelian disease at MIM:614072.	387
405453	pfam14763	HPS3_C	Hermansky-Pudlak syndrome 3, C-terminal. This domain is downstream of the mid domain family, pfam14762, of these vertebrate proteins. This region carries a number of tyrosine sorting motifs and the second of two di-leucine sorting boxes at residues 711-717 well as the ER membrane-retention signal KKPL at residues 1000-1003 in human HPS3. There is also reference to a human Mendelian disease at MIM:614072.	350
405454	pfam14764	SPG48	AP-5 complex subunit, vesicle trafficking. This family would appear to be the second of the two larger subunits of the fifth Adaptor-Protein complex, AP-5. Adaptor protein (AP) complexes facilitate the trafficking of cargo from one membrane compartment of the cell to another by recruiting other proteins to particular types of vesicles. AP-5 is involved in trafficking proteins from endosomes towards other membranous compartments. There are genetic links between AP-5 and hereditary spastic paraplegia, a group of human genetic disorders characterized by progressive spasticity in the lower limbs.	118
405455	pfam14765	PS-DH	Polyketide synthase dehydratase. This is the dehydratase domain of polyketide synthases. Structural analysis shows these DH domains are double hotdogs in which the active site contains a histidine from the N-terminal hotdog and an aspartate from the C-terminal hotdog. Studies have uncovered that a substrate tunnel formed between the DH domains may be essential for loading substrates and unloading products.	294
405456	pfam14766	RPA_interact_N	Replication protein A interacting N-terminal. This family of proteins represents the N-terminal domain of replication protein A (RPA) interacting protein. RPA interacting protein is involved in the import of RPA into the nucleus. The N-terminal domain is responsible for interaction with importin beta.	38
405457	pfam14767	RPA_interact_M	Replication protein A interacting middle. This family of proteins represents the middle domain of replication protein A (RPA) interacting protein. RPA interacting protein is involved in the import of RPA into the nucleus. This domain is responsible for interaction with RPA.	82
405458	pfam14768	RPA_interact_C	Replication protein A interacting C-terminal. This family of proteins represents the C-terminal domain of replication protein A (RPA) interacting protein. RPA interacting protein is involved in the import of RPA into the nucleus. The C-terminal domain is a putative zinc finger.	79
405459	pfam14769	CLAMP	Flagellar C1a complex subunit C1a-32. This family represents one small subunit, C1a-32, of the C1a projection (the seventh projection of flagellar). Numerous studies have indicated that each of the seven projections associated with the central pair of microtubules in flagellar plays a distinct role in regulating eukaryotic ciliary/flagellar motility. The C1a projection is a complex of proteins including PF6, C1a-86, C1a-34, C1a-32, C1a-18, and calmodulin. C1a projection is involved in modulating flagellar beat frequency and this is mediated via the C1a-34, C1a-32, and C1a-18 sub-complex by modulating the activity of both the inner and outer dynein arms.	97
405460	pfam14770	TMEM18	Transmembrane protein 18. The function of this family is not known, however it is predicted to be a three-pass membrane protein.	118
405461	pfam14771	DUF4476	Domain of unknown function (DUF4476). 	91
405462	pfam14772	NYD-SP28	Sperm tail. NYD-SP28 is expressed in a development-dependent manner, localized in spermatogenic cell cytoplams and human spermatozoa tail. It is post-translationally modified during sperm capacitation and ultimately contributes to the success of fertilisation.	101
405463	pfam14773	VIGSSK	Helicase-associated putative binding domain, C-terminal. The function of this short, serine-rich C-terminal region is not known. However, as it is frequently found at the very C-terminus of P-loop containing nucleoside triphosphate hydrolases, it might possibly be a binding domain.	62
405464	pfam14774	FAM177	FAM177 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 134 and 205 amino acids in length.	117
405465	pfam14775	NYD-SP28_assoc	Sperm tail C-terminal domain. NYD-SP28 is expressed in a development-dependent manner, localized in spermatogenic cell cytoplams and human spermatozoa tail. It is post-translationally modified during sperm capacitation and ultimately contributes to the success of fertilisation. This short region is found at the very C-terminus of family members of family NYD-SP28, pfam14772.	60
405466	pfam14776	UNC-79	Cation-channel complex subunit UNC-79. This family is a component of a cation-channel complex.	525
405467	pfam14777	BBIP10	Cilia BBSome complex subunit 10. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. BBIP10 localizes to the primary cilium, and is present exclusively in ciliated organisms. It is required for cytoplasmic microtubule polymerization and acetylation, two functions not shared with any other BBSome subunits. BBIP10 physically interacts with HDAC6. BBSome-bound BBIP10 may therefore function to couple acetylation of axonemal microtubules and ciliary membrane growth. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction.	55
405468	pfam14778	ODR4-like	Olfactory receptor 4-like. In C.elegans, odr-4 and odr-8 are required for localising a subset of odorant GPCRs to the cilia of olfactory neurons. Olfactory receptors (ORs) are synthesized in endoplasmic reticulum of the olfactory neurons, trafficked to the cell surface membrane and transported to the tip of the olfactory cilium, where they bind with odorants. Various accessory proteins are required for proper targetting of different ORs to the cell membrane. ODR-4 was the first accessory protein to be described.	368
405469	pfam14779	BBS1	Ciliary BBSome complex subunit 1. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of the all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. BBS1 predominantly localizes to the basal body and or transitional zone of ciliated cells. It has been found in a heptameric complex with BBS2, BBS5, BBS7, BBS8, and BBS9, termed the BBSome. Mutations in BBS1 can lead to retinal inadequacy.	254
405470	pfam14780	DUF4477	Domain of unknown function (DUF4477). 	187
405471	pfam14781	BBS2_N	Ciliary BBSome complex subunit 2, N-terminal. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. BBS2 is one of the three Bardet-Biedl syndrome subunits that is required for leptin receptor signalling in the hypothalamus, and BBS2 and 4 are also required for the localization of somatostatin receptor 3 and melanin-concentrating hormone receptor 1 into neuronal cilia.	107
405472	pfam14782	BBS2_C	Ciliary BBSome complex subunit 2, C-terminal. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. BBS2 is one of the three Bardet-Biedl syndrome subunits that is required for leptin receptor signalling in the hypothalamus, and BBS2 and 4 are also required for the localization of somatostatin receptor 3 and melanin-concentrating hormone receptor 1 into neuronal cilia.	429
405473	pfam14783	BBS2_Mid	Ciliary BBSome complex subunit 2, middle region. The BBSome (so-named after the association with Bardet-Biedl syndrome) is a complex of 8 subunits that lies at the base of the flagellar microtubule structure. The precise function of all the individual components in cilia formation is unclear, however they function to promote loading of cargo to the ciliary axoneme. The primary cilium, a slim microtubule-based organelle that projects from the surface of vertebrate cells has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for Hedgehog (Hh) signal transduction. Formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction. BBS2 is one of the three Bardet-Biedl syndrome subunits that is required for leptin receptor signalling in the hypothalamus, and BBS2 and 4 are also required for the localization of somatostatin receptor 3 and melanin-concentrating hormone receptor 1 into neuronal cilia.	108
405474	pfam14784	ECSIT_C	C-terminal domain of the ECSIT protein. This family represents the C-terminal domain of the evolutionarily conserved signaling intermediate in Toll pathway protein, an adapter protein of the Toll-like and IL-1 receptor signaling pathway, which is involved in the activation of NF-kappa-B via MAP3K1. This domain is missing in isoform 2. Fold recognition suggests that this domain may be distantly homologous to the pleckstrin homology domain	131
405475	pfam14785	MalF_P2	Maltose transport system permease protein MalF P2 domain. This is the second periplasmic domain (P2 domain) of the maltose transport system permease protein MalF.	164
405476	pfam14786	Death_2	Tube Death domain. This Tube-Death domain has an insertion between helices 2 and 3, and a C-terminal tail compared with the Death domain of Pelle proteins in Drosophila. The two N-terminal Death domains of the serine/threonine kinase Pelle and the adaptor protein Tube interact to form a six-helix bundle fold arranged in an open-ended linear array with plastic interfaces mediating their interactions. This interaction leads to the nuclear translocation of the transcription factor Dorsal and activation of zygotic patterning genes during Drosophila embryogenesis, and is assisted by the significant and indispensable contacts in the heterodimer contributed by the insertion and C-terminal tail described above.	137
373297	pfam14787	zf-CCHC_5	GAG-polyprotein viral zinc-finger. 	36
405477	pfam14788	EF-hand_10	EF hand. 	50
405478	pfam14789	THDPS_M	Tetrahydrodipicolinate N-succinyltransferase middle. This is the middle domain of 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase.	41
339376	pfam14790	THDPS_N	Tetrahydrodipicolinate N-succinyltransferase N-terminal. This is the N-terminal domain of 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase.	167
405479	pfam14791	DNA_pol_B_thumb	DNA polymerase beta thumb. The catalytic region of DNA polymerase beta is split into three domains. An N-terminal fingers domain, a central palm domain and a C-terminal thumb domain. This entry represents the thumb domain.	63
405480	pfam14792	DNA_pol_B_palm	DNA polymerase beta palm. The catalytic region of DNA polymerase beta is split into three domains. An N-terminal fingers domain, a central palm domain and a C-terminal thumb domain. This entry represents the palm domain.	110
405481	pfam14793	DUF4478	Domain of unknown function (DUF4478). This domain is found in bacteria, and is approximately 110 amino acids in length. It is found in association with pfam03641 and pfam11892.	109
405482	pfam14794	DUF4479	Domain of unknown function (DUF4479). This domain family is found in bacteria, and is approximately 70 amino acids in length. The family is found in association with pfam01588.	71
373300	pfam14795	Leucyl-specific	Leucine-tRNA synthetase-specific domain. This short region is found only in leucyl-tRNA synthetases. It is flexibly linked to the enzyme-core by beta-ribbons structures	56
405483	pfam14796	AP3B1_C	Clathrin-adaptor complex-3 beta-1 subunit C-terminal. This domain lies at the C-terminus of the clathrin-adaptor protein complex-3 beta-1 subunit. The AP-3 complex is associated with the Golgi region of the cell as well as with more peripheral structures. The AP-3 complex may be directly involved in trafficking to lysosomes or alternatively it may be involved in another pathway, but that mis-sorting in that pathway may indirectly lead to defects in pigment granules.	147
373302	pfam14797	SEEEED	Serine-rich region of AP3B1, clathrin-adaptor complex. This short low-complexity, highly serine-rich region lies on clathrin-adaptor complex 3 beta-1 subunit proteins, between family Adaptin_N, pfam01602 and a C-terminal domain, AP3B1_C,pfam14796.	125
405484	pfam14798	Ca_hom_mod	Calcium homeostasis modulator. This family of proteins control cytosolic calcium concentration. They are transmembrane proteins which may be pore-forming ion channels.	250
405485	pfam14799	FAM195	FAM195 family. 	98
405486	pfam14800	DUF4481	Domain of unknown function (DUF4481). 	293
405487	pfam14801	GCD14_N	tRNA methyltransferase complex GCD14 subunit N-term. This is the N-terminal domain of GCD14, itself a subunit of the tRNA methyltransferase complex that is required for 1-methyladenosine modification and maturation of initiator methionyl-tRNA. The exact function of the N-terminus is not known but it is necessary for maintaining the overall folding and for full enzymatic activity.	51
405488	pfam14802	TMEM192	TMEM192 family. The function of this family of transmembrane proteins is unknown. In vertebrates, proteins in this family are located in the lysosomal membrane and late endosome. In Arabidopsis, a member of this family has been found to weakly interact with FRIGIDA, a determinant of flowering time.	234
405489	pfam14803	Nudix_N_2	Nudix N-terminal. Ths domain occurs at the N-terminus of several Nudix (Nucleoside Diphosphate linked to X) hydrolases.	32
405490	pfam14804	Jag_N	Jag N-terminus. This domain is found at the N-terminus of proteins containing pfam13083 and pfam01424, including the jag proteins.	50
405491	pfam14805	THDPS_N_2	Tetrahydrodipicolinate N-succinyltransferase N-terminal. This is the N-terminal domain of 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase.	67
405492	pfam14806	Coatomer_b_Cpla	Coatomer beta subunit appendage platform. This family is found at the C-terminus of the coatamer beta subunit proteins (Beta-coat proteins). It is a platform domain on the appendage that carries a highly conserved tryptophan.	128
405493	pfam14807	AP4E_app_platf	Adaptin AP4 complex epsilon appendage platform. This domain is found at the C terminal of clathrin-adaptor epsilon subunit, and at the C-terminus of the appendage on the platform domain.	100
405494	pfam14808	TMEM164	TMEM164 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 214 and 330 amino acids in length. There are two conserved sequence motifs: LNPCH and DPF.	250
373310	pfam14809	TGT_C1	C1 domain of tRNA-guanine transglycosylase dimerization. This short region of the tRNA-guanine transglycosylase enzyme acts as the dimerization domain of the whole protein.	70
405495	pfam14810	TGT_C2	Patch-forming domain C2 of tRNA-guanine transglycosylase. Domain C2 of tRNA-guanine transglycosylase is formed by a four-stranded anti-parallel beta-sheet lined with two alpha helices. It has conserved basic residues on the surface of the beta-sheets as does the C-terminal domain PUA, pfam01472. The catalytic domain, TGT has conserved basic residues on the outer surface of the N-terminal three-stranded beta sheet, which closes the barrel, and it is postulated that these basic residues from the three domains form a continuous, positively charged patch to which the tRNA binds.	70
405496	pfam14811	TPD	Protein of unknown function TPD sequence-motif. This is a family of eukaryotic proteins of unknown function. A few members have an associated zinc-finger domain. All members carry a highly conserved TPD sequence-motif.	138
405497	pfam14812	PBP1_TM	Transmembrane domain of transglycosylase PBP1 at N-terminal. This is the N-terminal, transmembrane, domain of the transglycosylases ()penicillin-binding proteins), the multi-domain membrane proteins essential for cell wall synthesis that are targeted by penicillin antibiotics. The TM domain is a single helix, several of whose residues lie in close proximity to hydrophobic residues in the TGT domain. The TM helix seems to be necessary for stabilizing the protein-membrane interaction, and the resulting orientation limits the interaction between PBPb1 and lipid II in the membrane in a 2D lateral diffusion fashion.	85
405498	pfam14813	NADH_B2	NADH dehydrogenase 1 beta subcomplex subunit 2. This family represents an accessory subunit of the mitochondrial membrane respiratory chain NADH dehydrogenase (Complex I), that is believed not to be involved in catalysis.	69
405499	pfam14814	UB2H	Bifunctional transglycosylase second domain. UB2H is the second domain of the transglycosylases, or penicillin-binding proteins PBP1bs)), the multi-domain membrane proteins essential for cell wall synthesis that are targeted by penicillin antibiotics. The exact function of the UB2H domain is uncertain, but it may act as the binding component of PBP1b with different binding partners, or it may participate in the regulation between DNA repair and/or synthesis and cell wall formation during the bacterial cell cycle.	85
405500	pfam14815	NUDIX_4	NUDIX domain. 	115
405501	pfam14816	FAM178	Family of unknown function, FAM178. 	373
405502	pfam14817	HAUS5	HAUS augmin-like complex subunit 5. This family includes HAUS augmin-like complex subunit 5. The HAUS augmin-like complex contributes to mitotic spindle assembly, maintenance of chromosome integrity and completion of cytokinesis.	642
405503	pfam14818	DUF4482	Domain of unknown function (DUF4482). This family is found in eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam11365.	138
405504	pfam14819	QueF_N	Nitrile reductase, 7-cyano-7-deazaguanine-reductase N-term. The QueF monomer is made up of two ferredoxin-like domains aligned together with their beta-sheets that have additional embellishments. This subunit is composed of a three-stranded beta-sheet and two alpha-helices. QueF reduces a nitrile bond to a primary amine. The two monomer units together create suitable substrate-binding pockets.	110
373317	pfam14820	SPRR2	Small proline-rich 2. This family of small proteins is rich in proline, cysteine and glutamate. They contain a tandemly repeated nonamer, PKCPEPCPP. They are components of the cornified envelope of keratinocytes.	68
405505	pfam14821	Thr_synth_N	Threonine synthase N-terminus. This domain is found at the N-terminus of many threonine synthase enzymes.	79
405506	pfam14822	Vasohibin	Vasohibin. This family of proteins function as angiogenesis inhibitors in animals.	245
405507	pfam14823	Sirohm_synth_C	Sirohaem biosynthesis protein C-terminal. This domain is the C-terminus of a multifunctional enzyme which catalyzes the biosynthesis of sirohaem. Both of the catalytic activities of this enzyme (precorrin-2 dehydrogenase EC:1.3.1.76) and sirohydrochlorin ferrochelatase (EC:4.99.1.4) are located in the N-terminal domain of this enzyme, pfam13241.	66
405508	pfam14824	Sirohm_synth_M	Sirohaem biosynthesis protein central. This is the central domain of a multifunctional enzyme which catalyzes the biosynthesis of sirohaem. Both of the catalytic activities of this enzyme (precorrin-2 dehydrogenase EC:1.3.1.76) and sirohydrochlorin ferrochelatase (EC:4.99.1.4) are located in the N-terminal domain of this enzyme, pfam13241.	25
405509	pfam14825	DUF4483	Domain of unknown function (DUF4483). This family of proteins is found in eukaryotes. Proteins in this family are typically between 203 and 326 amino acids in length. There is a single completely conserved residue N that may be functionally important.	157
405510	pfam14826	FACT-Spt16_Nlob	FACT complex subunit SPT16 N-terminal lobe domain. The FACT or facilitator of chromatin transcription complex binds to and alters the properties of nucleosomes. This family represents the N-terminal lobe of the NTD, or N-terminal domain, and acts as a protein-protein interaction domain presumably with partners outside of the FACT complex. Knockout of the whole NTD domain, 1-450 residues in UniProt:P32558, in yeast serves to tender the cells sensitive to DNA replication stress but is not lethal. The C-terminal half of NTD is structurally similar to aminopeptidases, and the most highly conserved surface residues line a cleft equivalent to the aminopeptidase substrate-binding site, family peptidase_M24, pfam00557.	160
405511	pfam14827	dCache_3	Double sensory domain of two-component sensor kinase. Cache_3 is the periplasmic sensor domains of sensor histidine kinase of E. coli DcuS. This domain forms one of the components of the two-component signalling system that allows bacteria to adapt to changing environments. The ability of bacteria to monitor and adapt to their environment is crucial to their survival, and two-component signal transduction systems mediate most of these adaptive responses. One component is a histidine kinase sensor - this domain - most commonly part of a homodimeric transmembrane sensor protein, and the second component is a cytoplasmic response regulator. The two components interact in tandem through a phospho-transfer cascade.	238
405512	pfam14828	Amnionless	Amnionless. The amnionless protein forms a complex with cubilin. This complex is necessary for vitamin B12 uptake.	442
373324	pfam14829	GPAT_N	Glycerol-3-phosphate acyltransferase N-terminal. GPAT_N is the N-terminal domain of glycerol-3-phosphate acyltransferases, and it forms a four-helix bundle. Glycerol-3-phosphate (1)-acyltransferase(G3PAT) catalyzes the incorporation of an acyl group from either acyl-acyl carrier proteins or acyl-CoAs into the sn-1 position of glycerol 3-phosphate to yield 1-acylglycerol-3-phosphate. G3PATs can either be selective, preferentially using the unsaturated fatty acid, oleate (C18:1), as the acyl donor, or non-selective, using either oleate or the saturated fatty acid, palmitate (C16:0), at comparable rates. The differential substrate-specificity for saturated versus unsaturated fatty acids seen within this enzyme family has been implicated in the sensitivity of plants to chilling temperatures. The exact function of this domain is not known. it lies upstream of family Acyltransferase, pfam01553.	76
405513	pfam14830	Haemocyan_bet_s	Haemocyanin beta-sandwich. This antiparallel beta sandwich domain occurs in mollusc haemocyanins. Each mollusc haemocyanin contains several globular oxygen binding functional units. Each unit consists of an alpha-helical copper binding domain (pfam00264) and an antiparallel beta sandwich domain.	103
405514	pfam14831	DUF4484	Domain of unknown function (DUF4484). This domain is found, in a few members, a the the C-terminus of family Avl9, pfam09794. The function is not known.	183
373327	pfam14832	Tautomerase_3	Putative oxalocrotonate tautomerase enzyme. 4-oxalocrotonate tautomerase enzyme is involved in the anthranilate synthase pathway.1	136
405515	pfam14833	NAD_binding_11	NAD-binding of NADP-dependent 3-hydroxyisobutyrate dehydrogenase. 3-Hydroxyisobutyrate is a central metabolite in the valine catabolic pathway, and is reversibly oxidized to methylmalonate semi-aldehyde by a specific dehydrogenase belonging to the 3-hydroxyacid dehydrogenase family. The reaction is NADP-dependent and this region of the enzyme binds NAD. The NAD-binding domain of 6-phosphogluconate dehydrogenase adopts an alpha helical structure.	122
405516	pfam14834	GST_C_4	Glutathione S-transferase, C-terminal domain. GST conjugates reduced glutathione to a variety of targets including S-crystallin from squid, the eukaryotic elongation factor 1-gamma, the HSP26 family of stress-related proteins and auxin-regulated proteins in plants. Stringent starvation proteins in E. coli are also included in the alignment but are not known to have GST activity. The glutathione molecule binds in a cleft between N and C-terminal domains. The catalytically important residues are proposed to reside in the N-terminal domain.	117
405517	pfam14835	zf-RING_6	zf-RING of BARD1-type protein. The RING domain of the breast and ovarian cancer tumor-suppressor BRCA1 interacts with multiple cognate proteins, including the RING protein BARD1. Proper function of the BRCA1 RING domain is critical, as evidenced by the many cancer-predisposing mutations found within this domain. A dimer is formed between the RING domains of BRCA1 and BARD1. The BRCA1-BARD1 structure provides a model for its ubiquitin ligase activity, illustrates how the BRCA1 RING domain can be involved in associations with multiple protein partners and provides a framework for understanding cancer-causing mutations at the molecular level. The corresponding BRCA1-RING domain is on family zf-C3HC4_2, pfam13923.	65
405518	pfam14836	Ubiquitin_3	Ubiquitin-like domain. This ubiquitin-like domain is found in several ubiquitin carboxyl-terminal hydrolases and in gametogenetin-binding protein.	88
405519	pfam14837	INTS5_N	Integrator complex subunit 5 N-terminus. This family of proteins represents the N-terminus of subunit 5 of the integrator complex involved in snRNA transcription and processing.	208
405520	pfam14838	INTS5_C	Integrator complex subunit 5 C-terminus. This family of proteins represents the C-terminus of subunit 5 of the integrator complex involved in snRNA transcription and processing.	693
405521	pfam14839	DOR	DOR family. This family of proteins regulate autophagy and gene transcription.	206
405522	pfam14840	DNA_pol3_delt_C	Processivity clamp loader gamma complex DNA pol III C-term. This domain lies at the C-terminus of the delta subunit of the DNA polymerase III clamp loader gamma complex. Within the complex the several C-terminal domains, of gamma, delta and delta' form a helical scaffold, on which the rest of he subunits are hung. The gamma complex, an AAA+ ATPase, is the bacterial homolog of the eukaryotic replication factor C that loads the sliding clamp (beta, homologous to PCNA) onto DNA.	125
405523	pfam14841	FliG_M	FliG middle domain. This is the middle domain of the flagellar rotor protein FliG.	76
405524	pfam14842	FliG_N	FliG N-terminal domain. This is the N-terminal domain of the flagellar rotor protein FliG.	101
405525	pfam14843	GF_recep_IV	Growth factor receptor domain IV. This is the fourth extracellular domain of receptor tyrosine protein kinases. Interaction between this domain and the furin-like domain (pfam00757) regulates the binding of ligands to the receptor L domains (pfam01030).	132
405526	pfam14844	PH_BEACH	PH domain associated with Beige/BEACH. This PH domain is found in proteins containing the Beige/BEACH domain (pfam02138), it immediately precedes the Beige/BEACH domain.	99
373334	pfam14845	Glycohydro_20b2	beta-acetyl hexosaminidase like. 	133
405527	pfam14846	DUF4485	Domain of unknown function (DUF4485). This family is found in eukaryotes, and is approximately 90 amino acids in length.	81
405528	pfam14847	Ras_bdg_2	Ras-binding domain of Byr2. This domain is the binding/interacting region of several protein kinases, such as the Schizosaccharomyces pombe Byr2. Byr2 is a Ser/Thr-specific protein kinase acting as mediator of signals for sexual differentiation in S. pombe by initiating a MAPK module, which is a highly conserved element in eukaryotes. Byr2 is activated by interacting with Ras, which then translocates the molecule to the plasma membrane. Ras proteins are key elements in intracellular signaling and are involved in a variety of vital processes such as DNA transcription, growth control, and differentiation. They function like molecular switches cycling between GTP-bound 'on' and GDP-bound 'off' states.	95
405529	pfam14848	HU-DNA_bdg	DNA-binding domain. 	123
405530	pfam14849	YidC_periplas	YidC periplasmic domain. This is the periplasmic domain of YidC, a bacterial membrane protein which is required for the insertion and assembly of inner membrane proteins.	267
405531	pfam14850	Pro_dh-DNA_bdg	DNA-binding domain of Proline dehydrogenase. This domain lies at the N-terminus of bifunctional proline-dehydrogenases and is found to bind DNA.	113
405532	pfam14851	FAM176	FAM176 family. Members of the FAM176 family regulate autophagy and apoptosis.	145
405533	pfam14852	Fis1_TPR_N	Fis1 N-terminal tetratricopeptide repeat. The mitochondrial fission protein Fis1 consists of two tetratricopeptide repeats. This domain is the N-terminal tetratricopeptide repeat	33
405534	pfam14853	Fis1_TPR_C	Fis1 C-terminal tetratricopeptide repeat. The mitochondrial fission protein Fis1 consists of two tetratricopeptide repeats. This domain is the C-terminal tetratricopeptide repeat	53
373342	pfam14854	LURAP	Leucine rich adaptor protein. This family of proteins activate the canonical NF-kappa-B pathway, promote proinflammatory cytokine production and promote the antigen presenting and priming functions of dendritic cells.	117
405535	pfam14855	PapJ	Pilus-assembly fibrillin subunit, chaperone. PapJ is part of the Pap pilus assembly complex that plays an auxiliary role by ensuring the proper integration of PapA into the fimbrial shaft. PapA is the major shaft protein of the pilus.	187
405536	pfam14856	Hce2	Pathogen effector; putative necrosis-inducing factor. The domain corresponds to the mature part of the Ecp2 effector protein from the tomato pathogen Cladopsorium fulvum. Effectors are low molecular weight proteins that are secreted by bacteria, oomycetes and fungi to manipulate their hosts and adapt to their environment. Ecp2 is a 165 amino acid secreted protein that was originally identified as a virulence factor in C. fulvum, since disruption reduces virulence of the fungus on tomato plants. We have recently determined that Ecp2 is a member of a novel, widely distributed and highly diversified within the fungal kingdom multigene superfamily, which we have designated Hce2, for Homologs of C. fulvum Ecp2 effector. Although Ecp2 is present in most organisms as a small secreted protein, the mature part of this protein can be found fused to other protein domains, including the fungal Glycoside Hydrolase family 18, Glyco_hydro_18 pfam00704 and other, unknown, protein domains. The intrinsic function of Ecp2 remains unknown but it is postulated by that it is a necrosis-inducing factor in plants that serves pathogenicity on the host.	103
405537	pfam14857	TMEM151	TMEM151 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 338 and 558 amino acids in length.	424
405538	pfam14858	DUF4486	Domain of unknown function (DUF4486). This domain family is found in eukaryotes, and is typically between 542 and 565 amino acids in length.	542
258996	pfam14859	Colicin_M	Colicin M. Colicin M is a toxin produced by, and active against, Escherichia coli. It catalyzes the hydrolysis of lipid I and lipid II peptidoglycan intermediates, therefore inhibiting peptidoglycan biosynthesis and leading to lysis of the bacterial cells.	269
405539	pfam14860	DrrA_P4M	DrrA phosphatidylinositol 4-phosphate binding domain. This domain binds to phosphatidylinositol 4-phosphate. It is found in Legionella pneumophila DrrA, a protein involved in the redirection of endoplasmic reticulum-derived vesicles to the Legionella-containing vacuoles.	103
405540	pfam14861	Antimicrobial21	Plant antimicrobial peptide. This family includes plant antimicrobial peptides. They adopt an alpha-helical hairpin fold stabilized by two disulphide bonds.	30
373348	pfam14862	Defensin_big	Big defensin. Big defensins are antimicrobial peptides. They consist of a hydrophobic N-terminal half, which is active against Gram-positive bacteria, and a cationic C-terminal half, which is active against Gram-negative bacteria. The C-terminal half adopts a beta-defensin-like structure.	55
405541	pfam14863	Alkyl_sulf_dimr	Alkyl sulfatase dimerization. This domain is found in alkyl sulfatases such as the Pseudomonas aeruginosa SDS hydrolase, where it acts as a dimerization domain	138
405542	pfam14864	Alkyl_sulf_C	Alkyl sulfatase C-terminal. This domain is found at the C-terminus of alkyl sulfatases. Together with the N-terminal catalytic domain, this domain forms a hydrophobic chute and may recruit hydrophobic substrates.	124
373349	pfam14865	Macin	Macin. The macins are antimicrobial proteins. They form a disulphide-stabilized alpha-beta motif.	60
259003	pfam14866	Toxin_38	Potassium channel toxin. This family includes scorpion potassium channel toxins.	55
291529	pfam14867	Lantibiotic_a	Lantibiotic alpha. Lantibiotics are two-component lanthionine-containing peptide antibiotics active on Gram-positive bacteria.	29
405543	pfam14868	DUF4487	Domain of unknown function (DUF4487). This family of proteins is found in eukaryotes. Proteins in this family are typically between 209 and 938 amino acids in length. There is a conserved WCF sequence motif. There is a single completely conserved residue W that may be functionally important.	555
405544	pfam14869	DUF4488	Domain of unknown function (DUF4488). In most members this domain covers almost the whole sequence, but a few member-sequences also carry a TonB_C domain, PF03544. This domain has a lipocalin fold.	122
405545	pfam14870	PSII_BNR	Photosynthesis system II assembly factor YCF48. YCF48 is one of several assembly factors of the photosynthesis system II. The photosynthesis system II occurs in Cyanobacteria that are Gram-negative bacteria performing oxygenic photosynthesis. One of the three membranes surrounding these bacteria is the inner thylakoid membrane (TM) system that is localized within the cell and houses the large pigment-protein complexes of the photosynthetic electron transfer chain, i.e. Photosystem (PS) II, PSI, the cytochrome b6f complex, and the ATP synthase. YCF48 is necessary for efficient assembly and repair of the PSII. YCF48 is found predominantly in the thykaloid membrane. It is a BNR repeat protein.	304
405546	pfam14871	GHL6	Hypothetical glycosyl hydrolase 6. GHL6 is a family of hypothetical glycoside hydrolases.	135
405547	pfam14872	GHL5	Hypothetical glycoside hydrolase 5. GHL5 is a family of hypothetical glycoside hydrolases.	803
405548	pfam14873	BNR_assoc_N	N-terminal domain of BNR-repeat neuraminidase. This domain is usually found at the N-terminus of the BNR-repeat neuraminidase protein family.	149
373355	pfam14874	PapD-like	Flagellar-associated PapD-like. This domain is a putative PapD periplasmic pilus chaperone protein family.	102
405549	pfam14875	PIP49_N	N-term cysteine-rich ER, FAM69. The FAM69 family of cysteine-rich type II transmembrane proteins localize to the endoplasmic reticulum (ER) in cultured cells, probably via N-terminal di-arginine motifs. These proteins carry at least 14 luminal cysteines which are conserved in all FAM69s. There are currently few indications of the involvement of FAM69 members in human diseases. It would appear that FAM69 proteins are predicted to be have a protein kinase structure and function. Analysis of three-dimensional structure models and conservation of the classic catalytic motifs of protein kinases in four of human FAM69 proteins suggests they might have retained catalytic phosphotransferase activity. An EF-hand Ca2+-binding domain, inserted within the structure of the kinase domain, suggests they function as Ca2+-dependent kinases (unpublished).	157
291538	pfam14876	RSF	Respiratory growth transcriptional regulator. This is a family of transcriptional regulators that determine the transition from fermentative activity to growth on glycerol.	380
405550	pfam14877	mIF3	Mitochondrial translation initiation factor. This is a family of mitochondrial initiation factors IF3.	169
405551	pfam14879	DUF4489	Domain of unknown function (DUF4489). 	139
405552	pfam14880	COX14	Cytochrome oxidase c assembly. COX14 plays an essential role in cytochrome oxidase assembly. The COX14 product is a low-molecular weight membrane protein of mitochondria, but it is not a subunit of cytochrome oxidase. Orthology-prediction methods have identified the vertebrate C12orf62 orthologues to be orthologues of the yeast COX14.	59
373359	pfam14881	Tubulin_3	Tubulin domain. This family includes the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. Misato from Drosophila and Dml1p from fungi are descendants of an ancestral tubulin-like protein, and exhibit regions with similarity to members of a GTPase family that includes eukaryotic tubulin and prokaryotic FtsZ. Dml1p and Misato have been co-opted into a role in mtDNA inheritance in yeast, and into a cell division-related mechanism in flies, respectively. Dml1p might additionally function in the partitioning of the mitochondrial organelle itself, or in the segregation of chromosomes, thereby explaining its essential requirement. This domain subject to extensive post-translational modifications.	180
405553	pfam14882	PHINT_rpt	Phage-integrase repeat unit. This repeat family is found on phage-integrase proteins in up to 15 copies. The function is not known.	54
405554	pfam14883	GHL13	Hypothetical glycosyl hydrolase family 13. GHL13 is a family of hypothetical glycoside hydrolases.	325
405555	pfam14884	EFF-AFF	Type I membrane glycoproteins cell-cell fusogen. EFF-AFF was first identified when EFF1 mutants were found to block cell fusion in all epidermal and vulval epithelia in the worm. However, fusion between the anchor cell and the utse syncytium that establishes a continuous uterine-vulval tube proceeds normally in eff-1 mutants and thus Aff1 was established as necessary for this and the fusion of heterologous cells in C. elegans. The transmembrane forms of FF proteins, like most viral fusogens, possess an N-terminal signal sequence followed by a long extracellular portion, a predicted transmembrane domain, and a short intracellular tail. A striking conservation in the position and number of all 16 cysteines in the extracellular portion of FF proteins from different nematode species suggests that these proteins are folded in a similar 3D structure that is essential for their fusogenic activity. C. elegans AFF-1 and EFF-1 proteins are essential for developmental cell-to-cell fusion and can merge insect cells. Thus FFs comprise an ancient family of cellular fusogens that can promote fusion when expressed on a viral particle.	471
405556	pfam14885	GHL15	Hypothetical glycosyl hydrolase family 15. GHL15 is a family of hypothetical glycoside hydrolases.	272
405557	pfam14886	FAM183	FAM183A and FAM183B related. The function of this family of metazoan sequences is not known.	106
373364	pfam14887	HMG_box_5	HMG (high mobility group) box 5. Nucleolar transcription factor/upstream binding factor contains six HMG box domains. This is the fifth HMG box domain in these proteins. This domain has lost DNA-binding ability.	84
405558	pfam14888	PBP-Tp47_c	Penicillin-binding protein Tp47 domain C. Domain C is the largest domain in this unusual penicillin-binding protein PBP), Tp47. This domain is mainly characterized by an immunoglobulin fold with two opposing beta-sheets that form the typical barrel-like structure. In contrast to the classical immunoglobulin fold, however, this has an additional beta-strand inserted after strand 3. Also, the strands are connected by rather large loops. Helices are inserted between strands 2 and 3 and between strands 4 and 5. Domain C interacts with domain B via a surface that has a slightly concave, goblet-like shape. Tp47 is unusual in that it displays beta-lactamase activity, and thus it does not fit the classical structural and mechanistic paradigms for PBPs, and thus Tp47 appears to represent a new class of PBP.	158
405559	pfam14889	PBP-Tp47_a	Penicillin-binding protein Tp47 domain a. This is the first domain in this unusual penicillin-binding protein PBP), Tp47 is mainly composed of beta-strands and is sequentially non-contiguous. The first three domains in Tp47 interact with each other through intimate domain-domain interfaces. Domain A contacts domain B through its N-terminal segment. Domain A also interacts tightly with domain C, Tp47 is unusual in that it displays beta-lactamase activity, and thus it does not fit the classical structural and mechanistic paradigms for PBPs, and thus Tp47 appears to represent a new class of PBP.	161
405560	pfam14890	Intein_splicing	Intein splicing domain. Inteins are segments of protein which excise themselves from a precursor protein and mediate the rejoining of the remainder of the precursor (the extein). Most inteins consist of a splicing domain which is split into two segments by a homing endonuclease domain. This domain represents the splicing domain.	382
405561	pfam14891	Peptidase_M91	Effector protein. This family of proteins contains an HEXXH motif, typical of zinc metallopeptidases. The family includes the E. coli effector protein NleD, which cleaves and inactivates c-Jun N-terminal kinase (JNK).	173
405562	pfam14892	DUF4490	Domain of unknown function (DUF4490). This family of proteins is found in eukaryotes. Proteins in this family are typically between 101 and 220 amino acids in length. In mice, a member of this family whose expression is induced by p53 may play a role in DNA damage response.	99
405563	pfam14893	PNMA	PNMA. The PNMA family includes paraneoplastic antigens Ma 1, 2 and 3, found in the serum of patients with paraneoplastic neurological disorders. The family also includes modulator of apoptosis 1, which has a role in death receptor-dependent apoptosis.	327
405564	pfam14894	Lsm_C	Lsm C-terminal. This domain is found at the C-terminus of archaeal Lsm (like-Sm) proteins.	57
405565	pfam14895	PPPI_inhib	Protein phosphatase 1 inhibitor. This family of proteins interacts with and inhibits the phosphatase activity of protein phosphatase 1 (PP1) complexes.	342
405566	pfam14896	Arabino_trans_C	EmbC C-terminal domain. Arabinosyltransferase is involved in arabinogalactan (AG) biosynthesis pathway in mycobacteria. AG is a component of the macromolecular assembly of the mycolyl-AG-peptidoglycan complex of the cell wall. This enzyme has important clinical applications as it is believed to be the target of the antimycobacterial drug Ethambutol. This domain represents the C-terminal extracellular domain that is likely to bind to carbohydrate.	384
405567	pfam14897	EpsG	EpsG family. This family of proteins are related to the EpsG protein from B. subtilis. These proteins are likely glycosyl transferases belonging to the membrane protein GT-C clan.	323
405568	pfam14898	DUF4491	Domain of unknown function (DUF4491). This family of proteins is found in bacteria. Proteins in this family are typically between 94 and 107 amino acids in length. There is a conserved EYY sequence motif.	90
405569	pfam14899	DUF4492	Domain of unknown function (DUF4492). This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length. The function of these proteins is unknown.	63
405570	pfam14900	DUF4493	Domain of unknown function (DUF4493). This family of proteins is found in bacteria. Proteins in this family are typically between 264 and 710 amino acids in length. Many of these proteins have a lipid attachment site suggesting they are lipoproteins.	221
405571	pfam14901	Jiv90	Cleavage inducing molecular chaperone. Jiv90 is a fragment of the DnaJ protein in eukaryotes and in J-domain protein interacting with viral protein (Jiv) located in the N terminal region of the pestivirus viral polypeptide. The viral protein interacts stably with non structural (NS) protein NS2, causing a conformational change in NS2-NS3 and stimulates NS2-NS3 cleavage in trans. Cleavage of NS2-NS3 increases cytopathogenicity and consequently aids viral replication. Jiv therefore acts as a regulating cofactor for NS2 auto-protease. The efficient release of NS3 from the viral polypeptide by Jiv is considered crucial to the pestivirus cytopathogenicity. In eukaryotes, it usually lies 40 residues downstream of DnaJ family pfam00226. However, the function in eukaryotes is still unknown.	89
405572	pfam14902	DUF4494	Domain of unknown function (DUF4494). This family of proteins is found in bacteria. Proteins in this family are typically between 154 and 172 amino acids in length. There are two conserved sequence motifs: VDA and EAE. There is a single completely conserved residue E that may be functionally important.	139
405573	pfam14903	WG_beta_rep	WG containing repeat. This repeat contains an N-terminal WG repeat motif. The extent of the repeat is poorly defined. This repeat may form a beta solenoid structure (Bateman A pers. obs.).	35
405574	pfam14904	FAM86	Family of unknown function. Function of this protein family is not known.	94
405575	pfam14905	OMP_b-brl_3	Outer membrane protein beta-barrel family. This family includes proteins annotated as TonB dependent receptors. But it is also likely to contain other membrane beta barrel proteins of other functions.	407
405576	pfam14906	DUF4495	Domain of unknown function (DUF4495). This domain family is found in eukaryotes, and is typically between 322 and 336 amino acids in length. There are two conserved sequence motifs: QMW and DLW. Proteins in this family vary in length from 793 to 1184 amino acids.	318
405577	pfam14907	NTP_transf_5	Uncharacterized nucleotidyltransferase. This family is likely to be an uncharacterized group of nucleotidyltransferases.	249
405578	pfam14908	DUF4496	Domain of unknown function (DUF4496). This domain family is found in eukaryotes, and is typically between 134 and 154 amino acids in length. Proteins in this family vary in length between 264 and 772 amino acid residues.	89
405579	pfam14909	SPATA6	Spermatogenesis-assoc protein 6. This domain family is found in eukaryotes, and is approximately 140 amino acids in length. The family has similarity to the motor domain of kinesin related proteins and with the Caenorhabditis elegans neural calcium sensor protein (NCS-2).	139
405580	pfam14910	MMS22L_N	S-phase genomic integrity recombination mediator, N-terminal. MMS22L (Methyl methanesulfonate-sensitivity protein 22-like) is found in yeast, plants and vertebrates, and is integrally concerned with DNA forking and repair mechanisms during replication. MMS22L complexes with TONSL and this complex accumulates at regions of ssDNA associated with distressed replication forks or at processed DNA breaks. Its depletion results in high levels of endogenous DNA double-strand breaks caused by an inability to complete DNA synthesis after replication fork collapse. Thus the complex mediates recovery from replication stress and homologous recombination in vertebrates, yeasts and plants. This family is the more N-terminal region of the proteins.	708
405581	pfam14911	MMS22L_C	S-phase genomic integrity recombination mediator, C-terminal. MMS22L (Methyl methanesulfonate-sensitivity protein 22-like) is found in yeast, plants and vertebrates, and is integrally concerned with DNA forking and repair mechanisms during replication. MMS22L complexes with TONSL and this complex accumulates at regions of ssDNA associated with distressed replication forks or at processed DNA breaks. Its depletion results in high levels of endogenous DNA double-strand breaks caused by an inability to complete DNA synthesis after replication fork collapse. Thus the complex mediates recovery from replication stress and homologous recombination in vertebrates, yeasts and plants. This family is the more C-terminal region of the proteins.	373
405582	pfam14912	THEG	Testicular haploid expressed repeat. This repeat is the only conserved part of the THEG proteins from vertebrate spermatids. Both human and mouse THEG are specifically expressed in the nucleus of haploid male germ cells and are involved in the regulation of nuclear functions. Although the differential gene expression of THEG in spermatid-Sertoli cell co-culture supports the relevance of germ cell-Sertoli cell interaction for gene regulation during spermatogenesis, THEG was not found to be essential for spermatogenesis in mice.	59
405583	pfam14913	DPCD	DPCD protein family. This protein is a found in eukaryotes and a mutation in this protein is thought to cause Primary Ciliary Dyskinesia (PCD). This protein is 203 amino acids in length, 23 kDa in size and its function remains unknown. The gene that encodes this protein is a candidate gene for PCD and is expressed during ciliogenesis. PCD affects the airways and reproductive organs, and probing Northern blots show DPCD expression in humans is highest in the testes. Additionally, there is no indication of major splice variants.	190
405584	pfam14914	LRRC37AB_C	LRRC37A/B like protein 1 C-terminal domain. This family represents the C-terminal domain of the putative Leucine Rich Repeat Containing protein 37A or protein 37B (LRRC37A/B) found in eukaryotes. The Leucine Rich Repeats (LRR) lies in the central region. The gene that encodes this protein is found in the chromosomal position 17q11.2, and its microdeletion results in the disease, neurofibromatosis type-1 (NF1). The function of the protein, LRRC37B is unknown, however experimental data shows expression in the aorta, heart, skeletal muscle, liver and brain during gestation.	147
405585	pfam14915	CCDC144C	CCDC144C protein coiled-coil region. This family includes the human protein CCDC144C and the ankyrin repeat domain-containing protein 26-like 1 found in eukaryotes. Its function remains unknown, however, it is known to contain a coiled-coil domain which corresponds to this region. The ankyrin repeat which features in this protein is a common amino acid motif.	305
373383	pfam14916	CCDC92	Coiled-coil domain of unknown function. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The function is not known and the proteins carry no other domains.	57
405586	pfam14917	CCDC74_C	Coiled coil protein 74, C terminal. This is a C-terminal conserved domain of coiled-coil proteins from vertebrates. The function is not known. Expression levels in humans are elevated in breast cancer.	121
405587	pfam14918	MTBP_N	MDM2-binding. MTBP, or MDM2-binding protein, binds to MDM2. The MDM2 protein, through its interaction with p53, plays an important role in the regulation of the G1 checkpoint of the cell cycle. MTBP promotes MDM2-mediated ubiquitination and degradation of p53 and also MDM2 stabilisation in an MDM2 RING finger-dependent manner. MTBP differentially regulates the E3 ubiquitin ligase activity of MDM2 towards two of its most critical targets (itself and p53) and in doing so significantly contributes to MDM2-dependent p53 homeostasis in unstressed cells. MTBP inhibits cancer cell migration by interacting with a protein involved in cell motility. This motility protein is alpha-actinin-4 (ACTN4). It is unclear which regions of MTBP interact with which binding-partner. See PF14919, PF14920.	254
405588	pfam14919	MTBP_mid	MDM2-binding. MTBP, or MDM2-binding protein, binds to MDM2. The MDM2 protein, through its interaction with p53, plays an important role in the regulation of the G1 checkpoint of the cell cycle. MTBP promotes MDM2-mediated ubiquitination and degradation of p53 and also MDM2 stabilisation in an MDM2 RING finger-dependent manner. MTBP differentially regulates the E3 ubiquitin ligase activity of MDM2 towards two of its most critical targets (itself and p53) and in doing so significantly contributes to MDM2-dependent p53 homeostasis in unstressed cells. MTBP inhibits cancer cell migration by interacting with a protein involved in cell motility. This motility protein is alpha-actinin-4 (ACTN4). It is unclear which regions of MTBP interact with which binding-partner. See PF14918, PF14920.	339
405589	pfam14920	MTBP_C	MDM2-binding. MTBP, or MDM2-binding protein, binds to MDM2. The MDM2 protein, through its interaction with p53, plays an important role in the regulation of the G1 checkpoint of the cell cycle. MTBP promotes MDM2-mediated ubiquitination and degradation of p53 and also MDM2 stabilisation in an MDM2 RING finger-dependent manner. MTBP differentially regulates the E3 ubiquitin ligase activity of MDM2 towards two of its most critical targets (itself and p53) and in doing so significantly contributes to MDM2-dependent p53 homeostasis in unstressed cells. MTBP inhibits cancer cell migration by interacting with a protein involved in cell motility. This motility protein is alpha-actinin-4 (ACTN4). It is unclear which regions of MTBP interact with which binding-partner. See PF14918, PF14919.	257
405590	pfam14921	APCDDC	Adenomatosis polyposis coli down-regulated 1. The domain is duplicated in most members of this family. APCDD is directly regulated by the beta-catenin/Tcf complex, and its elevated expression promotes proliferation of colonic epithelial cells in vitro and in vivo. APCDD1 has an N-terminal signal-peptide and a C-terminal transmembrane region. The domain is rich in cysteines, there being up to 12 such residues, a structural motif important for interaction between Wnt ligands and their receptors. APCDD1 is expressed in a broad repertoire of cell types, indicating that it may regulate a diverse range of biological processes controlled by Wnt signalling.	234
405591	pfam14922	FWWh	Protein of unknown function. This is a family of eukaryotic proteins. Most members carry a highly distinctive, conserved sequence motif of FWWh, where h represents a hydrophobic residue. The function of the family is not known.	150
405592	pfam14923	CCDC142	Coiled-coil protein 142. The function of this coiled-coil domain-containing family is not known. It is found in eukaryotes.	455
405593	pfam14924	DUF4497	Protein of unknown function (DUF4497). This domain family is found in eukaryotes, and is typically between 107 and 123 amino acids in length. There are two completely conserved G residues that may be functionally important.	107
405594	pfam14925	HPHLAWLY	Domain of unknown function. Members of this family carry two distinct, highly conserved sequence motifs, CPPPLYYTHL and HPHLAWLY. The family is found in eukaryotes, and the function is not known. This family lies at the C-terminus of members.	641
405595	pfam14926	DUF4498	Domain of unknown function (DUF4498). This family of proteins is found in eukaryotes. Proteins in this family are typically between 203 and 308 amino acids in length.	245
405596	pfam14927	Neurensin	Neurensin. The neurensin family includes the neuronal membrane proteins neurensin-1 and neurensin-2. Neurensin-1 plays a role in neurite extension.	132
405597	pfam14928	S_tail_recep_bd	Short tail fibre protein receptor-binding domain. This domain is a receptor binding domain found on bacteriophage short tail fibre proteins. It contains a zinc-binding site and a potential lipopolysaccharide-binding site.	93
405598	pfam14929	TAF1_subA	TAF RNA Polymerase I subunit A. TATA box binding protein associated factor RNA Polymerase I subunit A is found in eukaryotes and is encoded by the gene TAF1A in humans. Its function is to aid transcription of DNA into RNA by binding to the promoter at the -10 TATA box site. It is a component of the transcription factor SL1/TIF-IB complex, involved in PIC assembly (pre-initiation complex) during RNA polymerase I-dependent transcription. The rate of PIC formation depends on the rate of association of this protein. This protein also stabilizes nucleolar transcription factor 1/UBTF on rDNA.	365
405599	pfam14930	Qn_am_d_aII	Quinohemoprotein amine dehydrogenase, alpha subunit domain II. This is the second domain of the alpha subunit of quinohemoprotein amine dehydrogenase.	107
405600	pfam14931	IFT20	Intraflagellar transport complex B, subunit 20. IFT20 is subunit 20 of the intraflagellar transport complex B. The intraflagellar transport complex assembles and maintains eukaryotic cilia and flagella. IFT20 is localized to the Golgi complex and is anchored there by the Golgi polypeptide, GMAP210, whereas all other subunits except IFT172 localize to cilia and the peri-basal body or centrosomal region at the base of cilia. IFT20 accompanies Golgi-derived vesicles to the point of exocytosis near the basal bodies where the other IFT polypeptides are present, and where the intact IFT particle is assembled in association with the inner surface of the cell membrane. Passage of the IFT complex then follows, through the flagellar pore recognition site at the transition region, into the ciliary compartment. There also appears to be a role of intraflagellar transport (IFT) polypeptides in the formation of the immune synapse in non ciliated cells. The flagellum, in addition to being a sensory and motile organelle, is also a secretory organelle. A number of IFT components are expressed in haematopoietic cells, which have no cilia, indicating an unexpected role of IFT proteins in immune synapse-assembly and intracellular membrane trafficking in T lymphocytes; this suggests that the immune synapse could represent the functional homolog of the primary cilium in these cells.	109
405601	pfam14932	HAUS-augmin3	HAUS augmin-like complex subunit 3. This domain is subunit three of the augmin complex found from Drosophila to humans. The HAUS-augmin complex is made up of eight subunits. The augmin complex interacts with gamma-TuRC, and attenuation of this interaction severely impairs spindle MT generation. Furthermore, we provide evidence that human augmin plays critical and non-redundant roles in the kinetochore-MT attachment and also central spindle formation during anaphase in human cells.The HAUS complex is required for mitotic spindle assembly and for maintenance of centrosome integrity.	261
373400	pfam14933	CEP19	CEP19-like protein. This family includes the centrosomal protein of 19 kDa found in eukaryotes. In humans, it is encoded for by the gene CEP19 which is also known as C3orf34. These proteins localize in the centrosomes. Centrosomes are dynamic organelles that assemble around the centrioles. They organize the microtubule cytoskeleton and mitotic spindle apparatus and are required for cell division and cell migration. C3orf34 localizes near the centrosome in early interphase, to spindle poles during mitosis, and to distinct foci oriented towards the midbody at telophase.	150
405602	pfam14934	DUF4499	Domain of unknown function (DUF4499). This family contains a protein found in eukaryotes. Transmembrane protein C10orf57 is encoded for by the gene chromosome 10 open reading frame 57 (C10orf57) located in chromosomal position 10q22.3. The exact function of this protein is still unknown, however it is thought to be an integral membrane protein. The protein sequence is 123 amino acids in length and has a mass of approximately 14.2 kDa. The family also includes some longer proteins that possess an N-terminal dehydrogenase domain, pfam01073.	88
405603	pfam14935	TMEM138	Transmembrane protein 138. This family of proteins is found in eukaryotes and members are approximately 160 amino acids in length. There are two conserved sequence motifs: YYY and DPR. This transmembrane protein belongs to a family found in eukaryotes and is involved in the biogenesis and degradation of ciliated cells. Mutations in this protein cause the disease Joubert syndrome(JBTS) where the cilia becomes non-motile. Ciliopathy can be severe since cilia provide the cell with large amounts of information through signals. Ciliopathy can affect cell behaviour as the appropriate signals between the cell and its environment are not made, which can affect cell survival.	119
405604	pfam14936	p53-inducible11	tumor protein p53-inducible protein 11. TP53 is a tumor suppressor gene, when switched on it suppresses tumor development by inducing stable growth arrest or cell apoptosis. The tumor protein TP53 inducible protein 11 encoded for by the gene TP53I11, has a protein sequence of 189 amino acids in length and 21 kDa in mass. The role of this protein is thought to negatively regulate cell proliferation in response to stress, and therefore suppress tumor formation.	182
405605	pfam14937	DUF4500	Domain of unknown function (DUF4500). This family is found in eukaryotes. The function of this protein remains unknown. The gene which encodes for this protein is named chromosome 6 open reading frame 162 (C6orf162) and is found between the chromosomal positions 6q15-q16.1. It is thought that this protein may be an important part of membrane function.	81
405606	pfam14938	SNAP	Soluble NSF attachment protein, SNAP. The soluble NSF attachment protein (SNAP) proteins are involved in vesicular transport between the endoplasmic reticulum and Golgi apparatus. They act as adaptors between SNARE (integral membrane SNAP receptor) proteins and NSF (N-ethylmaleimide-sensitive factor). They are structurally similar to TPR repeats.	273
405607	pfam14939	DCAF15_WD40	DDB1-and CUL4-substrate receptor 15, WD repeat. DCAFs, Ddb1- and Cul4-associated factors, are substrate receptors for the Cul4-Ddb1 Ubiquitin Ligase. There are 18 different factors, the majority of which are WD40-repeat-proteins.	203
405608	pfam14940	TMEM219	Transmembrane 219. This protein belongs to a family found in eukaryotes. Proteins in this family are typically between 240 and 315 amino acids in length. The domains in this family vary in length from 202 to 249 amino acids. Its exact function remains unknown, however, it is thought to have a role as a transmembrane protein. More specifically, it is possible that this transmembrane protein may have a role as an insulin-like growth factor binding protein 3-receptor (IGFBP-3R). This receptor binds to the ligand, insulin growth factor 3, which is a p53-induced, apoptosis factor important for cancer prevention.	236
405609	pfam14941	OAF	Transcriptional regulator, Out at first. This family of proteins is found in eukaryotes. Proteins in this family are typically between 198 and 332 amino acids in length. The domains in this family vary in length from 239 to 242 amino acids. The gene, OAF (out at first), which encodes this protein, has a promoter which may help mediate regulation of neighboring genes. An alternative name for this protein is HCV NS5A-transactivated protein 13 target protein 2, which stands for Hepatitis C virus nonstructural 5A-transactivated protein 13 target protein 2. NS5A inhibits double-stranded-RNA-activated protein kinase (PKR) activity, which is thought to allow Hepatitis C Virus replication to continue in the presence of an alpha interferon (IFN)induced antiviral response.	242
405610	pfam14942	Muted	Organelle biogenesis, Muted-like protein. The protein is a coiled-coil protein and belongs to a family found in eukaryotes. It undergoes alternative splicing forming two isoforms. The larger isoform is 187 amino acids long in protein sequence length and 21 kDa in mass. The smaller isoform is 110 amino acids long in protein sequence length and 12 kDa in mass. This protein associates with other proteins in order to form biogenesis of lysosome-related organelles complex-1 BLOC1 complex. BLOC-1 is required for the normal biogenesis of specialized organelles of the endosomal-lysosomal system.	141
405611	pfam14943	MRP-S26	Mitochondrial ribosome subunit S26. This family of proteins corresponds to mitochondrial ribosomal subunit S26 in eukaryotes	169
405612	pfam14944	TCRP1	Tongue Cancer Chemotherapy Resistant Protein 1. This family of proteins are found in eukaryotes. Tongue Cancer Chemotherapy Resistant-associated Protein 1 (TCRP1) is resistant to the chemotherapy drug, cisplatin, which induces apoptosis in tumor cells. There is suggestion that TCRP1 can be targeted to reverse chemotherapy resistance. The precise mechanism of TCRP1 inducing resistance against chemotherapy is still not clear, but it is thought that TCRP1 alters cell signalling pathways affecting apoptosis or DNA repair capacity. Proteins in this family are typically between 194 and 235 amino acids in length.	243
405613	pfam14945	LLC1	Normal lung function maintenance, Low in Lung Cancer 1 protein. This protein is part of a family found in eukaryotes. It is 137 amino acids long in protein sequence length and mass is approximately 15.7 kDa. The protein is present in the normal lung epithelium, but absent or downregulated in most primary non-small lung cancers. The gene is known as Low in Lung Cancer 1 (LLC1). This protein is thought to have a role in the maintenance of normal lung function and its absence may lead to lung tumorigenesis.	118
405614	pfam14946	DUF4501	Domain of unknown function (DUF4501). This family of proteins is found in eukaryotes. Proteins in this family are typically between 167 and 308 amino acids in length. The exact function of this protein remains unknown, but it is thought to be a single-pass membrane protein. This family contains many highly conserved cysteine residues.	177
405615	pfam14947	HTH_45	Winged helix-turn-helix. This winged helix-turn-helix domain contains an extended C-terminal alpha helix which is responsible for dimerization of this domain.	77
405616	pfam14948	RESP18	RESP18 domain. This domain is found in the glucocorticoid-responsive protein regulated endocrine-specific protein 18 (RESP18) and in the N-terminal extracellular region of receptor-type tyrosine-protein phosphatases containing the protein-tyrosine phosphatase receptor IA-2 domain (pfam11548).	77
405617	pfam14949	ARF7EP_C	ARF7 effector protein C-terminus. This family represents the C-terminus of the ARF7 effector protein (ARF7EP). ARF7EP interacts with ADP-ribosylation factor-like protein 14 and unconventional myosin-Ie and through this interaction controls movement of MHC-II-containing vesicles along the actin cytoskeleton in dendritic cells. It contains a conserved CXCXXXXCXXCXXXCXXCXXXXCXXXCXC motif in it's C-terminal half.	102
405618	pfam14950	DUF4502	Domain of unknown function (DUF4502). This family of proteins is found in eukaryotes. Proteins in this family are typically between 181 and 876 amino acids in length.	351
405619	pfam14951	DUF4503	Domain of unknown function (DUF4503). This family of proteins is found in eukaryotes. Proteins in this family are typically between 313 and 876 amino acids in length.	391
405620	pfam14952	zf-tcix	Putative treble-clef, zinc-finger, Zn-binding. This domain resembles the zinc-binding domain of prokaryotic topoisomerases, family DNA_ligase_ZBD pfam03119. The function of the eukaryotic proteins it is carried on is not known.	42
405621	pfam14953	DUF4504	Domain of unknown function (DUF4504). This family of proteins is found in eukaryotes. Proteins in this family are typically between 253 and 329 amino acids in length. There are two conserved sequence motifs: LLGYP and SFS.	254
373420	pfam14954	LIX1	Limb expression 1. This entry represents the limb expression 1 (LIX1) family.	242
405622	pfam14955	MRP-S24	Mitochondrial ribosome subunit S24. This family of proteins corresponds to mitochondrial ribosomal subunit S24 in eukaryotes.	135
405623	pfam14956	DUF4505	Domain of unknown function (DUF4505). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 166 and 225 amino acids in length.	178
405624	pfam14957	BORG_CEP	Cdc42 effector. The Cdc42 effector (CEP) or binder of Rho GTPases (BORG) proteins are involved in the organisation of the actin cytoskeleton. They may function as negative regulators of Rho GTPase signaling.	118
405625	pfam14958	DUF4506	Domain of unknown function (DUF4506). This domain family is found in eukaryotes, and is approximately 140 amino acids in length.	140
405626	pfam14959	GSAP-16	gamma-Secretase-activating protein C-term. GSAP, or gamma-secretase-activating protein, also known as PION, regulates gamma-secretase activity. The holo-protein is a large, approx 850 residue protein that is rapidly cleaved to an active 16 kDa C-terminal fragment that is the stable, predominant form. GSAP is expressed in inclusion bodies and is important in brain function. It dramatically and selectively increases neurotoxic beta-Amyloid production in the brain through a mechanism involving its interactions with both gamma-secretase and its substrate, the amyloid precursor protein C-terminal fragment (APP-CTF). Accumulation of neurotoxic beta-Amyloid is a major hallmark of Alzheimer's disease. Formation of beta-Amyloid is catalyzed by gamma-secretase, a protease with numerous substrates that catalyzes the intra-membrane cleavage of integral membrane proteins such as Notch receptors and APP (beta-amyloid precursor protein). The secondary structure of GSAP is largely alpha-helical, lacking well-defined tertiary structure. GSAP represents a type of gamma-secretase regulator that directs enzyme specificity by interacting with a specific substrate.	108
373426	pfam14960	ATP_synth_reg	ATP synthase regulation. Members of this family are subunits of mitochondrial ATP synthase (F-ATPase) and vacuolar ATPase (V-ATPase). In F-ATPase, this subunit regulates mitochondrial ATP synthase population.	50
405627	pfam14961	BROMI	Broad-minded protein. Broad-minded protein (BROMI) interacts with cell cycle-related kinase (CCRK), together these proteins regulate ciliary membrane and axonemal growth.	1290
373428	pfam14962	AIF-MLS	Mitochondria localization Sequence. This family contains a protein found in eukaryotes. Proteins in this family are typically between 240 and 613 amino acids in length. The family is found in association with pfam07992. This protein family is an N-terminal domain for the mitochondrial localization sequence for an apoptosis-inducing factor. The protein is also known as Corneal endothelium-specific protein 1 or as Ovary-specific acidic protein. It is thought to be important for membrane function and is expressed in the ovary and corneal endothelium.	192
405628	pfam14963	CAML	Calcium signal-modulating cyclophilin ligand. Calcium signal-modulating cyclophilin ligand was originally identified in a screen for cyclophilin B-interacting proteins. It is likely to be involved in calcium signalling. It has also been shown to interact with many other signalling molecules including proto-oncogene tyrosine-protein kinase LCK, tumor necrosis factor receptor superfamily member 13B and EGFR.	269
405629	pfam14964	DUF4507	Domain of unknown function (DUF4507). This family of proteins is found in eukaryotes. Proteins in this family are typically between 346 and 434 amino acids in length.	359
405630	pfam14965	BRI3BP	Negative regulator of p53/TP53. This family of transmembrane proteins is found in eukaryotes. Proteins in this family are typically between 213 and 245 amino acids in length. It is found in various tissues, including the brain, liver and kidneys. It was first discovered as a functional unknown gene, murine brain I3 (BRI3). This protein is also known as HCCRBP-1 and it plays a role in tumorigenesis, as it binds to an oncogene, HCCR-1, and acts as a negative regulator of p53/TP53 tumor suppressor. BRI3BP induces tumorigenesis by activating protein kinase C (PKC) activity but decreasing the pro-apoptotic PKC-alpha and PKC-delta isoform levels. BRI3BP is over-expressed in many tumors.	180
405631	pfam14966	DNA_repr_REX1B	DNA repair REX1-B. This family of proteins includes Chlamydomonas reinhardtii REX1-B (Required for Excision 1-B) which is involved in a light-independent DNA repair pathway.	94
405632	pfam14967	FAM70	FAM70 protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 241 and 349 amino acids in length. The function of this family is unknown.	325
405633	pfam14968	CCDC84	Coiled coil protein 84. The function of this coiled-coil domain-containing family is not known. It is found in eukaryotes.	328
405634	pfam14969	DUF4508	Domain of unknown function (DUF4508). This family of proteins is found in eukaryotes. Proteins in this family are typically between 117 and 253 amino acids in length.	96
405635	pfam14970	DUF4509	Domain of unknown function (DUF4509). This family of proteins is found in eukaryotes. Proteins in this family are typically between 212 and 449 amino acids in length. There is a conserved WLL sequence motif.	187
405636	pfam14971	DUF4510	Domain of unknown function (DUF4510). This family of proteins is found in eukaryotes. Proteins in this family are typically between 242 and 452 amino acids in length. There are two conserved sequence motifs: LEA and WMD.	153
405637	pfam14972	Mito_morph_reg	Mitochondrial morphogenesis regulator. This family of proteins regulate mitochondrial morphogenesis via a mechanism which is independent of mitofusins and dynamin-related protein 1.	162
405638	pfam14973	TINF2_N	TERF1-interacting nuclear factor 2 N-terminus. This is the N-terminus of TERF1-interacting nuclear factor 2. It is required for the formation of the shelterin complex. The shelterin complex is involved in the protection and maintenance of telomeres.	143
405639	pfam14974	P_C10	Protein C10. The function of this protein family is unknown. Mutations in protein C (C12orf57) are implicated in the pathogenesis of colobomatous microphthalmia.	103
405640	pfam14975	DUF4512	Domain of unknown function (DUF4512). This family of proteins is found in eukaryotes. Proteins in this family are typically between 74 and 104 amino acids in length. There are two completely conserved residues (C and P) that may be functionally important.	103
405641	pfam14976	FAM72	FAM72 protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 145 and 264 amino acids in length. The function of this family is unknown.	145
405642	pfam14977	FAM194	FAM194 protein. This family is found in eukaryotes, and is approximately 210 amino acids in length. There is a conserved YPSG sequence motif. The function of this family is unknown.	196
405643	pfam14978	MRP-63	Mitochondrial ribosome protein 63. This family of proteins is present in the intact 55S subunit of the mitochondrial ribosome. It is not known if it belongs to the 28S or to the 39S subunit.	89
405644	pfam14979	TMEM52	Transmembrane 52. This family of transmembrane proteins is found in eukaryotes. Proteins in this family are typically between 160 and 236 amino acids in length. There is a conserved LLCG sequence motif. The function of this family is unknown.	143
317403	pfam14980	TIP39	TIP39 peptide. 	51
317404	pfam14981	FAM165	FAM165 family. This family of proteins known as FAM165 are found in eukaryotes. Members of this family are as yet uncharacterized. Proteins in this family are typically short membrane proteins between 55 and 70 amino acids in length.	50
291643	pfam14982	UPF0731	UPF0731 family. The UPF0731 family of uncharacterized proteins is found in mammals.	78
291644	pfam14983	DUF4513	Domain of unknown function (DUF4513). This family of uncharacterized proteins is found in chordates.	132
405645	pfam14984	CD24	CD24 protein. 	52
373447	pfam14985	TM140	TM140 protein family. This family of uncharacterized membrane proteins are called transmembrane protein 140. They are found in mammals.	180
373448	pfam14986	DUF4514	Domain of unknown function (DUF4514). This family of uncharacterized proteins are found in mammals.	60
405646	pfam14987	NADHdh_A3	NADH dehydrogenase 1 alpha subcomplex subunit 3. This family of proteins are accessory subunits of the mitochondrial membrane respiratory chain NADH dehydrogenase (Complex I). This subunit is not believed to be catalytic.	78
405647	pfam14988	DUF4515	Domain of unknown function (DUF4515). This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 198 and 469 amino acids in length. There are two completely conserved L residues that may be functionally important.	206
405648	pfam14989	CCDC32	Coiled-coil domain containing 32. This family of proteins is found in eukaryotes. Proteins in this family are typically between 160 and 188 amino acids in length. The gene that encodes this protein is C15orf57 but its protein product is called Protein CCDC32 (Coiled-coil domain containing 32). The exact function of this protein is still unknown.	150
405649	pfam14990	DUF4516	Domain of unknown function (DUF4516). This family of proteins is found in eukaryotes. Proteins in this family are typically between 56 and 69 amino acids in length.	46
405650	pfam14991	MLANA	Protein melan-A. 	117
405651	pfam14992	TMCO5	TMCO5 family. The TMCO5 family includes human transmembrane and coiled-coil domain-containing proteins 5A and 5B.	281
405652	pfam14993	Neuropeptide_S	Neuropeptide S precursor protein. 	65
405653	pfam14994	TSGA13	Testis-specific gene 13 protein. This family of uncharacterized proteins are found in chordates. In humans this gene is found to be expressed specifically in the testes.	273
405654	pfam14995	TMEM107	Transmembrane protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 138 and 164 amino acids in length. There are two completely conserved residues (H and E) that may be functionally important and four transmembrane helices. The domains in this family vary in length from 124 to 126 amino acids. The precise function of the protein family is still unknown.	123
405655	pfam14996	RMP	Retinal Maintenance. RMP is encoded for by a gene, C8orf37. Mutations in the gene cause two types of retinal dystrophies: cone-rod dystrophy type 16 (CORD16) and retinitis pigmentosa type 64 (RP64). CORD16 affects the cone receptors which detect red, green or blue wavelengths of light and RP64 affects the cone receptors first and then the rod receptors. Both of these affect the photo-receptors in the eye leading to colour blindness or blindness respectively.	154
405656	pfam14997	CECR6_TMEM121	CECR6/TMEM121 family. This family includes Cat eye syndrome critical region protein 6, a protein which has been identified in a screen for candidate genes for the developmental disorder Cat Eye Syndrome (CES). It also includes the TMEM121 transmembrane proteins. The function of this family is unknown.	194
405657	pfam14998	Ripply	Transcription Regulator. The precise function of this family is not clear, but it is thought to play a role in somitogenesis, development and transcriptional repression. Ripply is also known by an alternative name, Bowline. Bowline, is an associate protein of the transcriptional co-repressor XGrg-4. This family contains two conserved sequence motifs: WRPW and FPVQATI. The WRPW motif is thought to be required for binding to tle/groucho proteins. Ripply3 is also known as Down Syndrome Critical Region Protein 6 homolog. This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 154 amino acids in length.	85
373461	pfam14999	Shadoo	Shadow of prion protein, neuroprotective. This protein family is a Prion-like protein and its function is neuroprotective and similar to PrP(C)-like. Shadoo is mainly expressed in the brain, and highly expressed in the hippocampus, the area of the brain which co-ordinates memory as well as spatial memory and navigation. This protein may also alter the biological actions of normal and abnormal Prion Protein (PrP) which lead to lethal neurodegenerative diseases. This family of proteins is found in eukaryotes. Proteins in this family are approximately 150 amino acids in length, of which the first 90 are alanine rich.	133
405658	pfam15000	TUSC2	tumor suppressor candidate 2. This family of proteins are candidate tumor suppressors.	111
405659	pfam15001	AP-5_subunit_s1	AP-5 complex subunit sigma-1. This family of proteins are subunits of the adaptor protein complex AP-5.	191
405660	pfam15002	ERK-JNK_inhib	ERK and JNK pathways, inhibitor. This coiled-coiled domain, CCDC134, is a secretory protein that inhibits Mitogen activated protein kinase (MAPK) pathways such as Raf-1/MEK/ERK and JNK/SAPK but not p38. CCDC134 is widely expressed in normal adult tissues, tumor tissues and cell lines, which shows its importance in cell signal transduction pathways, transcription regulation and therefore cell survival. Additionally, CCDC134 is known to bind to a transcription adaptor, hADA2a, which forms part of the general control nonderepressible 5 (GCN5) histone acetyltransferase complex. Acetylation usually 'switches genes on' for transcription. Moreover, knocking out CCDC134 suppressed hADA2a-induced cell apoptosis activity and G1/S cell cycle arrest suggesting its importance in cell survival. This family of proteins is found in eukaryotes. Proteins in this family are typically between 188 and 257 amino acids in length. This family is a coiled-coil domain containing protein 134 (CCDC134) whereby the coiled-coiled domain is a ubiquitous motif involved in oligomerization.	197
373465	pfam15003	HAUS2	HAUS augmin-like complex subunit 2. This family of proteins is found in eukaryotes. Proteins in this family are typically between 203 and 291 amino acids in length. HAUS augmin-like complex subunit 2 is alternatively called centrosomal protein of 27 kDa (CEP27). It localized in the microtubule organising centre, the centrosome. These microtubules are part of the cytoskeleton and give the cell its shape, provides it with a platform for motility and are crucial for mitosis. This protein is part of the HAUS augmin-like complex. This interacts with the gamma-tubulin ring complex (gamma-TuRC) which is required for spindle generation. HAUS2 may also increase the tension between spindle and kinetochore allowing for chromosome segregation during mitosis. This protein is involved in mitotic spindle assembly, maintenance of centrosome integrity and completion of cytokinesis.	191
405661	pfam15004	MYEOV2	Myeloma-overexpressed-like. This family of proteins is found in eukaryotes. It includes human myeloma-overexpressed gene 2 protein. Proteins in this family are typically between 45 and 74 amino acids in length. There are two conserved sequence motifs: MKP and DEMF. The function of this family is unknown.	57
405662	pfam15005	IZUMO	Izumo sperm-egg fusion, Ig domain-associated. This IZUMO family is a domain just upstream of the immunoglobulin domain on Izumo proteins in higher eukaryotes. The actual function of this region of the Izumo proteins is not known. The full-length protein is a molecule with a single immunoglobulin (Ig) domain. It is thought that Izumo proteins bind to putative Izumo receptors on the oocyte. Izumo is not detectable on the surface of fresh sperm but becomes exposed only after an exocytotic process, the acrosome reaction, has occurred. Studies have shown that knock-out mice (Izumo-/- males) were sterile despite normal mating behaviour and ejaculation, indicating the importance of the protein in fertilisation. There are cysteine residues thought to form a disulphide bridge. Izumo is a typical type I membrane glycoprotein with one immunoglobulin-like domain and a putative N-glycoside link motif (Asn 204). There is a conserved GCL sequence motif. Izumo expression has been found to be testis-specific.	142
373468	pfam15006	DUF4517	Domain of unknown function (DUF4517). The function of this protein remains unknown. This family of proteins is found in eukaryotes and are typically between 160 and 182 amino acids in length.	152
405663	pfam15007	CEP44	Centrosomal spindle body, CEP44. CEP44 is a coiled coil domain found localized in the centrosome and spindle poles.	127
405664	pfam15008	DUF4518	Domain of unknown function (DUF4518). The precise function of this protein family is unknown but it is thought to be involved in apoptosis regulation.	263
405665	pfam15009	TMEM173	Transmembrane protein 173. Transmembrane protein 173, also known as stimulator of interferon genes protein (STING), is a transmembrane adaptor protein which is involved in innate immune signalling processes. It induces expression of type I interferons (IFN-alpha and IFN-beta) via the NF-kappa-B and IRF3, pathways in response to non-self cytosolic RNA and dsDNA.	293
405666	pfam15010	FAM131	Putative cell signalling. The precise function of this protein family is unknown, however studies have shown it undergoes Protein N-myristoylation; a type of lipid modification in eukaryotic and viral proteins. Protein N-myristoylation is usually an irreversible co-translational protein modification which is useful in cell signal transduction pathways. This indicates that FAM131 may have some sort of role in cell signalling due to its ability to be myristoylated. This family of proteins is found in eukaryotes and are typically between 257 and 361 amino acids in length.	278
405667	pfam15011	CK2S	Casein Kinase 2 substrate. It is suggested that CK2S (C10orf109) is important in the regulation of cancer cell proliferation. Studies have indicated that CK2S is the downstream target of a protein kinase, casein kinase 2 (CK2), which is upregulated in cancer cells. CK2S has been found to be upregulated in cancer cells. The precise mechanism of CK2 targetting CK2S is not well characterized. It is found to be localized in the nucleus and cytoplasm. This family of proteins is found in eukaryotes. Proteins in this family are typically between 160 and 221 amino acids in length. There is a single completely conserved residue P that may be functionally important.	158
373474	pfam15012	DUF4519	Domain of unknown function (DUF4519). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 59 amino acids in length. There are two conserved sequence motifs: KET and VLP. There is a single completely conserved residue P that may be functionally important.	55
405668	pfam15013	CCSMST1	CCSMST1 family. This family of proteins was discovered in a screen of Bos taurus placental ESTs. The B. taurus member of this family was named cattle cerebrum and skeletal muscle-specific transcript 1. This family of proteins is found in eukaryotes. Proteins in this family are typically between 97 and 157 amino acids in length. There is a single completely conserved residue D that may be functionally important. The function of this family is unknown.	74
405669	pfam15014	CLN5	Ceroid-lipofuscinosis neuronal protein 5. 	301
373477	pfam15015	NYD-SP12_N	Spermatogenesis-associated, N-terminal. NYD-SP12, also known as SPATA16, is a germ-cell specific participant in the Golgi apparatus, and its expression is confined to spermatogenic epithelium, not being found in interstitial cells. Computer analysis of the protein-sequence showed that NYD-SP12 contains a cluster of phosphorylation sites for protein kinase C as well as for cyclic nucleotide-dependent protein kinases. It is postulated that since the mutation of some Golgi apparatus' proteins are responsible for male infertility that NYD-SP12 might play a role in modification and sorting of acrosomal enzymes. OMIM:102530.	564
405670	pfam15016	DUF4520	Domain of unknown function (DUF4520). This family of proteins is found in eukaryotes. Proteins in this family are typically between 197 and 638 amino acids in length.This is the C-terminal domain of the member proteins.	85
405671	pfam15017	WRNPLPNID	Putative WW-binding domain and destruction box. This short conserved region is a putative destruction-box, with its RxxLxxI sequence motif, though the homology is not absolute. The domain occurs on a number of tumorigenic proteins, on some RNA-binding proteins and serine-threonine regulatory proteins. The second less well-conserved motif, WITPS, is a potential WW domain ligand-binding motif for recruiting proteins to their substrates. WW domains bind tightly to short proline-containing peptides that are typically in regions of native disordered polypeptide, as this family is as it lies between a PIN domain and a zinc-binding domain.	61
405672	pfam15018	InaF-motif	TRP-interacting helix. This highly conserved motif is thought to be a transmembrane helix that binds to transient receptor potential (TRP) calcium channel. It is known that proline-rich proteins inactivate tannins found in food compounds, and it is putatively thought that PRR24 does too. This is important since tannins often inhibit the uptake of iron. InaF is a protein required for TRP calcium channel function in Drosophila. TRP-related channels have been suggested to mediate store-operated calcium entry, important for Ca2+ homeostasis in a wide variety of cell types. The amino acid sequence of PRR-24 contains two completely conserved Y residues that may be functionally important. This domain family is found in eukaryotes, and is approximately 40 amino acids in length.	35
405673	pfam15019	C9orf72-like	C9orf72-like protein family. The precise function of this family is unknown but members have been found to be localized in the cytoplasm of brain tissue. Defects in the gene, C9orf72, are the cause of frontotemporal dementia and/or amyotrophic lateral sclerosis (FTDALS) which is an autosomal dominant neurodegenerative disorder. The disorder is caused by a large expansion of a GGGGCC hexa-nucleotide within the first C9orf72 intron located between the first and the second non-coding exons. The expansion leads to the loss of transcription of one of the two transcripts encoding isoform 1 and to the formation of nuclear RNA foci. This domain family is found in eukaryotes, and is typically between 230 and 250 amino acids in length. There is a single completely conserved residue F that may be functionally important.	230
405674	pfam15020	CATSPERD	Cation channel sperm-associated protein subunit delta. The CATSPER (cation channel of sperm) complex is a tetrameric complex consisting of CATSPER1, CATSPER2, CATSPER3 and CATSPER4, it functions as an alkalinisation-activated calcium channel. This complex requires several auxiliary subunits, including CATSPERD. CATSPERD is essential for the cation channel function and may play a role in channel assembly or transport.	727
405675	pfam15021	DUF4521	Protein of unknown function (DUF4521). This family of vertebrate proteins is functionally uncharacterized. The family includes the Chromosome 20 protein C20orf196.	198
405676	pfam15022	DUF4522	Protein of unknown function (DUF4522). This family of proteins is functionally uncharacterized. This family of proteins is found in mammals. In human this protein is known as C4orf36.	117
405677	pfam15023	DUF4523	Protein of unknown function (DUF4523). This family of proteins is functionally uncharacterized. This family of proteins is found in mammals.	166
405678	pfam15024	Glyco_transf_18	Glycosyltransferase family 18. Enzymes belonging to glycosyltransferase family 18 (alpha-1,6-mannosylglycoprotein 6-beta-N-acetylglucosaminyltransferase) contribute to the creation of branches in complex-type N-glycans. This domain is responsible for the catalytic activity of the enzyme.	557
405679	pfam15025	DUF4524	Domain of unknown function (DUF4524). This family of proteins is found in eukaryotes. Proteins in this family are typically between 197 and 638 amino acids in length.This is the N-terminal domain of the member proteins. The human gene is from C5orf34.	145
373487	pfam15027	DUF4525	Domain of unknown function (DUF4525). This domain is found in eukaryotes. It is often found at the N-terminus of glycosyltransferase family 18 enzymes (pfam15024). It is also found in coiled-coil domain-containing protein 126.	137
373488	pfam15028	PTCRA	Pre-T-cell antigen receptor. The pre-T-cell antigen receptor (pre-TCR), expressed by immature thymocytes, has a pivotal role in early T-cell development, including TCR beta-selection, survival and proliferation of CD4(-)CD8(-) double-negative thymocytes, and subsequent alpha/beta T-cell lineage differentiation. This protein contains an immunoglobulin domain.	127
405680	pfam15029	TMEM174	Transmembrane protein 174. This family of proteins is found in chordates and includes the human integral membrane protein TMEM174 protein.	235
405681	pfam15030	DUF4527	Protein of unknown function (DUF4527). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrates.	276
405682	pfam15031	DUF4528	Domain of unknown function (DUF4528). This family of proteins is found in eukaryotes. Proteins in this family are typically between 95 and 154 amino acids in length. This family includes Human C15orf61.	126
405683	pfam15032	DUF4529	Protein of unknown function (DUF4529). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. The proteins contain a conserved VLPPLK sequence motif.	402
405684	pfam15033	Kinocilin	Kinocilin protein. This family of kinocilin proteins is found in vertebrate. In mouse it has been shown that this protein is expressed primarily in the kinocilium of sensory cells in the inner ear.	123
291693	pfam15034	KRTAP7	KRTAP type 7 family. This family of keratin associated proteins are found in vertebrate.	84
405685	pfam15035	Rootletin	Ciliary rootlet component, centrosome cohesion. 	189
405686	pfam15036	IL34	Interleukin 34. 	157
405687	pfam15037	IL17_R_N	Interleukin-17 receptor extracellular region. This domain is found at the N-terminus (extracellular region) of interleukin-17 receptor C and Interleukin-17 receptor E. This is the presumed ligand-binding domain. Human putative interleukin-17 receptor E-like consists only of this domain.	388
405688	pfam15038	Jiraiya	Jiraiya. Jiraiya inhibits bone morphogenetic protein (BMP) signaling during embryogenesis. The human member of this family is TMEM221.	170
405689	pfam15039	DUF4530	Domain of unknown function (DUF4530). This family of proteins is found in eukaryotes. Proteins in this family are typically around 140 amino acids in length. The human member of this family is C19orf69.	113
373499	pfam15040	Humanin	Humanin family. This family of proteins is found exclusively in humans. Humanin is a short anti-apoptotic peptide that interacts with Bax.	24
405690	pfam15041	DUF4531	Domain of unknown function (DUF4531). This family of uncharacterized proteins is found in mammals. This family includes the human protein C19orf71.	184
405691	pfam15042	LELP1	Late cornified envelope-like proline-rich protein 1. This family of uncharacterized proteins is found in mammals.	106
405692	pfam15043	CNRIP1	CB1 cannabinoid receptor-interacting protein 1. This family of proteins interacts with cannabinoid receptor 1 (CNR1) and attenuates CNR1-mediated tonic inhibition of voltage-gated calcium channels.	152
405693	pfam15044	CLU_N	Mitochondrial function, CLU-N-term. CLU_N is the N-terminal domain of the Clueless protein, also known as TIF31-like in other organisms. The function of this domain is not known. It family is found in association with pfam13236.	79
405694	pfam15045	Clathrin_bdg	Clathrin-binding box of Aftiphilin, vesicle trafficking. Aftiphilin forms a stable complex with p200 and gamma-synergin. This family contains a clathrin box, with two identified clathrin-binding motifs. This family of proteins is found in eukaryotes.	80
405695	pfam15046	DUF4532	Protein of unknown function (DUF4532). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes.	279
405696	pfam15047	DUF4533	Protein of unknown function (DUF4533). This family of proteins is functionally uncharacterized. This family of proteins is found in mammals. This family includes two human proteins: C12orf60 and C12orf69.	226
405697	pfam15048	OSTbeta	Organic solute transporter subunit beta protein. 	122
405698	pfam15049	DUF4534	Protein of unknown function (DUF4534). This family of proteins is functionally uncharacterized. This family of proteins is found in mammals. Proteins in this family are typically between 170 and 190 amino acids in length. The protein includes the human integral membrane TMEM217 protein.	163
405699	pfam15050	SCIMP	SCIMP protein. This family contains the SCIMP proteins which are a a transmembrane adaptor protein involved in major histocompatibility complex class II signaling.	132
373510	pfam15051	FAM198	FAM198 protein. This family of proteins is found in eukaryotes. The function of this family is unknown. Murine FAM198B is downregulated by FGFR signalling.	327
405700	pfam15052	TMEM169	TMEM169 protein family. This domain is thought to be structured transmembrane helices and includes the intermediary cytoplasmic domain. It is found in eukaryotes, and is approximately 130 amino acids in length.	130
405701	pfam15053	Njmu-R1	Mjmu-R1-like protein family. This protein family is thought to have a role in spermatogenesis. This family of proteins is found in eukaryotes. In humans, it is found in chromosome 17 open reading frame 75 (C17orf75). Proteins in this family are typically between 217 and 399 amino acids in length.	345
405702	pfam15054	DUF4535	Domain of unknown function (DUF4535). This family includes the uncharacterized protein C7orf73 that is found in eukaryotes. Members are generally less than 100 residues in length. Although the precise function of the domain is still unknown, members have a predicted N-terminal signal peptide sequence which suggests they are short secreted peptides.	45
405703	pfam15055	DUF4536	Domain of unknown function (DUF4536). This domain family is thought to be a transmembrane helix. It is found in eukaryotes, and is approximately 50 amino acids in length. In humans, it is located in the chromosomal position, C9orf123. The family contains the uncharacterized Sch. pombe protein TAM6 which is found in the mitochondrion.	47
405704	pfam15056	NRN1	Neuritin protein family. The domain family Neuritin1 (NRN1) is a GPI-anchored protein expressed in post-mitotic-differentiating neurons in the developing nervous system. NRN1 is a glutamate and neurotrophin receptor target encoding a neuronal protein that functions extracellularly to modulate neurite outgrowth (OMIM:607409). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 158 amino acids in length.	89
405705	pfam15057	DUF4537	Domain of unknown function (DUF4537). The function of this domain family is unknown. It is found in eukaryotes, and is typically between 119 and 141 amino acids in length. In humans, it is found in the chromosomal position C11orf16.	122
373517	pfam15058	Speriolin_N	Speriolin N-terminus. This family represents the N-terminus of the sperm centrosome protein speriolin.	200
405706	pfam15059	Speriolin_C	Speriolin C-terminus. This family represents the C-terminus of the sperm centrosome protein speriolin.	148
405707	pfam15060	PPDFL	Differentiation and proliferation regulator. Pancreatic progenitor cell differentiation and proliferation factor-like protein (PPDFL) is alternatively named Exocrine differentiation and proliferation factor-like protein. PPDFL regulates exocrine cell fate. This protein is highly expressed in exocrine progenitor cells which eventually differentiate to form exocrine pancreatic cells.	110
405708	pfam15061	DUF4538	Domain of unknown function (DUF4538). This protein family is thought to be a transmembrane helix. Its function remains unknown. This family of proteins is found in eukaryotes. Proteins in this family are typically between 58 and 87 amino acids in length.	56
405709	pfam15062	ARL6IP6	Haemopoietic lineage transmembrane helix. ADP-ribosylation factor-like protein 6-interacting protein 6 (ARP6) is a transmembrane helix present in the J2E erythro-leukaemic cell line, but not its myeloid variants. In tissues, ARL-6 mRNA was most abundant in brain and kidney. While ARL-6 protein was predominantly cytosolic, it is known to bind to SEC61-beta subunit of a protein conducting channel SEC61p.	86
405710	pfam15063	TC1	Thyroid cancer protein 1. Thyroid cancer protein 1 (TC1) is thought to decrease in apoptosis and increase cell proliferation. It is found to be expressed in thyroid papillary carcinoma. This suggests its importance in thyroid cancer. The molecular mechanism of TC1, involves up-regulating cell signalling through ERK-1/2 signalling pathway and it positively regulates transition between the G1 and S phase in the cell cycle. It is thought to positively regulate Wnt/beta-catenin signalling pathway by interacting with its repressor. In humans, it is located in the chromosomal position, C8orf4. This family of proteins is found in eukaryotes and contains a conserved NIF sequence motif.	74
373523	pfam15064	CATSPERG	Cation channel sperm-associated protein subunit gamma. This family represents the gamma subunit of the CATSPER, or cation channel sperm-associated protein complex. The complex appears only to be expressed in the flagellum of sperm. The complex is activated at alkaline intracellular pH, and being restricted to the flagellum is the mediating calcium channel.	971
405711	pfam15065	NCU-G1	Lysosomal transcription factor, NCU-G1. NCU-G1 is a set of highly conserved nuclear proteins rich in proline with a molecular weight of approximately 44 kDa. Especially high levels are detected in human prostate, liver and kidney. NCU-G1 is a dual-function family capable of functioning as a transcription factor as well as a nuclear receptor co-activator by stimulating the transcriptional activity of peroxisome proliferator-activated receptor-alpha (PPAR-alpha).	356
405712	pfam15066	CAGE1	Cancer-associated gene protein 1 family. CAGE-1 is a family of proteins overexpressed in tumor tissues compared with surrounding tissues. CAGE-1 gene showed testis-specific expression among normal tissues and displayed wide expression in a variety of cancer cell lines and cancer tissues. CAGE-1 is predominantly expressed during post-meiotic stages. It localizes to the acrosomal matrix and acrosomal granule showing it to be a component of the acrosome of mammalian spermatids and spermatozoa.	528
405713	pfam15067	FAM124	FAM124 family. The exact function of this protein family remains unknown. This family of proteins is found in eukaryotes. Proteins in this family are approximately 480 amino acids in length. There is a conserved LFL sequence motif.	235
405714	pfam15068	FAM101	FAM101 family. This protein family includes the actin regulators, Refilin A and B, however the exact function of this protein family remains unknown. Refilin is thought to stabilize peri-nuclear actin filament bundles, important in fibroblasts. Refilin is important as changes in localization and shape in the nucleus plays a role in cellular and developmental processes.	208
405715	pfam15069	FAM163	FAM163 family. This protein family is alternatively named Neuroblastoma-derived secretory proteins. Highly expressed in neuroblastoma compared to other tissues, suggesting that it may be used as a marker for metastasis in bone marrow.	163
405716	pfam15070	GOLGA2L5	Putative golgin subfamily A member 2-like protein 5. The function of the GOLGA2L5 protein family remains unknown. This family of proteins is thought to be found in the Golgi apparatus of eukaryotes. Proteins in this family are typically between and 840 amino acids in length.	523
405717	pfam15071	TMEM220	Transmembrane family 220, helix. Transmembrane 220 (TMEM220) is a domain of unknown function. It is thought to be a transmembrane helix. The length of this protein is typically between 150 and 160 amino acids. In humans, it is found in the chromosomal position 17p13.1.	99
405718	pfam15072	DUF4539	Domain of unknown function (DUF4539). This family of proteins is found in eukaryotes. Proteins in this family are typically between 230 and 625 amino acids in length.	85
405719	pfam15073	DUF4540	Domain of unknown function (DUF4540). This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 302 amino acids in length. In humans, it is found in the chromosomal position, C7orf72.	128
405720	pfam15074	DUF4541	Domain of unknown function (DUF4541). This family of proteins is found in eukaryotes. Proteins in this family are typically between 100 and 163 amino acids in length. There is a conserved KLHRDDR sequence motif. There is a single completely conserved residue Y that may be functionally important. In humans, the gene is found in the chromosomal location, C5orf49.	92
405721	pfam15075	DUF4542	Domain of unknown function (DUF4542). This family of proteins is found in eukaryotes. Proteins in this family are typically between 123 and 173 amino acids in length. There is a conserved IPPYN sequence motif. The gene that encodes this protein in humans, is found in the chromosomal position, C17orf98.	132
373535	pfam15076	DUF4543	Domain of unknown function (DUF4543). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 90 amino acids in length. The human member of this family is C17orf67.	74
405722	pfam15077	MAJIN	Membrane-anchored junction protein. Membrane-anchored junction protein (MAJIN) is a meiosis-specific telomere-associated protein involved in meiotic telomere attachment to the nucleus inner membrane, a crucial step for homologous pairing and synapsis. It is a component of the MAJIN-TERB1-TERB2 complex, which promotes telomere cap exchange by mediating attachment of telomeric DNA to the inner nuclear membrane and replacement of the protective cap of telomeric chromosomes.	241
405723	pfam15078	DUF4545	Domain of unknown function (DUF4545). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 417 amino acids in length. The human member of this family is C1orf141.	465
373538	pfam15079	Tsc35	Testis-specific protein 35. Tsc35 (also referred to in the literature as Tsc24) is essential for spermatogenesis in mammalian male reproduction. It is expressed in the testis from day 35 onwards in mice.	199
373539	pfam15080	DUF4547	Domain of unknown function (DUF4547). This family of proteins is found in eukaryotes. Proteins in this family are typically between 144 and 206 amino acids in length. The human member of this family is C3orf43.	196
373540	pfam15081	DUF4548	Domain of unknown function (DUF4548). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 178 amino acids in length. The human member of this family is C1orf105.	167
405724	pfam15082	DUF4549	Domain of unknown function (DUF4549). This family of proteins is found in eukaryotes. Proteins in this family are typically between 143 and 1871 amino acids in length. The human member of this family is C6orf183.	142
373542	pfam15083	Colipase-like	Colipase-like. This is a family of colipase-like proteins.	90
405725	pfam15084	DUF4550	Domain of unknown function (DUF4550). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. This domain contains an N-terminal HXE motif.	95
405726	pfam15085	NPFF	Neuropeptide FF. 	109
405727	pfam15086	UPF0542	Uncharacterized protein family UPF0542. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. There is a conserved LSWKL sequence motif. This family includes human protein C5orf43.	67
405728	pfam15087	DUF4551	Protein of unknown function (DUF4551). This family of proteins is functionally uncharacterized. This family of proteins is found in metazoa. This family includes human protein C12orf56.	571
405729	pfam15088	NADH_dh_m_C1	NADH dehydrogenase [ubiquinone] 1 subunit C1, mitochondrial. 	49
405730	pfam15089	DUF4552	Domain of unknown function (DUF4552). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrates. Proteins in this family are typically between 425 and 649 amino acids in length.	425
405731	pfam15090	DUF4553	Domain of unknown function (DUF4553). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrates. This family includes the human protein C10orf12.	474
405732	pfam15091	DUF4554	Domain of unknown function (DUF4554). This family of proteins is functionally uncharacterized. This family of proteins is found in some vertebrates. This family includes human protein C11orf80.	456
405733	pfam15092	UPF0728	Uncharacterized protein family UPF0728. This family of proteins is functionally uncharacterized. This family of proteins is found in metazoa. There is a conserved GPY sequence motif.	88
405734	pfam15093	DUF4555	Domain of unknown function (DUF4555). This family of proteins is functionally uncharacterized. This family of proteins is found in metazoa.This family includes the human protein C7orf31.	284
405735	pfam15094	DUF4556	Domain of unknown function (DUF4556). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrates. This family includes human protein C1orf127.	215
405736	pfam15095	IL33	Interleukin 33. 	266
405737	pfam15096	G6B	G6B family. 	220
405738	pfam15097	Ig_J_chain	Immunoglobulin J chain. 	134
405739	pfam15098	TMEM89	TMEM89 protein family. The function of this family of transmembrane proteins, TMEM89, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are approximately 159 amino acids in length.	131
405740	pfam15099	PIRT	Phosphoinositide-interacting protein family. The function of this family, PIRT, is not known, however it is predicted to be a multi-pass membrane protein. This family of proteins is thought to have a role in positively regulating TRPV1 channel activity via phosphatidylinositol 4,5-bisphosphate (PIP2). This family of proteins is found in eukaryotes. Proteins in this family are located in the cell membrane. Proteins in this family are approximately 140 amino acids in length.	131
405741	pfam15100	TMEM187	TMEM187 protein family. The function of this family, TMEM187, is not known, however it is predicted to be a multi-pass membrane protein. Members of this family are as yet uncharacterized. This protein family is also alternatively named ITBA1. This family of proteins are found in eukaryotes. Proteins in this family are typically between 239 and 267 amino acids in length.	244
405742	pfam15101	TERB2	Telomere-associated protein TERB2. TERB2 is a meiosis-specific telomere-associated protein involved in meiotic telomere attachment to the nucleus inner membrane, a crucial step for homologous pairing and synapsis.	207
405743	pfam15102	TMEM154	TMEM154 protein family. The function of this family of transmembrane proteins has not, as yet, been determined. However, it is thought to be a therapeutic target for ovine lentivirus infection. This family of proteins is found in eukaryotes and members are typically between 138 and 320 amino acids in length.	153
405744	pfam15103	G0-G1_switch_2	G0/G1 switch protein 2. This family of proteins regulate apoptosis by binding to Bcl-2 and preventing the formation of the anti-apoptotic BAX-BCL2 heterodimers.	105
405745	pfam15104	DUF4558	Domain of unknown function (DUF4558). This family of proteins is found in eukaryotes. Proteins in this family are typically between 78 and 121 amino acids in length. One member is annotated as being a flagellar associated protein.	86
405746	pfam15105	TMEM61	TMEM61 protein family. The function of this family of transmembrane proteins has not, as yet, been determined. Members of this family remain uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 150 and 211 amino acids in length.	182
405747	pfam15106	TMEM156	TMEM156 protein family. The function of this family of transmembrane proteins, TMEM 156, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins are found in eukaryotes. Proteins in this family are approximately 310 amino acids in length. In humans, the gene encoding this protein is located in the chromosomal position, 4p14.	226
405748	pfam15107	FAM216B	FAM216B protein family. The function of this family of proteins, FAM216B, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins are found in eukaryotes. Proteins in this family are approximately 150 amino acids in length. In humans, the gene encoding this protein is located in the position, C13orf30.	103
373566	pfam15108	TMEM37	Voltage-dependent calcium channel gamma-like subunit protein family. This family of transmembrane proteins, TMEM37, has a role in stabilizing the calcium channel in an inactivated (closed) state. It is a subunit of the L-type calcium channels. This family of proteins are found in eukaryotes. Proteins in this family are approximately 210 amino acids in length.	182
405749	pfam15109	TMEM125	TMEM125 protein family. The function of this family of transmembrane proteins, TMEM125, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 55 and 232 amino acids in length.	109
373568	pfam15110	TMEM141	TMEM141 protein family. The function of this family of transmembrane proteins, TMEM141, has not, as yet, been determined. Members of this family remain uncharacterized. TMEM141 protein family is found in eukaryotes. Proteins in this family are typically between 103 and 124 amino acids in length. There are two completely conserved residues (C and W) that may be functionally important.	91
405750	pfam15111	TMEM101	TMEM101 protein family. The function of this family of transmembrane proteins, TMEM101, has not, as yet, been determined. Members of this family remain uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 127 and 257 amino acids in length.	249
405751	pfam15112	DUF4559	Domain of unknown function (DUF4559). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. This family includes human protein CXorf38.	311
405752	pfam15113	TMEM117	TMEM117 protein family. The function of this family of transmembrane proteins has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 181 and 504 amino acids in length.	410
405753	pfam15114	UPF0640	Uncharacterized protein family UPF0640. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 70 and 80 amino acids in length. There are two conserved sequence motifs: PGK and YRFLP.	66
405754	pfam15115	HDNR	Domain of unknown function with conserved HDNR motif. This family of proteins is found in eukaryotes. Proteins in this family are typically between 117 and 219 amino acids in length. There is a conserved HDNR sequence motif. The function is not known.	174
405755	pfam15116	CD52	CAMPATH-1 antigen. 	41
373575	pfam15117	UPF0697	Uncharacterized protein family UPF0697. This family of uncharacterized proteins is found in vertebrates. Proteins in this family are typically around 100 amino acids in length.	98
373576	pfam15118	DUF4560	Domain of unknown function (DUF4560). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 66 and 78 amino acids in length. There are two conserved sequence motifs: FCK and RTL.	64
405756	pfam15119	APOC4	Apolipoprotein C4. 	94
405757	pfam15120	SPACA9	Sperm acrosome-associated protein 9. This family of proteins found in eukaryotes represents sperm acrosome-associated protein 9 (SPACA9, previously known as C9orf9 or MAST). Sperm acrosome-associated protein 9 has been suggested to form a complex with calcium-binding proteins calreticulin and caldendrin localized to the acrosome. Despite this, no known protein interaction motifs have been identified in MAST/SPACA9.	164
405758	pfam15121	TMEM71	TMEM71 protein family. The function of this family, TMEM71, is not known, however it is predicted to be a transmembrane protein. This family of proteins is found in eukaryotes and located in the cell membrane. Proteins in this family vary between 41 and 291 amino acids in length.	150
405759	pfam15122	TMEM206	TMEM206 protein family. The function of this family of transmembrane proteins, TMEM206, has not, as yet, been determined. Members of this family are remain uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are approximately 350 amino acids in length.	296
405760	pfam15123	DUF4562	Domain of unknown function (DUF4562). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. There is a conserved HRYQNPW sequence motif. This family includes the human protein C4orf45.	112
373581	pfam15124	FANCD2OS	FANCD2 opposite strand protein. This family of proteins of unknown function gets its name from its position in the mammalian genome: Fanconi anemia group D2 protein opposite strand transcript protein.	175
405761	pfam15125	TMEM238	TMEM238 protein family. The function of this family of transmembrane proteins, TMEM238; has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 61 and 153 amino acids in length.	64
405762	pfam15127	SmAKAP	Small membrane A-kinase anchor protein. SmAKAP is a small membrane-bound PKA-RI-specific protein kinase A-anchoring protein, referred to as small membrane-AKAP. It is probably tethered to the plasma membrane most through a dual acylation of its N-terminal Met-Gly-Cys- motif (myristoylation and palmitoylation, respectively). It specifically targets PKA-RI isoforms to the plasma membrane. It localizes to plasma membranes, is enriched at cell-cell junctions and associates with filopodia.	97
405763	pfam15128	T_cell_tran_alt	T-cell leukemia translocation-altered. This family of proteins is required for osteoclastogenesis.	92
405764	pfam15129	FAM150	FAM150 family. This family of proteins known as FAM150 is found in eukaryotes. Members of this family are as yet uncharacterized. Proteins in this family are approximately 143 amino acids in length. The function of this family has not, as yet, been determined, however it is predicted to be a secretory protein family.	124
405765	pfam15130	DUF4566	Domain of unknown function (DUF4566). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. This family includes human protein C6orf62.	226
373586	pfam15131	DUF4567	Domain of unknown function (DUF4567). This family of proteins is functionally uncharacterized. This family of proteins is found in some mammals.	75
405766	pfam15132	DUF4568	Domain of unknown function (DUF4568). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes.	194
405767	pfam15133	DUF4569	Domain of unknown function (DUF4569). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. This family includes human protein CXorf21.	304
405768	pfam15134	DUF4570	Domain of unknown function (DUF4570). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes.	110
405769	pfam15135	UPF0515	Uncharacterized protein UPF0515. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. There are two conserved sequence motifs: PLT and HSC.	271
405770	pfam15136	UPF0449	Uncharacterized protein family UPF0449. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. There is a conserved LPTRP sequence motif.	99
405771	pfam15137	DUF4571	Domain of unknown function (DUF4571). This family of proteins is functionally uncharacterized. This family of proteins is found in vertebrate. This family includes human protein C21orf62.	214
405772	pfam15138	Syncollin	Syncollin. This family has a role in zymogen granule exocytosis.	112
405773	pfam15139	DUF4572	Domain of unknown function (DUF4572). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 160 and 220 amino acids in length.	195
405774	pfam15140	DUF4573	Domain of unknown function (DUF4573). This family of proteins is found in eukaryotes. Proteins in this family are typically approximately 360 amino acids in length.	176
405775	pfam15141	DUF4574	Domain of unknown function (DUF4574). This family of proteins is found in eukaryotes. Proteins in this family are typically between and 86 amino acids in length.	87
405776	pfam15142	INCA1	INCA1. This family of proteins inhibits cyclin-dependent kinase activity.	178
291801	pfam15143	DUF4575	Domain of unknown function (DUF4575). This family of uncharacterized proteins is found in eukaryotes.	129
373598	pfam15144	DUF4576	Domain of unknown function (DUF4576). This family of uncharacterized proteins is found in eukaryotes.	88
405777	pfam15145	DUF4577	Domain of unknown function (DUF4577). The function of this family of proteins, has not, as yet, been determined. Members of this family are as yet uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically 128 amino acids in length.	128
405778	pfam15146	FANCAA	Fanconi anemia-associated. This family of proteins plays a role in the Fanconi anemia-associated DNA damage response.	437
405779	pfam15147	DUF4578	Domain of unknown function (DUF4578). This family of proteins is found in eukaryotes. Proteins in this family are typically between 44 and 137 amino acids in length.	126
405780	pfam15148	Apolipo_F	Apolipoprotein F. 	198
405781	pfam15149	CATSPERB	Cation channel sperm-associated protein subunit beta protein family. The function of this family of transmembrane proteins, CATSPERB, has not, as yet, been determined. However, it is thought to play a role in sperm hyperactivation by associating with CATSPER1. This family of proteins is found in eukaryotes. Proteins in this family are typically between 220 and 1107 amino acids in length.	520
317555	pfam15150	PMAIP1	Phorbol-12-myristate-13-acetate-induced. This family carries a BH3 domain between residues 23 and 40.	54
405782	pfam15151	RGCC	Response gene to complement 32 protein family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 44 and 130 amino acids in length. There is a conserved KLGDT sequence motif.	127
405783	pfam15152	Kisspeptin	Kisspeptin. 	76
405784	pfam15153	CYTL1	Cytokine-like protein 1. The function of this family of proteins, CYTL1, has not, as yet, been determined. However it is thought to be a secretory protein expressed in CD34+ haemopoietic cells. This family of proteins is found in eukaryotes. Proteins in this family are typically between 134 and 145 amino acids in length. There are two conserved sequence motifs: PPTCYSR and DDC.	127
317559	pfam15155	MRFAP1	MORF4 family-associated protein1. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 127 amino acids in length.	119
405785	pfam15156	CLN6	Ceroid-lipofuscinosis neuronal protein 6. This family of proteins is found in eukaryotes. Proteins in this family are typically between 190 and 310 amino acids in length.	280
405786	pfam15157	IQCJ-SCHIP1	Fusion protein IQCJ-SCHIP1 with IQ-like motif. This is a family of eukaryotic fusion proteins. It bridges two adjacent genes that encode distinct proteins, IQCJ, a novel IQ motif containing protein and SCHIP1, a schwannomin interacting protein. It contains a unique calmodulin-binding IQ motif at the N-terminus not shared with its shorter isoform SCHIP1, suggesting a distinctive function for this protein. It is localized to cytoplasm and actin-rich regions, and in differentiated PC12 cells is seen in neurite extensions. the exact physiological function is unclear.	150
405787	pfam15158	DUF4579	Domain of unknown function (DUF4579). This family of proteins is found in eukaryotes. Proteins in this family are typically between 192 and 239 amino acids in length. The human member of this family is C8orfK29.	184
405788	pfam15159	PIG-Y	Phosphatidylinositol N-acetylglucosaminyltransferase subunit Y. This family of proteins represents subunit Y of the GPI-N-acetylglucosaminyltransferase (GPI-GnT) complex. It may regulate activity of the complex by binding the catalytic subunit, PIG-A.	70
373611	pfam15160	SASRP1	Spermatogenesis-associated serine-rich protein 1. Spermatogenesis-associated serine-rich protein 1 is a serine-rich protein differentially expressed during spermatogenesis.	236
317565	pfam15161	Neuropep_like	Neuropeptide-like. This family contains putative neuropeptides.	61
405789	pfam15162	DUF4580	Domain of unknown function (DUF4580). This family of proteins is found in eukaryotes. Proteins in this family are typically between 63 and 185 amino acids in length.	162
405790	pfam15163	Meiosis_expr	Meiosis-expressed. This family of proteins is essential for spermiogenesis.	75
373614	pfam15164	WBS28	Williams-Beuren syndrome chromosomal region 28 protein homolog. WBS28 is an integral membrane family. These proteins have been identified as being linked to Williams-Beuren syndrome, OMIM:194050. This family of proteins is found in eukaryotes, and are typically 266 amino acids in length.	266
405791	pfam15165	REC114-like	Meiotic recombination protein REC114-like. REC114-like members are necessary for meiotic DNA double-strand break formation. It functions in conjunction with Mei4. This family of proteins is found in eukaryotes. Proteins in this family are typically between 43 and 259 amino acids in length.	239
291823	pfam15167	DUF4581	Domain of unknown function (DUF4581). This family of proteins is found in eukaryotes. Proteins in this family are typically 131 amino acids in length.	131
405792	pfam15168	TRIQK	Triple QxxK/R motif-containing protein family. TRIQK member-proteins share a characteristic triple repeat of the sequence QXXK/R, as well as a hydrophobic C-terminal region. Xenopus and mouse triqk genes are broadly expressed throughout embryogenesis, and mtriqk is also generally expressed in mouse adult tissues. TRIQK proteins are localized to the endoplasmic reticulum membrane. This family is found in eukaryotes and members are typically between and 86 amino acids in length.	79
405793	pfam15169	DUF4564	Domain of unknown function (DUF4564). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. This family includes the human protein C17orf62.	184
373618	pfam15170	CaM-KIIN	Calcium/calmodulin-dependent protein kinase II inhibitor. CaM-KIIN is the inhibitor of Calcium/calmodulin-dependent protein kinase II (CaMKII). CaMKII plays a central part in long-term potentiation, which underlies some forms of learning and memory. CaM-KIIN is a natural, specific inhibitor of CaMKII. This family is found in eukaryotes.	79
373619	pfam15171	Spexin	Neuropeptide secretory protein family, NPQ, spexin. Spexin, alternatively named NPQ, is a peptide hormone and is derived from a pro-hormone. This family of proteins has a role in inducing stomach wall contraction and is expressed in the submucosal layer of the mouse oesophagus and stomach. Spexin, like most peptide hormones, is a ligand for G-protein coupled receptors. Spexin is also thought to have a role in controlling arterial blood pressure as well as salt and water balance.	90
405794	pfam15172	Prolactin_RP	Prolactin-releasing peptide. 	45
405795	pfam15173	FAM180	FAM180 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 117 and 182 amino acids in length. There are two conserved sequence motifs: ELAS and DFE. The function of this family is unknown.	137
291830	pfam15174	PRNT	Prion-related protein testis-specific. PRNT is a family of prion-related proteins expressed in the testis. This family of proteins is found in eukaryotes. Proteins in this family are typically between 52 and 94 amino acids in length.	51
373622	pfam15175	SPATA24	Spermatogenesis-associated protein 24. This family of proteins bind to DNA and to TBP (TATA box binding protein), TATA-binding protein (TBP)-related protein 2 (TRF2) and several polycomb factors. It is likely to function as a transcription regulator.	170
405796	pfam15176	LRR19-TM	Leucine-rich repeat family 19 TM domain. LRR19-TM is the single-span transmembrane region of LRRC19, a leucine-rich repeat protein family. LRRC19 functions as a transmembrane receptor inducing pro-inflammatory cytokines. This suggests its role in innate immunity. This family of proteins is found in eukaryotes.	101
405797	pfam15177	IL28A	Interleukin-28A. The protein family, Interleukin-28A, plays an important role in modulating the immune system. This protein family is induced by viral infection and interacts with a class II receptor. This family of proteins is found in eukaryotes. Proteins in this family are typically between 145 and 195 amino acids in length.	156
405798	pfam15179	Myc_target_1	Myc target protein 1. This family of proteins is regulated by the c-Myc oncoprotein. It regulates the expression of several other c-Myc target genes.	193
405799	pfam15180	NPBW	Neuropeptides B and W. The function of this family, NPBW, which includes Neuropeptides B and W, is thought to be involved in activating G-protein coupled receptors, GPR7 and GPR8. It is thought to play a regulatory role in the organisation of neuroendocrine signals accessing the anterior pituitary gland. It is predicted that this effect will stimulate the increase in water-drinking and food-intake. This suggests it plays a role in the hypothalamic response to stress. This family of proteins is found in eukaryotes.	113
373627	pfam15181	SMRP1	Spermatid-specific manchette-related protein 1. This family of proteins, SMRP1, is thought to have a role in spermatogenesis and may be involved in differentiation or function of ciliated cells. This family of proteins is found in eukaryotes. Proteins in this family are typically approximately 260 amino acids in length.	262
405800	pfam15182	OTOS	Otospiralin. This family of proteins, Otospiralin, has a role in maintaining the neurosensory epithelium of the inner ear. This family of proteins is found in eukaryotes. Proteins in this family are approximately 90 amino acids in length.	69
405801	pfam15183	MRAP	Melanocortin-2 receptor accessory protein family. This family is thought to be involved in cell trafficking. It is required for MC2R expression in certain cell types, suggesting that it is involved in the processing, trafficking or function of MC2R. MRAP may be involved in the intracellular trafficking pathways in adipocyte cells. This family of proteins is found in eukaryotes. Proteins in this family are typically between 47 and 205 amino acids in length.	89
291840	pfam15184	TOM6p	Mitochondrial import receptor subunit TOM6 homolog. TOMM6 forms part of the pre-protein translocase complex of the outer mitochondrial membrane (TOM complex). This family of proteins is found in eukaryotes. Proteins in this family are typically between 43 and 74 amino acids in length.	74
291841	pfam15185	BMF	Bcl-2-modifying factor, apoptosis. BMF is thought to play a role in inducing apoptosis. It is thought to bind to Bcl-2 proteins. This family of proteins is found in eukaryotes. Proteins in this family are typically between 75 and 190 amino acids in length. There are two conserved sequence motifs: GNA and DQF.	224
405802	pfam15186	TEX13	Testis-expressed sequence 13 protein family. The function of this family of proteins has not, as yet, been determined. However, members are thought to be encoded for by spermatogonially-expressed, germ-cell-specific genes. This family of proteins is found in eukaryotes. Proteins in this family are typically between 177 and 384 amino acids in length. There are two conserved sequence motifs: FIN and LAL.	148
373630	pfam15187	Augurin	Oesophageal cancer-related gene 4. Augurin is alternatively named oesophageal cancer-related gene 4 protein. The function of this family of transmembrane proteins, is to induce the senescence of oligodendrocyte and neural precursor cells, characterized by G1 arrest, RB1 dephosphorylation and accelerated CCND1 and CCND3 proteasomal degradation. Augurin has been found to stimulate the release of ACTH via the release of hypothalamic CRF. This family of proteins is found in eukaryotes. Proteins in this family are typically 145 amino acids in length.	115
405803	pfam15188	CCDC-167	Coiled-coil domain-containing protein 167. The function of this family of coiled-coil domains, has not, as yet, been determined. Members of this family remain uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 103 amino acids in length.	82
405804	pfam15189	MEIOC	Meiosis-specific coiled-coil domain-containing protein MEIOC. This family of proteins is found in eukaryotes. In humans, it is encoded for on the chromosomal position C17orf104.	162
405805	pfam15190	TMEM251	Transmembrane protein 251. This family of proteins, also known as UPF0694, is found in eukaryotes. Proteins in this family are around 135 amino acids in length. In humans, it is found on the chromosomal position, C14orf109.	128
405806	pfam15191	Synaptonemal_3	Synaptonemal complex central element protein 3. 	85
405807	pfam15192	TMEM213	TMEM213 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 154 amino acids in length. The function of this family is unknown.	79
405808	pfam15193	FAM24	FAM24 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 87 and 101 amino acids in length. There are two conserved sequence motifs: FDLRT and CLY. The function of this family is unknown.	69
373637	pfam15194	TMEM191C	TMEM191C family. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 302 amino acids in length. There are two conserved sequence motifs: QDC and RLF. The function of this family is unknown.	121
405809	pfam15195	TMEM210	TMEM210 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 149 amino acids in length. The function of this family is unknown.	112
259330	pfam15196	Harakiri	Activator of apoptosis harakiri. 	94
291852	pfam15198	Dexa_ind	Dexamethasone-induced. 	90
373639	pfam15199	DAOA	D-amino acid oxidase activator. 	82
405810	pfam15200	KRTDAP	Keratinocyte differentiation-associated. 	76
291855	pfam15201	Rod_cone_degen	Progressive rod-cone degeneration. This family of proteins is involved in vision.	54
317593	pfam15202	Adipogenin	Adipogenin. This family of proteins is involved in the stimulation of adipocyte differentiation and development.	79
405811	pfam15203	TMEM95	TMEM95 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 102 and 231 amino acids in length. There is a conserved LGG sequence motif. The function of this family is unknown.	151
373642	pfam15204	KKLCAg1	Kita-kyushu lung cancer antigen 1. This is a family of cancer antigens.	85
405812	pfam15205	PLAC9	Placenta-specific protein 9. This family of proteins was identified as being enriched in placenta.	74
405813	pfam15206	FAM209	FAM209 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 170 amino acids in length. The function of this family is unknown.	148
405814	pfam15207	TMEM240	TMEM240 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 54 and 175 amino acids in length. The function of this family is unknown.	174
405815	pfam15208	Rab15_effector	Rab15 effector. This family of proteins has a role in receptor recycling from the endocytic recycling compartment.	237
405816	pfam15209	IL31	Interleukin 31. 	127
373648	pfam15210	SFTA2	Surfactant-associated protein 2. 	59
405817	pfam15211	CXCL17	VEGF co-regulated chemokine 1. 	89
291866	pfam15212	SPATA19	Spermatogenesis-associated protein 19, mitochondrial. 	130
405818	pfam15213	CDRT4	CMT1A duplicated region transcript 4 protein. 	137
373651	pfam15214	PXT1	Peroxisomal testis-specific protein 1. This family of proteins is testis-specific.	50
291869	pfam15215	FDC-SP	Follicular dendritic cell secreted peptide. 	65
405819	pfam15216	TSLP	Thymic stromal lymphopoietin. 	125
405820	pfam15217	TSC21	TSC21 family. This family of proteins is testis-specific.	180
373654	pfam15218	SPATA25	Spermatogenesis-associated protein 25. This family of proteins may be involved in spermatogenesis.	222
317606	pfam15219	TEX12	Testis-expressed 12. 	96
291874	pfam15220	HILPDA	Hypoxia-inducible lipid droplet-associated. This family of proteins stimulate intracellular lipid accumulation, function as autocrine growth factors and enhance cell growth.	63
373655	pfam15221	LEP503	Lens epithelial cell protein LEP503. This protein may be involved in lens epithelial cell differentiation.	69
259356	pfam15222	KAR	Kidney androgen-regulated. The function of this family is unknown.	105
405821	pfam15223	DUF4584	Domain of unknown function (DUF4584). This family of proteins is found in eukaryotes. Proteins in this family are approximately 835 amino acids in length. The family is found in association with pfam02437.	418
405822	pfam15224	SCRG1	Scrapie-responsive protein 1. This protein family has an important function in acting against the prion protein, Scrapie.This family of proteins is found in eukaryotes. Proteins in this family are approximately 98 amino acids in length.	76
405823	pfam15225	IL32	Interleukin 32. 	100
373659	pfam15226	HPIP	HCF-1 beta-propeller-interacting protein family. HPIP is a small cellular polypeptide that binds to the beta-propeller domain of HCF-1. HPIP regulates HCF-1 activity by modulating its subcellular localization. HCF-1 is a cellular protein required by VP16 to activate the herpes simplex virus- immediate-early genes. VP16 is a component of the viral tegument and, after release into the cell, binds to HCF-1 and translocates to the nucleus to form a complex with the POU domain protein Oct-1 and a VP16-responsive DNA sequence. HPIP-mediated export may provide the pool of cytoplasmic HCF-1 required for import of virion-derived VP16 into the nucleus.	133
405824	pfam15227	zf-C3HC4_4	zinc finger of C3HC4-type, RING. This is a family of primate-specific Ret finger protein-like (RFPL) zinc-fingers of the C3HC4 type. Ret finger protein-like proteins are primate-specific target genes of Pax6, a key transcription factor for pancreas, eye and neocortex development. This domain is likely to be DNA-binding. This zinc-finger domain together with the RDM domain, pfam11002, forms a large zinc-finger structure of the RING/U-Box superfamily. RING-containing proteins are known to exert an E3 ubiquitin protein ligase activity with the zinc-finger structure being mandatory for binding to the E2 ubiquitin-conjugating enzyme.	42
405825	pfam15228	DAP	Death-associated protein. 	95
405826	pfam15229	POM121	POM121 family. 	234
405827	pfam15230	SRRM_C	Serine/arginine repetitive matrix protein C-terminus. This domain is found near to the C-terminus of Serine/arginine repetitive matrix proteins 3 and 4.	67
405828	pfam15231	VCX_VCY	Variable charge X/Y family. The variable charge X/Y (VCX/VCY) family of proteins has members on the Human X and Y chromosomes, is expressed in male germ calls and may play a role in spermatogenesis or in sex ratio distortion.	133
405829	pfam15232	DUF4585	Domain of unknown function (DUF4585). The function of this protein domain family is yet to be characterized. It is putatively thought to lie in the C-terminal domain of the DNA nucleotide repair protein, Xeroderma pigmentosa complementation group A (XPA). The function of XPA is to bind to DNA and repair any mismatched base pairs. This domain family is often found in eukaryotes, and is approximately 70 amino acids in length. There is a conserved DPE sequence motif. In humans, this protein is encoded for in the chromosomal position, Chromosome 5 open reading frame 65. Mutations in the gene lead to myelodysplastic syndromes, where there is inefficient stem cell production in the bone marrow. This suggests that the protein may have a role in forming blood cells.	71
405830	pfam15233	SYCE1	Synaptonemal complex central element protein 1. This family of proteins includes synaptonemal complex central element protein 1, a component of the synaptonemal complex involved in meiosis, and synaptonemal complex central element protein 1-like, which may be involved in meiosis.	147
405831	pfam15234	LAT	Linker for activation of T-cells. 	233
405832	pfam15235	GRIN_C	G protein-regulated inducer of neurite outgrowth C-terminus. This represents the C-terminus of the G protein-regulated inducer of neurite outgrowth proteins.	126
405833	pfam15236	CCDC66	Coiled-coil domain-containing protein 66. This protein family, named Coiled-coil domain-containing protein 66 (CCDC) refers to a protein domain found in eukaryotes, and is approximately 160 amino acids in length. CCDC66 protein is detected mainly in the inner segments of photoreceptors in many vertebrates including mice and humans. It has been found in dogs, that a mutation in the CCDC66 gene causes generalized progressive retinal atrophy (gPRA). This shows that the protein encoded for by this gene is vital for healthy vision and guards against photoreceptor cell degeneration. The structure of CCDC66 proteins includes a heptad repeat pattern which contains at least one coiled-coil domain. There are at least two or more alpha-helices which form a cable-like structure.	152
405834	pfam15237	PTRF_SDPR	PTRF/SDPR family. This family of proteins includes muscle-related coiled-coil protein (MURC), protein kinase C delta-binding protein (PRKCDBP), polymerase I and transcript release factor (PTRF) and serum deprivation-response protein (SDPR). MURC activates the Rho/ROCK pathway. PRKCDBP appears to act as an immune potentiator. PTRF is involved in caveolae formation and function. SDPR is involved in the targetting of protein kinase Calpha to caveolae.	250
405835	pfam15238	FAM181	FAM181. This family of proteins is found in eukaryotes. Proteins in this family are typically between 256 and 426 amino acids in length.	283
405836	pfam15239	DUF4586	Domain of unknown function (DUF4586). This protein family, refers to a domain of unknown function. The precise role of this protein domain remains to be elucidated. This family of proteins is found in eukaryotes and are typically between 256 and 320 amino acids in length. There is a single completely conserved residue, phenylalanine (F), that may be functionally important. In humans, the protein is found in the position, chromosome 4 open reading frame 47.	294
405837	pfam15240	Pro-rich	Proline-rich. This family includes several eukaryotic proline-rich proteins.	167
405838	pfam15241	Cylicin_N	Cylicin N-terminus. This is the N-terminus of cylicin proteins, which may play a role in spermatid differentiation.	107
405839	pfam15242	FAM53	Family of FAM53. The FAM53 protein family refers to a family of proteins, which bind to a transcriptional regulator that modulates cell proliferation. It is known to be highly important in neural tube development. It is found in eukaryotes and is typically between 303 and 413 amino acids in length.	304
405840	pfam15243	ANAPC15	Anaphase-promoting complex subunit 15. This is a component of the anaphase promoting complex/cyclosome.	87
405841	pfam15244	HSD3	Spermatogenesis-associated protein 7, or HSD3. Spermatogenesis-associated protein HSD3 also goes by the name of spermatogenesis-associated protein 7 or SPAT7. The family carries a single transmembrane domain. It functions in several tissues, and is expressed in the developing and mature mouse retina; it is expressed in multiple retinal layers in the adult mouse retina. Mutations lead to LCA disease, or Leber congenital amaurosis, which results in a number of retinal dystrophies. The disease- phenotype is characterized by severe visual loss at birth, nystagmus, a variety of fundus changes, and minimal or absent recordable responses on the electroretinogram (ERG).	413
405842	pfam15245	VGLL4	Transcription cofactor vestigial-like protein 4. These proteins act as transcriptional enhancer factor (TEF-1) cofactors.	210
405843	pfam15246	NCKAP5	Nck-associated protein 5, Peripheral clock protein. NCKAP5 is short for Nck-associated protein 5, which is also known as the Peripheral clock protein. NCKAP5 is a protein family, which interacts with the SH3-containing region of the adaptor protein Nck. Nck is a protein that interacts with receptor tyrosine kinases and guanine nucleotide exchange factor Sos. The role of Nck can be thought of as similar to Grb2. The role of NCKAP5 is to assist Nck with its adaptor protein role.	309
405844	pfam15247	SLBP_RNA_bind	Histone RNA hairpin-binding protein RNA-binding domain. This family represents the RNA-binding domain of histone RNA hairpin-binding protein.	69
405845	pfam15248	DUF4587	Domain of unknown function (DUF4587). This protein family is a domain of unknown function. The precise function of this protein domain remains to be elucidated. This domain family is found in eukaryotes, and is typically between 64 and 79 amino acids in length. There are two conserved sequence motifs: QNAQ and HHH. In humans, it is found in the position, chromosome 21 open reading frame 58.	74
405846	pfam15249	GLTSCR1	Conserved region of unknown function on GLTSCR protein. This domain family is found in eukaryotes, and is typically between 105 and 124 amino acids in length. It is found on glioma tumor suppressor candidate region gene proteins. ** Forced reload	102
405847	pfam15250	Raftlin	Raftlin. This family of proteins plays a role in the formation and/or maintenance of lipid rafts.	448
405848	pfam15251	DUF4588	Domain of unknown function (DUF4588). This family of proteins is found in eukaryotes. Proteins in this family are typically between 200 and 274 amino acids in length. There is a conserved LYK sequence motif. There is a single completely conserved residue A that may be functionally important.	238
405849	pfam15252	DUF4589	Domain of unknown function (DUF4589). This protein family is a domain of unknown function. The precise function of the protein domain remains to be elucidated. This family of proteins is found in eukaryotes and are typically between 215 and 293 amino acids in length. The protein contains two conserved sequence motifs: SSS and KST.	245
405850	pfam15253	STIL_N	SCL-interrupting locus protein N-terminus. 	404
405851	pfam15254	CCDC14	Coiled-coil domain-containing protein 14. This protein family, Coiled-coil domain-containing protein 14 (CCDC14) is a domain of unknown function. This family of proteins is found in eukaryotes. Proteins in this family are typically between 301 and 912 amino acids in length.	862
405852	pfam15255	CAP-ZIP_m	WASH complex subunit CAP-Z interacting, central region. This domain is found on WASH complex subunits FAM21 and CAP-ZIP proteins, as well as on VPEF (vaccinia virus penetration factor). This family of proteins is found in eukaryotes. Proteins in this family are typically between 305 and 1321 amino acids in length. The exact function of this region is not known.	125
405853	pfam15256	SPATIAL	SPATIAL. SPATIAL (stromal protein associated with thymii and lymph node) proteins may be involved in spermatid differentiation.	199
405854	pfam15257	DUF4590	Domain of unknown function (DUF4590). This family of proteins remains to be characterized and is a domain of unknown function. This domain family is found in eukaryotes, and is approximately 120 amino acids in length. There are two conserved sequence motifs: CCE and PCY. In humans, the gene encoding this protein lies in the position, chromosome 1 open reading frame 173.	106
405855	pfam15258	FAM222A	Protein family of FAM222A. This protein family, FAM222A are a domain of unknown function. This family of proteins is found in eukaryotes and are typically between 411 and 562 amino acids in length. In humans, the gene encoding this protein domain lies in the position, chromosome 12 open reading frame 34.	528
405856	pfam15259	GTSE1_N	G-2 and S-phase expressed 1. This family is the N-terminus of GTSE1 proteins. GTSE-1 (G2 and S phase-expressed-1) protein is specifically expressed during S and G2 phases of the cell cycle. It is mainly localized to the microtubules and when overexpressed delays the G2 to M transition. the full protein negatively regulates p53 transactivation function, protein levels, and p53-dependent apoptosis. This domain family is found in eukaryotes, and is approximately 140 amino acids in length. There is a conserved FDFD sequence motif.	145
405857	pfam15260	FAM219A	Protein family FAM219A. This protein family, FAM219A is a domain of unknown function. This protein family has been found in eukaryotes. Proteins in this family are typically between 144 and 191 amino acids in length. There are two conserved sequence motifs: QLL and LDE.	124
405858	pfam15261	DUF4591	Domain of unknown function (DUF4591). This protein family is a domain of unknown function. It is found in eukaryotes, and is approximately 120 amino acids in length. In humans, the gene encoding this protein lies in the position chromosome 11 open reading frame 63.	126
405859	pfam15262	DUF4592	Domain of unknown function (DUF4592). This protein family is a domain of unknown function, which lies to the N-terminus of the protein. This domain family is found in eukaryotes, and is typically between 114 and 130 amino acids in length. There are two completely conserved residues (L and A) that may be functionally important. In humans, the gene that encodes this protein lies in the position, chromosome 2 open reading frame 55.	132
373696	pfam15264	TSSC4	tumor suppressing sub-chromosomal transferable candidate 4. This family of proteins is expressed from a gene cluster where in humans the TSSC4 gene is not imprinted. This same cluster is associated with the Beckwith-Wiedermann syndrome. This domain family is found in eukaryotes, and is typically between 120 and 147 amino acids in length. There is a conserved YSL sequence motif.	126
405860	pfam15265	FAM196	FAM196 family. This protein family is a domain of unknown function. This family of proteins is found in eukaryotes and are typically between 441 and 534 amino acids in length.	491
405861	pfam15266	DUF4594	Domain of unknown function (DUF4594). This protein family is a domain of unknown function. The protein family is found in eukaryotes, and is typically between 170 and 183 amino acids in length.In humans, the gene encoding this protein lies in the position, chromosome 15 open reading frame 52.	174
405862	pfam15268	Dapper	Dapper. This is a family of signalling proteins. They act in a diverse range of signaling pathways and have a range of binding partners. They act as homo- and heterodimers.	710
259402	pfam15269	zf-C2H2_7	Zinc-finger. this is a family of eukaryotic zinc-fingers.	54
405863	pfam15270	ACI44	Metallo-carboxypeptidase inhibitor. ACI44, a metallo-carboxypeptidase inhibitor, is one member of a battery of selective inhibitors protecting roundworms of the genus Ascaris, common parasites of the human gastrointestinal tract, from host enzymes and the immune system.	58
373701	pfam15271	BBP1_N	Spindle pole body component BBP1, Mps2-binding protein. This N-terminal domain of BBP1, a spindle pole body component, interacts directly, though transiently, with the polo-box domain of Cdc5p. full length BBP1 localizes at the cytoplasmic side of the central plaque periphery of the spindle pole body (SPB) and plays an important role in inserting a duplication plaque into the nuclear envelope and assembling a functional inner plaque. Although not a membrane protein itself, BBP1 binds to Mps2 as well as to Spc29 and the half-bridge protein Kar1, thus providing a model for how the SPB core is tethered within the nuclear envelope and to the half-bridge.	151
405864	pfam15272	BBP1_C	Spindle pole body component BBP1, C-terminal. This C-terminal domain of BBP1, a spindle pole body component, carries coiled-coils that are necessary for the localization of BBP1 to the spindle pole body (SPB). Although not a membrane protein itself, BBP1 binds to Mps2 as well as to Spc29 and the half-bridge protein Kar1, thus providing a model for how the SPB core is tethered within the nuclear envelope and to the half-bridge	183
405865	pfam15273	NHS	NHS-like. This family of proteins includes Nance-Horan syndrome protein (NHS).	641
405866	pfam15274	MLIP	Muscular LMNA-interacting protein. MLIP is a Muscle-enriched A-type Lamin-interacting Protein, an innovation of amniotes, and is expressed ubiquitously and most abundantly in heart, skeletal, and smooth muscle. MLIP interacts directly and co-localizes with lamin A and C in the nuclear envelope. MLIP also co-localizes with promyelocytic leukemia (PML) bodies within the nucleus. PML, like MLIP, is only found in amniotes, suggesting that a functional link between the nuclear envelope and PML bodies may exist through MLIP.	253
405867	pfam15275	PEHE	PEHE domain. This domain was first identified in drosophila MSL1 (male-specific lethal 1). In drosophila it binds to the histone acetyltransferase males-absent on the first protein (MOF) and to protein male-specific lethal-3 (MSL3).	127
405868	pfam15276	PP1_bind	Protein phosphatase 1 binding. This domain contains a protein phosphatase 1 (PP1) binding site.	63
405869	pfam15277	Sec3-PIP2_bind	Exocyst complex component SEC3 N-terminal PIP2 binding PH. This is the N-terminal domain of fungal and eukaryotic Sec3 proteins. Sec3 is a component of the exocyst complex that is involved in the docking of exocytic vesicles with fusion sites on the plasma membrane.This N-terminal domain contains a cryptic pleckstrin homology (PH) fold, and all six positively charged lysine and arginine residues in the PH domain predicted to bind the PIP2 head group are conserved. The exocyst complex is essential for many exocytic events, by tethering vesicles at the plasma membrane for fusion. In fission yeast, polarised exocytosis for growth relies on the combined action of the exocyst at cell poles and myosin-driven transport along actin cables.	81
259411	pfam15278	Sec3_C_2	Sec3 exocyst complex subunit. This small Sec3 C-terminal domain family is based around the fission yeast protein, and is rather shorter than the budding yeast/vertebrate domain Sec3_C, family. pfam09763. In fact it is only this coiled-coil region that they carry in common. The full length fission yeast, UniProtKB:Q10324, protein Sec3 is redundant with Exo70 for viability and for the localization of other exocyst subunits, suggesting that these components act as exocyst tethers at the plasma membrane. Sec3, Exo70 and Sec5 are transported by the myosin V Myo52 along actin cables. The exocyst holo-complex, including Sec3 and Exo70, is present on exocytic vesicles, which can reach cell poles by either myosin-driven transport or random walk.	86
405870	pfam15279	SOBP	Sine oculis-binding protein. SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteristic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.	321
405871	pfam15280	BORA_N	Protein aurora borealis N-terminus. This family of proteins is required for the activation of the protein kinase Aurora-A.	205
405872	pfam15281	Consortin_C	Consortin C-terminus. Consortin is a trans-Golgi network cargo receptor involved in targeting connexins to the plasma membrane.	113
405873	pfam15282	BMP2K_C	BMP-2-inducible protein kinase C-terminus. This family represents the C-terminus of BMP2K and related proteins.	255
405874	pfam15283	DUF4595	Domain of unknown function (DUF4595) with porin-like fold. Large family of predicted secreted proteins mostly from CFG group, but also from Burkholderia, Pseudomonas and Streptomyces. Function of these proteins is not known. A 3D structure of a representative of this family from Bacteroides uniformis was solved by JCSG and deposited to PDB as 4ghb. There is some overlap with RHS-repeat (PF05593) family despite lack of obvious repeats in the structure.	195
317659	pfam15284	PAGK	Phage-encoded virulence factor. PAGK represents a new of virulence factors that is translocated into the host cytoplasm via bacterial outer membrane vesicles (OMV). Members are small proteins composed of about 70 amino acids. In Salmonella they are secreted independently of the SPI-2 type-III secretion system, T3SS. The OMV functions as a vehicle for transferring virulence determinants to the cytoplasm of the infected host cell. OMVs are released from the cell envelopes of Gram-negative bacteria and comprise a variety of outer membrane and periplasmic constituents, including proteins, phospholipids, lipopolysaccharides, and DNA.	64
373713	pfam15285	BH3	Beclin-1 BH3 domain, Bcl-2-interacting. The BH3 domain is a short motif known to bind to Bcl-xLs. This interaction is important in apoptosis.	23
373714	pfam15286	Bcl-2_3	Apoptosis regulator M11, B cell 2 leukaemia/lymphoma like. pfam02180. Bcl-2_3 is a small family of eukaryotic proteins associated with autophagy. The family is found in association with pfam00452.	126
405875	pfam15287	KRBA1	KRBA1 family repeat. KRBA1 is a short repeating motif found in mammalian proteins. It is characterized by a highly conserved sequence of residues, SSPLxxLxxCLK. The function of the repeat, which can be present in up to seven copies, is unknown as is the function of the full length proteins.	43
405876	pfam15288	zf-CCHC_6	Zinc knuckle. This Zinc knuckle is found in FAM90A mammalian proteins.	36
405877	pfam15289	RFXA_RFXANK_bdg	Regulatory factor X-associated C-terminal binding domain. This C-terminal domain of Regulatory factor X-associated protein binds to RFXANK, the Ankyrin-repeat regulatory factor X proteins. RFXA is part of the RFX complex, Mutants of either RFXAP or RFXANK protein fail to bind to each other. RFX5 binds only to the RFXANK-RFXAP scaffold and not to either protein alone, and neither the scaffold nor RFX5 alone can bind DNA. The binding of the RFXANK-RFXAP scaffold to RFX5 leads to a conformational change in the latter that exposes the DNA-binding domain of RFX5. The DNA-binding domain of RFX5 anchors the RFX complex to MHC class II X and S promoter boxes.	122
405878	pfam15290	Syntaphilin	Golgi-localized syntaxin-1-binding clamp. Syntaphilin or Syntabulin is a family of eukaryotic proteins. Syntaphilin binds to syntaxin-1 thereby inhibiting SNARE complex formation by absorbing free syntaxin-1. So it is a syntaxin-1 clamp that controls SNARE assembly.	308
405879	pfam15291	Dermcidin	Dermcidin, antibiotic peptide. Dermcidin is a family of peptides produced in the sweat to protect against pathogenic Gram-positive bacteria.	84
405880	pfam15292	Treslin_N	Treslin N-terminus. This family represents the N-terminus of treslin, a checkpoint regulator which plays a role in DNA replication preinitiation complex formation.	793
405881	pfam15293	NUFIP2	Nuclear fragile X mental retardation-interacting protein 2. 	596
405882	pfam15294	Leu_zip	Leucine zipper. This family includes Leucine zipper transcription factor-like protein 1 (LZTFL1) and Leucine zipper protein 2 (LUZP2).	276
405883	pfam15295	CCDC50_N	Coiled-coil domain-containing protein 50 N-terminus. 	126
405884	pfam15296	Codanin-1_C	Codanin-1 C-terminus. This domain is found near to the C-terminus of codanin-1.	119
405885	pfam15297	CKAP2_C	Cytoskeleton-associated protein 2 C-terminus. This family includes the C-terminus of CKAP2 and CKAP2L. CKAP2 is a microtubule associated protein which stabilizes microtubules.	346
405886	pfam15298	AJAP1_PANP_C	AJAP1/PANP C-terminus. This family includes the C-terminus of adherens junction-associated protein 1 (AJAP1) and of PILR-associating neural protein (PANP). AJAP1 inhibits cell adhesion and migration. PANP is a ligand for the immune inhibitory receptor paired immunoglobulin-like type 2 receptor alpha.	204
405887	pfam15299	ALS2CR8	Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8. This domain is found in amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8 protein.	216
405888	pfam15300	INT_SG_DDX_CT_C	INTS6/SAGE1/DDX26B/CT45 C-terminus. This domain is found at the C-terminus of integrator complex subunit 6 (INTS6), sarcoma antigen 1 (SAGE1), protein DDX26B (DDX26B) and members of the cancer/testis antigen family 45.	62
405889	pfam15301	SLAIN	SLAIN motif-containing family. The SLAIN motif containing family is named after the presence of a SLAIN motif in SLAIN1. They are a family of microtubule plus-end tracking proteins.	435
405890	pfam15302	P33MONOX	P33 mono-oxygenase. This family of proteins contains a flavine-containing mono-oxygenase motif. It may have a role in the regulation of neuronal survival, differentiation and axonal outgrowth.	291
405891	pfam15303	RNF111_N	E3 ubiquitin-protein ligase Arkadia N-terminus. This domain is found at the N-terminus of E3 ubiquitin-protein ligase Arkadia.	273
405892	pfam15304	AKAP2_C	A-kinase anchor protein 2 C-terminus. This family includes the C-terminus of A-kinase anchor protein 2 (AKAP2). It includes the site where the regulatory subunits (RII) of protein kinase AII binds.	346
405893	pfam15305	IFT43	Intraflagellar transport protein 43. Intraflagellar transport protein 43 (IFT43) is a subunit of the IFT complex A (IFT-A) machinery of primary cilia.	136
405894	pfam15306	LIN37	LIN37. LIN37 is a component of the DREAM (or LINC) complex which represses cell cycle-dependent genes in quiescent cells and plays a role in the cell cycle-dependent activation of G2/M genes.	156
405895	pfam15307	SPACA7	Sperm acrosome-associated protein 7. SPACA7 is a family of eukaryotic proteins expressed in the testes. Proteins in this family are typically between 104 and 195 amino acids in length. There is a conserved DEIL sequence motif. The function is not known.	108
405896	pfam15308	CEP170_C	CEP170 C-terminus. This family includes the C-terminus of centrosomal protein of 170 kDa (CEP170).	667
405897	pfam15309	ALMS_motif	ALMS motif. This domain is found at the C-terminus of Alstrom syndrome protein 1 (ALMS1), KIAA1731 and C10orf90.	131
405898	pfam15310	VAD1-2	Vitamin A-deficiency (VAD) rat model signalling. VAD1-2 is a family of proteins found in eukaryotes. The family is expressed in testes and is involved in signalling during spermatogenesis.	249
405899	pfam15311	HYLS1_C	Hydrolethalus syndrome protein 1 C-terminus. 	89
405900	pfam15312	JSRP	Junctional sarcoplasmic reticulum protein. JSRP, junctional sarcoplasmic reticulum protein 1, or junctional-face membrane protein of 45 kDa homolog, is a family of eukaryotic proteins. The family is to the junctional face membrane of the skeletal muscle sarcoplasmic reticulum (SR); it colocalizes with its Ca2+-release channel (the ryanodine receptor), and interacts with calsequestrin and the skeletal-muscle dihydro-pyridine receptor Cav1. It is key for the functional expression of voltage-dependent Ca2+ channels.	63
405901	pfam15313	HEXIM	Hexamethylene bis-acetamide-inducible protein. HEXIM is a transcriptional regulator that functions as a general RNA polymerase II transcription inhibitor. In cooperation with 7SK snRNA it sequesters P-TEFb in a large inactive 7SK snRNP complex preventing RNA polymerase II phosphorylation and subsequent transcriptional elongation. HEXIM may also regulate NF-kappa-B, ESR1, NR3C1 and CIITA-dependent transcriptional activity.	135
405902	pfam15314	PRAP	Proline-rich acidic protein 1, pregnancy-specific uterine. PRAP, or proline-rich acidic protein 1, is a family of eukaryotic proteins. PRAP is abundantly expressed in the epithelial cells of the human liver, kidney, gastrointestinal tract, and cervix. It is significantly down-regulated in hepatocellular carcinoma and right colon adenocarcinoma compared with the respective adjacent normal tissues. In the mouse it is expressed in the epithelial cells of the mouse and rat gastrointestinal tracts, and pregnant mouse uterus. This article describes the isolation, distribution, and functional characterization of the human homolog. PRAP was abundantly expressed in the epithelial cells of the human liver, kidney, gastrointestinal tract, and cervix. PRAP plays an important role in maintaining normal growth suppression.	45
405903	pfam15315	FRG2	Facioscapulohumeral muscular dystrophy candidate 2. This family of proteins is found in eukaryotes. The family is localized close to the D4Z4 repeats on chromosome 4 and 10 that are associated with the autosomal dominant facioscapulohumeral muscular dystrophy (FSHD). FRG2 are transcriptionally upregulated in FSHD myoblast cultures suggesting involvement in the pathogenesis of FSHD.	182
405904	pfam15316	MDFI	MyoD family inhibitor. Members of this family inhibits the transactivation activity of the MyoD family of myogenic factors. They affect axin-mediated regulation of the Wnt and JNK signaling pathways, and regulate expression from viral promoters.	168
405905	pfam15317	Lbh	Cardiac transcription factor regulator, Developmental protein. The family of proteins are cardiac transcription regulators, named Lbh, short for Limb, bud and heart. They regulate embryological development in the heart. More specifically, in humans, they may act as transcriptional activators in MAPK signaling pathway to mediate cellular functions. This family of proteins is found in eukaryotes. Proteins in this family are typically between 92 and 116 amino acids in length.	88
373745	pfam15318	Bclt	Putative Bcl-2 like protein of testis. This family of proteins is found in eukaryotes. The family may represent a set of Bcl-2-like proteins involved in apoptosis, see UniProt:Q9BQM9.	175
405906	pfam15319	RHINO	RAD9, RAD1, HUS1-interacting nuclear orphan protein. RHINO, or RAD9, RAD1, HUS1-interacting nuclear orphan, is a family of eukaryotic proteins. Under genotoxic stresses such as ionizing radiation during the S phase, RHINO plays a role in DNA damage response signalling. It is recruited to sites of DNA damage through interaction with the 9-1-1 cell-cycle checkpoint response complex and TOPBP1 in a ATR-dependent (ataxia telangiectasia and Rad3-related) manner. It is required for the progression of the G1 to S phase transition of breast cancer cells, and it is known to play a role in the stimulation of CHEK1 phosphorylation. It interacts with RAD9A, RAD18, TOPBP1 and UBE2N.	244
405907	pfam15320	RAM	mRNA cap methylation, RNMT-activating mini protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 102 and 154 amino acids in length. There is a single completely conserved residue D that may be functionally important. RAM is a family of eukaryotic proteins that are an obligate component of the mammalian cap methyltransferase, RNMT (RNA guanine-7 methyltransferase). RAM consists of an N-terminal RNMT-activating domain and a C-terminal RNA-binding domain. Either RAM or RNMT independently have rather weak binding affinity for RNA, but together their RNA affinity is significantly increased. RAM is necessary for efficient cap methylation, maintaining mRNA expression levels, for mRNA translation and for cell viability.	83
405908	pfam15321	ATAD4	ATPase family AAA domain containing 4. ATAD4 is a family of proteins is found in eukaryotes. The family is also known as PRR15L, or proline-rich 15-like. ATAD4 is expressed almost exclusively in post-mitotic cells both during foetal development and in adult tissues, such as the intestinal epithelium and the testis. Its expression in mouse and human gastrointestinal tumors is linked, directly or indirectly, to the disruption of the Wnt signaling pathway.	98
405909	pfam15322	PMSI1	Protein missing in infertile sperm 1, putative. This family of proteins is found in eukaryotes. Proteins in this family are typically between 249 and 341 amino acids in length.	309
405910	pfam15323	Ashwin	Developmental protein. This family of proteins are found in eukaryotes. These proteins have an important role to play in developmental biology, particularly embryogenesis. It plays an important role in cell survival and axial pattern. It is also thought to be a crucial subunit in the tRNA splicing ligase complex. Proteins in this family are typically between 141 and 232 amino acids in length. There are two conserved sequence motifs: HPE and PQR.	220
405911	pfam15324	TALPID3	Hedgehog signalling target. TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development.	1253
405912	pfam15325	MRI	Modulator of retrovirus infection. MRI, or modulator of retrovirus infection, is a family of eukaryotic proteins that regulate the activity of the proteasome in the uncoating of retroviruses.	104
405913	pfam15326	TEX15	Testis expressed sequence 15. TEX15 is a family of eukaryotic proteins that is required for chromosomal synapsis and meiotic recombination. TEX15 regulates the loading of DNA repair proteins onto sites of double-stranded-breaks and, thus, its absence causes a failure in meiotic recombination. Two polymorphisms in the TEX15 gene could be considered the genetic risk factors for spermatogenic failure in the Chinese Han population.	234
405914	pfam15327	Tankyrase_bdg_C	Tankyrase binding protein C terminal domain. This protein domain family is found at the C-terminal end of the Tankyrase binding protein in eukaryotes. The precise function of this protein is still unknown. However, it is known interacts with the enzyme tankyrase, a telomeric poly(ADP-ribose) polymerase, by binding to it. Tankyrin catalyzes poly(ADP-ribose) chain formation onto proteins. More specifically, it binds to the ankyrin domain in tankyrase. The protein domain is approximately 170 amino acids in length and contains two conserved sequence motifs: FPG and LKA.	166
405915	pfam15328	GCOM2	Putative GRINL1B complex locus protein 2. This protein family is named Putative GRINL1B complex locus protein 2. GRINL1B is short for: glutamate receptor, ionotropic, N-methyl D-aspartate-like 1B. The name indicates what sort of receptor it is thought to be, a ligand gated ion channel specific to the neurotransmitter Glutamate. This family of proteins is found in eukaryotes. Proteins in this family are typically between 325 and 463 amino acids in length. The protein is thought to be the product of a pseudogene with a role in helping assemble a gene transcription unit.	216
405916	pfam15330	SIT	SHP2-interacting transmembrane adaptor protein, SIT. SIT, or SHP2-interacting transmembrane adaptor protein, is a disulfide-linked dimer that regulates human T Cell activation.	114
405917	pfam15331	TP53IP5	Cellular tumor antigen p53-inducible 5. TP53IP5 suppresses cell growth, and its intracellular location and expression change in a cell-cycle-dependent manner.	218
405918	pfam15332	LIME1	Lck-interacting transmembrane adapter 1. LIME1 is a family of eukaryotic transmembrane adaptors. It plays an important role in linking BCR stimulation to B-cell activation and is expressed in primary B cells. LIME localizes to lipid rafts in T cells in response to TCR stimulation, and is phosphorylated by Lck and recruits signalling molecules such as Lck, PI3K, Grb2, Gads, and SHP-2. LIME acts as the transmembrane adaptor linking BCR-induced membrane-proximal signalling to B-cell activation.	224
405919	pfam15333	TAF1D	TATA box-binding protein-associated factor 1D. TAF1D is a family of eukaryotic proteins that are members of the SL1 complex The SL1 complex includes TBP and TAF1A, TAF1B and TAF1C, and plays a role in RNA polymerase I transcription. Alternatives names have included 'JOSD3, Josephin domain containing 3'.	222
405920	pfam15334	AIB	Aurora kinase A and ninein interacting protein. AIB is a family of eukaryotic proteins necessary for the adequate functioning of Aurora-A, a protein involved in chromosome alignment, centrosome maturation, mitotic spindle assembly and aspects of tumorigenesis. AIB is likely to act as a regulator of Aurora-A activity.	326
405921	pfam15335	CAAP1	Caspase activity and apoptosis inhibitor 1. CAAP1, or caspase activity and apoptosis inhibitor 1, is a family of eukaryotic proteins involved in the regulation of apoptosis. It modulates a caspase-10 dependent mitochondrial caspase-3/9 feedback amplification loop.	62
405922	pfam15336	Auts2	Autism susceptibility gene 2 protein. Auts2, or FBRSL2, Fibrosin-1-like protein 2, is a family of eukaryotic proteins associated both with a susceptibility to autism and with influencing the number of corpora lutea produced by breeding sows.	217
405923	pfam15337	Vasculin	Vascular protein family Vasculin-like 1. GC-rich promoter-binding protein 1-like 1 or Vasculin-like protein family 1, is likely to be a transcription factor. The domain family is found in eukaryotes, and is approximately 90 amino acids in length.	94
373764	pfam15338	TPIP1	p53-regulated apoptosis-inducing protein 1. TPIP1 is a family of eukaryotic proteins whose expression is induced by wild-type p53. Ectopically expressed TPIP1, which is localized within mitochondria, leads to apoptotic cell death through dissipation of mitochondrial A(psi)m. Phosphorylation of p53 Ser-46 regulates the transcriptional activation of TPIP1, thereby mediating p53-dependent apoptosis.	123
405924	pfam15339	Afaf	Acrosome formation-associated factor. Afaf is a family of single pass type I membrane proteins. Afaf is a vesicle factor derived from the early endosome trafficking pathway that is involved in the biogenesis of the acrosome on the maturing spermatozoon head.	198
405925	pfam15340	COPR5	Cooperator of PRMT5 family. COPR5 is a family of histone H4-binding proteins expressed in the nucleus. It interacts with the N-terminus of histone H4 thereby mediating the association between histone H4 and PRMT5, PRMT5, the Janus kinase-binding protein 1 that catalyzes the formation of symmetric dimethyl-arginine residues in proteins. COPR5 is specifically required for histone H4 'Arg-3' methylation mediated by PRMT5, but not histone H3 'Arg-8' methylation, suggesting that it modulates the substrate specificity of PRMT5. This family of proteins is found in eukaryotes.	151
405926	pfam15341	SLX9	Ribosome biogenesis protein SLX9. SLX9 is present in pre-ribosomes from an early stage and is implicated in the processing events that remove the ITS1 spacer sequences. In eukaryotes, biogenesis of ribosomes starts in the nucleolus with transcription by RNA polymerase I of a large precursor RNA molecule, called 35S pre-rRNA in yeast, in which the 18S, 5.8S, and 25S mature rRNAs reside, while RNA polymerase III transcribes a 3'-extended pre-5S rRNA. The 35S precursor also contains external transcribed spacer elements (5' and 3'-ETS) at either end as well as internal transcribed spacers (ITS1 and ITS2) that separate the mature sequences.	118
405927	pfam15342	FAM212	FAM212 family. This domain family is found in eukaryotes, and is approximately 60 amino acids in length.	60
405928	pfam15343	DEPP	Decidual protein induced by progesterone family. DEPP is a family of proteins expressed in various tissues, including pancreas, placenta, ovary, testis and kidney. High levels are found during the first trimester. Its expression is induced by progesterone, testosterone and, to a much lower extent, oestrogen. The family is alternatively known as fasting-induced gene protein, FIG.	185
405929	pfam15344	FAM217	FAM217 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 329 and 507 amino acids in length. There is a conserved YPDFLP sequence motif.	230
405930	pfam15345	TMEM51	Transmembrane protein 51. This family of proteins is found in eukaryotes. Proteins in this family are typically between 233 and 253 amino acids in length.	237
405931	pfam15346	ARGLU	Arginine and glutamate-rich 1. ARGLU, arginine and glutamate-rich 1 protein family, is required for the oestrogen-dependent expression of ESR1 target genes. It functions in cooperation with MED1. The family of proteins is found in eukaryotes.	151
405932	pfam15347	PAG	Phosphoprotein associated with glycosphingolipid-enriched. PAG, or Cbp/PAG (Csk binding protein/phospho-protein associated with glycosphingolipid-enriched microdomains) is a transmembrane family that has a negative regulatory role in T-cell activation through being an adapter for C-terminal Src kinase, Csk. This family of proteins is found in eukaryotes.	429
405933	pfam15348	GEMIN8	Gemini of Cajal bodies-associated protein 8. GEMIN8 proteins are found in the nuclear bodies called gems (Gemini of Cajal bodies) that are often in proximity to Cajal (coiled) bodies themselves. They are also found in the cytoplasm. The family is part of the SMN (survival motor neurone) complex that plays an essential role in spliceosomal snRNP assembly in the cytoplasm and is required for pre-mRNA splicing in the nucleus. GEMIN8 binds directly to SMN1 and mediates the interaction of the GEMIN6-GEMIN7 heterodimer.	231
405934	pfam15349	DCA16	DDB1- and CUL4-associated factor 16. DCA16 is a family of eukaryotic proteins that interacts with DDB1 and CUL4A. The family may function as a substrate receptor for the CUL4-DDB1 E3 ubiquitin-protein ligase complex.	167
405935	pfam15350	ETAA1	Ewing's tumor-associated antigen 1 homolog. This family of proteins is found in eukaryotes, where members are expressed at high levels in the brain, liver kidney and Ewing tumor cell lines. Proteins in this family are typically between 648 and 898 amino acids in length.	819
405936	pfam15351	JCAD	Junctional protein associated with coronary artery disease. JCAD is a component of VE-cadherin-based cell-cell junctions in endothelial cells. The cell-cell or adherens junction is an adhesion complex that plays a crucial role in the organisation and function of epithelial and endothelial cellular sheets. These junctions join the actin cytoskeleton to the plasma membrane to form adhesive contacts between cells or between cells and extracellular matrix. The junctions also mediate both cell adhesion and cell-signalling. JCAD localizes close to the apical membrane in epithelial cells. This family is found in eukaryotes.	1358
405937	pfam15352	K1377	Susceptibility to monomelic amyotrophy. This family of proteins is associated with a susceptibility to monomelic amyotrophy.	981
405938	pfam15353	HECA	Headcase protein family homolog. HECA was characterized first in Drosophila where it regulates the proliferation and differentiation of cells during adult morphogenesis. In humans, HECA affects cell cycle progression and proliferation in head and neck cancer cells. It by slows down cell division of oral squamous cell carcinoma cells and may thereby act as a tumor-suppressor in head and neck cancers.	101
259486	pfam15354	KAAG1	Kidney-associated antigen 1. KAAG1, kidney-associated antigen 1, or RU2AS (RU2 antisense gene protein) has been found in mammals. It is expressed in testis and kidney, and, at lower levels, in urinary bladder and liver. It is expressed by a high proportion of tumors of various histologic origin, including melanomas, sarcomas and colorectal carcinomas.	84
405939	pfam15355	Chisel	Stretch-responsive small skeletal muscle X protein, Chisel. The murine X-linked gene Chisel (Csl/Smpx) is selectively expressed in cardiac and skeletal muscle cells. It localizes to the costameric cytoskeleton of muscle cells through its association with focal adhesion proteins, where it may participate in regulating the dynamics of actin through the Rac1/p38 kinase pathway. Thus it is implicated in the maintenance of muscle integrity and in responses to biomechanical stress.	86
292000	pfam15356	SPR1	Psoriasis susceptibility locus 2. SPR1 is psoriasis susceptibility locus 2 protein family.	114
373780	pfam15357	SEEK1	Psoriasis susceptibility 1 candidate 1. This family is considered a candidate for susceptibility to psoriasis.	149
405940	pfam15358	TSKS	Testis-specific serine kinase substrate. TSKS, testis-specific serine kinase substrate, is expressed in the testis and is downregulated in cancerous testicular tissue, in comparison with adjacent normal tissue. TSKS expression is very low to undetectable in seminoma, teratocarcinoma, embryonal, and Leydig cell tumors, while high in testicular tissue adjacent to tumors which contain pre-malignant carcinoma in situ. Recently it has been shown in human testis to be localized to the equatorial segment of ejaculated human sperm. The finding of a TSKS family member in mature sperm suggests that this family of kinases might play a role in sperm function. TSKS is localized during spermiogenesis to the centrioles of post-meiotic spermatids, where it reaches its greatest concentration during the period of flagellogenesis.	556
405941	pfam15359	CDV3	Carnitine deficiency-associated protein 3. This family of proteins is found in eukaryotes. Proteins in this family are typically between 128 and 251 amino acids in length. CDV3 is also known as TPP36 - tyrosine-phosphorylated protein 36. The function is not known.	125
405942	pfam15360	Apelin	APJ endogenous ligand. Apelin is among the most potent stimulators of cardiac contractility known. The apelin-APJ signaling pathway is an important novel mediator of cardiovascular control. Apelin is an adipokine secreted by adipocytes where it is co-expressed with apelin receptor (APJ) in adipocytes. It suppresses adipogenesis through MAPK kinase/ERK dependent pathways and prevents lipid droplet fragmentation, thereby inhibiting basal lipolysis through AMP kinase dependent enhancement of perilipin expression. It also inhibits hormone-stimulated acute lipolysis through decreasing perilipin phosphorylation. Apelin induces a decrease of free fatty acid release via its dual inhibition on adipogenesis and lipolysis. As a vaso-active and vascular cell growth-regulating peptide Apelin is a target of the BMP pathway, the TGF-beta/bone morphogenic protein (BMP) system - a major pathway for angiogenesis.	55
405943	pfam15361	RIC3	Resistance to inhibitors of cholinesterase homolog 3. RIC3 is a protein associated with nicotinic acetylcholine receptors (nAChRs), neurotransmitter-gated ion channels expressed at the neuromuscular junction and within the central and peripheral nervous systems. It can enhance functional expression of multiple nAChR subtypes. RIC3 promotes functional expression of homomeric alpha-7 and alpha-8 nicotinic acetylcholine receptors at the cell surface.	146
405944	pfam15362	Enamelin	Enamelin. ENAMELIN is involved in the mineralisation and structural organisation of enamel. It is necessary for the extension of enamel during the secretory stage of dental enamel formation. The proteins are expressed in teeth, particularly in odontoblasts, ameloblasts and cementoblasts.	907
405945	pfam15363	DUF4596	Domain of unknown function (DUF4596). This domain family is found in eukaryotes, and is approximately 50 amino acids in length. There is a conserved ELET sequence motif. There are two completely conserved residues (S and E) that may be functionally important.	46
405946	pfam15364	PAXIP1_C	PAXIP1-associated-protein-1 C term PTIP binding protein. This protein domain family is the C-terminal domain of PAXIP1-associated-protein-1, which also goes by the name PTIP-associated protein 1. This family of proteins is found in eukaryotes. The function of this protein is to localize at the site of DNA damage and form foci with PTIP at the DNA break point. Furthermore, studies have shown that depletion of PA1 increases cellular sensitivity to ionizing radiation. Proteins in this family are typically between 122 and 254 amino acids in length.	132
405947	pfam15365	PNRC	Proline-rich nuclear receptor coactivator motif. The PNRC family, proline-rich nuclear receptor coactivator, is found in eukaryotes. Studies in S. pombe show that the proteins carrying this motif are mRNA decapping proteins.In addition, this motif is found in Saccharomyces cerevisiae two intrinsically disordered decapping enhancers Edc1 and Edc2, which show limited sequence conservation with human PNRC2. This motif in the N-terminal domain serves two purposes: it enhances the activity of the catalytic domain by recognizing part of the mRNA cap structure (i.e. activation motif), and secondly, it directly interacts with the decapping activator Dcp1. Mutation in the (YAG) sequence led to los of activity of activate the decapping complex. Hence the activity of the family members involved in mRNA processing mechanisms depends on YAG activation motif that is 11-13 residues N-terminal of a conserved LPxP Dcp1 interaction motif.	19
373789	pfam15366	DUF4597	Domain of unknown function (DUF4597). This family of proteins is found in eukaryotes. Proteins in this family are typically between 63 and 76 amino acids in length. There is a conserved TPPTPT sequence motif.	62
405948	pfam15367	CABS1	Calcium-binding and spermatid-specific protein 1. CABS1 is a family of proteins found in eukaryotes. It is also known as NYD-SP26. It binds calcium and is specifically expressed in the elongate spermatids and then localized into the principal piece of flagella of matured spermatozoa.	397
405949	pfam15368	BioT2	Spermatogenesis family BioT2. BioT2 is a family of eukaryotic proteins expressed only in the testes. BioT2 is found abundantly in five types of murine cancer cell lines, suggesting it plays a role in testes development as well as tumorigenesis.	168
405950	pfam15369	KIAA1328	Uncharacterized protein KIAA1328. This function of this protein family remains uncharacterized. This family of proteins is found in eukaryotes.	325
405951	pfam15370	DUF4598	Domain of unknown function (DUF4598). This family of proteins is found in eukaryotes. Proteins in this family are typically between 159 and 251 amino acids in length.	111
405952	pfam15371	DUF4599	Domain of unknown function (DUF4599). The function of this family of eukaryotic proteins is not known.	88
405953	pfam15372	DUF4600	Domain of unknown function (DUF4600). 	128
405954	pfam15373	DUF4601	Domain of unknown function (DUF4601). This protein family is a domain of unknown function, which is found in eukaryotes. In humans, the gene encoding this protein is found in the position, chromosome 19 open reading frame 45.	437
405955	pfam15374	CCDC71L	Coiled-coil domain-containing protein 71L. The protein family, Coiled-coil domain-containing protein 71L, is a domain of unknown function, which is found in eukaryotes.	393
405956	pfam15375	DUF4602	Domain of unknown function (DUF4602). This family of proteins is found in eukaryotes. Proteins in this family are typically between 173 and 294 amino acids in length. This family includes Human C1orf131.	132
405957	pfam15376	DUF4603	Domain of unknown function (DUF4603). This protein family is a domain of unknown function. In particular, this domain lies at the C-terminal end of a protein found in eukaryotes.	1293
405958	pfam15377	DUF4604	Domain of unknown function (DUF4604). This protein family is a domain of unknown function, which is found in eukaryotes. Proteins in this family are typically between 141 and 174 amino acids in length and contain a conserved LSF sequence motif.	170
405959	pfam15378	DUF4605	Domain of unknown function (DUF4605). This protein family is a domain of unknown function, which is found in eukaryotes. Proteins in this family are typically between 82 and 137 amino acids in length.	59
405960	pfam15379	DUF4606	Domain of unknown function (DUF4606). This domain family is found in eukaryotes, and is approximately 100 amino acids in length.	103
373803	pfam15380	DUF4607	Domain of unknown function (DUF4607). This family of proteins is found in eukaryotes. Proteins in this family are typically between 207 and 359 amino acids in length.	264
405961	pfam15382	DUF4609	Domain of unknown function (DUF4609). This family of proteins is found in eukaryotes. Proteins in this family are typically between 70 and 139 amino acids in length.	68
405962	pfam15383	TMEM237	Transmembrane protein 237. This protein family is found in eukaryotes. The function of this protein is to aid the production of new cilia in ciliogenesis. Mutations in the protein cause a disease, named Joubert syndrome type 14 (JBTS14) and also affect cell signalling using the Wnt pathway. Proteins in this family are typically between 203 and 512 amino acids in length. There are two completely conserved G residues that may be functionally important.	248
405963	pfam15384	PAXX	PAXX, PAralog of XRCC4 and XLF, also called C9orf142. PAXX is a set of eukaryotic proteins that belong to the XRCC4 superfamily of DNA-double-strand break-repair proteins. PAXX interacts directly with DSB-repair protein Ku and is recruited to DNA-damage sites in cells thus functioning with XRCC4 and XLF to bring about DSB repair and cell survival in response to DSB-inducing agents.	195
405964	pfam15385	SARG	Specifically androgen-regulated gene protein. This family of proteins is found in eukaryotes, the function of this protein is still unknown but it is thought to be an androgen receptor. Protein expression is up-regulated in the presence of androgens, but not in the presence of glucocorticoids. SARG tends to be highly expressed in prostate tissue. Proteins in this family are typically between 340 and 587 amino acids in length. There is a conserved EETI sequence motif.	567
405965	pfam15386	Tantalus	Drosophila Tantalus-like. An alpha+beta fold domain found in metazoan proteins such as Drosophila Tantalus. Drosophila Tantalus binds the chromatin protein Additional sex combs (Asx) and also binds DNA in vitro.	53
405966	pfam15387	DUF4611	Domain of unknown function (DUF4611). This family of proteins is found in eukaryotes. Proteins in this family are typically between 71 and 100 amino acids in length. There is a conserved AKR sequence motif.	96
405967	pfam15388	FAM117	Protein Family FAM117. This protein family is a domain of unknown function found in eukaryotes. Proteins in this family are typically between 269 and 453 amino acids in length. There are two conserved sequence motifs: RRT and TQT.	309
405968	pfam15389	DUF4612	Domain of unknown function (DUF4612). This protein family is a domain of unknown function, which is found in eukaryotes. Proteins in this family are typically between 109 and 323 amino acids in length.	111
405969	pfam15390	WDCP	WD repeat and coiled-coil-containing protein family. This family includes WD repeat and coiled-coil-containing protein (WDCP, previously known as C2orf44), which is found in eukaryotes and consists of around 721 amino acids. The N-terminal contains two WD (tryptophan-aspartic acid) repeats (WD1 and WD2). WD repeats may be involved in a range of biological functions including apoptosis, transcriptional regulation and signal transduction. The C-terminal contains a proline-rich sequence (PPRLPQR), and is predicted to have leucine-rich coiled coil region (CC). WDCP was identified in a proteomic screen to find signalling components that interact with Hck (hematopoietic cell kinase), a non-receptor tyrosine kinase. WDCP was shown to bind tightly and specifically to the SH3 domain of Hck in U937 human monocytic cells. WDCP was also shown to exist as an oligomer when expressed in mammalian cells. While the function of WDCP is unknown, it has been identified in a gene fusion event with anaplastic lymphoma kinase (ALK) in colorectal cancer patients.	684
405970	pfam15391	DUF4614	Domain of unknown function (DUF4614). This domain family is found in eukaryotes, and is approximately 180 amino acids in length. There is a conserved EALT sequence motif.	176
405971	pfam15392	Joubert	Joubert syndrome-associated. This family of proteins is domain of unknown function, which is found in eukaryotes. However, mutations in the gene lead to Joubert's Syndrome, indicating that the protein that the gene encodes for is vital for correct ciliogenesis.	280
405972	pfam15393	DUF4615	Domain of unknown function (DUF4615). This protein family is a domain of unknown function, which is found in eukaryotes. Proteins in this family are typically between 161 and 229 amino acids in length. There is a single completely conserved residue F that may be functionally important.	132
373816	pfam15394	DUF4616	Domain of unknown function (DUF4616). This protein family is a domain of unknown function found at the C-terminal domain of the proteins. This protein family is found in eukaryotes. Proteins in this family are typically between 166 and 538 amino acids in length.	491
405973	pfam15395	DUF4617	Domain of unknown function (DUF4617). This family of proteins is found in eukaryotes. Proteins in this family are typically between 702 and 1745 amino acids in length.	1082
405974	pfam15396	FAM60A	Protein Family FAM60A. This protein family, FAM60A is a family of proteins is found in eukaryotes. It is known to be a cell cycle protein that binds to the promoter of a gene transcription repressor complex, named SIN4-HDAC complex. This means that FAM60A has an important role to play in 'switching on' gene expression. Proteins in this family are typically between 179 and 324 amino acids in length.	207
405975	pfam15397	DUF4618	Domain of unknown function (DUF4618). This family of proteins is found in eukaryotes. Proteins in this family are typically between 238 and 363 amino acids in length. There are two conserved sequence motifs: EYP and KCTPD.	258
405976	pfam15398	DUF4619	Domain of unknown function (DUF4619). This family of proteins is found in eukaryotes. Proteins in this family are typically between 128 and 299 amino acids in length.	296
405977	pfam15399	DUF4620	Domain of unknown function (DUF4620). 	113
405978	pfam15400	TEX33	Testis-expressed sequence 33 protein family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 147 and 280 amino acids in length. There are two conserved sequence motifs: NIRH and SYT. The function is not known.	138
405979	pfam15401	TAA-Trp-ring	Tryptophan-ring motif of head of Trimeric autotransporter adhesin. TAA-head_Trp-ring is the tryptophan-ring motif of some Gram-negative Enterobacteriaceae. The Trp-ring folds into a beta-meander type on the top of the head domain of its trimeric autotransporter adhesin proteins. In conjunction with the GIN domain it is thought to be the region of the head that adheres to fibronectin.	65
373822	pfam15402	Spc7_N	N-terminus of kinetochore NMS complex subunit Spc7. 	917
405980	pfam15403	HiaBD2	HiaBD2_N domain of Trimeric autotransporter adhesin (GIN). HiaBD2_N may represent the GIN domain of the Head region of TAAs - trimeric autotransporter adhesins. Not all TAAs carry this domain; however, in those that do, the GIN in combination with the Trp-ring domain is necessary for adhesion to fibronectin in the host cell.	52
405981	pfam15404	PH_4	Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species.	181
405982	pfam15405	PH_5	Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species.	135
373825	pfam15406	PH_6	Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species.	112
405983	pfam15407	Spo7_2_N	Sporulation protein family 7. Spo7_2 constitutes a different set of fungal and related species from those found in Spo7. This domain is found in general at the N-terminus. In many members the domain is associated with a Pleckstrin-homology - PH - domain.	64
405984	pfam15409	PH_8	Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species.	89
405985	pfam15410	PH_9	Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species.	118
405986	pfam15411	PH_10	Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species.	120
405987	pfam15412	Nse4-Nse3_bdg	Binding domain of Nse4/EID3 to Nse3-MAGE. This family includes Nse4 and EID3 members, that bind over this region to the Nse3 pocket, in MAGE family pfam01454.	56
405988	pfam15413	PH_11	Pleckstrin homology domain. This Pleckstrin homology domain is found in some fungal species.	105
405989	pfam15414	DUF4621	Protein of unknown function (DUF4621). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 350 amino acids in length.	329
405990	pfam15415	Mfa_like_2	Fimbrillin-like. This family of proteins is found in bacteria. Proteins in this family are typically between 348 and 360 amino acids in length. Analysis of structural comparisons shows this family to be part of the FimbA (CL0450) superfamily of adhesin components or fimbrillins.	312
405991	pfam15416	DUF4623	Domain of unknown function (DUF4623). This family of proteins is found in bacteria. Proteins in this family are approximately 470 amino acids in length. There are two conserved sequence motifs: HLL and RYL.	448
405992	pfam15417	DUF4624	Domain of unknown function (DUF4624). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 150 amino acids in length.	132
405993	pfam15418	DUF4625	Domain of unknown function (DUF4625). This family contains a likely bacterial Ig-like fold, suggesting it may be a family of lipoproteins.	131
405994	pfam15419	LNP1	Leukemia NUP98 fusion partner 1. This family of proteins includes leukemia NUP98 fusion partner 1, the gene encoding this protein is involved in a chromosomal translocation with the NUP98 locus in a form of T-cell acute lymphoblastic leukemia.	177
405995	pfam15420	Abhydrolase_9_N	Alpha/beta-hydrolase family N-terminus. This is the N-terminal transmembrane domain of a family of alpha/beta hydrolases which may function as lipases. The C-terminal domain (pfam10081) is the catalytic domain.	208
405996	pfam15421	Polysacc_deac_3	Putative polysaccharide deacetylase. 	423
405997	pfam15423	FLYWCH_N	FLYWCH-type zinc finger-containing protein. This family is the N-terminus of some FLYWCH-zinc-finger proteins, found in eukaryotes. The family is found in association with pfam04500. There are two conserved sequence motifs: EQE and QEPS.	107
405998	pfam15424	ODAM	Odontogenic ameloblast-associated family. 	264
405999	pfam15425	DUF4627	Domain of unknown function (DUF4627). This family of proteins is found in bacteria. Proteins in this family are approximately 230 amino acids in length. There is a conserved WYK sequence motif.	195
373837	pfam15427	S100PBPR	S100P-binding protein. S100PBPR is a family of proteins found in eukaryotes, and localized to cell nuclei where S100P is also present, and the two proteins co-immunoprecipitate. S100P is a member of the S100 family of calcium-binding proteins and there have been several recent reports of its over-expression in pancreatic ductal adenocarcinoma. In situ hybridisation shows S100PBPR transcripts to be found in islet cells but not duct cells of the healthy pancreas. An interaction between S100P and S100PBPR may be involved in early pancreatic cancer.	386
406000	pfam15428	Imm26	Immunity protein 26. A predicted immunity protein with mostly all-beta fold and several conserved hydrophobic residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI1 or Tox-HNH family. The protein is also found heterogeneous poly-immunity loci.	101
406001	pfam15429	DUF4628	Domain of unknown function (DUF4628). This family of proteins is found in eukaryotes. Proteins in this family are typically between 152 and 673 amino acids in length.	274
406002	pfam15430	SVWC	Single domain von Willebrand factor type C. SVWC is a family of single-domain von Willebrand factor type C proteins from lower eukaryotes. The canonical pattern of most von Willebrand factor type C (VWC) domains is of ten cysteines, however this family, largely but not exclusively of arthropod proteins, contains only eight. SVWC family proteins respond to environmental challenges, such as bacterial infection and nutritional status. They also are involved in anti-viral immunity, and all of these functions seem linked to SVWC expression being induced by Dicer2.	66
292071	pfam15431	TMEM190	Transmembrane protein 190. 	133
406003	pfam15432	Sec-ASP3	Accessory Sec secretory system ASP3. Sec-ASP3 is family of bacterial proteins involved in the Sec secretory system. The family forms part of the accessory SecA2/SecY2 system specifically required to export GspB, a serine-rich repeat cell-wall glycoprotein adhesin encoded upstream in the same operon.	123
406004	pfam15433	MRP-S31	Mitochondrial 28S ribosomal protein S31. MRP-S31 is the mitochondrial 28S ribosomal subunit S31. This family of proteins is found in eukaryotes. Proteins in this family are typically between 246 and 395 amino acids in length. There are two conserved sequence motifs: RHFMELV and GLSKN.	301
406005	pfam15434	FAM104	Family 104. This family of proteins is found in eukaryotes. Proteins in this family are typically between 113 and 185 amino acids in length. There is a conserved SLQ sequence motif.	109
406006	pfam15435	UNC119_bdg	UNC119-binding protein C5orf30 homolog. UNC119_bdg is a family of eukaryotic proteins that probably plays a role in trafficking of proteins, via interaction with unc119 family cargo adapters. The family may play a role in ciliary membrane localization.	198
406007	pfam15436	PGBA_N	Plasminogen-binding protein pgbA N-terminal. PGBA_N is an N-terminal family of bacterial proteins that bind plasminogen. This activity was identified in In Helicobacter pylori where it is thought to contribute to the virulence of this bacterium. Both PgbA and PgbB are surface-exposed proteins that mediate binding to plasminogen such that it can be converted into plasmin in the presence of a Pg activator.	217
292077	pfam15437	PGBA_C	Plasminogen-binding protein pgbA C-terminal. PGBA_C is an C-terminal family of bacterial proteins that bind plasminogen. This activity was identified in Helicobacter pylori where it is thought to contribute to the virulence of this bacterium. Both PgbA and PgbB are surface-exposed proteins that mediate binding to plasminogen such that it can be converted into plasmin in the presence of a plasminogen activator.	84
373844	pfam15438	Phyto-Amp	Antigenic membrane protein of phytoplasma. Phyto-Amp is a family of phytopathogenic wall-less bacterial antigenic membrane proteins. The bacteria are limited to the phloem and pose a major threat to agriculture worldwide. They are transmitted in a persistent, propagative manner by phloem-sucking Hemipteran insects. Phytoplasma membrane proteins are in direct contact with hosts and are assumed to be involved in determining vector specificity. Phyto-Amp is thought to be one family of proteins that mediates such specificity. The proteins appear to be encoded by circular extrachromosomal elements, at least one of which is a plasmid.	147
406008	pfam15439	NYAP_N	Neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adapter. NYAP_N is an N-terminal family of eukaryotic proteins that are substrates of tyrosine kinase in the brain. When first identified, the family members were referred to as unconventional myosin XVI, or Myr 8. However, proteins have now been identified as being integrally involved in neuronal function and morphogenesis. The family is involved in both the activation of phosphoinositide 3-kinase (PI3K) and the recruitment of the downstream effector WAVE complex to the close vicinity of PI3K; it also appears to regulate the brain size and neurite outgrowth in mice.	381
406009	pfam15440	THRAP3_BCLAF1	THRAP3/BCLAF1 family. This family includes thyroid hormone receptor-associated protein 3 (THRAP3), which is a spliceosome component and a subunit of the TRAP complex which plays a role in pre-mRNA splicing and in mRNA decay. It also includes the transcriptional repressor Bcl-2-associated transcription factor 1 (BCLAF1).	614
406010	pfam15441	ARHGEF5_35	Rho guanine nucleotide exchange factor 5/35. This family includes Rho guanine nucleotide exchange factor 5 and Rho guanine nucleotide exchange factor 35.	488
406011	pfam15442	DUF4629	Domain of unknown function (DUF4629). This domain family is found in eukaryotes, and is approximately 150 amino acids in length. There are two conserved sequence motifs: MHML and LGKK.	150
373849	pfam15443	DUF4630	Domain of unknown function (DUF4630). This family of proteins is found in eukaryotes. Proteins in this family are typically between 124 and 286 amino acids in length.	156
373850	pfam15444	TMEM247	Transmembrane protein 247. This family of transmembrane proteins is found in eukaryotes. Proteins in this family are typically between 197 and 222 amino acids in length. The function of this family is unknown.	211
373851	pfam15445	ATS	acidic terminal segments, variant surface antigen of PfEMP1. ATS is the intracellular and relatively conserved acidic terminal segment of the Plasmodium falciparum erythrocyte membrane protein-1 (PfEMP1). this domain appears to be present in all variants of the highly polymorphic PfEMP1 proteins.	446
406012	pfam15446	zf-PHD-like	PHD/FYVE-zinc-finger like domain. This family appears to be a combination domain of several consecutive zinc-binding regions.	170
406013	pfam15447	NTS	N-terminal segments of PfEMP1. This family, the N-terminal segment, is the most variable part of the variant surface antigen family of Plasmodium falciparum, the erythrocyte membrane protein-1 (PfEMP1) proteins. PfEMP1 is an important target for protective immunity and is implicated in the pathology of malaria through its ability to adhere to host endothelial receptors. A structural and functional study of the N-terminal domain of PfEMP1 from the VarO variant comprising the N-terminal segment (NTS) and the first DBL domain (DBL1alpha1), shows this region is directly implicated in rosetting. NTS, previously thought to be a structurally independent component of PfEMP1, forms an integral part of the DBL1alpha domain that is found to be the important heparin-binding site. This family is closely associated with PFEMP, pfam03011, and Duffy_binding, pfam05424.	36
406014	pfam15448	NTS_2	N-terminal segments of P. falciparum erythrocyte membrane protein. NTS_2 is a family of the most variable part of the variant surface antigen family of Plasmodium falciparum, the erythrocyte membrane protein-1 (PfEMP1). However, in this group of proteins conservation is high. PfEMP1 is an important target for protective immunity and is implicated in the pathology of malaria through its ability to adhere to host endothelial receptors.	50
406015	pfam15449	Retinal	Retinal protein. This family of proteins is found in the photoreceptor cells of the retina. Mutations of the gene encoding this protein have been associated with retinal disorders such as retinitis pigmentosa and late-onset progressive retinal atrophy. The function of this family of proteins is unknown, but it is likely to be important in the development and function of the retina.	1293
406016	pfam15450	CCDC154	Coiled-coil domain-containing protein 154. CCDC154 is an osteopetrosis-related protein that suppresses cell proliferation by inducing G2/M arrest.	526
373857	pfam15451	DUF4632	Domain of unknown function (DUF4632). This family of proteins is found in eukaryotes. Proteins in this family are typically between 59 and 190 amino acids in length.	71
406017	pfam15452	NYAP_C	Neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adapter. NYAP_C is a C-terminal family of eukaryotic proteins that are substrates of tyrosine kinase in the brain. When first identified, the family members were referred to as unconventional myosin XVI, or Myr 8. However, proteins have now been identified as being integrally involved in neuronal function and morphogenesis. The family is involved in both the activation of phosphoinositide 3-kinase (PI3K) and the recruitment of the downstream effector WAVE complex to the close vicinity of PI3K; it also appears to regulate the brain size and neurite outgrowth in mice.	261
406018	pfam15453	Pilt	Protein incorporated later into Tight Junctions. Pilt is a family of eukaryotic tight junction-proteins that binds to guanylate-kinase. Pilt is a component of TJs (Tight junctions) rather than AJs (Adhesin junctions). The protein is incorporated into TJs after TJ strands are formed, thereby suggesting the name Pilt for 'protein incorporated later into TJs'. Pilt binds to the guanylate-kinase region of hDlg otherwise known as Disk large homolog.	362
406019	pfam15454	LAMTOR	Late endosomal/lysosomal adaptor and MAPK and MTOR activator. LAMTOR is a family of eukaryotic proteins that have otherwise been referred to as Lipid raft adaptor protein p18, Late endosomal/lysosomal adaptor and MAPK and MTOR activator 1, and Protein associated with DRMs and endosomes. It is found to be one of three small proteins constituting the Rag complex or Ragulator that interact with each other, localize to endosomes and lysosomes, and play positive roles in the MAPK pathway. The complex does this by interacting with the Rag GTPases, recruiting them to lysosomes, and bringing about mTORC1 activation.	69
317808	pfam15455	Pro-rich_19	Proline-rich 19. This family includes proline-rich protein 19.	363
406020	pfam15456	Uds1	Up-regulated During Septation. Uds1 is a domain family is found mostly in fungi, and is typically between 120 and 138 amino acids in length. The GO annotation for the S.pombe protein describes the protein as barrier septum assembly involved in cell cycle cytokinesis, GO:0071937. Many of the uncharacterized members are listed as being involucrin repeat proteins, but this can not be substantiated.	120
406021	pfam15457	HopW1-1	Type III T3SS secreted effector HopW1-1/HopPmaA. HopW1-1 is a family of bacterial modular P. syringae Avr effectors that induce accumulation of the signal molecule salicylic acid (SA) and the transcripts of HWI1 (HOPW1-1-INDUCED GENE1) in Arabidopsis. Thus HopW1-1 elicits a resistance response in Arabidopsis.	321
406022	pfam15458	NTR2	Nineteen complex-related protein 2. NTR2 or Nineteen complex-related protein 2 is a family of largely fungal and plant proteins that form a complex with the DExD/H-box RNA helicase Prp43. Along with NTR1 it is an accessory factor of Prp43 in catalyzing spliceosome disassembly. Disassembly of the spliceosome after completion of the splicing reaction is necessary for recycling of splicing factors to promote efficient splicing. NTR2 and NTR1 associate with a post-splicing complex containing the excised intron and the spliceosomal U2, U5, and U6 snRNAs, that supports a link with a late stage in the pre-mRNA splicing process.	310
406023	pfam15459	RRP14	60S ribosome biogenesis protein Rrp14. RRP14 is a family of nucleolar 60S ribosomal biogenesis proteins from eukaryotes. RRP14 functions in ribosome synthesis as it is required for the maturation of both small and large subunit rRNAs and it helps to prevent premature cleavage of the pre-rRNA at site C2. It also plays a role in cell polarity and/or spindle positioning.	63
406024	pfam15460	SAS4	Something about silencing, SAS, complex subunit 4. SAS4 is a family of largely fungal silencing regulators. This silencing is mediated by chromatin. SAS4 specifically silences the yeast mating-type genes HML and HMR. SAS4 is found to be one subunit of a complex, the SAS complex, that interacts with chromatin assembly factor Asf1p, and asf1 mutants show silencing defects similar to mutants in the SAS complex. Thus, ASF1-dependent chromatin-assembly may mediate the role of the SAS complex in silencing. Co-expression of Sas2, SAS4, and Sas5 in Escherichia coli leads to formation of a stable SAS complex that acetylates histones. SAS4 is essential for the acetyltransferase activity of Sas2, and Sas5 is also important.	99
406025	pfam15461	BCD	Beta-carotene 15,15'-dioxygenase. This is a family of bacterial and archaeal proteins that catalyzes or regulates the conversion of beta-carotene to retinal. characterization of BCD proteins shows them to cleave beta-carotene at its central double bond (15,15') to yield two molecules of all-trans-retinal. However, the oxygen atom of retinal originated not from water but from molecular oxygen, suggesting that the enzyme was a beta-carotene 15,15'-dioxygenase, rather than a mono-oxygenase that catalyzes the same biochemical reaction.	264
406026	pfam15462	Barttin	Bartter syndrome, infantile, with sensorineural deafness (Barttin). Barttin is a family of mammalian proteins that are chloride ion channel beta-subunits crucial for renal Cl-re-absorption and inner ear K+ secretion. Bartter syndrome is a term covering a heterogeneous group of autosomal recessive salt-losing nephropathies that are caused by disturbed transepithelial sodium chloride re-absorption in the distal nephron. Mutations in the BCD proteins lead to sensorial deafness.	223
406027	pfam15463	ECM11	Extracellular mutant protein 11. ECM11 is a family of largely fungal proteins. ECM11 interacts with Cdc6, an essential protein involved in the initiation of DNA replication, and is a nuclear protein involved in maintaining chromatin structure. It was previously identified as a protein involved in yeast cell wall biogenesis and organisation, but is also found to be required in meiosis where its function is related to DNA replication and crossing-over.	133
406028	pfam15464	DUF4633	Domain of unknown function (DUF4633). This family of proteins is found in eukaryotes. Proteins in this family are typically between 94 and 123 amino acids in length.	114
406029	pfam15465	DUF4634	Domain of unknown function (DUF4634). This family of proteins is found in eukaryotes. Proteins in this family are typically between 98 and 133 amino acids in length.	131
406030	pfam15466	DUF4635	Domain of unknown function (DUF4635). This family of proteins is found in eukaryotes. Proteins in this family are typically between 120 and 154 amino acids in length. There are two conserved sequence motifs: LEQ and DLE.	134
406031	pfam15467	SGIII	Secretogranin-3. Secretogranin_3 is a family of vertebrate proteins that is one of the granin family. Granins are rich in acidic amino acids, exhibit aggregation at low pH, and possess a high capacity for calcium binding. Because granins are restricted in their localization to secretory granules of neuroendocrine cells, two interesting characteristics of their sorting mechanisms have been observed. These are, first, that they aggregate on low pH/high calcium concentrations and second that two of them carry an N-terminal disulfide loop, mutations in which lead to mis-sorting. Thus, granins are thought to be essential for the sorting of secretory proteins at the trans-Golgi network. Chromogranin A (CgA) binds to SGIII in secretory granules of endocrine cells. SGIII directly binds to cholesterol components of the secretory granule membrane and targets CgA to secretory granules in pituitary and pancreatic endocrine cells. Mutations in the SGIII gene may influence the risk of obesity through possible regulation of hypothalamic neuropeptide secretion.	449
373871	pfam15468	DUF4636	Domain of unknown function (DUF4636). This family of proteins is found in eukaryotes. Proteins in this family are typically between 196 and 244 amino acids in length.	243
406032	pfam15469	Sec5	Exocyst complex component Sec5. This Sec5 family of eukaryotic proteins conserved is not representing the Sec5-Ral binding site.	186
406033	pfam15470	DUF4637	Domain of unknown function (DUF4637). This family of proteins is found in eukaryotes. Proteins in this family are typically between 142 and 178 amino acids in length.	164
373874	pfam15471	TMEM171	Transmembrane protein family 171. This family of proteins is found in eukaryotes. TMEM171 is also known as parturition-related protein 2. Proteins in this family are typically between 242 and 326 amino acids in length.	317
406034	pfam15472	DUF4638	Domain of unknown function (DUF4638). This family of proteins is found in eukaryotes. Proteins in this family are typically between 240 and 272 amino acids in length.	262
406035	pfam15473	PCNP	PEST, proteolytic signal-containing nuclear protein family. PCNP is a PEST-containing nuclear protein that is ubiquitinated by NIRF, a Np95/ICBP90-like RING finger protein. PEST sequences, which are rich in proline (P), glutamic acid (E), serine (S) and threonine (T), are found in a number of short-lived proteins, such as transcription factors and cell cycle-associated proteins. Their function is generally controlled by proteolysis, mostly via ubiquitin-mediated degradation. Thus, NIRF and PCNP are a ubiquitin ligase and its substrate, respectively, that may constitute a novel signalling pathway with some relation to cell proliferation.	156
406036	pfam15474	MU117	Meiotically up-regulated gene family. This protein was identified as being up-regulated during meiosis in S.pombe. This family of proteins is found in largely in plants and fungi. Proteins in this family are typically between 128 and 920 amino acids in length.	104
406037	pfam15475	UPF0444	Transmembrane protein C12orf23, UPF0444. This family of proteins is found in eukaryotes. Proteins in this family are typically between 94 and 119 amino acids in length.	91
406038	pfam15476	SAP25	Histone deacetylase complex subunit SAP25. SAP25 is a family of proteins found in eukaryotes. SAP25 is a core component of the mSin3 co-repressor complex whose subcellular location is regulated by PML. mSin3, the transcriptional co-repressor, is associated with histone deacetylases (HDACs) and is utilized by many DNA-binding transcriptional repressors. SAP25 is a nucleo-cytoplasmic shuttling protein that is actively exported from the nucleus by a CRM1-dependent mechanism. It binds to the PAH1 domain of mSin3A, associates with the mSin3A-HDAC complex in vivo, and represses transcription when tethered to DNA.	202
406039	pfam15477	SMAP	Small acidic protein family. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. There is a single completely conserved residue G that may be functionally important.	73
406040	pfam15478	LKAAEAR	Family of unknown function with LKAAEAR motif. This family of proteins is found in eukaryotes. Proteins in this family are typically between 119 and 235 amino acids in length. There is a conserved LKAAEAR sequence motif.	137
406041	pfam15479	DUF4639	Domain of unknown function (DUF4639). This family of proteins is found in eukaryotes. Proteins in this family are typically between 161 and 601 amino acids in length.	580
406042	pfam15480	DUF4640	Domain of unknown function (DUF4640). This family of proteins is found in eukaryotes. Proteins in this family are typically between 99 and 306 amino acids in length.	292
406043	pfam15481	CPG4	Chondroitin proteoglycan 4. CPG4 is a domain family found in nematodes of one of nine core chondroitin proteoglycans. Vertebrates produce multiple chondroitin sulfate proteoglycans that play important roles in development and tissue mechanics. In the nematode Caenorhabditis elegans, the chondroitin chains lack sulfate but nevertheless play essential roles in embryonic development and vulval morphogenesis. CPG4 has the largest predicted mass of the C. elegans CPGs at 84 kDa. The majority of its 35 predicted glycosaminoglycan attachment sites reside in the COOH-terminal half of the protein, of which four sites were confirmed by DTT modification. The family is rich in conserved cysteines.	94
406044	pfam15482	CCER1	Coiled-coil domain-containing glutamate-rich protein family 1. This is a family of coiled-coil family proteins found in eukaryotes. Proteins in this family are typically between 160 and 397 amino acids in length.	213
406045	pfam15483	DUF4641	Domain of unknown function (DUF4641). This family of proteins is found in eukaryotes. Proteins in this family are typically between 201 and 519 amino acids in length.	443
406046	pfam15484	DUF4642	Domain of unknown function (DUF4642). This family of proteins is found in eukaryotes. Proteins in this family are typically between 115 and 196 amino acids in length.	155
406047	pfam15485	DUF4643	Domain of unknown function (DUF4643). This family of proteins is found in eukaryotes. Proteins in this family are typically between 254 and 462 amino acids in length.	263
406048	pfam15486	DUF4644	Domain of unknown function (DUF4644). This family of proteins is found in eukaryotes. Proteins in this family are typically between 143 and 191 amino acids in length.	161
406049	pfam15487	FAM220	FAM220 family. This protein family is a domain of unknown function which is found in eukaryotes. Proteins in this family are typically between 217 and 277 amino acids in length. There are two completely conserved residues (S and L) that may be functionally important.	275
406050	pfam15488	DUF4645	Domain of unknown function (DUF4645). This family of proteins is found in eukaryotes. Proteins in this family are typically between 200 and 298 amino acids in length.	294
406051	pfam15489	CTC1	CST, telomere maintenance, complex subunit CTC1. CTC1 is one of the three components of the CST complex that assists Shelterin to protect the ends of telomeres from attack by DNA-repair mechanisms. Mutations in human CTC1 have been recognized as contributing to cerebroretinal microangiopathy.	1139
373893	pfam15490	Ten1_2	Telomere-capping, CST complex subunit. Ten1_2 is a family of primarily plant and vertebrate telomere-capping proteins that is evolutionarily related to the mostly fungal family of Ten1, pfam12658.	117
406052	pfam15491	CTC1_2	CST, telomere maintenance, complex subunit CTC1. CTC1 is one of the three components of the CST complex that assists Shelterin to protect the ends of telomeres from attack by DNA-repair mechanisms. This family largely represents sequences from plants species.	287
406053	pfam15492	Nbas_N	Neuroblastoma-amplified sequence, N terminal. Nbas_N is an N-terminal family of metazoan sequences. This domain lies at the N-terminal of several WD40-containing proteins. The human protein is over-expressed in neuroblastoma cells.	282
406054	pfam15493	YrpD	Domain of unknown function, YrpD. This family of proteins is found in bacteria. Proteins in this family are typically between 236 and 351 amino acids in length. The member from Bacillus subtilis, UniProtKB:O05411, is named YrpD.	203
406055	pfam15494	SRCR_2	Scavenger receptor cysteine-rich domain. SRCR_2 is a scavenger receptor cysteine-rich domain family found largely on vertebrate sequences up-stream of the trypsin-like transmembrane serine protease, Spinesin.	99
406056	pfam15495	Fimbrillin_C	Major fimbrial subunit protein type IV, Fimbrillin, C-terminal. Fimbrillin_C is a C-terminal family of major fimbrial subunit protein type IV proteins largely from Bacillus species. The family is associated with family P_gingi_FimA, pfam06321.	83
406057	pfam15496	DUF4646	Domain of unknown function (DUF4646). This is a family of proteins largely from fungi. The function is not known.	120
406058	pfam15497	SNAPc19	snRNA-activating protein complex subunit 19, SNAPc subunit 19. SNAPc19 is a family of proteins found in eukaryotes. It is one of the five core components of the snRNA-activating protein complex or SNAPc that helps direct the nucleation of RNA polymerases II and III. The core RNA polymerase II snRNA promoters consist of a single essential element, the proximal sequence element (PSE), whereas the core RNA polymerase III snRNA promoters consist of both a PSE and a TATA box. The SNAPc binds to the PSE of both of these. SNAPc recognizes the PSE sequence common to all human snRNA genes, irrespective of polymerase specificity. SNAPc is also known as the PSE transcription factor (PTF) or PSE-binding protein (PBP). The human SNAP19 and SNAP45 subunits are dispensable for transcription in vitro and are not as widely conserved as the other three, SNAP190, SNAP43 and SNAP50, suggesting that these vertebrate-specific SNAPc subunits may have adapted specialized regulatory roles for snRNA gene transcription.	88
373901	pfam15498	Dendrin	Nephrin and CD2AP-binding protein, Dendrin. Dendrin is a family of eukaryotic proteins found in the podocytes of the kidneys. Dendrin, originally identified in telencephalic dendrites, is a constituent of the slit diaphragm, SD, complex of podocytes, where it directly binds to nephrin and CD2AP. Kidney podocytes and their slit diaphragms (SDs) form the final barrier to urinary protein loss. SD proteins also participate in intracellular signalling pathways. Dendrin appears to prevent programmed cell death (apoptosis) through its binding to nephrin. The SD protein nephrin serves as a component of a signalling complex that directly links podocyte junctional integrity to actin cytoskeletal dynamics. Thus, dendrin is identified as an SD family with proapoptotic signalling properties that accumulates in the podocyte nucleus in response to glomerular injury.	656
406059	pfam15499	Peptidase_C98	Ubiquitin-specific peptidase-like, SUMO isopeptidase. Peptidase_C98 is a small family of SUMO - small ubiquitin-related modifier - isopeptidases found in eukaryotes. Reversible attachment of SUMO is an essential protein modification in all eukaryotic cells, The family neither binds nor cleaves ubiquitin, but is a potent SUMO isopeptidase, and the invariant residues required for SUMO binding and cleavage, in UniProtKB:Q5W0Q7, are Cys-236, His-456 and Asp-472, all of which are fully conserved in the family. Member proteins are low-abundance proteins that colocalize with coilin in Cajal bodies. Peptidase_C98 depletion does not affect global sumoylation, but causes striking coilin mis-localization and impairs cell proliferation, functions that are not dependent on the catalytic activity. Thus, Peptidase_C98 represents a third type of SUMO protease, with essential functions in Cajal body biology.	272
259631	pfam15500	Ntox1	Putative RNase-like toxin, toxin_1. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold and conserved cysteine, histidine and glutamate residues that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system.	96
406060	pfam15501	MDM1	Nuclear protein MDM1. This family of proteins is present in the nucleus. The function of MDM1 is not known.	515
406061	pfam15502	MPLKIP	M-phase-specific PLK1-interacting protein. 	62
406062	pfam15503	PPP1R35_C	Protein phosphatase 1 regulatory subunit 35 C-terminus. This is the C-terminus of protein phosphatase 1 regulatory subunit 35. This protein interacts with and inhibits the serine/threonine-protein phosphatase PPP1CA.	144
406063	pfam15504	DUF4647	Domain of unknown function (DUF4647). This family of proteins is found in eukaryotes. Proteins in this family are typically between 282 and 480 amino acids in length.	465
373907	pfam15505	DUF4648	Domain of unknown function (DUF4648). This family of proteins is found in eukaryotes. Proteins in this family are typically between 115 and 207 amino acids in length.	80
292144	pfam15506	OCC1	OCC1 family. The human member of this family, overexpressed in colon carcinoma 1 protein has been shown to be overexpressed in several colon carcinomas.	61
406064	pfam15507	DUF4649	Domain of unknown function (DUF4649). This family of Firmicute sequences has members that are annotated as ribose-phosphate pyrophosphokinase; however there is no evidence for this attribution. Member proteins are all shorter than 100 residues in length.	68
406065	pfam15508	NAAA-beta	beta subunit of N-acylethanolamine-hydrolyzing acid amidase. NAAA-beta is a family of vertebral sequences that form the beta subunit of vertebral N-acylethanolamine-hydrolyzing acid amidase, a member of the choloylglycine hydrolase acid ceramidase family. The alpha subunit is represented by family CBAH, pfam02275.	63
406066	pfam15509	DUF4650	Domain of unknown function (DUF4650). This family of vertebrate proteins lies to the C-terminus of Ubiquitin-specific peptidase-like protein family peptidase_C98, pfam15499. It might be acting as the exosite for the peptidase.	519
373910	pfam15510	CENP-W	CENP-W protein. CENP-W is a family of vertebral kinetochore proteins that associates directly with CENP-T. CENP-W members are histone-fold proteins. The histone fold region is critical for binding to centromeric DNA. Importantly, the CENP-T-W complex does not directly associate with CENP-A, but with histone H3 in the centromere region. CENP-T and -W form a hetero-tetramer with CENP-S and -X and bind to a ~100 bp region of nucleosome-free DNA forming a nucleosome-like structure. The DNA-CENP-T-W-S-X complex is likely to be associated with histone H3-containing nucleosomes rather than with CENP-nucleosomes.	88
373911	pfam15511	CENP-T_C	Centromere kinetochore component CENP-T histone fold. CENP-T is a family of vertebral kinetochore proteins that associates directly with CENP-W. The N-terminus of CENP-T proteins interacts directly with the Ndc80 complex in the outer kinetochore. Importantly, the CENP-T-W complex does not directly associate with CENP-A, but with histone H3 in the centromere region. CENP-T and -W form a hetero-tetramer with CENP-S and -X and bind to a ~100 bp region of nucleosome-free DNA forming a nucleosome-like structure. The DNA-CENP-T-W-S-X complex is likely to be associated with histone H3-containing nucleosomes rather than with CENP-nucleosomes. This domain is the C-terminal histone fold domain of CENP-T, which associates with chromatin.	108
406067	pfam15512	CAF-1_p60_C	Chromatin assembly factor complex 1 subunit p60, C-terminal. CAF-1_p60_C is a family of vertebral proteins that is involved in chromatin assembly. CAF-1_p60 is one of the three subunits of the CAF-1 complex, and this domain binds to the C-terminal region of CAF-1_p150, family pfam12253. The N-terminal part of the CAF-1_p60 proteins is a WD-repeat structure, pfam00400.	177
406068	pfam15513	DUF4651	Domain of unknown function (DUF4651). family of short, secreted proteins specific to the Streptococcus genus, with distant homologs, not recognized by this HMM, found in other cocci. In all sequenced genomes, proteins from this family appear in a conserved genomic context with an thioredoxin, tRNA synthase and tRNA binding protein, but the functional implication of this is unclear	61
292152	pfam15514	ThaI	Restriction endonuclease ThaI. This family of restriction endonucleases belongs to the PD-(D/E)XK superfamily. It cuts the recognition site CG^CG leaving blunt ends.	202
406069	pfam15515	MvaI_BcnI	MvaI/BcnI restriction endonuclease family. This family of proteins includes the restriction endonucleases MvaI and BcnI. These enzymes both function as monomers. MvaI cleaves the sequence CC/WGG, where W is an A or a T nucleotide, leaving sticky ends. BcnI cleaves the sequence CC/SGG, where S is G or C, leaving sticky ends.	226
373914	pfam15516	BpuSI_N	BpuSI N-terminal domain. This is the N-terminal (nuclease) domain of the BpuSI restriction endonuclease.	168
406070	pfam15517	TBPIP_N	TBP-interacting protein N-terminus. This is the N-terminal restriction endonuclease-like domain found in several archaeal TATA-binding protein (TBP)-interacting proteins.	100
373916	pfam15518	L_protein_N	L protein N-terminus. This endonuclease domain is found at the N-terminus of many bunyavirus L proteins.	93
406071	pfam15519	RBM39linker	linker between RRM2 and RRM3 domains in RBM39 protein. A conserved linker between the second and the third RRM domain in human RBM39 (CAPER) protein, also present in other RNA binding proteins, especially those involved in RNA splicing. This linker was implicated in interactions with ESR1 and ESR2. Preliminary results from JCSG suggest that this is a structured domain with a well defined fold.	84
292158	pfam15520	Ntox10	Novel toxin 10. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the type 2 secretion system.	193
373918	pfam15521	Ntox11	Novel toxin 11. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin contains two structural domains, an N-terminal alpha/beta domain and a C-terminal all-beta domain. The domain contains conserved GxR, RxxxoH GxE and GxxH motifs and a conserved histidine residue. In bacterial polymorphic toxin systems, the toxin is usually exported by the Photorhabdus virulence cassette (PVC)-type export system.	256
259653	pfam15522	Ntox14	Novel toxin 14. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system.	218
406072	pfam15523	Ntox16	Novel toxin 16. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-alpha helical fold and conserved (DNE)xxH motif and arginine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, or Photorhabdus virulence cassette (PVC)-type secretion system.	85
406073	pfam15524	Ntox17	Novel toxin 17. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses a mostly all-beta fold and a conserved ExD motif and a histidine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 7 or TcdB/TcaC-type secretion system.	98
406074	pfam15525	DUF4652	Domain of unknown function (DUF4652). This family of uncharacterized proteins from Clostridia and Bacilli classes has an unusual structure of three beta propeller repeats that do not form a barrel, as in well known 6-, 7- etc beta propeller barrels, but instead are stacked in a three-layer beta-sheet sandwich. The function of all the proteins from this family is unknown.	193
406075	pfam15526	Ntox21	Novel toxin 21. Bacterial genomes and plasmids encode a variety of peptide and protein toxins that mediate inter-bacterial competition. Bacteriocins are diffusible proteins that parasitize cell-envelope proteins to enter and kill bacteria. Contact-dependent growth inhibition (CDI) is one mechanism of inter-bacterial competition. Novel Toxin 21 (alternatively 16S rRNA endonuclease CdiA) belongs to a family of prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. This RNase toxin found in bacterial polymorphic toxin systems, is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, with two conserved lysine residues and [DS]xDxxxH, RxG[ST] and RxxD motifs. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 4, type 5 or type 7 secretion systems. This is also referred to as the E. cloacae CdiAC. The CdiAC proteins carry a variety of sequence-diverse C-terminal domains, which represent a collection of distinct toxins. Many CdiA-CT toxins have nuclease activities. In accord with the structural homology, CdiA-CT cleaves 16S rRNA at the same site as colicin E3 and this nuclease activity is responsible for growth inhibition.	71
406076	pfam15527	Ntox22	Bacterial toxin 22. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses a mostly beta fold and two conserved histidines, two aspartates and a glutamate residue. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 5 secretion system.	129
373924	pfam15528	Ntox23	Bacterial toxin 23. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and conserved ND and DxxR motifs and a histidine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or TcdB/TcaC secretion system.	190
406077	pfam15529	Ntox24	Bacterial toxin 24. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and conserved ND and DxxR motifs and a histidine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or TcdB/TcaC secretion system. Interestingly, the toxin is also found in type-II toxin-antitoxin systems.	96
373925	pfam15530	Ntox25	Bacterial toxin 25. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses a mostly all-beta fold and conserved FGPY motif and a histidine residue. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 5 secretion system.	167
406078	pfam15531	Ntox27	Bacterial toxin 27. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold and conserved aspartate and glutamate residues, and an RxW motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 7 secretion systems.	130
373927	pfam15532	Ntox30	Bacterial toxin 30. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and two conserved histidines present in an RxH and THIP motif. The domain additionally has a highly conserved arginine residue. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 6 or type 7 secretion systems.	103
292170	pfam15533	Ntox33	Bacterial toxin 33. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold and [DN]xHxxK and DxxxD motifs. It is usually exported by the Type 2 secretory system.	65
406079	pfam15534	Ntox35	Bacterial toxin 35. A predicted RNase toxin found in bacterial polymorphic toxin systems that is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, and contains a conserved histidine residue and a KH motif. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2 secretion system.	77
373929	pfam15535	Ntox37	Bacterial toxin 37. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and a conserved glutamate residue, and [KR] and Hx[DH] motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 7 secretion systems.	64
292173	pfam15536	Ntox3	Bacterial toxin 3. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-beta fold and conserved aspartate, arginine, histidine and cysteine residues that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system.	133
292174	pfam15537	Ntox43	Bacterial toxin 43. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold with two conserved histidine residues. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2 or TcdB/TcaC-type secretion system. An example of this, the Pseudomonas RhsT-C, has been experimentally characterized.	127
406080	pfam15538	Ntox46	Bacterial toxin 46. A predicted toxin domain found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold with a conserved glutamine residue and a [KR]STxxPxxDxx[ST] motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 6 secretion system.	157
406081	pfam15539	CAF1-p150_C2	CAF1 complex subunit p150, region binding to CAF1-p60 at C-term. CAF1-p150_C2 is part of the binding region of the CAF1 complex p150 subunit to the p60 subunit. The CAF1 complex is essential in human cells for the de novo deposition of histones H3 and H4 at the DNA replication fork.	288
406082	pfam15540	Ntox47	Bacterial toxin 47. A predicted RNase toxin found in bacterial polymorphic toxin systems that is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, and contains two conserved aspartates, a glutamate, a histidine and an arginine residue and an RT motif. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 6 or type 7 secretion system.	111
292178	pfam15541	Ntox4	Bacterial toxin 4. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system.	109
406083	pfam15542	Ntox50	Bacterial toxin 50. A predicted RNase toxin found in bacterial polymorphic toxin systems that is proposed to adopt the BECR (Barnase-EndoU-ColicinE5/D-RelE) fold, and contains two conserved histidine, a serine, two lysine, and a threonine residue and a HxVP motif. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 6, type 7, and MuF-type secretion systems.	93
292180	pfam15543	Ntox5	Bacterial toxin 5. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system.	142
373932	pfam15544	Ntox6	Bacterial toxin 6. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold that is usually exported by the Photorhabdus virulence cassette (PVC)-type export system.	279
292182	pfam15545	Ntox8	Bacterial toxin 8. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an alpha+beta fold and HxR and HxxxH motifs and is usually exported by the type 2 and type 6 secretion system.	74
317880	pfam15546	DUF4653	Domain of unknown function (DUF4653). This family of proteins is found in eukaryotes. Proteins in this family are typically between 93 and 229 amino acids in length.	229
406084	pfam15547	DUF4654	Domain of unknown function (DUF4654). This family of proteins is found in eukaryotes. Proteins in this family are typically between 145 and 169 amino acids in length. There is a conserved IDC sequence motif.	137
406085	pfam15548	DUF4655	Domain of unknown function (DUF4655). This family of proteins is found in eukaryotes. Proteins in this family are typically between 533 and 570 amino acids in length.	534
406086	pfam15549	PGC7_Stella	PGC7/Stella/Dppa3 domain. The domain belongs to a fast evolving family known only from the placental mammals. The PGC7/Stella/Dppa3 protein protects imprinted regions from demethylation post-fertilization. This suggests that it might bind methylated DNA sequences directly. The conserved core includes a postively charged helical segment and a C-terminal CXCXXC motif that is predicted to chelate a metal ion. Most placental mammals contain 3-6 paralogs of this domain family. The CXCXXC motif is also conserved in a subset of fungal MBD4-like proteins.	166
406087	pfam15550	Draxin	Draxin. This family of proteins inhibit Wnt signaling and act as chemorepulsive axon guidance molecules.	319
406088	pfam15551	DUF4656	Domain of unknown function (DUF4656). This family of proteins is found in eukaryotes. Proteins in this family are typically between 286 and 398 amino acids in length.	361
406089	pfam15552	DUF4657	Domain of unknown function (DUF4657). This family of proteins is found in eukaryotes. Proteins in this family are typically between 305 and 370 amino acids in length.	294
406090	pfam15553	TEX19	Testis-expressed protein 19. This family of proteins is expressed in testis.	159
406091	pfam15554	FSIP1	FSIP1 family. 	399
406092	pfam15555	DUF4658	Domain of unknown function (DUF4658). This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 161 amino acids in length.	123
406093	pfam15556	Zwint	ZW10 interactor. This family of proteins is found in eukaryotes. Proteins in this family are typically between 127 and 281 amino acids in length.	252
373941	pfam15557	CAF1-p150_N	CAF1 complex subunit p150, region binding to PCNA. CAF1-p150_N is part of the N-terminus of the CAF1 complex p150 subunit that binds to PCNA - proliferating cell nuclear antigen. The PCNA mediates the connection between CAF-1 and the DNA replication fork. The CAF1 complex is essential in human cells for the de novo deposition of histones H3 and H4 at the DNA replication fork.	230
406094	pfam15558	DUF4659	Domain of unknown function (DUF4659). This family of proteins is found in eukaryotes. Proteins in this family are typically between 427 and 674 amino acids in length. There are two completely conserved residues (D and I) that may be functionally important.	374
406095	pfam15559	DUF4660	Domain of unknown function (DUF4660). This family of proteins is found in eukaryotes. Proteins in this family are typically between 93 and 189 amino acids in length.	107
292197	pfam15560	Imm12	Immunity protein 12. A predicted immunity protein with an alpha+beta fold and several conserved charged and hydrophobic residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI2 family. The protein is also found in heterogeneous poly-immunity loci.	138
406096	pfam15561	Imm15	Immunity protein 15. A predicted immunity protein with an alpha+beta fold and several conserved polar and hydrophobic residues. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems.	160
406097	pfam15562	Imm17	Immunity protein 17. A predicted immunity protein with two transmembrane helices, and a WxW motif and a conserved arginine between the two helices. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems.	60
406098	pfam15563	Imm19	Immunity protein 19. A predicted immunity protein with an alpha+beta fold and a conserved HxxRN motif. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems.	227
259695	pfam15564	Imm25	Immunity protein 25. A predicted immunity protein with an alpha+beta fold. Proteins containing this domain are present in heterogeneous poly-immunity loci of polymorphic toxin systems.	131
406099	pfam15565	Imm30	Immunity protein 30. A predicted immunity protein with a mostly alpha-helical fold and a conserved DxG motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-SHH family of HNH/Endonuclease VII fold nucleases.	96
292202	pfam15566	Imm32	Immunity protein 32. A predicted immunity protein with an alpha+beta fold and a conserved histidine residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox12 or Ntox37 or Notx 7 families.	54
379730	pfam15567	Imm35	Immunity protein 35. A predicted immunity protein with an alpha+beta fold and a conserved tryptophan residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a protease domain such as Tox-PL1 and Ntox40. In some instances, it is also fused to a papain-like toxin, ADP-ribosyl glycohydrolase and a S8-like peptidase. Based on these associations the domain is likely to be a protease inhibitor.	84
259699	pfam15568	Imm39	Immunity protein 39. A predicted immunity protein with an alpha+beta fold and conserved GR, and GxK motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI2 family of nucleases.	131
373946	pfam15569	Imm40	Immunity protein 40. A predicted immunity protein with an alpha+beta fold and conserved phenylalanine and tryptophan residues and a GGD motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox19 family.	93
406100	pfam15570	Imm43	Immunity protein 43. A predicted immunity protein with an alpha+beta fold with conserved tryptophan, proline, aspartate, serine and arginine residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-AHH family of HNH/Endonuclease VII fold nucleases. The gene for this toxin is also found in heterogeneous poly-immunity loci.	124
292206	pfam15571	Imm44	Immunity protein 44. A predicted immunity protein with an alpha+beta fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI1, Tox-URI2 or Tox-ParBL1 families. The gene for this toxin is also found in heterogeneous poly-immunity loci that show variations in structure even between closely related strains.	126
406101	pfam15572	Imm45	Immunity protein 45. A predicted immunity protein with an alpha+beta fold and a conserved C-terminal tryptophan residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-ColE3 family.	95
406102	pfam15573	Imm47	Immunity protein 47. A predicted immunity protein with an alpha+beta fold and a conserved KxGDxxK motif. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems.	258
317899	pfam15574	Imm48	Immunity protein 48. A predicted immunity protein with an all alpha-helical fold and a conserved HRG motif. Proteins containing this domain are present in heterogeneous poly-immunity loci in polymorphic toxin systems.	123
406103	pfam15575	Imm49	Immunity protein 49. A predicted immunity protein with an all alpha-helical fold and a conserved proline residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-REAse-1 or Tox-REase-6 families.	212
373950	pfam15576	DUF4661	Domain of unknown function (DUF4661). This family of proteins is found in eukaryotes. Proteins in this family are typically between 281 and 302 amino acids in length.	253
406104	pfam15577	Spc7_C2	Spc7_C2. Spc7_C2 is a short family to the C-terminus of fungal Spc7 proteins. The Ndc80-MIND-Spc7 complex plays a role in kinetochore function during late meiotic prophase and throughout the mitotic cell cycle. The N-terminal region of Spc7 co-localizes with the mitotic spindle, and it has been argued that Spc7 has the potential to associate with spindle microtubules and that this association is regulated by the C-terminal part of the Spc7 protein. However, this family represents only the conserved region towards the end of the C-terminus; the majority of the C-terminal part is in family Spc7, pfam08317.	62
406105	pfam15578	DUF4662	Domain of unknown function (DUF4662). This family of proteins is found in eukaryotes. Proteins in this family are approximately 290 amino acids in length.	268
406106	pfam15579	Imm52	Immunity protein 52. A predicted immunity protein with an alpha+beta fold and conserved tryptophan and phenylalanine residues, and a GT motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-REase-5 family.	102
373954	pfam15580	Imm53	Immunity protein 53. A predicted immunity protein with an alpha+beta fold and a conserved tryptophan, and WE and PGW motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox24 or Ntox10 families.	90
292215	pfam15581	Imm58	Immunity protein 58. A predicted immunity protein with an alpha+beta fold and YxxxD, WxG, KxxxE motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene.	109
292216	pfam15582	Imm65	Immunity protein 65. A predicted immunity protein with an alpha+beta fold and a conserved YxC motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-JAB1 family. The immunity protein typically contains a signal peptide and a lipobox.	321
406107	pfam15583	Imm68	Immunity protein 68. A predicted immunity protein with an alpha+beta fold and a conserved glutamate residue. The domain is often fused to one or more immunity domains in poly-immunity proteins.	152
292218	pfam15584	Imm72	Immunity protein 72. A predicted immunity protein with a mostly all-beta fold and GxxE, WxDxRY motifs and a glutamate residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox48 family. This domain is often fused to the Imm71 immunity domain.	81
406108	pfam15585	Imm7	Immunity protein 7. A predicted immunity protein with an alpha+beta fold and a conserved GxaG motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a Tox-REase-3 domain.	130
406109	pfam15586	Imm8	Immunity protein 8. A predicted immunity protein with an alpha+beta fold and a conserved WEa (a: aromatic) motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox7 family.	114
406110	pfam15587	Imm9	Immunity protein 9. A predicted immunity protein with an alpha+beta fold and a conserved lysine residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-URI2 family. The protein is also found in heterogeneous poly-immunity loci.	165
379733	pfam15588	Imm10	Immunity protein 10. A predicted immunity protein with a mostly all-beta fold and a conserved arginine residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a Pput_2613 deaminase domain. The protein is also found in heterogeneous poly-immunity loci.	104
406111	pfam15589	Imm21	Immunity protein 21. A predicted immunity protein with an alpha+beta fold and conserved WxG and YxxxC motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the NGO1392-family of HNH/Endonuclease VII fold nucleases.	156
373957	pfam15590	Imm27	Immunity protein 27. A predicted immunity protein with an alpha+beta fold and a conserved aspartate and GGxP motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox10 or Tox-ParB families.	67
292225	pfam15591	Imm31	Immunity protein 31. A predicted immunity protein with a mostly all-beta fold and a conserved GxS motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox17 or Ntox7 families.	73
406112	pfam15592	Imm41	Immunity protein 41. A predicted immunity protein with an alpha+beta fold and a conserved SF motif and tryptophan residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox21, Ntox29 or Tox-ART-RSE-like ADP-ribosyltransferase families.	108
406113	pfam15593	Imm42	Immunity protein 42. A predicted immunity protein with an alpha+beta fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Ntox18 family.	162
406114	pfam15594	Imm50	Immunity protein 50. A predicted immunity protein with an all-beta fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-HHH or Ntox24 families.	118
406115	pfam15595	Imm51	Immunity protein 51. A predicted immunity protein with an alpha+beta fold and a conserved tryptophan and Dx[DE] motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the Tox-RES or Tox-URI1 families. Proteins containing this domain are present in heterogeneous poly immunity loci in polymorphic toxin systems.	105
373959	pfam15596	Imm57	Immunity protein 57. A predicted immunity protein with a mostly alpha-helical fold and conserved aspartate and cysteine residues and an SE motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, usually containing a domain of the LD-peptidase or Tox-Caspase families.	111
406116	pfam15597	Imm59	Immunity protein 59. A predicted immunity protein with an alpha+beta fold and a conserved [DE]R motif. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox13 or Ntox40 families. In some proteins this domain is fused to the Imm38, pfam15599 immunity domain.	100
406117	pfam15598	Imm61	Immunity protein 61. A predicted immunity protein with an alpha+beta fold and a conserved arginine. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox40 family.	153
406118	pfam15599	Imm63	Immunity protein 63. A predicted immunity protein with an alpha+beta fold and a conserved E+G and ExxY motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox40, Tox-CdiAC and Tox-ARC families. The protein is also found in poly-immunity loci in polymorphic toxin systems.	83
406119	pfam15600	Imm64	Immunity protein 64. A predicted immunity protein with an alpha+beta fold and a conserved DxEA motif and arginine residue. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-ColD family.	207
406120	pfam15601	Imm70	Immunity protein 70. A predicted immunity protein with an alpha+beta fold and conserved tyrosine and tryptophan residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-REase-10 family.	131
379737	pfam15602	Imm71	Immunity protein 71. A predicted immunity protein with a mostly alpha-helical fold and conserved arginine and phenylalanine residues. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Ntox48 family. This domain is often fused to the Imm72 immunity domain.	158
406121	pfam15603	Imm74	Immunity protein 74. A predicted immunity protein with an alpha+beta fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-ARC family. This domain is also found in heterogeneous poly-immunity loci.	80
406122	pfam15604	Ntox15	Novel toxin 15. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses a most all-alpha helical fold and a conserved HxxD motif. In bacterial polymorphic toxin systems, the toxin is usually exported by the type 2, type 6, type 7 or Photorhabdus virulence cassette (PVC)-type secretion systems. This is shown to be a type IV secretion system protein that behaves as DNase.	154
373963	pfam15605	Ntox28	Bacterial toxin 28. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all alpha-helical fold and conserved aspartate and glutamate residues, and K[DE] and[DN]HxxE motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5 or type 7 secretion system.	104
292240	pfam15606	Ntox34	Bacterial toxin 34. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-alpha helical fold and conserved lysine and cysteine residues, and GNxxD and WxCxH motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 6 secretion system.	80
406123	pfam15607	Ntox44	Bacterial toxin 44. A predicted RNase toxin found in bacterial polymorphic toxin systems. The toxin possesses an all-alpha-helical fold with conserved DxK, GNxxxG, and DxxxD motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6 or type 7 secretion systems.	94
406124	pfam15608	PELOTA_1	PELOTA RNA binding domain. This RNA binding Pelota domain is at the C-terminus of a PRTase family. These PRTase+Pelota genes are found in the biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	79
406125	pfam15609	PRTase_2	Phosphoribosyl transferase. This PRTase family, and C-terminal TRSP domain, are related to OPRTases, and are predicted to use Orotate as substrate. These genes are found in the biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	189
379741	pfam15610	PRTase_3	PRTase ComF-like. This PRTase family is related to the ComF PRTases. These genes are found in the smaller biosynthetic operon associated with the Ter stress-response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress-response.	265
406126	pfam15611	EH_Signature	EH_Signature domain. This domain with a strongly conserved glutamate at the N-terminus and a histidine at the C-terminus, is found in a SWI2/SNF2 four gene operon. Its strict-neighborhood association with SWI2/SNF2 ATPase strongly suggests a function in conjunction with it. The other genes in the operon are a OmpA protein and a TM protein. This has a DNA related function along with the TerY-P triad.	347
406127	pfam15612	WHIM1	WSTF, HB1, Itc1p, MBD9 motif 1. A conserved alpha helical motif that along with the WHIM2 and WHIM3 motifs, and the DDT domain comprise an alpha helical module found in diverse eukaryotic chromatin proteins.Based on the Ioc3 structure, this module is inferred to interact with nucleosomal linker DNA and the SLIDE domain of ISWI proteins. The resulting complex forms a protein ruler that measures out the spacing between two adjacent nucleosomes. The conserved basic residue in WHIM1 is involved in packing with the DDT motif. The module shows a great domain architectural diversity and is often combined with other modified histone peptide recognising and DNA binding domains, some of which discriminate methylated DNA.	46
406128	pfam15613	WSD	Williams-Beuren syndrome DDT (WSD), D-TOX E motif. This family represents the combined alpha-helical module found in diverse eukaryotic chromatin proteins. Based on the Ioc3 structure, the N-terminus of this module is inferred to interact with nucleosomal linker DNA and the SLIDE domain of ISWI proteins. The resulting complex forms a protein ruler that measures out the spacing between two adjacent nucleosomes. The acidic residue from the GxD signature at the N-terminus is a major determinant of the interaction between the ISWI and WHIM motifs. The N-terminal portion also contacts the inter-nucleosomal linker DNA. The module shows a great domain architectural diversity and is often combined with other modified histone peptide recognizing and DNA binding domains, some of which discriminate methylated DNA. The WSD module constitutes the inter-nucleosomal linker DNA binding site in the major groove of DNA, and was first identified as WSD, the D-TOX E motif of plant homeodomains homologous with the mutant transcription factor causing Williams-Beuren syndrome in association with the DDT-domain.	69
406129	pfam15615	TerB_C	TerB-C domain. TerB-C occurs C-terminal of TerB in TerB-N containing proteins. This domain displays multiple conserved acidic residues (TerBC). The presence of conserved acidic residues in both TerB-N and TerB-C suggests that they, like the TerB domain, might also chelate metals. These two domains may also occur together in the same protein independently of TerB.	143
406130	pfam15616	TerY_C	TerY-C metal binding domain. TerY-C is found C-terminal to TerY-like vWA domains in some proteins. It has 8 conserved metal chelating cysteines or histidines. It occasionally occurs as solos.	129
406131	pfam15617	C-C_Bond_Lyase	C-C_Bond_Lyase of the TIM-Barrel fold. This family of TIM-Barrel fold C-C bond lyase is related to citrate-lyase. These genes are found in the biosynthetic operon, with other enzymatic domains, associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	320
406132	pfam15619	Lebercilin	Ciliary protein causing Leber congenital amaurosis disease. Lebercilin is a family of eukaryotic ciliary proteins. Mutations in the gene, LCA5, are implicated in the disease Leber congenital amaurosis. In photoreceptors, lebercilin is uniquely localized at the cilium that bridges the inner and outer segments. Lebercilin functions as an integral element of selective protein transport through photoreceptor cilia. Lebercilin specifically interacts with the intraflagellar transport (IFT), and disruption of IFT can lead to Leber congenital amaurosis.	186
406133	pfam15620	CENP-C_mid	Centromere assembly component CENP-C middle DNMT3B-binding region. CENP-C is a component of the centromere assembly complex in eukaryotes. CENP-C recruits the DNA methyltransferases DNMT3B, in order to establish the necessary epigenetic DNA-methylation essential for maintenance of chromatin structure and genomic stability. This middle region of CENP-C is the binding-domain for DNMT3B. Binding of CENP-C and DNMT3B to DNA occurs at both centromeric and peri-centromeric satellite repeats. CENP-C and DNMT3B regulate the histone code in these regions.	259
406134	pfam15621	PROL5-SMR	Proline-rich submaxillary gland androgen-regulated family. SMR is a family of proteins found in eukaryotes. The family of SMR proteins is expressed in the submaxillary gland. SMR members may play a role in protection or detoxification.	102
406135	pfam15622	CENP_C_N	Kinetochore assembly subunit CENP-C N-terminal. CENP-C is a vertebrate family that forms a core component of the centromeric chromatin. On depletion of CENP-C proper formation of both centromeres and kinetochores is prevented. The N-terminal of CENP-C is necessary for recruitment of some but not all components of the Mis12 complex of the kinetochore.	287
406136	pfam15623	CT47	Cancer/testis gene family 47. CT47 is a family of proteins found in eukaryotes. Proteins in this family are typically between 262 and 291 amino acids in length. There is a conserved HIL sequence motif. The function of this family is not known.	278
406137	pfam15624	Mif2_N	Kinetochore CENP-C fungal homolog, Mif2, N-terminal. Mif2_N is a family of fungal proteins homologous to mammalian CENP-C. On depletion of CENP-C proper formation of both centromeres and kinetochores is prevented. The N-terminal of CENP-C is necessary for recruitment of some but not all components of the Mis12 complex of the kinetochore.	133
406138	pfam15625	CC2D2AN-C2	CC2D2A N-terminal C2 domain. Many ciliary proteins are involved in ciliogenesis and implicated for ciliophathies. A recent study has shown that many of them contain various new versions of C2 domains which are predicted to mediate membrane localizations for Y-shaped linkers of transition zone of cilia. This is the first C2 domain of ciliary CC2D2A proteins which also have another C2 domain (CC2D2AC-C2) and a new inactive transglutaminase-like peptidase domain (CC2D2A-TGL).	176
373975	pfam15626	mono-CXXC	single CXXC unit. This is a solo version of the zf-CXXC domain with a conserved CXXCXXCX(n)C, zinc-binding motif. This is, thus far, only detected in the plant lineage in diverse chromatin proteins. Structural comparisons show that the mono-CXXC is homologous to the structural- zinc binding domain of medium chain dehydrogenases. The regular zf-CXXC domain binds nonmethyl-CpG dinucleotides.	53
406139	pfam15627	CEP76-C2	CEP76 C2 domain. Many ciliary proteins are involved in ciliogenesis and implicated for ciliophathies. A recent study has shown that many of them contain various new versions of C2 domains which are predicted to mediate membrane localizations for Y-shaped linkers of transition zone of cilia. This is the new C2 domain that is contained by ciliary CEP76 proteins.	154
373977	pfam15628	RRM_DME	RRM in Demeter. This is a predicted RRM-fold domain present at the C-terminus of Demeter-like glycoslyases. These proteins are involved in DNA demethylation in plants where they catalyze removal of the 5mC base and subsequently cleave the backbone through lyase activity. Orthologs of Demeter are present in plants and stramenopiles. The RRM fold domain is predicted to facilitate interaction of the catalytic domain with ssDNA or regulatory RNA.	102
406140	pfam15629	Perm-CXXC	Permuted single zf-CXXC unit. This is a permuted version of a single unit of the zf-CXXC domain that is detected in the Demeter-like proteins of land plants. Structural comparisons show that the mono-CXXC is homologous to the structural-zinc binding domain of medium chain dehydrogenases. The classical zf-CXXC domain binds nonmethyl-CpG dinucleotides.	32
406141	pfam15630	CENP-S	CENP-S protein. CENP-S is a family of vertebral and fungal kinetochore component proteins. CENP-S complexes with CENP-X to form a stable CENP-T-W-S-X heterotetramer.	76
406142	pfam15631	Imm-NTF2-2	NTF2 fold immunity protein. A predicted immunity protein of the NTF2 fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-NucA family. This domain is also fused to ankyrin repeats and the pfam14025.	72
406143	pfam15632	ATPgrasp_Ter	ATP-grasp in the biosynthetic pathway with Ter operon. This ATP-grasp family is related to carbamoyl phosphate synthetase. These genes are found in the biosynthetic operon associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.	131
406144	pfam15633	Tox-ART-HYD1	HYD1 signature containing ADP-ribosyltransferase. A predicted toxin of the ADP-ribosyltransferase superfamily present in bacterial polymorphic toxin systems. The domain has characteristic histidine, tyrosine and aspartate residues that comprise the active site. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, or type 7 secretion system.	97
292267	pfam15634	Tox-ART-HYE1	HYE1 signature containing ADP-ribosyltransferase. A predicted toxin of the ADP-ribosyltransferase superfamily present in bacterial polymorphic toxin systems. The domain has characteristic histidine, tyrosine and glutamate residues that comprise the active site.	282
373981	pfam15635	Tox-GHH2	GHH signature containing HNH/Endo VII superfamily nuclease toxin 2. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteristic s[AGP]HH signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type secretion system.	112
406145	pfam15636	Tox-GHH	GHH signature containing HNH/Endo VII superfamily nuclease toxin. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteristic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.	78
373983	pfam15637	Tox-HNH-HHH	HNH/Endo VII superfamily nuclease toxin with a HHH motif. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with characteristic conserved s[GD]xxR and HHH motifs. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, type 7 or Photorhabdus virulence cassette (PVC)-type secretion system.	103
406146	pfam15638	Tox-MPTase2	Metallopeptidase toxin 2. A zincin-like metallopeptidase domain found in bacterial polymorphic toxin systems.	196
406147	pfam15639	Tox-MPTase3	Metallopeptidase toxin 3. A zincin-like metallopeptidase domain found in bacterial polymorphic toxin systems.	137
406148	pfam15640	Tox-MPTase4	Metallopeptidase toxin 4. A zincin-like metallopeptidase domain found in bacterial polymorphic toxin systems.	132
373985	pfam15641	Tox-MPTase5	Metallopeptidase toxin 5. A zincin-like metallopeptidase domain found in bacterial polymorphic toxin systems.	110
292272	pfam15642	Tox-ODYAM1	Toxin in Odyssella and Amoebophilus. A predicted all-alpha fold toxin present in bacterial polymorphic toxin systems of the endosymbionts Odyssella and Amoebophilus.	385
317949	pfam15643	Tox-PL-2	Papain fold toxin 2. A papain fold toxin domain found in bacterial polymorphic toxin systems.	102
406149	pfam15644	Gln_amidase	Papain fold toxin 1, glutamine deamidase. A papain fold toxin domain found in bacterial polymorphic toxin systems. In these systems they might function either as a releasing peptidase or toxin. In Shigella flexneri, UniProtKB:Q8VSD5, this protein is expressed from a plasmid, and delivered into the host via the type III secretion system where it deamidates the glutamine residue at position 100 in ubiquitin-activating enzyme E2, UBC13, to a glutamic acid residue. Invasion of host cells by pathogens normally invokes an acute inflammatory response through activating the TRAF6-mediated signalling pathway. UBC13 helps to activate TRAF6. Thus deamidation of UBC13 results in the dampening of the inflammatory response. The key glutaminase deamidase activity is mediated by a cys-his-glu triad, present in all members of the family.	112
406150	pfam15645	Tox-PLDMTX	Dermonecrotoxin of the Papain-like fold. A papain fold toxin domain found in bacterial polymorphic toxin systems.	142
373987	pfam15646	Tox-REase-2	Restriction endonuclease fold toxin 2. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 7 or PrsW-peptidase dependent secretion system.	129
373988	pfam15647	Tox-REase-3	Restriction endonuclease fold toxin 3. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or PrsW-peptidase dependent secretion system.	102
406151	pfam15648	Tox-REase-5	Restriction endonuclease fold toxin 5. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or PrsW-peptidase dependent secretion system. Versions of this domain are also found in caudoviruses.	96
373990	pfam15649	Tox-REase-7	Restriction endonuclease fold toxin 7. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, or type 7 secretion system.	86
379747	pfam15650	Tox-REase-9	Restriction endonuclease fold toxin 9. A predicted toxin of the restriction endonuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 7 secretion system.	87
406152	pfam15651	Tox-SGS	Salivary glad secreted protein domain toxin. An alpha+beta fold domain with four conserved cysteine residues and a conserved [DE}xx[ND] motif. This domain is mainly present at the c-terminus of RHS repeats containing proteins in insects and crustaceans. Although no bacterial homologs have been identified, the domain architecture suggests an origin from bacterial polymorphic toxin systems.	96
379748	pfam15652	Tox-SHH	HNH/Endo VII superfamily toxin with a SHH signature. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with two conserved histidine residues. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6 or type 7 secretion system.	97
406153	pfam15653	Tox-URI2	URI fold toxin 2. A predicted toxin of the URI nuclease fold present in bacterial polymorphic toxin systems. In bacterial polymorphic toxin systems, the toxin is exported by the type 2 or type 6 secretion system.	86
406154	pfam15654	Tox-WTIP	Toxin with a conserved tryptophan and TIP tripeptide motif. A predicted toxin domain with two membrane spanning alpha helices and RxxR, Wx[ST]IP motifs. The domain is present in bacterial polymorphic toxin systems. The toxin is usually exported by the type 2 or Photorhabdus virulence cassette (PVC)-type secretion system.	74
406155	pfam15655	Imm-NTF2	NTF2 fold immunity protein. A predicted immunity protein of the NTF2 fold. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbor of the toxin gene, which usually contains toxin domains of the Tox-JAB-2 family.	129
317959	pfam15656	Tox-HDC	Toxin with a H, D/N and C signature. A predicted alpha/beta fold peptidase domain with a strongly conserved triad of a histidine, aspartate/asparagine and cysteine residues that are predicted to comprise the active site of the predicted peptidase. Proteins bearing this predicted toxin domain are particularly common in both intracellular and extracellular pathogens.	130
373994	pfam15657	Tox-HNH-EHHH	HNH/Endo VII superfamily nuclease toxins. A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteristic conserved [ED]H motif and two histidine residues. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 5, type 6, type 7 or Photorhabdus virulence cassette (PVC)-type secretion system.	69
317961	pfam15658	Latrotoxin_C	Latrotoxin C-terminal domain. A toxin domain present in arthropod alphaproteobacterial, gammaproteobacterial endosymbionts and also at the C-termini of the latrotoxins of the black widow spider. The domain is characterized by a conserved, hydrophobic helix and is predicted to associate with the cell membrane.	137
406156	pfam15659	Toxin-JAB1	JAB-like toxin 1. 	86
317963	pfam15660	Imm75	Putative Immunity protein 75. This family is highly conserved suggesting it might derive from a phage protein. Members are less than 90 residues in length, and the function is not known.	84
406157	pfam15661	CF222	C6orf222, uncharacterized family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 618 and 652 amino acids in length.	648
317965	pfam15662	SPATA3	Spermatogenesis-associated protein 3 family. The SPATA3 family of proteins is expressed significantly in testis and faintly in epididymis in the ten tissues of testis, ovary, spleen, kidney, lung, heart, brain, epididymis, liver and skeletal muscle in mouse. Members are not expressed in the eight other tissues. This suggests that SPATA3 plays potential roles in spermatogenesis cell apoptosis or spermatogenesis.	191
406158	pfam15663	zf-CCCH_3	Zinc-finger containing family. zf-CCCH_3 family is found in eukaryotes, and is typically between 155 and 169 amino acids in length.	110
406159	pfam15664	TMEM252	Transmembrane protein 252 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 152 and 182 amino acids in length. The function is not known.	139
406160	pfam15665	FAM184	Family with sequence similarity 184, A and B. The function of FAM184 is not known.	211
406161	pfam15666	HGAL	Germinal center-associated lymphoma. HGAL is a family of mammalian sequences typically between 104 and 179 amino acids in length. Members were discovered in a search for proteins precipitating diffuse large B-cell lymphomas. HGAL interacts with the cytoskeleton and aids the activity of interleukin-6 on cell migration. It also modulates the RhoA signalling pathway.	88
406162	pfam15667	GDWWSH	Protein of unknown function with motif GDWWSH. This family of proteins is found in eukaryotes. Proteins in this family are typically between 135 and 289 amino acids in length. There are three conserved sequence motifs: GDWWSH, RSDF and KRHG.	238
406163	pfam15668	DUF4663	Domain of unknown function (DUF4663). This family of proteins is found in eukaryotes. Proteins in this family are typically between 289 and 334 amino acids in length. There are two completely conserved residues (W and G) that may be functionally important.	335
406164	pfam15669	CCDC24	Coiled-coil domain-containing protein 24 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 187 and 319 amino acids in length. There are two completely conserved residues (G and P) that may be functionally important.	195
406165	pfam15670	Spem1	Spermatid maturation protein 1. Spem1 is a family of mammalian proteins. Proteins are exclusively expressed in the cytoplasm of the last three steps of spermiogenesis in the mouse testis, and male mice deficient in Spem1 are completely infertile because of deformed sperm.	254
406166	pfam15671	PRR18	Proline-rich protein family 18. This family of proteins is found in eukaryotes. Proteins in this family are typically between 117 and 297 amino acids in length. The function is not known but there are many highly conserved proline residues.	264
406167	pfam15672	Mucin15	Cell-membrane associated Mucin15. Mucin15 is a family of vertebrate mucins associated with the cell-membrane. The function is not known. Members of the family are typically between 284 and 335 amino acids in length.	315
374006	pfam15673	Ciart	Circadian-associated transcriptional repressor. Circadian-associated transcriptional repressor (Ciart or Chrono) is a negative regulatory component of the circadian clock. It functions as a transcriptional repressor, modulating BMAL1-CLOCK activity. It also regulates metabolic pathways such as the glucocorticoid response triggered by behavioral stress.	278
406168	pfam15674	CCDC23	Coiled-coil domain-containing protein 23. This family of proteins is found in eukaryotes. Proteins in this family are typically between 66 and 78 amino acids in length. There are two completely conserved residues (K and E) that may be functionally important.	57
406169	pfam15675	CLLAC	CLLAC-motif containing domain. This short domain is found in chordates. It carries a highly conserved CLLAC sequence motif. The function is not known.	30
406170	pfam15676	S6OS1	Six6 opposite strand transcript 1 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 114 and 587 amino acids in length. The function is not known.	557
374010	pfam15677	CEND1	Cell cycle exit and neuronal differentiation protein 1. This family of neuron-specific proteins may have a role in the differentiation of neuroblastoma cells and neuronal precursors. It is involved in development of the cerebellum.	143
406171	pfam15678	SPICE	Centriole duplication and mitotic chromosome congression. SPICE is a family of proteins found in chordates. It localizes to spindle microtubules in mitosis and to centrioles throughout the cell cycle. Deletion of SPICE compromises the architecture of spindles, the integrity of the spindle pole and the process of aligning chromosomes on the spindle (chromosome congression).	408
406172	pfam15679	DUF4665	Domain of unknown function (DUF4665). This family of proteins is found in eukaryotes. Proteins in this family are typically between 45 and 100 amino acids in length.	99
406173	pfam15680	OFCC1	Orofacial cleft 1 candidate gene 1 protein. This family of proteins is found in eukaryotes. Proteins in this family are typically between 125 and 276 amino acids in length.	109
406174	pfam15681	LAX	Lymphocyte activation family X. LAX is a family of proteins is found in chordates. LAX is membrane-associates and expressed in B cells, T cells, and other lymphoid-specific cell types. It down-regulates antigen-receptor signalling in T cells by inhibiting TCR-mediated p38 MAPK activation.	350
374015	pfam15682	Mustang	Musculoskeletal, temporally activated-embryonic nuclear protein 1. Mustang is a family of short, approx 80 residue, proteins found in chordates. It localizes to the nucleus and specifically, spatially in mesenchymal cells of the developing limbs and tail as well as in the fracture callus, especially in periosteal osteoprogenitor cells, proliferating chondrocytes, and young active osteoblasts. It is highly expressed during embryogenesis and inactivated in most adult tissues with the exception of skeletal muscle and tendon where is is acutely and differentially expressed during bone regeneration.	75
406175	pfam15683	TDRP	Testis development-related protein. TDRP is a family of proteins found in chordates. It is predominantly expressed in the testis. distributed in both cytoplasm and the nuclei of spermatogenic cells. It may act as a nuclear factor with an important role in spermatogenesis.	145
406176	pfam15684	AROS	Active regulator of SIRT1, or 40S ribosomal protein S19-binding 1. AROS is a family of chordate proteins active in the nucleolus. It has a stretch of polylysines at the N-terminus and in the middle regions and it localizes to the nucleus and especially the nucleolus in high concentrations. It binds to the 40S ribosomal protein RPS19, which is implicated in erythropoiesis. AROS is an active regulator of Sirtuin (SIRT1), an NAD+-dependent deacetylase protein that plays a role in cell survival and hormonal signalling, and AROS regulates the activity of SIRT1 by enhancing SIRT1-mediated de-acetylation of p53 and thus regulates growth of the cell.	128
406177	pfam15685	GGN	Gametogenetin. GGN is a family of proteins largely found in mammals. It reacts with POG in the maturation of sperm and is expressed virtually only in the testis. It is found to be associated with the intracellular membrane, binds with GGNBP1 and may be involved in vesicular trafficking.	639
406178	pfam15686	LYRIC	Lysine-rich CEACAM1 co-isolated protein family. LYRIC is a family of proteins found in eukaryotes. It is a type-1b membrane protein with a single transmembrane domain and localizes to the endoplasmic reticulum and the nuclear envelope. It is also found in the nucleolus, suggesting functional relationships between these two cellular compartments. It is found to be colocalized with tight junction proteins ZO-1 and occludin in polarised epithelial cells, suggesting that LYRIC is part of the tight junction complex. LYRIC has been shown to promote tumor cell migration and invasion by activating the transcription factor NF-kappaB.	419
406179	pfam15687	NRIP1_repr_1	Nuclear receptor-interacting protein 1 repression 1. This domain is the first (N-terminal) repression domain of nuclear receptor-interacting protein 1.	308
406180	pfam15688	NRIP1_repr_2	Nuclear receptor-interacting protein 1 repression 2. This domain is the second repression domain of nuclear receptor-interacting protein 1.	331
406181	pfam15689	NRIP1_repr_3	Nuclear receptor-interacting protein 1 repression 3. This domain is the third repression domain of nuclear receptor-interacting protein 1.	88
406182	pfam15690	NRIP1_repr_4	Nuclear receptor-interacting protein 1 repression 4. This domain is the fourth (C-terminal) repression domain of nuclear receptor-interacting protein 1.	311
406183	pfam15691	PPP1R32	Protein phosphatase 1 regulatory subunit 32. PPP1R32 is a family of eukaryotic proteins thought to be involved in the interactome of protein phosphatase-1.	418
406184	pfam15692	NKAP	NF-kappa-B-activating protein. NKAP is a family of eukaryotic proteins that interacts with NF-kappa-B. It is a nuclear regulator of TNF- and IL-1-induced NF-kappa-B activation. NKAP does not interact with RIP in mammalian cells family is often found in association with pfam06047.	84
406185	pfam15693	Med26_C	Mediator complex subunit 26 C-terminal. Med26_C is the C-terminal domain of subunit 26 of the Mediator complex in eukaryotes. Med19 and Med26 act synergistically to mediate the interaction between REST (a Kruppel-type zinc finger transcription factor that binds to a 21-bp RE1 silencing element present in over 900 human genes) and Mediator. The C-terminal domain is critical and sufficient for its assembly into Mediator and its interaction with Pol II. The most highly conserved C-terminal amino acids are critical for these interactions because deletion of the last eight amino acids from the Med26 C-terminus disrupted binding to Mediator and Pol II.	182
406186	pfam15694	Med26_M	Mediator complex subunit 26 middle domain. Med26_M is the middle domain of subunit 26 of Mediator. Med19 and Med26 act synergistically to mediate the interaction between REST (a Kruppel-type zinc finger transcription factor that binds to a 21-bp RE1 silencing element present in over 900 human genes) and Mediator.	255
292323	pfam15695	HERV-K_REC	Rec (regulator of expression encoded by corf) of HERV-K-113. REC is a family of rec proteins from the HERV-K viral polyprotein family. Rec is a functional homolog of Rev and Rex, and binds to an RNA element, the Rec-responsive element (RcRE), in the 3'LTR of HTDV/HERV-K transcripts. Thus Rec mediates nuclear export of RNA by binding to its responsive element, RcRE, present in a transcript. The human small glutamine-rich tetratricopeptide repeat-containing protein (hSGT) that controls mitotic processes and is a checkpoint protein during pro-metaphase is found to be a Rec-interacting partner.interferes with its role as a negative regulator of the androgen receptor, leading to enhanced androgen receptor activity. HERV-K(HML-2) elements benefit from this enhanced activity, as this leads to a vicious cycle that can result in increased cell proliferation, an inhibition of apoptosis, and eventually tumorigenesis.	87
406187	pfam15696	RAD51_interact	RAD51 interacting motif. This motif interacts with RAD51.	39
406188	pfam15697	DUF4666	Domain of unknown function (DUF4666). This family of proteins is found in plants. Proteins in this family are typically between 103 and 140 amino acids in length. There are two conserved sequence motifs: LQRS and FRR.	109
406189	pfam15698	Phosphatase	Phosphatase. Members of this family have phosphatase activity.	256
406190	pfam15699	NPR1_interact	NPR1 interacting. This family of proteins interacts via a motif at the C-terminus with the regulatory protein NPR1.	108
406191	pfam15700	DUF4667	Domain of unknown function (DUF4667). This family of proteins is found in fungi. Proteins in this family are typically between 172 and 313 amino acids in length.	231
406192	pfam15701	DUF4668	Domain of unknown function (DUF4668). This family of proteins is found in eukaryotes. Proteins in this family are typically between 142 and 211 amino acids in length.	162
406193	pfam15702	HPS6	Hermansky-Pudlak syndrome 6 protein. 	778
406194	pfam15703	LAT2	Linker for activation of T-cells family member 2. 	177
406195	pfam15704	Mt_ATP_synt	Mitochondrial ATP synthase subunit. This plant mitochondrial ATP synthase subunit may the the equivalent of the mitochondrial ATP synthase d subunit.	188
406196	pfam15705	TMEM132D_N	Mature oligodendrocyte transmembrane protein, TMEM132D, N-term. TMEM132D_N is the N-terminal family of chordate proteins implicated in panic disorder. TMEM132D is a single-pass transmembrane protein that is highly expressed in the cortical regions of the human and mouse brain. The function is still unknown. It may act as a cell-surface marker for oligodendrocyte differentiation. Additionally, as it may be most strongly expressed in neurons and it colocalizes with actin filaments TMEM132D may be implicated in neuronal sprouting and connectivity in brain regions important for anxiety-related behaviour.	131
406197	pfam15706	TMEM132D_C	Mature oligodendrocyte transmembrane protein, TMEM132D, C-term. TMEM132D_C is the C-terminal family of chordate proteins implicated in panic disorder. TMEM132D is a single-pass transmembrane protein that is highly expressed in the cortical regions of the human and mouse brain. The function is still unknown. It may act as a cell-surface marker for oligodendrocyte differentiation. Additionally, as it may be most strongly expressed in neurons and it colocalizes with actin filaments TMEM132D may be implicated in neuronal sprouting and connectivity in brain regions important for anxiety-related behaviour.	84
406198	pfam15707	MCCD1	Mitochondrial coiled-coil domain protein 1. This is a family of uncharacterized proteins known as mitochondrial coiled-coil domain protein 1.	90
406199	pfam15708	PRR20	Proline-rich protein family 20. This family of proteins is found in eukaryotes. Proteins in this family are typically between 73 and 221 amino acids in length. There is a conserved AYV sequence motif.	221
406200	pfam15709	DUF4670	Domain of unknown function (DUF4670). This family of proteins is found in eukaryotes. Proteins in this family are typically between 373 and 763 amino acids in length.	522
406201	pfam15710	DUF4671	Domain of unknown function (DUF4671). This family of proteins is found in eukaryotes. Proteins in this family are typically between 385 and 652 amino acids in length.	678
406202	pfam15711	ILEI	Interleukin-like EMT inducer. ILEI is a family of proteins found in vertebrates. It is heavily involved in the process of the transition from epithelial to mesenchymal tissue - EMT - during all of embryonic development, cancer progression, metastasis, and chronic inflammation/fibrosis. ILEI is upregulated exclusively at the level of translation, and abnormal ILEI expression, ie cytoplasmic over-expression instead of vesicular localization, is associated with EMT in human cancerous tissue. In order to induce and maintain the EMT of hepatocytes in a TGF-beta-independent fashion ILEI needs the cooperation of oncogenic Ras.	89
406203	pfam15712	NPAT_C	NPAT C-terminus. 	685
406204	pfam15713	PTPRCAP	Protein tyrosine phosphatase receptor type C-associated. 	150
406205	pfam15714	SpoVT_C	Stage V sporulation protein T C-terminal, transcription factor. SpoVT_C is the C-terminal part of the stage V sporulation protein T, a transcription factor involved in endospore formation in Gram-positive bacteria such as Bacillus subtilis. Sporulation is induced by conditions of environmental stress to protect the genome. SpoVT behaves as a tetramer that shows an overall significant distortion mediated by electrostatic interactions. Two monomers dimerize via the highly charged N-terminal AbrB-like domains, family pfam04014, to form swapped-hairpin beta-barrels. These asymmetric dimers then form tetramers through the formation of mixed helix bundles between their C-terminal domains. The C-termini themselves fold as GAF (cGMP-specific and cGMP-stimulated phosphodiesterases, Anabaena adenylate cyclases, and Escherichia coli FhlA) domains.	128
406206	pfam15715	PAF	PCNA-associated factor. 	131
406207	pfam15716	DUF4672	Domain of unknown function (DUF4672). This family of proteins is found in eukaryotes. Proteins in this family are typically between 165 and 199 amino acids in length.	173
406208	pfam15717	PCM1_C	Pericentriolar material 1 C-terminus. 	612
406209	pfam15718	MNR	Protein moonraker. Protein moonraker is a centriolar satellite component involved in centriole duplication. It promotes centriole duplication by localizing WDR62 to the centrosome.	933
406210	pfam15719	DUF4674	Domain of unknown function (DUF4674). This family of proteins is found in eukaryotes. Proteins in this family are typically between 126 and 221 amino acids in length.	191
406211	pfam15720	DUF4675	Domain of unknown function (DUF4675). This family of proteins is found in eukaryotes. Proteins in this family are approximately 190 amino acids in length.	198
406212	pfam15721	ANXA2R	Annexin-2 receptor. This family of proteins acts as annexin-2 receptors.	190
406213	pfam15722	FAM153	FAM153 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 289 amino acids in length.	114
406214	pfam15723	MqsR_toxin	Motility quorum-sensing regulator, toxin of MqsA. MqsR_toxin is a family of bacterial toxins that act as an mRNA interferase. MqsR is the gene most highly upregulated in E. coli persister cells and it plays an essential role in biofilm regulation and cell signalling. It forms part of a bacterial toxin-antitoxin TA system, and as expected for a TA system, the expression of the MqsR toxin leads to growth arrest, while co-expression with its antitoxin, MqsA, rescues the growth arrest phenotype. In addition, MqsR associates with MqsA to form a tight, non-toxic complex and both MqsA alone and the MqsR:MqsA2:MqsR complex bind and regulate the mqsR promoter. The structure of MqsR shows that is is a member of the RelE/YoeB family of bacterial RNases that are structurally and functionally characterized bacterial toxins.y characterized bacterial toxins.	96
406215	pfam15724	TMEM119	TMEM119 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 217 and 283 amino acids in length.	252
292353	pfam15725	RCDG1	Renal cancer differentiation gene 1 protein. This family includes human protein C4orf46, also known as renal cancer differentiation gene 1 protein (RCDG1).	83
292354	pfam15726	DUF4677	Domain of unknown function (DUF4677). This family of proteins is found in eukaryotes. Proteins in this family are typically between 157 and 195 amino acids in length.	198
406216	pfam15727	DUF4678	Domain of unknown function (DUF4678). This family of proteins is found in eukaryotes. Proteins in this family are typically between 318 and 395 amino acids in length.	380
406217	pfam15728	DUF4679	Domain of unknown function (DUF4679). This family of proteins is found in eukaryotes. Proteins in this family are typically between 213 and 412 amino acids in length.	399
406218	pfam15729	ALS2CR11	Amyotrophic lateral sclerosis 2 candidate 11. This family of proteins is found in eukaryotes. Proteins in this family are typically between 286 and 727 amino acids in length.	418
406219	pfam15730	DUF4680	Domain of unknown function (DUF4680). This family of proteins is found in eukaryotes. Proteins in this family are typically between 65 and 178 amino acids in length. There are two conserved sequence motifs: VISRM and ENE.	144
292359	pfam15731	MqsA_antitoxin	Antitoxin component of bacterial toxin-antitoxin system, MqsA. MqsA_antitoxin is a family of prokaryotic proteins that act as antidotes to the mRNA interferase MqsR. It has a zinc-binding at the very N-terminus indicating its DNA-binding capacity. MqsR is the gene most highly upregulated in E. Colo MqsR_toxin is a family of bacterial toxins that act as an mRNA interferase. MqsR is the gene most highly upregulated in E. coli persister cells and it plays an essential role in biofilm regulation and cell signalling. It forms part of a bacterial toxin-antitoxin TA system, and as expected for a TA system, the expression of the MqsR toxin leads to growth arrest, while co-expression with its antitoxin, MqsA, rescues the growth arrest phenotype. In addition, MqsR associates with MqsA to form a tight, non-toxic complex and both MqsA alone and the MqsR:MqsA2:MqsR complex bind and regulate the mqsR promoter. The structure of MqsR shows that is is a member of the RelE/YoeB family of bacterial RNases that are structurally and functionally characterized bacterial toxins.	131
374059	pfam15732	DUF4681	Domain of unknown function (DUF4681). This family of proteins is found in eukaryotes. Proteins in this family are typically between 101 and 127 amino acids in length.	127
406220	pfam15733	DUF4682	Domain of unknown function (DUF4682). This domain family is found in eukaryotes, and is typically between 152 and 183 amino acids in length. The family is found in association with pfam00566. There is a conserved NHLL sequence motif.	122
406221	pfam15734	MIIP	Migration and invasion-inhibitory. This family of proteins binds to insulin-like growth factor binding protein 2 (IGFBP-2) and inhibits the invasion of glioma cells.	337
406222	pfam15735	DUF4683	Domain of unknown function (DUF4683). This domain family is found in eukaryotes, and is typically between 384 and 400 amino acids in length.	391
406223	pfam15736	DUF4684	Domain of unknown function (DUF4684). This family of proteins is found in eukaryotes. Proteins in this family are typically between 531 and 1277 amino acids in length.	365
406224	pfam15737	DUF4685	Domain of unknown function (DUF4685). This domain family is found in eukaryotes, and is typically between 106 and 131 amino acids in length. There are two conserved sequence motifs: SGE and VRF.	117
406225	pfam15738	YafQ_toxin	Bacterial toxin of type II toxin-antitoxin system, YafQ. YafQ is a family of bacterial toxin ribonucleases of type II toxin-antitoxin systems. The E.coli gene is expressed from the dinB operon. The cognate antitoxin for the E. coli protein is DinJ, in family RelB_antitoxin, pfam02604.	88
406226	pfam15739	TSNAXIP1_N	Translin-associated factor X-interacting N-terminus. This domain is found at the N-terminus of translin-associated factor X-interacting protein, a protein which may play a role in spermatogenesis.	104
406227	pfam15740	PPP1R26_N	Protein phosphatase 1 regulatory subunit 26 N-terminus. This domain represents the N-terminus of protein phosphatase 1 regulatory subunit 26.	872
406228	pfam15741	LRIF1	Ligand-dependent nuclear receptor-interacting factor 1. This family of proteins interacts with the retinoic acid receptor RARalpha and inhibit it's ligand-dependent transcriptional activation.	739
406229	pfam15742	DUF4686	Domain of unknown function (DUF4686). This family of proteins is found in eukaryotes. Proteins in this family are typically between 498 and 775 amino acids in length. There is a conserved DLK sequence motif.	384
406230	pfam15743	SPATA1_C	Spermatogenesis-associated C-terminus. This domain family is found in eukaryotes, and is approximately 150 amino acids in length. There is a single completely conserved residue E that may be functionally important.	149
406231	pfam15744	UPF0492	Uncharacterized protein family UPF0492. This family of proteins is found in eukaryotes. Proteins in this family are typically between 78 and 408 amino acids in length.	364
406232	pfam15745	AP1AR	AP-1 complex-associated regulatory protein. 	275
374072	pfam15746	TMEM215	TMEM215 family. This family of proteins is found in eukaryotes. Proteins in this family are approximately 230 amino acids in length.	225
374073	pfam15747	DUF4687	Domain of unknown function (DUF4687). This family of proteins is found in eukaryotes. Proteins in this family are typically between 76 and 140 amino acids in length.	120
406233	pfam15748	CCSAP	Centriole, cilia and spindle-associated. This family of microtubule-binding proteins may play a role in embryonic brain development and cilia beating.	255
406234	pfam15749	MRNIP	MRN-interacting protein. This family is found in eukaryotes. Family members include MRN complex-interacting protein (MRNIP), which plays a role in preventing the accumulation of damaged DNA in cells. It associates with the MRE11-RAD50-NBS1 (MRN) damage-sensing complex and is rapidly recruited to sites of DNA damage. Phosphorylation of a serine promotes nuclear localization of MRNIP.	100
406235	pfam15750	UBZ_FAAP20	Ubiquitin-binding zinc-finger. This domain is the ubiquitin-binding zinc-finger of the Fanconi anemia-associated protein of 20 kDa.	35
406236	pfam15751	FANCA_interact	FAAP20 FANCA interaction domain. This domain is found at the N-terminus of Fanconi anemia-associated protein of 20 kDa (FAAP20), where it is responsible for interaction with Fanconi anemia group A protein (FANCA).	108
406237	pfam15752	DUF4688	Domain of unknown function (DUF4688). This family of proteins is found in eukaryotes. Proteins in this family are typically between 331 and 596 amino acids in length.	400
406238	pfam15753	BLOC1S3	Biogenesis of lysosome-related organelles complex 1 subunit 3. This family of proteins are components of the biogenesis of lysosome-related organelles complex-1 (BLOC-1).	168
406239	pfam15754	SPESP1	Sperm equatorial segment protein 1. 	318
406240	pfam15755	DUF4689	Domain of unknown function (DUF4689). This family of proteins is found in eukaryotes. Proteins in this family are typically between 202 and 224 amino acids in length.	223
406241	pfam15756	DUF4690	Domain of unknown function (DUF4690). This family of proteins is found in eukaryotes. Proteins in this family are typically between 100 and 122 amino acids in length. There are two conserved sequence motifs: LGPGAI and LRKF.	96
406242	pfam15757	Amelotin	Amelotin. This ameloblast-specific family of proteins may play a role in dental enamel formation.	194
406243	pfam15758	HRCT1	Histidine-rich carboxyl terminus protein 1. 	77
406244	pfam15759	TMEM108	TMEM108 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 258 and 575 amino acids in length.	511
406245	pfam15760	DLEU7	Leukemia-associated protein 7. 	194
406246	pfam15761	IMUP	Immortalisation up-regulated protein. This family of proteins is found in eukaryotes. Proteins in this family are approximately 100 amino acids in length. There are two conserved sequence motifs: GDPK and KKPK.	101
406247	pfam15762	DUF4691	Domain of unknown function (DUF4691). This family of proteins is found in eukaryotes. Proteins in this family are typically between 71 and 317 amino acids in length.	179
406248	pfam15763	DUF4692	Domain of unknown function (DUF4692). This family of proteins is found in eukaryotes. Proteins in this family are approximately 170 amino acids in length.	167
406249	pfam15764	DUF4693	Domain of unknown function (DUF4693). This family of proteins is found in eukaryotes. Proteins in this family are typically between 238 and 436 amino acids in length.	284
406250	pfam15765	DUF4694	Domain of unknown function (DUF4694). This family of proteins is found in eukaryotes. Proteins in this family are typically between 154 and 217 amino acids in length. There is a conserved SSGY sequence motif.	155
374090	pfam15766	DUF4695	Domain of unknown function (DUF4695). This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 206 amino acids in length. There is a conserved RFKTQP sequence motif.	107
406251	pfam15767	DUF4696	Domain of unknown function (DUF4696). This family of proteins is found in eukaryotes. Proteins in this family are typically between 599 and 780 amino acids in length. There is a conserved AFP sequence motif.	583
406252	pfam15768	CC190	Coiled-coil domain-containing protein 190. This family of proteins is found in eukaryotes. Proteins in this family are typically between 234 and 297 amino acids in length.	269
406253	pfam15769	DUF4698	Domain of unknown function (DUF4698). This family of proteins is found in eukaryotes. Proteins in this family are typically between 464 and 550 amino acids in length.	488
406254	pfam15770	DUF4699	Domain of unknown function (DUF4699). This family of proteins is found in eukaryotes. Proteins in this family are typically between 303 and 319 amino acids in length.	310
406255	pfam15771	IHO1	Interactor of HORMAD1 protein 1. Interactor of HORMAD1 protein 1 (IHO1, previously known as coiled-coil domain-containing protein 36 or DUF4700) is required for DNA double-strand breaks (DSBs) formation in unsynapsed regions during meiotic recombination. It is thought to function, in collaboration with SPO11-auxiliary proteins MEI4 and REC114, through the formation of DSB-promoting recombinosomes on chromatin at the onset of meiosis.	576
406256	pfam15772	UPF0688	UPF0688 family. This family of proteins is found in eukaryotes. Proteins in this family are typically between 176 and 243 amino acids in length.	232
406257	pfam15773	DUF4701	Domain of unknown function (DUF4701). This family of proteins is found in eukaryotes. Proteins in this family are typically between 111 and 520 amino acids in length.	502
406258	pfam15774	DUF4702	Domain of unknown function (DUF4702). This family of proteins is found in eukaryotes. Proteins in this family are typically between 346 and 637 amino acids in length.	399
406259	pfam15775	DUF4703	Domain of unknown function (DUF4703). This family of proteins is found in eukaryotes. Proteins in this family are typically between 149 and 210 amino acids in length.	186
374100	pfam15776	PRR22	Proline-rich protein family 22. This family of proteins is found in eukaryotes. Proteins in this family are typically between 217 and 420 amino acids in length.	366
406260	pfam15777	Anti-TRAP	Tryptophan RNA-binding attenuator protein inhibitory protein. 	59
406261	pfam15778	UNC80	Cation channel complex component UNC80. UNC80 is a family of proteins found in eukaryotes, and is typically between 193 and 224 amino acids in length. NALCN and UNC80 form a complex in mouse brain, both being tyrosine-phosphorylated; this phosphorylation can be inhibited by PP1. NALCN as the cation channel activated by substance P receptor, and the coupling from receptor to channel is facilitated by UNC80 and Src kinases rather than by a G-protein.	187
406262	pfam15779	LRRC37	Leucine-rich repeat-containing protein 37 family. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The function of this protein is unknown but it is likely to be upregulated by androgen.	73
406263	pfam15780	ASH	Abnormal spindle-like microcephaly-assoc'd, ASPM-SPD-2-Hydin. The ASH domain or N-terminal domain of abnormal spindle-like microcephaly-associated protein are found in proteins associated with cilia, flagella, the centrosome and the Golgi complex. The domain is also found in Hydin and OCRL whose deficiencies are associated with hydrocephalus and Lowe oculocerebrorenal syndrome (OCRL), respectively. The fact that Human ASPM protein carries an ASH domain indicates possible roles for ASPM in sperm flagellar or in ependymal cells' cilia. The presence of ASH in centrosomal and ciliary proteins indicates that ASPM may possess roles not only in mitotic spindle regulation, but also in ciliary and flagellar function.	98
406264	pfam15781	ParE-like_toxin	ParE-like toxin of type II bacterial toxin-antitoxin system. 	87
406265	pfam15782	GREB1	Gene regulated by oestrogen in breast cancer. GREB1 (gene regulated by estrogen in breast cancer 1) was first identified as an oestrogen-regulated gene expressed in breast cancer. Its exact function is not known but its expression is regulated by the coordinated binding of oestrogen-receptors to distal sites interacting with Pol II to activate gene transcription from core promoters located at a considerable distance from the greb1 gene.	1925
406266	pfam15783	FSIP2	Fibrous sheath-interacting protein 2. FSIP2, fibrous sheath-interacting protein 2, is the C-terminal portion of a family of proteins found in mammals. The function is not known but the domain appears to be repeated up to 10 times in some members.	876
406267	pfam15784	GPS2_interact	G-protein pathway suppressor 2-interacting domain. GPS2_interact is the more N-terminal domain of two co-repressor protein-families found in vertebrates. The domain is found in NCoR and SMRT proteins; N-CoR (nuclear receptor co-repressor) and SMRT (silencing mediator for retinoid and thyroid receptors) are related corepressors that mediate transcriptional repression by unliganded nuclear receptors and other classes of transcriptional repressors. GPS2 is a stoichiometric subunit of the N-CoR-HDAC3 complex. GPS2 links the complex to membrane receptor-related intracellular JNK (c-Jun amino-terminal kinase) signalling pathways.	89
406268	pfam15785	SMG1	Serine/threonine-protein kinase smg-1. SMG1 is a family of eukaryotic proteins. In humans this family acts as an mRNA-surveillance protein. In C.elegans, SMG1, a phosphatidylinositol kinase-related protein kinase, is a key regulator of growth. Loss of SMG1 leads to hyperactive responses to injury and subsequent growth that continues out of control. It has an antagonistic role to mTOR signalling in these worms and possibly also in higher eukaryotes.	613
406269	pfam15786	PET117	PET assembly of cytochrome c oxidase, mitochondrial. PET117 is a family of eukaryotic proteins found from fungi and plants to human. It is likely to be involved in the assembly of cytochrome C oxidase, and is found in the mitochondrion.	66
406270	pfam15787	DUF4704	Domain of unknown function (DUF4704). This domain of unknown function is found in eukaryotes on neurobeachin proteins.	262
406271	pfam15788	DUF4705	Domain of unknown function (DUF4705). DUF4705 is a family of repeated domains that is found in eukaryotes. It can occur up to 10 times in any one sequence. The repeat is rich in glycine and proline residues.	52
374113	pfam15789	Hyr1	Hyphally regulated cell wall GPI-anchored protein 1. Hyr1 family is a repeated domain found up to 39 times in a range of fungal and vertebral proteins. Hyr1 is a hypha-specific protein.	41
406272	pfam15790	EP400_N	E1A-binding protein p400, N-terminal. EP400_N is a family of eukaryote proteins. the exact function of this domain is not known. This family is largely low-complexity residues.	490
374115	pfam15791	DMRT-like	Doublesex-and mab-3-related transcription factor C1 and C2. DMRT-like is a C-terminal domain found on eukaryotic proteins for doublesex-and mab-3-related transcription factors C1 and C2. This is not the DM DNA-binding region. The family is all disorder and low-complexity.	119
406273	pfam15792	LAS2	Lung adenoma susceptibility protein 2. LAS2 is a family of eukaryotic proteins. Deletion of LAS2 is observed in approx. 40% of human lung adenocarcinomas, suggesting that loss of function of LAS2 may be a key step for promoting lung tumorigenesis.	75
406274	pfam15793	FAM35_C	Protein family FAM35, C-terminal. FAM35_C is a family of proteins found in eukaryotes. the function is not known.	174
406275	pfam15794	CCDC106	Coiled-coil domain-containing protein 106. CCDC106, coiled-coil domain-containing protein 106, is a family of eukaryote proteins. Yeast two-hybrid screening has identified CCDC106 as a p53-interacting partner. CCDC106 is a negative regulator of p53 and may be involved in tumorigenesis in some cancers by promoting the degradation of p53 protein and inhibiting its transactivity.	223
406276	pfam15795	Spec3	Ectodermal ciliogenesis protein. Spec3 is a family of eukaryotic membrane proteins. In the sea urchin, Spec3 is expressed predominantly during ectodermal ciliogenesis.	85
406277	pfam15796	KELK	KELK-motif containing domain of MRCK Ser/Thr protein kinase. KELK is a domain of eukaryotic proteins found in serine/threonine-protein kinase MRCK-type proteins. The region is low-complexity, but it is not a predicted disordered-binding domain. The name comes from a highly conserved sequence motif within the domain. The function is not known.	79
406278	pfam15797	DUF4706	Domain of unknown function (DUF4706). This domain family is found in eukaryotes, and is approximately 110 amino acids in length.	103
406279	pfam15798	PRAS	Proline-rich AKT1 substrate 1. This domain family is found in eukaryotes, and is typically between 117 and 132 amino acids in length. PRAS domain family is found in eukaryotes, and is typically between 117 and 132 amino acids in length. It is a proline-rich family that can be phosphorylated by AKT, and in the phosphorylated state binds to 14-3-3. The AKT signalling pathway contributes to regulation of apoptosis after a variety of cell death stimuli, and PRAS is found to be a substrate. PRAS plays an important role in regulating cell survival downstream of the PI3-K/Akt pathway after re-perfusion injury after transient focal cerebral ischemia. Copper/zinc-SOD (SOD1), a cytosolic isoenzyme of superoxide dismutase, SOD, is highly protective against ischemia and re-perfusion injury after transient focal cerebral ischemia, and SOD1 thus contributes to the inhibition of direct oxidation of PRAS and the activation of its signalling pathway. PRAS is also a mTOR binding partner, and PRAS phosphorylation by AKT and its association with 14-3-3, a cytosolic anchor protein, are crucial for insulin to stimulate mTOR (mammalian target of rapamycin).	123
406280	pfam15799	CCD48	Coiled-coil domain-containing protein 48. This family of proteins is found in eukaryotes. Proteins in this family are typically between 161 and 575 amino acids in length.	579
406281	pfam15800	CiPC	Clock interacting protein circadian. CiPC is a family of proteins found in eukaryotes. The protein was identified in sheep as a gene-orthologue involved in regulation of the circadian clock. Proteins in this family are typically between 220 and 400 amino acids in length.	329
406282	pfam15801	zf-C6H2	zf-MYND-like zinc finger, mRNA-binding. zf-C6H2 is an unusual zinc-finger similar to zf-MYND, pfam01753.This zinc-finger is found at the N-terminus of Pfam families Exo_endo_phos pfam03372 and Peptidase_M24 pfam00557. The domain is missing in prokaryotic methionine aminopeptidases, and is a unique type of zinc-finger domain. It consists of a C2-C2 zinc-finger motif similar to the RING finger family followed by a C2H2 motif similar to zinc-fingers involved in RNA-binding. In yeast the domain chelates zinc in a 2:1 ratio. The domain is found in yeast, plants and mammals. The domain is necessary for the association of the methionine aminopeptidase with the ribosome and the normal processing of the peptidase.	46
406283	pfam15802	DCAF17	DDB1- and CUL4-associated factor 17. DCAF17, DDB1- and CUL4-associated factor 17, is a family of proteins found in eukaryotes. It may function as a substrate-receptor for CUL4-DDB1 E3 ubiquitin-protein ligase complex. Mutations in the human protein, otherwise known as C2orf37, are responsible for Woodhouse-Sakati Syndrome. Woodhouse-Sakati Syndrome is a rare autosomal recessive multi-systemic disorder characterized by hypogonadism, alopecia, diabetes mellitus, mental retardation, and extrapyramidal syndrome.	474
406284	pfam15803	zf-SCNM1	Zinc-finger of sodium channel modifier 1. zf-SCNM1 is a C2H2 type zinc-finger conserved in eukaryotes found at the N-terminus of SCNM1, sodium channel modifier protein 1. Phylogenetic analysis of these zinc finger sequences places SCNM1 within the U1C subfamily of RNA binding proteins that is commonly found in RNA-processing proteins, suggesting that SCNM1 is involved in splicing activities.	27
406285	pfam15804	CCDC168_N	Coiled-coil domain-containing protein 168. CCDC168_N is the N-terminal region of eukaryotic coiled-coil proteins 168 family. There are up to 17, on average 6, copies of this repeat in most members.	205
406286	pfam15805	SCNM1_acidic	Acidic C-terminal region of sodium channel modifier 1 SCNM1. SCNM1_acidic is the C-terminal acidic region of eukaryotic sodium channel modifier protein 1. Deletion of this region affects the splicing and normal activity of the sodium channel Nav1.6 from gene Scn8a. SCNM1 sits within the U1C subfamily of RNA binding proteins that is commonly found in RNA-processing proteins, suggesting that SCNM1 is involved in splicing activities. SCNM1 and LUC7L2 associate with the mammalian spliceosomal subunit U1 snRNP.	47
406287	pfam15806	DUF4707	Domain of unknown function (DUF4707). This family of proteins is found in eukaryotes. The function is not known.	438
406288	pfam15807	MAP17	Membrane-associated protein 117 kDa, PDZK1-interacting protein 1. MAP17 is a family of proteins found in eukaryotes. It is a small non-glycosylated two-pass membrane protein, that is overexpressed in many tumors of different origins, including carcinomas.	117
406289	pfam15808	BCOR	BCL-6 co-repressor, non-ankyrin-repeat region. BCOR is a domain family found in eukaryotes, and is approximately 220 amino acids in length. This domain lies just upstream of the ankyrin-repeat region at the C-terminus of BCL-6 co-repressor proteins. The function of this region is not known.	218
406290	pfam15809	STG	Simian taste bud-specific gene product family. STG was first isolated from rhesus monkey taste buds. The exact function of STG is not known, but it has been implicated in follicular lymphomas, though not with psoriasis at least in a Swedish population despite lying close to the PSOR1 gene-locus.	240
406291	pfam15810	CCDC117	Coiled-coil domain-containing protein 117. CCDC117 is a family of coiled-coil proteins found in eukaryotes. Proteins in this family are typically between 203 and 279 amino acids in length. There is a conserved MELV sequence motif. The function is not known.	142
374135	pfam15811	SVIP	Small VCP/p97-interacting protein. SVIP, small VCP/p97-interacting protein, is a family of proteins found in eukaryotes. SVIP was identified by yeast two-hybrid screening to be an interactive partner of VCP/p97. Mammalian VCP/p97 and its yeast counterpart Cdc48p participate in the formation of organelles, including the endoplasmic reticulum (ER), Golgi apparatus, and nuclear envelope. Over-expression of SVIP caused the formation of large vacuoles that seemed to be derived from the ER. The family has two putative coiled-coil regions and contains proteins of approximately 80 amino acids in length.	77
374136	pfam15812	MREG	Melanoregulin. Melanoregulin is a family of proteins found in eukaryotes. It is a putative membrane fusion regulator. MREG forms a complex with peripherin-2. It is required for lysosome maturation and plays a role in intracellular trafficking. It is a negative regulator of melanosome intercellular transfer and it regulates intercellular melanosome transfer through palmitoylation.	148
406292	pfam15813	DUF4708	Domain of unknown function (DUF4708). This family of proteins is found in eukaryotes.	274
406293	pfam15814	FAM199X	Protein family FAM199X. This family of proteins is found in eukaryotes. The function of FAM199X is not known.	320
406294	pfam15815	MKRN1_C	E3 ubiquitin-protein ligase makorin-1, C-terminal. MKRN1_C is the very C-terminus of E3 ubiquitin-protein ligase makorin-1, or MKRN1, a family of eukaryotic putative ribonucleoproteins with a distinctive array of zinc-finger motifs. MKRN1 plays an important role in modulating the homeostasis of telomere-length through a dynamic balance involving the stability of the protein hTERT. MKRN1 has been shown to be a a transcriptional co-regulator and an E3 ligase. It functions simultaneously as a differentially negative regulator of p53 and p21, preferentially leading cells to p53-dependent apoptosis by suppressing p21. The exact function of the C-terminal region has not been determined.	87
406295	pfam15816	TMEM82	Transmembrane protein 82. TMEM82 is a family of proteins found in eukaryotes. The function is not known.	298
374141	pfam15817	TMEM40	Transmembrane protein 40 family. TMEM40 is a family of eukaryotic membrane proteins.	120
406296	pfam15818	CCDC73	Coiled-coil domain-containing protein 73 family. CCDC73 is a family of eukaryotic coiled-coil containing proteins. The function is not known. The alternative name is sarcoma antigen NY-SAR-79.	1050
406297	pfam15819	Fibin	Fin bud initiation factor homolog. Fibin is a family of eukaryotic proteins expressed in the lateral plate mesoderm of presumptive pectoral fin bud regions. It acts as a signal molecule for the expression of Tbx5, a gene involved in the specification of fore-limb identity. Fibin is found to be expressed in cerebellum, skeletal muscle and many other embryonic as well as adult mouse tissues, suggesting roles in both embryogenesis and in adult life. Although Fibin is routed through the endoplasmic reticulum (ER) no significant evidence for secretion is found. Fibin is post-translationally modified and forms dimers when expressed heterologously and its expression is regulated by a number of cellular signalling pathways.	189
292448	pfam15820	ECSCR	Endothelial cell-specific chemotaxis regulator. ECSCR, endothelial cell-specific chemotaxis regulator, is a family of proteins found in eukaryotes. It is also known as ARIA for apoptosis regulator through modulating IAP expression. It is a cell surface protein that regulates endothelial chemotaxis and tube formation, and interacts with filamin A. Filamin A anchors transmembrane proteins to the actin cytoskeleton becoming a scaffold for various signalling proteins. ECSCR is also known to interact with and regulate the function of several endothelial transmembrane molecules. It has been shown to play a role in angiogenesis, a complex process involving the migration, proliferation, and lumen formation of blood vessels by endothelial cells. ECSCR appears also to regulate endothelial apoptosis, probably through modulating proteasomal degradation of cIAP-1 and cIAP-2 in endothelial cells.	104
406298	pfam15821	DUF4709	Domain of unknown function (DUF4709). This domain family is found in eukaryotes, and is approximately 110 amino acids in length. There is a conserved QQL sequence motif.	109
318115	pfam15822	MISS	MAPK-interacting and spindle-stabilizing protein-like. MISS is a family of eukaryotic MAPK-interacting and spindle-stabilizing protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest.	238
406299	pfam15823	UPF0524	UPF0524 of C3orf70. UPF0524 is a family of proteins found in eukaryotes. Proteins in this family are typically between 183 and 250 amino acids in length. The function is not known.	239
374146	pfam15824	SPATA9	Spermatogenesis-associated protein 9. SPATA9, spermatogenesis-associated protein 9, or testis development protein NYD-SP16, is a family of eukaryotic proteins associated with sperm production. It is highly expressed in human testis and contains one transmembrane domain. Its localization indicates it is likely to play an important role in testicular development and spermatogenesis and may be an important factor in male infertility.	253
374147	pfam15825	FAM25	FAM25 family. FAM25 is a family of proteins found in eukaryotes. Proteins in this family are typically between 54 and 95 amino acids in length. There is a conserved GEK sequence motif. The function is not known.	65
374148	pfam15826	PUMA	Bcl-2-binding component 3, p53 upregulated modulator of apoptosis. PUMA (p53 upregulated modulator of apoptosis) is a family of eukaryotic proteins that are a target for activation by p53. The proteins contain BH3 domains and are induced in cells after p53 activation. They bind to Bcl-2, localize to the mitochondria to induce cytochrome c release, and activate the rapid induction of apoptosis.	189
406300	pfam15827	UPF0730	UPF0730 unknown protein family. UPF0730 is a family of proteins found in eukaryotes. Proteins in this family are typically between 51 and 156 amino acids in length.	46
406301	pfam15828	DUF4710	Domain of unknown function (DUF4710). This family of proteins is found in eukaryotes. Proteins in this family are typically between 60 and 150 amino acids in length.	75
406302	pfam15829	DUF4711	Domain of unknown function (DUF4711). This family of proteins is found in eukaryotes. Proteins in this family are typically between 130 and 288 amino acids in length.	217
406303	pfam15830	DUF4712	Domain of unknown function (DUF4712). This family of proteins is found in eukaryotes. Proteins in this family are typically between 133 and 267 amino acids in length.	250
406304	pfam15831	DUF4713	Domain of unknown function (DUF4713). This family of proteins is found in eukaryotes. Proteins in this family are typically between 68 and 91 amino acids in length. Members are single-pass membrane proteins.	56
406305	pfam15832	FAM27	FAM27 D and E protein family. FAM27 is a family of proteins found in eukaryotes. Proteins in this family are typically between 57 and 131 amino acids in length.	92
406306	pfam15833	DUF4714	Domain of unknown function (DUF4714). This family of proteins is found in eukaryotes. Proteins in this family are typically between 143 and 164 amino acids in length.	149
406307	pfam15834	THEG4	Testis highly expressed protein 4. THEG4, testis highly expressed protein 4, is a family of proteins found in eukaryotes. Proteins in this family are typically between 152 and 232 amino acids in length.	201
374157	pfam15835	DUF4715	Domain of unknown function (DUF4715). This family of proteins is found in eukaryotes. Proteins in this family are approximately 150 amino acids in length. The proteins are described as coiled-coil domain-containing protein ENSP00000299415-like.	139
406308	pfam15836	SSTK-IP	SSTK-interacting protein, TSSK6-activating co-chaperone protein. SSTK-IP, SSTK-interacting protein or TSSK6-activating co-chaperone, is a family of proteins found in eukaryotes. SSTK-IP directly binds to HSP70, is found associated with HSP70 and HSP90 in cells, and facilitates HSP90-dependent enzymatic activation of SSTK. SSTK is a small serine/threonine kinase expressed post-meiotically and essential for male fertility along with two other serine threonine kinases. SSTK is one of the smallest protein kinases, consisting only of N- and C-lobes of a kinase catalytic domain, and forms stable associations with heat shock protein (HSP) 70 and 90. SSTK-IP, its interacting protein, thus represents the first germ cell-specific co-chaperone and protein kinase that requires the HSP90 machinery for catalytic activation.	125
406309	pfam15837	DUF4716	Domain of unknown function (DUF4716). This domain family is found in eukaryotes, and is approximately 60 amino acids in length.	60
406310	pfam15838	DUF4717	Domain of unknown function (DUF4717). This family of proteins is found in eukaryotes. Proteins in this family are typically between 103 and 139 amino acids in length. There are two conserved sequence motifs: LLLL and CFNLAS.	72
406311	pfam15839	TEX29	Testis-expressed sequence 29 protein. TEX29, testis-expressed sequence 29 protein, is a family of proteins found in eukaryotes. Proteins in this family are typically between 39 and 150 amino acids in length.	69
406312	pfam15840	ARL17	ADP-ribosylation factor-like protein 17. ARL17 is a family of proteins found in primates. Proteins in this family are typically between 82 and 130 amino acids in length. Members of this family are also referred to as NBR2 or neighbor of BRAC1 gene 2.	61
374162	pfam15841	TMEM239	Transmembrane protein 239 family. This family of proteins is found in primates. Proteins in this family are typically between 152 and 198 amino acids in length.	155
374163	pfam15842	DUF4718	Domain of unknown function (DUF4718). This family of proteins is found in eukaryotes. Proteins in this family are typically between 130 and 224 amino acids in length.	183
374164	pfam15843	DUF4719	Domain of unknown function (DUF4719). This family of proteins is found in eukaryotes. Proteins in this family are typically between 67 and 240 amino acids in length.	207
374165	pfam15844	TMCCDC2	Transmembrane and coiled-coil domain-containing protein 2. This family of proteins is found in primates. Proteins in this family are approximately 180 amino acids in length.	171
406313	pfam15845	NICE-1	Cysteine-rich C-terminal 1 family. NICE-1 is family of proteins found in primates. Proteins in this family are typically between 51 and 105 amino acids in length.	89
374167	pfam15846	DUF4720	Domain of unknown function (DUF4720). This family of proteins is found in vertebrates. Proteins in this family are typically between 101 and 117 amino acids in length.	94
406314	pfam15847	Loricrin	Major keratinocyte cell envelope protein. Loricrin is a family of major keratinocyte cell envelope proteins found in primates. It acts as an important epidermal barrier, and is initially expressed in the granular layer comprising 70% of the total protein mass of the cornified layer. Expression of Loricrin is regulated by TNF-alpha via a c-Jun N-terminal kinase-dependent pathway.	312
374169	pfam15848	DUF4721	Domain of unknown function (DUF4721). This domain family is found in primates.	107
292477	pfam15849	DUF4722	Domain of unknown function (DUF4722). This family of proteins is found in vertebrates. Proteins in this family are typically between 86 and 203 amino acids in length.	167
406315	pfam15851	DUF4723	Domain of unknown function (DUF4723). This family of proteins is found in mammals. There are a number of conserved cysteines but it is unlikely to be a zinc-finger family.	81
406316	pfam15852	DUF4724	Domain of unknown function (DUF4724). This family of proteins is found in mammals. There is a conserved KVKPL sequence motif.	93
406317	pfam15854	DUF4725	Domain of unknown function (DUF4725). This family of proteins is found in vertebrates. Proteins in this family are approximately 80 amino acids in length.	80
318139	pfam15855	DUF4726	Domain of unknown function (DUF4726). This family of proteins is found in vertebrates. Proteins in this family are typically between 40 and 110 amino acids in length.	101
406318	pfam15856	DUF4727	Domain of unknown function (DUF4727). This family of proteins is found in vertebrates. There are a number of conserved cysteines, but the domain is not a zinc-finger.	216
406319	pfam15858	LCE6A	Late cornified envelope protein 6A family. LCE6A is a family of proteins is found in mammals. It was identified in a large-scale screening experiment as being involved in the barrier function of the epidermis.	81
406320	pfam15859	DEC1	Deleted in esophageal cancer 1 family. DEC1 is a family of proteins found in primates. The protein has been identified as being deleted in oesophageal cancers so is also referred to as candidate tumor suppressor CTS9. Proteins in this family are approximately 70 amino acids in length.	70
374176	pfam15860	DUF4728	Domain of unknown function (DUF4728). This family of arthropod proteins is functionally uncharacterized.	91
406321	pfam15861	partial_CstF	Partial cleavage stimulation factor domain. Partial_CstF domain is a protein domain that occurs in proteins from apicomplexan parasites. Currently (as of 2012), little is known about the function of this domain. However, it is homologous to the amino-terminal part of the cleavage stimulation factor, which is thought to be involved with mRNA maturation in mammals.	62
406322	pfam15862	Coilin_N	Coilin N-terminus. 	138
406323	pfam15863	EELM2	Extended EGL-27 and MTA1 homology domain. EELM2, the extended EGL-27 and MTA1 homology domain is a protein domain that occurs in proteins from apicomplexan parasites. Part of the EELM2 domain is homologous to the ELM2 domain, but is 'extended' in that its boundaries (the region of conservation) are longer than in the ELM2 domain. Currently (as of 2012), little is known about the function of this domain. However, some proteins that contain an EELM2 domain also contain a PHD finger domain, which is thought to be involved in chromatin remodelling. This suggests an associated role for the EELM2 domain.	170
406324	pfam15864	PglL_A	Protein glycosylation ligase. PglL_A is a pilin glycosylation ligase domain found in Gram negative bacteria. PglL protein O-oligosaccharyltransferases differ from the wider Wzy_C family, pfam04932, which contains both WaaL O-antigen ligases, in its substrate-specificity. PglL O-oligosaccharyltransferases (O-OTase) transfer oligosaccharide to serine or threonine in a protein. A further indication that the genes identified are PglL rather than WaaL homologs is that they are not located within lipopolysaccharide biosynthetic loci. The specific pilin glycosylation ligases are a subset of the more general bacterial protein o-oligosaccharyltransferases.	26
406325	pfam15865	Fanconi_A_N	Fanconi anaemia group A protein N-terminus. 	333
406326	pfam15866	DUF4729	Domain of unknown function (DUF4729). This family of proteins is functionally uncharacterized. This family of proteins is found in insects. Proteins in this family are typically between 238 and 666 amino acids in length.	208
406327	pfam15867	Dynein_attach_N	Dynein attachment factor N-terminus. This family represents the N-terminus of a dynein arm attachment factor which is required for dynein arm assembly and cilia motility.	68
406328	pfam15868	MBF2	Transcription activator MBF2. MBF2 activates transcription via its interaction with TFIIA. In Bombyx mori, it has been found to form a complex with MBF1 and the DNA-binding regulator FTZ-F1.	90
406329	pfam15869	TolB_like	TolB-like 6-blade propeller-like. 	295
406330	pfam15870	EloA-BP1	ElonginA binding-protein 1. This domain family is found in eukaryotes, and is typically between 144 and 167 amino acids in length.	162
406331	pfam15871	JMY	Junction-mediating and -regulatory protein. JMY, Junction-mediating and -regulatory protein is also a WASP homolog-associated protein with actin, membranes and microtubules. This middle region is the coiled-coil region that putatively binds microtubules to the scaffold. This ability to interact with microtubules plays a role in membrane tubulation.	298
318153	pfam15872	SRTM1	Serine-rich and transmembrane domain-containing protein 1. This family of proteins is found in eukaryotes. Proteins in this family are approximately 100 amino acids in length.	103
292498	pfam15873	DUF4730	Domain of unknown function (DUF4730). This family of proteins is found in eukaryotes. Proteins in this family are approximately 60 amino acids in length.	55
406332	pfam15874	Il2rg	Putative Interleukin 2 receptor, gamma chain. This family of proteins is found in eukaryotes. Proteins in this family are typically between 137 and 197 amino acids in length.	92
406333	pfam15875	DUF4731	Domain of unknown function (DUF4731). This family of proteins is found in eukaryotes. Proteins in this family are typically between 37 and 78 amino acids in length.	75
374189	pfam15876	DUF4732	Domain of unknown function (DUF4732). This family of proteins is found in eukaryotes. Proteins in this family are typically between 107 and 201 amino acids in length.	112
406334	pfam15877	TMEM232	Transmembrane protein family 232. This family of proteins is found in eukaryotes. The function is not known.	452
406335	pfam15878	DUF4733	Domain of unknown function (DUF4733). This family of proteins is found in eukaryotes. Proteins in this family are typically between 73 and 99 amino acids in length.	91
406336	pfam15879	MWFE	NADH-ubiquinone oxidoreductase MWFE subunit. MWFE is a short subunit of NADH-ubiquinone oxidoreductase found in eukaryotes. It is necessary for the activity of NADH-ubiquinone oxidoreductase complex I in mitochondria. This subunit is essential for the assembly and function of the enzyme. MWFE is found to be phosphorylated, eg in rat heart mitochondria. The short family includes much of a signal peptide.	55
406337	pfam15880	NDUFV3	NADH dehydrogenase [ubiquinone] flavoprotein 3, mitochondrial. 	35
406338	pfam15881	DUF4734	Domain of unknown function (DUF4734). This domain family is found in species of Drosophila, and is approximately 90 amino acids in length. The family is found in association with pfam07707.	91
406339	pfam15882	DUF4735	Domain of unknown function (DUF4735). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 288 and 410 amino acids in length. There are two completely conserved C residues that may be functionally important. In mammals this protein family is thyroid-specific.	290
374196	pfam15883	DUF4736	Domain of unknown function (DUF4736). This family of proteins is functionally uncharacterized. This family of proteins is found in insects. Proteins in this family are typically between 186 and 228 amino acids in length.	186
406340	pfam15884	QIL1	Protein QIL1. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 111 and 169 amino acids in length.	77
406341	pfam15886	CBM39	Carbohydrate binding domain (family 32). This domain is found at the N-terminus of beta-1,3-glucan-binding proteins involved in recognition of invading micro-organisms. It often co-occurs with pfam00722 (Glycosyl hydrolases family 16). It recognizes and binds to a triple-helical beta-1,3-glucan structure.	107
406342	pfam15887	Peptidase_Mx	Putative zinc-binding metallo-peptidase. This family has a highly conserved HHExxH motif with a highly conserved ED pairing downstream. HExxH is indicative of a zinc-binding metallo-peptidase.	240
374199	pfam15888	FOG_N	Folded gastrulation N-terminus. This is the N-terminal domain of the folded gastrulation protein. Folded gastrulation is required for morphogenic movements during gastrulation and nervous system development. It may act as a secreted signal and activate the G protein alpha subunit. This domain may be the G protein-coupled receptor ligand.	112
406343	pfam15889	DUF4738	Domain of unknown function (DUF4738). Family of uncharacterized proteins found in CFB group of bacteria, mostly from Bacteroides and Prevotella genera present in human gut and oral cavity, respectively. JCSG target SP13584B, the experimentally determined structure consists of two WD40-like beta sheet repeats forming a beta sandwich	134
406344	pfam15890	Peptidase_Mx1	Putative zinc-binding metallo-peptidase. This family is a putative zinc-binding metallo-peptidase. There are two highly conserved motifs, HHExxH and ED. HExxH with ED is indicative of zinc-binding metallo-peptidases.	238
406345	pfam15891	Nuc_deoxyri_tr2	Nucleoside 2-deoxyribosyltransferase like. 	105
406346	pfam15892	BNR_4	BNR repeat-containing family member. BNR_4 is a family which carries the unique sequence motif SxDxGxTW which is so characteristic of the repeats of the BNR family, pfam02012. It is unclear whether or not this unit is repeated throughout the sequences of this family, but if it is then the family is likely to be bacterial neuraminidase.	272
406347	pfam15893	DUF4739	Domain of unknown function (DUF4739). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 138 and 167 amino acids in length.	235
406348	pfam15894	SgrT	Inhibitor of glucose uptake transporter SgrT. 	53
374204	pfam15895	CAAX_1	CAAX box cerebral protein 1. CAAX_1 is a family of primate proteins. CAAX refers to the highly characteristic C-terminal residues, a cysteine and two aliphatic residues followed by any residue, a C-terminal tetrapeptide recognition motif called the Ca1a2X box. This motif on substrates is recognized by prenyltransferases that then attach an isoprenoid lipid (a process termed prenylation), one of the many post-translational modifications that occur in cells. The function of the prenylated family is not known.	209
406349	pfam15897	DUF4741	Domain of unknown function (DUF4741). 	169
406350	pfam15898	PRKG1_interact	cGMP-dependent protein kinase interacting domain. This domain is found at the C-terminus of protein phosphatase 1 regulatory subunits 12A, 12B and 12C. In protein phosphatase 1 regulatory subunit 12A it has been found to bind to cGMP-dependent protein kinase 1 via a leucine zipper motif located at the C-terminus of this domain.	101
406351	pfam15899	BNR_6	BNR-Asp box repeat. This BNR repeat is found in proteins such as human sortilin. The model complements family BNR_5.	14
406352	pfam15901	Sortilin_C	Sortilin, neurotensin receptor 3, C-terminal. Sortilin_C is the C-terminal cytoplasmic tail of sortilin, a Vps10p domain-containing family of proteins. Most sortilin is expressed within intracellular compartments, where it chaperones diverse ligands, including proBDNF and acid hydrolases. The sortilin cytoplasmic tail is homologous to mannose 6-phosphate receptor and is required for the intracellular trafficking of cargo proteins via interactions with distinct adaptor molecules. In addition to mediating lysosomal targeting of specific acid hydrolases, the sortilin cytoplasmic tail also directs trafficking of BDNF to the secretory pathway in neurons, where it can be released in response to depolarisation to modulate cell survival and synaptic plasticity.	164
406353	pfam15902	Sortilin-Vps10	Sortilin, neurotensin receptor 3,. Sortilin, also known in mammals as neurotensin receptor-3, is the archetypical member of a Vps10-domain (Vps10-D) that binds neurotrophic factors and neuropeptides. This domain constitutes the entire luminal part of Sortilin and is activated in the trans-Golgi network by enzymatic propeptide cleavage. The structure of the domain has been determined as a ten-bladed propeller, with up to 9 BNR or beta-hairpin turns in it. The mature receptor binds various ligands, including its own propeptide (Sort-pro), neurotensin, the pro-forms of nerve growth factor-beta (NGF)6 and brain-derived neurotrophic factor (BDNF)7, lipoprotein lipase (LpL), apo lipoprotein AV14 and the receptor-associated protein (RAP)1.	443
406354	pfam15903	PL48	Filopodia upregulated, FAM65. PL48 is associated with cytotrophoblast and lineage-specific HL-60 cell differentiation. The N-terminal part of the family is found to induce the formation of filopodia. It is found in vertebrates.	346
406355	pfam15904	LIP1	LKB1 serine/threonine kinase interacting protein 1. LIP1 is a protein found in eukaryotes. It represents the N-terminus of a leucine-rich-repeat protein that is implicated in Peutz-Jeghers syndrome. LIP1 interacts with the TGF-beta-regulated transcription factor SMAD4 to form a LKB1-LIP1-SMAD4 ternary complex. Mutations in SMAD4 lead to juvenile polyposis, suggesting a mechanistic link between these two diseases.	88
406356	pfam15905	HMMR_N	Hyaluronan mediated motility receptor N-terminal. HMMR_N is the N-terminal region of eukaryotic hyaluronan-mediated motility receptor proteins. The protein is functionally associated with BRCA1 and thus predicted to be a common, low-penetrance breast cancer candidate.	331
318178	pfam15906	zf-NOSIP	Zinc-finger of nitric oxide synthase-interacting protein. 	75
406357	pfam15907	Itfg2	Integrin-alpha FG-GAP repeat-containing protein 2. Members of this family are annotated as being integrin-alpha FG-GAP repeat-containing protein 2.	332
406358	pfam15908	HMMR_C	Hyaluronan mediated motility receptor C-terminal. HMMR_C is the C-terminal region of eukaryotic hyaluronan-mediated motility receptor proteins. The protein is functionally associated with BRCA1 and thus predicted to be a common, low-penetrance breast cancer candidate.	157
406359	pfam15909	zf-C2H2_8	C2H2-type zinc ribbon. This family carries three zinc-fingers in tandem.	98
374216	pfam15910	V-set_2	ICOS V-set domain. This family contains divergent V-set ig domains found in the ICOS protein.	113
406360	pfam15911	WD40_3	WD domain, G-beta repeat. 	57
406361	pfam15912	VIR_N	Virilizer, N-terminal. VIR_N is the conserved N-terminus of the protein virilizer, necessary for male and female viability and required for the production of eggs capable of embryonic development.	265
406362	pfam15913	Furin-like_2	Furin-like repeat, cysteine-rich. The furin-like cysteine rich region has been found in a variety of proteins from eukaryotes that are involved in the mechanism of signal transduction by receptor tyrosine kinases, which involves receptor aggregation.	102
406363	pfam15914	FAM193_C	FAM193 family C-terminal. This domain family is found in eukaryotes, and is approximately 60 amino acids in length. This C-terminal region of these proteins carries the most conserved residues.	56
406364	pfam15915	BAT	GAF and HTH_10 associated domain. GAF-HTH_assoc domain is always found between GAF-2 and HTH_10 domains on bacterio-opsin activator proteins. The exact function is not known.	156
406365	pfam15916	DUF4743	Domain of unknown function (DUF4743). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is approximately 150 amino acids in length. The family is found in association with pfam00293.	119
406366	pfam15917	PIEZO	Piezo. This domain is found in proteins belonging to the piezo family. Piezo proteins are components of cation channels. This domain is found in association with pfam12166.	163
374223	pfam15918	DUF4744	Domain of unknown function (DUF4744). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 81 and 415 amino acids in length.	66
406367	pfam15919	HicB_lk_antitox	HicB_like antitoxin of bacterial toxin-antitoxin system. This is a family of HicB-like antitoxins.	123
406368	pfam15920	WHAMM-JMY_N	N-terminal of Junction-mediating and WASP homolog-associated. WHAMM-JMY_N is the very N-terminus of WHAMM and JMY proteins. The function of this conserved region is not known; there are two highly conserved tryptophan residues.	49
318193	pfam15921	CCDC158	Coiled-coil domain-containing protein 158. CCDC158 is a family of proteins found in eukaryotes. The function is not known.	1112
292544	pfam15922	YjeJ	YjeJ-like. YjeJ is a family of bacterial proteins. The domains and proteins in this family vary in length from 283 to 284 amino acids. The function is not yet known. All proteins are Gammaproteobacteria.	283
374226	pfam15923	DUF4745	Domain of unknown function (DUF4745). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 180 amino acids in length.	133
406369	pfam15924	ALG11_N	ALG11 mannosyltransferase N-terminus. 	208
374228	pfam15925	SOSSC	SOSS complex subunit C. SOSS complex subunit C is a component of the SOSS complex, a single-stranded DNA binding complex involved in genomic stability, double-stranded break repair and ataxia telangiectasia-mutated-dependent signaling pathways.	95
406370	pfam15926	RNF220	E3 ubiquitin-protein ligase RNF220. This family represents the central region of the E3 ubiquitin-protein ligase RNF220.	246
406371	pfam15927	Casc1_N	Cancer susceptibility candidate 1 N-terminus. This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 200 amino acids in length. The family is found in association with pfam12366. There are two completely conserved residues (N and W) that may be functionally important.	201
406372	pfam15928	DUF4746	Domain of unknown function (DUF4746). This presumed domain is functionally uncharacterized. This domain is found in eukaryotes, and is typically between 247 and 324 amino acids in length. The family is found in association with pfam00085.	290
374232	pfam15929	Myofilin	Myofilin. Myofilin is an insect muscle protein found in thick muscle filaments.	146
318201	pfam15930	YdiH	Domain of unknown function. YdiH is a family of proteins found in bacteria. Proteins in this family are typically between 62 and 80 amino acids in length. The function is not known.	62
406373	pfam15931	DUF4747	Domain of unknown function (DUF4747). This family of proteins is found in bacteria. Proteins in this family are typically between 263 and 305 amino acids in length.	257
406374	pfam15932	DUF4748	Domain of unknown function (DUF4748). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 114 and 139 amino acids in length.	51
406375	pfam15933	RnlB_antitoxin	Antitoxin to bacterial toxin RNase LS or RnlA. RnlB_antitoxin, formerly known as yfjO, has been found to be the antidote protein to RNase LS or RnlA in E. coli. Bacterial toxin-antitoxin systems consist of a stable toxin and an unstable antitoxin. In this case, a novel type II system, RnlA is the stable toxin that causes inhibition of cell growth and rapidly degrades T4 late mRNAs to prevent their expression, and this is neutralized by the activity of the unstable antitoxin RnlB.	94
318204	pfam15934	Yuri_gagarin	Yuri gagarin. The yuri gagarin protein found in Drosophila, it plays roles in spermatogenesis.	234
406376	pfam15935	RnlA_toxin	RNase LS, bacterial toxin. RnlA_toxin is an RNase LS and a putative toxin of a bacterial toxin-antitoxin pair. Toxin-antitoxin systems consist of a stable toxin and an unstable antitoxin. In this case, a novel type II system, RnlA is the stable toxin that causes inhibition of cell growth and rapidly degrades T4 late mRNAs to prevent their expression, and this is neutralized by the activity of the unstable antitoxin RnlB.	87
406377	pfam15936	DUF4749	Domain of unknown function (DUF4749). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 121 and 170 amino acids in length. It is usually found in association with pfam00595 (PDZ) and pfam00412 (LIM), and often contains the conserved Zasp-like motif (IPR006643).	96
374238	pfam15937	PrlF_antitoxin	prlF antitoxin for toxin YhaV_toxin. PrlF_antitoxin is a family of bacterial antitoxins that neutralizes the toxin YhaV. PrlF is labile and forms a homodimer that then binds to the YhaV toxin thereby neutralising its ribonuclease activity. Alone, it can also act as a transcription factor. The YhaV/PrlF complex binds the prlF-yhaV operon, probably regulating its expression negatively. Over-expression of PrlF leads to increased doubling time.	95
406378	pfam15938	DUF4750	Domain of unknown function (DUF4750). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 76 and 92 amino acids in length. There are two completely conserved W residues that may be functionally important.	52
292561	pfam15939	YmcE_antitoxin	Putative antitoxin of bacterial toxin-antitoxin system. YmcE_antitoxin is the putative antitoxin for the supposed bacterial toxin GnsA, UniProtKB:P0AC92, family pfam08178.	76
406379	pfam15940	YjcB	Family of unknown function. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length.	90
374241	pfam15941	FidL_like	FidL-like putative membrane protein. FidL-like is a family of bacterial proteins that are purported to be membrane proteins.	90
374242	pfam15942	DUF4751	Domain of unknown function (DUF4751). This family of proteins is found in bacteria. Proteins in this family are approximately 140 amino acids in length.	121
406380	pfam15943	YdaS_antitoxin	Putative antitoxin of bacterial toxin-antitoxin system, YdaS/YdaT. YdaS_antitoxin is a family of putative bacterial antitoxins, neutralising the toxin YdaT, family pfam06254.	65
339554	pfam15944	DUF4752	Domain of unknown function (DUF4752). This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 90 and 105 amino acids in length. There is a conserved GLA sequence motif.	84
374244	pfam15946	DUF4754	Domain of unknown function (DUF4754). This family of proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	80
318214	pfam15947	DUF4755	Domain of unknown function (DUF4755). This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length.	129
406381	pfam15948	DUF4756	Domain of unknown function (DUF4756). This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length.	158
406382	pfam15949	DUF4757	Domain of unknown function (DUF4757). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 145 and 166 amino acids in length. The family is found in association with pfam00412. There are two completely conserved residues (W and L) that may be functionally important.	167
406383	pfam15950	DUF4758	Putative sperm flagellar membrane protein. 	124
406384	pfam15951	MITF_TFEB_C_3_N	MITF/TFEB/TFEC/TFE3 N-terminus. This domain is found at the N-terminus of several transcription factors including microphthalmia-associated transcription factor, transcription factor EB, transcription factor EC and transcription factor E3.	153
374249	pfam15952	ESM4	Enhancer of split M4 family. This family of proteins includes enhancer of split M4, enhancer of split M2 and enhancer of split MAlpha. These proteins are part of the Notch signaling pathway.	174
292575	pfam15953	PDU_like	Putative propanediol utilisation. This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length.	153
374250	pfam15955	Cuticle_4	Cuticle protein. 	74
374251	pfam15956	DUF4760	Domain of unknown function (DUF4760). This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 147 and 190 amino acids in length. There is a single completely conserved residue R that may be functionally important.	143
374252	pfam15957	Comm	Commissureless. Commissureless regulates Roundabout (Robo) levels and as a result regulates controls axon guidance across the embryo midline.	110
318223	pfam15958	DUF4761	Domain of unknown function (DUF4761). This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length.	105
406385	pfam15959	DUF4762	Domain of unknown function (DUF4762). This family of proteins is found in bacteria. Proteins in this family are approximately 70 amino acids in length. There is a conserved TTC sequence motif.	61
406386	pfam15960	DUF4763	Domain of unknown function (DUF4763). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 237 and 332 amino acids in length. There are two completely conserved residues (C and R) that may be functionally important.	236
406387	pfam15961	DUF4764	Domain of unknown function (DUF4764). 	798
374256	pfam15962	DUF4765	Domain of unknown function (DUF4765). This domain family is found in bacteria, and is approximately 90 amino acids in length.	1128
406388	pfam15963	Myb_DNA-bind_7	Myb DNA-binding like. 	85
318229	pfam15964	CCCAP	Centrosomal colon cancer autoantigen protein family. CCCAP is a family of proteins found in eukaryotes. CCCAP is also known as SDCCAG8, serologically defined colon cancer antigen 8. It is associated with the centrosome.	703
406389	pfam15965	zf-TRAF_2	TRAF-like zinc-finger. 	93
406390	pfam15966	F-box_4	F-box. 	115
406391	pfam15967	Nucleoporin_FG2	Nucleoporin FG repeated region. Nucleoporin_FG2, or nucleoporin p58/p45, is a family of chordate nucleoporins. The proteins carry many repeats of the FG sequence motif.	598
292590	pfam15968	RexB	Membrane-anchored ion channel, Abi component. RexB is a family of anti-lambda phage inner-membrane ion-channels with four transmembrane domains. On infection by phage, a phage protein-DNA complex is produced as a replication or recombination intermediate which activates RexA. RexA is an intracellular sensor that activates the membrane-anchored RexB. At least two RexA proteins are needed to activate one RexB protein. Activation opens the ion-channel leading to a drop in membrane potential, the outcome of which is the death of the host cell but also the cessation or abortion of the phage infection. RexA-RexB is one of the most well characterized bacterial abortive infection systems, or Abis.	139
374261	pfam15969	RexA	Intracellular sensor of Lambda phage, Abi component. RexA is a family of bacterial anti-phage proteins. It forms one partner in the two-component abortive infection system, Abi, of E. coli in partnership with RexB, a membrane-anchored ion- channel. Two RexA are needed to activate one RexB, and activation causes opening of the channel, the efflux of cations, a drop in cellular levels of ATP and subsequent death of the host cell and abortion of the phage infecting process which requires ATP.	235
292592	pfam15970	HicB-like_2	HicB_like antitoxin of bacterial toxin-antitoxin system. This is a family of HicB-like antitoxins.	81
374262	pfam15971	Mannosyl_trans4	DolP-mannose mannosyltransferase. This family catalyzes the transfer of mannose from DolP-mannose to the N-linked tetrasaccharide bound to the S-layer glycoprotein to form a pentasaccharide.	163
374263	pfam15972	Unpaired	Unpaired protein. Unpaired protein activates the JAK pathway.	177
374264	pfam15973	DUF4766	Domain of unknown function (DUF4766). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 106 and 128 amino acids in length. There is a conserved KVI sequence motif.	115
406392	pfam15974	Cadherin_tail	Cadherin C-terminal cytoplasmic tail, catenin-binding region. Cadherin_tail is the cytoplasmic domain at the C-terminus of cadherin proteins. This domain binds p120 catenin, an action critical for the surface stability of cadherin-catenin cell-cell adhesion complexes.	134
406393	pfam15975	Flot	Flotillin. Flotillin is a family of lipid-membrane-associated proteins found in bacteria, archaea and eukaryotes. The family is found in association with pfam01145, another integral membrane-associated domain. Flotillins in vertebrates are associated with sphingolipids and cholesterol-enriched membrane microdomains known as lipid-rafts. These rafts along with other membrane components are important in cell-signalling. Flotillins in other organisms have roles in viral pathogenesis, endocytosis, and membrane shaping.	121
406394	pfam15976	CooC_C	CS1-pili formation C-terminal. CooC_C is a highly conserved C-terminal domain on fimbrial outer membrane usher proteins like TcfC. The protein is required for CS1 pilus formation.	93
406395	pfam15977	HTH_46	Winged helix-turn-helix DNA binding. 	68
379756	pfam15978	TnsD	Tn7-like transposition protein D. TnsD is a family of putative Tn7-like transposition proteins type D.	360
406396	pfam15979	Glyco_hydro_115	Glycosyl hydrolase family 115. Glyco_hydro_115 is a family of glycoside hydrolases likely to have the activity of xylan a-1,2-glucuronidase, EC:3.2.1.131, or a-(4-O-methyl)-glucuronidase EC:3.2.1.-.	334
406397	pfam15980	ComGF	Putative Competence protein ComGF. ComGF is a family of putative bacterial competence proteins.	99
292603	pfam15981	EAV_GP5	Envelope glycoprotein GP 5 of equine arteritis virus. EAV_GP5 is a domain family found in equine arteritis virus envelope. It is approximately 80 amino acids in length and is found in association with pfam00951.	80
374269	pfam15982	TMEM135_C_rich	N-terminal cysteine-rich region of Transmembrane protein 135. TMEM135_C_rich is a family of putative peroxisomal membrane proteins found in eukaryotes. This is the highly conserved N-terminal region that has several highly conserved cysteine residues. The domain is associated with family Tim17, pfam02466.	134
374270	pfam15983	DUF4767	Domain of unknown function (DUF4767). This domain family is found in bacteria, and is approximately 140 amino acids in length. There is a single completely conserved residue Q that may be functionally important.	138
406398	pfam15984	Collagen_mid	Bacterial collagen, middle region. Collagen_mid is the conserved central region of bacterial collagen triple helix repeat proteins.	192
406399	pfam15985	KH_6	KH domain. KH motifs bind RNA in vitro. Auto-antibodies to Nova, a KH domain protein, cause para-neoplastic opsoclonus ataxia.	47
406400	pfam15989	DUF4768	Domain of unknown function (DUF4768). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 116 and 180 amino acids in length. There is a conserved FFFGQY sequence motif.	87
406401	pfam15990	UPF0767	UPF0767 family. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between and 92 amino acids in length. There are two conserved sequence motifs: IGYN and SPSL.	83
406402	pfam15991	G_path_suppress	G-protein pathway suppressor. This family of proteins inhibits G-protein- and mitogen-activated protein kinase-mediated signal transduction.	273
406403	pfam15992	DUF4769	Domain of unknown function (DUF4769). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 291 and 501 amino acids in length.	256
406404	pfam15993	Fuseless	Fuseless. This family includes Drosophila fuseless protein and contains four WXGXW motifs. Fuseless is a transmembrane protein which regulates pre-synaptic calcium channels.	299
406405	pfam15994	DUF4770	Domain of unknown function (DUF4770). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 169 and 182 amino acids in length. There is a single completely conserved residue L that may be functionally important.	181
406406	pfam15995	DUF4771	Domain of unknown function (DUF4771). This domain family is found in eukaryotes, and is approximately 160 amino acids in length. There is a conserved RYGK sequence motif.	159
406407	pfam15996	PNISR	Arginine/serine-rich protein PNISR. 	178
374280	pfam15997	DUF4772	Domain of unknown function (DUF4772). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 107 and 124 amino acids in length. There is a single completely conserved residue V that may be functionally important.	112
406408	pfam15998	DUF4773	Domain of unknown function (DUF4773). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 120 amino acids in length.	118
406409	pfam15999	DUF4774	Domain of unknown function (DUF4774). This presumed domain is functionally uncharacterized. This domain family is found in bacteria, eukaryotes and viruses, and is approximately 50 amino acids in length.	57
406410	pfam16000	CARMIL_C	CARMIL C-terminus. This domain is found near to the C-terminus of leucine-rich repeat-containing proteins in the CARMIL family. In leucine-rich repeat-containing protein 16A (LRRC16A) it includes the region responsible for interaction with F-actin-capping protein subunit alpha-2 (CAPZA2).	287
406411	pfam16001	DUF4775	Domain of unknown function (DUF4775). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 308 and 484 amino acids in length.	456
406412	pfam16002	Headcase	Headcase protein. This domain is found in Drosophila Headcase protein and the human Headcase protein homolog. In humans, it may have a role in some cancers.	194
406413	pfam16003	DUF4776	Domain of unknown function (DUF4776). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 444 and 485 amino acids in length. There is a conserved TLR sequence motif.	502
406414	pfam16004	EFTUD2	116 kDa U5 small nuclear ribonucleoprotein component N-terminus. 	76
406415	pfam16005	MOEP19	KH-like RNA-binding domain. MOEP19 is a family of mammalian KH-like RNA-binding motifs. The family is expressed during early embryogenesis. It appears to effect an early form of molecular asymmetry within the murine oocyte cytoplasm. The family marks a defined cortical cytoplasmic domain in oocytes and provides evidence for mammalian oocyte polarity and a form of pre-patterning that persists in zygotes and early embryos through the morula stage.	85
406416	pfam16006	NUSAP	Nucleolar and spindle-associated protein. This family of microtubule-associated proteins has a role in spindle microtubule organisation.	277
374290	pfam16007	DUF4777	Domain of unknown function (DUF4777). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 70 amino acids in length.	66
374291	pfam16008	DUF4778	Domain of unknown function (DUF4778). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 321 and 791 amino acids in length. There is a single completely conserved residue P that may be functionally important.	289
406417	pfam16009	DUF4779	Domain of unknown function (DUF4779). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 234 and 351 amino acids in length.	160
406418	pfam16010	CDH-cyt	Cytochrome domain of cellobiose dehydrogenase. CDH-cyt is the cytochrome domain, at the N-terminus, of cellobiose dehydrogenase. CDH-cyt folds as a beta sandwich with the topology of the antibody Fab V(H) domain and binds iron. The haem iron is ligated by Met83 and His181 in UniProtKB:Q01738.	177
406419	pfam16011	CBM9_2	Carbohydrate-binding family 9. CBM9_2 is a family of putative endoxylanase-like proteins that belong to the Carbohydrate-binding family 9.	199
406420	pfam16012	DUF4780	Domain of unknown function (DUF4780). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 132 and 144 amino acids in length. There is a single completely conserved residue W that may be functionally important.	177
406421	pfam16013	DUF4781	Domain of unknown function (DUF4781). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 288 and 306 amino acids in length.	308
406422	pfam16014	SAP130_C	Histone deacetylase complex subunit SAP130 C-terminus. 	406
406423	pfam16015	Promethin	Promethin. 	96
406424	pfam16016	DUF4782	Domain of unknown function (DUF4782). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 150 amino acids in length. The family is found in association with pfam02893.	147
406425	pfam16017	BTB_3	BTB/POZ domain. 	106
406426	pfam16018	Anillin_N	Anillin N-terminus. This domain is found towards the N-terminus of anillin. In mammalian anillin this domain is repeated. This domain overlaps with the region responsible for nuclear localization of anillin.	86
406427	pfam16019	CSRNP_N	Cysteine/serine-rich nuclear protein N-terminus. This presumed domain is found at the N-terminus of cysteine/serine-rich nuclear proteins. These proteins act as transcriptional activators.	217
406428	pfam16020	Deltameth_res	Deltamethrin resistance. This presumed domain is found in the deltamethrin-resistance protein prag01 from Culex pipiens pallens.	49
406429	pfam16021	PDCD7	Programmed cell death protein 7. 	306
406430	pfam16022	DUF4783	Domain of unknown function (DUF4783). This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. There is a single completely conserved residue F that may be functionally important. Recent structures show this domain has an NTF2 fold.	102
406431	pfam16023	DUF4784	Domain of unknown function (DUF4784). This is a family of uncharacterized proteins from Bacteroidetes.	409
406432	pfam16024	DUF4785	Domain of unknown function (DUF4785). This family of proteins is found in bacteria. Proteins in this family are typically between 392 and 442 amino acids in length.	376
406433	pfam16025	CALM_bind	Calcium-dependent calmodulin binding. This domain is found at the N-terminus of centriolar coiled-coil protein of 110 kDa (CCP110), where it binds calmodulin. Binding of calmodulin to this domain is calcium dependent.	78
406434	pfam16026	MIEAP	Mitochondria-eating protein. This domain is found at the C-terminus of mitochondria-eating proteins. This family of proteins regulate mitochondrial quality. They have a role in the degradation of damaged mitochondrial proteins and in the degradation of damaged mitochondria.	198
406435	pfam16027	DUF4786	Domain of unknown function (DUF4786). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 209 and 353 amino acids in length.	162
406436	pfam16028	SLC3A2_N	Solute carrier family 3 member 2 N-terminus. This domain is found at the N-terminus of solute carrier family 3 member 2 proteins (4F2 cell-surface antigen heavy chain).	77
406437	pfam16029	DUF4787	Domain of unknown function (DUF4787). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 70 amino acids in length.	62
406438	pfam16030	GD_N	Serine protease gd N-terminus. This domain is found at the N-terminus of the serine protease gd (gastrulation defective) in insects.	108
406439	pfam16031	TonB_N	TonB N-terminal region. TonB_N is a short domain found just downstream of the cytoplasmic-membrane anchor at the N-terminus of TonB proteins. The exact function is not known.	132
406440	pfam16032	DUF4788	Domain of unknown function (DUF4788). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 230 amino acids in length. There is a single completely conserved residue D that may be functionally important.	229
406441	pfam16033	DUF4789	Domain of unknown function (DUF4789). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 87 and 100 amino acids in length. There is a conserved GPC sequence motif. There are two completely conserved C residues that may be functionally important.	95
406442	pfam16034	JAKMIP_CC3	JAKMIP CC3 domain. This domain is found at the C-terminus of proteins belonging to the JAKMIP family (Janus kinase and microtubule-interacting proteins) and is predicted to be a coiled coil. It interacts with the Janus family kinases Tyk2 and Jak1.	199
406443	pfam16035	Chalcone_2	Chalcone isomerase like. 	203
406444	pfam16036	Chalcone_3	Chalcone isomerase-like. Chalcone_3 is a family of largely bacterial members.	165
406445	pfam16037	DUF4790	Domain of unknown function (DUF4790). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 134 and 191 amino acids in length. There is a single completely conserved residue C that may be functionally important.	93
406446	pfam16038	TMIE	TMIE protein. This family of proteins includes the mammalian transmembrane inner ear expressed protein. It's function is unknown.	85
406447	pfam16039	DUF4791	Domain of unknown function (DUF4791). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 189 and 203 amino acids in length. There are two conserved sequence motifs: PLPL and LGN. There is a single completely conserved residue N that may be functionally important.	162
406448	pfam16040	DUF4792	Domain of unknown function (DUF4792). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 70 amino acids in length.	71
406449	pfam16041	DUF4793	Domain of unknown function (DUF4793). This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length. There are two completely conserved C residues that may be functionally important.	108
406450	pfam16042	DUF4794	Domain of unknown function (DUF4794). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 74 and 92 amino acids in length.	76
406451	pfam16043	DUF4795	Domain of unknown function (DUF4795). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 285 and 978 amino acids in length.	181
406452	pfam16044	DUF4796	Domain of unknown function (DUF4796). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 194 and 289 amino acids in length. There is a single completely conserved residue C that may be functionally important.	189
406453	pfam16045	LisH_2	LisH. 	28
406454	pfam16046	FAM76	FAM76 protein. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 233 and 341 amino acids in length.	298
292666	pfam16047	Antimicrobial22	Frog antimicrobial peptide. This family includes the antimicrobial peptides Grahamin and Nigrocin which are secreted from frog skin.	21
292667	pfam16048	Antimicrobial23	Frog antimicrobial peptide. This family includes antimicrobial peptides such as Ranacyclin which are secreted from frog skin.	17
292668	pfam16049	Antimicrobial24	Frog antimicrobial peptide. This family includes antimicrobial peptides such as Aurein-5 and Caerin 2 which are secreted from frog skin.	25
406455	pfam16050	CDC73_N	Paf1 complex subunit CDC73 N-terminal. CDC73_N is the N-terminal region of the members of CDC73_C, pfam05179. CDC73 forms part of the Paf1 post-initiation complex. The exact function within the complex is not known.	302
406456	pfam16051	DUF4797	Domain of unknown function (DUF4797). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 40 amino acids in length. There is a conserved SGLPT sequence motif. There are two completely conserved residues (P and G) that may be functionally important.	43
406457	pfam16053	MRP-S34	Mitochondrial 28S ribosomal protein S34. 	128
406458	pfam16054	TMEM72	Transmembrane protein family 72. This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 145 and 275 amino acids in length.	153
406459	pfam16055	DUF4798	Domain of unknown function (DUF4798). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 80 and 365 amino acids in length. There is a single completely conserved residue H that may be functionally important.	103
318308	pfam16056	DUF4799	Domain of unknown function (DUF4799). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 362 and 1493 amino acids in length.	375
406460	pfam16057	DUF4800	Domain of unknown function (DUF4800). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 310 amino acids in length. The family is found in association with pfam02138, pfam00400. There is a conserved RDN sequence motif.	254
406461	pfam16058	Mucin-like	Mucin-like. This domain is found repeated at the C-terminus (C-tail) of bile salt-activated lipase, where is is O-glycosylated.	100
406462	pfam16059	DUF4801	Domain of unknown function (DUF4801). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00907.	52
406463	pfam16060	DUF4802	Domain of unknown function (DUF4802). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 70 amino acids in length. There are two conserved sequence motifs: CRC and YFDC.	65
406464	pfam16061	DUF4803	Domain of unknown function (DUF4803). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 351 and 686 amino acids in length. There is a conserved RRY sequence motif.	255
406465	pfam16062	DUF4804	Domain of unknown function (DUF4804). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 238 and 504 amino acids in length.	447
374340	pfam16063	DUF4805	Domain of unknown function (DUF4805). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 244 and 363 amino acids in length. There is a conserved WEL sequence motif.	265
406466	pfam16064	DUF4806	Domain of unknown function (DUF4806). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 80 amino acids in length.	86
406467	pfam16065	DUF4807	Domain of unknown function (DUF4807). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 171 and 270 amino acids in length. There is a conserved STLGG sequence motif.	126
374343	pfam16066	DUF4808	Domain of unknown function (DUF4808). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 106 and 135 amino acids in length.	121
406468	pfam16067	DUF4809	Domain of unknown function (DUF4809). This family of proteins is found in bacteria. Proteins in this family are typically between 120 and 137 amino acids in length. There is a conserved GGCNAC sequence motif.	129
406469	pfam16068	DUF4810	Domain of unknown function (DUF4810). This family of proteins is found in bacteria. Proteins in this family are typically between 117 and 134 amino acids in length. There is a conserved PES sequence motif. It is a putative lipoprotein.	84
406470	pfam16069	DUF4811	Domain of unknown function (DUF4811). This family of proteins is found in bacteria. Proteins in this family are typically between 188 and 241 amino acids in length. There is a single completely conserved residue Y that may be functionally important.	154
406471	pfam16070	TMEM132	Transmembrane protein family 132. This presumed domain is found in members of the TMEM132 family. TMEM132A may be involved in embryonic and postnatal brain development. TMEM132D may be a marker for oligodendrocyte differentiation.	344
406472	pfam16071	DUF4812	Domain of unknown function (DUF4812). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam03791, pfam03790. There are two completely conserved residues (H and I) that may be functionally important.	65
318322	pfam16072	DUF4813	Domain of unknown function (DUF4813). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.	291
406473	pfam16073	SAT	Starter unit:ACP transacylase in aflatoxin biosynthesis. SAT is the N-terminal starter unit:ACP transacylase of the aflatoxin biosynthesis pathway. SAT selects the hexanoyl starter unit from a pair of specialized fungal fatty acid synthase subunits (HexA/HexB) and transfers it onto the polyketide synthase A acyl-carrier protein to prime polyketide chain elongation. The family is found in association with pfam02801, pfam00109, pfam00550, pfam00975, pfam00698.	239
406474	pfam16074	PilW	Type IV Pilus-assembly protein W. PilW is a family of putative type IV pilus-assembly proteins. PilW is one of the component proteins of the pilus biogenesis process whereby pilus fibers are assembled in the periplasm, emerge onto the cell surface and are there stabilized, to allow bacterial attachment to host cells. PilW is an outer-membrane protein necessary for both the functionality of fibers and their stabilisation.	125
374348	pfam16075	DUF4815	Domain of unknown function (DUF4815). 	570
406475	pfam16076	Acyltransf_C	Acyltransferase C-terminus. This domain is found at the C-terminus of several different acyltransferases including 1-acyl-sn-glycerol-3-phosphate acyltransferase, acyl-CoA:lysophosphatidylglycerol acyltransferase 1 and lysocardiolipin acyltransferase 1.	73
406476	pfam16077	Spaetzle	Spaetzle. This family of proteins are nerve growth factor-like ligands required in the pathway that establishes the dorsal-ventral pattern of the embryo. They form a cystine knot structure.	93
406477	pfam16078	2-oxogl_dehyd_N	2-oxoglutarate dehydrogenase N-terminus. This domain is found at the N-terminus of 2-oxoglutarate dehydrogenases.	41
406478	pfam16079	Phage_holin_5_2	Phage holin family Hol44, in holin superfamily V. Phage_holin_V_2 is a family of small hydrophobic proteins with three transmembrane domains of the Hol44 family. These proteins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion. Full activity of the endolysin Lys44 from oenophage fOg44 requires sudden ion-nonspecific dissipation of the proton motive force, undertaken by the fOg44 holin during phage-infection.	66
406479	pfam16080	Phage_holin_2_3	Bacteriophage holin family HP1. Phage_holin_2_3 is a family of small hydrophobic phage proteins called holins with one transmembrane domain. Holins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion.	56
292699	pfam16081	Phage_holin_7_1	Mycobacterial 2 TMS Phage Holin (M2 Hol) Family. Phage_holin_8_1 is a family of two transmembrane mycobacteriophage holins, small hydrophobic proteins that effect lysis of host mycobacterial cells in conjunction with a mycobacteria-specific lysin, lysB. The product of lysB gene targets the mycobacteria outer membrane, the last barrier to bacteriophage release.	139
406480	pfam16082	Phage_holin_2_4	Bacteriophage holin family, superfamily II-like. Phage_holin_2_4 is a family of small hydrophobic phage proteins called holins with one transmembrane domain. Holins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion.	76
406481	pfam16083	Phage_holin_3_3	LydA holin phage, holin superfamily III. Phage_holin_3_3 is a family of small hydrophobic holin proteins with one or more transmembrane domains. Holins are encoded within the genomes of Gram-positive and Gram-negative bacteria as well as those of the bacteriophages of these organisms. Their primary function appears to be transport of murein hydrolases across the cytoplasmic membrane to the cell wall where these enzymes hydrolyze the cell wall polymer as a prelude to cell lysis. When chromosomally encoded, these enzymes are therefore autolysins. Holins may also facilitate leakage of electrolytes and nutrients from the cell cytoplasm, thereby promoting cell death. Some may catalyze export of nucleases. LydA and lydB are encoded on the dar operon. The phenotype of a rapid lysis in the absence of active LydB suggests that this protein might be an antagonist of the holin LydA.	78
292702	pfam16084	LydB	LydA-holin antagonist. LydB is a family of proteins that are antagonistic to the lysing action of holin LydA.	147
318333	pfam16085	Phage_holin_3_5	Bacteriophage holin Hol, superfamily III. Phage_holin_6_2 is a family of holins classified as 1.E.20 in the TC database. The hol gene (PRF9) product (117 aas) of Pseudomonas aeruginosa PAO1 exhibits a hydrophobicity profile similar to holins of P2 and phiCTX phages with two peaks of hydrophobicity that might correspond to either one or two TMSs. Hol functions in conjunction with the lytic enzyme, Lys, a glycosyl hydrolase that breaks-up the murein in the bacterial cell-wall, causing lysis of the cell and hence entry of phage particles. Several members are annotated as pyocin R2_PP when encoded on the chromosome.	113
374353	pfam16086	DUF4816	Domain of unknown function (DUF4816). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 178 and 456 amino acids in length. There is a conserved WKP sequence motif.	43
406482	pfam16087	DUF4817	Domain of unknown function (DUF4817). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 109 and 322 amino acids in length. There are two completely conserved residues (G and R) that may be functionally important.	54
406483	pfam16088	BORCS7	BLOC-1-related complex sub-unit 7. This is a family of unknown function found in eukaryotes. Family members include BORCS7 (BLOC-1-related complex sub-unit 7) also known as Diaskedin (from the Ancient Greek diaskedazo, meaning to disperse) or C10orf32. It constitutes sub-unit 7 of the BORC complex (BLOC-one-related complex). BORC is a multisubunit complex that regulates the positioning of lysosomes at the cell periphery, and consequently affects cell migration. BORC associates with the lysosomal membrane, where it functions to recruit the small GTPase Arl8. This initiates a series of interactions that promote the microtubule-guided transport of lysosomes toward the cell periphery.	103
406484	pfam16089	DUF4818	Domain of unknown function (DUF4818). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 176 and 214 amino acids in length. There is a single completely conserved residue W that may be functionally important.	109
406485	pfam16090	DUF4819	Domain of unknown function (DUF4819). This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is typically between 82 and 99 amino acids in length.	84
374358	pfam16091	DUF4820	Domain of unknown function (DUF4820). This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 320 and 483 amino acids in length. There are two conserved sequence motifs: WSLP and RPLPW.	226
406486	pfam16092	DUF4821	Domain of unknown function (DUF4821). 	264
406487	pfam16093	PAC4	Proteasome assembly chaperone 4. PAC4 or proteasome assembly chaperone 4 protein promotes assembly of the 20S proteasome. It interacts with PSMG3. It associates with alpha subunits of the 20S proteasome. At the very C-terminal is a crucial HbYX or hydrophobic-tyrosine-X sequence motif that, in proteasome activators, opens the 20S proteasome entry pore.	72
406488	pfam16094	PAC1	Proteasome assembly chaperone 4. PAC1 is a family of eukaryotic proteasome assembly chaperone 1 proteins in eukaryotes that promotes assembly of the core 20S proteasome as part of a heterodimer with PAC2.	282
406489	pfam16095	COR	C-terminal of Roc, COR, domain. The C-terminal of Roc domain, COR, along with Roc functions as the putative regulator of kinase activity. It functions as a proper GTP-binding protein with a low GTPase activity somehow stimulating the kinase activity.	196
406490	pfam16096	FXR_C1	Fragile X-related 1 protein C-terminal region 2. FXR_C1 is a small highly conserved region of the C-terminus of Fragile X-related proteins 1 and 2, FRX1, FRX2. The family is found in association with pfam05641, pfam00013. This family is immediately C-terminal to the core C terminal region, PF12235, and contains at least one block of RGG repeats that bind to G-quartet sequences in a wide variety of mRNAs.	75
406491	pfam16097	FXR_C3	Fragile X-related 1 protein C-terminal region 3. FXR_C1 is a small highly conserved region at the very C-terminus of Fragile X-related proteins 1 and 2, FRX1, FRX2. The family is found in association with pfam05641, pfam00013, PF16096.	68
406492	pfam16098	FXMR_C2	Fragile X-related mental retardation protein C-terminal region 2. FXMR_C2 is a small highly conserved region at the very C-terminus of Fragile X-related proteins FMR1. The family is found in association with pfam05641, pfam00013, PF16096.	86
406493	pfam16099	RMI1_C	Recq-mediated genome instability protein 1, C-terminal OB-fold. RMI1_C is a C-terminal oligo-nucleotide binding domain of Recq-mediated genome instability proteins. This domain interacts with RMI2-OB folds to make up the RMI core complex. The RMI core interface is crucial for BLM, Bloom syndrome, dissolvasome assembly and may have additional cellular roles as a docking hub for other proteins.	136
406494	pfam16100	RMI2	RecQ-mediated genome instability protein 2. RMI2 is a eukaryotic family of an OB3, oligo-nucleotide-binding proteins. It is an essential component of the RMI complex that plays a vital role in the processing of homologous recombination intermediates in order to limit DNA-crossover-formation in cells.	124
406495	pfam16101	PRIMA1	Proline-rich membrane anchor 1. 	122
406496	pfam16102	ACTH_assoc	ACTH-associated domain. ACTH_assoc is the low-complexity regions immediately adjacent to the highly conserved binding motif of the ACTH_domain, pfam00976. the exact function is not known.	28
406497	pfam16103	DUF4822	Domain of unknown function (DUF4822). A lipocain-like domain found in functionally uncharacterized bacterial proteins, often as a repeat of two domains. Proteins with this domain are found in a wide range of bacteria and are often annotated as S-layer proteins, but the origin of this annotation is not clear	121
292722	pfam16104	FPRL1_inhibitor	Formyl peptide receptor-like 1 inhibitory protein. This family consists of several formyl peptide receptor-like 1 inhibitory proteins from Staphylococcus aureus. These are secreted proteins that block the formyl peptide receptor-like 1 found in neutrophils, monocytes, B cells, and NK cells; and inhibit the binding of chemoattractants (such as formylated peptides) to FPRL1, which initiate phagocyte mobilization towards the infection site.	105
374367	pfam16105	DUF4823	Domain of unknown function (DUF4823). This family consists of hypothetical lipoproteins around 210 residues of length and is mainly found in various Pseudomonas species. The function of this family is unknown.	141
406498	pfam16106	DUF4824	Domain of unknown function (DUF4824). This family consists of several hypothetical lipoproteins around 270 residues in length and is mainly found in Pseudomonas species. The function of this family is unknown.	253
406499	pfam16107	DUF4825	Domain of unknown function (DUF4825). This domain forms the N-terminal, extracellular domain of some homologs of Staph BlaR1 proteases, where it replaces the penicillin-binding domain of BlaR1. It is also found in many uncharacterized proteins in a broad range of bacteria. Its association with BlaR1 homologs suggests it may be involved in substrate-, possibly antibiotic-binding, but this prediction has not been verified experimentally.	95
339611	pfam16108	DUF4826	Domain of unknown function (DUF4826). This family consists of uncharacterized proteins around 150 residues in length and is mainly found in various Shewanella species. The function of this protein is unknown.	124
406500	pfam16109	DUF4827	Domain of unknown function (DUF4827). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides species. Distant homology prediction algorithms consistently suggest a homology between this family and FKBP-type peptidyl-prolyl cis-trans isomerases (PF00254), but this relation is as yet not confirmed. The function of this family is unknown.	177
406501	pfam16110	DUF4828	Domain of unknown function (DUF4828). This family consists of uncharacterized proteins around 120 residues in length and is mainly found in various Enterococcus and Lactobacillus species. The function of this family is unknown.	79
379768	pfam16111	DUF4829	Domain of unknown function (DUF4829). This family consists of several uncharacterized proteins around 150 residues in length and is mainly found in various Clostridium species. The function of this family is unknown.	117
406502	pfam16112	DUF4830	Domain of unknown function (DUF4830). This family consists of several uncharacterized proteins around 150 residues in length and is mainly found in Clostridium, Eubacterium, and Ruminococcus species. The function of this family is unknown.	84
406503	pfam16113	ECH_2	Enoyl-CoA hydratase/isomerase. This family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase. This family differs from pfam00378 in the structure of it's C-terminus.	329
406504	pfam16114	Citrate_bind	ATP citrate lyase citrate-binding. This is the citrate-binding domain of ATP citrate lyase. This domain has a Rossmann fold.	177
406505	pfam16115	DUF4831	Domain of unknown function (DUF4831). This family consists of several uncharacterized proteins around 350 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	318
406506	pfam16116	DUF4832	Domain of unknown function (DUF4832). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides and Capnocytophaga species. The function of this family is unknown. Distant homology analysis suggests a possible similarity of proteins from this family to TIM barrel glycoside hydrolases and, subsequently its involvement in carbohydrate metabolism.The domain lies downstream of glycosyl hydrolases 42 suggesting that as a domain it might represent the carbohydrate-binding region of the enzyme.	208
406507	pfam16117	DUF4833	Domain of unknown function (DUF4833). This family consists of uncharacterized proteins around 170 residues in length and is mainly found in various Parabacteroides and Bacteroides species. The function of this family is unknown.	136
406508	pfam16118	DUF4834	Domain of unknown function (DUF4834). This family consists of uncharacterized proteins around 90 residues in length and is mainly found in various Parabacteroides and Bacteroides species. Protein in this family are characterized by a strongly conserved KDEGEYVD motif on the C-terminal and a very divergent N-terminal. The function of this family is unknown.	91
406509	pfam16119	DUF4835	Domain of unknown function (DUF4835). This family consists of uncharacterized proteins of around 300 residues in length and is mainly found in bacteria from the Cytophaga-Flavobacteria-Bacteroides (CFB) group, both environmental and from human microbiome. The function of this family is unknown.	276
406510	pfam16120	DUF4836	Domain of unknown function (DUF4836). This family consists of several uncharacterized proteins around 520 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	474
406511	pfam16121	40S_S4_C	40S ribosomal protein S4 C-terminus. This domain is found at the C-terminus of 40S ribosomal protein S4.	48
406512	pfam16122	40S_SA_C	40S ribosomal protein SA C-terminus. This domain is found at the C-terminus of 40S ribosomal protein SA.	96
406513	pfam16123	HAGH_C	Hydroxyacylglutathione hydrolase C-terminus. This domain is found at the C-terminus of hydroxyacylglutathione hydrolase enzymes. Substrate binding occurs at the interface between this domain and the catalytic domain (pfam00753).	82
406514	pfam16124	RecQ_Zn_bind	RecQ zinc-binding. This domain is the zinc-binding domain of ATP-dependent DNA helicase RecQ.	64
406515	pfam16125	DUF4837	Domain of unknown function (DUF4837). This family consists of uncharacterized proteins around 350 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	292
406516	pfam16126	DUF4838	Domain of unknown function (DUF4838). This family consists of several uncharacterized proteins found in various Bacteroides and Chloroflexus species. The function of this family is unknown.	263
406517	pfam16127	DUF4839	Domain of unknown function (DUF4839). This family consists of uncharacterized proteins around 300 residues in length and is mainly found in various Clostridium species. The function of this family is unknown.	122
406518	pfam16128	DUF4840	Domain of unknown function (DUF4840). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	141
406519	pfam16129	DUF4841	Domain of unknown function (DUF4841). this domain is found on the N-terminal of several uncharacterize proteins found in various Bacteroides species. Solved structure of one of them (BACOVA_00967) from Bacteroides ovatus shows a small beta barrel with an immunoglobulin-like fold. DUF4841 domain shows weak overlap with the DUF4114 family, suggesting a possible distant relation. Function of this domain is unknown.	69
406520	pfam16130	DUF4842	Domain of unknown function (DUF4842). This domain is found on the C-terminal of large number of uncharacterized proteins with broad phylogenetic distribution, which includes human gut Bacteroides, g-proteobacteria (Vibrio and Shewanella) and also spirochetes from Leptospira genus. Solved structure of Bacteroides ovatus protein BACOVA_00967 shows a large beta barrel with an immunoglobulin-like fold and significant structural similarity to collagen-binding domain of adhesin from S. aureus (1amx), but with several additional long loops and secondary structure elements. Function of this domain is unknown.	201
406521	pfam16131	Torus	Torus domain. This domain is found in pre-mRNA-splicing factor CWC2. It includes a CCCH-type zinc finger.	109
374383	pfam16132	DUF4843	Domain of unknown function (DUF4843). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Bacteroides species. Distant homology analysis suggest distant relation between this family and other families of proteins with immunoglobulin-like folds, which are often involved in substrate binding. However, specific function of this family is unknown. There is distant homology to the Calx-beta family pfam03160.	163
406522	pfam16133	DUF4844	Domain of unknown function (DUF4844). this family consists of short uncharacterized proteins found mostly in different strains of Acinetobacter bumanii, but also in several Shewanella species and in some bacteria from the CFB group. Solved structure of ABAYE3784 protein from Acinetobacter baumannii AYE shows a five helical bundle with a very strong structural similarity to a bromodomain domain. However, the specific function of proteins from the DUF4844 family is unknown	117
406523	pfam16134	THOC2_N	THO complex subunit 2 N-terminus. This family represents the N-terminus of THO complex subunit 2.	617
406524	pfam16135	Jas	TPL-binding domain in jasmonate signalling. The Jas domain is a short region of sequence characterized by IxCxCx(12)HAG found in plant transcriptional repressors. This motif appears to bind to the Groucho/Tup1-type co-repressor TOPLESS (TPL) and TPL-related proteins (TPRs). This binding is a crucial step in the jasmonate signalling pathway, involved in plant disease and defense.	48
406525	pfam16136	NINJA_B	Putative nuclear localization signal. NINJA proteins are Novel INteractor of JAZ proteins found in plants. NINJA proteins act as a transcriptional repressor, the activity of which is mediated by a functional TPL-binding EAR repression motif upstream from this domain.	114
406526	pfam16137	DUF4845	Domain of unknown function (DUF4845). This family consists of uncharacterized proteins around 120 residues in length and is mainly found in various Pseudomonas species. Distant homology analysis suggests that proteins from this family are related to pilin type IV proteins from the Bundlin (PF05307) family, this prediction is however not confirmed by any experimental evidence	85
406527	pfam16138	DUF4846	Domain of unknown function (4846). This family consists of uncharacterized proteins around 260 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	239
406528	pfam16139	DUF4847	Domain of unknown function (DUF4847). This uncharacterized domain has a lipocalin fold.	142
406529	pfam16140	DUF4848	Domain of unknown function (DUF4848). A small family of uncharacterized proteins around 310 residues in length and found in various Bacteroides species. The function of this family is unknown.	216
406530	pfam16141	DUF4849	Putative glycoside hydrolase Family 18, chitinase_18. This DUF is likely to be a form of glycosyl hydrolase from CAZy family 18, possibly chitinase 18. This would have the EC number of EC:3.2.1.14.	318
292760	pfam16142	DUF4850	Domain of unknown function (DUF4850). This family consists of several uncharacterized proteins around 250 residues in length and is mainly found in various Acinetobacter species. The function of this family is unknown.	184
374391	pfam16143	DUF4851	Domain of unknown function (DUF4851). This family consists of several uncharacterized proteins around 250 residues in length and is mainly found in various Desulfovibrio species. The function of this family is unknown.	195
406531	pfam16144	DUF4852	Domain of unknown function (DUF4852). This family consists of several uncharacterized proteins around 350 residues in length and is mainly found in various Parabacteroides, Bacteroides and Porphyromonas species. The function of this family is unknown.	121
406532	pfam16145	DUF4853	Domain of unknown function (DUF4853). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Actinomyces species. The function of this family is unknown.	135
406533	pfam16146	DUF4854	Domain of unknown function (DUF4854). This family consists of uncharacterized proteins found in firmicutes and high GC Gram+ bacteria associated with human and animal guts. The function of this family is unknown.	105
406534	pfam16147	DUF4855	Domain of unknown function (DUF4855). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species. Several proteins are annotated as glycerophosphodiester phosphodiesterases, but the origin of this annotation is not clear.	313
406535	pfam16148	DUF4856	Domain of unknown function (DUF4856). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this family is unknown.	358
406536	pfam16149	DUF4857	Domain of unknown function (DUF4857). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	270
406537	pfam16150	DUF4858	Domain of unknown function (DUF4858). This family consists of uncharacterized proteins around 190 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	182
406538	pfam16151	DUF4859	Domain of unknown function (DUF4859). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this family is unknown.	116
406539	pfam16152	DUF4860	Domain of unknown function (DUF4860). This family consists of uncharacterized proteins around 160 residues in length and is mainly found in various Eubacterium and Clostridium species. The function of this family is unknown.	98
406540	pfam16153	DUF4861	Domain of unknown function (DUF4861). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown. However, in many instances the domain lies upstream of a glycosyl hydrolase family, usually family 88, so it might be involved in carbohydrate binding.	382
406541	pfam16154	DUF4862	Domain of unknown function (DUF4862). This family consists of uncharacterized proteins around 300 residues in length and is mainly found in various high GC Gram+ bacteria, but also in several pathogenic and non-pathogenic enterobacteria (Salmonella, E. coli). Distant homology analysis suggests this could be a branch of Xylose isomerase-like TIM barrel family, but this prediction is currently not confirmed by experiment	291
406542	pfam16155	DUF4863	Domain of unknown function (DUF4863). This family consists of uncharacterized proteins around 150 residues in length and is mainly found in various delta- proteobacteria, but also several fungal species. Distant homology analysis suggest proteins from this family have a cupin-like fold and may be related to a group of lyases involved in the metabolism of benzoate. Few proteins from this family are annotated as p-hydroxylaminobenzoate lyases, NbaB, and this proposed function matches well their phylogenetic distribution, but there seems to be no direct experimental verification of this function, therefore at this point we call it a DUF.	153
406543	pfam16156	DUF4864	Domain of unknown function (DUF4864). This family consists of uncharacterized proteins around 120 residues in length and is mainly found in various Anabaena and Nostoc species. Distant homology analysis suggests this family is related to NTF2-like proteins and specifically to proteins that bind small molecules. HMM partly overlaps with Tol_Tol_Ttg2 (PF05494) involved in Toluene tolerance and lumazine binding family (PF12870) and these families should form a clan.	101
406544	pfam16157	DUF4865	Domain of unknown function (DUF4865). This family consists of uncharacterized proteins around 180 residues in length and is mainly found in various Bacillus species. Distant homology and fold prediction suggests proteins from this family would have a ferrodoxin dimeric fold and specifically be related to the putative mono-oxygenase ydhR family PF08803, however this prediction has not been verified by experiment	181
406545	pfam16158	N_BRCA1_IG	Ig-like domain from next to BRCA1 gene. Domain present between positions 365-485 in the human next to BRCA1 gene 1 protein Q14596 (NBR1_HUMAN) Distant homology and fold prediction analysis suggests this domain has an immunoglobulin like fold and is distantly homologous to domains involved in cell adhesion such as CARDB (PF07705). JCSG construct was crystalized confirming the domain boundaries	101
406546	pfam16159	FOXP-CC	FOXP coiled-coil domain. This domain, approximately 60-70 residues in length, is mainly found in Forkhead box proteins in various Mammalia species. It is a coiled-coil domain, which modulates the dimeric associations of FOXP transcription factors. Several key disease mutations, for instance those found in the IPEX syndrome are located in this domain	68
406547	pfam16160	DUF4866	Domain of unknown function (DUF4866). This family consists of uncharacterized proteins around 250 residues in length and is mainly found in various human gut Firmicute species and abundant in human gut metagenomic datasets. The function of this family is unknown.	246
406548	pfam16161	DUF4867	Domain of unknown function (DUF4867). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various human gut Firmicutes and a few eubacteria species. It is also amply represented in human gut metagenomic datasets. Distant homology analysis and marginal HMM overlaps suggest this family is a distant homolog of Ureidoglycolate hydrolase pfam04115, but this prediction is not verified by experiment, therefore the function of this family is still unknown.	199
406549	pfam16162	DUF4868	Domain of unknown function (DUF4868). This family consists of uncharacterized proteins around 320 residues in length and is a phylogenetically broad range of bacteria associated with the human gut microbiome. A member of this family from Lactobacillus casei CRL 705 is part of the gene cluster involved in synthesis of bacteriocin toxin, but the specific function of this family is unknown.	186
406550	pfam16163	DUF4869	Domain of unknown function (DUF4869). This family consists of uncharacterized proteins around 150 residues in length. Its members are found in human gut Firmicutes and are also abundant in human gut metagenomics datasets. The function of this family is unknown.	128
406551	pfam16164	VWA_N2	VWA N-terminal. This domain is found in von Willebrand factor proteins, where it is found to the N-terminus of the first VWA domain (pfam00092).	79
406552	pfam16165	Ferlin_C	Ferlin C-terminus. This domain is found at the C-terminus of proteins belonging to the ferlin family, including dysferlin, myoferlin, otoferlin and fer-1-like proteins.	154
374405	pfam16166	TIC20	Chloroplast import apparatus Tic20-like. Chloroplast function requires the import of nuclear encoded proteins from the cytoplasm across the chloroplast double membrane. This is accomplished by two protein complexes, the Toc complex located at the outer membrane and the Tic complex located at the inner membrane. The Toc complex recognizes specific proteins by a cleavable N-terminal sequence and is primarily responsible for translocation through the outer membrane, while the Tic complex translocates the protein through the inner membrane. This entry represents Tic20, a core member of the Tic complex. This protein is deeply embedded in the inner envelope membrane and is thought to function as a protein- conducting component of the Tic complex.	177
406553	pfam16167	DUF4871	Domain of unknown function (DUF4871). This family consists of uncharacterized proteins around 170 residues in length and is mainly found in various Bacillus species (B. cereus, B. thuringiensis and B. anthracis). The solved structure of B. anthracis homologs has a variant of the Greek-key beta barrel fold, making the DUF4870 family a member of a large group of bacterial immunoglobulin like domains, but the functional consequences of this classification remain unknown.	128
406554	pfam16168	AIDA	Adhesin of bacterial autotransporter system, probable stalk. The AIDA repeat is found on bacterial autotransporter proteins. As the repeat is short and occurs multiple times, it is likely to be the region of the transporter that acts as the stalk between the beta-barrel inserted into the membrane and the N-terminal head domain.	57
406555	pfam16169	DUF4872	Domain of unknown function (DUF4872). Members of this family are often found in the gene neighborhood, or fused to, non-ribosomal peptide synthetases.	173
406556	pfam16170	DUF4873	Domain of unknown function (DUF4873). This family consists mostly of short uncharacterized proteins found in various high GC Gram positive bacteria, primarily Mycobacterium species. However in some proteins, such as for instance Rv0943c proteins from Mycobacterium tuberculosis H37Rv, DUF4873 domain is found at the C-terminus, following the flavin-binding monooxygenase-like domain pfam00743, which is why probably many proteins with DUF4873 domains are annotated as monooxygenases. However these functions are not confirmed experimentally and the function of DUF4873 domain is still unknown.	91
406557	pfam16171	CENP-T_N	Centromere kinetochore component CENP-T N-terminus. CENP-T is a family of vertebral kinetochore proteins that associates directly with CENP-W. The N-terminus of CENP-T proteins interacts directly with the Ndc80 complex in the outer kinetochore. Importantly, the CENP-T-W complex does not directly associate with CENP-A, but with histone H3 in the centromere region. CENP-T and -W form a hetero-tetramer with CENP-S and -X and bind to a ~100 bp region of nucleosome-free DNA forming a nucleosome-like structure. The DNA-CENP-T-W-S-X complex is likely to be associated with histone H3-containing nucleosomes rather than with CENP-nucleosomes. This family represents the N-terminus of CENP-T.	378
406558	pfam16172	DOCK_N	DOCK N-terminus. This family is found near to the N-terminus of dedicator of cytokinesis (DOCK) proteins, between the variant SH3 domain (pfam07653) and the C2 domain (pfam14429).	378
406559	pfam16173	DUF4874	Domain of unknown function (DUF4874). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 161 and 175 amino acids in length. There is a conserved WGE sequence motif.	162
406560	pfam16174	IHABP4_N	Intracellular hyaluronan-binding protein 4 N-terminal. IHABP4_N is the N-terminal region of intracellular hyaluronan-binding protein 4-like and SERPINE1 mRNA binding protein 1-like proteins. This region carries nuclear localization sites, and may also be involved in the binding to some of the partners in the translational machinery.	155
292793	pfam16175	DUF4875	Domain of unknown function (DUF4875). Small protein family, with members present in few proteobacteria mostly Desulfovibrio species, but also a Vibrio phage vB, suggesting a possible phage origin Experimentally determined structure shows a fold reminiscent of a thioesterase/thiol ester dehydrase-isomerase fold, but a functional consequences of this similarity are not clear	127
406561	pfam16176	T-box_assoc	T-box transcription factor-associated. This domain lies downstream of the T-box in many eukaryotic T-box proteins. The exact function is not known.	226
406562	pfam16177	ACAS_N	Acetyl-coenzyme A synthetase N-terminus. This domain is found at the N-terminus of many acetyl-coenzyme A synthetase enzymes.	55
406563	pfam16178	Anoct_dimer	dimerization domain of Ca+-activated chloride-channel, anoctamin. This family appears to be the cytoplasmic domain of the calcium-activated chloride-channel, anoctamin, protein. It is responsible for creating the homodimeric architecture of the chloride-channel proteins.	224
406564	pfam16179	RHD_dimer	Rel homology dimerization domain. The Rel homology domain (RHD) is composed of two structural domains, an N-terminal DNA_binding domain (pfam00554) and a C-terminal dimerization domain. This is the dimerization domain.	103
374415	pfam16180	RelB_leu_zip	RelB leucine zipper. This domain is a leucine zipper found in RelB transcription factors.	84
406565	pfam16181	RelB_transactiv	RelB transactivation domain. This domain is the transactivation domain of the transcription factor RelB.	181
406566	pfam16182	AbLIM_anchor	Putative adherens-junction anchoring region of AbLIM. AbLIM_anchor is a domain lying between the LIM actin-binding and the vilin-head domain of actin-binding LIM proteins. It is likely that this domain is involved in anchoring abLIMs to circumferential actin bundles in specific cell types.	324
406567	pfam16183	Kinesin_assoc	Kinesin-associated. 	171
406568	pfam16184	Cadherin_3	Cadherin-like. 	111
406569	pfam16185	MTABC_N	Mitochondrial ABC-transporter N-terminal five TM region. MTABC_N is the N-terminal five transmembrane helices of eukaryotic mitochondrial ABC-transporters.	244
406570	pfam16186	Arm_3	Atypical Arm repeat. This atypical Arm repeat appears at the very C-terminus of eukaryotic proteins such as importin subunit alpha-2, as the last of the repeating units.	48
406571	pfam16187	Peptidase_M16_M	Middle or third domain of peptidase_M16. Peptidase_M16_M is the third domain of peptidase_M16 in eukaryotes of the insulin-degrading-enzyme type. Insulin-degrading enzymes - insulysin - are zinc metallopeptidases that metabolize several bioactive peptides, including insulin and the amyloid-beta-peptide. The tertiary structure of insulin-degrading enzymes resembles a clamshell composed of four structurally similar domains arranged to enclose a large central chamber. Substrates must enter the chamber, and it is likely that a hinge-like conformational change allows substrate binding and product release. Triphosphates are found to dock between the inner surfaces of the non-catalytic domains three and four.	283
406572	pfam16188	Peptidase_M24_C	C-terminal region of peptidase_M24. This is a short region at the C-terminus of a number of metallo-peptidases of the M24 family.	63
406573	pfam16189	Creatinase_N_2	Creatinase/Prolidase N-terminal domain. 	159
406574	pfam16190	E1_FCCH	Ubiquitin-activating enzyme E1 FCCH domain. This domain is found in the ubiquitin-activating E1 family enzymes.	69
406575	pfam16191	E1_4HB	Ubiquitin-activating enzyme E1 four-helix bundle. This domain is found in the ubiquitin-activating E1 family enzymes.	64
406576	pfam16192	PMT_4TMC	C-terminal four TMM region of protein-O-mannosyltransferase. PMT_4TMC is the C-terminal four membrane-pass region of protein-O-mannosyltransferases and similar enzymes.	198
406577	pfam16193	AAA_assoc_2	AAA C-terminal domain. AAA_assoc_2 is found at the C-terminus of a relatively small set of AAA domains in proteins ranging from archaeal to fungi, plants and mammals.	80
406578	pfam16195	UBA2_C	SUMO-activating enzyme subunit 2 C-terminus. 	93
406579	pfam16197	KAsynt_C_assoc	Ketoacyl-synthetase C-terminal extension. KAsynt_C_assoc represents the very C-terminus of a subset of proteins from the keto-acyl-synthetase 2 family. It is found in proteins ranging from bacteria to human.	111
406580	pfam16198	TruB_C_2	tRNA pseudouridylate synthase B C-terminal domain. This C-terminal region is found on a subset of TruB_B protein family members pfam01509. It is found from bacteria and archaea to fungi, plants and human.	65
406581	pfam16199	Radical_SAM_C	Radical_SAM C-terminal domain. This domain is found as a C-terminal extension to a subset of Radical_SAM domains. It is found in archaeal, bacterial, fungal, plant and human proteins.	83
406582	pfam16200	Band_7_C	C-terminal region of band_7. This domain is found on a subset of proteins as a C-terminal extension of the Band_7 family, pfam01145. It is found in proteins fro bacteria to fungi, plants and mammals.	63
406583	pfam16201	NopRA1	Nucleolar pre-ribosomal-associated protein 1. This family is found on the long vertebral and plant nucleolar proteins that also carry Npa1, pfam11707.	196
406584	pfam16202	BLM_N	N-terminal region of Bloom syndrome protein. BLM_N is the very N-terminal region of chordate Bloom syndrome proteins. The exact function is not known.	368
406585	pfam16203	ERCC3_RAD25_C	ERCC3/RAD25/XPB C-terminal helicase. This is the C-terminal helicase domain of ERCC3, RAD25 and XPB helicases.	248
406586	pfam16204	BDHCT_assoc	BDHCT-box associated domain on Bloom syndrome protein. This family is found on Bloom syndrome-associated DEAD-box helicases in higher eukaryotes. It lies between the BDHCT, and DEAD-box families, pfam08072 and pfam00270.	223
406587	pfam16205	Ribosomal_S17_N	Ribosomal_S17 N-terminal. This short N-terminal region is found in a number of higher eukaryotic ribosomal subunit 17 proteins.	67
374431	pfam16206	Mon2_C	C-terminal region of Mon2 protein. Mon2 proteins are found from fungi to plants, to human and is a scaffold protein involved in multiple aspects of endo membrane trafficking. This C-terminal region is essential for Mon2 activity.	827
406588	pfam16207	RAWUL	RAWUL domain RING finger- and WD40-associated ubiquitin-like. The RAWUL domain is found at the C-terminus of poly-comb group RING finger proteins. It is a ubiquitin-like domain. RAWUL binds directly to PUFD, a domain on BCOR proteins (BCL6 corepressor). BCOR has emerged as an important player in development and health.	66
406589	pfam16208	Keratin_2_head	Keratin type II head. 	160
406590	pfam16209	PhoLip_ATPase_N	Phospholipid-translocating ATPase N-terminal. PhoLip_ATPase_N is found at the N-terminus of a number of phospholipid-translocating ATPases. It is found in higher eukaryotes.	67
406591	pfam16210	Keratin_2_tail	Keratin type II cytoskeletal 1 tail. 	135
406592	pfam16211	Histone_H2A_C	C-terminus of histone H2A. 	35
406593	pfam16212	PhoLip_ATPase_C	Phospholipid-translocating P-type ATPase C-terminal. PhoLip_ATPase_C is found at the C-terminus of a number of phospholipid-translocating ATPases. It is found in higher eukaryotes.	249
406594	pfam16213	DCB	dimerization and cyclophilin-binding domain of Mon2. DCB is the N-terminal domain of Mon2- and GIG1-like proteins from metazoa. Mon2 and BIG1 like proteins play an important role in the cytoplasm-to-vacuole transport pathway and are required for Golgi homeostasis.	172
318454	pfam16214	AC_N	Adenylyl cyclase N-terminal extracellular and transmembrane region. This family covers the N-terminal extracellular region and the first transmembrane 5-6 pass region of adenylate cyclase.	415
406595	pfam16215	DUF4876	Protein of unknown function (DUF4876). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 392 and 433 amino acids in length. There is a conserved NNS sequence motif.	186
406596	pfam16216	GxGYxYP_N	GxGYxY sequence motif in domain of unknown function N-terminal. This domain is found in bacteria, archaea and eukaryotes, and is typically between 213 and 231 amino acids in length. This domain is found in association with pfam14323.	215
406597	pfam16217	M64_N	Peptidase M64 N-terminus. This domain is found at the N-terminus of IgA Peptidase M64. Its function is unknown.	115
406598	pfam16218	Peptidase_C101	Peptidase family C101. This is a family of cysteine-peptidases that is conserved in vertebrates. The key residues as found in human OTULIN are Asp126, Cys129, His339 and Asn341.	264
406599	pfam16219	DUF4879	Domain of unknown function (DUF4879). family of short proteins of bacterial proteins of phage origin, exemplified by protein SPBc2p013 from Bacillus phage SPBc2, found in various Bacillus and Pseudomonas species. Structure show unexpected structural similarity to greek key beta barrels from the E-set domain, especially to domains involved in carbohydrate and protein- protein binding. However functional consequences of this similarity are not confirmed.	123
406600	pfam16220	DUF4880	Domain of unknown function (DUF4880). This domain can be found on the N-terminal of uncharacterized proteins from various Rhodopseudomonas and Pseudomonas species, often, but not always followed by the ron siderophore sensor protein family (FecR, PF04773). The function of this domain is unknown.	43
379798	pfam16221	HTH_47	winged helix-turn-helix. HTH_47 is an example of a circularly permuted winged helix-turn-helix domain. HTH_47 is found at the very C-terminus of DUF2172, which is structurally similar to M28-peptidases but lacking one of the key zinc-binding residues.	77
339666	pfam16222	DUF4881	Domain of unknown function (DUF4881). This small family consists of several uncharacterized proteins around 200 residues in length and is mainly found in various Desulfovibrio species. The function of this protein is unknown.	180
374442	pfam16223	DUF4882	Domain of unknown function (DUF4882). This small family consists of several uncharacterized proteins around 325 residues in length and is mainly found in various Acinetobacter species. The function of this family is unknown.	267
406601	pfam16224	DUF4883	DOmain of unknown function (DUF4883). This family consists of several uncharacterized proteins around 160 residues in length and is mainly found in various Clostridium species. The function of this family is unknown.	118
406602	pfam16225	DUF4884	Domain of unknown function (DUF4884). This family consists of several uncharacterized proteins around 90 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	49
292843	pfam16226	DUF4885	Domain of unknown function (DUF4885). This family consists of several uncharacterized proteins around 390 residues in length and is mainly found in various Bacillus subtillis species. This family is predicted to be functional in biosynthesis of rhizocticins and antifungal phosphonate oligopeptides, but the specific function of this family is still unknown.	325
406603	pfam16227	DUF4886	Domain of unknown function (DUF4886). This domain is mainly found in uncharacterized proteins around 290 residues in length and is mainly found in various Bacteroides species. It has a curved central beta sheet flanked by helices. Distant homolog analysis showed it has a similarity with GDSL-like Lipase/Acylhydrose family. The function of this domain is still unknown.	250
374444	pfam16228	DUF4887	Domain of unknown function (DUF4887). This family consists of uncharacterized proteins around 210 residues in length and is mainly found in various Staphylococcus species. The function of this family is unknown.	176
292846	pfam16229	DUF4888	Domain of unknown function (DUF4888). This family consists of uncharacterized proteins around 190 residues in length and is mainly found in various Staphylococcus species. The function of this family is unknown.	141
406604	pfam16230	DUF4889	Domain of unknown function (DUF4889). This family consists of uncharacterized proteins around 110 residues in length and is mainly found in various Staphylococcus aureus species. The function of this family is unknown.	71
406605	pfam16231	DUF4890	Domain of unknown function (DUF4890). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	109
406606	pfam16232	DUF4891	Domain of unknown function (DUF4891). This family consists of uncharacterized proteins around 140 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	92
406607	pfam16233	DUF4893	Domain of unknown function (DUF4893). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Pseudomonas species. The function of this family is unknown.	171
406608	pfam16234	DUF4892	Domain of unknown function (DUF4892). This family consists of uncharacterized proteins around 270 residues in length and is mainly found in various Pseudomonas aeruginosa species. The function of this family is unknown.	182
406609	pfam16235	DUF4894	Domain of unknown function (DUF4894). A small family of uncharacterized proteins around 180 residues in length and found in various Thermotoga species. The function of this family is unknown.	122
406610	pfam16236	DUF4895	Domain of unknown function (DUF4895). A small family of uncharacterized proteins around 250 residues in length and found in various Thermotoga species. The function of this family is unknown.	218
406611	pfam16237	DUF4896	Domain of unknown function (DUF4896). A small family of uncharacterized proteins around 50 or 570 residues in length and found in various Thermotoga species. The function of this family is unknown.	44
406612	pfam16238	DUF4897	Domain of unknown function (DUF4897). A small family of uncharacterized proteins around 200 residues in length and found in various Thermotoga species. The function of this family is unknown.	152
406613	pfam16239	DUF4898	Domain of unknown function (DUF4898). A small family of uncharacterized proteins around 100 residues in length and found in various Sulfolobus species. The function of this family is unknown.	82
406614	pfam16240	DUF4899	Domain of unknown function (DUF4899). A small family of uncharacterized proteins around 340 residues in length and found in various Thermotoga and Thermosipho species. The function of this family is unknown.	283
292858	pfam16241	DUF4900	Domain of unknown function (DUF4900). This family consists of uncharacterized proteins around 600 residues in length and is mainly found in various Thermotoga and Fervidobaterium species. The function of this family is unknown.	89
406615	pfam16242	Pyrid_ox_like	Pyridoxamine 5'-phosphate oxidase like. This domain, approximately 140 residues in length, is mainly found in general stress proteins in various Xanthomonas species. It is composed of a six-stranded antiparallel beta-barrel flanked by five alpha-helices and can bind to FMN and FAD, suggesting that it may help the bacteria to react against the oxidative stress induced by the defense mechanisms of the plant.	149
374451	pfam16243	Sm_like	Sm_like domain. This domain, approximately 150 residues, is mainly found in several uncharacterized proteins in various Prochlorococcus and Synechococcus species. The crystal structure of ECX21941 reveals unexpected similarity to Sm/LSm proteins, which are important RNA-binding proteins, despite no detectable sequence similarity. The specific function of this family is unknown, but the structure analysis of ECX21941 indicates nucleic acid-binding capabilities and suggests a role in RNA and/or DNA processing.	85
406616	pfam16244	DUF4901	Domain of unknown function (DUF4901). This family consists of uncharacterized proteins around 470 residues in length and is mainly found in various Bacillus subtilis species. The function of this family is unknown.	228
406617	pfam16245	DUF4902	Domain of unknown function (DUF4902). A family of uncharacterized proteins around 140 residues in length and found in various Acidithiobacillacea and Acinetobacter species. It may be functional in extreme acidophile Acidithiobacillus ferrooxidans, but the specific function of this family is unknown.	118
406618	pfam16246	DUF4903	Domain of unknown function (DUF4903). A small family of uncharacterized proteins around 210 residues in length and found in various Bacteroides and Prevotella species. The function of this family is unknown.	190
292864	pfam16247	DUF4904	Domain of unknown function (DUF4904). This domain, approximately 130 residues in length, is mainly found in several uncharacterized proteins around 340 residues in Actinobacteria, Cyanobacteria and Metazoa species. It is mainly composed of antiparallel beta sheets and has a cystatin-like fold, but the specific function of this family is unknown.	127
379805	pfam16248	DUF4905	Domain of unknown function (DUF4905). A small family of uncharacterized proteins around 270 residues in length and found in various Cytophagales, Sphingobacteriaceae and Ignavibacteriaceae species. The function of this family is unknown.	81
406619	pfam16249	DUF4906	Domain of unknown function (DUF4906). A family of uncharacterized proteins around 300 residues in length and found in various Bacteroides species. The function of this family is unknown.	203
406620	pfam16250	DUF4907	Domain of unknown function (DUF4907). A family of uncharacterized proteins around 110 residues in length and found in various Bacteroides species. The function of this family is unknown.	65
406621	pfam16251	NAR	Nucleic acid-binding domain (NAR). This domain, approximately 100 residues in length, is mainly found in Orf1a polyproteins in severe acute respiratory syndrome coronavirus. The global domain of the NAR represents a new fold, with a parallel four-strand beta-sheet holding two alpha-helices of three and four turns that are oriented antiparallel to the beta-strands and a group of residues form a positively charged patch on the protein surface as the binding site responsible for binding affinity for nucleic acids.	129
318482	pfam16252	DUF4908	Domain of unknown function (DUF4908). A small family of uncharacterized proteins around 260 residues in length and found in various Caulobacter and Brevundimonas species. The function of this family is unknown.	221
292870	pfam16253	DUF4909	Domain of unknown function (DUF4909). This family of proteins is found in bacteria. Proteins in this family are approximately 160 amino acids in length. Several members are associated with vancomycin virulence in Staph. aureus in some way. These proteins are all lipoproteins, carrying the characteristic prokaryotic membrane-attachment site at their N-termini.	127
406622	pfam16254	DUF4910	Domain of unknown function (DUF4910). 	339
318484	pfam16255	Lipase_GDSL_lke	GDSL-like Lipase/Acylhydrolase. 	202
406623	pfam16256	DUF4911	Domain of unknown function (DUF4911). This family consists of uncharacterized proteins around 75 residues in length and is mainly found in various Thermotogav species. The function of this family is unknown.	57
406624	pfam16257	UxaE	tagaturonate epimerase. This family consists of uncharacterized proteins around 500 residues in length and is mainly found in various Bacteria species, such as Thermotoga, Paenibacillus and Rhodothermus. A newly recognized enzyme from the galacturonate utilization pathway in T. maritima with tagaturonate epimerase activity.	475
406625	pfam16258	DUF4912	Domain of unknown function (DUF4912). This family consists of uncharacterized proteins around 160 residues in length and is mainly found in various Clostridium species. The function of this family is unknown.	117
406626	pfam16259	DUF4913	Domain of unknown function (DUF4913). This family consists of uncharacterized proteins around 150 residues in length and is mainly found in various Arthrobacter species. The function of this family may be functional in enableing the growth of Arthrobacter sp. strain JBH1 with nitroglycerin as the sole source of carbon and nitrogen.	105
406627	pfam16260	DUF4914	Domain of unknown function (DUF4914). This family consists of uncharacterized proteins around 630 residues in length and is mainly found in various Thermotoga, Thermoanaerobacter and Carboxydibrachium species. The function of this family is unknown.	606
406628	pfam16261	DUF4915	Domain of unknown function (DUF4915). This family consists of uncharacterized proteins around 370 residues in length and is mainly found in various species, such as Shewanella, Rheinheimera, Saccharophagus, Leptolyngbya and so on. It contains serveral TPR repeat-containing proteins. The function of this family is unknown.	314
406629	pfam16262	DUF4916	Domain of unknown function (DUF4916). This domain family consists of uncharacterized proteins around 175 residues in length and is mainly found in various Streptomyces species. The function of this family is unknown. This family is related to the NUDIX hydrolases.	169
374458	pfam16263	DUF4917	Domain of unknown function (DUF4917). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Burkholderia and Brucella species. The function of this family is unknown.	311
339675	pfam16264	SatD	SatD family (SatD). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Streptococcus species. The function of this family is involved in acid resistance.	211
406630	pfam16265	DUF4918	Domain of unknown function (DUF4918). This family consists of uncharacterized proteins around 230 residues in length and is mainly found in various Listeria species. The function of this family is unknown.	224
406631	pfam16266	DUF4919	Domain of unknown function (DUF4919). This family consists of uncharacterized proteins around 230 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this family is unknown.	184
406632	pfam16267	DUF4920	Domain of unknown function (DUF4920). This family consists of uncharacterized proteins around 190 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	85
406633	pfam16268	DUF4921	Domain of unknown function (DUF4921). This family consists of uncharacterized proteins around 450 residues in length and is mainly found in various Corynebacterium species. Several proteins are predicted as galactose-1-phosphate uridylytransferases. The function of this family is unknown.	425
406634	pfam16269	DUF4922	Domain of unknown function (DUF4922). This family consists of uncharacterized proteins around 310 residues in length and is mainly found in various Bacteroides and Parabacteroides species. Several members are annotated as putative glycosyltransferases, but the specific function of this family is still unknown.	188
406635	pfam16270	DUF4923	Domain of unknown function (DUF4923). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown.	175
406636	pfam16271	DUF4924	Domain of unknown function (DUF4924). This family consists of uncharacterized proteins around 180 residues in length and is mainly found in various Parabacteroides and Bacteroides species. The function of this family is unknown.	179
406637	pfam16272	DUF4925	Domain of unknown function (DUF4925). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	339
406638	pfam16273	NuDC	Nuclear distribution C domain. This domain, approximately 40-50 residues in length, is mainly found in nuclear migration proteins in various Mammalia species. It may play a role not only in mitosis and cytokinesis, but also in interkinetic nuclear migration and neuronal migration during neocortical development.	64
406639	pfam16274	Qua1	Qua1 domain. This domain, approximately 40 residues in length, is mainly found in KH-domain containing, RNA-binding, signal transduction-associated protein 1 from yeast to human. It forms a homodimer composed of a perpendicular interaction of two helical hairpins, and the Qua1 domain is sufficient for homodimerization which is required for the regulation of alternative splicing.	52
406640	pfam16275	SF1-HH	Splicing factor 1 helix-hairpin domain. This domain, approximately 100 residues in length, is mainly found in splicing factor 1 from yeast to human. It is a helix-hairpin domain, which forms a secondary, hydrophobic interface with U2AF65(UHM) to lock the orientation of the two subunits, which is essential for cooperative formation of the ternary SF1-U2AF65-RNA complex. In this domain, it contains a highly conserved SPSP motif in its C terminal and phophorylation of SPSP motif induces a disorder-to-order transition within a novel SF1/U2AF65 interface, indicating a phosphorylation-dependent control of pre-mRNA splicing factors.	114
406641	pfam16276	NPM1-C	Nucleophosmin C-terminal domain. This domain, approximately 50 residues in length, is mainly found in Nucleophosmin proteins in mammalia species. Nucleophosmin, a nucleocytoplasmic shuttling protein, is related with cancer and involved in serveral cellluar functions, such as ribosome maturatation and export, centrosome duplication, and response to stress stimuli. This domain has a three-helix bundle which can bind G-quadruplex DNA and the interaction involves helices H1 and H2 of the NPM1-C domain mainly through electrostatic contacts with G-quadruplex phosphates, indicating a crucial role in rescuring its function in leukemia.	47
406642	pfam16277	DUF4926	Domain of unknown function (DUF4926). This family consists of uncharacterized proteins around 70 residues in length and is mainly found in various Caulobacter, Microcystis and Cyanothece species. The function of this family is unknown.	58
406643	pfam16278	zf-C2HE	C2HE / C2H2 / C2HC zinc-binding finger. zf-C2HE is an unusual zinc-binding domain found in fungi, plants and metazoa. It is often found at the C-terminus of HIT-domain-containing proteins, pfam01230. In fungi the fourth ligand is a Glu, in plants it is Cys and in metazoans it is usually a His. The fourth ligand is often mutated in neurogenerative disease-states.	60
406644	pfam16279	DUF4927	Domain of unknown function (DUF4927). This family, around 80 residues, consists of uncharacterized and nuclear receptor coactivator 2 proteins and is mainly found in mammalia species. The specific function of this family is still unknown.	89
374468	pfam16280	DUF4928	Domain of unknown function (DUF4928). This family consists of uncharacterized proteins around 330 residues in length and is mainly found in various Bacteria species, such as Enterobacteriales, Clostridiales, Actinomycetales and so on. The function of this family is unknown.	306
406645	pfam16282	SANT_DAMP1_like	SANT/Myb-like domain of DAMP1. This domain, approximately 90 residues, is mainly found in DNA methyltransferase 1-associated protein 1 (DAMP1) that plays an important role in development and maintenace of genome integrity in various mammalia species. It mainly consists of tandem repeats of three alpha-helices that are arranged in a helix-turn-helix motif and shows a structual similarity with SANT domain and Myb DNA-binding domain, indicating it contains a putative DNA binding site.	80
406646	pfam16283	DUF4929	Domain of unknown function (DUF4929). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various species, such as Bacteroides, Capnocytophaga and Prevotella. The function of this family is unknown.	366
374471	pfam16284	DUF4930	Domain of unknown function (DUF4930). A small family of uncharacterized proteins around 150 residues in length and found in various Staphylococcus aureus species. The function of this family is unknown.	144
406647	pfam16285	DUF4931	Domain of unknown function (DUF4931). This family consists of uncharacterized proteins around 270 residues in length and is mainly found in various Bacillus cereus species. Some members of this family are annotated as Galactose-1-phosphate uridylyltransferases, but the specific function of this family is unknown.	245
406648	pfam16286	DUF4932	Domain of unknown function (DUF4932). This family consists of uncharacterized proteins around 460 residues in length and is mainly found in various Bacteroides species, such as Bacteroides fragilis, Bacteroides sp. and so on. Several members are annotated as putative metalloproteases, but the specific function of this family is unknown.	330
406649	pfam16287	DUF4933	Domain of unknown function (DUF4933). This family consists of uncharacterized proteins around 450 residues in length and is mainly found in various species, such as Bacteroides and Parabacteroides. Several members are annotated as putative transmembrane proteins, but the specific function of this family is unknown.	386
406650	pfam16288	DUF4934	Domain of unknown function (DUF4934). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides species, such as Bacteroides fragilis and Bacteroides sp. The function of this family is unknown.	102
406651	pfam16289	DUF4935	Domain of unknown function (DUF4935). This family consists of uncharacterized proteins around 350 residues in length and is mainly found in various species, such as Prevotella, Pseudomonas, Leptospira and so on. The function of this family is unknown.	171
406652	pfam16290	DUF4936	Domain of unknown function (DUF4936). This family consists of uncharacterized proteins around 100 residues in length and is mainly found in various Burkholderiales species, such as Herbaspirillum, Cupriavidus, Ralstonia and so on. The function of this family is unknown.	87
406653	pfam16291	DUF4937	Domain of unknown function (DUF4937. This family consists of uncharacterized proteins around 120 residues in length and is mainly found in various Bacillus species, such as Bacillus subtilis and Bacillus amyloliquefaciens. Several members are annotated as ydbC, but the specific function of this family is unknown.	89
292908	pfam16292	DUF4938	Domain of unknown function (DUF4938). A small family consists of several uncharacterized proteins around 300 residues in length and is mainly found in various Chloroflexus, Comamonas, Delfitia, Rubrivivax and Roseiflexus species. Several members are annotated as cyanophycin synthetases, but the function of this family is unknown.	302
406654	pfam16293	zf-C2H2_9	C2H2 type zinc-finger (1 copy). 	57
406655	pfam16294	RSB_motif	RNSP1-SAP18 binding (RSB) motif. The RSB motif on the Acinus protein is the core around which the ASAP complex is built. The apoptosis and splicing-associated protein complex, ASAP, is made up of three proteins, SAP18 (Sin3-associated protein of 18 kDa), RNA-binding protein S1 (RNPS1) and apoptotic chromatin inducer in the nucleus (Acinus). The ASAP complex appears to be an assembly of proteins at the interface between transcription, splicing and NMD, acting as a hub in the network of protein-interactions that regulate gene-expression.	91
292911	pfam16295	TetR_C_10	Tetracycline repressor, C-terminal all-alpha domain. 	132
406656	pfam16296	TM_PBP2_N	N-terminal of TM subunit in PBP-dependent ABC transporters. This family mainly consists of Transmembrane subunit (TM) found in Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which generally bind type 2 PBPs, such as Binding-protein-dependent transport systems inner membrane component and Maltose transport permease MalF. It is around 580 residues in length and is mainly found in various species, such as Thermotoga, Dictyoglomus, Thermosipho, Fervidobacterium, Mesotoga and so on. The function of this family is unknown.	78
406657	pfam16297	DUF4939	Domain of unknown function (DUF4939). This family consists of uncharacterized proteins around 110 residues in length and is mainly found in various mammalia species. LDOC1, a member of this family and a novel MZF-1-interacting protein, inhibits NF-kappaB activation and relates with cancer and some other diseases. But the specific function of this family is still unknown.	114
406658	pfam16298	DUF4940	Domain of unknown function (DUF4940). This family consists of several uncharacterized proteins around 250 residues in length and is mainly found in various Thermotoga species. The function of this family is unknown.	206
406659	pfam16299	DUF4941	Domain of unknown function (DUF4941). This family consists of several uncharacterized proteins around 300 residues in length and is mainly found in various Thermotoga species. The function of this family is unknown.	265
406660	pfam16300	WD40_4	Type of WD40 repeat. Most members of this family form part of the 7-bladed beta-propeller at the N-terminus of coronin proteins.	44
339683	pfam16301	DUF4943	Domain of unknown function (DUF4943). This small family consists of several uncharacterized proteins around 170 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	150
292918	pfam16302	DUF4944	Domain of unknown function (DUF4944). This family consists of uncharacterized proteins around 160 residues in length and is mainly found in various Bacillus species. The function of this family is unknown.	128
292919	pfam16303	DUF4945	Domain of unknown function (DUF4945). This small family consists of uncharacterized proteins around 140 residues in length and is mainly found in various Bacteroides species, such as Bacteroides fragilis and Bacteroides sp.. The function of this family is unknown.	115
318518	pfam16304	DUF4946	Domain of unknown function (DUF4946). This small family consists of uncharacterized proteins around 180 residues in length and is mainly found in various Pseudomonas species, especially in Pseudomonas aeruginosa. The function of this family is unknown.	152
406661	pfam16305	DUF4947	Domain of unknown function (DUF4947). This small family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Streptococcus mutans species. The function of this family is unknown.	169
374480	pfam16306	DUF4948	Domain of unknown function (DUF4948). This small family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides, Paraprevotella, Parabacteroides and Alistipes species. The function of this family is unknown.	171
292923	pfam16307	DUF4949	Domain of unknown function (DUF4949). This small family consists of uncharacterized proteins around 140 residues in length and is mainly found in various Legionella pneumophila and longbeachae species. The function of this family is unknown.	107
292924	pfam16308	DUF4950	Domain of unknown function (DUF4950). This family consists of several uncharacterized proteins around 250 residues in length and is mainly found in various Enterococcus faecalis species. The function of this family is unknown.	191
339686	pfam16309	DUF4951	Domian of unknown function (DUF4951). This family consists of several uncharacterized proteins around 125 residues in length and is mainly found in various Acinetobacter baumannii species. The function of this family is unknown.	83
374481	pfam16310	DUF4952	Domian of unknown function (DUF4952). This family consists of several uncharacterized proteins around 150 residues in length and is mainly found in various Leptospira, Pseudomonas, Stenotrophomonas and Desulfovibrio species. The function of this family is unknown.	77
406662	pfam16311	TMEM100	Transmembrane protein 100. This family of proteins is found in eukaryotes. Proteins in this family are approximately 130 amino acids in length. There is some apparent similarity with family the phosphoinositide-interacting protein family PIRT, pfam15099, because those proteins are also transmembrane proteins.	132
406663	pfam16312	Oberon_cc	Coiled-coil region of Oberon. Oberon_cc is the coiled-coil region of Oberon proteins from plants. Oberon is necessary for maintenance and/or establishment of both the shoot and root apical meristems in Arabidopsis. Most Oberon proteins carry a PHD finger domain, pfam07227 and this coiled-coil domain. Oberon proteins mediate the TMO7 (the direct target of MP) expression through modification of, or binding to, chromatin at the TMO7 locus. TMO7 stands for the target of Monopteros 7 (or Auxin response factor 7).	129
406664	pfam16313	DUF4953	Met-zincin. This is a family of uncharacterized proteins that carry the highly characteristic met-zincin mmotif HExxHxxGxxH, the extended zinc-binding domain of metallopeptidases.	319
406665	pfam16314	DUF4954	Domain of unknown function (DUF4954). This family consists of uncharacterized proteins around 660 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	653
406666	pfam16315	DUF4955	Domain of unknown function (DUF4955). This family consists of uncharacterized proteins around 850 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	149
406667	pfam16316	DUF4956	Domain of unknown function (DUF4956). This family consists of uncharacterized proteins around 220 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	169
406668	pfam16317	Glyco_hydro_99	Glycosyl hydrolase family 99. This domain, around 350 residues, is mainly found in some uncharacterized proteins from bacteroides to human. Some proteins in this family, annotated as endo-alpha-mannosidases cleave mannoside linkages internally within an N-linked glycan chain, short circuiting the classical N-glycan biosynthetic pathway. This domain reveals a (beta-alpha)(8) barrel fold in which the catalytic centre is present in a long substrate-binding groove, consistent with cleavage within the N-glycan chain, providing a foundation upon which to develop new enzyme inhibitors targeting the hijacking of N-glycan synthesis in viral disease and cancer.	342
406669	pfam16318	DUF4957	Domain of unknown function (DUF4957). This family consists of uncharacterized proteins around 150 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this protein is unknown.	141
406670	pfam16319	DUF4958	Domain of unknown function (DUF4958). This family consists of uncharacterized proteins around 720 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	731
406671	pfam16320	Ribosomal_L12_N	Ribosomal protein L7/L12 dimerization domain. This is the N-terminal dimerization domain of ribosomal protein L7/L12.	48
406672	pfam16321	Ribosom_S30AE_C	Sigma 54 modulation/S30EA ribosomal protein C-terminus. This domain often occurs at the C-terminus of proteins containing pfam02482.	53
406673	pfam16322	Tub_N	Tubby N-terminal. Tub_N is the N-terminal region of Tubby proteins. It carries a nuclear localization signal and is able to activate transcription.	200
406674	pfam16323	DUF4959	Domain of unknown function (DUF4959). This family consists of uncharacterized proteins around 400 residues in length and is mainly found in various Bacteroides, Pedobacter and Parabacteroides species. Several proteins are annotated as Galactose-binding like proteins, but the specific function of this protein is unknown.	106
406675	pfam16324	DUF4960	Domain of unknown function (DUF4960). This family consists of uncharacterized proteins around 460 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	253
406676	pfam16325	Peptidase_U32_C	Peptidase family U32 C-terminal domain. This domain is found at the C-terminus of many members of Peptidase family U32 (pfam01136).	80
406677	pfam16326	ABC_tran_CTD	ABC transporter C-terminal domain. This domain is found at the C-terminus of ABC transporters. It has a coiled coil structure with an atypical 3(10)-helix in the alpha-hairpin region. It is involved in DNA_binding.	69
406678	pfam16327	CcmF_C	Cytochrome c-type biogenesis protein CcmF C-terminal. This C-terminal region of CcmF, one of the cytochrome c-type biogenesis proteins, is associated at the C-terminal with Cytochrome_C_asm family pfam01578. It is possible that it is this domain which delivers reductant to haem on CcmE.	323
292944	pfam16328	DUF4961	Domain of unknown function (DUF4961). This small family consists of several uncharacterized proteins around 350 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	317
318535	pfam16329	Pestivirus_E2	Pestivirus envelope glycoprotein E2. 	372
406679	pfam16330	MukB_hinge	MukB hinge domain. The hinge domain of chromosome partition protein MukB is responsible for dimerization and is also involved in protein-DNA interactions and conformational flexibility.	167
406680	pfam16331	TolA_bind_tri	TolA binding protein trimerisation. This is the N-terminal domain of the YbgF protein. YbgF binds to TolA. This domain mediates trimerisation.	72
406681	pfam16332	DUF4962	Domain of unknown function (DUF4962). This family consists of uncharacterized proteins around 870 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	476
406682	pfam16334	DUF4964	Domain of unknown function (DUF4964). This family consists of uncharacterized proteins around 840 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Glutaminases, but the function of this protein is unknown.	87
406683	pfam16335	DUF4965	Domain of unknown function (DUF4965). This family consists of uncharacterized proteins around 840 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Glutaminases, but the function of this protein is unknown.	174
406684	pfam16338	DUF4968	Domain of unknown function (DUF4968). This family consists of uncharacterized proteins around 830 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as alpha-glucosidases, but the function of this protein is unknown.	90
318542	pfam16339	DUF4969	Domain of unknown function (DUF4969). This small family consists of several uncharacterized proteins around 540 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	79
406685	pfam16341	DUF4971	Domain of unknown function (DUF4971). This small family consists of uncharacterized proteins around 370 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	139
292954	pfam16342	DUF4972	Domain of unknown function (DUF4972). This family consists of uncharacterized proteins around 490 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	128
406686	pfam16343	DUF4973	Domain of unknown function (DUF4973). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Prevotella species. The function of this protein is unknown.	130
406687	pfam16344	DUF4974	Domain of unknown function (DUF4974). This family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Parabacterodies species. The function of this protein is unknown.	70
406688	pfam16346	DUF4975	Domain of unknown function (DUF4975). This family consists of uncharacterized proteins around 500 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Glycosyl hydrolases, but the function of this protein is unknown.	176
406689	pfam16347	DUF4976	Domain of unknown function (DUF4976). This family consists of uncharacterized proteins around 530 residues in length and is mainly found in various Bacteroides species. Several proteins in this family are annotated as Arylsulfatases, but the function of this protein is unknown.	103
406690	pfam16348	Corona_NSP4_C	Coronavirus nonstructural protein 4 C-terminus. This is the C-terminal domain of the coronavirus nonstructural protein 4 (NSP4). NSP4 is a membrane-spanning protein which is thought to anchor the viral replication-transcription complex (RTC) to modified endoplasmic reticulum membranes. This predominantly alpha-helical domain may be involved in protein-protein interactions.	92
406691	pfam16349	DUF4978	Domain of unknown function (DUF4978). This family consists of uncharacterized proteins around 540 residues in length and is mainly found in various Bacteroides and Prevotella species. Several proteins in this family are annotated as Glycoside hydrolases, but the function of this protein is unknown.	172
406692	pfam16350	FAO_M	FAD dependent oxidoreductase central domain. This domain occurs in several FAD dependent oxidoreductases: Sarcosine dehydrogenase, Dimethylglycine dehydrogenase and Dimethylglycine dehydrogenase. It is situated between the DAO domain (pfam01266) and the GCV_T domain (pfam01571).	56
406693	pfam16351	DUF4979	Domain of unknown function (DUF4979). This family consists of uncharacterized proteins around 450 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	158
406694	pfam16352	DUF4980	Domain of unknown function (DUF4980). This family consists of uncharacterized proteins around 610 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	104
406695	pfam16353	DUF4981	Domain of unknown function(DUF4981). This family consists of uncharacterized proteins around 1000 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	90
406696	pfam16355	DUF4982	Domain of unknown function (DUF4982). This family is found in the C-terminal of uncharacterized proteins and beta-galactosidases around 680 residues in length from various Bacteroides species. The function of this protein is unknown.	62
374498	pfam16356	DUF4983	Domain of unknown function (DUF4983). This family consists of uncharacterized proteins around 600 residues in length and is mainly found in various Bacteroides species. The function of this protein is unknown.	93
374499	pfam16357	PepSY_TM_like_2	Putative PepSY_TM-like. This is a family of bacterial proteins with three PepSY-like TM regions.	197
406697	pfam16358	RcsF	RcsF lipoprotein. The RcsF lipoprotein is a component of the Rcs signaling system. It activates the Rcs system by transmitting signals from the cell suface to the histidine kinase RcsC.	110
406698	pfam16359	RcsD_ABL	RcsD-ABL domain. This domain is part of the RcsD histidine kinase. It recognizes the effector domain of RcsB.	103
406699	pfam16360	GTP-bdg_M	GTP-binding GTPase Middle Region. This family locates between the N-terminal domain and MMR_HSR1 50S ribosome-binding GTPase of GTP-binding HflX-like proteins. The full-length members bind and interact with the 50S ribosome and are GTPases, hydrolysing GTP/GDP/ATP/ADP. This region is unknown for its function.	79
406700	pfam16361	Peptidase_S8_N	N-terminal of Subtilase family. This is the N-terminal of Peptidase_S8 of subtilase family. It is around 100 residues in length from various Bacteroides species. The function of this family is unknown.	142
406701	pfam16362	YaiA	YaiA protein. This family of proteins is found in Enterobacteriaceae, where they are immediately downstream of a Shikimate kinase.	63
406702	pfam16363	GDP_Man_Dehyd	GDP-mannose 4,6 dehydratase. 	327
406703	pfam16364	Antigen_C	Cell surface antigen C-terminus. This repeated domain is found at the C-terminus of cell surface antigens. In the Streptococcus mutans antigen I/II there are three repeats of this domain, a cleft between the first two of these forms a binding site for the human salivary agglutinin (SAG).	171
292975	pfam16365	EutK_C	Ethanolamine utilization protein EutK C-terminus. This is the C-terminal domain of the ethanolamine utilization protein EutK. It is a helix-turn-helix domain and is predicted to bind to nucleic acids.	55
406704	pfam16366	CEBP_ZZ	Cytoplasmic polyadenylation element-binding protein ZZ domain. This ZZ-type zinc finger domain binds zinc via two conserved histidines in the C-terminal part of the domain.	56
406705	pfam16367	RRM_7	RNA recognition motif. 	94
406706	pfam16368	CEBP1_N	Cytoplasmic polyadenylation element-binding protein 1 N-terminus. This is the N-terminal domain of cytoplasmic polyadenylation element-binding protein 1.	307
406707	pfam16369	GH43_C	C-terminal of Glycosyl hydrolases family 43. This is the C-terminal of Glycosyl hydrolases family 43. It is around 100 residues in length from various Bacteroides species. The function of this family is unknown.	106
406708	pfam16370	MetallophosC	C terminal of Calcineurin-like phosphoesterase. This is the C-terminal of Calcineurin-like phosphoesterases. It is around 150 residues in length from various Bacteroides species. The function of this family is unknown.	156
406709	pfam16371	MetallophosN	N terminal of Calcineurin-like phosphoesterase. This is the N-terminal of Calcineurin-like phosphoesterases. It is around 150 residues in length from various Bacteroides species. The function of this family is unknown.	73
406710	pfam16372	DUF4984	Domain of unknown function (DUF4984). This domain is around 150 residues long and is located in the C-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this domain remains unknown.	163
406711	pfam16373	DUF4985	Domain of unknown function. This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides, Prevotella and Prevotella species. The function of this family remains unknown.	114
406712	pfam16374	CIF	Cycle inhibiting factor (CIF). Cycle inhibiting factors (Cif) are bacterial effectors that interfere with the eukarytoc cell cycle. CIF induce an irreversible cell cycle arrest upon injection into host cell. CIF blocks degradation of cyclin -dependent kinase inhibitors p21 and p27, inducing their accumulation in the cell. The x-ray crystal structure of Cif reveals it to be a divergent member of a superfamily of enzymes including cysteine proteases and acetyltransferases.	138
406713	pfam16375	DUF4986	Domain of unknown function. This family around 150 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Bacillus species. The function of this family remains unknown.	84
292986	pfam16376	fragilysinNterm	N-terminal domain of fragilysin. N-terminal domain of fragilysin, an extracellular metalloprotease toxin, which is primary virulence factor of B. fragilis, an oportunistic pathogen of human gut. The N-terminal domain of fragilysin inhibits fragilysin and is cleaved in a mature, virulent form.	144
406714	pfam16377	DUF4987	Domain of unknown function. This family around 150 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown.	145
406715	pfam16378	DUF4988	Domain of unknown function. This family around 200 residues locates in the N-terminal of some uncharacterized proteins in various Bacteroides and Alistipes species. The function of this family remains unknown. The N-terminus of this model has been clipped by ~30 residues as it was capturing parts of collagen sequences, pfam01391.	181
339721	pfam16379	DUF4989	Domain of unknown function (DUF4989). This family around 300 residues locates in the N-terminal of some uncharacterized proteins in various Bacteroides and Alistipes species. The function of this family remains unknown. This entry contains a duplication of a DUF1735-like domain.	293
406716	pfam16380	DUF4990	Domain of unknown function. This family around 150 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides species. The function of this family remains unknown.	142
406717	pfam16381	Coatomer_g_Cpla	Coatomer subunit gamma-1 C-terminal appendage platform. Coatomer_g_Cpla is the very C-terminal domain of the eukaryotic Coatomer subunit gamma-1 proteins. It acts as a platform domain to the C-terminal appendage. It carries one single protein/protein interaction site, which is the binding site for ARFGAP2 or ADP-ribosylation factor GTPase-activating protein. COPI-coated vesicles mediate retrograde transport from the Golgi back to the ER and intra-Golgi transport. The gamma-COPI is part of one of two subcomplexes that make up the heptameric coatomer complex along with the beta, delta and zeta subunits.	114
406718	pfam16383	DUF4992	Domain of unknown function. This family around 150 residues locates in the N-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown.	182
406719	pfam16384	DUF4993	Domain of unknown function. This family around 350 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides species. The function of this family remains unknown.	366
406720	pfam16385	DUF4994	Domain of unknown function. This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown.	98
339724	pfam16386	DUF4995	Domain of unknown function. This family around 100 residues locates in the N-terminal of some uncharacterized proteins and glucuronyl hydrolases in various Bacteroides species. The function of this family remains unknown.	73
406721	pfam16387	DUF4996	Domain of unknown function. This family around 100 residues locates in the N-terminal of some glycerophosphoryl diester phosphodiesterases and uncharacterized proteins in various Bacteroides and Prevotella species. The function of this family remains unknown.	102
406722	pfam16389	DUF4998	Domain of unknown function. This family around 200 residues locates in the N-terminal of some uncharacterized proteins in various Bacteroides and Parabacteroides species. The function of this family remains unknown.	199
406723	pfam16390	DUF4999	Domain of unknown function. This family around 75 residues locates in the N-terminal of F5/8 type C domain proteins and some uncharacterized proteins in various Bacteroides species. The function of this family remains unknown.	76
406724	pfam16391	DUF5000	Domain of unknown function. This family around 200 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Parabacteroides species. The function of this family remains unknown.	149
406725	pfam16392	DUF5001	Domain of unknown function. This family around 100 residues locates in the C-terminal of some uncharacterized proteins in various Bacteroides and Parabacteroides species. The function of this family remains unknown.	86
406726	pfam16394	DUF5003	Domain of unknown function (DUF5003). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are typically between 500 and 650 amino acids in length.	316
406727	pfam16395	DUF5004	Domain of unknown function (DUF5004). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are typically around 150 amino acids in length.	145
293005	pfam16396	DUF5005	Domain of unknown function (DUF5005). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are typically around 440 amino acids in length.	436
406728	pfam16397	DUF5006	Domain of unknown function (DUF5006). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are around 600 amino acids in length.	263
406729	pfam16398	DUF5007	Domain of unknown function (DUF5007). This small family of proteins is functionally uncharacterized. This family is found in Bacteroides and Sphingobacterium. The members in this family are around 350 residues in length.	287
406730	pfam16399	Aquarius_N	Intron-binding protein aquarius N-terminus. This family represents the N-terminus of intron-binding protein aquarius, a splicing factor which links excision of introns from pre-mRNA with snoRP assembly.	790
374519	pfam16400	DUF5008	Domain of unknown function (DUF5008). This small family of proteins is functionally uncharacterized. This family is found in Bacteroides, Paraprevotella, and Sphingobacterium. The members in this family are around 550 residues in length.	101
293010	pfam16401	DUF5009	Domain of unknown function (DUF5009). This small family of proteins is functionally uncharacterized. This family is mainly found in various Bacteroides species. The members in this family are around 470 residues in length.	260
406731	pfam16402	DUF5010	Domain of unknown function (DUF5010). This small family of proteins is functionally uncharacterized. This family is found in bacteroides. Proteins in this family are around 600 amino acids in length.	341
406732	pfam16403	DUF5011	Domain of unknown function (DUF5011). This small family of proteins is functionally uncharacterized. This family is found in Bacteroides, Prevotella, and Parabateroides. Proteins in this family are around 230 amino acids in length.	71
374522	pfam16404	DUF5012	Domain of unknown function (DUF5012). This small family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 230 amino acids in length.	125
406733	pfam16405	DUF5013	Domain of unknown function (DUF5013). This small family of proteins is functionally uncharacterized. This family is found in various Bacteroides and Parabacteroides species. Proteins in this family are around 400 amino acids in length.	145
406734	pfam16406	DUF5014	Domain of unknown function (DUF5014). This small family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 630 amino acids in length.	90
406735	pfam16407	PKD_2	PKD-like family. This is a PKD-like family of proteins found in various Bacteroides species.	157
406736	pfam16408	DUF5016	Domain of unknown function (DUF5016). This family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 660 amino acids in length.	125
406737	pfam16409	DUF5017	Domain of unknown function (DUF5017). This family of proteins is functionally uncharacterized. This family is found in various Bacteroides and Prevotella species. Proteins in this family are around 350 amino acids in length.	182
406738	pfam16410	DUF5018	Domain of unknown function (DUF5018). This family of proteins is functionally uncharacterized. This family is found in various Bacteroides and Alistipes species. Proteins in this family are around 600 amino acids in length.	355
406739	pfam16411	SusF_SusE	Outer membrane protein SusF_SusE. SusE and SusF are two outer membrane proteins composed of tandem starch specific carbohydrate binding modules (CBMs) with no enzymatic activity. They are are likely to play an important role in starch metabolism in Bacteroides. It has been speculated that they could compete for starch in the human intestinal tract by sequestering starch at the bacterial surface and away from competitors. SusE has higher affinity for starch compared to SusF.	165
406740	pfam16412	DUF5020	Domain of unknown function (DUF5020). This family of proteins is functionally uncharacterized. This family is found in various Bacteroides species. Proteins in this family are around 235 amino acids in length.	212
406741	pfam16413	Mlh1_C	DNA mismatch repair protein Mlh1 C-terminus. This is the C-terminal domain of DNA mismatch repair protein Mlh1, these proteins belong to the MutL family. This domain forms part of the endonuclease active site.	257
406742	pfam16414	NPC1_N	Niemann-Pick C1 N-terminus. This is the N-terminal domain of Niemann-Pick C1 family proteins. This family of proteins mediates transport of cholesterol from the intestinal lumen to enterocytes. This domain contains a cholesterol-binding pocket.	238
406743	pfam16415	CNOT1_CAF1_bind	CCR4-NOT transcription complex subunit 1 CAF1-binding domain. This is the CAF1-binding domain of CCR4-NOT transcription complex. It adopts a MIF4G (middle portion of eIF4G) fold.	225
406744	pfam16416	GUN4_N	ARM-like repeat domain, GUN4-N terminal. GUN4_N is the ARM-repeat like N-terminal domain of GUN4 proteins. It contains five helices arranged in an alternating antiparallel pattern that resembles ARM or HEAT repeats, though the functional importance of this poorly conserved domain in Gun4 is not currently known.	82
406745	pfam16417	CNOT1_TTP_bind	CCR4-NOT transcription complex subunit 1 TTP binding domain. This is the TTP binding domain of CCR4-NOT transcription complex subunit 1. It adopts a MIF4G (middle portion of eIF4G) fold.	183
406746	pfam16418	CNOT1_HEAT	CCR4-NOT transcription complex subunit 1 HEAT repeat. This domain is a HEAT repeat found in CCR4-NOT transcription complex subunit 1.	146
406747	pfam16419	CNOT1_HEAT_N	CCR4-NOT transcription complex subunit 1 HEAT repeat. This domain is a HEAT repeat found in fungal CCR4-NOT transcription complex subunit 1 at the N-terminus of PF16418.	224
406748	pfam16420	ATG7_N	Ubiquitin-like modifier-activating enzyme ATG7 N-terminus. This is the N-terminal domain of Ubiquitin-like modifier-activating enzyme ATG7. In Arabidopsis this domain binds the E2 enzymes ATG10 and ATG3.	309
406749	pfam16421	E2F_CC-MB	E2F transcription factor CC-MB domain. This is the coiled coil (CC) - marked box (MB) domain of E2F transcription factors. This domain forms a heterodimer with the corresponding domain of the DP transcription factor, the heterodimer binds the C-terminus of retinoblastoma protein.	94
406750	pfam16422	COE1_DBD	Transcription factor COE1 DNA-binding domain. 	227
406751	pfam16423	COE1_HLH	Transcription factor COE1 helix-loop-helix domain. This is the helix-loop-helix domain of transcription factor COE1. It is responsible for dimerization.	44
406752	pfam16424	DUF5021	Domain of unknown function (DUF5021). This family consists of Prepilin-type cleavage/methylation N-terminal domain proteins around 200 residues in length and is mainly found in various Eubacterium species. The function of this family is unknown.	158
293034	pfam16425	DUF5022	Domain of unknown function (DUF5022). This family consists of several uncharacterized proteins around 350 in length and is mainly found in various Firmicutes species. The function of this family is unknown.	287
406753	pfam16426	DUF5023	Domain of unknown function (DUF5023). This family consists of several uncharacterized proteins around 300 residues in length and is mainly found in various Eubacterium species. The function of this family is unknown.	197
406754	pfam16427	DUF5024	Domain of unknown function (DUF5024). This family consists of several uncharacterized proteins around 150 or 200 in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown.	104
406755	pfam16428	DUF5025	Domain of unknown function (DUF5025). This family consists of several uncharacterized proteins around 200 in length and is mainly found in various Parabacteroides species. The function of this family is unknown.	161
374540	pfam16429	DUF5026	Domain of unknown function (DUF5026). This family consists of several uncharacterized proteins around 100 residues in length and is mainly found in various Clostridiales species. The function of this family is unknown.	82
374541	pfam16430	DUF5027	Domain of unknown function (DUF5027). This family consists of several uncharacterized proteins around 180 in length and is mainly found in various Clostridiales species. The function of this family is unknown.	187
293040	pfam16431	DUF5028	Domain of unknown function (DUF5028). This family consists of several uncharacterized proteins around 200 in length and is mainly found in Eubacterium and Clostridium. The function of this family is unknown.	177
293041	pfam16432	DUF5029	Domain of unknown function (DUF5029). This family consists of several uncharacterized proteins around 550 in length and is mainly found in Bacteroides fragilis and sp. The function of this family is unknown.	210
406756	pfam16433	DUF5030	Domain of unknown function (DUF5030). This family consists of several uncharacterized proteins around 300 in length and is mainly found in various Bacteroides species. The function of this family is unknown.	307
406757	pfam16434	DUF5031	Domain of unknown function (DUF5031). This family consists of several uncharacterized proteins around 380 in length and is mainly found in Bacteroides fragilis and sp. The function of this family is unknown.	415
406758	pfam16435	DUF5032	Domain of unknown function (DUF5032). This family consists of several uncharacterized proteins around 270 in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown.	259
406759	pfam16436	DUF5033	Domain of unknown function (DUF5033). This family consists of several uncharacterized proteins around 200 in length and is mainly found in various Bacteroides species. The function of this family is unknown.	178
406760	pfam16437	DUF5034	Domain of unknown function (DUF5034). This family consists of several uncharacterized proteins around 190 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	169
406761	pfam16438	DUF5035	Domain of unknown function (DUF5035). This family consists of several uncharacterized proteins around 170 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	145
406762	pfam16439	DUF5036	Domain of unknown function (DUF5036). This family consists of several uncharacterized proteins around 240 residues in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown.	225
293049	pfam16440	DUF5037	Domain of unknown function (DUF5037). This family consists of several uncharacterized proteins around 270 residues in length and is mainly found in various Clostridiales species. The function of this family is unknown.	242
406763	pfam16441	DUF5038	Domain of unknown function (DUF5038). This family consists of several uncharacterized proteins around 200 residues in length and is mainly found in various Clostridiales species. The function of this family is unknown.	144
406764	pfam16442	DUF5039	Domain of unknown function (DUF5039). This family consists of several uncharacterized proteins around 240 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	203
406765	pfam16443	DUF5040	Domain of unknown function (DUF5040). This family consists of several uncharacterized proteins around 260 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	227
406766	pfam16444	DUF5041	Domain of unknown function (DUF5041). This family consists of several uncharacterized proteins around 230 residues in length and is mainly found in various Bacteroidales species. The function of this family is unknown.	192
406767	pfam16445	DUF5042	Domain of unknown function (DUF5042). This family consists of several uncharacterized proteins around 460 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	434
406768	pfam16446	DUF5043	Domain of unknown function (DUF5043). This family consists of several uncharacterized proteins around 200 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	155
374548	pfam16447	DUF5044	Domain of unknown function (DUF5044). This family consists of several uncharacterized proteins around 220 residues in length and is mainly found in various Clostridiales species. The function of this family is unknown.	178
406769	pfam16448	LapD_MoxY_N	LapD/MoxY periplasmic domain. This domain is the N-terminal periplasmic domain of the LapD and MoxY receptor proteins.	124
374549	pfam16449	MatB	Fimbrillin MatB. This is a family of fimbrial proteins.	168
406770	pfam16450	Prot_ATP_ID_OB	Proteasomal ATPase OB/ID domain. This is the interdomain (ID) or oligonucleotide binding (OB) domain of proteasomal ATPase	56
406771	pfam16451	Spike_NTD	Spike glycoprotein N-terminal domain. The N-terminal domain of the coronavirus spike glycoprotein functions as a receptor binding domain. It binds carcinoembryonic antigen-related cell adhesion molecule 1.	298
406772	pfam16452	Phage_CI_C	Bacteriophage CI repressor C-terminal domain. The C-terminal domain of the CI repressor functions in oligomer formation.	102
406773	pfam16453	IQ_SEC7_PH	PH domain. This PH domain is found in IQ motif and SEC7 domain-containing proteins.	135
406774	pfam16454	PI3K_P85_iSH2	Phosphatidylinositol 3-kinase regulatory subunit P85 inter-SH2 domain. This domain is found between the two SH2 domains in phosphatidylinositol 3-kinase regulatory subunit P85. It forms a complex with the adaptor-binding domain of phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha.	161
406775	pfam16455	UBD	Ubiquitin-binding domain. This ubiquitin-binding domain is found in ubiquitin domain-containing proteins.	102
406776	pfam16456	YmgD	YmgD protein. This family of proteins is found in bacteria. Proteins in this family are approximately 110 amino acids in length.	82
406777	pfam16457	PH_12	Pleckstrin homology domain. 	129
406778	pfam16458	Beta-prism_lec	Beta-prism lectin. This beta-prism fold lectin is the C-terminal domain of the Vibrio cholerae cytolytic pore-forming toxin hemolysin. It binds to N-glycans with a heptasaccharide GlcNAc4Man3 core (NGA2).	129
406779	pfam16459	Phage_TAC_13	Phage tail assembly chaperone, TAC. This family represents the phage-tail assembly chaperone proteins from a small set of Siphoviridae from Gammaproteobacteria. TACs are required for the morphogenesis of all long-tailed phages. The proposed function for the TAC is to coat the tape-measure protein to prevent it from forming unproductive complexes or precipitating before the tail tube protein has been incorporated.	98
293069	pfam16460	Phage_TTP_11	Phage tail tube, TTP, lambda-like. This family represents the phage-tail-tube protein from a set of Siphoviridae from Gammaproteobacteria. Tail tube proteins polymerize with the assistance of the Tail-tip complex, a tape measure protein and two chaperones. Infectivity of host is delivered through the tube.	137
406780	pfam16461	Phage_TTP_12	Lambda phage tail tube protein, TTP. This family represents the phage-tail-tube protein from a set of Siphoviridae from Gammaproteobacteria. Tail tube proteins polymerize with the assistance of the Tail-tip complex, a tape measure protein and two chaperones. Infectivity of host is delivered through the tube.	134
318627	pfam16462	Phage_TAC_14	Phage tail assembly chaperone protein, TAC. This is a family of Siphoviridae phage tail assembly chaperone proteins.	113
318628	pfam16463	Phage_TTP_13	Phage tail tube protein family. This is a small family of Siphoviridae phage tail tube proteins. The tube protein polymerizes to form the shaft through which the infecting DNA passes into the host.	137
406781	pfam16464	DUF5045	Domain of unknown function (DUF5045). This family consists of N-terminal of several uncharacterized proteins around 260 residues in length and is mainly found in various Bacteroides and Parabacteroides species. The function of this family is unknown.	85
406782	pfam16465	DUF5046	Domain of unknown function (DUF5046). This small family consists of C-terminal of several uncharacterized proteins around 500 residues in length and is mainly found in various Faecalibacterium species. The function of this family is unknown. This family has distant similarity to WD40 repeats.	286
293075	pfam16466	DUF5047	Domain of unknown function (DUF5047). This family consists of N-terminal of several uncharacterized proteins and peptidases around 360 residues in length and is mainly found in various Streptomyces species. The function of this family is unknown.	135
406783	pfam16467	DUF5048	Domain of unknown function (DUF5048). This family consists of C-terminal of several uncharacterized proteins around 500 residues in length and is mainly found in various Faecalibacterium and Clostridium species. The function of this family is unknown.	104
379845	pfam16468	DUF5049	Domain of unknown function (DUF5049). This family consists of some uncharacterized proteins around 60 residues in length and is mainly found in various Lactobacillus and Selenomonas species. The function of this family is unknown.	57
406784	pfam16469	NPA	Nematode polyprotein allergen ABA-1. The nematode polyprotein allergen ABA-1 is a lipid-binding protein comprising multiple tandem repeats of this domain.	116
406785	pfam16470	S8_pro-domain	Peptidase S8 pro-domain. This domain is the pro-domain of several peptidases belonging to family S8.	77
406786	pfam16471	JIP_LZII	JNK-interacting protein leucine zipper II. This is the second leucine zipper domain (LZII) of several JNK-interacting proteins (JIP). It interacts with the small GTP-binding protein ARF6.	56
406787	pfam16472	DUF5050	Domain of unknown function (DUF5050). 	283
406788	pfam16473	DUF5051	3' exoribonuclease, RNase T-like. This is a highly divergent 3' exoribonuclease family. The proteins constitute a typical RNase fold, where the active site residues form a magnesium catalytic centre. The protein of the solved structure readily cleaves 3' overhangs in a time-dependent manner. It is similar to DEDD-type RNases and is an unusual ATP-binding protein that binds ATP and dATP. It forms a dimer in solution and both protomers in the asymmetric unit bind a magnesium ion through Asp-6 in UniProtKB:P9WJ73.	177
406789	pfam16474	KIND	Kinase non-catalytic C-lobe domain. The KIND domain (kinase non-catalytic C-lobe domain) evolved from a catalytic protein kinase fold and functions as an interaction domain. In SPIRE1 (protein spire homolog 1) this domain interacts with FMN2 (formin-2).	194
406790	pfam16475	DUF5052	Domain of unknown function (DUF5052). This family consists of uncharacterized proteins around 200 residues in length and is mainly found in various Firmicutes species. The function of this family is unknown.	199
406791	pfam16476	DUF5053	Domain of unknown function (DUF5053). This family consists of C-terminal of uncharacterized proteins around 100 residues in length and is mainly found in various Prevotella species. The function of this family is unknown.	59
406792	pfam16477	DUF5054	Domain of unknown function (DUF5054). This family consists of Glycosyl hydrolase family 38 proteins around 700 residues in length and is mainly found in various Clostridium and Rhizobium species. The function of this family is unknown.	287
374565	pfam16478	DUF5055	Domain of unknown function (DUF5055). This family consists of several uncharacterized proteins around 100 residues in length and is mainly found in butyrate-producing bacteriums. The function of this family is unknown.	105
406793	pfam16479	DUF5056	Domain of unknown function (DUF5056). This family consists of uncharacterized proteins around 360 residues in length and is mainly found in various Bacteroides species. The function of this family is unknown.	93
406794	pfam16480	DUF5057	Domain of unknown function (DUF5057). This family consists of C-terminal of uncharacterized proteins and F5/8 type C domain proteins around 360 residues in length and is mainly found in various Firmicutes species. The function of this family is unknown.	353
406795	pfam16481	DUF5058	Domain of unknown function (DUF5058). This family consists of uncharacterized proteins around 250 residues in length and is mainly found in various Firmicutes species. The function of this family is unknown.	222
374568	pfam16482	Staufen_C	Staufen C-terminal domain. This is the C-terminal domain of Staufen proteins. It consists of an N-terminal Staufen-swapping motif (SSM) comprising two alpha helices, connected by a linker region to a dsRNA-binding-like domain ('RBD'). The 'RBD' has the fold of a functional dsRNA-binding domain, but lacks the residues required to bind RNA. This domain is responsible for dimerization, the SSM from one molecule interacts with the 'RBD' of another.	110
406796	pfam16483	Glyco_hydro_64	Beta-1,3-glucanase. Family 64 glycoside hydrolases have beta-1,3-glucanase activity.	370
406797	pfam16484	CPT_N	Carnitine O-palmitoyltransferase N-terminus. This domain is found at the N-terminus of carnitine O-palmitoyltransferases. It functions as a regulatory domain and is linked to the catalytic domain (pfam00755) via two transmembrane regions.	47
406798	pfam16485	PLN_propep	Protealysin propeptide. This propeptide is cleaved during maturation of protealysin. Before cleavage it interacts with the catalytic domain, blocking the active site.	43
406799	pfam16486	ArgoN	N-terminal domain of argonaute. ArgoN is the N-terminal domain of argonaute proteins in eukaryotes. ArgoN is composed of an antiparallel four-stranded beta sheet core that has two alpha helices positioned along one face of the sheet and an extended beta strand towards its N-terminus. The core fold of the N domain most closely resembles the catalytic domain of replication-initiator protein Rep. The N domain is linked to the PAZ domain via linker 1 region, and together these three regions are designated the PAZ-containing lobe of argonaute.	86
406800	pfam16487	ArgoMid	Mid domain of argonaute. The ArgoMid domain is found to be part of the Piwi-lobe of the argonaute proteins. It is composed of a parallel four-stranded beta-sheet core surrounded by four alpha-helices and two additional short alpha-helices. It most closely resembles the amino terminal tryptic core of the E.coli lactose repressor. There is an extensive interface between the Mid and the Piwi domains. The conserved C-terminal half or the Mid has extensive interactions with Piwi, with a deep basic pocket on the surface of the `Mid adjacent to the interface with Piwi. The Mid carries a binding pocket for the 5' phosphate overhang of the guide strand of DNA. The N, Mid, and Piwi domains form a base upon which the PAZ domain sits, resembling a duck. The 5' phosphate and the U1 base are held in place by a conserved network of interactions from protein residues of the Mid and Piwi domains in order to place the guide uniquely in the proper position observed in all Argonaute-RNA complexes.	84
406801	pfam16488	ArgoL2	Argonaute linker 2 domain. ArgoL2 is the second linker domain in eukaryotic argonaute proteins. It starts with two alpha-helices aligned orthogonally to each other followed by a beta-strand involved in linking the two lobes, the PAZ lobe and the Piwi lobe of argonaute to each other. Linker 2 together with the N, PAZ and L1 domains form a compact global fold. Numerous residues from Piwi, L1 and L2 linkers direct the path of the phosphate backbone of nucleotides 7-9, thus allowing DNA-slicing.	47
406802	pfam16489	GAIN	GPCR-Autoproteolysis INducing (GAIN) domain. The GAIN a domain of alpha-helices and beta-strands that is found in cell-adhesion GPCRs and precedes the GPS motif where the autoproteolysis occurs, family, pfam01825. The full GAIN domain, comprises the GPS and the GAIN, in cell-adhesion GPCRs, and is the functional unit for autoproteolysis. The GPS motif at the end of the GAIN domain is an ancient domain that exists in primitive ancestor organisms, and the full GAIN + GPS is conserved in all cell-adhesion GPCRs and all PKD1-related proteins.	205
406803	pfam16490	Oxidoreduct_C	Putative oxidoreductase C terminal domain. This is the putative C-terminal domain of a bacterial oxidoreductase. It lies C-terminal to family GFO_IDH_MocA pfam01408 in some members.	278
406804	pfam16491	Peptidase_M48_N	CAAX prenyl protease N-terminal, five membrane helices. The five N-terminal five transmembrane alpha-helices of peptidase_M48 family proteins including the CAAX prenyl proteases reside completely within the membrane of the endoplasmic reticulum.	179
406805	pfam16492	Cadherin_C_2	Cadherin cytoplasmic C-terminal. Cadherin_C_2 is the cytoplasmic C-terminal domain of some proto-cadherins. It is this region of the cadherins that allows cell-adhesion and the essential feature of metazoan multicellularity. Cadherins are cell-surface receptors that function in cell adhesion, cell polarity, and tissue morphogenesis.	84
406806	pfam16493	Meis_PKNOX_N	N-terminal of Homeobox Meis and PKNOX1. Meis_PKNOX_N is a family found at the N-terminus of Meis, Myeloid ecotropic viral integration site, transcription regulators and PKNOX1 regulators, PBX/knotted 1 homeobox 1, homeobox proteins.	86
406807	pfam16494	Na_Ca_ex_C	C-terminal extension of sodium/calcium exchanger domain. Na_Ca_ex_C is a region of the higher eukaryote sodium/calcium exchanger domain that extends toward the C-terminal, and is cytoplasmic.	134
406808	pfam16495	SWIRM-assoc_1	SWIRM-associated region 1. Much of the higher eukaryote SWI/SNF complex subunit SMARCC2 proteins is of low-complexity and or disordered. However, there are several short regions that are quite highly conserved. This is one of these regions. The function of the individual regions is not known.	84
406809	pfam16496	SWIRM-assoc_2	SWIRM-associated domain at the N-terminal. Much of the higher eukaryote SWI/SNF complex subunit SMARCC2 proteins is of low-complexity and or disordered. However, there are several short regions that are quite highly conserved. This is one of these regions. The function of the individual regions is not known.	410
406810	pfam16497	MHC_I_3	MHC-I family domain. 	180
406811	pfam16498	SWIRM-assoc_3	SWIRM-associated domain at the C-terminal. Much of the higher eukaryote SWI/SNF complex subunit SMARCC2 proteins is of low-complexity and or disordered. However, there are several short regions that are quite highly conserved. This is one of these regions. The function of the individual regions is not known.	65
374582	pfam16499	Melibiase_2	Alpha galactosidase A. 	284
406812	pfam16500	Cyclin_N2	N-terminal region of cyclin_N. Cyclin_N2 is fond upstream of the family Cyclin_N, pfam00134. The exact function of this region of cyclins is not certain.	135
406813	pfam16501	SCAPER_N	S phase cyclin A-associated protein in the endoplasmic reticulum. SCAPER_N is a short highly conserved region close to the N-terminus. SCAPER is localized to the endoplasmic reticulum and is a substrate for cyclin A/Cdk2. It associates with cyclin A and localizes to the ER. One theory suggests that SCAPER functions to create a local high concentration of cyclin A2 in the cytoplasm. Alternatively, SCAPER might be acting to sequester a portion of cellular cyclin A2 that could then be readily available for nuclear translocation, which may be needed for exit from G0 phase.	98
406814	pfam16502	DUF5059	Domain of unknown function (DUF5059). This domain is found fused to a copper-binding protein at the C-terminus, family Copper-bind, pfam00127. Its function is not known, and it is found in the Halobacteriaceae family in Archaea.	620
406815	pfam16503	zn-ribbon_14	Zinc-ribbon. This is a family of zinc-ribbons largely from eukaryotes that lie at the C-terminus of cytoplasmic tRNA adenylyltransferase 1 proteins. Most of these proteins carry an ATP-binding domain towards the N-terminus.	32
406816	pfam16504	SP24	Putative virion membrane protein of plant and insect virus. SP24, or structural protein of 24kD, is a family of putative virion membrane proteins of plant and insect viruses. These viruses are ssRNA positive-strand viruses, with no DNA stage. The family corresponds to the central region of the ORF3 of insect chroparaviruses and negeviruses and plant cileviruses, higreviruses and blunerviruses. It contains four transmembrane regions. Chronic bee paralysis virus (CBPV) is one of the more common member virions. SP24 is probably one of the major structural components of the virions.	147
293114	pfam16505	Emaravirus_P4	P4 movement protein of Emaravirus, and the 30K superfamily. Emaravirus_P4 is composed of movement proteins of the genus of negative-strand RNA viruses Emaravirus (related to the family Bunyaviridae), which infect plants. P4 is a movement protein of the 30K superfamily.	349
406817	pfam16506	DiSB-ORF2_chro	Putative virion glycoprotein of insect viruses. DiSB-ORF2_chro corresponds to a short conserved region at the N-terminus of putative glycoproteins from chroparaviruses. It carries two putative disulfide bridges. No similarity can be found with any other glycoproteins outside this region.	210
406818	pfam16507	BLM10_mid	Proteasome-substrate-size regulator, mid region. The ordered regions of the yeast BLM10 or PA200 (human homolog), full-length protein encode 32 HEAT repeat (HR)-like modules, each comprising two helices joined by a turn, with adjacent repeats connected by a linker. Whereas a standard HEAT repeat is composed of ~50 residues, the BLM10 HEAT repeats are highly variable. The length of helices ranges from 8 to 35 residues, turns range from 2 to 87 residues, and linkers range from 1 to 88 residues, with the longest linker, between HR21 and HR22, containing additional secondary structures (two strands and three helices). BLM10_mid is the middle ordered region of the three in BLM10. BLM10 is found to surround the proteasome entry pore in the 1.2 MDa complex of proteasome and BLM10 to form a largely closed dome that is expected to restrict access of potential substrates. Thus Blm10 and PA200 are predominantly nuclear and stimulate the degradation of model peptides, although they do not appear to stimulate the degradation of proteins, recognize ubiquitin, or utilize ATP.	499
406819	pfam16508	NIBRIN_BRCT_II	Second BRCT domain on Nijmegen syndrome breakage protein. 	114
374591	pfam16509	KORA	TrfB plasmid transcriptional repressor. KORA is a family of Gram-negative bacterial proteins that act as global repressors of genes involved in plasmid replication, conjugative transfer and stable inheritance in the IncP group of plasmids. KORA operates as a symmetric dimer, and contacts the DNA via the helix-turn-helix region at the N-terminus.	84
293119	pfam16510	P22_portal	Phage P22-like portal protein. The portal protein of P22 and similar Podoviridae tail phages is a dodecameric structure consisting of a hip (2), a leg(1) and a barrel(3). DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Domains 1 and 3 are mostly helical and form the majority of the DNA-translocating channel. Domain 2 adopts an alpha-beta-fold characterized by two sheets of eight beta-strands, which cross each other to form a beta-barrel-like structure.	668
406820	pfam16511	FERM_f0	N-terminal or F0 domain of Talin-head FERM. FERM_f0 forms a stable globular structure. The fold is an ubiquitin-like fold joined to the f1 domain in a novel fixed orientation by an extensive charged interface. It is required for maximal integrin-activation, by interacting with other FA components, No binding partner has yet been found for it.	82
406821	pfam16512	RhoGAP-FF1	p190-A and -B Rho GAPs FF domain. RhoGAP-FF1 is the FF domain of the Rho GTPase activating proteins (GAPs). These are the key proteins that make the switch between the active guanosine-triphosphate-bound form of Rho guanosine triphosphatases (GTPases) and the inactive guanosine-diphosphate-bound form. Rho guanosine triphosphatases (GTPases) are a family of proteins with key roles in the regulation of actin cytoskeleton dynamics. The RhoGAP-FF1 region contains the FF domain that has been implicated in binding to the transcription factor TFII-I; and phosphorylation of Tyr308 within the first FF domain inhibits this interaction. The RhoGAPFF1 domain constitutes the first solved structure of an FF domain that lacks the first of the two highly conserved Phe residues, but the substitution of Phe by Tyr does not affect the domain fold.	80
374594	pfam16514	NADH-UOR_E	putative NADH-ubiquinone oxidoreductase chain E. This putative NADH-ubiquinone oxidoreductase chain E family is found in Epsilonproteobacteria, chiefly in Helicobacter pylori. All proteins in the family are less than 100 residues in length.	74
406822	pfam16515	HIP1_clath_bdg	Clathrin-binding domain of Huntingtin-interacting protein 1. HIP1_clath_bdg is the coiled-coil region of Huntington-interacting proteins 1. It carries a highly conserved HADLLRKN sequence motif at its N-terminus which effects the binding of HIP1R to clathrin light-chain EED regulatory site. this binding then stimulates clathrin lattice assembly. Huntingtin-interacting protein 1 (HIP1) is an obligate binding partner for Huntungtin, and loss of this interaction triggers the cascade of events that results in the apoptosis of neuronal cells and the onset of Hungtinton's disease. Clathrin light-chain binds to a flexible coiled-coil domain in HIP1 and induces a compact state that is refractory to actin binding.	93
406823	pfam16516	CC2-LZ	Leucine zipper of domain CC2 of NEMO, NF-kappa-B essential modulator. CC2-LZ is a leucine-zipper domain associated with the CC2 coiled-coil region of NF-kappa-B essential modulator, NEMO. It plays a regulatory role, along with the very C-terminal zinc-finger; it contains a ubiquitin-binding domain (UBD) and represents one region that contributes to NEMO oligomerization. NEMO itself is an integral part of the IkappaB kinase complex and serves as a molecular switch via which the NF-kappaB signalling pathway is regulated.	100
406824	pfam16517	Nore1-SARAH	Novel Ras effector 1 C-terminal SARAH (Sav/Rassf/Hpo) domain. The Nore1-SARAH, C-terminal, domain of Nore1, the tumor-suppressor, a novel Ras effector, has a characteristic coiled-coil structure. It is a small helical module that is important in signal-transduction networks. The recombinant SARAH domain of Nore1 crystallizes as an anti-parallel homodimer with representative characteristics of coiled coils. The central function of the SARAH domain seems to be the mediation of homo- and hetero-oligomerization between SARAH domain-containing proteins. Nore1 forms homo- and hetero complexes through its C-terminal SARAH (Sav/Rassf/Hpo) domain.	39
406825	pfam16518	GrlR	T3SS negative regulator,GrlR. GrlR is a family of protobacterial type III secretion system negative regulators. Structurally, GrlR consists of a typical beta-barrel fold with eight beta-strands containing an internal hydrophobic cavity and a plug-like loop on one side of the barrel. Strong hydrophobic interactions between the two beta-barrels maintain its dimeric architecture. A unique surface-exposed EDED (Glu-Asp-Glu-Asp) motif is identified to be critical for GrlA-GrlR interaction and for the repressive activity of GrlR. The locus of enterocyte effacement (LEE) is essential for virulence of enterohaemorrhagic Escherichia coli (EHEC) and enteropathogenic E. coli (EPEC). It encodes some 20 genes including an overall regulator ler and two others, GrlR and GrlA, that form the type three secretion system for infection. GrlR comlexes with GrlA to repress expression of ler. GrlA is found in family pfam00462.	112
406826	pfam16519	TRPM_tetra	Tetramerisation domain of TRPM. TRPM7_tetra is a short anti-parallel coiled-coil tetramerisation domain of the transient receptor potential cation channel subfamily M member proteins 1-8. It is held together by extensive core packing and interstrand polar interactions. Transient receptor potential (TRP) channels comprise a large family of tetrameric cation-selective ion channels that respond to diverse forms of sensory input. The presence of cytoplasmic domains that direct channel assembly appears to be a feature of many voltage-gated ion channel superfamily members.	55
293128	pfam16520	BDV_M	ssRNA-binding matrix protein of Bornaviridae. BDV_M is a family of matrix proteins from negative-strand Bornaviridae viruses. Its most stable oligomeric form is a tetramer, and it lies beneath the viral envelope where it associates with the inner layer of the viral membrane. It bridges the gap between the nucleocapsid and the viral envelope thereby imparting structural integrity and individual form to the virus particle. Borna disease virus (BDV) is a neurotropic enveloped RNA virus causing a noncytolytic, persistent infection of the central nervous system in mammals. The order to which this virus belongs, Mononegavirales, also contains the Ebola, mumps, rabies and measles viruses, amongst other highly infectious agents.	103
406827	pfam16521	Myosin-VI_CBD	Myosin VI cargo binding domain. Myosin-VI_CBD is a C-terminal family that allows unconventional myosin-VI to recognize and select its binding cargoes. Several adaptor proteins have been reported to interact specifically with the CBD, thus defining the specific subcellular functions of myosin VI. The crystal structure determination of the myosin VI CBD/Dab2 (an endocytic adaptor protein Disabled-2 that is a cargo) complex shows that the Myosin-VI_CBD forms a cargo-induced dimer, suggesting that the motor undergoes monomer-to-dimer conversion that is dependent upon cargo binding. In the absence of cargo myosin VI exists as a stable monomer. This cargo binding-mediated monomer-to-dimer conversion mechanism adopted by myosin VI may be shared by other unconventional myosins, such as myosin VII and myosin X.	90
406828	pfam16522	FliS_cochap	Flagellar FLiS export co-chaperone, HP1076. FliS_cochap is a family of largely Campylobacterales proteins that are co-chaperones for FliS, one of the type III secretion system flagellar chaperones. The HP1076 (Flis_cochap) and FliS complex together prevents premature polymerization of flagellins and is critical for flagellar assembly and bacterial colonisation. The HP1076 shows co-chaperone activity that promotes protein folding of FliS with mutations in the flagellin binding pocket and enhances the chaperone activity of FliS.	131
406829	pfam16523	betaPIX_CC	betaPIX coiled coil. betaPIX_CC is the very C-terminal coiled-coil region of betaPIX or p21-activated kinase interacting exchange factor proteins. The coiled-coil runs from residues 589-646 in UniProtKB:G31IU6, and the PDZ-binding site is the final eight residues immediately downstream. The coiled-coil trimerizes and thus exposes three potential PZD-binding surfaces, although only one of these is maximally used. One of the C-terminal ends of the coiled-coil forms an extensive beta-sheet interaction with the Shank PDZ, while the other two ends are not involved in ligand binding and form random coils. Thus the coiled-coil domain allows multimerisation of betaPIX that is vital for its physiological functions. betaPIX and the Shank/ProSAP protein form a complex that acts as a protein scaffold for integrating signalling pathways and regulating postsynaptic structure.** Forced reload	87
406830	pfam16524	RisS_PPD	Periplasmic domain of Sensor histidine kinase RisS. RisS_PPD is the periplasmic domain of the sensor histidine kinase RisS. It is purported to be the region of the kinase that senses the pH of the environs.	105
406831	pfam16525	MHB	Haemophore, haem-binding. MHB is a coiled-coil molecule that binds free haem in mycobacterial cytoplasm to deliver it to membrane proteins for shuttling through the membrane.	75
406832	pfam16526	CLZ	C-terminal leucine zipper domain of cyclic nucleotide-gated channels. The CLZ domain is the C-terminal leucine-zipper domain of of cyclic nucleotide-gated channel proteins. The CLZ domains form homotypic trimers in solution thus constraining the channel of the CNGs to contain three cyclic nucleotide-gated subunits, CNGA. The CLZ domains formed homotypic parallel 3-helix coiled-coil domains, consistent with their proposed role in regulating subunit assembly.	70
318680	pfam16527	CpxA_peri	Two-component sensor protein CpxA, periplasmic domain. CpxA_peri is the periplasmic domain-family of the Gram-negative Gammaproteobacteria two-component signalling system, Cpx. It represents the recognition-site for sensing specific envelope stress signals. The fold that the domain-core of CpxA_peri conforms to is a PAS fold. The domain senses the environmental change and triggers a signal transduction to the cytoplasmic domain. As well as the PAS-core, there is a C-terminal tail that is necessary for ligand-sensing and binding to CpxP, a CpxA-associated and a regulatory protein.	134
406833	pfam16528	Exo84_C	Exocyst component 84 C-terminal. Exo84_C is the C-terminal helical region of the exocyst component Exo84. This region resembles a cullin-repeat, a multi-helical bundle. The exocyst is a large complex that is required for tethering vesicles at the final stages of the exocytic pathway in all eukaryotes. Exocyst subunits are composed of mostly helical modules strung together into long rods.	203
374606	pfam16529	Ge1_WD40	WD40 region of Ge1, enhancer of mRNA-decapping protein. Ge1_WD40 is the N-terminal region of Ge-1 or enhancer of mRNA-decapping proteins. WD40-repeat regions are involved in protein-protein interactions.	329
406834	pfam16530	IHHNV_capsid	Infectious hypodermal and haematopoietic necrosis virus, capsid. IHHNV_capsid is the single capsid protein of infectious hypodermal and haematopoietic necrosis virus, found particularly in shrimp densovirus. Densoviruses are a subfamily of the parvoviruses. The capsid protein has an eight-stranded anti-parallel beta-barrel 'jelly roll' motif similar to that found in many icosahedral viruses, including other parvoviruses. The N-terminal portion of the IHHNV coat protein adopts a 'domain-swapped' conformation relative to its twofold-related neighbor. The loops connecting the strands of the structurally conserved jelly roll motif differ considerably in structure and length from those of other parvoviruses. IHHNV was first reported as a highly lethal disease of juvenile shrimp in 1983, and has only one type of capsid protein that lacks the phospholipase A2 activity that has been implicated as a requirement during parvoviral host cell infection. The structure of recombinant virus-like particles, composed of 60 copies of the 37.5-kDa coat protein is the smallest parvoviral capsid protein reported thus far. The small size of the PstDNV capsid protein makes the system attractive as a model for studying assembly mechanisms of icosahedral virus capsids.	323
406835	pfam16531	SAS-6_N	Centriolar protein SAS N-terminal. SAS-6_N is the N-terminal domain of the SAS-6 centriolar protein, both in C.elegans and in humans. The N-terminal domain is the region through which the 9 rod-shaped homodimers that SAS-6 forms on oligomerization interact with each other. Proper functioning of the centriole requires this correct oligomerization.	91
406836	pfam16532	Phage_tail_NK	Sf6-type phage tail needle knob or tip of some Caudovirales. Phage_tail_NK is the globular tip protein of some tailed bacteriophages. Tailed bacteriophage virions deliver DNA to susceptible cells after adsorbing to specific receptors on the surface of the bacteria. In the Gram-negative bacteria these receptors are surface proteins or polysaccharides. In the phage Sf6-type needle, this distal tip folds into a knob with a TNF-like fold, similar to the fibre knobs of bacteriophage PRD1 and Adenovirus. It contains three bound L-glutamate molecules that are bind tightly in the crevices between the trimers of this trimeric tip.	152
406837	pfam16533	SOAR	STIM1 Orai1-activating region. SOAR is the Orai1-activating region of STIM1, where STIM1 are calcium sensors in the endoplasmic reticulum. As the store of calcium is depleted the calcium sensor in the ER activates Orai1, a Ca2+-release-activated Ca2+ (CRAC) channel, in the plasma membrane. The SOAR region, which runs from residues 340-443 on UniProtKB:Q13586, forms a dimer, and is essential for oligomerization of the whole of STIM1.	98
406838	pfam16534	ULD	Ubiquitin-like oligomerization domain of SATB. ULD is an N-terminal oligomerization domain of SATB or special AT-rich sequence-binding proteins. SATBs are global chromatin organizers and regulators of gene expression that are essential for T-cell development, breast cancer tumor growth and metastasis. SATBs assemble into a tetramer via the ULD domain, and the tetramerisation of SATBs are essential for recognising specific DNA sequences (such as multiple AT-rich DNA fragments). Thus, SATBs may regulate gene expression directly by binding to various promoters and upstream regions and thereby influencing promoter activity.	108
406839	pfam16535	T3SSipB	Type III cell invasion protein SipB. T3SSipB is a family of pathogenic Gram-negative bacterial proteins that invade human intestinal cells via the type III secretion system translocators. T3SSipB represents the coiled -coil region of the proteins and is shown to be homologous in activity to the pore-forming toxins of other Gram-negative pathogens, such as colicin Ia.	155
406840	pfam16536	PNKP-ligase_C	PNKP adenylyltransferase domain, C-terminal region. This is a short unique anti-parallel two-helical module with an extended tail peptide. It packs tightly against an extended peptide segment, residues 489-501 in UniProtKB:A3DJ38, near the N-terminus of the NTase domain, pfam16542. PNKP (polynucleotide 5'-kinase/3'-phosphatase) is the end-healing and end-sealing component of an RNA-repair system present in diverse bacteria from ten different phyla. RNA breakage by site-specific 'ribotoxins' is an ancient mechanism by which microbes respond to cellular stress and distinguish self from non-self. Ribotoxins are trans-esterifying endonucleases that generate 5'-OH and 2',3' cyclic phosphate termini. Repair of this type of RNA damage is feasible via sequential enzymatic end-healing and end-sealing steps. The exact function of this C-terminal region is unclear; however, the conformation of the bundles changes on transfer of a PO4 from ATP to AMP.	60
406841	pfam16537	T2SSB	Type II secretion system protein B. This is the B protein from some operons of bacterial secretion systems of type II. The exact function of the B protein is not known, though in the case of Vibrio cholerae there is a fusion protein between proteins A and B that includes an AAA domain, a PG_binding domains well as this domain at the C-terminus. Many of the other species have no A or B domain genes in this operon. The type II secretion pathway is conserved in Gram-negative bacteria that are prevalent in bacterial pathogens of plants (Pseudomonas fluorescens, Erwinia or Xanthomonas species), animals (Aeromonas hydrophila) and humans (Klebsiella oxytoca, Pseudomonas aeruginosa, Vibrio cholerae or Legionella pneumophila). Typical type II secretion systems (T2SSs) are encoded by a set of 12 to 16 gsp (general secretion pathway) genes organized into large operons including the conserved 'core' genes denoted C to O and in some bacterial species, as indicated above, extra gsp genes such as gspAB, gspN or gspS. A different nomenclature is used for Pseudomonas T2SSs, so the B gene is referred to as the P protein.	60
406842	pfam16538	FlgT_C	Flagellar assembly protein T, C-terminal domain. FlgT_C is the C-terminal domain of a family of flagellar proteins that make up part of the basal body of the flagellum. The flagellum is a large macromolecular assembly composed of three major parts: the basal body, the hook, and the filament. The basal body has two unique ring structures, the T ring and the H ring. FlgT is required to form and stabilize both ring structures. FlgT_C is not essential but stabilizes the H-ring structure..	74
406843	pfam16539	FlgT_M	Flagellar assembly protein T, middle domain. FlgT_M is the middle region of a family of flagellar proteins that make up part of the basal body of the flagellum. The flagellum is a large macromolecular assembly composed of three major parts: the basal body, the hook, and the filament. The basal body has two unique ring structures, the T ring and the H ring. FlgT is required to form and stabilize both ring structures. FlgT-N and FlgT-M are thought to be involved in the H-ring and the T-ring formation, respectively. and FlgT-M is also required for the stable association of FlgT with the basal body.	163
406844	pfam16540	MKLP1_Arf_bdg	Arf6-interacting domain of mitotic kinesin-like protein 1. This family is a C-terminal region of mitotic kinesin-like proteins that is necessary for the interaction with the small GTPase Arf6. MKLP1 is a Flemming body-localising protein essential for cytokinesis, so its interaction with Arf6 shows how Arf6 is involved in cytokinesis. The Arf6-MKLP1 complex plays a crucial role in cytokinesis by connecting the microtubule bundle and membranes at the cleavage plane.	107
406845	pfam16541	AltA1	Alternaria alternata allergen 1. AltA1 is a family of fungal allergens. It shows a unique beta-barrel comprising 11 beta-strands. There is structural evidence for the location of IgE antibody-binding epitopes. The crystal structure will allow efforts to promote immunotherapy for patients allergic to Alternaria species.	104
406846	pfam16542	PNKP_ligase	PNKP adenylyltransferase domain, ligase domain. PNKP_ligase is a classical ligase nucleotidyltransferase module of bacteria. PNKP (polynucleotide 5'-kinase/3'-phosphatase) is the end-healing and end-sealing component of an RNA-repair system present in diverse bacteria from ten different phyla. RNA breakage by site-specific 'ribotoxins' is an ancient mechanism by which microbes respond to cellular stress and distinguish self from non-self. Ribotoxins are trans-esterifying endonucleases that generate 5'-OH and 2',3' cyclic phosphate termini. Repair of this type of RNA damage is feasible via sequential enzymatic end-healing and end-sealing steps.	315
406847	pfam16543	DFRP_C	DRG Family Regulatory Proteins, Tma46. DFRP_C is a family of eukaryotic translation machinery-associated protein 46 proteins that are the binding partner for the highly conserved Developmentally Regulated GTP-binding (DRG) GTPases. Thus this family is referred to as DRG Family Regulatory Proteins (DFRP). Binding of this DFRP modulates the function of the GTPase.	89
406848	pfam16544	STAR_dimer	Homodimerization region of STAR domain protein. This family is the homodimerization domain of quaking proteins. Quaking-dimer is a helix-turn-helix dimer with an additional helix in the turn region. dimerization is required for adequate RNA-binding. Quaking is a prototypical member of the STAR (signal transducer and activator of RNA) protein family, which plays key roles in post-transcriptional gene regulation by controlling mRNA translation, stability and splicing. STAR_dimer is the homodimerization domain, Qua1 of the STAR domain of a series of proteins referred to as STAR/GSG, or Signal Transduction and Activation of RNA/GRP33, Sam68, GLD-1 family. These are conserved in higher eukaryotes and are RNA-binding transcriptional regulators. The STAR domain is a KH domain flanked by two homologous regions, Qua1 and Qua2. Qua1, this family, is the homodimerization domain, and the KH plus Qua2 is the RNA-binding region.	51
406849	pfam16545	CCM2_C	Cerebral cavernous malformation protein, harmonin-homology. CCM2_HHD is a folded-helical region of a family of vertebral proteins, mutations in which cause cerebral cavernous malformations (CCMs). These malformations are congenital vascular anomalies of the central nervous system that can result in haemorrhagic stroke, seizures, recurrent headaches, and focal neurologic deficits. This domain is structurally homologous to the N-terminal domain of harmonin, so it is named the CCM2 harmonin-homology domain or CCM2_HHD. This protein is often called Malcavernin.	91
406850	pfam16546	SGTA_dimer	Homodimerization domain of SGTA. SGTA_dimer is a short N-terminal domain at the start of SGTA or small glutamine-rich tetratricopeptide repeat-containing proteins. It is the homodimerization domain of the SGTA, a heat-shock protein (HSP) co-chaperone involved in the targeting of tail-anchor membrane proteins to the endoplasmic reticulum. This N-terminal homodimerization domain mediates the association with a single copy of Get4 or Get5 proteins, providing a link to the rest of the GET pathway.	64
406851	pfam16547	BLM10_N	Proteasome-substrate-size regulator, N-terminal. The ordered regions of the yeast BLM10 or PA200 (human homolog), full-length protein encode 32 HEAT repeat (HR)-like modules, each comprising two helices joined by a turn, with adjacent repeats connected by a linker. Whereas a standard HEAT repeat is composed of ~50 residues, the BLM10 HEAT repeats are highly variable. The length of helices ranges from 8 to 35 residues, turns range from 2 to 87 residues, and linkers range from 1 to 88 residues, with the longest linker, between HR21 and HR22, containing additional secondary structures (two strands and three helices). BLM10_N is the N-terminal ordered region of the three in BLM10. BLM10 is found to surround the proteasome entry pore in the 1.2 MDa complex of proteasome and BLM10 to form a largely closed dome that is expected to restrict access of potential substrates. BLM10 and PA200 are predominantly nuclear and stimulate the degradation of model peptides, although they do not appear to stimulate the degradation of proteins, recognize ubiquitin, or utilize ATP.	81
406852	pfam16548	FlgT_N	Flagellar assembly protein T, N-terminal domain. FlgT_N is the N-terminal domain of a family of flagellar proteins that make up part of the basal body of the flagellum. The flagellum is a large macromolecular assembly composed of three major parts: the basal body, the hook, and the filament. The basal body has two unique ring structures, the T ring and the H ring. FlgT is required to form and stabilize both ring structures. FlgT-N contributes to the construction of the H-ring structure, and adopts a two-layer alpha-beta sandwich architecture composed of a four-stranded anti-parallel beta-sheet and two alpha helices.	87
406853	pfam16549	T2SSS_2	Type II secretion system (T2SS) pilotin, S protein. T2S_S is the S protein or pilotin of the bacterial Gram-negative secretion system in Vibrio and some E.coli and Shigella. It is given the suffix _2 to distinguish it from the PulS_OutS family of pilotins from Klebsiella and Dickeya, etc. AspS is functionally equivalent and yet structurally unrelated to the pilotins found in Klebsiella and other bacteria. AspS binds to a specific targeting sequence in the Vibrio-type secretins, enhancing the kinetics of secretin assembly; homologs of AspS are found in all species of Vibrio as well those few strains of Escherichia and Shigella that have acquired a Vibrio-type T2SS. PulS is the Kelbsiella pilotin, found in PulS_OutS, pfam09691. Not all species with a type II secretion system have this pilotin or S protein.	104
406854	pfam16550	RPN13_C	UCH-binding domain. RPN13_C is a family of all-helical domains that forms the binding-surface for the proteasome-ubiquitn-receptor protein Rpn13 to UCH37, one of the three de-ubiquitinating enzymes of the proteasome.	106
406855	pfam16551	Quaking_NLS	Putative nuclear localization signal of quaking. Quaking_NLS is the very C-terminal region of quaking proteins that is purported to be the nuclear localization signal.	30
406856	pfam16552	OAM_alpha	D-ornithine 4,5-aminomutase alpha-subunit. OAM_alpha is the 12.8kDa, alpha subunit of d-ornithine 4,5-aminomutase, or OAM, an enzyme that converts d-ornithine to 2,4-diaminopentanoic acid by way of radical propagation from an adenosylcobalamin to a pyridoxal 5'-phosphate cofactor. OAM is an alpha2-beta2 heterodimer comprising two strongly associating subunits. The packing of the alpha subunits against the beta helps to form the substrate and co-factor binding-regions.	107
406857	pfam16553	PUFD	BCORL-PCGF1-binding domain. PUFD is the minimal domain at the C-terminus of BCORL (BCL6 corepressor) that is needed for binding and giving specificity to some of the PCGF proteins, polycomb-group RING finger homologs. PUFD binds to the RAWUL (RING finger- and WD40-associated ubiquitin-like) domain of the particular PCGF PCGF1, pfam16207. Polycomb group proteins form repressive complexes (PRC) that mediate epigenetic modifications of histones. In humans there are many different PCGF homologs whose functions all vary, but the direct binding partner of PCGF1 is BCOR. BCOR has emerged as an important player in development and health.	110
406858	pfam16554	OAM_dimer	dimerization domain of d-ornithine 4,5-aminomutase. This family is the short dimerization domain of the enzyme D-ornithine 4,5-aminomutase. It sits between the TIM-barrel pfam09043 and pfam02310. The enzyme is an alpha2-beta2-heterodimer that converts D-ornithine to 2,4-diaminopentanoic acid by way of radical propagation from an adenosylcobalamin to a pyridoxal 5'-phosphate cofactor.	78
406859	pfam16555	GramPos_pilinD1	Gram-positive pilin subunit D1, N-terminal. GramPos_pilinD1 is the first subunit domain of Gram-positive pilins from Strep.pneumoniae. There are three major pilin subunits that form the polymeric backbone of the pilin from S. pneumoniae, constructed of three Ig-like, CnaB, domains along with a crucial N-terminal domain, D1. The three IG-like domains are stabilized by internal Lys-Asn isopeptdie bonds, but this N-terminal domain makes few contact with the rest of the molecule due to the different orientation of its G beta-strand. Strand G of D1 also carries the YPKN motif that provides the essential Lys residue for the sortase-mediated intermolecular linkages along the pilus shaft. Gram-positive pili are formed from a single chain of covalently linked subunit proteins (pilins), usually comprising an adhesin at the distal tip, a major pilin that forms the polymer shaft and a minor pilin that mediates cell wall anchoring at the base.	161
406860	pfam16556	IL17R_fnIII_D1	Interleukin-17 receptor, fibronectin-III-like domain 1. IL17R_fnIII_D1 is the first of two fibronectin 3-like domains on interleukin-17 receptor proteins A and B. The tow fnIII domains are linked and together bind two molecules of IL-17 at one of its receptor-binding interfaces. This allows the other interface to bind to another receptor, thus allowing the IL-17 family of homodimeric cytokines to coordinate two different receptors.	154
406861	pfam16557	CUTL	CUT1-like DNA-binding domain of SATB. CUTL is part of the N-terminal region of SATB proteins, special AT-rich sequence-binding proteins that are global chromatin organizers and gene expression regulators essential for T-cell development and breast cancer tumor growth and metastasis. CUTL carries a DNA-binding region just as CUT domains do.	71
406862	pfam16558	AZUL	Amino-terminal Zinc-binding domain of ubiquitin ligase E3A. The AZUL or amino-terminal zinc-binding domain of ubiquitin E3a ligase is found in eukaryotes, and is an unusual zinc-finger domain. The final cysteine is usually mutated in Angelman syndrome patients. It is likely that AZUL plays a role in Ube3A substrate-recognition.	59
406863	pfam16559	GIT_CC	GIT coiled-coil Rho guanine nucleotide exchange factor. GIT-CC is the coiled-coil region of GIT (G protein-coupled receptor kinase-interacting) proteins. This coiled-coil region is the surface that associates with the equivalent binding-region on beta-PIX, or p21-activated kinase-interacting exchange factor proteins. Both GIT and PIX complex together to form a scaffold for the formation of multi-protein assemblies. On its own the GIT-CC region assembles into a parallel two-stranded CC in the asymmetric unit. Similarly the PIX coiled-coil region assembles into a trimer. At least in vitro the two regions associate together into a stable heteropentameric complex that consists of one PIX trimer and one GIT dimer.	66
406864	pfam16560	SAPI	Putative mobile pathogenicity island. SAPI is a family of putative Gram-positive mobile pathogenicity island proteins. SAPIs are responsible for many superantigen-related diseases in humans as they carry two or more superantigens.	213
406865	pfam16561	AMPK1_CBM	Glycogen recognition site of AMP-activated protein kinase. AMPK1_CBM is a family found in close association with AMPKBI pfam04739. The surface of AMPK1_CBM reveals a carbohydrate-binding pocket.	85
406866	pfam16562	HECW_N	N-terminal domain of E3 ubiquitin-protein ligase HECW1 and 2. HECW_N is a domain on E3 ubiquitin-protein ligases that lies upstream of the C2 domain; its function is not clearly understood, except perhaps to determine the substrate spectrum of the ligase.	118
406867	pfam16563	P66_CC	Coiled-coil and interaction region of P66A and P66B with MBD2. This family is a short coiled-coil interaction region on the transcriptional repressors P66A and P66B. The P66A and B, or alpha and beta, complex with MBDs or methyl-binding domain-containing proteins via a coiled-coil region on each. This P66-MBD2 complex forms part of an assembly with NuRD, nucleosome remodelling and deacetylation protein. MBD2-NuRD binds methylated DNA and regulates transcription of eg, the foetal beta-globin gene during development.	37
406868	pfam16564	MBDa	p55-binding region of Methyl-CpG-binding domain proteins MBD. MBDa is a second MBD domain of Methyl-CpG-binding domain proteins. region implicated in binding the RbAp46/48 (retinoblastoma protein-associated protein) homolog p55, which is one of the components of the MBD2-NuRD complex. The MBD2-NuRD complex is a nucleosome remodelling and deacetylation complex.	69
406869	pfam16565	MIT_C	Phospholipase D-like domain at C-terminus of MIT. MIT_C is the C-terminal domain of MIT-containing proteins, pfam04212. It contains an unanticipated phospholipase d fold (PLD fold) that binds avidly to phosphoinositide-containing membranes. It is conserved in eukaryotes, though not fungi and plants, and some bacteria.	137
406870	pfam16566	CREPT	Cell-cycle alteration and expression-elevated protein in tumor. CREPT (Cell-cycle alteration and expression-elevated protein in tumor) is a family of eukaryotic transcriptional regulators that ptromote the binding of RNA-polymerase to the CYCLIN D1, CCDN1, promoter and other genes involved in the cell-cycle. It promotes the formation of a chromatin loop in the CYCLIN D1 gene, and is preferentially expressed in a range of different human tumors.	147
293175	pfam16567	CagD	Pathogenicity island component CagD. CagD is a tightly conserved family of proteins found in the pathogenic strains of Helicobacter species. It is one of some 30 proteins, produced from the genomic insert termed the pathogenicity island, required for the type IV secretion system - T4SS - that delivers CagA oncoprotein toxin into the host cell. CagD is a covalent dimer in which each monomer folds as a single domain composed of five beta-strands and three alpha-helices. CagD partially associates with the inner membrane, where it may be exposed to the periplasmic space; this may indicate that CagD is released into the supernatant during host cell infection in order then to bind to the host cell surface, or to be incorporated into the pilus structure.	205
406871	pfam16568	Sam68-YY	Tyrosine-rich domain of Sam68. Sam68-YY is a short tyrosine-rich domain on Src-associated in mitosis, 68 kDa protein (Sam68), a protein that regulates TCF-1 alternative splicing. It is a crucial binding-partner of the APC-Arm domain that forms a superhelix with a positively charged groove, the surface-residues of which groove form numerous interactions with Sam68-YY to fix it in a bent conformation. APC-Arm is the armadillo repeat domain of the tumor-suppressor protein adenomatous polyposis coli or APC. APC plays plays important roles in Wnt signalling and other cellular processes.	55
406872	pfam16569	GramPos_pilinBB	Gram-positive pilin backbone subunit 2, Cna-B-like domain. GramPos_pilinBB is one of the major backbone units of Gram-positive pili, such as those from S.pneumoniae. There are three major pilin subunits that form the polymeric backbone of the pilin from S. pneumoniae, constructed of three transthyretin-like, CnaB, domains along with a crucial N-terminal domain, D1. The three Cna-B like domains are stabilized by internal Lys-Asn isopeptdie bonds, Gram-positive pili are formed from a single chain of covalently linked subunit proteins (pilins), usually comprising an adhesin at the distal tip, a major pilin that forms the polymer shaft and a minor pilin that mediates cell wall anchoring at the base.	116
406873	pfam16570	GramPos_pilinD3	Gram-positive pilin backbone subunit 3, Cna-B-like domain. GramPos_pilinD3 is one of the major backbone units of Gram-positive pili, such as those from S.pneumoniae. There are three major pilin subunits that form the polymeric backbone of the pilin from S. pneumoniae, constructed of three transthyretin-like, CnaB, domains along with a crucial N-terminal domain, D1. The three Cna-B like domains are stabilized by internal Lys-Asn isopeptdie bonds, Gram-positive pili are formed from a single chain of covalently linked subunit proteins (pilins), usually comprising an adhesin at the distal tip, a major pilin that forms the polymer shaft and a minor pilin that mediates cell wall anchoring at the base.	141
406874	pfam16571	FBP_C	FBP C-terminal treble-clef zinc-finger. FBP_C is a family from the C terminal end of fibronectin-binding proteins. It forms an extended four-cysteine zinc-finger with a unique structural fold. Fibronectin-binding proteins bind to elongation factor G - EF-G, which is mediated by the zinc-finger binding to the C-terminus of EF-G. FBPs release ribosomes by competing with them for EF-G.	155
406875	pfam16572	HlyD_D4	Long alpha hairpin domain of cation efflux system protein, CusB. HlyD_D4 is the long alpha-hairpin domain in the centre of CusB or HlyD proteins. CusB and HlyD proteins are membrane fusion proteins of the CusCFBA copper efflux system in E.coli and related bacteria. Efflux systems of this resistance-nodulation-division group - RND - have been developed to excrete poisonous metal ions, and in E.coli the only one that deals with silver and copper is the CusA transporter. The transporter CusA works in conjunction with a periplasmic component that is a membrane fusion protein, eg CusB, and an outer-membrane channel component CusC in a CusABC complex driven by import of protons. HlyD_D4 is thought to interact with the alpha-helical tunnels of the corresponding outer-membrane channels, ie the periplasmic domain of CusC.	54
406876	pfam16573	CLP1_N	N-terminal beta-sandwich domain of polyadenylation factor. This family is the short N-terminal domain of the pre-mRNA cleavage complex II protein Clp1. Clp1 function involves some degree of adenine or guanine nucleotide-binding and participates in the 3'-end-processing of mRNAs in eukaryotes.	92
406877	pfam16574	CEP209_CC5	Coiled-coil region of centrosome protein CE290. CEP290 and similar centrosomal proteins carry a number of coiled-coil regions, and this is the fifth along the length of the protein. It is thought that the proteins are involved in cilia biosynthesis.	128
406878	pfam16575	CLP1_P	mRNA cleavage and polyadenylation factor CLP1 P-loop. CLP1_P is the P-loop carrying domain of Clp1 mRNA cleavage and polyadenylation factor, Clp1, proteins in eukaryotes. Clp1 is essential for 3'-end processing of mRNAs. This region carries the P-loop suggesting it is the region that binds adenine or guanine nucleotide.	187
406879	pfam16576	HlyD_D23	Barrel-sandwich domain of CusB or HlyD membrane-fusion. HlyD_D23 is the combined domains 2 and 3 of the membrane-fusion proteins CusB and HlyD, which forms a barrel-sandwich. CusB and HlyD proteins are membrane fusion proteins of the CusCFBA copper efflux system in E.coli and related bacteria. The whole molecule hinges between D2 and D3. Efflux systems of this resistance-nodulation-division group - RND - have been developed to excrete poisonous metal ions, and in E.coli the only one that deals with silver and copper is the CusA transporter. The transporter CusA works in conjunction with a periplasmic component that is a membrane fusion protein, eg CusB, and an outer-membrane channel component CusC in a CusABC complex driven by import of protons.	214
318725	pfam16577	UBA_5	UBA domain. UBA_2 is a domain found on eukaryotic ubiquitin-interacting proteins. Sequestosome 1/p62 has recently been shown to interact with polyubiquitinated proteins through its UBA domain. This domain selectively binds K63-polyubiquitinated proteins.	62
406880	pfam16578	IL17R_fnIII_D2	Interleukin 17 receptor D. IL17R_fnIII_D2 is the second extracellular fibronectin III-like domain on interleukin17-receptor-D molecules. The exact ligands of IL17R-D are not known.	105
406881	pfam16579	AdenylateSensor	Adenylate sensor of SNF1-like protein kinase. AdenylateSensor is a family found at the C-terminus of SNF1-like protein kinases snf other protein-kinases.	118
293188	pfam16580	Astro_capsid_p2	C-terminal tail of astrovirus capsid projection or spike. Astro_capsid_p2 is a family of turkey astroviral spike projections. These are globular domains on the surface of the viral capsid. Astroviruses cause diarrhoea in a variety of mammals and birds, and are small, non-enveloped, single-stranded RNA viruses. The spike carries three conserved patches on its surface which could be candidates for avian receptor-binding sites.	245
406882	pfam16581	HIGH_NTase1_ass	Cytidyltransferase-related C-terminal region. This domain is found as the C-terminal portion of some HIGH_NTase1 proteins. The exact function is not known.	205
406883	pfam16582	TPP_enzyme_M_2	Middle domain of thiamine pyrophosphate. TPP_enzyme_M_2 is the middle domain of thiamine pyrophosphate in sequences not captured by pfam00205. This enzyme is necessary for the first step of the biosynthesis of menaquinone, or vitamin K2, an important cofactor in electron transport in bacteria.	207
406884	pfam16583	ZirS_C	Zinc-regulated secreted antivirulence protein C-terminal domain. ZirS_C is the C-terminal domain of ZirS, zinc-regulated secreted protein, that is part of a type V-like secretion system. The domain adopts a bacterial Ig-like fold. This domain interacts with its transporter ZirT, and ZirS also interacts directly with ZirU, the third component of this antivirulence complex. ZirT is the zinc-regulated transporter through which ZirS is secreted.	141
293192	pfam16584	LolA_2	Outer membrane lipoprotein carrier protein LolA. LolA_2 is a family of Bacteroidetes outer membrane lipoprotein carrier protein LolA-like proteins. The exact function is not known.	152
406885	pfam16585	Lipocalin_8	Lipocalin-like domain. 	135
406886	pfam16586	DUF5060	Domain of unknown function (DUF5060). This is the N-terminal domain of a putative glycoside hydrolase, DUF4038. It is found in a number of different bacterial orders.	70
406887	pfam16587	DUF5061	17 kDa common-antigen outer membrane protein. This is a bacterial domain of 17 kDa common-antigen proteins.	82
374649	pfam16588	zf-C2H2_10	C2H2 zinc-finger. 	23
406888	pfam16589	BRCT_2	BRCT domain, a BRCA1 C-terminus domain. This BRCT domain, a BRCA1 C-terminus region, is found on many RAP1 proteins, usually at the very N-terminus. The function in human at least of a BRCT is to contribute to the heterogeneity of the telomere DNA length, but that may not be its general function, which remains unknown.	84
406889	pfam16590	ESP	Exocrine gland-secreting peptide. ESP is a family of largely rodent exocrine gland-secreting peptides that are produced by the male extraorbital lacrimal gland to be secreted into the tear fluid. Other mice including females detect these peptides through receptors in the vomeronasal organ, and the receptors report information on mouse-strain, sex and species. The peptides are short, all carrying an N-terminal signal-peptide to indicate they are for secretion which accounts for much of the common conservation.	91
406890	pfam16591	HBM	Helical bimodular sensor domain. The HBM sensor domain has been identified primarily in bacterial chemoreceptors but is also present on histidine kinases. Characteristic features of this domain are its size of approximately 250 amino acids and its location in the bacterial periplasm. The McpS chemoreceptor of Pseudomonas putida KT2440 was found to possess an HBM sensor domain and its 3D structure in complex with physiologically relevant ligands has been reported. This domain is composed of 2 long and 4 short helices that form two modules each composed of a 4-helix bundle. The McpS chemoreceptor mediates chemotaxis towards a number of organic acids. Both modules of the McpS HBM domain contain a ligand binding site. Chemo-attractants binds to each of these sites and their binding was shown to trigger a chemotactic response. This domain is primarily found in different proteobacteria but also in archaea. Interestingly, amino acids in both ligand binding sites showed a high degree of conservation suggesting that members of this family sense similar ligands.	245
406891	pfam16592	Cas9_REC	REC lobe of CRISPR-associated endonuclease Cas9. The REC lobe of Cas9 - the CRISPR-associated endonuclease Cas9 - includes the REC1 and REC2 domains. REC1 forms an elongated, alpha-helical structure consisting of 25 alpha helices and two beta-sheets, whereas REC2 inserted within REC1 adopts a six-helix bundle structure. The REC lobe and the NUC lobe of Cas9 fold to present a positively charged groove at their interface which accommodates the negatively charged sgRNA:target DNA heteroduplex. CRISPR (clustered regularly interspaced short palindromic repeat)-Cas system occurs naturally in bacteria as a defense against invasion by phages or other mobile genetic elements. Cas9 is targeted to specific genomic locations by sgRNAs or single guide RNAs, in order to complex with invading DNA in order to cleave it and render it inactive.	526
406892	pfam16593	Cas9-BH	Bridge helix of CRISPR-associated endonuclease Cas9. Cas9-BH is the bridge helix between the NUC and the REC lobes of Cas9 - the CRISPR-associated endonuclease Cas9. The REC lobe and the NUC lobe of Cas9 fold to present a positively charged groove at their interface which accommodates the negatively charged sgRNA:target DNA heteroduplex. CRISPR (clustered regularly interspaced short palindromic repeat)-Cas system occurs naturally in bacteria as a defense against invasion by phages or other mobile genetic elements. Cas9 is targeted to specific genomic locations by sgRNAs or single guide RNAs, in order to complex with invading DNA in order to cleave it and render it inactive.	33
406893	pfam16594	ATP-synt_Z	Putative AtpZ or ATP-synthase-associated. This is a family of short highly conserved plant proteins that might be associated with ATP-synthase atp operon.	53
406894	pfam16595	Cas9_PI	PAM-interacting domain of CRISPR-associated endonuclease Cas9. Cas9_PI is a family found at the C-terminal of bacterial type II CRISPR system Cas9 endonuclease. This domain adopts a novel protein fold that is unique to the Cas9 family. It is positioned in the structure-DNA-complex to recognize the PAM sequence on the non-complementary DNA strand of the crRNA. PAM sequence is protospacer-adjacent motifs on DNA. See family CRISPR-DR2, Rfam:RF01315. Cas9 carries two nuclease domains, HNH and RuvC, which cleave the DNA strands that are complementary and non-complementary to the 20 nucleotide guide sequence in crRNAs, respectively.	264
406895	pfam16596	MFMR_assoc	Disordered region downstream of MFMR. This is a conserved region of disorder, identified with the MobiDB database, found in plants immediately to the C-terminus of the MFMR domain.	136
406896	pfam16597	Thyroglob_assoc	Thyroglobulin_1 repeat associated disordered domain. This domain of conserved disorder lies almost invariably between the two repeated Thyroglobulin_1 domains, pfam00086.	61
406897	pfam16598	Edc3_linker	Linker region of enhancer of mRNA-decapping protein 3. This region is located between the LSM14 pfam12701 (Lsm) and FDF pfam09532 domains of the enhancer of mRNA-decapping protein 3. This region is predicted to be natively unstructured. Its precise functional role is not known.	94
406898	pfam16599	PTN13_u3	Unstructured linker region on PTN13 protein between PDZ. This natively unstructured region lies between the first two PDZ domains on long eukaryotic tyrosine-protein phosphatase non-receptor type 13 proteins. The function is not known. However, since each of the PDZ domains binds with a different protein it is likely to be a linker region allowing flexibility between the PDZs.	191
406899	pfam16600	Caskin1-CID	Caskin1 CASK-interaction domain. The Caskin1 protein interacts with the CASK protein via this region.CASK and Caskin1 are synaptic scaffolding proteins. The binding motif on human Caskin1 is EEIWVLRK. A similar motif is found on protein MINT1 and protein TIAM1, both shown to be able to bind to CASK though the motif. MINT1 and TIAM1 are not part of this family. This region is predicted to be natively unstructured.	55
406900	pfam16601	NPF	Rabosyn-5 repeating NPF sequence-motif. NPF is a natively unstructured but well-conserved region found in eukaryotic proteins of the Rabenosyn-5 type, wherein the sequence motif arginine-proline-phenylalanine followed by several glutamates and aspartates is repeated up to four times along the sequence. NPF lies between the two Rab-binding domains, for Rab-4 and Rab-5, at the C-terminal end of these proteins. Rabosyn-5 (or rabenosyn) is also involved in cell-polarity determination in developing wing epithelia of Drosophila, when the NPF-motif may be implicated. These NPF motifs create a region of strong positive surface potential which appear to bind Eps15 homology, EH or EF-hand, domains on proteins involved in vesicle trafficking.	188
406901	pfam16602	USP19_linker	Linker region of USP19 deubiquitinase. This region is generally located between a CS domain pfam04969 and the enzymatic UCH domain pfam00582 of USP19 deubiquitinases. This region is predicted to be natively unstructured. Its precise functional role is not known.	121
406902	pfam16605	LSM_int_assoc	LSM-interacting associated unstructured. LSM_int_assoc is a family found largely on eukaryotic SART3 proteins just upstream of their C-terminal LSM-interacting domain. This region is natively unstructured.	60
406903	pfam16606	zf-C2H2_assoc	Unstructured conserved, between two C2H2-type zinc-fingers. This domain is found on a set of eukaryotic Zinc finger protein 536 transcriptional regulator proteins sandwiched between zf-C2H2, pfam00096 and zf-H2C2_2 pfam13465. It is not conserved between other pairs of the zinc-fingers on these sequences. It is natively unstructured, and its function is not known. The proteins recognize and bind 2 copies of the core DNA sequence 5'-CCCCCA-3'.	80
406904	pfam16607	CYLD_phos_site	Phosphorylation region of CYLD, unstructured. CYLD_phos_site is a natively unstructured region on a subset of tumor-suppressor and de-ubiquitinating enzyme CYLD proteins in eukaryotes. It lies between the second pair of CAP_GLY domains, pfam01302, on these proteins. This region of CYLD, being unstructured, carries a number of serine residues which, in response to cellular stimuli, become phosphorylated. This transient phosphorylation-state induces ubiquitination of TRAF2, a ubiquitin ligase that catalyzes both self-ubiquitination and the ubiquitination of specific target molecules involved in signal transduction.	165
406905	pfam16608	TNRC6-PABC_bdg	TNRC6-PABC binding domain. TNRC6-PABC_bdg is a natively unstructured region on the higher eukaryote TNRC6 subset of GW182 proteins that carries the binding motif for the interaction with Polyadenylate-binding protein 1, PABC. TNRC6 are trinucleotide repeat-containing gene 6 proteins required for miRNA-mediated gene silencing that are localized to the P bodies (processing bodies). P bodies are cytoplasmic mRNP aggregates that are involved in general mRNA translation repression and decay, including nonsense-mediated decay. Thus GW182 proteins are essential for microRNA-mediated translational repression and deadenylation in animal cells being a major component of miRISCs. The interaction motif that binds to PABC is ShNWPPEFHPGVPWKGLQ. This region lies between a Q-rich region and the RRM, or RNA-recognition motif, pfam13893.	290
406906	pfam16609	SH3-RhoG_link	SH3-RhoGEF linking unstructured region. This family of natively unstructured but conserved residues from higher eukaryotes is found to lie between an SH3 pfam00018 and the RhoGEF, pfam00621, domains. It is serine-rich and likely to be acidic and natively unstructured.	261
406907	pfam16610	dbPDZ_assoc	Unstructured region between two PDZ domains on Dlg5. dbPDZ_assoc is found on higher eukaryote Dlg5, Disks large homolog 5, proteins, lying between the second pair of PDZ domains. The sequence is natively unstructured but may just be long extensions of the PDZs on these sequences in this position. The function is not known.	81
406908	pfam16611	RGS12_us2	Unstructured region between RBD and GoLoco. RGs12_us2 is a region of Regulator of G-protein signalling 12 proteins that is natively unstructured and lies between an RBD domain and a GoLoco motif, pfam02196 and pfam02188. The function is not known.	72
406909	pfam16612	RGS12_usC	C-terminal unstructured region of RGS12. RGS12_usC is a region of Regulator of G-protein signalling 12 proteins that is natively unstructured and lies at the very C-terminus. It has a highly conserved central section. The function is not known.	138
406910	pfam16613	RGS12_us1	Unstructured region of RGS12. RGS12_us1 is a region of Regulator of G-protein signalling 12 proteins that is natively unstructured and lies N-terminal to other such regions in UniProt:E1BPP4. It is very glycine-rich, and the function is not known.	114
374671	pfam16614	RhoGEF67_u2	Unstructured region two on RhoGEF 6 and 7. RhoGEF67_u2 is a region of natively unstructured residues on Rho guanine nucleotide exchange factor 6 and 7 proteins. The function is not known. It lies after the PH domain and before the C-terminal coiled-coil.	109
406911	pfam16615	RhoGEF67_u1	Unstructured region one on RhoGEF 6 and 7. RhoGEF67_u1 is a region of natively unstructured residues on Rho guanine nucleotide exchange factor 6 and 7 proteins. The function is not known. It lies between the CH and the SH3 domains.	47
406912	pfam16616	PHC2_SAM_assoc	Unstructured region on Polyhomeotic-like protein 1 and 2. PHC2_SAM_assoc is a natively unstructured region on Polyhomeotic-like proteins 1 and 2, that lies immediately upstream of the SAM domain, pfam00536. The function is not known.	123
406913	pfam16617	INTAP	Intersectin and clathrin adaptor AP2 binding region. INTAP is a natively unstructured region of intersectin 1 proteins, lying between the first pair of SH3 domains, that binds to the clathrin adaptor AP2. This binding forms an intersectin-AP2 complex that functions as an important regulator of clathrin-mediated SV recycling in synapses.	115
406914	pfam16618	SH3-WW_linker	Linker region between SH3 and WW domains on ARHGAP12. SH3-WW_linker is a natively unstructured region on Rho-GTPase activating factor 12 proteins that lies between the SH3 and the WW domains. it is found in higher eukaryotes, and the function is not known.	196
406915	pfam16619	SUIM_assoc	Unstructured region C-term to UIM in Ataxin3. SUIM_assoc is a natively unstructured region on Ataxin 3 proteins that lies immediately C-terminal to the second UIM domain linking it to a third when present. The function is not known. It is rich in glutamine residues.	60
374677	pfam16620	23ISL	Unstructured linker between I-set domains 2 and 3 on MYLCK. 23ISL is a natively unstructured region lying between the second and third I-set domains on higher eukaryotic myosin light chain kinase (MYLCK) proteins. The function is not known. It carries a highly conserved TSSTITLQ sequence motif which might be a binding domain.	162
406916	pfam16621	NECFESHC	SH3 terminal domain of 2nd SH3 on Neutrophil cytosol factor 1. NECFESHC is the C-terminal domain of the second SH3 domain found on neutrophil cytosol factor 1 or p47phox proteins in higher eukaryotes. It is not unstructured as illustrated by the structure of Structure 1ng2.	50
406917	pfam16622	zf-C2H2_11	zinc-finger C2H2-type. Zinc-finger of C2H2 type found in higher eukaryotes.	29
406918	pfam16623	WW_FCH_linker	Unstructured linker region between on GAS7 protein. WW_FCH_linker is a natively unstructured region on GAS7 or Growth arrest-specific protein 7 higher eukaryote proteins. It lies between the WW and the FCH domains. The function is not known but it carries a highly conserved TINCVTFP sequence motif which might be a binding domain.	92
406919	pfam16624	zf-C2H2_assoc2	Unstructured region upstream of a zinc-finger. zf-C2H2_assoc2 is a short region of natively unstructured sequence immediately upstream of a C2H2-type zinc-finger on eukaryotic Zinc-finger proteins 592 and 800. The function is not known.	95
406920	pfam16625	ISET-FN3_linker	Unstructured linking region I-set and fnIII on Brother of CDO. ISET-FN3_linker is a short section of natively unstructured sequence on Biregional cell adhesion molecule-related/down-regulated by oncogenes (Cdon) binding proteins or Brother of CDO. It is found in higher eukaryotes and lies between the second I-set and the first fnIII domains, pfam07679 and pfam00041. The function is not known.	65
374683	pfam16626	Papilin_u7	Linking region between Kunitz_BPTI and I-set on papilin. Papilin_u7 is a conserved region of natively unstructured residues on proteoglycan-like sulfated glycoprotein - papilin 0 in higher eukaryotes. It links the Kunitz_BPTI, pfam00014, and I-set domains pfam07679. The function is not known.	92
406921	pfam16627	BRX_assoc	Unstructured region between BRX_N and BRX domain. BRX_assoc is a short stretch of plant transcription regulator proteins carrying the BRX domain that is natively unstructured. It connects the BRX_N and BRX domains in plant transcription regulators. The function is not known.	70
374685	pfam16628	Mac_assoc	Unstructured region on maltose acetyltransferase. Mac_assoc is a region of natively unstructured residues on fungal maltose acetyltransferase proteins. It lies just upstream of the Mac, pfam12464, domain linking it with the upstream Zn_clus, pfam00172, the Zn(2)-Cys(6) binuclear cluster. the function of this region is not known.	185
406922	pfam16629	Arm_APC_u3	Armadillo-associated region on APC. Arm_APC_u3 is a semi-unstructured region lying immediately downstream of the armadillo fold before the beta-catenin binding motifs, APC_crr, pfam05923, on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known.	293
406923	pfam16630	APC_u5	Unstructured region on APC between 1st and 2nd catenin-bdg motifs. APC_u5 is a short region of natively unstructured sequence lying between the first and the second 15-residue beta-catenin binding motifs, APC_15aa, pfam05972, on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known.	100
406924	pfam16631	TUTF7_u4	Unstructured region 4 on terminal uridylyltransferase 7. TUTF7_u4 is the fourth natively unstructured region found on a set of higher eukaryote Terminal uridylyltransferase 7 proteins. The function is not known. The region is rich in arginine and lysine.	88
406925	pfam16632	Caskin-tail	C-terminal region of Caskin. This region is found at the C-terminus of Caskin proteins. Caskins are CASK-binding synaptic scaffolding proteins. Part of this region is predicted to be in coiled-coil conformation. Its function is not known.	61
406926	pfam16633	APC_u9	Unstructured region on APC between 1st two creatine-rich regions. APC_u9 is a short region of natively unstructured sequence lying between the first and second APC_crr, pfam05923, domains on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known.	89
406927	pfam16634	APC_u13	Unstructured region on APC between APC_crr and SAMP. APC_u13 is a short region of natively unstructured sequence lying between the fourth creatine-rich region, APC_crr, pfam05923, and the SAMP pfam05924, domains on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known.	54
406928	pfam16635	APC_u14	Unstructured region on APC between SAMP and APC_crr. APC_u14 is a short region of natively unstructured sequence lying between the second SAMP pfam05924, and the fifth creatine-rich region, APC_crr, pfam05923, on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known.	94
406929	pfam16636	APC_u15	Unstructured region on APC between APC_crr regions 5 and 6. APC_u15 is a short region of natively unstructured sequence lying between the fifth and sixth creatine-rich, APC_crr, pfam05923, domains on APC or adenomatous polyposis coli proteins in higher eukaryotes. The function is not known.	81
406930	pfam16637	zf-C2H2_assoc3	Putative zinc-finger between two C2H2 zinc-fingers on Patz. zf-C2H2_assoc3 is a partially unstructured region on Patz or POZ-, AT hook-, and zinc finger-containing proteins of higher eukaryotes. It lies between the two C2H2-type zinc-fingers towards the C-terminus of these proteins and may well be an unusual zinc-finger itself.	74
406931	pfam16638	Tristanin_u2	Unstructured region on methyltransferase between zinc-fingers. Tristanin_u2 is a region of natively unstructured sequence on tristanin like or PR domain zinc finger protein 10s found in higher eukaryotes. It lies between two C2H2-type zinc-fingers. The function is not known.	121
406932	pfam16639	Apocytochr_F_N	Apocytochrome F, N-terminal. This is the N-terminal domain of cytochrome f. It is a soluble lumen-side domain.	154
406933	pfam16640	Big_3_5	Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold.	90
406934	pfam16641	CLIP1_ZNF	CLIP1 zinc knuckle. This zinc knuckle domain is found tandemly repeated at the C-terminal of the cytoplasmic linker protein CLIP1 (CLIP170). It forms a complex with the CAP-Gly domain of Dynactin.	17
406935	pfam16642	KCNQ2_u3	Unstructured region on Potassium channel subunit alpha KvLQT2. KCNQ2_u3 is a region of natively unstructured sequence on potassium voltage-gated channel subfamily KQT member 2 proteins from higher eukaryotes. It lies between families KCNQ_channel, pfam03520, and KCNQC3-Ank_bd, pfam11956. The function is not known.	96
406936	pfam16643	cNMPbd_u2	Unstructured region on cNMP-binding protein. cNMPbd_u2 is a natively unstructured region on a set of higher eukaryote cyclic nucleotide-binding domain-containing proteins. It lies between the second cNMP_binding, pfam00027, and the F-box, pfam00646, domains. The function is not known but there is a highly conserved DPDPFL sequence motif.	163
406937	pfam16644	NEXCaM_BD	Regulatory region of Na+/H+ exchanger NHE binds to calmodulin. NEXCaM_BD is a coiled-coil domain found as part of the regulatory, C-terminal region of the 12-14 TM sodium/proton exchangers (NHEs)2 of the solute carrier 9 (SLC9) family in all animal kingdoms. The C- lobe of CaM binds the first alpha-helix of the NHE, or NEXCaM_BD region, and the N-lobe of CaM binds the second helix of NEXCaM_BD.	110
318786	pfam16645	PHtD_u1	Unstructured region on Pneumococcal histidine triad protein. PHtD_u1 is a natively unstructured region on Pneumococcal histidine triad proteins of higher eukaryotes lying between the first two Strep_his_triad domains so far identified, pfam04270. The function is not known but it does not carry the characteristic histidine triad.	56
406938	pfam16646	AXIN1_TNKS_BD	Axin-1 tankyrase binding domain. This is the N-terminal domain tankyrase binding domain of Axin-1.	75
406939	pfam16647	GCSF	Granulocyte colony-stimulating factor. GCSF is a family of higher eukaryotic granulocyte colony-stimulating factor proteins. Granulocyte colony-stimulating factors are cytokines that are involved in haematopoeisis. They control the production, differentiation and function of white blood cell granulocytes. GCSF binds to the extracellular Ig-like and CRH domain of its receptor GCSFR, thereby triggering the receptor to homodimerize. Homodimerization result in activation of Janus tyrosine kinase-signal transducers and other activators of transcription (JAK-STAT)-type signalling cascades.	149
406940	pfam16648	Calpain_u2	Unstructured region on Calpain-3. Calpain_u2 is a region of natively unstructured sequence that lies between the Calpain_III, pfam01067 and the first EF-hand, Pfam;PF13833, domains on higher eukaryote calpain-3 proteins. The function is not known.	68
406941	pfam16649	IL23	Interleukin 23 subunit alpha. This family, interleukin 23 subunit alpha, is a heterodimer consisting of a 40 kDa subunit - p40 - that is shared with IL12 and a unique 19 kDa subunit - p19. IL23 is a pro-inflammatory cytokine that binds to adnectins and thus plays a key role in the pathogenesis of several autoimmune and inflammatory diseases. IL23 signalling on the cell membrane works through the interaction of four proteins, two of which are shared with the IL12-receptor complex; signalling through the cell membrane involves the combined aggregation of at least two receptor components and then the subsequent activation of the Jak/Tyk tyrosine kinases and the family of STAT transcription factors.	158
293256	pfam16650	SPEG_u2	Unstructured region on SPEG complex protein. SPEG_u2 is a region of natively unstructured but conserved sequence on Striated muscle-specific serine/threonine-protein kinase proteins in higher eukaryotes. It lies between two I-set immunoglobulin, pfam07679, domains. The function is not known.	57
406942	pfam16652	PH_13	Pleckstrin homology domain. 	144
406943	pfam16653	Sacchrp_dh_C	Saccharopine dehydrogenase C-terminal domain. This family comprises the C-terminal domain of saccharopine dehydrogenase. In some organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase. The saccharopine dehydrogenase can also function as a saccharopine reductase.	212
406944	pfam16654	DAPDH_C	Diaminopimelic acid dehydrogenase C-terminal domain. This family comprises the C-terminal domain of diaminopimelic acid dehydrogenase. Diaminopimelate dehydrogenase is a NADPH-dependent enzyme that catalyzes the oxidative deamination of meso-2,6-diaminopimelate, which is the direct precursor of L-lysine in bacterial lysine biosynthesis.	154
379867	pfam16655	PhoD_N	PhoD-like phosphatase, N-terminal domain. This domain is found at the N-terminus of proteins in the PhoD family pfam09423.	89
406945	pfam16656	Pur_ac_phosph_N	Purple acid Phosphatase, N-terminal domain. This domain is found at the N-terminus of Purple acid phosphatase proteins.	94
406946	pfam16657	Malt_amylase_C	Maltogenic Amylase, C-terminal domain. This is the C-terminal domain of Maltogenic amylase, an enzyme that hydrolyzes starch material. Maltogenic amylases are central to carbohydrate metabolism.	75
406947	pfam16658	RF3_C	Class II release factor RF3, C-terminal domain. 	129
374703	pfam16660	PHD20L1_u1	PHD finger protein 20-like protein 1. PHD20L1_u1 is a region of natively unstructured but highly conserved sequence on a set of higher eukaryotic PHD finger protein 20-like protein 1 like proteins. The function is not known.	68
406948	pfam16661	Lactamase_B_6	Metallo-beta-lactamase superfamily domain. This family is part of the metallo-beta-lactamase superfamily.	192
406949	pfam16662	FLYWCH_u	FLYWCH-type zinc finger-containing protein 1. FLYWCH_u is a region of natively unstructured but conserved sequence that lies between the FLYWCH zinc-finger domains on FLYWCH-type zinc finger-containing protein 1 proteins in higher eukaryotes. The function is not known but the N- and C-termini are likely to be part of the zinc-finger domains specific, to the eukaryotes.	62
406950	pfam16663	MAGI_u1	Unstructured region on MAGI. MAGI_u3 is a region of natively unstructured but highly conserved sequence on a subset, of higher eukaryote, membrane-associated guanylate kinase with WW and PDZ domain-containing proteins. The function is not known.	60
406951	pfam16664	STAC2_u1	Unstructured on SH3 and cysteine-rich domain-containing protein 2. STAC2_u1 is a region of natively unstructured but highly conserved sequence between C1_1 pfam00130, and an SH3 domain, eg pfam00018, on SH3 and cysteine-rich domain-containing proteins from higher eukaryotes. The function is not known.	123
406952	pfam16665	NCOA_u2	Unstructured region on nuclear receptor coactivator protein. NCOA_u2 is a region of natively unstructured but highly conserved sequence found on higher eukaryote nuclear receptor coactivator proteins. It lies between a PAS domain, pfam14598 and a steroid receptor coactivator domain, pfam08832. The function is not known.	119
406953	pfam16666	MAGI_u5	Unstructured region on MAGI. MAGI_u5 is a region of natively unstructured but highly conserved sequence on a subset, of higher eukaryote, membrane-associated guanylate kinase with WW and PDZ domain-containing proteins. The function is not known. This region lies between two PDZ, pfam00595 domains.	99
406954	pfam16667	MPDZ_u10	Unstructured region 10 on multiple PDZ protein. MPDZ_u10 is a region of natively unstructured but highly conserved sequence on multiple-PDZ-containing domain proteins in higher eukaryotes. It lies between two PDZ domains, pfam00595. The function is not known.	65
293273	pfam16668	JLPA	Adhesin from Campylobacter. JLPA is a surface-exposed lipoprotein adhesin that promotes interaction with the host epithelial cells. It is found in the genus Campylobacter, and the structure is an unclosed half beta-barrel fold with a wide hydrophobic concave face; this represents a novel bacterial surface lipoprotein.	352
406955	pfam16669	TTC5_OB	Tetratricopeptide repeat protein 5 OB fold domain. This OB fold domain is located at the C-terminus of Tetratricopeptide repeat protein 5 and is required for effective p53 response.	115
406956	pfam16670	PI-PLC-C1	Phosphoinositide phospholipase C, Ca2+-dependent. PI-PLC-C1 is a family of calcium 2+-dependent phosphatidylinositol-specific phospholipase C1 enzymes from bacteria and fungi. The enzyme classification number is EC:3.1.4.11. This enzyme is involved in part of the myo-inositol phosphate metabolic pathway.	330
293276	pfam16671	ACD	Actin cross-linking domain. This domain is found in Vibrio cholerae RtxA toxin and VgrG1 protein. This domain cross-links to G-actin leading to cytoskeletal changes.	386
406957	pfam16672	LAMTOR5	Ragulator complex protein LAMTOR5. 	88
406958	pfam16673	TRAF_BIRC3_bd	TNF receptor-associated factor BIRC3 binding domain. This domain is found in TNF receptor-associated factor 1 and 2 (TRAF1 and TRAF2), where it binds to Baculoviral IAP repeat-containing protein 3 (BIRC3) (cIAP2).	61
406959	pfam16674	UCH_N	N-terminal of ubiquitin carboxyl-terminal hydrolase 37. UCH_N is a domain found at the N-terminus of ubiquitin carboxyl-terminal hydrolase 37 or 26. The function is not known.	102
406960	pfam16675	FOXO_KIX_bdg	KIX-binding domain of forkhead box O, CR2. FOXO_KIX_bd is the first part of the region of transcription factor forkhead box O family proteins that binds to the CREB-binding proteins via the KIX domain. Coactivator CBP/p300 is recruited by FOXO3 via the binding of this domain as well as the simultaneous binding of the more C-terminal TAD domain.	76
406961	pfam16676	FOXO-TAD	Transactivation domain of FOXO protein family. TAD is a promiscuous binding domain that mediates the association of the transcription factor FOXO with the coactivator CBP/p300. Both this domain and the FOXO-KIX_bd family pfam16675 bind simultaneously the KIX domain of CBP/p300. Coactivator CBP/p300 is recruited by FOXO3 though binding to these two regions. The promiscuity of the TAD is further evidenced by that the finding that they also bind the TAZ1 and TAZ2 domains of CBP/p300.	40
406962	pfam16677	GP3_package	DNA-packaging protein gp3. DNA-packaging protein gp3 (terminase small subunit) is involved in DNA packing in bacteriophage. it contains a channel where DNA is bound and passed to DNA-packaging protein gp2 (terminase large subunit).	106
406963	pfam16678	HOIP-UBA	HOIP UBA domain pair. HOIP-UB is a binding domain on E3 ubiquitin-protein ligase RNF31 like proteins. E3 ubiquitin-protein ligase RNF31 is often referred to as HOIL-1L binding partner. The interaction of HOIL-1L and HOIP is thus via the UBL-UBA interaction. this interaction is important in E3 complex formation and the subsequent activation of NF-kappaB. This family contains two UBA-like domains.	150
406964	pfam16679	CDT1_C	DNA replication factor Cdt1 C-terminal domain. This is the C-terminal domain of DNA replication factor Cdt1. This domain binds the MCM complex.	96
406965	pfam16680	Ig_4	T-cell surface glycoprotein CD3 delta chain. This is an immunoglobulin-like domain. It is found on the T-cell surface glycoprotein CD3 delta chain. CD3delta and CD3epsilon complex together as part of the T-cell receptor complex.	76
406966	pfam16681	Ig_5	Ig-like domain on T-cell surface glycoprotein CD3 epsilon chain. Ig_5 is an immunoglobulin domain found on T-cell surface glycoprotein CD3 epsilon chain. It forms a first-order complex with T-cell surface glycoprotein CD3 delta chain as part of the T-cell receptor complex.	75
374722	pfam16682	MSL2-CXC	CXC domain of E3 ubiquitin-protein ligase MSL2. MSL2-CXC is an autonomously folded domain containing that binds three zinc ions. It lies on the E3 ubiquitin-protein ligase MSL2 in eukaryotes. The CXC domain critically contributes to the DNA-binding activity of MSL2. It carries 9 invariant cysteines within about a 50 residue region.	55
406967	pfam16683	TGase_elicitor	Transglutaminase elicitor. TGase_elicitor is a family of largely oomycete sequences from plant pathogens that elicit transglutaminase/acyltransferase activity. The enzyme classification is E.C:2.3.2.13. From the presence of sequences from Vibrio spp one can propose a lateral gene transfer event having occurred between bacteria and oomycetes to the probable selective advantage of the pathogen.	361
406968	pfam16684	Telomere_res	Telomere resolvase. Telomere resolvase (protelomerase) catalyzes the conversion of linear double-stranded DNA into hairpin telomeres.	272
374725	pfam16685	zf-RING_10	zinc RING finger of MSL2. zf-RING_10 is an N-terminal domain on E3 ubiquitin-protein ligase msl-2 proteins. The domain binds MSL1 and exhibits ubiquitin E3 ligase activity towards H2B K34, the histone proteins.	70
406969	pfam16686	POT1PC	ssDNA-binding domain of telomere protection protein. POT1PC is the ssDNA-binding domain on a family of fungal telomere protection protein 1 proteins. POT1PC is able to accommodate heterogeneous ssDNA ligands. Pot1 proteins are the proteins responsible for binding to and protecting the 3' single-stranded DNA (ssDNA) overhang at most eukaryotic telomeres.	152
374727	pfam16687	ELYS-bb	beta-propeller of ELYS nucleoporin. ELYS-bb is the N-terminal seven-bladed beta-propeller domain of ELYS nucleoporins in higher eukaryotes. It is required for anchorage of the nucleoporin to the nuclear envelope during cell-division.	488
406970	pfam16688	CNV-Replicase_N	Replicase polyprotein N-term from Coronavirus nsp1. CNV-Replicase_N is the N-terminal domain of a family of ssRNA positive-stranded porcine transmissible gastroenteritis coronaviruses. the domain folds into a six-stranded beta-barrel fold with a long alpha helix on the rim of the barrel. This fold is shared with SARS-CoV nsp1.	108
374728	pfam16689	APC_N_CC	Coiled-coil N-terminus of APC, dimerization domain. APC_N_CC is the N-terminal, coiled-coil dimerization domain of the adenomatosis polyposis coli (APC) tumor-repressor proteins. It plays a key role in the regulation of cellular levels of the oncogene product beta-catenin. Coiled-coil regions are binding repeats that in this case bind to the armadillo repeat region of beta-catenin.	52
406971	pfam16690	MMACHC	Methylmalonic aciduria and homocystinuria type C family. 	216
374730	pfam16691	DUF5062	Domain of unknown function (DUF5062). This family is found in Vibrio spp. The function is not known.	83
406972	pfam16692	Folliculin_C	Folliculin C-terminal domain. This is the C-terminal domain of folliculin. It has guanine nucleotide exchange factor (GEF) activity.	224
293298	pfam16693	Yop-YscD_ppl	Inner membrane component of T3SS, periplasmic domain. Yop-YscD-ppl is the periplasmic domain of Yop proteins like YscD from Proteobacteria. YscD forms part of the inner membrane component of the bacterial type III secretion injectosome apparatus.	254
406973	pfam16694	Cytochrome_P460	Cytochrome P460. 	122
406974	pfam16695	Tai4	Type VI secretion system (T6SS), amidase immunity protein. Tai4 is a new form of autoimmunity protein for a type VI secretion system, T6SS. T6SS has roles in interspecies interactions, as well as higher order host-infection, by injecting effector proteins into the periplasmic compartment of the recipient cells of closely related species. Pseudomonas aeruginosa produces at least three effector proteins to other cells and thus has three specific cognate immunity proteins to protect itself. Tae4, or type VI amidase effector 4, in Enterobacter cloacae has a cognate Tai4 or type VI amidase immunity 4 protein. The effector is Tae4, pfam14113.	91
406975	pfam16696	ZFYVE21_C	Zinc finger FYVE domain-containing protein 21 C-terminus. This is the C-terminal domain of Zinc finger FYVE domain-containing protein 21. It has a PH-like fold and is required for the regulation of focal adhesions and in cell migration.	120
406976	pfam16697	Yop-YscD_cpl	Inner membrane component of T3SS, cytoplasmic domain. Yop-YscD-cpl is the cytoplasmic domain of Yop proteins like YscD from Proteobacteria. YscD forms part of the inner membrane component of the bacterial type III secretion injectosome apparatus.	92
406977	pfam16698	ADAM17_MPD	Membrane-proximal domain, switch, for ADAM17. ADAM17_MPD is the membrane-proximal domain of a family of disintegrin and metalloproteinase domain-containing protein 17 found in metazoan species. ADAM17 is a major sheddase that is responsible for the regulation of a wide range of biological processes, such as cellular differentiation, regeneration, and cancer progression. This MPD region acts as the sheddase switch. PDI or protein-disulfide isomerase interacts with ADAM17 and to down-regulate its enzymatic activity. The interaction is directly with the MPD, the region of dimerization and substrate recognition, where it catalyzes an isomerisation of disulfide bridges within the thioredoxin motif CXXC. this isomerisation results in a major structural change between an active, open state and an inactive, closed state of the MPD. This change is thought to act as a molecular switch, allowing a global reorientation of the extracellular domains in ADAM17 and regulating its shedding activity.	62
406978	pfam16699	CSTF1_dimer	Cleavage stimulation factor subunit 1, dimerization domain. This family is the dimerization domain, at the N-terminal, of a family of cleavage stimulation factor subunit 1 proteins from eukaryotes. This domain allows for homodimerization such that the functional state of CSTF1 is a heterohexamer. The cleavage stimulation factor (CstF) complex is composed of three subunits and is essential for pre-mRNA 3'-end processing. CstF recognizes U and G/U-rich cis-acting RNA sequence elements and helps to stabilize the cleavage and polyadenylation specificity factor (CPSF) at the polyadenylation site as required for productive RNA cleavage.	57
318831	pfam16700	SNCAIP_SNCA_bd	Synphilin-1 alpha-Synuclein-binding domain. This coiled-coil domain found in Synphilin-1 is responsible for binding to alpha-Synuclein.	45
406979	pfam16701	Ad_Cy_reg	Adenylate cyclase regulatory domain. This domain regulates the activity of Actinobacterial adenylate cyclase in a pH-dependent manner, allowing activation at acidic pH.	185
406980	pfam16702	DUF5063	Domain of unknown function (DUF5063). 	164
406981	pfam16703	DUF5064	Domain of unknown function (DUF5064). This is found in Pseudomonas species. Several members are annotated as being acetyl-CoA carboxylase alpha subunit, but his could not be confirmed.	117
293309	pfam16704	Rab_bind	Rab binding domain. This coiled-coil domain, found in GRIP and coiled-coil domain-containing protein 2 and RANBP2-like and GRIP domain-containing protein, has been shown to bind to Rab in GRIP and coiled-coil domain-containing protein 2.	65
406982	pfam16705	NUDIX_5	NUDIX, or N-terminal NPxY motif-rich, region of KRIT. NUDIX_5 is found in higher eukaryotes at the N-terminus of KRIT1 or Krev interaction trapped proteins. NUDIX_5 carries three NPxY-like motifs, and it is found to bind the integrin cytoplasmic-associated protein 1 ICAP1. In the absence of KRIT1 ICAP1 binds via its C-terminal PH/PTB fold domain to the integrin beta-1 cytoplasmic tail. Binding of KRIT1 to ICAP1 via NUDIX_5 out-competes the binding of ICAP1 to integrin cytoplasmic tails such that ICAP1 is sequestered in the nucleus. Integrin activation is thus prevented.	169
406983	pfam16706	Izumo-Ig	Izumo-like Immunoglobulin domain. Izumo-Ig is the immunoglobulin domain on Izumo proteins from higher eukaryotes. Izumo is a typical type I membrane glycoprotein with one immunoglobulin-like domain and a putative N-glycoside link motif - glycosylation site. The full-length protein is a molecule with a single immunoglobulin (Ig) domain. It is thought that Izumo proteins bind to putative Izumo receptors on the oocyte. Izumo is not detectable on the surface of fresh sperm but becomes exposed only after an exocytotic process, the acrosome reaction, has occurred. Studies have shown that knock-out mice (Izumo-/- males) were sterile despite normal mating behaviour and ejaculation, indicating the importance of the protein in fertilisation. There is a conserved GCL sequence motif. Izumo expression has been found to be testis-specific.	86
406984	pfam16707	CagS	Cag pathogenicity island protein S of Helicobacter pylori. CagS is a family of proteins from the pathogenicity island of Helicobacter pylori. The gene lies just downstream of the cluster whose protein-products resemble those of the Vibrio proteins that form the structural core of T4SS. The exact function of CagS is not known.	196
406985	pfam16708	LppA	Lipoprotein confined to pathogenic Mycobacterium. This is a family of lipoproteins found only in pathogenic mycobacteria. These pathogenic lipoproteins may play a role in host-pathogen interactions. Lipoproteins localized to the cell-envelope of pathogenic bacteria are major determinants of virulence. The proteins are localized to the cell-surface via an N-terminal lipidation carried out by a transferase - pro-lipoprotein diacylglyceryl transferase Lgt - which attaches a diacylglyceride molecule to a sulfur atom from a crucial cysteine, and a consecutively acting lipoprotein signal peptidase LspA that cleaves the signal peptide just before the modified cysteine. When the peptidase is inactivated the pathogen has difficulty in replicating inside macrophages.	153
406986	pfam16709	SCAB-IgPH	Fused Ig-PH domain of plant-specific actin-binding protein. This family is a fused Ig and PH domain found on plant-specific actin-binding proteins or SCABs. SCAB proteins bind, bundle and stabilize actin filaments and regulate stomatal movement. The Ig-PH fusion domain is at the C-terminus. This domain has the N-terminal Ig beta-sandwich fold consisting of two antiparallel beta-sheets built from strands beta1 and beta2 and strands beta3-beta6, respectively. The C-terminus of the fused domains adopts the PH fold, of seven beta-strands, beta7-beta13 and two alpha-helices, alpha1 and alpha2 arranged into a beta-barrel. The Ig and PH domains appear to be truly fused together into an integral structure which displays a few conserved patches on the surface, particularly of the PH part. The canonical phosphoinositide-binding pocket of the classic PH domain is degenerate in this fused one, and the charge on the pocket suggest that the Ig-PH domain contains a non-canonical binding site for inositol phosphates. There are a handful of bacterial members at low threshold but they are missing the PH part of the fused domain, and appear to match little else.	98
293315	pfam16710	CTXphi_pIII-N1	N-terminal N1 domain of Vibrio phage CTXphi pIII. CTXphi_pIII-N1 is the N-terminal domain, N1, of the pIII protein of the CTXphi bacteriophage of Vibrio cholerae. CTXphi is a ssDNA Inovirus. pIII is a minor coat protein. This domain interacts directly with the C-terminus of TolA, a periplasmic protein of Vibrio cholerae itself as part of the infection mechanism.	111
374744	pfam16711	SCAB-ABD	Actin-binding domain of plant-specific actin-binding protein. SCAB-ABD is the actin-binding domain of plant-specific actin-binding proteins or SCABs. SCAB proteins bind, bundle and stabilize actin filaments and regulate stomatal movement. The Ig-PH fusion domain is at the C-terminus. The ABD is structurally independent from the first coiled-coil, CC1, domain which is also involved in binding; the CC1 is likely to function as a dimerization module	41
406987	pfam16712	SCAB_CC	Coiled-coil regions of plant-specific actin-binding protein. SCAB_CC is the two coiled-coil, dimerization domains of plant-specific actin-binding proteins or SCABs, CC1 and CC2, both of which contribute independently to dimerization. CC1 is also required for actin binding, indicating that SCAB1 is a bivalent actin cross-linker. since CC1 adopts an antiparallel helical hairpin that further dimerizes into a four-helix bundle. SCAB proteins bind, bundle and stabilize actin filaments and regulate stomatal movement	168
293318	pfam16713	EAGR_box	Enriched in aromatic and glycine Residues box. The Enriched in Aromatic and Glycine Residues (EAGR) box is found in proteins from Mycoplasma, often tandemly repeated, and may have a role in cell motility.	34
406988	pfam16714	TyrRSs_C	Tyrosyl-tRNA synthetase C-terminal domain. This domain is found at the C-terminus of fungal tyrosyl-tRNA synthetases. It binds to group I introns.	120
406989	pfam16715	CDPS	Cyclodipeptide synthase. This family of proteins includes enzymes involved in the synthesis of cyclodipeptides using aminoacyl-tRNAs as substrates, including cyclo(L-leucyl-L-leucyl) synthase, cyclo(L-tyrosyl-L-tyrosyl) synthase and cyclo(L-leucyl-L-phenylalanyl) synthase. They are structurally similar to class Ic aminoacyl-tRNA synthetases (aaRSs).	220
406990	pfam16716	BST2	Bone marrow stromal antigen 2. 	91
406991	pfam16717	RAC_head	Ribosome-associated complex head domain. The RAC head domain is involved in ribosome binding.	87
374750	pfam16718	IFS	Immunity factor for SPN. Immunity factor for SPN (IFS) binds to and inhibits the SPN toxin.	164
406992	pfam16719	SAWADEE	SAWADEE domain. The SAWADEE domain, found in plant homeobox proteins, has a pair of tandem tudor-like folds that bind chromatin.	126
374752	pfam16720	Albumin_I_a	Albumin I chain a. The albumin I protein, a hormone-like peptide, stimulates kinase activity upon binding a membrane bound 43 kDa receptor. This domain represents the a chain.	48
293326	pfam16721	zf-H3C2	Zinc-finger like, probable DNA-binding. This is a family of probably DNA-binding zinc-fingers found on Gag-Pol polyproteins from mouse retroviruses. Added to clan to resolve overlaps with zf-H2C2, but neither are true members.	96
406993	pfam16722	SAPIS-gp6	Pathogenicity island protein gp6 in Staphylococcus. SAPIS-gp6 is a family of proteins produced from the pathogenicity island SAPI1 in pathogenic Staphylococcus aureus. This is a mobile genetic element that carries genes for several superantigen toxins. SAPIS-gp6 is a dimeric protein produced from the pathogenicity island with a helix-loop-helix motif similar to that of bacteriophage scaffolding proteins. It is thought to determine the size of the capsids of distribution of the SAPI1 genome as it acts as an internal scaffolding protein during capsid size determination.	72
374754	pfam16723	DUF5065	Domain of unknown function (DUF5065). This family is found in found in Bacillus species. The function is not known.	156
406994	pfam16724	T4-gp15_tss	T4-like virus Myoviridae tail sheath stabilizer. T4-gp15_tss is the tail-sheath-stabilizer or tail-terminator protein of T4-like myoviridae phage. It forms a hexamer. It simultaneously forms the binding site for attachment of the capsid to the tail as gp15 binds to gp14 and gp13, the neck proteins, and completes the tail as it binds to the top of the tail via hexamer gp3 and the C-terminal domain of gp18 located in the last ring of the contractile tail sheath.	238
406995	pfam16725	Nucleolin_bd	Nucleolin binding domain. This domain adopts a three helix fold resembling part of a winged helix motif. It binds nucleolin.	70
406996	pfam16726	OCRL_clath_bd	Inositol polyphosphate 5-phosphatase clathrin binding domain. This domain is a clathrin binding domain found at the N-terminus of inositol polyphosphate 5-phosphatase OCRL. It has a PH domain-like fold.	101
406997	pfam16727	REV1_C	DNA repair protein REV1 C-terminal domain. This is the C-terminal domain of DNA repair protein REV1. It interacts with REV7, POLN, POLK and POLI.	91
293333	pfam16728	DUF5066	Domain of unknown function (DUF5066). 	213
406998	pfam16729	DUF5067	Domain of unknown function (DUF5067). 	125
406999	pfam16730	DnaGprimase_HBD	DnaG-primase C-terminal, helicase-binding domain. DnaG-primase_C is the C-terminal of a set of eubacterial DnaG primases that are a single-stranded DNA (ssDNA)-dependent RNA polymerase responsible for the synthesis of oligonucleotide primers needed for the replication of DNA. It interacts with helicase at the replication fork.	118
407000	pfam16731	GARP	Glutamic acid/alanine-rich protein of Trypanosoma. GARP, or glutamic acid/alanine-rich protein, is one of a subset of major surface molecules on Trypanosoma species. They are all surface-orientated, immunodominant, and highly charged. GARP is interesting as ts expression coincides with the loss and gain of variant surface glycoprotein (VSG) molecules in the tsetse vector. It has an extended helical bundle structure that is homologous to the core surface structure of VSG, suggesting that it might replace the bloodstream VSG as the trypanosomes differentiate inside the tsetse vector after a blood-meal.	193
407001	pfam16732	ComP_DUS	Type IV minor pilin ComP, DNA uptake sequence receptor. ComP-DUS is the DNA-uptake sequence receptor of pathogenic Proteobacteria. ComP is a type IV minor pilin -site on the minor type IV pilin, C one of three minor (low abundance) pilins in pathogenic Proteobacteria Neisseria species (with PilV and PilX). These modulate Tfp-mediated properties without affecting Tfp biogenesis. ComP plays a prominent role in competence at the level of DNA uptake. Comp is exposed on the surface of Neisseria filaments, and it is this that recognizes homotypic DNA through genus-specific DNA uptake sequence (DUS) motifs.	81
407002	pfam16733	NRho	Rhomboid N-terminal domain. This is the N-terminal domain of rhomboid protease.	69
407003	pfam16734	Pilin_GH	Type IV pilin-like G and H, putative. Pilin_GH is a family from Cyanobacteria. All the proteins are putatively annotated as being general secretion pathway proteins G and H, and are likely to be pilins of the type IV secretory pathway.	111
407004	pfam16735	MYO10_CC	Unconventional myosin-X coiled coil domain. This coiled coil domain is found in unconventional myosin-X and is responsible for dimerization.	52
407005	pfam16736	sCache_like	Single Cache-like. This entry represents the N-terminal Cache-like domain of the alkaline phosphatase synthesis sensor protein PhoR. It covers part of the PAS-like fold that share a central five-stranded beta- sheet of identical topology to other PAS domains.	114
407006	pfam16737	PHF12_MRG_bd	PHD finger protein 12 MRG binding domain. This domain found in PHD finger protein 12 binds to the MRG domain of Mortality factor 4-like protein 1.	39
407007	pfam16738	CBM26	Starch-binding module 26. CBM26 is a carbohydrate-binding module that binds starch.	68
407008	pfam16739	CARD_2	Caspase recruitment domain. In the probable ATP-dependent RNA helicase DDX58 this CARD domain is found near the N-terminus and interacts with the C-terminal domain.	90
407009	pfam16740	SKA2	Spindle and kinetochore-associated protein 2. Spindle and kinetochore-associated protein 2 (SKA2) interacts with the N-termini of SKA1 and SKA3 and forms the Ska complex. This is a microtubule binding complex required for chromosome segregation.	110
407010	pfam16741	mRNA_decap_C	mRNA-decapping enzyme C-terminus. The C-terminal domain of mRNA-decapping enzyme in Metazoa is responsible for trimerisation.	43
407011	pfam16742	IL17R_D_N	N-terminus of interleukin 17 receptor D. IL17R_D_N is found in higher eukaryotes. The function of this N-terminal domain is not known.	122
407012	pfam16743	PliI	Periplasmic lysozyme inhibitor of I-type lysozyme. 	121
407013	pfam16744	Zf_RING	KIAA1045 RING finger. 	72
407014	pfam16745	RsgA_N	RsgA N-terminal domain. This domain is found at the N-terminus of RsgA domains. It has an OB fold.	54
407015	pfam16746	BAR_3	BAR domain of APPL family. BAR_12 is the BAR coiled-coil domain at the N-terminus of APPL or adaptor protein containing PH domain, PTB domain, and leucine zipper motif proteins in higher eukaryotes. This BAR domain contains four helices whereas the other classical BAR domains contain only three helices. The first three helices form an antiparallel coiled-coil, while the fourth helix, is unique to APPL1. BAR domains take part in many varied biological processes such as fission of synaptic vesicles, endocytosis, regulation of the actin cytoskeleton, transcriptional repression, cell-cell fusion, apoptosis, secretory vesicle fusion, and tissue differentiation.	235
407016	pfam16747	Adhesin_E	Surface-adhesin protein E. Adhesin E plays a role in pathogenesis. It binds to host proteins including plasminogen, vitronectin and laminin.	125
407017	pfam16748	INSC_LBD	Inscuteable LGN-binding domain. This is the LGN-binding domain (LBD) of the inscuteable homolog protein. It interacts with the TPR motifs of G-protein-signaling modulator 2 (GPSM2) (LGN) and stabilizes LGN.	44
374774	pfam16749	Arteri_nsp7a	Arterivirus nonstructural protein 7 alpha. Nonstructural protein 7 alpha is likely to have a role in viral RNA synthesis.	128
407018	pfam16750	HK_sensor	Sensor domain of 2-component histidine kinase. HK_sensor is the sensor domain found at the N-terminus of the integral membrane two-component system sensor histidine kinase proteins in bacteria.	110
407019	pfam16751	RsdA_SigD_bd	Anti-sigma-D factor RsdA to sigma factor binding region. RsdA_SigD_bd is a domain at the N-terminus of anti-sigma-D factor RsdA proteins. It binds to the -35 promoter binding domain of sigma-D. The complex formed regulates the transcriptional expression of the bacterium.	46
407020	pfam16752	TBCC_N	Tubulin-specific chaperone C N-terminal domain. This N-terminal domain of tubulin-specific chaperone C has a spectrin-like fold and binds to tubulin.	115
407021	pfam16753	Tipalpha	TNF-alpha-Inducing protein of Helicobacter. Tipalpha is secreted from H. pylori as dimers and enters the gastric cells.It binds to DNA via the positively charged surface-patch formed between the two monomers of the crystal structure by the loop between helices alpha1 and alpha2. Each monomer consists of a helical domain and a mixed domain.	150
407022	pfam16754	Pesticin	Bacterial toxin homolog of phage lysozyme, C-term. This the C-terminal activator domain of pesticin, a hydrolase enzyme secreted by Yersinia pestis and other Gammaproteobacteria to kill related bacteria occupying the same ecological niche. It is referred to as a bacteriocin and it leads to the hydrolysis of peptidoglycan. Its immunity protein is Pim. Pesticin carries an elongated N-terminal translocation domain, an intermediate receptor binding domain, and a C-terminal activity domain with structural analogy to lysozyme homologs. The full-length protein is toxic to bacteria when taken up to the target site via the outer or the inner membrane. The receptor domain is necessary for the close contact with the outer membrane; the N-terminal is a type of translocational, TonB box; the C-terminal domain is the death-delivering domain.	152
407023	pfam16755	NUP214	Nucleoporin or Nuclear pore complex subunit NUP214=Nup159. NUP214 is a family of nucleoporins or nuclear pore complex subunit 214 in vertebrates and 159 in yeast found in eukaryotes. It participates in allowing family 2 of DEAD-box ATPases Dbp5/DDX19 to localize to the nuclear pore complex where it takes part in mRNA export and re-modelling. NUP214 helps to regulate DEAD-box ATPase activity.	359
407024	pfam16756	PALB2_WD40	Partner and localizer of BRCA2 WD40 domain. This domain is found at the C-terminus of partner and localizer of BRCA2 (PALB2). It is a seven-bladed WD40-type beta-propeller. It binds to the N-terminus of BRCA2.	351
407025	pfam16757	Fucosidase_C	Alpha-L-fucosidase C-terminal domain. The C-terminal domain of Structure 1hl8 is constructed of eight anti-parallel-strands packed into two-sheets of five and three strands, respectively, forming a two-layer-sandwich containing a Greek key motif.	90
407026	pfam16758	UL141	Herpes-like virus membrane glycoprotein UL141. UL141 is a family of glycoproteins from herpesvirus species. At it N-terminus it carries an Ig-like beta-sandwich domain, which binds to the cysteine-rich region of TRAIL-R2, a family of tumor necrosis factor receptor proteins. UL141 is both necessary and sufficient to retain TRAIL receptors in the ER, thereby preventing their cell surface expression and it is also necessary and sufficient to inhibit cell surface expression of CD155.	191
407027	pfam16759	LIG3_BRCT	DNA ligase 3 BRCT domain. The BRCT domain of DNA ligase 3 (LIG3) binds to the C-terminal BRCT domain of the scaffolding protein X-ray repair cross-complementing protein 1 (XRCC1) and mediates homo- and heterodimerization.	78
407028	pfam16760	CBM53	Starch/carbohydrate-binding module (family 53). 	75
407029	pfam16761	Clr2_transil	Transcription-silencing protein, cryptic loci regulator Clr2. Clr2_transil is a domain carrying the first and second of three regions on Clr2 that are necessary for transcriptional silencing by the protein. Clr2 is a protein in the SHREC complex that is a crucial factor required for heterochromatin formation and it plays a major role in mating-type and rDNA silencing. The third region is family pfam10383.	68
407030	pfam16762	RHH_6	Ribbon-helix-helix domain. This ribbon-helix-helix domain binds to DNA and may be a part of a toxin-antitoxin system.	77
407031	pfam16763	Spidroin_N	Major ampullate spidroin 1, spider silk protein 1, N-term. Spidroin is produced by a number of arachnids. Spidrions are made up of repetitive segments flanked by conserved non-repetitive domains, and this domain is the conserved non-repetitive region. Aggregation to form the rigid silk occurs due to association at the repetitive regions, and the N-terminal domain is necessary to prevent premature aggregation during storage before extrusion. This N_terminal region inhibits precocious aggregation and then accelerates and directs self-assembly as the pH is lowered along the extrusion duct.	125
407032	pfam16764	Sharpin_PH	Sharpin PH domain. This PH domain is found at the N-terminus of sharpin and is involved in dimerization.	113
293370	pfam16765	Pim	Pesticin immunity protein. Pim is the immunity protein produced by Yersinia pestis and other Gammaproteobacteria to protect themselves against the bacteriostatic activity of the toxin pesticin, pfam16754.	98
407033	pfam16766	CID_GANP	Binding region of GANP to ENY2. CID is a domain on higher eukaryotic germinal-cent associated nuclear protein, or GANP, that binds to the transcription and mRNA export factor ENY2. The complex of these two proteins forms part of the TREX-2 complex that links transcription with nuclear messenger RNA export.	71
407034	pfam16767	KinB_sensor	Sensor domain of alginate biosynthesis sensor protein KinB. KinB_sensor is the N-terminal sensor domain of histidine kinase from Pseudomonas species. The domain is the extracellular sensing domain, and is four helical bundle.	120
407035	pfam16768	NupH_GANP	Nucleoporin homology of Germinal-centre associated nuclear protein. NupH_GANP is the nucleoporin-homology domain at the N-terminus of human GANP or germinal-centre associated nuclear proteins. GANP is part of the TREX-2 complex that links transcription with nuclear messenger RNA export, and it associates with the mRNP particle through the interaction of the NupH_GANP with NXF1, the export factor. This attachment mediates efficient delivery of mRNPs to nuclear pore complexes.	292
407036	pfam16769	MCM3AP_GANP	MCM3AP domain of GANP. MCM3AP_GANP is the C-terminal domain of germinal centre-associated proteins, GANPs in higher eukaryotes. GANP forms part of the TREX-2 complex which in higher eukaryotes requires the MCM3AP domain of GANP to facilitate its localization to the Nuclear pore complex and nuclear envelope. TREX-2 complex links transcription with nuclear messenger RNA export.	717
407037	pfam16770	RTT107_BRCT_5	Regulator of Ty1 transposition protein 107 BRCT domain. This is the fifth BRCT domain of regulator of Ty1 transposition protein 107 (RTT107). It is involved in binding phosphorylated histone H2A.	91
407038	pfam16771	RTT107_BRCT_6	Regulator of Ty1 transposition protein 107 BRCT domain. This is the sixth BRCT domain of regulator of Ty1 transposition protein 107 (RTT107). It is involved in binding phosphorylated histone H2A.	107
407039	pfam16772	TERF2_RBM	Telomeric repeat-binding factor 2 Rap1-binding motif. This domain, found in telomeric repeat-binding factor 2, binds to the C-terminus of repressor activator protein 1 (RAP1) (telomeric repeat-binding factor 2-interacting protein 1).	41
374792	pfam16773	Phage_SSB	Lactococcus phage single-stranded DNA binding protein. This single-stranded DNA binding protein is found in Lactococcus phage. It can stimulate RecA-mediated homologous recombination. Its structure is a variation of the typical oligonucleotide/oligosaccharide binding-fold of single-stranded DNA binding proteins.	117
407040	pfam16774	Baseplate	Baseplate protein. This protein is a structural component of the phage baseplate in Siphoviridae.	157
407041	pfam16775	ZoocinA_TRD	Target recognition domain of lytic exoenzyme. ZoocinA_TRD is domain found downstream of various lytic enzymes, such as peptidase M23 and phage lysins. The domain is composed of strands of antiparallel beta sheet with one short alpha helix at the C-terminal end.	106
407042	pfam16776	INPP5B_PH	Type II inositol 1,4,5-trisphosphate 5-phosphatase PH domain. 	144
374794	pfam16777	RHH_7	Transcriptional regulator, RHH-like, CopG. RHH_7 is a ribbon-helix-helix protein family expressed by Helicobacter species. These proteins bind to specific DNA sequences with high affinity and usually act as repressors. Many are putatively named CopG.	74
407043	pfam16778	Phage_tail_APC	Phage tail assembly chaperone protein. Phage_tail_APC is a family of general phage tail assembly chaperone proteins from double-stranded DNA viruses with no RNA stage, many of which are unclassified.	60
407044	pfam16779	DMP12	DNA-mimic protein. This is a family of DNA-mimic proteins expressed by Neisseria species. In its monomeric form DMP12 interacts with the Neisseria dimeric form of the bacterial histone-like protein HU. HU proteins promote the assembly of higher-order DNA-protein structures, The interaction between DMP12 and HU protein may be instrumental in controlling the stability of the nucleoid in Neisseria as DMP12 prevents Neisseria HU protein from being digested by trypsin.	115
407045	pfam16780	AIMP2_LysRS_bd	AIMP2 lysyl-tRNA synthetase binding domain. This is the lysyl-tRNA synthetase binding domain of aminoacyl tRNA synthase complex-interacting multifunctional protein 2 (AIMP2).	47
407046	pfam16781	DUF5068	Domain of unknown function (DUF5068). This family is expressed by Firmicutes. The function is not known.	185
407047	pfam16782	SIL1	Nucleotide exchange factor SIL1. This family consists of fungal SIL1 nucleotide-exchange factor proteins.It interacts with Hsp70 (heat-shock protein of 70 kDa) Bip.	289
407048	pfam16783	FANCM-MHF_bd	FANCM to MHF binding domain. FANCM-MHF_bd is a structured region on Fanconi anaemia complementation group protein M that binds to a two-histone-fold-containing protein complex MHF. MHF binds double-strand DNA, stimulates the DNA-binding activity of FANCM, and contributes to the targeting of FANCM to chromatin.	115
293389	pfam16784	HNHc_6	Putative HNHc nuclease. This family is found in Gammaproteobacteria. It may be an HNH-like nucleases. The shorter matches are likely to be from phage proteins whereas the longer members are probably from the bacterial genomes.	203
407049	pfam16785	SMBP	Small metal-binding protein. This histidine-rich protein binds metal ions.	111
407050	pfam16786	RecA_dep_nuc	Recombination enhancement, RecA-dependent nuclease. REF is a family of P1-like phage RecA-dependent nucleases. It does not appear to act as a positive RecA regulator. It is a new kind of enzyme, a RecA-dependent nuclease.	102
407051	pfam16787	NDC10_II	Centromere DNA-binding protein complex CBF3 subunit, domain 2. NDC10_II is a the second of five domains on the Kluyveromyces lactis Ndc10 protein. Each subunit of the Ndc10 dimer binds a separate fragment of DNA, suggesting that Ndc10 stabilizes a DNA loop at the centromere.	313
407052	pfam16788	ATF7IP_BD	ATF-interacting protein binding domain. ATF7IP-BD is a short conserved region of activating transcription factor 7-interacting protein 1 found in higher eukaryotes. This domain appears to bind several key proteins such as TFIIE-alpha and TFIIE-beta as well the transcriptional regulator Sp1 which are part of the transcriptional machinery.	215
374803	pfam16789	YscO-like	YscO-like protein. This family of proteins is similar to the type III secretion protein YscO. The family includes Chlamydia trachomatis CT670 which is found in a type III secretion gene cluster. CT670 interacts with CT671, a putative YscP homolog and CT670 and CT671 may form a chaperone-effector pair.	160
407053	pfam16790	Phage_clamp_A	Bacteriophage clamp loader A subunit. This is the A subunit of bacteriophage DNA clamp loader required for loading of sliding clamps onto chromosomal DNA. These clamps are involved in processivity of DNA replication.	144
407054	pfam16791	Connexin40_C	Connexin 40 C-terminal domain. This is the C-terminal domain of connexin 40. It interacts with the C-terminal and cytoplasmic loop domains of connexin 43 and with the cytoplasmic loop pf connexin 40.	106
407055	pfam16792	Caudo_bapla16	Phage tail base-plate attachment protein of Caudovirales ORF16. Caudo_bapla16 is a family of ORF16 tail-phage P2-like proteins that forms part of the base-plate at the tip of the phage tail. The whole base-plate complex is involved in host recognition and attachment, and consists of several proteins derived from consecutive open-reading-frames. This central domain is expressed from ORF16 in the lactococcal P2-phage and forms a trimer.	372
374806	pfam16793	RepB_primase	RepB DNA-primase from phage plasmid. RepB_primase is a DNA-primase produced by P4-like phages. It is a zinc-independent primase unlike Pri-type primases. It takes up a dumbbell shaped consisting of an N-terminal catalytic domain separated by a long alpha-helix plus tether and a C-terminal helical-bundle domain. Primases are necessary for phage replication. RepBprime primases such as in this family recognize both ssiA and ssiB, ie only 1 single-stranded primase initiation site on each strand, independently of each other and then synthesize primers that are elongated by DNA polymerase III. The phage is thus replicated exclusively in leading strand mode.	230
407056	pfam16794	fn3_4	Fibronectin-III type domain. 	101
407057	pfam16795	Phage_integr_3	Archaeal phage integrase. catalyzes cleavage and ligation of DNA.	162
407058	pfam16796	Microtub_bd	Microtubule binding. This motor homology domain binds microtubules and lacks an ATP-binding site.	143
407059	pfam16797	Fungal_KA1	Fungal kinase associated-1 domain. This domain is found at the C-terminus of several fungal kinases.	115
374811	pfam16798	DUF5069	Domain of unknown function (DUF5069). 	134
407060	pfam16799	VGPC1_C	C-terminal membrane-localization domain of ion-channel, VCN1. VCN1_C is the short C-terminal region of voltage-gated proton channel 1 proteins in higher eukaryotes. The domain is necessary for achieving the dimeric architecture, two monomers form a dimer via parallel alpha-helical coiled-coil interaction. but it is also essential for localising the protein to an intracellular membrane.	48
374812	pfam16800	Endopep_inhib	IseA DL-endopeptidase inhibitor. This domain functions as a DL-endopeptidase inhibitor.	150
407061	pfam16801	MSL1_dimer	dimerization domain of Male-specific-Lethal 1. MSL1_dimer is the short coiled dimerization domain of higher eukaryotic MSL1, part of the MSL or Male-Specific Lethal complex. This complex regulates the dosage compensation of the male X chromosome in Drosophila and other eukaryotes. The structure of the MSL1/MSL2 core shows that two MSL2 subunits bind to a dimer formed by two molecules of MSL1. MSL11 is a substrate for MSL2 E3 ubiquitin ligase activity.	37
407062	pfam16802	DUF5070	Domain of unknown function (DUF5070). 	154
407063	pfam16803	DRE2_N	Fe-S cluster assembly protein DRE2 N-terminus. This is the N-terminal domain of the fungal Fe-S cluster assembly protein DRE2.	129
407064	pfam16804	DUF5071	Domain of unknown function (DUF5071). 	119
374815	pfam16805	Trans_coact	Phage late-transcription coactivator. This family of proteins is found in Caudovirales. It is a late-transcription coactivator which interacts with the host RNA polymerase forming a part of the initiation complex.	69
293411	pfam16806	ExsD	Antiactivator protein ExsD. The antiactivator protein ExsD represses the transcriptional activator ExsA. ExsA activates expression of type III secretion system genes. Repression of ExsA by ExsD is relieved by the secretion chaperone ExsC.	237
407065	pfam16807	DUF5072	Domain of unknown function (DUF5072). 	112
407066	pfam16808	PKcGMP_CC	Coiled-coil N-terminus of cGMP-dependent protein kinase. PKcGMP_CC is the N-terminal coiled-coil, dimerization, domain of cGMP-protein kinases.	35
293414	pfam16809	NleF_casp_inhib	NleF caspase inhibitor. Binds to and inhibits caspase-9, caspase-8 and caspase-4. therefore preventing caspase-induced apoptosis in the host cell.	145
374818	pfam16810	RXLR	RXLR phytopathogen effector protein, Avirulence activity. RXLR is a family of phytopathogen avirulence or effector proteins. RXLR proteins are defined by a secretion signal peptide - not in this family - followed by a conserved N-terminal domain with the sequence motif RXLR (Arg-Xaa-Leu-Arg) consensus sequence. The RXLR part is required for translocation inside plant cells, although it appears to be dispensable for the biochemical activity of the effectors when expressed directly inside host cells. The effector activity resides in the C-terminal part of the family, which activate effector-triggered immunity in plants that carry a corresponding resistance (R) protein. The C-terminal region exhibits a fold appears to be able to evolve to outwit the host as the latter tries to acquire new immunity.	138
374819	pfam16811	TAtT	TRAP transporter T-component. TAtT is a family of one component, the T-component, of a sub-set of TRAP-Ts or Tripartite ATP-independent periplasmic transporters. TRAP-Ts are bacterial transport systems implicated in the import of small molecules into the cytoplasm in bacteria. They are all periplasmic lipoproteins. TatT consists of a 13-alpha-helical fold containing cryptic tetratricopeptide repeat motifs (cTPRs) and encompassing a pore, ie is a water-soluble trimer whose protomers are each perforated by a pore. It forms a complex with a P component, and a putative ligand-binding cleft of TatPT aligns with the pore of TatT. Family TatPT is represented by some members of pfam03480.	263
318916	pfam16812	AdHead_fibreRBD	C-terminal head domain of the fowl adenovirus type 1 long fibre. AdHead_fibreRBD is a C-terminal part of the head domain of the dsDNA viruses, no RNA stage, Adenovirus. This is a globular head domain with an anti-parallel beta-sandwich fold formed by two four-stranded beta-sheets with the same overall topology as human adenovirus fibre heads. This C-terminal domain is the receptor-binding domain of the avian adenovirus long fibre.	207
407067	pfam16813	Cas_St_Csn2	CRISPR-associated protein Csn2 subfamily St. Cas_St_Csn2 is a family of Csn2 CRISPR-associated (Cas) proteins found in Firmicutes, largely Streptococcus and Enterococcus. CRISPR-associated (Cas) proteins are the main executioners of the process whereby prokaryotes acquire immunity against foreign genetic material. Cas allow short segments of this DNA, called spacer, to become incorporated into chromosomal loci as clustered regularly interspaced short palindromic repeats or CRISPRs; the resulting encoded RNAs are then processed into small fragments that guide the silencing of the invading genetic elements. Thus Cas are involved in the acquisition of new spacers. This family of St_Csn2 is longer than the canonical Csn2, pfam09711 through the addition of a large C-terminal domain. The central domain present in both families appears to be a channel that selectively interacts with dsDNA.	325
293419	pfam16814	Read-through	Read-through domain. The Enterobacteria phage minor coat protein A1 is a C-terminally extended version of the coat protein formed when ribosomes read-through a leaky stop codon. This is the C-terminal read-through domain of A1.	182
407068	pfam16815	HRI1	Protein HRI1. This fungal protein interacts with Sec72 and Hrr25, it's function is not yet known.	229
407069	pfam16816	DotD	DotD protein. The DotD protein is a component of the Dot/Icm type IVB secretion system. It is involved in the outer membrane targeting of DotH.	120
407070	pfam16817	DUF5073	Domain of unknown function (DUF5073). This domain of unknown function is a membrane protein found in Mycobacterium.	121
407071	pfam16818	SLM4	Protein SLM4. The fungal protein SLM4 (EGO3, GSE1) is a component of the GSE complex and the EGO (TOR) complex. The GSE complex is required for trafficking GAP1 out of the endosome. The EGO complex is involved in the regulation of autophagy and cell growth. SLM4 is required for the integrity and function of the EGO complex.	157
407072	pfam16819	DUF5074	Domain of unknown function (DUF5074). This family of proteins from Bacteroidetes, is found with a PKD domain at the N-terminus. Several members are annotated as putative quinonprotein alcohol dehydrogenase-like proteins but this could not be confirmed.	112
407073	pfam16820	PKD_3	PKD-like domain. This PKD-like family is found in various Bacteroidetes species.	68
374824	pfam16821	C_Hendra	C protein from hendra and measles viruses. This is a family of C proteins from a number of Morbillivirus species.	153
407074	pfam16822	ALGX	SGNH hydrolase-like domain, acetyltransferase AlgX. ALGX is a family found in bacteria. The domain demonstrates catalytic activity similar to that of the SGNH hydrolase-like domain, with the typical Ser-His-Asp triad found in this enzyme. Alginate is an exopolysaccharide that contributes to biofilm formation. ALGX is secreted into the biofilm and is responsible for the acetylation of biofilm polymers that help protect them from host destruction.	266
374826	pfam16823	PilZ_2	Atypical PilZ domain, cyclic di-GMP receptor. PilZ_2 is a family of cyclic di-GMP receptors found in Proteobacteria plant pathogens. PilZ_2 forms a tetramer that adopts a novel 'house-like' construct, with a central pillar domain of the four vertical alpha3 helices, a roof-top domain made up of the eight inclined alpha2 and alpha4 helices, and four corner-stone domains making up the PilZ domain. Cyclic-di-GMP is a universal secondary messenger molecule extensively involved in regulating bacterial pathogenicity, and its downstream receptor appears to be this PilZ domain.	136
407075	pfam16824	CBM_26	C-terminal carbohydrate-binding module. CBM_26 is a family of bacterial carbohydrate-binding modules frequently found at the C-terminus of enzymes. The combination is not unusual as the CBMs function to bring the relevant polysaccharide into close proximity to the active site.	125
407076	pfam16825	DUF5075	IGP family C-type lectin domain. This C-type lectin domain is present in the IGP 'invariant glycoprotein' family of proteins from Trypanosoma and Leishmania.	173
407077	pfam16826	DUF5076	Domain of unknown function (DUF5076). 	84
374829	pfam16827	zf-HC3	zinc-finger. This is a family of putative zinc-fingers from Actinobacteriales.	67
407078	pfam16828	GAGBD	GAG-binding domain on surface antigen. GAGBD is a domain on the surface antigen of the swine pathogen Streptococcus suis and related species. This domain expresses three clusters of basic residues, largely lysines, that are critical for heparin-binding and cell adhesion during bacterium-host cell adhesion. The GAGBD domain binds to the host cell surface glycosaminoglycans or GAGs of the Streptococcus.	152
293434	pfam16829	ATR13	Avirulence protein ATR13, RxLR effector. ATR13 is expressed by the plant pathogen oomycete Hyaloperonospora. Such phytopathogenic oomycetes like the one that infects Arabidopsis, Hyaloperonospora arabidopsidis (Hpa), grow intercellularly, forming parasitic structures called haustoria. Haustoria play a role in feeding and suppression of host defense systems. A whole range of pathogen proteins, called effectors, are secreted across this haustorial membrane, a subset of which are further translocated across the plant plasma membrane by an unknown mechanism that is present in both plants and animals. ATR13 is an RxLR effector from the downy mildew oomycete, and is a very dynamic protein. It contains two surface-exposed patches of polymorphism, one of which is involved in the specific recognition by host R-genes. The R-gene-products detect the presence of the infection by recognising the effector proteins. Once detected, the host R-genes trigger apoptosis of the host cell. The R-gene-products carry a specific motif, RxLR, that is recognizes the effector proteins.	101
407079	pfam16830	NBD94	Nucleotide-Binding Domain 94 of RH. NBD94 is a domain on one of the reticulocyte binding protein homolog family or RH proteins expressed by the malaria parasite merozoite. RH proteins recognize erythrocytes and are important in virulence. This domain has been shown to exhibit selective binding to ATP and ADP. Binding of ATP or ADP induces nucleotide-dependent structural changes in the C-terminal hinge-region of NBD94 that directly impact on the ability of the RH to bind to the red blood cells.	91
318930	pfam16831	CssAB	CS6 fimbrial subunits A and B, Coli surface antigen 6. CssAB is a family of CS6 pilins from E.coli, including both subunits A and B. It acts as a colonisation factor for the enterotoxigenic species pf E.coli to mediate bacterial attachment to the small intestinal epithelium. Both subunits in the fibre bind to receptors on epithelial cells, and that CssB, but not CssA, specifically recognizes the extracellular matrix protein fibronectin.	129
318931	pfam16832	EKLF_TAD1	Erythroid krueppel-like transcription factor, transactivation 1. This family is the first part of the minimal transactivation domain of erythroid-specific transcription factor EKFL in craniates. EKLF plays an important role in red blood cell development; it is posttranslationally modified by UBI on several lysine residues, and its turnover in the cell is regulated by ubiquitin-mediated degradation. In the first 90 residues at the N-terminus EKLF carries a minimal transactivation or TAD domain that is highly acidic. This minimal TAD of EKLF can be further subdivided into two independent domains EKLF_TAD1 (residues 1-40) and EKLF_TAD2 (residues 51-90), pfam16833, that are both capable of independently activating transcription. TAD1, is able to form a non-covalent interaction with ubiquitin. Both TAD1 and TAd2 are highly acidic and carry a PEST (sequence rich in proline, glutamic acid, serine, and threonine) region. Deletion of either PEST domain significantly slows down degradation of EKLF by ubiquitin. The minimal TAD has an overlapping activation/degradation function that is critical for the role of EKLF in red blood cell development.	27
407080	pfam16833	EKLF_TAD2	Erythroid krueppel-like transcription factor, transactivation 2. This family is the second part of the minimal transactivation domain of erythroid-specific transcription factor EKFL in craniates. EKLF plays an important role in red blood cell development; it is post-translationally modified by ubiquitin on several lysine residues, and its turnover in the cell is regulated by ubiquitin-mediated degradation. In the first 90 residues at the N-terminus EKLF carries a minimal transactivation or TAD domain that is highly acidic. This minimal TAD of EKLF can be further subdivided into two independent domains EKLF_TAD1 (residues 1-40), pfam16832, and EKLF_TAD2 (residues 51-90) that are both capable of independently activating transcription. Both TAD1 and TAD2 are highly acidic and carry a PEST (sequence rich in proline, glutamic acid, serine, and threonine) region. Deletion of either PEST domain significantly slows down degradation of EKLF by ubiquitin. The minimal TAD has an overlapping activation/degradation function that is critical for the role of EKLF in red blood cell development.	27
407081	pfam16834	CSM2	Shu complex component Csm2, DNA-binding. CSM2 is one of the components of the yeast Shu complex that maintains genomic stability during replication. CSM2 complexes first with Psy3, and their L2 loops confer the DNA-binding activity to the Shu complex. The Shu complex binds to recombination sites and is required for Rad51 assembly and function during meiosis. The heterodimer of Psy3-Csm2 stabilizes the Rad51-single-stranded DNA complex independently of nucleotide cofactor because Psy3-Csm2 is a structural mimic of the Rad51-dimer.	203
407082	pfam16835	SF3A2	Pre-mRNA-splicing factor SF3a complex subunit 2 (Prp11). SF3A2 is one of the components of the SF3a splicing factor complex of the mature U2 snRNP (small nuclear ribonucleoprotein particle). In yeast, SF3a shows a bifurcated assembly structure of three subunits, Prp9 (subunit 3), Prp11 (subunit 2) and Prp21 (subunit 1). with Prp21 wrapping around Prp11.	92
407083	pfam16836	PSY3	Shu complex component Psy3, DNA-binding description. PSY3 is one of the components of the yeast Shu complex that maintains genomic stability during replication. Psy3 complexes first with Cms2, and their L2 loops confer the DNA-binding activity to the Shu complex. The Shu complex binds to recombination sites and is required for Rad51 assembly and function during meiosis. The heterodimer of Psy3-Csm2 stabilizes the Rad51-single-stranded DNA complex independently of nucleotide cofactor because Psy3-Csm2 is a structural mimic of the Rad51-dimer.	216
407084	pfam16837	SF3A3	Pre-mRNA-splicing factor SF3A3, of SF3a complex, Prp9. SF3A3 is one of the components of the SF3a splicing factor complex of the mature U2 snRNP (small nuclear ribonucleoprotein particle). In yeast, SF3a shows a bifurcated assembly structure of three subunits, Prp9 (subunit 3), Prp11 (subunit 2) and Prp21 (subunit 1). Prp9 and Prp21 were not thought to interact with each other but the alpha1 helix of Prp9 does make important contacts with the SURP2 domain of Prp21, thus the two do interact via a bidentate-binding mode. Prp9 harbours a major binding site for stem-loop IIa of U2 snRNA.	77
407085	pfam16838	Caud_tail_N	Caudoviral major tail protein N-terminus. This is the N-terminal domain of the major tail protein, or knob protein, from Caudovirales.	120
407086	pfam16839	Antimicrobial25	Nematode antimicrobial peptide. This family of antimicrobial peptides is found in nematodes.	54
407087	pfam16840	ACTL7A_N	Actin-like protein 7A N-terminus. The N-terminus of actin-like protein 7A is required for interaction with testin (TES).	65
407088	pfam16841	CBM60	Ca-dependent carbohydrate-binding module xylan-binding. CBM60 is a family of xylan-binding modules found in conjunction with xylanase enzymes in many bacterial species that attack plant cell walls. Xylan is the major hemicellulose component of most plant cell walls, and is one of the most complex carbohydrates targeted by CBMs. CBM60 modules are evolutionarily related to CBM36 domains as both show circular permutation in the beta-barrel folds. CBM60 targets xylan but is also able to bind cellulose and galactan and thus contribute towards breakdown of the plant cell wall. Recognition of the ligand is conferred primarily through the polar interactions of O2 (oxygen) and O3 of a single sugar with a protein-bound calcium ion.	93
407089	pfam16842	RRM_occluded	Occluded RNA-recognition motif. This family is an unusual, usually C-terminal, RNA-recognition motif found in fungi. In yeast it is the fourth RRM domain on the essential splicing factor Prp24. Structurally, it has a non-canonical RRM fold with the expected beta-aloha-beta-beta-alpha-beta RRM-fold is flanked by N- and C-terminal alpha-helices. These two additional flanking alpha-helices occlude the beta-sheet face. The electropositive surface thereby presented is an alternative RNA-binding surface that allows both binding and unwinding of the U6 small nuclear RNA's internal stem loop, at least in vitro.	79
407090	pfam16843	Get5_bdg	Binding domain to Get4 on Get5, Golgi to ER traffic protein. Get5_bdg is the binding domain at the N-terminus of Get5, or Golgi to ER traffic protein 5, in yeast, that binds to Get4. Together with Get3, this tripartite complex is involved in the insertion of tail-anchored proteins in the ER membrane.	53
407091	pfam16844	DIMCO_N	Dinitrogenase iron-molybdenum cofactor, N-terminal. DIMCO_N is the N-terminal domain of the gamma (Y) subunit of nitrogenase. An alternative name is NafY_N, for nitrogenase accessory factor Y N-terminal. This region is negatively charged and appears to be necessary for recognising and interacting with the apo state of dinitrogenase. The full-length NafY protein facilitates the transfer of iron-molybdenum cofactor, or FeMo-co, into apodinitrogenase by binding to both. The C-terminal region, family Nitro_FeMo-Co, pfam02579, is the part that binds to the cofactor, and the N-terminus binds to apodinitrogenase. Nitrogenase is the bacterial enzyme responsible for nitrogen fixation by catalyzing the reduction of nitrogen gas (N2) to ammonium in an ATP-dependent manner. It has two components, dinitrogenase and dinitrogenase reductase.	91
407092	pfam16845	SQAPI	Aspartic acid proteinase inhibitor. SQAPI, aspartic acid inhibitor first isolated from squash, inhibits a wide range of aspartic proteinases. This particular family of PAAPIs (proteinaceous aspartic acid inhibitors) seems to have evolved quite recently from an ancestral cystatin. Structurally it consists of a four-stranded anti-parallel beta-sheet gripping an alpha-helix in much the same manner that a hand grips a tennis racket. The unstructured N-terminus and the loop connecting beta-strands 1 and 2 are important for pepsin inhibition, but the loop connecting strands 3 and 4 is not.	83
407093	pfam16846	Cep3	Centromere DNA-binding protein complex CBF3 subunit B. Cep3 is one of the major components of the CBF3. It dimerizes and in so doing forms a large central channel that is large enough to accommodate duplex B-form DNA. The dimerization region is followed by a linker to the zinc-finger domain at the C-terminus. The CBF3 complex is an essential core component of the budding yeast kinetochore and is required for the centromeric localization of all other kinetochore proteins. Cep3 is the only component with DNA-binding properties.	507
293452	pfam16847	AvrPtoB_bdg	Avirulence AvrPtoB, BAK1-binding domain. AvrPtoB_bdg is a binding region on a family of bacterial plant pathogenic proteins. Type III effector proteins are injected into plants by bacteria when they are under attack, eg Pseudomonas syringae when attacking tomato. AvrPtoB is one such effector that suppresses the plants' PAMP-triggered innate immunity. PAMPs are pathogen/microbe-associated molecular patterns that are detected as non-self by a host. AvrPtoB suppresses this response by binding to BAK1, a kinase that acts with several pattern recognition receptors to activate defense signalling. AvrPtoB_bdg is the region of AvrPtoB that binds to BAK1 thereby preventing its kinase activity after the perception of flagellin.	91
407094	pfam16848	SoDot-IcmSS	Substrate of the Dot/Icm secretion system, putative. This is a family of putative substrates of the Dot/Icm type IVA secretion system from Legionella species.	177
293454	pfam16849	Glyco_transf_88	Glycosyltransferase family 88. This is a family of type A glycosyltransferases found in Legionella. It acts as a virulence factor by the glucosylation of EF1A (elongation factor 1A) thereby blocking protein synthesis in the host cell.	423
374843	pfam16850	Inhibitor_I66	Peptidase inhibitor I66. This family of serine protease inhibitors has a beta-trefoil fold and inhibits trypsin and chymotrypsin.	146
407095	pfam16851	Stomagen	Stomagen. Stomagen (epidermal patterning factor-like protein 9) acts as a positive regulator of stomatal development.	50
293457	pfam16852	HHV-1_VABD	Herpes viral adaptor-to-host cellular mRNA binding domain. HHV-1_VABD is the short region of the Herpes simplex 1 virus' specific signature adaptor protein that binds to the cellular mRNA export factor such as mouse REF.	42
407096	pfam16853	CDC13_N	Cell division control protein 13 N-terminus. This domain is found at the N-terminus of fungal cell division control protein 13 (CDC13). It has an OB type fold. It is involved in dimerization of CDC13 and in interaction of CDC13 with the catalytic subunit of DNA polymerase alpha, Pol1.	208
407097	pfam16854	VPS53_C	Vacuolar protein sorting-associated protein 53 C-terminus. This is the C-terminal domain of fungal vacuolar protein sorting-associated protein 53.	203
407098	pfam16855	Soc	Small outer capsid protein. This protein attaches to and stabilizes the bacteriophage capsid.	74
407099	pfam16856	CDC4_D	Cell division control protein 4 dimerization domain. This is the dimerization domain (D domain) of fungal cell division control protein 4.	51
374848	pfam16857	RNA_pol_inhib	RNA polymerase inhibitor. This bacteriophage protein inhibits the bacterial host RNA polymerase by interacting with the RpoC subunit and inhibiting the formation of a promoter complex.	47
407100	pfam16858	CNDH2_C	Condensin II complex subunit CAP-H2 or CNDH2, C-term. CNDH2_C is the C-terminal domain of the H2 subunit of the condensin II complex, found in eukaryotes but not fungi. Eukaryotes carry at least two condensin complexes, I and II, each made up of five subunits. The functions of the two complexes are collaborative but non-overlapping. CI appears to be functional in G2 phase in the cytoplasm beginning the process of chromosomal lateral compaction while the CII are concentrated in the nucleus, possibly to counteract the activity of cohesion at this stage. In prophase, CII contributes to axial shortening of chromatids while CI continues to bring about lateral chromatid compaction, during which time the sister chromatids are joined centrally by cohesins. There appears to be just one condensin complex in fungi. CI and CII each contain SMC2 and SMC4 (structural maintenance of chromosomes) subunits, then CI has non-SMC CAP-D2 (CND1), CAP-G (CND3), and CAP-H (CND2). CII has, in addition to the two SMCs, CAP-D3, CAPG2 and CAP-H2. All four of the CAP-D and CAP-G subunits have degenerate HEAT repeats, whereas the CAP-H are kleisins or SMC-interacting proteins (ie they bind directly to the SMC subunits in the complex). The SMC molecules are each long with a small hinge-like knob at the free end of a longish strand, articulating with each other at the hinge. Each strand ends in a knob-like head that binds to one or other end of the CAP-H subunit. The HEAT-repeat containing D and G subunits bind side-by-side between the ends of the H subunit. Activity of the various parts of the complex seem to be triggered by extensive phosphorylations, eg, entry of the complex, in Sch.pombe, into the nucleus during mitosis is promoted by Cdk1 phosphorylation of SMC4/Cut3; and it has been shown that Cdk1 phosphorylates CAP-D3 at Thr1415 in He-La cells thus promoting early stage chromosomal condensation by CII.	284
407101	pfam16859	TetR_C_11	Bacterial transcriptional repressor C-terminal. This family of bacterial transcriptional repressors is characterized by the short approximately 50 amino acid stretch of residues constituting the helix-turn-helix DNA binding motif, around the YRFhY motif. The target proteins that are repressed are involved in the transcriptional control of multi-drug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. Another target protein is BetI, an osmoprotectant which controls the choline-glycine betaine pathway in E.coli.	113
407102	pfam16860	CX9C	CHCH-CHCH-like Cx9C, IMS import disulfide relay-system,. CX9C is the first half of a twin Cx9C motif in eukaryotic proteins. The function of this motif is to import nuclear-encoded mitochondrial intermembrane-space-proteins into the IMS (intermembrane space), as these latter lack a mitochondrial targeting sequence. The Cx9C proteins have a disulfide-bonded alpha-hairpin conformation. Cx9C-containing proteins are thus putative substrates for the Mia40-dependent thiol-disulfide exchange mechanism that carries out an oxidative folding process resulting in the proteins being trapped in the IMS.	42
407103	pfam16861	Carbam_trans_C	Carbamoyltransferase C-terminus. This domain is found in NodU from Rhizobium, CmcH from Nocardia lactamdurans and the bifunctional carbamoyltransferase TobZ from Streptoalloteichus tenebrarius. NodU a Rhizobium nodulation protein involved in the synthesis of nodulation factors has 6-O-carbamoyltransferase-like activity. CmcH is involved in cephamycin (antibiotic) biosynthesis and has 3-hydroxymethylcephem carbamoyltransferase activity, EC:2.1.3.7 catalyzing the reaction: Carbamoyl phosphate + 3-hydroxymethylceph-3-EM-4-carboxylate <=> phosphate + 3-carbamoyloxymethylcephem. TobZ functions as an ATP carbamoyltransferase and tobramycin carbamoyltransferase. These proteins contain two domains, this is the smaller, C-terminal, domain.	169
407104	pfam16862	Glyco_hydro_79C	Glycosyl hydrolase family 79 C-terminal beta domain. This domain is found at the C-terminus of glycosyl hydrolase family 79 proteins. It's function is not yet known.	103
407105	pfam16863	NtCtMGAM_N	N-terminal barrel of NtMGAM and CtMGAM, maltase-glucoamylase. NtCtMGAM_N is a beta-barrel-like structure just N-terminal to the catalytic domain of maltase-glucoamylase in eukaryotes. It contributes to the architecture of the substrate-binding site, by donating a loop that comes into close contact with two regions in the catalytic domain thereby creating the site. This family is frequently found at the N-terminus of Glycosyl hydrolase 31, pfam01055.to which it contributes as above.	111
407106	pfam16864	dimerization2	dimerization domain. This domain, found in methyltransferases, functions as a dimerization domain.	87
407107	pfam16865	GST_C_5	Glutathione S-transferase, C-terminal domain. Leishmania major and Trypanosoma cruzi glutathione-S-transferase (GST) has undergone gene duplication, diversification, and gene fusion leading to an four domain enzyme which contains two repeats of a GST N-terminal domain followed by a GST C-terminal domain.	108
407108	pfam16866	PHD_4	PHD-finger. 	64
407109	pfam16867	DMSP_lyase	Dimethlysulfonioproprionate lyase. Breaks down into dimethylsulfoniopropionate (DMSP) into acrylate and dimethyl sulfide.	163
407110	pfam16868	NMT1_3	NMT1-like family. 	289
407111	pfam16869	CNDH2_M	PF16858. CNDH2_M is the middle domain of the H2 subunit of the condensin II complex, found in eukaryotes but not fungi. Eukaryotes carry at least two condensin complexes, I and II, each made up of five subunits. The functions of the two complexes are collaborative but non-overlapping. CI appears to be functional in G2 phase in the cytoplasm beginning the process of chromosomal lateral compaction while the CII are concentrated in the nucleus, possibly to counteract the activity of cohesion at this stage. In prophase, CII contributes to axial shortening of chromatids while CI continues to bring about lateral chromatid compaction, during which time the sister chromatids are joined centrally by cohesins. There appears to be just one condensin complex in fungi. CI and CII each contain SMC2 and SMC4 (structural maintenance of chromosomes) subunits, then CI has non-SMC CAP-D2 (CND1), CAP-G (CND3), and CAP-H (CND2). CII has, in addition to the two SMCs, CAP-D3, CAPG2 and CAP-H2. All four of the CAP-D and CAP-G subunits have degenerate HEAT repeats, whereas the CAP-H are kleisins or SMC-interacting proteins (ie they bind directly to the SMC subunits in the complex). The SMC molecules are each long with a small hinge-like knob at the free end of a longish strand, articulating with each other at the hinge. Each strand ends in a knob-like head that binds to one or other end of the CAP-H subunit. The HEAT-repeat containing D and G subunits bind side-by-side between the ends of the H subunit. Activity of the various parts of the complex seem to be triggered by extensive phosphorylations, eg, entry of the complex, in Sch.pombe, into the nucleus during mitosis is promoted by Cdk1 phosphorylation of SMC4/Cut3; and it has been shown that Cdk1 phosphorylates CAP-D3 at Thr1415 in He-La cells thus promoting early stage chromosomal condensation by CII. This region represents the disordered section of CNDH2 between the N- and the C-termini.	127
407112	pfam16870	OxoGdeHyase_C	2-oxoglutarate dehydrogenase C-terminal. OxoGdeHyase_C is a family found immediately C-terminal to Transket_pyr, pfam02779. It is found at the C-terminus of 2-oxoglutarate dehydrogenase.	151
407113	pfam16871	DUF5077	Domain of unknown function (DUF5077). This family is found at the N-terminal of DUF3472, pfam00958.	189
407114	pfam16872	putAbiC	Putative phage abortive infection protein. Several members are annotated as putative phage abortive infection proteins.	80
374859	pfam16873	AbiGii_2	Putative abortive phage resistance protein AbiGii toxin. AbiGii is a family of putative type IV toxin-antitoxin system toxins. The AbiG abortive phage resistance protein affects lactococcal bacteriophages phiP335 and phiQ30 but not the other P335 phage species. AbiGii toxin appears to confer resistance to phages by a mechanism of abortive infection that acts by interfering with phage RNA synthesis. The cognate anti-toxin is found in pfam10899.	397
407115	pfam16874	Glyco_hydro_36C	Glycosyl hydrolase family 36 C-terminal domain. This domain is found at the C-terminus of many family 36 glycoside hydrolases. It has a beta-sandwich structure with a Greek key motif.	78
407116	pfam16875	Glyco_hydro_36N	Glycosyl hydrolase family 36 N-terminal domain. This domain is found at the N-terminus of many family 36 glycoside hydrolases. It has a beta-supersandwich fold.	256
407117	pfam16876	Lipin_mid	Lipin/Ned1/Smp2 multi-domain protein middle domain. This is a middle domain of lipins. Overall the enzyme acts as a magnesium-dependent phosphatidate phosphatase enzyme that catalyzes the conversion of phosphatidic acid to diacylglycerol during triglyceride, phosphatidylcholine and phosphatidylethanolamine biosynthesis. EC:5.2.1.8.	95
407118	pfam16877	DUF5078	Domain of unknown function (DUF5078). This family of unknown function is found in Mycobacterium spp.	119
407119	pfam16878	SIX1_SD	Transcriptional regulator, SIX1, N-terminal SD domain. SIX1_SD is a family of eukaryotic proteins, and it is found N-terminal to the Homeobox domain. As a transcription factor it lacks intrinsic activation domains and thus needs to bind to the EYA family of co-factors in order to mediate transcriptional activation. It is the SD domain that is necessary for this protein-protein interaction, binding to the C-terminal region of EYA - Eyes absent homolog proteins.	110
407120	pfam16879	Sin3a_C	C-terminal domain of Sin3a protein. Sin3a_C is a family of eukaryotic species. It is found at the C-terminus of the co-repressor Sin3a, and downstream of family Sin3_corepress, pfam08295.	281
407121	pfam16880	EHD_N	N-terminal EH-domain containing protein. EHD_N is a short domain that lies at the very N-terminus of many dynamins and EF-hand domain-containing proteins.	33
374865	pfam16881	LIAS_N	N-terminal domain of lipoyl synthase of Radical_SAM family. LIAS_N is found as the N-terminal domain of the Radical_SAM family in the members that are lipoyl synthase enzymes, particularly the mitochondrial ones in metazoa but also those in bacteria.	97
374866	pfam16882	DUF5079	Domain of unknown function (DUF5079). This protein is believed to be involved in the type VII secretion system.	241
293488	pfam16883	DUF5080	Domain of unknown function (DUF5080). This protein is believed to be involved in the type VII secretion system.	204
407122	pfam16884	ADH_N_2	N-terminal domain of oxidoreductase. N-terminal region of oxidoreductase and prostaglandin reductase and alcohol dehydrogenase.	108
407123	pfam16885	CAC1F_C	Voltage-gated calcium channel subunit alpha, C-term. CAC1F_C is the C-terminal region of voltage-gated calcium channel subunit alpha in higher eukaryotes. The exact function of this domain is not known.This region lies immediately downstream from the CDB motif, pfam08673.	348
407124	pfam16886	ATP-synt_ab_Xtn	ATPsynthase alpha/beta subunit N-term extension. ATP-synt_ab_Xtn is an extension of the alpha-beta catalytic subunit of VATA or V-type proton ATPase catalytic subunit at the N-terminal end. It is found from bacteria to humans, and was not modelled in family ATP-synt_ab, pfam00006.	120
318977	pfam16887	DUF5081	Domain of unknown function (DUF5081). This protein is believed to be involved in the type VII secretion system.	230
407125	pfam16888	DUF5082	Domain of unknown function (DUF5082). This protein is believed to be involved in the type VII secretion system.	122
407126	pfam16889	Hepar_II_III_N	Heparinase II/III N-terminus. This is the N-terminal domain of heparinase II/III proteins. It is a toroid-like domain.	344
374868	pfam16890	DUF5083	Domain of unknown function (DUF5083). This protein is believed to be involved in the type VII secretion system.	157
407127	pfam16891	STPPase_N	Serine-threonine protein phosphatase N-terminal domain. This family is often found at the N-terminus of Metallophos family, in serine-threonine protein phosphatases.	48
407128	pfam16892	CHS5_N	Chitin biosynthesis protein CHS5 N-terminus. This domain is found at the N-terminus of fungal chitin biosynthesis protein CHS5. It functions as a dimerization domain.	48
407129	pfam16893	fn3_2	Fibronectin type III domain. This fibronectin type III domain is found in fungal chitin biosynthesis protein CHS5 where, together with the neighboring BRCT domain (pfam00533), it binds to the Arf1 GTPase.	89
318984	pfam16894	DUF5084	Domain of unknown function (DUF5084). This protein is believed to be involved in the type VII secretion system.	130
374871	pfam16895	DUF5085	Domain of unknown function (DUF5085). This protein is believed to be involved in the type VII secretion system.	139
407130	pfam16896	PGDH_C	Phosphogluconate dehydrogenase (decarboxylating) C-term. PGDH_C is the C-terminal domain of putative bacterial phosphogluconate dehydrogenase proteins.	153
407131	pfam16897	MMR_HSR1_Xtn	C-terminal region of MMR_HSR1 domain. MMR_HSR1_Xtn is the C-terminal region of some members of the MMR_HSR1 family.	105
407132	pfam16898	TOPRIM_C	C-terminal associated domain of TOPRIM. TOPRIM_C is found as the C-terminal extension of the TOPRIM domain, pfam01751 in metazoa.	127
407133	pfam16899	Cyclin_C_2	Cyclin C-terminal domain. Cyclins contain two domains of similar all-alpha fold, this family corresponds with the C-terminal domain of some cyclins including cyclin C and cyclin H.	98
407134	pfam16900	REPA_OB_2	Replication protein A OB domain. Replication protein A contains two OB domains in it's DNA binding region. This is the second of the OB domains.	98
407135	pfam16901	DAO_C	C-terminal domain of alpha-glycerophosphate oxidase. DAO_C is the C-terminal region of alpha-glycerophosphate oxidase.	126
407136	pfam16902	Type2_restr_D3	Type-2 restriction enzyme D3 domain. This is the D3 domain of type-2 restriction enzyme. These enzymes contain an N-terminal recognition domain and a C-terminal catalytic domain. The recognition domain consists of the D1, D2 and D3 domains.	69
407137	pfam16903	Capsid_N	Major capsid protein N-terminus. This is the N-terminal domain of the major capsid protein in several dsDNA viruses.	196
407138	pfam16904	PurL_C	Phosphoribosylformylglycinamidine synthase II C-terminus. This is the C-terminal domain of phosphoribosylformylglycinamidine synthase II in Thermatoga and related species.	94
407139	pfam16905	GPHH	Voltage-dependent L-type calcium channel, IQ-associated. GPHH is a sequence motif found in this short domain on voltage-dependent L-type calcium channel proteins in eukaryotes. The domain is closely associated with the IQ-domain, pfam08763.	54
407140	pfam16906	Ribosomal_L26	Ribosomal proteins L26 eukaryotic, L24P archaeal. Ribosomal_L26 is a family of the 50S and the 60S ribosomal proteins from eukaryotes - L26 - and archaea - L25.	114
407141	pfam16907	Caskin-Pro-rich	Proline rich region of Caskin proteins. This proline rich region is found in Caskin proteins. Caskins are CASK-binding synaptic scaffolding proteins. This region is predicted to be natively unstructured. Its function is not known.	91
407142	pfam16908	VPS13	Vacuolar sorting-associated protein 13, N-terminal. VPS13 is a family of eukaryotic vacuolar sorting-associated 13 proteins that lies just downstream from Chorein_N family, pfam12624. The exact function of this domain is not known.	230
407143	pfam16909	VPS13_C	Vacuolar-sorting-associated 13 protein C-terminal. VPS13_C is a family of eukaryotic vacuolar sorting-associated 13 proteins that lies at the C-terminus of the members, The exact function of this domain is not known.	175
407144	pfam16910	VPS13_mid_rpt	Repeating coiled region of VPS13. This repeat is a family of repeating regions of eukaryotic vacuolar sorting-associated 13 proteins. This repeating region shares a common core element that includes a well-conserved P-x4-P-x13-17-G sequence. The exact function of this repeat is not known.	236
407145	pfam16911	PapA_C	Phthiocerol/phthiodiolone dimycocerosyl transferase C-terminus. 	120
407146	pfam16912	Glu_dehyd_C	Glucose dehydrogenase C-terminus. 	211
293518	pfam16913	PUNUT	Purine nucleobase transmembrane transport. PUNUT is a family of largely plant and fungal purine transporters. Most members are 10-pass transmembrane proteins, and they belong to the drug/metabolite transporter (dmt) superfamily. The plant vascular system transports nucloebases and their derivatives such as cytokinins and caffeine by a common H+-coupled high-affinity purine transport system; the PUNUT family members carry out this transport.	321
374885	pfam16914	TetR_C_12	Bacterial transcriptional repressor C-terminal. This domain is found at the C-terminus of a small group of bacterial TetR transcriptional regulator proteins.	105
319001	pfam16915	Eryth_link_C	Annelid erythrocruorin linker subunit C-terminus. This domain is found in linker subunits of the erythrocruorin respiratory complex in annelid worms.	120
407147	pfam16916	ZT_dimer	dimerization domain of Zinc Transporter. ZT_dimer is the dimerization region of the whole molecule of zinc transporters since the full-length members form a homodimer during activity. The domain lies within the cytoplasm and exhibits an overall structural similarity with the copper metallochaperone Hah1 UniProtKB:O00244, exhibiting an open alpha-beta domain with two alpha helices (H1 and H2) aligned on one side and a three-stranded mixed beta-sheet (S1 to S3) on the other side. The N-terminal part of the members is the Cation_efflux family, pfam01545.	73
407148	pfam16917	BPL_LplA_LipB_2	Biotin/lipoate A/B protein ligase family. 	182
407149	pfam16918	PknG_TPR	Protein kinase G tetratricopeptide repeat. This domain is found at the C-terminus of protein kinase G and contains a tetratricopeptide repeat (TPR).	340
407150	pfam16919	PknG_rubred	Protein kinase G rubredoxin domain. This rubredoxin domain is found at the N-terminus of protein kinase G, and is essential for kinase activity.	139
407151	pfam16920	TPKR_C2	Tyrosine-protein kinase receptor C2 Ig-like domain. In the tyrosine-protein kinase receptor NTRK1 this domain interacts with beta-nerve growth factor NGF.	45
407152	pfam16921	Tex_YqgF	Tex protein YqgF-like domain. This is the YqgF-like domain of the bacterial Tex protein, which is involved in transcriptional processes.	125
407153	pfam16922	SLD5_C	DNA replication complex GINS protein SLD5 C-terminus. The C-terminal domain of DNA replication complex GINS protein SLD5 is important in the assembly of the GINS complex, a complex which is involved in initiation of DNA replication and progression of DNA replication forks.	57
407154	pfam16923	Glyco_hydro_63N	Glycosyl hydrolase family 63 N-terminal domain. This is a family of eukaryotic enzymes belonging to glycosyl hydrolase family 63. They catalyze the specific cleavage of the non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase EC:3.2.1.106 is the first enzyme in the N-linked oligosaccharide processing pathway. This family represents the N-terminal beta sandwich domain.	221
407155	pfam16924	DpaA_N	Dipicolinate synthase subunit A N-terminal domain. 	115
407156	pfam16925	TetR_C_13	Bacterial transcriptional repressor C-terminal. 	113
293531	pfam16926	HisKA_4TM	Archaeal 4TM region of histidine kinase. This N-terminal region of histidine-kinases consists of 4xTMs and is found in Archaea.	164
407157	pfam16927	HisKA_7TM	N-terminal 7TM region of histidine kinase. HisKA_7TM is an N-terminal region consisting of seven transmembrane domains found in Archaea and some bacteria. It is always found associated with histidine kinase.	221
293533	pfam16928	Inj_translocase	DNA/protein translocase of phage P22 injectosome. Inj_translocase is a family of putative phage translocases that are involved in the injectosome mechanism. Phage P22 of Salmonella typhimurium ejects four proteins, gp7, gp16, gp20 and gp26, which are ejected from the phage virion into the bacterial cell after absorption. These four proteins may play a role in DNA ejection.	217
407158	pfam16929	Asp2	Accessory Sec system GspB-transporter. Asp2 is a family of the SecA2/Y2 accessory Sec secretory system of Gram-positive bacteria. It is specific for large serine-rich repeat, cell-wall-anchored, glycoproteins such as GspB. Export of GspB requires the three Asp1-Asp3 proteins. Asp2, in conjunction with Asp3, probably acts as a chaperone in the early stage of GspB transport.	505
374891	pfam16930	Porin_5	Putative porin. 	535
407159	pfam16931	Phage_holin_8	Putative phage holin. 	122
407160	pfam16932	T4SS_TraI	Type IV secretory system, conjugal DNA-protein transfer. T4SS_TraI is a family of putative Gram-negative, largely Proteobacterial, type IV conjugal DNA-Protein transfer or VirB secretory pathway (IVSP) proteins.	211
407161	pfam16933	PelG	Putative exopolysaccharide Exporter (EPS-E). PelG is a family of putative exopolysaccharide transporters like PelG. Most members carry twelve transmembrane regions. The family also contains fusion proteins with glycosyl transferase group 1, which are putative flippase transporters.	451
293539	pfam16934	Mersacidin	Two-component Enterococcus faecalis cytolysin (EFC). Mersacidin is a cytolysin, a lantibiotic produced by Gram-positive bacteria, The cytolysin is a 'pseudohaemolysin' which produces haemolysis on blood agar plates, but not in broth culture. Mersacidin is one of the type B lantibiotics (lanthionine-containing antibiotics) that contain post-translationally modified amino acids and cyclic ring structures. Mersacidin attacks the cell wall precursor lipid II, thereby inhibiting cell-wall synthesis.	68
407162	pfam16935	Hol_Tox	Putative Holin-like Toxin (Hol-Tox). Hol_Tox is a family of small proteins (34-48aas) with a single TM region. Members can exhibit antibacterial activity against Gram-positive bacteria but not against Gram-negative bacteria.	60
407163	pfam16936	Holin_9	Putative holin. This is a family of putative holins from Actinobacteria with three TM regions.	78
407164	pfam16937	T3SS_HrpK1	Type III secretion system translocator protein, HrpF. T3SS_HrpK1 is a family of putative Type III secretion system pore-forming bacterial proteins. These allow transfer of pathogenic material from bacterial cytoplasm into the plant host cytoplasm.	256
407165	pfam16938	Phage_holin_Dp1	Putative phage holin Dp-1. Phage_holin_Dp1 is a family of putative phage-holins from Gram-positive bacteria, largely Firmicutes, with two probable TMSs. The family shows lytic activity.	62
293544	pfam16939	Porin_6	Putative porin. Porin_6 is a family of putative porins from Leptospira species.	282
374895	pfam16940	Tic110	Chloroplast envelope transporter. Tic110 is a family of chloroplast envelope proteins. Some are involved in protein translocation and others are neurotransmitter receptor, cys loop, ligand-gated ion channel or LIC proteins.	573
407166	pfam16941	CymA	Putative cyclodextrin porin. 	341
293547	pfam16942	CclA_1	Putative cyclic bacteriocin. This is a family of short proteins from Gram- putatively from the carnocylcin A family of bacteriocins.	103
374896	pfam16943	T4SS_CagC	Cag pathogenicity island, type IV secretory system. T4SS_CagC is a family of putative pathogenicity island, type IV, conjugal DNA-protein transfer, secretory system proteins from Gram-negative bacteria.	119
407167	pfam16944	KCH	Fungal potassium channel. KHC is a family of fungal proteins carrying three transmembrane domains. It is a member of the fungal potassium channel family of transporters, and includes a pair of homologous sequences that localize to distinct zones of the yeast plasma membrane and are induced during the response to mating pheromones. Together KCH1 and KCH2 promote low-affinity K+ uptake and are essential for K+-dependent activation of HACS - a high-affinity Ca2+ influx system that activates calcineurin and is essential for cell survival - in S. cerevisiae cells responding to mating pheromones.	251
407168	pfam16945	Phage_r1t_holin	Putative lactococcus lactis phage r1t holin. Phage_r1t_holin is a family of putative phage r1t holins from lactococcus. these holins carry two hydrophobic putative TMs separated by a short beta-turn region.	70
319021	pfam16946	Porin_OmpG_1_2	OMPG-porin 1 family. Porin_OmpG_1_2 is a family of putative porins of the OmpG-type. these are channels without solute specificity.	294
407169	pfam16947	Ferredoxin_N	N-terminal region of 4Fe-4S ferredoxin iron-sulfur binding. Ferredoxin_N is a short domain that is often found at the N-terminus of 4Fe-4S ferredoxin iron-sulfur binding domain proteins from Archaea and a few bacteria.	65
374899	pfam16949	ABC_tran_2	Putative ATP-binding cassette. This is a family of putative two component ABC exporters. This is the membrane protein of approximately 573 residues and twelve transmembrane domains. It is encoded adjacent to an ATPase.	542
293556	pfam16951	MaAIMP_sms	Putative methionine and alanine importer, small subunit. MaAIMP_sms is a family of hypothetical proteins from Proteobacteria that purported to be small subunits of a methionine and alanine importer.	60
407170	pfam16952	Gln-synt_N_2	Glutamine synthetase N-terminal domain. 	112
407171	pfam16953	PRORP	Protein-only RNase P. PRORPs (protein-only RNase P) are a class of RNA processing enzymes that catalyze maturation of the 5' end of precursor tRNAs in Eukaryotes. Arabidopsis thaliana contains PRORP enzymes (PRORP1, PRORP2 and PRORP3) where PRORP1 localizes to mitochondria as well as chloroplasts, while PRORP2 and PRORP3 are found in the nucleus. In humans and most other metazoans, mt-RNase P is composed of three protein subunits (mitochondrial RNase P proteins 1-3; MRPP1-3), homologs to the Arabidopsis thaliana PRORP1-3. This domain corresponds to the metallonuclease domain of PRORPs. PRORP1 has 22% sequence identity to the human homolog MRPP3. PRORP1 crystal structure shows a V-shaped tripartite structure with a C-terminal metallonuclease domain of the NYN (N4BL1, YacP-like nuclease) family, with a typical and functional two-metal-ion catalytic site that has conserved aspartate residues.	241
407172	pfam16954	HRG	Haem-transporter, endosomal/lysosomal, haem-responsive gene. HRG1 is a family of conserved, membrane-bound permeases that reside in distinct intracellular compartments and bind and transport haem in metazoa. These proteins carry four transmembrane domains, 4xTMs, modelled here in two pairs, the two N-terminal and the two more C-terminal.	51
374902	pfam16955	OFeT_1	Ferrous iron uptake permease, iron-lead transporter. OFeT_1 is a family of conserved archaeal membrane proteins that are putative oxidase-dependent Fe2+ transporters.	206
339867	pfam16956	Porin_7	Putative general bacterial porin. 	274
407173	pfam16957	Mal_decarbox_Al	Malonate decarboxylase, alpha subunit, transporter. Mal_decarbox_Al is a family of Na+-transporting carboxylic acid decarboxylases.	547
407174	pfam16958	PRP9_N	Pre-mRNA-splicing factor PRP9 N-terminus. This is the N-terminal domain of pre-mRNA-splicing factor PRP9.	149
407175	pfam16959	Collectrin	Renal amino acid transporter. Collectrin is a single-pass transmembrane protein that is homologous to the C-terminal region of human angiotensin-converting enzyme 2, ACE2, found in Peptidase_M2 pfam01401. Collectrin is critical for normal amino acid reabsorption in the kidney.	154
319033	pfam16960	HpuA	Haemoglobin-haptoglobin utilisation, porphyrin transporter. HpuA is a family of Neisseria spp proteins from the hpuAB operon, which are putative porphyrin transporters.	313
374905	pfam16961	OmpA_like	Putative OmpA-OmpF-like porin family. This is a family of putative OmpA-OmpF-like porins from Bacteroidetes.	197
407176	pfam16962	ABC_export	Putative ABC exporter. This is a family of putative ABC_exporters from Firmicutes.	533
407177	pfam16963	PelD_GGDEF	PelD GGDEF domain. This degenerate GGDEF domain is found at the C-terminus of PelD, a membrane-bound c-di-GMP-specific receptor. It contains an RXXD motif resembling the allosteric inhibition site found in diguanylate cyclases. In PelD this RXXD motif binds to dimeric c-di-GMP.	123
407178	pfam16964	TadF	Putative tight adherence pilin protein F. TadF is a family of proteins from the tad locus that is part of the type IV bacterial secretory system.	176
374907	pfam16965	CSG2	Ceramide synthase regulator. CSG2 is an integral membrane protein with up to 10 transmembrane segments that, when over-expressed, localizes to the endoplasmic reticulum. CSG2 is a family of fungal transmembrane proteins that regulate mannosyl phosphorylinositol ceramide synthase and are thereby implicated in calcium homoeostasis in the cell.	396
407179	pfam16966	Porin_8	Porin-like glycoporin RafY. This is a family of Gram-negative Gammaproteobacteria putative raffinose-like porins.	363
407180	pfam16967	TcfC	E-set like domain. TcfC is a family of bacterial fimbrial proteins. These sit in the outer bacterial membrane surrounding the RcpA proteins of the fimbrial shaft. This family is from Gamma-proteobacteria. This domain represents an immunoglobulin like E-set domain.	68
407181	pfam16968	TadZ_N	Pilus assembly protein TadZ N-terminal. TadZ_N is the N-terminal region of the Flp pilus assembly protein TadZ, which carries an AAA, ATPase domain immediately downstream, AAA_31, pfam13614. The domain is an example of a signal-transduction-response receiver. It is localized to the cytoplasmic side of the inner bacterial cell-membrane, contacting also with both tadA and RcpC.	129
407182	pfam16969	SRP68	RNA-binding signal recognition particle 68. SRP68 is a family that is part of the SRP or signal recognition particle complex. This complex, consisting of six proteins and a 7SL-RNA is necessary for guiding the emerging proteins designed for the membrane towards the translocation pore. SRP68 forms a stable heterodimer with SRP72, a protein with a TPR repeat. Specific RNA-binding of SRP68 is mediated by the N-terminal domain of approximately 200 residues of this family.	561
339870	pfam16970	FimA	Type-1 fimbrial protein, A. FimA is a family of Gram-negative fimbrial component A proteins that form part of the pili. There are usually up to 1000 copies of this subunit in one pilus that form a helically wound rod onto which the tip fibrillum (FimF.FimG, FimH) is attached. Pilus subunits are translocated from the cytoplasm to the periplasm via the general secretory pathway SecYEG.	145
407183	pfam16971	RcpB	Rough colony protein B, tight adherence - tad - subunit. RcpB is part of the Tad operon of proteins. The Tad (tight adherence) macromolecular transport system, present in many bacterial and archaeal species, represents an ancient and major new subtype of type II secretion. The three Rcp proteins (RcpA, RcpB, and RcpC) and TadD, a putative lipoprotein, are localized to the bacterial outer membrane.	168
374913	pfam16972	TipE	Na+ channel auxiliary subunit TipE. TipE appears to be a family of insect Na+ channel auxiliary subunit proteins.	486
407184	pfam16973	FliN_N	Flagellar motor switch protein FliN N-terminal. FliN is one of three proteins that form a switch-complex at the base of the basal body of the flagellum; the switch regulates the flagellum-motor.	50
374915	pfam16974	NAR2	High-affinity nitrate transporter accessory. NAR2 is a family of plant proteins with a C-terminal transmembrane region that is an essential accessory for high-affinity nitrate uptake. This family works together with NRT2, a 12xTM family of proteins that is part of family MFS_1, pfam07690. NAR2 is also involved in the repression of lateral root initiation in response to high ratios of sucrose to nitrogen in the medium. Therefore the two component-system of NAR2 and NRT2 itself is likely to be involved in the signalling pathway that integrates nutritional cues for the regulation of lateral root architecture. The functional unit of the high-affinity nitrate influx complex is likely to be a tetramer, in Arabidopsis, made up of two subunits each of NRT2.1 and NAR2.1.	173
407185	pfam16975	UPAR_LY6_2	Ly6/PLAUR domain-containing protein 6, Lypd6. UPAR_LY6_2 is a family of higher eukaryotic proteins expressed in neurons. It modulates nicotinic acetylcholine receptors by selectively increasing Ca2+-influx through this ion channel. The family carries an LU protein domain - about 80 amino acids long characterized by a conserved pattern of 10 cysteine residues. The family is a positive feedback regulator of Wnt/beta-catenin signalling, eg for patterning of the mesoderm and neuroectoderm in zebrafish gastrulation, where Lypd6 is GPI-anchored to the plasma-membrane and interacts with the Wnt receptor Frizzled8 and the co-receptor Lrp6.	105
407186	pfam16976	RcpC	Flp pilus assembly protein RcpC/CpaB. RcpC is a family of Gram-negative proteins expressed from the tight-adherence tad locus. RcpC is an auxillary protein that sits in the inner membrane and interacts with TadB and TadZ, an AAA ATPase. A recent study has identified two tandem beta-clip domains in RcpC95. beta-Clip domains are known to interact with carbohydrate moieties in other systems, such as SAF.	115
407187	pfam16977	ApeC	C-terminal domain of apextrin. ApeC domain was first identified from two apextrin-like proteins (ALP) of the amphioxus Branchiostoma japonicum. Our functional studies show that amphioxus ALP1 and ALP2 are important anti-bacterial effectors, and that the apeC domain of the ALP1/2 mediates the bacterial recognition by binding to bacterial muramyl dipeptide (MDP). Further analysis shows that the apeC domain is present in various proteins from cnidarians, molluscs, arthropods, hemichordates, echinoderms and amphioxus. The apeC domain is also found to form different domain combinations with other domains (in press).	205
407188	pfam16978	CRIM	SAPK-interacting protein 1 (Sin1), middle CRIM domain. CRIM is a domain in the middle of Sin1 that is important in the substrate recognition of TORC2. It is conserved from yeast to humans. TOR is a serine/threonine-specific protein kinase and forms functionally distinct protein complexes referred to as TORC1 and TORC2.	137
407189	pfam16979	SIN1_PH	SAPK-interacting protein 1 (Sin1), Pleckstrin-homology. SIN1_PH is a pleckstrin-homology domain found at the C-terminus of SIN1. It is conserved from yeast to humans. PH-domains are involved in intracellular signalling or as constituents of the cytoskeleton. SIN1 (SAPK-interacting protein 1) plays an essential role in signal transduction, anf the PH domain is involved in lipid and membrane binding.	104
407190	pfam16980	CitMHS_2	Putative citrate transport. CitMHS is a family of putative citrate transporters, belonging to the Na+/H+ antiporter NhaD-like permease superfamily.	440
293586	pfam16981	Chi-conotoxin	chi-Conotoxin or t superfamily. Chi-conotoxin is a family of Cone snail venom chi-conopeptide class bioactive peptides based. These conopeptides show a unique ability, highly selectively and non-competitively, to inhibit the noradrenaline transporter. They show an unusual cysteine-stabilized scaffold that presents a gamma-turn in an optimized conformation for high affinity interactions with the noradrenaline transporter.	60
407191	pfam16982	Flp1_like	Putative Flagellin, Flp1-like, domain. 	48
407192	pfam16983	MFS_MOT1	Molybdate transporter of MFS superfamily. MFS_MOT1 is a family of molybdenate transporters. Molybdenum is an essential element that is taken up into the cell in the oxyanion molybdate. Molybdenum is used in the form of molybdopterin-cofactor, which participates in the active site of enzymes involved in key reactions of carbon, nitrogen, and sulfur metabolism.	111
319052	pfam16984	Grp7_allergen	Group 7 allergen. 	180
407193	pfam16985	DUF5086	Domain of unknown function (DUF5086). 	118
374920	pfam16986	CzcE	Heavy-metal resistance protein CzcE. CzcE is involved in heavy-metal resistance. It binds copper, which induces a conformational change.	80
407194	pfam16987	KIX_2	KIX domain. This KIX domain is an activator-binding domain.	83
407195	pfam16988	Vps36-NZF-N	Vacuolar protein sorting 36 NZF-N zinc-finger domain. The vacuolar protein sorting 36 NZF-N zinc-finger domain interacts with the C-terminus of vacuolar protein sorting 28.	65
407196	pfam16989	T6SS_VasJ	Type VI secretion, EvfE, EvfF, ImpA, BimE, VC_A0119, VasJ. T6SS_VasJ is a family from Gram-negative bacteria that forms a component of the type VI pathogenic secretion system. In the case of the Escherichia coli RS218 strain UniProtKB:G8IRL4, EvfF,it represents expression of the full-length gene; whereas it is just the C-terminal part of EvfE, UniProtKB:G8IRL3. The N-terminal part of these sequences is in family ImpA_N, pfam06812.	254
293595	pfam16990	CBM_35	Carbohydrate binding module (family 35). This is a mannan-specific carbohydrate binding domain, previously known as the X4 module. Unlike other carbohydrate binding modules, binding to substrate causes a conformational change.	119
407197	pfam16991	SIR4_SID	Sir4 SID domain. This is the Sir2 interaction domain (SID domain) of silent information regulator 4 (Sir4).	159
379914	pfam16992	RNA_pol_RpbG	DNA-directed RNA polymerase, subunit G. RNA_pol_RpbG is a family of archaeal and fungal subunit G of DNA-directed RNA polymerase.	119
407198	pfam16993	Asp1	Accessory Sec system protein Asp1. Asp1, along with SecY2, SecA2, and other proteins forms part of the accessory secretory protein system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown. Asp1 is predicted to be cytosolic.	522
407199	pfam16994	Glyco_trans_4_5	Glycosyl-transferase family 4. 	167
407200	pfam16995	tRNA-synt_2_TM	Transmembrane region of lysyl-tRNA synthetase. tRNA-synt_2_TM is a family from the N-terminal region of tRNA-synthase-2, with 6xTMs. The presence of this region indicates that the protein is anchored in the membrane. The family is found in Actinobacteria.	215
407201	pfam16996	Asp4	Accessory secretory protein Sec Asp4. Asp4 and Asp5 are putative accessory components of the SecY2 channel of the SecA2-SecY2 mediated export system, but they are not present in all SecA2-SecY2 systems. This family of Asp4 is found in Firmicutes.	55
407202	pfam16997	Wap1	Wap1 domain. The Wap1 domain is found at the C-terminus of fungal Wpl1 proteins (also known as Rad61). These proteins are members of the cohesin complex. The Wap1 domain binds to the ATPase domain of Smc3.	373
339877	pfam16998	17kDa_Anti_2	17 kDa outer membrane surface antigen. 17kDa_Anti_2 is a surface protein that is found in several Proteobacteria species.	111
339878	pfam16999	V-ATPase_G_2	Vacuolar (H+)-ATPase G subunit. This family represents vacuolar (H+)-ATPase G subunit from several bacterial and archaeal species. Subunit G is a component of the peripheral stalk of the ATPase complex	104
407203	pfam17000	Asp5	Accessory secretory protein Sec, Asp5. Asp4 and Asp5 are putative accessory components of the SecY2 channel of the SecA2-SecY2 mediated export system, but they are not present in all SecA2-SecY2 systems. This family of Asp5 is found in Firmicutes.	71
407204	pfam17001	T3SS_basalb_I	Type III secretion basal body protein I, YscI, HrpB, PscI. T3SS_basalb_I represents a family of Gram-negative type III secretion basal body proteins I. It is the inner rod protein of the secreted needle. YscI is suggested to form a rod that allows substrate passage across the inner membrane of the needle protein YscF through it.	94
374928	pfam17002	DUF5089	Domain of unknown function (DUF5089). This is a family of microsporidial-specific proteins of unknown function. There is distant homology to synaptosomal-associated 25 family proteins.	193
374929	pfam17003	Actin_micro	Putative actin-like family. This is a family of microsporidial-specific proteins of unknown function. There is distant homology to the Actin family.	350
374930	pfam17004	SRP_TPR_like	Putative TPR-like repeat. This is a family of microsporidial sequences that are likely to fold into a TPR-like structure. Many sequences are annotated as being signal recognition proteins.	109
407205	pfam17005	WD40_like	WD40-like domain. This is a family of proteins which have weak homology to the WD40 repeat family. Members are largely from microsporidia and related species.	301
374931	pfam17006	DUF5087	Domain of unknown function (DUF5087). This is a family of microsporidial sequences of unknown function.	292
374932	pfam17007	HTH_micro	HTH-like. This is a family of microsporidial sequences whose function is not known. It is possible that the proteins are DNA-binding as there is distant homology to helix-turn-helix families at the N-terminus.	454
374933	pfam17008	DUF5088	Domain of unknown function (DUF5088). This is a family of microsporidial sequences of unknown function.	184
374934	pfam17009	DUF5090	Domain of unknown function (DUF5090). This is a microsporidial-specific family of proteins of unknown function. The family is likely to be of four transmembrane domains.	187
374935	pfam17010	DUF5092	Domain of unknown function (DUF5092). his is a family of microsporidial-specific sequences of unknown function. There is one transmembrane domain towards the C-terminus.	145
374936	pfam17011	DUF5093	Domain of unknown function (DUF5093). This is a family of microsporidial sequences that may be distantly related to RRP7, pfam12923, ribosomal-RNA-processing protein 7.	131
374937	pfam17012	DUF5091	Domain of unknown function (DUF5091). This is a family of microsporidial-specific sequences of unknown function.	147
374938	pfam17013	Acetyltransf_15	Putative acetyl-transferase. This is a family of microsporidial proteins which may be distantly related to acetyl-transferase.	210
374939	pfam17014	Mad3_BUB1_I_2	Putative Mad3/BUB1 like region 1 protein. This family of microsporidial sequences may be related to the Mad3_BUB1_I family pfam08311.	128
374940	pfam17015	DUF5094	Domain of unknown function (DUF5094). This family of largely microsporidial-specific proteins is of unknown function. However there may be distant homology to family Csm1, pfam12539.	178
374941	pfam17016	DUF5095	Domain of unknown function (DUF5095). This is a family of microsporidial-specific sequences. The function is not known and there is no distant homology to any Pfam families so far.	229
374942	pfam17017	zf-C2H2_aberr	Aberrant zinc-finger. This is a family of largely microsporidia-specific proteins with an aberrant zinc-finger motif of Cx(4)C2H repeated.	165
374943	pfam17018	MICSWaP	Spore wall protein. This is a family of microsporidial spore-wall proteins.	193
374944	pfam17019	DUF5096	Domain of unknown function (DUF5096). This is a family of microsporidial sequences of unknown function. There is a well conserved Asp residue towards the C-terminus which may be functional.	192
407206	pfam17020	DUF5097	Domain of unknown function (DUF5097). This is a family of microsporidia-specific proteins of unknown function. There is the possibility of very distant homology to the WAC domain.	119
374945	pfam17021	Mei5_like	Putative double-strand recombination repair-like. This is a family of microsporidia-specific sequences with homology to the double-strand recombination repair protein family, Mei5 pfam10376.	118
374946	pfam17022	PTP2	Polar tube protein 2 from Microsporidia. PTP2 is a family of microsporidial polar-tube protein 2 sequences. Humans can be infected with the unicellular eukaryote Microsporidia which are obligate intracellular parasites that produce resistant spores. To initiate entry into a new host cell a unique motile process is formed by a sudden extrusion of the polar tube protein from the spore. There are a series of conserved cysteine residues.	216
407207	pfam17023	DUF5098	Domain of unknown function (DUF5098). This is family of microsporidia-specific sequences with no known function. There is a very characteristic NPW sequence motif at the very C-terminus.	461
374947	pfam17024	DMAP1_like	Putative DMAP1-like. This is a family of microsporidia-specific sequences that may have distant homology to the family DMAP1, pfam05499.	113
374948	pfam17025	DUF5099	Domain of unknown function (DUF5099). This is a family of microsporidia-specific sequences of unknown function.	109
374949	pfam17026	zf-RRPl_C4	Putative ribonucleoprotein zinc-finger pf C4 type. This is a family of largely microsporidia-specific proteins. One member is annotated as being a ribonucleoprotein. The family carries two pairs of CxxC residues suggesting that there is DNA-binding.	108
374950	pfam17027	Bromo_TP_like	Histone-fold protein. This is a family of microsporidia-specific sequences that have distant homology to the Bromo_TP family, pfam07524.	119
407208	pfam17028	8TM_micro	8TM Microsporidial transmembrane domain. This is a family of largely microsporidial-specific proteins that carry eight transmembrane regions, in two blocks of four. Such an arrangement of TMs suggests a transporter function of some kind. There is a highly conserved NFLNW sequence-motif at the C-terminus which might be of functional importance.	259
374951	pfam17029	DUF5100	Domain of unknown function (DUF5100). This is a family of microsporidia-specific sequences of unknown function.	126
374952	pfam17030	Beta_lactamase3	Putative beta-lactamase-like family. This is a family derived from microsporidia-specific proteins. There is homology to the beta-lactamase domain.	213
374953	pfam17031	DUF5101	Domain of unknown function (DUF5101). This is a family of short microsporidia-specific proteins of unknown function.	99
407209	pfam17032	zinc_ribbon_15	zinc-ribbon family. This zinc-ribbon region is found on a set of largely microsporidia-specific proteins.	73
407210	pfam17033	Peptidase_M99	Carboxypeptidase controlling helical cell shape catalytic. This is the peptidase domain of a D,L-carboxypeptidase. The active site residues are Arg86, Glu222 and the metal ligands, in the peptidase domain, are Gln46, Glu49 and His128 in UniProtKB:O25708. The protein binds many zinc ions and a calcium ion and there are other metal binding sites. The catalytic activity is the release of m-Dpm from the peptide muramyl-Ala-gamma-D-Glu-m-Dpm; this is probably the precursor of the cell wall cross-linking peptide.	227
374955	pfam17034	zinc_ribbon_16	Zinc-ribbon like family. This family is found at the C-terminus of WD40 repeat structures in eukaryotes.	125
407211	pfam17035	BET	Bromodomain extra-terminal - transcription regulation. The BET, or bromodomain extra-terminal domain, is found on bromodomain proteins that play key roles in development, cancer progression and virus-host pathogenesis. It interacts with NSD3, JMJD6, CHD4, GLTSCR1, and ATAD5 all of which are shown to impart a pTEFb-independent transcriptional activation function on the bromodomain proteins.	64
407212	pfam17036	CBP_BcsS	Cellulose biosynthesis protein BcsS. This is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)).	145
374957	pfam17037	CBP_BcsO	Cellulose biosynthesis protein BcsO. This is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)).	208
407213	pfam17038	CBP_BcsN	Cellulose biosynthesis protein BcsN. This is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)).	186
407214	pfam17039	Glyco_tran_10_N	Fucosyltransferase, N-terminal. This is the N-terminal domain of a family of fucosyltransferases. This enzyme transfers fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is known as glycosyltransferase family 10. The N-terminal domain is the likely binding-region for the fucose-like substrate (manuscript in publication).	109
407215	pfam17040	CBP_CCPA	Cellulose-complementing protein A. This is a family of bacterial cellulose-complementing protein A proteins necessary for cellulose biosynthesis. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)).	121
407216	pfam17041	SasG_E	E domain. This short domain is about 50 amino acids in length. Its structure shows that it is composed of two beta sheets each of three strands. This domain is found associated with the pfam07501 domain and it has structural similarity with that domain although it is somewhat shorter. The E domain forms part of a rod like structure.	48
407217	pfam17042	DUF1357_C	Putative nucleotide-binding of sugar-metabolising enzyme. This conserved region is found in proteins of unknown function in a range of Proteobacteria as well as the Gram-positive Oceanobacillus iheyensis. Structural analysis of the whole protein indicates the N- and C-termini interacting to produce a binding-interface in which a threonate-ADP complex is bound, suggesting that a sugar binding site is on the N-terminal domain, pfam07005, and a nucleotide binding site is in the C-terminal domain here (manuscript in preparation).	163
407218	pfam17043	MAT1-1-2	Mating type protein 1-1-2 of unknown function. MAT1-1-2 is a family of proteins present in Sordariomycetes. They are encoded by the MAT1-1-2 gene which is present in the mating types of Sordariomycetes. The most famous representative if this family is Neurospora crassa. MAT1-1-2 is the generic nomenclature of all mating-type genes encoding proteins with HPG (also termed PPF) domain. This gene and its domain was first identified in Podospora anserina (its name in this species is SMR1) and Neurospora crassa (its name in this species is mat A-2) by Debuchy et al (1993). HPG was the first name proposed for the domain found in MAT1-1-2 proteins, based on the most conserved residues (histidine, proline and glycine). PPF was a second denomination proposed by Kanematsu et al (2007) for the same domain but these authors identified different conserved residues (proline, proline and phenylalanine). The function of this domain is not yet known.	147
319104	pfam17044	BPTA	Borrelial persistence in ticks protein A. BPTA is a family of proteins that are found in Borrelia species. The function is not known.	196
407219	pfam17045	CEP63	Centrosomal protein of 63 kDa. CEP63 is a family of eukaryotic proteins involved in centriole activity.	268
407220	pfam17046	Ses_B	SesB domain on fungal death-pathway protein. SesB is a short conserved domain found on fungal proteins that are part of the cell death or heterokaryon incompatibility pathway.	22
293652	pfam17047	SMP_LBD	Synaptotagmin-like mitochondrial-lipid-binding domain. SMP is a proposed lipid-binding module, ie a synaptotagmin-like mitochondrial-lipid-binding domain found in eukaryotes. The SMP domain has a beta-barrel structure like protein modules in the tubular-lipid-binding (TULIP) superfamily. It dimerizes to form an approximately 90-Angstrom-long cylinder traversed by a channel lined entirely with hydrophobic residues. The following two C2 domains then form arched structures flexibly linked to the SMP domain. The SMP domain is a lipid-binding domain that links the ER with other lipid bilayer-membranes within the cell.	180
407221	pfam17048	Ceramidse_alk_C	Neutral/alkaline non-lysosomal ceramidase, C-terminal. This family represents C-terminal domain of a group of neutral/alkaline ceramidases found in both bacteria and eukaryotes. The EC classification is EC:3.5.1.23. The enzyme hydrolyzes ceramide to generate sphingosine and fatty acid. The enzyme plays a regulatory role in a variety of physiological events in eukaryotes and also functions as an exotoxin in particular bacteria. This C-terminal tail of the enzyme is highly conserved across all species and may play a role in the interaction of the enzyme with the plasma membranes. The tail is also vital for the stabilisation of the enzyme as a whole.	165
407222	pfam17049	AEP1	ATPase expression protein 1. ATPase expression protein 1 (AEP1) is a yeast mitochondrial protein. It is essential for the expression of subunit 9 of mitochondrial ATP synthase.	396
407223	pfam17050	AIM5	Altered inheritance of mitochondria 5. AIM5 is a fungal mitochondrial inner membrane protein. It is a component of the mitochondrial inner membrane organising system (MINOS/MitOS), which promotes normal mitochondrial morphology.	60
293656	pfam17051	COA2	Cytochrome C oxidase assembly factor 2. 	86
407224	pfam17052	CAF20	Cap associated factor. In eukaryotes, the translation of mRNA is initiated by the binding of eIF4F complex, which is composed of eIF4E, eIF4A and eIF4G proteins. elF4E-binding proteins (4E-BPs) are involved in translational regulation through their interaction with eIF4E. There are two elF4E-binding proteins (4E-BPs) found in S. cerevisiae, Caf20 and Eap1. This entry represents Caf20 (also known as p20), which competes with elF4G for binding to elF4E and interferes with the formation of the elF4F complex, hence inhibiting translation. It is needed for the induction of pseudohyphal growth in response to nitrogen limitation.	151
407225	pfam17053	GEP5	Genetic interactor of prohibitin 5. Genetic interactor of prohibitin 5 (GEP5), also known as required for respiratory growth protein5 (RRG5), has been shown to interact with prohibitin ring complexes in the mitochondrial inner membrane that regulate cell proliferation as well as the dynamics and function of mitochondria. It is required for mitochondrial genome maintenance and is essential for respiratory growth.	214
293659	pfam17054	JUPITER	Microtubule-Associated protein Jupiter. Is a microtubule-associated protein that binds to all microtubule populations in Drosophila.	208
293660	pfam17055	VMR2	Viral matrix protein M2. Is a viral transmembrane protein which forms a proton-selective ion channel that is needed for the efficient release of the viral genome during virus entry. Once is attached to the cell surface, the virion enters the cells by endocytosis. Acidification of the endosome triggers M2 ion channel activity. Also plays a role in viral proteins secretory pathways. Elevates the intravesicular pH of normally acidic compartments, such as trans-Golgi network. It seems that M2 protein ion channel activity can affect the status of the conformational form of cleaved HA during intracellular transport.	235
407226	pfam17056	KRE1	Killer toxin-resistance protein 1. The killer toxin-resistance protein 1 family are GPI-anchored plasma membrane proteins, found in yeast. They are involved in 1,6-beta-glucan formation and in the assembly and architecture of the cell wall. They also act as plasma membrane receptors for the yeast K1 viral toxin, and are involved in subsequent lethal channel formation. The family also includes Pga1 proteins, which have a role in oxidative stress response and in adhesion and biofilm formation.	66
293662	pfam17057	B3R	Poxviridae B3 protein. This is a viral protein. Its function is unknown.	123
407227	pfam17058	MBR1	Mitochondrial biogenesis regulation protein 1. In yeast this protein participates in mitochondrial biogenesis and stress response. And also seems that may affect the NAM7 function, possibly at the level of mRNA turnover.	208
293664	pfam17059	MGTL	MgtA leader peptide. MTG is a bacterial protein that makes mgtA transcription sensitive to intracellular proline levels. When the levels of proline are low, this protein is not able to be translated and stem loop'C' forms in the mgt A 5'UTR which enables the transcription of the downstream mgtA gene.	17
407228	pfam17060	MPS2	Monopolar spindle protein 2. Is a fungal transmembrane protein which is part of the component of the spindle pole body (SPB) required for the insertion of the nascent SPB into the nuclear envelope and for the proper execution of spindle pole body (SPB) duplication. It seems that Mps2-Spc24 interaction may contribute to the localization of Spc24 and other kinetochore components to the inner plaque of the SPB.	340
407229	pfam17061	PARM	PARM. Human PARM-1 is a mucin-like, androgen-regulated transmembrane protein that is present in most tissues, with high levels in the heart, kidney and placenta. It has been shown to be induced and expressed in prostate after castration and may have a role in cell proliferation and immortalisation in prostate cancer.	296
407230	pfam17062	Osw5	Outer spore wall 5. In fungi the outermost cape of the spore wall is made up of a polymer that contains cross-linked amino acid dityrosine, which is important for the stress resistance of the spore. The OSW family of proteins have been implicated in assembly of this protective dityrosine coat. OSW5 null mutant spores show an enhanced spore wall permeability and vulnerability to beta glucanase digestion. The proteins are predicted to be integral membrane proteins.	70
293668	pfam17063	Psm4	Phenol-soluble modulin alpha 4 peptide. Psma4 is a methicillin-resistant Staphylococcus aureus (MRSA) protein that may recruit, activate and induce the lysis of human neutrophils. It stimulates the secretion of IL-8 and also has haemolytic activity during MRSA infection.	20
407231	pfam17064	QVR	Sleepless protein. In Drosophila QUIVER (also known as SLEEPLESS protein) is required for homoeostatic regulation of sleep under normal conditions and following sleep deprivation. It is a novel potassium channel subunit that modulates the Shaker potassium channel which regulates the sleep.	85
293670	pfam17065	UPF0669	Putative cytokine, C6ORF120. C6orf120 is a secreted protein that promotes cell cycle progression of CD4(+) T-cells, not hepatocytes. In humans it has its main role in tunicamycin-induced CD4(+) T apoptosis that may be associated with endoplasmatic reticulum stress. This suggests that it might be a new cytokine with immununoregulatory function that is selective for CD4+ T cells. It is mainly expressed in hepatocytes and cells in germinal centre of lymph nodes.	185
407232	pfam17066	RITA	RBPJ-interacting and tubulin associated protein. RITA is a highly conserved protein that binds to tubulin and shuttles between the cytoplasm and nucleus. It is responsible for export of RBP-J/CBF-1 from the nucleus, which modulates Notch-mediated transcription.	267
293672	pfam17067	RPS31	Ribosomal protein S31e. RPS31, Ubi3 precursor, which is part of mature 60S and 40S ribosomal subunits. It seems that linear ubiquitin fusion to Rps31 and its subsequent cleavage are required for the efficient production and functional integrity of 40S ribosomal subunits.	99
293673	pfam17068	RRG8	Required for respiratory growth protein 8 mitochondrial. RRG8 is a mitochondrial protein that plays an important role in maintenance of mtDNA due to is required for respiratory activity and maintenance and expression of the mitochondrial genome.	279
293674	pfam17069	RSRP	Arginine/Serine-Rich protein 1. RSRP1 is an eukaryotic protein family. Its function is unknown.	299
319115	pfam17070	Thx	30S ribosomal protein Thx. Thx forms part of the 30S ribosomal subunit. It fits into a cavity between multiple RNA elements in the top of the 30S subunit head and stabilizes the organisation of these elements.	27
293676	pfam17071	Capsid_VP7	Outer capsid protein VP7. Outer capsid protein VP7 is a reoviral protein that interacts with VP4 to form the outer icosahedral capsid. Outer capsids are involved directly in viral host interactions.	276
407233	pfam17072	Spike_torovirin	Torovirinae spike glycoprotein. The spike glycoprotein is a corona viral transmembrane protein that mediates the binding of virions to the host cell receptor and is involved in membrane fusion. The torovirinae spike proteins appear distinct from other coronaviridae spike proteins, such as human SARS coronavirus.	1271
319116	pfam17073	SafA	Two-component-system connector protein. SafA is a bacterial transmembrane protein family that connects the signal transduction between the two component systems EvgS/EvgA and PhoQ/Phop. SafA interacts with PhoQ, leading to the PhoQ/PhoP system activation in response to acid stress conditions.	64
293679	pfam17074	Darcynin	Darcynin, domain of unknown function. Darcynin is a bacterial protein family. Its function is unknown.	127
374974	pfam17075	RRT14	Regular of rDNA transcription protein 14. Regulator of rDNA transcription protein14 (RRT14) is a nucleolar protein that is involved in ribosome biogenensis.	196
293681	pfam17076	SBE2	SBE2, cell-wall formation. 	820
293682	pfam17077	Msap1	Mitotic spindle associated protein SHE1. She1 seems to be related to the spindle integrity function of the Dam1 complex. She1 is a dynein regulator and limits dynein offloading by gating the recruitment of dynactin to the astral microtubule plus end. Aurora B phosphorylates She1, modulating its potency against dynein.	330
293683	pfam17078	SHE3	SWI5-dependent HO expression protein 3. SWI5-dependent HO expression protein 3 (She3) is an RNA-binding protein that binds specific mRNAs, including the mRNA of Ash1, which is invalid in cell-fate determination. She3 acts as an adapter protein that docks the myosin motor Myo4p onto an Ash1-She2p ribonucleoprotein complex. She3 seems to bind to Myo4p and Shep2p via different domains.	228
293684	pfam17079	SOTI	Male-specific protein scotti. Soti is a post-meiotically transcribed gene that is required in late spermiogenesis for normal spermatid individualisation. Besides, it is expressed in primary spermatocytes and round spermatids.	101
374975	pfam17080	SepA	Multidrug Resistance efflux pump. SepA is a drug efflux protein that is involved in bacterial multidrug resistance. It is predicted to have four transmembrane domains.	144
407234	pfam17081	SOP4	Suppressor of PMA 1-7 protein. SOP4 is a family of fungal ER membrane proteins that regulate the quality control and intracellular transport of Pma1-7, a mutant plasma membrane ATPase.	209
293687	pfam17082	Spc29	Spindle Pole Component 29. Spc29 is a component of the Spc-110 subcomplex and is required for the SPB (Spindle pole body) duplication. Spc29 acts as a linker between the central plaque component Spc42 to the inner plaque component Spc110.	245
407235	pfam17083	Swm2	Nucleolar protein Swm2. The nucleolar protein SWM2 (Synthetic With MUD-2-delta protein2) constitutes a yeast protein family. SWM2 is a nonessential gene whose function is unknown, but it encodes a protein that binds Tgs1, an enzyme responsible for 2,2,7-trimethylguanosine (TMG) capping of small nuclear (sn) RNAs implicated in pre-mRNA splicing.	130
407236	pfam17084	TDA11	Topoisomerase I damage affected protein 11. Tda11 is a fungal protein family. The function is unknown.	465
293690	pfam17085	UCMA	Unique cartilage matrix associated protein. UCMA is a secreted cartilage-specific protein located in chromosome 2 that is predominantly expressed in resting chondrocytes. It is secreted into the extracellular matrix as an uncleaved precursor and shows the same restricted distribution pattern in cartilage as UCMA mRNA. This protein is proteolytically processed and contains tyrosine sulfates. It seems to be to be involved in the negative control of osteogenic differentiation of osteochondrogenic precursor cells in peripheral zones of foetal cartilage.	134
374978	pfam17086	HV_small_capsid	Small capsid protein of Herpesviridae. This is a family of herpes-type viral small capsid proteins.	77
293692	pfam17087	HHV-5_US34A	Herpesvirus US34A protein family. Proteins in this human cytomegalovirus (HHV-5 )family contain a transmembrane domain.	64
407237	pfam17088	YCF90	Uncharacterized protein family. Ycf90 is an algal protein located in chloroplasts. Its function is unknown.	388
293694	pfam17089	YjbT	Uncharacterized protein family. This is a family of bacterial proteins. The function is unknown.	92
374980	pfam17090	Ytca	Uncharacterized protein family. This is a family of bacterial transmembrane proteins. The function is unknown.	62
293696	pfam17091	Tail_VII	Inovirus G7P protein. Tail virion protein 7P is a viral transmembrane protein that interacts with the packaging signal of the viral genome leading to the initiation the virion concomitant assembly-budding process in the host inner membrane.	40
407238	pfam17092	PCB_OB	Penicillin-binding protein OB-like domain. 	109
293698	pfam17093	PBP_N	Penicillin-binding protein N-terminus. This domain occurs at the N-terminus of some penicillin-binding proteins in Caulobacter species.	138
293699	pfam17094	UPF0715	Uncharacterized protein family (UPF0715). This is a family of Bacilli transmembrane proteins. The function is unknown.	115
407239	pfam17095	CAMSAP_CC1	Spectrin-binding region of Ca2+-Calmodulin. CAMSAP_CC1 is the conserved region on calmodulin-regulated spectrin-associated proteins in eukaryotes that binds spectrin. CAMSAPs are vertebrate microtubule-binding proteins, representatives of a family of cytoskeletal proteins that arose in animals. This conserved CC1 region binds to both spectrin and Ca2+/calmodulin in vitro, although the binding of Ca2+/calmodulin inhibited the binding of spectrin. CC1 appears to be a functional region of CAMSAP1 that links spectrin-binding to neurite outgrowth.	59
407240	pfam17096	AIM3	Altered inheritance of mitochondria protein 3. AIM3 is a family of fungal proteins that are described as altered inheritance of mitochondria protein 3 proteins.	85
407241	pfam17097	Kre28	Spindle pole body component. In Saccharomyces cerevisae Kre28 and Spc105 form a kinetochore microtubule binding complex, which bridges between centromeric heterochromatin and kinetochore MAPs (microtubule associated protein, such as Bim1, Bik1 and SIk19) and motors (Cin8, Kar3). It may be regulated by sumoylation.	360
407242	pfam17098	Wtap	WTAP/Mum2p family. The Wtap family includes female-lethal(2)D from Drosophila and pre-mRNA-splicing regulator WTAP from mammals. The former is required for female-specific splicing of Sex-lethal RNA, and the latter is a regulatory subunit of the RNA N6-methyladenosine methyltransferase. The family also includes the yeast Mum2p protein which is part of the Mis complex.	155
293704	pfam17099	TrpP	Tryptophan transporter TrpP. TrpP is a bacterial transmembrane protein that is probably involved in tryptophan uptake. Its expression is regulated by tryptophan-activated RNA-binding regulatory protein (TRAP).	169
407243	pfam17100	NACHT_N	N-terminal domain of NWD NACHT-NTPase. This is an N-terminal domain on putative NWD NACHT proteins, signal transducing ATPases which undergo ligand-induced oligomerization.	220
407244	pfam17101	Stealth_CR1	Stealth protein CR1, conserved region 1. Stealth_C1 is the first of several highly conserved regions on stealth proteins in metazoa and bacteria. There are up to four CR regions on all member proteins. CR1 carries a well-conserved IDVVYT sequence-motif. The domain is found in tandem with CR2, CR3 and CR4 on both potential metazoan hosts and pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1-phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system.	29
407245	pfam17102	Stealth_CR3	Stealth protein CR3, conserved region 3. Stealth_CR3 is the third of several highly conserved regions on stealth proteins in metazoa and bacteria. There are up to four CR regions on all member proteins. The domain is found in tandem with CR1, CR2 and CR3 on both potential metazoan hosts and pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1-phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system.	49
407246	pfam17103	Stealth_CR4	Stealth protein CR4, conserved region 4. Stealth_CR4 is the fourth highly conserved region on stealth proteins in metazoa and bacteria. There are four CR regions on mammalian members. CR4 carries a well-conserved CLND sequence-motif. The domain is found in tandem with CR1, CR2 and CR3 on both potential metazoan hosts and on pathogenic eubacterial species that are capsular polysaccharide phosphotransferases. The CR domains also appear on eukaryotic proteins such as GNPTAB, N-acetylglucosamine-1-phosphotransferase subunits alpha/beta. Horizontal gene-transfer seems to have occurred between host and bacteria of these sequence-regions in order for the bacteria to evade detection by the host innate immune system.	56
407247	pfam17104	DUF5102	Domain of unknown function (DUF5102). This is a family fungal sequences of no known function.	292
407248	pfam17105	BRD4_CDT	C-terminal domain of bromodomain protein 4. BRD4_CDT is the short highly conserved C-terminal domain of certain bromodomain proteins, notably Brd4. The Brd4 CTD interacts with the cyclin T1 and Cdk9 subunits of positive transcription elongation factor b (pTEFb) complex. Brd4 displaces negative regulators, the HEXIM1 and 7SKsnRNA complex, from pTEFb, thereby transforming it into an active form that can phosphorylate RNA pol II.	44
374990	pfam17106	NACHT_sigma	Sigma domain on NACHT-NTPases. NACHT_sigma is a short conserved region found on NACHT-NTPases. The function of this domain is not known.	42
407249	pfam17107	SesA	N-terminal domain on NACHT_NTPase and P-loop NTPases. This is a family of fungal N-terminal domains that appear at the N-terminus of P-loop NTPases, NACHT-NTPases and Ankyrin or WD repeat proteins. The exact function is not known.	122
374992	pfam17108	HET-S	N-terminal small S protein of HET, non-prionic. HET-S is an N-terminal domain on various fungal STAND proteins. The function is not known exactly.	23
407250	pfam17109	Goodbye	fungal STAND N-terminal Goodbye domain. The Goodbye domain is an N-terminal domain on certain fungal STAND proteins. The exact function is not known.	120
407251	pfam17110	TFB6	Subunit 11 of the general transcription factor TFIIH. TFB6 is a family of fungal proteins that form the 11th subunit of the general transcription factor TFIIH. TFB6 facilitates the dissociation of Ssl2 helicase from TFIIH after the initiation of transcription.	170
374995	pfam17111	Helo_like_N	Fungal N-terminal domain of STAND proteins. Helo_like is a family of predicted fungal STAND NTPases. The exact function is not known.	209
407252	pfam17112	Tom6	Mitochondrial import receptor subunit Tom6, fungal. Tom6 is the Tom6 subunit of the protein translocase complex TOM in fungi. This complex of the outer membrane of mitochondria is the entry gate for the vast majority of precursor proteins that are encoded by nuclear DNA, synthesized in the cytosol and imported into the mitochondria. Tom6 and Tom7 together play a role in the assembly, stability and dynamics of the TOM complex.	45
374997	pfam17113	AmpE	Regulatory signalling modulator protein AmpE. AmpE is a family of bacterial regulatory proteins. AmpE in conjunction with AmpD sense the effect of beta-lactam on peptidoglycan synthesis and relay this signal to AmpR. AmpR regulates the production of beta-lactamase.	284
374998	pfam17114	Nod1	Gef2-related medial cortical node protein Nod1. This is a small family of fungal proteins that are involved in cytokinesis, the last stage of the cell-division cycle. Nod1 co-localizes with Gef2 - RhoGEF - in the contractile ring and its precursor cortical nodes. Nod1 and Gef2 interact through this C-terminal region of each, this interaction being important for their localization.	145
407253	pfam17115	Toast_rack_N	N-terminal domain of toast_rack, DUF2154. This short domain lies at the N-terminus of DUF2154, pfam09922, hereafter named Toast_rack from its structural resemblance. The function of both domains is unknown though DUF2154 is proposed to be a cell-adhesion protein.	92
407254	pfam17116	DUF5103	Domain of unknown function (DUF5103). This is a family of Bacteroidetes proteins of unknown function.	288
407255	pfam17117	DUF5104	Domain of unknown function (DUF5104). This is a family of gut microbes of unknown function.	107
407256	pfam17118	DUF5105	Domain of unknown function (DUF5105). This is a family of Firmicutes proteins of unknown function. There is one structure, Structure 4r4g, a lipoprotein, whose N-terminus is represented by DUF4352, pfam11611.	189
407257	pfam17119	MMU163	Mitochondrial protein up-regulated during meiosis. This is a family of fungal mitochondrial proteins of unknown function.	253
375001	pfam17120	Zn_ribbon_17	Zinc-ribbon, C4HC2 type. 	57
375002	pfam17121	zf-C3HC4_5	Zinc finger, C3HC4 type (RING finger). 	51
293727	pfam17122	zf-C3H2C3	Zinc-finger. 	35
407258	pfam17123	zf-RING_11	RING-like zinc finger. 	29
375004	pfam17124	ThiJ_like	ThiJ/PfpI family-like. This is a family of fungal and bacterial ThiJ/PfpI-like proteins.	188
407259	pfam17125	Methyltr_RsmF_N	N-terminal domain of 16S rRNA methyltransferase RsmF. This is the N-terminal domain of the RsmF methyl transferase. RsmF is a multi-site-specific methyltransferase that is responsible for the synthesis of three modifications on cytidines in 16S ribosomal RNA. The N-terminus is critical for stabilizing the catalytic core of the enzyme.	88
407260	pfam17126	RsmF_methylt_CI	RsmF rRNA methyltransferase first C-terminal domain. This is the first of two distinct C-terminal domains of the 16S rRNA methyltransferase RsmF. It is necessary for stabilizing the catalytic core, pfam01189.	61
407261	pfam17127	DUF5106	Domain of unknown function (DUF5106). This domain, found in Bacteroidetes proteins, is frequently associated with a putative thiol-disulfide oxidoreductase domain, pfam13905. The function of this domain is not known.	153
407262	pfam17128	DUF5107	Domain of unknown function (DUF5107). This family is found in range of different bacterial species. In many proteins it lies N-terminal to a TPR-repeat region at the C-terminus.	300
407263	pfam17129	Peptidase_M99_C	C-terminal domain of metallo-carboxypeptidase. C-terminal immunoglobulin-like domain of helical cell shape-determining peptidoglycan hydrolases, a metallo-carboxypeptidase. The structural elements of this domain form a Ca2+ binding-channel, the Ca2+ being co-ordinated by six ligand-atoms.	100
407264	pfam17130	Peptidase_M99_m	beta-barrel domain of carboxypeptidase M99. This is the central, beta-barrel, domain of the metallo-carboxypeptidase that maintains helical cell-shape in Helicobacter. It shows a novel fold. It has a highly positively charged surface which contributes to a high overall isoelectric point. A calcium-binding channel is formed from residues in the C-terminal Ig-like domain in conjunction with some of the long side-chains of residues from strands beta-14 and beta-18 of this domain.	73
407265	pfam17131	LolA_like	Outer membrane lipoprotein-sorting protein. This is likely to be a family of outer-membrane lipoprotein-sorting proteins.	182
407266	pfam17132	Glyco_hydro_106	alpha-L-rhamnosidase. 	873
407267	pfam17133	DUF5108	Domain of unknown function (DUF5108). This is a family of Bacteroidetes proteins. The domain lies upstream of a Fasciclin family, pfam02469.	216
407268	pfam17134	DUF5109	Domain of unknown function (DUF5109). This is a family of Gram-positive Bacteroidetes and Firmicutes proteins. It lies just C-terminal to a putative glycosyl-hydrolase family, DUF4434, pfam14488. It is likely to be some form of binding or recognition domain.	114
407269	pfam17135	Ribosomal_L18	Ribosomal protein 60S L18 and 50S L18e. This is a family of ribosomal proteins, 60S L18 from eukaryotes and 50S L18e from Archaea.	184
407270	pfam17136	ribosomal_L24	Ribosomal proteins 50S L24/mitochondrial 39S L24. This is the family of bacterial 50S ribosomal subunit proteins L24. It also carries some mitochondrial 39S L24 proteins.	60
407271	pfam17137	DUF5110	Domain of unknown function (DUF5110). This domain is likely to be a carbohydrate-binding domain of some description as it is found immediately C-terminal to the glycosyl-hydrolase family Glyco_hydro_31, pfam01055.	72
407272	pfam17138	DUF5111	Domain of unknown function (DUF5111). This family is found immediately downstream of SusE, a putative starch-processing family, pfam14292. It is possible that this domain represents a substrate-binding site.	121
407273	pfam17139	DUF5112	Domain of unknown function (DUF5112). This domain is frequently found upstream of family HATPase_c pfam000251.	266
407274	pfam17140	DUF5113	Domain of unknown function (DUF5113). This domain is frequently found downstream of family HATPase_c pfam000251 in duplicate.	162
407275	pfam17141	DUF5114	Domain of unknown function (DUF5114). This family lies further downstream of DUF5111, pfam17138, on proteins from Bacteroidetes that also carry a SusE family, pfam14292.	88
407276	pfam17142	DUF5115	Domain of unknown function (DUF5115). 	258
407277	pfam17144	Ribosomal_L5e	Ribosomal large subunit proteins 60S L5, and 50S L18. This family contains the large 60S ribosomal L5 proteins from Eukaryota and the 50S L18 proteins from Archaea. It has been shown that the amino terminal 93 amino acids of Rat Rpl5 are necessary and sufficient to bind 5S rRNA in vitro, suggesting that the entire family has a function in rRNA binding.	162
407278	pfam17145	DUF5119	Domain of unknown function (DUF5119). This is a family of uncharacterized Bacteroidia sequences.	193
407279	pfam17146	PIN_6	PIN domain of ribonuclease. This is a PIN domain found largely in eukaryotes.	87
407280	pfam17147	PFOR_II	Pyruvate:ferredoxin oxidoreductase core domain II. PFOR_II is a core domain of the anaerobic enzyme pyruvate:ferredoxin oxidoreductase and is necessary for inter subunit contacts in conjunction with domains I and IV.	102
407281	pfam17148	DUF5117	Domain of unknown function (DUF5117). This domain may fall upstream of a met-zincin domain.	189
319167	pfam17149	CHASE5	Periplasmic sensor domain found in signal transduction proteins. CHASE5 is a conserved periplasmic sensor domain found in histidine kinases, diguanylate cyclases/phosphodiesterases and methyl-accepting chemotaxis proteins. In Pseudomonas aeruginosa, CHASE5 is the sensor domain in the c-di-GMP phosphodiesterase BifA that regulates biofilm formation and in sensor kinase AruS that regulates arginine degradation pathways. These results suggest that CHASE5 might bind arginine or a related compound.	108
407282	pfam17150	CHASE6_C	C-terminal domain of two-partite extracellular sensor domain. CHASE6 was originally described as a two-partite extracellular (periplasmic) sensor domain found in histidine kinases and HD-GYP-type c-di-GMP-specific phosphodiesterases and assigned to COG4250 in the COG database. Subsequently, its N-terminal part has been described as a separate DICT (DIguanylate Cyclases and Two-component systems) domain (pfam10069) (Aravind L., Iyer LM, Anantharaman V. (2010) Natural history of sensor domains in the bacterial signalling systems. In: Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition ((Spiro S, Dixon R, eds)), pp. 1-38. Caister Academic Press, Norfolk, UK). The current entry contains only the C-terminal part of the original CHASE6 domain, which is found primarily in cyanobacteria.	80
407283	pfam17151	CHASE7	Periplasmic sensor domain. CHASE7 is a conserved periplasmic sensor domain found in histidine kinases and diguanylate cyclases/phosphodiesterases, including the diguanylate cyclase DgcQ (YedQ) that regulates biofilm formation and motility in Escherichia coli (Hengge R. et al. (2015) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation).	187
407284	pfam17152	CHASE8	Periplasmic sensor domain. CHASE8 is a conserved periplasmic sensor domain found in histidine kinases, diguanylate cyclases/phosphodiesterases and methyl-accepting chemotaxis proteins, including the diguanylate cyclase DgcN (YfiN) that regulates biofilm formation and motility in Escherichia coli (Hengge R. et al. (2015) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation). In Pseudomonas aeruginosa, CHASE8 is the sensor domain in the diguanylate cyclase TpbB that regulates biofilm formation by controlling the levels of extracellular DNA.	102
319171	pfam17153	CHASE9	Periplasmic sensor domain, extracellular. CHASE9 is a conserved extracellular (periplasmic) sensor domain found in histidine kinases, diguanylate cyclases/phosphodiesterases, methyl-accepting chemotaxis proteins, adenylate cyclases and protein serine phosphatases, including the c-di-GMP phosphodiesterases PdeI (YliE) of Escherichia coli (Hengge R. et al. ((2015)) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation).	116
407285	pfam17154	GAPES3	Gammaproteobacterial periplasmic sensor domain. GAPES3 (GAmmaproteobacterial PEriplasmic Sensor) domain is a periplasmic sensor domain found in diguanylate cyclases/phosphodiesterases, including the c-di-GMP phosphodiesterases PdeK (YhjK) of Escherichia coli (Hengge R. et al. ((2015)) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation) and HmsP of Yersinia pestis.	121
319173	pfam17155	GAPES1	Gammaproteobacterial periplasmic sensor domain. GAPES1 (GAmmaproteobacterial PEriplasmic Sensor) domain is a periplasmic sensor domain found in diguanylate cyclases and methyl-accepting chemotaxis proteins, including the diguanylate cyclase DgcJ (YeaJ) that regulates biofilm formation and motility in Escherichia coli and (Hengge R. et al. ((2015)) 'A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12'. J.Bacteriol., in preparation).	274
319174	pfam17156	GAPES2	Gammaproteobacterial periplasmic sensor domain. GAPES2 (GAmmaproteobacterial PEriplasmic Sensor) domain is a periplasmic sensor domain found in diguanylate cyclases, including the diguanylate cyclase DgcI (YliF) of Escherichia coli (Hengge R. et al. ((2015)) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation). It contains three conserved Cys residues that might participate in thiol-disulfide exchange.	204
407286	pfam17157	GAPES4	Gammaproteobacterial periplasmic sensor domain. GAPES4 (GAmmaproteobacterial PEriplasmic Sensor) domain is a periplasmic sensor domain found in various GGDEF- and EAL-containing proteins. In Escherichia coli, GAPES4 forms the N-terminal domain of the regulatory protein CsrD (YhdA) (Hengge R. et al. ((2015)) 'A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12'. J.Bacteriol., in preparation), which contains enzymatically inactive GGDEF and EAL domains and controls CsrD) that controls the degradation of two non-coding RNAs, CsrB and CsrC. In Vibrio cholerae, GAPES4-containing protein MshH (Q9KUW1_VIBCH) inhibits biofilm formation, apparently acting through the glucose-specific enzyme IIA (Q9KTD8, pfam00358).	98
407287	pfam17158	MASE4	Membrane-associated sensor, integral membrane domain. MASE4 (Membrane-Associated SEnsor) is an integral membrane sensor domain found in various GGDEF domain proteins, including a functional diguanylate cyclase DgcT (YcdT) and the enzymatically inactive CdgI (YeaI) of Escherichia coli (Hengge R. et al. ((2015)) 'A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12'. J.Bacteriol., in preparation). In the Shiga toxin-producing enteroaggregative E. coli O104:H4, which caused the outbreak of the haemolytic uraemic syndrome in Germany in 2011, MASE4-containing diguanylate cyclase DgcX, UniProtKB:B7LBD9_ECO55, was highly expressed, ensuring strong biofilm formation.	239
407288	pfam17159	MASE3	Membrane-associated sensor domain. MASE3 (Membrane-Associated SEnsor) is an integral membrane sensor domain of unknown specificity found in histidine kinases, diguanylate cyclases and protein phosphatases in various bacteria and archaea.	226
319178	pfam17160	DUF5124	Domain of unknown function (DUF5124). 	100
407289	pfam17161	DUF5123	Domain of unknown function (DUF5123). 	116
407290	pfam17162	DUF5118	Domain of unknown function (DUF5118). This domain falls upstream of a met-zincin domain.	50
407291	pfam17163	DUF5125	Domain of unknown function (DUF5125). 	193
407292	pfam17164	DUF5122	Domain of unknown function (DUF5122) beta-propeller. 	36
407293	pfam17165	DUF5121	Domain of unknown function (DUF5121). 	111
407294	pfam17166	DUF5126	Domain of unknown function (DUF5126). This domain lies C-terminal to DUF4959, pfam16323.	102
407295	pfam17167	Glyco_hydro_36	Glycosyl hydrolase 36 superfamily, catalytic domain. This is the catalytic region of the superfamily of enzymes referred to as GH36. UniProtKB:Q76IQ9 is a chitobiose phosphorylase that catalyzes the reversible phosphorolysis of chitobiose into alpha-GlcNAc-1-phosphate and GlcNAc with inversion of the anomeric configuration. The full-length enzyme comprises a beta sandwich domain and an (alpha/alpha)(6) barrel domain. The alpha-helical barrel component of the domain, this family, is the catalytic region.	425
407296	pfam17168	DUF5127	Domain of unknown function (DUF5127). 	226
407297	pfam17169	NRBF2_MIT	MIT domain of nuclear receptor-binding factor 2. This MIT domain is the microtubule interaction and trafficking of nuclear receptor-binding factor 2 - NRBF2 - in higher eukaryotes. It is a coiled-coil region at the N-terminus of pfam08961. NRBF2 plays an essential role in autophagy, the cellular pathway that degrades long-lived proteins and other cytoplasmic contents through lysosomes. NRBF2 binds Atg14L - a Beclin-binding protein - directly via the MIT domain and enhances Atg14L-linked Vps34 kinase (a class III phosphatidylinositol-3 kinase) activity and autophagy induction.	83
407298	pfam17170	DUF5128	6-bladed beta-propeller. This family is a 6-bladed beta-propeller structure of unknown function. There is a highly conserved FDxxG motif which might be important.	321
407299	pfam17171	GST_C_6	Glutathione S-transferase, C-terminal domain. This domain is closely related to PF00043.	64
407300	pfam17172	GST_N_4	Glutathione S-transferase N-terminal domain. This domain is homologous to pfam02798.	97
375028	pfam17173	DUF5129	Domain of unknown function (DUF5129). 	337
379937	pfam17174	DUF5130	Domain of unknown function (DUF5130). 	136
407301	pfam17175	MOLO1	Modulator of levamisole receptor-1. MOLO1 is a one-pass transmembrane protein that contains a single extracellular globular domain. It is a positive regulator of levamisole-sensitive acetylcholine receptors in Caenorhabditis elegans. These receptors are Cys-loop ligand-gated ion channels, and the MOLO1 domain is an auxiliary subunit of the gated channel. The proteins carry a Rossmann fold.	119
407302	pfam17176	tRNA_bind_3	tRNA-binding domain. This domain, found at the C-terminus of tRNA(Met) cytidine acyltransferase, may be involved in tRNA-binding. This family represents the tRNA-binding domain proteins not captured by pfam13725.	119
407303	pfam17177	PPR_long	Pentacotripeptide-repeat region of PRORP. Pentatricopeptide repeat (PPR) proteins are a large family of modular RNA-binding proteins which mediate several aspects of gene expression primarily in organelles but also in the nucleus. PPR_long is the region of Arabidopsis protein-only RNase P (PRORP) enzyme that consists of up to eleven alpha-helices. PRORPs are a class of RNA processing enzymes that catalyze maturation of the 5' end of precursor tRNAs in Eukaryotes. All PPR proteins contain tandemly repeated sequence motifs (the PPR motifs) which can vary in number. The series of helix-turn-helix motifs formed by PPR motifs throughout the protein produces a superheros with a central groove that allows the protein to bind RNA. Proteins containing PPR motifs are known to have roles in transcription, RNA processing, splicing, stability, editing, and translation. Over a decade after the discovery of PPR proteins, the super-helical structure was confirmed. The protein-only mitochondrial RNase P crystal structure from Arabidopsis thaliana (PRORP1) confirmed the role of its PPR motifs in pre-tRNA binding and suggest it has evolved independently from other RNase P proteins that rely on catalytic RNA.	212
407304	pfam17178	MASE5	Membrane-associated sensor. MASE5 is a family of bacterial membrane-associated sensor domains. It is an integral membrane sensor domain found in various GGDEF domain proteins, including a diguanylate cyclase DgcY (EcSMS35_1716) from multidrug-resistant environmental isolate Escherichia coli SMS-3-5 (Hengge R. et al. (2015) [A systematic naming system for GGDEF- and EAL-containing c-di-GMP turnover proteins in Escherichia coli K-12]. J.Bacteriol., in preparation).	192
407305	pfam17179	Fer4_22	4Fe-4S dicluster domain. 	95
407306	pfam17180	zf-3CxxC_2	Zinc-binding domain. 	74
407307	pfam17181	EPF	Epidermal patterning factor proteins. EPF is a family of plant epidermal cell growth factors. It is a signalling peptide that determines the spacing and separation of the development of stomatal cells in the upper epidermis of plant leaf cells.	45
407308	pfam17182	OSK	OSK domain. This entry represents the OSK domain defined by Jeske and colleagues. The domain is related to SGNH hydrolases but lacks the active site residues. The domain binds to RNA.	202
375036	pfam17183	Blt1_C	Get5 carboxyl domain. During size-dependent cell cycle transitions controlled by the ubiquitous cyclin-dependent kinase Cdk1, Blt1 has been shown to co-localize with Cdr2 in the medial interphase nodes, as well as with Mid1 which was previously shown to localize to similar interphase structures. Physical interactions between Blt1-Mid1, Blt1-Cdr2 and Cdr2-Mid1 were detected, indicating that medial cortical nodes are formed by the ordered, Cdr2-dependent assembly of multiple interacting proteins during interphase. This entry corresponds to the C-terminal dimerization domain.	51
407309	pfam17184	Rit1_C	Rit1 N-terminal domain. This domain is the N-terminal domain from the enzyme (EC:2.4.2.-) which modifies exclusively the initiator tRNA in position 64 using 5'-phosphoribosyl-1'-pyrophosphate as the modification donor. As the initiator tRNA participates both in the initiation and elongation of translation, the 2'-O-ribosyl phosphate modification discriminates the initiator tRNAs from the elongator tRNAs. The N-terminal domain is the most conserved region of the protein.	272
407310	pfam17185	NlpE_C	NlpE C-terminal OB domain. This family represents a bacterial outer membrane lipoprotein that is necessary for signalling by the Cpx pathway. This pathway responds to cell envelope disturbances and increases the expression of periplasmic protein folding and degradation factors. While the molecular function of the NlpE protein is unknown, it may be involved in detecting bacterial adhesion to abiotic surfaces. In Escherichia coli and Salmonella typhi, NlpE is also known to confer copper tolerance in copper-sensitive strains of Escherichia coli, and may be involved in copper efflux and delivery of copper to copper-dependent enzymes. This domain is found at the C-terminus of the NlpE protein.	90
407311	pfam17186	Lipocalin_9	Lipocalin-like domain. This family contains the members of the old Pfam family DUF2006. Structural characterization of a family member (from DUF2006 now merged into this family) has revealed a lipocalin-like fold with domain duplication. This entry represents the C-terminal domain of the pair.	130
407312	pfam17187	Svf1_C	Svf1-like C-terminal lipocalin-like domain. Family of proteins that are involved in survival during oxidative stress. This entry corresponds to the the C-terminal domain of a pair of lipocalin domains.	163
407313	pfam17188	MucB_RseB_C	MucB/RseB C-terminal domain. Members of this family are regulators of the anti-sigma E protein RseD.	98
407314	pfam17189	Glyco_hydro_30C	Glycosyl hydrolase family 30 beta sandwich domain. 	63
407315	pfam17190	RecG_N	RecG N-terminal helical domain. This four helical bundle domain is found at the N-terminus of bacterial RecG proteins.	89
407316	pfam17191	RecG_wedge	RecG wedge domain. This DNA-binding domain has an OB-fold with large elaborations.	162
407317	pfam17192	MukF_M	MukF middle domain. The kicA and kicB genes are found upstream of mukB. It has been suggested that the kicB gene encodes a killing factor and the kicA gene codes for a protein that suppresses the killing function of the kicB gene product. It was also demonstrated that KicA and KicB can function as a post-segregational killing system, when the genes are transferred from the E. coli chromosome onto a plasmid.	161
407318	pfam17193	MukF_C	MukF C-terminal domain. This presumed domain is found at the C-terminus of the MukF protein.	158
407319	pfam17194	AbiEi_3_N	Transcriptional regulator, AbiEi antitoxin N-terminal domain. AbiEi_3 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.	93
407320	pfam17195	DUF5132	Protein of unknown function (DUF5132). Proteins in this family are uncharacterized, but have been identified as members of a gene cluster for the synthesis of Ansamitocin.	47
407321	pfam17196	DUF5133	Protein of unknown function (DUF5133). This protein of unknown function is part of the Borrelidin synthesis genomic cluster. Borrelidin is a polyketide antibiotic.	65
407322	pfam17197	DUF5134	Domain of unknown function (DUF5134). Proteins in this family are uncharacterized, but have been identified as members of a gene cluster for the synthesis of the tetramic-acid antibiotic streptolydigin, which inhibits bacterial RNA polymerase (RNAP).	157
379943	pfam17198	AveC_like	Spirocyclase AveC-like. AveC catalyzes the stereospecific spiroketalization of a dihydroxy-ketone polyketide intermediate in the biosynthetic pathway of Avermectin, a potent antiparasitic agent. Additionally, it has a unique dehydration activity that serves to determine the regiospecific saturation pattern for spiroketal diversity. MeiC, the counterpart in the biosynthesis of AVE-like meilingmycin, also has spirocyclase activity, but lacks the dehydratase activity.	229
407323	pfam17199	DUF5136	Protein of unknown function (DUF5136). Sequences in this family have been identified in Micromonospora as part of the genomic cluster for the synthesis of dynemicin, an enediyne antitumor antibiotic.	28
407324	pfam17200	sCache_2	Single Cache domain 2. This entry represents the single Cache domain 2 (sCache_2), which contains the long N-terminal helix domain.	154
407325	pfam17201	Cache_3-Cache_2	Cache 3/Cache 2 fusion domain. The Cache_3-Cache_2 domain likely originated as a fusion of sCache_3 and sCache_2 domains.	292
407326	pfam17202	sCache_3_3	Single cache domain 3. 	107
375046	pfam17203	sCache_3_2	Single cache domain 3. 	140
375047	pfam17204	Sid-5	Sid-5 family. SID-5 is a C. elegans endosome-associated protein that is required for efficient systemic RNA.	76
407327	pfam17205	PSI_integrin	Integrin plexin domain. This short disulphide rich domain is found at the N-terminus of integrin beta chains.	48
319224	pfam17206	SeqA_N	SeqA protein N-terminal domain. The binding of SeqA protein to hemimethylated GATC sequences is important in the negative modulation of chromosomal initiation at oriC, and in the formation of SeqA foci necessary for Escherichia coli chromosome segregation. SeqA tetramers are able to aggregate or multimerize in a reversible, concentration-dependent manner. Apart from its function in the control of DNA replication, SeqA may also be a specific transcription factor. This short domain mediates dimerization.	36
407328	pfam17207	MCM_OB	MCM OB domain. This family contains an OB-fold found within MCM proteins. This domain contains an insertion at the zinc binding motif.	126
407329	pfam17208	RBR	RNA binding Region. 	59
407330	pfam17209	Hfq	Hfq protein. 	64
407331	pfam17210	SdrD_B	SdrD B-like domain. This family corresponds to the B-like domain from the SdrD protein. This domain has three calcium binding sites within a greek key beta sandwich fold.	112
407332	pfam17211	VHL_C	VHL box domain. This domain represents the short C-terminal alpha helical domain from the VHL protein.	49
407333	pfam17212	Tube	Tail tubular protein. This family includes the tail tubular gp11 protein from bacteriophage T7.	169
407334	pfam17213	Hydin_ADK	Hydin Adenylate kinase-like domain. This domain found in the Hydin protein is homologous to adenylate kinases.	202
407335	pfam17214	KH_7	KH domain. 	68
407336	pfam17215	Rrp44_S1	S1 domain. This domain corresponds to the S1 domain found at the C-terminus of ribonucleases such as yeast Rrp44.	87
375054	pfam17216	Rrp44_CSD1	Rrp44-like cold shock domain. 	148
407337	pfam17217	UPA	UPA domain. The UPA domain is conserved in UNC5, PIDD, and Ankyrins. It has a beta sandwich structure.	140
407338	pfam17218	CBX7_C	CBX family C-terminal motif. This motif is found at the C-terminus of CBX family proteins. It is bound by the RAWUL domain of the RING1B protein.	33
407339	pfam17219	YAF2_RYBP	Yaf2/RYBP C-terminal binding motif. This motif is found in the Yaf2 and RYBP proteins that are homologous parts of the PRC1 complex. This motif forms a beta hairpin structure when it binds to the RAWUL domain pfam16207.	33
375058	pfam17220	DUF5137	Protein of unknown function (DUF5137). This is a family of uncharacterized yeast proteins.	78
407340	pfam17221	COMMD1_N	COMMD1 N-terminal domain. This helical domain is found at the N-terminus of COMMD1.	102
407341	pfam17222	Peptidase_C107	Viral cysteine endopeptidase C107. This is a family of viral cysteine endopeptidases that process RNA polyproteins. Site directed mutagenesis suggest that H1434 and C1539 form the catalytic dyad.	314
375060	pfam17223	CPCFC	Cuticle protein CPCFC. This entry contains cuticle proteins with a CX(5)C motif, although some members have a CX(7)C motif. In Anopheles gambiae, mRNA for this protein is most abundant immediately following ecdysis in larvae, pupae and adults, and is localized primarily in epidermis that secretes hard cuticle, sclerites, setae, head capsules, appendages and spermatheca. EM immunolocalization studies have shown that the protein is present in the endocuticle of legs and antennae. CPCFC is found throughout the Hexapoda and in several classes of Crustacea.	17
407342	pfam17224	DUF5300	Domain of unknown function (DUF5300). This small family of proteins found in Clostridiales is functionally uncharacterized. Proteins in this family are around 130 amino acids in length. Based on NMR structure 2MCA, it forms a beta-sandwich structure consisting of two 4-stranded antiparallel b-strands. The structure is very similar to glutamine glutamyltransferases (1l9n) and peptide transporters (5a9h).	98
407343	pfam17225	DUF5301	Domain of unknown function (DUF5300). This small family of proteins is functionally uncharacterized. It is found mainly in Firmicutes. Proteins in this family are around 130 amino acids in length. Based on NMR structure 2MCT, it forms an alpha/beta structure with a 6 stranded antiparallel b-sheet planked by a single alpha helix. The only protein with similar structures is a putative lipoprotein (PDB code 4R7R).	97
407344	pfam17226	MTA_R1	MTA R1 domain. The R1 domain is found in the MTA1 protein and its homologs. The domain is composed of 4 alpha helices. It has been shown to bind to the RBBP4 protein. The MTA proteins contain a second partial copy of this domain called R2. The R2 domain is matched by this model for some proteins.	79
407345	pfam17227	DUF5302	Family of unknown function (DUF5302). Family of unknown function found in Actinobacteria with highly conserved motif of FRRKSG found at the C-terminus.	52
375063	pfam17228	SGP	Sulphur globule protein. Sulphur globules are membrane-bounded intracellular globules, used by purple sulphur bacteria to transiently store sulphur during the oxidisation of reduced sulphur compounds. This proteobacterial family contains structural proteins of these sulphur globules, and includes sulphur globule protein CV1 (SgpA) and sulphur globule protein CV2 (SgpB).	96
407346	pfam17229	DUF5303	Region of unknown function (DUF5303). This disordered region of unknown function shows similarity to the N-terminal region of SMG1.	106
407347	pfam17230	DUF5304	Family of unknown function (DUF5304). This family of unknown function is found in Actinobacteria.	149
407348	pfam17231	DUF5305	Family of unknown function (DUF5305). This family consists of several hypothetical proteins of unknown function.	215
407349	pfam17232	DUF5306	Family of unknown function (DUF5306). This family of unknown function is found mainly in plants.	82
407350	pfam17233	DUF5308	Family of unknown function (DUF5308). This family of uncharacterized fungal proteins are primarily found in ascomycota.	162
407351	pfam17234	MPM1	Mitochondrial peculiar membrane protein 1. This family contains mitochondrial peculiar membrane proteins, found predominantly in Saccharomycetales.	172
407352	pfam17235	STD1	STD1/MTH1. This family of proteins includes the known homologs STD1 (also known as MSN3) and MTH1. Both STD1 and MTH1 are involved in modulating the expression of glucose-regulated genes in yeast, but have been shown to function by slightly different methods. It has been suggested that both STD1 and MTH1 are required to repress the hexose transporter genes in low glucose conditions. STD1 has also been shown to stimulate SNF1 kinase through interaction with the catalytic domain of SNF1, antagonising auto-inhibition and promoting an active conformation of the kinase.	213
407353	pfam17236	DUF5309	Family of unknown function (DUF5309). This is a family of uncharacterized proteins found in viruses and bacteria.	280
407354	pfam17237	DUF5310	Family of unknown function (DUF5310). This uncharacterized family of proteins contains members that are found mainly in fungi.	44
407355	pfam17238	DUF5311	Family of unknown function (DUF5311). This is a family of proteins which is mostly found in Streptophyta.On the C terminal of this family, the Nucleoporin Nup120/160 family pfam11715 if often present.	194
375073	pfam17239	DUF5312	Family of unknown function (DUF5312). This is a family of unknown function, mostly found in Spirochaeta.	553
375074	pfam17240	DUF5313	Family of unknown function (DUF5313). This is a family of unknown function, found mostly in Actinobacteria and composed of trans-membrane proteins.	123
407356	pfam17241	DUF5314	Family of unknown function (DUF5314). This is a family of unknown function usually preceded by the GAG-pre-integrase domain pfam13976.	154
407357	pfam17242	DUF5315	Disordered region of unknown function (DUF5315). This is a family of unknown function found mostly in Saccharomycetales.	77
407358	pfam17243	POTRA_TamA_1	POTRA domain TamA domain 1. This family represents the POTRA domain found in the membrane insertase TamA.	74
407359	pfam17244	CDC24_OB3	Cell division control protein 24, OB domain 3. This family contains OB-fold domains that bind to nucleic acids. The family includes a domain found in Cell division control protein 24 (Cdc24). Cdc24 plays an essential role in the progression of normal DNA replication and is required to maintain genomic integrity. Cdc24 has been reported to interact with replication factor C (RFC) as well as proliferating cell nuclear antigen (PCNA), and has been suggested to act as a target for the regulation of damage repair DNA synthesis.	207
407360	pfam17245	CDC24_OB2	Cell division control protein 24, OB domain 2. This family contains OB-fold domains that bind to nucleic acids. The family includes a domain found in Cell division control protein 24 (Cdc24). Cdc24 plays an essential role in the progression of normal DNA replication and is required to maintain genomic integrity. Cdc24 has been reported to interact with replication factor C (RFC) as well as proliferating cell nuclear antigen (PCNA), and has been suggested to act as a target for the regulation of damage repair DNA synthesis.	129
407361	pfam17246	CDC24_OB1	Cell division control protein 24, OB domain 1. This family contains OB-fold domains that bind to nucleic acids. The family includes a domain found in Cell division control protein 24 (Cdc24). Cdc24 plays an essential role in the progression of normal DNA replication and is required to maintain genomic integrity. Cdc24 has been reported to interact with replication factor C (RFC) as well as proliferating cell nuclear antigen (PCNA), and has been suggested to act as a target for the regulation of damage repair DNA synthesis.	118
407362	pfam17247	DUF5316	Family of unknown function (DUF5316). This is a family of unknown function mainly found in Firmicutes. Might contain multiple trans-membrane sequences.	74
407363	pfam17248	DUF5317	Family of unknown function (DUF5317). This is a family of unknown function found mainly in Bacteria. Members of this family have multiple trans-membrane domains with the majority typically constituted of 4 trans-membrane regions.	150
407364	pfam17249	DUF5318	Family of unknown function (DUF5318). This family of unknown function is mostly found in Actinobacteria.	131
407365	pfam17250	NDUFB11	NADH-ubiquinone oxidoreductase 11 kDa subunit. Complex I of the respiratory chain is a proton-pumping, NADH ubiquinone oxidoreductase that oxidizes NADH in the electron transport pathway. Plants contain the series of 14 highly conserved complex I subunits found in other eukaryotic and related prokaryotic enzymes.	86
407366	pfam17251	Pom	Protochlamydia outer membrane protein. This family represents an outer membrane protein found in environmental chlamydia. The protein shows porin function.	279
407367	pfam17252	DUF5319	Family of unknown function (DUF5319). This is a family of unknown function mostly found in Actinobacteria.	121
407368	pfam17253	DUF5320	Family of unknown function (DUF5320). A number of this family members have a coiled coil domain at the C terminal.	98
407369	pfam17254	DUF5321	Family of unknown function (DUF5321). This is a family of unknown function. Most of the members seem to carry one trans-membrane region.	160
407370	pfam17255	DUF5322	Family of unknown function (DUF5322). This is a family of unknown function. The uncharacterized family is mainly found in Bacteria and consists of two putative trans-membrane domains.	133
407371	pfam17256	ANAPC16	Anaphase Promoting Complex Subunit 16. The Anaphase-promoting complex/cyclosome (APC/C) is a 1.5 megaDaltons assembly ubiquitin ligase complex comprising 19 subunits. This multifunctional ubiquitin-protein ligase targets different substrates for ubiquitylation and therefore regulates a variety of cellular processes such as cell division, differentiation, genome stability, energy metabolism, cell death, autophagy as well as carcinogenesis. The APC/C complex contains two sub-complexes,the Platform and the Arc Lamp. The Arc Lamp, which mediates transient association with regulators and ubiquitination substrates, contains the small subunits APC16, CDC26, APC13, and tetratricopeptide repeat (TPR) proteins. APC16 is a conserved subunit of the APC/C. APC16 was found in association with tandem-affinity-purified mitotic checkpoint complex protein complexes. APC16 is a bona fide subunit of human APC/C. It is present in APC/C complexes throughout the cell cycle. The phenotype of APC16-depleted cells copies depletion of other APC/C subunits, and APC16 is important for APC/C activity towards mitotic substrates. APC16 sequence homologs can be identified in metazoans, but not fungi, by four conserved primary sequence stretches.	80
375084	pfam17257	DUF5323	Family of unknown function (DUF5323). This family of proteins found in Eukaryota, has no known function.	62
407372	pfam17258	DUF5324	Family of unknown function (DUF5324). This is a family of unknown function, mostly found in Actinobacteria. Most of the family members contain one trans-membrane domain.	220
407373	pfam17259	DUF5325	Family of unknown function (DUF5325). This is a family of unknown function mainly found in Bacilli. Family members of this family are predicted to have trans-membrane domains.	61
407374	pfam17260	DUF5326	Family of unknown function (DUF5326). This is a family of unknown function mostly found in Actinobacteria. Many of the family members are predicted to contain two trans-membrane domains.	70
407375	pfam17261	DUF5327	Family of unknown function (DUF5327). This bacterial family of proteins has no known function and is mostly found in Bacilli.	97
407376	pfam17262	DUF5328	Family of unknown function (DUF5328). This family of unknown function can be found in Bacteria and Archaea. Some of the proteins in this family are annotated in UniProt as putative DNA repair proteins.	114
407377	pfam17263	DUF5329	Family of unknown function (DUF5329). This is a bacterial family of proteins with unknown function.	93
407378	pfam17264	DUF5330	Family of unknown function (DUF5330). This is a family of unknown function which is mostly found in Bacteria.	65
407379	pfam17265	DUF5331	Family of unknown function (DUF5331). This bacterial family of unknown function can be found in Cyanobacteria.	113
407380	pfam17266	DUF5332	Family of unknown function (DUF5332). This family of uncharacterized proteins is mostly found in Chromadorea.	148
407381	pfam17267	DUF5333	Family of unknown function (DUF5333). This family of uncharacterized proteins is mostly found in Alphaproteobacteria.	110
339984	pfam17268	DUF5334	Family of unknown function (DUF5334). This is a family of unknown function which can is found mainly in Proteobacteria.	71
407382	pfam17269	DUF5335	Family of unknown function (DUF5335). This bacterial family of proteins has no known function.	110
407383	pfam17270	DUF5336	Family of unknown function (DUF5336). This Actinobacterial family of proteins has no known function. Most of the family members are predicted to have have 4 trans-membrane regions.	115
407384	pfam17271	Usher_TcfC	TcfC Usher-like barrel domain. This is the presumed beta barrel domain from the usher-like TcfC family of proteins.	422
407385	pfam17272	DUF5337	Family of unknown function (DUF5337). This family of unknown function is found in Rhodobacterales. Most members are predicted to have 2 trans-membrane regions.	74
407386	pfam17273	DUF5338	Family of unknown function (DUF5338). This is a family of unknown function which can be found mostly in Proteobacteria.	70
407387	pfam17274	DUF5339	Family of unknown function (DUF5339). This is a family of unknown function that can be found mostly in Proteobacteria. Some of the family members are predicted to contain a coiled coil region.	70
407388	pfam17275	DUF5340	Family of unknown function (DUF5340). This family of unknown function can be found in Cyanobacteria.	70
407389	pfam17276	DUF5341	Family of unknown function (DUF5341). This is a family of unknown function, which can be found mostly in Ascomycota.	161
407390	pfam17277	DUF5342	Family of unknown function (DUF5342). This family of no known function is found in Bacilli.	69
407391	pfam17278	DUF5343	Family of unknown function (DUF5343). This is a family of unknown function which is found in Bacteria and Archaea.	138
407392	pfam17279	DUF5344	Family of unknown function (DUF5344). This is a Bacterial family of unknown function. Most of the members of this family are predicted to contain a coiled-coil region.	87
407393	pfam17280	DUF5345	Family of unknown function (DUF5345). This is a family of unknown function. It is found mostly in Bacteria. Members of this family are predicted to contain 2 trans-membrane regions.	77
407394	pfam17281	DUF5346	Family of unknown function (DUF5346). This family of unknown function is found in Nematoda.	102
407395	pfam17282	DUF5347	Family of unknown function (DUF5347). This family of unknown function is found in Bacteria, mainly in Proteobacteria.	102
407396	pfam17283	Zn_ribbon_SprT	SprT-like zinc ribbon domain. This family represents a domain found in eukaryotes and prokaryotes. The domain contains a characteristic motif of the zinc ribbon. This family includes the bacterial SprT protein.	38
407397	pfam17284	Spermine_synt_N	Spermidine synthase tetramerisation domain. This domain represents the N-terminal tetramerization domain from spermidine synthase.	53
407398	pfam17285	PRMT5_TIM	PRMT5 TIM barrel domain. This domain corresponds to the N-terminal TIM barrel domain from PRMT5 proteins..	248
407399	pfam17286	PRMT5_C	PRMT5 oligomerization domain. 	173
407400	pfam17287	POTRA_3	POTRA domain. This POTRA domain is found in ShlB-like proteins.	56
375107	pfam17288	Terminase_3C	Terminase RNAseH like domain. 	154
407401	pfam17289	Terminase_6C	Terminase RNaseH-like domain. 	153
407402	pfam17290	Arena_ncap_C	Arenavirus nucleocapsid C-terminal domain. This domain represents the the C-terminal domain that contains 3'-5' exoribonuclease activity involved in suppressing interferon induction. This domain has an RNaseH-like fold.	177
407403	pfam17291	M60-like_N	N-terminal domain of M60-like peptidases. This accessory domain has a jelly roll topology.	107
407404	pfam17292	POB3_N	POB3-like N-terminal PH domain. This domain is found at the N-terminus of POB3 and related proteins.	93
407405	pfam17293	Arm-DNA-bind_5	Arm DNA-binding domain. This domain is the N-terminal Arm DNA-binding domain found in various tyrosine recombinases.	87
407406	pfam17294	Lipoprotein_22	Uncharacterized lipoprotein family. The proteins in this family all have an N-terminal lipoprotein attachment motif. No member of this family has been functionally characterized.	166
407407	pfam17295	DUF5348	Domain of unknown function (DUF5348). 	69
340012	pfam17296	ArenaCapSnatch	Arenavirus cap snatching domain. This domain represents the N-terminal domain of the Arenavirus polymerase that is involved in cap snatching during transcription initiation.	171
407408	pfam17297	PEPCK_N	Phosphoenolpyruvate carboxykinase N-terminal domain. catalyzes the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate.	218
407409	pfam17298	DUF5349	Family of unknown function (DUF5349). This is a family of unknown function found in Saccharomycetaceae.	362
407410	pfam17299	DUF5350	Family of unknown function (DUF5350). This family is found in Euryarchaeota, predominantly in Methanomicrobia and Archaeoglobi. No known function for this family has been demonstrated.	57
407411	pfam17300	FIN1	Filament protein FIN1. Fin1 is a kinetochore protein, predicted to contain two putative coiled-coil regions at its C-terminus. It is present in a filamentous structure associated with the spindle and spindle pole in dividing cells during anaphase. Fin1 is a substrate of S-phase cyclin-dependent kinase (CDK). It binds to PP1 creating the Fin1- PPI complex which is recruited onto kinetochores promoting spindle assembly checkpoint (SAC) dis-assembly during anaphase. This is an important step in cell division since the kinetochore is the docking site for the spindle assembly checkpoint that monitors the defects in chromosome attachment and blocks anaphase onset. Fin1 has two RXXS/T sequences: S377 (RVTS), S526 (RKVS) that can be phosphorylated. Upon phosphorylation, interactions with other proteins such as Bmh1 and Bmh2 is promoted. However, de-phosphorylation during anaphase promotes the kinetochore recruitment of Fin1-PP1.	240
407412	pfam17301	LpqV	Putative lipoprotein LpqV. This is a family of cell surface proteins found in Mycobacterium with no known function.	117
407413	pfam17302	DUF5351	Family of unknown function (DUF5351). This family of unknown function is found in Bacillales.	29
407414	pfam17303	DUF5352	Family of unknown function (DUF5352). This is a family of unknown function found mostly in Eukaryota.	165
407415	pfam17304	DUF5353	Family of unknown function (DUF5353). This is a family of unknown function found mostly in Fungi. Members of this family are predicted to contain 2 trans-membrane regions.	68
375117	pfam17305	DUF5354	Family of unknown function (DUF5354). This family of unknown function is found mostly in Metazoa.	124
407416	pfam17306	DUF5355	Family of unknown function (DUF5355). This family of unknown function is found in Saccharomycetales.	331
407417	pfam17307	Smim3	Small integral membrane protein 3. This domain family can be found in Smim3 proteins (Small integral membrane protein 3) also known as NID67 (NGF-induced differentiation clone 67). It is a primary response gene, hypothesized to be involved in forming or regulating ion channels in neuronal differentiation. It is strongly induced by NGF (Nerve Growth Factor) and FGF (Fibroblast Growth Factor), both of which cause these cells to differentiate. The amino acid sequence of NID67 is strongly conserved among rat, mouse and human. This family of small membrane proteins is only 60 amino acids long and analysis of the predicted peptide sequence reveals a stretch of 29 hydrophobic and uncharged residues which very likely comprise a trans-membrane region.	60
375119	pfam17308	Corazonin	Pro-corazonin. This domain family is found in Corazonin proteins in Drosophila and other Anthropods. Corazonin (Crz)is a neuropeptide with a wide spectrum of biological functions in diverse insect groups. It was first discovered due to its myostimulatory activities on the heart muscle of Periplaneta Americana and the hyper-neural muscle of Carausius morosus. In Drosophila melanogaster, Crz plays diverse roles ranging from a regulator of insulin producing cells in the brain to roles specific to tissues, life stages, and gender.	134
407418	pfam17309	DUF5356	Family of unknown function (DUF5356). This is a family of unknown function found in Chromadorea.	135
375121	pfam17310	DUF5357	Family of unknown function (DUF5357). This is a family of unknown function found in Cyanobacteria. Most of the family members are predicted to have several trans-membrane regions.	319
407419	pfam17311	DUF5358	Family of unknown function (DUF5358). This family of unknown function is found in Proteobacteria.	161
407420	pfam17312	Helveticin_J	Bacteriocin helveticin-J. Bacteriocins are biologically active proteins or protein complexes that display a bactericidal mode of action towards closely related species. Bacteriocins produced by lactic acid bacteria are grouped into different classes. Class III of bacteriocins includes large heat liable proteins. Lactobacillus helveticus 481 produces a 37-kDa bacteriocin called helveticin J which is a representative for Clas III bacteriocins.	310
407421	pfam17313	DUF5359	Family of unknown function (DUF5359). This is a family of unknown function found in Bacillales. Most of the family members are predicted to have one trans-membrane region.	56
407422	pfam17314	DUF5360	Family of unknown function (DUF5360). This is a family of unknown function. It is present in Bacteria and most of the family members are predicted to have 4 trans-membrane regions.	127
407423	pfam17315	FMP23	Found in Mitochondrial Proteome. FMP23 gene encodes a putative mitochondrial protein involved in iron-copper homoeostasis. It was observed to be induced in response to ATX1 deletion and high copper conditions.	119
407424	pfam17316	PET10	Petite colonies protein 10. This family of proteins found in yest does not have a clear function but are predicted to be involved in lipid metabolism.	254
407425	pfam17317	MFA1_2	Mating hormone A-factor 1&2. The polypeptides encoded by the MFa1 and MFa2 genes are precursors of 36 and 38 amino acids, respectively. These mating pheromones secreted by S. cerevisiae a-cells, exhibit a single amino acid residue difference (the MFa1 gene product contains a valine instead of the leucine coded for by MFa2 at position 6 of the mature a-factor). The most significant feature of the primary a-factor gene products is the presence of a specific C-terminal motif, found in all known farnesylated proteins, representing a signal for modification of polypeptides with an isoprenoid group. In the case of both a-factor precursors, this specific sequence of amino acids is -CVIA. However, the general motif is referred to as a CAAX box, since the consensus sequence of amino acids present at the C-terminus of isoprenylated proteins consists of an invariable cysteine (C) residue followed by two aliphatic (A) amino acids and ending in a carboxyl-terminal residue of almost any (X) type The specific CAAX sequence has also been shown to target the peptide for either farnesylation or geranylgeranylation.	34
407426	pfam17318	DUF5361	Family of unknown function (DUF5361). This is a family of unknown function found in Bacteria.	36
340035	pfam17319	DUF5362	Family of unknown function (DUF5362). This is a family of unknown function found in Bacteria. Most of the family members are predicted to have 2 trans-membrane regions.	94
407427	pfam17320	DUF5363	Family of unknown function (DUF5363). This is a family of unknown function found in Gammaproteobacteri.	54
407428	pfam17321	Vac17	Vacuole-related protein 17. Vac17 serves as an adaptor protein recruiting vacuole vesicles to the actin cable tracks by its dual interaction with Vac8 and the Myo2 motor protein. It is directly phosphorylated by Cdk1. Vac17 plays an important role in vacuole inheritance and segregation in cell division.	445
407429	pfam17322	DUF5364	Family of unknown function (DUF5364). This family of unknown function is found in Saccharomycetales.	185
407430	pfam17323	ToxS	Trans-membrane regulatory protein ToxS. Gram negative bacteria such as Vibrio cholera require the production of a number of virulence factors during infection. ToxS, a member of this domain family, is required for ToxR activity. The ToxR and ToxS regulatory proteins are considered to be at the root of the V. cholera virulence regulon, called the ToxR regulon. ToxS serves as a mediator of ToxR function, perhaps by influencing its stability and/or capacity to dimerize, hence ToxS plays an important function in transcriptional activation of Vibrio cholerae virulence genes.	147
340040	pfam17324	BLI1	BLOC-1 interactor 1. In yeast BLOC-1 consists of six subunits localized to the endosomes. In the absence of BLOC-1 subunits, the balance between recycling and degradation of selected cargoes is impaired. This family contains BLI1 (BlOC-1 interactor 1) protein, a subunit of the BLOC-1 complex which mediates endosomal maturation.	111
407431	pfam17325	SPG4	Stationary phase protein 4. Saccharomyces cerevisiae respond and cope to starvation by ceasing growth and entering a non-proliferating state referred to as stationary phase. Expression of SPG4 has been shown to be higher in stressed cells, and stationary phase cells compared to active cells. It is not required for growth on non-fermentable carbon sources.	109
407432	pfam17326	DUF5365	Family of unknown function (DUF5365). This is a family of unknown function found in Bacillaceae.	116
407433	pfam17327	AHL_synthase	Acyl homoserine lactone synthase. Members of this family are involved in quorum sensing processes. In gram negative bacteria, N-acylhomoserine lactones (AHLs) act as signals. As the bacterial density increases, AHLs accumulate, and once they reach a critical level (quorum), they interact with cognate receptor proteins, which then affect target gene expression. Some AHLs are synthesized by LuxM (AHL synthase) and homologs (VanM and opaM). LuxM enzymes use S-adenosyl-methionine (SAM) as one of its two substrates and are capable of using either acyl-acyl-carrier-protein (acyl-ACP) or acyl-coenzyme A (acyl-CoA) as the other substrate. VanM, the LuxM homolog, produces two auto-inducers C6HSL and 3OC6HSL. Both autoinducers are detected by the VanN receptor. The autoinducers HAI-1, is synthesized by the cytoplasmic enzymes LuxM.	376
407434	pfam17328	DUF5366	Family of unknown function (DUF5366). This is a family of unknown function, found in Bacillales. Members of the family are predicted to have between 4 and 5 trans-membrane regions.	158
407435	pfam17329	DUF5367	Family of unknown function (DUF5367). This bacterial family of proteins of unknown function is predicted to contain 3 or 4 trans-membrane regions.	98
407436	pfam17330	SWC7	SWR1 chromatin-remodelling complex, subunit Swc7. Th SWR1 complex is involved in chromatin-remodelling by promoting the the ATP-dependent exchange of histone H2A for the H2A variant HZT1 in Saccharomyces cerevisiae or H2AZ in mammals. The SWR1 chromatin-remodelling complex is composed of at least 14 subunits and has a molecular mass of about 1.2 to 1.5 MDa. In S. cerevisiae there are core conserved subunits (ATPase; Swr1,RuvB-like; Rvb1 and Rvb2, Actin; Act1, Actin-related: Arp4 and Arp6, YEATS protein; Yaf9) and non-conserved subunits ( Vps71 (Swc6), Vps72 (Swc2), Swc3, Swc4, Swc5, Swc7, Bdf1). Seven of the SWR1 subunits are involved in maintaining complex integrity and H2AZ histone replacement activity: Swr1, Swc2, Swc3, Arp6, Swc5, Yaf9 and Swc6. Arp4 is required for the association of Bdf1, Yaf9, and Swc4 and Arp4 is also required for SWR1 H2AZ histone replacement activity in vitro. Furthermore the N-terminal region of the ATPase Swr1 provides the platform upon which Bdf1, Swc7, Arp4, Act1, Yaf9 and Swc4 associate. It also contains an additional H2AZ-H2B specific binding site, distinct from the binding site of the Swc2 subunit. In eukaryotes the deposition of variant histones into nucleosomes by the chromatin-remodelling complexes such as the SWR1 and INO80 complexes have many crucial functions including the control of gene regulation and expression, checkpoint regulation, DNA replication and repair, telomer maintenance and chromosomal segregation and as such represent critical components of pathways that maintain genomic integrity. This entry represents the subunit Swc7; the smallest subunit of the SWR1 complex. Swc7 is not required for H2AZ binding. It associates with the N-terminus of Swr1, and the association of Bdf1 requires Swc7, Yaf9, and Arp4.	98
407437	pfam17331	GFD1	GFD1 mRNA transport factor. Following transcription, mRNA is processed, packaged into messenger ribonucleoprotein (mRNP) particles, and transported through nuclear pores (NPCs) to the cytoplasm. Gfd1 is one of several factors that, although not essential for mRNA export, enhances the efficiency of the process, either by facilitating integration of different steps in the gene expression pathway or by increasing the rate of key steps. Gfd1 localizes to the cytoplasm and nuclear rim. It interacts with a number of components of the mRNA export machinery in yeast. Most notably, Gfd1 interacts with the Dbp5-activating protein, Gle1, the cytoplasmic nucleoporin Nup42/Rip1, the putative RNA helicase, Dbp5, and a protein implicated in mRNA export, Zds1. Gfd1 forms a complex with Nab2 both in vitro and in vivo in which Gfd1 binds to the N-terminal domain of Nab2. The crystal structure, together with complementary NMR data, indicated that residues 126-150 of Gfd1 form a single alpha-helix that binds primarily to helix 2 of Nab2-N. Gfd1 functions to co-ordinate Dbp5 and Gle1 to facilitate the removal of Nab2 from mRNPs at the cytoplasmic face of nuclear pores.	22
407438	pfam17332	pXO2-11	Uncharacterized protein pXO2-11. This is a protein of unknown function found in Firmicutes and predicted to contain 2 trans-membrane regions.	89
375136	pfam17333	DEFB136	Beta-defensin 136. Beta-defensins are small cationic peptides that have triple-stranded beta-sheet structure. They are characterized by the presence of multiple cysteine residues (forming three distinctive intramolecular disulfide bridges) and a highly similar tertiary structure known as the defensin motif. All beta-defensin genes encode a precursor peptide that consists of a hydrophobic, leucine-rich signal sequence, a pro-sequence, and a mature six-cysteine defensin motif at the carboxy terminus. They exhibit broad-spectrum antimicrobial properties and contribute to mucosal immune responses at epithelial sites. Several beta-defensins family members have been shown to play essential roles in sperm maturation and fertility in rats, mice and humans. In addition to the wide spectrum of antimicrobial activity, mammalian beta-defensins have been reported to have other roles in the immune system, such as the chemotactic ability for immature dendritic cells and memory T-cells via chemokine receptor-6 demonstrated by human beta-defensin-2. This entry contains beta-defensins such as DEFB136, the mouse homolog Defb42, and Ostricacin-3.	51
407439	pfam17334	CsgA	Sigma-G-dependent sporulation-specific SASP protein. Curli are extracellular functional amyloids that are assembled by enteric bacteria during biofilm formation and host colonization. The csg (curli specific gene) operon encodes major structural and accessory proteins that are required for curli production. The csgBAC operon encodes the major and minor curli fiber components, CsgA and CsgB, respectively. CsgA is secreted to the extracellular milieu as an unfolded protein and forms amyloid polymers upon interacting with the CsgB nucleator. CsgA is comprised of five imperfect repeating units with highly conserved glutamine and asparagine residues that are important for amyloid formation. Each repeating unit is predicted to form a strand-loop-strand motif. In vitro, CsgC inhibits CsgA amyloid formation at substoichiometric concentrations and maintains CsgA in a non-beta-sheet rich conformation, making CsgC an efficient and selective amyloid inhibitor.	83
407440	pfam17335	IES5	Ino80 complex subunit 5. The INO80 chromatin remodeling complex is known to be related to DNA repair in yeast, mammals, and plants. In yeast, the INO80 complex is recruited to the DSBs (DNA double-strand breaks) through the direct interaction of its Nhp10 (non-histone protein 10) or Arp4 (Actin-Related Protein) subunits with phosphorylated histone H2A. However, the ortholog of yeast Nhp10 does not exist in mammals. The Nhp10 module consists of Nhp10, Ies1, Ies3, and Ies5. These yeast-specific subunits cross-link to the N-terminus of Ino80 and form a stable complex in-vitro, which helps high-affinity targeting of INO80 to nucleosome-binding.	110
407441	pfam17336	DUF5368	Family of unknown function (DUF5368). This is a family of unknown function found in Proteobacteria and predicted to contain 2 trans-membrane regions.	111
407442	pfam17337	Gal_GalNac_35kD	Galactose-inhibitable lectin 35 kDa subunit. The role of the cell surface D-galactose (Gal)/N-Acetyl-D-galactosamine (GalNAc), lectin in the adhesion process has been demonstrated in Entamoeba histolytica, a protozoan parasite that causes amebiasis in humans. The Gal/GalNAc lectin is a heterotrimeric protein complex. It is composed of a 260 kDa heterodimer of trans-membrane disulphide-linked heavy 170 kDa subunit and glycosylphosphatidylinositol (GPI)-anchored light 31 kDa/35 kDa subunits. The light subunits are non-covalently associated with an intermediate subunit of 150 kDa. Inhibition of expression of 35 kDa subunit of Gal/GalNAc lectin inhibits the cytotoxic and cytopathic activity of E. histolytica, but no decrease in adherence capacity to mammalian cells was evident. Interestingly, a carbohydrate-binding activity has been reported for the 35 kDa light subunit of the lectin molecules of the closely related Entamoeba invadens. This entry is related to the light subunit where this domain of unknown function is present. The light subunit consists of several polypeptide chains with considerable antigenic homology. The two light (31/35 kDa) subunits of the lectin are present in two isoforms: the 31 kDa isoform is glycerolphosphatidylinositol (GPI) anchored; and the 35 kDa isoform is more highly glycosylated.	225
375140	pfam17338	GP88	Gene 88 protein. This family of unknown function is found in Bacteria.	231
407443	pfam17339	DUF5369	Family of unknown function (DUF5369). This is a family of unknown function found in Chromadorea.	107
407444	pfam17340	DUF5370	Family of unknown function (DUF5370). This is a family of unknown function found in Bacillaceae.	63
340057	pfam17341	DUF5371	Family of unknown function (DUF5371). This is a family of unknown function found in Euryarchaeota.	65
407445	pfam17342	DUF5372	Family of unknown function (DUF5372). This family of unknown function is found in Bacteria.	78
407446	pfam17343	DUF5373	Family of unknown function (DUF5373). This family of unknown function is found in Caenorhabditis. Members of this family are predicted to contain 4 trans-membrane regions.	182
407447	pfam17344	DUF5374	Family of unknown function (DUF5374). This is a family of unknown function found in Pasteurellaceae.	40
340061	pfam17345	DUF5375	Family of unknown function (DUF5375). This is a family of unknown function found in Enterobacteriaceae.	106
375143	pfam17346	DUF5376	Family of unknown function (DUF5376). This is a family of unknown function found in Bacteria.	129
407448	pfam17347	DUF5377	Family of unknown function (DUF5377). This is a family of unknown function found in Pasteurellaceae.	96
407449	pfam17349	DUF5378	Family of unknown function (DUF5378). This is a family of unknown function which is found in Mycoplasmataceae.Family members are predicted to contain 7 trans-membrane regions	282
340065	pfam17350	DUF5379	Family of unknown function (DUF5379). This family of unknown function is found in Methanobacteria and Methanococci. Family members are predicted to have 3 trans-membrane regions.	90
375145	pfam17351	DUF5380	Family of unknown function (DUF5380). This is a family of unknown function found in Rhabditida.	85
340067	pfam17352	MFS18	Male Flower Specific protein 18. This domain family is found on MFS18 protein from Maize. MFS18 mRNA accumulates in the glumes and in anther walls, paleas and lemmas of mature florets. It is particularly associated with the vascular bundle in the glumes and encodes a polypeptide of 12 kDa, rich in glycine, proline and serine that has similarities with other plant structural proteins. There is no known function of this domain family in Maize or other Poaceae.	97
407450	pfam17353	DUF5381	Family of unknown function (DUF5381). This is a family of unknown function found in Bacillales.	169
375147	pfam17354	DUF5382	Family of unknown function (DUF5382). This is a family of unknown function found in Caenorhabditis.	418
407451	pfam17355	DUF5383	Family of unknown function (DUF5383). This is a family of unknown function found in Bacillales. Members of this family are predicted to contain one trans-membrane region.	124
407452	pfam17356	PBSX_XtrA	Phage-like element PBSX protein XtrA. This is a family of unknown function found in Bacilli.	64
375150	pfam17357	FIT1_2	Facilitor Of Iron Transport 1 and 2. Fit proteins (facilitor of iron transport) found on Saccharomyces cerevisiae cell wall are mannoproteins implicated in the siderophore-iron bound transport. This domain family can be found in FIT1 and FIT2 proteins in Saccharomycetaceae. The FIT1-3 cell wall mannoproteins are attached to the beta-glucan layer through a GPI (glycosylphosphatidylinoisitol) anchor. They are very rich in serine and threonine residues (40-50 % serine and threonine) and bear several short repeat of 6-7 amino acids sequence. The exact domain function is unknown.	86
407453	pfam17358	DUF5384	Family of unknown function (DUF5384). This is a family of unknown function found in Proteobacteria.	145
407454	pfam17359	DUF5385	Family of unknown function (DUF5385). This is a family of unknown function found in Mycoplasmataceae. Family members are predicted to have one trans-membrane region.	217
375152	pfam17360	DUF5386	Family of unknown function (DUF5386). This is a family of unknown function found in Chromadorea.	170
340076	pfam17361	DUF5387	Family of unknown function (DUF5387). This is a family of unknown function found in Strongyloides.	222
375153	pfam17362	pXO2-34	Family of unknown function. This is a family of unknown function found in Bacilli.	79
375154	pfam17363	DUF5388	Family of unknown function (DUF5388). This is a family of unknown function found in Lactobacillales.	70
407455	pfam17364	DUF5389	Family of unknown function (DUF5389). This is a family of unknown function found in Pasteurellaceae. Family members are predicted to have 3 trans-membrane regions.	104
375156	pfam17365	DUF5390	Family of unknown function (DUF5390). This is a family of unknown function found in Caenorhabditis.	141
375157	pfam17366	AGA2	A-agglutinin-binding subunit Aga2. The wall of Saccharomyces cerevisiae consists of mannoproteins, beta-glucans, and a small amount of chitin. Mannoproteins include Aga2p where this domain family is found. There are two main display systems for yeast, the agglutinin system and the flocculin system. The S. cerevisiae sexual agglutinins facilitate the mating between two types of cells, a and alpha. a-Agglutinin consists of two subunits, encoded by two unlinked genes, AGA1 and AGA2. The cell surface adhesion protein (Aga2), enhances agglutination between a and alpha cells. Optimal binding includes interactions of the alpha-agglutinin binding pocket with the Aga2p terminal carboxyl group. This O-mannosylated glycopeptide is doubly disulfide linked to Aga1p. The Aga2p half-cystines near the ends of the peptide are linked to two Aga1p Cys residues separated by only two residues. This closeness of the disulfide bonds stabilizes the alpha/beta structure in Aga2p.	58
407456	pfam17367	NiFe_hyd_3_EhaA	NiFe-hydrogenase-type-3 Eha complex subunit A. Energy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe] hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. Energy-converting [NiFe] hydrogenases function as ion pumps, catalyzing the reduction of ferredoxin with H2 driven by the proton-motive force or the sodium-ion-motive force. Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S] polyferredoxin, ten other predicted integral membrane proteins (EhaA, EhaB, EhaC, EhaD, EhaE, EhaF, EhaG, EhaI, EhaK, EhaL) and four hydrophobic subunits (EhaM, EhaR, EhS, EhT). Eha and Ehb catalyze the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases. Based on sequence similarity and genome context analysis, other organisms such as Methanopyrus kandleri, Methanocaldococcus jannaschii, and Methanothermobacter marburgensis also encode Eha-like [NiFe]-hydrogenase-3-type complexes and have very similar eha operon structure. This domain family can be found on the small membrane proteins that are predicted to be the EhaA trans-membrane subunits of multisubunit membrane-bound [NiFe]-hydrogenase Eha complexes.	94
407457	pfam17368	YwcE	Spore morphogenesis and germination protein YwcE. The ywcE gene codes for a holin-like protein that localizes to the cell and spore membranes. It is expressed at the onset of sporulation and transcription is repressed during growth by the transition-state regulator AbrB. YwcE is an 83-residue protein with three trans-membrane domains and a highly charged C-terminal tail. Moreover, YwcE has a dual start motif, which plays a role in the regulation of class I or class II holins. It is likely to have the N-terminus on the outside of the membrane and the C-terminus in the cytoplasm. This domain family is found in YwcE proteins in Bacilli.	85
407458	pfam17369	DUF5391	Family of unknown function (DUF5391). This is a family of unknown function found in Bacilli. Family members are predicted to have 4 trans-membrane regions.	135
375160	pfam17370	DUF5392	Family of unknown function (DUF5392). This is a family of unknown function found in Bacilli. Family members are predicted to have 2 trans-membrane regions.	139
407459	pfam17371	DUF5393	Family of unknown function (DUF5393). This is a family of unknown function found in Trypanosomatidae.	666
407460	pfam17372	DUF5394	Family of unknown function (DUF5394). This is a family of unknown function found in Rickettsiales.	205
407461	pfam17373	DUF5395	Family of unknown function (DUF5395). This is a family of unknown function found in Archaea and Bacteria.	81
340089	pfam17374	DUF5396	Family of unknown function (DUF5396). This is a family of unknown function found in Mycoplasma.	947
340090	pfam17375	DUF5397	Family of unknown function (DUF5397). This is a family of unknown function found in Proteobacteria.	64
375163	pfam17376	DUF5398	Family of unknown function (DUF5398). This is a family of unknown function found in Chlamydiales.	80
340092	pfam17377	DUF5399	Family of unknown function (DUF5399). This is a family of unknown function found in Chlamydiales.	134
407462	pfam17378	REC104	Meiotic recombination protein REC104. REC104 is one of several meiosis specific genes required for generating meiotic DSBs (double strand breaks). It is suggested that Rec102 and Rec104 directly promote DSB formation as part of a multiprotein complex with Spo11. Rec102 and Rec104 are mutually dependent for proper sub-cellular localization, and share a requirement for Spo11 and Ski8 for their recruitment to meiotic chromosomes. Moreover, Rec102 is required for Rec104 to accumulate to normal steady-state levels and to be properly phosphorylated. It is likely that Rec102 and Rec104 move freely in and out of the nucleus but are most stably sequestered there only when they can form a complex on chromosomes. This domain family is found on Rec104 proteins in yeast.	182
407463	pfam17379	DUF5400	Family of unknown function (DUF5400). This is a family of unknown function found in Methanobacteria and Methanococci. Members of this family are predicted to contain 4 trans-membrane regions.	100
375164	pfam17380	DUF5401	Family of unknown function (DUF5401). This is a family of unknown function found in Chromadorea.	722
375165	pfam17381	Svs_4_5_6	Seminal vesicle secretory protein 4/5/6. There are seven major proteins involved in murine seminal vesicle secretion (SVS1-7). Mouse Svs2-Svs6 genes evolved by gene duplication and belong to the same gene family. This domain family is found in SVS4/5 and 6. SVS4 is a basic, thermostable, secretory protein synthesized by rat seminal vesicle epithelium under strict androgen transcriptional control. This protein has potent nonspecies-specific immunomodulatory, anti-inflammatory, and pro-coagulant activities that have been shown to be located in the N-terminal region of Svs4 (fragment 1-70). The N-terminal segment has a high amino-acid sequence similarity with the C-terminal segment 34-66 of uteroglobin, a rabbit steroid-inducible, cytokine-like, multifunctional, secreted protein. Furthermore, SVS4 acts as a sperm capacitation inhibitor, by interacting with SVS3 and SVS2.	91
340097	pfam17382	ycf70	Uncharacterized protein ycf70. This is a family of unknown function found in Poaceae.	89
375166	pfam17383	kleA_kleC	Uncharacterized KorC regulated protein A. This is a family of unknown function found in Proteobacteria.	76
407464	pfam17384	DUF150_C	RimP C-terminal SH3 domain. This family represents the C-terminal domain from RimP.	70
407465	pfam17385	LBP_M	Lacto-N-biose phosphorylase central domain. The gene which codes for this protein in gut-bacteria is located in a novel putative operon for galactose metabolism. The protein appears to be a carbohydrate-processing phosphorolytic enzyme (EC:2.4.1.211), unlike either glycoside hydrolases or glycoside lyase. Intestinal colonisation by bifidobacteria is important for human health, especially in pediatrics, because colonisation seems to prevent infection by some pathogenic bacteria that cause diarrhoea or other illnesses. The operon seems to be involved in intestinal colonisation by bifidobacteria mediated by metabolism of mucin sugars. In addition, it may also resolve the question of the nature of the bifidus factor in human milk as the lacto-N-biose structure found in milk oligosaccharides.	221
407466	pfam17386	LBP_C	Lacto-N-biose phosphorylase C-terminal domain. The gene which codes for this protein in gut-bacteria is located in a novel putative operon for galactose metabolism. The protein appears to be a carbohydrate-processing phosphorolytic enzyme (EC:2.4.1.211), unlike either glycoside hydrolases or glycoside lyase. Intestinal colonisation by bifidobacteria is important for human health, especially in pediatrics, because colonisation seems to prevent infection by some pathogenic bacteria that cause diarrhoea or other illnesses. The operon seems to be involved in intestinal colonisation by bifidobacteria mediated by metabolism of mucin sugars. In addition, it may also resolve the question of the nature of the bifidus factor in human milk as the lacto-N-biose structure found in milk oligosaccharides.	53
407467	pfam17387	Glyco_hydro_59M	Glycosyl hydrolase family 59 central domain. 	116
407468	pfam17388	GP24_25	Tail assembly protein Gp24 and Gp25. Bacteriophages (viruses of bacteria) use a specialized organelle called a tail to deliver their genetic material and proteins across the cell envelope during infection. In phages the most complex part of these contractile injection systems, the base-plate, is responsible for coordinating host recognition or other environmental signals with sheath contraction. In T4 phage, 15 different proteins encoded by Gene Products (Gps), make up the base-plate and proximal region of the tail tube. The base-plate is divided into inner, intermediate and peripheral regions. Gp25 is located in the inner region of the base-plate. It interacts with Gp53 connecting the core bundle to the central hub and the tube, stabilizing the entire assembly. Gp25 has a structurally conserved loop (residues 47-49), mediating the interaction between LysM (residues 46-82 in Gp53) and the core bundle. Orthologues of Gp25 contain an EPR motif (Glu-Pro-Arg, residues 85-87 of Gp25), which interacts with the core bundle and points towards the region of the Gp27-Gp48 interface. In summary, Gp25 plays a critical role in sheath assembly and contraction. This domain family is found on Gp24 and Gp25 Mycobacterium phages.	132
407469	pfam17389	Bac_rhamnosid6H	Bacterial alpha-L-rhamnosidase 6 hairpin glycosidase domain. This family consists of bacterial rhamnosidase A and B enzymes. L-Rhamnose is abundant in biomass as a common constituent of glycolipids and glycosides, such as plant pigments, pectic polysaccharides, gums or biosurfactants. Some rhamnosides are important bioactive compounds. For example, terpenyl glycosides, the glycosidic precursor of aromatic terpenoids, act as important flavouring substances in grapes. Other rhamnosides act as cytotoxic rhamnosylated terpenoids, as signal substances in plants or play a role in the antigenicity of pathogenic bacteria.	340
379972	pfam17390	Bac_rhamnosid_C	Bacterial alpha-L-rhamnosidase C-terminal domain. This family consists of bacterial rhamnosidase A and B enzymes. L-Rhamnose is abundant in biomass as a common constituent of glycolipids and glycosides, such as plant pigments, pectic polysaccharides, gums or biosurfactants. Some rhamnosides are important bioactive compounds. For example, terpenyl glycosides, the glycosidic precursor of aromatic terpenoids, act as important flavouring substances in grapes. Other rhamnosides act as cytotoxic rhamnosylated terpenoids, as signal substances in plants or play a role in the antigenicity of pathogenic bacteria.	78
407470	pfam17391	Urocanase_N	Urocanase N-terminal domain. 	127
407471	pfam17392	Urocanase_C	Urocanase C-terminal domain. 	196
340108	pfam17393	DUF5402	Family of unknown function (DUF5402). This is a family of unknown function found in Methanobacteria and Methanococci.	119
340109	pfam17394	KleE	Uncharacterized KleE stable inheritance protein. This domain family of unknown function is found in Proteobacteria. Family Members are predicted to contain two trans-membrane regions.	108
407472	pfam17395	DUF5403	Family of unknown function (DUF5403). This is a family of unknown function found in Actinobacteria.	96
407473	pfam17396	DUF1611_N	Domain of unknown function (DUF1611_N) Rossmann-like domain. 	93
340112	pfam17397	DUF5404	Family of unknown function (DUF5404). This is a family of unknown function found in Chordata. This domain is located downstream the N-terminal of Fip1 pfam05182. The Tsx gene resides at the X-inactivation centre and once thought to encode a protein expressed in testis. However, this was disputed upon further analysis. ORF and immunostaining analysis concluded that Tsx may be non-coding. Tsx long transcript is abundantly expressed in meiotic germ cells, embryonic stem cells, and brain. In vertebrates, Fip1 is the evolutionary precursor of eutherian Tsx, hence its location upstream from the Tsx gene.	145
407474	pfam17398	NolB	Nodulation protein NolB. This domain family of unknown function is found in Rhizobiales. Family members are involved in Nodulation (nodule development in plants).	151
407475	pfam17399	DUF5405	Domain of unknown function (DUF5405). This domain family is found in Enterobacteriaceae. This protein may have a phage origin being found in bacteriophage P2. The majority of proteins have a conserved cysteine residue close to their C-terminus which may have functional significance.	94
407476	pfam17400	DUF5406	Family of unknown function (DUF5406). This is a family of unknown function found in Bacteria.	114
340116	pfam17401	DUF5407	Family of unknown function (DUF5407). This is a family of unknown function found in Chlamydiales.	74
340117	pfam17402	DUF5408	Family of unknown function (DUF5408). This is a family of unknown function found in Helicobacteraceae. Family members are predicted to contain one trans-membrane region.	63
407477	pfam17403	Nrap_D2	Nrap protein PAP/OAS-like domain. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript.	148
407478	pfam17404	Nrap_D3	Nrap protein domain 3. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript.	160
407479	pfam17405	Nrap_D4	Nrap protein nucleotidyltransferase domain 4. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript.	201
407480	pfam17406	Nrap_D5	Nrap protein PAP/OAS1-like domain 5. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript.	158
407481	pfam17407	Nrap_D6	Nrap protein domain 6. Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localized in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript.	128
407482	pfam17408	MCD_N	Malonyl-CoA decarboxylase N-terminal domain. This family consists of several eukaryotic malonyl-CoA decarboxylase (MLYCD) proteins. Malonyl-CoA, in addition to being an intermediate in the de novo synthesis of fatty acids, is an inhibitor of carnitine palmitoyltransferase I, the enzyme that regulates the transfer of long-chain fatty acyl-CoA into mitochondria, where they are oxidized. After exercise, malonyl-CoA decarboxylase participates with acetyl-CoA carboxylase in regulating the concentration of malonyl-CoA in liver and adipose tissue, as well as in muscle. Malonyl-CoA decarboxylase is regulated by AMP-activated protein kinase (AMPK).	85
407483	pfam17409	MoaF_C	MoaF C-terminal domain. MoaF protein is essential for the production of the monoamine-inducible 30kDa protein in Klebsiella. It is necessary for reconstituting organoautotrophic growth in Ralstonia eutropha. It is conserved in Proteobacteria and some lower eukaryotes. The operon regulating the Moa genes is responsible for molybdenum cofactor biosynthesis. This entry corresponds to the C-terminal domain.	113
407484	pfam17410	Stevor	Subtelomeric Variable Open Reading frame. The parasite protein STEVOR (Subtelomeric Variable Open Reading frame) is an erythrocyte-binding protein recognizing Glycophorin C on the red blood cell (RBC) surface. The cytoplasmic domain of STEVOR is shown to interact with ankyrin complex at the erythrocyte skeleton. It is phosphorylated by protein kinase A (PKA) at a specific serine residue (S324). The N-terminal semi-conserved region of Stevor that is present in this domain is shown to specifically bind to to a chymotrypsin-resistant RBC receptor. The expression of STEVOR in multiple parasite stages including merozoites suggests that STEVOR mediates multiple distinct functions in parasitic infectious cycle.	275
407485	pfam17411	SmaI	Type II site-specific deoxyribonuclease. Family members of this domain are Type II site-specific deoxyribonuclease EC=3.1.21.4. The endonuclease SmaI recognizes and cleaves the sequence CCCGGGG on DNA, yielding a blunt end scission. It has been used for the diagnosis of neurogenic muscle weakness, ataxia and retinitis pigmentosa disease or Leigh's disease. Due to its specificity in recognizing the cleavage site, it is used in Leigh's disease to specifically eliminate the mutant mitochondrial DNA (mtDNA), which coexists with the wild-type mtDNA (heteroplasmy). Only the mutant mtDNA, but not the wild-type mtDNA, is selectively restricted by the enzyme. By delivering the SmaI gene fused to a mitochondrial targeting sequence, specific elimination of the mutant mtDNA was demonstrated, resulting in restoration of both the normal intracellular ATP level and normal mitochondrial membrane potential. The same strategy has also been demonstrated retinitis pigmentosa (NARP), where a mutant mitochondrial DNA carrying a T8993G transversion has been targeted by using SmaI enzymes.	241
407486	pfam17412	VraX	Family of unknown function. This domain family is found in VraX proteins from Staphylococcus aureus. The vraX gene belongs to the vra operon together with the vraA gene encoding for a long chain fatty acid-CoA ligase, which is up-regulated in the VISA (vancomycin-intermediate S. aureus). The gene product, a 55-amino acids protein,is upregulated in the stress response to cell wall-active antibiotics and other surface-interactive molecules. VraX harbors a putative phosphorylation site, and could therefore be involved in regulatory processes within the cell. However, no exact function has been demonstrated.	55
340128	pfam17413	VirB7	Outer membrane lipoprotein virB7. The type IV secretion systems (T4SSs) are ancestrally related to bacterial conjugation machines and are able to translocate proteins and/or protein-DNA complexes to the extracellular milieu or the host interior, in many cases contributing to the ability of the bacterial pathogen to colonize the host and evade its immune system. In the pathogenic plant pathogen Agrobacterium tumefaciens T4SS allows the bacterium to transfer a segment of its tumor inducing (Ti-) plasmid DNA into plant cells causing crown gall tumor disease. Proteins in the virB and virD operons catalyze processing of the T-DNA and its transfer to plants. The VirB proteins assemble a secretion apparatus spanning both bacterial membranes to allow transfer of DNA and protein substrates into plant cells. VirB7 and VirB8, along with VirB6, VirB9 and VirB10, are the core components of the Agrobacterium DNA translocation apparatus. Structural studies with the Escherichia coli plasmid pKM101 VirB homologs showed that three proteins, TraN (VirB7 homolog), TraO (VirB9) and TraF (VirB10), form a hetero-tetradecameric structure with 14-fold symmetry forming an outer membrane channel through which the substrates pass. VirB7 stabilizes VirB9 and in its absence bacteria do not accumulate VirB9 preventing assembly of the secretion machine. Members of the VirB7 family are typically 45-65 residues long, becoming 15-20 residues shorter after removal of the N-terminal signal sequence and covalent attachment to lipid molecules.	35
407487	pfam17414	MatP_C	MatP C-terminal ribbon-helix-helix domain. This family, many of whose members are YcbG, organizes the macrodomain Ter of the chromosome of bacteria such as E coli. In these bacteria, insulated macrodomains influence the segregation of sister chromatids and the mobility of chromosomal DNA. Organisation of the Terminus region (Ter) into a macrodomain relies on the presence of a 13 bp motif called matS repeated 23 times in the 800-kb-long domain. MatS sites are the main targets in the E. coli chromosome of YcbG or MatP (macrodomain Ter protein). MatP accumulates in the cell as a discrete focus that co-localizes with the Ter macrodomain. The effects of MatP inactivation reveal its role as the main organizer of the Ter macrodomain: in the absence of MatP, DNA is less compacted, the mobility of markers is increased, and segregation of the Ter macrodomain occurs early in the cell cycle. A specific organisational system is required in the Terminus region for bacterial chromosome management during the cell cycle. This entry represents the C-terminal ribbon-helix-helix domain.	60
407488	pfam17415	NigD_C	NigD-like C-terminal beta sandwich domain. This family of proteins is functionally uncharacterized. This family of proteins is found in Bacteroides species. Proteins in this family are typically between 234 and 260 amino acids in length. These proteins possess an N-terminal lipoprotein attachment site. The family includes NigD a protein found in the Nig operon that encodes a bacteriocin called nigrescin. It has been suggested that NigD may be the immunity protein for nigrescin (NigC) because it is directly downstream. This entry represents the C-terminal beta-sandwich domain of NigD.	120
407489	pfam17416	Glycoprot_B_PH1	Herpesvirus Glycoprotein B. This domain has a PH-like fold.	210
407490	pfam17417	Glycoprot_B_PH2	Herpesvirus Glycoprotein B PH-like domain. This domain corresponds to the second PH-like domain in herpesvirus glycoprotein B.	97
407491	pfam17418	SdpA	Sporulation delaying protein SdpA. Spore formation by the bacterium Bacillus subtilis is an elaborate developmental process that is triggered by nutrient limitation. Cells that have entered the pathway to sporulate produce and export a killing factor and a signaling protein that act cooperatively to block sister cells from sporulating and to cause them to lyse. The sporulating cells feed on the nutrients thereby released, which allows them to keep growing rather than to complete morphogenesis. Entry into sporulation is governed by the regulatory protein Spo0A (master regulator of sporulation). Upon Spo0A phosphorylation, it represses the expression of abrB, a negative regulator of skfABCEFGH and sdpAB, leading to the transcriptional activation of sdpAB operon. The production of SdpAB is essential for the SDP toxin. SDP is a 42-amino-acid, ribosomally synthesized AMP which contains a disulfide bond between two cysteine residues located at the N-terminus. SDP acts by rapidly collapsing the proton motive force thereby inducing autolysin mediated lysis on neighboring species and non-biofilm producing B. subtilis cells (which do not produce SdpI) to respond by moving away, while autolysis would release nutrients that can be readily used to promote biofilm growth. SdpAB proteins are required to produce SDP from SdpC33-203. This domain family is found in SdpA proteins which are predicted to be a 158-amino-acid proteins suggest to be primarily cytoplasmic.	142
407492	pfam17419	MauJ	Methylamine utilization protein MauJ. This domain family is found in MauJ proteins. The exact function of the MauJ proteins is unknown but thought to be involved in methylamine utilization. MauJ is predicted to be a cytoplasmic protein.	282
340135	pfam17420	Gp17	Superinfection exclusion protein, bacteriophage P22. Bacteriophages infect host cells by injecting their genome through the cell wall. To this end, tailed bacteriophages have evolved complex tail machines that extend from a unique capsid vertex, providing both an attachment point to the host surface, and a channel for genome-ejection through the cell envelope. Family members of this domain are putative gp17 proteins involved in genome delivery tail machine in Entereobacteria phage p22 and Salmonella phage ViI. Gp17 found in other bacteriophages such as SPP1 (siphophage SPP1, a lytic Bacillus subtilis phage) has been identified as a tail completion protein adopting an alpha/beta fold, and found to be located at the interface between the head-to-tail connector and the tail of bacteriophage SPP1.	98
340136	pfam17421	DUF5409	Family of unknown function (DUF5409). This domain of unknown function is found in Poxviridae.	88
375184	pfam17422	DUF5410	Family of unknown function (DUF5410). This is a family of unknown function found in Rickettsia.	353
407493	pfam17423	SwrA	Swarming motility protein. This domain family is found in Bacillus. Members of this family are Swra proteins involved in swarming motility (a multicellular movement of hyper-flagellated cells on a surface). SwrA is a key transcription factor facilitating this cascade. It acts synergistically with DegU to drive the fla/che operon encoding flagella components, chemotaxis constituents and the alternative sigma factor sigmaD, which is regarded as the primary event in the development of motility. LonA protease of Bacillus subtilis inhibits SwrA by proteolytically restricting its accumulation. SwrA does not contain any known DNA binding domain, and it has been shown to interact with the N-terminal domain of DegU. Anecdotally, in most laboratory strains, e.g. 168, the swrA coding sequence contains a nucleotide insertion that prematurely interrupts its reading frame, causing a non-swarming phenotype strain.	116
407494	pfam17424	DUF5411	Family of unknown function (DUF5411). This is a family of unknown function found in Bacteria.	134
407495	pfam17425	Arylsulfotran_N	Arylsulfotransferase Ig-like domain. This family consists of several bacterial Arylsulfotransferase proteins. Arylsulfotransferase (ASST) transfers a sulfate group from phenolic sulfate esters to a phenolic acceptor substrate. This domain has an Ig-like fold.	89
407496	pfam17426	Putative_G5P	Putative Gamma DNA binding protein G5P. This domain family is found in Gammaproteobacterial proteins. Members of the family are predicted to be G5P DNA binding proteins. Homologous proteins are found in pfam02303	107
340142	pfam17427	Phi29_Phage_SSB	Phage Single-stranded DNA-binding protein. DNA replication of phi29 and related phages takes place via a strand displacement mechanism, a process that generates large amounts of single-stranded DNA (ssDNA). Consequently, phage-encoded ssDNA-binding proteins (SSBs) are essential proteins during phage phi29-like DNA replication. Single-stranded DNA-binding proteins (SSBs) destabilize double-stranded DNA (dsDNA) and bind without sequence specificity, but selectively and cooperatively, to single-stranded DNA (ssDNA) conferring a regular structure to it, which is recognized and exploited by a variety of enzymes involved in DNA replication, repair and recombination. Phage phi29 protein p5 is the SSB protein active during phi29 DNA replication. It protects ssDNA against nuclease degradation and greatly stimulates dNTP incorporation during phi29 DNA replication process. Binding of the SSB to ssDNA prevents non-productive binding of the viral DNA polymerase to ssDNA, and allows the release DNA polymerase molecules that are already titrated by the ssDNA. This effect would be of particular importance in phi29-like DNA replication systems, where large amounts of ssDNA are generated and SSB binding to ssDNA could favor efficient re-usage of templates. This domain family is found in SSB proteins in phage phi-29, homologs are found in pfam00436.	123
407497	pfam17428	DUF5412	Family of unknown function (DUF5412). This is a family of unknown function found in Bacteria. Members of this family have one or two predicted trans-membrane regions.	118
407498	pfam17429	GP70	Gene 70 protein. This family of unknown function is found in Mycobcterium phage and Actinobacteria.	54
407499	pfam17431	ypmT	Uncharacterized ympT. This is a family of unknown function found in Bacillus.	62
407500	pfam17432	DUF3458_C	Domain of unknown function (DUF3458_C) ARM repeats. This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes.	320
407501	pfam17433	Glyco_hydro_49N	Glycosyl hydrolase family 49 N-terminal Ig-like domain. Family of dextranase (EC 3.2.1.11) and isopullulanase (EC 3.2.1.57). Dextranase hydrolyzes alpha-1,6-glycosidic bonds in dextran polymers. This domain corresponds to the N-terminal Ig-like fold.	186
340149	pfam17434	DUF5413	Family of unknown function (DUF5413). This is a family of unknown function found in Bradyrhizobiaceae. Family members contain 3 or 4 predicted trans-membrane regions.	133
407502	pfam17435	DUF5414	Family of unknown function (DUF5414). This is a family of unknown function found in Chlamydiales. Family members have a known structure.	183
340151	pfam17436	DUF5415	Family of unknown function (DUF5415). This is a family of unknown function found in Enterococcus.	66
407503	pfam17437	DUF5416	Family of unknown function (DUF5416). This is a family of unknown function found in Campylobacteria.	173
407504	pfam17438	DUF5417	Family of unknown function (DUF5417). This is a family of unknown function found in Proteobacteria.	91
340154	pfam17439	DUF5418	Family of unknown function (DUF5418). This is a family of unknown function found in Methanocaldococcus jannaschii. Family members hace three predicted trans-membrane regions.	151
407505	pfam17440	Thiol_cytolys_C	Thiol-activated cytolysin beta sandwich domain. This domain has an immunoglobulin like fold. It is found at the C-terminus of the thiol-activated cytolsin protein.	102
375192	pfam17441	DUF5419	Family of unknown function (DUF5419). This is a family of unknown function found in Rhodopseudomonas.	54
340157	pfam17442	U62_UL91	Functional domain of U62 and UL91 proteins. Human herpesvirus 6A (HHV-6A) and HHV-6B are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. Human cytomegalovirus (HCMV) is responsible for significant diseases in developing fetus as well as in an immunocompromised host. During their productive cycle, herpesviruses have a regulated temporal cascade of gene expression that can be divided into three general stages: immediate-early (IE), early (E), and late (L). Following viral DNA replication, late viral genes that mainly encode structural proteins start to be transcribed, ultimately leading to the assembly and release of infectious particles. This domain family is found in Human herpesvirus 6A and 6B (HHV-6A/B) as well as HCMV. Family members are shown to be involved in late gene expression such as UL91 in Human Cytomegalovirus. This functional domain is located on the N-terminal (1-71 amino acids) of full-length UL91. It has been found to suffice for transcriptional activation of true-late genes within the nucleus of infected cells. In other words, UL91 is fully functional as a 71-aa N-terminal polypeptide and This small 71-aa polypeptide contains all protein-protein interaction motifs crucial to mediate transcriptional activation.	65
340158	pfam17443	pXO2-72	Uncharacterized protein pXO2-72. This is a family of unknown function found in Bacilli.	62
340159	pfam17444	yhdX	Uncharacterized protein YhdX. This is a family of unknown function found in Bacillus.	33
340160	pfam17445	Mfa1	Mating factor A1. Many pathogenic fungi undergo morphological changes in order to infect their hosts. The Ustilago maydis pathogenic cycle starts when two mating compatible haploid yeast cells recognize each other via a pheromone-receptor system which is encoded by two sets of genes a and b. The a locus (a1 and a2) controls the cell fusion by encoding intercellular recognition system consisting of precursors (mfa1 and mfa2) and receptors (pra1 and pra2) of lipopeptide pheromones. The open reading frame codes for a 42-amino acid precursor, which is processed to a shorter peptide of 13 amino acids. The terminal CAAX motif is typical of farnesylated fungal pheromones, in which the last three amino acids are removed during farnesylation of the cysteine residue. This terminal cysteine is known to be Omethylated in several fungal pheromones. Mating leads to the formation of a dikaryon filament, whose apical tip differentiates into a specialized structure for plant penetration known as the appressorium. Once inside the plant, U. maydis proliferates, inducing the formation of tumors and eventually develops into diploid spores. This mating process requires cross-talk between cAMP and mitogen-activated protein kinase (MAPK) signaling. Upstream regulation of a locus has been demonstrated where Hos2 (Histone deacetylases (HDACs) plant homolog) directly regulates the expression of U. maydis mating-type genes downstream of the cAMP-PKA pathway. Furthermore, pheromone recognition blocks cell cycle progression in U. maydis cells in order to prepare mating partners for conjugation where cells undergo arrest in G2 phase. This entry relates to the domain found in Mfa1 proteins in Ustilgo maydis and U. hordei.	43
407506	pfam17446	ltuA	Late transcription unit A protein. This is a domain of unknown function found in Chlamydia.	46
407507	pfam17447	ykpC	Uncharacterized protein YkpC. This is a family of unknown function found in Bacillus.	42
340163	pfam17448	yqaH	Uncharacterized protein YqaH. This is a family of unknown function found in Bacillus.	88
340164	pfam17449	yrzK	Uncharacterized protein YrzK. This is a family of unknown function found in Bacillus.	54
407508	pfam17450	Melibiase_2_C	Alpha galactosidase A C-terminal beta sandwich domain. 	86
407509	pfam17451	Glyco_hyd_101C	Glycosyl hydrolase 101 beta sandwich domain. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae is largely determined by the ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. This family is the enzymatic region, EC:3.2.1.97, of the cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins. This reaction is exemplified by a S. pneumoniae protein, where Asp764 is the catalytic nucleophile-base and Glu796 the catalytic proton donor. This domain represents C-terminal the beta sandwich domain.	111
340167	pfam17452	YnfE	Uncharacterized protein YnfE. This is a family of unknown function found in Bacillus.	78
340168	pfam17453	YhdK	Sigma-M inhibitor protein. This is a domain of unknown function found in Sigma M inhibitor proteins YhdK. In Bacillus subtilis, sigM (yhdM) gene, is required for growth and survival after salt stress. Expression of sigM is positively autoregulated and is controlled by growth phase and medium composition. SigM-dependent transcription is regulated by the products of both the yhdL and the yhdK genes, which are co-transcribed with the sigM gene. The small hydrophobic protein YhdK, appears to interact with the trans-membrane domain of YhdL, suggesting some specific role for YhdK in the anti-sigma function of YhdL.	96
375196	pfam17454	Bee_toxin	Honey bee toxin. Bee venom contains a variety of peptides such as melittin, apamin, adolapin and mast cell degranulating peptide. Bee venom has been used in the treatment of major neurodegenerative disorders, including Alzheimer's Disease, Parkinson's Disease, Epilepsy, Multiple Sclerosis and Amyotrophic Lateral Sclerosis. Secondary structure analysis of apamin, mast cell degranulating peptide, tertiapin and secapin have been studied. The predicted structure for mast cell degranulating peptide is almost spherical with the eight positive centers evenly distributed over the surface. It has also been suggested that these four peptides share a common folding pattern, which is centred on a beta-turn covalently linked to an alpha-helical segment by two disulphide links. It is further suggested that apamin, mast cell degranulating peptide and tertiapin form a single molecular class. This domain family is found in apamin, mast cell degranulating peptide and tertiapin. Apamin, the most widely studied member of this family has been shown to be a selective blocker of small-conductance Ca2+-activated K+ (KCa2.X or SK) channels.	49
407510	pfam17455	LtuB	Late transcription unit B protein. This is a family of unknown function which is specific to Chlamydia late transcription unit B protein.	79
340171	pfam17456	TcpS	Toxin-coregulated pilus protein S. The toxin-coregulated pilus (TCP) and cholera toxin (CT) are two main virulence factors produced by V. cholerae, which allows the bacterium to colonize and establish an infection in a host and to cause the physical symptoms of the disease, respectively. Increased expression of the TCP, a type IV pilus expressed by the tcp operon (tcpABQCRDSTEF) located on the Vibrio pathogenicity island (VPI), has been associated with enhanced attachment and is essential for colonization of the intestinal epithelium. This domain of unknown function is found in TcpS proteins in Vibrionaceae such as Vibrio choleae.	152
407511	pfam17457	DUF5420	Family of unknown function (DUF5420). This is a domain of unknown function found in Gammaproteobacteria such as Haemophilus influenzae.	185
340173	pfam17458	DUF5421	Family of unknown function (DUF5421). This is a domain of unknown function found in Chlamydia.	284
340174	pfam17459	DUF5422	Family of unknown function (DUF5422). This is a family of unknown function found in Chlamydia. Members of this family have 1-4 predicted trans-membrane regions.	153
340175	pfam17460	RP854	Uncharacterized protein RP854. This is a family of unknown function found in Rickettsia. Members of this family are predicted to have one trans-membrane region.	212
375197	pfam17461	DUF5423	Family of unknown function (DUF5423). This is a domain of unknown function found in Chlamydia. Family members have 4 predicted trans-membrane regions.	348
340177	pfam17462	DUF5424	Family of unknown function (DUF5424). This is a family of unknown function specific to Rickettsia amblyommii.	175
340178	pfam17463	Gp79	Gene Product 79. This is a domain of unknown function found in Mycobacterium phage. Family members include the full Gp79 protein found in Mycobacteriophage L5. Mycobacteriophage L5, is a phage isolated from Mycobacterium smegmatis. It forms stable lysogens in M. smegmatis and has a broad host range among the pathogenic mycobacteria. L5 encodes gene products (gp) toxic to the host M. smegmatis. Expression of gp79 interferes with the cell membrane or cell-wall synthesis of M. smegmatis, leading to altered cell morphology. It also has a bactericidal effect on E. coli. The N-terminal segment of gp79 (amino acids 1-41) shares sequence similarity with the signal peptide of the D-alanylD-alanine carboxypeptidase of Bacillus licheniformis. This enzyme removes C-terminal D-alanyl residues from sugarpeptide cell-wall precursors and is also a penicillin-binding protein (PBP). The homology of the hydrophobic N-terminal part of gp79 to a PBP (penicillin-binding protein) signal peptide may indicate an interaction of gp79 with proteins or metabolites involved in the peptidoglycan synthesis of M. smegmatis.	51
340179	pfam17464	Pns11_12	Non-structural protein 11 and 12. This is a domain of unknown function found in Phytoreovirus. Family members include the Rice dwarf virus Pns11 and Pns12. Rice dwarf virus (RDV) is an icosahedral, double-layered particle. The viral genome consists of 12 segmented dsRNAs that encode seven structural (P1, P2, P3, P5, P7, P8 and P9) and five non-structural (Pns4, Pns6, Pns10, Pns11 and Pns12) proteins. Pns11 is known to bind nucleic acids and Pns12 is a phosphorylated protein. The non-structural proteins Pns6, Pns11 and Pns12 of RDV are the major constituents of the matrix of viral inclusions in which the assembly of progeny virions and the synthesis of viral RNA are thought to occur.	205
340180	pfam17465	Putative_CCL4	Chemokine-like protein, HHV-6 U83 gene product. Human herpesvirus 6A (HHV-6A) and HHV-6B are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. HHV6 A/B encode two putative chemokine receptors and a chemokine-like protein. The HHV6 U83 gene encodes a CC chemokine, which functions as a highly selective and efficacious agonist for the human CCR2 receptor both in respect of signal transduction and the ability to induce chemotaxis. homologs of the U83 gene products are found in Human cytomegalovirus encoded chemokines vCXC1 and vCXC2. HHV-6 CCL4 contains a region with the CC/CX3C chemokine motif and a glycosaminoglycan (GAG)-binding epitope, BBXB (B being a basic residue), found right before the third Cys residue, which very likely forms a disulfide bridge back to the first Cys of the protein. This gene is the only HHV-6A/B divergent gene that is specific for these viruses. The U83 chemokine gene is distinct between HHV-6A and HHV-6B strains, encoding up to 13% amino acid differences. The HHV-6A (U83A) and HHV-6B (U83B) chemokines have distinct specificities which determine chemoattraction or diversion of different leukocyte subsets for infection or immune evasion, thus an early component of cellular tropism as well as mediator of innate immunity. U83 also has a varied gene structure, with N-terminal length variation determining production of the encoded mature secreted chemokine, coupled with control by cell-directed splicing which truncates the chemokine gene early in replication to encode an antagonist. The long active form of U83A has a unique broad specificity for receptors CCR1, CCR4, CCR5, CCR6 and CCR8 present on plasmacytoid and myeloid dendritic and monocyte/macrophage antigen presenting cells, as well as both TH1 and TH2 skin homing lymphocytes and NK cells; it is also amongst the highest affinity ligands for CCR5 and inhibits HIV-1 binding at this coreceptor. U83A can both block and divert human chemokine action while occupying the human chemokine receptors.	97
340181	pfam17466	NinD	Family of unknown function. This is a family of unknown function found in Enterobacteria phage P22 and Enterobacteria phage lambda.	57
340182	pfam17467	E7R	Viral Protein E7. This domain family is found in Vaccinia and Variola viruses. Family members include E7R gene product. Vaccinia virus (VV) is a large double-stranded DNA virus that replicates in the cytoplasm of infected cells. Many viruses express proteins that are modified by myristic acid. Myristic acid is a 14-carbon fatty acid that is cotranslationally transferred to the penultimate glycine residue found within the consensus sequence MGXXX(S/T/A/C/N) (where X is any amino acid) at the amino terminus of target proteins. E7R proteins in Vaccina virus have been shown to be myristylated. The expressed E7R protein has also been found to reside within mature infectious virions.	60
340183	pfam17468	Gp52	Phage protein Gp52. This domain of unknown function is found in Mycobacterium phage.	61
340184	pfam17469	Gp68	Phage protein Gp68. This is a domain of unknown function found in Mycobacterium phage.	78
340185	pfam17470	Gp45_2	Phage protein Gp45.2. This is a domain of unknown function found in Myoviridae.	58
340186	pfam17471	Gp63	Hypothetical phage protein Gp63. This is a family of unknown function found in Mycobacterium.	73
340187	pfam17472	DUF5425	Family of unknown function (DUF5425). This is a family of unknown function found in Borreliella burgdorferi.	76
340188	pfam17473	DUF5426	Family of unknown function (DUF5426). This is a family of unknown function found in Mycoplasma.	137
340189	pfam17474	U71	Tegument protein UL11 homolog. Human herpesvirus 6A (HHV-6A) and HHV-6B are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. During their productive cycle, herpesviruses have a regulated temporal cascade of gene expression that can be divided into three general stages: immediate-early (IE), early (E), and late (L). Following viral DNA replication, late viral genes that mainly encode structural proteins start to be transcribed, ultimately leading to the assembly and release of infectious particles. This domain family is found in tegument protein UL11 homolog (U71) in HHV-6A/B. It is a myristylated virion protein which is expressed at the early stage of the lytic cycle.	52
340190	pfam17475	Binary_toxB_2	Clostridial binary toxin B/anthrax toxin PA domain 2. This domain forms the middle beta sandwish domain in anthrax toxin.	218
375198	pfam17476	Binary_toxB_3	Clostridial binary toxin B/anthrax toxin PA domain 3. This entry represents the beta-grasp domain in anthrax protective antigen.	102
407512	pfam17477	Rota_VP4_MID	Rotavirus VP4 membrane interaction domain. This entry represents the VP4 membrane interction domain.	225
340193	pfam17478	VP4_helical	Rotavirus VP4 helical domain. 	291
407513	pfam17479	DUF3048_C	Protein of unknown function (DUF3048) C-terminal domain. Some members in this bacterial family of proteins are annotated as YerB. However currently no function is known. This entry represents the C-terminal domain.	114
340195	pfam17480	AlphaC_C	Alpha C protein C terminal. The alpha C protein (ACP) is found in Streptococcus and acts as an invasin which plays a role in the internalisation and translocation of the organism across human epithelial surfaces. Group B Streptococcus is the leading cause of diseases including bacterial pneumonia, sepsis and meningitis. The N terminal of ACP is associated with virulence and forms a beta sandwich and a three helix bundle. This entry is the C-terminal domain for APC. The C-terminal domain (45 amino acids) contains an LPXTG peptidoglycan-anchoring motif characteristic of cell-wall anchored surface proteins.	71
407514	pfam17481	Phage_sheath_1N	Phage tail sheath protein beta-sandwich domain. This entry represents the N-terminal beta sandwich domain found in a variety of phage tail sheath proteins.	99
407515	pfam17482	Phage_sheath_1C	Phage tail sheath C-terminal domain. This entry represents the C-terminal domain in a variety of phage tail sheath proteins.	104
407516	pfam17483	TbpB_C	C-lobe handle domain of Tf-binding protein B. Bacterial lipoproteins represent a large group of specialized membrane proteins that perform a variety of functions including maintenance and stabilization of the cell envelope, protein targeting and transit to the outer membrane, membrane biogenesis, and cell adherence. Pathogenic Gram-negative bacteria within the Neisseriaceae and Pasteurellaceae families rely on a specialized uptake system, characterized by an essential surface receptor complex that acquires iron from host transferrin (Tf) and transports the iron across the outer membrane. They have an iron uptake system composed of surface exposed lipoprotein, Tf-binding protein B (TbpB), and an integral outer-membrane protein, Tf-binding protein A (TbpA), that together function to extract iron from the host iron binding glycoprotein (Tf). TbpB is a bilobed (N and C lobe) lipid-anchored protein with each lobe consisting of an eight-stranded beta barrel flanked by a handle domain made up of four (N lobe) or eight (C lobe) beta strands. TbpB extends from the outer membrane surface by virtue of an N-terminal peptide region that is anchored to the outer membrane by fatty acyl chains on the N-terminal cysteine and is involved in the initial capture of iron-loaded Tf. This domain family is found in the handle domain of the C lobe (domain C) of TbpB proteins. It consists of a squashed six-stranded beta sheet flanked by two antiparallel beta strands and has no supporting alpha helix as in the N lobe.	99
407517	pfam17484	TbpB_A	N-Lobe handle Tf-binding protein B. Bacterial lipoproteins represent a large group of specialized membrane proteins that perform a variety of functions including maintenance and stabilization of the cell envelope, protein targeting and transit to the outer membrane, membrane biogenesis, and cell adherence. Pathogenic Gram-negative bacteria within the Neisseriaceae and Pasteurellaceae families rely on a specialized uptake system, characterized by an essential surface receptor complex that acquires iron from host transferrin (Tf) and transports the iron across the outer membrane. They have an iron uptake system composed of surface exposed lipoprotein, Tf-binding protein B (TbpB), and an integral outer-membrane protein, Tf-binding protein A (TbpA), that together function to extract iron from the host iron binding glycoprotein (Tf). TbpB is a bilobed (N and C lobe) lipid-anchored protein with each lobe consisting of an eight-stranded beta barrel flanked by a handle domain made up of four (N lobe) or eight (C lobe) beta strands. TbpB extends from the outer membrane surface by virtue of an N-terminal peptide region that is anchored to the outer membrane by fatty acyl chains on the N-terminal cysteine and is involved in the initial capture of iron-loaded Tf. The 4-residue conserved LSAC motif found at the amino terminus of TbpB represents a prototypical lipobox, with the cysteine residue serving as the first amino acid in the mature protein which is subsequently modified by the addition of a diacyl glycerol. A second conserved motif of interest is located two amino acids downstream of the LSAC site. This region consists of four glycine residues in tandem. Deletion of the conserved polyglycine motif has significant negative effects on growth in certain conditions, while mutational analysis revealed that the LSAC motif constituting the lipobox of TbpB is necessary for lipidation and hence tethering of TbpB to the bacterial surface. This domain family is found on the N-terminal region of TbpB proteins, which comprises the N lobe handle consisting of a four-stranded antiparallel beta sheets held together by a short surface-exposed alpha helix. Tf-binding activity primarily resides in the TbpB N lobe.	136
340200	pfam17485	SatRNA_48	Satellite RNA 48 kDa protein. Satellite RNAs (satRNAs) are short RNA molecules, usually <1,500 nt, that depend on cognate helper viruses for replication, encapsidation, movement, and transmission, but most share little or no sequence homology to the helper viruses. In contrast, satellite viruses are satRNAs that encode and are encapsidated in their own capsid proteins (CPs). Members of this family are nonstructural proteins of 48kDa in size which been shown to be involved in the replication of the sat-RNA. They are found in tomato black ring virus (TBRV).	299
340201	pfam17486	Cys_Knot_tox	Cystine knot toxins. This family is found in Araneaea (spiders) and family members are venomus peptides with 4 disulfide bonds. Cystine knot toxins (CKTs) are small, compact molecules cross-linked by three to five disulfide bonds and are often the key contributors to the activity and potency of the venom. While these disulfide-rich peptides can adopt a number of different structural motifs, three of the most observed structural scaffold motifs are the inhibitor cystine knot (ICK) and the disulfide-directed beta-hairpin (DDH) and Kunitz motif. These venomus peptides mainly act on membrane proteins in electro-excitable cell membranes by modulating voltage-activated sodium (NaV), calcium (CaV), and potassium (KV) channels, acid-sensing ion channels (ASICs), transient receptor potential (TRP) channels, and mechanosensitive channels (MSCs).	70
407518	pfam17487	RPS12	Ribosomal protein S12. This is a family of unknown function. Family members are ribosomal proteins found in the mitochondria (RPS12). Homologus RPS12 proteins in bacterial ribosoms participate in stabilizing the second base pair of the codon-anticodon duplex in the A site and is likely to be critical for the fidelity of decoding process. A similar role can be anticipated for this protein in mitochondrial ribosomes. This has been shown where the product of edited RPS12 mRNA translation represented a component of the mitoribosome's small subunit.	87
407519	pfam17488	Herpes_glycoH_C	Herpesvirus glycoprotein H C-terminal domain. Herpesvirus glycoprotein H (gH) is a virion associated envelope glycoprotein. Complex formation between gH and gL has been demonstrated in both virions and infected cells. This entry represents the C-terminal domain.	135
407520	pfam17489	Tnp_22_trimer	L1 transposable element trimerization domain. This entry represents the trimerization domain.	43
407521	pfam17490	Tnp_22_dsRBD	L1 transposable element dsRBD-like domain. This entry represents the double stranded RNA-binding-like domain.	65
340206	pfam17491	m_DGTX_Dc1a_b_c	Spider Toxins mu-diguetoxin-1 a, b and c. This family has members that are 56-59 residue mu-diguetoxin-1 toxins, which have been isolated from the weaving spider, Diguetia canities. These toxins were isolated as a result of their potent insect paralytic activities, designated mu-DGTX-Dc1a to -Dc1c (formerly DTX9.2, DTX11 and DTX12). Family members such as beta-Diguetoxin-Dc1a (Dc1a) has been structurally characterized and shown to have disulfide bonds which form a classical inhibitor cysteine knot (ICK) motif in which the Cys13-Cys26 and Cys20-Cys40 disulfide bonds and the intervening sections of the polypeptide backbone forming a 23-residue ring that is pierced by the Cys25-Cys54 disulfide bond. This ICK motif is commonly found in spider toxins, and this particular scaffold provides these peptides (so-called knottins) with an unusually high degree of chemical, thermal and biological stability. Dc1a contains an additional disulfide bond (Cys42-Cys52) that appears to serve as a molecular staple which limits the flexibility of a disordered serine-rich hairpin loop. The extended N-terminus of Dc1a along with an unusually large loop between Cys26 and Cys40 enables the formation of an N-terminal three-stranded antiparallel beta-sheet that is not found in any other knottin.The molecular surface of Dc1a contains a relatively uniform distribution of charged residues; moreover, there are no distinct clusters of hydrophobic residues that might mediate an interaction with lipid bilayers.	55
340207	pfam17492	D_CNTX	Delta Ctenitoxins. This family includes peptides isolated from Phoneutria such as delta-ctenitoxins.Members of the CNTX-Pn1a family and its paralogs (delta-CNTX-Pn1b through delta-CNTX-Pn1e) of Phoneutria toxins have complex effects on sodium channels but their primary effect appears to be an inhibition of channel inactivation, a pharmacology similar to that of the delta-atracotoxins and delta-conotoxins. Orthologous toxins such as delta-CNTX-Pr1/PK1 and Pn2 are also family members, some of which act by clocking the calcium channels. Delta-CNTX-Pn1a and delta-CNTX-PN2a are 48-amino-acid polypeptides, with 5 disulfide bridges. The later has a complex pharmacology that results in inhibition of NaV channel inactivation and a hyperpolarizing shift in the channel activation potential.	48
340208	pfam17493	DUF5428	Family of unknown function (DUF5428). This is a family of unknown function found in Betanecrovirus.	63
340209	pfam17494	DUF5429	Family of unknown function (DUF5429). This is a family of unknown function.	76
375203	pfam17495	DUF5430	Family of unknown function (DUF5430). This is a family of unknown function found in Feline immunodeficiency virus.	106
407522	pfam17496	DUF5431	Family of unknown function (DUF5431). This is a family of unknown function found in Enterobacteriaceae.	70
340212	pfam17497	DUF5432	Family of unknown function (DUF5432). This is a family of unknown function found in Orthopoxvirus.	74
340213	pfam17498	DUF5433	Family of unknown function (DUF5433). This is a family of unknown function found in Orthopoxviruses.	67
375204	pfam17499	Pilosulin	Ant venom peptides. Members of this family are found in Myrmecia pilosula and represent a group of peptides that display cytotoxic, hypotensive, histamine-releasing and antimicrobial activities. Pilosulins constitute the major allergens of the venom of Myrmecia pilosula (Myrmeciinae). Pilosulin 1 is a long linear peptide (57 amino acids) and displays haemolytic and cytolytic activities. Pilosulins 3, 4, and 5 are a group of homo- and heterodimeric peptides. Pilosulin 1 is expressed in the venom sac of ants in the form of a propeptide (112 kDa) which undergoes extensive post-translational modification. It is proposed to give rise to a family of six homologous C-terminal peptide sub-sequences containing between 27 and 56 amino acid residues in the final venom. Furthermore, it is found to form random coils and have minimal secondary structure. However, in increasingly hydrophobic conditions, approximately one-third of the peptide forms alpha-helix secondary structures. Studies on human erythrocytes and lymphocytes, show that Pilosulin 1 is highly lytic towards leukocytes and that the NH2-terminus (20 N-terminal residues) of Pilosulin 1 is critical for its cytotoxic activity and antimicrobial activities. Another family member Pilosulin 3, is a heterodimer of Pilosulin 3a and Pilosulin 3b linked in anti-parallel fashion through 2 disulfide bridges. This peptide is the most abundant peptide found in native venom.	74
340215	pfam17500	Colicin_K	Colicin-K immunity protein. Colicins are bacterial toxins produced by Escherichia coli strains and are active against E. coli or related strains. These bacterial antibiotic toxins play an important role in the E. coli colonization of environmental niches. Members of this family are Colicin K peptides which require TolA, TolB, TolQ, and TolR proteins for translocation across the periplasm and binding to the outer membrane receptor. Colicin K uses the Tsx nucleoside-specific receptor for binding at the cell surface, the OmpA protein for translocation through the outer membrane, and the TolABQR proteins for the transit through the periplasm. The N-terminal domain interacts with components of its import machinery, including the TolB and TolQ proteins.	96
340216	pfam17501	Viral_RdRp_C	Viral RNA-directed RNA polymerase. This is the C-terminal of RNA-directed RNA polymerase (Protein A) found in Alphanodaviruses such as Flock House Virus (FHV). FHV is a positive-stranded RNA virus with a bipartite genome of RNAs, RNA1 and RNA2. RNA1 encodes protein A, which is the catalytic subunit of the RNA-dependent RNA polymerase (RdRp) and functions as the sole viral replicase protein responsible for RNA replication. FHV protein A also possesses a terminal nucleotidyl transferase (TNTase) activity, which is able to restore the nucleotide loss at the 3'-end initiation site of RNA template to rescue RNA synthesis initiation. It has also been reported that FHV protein A replicates viral RNA in concert with the mitochondrial outer membrane and other viral or cellular factors and mediates the formation of viral RNA replication complexes and small spherules by inducing membrane rearrangement.This domain is also found in B1 proteins which are encoded by the subgenomic RNA3 during FHV replication. The function of translated B1 protein is poorly defined, but may be important for maintenance of RNA replication.	101
340217	pfam17502	DUF5434	Family of unknown function (DUF5434). This is a family of unknown function found in Varicellovirus.	189
407523	pfam17503	DUF5435	Family of unknown function (DUF5435). This is a family of unknown function found in Varicellovirus.	208
340219	pfam17504	DUF5436	Family of unknown function (DUF5436). This is a family of unknown function found in Orthopoxvirus.	79
375205	pfam17505	DUF5437	Family of unknown function (DUF5437). This is a family of unknown function found in Alphabaculovirus.	60
340221	pfam17506	DUF5438	Family of unknown function (DUF5438). This is a family of unknown function found in Orthopoxvirus.	71
340222	pfam17507	DUF5439	Family of unknown function (DUF5439). This is a family of unknown function found in Orthopoxvirus.	75
340223	pfam17508	MccV	Microcin V bacteriocin. Family members are bacterial microcin-V peptides MccV, also known as colicin V. MccV was the first antibiotic substance reported to be produced by E. coli. This antibacterial agent was initially named colicin V (ColV). However, on account of several characteristics (low molecular mass, non-inducible production, and dedicated export system), it became classified within the microcins. The structural gene cvaC, encodes the 103-aa MccV precursor. The dedicated export system of MccV has been well characterized and involves two genes that form the second operon. The MccV protein has an N-terminal double glycine motif which precedes the cleavage site for the precursor protein.	104
375206	pfam17509	DUF5440	Family of unknown function (DUF5440). This is a family of unknown function found in bacteria.	93
375207	pfam17510	Gp44	Mycobacterium phage hypothetical protein Gp44.1. This is a family with unknown function. Family members are hypothetical proteins found in Mycobacterium phages.	107
340226	pfam17511	Mobilization_B	Mobilization protein B. This is a family of unknown function found in Bacteria. Family members include Mobilization protein B (MobB). MobB contains a putative membrane-spanning domain, and might be involved in anchoring or presenting MobA, and the covalently-linked plasmid DNA, to the conjugative pore for subsequent export. In agreement with this, MobB has been shown to be associated with the membrane. Deletion of the membrane-spanning domain disrupts this association and decreases the frequency of both type IV transport and plasmid mobilization. MobB is one out of three proteins encoded by RSF1010 that are required for its mobilization along with MobA and MobC. MobB encoded by the broad-host-range plasmid R1162 is required for its efficient transfer by conjugation. The C-terminal half of the protein contains a membrane domain essential for transfer, while the other, functionally active region of MobB, identified by mutagenesis, is at the N-terminal end. One mutation affecting this region inhibits replication, suggesting that this part of the protein is contacting and sequestering the relaxase-linked primase. A model that represents MobB molecules as anchored in the membrane at one end and engaging the relaxase at the other. This arrangement is suggested to increase the transfer frequency by raising the probability of contact between the relaxase and the membrane-embedded, coupling protein for type IV secretion.	136
340227	pfam17512	Sh_2	Metapneumovirus Small hydrophobic protein. This family is found in SH (small hydrophobic) proteins present in Metapneumovirus such as the Avian metapneumovirus (AMPV), a paramyxovirus that has three membrane proteins (G, F, and SH). Among them, the SH protein is a small type II integral membrane protein. It is located in both the plasma membrane as well as within intracellular compartments. AMPV type C- SH protein localizes in the endoplasmic reticulum (ER), Golgi, and cell surface, and is transported through ER-Golgi secretory pathway. AMPV SH protein is modified by N-linked glycans and can be released into the extracellular environment. Furthermore, it has been shown that glycosylated AMPV SH proteins form homodimers through cysteine-mediated disulfide bonds.	174
375208	pfam17513	DUF5441	Family of unknown function (DUF5441). This is a family of unknown function found in Mastadenoviruses.	189
340229	pfam17514	DUF5442	Family of unknown function (DUF5442). This is a family of unknown function found in Chironomus.	107
340230	pfam17515	CPV_Polyhedrin	Cypovirus polyhedrin protein. This family is found in polyhedrin proteins of Cypoviruses. These viruses possess a single capsid layer with turrets and are commonly embedded in crystalline occlusion bodies called polyhedra, which are formed in the cell cytoplasm and mainly composed of a single virus-encoded protein, polyhedron. Cypoviruses have been classified into 21 distinct types. Within each type the amino acid sequence of polyhedrins are highly conserved, whilst between types there is little conservation. Structural analysis and comparison of the different polyhedrins reveals five variable regions: the N-terminal loop, connections between secondary structures (H2 and H3, beta-E and beta-F, beta-F and beta-G, beta-G and beta-H), and the C-terminal loop, which is designate V1-V5 respectively. V2 forms a cap at one end of the protein and is subdivided across two sections of the polypeptide, V2n and V2c. Differences in these regions give each polyhedrin its characteristic appearance. The base domain (residues 74-110) is a region that is neither required for proper folding of the protein, nor for crystal assembly, but fine-tunes the crystal, 'locking-down' the structure, often in conjunction with NTPs. This region is also implicated in virion recognition and packaging.	241
407524	pfam17516	ProQ_C	ProQ C-terminal domain. This domain is found at the C-terminus of many ProQ proteins.	51
407525	pfam17517	IgGFc_binding	IgGFc binding protein. This domain is found at the N terminal of human IgGFc-binding protein and has been shown to confer IgG Fc binding activity. It may play a role in immune protection and inflammation in the intestines of primates.	292
340233	pfam17518	DUF5443	Family of unknown function (DUF5443). This is a family of unknown function found in Mycoplasma.	344
407526	pfam17519	DUF5444	Family of unknown function (DUF5444). This is a family of unknown function found in Enterobacterales.	62
340235	pfam17520	DUF5445	Family of unknown function (DUF5445). This is a family of unknown function found in Enterobacteriaceae.	52
340236	pfam17521	Secapin	Honey bee peptides. Family members are bee venom peptides such as Secapin. Mature secapin is composed of 25 amino acid residues that contain a disulfide link. Secapin has been demonstrated to act as a potent neurotoxin. In Apis mellifera secapin exhibits anti-bacterial activity and induces inflammation and pain with anti-fibrinolytic, anti-elastolytic, and anti-microbial activities. Secapin shares a common folding pattern with apamin, mast cell degranulating peptide and tertiapin; it is centred on a beta-turn covalently linked to an alpha-helical segment by one disulphide link (two disulphide links in the other peptides).	45
340237	pfam17522	DUF5446	Family of unknown function (DUF5446). This is a family of unknown function found in Bacillales.	72
340238	pfam17523	MPS-4	MinK-related peptide, potassium channel accessory sub-unit protein 4. MinK-related peptides (MiRPs or KCNEs) are single-transmembrane proteins that associate with pore-forming ion-channel sub-units to form stable complexes with channel properties markedly distinct from those of the isolated pore-forming sub-units. MPS-4 is expressed exclusively in the C. elegans nervous system and is essential for neuronal excitability.	78
407527	pfam17524	CnrY	anti-sigma factor CnrY. This family is found in alpha and beta proteobacteria. Family members include anti-sigma factor CnrY from Cupriavidus metallidurans. Sigma factors are multi-domain sub-units of bacterial RNA polymerase (RNAP) that play critical roles in transcription initiation, including the recognition and opening of promoters as well as the initial steps in RNA synthesis. They also control a wide variety of adaptive responses such as morphological development and the management of stress. A recurring theme in sigma factor control is their sequestration by anti-sigma factors that occlude their RNAP-binding determinants. CnrH, controls cobalt and nickel resistance in Cupriavidus metallidurans. CnrH is regulated by a complex of two transmembrane proteins: the periplasmic sensor CnrX and the anti-sigma CnrY. At rest, CnrH is sequestered by CnrY whose 45-residue-long cytosolic domain is one of the shortest anti-sigma domains. Upon Ni(II) or Co(II) ions detection by CnrX in the periplasm, CnrH is released between CnrH and the cytosolic domain of CnrY (CnrYc). The CnrH/CnrYC complex displays an unexpected structural similarity to the anti-sigma NepR in complex with its antagonist PhyR, whereas NepR shares no sequence similarity with CnrY. Crystal structure of CnrH/CnrY shows that CnrYC residues 3-19 are folded as a well-defined alpha-helix. The peptide further extends along the hydrophobic groove of sigma 2 with no canonical structure except for a short helical turn spanning residues 24-28. CnrY has a hydrophobic knob made of V4, W7 and L8 side chains protruding into sigma 4 hydrophobic pocket and contributing to the interface. In vivo investigation of CnrY function pinpoints part of the hydrophobic knob as a hotspot in CnrH inhibitory binding.	98
340240	pfam17525	DUF5447	Family of unknown function (DUF5447). This is a family of unknown function found in Pseudomonas.	92
340241	pfam17526	DUF5448	Family of unknown function (DUF5448). This is a family of unknown function found in Gammaproteobacteria.	118
340242	pfam17527	ALP	Phage ALP protein. During the course of infection of Escherichia coil by bacteriophage T4, transcription of viral late genes does not take place unless template DNA contains hydroxymethyl cytosine (hmCyt), a modification normally effected by virus-encoded enzymes. Bacteriophage T4 Alc protein acts as a site-specific termination factor participating in shutting off host transcription after infection of E. coli, while the bacteriophage T4 transcription is protected from the action of Alc by overall substitution of cytosine with 5-hydroxymethyl cytosine in T4 DNA. Based on genetic studies, Alc is thought to bind directly to the beta sub-unit dispensable region 1 (bDR1) of E.coli RNAP. However, immune-isolation experiments show that Alc binds both core and sigma 70-holoenzyme of RNAP.	177
407528	pfam17528	DUF5449	Family of unknown function (DUF5449). This is a family of unknown function found in Lactobacillus.	174
407529	pfam17529	DUF5450	Family of unknown function (DUF5450). This is a family of unknown function found in Giardia intestinalis.	161
340245	pfam17530	NS3	Non-structural protein NS3. This is a family of proteins found in Densoviruses. Members of this family such as NS3 found in Junonia coenia have been shown to be involved in viral DNA replication. Generation of deletion mutants and replicative cycle analysis show that NS3 is required for viral DNA replication. Bioinformatics analysis of Bombyx mori densovirus protein NS3, show that it has two putative zinc-finger motifs, 6 putative N glycosylation sites, and 4 putative phosphorylation sites.	245
340246	pfam17531	O_Spanin_T7	outer-membrane spanin sub-unit. This family contains members of the outer membrane spanin sub-unit protein (o-spanin), found in Enterobacteria phage T7. Spanins are lytic proteins that act on bacterial outer-membrane by disrupting it, allowing progeny virions to spread. O-spanin acts together with inner membrane spanin sub-unit (i-spanin) to form the spanin complex necessary for function.	33
340247	pfam17532	DUF5451	Family of unknown function (DUF5451). This is a family of unknown function found in Epstein-Barr virus.	148
340248	pfam17533	DUF5452	Family of unknown function (DUF5452). This is a family of unknown function found in Mycoplasmataceae.	169
340249	pfam17534	DUF5453	Family of unknown function (DUF5453). This is a family of unknown function found in Mycoplasma. Family members have 4 predicted trans-membrane regions.	186
340250	pfam17535	DUF5454	Family of unknown function (DUF5454). This is a family of unknown function found in Mycoplasma.	221
340251	pfam17536	Mx_ML	Matrix and Matrix long proteins N-terminal. This entry represents the N-terminal fragment of family members such as the Matrix (Mx) and Matrix protein long (ML) proteins. They are found in Thogoto virus (THOV), a tick-transmitted orthomyxovirus with a genome consisting of six single-stranded RNA segments that encode seven structural proteins. Matrix proteins of the family Orthomyxoviridae are major structural components of the viral capsid, located below the viral lipid membrane and provide protection for viral ribonucleoproteins (vRNPs). They serve as a major participant during the processes of virus invasion and budding. Furthermore, they play specific roles throughout the viral life cycle, usually by interacting with other viral components or host cellular proteins. ML protein, an extended version of the viral M protein, is a viral IFN antagonist. ML is essential for virus growth and pathogenesis in an IFN-competent host. In the presence of ML the activation and/or action of the interferon regulatory factor-3 (IRF-3) is severely affected. This effect depends on direct interaction of ML with the transcription factor IIB (TFIIB). ML suppresses IRF-7 in a similar manner as it suppresses IRF-3. Studies have revealed that ML associates with IRF-7 and prevents IRF-7 dimerization and interaction with TRAF6. Structural analysis revealed that N-terminal fragment of M protein (MN) undergoes conformational changes that result in specific, pH-dependent inter-molecular interactions. Comparison of THOV MN and influenza A virus (IAV) MN region, showed low sequence identity. However, superimposition of the two structures in neutral condition, showed that both matrix proteins contain nine helices connected with same topology. Since the matrix layer of IAV disassembles in acidic endosome at the beginning of infection and repacks in the neutral cytoplasm, a change of pH might be a key regulator for the capsid assembly/disassembly transition during these processes. Hence, pH-dependent conformational transition model was studied in THOV MN, where interactions such as hydrogen bonds and hydrophobic interactions are suggested to be involved in THOV matrix assembly.	149
407530	pfam17537	DUF5455	Family of unknown function (DUF5455). This is a family of unknown function found in Proteobacteria. Family members contain three predicted trans-membrane regions.	102
407531	pfam17538	C_LFY_FLO	DNA Binding Domain (C-terminal) Leafy/Floricaula. This family consists of various plant development proteins which are homologs of floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. Mutations in the sequences of these proteins affect flower and leaf development. LFY is a plant-specific transcription factor (TF) essential for flower development. It is one of the few master regulators of flower development, as it integrates environmental and endogenous signals to orchestrate the whole floral network. Transcription factors such as LFY, recognize short DNA motifs primarily through their DNA-binding domain. Upon binding to short stretches of DNA called cis-elements or TF binding sites (TFBS), they regulate gene expression. This entry represents the DNA binding domain found in C-terminal of LFY proteins in plants. Structure-function studies have demonstrated that LFY binds semi-palindromic 19-bp DNA elements through its highly conserved C-terminal DBD, a unique helix-turn-helix fold that by itself dimerizes on DNA.	169
340254	pfam17539	DUF5456	Family of unknown function (DUF5456). This is a family of unknown function found in Bacteroides.	152
375213	pfam17540	DUF5457	Family of unknown function (DUF5457). This is a family of unknown function found in Bacteria. Family members have one predicted trans-membrane region.	89
407532	pfam17541	DUF5458	Family of unknown function (DUF5458). This is a family of unknown function found in Bacteroidetes.	430
340257	pfam17542	RP853	Uncharacterized RP853. This is a family of unknown function found in Rickettsia. Family members are predicted to contain one trans-membrane region.	317
340259	pfam17544	DUF5460	Family of unknown function (DUF5460). This is a family of unknown function found in Rickettsia. Family members are predicted to contain one trans-membrane region.	375
340260	pfam17545	DUF5461	Family of unknown function (DUF5461). This is a family of unknown function found in viruses.	93
340261	pfam17546	Defb50	Beta Defensin 50. B-defensin are small cationic antimicrobial peptides. Family members such as beta-defensin 50 (Defb50) has poor antimicrobial activity in its oxidized form, but this improves under reduced conditions.	50
407533	pfam17547	DUF5462	Family of unknown function (DUF5462). This is a family of unknown function found in Gammaproteobacteria.	157
340263	pfam17548	p6	Histone-like Protein p6. Family members such as protein p6 from Bacillus subtilis phage phi29 bind double-stranded DNA, forming a large nucleoprotein complex all along the viral genome, and have been proposed to be an architectural protein with a global role in genome organization. P6 is also involved in viral transcriptional control, repressing the C2 early promoter located at the right DNA end,and together with the viral regulatory protein p4, repressing early promoters A2b/A2c and activating late promoter A3.	76
340264	pfam17549	Phage_Gp17	Gene Product 17. Family members such as protein 17 (gene product 17/gp17) found in Bacillus phage phi29, is involved in DNA replication and in pulling the phage DNA into the cell during the injection process.	140
340265	pfam17550	PsaF	Family of unknown function. This is a family of unknown function found in Yersinia pestis.	162
340266	pfam17551	DUF5463	Family of unknown function (DUF5463). This is a family of unknown function found in Yersinia pestis.	32
340267	pfam17552	DUF5464	Family of unknown function (DUF5464). This is a family of unknown function found in Bacteriophages.	51
340268	pfam17553	DUF5465	Family of unknown function (DUF5465). This is a family of unknown function found in Enterobacteria phage T7.	19
340269	pfam17554	DUF5466	Family of unknown function (DUF5466). This is a family of unknown function found in Enterobacteria phage T7.	57
407534	pfam17555	DUF5467	Family of unknown function (DUF5467). This is a family of unknown function found in Bacteria. Family members have 5 predicted trans-membrane regions.	274
340271	pfam17556	MIT_LIKE_ACTX	MIT-like atracotoxin family. This family includes peptides such as the Atracotoxin-Hvf17. It is a a non-toxic peptide isolated from the venom of Blue Mountains funnel-web spider Hadronyche versuta. It does not function like classical funnel-web spider atracotoxins to modulate mammalian or insect voltage-gated ion channel function since it lacks insecticidal activity and fails to affect vas deferens smooth muscle or skeletal muscle contractility. This peptide has ten conserved cysteine residues similar to AVIT family members such as MIT1. Due to the lack of the AVIT N-terminal four residues and lack of functional similarity to the AVIT family, the Atracotoxin-Hvf17 is classified as MIT-like atracotoxin.	68
375216	pfam17557	Conotoxin_I2	I2-superfamily conotoxins. Conotoxins (or conopeptides) are the peptidic components of the venoms of marine cone snails (genus Conus). They are classified in one of three ways: gene superfamily, cysteine framework or pharmacological family. Several distinct cysteine frameworks have been described in conotoxins. Members of this family display a XI cysteine pattern (C-C-CC-CC-C-C) and belong to the I2- superfamily conotoxins. Family members such as Kappa-conotoxin ViTx and Kappa-conotoxin SrXIA inhibit voltage gated potassium channels (Kv).	38
375217	pfam17558	AGH	Androgenic gland hormone. This family contains members such as the Androgenic gland hormone (AGH) of the woodlouse, Armadillidium vulgare. AGH is a heterodimeric glycopeptide synthesized and secreted from androgenic glands. It is responsible for sex differentiation in crustaceans and contains 4 disulfide bonds.	121
340275	pfam17560	Megourin	Aphid Megourins. This family is fond in the vetch aphid Megoura viciae with members such as Megourin 1, 2 and 3. Megourins are antimicrobial peptides that act against Gram-positive bacteria and fungi.	63
407535	pfam17561	DUF5469	Family of unknown function (DUF5469). This is a family of unknown function found in Bacteroidetes. Family members have one predicted trans-membrane region.	148
340277	pfam17562	Styelin	Styelin A-E. This is a family of antimicrobial peptides found in Stela clava (Sea squirt). Family members such as Styelin A and B, are two alpha-helical phenylalanine-rich antimicrobial peptides effective against a panel of Gram-negative and Gram-positive bacteria. Styelin contains unusual amino acids such as dihydroxyarginine, dihydroxylysine, 6-bromotryptophan, and 3,4-dihydroxyphenylalanine which are important for the antimicrobial activity at high salt concentrations.	59
340278	pfam17563	Cu	Cupiennin. Cupiennin are small cationic alpha-helical peptides from the venom of the ctenid spider Cupiennius salei which are characterized by high bactericidal as well as hemolytic activities. Family members such as cupiennin 1a exert both cytolytic and antibacterial effects. The cytolytic activity of the cupiennin peptides depends primarily on the amphipathic N-terminus, which is capable of inserting into the membrane, and is modulated by the C-terminus via electrostatic interactions with the cell surface.	27
340279	pfam17564	DUF5470	Family of unknown function (DUF5470). This is a family of unknown function found in viruses.	73
340280	pfam17565	DUF5471	Family of unknown function (DUF5471). This is a family of unknown function found in Enterobacteria phage T7.	70
340281	pfam17566	DUF5472	Family of unknown function (DUF5472). This is a family of unknown function found in Human papillomavirus type 11.	73
340282	pfam17567	DUF5473	Family of unknown function (DUF5473). This is a family of unknown function found in Human adenovirus.	106
340283	pfam17568	DUF5474	Family of unknown function (DUF5474). This is a family of unknown function found in Saccharomycetales.	77
340284	pfam17569	DUF5475	Family of unknown function (DUF5475). This is a family of unknown function found in Alphabaculovirus.	81
375218	pfam17570	DUF5476	Family of unknown function (DUF5476). This is a family of unknown function found in Podoviridae.	61
340286	pfam17571	DUF5477	Family of unknown function (DUF5477). This is a family of unknown function found in Podoviridae.	77
340287	pfam17572	DUF5478	Family of unknown function (DUF5478). This is a family of unknown function found in Alphabaculovirus.	86
340288	pfam17573	GA-like	GA-like domain. This domain is found in bacterial cell surface proteins. It is related to the GA domain that forms a three helix bundle.	50
340289	pfam17574	TA_inhibitor	Inhibitor of toxin/antitoxin system (Gp4.5). This is a family of prokaryotic toxin-antitoxin (TA) systems inhibitors, found in Podoviridae such as Enterobacteria phage T7. Family members such as Gene product 4.5 have been shown to neutralize TA-system-mediated abortive infection by inhibiting the Lon protease activity, thus preventing antitoxin degradation and toxin activation.	89
340290	pfam17575	DUF5479	Family of unknown function (DUF5479). This is a family of unknown function found in Kappa-papillomavirus.	101
340291	pfam17576	DUF5480	Family of unknown function (DUF5480). This is a family of unknown function found in Podoviridae.	71
340292	pfam17577	ETM	ECORI-T site protein ETM. This is a family of unknown function found in Alphabaculovirus.	109
340293	pfam17578	DUF5481	Family of unknown function (DUF5481). This is a family of unknown function found in Myoviridae.	103
340294	pfam17579	DUF5482	Family of unknown function (DUF5482). This is a family of unknown function found in Saccharomycetales.	159
340295	pfam17580	GBR_NSP5	Group B Rotavirus Non-structural protein 5. Family members such as non-structural protein 5 (NSP5), are found in Group B rotaviruses (GBR). Group B rotavirus (GBR) is genetically and antigenically distinct from Group A rotavirus (GAR). Hence phylogneetic studies have been carried out and show that the C-terminal region of NSP5, which is conserved among GAR and critical for its function for viroplasm-like structure formation in cells, was also conserved in GBR NSP5.	176
340296	pfam17581	DUF5483	Family of unknown function (DUF5483). This is a family of unknown function found in Saccharomycetaceae.	441
340297	pfam17582	UL20	Cytomegalovirus UL20. This family has members such as the human cytomegalovirus glycoprotein UL20. UL20 is a type I trans-membrane glycoprotein with an immunoglobulin-like ectodomain that is highly polymorphic among HCMV strains.	304
340298	pfam17583	DUF5484	Family of unknown function (DUF5484). This is a family of unknown function found in Myoviridae.	43
340299	pfam17584	comS	Bacillus Competence protein S. ComS is crucial for competence development as it prevents proteolytic degradation of ComK, the key transcriptional activator of all genes required for the uptake and integration of DNA. This family includes members of the Bacillus comS proteins.	44
340300	pfam17585	Phage_Arf	Accessory recombination function protein. Family members are found in Caudovirales such as Salmonella virus P22. Family members have a recombination accessory function.	47
340301	pfam17586	DUF5485	Family of unknown function (DUF5485). This is a family of unknown function found in Alphabaculovirus.	56
340302	pfam17587	Dmd	Discriminator of mRNA degradation. This family includes Dmd peptides from T4 phages. Dmd can suppress the toxicities of toxins such as LsoA (an endoribonucleases toxin expressed by E.coli). Crystal structure analysis show that Dmd is inserted into the deep groove between the N-terminal repeated domain (NRD) and the Dmd-binding domain (DBD) of LsoA. Site-directed mutagenesis of Dmd revealed the conserved residues (W31 and N40) are necessary for LsoA binding and the toxicity suppression.	60
340303	pfam17588	DUF5486	Family of unknown function (DUF5486). This is a family of unknown function found in Myoviridae.	53
407536	pfam17589	DUF5487	Family of unknown function (DUF5487). This is a family of unknown function found in Myoviridae.	66
340305	pfam17590	DUF5488	Family of unknown function (DUF5488). This is a family of unknown function found in Orthopoxvirus.	70
340306	pfam17591	UL41A	Herpesvirus UL41A. Members of this family are found in Human cytomegalovirus. No known function has been reported.	78
340307	pfam17592	DUF5489	Family of unknown function (DUF5489). This is a family of unknown function found in Alphafusellovirus.	78
340308	pfam17593	DUF5490	Family of unknown function (DUF5490). This is a family of unknown function found in Myoviridae.	62
340309	pfam17594	GP57	Phage Tail fiber assembly helper protein. Gene product 57 (Gp57) is a chaperone protein for short tail fiberphage protein that acts as a molecular chaperone of gp12, increasing the folding efficacy and production efficiency.	75
340310	pfam17595	DUF5491	Family of unknown function (DUF5491). This is a family of unknown function found in Myoviridae.	68
340311	pfam17596	DUF5492	Family of unknown function (DUF5492). This is a family of unknown function found in Alphabaculovirus.	80
340312	pfam17597	DUF5493	Family of unknown function (DUF5493). This is a family of unknown function found in viruses.	82
340313	pfam17598	DUF5494	Family of unknown function (DUF5494). This is a family of unknown function found in viruses.	84
340314	pfam17599	DUF5495	Family of unknown function (DUF5495). This is a family of unknown function found in Myoviridae.	87
340315	pfam17600	DUF5496	Family of unknown function (DUF5496). This is a family of unknown function found in Myoviridae.	87
340316	pfam17601	DUF5497	Family of unknown function (DUF5497). This is a family of unknown function found in Alphabaculovirus.	89
340317	pfam17602	DUF5498	Family of unknown function (DUF5498). This is a family of unknown function found in Myoviridae.	96
340318	pfam17603	DUF5499	Family of unknown function (DUF5499). This is a family of unknown function found in Myoviridae.	97
340319	pfam17604	DUF5500	Family of unknown function (DUF5500). This is a family of unknown function found in Herpesvirus.	98
340320	pfam17605	DUF5501	Family of unknown function (DUF5501). This is a family of unknown function found in Alphabaculovirus.	107
340321	pfam17606	DUF5502	Family of unknown function (DUF5502). This is a family of unknown function found in Listeria.	87
340322	pfam17607	DUF5503	Family of unknown function (DUF5503). This is a family of unknown function found in Enterobacteriaceae.	116
340323	pfam17608	DUF5504	Family of unknown function (DUF5504). This is a family of unknown function found in Lactobacillus. Family members have 4 predicted trans-membrane regions.	124
340324	pfam17609	HCMV_UL124	Family of unknown function. This is a family of unknown function found in beta-herpesvirus. Family members such as UL124 is a predicted membrane glycoprotein with one predicted trans-membrane region.	126
340325	pfam17610	DUF5505	Family of unknown function (DUF5505). This is a family of unknown function found in Alphabaculovirus.	156
340326	pfam17611	DUF5506	Family of unknown function (DUF5506). This is a family of unknown function found in Fowl aviadenovirus.	161
340327	pfam17612	DUF5507	Family of unknown function (DUF5507). This is a family of unknown function found in Escherichia.	160
340328	pfam17613	motB	Modifier of transcription. Family members are transcription regulation-related proteins found in Myoviridae such as Enterobacteria phage T4.	162
340329	pfam17614	FPV060	Viral CC-type chemokine. Family members found in Fowlpox virus are CC chemokine-like proteins. Fpv060 contains the conserved pattern of four cysteine residues similar to the CC chemokine family. Fpv060 also contains more cysteines in the mature protein, than cellular chemokines and one predicted trans-membrane region. In vitro studies show N-terminal glycosylation and show that Fpv060 from Fowl pox virus is much larger and has many more cysteine residues than host chemokines and viral homologs.	188
375219	pfam17615	C166	Family of unknown function. Family members found in Fuselloviridae are predicted to play a role in virus function.	171
340331	pfam17616	US6	Viral unique short region 6. This family has members such as US6 found in HCMV (Human cytomegalovirus). US6 is a unique short region glycoprotein found in the ER. It blocks the binding of ATP by TAP1 (Transporter associated with Antigen Processing 1) through a conformational change and subsequently inhibits TAP-mediated peptide translocation to the ER. It also down regulates only MHC class I. Inhibition of US6 of TAP has been shown to require residues 89 to 108 of the HCMV US6 luminal domain, whereas sequences that flank this region stabilize the binding of the viral protein to TAP. Residues 81 to 90 and the C-terminal 39 residues of HCMV US6 may also contribute to the stabilization of the interaction between US6 and TAP.	161
340332	pfam17617	US10	Viral unique short region 10. This family contains US10 proteins found in HCMV Human cytomegalovirus. US10 is a unique short region trans-membrane glycoprotein found in the endoplasmic reticulum (ER). It down-regulates cell surface expression of HLA-G, but not that of classical class I MHC molecules. Despite of binding to classical class I MHC molecules and delaying their trafficking, it does not affect their steady-state cell surface levels. US10 contains a tri-leucine motif in the cytoplasmic tail which is responsible for down-regulation of HLA-G.	161
407537	pfam17618	SL4P	Uncharacterized Strongylid L4 protein. Family members are predicted non-classically secreted proteins found in Ancylostoma ceylaniucum. Homologs are found in strongylids A. ceylanicum, N. americanus, H. contortus and Angiostrongylus cantonensis, where the corresponding genes in A. cantonensis are expressed in L4 larvae. Thus this family members found in A, ceylaniucum have been named strongylid L4 proteins (SL4Ps). Although SL4Ps do not resemble any domains of known function, they do have a conspicuous number of charged residues (both acidic and basic) in their N-terminal, most highly conserved regions.	88
407538	pfam17619	SCVP	Secreted clade V proteins. Family members are found in strongylid parasites (A. ceylanicum, N. americanus, H. contortus and Heterorhabditis bacteriophora) and in related non-parasitic clade V species (C. elegans, Caenorhabditis briggsae and P. pacificus), hence the name secreted clade V proteins (SCVPs). In A. ceylanicum, the encoded 150 residue proteins are predicted to be classically secreted.	97
340335	pfam17620	ORF45	Family of unknown function. Family members found in alphabaculoviruses such as orf45 have been implicated in late gene expression when linked to orf41.	191
340336	pfam17621	DUF5508	Family of unknown function (DUF5508). This is a family of unknown function found in Enterobacteriaceae.	263
340337	pfam17622	UL16	Viral unique long protein 16. This family contains members such as UL16 found in the human cytomegalovirus (HCMV). It is an immunoevasin which subverts NKG2D-mediated immune responses by retaining a select group of diverse NKG2D ligands inside the cell. UL16 is a heavily glycosylated 50 kDa type I trans-membrane glycoprotein. The ectodomain folds into a modified version of the a variable (V-type) (immunoglobulin Ig)-like domain. The N-terminal plug region (amino acids 27-50) is covalently linked to the Ig-like core with a disulfide bond. UL16 protein utilizes a three-stranded beta-sheet to engage the alpha-helical surface of the MHC class I-like MICB platform domain. Residues at the center of this beta-sheet mimic a central binding motif employed by the structurally unrelated C-type lectin-like NKG2D to facilitate engagement of diverse NKG2D ligands.	204
340338	pfam17623	B277	Family of unknown function. This is a family of unknown function, however family members such as B277 have been suggested to play a role in viral function.	277
340339	pfam17624	US30	Family of unknown function. This is a family of unknown function found in Cytomegalovirus. One of the family members US30 is a putative membrane glycoprotein with one predicted trans-membrane region.	282
340340	pfam17625	DUF5509	Family of unknown function (DUF5509). This is a family of unknown function found in Baculoviridae.	362
340341	pfam17626	IncF	Inclusion membrane protein F. The chlamydial inclusion membrane is extensively modified by the insertion of type III secreted effector proteins. These inclusion membrane proteins (Incs) are exposed to the cytosol and share a common structural feature of a long, bi-lobed hydrophobic domain but little or no primary amino acid sequence similarity. This family has members such as the IncF proteins found in Chlamydia trachomatis. IncF, is enriched at the point of contact of RBs (reticulate bodies) with the inclusion membrane. It is expressed early in the developmental cycle and interacts with many other Inc proteins, like Ct058 or Ct850, which are expressed later during the cycle. Thus, IncF could act as an interaction node for Inc proteins. IncF consists of 104 amino acids of which 38 N-terminal amino acids encoding the signal sequence for the type III system and 12 C-terminal amino acids may be localized in the host cell cytoplasm. Suggesting that IncF or other small Incs interact with other Inc proteins by their trans-membrane domain. It has been identified to be capable of homo-oligomerization and also displayed self-interacting properties.	104
340342	pfam17627	IncE	Inclusion membrane protein E. The chlamydial inclusion membrane is extensively modified by the insertion of type III secreted effector proteins. These inclusion membrane proteins (Incs) have two major characteristics: an N-terminal type III secretion signal that is necessary for their secretion out of the bacterium and a hydrophobic region consisting of at least two trans-membrane helices that allows insertion into the inclusion membrane. Generally, both the N- and C-terminal regions of the Inc are exposed to the host cell cytosol. This family has members such as the IncE (also known as CT116) proteins found in Chlamydia trachomatis. IncE Interacts with Retromer-Associated Sorting Nexins (SNXs) directly binding the PX-domains of SNX5/6. It is expressed within the first 2 hours of C. trachomatis infection. IncE region 101-132 is the binding site for SNX5/6 causing re-localization of SNX5/6 from endosomes to the inclusion membrane. IncE101-132 expression was shown to be sufficient to maintain CI-MPR (Cation-Independent Mannose-6-Phosphate Receptor) in retromer-containing compartments, thereby disrupting efficient CI-MPR trafficking to the trans-Golgi. It has been suggested that SNX5/6 bind directly to IncE independently of phosphoinositides and that the predicted IncE C-terminal beta-hairpin is required. IncE-mediated sequestration of retromer SNX-BAR proteins may promote Golgi fragmentation, a process that facilitates lipid acquisition by C. trachomatis and enhances progeny production.	132
340343	pfam17628	IncD	Inclusion membrane protein D. The chlamydial inclusion membrane is extensively modified by the insertion of type III secreted effector proteins. These inclusion membrane proteins (Incs) have two major characteristics: an N-terminal type III secretion signal that is necessary for their secretion out of the bacterium and a hydrophobic region consisting of at least two trans-membrane helices that allows insertion into the inclusion membrane. Generally, both the N- and C-terminal regions of the Inc are exposed to the host cell cytosol. This family has members such as the IncD proteins found in Chlamydia trachomatis. This C. trachomatis effector protein IncD has been shown to recruit the lipid transfer protein CERT to the inclusion membrane by directly interacting with CERT PH domain, which mediates the FFAT motif-dependent recruitment of the ER-resident protein VAPB (vesicle-associated membrane protein-associated protein) to the inclusion.	141
340344	pfam17629	DUF5510	Family of unknown function (DUF5510). This is a family of unknown function found in Rickettsia. Family members are predicted to have 2 or 3 trans-membrane regions.	62
340345	pfam17630	DUF5511	Family of unknown function (DUF5511). This is a family of unknown function found in Bacillus.	69
340346	pfam17631	DUF5512	Family of unknown function (DUF5512). This is a family of unknown function found in Bacillus.	139
340347	pfam17632	DUF5513	Family of unknown function (DUF5513). This is a family of unknown function found in Bacillus.	91
340348	pfam17633	DUF5514	Family of unknown function (DUF5514). This is a family of unknown function found in Bacillus.	142
340349	pfam17634	Gp67	Gene product 67. This is a family of unknown function found in Myoviridae such as Enterobacteria phages. Family members such as Gp67, is a prohead core (scaffold) protein.	80
407539	pfam17635	DUF5515	Family of unknown function (DUF5515). This is a family of unknown function found in SARS coronavirus.	70
340351	pfam17636	UL21a	Viral Unique Long protein 21a. Members of this family such as UL21a found in Human cytomegalovirus (HCMV) is required for HCMV to establish efficient productive infection. It is a short-lived cytoplasmic protein that facilitates HCMV replication. It has also been shown to be responsible for APC1, APC4 and APC5 degradation.	123
375222	pfam17637	DUF5516	Family of unknown function (DUF5516). This is a family of unknown function found in T7 viruses.	37
340353	pfam17638	UL42	HCMV UL42. Family members include UL42 proteins found in Human cytomegalovirus (HCMV). UL42 has two Pro-Pro-X-Tyr (PPxY) sequences, a hydrophobic region at the C-terminus and no N-terminal signal peptide. These features are shared with herpes simplex virus (HSV) UL56. UL42 has a putative C-terminal trans-membrane region. HCMV UL42 interacts with Itch, a member of the Nedd4 family of ubiquitin E3 ligases, through its PY motifs as observed in HSV UL56, suggestive of a regulatory function.	125
340354	pfam17639	DUF5517	Family of unknown function (DUF5517). This is a family of unknown function found in Fuselloviridae. Structure analysis suggest a role in viral assembly.	100
340355	pfam17640	UL17	Uncharacterized UL17. This is a family of unknown function found in beta-herpesviruses such as Human cytomegalovirus (HCMV).	102
407540	pfam17641	ASPRs	Ancylostoma-associated secreted protein related genes. This family includes members encoded by ASP-related genes which are distant homologs to ASPs (Ancylostoma-associated secreted proteins). ASPs are a diverse set of secreted cysteine-rich proteins pfam00188. ASPRs, on the other hand are predicted to be secreted with one ASPR in Heligmosomoides bakeri shown to be secreted by parasitic adults. Thus, like ASPs, ASPRs are suggested to comprise an important element of hookworm infection in vivo.	118
407541	pfam17642	TssD	Hemolysin coregulated protein Hcp (TssD). T6SSs are toxin delivery systems. It is a multiprotein complex requiring numerous core proteins (Tss proteins) including cytoplasmic, transmembrane, and outer membrane components. The needle or tube apparatus is comprised of a phage-like complex, similar to the T4 contractile bacteriophage tail, which is thought to be anchored to the membrane by a trans-envelope complex. These tube and trans-envelope sub-assemblies are linked via TssK. This entry comprises family members such as the inner tube protein Hcp (TssD). Hcp proteins form hexamers that stack to form the inner tube/needle structure of the puncturing device. Other functions have also been described for Hcp proteins, for example, some Hcp proteins have been shown to have a chaperone function in that they bind to and stabilize effectors. In addition, there are evolved Hcp proteins that have the Hcp domain at the N-terminal half of the protein and a toxic effector function present in the C-terminal portion of the protein.	127
407542	pfam17643	TssR	Type VI secretion system, TssR. T6SSs are toxin delivery systems. It is a multiprotein complex requiring numerous core proteins (Tss proteins) including cytoplasmic, transmembrane, and outer membrane components. The needle or tube apparatus is comprised of a phage-like complex, similar to the T4 contractile bacteriophage tail, which is thought to be anchored to the membrane by a trans-envelope complex. This entry relates to TssR family members. TssR proteins have no predicted TM regions.	745
375225	pfam17644	30K_MP_core	Core domain of 30K viral movement proteins. This entry represents the core domain found in viral movement proteins (MP) of the 30K type. The core domain is conserved among MPs of 30K sharing the same predicted secondary structure which consists of 1 alpha-helix and 7 predicted beta-strands. The only sequence feature common to all 30K MPs is a short region between beta-strands 1 and 2, which contains several conserved hydrophobic positions, and a nearly-invariant aspartate which constitutes the sequence signature of the superfamily. This signature aspartate has a conserved role in essential for viral cell-to-cell movement.	138
375226	pfam17645	Amdase	Arylmalonate decarboxylase. This entry contains members such as the arylmalonate decarboxylases (AMDase; EC 4.1.1.76), which belong to the family of carboxy-lyases (EC 4.1). Amdases are capable of decarboxylating a range of alpha-disubstituted malonic acid derivates to enantiopure products without the need for any cofactor. AMDases are members of the widespread Asp/Glu racemase family pfam01177 together with aspartate (EC 5.1.1.13) and glutamate racemases (EC 5.1.1.3), hydantoin racemases (EC 5.1.99.5) and maleate isomerases (EC 5.2.1.1).	217
375227	pfam17646	Zemlya	Closterovirus 1a polyprotein central region. This family represents an alignment of the Zemlya region of closteroviruses. The alignment of the 1a polyprotein of the Closteroviridae family members revealed that this region was not conserved in other genera. The homologs of the Zemlya region are not found in other viral or cellular proteins. This region is named the Zemlya region (zemlya is the Russian word for earth), meaning that its conserved amino acid sequence represents a olid ground within the highly variable central region of 1a polyporotein. It is composed of four predicted alpha-helices, alphaA to alphaD, and contains three conserved positions: i) a strictly conserved glutamate (E) in helix alphaA (E1291 in Beet yellows closterovirus (BYV)); ii) a strictly conserved proline (P1380) in alphaD; and iii) a conserved basic position (arginine or lysine; R1384 in BYV). The presence of a conserved proline in helix alphaD is noteworthy because prolines are strongly disfavoured in helices; this proline most probably induces a kink in the helix. Functional studies have suggested that most part of the Zemlya region, targets the ER and remodels the ER membranes. More specifically, deletion analysis and substitutions of the conserved hydrophobic amino acid residues suggest a role of the putative amphipathic helix1368-1385 (alphaD) in the formation of globules. Hence it was proposed that this specific region in 1a protein protein may be involved in the biogenesis of closterovirus.	106
407543	pfam17647	DUF5518	Family of unknown function (DUF5518). This is a family of unknown function found in Archaea. Family members have multiple predicted trans-membrane regions.	118
407544	pfam17648	DUF5519	Family of unknown function (DUF5519). This is a family of unknown function.	96
407545	pfam17649	VPS38	Vacuolar protein sorting 38. The class III phosphatidylinositol-3-kinase (PI3K) known as Vps34 (vacuolar protein sorting 34, encoded by PIK3C3) regulate intracellular membrane trafficking in endocytic sorting, cytokinesis and autophagy. Vps34 forms complexes with other proteins: Vps15 (encoded by PIK3R4, known as p150 in mammalian cells), Vps30 (encoded by VPS30/ATG6 in yeast, equivalent to mammalian Beclin 1, encoded by BECN1) and either Vps38 (UVRAG) or Atg14 (ATG14L). This family includes members such as Vps38 found in Saccharomyces cerevisiae. Vps38 is characteristic of complex II and essential for vacuolar protein sorting. In mammalian cells, complex II is also involved in autophagy, receptor degradation and cytokinesis as well as signaling, recycling and lysosomal tubulation. Independently from complex I and II, Beclin 1 and UVRAG also play separate roles in endosome function and neuron viability. In complex I, Vps38/UVRAG is substituted with Atg14/ATG14L. Although the N-terminal domains of Vps30, Vps38 and Atg14 differ, the overall similarity of their domain organizations suggests that these proteins may have evolved from a common ancestor.	425
407546	pfam17650	RACo_linker	RACo linker region. This family includes reductive activator of CoFeSP (RACo) proteins. Structure analysis of RACo indicate that it contains 4 regions: N-terminal region pfam00111 (residues 3-94) binding the [2Fe-2S] cluster, a linker region (residues 95-125), the middle region (residues 126-206), and the large C-terminal domain pfam14574 (residues 207-630). This entry pertains to the linker region. The linker region is only present in RACE (reductive activases for corrinoid enzymes) protein sequences with the N-terminal [2Fe-2S] cluster family pfam00111 and is absent in the RamA-like RACE proteins, suggesting that the linker domain and the N-terminal domain form one functional unit.	86
407547	pfam17651	Raco_middle	RACo middle region. This family includes reductive activator of CoFeSP (RACo) proteins. Structure analysis of RACo indicate that it contains 4 regions: N-terminal region pfam00111 (residues 3-94) binding the [2Fe-2S] cluster, a linker region (residues 95-125), the middle region (residues 126-206), and the large C-terminal domain pfam14574 (residues 207-630). This entry pertains to the middle region. This region contains residues in their alpha-helices (H6 and H7) that mediate dimerization with subdomain I of the C-terminal domain.	163
407548	pfam17652	Glyco_hydro81C	Glycosyl hydrolase family 81 C-terminal domain. Family of eukaryotic beta-1,3-glucanases. Within the Aspergillus fumigatus protein, two perfectly conserved Glu residues (E550 or E554) have been proposed as putative nucleophiles of the active site of the Engl1 endoglucanase, while the proton donor would be D475. The endo-beta-1,3-glucanase activity is essential for efficient spore release. This entry represents the helical C-terminal domain.	349
407549	pfam17653	DUF5522	Family of unknown function (DUF5522). This is a family of unknown function. Family members are found in Bacteria and Eukaryotes. In algae, this family is found on the N-terminal of diphthamide synthase family pfam01902 and is predicted to be a membrane transporter belonging to The ATP-binding Cassette (ABC) family. In nematoda, it is found on the N-terminal side of ribosomal protein family L14 pfam00238. It is also found on the N-temrinal of Cob(I)alamin adenosyltransferase pfam01923 in other eukaryotes. Family members found in homo sapiens have been shown to be highly expressed in 15 high-grade neuroendocrine tumor cell lines and YAP1-positive small-cell lung cancer cell lines as well as being up-regulated in two human Multiple Myloma cell lines in response to selective nuclear export inhibitor KPT-276. The HMM profile of this family reveals 4 highly conserved cysteines.	48
407550	pfam17654	Trnau1ap	Selenocysteine tRNA 1 associated proteins. This entry represents the C-terminal region of Selenocysteine tRNA 1 associated proteins (Trnau1ap also known as Secp43). Family members found in Eukaryotes have been shown to serve an essential role in the synthesis of selenoproteins, which have critical functions in numerous biological processes. Selenium deficiency results in a variety of diseases, including cardiac disease. Trnau1ap proteins harbor RNA recognition motifs (RRM) pfam00076 and Tyr-rich region found in the C-terminal. The Tyr-rich region (amino acids 185-225) is conserved among several mammals, including human, chimp, dog, cattle, mouse and rat. Furthermore, constitutive deletion of exons corresponding to the Tyr-rich region in mouse resulted in embryonic lethality.	101
407551	pfam17655	IRK_C	Inward rectifier potassium channel C-terminal domain. This cytoplasmic C-terminal domain has an Ig fold.	174
407552	pfam17656	ChapFlgA_N	FlgA N-terminal domain. Presumed domain found to N-terminus of SAF-like domain in FlgA proteins.	76
407553	pfam17657	DNA_pol3_finger	Bacterial DNA polymerase III alpha subunit finger domain. 	166
407554	pfam17658	DUF5520	Family of unknown function (DUF5520). This is a family of unknown function found in Mammalia.	338
407555	pfam17659	DUF5521	Family of unknown function (DUF5521). This is a family of unknown function found in Eukaryota. Family members include the human CXorf57. High-throughput sequencing used to identify genes driving tumorigenesis in Avian leukosis virus (ALV)-induced B-cell lymphomas, showed CXorf57 as the 10th most frequently targeted common integration site by ALV. ALV induces B-cell lymphoma and other neoplasms in chickens by integrating within or near cancer genes and perturbing their expression. CXorf57 encodes a protein that has a conserved putative replication factor A protein 1 domain. Several proteins with this domain have been shown to be involved in recognition of DNA damage for nucleotide excision repair. CXorf57 contains 24 unique integration sites that are spaced throughout the gene and in no preferred orientation.	848
407556	pfam17660	BTRD1	Bacterial tandem repeat domain 1. This short domain is found in a wide variety of bacterial proteins.	50
407557	pfam17661	DUF5523	Family of unknown function (DUF5523). This is a family of unknown function found in Eukaryotes. Family members such as the human CC1D2A protein carry the domain architecture where pfam15625 and pfam00168 are found at the C-terminal region of this family. However, other family members do not carry either of the above mentioned Pfam families.	255
407558	pfam17662	DUF5524	Family of unknown function (DUF5524). This is a family of unknown function found in Metazoa.	290
407559	pfam17663	DUF5525	Family of unknown function (DUF5525). This is a family of unknown function found in Chordata.	1017
407560	pfam17664	DUF5526	Family of unknown function (DUF5526). This is a family of unknown function found in Metazoa.	168
407561	pfam17665	DUF5527	Family of unknown function (DUF5527). This is a family of unknown function found in Chordata.	139
407562	pfam17666	DUF5528	Family of unknown function (DUF5528). This is a family of unknown function found in Chordata.	152
407563	pfam17667	Pkinase_fungal	Fungal protein kinase. This domain appears to be a variant of the protein kinase domain that is found in a variety of fungal species.	386
407564	pfam17668	Acetyltransf_17	Acetyltransferase (GNAT) domain. 	109
375243	pfam17669	DUF5529	Family of unknown function (DUF5529). This is a family of unknown function found in Chordata.	186
407565	pfam17670	DUF5530	Family of unknown function (DUF5530). This is a family of unknown function found in Chordata.	141
407566	pfam17671	DUF5531	Family of unknown function (DUF5531). This is a family of unknown function found in Mammalia. Family members have one or several predicted trans-membrane regions.	151
375246	pfam17672	DUF5589	Family of unknown function (DUF5589). This is a family of unknown function found in mammalia. Family members contains one or several predicted trans-membrane regions.	89
407567	pfam17673	DUF5532	Family of unknown function (DUF5532). This is a family of unknown function found in mammals.	92
407568	pfam17674	HHH_9	HHH domain. 	70
407569	pfam17675	APG6_N	Apg6 coiled-coil region. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg6/Vps30p has two distinct functions in the autophagic process, either associated with the membrane or in a retrieval step of the carboxypeptidase Y sorting pathway.	131
407570	pfam17676	Peptidase_S66C	LD-carboxypeptidase C-terminal domain. Muramoyl-tetrapeptide carboxypeptidase hydrolyses a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein. This family corresponds to Merops family S66.	120
407571	pfam17677	Glyco_hydro38C2	Glycosyl hydrolases family 38 C-terminal beta sandwich domain. This domain is found at the C-terminal end of various glycosyl hydrolases belonging to family 38. The domain has a beta sandwich fold.	73
407572	pfam17678	Glyco_hydro_92N	Glycosyl hydrolase family 92 N-terminal domain. This domain is found at the N-terminus of family 92 glycosyl hydrolase proteins.	231
375249	pfam17679	Dip	gp37/Dip protein. This protein is found in the giant phage phi KZ. This protein has been shown to bind to RNAse E of the host P. aeruginosa and to inhibit the RNA degradation machinery of the bacterium.	273
407573	pfam17680	FlgO	FlgO protein. This entry represents the FlgO protein. Mutation of this protein in Vibrio cholerae has been shown to reduce motility. FlgO is an outer membrane protein that localizes throughout the membrane and not at the flagellar pole. Although FlgO and FlgP do not specifically localize to the flagellum, they are required for flagellar stability. Proteins in this family mostly contain an N-terminal lipoprotein attachment motif.	130
407574	pfam17681	GCP_N_terminal	Gamma tubulin complex component N-terminal. This is the N-terminal domain found in components of the gamma-tubulin complex proteins (GCPs). Family members include spindle pole body (SBP) components such as Spc97 and Spc98 which function as the microtubule-organizing center in yeast. Furthermore, family members such as human GCP4 (Gamma-tubulin complex component 4) have been structurally elucidated. Functional studies have shown that the N-terminal domain defines the functional identity of GCPs, suggesting that all GCPs are incorporated into the helix of gamma-tubulin small complexes (gTURCs) via lateral interactions between their N-terminal domains. Thereby, they define the direct neighbors and position the GCPs within the helical wall of gTuRC. Sequence alignment of human GCPs based on the GCP4 structure helped delineate conserved regions in the N- and C-terminal domains. In addition to the conserved sequences, the N-terminal domains carry specific insertions of various sizes depending on the GCP, i.e. internal insertions or N-terminal extensions. These insertions may equally contribute to the function of individual GCPs as they have been implied in specific interactions with regulatory or structural proteins. For instance, GCP6 carries a large internal insertion phosphorylated by Plk4 and containing a domain of interaction with keratins, whereas the N-terminal extension of GCP3 interacts with the recruitment protein MOZART1.	294
407575	pfam17682	Tau95_N	Tau95 Triple barrel domain. TFIIIC1 is a multisubunit DNA binding factor that serves as a dynamic platform for assembly of pre-initiation complexes on class III genes. This entry represents the tau 95 subunit which holds a key position in TFIIIC, exerting both upstream and downstream influence on the TFIIIC-DNA complex by rendering the complex more stable. Once bound to tDNA-intragenic promoter elements, TFIIIC directs the assembly of TFIIIB on the DNA, which in turn recruits the RNA polymerase III (pol III) and activates multiple rounds of transcription.	115
407576	pfam17683	TFIIF_beta_N	TFIIF, beta subunit N-terminus. Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter.	104
407577	pfam17684	SCAB-PH	PH domain of plant-specific actin-binding protein. This family is a PH domain found on plant-specific actin-binding proteins or SCABs. SCAB proteins bind, bundle and stabilize actin filaments and regulate stomatal movement. The Ig-PH fusion domain is at the C-terminus. This domain adopts the PH fold, of seven beta-strands, beta7-beta13 and two alpha-helices, alpha1 and alpha2 arranged into a beta-barrel. The canonical phosphoinositide-binding pocket of the classic PH domain is degenerate in this fused one, and the charge on the pocket suggest that the Ig-PH domain contains a non-canonical binding site for inositol phosphates.	108
375254	pfam17685	DUF5533	Family of unknown function (DUF5533). This is a family of unknown function found in chordata. Family members have multiple predicted transmembrane regions.	139
407578	pfam17686	DUF5534	Family of unknown function (DUF5534). This is a family of unknown function found in mammals. Family members have one or several predicted trans-membrane regions.	183
375256	pfam17687	DUF5535	Family of unknown function (DUF5535). This is a family of unknown function found in mammals.	105
407579	pfam17688	DUF5536	Family of unknown function (DUF5536). This is a family of unknown function found in mammals.	185
407580	pfam17689	Arabino_trans_N	Arabinosyltransferase concanavalin like domain. Arabinosyltransferase is involved in arabinogalactan (AG) biosynthesis pathway in mycobacteria. AG is a component of the macromolecular assembly of the mycolyl-AG-peptidoglycan complex of the cell wall. This enzyme has important clinical applications as it is believed to be the target of the antimycobacterial drug Ethambutol.	155
375258	pfam17690	DUF5537	Family of unknown function (DUF5537). This is a family of unknown function found in chordata.	160
407581	pfam17691	Croc_4	Contingent replication of cDNA 4. Family members are 18-kDa serine/threonine-rich polypeptides containing a P-loop motif and an SH3-binding region with phosphorylation sites for a variety of protein kinases (cdc2, CDK2, MAPK, CDK5, protein kinase C, Ca(2+)/calmodulin protein kinase 2, casein kinase 2) involved in cell proliferation and differentiation. Functional studies revealed that expression is associated with proliferating and migrating cells in developing brain. Furthermore, it has been suggested that CROC-4 participates in brain-specific c-fos signaling pathways involved in cellular remodeling of brain architecture. C1orf61 expression was also found associated with the progression of liver disease as well as human embryogenesis. It was shown to be up-regulated in hepatic cirrhosis tissues and further up-regulated in primary hepatocellular carcinoma tumors where it was suggested to play a role as a tumor activator.	79
407582	pfam17692	DUF5538	Family of unknown function (DUF5538). This is a family of unknown function found in primates.	130
375261	pfam17693	DUF5539	Family of unknown function (DUF5539). This is a family of unknown function found in primates.	201
375262	pfam17694	DUF5540	Family of unknown function (DUF5540). This is a family of unknown function found in primates.	80
407583	pfam17695	DUF5541	Family of unknown function (DUF5541). This is a family of unknown function found in primates.	116
407584	pfam17696	DUF5542	Family of unknown function (DUF5542). This is a family of unknown function. The C-terminal has a strongly conserved WxxxW motif.	68
375265	pfam17697	DUF5543	Family of unknown function (DUF5543). This is a family of unknown function found in primates.	118
375266	pfam17698	DUF5544	Family of unknown function (DUF5544). This is a family of unknown function found in primates.	125
375267	pfam17699	DUF5545	Family of unknown function (DUF5545). This is a family of unknown function found in primates.	221
407585	pfam17700	DUF5546	Family of unknown function (DUF5546). This is a family of unknown function found in primates.	135
407586	pfam17701	DUF5547	Family of unknown function (DUF5547). This is a family of unknown function found in mammals.	123
407587	pfam17702	DUF5548	Family of unknown function (DUF5548). This is a family of unknown function found in primates.	177
407588	pfam17703	DUF5549	Family of unknown function (DUF5549). This is a family of unknown function found in mammals.	200
375272	pfam17704	DUF5550	Family of unknown function (DUF5550). This is a family of unknown function found in primates.	123
375273	pfam17705	DUF5551	Family of unknown function (DUF5551). This is a family of unknown function found in primates.	156
375274	pfam17706	DUF5552	Family of unknown function (DUF5552). This is a family of unknown function found in primates.	217
407589	pfam17707	DUF5553	Family of unknown function (DUF5553). This is a family of unknown function found in primates.	223
407590	pfam17708	Gasdermin_C	Gasdermin PUB domain. The precise function of this protein is unknown. A deletion/insertion mutation is associated with an autosomal dominant non-syndromic hearing impairment form. In addition, this protein has also been found to contribute to acquired etoposide resistance in melanoma cells. This family also includes the gasdermin protein	174
375277	pfam17709	DUF5554	Family of unknown function (DUF5554). This is a family of unknown function found in primates.	75
407591	pfam17710	DUF5555	Family of unknown function (DUF5555). This is a family of unknown function found in mammals.	295
375279	pfam17711	DUF5556	Family of unknown function (DUF5556). This is a family of unknown function found in primates.	179
375280	pfam17712	DUF5557	Family of unknown function (DUF5557). This is a family of unknown function found in primates.	90
407592	pfam17713	DUF5558	Family of unknown function (DUF5558). This is a family of unknown function found in Homo sapiens.	129
375282	pfam17714	DUF5559	Family of unknown function (DUF5559). This is a family of unknown function found in primates.	194
375283	pfam17715	DUF5560	Family of unknown function (DUF5560). This is a family of unknown function found in primates.	165
407593	pfam17716	DUF5561	Family of unknown function (DUF5561). This is a family of unknown function found in eukaryota.	255
407594	pfam17717	DUF5562	Family of unknown function (DUF5562). This is a family of unknown function found in mammals. Family members have one or several predicted transmembrane regions.	166
375286	pfam17718	DUF5563	Family of unknown function (DUF5563). This is a family of unknown function found in chordata.	192
407595	pfam17719	DUF5564	Family of unknown function (DUF5564). This is a family of unknown function found in chordata.	98
407596	pfam17720	DUF5565	Family of unknown function (DUF5565). This is a family of unknown function found in bacteria and eukaryotes.	324
407597	pfam17721	DUF5566	Family of unknown function (DUF5566). This is a family of unknown function found in chordata.	233
407598	pfam17722	DUF5567	Family of unknown function (DUF5567). This is a family of unknown function found in chordata.	234
407599	pfam17723	RHH_8	Ribbon-Helix-Helix transcriptional regulator family. This family of proteins are likely to be transcriptional regulators that have an N-terminal ribbon-helix-helix domain. Although some members of the family are annotated as CopG, this family does not include that protein.	119
407600	pfam17724	DUF5568	Family of unknown function (DUF5568). This is a family of unknown function found in chordata.	203
407601	pfam17725	YBD	YAP binding domain. TEA domain transcription factors contain an N-terminal TEA domain pfam01285 and a C-terminal YAP binding domain (YBD). This entry corresponds to the YBD that binds to the oncoproteins YAP and TAZ. The structure of the YBD shows that it has an Ig-like beta sandwich fold.	206
407602	pfam17726	DpnI_C	Dam-replacing HTH domain. Dam-replacing protein (DRP) is an restriction endonuclease that is flanked by pseudo-transposable small repeat elements. The replacement of Dam-methylase by DRP allows phase variation through slippage-like mechanisms in several pathogenic isolates of Neisseria meningitidis. This domain represents the C-terminal HTH domain.	69
407603	pfam17727	CtsR_C	CtsR C-terminal dimerization domain. This family consists of several Firmicute transcriptional repressor of class III stress genes (CtsR) proteins. CtsR of L. monocytogenes negatively regulates the clpC, clpP and clpE genes belonging to the CtsR regulon. This entry corresponds to the C-terminal dimerization domain.	72
407604	pfam17728	BsuBI_PstI_RE_N	BsuBI/PstI restriction endonuclease HTH domain. This family represents the C-terminus of bacterial enzymes similar to type II restriction endonucleases BsuBI and PstI (EC:3.1.21.4). The enzymes of the BsuBI restriction/modification (R/M) system recognize the target sequence 5'CTGCAG and are functionally identical with those of the PstI R/M system.	140
407605	pfam17729	DUF5569	Family of unknown function (DUF5569). This is a family of unknown function found in mammals.	227
407606	pfam17730	Centro_C10orf90	Centrosomal C10orf90. This is the N-terminal region found on proteins encoded by C10orf90. Most of the family members carry ALMS motif on their C-terminal pfam15309.SiRNA mediated functional analysis suggest that the C10orf90 encoded proteins have a role in centrosomal functions.	512
407607	pfam17731	DUF5570	Family of unknown function (DUF5570). This is a family of unknown function found in chordata. Family members contain a transmembrane region in the C-terminal and have been shown to be localized to the Golgi apparatus.	123
407608	pfam17732	DUF5571	Family of unknown function (DUF5571). This is a family of unknown function found in chordata. Family members carry a zinc finger family on the N-terminal pfam15663.	351
407609	pfam17733	DUF5572	Family of unknown function (DUF5572). This is a family of unknown function found in eukaryotes. Family members carry a highly conserved KPWE sequence at the C-terminal.	48
375300	pfam17734	Spt46	Spermatogenesis-associated protein 46. This family is found in chordata. Functional characterization studies showed that the deletion of Spata46 in mice resulted in subfertility with abnormal sperm head shape and a failure of sperm-egg fusion. Spata46 has also been shown to localize to the nuclear membrane by a transmembrane region in the N-terminal.	228
407610	pfam17735	BslA	Biofilm surface layer A. This family includes members such as BslA (previously called YuaB). Secreted BslA from Bacillus subtillis has been shown to form surface layers around the biofilm self-assembling at interfaces of B. subtilis biofilms, forming an elastic film. structural analysis revealed that BslA consists of an Ig-type fold with the addition of an unusual, extremely hydrophobic cap region. The hydrophobic cap exhibits physiochemical properties similar to the hydrophobic surface found in fungal hydrophobins; thus, BslA is defined as member of a class of bacterially produced hydrophobins.	121
407611	pfam17736	Ig_C17orf99	C17orf99 Ig domain. This Ig domain is found in tandem in the uncharacterized human protein C17orf99, which is found across mammalian species.	95
375302	pfam17737	Ig_C19orf38	Ig domain in C19orf38 (HIDE1). This entry represents an Ig domain found in the uncharacterized human protein C19orf38, which is found across mammals. Family members have one predicted transmembrane region which is not included in this entry. The C19orf38 protein is also known as HIDE1 after Highly expressed in Immature DEndritic cell transcript 1 protein.	91
407612	pfam17738	DUF5575	Family of unknown function (DUF5575). This is a family of unknown function found in chordates.	307
375304	pfam17739	DUF5576	Family of unknown function (DUF5576). This is a family of unknown function found in Hominidae.	134
407613	pfam17740	DUF5577	Family of unknown function (DUF5577). This is a family of unknown function found in Metazoa.	334
407614	pfam17741	DUF5578	Family of unknown function (DUF5578). This is a family of unknown function found in Eukaryotes.	268
375307	pfam17742	DUF5579	Family of unknown function (DUF5579). This is a family of unknown function found in chordates. Family members carry one predicted transmembrane region.	202
407615	pfam17743	DUF5580	Family of unknown function (DUF5580). This is a family of unknown function found in metazoa.	547
407616	pfam17744	DUF5581	Family of unknown function (DUF5581). This is a family of unknown function found in chordates.	315
407617	pfam17745	Ydr279_N	Ydr279p protein triple barrel domain. RNases H are enzymes that specifically hydrolyse RNA when annealed to a complementary DNA and are present in all living organisms. In yeast RNase H2 is composed of a complex of three proteins (Rnh2Ap, Ydr279p and Ylr154p), this family represents the homologues of Ydr279p. It is not known whether non yeast proteins in this family fulfil the same function. This domain corresponds to the N-terminal triple barrel domain.	66
407618	pfam17746	SfsA_N	SfsA N-terminal OB domain. This family contains Sugar fermentation stimulation proteins. Which is probably a regulatory factor involved in maltose metabolism. This domain corresponds to the N-terminal OB fold.	66
407619	pfam17747	VID27_PH	VID27 PH-like domain. This region has been predicted to contain a PH-like domain.	109
407620	pfam17748	VID27_N	VID27 N-terminal region. This region may contain a PH domain.	173
407621	pfam17749	MIP-T3_C	Microtubule-binding protein MIP-T3 C-terminal region. This protein, which interacts with both microtubules and TRAF3 (tumour necrosis factor receptor-associated factor 3), is conserved from worms to humans. The N-terminal region is the microtubule binding domain and is well-conserved; the C-terminal 100 residues, also well-conserved, constitute the coiled-coil region which binds to TRAF3. The central region of the protein is rich in lysine and glutamic acid and carries KKE motifs which may also be necessary for tubulin-binding, but this region is the least well-conserved.	154
407622	pfam17750	Reo_sigmaC_M	Reovirus sigma C capsid protein triple beta spiral. This short region forms a triple beta spiral structural motif.	40
407623	pfam17751	SKICH	SKICH domain. The SKICH domains of SKIP and PIPP mediate plasma membrane localisation. The functions of the SKICH domains of NDP52 and CALCOCO1 are not known.	102
407624	pfam17752	BLF1	Burkholderia lethal factor 1. This family includes members such as BLF1 (Burkholderia lethal factor 1) also known as BPSL1549. BLF1 is a potent toxin from Burkholderia pseudomallei causing melioidosis. BLF1 interacts with the human translation factor eIF4A causing deamidation of Gln339 to Glu. Thereby, reducing endogenous host cell protein synthesis and triggering increased stress granule formation, which is associated with translational blocks. Structural analysis of BLF1 revealed an alpha/beta fold comprising a sandwich of two mixed beta-sheets surrounded by loops and alpha-helices, where the beta-sheet core of the catalytic pocket is structurally similar to that of the deamidase domain of CNF1 pfam05785.	213
407625	pfam17753	Ig_mannosidase	Ig-fold domain. This Ig-like fold domain is found in mannosidase enzymes.	78
407626	pfam17754	TetR_C_14	MftR C-terminal domain. This domain is found at the C-terminus of TetR like transcription factors including the Mycofactin biosynthesis transcription factor.	112
407627	pfam17755	UvrA_DNA-bind	UvrA DNA-binding domain. 	110
407628	pfam17756	RET_CLD1	RET Cadherin like domain 1. RET is a single transmembrane-spanning receptor tyrosine kinase (RTK) that plays critical roles in the development of vertebrates. Structural analysis indicate that RET contains four consecutive cadherin-like domains (CLD). This entry relates to the first CLD at the N-terminal. Several regions within RET-CLD1 have been shown to be important for ligand-coreceptor binding. CLD1 and CLD2 have a distinctive clamshell shape and CLD1 is essential for CLD2 folding. CLD1 contains 2 sites for GDNF receptor alpha 1 binding.	125
407629	pfam17757	UvrB_inter	UvrB interaction domain. This domain is found in the UvrB protein where it interacts with the UvrA protein.	91
407630	pfam17758	Prot_ATP_OB_N	Proteasomal ATPase OB N-terminal domain. This is N-terminal oligonucleotide binding (OB) domain of proteasomal ATPase	62
407631	pfam17759	tRNA_synthFbeta	Phenylalanyl tRNA synthetase beta chain CLM domain. This domain corresponds to the catalytic like domain (CLM) in the beta chain of phe tRNA synthetase.	214
407632	pfam17760	UvrA_inter	UvrA interaction domain. This domain found in UvrA proteins interacts with the UvrB protein.	109
407633	pfam17761	DUF1016_N	DUF1016 N-terminal domain. This family may include an HTH domain.	136
407634	pfam17762	HTH_ParB	HTH domain found in ParB protein. 	52
407635	pfam17763	Asparaginase_C	Glutaminase/Asparaginase C-terminal domain. This domain is found at the C-terminus of asparaginase enzymes.	114
407636	pfam17764	PriA_3primeBD	3' DNA-binding domain (3'BD). This domain represents the N-terminal DNA-binding domain found in the PriA protein. The 3'BD, which has been shown to bind the 3' end of the leading-strand arm of replication fork structures.	96
407637	pfam17765	MLTR_LBD	MmyB-like transcription regulator ligand binding domain. This domain is found in a family of actinobacterial transcription factors. The structure shows it has a PAS domain like fold and it is bound to Myristic acid.	168
407638	pfam17766	fn3_6	Fibronectin type-III domain. This FN3 like domain is found at the C-terminus of cucumisin proteins.	98
407639	pfam17767	NAPRTase_N	Nicotinate phosphoribosyltransferase (NAPRTase) N-terminal domain. Nicotinate phosphoribosyltransferase (EC:2.4.2.11) is the rate limiting enzyme that catalyses the first reaction in the NAD salvage synthesis. This is the N-terminal domain of the enzyme.	124
407640	pfam17768	RecJ_OB	RecJ OB domain. This OB-fold is found in RecJ proteins where is binds to ssDNA.	107
407641	pfam17769	PurK_C	Phosphoribosylaminoimidazole carboxylase C-terminal domain. This entry represents the C-terminal domain of the PurK enzyme.	56
407642	pfam17770	RNase_J_C	Ribonuclease J C-terminal domain. This domain is found at the C-terminus of Ribonuclease J proteins. Its function is unknown, but deletion of this domain causes dissociation to monomers.	102
407643	pfam17771	ADAM_CR_2	ADAM cysteine-rich domain. This cysteine rich domain is found in a variety of ADAM like peptidases. This domain is distantly related to pfam08516.	69
407644	pfam17772	zf-MYST	MYST family zinc finger domain. This zinc finger domain is found in the MYST family of histone acetyltransferases.	55
407645	pfam17773	UPF0176_N	UPF0176 acylphosphatase like domain. This domain is found at the N-terminus of UPF0176 family proteins. It adopts a fold similar to the pfam00708 family.	92
407646	pfam17774	YlmH_RBD	Putative RNA-binding domain in YlmH. This domain adopts an RRM like fold and is found in the B. subtilis YlmH cell division protein.	84
407647	pfam17775	UPF0225	UPF0225 domain. This entry represents an NTF2-like domain found in bacterial proteins.	99
407648	pfam17776	NLRC4_HD2	NLRC4 helical domain HD2. This entry represents a helical domain found in the NLRC4 protein and NOD2 protein.	122
407649	pfam17777	RL10P_insert	Insertion domain in 60S ribosomal protein L10P. This domain is found in prokaryotic and archaeal ribosomal L10 protein.	71
407650	pfam17778	BLACT_WH	Beta-lactamase associated winged helix domain. This winged helix domain is found at the C-terminus of some beta lactamase enzymes.	46
407651	pfam17779	NOD2_WH	NOD2 winged helix domain. This winged helix domain is found in the NOD2 protein. Its molecular function is not known.	57
407652	pfam17780	OCRE	OCRE domain. The OCtamer REpeat (OCRE) has been annotated as a 42-residue sequence motif with 12 tyrosine residues in the spliceosome trans-regulatory elements RBM5 and RBM10 (RBM [RNA-binding motif]), which are known to regulate alternative splicing of Fas and Bcl-x pre-mRNA transcripts. The structure of the domain consists of an anti-parallel arrangement of six beta strands.	51
407653	pfam17781	RPN1_RPN2_N	RPN1/RPN2 N-terminal domain. This domain is found at the N-terminus of the 26S proteasome regulatory subunits RPN1 and RPN2. The domain is formed by an array of alpha helices.	301
407654	pfam17782	DprA_WH	DprA winged helix domain. This winged helix domain is found in the DprA protein.	61
407655	pfam17783	CvfB_WH	CvfB-like winged helix domain. This winged helix domain is found in RNA-binding proteins such as CvfB.	58
407656	pfam17784	Sulfotransfer_4	Sulfotransferase domain. This family of proteins are distantly related to sulfotransferase enzymes. This protein in S. mansonii has been shown to be involved in resistance to oxamniquine and to have sulfotransferase activity.	214
407657	pfam17785	PUA_3	PUA-like domain. This PUA-like domain is found at the N-terminus of SAM-dependent methyltransferases.	64
407658	pfam17786	Mannosidase_ig	Mannosidase Ig/CBM-like domain. This domain corresponds to domain 4 in the structure of Bacteroides thetaiotaomicron beta-mannosidase, BtMan2A. This domain has an Ig-like fold.	91
407659	pfam17787	PH_14	PH domain. This entry corresponds to the PH domain found at the N-terminus of phospholipase C enzymes.	131
407660	pfam17788	HypF_C	HypF Kae1-like domain. This domain is found in the HypF protein. In the structure it is one of the two subdomains of the Kae1 domain.	99
407661	pfam17789	MG4	Macroglobulin domain MG4. This domain is MG4 found in complement C3 and C5 proteins.	95
407662	pfam17790	MG1	Macroglobulin domain MG1. This entry represents the N-terminal macroglobulin domain found in complement proteins C3, C4 and C5.	101
407663	pfam17791	MG3	Macroglobulin domain MG3. This entry corresponds to the MG3 domain found in complement components C3, C4 and C5.	83
407664	pfam17792	ThiD2	ThiD2 family. This domain functions as a ThiD protein and is called the ThiD2 family. The domain is associated with the ThiE domain in some proteins.	124
407665	pfam17793	AHD	ANC1 homology domain (AHD). This entry corresponds to the ANC1 homology domain (AHD) found in AF-9.	61
407666	pfam17794	Vault_2	Major Vault Protein repeat domain. This short domain is found repeated numerous times in the Major Vault Protein. This entry is related to pfam01505.	60
407667	pfam17795	Vault_3	Major Vault Protein Repeat domain. This domain is found in the Major Vault Protein.	62
407668	pfam17796	Vault_4	Major Vault Protein repeat domain. 	61
407669	pfam17797	RL	RL domain. The RRM-like (RL) domain is found in the N-terminal region of the polyA polymerase PAPD1. It contributes to PAPD1 dimerization and has a fold similar to RNP-type RBDs.	71
407670	pfam17798	TRIF-NTD	TRIF N-terminal domain. The N-terminal domain of TRIF/TICAM-1 has a structure that consists of eight antiparallel helices. This domain believed to be involved in self-regulation of TRIF by interacting with its TIR domain.	157
407671	pfam17799	RRM_Rrp7	Rrp7 RRM-like N-terminal domain. This domain corresponds to the N-terminal RNA-binding domain found in the Rrp7 protein. It has an RRM-like fold with a circular permutation.	160
407672	pfam17800	NPL	Nucleoplasmin-like domain. 	88
407673	pfam17801	Melibiase_C	Alpha galactosidase C-terminal beta sandwich domain. This domain is found at the C-terminus of alpha galactosidase enzymes.	74
407674	pfam17802	SpaA	Prealbumin-like fold domain. This entry contains a prealbumin-like domain from a wide variety of bacterial surface proteins. This entry corresponds to domain 1 and domain 3 of SpaA from Corynebacterium diphtheriae. Some members of this family contain an isopeptide bond.	72
407675	pfam17803	Cadherin_4	Bacterial cadherin-like domain. This entry contains numerous bacterial cadherin-like domains found in extracelullar proteins.	71
407676	pfam17804	TSP_NTD	Tail specific protease N-terminal domain. The N-terminal domain of tail specific proteases has a novel fold composed of 10 alpha helices.	187
407677	pfam17805	AsnC_trans_reg2	AsnC-like ligand binding domain. This entry contains an AsnC-like ligand binding domain.	86
375343	pfam17806	SO_alpha_A3	Sarcosine oxidase A3 domain. This short domain is found in Heterotetrameric Sarcosine Oxidase's alpha A3 domain. This domain binds to FMN in sarcosine oxidase. This domain is related to pfam04324 but lacks its iron binding cysteine residues.	87
407678	pfam17807	zf-UBP_var	Variant UBP zinc finger. This domain is found in ubiquitin C-terminal hydrolase enzymes and is related to the pfam02148 domain. However, it has an altered pattern of zinc binding residues.	64
375345	pfam17808	fn3_PAP	Fn3-like domain from Purple Acid Phosphatase. This entry represents an N-terminal Fn3-like domain found at the N-terminus of purple acid phosphatase enzymes.	118
375346	pfam17809	UPA_2	UPA domain. The UPA domain is conserved in UNC5, PIDD, and Ankyrins. It has a beta sandwich structure.	131
407679	pfam17810	Arg_decarb_HB	Arginine decarboxylase helical bundle domain. This entry represents a helical bundle domain that is found between the two enzymatic domains of the arginine decarboxylases.	84
407680	pfam17811	JHD	Jumonji helical domain. This 4-helix bundle domain is associated with the Jumonji domain pfam02373.	104
407681	pfam17812	RET_CLD3	RET Cadherin like domain 3. RET is a single transmembrane-spanning receptor tyrosine kinase (RTK) that plays critical roles in the development of vertebrates. Structural analysis indicate that RET contains four consecutive cadherin-like domains (CLD). This entry relates to CLD3. Classical cadherin calcium-coordinating motifs can be found between CLD2 and CLD3.	114
407682	pfam17813	RET_CLD4	RET Cadherin like domain 4. RET is a single transmembrane-spanning receptor tyrosine kinase (RTK) that plays critical roles in the development of vertebrates. Structural analysis indicate that the ligand-binding RET ectodomain (RET-ECD) contains four consecutive cadherin-like domains (CLD1-CLD4) followed by a membrane-proximal cysteine-rich domain (CRD). This entry relates to CLD4 which is required for CRD folding.	104
375350	pfam17814	LisH_TPL	LisH-like dimerisation domain. TOPLESS (TPL) proteins have a highly conserved N-terminal domain containing a lissencephaly homologous (LisH) dimerization motif.	30
407683	pfam17815	PDZ_3	PDZ domain. This entry contains the second PDZ domain from plant peptidases such as Deg2. This domain is involved in cage assembly.	145
407684	pfam17816	PDZ_4	PDZ domain. This entry represents a PDZ domain that is found in the CPAF protein from chlamydia trachomatis.	114
407685	pfam17817	PDZ_5	PDZ domain. This entry corresponds to PDZ domains found in neurabin and spinophilin proteins. The PDZ domain in spinophilin mediates its interaction with protein phosphatase PP1.	73
407686	pfam17818	KCT2	Keratinocyte-associated gene product. This entry includes Keratinocyte-associated transmembrane protein 2 found in humans. Functional studies show that KCP2 localizes to the endoplasmic reticulum, consistent with a role in protein biosynthesis, and has a functional KKxx retrieval signal at its cytosolic C-terminus.	187
407687	pfam17819	DUF5582	Family of unknown function (DUF5582). This is a family of unknown function found in chordata.	146
407688	pfam17820	PDZ_6	PDZ domain. This entry represents the PDZ domain from a wide variety of proteins.	54
375357	pfam17821	DUF5583	Family of unknown function (DUF5583). This is a family of unknown function found in chordata.	129
407689	pfam17822	DUF5584	Family of unknown function (DUF5584). This is a family of unknown function found in chordata.	230
407690	pfam17823	DUF5585	Family of unknown function (DUF5585). This is a family of unknown function found in chordata.	506
407691	pfam17824	DUF5586	Family of unknown function (DUF5586). This is a family of unknown function found in chordata.	404
407692	pfam17825	DUF5587	Family of unknown function (DUF5587). This is a family of unknown function found in chordata.	1440
375362	pfam17826	DUF5588	Family of unknown function (DUF5588). This is a family of unknown function found in chordata.	362
407693	pfam17827	PrmC_N	PrmC N-terminal domain. This entry corresponds to the N-terminal alpha helical domain of the HemK protein. HemK is a methyltransferase enzyme that carries out the methylation of the N5 nitrogen of the glutamine found in the conserved GGQ motif of class-1 release factors.	71
407694	pfam17828	FAS_N	N-terminal domain in fatty acid synthase subunit beta. This entry represents the N-terminal domain found in fatty acid synthase proteins.	127
407695	pfam17829	GH115_C	Gylcosyl hydrolase family 115 C-terminal domain. This domain is found at the C-terminus of glycosyl hydrolase family 115 proteins. This domain has a beta-sandwich fold.	172
407696	pfam17830	STI1	STI1 domain. This entry corresponds to the STI1 domain that is found in two copies in the Sti1 protein.	55
375365	pfam17831	PDH_E1_M	Pyruvate dehydrogenase E1 component middle domain. This entry represents one of the thiamin diphosphate-binding domains found in pyruvate dehydrogenase E1 component.	230
407697	pfam17832	Pre-PUA	Pre-PUA-like domain. This Pre-PUA-like domain is found in a wide variety of proteins including the eukaryotic translation initiation factor 2D, where it is found at the N-terminus.	86
407698	pfam17833	UPF0113_N	UPF0113 Pre-PUA domain. 	81
407699	pfam17834	GHD	Beta-sandwich domain in beta galactosidase. This entry corresponds to a beta sandwich like domain found in glycosyl hydrolase family 35 beta galactosidase enzymes.	72
407700	pfam17835	NOG1_N	NOG1 N-terminal helical domain. This domain is found at the N-terminus of NOG1 GTPase proteins.	160
407701	pfam17836	PglD_N	PglD N-terminal domain. This alpha/beta domain is found at the N-terminus of proteins such as PglD. This domain binds a UDP-sugar substrate.	78
407702	pfam17837	4PPT_N	4'-phosphopantetheinyl transferase N-terminal domain. This entry represents the N-terminal domain from 4'- phosphopantetheinyl transferase enzymes. This domain is structurally related to the pfam01648 domain with which it forms a pseudodimeric arrangement.	68
407703	pfam17838	PH_16	PH domain. 	122
407704	pfam17839	CNP_C_terminal	C-terminal domain of cyclic nucleotide phosphodiesterase. This is the C-temrinal domain found in Listeria monocytogenes, Lmo2642 cyclic nucleotide phosphodiesterase. The auxiliary C-terminal domain, consists of five alpha-helices forming a long helical bundle, and is connected to the catalytic domain pfam00149 by two loop segments. It is suggested that this auxiliary domain of Lmo2642 might confer functional specificity to the protein through the interactions with unknown factors or involving the substrate recognition.	108
407705	pfam17840	Tugs	Tethering Ubl4a to BAGS domain. This is the C-terminal domain of Ubiquitin-like protein 4A an ortholog of yeast Get5. In budding yeast, GET proteins directly mediate the insertion of newly synthesized TA proteins into endoplasmic reticulum membranes. Similarly, mammalian BAG6, Ubl4a, and SGTA make up a trimeric complex that binds TA proteins post-translationally and then loads them onto the cytosolic ATPase TRC40, which in turn targets them to the endoplasmic reticulum. Structural studies show that this C-terminal TUGS domain of Ubl4a is essential for BAG6 tethering. Given that BAG6 mediates oligomeric complex formation of Ubl4a, TRC35, and TRC40 (mammalian counterparts of Get5, Get4, and Get3, respectively), the C-terminal TUGS domain might be crucial for supporting BAG6-mediated Ubl4a-TRC35 complex formation in humans as an alternative to the direct Get5-Get4 interaction in yeast.	47
407706	pfam17841	Bep_C_terminal	BID domain of Bartonella effector protein (Bep). This entry is the BID (Bep intracellular delivery) domain located at the C-terminal of Bartonella effector proteins (Beps). It functions as a secretion signal in a subfamily of protein substrates of bacterial type IV secretion (T4S) systems. It mediates transfer of (1) relaxases and the attached DNA during bacterial conjugation, and (2) numerous Beps during protein transfer into host cells infected by pathogenic Bartonella species. Crystal structure of several representative BID domains show a conserved fold characterized by a compact, antiparallel four-helix bundle topped with a hook.	97
407707	pfam17842	dsRBD2	Double-stranded RNA binding domain 2. This domain is found in HEN1 proteins from Arabidopsis. Structural characterization reveal that small RNA substrate bind to two double-stranded RNA (dsRNA)-specific binding domains, dsRBD1 and dsRBD2. This entry relates to dsRBD2 which together with dsRBD1 forms a strong grip on the duplex region of the small RNA substrate, and these interactions help position the other duplex terminus towards the MTase domain pfam13847.	147
407708	pfam17843	MycE_N	MycE methyltransferase N-terminal. This is the N-terminal domain found in MycE from the mycinamicin biosynthetic pathway. MycE is a tetramer of a two-domain polypeptide, comprising a C-terminal catalytic MT domain and an N-terminal auxiliary domain, which is important for quaternary assembly and for substrate binding.	110
407709	pfam17844	SCP_3	Bacterial SCP ortholog. This domain is found in MSMEG_5817 gene product from M. smegmatis. It has been shown to be vital for mycobacterial survival within host macrophages. Crystal structure revealed a Rossmann-like fold alpha/beta two-layer sandwich forming a highly hydrophobic interface cavity and with high structural homology to the SCP family. Hence, it has been suggested that this domain may be involved in the interaction of apolar ligands through its hydrophobic cavity. Alanine-scanning mutagenesis of the hydrophobic cavity of MSMEG_5817 protein demonstrated that the conserved Val82 residue plays an important role in ligand binding.	93
407710	pfam17845	FbpC_C_terminal	FbpC C-terminal regulatory nucleotide binding domain. Most functional ABC transporters are composed of at least four sub-units: two trans-membrane (TM) domains where the transport process takes place and two cytoplasmic nucleotide binding domains (NBDs) providing the energy required for active transport. This entry is one of the two NBDs found at the the C-terminal domain of FbpC, ferric iron uptake transporter, from the Neisseria gonorrhoeae. The C-terminal regulatory domain adopts two OB-folds per monomer. These are similar in topology to those seen in the NBD (nucleotide binding domain) from the maltose uptake ABC transporter, MalK. However, FbpC does not open as far as MalK when ATP is removed from their respective closed structures. This difference was suggested to be due to the substantial domain swap in the regulatory domain of FbpC.	55
375377	pfam17846	XRN_M	Xrn1 helical domain. This helical domain is part of the Xrn1 catalytic core. Xrn1 is a cytoplasmic 5'-3' exonuclease that degrades decapped mRNAs.	442
375378	pfam17847	GlcV_C_terminal	Glucose ABC transporter C-terminal domain. This is the C-terminal domain found at the ATPase subunit of the glucose ABC transporter from Sulfolobus solfataricus. Overall, the C-terminal domain (residues 243-353) contains only beta-strands, which form an elongated barrel-shaped structure composed of two parts. This entry represents the upper part which includes a three-stranded anti-parallel beta-sheets and two small anti-parallel beta-strands. The overall structure of this domain is very similar to that of the C-terminal domain of MalK from T. litoralis however, the function of the C-terminal domain in GlcV is not clear.	61
407711	pfam17848	zf-ACC	Acetyl-coA carboxylase zinc finger domain. Acetyl-coA carboxylase (ACC) is a central metabolic enzyme that catalyzes the committed step in fatty acid biosynthesis: biotin- dependent conversion of acetyl-coA to malonyl-coA. In bacteria this protein contains a small zinc finger domain.	26
407712	pfam17849	OB_Dis3	Dis3-like cold-shock domain 2 (CSD2). This domain has an OB fold and is found in the Dis3l2 protein. This domain along with CSD1 binds to RNA.	77
407713	pfam17850	CysA_C_terminal	CysA C-terminal regulatory domain. ABC (ATP-binding cassette) transporters share a common architecture comprising two variable hydrophobic transmembrane domains (TMDs) that form the translocation pathway and two conserved hydrophilic ABC-ATPases that hydrolyze ATP. This is the C-terminal regulatory domain found at the ATPase subunit of CysA, a putative sulfate ABC transporter from Alicyclobacillus acidocaldarius. The regulatory domain of CysA is built up of an elongated beta-barrel composed of two beta-sandwiches that form a common hydrophobic core.	43
407714	pfam17851	GH43_C2	Beta xylosidase C-terminal Concanavalin A-like domain. This domain is found to the C-terminus of the pfam04616 domain. This domain adopts a concanavalin A-like fold.	203
407715	pfam17852	Dynein_AAA_lid	Dynein heavy chain AAA lid domain. This entry corresponds to the extension domain of AAA domain 5 in the dynein heavy chain. This domain is composed of 8 alpha helices.	126
407716	pfam17853	GGDEF_2	GGDEF-like domain. This domain is distantly related to the GGDEF domain, suggesting these may by diguanylate cyclase enzymes.	116
407717	pfam17854	FtsK_alpha	FtsK alpha domain. FtsK is a DNA translocase that coordinates chromosome segregation and cell division in bacteria. In addition to its role as activator of XerCD site-specific recombination, FtsK can translocate double-stranded DNA (dsDNA) rapidly and directionally and reverse direction. FtsK can be split into three domains called alpha (this entry), beta and gamma. The alpha and beta domains contain the core ATPase machinery of the DNA translocase.	101
407718	pfam17855	MCM_lid	MCM AAA-lid domain. This entry represents the AAA-lid domain found in MCM proteins.	86
407719	pfam17856	TIP49_C	TIP49 AAA-lid domain. This family consists of the C-terminal region of several eukaryotic and archaeal RuvB-like 1 (Pontin or TIP49a) and RuvB-like 2 (Reptin or TIP49b) proteins. The N-terminal domain contains the pfam00004 domain. In zebrafish, the liebeskummer (lik) mutation, causes development of hyperplastic embryonic hearts. lik encodes Reptin, a component of a DNA-stimulated ATPase complex. Beta-catenin and Pontin, a DNA-stimulated ATPase that is often part of complexes with Reptin, are in the same genetic pathways. The Reptin/Pontin ratio serves to regulate heart growth during development, at least in part via the beta-catenin pathway. TBP-interacting protein 49 (TIP49) was originally identified as a TBP-binding protein, and two related proteins are encoded by individual genes, tip49a and b. Although the function of this gene family has not been elucidated, they are supposed to play a critical role in nuclear events because they interact with various kinds of nuclear factors and have DNA helicase activities.TIP49a has been suggested to act as an autoantigen in some patients with autoimmune diseases.	66
375385	pfam17857	AAA_lid_1	AAA+ lid domain. This domain represents the AAA lid domain from dynein heavy chain D3.	100
375386	pfam17858	Defensin_int	Platypus intermediate defensin-like peptide. This entry represents a defensin like peptide identified in the platypus genome. Structurally it resembles the beta defensins. The peptide was found to display potent antimicrobial activity against Staphylococcus aureus and Pseudomonas aeruginosa.	44
375387	pfam17859	Pelovaterin	Pelovaterin. The pelovaterin peptide is a major intracrystalline peptide found in turtle eggshell. The global fold of pelovaterin is similar to that of human beta-defensins. Pelovaterin exhibits strong antimicrobial activity against two pathogenic gram-negative bacteria, Pseudomonas aeruginosa and Proteus vulgaris.	42
375388	pfam17860	Defensin_RK-1	RK-1-like defensin. This family includes RK-1 a defensin like peptide from rabbit kidney. The family also includes some rat alpha defensins.	34
375389	pfam17861	Laterosporulin	Laterosporulin defensin-like peptide. This entry corresponds to a bacteriocin from the bacterium Brevibacillus laterosporus called laterosporulin. This peptide has a defensin-like structure.	49
407720	pfam17862	AAA_lid_3	AAA+ lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains.	45
407721	pfam17863	AAA_lid_2	AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains.	73
380039	pfam17864	AAA_lid_4	RuvB AAA lid domain. The RuvB protein makes up part of the RuvABC revolvasome which catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalysed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein. This entry contains the AAA lid domain that is found to the C-terminus of the AAA domain.	74
407722	pfam17865	AAA_lid_5	Midasin AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. This lid domain is found in midasin proteins.	104
407723	pfam17866	AAA_lid_6	AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains.	60
407724	pfam17867	AAA_lid_7	Midasin AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains. This lid domain is found in midasin proteins.	106
407725	pfam17868	AAA_lid_8	AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains.	72
407726	pfam17869	Cys_box	Anosmin cysteine rich domain. This is the Cys-box (cysteine-rich) domain found on the N-terminal of anosmin-1 proteins. It is suggested that the Cys-box domain may resemble the cysteine-rich region of the insulin-like growth factor receptor. Family members are found in chordates.	82
407727	pfam17870	Insulin_TMD	Insulin receptor trans-membrane segment. This entry represents the trans-membrane domain (TMD) found in insulin receptor proteins. The TMD of the insulin receptor is within the beta-subunit and contains 23 amino acids. Mutations in the TMD were shown to have effects on receptor biosynthetic processing and kinase activation. Substitution of the entire TMD of the insulin receptor (IR) resulted in constitutive kinase activation in vitro, while replacing the TMD with that of glycophorin A inhibited insulin action. Structural studies show that TMD contains a helix and a kink when it is purified in dodecylphosphocholine (DPC) micelles. The residues 942-948 preceding the TMD have a propensity to be a short helix and may interact with membrane.	47
407728	pfam17871	AAA_lid_9	AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains.	104
407729	pfam17872	AAA_lid_10	AAA lid domain. This entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains.	99
375396	pfam17873	Rep_1B	Replicase polyprotein 1ab. This entry relates to a regulatory domain found in replicase polyprotein 1ab found in Arterivirus. Structural studies of arterivirus helicase (nsp10), indicate that this domain undergoes conformational changes on substrate binding. Besides the large conformational change, it is suggested that the regions at the surface of domain 1B not directly involved in DNA binding may become flexible. For example, domain 1B residues Arg95, Gly125 and Ala131 become disordered after DNA binding. Together with domains 1A and 2A it forms a nucleic acid-binding channel where the single-stranded part of the DNA substrate is bound to.	53
407730	pfam17874	TPR_MalT	MalT-like TPR region. This entry contains a series of TPR repeats.	336
407731	pfam17875	RPA43_OB	RPA43 OB domain in RNA Pol I. This is OB domain found in RPA43 proteins (DNA-directed RNA polymerase I subunit RPA43, also known as A43) in yeast. Functional analysis of RNA polymerase I show that, subunits A14 and A43 form the heterodimer A14/43, which is distantly related to Rpb4/7 in Pol II and C17/25 in Pol III. Crystal structure analysis show that A43-A14 heterodimer forms the stalk that provides a platform for initiation factors and interacts with newly synthesized RNA.	111
407732	pfam17876	CSD2	Cold shock domain. Crystallographic structure analysis of E. coli wild-type RNase II revealed that the amino-terminal region starts with an alpha-helix followed by two consecutive five-stranded anti-parallel beta-barrels, identified as cold-shock domains (CSD1 and CSD2). This entry relates to CSD2 which lacks the typical sequence motifs RNPI and RNPII but contributes to RNA binding.	74
407733	pfam17877	Dis3l2_C_term	DIS3-like exonuclease 2 C terminal. This is the C-terminal S1 domain found in Dis3L2 proteins. Dis3L2 belongs to the RNase II/R 3-5 exonuclease superfamily, which includes the catalytic subunit of the RNA exosome in yeast and in humans.	87
407734	pfam17878	ssDBP	Single-stranded DNA-binding protein. Family members include single-stranded DNA binding protein encoded by the filamentous Pseudomonas bacteriophage Pf3.	72
375400	pfam17879	DNA_ligase_C	DNA ligase C-terminal domain. This is the C-terminal domain found in ATP-Dependent DNA Ligase from Bacteriophage T7. This domain has no ligase activity, however together with the N-terminal domain they bind to double-stranded DNA consistent with the idea that the DNA-binding site is between the domains in the intact protein. Furthermore, although the fold of domain 2 is very similar to a number of other proteins that bind single-stranded DNA, this domain does not bind to single-stranded DNA but instead has a high affinity for double-stranded DNA.	109
407735	pfam17880	Yos9_DD	Yos9 dimerzation domain. This is the dimerization domain (DD) found in Yos9 proteins in yeast. Structural analysis revealed that this domain contributes to self association of Yos9. The overall fold of the domain can be classified as an alpha-beta-roll architecture, comprising two alpha-helices and seven beta-strands.	128
407736	pfam17881	DUF5590	Domain of unknown function (DUF5590). This is a domain of unknown function found in bacterial proteins.	45
407737	pfam17882	SBD	OAA-family lectin sugar binding domain. This domain is found in agglutinin family of lectins. Oscillatoria agardhii agglutinin (OAA)- family lectins comprise either one or two homologous domains, with a single domain possessing two glycan binding sites. OAA is one of the lectins with anti-HIV activity. This sugar binding domain is also found in Pseudomonas fluorescens agglutinin (PFA) and myxobacterial hemagglutinin (MBHA), where MBHA contains two sugar-binding domains (i.e. 4 sugar binding sites), whereas OAA and PFA are single-domain proteins (i.e. 2 sugar binding sites).	76
407738	pfam17883	MBG	MBG domain. This domain is found in a variety of bacterial extracellular proteins. Although initially described as having a divergent Ig fold this domain has a novel topology that is like a mirror image of the beta grasp fold. Hence the name of Mirror Beta Grasp (MBG) domain.	99
407739	pfam17884	DUF5591	Domain of unknown function (DUF5591). This is a domain of unknown function found in archaeal tRNA-guanine transglycosylase (EC:2.4.2.48) and in archaeosine synthase (EC:2.6.1.97) proteins.	149
407740	pfam17885	Smoa_sbd	Styrene monooxygenase A putative substrate binding domain. This domain is found in the 46 kDa FAD-specific styrene epoxidase (SMOA) protein, comprises a part of the styrene monooxygenase (SMO) two-component flavoprotein monooxygenase enzyme. Structural analysis indicates that SMOA monomer comprises two globular domains spanned by a long alpha-helix. This domain contains a putative substrate binding site.	108
407741	pfam17886	ArsA_HSP20	HSP20-like domain found in ArsA. This domain is found at the C-terminus of ArsA like proteins. This domain is related to HSP20.	63
407742	pfam17887	Jak1_Phl	Jak1 pleckstrin homology-like domain. This entry is for the pleckstrin homology-like (PHL) subdomain found in Jak1 proteins. JAK1 is a member of the Janus kinase (JAK) family of non-receptor tyrosine kinases that are activated in response to cytokines and interferons. PHL (residues 283-419) together with the N-terminal ubiquitin-like subdomain (residues 36-111) and an acyl-coenzyme A binding protein-like subdomain (residues 148-282), associate into a canonical tri-lobed FERM domain.	145
407743	pfam17888	Carm_PH	Carmil pleckstrin homology domain. This is a non-canonical pleckstrin homology (PH) domain connected to a 16-leucine-rich repeat domain found in CARMIL (CP Arp2/3 complex myosin-I linker) proteins. The PH domain is interconnected with an N-terminal helix (N-helix), residues 10-20 and a C-terminal linker (Linker), residues 129-147 in mouse F-actin-uncapping protein LRRC16A. Structural and functional studies indicate that the PH domain involved in direct binding to the PM (plasma membrane) and a HD (helical domain) responsible for antiparallel dimerization and enhancement of CARMIL's membrane-binding activity. Furthermore, it appears that CARMIL's PH domain mediates non-specific binding to the membrane, in contrast to other PH domains that bind polyphosphorylated phosphatidylinositides, which are thought to function as signalling lipids.	94
407744	pfam17889	NLRC4_HD	NLRC4 helical domain. This is a helical domain found in NLRC4, Nucleotide-binding and oligomerization domain-like receptor (NLR) proteins. Structural and functional studies indicate that the helical domain HD2 repressively contacted a conserved and functionally important alpha-helix of the NBD (nucleotide binding domain) in mouse NLR family CARD domain-containing protein 4. Furthermore, the HD2 domain was shown to cap the N-terminal side of the LRR (leucine-rich repeat) domain via extensive interactions. Other family members carrying this domain include baculoviral IAP repeat-containing protein 1 (Birc1) also known as neuronal apoptosis inhibitory protein (Naip).	106
407745	pfam17890	WW_like	Peptidoglycan hydrolase LytB WW-like domain. Structural analysis revealed that the catalytic domain of LytB consists of three structurally independent modules: SH3b, WW domain-like, and the glycoside hydrolase family 73 (GH73). This entry is the WW like domain found in endo-beta-N-acetylglucosaminidase LytB from Streptococcus pneumoniae. Functional analysis show that the deletion of both SH3b and WW modules almost completely abolished the activity of LytB. Furthermore, it was shown that the SH3b and WW modules are indispensable for LytB in cell separation.	53
407746	pfam17891	FluMu_N	Mu-like prophage FluMu N-terminal domain. Structural analysis of HI1506 (also known as Mu-like prophage FluMu protein gp35) from Haemophilus influenzae show that HI1506 consists of two structured domains connected by an unstructured 30 amino acid loop. This entry is the N-terminal domain which comprises a three-stranded antiparallel beta-sheet packed against an alpha-helix.	49
407747	pfam17892	Cadherin_5	Cadherin-like domain. 	98
380048	pfam17893	Cas9_b_hairpin	CRISPR-associated endonuclease Cas9 beta-hairpin domain. This is beta-hairpin domain found in Cas9 proteins from Actinobacteria. The beta-hairpin domain is not conserved in all type II-C Cas9 proteins. The beta-hairpin in Cas 9 from Streptococcus pyogenes blocks the HNH domain active site.	90
407748	pfam17894	Cas9_Topo	Topo homolgy domain in CRISPR-associated endonuclease Cas9. This is the Topo-homology domain found in Cas9 proteins from Actinobacteria. This domain bears structural similarity to a domain found in topoisomerase II.	71
407749	pfam17895	Dicer_N	Giardia Dicer N-terminal domain. This is the N-terminal domain found in dicer proteins from Giardia intestinalis. The N-terminal region (i.e. the platform domain) forms a flat surface composed of antiparallel beta sheet and three alpha helices. This surface contains a large positively charged region that could interact directly with the negatively charged phosphodiester backbone of the modeled dsRNA helix.	144
375411	pfam17896	Nsp2a_N	Replicase polyprotein 1a N-terminal domain. This is the N-terminal domain found in Replicase polyprotein 1a (also known as non-structural protein 2a-Nsp2a). Family members are found in Gammacoronaviruses.	358
407750	pfam17897	VCPO_N	Vanadium chloroperoxidase N-terminal domain. This is the N-terminal domain found in Vanadium chloroperoxidase proteins found in fungi.	215
407751	pfam17898	GerD	Spore germination GerD central core domain. This is the central core domain found in GerD from a thermophilic Bacillus. GerD plays a critical role in nutrient receptor-mediated spore germination in Bacillus species. The crystal structure of GerD reveals this domain as a trimeric superhelical rope fold. Alterations in GerD structure have profound effects on spores' nutrient germination.	114
407752	pfam17899	Peptidase_M61_N	Peptidase M61 N-terminal domain. This domain is found at the N-terminus of pfam05299 and has a beta sandwich-like fold with similarity to the baculovirus p35 protein.	168
407753	pfam17900	Peptidase_M1_N	Peptidase M1 N-terminal domain. This domain is found at the N-terminus of aminopeptidases from the M1 family.	186
375414	pfam17901	EF-hand_12	EF-hand fold domain. This domain is found in Dd-STATa, a STAT protein (Signal transducer and activator of transcription A) which transcriptionally regulates cellular differentiation in Dictyostelium discoideum. The EF-hand domains predicted to contain several basic residues that lie close to the DNA backbone.	93
407754	pfam17902	SH3_10	SH3 domain. This entry represents an SH3 domain.	65
407755	pfam17903	KH_8	Krr1 KH1 domain. This entry represents the first KH domain in the KRR1 protein. Krr1 is a ribosomal assembly factor. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif and is involved in binding another assembly factor, Kri1.	81
407756	pfam17904	KH_9	FMRP KH0 domain. This entry corresponds to the KH0 domain from the FMRP protein. This is a divergent KH domain that was discovered through solving the structure of an N-terminal fragment of the FMRP protein. KH0 does not have the canonical G-X-X-G motif between helices A and B. It has been suggested that this domain may be involved in RNA binding.	85
407757	pfam17905	KH_10	GLD-3 KH domain 5. This entry corresponds to KH5 of the GLD-3 protein. The 4 KH domains KH2 to KH5 form a proteolytically stable structure.	95
407758	pfam17906	HTH_48	HTH domain in Mos1 transposase. The N-terminal domain of the Mos1 Mariner transposase comprises two HTH domains. This HTH domain binds in the DNA major groove to the transposons inverted repeats.	50
407759	pfam17907	AWS	AWS domain. This entry represents the AWS (associated with SET domain) domain. This is a zinc binding domain. The full AWS domain contains 8 cysteines. This entry represents the N-terminal part of the domain, with the C-terminal part interwoven with the SET domain.	39
407760	pfam17908	APAF1_C	APAF-1 helical domain. This domain represents the C-terminal alpha helical domain of the apoptotic Apaf-1 protein.	135
375422	pfam17909	Htr2	Htr2 transmembrane domain. Archaebacterial photoreceptors mediate phototaxis by regulating cell motility through two-component signaling cascades like those found in chemotaxis signaling chains of enteric bacteria. The photoreceptor sensory rhodopsin II from N. pharaonis (NpSRII) in complex with its cognate transducer NpHtrII serves as a system for transmembrane signal transfer. This entry is for the transmembrane domain of the transducer HtrII. Studies suggest that conformation changes of the NpSRII/NpHtrII complex may be crucial for the mechanism of signal propagation spanning the membrane domain and feeding into the HAMP domain. Furthermore, HtrII in H. salinarum not only transmits the signal from the photoreceptor SRII but also operates as a chemoreceptor.	61
407761	pfam17910	FeoB_Cyto	FeoB cytosolic helical domain. FeoB is a G-protein coupled membrane protein essential for Fe(II) uptake in prokaryotes. In the structures, a canonical G-protein domain (G domain) is followed by a helical bundle domain (S-domain) which is represented by this entry.	90
407762	pfam17911	Ski2_N	Ski2 N-terminal region. This region is the N-terminal extended region found in the Ski2 protein. The Ski complex is a conserved multiprotein assembly required for the cytoplasmic functions of the exosome, including RNA turnover, surveillance, and interference. Ski2, Ski3, and Ski8 assemble in a tetramer with 1:1:2 stoichiometry.	134
407763	pfam17912	OB_MalK	MalK OB fold domain. This entry corresponds to one of two OB-fold domains found in the MalK transport protein.	53
407764	pfam17913	FHA_2	FHA domain. This entry represents a divergent FHA domain which in PNK binds to phosphorylated segment of XRCC1.	97
407765	pfam17914	HopA1	HopA1 effector protein family. This family includes the HopA1 effector protein from Pseudomonas syringae. Structurally this protein has an alpha + beta fold. The effector protein HopA1 was shown to affect the EDS1 complex by binding EDS1 directly and activating the immune response signaling pathway.	170
375426	pfam17915	zf_Rg	Reverse gyrase zinc finger. This is the N-terminal zinc finger domain present in reverse gyrase proteins. Most reverse gyrases conserve the N-terminal zinc finger of the zinc ribbon type, pointing to a crucial function of this domain. Structure of Thermotoga maritima reverse gyrase elucidates that the N-terminal zinc finger firmly attaches the H1 (helicase 1) domain to the topoisomerase domain contributing to double-strand DNA (dsDNA) binding.	49
407766	pfam17916	LID	LIM interaction domain (LID). LIM-homeodomain (LIM-HD) proteins are transcription factors that are critical in the development of many cell types and tissues. The activity of LIM-HD proteins is dependent on the essential cofactor LIM domain-binding protein 1 (Ldb1). This entry represents a 30-residue LIM interaction domain (LID) that binds to the LIM domains of all LIM-HD and closely related LIM-only (LMO) proteins. Isl1 (Insulin gene enhancer protein 1) and Isl2 each contain a LID-like sequence in their C-terminal regions, the Lhx3-binding domain (LBD), which binds the LIM domains Lhx3 and Lhx4.	29
407767	pfam17917	RT_RNaseH	RNase H-like domain found in reverse transcriptase. DNA polymerase and ribonuclease H (RNase H) activities allow reverse transcriptases to convert the single-stranded retroviral RNA genome into double-stranded DNA, which is integrated into the host chromosome during infection. This entry represents the RNase H like domain.	104
407768	pfam17918	TetR_C_15	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. This entry represents the C-terminal domain found in a number of different TetR transcription regulator proteins including SlmA proteins found in E. coli. Unlike other TetR proteins, SlmA functions not as a transcription regulator but rather as an NO (nucleoid occlusion) factor. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain.	108
407769	pfam17919	RT_RNaseH_2	RNase H-like domain found in reverse transcriptase. 	100
407770	pfam17920	TetR_C_16	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. This entry represents the C-terminal domain found in a number of different TetR transcription regulator proteins found in Actinobacteria. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain.	107
407771	pfam17921	Integrase_H2C2	Integrase zinc binding domain. This zinc binding domain is found in a wide variety of integrase proteins.	58
407772	pfam17922	TetR_C_17	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. This entry represents the C-terminal domain present in Yfir transcription regulator proteins found in Bacillus subtilus. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain.	100
375433	pfam17923	TetR_C_18	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in TetR transcriptional regulations found in proteobacteria.	113
407773	pfam17924	TetR_C_19	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the transcriptional regulator heme-regulated transporter regulator (HrtR), which senses and binds a heme molecule as its physiological effector to regulate the expression of the heme-efflux system responsible for heme homeostasis in L. lactis.	117
407774	pfam17925	TetR_C_20	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the transcriptional regulator KstR that regulates a large set of genes responsible for cholesterol catabolism. This is important for Mycobacterium tuberculosis during infection, both at an early stage in the macrophage phagosome and later within the necrotic granuloma.	107
407775	pfam17926	TetR_C_21	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the TetR Transcriptional Repressor found in Streptomyces coelicolor A3. Family members include HTH-type transcriptional repressor sco4008, which is suggested to be a transcriptional repressor of sco4007 responsible for the multidrug resistance system in S. coelicolor A3.	111
407776	pfam17927	Ins134_P3_kin_N	Inositol 1,3,4-trisphosphate 5/6-kinase pre-ATP-grasp domain. This family consists of several inositol 1, 3, 4-trisphosphate 5/6-kinase proteins. Inositol 1,3,4-trisphosphate is at a branch point in inositol phosphate metabolism. It is dephosphorylated by specific phosphatases to either inositol 3,4-bisphosphate or inositol 1,3-bisphosphate. Alternatively, it is phosphorylated to inositol 1,3,4,6-tetrakisphosphate or inositol 1,3,4,5-tetrakisphosphate by inositol trisphosphate 5/6-kinase. This entry represents the N-terminal pre-ATP-grasp domain.	80
407777	pfam17928	TetR_C_22	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present the TetR Transcriptional Repressor present in sco1712 proteins from Streptomyces coelicolo which act as a regulator of antibiotic production.	113
407778	pfam17929	TetR_C_34	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in putative TetR family transcriptional regulators found in bacteria.	120
407779	pfam17930	LpxI_N	LpxI N-terminal domain. This entry represents the N-terminal domain of the LpxI enzyme that is involved in biosynthesis of lipid A. Specifically it carried out the hydrolysis of UDP-2,3-diacyl- glucosamine. This step is either carried out by LpxI or LpxH. This domain has a Rossmann fold.	129
407780	pfam17931	TetR_C_23	Tetracyclin repressor-like, C-terminal domain. This is a C-terminal domain present in putative TetR family transcriptional regulators found in bacteria.	127
407781	pfam17932	TetR_C_24	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in family members such as HTH-type transcriptional repressor KstR2 as well as fatty acid metabolism regulator proteins. In Mycobacterium smegmatis, KstR2 is involved in involved in cholesterol catabolism, while YsiA in Bacillus subtilis is involved in fatty acid degradation.	114
407782	pfam17933	TetR_C_25	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in Rv1219c of Mycobacterium tuberculosis. Structural studies indicate that the helix alpha 10 of the C-terminal end of Rv1219c forms a long arm feature, a feature which is unique in Rv1219c compared to some other members of the TetR family. Furthermore, it has been shown that substrate binding occurs in the C-terminal regulatory domain of Rv1219c.	106
407783	pfam17934	TetR_C_26	Tetracyclin repressor-like, C-terminal domain. This entry represents the C-terminal domain present in putative HTH-type transcriptional regulator. Family members are found in bacilli.	109
407784	pfam17935	TetR_C_27	Tetracyclin repressor-like, C-terminal domain. This is the C-terminal domain present in putative TetR transcriptional regulators.	106
407785	pfam17936	Big_6	Bacterial Ig domain. This domain is found in a wide variety of extracellular bacterial proteins often in multiple tandem copies.	83
407786	pfam17937	TetR_C_28	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in CgmR (C. glutamicum multidrug-responsive transcriptional repressor), previously called CGL2612 protein. CgmR (CGL2612) from Corynebacterium glutamicum is a multidrug-resistance-related transcription factor belonging to the TetR family. It regulates expression of the immediately upstream gene cgmA (cgl2611) by binding to the operator cgmO in the cgmA promoter. The cgmA gene encodes a permease belonging to the major facilitator superfamily, a protein family composed of bacterial multidrug exporters, and the pair of CgmR and CgmA confers multidrug resistance on C. glutamicum.	97
407787	pfam17938	TetR_C_29	Tetracyclin repressor-like, C-terminal domain. This domain is found in the C-terminal region of putative TetR-family regulatory proteins.	119
407788	pfam17939	TetR_C_30	Tetracyclin repressor-like, C-terminal domain. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain. This entry represents the C-terminal domain present in the Pseudomonas aeruginosa PsrA which regulates the fadBA5 beta-oxidation operon. Functional analysis of PsrA indicated its importance in regulating b-oxidative enzymes. It has also been suggested that PsrA, a member of the TetR family of repressors, could affect global gene expression including activation of rpoS.	113
407789	pfam17940	TetR_C_31	Tetracyclin repressor-like, C-terminal domain. This is the C-terminal domain found in putative transcriptional regulator, TetR family proteins.	107
407790	pfam17941	PP_kinase_C_1	Polyphosphate kinase C-terminal domain 1. Polyphosphate kinase (Ppk) catalyses the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. This C1-terminal domain has a structure similar to phospholipase D. It is one of two closely related carboxy-terminal domains (C1 and C2 domains). Both the C1 and C2 domains (residues 322-502 and 503-687, respectively) consist of a sevenstranded mixed beta-sheet flanked by five alpha-helices. However, the structural topology and relative orientations of the helices to the beta-sheet in these two domains are different. The C1 and C2 domains are highly conserved in the PPK family. Some of the residues previously shown to be crucial for the enzyme catalytic activity are located in these two domains.	167
407791	pfam17942	Morc6_S5	Morc6 ribosomal protein S5 domain 2-like. This domain is found in MORC6 proteins in eukaryotes. Arabidopsis microrchidia (MORC) ATPase family proteins are conserved among plants and animals and are involved in transcriptional silencing. In Arabidopsis, MORC6/DMS11 was reported to function in the condensation of pericentromeric heterochromatin, thereby facilitating transcriptional silencing. Further studies demonstrate that MORC6 and its homologs MORC1 and MORC2 form a complex which associates with SUVH9, required for Pol V occupancy in the RdDM (RNA-directed DNA methylation) pathway.	139
407792	pfam17943	HOCHOB	Homeobox-cysteine loop-homeobox. This domain is considered a double homeodomain, termed HOCHOB, present in the C. elegans genome. Family members include CEH-91 and CEH-93 that share extended sequence similarity with each other upstream of their typical HDs (Homeodomains). CEH-92, another family member, has three copies of this domain. The domain consists of two divergent HDs that are separated by a linker of about 17 residues. The linker has a number of conserved positions, two of which are cysteine residues suggesting that they could be involved in metal binding. Hence, the name HOCHOB (Homeobox-cysteine loop-homeobox). Furthermore, there are two conserved histidine residues, one in each HD (in CEH-91 displaced by two positions), and there is also a conserved aspartic acid. It is speculated that the HOCHOB domain is an evolutionary novelty that is derived from two HDs and may have gained metal-binding capacity.	120
407793	pfam17944	Arg_decarbox_C	Arginine decarboxylase C-terminal helical extension. This small three helical domain is found at the C-terminus of the arginine decarboxylase enzyme.	50
407794	pfam17945	Crystall_4	Beta/Gamma crystallin. This is the C-terminal domain found in mucin glycoproteins such as secreted protease of C1 esterase inhibitor from EHEC (StcE). This domain adopts a beta/gamma crystallin fold and has been shown to be dispensable for substrate binding. Furthermore, deletion analysis suggest that lack of the C-terminal resulted in impaired association with the cell surface.	90
407795	pfam17946	RecC_C	RecC C-terminal domain. This entry corresponds to the C-terminal domain of the RecC protein. This domain has a PD(D/E)XK like fold. Deleting this domain eliminates RecD assembly within the RecBCD complex.	224
407796	pfam17947	4HB	Four helical bundle domain. This domain is found in elongation factor 3A where it packs against the bottom of the concave face of the HEAT domain.	78
407797	pfam17948	DnaT	DnaT DNA-binding domain. This domain is found in E.coli primosomal protein 1 (Pp1); the PP1 domain (residues 84-153) can bind to different types of ssDNA, which is fundamental for its physiological substrate bindings. Functional analysis indicate that both N- and C- terminals are essential to having the cooperative effect in binding ssDNA. The ssDNA bound complex displays a spiral filament assembly that is adopted by many proteins that are involved in DNA replication, such as DnaA, RecA and PriB. This domain is similar to pfam08585 except that it contains an extra loop at the N-terminus (84-99). Structural analysis indicate that this extra loop might be essential for the stabilisation of the three-helix bundle.	71
375443	pfam17949	PND	FANCM pseudonuclease domain. This entry represents the pseudonuclease domain (PND) from the FANCM protein. This domain is part of the PD(D/E)XK superfamily but does not appear to have a full set of catalytic residues.	125
407798	pfam17950	SpmSyn_N	S-adenosylmethionine decarboxylase N -terminal. This is the N-terminal domain found in human spermine synthase (EC 2.5.1.22). The N-terminal domain, which forms the major part of the dimerization interface, shows a considerable structural similarity to the AdoMetDC-like fold (S-adenosylmethionine decarboxylase, the enzyme that forms the aminopropyl donor substrate), pfam02675. Deletion of the N-terminal domain led to a complete loss of spermine synthase activity, suggesting that dimerization may be required for activity. The N-terminal domain (amino acids 1-117) includes seven beta-strands and two alpha-helices.	96
407799	pfam17951	FAS_meander	Fatty acid synthase meander beta sheet domain. This domain is found in fungal fatty acid synthase beta chain proteins.	146
407800	pfam17952	Cas6_N	Cas6 N-terminal domain. The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA.Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader.The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). This entry represents the N-terminal domain of Cas 6 proteins. The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability. Structural analysis of Sulfolobus sulfotaricus P2 Cas6 (SsCas6) proteins indicate that SsCas6 is able to bind and cleave the nonstructured RNA by stabilizing an otherwise unstable duplex of two base pairs near the cleavage site, leading to an inline conformation around the scissile phosphate necessary for its breakage.	115
407801	pfam17953	Csm4_C	CRISPR Csm4 C-terminal domain. Clustered, regularly interspaced, short palindromic repeat (CRISPR) loci play a pivotal role in the prokaryotic host defense system against invading genetic materials. The CRISPR loci are transcribed to produce CRISPR RNAs (crRNAs), which form interference complexes with CRISPR-associated (Cas) proteins to target the invading nucleic acid for degradation. The interference complex of the type III-A CRISPR-Cas system is composed of five Cas proteins (Csm1-Csm5) and a crRNA, and targets invading DNA. This entry represents the C-terminal domain found in Csm4. Csm4 structurally resembles Cmr3, a component of the type III-B CRISPR-Cas interference complex. Studies indicate that Csm3-Csm4 complex binds single-stranded RNA in a non-sequence-specific manner. Structural analysis show, Csm3 and Csm4 have one and two ferredoxin-like folds (also known as an RRM-like fold), respectively. The long beta-hairpin inserted into the C-terminal ferredoxin-like fold of Csm4, is well-conserved in the Cmr3 structure. The corresponding beta-hairpin of Cmr3 binds the D1 domain of Cmr2, as observed in the Cmr2-Cmr3 complex structure. Furthermore, it is suggested that the hairpin of Csm4 is responsible for the interaction with Csm1 (ortholog of Cmr2).	91
407802	pfam17954	Pirin_C_2	Quercetinase C-terminal cupin domain. Experiments on the YhhW protein show that is has quercetinase activity. This entry represents the C-terminal cupin domain from the two cupin domains that make up the protein. This domain is usually associated with pfam02678.	86
407803	pfam17955	Cas6b_N	Cas6b N-terminal domain. Clustered, regularly interspaced, short palindromic repeat (CRISPR) loci play a pivotal role in the prokaryotic host defense system against invading genetic materials. The CRISPR loci are transcribed to produce CRISPR RNAs (crRNAs), which form interference complexes with CRISPR-associated (Cas) proteins to target the invading nucleic acid for degradation. Four Cas proteins (Cas5, Cas6b, Cas7 and Cas8b) are proposed to form a Type I-B Cascade complex that mediates the antiviral defense. This is the N-terminal domain found in Cas6b proteins. Cas6b is a member of Cas6 RNA processing endoribonucleases found in bacteria and archaea whose RNA substrates have a wide range of structural features. Cocrystal structures of Cas6 from Methanococcus maripaludis (MmCas6b) bound with its repeat RNA revealed a dual-site binding structure and a cleavage site conformation poised for phosphodiester bond breakage.	104
407804	pfam17956	NAPRTase_C	Nicotinate phosphoribosyltransferase C-terminal domain. This domain is found at the C-terminus of some Nicotinate phosphoribosyltransferase enzymes. The function of this domain is uncertain.	111
407805	pfam17957	Big_7	Bacterial Ig domain. This entry represents a bacterial ig-like domain that is found in glycosyl hydrolase enzymes.	67
407806	pfam17958	EF-hand_13	EF-hand domain. This entry represents an EF-hand domain found in one of the regulatory B subunits of PP2A.	90
407807	pfam17959	EF-hand_14	EF-hand domain. This EF-hand domain is found at the N-terminus of the human glutaminase enzyme.	90
407808	pfam17960	TIG_plexin	TIG domain. This entry represents an TIG or IPT domain (Ig domain shared by Plexins and Transcription factors) found in plexins.	89
407809	pfam17961	Big_8	Bacterial Ig domain. This entry represents a bacterial Ig-fold domain that is found in a wide range of bacterial cell surface adherence proteins.	100
407810	pfam17962	bMG6	Bacterial macroglobulin domain 6. This macroglobulin domain is found in bacterial alpha 2 macroglobulin proteins. It adopts an Ig-like beta sandwich fold.	112
407811	pfam17963	Big_9	Bacterial Ig domain. This entry represents a wide variety of bacterial Ig domains.	90
407812	pfam17964	Big_10	Bacterial Ig domain. This entry represents a bacterial Ig-like domain found associated with transpeptidase domains.	182
407813	pfam17965	MucBP_2	Mucin binding domain. This domain is found in bacterial cell surface proteins that interact with mucins. The archetypal member of this family is the Mub-R5 B1 domain. This domain has a beta-grasp fold.	75
407814	pfam17966	Mub_B2	Mub B2-like domain. This entry corresponds to the Mub B2 domain. This domain is related to the Mub B1 domain pfam17965. This domain may be involved in mucin binding. This domain is often found associated with the related pfam17965 in bacterial cell surface proteins.	70
407815	pfam17967	Pullulanase_N2	Pullulanase N2 domain. This domain is found close to the N-terminus of the Klebsiella starch debranching pullulanase enzyme. The structure of the domain is a beta sandwich fold.	112
407816	pfam17968	Tlr3_TMD	Toll-like receptor 3 trans-membrane domain. Toll-like receptor (TLR) 3 is an endosomal TLR that mediates immune responses against viral infections upon activation by its ligand double-stranded RNA, a replication intermediate of most viruses. TLR3 is expressed widely in the body and activates both the innate and adaptive immune systems. This entry represents the Toll-like receptor 3 trans-membrane domain which has been shown to form dimers and trimers with different surfaces for helix-helix interaction.	33
407817	pfam17969	Ldt_C	L,D-transpeptidase C-terminal domain. This is the C-terminal domain found in d-transpeptidases (Ldt) homologues from E.coli. Three of these enzymes (YbiS, ErfK, YcfS) have been shown to cross-link Braun's lipoprotein to the peptidoglycan (PG), while the other two (YnhG, YcbB) form direct meso-diaminopimelate (DAP-DAP, or 3-3) cross-links within the PG. Family members include erfK (ldtA), ybiS (ldtB), ycfS (ldtC), and ynhG (ldtE).	67
407818	pfam17970	bMG1	Bacterial Alpha-2-macroglobulin MG1 domain. Alpha-2-macroglobulins (A2Ms) are plasma proteins that trap and inhibit a broad range of proteases and are major components of the eukaryotic innate immune system. However, A2M-like proteins were identified in pathogenically invasive bacteria and species that colonize higher eukaryotes. Bacterial A2Ms are located in the periplasm where they are believed to provide protection to the cell by trapping external proteases through a covalent interaction with an activated thioester. This domain is found on the N-terminal region in A2Ms in bacteria. Structure analysis of Salmonella enterica ser A2Ms (SA-A2Ms) show that they are composed of 13 domains, all of which fold as variants of beta sandwiches with the exception of the TED, which consists of 14 alpha helices. Most of the beta sandwich domains appear to serve a structural role and are referred to as the macroglobulin-like (MG) domains. This is the MG1 domain which is the farthest from the body of the structure. It is normally anchored to the inner membrane in vivo and connected to MG2 by a flexible linker.	105
375455	pfam17971	LIFR_D2	Leukemia inhibitory factor receptor D2 domain. This is the D2 domain in cytokine-binding module 1 (CBM1) found in Leukemia inhibitory factor receptor (LIFR) and OSM receptors (OSMR). LIFR has an extracellular region with a modular structure containing two cytokine-binding modules (CBM) separated by an Ig-like domain and followed by three membrane-proximal fibronectin type-III (FNIII) domains. The D2 domain in CBM1 shows structural similarity to the corresponding CBM domains of both gp130 and IL-6Ralpha because it contains conserved structural features like the WSXWS motif. The WSXWS motif in cytokine receptors is a molecular switch involved in receptor activation.	114
407819	pfam17972	bMG5	Bacterial Alpha-2-macroglobulin MG5 domain. Alpha-2-macroglobulins (A2Ms) are plasma proteins that trap and inhibit a broad range of proteases and are major components of the eukaryotic innate immune system. However, A2M-like proteins were identified in pathogenically invasive bacteria and species that colonize higher eukaryotes. Bacterial A2Ms are located in the periplasm where they are believed to provide protection to the cell by trapping external proteases through a covalent interaction with an activated thioester. This domain is found on the N-terminal region in A2Ms in bacteria. Structure analysis of Salmonella enterica ser A2Ms (SA-A2Ms) show that they are composed of 13 domains, all of which fold as variants of beta sandwiches with the exception of the TED, which consists of 14 alpha helices. Most of the beta sandwich domains appear to serve a structural role and are referred to as the macroglobulin-like (MG) domains. This is the MG5 domain.	127
407820	pfam17973	bMG10	Bacterial Alpha-2-macroglobulin MG10 domain. Alpha-2-macroglobulins (A2Ms) are plasma proteins that trap and inhibit a broad range of proteases and are major components of the eukaryotic innate immune system. However, A2M-like proteins were identified in pathogenically invasive bacteria and species that colonize higher eukaryotes. Bacterial A2Ms are located in the periplasm where they are believed to provide protection to the cell by trapping external proteases through a covalent interaction with an activated thioester. This domain is found on the C-terminal region in A2Ms in bacteria. Structure analysis of Salmonella enterica ser A2Ms (SA-A2Ms) show that they are composed of 13 domains, all of which fold as variants of beta sandwiches with the exception of the TED, which consists of 14 alpha helices. Most of the beta sandwich domains appear to serve a structural role and are referred to as the macroglobulin-like (MG) domains. This is the MG10 domain. MG10 is markedly different from the other MG domains in that it has more beta strands and an alpha helix. The position of MG10 is stabilized by, in addition to other hydrogen bonds, the formation of a beta sheet with MG9.	128
407821	pfam17974	GalBD_like	Galactose-binding domain-like. Proteins containing a galactose-binding domain-like fold can be found in several different protein families, in both eukaryotes and prokaryotes. The common function of these domains is to bind to specific ligands, such as cell-surface-attached carbohydrate substrates for galactose oxidase and sialidase, phospholipids on the outer side of the mammalian cell membrane for coagulation factor Va, membrane-anchored ephrin for the Eph family of receptor tyrosine kinases, and a complex of broken single-stranded DNA and DNA polymerase beta for XRCC1. The structure of the galactose-binding domain-like members consists of a beta-sandwich, in which the strands making up the sheets exhibit a jellyroll fold.	190
407822	pfam17975	RNR_Alpha	Ribonucleotide reductase alpha domain. This is the alpha helical domain of ribonucleotide reductases. Family members include Ribonucleotide reductase (RNR, EC:1.17.4.1) which catalyse the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. This domain is found in Class II. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Many organisms have more than one class of RNR present in their genomes. Ribonucleotide reductase is an oligomeric enzyme composed of a large sub-unit (700 to 1000 residues) and a small sub-unit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain. Some family members carry ATP cone domain which acts as a functional regulator. Competitive binding of ATP and dATP to an N-terminal ATP-cone domain determines enzyme activity. As the ratio of dATP to ATP increases above a certain threshold, the enzyme activity is turned off. Substrate nucleotides are recognized by relatively simple H-bonding interactions at the N-terminus of one or more alpha helices. In the monomeric class II RNR, the effector binds in a pocket formed by helices in a 130 amino acid insertion which constitutes this domain.	101
407823	pfam17976	zf-RING_12	RING/Ubox like zinc-binding domain. This is a RING zinc finger domain found in parkin proteins. Parkin consists of a ubiquitin-like (Ubl) domain and a 60-amino acid linker followed by this domain RING0 and three additional zinc finger domains characteristic of the RBR family. RING0 binds two coordinated zinc atoms at each extremity of the domain with a hairpin. Deletion of RING0 massively derepressed parkin activity supporting the role of RING0 in autoinhibition, point mutations in RING0 (Phe146 to Ala) or RING2 (Phe463 to Ala) both increased parkin activity. The REP (repressor element of parkin) and RING0 domains play a preeminent role in repressing parkin ligase activity through their interactions with RING1 and RING2, respectively.	73
375459	pfam17977	zf-RING_13	RING/Ubox like zinc-binding domain. This is a zinc binding domain found in nidovirus helicase. It includes includes 12 or 13 conserved Cys/His residues. Amino acid substitutions in ZBD or the adjacent spacer that connects it to the downstream domain can profoundly affect EAV (equine arteritis virus) helicase activity and RNA synthesis, with most replacements of conserved Cys or His residues yielding replication-negative virus phenotypes.	68
407824	pfam17978	zf-RING_14	RING/Ubox like zinc-binding domain. This is a RING zinc finger domain found in parkin proteins. Parkin consists of a ubiquitin-like (Ubl) domain and a 60-amino acid linker followed by RING0 and three additional zinc finger domains characteristic of the RBR family. This entry relates to RING1 zinc binding domain. The RING1 domain displays the C3HC4 cross-brace motif characteristic of RING domains. The N-terminal Ubl domain binds to RING1.	91
407825	pfam17979	zf-CRD	Cysteine rich domain with multizinc binding regions. This is a cysteine rich domain which contains four zinc-binding regions (binding five zinc ions). Family members include human CHFR which interacts with Poly(ADP-ribose) (PAR) through a 20-amino acid PAR binding zinc finger region (PBZ) found at the C-terminal end of this cysteine-rich domain (i.e. the 4th region towards the C-terminal). CHFR lacking PBZ does not co-localize with nuclear PAR foci in interphase cells and cannot rescue antephase checkpoint function despite retaining autoubiquitination activity. Hence it has been suggested that the CHFR-PAR interaction is an important part of the antephase checkpoint and could form part of the checkpoint sensor for cellular stress and microtubule poisons or be required for proper localization of CHFR. The PBZ region of CHFR contains two adenine binding sites.	158
407826	pfam17980	ADD_DNMT3	Cysteine rich ADD domain in DNMT3. This is a cysteine-rich domain termed ADD (ATRX-DNMT3-DNMT3L, AD-DATRX) found in DNMT3A proteins. The ADD domains of the DNMT3 family have a decisive role in blocking DNMT activity in the areas of the genome with chromatin containing methylated H3K4. Furthermore, the ADD domain of DNMMT3A (ADD-3A) competes with the chromodomain (CD) of heterochromatin protein 1 alpha (HP1alpha, CDHP1alpha) for binding to the H3 tail. The DNA methyltransferase (DNMT) 3 family members DNMT3A and DNMT3B and the DNMT3-like non-enzymatic regulatory factor DNMT3L, are involved in de-novo establishment of DNA methylation patterns in early mammalian development.	56
407827	pfam17981	ADD_ATRX	Cysteine Rich ADD domain. This is a cysteine-rich domain termed ADD (ATRX-DNMT3-DNMT3L, AD-DATRX) found in ATRX proteins. Chromatin-associated human protein ATRX was originally identified because mutations in the ATRX gene cause a severe form of syndromal X-linked mental retardation called ATR-X syndrome. Mutations or knockdown of ATRX expression cause diverse effects, including altered patterns of DNA methylation, a telomere-dysfunction phenotype, aberrant chromosome segregation, premature sister chromatid separation and changes in gene expression. ATRX localizes predominantly to large, tandemly repeated regions (such as telomeres, centromeres and ribosomal DNA) associated with heterochromatin, and studies show that it directs H3.3 deposition to pericentric and telomeric heterochromatin. The ADD domain of ATRX, in which most syndrome-causing mutations occur, engages the N-terminal tail of histone H3 through two rigidly oriented binding pockets, one for unmodified Lys4 and the other for di- or trimethylated Lys9. Mutations in the ATRX ADD domain cause mislocalization of ATRX protein to heterochromatin, and this may contribute to understanding the underlying etiology of ATRX syndrome. Structure analysis of the ADD domain of ATRX revealed that it contains a PHD zinc-finger domain packed against a GATA-like zinc finger. Same structure is also found in the DNMT3 DNA methyltransferases and DNMT3L.	56
407828	pfam17982	C5HCH	NSD Cys-His rich domain. This is an NSD-specific Cys-His rich region (C5HCH) domain. Family members include NSD3 (nuclear receptor SET domain-containing) proteins. This domain is located on the C-terminal of NSD1, 2 and 3 proteins. C5HCH domain lies adjacent to the fifth plant homeodomain (PHD5). The PHD5-C5HCH module of NSD3 (PHD5-C5HCHNSD3) recognizes the H3 N-terminal peptide containing unmodified K4 and trimethylated K9. Moreover, it has been reported that the PHD5-C5HCH module of NSD1 (PHD5-C5HCH) was the sole region required for tight binding of the NUP98-NSD1 fusion protein to the HoxA9 gene promoter, implicating that PHD5-C5HCH might have chromatin targeting ability.	50
375465	pfam17983	Tryp_inh	Trypsin inhibitors 1,2 and 3. This is domain is found in trypsin inhibitor 1, 2 and 3 found in Chenopodiaceae. Structure analysis of S. oleracea trypsin inhibitor III (SOTI-III), in complex with bovine pancreatic trypsin, shows a knottin-like cystine-bridge topology.	33
375466	pfam17984	TERT_thumb	Telomerase reverse transcriptase thumb DNA binding domain. The catalytic subunit of telomerase is structurally similar to retroviral reverse transcriptases, viral RNA polymerases and, to a lesser extent, the bacteriophage B-family DNA polymerases. Like its structural homologs, the core catalytic subunit of telomerase, TERT, contains the fingers, palm and thumb domains required for nucleic acid and nucleotide associations as well as catalysis. The four major TERT domains: the RNA binding domain (TRBD); the fingers domain, implicated in nucleotide binding and processivity; the palm domain, which contains the active site of the enzyme; and the thumb domain, implicated in DNA binding and processivity are organized into a ring configuration similar to that observed for the substrate-free enzyme. This is the thumb domain found in Tribolium castaneum telomerase catalytic subunit, TERT. Contacts between TERT and the DNA substrate are mostly mediated via backbone interactions with the thumb loop and helix. The thumb helix sits in the minor groove of the RNA-DNA heteroduplex, making extensive contacts with the phosphodiester backbone and the ribose groups of the RNA-DNA hybrid.	191
407829	pfam17985	SipA_VBS	SipA vinculin binding site. This motif family includes the three vinculin binding sites found in the Shigella SipA/IpaA protein. The family also includes some proteins from Chlamydia species.	22
407830	pfam17986	EKAL	EMP3-KAHRP-like N-terminal domain. This is the N-terminal domain which is found at the N terminus of the erythrocyte cytoskeleton-associated protein (EMP3) and the knob-associated histidine-rich protein (KAHRP). KAHRP found in protozoan parasites such as Plasmodium falciparum, is involved in both rigidifying the host cell and the formation of cytoadherent knob structures.	53
407831	pfam17987	PMT2_N	Phosphoethanolamine N-methyltransferase 2 N-terminal. This is the N-terminal vestigial domain found in Haemonchus contortus phosphoethanolamine N-methyltransferases 2 (PMT2). Structural analysis reveal changes leading to loss of function in the vestigial domains of the nematode PMT.	124
375470	pfam17988	VEGFR-2_TMD	VEGFR-2 Transmembrane domain. This is a transmembrane domain (TMD) of vascular endothelial growth factor receptor 2 which regulates blood vessel homeostasis. Transmembrane signalling by receptor tyrosine kinases (RTKs) requires specific orientation of the intracellular kinase domains in active receptor dimers. Two mutants in VEGFR-2 TMD showed constitutive kinase activity, suggesting that precise TMD orientation is mandatory for kinase activation. Scanning mutagenesis and structural analysis indicated that introducing two polar amino acids in distinct positions of the TMD (G770E/F777E and I771E/L778E mutations) reorients transmembrane helices and leads to stable dimer formation. Therefore, it has been suggested that the transition between the inactive and the active dimeric state of VEGFR-2 implicates alternative dimeric TMD conformations.	35
407832	pfam17989	ALP_N	Actin like proteins N terminal domain. This is the N-terminal domain found in archaeal actin homolog Ta0583 found in thermophilic archaeon Thermoplasma acidophilum. Structural analysis indicate that the fold of Ta0583 contains the core structure of actin indicating that it belongs to the actin/Hsp70 superfamily of ATPases. Furthermore,Ta0583 co-crystallized with ADP shows that the nucleotide binds at the interface between the subdomains of Ta0583 in a manner similar to that of actin. It has been suggested that Ta0583 might function in the cellular organisation of T. acidophilum. Other family members include ParM another actin-like protein found in Staphylococcus aureus. Crystal structure co-ordinates revealed that this protein is most structurally related to the chromosomally encoded Actin-like proteins (Alp) Ta0583 from the archaea Thermoplasma acidophilum. Furthermore, biophysical analyses have suggested that ParM filaments undergo a treadmilling-like mechanism of motion in vitro similar to that of F-actin. The recruitment of ParM to the segrosome complex, was shown to be required for the conversion of static ParM filaments to a dynamic form proficient for active segregation and facilitated by the C-terminus of ParR	148
407833	pfam17990	LodA_N	L-Lysine epsilon oxidase N-terminal. This is the N-terminal domain found in antimicrobial protein (LodA) with lysine-epsilon oxidase activity (EC 1.4.3.20) which is produced by gram-negative marine bacteria such as Marinomonas mediterranea. The enzyme, previously named marinocine, catalyzes the oxidative deamination of l-lysine into 6-semialdehyde 2-aminoadipic acid, ammonia, and hydrogen peroxide (H2O2). Orthologous proteins have been detected in other bacterial genera, where they participate in biofilm development and dispersal. It has been shown that M. mediterranea LodA and its homologues induce cell death in the microcolonies formed in the process of biofilm development due to the hydrogen peroxide generated by their enzymatic activity. Moreover, cells dispersed from the biofilm by means of this mechanism show a phenotypic variation in growth and biofilm formation. The active form of LodA containing the quinonic cofactor is generated intracellularly only in the presence of LodB, suggesting that the latter protein is involved in this process.	215
407834	pfam17991	Thioredoxin_10	Thioredoxin like C-terminal domain. This is the C-terminal thioredoxin like domain found in Rv2874 in the pathogenic bacterium Mycobacterium tuberculosis. Structure analysis of Rv2874-C shows the presence of a C-terminal domain formed by the 128 residues Thr568-Gly695. These residues form a jelly-roll structure in which two antiparallel beta-sheets sandwich a hydrophobic core. This domain is combined with a second domain with a carbohydrate-binding module (CBM) fold.	142
407835	pfam17992	Agarase_CBM	Agarase CBM like domain. This is the N-terminal CBM-like domain in exo-beta-agarase proteins (EC:3.2.1.81) found in the marine microbe Saccharophagus degradans. This enzyme catalyzes a critical step in the metabolism of agarose by S. degradans through cleaving agarose oligomers into neoagarobiose products that can be further processed into monomers. The CBM-like domain is structurally very similar to some CBM families. A loop in the CBM-like domain is involved in forming the roof of the active site channel. The contribution of the CBM-like domain to formation of the active site of the enzyme supports a role in substrate recognition explaining the exo-mode of beta-agarase action.	178
407836	pfam17993	HA70_C	Haemagglutinin 70 C-terminal domain. This is the C-terminal domain found in hemagglutinin component such as HA70 found in Clostridium botulinum. HA is a component of the large botulinum neurotoxin complex and is critical for its oral toxicity. HA plays multiple roles in toxin penetration in the gastrointestinal tract, including protection from the digestive environment, binding to the intestinal mucosal surface, and disruption of the epithelial barrier. HA consists of three different proteins, designated HA70 (also known as HA3), HA33 (HA1), and HA17 (HA2) based on molecular mass. HA70 consists of three domains (D1-3). The D1 and D2 domains, which adopt similar structures, mediate the trimerization of HA70 with each protomer. The D3 domain, sitting at the tip of the trimer, is composed of two similar jelly-roll-like beta-sandwich structures. Furthermore, crystal structures of HA70 in a complex with alpha2,3- or alpha2,6-SiaLac (alpha2,6-sialyllactose), show that alpha2,3- and alpha2,6-SiaLac bound to the same region in the D3 domain of HA70. This domain is the D3 domain found in HA3/HA70 which has been shown to be involved in binding to carbohydrate of glycoproteins from epithelial cells in the infection process.	135
407837	pfam17994	Glft2_N	Galactofuranosyltransferase 2 N-terminal. This is the N-terminal beta-barrel domain found in the polymerizing galactofuranosyltransferase GlfT2 (Rv3808c). This enzyme synthesizes the bulk of the galactan portion of the mycolyl-arabinogalactan complex, which is the largest component of the mycobacterial cell wall such as in Mycobacterium tuberculosis. The N-terminal domain contains two short helices preceding a 10-stranded beta-sandwich with jelly roll topology.	140
407838	pfam17995	GH101_N	Endo-alpha-N-acetylgalactosaminidase N-terminal. This is the N-terminal domain found in Streptococcus pneumoniae endo-alpha-N-acetylgalactosaminidase (EC:3.2.1.97), a cell surface-anchored glycoside hydrolase from family GH101 involved in the breakdown of mucin type O-linked glycans. This is a twisted beta-sandwich domain composed of two sheets of six and seven antiparallel beta-strands. The domain appears to be missing the extended metal and carbohydrate-binding loops.	180
407839	pfam17996	CE2_N	Carbohydrate esterase 2 N-terminal. This is the N-terminal beta-sheet domain with jelly roll topology found in CE2 acetyl-esterase from the bacterium Clostridium thermocellum. This enzyme displays dual activities, it catalyses the deacetylation of plant polysaccharides and also potentiates the activity of its appended cellulase catalytic module through its noncatalytic cellulose binding function. This N-terminal jelly-roll domain appears to extend the substrate/cellulose binding cleft of the catalytic domain in C.thermocellum.	108
375474	pfam17997	Cry1Ac_D5	Insecticidal delta-endotoxin CryIA(c) domain 5. This domain is found in the protoxins portion of insecticidal proteins (parasporins, or Cry proteins) such as those from Bacillus thuringiensis (Bt) Cry1Ac. The protoxin portion comprise a proteolytically labile C-terminal segment (sometimes referred to as the protoxin domain). This is domain V in Cry1Ac from B. thuringiensis. One of the four protoxin domains (D-IV through D-VII). Domains V and VII are beta-rolls (similar to D-II or D-III) that closely resemble carbohydrate-binding modules (CBM) found in sugar hydrolases, however, it is difficult to guess which particular carbohydrates (if any) may serve as their ligands because residues on the putative sugar-binding interfaces are conserved neither in sequence nor in local structure. Structural analysis indicate that there are putative disulfide crosslinking at the dimer interface mediated by cysteines within 783-823 region of this domain which together with other cysteines creates a three-dimensional network of cross-links across the crystal which may play a role in stabilizing mature Bt Cry1Ac.	173
407840	pfam17998	AgI_II_C2	Cell surface antigen I/II C2 terminal domain. This is the second domain (C2) located in the C-terminal region found in antigen I/II type adhesin protein AspA from S. pyogenes. Together with C3, these two domains form an elongated structure, each domain adopts the DEv-IgG fold. Similar to the classical IgG folds, it is comprised of two major antiparallel beta-sheets, designated ABED and CFG. For the C2-domain, there are two additional strands on the CFG sheet. Furthermore, sheets ABED and CFG are interconnected by several cross-connecting loops and one alpha-helix (DH1). The side chains of D982 and N996 in the C2-domain are involved in hydrogen bonding with the side chains of R1264 and N1295 in the C3 domain. Main chain hydrogen bonding can also be observed between S992 in C2 and N1189/G1191 in C3, furthermore stabilizing the interaction between the domains. The C2 domain contains one bound metal ion, modeled as Ca2+, and both the C2- and C3-domains are stabilized by conserved isopeptide bonds, which connect the beta-sheets of the central DEv-IgG motifs.Other members of this family include Major cell-surface adhesin PAc from Streptococcus mutans and SspB from Streptococcus gordonii.	180
407841	pfam17999	PulA_N1	Pullulanase N1-terminal domain. This is the N-terminal domain found in debranching enzyme such as Pullulanase (PulA)from Anoxybacillus sp. LM18-11. The PulA structure comprises four domains (N1, N2, A, and C). This is the N1 domain which has been identified as a carbohydrate-binding motif. Two maltotriose or maltotetraose molecules were found between the N1 domain and a loop of the A domain in the PulA-maltotriose or PulA-maltotetraose structures. These carbohydrates are bound in a parallel binding mode close to each other and form hydrogen bonds. The sugar moieties bound to the N1 domain are not immediately adjacent to the active site, but the enzyme might use N1 binding to attract and grab the substrate. Functional analysis indicate that N1 is important for catalytic activity and thermostability in addition to assisting substrate binding. The structure of the N1 domain reveals a classic distorted beta-jelly roll fold consisting of two anti-parallel beta-sheets, forming a concave and a convex surface. On the concave side of N1 domain there is a cleft to accommodate two molecules of maltotriose or maltotetraose.	85
407842	pfam18000	Top6b_C	Type 2 DNA topoisomerase 6 subunit B C-terminal domain. This is the C-terminal domain found in archaeal type 2 DNA topoisomerase 6 subunit B (EC:5.99.1.3). This region is a small helix-two turns-helix (H2TH) domain inserted between the GHKL and transducer domains which adopts an immunoglobulin-like fold. Mutation analysis of this C-terminal domain showed that the overall activity of the mutant mesophilic methanogen M. mazei Top6B (MmT6) is modestly reduced but its relative activity on different substrates is not affected. Due to the similarity of the B subunit's CTD to known protein- and carbohydrate-binding modules, it has been suggested that it could regulate topo VI spatially, perhaps by localizing the enzyme to a specific subcellular region or functional partner.	113
375477	pfam18001	Il13Ra_Ig	Interleukin-13 receptor subunit alpha Ig-like domain. This is the N-terminal Ig-like domain found in IL-13Ralpha1 type two cytokine complex. The IL-13Ralpha1 contains an extra N-terminal Ig-like domain not found in other receptors of the the common gamma-chain subfamily. The extra N-terminal IL-13Ralpha1 Ig-like domain contacts the dorsal surfaces of both IL-4 and IL-13. Mutational studies show that the deletion of this domain affects the binding of IL-13 to IL-13Ralpha1.	95
407843	pfam18002	T6_Ig_like	T6 antigen Ig like domain. This is the N-terminal immunoglobulin-like domain. Family members carrying this domain include Trypsin-resistant surface T6 protein found in Streptococcus pyogenes.	142
407844	pfam18003	DUF3823_C	Domain of unknown function (DUF3823_C). This is a family of uncharacterized proteins from Bacteroidetes. This domain has an Ig-like fold.	104
407845	pfam18004	RPN2_C	26S proteasome regulatory subunit RPN2 C-terminal domain. This is the C-terminal domain found in S. cerevisiae Rpn2 (26S proteasome regulatory subunit RPN2) as well as other eukaryotic species. A study revealed that the C-terminal 52 residues of the Rpn2 C-terminal domain are responsible for mediating interactions with the ubiquitin-binding subunit Rpn13. Futhermore, the extreme C-terminal 20 or 21 residues of Rpn2 (926-945 or 925-945) of S. cerevisiae, were shown to be equally effective at binding Rpn13. Multiple sequence alignments indicate that Rpn2 orthologs are highly conserved in this C-terminal region and share characteristic acidic, aromatic, and proline residues, suggesting a common function. In the structure of Rpn2 from S. cerevisiae, this region is exposed and disordered, and is thus accessible for associating with Rpn13. The Rpn2 binding surface of human Rpn13 has been mapped by nuclear magnetic resonance titration to one surface of its Pru domain.	159
407846	pfam18005	eIF3m_C_helix	eIF3 subunit M, C-terminal helix. This is the C-terminal helix domain found in Eukaryotic translation initiation factor 3 subunit M (eIF3m). In mammalian eIF3a, the C-terminal helix following the PCI domain is involved in interactions with other core subunits.	29
407847	pfam18006	SepRS_C	O-phosphoseryl-tRNA synthetase C-terminal domain. The SSHS domain is mainly found in Archaea. The domain makes up part of the anticodon binding domain at the C-terminal of O-phosphoserine--tRNA(Cys) ligase.	31
407848	pfam18007	DUF5593	Domain of unknown function (DUF5593). This is a domain of unknown function found in Corynebacteriales.	93
407849	pfam18008	Bac_RepA_C	Replication initiator protein A C-terminal domain. This is the C-terminal domain (CTD) that can be found in the conserved replication initiator, RepA,essential for staphylococcal propagation. RepA CTD shared the strongest structural homology to the Enterococcus faecalis DnaD CTD, yet perform distinct functions. RepA CTD shows strong sequence homology between RepA_N plasmids in genus-specific clusters, suggesting that it may perform host-specific functions necessary for replication. The RepA CTD interacts with the host DnaG primase, which binds the replicative helicase. Structural data indicate that the RepA CTD exists as a monomeric entity, flexibly tethered to the DNA-bound NTD.	94
375484	pfam18009	Fer4_23	4Fe-4S iron-sulfur cluster binding domain. This is the C-terminal domain found in Deinococcus radiodurans protein DR2241 (a Ribosomal protein S2-related protein). This domain has been shown to harbour the sequence motifs CxxC and CxxxC which bind a [4Fe-4S] iron-sulphur cluster. Together with the preceding domain, it is heavily involved in the tetramer formation.	82
375485	pfam18010	HTH_49	Cry35Ab1 HTH C-terminal domain. This is the C-terminal domain found in Bacillus thuringiensis protein Cry35Ab1 (an insecticidal protein). The domain has three helices held in a hydrophobic core, the first two form a typical helix-loop-helix whilst the third helix is perpendicular. The domain is structurally homologous to BinB and proteins containing similar beta-trefoil lectin-like domains. The domain is not required for insecticidal activity or for immuno-reactivity and its function is likely to be to bind to WCR brush border membrane vesicles.	29
407850	pfam18011	Catalase_C	C-terminal domain found in long catalases. This domain is found at the C-terminus of a variety of large catalase enzymes from bacteria. Structurally it is related to class I glutamine amidotransferase domains. The precise molecular function of this domain is uncertain.	150
407851	pfam18012	PH_17	PH domain. This entry represents the C-terminal part of the split PH domain from syntrophin proteins.	59
407852	pfam18013	Phage_lysozyme2	Phage tail lysozyme. This domain has a lysozyme like fold. It is found in the tail protein of various phages probably giving them the ability to degrade the host cell wall peptidoglycan layer.	139
407853	pfam18014	Acetyltransf_18	Acetyltransferase (GNAT) domain. This entry represents a likely acetyltransferase enzyme that is related to pfam06852.	123
407854	pfam18015	Acetyltransf_19	Acetyltransferase (GNAT) domain. This entry represents a likely acetyltransferase enzyme that is related to pfam13302.	113
407855	pfam18016	SAM_3	SAM domain (Sterile alpha motif). 	65
407856	pfam18017	SAM_4	SAM domain (Sterile alpha motif). This entry corresponds to a SAM domain that is found at the N-terminus of the human C19orf47 protein.	84
407857	pfam18018	DNA_pol_D_N	DNA polymerase delta subunit OB-fold domain. The eukaryotic DNA polymerase delta (Pol delta) participates in genome replication, homologous recombination, DNA repair and damage tolerance. Human Pol delta consists of four subunits: p125, p50, p66 and p12. The first three subunits correspond to the three subunits of S. cerevisiae Pol delta. p50 serves as a scaffold for the assembly of Pol delta by interacting simultaneously with all of the other three subunits. This entry corresponds to the OB fold domain found in the p50 subunit.	129
407858	pfam18019	HD_6	HD domain. This HD domain is found at the N-terminus of Cas3 enzymes fused to a helicase domain. This domain is sometimes found as a separate protein. It acts as a nuclease that cleaves ssDNA.	211
407859	pfam18020	TIG_2	TIG domain found in plexin. This entry represents a TIG domain found in plexin proteins. TIG domains have an Ig-like fold.	94
407860	pfam18021	Agglutinin_C	Agglutinin C-terminal. This is the C-terminal domain of the alpha chain found in Marasmius Oreades lectin protein (MOA) which binds specifically with Gal.alpha(1,3)Gal-containing sugar epitopes. The enzymatic activity of the MOA may be associated with this domain. The domain has an alpha/beta-fold, which features a central six-stranded, mostly anti-parallel, beta-sheet flanked by three alpha-helices, and a short two-stranded beta-sheet.	93
407861	pfam18022	Lectin_C_term	Ricin-type beta-trefoil lectin C-terminal domain. This is the C-terminal domain of the beta chain found in Polyporus squamosus lectin protein (PSL). PSL binds specifically to glycans terminating with the sequence: Neu5Ac.alpha2-6Gal.beta. The C-terminal domain is not involved in the binding to the Neu5Ac.alpha2-6Gal.beta. The C-terminal domain is characterized by a central five-stranded beta-sheet that is flanked by three alpha-helices and topped by a short strand. It shows high fold similarity to its closest relative, the Gal.alpha1-3Gal-binding agglutinin from the mushroom Marasmius oreades agglutinin (MOA).	102
375495	pfam18023	FKBP_N_2	BDBT FKBP like N-terminal. This is the N-terminal domain of the beta chain found in Drosophila melanogaster protein BDBT (a FK506-Binding Protein) which stimulates the DBT circadian function. The domain contains the DBT-binding site. The domain is structurally homologous to the peptidyl prolyl isomerase (PPIase) regions of FK506-binding proteins despite low sequence homology. BDBT is structurally related to the immunophilin FKBP51 and it shares a common domain organization consisting of PPIase-like and TPR domains with noncanonical immunophilins such as FKBP38 or FKBPL.	113
407862	pfam18024	HTH_50	Helix-turn-helix domain. The TyrR protein of Haemophilus influenzae is a 36-kD transcription factor whose major function is to control the expression of genes important in the biosynthesis and transport of aromatic amino acids. This entry represents the C-terminal helix-turn-helix DNA-binding domain of TyrR and related proteins.	50
407863	pfam18025	FucT_N	Alpha-(1,3)-fucosyltransferase FucT N-terminal domain. This is the N-terminal domain of the alpha chain found in Helicobacter pylori Fucosyltransferase protein which is involved in the production of Lewis x trisaccharide, a major component of lipopolysaccharide. The N-terminal domain contains the catalyst base, Glu-95 which is equivalent to the Asp-100 of other members of the glycosyltransferases-B family. The domain contains the pocket where LacNAc binds. The domain is composed of 2-10 heptad repeats and a conserved N-terminal alpha-beta-alpha motif which has little sequence similarity to the conserved N-terminal motif in other glycosyltransferases.	92
407864	pfam18026	Exog_C	Endo/exonuclease (EXOG) C-terminal domain. This is the C-terminal domain found in EndoG-like mitochondrial endo/exonuclease (EXOG) proteins in higher eukaryotes. Evolutionary conserved mitochondrial nucleases are involved in programmed cell death and normal cell proliferation in lower and higher eukaryotes. It has been proposed that during metazoan evolution duplication of an ancestral nuclease gene could have generated the paralogous EndoG- and EXOG-protein subfamilies in higher eukaryotes, thereby maintaining the full endo/exonuclease activity found in mitochondria of lower eukaryotes. Family members include the human EXOG, a dimeric mitochondrial enzyme that displays 5'-3' exonuclease activity and further differs from EndoG in substrate specificity. This C-terminal domain is predicted to fold into a coiled-coil structure. Deletion of the domain led to a pronounced reduction in EXOG activity, revealing that the presence and most likely the proper positioning of this domain in EXOG proteins is crucial for its enzymatic activity.	49
407865	pfam18027	Pepdidase_M14_N	Cytosolic carboxypeptidase N-terminal domain. This entry corresponds to the N-terminal domain of cytosolic carboxypeptidases. The N-terminal domain folds into a nine-stranded antiparallel beta sandwich. This domain is specific to CCP proteins and is absent in other carboxypeptidases. It has been hypothesized that the N-terminal domain might contribute to folding, might have a regulatory function and/or might be involved in binding other proteins.	107
375500	pfam18028	Zmiz1_N	Zmiz1 N-terminal tetratricopeptide repeat domain. This is the N-terminal domain found in Zmiz1 proteins (Zinc finger MIZ domain-containing protein 1). Zmiz1 is a direct Notch1 cofactor that heterogeneously regulates Notch target genes. Zmiz1 directly interacts with the RAM1 domain of Notch1 through this N-terminal tetratricopeptide repeat (TPR) domain. Furthermore, it has been shown that Zmiz1 and Notch1 cooperatively recruit each other to chromatin through direct interaction via the N-terminal TPR domain resulting in a slight increase in activating histone marks and decrease of repressive histone marks. Functional analysis indicate that the N-Terminal Domain of Zmiz1 is important for driving Myc transcription and proliferation indirectly.	93
407866	pfam18029	Glyoxalase_6	Glyoxalase-like domain. This entry comprises a diverse set of domains related to the Glyoxalase domain. The exact specificity of these proteins is uncertain.	111
407867	pfam18030	Rimk_N	RimK PreATP-grasp domain. This is the N-terminal domain found in Escherichia coli RimK proteins (Ribosomal protein S6-L-glutamate ligase). This domain precedes the ATP-grasp domain pfam08443.	94
407868	pfam18031	UCH_C	Ubiquitin carboxyl-terminal hydrolases. This is the C-terminal domain found in eukaryotic UCH37 proteins (also known as Ubiquitin carboxyl-terminal hydrolase isozyme L5, UCHL5). UCH37 is a subunit of two complexes: INO80, which performs ATP-dependent sliding of nucleosomes for transcriptional regulation and DNA repair, and the 26S proteasome, which performs ATP-dependent proteolysis of polyubiquitylated proteins in the cytosol and nucleus. Recruitment to the proteasome is mediated by the C-terminal domain of RPN13 (also known as ADRM1). Recruitment to INO80 is mediated by the N-terminal domain of NFRKB. Structural and biochemical analysis reveal that RPN13 and NFRKB make similar interactions with the UCH37 C-terminal domain but have very different interactions with the catalytic UCH domain that are activating in the case of RPN13 and highly inhibitory in the case of NFRKB.	46
407869	pfam18032	FRP	Photoprotection regulator fluorescence recovery protein. This family includes fluorescence recovery protein (FRP) domain, which is found in Synechocystis sp. PCC 6803 substr. Kazusa. FRP causes the dissociation of the orange carotenoid protein (OCP) from the phycobilisomes by interacting with the C-terminal domain of OCP, accelerating the conversion of the active red OCP to the inactive orange form. A patch of residues (W50, D54, H53, and R60), contributed by both chains of the FRP dimer cause the acceleration of the OCPr to OCPo conversion. Mutation of the absolutely conserved amino acids (R60) affect the activity of FRP.	99
407870	pfam18033	SpuA_C	SpuA C-terminal. This is the C-terminal beta sandwich domain found in Streptococcus pneumoniae Spu4 proteins. Spu4 is a large multimodular cell wall-attached enzyme involved in the degradation of glycogen.	93
407871	pfam18034	Bac_GH3_C	Bacterial Glycosyl hydrolase family 3 C-terminal domain. This is the C-terminal domain of the glycoside hydrolase family (pfam00933)	137
407872	pfam18035	Bap31_Bap29_C	Bap31/Bap29 cytoplasmic coiled-coil domain. Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31. This entry represents the cytoplasmic domain which forms a heterodimeric coiled-coil with Bap29. This Bap29 and Bap31 are homologous to each other and this entry includes both proteins.	52
407873	pfam18036	Ubiquitin_4	Ubiquitin-like domain. This domain has a ubiquitin-like fold. It is found in a diverse range of proteins including the 25KDa U11/U12 component.	89
407874	pfam18037	Ubiquitin_5	Ubiquitin-like domain. This entry includes N-terminal ubiquitin-like domain from proteins such as NEDD8 ultimate buster protein.	96
407875	pfam18038	FERM_N_2	FERM N-terminal domain. This entry represents the FERM N-terminal domain found in focal adhesion kinases.	96
407876	pfam18039	UBA_6	UBA-like domain. This entry represents a UBA-like domain found at the N-terminus of ribonuclease ZC3 proteins.	42
407877	pfam18040	BPA_C	beta porphyranase A C-terminal. This is the C-terminal domain found in Bacteroides plebeius of proteins such as beta-porphyranase A (BPA), a beta-galactanase that cleaves the beta-1,4 glycosidic bond. Porphyranase degrade red seaweed glycans. This domain adopts a beta sandwich shape.	95
407878	pfam18041	MapZ_EC1	MapZ extracellular domain 1. This is the extracellular domain 1 (MapZextra1) found in Streptococcus pneumoniae cell division site positioning protein MapZ. MapZ ensures accurate placement of the bacterial division site. The domain is a rigid four alpha-helices with two flexible linkers. The N-terminal end of MapZextra1 is connected to the trans-membrane segment of MapZ whilst the C-terminal is linked to MapZextra2 via a serine rich linker (SRL).The highly conserved residues are not accessible at the surface but are directly involved in many inter-helices interactions allowing for rigidity.	129
407879	pfam18042	ORF_12_N	ORF 12 gene product N-terminal. This is the N-terminal domain of Streptomyces clavuligerus ORF 12 gene product, which is directly involved in biosynthesis of Clavulanic acid (CA). The N-terminal domain consists of one four-stranded antiparallel beta-sheet surrounded by four alpha-helices and folds similarly to steroid isomerases and polyketide cyclases (PKTC). However, the N-terminal domain has no apparent polar-lined active-site pocket or conserved catalytic residues analogous to those of either the steroid isomerases or PKTCs. ORF 12 has 2 binding sites, CA is able to bind both to an active site and between the two domains. The active site pocket is lined by residues from both the N- and the C-terminal domains (His88, Ser173, Thr209, Ser234, Ser278, Met383, Phe374, Ala376 and Phe385). The C-2 carboxylate of CA is positioned deep in the pocket, making electrostatic interactions with Lys375, Arg418 and Lys89. Lys89 is located in strand beta-2 of the N-terminal domain and may also be involved in the beta-lactam core-binding region, including the residues Lys89 as well as Arg418.	92
407880	pfam18043	T4_Rnl2_C	T4 RNA ligase 2 C-terminal. This is the C-terminal domain of Enterobacteria phage T4 RNA ligase 2 (Rnl2). The C-terminal domain consists of a four-helix bundle. The C-terminal domain is thought to be critical for binding the Rnl2 to adenylylated nicked duplex. This is known as step 2 of the ligation pathway.	82
407881	pfam18044	zf-CCCH_4	CCCH-type zinc finger. This short zinc binding domain has the pattern of three cysteines and one histidine to coordinate the zinc ion. This domain is found in a wide variety of proteins such as E3 ligases.	22
407882	pfam18045	ISP3_C	ISP3 C-terminal. This is the C-terminal domain of ISP3 protein, which plays a role in asexual daughter cell formation, for example in T.gondii. The domain consists of a seven-stranded antiparallel beta-sandwich bordered on one end by a interstrand loop (open end) and capped at the other end by an amphipathic C-terminal helix (closed end). The loop between beta 5 and beta 6 is extended and variable. The domain adopts a pleckstrin homology (PH) fold, despite having neglible sequence similarity. PH domains are often found in proteins that support protein-lipid and play a role in mediating membrane localization through IP binding. However, the Phospholipid Binding Properties of PH domains is not conserved in the ISP3. Unlike PH domains, ISP3 is cysteine rich. The cysteine-rich nature of the ISP3s and the number of surface-exposed cysteines may result in redox instability and may also facilitate higher order multimerization. There are no disulfide bonds in ISP3 unlike in ISP1. It is worth noting that ISP1 and ISP3 share low sequence identity but contain the same secondary core elements.	110
407883	pfam18046	FKBP26_C	FKBP26_C-terminal. This is the C-terminal domain (CTD) of Methanocaldococcus jannaschii peptidyl-prolyl cis/trans isomerase FKBP26. FKBP26 mediates protein folding. CTD has an alpha/beta-sandwich structure composed of three alpha-helices and a three-stranded, mixed-orientation beta-sheet. The CTD domain is responsible for dimerization of FKBP26 by through inter-subunit antiparallel pairing of the third beta-strands of the two CTDs. The CTD dimer forms a continuous, six-stranded mixed beta-sheet. A CTD-like structure is also found in the phylogenetically conserved NifU proteins (HIRIP5 in mouse and humans).	72
407884	pfam18047	PatG_D	PatG Domain. This is a domain found in PatG proteins, these proteins are involved in prfocessing the precursor peptide to yield the cyclic Patellamide. PatG can be found in Prochloron sp.	111
407885	pfam18048	TRAF6_Z2	TNF receptor-associated factor 6 zinc finger 2. This domain is the second of three zinc fingers of Homo sapiens TNF receptor associated factor 6 (TRAF6). TRAF6 mediates Lys63 (K63)-linked polyubiquitination for Necrosis Factor-kappaB activation. The first three residues and the last Cys of finger 1 form a classical type I beta-turn.	27
407886	pfam18049	DNA_pol_P_Exo	DNA polymerase nu pseudo-exo. This domain, known as the Pseudo-exo domain, is found DNA polymerase Nu protein in species such as Homo sapiens.Residues 192-416 of Pol nu formed a degenerate 3'-5'-exonuclease domain, which deviates from the equivalent.	212
407887	pfam18050	Cyclophil_like2	Cyclophilin-like family. This entry represents a family of cyclophilin-like proteins found in a range of bacterial species.	114
407888	pfam18051	RPN1_C	26S proteasome non-ATPase regulatory subunit RPN1 C-terminal. This is the C-terminal domain found in RPN1 proteins (26S proteasome non-ATPase regulatory subunit 2). The 26S proteasome holocomplex consists of a 28-subunit barrel-shaped core particle (CP) in the center capped at the top and bottom by 19-subunit regulatory particles (RPs). The CP forms the catalytic chamber and the RP is formed from two subcomplexes known as the lid and the base. The lid comprises nine Rpn subunits in yeast (Rpn3/5/6/7/8/9/11/12/15) and the base comprises three Rpn subunits (Rpn1/2/13) and six ATPases (Rpt1-6).	54
407889	pfam18052	Rx_N	Rx N-terminal domain. This entry represents the N-terminal domain found in many plant resistance proteins. This domain has been predicted to be a coiled-coil, however the structure shows that it adopts a four helical bundle fold.	93
407890	pfam18053	GyrB_insert	DNA gyrase B subunit insert domain. This is the insert domain found in DNA gyrase B subunit proteins. Studies indicate that the insert has two functions, acting as a steric buttress to pre-configure the primary DNA-binding site, and serving as a relay that may help coordinate communication between different functional domains.	167
407891	pfam18054	CEL_III_C	CEL-III C-terminal. This is the C-terminal domain found in Cucumaria echinata CEL-III protein which is a lectin that exhibits both hemolytic and hemagglutinating activity. The domain is responsible for oligomerization and insertion of CEL-III into the erythrocyte membrane. The domain is composed of eight stranded beta sandwich and two alpha helices, the latter changes conformation upon binding to the cell surface carbohydrates.	154
407892	pfam18055	RPN6_N	26S proteasome regulatory subunit RPN6 N-terminal domain. This is the N-terminal domain found in RPN6 proteins (26S proteasome regulatory subunit). The 26S proteasome holocomplex consists of a 28-subunit barrel-shaped core particle (CP) in the center capped at the top and bottom by 19-subunit regulatory particles (RPs). The CP forms the catalytic chamber and the RP is formed from two subcomplexes known as the lid and the base. The lid comprises nine Rpn subunits in yeast (Rpn3/5/6/7/8/9/11/12/15) and the base comprises three Rpn subunits (Rpn1/2/13) and six ATPases (Rpt1-6). Phosphorylation of Rpn6 enhances proteasome ATPase activity and promotes the formation of doubly capped (30S) proteasome, hence accelerating the degradation of short-lived proteins.	117
407893	pfam18056	PBP3	Penicillin Binding Protein 3 Domain. This domain belongs to peptidoglycan synthesis regulatory factor 3 (PBP3) from streptococcus pneumoniae. Peptidoglycan synthesis regulatory factor are known as Penicillin binding proteins (PBP) and are membrane-associated enzymes that perform critical functions in the bacterial cell division process. The domain contributes residues to the active site such as Arg 278.	110
407894	pfam18057	DUF5594	Domain of unknown function (DUF5594). This domain was first discovered in BPSL1050, a highly immunoreactive protein found in Burkholderia pseudomallei. The domain's structure consists of three helical regions which pack onto an antiparallel beta sheet, formed by four strands. The beta sheet is solvent exposed on one side and packs tightly against the three helices on the other side, generating a network of hydrophobic and aromatic interactions that contribute to tight packing of the protein. Is is thought that the small loop L1, the main loop L2, and part of helix alpha-3 extending until Leu120 are the three main immunogenic sequences.	115
407895	pfam18058	SbsC_C	SbsC C-terminal domain. This is the C-terminal domain found in Bacterial Cell Surface Layer Protein SbsC which can be found in species such as Geobacillus stearothermophilus. The C-terminal domain is the third and last triple-helical bundle and adopts a canonical coiled-coil structure. A similar overall arrangement of antiparallel triple-helical bundles has been found in the cytoskeletal protein spectrin (2SPC).	132
407896	pfam18059	Csd3_N	Csd3 N-terminal. Csd3 (also known as HdpA) is a bi-functional enzyme with delta,delta-endopeptidase activity and delta,delta-carboxypeptidase activity. The N-terminal domain is also known as domain 1 and is composed of an alpha/beta fold consisting of a five-stranded antiparallel beta-sheet and three short alpha-helices. Domain 1 blocks the active-site cleft of the LytM domain, with the protruding helix alpha-3 contributing to the Zn2+ coordination sphere. The fold of domain 1 is very remotely related to monellin/cystatin superfamily proteins, some of which act as inhibitors of cysteine peptidases.	83
407897	pfam18060	F_actin_bund_C	F actin bundling C terminal. This is the C-terminal domain found in 34 kDa F actin bundling protein. ABP34 is a calcium regulated actin binding protein that cross links actin filaments into bundles. Residues 216-244 in the C-domain are part of the strongest actin-binding sites (residue 193 residue 254) and have conserved sequences with the actin-binding regions of alpha-actinin and ABP120.	87
407898	pfam18061	CRISPR_Cas9_WED	CRISPR-Cas9 WED domain. This domain, known as the wedge (WED) domain, is found in Cas9 proteins which are present in Staphylococcus aureus. Cas9 cleaves double-stranded DNA targets with a protospacer adjacent motif (PAM) and complementarity to the guide RNA. The Cas9 WED domain has a fold comprising a twisted five-stranded beta sheet flanked by four alpha helices, and is responsible for the recognition of the distorted repeat: anti-repeat duplex. WED domains are responsible for the recognition of single-guide RNA scaffolds.	126
407899	pfam18062	RE_AspBHI_N	Restriction endonuclease AspBHI N-terminal. This is the N-terminal domain found in modification-dependent restriction endonuclease proteins such as AspBHI, which can be found in Azoarcus sp. AspBHI is a homo-tetrameric protein that recognizes 5-methylcytosine in the double-strand DNA sequence context of (C/T) (C/G) (5mC) nucleotide (C/G) and cleaves the two strands at a fixed distance (N12/N16) 3 to the modified cytosine. The N-terminal domain is responsible for DNA-recognition and resembles an SRA-like 5-methylcytosine binding domain in structure and function.	185
407900	pfam18063	BB_PF	Beta barrel Pore-forming domain. This domain is found in Monalysin Pore-forming Toxin which is a type of beta-barrel pore-forming toxin protein found in Pseudomonas entomophila. Monalysin forms a stable doughnut-like 18-mer complex composed of two disk-shaped nonamers to form a pore. The domain is composed of a central twisted beta -sheet composed of three antiparallel beta-strands (beta 3, beta 6, and beta 8/9) and flanked by the pore-forming segment and the C-terminal region on either side. The pore-forming domain (residues 102-170) is located between strands beta 3 and beta 6, and is formed from two antiparallel beta-strands connected by three alpha-helices, alpha 3, alpha 4, and alpha 5. The C-terminal region forms a long alpha-helix followed by a small hairpin and a short alpha-helix.	204
407901	pfam18064	ParB_C	Centromere-binding protein ParB C-terminal. This is the C-terminal domain found in centromere-binding protein ParB, which is used for stable segregation. The C-terminal domain has a ribbon-helix helix (RHH) motif with a C-terminal loop (residues 119-128) following helix alpha-2. The domain forms a dimer with the C-terminal of the beta chain. The function of the C-terminal domain is to bind to DNA.	47
375529	pfam18065	PatG_C	PatG C-terminal. This is the C-terminal domain of Prochloron sp. PatG, which process the precursor peptide to yield the cyclic Patellamide. The C-terminal domain of PatG is 56% structurally homologous to the C-terminal domain of PatA.	115
407902	pfam18066	Phage_ABA_S	Phage ABA sandwich domain. This domain is found in a prophage protein BC1872 found in B. cereus. The domain forms a three-layer beta/alpha/beta sandwich. Three alpha-helices, alpha1, alpha2, and alpha3, are sandwiched on one side by three-stranded antiparallel beta-sheet (beta3, beta4, and beta5) and on the other by a beta-hairpin (beta1 and beta2).	89
407903	pfam18067	Lipase_C	Lipase C-terminal domain. This domain is found in Archaeoglobus fulgidus lipase (AFL). The domain consists of a layer of seven beta-sheet. When the domain is combined with the proximal domain, which is also a layer of seven beta-sheet, they form a beta sandwich. The combination of these two domains is known as the C-terminal domain. It is likely that the C-terminal domain plays an important role in substrate specificity, catalytic efficiency but also attributes partly to AFLs stability.	96
407904	pfam18068	Npun_R1517	Npun R1517. This domain belongs to NPun R1517 which is found in Nostoc punctiforme. Studies indicate that Npun R1517 is encoded by orphan gene 29. Npun R1517 adopts a sleigh-shaped structure with a two-stranded antiparallel beta-sheet forming the floor of the sleigh, a HTH forming the seat and a HTH forming the front of the sleigh.	74
407905	pfam18069	DR2241	DR2241 stabilising domain. This is the middle domain found in DR2241, a multi-domain protein with an N-terminal cobalamin (vitamin B12) chelatase domain. DR2241 is found in D. radiodurans. The middle domain has four alpha-helices (alpha7-alpha10) in contact with the N-terminal domains and C-terminal domain and five anti-parallel beta-strands with strand order 12354 at the outer side of one monomer. The middle domain, as well as the C-terminal domain, are heavily involved in the tetramer stabilisation.	111
407906	pfam18070	Cas9_PI2	CRISPR-Cas9 PI domain. This domain is found in Cas9 proteins which can be Staphylococcus aureus. Cas9 cleaves double-stranded DNA targets with a protospacer adjacent motif (PAM) and complementarity to the guide RNA. When this domain is combined with the C-terminal domain it is called the Pam-interacting (PI) domain. The Cas9 orthologs from different microbes have highly divergent sequences but their PI domains share a conserved core fold and recognize distinct PAM sequences.	59
407907	pfam18071	HMW1C_N	HMW1C N-terminal. This is the N-terminal domain found in Actinobacillus pleuropneumoniae HMW1C (ApHMW1C). HMW1 adhesin is an N-linked glycoprotein that mediates adherence to respiratory epithelium through N-glycosylation of protein acceptor sites an O-glycosylation of sugar acceptor sites. The N-terminal domain forms an all alpha domain (AAD) when combined with the domain spanning from residue 154 to residue 245. The AAD interacts extensively with the C-terminal GT-B fold in order to create unique groove with potential to accommodate the acceptor protein.	143
407908	pfam18072	FGAR-AT_linker	Formylglycinamide ribonucleotide amidotransferase linker domain. This is the linker domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT), also known as Phosphoribosylformylglycinamidine synthase (EC:6.3.5.3), PurL and formylglycinamidine ribonucleotide (FGAM) synthase. This enzyme catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway. The structure analysis of Salmonella typhimurium FGAR-AT reveals that this linker domain is made up of a long hydrophilic belt with an extended conformation.	50
407909	pfam18073	Rubredoxin_2	Rubredoxin metal binding domain. This is the C-terminal rubredoxin metal binding domain found in Interest in lipopolysaccharide (LPS) assembly protein B (LapB). Rubredoxin proteins form small non-heme iron binding sites that use four cysteine residues to coordinate a single metal ion in a tetrahedral environment. Rubredoxins are most commonly found in bacterial systems, but have also been found in eukaryotes. The key features of these rubredoxin-like domains are the extended loops or 'knuckles' and the tetracysteine mode of iron binding. Structural analysis of LapB from Escherichia coli show that the rubredoxin metal binding domain is intimately bound to the TPR motifs and that this association to the TPR motifs is essential to LPS regulation and growth in vivo. Other family members include RadA proteins which play a role in DNA damage repair. In E. coli, a protein known as RadA (or Sms) participates in the recombinational repair of radiation-damaged DNA in a process that uses an undamaged DNA strand in one DNA duplex to fill a DNA strand gap in a homologous sister DNA duplex. RadA carries a zinc finger at the N-terminal domain.	28
407910	pfam18074	PriA_C	Primosomal protein N C-terminal domain. This is the C-terminal domain found in PriA DNA helicase, a multifunctional enzyme that mediates the process of restarting prematurely terminated DNA replication reactions in bacteria. The C-terminal domain (CTD) bears similarity to the S10 subunit which binds branched rRNA within the bacterial ribosome. The C-terminal domain is part of the helicase domain of PriA proteins. It acts together with the 3' DNA-binding domain to form a site for binding ssDNA-binding protein (SSB).	96
407911	pfam18075	FtsX_ECD	FtsX extracellular domain. This is the extracellular domain (ECD) found in FtsX enzyme, a homolog of the transmembrane PG-hydrolase regulator. The FtsX extracellular domain binds the PG peptidase Rv2190c/RipC N-terminal segment, causing a conformational change that activates the enzyme ileading to PG hydrolysis in Mycobacterium tuberculosis. Structural analysis of FtsX ECD reveals fold containing two lobes connected by a flexible hinge. Mutations in the hydrophobic cleft between the lobes showed reduction in RipC binding in vitro and inhibition of FtsX function in Mycobacterium smegmatis.	94
407912	pfam18076	FGAR-AT_N	Formylglycinamide ribonucleotide amidotransferase N-terminal. This is the N-terminal domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT), also known as Phosphoribosylformylglycinamidine synthase (EC:6.3.5.3), PurL and formylglycinamidine ribonucleotide (FGAM) synthase. This enzyme catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide and glutamine to formylglycinamidine ribonucleotide, ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway.	115
407913	pfam18077	DUF5595	Domain of unknown function (DUF5595). This domain is found in Nude C 80 (Ndc80) proteins which can be found in species such as Homo sapiens. Ndc80 protein complexes are a core component of the end-on attachment sites for kinetochore microtubules. Ndc80 is also known as Hec1, for highly expressed in cancer 1.	73
407914	pfam18078	Thioredoxin_11	Thioredoxin-like SNTX domain. This domain is found in pore-forming toxin stonustoxin (SNTX), a lethal venom found in Synanceia horrida. Because of the thioredoxin-like nature of the domain, it is referred to as the THX domain. The THX domain is comprised of a five-stranded beta-sheet and shares greatest structural similarity with Saccharomyces cerevisiae mitochondrial THX3. It is thought that THX domain plays a purely structural role.	126
407915	pfam18079	AglB_L1	Archaeal glycosylation protein B long peripheral domain. This domain is found in Archaeal Glycosylation B protein (AglB-Long) in A. fulgidus. When the domain, known as peripheral l (Pl), is combined with the central core (CC) and insertion (IS) sub-units, they form the C-terminal domain. It is thought that the C-terminal domain may contribute toward the increased thermal stability of the AglB proteins in the hyper-thermophilic.	88
407916	pfam18080	Gal_mutarotas_3	Galactose mutarotase-like fold domain. This domain is found in endo-alpha-N-acetylgalactosaminidase present in Streptococcus pneumoniae. Endo-alpha-N-acetylgalactosaminidase is a cell surface-anchored glycoside hydrolase involved in the breakdown of mucin type O-linked glycans. The domain, known as domain 2, exhibits strong structural similarlity to the galactose mutarotase-like fold but lacks the active site residues. Domains, found in a number of glycoside hydrolases, structurally similar to domain 2 confer stability to the multidomain architectures.	243
407917	pfam18081	FANC_SAP	Fanconi anemia-associated nuclease SAP domain. This domain is found in Fanconi-anemia-associated nuclease 1 (FAN1) present in Pseudomonas aeruginosa. FAN1 is a nuclease associated with Fanconi anemia (FA), an autosomal recessive genetic disorder caused by defects in FA genes responsible for processing DNA inter-strand cross-links (ICLs). The domain, known as the SAP domain, helps to augment the overall protein DNA interaction by interacting with the 3' and 5' ends of the template strand. Support of the pre-nick segment binding is crucial as multiple mutations in this domain resulted in hypersensitivity to a cross-linking agent in the SAP domain of Caenorhabditis elegans' FAN1. The helix-hairpin-helix of the SAP recognize three consecutive phosphate groups (C19, A20 and A21) at the 3' end of the template via the basic residues K116, K135 and K117.	51
407918	pfam18082	DUF5596	Domain of unknown function (DUF5596). This domain belongs to polcalcin from the weed Chenopodium album, recombinant Che a 3 (rChe a 3). Polcalcin occur in pollen as highly cross-reactive allergenic molecules. The three-dimensional structure of rChe a 3 resembles an alphahelical fold that is essentially identical with that of the two EF-hand allergens from birch pollen, Bet v 4, and timothy grass pollen, Phl p 7.	129
407919	pfam18083	PutA_N	Proline utilization A N-terminal domain. This domain is found in Proline utilization A (PutA) proteins present in Geobacter sulfurreducens. PutA are bifunctional peripheral membrane flavoenzymes that catalyze the oxidation of l-proline to l-glutamate and couple the oxidation of imported proline imported to the reduction of membrane-associated quinones. This domain is located at the N-terminus and is referred to as the alpha domain. The hydrocarbon tail of Zwittergent 3-12 binds to an exposed hydrophobic patch of the alpha domain which contains aromatic and nonpolar residues. The domain may be involved in membrane association.	113
407920	pfam18084	ARTD15_N	ARTD15 N-terminal domain. This is the N-terminal domain of poly ADP-ribose polymerase (PARP16 also known as ARTD15) present in homo sapiens. ARTDs catalyse the formation of branched or unbranched chains of ADP-ribose units on protein side chains. The N-terminal domain of ARTD15 does not share any obvious sequence similarity with the regulatory domains of ARTD1-4. The N-terminal domain arrangement in both ARTD15 and ARTD1-3 suggests a regulatory role through different mechanisms.	82
407921	pfam18085	Mak_N_cap	Maltokinase N-terminal cap domain. Glycogen is a central energy storage molecule in bacteria and the metabolic pathways associated with its biosynthesis and degradation are crucial for maintaining cellular energy homeostasis. In mycobacteria, the GlgE pathway involves the combined action of trehalose synthase (TreS), maltokinase (Mak) and maltosyltransferase (GlgE). The N-terminal lobe can be divided into two subdomains: a cap N-terminal subdomain comprising the first 88 amino acid residues. This entry is for the cap N-terminal domain found in mycobacterial maltokinase (Mak), (EC:2.7.1.175). The N-terminal cap subdomain and the C-terminal lobe are predominantly acidic, the intermediate subdomain is enriched in positively charged residues. A structural search with only the first 88 amino acid residues of Mak, corresponding to the N-terminal cap subdomain of maltokinases, unveiled a resemblance with proteins displaying the cystatin fold and a remote similarity with the N-terminal domain of the serine/threonine protein kinase GCN2. Conservation of the cap subdomain in maltokinases (including the bifunctional TreS-Mak enzymes), in particular of the residues in the proximity of the P-loop, together with the potential flexibility of this region, are compatible with regulatory functions for this subdomain. Hence it is hypothesized that the N-terminal cap subdomain plays a central role in modulation of Mak enzymatic activity.	88
407922	pfam18086	PPIP5K2_N	Diphosphoinositol pentakisphosphate kinase 2 N-terminal domain. This is the N-terminal domain found in the Inositol hexakisphosphate and diphosphoinositol-pentakisphosphate kinase 2 (PPIP5K2), EC:2.7.4.24. Structure analysis of human PPIP5K2 indicate that this region forms alpha-beta-alpha domain.PPIP5K2 is one of the mammalian PPIP5K isoforms responsible for synthesis of diphosphoinositol polyphosphates (inositol pyrophosphates; PP-InsPs), regulatory molecules that function at the interface of cell signaling and organismic homeostasis.	90
407923	pfam18087	RuBisCo_chap_C	Rubisco Assembly chaperone C-terminal domain. This is the C-terminal domain, also known as the beta domain, of Rubsico Assembly Chaperone protein (Raf1). Raf1 is necessary for rubisco to catalyze the rate-limiting step of carbon fixation through carboxylating the five-carbon sugar substrate ribulose-1,5-bisphosphate. The beta domains primary function is dimerization, which is critical for Raf1 to achieve the necessary avidity for complex formation with RbcL (the large complex sub-unbit of Rubsico) assembly intermediates. The beta domain is also involved, to a small extent, in binding to RbcL with use of the lustiness near the beta domain's conserved top surface.	138
407924	pfam18088	Glyco_H_20C_C	Glycoside Hydrolase 20C C-terminal domain. This is the C-terminal domain of Glycoside hydrolase 20 C (GH20C) present in S. pneumoniae. GH20C possesses the ability to hydrolyze the beta-linkages joining either N-acetylglucosamine or N-acetylgalactosamine to a wide variety of aglycon residues. The C-terminal domain is commonly known as Domain III is important in dimerization as it forms the primary interface of the dimer. However, there is presently no evidence supporting dimerization as being necessary for catalysis. Domain III is unusual among structurally characterized GH20 enzymes but in GH20 enzymes possessing domain III, dimerization seems to be a conserved feature.	194
407925	pfam18089	DAPG_hydrolase	DAPG hydrolase PhiG domain. This domain is found in 2,4-diacetylphloroglucinol hydrolase PhiG present in Pseudomonas fluorescens. 2,4-diacetylphloroglucinol hydrolase is the gene product of PhiG that is responsible for cleaving toxic 2,4-diacetylphloroglucinol (DAPG). The small N-terminal region of the domain is involved in dimerization through hydrogen bonding of the dimer interface. The C-terminal catalytic region resembles the tetracenomycin aromatase/cyclase and has a Bet v1-like fold. DAPG PhiG is the first discovered hydrolase whose catalytic domain belongs to the Bet v1-like fold, rather than the classical alpha/beta-fold hydrolases.	221
375543	pfam18090	SoPB_HTH	Centromere-binding protein HTH domain. This domain is found in centromere-binding protein (SopB). SopB displays an intriguing range of DNA-binding properties essential for partition; it binds the centromere to form a partition complex, which recruits NTPase (SopA), and it also inhibits SopA polymerization. The domain has a helix-turn-helix (HTH) structure and is thought to be the specific DNA-binding domain mainly through residues from the recognition helix, alpha 3, of the HTH. The domain has show structural similarity to the DNA-binding domains of P1 ParB and KorB.	75
407926	pfam18091	E3_UbLigase_RBR	E3 Ubiquitin Ligase RBR C-terminal domain. This is the C-terminal domain of HOIP present in Homo sapiens. HOIP synthesize the linear ubiquitin chains that help control innate immunity and inflammation. This region has an RBR domain which catalyzes the transfer of ubiquitin onto a substrate.	90
407927	pfam18092	DraK_HK_N	DraK Histidine Kinase N-terminal domain. This is the N-terminal domain found in DraK Histidine Kinase (HK) present in Streptomyces coelicolor. Activation of the DraK HK leads to autophosphorylation of its kinase domain in the cytoplasm and subsequent transphosphorylation of DraR promoting blue-pigmented polyketide actinorhodin (ACT) production. The N-terminal domain is known as the sensor (or input) domain and undergoes a conformational change resulting in phosphorylation of a conserved histidine in its cytoplasmic kinase domain.	85
407928	pfam18093	Trm5_N	tRNA methyltransferase 5 N-terminal domain. This is the N-terminal domain of tRNA methyltransferase 5 (Trm5) present in Methanocaldococcus jannaschii. Trm5 catalyzes the methyl transfer from S-adenosyl methionine (AdoMet) to N1 of G37. This domain, also known as the D1 domain, contacts the tertiary core (elbow) region of the tRNA L shape in a ternary complex of the enzyme with tRNA and AdoMet.	47
407929	pfam18094	DNA_pol_B_N	DNA polymerase beta N-terminal domain. This is the N-terminal domain of DNA polymerase beta present in Homo sapiens. DNA polymerase beta is a repair enzyme that has a key role in the base excision repair of simple DNA lesions.	103
407930	pfam18095	PAS_12	UPF0242 C-terminal PAS-like domain. This domain is found at the C-terminus of proteins of the UPF0242 family. This domain is related to the PAS domain pfam13426.	153
407931	pfam18096	Thump_like	THUMP domain-like. This is a domain of unknown function found in bacteria.	77
407932	pfam18097	Vta1_C	Vta1 C-terminal domain. This is the C-terminal domain of Vta1 proteins pfam04652. Structural and functional analysis indicate that this C-terminal domain promotes the ATP-dependent double ring assembly of Vps4. Furthermore, it has been shown that it is necessary and sufficient for protein dimerization. Mutations in Lys-299 and Lys-302 completely abolished the ability of Vta1 to stimulate the ATPase activity of Vps4 while mutation in Lys-322 had no effect.	38
407933	pfam18098	RPN5_C	26S proteasome regulatory subunit RPN5 C-terminal domain. This is the C-terminal domain of the 26S proteasome regulatory subunit RPN5 proteins.This helical domain can be found adjacent to pfam01399. The 26S proteasome is the major ATP-dependent protease in eukaryotes. Three subcomplexes form this degradation machine: the lid, the base, and the core. The helices found at the C terminus of each lid subunit form a helical bundle that directs the ordered self-assembly of the lid subcomplex. This domain which comprises the tail of RPN5 along with the tail of Rpn9, are important for Rpn12 binding to the lid.	32
407934	pfam18099	DUF5010_C	DUF5010 C-terminal domain. This domain is found at the end of a family of putative glycosyl hydrolases pfam16402. This domain is likely to function as a carbohydrate binding domain due to its similarity with pfam03422.	112
407935	pfam18100	PDE4_UCR	Phosphodiesterase 4 upstream conserved regions (UCR). This is the upstream conserved region (UCR) found in Phosphodiesterase 4 (PDE4) enzymes. PDE4 is a contributor to intracellular signalling and an important drug target. The four members of this enzyme family (PDE4A to -D) are functional dimers in which each subunit contains two upstream conserved regions (UCR), UCR1 and -2, which precede the C-terminal catalytic domain pfam00233. Due to alternative promoters/start sites and variable mRNA splicing, transcription from the four PDE4 genes results in the expression of more than 25 different isoforms of PDE4. Each isoform has a unique N-terminal region that determines its specific subcellular localization by mediating interactions with scaffolding proteins. The isoforms are further classified into long, short, and supershort forms based on the presence or absence of two upstream conserved regions (UCRs, known as UCR1 and UCR2). Long splice variants contain both UCR1 and UCR2, short variants lack UCR1, and the supershort forms of PDE4 additionally lack part of UCR2. The extent to which UCRs are present determines critical functional differences between the isoforms. Phosphorylation by protein kinase A (PKA) at a conserved site on UCR1 activates all long PDE4 isoforms. Mutation and deletion studies have shown that long forms of PDE4 are dimeric, with key dimerization interactions mediated by UCR1 and UCR2, and that the C-terminal half of UCR2 could play a negative regulatory role.	119
407936	pfam18101	Pan3_PK	Pan3 Pseudokinase domain. This is a pseudokinase (PK) domain found in PAB-dependent poly(A)-specific ribonuclease subunit pan3. PAN3 proteins contain three prominent regions: an unstructured N-terminal region (N-term), a central PK domain, and a highly conserved C-terminal domain (C-term). The PAN3 PK domain has retained its ATP binding capacity, and this function is required for mRNA degradation in vivo. Analysis of Pan3 amino acids sequences show that, despite of retaining the general structural characteristics of protein kinases, the PK domain has substitutions in all the conserved motifs that are critical for kinase activity, such as in the catalytic VAIK and HRD motifs and in the Mg2+ binding DFG motif. However, the PAN3 PK domain has been shown to bind ATP. Furthermore, similar to other kinases, the ATP-binding site is located in the cleft between the N- and C-lobes of the kinase fold, however, the ATP-binding pocket is wider than that of typical kinases.	138
407937	pfam18102	DTC	Deltex C-terminal domain. This is the C-terminal domains found in members of the Deltex family of proteins which comprises five members (DTX1, 2, 3, 4, and 3L). This conserved C-terminal region of about 150 residues of the Deltex family, is preceded by a RING E3 ligase domain in four of the members. Crystal structure of the Deltex C-terminal (DTC) domain reveals a fold composed of a central beta-sheet lined with two long parallel alpha-helices.	133
375552	pfam18103	SH3_11	Retroviral integrase C-terminal SH3 domain. This is the carboxy-terminal domain (CTD) found in retroviral integrase, an essential retroviral enzyme that binds both termini of linear viral DNA and inserts them into a host cell chromosome. The CTD adopts an SH3-like fold. Each CTD makes contact with the phosphodiester backbone of both viral DNA molecules, essentially crosslinking the structure.	63
407938	pfam18104	Tudor_2	Jumonji domain-containing protein 2A Tudor domain. This is the tudor domain found in histone demethylase Jumonji domain-containing protein 2A (JMJD2A). Structure and function analysis indicate that this domain can recognize equally well two unrelated histone peptides, H3K4me3 and H4K20me3, by means of two very different binding mechanisms. JMJD2 also known as KDM4, is a conserved iron (II)-dependent jumonji-domain demethylase subfamily that is essential during development. Vertebrate KDM4A-C proteins contain a conserved double tudor domain (DTD).	35
407939	pfam18105	PGM1_C	PGM1 C-terminal domain. This is the C-terminal domain found in PGM1 present in Streptomyces cirratus. PGM1 is a gene product that links precursor peptides together to form the antibiotic Pheganomycin.	53
407940	pfam18106	Rol_Rep_N	Rolling Circle replication initiation protein N-terminal domain. This is the N-terminal domain of the Rolling Circle Replication Initiator Protein (Rep) from Geobacillus stearothermophilus. This protein acts on plasmids from family pT181 to initiate replication, recruit a helicase to the site of initiation and terminate replication after DNA synthesis. These proteins possess a unique active site and a catalytically essential metal ion is bound in a distinct manner from other rolling circle Reps.	91
407941	pfam18107	HTH_ABP1_N	Fission yeast centromere protein N-terminal domain. This domain is found in the fission yeast centromere protein (Abp1) in species such as Shizosaccharomyces pombe. The domain, referred to as Domain 1, is DNA-binding and makes up half of the N-terminal region.	61
407942	pfam18108	QSOX_Trx1	QSOX Trx-like domain. This domain is found in Quiescin sulfhydryl oxidase (QSOX), an oxidoreductase present in Homo sapiens capable of both generating and transferring disulfide modules within a single polypeptide. The domain is thioredoxin-like, hence referred to as Trx1 domain. Trx1 domain has a di-cysteine motif (Cys-X-X-Cys) which is related to the redox-active domains of protein disulfide isomerase. The Trx1 domain is responsible for intramolecular disulfide transfer through the di-cysteine motif.	108
407943	pfam18109	Fer4_24	Ferredoxin I 4Fe-4S cluster domain. This is domain is found in Ferredoxin I (FdI), an Iron-sulfur ([Fe-S]) cluster-containing protein, present in species such as Azotobacter vinelandii. [Fe-S] proteins participate in electron transfer, catalytic, regulatory, and structural function. The FdI cluster exhibits a pH-dependent reduction potential and reversible protonation in the reduced state.	35
407944	pfam18110	BRCC36_C	BRCC36 C-terminal helical domain. This is the C-terminal domain of BRCC36, a Zn2+ dependent deubiquitinating enzyme, present in Camponotus floridanus. BRCC36 hydrolyzes lysine linked ubiquitin chains as part of macromolecular complexes that participate in either interferon signalling or DNA-damage recognition. The domain consists of 2 non canonical helices. The domain interacts hydrophobically with helices alpha 4 and alpha 5 of KIAA0157 in the form of a coiled coil helical bundle. This interaction helps establish the association of BRCC36 with KIAA0157, a pseudo-DUB MPN- protein that is essential for the activity of BRCC36.	81
407945	pfam18111	RPGR1_C	Retinitis pigmentosa G-protein regulator interacting C-terminal. This is the C-terminal domain of retinitis pigmentosa G-protein regulator (RPGR) interacting protein-1 present in Homo sapiens. A mutation in RPGR interacting protein-1 can be observed in the eye disease Leber congenital amaurosis. The domain is commonly known as the RPGR-interacting domain (RID) and is thought to have a C2-like fold.	166
407946	pfam18112	Zn-C2H2_12	Autophagy receptor zinc finger-C2H2 domain. This domain is found in calcium-binding and coiled-coil domain 2/NDP25 (CALCOCO2/NDP25) found in Homo sapiens. CALCOCO2/NDP25 is an ubiquitin-binding autophagy receptor involved in the selective autophagic degradation of invading pathogens. This domain is a typical C2H2-type zinc finger which specifically recognizes mono-ubiquitin or poly-ubiquitin chain. The overall ubiquitin-binding mode utilizes the C-terminal alpha-helix to interact with the solvent-exposed surface of the central beta-sheet of ubiquitin, similar to that observed in the RABGEF1/Rabex-5 or POLN/Pol-eta zinc finger.	27
407947	pfam18113	Rbx_binding	Rubredoxin binding C-terminal domain. This is the C-terminal domain found in rubredoxin reductase (RdxR) present in Pseudomonas aeruginosa. RdxR are important in prokaryotes as they allow for the metabolism of inert n-alkanes and RdxR is also crucial for archaea and anaerobic bacteria in the response to oxidative stress. This domain is known to recognize and bind to rubredoxin.	71
407948	pfam18114	Suv3_N	Suv3 helical N-terminal domain. This is the N-terminal domain of Suv3 present in Homo sapiens. Suv3 is an NTP-dependent RNA/DNA helicase that is necessary for the degradation of mature mtRNAs. Suv3 has been found to interact in vitro with polynucleotide phosphorylase.	117
407949	pfam18115	Tudor_3	DNA repair protein Crb2 Tudor domain. This is the tudor domain found in DNA repair protein crb2. Structural and functional studies of Crb2 and its mammalian homologue 53BP1 indicate that the conserved tandem-Tudor domain of 53BP1 and Crb2 preferentially interacts with H4K20me2, though it also binds to H4K20me1. Furthermore, despite low amino acid sequence similarity, Crb2 is structurally related to 53BP1 in having two tudor domains and a conserved dimethyllysine-binding pocket, and that, like 53BP1, it directly binds H4-K20me2.	49
407950	pfam18116	SNX17_FERM_C	Sorting Nexin 17 FERM C-terminal domain. This is the C-terminal domain of sorting nexin 17 (SNX17) present in Homo sapiens. SNX17 localizes to early endosomes where it directly binds NPX(Y/F) motifs in the target receptors to mediate their rates of endocytic internalization, recycling, or degradation. The domain is known as terminal band 4.1/ezrin/radixin/moesin (FERM) domain. The FERM domain binds directly to the common motif, NPX(Y/F), in the cytoplasmic region of its target proteins.	109
407951	pfam18117	EDS1_EP	Enhanced disease susceptibility 1 protein EP domain. This is the C-terminal domain found in the enhanced disease susceptibility 1 (EDS1) protein present in Arabidopsis thaliana. EDS1 controls the post-infection basal resistance layer. This highly conserved domain is known as the EP domain and its interface consists of hydrophobic interactions, salt bridges, and an extensive hydrogen bonding network.	214
407952	pfam18118	PRC2_HTH_1	Polycomb repressive complex 2 tri-helical domain. This domain can be found in the Polycomb repressive complex 2 (PRC2) present in Homo sapiens. Polycomb complexes maintain repressive chromatin states by silencing gene expression. PRC2 does this by methylating lysine 27 of histone H3. This domain makes up part of the N-lobe which is involved in regulation.	101
407953	pfam18119	RIG-I_C	RIG-I receptor C-terminal domain. This is the C-terminal domain of Innate Immune Pattern-Recognition Receptor RIG-I present in homo sapiens. RIG-I is a key cytosolic pattern-recognition receptors of the vertebrate innate immune system that form the first line of defense against RNA viral infection. RNA binding to RIG-I is mediated both by the C-terminal domain and by the helicase domain. The C-terminal domain specifically binds the 5'triphosphate end with a 10-fold higher affinity compared to 5'OH-dsRNA.	139
407954	pfam18120	DUF5597	Domain of unknown function (DUF5597). This is the C-terminal domain of xyloglucan utilization locus (XyGUL) present in Cellvibrio japonicas. XyGUL is required for xyloglucan utilization. It is also the C-terminal domain of PF02449 and PF01301.	130
407955	pfam18121	TFA2_Winged_2	TFA2 Winged helix domain 2. This is the second winged helix domain can be found in TFA2 proteins present in Saccharomyces cerevisiae. In form 2, the domain interacts directly with Rad3, a DNA helicase.	61
407956	pfam18122	APC1_C	Anaphase-promoting complex sub unit 1 C-terminal domain. This is the C-terminal domain of chain A, also known as sub-unit 1, found in anaphase-promoting complex (APC/C) present in Homo sapiens. APC/C is an ubiquitin ligase that controls chromosome segregation and mitotic exit.	158
407957	pfam18123	FGFR3_TM	Fibroblast growth factor receptor 3 transmembrane domain. This transmembrane (TM) domain is found in Fibroblast growth factor receptor 3 (FGFR3) present in Homo sapiens. Fibroblast growth factors transduce diverse biochemical signals by lateral dimerization in the plasma membrane, followed by receptor auto-phosphorylation and stimulation of downstream signalling cascades. In FGFR3 TM domains associate in a parallel fashion in a left-handed dimer via an extended heptad motif. The N-terminal part of the TM dimer act, most likely, as anchors positioning the TM domain in the detergent head group region. The charged residues flanking the TM helix on both termini have apparently profound destabilizing effect on the FGFR3 dimer but on absence of ligand, the TM domain interaction stabilize the FGFR3 dimer.	31
407958	pfam18124	Kindlin_2_N	Kindlin-2 N-terminal domain. This is the N-terminal domain (K2-N) of Kindlin-2 protein present in Homo sapiens. Kindlin-2 is a regulator for heterodimeric integrin adhesion receptors promotes integrin activation. Activation depends on binding of the N-terminal domain to the integrin beta cytoplasmic tail (CT), which disrupts the receptors association with alpha-CT and triggers the conformational transitions in the receptor. K2-N contains a conserved positively charged surface that binds to membrane enriched with negatively charged phosphatidylinositol-(4,5)-bisphosphate (PIP2). K2-N is also very similar to the homologous kindlin-1 F0.	89
375573	pfam18125	RlmM_FDX	RlmM ferredoxin-like domain. This domain is found in Ribosomal methyltransferase RlmM (YdgE) present in E. coli. RlmM catalyzes the S-adenosyl methionine (AdoMet)-dependent 2'O methylation of C2498 in 23S ribosomal RNA. The domain is ferredoxin-like and forms part of the THUMP domain which binds RNA. THUMP domains typically have low sequence similarity.	71
407959	pfam18126	Mitoc_mL59	Mitochondrial ribosomal protein mL59. This domain is a protein found in mitochondrial ribosome 54S large sub unit present in species such as Saccharomyces cerevisiae. The domain used to be referred to as MRP25 but is now called mL59 protein. mL59 is known to partially stabilize a change in the rRNA path prior to helix 82-ES1 ultimately leading to the stabilization of the phosphate backbone of the tRNA acceptor stem. It is worth noting that the domain is encoded in the nucleus and imported from the cytoplasm.	129
407960	pfam18127	DUF5598	Domain of unknown function (DUF5598). This is the N-terminal domain found in Nicotinamide phosphoribosyltransferase (NAMPT) present in Homo sapiens. NAMPT captures nicotinamide (NAM) and replenish the nicotinamide adenine dinucleotide (NAD+) pool during ADP-ribosylation and transferase reactions.	98
375576	pfam18128	HydF_dimer	Hydrogen maturase F dimerization domain. This domain is found in Hydrogen maturase F (HydF) present in Thermotoga neapolitana. HydF is a GTPase, containing an FeS cluster-binding motif, that is able to are able to activate HydA produced so that HydA can drive the reversible reduction of protons to molecular H2. This domain, referred to as domain II, is responsible for HydF dimerization through the formation of a continuous beta-sheet comprising eight beta-strands from two monomers.	99
407961	pfam18129	SH3_12	Xrn1 SH3-like domain. This is the C-terminal SH3-like domain which can be found in the exoribonuclease Xrn1. Xrn1 is a 175 kDa processive exoribonuclease that is conserved from yeast to mammals which targets cytoplasmic RNA substrates marked by a 5' monophosphate for processive 5'-to-3' degradation. The Sh3-like domain in Xrn1 lacks the canonical SH3 residues normally involved in binding proline-rich peptide motifs and instead engages in non-canonical interactions with the catalytic domain. Additionally it is essential in maintaining the structural integrity of Xrn1, since partial truncation of this domain in yeast Xrn1 yields an inactive protein. There is a long loop projecting from the SH3-like domain that contacts the PAZ/Tudor domain, occluding the functional surface that binds RNA or peptide motifs containing methylated arginines, respectively, in canonical PAZ and Tudor domain.	68
407962	pfam18130	ATPgrasp_N	ATP-grasp N-terminal domain. This is the N-terminal domain found in BL00235 present in Bacillus licheniformis. BL00235 is a ATP-grasp superfamily protein that catalyzes the formation of an alpha-peptide bond between two L-amino acids in an ATP-dependent manner. BL00235 has a highly restricted substrate specificity: the N-terminal substrate is confined to L-methionine an L-leucine, while the C-terminal substrates include small residues such as L-alanine, L-serine, L-threonine and L-cysteine.	81
407963	pfam18131	KN17_SH3	KN17 SH3-like C-terminal domain. This is the C-terminal domain of human KIN17 protein. Overexpression of KIN17 modifies the nuclear morphology and inhibits S-phase progression, thus blocking cell growth as part of the response to genotoxics. The C-terminal domain binds to RNA and is generally well conserved. The domain has structural similarity with various SH3-like domains, although it lacks similarities in both primary sequence and charge distribution.	53
407964	pfam18132	Tyosinase_C	Tyosinase C-terminal domain. This is the C-terminal domain of Tyosinase present in Aspergillus oryzae. Tyosinase is a dinuclear copper monooxygenase/oxidase that plays a crucial role in the melanin pigment biosynthesis. The C-terminal domain is referred to as the shielding domain as it prohibits substrate access to the enzyme-active site and blocks the oxidase/oxygenase activity to avoid undesirable intracellular reactions of highly reactive quinonoid products. This means the domain may play an important role in regulating the enzyme activity. Two of the three cysteines (Cys522, and Cys525) that play a significant role in the copper incorporation process belong to the C-terminal domain.	122
407965	pfam18133	HydF_tetramer	Hydrogen maturase F tetramerization domain. This is the C-terminal domain found in Hydrogen maturase F (HydF) present in Thermotoga neapolitana. HydF is a GTPase, containing an FeS cluster-binding motif, that is able to are able to activate HydA produced so that HydA can drive the reversible reduction of protons to molecular H2. This domain is known as domain III, and is primarily responsible for homotetramer formation. Interactions between the two FeS cluster-binding domains worth noting are the interactions between beta-2 strands, the initial part of the long loop that connects strand beta-2 to strand beta-3, and the loop that connects strand beta-1 to helix alpha-3. There are three highly conserved cysteine residues (Cys-302, Cys-353, and Cys-356) that represent the FeS cluster-binding site which form a superficial pocket.	118
407966	pfam18134	AGS_C	Adenylyl/Guanylyl and SMODS C-terminal sensor domain. Predicted to function as a sensor domain, sensing nucleotides or nucleotide derivatives generated by bacterial adenylyl/guanylyl cyclase domains. The sensing of ligands by AGS-C is predicted to activate effectors deployed by a class of conflict systems which are reliant on the on the production and sensing of the nucleotide second messengers.	129
407967	pfam18135	Type_ISP_C	Type ISP C-terminal specificity domain. This is the C-terminal domain of Type ISP restriction-modification enzyme LLaBIII present in Lactococcus lactis subsp. cremoris. Type ISP restriction-modification (RM) enzymes provide a potent defence against infection by foreign and bacteriophage DNA. This domain interacts extensively with DNA and is known as the target recognition domain (TRD). TRD works by recognising 6/7 base pairs of asymmetric sequence.	342
407968	pfam18136	DNApol_Exo	DNA mitochondrial polymerase exonuclease domain. This domain belongs to human mitochondrial DNA polymerase (Pol-gamma). Pol-gamma has a catalytic subunit, Pol gamma-A, which possesses both polymerase and proofreading exonuclease activities and an accessory subunit, Pol gamma-B, which accelerates polymerization rate and suppresses exonuclease activity. This domain is the exonuclease domain of the catalytic subunit, Pol gamma-A.	282
407969	pfam18137	ORC_WH_C	Origin recognition complex winged helix C-terminal. This is the C-terminal winged-helix (WH) DNA-binding domain of the origin recognition complex present in Drosophila melanogaster. The WH domain is responsible for recognizing origin sequences.	131
407970	pfam18138	bacHORMA_1	Bacterial HORMA domain family 1. Family of bacterial HORMA domains found in conserved genome contexts with Pch2/TRIP13 P-loop NTPases. Acts as a 'third component' in broad class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. Together with Pch2/TRIP13, could act as co-effectors or in regulation of other effectors of the systems.	169
407971	pfam18139	LSDAT_euk	SLOG in TRPM. Family in the SLOG superfamily, found in several eukaryotic channels including diverse ciliate channels and the TRPM class of animal ion channels. Positioned near the N-terminus of all TRPM channels, it is predicted to play a regulatory role for the channel in potentially recognizing a universal nucleotide or nucleotide-derived ligand.	266
407972	pfam18140	PCC_BT	Propionyl-coenzyme A carboxylase BT domain. This domain is found in Propionyl-coenzyme A carboxylase (PCC), present in Roseobacter denitrificans. PCC is a mitochondrial biotin-dependent enzyme that is essential for the catabolism of certain amino acids, cholesterol, and fatty acids with an odd number of carbon atoms. Since this domain mediates biotin carboxylase-carboxyltransferase interactions it is referred to as the BT domain. The BT domain is located between biotin carboxylase and the biotin carboxyl carrier protein domains. The BT domain shares some structural similarity with the pyruvate carboxylase tetramerization domain of pyruvate carboxylase.	126
407973	pfam18141	DUF5599	Domain of unknown function (DUF5599). This domain is found in UPF 1 present in Homo sapiens. UPF 1 is involved in the initiation of nonsense-medicated decay, the process of degradation of transcripts containing premature termination codons in order to control the quality of mRNA. The domain is referred to as 1B.	93
407974	pfam18142	SLATT_fungal	SMODS and SLOG-associating 2TM effector domain. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function in bacteria as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. The role of this fungal family is not yet understood, although the expansion of the family in many fungal lineages points to a potential role in conflict.	121
407975	pfam18143	HAD_SAK_2	HAD domain in Swiss Army Knife RNA repair proteins. Family of HAD domain phophoesterases observed in large eukaryotic proteins with predicted role in RNA repair, the so-called 'Swiss Army Knife' repair proteins. May be involved in phosphate group removal during RNA re-ligation.	143
407976	pfam18144	SMODS	Second Messenger Oligonucleotide or Dinucleotide Synthetase domain. Nucleotide synthetase enzyme of the DNA polymerase beta superfamily. Experimental studies have demonstrated cGAMP synthetase activity in the Vibrio cholerae DncV protein, a member of the SMODS family. The diversity inherent to the SMODS family suggests members of the family could generate a range of nucleotides, cyclic and/or linear. The nucleotide second messengers generated by the SMODS domains are predicted to activate effectors in a class of conflict systems reliant on the production and sensing of the nucleotide second messengers.	164
407977	pfam18145	SAVED	SMODS-associated and fused to various effectors sensor domain. Predicted to function as a sensor domain, sensing nucleotides or nucleotide derivatives generated by SMODS and other nucleotide synthetase domains. The sensing of ligands by SAVED is predicted to activate effectors deployed by a class of conflict systems which are reliant on the on the production and sensing of the nucleotide second messengers.	189
407978	pfam18146	CinA_KH	Damage-inducible protein CinA KH domain. This domain is found in competence-induced protein A (CinA) present in Thermus thermophiles. CinA is important in the horizontal transfer of genes via competence and may also participate in the pyridine nucleotide cycle, which recycles products formed by non-redox uses of NAD. This domain has a KH-type fold and contains the absolutely conserved Glu-187, which stabilizes the binding of Mg2+ and hence polarizes the P=O bond for hydrolysis. A major feature of the CinA in T. thermophiles structure is the asymmetry in the dimer, which is caused by contact between a KH-type domain on the opposite chain and the bound ADP-ribose. This has the effect of closing the active site, allowing additional recognition of ADP-ribose by residues from the KH-type domain.	73
407979	pfam18147	Suv3_C_1	Suv3 C-terminal domain 1. This domain is found in Suv3 present in Homo sapiens. Suv3 is an NTP-dependent RNA/DNA helicase that is necessary for the degradation of mature mtRNAs. Suv3 has been found to interact in vitro with polynucleotide phosphorylase. This domain makes up part of the C-terminal domain.	41
375589	pfam18148	RGS_DHEX	Regulator of G-protein signalling DHEX domain. This domain is found in RGS9 (Class C) regulator of G-protein signalling (RGS) protein present in Mus musculus. RGS proteins attenuate heterotrimeric G-protein signalling by enhancing the intrinsic GTPase activity of G-alpha subunits and are vital for proper signal transduction kinetics. The domain is referred to as DEP helical extension (DHEX) because it is located next to N-terminal Dishevelled/Egl-10/Pleckstrin homology (DEP) domain. Both the DEP and DHEX domains are necessary, but not sufficient, to bind anchoring proteins such as RGS9 anchor protein. DHEX has no close structural homologs.	100
407980	pfam18149	Helicase_PWI	N-terminal helicase PWI domain. This domain is found in spliceosomal RNA helicase Brr2. Brr2 is required for the assembly of a catalytically active spliceosome on a messenger RNA precursor. The domain is found in the N-terminal region and is non-canonically PWI-like. The PWI-like domain is thought to be involved in protein-protein interactions.	111
407981	pfam18150	DUF5600	Domain of unknown function (DUF5600). This domain can be found in EH-domain-containing ATPase 2 (EHD2) present in Mus musculus. The domain is helical in nature and has extensive contacts with the G-domain.	107
407982	pfam18151	DUF5601	Domain of unknown function (DUF5601). This domain is found in the catalytic core RABEX-5 present in Homo sapiens. RABEX, also known as Rab GTPase exchange factors, regulate endocytic trafficking through activation of the Rab families RAB5, RAB21 and RAB22. The domain is helical in nature.	65
407983	pfam18152	DAHP_snth_FXD	DAHP synthase ferredoxin-like domain. This domain is found in 3-Deoxy-d-arabino-heptulosonate-7-phosphate synthase (DAHPS) present in Thermotoga maritime. DAHPS catalyzes the first reaction of the aromatic biosynthetic pathway in bacteria, fungi, and plants, the condensation of PEP and E4P with the formation of DAHP. The domain is ferredoxin-like and is thought to play a critical role in feedback regulation of the enzyme.	67
407984	pfam18153	S_2TMBeta	SMODS-associating 2TM, beta-strand rich effector domain. Predicted sensor/effector coupled domain which occurs in conserved genome contexts with the SMODS nucleotide synthetase. In addition to the predicted pore-forming 2TM region, the domain contains seven predicted beta-strands, suggestive of a lipocalin-like beta-barrel structure which could act as the sensor which activates the pore-forming effector response.	180
407985	pfam18154	pPIWI_RE_REase	REase associating with pPIWI_RE. Restriction endonuclease (REase) domain family, found in a conserved three-gene island that also contains a DinG-type helicase and the pPIWI_RE module. This three gene island is predicted to form a conflict system which targets R-loop formation of invasive plasmids during plasmid replication.	118
407986	pfam18155	pPIWI_RE_Z	pPIWI RE three-gene island domain Z. Poorly-understood domain observed N-terminal to DinG-type helicase, which is part of a conserved three-gene island also containing a REase domain and the pPIWI_RE module. This three gene island is predicted to form a conflict system which targets R-loop formation of invasive plasmids during plasmid replication.	166
407987	pfam18156	pPIWI_RE_Y	pPIWI_RE three-gene island domain Y. Poorly-understood domain observed N-terminal to restriction endonuclease (REase) domain, which is part of a conserved three-gene island also containing a DinG-type helicase and the pPIWI_RE module. This three gene island is predicted to form a conflict system which targets R-loop formation of invasive plasmids during plasmid replication.	144
407988	pfam18157	MID_pPIWI_RE	MID domain of pPIWI_RE. MID domain of the pPIWI_RE PIWI/Argonaute module. pPIWI_RE is found in a conserved three-gene island that also contains a DinG-type helicase and an REase nuclease. This three gene island is predicted to form a conflict system which targets R-loop formation of invasive plasmids during plasmid replication.	142
407989	pfam18158	AidB_N	Adaptive response protein AidB N-terminal domain. This is the N-terminal domain of Adaptive response protein AidB present in E. coli. AidB is upregulated in response to small doses of DNA-methylating agents initiates a response that mitigates the mutagenic and cytotoxic effects of DNA methylation. Tetramer formation is thought to be carried out by the N-terminal domain.	156
407990	pfam18159	S_4TM	SMODS-associating 4TM effector domain. Predicted pore-forming effector domain found in conserved genome contexts with diverse nucleotide synthetases including the SMODS synthetases. Predicted to function as a pore-forming effector in a class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. S-4TM domains are predicted to initiate cell suicide responses upon their activation.	291
407991	pfam18160	SLATT_5	SMODS and SLOG-associating 2TM effector domain family 5. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family contains an additional C-terminal alpha-helix, and strictly associates with a reverse transcriptase domain, part of a predicted retroelement with diversity-generating potential.	190
407992	pfam18161	ISP1_C	ISP1 C-terminal. This is the C-terminal domain of ISP1 protein, which plays a role in asexual daughter cell formation, such as in Toxi. gondii. The domain consists of a seven-stranded antiparallel beta-sandwich bordered on one end by an inter-strand loop (open end) and capped at the other end by an amphipathic C-terminal helix (closed end). The domain adopts a pleckstrin homology (PH) fold, despite having negligible sequence similarity. PH domains are often found in proteins that support protein-lipid and play a role in mediating membrane localization through IP binding. However, the Phospholipid Binding Properties of PH domains is not conserved in the TgISP1. Unlike PH domains, ISP1 is cysteine rich. The cysteine-rich nature of the ISPs and the number of surface-exposed cysteines may result in redox instability and may also facilitate higher order multimerization. A disulfide bond between beta 2 and beta 3 is likely a structural feature of the ISP1, as both cysteines appear broadly conserved.	107
407993	pfam18162	Arc_C	Arc C-lobe. This is the C-terminal domain of Arc protein present in found in Rattus norvegicus. The Arc protein modulates the trafficking of AMPA-type glutamate receptors. This domains tertiary structure is similar to the capsid domain of HIV gag protein. The domain is thought to have evolved from the capsid domain of Ty3/Gypsy retrotransposon.	83
407994	pfam18163	LD_cluster2	SLOG cluster2. Family in the SLOG superfamily, observed associating with distinct effector domains including the patatin lipase or a protein containing one enzymatically active and one inactive copy of the TIR domain.	262
407995	pfam18164	GNAT_C	GNAT-like C-terminal domain. This is the C-terminal domain found in N-acyltransferase (NAT) proteins present in Actinoplanes teichomyceticus. In this organism, NAT proteins are responsible for N-acylation in the synthesis of the antibiotic teicoplanin. The C-terminal domain undergoes a substantial conformational change upon binding to Acyl-CoA. The C-terminal domain is considered Gcn5-related N-acetyltransferase like (GNAT-like) but differs from the canonical GNAT fold in that it lacks the first beta strand and has an additional four alpha helices.	141
407996	pfam18165	pP_pnuc_1	Predicted pPIWI-associating nuclease. Predicted nuclease effector domain associating with prokaryotic PIWI-centered conflict systems.	135
407997	pfam18166	pP_pnuc_2	Predicted pPIWI-associating nuclease. Predicted nuclease effector domain associating with prokaryotic PIWI-centered conflict systems.	122
407998	pfam18167	Sa_NUDIX	SMODS-associated NUDIX domain. NUDIX domain with distinctive features in the substrate-interacting region observed associating with SMODS domain synthetases. Predicted to cleave nucleotide diphosphate bonds, potentially to regulate flux through the pore formed by fused 2TM module.	199
407999	pfam18168	PPL5	Prim-pol family 5. Family of prim-pol enzymes currently known only in kinetoplastids.	321
408000	pfam18169	SLATT_6	SMODS and SLOG-associating 2TM effector domain 6. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family associates with a SMODS nucleotide synthetase domain fused to the predicted AGS-C sensor domain. It is sometimes further coupled to R-M systems.	176
408001	pfam18171	LSDAT_prok	SLOG in TRPM, prokaryote. Family in the SLOG superfamily, fused to or operonically associating with SLATT domain in diverse prokaryotes. Predicted to function as ligand sensor in conjunction with the SLATT transmembrane domain.	194
408002	pfam18172	LepB_GAP_N	LepB GAP domain N-terminal subdomain. This is a subdomain of a Rab GTPase-activating protein (GAP) effector from Legionella pneumophilia. This GAP modulates Rab enzymes that act as molecular switches in regulating vesicular transport in eukaryotic cells. This N-terminal subdomain belongs to the the GAP domain of the protein. The catalytic arginine finger (Arg444) is located within this sub-domain and it is the only arginine residue required for GAP activity.	189
375605	pfam18173	bacHORMA_2	Bacterial HORMA domain 2. Family of bacterial HORMA domains found in conserved genome contexts with Pch2/TRIP13 P-loop NTPases. Acts as a 'third component' in broad class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. Together with Pch2/TRIP13, could act as co-effectors or in regulation of other effectors of the systems.	166
408003	pfam18174	HU-CCDC81_bac_1	CCDC81-like prokaryotic HU domain 1. First of two HU domains found in bacterial proteins typically fused to a C-terminal transmembrane helix and an extracellular peptidoglycan-binding domain. The HU domains in many of these proteins are predicted to function in tethering the nucleoid to the cell envelope.	59
408004	pfam18175	HU-CCDC81_bac_2	CCDC81-like prokaryotic HU domain 2. Second of two HU domains found in bacterial proteins typically fused to a C-terminal transmembrane helix and an extracellular peptidoglycan-binding domain. The HU domains in many of these proteins are predicted to function in tethering the nucleoid to the cell envelope.	70
408005	pfam18176	KptA_kDCL	KptA in kinetoplastid DICER domain. KptA ADP-ribosyltransferase domain observed in kinetoplastid DICER-like (DCL) enzymes. Appears to have lost most residues required for catalyzing phospho-transfer to NAD; however, several positively-charged residues implicated in RNA end recognition remain well-conserved.	158
408006	pfam18177	La_HTH_kDCL	La HTH in kinetoplastid DICER domain. Winged HTH domain family observed in kinetoplastid DICER-like (DCL) enzymes, situated N-terminal to the KptA_kDCL domain pfam18176.	89
408007	pfam18178	TPALS	TIR- and PNP-associating SLOG family. Family in the SLOG superfamily associating with predicted TIR- and PNP-like effector domains. Members of this family are predicted to function as sensors of nucleotide or nucleotide-derived ligands, which are likely processed or modified by the associating effectors. Often co-occur genomically with the bacterial HORMA and Pch2/TRIP13 domains.	232
408008	pfam18179	SUa-2TM	SMODS- and Ubiquitin system-associated 2TM effector domain. Predicted pore-forming effector domain observed exclusively in conserved genome contexts with the SMODS nucleotide synthetases and the bacterial Ubiquitin conjugation systems.	278
408009	pfam18180	LD_cluster3	SLOG cluster3 family. Family in the SLOG superfamily, observed to associate with a predicted effector protein containing one enzymatically active and inactive copy of the TIR domain.	170
408010	pfam18181	SLATT_1	SMODS and SLOG-associating 2TM effector domain 1. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often C-terminally fused to the SLATT_3 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels. In relatively rare instances, it is genomically linked as a standalone domain to the RelA/SpoT nucleotide synthetase and the predicted NA37/YejK sensor domain.	122
408011	pfam18182	mCpol	minimal CRISPR polymerase domain. Minimal version of the CRISPR polymerase domain. Predicted to generate cyclic nucleotides, potentially sensed by CARF domains which in turn activate various effector domain including HEPN RNases, CARF sensor and effectors are found in conserved genome contexts. Part of a broader class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivatives. Implicates CRISPR polymerase of the Type III CRISPR/Cas systems in a nucleotide synthetase functional role.	114
408012	pfam18183	SLATT_2	SMODS and SLOG-associating 2TM effector domain 2. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is the only prokaryotic SLATT family to exist as a standalone domain, with no as-yet discernable genome associations.	192
408013	pfam18184	SLATT_3	SMODS and SLOG-associating 2TM effector domain 3. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is always N-terminally fused to the SLATT_1 family, and is typically operonically linked to either inactive TIR domains or SLOG domains which could act as regulators of the SLATT channels.	156
408014	pfam18185	STALD	Sir2- and TIR-associating SLOG family. Family in the SLOG superfamily, associating with predicted Sir2- and TIR-like effector domains. Members of this family are predicted to functions as sensors of nucleotide or nucleotide-derived ligands, which are likely processed or modified by the associating effectors.	207
408015	pfam18186	SLATT_4	SMODS and SLOG-associating 2TM effector domain family 4. The SLATT domain contains two transmembrane helices. SLATT domains are generally predicted to function as pore-forming effectors in a class of conflict systems which are reliant on the production of second messenger nucleotide or nucleotide derivatives. SLATT domains are predicted to initiate cell suicide responses upon their activation. This SLATT family is often coupled to the SMODS nucleotide synthetase and is sometimes further embedded in other conflict systems like CRISPR/Cas or R-M systems.	165
408016	pfam18187	RIF5_SNase_1	TbRIF5 SNase domain 1. Staphylococcus nuclease (SNase) domain family found in the Trypanosoma brucei TbRIF5 protein, which could contribute to the processing of dsRNA targets.	166
408017	pfam18188	PPL4	Prim-pol 4. Family of prim-pol enzymes with predicted roles in RNA processing and repair, potentially acting independently of a DNA template. One member of the family is fused to kinetoplastid DICER-like protein 1 (DCL1).	159
408018	pfam18189	RIF5_SNase_2	TbRIF5 SNase domain 2. Staphylococcus nuclease (SNase) domain family found in the Trypanosoma brucei TbRIF5 protein, which could contribute to the processing of dsRNA targets.	179
375619	pfam18190	Plk4_PB1	Polo-like Kinase 4 Polo Box 1. This domain is found in Polo-like kinase 4 (Plk4) present in Drosophila melanogaster. Plk4 is a conserved component in the duplication pathway of centrioles which is needed to prevent chromosomal instability. The domain is Polo Box 1 (PB1) and has a pseudo-symmetric dimerization interface across PB1-PB1.	107
408019	pfam18191	PnpCD_PnpD_N	Hydroquinone 1,2-dioxygenase large subunit N-terminal. This is the N-terminal domain of the alpha subunit, known as PnpD, of Hydroquinone 1,2-dioxygenase (PnpCD) present in Pseudomonas sp. strain WBC-3. PnpCD is the key enzyme in the degradation pathway of pollutant para-nitrophenol (PNP). The N-terminal domain residues Trp-76 and Phe-79 are indispensable in the formation of the active site pocket. The N-terminal domain also plays a vital role in formation of the heterotetrameric structure. Structural homologs of the N-terminal domain exhibit the nature to bind nucleic acids but due to the steric effect of the C-terminal domain, this N-terminal domain cannot bind nucleic acids.	151
408020	pfam18192	DNTTIP1_dimer	DNTTIP1 dimerisation domain. This is the N-terminal domain of DNTTIP1, a protein that forms part of a novel histone deacetylase complex present in Homo sapiens. Histone deacetylase complexes comprise DNTTIP1, histone deacetylase (HDAC) and the repressor protein MIDEAS. The acetylation of histone tails plays a critical role in determining the accessibility of chromatin to transcriptional regulators and RNA polymerase complexes. This N-terminal domain is responsible for dimerization of histone deacetylase 1(HDAC1). The N-terminal domain also interacts and mediates the assembly of the HDAC1- MIDEAS complex.	69
408021	pfam18193	Fibrillin_U_N	Fibrillin 1 unique N-terminal domain. This is the N-terminal domain of human fibrillin-1. Fibrillin is a primary constituent of microfibrils in the extracellular matrix of many elastic and non-elastic connective tissues. This domain, known as the fibrillin unique N-terminal (FUN) domain, constitutes the minimal interaction site for the fibrillin C terminus. The FUN domain has homologs in the human proteins LTBP-1L/2 and VWCE, which are also associated with a C-terminal EGF-like domain.	37
408022	pfam18194	Xrn1_D3	Exoribonuclease 1 Domain-3. This domain is found in 5' to 3' exoribonuclease 1 (XRN1) present in Kluyveromyces lactis. XRN1 is involved in transcription, RNA metabolism, and RNA interference. This domain, known as D3, is the third of four domains located far from the active site. These four domains may help to stabilize the N-terminal segment of Xrn1 for catalysis.	71
408023	pfam18195	GatD_N	GatD N-terminal domain. This is the N-terminal domain of GatD protein present in Pyrococcus abyssi. Two GatD and two GatE associate to form a tetramer complex. The tetramer complex is able to mature Glutamic acid-tRNA Glutamine into Glutamine-tRNA Glutamine, a necessary step in the translation of proteins. The N-terminal domain is involved in anchoring GatD to GatE in order to form the tetramer.	53
408024	pfam18196	Cdh1_DBD_1	Chromodomain helicase DNA-binding domain 1. This domain can be found in chromodomain helicase DNA-binding protein 1 (Chd1) present in Saccharomyces cerevisiae. Cdh1 proteins have been associated with the efficient assembly and spacing of nucleosomes. The domain consists of four helices, alpha helix 1-4, and can be divided into regions SANT, HL1 and the beta-linker. The HL1 region comprises of some 40 residues specific to budding yeast that are unlikely to form such a prominent feature or perform a conserved function in other species. The domain itself forms part of the DNA-binding domain. Basic residues on the alpha-1 helix are thought to be important for DNA interaction.	120
380135	pfam18197	DUF5602	Domain of unknown function (DUF5602). This domain is found in TTHB210 protein present in Thermus thermophilus. TTHB210 is a Sigma-E factor regulated gene product that forms a homodecamer. This domain is chain G and can be classified with chains A, C, E and I based on its folds.	50
408025	pfam18198	AAA_lid_11	Dynein heavy chain AAA lid domain. This family represents the AAA lid domain found neat the C-terminal region of dynein heavy chain.	156
408026	pfam18199	Dynein_C	Dynein heavy chain C-terminal domain. This family represents the C-terminal domain of dynein heavy chain. This domain is a complex structure comprising six alpha-helices and an incomplete six-stranded antiparallel beta-barrel. The shape of this domain is distinctively flat, spreading over the AAA1, AAA5 and AAA6 domain.	303
408027	pfam18200	Big_11	Bacterial Ig-like domain. This presumed domain is found repeat in bacterial cell surface proteins.	79
408028	pfam18201	PIH1_CS	PIH1 CS-like domain. This domain is found in yeast PIH1 and its homologues. This domain consists of a seven-stranded beta sandwich with the topology of a CS domain, a structural motif also found in Hsp90 co-chaperones such as p23/Sba1 and Sgt1.	100
408029	pfam18202	TQ	T-Q ester bond containing domain. This domain is found in gram positive bacterial surface proteins. It contains a very unusual isopeptide bond between a conserved N-terminal threonine residue on the first beta strand of the Ig-like fold and and a glutamine residue in the final strand of the domain.	125
408030	pfam18203	IPTL-CTERM	IPTL-CTERM motif. This entry represents a predicted C-terminal sorting motif.	28
408031	pfam18204	PGF-CTERM	PGF-CTERM motif. 	23
408032	pfam18205	VPDSG-CTERM	VPDSG-CTERM motif. The PEP-CTERM/exosortase system has been previously identified through in silico analysis. This entry describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria.	26
408033	pfam18206	Porphyrn_cat_1	Porphyranase catalytic subdomain 1. This domain is found in porphyranase protein present in Bacteroides plebeius. Porphyranase breaks down porphyran during digestion of red seaweed glycans. It is worth noting that red seaweed glycans contain sulfate esters that are absent in terrestrial plants. This domain makes up part of the catalytic domain of the porphyranase protein.	105
408034	pfam18207	LIFR_N	Leukemia inhibitory factor receptor N-terminal domain. This domain can be found in leukemia inhibitory factor receptor (LIFR). LIFR is a cell surface receptor that mediates the actions of LIF and other interleukin-6 type cytokines through the formation of signalling complexes with gp130. This is the N-terminal domain, referred to as domain 1, which contains conserved disulfide bonds between Cys-10 to Cys-20 and Cys-37 to Cys-45.	75
408035	pfam18208	NES_C_h	Nicking enzyme C-terminal middle helical domain. This domain is found in nicking enzyme in S. aureus. NES initiates and terminates the transfer of plasmids that variously confer resistance to a range of drugs, including vancomycin and gentamicin. This domain is found in the C-terminal region of NES. The C-terminal region is required for conjugation and significantly impacts the catalytic activity of the N-terminal relaxase.	109
408036	pfam18209	ESF1	Embryo surrounding factor 1. This domain is Embryo surrounding factor 1 (ESF1) protein present in Arabidopsis thaliana. Maternally contributed central cell ESF1 peptides play an important role in suspensor formation and pro-embryo development. The biological activity of ESF1 depends on structural topology, which is stabilized by disulfide bonds.	56
408037	pfam18210	Knl1_RWD_C	Knl1 RWD C-terminal domain. This domain is found in Knl1, a sub-unit of the KMN network, present in Homo sapiens. The KMN network is the core of the outer kinetochore which is responsible for microtubule binding/stabilization and controls the spindle assembly checkpoint. This domain is the second of two RING finger, WD repeat, DEAD-like helicase (RWD) domains. The tandem RWD domains mediate kinetochore targeting of the microtubule-binding subunits by interacting with the Mis12 complex. The Mis12 complex is a KMN sub-complex that tethers directly onto the underlying chromatin layer.	98
408038	pfam18211	Csm1_B	Csm1 subunit domain B. This domain is found in the Csm1 subunit of the Csm complex found in Thermococcus onnurineus. Csm is a type III-A CRISPR-Cas system, which is an RNA-guided immune defense mechanism that detects and destroys foreign DNA or RNA. This domain is known as domain A and is positioned side by side with domain C. Both domain A and domain C adopt the BABBA topology. Domain A interacts primarily with domain B.	94
408039	pfam18212	ZNRF_3_ecto	ZNRF-3 Ectodomain. This domain is found in ZNRF-3 protein present in Danio rerio. ZNRF-3 is a transmembrane E3 ubiquitin ligase that antagonizes Wnt signalling, the signalling system used to mediate Rspo protein actions. ZNFR3 and RNF43, alongside the Rspo proteins, have emerged as a system with therapeutic potential for a number of pathological processes. This domain is known as the ectodomain.	104
408040	pfam18213	SUB1_ProdP9	SUB1 protease Prodomain ProdP9. This domain is the bound prodomain fragment ProdP9 of the SUB1 protein present in Plasmodium falciparum. SUB1 is a serine protease that processes a subset of parasite proteins that play indispensable roles in egress and invasion. The C-terminal stalk of ProdP9 binds in the active site groove in a substrate-like manner and is truncated at the N terminus as a result of the chymotrypsin digestion step used during purification. ProdP9 is structural similar to MIC5 from Toxoplasma gondii, despite low sequence identity.	80
408041	pfam18214	STATa_Ig	STATa Immunoglobulin-like domain. This domain is found in signal transducer and activator of transcription A protein (STATa) present in Dictyostelium discoideum (dd). STATa is responsible for transcriptionally regulating cellular differentiation in Dictyostelium discoideum. ddSTATa is the only non-metazoan known to employ SH2 domain signaling. This domain adopts an Immunoglobulin-like fold.	122
408042	pfam18215	Rtt106_N	Histone chaperone Rtt106 N-terminal domain. This is the N-terminal domain of Rtt106 in Saccharomyces cerevisiae. Rtt106 is a histone chaperone that contributes to the deposition of newly synthesized acetylated Histone 3 Lysine 56 (H3K56ac) carrying H3-H4 complex on replicating DNA. The N-terminal domain of Rtt106 homodimerizes and interacts with H3-H4 independently of acetylation.	45
375643	pfam18216	N_formyltrans_C	N-formyltransferase dimerization C-terminal domain. This is the C-terminal domain of N-formyltransferase found in Francisella tularensis. N-formylated sugars are observed on O-antigens of pathogenic Gram-negative bacteria. This C-terminal domain is responsible for dimerization. In particular, the beta hairpin motif present in the domain helps create a subunit-subunit interface. The dimeric interface is characterized by a hydrophobic patch formed by Ile 195, Leu 197, Val 201, Met 203, Ile 207, Phe 223, Val 231, Val 233, Leu 235, and Leu 237 from both monomers.	52
408043	pfam18217	Zap1_zf2	Zap1 zinc finger 2. This domain can be found in Zap1 present in Saccharomyces cerevisiae. Zap1 regulates S. cerevisiae which mediates the transcription of genes encoding uptake vacuolar transporters. This domain corresponds to zinc finger 2 (zf2) which has been shown to be a constitutive transcriptional activator. The two zinc fingers interactions stabilizes Zn(II)-binding.	24
408044	pfam18218	Spa1_C	Lantibiotic immunity protein Spa1 C-terminal domain. This is the C-terminal domain found in SpaI present in Bacillus subtilis. SpaI is an immunity lipoprotein that protects the Gram-positive bacteria against their own lantibiotics, in this case subtilin. SpaI together with the ABC transporter SpaFEG protects the membrane from subtilin insertion.	99
408045	pfam18219	SidC_N	SidC N-terminal domain. This is the N-terminal domain of SidC present in Legionella pneumophilia. SidC appears to be involved in modulating mammalian trafficking by promoting the communication between ER-derived vesicles and the Legionella containing vacuole. The N-terminal domain (SidC-N) has a novel fold with 4 potential subdomains. SidC-N does not show structural similarity to any known protein domain in the protein data bank.	466
408046	pfam18220	BspA_v	Adhesin BspA variable domain. This domain is found in BspA protein present in Streptococcus agalactiae. BspA is an antigen I/II family polypeptide that confers adhesion linked to pathogenesis in group B Streptococcus. This domain is referred to as the variable domain (BspA-V). BspA-V is responsible for binding to scavenger receptor gp340. BspA-V adopts a fold that is distinct from those of other AgI/II family polypeptide variable domains.	150
375648	pfam18221	MU2_FHA	Mutator 2 Fork head associated domain. This is the N-terminal forkhead-associated (FHA) domain found in Drosophila mutator 2 (MU2) protein. FHA domains are generally phosphothreonine (pThr) specific-binding domains and are present in DNA repair and checkpoint proteins. However, phosphothreonine binding is not conserved in the MU2 active site pocket due to the absence of three key residues needed for pThr binding. Dimerization, is conserved between FHA domains of Drosophila MU2 and human MDC1 albeit through different interfaces. The MU2 FHA domain dimerizes via the beta-sheet 2, the MDC1 FHA domain dimerizes via the opposite beta-sheet 1.	94
408047	pfam18222	PilN_bio_d	PilN biogenesis protein dimerization domain. This domain is found in PilN type IV pilus biogenesis protein present in Thermus thermophiles. PilN is an integral inner membrane protein needed for the formation of type IV pilus. This domain forms a dimer which is mediated by symmetric contacts between residues in alpha-1, beta-1, beta-3 and alpha-3.	102
408048	pfam18223	PilJ_C	Pili PilJ C-terminal domain. This is the C-terminal domain of PilJ, a Type IV pilin found in gram-positive Clostridium difficile. Incorporation of PilJ into pili exposes the C-terminal domain of PilJ to create a novel interaction surface. This C-terminal domain is not observed in other Type IV pilin proteins.	95
408049	pfam18224	ToxB_N	ToxB N-terminal domain. This is the N-terminal domain of ToxB found in Pyrenophora tritici-repentis. This domain is crucial for toxin activity. There are only two amino acid differences between ToxB and toxb, an inactive homolog of ToxB. These two differences are a Val at position 3 in ToxB compared to a Thr in toxb, and an Ala at position 12 in ToxB compared to a Val in toxb. AvrPiz-t, a secreted avirulence protein produced by the rice blast fungus, is a structural homolog to ToxB.	61
408050	pfam18225	AbfS_sensor	Sensor histidine kinase (AbfS) sensor domain. This is the sensor domain of sensor histidine kinase (AbfS) present in Cellvibrio japonicas. AbfS forms part of the AbfR/S two-component system which is needed to to activate the expression of the suite of enzymes that remove the numerous side chains from xylan. The overall fold of the sensor domain is that of a classical Per Arndt Sim domain.	65
408051	pfam18226	QslA_E	LasR-specific antiactivator QslA chain E. This domain is chain E of QslA present in Pseudomonas aeruginosa. QslA is an antiactivator which binds to the transcription factor LasR, disrupting its dimerization and preventing LasR from binding to target DNA. Chain E interacts with chain F of QslA and forms the dimerization interface. Chain E also interacts with chain A of ligand-binding domain in LasR.	71
408052	pfam18227	LepB_GAP_C	LepB GAP domain C-terminal subdomain. This subdomain is found in the Rab1 GTPase-activating protein (GAP) domain of GAP LepB present in Legionella pneumophilia. LepB inactivates Rab1, a key regulator of the secretory vesicular trafficking machinery, by acting as a GTPase-activating protein. LepB is also an antagonist of DrrA, which promotes Rab1. LepB acts by an atypical RabGAP mechanism that is reminiscent of classical GAPs. This is the C-terminal subdomain of the GAP domain and consists of an unusual fold.	80
375655	pfam18228	CdiI_N	CdiI N-terminal domain. This is the N-terminal domain of Contact-dependent growth inhibition immunity (CdiI) proteins present in Enterobacter cloacae. CdiI proteins neutralize CdiA-CT toxins to protect toxin-producing cells from auto-inhibition. Structural homology searches reveal that Enterobacter cloacae's CdiI is most similar to the Whirly family of single-stranded DNA-binding protein.	108
408053	pfam18229	GcnA_N	N-acetyl-beta-D-glucosaminidase N-terminal domain. This is the N-terminal domain found in N-acetyl-beta-D-glucosaminidase (GcnA) present in Streptococcus gordonii. GcnA is a family 20 glycosidase that cleaves N-acetyl-beta-D-glucosamine and N-acetyl-beta-D-galactosamine from 4-methylumbelliferylated substrates. Similar N-terminal domains have been observed in all family 20 glycosidases although the number of beta-sheet strands may vary from five.	78
408054	pfam18230	Glyc_hyd_38C_2	Glycosyl hydrolases family 38 C-terminal sub-domain. This is a subdomain found in the C-terminal region of golgi alpha-mannosidase II present in Drosophila melanogaster. These proteins are important in glycoprotein processing and are thought to cleave mannosidic bonds through a double displacement mechanism involving a reaction intermediate.This subdomain is found at the C-terminal of Glycosyl hydrolases family 38 C-terminal domain.	89
375658	pfam18231	DUF5603	Domain of unknown function (DUF5603). This domain is found in the C-terminal region of free serine kinase (SerK) in the hyperthermophilic archaeon Thermococcus kodakarensis. SerK converts ADP and l-serine (Ser) into AMP and O-phospho-l-serine (Sep), which is a precursor of l-cysteine. The domain is not conserved in the ParB/Srx family. The differences between SerK and the other members of the ParB/Srx family is concentrated in the C-terminal region, which may include residues involved in the Sep binding.	105
375659	pfam18232	Chalcone_N	Chalcone isomerase N-terminal domain. This is the N-terminal domain of chalcone isomerase present in Eubacterium ramulus. Chalcone isomerase is involved in the degradation pathway of flavone naringenin.	102
408055	pfam18233	Cdc13_OB4_dimer	Cdc13 OB4 dimerization domain. This domain is found in Cdc13 proteins in several Candida species. The Cdc13-Stn1-Ten1 complex is crucial for telomere protection. This domain is the C-terminal OB4 domain and is responsible for dimerization. Dimerization of Cdc13 is important for high-affinity DNA binding.	114
408056	pfam18234	VioE	Violacein biosynthetic enzyme VioE. This domain is VioE present in Chromobacterium violaceum. VioE plays a key role in the biosynthesis of violacein. Violacein has potential medical applications as an antibacterial, anti-tryptanocidal, anti-ulcerogenic and as an anti-cancer drug. VioE forms a homodimer with a chiefly hydrophobic interface between the two VioE monomers.The fact that VioE adopts a fold normally associated with lipoprotein carrier proteins may be due to VioE for binding the hydrophobic polyethylene glycol.	182
408057	pfam18235	OST_P2	Oligosaccharyltransferase Peripheral 2 domain. This is a domain found in the C-terminal region of STT3 present in P. furiosus. STT3 is an Oligosaccharyltransferase which catalyzes the transfer of a heptasaccharide, containing one hexouronate and two pentose residues, onto peptides in an Asn-X-Thr/Ser-motif-dependent manner. This domain, known as the Peripheral 2 (P2) domain, encircles the central core domain.	133
408058	pfam18236	AGO_N	Argonaute N domain. This is the N domain often found in the N-terminal region of Argonaute (AGO) present in Kluyveromyces polyspora. AGO forms part of the RNA-induced silencing complex that mediates the gene silencing pathway, RNA interference. The N domain blocks the nucleic acid-binding channel and prevents propagation of guide-target pairing beyond position 16.	122
375664	pfam18237	Tk-SP_N-pro	Tk-SP N-propeptide domain. This is the N-propeptide domain found in Tk-SP, a subtilisin-like serine protease from Thermococcus kodakaraensis. The beta sheet of this domain packs tightly to the two nearly parallel alpha helices 2 and 3 located at the surface of the subtilisin domain. Gln105 and Asp107 of the N-propeptide domain also bind to the N-termini of these two alpha-helices to form helix caps.	67
408059	pfam18238	LnmK_N_HDF	LnmK N-terminal Hot Dog Fold domain. This domain is found in LnmK and is present in Streptomyces atroolivaceus. LnmK is a bifunctional acyltransferase/decarboxylase (AT/DC) that catalyzes first self-acylation using methylmalonyl-CoA as a substrate and subsequently trans-acylation of the methylmalonyl group to the phosphopantetheinyl group of the LnmL acyl carrier protein. LnmK is a homodimer composed of two monomeric double-hot-dog folds (DHDF). This domain is the N-terminal hot dog fold.	176
408060	pfam18239	HA1	Hemagglutinin I. This domain is hemagglutinin I (HA1) present in Physarum polycephalum. Although the physiological function of the secreted HA1 remains to be established, HA1 recognizes cell wall polysaccharides of E. coli. The beta-sandwich fold of HA1, composed of two up and down beta-sheets, is conserved among other legume lectin-like proteins. The up and down beta-sheet region is a minimal carbohydrate recognition domain.	90
408061	pfam18240	PSII_Pbs31	Photosystem II Psb31 protein. This domain is Psb31, an extrinsic protein found in photosystem II (PSII) present in Chaetoceros gracilis. Photosystem II (PSII) is a multisubunit, membrane protein complex located in the thylakoid membranes of oxygenic photosynthetic organisms from cyanobacteria to higher plants. The four helices in the N-terminal domain are arranged in an up-down-up-down fold and are similar in structure to PsbQ protein in Spinach, despite low sequence homology.	93
408062	pfam18241	AvrM-A	Flax-rust effector AvrM-A. This domain is found in AvrM-A present in Melampsora lini. AvrM-A is a natural variant of AvrM which is a secreted effector protein that can internalize into plant cells in the absence of the pathogen and bind to phosphoinositides. AvrM results in effector-triggered immunity. This domain makes up part of the C-terminal region, which is highly conserved in AvrM. The domain is required for M-dependent effector-triggered immunity.	147
375669	pfam18242	LupA	Legionella ubiquitin-specific protease A domain. This domain is found in Legionella ubiquitin-specific protease A (LupA). LupA removes a ubiquitin modification from LegC3 which inactivates the cognate effector. This domain is typical of eukaryotic ubiquitin proteases involved in deconjugation of ubiquitin or ubiquitin-like proteins from their targets.	178
408063	pfam18243	BfiI_DBD	Metal-independent restriction enzyme BfiI DNA binding domain. This domain is found in the metal-independent restriction enzyme BfiI present in Bacillus firmus. This domain is found in the C-terminal of the protein and is responsible for DNA binding. The domain exhibits a beta-barrel-like structure similar to the effector DNA-binding domain of the Mg2+ dependent restriction enzyme EcoRII and to the B3-like DNA-binding domain of plant transcription factors.	164
408064	pfam18244	CttA_N	Cellulose-binding protein CttA N-terminal domain. This is the N-terminal domain of cellulose-binding protein CttA present in Ruminococcus flavefaciens. CttA mediates attachment of the bacterial substrate via two carbohydrate-binding modules. The domain is known as the X-module and lacks a true hydrophobic core. Unlike the X-modules in other types of CohE-XDoc complexes it does not contribute to the binding surface. This X-module appears to serve as an extended spacer, which separates the cellulose-binding modules at the N terminus of CttA and the bacterial cell wall. The domain does not share structural similarity with other known X-modules from cellulolytic bacteria but does show similarity to G5-1 module of StrH from S. pneumoniae.	71
408065	pfam18245	XRN1_DBM	5-3 exonuclease XRN1 DCP1-binding motif. This domain is found in the 5'-3' exonuclease (XRN1) present in Drosophila melanogaster. XRN1 degrades deadenylated mRNA that has recently been decapped by decapping enzyme 2 (DCP2). DCP2 associates with decapping activators DCP1 and EDC4. The direct interaction between DCP1 and XRN1 couples mRNA decapping to 5' exonucleolytic degradation. This domain is responsible for binding to DCP1. In particular, the helical C-terminal region of the domain contributes to the binding affinity and the specificity of the interaction.	26
408066	pfam18246	OST_IS	Oligosaccharyltransferase Insert domain. This is a domain found in STT3 present in P. furiosus. In P. furiousus, STT3 is an Oligosaccharyltransferase which catalyzes the transfer of a heptasaccharide, containing one hexouronate and two pentose residues, onto peptides in an Asn-X-Thr/Ser-motif-dependent manner. This domain is inserted into the central core (CC) domain and hence is referred to as the Insert (IS) domain. This IS domain contains a disulphide bond.	83
375674	pfam18247	AvrM_N	Flax-rust effector AvrM N-terminal domain. This is the N-terminal domain found in AvrM present in Melampsora lini. AvrM is a secreted effector protein that can internalize into plant cells in the absence of pathogens, binds to phosphoinositides and results in effector-triggered immunity. This domain is related to the WY domain core in oomycete effectors.	65
375675	pfam18248	RalF_SCD	RalF C-terminal Sec-7 capping domain. This is the C-terminal domain of RalF protein present in Legionella pneumophilia. RalF is secreted into host cytosol via the Dot/Icm type IV transporter where it acts to recruit ADP-ribosylation factor (Arf) to pathogen-containing phagosomes in the establishment of a replicative organelle. This domain forms a cap over the active site in the Sec7 domain and so is referred to as the Sec7-capping domain.	147
375676	pfam18249	Ca_bind_SSO6904	Calcium binding protein SSO6904. This domain is SSO6904 present in Sulfolobus solfataricus. SSO6904 is a calcium binding protein thought to have a weak affinity for other cations such as Mg2+ and Zn2+. The structure of SSO6904 is similar to that of saposin-fold proteins. Saposin proteins are membrane-interacting glycoproteins required for the hydrolysis of certain sphingolipids by specific lysosomal hydrolases.	90
408067	pfam18250	Tgi2PP	Effector immunity protein Tgi2PP. This domain is Tgi2PP found in Pseudomonas protegens. Tgi2PP is part of the Tge2PP- Tgi2PP Effector-immunity pair secreted by the type VI secretion system (T6SS). Tgi2PP interacts predominantly by hydrogen bonding and hydrophobic interactions with Tge2PP via the insertion of the beta-sheet core of Tgi2PP into the substrate-binding groove of Tge2PP. Tgi2PP contains a similar topology to the periplasmic E. coli colicin M immunity protein.	46
375678	pfam18251	Defensin_5	Fungal defensin Copsin. This domain is Copsin present in Coprinopsis cinerea. Copsin is a defensin that interferes with peptidoglycan synthesis and has a CS-alpha-beta fold. Copsin is stabilized by a unique connectivity of six cysteine bonds in contrast to most other CS-alpha-beta defensins which are linked by three or four disulfide bonds.	39
408068	pfam18252	Cu_bind_CorA	Copper(I)-binding protein CorA. This domain is found in CorA present in Methylomicrobium album. CorA is a copper repressible surface associated copper(I)-binding protein. CorA can bind one copper ion per protein molecule. The overall fold of CorA is similar to M. capsulatus protein MopE, including the unique copper(I)-binding site and most of the secondary structure elements.	175
408069	pfam18253	HipN	Hsp70-interacting protein N N-terminal domain. This is the N-terminal domain, known as HipN, found in Hsp70-interacting protein (Hip) present in Rattus norvegicus. Hip cooperates with the chaperone Hsp70 in protein folding and prevention of aggregation and may delay substrate release by slowing ADP dissociation from Hsp70. HipN is responsible for N-terminal homo-dimerization which is necessary so that the Hip dimer can interact with Hsp70 molecules.	42
408070	pfam18254	HMw1_D2	HMW1 domain 2. This domain is found in Actinobacillus pleuropneumoniae HMW1C (ApHMW1C). HMW1 adhesin is an N-linked glycoprotein that mediates adherence to respiratory epithelium through N-glycosylation of protein acceptor sites and O-glycosylation of sugar acceptor sites. This domain forms an all alpha domain (AAD) when combined with the N-terminal domain. The AAD interacts extensively with the C-terminal GT-B fold in order to create a unique groove with the potential to accommodate the acceptor protein.	88
408071	pfam18255	SAM_DrpA	DNA processing protein A sterile alpha motif domain. This is the N-terminal domain found in DNA processing protein A (DprA) present in Streptococcus pneumoniae. DprA has recently been discovered to be a transformation-dedicated RecA loader. Transformation is believed to play a major role in genetic plasticity. This domain is known as the sterile alpha motif (SAM) domain. DprAs are able to form a type of dimer through SAM-SAM interactions, also known as N/N interactions.	62
375683	pfam18256	HscB_4_cys	Co-chaperone HscB tetracysteine metal binding motif. This is the N-terminal domain of human co-chaperone protein HscB (hHscB). This domain is capable of binding a metal ion through its tetracysteine metal binding motif. The metal atom is coordinated by a set of four cysteine residues (Cys41, Cys44, Cys58 and Cys61) on opposed beta-hairpins. Although the N-domain lacks any recognizable secondary structure elements, it has several distant structural homologs including C-4 zinc finger domains and rubredoxin.	27
408072	pfam18257	DsbG_N	Disulfide isomerase DsbG N-terminal. This is the N-terminal domain found in DsbG, a protein disulfide isomerase present in the periplasm of Helicobacter pylori. The formation of correct disulfide bonds is critical in the folding process of many secretory and membrane proteins in bacteria. Non-native disulfides are corrected by the isomerase DsbC, and, to a lesser extent, by DsbG. The N-terminal domain is involved in dimerization. The dimer interface of Helicobacter pylori's DsbG is stabilized by hydrophobic interactions and hydrogen bonds involving alpha 1, beta-3 to beta-4 loop, beta-4 and beta-4 to alpha-2 loop. This pattern of dimerization is similar to that of E. coli's DsbG.	90
408073	pfam18258	IL4_i_Ig	Interleukin-4 inducing immunoglobulin-binding domain. This domain is found in Interleukin-4 inducing protein alpha-1 (IPSE/alpha-1) present in Schistosoma mansoni, a parasite of humans. IPSE/alpha-1 triggers the release of IL-4 from basophils in the liver which is a major site of egg deposition during S. mansoni infection. This domain adopts a beta gamma-crystallin fold that is stabilized by three disulfide bonds within the domain (23/26, 59/93, and 111/121). The domain is involved in immunoglobulin binding.	89
408074	pfam18259	CBM65_1	Carbohydrate binding module 65 domain 1. This domain is found in the non-catalytic carbohydrate binding module 65B (CMB65B) present in Eubacterium cellulosolvens. CBMs are present in plant cell wall degrading enzymes and are responsible for targeting, which enhances catalysis. CBM65s display higher affinity for oligosaccharides, such as cellohexaose, and particularly polysaccharides than cellotetraose, which fully occupies the core component of the substrate binding cleft. The concave surface presented by beta-sheet 2 comprises the beta-glucan binding site in CBM65s. C6 of all the backbone glucose moieties makes extensive hydrophobic interactions with the surface tryptophans of CBM65s. Three out of the four surface Trp are highly conserved. The conserved metal ion site typical of CBMs is absent in this CBM65 family.	113
408075	pfam18260	Nab2p_Zf1	Nuclear polyadenylated RNA-binding 2 protein CCCH zinc finger 1. This domain is found in nuclear polyadenylated RNA-binding 2 protein (Nab2p) present in Saccharomyces cerevisiae. Nab2p is a major family of Poly A-binding proteins whose interactions are thought to be crucial for the control of poly(A) tail length. This domain is the first of seven CCCH zinc fingers which are responsible for polyadenosine RNA binding. When combined with the next three zinc fingers (Zf1-4), these four zinc fingers together may bind RNA in the 3' to 5' direction.	26
408076	pfam18261	Rpn9_C	Rpn9 C-terminal helix. This is the C-terminal domain found in Rpn9 present in Saccharomyces cerevisiae. Rpn9 is one of six PCI-domain-containing proteins that form the lid of the proteasome for ATP-dependent unfolding and hydrolysis of the polypeptide. Rpn9s C-terminal domain is not necessary for lid assembly with the exception of Rpn12, where the domains absence prevents the association of Rpn12.	33
408077	pfam18262	PhetRS_B1	Phe-tRNA synthetase beta subunit B1 domain. This is the N-terminal domain found in human cytosolic phenylalanyl tRNA synthetase beta subunit.	83
408078	pfam18263	MCM6_C	MCM6 C-terminal winged-helix domain. The minichromosome maintenance (Mcm) complex is the replicative helicase in eukaryotic species, that plays essential roles in the initiation and elongation phases of DNA replication. During late M and early G(1), the Mcm complex is loaded onto chromatin to form prereplicative complex in a Cdt1-dependent manner. This entry represents the C-terminal domain of human Mcm6 which is the Cdt1 binding domain (CBD). The structure of CBD exhibits a typical winged helix fold that is generally involved in protein-nucleic acid interaction. The CBD failed to interact with DNA in experiments. The CBD-Cdt1 interaction involves the helix-turn-helix motif of CBD.	107
408079	pfam18264	preSET_CXC	CXC domain. This domain is found to the N-terminus of the SET domain in the EZH2 protein. It is a zinc binding domain.ED L9LD52.1/505-536;	32
408080	pfam18265	Nas2_N	Nas2 N_terminal domain. Nas2 is a proteosome assembly chaperone. Nas2 bivalently binds the proteasome Rpt5 subunit. The Nas2 N-terminal helical domain masks the Rpt1-interacting surface of Rpt5.	79
408081	pfam18266	Ncstrn_small	Nicastrin small lobe. This domain is part of the protein Nicastrin, a component of gamma secretase present in Homo sapiens. Gamma-secretase is thought to contribute to Alzheimer's disease development by generating beta-amyloid peptides. This domain is the known as the small lobe which forms the 'lid'. The lid is an extended surface loop that covers the hydrophilic pocket that is thought to be responsible for substrate recruitment. On substrate binding, the large lobe is thought to rotate relative to the small lobe.	169
408082	pfam18267	Rubredoxin_C	Rubredoxin NAD+ reductase C-terminal domain. This is the C-terminal domain of NADH rubredoxin oxidoreductase present in Clostridium acetobutylicum. The majority of obligatory anaerobes detoxify micro-aerobic environments by consuming O2 via H2O-forming NADH oxidase. This enzyme offers an alternate reaction pathway for scavenging of O2 and reactive oxygen species, wherein the reducing equivalent is obtained from NADH.	70
408083	pfam18268	Hit1_C	Hit1 C-terminal. This domain is found in Hit1 protein (Hit1p) present in Saccharomyces cerevisiae. Hit1p contributes to C/D small nucleolar RNPs (snoRNPs) stability and pre-RNA maturation kinetics by associating with U3 snoRNA precursors and influencing its 3'-end processing. Snu13p-Rsa1p-Hit1p heterotrimer binds C/D snoRNAs. C/D snoRNAs are essential for the biogenesis of ribosomes and spliceosomes. The domain adopts a Pac-Hit fold that forms a claw which locks the alpha-1 helix of Rsa1p, while the Rsa1p alpha-2 helix packs against the exposed surface of the Hit1p Pac-Hit domain alpha-3 helix.	82
408084	pfam18269	T3SS_ATPase_C	T3SS EscN ATPase C-terminal domain. This is the C-terminal domain of the EscN protein family of ATPases that form part of the Type III secretion system (T3SS) present in Escherichia coli. T3SS is a macromolecular complex that creates a syringe-like apparatus extending from the bacterial cytosol across three membranes to the eukaryotic cytosol. This process is essential for pathogenicity. EscN is a functionally unique ATPase that provides an inner-membrane recognition gate for the T3SS chaperone-virulence effector complexes as well as a potential source of energy for their subsequent secretion.The C-terminal domain of T3SS ATPases mediates binding with multiple contact points along the chaperone.	70
375697	pfam18270	Evf	Virulence factor Evf. This domain is found in Erwinia virulence factor (Evf) present in the Drosophila Pathogen, Erwinia carotovora. Evf is able to bind to model membranes containing negatively charged phospholipids and to promote their aggregation. Palmitoic acid covalently binds to the completely conserved Cys209. The structure of Evf is unlike any virulence factors known to date.	235
408085	pfam18271	GH131_N	Glycoside hydrolase 131 catalytic N-terminal domain. This is the N-terminal domain found in glycoside hydrolase family 131 (GH131A) protein observed in Coprinopsis cinerea. GH131A exhibits bifunctional exo-beta-1,3-/-1,6- and endo-beta-1,4 activity toward beta-glucan. This domain is catalytic in nature though the catalytic mechanism of C. cinerea GH131A is different from that of typical glycosidases that use a pair of carboxylic acid residues as the catalytic residues. In the case of GH131A, Glu98 and His218 may form a catalytic dyad and Glu98 may activate His218 during catalysis.	251
408086	pfam18272	ssDNA_TraI_N	single-stranded DNA binding TraI N-terminal subdomain. This is a subdomain found in TraI present in E. coli. Tra1 is a conjugative relaxase that forms part of the Type IV secretion system. This subdomain, referred to as 2A, is located in N-terminal region of the translocation signal (TSA) domain. TSA is known to reside in a larger ssDNA-binding domain.	53
408087	pfam18273	T3RM_EcoP15I_C	Type III R-M EcoP15I C-terminal domain. This domain is found in the Type III restriction-modification EcoP15I complex, present in E. coli. The Mod subunits function as a dimer with ModA recognizing the DNA and ModB methylating the target adenine base. This domain is found in the C-terminal of the ModB subunit.	97
375701	pfam18274	V_ATPase_prox	Vacuolar ATPase Subunit I N-terminal proximal lobe. This domain is found in the cytoplasmic N-terminal domain of vacuolar ATP synthase subunit I present in Meiothermus ruber. Subunit I is a homolog of subunit A which associates with the membrane-bound complex of eukaryotic vacuolar H+-ATPase (V-ATPase) acidification machinery. The domain forms the proximal lobe that caps one end of the alpha helix bundle, with the distal lobe capping the other end. Although the two lobes exhibit a similar motif, the molecular nature of the coupling with the identical stalks is thought to be dissimilar.	52
408088	pfam18275	His_Me_b4a2	His-Me finger endonuclease beta4-alpha2 domain. This domain is found in Hpy991 present in Helicobacter pylori. Hpy991 is a beta-beta-alpha-Me restriction endonuclease that recognizes the CGWCG target sequence and cleaves both DNA strands with a stagger that leads to 5'-recessed ends in the cleavage products. This domain is the first of two beta4-alpha2 repeats found after the N-terminal domain. The two repeats have low overall sequence similarity but readily identified by a structural comparison. Both repeats contain contains two CXXC motifs that map to the first beta-hairpin and the first alpha-helix. The four cysteine residues coordinate a structurally bound Zn2+ ion tetrahedrally. The major groove is in contact with the first repeat, with the beta-hairpin 2 inserting deeply into the groove.	52
408089	pfam18276	TcA_TcB_BD	Tc toxin complex TcA C-terminal TcB-binding domain. This domain is found in the C-terminal region of the Tc toxin TcA, present in Photorhabdus luminescens. Tc Toxin complexes bind to the cell surface, are endocytosed and perforate the host endosomal membrane by forming channels that translocate toxic enzymes into the host. This domain is responsible for binding to toxin TcB. Binding of TcA to TcB/TcC opens the beta-propeller gate.	287
408090	pfam18277	AbrB_C	AbrB C-terminal domain. This is the C-terminal domain of AbrB protein from Bacillus subtilis. AbrB is a transition state regulator. Functions of AbrB include biofilm formation, antibiotic production, competence development, extracellular enzyme production, motility, and sporulation. The C-terminal domain is responsible for multimerization and, to a lesser extent than the N-terminal domain, also contributes in DNA binding.	37
408091	pfam18278	RANK_CRD_2	Receptor activator of the NF-KB cysteine-rich repeat domain 2. This domain is found in the receptor activator of the NF-KB (RANK) present in Mus musculus. RANK and its cognate ligand RANKL play a role in bone remodelling, immune function and mammary gland development in conjunction with various cytokines and hormones. The binding of RANKL to RANK causes trimerisation of the receptor, which activates the signalling pathway and results in osteoclastogenesis from progenitor cells and the activation of mature osteoclasts receptor activator of the NF-KB. This domain is the second of four cysteine rich pseudo-repeat domains (CRDs) and so is known as CRD2. RANK moves via a hinge region between CRD2 and CRD3 to make close contact with RANKL.	41
375706	pfam18279	zf-WRNIP1_ubi	Werner helicase-interacting protein 1 ubiquitin-binding domain. This domain is found in the Werner helicase-interacting protein 1 present in Homo sapiens. The domain is a zinc finger responsible and has a zinc-coordinating B-B-A fold. WRNIP1 UBZ binds ubiquitin in a similar manner to Rad18 UBZ.	21
375707	pfam18280	AadA_C	Aminoglycoside adenyltransferase C-terminal domain. This is the C-terminal domain of aminoglycoside (3'')(9) adenyltransferase (AadA) present in Salmonella enterica. AadA acts as a monomer to catalyse the magnesium-dependent transfer of adenosine monophosphate from ATP to the two chemically dissimilar drugs streptomycin and spectinomycin.	104
408092	pfam18281	BILBO1_N	BILBO1 N-terminal domain. This is the N-terminal domain of BIBLO1 present in Trypanosoma brucei. BILBO1 is a flagella pocket collar component. Depletion of BIBLO1 prevents flagella pocket and flagella pocket collar biogenesis and leads to cell death. This domain has a ubiquitin-like fold and has a conserved patch of four aromatic residues (Phe-12, Trp-71, Tyr-87, and Phe-89) and three basic residues (Lys-15, Lys-60, and Lys-62).	92
375709	pfam18282	RAP80_UIM	RAP80 N-terminal ubiquitin interaction motif. This is the N-terminal domain found in RAP80 protein present in Homo sapiens. RAP80 is fundamental for protein recruitment in the DNA damage response. The N-terminal domain is a ubiquitin-interacting motif (UIM). RAP80 is involved in multivalent recognition of polyUb chains through N-terminal domain.	57
408093	pfam18283	CBM77	Carbohydrate binding module 77. This domain is the non-catalytic carbohydrate binding module 77 (CBM77) present in Ruminococcus flavefaciens. CBMs fulfil a critical targeting function in plant cell wall depolymerisation. In CBM77, a cluster of conserved basic residues (Lys1092, Lys1107 and Lys1162) confer calcium-independent recognition of homogalacturonan.	108
408094	pfam18284	DNA_meth_N	DNA methylase N-terminal domain. This is the N-terminal domain of DNA methylase (pfam00145). Family members include Modification methylase EcoRII (EC:2.1.1.37) and DNA-cytosine methyltransferase.	57
408095	pfam18285	LuxT_C	Tetracycline repressor LuxT C-terminal domain. This is the C-terminal domain of LuxT. LuxT is a tetracycline repressor family regulator identified in Vibrio alginolyticus which may play a role in the fine-tuning of the virulence via quorum sensing (QS).	87
408096	pfam18286	T3SS_ExsE	Type III secretion system ExsE. This domain is found in ExsE present in Pseudomonas aeruginosa. ExsE forms part of the ExsACDE signaling cascade which acts as an important regulatory switch that ensures timely expression of the Type III secretion system (T3SS) and so plays a critical role in facilitating infection. Prior to host-cell contact, the T3SS is inactive and ExsE and Type III Secretion Chaperone (ExsC) form a stable complex. ExsC forms a compact homodimer and ExsE wraps around one face of this dimer.	46
375714	pfam18287	Hfx_Cass5	Integron Cassette Protein Hfx_Cass5. This domain forms part of the integron cassette protein Hfx_CASS5 present in Vibrio cholerae. The structure of Hfx is a tetramer built from two domain-swapped dimers.	80
408097	pfam18288	FAA_hydro_N_2	Fumarylacetoacetase N-terminal domain 2. This is domain is found in the N-terminal region of Fumarylacetoacetate (FAA) hydrolase (pfam01557). Family members of this domain include Pseudogulbenkiania ferrooxidans and Cupriavidus gilardii.	78
408098	pfam18289	HU-CCDC81_euk_2	CCDC81 eukaryotic HU domain 2. This is the second of two HU domains found in the CCDC81-like proteins. CCDC81 has been experimentally linked to the centrosome; eukaryotic CCDC81 HU domains are predicted to function in protein-protein interactions in centrosome organization and potentially contribute to cargo-binding in conjunction with Dynein-VII. A striking lineage-specific expansion of the domain is observed in birds, where the HU domains could function in recognition of non-self molecules.	75
408099	pfam18290	Nudix_hydro	Nudix hydrolase domain. This domain is found just before the N-terminal region of nucleoside diphosphate-linked moiety (Nudix) hydrolases (pfam00293). Nudix hydrolases catalyze the hydrolysis of nucleoside diphosphates which are often toxic metabolic intermediates and signalling molecules.	80
408100	pfam18291	HU-HIG	HU domain fused to wHTH, Ig, or Glycine-rich motif. Rapidly-diverging family of HU domains predominantly observed in the bacteroidetes lineage with a predicted role in recognition and possible interception of the DNA of parasitic elements, a counter-conflict strategy preventing incorporation of these elements into the host genome.	125
408101	pfam18292	ZIP4_domain	Zinc transporter ZIP4 domain. This domain is found in ZRT1-IRT1-like protein 4 (ZIP4) present in Homo sapiens and Mus musculus. ZIP4 is a zinc transporter that allows uptake of the essential nutrient zinc. The domain is found before the N-terminal of ZIP Zinc transporter domain (pfam02535).	167
408102	pfam18293	Caprin-1_dimer	Caprin-1 dimerization domain. This domain is found in human Caprin-1 protein. Caprin-1 plays a role in many important biological processes, including cellular proliferation, innate immune response and synaptic plasticity. This domain is found in the highly conserved homologous region 1(HR1) and is responsible for the tight homodimerization of Caprin-1.	116
408103	pfam18294	Pept_S41_N	Peptidase S41 N-terminal domain. This domain is found in the N-terminal region of proteins carrying the peptidase S41 domain (pfam03572) in Bacteroidetes.	49
408104	pfam18295	Pdase_M17_N2	M17 aminopeptidase N-terminal domain 2. This domain is found in the N-terminal region of M17 aminopeptidase (pfam00883) present in Homo sapiens and Mus musculus. M17 aminopeptidases are Zn-dependent exopeptidases that catalyse the removal of unsubstituted amino acid residues from the N-terminus of peptides.	121
408105	pfam18296	MID_MedPIWI	MID domain of medPIWI. MID domain of the medPIWI PIWI/Argonaute module. medPIWI is the core globular domain of the Med13 protein. Med13 is one member of the CDK8 subcomplex of the Mediator transcriptional coactivator complex. The medPIWI module in Med13 is predicted to bind double-stranded nucleic acids, triggering the experimentally-observed conformational switch in the CDK8 subcomplex which regulates the Mediator complex.	191
408106	pfam18297	NFACT-R_2	NFACT protein RNA binding domain. NFACT-R RNA binding family found found in bacteria fused to the ThiI domain as a variant of the canonical tRNA 4-thiouridylation pathway.	104
408107	pfam18298	NusG_add	NusG additional domain. This domain is found in Thermotoga maritima NusG, which interacts with RNA polymerase and other proteins to form multi-component complexes that modulate transcription. This domain is referred to as Domain II and is an additional domain inserted into the N-terminal domain.	109
408108	pfam18299	R2K_2	ATP-grasp domain, R2K clade family 2. Family of ATP-grasp enzymes belonging to the R2K clade, wherein one of the absolutely-conserved lysine residues has migrated to the RAGYNA domain which is a part of the core ATP-grasp module. This family is predicted to catalyze peptide ligation reactions on protein substrates in biological conflict contexts, probably between bacteriophages and their hosts.	147
408109	pfam18300	DUF5604	Domain of unknown function (DUF5604). This domain is often found in the N-terminal region of proteins carrying the SET domain (pfam00856), such as the SETDB1 protein present in Homo sapiens. SETDB1 is a histone methyltransferase that suppresses gene expression and modulates heterochromatin formation through H3K9me2/3.	58
408110	pfam18301	preATP-grasp_3	pre ATP-grasp 3 domain. This domain is found just before the N-terminal of the ATP grasp 3 domain (pfam02655). The domain is carried by species such as Azospirillum brasilense and Methylobacter tundripaludum.	76
375729	pfam18302	CPSase_C	Carbamoyl phosphate synthetase C-terminal domain. This is the C-terminal domain found after the MGS domain (pfam02142) in human carbamoyl phosphate synthetase. Carbamoyl phosphate synthetase catalyzes the first step of ammonia detoxification to urea.	14
408111	pfam18303	Saf_2TM	SAVED-fused 2TM effector domain. Predicted pore-forming effector domain directly fused to predicted SAVED sensor domain. Binding of a ligand via the SAVED sensor is predicted to activate the Saf-2TM and initiate a cell suicide response. Component of a class of conflict systems reliant on the production of second messenger nucleotide or nucleotide derivative.	152
375731	pfam18304	SabA_adhesion	SabA N-terminal extracellular adhesion domain. This is the N-terminal extracellular adhesion domain of Sialic acid binding adhesin (SabA) present in Helicobacter pylori. The N-terminal domain of SabA functions as a sugar-binding adhesion domain with conserved disulfide bonds. Notably, these amino acid residues are not only conserved among SabA orthologs but also between SabA and BabA.	299
408112	pfam18305	DNA_pol_A_exoN	3' to 5' exonuclease C-terminal domain. This domain is found just after the C-terminal region of the HRDC domain (pfam00570) in 3'-5' exonuclease proteins (pfam01612). The domain is carried by species such as Streptomyces griseoaurantiacus and Streptomyces albulus.	87
408113	pfam18306	LDcluster4	SLOG cluster4 family. Family in the SLOG superfamily, observed as a standalone domain with little informative genome context, although related families in the SLOG superfamily are predicted to function in diverse conflict contexts.	152
408114	pfam18307	Tfb2_C	Transcription factor Tfb2 (p52) C-terminal domain. This is the C-terminal domain of Transcription factor Tfb2 present in Saccharomyces cerevisiae. Tfb2 is referred to as p52 in humans. The interaction between p8-Tfb5 and p52-Tfb2 has a key role in the maintenance of the transcription factor TFIIH architecture and TFIIHs function in nucleotide-excision repair (NER) pathway. The C-terminal domain of Tfb2 is thought to have a crucial role in DNA repair.	68
408115	pfam18308	GGA_N-GAT	GGA N-GAT domain. This domain is found in the N-terminal region of the GGA and Tom1 (GAT) domain in Golgi-localizing gamma-adaptin ARF-binding protein 1 (GGA1) present in Homo sapiens. The GAT domains is the key region in GGA that interacts with ARF. ARF plays a crucial role in docking adaptor proteins to membranes. This domain is referred to as N-GAT and it interacts extensively with ARF.	39
408116	pfam18309	Ago_PAZ	Argonaute PAZ domain. This is a PAZ domain is found in argonaute present in Thermus thermophiles. Argonaute has a central role in the RNA interference pathway by mediating the maturation of small interfering RNA (siRNA) through initial degradation of the passenger strand, followed by guide-strand-mediated sequence-specific cleavage of target mRNA. The nucleic-acid-binding channel is thought to be positioned between the PAZ and PIWI domain.	88
408117	pfam18310	DUF5605	Domain of unknown function (DUF5605). This domain is found in the C-terminal region of proteins carrying pfam16586 and pfam13204. The C-terminal domain is carried by species such as Bacteroides vulgatus.	73
408118	pfam18311	Rrp40_N	Exosome complex exonuclease Rrp40 N-terminal domain. This is the N-terminal domain of Rrp40 of the exosome complex present in Saccharomyces cerevisiae. The RNA exosome complex is responsible for degrading RNA molecules in the 3' to 5' direction. Rrp40 is a 'cap' protein and binds the RNase PH barrel on the opposite side from the S1/KH ring. The N-terminal domain of Rrp44 forms a long beta-hairpin that is wedged in between Rrp41-Rrp42 and approaches the N terminus of the cap protein Rrp4.	47
408119	pfam18312	ScsC_N	Copper resistance protein ScsC N-terminal domain. This is the N-terminal domain found in Copper resistance protein ScsS present in Proteus mirabilis. ScsC is a powerful disulfide isomerase that is able to refold and reactivate the scrambled disulfide form of the model substrate RNase A. The protein has a thioredoxin 4 domain (pfam13462) but, unlike other characterized proteins in this family, it is trimeric. The N-terminal domain is responsible for trimerization of ScsC which is needed for isomerase activity.	30
408120	pfam18313	TLP1_add_C	Thiolase-like protein type 1 additional C-terminal domain. This domain is found in thiolase-like protein type 1 (TLP1) present in Mycobacterium smegmatis. Thiolase enzymes are acetyl-coenzyme A acetyltransferases which convert two units of acetyl-CoA to acetoacetyl CoA in the mevalonate pathway. This domain is deemed an additional C-terminal region, much like the SPC2-thiolase present in mammals which has an additional C-terminal domain termed the sterol carrier protein-2 (SPC2). However, the additional C-terminal domain in TLP1 folds differently to the traditional SCP2-fold observed in mammalian SPC2-thiolase. The topology of the C-terminal domain of TLP1 is reminiscent of single strand nucleic acid binding proteins.	82
408121	pfam18314	FAS_I_H	Fatty acid synthase type I helical domain. This domain is found in the fatty acid synthase (FAS) complex present in species such as Mycobacterium smegmatis and Thermomyces lanuginosus. FAS is a homo-hexameric enzyme that catalyzes synthesis of fatty acid precursors of mycolic acids. This domain is composed of dimerization module 1 (DM1) and four-helix bundle (4HB), both of which are conserved parts of the acetyl transferase.	203
408122	pfam18315	VCH_CASS14	Integron cassette protein VCH_CASS1 chain. This domain is a chain that forms part of the integron cassette protein VCH_CASS14 present in Vibrio cholerae. In each monomer lies a deep binding pocket for small molecule substrates formed by helices alpha-1 and alpha-2 and residues from the central four strands of the beta-sheet. The pocket is extensively lined with hydrophobic side chains.	97
408123	pfam18316	S-l_SbsC_C	S-layer protein SbsC C-terminal domain. This domain is found in the crystalline bacterial cell-surface layer (S-layer) protein SbsC present in Geobacillus stearothermophilus. S-layers are a common feature of archaeal cell envelopes. SbsC is an oblique lattice forming protein. This domain, termed Domain 9, is located at the C-terminal region of SbsC. The C-terminal region comprises the self-assembly domain responsible for the formation of the crystalline array.	85
408124	pfam18317	SDH_C	Shikimate 5'-dehydrogenase C-terminal domain. This domain is found in the C-terminal region of Shikimate 5'-dehydrogenase (SDH) present in Methanocaldococcus jannaschii. SDH catalyses the NADPH-dependent reduction of 3-dehydroshikimate to shikimate in the shikimate pathway. The domain is found just after the C-terminal domain (pfam01488) which is responsible for NADP binding.	31
408125	pfam18318	Gln-synt_C-ter	Glutamine synthetase C-terminal domain. This domain is found in type III glutamine synthetase present in Bacteroides fragilis. Glutamine synthetase (GS) are large oligomeric enzymes that catalyze the condensation of ammonium and glutamate to form glutamine, the principal source of nitrogen for protein and nucleic acid synthesis. This domain is located in the C-terminal end of the protein.	118
408126	pfam18319	PriA_CRR	PriA DNA helicase Cys-rich region (CRR) domain. This is a cys-rich region (CRR) domain found in PriA DNA helicases. In bacteria, the replication restart process is orchestrated by the PriA DNA helicase, which identifies replication forks via structure-specific DNA binding and interactions with fork-associated ssDNA-binding proteins (SSBs). The CRR region which is embedded within the C-terminal helicase lobe has been identified to bind two Zn2+ ions. This 50-residue insertion forms a structure on the surface of the helicase core in which two Zn2+ ions are coordinated by invariant Cys residues. Biochemical experiments have shown that sequence changes to Zn2+-binding Cys residues in the PriA CRR can eliminate helicase, but not ATPase, activity and can block assembly of PriB onto DNA-bound PriA, implicating the CRR in multiple functions in PriA.	27
408127	pfam18320	Csc2	Csc2 Crispr. The Csc2 Crispr family of proteins forms a core RNA recognition motif-like domain, flanked by three peripheral insertion domains: a lid domain, a Zinc-binding domain and a helical domain. The CRISPR-Cas system is possibly a mechanism of defence against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes.	298
408128	pfam18321	3HCDH_RFF	3-hydroxybutyryl-CoA dehydrogenase reduced Rossmann-fold domain. This domain is found in 3-hydroxybutyryl-CoA dehydrogenase present in E. coli. 3-hydroxybutyryl-CoA dehydrogenase catalyzes the second step in the biosynthesis of n-butanol from acetyl-CoA, in which acetoacetyl-CoA is reduced to 3-hydroxybutyryl-CoA. This domain is a reduced Rossmann-fold domain and, unlike the first Rossmann-fold domain, it is missing the catalytic residues and an NAD(H) binding cleft.	69
408129	pfam18322	CLIP_1	Serine protease Clip domain PPAF-2. This domain is found in Prophenoloxidase-activating factor (PPAF)-II present in the beetle Holotrichia diomphalia. PPAF-II is indispensable for the generation of the active phenoloxidase leading to melanization, a major defense mechanism of insects. This domain is the clip domain and it is thought to tightly associate with regions I-III of the serine protease-like (SPL) domain. The clip domain is a protein-interaction module that plays an essential role in the binding and activation of PO76s via its central cleft.	52
408130	pfam18323	CSN5_C	Cop9 signalosome subunit 5 C-terminal domain. The COP9 (Constitutive photomorphogenesis 9) signalosome (CSN), a large multiprotein complex that resembles the 19S lid of the 26S proteasome, plays a central role in the regulation of the E3-cullin RING ubiquitin ligases (CRLs). The catalytic activity of the CSN complex is carried by subunit 5 (CSN5), also known as c-Jun activation domain-binding protein-1 (Jab1). This entry is the C-terminal domain found in CSN5 proteins. CSN5, whose two C-terminal helices form an antiparallel hairpin, inserts its final C-terminal helix (helix II) into the central CSN6 framework at the core of the bundle. Deletion of the C-terminal helices has a pronounced effect on CSN integrity.	82
408131	pfam18324	TT1725	Hypothetical protein TT1725. This is the hypothetical protein TT1725 found in Thermus thermophilus HB8. The sequence is conserved in three predicted prokaryotic proteins with unknown functions, including Deinococcus radiodurans, Stigmatella aurantiaca, and Mycobacterium leprae. The presence of positively-charged residues in the alpha-1 helix suggests this region binds to a protein with a negatively charged region or to nucleic acids.	110
408132	pfam18325	Fas_alpha_ACP	Fatty acid synthase subunit alpha Acyl carrier domain. This is the acyl carrier domain (ACP) found in fatty acid synthase subunit alpha (FAS2) EC:2.3.1.86.The fungal type I fatty acid synthase (FAS) is a 2.6 MDa multienzyme complex, catalyzing all necessary steps for the synthesis of long acyl chains. To be catalytically competent, the FAS must be activated by a posttranslational modification of the central acyl carrier domain (ACP) by an intrinsic phosphopantetheine transferase (PPT).	162
375753	pfam18326	RFX5_N	RFX5 N-terminal domain. This is the N-terminal domain of regulatory factor X (RFX)-5 protein of the RFX complex present in Homo sapiens. The RFX complex is made up of RFX5, RFXAP and RFXB. The complex is involved in the regulation of the expression of the major histocompatibility complex class II (MHCII) gene products. These gene products are essential for the initiation and regulation of the mammalian immune response. The N-terminal domain of RFX5 is responsible for homodimerization of RFX5 which promotes folding of the C-terminal domain of RFXAP. The folding of RFXAP results in the formation of a potential binding site for RFXB to bind to the MHCII promoter.	59
408133	pfam18327	PRODH	Proline utilization A proline dehydrogenase N-terminal domain. This is the N-terminal domain found in Proline utilization A (PutA) proteins. Proline utilization A (PutA) is a flavoprotein that has mutually exclusive roles as a transcriptional repressor of the put regulon and a membrane-associated enzyme that catalyzes the oxidation of proline to glutamate. The N-terminal region carries the flavoenzyme proline dehydrogenase (PRODH) domain which catalyzes the 2-electron oxidation of proline with the concomitant reduction of a flavin cofactor.	48
408134	pfam18328	PfaD_N	Fatty acid synthase subunit PfaD N-terminal domain. This domain is found in N-terminal region of PfaD, an enoyl reductase enzyme present in Bacillus subtilis. PfaD plays a role in the biosynthesis of polyunsaturated fatty acids. The domain is typically found just before the N-terminal region of a nitronate monooxygenase domain (pfam03060).	62
408135	pfam18329	SGBP_B_XBD	Surface glycan-binding protein B xyloglucan binding domain. This is the C-terminal domain found in the surface glycan-binding protein-B (SGBP-B) protein found in Bacteroides ovatus. SGBP-B is a cell-surface-localized, xyloglucan-specific binding protein. The C-terminal domain mediates xyloglucan binding. The domain display similarity to the C-terminal beta-sandwich domain of many GH13 enzyme.	178
408136	pfam18330	Lig_C	Ligase Pab1020 C-terminal region. This is the C-terminal region of RNA ligase Pab1020 present in Pyrococcus abyssi. Pab1020 catalyzes the nucleotidylation of oligo-ribonucleotides in an ATP-dependent reaction. This region contains both a dimerization domain and a C-terminal domain.	125
408137	pfam18331	PKHD_C	PKHD-type hydroxylase C-terminal domain. This is the C-terminal domain found in PKHD-type hydroxylase enzymes. Family members are found mostly in Bacteria and carry the 2OG-Fe(II) oxygenase superfamily pfam13640.	43
408138	pfam18332	XRN1_D1	Exoribonuclease Xrn1 D1 domain. This domain can be found in 5' to 3' exoribonuclease 1 (XRN1) which belong to a family of conserved enzymes in eukaryotes and have important functions in transcription, RNA metabolism, and RNA interference. Xrn1 in fungi and animals is primarily cytosolic and is involved in degradation of decapped mRNAs, nonsense mediated decay, microRNA decay and is essential for proper development. The Xrn1 homolog in Drosophila, known as Pacman, is required for male fertility. This domain (D1) along with 3 other domains, make up a 510-residue segment following the conserved regions found in XRNs but they are only present in XRN1 and are absent in Rat1/XRN2. The amino acid sequences of these four domains contain an excess of basic residues, suggesting that these domains might help in binding the RNA substrate. Mutational studies carried out in D1 domain show that the mutant forms had dramatically reduced nuclease activity towards ssDNA substrate indicating that domain D1 is required for Xrn1 nuclease activity.	192
408139	pfam18333	ssDNA_DBD	Non-canonical single-stranded DNA-binding domain. This domain is found in ThermoDBP, a non-canonical single-stranded DNA-binding protein in Thermoproteales. Single-stranded DNA-binding proteins are needed for DNA metabolism, sequestering and protecting transiently formed ssDNA during DNA replication and recombination, detecting DNA damage and recruiting repair proteins. The outer edge of the ssDNA-binding cleft, formed by this domain, has a strongly positive electrostatic surface potential because of the conserved basic residues R49, K54, R65, R80, R86, R90, K97, and R112.	106
408140	pfam18334	XRN1_D2_D3	Exoribonuclease Xrn1 D2/D3 domain. This domain can be found in 5' to 3' exoribonuclease 1 (XRN1) which belong to a family of conserved enzymes in eukaryotes and have important functions in transcription, RNA metabolism, and RNA interference. Xrn1 in fungi and animals is primarily cytosolic, involved in degradation of decapped mRNAs, nonsense mediated decay, microRNA decay and is essential for proper development. The Xrn1 homolog in Drosophila, known as Pacman, is required for male fertility. This entry relates to domain 2 and 3 combined which can be found in the 510-residue C-terminal extension found in XRN1 and not in XRN2/Rat1. Domain D2 is formed by two stretches of Xrn1, residues 915-960 and 1134-1151. The presence of domain (D3) is suggested based on structure. This domain is formed by residues 979-1109, in the insert of domain D2. It is suggested that domains D2-D4 may help maintain domain D1 pfam18332 in the correct conformation, thereby indirectly stabilising the conformation of the N-terminal segment pfam03159.	87
408141	pfam18335	SH3_13	ATP-dependent RecD-like DNA helicase SH3 domain. This is an SH3 (SRC homology domain 3) domain found in RecD helicases (EC 3.6.4.12) that belong to the bacterial Superfamily 1B (SF1B). This superfamily of helicases translocate in a 5'-3' direction and are required for a range of cellular activities across all domains of life. Structural analysis indicate that the extension of the 5'-tail of the unwound DNA duplex induces a large conformational change in the RecD subunit, that is transferred through the RecC subunit to activate the nuclease domain of the RecB subunit. The process involves this SH3 domain that binds to a region of the RecB subunit. Studies of RecD in E. coli also revealed that the SH3 domain interacts with the ssDNA tail in a location different to that normally occupied by a peptide in canonical eukaryotic SH3 domains, thus retaining the potential to bind peptide at the same time as the ssDNA tail.	65
408142	pfam18336	Tudor_FRX1	Fragile X mental retardation Tudor domain. This is the N-terminal Tudor domain (Tud1) found in Fragile X mental retardation syndrome-related protein 1 (Fxr1). The Tud1 domain forms a canonical Tudor barrel. It is usually found in tandem with Agenet domain pfam05641.	49
408143	pfam18337	Tudor_RapA	RapA N-terminal Tudor like domain. This is one of two Tudor-like domains found in the N-terminal region of RapA proteins. RapA is an abundant RNAP-associated protein of 110-kDa molecular weight with ATPase activity. It forms a stable complex with the RNAP core enzyme, but not with the holoenzyme. The ATPase activity of RapA increases upon its binding to RNAP. The N-terminal region of RapA contains two copies of a Tudor-like domains, both folded as a highly bent antiparallel beta-sheet. This fold is also found in transcription factor NusG, ribosomal protein L24, human SMN (survival of motor neuron) protein, mammalian DNA repair factor 53BP1, putative fission yeast DNA repair factor Crb2 and bacterial transcription-repair coupling factor known as Mfd. The functional roles of the N-terminal region homologs in these proteins suggest that the Tudor-like domains of RapA may interact with both nucleic acids and RNAP.	62
408144	pfam18338	BppL_N	Lower baseplate protein N-terminal domain. This domain is found in the N-terminal region of the receptor-binding protein of bacteriophage TP901-1, which infects Lactococcus lactis. The receptor-binding protein of phage TP901-1 is termed the lower baseplate protein (BppL) and is trimeric in nature. The N-terminal domain of BppL plugs into the upper baseplate protein (BppU).	25
408145	pfam18339	Tudor_1_RapA	RapA N-terminal Tudor like domain 1. This is one of two Tudor-like domains found in the N-terminal region of RapA proteins. RapA is an abundant RNAP-associated protein of 110-kDa molecular weight with ATPase activity. It forms a stable complex with the RNAP core enzyme, but not with the holoenzyme. The ATPase activity of RapA increases upon its binding to RNAP. The N-terminal region of RapA contains two copies of a Tudor-like domains, both folded as a highly bent antiparallel beta-sheet. This fold is also found in transcription factor NusG, ribosomal protein L24, human SMN (survival of motor neuron) protein, mammalian DNA repair factor 53BP1, putative fission yeast DNA repair factor Crb2 and bacterial transcription-repair coupling factor known as Mfd. The functional roles of the N-terminal region homologs in these proteins suggest that the Tudor-like domains of RapA may interact with both nucleic acids and RNAP.	51
408146	pfam18340	TraI_2B	DNA relaxase TraI 2B/2B-like domain. This is the 2B and 2B-like sub-domain found in TraI (EC:5.99.1.2) a relaxase of F-family plasmids. It contains four domains; a trans-esterase domain that executes the nicking and covalent attachment of the T-strand to the relaxase, a vestigial helicase domain (carrying the 2B/2B-like sub-domain) that operates as an ssDNA-binding domain, an active 5' to 3' helicase domain, and a C-terminal domain that functions as a recruitment platform for relaxosome components. The 2B sub-domains in TraI are formed by residues 625-773 in the vestigial helicase domain and residues 1255-1397 in the active helicase domain. The 2B/2B-like sub-domain interacts with ssDNA where it contributes to the surface area where ssDNA bind. In other words the ssDNA-binding site is located in a groove between the 2B and 2B-like parts of the sub-domain. The sub-domain parts appear to act as clamps holding the ssDNA in place, resulting in the ssDNA being completely surrounded by protein. In previous studies, the 2B/2B-like sub-domain of the TraI vestigial helicase domain has been identified as translocation signal A (TSA) since it contains sequences essential for the recruitment of TraI to the T4S system. Thus, the 2B/2B-like sub-domain plays two major roles in relaxase function: (1) interacting with the DNA and possibly promoting high processivity and (2) mediating recruitment of the relaxosome to the T4S system.	79
408147	pfam18341	PSA_CBD	PSA endolysin C-terminal cell wall binding domain. This is the C-terminal domain of bacteriophage PSA endolysin. The C-terminal domain is the cell wall-binding domain (CBD) which is composed of two structurally homologous subdomains. CBD comprises two copies of a beta-barrel-like folds, which are held together by means of swapped beta-strands. The observed structure of the CBD sub-domains from Listeriaphage endolysin (N-acetyl-muramoyl-l-alanine amidase), could be the result of either a gene duplication during evolution of the CBD or the pick-up of another functionally equivalent coding sequence, followed by swapping of the respective ancestral leading beta-strands.	51
408148	pfam18342	LytB_WW	Endo-beta-N-acetylglucosaminidase LytB WW domain. This domain has can be found in endo-beta-N-acetylglucosaminidase LytB (EC 3.2.1.96) of S. pneumoniae and other gram positive bacteria. Comparative analysis revealed that the second all-beta module derived from the WW-like segments is structurally similar to the chitin binding domain of S. marcescens chitinase ChiB, implying a peptide binding function for this module.	66
408149	pfam18343	SH3_14	Dda helicase SH3 domain. This is a Src homology-3 (SH3) like beta-barrel domain which can be found in Dda enzyme. Dda is a phage T4 SF1B helicase. The Dda SH3 domain contains two insertions (compared to RecD2), a second beta-ribbon that is referred to as the hook and a beta-ribbon/two-helix substructure that is referred to as the tower. The tower region within the domain is rigidly connected to domain 2A in Dda and appears to be specifically designed for the task of supporting the extended pin. Hence, it is suggested that 2A and this SH3 domain move as one unit during the ATP-driven translocation of ssDNA while maintaining contact with the pin. In this scenario, the pin-tower interaction can be considered as an additional transmission site that serves to more efficiently couple the energy from ATP binding and hydrolysis to the unwinding of dsDNA.	134
408150	pfam18344	CBM32	Carbohydrate binding module family 32. This domain is found in GH84C present in Clostridium perfringens. GH84C is a beta-N-acetylglucosaminidase. This domain is a family 32 carbohydrate binding module (CBM) which preferentially recognizes the non-reducing terminus of N-acetyllactosamine.	64
408151	pfam18345	zf_CCCH_4	Zinc finger domain. This is a zinc finger domain found in Zinc finger CCCH-type with G patch domain-containing proteins such as ZIP. Functional studies indicate that ZIP specifically targets EGFR and represses its transcription, and that the zinc finger and the coiled-coil domains are central to that process.	19
408152	pfam18346	SH3_15	Mind bomb SH3 repeat domain. The REP domain of Mind bomb which serves as the substrate recognition domain for a second, membrane-distal epitope of the Notch ligand (C-box). Although the first Mib repeat of REP may play a dominant role in ligand binding, the two repeats appear to cooperate in the engagement of the Jag1 tail. Mind bomb (Mib) proteins are large, multi-domain E3 ligases that promote ubiquitination of the cytoplasmic tails of Notch ligands. The structure and functional analysis, show that Mib1 contains two independent substrate recognition domains that engage two distinct epitopes from the cytoplasmic tail of the ligand Jagged1, one in the intracellular membrane proximal region and the other near the C terminus. REP domains have a five-stranded anti-parallel twisted beta sheet topology similar to that of SH3 domains.	67
408153	pfam18347	DUF5606	Domain of unknown function (DUF5606). This is a domain of unknown function found at the N-terminal region of bacterial proteins.	46
408154	pfam18348	SH3_16	Bacterial dipeptidyl-peptidase Sh3 domain. This is the first of two N-terminal bacterial SH3 (SH3b) domains found in bacterial dipeptidyl-peptidases VI such as gamma-D-glutamyl-L-diamino acid endopeptidases. The first SH3b domain plays an important role in defining substrate specificity by contributing to the formation of the active site, such that only murein peptides with a free N-terminal alanine are allowed.	49
375776	pfam18349	Paz_1	PAZ domain. This is a Paz domain found in Argonaute proteins from Aquifex aeolicus bacteria. The PAZ core fold in Aquifex aeolicus bacteria Ago (Aa-Ago) proteins is closely related to the human hAgo1 PAZ domain. Structural and functional studies of Aa-Ago indicate that conformational rearrangement of the PAZ domain may be critical for the catalytic cycle of Argonaute and the RNA-induced silencing complex.	93
375777	pfam18350	SH3_17	Restriction endonuclease SH3 domain. This is an N-terminal beta-barrel domain found in the Hpy99I protomer region. Hpy99I is a type II restriction endonuclease (REase) found in the gastric pathogen Helicobacter pylori. The beta-barrel domain has the SH3-domain fold. Deletion of the beta-barrel domain drastically reduced the activity of Hpy99I.	53
408155	pfam18351	Ago_N_1	Fungal Argonaute N-terminal domain. The AGO (Argonaute) proteins have four domains: an N-terminal domain, the PAZ domain, the MID domain and the PIWI domain. This entry is for the N-terminal domain. The N-terminal domain of AGOs is the most variable domain. Compared with prokaryotic Argonautes, KpAGO (Kluyveromyces polysporus Argonaute) has numerous surface-exposed insertion segments, with a cluster of conserved insertions re-positioning the N domain, contributing to the formation of nucleic-acid-binding channel to enable full propagation of the 3' end of the guide RNA guide-target pairing. The guide strand is used by the RISC complex to specify interactions with target RNAs. If sequence complementarity between guide and target is extensive, AGO catalyses cleavage, resulting in slicing of the target RNA.	95
408156	pfam18352	Gp138_N	Phage protein Gp138 N-terminal domain. This domain is found in the N-terminal domain of gene product 138 (gp138) in an unidentified bacteriophage. Gp138 is thought to be involved in the process of opening the host cell membrane during infection. The domain has an OB-fold with an intramolecular disulfide bond between C114 and C120.	98
375780	pfam18353	PG_isomerase_N	Phosphoglucose isomerase N-terminal domain. This domain is found in the N-terminal region of glucose-6-phosphate isomerase-like protein, just before the phospho-glucose isomerase C-terminal SIS domain (pfam10432).	110
375781	pfam18354	SH3_18	CarS bacterial SH3 domain. This is an SH3 domain found in antirepressor proteins such as CarS from Myxococcus xanthus. CarS antirepressor recognizes and neutralizes its cognate repressors to turn on a photo-inducible promoter. CarS physically interacts with the MerR-type winged-helix DNA-binding domain of these repressors leading to activation of carB operon. Structural studies of CarS from M. Xanthus reveals a beta-barrel fold akin to that in SH3 domains. However, it diverges from the typical SH3 domain fold in the lengths and conformations of the connecting loops. Functional analysis reveal that SH3 domain-like fold in the antirepressor CasS, mimics operator DNA in sequestering the repressor DNA recognition helix to activate transcription.	85
375782	pfam18355	DUF5607	Domain of unknown function (DUF5607). 	65
408157	pfam18356	DUF5608	Domain of unknown function (DUF5608). 	56
408158	pfam18357	DtxR	Diphteria toxin repressor SH3 domain. This is an SH3 domain which can be found in diphtheria toxin repressor (DtxR) proteins. DtxR) from Corynebacterium diphtheriae regulates the expression of the gene on corynebacteriophages that encodes diphtheria toxin (DT). Other genes regulated by DtxR include those that encode proteins involved in siderophore-mediated iron uptake.	78
408159	pfam18358	Tudor_4	Histone methyltransferase Tudor domain. This is a Tudor domain found in histone-lysine N-methyltransferase SETDB1 proteins (EC:2.1.1.43), also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4.	50
408160	pfam18359	TUDOR_5	Histone methyltransferase Tudor domain 1. This is the first TUDOR domain found in SETDB1 enzymes (EC:2.1.1.43) in homosapiens, also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4. SET domain, bifurcated 1 (SETDB1) is a histone methyltransferase (HMT) that methylates lysine 9 on histone H3 (H3K9). The enzymatic activity of SETDB1, in association with MBD1-containing chromatin-associated factor 1 (MCAF1), converts H3K9me2 to H3K9me3 and represses subsequent transcription. SETDB1 is amplified in cancers such as melanoma and lung cancer, and increased expression of SETDB1 promotes tumorigenesis in a zebrafish melanoma model. In addition, SETDB1 is required for endogenous retrovirus silencing during early embryogenesis, inhibition of adipocyte differentiation, and differentiation of mesenchymal cells into osteoblasts. The tandem Tudor domains in the N-terminal region are involved in protein-protein interactions. The second tudor domain is pfam18385.	53
408161	pfam18360	hnRNP_Q_AcD	Heterogeneous nuclear ribonucleoprotein Q acidic domain. This is an acidic sequence segment domain found in the splicing factor SYNCRIP (hnRNP Q) which is involved in viral replication, neural morphogenesis, modulation of circadian oscillation and the regulation of the cytidine deaminase APOBEC1. This domain is a self-folding globular domain with an all alpha-helix architecture with negatively charged surface areas. Additionally it contains a large hydrophobic cavity and a positively charged surface area as potential epitopes for inter-molecular interactions.	70
375787	pfam18361	ssDBP_DBD	Single stranded DNA-binding protein ss DNA binding domain. This domain is found in the N-terminal of ThermoDBP, a single stranded DNA binding protein found in Thermoproteus tenax. ThermoDBP binds specifically to ssDNA with low sequence specificity. This domain is responsible for ssDNA binding. Conserved motif 'LIYWIRSDR' is located at the C-terminal end of the domain and is thought to participate in ssDNA binding.	136
408162	pfam18362	THB	Tri-helix bundle domain. This domain can be found in the myosin-binding motif (m-domain) region present in myosin-binding protein C (MyBP-C). MyBP-C is a sarcomeric assembly protein necessary for the regulation of sarcomere structure and function. The MyBP-C family of proteins consists mainly of modules with immunoglobulin (Ig) or fibronectin folds. This domain exhibits a three-helix bundle fold and there is a known actin-binding motif, LK(R/K)XK positioned in the third helix (alpha3), similar to that found in villin and related proteins.	34
408163	pfam18363	PI_PP_I	Phosphoinositide phosphatase insertion domain. This domain is found in the effector protein SidP present in Legionella longbeachae. SidP functions as a Phosphoinositide-3-phosphatase specifically hydrolyzing Phosphoinositide(3)P, referred to as PI(3)P, and PI(3,5)P2. The domain is inserted into the N-terminal portion of the catalytic domain and is referred to as the appendage or insertion (I) domain.	94
408164	pfam18364	Molybdopterin_N	Molybdopterin oxidoreductase N-terminal domain. This is the N-terminal domain of pfam00384 found in a number of molybdopterin-containing oxidoreductases such as dimethyl sulfoxide/trimethylamine N-oxide reductase, also known as DMSO reductase (EC:1.7.2.3, EC:1.8.5.3).	41
408165	pfam18365	PI_PP_C	Phosphoinositide phosphatase C-terminal domain. This domain is found in the C-terminal region of effector protein SidP present in Legionella longbeachae. SidP functions as a PI-3-phosphatase specifically hydrolyzing PI(3)P and PI(3,5)P2. This C-terminal domain is rich with glutamate residues.	125
408166	pfam18366	zf_ZIC	Zic proteins zinc finger domain. This is the ZF1 (Zinc Finger 1) domain found in Zic family proteins found in Eukaryotes. In humans, there are five members of the Zic family that are involved in human congenital anomalies. One of them, ZIC3, causes X-linked heterotaxy (HTX1), which is a left-right axis disturbance that manifests as variable combinations of heart malformation, altered lung lobation, splenic abnormality and gastrointestinal malrotation. Zic faily proteins contain multiple zinc finger domains (ZFD), which are generally composed of five tandemly repeated C2H2 zinc finger (ZF) motifs. Sequence comparison analysis reveal that this N-terminal ZF (ZF1) domain of the Zic zinc finger domains is unique in that it possesses more amino acid residues (6-38 amino acids) between the two cysteine residues of the C2H2 motif compared to Gli and Glis ZF1s or any of the other ZFs (ZF2-5) in the Gli/Glis/Zic superfamily of proteins. Mutations in cysteine 253 (C253S) or histidine 286 (H286R) in ZIC3 ZF1, which are found in heterotaxy patients, result in extranuclear localization of the mutant ZIC3 protein. Furthermore, mutations in the evolutionarily conserved amino acid residues (C253, W255, C268, H281 and H286) of ZF1 generally impair nuclear localization.	45
408167	pfam18367	Rv2175c_C	Rv2175c C-terminal domain of unknown function. This is the C-terminal domain of unknown function found in actinomycetes such as M. tuberculosis Rv2175c. Rv2175c has a DNA binding activity and possesses a winged helix-turn-helix fold, furthermore it is identified as a substrate of the PknL kinase.	56
408168	pfam18368	Ig_GlcNase	Exo-beta-D-glucosaminidase Ig-fold domain. This domain can be found in 2 glycoside hydrolase subfamily of beta-glucosaminidases (EC:3.2.1.165) such as CsxA, from Amycolatopsis orientalis that has exo-beta-D-glucosaminidase (exo-chitosanase) activity. It has an immunoglobulin-like topology.	104
408169	pfam18369	PKS_DE	Polyketide synthase dimerisation element domain. This is the dimerisation element domain found in bacterial modular polyketide synthase ketoreductases. The dimerization element (DE) domain is N-terminal to the KR domain pfam08659. DE domain is necessary for KR function, presumably because the dimeric DE orients the KR domains for optimal activity within a module.	45
408170	pfam18370	RGI_lyase	Rhamnogalacturonan I lyases beta-sheet domain. This is the beta-sheet domain found in rhamnogalacturonan (RG) lyases, which are responsible for an initial cleavage of the RG type I (RG-I) region of plant cell wall pectin. Polysaccharide lyase family 11 carrying this domain, such as YesW (EC:4.2.2.23) and YesX (EC:4.2.2.24), cleave glycoside bonds between rhamnose and galacturonic acid residues in RG-I through a beta-elimination reaction. Other family members carrying this domain are hemagglutinin A, lysine gingipain (Kgp) and Chitinase C (EC:3.2.1.14).	86
408171	pfam18371	FAD_SOX	Flavin adenine dinucleotide (FAD)-dependent sulfhydryl oxidase. This is a flavin adenine dinucleotide (FAD) binding domain found in Quiescin sulfhydryl oxidases (QSOX) (EC:1.8.3.2). QSOX is a multi-domain disulfide catalyst that is localized primarily to the Golgi apparatus and secreted fluids and has attracted attention due to its over-production in tumors. Structural studies indicate that the closure of the Trx1 domain over the FAD-binding site may enhance the active-site chemistry for disulfide formation.	104
408172	pfam18372	I-EGF_1	Integrin beta epidermal growth factor like domain 1. This is the I-EGF 1 domain found in several integrin betas such as integrin beta 1-7. Structural analysis reveal an epidermal growth factor-like (I-EGF) domains 1 and 2. EGF1 lacks one disulfide (C2-C4) relative to the integrin EGF 2, 3, and 4 domains, this allows the C-terminal end of EGF1 to flex remarkably relative to its N-terminal end.	29
408173	pfam18373	Spectrin_like	Spectrin like domain. Desmoplakin (DP) is an integral part of desmosomes, where it links desmosomal cadherins to the intermediate filaments. The N-terminal region of DP contains a plakin domain common to members of the plakin family. Plakin domains contain multiple copies of spectrin repeats (SRs) pfam00435. Spectrin repeats (SRs) consist of three alpha-helices (A, B, and C) that form an antiparallel triple-helical bundle. This entry describes SR6 which has a divergent structure relative to the other SRs. SR6 shows significant deviations in helices A and B where they are significantly shorter than in other repeats. Structural comparison revealed that SR6 is more similar to other three-helix-bundle proteins, including target of Myb1 and the syntaxin Habc domain, than to other SR proteins. Due to these differences with other spectrin repeats, this region is termed spectrin-like repeat.	78
408174	pfam18374	Enolase_like_N	Enolase N-terminal domain-like. This is the N-terminal domain found in o-succinylbenzoate synthase (OSBS) enzymes (EC:4.2.1.113). Like other members of the enolase superfamily, OSBS enzymes are composed of a C-terminal catalytic (beta/alpha)7beta-barrel domain pfam00113 and an N-terminal capping domain with an alpha+beta fold that is found in the enolase superfamily. This domain is different from other enolase super family N-terminal domains such as pfam03952. This actino-bacterial N-terminal domain lacks the prototypical first two helices of the enolase capping domains. Structural analysis of T. fusca OSBS reveals that this is compensated for with an extra helix appended to the C-terminus.	51
408175	pfam18375	CDH1_2_SANT_HL1	CDH1/2 SANT-Helical linker 1. CDH1 is an ATP-dependent chromatin-remodelling factor and plays an important role in regulating nucleosome assembly and mobilization. CHD1 consists of double chromodomain, SNF2-related ATPase domain, and a C-terminal DNA-binding domain. The DNA-binding domain contains SANT (Swi3, Ada2, N-CoR, TFIIIB) and SLIDE (SANT-like ISWI) domains in its C-terminal region. SANT domains are structurally related to Myb-like domains are common motifs found in chromatin interacting proteins. Deletion of individual SANT or SLIDE domains in CDH1 does not significantly affect nucleosome binding, but combined deletion of both domains severely compromise binding, suggesting that the SANT-SLIDE motif recognizes DNA/nucleosomes as a single cooperative unit. SANT sequences of Chd1 proteins are the most distantly relation group of sequences relation to other SANT/Myb sequences, and are more diverse than other SANT proteins. The SANT and SLIDE regions are well conserved in both Chd1 and ISWI (imitation switch) remodelling enzymes. This domain comprises the SANT region and the helical linker region 1 (HL1).	90
408176	pfam18376	MDD_C	Mevalonate 5-diphosphate decarboxylase C-terminal domain. Mevalonate diphosphate decarboxylase (EC:4.1.1.33) catalyzes the ATP dependent decarboxylation of mevalonate 5-diphosphate (MVAPP) to form isopentenyl 5-diphosphate. The reaction is required for production of polyisoprenoids and sterols from acetyl-CoA. This entry represents the C-terminal domain of the mevalonate 5-diphosphate decarboxylase enzyme which is a member of the GHMP kinase superfamily.	186
408177	pfam18377	FERM_F2	FERM F2 acyl-CoA binding protein-like domain. This is an F2 lobe domain consisting of an acyl-CoA binding protein fold found in FERM region of Jak-family tyrosine kinases. Multidomain JAK molecules interact with receptors through their FERM and SH2-like domains, triggering a series of phosphorylation events, resulting in the activation of their kinase domains. Overall, the FERM region maintains the typical three-lobed architecture, with an F1 lobe consisting of a ubiquitin-like fold, an F2 lobe consisting of an acyl-CoA binding protein fold, and an F3 lobe consisting of a pleckstrin-homology (PH) fold. JAK1 FERM-F2 domain has been shown to act as the interaction site for the IFNLR1 box1 motif (PxxLxF) of class II cytokine receptors which is essential for kinase activation.	131
408178	pfam18378	Nup188_C	Nuclear pore protein NUP188 C-terminal domain. This is C-terminal domain of Nup188. It is a right-handed arc-shaped superhelical structure built from 19 helices that form 6 helical repeats, which are stacked in regular order. The first helical pair (alpha1 and alpha2) forms a HEAT repeat followed by 5 ARM repeats.	371
408179	pfam18379	FERM_F1	FERM F1 ubiquitin-like domain. This is an F1 lobe domain consisting of a ubiquitin like fold found in FERM region of Jak-family tyrosine kinases. Multidomain JAK molecules interact with receptors through their FERM and SH2-like domains, triggering a series of phosphorylation events, resulting in the activation of their kinase domains. Overall, the FERM region maintains the typical three-lobed architecture, with an F1 lobe consisting of a ubiquitin-like fold, an F2 lobe consisting of an acyl-CoA binding protein fold, and an F3 lobe consisting of a pleckstrin-homology (PH) fold.	96
408180	pfam18380	GEN1_C	Holliday junction resolvase Gen1 C-terminal domain. This is the C-terminal domain found in GEN1 resolvase. It is composed of three-strand antiparallel beta sheets and four alpha helices. GEN1 protein, a member of the XPG/Rad2 family of structure-selective endonucleases, is specialized for the cleavage of Holliday junction recombination intermediates. Structural comparison indicates that the C-terminal domain is similar to a series of chromobox homology proteins. Functional analysis indicates that the chromodomain provides an additional DNA binding site necessary for efficient HJ cleavage, and its truncation severely hampers GEN1's catalytic activity.	104
408181	pfam18381	YcaO_C	YcaO cyclodehydratase C-terminal domain. This is the proline-rich C-terminal domain found in ribosomal protein S12 methylthiotransferase accessory factor YcaO. It has been shown to be involved in both C protein recognition and cyclodehydration. The C-terminal domain resembles a tetratricopeptide repeat that mediates dimerization.	172
408182	pfam18382	Formin_GBD_N	Formin N-terminal GTPase-binding domain. This is the N-terminal GTPase-binding domain (GBD) of formins also known as formin homology domain-containing proteins (FHOD) pfam02181. This GBD is recruited by Rac and Ras GTPases in cells and plays an essential role for FHOD1-mediated actin remodelling and transcriptional activation, localizes to specific GTPases in cells, and binds to GTPases in vitro. It exhibits structural similarity to the ubiquitin superfold as found, for example, in the Ras-binding domains of c-Raf1 or PI3 kinase, but contains an unusual loop that inserts into the first FH3 repeat.	99
408183	pfam18383	IFT81_CH	Intraflagellar transport 81 calponin homology domain. This is the N-terminal domain found in IFT81 proteins. Crystal structure analysis revealed that IFT81-N adopts the fold of a calponin homology (CH) domain with structural similarity to the kinetochore complex component NDC80 with microtubule (MT)-binding properties. Functional analysis show that IFT74 and IFT81 form a tubulin-binding module required for ciliogenesis. It is suggested that IFT81-N binds the globular domain of tubulin to provide specificity, and IFT74-N recognizes the beta-tubulin tail to increase affinity.	123
375810	pfam18384	zf_CCCH_5	Unkempt Zinc finger domain 1 (Znf1). This is CCCH zinc finger 1 domain found in Unkempt N-terminal region. Unkempt is an evolutionary conserved RNA-binding protein that regulates translation of its target genes and is required for the establishment of the early bipolar neuronal morphology. It carries six CCCH zinc fingers (ZnFs) forming two compact clusters, ZnF1-3 and ZnF4-6, that recognize distinct trinucleotide RNA substrates. These clusters, recognize an unexpectedly short stretch of RNA sequence-only three consecutive ribonucleotides-with a varying degree of specificity. ZnF1-3 binds to the UUA motif of RNA substrates.	40
408184	pfam18385	Tiam_CC_Ex	T-lymphoma invasion and metastasis CC-Ex domain. This is the CC and Ex subdomains found in PH-CC-Ex globular domain from Tiam1 and Tiam2 proteins (T-lymphoma invasion and metastasis). The CC subdomain forms an antiparallel coiled coil with two long alpha-helices, together with the C-terminal Ex subdomain they form a small globular domain comprising three alpha-helices. The CC subdomain of the Tiam2 PHCCEx domain follows the C-terminal alpha1 helix of the PH pfam00169 subdomain through a four-residue linker.	98
408185	pfam18386	ROQ_II	Roquin II domain. The ROQ domain is composed of three subdomains, I, II and III. This entry describes the second domain, ROQ II. Structural analysis reveals similarity of domain II to the helix-turn-helix (HTH) fold. Mutagenesis and biochemical studies show that that the HTH fold in domain II contributes to binding dsRNA at the 5'arm.	56
408186	pfam18387	zf_C2H2_ZHX	Zinc-fingers and homeoboxes C2H2 finger domain. This is a C2H2 zinc-finger domain found in ZHX proteins such as ZHX1. ZHXs are multidomain proteins comprising two C2H2 zinc finger motifs and five homeodomains. Both homeodomains and zinc fingers are short protein modules involved in protein-DNA and/or protein-protein interactions; they are frequently associated with roles in transcriptional regulation. All members of the ZHX family are reported to be able to form both homo- and heterodimers via the region containing homeodomain 1. ZHX1 is a transcriptional repressor which is ubiquitously expressed. It interacts with nuclear factor Y subunit A (NFYA) and DNA methyl transferase 3B (DNMT3B) for its repression activity. Changes in expression profiles of rat ZHX1 ortholog have been associated with glomerular disease. In addition to the five homeodomains, ZHX1, which also contains of two N terminal C2H2 zincfingers forms homodimers via homeodomain and can also form heterodimers with ZHX3.	53
408187	pfam18388	Atg29_N	Atg29 N-terminal domain. This is the N-terminal domain found in fungal Atg proteins such as Atg29. In yeast, the induction of autophagy begins at a single perivacuolar site that is proximal to the vacuole, called the phagophore assembly site (PAS). Atg17-Atg29-Atg31 complex (Atg1 complex) formation is a prerequisite for PAS assembly. Functional analysis indicate that the N-terminal half Atg29 can bind Atg31.	54
408188	pfam18389	TrmO_C	TrmO C-terminal domain. This domain is found at the C-terminus of TrmO tRNA methyltransferase proteins. This domain has a RelE fold.	65
408189	pfam18390	GlgX_C	Glycogen debranching enzyme C-terminal domain. This is the C-terminal domain of the glycogen debranching enzyme GlgX. GlgX hydrolyzes alpha-1,6-glycosidic linkages of phosphorylase-limit dextrin containing only three or four glucose subunits produced by glycogen phosphorylase. Sequence analysis suggests that GlgX is a debranching enzyme belonging to the glycoside hydrolase GH-13 family in the CAZy database.	85
408190	pfam18391	CHIP_TPR_N	CHIP N-terminal tetratricopeptide repeat domain. This is N-terminal tetratricopeptide repeat (TPR) domain found in C terminus of Hsp70 interacting proteins (CHIP). The TPR domain of CHIP binds directly to EEVD motifs located at the C termini of Hsc/Hsp70 and Hsp90.	83
408191	pfam18392	CSN7a_helixI	COP9 signalosome complex subunit 7a helix I domain. This is The C-terminal helix I domain found in COP9 signalosome complex subunit 7a. The helix from CSN7 (helix I) contacts CSN6 helices I and II at the base of the bundle, nearest the PCI ring.	50
408192	pfam18393	MotY_N	MotY N-terminal domain. The bacterial flagellar motor is a rotary motor complex composed of various proteins. MotX and MotY are essential for the Na+-driven flagellar motor motility of Vibrio, Shewanella and Aeromonas species. MotY is main component for T-ring formation and absence of MotY completely disrupt the T-ring formation. This is the N-terminal domain of MotY which is shown to be essential for motility and responsible for the interaction with both MotX and the basal body. Functional analysis suggests that MotY-N connects the basal body to MotX and that the PomA/PomB complex associates with MotX to form the functional stator complex around the rotor. MotY-N alone does not associate strongly with the basal body, but the partial T-ring structure made of the MotY-N/MotX complex is sufficient to allow at least a few PomA/PomB stator complexes to be incorporated into the motor.	146
408193	pfam18394	TBK1_CCD1	TANK-binding kinase 1 coiled-coil domain 1. This is a coiled-coil domain found in TANK-binding kinase 1 (TBK1), it comprises one of two coiled-coil domains found in the scaffold dimerization region. TBK1 is a serine/threonine kinase and a noncanonical member of the IKK family implicated in diverse cellular functions, including innate immune response as well as tumorigenesis and development. Deletion of the coiled-coil 1 region in TBK1 lead to a severe impairment in TBK1 function even upon over-expression.	256
408194	pfam18395	Cas3_C	Cas3 C-terminal domain. This is the C-temrinal domain of Cas3 proteins. The C-terminal domain (CTD) is shown to completely wrap ssDNA inside the helicase. Deletion of the CTD (aa 819-924) reduced CRISPR interference. It is suggested that the CTD regulates the N-terminal HD nuclease activity by functioning as a substrate filter.	107
408195	pfam18396	TBK1_ULD	TANK binding kinase 1 ubiquitin-like domain. This is the ubiquitin-like domain (ULD) found in TANK-binding kinase 1 (TBK1). TBK1 is a serine/threonine kinase and a noncanonical member of the IKK family implicated in diverse cellular functions, including innate immune response as well as tumorigenesis and development. It has been reported that the ULD of TBK1 regulates kinase activity, playing an important role in signaling and mediating interactions with other molecules in the IFN pathway. Deletion of ULD indicates that it is required for the kinase domain to form an enzymatically active conformation. TBK1 ULD has a ubiquitin-like structure and an Ile44 hydrophobic patch, which is conserved among ULDs and IKK and IKK-related proteins. This hydrophobic patch is involved in ULD-SDD interactions in TBK1 and other IKK and IKK-related proteins.	88
408196	pfam18397	IKBKB_SDD	IQBAL scaffold dimerization domain. This is the C-terminal scaffold dimerization domain (SDD) found in inhibitor of nuclear factor kappa-B kinase subunit beta IKBKB (EC:2.7.11.10). IKK2 also known as IKBKB is one of the core component of IKB kinases (IKK). IKB kinase (IKK) is an enzyme that quickly becomes active in response to diverse stresses on a cell. The SDD consists primarily of two long alpha-helices.	275
408197	pfam18398	CLIP_SPH_mas	Clip-domain serine protease homolog masquerade. The clip domain is a structural/regulatory unit in many arthropod serine proteases. The clip domain super-family also includes serine protease homologs (SPHs). This entry describes clip domains in the SPHs (CLIP subfamily A), which belong to group-3. SPHs usually carry between 1 to 5 clip domains. One of the most prominent family members is masquerade (mas). Deletion in drosophila models lead to defects in somatic muscle attachment and in the formation of the nervous system during embryogenesis.	33
408198	pfam18399	CLIP_SPH_Scar	Clip-domain serine protease homolog Scarface. The clip domain is a structural/regulatory unit in many arthropod serine proteases. The clip domain super-family also includes serine protease homologs (SPHs). This entry describes clip domains in the SPHs (CLIP subfamily A), which belong to group-3. SPHs usually carry between 1 to 5 clip domains. The most prominent family member of carrying this clip domain is Scarface proteins in drosophila, which bear an inactive catalytic site, representing a subgroup of serine protease homologues (SPH). Loss-of-function induces defects in JNK-controlled morphogenetic events such as embryonic dorsal closure and adult male terminalia rotation.	66
408199	pfam18400	Thioredoxin_12	Thioredoxin-like domain. This is one of four TRXL(thioredoxin-like) domains found in UDP-glucose:glycoprotein glucosyltransferase (UGGT).	185
408200	pfam18401	Thioredoxin_13	Thioredoxin-like domain. This is the second out of four TRXL(thioredoxin-like) domains found in UDP-glucose:glycoprotein glucosyltransferase (UGGT).	136
408201	pfam18402	Thioredoxin_14	Thioredoxin-like domain. This is the third out of four TRXL(thioredoxin-like) domains found in UDP-glucose:glycoprotein glucosyltransferase (UGGT).	248
408202	pfam18403	Thioredoxin_15	Thioredoxin-like domain. This is the fourth TRXL(thioredoxin-like) domain found in UDP-glucose:glycoprotein glucosyltransferase (UGGT).	204
408203	pfam18404	Glyco_transf_24	Glucosyltransferase 24. This is the catalytic domain found in UDP-glucose:glycoprotein glucosyltransferase (UGGT). This domain belongs to glucosyltransferase 24 family (GT24) A-type domain. The GT domain displays the expected glycosyltransferase type A (GT-A) fold.	268
408204	pfam18405	Serine_protease	Gammaproteobacterial serine protease. This family includes serine proteases such as L. pneumophila effector Lpg1137. Lpg1137, is a serine protease that targets the mitochondria-associated ER membrane (MAM) and degrades STX17 (syntaxin 17), a SNARE implicated in macroautophagy/autophagy as well as mitochondria dynamics and membrane trafficking in fed cells. Lpg1137 has a sequence (-Gly-Leu-Ser68-Gly-Gly-) that matches the consensus sequence for the active site of serine proteases (Gly-X-Ser-X-Gly/Ala, where X is any residue). It exhibits proteolytic activity toward STX17 in vitro, whereas an active site mutant in which Ser68 is replaced by Ala does not. Expressed Lpg1137 localizes to the MAM and mitochondria, in addition to the cytosol, and binds to STX17.	283
408205	pfam18406	DUF1281_C	Ferredoxin-like domain in Api92-like protein. This domain has a ferredoxin like fold. It is often found to the C terminus of pfam06924.	87
408206	pfam18407	GNAT_like	GCN5-related N-acetyltransferase like domain. This is a domain with a GCN5-related N-acetyltransferase (GNAT) fold which can be found in Rv1692 phophatases. Crystal structure of Rv1692 indicates that this C-temrinal extension, which is absent in other characterized HADSF members, resembles a small GCN5-related N-acetyltransferase (GNAT) fold. Furthermore, it is fused to the HADSF catalytic domain pfam13242. Functional studies indicate that this GNAT region is not likely to be involved in acetyl group transfer using AcCoA and SucCoA, it could nonetheless be a regulatory domain. Furthermore, it is suggested that this GNAT domain is required for the solubility of the HADSF fold of Rv1692 and is potentially needed for the structural integrity of this enzyme.	60
408207	pfam18408	zf_Hakai	C2H2 Hakai zinc finger domain. This is the C2H2 zinc finger domain found in E3 ubiquitin ligase Hakai. Hakai targets tyrosine-phosphorylated E-cadherin. It carries a Tyr(P)-binding domain, coined the HYB domain for Hakai phosphotyrosine (Tyr(P)) binding. HYB domain structure illustrates that it forms a zinc-coordinated homodimer in an antiparallel, intertwined configuration, utilizing residues from the Tyr(P)-binding region of two Hakai monomers. The C-terminal region of the HYB domain, which harbors the atypical zinc-coordination motif and key residues involved in the Tyr(P) interaction, plays an important role in the dimerization observed in the HYB domain.	32
408208	pfam18409	Plk4_PB2	Polo-like Kinase 4 Polo Box 2. This Polo box (PB) domain is found in Polo-like kinase 4 (Plk4) present in Drosophila melanogaster. Plk4 is a conserved component in the duplication pathway of centrioles which is needed to prevent chromosomal instability. Plk4 localizes to centrioles in M/G1. Structural analysis reveals two tandem, homodimerized polo boxes, PB1-PB2, that form a winged architecture. This domain is PB2, together with PB1 pfam18190, they are required for binding the centriolar protein Asterless (Asl) as well as robust centriole targeting. In other words, PB1-PB2 cassette collectively binds Asl and affords robust centriole localization, optimally positioning the kinase domain for trans-autophosphorylation.	109
408209	pfam18410	BTHB	Basic tilted helix bundle domain. This domain is found on the N-terminal region of FKBPs such as FKBP25 and in the core region of E3 ubiquitin ligase HectD1. It adopts a compact 5-helix bundle, hence termed BTHB (Basic Tilted Helix Bundle) domain. In FKBP25, it has been suggested to have a role in regulating the association state of nucleosomes by interacting with nucleolin. Moreover, this basic domain in FKBP25 forms alternative complexes with other chromatin-related proteins, such as the HDAC1, HDAC2, and the transcriptional regulator YY1, the DNA binding activity of which is enhanced on binding FKBP25. Structural analysis of this fold suggests that the DNA binding properties of FKBP25 and HectD1 are presented by the conserved basic region.	72
408210	pfam18411	Annexin_like	Annexin-like domain. This annexin-like domain can be found in astrotactin 2 (Astn-2), an integral membrane perforin-like protein linked to the planar cell polarity pathway in hair cells. The annexin-like domain is closest in fold to repeat three of human annexin V and similarly binds calcium, yet shares no sequence homology with it. Notably, this ASTN-2 annexin-like domain is closer in structure to human annexin repeat 3 than human annexin repeat 3 is to repeat 1. Annexin-like domains are known for their capacity to remodel membranes, triggered by calcium binding, and have also been suggested to be involved in the formation of pores in membranes both are possible biological roles of the ASTN-2 annexin-like domain.	93
408211	pfam18412	Wza_C	Outer-membrane lipoprotein Wza C-terminal domain. This is the C-terminal domain found in Wza, an integral outer membrane lipoprotein, which is essential for group 1 capsule export in Escherichia coli. The domain is exposed on the cell surface and is suggested to mimic antimicrobial peptide pore formation.	30
408212	pfam18413	Neuraminidase	Neuraminidase-like domain. This is a neuraminidase-like domain, which is structurally homologous to neuraminidases. It can be found in TcA subunit in tripartite Tc toxin complexes of bacterial pathogens. Functional analysis suggest that the neuraminidase-like domain acts as an electrostatic lock that opens at high or low pH values.	171
375840	pfam18414	zf_C2H2_10	C2H2 type zinc-finger. This is a zinc finger domain C2H2 which can be found in optineurin (optic neuropathy inducing protein) and NF-kappa-B essential modulator (NEMO) furthermore, it can be found in kinase TBK1, a member of the IKK (inhibitor of nuclear factor kappa-B kinase) family. The C-terminal region, which carries the zinc finger domain, constitutes the regulatory domain of NEMO, as it receives the activation signal from upstream molecules, and subsequently transmits this activation to the kinases bound to the N-terminal domain. The isolated NEMO zinc finger is thought to be involved in protein-protein rather than protein-DNA interaction.	26
408213	pfam18415	HKR_ArcB_TM	Histidine kinase receptor ArcB trans-membrane domain. Histidine kinase receptors (HKRs) are part of a two-component system, in which an HKR in the bacterial inner membrane transmits a signal to a response regulator located in the cytoplasm. This is a trans-membrane domain (TM) found in ArcB (class 2, aerobic respiratory control sensor). ArcB has two TM helices connected by a short periplasmic loop. TM domain structures suggests a loose helical packing which provides an inherent flexibility in the TM domains and that this is perhaps essential to the mechanism of signal transduction across the membrane.	75
408214	pfam18416	GbpA_2	N-acetylglucosamine binding protein domain 2. This domain can be found in N-acetylglucosamine binding protein (GbpA) from Vibrio cholerae, a bacterial pathogen that colonizes the chitinous exoskeleton of zooplankton as well as the human gastrointestinal tract. GbpA binds to GlcNAc oligosaccharides. Structural comparison show that there are distant structural similarities between domain 2 of GbpA and the beta-domain of the flagellin protein p5. It is suggested that this domain interacts with the bacterial surface, and functions to project an alginate binding domain of the protein from the cell surface.	102
408215	pfam18417	LodA_C	L-lysine epsilon oxidase C-terminal domain. This is the C-terminal domain of L-Lysine epsilon-oxidase (LodA, EC 1.4.3.20), an enzyme which catalyses the oxidative deamination of free L-lysine into L-2-aminoadipate 6-semialdehyde, ammonia and hydrogen peroxide.	144
408216	pfam18418	AnkUBD	Ankyrin ubiquitin-binding domain. This is an Ankyrin repeat domain found in TRABID (also known as Ubiquitin thioesterase ZRANB1) (EC:3.4.19.12). In TRABID, the first ankyrin repeat spans residues 260-290 and is connected to the second repeat residues 313-340 by a long linker that packs against what would correspond to the concave surface in an extended ankyrin-repeat structure. Ankyrin-repeat domains mediate protein interactions through a variety of surfaces. The ankyrin domain of TRABID interacts with ubiquitin, hence it is referred to as the ankyrin ubiquitin-binding domain, or AnkUBD.	96
408217	pfam18419	ATP-grasp_6	ATP-grasp-like domain. Glutathione biosynthesis is achieved in most organisms via a conserved two-step approach relying on the capacity of two independent and unrelated ligases to perform peptide synthesis coupled to ATP hydrolysis. In a first and rate-limiting step, gamma-glutamylcysteine ligase (gamma-ECL) (or GshA; EC:6.3.2.2) uses l-glutamate and l-cysteine to form gamma-glutamylcysteine (gamma-EC), which, in a second step, is condensed with glycine to glutathione by glutathione synthetase (GS) (or GshB; EC:6.3.2.3). However, several pathogenic and free-living bacteria carry out glutathione biosynthesis based on a single enzyme that catalyzes both the gamma-ECL and the GS reactions. Such bifunctional glutathione-synthesizing enzymes have been termed gamma-GCS-GS or GshF. Hybrid GshF contains a typical gamma-proteobacterial gamma-ECL fused to an ATP-grasp-like domain. The ATP-grasp-like module is responsible for the ensuing formation of glutathione from gamma-glutamylcysteine and glycine. The ATP-grasp-like domain has an antiparallel beta-sheet in the GshF structures in contrast to all structurally characterized members of the ATP-grasp superfamily.	54
375846	pfam18420	CSN4_RPN5_eIF3a	CSN4/RPN5/eIF3a helix turn helix domain. Cullin-RING E3 ubiquitin ligases (CRLs) are regulated by the eight-subunit COP9 signalosome (CSN). Enzymatically, CSN functions as an isopeptidase that removes the ubiquitin-like activator NEDD8 from CRLs, but it can also bind deneddylated CRLs and maintain them in an inactive state. The CSN subunits CSN1, CSN2, CSN3, CSN4, CSN7 and CSN8, share a common domain composition: an N-terminal array of tandem alpha-helical tetratricopeptide/-like repeats, a 34 residue motif, followed by a PCI domain, which encompasses a WH subdomain, a linker, and one or two alpha-helices at the C-terminus. This entry describes the C-terminal helices found on CSN4. The two helices from CSN4 (helices I and II) form a brace roughly perpendicular to the bundle axis in contact with the three C-terminal helices of CSN6. CSN5, whose two C-terminal helices form an antiparallel hairpin, inserts its final C-terminal helix (helix II) into the central CSN6 framework at the core of the bundle. Both CSN1 and CSN4 are dependent on the presence of their C-terminal helix (CSN1 isoform-2 residues: 466-527; and CSN4: 364-406) for integration into CSN. COP9 signalosome shares common architecture with the 26S proteasome lid and eIF3 where the 19S lid subunit RPN5 and the eIF3 core subunit eIF3a share significant structural similarity with CSN4.	42
408218	pfam18421	Peptidase_M23_N	Peptidase family M23 N-terminal domain. This is the N-terminal domain of Peptidase M23 pfam01551 mostly found in proteobacteria.	73
408219	pfam18422	TNFR_16_TM	Tumor necrosis factor receptor member 16 trans-membrane domain. This is the helical trans-membrane domain found in tumor necrosis factor receptor superfamily member 16 (also known as p75 neurotrophin receptor, and nerve growth factor receptor-NGFR). p75 plays prominent biological functions such the induction of cell death, and it demonstrates several other activities, like survival, axonal growth, and cell migration. The trans-membrane (TM) domain of p75 stabilizes the receptor dimers through a disulfide bond, essential for the NGF signalling Structural and mutational analysis indicate that Cys257 plays the key role in this stabilisation process. Furthermore, although the p75-C257A mutant is still capable to form dimers and bind to NGF, it is unable to transduce the signals triggered by NGF binding in some cell signalling paradigms.	38
408220	pfam18423	zf_CopZ	Zinc binding domain. This is N-terminal domain containing a mononuclear metal center for zinc binding found in copper chaperone CopZ proteins.	62
408221	pfam18424	a_DG1_N2	Alpha-Dystroglycan N-terminal domain 2. This is the second N-terminal domain found in alpha-Dystroglycan (DG). The murine skeletal muscle N-terminal alpha-DG region, contains two autonomous domains; the first identified as an Ig-like and the second resembling ribosomal RNA-binding proteins. This domain is similar to the small subunit ribosomal protein S6 of Thermus thermophilus (S6 domain). It is suggested that the S6 domain may be of functional relevance for LARGE (like-acetylglucosaminyltransferase) recognition along the alpha-DG maturation pathway.	123
408222	pfam18425	CspB_prodomain	Csp protease B prodomain. Csp proteases (Csps) and the subtilase protease family Subtilases are serine proteases that contain a catalytic triad in the order of Asp, His and Ser. Structure analysis reveals that Csps are subtilisin-like proteases with two distinctive functional features: a central jellyroll domain and a retained prodomain. The prodomain adopts a similar fold to the prodomains of related subtilisin-like proteases with the C-terminal region extending deep into the catalytic cleft. However, unlike the majority of subtilisin-like proteases, the prodomain stays bound to the subtilase domain via a network of interactions that result in tighter prodomain binding relative to other subtilases. Finally the prodomain acts as both an intramolecular chaperone and an inhibitor of CspB protease activity.	89
408223	pfam18426	Tli4_C	Tle cognate immunity protein 4 C-terminal domain. T6SS bacteria employ toxic effectors to inhibit rival cells and concurrently use effector cognate immunity proteins to protect their sibling cells. The effector and immunity pairs (E-I pairs) endow the bacteria with a great advantage in niche competition. This is the C-terminal domain of Tli4. The Tle cognate immunity proteins (Tlis) can directly disable the transported Tle protein and thereby mediate the self-protection process. The Tle-Tli effector-immunity (E-I) pairs confer substantial advantage to the donor cell during interbacterial competition. Tli4 displays a two-domain structure, in which a large lobe and a small lobe form a crab claw-like conformation. Tli4 uses this crab claw to grasp the cap domain of Tle4, especially the lid2 region, which prevents the interfacial activation of Tle4 and thus causes enzymatic dysfunction of Tle4. Structural comparison indicates similarity between this C-terminal domain of Tli4 and Tsi3, which is the cognate immunity protein of the effector protein Tse3 in P. aeruginosa PDB:4n7s.	161
408224	pfam18427	DDR_swiveling	DD-reactivating factor swiveling domain. AdoCbl-dependent diol dehydratase (DD) (EC 4.2.1.28) is one of the enzymes that catalyzes the conversion of 1,2-propanediol, 1,2-ethanediol, and glycerol to the corresponding aldehydes. A DD-reactivating factor (DDR) is responsible for the rapid reactivation of the inactivated holoDD in the presence of AdoCbl, ATP, and Mg2+. DDR exists as a dimer of heterodimer (alpha-beta)2. The alpha subunit has four domains: ATPase domain, swiveling domain, linker domain, and insert domain. The beta subunit, composed of a single domain, has a similar fold to the beta subunit of diol dehydratase (DD). This entry describes the swiveling domain of DDR, which structurally connects the beta subunit and the ATPase domain of the other alpha subunit. Furthermore, the beta subunit moves with the swiveling domain while the linker domain acts as a flat spring or a hinge for the domain movement of the swiveling domain.	162
408225	pfam18428	BRCT_3	BRCA1 C Terminus (BRCT) domain. Brca1 C-terminal (BRCT) domains are a common protein-protein interaction regions in proteins involved in the DNA damage response and DNA repair. For example 53BP1 which plays multiple roles in mammalian DNA damage repair, has a C-terminal tandem BRCT domain (BRCT2), which in its orthologs, Saccharomyces cerevisiae Rad9p and Schizosaccharomyces pombe Crb2, mediates binding to the equivalents of gammaH2AX. Structural and functional studies indicate that the 53BP1-BRCT2 domain is a competent binding module for phosphorylated peptides with a clear specificity for the DNA-damage marker gammaH2AX, and in isolation from other parts of 53BP1 is sufficient for localization to sites of DNA damage in cells associated with gammaH2AX.	102
408226	pfam18429	DUF5609	Domain of unknown function (DUF5609). This is a probable HAD-like (haloalkanoate dehalogenase) domain found in bacterial phosphoserine phosphatases.	65
408227	pfam18430	DBD_HTH	Putative DNA-binding domain. This is a putative DNA-binding protein dimerization domain found bacterial proteins such as CD3330, (a transposon-related DNA-binding protein from Clostridium difficile). Crystal structure analysis suggests that CD3330 N-terminus is involved in dimer formation and also in crystal packing but the C-terminus is open to solvent.	35
408228	pfam18431	RNAse_A_bac	Bacterial CdiA-CT RNAse A domain. Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighboring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. Structure analysis of CdiA-CT shows that it adopts the same fold (with two beta-sheets forming an overall kidney shape) as angiogenin and other RNase A paralogs, but the toxin does not share sequence similarity with these nucleases and lacks the characteristic disulfide bonds of the superfamily. Furthermore, structural comparison analysis identified human angiogenin, Rana pipiens protein P-30 (onconase) and mouse pancreatic ribonuclease (RNase 1) as the closest structural homologs of CdiA-CT.	113
408229	pfam18432	ECD	Extracellular Cadherin domain. This is an extracellular cadherin (EC) domain which can be found at the N-terminal region of Protocadherin 15 (Pcdh15). Pcdh15 features exceptionally long extracellular domains containing 11 ECs. These repeats are structurally similar, but not identical in sequence, often featuring linkers with conserved calcium-binding sites that confer mechanical strength to them.	110
408230	pfam18433	DUF5610	Domain of unknown function (DUF5610). This is a domain of unknown function found in bacterial proteins.	114
408231	pfam18434	Kazal_3	Kazal-type serine protease inhibitor domain. Kazal domain found in factor I-like modules (FIMs) region on the carboxyl-terminal of complement component C7 proteins. Complement component C7 is a subunit of the membrane attack complex (MAC), a fundamental machinery in the mammalian innate immunity. KAZAL domains are common in serine protease inhibitors.	49
408232	pfam18435	EstA_Ig_like	Esterase Ig-like N-terminal domain. This is an N-terminal immunoglobulin (Ig)-like domain found in esterases such as EstA. Analysis of the EstA structure confirms that it is a member of the alpha/beta hydrolase family, with a conserved Ser-Asp-His catalytic triad. The Ig-like domain presumably plays a role in the multimerization of EstA into an unusual hexameric structure. Additionally, it may also participate in the catalysis of EstA by guiding the substrate to the active site.	120
408233	pfam18436	HECW1_helix	Helical box domain of E3 ubiquitin-protein ligase HECW1. This is a region of 109 amino acids found in HECW1 proteins in Eukaryotes.Polymorphisms in the same region in the C.elegans homologue affects C. elegans behavioural avoidance of a lawn of Pseudomonas aeruginosa.	67
408234	pfam18437	Nup54_C	Nup54 C-terminal interacting domain. The mammalian nuclear pore complex (NPC) conducts nucleocytoplasmic transport and contains multiple copies of nucleoporins (nups). This is the C-terminal interacting domain found on Nup54. Nup45 is a splice variant of Nup58 with an identical alpha-helical region. Nup54 along with Nup62 and Nup58 are essential for nuclear transport. The C-terminal part of the alpha-helical region of Nup54 interacts with a C-terminal part of the alpha-helical region of Nup58. Interestingly, this region appears in two distinct conformations: a single helix and a helix-loop-helix, termed 'straight' and 'bent'. Whereas the straight conformer consists of a 34 residues long alpha helix (residues 460-493), the bent conformer is composed of two alpha helices, each 13 residues long, connected by a central loop (N helix, residues 460-472; C helix, residues 477-489).	39
408235	pfam18438	Glyco_hydro_38	Glycosyl hydrolases family 38 C-terminal domain 1. The enzymatic hydrolysis of alpha-mannosides is catalyzed by glycoside hydrolases (GH), termed alpha-mannosidases. Streptococcal (Sp) GH38 alpha-mannosidase active on N-glycans and possibly O-glycans. SpGH38 structure can be considered as five domains: an N-terminal alpha/beta-domain, a three-helix bundle and three predominantly beta-sheet domains. This is the first of the three beta-sheet domains found in GH38, termed Beta-1. Structural analysis indicate that the beta-1 domain bows outward from the protein core, is involved in dimer interactions whilst also forming a lid 'above' and somewhat into the active centre of its dimer.	111
408236	pfam18439	zf_UBZ	Ubiquitin-Binding Zinc Finger. This is ubiquitin-binding zinc finger (UBZ) domain found in DNA polymerase eta (EC:2.7.7.7). It is important in the recruitment of the polymerase to the stalled replication machinery in translesion synthesis. The UBZ domain adopts a classical C2H2 zinc-finger structure characterized by a beta-beta-alpha fold.	32
408237	pfam18440	GlcNAc-1_reg	Putative GlcNAc-1 phosphotransferase regulatory domain. The Golgi enzyme UDP-GlcNAc-lysosomal enzyme N-acetylglucosamine-1-phosphotransferase (GlcNAc-1-phosphotransferase), an alpha2beta2gamma2 hexamer, mediates the initial step in the addition of the mannose 6-phosphate targeting signal on newly synthesized lysosomal enzymes. GNPTAB encodes the alpha and beta subunits of GlcNAc-1-phosphotransferase, and mutations in this gene cause the lysosomal storage disorders mucolipidosis II and III alpha-beta The alpha-beta subunits contain three identifiable domains separated by so-called spacer regions. This domain is part of the first spacer region, Spacer-1. Studies indicate that GlcNAc-1 lacking spacer-1 exhibits enhanced phosphorylation of several non-lysosomal glycoproteins, while the phosphorylation of lysosomal acid hydrolases is not altered. In view of these effects on the maturation and function of GlcNAc-1, it is suggested to rename 'spacer-1' the 'regulatory-1' domain.	88
408238	pfam18441	Hen1_Lam_C	Hen1 La-motif C-terminal domain. RNA silencing is a conserved regulatory mechanism in fungi, plants and animals that regulates gene expression and defence against viruses and transgenes. A conserved S-adenosyl-l-methionine-dependent RNA methyltransferase, HUA ENHANCER 1 (HEN1), and its homologues are responsible for 2'-O-methylation on the 3' terminal nucleotide of microRNAs and small interfering RNAs (siRNAs). The 2'-O-methylation protects miRNAs and siRNAs from 3'-end uridylation and 3'-to-5' exonuclease-mediated degradation. This domain lies on the C-terminal region of the La-motif domain found in HEN1.	136
375868	pfam18442	G2BR	E3 gp78 Ube2g2-binding region (G2BR). The activity of RING finger ubiquitin ligases (E3) is dependent on their ability to facilitate transfer of ubiquitin from ubiquitin-conjugating enzymes (E2) to substrates. The G2BR domain within the E3 gp78 binds selectively and with high affinity to the E2 Ube2g2. Binding to the G2BR results in conformational changes in Ube2g2 that affect ubiquitin loading. The Ube2g2-G2BR interaction also causes a 50-fold increase in affinity between the E2 and RING finger. Hence, the Ube2g2-binding region (G2BR) is required for the function of gp78. In yeast, Ubc7p, the ortholog of Ube2g2, is recruited by Cue1p to the ER membrane. Cue1p directly binds Ubc7p through a stretch of 50 aa domain analogous to G2BR, i.e. suggesting that this domain which activates ERAD and Hrd1p stimulating ubiquitylation, might be the yeast equivalent of the G2BR domain.	26
408239	pfam18443	Tli4_N	Tle cognate immunity protein 4 N-terminal domain. T6SS bacteria employ toxic effectors to inhibit rival cells and concurrently use effector cognate immunity proteins to protect their sibling cells. The effector and immunity pairs (E-I pairs) endow the bacteria with a great advantage in niche competition. This is the C-terminal domain of Tli4. The Tle cognate immunity proteins (Tlis) can directly disable the transported Tle protein and thereby mediate the self-protection process. The Tle-Tli effector-immunity (E-I) pairs confer substantial advantage to the donor cell during interbacterial competition. Tli4 displays a two-domain conformation (domains I and II) and contains 17 beta-strands and four helices. These two domains pack into a crab claw-like conformation functioning as an inhibitor of Tle4. Both domains adopt an alpha+beta architecture. Domain I features a central antiparallel beta-sheet sandwiched by two helices and a short antiparallel beta-sheet. This entry comprises the N-terminal domain I found in Tli4 proteins.	149
408240	pfam18444	RRM_9	RNA recognition motif. The Mex67-Mtr2 complex (TAP-p15 or NXF1-NXT1 in metazoans) is the principal mRNA export factor in Saccharomyces cerevisiae. Mex67 is a member of the NXF family of proteins and has conserved homologs through eukaryotes from yeast to humans. Although sequence conservation is poor between S. cerevisiae Mex67 and Homo sapiens NXF1, they do show functional complementarity. Mex67 and TAP/NXF1 are modular proteins that contain four structural domains: an N-terminal RNA recognition motif (RRM), a leucine rich repeat (LRR) domain, a nuclear transport factor 2-like domain (NTF2L) and an ubiquitin-associated domain (UBA). This entry describes the N-terminal RNA recognition motif (RRM) found in Mex67 proteins.	70
375871	pfam18445	zf_PR_Knuckle	PR zinc knuckle motif. This is a zinc knuckle motif found in PRDM4 (Schwann cell factor 1, SC-1), a member of the PR protein family. PRDM4 is a transcriptional regulator that has been implied in transduction of nerve growth factor signals via the p75 neurotrophin receptor and in cell growth arrest. The short motif is also present in several other PR proteins including human PRDM6 (PRISM), PRDM7, PRDM9 (meisetz), PRDM10 (tristanin), PRDM11, and PRDM15. The conservation of cysteine and histidine residues suggested that this 20 amino acid motif binds zinc, hence the name 'PR zinc knuckle' to distinguish it from the longer (30 amino acid) C2H2-like zinc fingers that are located C-terminally of the PR domain. The PR zinc knuckle fold is similar to that of Gag-knuckles (a beta-hairpin providing two zinc ligands followed by a short helix or a loop providing the other two zinc ligands) and zinc ribbons (two beta-hairpins, each providing two zinc ligands).	38
408241	pfam18446	DUF5611	Domain of unknown function (DUF5611). This is a domain of unknown function. Studies of the TA0095 gene product indicate that this 96-residue hypothetical protein from Thermoplasma acidophilum is a member of the COG4004 orthologous group of unknown function found in Archaea bacteria. The structure displays an alpha/beta two-layer sandwich architecture formed by three alpha-helices and five beta-strands. Furthermore, structural homologs indicate that the TA0095 structure belongs to the TBP-like fold.	103
408242	pfam18447	FN3_7	Fibronectin type III domain. This domain is found in Interleukin-7 receptor subunit alpha (IL-7Ralpha), which together IL-7 form a complex crucial to several signalling cascades leading to the development and homeostasis of T and B cells. IL-7Ralpha carries a 219 residue ectodomain on the N-terminal region which is crucial for T and B-cell development. Mutations in the IL-7Ralpha ectodomain inhibits T and B cell development, resulting in patients with a form of severe combined immunodeficiency (SCID). The ectodomain folds into two fibronectin type III (FNIII) domains connected by a 310-helical linker. This entry comprises the first of the two FNIII domains, D1 while D2 domain is pfam00041. In the D1 domain of IL-7Ralpha, a disulfide bond (C22R-C37R) conserved among cytokine receptor class I (CRH I) family members bridges two beta strands.	97
408243	pfam18448	CBM46	Carbohydrate binding domain. Carbohydrate active enzymes (CAZYmes) that target recalcitrant polysaccharides are modular enzymes containing noncatalytic carbohydrate-binding modules (CBMs) that direct enzymes to their cognate substrate, thus potentiating catalysis. The structure of Bacillus halodurans endo-beta-1,4-glucanase B (Cel5B) reveals that CBM46 is tightly associated with the catalytic module and, dependent on the glucan presented to the enzyme, can contribute directly to substrate binding or play a targeting role in directing the enzyme to regions of the plant cell wall rich in the polysaccharide hydrolyzed by the enzyme. The CBM46 domain displays a classic beta-sandwich jelly roll fold. Against beta-1,3-1,4-glucans CBM46 domain participates in productive substrate binding and thus plays a direct role in the hydrolytic activity of the enzyme.	105
408244	pfam18449	Endotoxin_C2	Delta endotoxin. This is domain (D-VI) can be found in Bacillus thuringiensis (Bt) insecticidal protein Cry1Ac. Full length structural analysis reveal that Cry1Ac contains seven distinct domains (DI-DVII): the three canonical toxin core domains (D-I through D-III) and four protoxin domains (D-IV through D-VII). Cry1Ac is sickle-shaped with the toxic core as handle and the protoxin domains as the blade. Domains IV and VI are alpha-bundles that resemble structural/interaction domains such as spectrin (PDB ID: 1CUN) or bacterial fibrinogen-binding complement inhibitor.	63
408245	pfam18450	zf_C2H2_6	Zinc Finger domain. This is a C2H2 type zinc finger domain which can be found in Zinc finger and BTB domain-containing protein 21 (ZBTB21).	28
408246	pfam18451	CdiA_C	Contact-dependent growth inhibition CdiA C-terminal domain. Contact-dependent growth inhibition (CDI) systems encode polymorphic toxin/immunity proteins that mediate competition between neighboring bacterial cells. CDI is mediated by the CdiB/CdiA family of two-partner secretion proteins. This domain represents the C-terminal of CdiA proteins (CdiA-CT), which contains the CDI toxin activity. The C-terminal nuclease domain forms a stable complex with its cognate immunity protein. It is also sufficient to inhibit growth when expressed in E. coli cells, consolidating the idea that they constitute the functional CDI toxin. The CdiA-CT C-terminal domains are structurally similar to type IIS restriction endonucleases suggesting that the toxins have metal-dependent DNase activity.	81
408247	pfam18452	Ig_6	Immunoglobulin domain. This is an immunoglobulin domain which can be found in Interleukin-18 receptor alpha (IL-18Ra). IL-18Ra ectodomain folds into three immunoglobulin (Ig)-like domains, similar to IL-1 receptors. Each domain comprises a two-layer sandwich of six to nine beta-strands and contains at least one intra-domain disulfide bond.	50
375879	pfam18453	C4bp_oligo	Oligomerization domain of C4b-binding protein alpha. This is the C-terminal oligomerization domain found in C4b-binding protein (C4BP), which contains 14 cysteines that form 7 intermolecular disulfide bridges. C4BP is a plasma glycoprotein complex of 570 kDa, which is mainly produced in the liver. C4BP is the major inhibitor of complement activation, the major isoform consists of 7alpha and one beta-chain where each alpha-chain comprises eight complement control domain proteins (CCPs) and a C-terminal oligomerization domain. This domain carries a stretch of 42-45 amino acids that has been shown to be required for ring formation.	49
408248	pfam18454	Mtd_N	Major tropism determinant N-terminal domain. This is the N-terminal domain of major tropism determinant (Mtd), a retroelement-encoded receptor-binding protein. Mtd-N forms a three-fold symmetric beta-prism. This resembles the pseudo three-fold-symmetric beta-prisms of monocot lectins, but lacks residues in these lectins identified as binding carbohydrates. The beta-prism and beta-sandwich domains reinforce overall trimeric assembly and therefore may have indirect roles in stabilizing the backbone of the variable region.	37
375881	pfam18455	GBR2_CC	Gamma-aminobutyric acid type B receptor subunit 2 coiled-coil domain. This is the intracellular coiled-coil domain found in Gamma-aminobutyric acid type B receptor subunit 2 (GBR2). The coiled-coil complex between the GABAB receptor subunits GBR1 and GBR2 is responsible for facilitating the surface transport of the intact receptor. Disruption of the hydrophobic coiled-coil interface with single mutations in either subunit impairs surface expression of GBR1, confirming that the coiled-coil interaction is required to inactivate the adjacent ER retention signal of GBR1.	39
408249	pfam18456	CmlA_N	Diiron non-heme beta-hydroxylase N-terminal domain. This is the N-terminal domain found in Diiron non-heme beta-hydroxylase (CmlA). CmlA catalyzes beta-hydroxylation of the precursor molecule l-p-aminophenylalanine (l-PAPA) to form l-p-aminophenylserine. Structural analysis indicate that the N-terminal domain facilitates dimerization and has a mixed alpha-beta topology. Furthermore, a projecting 'dimerization arm' (residues 108-146) from the N-terminal domain of CmlA mediates the interaction between the monomers.	232
408250	pfam18457	PUD1_2	Up-Regulated in long-lived daf-2. This entry includes C. elegans PUD-1 and PUD-2, two proteins up-regulated in daf-2(loss-of-function) (PUD), are homologous 17-kD proteins with similar beta-sandwich folds that further associate with each other into a V-shaped heterodimer.	172
408251	pfam18458	XPB_DRD	Xeroderma pigmentosum group B helicase damage recognition domain. This domain is found in the N-terminal region of xeroderma pigmentosum group B (XPB) helicase present in Archaeoglobus fulgidus. XPB is essential for transcription, nucleotide excision repair, and TFIIH functional assembly. The domain is a damage recognition domain (DRD) which allows XPB to unwind damaged DNA as needed for nucleotide excision repair.	57
408252	pfam18459	PCSK9_C1	Proprotein convertase subtilisin-like/kexin type 9 C-terminal domain. This entry represents a subdomain found in the C-terminal cysteine/histidine-rich domain (CRD) of PCSK9 (also known as neural apoptosis-regulated convertase, NARC-1). PCSK9 has been shown to regulate circulating LDL-R levels by controlling LDL-R degradation. Furthermore, numerous mutations in the PCSK9 gene have been identified and associated with hypercholesterolemia (gain of function) or hypocholesterolemia (loss of function). The fully folded CRD, shows structural similarity to the resistin homotrimer, a small cytokine associated with obesity and diabetes. The C-terminal domain from PCSK9 consists of three, three-stranded beta-subdomains arranged in a pseudothreefold, and each of the subdomains in the CRD of PCSK9 consists of three structurally conserved disulfide bonds.	83
408253	pfam18460	HetR_C	Heterocyst differentiation regulator C-terminal Hood domain. This is the C-terminal hood domain found in Heterocyst differentiation control protein (HetR). HetR-C binds to PatS peptide. Two PatS6 peptides bind to the lateral clefts of HetR-Hood domain, and trigger significant conformational changes of the flap domain, resulting in dissociation of the auxiliary alpha-helix and eventually release of HetR from the DNA major grove and termination of transcription.	79
408254	pfam18461	Atypical_Card	Atypical caspase recruitment domain. The N-terminal effector domain found in NLRC5. It adopts a six alpha-helix bundle with a general death fold. Structure and sequence analysis of the NLRC5-N indicate that it possesses a fold similar to the one of the death-fold domains; however, it displays significant differences in the number of core alpha-helices and their relative orientation. Hence, it is suggested that NLRC5 belongs to the caspase recruitment domain (CARD) subfamily as an atypical CARD.	95
408255	pfam18462	DUF5612	Domain of unknown function (DUF5612). This is a domain of unknown function which is mostly found at the C-terminal of ACT domains such as pfam01842.	143
408256	pfam18463	PCSK9_C3	Proprotein convertase subtilisin-like/kexin type 9 C-terminal domain. This entry represents a subdomain found in the C-terminal cysteine/histidine-rich domain (CRD) of PCSK9 (also known as neural apoptosis-regulated convertase, NARC-1). PCSK9 has been shown to regulate circulating LDL-R levels by controlling LDL-R degradation. Furthermore, numerous mutations in the PCSK9 gene have been identified and associated with hypercholesterolemia (gain of function) or hypocholesterolemia (loss of function). The fully folded CRD, shows structural similarity to the resistin homotrimer, a small cytokine associated with obesity and diabetes. The C-terminal domain from PCSK9 consists of three, three-stranded beta-subdomains arranged in a pseudothreefold, and each of the subdomains in the CRD of PCSK9 consists of three structurally conserved disulfide bonds.	74
408257	pfam18464	PCSK9_C2	Proprotein convertase subtilisin-like/kexin type 9 C-terminal domain. This entry represents a subdomain found in the C-terminal cysteine/histidine-rich domain (CRD) of PCSK9 (also known as neural apoptosis-regulated convertase, NARC-1). PCSK9 has been shown to regulate circulating LDL-R levels by controlling LDL-R degradation. Furthermore, numerous mutations in the PCSK9 gene have been identified and associated with hypercholesterolemia (gain of function) or hypocholesterolemia (loss of function). The fully folded CRD, shows structural similarity to the resistin homotrimer, a small cytokine associated with obesity and diabetes. The C-terminal domain from PCSK9 consists of three, three-stranded beta-subdomains arranged in a pseudothreefold, and each of the subdomains in the CRD of PCSK9 consists of three structurally conserved disulfide bonds.	66
408258	pfam18465	Rieske_3	Rieske 3Fe-4S. This domain is comprised of the iron-sulphur cluster and Rieske subunit found in the large subunit of arsenite oxidase. Arsenite oxidase is a 100 kDa molybdenum- and iron-sulfur-containing protein located on the outer surface of the inner membrane of Gram-negative organisms. The large subunit of arsenite oxidase is similar to other members of the dimethylsulfoxide (DMSO) reductase family of molybdenum enzymes. The large subunit of arsenite oxidase is divided into four domains, with domain I binding the [3Fe-4S] cluster. Domain I, consists of three antiparallel beta sheets and six helices. The [3Fe-4S] cluster is coordinated by the motif Cys21-X2-Cys24-X3-Cys28 near the interface with domains III and IV. A large, flattened funnel-like cavity bounded by domains I, II, and III leads to the molybdenum center pfam00384 located near the center of the molecule.	96
408259	pfam18466	GluRS_N	Glutamate--tRNA ligase N-terminal domain. This is an N-terminal domain of Glutamate--tRNA ligase (GluRS, EC:6.1.1.17). The domain adopts a classical glutathione S-transferase (GST)-like fold and it interacts with tRNA-aminoacylation cofactor ARC1 (Arc1p) N-terminal domain for the formation of aminoacyl-tRNA synthetase (aaRS) complex in yeast.	55
408260	pfam18467	DUF5613	Domain of unknown function (DUF5613). This is a domain of unknown function found in bacteria.	97
408261	pfam18468	Pfk_N	Phosphofructokinase N-terminal domain yeast. This is a phosphofructokinase (Pfk) N-terminal domain found in yeast ATP-dependent 6-phosphofructokinase subunit alpha. ATP-dependent 6-phosphofructokinases (Pfks, EC 2.7.1.11) catalyze the phosphorylation of fructose 6-phosphate (F-6-P) to fructose 1,6-bisphosphate, a key control step of glycolysis in most organisms. The N-terminal domain contains the active site and is related to glyoxalase I (E.C. 4.4.1.5).	98
408262	pfam18469	PH_18	Pleckstrin homology domain. This is a Pleckstrin Homology (PH) domain found on the N-terminal region of the histone chaperone Rtt106 in yeast. Rtt106 binds histone H3 acetylated at lysine 56 (H3K56ac) and facilitates nucleosome assembly during several molecular processes and this N-PH domain is shown to mediate histone binding.	140
408263	pfam18470	Cas9_a	Cas9 alpha-helical lobe domain. This is an alpha-helical lobe domain found in Cas9 proteins. Cas9 enzymes adopt a bilobed architecture composed of a nuclease lobe containing juxtaposed RuvC and HNH nuclease domains and a variable alpha-helical lobe likely to be involved in nucleic acid binding. Amino acid residues located in both the nuclease and alpha-helical lobe clefts are highly conserved within type II-A Cas9 proteins.	221
408264	pfam18471	Ribosomal_L27_C	Ribosomal L27 protein C-terminal domain. This is the C-terminal domain of 54S ribosomal protein L2 (also known as Mitochondrial large ribosomal subunit protein bL27m). bL27 C-terminal region interacts with an expansion segment of 21S rRNA to form part of the central protuberance.	240
408265	pfam18472	HP1451_C	HP1451 C-terminal domain. HP1451 modulates the ATPase activity of HP0525 H. pylori. It is suggested that HP1451 acts as an inhibitory factor of HP0525 to regulate Cag-mediated secretion. It consists of two domains. The two HP1451 domains (referred to as the KH-domain and S8-domain) interact with each other via two salt bridges. The second domain is structurally homologous to other nucleic acid-binding proteins such as the NTD of ribosomal protein S8 and DNaseI.	67
408266	pfam18473	Urease_linker	Urease subunit beta-alpha linker domain. This domain is present in bacterial ureases and corresponds to the gap region between the C-terminus of the beta-chain Urease beta subunit pfam00699 and the N-terminus of the alpha-chain Urease alpha-subunit, N-terminal domain pfam00449. It is suggested that this region is required for the stability of the putative transmembrane beta-barrel, and might be the reason for bacterial urease (B. pasteurii) not being lethal to insects.	34
408267	pfam18474	DUF5614	Family of unknown function (DUF5614). This is an N-terminal domain found in C7orf25 protein UPF0415. It is distantly related to the PD-(D/E)XK nucleases.	221
408268	pfam18475	PIN7	PIN domain. This is a bacterial PIN-like domain of unknown function.	103
408269	pfam18476	PIN_8	PIN like domain. This is a domain of unknown function, suggested to be a member of PIN like domains clan.	228
408270	pfam18477	PIN_9	PIN like domain. This is a domain of unknown function that resembles the PIN like domains. Family members include Ribonuclease VapC9.	113
408271	pfam18478	PIN_10	PIN like domain. This is a bacterial domain of unknown function suggested to resemble PIN like domains.	84
408272	pfam18479	PIN_11	PIN like domain. This is a eukaryotic/eumetazoan PIN like domain found in the C-terminal region of bilateral ZNF451 proteins such as isoform 1 of human ZNF451. ZNF451 was shown to interact with p300 by the PIN-like domain and to negatively regulate TGF-beta signalling in a p300-dependent and sumoylation-independent manner. This domain is suggested to posses a potential active nuclease due to the presence of at least four conserved Asp residues in the predicted active site. Furthermore, it contains several conserved Cys and His residues, which may suggest stabilization of the domain structure with an embedded short zinc-binding loop.	118
408273	pfam18480	DUF5615	Domain of unknown function (DUF5615). This is a domain of unknown function found in potential toxin-antitoxin system component.	77
408274	pfam18481	DUF5616	Domain of unknown function (DUF5616). This domain is found in a number of prokaryotic proteins. It is mostly found fused with the N-terminal domain pfam04256. This C-terminal domain is suggested to be a PIN like domain.	140
408275	pfam18482	Pih1_fungal_CS	Fungal Pih1 CS domain. The Pih1 protein is part of the R2TP complex. The CS domain of Pih1 binds to the unstructured region of Tah1. The C-terminal domain of Pih1 consists of a seven-stranded beta sandwich with the topology of a CS domain, a structural motif also found in Hsp90 cochaperones such as p23/Sba1 and Sgt1.	90
408276	pfam18483	Bact_lectin	Bacterial lectin. This entry primarily matches to legume-like lectin domains found in prokaryotes.	211
408277	pfam18484	CDCA	Cadmium carbonic anhydrase repeat. This domain is the cadmium carbonic anhydrase repeat unit of the beta-carbonic anhydrase of a marine diatom, that uses both zinc and cadmium for catalysis of the reversible hydration of carbon dioxide for use in inorganic carbon acquisition for photosynthesis (thus being a cambialistic enzyme). Compared with alpha- and gamma-carbonic anhydrases that use three histidines to coordinate the zinc-atom, this beta-carbonic anhydrase has two cysteines and one histidine, and rapidly binds cadmium.	184
408278	pfam18485	GST_N_5	Glutathione S-transferase, N-terminal domain. This is the N-terminal (GST-N) domain containing a thioredoxin fold. This domain found in methionyl-tRNA synthetase (MRS), a multi-tRNA synthetase complex (MSC) component.	74
408279	pfam18486	PUB_1	PNGase/UBA- or UBX-containing domain. This is a PUB domain (PNGase/UBA- or UBX-containing domain), found in E3 ubiquitin-protein ligase RNF31, also known as Ring finger protein 31 and HOIL-1-interacting protein (HOIP) (EC:2.3.2.27). RNF31/HOIP is observed to contribute to inborn human immunity disorders, in which RNF31/HOIP missense mutation at PUB domain gives rise to the de-stabilized LUBAC complex (linear ubiquitin chain assembly complex) and subsequently causes the auto-inflammation and immunodeficiency. In addition, RNF31 is reported to modify ERK and JNK pathways leading to cisplatin resistance. Functional studies indicate that HOIP and OTULIN interact and act as a bimolecular editing pair for linear ubiquitin signals where the HOIP-PUB domain binds to the PUB interacting motif (PIM) of OTULIN and the chaperone VCP/p9. This interaction plays an important role where the HOIP binding to OTULIN is required for the recruitment of OTULIN to the TNF receptor complex and to counteract HOIP-dependent activation of the NF-KB pathway.	64
408280	pfam18487	TSR	Thrombospondin type 1 repeat. This is a thrombospondin type I repeat (TSR) found in properdin. Properdin, also known as factor P (fP), is a glycoprotein constructed from a common pool of structure units or modules, which are homologous to the thrombospondin type 1 repeat, TSR. It is positive regulator of the complement system that stabilizes the alternative pathway C3-convertase C3bBb. Properdin also inhibits the factor H-mediated cleavage of C3b by factor I. In addition, properdin acts as a pattern recognition molecule capable of identifying and interacting with microbial surfaces, apoptotic cells, and necrotic cells. However this role of pattern recognition is controversial. Studies indicate that this domain is a TSR variants. It is present at the N-terminal of properdin and has been denoted as TSR-0. It is suggested that the TSR-0 domain of properdin which possesses only the six Cys residues and no CWR-layered motif (which is usually found in other TSR domains) may constitute a truncated TSR domain.	50
408281	pfam18488	WYL_3	WYL domain. Many phytopathogens secrete and/or inject 'effector' proteins inside host cells to modulate cellular processes. Phytopathogens deliver effector proteins inside host plant cells to promote infection. Crystal structures of the effector domains from two oomycete RXLR proteins, Phytophthora capsici AVR3a11 and Phytophthora infestans PexRD2 reveal a core alpha-helical fold (termed the 'WY-domain') which enables functional adaptation of these fast evolving effectors through (i) insertion/deletions in loop regions between alpha-helices, (ii) extensions to the N and C termini, (iii) amino acid replacements in surface residues, (iv) tandem domain duplications, and (v) oligomerization. It is proposed that the core fold provides both a degree of molecular stability and plasticity that enables development/maintenance of effector virulence activities while allowing evasion of recognition by the plant innate immune system during rapid 'arms race' co-evolution.	63
408282	pfam18489	Alpha_Helical	Alpha helical domain. This is the N-terminal domain found in putative tRNA-binding proteins found in archaea. Structural analysis from Pyrococcus horikoshii indicate that it is a helical domain where many conserved residues are found in the first three helices and are mainly located on the inverse side of the putative tRNA-binding site. A structural homology search suggested that this fold prefers to bind proteins/peptides.	127
408283	pfam18490	tRNA_bind_4	tRNA-binding domain. This is the N-terminal domain found in archeal type-2 serine-tRNA ligase (SerRS) (EC:6.1.1.11). The SerRS N-terminal domain interacts with the extra-arm stem and the outer corner of tRNA specific to Selenocysteine (tRNA-Sec).	159
408284	pfam18491	SRA	SET and RING associated domain. This is the C-terminal domain found in PvuRts1I, a modification-dependent restriction endonuclease that recognizes 5-hydroxymethylcytosine (5hmC) as well as 5-glucosylhydroxymethylcytosine (5ghmC) in double-stranded DNA in bacteria. Structural analysis indicates that it has the typical SRA (SET and RING associated) domain fold (pfam02182).	140
408285	pfam18492	ORF_2_N	Open reading frame 2 N-terminal domain. This is the N-terminal domain found in ORF 2 (open reading frame 2), a protein encoded just downstream of asp (A. sobria serine protease). The ORF 2 N-terminal domain is essential for proper ASP folding. This domain is intrinsically disordered but forms some degree of secondary structure upon binding ASP.	121
408286	pfam18493	DUF5617	Domain of unknown function (DUF5617). This is a C-terminal domain of unknown function found in gammaproteobacteria.	96
408287	pfam18494	Pullulanase_Ins	Pullulanase Ins domain. Pullulanases (pullulan 6-glucanohydrolase, EC 3.2.1.41) are debranching enzymes that are able to hydrolyze the alpha-1,6-glycosidic linkage in pullulan, starch, amylopectin, and related oligosaccharides. Type I pullulanases specifically cleave the alpha-1,6-glycosidic linkages in pullulan and branched oligosaccharides to produce maltotriose and linear oligosaccharides, respectively. Structural analysis of Klebsiella lipoprotein pullulanase (PulA) illustrates that the catalytic core is composed of two major regions: the TIM-barrel domain A and beta-sandwich fold domain C. PulA contains an extra domain, a highly mobile Ins subdomain of unknown function which is inserted into the catalytic TIM-barrel domain A of Klebsiella pullulanases. The Ins subdomain is rich in helical and loop secondary structure. A disulfide bond between Cys491 and Cys506 and two Ca2+ ions presumably stabilizes this domain.This insertion is also found in pullulanases from other Gram-negative genera that have a functional T2SS, such as Vibrio, Aeromonas, and Photorhabdus. Functional analysis indicate that this domain is required for PulA secretion via the T2SS.	74
408288	pfam18495	VbhA	Antitoxin VbhA. VbhT is a bacterial Fic protein of the mammalian pathogen B. schoenbuchensis7,8. It is composed of an N-terminal FIC domain and a C-terminal BID domain. FIC domains are known to catalyse adenylylation (also called AMPylation). This entry represents VbhA, an antitoxin that binds FIC domain (filamentation induced by cyclic AMP) of VbhT and inhibits its activity. It inhibits the adenylylation activity of VbhT by positioning close to the putative ATP-binding site, hence competing with ATP binding.	47
408289	pfam18496	ColG_sub	Collagenase G catalytic helper subdomain. This is the catalytic helper subdomain found in collagenase G from Clostridium histolyticum. This domain is indispensable for proper folding and full peptidase activity.	116
408290	pfam18497	RNase_3_N	Ribonuclease III N-terminal domain. This is the N-terminal domain of eukaryotic ribonuclease 3 (RNase III, EC:3.1.26.3). Structure analysis of Saccharomyces cerevisiaeRNase III (Rnt1p) :RNA revealed specific contacts between the N-terminal domain (NTD) dimer and the 5' end region of the tetraloop. Deletion of the NTD led to cleavage at alternative site(s), suggesting that this region increases the precision of the cleavage site selectivity.	95
408291	pfam18498	DUF5618	Domain of unknown function (DUF5618). This is a domain of unknown function found in bacteria.	123
408292	pfam18499	Cue1_U7BR	Ubc7p-binding region of Cue1. Cue1p (coupling of ubiquitin conjugation to ER degradation protein 1) is an integral component of yeast endoplasmic reticulum (ER)-associated degradation (ERAD) ubiquitin ligase (E3) complexes. It tethers the ERAD ubiquitin-conjugating enzyme (E2), Ubc7p, to the ER and prevents its degradation, and also activates Ubc7p. This domain represents the Ubc7p-binding region (U7BR) of Cue1p with Ubc7p. The U7BR is as E2-binding domain that includes three alpha-helices that interact extensively with the 'backside' of Ubc7p. U7BR stimulates both RING-independent and RING-dependent ubiquitin transfer from Ubc7p. Moreover, the U7BR enhances ubiquitin-activating enzyme (E1)-mediated charging of Ubc7p with ubiquitin.	52
408293	pfam18500	CadC_C1	CadC C-terminal domain 1. CadC is an integral membrane protein of 512 amino acids comprising an N-terminal cytoplasmic DNA-binding domain, a transmembrane helix, and a C-terminal periplasmic domain. CadC belongs to the ToxR-like regulators that encompass biochemically non-modified one-component systems with similar gross topology, including several low pH-induced transcription regulators. Structural analysis of the C-terminal periplasmic domain indicates that it resembles the sensory domain of a (pH-activated) ToxR-like regulator. Furthermore, it is composed of two subdomains with a cavity at their interface that is suited to accommodate cadaverine, the feedback inhibitor of the Cad system. This is the N-terminal subdomain of the C-terminal periplasmic domain. It is composed of five-stranded beta-sheets.	134
408294	pfam18501	REC1	Alpha helical recognition lobe domain. Cpf1 is an RNA-guided endonuclease of a type V CRISPR-Cas system. Cpf1 adopts a bilobed architecture consisting of an alpha-helical recognition (REC) lobe and a nuclease (NUC) lobe, with the small CRISPR RNAs (crRNAs)-target DNA heteroduplex bound to the positively charged, central channel between the two lobes. The REC lobe consists of the REC1 and REC2 domains where REC1 comprises 13 alpha helices, and REC2 comprises ten alpha helices and two beta strands that form a small antiparallel sheet. This entry represents REC1 domain.	245
408295	pfam18502	Mrpl_C	54S ribosomal protein L8 C-terminal domain. This is the C-terminal domain of mitochondrial 54S ribosomal protein L8.	111
408296	pfam18503	RPN6_C_helix	26S proteasome subunit RPN6 C-terminal helix domain. This is the C-terminal helix domain found in RPN6, a component of the 26S proteasome. The C-terminal helices are essential for lid assembly.	27
375930	pfam18504	Csm1_N	Csm1 N-terminal domain. In the budding yeast Saccharomyces cerevisiae, sister chromatid co-orientation in meiosis I depends on the four-protein monopolin complex (Mam1, Csm1, Lrs4, and Hrr25/casein kinase 1), which localizes to centromeres from meiotic prophase through metaphase I. Csm1 and Lrs4, form a complex that resides in the nucleolus during interphase and relocalizes to centromeres during meiotic prophase, accompanied by phosphorylation of Lrs4. This is the N-terminal domain of Csm1 which forms a coiled-coil.	70
408297	pfam18505	DUF5619	Domain of unknown function (DUF5619). This is a domain of unknown function found in bacteria and archaea.	85
408298	pfam18506	RelB_N	RelB Antitoxin alpha helical domain. This is an alpha helix domain found in the N-terminal region of antitoxin RelB. RelE-RelB (RelBE) is a toxin-antitoxin (TA) protein complex. It is suggested that the toxic action of RelE is counteracted by antitoxin RelB, which wraps around RelE, blocks its active site and prevents sterically the binding to the ribosomal A-site.The long N-terminal alpha-helix of the tightly bound antitoxin RelB covers the presumed active site of the toxin RelE that is formed by a central beta-sheet.	46
375933	pfam18507	WW_1	WW domain. This is a WW domain found in histone-lysine N-methyltransferase, H3 lysine-36 specific (EC:2.1.1.43) in Saccharomyces cerevisiae. The WW domain is the simplest natural beta-sheet structure. It is a 35-residue protein module found in signaling and regulatory proteins with two highly conserved tryptophans and a strictly conserved proline.	27
408299	pfam18508	zf_C2H2_13	Zinc finger domain. The SAGA (Spt-Ada-Gcn5-acetyltransferase) complex performs multiple functions in transcription activation including deubiquitinating histone H2B, which is mediated by a subcomplex called the deubiquitinating module (DUBm). The yeast DUBm comprises a catalytic subunit, Ubp8, and three additional subunits, Sgf11, Sus1 and Sgf73, all of which are required for DUBm activity. A portion of the non-globular Sgf73 subunit lies between the Ubp8 catalytic domain and the zinc finger (ZnF)-UBP domain and has been proposed to contribute to deubiquitinating activity by maintaining the catalytic domain in an active conformation. Sgf73 contributes to maintaining both the organization and ubiquitin-binding conformation of Ubp8, thereby contributing to overall DUBm activity. This domain is a Sgf73 fragment in the DUB module. It is a zinc finger (ZnF) domain whose integrity is essential for the incorporation of this subunit into DUBm as well as for the catalytic activity of Ubp8, as either a short deletion or point mutations in Sgf73 zinc-coordinating residues disrupt the association of Sgf73 with the rest of the DUBm.	42
375935	pfam18509	MCR	Magnetochrome domain. Magnetotactic bacteria (MTB) align along the Earth's magnetic field using an organelle called the magnetosome. The magnetosome-associated protein MamP is conserved in all MTB and has a PDZ domain, a small c-type cytochrome domain (the first magnetochrome domain, MCR1), a 17-residue linker and a second magnetochrome domain (MCR2). This entry describes the two tandem magnetochrome domains carrying c-type cytochrome motifs CX2CH.	30
408300	pfam18510	NUC	Nuclease domain. This is a nuclease (NUC) domain found in Cpf1, an RNA-guided endonuclease of a type V CRISPR-Cas system. Structural and functional analysis indicate that this domain is involved in DNA cleavage.	158
408301	pfam18511	F-box_5	F-box. Jasmonates are a family of plant hormones that regulate plant growth, development and responses to stress. COI1 is an F-box protein that functions as the substrate-recruiting module of the Skp1-Cul1-F-box protein (SCF) ubiquitin E3 ligase complex. The role of COI1-mediated JAZ degradation in jasmonate (JA) signaling is analogous to auxin signaling through the receptor F-box protein transport inhibitor response 1 (TIR1), which promotes hormone-dependent turnover of the AUX/IAA transcriptional repressors. The crystal structure of COI1 reveals a TIR1-like overall architecture, with an N-terminal tri-helical F-box motif bound to ASK1 and a C-terminal horseshoe-shaped solenoid domain formed by 18 tandem leucine-rich repeats. This entry represents the N-terminal F-box domain which is also found in other auxin signaling f-box proteins such as AFB1, AFB2 and AFB3.	42
375938	pfam18512	BssB_TutG	Benzylsuccinate synthase beta subunit. Members of this family include benzylsuccinate synthase beta subunit found in bacteria. BssB acts as a regulator of activation and may additionally be involved in regulating access to the enzyme's active site. It adopts a fold similar to that of a high potential iron-sulfur protein (HiPIP) and resembles the single small subunit of HPAD, which is known as the HpdC or HPADgamma in that system.	66
408302	pfam18513	Pro_sub2	Prodomain subtilisin 2. Plasmodium subtilisin 2 (Sub2) is a multidomain protein that plays an important role in malaria infection. This domain is a conserved region of the inhibitory prodomain of Sub2 from Plasmodium falciparum, termed prosub2 which has structural similarity to bacterial and mammalian subtilisin-like prodomains.	88
408303	pfam18514	Get5_C	Get5 C-terminal domain. Tail-anchored trans-membrane proteins are targeted to membranes post-translationally. The proteins Get4 and Get5 form an obligate complex that catalyzes the transfer of tail-anchored proteins destined to the endoplasmic reticulum from Sgt2 to the cytosolic targeting factor Get3. This is the carboxyl domain of Get5 (Get5-C), a homodimerization domain, resulting in a heterotetrameric Get4/Get5 complex.	38
408304	pfam18515	Rh5	Rh5 coiled-coil domain. This is a helical coiled-coil domain found in reticulocyte-binding protein homolog 5 (RH5), a Plasmodium falciparum protein essential for erythrocyte invasion.	255
408305	pfam18516	RuvC_1	RuvC nuclease domain. This is a RuvC nuclease domain found in type V CRISPR-associated protein Cas12a (Cpf1), used for genome editing applications. These proteins carry out endoribonuclease activity for processing its own guide RNAs and RNA-guided DNase activity for target DNA cleavage. The C-terminal region of Cas12a carries the RUVC domain, NUC domain pfam18510 and the arginine-rich bridge helix (BH). Both the NUC and BH domains are nested in the RuvC domain. Mutations in the RuvC domain impair cleavage of both strands in a target DNA duplex, while a mutation in the Nuc domain impaired target strand cleavage only. This indicates that the DNA nuclease active sites are located at the interface of the RuvC and Nuc domains and that cleavage of the non-target DNA strand by the RuvC domain is a prerequisite for target strand cleavage by the Nuc domain.	412
408306	pfam18517	LZ3wCH	Leucine zipper with capping helix domain. This domain is found at the C-terminal region of Hop2 and Mnd1 proteins. In meiotic DNA recombination, the Hop2-Mnd1 complex promotes Dmc1-mediated single-stranded DNA (ssDNA) invasion into homologous chromosomes to form a synaptic complex. Hop2 (for homologous pairing; also known as TBPIP) is expressed specifically during meiosis, same as Mnd1 (for meiotic nuclear divisions 1). The C-terminal region of both Hop2 and Mnd1, folds into three alpha-helices that are interrupted by two short non-helical regions. These alpha-helices of the two proteins together form a parallel coiled coil that provides the major interface for heterodimer formation. The non-helical regions form substantially kinked junctions between adjacent leucine zippers: the LZ1-LZ2 and LZ2-LZ3 junctions.This domain is the C-terminal segment of Hop2 and Mnd1 which folds back onto the C-terminal leucine zipper (LZ3) to form a helical bundle-like structure, hence designated LZ3wCH (for LZ3 with capping helices). The LZ3wCH region plays a role in interacting with the Dmc1 nucleofilament.	55
408307	pfam18518	TcA_RBD	TcA receptor binding domain. Tc toxin complexes are virulence factors of many bacteria such as the plague pathogen Yersinia pestis. Tc toxins are composed of TcA, TcB and TcC subunits. TcA forms a large bell-shaped pentameric structure and enters the membrane like a syringe, forming a translocation channel through which the cytotoxic domain is probably transported into the cytoplasm. TcA has four receptor-binding domains. This domain is one of 4 receptor binding domains found in TcA. All four domains have an immunoglobulin (Ig)-like beta-sandwich fold of two sheets with antiparallel beta-strands. The domains are structurally reminiscent of the receptor-binding domains of the diphtheria and anthrax toxins.	130
408308	pfam18519	Sgf11_N	SAGA-associated factor 11 N-terminal domain. The SAGA (Spt-Ada-Gcn5-Acetyltransferase) transcriptional co-activator is a protein complex that regulates inducible yeast genes by performing multiple functions including acetylating core histones, recruiting the RNA polymerase II preinitiation complex, and deubiquitinating histone H2B. The deubiquitinating activity of SAGA resides in a distinct sub-complex called the deubiquitinating module (DUBm), which consists of four proteins that are conserved across eukaryotes: Ubp8, Sgf11, Sus1 and Sgf73. The DUBm proteins are organized into two lobes around the globular domains of Ubp8. In SAGA, Sus1 binds to Sgf11 by wrapping around this N-terminal domain of Sgf11, forming a stable dimer.	39
375946	pfam18520	Spc110_C	Spindle pole body component 110 C-terminal domain. This is the C-terminal domain found in Spc110 proteins. Spc110 is a spindle pole body component (SPB) protein. The N-terminus is shown to bind to gamma-tubulin small complex (g-TuSC) while this C-terminal domain is essential for calmodulin-binding. The C-terminus of Spc110 is anchored to the SPB via a conserved PACT domain.	52
375947	pfam18521	TAD2	Transactivation domain 2. This is a N-terminal transactivation domain (TAD) domain 2 found in p53 proteins. In p53 two TAD domains are found termed TAD1 (residues 1-39) and TAD2 (residues 40-61), both of which have been shown to be able to independently activate gene transcription and are intrinsically disordered protein domains that adopt a helical conformation for at least part of their length when bound. This inherent flexibility allows the TADs to adapt to and bind a broad range of proteins. This entry describes TAD2 which can independently interact with Taz2 domain of the histone acetyltransferase p300. It has also been shown to bind to OB-fold domain of replication protein 70 A (RPA) as well as the pleckstrin homology (PH) domain of the p62 and Tfb1 subunits of human and yeast TFIIH.	25
408309	pfam18522	DUF5620	Domain of unknown function (DUF5620). This is a domain of unknown function predicted to be a carbohydrate binding module.	119
408310	pfam18523	Sld3_N	Sld3 N-terminal domain. Sld3 is conserved in yeast and fungi, and treslin, also known as Ticrr, has been identified as the functional counterpart of Sld3 in metazoans. Yeast Sld3 and its metazoan counterpart treslin are the hub proteins mediating protein associations critical for formation of the helicase. This entry represents the N-terminal domain of Sld3 which is shown to bind to the N-terminal domain of Sld7.	116
408311	pfam18524	HPIP_like	High potential iron-sulfur protein like. This is a C-terminal domain found in 4-hydroxyphenylacetate decarboxylase small subunit (EC:4.1.1.83), which catalyzes the last reaction in the fermentative production of p-cresol from tyrosine. The C-terminal domain [4F-4S] cluster bears structural similarity to high-potential iron-sulfur proteins (HiPIPs). HiPIPs have an N-terminal extension of 20-40 residues, so the structural similarity is limited to their Fe/S cluster-binding scaffold. Furthermore, despite of the weak amino acid sequence identity, the cluster binding motifs are remarkably similar to H/CX2CX12-13CX16-17C for the gamma-subunit and CX2CX13-19CX14-19C for HiPIPs.	40
408312	pfam18525	Cas9_C	Cas9 C-terminal domain. This is the C-terminal domain of Cas9 enzymes found in actinobacteria.	110
408313	pfam18526	DB_JBP1	Thymine dioxygenase JBP1 DNA-binding domain. The J-binding protein 1 (JBP1) is essential for biosynthesis and maintenance of DNA base-J (beta-d-glucosyl-hydroxymethyluracil). Base-J and JBP1 are confined to some pathogenic protozoa and are absent from higher eukaryotes, prokaryotes and viruses. JBP1 recognizes J-containing DNA (J-DNA) through the DNA-Binding JBP1 domain (DB-JBP1), which binds to J-DNA with approximately the same affinity and specificity as full-length JBP1. Structure analysis of DB-JBP1 revealed a helix-turn-helix variant fold, a 'helical bouquet' with a 'ribbon' helix encompassing the amino acids responsible for DNA binding. Mutation of a single residue (Asp525) in the ribbon helix abrogates specificity toward J-DNA.	164
408314	pfam18527	STT3_PglB_C	STT3/PglB C-terminal beta-barrel domain. Asparagine-linked glycosylation is a post-translational modification of proteins containing the conserved sequence motif Asn-X-Ser/Thr. The attachment of oligosaccharides is implicated in diverse processes such as protein folding and quality control, organism development or host-pathogen interactions. The reaction is catalysed by oligosaccharyltransferase (OST), a membrane protein complex located in the endoplasmic reticulum. The central, catalytic enzyme of OST is the STT3 subunit, which has homologues in bacteria and archaea. Structural analysis of a bacterial OST, undecaprenyl-diphosphooligosaccharide protein glycotransferase EC:2.4.99.19 (PglB) protein, revealed two domains: a transmembrane domain and a periplasmic domain. This entry represents the C-terminal periplasmic beta-barrel domain.	79
408315	pfam18528	Ret2_MD	RNA editing 3' terminal uridylyl transferase 2 middle domain. Post-transcriptional RNA editing in Trypanosomatids (pathogenic protozoa) is catalyzed by a large multiprotein complex, the editosome. A key editosome enzyme, RNA editing terminal uridylyl transferase 2 (TUTase 2; RET2) catalyzes the uridylate addition reaction. RET2 structure consists of three domains: the N-terminal domain (NTD), the middle domain (MD) and the C-terminal domain (CTD). This MD domain is mainly composed of six helices and a four-stranded antiparallel beta-sheet. structural comparison reveals that the fold of this MD is topologically similar to the binding domains of several RNA-binding proteins such as the RNA-binding domain of the U1A spliceosomal protein, the RRM domain of the human La protein and the CTD of an archaeal CCA-adding enzyme. The CTD of the archaeal CCA-adding enzyme has been shown to bind double-stranded tRNA stem substrate through the alpha-helices regions. Hence it is suggested that this domain might be an RNA-binding domain.	93
408316	pfam18529	MIX	Mitochondrial membrane-anchored proteins. MIX forms an all alpha-helical fold comprising seven alpha-helices that fold into a single domain. The distribution of helices is similar to a number of scaffold proteins, namely HEAT repeats, 14-3-3, and tetratricopeptide repeat proteins, suggesting that MIX mediates protein-protein interactions.	151
408317	pfam18530	Swi6_N	Swi6 N-terminal domain. This is a putative DNA binding domain, it comprises four alpha helices and five beta strands arranged in a mixed alpha/beta fold.	108
408318	pfam18531	Polo_box_2	Polo box domain. In metazoans, Plk4 kinases control daughter centriole assembly. Plk4 homologs have an N-terminal kinase domain, a C-terminal polo box, and a central domain termed the 'cryptic polo box' (CPB) that has been shown to dimerize, to be sufficient for centriole localization and to be required for Plk4 to promote centriole assembly. Probable serine/threonine-protein kinase zyg-1 (EC:2.7.11.1) (ZYG-1) is a Plk4 homlog found in C. elegans. Crystal structure for the CPB of C. elegans ZYG-1, reveals that it forms a Z-shaped dimer containing an intermolecular beta-sheet with an extended basic surface patch. Electrostatic interactions between the basic patch on the ZYG-1 CPB dimer and the SPD-2 acidic region dock ZYG-1 onto centrioles to promote new centriole assembly. ZYG-1 CPB contains two tandem polo boxes (PB1 and PB2), each containing a six-stranded beta-sheet with an alpha-helix packed against one side.	112
408319	pfam18532	DUF5621	Domain of unknown function (DUF5621). This is a domain of unknown function found in gammaproteobacteria.	139
408320	pfam18533	DUF5622	Domain of unknown function (DUF5622). This is a domain of unknown function found in archaea-specific ribosomal proteins such as L46a which is suggested to directly bind to rRNA in the ribosome.	66
408321	pfam18534	HBD	Helical bundle domain. Lpg0393 is a Legionella pneumophila effector protein. Structure analysis reveals that it has two domains, the N-terminal domain is a Vps9-like domain, which is structurally most similar to the catalytic core of human Rabex-5 that activates the endosomal Rab proteins Rab5, Rab21 and Rab22. The C-terminal domain is a helical bundle domain. The C-terminal helical bundle of Lpg0393 corresponds to the N-terminal helical bundle of Rabex-5, it lacks an obvious region that corresponds to the membrane-binding motif of Rabex-5. One possibility may be that Lpg0393 localization to endosomes depends on an unknown Legionella effector.	84
408322	pfam18535	Gal11_ABD1	Gal11 activator-binding domain (ABD1). This is activator-binding domain (ABD1) found in Gal11/med15 proteins. Structural analysis indicate that it binds to the central activator domain (cAD) of Gcn4. Mutations in Gal11-ABD1 W196 residue abolishes the binding to Gcn4 cAD.	81
408323	pfam18536	DUF5623	Domain of unknown function (DUF5623). This is a domain of unknown function found in proteobacteria.	119
408324	pfam18537	CODH_A_N	Carbon monoxide dehydrogenase subunit alpha N-terminal domain. Acetyl-coenzyme A (CoA) synthase/carbon monoxide dehydrogenase (ACS/CODH) is a bifunctional enzyme that catalyzes the reversible reduction of CO2 to CO (CODH activity). This entry is for the N-terminal domain found in ACS/CODH subunit alpha.	83
408325	pfam18538	DUF5624	Domain of unknown function (DUF5624). This is a domain of unknown function found mainly in bacteria.	129
408326	pfam18539	DUF5625	Domain of unknown function (DUF5625). This is a domain of unknown function found in proteobacteria.	130
408327	pfam18540	DUF5626	Domain of unknown function (DUF5626). This is a domain of unknown function mostly found in firmicutes.	120
408328	pfam18541	RuvC_III	RuvC endonuclease subdomain 3. Cas9 proteins are abundant across the bacterial kingdom, but vary widely in both sequence and size. All known Cas9 enzymes contain an HNH domain that cleaves the DNA strand complementary to the guide RNA sequence (target strand), and a RuvC nuclease domain required for cleaving the noncomplementary strand (non-target strand), yielding double-strand DNA breaks (DSBs). The crystal structures of type II-A and II-C Cas9 proteins highlight the features in Cas9 enzymes that support their function as RNA-guided endonucleases. Cas9 enzymes adopt a bilobed architecture composed of a nuclease lobe containing juxtaposed RuvC and HNH nuclease domains and a variable alpha-helical lobe likely to be involved in nucleic acid binding. The RuvC domain forms the structural core of the nuclease lobe, a six-stranded beta sheet surrounded by four alpha helices, with all three conserved subdomains (I, II, III) contributing catalytic residues to the active site.	160
408329	pfam18542	TFIIB_C_1	Transcription factor IIB C-terminal module 1. In the pathogenic trypanosome, Trypanosoma brucei, transcription factor IIB (tTFIIB) is essential for spliced leader (SL) RNA gene transcription and cell viability, but has a highly divergent primary sequence in comparison to TFIIB in other eukaryotes. Structure analysis of the C-terminal region of trypanosome TFIIB, reveals 2, closely packed helical modules followed by a C-terminal extension of 32 aa. The trypanosome-specific region comprises the second helical module and the C-terminal extension. Both helical modules contain the canonical 5-helix cyclin fold characteristic of TFIIB proteins. This domain is mostly found in Trypanosomatidae.	98
408330	pfam18543	ID	Intracellular delivery domain. This is a C-terminal domain found in BepA proteins from Bartonella henselae. It is a type IV secretion system (T4SS) effector protein. BepA from Bartonella henselae is composed of an N-terminal Fic domain and a C-terminal Bartonella intracellular delivery (ID) domain, the latter being responsible for T4SS-mediated translocation into host cells. The ID domain of BepA mediates inhibition of apoptosis and exhibits an OB (oligonucleotide/oligosaccharide binding)-fold.	55
408331	pfam18544	Polo_box_3	Polo box domain. In metazoans, Plk4 kinases control daughter centriole assembly. Plk4 homologs have an N-terminal kinase domain, a C-terminal polo box, and a central domain termed the 'cryptic polo box' (CPB) that has been shown to dimerize, to be sufficient for centriole localization and to be required for Plk4 to promote centriole assembly. Probable serine/threonine-protein kinase zyg-1 (EC:2.7.11.1) (ZYG-1) is a Plk4 homlog found in C. elegans. Crystal structure for the CPB of C. elegans ZYG-1, reveals that it forms a Z-shaped dimer containing an intermolecular beta-sheet with an extended basic surface patch. Electrostatic interactions between the basic patch on the ZYG-1 CPB dimer and the SPD-2 acidic region dock ZYG-1 onto centrioles to promote new centriole assembly. ZYG-1 CPB contains two tandem polo boxes (PB1 and PB2), each containing a six-stranded beta-sheet with an alpha-helix packed against one side. This entry represents PB2.	97
408332	pfam18545	HalOD1	Halobacterial output domain 1. HalOD1 (Halobacterial output domain 1) is a protein domain that is specific for haloarchaea and their viruses. It is found in a stand-alone version and also in combination with Response_reg pfam00072 and other domains (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems).	75
408333	pfam18546	MetOD1	Metanogen output domain 1. MetOD1 (Metanogen output domain 1) is a protein domain that is found in euryarchaeal classes Methanobacteria and Methanomicrobia, either in stand-alone form or in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems).	143
375973	pfam18547	HalOD2	Halobacterial output domain 2. HalOD2 (Halobacterial output domain 2) is a protein domain that is found in haloarchaea in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems).	52
375974	pfam18548	MetOD2	Metanogen output domain 2. MetOD2 (Metanogen output domain 2) is found in euryarchaeal class Methanomicrobia, usually in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems).	88
408334	pfam18549	NitrOD1	Nitrosopumilus output domain 1. NitrOD1 (Nitrosopumilus output domain 1) is found in thaumarchaea, either in stand-alone form or in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems).	68
375976	pfam18550	NitrOD2	Nitrososphaera output domain 2. NitrOD2 (Nitrososphaera output domain 2) is found in thaumarchaea, either in stand-alone form or in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems).	104
408335	pfam18551	TackOD1	Thaumarchaeal output domain 1. TackOD1 (Thaumarchaeal output domain 1) is a predicted metal-binding domain found in archaea and in some bacteria. It contains 11 highly conserved Cys residues, which form 5 CxxC motifs and an HxxC motif. In several instances, it is found in combination with the Response_reg pfam00072 domain (Galperin et al., 2018, Phyletic distribution and lineage-specific domain architectures of archaeal two-component signal transduction systems).	188
408336	pfam18552	PheRS_DBD1	PheRS DNA binding domain 1. This is a DNA-binding fold domain found in Phenylalanyl-tRNA Synthetase (EC:6.1.1.20) N-terminal region. This domain belongs to a superfamily of 'winged helix' DNA-biding domains. The topology of DBD-1 and DBD-3 closely resembles the topology of the Z-DNA-binding domain Zalpha of double-stranded RNA (dsRNA) adenosine deaminase and other domains from DNA-binding proteins. Mutational analysis indicate that DBD-1, 2 and 3 play critical roles in tRNA-Phe binding and recognition, i.e., from the drastic reduction of aminoacylation activity seen upon removal of the N-terminal domains.	59
408337	pfam18553	PheRS_DBD3	PheRS DNA binding domain 3. This is a DNA-binding fold domain found in Phenylalanyl-tRNA Synthetase N-terminal region. This domain belongs to a superfamily of 'winged helix' DNA-biding domains. The topology of DBD-1 and DBD-3 closely resembles the topology of the Z-DNA-binding domain Zalpha of double-stranded RNA (dsRNA) adenosine deaminase and other domains from DNA-binding proteins. Mutational analysis indicate that DBD-1, 2 and 3 play critical roles in tRNAPhe binding and recognition, i.e., from the drastic reduction of aminoacylation activity seen upon removal of the N-terminal domains.	57
408338	pfam18554	PheRS_DBD2	PheRS DNA binding domain 2. This is a DNA-binding fold domain found in Phenylalanyl-tRNA Synthetase N-terminal region. Mutational analysis indicate that DBD-1, 2 and 3 play critical roles in tRNA- Phe binding and recognition, i.e., from the drastic reduction of aminoacylation activity seen upon removal of the N-terminal domains. DBD-2 and DBD-3 constitute large insertions sequentially included between two neighboring antiparallel strands of the DBD-1 domain. Moreover, the DBD-3 pfam18553 is the domain insertion into DBD-2.	33
408339	pfam18555	MobL	MobL relaxases. This family includes members of relaxase enzymes. These enzymes initiate bacterial conjugation contributing to the spread of antibiotic resistance. These MobL relaxases are found mainly in Firmicutes. It is suggested that MobL type relaxases play a prominent role in horizontal gene transfer in Firmicutes bacteria. Family members carry a stretch of relaxase motif III 'HUH' sequence that is characteristic of the HUH endonuclease superfamily essential for enzymatic activity.	387
408340	pfam18556	TetR_C_35	Bacterial Tetracyclin repressor, C-terminal domain. This is the C-terminal tetracyclin repressor domain found in bacteria. Family members include TetR family transcriptional regulators Rv3249c and Rv1816 found in Mycobacterium tuberculosis. Palmitic acid (a fatty acid) and isopropyl laurate (a fatty acid ester), were identified as binding ligands to Rv3249c and Rv1816 respectively. Similar to other TetR family regulators, these proteins are alpha-helical dimeric proteins consisting of a smaller N-terminal DNA-binding domain and a larger C-terminal regulatory domain.	105
408341	pfam18557	NepR	Anti-sigma factor NepR. The general stress response sigma factor in alphaproteobacteria, sigma EcfG is inactivated by the anti-sigma factor NepR, which is itself regulated by the response regulator PhyR. NepR forms two helices that extend over the surface of the PhyR subdomains. Homology modeling and comparative analysis of NepR, PhyR and sigmaEcfG mutants indicate that NepR contacts both proteins with the same determinants, showing sigma factor mimicry at the atomic level. This entry represents NepR domains found in alphaproteobacteria.	33
408342	pfam18558	HTH_51	Helix-turn-helix domain. This is helix turn helix domain found in polyketide synthases (PKSs) in fungi. They are multidomain enzymes that biosynthesize a wide range of natural products. Family members include citrinin polyketide synthases which contain a C-methyltransferase (CMeT) domain pfam08242 that adds one or more S-adenosylmethionine (SAM)-derived methyl groups to the carbon framework.	90
408343	pfam18559	Exop_C	Galactose-binding domain-like. This is the C-terminal domain found in ExoP (exo-1,3/1,4-beta-glucanase) from Pseudoalteromonas. This domain contains a beta-sandwich fold which is common in glycosyl hydrolases (GH7, 11, 12 and 16) and in some 23 carbohydrate-binding modules. It is suggested that the main role of this domain is to provide structural stability necessary for ExoP activity, however no substrate-binding role has been shown.	155
408344	pfam18560	Lectin_like	Lectin like domain. This is a lectin like domain found in Cwp84, a surface-located cysteine protease (a member of the C1A cysteine protease family, also known as papain proteases) responsible for the maturation of the SlpA precursor protein which has been implicated in the degradation of extracellular matrix proteins such as fibronectin, laminin and vitronectin. Structural comparison indicates that this domain is similar to carbohydrate-binding domains.	157
408345	pfam18561	Regnase_1_C	Endoribonuclease Regnase 1/ ZC3H12 C-terminal domain. This is the C-terminal domain found in regnase-1, an RNase that directly cleaves mRNAs of inflammatory genes such as IL-6 and IL-12p40, and negatively regulates cellular inflammatory responses. The C-terminal domain is composed of three alpha helices and resembles ubiquitin associated protein 1 in structure.	44
408346	pfam18562	CIDR1_gamma	Cysteine-Rich Interdomain Region 1 gamma. Rosetting is the capacity of infected RBCs to bind uninfected RBCs, which is consistently associated with severe malaria in African children. The rosette-forming PfEMP1 adhesins, namely IT4/R29, Palo Alto 89F5 VarO, 3D7/PF13_0003 and IT4/var60, belong to a specific sub-group called groupA/UpsA var genes and all four present a specific Duffy Binding-Like and and Cysteine-Rich Interdomain Region (DBL1alpha1-CIDR1gamma) double domain Head region found at the extracellular region of PfEMP1. This entry represents the CIDR1gamma domain which increases the binding affinity to VarO (Palo Alto VarO parasites).	52
408347	pfam18563	TubC_N	TubC N-terminal docking domain. This is the N-terminal docking domain found in TubC proteins from the tubulysin polyketide synthase and nonribosomal polypeptide synthetase (PKS-NRPS) system, which binds to C-terminal docking domain of TubB.	52
408348	pfam18564	Glyco_hydro_5_C	Glycoside hydrolase family 5 C-terminal domain. This is the C-terminal domain of endo-glycoceramidase II (EGC), a membrane-associated family 5 glycosidase pfam00150. The C-terminal domain assumes a beta-sandwich fold, which resembles that of many carbohydrate-binding modules.	86
408349	pfam18565	Glyco_hydro2_C5	Glycoside hydrolase family 2 C-terminal domain 5. Domain 5 is found in dimeric beta-D-galactosidase from Paracoccus sp. 32d, which contributes to stabilization of the functional dimer. It is suggested that the location of this domain 5, may be one of the factors responsible for the creation of a functional dimer and cold-adaptation of this enzyme.	103
408350	pfam18566	Ldi	Linalool dehydratase/isomerase. This (alpha,alpha)6 barrel fold domain is found in linalool dehydratase/isomerase (Ldi) EC:4.2.1.127. An enzyme found in the betaproteobacterium Castellaniella defragrans 65Phen that mineralizes monoterpenes coupled to anaerobic denitrification. The periplasmic enzyme reversibly catalyzes the isomerisation from the primary alkenol geraniol into the tertiary alkenol (S)-linalool and its dehydration to beta-myrcene. Each monomer is built up of a classical (alpha,alpha)6 barrel fold composed of six inner helices. Structural data of Ldi revealed the terpene binding site between two monomers inside a hydrophobic channel, and three catalytic clusters involved in catalysis.	307
408351	pfam18567	TIR_3	Toll/interleukin-1 receptor domain. This is a Toll/interleukin-1 receptor (TIR) domain found in the N-terminal region of B-cell adaptor for phosphoinositide 3-kinase (BCAP). BCAP functions in linking the B-cell receptor (BCR) and the co-receptor CD19 to the activation of PI3K via interaction with the SH2 domains on the regulatory p85 subunit. BCAP TIR associates with the MAL/TIRAP adaptor and the TIR domains of Toll-like receptors (TLRs).	131
408352	pfam18568	COS	TRIM C-terminal subgroup One Signature domain. This domain is found in the C-terminal region of the TRIM subgroup C-1 proteins such as E3 ubiquitin-protein ligase Midline-1 protein which is required for the proper development during embryogenesis. Mutations of MID1 are associated with X-linked Opitz G syndrome, characterized by midline anomalies. This domain is also found in MURF1-3 proteins that do not contain the FNIII and B30.2 domains. MUF1-3 proteins are also associated with microtubules. Deletion of the COS domain does not affect MID1 dimerization but disrupts the localization to the microtubules.	52
408353	pfam18569	Thioredoxin_16	Thioredoxin-like domain. This is a thioredoxin like domain found in AIMP2 proteins (Aminoacyl tRNA synthetase complex interacting multifunctional protein 2). Aimp2 is a component of human multi-tRNA synthetase complex (MSC). MSC is a macromolecular protein complex consisting of nine different ARSs and three ARS-interacting multifunctional proteins (AIMPs).	93
408354	pfam18570	Nup54_57_C	NUP57/Nup54 C-terminal domain. The nuclear pore complex (NPC) constitutes the sole gateway for bidirectional nucleocytoplasmic transport. NPCs are formed by multiple copies of 34 distinct proteins, termed nucleoporins (nups). In yeast, the channel nups Nsp1, Nup49, and Nup57 constitute part of the central transport channel and form the diffusion barrier with their disordered phenylalanine-glycine (FG) repeats. Structural studies of yeast Nup57 indicate that it contains left-handed coiled-coil domains (CCD1-3). This entry represents the third CCD located at the C-terminal region of Nup57 which is composed of five heptad repeats. Nup57 in yeast is the equivalent to human Nup54 pfam13874.	29
408355	pfam18571	VWA_3_C	von Willebrand factor type A C-terminal domain. This is the C-terminal domain of von Willebrand factor type A pfam13768.	47
408356	pfam18572	T6PP_N	Trehalose-6-phosphate phosphatase N-terminal helical bundle domain. This is the N-terminal domain found in trehalose-6-phosphate phosphatase (T6PP, EC 3.1.3.12) from parasitic nematodes such as Brugia malayi. In the model nematode Caenorhabditis elegans, T6PP is essential for survival due to the toxic effect(s) of the accumulation of trehalose 6-phosphate. T6PP has also been shown to be essential in Mycobacterium tuberculosis. The N-terminal domain composed of a three-helix bundle is similar in topology to the Microtubule Interacting and Transport (MIT) domains of the Vps4-like ATPases from Sulfolobus acidocaldarius. MIT domains are protein-interacting domains typically associated with multivesicular body formation, cytokinetic abscission, or viral budding. Mutational analysis indicate that deletion or mutation of the MIT-like domain is highly destabilizing to the enzyme.	98
408357	pfam18573	BclA_C	BclA C-terminal domain. This is the C-terminal domain of BclA (Bacillus collagen-like protein of anthracis) which is expressed on spores of Bacillus species. Trimers of the C-terminal domain (CTD) form the tips of the spore's hair-like nap and are the immunodominant target of vertebrate antibodies and drive trimerization. Structure analysis indicate the C-terminal region of the peptide folding into an all-beta structure with a jelly-fold topology, similar to the first human complement C1q, a member of the tumor necrosis factor (TNF)-like family. The C-terminal globular domain has been shown to be located on the exterior of the exosporium, and therefore is critical in determining the immunogenicity of the spore in a mammalian host.	127
408358	pfam18574	zf_C2HC_14	C2HC Zing finger domain. This is a zinc finger domain together with a linker region found in RNF125, a small protein (25kD) that contains a RING domain, three zinc fingers (ZnFs) and a ubiquitin interacting motif (UIM). The C2HC ZnF plays an essential role in the interaction of RNF125 with the E2 UbcH5a, which originates from the requirement of the C2HC-ZnF for the structural stability of the RING domain. A mutation at one of the contact residues in the C2HC-ZnF, a highly conserved M112, resulted in the loss of ubiquitin ligase activity. Furthermore, mutations at the Zn2+ chelating cysteine residues, C100 and C103 of this domain resulted in a loss of activity.	33
408359	pfam18575	HAMP_N3	HAMP N-terminal domain 3. Aer2 soluble receptor from Pseudomonas aeruginosa contains three successive HAMP domains in the N-terminal region. HAMP domains are widespread prokaryotic signaling modules. This entry is the third N-terminal HAMP domain (HAMP3). HAMP3 adopt a conformation resembling Af1503, with only minor differences in helical tilt and orientation. The basic construction of each HAMP domain consists of a monomeric unit of two parallel alpha helices (AS1 and AS2) joined by an elongated connector of 12-14 residues form a parallel four-helix bundle.	43
408360	pfam18576	HTH_52	Helix-turn-helix domain. This is a helix turn helix domain found in bacilli.	64
408361	pfam18577	ASTN_2_hairpin	Astrotactin-2 C-terminal beta-hairpin domain. This is a beta-hairpin domain found at the C-terminal region of astrotactin 2 proteins (ASTN-2). ASTN-2 is an integral membrane perforin-like protein linked to the planar cell polarity pathway in hair cells. it consists of multiple polypeptide folds: a perforin-like domain, a minimal epidermal growth factor-like module, a fibronectin type III domain Fn (III) and an annexin-like domain as well as the beta hairpin domain which packs across the fibronectin domain.	47
408362	pfam18578	Raf1_N	Rubisco accumulation factor 1 alpha helical domain. This is the N-terminal alpha helical domain found in Rubisco accumulation factor1 (Raf1). Raf1 from Arabidopsis thaliana consists of an N-terminal alpha-domain, a flexible linker segment and a C-terminal beta-sheet domain that mediates dimerization. The alpha-domains mediate the majority of functionally important contacts with RbcL (Rubisco large subunits) by bracketing each RbcL dimer at the top and bottom. The alpha-domain alone is essentially inactive.	106
408363	pfam18579	Raf1_HTH	Rubisco accumulation factor 1 helix turn helix domain. This is helix turn helix domain found in alpha helical region of Rubisco accumulation factor1 (Raf1). Raf1 from Arabidopsis thaliana consists of an N-terminal alpha-domain, a flexible linker segment and a C-terminal beta-sheet domain that mediates dimerization. The alpha-domains mediate the majority of functionally important contacts with RbcL (Rubisco large subunits) by bracketing each RbcL dimer at the top and bottom. The alpha-domain alone is essentially inactive.	61
408364	pfam18580	Sun2_CC2	SUN2 coiled coil domain 2. LINC complexes are formed by coupling of KASH (Klarsicht, ANC-1, and Syne/Nesprin Homology) and SUN (Sad1 and UNC-84) proteins from the inner and outer nuclear membranes (INM and ONM, respectively). the formation of LINC complexes by KASH and SUN proteins at the nuclear envelope (NE) establishes the physical linkage between the cytoskeleton and nuclear lamina, which is instrumental for the mechanical force transmission from the cytoplasm to the nuclear interior, and is essential for cellular processes such as nuclear positioning and migration, centrosome-nucleus anchorage, and chromosome dynamics. SUN2 possesses two coiled-coil domains (CC1 and CC2). These coiled-coil domains are also believed to act as rigid spacers to delineate the distance between the ONM and INM of the NE. Furthermore, the two coiled-coil domains of SUN2 have been indicated to be able to directly modulate SUN domain activity and regulate the subsequent interactions between the SUN and KASH domains. CC2 forms a three-helix bundle to lock the SUN domain in an inactive conformation acting as an inhibitory component. Structure-based sequence analysis demonstrated that several Gly residues are located in the flexible linker regions between the three helices which would ideally provide the breaks/turns in CC2 for three-helix bundle formation. The last helix alpha3 of CC2 (that is immediately connected to the SUN domain) has been shown to be an essential segment for promoting SUN domain trimerization in the SUN-KASH complex structure.	58
408365	pfam18581	SYCP2_ARLD	Synaptonemal complex 2 armadillo-repeat-like domain. Synaptonemal complex protein 2 (SYCP2) N-terminal region contains two separate subdomains an ARLD (armadillo-repeat-like domain) and an SLD (Spt16M-like domain). The ARLD domain belongs to the armadillo-repeat protein family. Armadillo-repeat units often form a superhelix, which typically provides a platform for many protein partners that transduce Wnt signaling, such as beta-catenin. The ARLD of mouse SYCP2 was found to associate with different protein partners, including CENP J and CENP F. ARLD structure is highly similar to that of the 'required for cell differentiation (RCD-1)' protein.	171
408366	pfam18582	HZS_alpha	Hydrazine synthase alpha subunit middle domain. The crystal structure of hydrazine synthase multiprotein complex isolated from the anammox organism Kuenenia stuttgartiensis implies a two-step mechanism for hydrazine synthesis: a three-electron reduction of nitric oxide to hydroxylamine at the active site of the gamma-subunit and its subsequent condensation with ammonia, yielding hydrazine in the active centre of the alpha-subunit. The alpha-subunit consists of three domains: an N-terminal domain which includes a six-bladed beta-propeller, a middle domain binding a pentacoordinated c-type haem (haem alphaI) and a C-terminal domain which harbours a bis-histidine-coordinated c-type haem (haem alphaII). This entry represents the middle domain of subunit alpha of hydrazine synthase (HZS).	98
408367	pfam18583	Arnt_C	Aminoarabinose transferase C-terminal domain. ArnT is a member of the GT-C family of glycosyltransferases, and it has a similar fold to a bacterial oligosaccharyltransferase (OST) from Campylobacter lari (PglB) and to an archaeal OST from Archaeoglobus fulgidus (AglB). This entry represents the C-terminal periplasmic domain of Arnt proteins.	103
408368	pfam18584	SYCP2_SLD	Synaptonemal complex 2 Spt16M-like domain. Synaptonemal complex protein 2 (SYCP2) N-terminal region contains two separate subdomains an ARLD (armadillo-repeat-like domain) and an SLD (Spt16M-like domain). The SLD structure is highly similar to the middle domain of the histone chaperone FACT. It consists of a twisted ten-stranded beta-sheet flanked by two helices. Since the SLD domain structurally resembles Spt16M, which is known as the well-recognized histone protein H2A-H2B; it is speculated that the SLD may be involved in chromatin binding.	111
408369	pfam18585	zf-CCCH_6	Chromatin remodeling factor Mit1 C-terminal Zn finger 2. The Snf2/Hdac Repressive Complex (SHREC) is the fission yeast nucleosome remodeling and deacetylation (NuRD) equivalent and plays a major role in transcriptional gene silencing (TGS) within S. pombe heterochromatin. SHREC consists of the chromatin remodeler Mit1, the HDAC Clr3, and Clr1 and Clr2 proteins. The Mit1 C terminus contains two zinc binding motifs (CCHC zinc fingers) at the C-terminus onto which the alpha helices from both Clr1 and Mit1 pack. The Mit1 chromatin remodeler uses its C-terminal domain to intimately bind to the N-terminal half of Clr1 to integrate into the SHREC complex. This is entry represents the second C-terminal zinc-binding domain found on Mit-1.	53
376012	pfam18586	zf-CCCH_7	Chromatin remodeling factor Mit1 C-terminal Zn finger 1. The Snf2/Hdac Repressive Complex (SHREC) is the fission yeast nucleosome remodeling and deacetylation (NuRD) equivalent and plays a major role in transcriptional gene silencing (TGS) within S. pombe heterochromatin. SHREC consists of the chromatin remodeler Mit1, the HDAC Clr3, and Clr1 and Clr2 proteins. The Mit1 C terminus contains two zinc binding motifs (CCHC zinc fingers) at the C-terminus onto which the alpha helices from both Clr1 and Mit1 pack. The Mit1 chromatin remodeler uses its C-terminal domain to intimately bind to the N-terminal half of Clr1 to integrate into the SHREC complex. This is entry represents the first C-terminal zinc-binding domain found on Mit-1.	90
408370	pfam18587	PLL	PTX/LNS-Like (PLL) domain. Adhesion G protein-coupled receptors (aGPCRs) play critical roles in diverse neurobiological processes including brain development, synaptogenesis, and myelination. The aGPCR GPR56/ADGRG1 regulates both oligodendrocyte and cortical development. The N-terminal domain of GPR56 has low sequence identity and a fold that likely diverged from the PTX and LNS domains. It also has a conserved motif (HphiC91xxWxxxxG) that was identified among canonical PTX domains. Thus, it is termed the Pentraxin/Laminin/neurexin/sex-hormone-binding-globulin-Like (PLL) domain. Truncation-based analyses suggest that the regions of GPR56 responsible for binding TG2 and collagen III are within the PLL domain, most likely in the surface-exposed conserved patch. Furthermore, it is suggested that the conserved patch of the PLL domain mediates an essential function in CNS myelination.	134
408371	pfam18588	WcbI	Polysaccharide biosynthesis enzyme WcbI. Capsular polysaccharides (CPSs) are protective structures on the surfaces of many Gram-negative bacteria. wcbI is one of several genes in the CPS biosynthetic cluster whose deletion leads to significant attenuation of the pathogen. Structural analysis and biophysical assays suggest that WcbI functions as an acetyltransferase enzyme but it requires another functional module to carry out this function. WcbI adopts a predominantly helical fold where the N-terminal 100 amino acids form a ligand-binding domain and binds tightly to coenzyme A and its derivative acetyl-CoA.	207
408372	pfam18589	ObR_Ig	Obesity receptor immunoglobulin like domain. This is the immunoglobulin-like domain (IGD) found in obesity receptors (ObR). ObR is a single membrane-spanning receptor belonging to the class I cytokine receptor family. All isoforms have an identical extracellular part consisting of six domains: an N-terminal domain (NTD), two CRH domains (CRH1 and CRH2), an immunoglobulin-like domain (IGD), and two additional membrane-proximal fibronectin type III (FN III) domains. ObR activation depends on the CRH2, IGD, and FN III domains, however the CRH2 domain is the major leptin-binding determinant in the receptor. The IGD and membrane-proximal domains have no detectable affinity for the ligand, but are nonetheless indispensable for receptor activation. Deletion of the IGD results in a receptor with wild-type affinity for leptin, but completely devoid of biological activity.	105
408373	pfam18590	IMP2_N	Immune Mapped Protein 2 (IMP2) N-terminal domain. Immune Mapped Protein 2 (IMP2) N-terminal domain which is conserved across both IMP1 and IMP2 families. It is suggested that the globular domain likely contributes to a shared function, hence it is termed 'IMP1-like domain'.	87
408374	pfam18591	IMP2_C	Immune Mapped Protein 2 (IMP2) C-terminal domain. Immune Mapped Protein 2 (IMP2) C-terminal domain.	63
408375	pfam18592	Tho1_MOS11_C	Tho1/MOS11 C-terminal domain. THO is a multi-protein complex involved in the formation of messenger ribonuclear particles (mRNPs) by coupling transcription with mRNA processing and export. Some studies show that Tho1, like Sub2, can assemble onto the nascent mRNA during transcription and that Tho1 and Sub2 can provide alternative pathways for mRNP biogenesis in the absence of a functional THO complex. This is the C-terminal domain found in Tho1 and MOS11 proteins. The C-terminal region of Tho1 from Saccharomyces cerevisiae, adopts a helical fold similar to that of the WHEP RNA-binding domains of metazoan aminoacyl-tRNA synthetases.	37
408376	pfam18593	CdiI_2	CdiI immunity protein. Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighboring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. CDI+ bacteria also produce CdiI immunity proteins, which specifically neutralize cognate CdiA-CT toxins to prevent self-inhibition. Structure analysis of CdiI immunity protein from Yersinia kristensenii shows that it is composed of eight alpha-helices packed together to form a nearly spherical structure with weak structural homology to a putative TetR family transcriptional repressor. The CdiI protein fits into the curved cavity of the CdiA-CTYkris toxin domain where it most likely neutralizes toxin activity by blocking access to RNA substrates. This domain is mostly found in gammaproteobacteria.	91
408377	pfam18594	Sas6_CC	Sas6/XLF/XRCC4 coiled-coil domain. This is a coiled-coil domain found at the C-terminal of spindle assembly abnormal protein 6 (Sas6). The highly conserved protein SAS-6 constitutes the center of the cartwheel assembly that scaffolds centrioles early in their biogenesis.Structural analysis of Sas6 show that similar to XLF, and XRCC4 it forms a parallel coiled-coil dimer.	30
408378	pfam18595	DHR10	Designed helical repeat protein 10 domain. Repeat proteins composed of multiple tandem copies of a modular structure unit1 are widespread in nature and have critical roles in molecular recognition, signaling, and other essential biological processes. This entry describes a MazG related domain also designated as Designed helical repeat protein 10 (DHR10). This domain is also found at the N-terminal region of Nuf2 proteins pfam03800.	117
408379	pfam18596	Sld7_C	Sld7 C-terminal domain. This is an alpha helical domain found at the C-terminal region of Sld7 proteins. Yeast Sld3 and its metazoan counterpart treslin are the hub proteins mediating protein associations critical for formation of the replicative helicase at the replication origins of chromosomes. Sld7 forms a complex with Sld3 throughout the cell cycle, and associates with and dissociates from origins in an Sld3- dependent manner and is thought to regulate the function of Sld3. Structural analysis of S. cerevisiae Sld7 indicates that two Sld7 molecules form a homodimer using their C-terminal domains.	77
408380	pfam18597	SH3_19	Myosin X N-terminal SH3 domain. This is the N-terminal Sh3 domain found in myosin X. Myosin X is essential for neuritogenesis, wound healing, cancer metastasis and some pathogenic infections. Myosin X is required for filopodia formation and extension.	52
408381	pfam18598	TetR_C_36	Tetracyclin repressor-like, C-terminal domain. This is a C-terminal TetR regulatory domain found in QsdR proteins (quorum-sensing degradation regulation).	111
408382	pfam18599	LCIB_C_CA	Limiting CO2-inducible proteins B/C beta carbonyic anhydrases. Limiting CO2-inducible B protein (LCIB)-LCIC complex plays an important role in the microalgal CO2-concentrating mechanisms (CCMs).LCIB and homologs (LCIB1-4 and LCIC) structurally resemble beta carbonyic anhydrases (b-CAs) with striking similarities in overall fold, zinc-binding motif, and especially putative active site architecture.	222
408383	pfam18600	Ezh2_MCSS	MCSS domain. Polycomb repressive complex 2 (PRC2) carries out the methylation of lysine 27 of histone H3, a hallmark of repressive chromatin. Three core subunits make up the catalytic core of PRC2; the SET domain containing EZH2, the zinc-finger containing SUZ12 and the WD40 repeat protein EED. The complex forms a compact arrangement of three lobes. The middle lobe largely comprises two domains that mark the beginning of the carboxy (C)-terminal region of EZH2 (MCSS and SANT2) and the helical, C-terminal, component of the Suz12 Vefs domain. This entry describes the MCSS (also known as SANT2L) domain. There is one zinc binding (Zn1Cys3His1) which is formed solely by MCSS.	53
408384	pfam18601	EZH2_N	EZH2 N-terminal domain. Polycomb repressive complex 2 (PRC2) carries out the methylation of lysine 27 of histone H3, a hallmark of repressive chromatin. Three core subunits make up the catalytic core of PRC2; the SET domain containing EZH2, the zinc-finger containing SUZ12 and the WD40 repeat protein EED. The complex forms a compact arrangement of three lobes. This is the N-terminal domain of EZH2.	79
408385	pfam18602	Rap1a	Rap1a immunity proteins. The structures of the immunity proteins, Rap1a, responsible for the inhibition and neutralization of Ssp1 endopeptidase, revealed two distinct folds. The structure of the Ssp1-Rap1a complex revealed a tightly bound heteromeric assembly with two effector molecules flanking a Rap1a dimer. The Rap1a subunit displays a compact globular structure constructed from five alpha-helices that assemble to form the highly stable symmetric dimer.	86
408386	pfam18603	LAL_C2	L-amino acid ligase C-terminal domain 2. l-amino-acid ligases (LALs; EC 6.3.2.28) were discovered to be ATP-grasp superfamily enzymes that catalyze the formation of an alpha-peptide bond between two l-amino acids in an ATP-dependent manner. The members of this family share a common structural architecture that consists of three domains referred to as the A-domain, B-domain and C-domain. The C domain can be further divided into the C1-subdomain and the C2-subdomain. This entry represents the C2 subdomain.	78
408387	pfam18604	PreAtp-grasp	Pre ATP-grasp domain. This is a preATP grasp domain region found inon the N-terminal of pfam02222 in Pheganomycin (PGM1).	92
376031	pfam18605	PikAIV_N	Narbonolide/10-deoxymethynolide synthase PikA4 N-terminal domain. Polyketide synthase (PKS) catalyzes the biosynthesis of polyketides, which are structurally and functionally diverse natural products in microorganisms and plants. Type I modular PKSs are the large, multifunctional enzymes responsible for the production of a diverse family of structurally rich and often biologically active natural products. The efficiency of acyl transfer at the interfaces of the individual PKS proteins is thought to be governed by helical regions, termed docking domains (dd), located at the C-terminus of the upstream and N-terminus of the downstream polypeptide chains. This entry represents the N-terminal coiled-coil domain found in PikAIV (module 6) proteins from the Pik PKS system in bacteria. This N-terminal PKS docking domain (KS-side docking domain, KSdd) exhibits a coiled-coil motif and the dimer presents a small hydrophobic patch, sometimes flanked by charged residues, as a narrow binding groove where the ACPdd terminal helix can bind.	30
408388	pfam18606	HTH_53	Zap helix turn helix N-terminal domain. Zinc-finger antiviral protein (ZAP) is a host factor that specifically inhibits the replication of certain viruses, such as HIV-1, by targeting viral mRNA for degradation. This domain is a helix turn helix domain found at the N-terminal region constituting the top cockpit layer of the protein.	62
408389	pfam18607	HTH_54	ParA helix turn helix domain. The accurate segregation of DNA is essential for the faithful inheritance of genetic information. Segregation of the prototypical P1 plasmid par system requires two proteins, ParA and ParB, and a centromere. When bound to ATP, ParA mediates segregation by interacting with centromere-bound ParB, but when bound to ADP, ParA fulfills a different function: DNA-binding transcription autoregulation. ParA consists of an elongated N-terminal alpha-helix which mediates dimerization, a winged-HTH and a Walker-box containing C-domain. This entry describes the N-terminal alpha helix domain combined with the winged HTH region.	92
408390	pfam18608	XAF1_C	XIAP-associated factor 1 C-terminal domain. XIAP-associated factor 1 (XAF1) is a 301-amino acids interferon (INF)-inducible pro-apoptotic protein. The XIAP binding region within XAF1, XIAP RING binding site, is located at the C-terminal portion of XAF1. This entry represents the C-terminal region which is functionally identified as XIAP RING-binding domain of XAF1.	51
408391	pfam18609	SAM_Exu	Exuperantia SAM-like domain. Exuperantia (Exu) is associated with localization of bicoid (bcd) mRNA and required for its localization at the anterior pole of the oocyte. Crystal structure of Exu reveals a dimeric assembly with each monomer consisting of a 3'-5' EXO-like domain and a sterile alpha motif (SAM)-like domain. The SAM-like domain interacts with its target RNA as a homodimer and is required for RNA binding activity.	73
408392	pfam18610	Peripla_BP_7	Periplasmic binding protein domain. Treponema pallidum, the bacterium that causes syphilis, is an obligate human parasite. T. pallidum lacks the machinery for the de novo synthesis of many key nutrients therefore it acquires these nutrients from its human host. MglB-2 from T. pallidum has been shown to act as the ligand-binding element of an ABC transporter for D-glucose. The overall fold of MglB-2 resembles those of LBPs (Ligand-binding proteins sometimes called 'Periplasmic Binding Proteins') that serve as receptors for nutrients and cofactors in bacterial ABC transporters. Furthermore, structural analysis of MglB-2 i found in Treponema pallidum shows it to be one of the founding member of a family of proteins related to the 'Type I' or 'Cluster B' LBPs. This domain can also be found on the C-terminal region of pfam13407.	71
408393	pfam18611	IL3Ra_N	IL-3 receptor alpha chain N-terminal domain. Interleukin-3 (IL-3) is an activated T cell product that bridges innate and adaptive immunity and contributes to several immunopathologies. Structure of IL-3 receptor alpha chain (IL3Ra) in complex with the anti-leukemia antibody CSL362 reveals that the N-terminal domain (NTD), a domain also present in the granulocyte-macrophage colony-stimulating factor (GM-CSF), contains the CSL362 binding epitope. Furthermore, NTD of IL3Ra adopts a typical fibronectin type III (FnIII) fold.	74
408394	pfam18612	Bac_A_amyl_C	Bacterial Alpha amylase C-terminal domain. This is a bacterial alpha amaylase C-terminal domain found mostly in bacilli.	69
408395	pfam18613	TrkA_TMD	Tyrosine kinase receptor A trans-membrane domain. This receptor consists of 796 amino acids and can be divided in the extracellular ligand-binding domain, the trans-membrane domain, and the intracellular tyrosine kinase domain.This domain is the TMD of TrkA which has shown to be involved in the interaction with amyloid precursor protein (APP).	22
408396	pfam18614	RNase_II_C_S1	RNase II-type exonuclease C-terminal S1 domain. This entry describes the C-terminal S1 domain found in type 2 RNase exonucleases. DrR63 proteins from Deinococcus radiodurans are an RNase II-type enzymes (DrII). Structure analysis of DrII indicates that it has an N-terminal HTH domain which interacts with a flexible loop that connects two beta-strands from the conserved C-terminal S1 domain, forming a beta-wing fold common in wHTH domains.	59
408397	pfam18615	SMYLE_N	Short myomegalin-like EB1 binding proteins, N-terminal domain. This N-terminal region is found in SMYLE (for short myomegalin-like EB1 binding protein). It includes the SMYLE homology (SmyH) domain found in the first 100 residues at the N terminus. This conserved SmyH domain is required and sufficient for PKA scaffolding protein AKAP9, and the pericentrosomal protein CDK5RAP2 binding.	388
408398	pfam18616	CdiI_3	CDI immunity proteins. Contact-dependent growth inhibition (CDI) is a widespread mechanism of bacterial competition. CDI+ bacteria deliver the toxic C-terminal region of contact-dependent inhibition A proteins (CdiA-CT) into neighboring target bacteria and produce CDI immunity proteins (CdiI) which bind CdiA-CT domains and neutralize their toxic activity to protect against self-inhibition. CdiI immunity proteins are also variable and only neutralize their cognate CdiA-CT toxins. Structure analysis of CdiI from Escherichia coli 536 (EC536) shows that is composed of a single domain and that it blocks the interaction with substrate, strongly suggesting that the immunity protein occludes the nuclease active site.	94
408399	pfam18617	Nup214_FG	Nucleoporin Nup214 phenylalanine-glycine (FG) domain. CRM1 is the major nuclear export receptor. During translocation through the nuclear pore, transport complexes transiently interact with phenylalanine-glycine (FG) repeats of multiple nucleoporins. On the cytoplasmic side of the nuclear pore, CRM1 tightly interacts with the nucleoporin Nup214. Nup214 binds to N- and C-terminal regions of CRM1, thereby clamping CRM1 in a closed conformation and stabilizing the export complex. This entry represents an FG repeat region within the C terminus of Nup214 which is required for its interaction with CRM1.	62
408400	pfam18618	HP0268	HP0268. HP0268 is a small, characterized protein that is conserved in H. pylori strains and consists of 80 amino acid residues with a molecular weight of approximately 9.5 kDa. HP0268 has nicking endonuclease and RNase activities, both of which are specific for a single-strand of nucleotides. It is structurally similar to small MutS-related (SMR) domains, that can be categorized roughly into three subfamilies according to their arrangement in the domain architecture.HP0268 falls into subfamily 3 that is found as stand-alone type proteins. It is proposed that HP0268 has become an evolutionary intermediate between RNases and nicking endonucleases during H. pylori adaptation to the extremely acidic environment of the stomach.	80
408401	pfam18619	GAIN_A	GPCR-Autoproteolysis-INducing (GAIN) subdomain A. GPR56 is a a cell-surface G protein-coupled receptor (GPCR) which belongs to the adhesion G protein-coupled receptor (aGPCR) family, a large family of chimeric proteins that have both adhesion and signaling functions and play critical roles in diverse neurobiological processes including brain development, synaptogenesis, and myelination. This entry represents GPCR-Autoproteolysis-INducing (GAIN) subdomain A, including PLL-GAIN linker (F161-D175) region.	48
408402	pfam18620	DUF5627	Family of unknown function (DUF5627). This is a domain of unknown function found in bacteria.	133
408403	pfam18621	DUF5628	Family of unknown function (DUF5628). This is a domain of unknown function found in Actinobacteria.	110
408404	pfam18622	HTH_55	RctB helix turn helix domain. RctB is a highly conserved 75.3 kD protein (658 residues), which is unique to the Vibrionaceae. The first 500 amino acids of RctB are sufficient to mediate oriCII-based replication and its C-terminal 165 residues may mediate regulatory processes. RctB contains at least three DNA binding winged-helix-turn-helix motifs, and mutations within any of these severely compromise biological activity. This entry describes domain 1 located at the N-terminal region of RctB proteins. Mutational analysis show that it binds oriCII DNA, and that this function is critical for the capacity of RctB to mediate oriCII-based replication.	107
408405	pfam18623	TnsE_C	TnsE C-terminal domain. The bacterial transposon Tn7 facilitates horizontal transfer by directing transposition into actively replicating DNA with the element-encoded protein TnsE. Structural analysis of the C-terminal domain of TnsE identified a central V-shaped loop that toggles between two distinct conformations. It is suggested that a conformational change within the C-terminal domain of TnsE underlies target site selection by regulating stable engagement of the target DNA while providing a signal for activating transposition.	145
408406	pfam18624	CdiI_4	CDI immunity protein. Contact-dependent growth inhibition (CDI) is a mechanism of inter-cellular competition in which Gram-negative bacteria exchange polymorphic toxins using type V secretion systems. Structure analysis of the CDI toxin from Escherichia coli NC101 reveals that it has moderate structural homology to Whirly-like proteins found in plastids, but appears to lack the characteristic Whirly RNA-binding site.	104
408407	pfam18625	EspB_PE	ESX-1 secreted protein B PE domain. The ESX-1 secretion system is an important virulence determinant in Mycobacterium tuberculosis. ESX-1 secreted protein B (EspB) contains putative PE (Pro-Glu) and PPE (Pro-Pro-Glu) domains, and a C-terminal domain, which is processed by MycP1 protease during secretion. This domain represents the PE domain located at the N-terminal region of EspB which carries the conserved YxxxD/E secretion motif.	78
408408	pfam18626	Gln_deamidase_2	Glutaminase. Protein glutaminase (PG, EC 3.5.1.44) can deamidate glutamine residues in proteins to glutamate residues. This entry represents the mature PG enzyme which bears partial homology to factor XIII-like Transglutaminase (TG), especially its Cys-His-Asp catalytic triad. A similar triad (Cys-His-Asn) is also shared by some cysteine proteases such as papain and actinidin. The mature PG is a monomer enzyme consisting of 185 amino acid residues.	106
408409	pfam18627	PgdA_N	Peptidoglycan GlcNAc deacetylase N-terminal domain. This is the N-terminal and middle domain found in Streptococcus pneumoniae peptidoglycan GlcNAc deacetylase (SpPgdA). PgdA protects the Gram-positive bacterial cell wall from host lysozymes by deacetylating peptidoglycan GlcNAc residues. It is a member of the family 4 carbohydrate esterases (CE-4).	218
408410	pfam18628	P2_N	Viral coat protein P2 N-terminal domain. P2 (30.2 kDa) is the major outer-coat protein of the marine lipid-containing bacteriophage PM2. Each sub-unit of P2 is composed of two beta barrel jelly rolls, disposed normal to the surface of the capsid, which lend pseudo-6-fold symmetry to the molecules, facilitating their close packing within the capsid. There is a Ca2+ ion located between the two beta barrels of P2 that helps PM2 molecular organizations stabilization. This entry represents the N-terminal jelly roll domain of P2.	127
408411	pfam18629	DUF5629	Family of unknown function (DUF5629). This is a domain of unknown function found in hypothetical proteins from Pseudomonas aeruginosa.	98
408412	pfam18630	Peptidase_M60_C	Peptidase M60 C-terminal domain. This is C-terminal domain (CTD) of M60-peptidases pfam13402. It Can also be found at the C-terminal region of gingipain B (RgpB) from P. gingivalis. It was found to possess a typical Ig-like fold encompassing seven antiparallel beta-strands organized in two beta-sheets, packed into a beta-sandwich structure that can spontaneously dimerize through C-terminal strand swapping. Translocation of gingipains from the periplasm across the OM is dependent on the conserved CTD, which appears to be important for secretion of the proteins and in particular, truncation of the last few C-terminal residues of this domain leads to accumulation of gingipains in the periplasm. Subsequently, the T9SS targeting signal was demonstrated to reside within the last 22 residues at the C-terminus of the CTD. During gingipain translocation across the OM, the CTD is cleaved off by PorU.	65
408413	pfam18631	Cucumopine_C	Cucumopine synthase C-terminal helical bundle domain. McbB from Marinactinospora thermotolerans is an enzyme that catalyzes the Pictet-Spengler (PS) reaction of L-tryptophan and oxaloacetaldehyde to produce the betaC scaffold of marinacarbolines. This is the C-terminal domain composed of 5 bundled alpha helices. It is weakly similar to the signal transduction histidine-protein kinase BarA from E. coli and the DNA endonuclease I-MsoI from Monomastix sp.	141
408414	pfam18632	DUF5630	Family of unknown function (DUF5630). This is a domain of unknown function mostly found in Legionella.	218
408415	pfam18633	zf-CCCH_8	Zinc-finger antiviral protein (ZAP) zinc finger domain 3. Zinc-finger antiviral protein (ZAP) is a host factor that specifically inhibits the replication of certain viruses, such as HIV-1, by targeting viral mRNA for degradation. N-terminal domain of ZAP is the major functional domain which contains four zinc-finger motifs. This entry represents the third zinc finger type CCCH.	28
408416	pfam18634	RXLR_WY	RXLR phytopathogen effector protein WY-domain. Filamentous plant pathogens cause devastating diseases of crops. Phytophthora infestans, the Irish potato famine pathogen, facilitates disease on its hosts by delivering effector proteins that modulate host cell processes to the benefit of the parasite, a strategy used by many biotrophic plant pathogens. The Phytophthora infestans RXLR-type effector PexRD54 binds potato ATG8 via its ATG8 family-interacting motif (AIM) and perturbs host-selective autophagy. The N-terminal region of PexRD54 contains 5 tandem WY domains. The WY domain is a conserved structural unit consisting of three alpha-helices and two characteristic hydrophobic amino acids, frequently W (Trp) and Y (Tyr), which contribute to a stable hydrophobic core. Deletion analysis show that the WY domains of PexRD54 are dispensable for ATG8CL binding suggesting an alternative function for these domains.	51
408417	pfam18635	EpCAM_N	Epithelial cell adhesion molecule N-terminal domain. EpCAM (epithelial cell adhesion molecule), a stem and carcinoma cell marker, is a cell surface protein involved in homotypic cell-cell adhesion via intercellular oligomerization and proliferative signalling via proteolytic cleavage. Structure analysis indicate that it is composed of three domains: N-domain, Thyroglobulin type-1A (TY) domain and the C-terminal domain. This entry represents the small and compact disulphide-rich N-terminal domain of 39 amino-acid residues.	33
408418	pfam18636	Sld7_N	Mitochondrial morphogenesis protein SLD7 N-terminal domain. The initiation of eukaryotic chromosomal DNA replication requires the formation of an active replicative helicase at the replication origins of chromosomes. Yeast Sld3 and its metazoan counterpart treslin are the hub proteins mediating protein associations critical for formation of the helicase. The Sld7 protein interacts with Sld3, and the complex formed is thought to regulate the function of Sld3. Although Sld7 is a non-essential DNA replication protein that is found in only a limited range of yeasts, its depletion slowed the growth of cells and caused a delay in the S phase. Structure analysis indicates that the N-terminal domain of Sld7 binds to the N-terminal region of Sld3.	122
408419	pfam18637	AUDH_Cupin	Aldos-2-ulose dehydratase/isomerase (AUDH) Cupin domain. The enzyme aldos-2-ulose dehydratase/isomerase (AUDH) participates in carbohydrate secondary metabolism, catalyzing the conversion of glucosone and 1,5-d-anhydrofructose to the secondary metabolites cortalcerone and microthecin, respectively. Crystal structure analysis revealed that the enzyme subunit is built up of three domains, an N-terminal seven-bladed propeller, a bicupin and a C-terminal lectin domain. This entry describes the second Cupin domain (residues 574-739) composed of two antiparallel sheets that build up the jellyroll sandwich fold formed from four and five beta-strands. This cupin domain in AUDH is found to contain a zinc binding site where the metal site is located at the bottom of the cleft formed by the beta-sandwich, as observed in many cupins.	156
408420	pfam18638	CyRPA	Cysteine-Rich Protective Antigen 6 bladed domain. Plasmodium falciparum Cysteine-Rich Protective Antigen (PfCyRPA) is a 42.8 kDa protein of 362 residues with a predicted N-terminal secretion signal. It is part of a multi-protein complex including the PfRH5-interacting protein PfRipr and the reticulocyte binding-like homologous protein PfRH5, which binds to the erythrocyte receptor basigin. PfRH5, PfCyRPA, and PfRipr colocalize during parasite invasion at the junction between merozoites and erythrocytes. The complex seems to be required both for triggering Ca2+ release and establishment of tight junctions. PfCyRPA adopts a 6-bladed beta-propeller structure with similarity to the classic sialidase fold, but it has no sialidase activity and fulfills a purely non-enzymatic function. Each blade of the propeller is constructed by a four-stranded anti-parallel beta-sheet.	315
408421	pfam18639	Longin_2	Yeast longin domain. This is a longin domain which is found in the N-terminal region of Lst4 proteins in yeast. Lst4 is the Fnip1/2 orthologue found in mammals. Lst4 forms a complex with Lst7 and are targeted to the vacuole when the cells are starved of carbon, and to a lesser extent nitrogen. Lst4 and Fnip1/2 belong to the DENN family of proteins which comprise an N-terminal longin domain, commonly found in a variety of trafficking proteins, and a C-terminal DENN domain. This domain is made up of a core five-strand beta-sheet, with one short alpha-helix.	159
408422	pfam18640	LepB_N	LepB N-terminal domain. Rab GTPases constitute the largest family of small GTP-binding proteins that act as molecular switches in regulating vesicular transport in eukaryotic cells. LepB is a Rab GTPase-activating protein (GAP) effector found in Legionella pneumophila. This entry represents the N-terminal domain which is followed by a GAP domain.	183
408423	pfam18641	LidA_Long_CC	LidA long coiled-coil domain. LidA, another Rab1-interacting bacterial effector protein, is translocated by Legionella into the host cytosol at the beginning of infection, and it localized to the Legionella-containing vacuole (LCV) at the cytosolic surface. It has been shown that tight interaction with Rab1 allows LidA to facilitate the Legionella targeting factor (DrrA/SidM)-catalyzed release of Rab1 from GDP dissociation inhibitors (GDI). The base of the protein is formed by two antiparallel coiled-coil structures forming a long coiled-coil domain. This region of LidA interacts with switch and interswitch regions of Rab1 the nucleotide binding pocket of Rab8a, hence blocking access to the GDP/GTP-binding site to a great extent.	177
408424	pfam18642	IMPa_helical	Immunomodulating metalloprotease helical domain. IMPa is an immunomodulator metalloprotease that belongs to the peptidase M60 family pfam13402. This entry represents the helical domain found at the N-terminal of the Ig domain.	107
408425	pfam18643	RE_BsaWI	BsaWI restriction endonuclease type 2. Type II restriction endonucleases recognize short 4-8 bp nucleotide sequences and cleave phosphodiester bonds within or close to their target site. BsaWI restriction endonuclease from the thermophilic bacterium Bacillus stearothermophilus W1718 belongs to a group of restriction endonucleases that share CCGG motif within their target sites, termed 'CCGG-family'. However, the R-(D/E)R motif residues, which are supposed to recognize CCGG from the major groove side, are poorly ordered and located far away from the DNA bases. BsaWI contacts with the CCGG tetranucleotide from the minor groove side. It is folded into two domains an N-terminal helical domain and a C-terminal catalytic domain. Furthermore, it carries a PDXKXE motif at the putative active site.	105
408426	pfam18644	Phage_int_SAM_6	Phage integrase SAM-like domain. Xer recombinases are members of the tyrosine site-specific recombinase superfamily, a large group of enzymes that catalyze DNA breakage and rejoining using a conserved tyrosine nucleophile. Tyrosine recombinases promote various programmed DNA rearrangements including the monomerization of phage, plasmid and chromosome multimers, resolution of hairpin telomeres, and the movement of virulence and antibiotic resistance carrying integrative mobile genetic elements. Structural analysis of Helicobacter pylori XerH indicates that this N-terminal domain consisting of six alpha-helices contacts the DNA using a four-helix bundle.	132
408427	pfam18645	DUF5631	Family of unknown function (DUF5631). This is an alpha helical domain found at the C-terminal region of the hypothetical protein Rv3899c from Mycobacterium tuberculosis which is conserved across mycobacteria.	96
408428	pfam18646	DUF5632	Family of unknown function (DUF5632). This an alpha-beta-alpha domain found at the N-terminal region of Rv3899c, a hypothetical protein from Mycobacterium tuberculosis which is conserved across mycobacteria.	80
408429	pfam18647	Fungal_lectin_2	Alpha-galactosyl-binding fungal lectin. This domain can be found in alpha-galactosyl binding Lyophyllum decastes lectin (LDL). It is composed of five-stranded anti-parallel beta-sheet and two alpha-helices and contain conserved cysteines responsible for disulfide bridges. The protein with the highest similarity is ginkbilobin-2, a protein with apparent anti-fungal properties isolated from the seeds of the ginkgo biloba tree. Homologous sequences can be divided into two groups, where the proteins in the group with closest homology to LDL only consist of a single LDL-like domain. In the second group of sequences, the LDL-like domain is found at the C-terminal end of a larger domain with homology to members of the Ser, Gly, Asn, His consensus sequence (SGNH)-hydrolase family, which is part of the Gly, Asp, Ser, Leu motif-esterase/lipase superfamily.	102
408430	pfam18648	ADPRTs_Tse2	Tse2 ADP-ribosyltransferase toxins. Tse2 from P. aeruginosa has structural features similar to ADP-ribosylating toxins. It is a cytoactive toxin secreted by a type six secretion apparatus of Pseudomonas aeruginosa and found mostly in gamma proteobacteria. It naturally attacks a target in the cytoplasm of bacterial cells. Structural analysis shows similarity between Tse2 and nicotinamide adenine dinucleotide (NAD)-dependent enzymes from bacteria, notably the mono-ADP-ribosyltransferase toxins (ADPRTs). Furthermore, it revealed that the Tse2 active site is occluded upon binding the cognate immunity protein Tsi2. The abrogation of toxicity for the R14A, S80A, and H122A mutant Tse2 proteins indicates the importance of these amino acids in the mechanism of Tse2 toxicity and, given their conservation with NAD-reactive enzymes, also supports their assignment as being involved in a catalytic reaction.	155
408431	pfam18649	EcpB_C	EcpB C-terminal domain. This is an immunoglobulin like domain found at the C-terminal region of EcpB. It is a periplasmic chaperone which along with EcpE help assemble the E. coli common pilus (ECP) EcpA and EcpD subunits. The C-terminal domain is predicted to contain residues that might be involved in binding a C-terminal carboxylate anchor.	72
408432	pfam18650	IMPa_N_2	Immunomodulating metalloprotease N-terminal domain. PA0572 of P. aeruginosa is an inhibitor of PSGL-1, also known as an immunomodulating metalloprotease of P. aeruginosa (IMPa). IMPa prevents neutrophil extravasation and thereby protects P. aeruginosa from neutrophil attack. It belongs to the peptidase M60 family pfam13402. This entry represents the N-terminal alpha/beta-fold domain.	200
408433	pfam18651	CshA_NR2	Surface adhesin CshA non-repetitive domain 2. The multifunctional fibrillar adhesin CshA, which mediates binding to both host molecules and other microorganisms, is an important determinant of colonization by Streptococcus gordonii, an oral commensal and opportunistic pathogen of animals and humans. CshA binds the high-molecular-weight glycoprotein fibronectin (Fn) via an N-terminal non-repetitive region, and this protein-protein interaction has been proposed to promote S. gordonii colonization at multiple sites within the host. This 259-kDa polypeptide is organized in the form of a leader peptide (residues 1-41), a non-repetitive region (residues 42-778), 17 repeat domains (R1-R17, each about 101-aa residues), and a C-terminal cell wall anchor. The non-repetitive Fn-binding region of CshA in turn is composed of three distinct domains, designated as non-repetitive domain 1 (NR1, CshA(42-222)), non-repetitive domain 2 (NR2, CshA(223-540)), and non-repetitive domain 3 (NR3, CshA(582-814)). The NR2 domain of CshA is shown to adopt a globular structure with a lectin-like fold and a ligand-binding site on its surface with structural homologues identified as those involved in binding carbohydrates or glycoproteins.	266
408434	pfam18652	Adhesin_P1_N	Adhesin P1 N-terminal domain. The cariogenic bacterium Streptococcus mutans uses adhesin P1 to adhere to tooth surfaces, extracellular matrix components, and other bacteria. The N terminus forms a stabilizing scaffold by wrapping behind the base of P1's elongated stalk and physically 'locking' it into place. It is suggested that the N-terminal has such a pronounced impact on P1 immunogenicity, antigenicity, folding, stability, and adherent function.	106
408435	pfam18653	Arcadin_1	Arcadin 1. Arcadin-1 is encoded by arcade gene cluster which also encodes cernactin. Crenactin is a filament-forming protein from the crenarchaeon Pyrobaculum calidifontis which shows exceptional similarity to eukaryotic F-actin. Arcadin-1 on the other hand does not seem to be related to any known eukaryotic actin binding proteins nor does it affect crenactin polymerisation.	111
408436	pfam18654	LegC3_N	LegC3 N-terminal coiled-coil domain. LegC3 is an effector protein secreted by Legionella pneumophila which is believed to act by inhibiting vacuolar fusion. The N-terminal domain of LegC3 is composed of a long discontinuous coiled-coil capped by an antiparallel four-helix bundle that is rigidly connected to the coiled-coil segment. The features responsible for fusion inhibition are located mainly within the N-terminal domain, as this domain alone was sufficient to inhibit vacuole fusion, and that this domain remains associated with vacuoles.	296
408437	pfam18655	SHIRT	SHIRT domain. The SHIRT domain is found in a range of presumed bacterial adhesin proteins.	82
408438	pfam18656	DUF5633	Family of unknown function (DUF5633). This entry represents a 40 residue repeat that is often found in tandem in a small set of bacterial cell surface proteins. The function of this region is not known.	41
408439	pfam18657	YDG	YDG domain. This presumed domain is found in a wide variety of bacterial cell surface proteins. This domain has a highly conserved YDG motif near its N-terminus. This domain is likely related to the pfam17883 domain.	82
408440	pfam18658	zf-C2H2_12	Zinc-finger C2H2-type. This is a zinc finger domain C2H2 type which can be found in SPIN1 docking protein (SPIN-DOC) and Epm2a-interacting protein 1 (Epm2aip1). SPIN-DOC is a Spindlin1 (SPIN1) regulator that directly binds and strongly disrupts its histone methylation reading ability, causing it to disassociate from chromatin. Epm2aip1 is a glycogen synthase (GS)-associated protein. In the absence of Epm2aip1, the sensitivity of the liver to insulin, in which GS is a principal actor, is impaired.	64
408441	pfam18659	CelTOS	Cell-traversal protein for ookinetes and sporozoites. Cell-traversal protein for ookinetes and sporozoites (CelTOS) is a conserved protein that is essential for traversal of malaria parasites in both the mosquito vector and human host and is therefore critical for malaria transmission and disease pathogenesis. It specifically binds phosphatidic acid commonly present within the inner leaflet of plasma membranes, and potently disrupts liposomes composed of phosphatidic acid by forming pores. CelTOS resembles class I viral membrane fusion glycoproteins and a bacterial pore-forming toxin with roles in membrane binding and disruption. CelTOS forms an alpha helical dimer that resembles a tuning fork. Structure analysis indicate that it has a distinct structural architecture with two subdomains that independently resemble membrane binding and/or disrupting proteins and could simultaneously act during disruption.	116
408442	pfam18660	Tsi6	Tsi6. Tsi6 inhibits the NADase activity of Tse6, an integral membrane toxin from Pseudomonas aeruginosa. The Tsi6 immunity protein adopts an all-alpha-helical fold that binds to a surface of Tse6.	83
408443	pfam18661	AvrLm4-7	Avirulence Effector AvrLm4-7. AvrLm4-7 is found in Leptosphaeria maculans, an ascomycete fungus in the dothideomycete group which is responsible for stem canker (blackleg) of Brassica napus (oilseed rape, OSR) and other crucifers. AvrLm4-7 is one of six avirulence genes which encodes a small secreted protein strongly over-expressed at the onset of plant infection. This gene confers a dual recognition specificity by two distinct resistance genes of OSR, Rlm4 and Rlm7 and loss of AvrLm4 avirulence was demonstrated to be associated with a strong fitness cost. Structure and functional analysis of AvrLm4-7 protein show that it contains the motifs RAWG and RYRE, part of a well-structured protein region held together by disulfide bridges. Mutations in the RAWG motif or in the RYRE motif (especially mutations in both motifs) almost abolished the translocation of AvrLm4-7 into cells. Furthermore, loss of recognition of AvrLm4-7 by Rlm4 is caused by the mutation of a single glycine to an arginine residue located in a loop of the protein.	86
408444	pfam18662	HTH_56	Cch helix turn helix domain. Staphylococcal Cassette Chromosome, or SCC elements, are a family of genomic islands found in S. aureus and closely related species. SCC elements that carry the mecA gene are called SCCmec and render S. aureus methicillin-resistant, creating the MRSA strains. Cch, the self-loading helicase encoded by SCCmec type IV, belongs to the pre-sensor II insert clade of AAA+ ATPases, as do the archaeal and eukaryotic MCM-family replicative helicases. The N-terminal domain carries pfam06048. The central domain (residues 157-438) contains an AAA+ ATPase fold. This domain is found at the C-terminal region, it is a winged helix-turn-helix (WH) domain typical of many dsDNA-binding proteins.	110
376087	pfam18663	Pallilysin	Pallilysin beta barrel domain. The Treponema pallidum protein, Tp0751 (also known as pallilysin), possesses adhesive properties and has been previously reported to mediate attachment to the host extracellular matrix components laminin, fibronectin, and fibrinogen. Tp0751 adopts an eight-stranded beta-barrel with a profile of short conserved regions consistent with a non-canonical lipocalin fold. Lipocalins, along with fatty acid-binding proteins and avidins, are members of the calycin superfamily, which is defined by the distinct features of a central beta-barrel and a key structural signature consisting of three short conserved regions (SCR1, SCR2, and SCR3). However, Tp0751 does not contain all three conserved regions, hence it is considered an outlier to canonical lipocalins. In SCR1, there is a GxW motif.	118
408445	pfam18664	CdiA_C_tRNase	CdiA C-terminal tRNase domain. This entry represents the C-terminal tRNase domain of CdiA a type II toxin/immunity protein complex which can be found in B. pseudomallei isolate E479. The C-terminal tRNase domain has an alpha/beta-fold characteristic of PD(D/E)XK nucleases. The PD(D/E)XK superfamily includes most restriction endonucleases and other enzymes involved in DNA recombination and repair.	116
408446	pfam18665	TetR_C_37	Tetracyclin repressor-like, C-terminal domain. IcaR belongs to the tetracycline repressor (TetR) family of proteins, which are involved in a wide variety of gene regulations. It binds to a 42 bp region immediately upstream of the icaA gene. This entry represents the C-terminal domain which is involved in dimerization.	117
408447	pfam18666	CBM64	Carbohydrate-binding module 64. Spirochaeta thermophila secretes seven glycoside hydrolases for plant biomass degradation that carry a carbohydrate-binding module 64 (CBM64) appended at the C-terminus. CBM64 adsorbs to various beta1-4-linked pyranose substrates and shows high affinity for cellulose. Structure analysis indicates a jelly-roll-like fold corresponding to a surface-binding type A CBM.	74
408448	pfam18667	BppU_IgG	Baseplate upper protein immunoglobulin like domain. This is a beta-sandwich immunoglobulin fold domain, which resembles the plexin-A2 C-terminal domain in structure. In baseplate upper protein (BppU, also known as ORF48) trimer, this domain plays part in surrounding the Dit hexameric core.	96
408449	pfam18668	Tail_spike_N	Tail spike TSP1/Gp66 receptor binding N-terminal domain. Bacteriophages recognize and bind to their hosts with the help of receptor-binding proteins (RBPs) that emanate from the phage particle in the form of fibers or tailspikes. RBPs of podovirus G7C tailspikes gp63.1 and gp66 are essential for infection of its natural host bacterium E. coli 4s. Gp63.1 and gp66 form a stable complex, in which the N-terminal part of gp66 serves as an attachment site for gp63.1 and anchors the gp63.1-gp66 complex to the G7C tail. The two N-terminal domains show 70% sequence identity to the N-terminal region of the CBA120 phage tailspike 1 (orf210, TSP1). The N-terminal domain of TSP1 is the virion head binding domain that interfaces with the phage baseplate. The N-terminal domain can be further divided into two subdomains, each beginning with a alpha-helix followed by an anti-parallel beta-sandwich. Subdomain two folds similarly to the chitin binding domain of Chitinase from Bacillus circulans.	70
408450	pfam18669	Trp_ring	Trimeric autotransporter adhesin Trp ring domain. Autotransporters are synthesized as precursor proteins with three functional domains, namely, an N-terminal signal peptide, an internal passenger domain, and a C-terminal pore-forming translocator domain. The C-terminal translocator domain is embedded in the outer membrane and facilitates delivery of the internal passenger domain to the bacterial surface. In conventional autotransporters, the C-terminal translocator domain contains approximately 300 amino acids and is monomeric. In contrast, in trimeric autotransporters, the translocator domain contains 60 to 70 amino acids and forms trimers in the outer membrane. This entry represents a Trp-ring domain which is found in the translocator region of H. influenzae Hia autotransporter, an adhesive protein that promotes adherence to respiratory epithelial cells. Trp-ring domains appear to be crucial repeated modular units in Hia, both in the general architecture of the passenger domain and in the structure of the binding domains.	49
408451	pfam18670	V_ATPase_I_N	V-type ATPase subunit I, N-terminal domain. Vacuolar H+-ATPase (V-ATPase) is a ubiquitous multi-subunit proton pump that acidifies a wide variety of intracellular compartments, which in turn affects many biological processes, including membrane trafficking, protein degradation and coupled transport of small molecules and pH homeostasis. Subunit 'a' of V0 (the functional domain responsible for proton transport) sector is highly conserved across eukaryotic species and exists in multiple isoforms. It is the largest subunit of V-ATPases and partitioned almost equally into an N-terminal cytosolic domain and a C-terminal integral membrane. Structure analysis of the N-terminal cytosolic domain from the Meiothermus ruber subunit 'I' homolog of subunit a shows that it is composed of a curved long central alpha-helix bundle capped on both ends by two lobes with similar alpha/beta architecture.	90
408452	pfam18671	4HPAD_g_N	4-Hydroxyphenylacetate decarboxylase subunit gamma N-terminal. 4-Hydroxyphenylacetate decarboxylase (4-HPAD) is a heterotetramer consisting of catalytic beta-subunit harboring the putative glycyl/thiyl dyad and a distinct small gamma-subunit with two [4Fe-4S] clusters (EC:4.1.1.83). The gamma-subunit is proposed to be involved in the regulation of the oligomeric state and catalytic activity of the enzyme and it comprises two domains with some amino acid sequence identity that are structurally related by a pseudo-2-fold symmetry indicating a gene duplication origin. This entry represents the N-terminal domain which binds one [4Fe-4S] cluster through His3, Cys6, Cys19, and Cys36.	31
376096	pfam18672	DUF5634_N	Family of unknown function (DUF5634). This is an N-terminal domain of unknown function found in Deltaproteobacteria.	93
408453	pfam18673	IrmA	interleukin receptor mimic protein A. The E. coli interleukin [IL] receptor mimic protein A (IrmA), is a small (13 kDa) Uropathogenic E. coli (UPEC) protein that was originally identified in a large reverse genetic screen as a broadly protective vaccine antigen. It has a fibronectin III (FNIII)-like fold that forms a domain-swapped dimer with structural mimicry to the binding domain of the IL-2 receptor (IL-2R), the IL-4 receptor (IL-4R) and, to a lesser extent, the IL-10 receptor (IL-10R). IrmA binds to all three cytokines, with the greatest affinity observed for IL-4. It is suggested that IrmA may contribute to manipulation of the innate immune response during UPEC infection.	106
408454	pfam18674	TarS_C1	TarS beta-glycosyltransferase C-terminal domain 1. Beta-glycosyltransferase TarS is an enzyme responsible for the glycosylation of wall teichoic acid polymers of the S. aureus cell wall, a process that has been shown to be specifically responsible for methicillin resistance in MRSA. It contains a trimerization domain composed of tandem carbohydrate binding motifs.The two C-terminally localized regions composed of a series of beta-sheets participate in an extensive trimerization interface and they assume an immunoglobulin-like fold. It is suggested that both carbohydrate binding domains may be involved in polyRboP binding, however unlike pullulanase, the CBMs of TarS are involved in the formation of an extensive trimerization interface.	148
408455	pfam18675	HepII_C	Heparinase II C-terminal domain. Heparinase II (HepII) is an 85-kDa dimeric enzyme that depolymerizes both heparin and heparan sulfate glycosaminoglycans. The protein is composed of three domains: an N-terminal alpha-helical domain, a central two-layered beta-sheet domain, and a C-terminal domain forming a two-layered beta-sheet. The C-terminal domain contains nine beta-strands packed together in a manner resembling a beta-barrel.	88
408456	pfam18676	MBG_2	MBG domain (YGX type). This domain is found in a variety of bacterial extracellular proteins. This domain is related to the MBG domain (pfam17883). But it replaces the characteristic YDG motif close the N-terminus with a YGX motif.	72
408457	pfam18677	ArnB_C	Archaellum regulatory network B, C-terminal domain. This is the C-terminal domain found in archeal proteins that carry a von Willebrand factor type A domain such as ArnB from Sulfolobus acidocaldarius. ArnB is involved in negative regulation of the archaellum (former: archaeal flagellum), and the C-terminal domain is phosphorylated by ArnC and ArnD on serine and threonine residues affecting docking of ArnA.	72
408458	pfam18678	AOC_like	Allene oxide cyclase barrel like domain. This is an allene oxide cyclase barrel like domain found in spirotetronate cyclases such as AbyU, a Diels-Alderase enzyme. It is comprised of two eight-stranded antiparallel beta-barrels.	122
408459	pfam18679	HTH_57	ThcOx helix turn helix domain. This is a winged helix turn helix domain which is found in cyanobactin oxidase ThcOx N-terminal region. The oxidase converts thiazolines to thiazoles.	107
408460	pfam18680	SPECT1	Plasmodium host cell traversal SPECT1. This domain is found in SPECT1 (sporozoite microneme protein essential for cell traversal). It is formed of a four alpha-helix bundle with a 'hook'-like feature at one end. These helices in parallel or antiparallel alignment.	180
408461	pfam18681	DUF5634	Family of unknown function (DUF5634). This is a domain of unknown function mostly found in bacilli.	95
408462	pfam18682	PilA4	Pilin A4. This domain is found in the major pilin protein PilA. PilA4 binds to PilMNO forming a complex with a well-defined platform linking the cytoplasmic PilM protein to pilus subunits in the periplasm. Structure analysis indicate that it is comprised of one alpha-helix and 4 beta-sheets.	82
408463	pfam18683	ChiW_Ig_like	Chitinase W immunoglobulin-like domain. This is an immunoglobulin like domain found in ChiW, a chitinase with high activity towards various chitins. ChiW has a multi-modular architecture composed of six domains to function efficiently on the cell surface: a right-handed beta-helix domain (carbohydrate-binding module family 54, CBM-54), a Gly-Ser-rich loop, 1st immunoglobulin-like (Ig-like) fold domain, 1st beta/alpha-barrel catalytic domain (glycoside hydrolase family 18, GH-18), 2nd Ig-like fold domain and 2nd beta/alpha-barrel catalytic domain (GH-18).	106
408464	pfam18684	PlyB_C	Pleurotolysin B C-terminal domain. This a trefoil C-terminal beta-rich domain found in PlyB, one of the components of pleurotolysin (Ply) pore-forming protein. Ply is a membrane attack complex/perforin-like family (MACPF) protein consisting of two components, PlyA and PlyB. PlyB and PlyA act together to form relatively small and regular pores in liposomes. The PlyB C-terminal trefoil sits on top of the PlyA dimer.	173
408465	pfam18685	DUF5635	Family of unknown function (DUF5635). This is a domain of unknown function which is found at the C-terminal region of pfam13749 in actinobacteria.	86
376110	pfam18686	DUF5636	Family of unknown function (DUF5636). This is a domain of unknown function mostly found in gammaproteobacteria.	193
376111	pfam18687	DUF5637	Family of unknown function (DUF5637). This is a domain of unknown function found in predicted cysteine knot peptides.	33
408466	pfam18688	DUF5638	Family of unknown function (DUF5638). This is a domain of unknown function found in Legionella.	104
376113	pfam18689	PriX	Primase X. This domain is found in non-catalytic subunit of the archaeal eukaryotic-type primase, PriX. Detailed sequence analysis combined with structural analysis of a truncated PriX protein from the hyperthermophilic archaeon Sulfolobus solfataricus shows that, PriX is essential for the survival of the organism and that it is homologous to the C-terminal domain of archaeal and eukaryotic large primase subunits PriL. Highly conserved PriX homologues are present in many members of the phylum Crenarchaeota.	99
408467	pfam18690	DUF5639	Family of unknown function (DUF5639). This is a domain of unknown function which is mainly found in Deinococcus-Thermus. Some family members can be found in the C-terminal region of pfam01565.	82
408468	pfam18691	Cdc13_OB2	Cell division control protein 13, OB2 domain. Cdc13 is an essential yeast protein required for telomere length regulation and genome stability. Cdc13, like a number of single-stranded telomere binding proteins, consists of several oligonucleotide-oligosaccharide binding (OB) folds. These folds potentially arise from evolutionary gene duplication and are involved in multiple functions, including nucleic acid and protein binding and Cdc13 dimerization. This entry represents the OB2 domain, second OB-fold counting from the N terminus of Cdc13. Biochemical assays indicate OB2 is not involved in telomeric DNA or Stn1 binding. However, disruption of the OB2 dimer in full-length Cdc13 affects Cdc13-Stn1 association, leading to telomere length deregulation, increased temperature sensitivity, and Stn1 binding defects. Hence it is suggested that the dimerization of the OB2 domain of Cdc13 is required for proper Cdc13, Stn1, Ten1 (CST) assembly and productive telomere capping.	111
408469	pfam18692	DUF5640	Family of unknown function (DUF5640). This domain is found in proteins of unknown function. It has composed of eight antiparallel beta strands and carries a G-X-W motif in the N-terminal region. The conserved glycine-X-tryptophan (G-X-W) motif is characteristic for the lipocalin family.	83
408470	pfam18693	TRAM_2	TRAM domain. This is a C-terminal TRAM (after TRM2, a family of uridine methylases, and MiaB) domain found in the methylthiotransferases RimO enzymes that catalyze the conversion of aspartate to 2-methylthio-aspartate (msD) in the S12 protein near the decoding center in prokaryotic ribosomes. The TRAM domain in RimO, contains five anti-parallel beta-strands and docks on the surface of the Radical-SAM domain at the distal edge of its open TIM-barrel from its conserved [4Fe-4S] cluster.	63
408471	pfam18694	TDP43_N	Transactive response DNA-binding protein N-terminal domain. This domain can be found at the N-terminal region of transactive response DNA-binding protein 43 kDa (TDP-43), an RNA transporting and processing protein whose aberrant aggregates are implicated in neurodegenerative diseases. TDP-43 N-terminal domain has been shown to play an important role in the aggregation of TDP-43 monomers and its loss of function affects the RNA metabolic levels. Secondary structure of the N-terminal domain consists of six beta-strands and it resembles axin 1.	74
408472	pfam18695	cPLA2_C2	Cytosolic phospholipases A2 C2-domain. Cytosolic phospholipases A2 (cPLA2s) consist of a family of calcium-sensitive enzymes that function to generate lipid second messengers through hydrolysis of membrane-associated glycerophospholipids. In humans, the cPLA2 family contains six isoforms. Structural information of full length cPLA2alpha apo form, shows that it is composed of two domains; an N-terminal Ca2 + binding C2 domain and a C-terminal alpha/beta hydrolase core. This entry describes the N-terminal Ca2+ binding C2 domain which is composed of an eight-stranded antiparallel beta-sandwich consisting of two four-stranded beta-sheets. C2 domains are present in many lipid-binding proteins including Copines, CAPRI and Rabphilin-3A all of which are involved in membrane trafficking.	111
408473	pfam18696	SMP_C2CD2L	Synaptotagmin-like, mitochondrial and lipid-binding domain. This is a lipid transport domain found in phospholipid transfer proteins such as C2CD2L-like (also known as TMEM24). The TMEM24-SMP domain is shown to bind glycerolipids with a preference for phosphatidylinositol (PI).The bound PI is then transferred to the plasma membrane (PM) where it is converted to phosphatidylinositol-4,5-bisphosphate [PI(4,5)P2] to replenish pools of this lipid hydrolyzed during glucose-stimulated signaling. PI(4,5)P2 is required for Ca2+-dependent exocytosis hence, the SMP domain of TMEM24 is essential for sustaining the intracellular Ca2+ oscillations that trigger bursts of insulin granule release and hence insulin secretion. The SMP domain belongs to a superfamily of lipid/hydrophobic ligand-binding domains called TULIP for (tubular lipid-binding proteins) it adopts TULIP fold with two alpha helices and a highly curved antiparallel beta sheet forming a cornucopia-like structure.	152
408474	pfam18697	MLVIN_C	Murine leukemia virus (MLV) integrase (IN) C-terminal domain. This is the C-terminal domain (CTD) which can be found in murine leukemia virus (MLV) integrase (IN) proteins. The MLV IN C-terminal domain interacts with the bromo and extraterminal (BET) proteins through the ET domain. This interaction provides a structural basis for global in vivo integration-site preferences andt disruption of this interaction through truncation mutations affects the global targeting profile of MLV. The CTD consists an SH3 fold followed by a long unstructured tail.	83
408475	pfam18698	HisK_sensor	Histidine kinase sensor domain. The Bacillus subtilis ResD-ResE two-component (TC) regulatory system activates genes involved in nitrate respiration in response to oxygen limitation or nitric oxide (NO). The sensor kinase ResE activates the response regulator ResD through phosphorylation, which then binds to the regulatory region of genes involved in anaerobiosis to activate their transcription. In other words, ResE is involved in sensing signals related to the redox state of the cells. ResE is composed of an N-terminal signal input domain and a C-terminal catalytic domain. The N-terminal domain contains two transmembrane subdomains and a large extra-cytoplasmic loop. Mutational analysis indicate that cytoplasmic ResE lacking the transmembrane segments and the extra-cytoplasmic loop retains the ability to sense oxygen limitation and NO, which leads to transcriptional activation of ResDE-dependent genes. Having said that, it is also proposed that the extra-cytoplasmic region may serve as a second signal-sensing subdomain. This suggests that the extracytoplasmic region could contribute to amplification of ResE activity leading to the robust activation of genes required for anaerobic metabolism in B. subtilis. This entry represents the extracytoplasmic subdomain. Family members also include SrrB found in S. aureus that is similar to ResE of B. subtilis.	126
408476	pfam18699	MRPL52	Mitoribosomal protein mL52. Members of this family include the mamalian mitoribosomal proteins mL52 which is found in the 39S subunit. The mL52 has no homologues in yeast.	91
408477	pfam18700	Castor1_N	Cytosolic arginine sensor for mTORC1 subunit 1 N-terminal domain. CASTOR1 (Cytosolic arginine sensor for mTORC1 subunit 1) has been identified as the cytosolic arginine sensor for the mTORC1 pathway. In the absence of arginine, CASTOR1 binds to GATOR2 and inhibits mTORC1 signaling; whereas in the presence of arginine, CASTOR1 interacts with arginine and no longer associates with GATOR2. The arginine sits in a pocket between the N-terminal domain (NTD) and the C-terminal domain (CTD) of CASTOR1. The CASTOR1-NTD on the opposite side of the arginine-binding site was identified to mediate direct physical interaction with its downstream effector GATOR2, via GATOR2 subunit Mios.	61
408478	pfam18701	DUF5641	Family of unknown function (DUF5641). This presumed domain is found in a range of retrotransposon polyproteins.	94
408479	pfam18702	DUF5642	Domain of unknown function (DUF5642). This is a domain of unknown function found in actinobacteria.	186
408480	pfam18703	MALT1_Ig	MALT1 Ig-like domain. This is an Immunoglobulin like domain which can be found in the mucosa-associated lymphoid tissue lymphoma translocation 1 (MALT1) paracaspase. Malt1 is a key component of the Carma1/Bcl10/MALT1 signalosome and is critical for NF-kB signaling in multiple contexts. The MALT1 C-terminal Ig domain is suggested to recruit key factors to promote NF-kB activation. The It is also proposed to undergo Lys63-linked ubiquitylation via TRAF6 in potentially nine different lysines to recruit the IKK complex.	138
408481	pfam18704	Chromo_2	Chromatin organization modifier domain 2. Chromodomains serve as chromatin-targeting modules, general protein interaction elements as well as dimerization sites. They are found in many chromatin-associated proteins that bind modified histone tails for chromatin targeting. Chromodomains often recognize modified lysines through their aromatic cage thus targeting proteins to chromatin. Family members such as GEN1 carry a chomodomain which directly contacts DNA and its truncation severely hampers GEN1's catalytic activity. The chromodomain allows GEN1 to correctly position itself against DNA molecules, and without the chromodomain, GEN1's ability to cut DNA was severely impaired. The GEN1 chromodomain was found to be distantly related to the CDY chromodomains and chromobox proteins, particularly to the chromo-shadow domains of CBX1, CBX3 and CBX5. Furthermore, it is conserved from yeast (Yen1) to humans with the only exception being the Caenorhabditis elegans GEN1, which has a much smaller protein size of 443 amino acids compared to yeast Yen1 (759 aa) or human GEN1 (908 aa).	62
408482	pfam18705	DUF5643	Family of unknown function (DUF5643). This is an immunoglobulin-like domain found in bacteria.	117
408483	pfam18706	ISPD_C	D-ribitol-5-phosphate cytidylyltransferase C-terminal domain. This domain is located at the C-terminal region of ISPD (isoprenoid synthase domain containing protein, EC:2.7.7.40), pfam01128. Structural homologs can be found in two distinct alpha/beta protein families including the seven-stranded NAD(P) (H)-dependent short-chain dehydrogenases/reductases and five-stranded response regulator proteins involved in bacterial sensing systems.	169
408484	pfam18707	IL2RB_N1	Interleukin-2 receptor subunit beta N-terminal domain 1. IL-2Rbeta is a member of the class I cytokine receptor superfamily. It carries a cytokine-binding homology region, which is divided in two fibronectin type-III (FN-III) domains termed D1 and D2. Each domain contains seven beta-strands that form a sandwich of two antiparallel beta-sheets. The N-terminal D1 domain of IL-2Rbeta includes two highly conserved disulfide bridges. This entry describes D1 of the N-terminal region of IL2Rbeta.	92
408485	pfam18708	MapZ_C2	MapZ extracellular C-terminal domain 2. In the pneumococcus cell division, MapZ (Midcell Anchored Protein Z) locates at the division site before FtsZ and guides septum positioning. MapZ forms ring structures at the cell equator and moves apart as the cell elongates, therefore behaving as a permanent beacon of division sites. MapZ then positions the FtsZ-ring through direct protein-protein interactions. Structural analysis indicate that it displays a bi-modular structure composed of two subdomains separated by a flexible serine-rich linker. The extracellular C-terminal domain carries a conserved patch of amino acids which plays a crucial function in binding peptidoglycan and positioning MapZ at the cell equator.	94
408486	pfam18709	DLP_helical	Dynamin-like helical domain. This helical domain is found in bacterial proteins such as labile enterotoxin output A (LeoA), a large GTPase (64.2 kDa) with a putative involvement in membrane vesicle (MV) secretion in Escherichia coli. The crystal structure of LeoA reveals a fold with all the hallmarks of a dynamin-like protein (DLP).	344
408487	pfam18710	ComR_TPR	ComR tetratricopeptide. In Gram-positive bacteria, cell-to-cell communication mainly relies on extracellular signaling peptides. ComR is a member of the RNPP family, which positively controls competence for natural DNA transformation in streptococci. It is directly activated by the binding of its associated pheromone XIP. The crystal structure analysis of ComR shows that it contains an N-terminal helix-turn-helix (HTH), DNA binding domain (DBD) and a C-terminal tetratricopeptide repeat (TPR) domain. The TPR domain is composed of 11 alpha-helices forming 5 TPR motifs followed by an additional C-terminal alpha-helix 16 called CAP. The pheromone XIP binding site is found in the TPR region. Biochemical and mutational analysis indicate that, if the interacting XIP is accepted it can then trigger the conformational change of the TPR domain to open the DBD-TPR interface to allow dimer formation that is required to bind DNA.	224
408488	pfam18711	TxDE	Toxoflavin-degrading enzyme. This domain is found in toxoflavin-degrading enzymes such as toxoflavin lyase (TflA) also known as toxoflavin-degrading enzyme (TxDE). TflA/TxDE is structurally similar to the vicinal oxygen chelate superfamily of metalloenzymes, despite the lack of apparent sequence identity.	55
408489	pfam18712	DUF5644	Family of unknown function (DUF5644). This is a domain of unknown function found at the C-terminal region of Helicobacterial proteins of unknown function.	109
408490	pfam18713	DUF5645	Domain of unknown function (DUF5645). This is a domain of unknown function found in Diptera. Some family members carry pfam08445 on their C-terminal.	126
408491	pfam18714	PI-TkoII_IV	DNA polymerase II intein Domain IV. This domain can be found in the hyperthermophilic archaeon Thermococcus kodakaraensis Pol-2 intein. It is suggested to be a potential DNA binding domain.	150
408492	pfam18715	Phage_spike	Phage spike trimer. Bacteriophages penetrate the host cell membrane using their tail to inject genetic material into the host. In this penetration process, they use central spike domain located beneath their baseplate. The spike domain folds as a trimeric iron-binding structure. This entry contains three copies of the repeat unit.	53
408493	pfam18716	VATC	Vms1-associating treble clef domain. Treble clef fold domain found at C-terminus of many, but not all, Vms1/ANKZF1-like proteins.	43
408494	pfam18717	CxC4	CxC4 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain.	129
408495	pfam18718	CxC5	CxC5 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain.	117
408496	pfam18719	ArlS_N	ArlS sensor domain. This entry represents the N-terminal extracellular sensor domain of the ArlS protein from S. aureus.	127
376143	pfam18720	EGF_Tenascin	Tenascin EGF domain. This entry represents the EGF-like domains found in tenascin proteins.	29
408497	pfam18721	CxC6	CxC6 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain inserted into the core of the KDZ transposase domain.	66
408498	pfam18722	MazG_C	MazG C-terminal domain. An alpha+beta fold domain found C-terminal to the MazG superfamily pyrophosphatase domain. The domain has a conserved DxYRxHDxxH motif indicative of catalytic activity. Based on its broader context in DNA modification, it is proposed to function as a nucleotide kinase.	190
408499	pfam18723	aGPT-Pplase1	alpha-glutamyl/putrescinyl thymine pyrophosphorylase clade 1. An alpha helical domain related to the alpha-helical DNA glycosylases, predicted to catalyze the in situ synthesis of hypermodified bases such as alpha-glutamyl, putrescinyl thymine, 5-(2-aminoethoxy)methyluridine or 5-(2-aminoethyl)uridine. The enzyme is predicted to utilize a high-energy pyrophosphate DNA base intermediate which is subject to a nucleophilic attack by the modifying moiety. Members of this clade are found in phages with hypermodified bases and eukaryotes such as fungi and stramenopiles.	280
408500	pfam18724	aGPT-Pplase2	Alpha-glutamyl/putrescinyl thymine pyrophosphorylase clade 2. An alpha helical domain related to the alpha-helical DNA glycosylases, predicted to catalyze the in situ synthesis of hypermodified bases such as alpha-glutamyl, putrescinyl thymine, 5-(2-aminoethoxy)methyluridine or 5-(2-aminoethyl)uridine. The enzyme is predicted to utilize a high-energy pyrophosphate DNA base intermediate which is subject to a nucleophilic attack by the modifying moiety. Mainly found in caudoviruses and prophages.	229
408501	pfam18725	HEPN_SAV2148	SAV2148-like HEPN. SAV2148-like HEPN nuclease domain.	216
408502	pfam18726	HEPN_SAV_6107	SAV_6107-like HEPN. SAV_6107-like HEPN.	98
376150	pfam18727	ALMS_repeat	Alstrom syndrome repeat. This entry contains a single repeat unit of approximately 47 AA. It is found in Alstrom syndrome protein 1 (ALMS1) and homologs.	47
408503	pfam18728	HEPN_AbiV	AbiV. AbiV-like HEPN	157
408504	pfam18729	HEPN_STY4199	STY4199-like HEPN. STY4199-like HEPN nuclease domain.	282
408505	pfam18730	HEPN_Cthe2314	Cthe_2314-like HEPN. Cthe_2314-like HEPN.	173
408506	pfam18731	HEPN_Swt1	Swt1-like HEPN. Swt1-like HEPN. This HEPN domain might have a role in binding and sensing unspliced pre-mRNAs that are specifically targeted by the Swt1 nuclease at the nuclear envelope.	116
376155	pfam18732	HEPN_AbiA_CTD	HEPN like, Abia C-terminal domain. AbiA-CTD-like HEPN nuclease. Fused to Reverse Transcriptase ; in operon with R-M system.	132
408507	pfam18733	HEPN_LA2681	LA2681-like HEPN. LA2681-like HEPN nuclease.	207
408508	pfam18734	HEPN_AbiU2	AbiU2. AbiU2-like HEPN	193
408509	pfam18735	HEPN_RiboL-PSP	RiboL-PSP-HEPN. RiboL-PSP-HEPN. Fused to endoRNase L-PSP ; in operon with ParB.	191
408510	pfam18736	pEK499_p136	HEPN pEK499 p136. pEK499_p136-like HEPN.	150
408511	pfam18737	HEPN_MAE_28990	MAE_28990/MAE_18760-like HEPN. HEPN-like nuclease. MAE_28990 In operon with a ParB nuclease and DNA methylase genes. MAE_18760-like HEPN found fused to HEPN/RES-NTD1, HEPN/Toprim-NTD1, Schlafen and a novel beta rich domain. In operon with ParA/Soj ATPase of SIMIBI-type GTPase fold.	211
408512	pfam18738	HEPN_DZIP3	DZIP3/ hRUL138-like HEPN. DZIP3/ hRUL138-like HEPN nuclease. Fusion to TPR, Zn-ribbon, RING, Ankyrin, CARD, NACHT ATPase, DEATH and LRR in various animal lineages.	144
408513	pfam18739	HEPN_Apea	Apea-like HEPN. Apea-like HEPN nuclease. In epsilonproteobacteria embedded in R-M operons.	99
408514	pfam18740	EC042_2821	EC042_2821-lke REase. REase Fold Fused to HEPN (EC042_2821) and an N-terminal wHTH in some.	188
408515	pfam18741	MTES_1575	REase_MTES_1575. Vsr REase Fold. Fused to HEPN (SWT1/Abi2 family), along with Transglutaminase and wHTH.	96
408516	pfam18742	DpnII-MboI	REase_DpnII-MboI. REase Fold fused to DpnII/MboI-NTD.	150
408517	pfam18743	AHJR-like	REase_AHJR-like. REase Fold fused to HEPN(DUF86) pfam01934.	124
376166	pfam18744	SNAD1	Secreted Novel AID/APOBEC-like Deaminase 1. A family of secreted AID/APOBEC like deaminases found sporadically across vertebrates.	208
408518	pfam18745	SNAD2	Secreted Novel AID/APOBEC-like Deaminase 2. A family of secreted AID/APOBEC like deaminases found in ray-finned fishes.	211
408519	pfam18746	aGPT-Pplase3	Alpha-glutamyl/putrescinyl thymine pyrophosphorylase clade 3. An alpha helical domain related to the alpha-helical DNA glycosylases, predicted to catalyze the in situ synthesis of hypermodified bases such as alpha-glutamyl, putrescinyl thymine, 5-(2-aminoethoxy)methyluridine or 5-(2-aminoethyl)uridine. The enzyme is predicted to utilize a high-energy pyrophosphate DNA base intermediate which is subject to a nucleophilic attack by the modifying moiety. Mainly found in bacterial mobile operons.	279
408520	pfam18747	Ploopntkinase2	P-loop Nucleotide Kinase2. A P-loop Nucleotide Kinase predicted to be involved in modified base biosynthesis.	298
408521	pfam18748	Ploopntkinase1	P-loop Nucleotide Kinase1. A P-loop Nucleotide Kinase predicted to be involved in modified base biosynthesis.	196
376171	pfam18749	SNAD3	Secreted Novel AID/APOBEC-like Deaminase 3. A family of AID/APOBEC like deaminases found in vertebrates that were derived from secreted versions of the family.	379
408522	pfam18750	SNAD4	Secreted Novel AID/APOBEC-like Deaminase 4. A family of secreted AID/APOBEC like deaminases found only in sponges that often shows lineage-specific expansions.	104
408523	pfam18751	Ploopntkinase3	P-loop Nucleotide Kinase3. A P-loop Nucleotide Kinase predicted to be involved in modified base biosynthesis.	184
376174	pfam18752	DAAD	Dictyosteliid AID/APOBEC-like Deaminase. A family of secreted AID/APOBEC-like deaminases found in dictyostellids that often shows lineage-specific expansions.	291
408524	pfam18753	Nmad2	Nucleotide modification associated domain 2. A beta-strand rich domain containing a conserved cysteine and charged residues predicted to play a role in modified DNA base biosynthesis.	202
408525	pfam18754	Nmad3	Nucleotide modification associated domain 3. An alpha+beta fold domain with a high conserved HxD and D motifs suggestive of enzymatic function and predicted to be involved in modifed nucleotide biosynthesis.	244
408526	pfam18755	RAMA	Restriction Enzyme Adenine Methylase Associated. An alpha+beta fold domain associated with restriction enzymes across prokaryotes and fused to JAB deubiquitinases, and chromatin proteins in a wide range of eukaryotes. The domain is predicted to function as a modified-DNA reader domain.	108
408527	pfam18756	Nmad4	Nucleotide modification associated domain 4. An alpha+beta fold domain typically associated with DNA methylases and likely to be involved in modified nucleotide biosynthesis.	87
408528	pfam18757	Nmad5	Nucleotide modification associated domain 5. An alpha+beta fold domain associated with DNA base modifying genes in prokaryotes, and likely to be involved in modified DNA base biosynthesis.	205
408529	pfam18758	KDZ	Kyakuja-Dileera-Zisupton transposase. A transposase family with an RNaseH catalytic domain, often fused to DNA binding domains such as SAP or cysteine cluster domains. KDZ transposases are widely present in fungi, metazoa, chlorophytes and haotpohytes. Fungal versions are often associated with a TET/JBP family of dioxygenases.	218
408530	pfam18759	Plavaka	Plavaka transposase. A transposase with an RNaseH catalytic domain that often has a histone binding BAM/BAH domain at the C-terminus and is sometimes associated with TET/JBP family of dioxygenases in fungi.	320
408531	pfam18760	ART-PolyVal	ADP-Ribosyltransferase in polyvalent proteins. A family of ADP-Ribosyltransferases found in polyvalent proteins of phages and conjugative elements. These are in turn related to the Tox-ART-HYD2 group of ADP-Ribosyltransferases that are seen in polymorphic toxin systems and in toxin-antitoxin systems. These are predicted to modify host proteins.	136
408532	pfam18761	Heliorhodopsin	Heliorhodopsin. Heliorhodopsins, distantly related to type-1 rhodopsins, are embedded in the membrane with their N termini facing the cell cytoplasm, an orientation that is opposite to that of type-1 or type-2 rhodopsins. Heliorhodopsins show photocycles that are longer than one second, which is suggestive of light-sensory activity. Heliorhodopsin photocycles accompany retinal isomerization and proton transfer, as in type-1 and type-2 rhodopsins, but protons are never released from the protein.	242
408533	pfam18762	Kinase-PolyVal	Serine/Threonine/Tyrosine Kinase found in polyvalent proteins. A family of protein kinases found in polyvalent proteins of phages and prophages that although preserving their active site residues for ATP-binding and phosphotransfer appear to have lost the C-terminal subdomain characteristic of this superfamily.	160
408534	pfam18763	ddrB-ParB	ddrB-like ParB superfamily domain. A member of the ParB/sulfiredoxin superfamily of proteins found in polyvalent proteins prototyped by the version in the phage P1 ddRB protein. These proteins are predicted to function as nucleases.	124
408535	pfam18764	nos_propeller	Nitrous oxide reductase propeller repeat. Nitrous oxide reductases usually contain a seven-bladed beta-propeller domain with external short alpha-helices. This entry represents a single blade of the propeller, with imperfect alpha-helix, usually at the C-terminus of the repeat region.	71
408536	pfam18765	Polbeta	Polymerase beta, Nucleotidyltransferase. A member of the nucleotidyltransferase fold found in polymorphic toxins (NTox45) and polyvalent proteins.	93
408537	pfam18766	SWI2_SNF2	SWI2/SNF2 ATPase. A SWi2/SNF2 ATPase found in polyvalent proteins.	223
408538	pfam18767	AID	Activation induced deaminase. The activation induced deaminase is a vertebrate-specific member of the classical AID/APOBEC cytosine deaminases that is involved in antibody diversification.	90
408539	pfam18768	HTH_Bact	Helix-turn-helix bacterial domain. The bacterial PIcR helix-turn-helix transcription factor includes five TPR units of different lengths. This entry represents the central, medium-sized HTHs repeat.	210
408540	pfam18769	APOBEC1	APOBEC1. APOBEC1 deaminates cytosine both in RNA and ssDNA and has roles in both mRNA editing and ssDNA mutagenesis as part of the defense against retroviruses and genomic retrotransposons.	101
408541	pfam18770	Arm_vescicular	Armadillo tether-repeat of vescicular transport factor. Armadillo-like tether-repeat of general vescicular transport factor. This entry contains a single copy of the repeat unit.	60
408542	pfam18771	APOBEC3	APOBEC3. APOBEC3 deaminases act as restriction factors in the innate response to retroviruses and various retroelements.	135
408543	pfam18772	APOBEC2	APOBEC2. APOBEC2 is a highly conserved (slow-evolving) family of AID/APOBECs found in most vertebrates including cartilaginous fishes. APOBEC2 is poorly understood in terms of their molecular functions and substrate specificity.	174
408544	pfam18773	Importin_rep	Importin 13 repeat. Importin 13 has a spiralic structure containing repeats structurally similar to HEAT repeats. It serves as receptor for nuclear localization signals (NLS) in cargo substrates, mediating docking of the importin/substrate complex to the nuclear pore complex (NPC). It contains several repeats structurally similar to the HEAT repeat. This Pfam entry represents a single repeat unit.	40
408545	pfam18774	APOBEC4_like	APOBEC4-like -AID/APOBEC-deaminase. Cnidarian and Algal homologs of the APOBEC4-like AID/APOBEC-like deaminases characterized by a distinct Zn chelating site involving residues from the conserved loops 1 and 3.	131
408546	pfam18775	APOBEC4	APOBEC4. A member of the AID/APOBEC family of cytosine deaminases. The biological function of APOBEC4 is poorly understood. However, it is widely conserved across vertebrates.	74
408547	pfam18776	Hexapep_loop	Hexapeptide repeat including loop. This entry contains a single hexapeptide repeat unit including a loop between two strands.	24
408548	pfam18777	CRM1_repeat	Chromosome region maintenance or exportin repeat. Chromosome region maintenance 1 or exportin 1 mediates the nuclear transport of proteins bearing a leucin-rich nuclear export signal (NES). It contains helical repeats that are structurally similar to HEAT repeats, but share little sequence similarity with them. N-terminal, C-terminal and central repeats show slightly different structural arrangements, with N- and C- termini repeats interacting with each other. This entry represents the central repeats of CRM1.	37
408549	pfam18778	NAD1	Novel AID APOBEC clade 1. A distinct family of AID/APOBEC-like deaminases found in ray-finned fishes, the coelacanth, amphibians, lizards, and marsupials.	175
408550	pfam18779	LRR_RI_capping	Capping Ribonuclease inhibitor Leucine Rich Repeat. Leucine-rich repeats are composed of a beta-alpha unit. This repeat unit is found as capping unit (N- or C- terminal of the repeat region) of Ribonuclease Inhibitors.	30
408551	pfam18780	HNH_repeat	Homing endonuclease repeat. Homing endonucleases are found in bacteria and viruses and catalyze the hydrolysis of genomic DNA within the cells that synthesize them. This entry represents a single repeat unit.	23
408552	pfam18781	Phage_spike_2	Phage spike trimer. This Pfam entry includes some phage spike repeats that fail to be detected with the pfam18715 model.	78
408553	pfam18782	NAD2	Novel AID APOBEC clade 2. A distnct family of AID/APOBEC deaminases found only in Amphibians.	179
408554	pfam18783	IPU_b_solenoid	Isopullulanase beta-solenoid repeat. IPU and dextranase repeat unit includes three (or one long and one short) parallel beta-strands. The repeat region as a whole folds into a beta-helix, known as beta-solenoid.	33
408555	pfam18784	CRM1_repeat_2	CRM1 / Exportin repeat 2. Chromosome region maintenance 1 / Exportin 1 mediates the nuclear transport of proteins bearing a leucin-rich nuclear export signal (NES). It contains helical repeats that are structurally similar to HEAT repeats, but share little sequence similarity with them. N-, C-terminal and central repeats show slightly different structural arrangements, with N- and C- terminal repeats interacting with each other. This Pfam entry includes some CRM1 repeats that fail to be detected with the pfam18777 model.	68
408556	pfam18785	Inv-AAD	Invertebrate-AID/APOBEC-deaminase. A classical AID/APOBEC-like deaminases found in lophotrochozoans, echinoderms and cnidarians.	129
408557	pfam18786	Importin_rep_2	Importin 13 repeat. Importin 13 serves as receptor for nuclear localization signals (NLS) in cargo substrates. It mediates docking of the importin/substrate complex to the nuclear pore complex (NPC). It contains several repeats structurally similar to the HEAT repeat. This Pfam entry represents a single repeat unit.	44
408558	pfam18787	CRM1_repeat_3	CRM1 / Exportin repeat 3. Chromosome region maintenance 1 / Exportin 1 mediates the nuclear transport of proteins bearing a leucin-rich nuclear export signal (NES). It contains helical repeats that are structurally similar to HEAT repeats, but share little sequence similarity with them. N-, C-terminal and central repeats show slightly different structural arrangements, with N- and C- terminal repeats interacting with each other. This Pfam entry includes some CRM1 repeats that fail to be detected with the PF18777 model.	51
408559	pfam18788	DarA_N	Defence against restriction A N-terminal. This is an alpha and beta fold domain. It has a conserved aspartate, and an asparagine residue followed by a basic residue in a Nx+ motif. This predicted structural domain is mainly found in polyvalent proteins of phages/prophages. The P1 hdf protein, a solo version of the domain, and the Phage P1 DarA protein that contains this domain are components of the phage P1 head. The domain might be involved in a counter-restriction activity.	101
408560	pfam18789	DarA_C	Defence against restriction A C-terminal. This is a mostly alpha-helical domain found in polyvalent proteins of phages and prophages. In Phage P1, the DarA protein is a component of the phage P1 head.	70
408561	pfam18790	KfrB	KfrB protein. This is an alpha and beta domain found in polyvalent proteins of conjugative element or often in the neighbourhood of one. The KfrB domain has been speculated to play a role in discrimination of self from non-self in plasmid conjugation systems.	61
408562	pfam18791	Transp_inhibit	Transport inhibitor response 1 protein domain. The F-box protein Transport inhibitor response 1 (TIR1) is a receptor for auxin, triggering an auxin-enhanced and ubiquitin-mediated degradation of substrates. The targets are recruited via interaction with the leucine-rich repeat region of the protein. This Pfam entry represents a specific unit of the LRR region, including an insertion of one short alpha-helix in the loop between the beta-strand and the following helix. It shares some sequence homology with a unit with similar structure of Coronatine-insensitive protein 1.	47
408563	pfam18792	UspA1_rep	Ubiquitous surface protein adhesin repeat. The UspA1 head domain is globally similar to other structures of TAA head domains and comprises a trimeric left-handed parallel beta-roll, which is formed from 14 repeating 14-16 residue segments that form a ladder of beta-strand coils. This Pfam entry represents a single repeat unit of the beta-ladder.	13
408564	pfam18793	nos_propeller_2	Nitrous oxide reductase propeller repeat 2. Nitrous oxide reductases usually contain a seven-bladed beta-propeller domain with external short alpha-helices. This entry represents a single blade of the propeller, without alpha- helical insertion.	70
408565	pfam18794	HSM3_C	DNA mismatch repair protein HSM3, C terminal domain. Hsm3 is a proteasome-dedicated chaperone that forms a base precursor, Hsm3-Rpt1-Rpt2-Rpn1. Hsm3 consists of 23 alpha-helices forming 11 repeats similar to HEAT repeats. This entry include the last 5 repeats at the C terminal.	177
408566	pfam18795	HSM3_N	DNA mismatch repair protein HSM3, N terminal domain. Hsm3 is a proteasome-dedicated chaperone that forms a base precursor, Hsm3-Rpt1-Rpt2-Rpn1. Hsm3 consists of 23 alpha-helices forming 11 repeats similar to the HEAT repeats. This entry includes the first 5 repeats at the N-terminal.	237
408567	pfam18796	LPD1	Large polyvalent protein-associated domain 1. This is an alpha helical domain with a conserved ExxARxxE motif that is found in polyvalent proteins of both conjugative elements and phages and prophages.	78
408568	pfam18797	APC_rep	Adenomatous polyposis coli (APC) repeat. Adenomatous polyposis coli contains an armadillo repeat and uses its highly conserved surface groove to recognize the APC-binding region (ABR) of Asef. This entry represents a single repeat unit of the Armadillo region.	74
408569	pfam18798	LPD3	Large polyvalent protein-associated domain 3. This domain is predicted to adopt an alpha and beta fold. The secondary structure arrangement suggests it to be a member of the BECR fold. The domain is found in polyvalent proteins of both conjugative elements and phages/prophages.	110
408570	pfam18799	LPD5	Large polyvalent protein-associated domain 5. This domain is predicted to be an enzymatic alpha beta domain. It often found N-terminal to a metallopeptidase domain in polyvalent proteins. The domain contains a conserved aspartate, lysine, and two arginine residues.	146
408571	pfam18800	Atthog	Attenuator of Hedgehog. Attenuator of Hedgehog is a integral membrane protein of the tetraspan family that functions as a negative regulator of Hedgehog signaling.	141
408572	pfam18801	RapH_N	response regulator aspartate phosphatase H, N terminal. Rap proteins consist of a N-terminal 3-helix bundle and a tetratricopeptide domain. This entry represents the conserved region of the C-terminal bundle.	62
408573	pfam18802	CxC1	CxC1 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain.	104
408574	pfam18803	CxC2	CxC2 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain.	107
408575	pfam18804	CxC3	CxC3 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain.	113
408576	pfam18805	LRR_10	Leucine-rich repeat. This Pfam entry includes some LRRs that fail to be detected with the pfam00560 model. This entry represents two repeat units.	67
408577	pfam18806	Importin_rep_3	Importin 13 repeat. Importin 13 serves as receptor for nuclear localization signals (NLS) in cargo substrates. It mediates docking of the importin/substrate complex to the nuclear pore complex (NPC). It contains several repeats structurally similar to the HEAT repeat. This Pfam entry represents a single repeat unit.	75
408578	pfam18807	TTc_toxin_rep	Tripartite Tc toxins repeat. Tripartite Tc toxin complexes of bacterial pathogens perforate the host membrane and translocate toxic enzymes into the host cell. These structures undergo a transition between a prepore to a pore state and they are mainly constituted by closed beta-layer repeats. This Pfam entry includes a single repeat unit.	45
408579	pfam18808	Importin_rep_4	Importin repeat. The importin subunit beta-3 has a superhelical structure composed of tandem repeats structurally similar to HEAT repeats. This Pfam entry includes a single repeat unit.	90
408580	pfam18809	PBECR1	phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease1. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine and threonine residues.	108
408581	pfam18810	PBECR2	phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease2. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages. The predicted active site contains a conserved arginine and threonine residues.	120
408582	pfam18811	DPPIV_rep	Dipeptidyl peptidase IV (DPP IV) low complexity region. Dipeptidyl peptidase IV includes an helical N-terminal region, the pfam00930 domain and the pfam00326 domain, comprising the active site. This Pfam entry represents a sequence that can be repeated in the low complexity region between the helical N-terminus and the DPPIV_N domain.	21
408583	pfam18812	PBECR3	phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease3. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine and threonine residues.	116
408584	pfam18813	PBECR4	phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease4. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine residue.	185
408585	pfam18814	PBECR5	phage-Barnase-EndoU-ColicinE5/D-RelE like nuclease5. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine and threonine residues.	237
408586	pfam18815	AFP_2	Bacterial antifreeze protein repeat. This family of proteins is involved in stopping the formation of ice crystals at low temperatures. The structure folds as a Ca(2+)-bound parallel beta-helix with an extensive array of ice-like surface waters that are anchored via hydrogen bonds directly to the polypeptide backbone and adjacent side chains.	52
408587	pfam18816	Importin_rep_5	Importin repeat. The importin subunit beta-3 has a superhelical structure composed of tandem repeats structurally similar to HEAT repeats. This Pfam entry includes a single repeat unit and includes sequences not captured by pfam18808.	52
408588	pfam18817	HEAT_UF	Repeat of uncharacterized protein PH0542. Repeat found in PH0542 showing some sequence similarity to HEAT repeat.	45
408589	pfam18818	MPTase-PolyVal	Metallopeptidase superfamily domain. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements. The predicted active site contains a conserved histidine and threonine residues.	124
408590	pfam18819	MuF_C	Phage MuF-C-terminal domain. A predicted endoRNase of the Barnase-EndoU-ColicinE5/D-RelE like nuclease fold found in polyvalent proteins of phages and conjugative elements and also fused to the MuF domain, a structural component of the phage head. The predicted active site contains a conserved histidine and serine residues.	103
376239	pfam18820	BD_b_sandwich	Bdellovibrio Beta-sandwich. A beta-sandwich domain exclusively found at the N-terminal of CHROMO domains in many Bdellovibrio proteins.	131
408591	pfam18821	LPD7	Large polyvalent protein-associated domain 7. This domain contains conserved aspartate and phenylalanine residues. It is widely present in polyvalent proteins and gene neighbourhoods of conjugative elements. This domain is also known as PTox1.	92
408592	pfam18822	CdvA	CdvA-like coiled-coil domain. A coiled coil region domain related to the CdvA-like proteins.	123
408593	pfam18823	InPase	Inorganic Pyrophosphatase. A type I Inorganic Pyrophosphatase family domain that is found in polyvalent proteins.	135
408594	pfam18824	LPD11	Large polyvalent protein-associated domain 11. This is an alpha-helical domain with conserved hydrophobic residues. It is found in polyvalent proteins of conjugative elements.	69
408595	pfam18825	LPD13	Large polyvalent protein-associated domain 13. This is an alpha and beta domain that is found in polyvalent proteins of both conjugative elements and phages/prophages.	139
408596	pfam18826	bVLRF1	bacteroidetes VLRF1 release factor. Archaeo-eukaryotic release factor domain family belonging to the VLRF1 clade observed primarily in the bacteroidetes bacterial lineage. Contains a conserved glutamine residue in the release factor catalytic loop, suggesting it functions as an active peptidyl-tRNA hydrolase at the ribosome.	143
408597	pfam18827	LPD14	Large polyvalent protein-associated domain 14. This is an alpha-helical domain with a conserved glutamate residue that is mainly found in polyvalent proteins of prophages.	136
408598	pfam18828	LPD15	Large polyvalent-protein-associated domain 15. This is a predicted enzymatic alpha and beta domain. It is found at the N-terminus of polyvalent proteins of conjugative elements.	99
408599	pfam18829	Importin_rep_6	Importin repeat 6. The importin subunit beta-3 has a superhelical structure composed of tandem repeats structurally similar to HEAT repeats. This Pfam entry represents two consecutive repeat units and includes sequences captured by pfam18808.	110
408600	pfam18830	LPD16	Large polyvalent protein-associated domain 16. This is an alpha and beta fold domain that is mainly found in polyvalent proteins of conjugative elements.	82
408601	pfam18831	LRR_11	Leucine-rich repeat. This Pfam entry includes some LRRs that fail to be detected with the pfam00560 model. This entry represents one repeat unit.	29
408602	pfam18832	LPD18	Large polyvalent protein-associated domain 18. This is a mostly all beta domain which contains conserved acidic residues. It is mainly found in polyvalent proteins of conjugative elements.	86
408603	pfam18833	TPR_22	Tetratricopeptide repeat. This Pfam entry includes outlying Tetratricopeptide-like repeats (two repeat units) that are not matched by pfam00515.	92
408604	pfam18834	LPD22	Large polyvalent protein associated domain 22. This is a predicted enzymatic alpha-helical domain with highly conserved aspartate residues. The domain is found in polyvalent proteins of phage and prophage genes and is often the immediate neighbour of a lysozyme gene.	98
408605	pfam18835	Beta_helix_2	Beta helix repeat of Inulin fructotransferase. This region contains a right-handed parallel beta helix repeat unit found in Inulin fructotransferase. This Pfam entry includes sequences not found by pfam13229.	68
408606	pfam18836	B_solenoid_ydck	Beta solenoid repeat from YDCK. The crystal structure of YDCK from Salmonella cholerae includes a beta-solenoid repeat. This Pfam entry includes a single repeat unit of YDCK repeat.	18
408607	pfam18837	LRR_12	Leucine-rich repeat. This Pfam entry includes some LRRs that fail to be detected with the pfam00560 model. This entry represents one repeat unit.	30
408608	pfam18838	LPD23	Large polyvalent protein associated domain 23. This is an alpha-helical domain that is usually N-terminal to a metallopeptidase domain. The domain is found in both in polyvalent proteins of conjugative elements and phages/prophages.	58
408609	pfam18839	LPD24	Large polyvalent protein associated domain 24. This is an all-beta domain that is mostly seen in polyvalent proteins of conjugative elements.	70
408610	pfam18840	LPD25	Large polyvalent protein associated domain 25. This is an alpha and beta fold domain found in polyvalent proteins of conjugative elements.	99
408611	pfam18841	B_solenoid_dext	Beta solenoid repeat from Dextranase. The crystal structures of Dex49A from Penicillium minioluteum and of ATCC9642 isopullulanase from Aspergillus niger include beta-solenoid repeats, sharing structural similarities. This Pfam entry includes a single repeat unit of the repeat regions.	33
408612	pfam18842	LPD26	Large polyvalent protein associated domain 26. This is a small alpha-helical domain with two acidic residues conserved in a predicted loop between two of its helices. The domain is mainly found in polyvalent proteins of conjugative elements.	57
408613	pfam18843	LPD28	Large polyvalent protein associated domain 28. This is a beta strand rich domain that lacks strongly conserved polar residues. It is fast diverging and is found in polyvalent proteins of conjugative elements.	96
408614	pfam18844	baeRF_family2	Bacterial archaeo-eukaryotic release factor family 2. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. This family contains a well-conserved 'FP' motif in the catalytic loop.	149
408615	pfam18845	baeRF_family3	Bacterial archaeo-eukaryotic release factor family 3. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome.	168
408616	pfam18846	baeRF_family5	Bacterial archaeo-eukaryotic release factor family 5. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. This family unusually lacks the fusion to the C-terminal Pelota domain.	132
408617	pfam18847	LPD29	Large polyvalent protein associated domain 29. This is an alpha and beta fold domain with conserved polar residues that is found in polyvalent proteins of conjugative elements.	91
408618	pfam18848	baeRF_family6	Bacterial archaeo-eukaryotic release factor family 6. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome.	149
408619	pfam18849	baeRF_family7	Bacterial archaeo-eukaryotic release factor family 7. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome. Consistent with this, many members of the family associate on the genome with HPF-like ribosome hibernation factor.	144
408620	pfam18850	LPD30	Large polyvalent protein associated domain 30. This is an alpha and beta fold domain that is found in polyvalent proteins of conjugative elements.	121
408621	pfam18851	baeRF_family8	Bacterial archaeo-eukaryotic release factor family 8. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome.	141
408622	pfam18852	LPD34	Large polyvalent protein associated domain 34. This is a predicted enzymatic alpha and beta fold domain with a large, prominent helix with conserved glutamate residue and several additional conserved residues including the motifs HTxN and SN. The domain is associated with polyvalent proteins of firmicute conjugative elements.	213
408623	pfam18853	LPD37	Large polyvalent protein associated domain 37. This is and alpha and beta fold domain that is found in polyvalent proteins that are likely to be phage/prophage-derived.	244
408624	pfam18854	baeRF_family10	Bacterial archaeo-eukaryotic release factor family 10. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome.	140
408625	pfam18855	baeRF_family11	Bacterial archaeo-eukaryotic release factor family 11. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome.	139
408626	pfam18856	baeRF_family12	Bacterial archaeo-eukaryotic release factor family 12. Bacterial family of the archaeo-eukaryotic release factor superfamily. Likely to play roles in biological conflicts or regulation under stress conditions at the ribosome.	138
408627	pfam18857	LPD38	Large polyvalent protein associated domain 38. This is an alpha and beta fold domain found in polyvalent proteins of phages and prophages.	189
408628	pfam18858	LPD39	Large polyvalent protein associated domain 39. This is a predicted enzymatic alpha-helical domain that is associated with polyvalent proteins of phages and prophages.	196
408629	pfam18859	acVLRF1	Actinobacteria/chloroflexi VLRF1 release factor. Archaeo-eukaryotic release factor domain family belonging to the VLRF1 clade, observed primarily in the actinbacteria and chloroflexi bacterial lineages. Contains a conserved glutamine residue in the release factor catalytic loop, suggesting it functions as an active peptidyl-tRNA hydrolase at the ribosome.	130
408630	pfam18860	AbiJ_NTD3	AbiJ N-terminal domain 3. Alpha + beta domain. Found fused to AbiJ-like HEPN. Fused to other domains presumably involved in defense.	167
408631	pfam18861	PTP_tm	Transmembrane domain of protein tyrosine phosphatase, receptor type J. Protein tyrosine phosphatases (PTPs) are known to be signaling molecules that regulate a variety of cellular processes, including cell growth, differentiation, mitotic cycle, and oncogenic transformation. PTP receptor type J possesses an extracellular region containing five fibronectin type III repeats, the transmembrane region included in this Pfam entry, and a intracytoplasmic catalytic domain.	161
408632	pfam18862	ApeA_NTD1	ApeA N-terminal domain 1. Mostly beta strands. Fused to HEPN (Apea). Several conserved aromatic residues, abundant but poorly conserved.	421
408633	pfam18863	AbiJ_NTD4	AbiJ N-terminal domain 4. Alpha + beta. Found fused to AbiJ-like HEPN and heat repeats.	153
408634	pfam18864	AbiTii	AbiTii. Alpha + beta domain. Found fused to the N-terminus of the c2405 family of HEPN domains and in few cases to Ymh.	186
408635	pfam18865	AbiJ_NTD5	AbiJ N-terminal domain 5. Mostly alpha helical. Found fused to AbiJ-lke HEPN, and to other domains presumably involved in defense.	93
408636	pfam18866	CxC7	CxC7 like cysteine cluster associated with KDZ transposases. A predicted Zinc chelating domain present N-terminal to the KDZ transposase domain.	64
408637	pfam18867	HEPN-like_int	HEPN-like integron domain. This is a HEPN-like nuclease. Part of mobile integron element. The integron cassettes are known to be activated by stress conditions, thereby allowing swapping of genetic material that might be of adaptive value. Hence it is hypothesized that the HEPN domains present in some integron cassettes contribute to the stress response by functioning as RNases that induce dormancy by probably inhibiting translation and thus enabling survival of harsh conditions.	151
408638	pfam18868	zf-C2H2_3rep	Zinc finger C2H2-type, 3 repeats. This Pfam entry includes three instances of the Zinc finger C2H2-type. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array.	126
408639	pfam18869	HEPN_RnaseLS	RnaseLS-like HEPN. RnaseLS-like HEPN.	124
408640	pfam18870	HEPN_RES_NTD1	HEPN/RES N-terminal domain 1. Mostly alpha helical. Fused to HEPN (MAE_28990 superfamily), RES domain, a potential RNase found in various toxin systems.	131
408641	pfam18871	HEPN_Toprim_N	HEPN/Toprim N-terminal domain 1. Alpha + beta domain. Fused to two distinct HEPN families: MAE_28990 and ERFG_01251 families, TOPRIM and a Mrr-like REase domain.	225
408642	pfam18872	Daz	Daz repeat. This short repeat is found in the Daz proteins is a varying number of copies. The molecular function of these repeats in unknown.	21
408643	pfam18873	Sgo0707_N1	Sgo0707 N-terminal domain. This domain found at the N-terminus of the cell surface Sgo0707 protein. This domain is called the N1 domain and is involved in host colonisation.The largest domain, N1, comprises a putative binding cleft with a single cysteine located in its centre and exhibits an unexpected structural similarity to the variable domains of the streptococcal Antigen I/II adhesins.	265
408644	pfam18874	QPE	QPE domain. This sort presumed domain is found in a small set of gram positive organisms in cell surface proteins with an N-terminal collagen binding domain. We have named this domain QPE after the most conserved sequence motif.	45
408645	pfam18875	AF4_int	AF4 interaction motif. This short motif found in the AF4 protein interacts with AF9.	15
408646	pfam18876	AF-4_C	AF-4 proto-oncoprotein C-terminal region. This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila.	262
408647	pfam18877	SSSPR-51	SSSPR-51 domain. This repeat domain is designated SSRS51, Streptococcal and Staphylococcal Surface Protein Repeat of size 51. These repeats are homologous to the listerial repeats of pfam13461, but shorter on average by about 8 amino acids.	48
408648	pfam18878	PPE-PPW	PPE-PPW subfamily C-terminal region. This entry represents the C-terminal region of a subfamily of PPE proteins known as the PPW subfamily. The PPW refers to three conserved residues found in the sequence alignment. The region also contains a second conserved motif GFGT.	48
408649	pfam18879	EspA_EspE	EspA/EspE family. This family of proteins includes Mycobacterium tuberculosis EspA and EspE proteins.	84
408650	pfam18880	UDI	Uracil-DNA glycosylase inhibitor. Uracil-DNA glycosylase inhibitor (UGI) found in Bacillus subtilis phage, is an inhibitor that inactivates the host uracil-DNA glycosylase (UDG), also known as (UNG) uracil-DNA N-glycosylase. UDG is a highly conserved enzyme responsible for the initiation of uracil-base excision repair (1). UGI forms a tight non-covalent bond to UDG, completely inhibiting it from binding to DNA by inserting its beta-1 strand into the conserved DNA-binding groove of the enzyme (2). In complex with UDG, UGI folds into an alpha-beta-alpha sandwich structure formed by five-stranded antiparallel beta-strands and two helices (3).	82
408651	pfam18881	DUF5646	Family of unknown function (DUF5646). This is a family of unknown function. Family members include the archaeal homolog the bacterial RelB, a toxin-antitoxin system which is activated during amino acid starvation. This family is mostly found in thermococcus archaea.	69
408652	pfam18882	DUF5647	Family of unknown function (DUF5647). This is a family of unknown function. Family members include the hypothetical protein TTHC002 from Thermus thermophilus. Its has an alpha-beta-alpha-beta(3) structure and forms a dimer with a single beta-sheet, folded in a barrel-like shape.	79
408653	pfam18883	AC_1	Autochaperone Domain Type 1. This entry represents the autochaperone domain of type 1 (AC-1) in the Type Va Secretion System (T5aSS). Autotransporters (ATs) belong to a family of modular proteins secreted by the Type V, subtype a, secretion system (T5aSS) and considered as an important source of virulence factors in lipopolysaccharidic diderm bacteria (archetypical Gram-negative bacteria). The AC of type 1 with beta-fold appears as a prevalent and conserved structural element exclusively associated to beta-helical AT passenger.	113
408654	pfam18884	TSP3_bac	Bacterial TSP3 repeat. This entry contains a novel bacterial thrombospondin type 3 repeat which differs from the typical consensus by containing a glutamate in place of one of the calcium binding aspartate residues.	22
408655	pfam18885	DUF5648	Repeat of unknown function (DUF5648). This entry represents a repeat of approximately 40 residues in length. It is often associated with enzymatic domains in bacterial cell surface proteins. This entry may represent a beta-propeller repeat, although most proteins only possess three repeats rather than the expected 6-8 copies.	39
408656	pfam18886	DUF5649	Repeats of unknown function (DUF5649). This entry represents a series of potential beta-helix repeats found in a variety of putative bacterial adhesin proteins.	68
408657	pfam18887	MBG_3	MBG domain. This entry corresponds to an MBG (mirror beta grasp) domain. It is found in a variety of bacterial cell surface proteins.	72
408658	pfam18888	DUF5650	Repeat of unknown function (DUF5650). This entry represents a repeating region found in filamentous hemagglutinin proteins from various bacteria. This entry may contain a beta helix structure.	55
408659	pfam18889	Beta_helix_3	Beta helix repeat. This entry contains a 30 residue repeat found in a variety of bacterial cell surface proteins. This repeat is related to pfam14262, meaning that it has a beta-helix structure. The sequence repeat is quite glycine rich.	20
408660	pfam18890	FANCL_d2	FANCL UBC-like domain 2. This entry represents the second of three UBC-like domain found in the FANCL protein, which is the catalytic E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2.	91
408661	pfam18891	FANCL_d3	FANCL UBC-like domain 3. This entry represents the third of three UBC-like domain found in the FANCL protein, which is the catalytic E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2.	97
408662	pfam18892	DUF5651	Family of unknown function (DUF5651). This entry represents a probable zinc binding domain found at the C-terminus of some Firmicute bacteria. The function of these proteins is unknown.	54
408663	pfam18893	DUF5652	Family of unknown function (DUF5652). This entry represents a protein containing two transmembrane helices. Many of these proteins are found in organisms in the Candidate Phyla Radiation.	71
408664	pfam18894	PhageMetallopep	Putative phage metallopeptidase. This entry represents a probable metallopeptidase found in a variety of phage and bacterial proteomes.	141
408665	pfam18895	T4SS_pilin	Type IV secretion system pilin. This entry represents likely Type IV secretion system pilins.	71
408666	pfam18896	SLT_3	Lysozyme like domain. This entry represents a lysozyme like domain found in candidate phyla radiation bacteria. The domain contains several conserved cysteine and histidine residues suggesting that it may bind to zinc.	89
408667	pfam18897	DUF5653	Family of unknown function (DUF5653). This entry contains a group of bacterial proteins of no known function. The proteins are approximately 230 amino acids in length.	194
408668	pfam18898	DUF5654	Family of unknown function (DUF5654). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 79 and 98 amino acids in length. The region contains two predicted transmembrane helices. The Eukaryotic examples are found in the Foraminiferan Reticulomyxa filosa that contains several proteins with this family.s	71
408669	pfam18899	DUF5655	Domain of unknown function (DUF5655). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 122 and 304 amino acids in length.	110
408670	pfam18900	DUF5656	Protein of unknown function (DUF5656). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 237 and 274 amino acids in length. These proteins are likely to be integral membrane proteins.	240
408671	pfam18901	DUF5657	Family of unknown function (DUF5657). This family of small integral membrane proteins is found in bacteria. Proteins in this family are approximately 80 amino acids in length.	61
408672	pfam18902	DUF5658	Domain of unknown function (DUF5658). This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 101 and 135 amino acids in length. There is a completely conserved aspartate and a conserved EXNP motif that may be functionally important.	82
408673	pfam18903	DUF5659	Domain of unknown function (DUF5659). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 78 and 90 amino acids in length.	75
408674	pfam18904	DUF5660	Domain of unknown function (DUF5660). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and fungi, and is approximately 110 amino acids in length.	109
408675	pfam18905	DUF5661	Protein of unknown function (DUF5661). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea and viruses. Proteins in this family are typically between 89 and 148 amino acids in length.	71
408676	pfam18906	Phage_tube_2	Phage tail tube protein. This family of proteins are tube proteins which polymerise to form the phage tails.	253
408677	pfam18907	DUF5662	Family of unknown function (DUF5662). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 175 and 193 amino acids in length. Many proteins in this family are annotated as catalase, but this could not be verified.	157
408678	pfam18908	DUF5663	Protein of unknown function (DUF5663). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 102 and 113 amino acids in length.	86
408679	pfam18909	DUF5664	Siphovirus protein of unknown function (DUF5664). This family of proteins is found predominantly in siphoviruses. Proteins in this family are typically between 117 and 208 amino acids in length.	99
408680	pfam18910	DUF5665	Domain of unknown function (DUF5665). This entry represents a functionally uncharacterized family of integral membrane proteins. This protein family is found in bacteria, and is approximately 60 amino acids in length. There are several conserved glycines in the first transmembrane helix that may be functionally important.	56
408681	pfam18911	PKD_4	PKD domain. This entry is composed of PKD domains found in bacterial surface proteins.	81
408682	pfam18912	DZR_2	Double zinc ribbon domain. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam00156. This entry corresponds to two zinc ribbon motifs. This domain is found at the N-terminus of the ComF operon protein 3.	56
408683	pfam18913	FBPase_C	Fructose-1-6-bisphosphatase, C-terminal domain. This entry represents the C-terminal domain of Fructose-1-6-bisphosphatase enzymes. According to ECOD this domain has a Rossmann-like fold.	125
408684	pfam18914	DUF5666	Domain of unknown function (DUF5666). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 60 amino acids in length. This domain is likely to adopt an OB-fold based on similarity to other families.	60
408685	pfam18915	DUF5667	Domain of unknown function (DUF5667). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is typically between 95 and 113 amino acids in length.	102
408686	pfam18916	Lycopene_cyc	Lycopene cyclase. 	92
408687	pfam18917	DUF5668	Domain of unknown function (DUF5668). This entry is composed of two transmembrane helices that are often found in 2 or three copies in a protein. The members of this family are functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 40 amino acids in length. This entry is often associated with pfam09922 a putative adhesive domain that adopts a beta helix fold.	42
408688	pfam18918	DUF5669	Family of unknown function (DUF5669). This is a family of unknown function. Family members are mostly found in gammaproteobacteria.	76
408689	pfam18919	DUF5670	Family of unknown function (DUF5670). This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 50 amino acids in length. There is a single completely conserved residue W that may be functionally important. These proteins contain two transmembrane helices.	43
408690	pfam18920	DUF5671	Domain of unknown function (DUF5671). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are typically between 168 and 339 amino acids in length. These proteins are likely to be integral membrane proteins.	131
408691	pfam18921	Cyanophycin_syn	Cyanophycin synthase-like N-terminal domain. This domain is found at the N-terminus of cyanophycin synthase proteins and related enzymes from bacteria and archaea. It is approximately 120 amino acids in length. The family is found in association with pfam08245, pfam02875. This domain is found in isolation in some proteins.	115
408692	pfam18922	DUF5672	Protein of unknown function (DUF5672). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 260 and 408 amino acids in length. This entry corresponds to a region of about 200 amino acids in length with multiple conserved motifs. There are two conserved sequence motifs: GAP and NGG. In some proteins this domain is found associated with various glycosyl transferase enzyme domains, suggesting this domain has a related role in glycan biosynthesis.	165
408693	pfam18923	DUF5673	Domain of unknown function (DUF5673). This presumed domain is functionally uncharacterized. This domain family is found in bacteria and archaea, and is approximately 90 amino acids in length. The domain is usually found C-terminal to a pair of transmembrane helices.	66
408694	pfam18924	DUF5674	Protein of unknown function (DUF5674). This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and archaea. Proteins in this family are approximately 110 amino acids in length.	109
408695	pfam18925	DUF5675	Family of unknown function (DUF5675). This presumed domain is found in bacteria, archaea, alveolata and caudoviruses. Proteins in this family are typically between 133 and 179 amino acids in length.	114
408696	pfam18926	DUF5676	2TM family of unknown function (DUF5676). This family of presumed integral membrane proteins is found in bacteria and archaea. Proteins in this family are approximately 90 amino acids in length and contain two predicted transmembrane helices.	82
408697	pfam18927	CrtO	Glycosyl-4,4'-diaponeurosporenoate acyltransferase. This family of proteins is found in certain bacterial lineages. In staphylococcus this protein is known to be Glycosyl-4,4'-diaponeurosporenoate acyltransferase an enzyme involved in the final step of synthesis of staphyloxanthin, an orange pigment found in most S. aureus strains. Proteins in this family are typically between 157 and 184 amino acids in length. Members of the family contain and EXXH motif that may be functionally important.	119
408698	pfam18928	DUF5677	Family of unknown function (DUF5677). This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 250 and 347 amino acids in length. These proteins contain a conserved RXXXE motif an invariant Histidine that may be functionally important.	163
408699	pfam18929	DUF5678	Family of unknown function (DUF5678). This presumed domain family is found in bacteria and archaea. Proteins in this family are typically between 64 and 76 amino acids in length.	50
408700	pfam18930	DUF5679	Domain of unknown function (DUF5679). This family of domains is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 48 and 68 amino acids in length. These domains contain four conserved cysteines suggesting that this domain is zinc binding	40
408701	pfam18931	DUF5680	Domain of unknown function (DUF5680). This family of presumed domains is found in bacteria and archaea. Proteins in this family are typically between 152 and 220 amino acids in length. In some of the proteins this domain is associated with an N-terminal HTH domain, suggesting that they are transcriptional regulators. This suggests that this may be a previously unidentified ligand binding domain.	104
408702	pfam18932	DUF5681	Family of unknown function (DUF5681). This domain family is found in bacteria, archaea and viruses, and is typically between 75 and 86 amino acids in length. There is a conserved SGNP sequence motif. There are two completely conserved G residues that may be functionally important.	77
408703	pfam18933	PsbP_2	PsbP-like protein. This family is related to the PsbP family.	184
408704	pfam18934	DUF5682	Family of unknown function (DUF5682). This is a family of unknown function, mostly found in bacteria.	729
408705	pfam18935	DUF5683	Family of unknown function (DUF5683). This is a domain of unknown function found mostly in bacteria.	151
408706	pfam18936	DUF5684	Family of unknown function (DUF5684). This is a family of unknown function mostly found in bacteria. Some family members can be found at the N-terminal region of pfam00717.	80
408707	pfam18937	DUF5685	Family of unknown function (DUF5685). This is a family of unknown function mostly found in bacteria.	271
408708	pfam18938	aRib	Atypical Rib domain. This entry contains atypical Rib (aRib) domains. These are found in a variety of bacterial cell surface proteins. These proteins share a conserved motif with the Rib domain (YPDXXD). The structure of the aRib domain has been solved from two proteins, the SrpA adhesin and the GspB adhesin. In these proteins this domain has been termed the unique domain due to its lack of similarity to any other known structures at the time. The aRib domain from SrpA has been shown to mediate a dimer interaction. This family has been added to the E-set clan based on its similarity to the Rib domain, although it does not contain the Ig fold.	71
408709	pfam18939	DUF5686	Family of unknown function (DUF5686). This is a family of unknown function, mostly found in bacteria. Family members can be found at the C-terminal region of pfam13715.	666
408710	pfam18940	DUF5687	Family of unknown function (DUF5687). This is a family of unknown function mainly found in bacteria.	484
408711	pfam18941	DUF5688	Family of unknown function (DUF5688). This is a family of unknown function found in bacteria.	290
408712	pfam18942	DUF5689	Family of unknown function (DUF5689). This is a domain of unknown function. It is mostly found in bacteria and can be present in multiple copies.	220
408713	pfam18943	DUF5690	Family of unknown function (DUF5690). This is a family of unknown function mostly found in bacteria.	382
408714	pfam18944	DUF5691	Family of unknown function (DUF5691). This is a family of unknown function. Some family members overlap with pfam13646.	209
408715	pfam18945	VipB_2	EvpB/VC_A0108, tail sheath gpW/gp25-like domain. EvpB is a family of Gram-negative probable type VI secretion system components of the tail sheath. They have been known as COG:COG3517. These sheath-components, of which there are many copies in the sheath, are also variously referred to as VipA/VipB and TssB/TssC. On contact with another bacterial cell the sheath contracts and pushes the puncturing device and tube through the cell envelope and punches the target bacterial cell. This entry represents the gpW/gp25-like domain.	112
408716	pfam18946	Apex	GpV Apex motif. This entry represents a short motif found at the C-terminus of Phage gpV proteins. These proteins act as a spike for piercing the host membrane. The apex motif contains a conserved HXH motif that coordinates an iron ion.	23
408717	pfam18947	HAMP_2	HAMP domain. 	67
408718	pfam18948	DUF5692	Family of unknown function (DUF5692). This is a family of unknown function mostly found in bacteria.	304
408719	pfam18949	DUF5693	Family of unknown function (DUF5693). This is a family of unknown function found in bacteria.	608
408720	pfam18950	DUF5694	Family of unknown function (DUF5694). This is a family of unknown function, mostly found in bacteria.	185
408721	pfam18951	DUF5695	Family of unknown function (DUF5695). This is a family of unknown function mainly found in fungi and bacteria.	857
408722	pfam18952	DUF5696	Family of unknown function (DUF5696). This is a family of unknown function with some overlap with clan family members of CL0058.	608
408723	pfam18953	SAP_new25	SAP domain-containing new25. This family includes Schizosaccharomyces specific SAP domain containing proteins such as gene product new25. SAP ( SAF-A/B, Acinus and PIAS) motif is a DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins. For instance, the SAP domain of SUMO E3 ligase PIAS1 from human is shown to bind an A/T-rich DNA.	51
408724	pfam18954	DUF5697	Family of unknown function (DUF5697). This entry likely contains an N-terminal Helix-tun-helix motif and is therefore likely to be a transcriptional regulator.	161
408725	pfam18955	DUF5698	Domain of unknown function (DUF5698). This family is functionally uncharacterized. This family family is found in bacteria and archaea, and is approximately 60 amino acids in length and contains two probable transmembrane helices. This entry is found in association with pfam10035. The C-terminal transmembrane helix contains a GXXXGXXXG motif that is characteristic of transmembrane helices that dimerise.	58
408726	pfam18956	DUF5699	Family of unknown function (DUF5699). 	61
408727	pfam18957	RibLong	Long Rib domain. This entry represents the Long Rib domain that is closely related to the pfam08428 Rib domain but has a conserved insertion. These domains are found in bacterial cell surface proteins.	93
408728	pfam18958	DUF5700	Putative zinc dependent peptidase (DUF5700). This entry represents a group of putative zinc dependent peptidases that have the characteristic HEXXH motif. This family is most related to pfam10026.	278
408729	pfam18959	DUF5701	Family of unknown function (DUF5701). This is a family of unknown function mostly found in bacteria.	193
408730	pfam18960	DUF5702	Family of unknown function (DUF5702). This family is mostly found in bacteria.	276
408731	pfam18961	DUF5703_N	Domain of unknown function (DUF5703). This is an N-terminal domain of unknown function mostly found in bacteria. It is possible that this domain might be a putative glycoside hydrolase. This family belongs to the Galactose Mutarotase-like superfamily.	287
408732	pfam18962	Por_Secre_tail	Secretion system C-terminal sorting domain. Species that include Porphyromonas gingivalis, Fibrobacter succinogenes, Flavobacterium johnsoniae, Cytophaga hutchinsonii, Gramella forsetii, Prevotella intermedia, and Salinibacter ruber have on average twenty or more copies of this C-terminal domain, associated with sorting to the outer membrane and covalent modification. This domain targets proteins to type IX secretion systems and is secreted then cleaved off by a C-terminal signal peptidease. Based on similarity to other families it is likely that this domain adopts an immunoglobulin like fold.	72
408733	pfam18963	DUF5703	Family of unknown function (DUF5703). This family includes members mostly found in Actinobacteria. Some family members are thought to be dihydroorotate dehydrogenases, however there is no evidence to support the relationship to pfam01180.	51
408734	pfam18964	DUF5704	Family of unknown function (DUF5704). This entry is mainly found in Firmicutes and has a notable number of highly conserved aromatic amino acids. The entry is found towards the C-terminus of pfam02368 in some of the members and there are members that present two copies.	185
408735	pfam18965	DUF5705	Family of unknown function (DUF5705). 	1052
408736	pfam18966	Lipoprotein_23	uncharacterized lipoprotein. This entry includes members found in Actinobacteria (mostly streptomycetaceae, micromonosporales and pseudonocardiales). Some members are annotated as lipoproteins.	179
408737	pfam18967	DUF5706	Family of unknown function (DUF5706). 	107
408738	pfam18968	DUF5707	Family of unknown function (DUF5707). 	98
408739	pfam18969	DUF5708	Family of unknown function (DUF5708). This family includes members in Actinobacteria. All family members present double transmembrane.	60
408740	pfam18970	DUF5709	Family of unknown function (DUF5709). 	49
408741	pfam18971	CagA_N	CagA protein. The Helicobacter pylori type IV secretion effector CagA is a major bacterial virulence determinant and critical for gastric carcinogenesis. X-ray crystallographic analysis of the N-terminal CagA fragment (residues 1-876) revealed that the region has a structure comprised of three discrete domains. Domain I constitutes a mobile CagA N terminus, while Domain II tethers CagA to the plasma membrane by interacting with membrane phosphatidylserine. Domain III interacts intramolecularly with the intrinsically disordered C-terminal region, and this interaction potentiates the pathogenic scaffold/hub function of CagA.	876
408742	pfam18972	Wheel	Cns1/TTC4 Wheel domain. The wheel domain is found at the C-terminus of yeast Cns1 and human TTC4 proteins. The structure of the domain shows an overall fold consisting of a twisted five-stranded beta sheet surrounded by several alpha helices. The Hsp90 chaperone machinery in eukaryotes comprises a number of distinct accessory factors. Cns1 is one of the few essential co-chaperones in yeast. Cns1 is important for maintaining translation elongation, specifically chaperoning the elongation factor eEF2. In this context, Cns1 interacts with the novel co-factor Hgh1 and forms a quaternary complex together with eEF2 and Hsp90.	114
408743	pfam18973	CBL	Putative Chitin binding like. This family includes host-selective toxins such as SnTox1 found in S. nodorum. SnTox1 is a necrotrophic effector contains 6 cysteine residues, a common feature for some fungal avirulence effectors such as the Avr and ECP effectors from Cladosporium fulvum. The high content of cysteine residues and high stability suggest that SnTox1 may function in the plant apoplastic space which is abundant in plant defense components. Protein sequence analysis indicate that SnTox1 contains a C-terminal chitin binding (CB) like motif. Three-dimensional (3D) structure-based sequence alignment suggested that the putative CB motif in SnTox1 was more similar to those of plant-specific ChtBDs than to Avr4 proteins, which are related to invertebrate ChtBDs. Furthermore, SnTox1 contained all secondary-structure-related residues including the strictly conserved b-strand-forming 'CCS' motif found only in plant-specific ChtBD1 proteins.	48
408744	pfam18974	DUF5710	Domain of unknown function (DUF5710). This is a domain of unknown function which can be found in DNA primases such as TraC.	44
408745	pfam18975	DUF5711	Family of unknown function (DUF5711). This is a family of unknown function mostly found in bacteria and archea. Some members contain WD repeats.	344
408746	pfam18976	DUF5712	Family of unknown function (DUF5712). This is a family of unknown function mainly found in Bacteroidetes.	292
408747	pfam18977	DUF5713	Family of unknown function (DUF5713). This is a family of unknown function, mainly found in bacteria.	107
408748	pfam18978	DUF5714	Family of unknown function (DUF5714). This is a family of unknown function, mainly found in bacteria. It is distantly related to Pfam family pfam09719, which is a heme binding cytochrome. This domain is found associated with other domains such as the Radical SAM domain and a methyltransferase.	173
408749	pfam18979	DUF5715	Family of unknown function (DUF5715). This is a family of unknown function, mainly found in bacteria.	170
408750	pfam18980	DUF5716_C	Family of unknown function (DUF5716) C-terminal. This is a C-terminal domain found in bacterial sequences of unknown function.	295
408751	pfam18981	InlK_D3	Internalin K domain (D3/D4). This domain is found at the elbow of internalin surface proteins, used by the bacteria to invade mammalian cells. This domain has an Ig-like fold.	75
408752	pfam18982	DUF5716	Family of unknown function (DUF5716). This is a family of unknown function, mostly found in bacteria.	434
408753	pfam18983	DUF5717	Family of unknown function (DUF5717)C-terminal. This is a C-terminal domain of a family of unknown function found in bacteria.	306
408754	pfam18984	DUF5717_N	Family of unknown function (DUF5717)N-terminal. This is the N-terminal domain found in sequences of unknown function in bacteria.	875
408755	pfam18985	DUF5718	Family of unknown function (DUF5718). This is a family of unknown function, mostly found in bacteria.	250
408756	pfam18986	DUF5719	Family of unknown function (DUF5719). This is a family of unknown function, mostly found in bacteria.	319
408757	pfam18987	DUF5720	Family of unknown function (DUF5720). This is a family of unknown function, mostly found in bacteria.	100
408758	pfam18988	DUF5721	Family of unknown function (DUF5721). This is a family of unknown function, mostly found in Firmicutes.	149
408759	pfam18989	DUF5722	Family of unknown function (DUF5722). This is a family of unknown function mainly found in bacteria.	394
408760	pfam18990	DUF5723	Family of unknown function (DUF5723). This is a family of unknown function mainly found in bacteria.	377
408761	pfam18991	DUF5724	Family of unknown function (DUF5724). This is a family of unknown function mainly found in bacteria.	342
408762	pfam18992	DUF5725	Family of unknown function (DUF5725). A highly conserved domain of unknown function found in Platyhelminthes.	158
408763	pfam18993	Rv0078B	Rv0078B-related antitoxin. Putative antitoxin protein according to TASmania database.	63
408764	pfam18994	Prophage_tailD1	Prophage endopeptidase tail N-terminal domain. This domain represents the N-terminal domain of prophage tail proteins that are probably acting as endopeptidases. This domain has a RIFT related fold.	84
408765	pfam18995	PRT6_C	Proteolysis_6 C-terminal. This is the C-terminal domain mainly found in E3 ubiquitin ligases. Proteolysis 6 (PRT6) encodes a ubiquitin E3 ligase belonging to the N-end rule pathway of targeted protein degradation, which is a specialized subset of the ubiquitin proteasome system. In Arabidopsis, at least two N-recognins (E3 ubiquitin ligases) with different substrate specificities exist, namely PROTEOLYSIS1 (PRT1) and PRT6.	444
408766	pfam18996	DUF5726	Family of unknown function (DUF5726). This family is found in various Platyhelminthes, with many of the sequences annotated as being the polyprotein of the transposon Ty3-I Gag-Pol. However, the genomic location of other members suggests that this region is not always found in a transposon. The family members are characterized by a highly conserved DxDxDxCC motif. The function of this family is currently unknown.	90
408767	pfam18997	DUF5727	Family of unknown function (DUF5727). A conserved domain of unknown function found in Platyhelminthes, closely related to glycosylphosphatidylinositol (GPI)-anchored protein GP50, known as Diagnostic antigen GP50. GP50 is a family of highly expressed antigens exclusive to tapeworms.	192
408768	pfam18998	Flg_new_2	Divergent InlB B-repeat domain. This family of domains are found in bacterial cell surface proteins. They are often found in tandem array. This domain is closely related to pfam09479.	74
408769	pfam18999	DUF5728	Family of unknown function (DUF5728). This is a highly conserved domain of unknown function found in Platyhelminthes, with many of the sequences annotated as being the small conductance calcium-activated potassium channel protein. However, the location suggests that this domain belongs to the intracellular amino terminus of these transmembrane proteins, immediately next to the calcium-activated SK potassium channel.	188
408770	pfam19000	DUF5729	Family of unknown function (DUF5729). This is a highly conserved domain of unknown function found in Platyhelminthes, with many of the sequences annotated as being the small conductance calcium-activated potassium channel protein. However, the location suggests that this domain belongs to the intracellular amino terminus of these transmembrane proteins, near the calcium-activated SK potassium channel.	141
408771	pfam19001	DUF5730	Family of unknown function (DUF5730). A highly conserved domain of unknown function found in Platyhelminthes, in which the first 21 residues correspond to a transmembrane domain.	182
408772	pfam19002	DUF5731	Family of unknown function (DUF5731). A highly conserved domain of unknown function found in Platyhelminthes.	108
408773	pfam19003	DUF5732	Family of unknown function (DUF5732). This family is found in various Platyhelminthes, with many of the sequences annotated as being the STARP-like antigen. This highly conserved domain is located next to the basic helix-loop-helix motif found in this antigen. The function of this family is currently unknown.	83
408774	pfam19004	DUF5733	Family of unknown function (DUF5733). A highly conserved domain of unknown function found in Platyhelminthes.	115
408775	pfam19005	DUF5734	Family of unknown function (DUF5734). This is a conserved domain found in various Platyhelminthes. The function of this family is still unknown.	83
408776	pfam19006	DUF5735	Family of unknown function (DUF5735). A highly conserved domain of unknown function found in different Platyhelminthes.	138
408777	pfam19007	DUF5736	Family of unknown function (DUF5736). A highly conserved domain of unknown function found in various Platyhelminthes.	111
408778	pfam19008	DUF5737	Family of unknown function (DUF5737). A highly conserved domain of unknown function found in various Platyhelminthes.	99
408779	pfam19009	DUF5738	Family of unknown function (DUF5738). A highly conserved domain of unknown function found in various Platyhelminthes.	104
408780	pfam19010	DUF5739	Family of unknown function (DUF5739). A highly conserved domain of unknown function found in various Platyhelminthes.	114
408781	pfam19011	DUF5740	Family of unknown function (DUF5740). This family is found in various Platyhelminthes, with many of the sequences annotated as being part of death domain-containing proteins. This highly conserved domain is located at the amino terminus. The function of this family is currently unknown.	94
408782	pfam19012	DUF5741	Family of unknown function (DUF5741). This coiled-coil is found in various Platyhelminthes, with many of the sequences annotated as being uveal autoantigen with coiled-coil. This region is one of the numerous coiled-coil regions which constitutes this antigen. The function is currently unknown.	82
408783	pfam19013	DUF5742	Family of unknown function (DUF5742). A highly conserved domain of unknown function found in various Platyhelminthes.	111
408784	pfam19014	DUF5743	Family of unknown function (DUF5743). A highly conserved domain of unknown function found in various Platyhelminthes.	186
408785	pfam19015	DUF5744	Family of unknown function (DUF5744). This is a highly conserved domain found in various Platyhelminthes. Its function is currently unknown.	81
408786	pfam19016	DUF5745	Domain of unknown function (DUF5745). This is a domain of unknown function found in Platyhelminthes. It shows homology with the calponin homology (CH) domain.	59
408787	pfam19017	DUF5746	Domain of unknown function (DUF5746). This is a highly conserved domain found in various Platyhelminthes. Its function is currently unknown, with some of the sequences annotated as being Palmitoyltransferase. This highly conserved domain is located at the amino terminus, next to a DHHC domain (named for its signature tetrapeptide Asp-His-His-Cys).	127
408788	pfam19018	Vanin_C	Vanin C-terminal domain. This domain is found at the C terminus of Vanin 1 and related proteins.	165
408789	pfam19019	Phlebo_G2_C	Phlebovirus glycoprotein G2 C-terminal domain. This family consists of several Phlebovirus glycoprotein G2 C-terminal Ig-like domains.	171
408790	pfam19020	Ta1207	Ta1207 family. The function of this family is unknown. The protein forms a homopentameric complex in T. acidophilum. Each protein is composed of two structurally similar domains.	276
408791	pfam19021	DUF5747	Family of unknown function (DUF5747). The function of this protein family is unknown.	197
408792	pfam19022	DUF5748	Family of unknown function (DUF5748). The function of this family of euryarchaeal proteins is unknown.	101
408793	pfam19023	DUF5749	Family of unknown function (DUF5749). The function of this family of euryarchaeal proteins is unknown. The structure of this protein forms a beta barrel.	78
408794	pfam19024	DUF5750	Family of unknown function (DUF5750). The function of this family of proteins is unknown.	91
408795	pfam19025	DUF5751	Family of unknown function (DUF5751). The function of this archaeal family is unknown.	116
408796	pfam19026	HYPK_UBA	HYPK UBA domain. This entry represents the UBA domain found at the C-terminus of the HYPK protein and its homologues. This domain in HYPK mediates a protein interaction with the Naa15 C-terminus.	41
408797	pfam19027	DUF5752	Family of unknown function (DUF5752). This family includes the OrfY protein from the hyperthermophilic archaeum Thermoproteus tenax. This protein co-occurs with the treS/P protein in an operon regulating the synthesis of trehalose. The structure of this protein shows it contains an internal duplication.	208
408798	pfam19028	TSP1_spondin	Spondin-like TSP1 domain. This entry represents a sub-type of TSP1 domains that have an alternative disulphide binding pattern compared to the canonical TSP1 domain.	52
408799	pfam19029	DUF883_C	DUF883 C-terminal glycine zipper region. This family corresponds to the C-terminal presumed transmembrane helix found in DUF883 proteins. The helix contains a glycine zipper motif suggestive of dimerisation.	30
408800	pfam19030	TSP1_ADAMTS	Thrombospondin type 1 domain. This subfamily of thrombospondin type 1 repeats are mainly found in ADAMTS proteins.	55
408801	pfam19031	Intu_longin_1	First Longin domain of INTU, CCZ1 and HPS4. This entry is specific to the first Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1/CCZ1) family, including protein sequences of INTU, CCZ1 and HPS4 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia.	112
408802	pfam19032	Intu_longin_2	Intu longin-like domain 2. This entry represents a longin-like domain found in Intu and related proteins.	119
408803	pfam19033	Intu_longin_3	Intu longin-like domain 3. This entry represents a longin-like domain found in Intu and related proteins.	97
408804	pfam19034	RnlA-toxin_C	RNase LS, bacterial toxin C terminal. RnlA toxin is an RNase LS and a putative toxin of a bacterial toxin-antitoxin pair. Toxin-antitoxin systems consist of a stable toxin and an unstable antitoxin. In this case, a novel type II system, RnlA is the stable toxin that causes inhibition of cell growth and rapidly degrades T4 late mRNAs to prevent their expression, and this is neutralized by the activity of the unstable antitoxin RnlB.	125
408805	pfam19035	TSP1_CCN	CCN3 Nov like TSP1 domain. This entry represents a sub-type of TSP1 domains found in matricellular CCN proteins that have an alternative disulphide binding pattern compared to the canonical TSP1 domains.	44
408806	pfam19036	Fuz_longin_1	First Longin domain of FUZ, MON1 and HPS1. This entry is specific to the first Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1-CCZ1) family, including protein sequences of FUZ, MON1 and HPS1 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia.	125
408807	pfam19037	Fuz_longin_2	Second Longin domain of FUZ, MON1 and HPS1. This entry represents a longin-like domain found in Fuz and related proteins. This entry is specific to the second Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1-CCZ1) family, including protein sequences of FUZ, MON1 and HPS1 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia.	98
408808	pfam19038	Fuz_longin_3	Third Longin domain of FUZ, MON1 and HPS1. This entry represents a longin-like domain found in Fuz and related proteins. This entry is specific to the third Longin domain of the HerMon (Hermansky-Pudlak syndrome and MON1-CCZ1) family, including protein sequences of FUZ, MON1 and HPS1 families. The Mon1/Ccz1 complex (MC1) is the GDP/GTP exchange factor (GEF) for the Rab GTPase Ypt7/Rab7 during vesicular trafficking. The Hps1/Hps4 complex (BLOC-3) is a Rab32 and Rab38 GEF and is required for biogenesis of melanosomes and platelet dense granules. Inturned (INTU) and Fuzzy (FUZ) proteins interact as members of the ciliogenesis and planar polarity effector (CPLANE) complex that controls recruitment of intraflagellar transport machinery to the basal body of primary cilia.	103
408809	pfam19039	ASK_PH	ASK kinase PH domain. This PH-like domain is found in the regulatory region of ASK1 and related kinase proteins. This domain is found adjacent to the kinase domain.	97
408810	pfam19040	SGNH	SGNH domain (fused to AT3 domains). This entry include SGNH domains that are found fused to membrane domains from the AT3 families pfam01757.	235
408811	pfam19041	CBP30	Nuclear cap binding complex subunit CBP30. This entry represents the CBP30 component of the trypanasome nuclear cap binding complex. Trypanosomes have a different cap 4 structure for mRNAs. CBP30 is part of the complex that recognizes this cap.	299
408812	pfam19042	CBP110	Nuclear cap binding complex subunit CBP110. This entry represents the CBP110 component of the trypanasome nuclear cap binding complex. Trypanosomes have a different cap 4 structure for mRNAs. CBP110 is part of the complex that recognizes this cap.	987
408813	pfam19043	CBP66	Nuclear cap binding complex subunit CBP66. This entry represents the CBP66 component of the trypanasome nuclear cap binding complex. Trypanosomes have a different cap 4 structure for mRNAs. CBP66 is part of the complex that recognizes this cap.	583
408814	pfam19044	P-loop_TraG	TraG P-loop domain. This entry represents the P-loop domain found in the TraG conjugation protein.	413
408815	pfam19045	Ligase_CoA_2	Ligase-CoA domain. This domain is related to pfam00549 and adopts a flavodoxin fold.	162
408816	pfam19046	GM130_C	GM130 C-terminal binding motif. This entry represents the C-terminal motif from the GM130 protein that is bound by the GRASP65 PDZ domain pfam04495.	46
408817	pfam19047	HOOK_N	HOOK domain. This domain is found at the N-terminus of HOOK proteins.	151
408818	pfam19048	SidE_mART	SidE mono-ADP-ribosyltransferase domain. This domain found in the SidE bacterial effector protein mediates the mono-ADP-ribosylation of ubiquitin, which is then ligated to host proteins by the pfam12252 domain.	313
408819	pfam19049	SidE_DUB	SidE DUB domain. This entry represents the N-terminal deubiquitinating domain from the SidE protein. The SidE protein is a bacterial effector protein that can ubiquitinate and presumably deubiquitinate host proteins.	173
408820	pfam19050	PhoD_2	PhoD related phosphatase. This entry contains a domain that is presumed to be a phosphatase enzyme based on its similarity to pfam09423.	543
408821	pfam19051	GFO_IDH_MocA_C2	Oxidoreductase family, C-terminal alpha/beta domain. This entry represents a domain found at the C-terminus of a variety of oxidoreductase enzymes. The domain is related to pfam02894.	254
408822	pfam19052	BRINP	BMP/retinoic acid-inducible neural-specific protein. This entry represents the BMP/retinoic acid-inducible neural-specific protein (BRINP) family, including BRINP1/2/3. They are predominantly and widely expressed in both the central nervous system (CNS) and peripheral nervous system (PNS). They inhibit neuronal cell proliferation by negative regulation of the cell cycle G1/S transition.	448
408823	pfam19053	EccD	EccD-like transmembrane domain. This entry represents an integral membrane component of the ESX type VII secretion systems, EccD. This region includes 11 predicted transmembrane alpha helices.	392
408824	pfam19054	DUF5753	Domain of unknown function (DUF5753). This entry represents a putative ligand binding domain found in bacterial transcription regulators that have an N-terminal HTH domain pfam13560.	178
408825	pfam19055	ABC2_membrane_7	ABC-2 type transporter. 	409
408826	pfam19056	WD40_2	WD40 repeated domain. This entry contains an array of WD40 repeats found in RhoGEF proteins.	487
408827	pfam19057	PH_19	PH domain. This entry contains a PH domain found in RhoGEF proteins.	146
408828	pfam19058	DUF5754	Family of unknown function (DUF5754). This is a family of uncharacterized proteins of unknown function found in viruses.	49
408829	pfam19059	DUF5755	Family of unknown function (DUF5755). This family of unknown function appears to be primarily restricted to mimiviridae and phycodnaviridae. This entry may be found at the N terminus of longer proteins, which contain a C-terminal DNA polymerase family domain and a DNA polymerase family B exonuclease domain.	92
408830	pfam19060	DUF5756	Family of unknown function (DUF5756). This family of unknown function is predominantly found in Phycodnaviridae.	66
408831	pfam19061	DUF5757	Family of unknown function (DUF5757). This is a family of uncharacterized proteins of unknown function found in viruses. It is thought to be part of the early transcription factor large subunit.	94
408832	pfam19062	DUF5758	Family of unknown function (DUF5758). This is a family of uncharacterized proteins of unknown function found in viruses, as well as in bacteria. It is predicted to be a pentapeptide repeat-containing protein.	105
408833	pfam19063	DUF5759	Family of unknown function (DUF5759). This is a family of uncharacterized proteins of unknown function found in viruses.	71
408834	pfam19064	DUF5760	Family of unknown function (DUF5760). This is a family of uncharacterized proteins of unknown function found in Phycodnaviridae and Mimiviridae.	86
408835	pfam19065	DUF5761	Family of unknown function (DUF5761). This is a family of uncharacterized proteins of unknown function found in viruses.	62
408836	pfam19066	DUF5762	Family of unknown function (DUF5762). This is a family of uncharacterized proteins of unknown function found in viruses. It is inferred from homology to be a membrane-component.	69
408837	pfam19067	DUF5763	Family of unknown function (DUF5763). This is a family of uncharacterized proteins of unknown function found predominantly in viruses. However, some matches with predicted proteins from Archaea and Eukaryotes were also found.	39
408838	pfam19068	DUF5764	Family of unknown function (DUF5764). This is a family of uncharacterized proteins of known function found in viruses, particularly in Conferred and Mummified.	134
408839	pfam19069	DUF5765	Family of unknown function (DUF5765). This is a family of proteins of unknown function found in viruses, which are thought to be membrane proteins.	91
408840	pfam19070	DUF5766	Family of unknown function (DUF5766). This is a family of uncharacterized proteins of unknown function found in viruses.	77
408841	pfam19071	DUF5767	Family of unknown function (DUF5767). This is a family uncharacterized proteins of unknown function found in viruses.	85
408842	pfam19072	DUF5768	Family of unknown function (DUF5768). This is a family of uncharacterized proteins of unknown function found in viruses.	123
408843	pfam19073	DUF5769	Family of unknown function (DUF5769). This is a family of uncharacterized proteins of unknown function found in Mimiviridae.	190
408844	pfam19074	DUF5770	Family of unknown function (DUF5770). This is a family of uncharacterized proteins of unknown function found in Iridoviridae.	131
408845	pfam19075	DUF5771	Family of unknown function (DUF5771). This is a family of uncharacterized proteins of unknown function found in viruses which apparently has N-acetyltransferase activity.	69
408846	pfam19076	CshA_repeat	Surface adhesin CshA repetitive domain. Repeat domain from surface fibrillar adhesin CshA. This domain forms several tandem repeats with high sequence identity. CshA is found primarily in streptococci species, but also in other Gram+ bacteria.	97
408847	pfam19077	Big_13	Bacterial Ig-like domain. Presumed domain found as tandem repeats of high sequence identity in bacterial cell surface proteins.	102
408848	pfam19078	Big_12	Bacterial Ig-like domain. Presumed domain found as tandem repeats of high sequence identity in bacterial cell surface proteins.	102
408849	pfam19079	CFSR	Collagen-flanked surface repeat. Repeat flanked by collagen-like CXX motifs found in bacterial cell surface proteins.	49
408850	pfam19080	DUF5772	Family of unknown function (DUF5772). This is a family of proteins of unknown function found in viruses which appears to contain transmembrane proteins inferred from homology.	84
408851	pfam19081	Ig_7	Ig-like domain CHU_C associated. Presumed Ig-like domain found as tandem repeats in proteins with the gliding motility-associated C-terminal domain CHU_C	84
408852	pfam19082	DUF5773	Family of unknown function (DUF5773). This is a family of uncharacterized proteins of unknown function found in Phycodnaviridae.	175
408853	pfam19083	DUF5774	Family of unknown function (DUF5774). This is a family of uncharacterized proteins of unknown function found in viruses.	123
408854	pfam19084	DUF5775	Family of unknown function (DUF5775). This is a family of uncharacterized proteins of unknown function, found in phycodnaviruses. The protein contains a predicted N-terminal transmembrane helix.	81
408855	pfam19085	Choline_bind_2	Choline-binding repeat. this entry contains a pair of presumed choline-binding repeats that are often found adjacent to pfam01473.	38
408856	pfam19086	Terpene_syn_C_2	Terpene synthase family 2, C-terminal metal binding. 	198
408857	pfam19087	DUF5776	Domain of unknown function (DUF5776). Presumed stalk domain found in bacterial surface proteins forming tandem repeats with high sequence identity. This domain is also associated with other known bacterial surface protein stalks and adhesive domains.	67
408858	pfam19088	TUTase	TUTase nucleotidyltransferase domain. This nucleotidyltransferase domain is found in TUTase enzymes.	333
408859	pfam19089	DUF5777	Membrane bound beta barrel domain (DUF5777). This entry contains integral membrane beta barrel proteins.	247
408860	pfam19090	DUF5778	Family of unknown function (DUF5778). Family of unknown function predominantly found in Halobacteria.	126
408861	pfam19091	DUF5779	Family of unknown function (DUF5779). Family of unknown function predominantly found in Halobacteria.	96
408862	pfam19092	DUF5780	Family of unknown function (DUF5780). This entry adopts a Greek-key beta sandwich topology, with long loops in some of the members. While a structure is known, the function of this domain is yet to be determined.	109
408863	pfam19093	DUF5781	Family of unknown function (DUF5781). Family of unknown function predominantly found in Halobacteria.	243
408864	pfam19094	DUF5782	Family of unknown function (DUF5782). Family of unknown function predominantly found in Halobacteria.	75
408865	pfam19095	DUF5783	Family of unknown function (DUF5783). Family of unknown function predominantly found in Halobacteria.	105
408866	pfam19096	DUF5784	Family of unknown function (DUF5784). Family of unknown function predominantly found in Halobacteria.	329
408867	pfam19097	Snu56_snRNP	Snu56-like U1 small nuclear ribonucleoprotein component. This family is a component of the U1 snRNP particle, which recognizes and binds the 5'-splice site of pre-mRNA. Together with other non-snRNP factors, U1 snRNP forms the spliceosomal commitment complex, which targets pre-mRNA to the splicing pathway.	434
408868	pfam19098	DUF5785	Family of unknown function (DUF5785). Family of unknown function predominantly found in Halobacteria.	98
408869	pfam19099	DUF5786	Family of unknown function (DUF5786). Family of unknown function predominantly found in Halobacteria.	55
408870	pfam19100	DUF5787	Family of unknown function (DUF5787). Family of unknown function predominantly found in Halobacteria.	264
408871	pfam19101	DUF5788	Family of unknown function (DUF5788). This is a family of proteins of unknown function predominantly found in Halobacteria.	132
408872	pfam19102	DUF5789	Family of unknown function (DUF5789). This is a family of proteins of unknown function predominantly found in Halobacteria.	74
408873	pfam19103	DUF5790	Family of unknown function (DUF5790). This is a family of proteins of unknown function found predominantly in Halobacteria.	126
408874	pfam19104	DUF5791	Family of unknown function (DUF5791). This is a family of proteins of unknown function predominantly found in Halobacteria.	124
408875	pfam19105	DUF5792	Family of unknown function (DUF5792). This family contains a domain of unknown function found in prasinoviruses.	152
408876	pfam19106	DUF5793	Family of unknown function (DUF5793). This is a family of proteins of unknown function predominantly found in Halobacteria.	157
408877	pfam19107	DUF5794	Family of unknown function (DUF5794). This is a family of proteins of unknown function predominantly found in Halobacteria.	128
408878	pfam19108	DUF5795	Family of unknown function (DUF5795). Family of unknown function predominantly found in Halobacteria.	74
408879	pfam19109	DUF5796	Family of unknown function (DUF5796). Family of proteins of unknown function predominantly found in Halobacteria.	139
408880	pfam19110	DUF5797	Family of unknown function (DUF5797). This is a family of proteins of unknown function predominantly found in Halobacteria.	163
408881	pfam19111	DUF5798	Family of unknown function (DUF5798). Family of unknown function predominantly found in Halobacteria.	89
408882	pfam19112	VanA_C	Vanillate O-demethylase oxygenase C-terminal domain. This domain is found in a wide variety of oxygenases such as Vanillate O-demethylase oxygenase and Toluene-4-sulfonate monooxygenase.	196
408883	pfam19113	DUF5799	Family of unknown function (DUF5799). This is a family of proteins of unknown function predominantly found in Halobacteria.	148
408884	pfam19114	EsV_1_7_cys	EsV-1-7 cysteine-rich motif. The EsV-1-7 repeat is a cysteine-rich motif of unknown function. The motif was originally identified in the Ectocarpus "immediate upright" protein, which has an EsV-1-7 domain that contains five EsV-1-7 repeats. The name is derived from the Ectocarpus virus EsV-1 protein EsV-1-7, which possesses six EsV-1-7 repeats. Ectocarpus has a large family of EsV-1-7 domain proteins with between one and 19 copies of the motif (C-X4-C-X16-C-X2-H-X12). In addition to brown algae, EsV-1-7 domain proteins have been found in eustigmatophytes, oomycetes, cryptophytes, two families of green algae (Coccomyxaceae and Selenastraceae) and also in viral genomes, such as Emiliania huxleyi virus PS401 and Pithovirus sibericum. Based on this unusual distribution, it has been proposed that EsV-1-7 domain genes have been exchanged between lineages by horizontal gene transfer during evolution [1,2].	35
408885	pfam19115	DUF5800	Family of unknown function (DUF5800). This is a family of proteins of unknown function predominantly found in Halobacteria.	64
408886	pfam19116	DUF5801	Domain of unknown function (DUF5801). This entry contains a presumed domain that is found as tandem repeats in a number of bacterial proteins.	150
408887	pfam19117	Mim2	Mitochondrial import 2. This entry, together with pfam08219 form the Mim1/Mim2 complex, which is specific to fungi. This complex is responsible for the assembly and/or insertion of a subset of mitochondrial outer membrane proteins, including subunits of the main mitochondrial outer membrane translocase.	45
408888	pfam19118	DUF5802	Family of unknown function (DUF5802). Family of unknown function predominantly found in Halobacteria.	113
408889	pfam19119	DUF5803	Family of unknown function (DUF5803). Family of unknown function predominantly found in Halobacteria.	196
408890	pfam19120	DUF5804	Family of unknown function (DUF5804). Family of unknown function predominantly found in Halobacteria and Methanomicrobia.	108
408891	pfam19121	DUF5805	Family of unknown function (DUF5805). This is a family of proteins of unknown function predominantly found in Halobacteria.	67
408892	pfam19122	DUF5806	Family of unknown function (DUF5806). This is a family of proteins of unknown function predominantly found in Halobacteria and Methanomicrobia.	148
408893	pfam19123	DUF5807	Family of unknown function (DUF5807). This is a family of proteins of unknown function found in Halobacteria.	106
408894	pfam19124	DUF5808	Family of unknown function (DUF5808). This is a family of proteins of unknown function predominantly found in Firmicutes but also in Actinobacteria and Halobacteria. Members of this family are thought to be DUF1648 domain-containing proteins as they are membrane-components.	26
408895	pfam19125	DUF5809	Family of unknown function (DUF5809). This is a family of proteins of unknown function predominantly found in Halobacteria.	130
408896	pfam19126	DUF5810	Family of unknown function (DUF5810). This is a family of proteins of unknown function predominantly found in Halobacteria, but also in Eukaryotes. This family contains some members that are predicted as C2H2-type domain-containing proteins.	66
408897	pfam19127	Choline_bind_3	Choline-binding repeat. Pair of presumed choline-binding repeats often found adjacent to pfam01473.	48
408898	pfam19128	DUF5811	Family of unknown function (DUF5811). This is a family of proteins of unknown function predominantly found in Halobacteria.	103
408899	pfam19129	DUF5812	Family of unknown function (DUF5812). This is a family of unknown function predominantly found in Halobacteria.	140
408900	pfam19130	DUF5813	Family of unknown function (DUF5813). This is a family of unknown function found predominantly in Halobacteria.	143
408901	pfam19131	DUF5814	Family of unknown function (DUF5814). This is a family of proteins that are thought to have helicase activity, predominantly found in Halobacteria and Methanomicrobia.	148
408902	pfam19132	DUF5815	Family of unknown function (DUF5815). This is a family of unknown function predominantly found in Halobacteria.	155
408903	pfam19133	DUF5816	Family of unknown function (DUF5816). This is a family of proteins predominantly found in Halobacteria. They are thought to be GNAT (Gcn5-related N-acetyltransferases) family acetyltransferases.	72
408904	pfam19134	DUF5817	Family of unknown function (DUF5817). This is a family of proteins predominantly found in Halobacteria. They are thought to be the replication protein H.	51
408905	pfam19135	DUF5818	Protein of unknown function (DUF5818). This is a family of uncharacterized proteins.	57
408906	pfam19136	DUF5819	Family of unknown function (DUF5819). This is a family of uncharacterized proteins.	168
408907	pfam19137	DUF5820	Family of unknown function (DUF5820). This is a family of unknown function predominantly found in Halobacteria.	117
408908	pfam19138	DUF5821	Family of unknown function (DUF5821). This is a family of proteins of unknown function predominantly found in Halobacteria.	217
408909	pfam19139	DUF5822	Family of unknown function (DUF5822). This is a family of proteins of unknown function predominantly found in Halobacteria. This family includes some members which are thought to be peptidoglycan binding proteins.	38
408910	pfam19140	DUF5823	Family of unknown function (DUF5823). This is a family of uncharacterized proteins.	178
408911	pfam19141	DUF5824	Family of unknown function (DUF5824). This family contains a domain of unknown function, which is predominantly found in Phycodnaviridae and Caudovirale viruses.	127
408912	pfam19142	DUF5825	Family of unknown function (DUF5825). This is a family of uncharacterized proteins.	180
408913	pfam19143	Omp85_2	OMP85 superfamily. This entry represents the membrane spanning beta barrel domain of various Omp85 superfamily related proteins that have often have POTRA and Patatin domains. This family contains mainly flavobacterial proteins.	351
408914	pfam19144	DUF5826	Family of unknown function (DUF5826). This family of unknown function is mostly found in prasinoviruses and is likely to represent membrane proteins with two transmembrane regions.	68
408915	pfam19145	DUF5827	Family of unknown function (DUF5827). This is a family of proteins of unknown function predominantly found in Halobacteria.	87
408916	pfam19146	DUF5828	Family of unknown function (DUF5828). This is a family of proteins of unknown function predominantly found in Halobacteria.	175
408917	pfam19147	DUF5829	Family of unknown function (DUF5829). This is a family of uncharacterized proteins.	266
408918	pfam19148	DUF5830	Family of unknown function (DUF5830). This is a family of proteins predominantly found in Halobacteria. Some members includes the MarR family transcriptional regulator.	115
408919	pfam19149	DUF5831	Family of unknown function (DUF5831). This family of unknown function is found mostly in prasinoviruses.	72
408920	pfam19150	DUF5832	Family of unknown function (DUF5832). This entry contains proteins of unknown function found predominantly in the Phycodnaviridae, Mimiviridae, Marseilleviridae and Iridoviridae virus families.	81
408921	pfam19151	Sublancin	Sublancin. This family represents sublancin, a small bacteriocin active against Gram-positive bacteria. This family appears to be restricted to Bacilli. Sublancin was thought to be a lantibiotic but was later shown to be an S-linked glycopeptide. Glycosylation is essential for its antimicrobial activity. Sublancin is biosynthesized as a precursor peptide bearing an N-terminal leader peptide, and a C-terminal core peptide that is converted into the mature peptide. Sublancin comprises two alpha helices and a well-defined inter-helical loop. Sublancin inhibits B.cereus spore outgrowth, after the germination stage, approximately 1000-fold better than it inhibits exponential growth of the same cells and inhibits B.subtilis strain ATCC6633 and B. megaterium strain 14581.	57
408922	pfam19152	DUF5834	Family of unknown function (DUF5834). This family represents an uncharacterized protein that is associated with the thiocillin biosynthetic gene cluster.	159
408923	pfam19153	DUF5835	Family of unknown function (DUF5835). This family represents an uncharacterized protein that is associated with the biosynthetic gene cluster for the bacteriocin salivaricin CRL 1328. The salivaricin CRL 1328 biosynthetic gene cluster is similar to the previously described gene cluster for the bacteriocin ABP118.	53
408924	pfam19154	DUF5836	Family of unknown function (DUF5836). This family represents the induction peptide AbpIP, which regulates the biosynthesis of the bacteriocin ABP-118 in Lactobacillus salivarius subsp. salivarius UCC118.	38
408925	pfam19155	DUF5837	Family of unknown function (DUF5837). This family of unknown function is associated with the tenuecyclamide A biosynthetic gene cluster.	63
408926	pfam19156	DUF5838	Family of unknown function (DUF5838). This family of unknown function is associated with the biosynthetic gene cluster for anacyclamide A10. The family appears to be restricted to Cyanobacteria. Some matches also contain a methyltransferase domain at the N terminus.	285
408927	pfam19157	DUF5839	Family of unknown function (DUF5839). This family of unknown function is associated with the biosynthetic gene cluster for glycocin F. The family appears to be restricted to Firmicutes.	87
408928	pfam19158	DUF5840	Family of unknown function (DUF5840). This family contains uncharacterized proteins. It also contains the anacyclamide synthesis protein AcyE. Cyanobactins are small, cyclic peptides found in cyanobacteria. They are ribosomally synthesized and post-translationally modified. Cyanobactin biosynthesis clusters contain 7-12 genes. Anaclyclamides are a type of cyanobactin produced in strains of the cyanobacteria Anabaena. AcyE is a 49-amino-acid protein with N-terminal homology to the peptide precursor proteins in the other cyanobactin pathways. The core peptide of AcyE is cleaved during post-translational processing of the precursor peptide.	49
408929	pfam19159	DUF5841	Family of unknown function (DUF5841). This family of unknown function is associated with the biosynthetic gene cluster for enterocin A.	48
408930	pfam19160	SPARK	SPARK. This entry is typically found as an extracellular domain of plant receptor-like kinases, many of which play a role in signalling during plant-fungal symbiosis. The precise function of this entry is unknown.	161
408931	pfam19161	DUF5843	Family of unknown function (DUF5843). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	196
408932	pfam19162	DUF5844	Family of unknown function (DUF5844). This is a family of uncharacterized proteins of unknown function found in Iridoviridae.	109
408933	pfam19163	DUF5845	Family of unknown function (DUF5845). This is a family of uncharacterized proteins of unknown function found in viruses.	80
408934	pfam19164	DUF5846	Family of unknown function (DUF5846). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae.	116
408935	pfam19165	DUF5847	Family of unknown function (DUF5847). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae.	407
408936	pfam19166	DUF5848	Family of unknown function (DUF5848). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. This family is also found in Bacteria and Fungi.	64
408937	pfam19167	DUF5849	Family of unknown function (DUF5849). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	200
408938	pfam19168	DUF5850	Family of unknown function (DUF5850). This is a family of uncharacterized proteins of unknown function predominantly found in Iridoviridae. The family contains a conserved motif towards the C-terminus of these proteins. This motif contains a central RGD sequence which suggests these viral proteins may bind to integrins.	133
408939	pfam19169	DUF5851	Family of unknown function (DUF5851). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae.	353
408940	pfam19170	DUF5852	Family of unknown function (DUF5852). This is a family of uncharacterized proteins of unknown function predominantly found in Iridoviridae.	158
408941	pfam19171	DUF5853	Family of unknown function (DUF5853). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae.	136
408942	pfam19172	DUF5854	Family of unknown function (DUF5854). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae.	161
408943	pfam19173	DUF5855	Family of unknown function (DUF5855). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae.	183
408944	pfam19174	DUF5856	Family of unknown function (DUF5856). This is a family of uncharacterized proteins of unknown function predominantly found in viruses. This family was also found in Bacteria.	94
408945	pfam19175	DUF5857	Family of unknown function (DUF5857). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	287
408946	pfam19176	DUF5858	Family of unknown function (DUF5858). This is a family of uncharacterized proteins of unknown function predominantly found in Marseillevirus.	61
408947	pfam19177	DUF5859	Family of unknown function (DUF5859). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	162
408948	pfam19178	DUF5860	Family of unknown function (DUF5860). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	167
408949	pfam19179	DUF5861	Family of unknown function (DUF5861). This is a family of uncharacterized proteins of unknown function found in viruses. This family also includes proteins found in eukaryotes which are thought to be E3 ubiquitin-protein ligase.	116
408950	pfam19180	DUF5862	Family of unknown function (DUF5862). This is a family of uncharacterized proteins of unknown function predominantly found in Ascoviridae. This family also includes uncharacterized proteins found in Bacteria.	68
408951	pfam19181	DUF5863	Family of unknown function (DUF5863). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	169
408952	pfam19182	DUF5864	Family of unknown function (DUF5864). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae.	136
408953	pfam19183	DUF5865	Family of unknown function (DUF5865). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	210
408954	pfam19184	DUF5866	Family of unknown function (DUF5866). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae.	74
408955	pfam19185	DUF5867	Family of unknown function (DUF5867). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	274
408956	pfam19186	DUF5868	Family of unknown function (DUF5868). This is a family of uncharacterized proteins of unknown function predominantly found in Mimiviridae.	193
408957	pfam19187	HTH_PafC	PafC helix-turn-helix domain. This entry is an N-terminal HTH domain found in the PafC protein. Transcriptional activator PafBC is responsible for upregulating the majority of genes induced by DNA damage.	115
408958	pfam19188	AGRB_N	Adhesion GPCR B N-terminal region. This region is found at the N-terminus of various adhesion G-protein coupled receptor B proteins. This region contains 10 cysteine residues that probably form disulphide bonds.	177
408959	pfam19189	Mtf2	Mtf2 family. This family appears to be distantly related to PPR repeats.	196
408960	pfam19190	BACON_2	Viral BACON domain. This family represents a distinct class of BACON domains found in crAss-like phages, the most common viral family in the human gut, in which they are found in tail fiber genes. This suggests they may play a role in phage-host interactions.	91
408961	pfam19191	HEF_HK	HEF_HK domain. This is a dimerization and histidine phosphotransfer (DHp) domain found in Histidine Kinases (HK). This domain is belongs to the His_Kinase_A (CL0025) clan. HK domain architectures typically contain DHp domains adjacent to GHKL domains, such as HATPase_c (pfam02518) and HATPase_c_3 (pfam13589) which comprise the ATP-binding regions.	67
408962	pfam19192	Response_reg_2	Response receiver domain. This is a receiver domain (REC) commonly found in the same gene-neighbourhood of its cognate HK. Together they comprise a Two-Component System (TCS). There is a high degree of specificity among REC and DHp domains in a cognate pair. This domain is related to the Response_reg (pfam00072) domain. We found that pfam19191 and this domain show high degree of linkage in the same gene-neighbourhoods across several bacterial lineages. This implies that they comprise a TCS.	176
408963	pfam19193	Tectonin	Tectonin domain. This entry represents proteins homologous to tectonin. This protein adopts a 6 bladed beta propeller structure.	214
408964	pfam19194	DUF5869	Family of unknown function (DUF5869). This is a family of uncharacterized proteins of unknown function found in Marseillevirus.	159
408965	pfam19195	DUF5870	Family of unknown function (DUF5870). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	488
408966	pfam19196	DUF5871	Family of unknown function (DUF5871). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae.	136
408967	pfam19197	DUF5872	Family of unknown function (DUF5872). This is a family of uncharacterized proteins of unknown function predominantly found in viruses and Bacteria.	115
408968	pfam19198	RsaA_NTD	RsaA N-terminal domain. This entry represents the N-terminal domain of the RsaA S-layer protein from Caulobacter crescentus. This domain binds to lipopolysaccharide.	174
408969	pfam19199	Phage_coatGP8	Phage major coat protein, Gp8. 	68
408970	pfam19200	DUF871_N	DUF871 N-terminal domain. This family consists of several conserved hypothetical proteins from bacteria and archaea. The function of this family is unknown.	235
408971	pfam19201	DUF5873	Family of unknown function (DUF5873). This is a family of uncharacterized proteins of unknown function predominantly found in Phyconaviridae.	109
408972	pfam19202	DUF5874	Family of unknown function (DUF5874). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae. This entry includes proteins that are also found in Fungi.	74
408973	pfam19203	DUF5875	Family of unknown function (DUF5875). This is a family of uncharacterized proteins of unknown function predominantly found in Iridoviridae.	228
408974	pfam19204	DUF5876	Family of unknown function (DUF5876). This is a family of uncharacterized proteins of unknown function predominantly found in Iridoviridae.	561
408975	pfam19205	DUF5877	Family of unknown function (DUF5877). This is a family of uncharacterized proteins of unknown functions predominantly found in Iridoviridae.	609
408976	pfam19206	DUF5878	Family of unknown function (DUF5878). This is a family of uncharacterized proteins of unknown function found in viruses.	148
408977	pfam19207	DUF5879	Family of unknown function (DUF5879). This is a family of uncharacterized proteins of unknown function predominantly found in viruses.	273
408978	pfam19208	DUF5880	Family of unknown function (DUF5880). This is a family of uncharacterized proteins of unknown function predominantly found in Phycodnaviridae. This family also includes uncharacterized proteins found in Archaea and Eukaryotes.	97
408979	pfam19209	CoV_S1_C	Coronavirus spike glycoprotein S1, C-terminal. This entry represents a domain found at the C-terminus of the Coronavirus S1 protein. It is found across a range of alpha, beta and gamma coronaviruses. This small all beta stranded domain is known as subdomain 2 in the structure of the porcine epidemic diarrhea virus spike protein.	58
408980	pfam19211	CoV_NSP2_N	Coronavirus replicase NSP2, N-terminal. This entry corresponds to the N-terminal region of coronavirus non-structural protein 2. NSP2 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. The function of this protein is uncertain. This region contains numerous conserved and semi-conserved cysteine residues.	204
408981	pfam19212	CoV_NSP2_C	Coronavirus replicase NSP2, C-terminal. This entry corresponds to a presumed domain found at the C-terminus of Coronavirus non-structural protein 2 (NSP2). NSP2 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. The function of NSP2 is uncertain. This presumed domain is found in two copies in some viral NSP2 proteins. This domain is found in both alpha and betacoronaviruses.	156
408982	pfam19213	CoV_NSP6	Coronavirus replicase NSP6. This entry represents proteins found in Coronaviruses and includes the Non-structural Protein 6 (NSP6). Coronaviruses encode large replicase polyproteins which are proteolytically processed by viral proteases to generate mature Nonstructural Proteins (NSPs). NSP6 is a membrane protein containing 6 transmembrane domains with a large C-terminal tail. NSP6 from the avian coronavirus, infectious bronchitis virus (IBV) and the mouse hepatitis virus (MHV) have been shown to localize to the ER and to generate autophagosomes. Coronavirus NSP6 proteins have also been shown to limit autophagosome expansion. This may favour coronavirus infection by reducing the ability of autophagosomes to deliver viral components to lysosomes for degradation. NSP6 from IBV, MHV and severe acute respiratory syndrome coronavirus (SARS-CoV) have also been found to activate autophagy.	261
408983	pfam19214	CoV_S2_C	Coronavirus spike glycoprotein S2, intravirion. This entry represents the cysteine rich intravirion region found at the C-terminus of coronavirus spike proteins (S). These cysteine residues are targets for palmitoylation, necessary for efficiently S incorporation into virions and S-mediated membrane fusions.	42
408984	pfam19215	CoV_NSP15_C	Coronavirus replicase NSP15, uridylate-specific endoribonuclease. This entry represents the C-terminal domain of coronavirus non-structural protein 15 (NSP15 or nsp15). NSP15 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. This domain exhibits endoribonuclease activity designated EndoU, highly conserved in all known CoVs and is part of the replicase-transcriptase complex that plays important roles in virus replication and transcription. NSP15 is a Uridylate-specific endoribonuclease that cleaves the 5'-polyuridines from negative-sense viral RNA, termed PUN RNA either upstream or downstream of uridylates, at GUU or GU to produce molecules with 2',3'-cyclic phosphate ends. PUN RNA is a CoV MDA5-dependent pathogen-associated molecular pattern (PAMP).	154
408985	pfam19216	CoV_NSP15_M	Coronavirus replicase NSP15, middle domain. This entry represents the non-catalytic middle domain from coronavirus non-structural protein 15 (NSP15). NSP15 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. This domain is formed by ten beta strands organized into three beta hairpins.	90
408986	pfam19217	CoV_NSP4_N	Coronavirus replicase NSP4, N-terminal. This is the N-terminal domain of the coronavirus nonstructural protein 4 (NSP4). NSP4 is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. NSP4 is a membrane-spanning protein which is thought to anchor the viral replication-transcription complex to modified endoplasmic reticulum membranes. This N-terminal region represents the membrane spanning region, covering four transmembrane regions.	351
408987	pfam19218	CoV_NSP3_C	Coronavirus replicase NSP3, C-terminal. This family represents the C-terminal region of non-structural protein NSP3 (also known as nsp3). NSP3 is the product of ORF1a. It is found in human SARS coronavirus polyprotein 1a and 1ab, and in related coronavirus polyproteins. It is a multifunctional protein comprising up to 16 different domains and regions. NSP3 binds to viral RNA, nucleocapsid protein, as well as other viral proteins and participates in polyprotein processing.	464
408988	pfam19219	CoV_NSP15_N	Coronavirus replicase NSP15, N-terminal oligomerisation. This is the N-terminal domain of the coronavirus nonstructural protein 15 (NSP15), which is encoded by ORF1a/1ab and proteolytically released from the pp1a/1ab polyprotein. NSP15, is a nidoviral RNA uridylate-specific endoribonuclease (NendoU) carrying C-terminal catalytic domain belonging to the EndoU family. The SARS-CoV-2 NendoU monomers assemble into a double-ring hexamer, generated by a dimer of trimers. The hexamer is stabilized by the interactions of N-terminal oligomerization domain.	61
408989	pfam19220	Crescentin	Crescentin protein. This entry represents a bacterial equivalent to Intermediate Filament proteins, named crescentin, whose cytoskeletal function is required for the vibrioid and helical shapes of Caulobacter crescentus. Without crescentin, the cells adopt a straight-rod morphology. Crescentin has characteristic features of IF proteins including the ability to assemble into filaments in vitro without energy or cofactor requirements. In vivo, crescentin forms a helical structure that colocalizes with the inner cell curvatures beneath the cytoplasmic membrane.	401
408990	pfam19221	MELT	MELT motif. The outer kinetochore protein scaffold KNL1 is essential for error-free chromosome segregation during mitosis and meiosis. A critical feature of KNL1 is an array of repeats containing MELT-like motifs. When phosphorylated, these motifs form docking sites for the BUB1-BUB3 dimer that regulates chromosome biorientation and the spindle assembly checkpoint. This entry mainly represents vertebrate proteins although MELT motifs are found much more widely.	26
408991	pfam19222	Noda_Vmethyltr	Nodavirus Vmethyltransferase. This entry represents a family of nodavirus proteins that is homologous to pfam01660. These proteins are likely methytransferases involved in mRNA capping.	148
408992	pfam19223	Chropara_Vmeth	Chroparavirus methyltransferase. This entry represents a family of chroparavirus proteins that is homologous to pfam01660. These proteins are likely methytransferases involved in mRNA capping.	319
408993	pfam19224	pATOM36	pATOM36 family. This entry represents the trypanosome Peripherally associated ATOM36 protein which has been shown to complement a deletion of the Mim1/Mim2 complex in yeast. The integral MOM protein, peripheral archaic translocase of the outer membrane 36 (pATOM36), in analogy to the MIM complex, is involved in the assembly and/or membrane insertion of a small subset of MOM proteins including subunits of the main trypanosomal outer membrane protein translocase (ATOM complex).	270
408994	pfam19225	Spo16	Spo16 protein. This entry represents proteins related to yeast Spo16. Spo16 forms a complex with Zip2 and Zip4. Zip2 and Spo16, form a meiosis-specific XPF-ERCC1-like complex. The recombinant Zip2 XPF domain together with Spo16 preferentially binds branched DNA structures, such as D loops and HJs. The Spo16 protein contains a C-terminal HHH motif.	163
408995	pfam19226	DisA	DisA glycoprotein. This entry corresponds to a putative viral glycoprotein.	363
408996	pfam19227	Salyut	Salyut domain. This entry represents the Salyut domain found in the replicase of all viruses of the viral family Tymoviridae, composed of viruses that infect plants and insects. it is located within a long, hypervariable, Proline-rich hinge region located between the Alphavirus-like methyltransferase (pfam01660) and the pfam01443. It is located 150-20aa downstream the Iceberg region that forms the C-terminus of Vmethyltransf. The function of this family is unknown.	51
275365	sd00001	TSP3	Calcium-binding Thrombospondin type 3 (TSP3) repeat. TSP3 repeats of the vertebrate thrombospondin (TSP)-1,-2,-3,-4 and TSP-5/also known as COMP (cartilage oligomeric matrix protein), and related proteins. These short aspartate-rich repeats are a continuous series of calcium binding sites that can be divided into two sequence motifs: N-type and C-type. N-type and C-type motifs are distinguished by their sequence length, calcium ion binding, and their interactions with water molecules. C-type motifs are higher affinity binding sites compared to N-type motifs.	59
275366	sd00002	TSP3	Calcium-binding Thrombospondin type 3 (TSP3) repeat. TSP3 repeats of the vertebrate thrombospondin (TSP)-1,-2,-3,-4 and TSP-5/also known as COMP (cartilage oligomeric matrix protein), and related proteins. These short aspartate-rich repeats are a continuous series of calcium binding sites that can be divided into two sequence motifs: N-type and C-type. N-type and C-type motifs are distinguished by their sequence length, calcium ion binding, and their interactions with water molecules. C-type motifs are higher affinity binding sites compared to N-type motifs.	59
275367	sd00003	TSP3_1C	Calcium-binding Thrombospondin type 3 (TSP3) repeat; C-type motif 1C. TSP3 repeats of the vertebrate thrombospondin (TSP)-1,-2,-3,-4 and TSP-5/also known as COMP (cartilage oligomeric matrix protein), and related proteins. These short aspartate-rich repeats are a continuous series of calcium binding sites that can be divided into two sequence motifs: N-type and C-type. N-type and C-type motifs are distinguished by their sequence length, calcium ion binding, and their interactions with water molecules. C-type motifs are higher affinity binding sites compared to N-type motifs. The first TSP3 repeat 1C deviates from the canonical C-type calcium binding repeat in containing an insert relative to the other C-type repeats, however, the residues of the interrupted halves are positioned identically to C-type repeats without the insert.	35
276811	sd00004	PPR	Pentatricopeptide repeat, an RNA-binding module. The Pentatricopeptide repeat (PPR) is a 35-residue repeat motif that forms two anti-parallel alpha helices and binds single-stranded RNA in a sequence-specific and modular manner. It is present in a large family of RNA-binding proteins that are found in protists, fungi, and metazoan, but are most abundant in the mitochondria and chloroplasts of terrestrial plants. PPR proteins function in many aspects of RNA metabolism, including splicing, editing, degradation, and translation. They contain between 2 to 30 PPR repeats, organized into a hairpin of alpha helices. Proteins containing only arrays of PPR repeats that are 35-amino acid in length are called P class proteins. The second type of PPR proteins, called PLS class, contain additional C-terminal endonuclease or RNA editing domains and a distinct PPR architecture of triplet repeats alternating between a typical PPR, a longer PPR and a short PPR of 31 residues.	100
276810	sd00005	TPR	Tetratricopeptide repeat. The Tetratricopeptide repeat (TPR) typically contains 34 amino acids and is found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans. It is present in a variety of proteins including those involved in chaperone, cell-cycle, transcription, and protein transport complexes. The number of TPR motifs varies among proteins. Those containing 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accommodate an alpha-helix of a target protein. It has been proposed that TPR proteins preferentially interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes.	60
276809	sd00006	TPR	Tetratricopeptide repeat. The Tetratricopeptide repeat (TPR) typically contains 34 amino acids and is found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans. It is present in a variety of proteins including those involved in chaperone, cell-cycle, transcription, and protein transport complexes. The number of TPR motifs varies among proteins. Those containing 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accommodate an alpha-helix of a target protein. It has been proposed that TPR proteins preferentially interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes.	97
276808	sd00008	TPR_YbbN	C-terminal Tetratricopeptide repeat (TPR) region of YbbN and similar motifs. The Tetratricopeptide repeat (TPR) typically contains 34 amino acids and is found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans. It is present in a variety of proteins including those involved in chaperone, cell-cycle, transcription, and protein transport complexes. YbbN is a thioredoxin-like protein containing two tandem TPR repeats at the C-terminus, separated by two alpha helices. Its N-terminal thioredoxin-like domain is not a functional oxidoreductase. It functions in heat stress response and DNA synthesis as a chaperone or co-chaperone.	171
276807	sd00010	SLR	Sel1-like repeat. Sel1-like repeats (SLRs) share similar alpha-helical conformations with Tetratricopeptide repeats (TPRs), but with different consensus sequence lengths and superhelical topologies. SLRs contain 36 to 44 amino acids and are present in bacteria and eukaryotes but not in archaea. SLR proteins are involved in a variety of functions, and many serve as adaptor proteins for the assembly of macromolecular complexes. The SLR family was named after the Caenorhabditis elegans Sel1 protein which is predicted to fold into 11 SLRs, a transmembrane domain, and an N-terminal signal sequence. The human Sel1L protein contains an additional fibronectin type-II domain and an N-terminal PEST sequence. Its downregulation is associated with the development of breast and pancreatic carcinomas.	133
276806	sd00016	Apc5	Tetratricopeptide repeat (TPR)-like motif of Apc5 and similar motifs. Apc5 is a subunit of the anaphase-promoting complex/cyclosome (APC/C) which is a multi-subunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Apc5 binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction. The N-terminus of Afi1 serves to stabilize the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC. This model represents the Tetratricopeptide repeat (TPR)-like motif region of Apc5.	98
275368	sd00017	ZF_C2H2	Zinc finger, C2H2 type. The C2H2 zinc finger is a classical zinc finger domain. C2H2-type zinc fingers are ubiquitous; more than 1% of all mammalian proteins are predicted to contain at least one zinc finger. They often function as DNA or protein binding structural motifs, such as in eukaryotic transcription factors, and therefore they play important roles in cellular processes such as development, differentiation, and oncosuppression. C2H2 zinc finger proteins contain from 1 to more than 30 zinc finger repeats.	78
275369	sd00018	ZF_C2H2	Zinc finger, C2H2 type. The C2H2 zinc finger is a classical zinc finger domain. C2H2-type zinc fingers are ubiquitous; more than 1% of all mammalian proteins are predicted to contain at least one zinc finger. They often function as DNA or protein binding structural motifs, such as in eukaryotic transcription factors, and therefore they play important roles in cellular processes such as development, differentiation, and oncosuppression. C2H2 zinc finger proteins contain from 1 to more than 30 zinc finger repeats.	24
275370	sd00019	ZF_C2H2	Zinc finger, C2H2 type. The C2H2 zinc finger is a classical zinc finger domain. C2H2-type zinc fingers are ubiquitous; more than 1% of all mammalian proteins are predicted to contain at least one zinc finger. They often function as DNA or protein binding structural motifs, such as in eukaryotic transcription factors, and therefore they play important roles in cellular processes such as development, differentiation, and oncosuppression. C2H2 zinc finger proteins contain from 1 to more than 30 zinc finger repeats.	49
275371	sd00020	ZF_C2H2	Zinc finger, C2H2 type. The C2H2 zinc finger is a classical zinc finger domain. C2H2-type zinc fingers are ubiquitous; more than 1% of all mammalian proteins are predicted to contain at least one zinc finger. They often function as DNA or protein binding structural motifs, such as in eukaryotic transcription factors, and therefore they play important roles in cellular processes such as development, differentiation, and oncosuppression. C2H2 zinc finger proteins contain from 1 to more than 30 zinc finger repeats.	46
275375	sd00025	zf-RanBP2	RanBP2-type zinc finger. The zf-RanBP2 domain represents a new superfamily of C2C2-type zinc finger motif, which is characterized by the conserved sequence pattern W-X-C-X(2,4)-C-X(3)-N-X(6)-C-X(2)-C. They fold into a structure composed of two orthogonal beta-hairpin strands that sandwich a single Zn2+ ion coordinated with four cysteine residues. zf-RanBP2 domains are mainly found in eukaryotic proteins and some exist in bacteria and archaea. According to different binding partners, the superfamily can be classified into several families. For instance, the E3 SUMO-protein ligase RanBP2-like family binds Ran, the nuclear protein localization protein 4 homolog (NPL4)-like family binds ubiquitin, and the zinc finger Ran-binding domain-containing protein 2 (ZRANB2)-like family binds single-stranded RNA (ssRNA). Most of superfamily members contain one copy of zf-RanBP2, but some contain several zf-RanBP2 domains.	293
275376	sd00029	zf-RanBP2	RanBP2-type zinc finger. The zf-RanBP2 domain represents a new superfamily of C2C2-type zinc finger motif, which is characterized by the conserved sequence pattern W-X-C-X(2,4)-C-X(3)-N-X(6)-C-X(2)-C. They fold into a structure composed of two orthogonal beta-hairpin strands that sandwich a single Zn2+ ion coordinated with four cysteine residues. zf-RanBP2 domains are mainly found in eukaryotic proteins and some exist in bacteria and archaea. According to different binding partners, the superfamily can be classified into several families. For instance, the E3 SUMO-protein ligase RanBP2-like family binds Ran, the nuclear protein localization protein 4 homolog (NPL4)-like family binds ubiquitin, and the zinc finger Ran-binding domain-containing protein 2 (ZRANB2)-like family binds single-stranded RNA (ssRNA). Most of superfamily members contain one copy of zf-RanBP2, but some contain several zf-RanBP2 domains.	74
275377	sd00030	zf-RanBP2	RanBP2-type zinc finger. The zf-RanBP2 domain represents a new superfamily of C2C2-type zinc finger motif, which is characterized by the conserved sequence pattern W-X-C-X(2,4)-C-X(3)-N-X(6)-C-X(2)-C. They fold into a structure composed of two orthogonal beta-hairpin strands that sandwich a single Zn2+ ion coordinated with four cysteine residues. zf-RanBP2 domains are mainly found in eukaryotic proteins and some exist in bacteria and archaea. According to different binding partners, the superfamily can be classified into several families. For instance, the E3 SUMO-protein ligase RanBP2-like family binds Ran, the nuclear protein localization protein 4 homolog (NPL4)-like family binds ubiquitin, and the zinc finger Ran-binding domain-containing protein 2 (ZRANB2)-like family binds single-stranded RNA (ssRNA). Most of superfamily members contain one copy of zf-RanBP2, but some contain several zf-RanBP2 domains.	60
275378	sd00031	LRR_1	leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.	110
275379	sd00032	LRR_2	leucine rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.	205
275380	sd00033	LRR_RI	leucine-rich repeats, ribonuclease inhibitor (RI)-like subfamily. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.	238
275381	sd00034	LRR_AMN1	leucine-rich repeats, antagonist of mitotic exit network protein 1-like subfamily. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.	212
275382	sd00035	LRR_NTF	leucine-rich repeats, nuclear transport factor-like subfamily. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.	144
275383	sd00036	LRR_3	leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.	142
275384	sd00037	PASTA	PASTA domain. PASTA domain is found at the C-termini of several penicillin-binding proteins (PBPs) and bacterial serine/threonine kinases. It is a small globular domain consisting of 3 beta-sheets and an alpha-helix. The name PASTA is derived from PBP and Serine/Threonine kinase Associated domain.	126
276965	sd00038	Kelch	Kelch repeat. Kelch repeats are 44 to 56 amino acids in length and form a four-stranded beta-sheet corresponding to a single blade of five to seven bladed beta propellers. The Kelch superfamily is a large evolutionary conserved protein family whose members are present throughout the cell and extracellularly, and have diverse activities. Kelch repeats are often in combination with other domains, like BTB and BACK or F-box domains.	140
293791	sd00039	7WD40	WD40 repeats in seven bladed beta propellers. The WD40 repeat is found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing, and cytoskeleton assembly. It typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40. Between the GH and WD dipeptides lies a conserved core. It forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel beta-sheet. The WD40 sequence repeat originally described in literature forms the first three strands of one blade and the last strand in the next blade. The C-terminal WD40 repeat completes the blade structure of the N-terminal WD40 repeat to create the closed ring propeller-structure. The residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands, allowing them to bind either stably or reversibly.	293
293790	sd00041	GyrA-ParC_C	beta-pinwheel repeat found at the C-terminus of GyrA, ParC, and similar proteins. Beta-pinwheel repeats are found at the C-terminus of both DNA gyrase subunit A and ParC, a subunit of topoisomerase IV (topo IV). DNA gyrase, a type IIA topoisomerase is a GyrA2GyrB2 heterotetramer which introduces negative supercoiling into the circular bacterial chromosome. Topo IV, a type IIB topoisomerase, is a ParC2ParE2 tetramer, which primarily relaxes positive supercoils and mediates topological unlinking of entangled DNA segments such as catenanes. The GyrA C-terminal repeat region, referred to as the C-terminal domain or CTD, binds DNA nonspecifically; it is thought to constrain a positive supercoil by wrapping a DNA duplex around its surface, upon strand passage, this wrap is converted into two negative supercoils. All known gyrase CTDs have 6 bladed beta-pinwheels, the topo IV CTD in various organisms is more variable and includes both 3-bladed and 8-bladed pinwheels.	253
293789	sd00042	LVIVD	LVIVD repeat. LVIVD repeats are mainly found in bacterial and archaeal cell surface proteins, many of them hypothetical. Structurally, LVIVD repeats have been predicted to form a beta-propeller, with each repeat forming one four-stranded anti-parallel beta-sheet blade.	120
293788	sd00043	ARM	armadillo repeat. Armadillo (ARM)/beta-catenin-like repeats are approximately 40 amino acid long, tandemly repeated sequence motif, first identified in the Drosophila segment polarity gene armadillo. These repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified. ARM repeats are related to HEAT repeats.	117
293787	sd00044	HEAT	HEAT repeats. The canonical HEAT repeat consists of two helices forming a helical hairpin. HEAT repeats are found in a diverse family of proteins, including the four proteins from which its name is based: Huntingtin, Elongation factor 3, the PR65/A subunit of protein phosphatase 2A (PP2A), and the lipid kinase TOR (target of rapamycin). The HEAT repeat family is related to armadillo (ARM)/beta-catenin-like repeats.	181
293786	sd00045	ANK	ankyrin repeats. Ankyrin repeats are one of the most abundant repeat motifs, and generally function as scaffolds for protein-protein interactions in processes including cell cycle, transcriptional regulation, signal transduction, vesicular trafficking, and inflammatory response. Although predominantly found in eukaryotic proteins, they are also found in some bacterial and viral proteins.  Less is known of their physiological roles in prokaryotes. Some bacterial ANK proteins play key roles in microbial pathogenesis by mimicking or manipulating host function(s). The pathogen Providencia alcalifaciens N-formyltransferase ankyrin repeats function in small molecule binding and allosteric control. Ankyrin-repeat proteins have been associated with a number of human diseases.	98
293785	sd00046	FHA_bHelix	beta-helical repeat found in filamentous hemagglutinin and related adhesins and CdiA family proteins. This model contains ten copies of an approximately 20-residue repeat found in two-partner secretion (TPS) proteins, including the filamentous hemagglutinin (FHA) family of adhesins and CdiA family proteins. These repeats form a right-handed beta-helical structure, and are found in large secreted proteins from a number of plant and animal pathogens. FHA family adhesins bind to various types of cells and may contribute to attachment, aggregation, and pathogenesis. CdiA proteins are involved in contact-dependent growth inhibition (CDI).	209
128322	smart00002	PLP	Myelin proteolipid protein (PLP or lipophilin). 	60
128323	smart00003	NH	Neurohypophysial hormones. Vasopressin/oxytocin gene family.	78
197463	smart00004	NL	Domain found in Notch and Lin-12. The Notch protein is essential for the proper differentiation of the Drosophila ectoderm. This protein contains 3 NL domains.	38
214467	smart00005	DEATH	DEATH domain, found in proteins involved in cell death (apoptosis). Alpha-helical domain present in a variety of proteins with apoptotic functions. Some (but not all) of these domains form homotypic and heterotypic dimers.	88
128326	smart00006	A4_EXTRA	amyloid A4. amyloid A4 precursor of Alzheimers disease	165
214468	smart00008	HormR	Domain present in hormone receptors. 	70
197466	smart00010	small_GTPase	Small GTPase of the Ras superfamily; ill-defined subfamily. SMART predicts Ras-like small GTPases of the ARF, RAB, RAN, RAS, and SAR subfamilies. Others that could not be classified in this way are predicted to be members of the small GTPase superfamily without predictions of the subfamily.	166
214469	smart00012	PTPc_DSPc	Protein tyrosine phosphatase, catalytic domain, undefined specificity. Protein tyrosine phosphatases. Homologues detected by this profile and not by those of "PTPc" or "DSPc" are predicted to be protein phosphatases with a similar fold to DSPs and PTPs, yet with unpredicted specificities.	105
214470	smart00013	LRRNT	Leucine rich repeat N-terminal domain. 	33
214471	smart00014	acidPPc	Acid phosphatase homologues. 	116
197470	smart00015	IQ	Calmodulin-binding motif. Short calmodulin-binding motif containing conserved Ile and Gln residues.	23
214472	smart00017	OSTEO	Osteopontin. Osteopontin is an acidic phosphorylated glycoprotein of about 40 Kd which is abundant in the mineral matrix of bones and which binds tightly to hydroxyapatite. It is suggested that osteopontin might function as a cell attachment factor and could play a key role in the adhesion of osteoclasts to the mineral matrix of bone	287
197472	smart00018	PD	P or trefoil or TFF domain. Proposed role in renewal and pathology of mucous epithelia.	46
128335	smart00019	SF_P	Pulmonary surfactant proteins. Pulmonary surfactant associated proteins promote alveolar stability by lowering the surface tension at the air-liquid interface in the peripheral air spaces. SP-C, a component of surfactant, is a highly hydrophobic peptide of 35 amino acid residues which is processed from a larger precursor protein. SP-C is post-translationally modified by the covalent attachment of two palmitoyl groups on two adjacent cysteines	191
214473	smart00020	Tryp_SPc	Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.	229
197474	smart00021	DAX	Domain present in Dishevelled and axin. Domain of unknown function.	83
214474	smart00022	PLAc	Cytoplasmic phospholipase A2, catalytic subunit. Cytosolic phospholipases A2 hydrolyse arachidonyl phospholipids. Family includes phospholipases B isoforms.	549
128339	smart00023	COLIPASE	Colipase. Colipase is a protein that functions as a cofactor for pancreatic lipase, with which it forms a stoichiometric complex. It also binds to the bile-salt covered triacylglycerol interface thus allowing the enzyme to anchor itself to the water-lipid interface. Colipase is a small protein of approximately 100 amino-acid residues with five conserved disulfide bonds.	95
214475	smart00025	Pumilio	Pumilio-like repeats. Pumilio-like repeats that bind RNA.	36
128341	smart00026	EPEND	Ependymins. Ependymins are the predominant proteins in the cerebrospinal fluid (CSF) of teleost fish. They have been implicated in the neurochemistry of memory and neuronal regeneration. They are glycoproteins of about 200 amino acids that can bind calcium. Four cysteines are conserved that probably form disulfide bonds.	191
197477	smart00027	EH	Eps15 homology domain. Pair of EF hand motifs that recognise proteins containing Asn-Pro-Phe (NPF) sequences.	96
197478	smart00028	TPR	Tetratricopeptide repeats. Repeats present in 4 or more copies in proteins. Contain a minimum of 34 amino acids each and self-associate via a "knobs and holes" mechanism.	34
214476	smart00029	GASTRIN	gastrin / cholecystokinin / caerulein family. This family gathers small proteins of about 100 130 amino acids that act as hormones, among them gastrin, cholecystokinin and preprocaerulein which stimulate gastric, biliary, and pancreatic secretion and smooth muscle contraction.	14
128345	smart00030	CLb	CLUSTERIN Beta chain. 	206
214477	smart00031	DED	Death effector domain. 	79
214478	smart00032	CCP	Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR). The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. A missense mutation in seventh CCP domain causes deficiency of the b subunit of factor XIII.	56
214479	smart00033	CH	Calponin homology domain. Actin binding domains present in duplicate at the N-termini of spectrin-like proteins (including dystrophin, alpha-actinin). These domains cross-link actin filaments into bundles and networks. A calponin homology domain is predicted in yeasst Cdc24p.	101
214480	smart00034	CLECT	C-type lectin (CTL) or carbohydrate-recognition domain (CRD). Many of these domains function as calcium-dependent carbohydrate binding modules.	124
128350	smart00035	CLa	CLUSTERIN alpha chain. 	216
214481	smart00036	CNH	Domain found in NIK1-like kinases, mouse citron and yeast ROM1, ROM2. 	302
128352	smart00037	CNX	Connexin homologues. Connexin channels participate in the regulation of signaling between developing and differentiated cell types.	34
197483	smart00038	COLFI	Fibrillar collagens C-terminal domain. Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc.	232
128354	smart00039	CRF	corticotropin-releasing factor. 	40
128355	smart00040	CSF2	Granulocyte-macrophage colony-simulating factor (GM-CSF). GM-CSF stimulates the development of and the cytotoxic activity of white blood cells.	121
214482	smart00041	CT	C-terminal cystine knot-like domain (CTCK). The structures of transforming growth factor-beta (TGFbeta), nerve growth factor (NGF), platelet-derived growth factor (PDGF) and gonadotropin all form 2 highly twisted antiparallel pairs of beta-strands and contain three disulphide bonds. The domain is non-globular and little is conserved among these presumed homologues except for their cysteine residues. CT domains are predicted to form homodimers.	82
214483	smart00042	CUB	Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein. This domain is found mostly among developmentally-regulated proteins. Spermadhesins contain only this domain.	102
214484	smart00043	CY	Cystatin-like domain. Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains.	107
214485	smart00044	CYCc	Adenylyl- / guanylyl cyclase, catalytic domain. Present in two copies in mammalian adenylyl cyclases. Eubacterial homologues are known. Two residues (Asn, Arg) are thought to be involved in catalysis. These cyclases have important roles in a diverse range of cellular processes.	194
214486	smart00045	DAGKa	Diacylglycerol kinase accessory domain (presumed). Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain might either be an accessory domain or else contribute to the catalytic domain. Bacterial homologues are known.	160
214487	smart00046	DAGKc	Diacylglycerol kinase catalytic domain (presumed). Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain is presumed to be the catalytic domain. Bacterial homologues areknown.	124
214488	smart00047	LYZ2	Lysozyme subfamily 2. Eubacterial enzymes distantly related to eukaryotic lysozymes.	147
128363	smart00048	DEFSN	Defensin/corticostatin family. Cysteine-rich domains that lyse bacteria, fungi and enveloped viruses by forming multimeric membrane-spanning channels.	29
214489	smart00049	DEP	Domain found in Dishevelled, Egl-10, and Pleckstrin. Domain of unknown function present in signalling proteins that contain PH, rasGEF, rhoGEF, rhoGAP, RGS, PDZ domains. DEP domain in Drosophila dishevelled is essential to rescue planar polarity defects and induce JNK signalling (Cell 94, 109-118).	77
214490	smart00050	DISIN	Homologues of snake disintegrins. Snake disintegrins inhibit the binding of ligands to integrin receptors. They contain a 'RGD' sequence, identical to the recognition site of many adhesion proteins. Molecules containing both disintegrin and metalloprotease domains are known as ADAMs.	75
128366	smart00051	DSL	delta serrate ligand. 	63
214491	smart00052	EAL	Putative diguanylate phosphodiesterase. Putative diguanylate phosphodiesterase, present in a variety of bacteria.	242
197491	smart00053	DYNc	Dynamin, GTPase. Large GTPases that mediate vesicle trafficking. Dynamin participates in the endocytic uptake of receptors, associated ligands, and plasma membrane following an exocytic event.	240
197492	smart00054	EFh	EF-hand, calcium binding motif. EF-hands are calcium-binding motifs that occur at least in pairs. Links between disease states and genes encoding EF-hands, particularly the S100 subclass, are emerging. Each motif consists of a 12 residue loop flanked on either side by a 12 residue alpha-helix. EF-hands undergo a conformational change unpon binding calcium ions.	29
214492	smart00055	FCH	Fes/CIP4 homology domain. Alignment extended from original report. Highly alpha-helical. Also known as the RAEYL motif or the S. pombe Cdc15 N-terminal domain.	87
214493	smart00057	FIMAC	factor I membrane attack complex. 	68
214494	smart00058	FN1	Fibronectin type 1 domain. One of three types of internal repeat within the plasma protein, fibronectin. Found also in coagulation factor XII, HGF activator and tissue-type plasminogen activator. In t-PA and fibronectin, this domain type contributes to fibrin-binding.	45
128373	smart00059	FN2	Fibronectin type 2 domain. One of three types of internal repeat within the plasma protein, fibronectin. Also occurs in coagulation factor XII, 2 type IV collagenases, PDC-109, and cation-independent mannose-6-phosphate and secretory phospholipase A2 receptors. In fibronectin, PDC-109, and the collagenases, this domain contributes to collagen-binding function.	49
214495	smart00060	FN3	Fibronectin type 3 domain. One of three types of internal repeat within the plasma protein, fibronectin. The tenth fibronectin type III repeat contains a RGD cell recognition sequence in a flexible loop between 2 strands. Type III modules are present in both extracellular and intracellular proteins.	83
214496	smart00061	MATH	meprin and TRAF homology. 	95
214497	smart00062	PBPb	Bacterial periplasmic substrate-binding proteins. bacterial proteins, eukaryotic ones are in PBPe	219
214498	smart00063	FRI	Frizzled. Drosophila melanogaster frizzled mediates signalling that polarises a precursor cell along the anteroposterior axis. Homologues of the N-terminal region of frizzled exist either as transmembrane or secreted molecules. Frizzled homologues are reported to be receptors for the Wnt growth factors. (Not yet in MEDLINE: the FRI domain occurs in several receptor tyrosine kinases [Xu, Y.K. and Nusse, Curr. Biol. 8 R405-R406 (1998); Masiakowski, P. and Yanopoulos, G.D., Curr. Biol. 8, R407 (1998)].	113
214499	smart00064	FYVE	Protein present in Fab1, YOTB, Vac1, and EEA1. The FYVE zinc finger is named after four proteins where it was first found: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn2+ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. The FYVE finger is structurally related to the PHD finger and the RING finger. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. The FYVE finger functions in the membrane recruitment of cytosolic proteins by binding to phosphatidylinositol 3-phosphate (PI3P), which is prominent on endosomes. The R+HHC+XCG motif is critical for PI3P binding.	68
214500	smart00065	GAF	Domain present in phytochromes and cGMP-specific phosphodiesterases. Mutations within these domains in PDE6B result in autosomal recessive inheritance of retinitis pigmentosa.	149
214501	smart00066	GAL4	GAL4-like Zn(II)2Cys6 (or C6 zinc) binuclear cluster DNA-binding domain. Gal4 is a positive regulator for the gene expression of the galactose- induced genes of S. cerevisiae. Is present only in fungi.	43
128381	smart00067	GHA	Glycoprotein hormone alpha chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology.	87
214502	smart00068	GHB	Glycoprotein hormone beta chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology.	107
214503	smart00069	GLA	Domain containing Gla (gamma-carboxyglutamate) residues. A hyaluronan-binding domain found in proteins associated with the extracellular matrix, cell adhesion and cell migration.	65
128384	smart00070	GLUCA	Glucagon like hormones. 	27
128385	smart00071	Galanin	Galanin. Galanin is a neuropeptide that controls various biological activities: it regulates the release growth hormone, inhibits the release of insulin and somatostatin, contracts smooth muscle of the gastrointestinal and genitourinary tract and may be involved in the control of adrenal secretion	103
214504	smart00072	GuKc	Guanylate kinase homologues. Active enzymes catalyze ATP-dependent phosphorylation of GMP to GDP. Structure resembles that of adenylate kinase. So-called membrane-associated guanylate kinase homologues (MAGUKs) do not possess guanylate kinase activities; instead at least some possess protein-binding functions.	174
197502	smart00073	HPT	Histidine Phosphotransfer domain. Contains an active histidine residue that mediates phosphotransfer reactions. Domain detected only in eubacteria. This alignment is an extension to that shown in the Cell structure paper.	92
214505	smart00075	HYDRO	Hydrophobins. 	76
197503	smart00076	IFabd	Interferon alpha, beta and delta. Interferons produce antiviral and antiproliferative responses in cells. They are classified into five groups, all of them related but gamma-interferon.	117
128390	smart00077	ITAM	Immunoreceptor tyrosine-based activation motif. Motif that may be dually phosphorylated on tyrosine that links antigen receptors to downstream signalling machinery.	21
214506	smart00078	IlGF	Insulin / insulin-like growth factor / relaxin family. Family of proteins including insulin, relaxin, and IGFs. Insulin decreases blood glucose concentration.	66
197504	smart00079	PBPe	Eukaryotic homologues of bacterial periplasmic substrate binding proteins. Prokaryotic homologues are represented by a separate alignment: PBPb	133
197505	smart00080	LIF_OSM	leukemia inhibitory factor. OSM, Oncostatin M	157
214507	smart00082	LRRCT	Leucine rich repeat C-terminal domain. 	51
197507	smart00084	NMU	Neuromedin U. Neuromedin U (NmU) is a vertebrate peptide which stimulates uterine smooth muscle contraction and causes selective vasoconstriction. Like most other active peptides, it is proteolytically processed from a larger precursor protein. The mature peptides are 8 (NmU-8) to 25 (NmU-25) residues long and C- terminally amidated. The sequence of the C-terminal extremity of NmU is extremely well conserved in mammals, birds and amphibians.	25
214508	smart00085	PA2c	Phospholipase A2. 	117
197509	smart00086	PAC	Motif C-terminal to PAS motifs (likely to contribute to PAS structural domain). PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold.	43
128398	smart00087	PTH	Parathyroid hormone. 	36
214509	smart00088	PINT	motif in proteasome subunits, Int-6, Nip-1 and TRIP-15. Also called the PCI (Proteasome, COP9, Initiation factor 3) domain. Unknown function.	88
214510	smart00089	PKD	Repeats in polycystic kidney disease 1 (PKD1) and other proteins. Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.	79
214511	smart00090	RIO	RIO-like kinase. 	237
214512	smart00091	PAS	PAS domain. PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels.	67
128403	smart00092	RNAse_Pc	Pancreatic ribonuclease. 	123
214513	smart00093	SERPIN	SERine Proteinase INhibitors. 	359
214514	smart00094	TR_FER	Transferrin. 	332
128406	smart00095	TR_THY	Transthyretin. 	121
128407	smart00096	UTG	Uteroglobin. 	69
128408	smart00097	WNT1	found in Wnt-1. 	305
214515	smart00098	alkPPc	Alkaline phosphatase homologues. 	419
128410	smart00099	btg1	tob/btg1 family. The tob/btg1 is a family of proteins that inhibit cell proliferation.	108
197516	smart00100	cNMP	Cyclic nucleotide-monophosphate binding domain. Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic cNMP-binding domains, present in ion channels, and cNMP-dependent kinases.	120
128412	smart00101	14_3_3	14-3-3 homologues. 14-3-3 homologues mediates signal transduction by binding to phosphoserine-containing proteins. They are involved in growth factor signalling and also interact with MEK kinases.	244
214516	smart00102	ADF	Actin depolymerisation factor/cofilin -like domains. Severs actin filaments and binds to actin monomers.	127
214517	smart00103	ALBUMIN	serum albumin. 	187
197517	smart00104	ANATO	Anaphylatoxin homologous domain. C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins.	35
214518	smart00105	ArfGap	Putative GTP-ase activating proteins for the small GTPase, ARF. Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs.	119
128417	smart00107	BTK	Bruton's tyrosine kinase Cys-rich motif. Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains (but not all PH domains are followed by BTK motifs). The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region.	36
214519	smart00108	B_lectin	Bulb-type mannose-specific lectin. 	114
197519	smart00109	C1	Protein kinase C conserved region 1 (C1) domains (Cysteine-rich domains). Some bind phorbol esters and diacylglycerol. Some bind RasGTP. Zinc-binding domains.	50
128420	smart00110	C1Q	Complement component C1q domain. Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor.	135
128421	smart00111	C4	C-terminal tandem repeated domain in type 4 procollagens. Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome.	114
214520	smart00112	CA	Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.	81
128423	smart00113	CALCITONIN	calcitonin. This family is formed by calcitonin, the calcitonin gene-related peptide, and amylin. They are short polypeptide hormones.	38
128424	smart00114	CARD	Caspase recruitment domain. Motif contained in proteins involved in apoptotic signalling. Mediates homodimerisation. Structure consists of six antiparallel helices arranged in a topology homologue to the DEATH and the DED domain.	88
214521	smart00115	CASc	Caspase, interleukin-1 beta converting enzyme (ICE) homologues. Cysteine aspartases that mediate programmed cell death (apoptosis). Caspases are synthesised as zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologues.	241
214522	smart00116	CBS	Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease.	49
214523	smart00119	HECTc	Domain Homologous to E6-AP Carboxyl Terminus with. E3 ubiquitin-protein ligases. Can bind to E2 enzymes.	328
214524	smart00120	HX	Hemopexin-like repeats. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs).	45
197525	smart00121	IB	Insulin growth factor-binding protein homologues. High affinity binding partners of insulin-like growth factors.	75
128430	smart00125	IL1	Interleukin-1 homologues. Cytokines with various biological functions. Interluekin 1 alpha and beta are also known as hematopoietin and catabolin.	147
128431	smart00126	IL6	Interleukin-6 homologues. Family includes granulocyte colony-stimulating factor (G-CSF) and myelomonocytic growth factor (MGF). IL-6 is also known as B-cell stimulatory factor 2.	154
128432	smart00127	IL7	Interleukin-7 and interleukin-9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multifunctional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear.	146
214525	smart00128	IPPc	Inositol polyphosphate phosphatase, catalytic domain homologues. Mg(2+)-dependent/Li(+)-sensitive enzymes.	306
214526	smart00129	KISc	Kinesin motor, catalytic domain. ATPase. Microtubule-dependent molecular motors that play important roles in intracellular transport of organelles and in cell division.	335
214527	smart00130	KR	Kringle domain. Named after a Danish pastry. Found in several serine proteases and in ROR-like receptors. Can occur in up to 38 copies (in apolipoprotein(a)). Plasminogen-like kringles possess affinity for free lysine and lysine- containing peptides.	83
197529	smart00131	KU	BPTI/Kunitz family of serine protease inhibitors. Serine protease inhibitors. One member of the family is encoded by an alternatively-spliced form of Alzheimer's amyloid beta-protein.	53
214528	smart00132	LIM	Zinc-binding domain present in Lin-11, Isl-1, Mec-3. Zinc-binding domain family. Some LIM domains bind protein partners via tyrosine-containing motifs. LIM domains are found in many key regulators of developmental pathways.	54
214529	smart00133	S_TK_X	Extension to Ser/Thr-type protein kinases. 	64
214530	smart00134	LU	Ly-6 antigen / uPA receptor -like domain. Three-fold repeated domain in urokinase-type plasminogen activator receptor; occurs singly in other GPI-linked cell-surface glycoproteins (Ly-6 family, CD59, thymocyte B cell antigen, Sgp-2). Topology of these domains is similar to that of snake venom neurotoxins.	85
214531	smart00135	LY	Low-density lipoprotein-receptor YWTD domain. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin.	43
214532	smart00136	LamNT	Laminin N-terminal domain (domain VI). N-terminal domain of laminins and laminin-related protein such as Unc-6/ netrins.	238
214533	smart00137	MAM	Domain in meprin, A5, receptor protein tyrosine phosphatase mu (and others). Likely to have an adhesive function. Mutations in the meprin MAM domain affect noncovalent associations within meprin oligomers. In receptor tyrosine phosphatase mu-like molecules the MAM domain is important for homophilic cell-cell interactions.	161
214534	smart00138	MeTrc	Methyltransferase, chemotaxis proteins. Methylates methyl-accepting chemotaxis proteins to form gamma-glutamyl methyl ester residues.	264
214535	smart00139	MyTH4	Domain in Myosin and Kinesin Tails. Domain present twice in myosin-VIIa, and also present in 3 other myosins.	152
128445	smart00140	NGF	Nerve growth factor (NGF or beta-NGF). NGF is important for the development and maintenance of the sympathetic and sensory nervous systems.	106
197537	smart00141	PDGF	Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family. Platelet-derived growth factor is a potent activator for cells of mesenchymal origin. PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer. Members of the VEGF family are homologues of PDGF.	83
214536	smart00142	PI3K_C2	Phosphoinositide 3-kinase, region postulated to contain C2 domain. Outlier of C2 family.	100
197539	smart00143	PI3K_p85B	PI3-kinase family, p85-binding domain. Region of p110 PI3K that binds the p85 subunit.	78
197540	smart00144	PI3K_rbd	PI3-kinase family, Ras-binding domain. Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding RA domains (unpublished observation).	108
214537	smart00145	PI3Ka	Phosphoinositide 3-kinase family, accessory domain (PIK domain). PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation.	184
214538	smart00146	PI3Kc	Phosphoinositide 3-kinase, catalytic domain. Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. These homologues may be either lipid kinases and/or protein kinases: the former phosphorylate the 3-position in the inositol ring of inositol phospholipids. The ataxia telangiectesia-mutated gene produced, the targets of rapamycin (TOR) and the DNA-dependent kinase have not been found to possess lipid kinase activity. Some of this family possess PI-4 kinase activities.	240
214539	smart00147	RasGEF	Guanine nucleotide exchange factor for Ras-like small GTPases. 	242
197543	smart00148	PLCXc	Phospholipase C, catalytic domain (part); domain X. Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme appears to be a homologue of the mammalian PLCs.	143
128454	smart00149	PLCYc	Phospholipase C, catalytic domain (part); domain Y. Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme appears to be a homologue of the mammalian PLCs.	115
197544	smart00150	SPEC	Spectrin repeats. 	101
128456	smart00151	SWIB	SWI complex, BAF60b domains. 	77
128457	smart00152	THY	Thymosin beta actin-binding motif. 	37
128458	smart00153	VHP	Villin headpiece domain. 	36
197545	smart00154	ZnF_AN1	AN1-like Zinc finger. Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis.	39
197546	smart00155	PLDc	Phospholipase D. Active site motifs. Phosphatidylcholine-hydrolyzing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, aspartic acid, and/or asparagine residues which may contribute to the active site. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved.	28
197547	smart00156	PP2Ac	Protein phosphatase 2A homologues, catalytic domain. Large family of serine/threonine phosphatases, that includes PP1, PP2A and PP2B (calcineurin) family members.	271
197548	smart00157	PRP	Major prion protein. The prion protein is a major component of scrapie-associated fibrils in Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler syndrome and bovine spongiform encephalopathy.	218
128463	smart00159	PTX	Pentraxin / C-reactive protein / pentaxin family. This family form a doscoid pentameric structure. Human serum amyloid P demonstrates calcium-mediated ligand-binding.	206
197549	smart00160	RanBD	Ran-binding domain. Domain of apporximately 150 residues that stabilises the GTP-bound form of Ran (the Ras-like nuclear small GTPase).	130
128465	smart00162	SAPA	Saposin/surfactant protein-B A-type DOMAIN. Present as four and three degenerate copies, respectively, in prosaposin and surfactant protein B. Single copies in acid sphingomyelinase, NK-lysin amoebapores and granulysin. Putative phospholipid membrane binding domains.	34
214540	smart00164	TBC	Domain in Tre-2, BUB2p, and Cdc16p. Probable Rab-GAPs. Widespread domain present in Gyp6 and Gyp7, thereby giving rise to the notion that it performs a GTP-activator activity on Rab-like GTPases.	216
197551	smart00165	UBA	Ubiquitin associated domain. Present in Rad23, SNF1-like kinases. The newly-found UBA in p62 is known to bind ubiquitin.	37
197552	smart00166	UBX	Domain present in ubiquitin-regulatory proteins. Present in FAF1 and Shp1p.	77
128469	smart00167	VPS9	Domain present in VPS9. Domain present in yeast vacuolar sorting protein 9 and other proteins.	117
214541	smart00173	RAS	Ras subfamily of RAS small GTPases. Similar in fold and function to the bacterial EF-Tu GTPase. p21Ras couples receptor Tyr kinases and G protein receptors to protein kinase cascades	164
197554	smart00174	RHO	Rho (Ras homology) subfamily of Ras-like small GTPases. Members of this subfamily of Ras-like small GTPases include Cdc42 and Rac, as well as Rho isoforms.	174
197555	smart00175	RAB	Rab subfamily of small GTPases. Rab GTPases are implicated in vesicle trafficking.	164
128473	smart00176	RAN	Ran (Ras-related nuclear proteins) /TC4 subfamily of small GTPases. Ran is involved in the active transport of proteins through nuclear pores.	200
128474	smart00177	ARF	ARF-like small GTPases; ARF, ADP-ribosylation factor. Ras homologues involved in vesicular transport. Activator of phospholipase D isoforms. Unlike Ras proteins they lack cysteine residues at their C-termini and therefore are unlikely to be prenylated. ARFs are N-terminally myristoylated. Contains ATP/GTP-binding motif (P-loop).	175
197556	smart00178	SAR	Sar1p-like members of the Ras-family of small GTPases. Yeast SAR1 is an essential gene required for transport of secretory proteins from the endoplasmic reticulum to the Golgi apparatus.	184
214542	smart00179	EGF_CA	Calcium-binding EGF-like domain. 	39
214543	smart00180	EGF_Lam	Laminin-type epidermal growth factor-like domai. 	46
214544	smart00181	EGF	Epidermal growth factor-like domain. 	35
214545	smart00182	CULLIN	Cullin. 	143
128480	smart00183	NAT_PEP	Natriuretic peptide. Atrial natriuretic peptides are vertebrate hormones important in the overall control of cardiovascular homeostasis and sodium and water balance in general.	24
214546	smart00184	RING	Ring finger. E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain; Various RING fingers exhibit binding activity towards E2 ubiquitin-conjugating enzymes (Ubc' s)	40
214547	smart00185	ARM	Armadillo/beta-catenin-like repeats. Approx. 40 amino acid repeat. Tandem repeats form superhelix of helices that is proposed to mediate interaction of beta-catenin with its ligands. Involved in transducing the Wingless/Wnt signal. In plakoglobin arm repeats bind alpha-catenin and N-cadherin.	41
214548	smart00186	FBG	Fibrinogen-related domains (FReDs). Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety of fibrinogen-related proteins, including tenascin and Drosophila scabrous.	212
197563	smart00187	INB	Integrin beta subunits (N-terminal portion of extracellular region). Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A "insert" or "I" -like domain (although this remains to be confirmed).	423
128485	smart00188	IL10	Interleukin-10 family. Interleukin-10 inhibits the synthesis of a number of cytokines, including IFN-gamma, IL-2, IL-3, TNF and GM-CSF produced by activated macrophages and by helper T cells.	137
128486	smart00189	IL2	Interleukin-2 family. Interleukin-2 is a cytokine produced by T-helper cells in response to antigenic or mitogenic stimulation. This protein is required for T-cell proliferation and other activities crucial to the regulation of the immune response.	154
197564	smart00190	IL4_13	Interleukins 4 and 13. Interleukins-4 and -13 are cytokines involved in inflammatory and immune responses. IL-4 stimulates B and T cells.	138
214549	smart00191	Int_alpha	Integrin alpha (beta-propellor repeats). Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Alpha integrins are proposed to contain a domain containing a 7-fold repeat that adopts a beta-propellor fold. Some of these domains contain an inserted von Willebrand factor type-A domain. Some repeats contain putative calcium-binding sites. The 7-fold repeat domain is homologous to a similar domain in phosphatidylinositol-glycan-specific phospholipase D.	57
197566	smart00192	LDLa	Low-density lipoprotein receptor domain class A. Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia.	33
128490	smart00193	PTN	Pleiotrophin / midkine family. Heparin-binding domain family.	80
214550	smart00194	PTPc	Protein tyrosine phosphatase, catalytic domain. 	259
214551	smart00195	DSPc	Dual specificity phosphatase, catalytic domain. 	138
214552	smart00197	SAA	Serum amyloid A proteins. Serum amyloid A proteins are induced during the acute-phase response. Secondary amyloidosis is characterised by the extracellular accumulation in tissues of SAA proteins. SAA proteins are apolipoproteins.	103
214553	smart00198	SCP	SCP / Tpx-1 / Ag5 / PR-1 / Sc7 family of extracellular domains. Human glioma pathogenesis-related protein GliPR and the plant pathogenesis-related protein represent functional links between plant defense systems and human immune system. This family has no known function.	144
197570	smart00199	SCY	Intercrine alpha family (small cytokine C-X-C) (chemokine CXC). Family of cytokines involved in cell-specific chemotaxis, mediation of cell growth, and the inflammatory response.	59
214554	smart00200	SEA	Domain found in sea urchin sperm protein, enterokinase, agrin. Proposed function of regulating or binding carbohydrate sidechains.	121
197571	smart00201	SO	Somatomedin B -like domains. Somatomedin-B is a peptide, proteolytically excised from vitronectin, that is a growth hormone-dependent serum factor with protease-inhibiting activity.	43
214555	smart00202	SR	Scavenger receptor Cys-rich. The sea urchin egg peptide speract contains 4 repeats of SR domains that contain 6 conserved cysteines. May bind bacterial antigens in the protein MARCO.	101
197573	smart00203	TK	Tachykinin family. Tachykinins are a group of biologically active peptides which excite neurons, evoke behavioral responses, are potent vasodilatators and contract (directly or indirectly) many smooth muscles. These peptides are synthesized as longer precursors and then processed to peptides from ten to twelve residues long.	11
214556	smart00204	TGFB	Transforming growth factor-beta (TGF-beta) family. Family members are active as disulphide-linked homo- or heterodimers. TGFB is a multifunctional peptide that controls proliferation, differentiation, and other functions in many cell types.	102
128501	smart00205	THN	Thaumatin family. The thaumatin family gathers proteins related to plant pathogenesis. The thaumatin family includes very basic members with extracellular and vacuolar localization. Thaumatin itsel is a potent sweet-tasting protein. Several members of this family display significant in vitro activity of inhibiting hyphal growth or spore germination of various fungi probably by a membrane permeabilizing mechanism.	218
128502	smart00206	NTR	Tissue inhibitor of metalloproteinase family. Form complexes with metalloproteinases, such as collagenases, and irreversibly inactivate them.	172
214557	smart00207	TNF	Tumour necrosis factor family. Family of cytokines that form homotrimeric or heterotrimeric complexes. TNF mediates mature T-cell receptor-induced apoptosis through the p75 TNF receptor.	125
214558	smart00208	TNFR	Tumor necrosis factor receptor / nerve growth factor receptor repeats. Repeats in growth factor receptors that are involved in growth factor binding. TNF/TNFR	39
214559	smart00209	TSP1	Thrombospondin type 1 repeats. Type 1 repeats in thrombospondin-1 bind and activate TGF-beta.	53
214560	smart00210	TSPN	Thrombospondin N-terminal -like domains. Heparin-binding and cell adhesion domain of thrombospondin	184
214561	smart00211	TY	Thyroglobulin type I repeats. The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases and binding partners of heparin.	46
214562	smart00212	UBCc	Ubiquitin-conjugating enzyme E2, catalytic domain homologues. Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologues. This pathway functions in regulating many fundamental processes required for cell viability.TSG101 is one of several UBC homologues that lacks this active site cysteine.	145
214563	smart00213	UBQ	Ubiquitin homologues. Ubiquitin-mediated proteolysis is involved in the regulated turnover of proteins required for controlling cell cycle progression	72
214564	smart00214	VWC	von Willebrand factor (vWF) type C domain. 	59
214565	smart00215	VWC_out	von Willebrand factor (vWF) type C domain. 	67
214566	smart00216	VWD	von Willebrand factor (vWF) type D domain. Von Willebrand factor contains several type D domains: D1 and D2 are present within the N-terminal propeptide whereas the remaining D domains are required for multimerisation.	163
197580	smart00217	WAP	Four-disulfide core domains. 	47
128514	smart00218	ZU5	Domain present in ZO-1 and Unc5-like netrin receptors. Domain of unknown function.	104
197581	smart00219	TyrKc	Tyrosine kinase, catalytic domain. Phosphotransferases. Tyrosine-specific kinase subfamily.	257
214567	smart00220	S_TKc	Serine/Threonine protein kinases, catalytic domain. Phosphotransferases. Serine or threonine-specific kinase subfamily.	254
214568	smart00221	STYKc	Protein kinase; unclassified specificity. Phosphotransferases. The specificity of this class of kinases can not be predicted. Possible dual-specificity Ser/Thr/Tyr kinase.	258
214569	smart00222	Sec7	Sec7 domain. Domain named after the S. cerevisiae SEC7 gene product, which is required for proper protein transport through the Golgi. The domain facilitates guanine nucleotide exchange on the small GTPases, ARFs (ADP ribosylation factors).	189
128519	smart00223	APPLE	APPLE domain. Four-fold repeat in plasma kallikrein and coagulation factor XI. Factor XI apple 3 mediates binding to platelets. Factor XI apple 1 binds high-molecular-mass kininogen. Apple 4 in factor XI mediates dimer formation and binds to factor XIIa. Mutations in apple 4 cause factor XI deficiency, an inherited bleeding disorder.	79
128520	smart00224	GGL	G protein gamma subunit-like motifs. 	63
197585	smart00225	BTB	Broad-Complex, Tramtrack and Bric a brac. Domain in Broad-Complex, Tramtrack and Bric a brac. Also known as POZ (poxvirus and zinc finger) domain. Known to be a protein-protein interaction motif found at the N-termini of several C2H2-type transcription factors as well as Shaw-type potassium channels. Known structure reveals a tightly intertwined dimer formed via interactions between N-terminal strand and helix structures. However in a subset of BTB/POZ domains, these two secondary structures appear to be missing. Be aware SMART predicts BTB/POZ domains without the beta1- and alpha1-secondary structures.	97
197586	smart00226	LMWPc	Low molecular weight phosphatase family. 	134
128523	smart00227	NEBU	The Nebulin repeat is present also in Las1. Tandem arrays of these repeats are known to bind actin.	31
214570	smart00228	PDZ	Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.	85
214571	smart00229	RasGEFN	Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal motif. A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this domain N-terminal to the RasGef (Cdc25-like) domain. The recent crystal structureof Sos shows that this domain is alpha-helical and plays a "purely structural role" (Nature 394, 337-343).	127
128526	smart00230	CysPc	Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit).	318
214572	smart00231	FA58C	Coagulation factor 5/8 C-terminal domain, discoidin domain. Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes.	139
214573	smart00232	JAB_MPN	JAB/MPN domain. Domain in Jun kinase activation domain binding protein and proteasomal subunits. Domain at Mpr1p and Pad1p N-termini. Domain of unknown function.	135
214574	smart00233	PH	Pleckstrin homology domain. Domain commonly found in eukaryotic signalling proteins. The domain family possesses multiple functions including the abilities to bind inositol phosphates, and various proteins. PH domains have been found to possess inserted domains (such as in PLC gamma, syntrophins) and to be inserted within other domains. Mutations in Brutons tyrosine kinase (Btk) within its PH domain cause X-linked agammaglobulinaemia (XLA) in patients. Point mutations cluster into the positively charged end of the molecule around the predicted binding site for phosphatidylinositol lipids.	102
214575	smart00234	START	in StAR and phosphatidylcholine transfer protein. putative lipid-binding domain in StAR and phosphatidylcholine transfer protein	205
214576	smart00235	ZnMc	Zinc-dependent metalloprotease. Neutral zinc metallopeptidases. This alignment represents a subset of known subfamilies. Highest similarity occurs in the HExxH zinc-binding site/ active site.	139
197593	smart00236	fCBD	Fungal-type cellulose-binding domain. Small four-cysteine cellulose-binding domain of fungi	34
197594	smart00237	Calx_beta	Domains in Na-Ca exchangers and integrin-beta4. Domain in Na-Ca exchangers and integrin subunit beta4 (and some cyanobacterial proteins)	90
197595	smart00238	BIR	Baculoviral inhibition of apoptosis protein repeat. Domain found in inhibitor of apoptosis proteins (IAPs) and other proteins. Acts as a direct inhibitor of caspase enzymes.	71
214577	smart00239	C2	Protein kinase C conserved region 2 (CalB). Ca2+-binding motif present in phospholipases, protein kinases C, and synaptotagmins (among others). Some do not appear to contain Ca2+-binding sites. Particular C2s appear to bind phospholipids, inositol polyphosphates, and intracellular proteins. Unusual occurrence in perforin. Synaptotagmin and PLC C2s are permuted in sequence with respect to N- and C-terminal beta strands. SMART detects C2 domains using one or both of two profiles.	101
214578	smart00240	FHA	Forkhead associated domain. Found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain.	52
214579	smart00241	ZP	Zona pellucida (ZP) domain. ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan).	252
214580	smart00242	MYSc	Myosin. Large ATPases. ATPase; molecular motor. Muscle contraction consists of a cyclical interaction between myosin and actin. The core of the myosin structure is similar in fold to that of kinesin.	677
128539	smart00243	GAS2	Growth-Arrest-Specific Protein 2 Domain. GROWTH-ARREST-SPECIFIC PROTEIN 2 Domain	73
214581	smart00244	PHB	prohibitin homologues. prohibitin homologues	160
214582	smart00245	TSPc	tail specific protease. tail specific protease	192
128542	smart00246	WH2	Wiskott Aldrich syndrome homology region 2. Wiskott Aldrich syndrome homology region 2 / actin-binding motif	18
214583	smart00247	XTALbg	Beta/gamma crystallins. Beta/gamma crystallins	82
197603	smart00248	ANK	ankyrin repeats. Ankyrin repeats are about 33 amino acids long and occur in at least four consecutive copies. They are involved in protein-protein interactions. The core of the repeat seems to be an helix-loop-helix structure.	30
214584	smart00249	PHD	PHD zinc finger. The plant homeodomain (PHD) finger is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in epigenetics and chromatin-mediated transcriptional regulation. The PHD finger binds two zinc ions using the so-called 'cross-brace' motif and is thus structurally related to the RING finger and the FYVE finger. It is not yet known if PHD fingers have a common molecular function. Several reports suggest that it can function as a protein-protein interacton domain and it was recently demonstrated that the PHD finger of p300 can cooperate with the adjacent BROMO domain in nucleosome binding in vitro. Other reports suggesting that the PHD finger is a ubiquitin ligase have been refuted as these domains were RING fingers misidentified as PHD fingers.	47
197605	smart00250	PLEC	Plectin repeat. 	38
128547	smart00251	SAM_PNT	SAM / Pointed domain. A subfamily of the SAM domain	82
214585	smart00252	SH2	Src homology 2 domains. Src homology 2 domains bind phosphotyrosine-containing polypeptides via 2 surface pockets. Specificity is provided via interaction with residues that are distinct from the phosphotyrosine. Only a single occurrence of a SH2 domain has been found in S. cerevisiae.	84
128549	smart00253	SOCS	suppressors of cytokine signalling. suppressors of cytokine signalling	43
214586	smart00254	ShKT	ShK toxin domain. ShK toxin domain	33
214587	smart00255	TIR	Toll - interleukin 1 - resistance. 	140
197608	smart00256	FBOX	A Receptor for Ubiquitination Targets. 	41
197609	smart00257	LysM	Lysin motif. 	44
128554	smart00258	SAND	SAND domain. 	73
128555	smart00259	ZnF_A20	A20-like zinc fingers. A20- (an inhibitor of cell death)-like zinc fingers. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappaB activation.	26
214588	smart00260	CheW	Two component signalling adaptor domain. 	138
214589	smart00261	FU	Furin-like repeats. 	45
214590	smart00262	GEL	Gelsolin homology domain. Gelsolin/severin/villin homology domain. Calcium-binding and actin-binding. Both intra- and extracellular domains.	90
197612	smart00263	LYZ1	Alpha-lactalbumin / lysozyme C. 	127
214591	smart00264	BAG	BAG domains, present in regulator of Hsp70 proteins. BAG domains, present in Bcl-2-associated athanogene 1 and silencer of death domains	79
128561	smart00265	BH4	BH4 Bcl-2 homology region 4. 	27
128562	smart00266	CAD	Domains present in proteins implicated in post-mortem DNA fragmentation. 	74
128563	smart00267	GGDEF	diguanylate cyclase. Diguanylate cyclase, present in a variety of bacteria.	163
214592	smart00268	ACTIN	Actin. ACTIN subfamily of ACTIN/mreB/sugarkinase/Hsp70 superfamily	373
197615	smart00269	BowB	Bowman-Birk type proteinase inhibitor. 	55
214593	smart00270	ChtBD1	Chitin binding domain. 	38
197617	smart00271	DnaJ	DnaJ molecular chaperone homology domain. 	60
197618	smart00272	END	Endothelin. 	22
214594	smart00273	ENTH	Epsin N-terminal homology (ENTH) domain. 	127
128570	smart00274	FOLN	Follistatin-N-terminal domain-like. Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence	24
214595	smart00275	G_alpha	G protein alpha subunit. Subunit of G proteins that contains the guanine nucleotide binding site	342
214596	smart00276	GLECT	Galectin. Galectin - galactose-binding lectin	128
197621	smart00277	GRAN	Granulin. 	51
197622	smart00278	HhH1	Helix-hairpin-helix DNA-binding motif class 1. 	20
197623	smart00279	HhH2	Helix-hairpin-helix class 2 (Pol1 family) motifs. 	36
197624	smart00280	KAZAL	Kazal type serine protease inhibitors. Kazal type serine protease inhibitors and follistatin-like domains.	46
214597	smart00281	LamB	Laminin B domain. 	127
214598	smart00282	LamG	Laminin G domain. 	132
214599	smart00283	MA	Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer). Thought to undergo reversible methylation in response to attractants or repellants during bacterial chemotaxis.	262
128580	smart00284	OLF	Olfactomedin-like domains. 	255
197628	smart00285	PBD	P21-Rho-binding domain. Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB).	36
128582	smart00286	PTI	Plant trypsin inhibitors. 	29
214600	smart00287	SH3b	Bacterial SH3 domain homologues. 	63
197630	smart00288	VHS	Domain present in VPS-27, Hrs and STAM. Unpublished observations. Domain of unknown function.	133
214601	smart00289	WR1	Worm-specific repeat type 1. Worm-specific repeat type 1. Cysteine-rich domain apparently unique (so far) to C. elegans. Often appears with KU domains. About 3 dozen worm proteins contain this domain.	38
197632	smart00290	ZnF_UBP	Ubiquitin Carboxyl-terminal Hydrolase-like zinc finger. 	50
197633	smart00291	ZnF_ZZ	Zinc-binding domain, present in Dystrophin, CREB-binding protein. Putative zinc-binding domain present in dystrophin-like proteins, and CREB-binding protein/p300 homologues. The ZZ in dystrophin appears to bind calmodulin. A missense mutation of one of the conserved cysteines in dystrophin results in a patient with Duchenne muscular dystrophy.	44
214602	smart00292	BRCT	breast cancer carboxy-terminal domain. 	78
214603	smart00293	PWWP	domain with conserved PWWP motif. conservation of Pro-Trp-Trp-Pro residues	63
128590	smart00294	4.1m	putative band 4.1 homologues' binding motif. 	19
214604	smart00295	B41	Band 4.1 homologues. Also known as ezrin/radixin/moesin (ERM) protein domains. Present in myosins, ezrin, radixin, moesin, protein tyrosine phosphatases. Plasma membrane-binding domain. These proteins play structural and regulatory roles in the assembly and stabilization of specialized plasmamembrane domains. Some PDZ domain containing proteins bind one or more of this family. Now includes JAKs.	201
197636	smart00297	BROMO	bromo domain. 	107
214605	smart00298	CHROMO	Chromatin organization modifier domain. 	55
128594	smart00299	CLH	Clathrin heavy chain repeat homology. 	140
197638	smart00300	ChSh	Chromo Shadow Domain. 	61
214606	smart00301	DM	Doublesex DNA-binding motif. 	54
128597	smart00302	GED	Dynamin GTPase effector domain. 	92
197639	smart00303	GPS	G-protein-coupled receptor proteolytic site domain. Present in latrophilin/CL-1, sea urchin REJ and polycystin.	49
197640	smart00304	HAMP	HAMP (Histidine kinases, Adenylyl cyclases, Methyl binding proteins, Phosphatases) domain. 	53
197641	smart00305	HintC	Hint (Hedgehog/Intein) domain C-terminal region. Hedgehog/Intein domain, C-terminal region. Domain has been split to accommodate large insertions of endonucleases.	46
197642	smart00306	HintN	Hint (Hedgehog/Intein) domain N-terminal region. Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases.	100
214607	smart00307	ILWEQ	I/LWEQ domain. Thought to possess an F-actin binding function.	200
214608	smart00308	LH2	Lipoxygenase homology 2 (beta barrel) domain. 	105
197643	smart00309	PAH	Pancreatic hormones / neuropeptide F / peptide YY family. Pancreatic hormone is a regulator of pancreatic and gastrointestinal functions.	36
197644	smart00310	PTBI	Phosphotyrosine-binding domain (IRS1-like). 	99
214609	smart00311	PWI	PWI, domain in splicing factors. 	74
214610	smart00312	PX	PhoX homologous domain, present in p47phox and p40phox. Eukaryotic domain of unknown function present in phox proteins, PLD isoforms, a PI3K isoform.	105
214611	smart00313	PXA	Domain associated with PX domains. unpubl. observations	176
214612	smart00314	RA	Ras association (RalGDS/AF-6) domain. RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Kalhammer et al. have shown that not all RA domains bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase. Predicted RA domains in PLC210 and nore1 found to bind RasGTP. Included outliers (Grb7, Grb14, adenylyl cyclases etc.)	90
214613	smart00315	RGS	Regulator of G protein signalling domain. RGS family members are GTPase-activating proteins for heterotrimeric G-protein alpha-subunits.	118
197648	smart00316	S1	Ribosomal protein S1-like RNA-binding domain. 	72
214614	smart00317	SET	SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues	124
214615	smart00318	SNc	Staphylococcal nuclease homologues. 	137
128614	smart00319	TarH	Homologues of the ligand binding domain of Tar. Homologues of the ligand binding domain of the wild-type bacterial aspartate receptor, Tar.	135
197651	smart00320	WD40	WD40 repeats. Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.	40
214616	smart00321	WSC	present in yeast cell wall integrity and stress response component proteins. Domain present in WSC proteins, polycystin and fungal exoglucanase	95
197652	smart00322	KH	K homology RNA-binding domain. 	68
214617	smart00323	RasGAP	GTPase-activator protein for Ras-like GTPases. All alpha-helical domain that accelerates the GTPase activity of Ras, thereby "switching" it into an "off" position. Improved domain limits from structure.	344
214618	smart00324	RhoGAP	GTPase-activator protein for Rho-like GTPases. GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. etter domain limits and outliers.	174
214619	smart00325	RhoGEF	Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases. Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Improved coverage.	180
214620	smart00326	SH3	Src homology 3 domains. Src homology 3 (SH3) domains bind to target proteins through sequences containing proline and hydrophobic amino acids. Pro-containing polypeptides may bind to SH3 domains in 2 different binding orientations.	56
214621	smart00327	VWA	von Willebrand factor (vWF) type A domain. VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods.	175
214622	smart00328	BPI1	BPI/LBP/CETP N-terminal domain. Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) N-terminal domain	225
128624	smart00329	BPI2	BPI/LBP/CETP C-terminal domain. Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) C-terminal domain	202
214623	smart00330	PIPKc	Phosphatidylinositol phosphate kinases. 	342
214624	smart00331	PP2C_SIG	Sigma factor PP2C-like phosphatases. 	193
214625	smart00332	PP2Cc	Serine/threonine phosphatases, family 2C, catalytic domain. The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity.	252
197660	smart00333	TUDOR	Tudor domain. Domain of unknown function present in several RNA-binding proteins. 10 copies in the Drosophila Tudor protein. Initial proposal that the survival motor neuron gene product contain a Tudor domain are corroborated by more recent database search techniques such as PSI-BLAST (unpublished).	57
197661	smart00335	ANX	Annexin repeats. 	53
197662	smart00336	BBOX	B-Box-type zinc finger. 	42
214626	smart00337	BCL	BCL (B-Cell lymphoma); contains BH1, BH2 regions. (BH1, BH2, (BH3 (one helix only)) and not BH4(one helix only)). Involved in apoptosis regulation	100
197664	smart00338	BRLZ	basic region leucin zipper. 	65
214627	smart00339	FH	FORKHEAD. FORKHEAD, also known as a "winged helix"	89
128634	smart00340	HALZ	homeobox associated leucin zipper. 	44
128635	smart00341	HRDC	Helicase and RNase D C-terminal. Hypothetical role in nucleic acid binding. Mutations in the HRDC domain cause human disease.	81
197666	smart00342	HTH_ARAC	helix_turn_helix, arabinose operon control protein. 	84
197667	smart00343	ZnF_C2HC	zinc finger. 	17
214628	smart00344	HTH_ASNC	helix_turn_helix ASNC type. AsnC: an autogenously regulated activator of asparagine synthetase A transcription in Escherichia coli)	108
197669	smart00345	HTH_GNTR	helix_turn_helix gluconate operon transcriptional repressor. 	60
214629	smart00346	HTH_ICLR	helix_turn_helix isocitrate lyase regulation. 	91
197670	smart00347	HTH_MARR	helix_turn_helix multiple antibiotic resistance protein. 	101
128642	smart00348	IRF	interferon regulatory factor. interferon regulatory factor, also known as trytophan pentad repeat	107
214630	smart00349	KRAB	krueppel associated box. 	61
214631	smart00350	MCM	minichromosome maintenance proteins. 	509
128645	smart00351	PAX	Paired Box domain. 	125
197673	smart00352	POU	Found in Pit-Oct-Unc transcription factors. 	75
197674	smart00353	HLH	helix loop helix domain. 	53
197675	smart00354	HTH_LACI	helix_turn _helix lactose operon repressor. 	70
197676	smart00355	ZnF_C2H2	zinc finger. 	23
214632	smart00356	ZnF_C3H1	zinc finger. 	27
214633	smart00357	CSP	Cold shock protein domain. RNA-binding domain that functions as a RNA-chaperone in bacteria and is involved in regulating translation in eukaryotes. Contains sub-family of RNA-binding domains in the Rho transcription termination factor.	64
214634	smart00358	DSRM	Double-stranded RNA binding motif. 	67
214635	smart00359	PUA	Putative RNA-binding Domain in PseudoUridine synthase and Archaeosine transglycosylase. 	76
214636	smart00360	RRM	RNA recognition motif. 	73
214637	smart00361	RRM_1	RNA recognition motif. 	70
214638	smart00363	S4	S4 RNA-binding domain. 	60
214639	smart00364	LRR_BAC	Leucine-rich repeats, bacterial type. 	20
197684	smart00365	LRR_SD22	Leucine-rich repeat, SDS22-like subfamily. 	22
197685	smart00367	LRR_CC	Leucine-rich repeat - CC (cysteine-containing) subfamily. 	26
197686	smart00368	LRR_RI	Leucine rich repeat, ribonuclease inhibitor type. 	28
197687	smart00369	LRR_TYP	Leucine-rich repeats, typical (most populated) subfamily. 	24
197688	smart00370	LRR	Leucine-rich repeats, outliers. 	24
197689	smart00380	AP2	DNA-binding domain in plant proteins such as APETALA2 and EREBPs. 	64
214640	smart00382	AAA	ATPases associated with a variety of cellular activities. AAA - ATPases associated with a variety of cellular activities. This profile/alignment only detects a fraction of this vast family. The poorly conserved N-terminal helix is missing from the alignment.	148
197691	smart00384	AT_hook	DNA binding domain with preference for A/T rich regions. Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y).	13
214641	smart00385	CYCLIN	domain present in cyclins, TFIIB and Retinoblastoma. A helical domain present in cyclins and TFIIB (twice) and Retinoblastoma (once). A protein recognition domain functioning in cell-cycle and transcription control.	83
214642	smart00386	HAT	HAT (Half-A-TPR) repeats. Present in several RNA-binding proteins. Structurally and sequentially thought to be similar to TPRs.	33
214643	smart00387	HATPase_c	Histidine kinase-like ATPases. Histidine kinase-, DNA gyrase B-, phytochrome-like ATPases.	111
214644	smart00388	HisKA	His Kinase A (phosphoacceptor) domain. Dimerisation and phosphoacceptor domain of histidine kinases.	66
197696	smart00389	HOX	Homeodomain. DNA-binding factors that are involved in the transcriptional regulation of key developmental processes	57
214645	smart00390	GoLoco	LGN motif, putative GEFs specific for G-alpha GTPases. GEF specific for Galpha_i proteins	23
128673	smart00391	MBD	Methyl-CpG binding domain. Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain	77
214646	smart00392	PROF	Profilin. Binds actin monomers, membrane polyphosphoinositides and poly-L-proline.	129
214647	smart00393	R3H	Putative single-stranded nucleic acids-binding domain. 	79
197697	smart00394	RIIa	RIIalpha, Regulatory subunit portion of type II PKA R-subunit. RIIalpha, Regulatory subunit portion of type II PKA R-subunit. Contains dimerisation interface and binding site for A-kinase-anchoring proteins (AKAPs).	38
197698	smart00396	ZnF_UBR1	Putative zinc finger in N-recognin, a recognition component of the N-end rule pathway. Domain is involved in recognition of N-end rule substrates in yeast Ubr1p	71
197699	smart00397	t_SNARE	Helical region found in SNAREs. All alpha-helical motifs that form twisted and parallel four-helix bundles in target soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptor proteins. This motif found in "Q-SNAREs".	66
197700	smart00398	HMG	high mobility group. 	70
197701	smart00399	ZnF_C4	c4 zinc finger in nuclear hormone receptors. 	70
128681	smart00400	ZnF_CHCC	zinc finger. 	55
214648	smart00401	ZnF_GATA	zinc finger binding to DNA consensus sequence [AT]GATA[AG]. 	52
214649	smart00404	PTPc_motif	Protein tyrosine phosphatase, catalytic domain motif. 	105
214650	smart00406	IGv	Immunoglobulin V-Type. 	81
214651	smart00407	IGc1	Immunoglobulin C-Type. 	75
197706	smart00408	IGc2	Immunoglobulin C-2 Type. 	63
214652	smart00409	IG	Immunoglobulin. 	85
214653	smart00410	IG_like	Immunoglobulin like. IG domains that cannot be classified into one of IGv1, IGc1, IGc2, IG.	85
197709	smart00411	BHL	bacterial (prokaryotic) histone like domain. 	90
128690	smart00412	Cu_FIST	Copper-Fist. binds DNA only in present of copper or silver	39
197710	smart00413	ETS	erythroblast transformation specific domain. variation of the helix-turn-helix motif	87
197711	smart00414	H2A	Histone 2A. 	106
214654	smart00415	HSF	heat shock factor. 	105
128694	smart00417	H4	Histone H4. 	74
197713	smart00418	HTH_ARSR	helix_turn_helix, Arsenical Resistance Operon Repressor. 	66
128696	smart00419	HTH_CRP	helix_turn_helix, cAMP Regulatory protein. 	48
197714	smart00420	HTH_DEOR	helix_turn_helix, Deoxyribose operon repressor. 	53
197715	smart00421	HTH_LUXR	helix_turn_helix, Lux Regulon. lux regulon (activates the bioluminescence operon	58
197716	smart00422	HTH_MERR	helix_turn_helix, mercury resistance. 	70
214655	smart00423	PSI	domain found in Plexins, Semaphorins and Integrins. 	47
128701	smart00424	STE	STE like transcription factors. 	111
214656	smart00425	TBOX	Domain first found in the mice T locus (Brachyury) protein. 	190
128703	smart00426	TEA	TEA domain. 	68
197718	smart00427	H2B	Histone H2B. 	97
128705	smart00428	H3	Histone H3. 	105
214657	smart00429	IPT	ig-like, plexins, transcription factors. 	90
214658	smart00430	HOLI	Ligand binding domain of hormone receptors. 	163
128708	smart00431	SCAN	leucine rich region. 	113
197721	smart00432	MADS	MADS domain. 	59
214659	smart00433	TOP2c	TopoisomeraseII. Eukaryotic DNA topoisomerase II, GyrB, ParE	594
214660	smart00434	TOP4c	DNA Topoisomerase IV. Bacterial DNA topoisomerase IV, GyrA, ParC	444
214661	smart00435	TOPEUc	DNA Topoisomerase I (eukaryota). DNA Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina virus topoisomerase, Variola virus topoisomerase, Shope fibroma virus topoisomeras	391
214662	smart00436	TOP1Bc	Bacterial DNA topoisomeraes I ATP-binding domain. Extension of TOPRIM in Bacterial DNA topoisomeraes I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase beta subunit	89
214663	smart00437	TOP1Ac	Bacterial DNA topoisomerase I DNA-binding domain. Bacterial DNA topoisomerase I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase alpha subunit	259
128715	smart00438	ZnF_NFX	Repressor of transcription. 	20
214664	smart00439	BAH	Bromo adjacent homology domain. 	121
128717	smart00440	ZnF_C2C2	C2C2 Zinc finger. Nucleic-acid-binding motif in transcriptional elongation factor TFIIS and RNA polymerases.	40
128718	smart00441	FF	Contains two conserved F residues. A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.	55
214665	smart00442	FGF	Acidic and basic fibroblast growth factor family. Mitogens that stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family play essential roles in patterning and differentiation during vertebrate embryogenesis, and have neurotrophic activities.	126
197727	smart00443	G_patch	glycine rich nucleic binding domain. A predicted glycine rich nucleic binding domain found in the splicing factor 45, SON DNA binding protein and D-type Retrovirus- polyproteins.	47
214666	smart00444	GYF	Contains conserved Gly-Tyr-Phe residues. Proline-binding domain in CD2-binding protein. Contains conserved Gly-Tyr-Phe residues.	56
214667	smart00445	LINK	Link (Hyaluronan-binding). 	94
197729	smart00446	LRRcap	occurring C-terminal to leucine-rich repeats. A motif occurring C-terminal to leucine-rich repeats in "sds22-like" and "typical" LRR-containing proteins.	19
214668	smart00448	REC	cheY-homologous receiver domain. CheY regulates the clockwise rotation of E. coli flagellar motors. This domain contains a phosphoacceptor site that is phosphorylated by histidine kinase homologues.	55
214669	smart00449	SPRY	Domain in SPla and the RYanodine Receptor. Domain of unknown function. Distant homologues are domains in butyrophilin/marenostrin/pyrin homologues.	122
197731	smart00450	RHOD	Rhodanese Homology Domain. An alpha beta fold found duplicated in the Rhodanese protein. The the Cysteine containing enzymatically active version of the domain is also found in the CDC25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and stress proteins such as Senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions with a loss of the cysteine are also seen in Dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases. These are likely to play a role in protein interactions.	100
197732	smart00451	ZnF_U1	U1-like zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.	35
214670	smart00452	STI	Soybean trypsin inhibitor (Kunitz) family of protease inhibitors. 	172
197734	smart00453	WSN	Worm-specific (usually) N-terminal domain. 	69
197735	smart00454	SAM	Sterile alpha motif. Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerisation.	68
128731	smart00455	RBD	Raf-like Ras-binding domain. 	70
197736	smart00456	WW	Domain with 2 conserved Trp (W) residues. Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.	33
214671	smart00457	MACPF	membrane-attack complex / perforin. 	195
214672	smart00458	RICIN	Ricin-type beta-trefoil. Carbohydrate-binding domain formed from presumed gene triplication.	118
128735	smart00459	Sorb	Sorbin homologous domain. First found in the peptide hormone sorbin and later in the ponsin/ArgBP2/vinexin family of proteins.	50
214673	smart00460	TGc	Transglutaminase/protease-like homologues. Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events.	68
214674	smart00461	WH1	WASP homology region 1. Region of the Wiskott-Aldrich syndrome protein (WASp) that contains point mutations in the majority of patients with WAS. Unknown function. Ena-like WH1 domains bind polyproline-containing peptides, and that Homer contains a WH1 domain.	106
214675	smart00462	PTB	Phosphotyrosine-binding domain, phosphotyrosine-interaction (PI) domain. PTB/PI domain structure similar to those of pleckstrin homology (PH) and IRS-1-like PTB domains.	134
214676	smart00463	SMR	Small MutS-related domain. 	80
197740	smart00464	LON	Found in ATP-dependent protease La (LON). N-terminal domain of the ATP-dependent protease La (LON), present also in other bacterial ORFs.	92
214677	smart00465	GIYc	GIY-YIG type nucleases (URI domain). 	84
197742	smart00466	SRA	SET and RING finger associated domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533.	155
197743	smart00467	GS	GS motif. Aa approx. 30 amino acid motif that precedes the kinase domain in types I and II TGF beta receptors. Mutation of two or more of the serines or threonines in the TTSGSGSG of TGF-beta type I receptor impairs phosphorylation and signaling activity.	30
128744	smart00468	PreSET	N-terminal to some SET domains. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.	98
128745	smart00469	WIF	Wnt-inhibitory factor-1 like domain. Occurs as extracellular domain in metazoan Ryk receptor tyrosine kinases. C. elegans Ryk is required for cell-cuticle recognition. WIF-1 binds to Wnt and inhibits its activity.	136
214678	smart00470	ParB	ParB-like nuclease domain. Plasmid RK2 ParB preferentially cleaves single-stranded DNA. ParB also nicks supercoiled plasmid DNA preferably at sites with potential single-stranded character, like AT-rich regions and sequences that can form cruciform structures. ParB also exhibits 5-->3 exonuclease activity.	89
214679	smart00471	HDc	Metal dependent phosphohydrolases with conserved 'HD' motif. Includes eukaryotic cyclic nucleotide phosphodiesterases (PDEc). This profile/HMM does not detect HD homologues in bacterial glycine aminoacyl-tRNA synthetases (beta subunit).	124
197746	smart00472	MIR	Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. 	57
214680	smart00473	PAN_AP	divergent subfamily of APPLE domains. Apple-like domains present in Plasminogen, C. elegans hypothetical ORFs and the extracellular portion of plant receptor-like protein kinases. Predicted to possess protein- and/or carbohydrate-binding functions.	78
214681	smart00474	35EXOc	3'-5' exonuclease. 3\' -5' exonuclease proofreading domain present in DNA polymerase I, Werner syndrome helicase, RNase D and other enzymes	172
214682	smart00475	53EXOc	5'-3' exonuclease. 	259
128752	smart00476	DNaseIc	deoxyribonuclease I. Deoxyribonuclease I catalyzes the endonucleolytic cleavage of double-stranded DNA. The enzyme is secreted outside the cell and also involved in apoptosis in the nucleus.	276
214683	smart00477	NUC	DNA/RNA non-specific endonuclease. prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases	210
214684	smart00478	ENDO3c	endonuclease III. includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases	149
214685	smart00479	EXOIII	exonuclease domain in DNA-polymerase alpha and epsilon chain, ribonuclease T and other exonucleases. 	169
214686	smart00480	POL3Bc	DNA polymerase III beta subunit. 	345
197753	smart00481	POLIIIAc	DNA polymerase alpha chain like domain. DNA polymerase alpha chain like domain, incl. family of hypothetical proteins	67
214687	smart00482	POLAc	DNA polymerase A domain. 	207
214688	smart00483	POLXc	DNA polymerase X family. includes vertebrate polymerase beta and terminal deoxynucleotidyltransferases	334
214689	smart00484	XPGI	Xeroderma pigmentosum G I-region. domain in nucleases	73
214690	smart00485	XPGN	Xeroderma pigmentosum G N-region. domain in nucleases	99
214691	smart00486	POLBc	DNA polymerase type-B family. DNA polymerase alpha, delta, epsilon and zeta chain (eukaryota), DNA polymerases in archaea, DNA polymerase II in e. coli, mitochondrial DNA polymerases and and virus DNA polymerases	474
214692	smart00487	DEXDc	DEAD-like helicases superfamily. 	201
214693	smart00488	DEXDc2	DEAD-like helicases superfamily. 	289
197757	smart00490	HELICc	helicase superfamily c-terminal domain. 	82
214694	smart00491	HELICc2	helicase superfamily c-terminal domain. 	142
214695	smart00493	TOPRIM	topoisomerases, DnaG-type primases, OLD family nucleases and RecR proteins. 	75
214696	smart00494	ChtBD2	Chitin-binding domain type 2. 	49
197760	smart00495	ChtBD3	Chitin-binding domain type 3. 	41
128772	smart00496	IENR2	Intron-encoded nuclease repeat 2. Short helical motif of unknown function (unpublished results).	17
197761	smart00497	IENR1	Intron encoded nuclease repeat motif. Repeat of unknown function, but possibly DNA-binding via helix-turn-helix motif (Ponting, unpublished).	53
214697	smart00498	FH2	Formin Homology 2 Domain. FH proteins control rearrangements of the actin cytoskeleton, especially in the context of cytokinesis and cell polarisation. Members of this family have been found to interact with Rho-GTPases, profilin and other actin-assoziated proteins. These interactions are mediated by the proline-rich FH1 domain, usually located in front of FH2 (but not listed in SMART). Despite this cytosolic function, vertebrate formins have been assigned functions within the nucleus. A set of Formin-Binding Proteins (FBPs) has been shown to bind FH1 with their WW domain.	392
214698	smart00499	AAI	Plant lipid transfer protein / seed storage protein / trypsin-alpha amylase inhibitor domain family. 	79
128776	smart00500	SFM	Splicing Factor Motif, present in Prp18 and Pr04. 	44
128777	smart00501	BRIGHT	BRIGHT, ARID (A/T-rich interaction domain) domain. DNA-binding domain containing a helix-turn-helix structure	93
128778	smart00502	BBC	B-Box C-terminal domain. Coiled coil region C-terminal to (some) B-Box domains	127
214699	smart00503	SynN	Syntaxin N-terminal domain. Three-helix domain that (in Sso1p) slows the rate of its reaction with the SNAP-25 homologue Sec9p	117
128780	smart00504	Ubox	Modified RING finger domain. Modified RING finger domain, without the full complement of Zn2+-binding ligands. Probable involvement in E2-dependent ubiquitination.	63
214700	smart00505	Knot1	Knottins. Knottins, representing plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins and arthropod defensins.	45
214701	smart00506	A1pp	Appr-1"-p processing enzyme. Function determined by Martzen et al. Extended family detected by reciprocal PSI-BLAST searches (unpublished results, and Pehrson & Fuji).	133
214702	smart00507	HNHc	HNH nucleases. 	52
214703	smart00508	PostSET	Cysteine-rich motif following a subset of SET domains. 	17
197766	smart00509	TFS2N	Domain in the N-terminus of transcription elongation factor S-II (and elsewhere). 	75
128786	smart00510	TFS2M	Domain in the central regions of transcription elongation factor S-II (and elsewhere). 	102
128787	smart00511	ORANGE	Orange domain. This domain confers specificity among members of the Hairy/E(SPL) family.	45
214704	smart00512	Skp1	Found in Skp1 protein family. Family of Skp1 (kinetochore protein required for cell cycle progression) and elongin C (subunit of RNA polymerase II transcription factor SIII) homologues.	104
128789	smart00513	SAP	Putative DNA-binding (bihelical) motif predicted to be involved in chromosomal organisation. 	35
214705	smart00515	eIF5C	Domain at the C-termini of GCD6, eIF-2B epsilon, eIF-4 gamma and eIF-5. 	83
214706	smart00516	SEC14	Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p). Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p) and in RhoGAPs, RhoGEFs and the RasGAP, neurofibromin (NF1). Lipid-binding domain. The SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits.	158
197769	smart00517	PolyA	C-terminal domain of Poly(A)-binding protein. Present also in Drosophila hyperplastics discs protein. Involved in homodimerisation (either directly or indirectly)	64
214707	smart00518	AP2Ec	AP endonuclease family 2. These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites	273
128794	smart00520	BASIC	Basic domain in HLH proteins of MYOD family. 	91
128795	smart00521	CBF	CCAAT-Binding transcription Factor. 	62
214708	smart00523	DWA	Domain A in dwarfin family proteins. 	109
197770	smart00524	DWB	Domain B in dwarfin family proteins. 	171
197771	smart00525	FES	iron-sulpphur binding domain in DNA-(apurinic or apyrimidinic site) lyase (subfamily of ENDO3). 	21
197772	smart00526	H15	Domain in histone families 1 and 5. 	66
197773	smart00527	HMG17	domain in high mobilty group proteins HMG14 and HMG 17. 	88
128801	smart00528	HNS	Domain in histone-like proteins of HNS family. 	46
197774	smart00529	HTH_DTXR	Helix-turn-helix diphteria tox regulatory element. iron dependent repressor	95
197775	smart00530	HTH_XRE	Helix-turn-helix XRE-family like proteins. 	56
128804	smart00531	TFIIE	Transcription initiation factor IIE. 	147
214709	smart00532	LIGANc	Ligase N family. 	441
214710	smart00533	MUTSd	DNA-binding domain of DNA mismatch repair MUTS family. 	308
197777	smart00534	MUTSac	ATPase domain of DNA mismatch repair MUTS family. 	185
197778	smart00535	RIBOc	Ribonuclease III family. 	129
197779	smart00536	AXH	domain in Ataxins and HMG containing proteins. unknown function	116
214711	smart00537	DCX	Domain in the Doublecortin (DCX) gene product. Tandemly-repeated domain in doublin, the Doublecortin gene product. Proposed to bind tubulin. Doublecortin (DCX) is mutated in human X-linked neuronal migration defects.	89
197780	smart00538	POP4	A domain found in a protein subunit of human RNase MRP and RNase P ribonucleoprotein complexes and archaeal proteins. 	92
214712	smart00539	NIDO	Extracellular domain of unknown function in nidogen (entactin) and hypothetical proteins. 	152
128813	smart00540	LEM	in nuclear membrane-associated proteins. LEM, domain in nuclear membrane-associated proteins, including lamino-associated polypeptide 2 and emerin.	44
128814	smart00541	FYRN	FY-rich domain, N-terminal region. is sometimes closely juxtaposed with the C-terminal region (FYRC), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins.	44
197781	smart00542	FYRC	FY-rich domain, C-terminal region. is sometimes closely juxtaposed with the N-terminal region (FYRN), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins.	86
214713	smart00543	MIF4G	Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. Ponting (TiBS) "Novel eIF4G domain homologues (in press)	200
214714	smart00544	MA3	Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains Ponting (TIBS) "Novel eIF4G domain homologues" in press	113
128818	smart00545	JmjN	Small domain found in the jumonji family of transcription factors. To date, this domain always co-occurs with the JmjC domain (although the reverse is not true).	42
214715	smart00546	CUE	Domain that may be involved in binding ubiquitin-conjugating enzymes (UBCs). CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2. Ponting (Biochem. J.) "Proteins of the Endoplasmic reticulum" (in press)	43
197784	smart00547	ZnF_RBZ	Zinc finger domain. Zinc finger domain in Ran-binding proteins (RanBPs), and other proteins. In RanBPs, this domain binds RanGDP.	25
214716	smart00548	IRO	Motif in Iroquois-class homeodomain proteins (only). Unknown function. 	18
197785	smart00549	TAFH	TAF homology. Domain in Drosophila nervy, CBFA2T1, human TAF105, human TAF130, and Drosophila TAF110. Also known as nervy homology region 1 (NHR1).	92
128823	smart00550	Zalpha	Z-DNA-binding domain in adenosine deaminases. Helix-turn-helix-containing domain. Also known as Zab.	68
214717	smart00551	ZnF_TAZ	TAZ zinc finger, present in p300 and CBP. 	79
214718	smart00552	ADEAMc	tRNA-specific and double-stranded RNA adenosine deaminase (RNA-specific editase). 	374
197786	smart00553	SEP	Domain present in Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. 	93
214719	smart00554	FAS1	Four repeated domains in the Fasciclin I family of proteins, present in many other contexts. 	97
128828	smart00555	GIT	Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins. Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins, and in yeast Spa2p and Sph1p (CPP; unpublished results). In p95-APP1 the N-terminal GIT motif might be involved in binding PIX.	31
214720	smart00557	IG_FLMN	Filamin-type immunoglobulin domains. These form a rod-like structure in the actin-binding cytoskeleton protein, filamin. The C-terminal repeats of filamin bind beta1-integrin (CD29).	93
214721	smart00558	JmjC	A domain family that is part of the cupin metalloenzyme superfamily. Probable enzymes, but of unknown functions, that regulate chromatin reorganisation processes (Clissold and Ponting, in press).	58
128831	smart00559	Ku78	Ku70 and Ku80 are 70kDa and 80kDa subunits of the Lupus Ku autoantigen. This is a single stranded DNA- and ATP-depedent helicase that has a role in chromosome translocation. This is a domain of unknown function C-terminal to its von Willebrand factor A domain, that also occurs in bacterial hypothetical proteins.	140
214722	smart00560	LamGL	LamG-like jellyroll fold domain. 	133
214723	smart00561	MBT	Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation.	96
197791	smart00562	NDK	Enzymes that catalyze nonsubstrate specific conversions of nucleoside diphosphates to nucleoside triphosphates. These enzymes play important roles in bacterial growth, signal transduction and pathogenicity.	135
214724	smart00563	PlsC	Phosphate acyltransferases. Function in phospholipid biosynthesis and have either glycerolphosphate, 1-acylglycerolphosphate, or 2-acylglycerolphosphoethanolamine acyltransferase activities. Tafazzin, the product of the gene mutated in patients with Barth syndrome, is a member of this family.	118
128836	smart00564	PQQ	beta-propeller repeat. Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases.	33
128837	smart00567	EZ_HEAT	E-Z type HEAT repeats. Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role.	30
214725	smart00568	GRAM	domain in glucosyltransferases, myotubularins and other putative membrane-associated proteins. 	60
197794	smart00569	L27	domain in receptor targeting proteins Lin-2 and Lin-7. 	53
197795	smart00570	AWS	associated with SET domains. subdomain of PRESET	50
214726	smart00571	DDT	domain in different transcription and chromosome remodeling factors. 	63
128842	smart00572	DZF	domain in DSRM or ZnF_C2H2 domain containing proteins. 	246
214727	smart00573	HSA	domain in helicases and associated with SANT domains. 	73
214728	smart00574	POX	domain associated with HOX domains. 	140
128845	smart00575	ZnF_PMZ	plant mutator transposase zinc finger. 	28
128846	smart00576	BTP	Bromodomain transcription factors and PHD domain containing proteins. subdomain of archael histone-like transcription factors	77
214729	smart00577	CPDc	catalytic domain of ctd-like phosphatases. 	148
214730	smart00579	FBD	domain in FBox and BRCT domain containing plant proteins. 	72
197798	smart00580	PUG	domain in protein kinases, N-glycanases and other nuclear proteins. 	57
128850	smart00581	PSP	proline-rich domain in spliceosome associated proteins. 	54
214731	smart00582	RPR	domain present in proteins, which are involved in regulation of nuclear pre-mRNA. 	124
214732	smart00583	SPK	domain in SET and PHD domain containing proteins and protein kinases. 	114
214733	smart00584	TLDc	domain in TBC and LysM domain containing proteins. 	165
128854	smart00586	ZnF_DBF	Zinc finger in DBF-like proteins. 	49
214734	smart00587	CHK	ZnF_C4 abd HLH domain containing kinases domain. subfamily of choline kinases	196
128856	smart00588	NEUZ	domain in neuralized proteins. 	123
128857	smart00589	PRY	associated with SPRY domains. 	52
214735	smart00591	RWD	domain in RING finger and WD repeat containing proteins and DEXDc-like helicases subfamily related to the UBCc domain. 	107
197800	smart00592	BRK	domain in transcription and CHROMO domain helicases. 	45
214736	smart00593	RUN	domain involved in Ras-like GTPase signaling. 	64
214737	smart00594	UAS	UAS domain. 	122
214738	smart00595	MADF	subfamily of SANT domain. 	89
128863	smart00596	PRE_C2HC	PRE_C2HC domain. 	69
214739	smart00597	ZnF_TTF	zinc finger in transposases and transcription factors. 	91
214740	smart00602	VPS10	VPS10 domain. 	612
128866	smart00603	LCCL	LCCL domain. 	85
214741	smart00604	MD	MD domain. 	145
214742	smart00605	CW	CW domain. 	94
128869	smart00606	CBD_IV	Cellulose Binding Domain Type IV. 	129
128870	smart00607	FTP	eel-Fucolectin Tachylectin-4 Pentaxrin-1 Domain. 	151
214743	smart00608	ACR	ADAM Cysteine-Rich Domain. 	137
197803	smart00609	VIT	Vault protein Inter-alpha-Trypsin domain. 	130
214744	smart00611	SEC63	Domain of unknown function in Sec63p, Brr2p and other proteins. 	312
128874	smart00612	Kelch	Kelch domain. 	47
214745	smart00613	PAW	domain present in PNGases and other hypothetical proteins. present in several copies in proteins with unknown function in C. elegans	89
214746	smart00614	ZnF_BED	BED zinc finger. DNA-binding domain in chromatin-boundary-element-binding proteins and transposases	50
128877	smart00615	EPH_lbd	Ephrin receptor ligand binding domain. 	177
214747	smart00630	Sema	semaphorin domain. 	390
214748	smart00631	Zn_pept	Zn_pept domain. 	277
214749	smart00632	Aamy_C	Aamy_C domain. 	81
214750	smart00633	Glyco_10	Glycosyl hydrolase family 10. 	263
214751	smart00634	BID_1	Bacterial Ig-like domain (group 1). 	92
214752	smart00635	BID_2	Bacterial Ig-like domain 2. 	81
214753	smart00636	Glyco_18	Glyco_18 domain. 	334
214754	smart00637	CBD_II	CBD_II domain. 	92
214755	smart00638	LPD_N	Lipoprotein N-terminal Domain. 	574
214756	smart00639	PSA	Paramecium Surface Antigen Repeat. 	62
214757	smart00640	Glyco_32	Glycosyl hydrolases family 32. 	437
128889	smart00641	Glyco_25	Glycosyl hydrolases family 25. 	109
214758	smart00642	Aamy	Alpha-amylase domain. 	166
214759	smart00643	C345C	Netrin C-terminal Domain. 	114
214760	smart00644	Ami_2	Ami_2 domain. 	126
214761	smart00645	Pept_C1	Papain family cysteine protease. 	175
214762	smart00646	Ami_3	Ami_3 domain. 	113
214763	smart00647	IBR	In Between Ring fingers. the domains occurs between pairs og RING fingers	64
197818	smart00648	SWAP	Suppressor-of-White-APricot splicing regulator. domain present in regulators which are responsible for pre-mRNA splicing processes	54
197819	smart00649	RL11	Ribosomal protein L11/L12. 	132
128898	smart00650	rADc	Ribosomal RNA adenine dimethylases. 	169
197820	smart00651	Sm	snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing	67
128900	smart00652	eIF1a	eukaryotic translation initiation factor 1A. 	83
214764	smart00653	eIF2B_5	domain present in translation initiation factor eIF2B and eIF5. 	110
128902	smart00654	eIF6	translation initiation factor 6. 	200
214765	smart00656	Amb_all	Amb_all domain. 	190
128904	smart00657	RPOL4c	DNA-directed RNA-polymerase II subunit. 	118
197821	smart00658	RPOL8c	RNA polymerase subunit 8. subunit of RNA polymerase I, II and III	143
128906	smart00659	RPOLCX	RNA polymerase subunit CX. present in RNA polymerase I, II and III	44
197822	smart00661	RPOL9	RNA polymerase subunit 9. 	52
214766	smart00662	RPOLD	RNA polymerases D. DNA-directed RNA polymerase subunit D and bacterial alpha chain	224
214767	smart00663	RPOLA_N	RNA polymerase I subunit A N-terminus. 	295
214768	smart00664	DoH	Possible catecholamine-binding domain present in a variety of eukaryotic proteins. A predominantly beta-sheet domain present as a regulatory N-terminal domain in dopamine beta-hydroxylase, mono-oxygenase X and SDR2. Its function remains unknown at present (Ponting, Human Molecular Genetics, in press).	148
214769	smart00665	B561	Cytochrome b-561 / ferric reductase transmembrane domain. Cytochrome b-561 recycles ascorbate for the generation of norepinephrine by dopamine-beta-hydroxylase in the chromaffin vesicles of the adrenal gland. It is a transmembrane heme protein with the two heme groups being bound to conserved histidine residues. A cytochrome b-561 homologue, termed Dcytb, is an iron-regulated ferric reductase in the duodenal mucosa. Other homologues of these are also likely to be ferric reductases. SDR2 is proposed to be important in regulating the metabolism of iron in the onset of neurodegenerative disorders.	129
214770	smart00666	PB1	PB1 domain. Phox and Bem1p domain, present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pairs associate.	81
128913	smart00667	LisH	Lissencephaly type-1-like homology motif. Alpha-helical motif present in Lis1, treacle, Nopp140, some katanin p60 subunits, muskelin, tonneau, LEUNIG and numerous WD40 repeat-containing proteins. It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerisation, or else by binding cytoplasmic dynein heavy chain or microtubules directly.	34
128914	smart00668	CTLH	C-terminal to LisH motif. Alpha-helical motif of unknown function.	58
214771	smart00670	PINc	Large family of predicted nucleotide-binding domains. From similarities to 5'-exonucleases, these domains are predicted to be RNases. PINc domains in nematode SMG-5 and yeast NMD4p are predicted to be involved in RNAi.	111
214772	smart00671	SEL1	Sel1-like repeats. These represent a subfamily of TPR (tetratricopeptide repeat) sequences.	36
214773	smart00672	CAP10	Putative lipopolysaccharide-modifying enzyme. 	256
197827	smart00673	CARP	Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product. 	38
197828	smart00674	CENPB	Putative DNA-binding domain in centromere protein B, mouse jerky and transposases. 	66
128920	smart00675	DM11	Domains in hypothetical proteins in Drosophila including 2 in CG15241 and CG9329. 	164
128921	smart00676	DM10	Domains in hypothetical proteins in Drosophila, C. elegans and mammals. Occurs singly in some nucleoside diphosphate kinases. 	104
128922	smart00678	WWE	Domain in Deltex and TRIP12 homologues. Possibly involved in regulation of ubiquitin-mediated proteolysis. 	73
128923	smart00679	CTNS	Repeated motif present between transmembrane helices in cystinosin, yeast ERS1p, mannose-P-dolichol utilization defect 1, and other hypothetical proteins. Function unknown, but likely to be associated with the glycosylation machinery.	32
197829	smart00680	CLIP	Clip or disulphide knot domain. Present in horseshoe crab proclotting enzyme N-terminal domain, Drosophila Easter and silkworm prophenoloxidase-activating enzyme.	52
214774	smart00682	G2F	G2 nidogen domain and fibulin. 	227
128926	smart00683	DM16	Repeats in sea squirt COS41.4, worm R01H10.6, fly CG1126 etc. 	55
128927	smart00684	DM15	Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function. 	39
128928	smart00685	DM14	Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. 	59
128929	smart00686	DM13	Domain present in fly proteins (CG14681, CG12492, CG6217), worm H06A10.1 and Arabidopsis thaliana MBG8.9. 	108
128930	smart00688	DM7	Domain of unknown function in Drosophila CG15332, CG15333 and CG18293. 	95
214775	smart00689	DM6	Cysteine-rich domain currently specific to Drosophila. 	157
214776	smart00690	DM5	Domain of unknown function, currently peculiar to Drosophila. 	102
128933	smart00692	DM3	Zinc finger domain in CG10631, C. elegans LIN-15B and human P52rIPK. 	59
214777	smart00693	DysFN	Dysferlin domain, N-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region.	62
128935	smart00694	DysFC	Dysferlin domain, C-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region.	34
197831	smart00695	DUSP	Domain in ubiquitin-specific proteases. 	88
128937	smart00696	DM9	Repeats found in Drosophila proteins. 	71
214778	smart00697	DM8	Repeats found in several Drosophila proteins. 	93
197832	smart00698	MORN	Possible plasma membrane-binding motif in junctophilins, PIP-5-kinases and protein kinases. 	22
214779	smart00700	JHBP	Juvenile hormone binding protein domains in insects. The juvenile hormone exerts pleiotropic functions during insect life cycles and its binding proteins regulate these functions.	224
128941	smart00701	PGRP	Animal peptidoglycan recognition proteins homologous to Bacteriophage T3 lysozyme. The bacteriophage molecule, but not its moth homologue, has been shown to have N-acetylmuramoyl-L-alanine amidase activity. One member of this family, Tag7, is a cytokine.	142
214780	smart00702	P4Hc	Prolyl 4-hydroxylase alpha subunit homologues. Mammalian enzymes catalyse hydroxylation of collagen, for example. Prokaryotic enzymes might catalyse hydroxylation of antibiotic peptides. These are 2-oxoglutarate-dependent dioxygenases, requiring 2-oxoglutarate and dioxygen as cosubstrates and ferrous iron as a cofactor.	165
214781	smart00703	NRF	N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4). Also present in several other worm and fly proteins.	110
197836	smart00704	ZnF_CDGSH	CDGSH-type zinc finger. Function unknown. 	38
128945	smart00705	THEG	Repeats in THEG (testicular haploid expressed gene) and several fly proteins. 	20
214782	smart00706	TECPR	Beta propeller repeats in Physarum polycephalum tectonins, Limulus lectin L-6 and animal hypothetical proteins. 	35
128947	smart00707	RPEL	Repeat in Drosophila CG10860, human KIAA0680 and C. elegans F26H9.2. 	26
214783	smart00708	PhBP	Insect pheromone/odorant binding protein domains. 	103
128949	smart00709	Zpr1	Duplicated domain in the epidermal growth factor- and elongation factor-1alpha-binding protein Zpr1. Also present in archaeal proteins. 	160
214784	smart00710	PbH1	Parallel beta-helix repeats. The tertiary structures of pectate lyases and rhamnogalacturonase A show a stack of parallel beta strands that are coiled into a large helix. Each coil of the helix represents a structural repeat that, in some homologues, can be recognised from sequence information alone. Conservation of asparagines might be connected with asparagine-ladders that contribute to the stability of the fold. Proteins containing these repeats most often are enzymes with polysaccharide substrates.	23
197839	smart00711	TDU	Short repeats in human TONDU, fly vestigial and other proteins. Unknown function.	16
197840	smart00712	PUR	DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria. 	63
128953	smart00713	GYR	Motif of unknown function with conserved Gly, Tyr, Arg tripeptide in Drosophila proteins. 	18
197841	smart00714	LITAF	Possible membrane-associated motif in LPS-induced tumor necrosis factor alpha factor (LITAF), also known as PIG7, and other animal proteins. 	67
128955	smart00715	LA	Domain in the RNA-binding Lupus La protein; unknown function. 	80
197842	smart00717	SANT	SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains. 	49
214785	smart00718	DM4_12	DM4/DM12 family of domains in Drosophila melanogaster proteins of unknown function. 	95
197843	smart00719	Plus3	Short conserved domain in transcriptional regulators. Plus3 domains occur in the Saccharomyces cerevisiae Rtf1p protein, which interacts with Spt6p, and in parsley CIP, which interacts with the bZIP protein CPRF1.	109
214786	smart00720	calpain_III	calpain_III domain. 	143
214787	smart00721	BAR	BAR domain. 	239
214788	smart00722	CASH	Domain present in carbohydrate binding proteins and sugar hydrolses. 	153
128962	smart00723	AMOP	Adhesion-associated domain present in MUC4 and other proteins. 	154
214789	smart00724	TLC	TRAM, LAG1 and CLN8 homology domains. Protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis, TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. The family may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains.	205
214790	smart00725	NEAT	NEAr Transporter domain. 	123
197845	smart00726	UIM	Ubiquitin-interacting motif. Present in proteasome subunit S5a and other ubiquitin-associated proteins.	20
128966	smart00727	STI1	Heat shock chaperonin-binding motif. 	41
214791	smart00728	ChW	Clostridial hydrophobic, with a conserved W residue, domain. 	46
214792	smart00729	Elp3	Elongator protein 3, MiaB family, Radical SAM. This superfamily contains MoaA, NifB, PqqE, coproporphyrinogen III oxidase, biotin synthase and MiaB families, and includes a representative in the eukaryotic elongator subunit, Elp-3. Some members of the family are methyltransferases.	216
214793	smart00730	PSN	Presenilin, signal peptide peptidase, family. Presenilin 1 and presenilin 2 are polytopic membrane proteins, whose genes are mutated in some individuals with Alzheimer's disease. Distant homologues, present in eukaryotes and archaea, also contain conserved aspartic acid residues which are predicted to contribute to catalysis. At least one member of this family has been shown to possess signal peptide peptidase activity.	249
214794	smart00731	SprT	SprT homologues. Predicted to have roles in transcription elongation. Contains a conserved HExxH motif, indicating a metalloprotease function.	146
128971	smart00732	YqgFc	Likely ribonuclease with RNase H fold. YqgF proteins are likely to function as an alternative to RuvC in most bacteria, and could be the principal holliday junction resolvases in low-GC Gram-positive bacteria. In Spt6p orthologues, the catalytic residues are substituted indicating that they lack enzymatic functions.	99
197848	smart00733	Mterf	Mitochondrial termination factor repeats. Human mitochondrial termination factor is a DNA-binding protein that acts as a transcription termination factor. Six repeats occur in human mTERF, that also are present in numerous plant proteins.	31
128973	smart00734	ZnF_Rad18	Rad18-like CCHC zinc finger. Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids.	24
128974	smart00735	ZM	ZASP-like motif. Short motif (26 amino acids) present in an alpha-actinin-binding protein, ZASP, and similar molecules.	26
214795	smart00736	CADG	Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.	97
214796	smart00737	ML	Domain involved in innate immunity and lipid metabolism. ML (MD-2-related lipid-recognition) is a novel domain identified in MD-1, MD-2, GM2A, Npc2 and multiple proteins of unknown function in plants, animals and fungi. These single-domain proteins were predicted to form a beta-rich fold containing multiple strands, and to mediate diverse biological functions through interacting with specific lipids.	119
197850	smart00738	NGN	In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold. In Spt5p, this domain may confer affinity for Spt4p.Spt4p	106
128978	smart00739	KOW	KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.	28
197851	smart00740	PASTA	PASTA domain. 	67
214797	smart00741	SapB	Saposin (B) Domains. Present in multiple copies in prosaposin and in pulmonary surfactant-associated protein B. In plant aspartic proteinases, a saposin domain is circularly permuted. This causes the prediction algorithm to predict two such domains, where only one is truly present.	76
128981	smart00742	Hr1	Rho effector or protein kinase C-related kinase homology region 1 homologues. Alpha-helical domain found in vertebrate PRK1 and yeast PKC1 protein kinases C. The HR1 in rhophilin bind RhoGTP; those in PRK1 bind RhoA and RhoB. Also called RBD - Rho-binding domain	57
214798	smart00743	Agenet	Tudor-like domain present in plant sequences. Domain in plant sequences with possible chromatin-associated functions.	59
128983	smart00744	RINGv	The RING-variant domain is a C4HC3 zinc-finger like motif found in a number of cellular and viral proteins. Some of these proteins have been shown both in vivo and in vitro to have ubiquitin E3 ligase activity. The RING-variant domain is reminiscent of both the RING and the PHD domains and may represent an evolutionary intermediate. To describe this domain the term PHD/LAP domain has been used in the past. Extended description: The RING-variant (RINGv) domain contains a C4HC3 zinc-finger-like motif similar to the PHD domain, while some of the spacing between the Cys/His residues follow a pattern somewhat closer to that found in the RING domain. The RINGv domain, similar to the RING, PHD and LIM domains, is thought to bind two zinc ions co-ordinated by the highly conserved Cys and His residues. RING variant domain: C-x (2) -C-x(10-45)-C-x (1) -C-x (7) -H-x(2)-C-x(11-25)-C-x(2)-C As opposed to a PHD: C-x(1-2) -C-x (7-13)-C-x(2-4)-C-x(4-5)-H-x(2)-C-x(10-21)-C-x(2)-C Classical RING domain: C-x (2) -C-x (9-39)-C-x(1-3)-H-x(2-3)-C-x(2)-C-x(4-48) -C-x(2)-C	49
197854	smart00745	MIT	Microtubule Interacting and Trafficking molecule domain. 	77
214799	smart00746	TRASH	metallochaperone-like domain. 	39
128986	smart00747	CFEM	eight cysteine-containing domain present in fungal extracellular membrane proteins. 	65
214800	smart00748	HEPN	Higher Eukarytoes and Prokaryotes Nucleotide-binding domain. 	113
197856	smart00749	BON	bacterial OsmY and nodulation domain. 	61
214801	smart00750	KIND	kinase non-catalytic C-lobe domain. It is an interaction domain identified as being similar to the C-terminal protein kinase catalytic fold (C lobe). Its presence at the N terminus of signalling proteins and the absence of the active-site residues in the catalytic and activation loops suggest that it folds independently and is likely to be non-catalytic. The occurrence of KIND only in metazoa implies that it has evolved from the catalytic protein kinase domain into an interaction domain possibly by keeping the substrate-binding features	176
128990	smart00751	BSD	domain in transcription factors and synapse-associated proteins. 	51
214802	smart00752	HTTM	Horizontally Transferred TransMembrane Domain. Sequence analysis of vitamin K dependent gamma-carboxylases (VKGC) revealed the presence of a novel domain, HTTM (Horizontally Transferred TransMembrane) in its N-terminus. In contrast to most known domains, HTTM contains four transmembrane regions. Its occurrence in eukaryotes, bacteria and archaea is more likely caused by horizontal gene transfer than by early invention. The conservation of VKGC catalytic sites indicates an enzymatic function also for the other family members.	271
214804	smart00754	CHRD	A domain in the BMP inhibitor chordin and in microbial proteins. 	118
197860	smart00755	Grip	golgin-97, RanBP2alpha,Imh1p and p230/golgin-245. 	46
214805	smart00756	VKc	Family of likely enzymes that includes the catalytic subunit of vitamin K epoxide reductase. Bacterial homologues are fused to members of the thioredoxin family of oxidoreductases.	142
214806	smart00757	CRA	CT11-RanBPM. protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi)	99
214807	smart00758	PA14	domain in bacterial beta-glucosidases other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins. 	136
128998	smart00759	Flu_M1_C	Influenza Matrix protein (M1) C-terminal domain. This region is thought to be a second domain of the M1 matrix protein.	95
197863	smart00760	Bac_DnaA_C	Bacterial dnaA protein helix-turn-helix domain. Could be involved in DNA-binding.	69
214808	smart00761	HDAC_interact	Histone deacetylase (HDAC) interacting. This domain is found on transcriptional regulators. It forms interactions with histone deacetylases.	102
214809	smart00762	Cog4	COG4 transport protein. This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi and intra Golgi transport.	324
214810	smart00763	AAA_PrkA	PrkA AAA domain. This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This is the N-terminal AAA domain.	361
129003	smart00764	Citrate_ly_lig	Citrate lyase ligase C-terminal domain. Proteins of this family contain the C-terminal domain of citrate lyase ligase EC:6.2.1.22.	182
129004	smart00765	MANEC	The MANEC domain, formerly called MANSC. This domain, comprising 8 conserved cysteines, is found in the N terminus of higher multicellular animal membrane and extracellular proteins. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. It has been proposed that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase.	93
197866	smart00766	DnaG_DnaB_bind	DNA primase DnaG DnaB-binding. DnaG_DnaB_bind defines a domain of primase required for functional interaction with DnaB that attracts primase to the replication fork. DnaG_DnaB_bind is responsible for the interaction between DnaG and DnaB.	125
214811	smart00767	DCD	DCD is a plant specific domain in proteins involved in development and programmed cell death. The domain is shared by several proteins in the Arabidopsis and the rice genomes, which otherwise show a different protein architecture. Biological studies indicate a role of these proteins in phytohormone response, embryo development and programmed cell death by pathogens or ozone.	132
197867	smart00768	X8	Possibly involved in carbohydrate binding. The X8 domain, which may be involved in carbohydrate binding, is found in an Olive pollen antigen as well as at the C terminus of family 17 glycosyl hydrolases. It contains 6 conserved cysteine residues which presumably form three disulfide bridges.	85
214812	smart00769	WHy	Water Stress and Hypersensitive response. 	100
214813	smart00770	Zn_dep_PLPC	Zinc dependent phospholipase C (alpha toxin). This domain conveys a zinc dependent phospholipase C activity (EC 3.1.4.3). It is found in a monomeric phospholipase C of Bacillus cereus as well as in the alpha toxin of Clostridium perfringens and Clostridium bifermentans, which is involved in haemolysis and cell rupture. It is also found in a lecithinase of Listeria monocytogenes, which is involved in breaking the 2-membrane vacuoles that surround the bacterium. Structure information: PDB 1ca1.	241
129010	smart00771	ZipA_C	ZipA, C-terminal domain (FtsZ-binding). C-terminal domain of ZipA, a component of cell division in E.coli. It interacts with the FtsZ protein in one of the initial steps of septum formation. The structure of this domain is composed of three alpha-helices and a beta-sheet consisting of six antiparallel beta-strands.	131
214814	smart00773	WGR	Proposed nucleic acid binding domain. This domain is named after its most conserved central motif. It is found in a variety of polyA polymerases as well as in molybdate metabolism regulators (e.g. in E.coli) and other proteins of unknown function. The domain is found in isolation in some proteins and is between 70 and 80 residues in length. It is proposed that it may be a nucleic acid binding domain.	84
214815	smart00774	WRKY	DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.	59
197870	smart00775	LNS2	This domain is found in Saccharomyces cerevisiae protein SMP2, proteins with an N-terminal lipin domain and phosphatidylinositol transfer proteins. SMP2 is involved in plasmid maintenance and respiration. Lipin proteins are involved in adipose tissue development and insulin resistance.	157
214816	smart00776	NPCBM	This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. 	145
214817	smart00777	Mad3_BUB1_I	Mad3/BUB1 hoMad3/BUB1 homology region 1. Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p.	124
129016	smart00778	Prim_Zn_Ribbon	Zinc-binding domain of primase-helicase. This region represents the zinc binding domain. It is found in the N-terminal region of the bacteriophage P4 alpha protein, which is a multifunctional protein with origin recognition, helicase and primase activities.	37
197872	smart00780	PIG-X	PIG-X / PBN1. Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules.	203
129018	smart00782	PhnA_Zn_Ribbon	PhnA Zinc-Ribbon. This protein family includes an uncharacterised member designated phnA in Escherichia coli, part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage. This protein is not related to the characterised phosphonoacetate hydrolase designated PhnA.	47
129019	smart00783	A_amylase_inhib	Alpha amylase inhibitor. Alpha amylase inhibitor inhibits mammalian alpha-amylases specifically, by forming a tight stoichiometric 1:1 complex with alpha-amylase. The inhibitor has no action on plant and microbial alpha amylases.	69
214818	smart00784	SPT2	SPT2 chromatin protein. This entry includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation.	106
129021	smart00785	AARP2CN	AARP2CN (NUC121) domain. This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU.	83
129022	smart00786	SHR3_chaperone	ER membrane protein SH3. This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs). Shr3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of Shr3, AAPs are retained in the ER.	196
197874	smart00787	Spc7	Spc7 kinetochore protein. This domain is found in cell division proteins which are required for kinetochore-spindle association.	312
197875	smart00788	Adenylsucc_synt	Adenylosuccinate synthetase. Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalyzing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterized from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present - one involved in purine biosynthesis and the other in the purine nucleotide cycle. The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein.	417
129025	smart00789	Ad_cyc_g-alpha	Adenylate cyclase G-alpha binding domain. This fungal domain is found in adenylate cyclase and interacts with the alpha subunit of heterotrimeric G proteins.	51
129026	smart00790	AFOR_N	Aldehyde ferredoxin oxidoreductase, N-terminal domain. Enzymes of the aldehyde ferredoxin oxidoreductase (AOR) family contain a tungsten cofactor and an 4Fe4S cluster and catalyse the interconversion of aldehydes to carboxylates. This family includes AOR, formaldehyde ferredoxin oxidoreductase (FOR), glyceraldehyde-3-phosphate ferredoxin oxidoreductase (GAPOR), all isolated from hyperthermophilic archea. carboxylic acid reductase found in clostridia. and hydroxycarboxylate viologen oxidoreductase from Proteus vulgaris, the sole member of the AOR family containing molybdenum. GAPOR may be involved in glycolysis. but the functions of the other proteins are not yet clear. AOR has been proposed to be the primary enzyme responsible for oxidising the aldehydes that are produced by the 2-keto acid oxidoreductases.	199
129027	smart00791	Agglutinin	Amaranthus caudatus agglutinin or amaranthin is a lectin from the ancient South American crop, amaranth grain. Although its biological function is unknown, it has a high binding specificity for the methyl-glycoside of the T-antigen, found linked to serine or threonine residues of cell surface glycoproteins. The protein is comprised of a homodimer, with each homodimer consisting of two beta-trefoil domains.	139
197876	smart00792	Agouti	Agouti protein. The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP) is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation.	124
214819	smart00793	AgrB	Accessory gene regulator B. The accessory gene regulator (agr) of Staphylococcus aureus is the central regulatory system that controls the gene expression for a large set of virulence factors. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. At low cell density, the agr genes are continuously expressed at basal levels. A signal molecule, autoinducing peptide (AIP), produced and secreted by the bacteria, accumulates outside of the cells. When the cell density increases and the AIP concentration reaches a threshold, it activates the agr response, i.e. activation of secreted protein gene expression and subsequent repression of cell wall-associated protein genes. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein. AgrB is involved in the proteolytic processing of AgrD and may have both proteolytic enzyme activity and a transporter facilitating the export of the processed AgrD peptide.	184
129030	smart00794	AgrD	Staphylococcal AgrD protein. This family consists of several AgrD proteins from many Staphylococcus species. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr.	45
129031	smart00795	Agro_virD5	Agrobacterium VirD5 protein. The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterised products. This family represents the VirD5 protein.	780
214820	smart00796	AHS1	Allophanate hydrolase subunit 1. This domain represents subunit 1 of allophanate hydrolase (AHS1).	201
214821	smart00797	AHS2	Allophanate hydrolase subunit 2. This domain represents subunit 2 of allophanate hydrolase (AHS2).	280
214822	smart00798	AICARFT_IMPCHas	AICARFT/IMPCHase bienzyme. This is a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP.	311
214823	smart00799	DENN	Domain found in a variety of signalling proteins, always encircled by uDENN and dDENN. The DENN domain is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity.	183
214824	smart00800	uDENN	Domain always found upstream of DENN domain, found in a variety of signalling proteins. The uDENN domain is part of the tripartite DENN domain. It is always found upstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity.	89
129037	smart00801	dDENN	Domain always found downstream of DENN domain, found in a variety of signalling proteins. The dDENN domain is part of the tripartite DENN domain. It is always found downstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity.	69
214825	smart00802	UME	Domain in UVSB PI-3 kinase, MEI-41 and ESR-1. Characteristic domain in UVSP PI-3 kinase, MEI-41 and ESR-1. Found in nucleolar proteins. Associated with FAT, FATC, PI3_PI4_kinase modules.	107
129039	smart00803	TAF	TATA box binding protein associated factor. TAFs (TATA box binding protein associated factors) are part of the transcription initiation factor TFIID multimeric protein complex. TFIID is composed of the TATA box binding protein (TBP) and a number of TAFs. The TAFs provide binding sites for many different transcriptional activators and co-activators that modulate transcription initiation by Pol II. TAF proteins adopt a histone-like fold.	65
197882	smart00804	TAP_C	C-terminal domain of vertebrate Tap protein. The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for the nuclear export of mRNA. Its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate shuttling. This domain forms a compact four-helix fold related to that of a UBA domain.	63
197883	smart00805	AGTRAP	Angiotensin II, type I receptor-associated protein. This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the C-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear.	159
214826	smart00806	AIP3	Actin interacting protein 3. Aip3p/Bud6p is a regulator of cell and cytoskeletal polarity in Saccharomyces cerevisiae that was previously identified as an actin-interacting protein. Actin-interacting protein 3 (Aip3p) localizes at the cell cortex where cytoskeleton assembly must be achieved to execute polarized cell growth, and deletion of AIP3 causes gross defects in cell and cytoskeletal polarity. Aip3p localization is mediated by the secretory pathway, mutations in early- or late-acting components of the secretory apparatus lead to Aip3p mislocalization.	426
214827	smart00807	AKAP_110	A-kinase anchor protein 110 kDa. This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction.	851
197885	smart00808	FABD	F-actin binding domain (FABD). FABD is the F-actin binding domain of Bcr-Abl and its cellular counterpart c-Abl. The Bcr-Abl tyrosine kinase causes different forms of leukemia in humans. Depending on its position within the cell, Bcr-Abl differentially affects cellular growth. The FABD forms a compact left-handed four-helix bundle in solution.	126
197886	smart00809	Alpha_adaptinC2	Adaptin C-terminal domain. Adaptins are components of the adaptor complexes which link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. Gamma-adaptin is a subunit of the golgi adaptor. Alpha adaptin is a heterotetramer that regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This Ig-fold domain is found in alpha, beta and gamma adaptins and consists of a beta-sandwich containing 7 strands in 2 beta-sheets in a greek-key topology.. The adaptor appendage contains an additional N-terminal strand.	104
129046	smart00810	Alpha-amyl_C2	Alpha-amylase C-terminal beta-sheet domain. This entry represents the beta-sheet domain that is found in several alpha-amylases, usually at the C-terminus. This domain is organised as a five-stranded anti-parallel beta-sheet.	61
214828	smart00811	Alpha_kinase	Alpha-kinase family. This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains.	198
214829	smart00812	Alpha_L_fucos	Alpha-L-fucosidase. O-Glycosyl hydrolases (EC 3.2.1.-) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Family 29 encompasses alpha-L-fucosidases, which is a lysosomal enzyme responsible for hydrolyzing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins. Deficiency of alpha-L-fucosidase results in the lysosomal storage disease fucosidosis.	384
214830	smart00813	Alpha-L-AF_C	Alpha-L-arabinofuranosidase C-terminus. This entry represents the C terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase. This catalyses the hydrolysis of non-reducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides.	189
129050	smart00814	Alpha_TIF	Alpha trans-inducing protein (Alpha-TIF). Alpha-TIF (VP16) from Herpes Simplex virus is an essential tegument protein involved in the transcriptional activation of viral immediate early (IE) promoters (alpha genes) during the lytic phase of viral infection. VP16 associates with cellular transcription factors to enhance transcription rates, including the general transcription factor TFIIB and the transcriptional coactivator PC4. The N-terminal residues of VP16 confer specificity for the IE genes, while the C-terminal residues are responsible for transcriptional activation. Within the C-terminal region are two activation regions that can independently and cooperatively activate transcription. VP16 forms a transcriptional regulatory complex with two cellular proteins, the POU-domain transcription factor Oct-1 and the cell-proliferation factor HCF-1. VP16 is an alpha/beta protein with an unusual fold. Other transcription factors may have a similar topology.	356
214831	smart00815	AMA-1	Apical membrane antigen 1. Apical membrane antigen 1 (AMA-1) is a Plasmodium asexual blood-stage antigen. It has been suggested that positive selection operates on the AMA-1 gene in regions coding for antigenic sites.	239
129052	smart00816	Amb_V_allergen	Amb V Allergen. Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphhydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens.	45
214832	smart00817	Amelin	Ameloblastin precursor (Amelin). This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals.	411
197891	smart00818	Amelogenin	Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth. They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.	165
214833	smart00822	PKS_KR	This enzymatic domain is part of bacterial polyketide synthases. It catalyses the first step in the reductive modification of the beta-carbonyl centres in the growing polyketide chain. It uses NADPH to reduce the keto group to a hydroxy group.	180
214834	smart00823	PKS_PP	Phosphopantetheine attachment site. Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups.	86
214835	smart00824	PKS_TE	Thioesterase. Peptide synthetases are involved in the non-ribosomal synthesis of peptide antibiotics. Next to the operons encoding these enzymes, in almost all cases, are genes that encode proteins that have similarity to the type II fatty acid thioesterases of vertebrates. There are also modules within the peptide synthetases that also share this similarity. With respect to antibiotic production, thioesterases are required for the addition of the last amino acid to the peptide antibiotic, thereby forming a cyclic antibiotic. Thioesterases (non-integrated) have molecular masses of 25-29 kDa.	212
214836	smart00825	PKS_KS	Beta-ketoacyl synthase. The structure of beta-ketoacyl synthase is similar to that of the thiolase family and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains.	298
214837	smart00826	PKS_DH	Dehydratase domain in polyketide synthase (PKS) enzymes. 	167
214838	smart00827	PKS_AT	Acyl transferase domain in polyketide synthase (PKS) enzymes. 	298
214839	smart00828	PKS_MT	Methyltransferase in polyketide synthase (PKS) enzymes. 	224
214840	smart00829	PKS_ER	Enoylreductase. Enoylreductase in Polyketide synthases.	287
214841	smart00830	CM_2	Chorismate mutase type II. Chorismate mutase, catalyses the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine..	79
214842	smart00831	Cation_ATPase_N	Cation transporter/ATPase, N-terminus. This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases.	75
214843	smart00832	C8	This domain contains 8 conserved cysteine residues. Not all of the conserved cysteines have been included in the alignment model. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin.	76
214844	smart00833	CobW_C	Cobalamin synthesis protein cobW C-terminal domain. CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids. This entry represents the C-terminal domain found in CobW, as well as in P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression.	92
197903	smart00834	CxxC_CXXC_SSSS	Putative regulatory protein. CxxC_CXXC_SSSS represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB. Most proteins in this entry have a C-terminal region containing highly degenerate sequence.	41
214845	smart00835	Cupin_1	Cupin. This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant.	146
214846	smart00836	DALR_1	DALR anticodon binding domain. This all alpha helical domain is the anticodon binding domain of Arginyl tRNA synthetase. This domain is known as the DALR domain after characteristic conserved amino acids.	122
129070	smart00837	DPBB_1	Rare lipoprotein A (RlpA)-like double-psi beta-barrel. Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N terminus of pollen allergen.	87
197906	smart00838	EFG_C	Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold.	85
214847	smart00839	ELFV_dehydrog	Glutamate/Leucine/Phenylalanine/Valine dehydrogenase. Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.	102
214848	smart00840	DALR_2	This DALR domain is found in cysteinyl-tRNA-synthetases. 	56
214849	smart00841	Elong-fact-P_C	Elongation factor P, C-terminal. These nucleic acid binding domains are predominantly found in elongation factor P, where they adopt an OB-fold, with five beta-strands forming a beta-barrel in a Greek-key topology.	57
214850	smart00842	FtsA	Cell division protein FtsA. FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains.	187
197911	smart00843	Ftsk_gamma	This domain directs oriented DNA translocation and forms a winged helix structure. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding.	63
197912	smart00844	GA	GA module. The protein G-related albumin-binding (GA) module is composed of three alpha helices. This module is found in a range of bacterial cell surface proteins. The GA module from the Peptostreptococcus magnus albumin-binding protein (PAB) shows a strong affinity for albumin.	60
197913	smart00845	GatB_Yqey	GatB domain. This domain is found in GatB and proteins related to bacterial Yqey. It is about 140 amino acid residues long. This domain is found at the C terminus of GatB which transamidates Glu-tRNA to Gln-tRNA. The function of this domain is uncertain. It does however suggest that Yqey and its relatives have a role in tRNA metabolism.	147
214851	smart00846	Gp_dh_N	Glyceraldehyde 3-phosphate dehydrogenase, NAD binding domain. GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. N-terminal domain is a Rossmann NAD(P) binding fold.	149
214852	smart00847	HA2	Helicase associated domain (HA2) Add an annotation. This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding.	82
214853	smart00848	Inhibitor_I29	Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.	57
214854	smart00849	Lactamase_B	Metallo-beta-lactamase superfamily. Apart from the beta-lactamases a number of other proteins contain this domain. These proteins include thiolesterases, members of the glyoxalase II family, that catalyse the hydrolysis of S-D-lactoyl-glutathione to form glutathione and D-lactic acid and a competence protein that is essential for natural transformation in Neisseria gonorrhoeae and could be a transporter involved in DNA uptake. Except for the competence protein these proteins bind two zinc ions per molecule as cofactor.	177
197918	smart00850	LytTR	LytTr DNA-binding domain. This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern.	96
214855	smart00851	MGS	MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site.	91
214856	smart00852	MoCF_biosynth	Probable molybdopterin binding domain. This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation.	138
214857	smart00853	MutL_C	MutL C terminal dimerisation domain. MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerisation.	140
214858	smart00854	PGA_cap	Bacterial capsule synthesis protein PGA_cap. This protein is a putative poly-gamma-glutamate capsule biosynthesis protein found in bacteria. Poly-gamma-glutamate is a natural polymer that may be involved in virulence and may help bacteria survive in high salt concentrations. It is a surface-associated protein.	239
214859	smart00855	PGAM	Phosphoglycerate mutase family. Phosphoglycerate mutase (PGAM) and bisphosphoglycerate mutase (BPGM) are structurally related enzymes that catalyse reactions involving the transfer of phospho groups between the three carbon atoms of phosphoglycerate... Both enzymes can catalyse three different reactions with different specificities, the isomerization of 2-phosphoglycerate (2-PGA) to 3-phosphoglycerate (3-PGA) with 2,3-diphosphoglycerate (2,3-DPG) as the primer of the reaction, the synthesis of 2,3-DPG from 1,3-DPG with 3-PGA as a primer and the degradation of 2,3-DPG to 3-PGA (phosphatase activity). In mammals, PGAM is a dimeric protein with two isoforms, the M (muscle) and B (brain) forms. In yeast, PGAM is a tetrameric protein.	158
214860	smart00856	PMEI	Plant invertase/pectin methylesterase inhibitor. This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein. It is also found at the N-termini of PMEs predicted from DNA sequences, suggesting that both PMEs and their inhibitors are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical.	148
214861	smart00857	Resolvase	Resolvase, N terminal domain. The N-terminal domain of the resolvase family contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase.	148
214862	smart00858	SAF	This domain family includes a range of different proteins. Such as antifreeze proteins and flagellar FlgA proteins, and CpaB pilus proteins. 	63
214863	smart00859	Semialdhyde_dh	Semialdehyde dehydrogenase, NAD binding domain. The semialdehyde dehydrogenase family is found in N-acetyl-glutamine semialdehyde dehydrogenase (AgrC), which is involved in arginine biosynthesis, and aspartate-semialdehyde dehydrogenase, an enzyme involved in the biosynthesis of various amino acids from aspartate. This family is also found in yeast and fungal Arg5,6 protein, which is cleaved into the enzymes N-acety-gamma-glutamyl-phosphate reductase and acetylglutamate kinase. These are also involved in arginine biosynthesis. All proteins in this entry contain a NAD binding region of semialdehyde dehydrogenase.	123
214864	smart00860	SMI1_KNR4	SMI1 / KNR4 family. Proteins in this family are involved in the regulation of 1,3-beta-glucan synthase activity and cell-wall formation.	127
214865	smart00861	Transket_pyr	Transketolase, pyrimidine binding domain. Transketolase (TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources show that the enzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Hansenula polymorpha, there is a highly related enzyme, dihydroxy-acetone synthase (DHAS) (also known as formaldehyde transketolase), which exhibits a very unusual specificity by including formaldehyde amongst its substrates.	136
214866	smart00862	Trans_reg_C	Transcriptional regulatory protein, C terminal. This domain is almost always found associated with the response regulator receiver domain. It may play a role in DNA binding.	76
197931	smart00863	tRNA_SAD	Threonyl and Alanyl tRNA synthetase second additional domain. The catalytically active form of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this SAD domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain.	43
214867	smart00864	Tubulin	Tubulin/FtsZ family, GTPase domain. This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea.	192
214868	smart00865	Tubulin_C	Tubulin/FtsZ family, C-terminal domain. This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain.	120
214869	smart00866	UTRA	The UbiC transcription regulator-associated (UTRA) domain is a conserved ligand-binding domain. It has a similar fold to HutC/FarR-like bacterial transcription factors of the GntR family. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules.	143
214870	smart00867	YceI	YceI-like domain. E. coli YceI is a base-induced periplasmic protein. The recent structure of a member of this family shows that it binds to polyisoprenoid. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold.	166
214871	smart00868	zf-AD	Zinc-finger associated domain (zf-AD). The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA.	73
214872	smart00869	Autotransporter	Autotransporter beta-domain. Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type IV pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different peptidase is used and in some cases no cleavage occurs.	268
214873	smart00870	Asparaginase	Asparaginase, found in various plant, animal and bacterial cells. Asparaginase catalyses the deamination of asparagine to yield aspartic acid and an ammonium ion, resulting in a depletion of free circulatory asparagine in plasma. The enzyme is effective in the treatment of human malignant lymphomas, which have a diminished capacity to produce asparagine synthetase: in order to survive, such cells absorb asparagine from blood plasma..- if Asn levels have been depleted by injection of asparaginase, the lymphoma cells die.	323
214874	smart00871	AraC_E_bind	Bacterial transcription activator, effector binding domain. This domain is found in the probable effector binding domain of a number of different bacterial transcription activators.and is also present in some DNA gyrase inhibitors. The absence of a HTH motif in the DNA gyrase inhibitors is thought to indicate the fact that these do not bind DNA.	158
214875	smart00872	Alpha-mann_mid	Alpha mannosidase, middle domain. Members of this entry belong to the glycosyl hydrolase family 38, This domain, which is found in the central region adopts a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. The domain is predominantly found in the enzyme alpha-mannosidase.	79
214876	smart00873	B3_4	B3/4 domain. This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins.	174
197942	smart00874	B5	tRNA synthetase B5 domain. This domain is found in phenylalanine-tRNA synthetase beta subunits.	68
197943	smart00875	BACK	BTB And C-terminal Kelch. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues.	101
214877	smart00876	BATS	Biotin and Thiamin Synthesis associated domain. Biotin synthase (BioB), , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this entry) and form a heterodimer. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers.. This domain therefore may be involved in co-factor binding or dimerisation.	94
197945	smart00877	BMC	Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure.	75
214878	smart00878	Biotin_carb_C	Biotin carboxylase C-terminal domain. Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyses the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain.	107
214879	smart00879	Brix	The Brix domain is found in a number of eukaryotic proteins. Members include SSF proteins from yeast and humans, Arabidopsis thaliana Peter Pan-like protein and several hypothetical proteins.	180
214880	smart00880	CHAD	The CHAD domain is an alpha-helical domain functionally associated with some members of the adenylate cyclase family. It has conserved histidines that may chelate metals.	262
214881	smart00881	CoA_binding	CoA binding domain. This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases.	100
214882	smart00882	CoA_trans	Coenzyme A transferase. Coenzyme A (CoA) transferases belong to an evolutionary conserved family of enzymes catalyzing the reversible transfer of CoA from one carboxylic acid to another. They have been identified in many prokaryotes and in mammalian tissues. The bacterial enzymes are heterodimer of two subunits (A and B) of about 25 Kd each while eukaryotic SCOT consist of a single chain which is colinear with the two bacterial subunits.	212
197951	smart00883	Cpn10	Chaperonin 10 Kd subunit. The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins. These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10). The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.	93
214883	smart00884	Cullin_Nedd8	Cullin protein neddylation domain. This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue.	68
197953	smart00885	D5_N	D5 N terminal like. This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primases phages.	141
214884	smart00886	Dabb	Stress responsive A/B Barrel Domain. The function of this domain is unknown, but it is upregulated in response to salt stress in Populus balsamifera (balsam poplar). It is also found at the C-terminus of a fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus.It is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved and the domain forms an alpha-beta barrel dimer. Although there is a clear duplication within the domain it is not obviously detectable in the sequence.	97
214885	smart00887	EB_dh	Ethylbenzene dehydrogenase. Eythylbenzene dehydrogenase is a heterotrimer of three subunits that catalyses the anaerobic degradation of hydrocarbons. The alpha subunit contains the catalytic centre as a Molybdenum cofactor-complex. This removes an electron-pair from the hydrocarbon and passes it along an electron transport system involving iron-sulphur complexes held in the beta subunit and a Haem b molecule contained in the gamma subunit. The electron-pair is then subsequently passed to an as yet unknown receiver. The enzyme is found in a variety of different bacteria.	209
214886	smart00888	EF1_GNE	EF-1 guanine nucleotide exchange domain. Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution. Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta). This entry represents the guanine nucleotide exchange domain of the beta (EF-1beta, also known as EF1B-alpha) and delta (EF-1delta, also known as EF1B-beta) chains of EF1B proteins from eukaryotes and archaea. The beta and delta chains have exchange activity, which mainly resides in their homologous guanine nucleotide exchange domains, found in the C-terminal region of the peptides. Their N-terminal regions may be involved in interactions with the gamma chain (EF-1gamma).	88
214887	smart00889	EFG_IV	Elongation factor G, domain IV. Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution. Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome. EF2 has five domains. This entry represents domain IV found in EF2 (or EF-G) of both prokaryotes and eukaryotes. The EF2-GTP-ribosome complex undergoes extensive structural rearrangement for tRNA-mRNA movement to occur. Domain IV, which extends from the 'body' of the EF2 molecule much like a lever arm, appears to be essential for the structural transition to take place.	120
197958	smart00890	EKR	Domain of unknown function. EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) and the 4Fe-4S binding domain Fer4. It contains a characteristic EKR sequence motif. The exact function of this domain is not known.	57
214888	smart00891	ERCC4	ERCC4 domain. This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases.	98
214889	smart00892	Endonuclease_NS	DNA/RNA non-specific endonuclease. A family of bacterial and eukaryotic endonucleases share the following characteristics: they act on both DNA and RNA, cleave double-stranded and single-stranded nucleic acids and require a divalent ion such as magnesium for their activity. An histidine has been shown to be essential for the activity of the Serratia marcescens nuclease. This residue is located in a conserved region which also contains an aspartic acid residue that could be implicated in the binding of the divalent ion.	198
214890	smart00893	ETF	Electron transfer flavoprotein domain. Electron transfer flavoproteins (ETFs) serve as specific electron acceptors for primary dehydrogenases, transferring the electrons to terminal respiratory systems. They can be functionally classified into constitutive, "housekeeping" ETFs, mainly involved in the oxidation of fatty acids (Group I), and ETFs produced by some prokaryotes under specific growth conditions, receiving electrons only from the oxidation of specific substrates (Group II). ETFs are heterodimeric proteins composed of an alpha and beta subunit, and contain an FAD cofactor and AMP. ETF consists of three domains: domains I and II are formed by the N- and C-terminal portions of the alpha subunit, respectively, while domain III is formed by the beta subunit. Domains I and III share an almost identical alpha-beta-alpha sandwich fold, while domain II forms an alpha-beta-alpha sandwich similar to that of bacterial flavodoxins. FAD is bound in a cleft between domains II and III, while domain III binds the AMP molecule. Interactions between domains I and III stabilise the protein, forming a shallow bowl where domain II resides. This entry represents the N-terminal domain of both the alpha and beta subunits from Group I and Group II ETFs.	185
214891	smart00894	Excalibur	Excalibur calcium-binding domain. Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognised and the evolution of EF-hand-like domains is probably more complex than previously appreciated.	37
214892	smart00895	FCD	This entry represents the C-terminal ligand binding domain of many members of the GntR family. This domain probably binds to a range of effector molecules that regulate the transcription of genes through the action of the N-terminal DNA-binding domain. This domain is found in and that are regulators of sugar biosynthesis operons. Many bacterial transcription regulation proteins bind DNA through a helix-turn-helix (HTH) motif, which can be classified into subfamilies on the basis of sequence similarities. The HTH GntR family has many members distributed among diverse bacterial groups that regulate various biological processes. It was named GntR after the Bacillus subtilis repressor of the gluconate operon. In general, these proteins contain a DNA-binding HTH domain at the N terminus, and an effector binding or oligomerisation domain at the C terminus. The winged-helix DNA-binding domain is well conserved in structure for the whole of the GntR family, and is similar in structure to other transcriptional regulator families. The C-terminal effector-binding and oligomerisation domains are more variable and are consequently used to define the subfamilies. Based on the sequence and structure of the C-terminal domains, the GtnR family can be divided into four major groups, as represented by FadR, HutC, MocR and YtrA, as well as some minor groups such as those represented by AraR and PlmA.	123
214893	smart00896	FDX-ACB	Ferredoxin-fold anticodon binding domain. This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2).	93
214894	smart00897	FIST	FIST N domain. The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids.	196
214895	smart00898	Fapy_DNA_glyco	Formamidopyrimidine-DNA glycosylase N-terminal domain. This entry represents the catalytic domain of DNA glycosylase/AP lyase enzymes, which are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. Most damage to bases in DNA is repaired by the base excision repair pathway. These enzymes are primarily from bacteria, and have both DNA glycosylase activity and AP lyase activity. Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei). Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; ) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; ). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3'- and 5'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes. Fpg binds one ion of zinc at the C-terminus, which contains four conserved and essential cysteines.. Endonuclease VIII (Nei) has the same enzyme activities as Fpg above, but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine. These protein contains three structural domains: an N-terminal catalytic core domain, a central helix-two turn-helix (H2TH) module and a C-terminal zinc finger (see PDB:1K82). The N-terminal catalytic domain and the C-terminal zinc finger straddle the DNA with the long axis of the protein oriented roughly orthogonal to the helical axis of the DNA. Residues that contact DNA are located in the catalytic domain and in a beta-hairpin loop formed by the zinc finger.	115
214896	smart00899	FeoA	This entry represents the core domain of the ferrous iron (Fe2+) transport protein FeoA found in bacteria. This domain also occurs at the C-terminus in related proteins. The transporter Feo is composed of three proteins: FeoA a small, soluble SH3-domain protein probably located in the cytosol; FeoB, a large protein with a cytosolic N-terminal G-protein domain and a C-terminal integral inner-membrane domain containing two 'Gate' motifs which likely functions as the Fe2+ permease; and FeoC, a small protein apparently functioning as an [Fe-S]-dependent transcriptional repressor. Feo allows the bacterial cell to acquire iron from its environment.	72
214897	smart00900	FMN_bind	This conserved region includes the FMN-binding site of the NqrC protein as well as the NosR and NirI regulatory proteins. 	86
214898	smart00901	FRG	This domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterised. 	103
214899	smart00902	Fe_hyd_SSU	Iron hydrogenase small subunit. Many microorganisms, such as methanogenic, acetogenic, nitrogen-fixing, photosynthetic, or sulphate-reducing bacteria, metabolise hydrogen. Hydrogen activation is mediated by a family of enzymes, termed hydrogenases, which either provide these organisms with reducing power from hydrogen oxidation, or act as electron sinks. There are two hydrogenases families that differ functionally from each other: NiFe hydrogenases tend to be more involved in hydrogen oxidation, while Iron-only FeFe (Fe only) hydrogenases in hydrogen production. Fe only hydrogenases show a common core structure, which contains a moiety, deeply buried inside the protein, with an Fe-Fe dinuclear centre, nonproteic bridging, terminal CO and CN- ligands attached to each of the iron atoms, and a dithio moiety, which also bridges the two iron atoms and has been tentatively assigned as a di(thiomethyl)amine. This common core also harbours three [4Fe-4S] iron-sulphur clusters. In FeFe hydrogenases, as in NiFe hydrogenases, the set of iron-sulphur clusters is dispersed regularly between the dinuclear Fe-Fe centre and the molecular surface. These clusters are distant by about 1.2 nm from each other but the [4Fe-4S] cluster closest to the dinuclear centre is covalently bound to one of the iron atoms though a thiolate bridging ligand. The moiety including the dinuclear centre, the thiolate bridging ligand, and the proximal [4Fe-4S] cluster is known as the H-cluster. A channel, lined with hydrophobic amino acid side chains, nearly connects the dinuclear centre and the molecular surface. Furthermore hydrogen-bonded water molecule sites have been identified at the interior and at the surface of the protein. The small subunit is comprised of alternating random coil and alpha helical structures that encompass the large subunit in a novel protein fold.	52
214900	smart00903	Flavin_Reduct	Flavin reductase like domain. This entry represents the FMN-binding domain found in NAD(P)H-flavin oxidoreductases (flavin reductases), a class of enzymes capable of producing reduced flavin for bacterial bioluminescence and other biological processes. This domain is also found in various other oxidoreductase and monooxygenase enzymes... This domain consists of a beta-barrel with Greek key topology, and is related to the ferredoxin reductase-like FAD-binding domain. The flavin reductases have a different dimerisation mode than that found in the PNP oxidase-like family, which also carries an FMN-binding domain with a similar topology.	147
214901	smart00904	Flavokinase	Riboflavin kinase. Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme. the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases. This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme.	124
214902	smart00905	FolB	Dihydroneopterin aldolase. Dihydroneopterin aldolase catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. In the opportunistic pathogen Pneumocystis carinii, dihydroneopterin aldolase function is expressed as the N-terminal portion of the multifunctional folic acid synthesis protein (Fas). This region encompasses two domains, FasA and FasB, which are 27% amino acid identical. FasA and FasB also share significant amino acid sequence similarity with bacterial dihydroneopterin aldolases. This region consists of two tandem sequences each homologous to folB and which form tetramers.	113
214903	smart00906	Fungal_trans	Fungal specific transcription factor domain. This domain is found in a number of fungal transcription factors including transcriptional activator xlnR, yeast regulatory protein GAL4, and other transcription proteins regulating a variety of cellular and metabolic processes.	93
197975	smart00907	GDNF	GDNF/GAS1 domain. This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons.. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity.	86
214904	smart00908	Gal-bind_lectin	Galactoside-binding lectin. Animal lectins display a wide variety of architectures. They are classified according to the carbohydrate-recognition domain (CRD) of which there are two main types, S-type and C-type. Galectins (previously S-lectins) bind exclusively beta-galactosides like lactose. They do not require metal ions for activity. Galectins are found predominantly, but not exclusively in mammals. Their function is unclear. They are developmentally regulated and may be involved in differentiation, cellular regulation and tissue construction.	122
214905	smart00909	Germane	Sporulation and spore germination. The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 pfam01520 Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold.	84
214906	smart00910	HIRAN	The HIRAN protein (HIP116, Rad5p N-terminal) is found in the N-terminal regions of the SWI2/SNF2 proteins typified by HIP116 and Rad5p. HIRAN is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes. It has been predicted that this protein functions as a DNA-binding domain that probably recognises features associated with damaged DNA or stalled replication forks.	90
214907	smart00911	HWE_HK	HWE histidine kinase. The HWE domain is found in a subset of two-component system kinases, belonging to the same superfamily as. In. the HWE family was defined by the presence of conserved a H residue and a WXE motifs and was limited to members of the proteobacteria. However, many homologues of this domain are lack the WXE motif. Furthermore, homologues are found in a wide range of Gram-positive and Gram-negative bacteria as well as in several archaea.	84
214908	smart00912	Haemagg_act	haemagglutination activity domain. This domain is suggested to be a carbohydrate- dependent haemagglutination activity site. It is found in a range of haemagglutinins and haemolysins.	119
197981	smart00913	IBN_N	Importin-beta N-terminal domain. Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins.. which is important for importin-beta mediated transport.	67
197982	smart00914	IDEAL	A short protein domain of unknown function. It is found at the C-terminus of proteins in the UPF0302 family. It is named after the sequence of the most conserved region in some members.	37
214909	smart00915	Jacalin	Jacalin-like lectin domain. This entry represents a mannose-binding lectin domain with a beta-prism fold consisting of three 4-stranded beta-sheets, with an internal pseudo 3-fold symmetry. Some lectins in this group stimulate distinct T- and B- cell functions, such as Jacalin, which binds to the T-antigen and acts as an agglutinin. This domain is found in 1 to 6 copies in lectins. The domain is also found in the salt-stress induced protein from rice and an animal prostatic spermine-binding protein.	128
197984	smart00916	L51_S25_CI-B8	Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain. Proteins containing this domain are located in the mitochondrion and include ribosomal protein L51, and S25. This domain is also found in mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) . It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins.	70
214910	smart00917	LeuA_dimer	LeuA allosteric (dimerisation) domain. This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyses the first committed step in the leucine biosynthetic pathway. This domain, is an internally duplicated structure with a novel fold. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich.	131
214911	smart00918	Lig_chan-Glu_bd	Ligated ion channel L-glutamate- and glycine-binding site. This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan.	62
214912	smart00919	Malic_M	Malic enzyme, NAD binding domain. Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate.	231
214913	smart00920	MHC_II_alpha	Class II histocompatibility antigen, alpha domain. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection).	81
197989	smart00921	MHC_II_beta	Class II histocompatibility antigen, beta domain. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection).	72
214914	smart00922	MR_MLE	Mandelate racemase / muconate lactonizing enzyme, C-terminal domain. Mandelate racemase (MR) and muconate lactonizing enzyme (MLE) are two bacterial enzymes involved in aromatic acid catabolism. They catalyze mechanistically distinct reactions yet they are related at the level of their primary, quaternary (homooctamer) and tertiary structures.. This entry represents the C-terminal region of these proteins.	97
197991	smart00923	MbtH	MbtH-like protein. This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. Many of the members of this family are found in known antibiotic synthesis gene clusters.	49
214915	smart00924	MgtE_N	MgtE intracellular N domain. This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. It is presumed to be an intracellular domain, that may be involved in magnesium binding.	105
214916	smart00925	MltA	MltA specific insert domain. This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding.	153
197994	smart00926	Molybdop_Fe4S4	Molybdopterin oxidoreductase Fe4S4 domain. The molybdopterin oxidoreductase Fe4S4 domain is found in a number of reductase/dehydrogenase families, which include the periplasmic nitrate reductase precursor and the formate dehydrogenase alpha chain.	55
214917	smart00927	MutH	DNA mismatch repair enzyme MutH. MutS, MutL and MutH are the three essential proteins for initiation of methyl-directed DNA mismatch repair to correct mistakes made during DNA replication in Escherichia coli. MutH cleaves a newly synthesized and unmethylated daughter strand 5' to the sequence d(GATC) in a hemi-methylated duplex. Activation of MutH requires the recognition of a DNA mismatch by MutS and MutL.	100
197996	smart00928	NADH_4Fe-4S	NADH-ubiquinone oxidoreductase-F iron-sulfur binding region. 	46
214918	smart00929	NADH-G_4Fe-4S_3	NADH-ubiquinone oxidoreductase-G iron-sulfur binding region. 	41
197998	smart00930	NIL	This domain is found at the C-terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family.	76
197999	smart00931	NOSIC	NOSIC (NUC001) domain. This is the central domain in Nop56/SIK1-like proteins.	52
214919	smart00932	Nfu_N	Scaffold protein Nfu/NifU N terminal. This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters.	88
214920	smart00933	NurA	NurA nuclease. This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius.	262
214921	smart00934	OMPdecase	Orotidine 5'-phosphate decarboxylase / HUMPS family. Orotidine 5'-phosphate decarboxylase (OMPdecase) catalyzes the last step in the de novo biosynthesis of pyrimidines, the decarboxylation of OMP into UMP. In higher eukaryotes OMPdecase is part, with orotate phosphoribosyltransferase, of a bifunctional enzyme, while the prokaryotic and fungal OMPdecases are monofunctional protein.	212
214922	smart00935	OmpH	Outer membrane protein (OmpH-like). This family includes outer membrane proteins such as OmpH among others. Skp (OmpH) has been characterized as a molecular chaperone that interacts with unfolded proteins as they emerge in the periplasm from the Sec translocation machinery.	140
198004	smart00936	PBP5_C	Penicillin-binding protein 5, C-terminal domain. Penicillin-binding protein 5 expressed by E. coli functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain (pfam00768) is the catalytic domain. The C-terminal domain featured in this family is organized into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides.	92
214923	smart00937	PCRF	This domain is found in peptide chain release factors. 	116
198006	smart00938	P-II	Nitrogen regulatory protein P-II. P-II modulates the activity of glutamine synthetase.	102
214924	smart00939	PepX_C	X-Pro dipeptidyl-peptidase C-terminal non-catalytic domain. This domain is found at the C-terminus of cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). The domain, which is a beta sandwich, is also found in serine peptidases belonging to MEROPS peptidase family S15: Xaa-Pro dipeptidyl-peptidases. Members of this entry, that are not characterised as peptidases, show extensive low-level similarity to the Xaa-Pro dipeptidyl-peptidases.	214
198008	smart00940	PepX_N	X-Prolyl dipeptidyl aminopeptidase PepX, N-terminal. This N-terminal domain adopts a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, with the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand five of the catalytic domain. This domain mediates dimerisation of the protein, with two proline residues present in the domain being critical for interaction.	156
214925	smart00941	PYNP_C	Pyrimidine nucleoside phosphorylase C-terminal domain. This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP). The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer.	75
214926	smart00942	PriCT_1	Primase C terminal 1 (PriCT-1). This alpha helical domain is found at the C terminal of primases.	66
214927	smart00943	Prim-Pol	Bifunctional DNA primase/polymerase, N-terminal. Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities.	154
214928	smart00944	Pro-kuma_activ	Pro-kumamolisin, activation domain. This domain is found at the N-terminus of peptidases belonging to MEROPS peptidase family S53 (sedolisin, clan SB). The domain adopts a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptidase.	136
198013	smart00945	ProQ	ProQ/FINO family. This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProQ, in Escherichia coli.	113
198014	smart00946	ProRS-C_1	Prolyl-tRNA synthetase, C-terminal. Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif.	67
214929	smart00947	Pro_CA	Carbonic anhydrase. Carbonic anhydrases (CA) are zinc metalloenzymes which catalyze the reversible hydration of carbon dioxide. In Escherichia coli, CA (gene cynT) is involved in recycling carbon dioxide formed in the bicarbonate-dependent decomposition of cyanate by cyanase (gene cynS). By this action, it prevents the depletion of cellular bicarbonate. In photosynthetic bacteria and plant chloroplast, CA is essential to inorganic carbon fixation. Prokaryotic and plant chloroplast CA are structurally and evolutionary related and form a family distinct from the one which groups the many different forms of eukaryotic CA's.	154
198016	smart00948	Proteasome_A_N	Proteasome subunit A N-terminal signature Add an annotation. This domain is conserved in the A subunits of the proteasome complex proteins.	23
198017	smart00949	PAZ	This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerisation. The three-dimensional structure of this domain has been solved. The PAZ domain is composed of two subdomains. One subdomain is similar to the OB fold, albeit with a different topology. The OB-fold is well known as a single-stranded nucleic acid binding fold. The second subdomain is composed of a beta-hairpin followed by an alpha-helix. The PAZ domains shows low-affinity nucleic acid binding and appears to interact with the 3' ends of single-stranded regions of RNA in the cleft between the two subdomains. PAZ can bind the characteristic two-base 3' overhangs of siRNAs, indicating that although PAZ may not be a primary nucleic acid binding site in Dicer or RISC, it may contribute to the specific and productive incorporation of siRNAs and miRNAs into the RNAi pathway.	138
214930	smart00950	Piwi	This domain is found in the protein Piwi and its relatives. The function of this domain is the dsRNA guided hydrolysis of ssRNA. Determination of the crystal structure of Argonaute reveals that PIWI is an RNase H domain, and identifies Argonaute as Slicer, the enzyme that cleaves mRNA in the RNAi RISC complex.. In addition, Mg+2 dependence and production of 3'-OH and 5' phosphate products are shared characteristics of RNaseH and RISC. The PIWI domain core has a tertiary structure belonging to the RNase H family of enzymes. RNase H fold proteins all have a five-stranded mixed beta-sheet surrounded by helices. By analogy to RNase H enzymes which cleave single-stranded RNA guided by the DNA strand in an RNA/DNA hybrid, the PIWI domain can be inferred to cleave single-stranded RNA, for example mRNA, guided by double stranded siRNA.	301
214931	smart00951	QLQ	QLQ is named after the conserved Gln, Leu, Gln motif. QLQ is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. QLQ has been postulated to be involved in mediating protein interactions.	36
214932	smart00952	RAP	This domain is found in various eukaryotic species, particularly in apicomplexans. In Plasmodium falciparum, the domain is found in proteins that are important in various parasite-host cell interactions. It is thought to be an RNA-binding domain.	58
214933	smart00953	RES	RES domain. This presumed protein contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. RES is found widely distributed in bacteria, it has about 150 residues in length.	121
214934	smart00954	RelA_SpoT	Region found in RelA / SpoT proteins. The functions of Escherichia coli RelA and SpoT differ somewhat. RelA produces pppGpp (or ppGpp) from ATP and GTP (or GDP). SpoT degrades ppGpp, but may also act as a secondary ppGpp synthetase. The two proteins are strongly similar. In many species, a single homolog to SpoT and RelA appears reponsible for both ppGpp synthesis and ppGpp degradation. (p)ppGpp is a regulatory metabolite of the stringent response, but appears also to be involved in antibiotic biosynthesis in some species.	111
214935	smart00955	RNB	This domain is the catalytic domain of ribonuclease II. 	286
214936	smart00956	RQC	This DNA-binding domain is found in the RecQ helicase among others and has a helix-turn-helix structure. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain.	92
214937	smart00957	SecA_DEAD	SecA DEAD-like domain. SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner. This domain represents the N-terminal ATP-dependent helicase domain, which is related to the.	380
214938	smart00958	SecA_PP_bind	SecA preprotein cross-linking domain. The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain.	114
198027	smart00959	Rho_N	Rho termination factor, N-terminal domain. The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers. This domain is found to the N-terminus of the RNA binding domain.	43
214939	smart00960	Robl_LC7	Roadblock/LC7 domain. This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role.	88
198029	smart00961	RuBisCO_small	Ribulose bisphosphate carboxylase, small chain. RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) is a bifunctional enzyme that catalyses both the carboxylation and oxygenation of ribulose-1,5-bisphosphate (RuBP), thus fixing carbon dioxide as the first step of the Calvin cycle. RuBisCO is the major protein in the stroma of chloroplasts, and in higher plants exists as a complex of 8 large and 8 small subunits. The function of the small subunit is unknown. While the large subunit is coded for by a single gene, the small subunit is coded for by several different genes, which are distributed in a tissue specific manner. They are transcriptionally regulated by light receptor phytochrome. which results in RuBisCO being more abundant during the day when it is required.	96
214940	smart00962	SRP54	SRP54-type protein, GTPase domain. This entry represents the GTPase domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. The GTPase domain is evolutionary related to P-loop NTPase domains found in a variety of other proteins.	197
214941	smart00963	SRP54_N	SRP54-type protein, helical bundle domain. This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.	77
214942	smart00964	STAT_int	STAT protein, protein interaction domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain.	120
198033	smart00965	STN	Secretin and TonB N terminus short domain. This is a short domain found at the N-terminus of the Secretins of the bacterial type II/III secretory system as well as the TonB-dependent receptor proteins. These proteins are involved in TonB-dependent active uptake of selective substrates.	52
198034	smart00966	SpoVT_AbrB	SpoVT / AbrB like domain. This domain is found in AbrB from Bacillus subtilis. The product of the abrB gene is an ambiactive repressor and activator of the transcription of genes expressed during the transition state between vegetative growth and the onset of stationary phase and sporulation. AbrB is thought to interact directly with the transcription initiation regions of genes under its control. AbrB contains a helix-turn-helix structure, but this domain ends before the helix-turn-helix begins. The product of the B. subtilis gene spoVT is another member of this family and is also a transcriptional regulator. DNA-binding activity in this AbrB homologue requires hexamerisation. Another family member has been isolated from the Sulfolobus solfataricus and has been identified as a homologue of bacterial repressor-like proteins. The Escherichia coli family member SohA or Prl1F appears to be bifunctional and is able to regulate its own expression as well as relieve the export block imposed by high-level synthesis of beta-galactosidase hybrid proteins.	45
214943	smart00967	SpoU_sub_bind	RNA 2'-O ribose methyltransferase substrate binding. This domain is a RNA 2'-O ribose methyltransferase substrate binding domain.	70
214944	smart00968	SMC_hinge	SMC proteins Flexible Hinge Domain. This entry represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction.	120
198037	smart00969	SOCS_box	The SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases. 	34
214945	smart00970	s48_45	Sexual stage antigen s48/45 domain. This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation.	116
198039	smart00971	SATase_N	Serine acetyltransferase, N-terminal. The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants.and bacteria.	105
198040	smart00972	SCPU	Spore Coat Protein U domain. This domain is found in a bacterial family of spore coat proteins.as well as a family of secreted pili proteins involved in motility and biofilm formation.	59
214946	smart00973	Sec63	Sec63 Brl domain. This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases.	314
214947	smart00974	T5orf172	This entry represents the putative helicase A859L. 	80
214948	smart00975	Telomerase_RBD	Telomerase ribonucleoprotein complex - RNA binding domain. Telomeres in most organisms are comprised of tandem simple sequence repeats. The total length of telomeric repeat sequence at each chromosome end is determined in a balance of sequence loss and sequence addition. One major influence on telomere length is the enzyme telomerase. It is a reverse transcriptase that adds these simple sequence repeats to chromosome ends by copying a template sequence within the RNA component of the enzyme. The RNA binding domain of telomerase - TRBD - is made up of twelve alpha helices and two short beta sheets. How telomerase and associated regulatory factors physically interact and function with each other to maintain appropriate telomere length is poorly understood. It is known however that TRBD is involved in formation of the holoenzyme (which performs the telomere extension) in addition to recognition and binding of RNA.	136
214949	smart00976	Telo_bind	Telomeric single stranded DNA binding POT1/CDC13. The telomere-binding protein forms a heterodimer in ciliates consisting of an alpha and a beta subunit. This complex may function as a protective cap for the single-stranded telomeric overhang. Alpha subunit consists of 3 structural domains, all with the same beta-barrel OB fold.	137
198045	smart00977	TilS_C	TilS substrate C-terminal domain. This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein.	69
214950	smart00978	Tim44	Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region.	147
198047	smart00979	TIFY	This short possible domain is found in a variety of plant transcription factors that contain GATA domains as well as other motifs. Although previously known as the Zim domain this is now called the tify domain after its most conserved amino acids. TIFY proteins can be further classified into two groups depending on the presence (group I) or absence (group II) of a C2C2-GATA domain. Functional annotation of these proteins is still poor, but several screens revealed a link between TIFY proteins of group II and jasmonic acid-related stress response.	36
214951	smart00980	THAP	The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes.	80
214952	smart00981	THUMP	The THUMP domain is named after after thiouridine synthases, methylases and PSUSs. The THUMP domain consists of about 110 amino acid residues. The structure of ThiI reveals that the THUMP has a fold unlike that of previously characterised RNA-binding domains. It is predicted that this domain is an RNA-binding domain The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets.	83
198050	smart00982	TRCF	This domain is found in proteins necessary for strand-specific repair in DNA such as TRCF in Escherichia coli. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript.	100
214953	smart00983	TPK_B1_binding	Thiamin pyrophosphokinase, vitamin B1 binding domain. Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis.	66
214954	smart00984	UDPG_MGDP_dh_C	UDP binding domain. The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate.	99
214955	smart00985	UBA_e1_C	Ubiquitin-activating enzyme e1 C-terminal domain. This presumed domain found at the C terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterised.	128
214956	smart00986	UDG	Uracil DNA glycosylase superfamily. 	156
214958	smart00988	UreE_N	UreE urease accessory protein, N-terminal domain. UreE is a urease accessory protein. Urease hydrolyses urea into ammonia and carbamic acid.	65
198057	smart00989	V4R	The V4R (vinyl 4 reductase) domain is a predicted small molecular binding domain, that may bind to hydrocarbons. 	61
214959	smart00990	VRR_NUC	This model contains proteins with the VRR-NUC domain. It is associated with members of the PD-(D/E)XK nuclease superfamily, which include the type III restriction modification enzymes, for example StyLTI.	108
214960	smart00991	WHEP-TRS	A conserved domain of 46 amino acids, called WHEP-TRS has been shown.to exist in a number of higher eukaryote aminoacyl-transfer RNA synthetases. This domain is present one to six times in the several enzymes. There are three copies in mammalian multifunctional aminoacyl-tRNA synthetase in a region that separates the N-terminal glutamyl-tRNA synthetase domain from the C-terminal prolyl-tRNA synthetase domain, and six copies in the intercatalytic region of the Drosophila enzyme. The domain is found at the N-terminal extremity of the mammalian tryptophanyl- tRNA synthetase and histidyl-tRNA synthetase, and the mammalian, insect, nematode and plant glycyl- tRNA synthetases. This domain could contain a central alpha-helical region and may play a role in the association of tRNA-synthetases into multienzyme complexes.	56
214961	smart00992	YccV-like	Hemimethylated DNA-binding protein YccV like. YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix.	98
198061	smart00993	YL1_C	YL1 nuclear protein C-terminal domain. This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins.	30
198062	smart00994	zf-C4_ClpX	ClpX C4-type zinc finger. The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known.	39
214962	smart00995	AD	Anticodon-binding domain. This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins.	90
214963	smart00996	AdoHcyase	S-adenosyl-L-homocysteine hydrolase. 	426
198065	smart00997	AdoHcyase_NAD	S-adenosyl-L-homocysteine hydrolase, NAD binding domain. 	162
198066	smart00998	ADSL_C	Adenylosuccinate lyase C-terminus. Adenylosuccinate lyase catalyses two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide (the fifth step of de novo IMP biosynthesis); the formation of adenosine monophosphate (AMP) from adenylosuccinate (the final step in the synthesis of AMP from IMP). This entry represents the C-terminal, seven alpha-helical, domain of adenylosuccinate lyase.	81
198067	smart00999	Aerolysin	Aerolysin toxin. This family represents the pore forming lobe of aerolysin.	368
214964	smart01000	Aha1_N	Activator of Hsp90 ATPase, N-terminal. This domain is predominantly found in the protein 'Activator of Hsp90 ATPase', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity.	134
214965	smart01001	AIRC	AIR carboxylase. Members of this family catalyse the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyse the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain.	152
214966	smart01002	AlaDh_PNT_C	Alanine dehydrogenase/PNT, C-terminal domain. Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine.	149
214967	smart01003	AlaDh_PNT_N	Alanine dehydrogenase/PNT, N-terminal domain. Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine.	133
214968	smart01004	ALAD	Delta-aminolevulinic acid dehydratase. This entry represents porphobilinogen (PBG) synthase (PBGS, or 5-aminoaevulinic acid dehydratase, or ALAD, ), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses a Knorr-type condensation reaction between two molecules of ALA to generate porphobilinogen, the pyrrolic building block used in later steps. The structure of the enzyme is based on a TIM barrel topology made up of eight identical subunits, where each subunit binds to a metal ion that is essential for activity, usually zinc (in yeast, mammals and certain bacteria) or magnesium (in plants and other bacteria). A lysine has been implicated in the catalytic mechanism. The lack of PBGS enzyme causes a rare porphyric disorder known as ALAD porphyria, which appears to involve conformational changes in the enzyme.	321
214969	smart01005	Ala_racemase_C	Alanine racemase, C-terminal domain. Alanine racemase plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins contains this domain are found in both prokaryotic and eukaryotic proteins.	124
198074	smart01006	AlcB	Siderophore biosynthesis protein domain. AlcB is the conserved 45 residue region of one of the proteins of a complex which mediates alcaligin biosynthesis in Bordetella and aerobactin biosynthesis in E. coli and other bacteria. The protein appears to catalyse N-acylation of the hydroxylamine group in N-hydroxyputrescine with succinyl CoA - an activated mono-thioester derivative of succinic acid that is an intermediate in the Krebs cycle.	48
214970	smart01007	Aldolase_II	Class II Aldolase and Adducin N-terminal domain. This family includes class II aldolases and adducins which have not been ascribed any enzymatic function.	185
214971	smart01008	Ald_Xan_dh_C	Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. Aldehyde oxidase catalyses the conversion of an aldehyde in the presence of oxygen and water to an acid and hydrogen peroxide. The enzyme is a homodimer, and requires FAD, molybdenum and two 2FE-2S clusters as cofactors. Xanthine dehydrogenase catalyses the hydrogenation of xanthine to urate, and also requires FAD, molybdenum and two 2FE-2S clusters as cofactors. This activity is often found in a bifunctional enzyme with xanthine oxidase activity too. The enzyme can be converted from the dehydrogenase form to the oxidase form irreversibly by proteolysis or reversibly through oxidation of sulphydryl groups.	107
214972	smart01009	AlkA_N	AlkA N-terminal domain. This domain is found at the N terminus of bacterial AlkA . AlkA (3-methyladenine-DNA glycosylase II) is a base excision repair glycosylase from Escherichia coli. It removes a variety of alkylated bases from DNA, primarily by removing alkylation damage from duplex and single stranded DNA. AlkA flips a 1-azaribose abasic nucleotide out of DNA. This produces a 66 degrees bend in the DNA and a marked widening of the minor groove.	113
214973	smart01010	AMPKBI	5'-AMP-activated protein kinase beta subunit, interation domain. This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain is sometimes found in proteins belonging to this family.	100
198079	smart01011	AMP_N	Aminopeptidase P, N-terminal domain. This domain is structurally very similar to the creatinase N-terminal domain. However, little or no sequence similarity exists between the two families.	135
198080	smart01012	ANTAR	ANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins. The majority of the domain consists of a coiled-coil.	55
198081	smart01013	APC2	Anaphase promoting complex (APC) subunit 2. The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyse the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein.	60
198082	smart01014	ARID	ARID/BRIGHT DNA binding domain. Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.	88
214974	smart01015	Arfaptin	Arfaptin-like domain. Arfaptin interacts with ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The structure of arfaptin shows that upon binding to a small GTPase, arfaptin forms an elongated, crescent-shaped dimer of three-helix coiled-coils. The N-terminal region of ICA69 is similar to arfaptin.	217
214975	smart01016	Arg_tRNA_synt_N	Arginyl tRNA synthetase N terminal dom. This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition.	85
214976	smart01017	Arrestin_C	Arrestin (or S-antigen), C-terminal domain. Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain. Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X). which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction.	142
198086	smart01018	B12-binding_2	B12 binding domain. Cobalamin-dependent methionine synthase is a large modular protein that catalyses methyl transfer from methyltetrahydrofolate (CH3-H4folate) to homocysteine. During the catalytic cycle, it supports three distinct methyl transfer reactions, each involving the cobalamin (vitamin B12) cofactor and a substrate bound to its own functional unit. The cobalamin cofactor plays an essential role in this reaction, accepting the methyl group from CH3-H4folate to form methylcob(III)alamin, and in turn donating the methyl group to homocysteine to generate methionine and cob(I)alamin. Methionine synthase is a large enzyme composed of four structurally and functionally distinct modules: the first two modules bind homocysteine and CH3-H4folate, the third module binds the cobalamin cofactor and the C-terminal module binds S-adenosylmethionine. The cobalamin-binding module is composed of two structurally distinct domains: a 4-helical bundle cap domain (residues 651-740 in the Escherichia coli enzyme) and an alpha/beta B12-binding domain (residues 741-896). The 4-helical bundle forms a cap over the alpha/beta domain, which acts to shield the methyl ligand of cobalamin from solvent. Furthermore, in the conversion to the active conformation of this enzyme, the 4-helical cap rotates to allow the cobalamin cofactor to bind the activation domain. The alpha/beta domain is a common cobalamin-binding motif, whereas the 4-helical bundle domain with its methyl cap is a distinctive feature of methionine synthases.	84
214977	smart01019	B3	B3 DNA binding domain. Two DNA binding proteins, RAV1 and RAV2 from Arabidopsis thaliana contain two distinct amino acid sequence domains found only in higher plant species. The N-terminal regions of RAV1 and RAV2 are homologous to the AP2 DNA-binding domain (see ) present in a family of transcription factors, while the C-terminal region exhibits homology to the highly conserved C-terminal domain, designated B3, of VP1/ABI3 transcription factors. The AP2 and B3-like domains of RAV1 bind autonomously to the CAACA and CACCTG motifs, respectively, and together achieve a high affinity and specificity of binding. It has been suggested that the AP2 and B3-like domains of RAV1 are connected by a highly flexible structure enabling the two domains to bind to the CAACA and CACCTG motifs in various spacings and orientations.	96
198088	smart01020	B2-adapt-app_C	Beta2-adaptin appendage, C-terminal sub-domain. Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerisation. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15).	111
214978	smart01021	Bac_rhodopsin	Bacteriorhodopsin-like protein. The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria.. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine).	233
214979	smart01022	ASCH	The ASCH domain adopts a beta-barrel fold similar to that of the PUA domain. It is thought to function as an RNA-binding domain during coactivation, RNA-processing and possibly during prokaryotic translation regulation.	99
198091	smart01023	BAF	Barrier to autointegration factor. Barrier-to-autointegration factor (BAF) is an essential protein that is highly conserved in metazoan evolution, and which may act as a DNA-bridging protein. BAF binds directly to double-stranded DNA, to transcription activators, and to inner nuclear membrane proteins, including lamin A filament proteins that anchor nuclear-pore complexes in place, and nuclear LEM-domain proteins that bind to laminins filaments and chromatin. New findings suggest that BAF has structural roles in nuclear assembly and chromatin organization, represses gene expression and might interlink chromatin structure, nuclear architecture and gene regulation in metazoans. BAF can be exploited by retroviruses to act as a host component of pre-integration complexes, which promote the integration of the retroviral DNA into the host chromosome by preventing autointegration of retroviral DNA. BAF might contribute to the assembly or activity of retroviral pre-integration complexes through direct binding to the retroviral proteins p55 Gag and matrix, as well as to DNA.	87
214980	smart01024	BCS1_N	This domain is found at the N terminal of the mitochondrial ATPase BCS1. It encodes the import and intramitochondrial sorting for the protein. 	170
214981	smart01025	BEN	The BEN domain is found in diverse animal proteins. Proteins containing BEN domains are BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, the chordopoxvirus virosomal protein E5R and several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription.	80
214982	smart01026	Beach	Beige/BEACH domain. The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein. The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown.	280
214983	smart01027	Beta-Casp	Beta-Casp domain. The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains.	126
198096	smart01028	Beta-TrCP_D	D domain of beta-TrCP. This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with F-box domain, WD domain. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerisation of the protein. Dimerisation is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation.	40
198097	smart01029	BetaGal_dom2	Beta-galactosidase, domain 2. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyses the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, which is N-terminal to it, but itself has no metazoan members.	182
214984	smart01030	BHD_1	Rad4 beta-hairpin domain 1. This short domain is found in the Rad4 protein. This domain binds to DNA.	54
214985	smart01031	BHD_2	Rad4 beta-hairpin domain 2. This short domain is found in the Rad4 protein. This domain binds to DNA.	56
198100	smart01032	BHD_3	Rad4 beta-hairpin domain 3. This short domain is found in the Rad4 protein. This domain binds to DNA.	75
198101	smart01033	BING4CT	BING4CT (NUC141) domain. This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins.	80
198102	smart01034	BLUF	Sensors of blue-light using FAD. The BLUF domain has been shown to bind FAD in the AppA protein. AppA is involved in the repression of photosynthesis genes in response to blue-light.	92
214986	smart01035	BOP1NT	BOP1NT (NUC169) domain. This N terminal domain is found in BOP1-like WD40 proteins.	264
214987	smart01036	BP28CT	BP28CT (NUC211) domain. This C-terminal domain is found in BAP28-like nucleolar proteins.	151
198105	smart01037	Bet_v_1	Pathogenesis-related protein Bet v I family. This family is named after Bet v 1, the major birch pollen allergen. This protein belongs to family 10 of plant pathogenesis-related proteins (PR-10), cytoplasmic proteins of 15-17 kd that are wide-spread among dicotyledonous plants. In recent years, a number of diverse plant proteins with low sequence similarity to Bet v 1 was identified. A classification by sequence similarity yielded several subfamilies related to PR-10.- Pathogenesis-related proteins PR-10: These proteins were identified as major tree pollen allergens in birch and related species (hazel, alder), as plant food allergens expressed in high levels in fruits, vegetables and seeds (apple, celery, hazelnut), and as pathogenesis-related proteins whose expression is induced by pathogen infection, wounding, or abiotic stress. Hyp-1, an enzyme involved in the synthesis of the bioactive naphthodianthrone hypericin in St. John's wort (Hypericum perforatum) also belongs to this family. Most of these proteins were found in dicotyledonous plants. In addition, related sequences were identified in monocots and conifers. - Cytokinin-specific binding proteins: These legume proteins bind cytokinin plant hormones. - (S)-Norcoclaurine synthases are enzymes catalysing the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine. -Major latex proteins and ripening-related proteins are proteins of unknown biological function that were first discovered in the latex of opium poppy (Papaver somniferum) and later found to be upregulated during ripening of fruits such as strawberry and cucumber. The occurrence of Bet v 1-related proteins is confined to seed plants with the exception of a cytokinin-binding protein from the moss Physcomitrella patens.	151
214988	smart01038	Bgal_small_N	Beta galactosidase small chain. This domain comprises the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase.	272
198107	smart01039	BRICHOS	The BRICHOS domain is found in a variety of proteins implicated in dementia, respiratory distress and cancer. Its exact function is unknown; roles that have been proposed for the domain, which is about 100 amino acids long, include (a) targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialised intracellular protease processing system. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, pfam08999 provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region.	96
214989	smart01040	Bro-N	BRO family, N-terminal domain. This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins.	89
214990	smart01041	BRO1	BRO1-like domain. This domain is found in a number proteins including Rhophilin and BRO1. It is known to have a role in endosomal targeting. ESCRT-III subunit Snf7 binds to a conserved hydrophobic patch in the BRO1 domain that is required for protein complex formation and for the protein-sorting function of BRO1.	381
198110	smart01042	Brr6_like_C_C	Di-sulfide bridge nucleocytoplasmic transport domain. Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport.	134
198111	smart01043	BTAD	Bacterial transcriptional activator domain. Found in the DNRI/REDD/AFSR family of regulators. This region of AFSR along with the C terminal region is capable of independently directing actinorhodin production. This family contains TPR repeats.	145
214991	smart01044	Btz	CASC3/Barentsz eIF4AIII binding. This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide.	106
214992	smart01045	BURP	The BURP domain is found at the C-terminus of several different plant proteins. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus; USPs and USP-like proteins; RD22 from Arabidopsis thaliana; and PG1beta from Lycopersicon esculentum. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid. The function of this domain is unknown.	222
198114	smart01046	c-SKI_SMAD_bind	c-SKI Smad4 binding domain. c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4.	95
214993	smart01047	C1_4	TFIIH C1-like domain. The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterised by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C.	49
214994	smart01048	C6	This domain of unknown function is found in a C. elegans protein. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge.	98
214995	smart01049	Cache_2	Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins. Members include the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions.	91
214996	smart01050	CactinC_cactus	Cactus-binding C-terminus of cactin protein. CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain further upstream.	129
198119	smart01051	CAMSAP_CKK	Microtubule-binding calmodulin-regulated spectrin-associated. This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin.	129
214997	smart01052	CAP_GLY	Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove.	68
198121	smart01053	CaMBD	Calmodulin binding domain. Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other.	76
214998	smart01054	CaM_binding	Plant calmodulin-binding domain. The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins, and are thought to constitute the calmodulin-binding domains.. Binding of the proteins to calmodulin depends on the presence of calcium ions.. These proteins are thought to be involved in various processes, such as plant defence responses.and stolonisation or tuberization.	115
214999	smart01055	Cadherin_pro	Cadherin prodomain like. Cadherins are a family of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This domain corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions.	87
198124	smart01056	Candida_ALS_N	Cell-wall agglutinin N-terminal ligand-sugar binding. This is likely to be the sugar or ligand binding domain of the yeast alpha-agglutinins.	245
215000	smart01057	Carb_anhydrase	Eukaryotic-type carbonic anhydrase. Carbonic anhydrases are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate.. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.	247
215001	smart01058	CarD_TRCF	CarD-like/TRCF domain. CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes. This family includes the presumed N-terminal domain. CarD interacts with the zinc-binding protein CarG, to form a complex that regulates multiple processes in Myxococcus xanthus. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription. This domain is involved in binding to the stalled RNA polymerase.	99
215002	smart01059	CAT	Chloramphenicol acetyltransferase. Chloramphenicol acetyltransferase (CAT).catalyzes the acetyl-CoA dependent acetylation of chloramphenicol (Cm), an antibiotic which inhibits prokaryotic peptidyltransferase activity. Acetylation of Cm by CAT inactivates the antibiotic. A histidine residue, located in the C-terminal section of the enzyme, plays a central role in its catalytic mechanism. There is a second family of CAT. evolutionary unrelated to the main family described above. These CAT belong to the bacterial hexapeptide-repeat containing-transferases family (see ). The crystal structure of the type III enzyme from Escherichia coli with chloramphenicol bound has been determined. CAT is a trimer of identical subunits (monomer Mr 25,000) and the trimeric structure is stabilised by a number of hydrogen bonds, some of which result in the extension of a beta-sheet across the subunit interface. Chloramphenicol binds in a deep pocket located at the boundary between adjacent subunits of the trimer, such that the majority of residues forming the binding pocket belong to one subunit while the catalytically essential histidine belongs to the adjacent subunit. His195 is appropriately positioned to act as a general base catalyst in the reaction, and the required tautomeric stabilisation is provided by an unusual interaction with a main-chain carbonyl oxygen.	202
215003	smart01060	Catalase	Catalases are antioxidant enzymes that catalyse the conversion of hydrogen peroxide to water and molecular oxygen, serving to protect cells from its toxic effects. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases that are closely related to plant peroxidases, and non-haem, manganese-containing catalases that are found in bacteria.	373
215004	smart01061	CAT_RBD	CAT RNA binding domain. This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram+ and Gram- bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer.to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template.	55
198130	smart01062	Ca_chan_IQ	Voltage gated calcium channel IQ domain. Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF).	31
215005	smart01063	CBM49	Carbohydrate binding domain CBM49. This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose.	84
198132	smart01064	CBM_10	Cellulose or protein binding domain. This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria.	29
215006	smart01065	CBM_2	Starch binding domain. 	88
198134	smart01066	CBM_25	Carbohydrate binding domain. 	83
215007	smart01067	CBM_3	Cellulose binding domain. 	83
215008	smart01068	CBM_X	Putative carbohydrate binding domain. 	62
215009	smart01069	CDC37_C	Cdc37 C terminal domain. Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain pfam08565 and the N terminal kinase binding domain of Cdc37.	93
215010	smart01070	CDC37_M	Cdc37 Hsp90 binding domain. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37. It is found between the N terminal Cdc37 domain which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 whose function is unclear.	155
198139	smart01071	CDC37_N	Cdc37 N terminal kinase binding. Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases.and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function.	154
215011	smart01072	CDC48_2	Cell division protein 48 (CDC48) domain 2. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain.	64
215012	smart01073	CDC48_N	Cell division protein 48 (CDC48) N-terminal domain. This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain.	82
215013	smart01074	Cdc6_C	CDC6, C terminal. The C terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localisation factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity.	84
215014	smart01075	CDT1	DNA replication factor CDT1 like. CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin.	164
198144	smart01076	CG-1	CG-1 domains are highly conserved domains of about 130 amino-acid residues. The domains contain a predicted bipartite NLS and are named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs.	118
198145	smart01077	Cg6151-P	Uncharacterized conserved protein CG6151-P. This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined.	111
198146	smart01078	CGGC	This putative domain contains a quite highly conserved sequence of CGGC in its central region. The domain has many conserved cysteines and histidines suggestive of a zinc binding function.	106
215015	smart01079	CHASE	This domain is found in the extracellular portion of receptor-like proteins - such as serine/threonine kinases and adenylyl cyclases. Predicted to be a ligand binding domain.	176
215016	smart01080	CHASE2	CHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE2 domains are not known at this time.	303
215017	smart01081	CHB_HEX	Putative carbohydrate binding domain. This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain.	160
198150	smart01082	CHZ	Histone chaperone domain CHZ. This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones.	38
198151	smart01083	Cir_N	N-terminal domain of CBF1 interacting co-repressor CIR. This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex.	37
198152	smart01084	CKS	Cyclin-dependent kinase regulatory subunit. Cyclin-dependent kinase regulatory subunit.	70
198153	smart01085	CK_II_beta	Casein kinase II regulatory subunit. 	184
198154	smart01086	ClpB_D2-small	C-terminal, D2-small domain, of ClpB protein. This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit and thereby providing enough binding energy to stabilise the functional assembly. The domain is associated with two Clp_N at the N-terminus as well as AAA and AAA_2.	90
215018	smart01087	COG6	Conserved oligomeric complex COG6. COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation.	598
198156	smart01088	Col_cuticle_N	Nematode cuticle collagen N-terminal domain. The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins.	53
198157	smart01089	Connexin_CCC	Gap junction channel protein cysteine-rich domain. 	67
215019	smart01090	Copper-fist	Copper fist is an N-terminal domain involved in copper-dependent DNA binding. The domain is named for its resemblance to a fist. It can be found in some fungal transcription factors. These proteins activate the transcription of the metallothionein gene in response to copper. Metallothionein maintains copper levels in yeast. The copper fist domain is similar in structure to metallothionein itself, and on copper binding undergoes a large conformational change, which allows DNA binding.	38
215020	smart01091	CorC_HlyC	Transporter associated domain. This small domain is found in a family of proteins with the DUF21 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates.	78
215021	smart01092	CO_deh_flav_C	CO dehydrogenase flavoprotein C-terminal domain. 	102
198161	smart01093	CP12	CP12 domain. 	72
215022	smart01094	CpcD	CpcD/allophycocyanin linker domain. 	51
198163	smart01095	Cpl-7	Cpl-7 lysozyme C-terminal domain. This domain was originally found in the C-terminal moiety of the Cpl-7 lysozyme encoded by the Streptococcus pneumoniae bacteriophage Cp-7. It is assumed that these repeats represent cell wall binding motifs although no direct evidence has been obtained so far.	42
198164	smart01096	CPSase_L_D3	Carbamoyl-phosphate synthetase large chain, oligomerisation domain. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain.	124
198165	smart01097	CPSase_sm_chain	Carbamoyl-phosphate synthase small chain, CPSase domain. The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. The small chain has a GATase domain in the carboxyl terminus.	130
215023	smart01098	CPSF73-100_C	This is the C-terminal conserved region of the pre-mRNA 3'-end-processing of the polyadenylation factor CPSF-73/CPSF-100 proteins. The exact function of this domain is not known.	212
198167	smart01099	CPW_WPC	This group of sequences is defined by a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown.	60
215024	smart01100	CRAL_TRIO_N	CRAL/TRIO, N-terminal domain. 	48
215025	smart01101	CRISPR_assoc	This domain forms an anti-parallel beta strand structure with flanking alpha helical regions. 	215
198170	smart01102	CRM1_C	CRM1 C terminal. CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat.	321
198171	smart01103	CRS1_YhbY	Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome.	84
215026	smart01104	CTD	Spt5 C-terminal nonapeptide repeat binding Spt4. The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.	121